freeipmi-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Freeipmi-devel] ipmi_monitoring_sensor_readings_by_record_id: inter


From: Albert Chu
Subject: Re: [Freeipmi-devel] ipmi_monitoring_sensor_readings_by_record_id: internal error
Date: Tue, 18 Jul 2017 12:15:47 -0700

There's clearly some communication problems with the motherboard,
leading to the "internal IPMI errors".  Many times we send a request and
don't even see a response.  In atleast one case before, the response
wasn't even a fully formed packet.

But this made me realize what is the possible problem.

When you run IPMI commands (i.e. ipmi-sensors), are you using one of the
kernel device drivers (e.g. linux defaults to /dev/ipmi0) as your
communication driver?

The default ipmimonitoring-sensors example happens to use the KCS
driver, which is separate and not related to the kernel one.  It may be
conflicting w/ the kernel device driver.  Effectively they are both
doing communication to the BMC but not sharing a lock.

If you are using  /dev/ipmi0, if you changed the ipmimonitoring example
to use the IPMI_MONITORING_DRIVER_TYPE_OPENIPMI driver, thing'll
probably work out.

Al

On Tue, 2017-07-18 at 11:43 -0700, Sohan Chowdary Kollu wrote:
> I am using 1.5.5 version.
> 
> Below are the packet details along with errors. Except for the 3rd
> scenario all other errors are very frequent
> 
> 
> 1)
> 
> Failed right away (first sdr request in the trace)
> 
> 
>  Get SDR Repository Info Request
> 
> =====================================================
> 
> KCS Header:
> 
> ------------
> 
> [               0h] = lun[ 2b]
> 
> [               Ah] = net_fn[ 6b]
> 
> IPMI Command Data:
> 
> ------------------
> 
> [              20h] = cmd[ 8b]
> 
> (ipmi_monitoring_sdr_cache.c, ipmi_monitoring_sdr_cache_load, 314):
> ipmi_sdr_cache_open: internal IPMI error
> 
> ipmi_monitoring_sensor_readings_by_record_id: internal error
> 
> 
> 2)
> 
> a) Failed right away (first sdr request in the trace)
> 
>  =====================================================
> 
> Get SDR Repository Info Request
> 
> =====================================================
> 
> KCS Header:
> 
> ------------
> 
> [               0h] = lun[ 2b]
> 
> [               Ah] = net_fn[ 6b]
> 
> IPMI Command Data:
> 
> ------------------
> 
> [              20h] = cmd[ 8b]
> 
> (ipmi_monitoring_sdr_cache.c, _ipmi_monitoring_sdr_cache_retrieve,
> 223): ipmi_sdr_cache_create: internal IPMI error
> 
> ipmi_monitoring_sensor_readings_by_record_id: internal error
> 
> 
> b) Failed after going though some sdr requests 
> 
> =====================================================
> 
> Get SDR Request
> 
> =====================================================
> 
> KCS Header:
> 
> ------------
> 
> [               0h] = lun[ 2b]
> 
> [               Ah] = net_fn[ 6b]
> 
> IPMI Command Data:
> 
> ------------------
> 
> [              23h] = cmd[ 8b]
> 
> [            8820h] = reservation_id[16b]
> 
> [              82h] = record_id[16b]
> 
> [              25h] = offset_into_record[ 8b]
> 
> [              10h] = bytes_to_read[ 8b]
> 
> (ipmi_monitoring_sdr_cache.c, _ipmi_monitoring_sdr_cache_retrieve,
> 223): ipmi_sdr_cache_create: internal IPMI error
> 
> ipmi_monitoring_sensor_readings_by_record_id: internal error
> 
> 
> 3)
> 
> Failed right away (first sdr request in the trace). Seen this only
> twice
> 
> 
> =====================================================
> 
> Get SDR Repository Info Request
> 
> =====================================================
> 
> KCS Header:
> 
> ------------
> 
> [               0h] = lun[ 2b]
> 
> [               Ah] = net_fn[ 6b]
> 
> IPMI Command Data:
> 
> ------------------
> 
> [              20h] = cmd[ 8b]
> 
> (ipmi_monitoring_sdr_cache.c, ipmi_monitoring_sdr_cache_load, 336):
> ipmi_sdr_cache_open: internal IPMI error
> 
> ipmi_monitoring_sensor_readings_by_record_id: internal error
> 
> 
> 4)
> 
> a) Failed at Reading Request
> 
> =====================================================
> 
> Get Sensor Reading Request
> 
> =====================================================
> 
> KCS Header:
> 
> ------------
> 
> [               0h] = lun[ 2b]
> 
> [               4h] = net_fn[ 6b]
> 
> IPMI Command Data:
> 
> ------------------
> 
> [              2Dh] = cmd[ 8b]
> 
> [              B0h] = sensor_number[ 8b]
> 
> (ipmi_monitoring_sensor_reading.c, _get_sensor_reading, 356):
> ipmi_sensor_read: internal IPMI error
> 
> (ipmi_monitoring.c, _ipmi_monitoring_sensor_readings_by_record_id,
> 1449): ipmi_sdr_cache_iterate: error returned in callback
> 
> ipmi_monitoring_sensor_readings_by_record_id: internal error
> 
> 
> b) Failed at Reading Response
> 
> =====================================================
> 
> Get Sensor Reading Request
> 
> =====================================================
> 
> KCS Header:
> 
> ------------
> 
> [               0h] = lun[ 2b]
> 
> [               4h] = net_fn[ 6b]
> 
> IPMI Command Data:
> 
> ------------------
> 
> [              2Dh] = cmd[ 8b]
> 
> [              90h] = sensor_number[ 8b]
> 
> =====================================================
> 
> Get Sensor Reading Response
> 
> =====================================================
> 
> KCS Header:
> 
> ------------
> 
> [               0h] = lun[ 2b]
> 
> [               5h] = net_fn[ 6b]
> 
> IPMI Command Data:
> 
> ------------------
> 
> [               0h] = cmd[ 8b]
> 
> (ipmi_monitoring_sensor_reading.c, _get_sensor_reading, 356):
> ipmi_sensor_read: internal IPMI error
> 
> (ipmi_monitoring.c, _ipmi_monitoring_sensor_readings_by_record_id,
> 1449): ipmi_sdr_cache_iterate: error returned in callback
> 
> ipmi_monitoring_sensor_readings_by_record_id: internal error
> 
> 
> Thanks
> 
> 
> 
> On Mon, Jul 17, 2017 at 11:46 PM, Albert Chu <address@hidden>
> wrote:
>         Hi,
>         
>         
>         What version of FreeIPMI are you using?  The line numbers
>         don't quite line up with the master branch.
>         
>         
>         Also, could you set IPMI_MONITORING_FLAGS_DEBUG_IPMI_PACKETS
>         and show the IPMI packet that occurs right before the error
>         line?
>         
>         
>         Thanks,
>         
>         
>         
>         Al
>         
>         
>         On Mon, Jul 17, 2017 at 4:28 PM, Sohan Chowdary Kollu
>         <address@hidden> wrote:
>                 Hi Albert,
>                 
>                 Thanks for quick response. I have set the flags for
>                 debugging and found it failing at one of the three
>                 instances below in different runs.
>                 
>                 1) (ipmi_monitoring_sensor_reading.c,
>                 _get_sensor_reading, 356): ipmi_sensor_read: internal
>                 system error(ipmi_monitoring.c,
>                 _ipmi_monitoring_sensor_readings_by_record_id, 1449):
>                 ipmi_sdr_cache_iterate: error returned in callback
>                 ipmi_monitoring_sensor_readings_by_record_id: internal
>                 error
>                 2)(ipmi_monitoring_sdr_cache.c,
>                 ipmi_monitoring_sdr_cache_load, 314):
>                 ipmi_sdr_cache_open: internal IPMI
>                 error ipmi_monitoring_sensor_readings_by_record_id:
>                 internal error
>                 
>                 
>                 3) (ipmi_monitoring_sdr_cache.c,
>                 _ipmi_monitoring_sdr_cache_retrieve, 223):
>                 ipmi_sdr_cache_create: internal IPMI
>                 error ipmi_monitoring_sensor_readings_by_record_id:
>                 internal error
>                 
>                 
>                 
>                 Thanks
>                 
>                 
>                 
>                 On Mon, Jul 17, 2017 at 2:34 PM, Albert Chu
>                 <address@hidden> wrote:
>                         The "internal error" indicates some logical
>                         error that the library
>                         doesn't know how to handle.  Given its coming
>                         from
>                         ipmi_monitoring_sensor_readings_by_record_id
>                         and it occurs when you run
>                         the program back to back, I would bet there is
>                         some internal IPMI issue
>                         on your system.  Perhaps its a new error code
>                         or something like that
>                         that I do not handle gracefully correctly.
>                         
>                         To try and debug, could you set the flag
>                         "IPMI_MONITORING_FLAGS_DEBUG |
>                         IPMI_MONITORING_FLAGS_DEBUG_IPMI_PACKETS" when
>                         calling
>                         ipmimonitoring_init() in the example code.
>                         Hopefully that'll be enough
>                         to figure out the issue.
>                         
>                         Al
>                         
>                         On Mon, 2017-07-17 at 13:03 -0700, Sohan
>                         Chowdary Kollu wrote:
>                         > Hi,
>                         >
>                         > I am executing the ipmimonitoring-sensors.c
>                         example provided in the
>                         > freeipmi library. It throws internal error
>                         sometimes. Issue is
>                         > reproducible when i execute the program back
>                         to back couple of times.
>                         > I need to wait approximately 30 sec or more
>                         after the last execution
>                         > for the program to run properly.
>                         >
>                         >
>                         > This is the error
>                         ipmi_monitoring_sensor_readings_by_record_id:
>                         > internal error
>                         >
>                         >
>                         >
>                         > I ran some of the commands on terminal back
>                         to back , including
>                         > ipmi-sensors with group option,
>                         ipmimonitoring etc. None of them thew
>                         > any errors. Error occurs only when i am use
>                         the API.
>                         >
>                         >
>                         > Has anyone faced this issue before? If yes,
>                         can you tell me how to
>                         > avoid it
>                         >
>                         >
>                         >
>                         >
>                         > Thanks,
>                         > Sohan
>                         
>                         >
>                         _______________________________________________
>                         > Freeipmi-devel mailing list
>                         > address@hidden
>                         >
>                         https://lists.gnu.org/mailman/listinfo/freeipmi-devel
>                         
>                         --
>                         Albert Chu
>                         address@hidden
>                         Computer Scientist
>                         High Performance Systems Division
>                         Lawrence Livermore National Laboratory
>                         
>                         
>                 
>                 
>                 
>                 
>                 -- 
>                 Thanks,
>                 Sohan
>                 
>                 _______________________________________________
>                 Freeipmi-devel mailing list
>                 address@hidden
>                 https://lists.gnu.org/mailman/listinfo/freeipmi-devel
>                 
>         
>         
> 
> 
> 
> 
> -- 
> Thanks,
> Sohan

-- 
Albert Chu
address@hidden
Computer Scientist
High Performance Systems Division
Lawrence Livermore National Laboratory





reply via email to

[Prev in Thread] Current Thread [Next in Thread]