freeipmi-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Freeipmi-devel] ipmi_monitoring_sensor_readings_by_record_id: inter


From: Albert Chu
Subject: Re: [Freeipmi-devel] ipmi_monitoring_sensor_readings_by_record_id: internal error
Date: Tue, 18 Jul 2017 17:24:39 -0700

Awesome.  If you could, perhaps update the stackoverflow with the
solution.

Al

On Tue, 2017-07-18 at 17:16 -0700, Sohan Chowdary Kollu wrote:
> I have set the driver to -1 to use default . It worked like a charm.
> Thanks a lot.
> 
> On Tue, Jul 18, 2017 at 12:15 PM, Albert Chu <address@hidden> wrote:
>         There's clearly some communication problems with the
>         motherboard,
>         leading to the "internal IPMI errors".  Many times we send a
>         request and
>         don't even see a response.  In atleast one case before, the
>         response
>         wasn't even a fully formed packet.
>         
>         But this made me realize what is the possible problem.
>         
>         When you run IPMI commands (i.e. ipmi-sensors), are you using
>         one of the
>         kernel device drivers (e.g. linux defaults to /dev/ipmi0) as
>         your
>         communication driver?
>         
>         The default ipmimonitoring-sensors example happens to use the
>         KCS
>         driver, which is separate and not related to the kernel one.
>         It may be
>         conflicting w/ the kernel device driver.  Effectively they are
>         both
>         doing communication to the BMC but not sharing a lock.
>         
>         If you are using  /dev/ipmi0, if you changed the
>         ipmimonitoring example
>         to use the IPMI_MONITORING_DRIVER_TYPE_OPENIPMI driver,
>         thing'll
>         probably work out.
>         
>         Al
>         
>         On Tue, 2017-07-18 at 11:43 -0700, Sohan Chowdary Kollu wrote:
>         > I am using 1.5.5 version.
>         >
>         > Below are the packet details along with errors. Except for
>         the 3rd
>         > scenario all other errors are very frequent
>         >
>         >
>         > 1)
>         >
>         > Failed right away (first sdr request in the trace)
>         >
>         >
>         >  Get SDR Repository Info Request
>         >
>         > =====================================================
>         >
>         > KCS Header:
>         >
>         > ------------
>         >
>         > [               0h] = lun[ 2b]
>         >
>         > [               Ah] = net_fn[ 6b]
>         >
>         > IPMI Command Data:
>         >
>         > ------------------
>         >
>         > [              20h] = cmd[ 8b]
>         >
>         > (ipmi_monitoring_sdr_cache.c,
>         ipmi_monitoring_sdr_cache_load, 314):
>         > ipmi_sdr_cache_open: internal IPMI error
>         >
>         > ipmi_monitoring_sensor_readings_by_record_id: internal error
>         >
>         >
>         > 2)
>         >
>         > a) Failed right away (first sdr request in the trace)
>         >
>         >  =====================================================
>         >
>         > Get SDR Repository Info Request
>         >
>         > =====================================================
>         >
>         > KCS Header:
>         >
>         > ------------
>         >
>         > [               0h] = lun[ 2b]
>         >
>         > [               Ah] = net_fn[ 6b]
>         >
>         > IPMI Command Data:
>         >
>         > ------------------
>         >
>         > [              20h] = cmd[ 8b]
>         >
>         > (ipmi_monitoring_sdr_cache.c,
>         _ipmi_monitoring_sdr_cache_retrieve,
>         > 223): ipmi_sdr_cache_create: internal IPMI error
>         >
>         > ipmi_monitoring_sensor_readings_by_record_id: internal error
>         >
>         >
>         > b) Failed after going though some sdr requests
>         >
>         > =====================================================
>         >
>         > Get SDR Request
>         >
>         > =====================================================
>         >
>         > KCS Header:
>         >
>         > ------------
>         >
>         > [               0h] = lun[ 2b]
>         >
>         > [               Ah] = net_fn[ 6b]
>         >
>         > IPMI Command Data:
>         >
>         > ------------------
>         >
>         > [              23h] = cmd[ 8b]
>         >
>         > [            8820h] = reservation_id[16b]
>         >
>         > [              82h] = record_id[16b]
>         >
>         > [              25h] = offset_into_record[ 8b]
>         >
>         > [              10h] = bytes_to_read[ 8b]
>         >
>         > (ipmi_monitoring_sdr_cache.c,
>         _ipmi_monitoring_sdr_cache_retrieve,
>         > 223): ipmi_sdr_cache_create: internal IPMI error
>         >
>         > ipmi_monitoring_sensor_readings_by_record_id: internal error
>         >
>         >
>         > 3)
>         >
>         > Failed right away (first sdr request in the trace). Seen
>         this only
>         > twice
>         >
>         >
>         > =====================================================
>         >
>         > Get SDR Repository Info Request
>         >
>         > =====================================================
>         >
>         > KCS Header:
>         >
>         > ------------
>         >
>         > [               0h] = lun[ 2b]
>         >
>         > [               Ah] = net_fn[ 6b]
>         >
>         > IPMI Command Data:
>         >
>         > ------------------
>         >
>         > [              20h] = cmd[ 8b]
>         >
>         > (ipmi_monitoring_sdr_cache.c,
>         ipmi_monitoring_sdr_cache_load, 336):
>         > ipmi_sdr_cache_open: internal IPMI error
>         >
>         > ipmi_monitoring_sensor_readings_by_record_id: internal error
>         >
>         >
>         > 4)
>         >
>         > a) Failed at Reading Request
>         >
>         > =====================================================
>         >
>         > Get Sensor Reading Request
>         >
>         > =====================================================
>         >
>         > KCS Header:
>         >
>         > ------------
>         >
>         > [               0h] = lun[ 2b]
>         >
>         > [               4h] = net_fn[ 6b]
>         >
>         > IPMI Command Data:
>         >
>         > ------------------
>         >
>         > [              2Dh] = cmd[ 8b]
>         >
>         > [              B0h] = sensor_number[ 8b]
>         >
>         > (ipmi_monitoring_sensor_reading.c, _get_sensor_reading,
>         356):
>         > ipmi_sensor_read: internal IPMI error
>         >
>         > (ipmi_monitoring.c,
>         _ipmi_monitoring_sensor_readings_by_record_id,
>         > 1449): ipmi_sdr_cache_iterate: error returned in callback
>         >
>         > ipmi_monitoring_sensor_readings_by_record_id: internal error
>         >
>         >
>         > b) Failed at Reading Response
>         >
>         > =====================================================
>         >
>         > Get Sensor Reading Request
>         >
>         > =====================================================
>         >
>         > KCS Header:
>         >
>         > ------------
>         >
>         > [               0h] = lun[ 2b]
>         >
>         > [               4h] = net_fn[ 6b]
>         >
>         > IPMI Command Data:
>         >
>         > ------------------
>         >
>         > [              2Dh] = cmd[ 8b]
>         >
>         > [              90h] = sensor_number[ 8b]
>         >
>         > =====================================================
>         >
>         > Get Sensor Reading Response
>         >
>         > =====================================================
>         >
>         > KCS Header:
>         >
>         > ------------
>         >
>         > [               0h] = lun[ 2b]
>         >
>         > [               5h] = net_fn[ 6b]
>         >
>         > IPMI Command Data:
>         >
>         > ------------------
>         >
>         > [               0h] = cmd[ 8b]
>         >
>         > (ipmi_monitoring_sensor_reading.c, _get_sensor_reading,
>         356):
>         > ipmi_sensor_read: internal IPMI error
>         >
>         > (ipmi_monitoring.c,
>         _ipmi_monitoring_sensor_readings_by_record_id,
>         > 1449): ipmi_sdr_cache_iterate: error returned in callback
>         >
>         > ipmi_monitoring_sensor_readings_by_record_id: internal error
>         >
>         >
>         > Thanks
>         >
>         >
>         >
>         > On Mon, Jul 17, 2017 at 11:46 PM, Albert Chu
>         <address@hidden>
>         > wrote:
>         >         Hi,
>         >
>         >
>         >         What version of FreeIPMI are you using?  The line
>         numbers
>         >         don't quite line up with the master branch.
>         >
>         >
>         >         Also, could you set
>         IPMI_MONITORING_FLAGS_DEBUG_IPMI_PACKETS
>         >         and show the IPMI packet that occurs right before
>         the error
>         >         line?
>         >
>         >
>         >         Thanks,
>         >
>         >
>         >
>         >         Al
>         >
>         >
>         >         On Mon, Jul 17, 2017 at 4:28 PM, Sohan Chowdary
>         Kollu
>         >         <address@hidden> wrote:
>         >                 Hi Albert,
>         >
>         >                 Thanks for quick response. I have set the
>         flags for
>         >                 debugging and found it failing at one of the
>         three
>         >                 instances below in different runs.
>         >
>         >                 1) (ipmi_monitoring_sensor_reading.c,
>         >                 _get_sensor_reading, 356): ipmi_sensor_read:
>         internal
>         >                 system error(ipmi_monitoring.c,
>         >
>          _ipmi_monitoring_sensor_readings_by_record_id, 1449):
>         >                 ipmi_sdr_cache_iterate: error returned in
>         callback
>         >
>          ipmi_monitoring_sensor_readings_by_record_id: internal
>         >                 error
>         >                 2)(ipmi_monitoring_sdr_cache.c,
>         >                 ipmi_monitoring_sdr_cache_load, 314):
>         >                 ipmi_sdr_cache_open: internal IPMI
>         >                 error
>         ipmi_monitoring_sensor_readings_by_record_id:
>         >                 internal error
>         >
>         >
>         >                 3) (ipmi_monitoring_sdr_cache.c,
>         >                 _ipmi_monitoring_sdr_cache_retrieve, 223):
>         >                 ipmi_sdr_cache_create: internal IPMI
>         >                 error
>         ipmi_monitoring_sensor_readings_by_record_id:
>         >                 internal error
>         >
>         >
>         >
>         >                 Thanks
>         >
>         >
>         >
>         >                 On Mon, Jul 17, 2017 at 2:34 PM, Albert Chu
>         >                 <address@hidden> wrote:
>         >                         The "internal error" indicates some
>         logical
>         >                         error that the library
>         >                         doesn't know how to handle.  Given
>         its coming
>         >                         from
>         >
>          ipmi_monitoring_sensor_readings_by_record_id
>         >                         and it occurs when you run
>         >                         the program back to back, I would
>         bet there is
>         >                         some internal IPMI issue
>         >                         on your system.  Perhaps its a new
>         error code
>         >                         or something like that
>         >                         that I do not handle gracefully
>         correctly.
>         >
>         >                         To try and debug, could you set the
>         flag
>         >                         "IPMI_MONITORING_FLAGS_DEBUG |
>         >
>          IPMI_MONITORING_FLAGS_DEBUG_IPMI_PACKETS" when
>         >                         calling
>         >                         ipmimonitoring_init() in the example
>         code.
>         >                         Hopefully that'll be enough
>         >                         to figure out the issue.
>         >
>         >                         Al
>         >
>         >                         On Mon, 2017-07-17 at 13:03 -0700,
>         Sohan
>         >                         Chowdary Kollu wrote:
>         >                         > Hi,
>         >                         >
>         >                         > I am executing the
>         ipmimonitoring-sensors.c
>         >                         example provided in the
>         >                         > freeipmi library. It throws
>         internal error
>         >                         sometimes. Issue is
>         >                         > reproducible when i execute the
>         program back
>         >                         to back couple of times.
>         >                         > I need to wait approximately 30
>         sec or more
>         >                         after the last execution
>         >                         > for the program to run properly.
>         >                         >
>         >                         >
>         >                         > This is the error
>         >
>          ipmi_monitoring_sensor_readings_by_record_id:
>         >                         > internal error
>         >                         >
>         >                         >
>         >                         >
>         >                         > I ran some of the commands on
>         terminal back
>         >                         to back , including
>         >                         > ipmi-sensors with group option,
>         >                         ipmimonitoring etc. None of them
>         thew
>         >                         > any errors. Error occurs only when
>         i am use
>         >                         the API.
>         >                         >
>         >                         >
>         >                         > Has anyone faced this issue
>         before? If yes,
>         >                         can you tell me how to
>         >                         > avoid it
>         >                         >
>         >                         >
>         >                         >
>         >                         >
>         >                         > Thanks,
>         >                         > Sohan
>         >
>         >                         >
>         >
>          _______________________________________________
>         >                         > Freeipmi-devel mailing list
>         >                         > address@hidden
>         >                         >
>         >
>          https://lists.gnu.org/mailman/listinfo/freeipmi-devel
>         >
>         >                         --
>         >                         Albert Chu
>         >                         address@hidden
>         >                         Computer Scientist
>         >                         High Performance Systems Division
>         >                         Lawrence Livermore National
>         Laboratory
>         >
>         >
>         >
>         >
>         >
>         >
>         >                 --
>         >                 Thanks,
>         >                 Sohan
>         >
>         >
>          _______________________________________________
>         >                 Freeipmi-devel mailing list
>         >                 address@hidden
>         >
>          https://lists.gnu.org/mailman/listinfo/freeipmi-devel
>         >
>         >
>         >
>         >
>         >
>         >
>         >
>         > --
>         > Thanks,
>         > Sohan
>         
>         --
>         Albert Chu
>         address@hidden
>         Computer Scientist
>         High Performance Systems Division
>         Lawrence Livermore National Laboratory
>         
>         
>         
> 
> 
> 
> 
> -- 
> Thanks,
> Sohan

-- 
Albert Chu
address@hidden
Computer Scientist
High Performance Systems Division
Lawrence Livermore National Laboratory





reply via email to

[Prev in Thread] Current Thread [Next in Thread]