[Top][All Lists]
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Freeipmi-users] Re: FreeIPMI SDR caching problem.
From: |
Al Chu |
Subject: |
Re: [Freeipmi-users] Re: FreeIPMI SDR caching problem. |
Date: |
Fri, 12 Feb 2010 11:26:13 -0800 |
Thanks Peter. I'll release 0.8.4 a little bit later on. I don't think
this is critical b/c most use IPMI 1.5 for the tools. The only thing
that requires IPMI 2.0 is ipmiconsole, and the bug did not exist in
there.
Thanks,
Al
On Fri, 2010-02-12 at 11:17 -0800, Peter Bisroev wrote:
> Hi Al,
>
>
>
> Just tested out the beta. Worked perfectly on PowerEdge R300 (DRAC5), R610
>
> (iDRAC) and R710 (iDRAC).
>
>
>
> Thank you for a quick patch. I will let you know if I see any more
>
> problems.
>
>
>
> Regards,
>
> Peter
>
>
>
> On Fri, 12 Feb 2010 09:41:57 -0800, Al Chu <address@hidden> wrote:
>
> > Hey Peter,
>
> >
>
> > I've put up a beta here:
>
> >
>
> >
>
> http://*ftp.gluster.com/pub/freeipmi/qa-release/freeipmi-0.8.4.beta0.tar.gz
>
> >
>
> > PLMK if it works for you.
>
> >
>
> > Al
>
> >
>
> >
>
> > On Fri, 2010-02-12 at 07:14 -0800, Peter Bisroev wrote:
>
> >> Hey Al,
>
> >>
>
> >>
>
> >>
>
> >> Thank you for a quick response and a patch. As soon as the beta is up
>
> >> I'll
>
> >>
>
> >> test it out and let you know.
>
> >>
>
> >>
>
> >>
>
> >> Thanks!
>
> >>
>
> >>
>
> >>
>
> >> -- peter
>
> >>
>
> >>
>
> >>
>
> >> On Thu, 11 Feb 2010 17:21:11 -0800, Al Chu <address@hidden> wrote:
>
> >>
>
> >> > Hey Peter,
>
> >>
>
> >> >
>
> >>
>
> >> > I think I figured it out, I introduced a sequence number wrap around
>
> >> > bug
>
> >>
>
> >> > in 0.8.1 when I was "cleaning up" some code. Most motherboards don't
>
> >>
>
> >> > have anywhere near the number of sensors that the Dell ones do, so
>
> they
>
> >>
>
> >> > never reach that point. I'll put up a beta tar.gz tomorrow.
>
> >>
>
> >> >
>
> >>
>
> >> > Al
>
> >>
>
> >> >
>
> >>
>
> >> > On Thu, 2010-02-11 at 16:44 -0800, Al Chu wrote:
>
> >>
>
> >> >> Hi Peter,
>
> >>
>
> >> >>
>
> >>
>
> >> >> > First of all let me thank you for the great work that you and other
>
> >>
>
> >> >> > developers are doing, this is truly a very helpful tool.
>
> >>
>
> >> >>
>
> >>
>
> >> >> You're definitely welcome.
>
> >>
>
> >> >>
>
> >>
>
> >> >> > I was thinking of submitting this problem through the GNU FreeIPMI
>
> >>
>
> >> >> > project bugs page but did not see any activity there so decided to
>
> >>
>
> >> >> > mail you directly. I hope that that is OK.
>
> >>
>
> >> >>
>
> >>
>
> >> >> No problem, but in the future it's best to e-mail the
>
> >>
>
> >> >> address@hidden list (I'm CCing it now).
>
> >>
>
> >> >>
>
> >>
>
> >> >> > Now to the problem. When running ipmi-* tools that can generate SDR
>
> >>
>
> >> >> > cache
>
> >>
>
> >> >> > such as ipmimonitoring and ipmi-sensors with driver type (-D
>
> >> >> > LAN_2_0),
>
> >>
>
> >> >> > the
>
> >>
>
> >> >> > SDR cache generation freezes after some record and then gives the
>
> >>
>
> >> >> > following
>
> >>
>
> >> >> > error, 'ipmi_sdr_cache_create: internal IPMI error'.
>
> >>
>
> >> >>
>
> >>
>
> >> >> That's definitely not good.
>
> >>
>
> >> >>
>
> >>
>
> >> >> > After the error I can rerun the command with driver type LAN, and
>
> >>
>
> >> >> > everything succeeds. From that point on I can use driver type
>
> >> >> > LAN_2_0
>
> >>
>
> >> >> > again
>
> >>
>
> >> >> > and everything will work fine. So it looks like that only SDR cache
>
> >>
>
> >> >> > generation does not seem to work with LAN_2_0 driver. Just for
>
> >> >> > testing
>
> >>
>
> >> >> > I
>
> >>
>
> >> >> > ran a similar command to read the SDR data using ipmitool and using
>
> >>
>
> >> >> > ipmitool's equivalent of LAN_2_0 and everything seemed to work
>
> fine.
>
> >>
>
> >> >> >
>
> >>
>
> >> >> > I have observed this behavior on Dell PowerEdge R300, R610 and
>
> R710.
>
> >>
>
> >> >> > And at
>
> >>
>
> >> >> > every run the error will be generated after the same SDR record for
>
> >>
>
> >> >> > each
>
> >>
>
> >> >> > platform. Below are the errors for each one of those server types:
>
> >>
>
> >> >>
>
> >>
>
> >> >> Luckily, I have a PowerEdge R610 and I've reproduced the bug. I need
>
> >> >> to
>
> >>
>
> >> >> figure this out, it's definitely odd. It does appear to be specific
>
> >> >> to
>
> >>
>
> >> >> the Dell motherboards. I'm not sure what the issue is.
>
> >>
>
> >> >>
>
> >>
>
> >> >> Thanks for bringing it to my attention. I'll look into it and get
>
> >> >> back
>
> >>
>
> >> >> to you.
>
> >>
>
> >> >>
>
> >>
>
> >> >> Al
>
> >>
>
> >> >>
>
> >>
>
> >> >> >
>
> >>
>
> >> >> >
>
> >>
>
> >> >> > R300 (DRAC5):
>
> >>
>
> >> >> >
>
> >>
>
> >> >> > --------------------------------
>
> >>
>
> >> >> >
>
> >>
>
> >> >> > Caching SDR record 61 of 80 (current record ID 61)
>
> >>
>
> >> >> >
>
> >>
>
> >> >> > ipmi_sdr_cache_create: internal IPMI error
>
> >>
>
> >> >> >
>
> >>
>
> >> >> > --------------------------------
>
> >>
>
> >> >> >
>
> >>
>
> >> >> >
>
> >>
>
> >> >> >
>
> >>
>
> >> >> > R610 (iDRAC):
>
> >>
>
> >> >> >
>
> >>
>
> >> >> > --------------------------------
>
> >>
>
> >> >> >
>
> >>
>
> >> >> > Caching SDR record 58 of 124 (current record ID 58)
>
> >>
>
> >> >> >
>
> >>
>
> >> >> > ipmi_sdr_cache_create: internal IPMI error
>
> >>
>
> >> >> >
>
> >>
>
> >> >> > --------------------------------
>
> >>
>
> >> >> >
>
> >>
>
> >> >> >
>
> >>
>
> >> >> >
>
> >>
>
> >> >> > R710 (iDRAC):
>
> >>
>
> >> >> >
>
> >>
>
> >> >> > --------------------------------
>
> >>
>
> >> >> >
>
> >>
>
> >> >> > Caching SDR record 59 of 115 (current record ID 59)
>
> >>
>
> >> >> >
>
> >>
>
> >> >> > ipmi_sdr_cache_create: internal IPMI error
>
> >>
>
> >> >> >
>
> >>
>
> >> >> > --------------------------------
>
> >>
>
> >> >> >
>
> >>
>
> >> >> >
>
> >>
>
> >> >> >
>
> >>
>
> >> >> > If you have a time and opportunity to look at this problem is there
>
> >>
>
> >> >> >
>
> >>
>
> >> >> > anything I can provide you with that will make this task easier?
>
> >>
>
> >> >>
>
> >>
>
> >> >> On Thu, 2010-02-11 at 15:47 -0800, Peter Bisroev wrote:
>
> >>
>
> >> >> > Hello Albert,
>
> >>
>
> >> >> >
>
> >>
>
> >> >> >
>
> >>
>
> >> >> >
>
> >>
>
> >> >> > My name is Peter Bisroev and I am using the FreeIPMI package to
>
> >>
>
> >> monitor
>
> >>
>
> >> >> > a
>
> >>
>
> >> >> >
>
> >>
>
> >> >> > number of DELL servers from a scripted environment.
>
> >>
>
> >> >> >
>
> >>
>
> >> >> >
>
> >>
>
> >> >> >
>
> >>
>
> >> >> > First of all let me thank you for the great work that you and other
>
> >>
>
> >> >> >
>
> >>
>
> >> >> > developers are doing, this is truly a very helpful tool. I was
>
> >>
>
> >> thinking
>
> >>
>
> >> >> > of
>
> >>
>
> >> >> >
>
> >>
>
> >> >> > submitting this problem through the GNU FreeIPMI project bugs page
>
> >> >> > but
>
> >>
>
> >> >> > did
>
> >>
>
> >> >> >
>
> >>
>
> >> >> > not see any activity there so decided to mail you directly. I hope
>
> >>
>
> >> that
>
> >>
>
> >> >> >
>
> >>
>
> >> >> > that is OK.
>
> >>
>
> >> >> >
>
> >>
>
> >> >> >
>
> >>
>
> >> >> >
>
> >>
>
> >> >> > Now to the problem. When running ipmi-* tools that can generate SDR
>
> >>
>
> >> >> > cache
>
> >>
>
> >> >> >
>
> >>
>
> >> >> > such as ipmimonitoring and ipmi-sensors with driver type (-D
>
> >> >> > LAN_2_0),
>
> >>
>
> >> >> > the
>
> >>
>
> >> >> >
>
> >>
>
> >> >> > SDR cache generation freezes after some record and then gives the
>
> >>
>
> >> >> > following
>
> >>
>
> >> >> >
>
> >>
>
> >> >> > error, 'ipmi_sdr_cache_create: internal IPMI error'.
>
> >>
>
> >> >> >
>
> >>
>
> >> >> >
>
> >>
>
> >> >> >
>
> >>
>
> >> >> > After the error I can rerun the command with driver type LAN, and
>
> >>
>
> >> >> >
>
> >>
>
> >> >> > everything succeeds. From that point on I can use driver type
>
> >> >> > LAN_2_0
>
> >>
>
> >> >> > again
>
> >>
>
> >> >> >
>
> >>
>
> >> >> > and everything will work fine. So it looks like that only SDR cache
>
> >>
>
> >> >> >
>
> >>
>
> >> >> > generation does not seem to work with LAN_2_0 driver. Just for
>
> >> >> > testing
>
> >>
>
> >> >> > I
>
> >>
>
> >> >> >
>
> >>
>
> >> >> > ran a similar command to read the SDR data using ipmitool and using
>
> >>
>
> >> >> >
>
> >>
>
> >> >> > ipmitool's equivalent of LAN_2_0 and everything seemed to work
>
> fine.
>
> >>
>
> >> >> >
>
> >>
>
> >> >> >
>
> >>
>
> >> >> >
>
> >>
>
> >> >> > I have observed this behavior on Dell PowerEdge R300, R610 and
>
> R710.
>
> >>
>
> >> >> > And at
>
> >>
>
> >> >> >
>
> >>
>
> >> >> > every run the error will be generated after the same SDR record for
>
> >>
>
> >> >> > each
>
> >>
>
> >> >> >
>
> >>
>
> >> >> > platform. Below are the errors for each one of those server types:
>
> >>
>
> >> >> >
>
> >>
>
> >> >> >
>
> >>
>
> >> >> >
>
> >>
>
> >> >> > R300 (DRAC5):
>
> >>
>
> >> >> >
>
> >>
>
> >> >> > --------------------------------
>
> >>
>
> >> >> >
>
> >>
>
> >> >> > Caching SDR record 61 of 80 (current record ID 61)
>
> >>
>
> >> >> >
>
> >>
>
> >> >> > ipmi_sdr_cache_create: internal IPMI error
>
> >>
>
> >> >> >
>
> >>
>
> >> >> > --------------------------------
>
> >>
>
> >> >> >
>
> >>
>
> >> >> >
>
> >>
>
> >> >> >
>
> >>
>
> >> >> > R610 (iDRAC):
>
> >>
>
> >> >> >
>
> >>
>
> >> >> > --------------------------------
>
> >>
>
> >> >> >
>
> >>
>
> >> >> > Caching SDR record 58 of 124 (current record ID 58)
>
> >>
>
> >> >> >
>
> >>
>
> >> >> > ipmi_sdr_cache_create: internal IPMI error
>
> >>
>
> >> >> >
>
> >>
>
> >> >> > --------------------------------
>
> >>
>
> >> >> >
>
> >>
>
> >> >> >
>
> >>
>
> >> >> >
>
> >>
>
> >> >> > R710 (iDRAC):
>
> >>
>
> >> >> >
>
> >>
>
> >> >> > --------------------------------
>
> >>
>
> >> >> >
>
> >>
>
> >> >> > Caching SDR record 59 of 115 (current record ID 59)
>
> >>
>
> >> >> >
>
> >>
>
> >> >> > ipmi_sdr_cache_create: internal IPMI error
>
> >>
>
> >> >> >
>
> >>
>
> >> >> > --------------------------------
>
> >>
>
> >> >> >
>
> >>
>
> >> >> >
>
> >>
>
> >> >> >
>
> >>
>
> >> >> > If you have a time and opportunity to look at this problem is there
>
> >>
>
> >> >> >
>
> >>
>
> >> >> > anything I can provide you with that will make this task easier?
>
> >>
>
> >> >> >
>
> >>
>
> >> >> >
>
> >>
>
> >> >> >
>
> >>
>
> >> >> > PS: I would be more than happy to submit a patch for the problem
>
> but
>
> >>
>
> >> my
>
> >>
>
> >> >> >
>
> >>
>
> >> >> > knowledge of IPMI protocol is almost non existent so that is not an
>
> >>
>
> >> >> > option
>
> >>
>
> >> >> >
>
> >>
>
> >> >> > for the meantime.
>
> >>
>
> >> >> >
>
> >>
>
> >> >> >
>
> >>
>
> >> >> >
>
> >>
>
> >> >> > Thanks you.
>
> >>
>
> >> >> >
>
> >>
>
> >> >> >
>
> >>
>
> >> >> >
>
> >>
>
> >> >> > Best Regards,
>
> >>
>
> >> >> >
>
> >>
>
> >> >> > Peter
>
> >>
>
> >> >> >
>
> >>
>
> >> >> >
>
--
Albert Chu
address@hidden
Computer Scientist
High Performance Systems Division
Lawrence Livermore National Laboratory