[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Freeipmi-users] ipmimonitoring: correctable memory error vs. ipmito
Re: [Freeipmi-users] ipmimonitoring: correctable memory error vs. ipmitool: Memory 0 error
Wed, 09 Oct 2013 08:50:00 -0700
I totally forgot yesterday. There is a working around called
"discretereading" in ipmi-sensors specifically for HP systems. You can
specify it w/ the -W option. I wrote a small blurb about it here.
Short answer, b/c of the kookiness in HP systems, both FreeIPMI and
ipmitool (by default) output the wrong thing. I cannot speak to whether
there is a workaround for this in ipmitool.
As for your original question, I'm not sure how to clear the
"Correctable Memory Error". It appears to be a bug in HP's firmware.
You'll notice that it reports "Correctable Memory Error" and "Presence
Detected". The event bit notifying the correctable memory error appears
to be stuck (the bit for Presence Detected is uhh "good"). I suppose
it's possible a hard reset (bmc-device --hard-reset or a hard button
push) could clear it??
Sorry, wish there was better news.
On Wed, 2013-10-09 at 07:31 -0400, Dan Mann wrote:
> Hi Al,
> Output from ipmitool -H host.ilo -U Administrator -P xxxxx -I lanplus
> sdr :
> Memory | 0 error | nc
> Output from ipmimonitoring -h host.ilo -u Administrator -p xxxxx -D
> LAN_2_0 :
> 45 | Memory | Memory | Warning | N/A | 'Correctable memory error'
> 'Presence detected'
> On Tue, Oct 8, 2013 at 9:50 PM, Al Chu <address@hidden> wrote:
> Hi Dan,
> Could you show me the ipmitool command you ran and the
> output? It's not
> clear to me what you are referring to with "0 error".
> On Tue, 2013-10-08 at 15:31 -0400, Dan Mann wrote:
> > Hello!
> > I have some systems that I monitor with freeIPMI. Some of
> those systems,
> > specifically a number of HP systems, report "Correctable
> Memory Error" in
> > the ipmi-sensors or ipmimonitoring output. I understand
> that typically
> > "Correctable Memory Errors" are not in and of themselves a
> > issues. I've spoken to HP and looked in the iLO IML and we
> see no memory
> > alerts mentioned.
> > My concern is that our monitoring systems detect this error
> condition and
> > report on it, but there is no actionable steps for me to
> take to correct
> > these issues because:
> > 1. I don't know when the issue occurred
> > 2. I don't know how to clear the issue
> > 3. Since I don't know 1 or 2 I'm not sure if this is a new
> issue or an
> > artifact from another event.
> > During the course of troubleshooting I noticed that
> ipmitool -H reports
> > the memory condition as "0 error".
> > I tried with:
> > ipmi-sensors - 1.3.2
> > and
> > ipmi-sensors - 0.7.16.beta1
> > Both returned the correctable memory error.
> > Is this possibly a bug?
> > Dan
> > _______________________________________________
> > Freeipmi-users mailing list
> > address@hidden
> > https://lists.gnu.org/mailman/listinfo/freeipmi-users
> Albert Chu
> Computer Scientist
> High Performance Systems Division
> Lawrence Livermore National Laboratory
High Performance Systems Division
Lawrence Livermore National Laboratory