freeipmi-users
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Freeipmi-users] ipmimonitoring: correctable memory error vs. ipmito


From: Al Chu
Subject: Re: [Freeipmi-users] ipmimonitoring: correctable memory error vs. ipmitool: Memory 0 error
Date: Wed, 09 Oct 2013 08:50:00 -0700

I totally forgot yesterday.  There is a working around called
"discretereading" in ipmi-sensors specifically for HP systems.  You can
specify it w/ the -W option.  I wrote a small blurb about it here.  

http://www.gnu.org/software/freeipmi/freeipmi-faq.html#Why-is-the-output-from-FreeIPMI-different-than-another-software_003f

Short answer, b/c of the kookiness in HP systems, both FreeIPMI and
ipmitool (by default) output the wrong thing.  I cannot speak to whether
there is a workaround for this in ipmitool.

As for your original question, I'm not sure how to clear the
"Correctable Memory Error".  It appears to be a bug in HP's firmware.
You'll notice that it reports "Correctable Memory Error" and "Presence
Detected".  The event bit notifying the correctable memory error appears
to be stuck (the bit for Presence Detected is uhh "good").  I suppose
it's possible a hard reset (bmc-device --hard-reset or a hard button
push) could clear it??

Sorry, wish there was better news.

Al

On Wed, 2013-10-09 at 07:31 -0400, Dan Mann wrote:
> 
> Hi Al,
> 
> 
> 
> Output from ipmitool -H host.ilo -U Administrator -P xxxxx -I lanplus
> sdr :
> ...
> Memory           | 0 error           | nc
> 
> 
> 
> Output from ipmimonitoring -h host.ilo -u Administrator -p xxxxx -D
> LAN_2_0 :
> ...
> 45 | Memory | Memory | Warning | N/A | 'Correctable memory error'
> 'Presence detected'
> 
> 
> 
> 
> Dan
> 
> 
> 
> 
> On Tue, Oct 8, 2013 at 9:50 PM, Al Chu <address@hidden> wrote:
>         Hi Dan,
>         
>         Could you show me the ipmitool command you ran and the
>         output?  It's not
>         clear to me what you are referring to with "0 error".
>         
>         Al
>         
>         On Tue, 2013-10-08 at 15:31 -0400, Dan Mann wrote:
>         > Hello!
>         >
>         > I have some systems that I monitor with freeIPMI.  Some of
>         those systems,
>         > specifically a number of HP systems, report "Correctable
>         Memory Error" in
>         > the ipmi-sensors or ipmimonitoring output.  I understand
>         that typically
>         > "Correctable Memory Errors" are not in and of themselves a
>         critical
>         > issues.  I've spoken to HP and looked in the iLO IML and we
>         see no memory
>         > alerts mentioned.
>         >
>         > My concern is that our monitoring systems detect this error
>         condition and
>         > report on it, but there is no actionable steps for me to
>         take to correct
>         > these issues because:
>         >
>         > 1.  I don't know when the issue occurred
>         > 2.  I don't know how to clear the issue
>         > 3.  Since I don't know 1 or 2 I'm not sure if this is a new
>         issue or an
>         > artifact from another event.
>         >
>         > During the course of troubleshooting  I noticed that
>         ipmitool -H reports
>         > the memory condition as "0 error".
>         >
>         >
>         > I tried with:
>         >
>         > ipmi-sensors - 1.3.2
>         > and
>         > ipmi-sensors - 0.7.16.beta1
>         >
>         > Both returned the correctable memory error.
>         >
>         >
>         > Is this possibly a bug?
>         >
>         >
>         > Dan
>         
>         > _______________________________________________
>         > Freeipmi-users mailing list
>         > address@hidden
>         > https://lists.gnu.org/mailman/listinfo/freeipmi-users
>         --
>         Albert Chu
>         address@hidden
>         Computer Scientist
>         High Performance Systems Division
>         Lawrence Livermore National Laboratory
>         
> 
> 
-- 
Albert Chu
address@hidden
Computer Scientist
High Performance Systems Division
Lawrence Livermore National Laboratory




reply via email to

[Prev in Thread] Current Thread [Next in Thread]