freeipmi-users
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Freeipmi-users] Decoding ram errors on supermicro


From: Tom Hetmer
Subject: Re: [Freeipmi-users] Decoding ram errors on supermicro
Date: Tue, 04 Dec 2018 11:39:03 +0100

Sure. It seems there's a similar ticket already: 
https://github.com/chu11/freeipmi-mirror/issues/19
Yep, that's the code. ipmitool and a few others decode it too.


We have a *lot* of Supermicros so I can help with testing if needed - but we 
don't get that much CRC errors though :)
So I guess we'd have to wait till one pops up. But I hope the 'ver 2' method 
from ipmiutil works fine.
We used ipmitool in our monitoring before and it was accurate but slow, that's 
why I rewrote it all to use freeipmi.


Thanks!


Best,
Tom Hetmer


CDN77 Operations
address@hidden / +44 (0) 20 3514 2399 / www.cdn77.com

----- Původní zpráva ----- 
> Odesilatel: "Albert Chu" <address@hidden> 
> Příjemce: "Tom Hetmer" <address@hidden>, address@hidden 
> Datum: 12/03/18 21:06 
> Předmět: Re: [Freeipmi-users] Decoding ram errors on supermicro 
> 
> Hi Tom,
> 
> Thanks for the pointer to ipmiutil's code.  I assume you found this
> comment:
> 
> ---
>       /* ver 2 method: 2A 80 = P1_DIMMB1 */                                   
>                                                          
>           /* SuperMicro says:                                                 
>                                                          
>            *  pair: %c (data2 >> 4) + 0x40 + (data3 & 0x3) * 3, (='B')        
>                                                          
>            *  dimm: %c (data2 & 0xf) + 0x27,                                  
>                                                          
>            *  cpu:  %x (data3 & 0x03) + 1);                                   
>                                                          
>            */                       
> ---
> 
> I can definitely add it to my todo list.
> 
> Would you mind writing up an issue on github here?
> 
> https://github.com/chu11/freeipmi-mirror
> 
> Al
> 
> On Mon, 2018-12-03 at 17:55 +0100, Tom Hetmer wrote:
> > Hi, 
> > 
> > it'd be good if freeipmi supported decoding the supermicro ECC
> > errors.
> > 
> > 
> > Manufacturer: Supermicro
> > Product Name: X10DRH LN4
> > eg.
> > freeipmi
> > 1,Dec-01-2018,06:37:53,Sensor #0,Memory,Critical,Uncorrectable memory
> > error ; OEM Event Data2 code = 3Ah ; OEM Event Data3 code = 81h
> > 
> > 
> > web interface
> > 1 | 12/01/2018 | 06:37:53 | Memory | Uncorrectable ECC
> > (@DIMMG1(CPU2)) | Asserted
> > 
> > 
> > something like this worked for me (stolen from ipmiutil)
> > 
> > 
> > $cpu = ($data3 & 0x03) + 1;
> > 
> > 
> > $NPAIRS = 26;
> > $rgpairs = "ABCDEFGHIJKLMNOPQRSTUVWXYZ";
> > 
> > 
> > $bdata = "0x".$data2.$data3;
> > $bdata = hexdec($bdata);
> > $pair = (($bdata & 0xF0) >> 4) - 1;
> > 
> > 
> > if ($pair < 0) $pair = 0;
> > if ($pair > $NPAIRS) $pair = $NPAIRS - 1;
> > 
> > 
> > $pair = $rgpairs[$pair - 1];
> > 
> > 
> > $dimm = $bdata & 0x0F;
> > 
> > 
> > $dimm may be incorrect as the original code decrements 9, but on that
> > board it was wrong so i changed it to get the right result - we'll
> > see if it keeps getting the right values.
> > 
> > Best,
> > Tom Hetmer
> > 
> > 
> > CDN77 Operations
> > address@hidden / +44 (0) 20 3514 2399 / www.cdn77.com
> > 
> > _______________________________________________
> > Freeipmi-users mailing list
> > address@hidden
> > https://lists.gnu.org/mailman/listinfo/freeipmi-users
> -- 
> Albert Chu
> address@hidden
> Computer Scientist
> High Performance Systems Division
> Lawrence Livermore National Laboratory



reply via email to

[Prev in Thread] Current Thread [Next in Thread]