freeipmi-users
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Freeipmi-users] Can't get freeipmi_interpret_sensor.conf to work


From: Albert Chu
Subject: Re: [Freeipmi-users] Can't get freeipmi_interpret_sensor.conf to work
Date: Wed, 21 Sep 2011 14:43:47 -0700

Hi Goran,

I added the X8SIE motherboard in w/ this beta, can you give it a shot?

http://download.gluster.com/pub/freeipmi/qa-release/freeipmi-1.0.7.beta5.tar.gz

> Checked on all available hosts and it works OK, with the comments 
> below.  Should I see any difference with --output-sensor-state? It 
> looks the same to me. And I didn't get any errors.

Oh, you're using ipmimonitoring.  ipmimonitoring enables
--output-sensor-state by default, so in your case they should look the
same.

> I have attached screenshots from IPMIView. It seems like that 
> application understands that that some of the fans are not connected 
> as deselecting the 'Hide inactive item' only shows more fans. More 
> interesting is that for the 10437:4 I see nothing of failing event 21 
> and in the 5593:4404 there is no fail on event 18.

I wouldn't be surprised if your software ignores a number of sensors.
The Module/Board sensor is a particularly uncommon sensor, so I'd bet
they just ignore it.  In addition, it's a State Asserted vs. Deasserted
sensor, so it is possible the "State" interpretation is wrong.  From the
freeipmi_interpret_sensor.conf manpage:

"Most default interpretations can be determined quite easily and can
meet the needs of most users. For example, a reading of
"Performance_Met" is normally better than "Performance_Lags".  However,
some sensors can be ambiguous and depend completely on the manufacturer.
For  example, "State_Asserted" vs. "State_Deasserted" are completely at
the interpretation of the vendor.  Users are  advised  to  adjust the
interpretations below as needed for their machines."

So it's certainly possible for your motherboard "Asserted" could be ok.
It is odd it would ignore a Power Supply sensor.

Al

On Wed, 2011-09-21 at 09:38 -0700, Goran Lowkrantz wrote:
> Hi Albert,
> 
> Checked on all available hosts and it works OK, with the comments below.
> Should I see any difference with --output-sensor-state? It looks the same
> to me. And I didn't get any errors.
> 
> I found another system lurking in our racks with the same type of CPU temp
> sensor but not yet identified. Output from dmidecode and bmc-info gave this:
> Manufacturer: Supermicro
> Product Name: X8SIE
> Manufacturer ID       : Super Micro Computer Inc. (47488)
> Product ID            : 1037
> 
> Also, I found a difference between the two X7SBi-LN4 boards. The Magnum
> version has a CPU Temp meter that is outputting a normal C reading, so that
> version does not need the OEM fix. Sorry for missing that last time.
> 
> Supermicro X7SBi-LN4 5593:4404
> # ipmimonitoring --interpret-oem-data --output-sensor-state
> --ignore-unrecognized-events
> ID | Name         | Type              | State    | Reading    | Units |
> Event
> 1  | CPU          | Temperature       | Nominal  | 36.00      | C     | 'OK'
> 2  | System       | Temperature       | Nominal  | 33.00      | C     | 'OK'
> 3  | CPU Core     | Voltage           | Nominal  | 1.26       | V     | 'OK'
> 4  | DIMM         | Voltage           | Nominal  | 1.82       | V     | 'OK'
> 5  | 3.3V         | Voltage           | Nominal  | 3.33       | V     | 'OK'
> 6  | 5V           | Voltage           | Nominal  | 4.82       | V     | 'OK'
> 7  | 5VSB         | Voltage           | Nominal  | 4.92       | V     | 'OK'
> 8  | 12V          | Voltage           | Nominal  | 11.71      | V     | 'OK'
> 9  | -12V         | Voltage           | Nominal  | -12.23     | V     | 'OK'
> 10 | Battery      | Voltage           | Nominal  | 3.26       | V     | 'OK'
> 11 | FAN1         | Fan               | Nominal  | 10125.00   | RPM   | 'OK'
> 12 | FAN2         | Fan               | Nominal  | 0.00       | RPM   | 'OK'
> 13 | FAN3         | Fan               | Nominal  | 9585.00    | RPM   | 'OK'
> 14 | FAN4         | Fan               | Nominal  | 0.00       | RPM   | 'OK'
> 15 | FAN5         | Fan               | Nominal  | 0.00       | RPM   | 'OK'
> 16 | FAN6/CPU     | Fan               | Nominal  | 0.00       | RPM   | 'OK'
> 17 | Intrusion    | Physical Security | Nominal  | N/A        | N/A   | 'OK'
> 18 | Power Supply | Power Supply      | Critical | N/A        | N/A   |
> 'Power Supply Failure detected'
> 
> Supermicro X7SBi-LN4 10437:4
> # ipmimonitoring --interpret-oem-data --output-sensor-state
> --ignore-unrecognized-events
> ID | Name            | Type                     | State    | Reading    |
> Units | Event
> 4  | CPU Temp        | OEM Reserved             | Nominal  | N/A        |
> N/A   | 'Low'
> 5  | Sys Temp        | Temperature              | Nominal  | 36.00      | C
> | 'OK'
> 6  | CPU Vcore       | Voltage                  | Nominal  | 1.22       | V
> | 'OK'
> 7  | DIMM Volt       | Voltage                  | Nominal  | 1.82       | V
> | 'OK'
> 8  | 3.3V            | Voltage                  | Nominal  | 3.31       | V
> | 'OK'
> 9  | 5V              | Voltage                  | Nominal  | 4.82       | V
> | 'OK'
> 10 | 12V             | Voltage                  | Nominal  | 11.81      | V
> | 'OK'
> 11 | -12V            | Voltage                  | Nominal  | -12.10     | V
> | 'OK'
> 12 | 5VSB            | Voltage                  | Nominal  | 4.92       | V
> | 'OK'
> 13 | VBAT            | Voltage                  | Nominal  | 3.25       | V
> | 'OK'
> 14 | Fan1            | Fan                      | Nominal  | 10200.00   |
> RPM   | 'OK'
> 15 | Fan2            | Fan                      | Nominal  | 9500.00    |
> RPM   | 'OK'
> 16 | Fan3            | Fan                      | Critical | 0.00       |
> RPM   | 'At or Below (<=) Lower Non-Recoverable Threshold'
> 17 | Fan4            | Fan                      | Critical | 0.00       |
> RPM   | 'At or Below (<=) Lower Non-Recoverable Threshold'
> 18 | Fan5            | Fan                      | Critical | 0.00       |
> RPM   | 'At or Below (<=) Lower Non-Recoverable Threshold'
> 19 | Fan6/CPU        | Fan                      | Critical | 0.00       |
> RPM   | 'At or Below (<=) Lower Non-Recoverable Threshold'
> 20 | Power Supply    | Power Supply             | Nominal  | N/A        |
> N/A   | 'OK'
> 21 | CPU Internal Er | Module/Board             | Critical | N/A        |
> N/A   | 'State Asserted'
> 22 | System Overheat | Module/Board             | Nominal  | N/A        |
> N/A   | 'OK'
> 23 | Thermal Trip    | Module/Board             | Nominal  | N/A        |
> N/A   | 'OK'
> 
> 
> I have attached screenshots from IPMIView. It seems like that application
> understands that that some of the fans are not connected as deselecting the
> 'Hide inactive item' only shows more fans. More interesting is that for the
> 10437:4 I see nothing of failing event 21 and in the 5593:4404 there is no
> fail on event 18.
> 
> /glz
> 
> 
> --On September 20, 2011 16:40:19 -0700 Albert Chu <address@hidden> wrote:
> 
> > Hi Goran,
> >
> > No problem, I just went ahead and added the support under the Supermicro
> > "banner".  I've added them to this beta.  Could you LMK if it seems to
> > work with all the Supermicro boards you have?
> >
> > http://download.gluster.com/pub/freeipmi/qa-release/freeipmi-1.0.7.beta4.
> > tar.gz
> >
> > If you could run w/ --output-sensor-state and see if the event
> > interpretations are working too, I'd appreciate it.
> >
> > Thanks,
> >
> > Al
> >
> > On Tue, 2011-09-20 at 10:34 -0700, Goran Lowkrantz wrote:
> >> Hi Albert,
> >>
> >> Yes, they all have the CPU sensor.
> >>
> >> Re the Magnum thing, I have no idea why they have different manufacturer
> >> code. I looked at the two X8DTL and X7SBi-LN4 boards and could see no
> >> visual difference between them. They have different codes in a few places
> >> but I don't know if it's date codes or something else. But I am almost
> >> 100% sure we got the Magnum systems first, something like 8 to 12 weeks
> >> between the two pairs. But then we are in the Swedish north so I have no
> >> idea about how long time it takes for Super Micro stock to move this far
> >> and thus the delivery times may be irrelevant.
> >>
> >> /glz
> >>
> >> --On Tuesday, September 20, 2011 10:04 AM -0700 Albert Chu
> >> <address@hidden> wrote:
> >>
> >> > Hey Goran,
> >> >
> >> > Thanks for the list.  I can go ahead and add these motherboards too.  I
> >> > assume they all have the same Supermicro OEM CPU Temp sensors?
> >> >
> >> > Do the magnum technologies inc motherboards have a name/product name
> >> > associated with them?  The way I document/organize the code, it would
> >> > nicer to associated those motherborads w/ Magnum instead of Supermicro.
> >> >
> >> > Al
> >> >
> >> > On Tue, 2011-09-20 at 08:54 -0700, Goran Lowkrantz wrote:
> >> >> Hi Albert,
> >> >>
> >> >> Works just perfekt.
> >> >>
> >> >> Here is a list of our other Supermicro board:
> >> >>
> >> >> Supermicro X7DB8     10437:4
> >> >> Supermicro X8DTN     10437:4
> >> >> Supermicro X8DTL     47488:6
> >> >> Supermicro X9SCL/X9SCM       47488:1572
> >> >> Supermicro X7SBi-LN4 5593:4404
> >> >> Supermicro X7SBi-LN4 10437:4
> >> >> Supermicro X8DTL     5593:6
> >> >> Supermicro X8DTN+-F  47488:1551
> >> >>
> >> >>
> >> >> 10437: Peppercon AG
> >> >> 47488: Super Micro Computer Inc.
> >> >> 5593: Magnum Technologies Inc.
> >> >>
> >> >> /glz
> >> >>
> >> >>
> >> >> --On September 19, 2011 11:14:48 -0700 Albert Chu <address@hidden>
> >> >> wrote:
> >> >>
> >> >> > Hi Goran,
> >> >> >
> >> >> > Unfortunately that's not how freeipmi_interpret_sensor.conf works.
> >> >> > That conf file is for use with the --output-sensor-state option.
> >> >> > (In hindsight, I now see the overloading of the word 'interpret'
> >> >> > may be confusing).
> >> >> >
> >> >> > I went ahead and added your motherboard into this beta of FreeIPMI,
> >> >> > can you see if it works?  You'll still need to specify
> >> >> > --interpret-oem-data. You may also want to run --output-sensor-state
> >> >> > to see if that brings up an interpretation for those events too.
> >> >> >
> >> >> > http://download.gluster.com/pub/freeipmi/qa-release/freeipmi-1.0.7.
> >> >> > bet a3. tar.gz
> >> >> >
> >> >> > In addition, you may be interested in the
> >> >> > --ignore-unrecognized-events option to eliminate all those
> >> >> > unrecognized events.
> >> >> >
> >> >> > Thanks,
> >> >> > Al
> >> >> >
> >> >> > On Mon, 2011-09-19 at 02:44 -0700, Goran Lowkrantz wrote:
> >> >> >> I am trying to add the CPU Temp sensors output from our Supermicro
> >> >> >> servers  to freeipmi_interpret_sensor.conf but I can't get it
> >> >> >> working. As we have  quite a few, I would like to get them working
> >> >> >> so I started with this.
> >> >> >>
> >> >> >> The server I am testing on has a X8DTN+-F motherboard, below is the
> >> >> >> dmidecode output. We are running FreeBSD amd64 8.2-STABLE with
> >> >> >> FreeIPMI 1.0.6. All files are in the default location
> >> >> >> /usr/local/etc/freeipmi
> >> >> >>
> >> >> >> Handle 0x0001, DMI type 1, 27 bytes
> >> >> >> System Information
> >> >> >>         Manufacturer: Supermicro
> >> >> >>         Product Name: X8DTN+-F
> >> >> >>         Version: 1234567890
> >> >> >>         Serial Number: 1234567890
> >> >> >>         UUID: 25091011-3C00-C25E-8922-003048F7A292
> >> >> >>         Wake-up Type: Power Switch
> >> >> >>         SKU Number: 1234567890
> >> >> >>         Family: Server
> >> >> >>
> >> >> >> Using bcm-info gives the following output:
> >> >> >> # bmc-info
> >> >> >> Device ID             : 32
> >> >> >> Device Revision       : 1
> >> >> >> Device SDRs           : unsupported
> >> >> >> Firmware Revision     : 2.04
> >> >> >> Device Available      : yes (normal operation)
> >> >> >> IPMI Version          : 2.0
> >> >> >> Sensor Device         : supported
> >> >> >> SDR Repository Device : supported
> >> >> >> SEL Device            : supported
> >> >> >> FRU Inventory Device  : supported
> >> >> >> IPMB Event Receiver   : supported
> >> >> >> IPMB Event Generator  : supported
> >> >> >> Bridge                : unsupported
> >> >> >> Chassis Device        : supported
> >> >> >> Manufacturer ID       : Super Micro Computer Inc. (47488)
> >> >> >> Product ID            : 1551
> >> >> >>
> >> >> >> GUID : 00000000-0000-0000-0000-000000000000
> >> >> >>
> >> >> >> Channel Information
> >> >> >>
> >> >> >> Channel Number       : 0
> >> >> >> Medium Type          : IPMB (I2C)
> >> >> >> Protocol Type        : IPMB-1.0
> >> >> >> Active Session Count : 0
> >> >> >> Session Support      : session-less
> >> >> >> Vendor ID            : Intelligent Platform Management Interface
> >> >> >> forum (7154)
> >> >> >>
> >> >> >> Channel Number       : 1
> >> >> >> Medium Type          : 802.3 LAN
> >> >> >> Protocol Type        : IPMB-1.0
> >> >> >> Active Session Count : 0
> >> >> >> Session Support      : multi-session
> >> >> >> Vendor ID            : Intelligent Platform Management Interface
> >> >> >> forum (7154)
> >> >> >>
> >> >> >> Channel Number       : 3
> >> >> >> Medium Type          : Asynch. Serial/Modem (RS-232)
> >> >> >> Protocol Type        : IPMB-1.0
> >> >> >> Active Session Count : 0
> >> >> >> Session Support      : single-session
> >> >> >> Vendor ID            : Intelligent Platform Management Interface
> >> >> >> forum (7154)
> >> >> >>
> >> >> >> Channel Number       : 5
> >> >> >> Medium Type          : IPMB (I2C)
> >> >> >> Protocol Type        : IPMB-1.0
> >> >> >> Active Session Count : 0
> >> >> >> Session Support      : session-less
> >> >> >> Vendor ID            : Intelligent Platform Management Interface
> >> >> >> forum (7154)
> >> >> >>
> >> >> >> So I would expect the id to be 47488:1551
> >> >> >>
> >> >> >> Verbose output of ipmi-sensors for the two sensors:
> >> >> >> Record ID: 1277
> >> >> >> ID String: CPU1 Temp
> >> >> >> Sensor Type: OEM Reserved (C0h)
> >> >> >> Sensor Number: 82
> >> >> >> IPMB Slave Address: 10h
> >> >> >> Sensor Owner ID: 20h
> >> >> >> Sensor Owner LUN: 0h
> >> >> >> Channel Number: 0h
> >> >> >> Entity ID: system board (7)
> >> >> >> Entity Instance: 1
> >> >> >> Entity Instance Type: Physical Entity
> >> >> >> Event/Reading Type Code: 70h
> >> >> >> Sensor Event: 'OEM Event = 0000h'
> >> >> >>
> >> >> >> Record ID: 1344
> >> >> >> ID String: CPU2 Temp
> >> >> >> Sensor Type: OEM Reserved (C0h)
> >> >> >> Sensor Number: 81
> >> >> >> IPMB Slave Address: 10h
> >> >> >> Sensor Owner ID: 20h
> >> >> >> Sensor Owner LUN: 0h
> >> >> >> Channel Number: 0h
> >> >> >> Entity ID: system board (7)
> >> >> >> Entity Instance: 2
> >> >> >> Entity Instance Type: Physical Entity
> >> >> >> Event/Reading Type Code: 70h
> >> >> >> Sensor Event: 'OEM Event = 0000h'
> >> >> >>
> >> >> >> This shows the event code as 0x70 and the sensor type as 0xc0.
> >> >> >>
> >> >> >> Here are the lines I have added for the test and these are the only
> >> >> >> lines  not commented in the file:
> >> >> >>
> >> >> >> IPMI_OEM_Value 47488:1551 0x70 0xC0 0x0000 Nominal
> >> >> >> IPMI_OEM_Value 47488:1551 0x70 0xC0 0x0001 Warning
> >> >> >> IPMI_OEM_Value 47488:1551 0x70 0xC0 0x0002 Warning
> >> >> >> IPMI_OEM_Value 47488:1551 0x70 0xC0 0x0004 Critical
> >> >> >> IPMI_OEM_Value 47488:1551 0x70 0xC0 0x0007 Warning
> >> >> >>
> >> >> >> But I still get this:
> >> >> >> # ipmi-sensors --interpret-oem-data
> >> >> >> ID   | Name          | Type              | Reading    | Units |
> >> >> >> Event 4    | FAN 1         | Fan               | N/A        | RPM
> >> >> >> | N/A 71   | FAN 2         | Fan               | 4096.00    | RPM
> >> >> >> | 'OK' 138  | FAN 3         | Fan               | 4356.00    | RPM
> >> >> >> | 'OK' 205  | FAN 4         | Fan               | 4356.00    | RPM
> >> >> >> | 'OK' 272  | FAN 5         | Fan               | N/A        | RPM
> >> >> >> | N/A 339  | FAN 6         | Fan               | N/A        | RPM
> >> >> >> | N/A 406  | FAN 7         | Fan               | N/A        | RPM
> >> >> >> | N/A 473  | FAN 8         | Fan               | N/A        | RPM
> >> >> >> | N/A 540  | CPU1 Vcore    | Voltage           | 1.01       | V
> >> >> >> | 'OK' 607  | CPU2 Vcore    | Voltage           | 1.10       | V
> >> >> >> | 'OK' 674  | +1.5 V        | Voltage           | 1.52       | V
> >> >> >> | 'OK' 741  | +5 V          | Voltage           | 5.09       | V
> >> >> >> | 'OK' 808  | +5VSB         | Voltage           | 5.06       | V
> >> >> >> | 'OK' 875  | +12 V         | Voltage           | 12.19      | V
> >> >> >> | 'OK' 942  | CPU1 DIMM     | Voltage           | 1.54       | V
> >> >> >> | 'OK' 1009 | CPU2 DIMM     | Voltage           | 1.54       | V
> >> >> >> | 'OK' 1076 | +3.3VCC       | Voltage           | 3.26       | V
> >> >> >> | 'OK' 1143 | +3.3VSB       | Voltage           | 3.22       | V
> >> >> >> | 'OK' 1210 | VBAT          | Voltage           | 3.19       | V
> >> >> >> | 'OK' 1277 | CPU1 Temp     | OEM Reserved      | N/A        | N/A
> >> >> >> | 'OEM Event  = 0000h'
> >> >> >> 1344 | CPU2 Temp     | OEM Reserved      | N/A        | N/A   |
> >> >> >> 'OEM Event  = 0000h'
> >> >> >> 1411 | System Temp   | Temperature       | 21.00      | C     |
> >> >> >> 'OK' 1478 | P1-DIMM1A     | Temperature       | 33.00      | C
> >> >> >> | 'OK' 1545 | P1-DIMM1B     | Temperature       | 32.00      | C
> >> >> >> | 'OK' 1612 | P1-DIMM1C     | Temperature       | N/A        | C
> >> >> >> | N/A 1679 | P1-DIMM2A     | Temperature       | 36.00      | C
> >> >> >> | 'OK' 1746 | P1-DIMM2B     | Temperature       | 34.00      | C
> >> >> >> | 'OK' 1813 | P1-DIMM2C     | Temperature       | N/A        | C
> >> >> >> | N/A 1880 | P1-DIMM3A     | Temperature       | 36.00      | C
> >> >> >> | 'OK' 1947 | P1-DIMM3B     | Temperature       | 36.00      | C
> >> >> >> | 'OK' 2014 | P1-DIMM3C     | Temperature       | N/A        | C
> >> >> >> | N/A 2081 | P2-DIMM1A     | Temperature       | 30.00      | C
> >> >> >> | 'OK' 2148 | P2-DIMM1B     | Temperature       | 28.00      | C
> >> >> >> | 'OK' 2215 | P2-DIMM1C     | Temperature       | N/A        | C
> >> >> >> | N/A 2282 | P2-DIMM2A     | Temperature       | 27.00      | C
> >> >> >> | 'OK' 2349 | P2-DIMM2B     | Temperature       | 27.00      | C
> >> >> >> | 'OK' 2416 | P2-DIMM2C     | Temperature       | N/A        | C
> >> >> >> | N/A 2483 | P2-DIMM3A     | Temperature       | 27.00      | C
> >> >> >> | 'OK' 2550 | P2-DIMM3B     | Temperature       | 28.00      | C
> >> >> >> | 'OK' 2617 | P2-DIMM3C     | Temperature       | N/A        | C
> >> >> >> | N/A 2684 | Chassis Intru | Physical Security | N/A        | N/A
> >> >> >> | 'OK' 2751 | PS Status     | Power Supply      | N/A        | N/A
> >> >> >> | 'Presence  detected' 'Unrecognized Event = 0100h' 'Unrecognized
> >> >> >> Event = 0200h'  'Unrecognized Event = 0400h' 'Unrecognized Event =
> >> >> >> 0800h' 'Unrecognized  Event = 1000h' 'Unrecognized Event = 2000h'
> >> >> >> 'Unrecognized Event = 4000h'
> >> >> >>
> >> >> >> Is there any trick that I have just missed to get the config file
> >> >> >> active?
> >> >> >>
> >> >> >> /glz
> >> >> >>
> >> >> >>
> >> >> >> ................................................... the future
> >> >> >> isMobile
> >> >> >>
> >> >> >>   Goran Lowkrantz <address@hidden>
> >> >> >>   System Architect, isMobile AB
> >> >> >>   Sandviksgatan 81, PO Box 58, S-971 03 Luleå, Sweden
> >> >> >>   Mobile: +46(0)70-587 87 82
> >> >> >> http://www.ismobile.com
> >> >> >> ...............................................
> >> >> >>
> >> >> >> _______________________________________________
> >> >> >> Freeipmi-users mailing list
> >> >> >> address@hidden
> >> >> >> https://lists.gnu.org/mailman/listinfo/freeipmi-users
> >> >> > --
> >> >> > Albert Chu
> >> >> > address@hidden
> >> >> > Computer Scientist
> >> >> > High Performance Systems Division
> >> >> > Lawrence Livermore National Laboratory
> >> >> >
> >> >> >
> >> >>
> >> >>
> >> >>
> >> >> ................................................... the future
> >> >> isMobile
> >> >>
> >> >>   Goran Lowkrantz <address@hidden>
> >> >>   System Architect, isMobile AB
> >> >>   Sandviksgatan 81, PO Box 58, S-971 03 Luleå, Sweden
> >> >>   Mobile: +46(0)70-587 87 82
> >> >> http://www.ismobile.com
> >> >> ...............................................
> >> > --
> >> > Albert Chu
> >> > address@hidden
> >> > Computer Scientist
> >> > High Performance Systems Division
> >> > Lawrence Livermore National Laboratory
> >> >
> >> >
> >>
> >>
> >>
> >>
> > --
> > Albert Chu
> > address@hidden
> > Computer Scientist
> > High Performance Systems Division
> > Lawrence Livermore National Laboratory
> >
> >
> 
> 
> 
> ................................................... the future isMobile
> 
>   Goran Lowkrantz <address@hidden>
>   System Architect, isMobile AB
>   Sandviksgatan 81, PO Box 58, S-971 03 Luleå, Sweden
>   Mobile: +46(0)70-587 87 82
> http://www.ismobile.com ...............................................
-- 
Albert Chu
address@hidden
Computer Scientist
High Performance Systems Division
Lawrence Livermore National Laboratory





reply via email to

[Prev in Thread] Current Thread [Next in Thread]