[Top][All Lists]
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Freeipmi-devel] can't read tempeature info
From: |
Albert Chu |
Subject: |
Re: [Freeipmi-devel] can't read tempeature info |
Date: |
Wed, 24 Aug 2011 11:45:45 -0700 |
Hi Franco,
A few additional comments/thoughts below.
On Wed, 2011-08-24 at 10:42 -0700, Albert Chu wrote:
> Hi Franco,
>
> Well, the answer to your problem is surprisingly simple. Your
> motherboard is reporting that the sensors are disabled, thus
> ipmi-sensors doesn't output anything as a result. One example:
>
> pc-xyz: IPMI Command Data:
> pc-xyz: ------------------
> pc-xyz: [ 2Dh] = cmd[ 8b]
> pc-xyz: [ 0h] = comp_code[ 8b]
> pc-xyz: [ 42h] = sensor_reading[ 8b]
> pc-xyz: [ 0h] = reserved1[ 5b]
> pc-xyz: [ 0h] = reading_state[ 1b]
> pc-xyz: [ 0h] = sensor_scanning[ 1b]
> pc-xyz: [ 0h] = all_event_messages[ 1b]
> pc-xyz: [ C0h] = sensor_event_bitmask1[ 8b]
> pc-xyz: IPMI Trailer:
> pc-xyz: --------------
> pc-xyz: [ 49h] = checksum2[ 8b]
> Sensor reading/event bitmask not available: sensor scanning disabled
> 2 | Temp | Temperature | N/A | C |
> N/A
>
> So the question is why is your ipmitool working (or atleast appears to
> be working)? Because the ipmitool I see from sourceforge checks for
> this bit and will not output if it is detected.
>
> } else if (!(rsp->data[1] & SCANNING_DISABLED)) {
> validread = 0;
>
> Is it possible your ipmitool is a version supplied by Dell, and Dell has
> hacked it to not check for this to get around this hardware issue?
>
> On the Dell Poweredge node I have access to, via freeipmi I get:
>
> # ipmi-sensors
> ID | Name | Type | Reading | Units | Event
> 1 | Temp | Temperature | N/A | C | N/A
> 2 | Temp | Temperature | N/A | C | N/A
> 3 | Temp | Temperature | N/A | C | N/A
> 4 | Temp | Temperature | N/A | C | N/A
> 5 | Ambient Temp | Temperature | 17.00 | C | 'OK'
> <snip>
>
> and via ipmitool
>
> # ipmitool -I free sensor list
> Temp | na | degrees C | na | na | na |
> na | 85.000 | 90.000 | na
> Temp | na | degrees C | na | na | na |
> na | 85.000 | 90.000 | na
> Temp | na | degrees C | na | na | na |
> na | na | na | na
> Temp | na | degrees C | na | na | na |
> na | na | na | na
> Ambient Temp | 17.000 | degrees C | ok | na | 3.000 |
> 8.000 | 42.000 | 47.000 | na
> <snip>
>
> So the exact same output on my system. Lets hack them both to NOT check
> for the sensor-scanning disabled bit.
>
> (turning on --entity-sensor-names)
>
> ID | Name | Type | Reading
> | Units | Event
> 1 | Processor 1 Temp | Temperature | -69.00
> | C | 'OK'
> 2 | Processor 2 Temp | Temperature | -65.00
> | C | 'OK'
> 3 | Power Supply 1 Temp | Temperature | 40.00
> | C | 'OK'
> 4 | Power Supply 2 Temp | Temperature | 40.00
> | C | 'OK'
> 5 | System Board Ambient Temp | Temperature | 17.00
> | C | 'OK'
>
> Temp | -69.000 | degrees C | ok | na | na |
> na | 85.000 | 90.000 | na
> Temp | -65.000 | degrees C | ok | na | na |
> na | 85.000 | 90.000 | na
> Temp | 40.000 | degrees C | ok | na | na |
> na | na | na | na
> Temp | 40.000 | degrees C | ok | na | na |
> na | na | na | na
> Ambient Temp | 17.000 | degrees C | ok | na | 3.000 |
> 8.000 | 42.000 | 47.000 | na
>
> Aha! It seems that your version of ipmitool has been altered to act
> differently than the normal ipmitool.
>
> I can easily put in a workaround flag (e.g. -W ignorescanningbit or
> something) to deal with this for Dell motherboards. However, before
> doing that there is an additional question to be answered. Are the
> temperatures of -69, -65, 40 & 40 correct or incorrect?? At the
> minimum, the temperatures of -69 & -65 seem highly incorrect for
> processor temperatures (but as Andy said earlier, it's possible they are
> margin sensors, but nothing indicates that).
>
> I am wondering, does Dell supply any software that you might be able to
> try out to see what "Dell approved" readings are? That way we know what
> should be done.
I just noticed something. On the right side of the ipmitool output,
you'll notice 85.0C and 95.0C are the critical/non-recoverable
temperatures. So it suggests to me that these sensors are in fact not
margin sensors and shouldn't be negative.
I messed around some some of the sensor data repository (SDR) records,
and noticed that the 'B' component of the record was -128 (if you run
ipmi-sensors -vv -s 1, you'll see what I'm talking about). -128 is
quite odd (not seen on other motherboards I had), so I did a hack to
change it to 0. And I got:
ID | Name | Type | Reading |
Units | Event
1 | Processor 1 Temp | Temperature | 59.00 |
C | 'OK'
2 | Processor 2 Temp | Temperature | 64.00 |
C | 'OK'
3 | Power Supply 1 Temp | Temperature | 168.00 |
C | 'OK'
4 | Power Supply 2 Temp | Temperature | 168.00 |
C | 'OK'
5 | System Board Ambient Temp | Temperature | 146.00 |
C | 'OK'
So the first 2 temperatures look far more reasonable, but naturally the
adjustment affected the later 3.
I googled around a little bit and it seems many have noticed the
negative temperatures. Overall, the feeling seems to be that these
sensors are not correct. Given that these sensors are marked
"disabled", I am inclined to believe that these temperature sensors are
in fact invalid and therefore should not be output. That or Dell has a
mistake in their firmware that has gone unnoticed for some time.
I'll let others chime in with their thoughts, I could be wrong, and I'll
see if I should add a workaround.
Al
> Al
>
>
> On Wed, 2011-08-24 at 01:35 -0700, Franco Brasolin wrote:
> > thank you Albert for your quick response,
> > below all the answers.
> > ciao
> > Franco
> >
> > Il 23/08/2011 19:00, Albert Chu ha scritto:
> > > Hi Franco,
> > >
> > > The fact that you're getting some negative degrees in ipmitool means
> > > something is probably wrong with the IPMI firmware on your mobo.
> > > Something is definitely not right.
> > >
> > > The first thing to try is to run ipmi-sensors w/ --bridge-sensors. It's
> > > possible the sensors aren't on the main IPMI bus, so they need to be
> > > bridged to other devices on the motherboard.
> >
> > ipmi-sensor -h pc-xyz -u user -p passw --bridge-sensors doesn't help
> > >
> > > As a second shot, this seems similar to a bug I saw on a HP machine.
> > > Could you try running with the "-W discretereading" flag w/
> > > ipmi-sensors. Maybe that will fix the problem. It would also be
> > > interesting to compare FreeIPMI's ipmi-sensors output to ipmitool's
> > > 'sensor list' output (ipmitool's assumptions are different in that code
> > > path).
> >
> > ipmi-sensor -h pc-xyz -u user -p passw --W discretereading
> > doesn't help too
> >
> >
> > In attachment the ipmi-sensor & ipmitool sensor list output.
> >
> >
> > > If that doesn't help, could you send me the --debug output from
> > > ipmi-sensors. I'd have to look into detail what is actually going on on
> > > this motherboard.
> >
> > In attachment also the debug output
> > >
> > > As a side note, you may be interested in the --entity-sensor-names
> > > option for ipmi-sensors. It may make your output better for your
> > > motherboard.
> > >
> > > Al
> > >
> > > On Tue, 2011-08-23 at 06:47 -0700, Franco Brasolin wrote:
> > >> Hi all,
> > >> I need some help to read Temperature sensors on a Dell PowerEdge R410
> > >> model name : Intel(R) Xeon(R) CPU E5620 @ 2.40GHz
> > >> # uname -a
> > >> Linux pc-xyz 2.6.18-238.1.1.el5 #1 SMP Wed Jan 19 11:06:36 CET 2011
> > >> x86_64 x86_64 x86_64 GNU/Linux
> > >>
> > >> If I try (from another host with freeipmi 1.06.beta0 installed) the
> > >> following command:
> > >>
> > >> # ipmi-sensors -V
> > >> ipmi-sensors - 1.0.6.beta0
> > >>
> > >> # ipmi-sensors -h pc-xyz -u user -p passw | grep -i temp
> > >> 1 | Temp | Temperature | N/A | C |
> > >> N/A
> > >> 2 | Temp | Temperature | N/A | C |
> > >> N/A
> > >> 3 | Temp | Temperature | N/A | C |
> > >> N/A
> > >> 4 | Temp | Temperature | N/A | C |
> > >> N/A
> > >> 5 | Ambient Temp | Temperature | N/A | C |
> > >> N/A
> > >> 6 | Ambient Temp | Temperature | N/A | C |
> > >> N/A
> > >> 7 | Temp | Temperature | N/A | C |
> > >> N/A
> > >> 8 | Temp | Temperature | N/A | C |
> > >> N/A
> > >> 9 | Temp | Temperature | N/A | C |
> > >> N/A
> > >> 10 | Ambient Temp | Temperature | 21.00 | C |
> > >> 'OK'
> > >> 11 | Planar Temp | Temperature | N/A | C |
> > >> N/A
> > >> 65 | CPU Temp Interf | Temperature | N/A | N/A |
> > >> N/A
> > >> 110 | Mem Overtemp | Memory | N/A | N/A |
> > >> N/A
> > >>
> > >> ie: only ambient temperature is available, while if I use ipmitool:
> > >>
> > >>
> > >> # ipmitool -H pc-xyz -U user -P passw sdr type temperature
> > >> Temp | 01h | ok | 3.1 | -57 degrees C
> > >> Temp | 02h | ok | 3.2 | -63 degrees C
> > >> Temp | 05h | ok | 10.1 | 19 degrees C
> > >> Ambient Temp | 07h | ok | 10.1 | 24 degrees C
> > >> Temp | 06h | ok | 10.2 | 30 degrees C
> > >> Ambient Temp | 08h | ok | 10.2 | 27 degrees C
> > >> Ambient Temp | 0Eh | ok | 7.1 | 18 degrees C
> > >> Planar Temp | 0Fh | ok | 7.1 | 35 degrees C
> > >> IOH THERMTRIP | 5Dh | ns | 7.1 | Disabled
> > >> CPU Temp Interf | 76h | ns | 7.1 | Disabled
> > >> Temp | 0Ah | ok | 8.1 | 26 degrees C
> > >> Temp | 0Bh | ok | 8.1 | 23 degrees C
> > >> Temp | 0Ch | unc | 8.1 | 44 degrees C
> > >>
> > >> I obtain much more info.
> > >> What am I doing wrong ??
> > >>
> > >> thank you very much for your help!
> > >> ciao
> > >> Franco
> > >>
> > >>
> > >>
> > >>
> > >>
> > >> _______________________________________________
> > >> Freeipmi-devel mailing list
> > >> address@hidden
> > >> https://lists.gnu.org/mailman/listinfo/freeipmi-devel
> >
--
Albert Chu
address@hidden
Computer Scientist
High Performance Systems Division
Lawrence Livermore National Laboratory
- [Freeipmi-devel] can't read tempeature info, Franco Brasolin, 2011/08/23
- Re: [Freeipmi-devel] can't read tempeature info, Albert Chu, 2011/08/23
- Re: [Freeipmi-devel] can't read tempeature info, Andy Cress, 2011/08/23
- Re: [Freeipmi-devel] can't read tempeature info, Franco Brasolin, 2011/08/24
- Re: [Freeipmi-devel] can't read tempeature info, Albert Chu, 2011/08/24
- Re: [Freeipmi-devel] can't read tempeature info,
Albert Chu <=
- Re: [Freeipmi-devel] can't read tempeature info, Franco Brasolin, 2011/08/25
- Re: [Freeipmi-devel] can't read tempeature info, Albert Chu, 2011/08/25
- Re: [Freeipmi-devel] can't read tempeature info, Albert Chu, 2011/08/25
- Re: [Freeipmi-devel] can't read tempeature info, Franco Brasolin, 2011/08/26
- Re: [Freeipmi-devel] can't read tempeature info, Al Chu, 2011/08/26