[Top][All Lists]
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Freeipmi-devel] can't read tempeature info
From: |
Albert Chu |
Subject: |
Re: [Freeipmi-devel] can't read tempeature info |
Date: |
Thu, 25 Aug 2011 09:55:01 -0700 |
Hi Franco,
On Thu, 2011-08-25 at 02:26 -0700, Franco Brasolin wrote:
> Hi Albert,
>
> Il 24/08/2011 19:42, Albert Chu ha scritto:
> > Hi Franco,
> >
> > Well, the answer to your problem is surprisingly simple. Your
> > motherboard is reporting that the sensors are disabled, thus
> > ipmi-sensors doesn't output anything as a result. One example:
> >
> > pc-xyz: IPMI Command Data:
> > pc-xyz: ------------------
> > pc-xyz: [ 2Dh] = cmd[ 8b]
> > pc-xyz: [ 0h] = comp_code[ 8b]
> > pc-xyz: [ 42h] = sensor_reading[ 8b]
> > pc-xyz: [ 0h] = reserved1[ 5b]
> > pc-xyz: [ 0h] = reading_state[ 1b]
> > pc-xyz: [ 0h] = sensor_scanning[ 1b]
> > pc-xyz: [ 0h] = all_event_messages[ 1b]
> > pc-xyz: [ C0h] = sensor_event_bitmask1[ 8b]
> > pc-xyz: IPMI Trailer:
> > pc-xyz: --------------
> > pc-xyz: [ 49h] = checksum2[ 8b]
> > Sensor reading/event bitmask not available: sensor scanning disabled
> > 2 | Temp | Temperature | N/A | C |
> > N/A
> >
> > So the question is why is your ipmitool working (or atleast appears to
> > be working)? Because the ipmitool I see from sourceforge checks for
> > this bit and will not output if it is detected.
> >
> > } else if (!(rsp->data[1]& SCANNING_DISABLED)) {
> > validread = 0;
> >
> > Is it possible your ipmitool is a version supplied by Dell, and Dell has
> > hacked it to not check for this to get around this hardware issue?
> >
> > On the Dell Poweredge node I have access to, via freeipmi I get:
> >
> > # ipmi-sensors
> > ID | Name | Type | Reading | Units | Event
> > 1 | Temp | Temperature | N/A | C | N/A
> > 2 | Temp | Temperature | N/A | C | N/A
> > 3 | Temp | Temperature | N/A | C | N/A
> > 4 | Temp | Temperature | N/A | C | N/A
> > 5 | Ambient Temp | Temperature | 17.00 | C | 'OK'
> > <snip>
> >
> > and via ipmitool
> >
> > # ipmitool -I free sensor list
> > Temp | na | degrees C | na | na | na
> > | na | 85.000 | 90.000 | na
> > Temp | na | degrees C | na | na | na
> > | na | 85.000 | 90.000 | na
> > Temp | na | degrees C | na | na | na
> > | na | na | na | na
> > Temp | na | degrees C | na | na | na
> > | na | na | na | na
> > Ambient Temp | 17.000 | degrees C | ok | na | 3.000
> > | 8.000 | 42.000 | 47.000 | na
> > <snip>
> >
> > So the exact same output on my system. Lets hack them both to NOT check
> > for the sensor-scanning disabled bit.
> >
> > (turning on --entity-sensor-names)
> >
> > ID | Name | Type | Reading
> > | Units | Event
> > 1 | Processor 1 Temp | Temperature | -69.00
> > | C | 'OK'
> > 2 | Processor 2 Temp | Temperature | -65.00
> > | C | 'OK'
> > 3 | Power Supply 1 Temp | Temperature | 40.00
> > | C | 'OK'
> > 4 | Power Supply 2 Temp | Temperature | 40.00
> > | C | 'OK'
> > 5 | System Board Ambient Temp | Temperature | 17.00
> > | C | 'OK'
> >
> > Temp | -69.000 | degrees C | ok | na | na
> > | na | 85.000 | 90.000 | na
> > Temp | -65.000 | degrees C | ok | na | na
> > | na | 85.000 | 90.000 | na
> > Temp | 40.000 | degrees C | ok | na | na
> > | na | na | na | na
> > Temp | 40.000 | degrees C | ok | na | na
> > | na | na | na | na
> > Ambient Temp | 17.000 | degrees C | ok | na | 3.000
> > | 8.000 | 42.000 | 47.000 | na
> >
> > Aha! It seems that your version of ipmitool has been altered to act
> > differently than the normal ipmitool.
>
> yes, you're right
> ipmitool was already installed (vers. 1.8.8 ), and I don't know how it
> was installed and/or if it's a special modified version.
> Now I have just installed the latest version 1.8.11 and the behaviour is
> changed, ie: only "Ambient temp" is displayed.
Ahh. I did some checking, and it seems the SCANNING_DISABLED fix was
added post-ipmitool-1.8.8. It seems your version was just old (ipmitool
1.8.8 was released mid-2006).
>
> > I can easily put in a workaround flag (e.g. -W ignorescanningbit or
> > something) to deal with this for Dell motherboards. However, before
> > doing that there is an additional question to be answered. Are the
> > temperatures of -69, -65, 40& 40 correct or incorrect?? At the
> > minimum, the temperatures of -69& -65 seem highly incorrect for
> > processor temperatures (but as Andy said earlier, it's possible they are
> > margin sensors, but nothing indicates that).
>
> I think Andy is right, please have a look to this page:
>
> http://comments.gmane.org/gmane.linux.hardware.dell.poweredge/25491
>
> and
>
> http://www.intel.com/design/xeon/datashts/313355.htm section 6.3.1.1
>
Ahhh, interesting. I have a different motherboard w/ margin temperature
sensors, but they report thresholds of about +5 degrees (e.g. it's gone
"too positive"). In contrast, it seems Dell gives (what I assume to be
the) normal threshold of 85/90C.
> > I am wondering, does Dell supply any software that you might be able to
> > try out to see what "Dell approved" readings are? That way we know what
> > should be done.
>
> there's OMSA (OpenManage Server Administrator) available
> at the DELL site, but I didn't try it
>
>
> So now: is it right to modify the code to don't test disabled sensors,
> as you wrote before:
>
> ========================================================================
> > So the question is why is your ipmitool working (or atleast appears to
> > be working)? Because the ipmitool I see from sourceforge checks for
> > this bit and will not output if it is detected.
> >
> > } else if (!(rsp->data[1]& SCANNING_DISABLED)) {
> > validread = 0;
> ========================================================================
>
> or not ???
> and if yes, will this modification be available in one of next freeipmi
> release or do I have to do in my own version ?
Now it's clear to me that these sensors are "for real". The fact that
Dell's motherboard reports "scanning disabled" is a bug in their
firmware. I need to add a workaround to deal with the scanning disabled
bit. I'll try to get your a beta tar.gz sometime today (or perhaps
it'll be morning for you when you get into work :P).
Al
>
> thank you!
>
> ciao
> Franco
>
>
>
>
> >
> > Al
> >
> >
> > On Wed, 2011-08-24 at 01:35 -0700, Franco Brasolin wrote:
> >> thank you Albert for your quick response,
> >> below all the answers.
> >> ciao
> >> Franco
> >>
> >> Il 23/08/2011 19:00, Albert Chu ha scritto:
> >>> Hi Franco,
> >>>
> >>> The fact that you're getting some negative degrees in ipmitool means
> >>> something is probably wrong with the IPMI firmware on your mobo.
> >>> Something is definitely not right.
> >>>
> >>> The first thing to try is to run ipmi-sensors w/ --bridge-sensors. It's
> >>> possible the sensors aren't on the main IPMI bus, so they need to be
> >>> bridged to other devices on the motherboard.
> >>
> >> ipmi-sensor -h pc-xyz -u user -p passw --bridge-sensors doesn't help
> >>>
> >>> As a second shot, this seems similar to a bug I saw on a HP machine.
> >>> Could you try running with the "-W discretereading" flag w/
> >>> ipmi-sensors. Maybe that will fix the problem. It would also be
> >>> interesting to compare FreeIPMI's ipmi-sensors output to ipmitool's
> >>> 'sensor list' output (ipmitool's assumptions are different in that code
> >>> path).
> >>
> >> ipmi-sensor -h pc-xyz -u user -p passw --W discretereading
> >> doesn't help too
> >>
> >>
> >> In attachment the ipmi-sensor& ipmitool sensor list output.
> >>
> >>
> >>> If that doesn't help, could you send me the --debug output from
> >>> ipmi-sensors. I'd have to look into detail what is actually going on on
> >>> this motherboard.
> >>
> >> In attachment also the debug output
> >>>
> >>> As a side note, you may be interested in the --entity-sensor-names
> >>> option for ipmi-sensors. It may make your output better for your
> >>> motherboard.
> >>>
> >>> Al
> >>>
> >>> On Tue, 2011-08-23 at 06:47 -0700, Franco Brasolin wrote:
> >>>> Hi all,
> >>>> I need some help to read Temperature sensors on a Dell PowerEdge R410
> >>>> model name : Intel(R) Xeon(R) CPU E5620 @ 2.40GHz
> >>>> # uname -a
> >>>> Linux pc-xyz 2.6.18-238.1.1.el5 #1 SMP Wed Jan 19 11:06:36 CET 2011
> >>>> x86_64 x86_64 x86_64 GNU/Linux
> >>>>
> >>>> If I try (from another host with freeipmi 1.06.beta0 installed) the
> >>>> following command:
> >>>>
> >>>> # ipmi-sensors -V
> >>>> ipmi-sensors - 1.0.6.beta0
> >>>>
> >>>> # ipmi-sensors -h pc-xyz -u user -p passw | grep -i temp
> >>>> 1 | Temp | Temperature | N/A | C |
> >>>> N/A
> >>>> 2 | Temp | Temperature | N/A | C |
> >>>> N/A
> >>>> 3 | Temp | Temperature | N/A | C |
> >>>> N/A
> >>>> 4 | Temp | Temperature | N/A | C |
> >>>> N/A
> >>>> 5 | Ambient Temp | Temperature | N/A | C |
> >>>> N/A
> >>>> 6 | Ambient Temp | Temperature | N/A | C |
> >>>> N/A
> >>>> 7 | Temp | Temperature | N/A | C |
> >>>> N/A
> >>>> 8 | Temp | Temperature | N/A | C |
> >>>> N/A
> >>>> 9 | Temp | Temperature | N/A | C |
> >>>> N/A
> >>>> 10 | Ambient Temp | Temperature | 21.00 | C |
> >>>> 'OK'
> >>>> 11 | Planar Temp | Temperature | N/A | C |
> >>>> N/A
> >>>> 65 | CPU Temp Interf | Temperature | N/A | N/A |
> >>>> N/A
> >>>> 110 | Mem Overtemp | Memory | N/A | N/A |
> >>>> N/A
> >>>>
> >>>> ie: only ambient temperature is available, while if I use ipmitool:
> >>>>
> >>>>
> >>>> # ipmitool -H pc-xyz -U user -P passw sdr type temperature
> >>>> Temp | 01h | ok | 3.1 | -57 degrees C
> >>>> Temp | 02h | ok | 3.2 | -63 degrees C
> >>>> Temp | 05h | ok | 10.1 | 19 degrees C
> >>>> Ambient Temp | 07h | ok | 10.1 | 24 degrees C
> >>>> Temp | 06h | ok | 10.2 | 30 degrees C
> >>>> Ambient Temp | 08h | ok | 10.2 | 27 degrees C
> >>>> Ambient Temp | 0Eh | ok | 7.1 | 18 degrees C
> >>>> Planar Temp | 0Fh | ok | 7.1 | 35 degrees C
> >>>> IOH THERMTRIP | 5Dh | ns | 7.1 | Disabled
> >>>> CPU Temp Interf | 76h | ns | 7.1 | Disabled
> >>>> Temp | 0Ah | ok | 8.1 | 26 degrees C
> >>>> Temp | 0Bh | ok | 8.1 | 23 degrees C
> >>>> Temp | 0Ch | unc | 8.1 | 44 degrees C
> >>>>
> >>>> I obtain much more info.
> >>>> What am I doing wrong ??
> >>>>
> >>>> thank you very much for your help!
> >>>> ciao
> >>>> Franco
> >>>>
> >>>>
> >>>>
> >>>>
> >>>>
> >>>> _______________________________________________
> >>>> Freeipmi-devel mailing list
> >>>> address@hidden
> >>>> https://lists.gnu.org/mailman/listinfo/freeipmi-devel
> >>
>
--
Albert Chu
address@hidden
Computer Scientist
High Performance Systems Division
Lawrence Livermore National Laboratory
- [Freeipmi-devel] can't read tempeature info, Franco Brasolin, 2011/08/23
- Re: [Freeipmi-devel] can't read tempeature info, Albert Chu, 2011/08/23
- Re: [Freeipmi-devel] can't read tempeature info, Andy Cress, 2011/08/23
- Re: [Freeipmi-devel] can't read tempeature info, Franco Brasolin, 2011/08/24
- Re: [Freeipmi-devel] can't read tempeature info, Albert Chu, 2011/08/24
- Re: [Freeipmi-devel] can't read tempeature info, Albert Chu, 2011/08/24
- Re: [Freeipmi-devel] can't read tempeature info, Franco Brasolin, 2011/08/25
- Re: [Freeipmi-devel] can't read tempeature info,
Albert Chu <=
- Re: [Freeipmi-devel] can't read tempeature info, Albert Chu, 2011/08/25
- Re: [Freeipmi-devel] can't read tempeature info, Franco Brasolin, 2011/08/26
- Re: [Freeipmi-devel] can't read tempeature info, Al Chu, 2011/08/26