freeipmi-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Freeipmi-devel] can't read tempeature info


From: Franco Brasolin
Subject: Re: [Freeipmi-devel] can't read tempeature info
Date: Thu, 25 Aug 2011 11:26:59 +0200
User-agent: Mozilla/5.0 (Windows NT 6.1; rv:6.0) Gecko/20110812 Thunderbird/6.0

Hi Albert,

Il 24/08/2011 19:42, Albert Chu ha scritto:
Hi Franco,

Well, the answer to your problem is surprisingly simple.  Your
motherboard is reporting that the sensors are disabled, thus
ipmi-sensors doesn't output anything as a result.  One example:

pc-xyz: IPMI Command Data:
pc-xyz: ------------------
pc-xyz: [              2Dh] = cmd[ 8b]
pc-xyz: [               0h] = comp_code[ 8b]
pc-xyz: [              42h] = sensor_reading[ 8b]
pc-xyz: [               0h] = reserved1[ 5b]
pc-xyz: [               0h] = reading_state[ 1b]
pc-xyz: [               0h] = sensor_scanning[ 1b]
pc-xyz: [               0h] = all_event_messages[ 1b]
pc-xyz: [              C0h] = sensor_event_bitmask1[ 8b]
pc-xyz: IPMI Trailer:
pc-xyz: --------------
pc-xyz: [              49h] = checksum2[ 8b]
Sensor reading/event bitmask not available: sensor scanning disabled
2   | Temp             | Temperature              | N/A        | C     |
N/A

So the question is why is your ipmitool working (or atleast appears to
be working)?  Because the ipmitool I see from sourceforge checks for
this bit and will not output if it is detected.

         } else if (!(rsp->data[1]&  SCANNING_DISABLED)) {
                 validread = 0;

Is it possible your ipmitool is a version supplied by Dell, and Dell has
hacked it to not check for this to get around this hardware issue?

On the Dell Poweredge node I have access to, via freeipmi I get:

# ipmi-sensors
ID | Name         | Type        | Reading    | Units | Event
1  | Temp         | Temperature | N/A        | C     | N/A
2  | Temp         | Temperature | N/A        | C     | N/A
3  | Temp         | Temperature | N/A        | C     | N/A
4  | Temp         | Temperature | N/A        | C     | N/A
5  | Ambient Temp | Temperature | 17.00      | C     | 'OK'
<snip>

and via ipmitool

# ipmitool -I free sensor list
Temp             | na         | degrees C  | na    | na        | na        | na 
       | 85.000    | 90.000    | na
Temp             | na         | degrees C  | na    | na        | na        | na 
       | 85.000    | 90.000    | na
Temp             | na         | degrees C  | na    | na        | na        | na 
       | na        | na        | na
Temp             | na         | degrees C  | na    | na        | na        | na 
       | na        | na        | na
Ambient Temp     | 17.000     | degrees C  | ok    | na        | 3.000     | 
8.000     | 42.000    | 47.000    | na
<snip>

So the exact same output on my system.  Lets hack them both to NOT check
for the sensor-scanning disabled bit.

(turning on --entity-sensor-names)

ID | Name                             | Type                     | Reading    | 
Units | Event
1  | Processor 1 Temp                 | Temperature              | -69.00     | 
C     | 'OK'
2  | Processor 2 Temp                 | Temperature              | -65.00     | 
C     | 'OK'
3  | Power Supply 1 Temp              | Temperature              | 40.00      | 
C     | 'OK'
4  | Power Supply 2 Temp              | Temperature              | 40.00      | 
C     | 'OK'
5  | System Board Ambient Temp        | Temperature              | 17.00      | 
C     | 'OK'

Temp             | -69.000    | degrees C  | ok    | na        | na        | na 
       | 85.000    | 90.000    | na
Temp             | -65.000    | degrees C  | ok    | na        | na        | na 
       | 85.000    | 90.000    | na
Temp             | 40.000     | degrees C  | ok    | na        | na        | na 
       | na        | na        | na
Temp             | 40.000     | degrees C  | ok    | na        | na        | na 
       | na        | na        | na
Ambient Temp     | 17.000     | degrees C  | ok    | na        | 3.000     | 
8.000     | 42.000    | 47.000    | na

Aha!  It seems that your version of ipmitool has been altered to act
differently than the normal ipmitool.

yes, you're right
ipmitool was already installed (vers. 1.8.8 ), and I don't know how it was installed and/or if it's a special modified version. Now I have just installed the latest version 1.8.11 and the behaviour is changed, ie: only "Ambient temp" is displayed.


I can easily put in a workaround flag (e.g. -W ignorescanningbit or
something) to deal with this for Dell motherboards.  However, before
doing that there is an additional question to be answered.  Are the
temperatures of -69, -65, 40&  40 correct or incorrect??  At the
minimum, the temperatures of -69&  -65 seem highly incorrect for
processor temperatures (but as Andy said earlier, it's possible they are
margin sensors, but nothing indicates that).

I think Andy is right, please have a look to this page:

http://comments.gmane.org/gmane.linux.hardware.dell.poweredge/25491

and

http://www.intel.com/design/xeon/datashts/313355.htm  section 6.3.1.1


I am wondering, does Dell supply any software that you might be able to
try out to see what "Dell approved" readings are?  That way we know what
should be done.

there's OMSA (OpenManage Server Administrator) available
 at the DELL site, but I didn't try it


So now: is it right to modify the code to don't test disabled sensors,
as you wrote before:

========================================================================
> So the question is why is your ipmitool working (or atleast appears to
> be working)?  Because the ipmitool I see from sourceforge checks for
> this bit and will not output if it is detected.
>
>          } else if (!(rsp->data[1]&  SCANNING_DISABLED)) {
>                  validread = 0;
========================================================================

or not ???
and if yes, will this modification be available in one of next freeipmi release or do I have to do in my own version ?


thank you!

ciao
Franco





Al


On Wed, 2011-08-24 at 01:35 -0700, Franco Brasolin wrote:
thank you Albert for your quick response,
below all the answers.
ciao
Franco

Il 23/08/2011 19:00, Albert Chu ha scritto:
Hi Franco,

The fact that you're getting some negative degrees in ipmitool means
something is probably wrong with the IPMI firmware on your mobo.
Something is definitely not right.

The first thing to try is to run ipmi-sensors w/ --bridge-sensors.  It's
possible the sensors aren't on the main IPMI bus, so they need to be
bridged to other devices on the motherboard.

ipmi-sensor -h pc-xyz  -u user -p passw --bridge-sensors doesn't help

As a second shot, this seems similar to a bug I saw on a HP machine.
Could you try running with the "-W discretereading" flag w/
ipmi-sensors.  Maybe that will fix the problem.  It would also be
interesting to compare FreeIPMI's ipmi-sensors output to ipmitool's
'sensor list' output (ipmitool's assumptions are different in that code
path).

ipmi-sensor -h pc-xyz  -u user -p passw --W discretereading
doesn't help too


In attachment the ipmi-sensor&  ipmitool sensor list output.


If that doesn't help, could you send me the --debug output from
ipmi-sensors.  I'd have to look into detail what is actually going on on
this motherboard.

In attachment also the debug output

As a side note, you may be interested in the --entity-sensor-names
option for ipmi-sensors.  It may make your output better for your
motherboard.

Al

On Tue, 2011-08-23 at 06:47 -0700, Franco Brasolin wrote:
Hi all,
I need some help to read Temperature sensors on a Dell  PowerEdge R410
model name      : Intel(R) Xeon(R) CPU           E5620  @ 2.40GHz
# uname -a
Linux pc-xyz 2.6.18-238.1.1.el5 #1 SMP Wed Jan 19 11:06:36 CET 2011
x86_64 x86_64 x86_64 GNU/Linux

If I try (from another host with freeipmi 1.06.beta0 installed) the
following command:

# ipmi-sensors -V
ipmi-sensors - 1.0.6.beta0

# ipmi-sensors -h pc-xyz  -u user -p passw   | grep -i temp
1   | Temp             | Temperature              | N/A        | C     | N/A
2   | Temp             | Temperature              | N/A        | C     | N/A
3   | Temp             | Temperature              | N/A        | C     | N/A
4   | Temp             | Temperature              | N/A        | C     | N/A
5   | Ambient Temp     | Temperature              | N/A        | C     | N/A
6   | Ambient Temp     | Temperature              | N/A        | C     | N/A
7   | Temp             | Temperature              | N/A        | C     | N/A
8   | Temp             | Temperature              | N/A        | C     | N/A
9   | Temp             | Temperature              | N/A        | C     | N/A
10  | Ambient Temp     | Temperature              | 21.00      | C     |
'OK'
11  | Planar Temp      | Temperature              | N/A        | C     | N/A
65  | CPU Temp Interf  | Temperature              | N/A        | N/A   | N/A
110 | Mem Overtemp     | Memory                   | N/A        | N/A   | N/A

ie: only ambient temperature is available, while if I use ipmitool:


#  ipmitool -H pc-xyz -U user -P passw  sdr type temperature
Temp             | 01h | ok  |  3.1 | -57 degrees C
Temp             | 02h | ok  |  3.2 | -63 degrees C
Temp             | 05h | ok  | 10.1 | 19 degrees C
Ambient Temp     | 07h | ok  | 10.1 | 24 degrees C
Temp             | 06h | ok  | 10.2 | 30 degrees C
Ambient Temp     | 08h | ok  | 10.2 | 27 degrees C
Ambient Temp     | 0Eh | ok  |  7.1 | 18 degrees C
Planar Temp      | 0Fh | ok  |  7.1 | 35 degrees C
IOH THERMTRIP    | 5Dh | ns  |  7.1 | Disabled
CPU Temp Interf  | 76h | ns  |  7.1 | Disabled
Temp             | 0Ah | ok  |  8.1 | 26 degrees C
Temp             | 0Bh | ok  |  8.1 | 23 degrees C
Temp             | 0Ch | unc |  8.1 | 44 degrees C

I obtain much more info.
What am I doing wrong ??

thank you very much for your help!
ciao
Franco





_______________________________________________
Freeipmi-devel mailing list
address@hidden
https://lists.gnu.org/mailman/listinfo/freeipmi-devel





reply via email to

[Prev in Thread] Current Thread [Next in Thread]