freeipmi-users
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Freeipmi-users] Disabled temp sensors


From: Eric Pooch
Subject: Re: [Freeipmi-users] Disabled temp sensors
Date: Tue, 4 May 2010 19:16:36 -0700

        From:     address@hidden
        Subject:        Re: [Freeipmi-users] Disabled temp sensors
        Date:   May 4, 2010 7:15:43 PM PDT
        To:       address@hidden

Al,
see below:
On May 4, 2010, at 10:48 AM, Al Chu wrote:


Hey Eric,

Ahhhh.  That would do it.  The slave address is probably wrong in the
SDR.  When you run w/ bridging, the sensors attempts to bridge to an
address that is probably non-functional/illegal.


Yep.

LMK what your final patch looks like.  I can work it into a workaround
of some sort for ipmi-sensors.  (e.g.
--workaround-flags=assumebmcslaveaddr).


I can work on it, but I wanted to make sure that I don't just have some errors in the SDR that are causing the problem. If I issue the Clear SDR Repository command, with this cause me to lose information, or will the SDR repository get rebuilt fresh on its own?


How does ipmi-sel look like? I'm wondering if SEL events are reporting
right/wrong slave addresses and sensor related outputs are outputting
correctly or not.


sudo ipmi-sel
ipmi_sel_parse: internal IPMI error

My fans still look messed up.


It certainly depends on if the SDR is correct or not.  From the output
below, it looks as though the Fans are "transition" fans.  They only
report the transition state instead of fan instead of an RPM.  If they
aren't "transition" fans, then the SDR might be wrong which is leading
to this kind of output.

There is also a valid sensor reading , but it doesn't look like the library supports that.


BTW, you forgot the debug output from your previous e-mail.


I did send it as an attachment, but I think it got filtered out.

Thanks a lot
--Eric

Al

On Mon, 2010-05-03 at 21:32 -0700, Eric Pooch wrote:

OK, I think I found the problem on my computer's implementation of IPMI
I edited:
/freeipmi-0.8.5/libfreeipmi/src/sensor-read/ipmi-sensor-read.c

   if (slave_address == IPMI_SLAVE_ADDRESS_BMC)*/
   if (slave_address != IPMI_SLAVE_ADDRESS_BMC)
And received what looks like good data:

1  | Fan 1            | Fan                      | N/A        | N/A
| 'transition to Off Line'
2  | Fan 2            | Fan                      | N/A        | N/A
| 'transition to Running' 'transition to On Line'
3  | Fan 3            | Fan                      | N/A        | N/A
| 'transition to Running' 'transition to On Line'
4  | Fan 4            | Fan                      | N/A        | N/A
| 'transition to Running' 'transition to On Line'
5  | PCI Fan          | Fan                      | N/A        | N/A
| 'transition to Off Line'
6  | Memory           | Memory                   | N/A        | N/A
| 'OK'
7  | CPU 1            | Processor                | N/A        | N/A
| 'Processor Presence detected'
8  | CPU 2            | Processor                | N/A        | N/A
| 'Processor Presence detected'
9  | VRM              | Voltage                  | N/A        | N/A
| 'OK'
10 | CPU1 Temperature | Temperature              | 35.00      | C
| 'OK'
11 | CPU2 Temperature | Temperature              | 33.00      | C
| 'OK'
12 | Thermal Trip     | Temperature              | N/A        | N/A
| 'OK'
13 | Sys Temperature  | Temperature              | 31.00      | C
| 'OK'
14 | DDR 1.25V        | Voltage                  | 1.25       | V
| 'OK'
15 | Sys 3.3V         | Voltage                  | 3.25       | V
| 'OK'
16 | Sys 5V           | Voltage                  | 5.00       | V
| 'OK'
17 | CIOBE 1.2V       | Voltage                  | 1.21       | V
| 'OK'
18 | CIOBE 2.5V       | Voltage                  | 2.52       | V
| 'OK'
19 | BIOS Progress    | System Firmware Progress | N/A        | N/A
| N/A
20 | Watchdog         | Watchdog 2               | N/A        | N/A
| N/A

This is much better, and I get info for almost all of the sensors
that just showed N/A before. My fans still look messed up.  I will
figure out more details, make it a bit cleaner and send a patch for
users of this flawed IPMI implementation
Thanks
--Eric

On May 3, 2010, at 8:42 PM, Eric Pooch wrote:


Ok, I updated to 0.8.5 and attached an archive of the debug log from:
$ sudo ipmi-sensors --debug
see below:
On May 3, 2010, at 5:23 PM, Al Chu wrote:


Hey Eric,


Also, bmc-info returns IPMI version 1.0 that is probably not
supported by FreeIPMI, but ipmi-locate, returns "IPMI Version: 1.5"
for all of the devices.


Doing a quick online search, this machine appears to be pretty
old.  It
is possible that it does not support IPMI 1.5.  The output from
ipmi-locate you're seeing may be the defaults and not actual outputs from the machine (this is confusing many people so I'm changing this output for the next 0.9.1 release). If it is only IPMI 1.0, there's
probably not much I can do to help you, since many of the IPMI
commands
will just not be supported on your motherboard.



Ok.  I understand

First, all my sensor values come back as [NA] even though most work
properly under ipmitool.


I assume you're using FreeIPMI 0.7.X b/c the newest one (0.8.X line)
does not have "[NA]" output.  There have certainly been fixes since
then, so you may wish to upgrade. My initial guess was bridging, but
you seem to have tried that.

I've noticed on some motherboards that there are issues b/c I find
errors/problems in other parts of IPMI that ipmitool doesn't, thus I
output errors and they don't.  We need to dig into the core of the
errors on your board to figure out what they/I are doing
wrong/differently.  Can you provide --debug output?


So, I think maybe there something that is disabling the Temp sensor
at another level.  I noticed on the HP lightsout user guide that
they
have a setting "o PEF Control—Enables or disables the sensor. "


Based on some of the error messages you posted from ipmitool (BTW, in
the future could you indicate what tools the error messages came
from, I
thought you were indicating FreeIPMI errors and couldn't find them at
first),

Sorry, I thought I was listing FreeIPMI errors, but I guess I
posted errors from the wrong log.

my guess is bridging is not supported on your motherboard and/or
there is a firmware issue w/ bridging, so the temp sensors can't be
reached.

I would agree, except that the standard IMPI raw "get sensor
reading" command works fine.  It is almost like ipmi-sensors and
ipmitool are finding something they don't like in the sdr and not
trying to read the sensor at all.

$ sudo ipmi-raw 0 04 2d 0A
rcvd: 2D 00 23 C0 00 00

0x23 =  35 degrees celsius, which seems right for my processor
temp.  As I mentioned before, it varies proportionally with server
load, seems like the value I need, and is the correct command as
far as I can tell from the IPMI v 1.5 specs


It's hard to say.  If you can provide me --debug output from
ipmi-sensors, I can maybe analyze it deeper.



$ sudo ipmi-sensors --debug
see attachment

$ sudo ipmi-sensors --bridge
ipmi_sensor_read: internal IPMI error


Does any HP specific software work for you for all these sensors? If
their software does, and ipmitool/FreeIPMI does not, it indicates
there
is something kooky on your motherboard.


I don't know, I don't have access to Windows.  If it won't work
with FreeIPMI, I understand that my motherboard is old, but it just
seems strange that I can get the sensor reading using ipmi-raw, but
not ipmi-sensors.

Thanks a lot for your help.
--Eric


Al

On Sun, 2010-05-02 at 10:10 -0700, Eric Pooch wrote:

I am having several problems on my HP proliant dl140 G1

First, all my sensor values come back as [NA] even though most work
properly under ipmitool.
I get the debug errors from ipmi-sensors:

Error reading event status for sensor #09: Invalid command
...
Error reading event enable for sensor #09: Invalid command

When I try ipmi-raw to send those commands, I also get the same
error, so I think the commands are not supported on the sensors.
The
sensors are returning the proper information when I send a raw
command to get their readings. (see below)

However, none of my temp sensors work properly in either freeipmi or
ipmitool and I get a debug error:
Error reading sensor CPU1 Temperature (#0a): Destination unavailable

I get the same "destination unavailable message from event status
and
event enable. However, when I enter the raw ipmi command to read the
temp sensor:
sudo ipmi-raw 0 04 2d 0A

it responds correctly:
rcvd: 2D 00 1B C0 00 00

The 1B is the correct temperature in Celsius that rises with
processor load.  It is definitely the correct temperature.
I have tried the bridge mode but I get an error also.
It seems like the sensor is responding correctly, but is disabled as
far as the sdr is concerned?  I can't enable it through a raw
command
because none of the sensors respond to the "event status" or "event
enable" commands.  So, I think maybe there something that is
disabling the Temp sensor at another level.  I noticed on the HP
lightsout user guide that they have a setting "o PEF Control— Enables
or disables the sensor. "
I am not really sure how to make a change that would cause the
sensor
to be enabled.

Also, bmc-info returns IPMI version 1.0 that is probably not
supported by FreeIPMI, but ipmi-locate, returns "IPMI Version: 1.5"
for all of the devices.

Thanks for any help! _______________________________________________
Freeipmi-users mailing list
address@hidden
http://**lists.gnu.org/mailman/listinfo/freeipmi-users


--
Albert Chu
address@hidden
Computer Scientist
High Performance Systems Division
Lawrence Livermore National Laboratory



_______________________________________________
Freeipmi-users mailing list
address@hidden
http://*lists.gnu.org/mailman/listinfo/freeipmi-users




_______________________________________________
Freeipmi-users mailing list
address@hidden
http://*lists.gnu.org/mailman/listinfo/freeipmi-users


--
Albert Chu
address@hidden
Computer Scientist
High Performance Systems Division
Lawrence Livermore National Laboratory





reply via email to

[Prev in Thread] Current Thread [Next in Thread]