freeipmi-users
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Freeipmi-users] Troubleshooting inconsistent results from server


From: Brian LaFlamme
Subject: Re: [Freeipmi-users] Troubleshooting inconsistent results from server
Date: Fri, 8 Jan 2016 09:22:14 -0500

For the over-LAN issue, I cannot ipmiping node1, but I can ipmiping node2.
This is the same flat network with no firewalls or other devices in
between.  Also, for both nodes, the web system management console works
just fine.

address@hidden:/var/log/munin# ipmiping node1ipmi
ipmiping node1ipmi (192.168.1.66)
response timed out: rq_seq=42
response timed out: rq_seq=43
^C--- ipmiping node1ipmi statistics ---
3 requests transmitted, 0 responses received in time, 100.0% packet loss
You have new mail in /var/mail/root
address@hidden:/var/log/munin# ipmiping node2ipmi
ipmiping node2ipmi (192.168.1.88)
response received from 192.168.1.88: rq_seq=17
response received from 192.168.1.88: rq_seq=18
response received from 192.168.1.88: rq_seq=19
^C--- ipmiping node2ipmi statistics ---
3 requests transmitted, 3 responses received in time, 0.0% packet loss



For the inband issue, it appears that I do have local kernel support on
node1:

address@hidden:/etc# ps aux | grep ipmi
root         237  0.1  0.0      0     0 ?        SN   Jan06   4:41 [kipmi0]

And this appears to be built directly into the kernel.  I found no modules
associated with ipmi.  From the link you set, it looks like adding should
work, right?

ipmi_si.force_kipmid=0

On Thu, Jan 7, 2016 at 8:09 PM, Albert Chu <address@hidden> wrote:

> Hi Brian,
>
> I think the over-LAN and inband are two separate issues.  The over-LAN
> is likely some configuration/networking issue.  Can you atleast ipmiping
> the node?
>
> As for the inband issue, it sounds very much like this issue:
>
>
> http://www.gnu.org/software/freeipmi/freeipmi-faq.html#Why-am-I-seeing-so-many-_0027internal-IPMI-error_0027-or-_0027driver-busy_0027-messages_003f
>
> Al
>
> On Thu, 2016-01-07 at 19:43 -0500, Brian LaFlamme wrote:
> > I have a Dell C6100 blade server with 2 identical nodes, and I'm trying
> to
> > troubleshoot some odd behavior with ipmi.  One node (node2) works
> perfectly
> > over LAN and over its local web interface.  A separate node (node1)
> doesn't
> > work well at all.
> >
> > Locally, I get very inconsistent results.  E.g., running 'bmc-config
> > --checkout' on node1 usually ends prematurely without any error message,
> > resulting in an incomplete config file.  Sometimes it completes.  In
> > contrast, i always get a complete config file on node2.
> >
> > Here is an attempt to run a simple command locally on node1 a few times
> to
> > demonstrate the inconsistency.  The first time it runs to completion, the
> > next few times it dies with an error.  I paused for 20+ seconds between
> > each command to make sure I wasn't overloading anything.
> >
> > address@hidden:~# ipmi-sensors
> > ID | Name             | Type                                | Reading
> |
> > Units | Event
> > 2  | FCB FAN1         | Fan                                 | 5500.00
> |
> > RPM   | 'OK'
> > 3  | FCB FAN2         | Fan                                 | 5500.00
> |
> > RPM   | 'OK'
> > 4  | FCB FAN3         | Fan                                 | 5500.00
> |
> > RPM   | 'OK'
> > 5  | FCB FAN4         | Fan                                 | 5500.00
> |
> > RPM   | 'OK'
> > 6  | PEF Action       | System Event                        | N/A
> |
> > N/A   | 'OK'
> > 7  | WatchDog2        | Watchdog 2                          | N/A
> |
> > N/A   | 'OK'
> > 8  | AC Pwr On        | Power Unit                          | N/A
> |
> > N/A   | 'OK'
> > 9  | ACPI Pwr State   | System ACPI Power State             | N/A
> |
> > N/A   | 'Legacy ON state'
> > 10 | FCB Ambient1     | Temperature                         | 20.00
> |
> > C     | 'OK'
> > 11 | FCB Ambient2     | Temperature                         | 21.00
> |
> > C     | 'OK'
> > 12 | CPU1Status       | Processor                           | N/A
> |
> > N/A   | 'OK'
> > 13 | CPU2Status       | Processor                           | N/A
> |
> > N/A   | 'OK'
> > 14 | PS 12V           | Voltage                             | 12.09
> |
> > V     | 'OK'
> > 15 | PS 5V            | Voltage                             | 5.10
>  |
> > V     | 'OK'
> > 16 | MLB TEMP 2       | Temperature                         | 63.00
> |
> > C     | 'OK'
> > 17 | MLB TEMP 3       | Temperature                         | 52.00
> |
> > C     | 'OK'
> > 18 | Processor 1 Temp | Temperature                         | 60.00
> |
> > C     | 'OK'
> > 19 | MLB TEMP 1       | Temperature                         | 62.00
> |
> > C     | 'OK'
> > 20 | Processor 2 Temp | Temperature                         | 66.00
> |
> > C     | 'OK'
> > 21 | STBY 3.3V        | Voltage                             | 3.35
>  |
> > V     | 'OK'
> > 22 | PS Current       | Current                             | 38.00
> |
> > A     | 'OK'
> > 23 | SEL Fullness     | Event Logging Disabled              | N/A
> |
> > N/A   | 'Log Area Reset/Cleared'
> > 24 | PCI BUS          | Critical Interrupt                  | N/A
> |
> > N/A   | 'OK'
> > 25 | Memory           | Memory                              | N/A
> |
> > N/A   | 'OK'
> > 26 | VCORE 1          | Voltage                             | 1.04
>  |
> > V     | 'OK'
> > 27 | VCORE 2          | Voltage                             | 0.87
>  |
> > V     | 'OK'
> > 30 | NM Capability    | OEM Reserved                        | N/A
> |
> > N/A   | N/A
> > 33 | Security         | Platform Security Violation Attempt | N/A
> |
> > N/A   | 'OK'
> > 34 | PSU 1 AC Status  | Power Unit                          | N/A
> |
> > N/A   | N/A
> > 35 | PSU 2 AC Status  | Power Unit                          | N/A
> |
> > N/A   | N/A
> > 36 | PSU 1 Present    | Power Supply                        | N/A
> |
> > N/A   | N/A
> > 37 | PSU 2 Present    | Power Supply                        | N/A
> |
> > N/A   | N/A
> > 38 | PSU 2 POUT       | Current                             | N/A
> |
> > A     | N/A
> > 39 | PSU 1 POUT       | Current                             | N/A
> |
> > A     | N/A
> > address@hidden:~# ipmi-sensors
> > ID | Name             | Type                                | Reading
> |
> > Units | Event
> > 2  | FCB FAN1         | Fan                                 | 5500.00
> |
> > RPM   | 'OK'
> > 3  | FCB FAN2         | Fan                                 | 5500.00
> |
> > RPM   | 'OK'
> > 4  | FCB FAN3         | Fan                                 | 5500.00
> |
> > RPM   | 'OK'
> > 5  | FCB FAN4         | Fan                                 | 5500.00
> |
> > RPM   | 'OK'
> > 6  | PEF Action       | System Event                        | N/A
> |
> > N/A   | 'OK'
> > 7  | WatchDog2        | Watchdog 2                          | N/A
> |
> > N/A   | 'OK'
> > 8  | AC Pwr On        | Power Unit                          | N/A
> |
> > N/A   | 'OK'
> > ipmi_sensor_read: internal IPMI error
> > address@hidden:~# ipmi-sensors
> > ID | Name             | Type                                | Reading
> |
> > Units | Event
> > 2  | FCB FAN1         | Fan                                 | 5500.00
> |
> > RPM   | 'OK'
> > ipmi_sensor_read: internal IPMI error
> > address@hidden:~# ipmi-sensors
> > ID | Name             | Type                                | Reading
> |
> > Units | Event
> > 2  | FCB FAN1         | Fan                                 | 5500.00
> |
> > RPM   | 'OK'
> > 3  | FCB FAN2         | Fan                                 | 5500.00
> |
> > RPM   | 'OK'
> > 4  | FCB FAN3         | Fan                                 | 5500.00
> |
> > RPM   | 'OK'
> > 5  | FCB FAN4         | Fan                                 | 5500.00
> |
> > RPM   | 'OK'
> > 6  | PEF Action       | System Event                        | N/A
> |
> > N/A   | 'OK'
> > 7  | WatchDog2        | Watchdog 2                          | N/A
> |
> > N/A   | 'OK'
> > 8  | AC Pwr On        | Power Unit                          | N/A
> |
> > N/A   | 'OK'
> > 9  | ACPI Pwr State   | System ACPI Power State             | N/A
> |
> > N/A   | 'Legacy ON state'
> > ipmi_sensor_read: internal IPMI error
> >
> > Also, I get no response from node1 over LAN, whereas node2 works
> perfectly
> > (not shown).
> >
> > address@hidden:~# ipmi-sensors -h node1ipmi -u root -p XXX --debug
> > node1ipmi: =====================================================
> > node1ipmi: IPMI 1.5 Get Channel Authentication Capabilities Request
> > node1ipmi: =====================================================
> > node1ipmi: RMCP Header:
> > node1ipmi: ------------
> > node1ipmi: [               6h] = version[ 8b]
> > node1ipmi: [               0h] = reserved[ 8b]
> > node1ipmi: [              FFh] = sequence_number[ 8b]
> > node1ipmi: [               7h] = message_class.class[ 5b]
> > node1ipmi: [               0h] = message_class.reserved[ 2b]
> > node1ipmi: [               0h] = message_class.ack[ 1b]
> > node1ipmi: IPMI Session Header:
> > node1ipmi: --------------------
> > node1ipmi: [               0h] = authentication_type[ 8b]
> > node1ipmi: [               0h] = session_sequence_number[32b]
> > node1ipmi: [               0h] = session_id[32b]
> > node1ipmi: [               9h] = ipmi_msg_len[ 8b]
> > node1ipmi: IPMI Message Header:
> > node1ipmi: --------------------
> > node1ipmi: [              20h] = rs_addr[ 8b]
> > node1ipmi: [               0h] = rs_lun[ 2b]
> > node1ipmi: [               6h] = net_fn[ 6b]
> > node1ipmi: [              C8h] = checksum1[ 8b]
> > node1ipmi: [              81h] = rq_addr[ 8b]
> > node1ipmi: [               0h] = rq_lun[ 2b]
> > node1ipmi: [              23h] = rq_seq[ 6b]
> > node1ipmi: IPMI Command Data:
> > node1ipmi: ------------------
> > node1ipmi: [              38h] = cmd[ 8b]
> > node1ipmi: [               Eh] = channel_number[ 4b]
> > node1ipmi: [               0h] = reserved1[ 3b]
> > node1ipmi: [               0h] = get_ipmi_v2.0_extended_data[ 1b]
> > node1ipmi: [               3h] = maximum_privilege_level[ 4b]
> > node1ipmi: [               0h] = reserved2[ 4b]
> > node1ipmi: IPMI Trailer:
> > node1ipmi: --------------
> > node1ipmi: [              AAh] = checksum2[ 8b]
> >
> >
> > Additional details (this info is identical to the working node)
> >
> >
> > address@hidden:~# bmc-info
> > Device ID             : 37
> > Device Revision       : 1
> > Device SDRs           : unsupported
> > Firmware Revision     : 1.30
> > Device Available      : yes (normal operation)
> > IPMI Version          : 2.0
> > Sensor Device         : supported
> > SDR Repository Device : supported
> > SEL Device            : supported
> > FRU Inventory Device  : supported
> > IPMB Event Receiver   : supported
> > IPMB Event Generator  : supported
> > Bridge                : unsupported
> > Chassis Device        : supported
> > Manufacturer ID       : Inventec Enterprise System Corp. (20569)
> > Product ID            : 52
> > Auxiliary Firmware Revision Information : 6D6E0001h
> >
> > GUID : f790edd1-a000-0061-756d-502032343435
> >
> > System Firmware Version       : 5442A170
> > System Name                   :
> > Primary Operating System Name :
> > Operating System Name         :
> >
> > Channel Information
> >
> > Channel Number       : 0
> > Medium Type          : IPMB (I2C)
> > Protocol Type        : IPMB-1.0
> > Active Session Count : 0
> > Session Support      : session-less
> > Vendor ID            : Intelligent Platform Management Interface forum
> > (7154)
> >
> > Channel Number       : 1
> > Medium Type          : 802.3 LAN
> > Protocol Type        : IPMB-1.0
> > Active Session Count : 0
> > Session Support      : multi-session
> > Vendor ID            : Intelligent Platform Management Interface forum
> > (7154)
> >
> > Channel Number       : 6
> > Medium Type          : IPMB (I2C)
> > Protocol Type        : IPMB-1.0
> > Active Session Count : 0
> > Session Support      : session-less
> > Vendor ID            : Intelligent Platform Management Interface forum
> > (7154)
> > _______________________________________________
> > Freeipmi-users mailing list
> > address@hidden
> > https://lists.gnu.org/mailman/listinfo/freeipmi-users
> --
> Albert Chu
> address@hidden
> Computer Scientist
> High Performance Systems Division
> Lawrence Livermore National Laboratory
>
>
>


reply via email to

[Prev in Thread] Current Thread [Next in Thread]