freeipmi-users
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Freeipmi-users] ipmiconsole BMC Implementation with x38ml


From: ktaka
Subject: Re: [Freeipmi-users] ipmiconsole BMC Implementation with x38ml
Date: Sat, 27 Sep 2008 01:08:08 +0900
User-agent: Thunderbird 2.0.0.9 (X11/20071031)

Hi Al,

> Great to hear.  Could you give me details on your motherboard, or point
> me to a webpage with it?  That way I can update the workarounds
> documentation to include this motherboard.

Sure.
The system I'm using is, Intel's sr1520ml(twin motherboard server)
http://www.intel.com/products/server/systems/sr1520ml/sr1520ml-overview.htm

The motherboard on this is, x38ml
http://www.intel.com/Products/Server/Motherboards/X38ML/X38ML-specifications.htm

This board has on board BMC with support for IPMI 2.0, shared nic for
IPMI out bound connection.

>> Sol connection seems silently disconnected from the target host through
>> the booting process.
> 
> I don't think this is that uncommon.  

Yep. I often saw the disconnected connection case, when I was using
supermicro servers, a few years ago.
But nowadays that doesn't happen so often with them. It's stable. Your
tool works like a charm, and never disconnect most of the time, unless I
had the bad connections or misconfigured bonding where IPMI MAC address
hops eth0 to eth1.

Maybe I should have thanked you for developping such a great tool before
I asked the question.
I just wish freeipmi worked with this new M/B, as the way it did with my
many supermicro servers.

> Do you happen to be diskless
> booting?  One manufacturer told me that there are lines of ethernet
> cards that run out of memory while diskless booting.  So all IPMI
> traffic stops during that time.

Yes, I'm pxe booting. That could be one reason.

> I can imagine other scenarios where the ethernet is gone during boot.
> Tools that were connected via SOL will either hang or eventually
> timeout.
>
>> However do you have any idea on what's causing this, and how to
>> correct this behavior?
>
> I don't know of a way to get around it.  From the 'ipmiconsole' side,
> all I can do is read/send SOL packets.  When too many packets get lost,
> or I get errors from the motherboard, etc. eventually I need to give up.
> On one motherboard I tested with, during boot (it seemed) a large number
> of SOL packets were dropped by the motherboard, and when SOL was alive
> again, the sequence numbers of the newly sent packets were incremented
> by a large number.  Eventually I need to give up b/c all the sequence
> numbers are out of whack.

Well I think I'm getting closer now. Thank you.
After a bunch of experiments, I came to this conclusion:
This M/B(or BMC) seems stop sending sol data when the nic link goes down
while it's sending the sol data, and never send the data again even if
nic link is back.

Quick tests I did while I was writing this email are,

1. Boot up the linux without "console=ttyS0,19200n8" option so that no
kernel message is displayed in sol connection.

(1) Disconnect/Reconnect LAN cable while seeing "vmstat 1" through sol.
-> The sol stopped.
(2) Disconnect/Reconnect LAN cable while the sol is connected but
nothing continuous is displayed. -> The sol continued to work after
reconnecting the cable.

2. Boot up the linux with "console=ttyS0,19200n8" option so that I can
see the kernel message through sol.
(3) Disconnect/Reconnect LAN cable. -> The sol stopped.
(4) "modprobe -r igb", "modprobe igb", i.e. unloading/loding the nic
driver(from tty1). The nic link goes down for a couple of seconds. ->
The sol stopped.

Here are the rules of thumb for this motherboard:
1. Try not to use the first nic which is shared with IPMI connection for
pxe booting.(Because while displaying the pxe rom message, the link goes
down for a second.)
2. Try not to use serial console in order to see the kernel message, i.e
avoid " console=ttyS0,19200n8" for kernel command line option.(Because
when nic is initialized in the boot process, the link goes down for a
moment.)

This way I think I can avoid "seemed left disconnected" situation for
this motherboard most of the time now.

Do you think I'm doing right?

> Are you seeing 'ipmiconsole' hang forever?  Or does it eventually get an
> error or timeout?  It atleast shouldn't hang forever.  If it hangs
> forever, do you think you could give me a --debug output of that
> particular situation.  (Send as an attachment, since I'm sure the --
> debug output will be very long.)

The "hang" seems forever. It seems to me that ipmiconsole "thinks" it's
not disconnected and continues to send whatever input through terminal,
while the target host never send back any data again.

I think I can send you --debug output when I did the followings:
1. Make sol connection to already logged in console.
2. Issue date command to see if it's alive.
3. In tty0, issue "modprobe igb" to make sol "hang".
4. Issue "sleep 1000" followed by Ctrl+C.
5. Disconnect sol by hitting "&.".

Here is stdout out put:

x60:~# /usr/ccmp/sbin/ipmiconsole -W authcap,solpayloadsize -u rt -p rt
-h 192.168.20.116 --debug 2>/tmp/debug
[SOL established]
date
Sat Sep 27 00:52:24 JST 2008
usb:~# Intel(R) Gigabit Ethernet Network Driver - version 1.0.8-k2
Copyright (c) 2007 Intel Corporatio
[closing the connection]

Attached is the gzipped --debug output.

I hope this is what you want.
Otherwise, please let me know.
Thank you.

-- 
Best regards,
Kimitoshi Takahashi

reply via email to

[Prev in Thread] Current Thread [Next in Thread]