freeipmi-users
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Freeipmi-users] ASROCK MT-C224 + ipmiconsole: [error received]: exc


From: Albert Chu
Subject: Re: [Freeipmi-users] ASROCK MT-C224 + ipmiconsole: [error received]: excess errors received
Date: Mon, 30 Jan 2017 11:10:52 -0800

> I have learned that there are two ways to run IPMI on the ASROCK
> MT-C224. IPMI can share the LAN1 port and/or IPMI can use a dedicated
> IPMI_LAN port. The earlier grep (still attached below) of --debug for
> the linux boot was made sharing the LAN1 port.

Ahh, this is a common feature of many boards.  However, I have not heard
of a system that has had the "jumping sequence numbers" issue you saw.
I haven't seen it on any machines at my company, but we use dedicated
mode almost exclusively.

It's likely a bug on the ASROCK.  I can add it to the "bugs and
workarounds" doc
(https://www.gnu.org/software/freeipmi/freeipmi-bugs-issues-and-workarounds.txt).
 
> I switched to the dedicated IPMI_LAN port and the situation is
improved
> as shown by the grep of --debug output for the linux boot below ...
> 
> address@hidden cat power-up-debug.txt | grep -a failed
> (ipmiconsole_checks.c, ipmiconsole_check_requester_sequence_number,
389): hostname=e3bIPMI; protocol_state=Ah: requester sequence number
check failed; p = 21; req_seq = 3Eh; expected_req_seq = 3Fh
> (ipmiconsole_checks.c, ipmiconsole_check_command, 353):
hostname=e3bIPMI; protocol_state=Bh: command check failed; p = 23; cmd =
49h; expected_cmd = 3Ch

This appears to be a single lost packet, which can happen once in
awhile.

Al

On Fri, 2017-01-27 at 23:05 -0500, myglc2 wrote:
> On 01/27/2017 at 00:52 Albert Chu writes:
> 
> > It's also worth mentioning, I don't know why the sequence numbers are
> > jumping by two most times.  It shouldn't be the case, also suggesting
> > lost messages.
> >
> > If the BMC has a bug where sequence numbers are jumping by two somewhat
> > randomly, that could explain the issue further.  It means the default
> > ipmiconsole can only handle half as many messages dropping than it
> > normally would be.
> >
> > Al
> 
> Hi Al,
> 
> I have learned that there are two ways to run IPMI on the ASROCK
> MT-C224. IPMI can share the LAN1 port and/or IPMI can use a dedicated
> IPMI_LAN port. The earlier grep (still attached below) of --debug for
> the linux boot was made sharing the LAN1 port.
> 
> I switched to the dedicated IPMI_LAN port and the situation is improved
> as shown by the grep of --debug output for the linux boot below ...
> 
> address@hidden cat power-up-debug.txt | grep -a failed
> (ipmiconsole_checks.c, ipmiconsole_check_requester_sequence_number, 389): 
> hostname=e3bIPMI; protocol_state=Ah: requester sequence number check failed; 
> p = 21; req_seq = 3Eh; expected_req_seq = 3Fh
> (ipmiconsole_checks.c, ipmiconsole_check_command, 353): hostname=e3bIPMI; 
> protocol_state=Bh: command check failed; p = 23; cmd = 49h; expected_cmd = 3Ch
> 
> ... and freeipmi doesn't drop the connection. So, at this point I have a
> usable system and I a happy user ;-)
> 
> I still see 5 to 10 control-@ 's in the linux log, which are errors that
> typically involve the lost of a few characters. Would you think turning
> on flow control would help this?
> 
> Many thanks for your help!
> 
> - George
> 
> >
> > On Wed, 2017-01-25 at 20:31 -0500, myglc2 wrote:
> >> Hi Albert, Thank you for the quick response.
> >>
> >> On 01/25/2017 at 20:00 Albert Chu writes:
> >>
> >> > Hi,
> >> >
> >> > This seems to be an error caused by a simple sequence number issue.
> >> > Enough messages from the remote service processor have gotten lost, so
> >> > ipmiconsole gives up at some point.  I don't know if your log output
> >> > below is showing consecutive
> >> > "ipmiconsole_check_outbound_sequence_number" errors, but there is
> >> > atleast that one big jump from #398 to #429, indicating lots of lost
> >> > messages.
> >>
> >> Sorry, I think my cut-and-paste left a bit to be desired ;-) Here is a
> >> more informative ( hopefully) grep of a session that bags out about 1/2
> >> way thru a linux boot ...
> >>
> >> address@hidden /root/con/06$ cat freeipmi.debug.txt | grep -a -E 
> >> '(failed;|excessive)'
> >> (ipmiconsole_checks.c, ipmiconsole_check_outbound_sequence_number, 186): 
> >> hostname=e3bIPMI; protocol_state=9h: session sequence number check failed; 
> >> p = 17; session_sequence_number = 329; highest_received_sequence_number = 
> >> 317
> >> (ipmiconsole_checks.c, ipmiconsole_check_outbound_sequence_number, 186): 
> >> hostname=e3bIPMI; protocol_state=9h: session sequence number check failed; 
> >> p = 17; session_sequence_number = 331; highest_received_sequence_number = 
> >> 317
> >> (ipmiconsole_checks.c, ipmiconsole_check_outbound_sequence_number, 186): 
> >> hostname=e3bIPMI; protocol_state=9h: session sequence number check failed; 
> >> p = 17; session_sequence_number = 333; highest_received_sequence_number = 
> >> 317
> >> (ipmiconsole_checks.c, ipmiconsole_check_outbound_sequence_number, 186): 
> >> hostname=e3bIPMI; protocol_state=9h: session sequence number check failed; 
> >> p = 17; session_sequence_number = 335; highest_received_sequence_number = 
> >> 317
> >> (ipmiconsole_checks.c, ipmiconsole_check_outbound_sequence_number, 186): 
> >> hostname=e3bIPMI; protocol_state=9h: session sequence number check failed; 
> >> p = 17; session_sequence_number = 337; highest_received_sequence_number = 
> >> 317
> >> (ipmiconsole_checks.c, ipmiconsole_check_outbound_sequence_number, 186): 
> >> hostname=e3bIPMI; protocol_state=9h: session sequence number check failed; 
> >> p = 17; session_sequence_number = 339; highest_received_sequence_number = 
> >> 317
> >> (ipmiconsole_checks.c, ipmiconsole_check_outbound_sequence_number, 186): 
> >> hostname=e3bIPMI; protocol_state=9h: session sequence number check failed; 
> >> p = 17; session_sequence_number = 341; highest_received_sequence_number = 
> >> 317
> >> (ipmiconsole_checks.c, ipmiconsole_check_outbound_sequence_number, 186): 
> >> hostname=e3bIPMI; protocol_state=9h: session sequence number check failed; 
> >> p = 17; session_sequence_number = 343; highest_received_sequence_number = 
> >> 317
> >> (ipmiconsole_checks.c, ipmiconsole_check_outbound_sequence_number, 186): 
> >> hostname=e3bIPMI; protocol_state=9h: session sequence number check failed; 
> >> p = 17; session_sequence_number = 345; highest_received_sequence_number = 
> >> 317
> >> (ipmiconsole_checks.c, ipmiconsole_check_outbound_sequence_number, 186): 
> >> hostname=e3bIPMI; protocol_state=9h: session sequence number check failed; 
> >> p = 17; session_sequence_number = 347; highest_received_sequence_number = 
> >> 317
> >> (ipmiconsole_checks.c, ipmiconsole_check_outbound_sequence_number, 186): 
> >> hostname=e3bIPMI; protocol_state=9h: session sequence number check failed; 
> >> p = 17; session_sequence_number = 349; highest_received_sequence_number = 
> >> 317
> >> (ipmiconsole_checks.c, ipmiconsole_check_outbound_sequence_number, 186): 
> >> hostname=e3bIPMI; protocol_state=9h: session sequence number check failed; 
> >> p = 17; session_sequence_number = 351; highest_received_sequence_number = 
> >> 317
> >> (ipmiconsole_checks.c, ipmiconsole_check_outbound_sequence_number, 186): 
> >> hostname=e3bIPMI; protocol_state=9h: session sequence number check failed; 
> >> p = 17; session_sequence_number = 353; highest_received_sequence_number = 
> >> 317
> >> (ipmiconsole_checks.c, ipmiconsole_check_outbound_sequence_number, 186): 
> >> hostname=e3bIPMI; protocol_state=9h: session sequence number check failed; 
> >> p = 17; session_sequence_number = 355; highest_received_sequence_number = 
> >> 317
> >> (ipmiconsole_checks.c, ipmiconsole_check_outbound_sequence_number, 186): 
> >> hostname=e3bIPMI; protocol_state=9h: session sequence number check failed; 
> >> p = 19; session_sequence_number = 356; highest_received_sequence_number = 
> >> 317
> >> (ipmiconsole_checks.c, ipmiconsole_check_outbound_sequence_number, 186): 
> >> hostname=e3bIPMI; protocol_state=9h: session sequence number check failed; 
> >> p = 17; session_sequence_number = 358; highest_received_sequence_number = 
> >> 317
> >> (ipmiconsole_checks.c, ipmiconsole_check_outbound_sequence_number, 186): 
> >> hostname=e3bIPMI; protocol_state=9h: session sequence number check failed; 
> >> p = 17; session_sequence_number = 360; highest_received_sequence_number = 
> >> 317
> >> (ipmiconsole_checks.c, ipmiconsole_check_outbound_sequence_number, 186): 
> >> hostname=e3bIPMI; protocol_state=9h: session sequence number check failed; 
> >> p = 17; session_sequence_number = 362; highest_received_sequence_number = 
> >> 317
> >> (ipmiconsole_processing.c, _process_ctx, 4077): hostname=e3bIPMI; 
> >> protocol_state=9h: closing with excessive errors
> >> (ipmiconsole_checks.c, ipmiconsole_check_outbound_sequence_number, 186): 
> >> hostname=e3bIPMI; protocol_state=Ah: session sequence number check failed; 
> >> p = 21; session_sequence_number = 363; highest_received_sequence_number = 
> >> 317
> >> (ipmiconsole_checks.c, ipmiconsole_check_requester_sequence_number, 389): 
> >> hostname=e3bIPMI; protocol_state=Ah: requester sequence number check 
> >> failed; p = 21; req_seq = 2Eh; expected_req_seq = 2Fh
> >> (ipmiconsole_checks.c, ipmiconsole_check_outbound_sequence_number, 186): 
> >> hostname=e3bIPMI; protocol_state=Bh: session sequence number check failed; 
> >> p = 23; session_sequence_number = 366; highest_received_sequence_number = 
> >> 317
> >> (ipmiconsole_checks.c, ipmiconsole_check_command, 353): hostname=e3bIPMI; 
> >> protocol_state=Bh: command check failed; p = 23; cmd = 49h; expected_cmd = 
> >> 3Ch
> >> (ipmiconsole_checks.c, ipmiconsole_check_outbound_sequence_number, 186): 
> >> hostname=e3bIPMI; protocol_state=Bh: session sequence number check failed; 
> >> p = 23; session_sequence_number = 367; highest_received_sequence_number = 
> >> 317
> >> address@hidden /root/con/06$
> >>
> >> > You may wish to check network connections and such for errors, lost
> >> > packets, etc.
> >>
> >> I don't think so, the two machines are the only ones on a switch.
> >>
> >> > If you believe this to not be the case, there is atleast 1 other known
> >> > situation where I know this to occur.  It occurs when the server is
> >> > being rebooted (or some similar to that) and the internal serial UART
> >> > chip is rebooted and leads to some communication problems between it and
> >> > the internal service processor, suddenly leading to huge jumps in
> >> > sequence numbers.  Unfortunately, there is no solution for this other
> >> > than to restart.
> >>
> >> I tried rebooting/repowering and it does not affect this behavior.
> >>
> >> FWIW, ipmitool does not report any errors and handles a full linux boot
> >> with out bagging out.  However I do see 5 to 10 control-@ 's in the log
> >> which are clearly errors.
> >>
> >> So... I believe the problem is with the ASROCK MT-C224. I will follow up
> >> with them.
> >>
> >> Many thanks!
> >>
> >> - George
> >>
> >> > Al
> >> >
> >> > On Wed, 2017-01-25 at 14:38 -0500, myglc2 wrote:
> >> >> NOTE: Please pardon if duplicate, I also posted via gmane by mistake.
> >> >>
> >> >> Hi,
> >> >>
> >> >> Using freeipmi ipmiconsole SOL to connect to ASROCK MT-C224 everything
> >> >> is looking good until ...
> >> >>
> >> >>     [...]
> >> >>     [error received]: excess errors received
> >> >>     [closing the connection]
> >> >>
> >> >> So I tried ...
> >> >>
> >> >> ipmiconsole -h e3bIPMI -u admin -p admin --debug
> >> >>
> >> >> ... which showed me ....
> >> >>
> >> >> [...]
> >> >> (ipmiconsole_checks.c, ipmiconsole_check_outbound_sequence_number, 
> >> >> 186): hostname=e3bIPMI; protocol_state=9h: session sequence number 
> >> >> check failed; p = 17; session_sequence_number = 396; 
> >> >> highest_received_sequence_number = 384
> >> >> [...]
> >> >> (ipmiconsole_checks.c, ipmiconsole_check_outbound_sequence_number, 
> >> >> 186): hostname=e3bIPMI; protocol_state=9h: session sequence number 
> >> >> check failed; p = 17; session_sequence_number = 398; 
> >> >> highest_received_sequence_number = 384
> >> >> [...]
> >> >> (ipmiconsole_checks.c, ipmiconsole_check_outbound_sequence_number, 
> >> >> 186): hostname=e3bIPMI; protocol_state=9h: session sequence number 
> >> >> check failed; p = 17; session_sequence_number = 429; 
> >> >> highest_received_sequence_number = 384
> >> >> (ipmiconsole_processing.c, _process_ctx, 4077): hostname=e3bIPMI;
> >> >> protocol_state=9h: closing with excessive errors
> >> >>
> >> >> Looks like the BMC is stuck at # 186, EH? So I tried ...
> >> >>
> >> >> ipmiconsole -h e3bIPMI -u admin -p admin  -W solpacketseq --debug
> >> >>
> >> >> ... and ...
> >> >>
> >> >> ipmiconsole -h e3bIPMI -u admin -p admin  -W solpacketseq
> >> >>
> >> >> ... neither of which helped. Suggestions would be most welcome.
> >> >>
> >> >> Thanks in advance - George
> >> >>
> >> >> VERSIONS:
> >> >>
> >> >> ipmiconsole --version
> >> >> ipmiconsole - 1.4.5
> >> >>
> >> >> ASROCK
> >> >> BIOS 3.20        7/17/2015
> >> >> BMC  04.04.00    9/3/2014
> >> >>
> >> >> _______________________________________________
> >> >> Freeipmi-users mailing list
> >> >> address@hidden
> >> >> https://lists.gnu.org/mailman/listinfo/freeipmi-users

-- 
Albert Chu
address@hidden
Computer Scientist
High Performance Systems Division
Lawrence Livermore National Laboratory





reply via email to

[Prev in Thread] Current Thread [Next in Thread]