freeipmi-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Freeipmi-devel] Inverted IPMI responses


From: GIRARD, MARC
Subject: Re: [Freeipmi-devel] Inverted IPMI responses
Date: Mon, 21 Jan 2019 15:15:06 +0000

Hi Albert

Great fix : your trace message appears.

Please find side-by side analysis attached.

I have one minor remark, decoding of OEM Intelnm Get Node Manager Statistics 
response no longer appears in debug trace: manufacturer id, current, min, max, 
average values and so one.

> I'd be interested in knowing how often you hit this race as well.  Is it very 
> often?  Rare?
It is very strange
I am testing on a cluster with 128 intel PCSD nodes : 2 racks with 16 chassis 
(H2000G) each. Each chassis have 4 nodes(S2600BP).
I have no problem with one rack.
The problem occurs only with first node of each chassis of the second rack 
(except one). I suppose the first node of a chassis has a special role.
And I suspect our lab admins have made a series of 5 ethernet switches between 
my monitoring host and BMCs of the second rack (3 cisco 2960 and 2 cisco 3570). 
Ethernet topology looks like a comb instead of a tree and my monitoring host is 
a the end of the comb.
Perhaps, the two points may explains that the problem is never seen before.
In my case the problem is very often : about 50% of commands failed.

I have another thread mail with our Intel support, they reproduced the problem 
in their laboratory.

Kind regards / Cordialement

Marc Girard
Power Efficiency team
Atos
-----Original Message-----
From: Albert Chu <address@hidden> 
Sent: Saturday, January 19, 2019 12:53 AM
To: GIRARD, MARC <address@hidden>
Cc: address@hidden
Subject: Re: [Freeipmi-devel] Inverted IPMI responses

Hey Marc,

Ok, I think I got a fix.  In my github mirror:

https://emea01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fchu11%2Ffreeipmi-mirror&amp;data=02%7C01%7Cmarc.girard%40atos.net%7Cd300f125c5834a5bc7d108d67da020c6%7C33440fc6b7c7412cbb730e70b0198d5a%7C0%7C0%7C636834524042643592&amp;sdata=2fhE2nUvYidO6YvGi0LVT49wEC0ncL2o%2Bq%2F5pRyhJzc%3D&amp;reserved=0

I have a branch:

ipmb_network_reorder_race

Just do the normal thing to build and add some tracing to make sure.

./autogen.sh
./configure --enable-debug --enable-trace make cd ipmi-sensors ./ipmi-sensors 
--bridge-sensors -h host ... etc.

If things are being worked around correctly, hopefully you'll see a trace 
message like:

api/ipmi-lan-session-common.c: 1299: _ipmi_check_ipmb_out_of_order:
error 'reversed obj_cmd responses' (0)

I'd be interested in knowing how often you hit this race as well.  Is it very 
often?  Rare?

Al

--
Albert Chu
address@hidden
Computer Scientist
High Performance Systems Division
Lawrence Livermore National Laboratory

Attachment: Trace analysis - 2.pdf
Description: Trace analysis - 2.pdf


reply via email to

[Prev in Thread] Current Thread [Next in Thread]