[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Freeipmi-devel] Inverted IPMI responses
From: |
GIRARD, MARC |
Subject: |
Re: [Freeipmi-devel] Inverted IPMI responses |
Date: |
Mon, 21 Jan 2019 15:15:06 +0000 |
Hi Albert
Great fix : your trace message appears.
Please find side-by side analysis attached.
I have one minor remark, decoding of OEM Intelnm Get Node Manager Statistics
response no longer appears in debug trace: manufacturer id, current, min, max,
average values and so one.
> I'd be interested in knowing how often you hit this race as well. Is it very
> often? Rare?
It is very strange
I am testing on a cluster with 128 intel PCSD nodes : 2 racks with 16 chassis
(H2000G) each. Each chassis have 4 nodes(S2600BP).
I have no problem with one rack.
The problem occurs only with first node of each chassis of the second rack
(except one). I suppose the first node of a chassis has a special role.
And I suspect our lab admins have made a series of 5 ethernet switches between
my monitoring host and BMCs of the second rack (3 cisco 2960 and 2 cisco 3570).
Ethernet topology looks like a comb instead of a tree and my monitoring host is
a the end of the comb.
Perhaps, the two points may explains that the problem is never seen before.
In my case the problem is very often : about 50% of commands failed.
I have another thread mail with our Intel support, they reproduced the problem
in their laboratory.
Kind regards / Cordialement
Marc Girard
Power Efficiency team
Atos
-----Original Message-----
From: Albert Chu <address@hidden>
Sent: Saturday, January 19, 2019 12:53 AM
To: GIRARD, MARC <address@hidden>
Cc: address@hidden
Subject: Re: [Freeipmi-devel] Inverted IPMI responses
Hey Marc,
Ok, I think I got a fix. In my github mirror:
https://emea01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fchu11%2Ffreeipmi-mirror&data=02%7C01%7Cmarc.girard%40atos.net%7Cd300f125c5834a5bc7d108d67da020c6%7C33440fc6b7c7412cbb730e70b0198d5a%7C0%7C0%7C636834524042643592&sdata=2fhE2nUvYidO6YvGi0LVT49wEC0ncL2o%2Bq%2F5pRyhJzc%3D&reserved=0
I have a branch:
ipmb_network_reorder_race
Just do the normal thing to build and add some tracing to make sure.
./autogen.sh
./configure --enable-debug --enable-trace make cd ipmi-sensors ./ipmi-sensors
--bridge-sensors -h host ... etc.
If things are being worked around correctly, hopefully you'll see a trace
message like:
api/ipmi-lan-session-common.c: 1299: _ipmi_check_ipmb_out_of_order:
error 'reversed obj_cmd responses' (0)
I'd be interested in knowing how often you hit this race as well. Is it very
often? Rare?
Al
--
Albert Chu
address@hidden
Computer Scientist
High Performance Systems Division
Lawrence Livermore National Laboratory
Trace analysis - 2.pdf
Description: Trace analysis - 2.pdf
- [Freeipmi-devel] Inverted IPMI responses, GIRARD, MARC, 2019/01/16
- Re: [Freeipmi-devel] Inverted IPMI responses, Albert Chu, 2019/01/16
- Re: [Freeipmi-devel] Inverted IPMI responses, GIRARD, MARC, 2019/01/17
- Re: [Freeipmi-devel] Inverted IPMI responses, Albert Chu, 2019/01/17
- Re: [Freeipmi-devel] Inverted IPMI responses, GIRARD, MARC, 2019/01/18
- Re: [Freeipmi-devel] Inverted IPMI responses, Albert Chu, 2019/01/18
- Re: [Freeipmi-devel] Inverted IPMI responses,
GIRARD, MARC <=
- Re: [Freeipmi-devel] Inverted IPMI responses, GIRARD, MARC, 2019/01/31
- Re: [Freeipmi-devel] Inverted IPMI responses, Albert Chu, 2019/01/31