freeipmi-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Freeipmi-devel] lib: raw command and threads


From: Thomas Cadeau
Subject: Re: [Freeipmi-devel] lib: raw command and threads
Date: Mon, 2 Dec 2013 18:16:24 +0100
User-agent: Mozilla/5.0 (X11; Linux i686; rv:24.0) Gecko/20100101 Thunderbird/24.1.1

Hi all,

I come back with the same result on "clean" nodes.

Using a simple program, I have answer in 3ms with active driver:
$ lsmod |grep ipmi
   ipmi_devintf            8145  2
   ipmi_si                42497  1
And an answer in 17 ms without the driver.

Inside our project, we have no problem with the drivers, but without, we always have the memory issue described beside.
Note it will be a real problem with new rhel kernel.
We really need  to correct this.

I will come back tomorrow to see witch part of our code we can share for the moment. The project is quite big but the only difference with the simple program I see is the call inside the thread.

Thomas

Le 22/11/2013 16:51, Thomas Cadeau a écrit :
Thanks a lot for your answer.

The way you propose will not fit to what we want to do.
I re-ran on "safe" cpus without any troubles.

When I will have a real pool of cpus without any other troubles, I will let you you if there is again the problem.

Thomas

Le 21/11/2013 20:23, Albert Chu a écrit :
Hi Thomas,

I did a quick sanity test on my system and it worked (of course, it may
have not been exactly like you did things).

The trace indicates the segfault is here:

#0  0x00007f4e278c89a9 in inb (ctx=0x7f4e28001770) at
/usr/include/sys/io.h:48
Which is during memory mapped i/o.  I suppose a segfault could happen if
the in/out call was going to a bad part of memory.  It might suggest
some corruption is happening.  Is it possible you're corrupting some
data structure somewhere?  The close/destroy/re-create works b/c it
fixes the corruption?

In all of FreeIPMI (especially the multi-ranged host access in the
tools), we create a context per thread for communication, e.g.

launch_thread
    ctx = ipmi_ctx_create();
    ipmi_ctx_find_inband(ctx, ...);
    loop
       ipmi_cmd_raw

Have you considered doing it this way?

Al


On Thu, 2013-11-21 at 17:00 +0100, Thomas Cadeau wrote:
Hi all,


I'am curently tring to call a raw command several times.
Here are the functions I call:

ctx = ipmi_ctx_create()

ipmi_ctx_find_inband (ctx,
                   NULL,//&driver_type,
                   0,   // disable_auto_probe,
                   0,   // driver_address,
                   0,   // register_spacing,
                   0,   // driver_device,
                   0,   // workaround_flags,
                   IPMI_FLAGS_DEFAULT//0
                   )

ipmi_cmd_raw(ctx,
              0x00, //lun (logical unit number)
              0x3A,//IPMI_NET_FN_SENSOR_EVENT_RQ,
              bytes_rq, //request data //const void *
              2, //length (in bytes)
              bytes_rs, //response buffer //void *
              IPMI_RAW_MAX_ARGS //max response length
              )
I check all return code.

If I create a simple example with a loop, I have no problem.
ctx = ipmi_ctx_create()
ipmi_ctx_find_inband ( ...  )
for (...){
ipmi_cmd_raw(...)
//use result
}
Then I try inside an internal project, during initialization, I use the
3 functions, and then each time I want to update and call
ipmi_cmd_raw(...), a thread is created to do all operations.

ctx = ipmi_ctx_create()
ipmi_ctx_find_inband ( ...  )
  ipmi_cmd_raw(...)
  //use result
...
//with fixed frequency:
launch thread
        > ipmi_cmd_raw(...)
        > //use result
In this case, on some cpus, I have no problem. But on some, I have a
segfault (core dump):
#0  0x00007f4e278c89a9 in inb (ctx=0x7f4e28001770) at
/usr/include/sys/io.h:48
#1  _ipmi_kcs_get_status (ctx=0x7f4e28001770) at
driver/ipmi-kcs-driver.c:533
#2  0x00007f4e278c8e50 in _ipmi_kcs_wait_for_ibf_clear
(ctx=0x7f4e28001770)
     at driver/ipmi-kcs-driver.c:656
#3  0x00007f4e278c91d6 in ipmi_kcs_write (ctx=0x7f4e28001770,
buf=0x7f4e28003420, buf_len=3)
     at driver/ipmi-kcs-driver.c:845
#4  0x00007f4e27898bc1 in _kcs_cmd_write (ctx=0x7f4e28005190,
obj_cmd_rq=<value optimized out>,
     obj_cmd_rs=0x7f4e28001ae0) at api/ipmi-kcs-driver-api.c:255
#5  api_kcs_cmd (ctx=0x7f4e28005190, obj_cmd_rq=<value optimized out>,
obj_cmd_rs=0x7f4e28001ae0)
     at api/ipmi-kcs-driver-api.c:398
#6  0x00007f4e27899091 in api_kcs_cmd_raw (ctx=0x7f4e28005190,
buf_rq=0x7f4e2e390a60, buf_rq_len=2,
     buf_rs=0x7f4e2e38f8c0, buf_rs_len=4512) at
api/ipmi-kcs-driver-api.c:750
#7  0x00007f4e2788f9a9 in ipmi_cmd_raw (ctx=0x7f4e28005190, lun=<value
optimized out>,
net_fn=<value optimized out>, buf_rq=0x7f4e2e390a60, buf_rq_len=2,
buf_rs=0x7f4e2e38f8c0,
     buf_rs_len=4512) at api/ipmi-api.c:1983
If I force to connect again, I have no problem. But this workaround is
not a good way:
ctx = ipmi_ctx_create()
ipmi_ctx_find_inband ( ...  )
  ipmi_cmd_raw(...)
  //use result
...
//with fixed frequency:
launch thread
        > ipmi_ctx_close(ctx)
        > ipmi_ctx_destroy(ctx);
ctx = ipmi_ctx_create()
ipmi_ctx_find_inband ( ...  )
        >ipmi_cmd_raw(...)
        > //use result
Note that I check the version of BMC on each nodes, and I use
freeipmi-1.2.1.
I also hace security to ensure only one use of ctx can be done.

Do you have any idea of what happpens and if I'm doing something wrong?
Is there a function to check the connection is opened and if I need to
reopen?

Thank you for your help.

Thomas Cadeau

_______________________________________________
Freeipmi-devel mailing list
address@hidden
https://lists.gnu.org/mailman/listinfo/freeipmi-devel





reply via email to

[Prev in Thread] Current Thread [Next in Thread]