freeipmi-users
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Freeipmi-users] ipmiconsole: "could not set errnum: current = 31, d


From: Albert Chu
Subject: Re: [Freeipmi-users] ipmiconsole: "could not set errnum: current = 31, desired = 25"
Date: Tue, 12 Aug 2014 10:40:11 -0700

Responses below

On Tue, 2014-08-12 at 10:11 +1000, David O'Shea wrote:
> Hi Al,
> 
> Thanks for the response, please see below:
> 
> > Subject: Re: [Freeipmi-users] ipmiconsole: "could not set errnum:
> current = 31, desired = 25"
> > From: address@hidden
> > To: address@hidden
> > CC: address@hidden
> > Date: Mon, 11 Aug 2014 16:24:00 -0700
> > 
> > Hi David,
> > 
> [...]
> > > (ipmiconsole_processing.c, _sol_bmc_to_remote_console_packet,
> 2801):
> > > hostname=silicon-bmc.adl.quantum.com; protocol_state=9h:
> scbuf_write:
> > > dropped data: dropped=14
> > > (ipmiconsole_processing.c, _process_ctx, 4005):
> > > hostname=silicon-bmc.adl.quantum.com; protocol_state=Ah: closing
> > > session due to session timeout
> > > (ipmiconsole_ctx.c, ipmiconsole_ctx_set_errnum, 1396): could not
> set
> > > errnum: current = 31, desired = 25
> > 
> > Interesting, it appears that the buffer (scbuf_write) overflowed its
> > max, which is defaulted to 16K. That's what likely caused the
> internal
> > error. I guess there was a ton of data coming out. It's possible if
> > the buffer were increased you may not see the problem anymore. The
> max
> > was picked along time ago and perhaps BMCs are simply able to pump
> out
> > more data than they used to.
> 
> I assume this is the CONSOLE_BMC_TO_REMOTE_CONSOLE_BUF_MAX definition,
> and I'd have to recompile to override this?

Yeah, although I'd say go ahead and double all the buffers there.  It
won't make a huge difference anyways.  Heck, even quadrupling it to 64K
probably would be fine too.

> Could the problem be that I'm not reading from the file descriptor
> fast enough?  

Actually, that is a very likely reason.

> I have the FD set to non-blocking mode and I have a thread that
> basically just reads up to 2K from the FD, and if it times out it
> sleeps for ~10ms and tries again, otherwise it does a bit of
> processing and then writes the data to a pipe to another thread, I
> suppose the write to the pipe could be block if the other thread is
> too slow or something.

Why not use select() or poll()?  That way you can just read from the FD
whenever there is data.  The writer of Conman effectively does this and
he's never had this issue.

> 
> > > Am I using libipmiconsole incorrectly if I'm just reading from the
> file descriptor waiting for EOF to indicate an error?
> > 
> > Nope, that's correct usage. Your situation is admittedly unique. Is
> > this something that can't be programmed around? Or is this more just
> a
> > "interesting thing that happened."
> 
> I can't really program around it reliably - I could add some logic to
> reconnect if I lose the connection, but if the message I'm waiting for
> happened to appear while I was disconnected, that would be bad.

Yeah, unfortunately there's not much we can do here.  IPMI being UDP
based, sometimes timeouts can just happen.  Other issues independent of
IPMI also happen.  We found some motherboards will reset the internal
IPMI chip during a reboot, so the connection goes away during a reboot
anyways :P  The writer of Conman deals with these errors appropriately
whenever they are encountered.

> > The output you show above it the kind of debugging output you should
> > see. The INTERNAL_ERROR was likely caused by the
> > _sol_bmc_to_remote_console_packet message.
> 
> Ahh ok thanks.
> 
> Perhaps I should mention some other details now in case they are
> relevant: my second thread acts a bit like 'expect' - waits for a
> message, writes a response/command to the libipmiconsole file
> descriptor, waits for the next message, etc.  I found that if I tried
> to do 4K reads from the FD in the first thread, whilst the write calls
> in the second thread did (if I recall correctly) return, I didn't get
> the expected responses coming back from the iDRAC in the read thread.
> I didn't dig any deeper since things generally seem to be okay with 2K
> reads.
> 
> Thanks!
> David
> 
-- 
Albert Chu
address@hidden
Computer Scientist
High Performance Systems Division
Lawrence Livermore National Laboratory





reply via email to

[Prev in Thread] Current Thread [Next in Thread]