freeipmi-users
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Freeipmi-users] pstdout_launch: unknown internal error encountered


From: Al Chu
Subject: Re: [Freeipmi-users] pstdout_launch: unknown internal error encountered with 319 hosts and --consolidate-output --quiet-readings
Date: Wed, 14 Oct 2009 16:39:58 -0700

Hey Chris,

I went ahead and put a beta up.

freeipmi-0.7.14.beta0.tar.gz

at

http://ftp.gluster.com/pub/freeipmi/qa-release/

want to give it a shot?

Al

On Wed, 2009-10-14 at 16:00 -0700, Al Chu wrote:
> Hey Chris,
> 
> Just spoke to the maintainer of the internal "hostlist" library.  Short
> term, I can build you a beta that can get around the problem.  However,
> you will not get a nice
> 
> xxxx[0001-319]-lom
> 
> output, you would instead get
> 
> xxxx0001-lom,xxxx0002-lom,xxxx0003-lom,...
> 
> The issue was there was a buffer overflow.  My buffer was 4096 chars,
> which sure enough is overflowed after about 316 nodes in your format (13
> chars * 316 > 4096).
> 
> Now why was there a buffer overflow?  The hostrange library currently
> can't deal with hostrange "building" (which is what is done when outputs
> are being consolidated) when the host has a suffix (i.e "-lom").
> However, I spoke to the author and if there is only 1 "numeric range",
> such as in your case, perhaps that can be handled as a special case,
> since there is no ambiguity of how to build up the hostrange.  The
> suffix situation is a unique situation, most commonly seen with a format
> like:
> 
> node1-eth2
> 
> in the above, it is impossible to know if the '1' or the '2' is the
> hostrange part (although normal users can easily guess that it's the '1'
> and not the '2', code wise you really never know).
> 
> On your end, a short term way to deal with this problem and have a clean
> output is to perhaps come up with a different host alias?  Here at LLNL,
> we prefix all IPMI addresses with a unique prefix.
> 
> Hope that helps short term and hopefully we can get a fix longer term.
> 
> Al
> 
> On Wed, 2009-10-14 at 10:41 -0700, Al Chu wrote: 
> > Hey Chris,
> > 
> > I've reproduced this problem in the underlying hostlist library.  I'm
> > working with the maintainer of the library to figure out if there is a
> > bug or if there is a hostrange assumption issue.  I noticed your range
> > input was:
> > 
> > 0001-319
> > 
> > which internally in hostlist will lead to
> > 
> > 0001-0319
> > 
> > Is your intent for xxxx[0001-319] to lead to xxxx0318, xxx0319, etc.?
> > 
> > Inputting the later also seems to cause an error, so there probably is a
> > bug somewhere, may it be an input checking bug or an output bug.
> > 
> > Al
> > 
> > On Wed, 2009-10-14 at 09:43 -0700, Al Chu wrote:
> > > Hey Chris,
> > > 
> > > On Wed, 2009-10-14 at 11:53 -0400, Chris Harwell wrote:
> > > > Greetings freeipmi users,
> > > > 
> > > > I've really enjoyed using freeipmi - it is a great tool. I
> > > > particularly like how the host range syntax works and simplifies
> > > > certain tasks.
> > > 
> > > Thanks.
> > > 
> > > > I've recently run into a case where freeipmi fails and hope you can
> > > > offer some help or advice.
> > > > 
> > > > This fails:
> > > >  ipmi-sensors -g Fan -h xxxx[0001-319]-lom  --consolidate-output
> > > > --quiet-readings
> > > > also where the second number is 319 fails.
> > > > 
> > > > These invocations work:
> > > >   ipmi-sensors -g Fan -h xxxx[0001-319]-lom --consolidate-output
> > > >   ipmi-sensors -g Fan -h xxxx[0001-318]-lom --consolidate-output
> > > > --quiet-readings
> > > >   ipmi-sensors -g Fan -h xxxx[0001-319]-lom
> > > >   ipmi-sensors -g Fan -h xxxx[0001-318]-lom --consolidate-output
> > > >   ipmi-sensors -g Fan -h xxxx[0001-319]-lom --consolidate-output
> > > >   ipmi-sensors -g Fan -h xxxx[0001-319]-lom --quiet-readings
> > > > 
> > > > when it fails the output looks like this:
> > > > $  ipmi-sensors -g Fan -h xxxx[0001-319]-lom --consolidate-output
> > > > --quiet-readings
> > > > pstdout_launch: unknown internal error
> > > > 
> > > > I encounter this in the several versions I could check quickly 0.6.5,
> > > > 0.7.12 and 0.7.13:
> > > > :bin$ ipmi-sensors -V
> > > > ipmi-sensors - 0.7.13
> > > > Copyright (C) 2003-2008 FreeIPMI Core Team
> > > > This program is free software; you may redistribute it under the terms 
> > > > of
> > > > the GNU General Public License.  This program has absolutely no 
> > > > warranty.
> > > > drdenws02:bin$  ipmi-sensors -g Fan -h drdb[0001-319]-lom -u ADMIN -p
> > > > ADMIN --consolidate-output --quiet-readings
> > > > pstdout_launch: unknown internal error
> > > > 
> > > > debug output is copious, the last bit looks like this:
> > > > xxxx0317-lom: IPMI Command Data:
> > > > xxxx0317-lom: ------------------
> > > > xxxx0317-lom: [              3Ch] = cmd[ 8b]
> > > > xxxx0317-lom: [               0h] = comp_code[ 8b]
> > > > xxxx0317-lom: IPMI Trailer:
> > > > xxxx0317-lom: --------------
> > > > xxxx0317-lom: [              23h] = checksum2[ 8b]
> > > > pstdout_launch: unknown internal error
> > > > 
> > > > Please advise  - am I running into a known limitation or just using
> > > > this wrong? Is there other information I ought to provide?
> > > 
> > > In all liklihood there is some corner case in the hostrange parsing.
> > > I'll take a look into it and get back to you if I need any more info.
> > > 
> > > Thanks,
> > > Al
> > > 
> > > > Thanks in advance,
> > > > Chris Harwell
> > > > 
> > > > 
> > > > _______________________________________________
> > > > Freeipmi-users mailing list
> > > > address@hidden
> > > > http://****lists.gnu.org/mailman/listinfo/freeipmi-users
> > > > 
-- 
Albert Chu
address@hidden
Computer Scientist
High Performance Systems Division
Lawrence Livermore National Laboratory





reply via email to

[Prev in Thread] Current Thread [Next in Thread]