[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: cfagent hangs
Luke A. Kanies
Re: cfagent hangs
Thu, 4 Dec 2003 12:26:47 -0600 (CST)
On Mon, 24 Nov 2003, Jeff Wasilko wrote:
> I've been having problems with cfagent hanging for multiple days.
> It's usually started by some sort of network problem (we've had a
> bit of instability here that we've traced down to a failing gigE
> cfagent is started by cfexecd. Is there any way to get cfexec to
> kill the wedged cfagent?
> lexx 7 ># ps -ef | grep cfagent
> root 17435 375 0 Nov 22 ? 0:04
> lexx 8 ># truss -p 17435
> recv(8, 0xFFBF2618, 8, 0) (sleeping...)
> It seems to be hung in a copy of a big tree (pushing out our
> /usr/local equivilent):
> This is the mail I got from cfengine when I killed the hung
> cfengine:lexx: Received signal 15 (SIGTERM) while doing
> cfengine:lexx: Logical start time Sat Nov 22 16:20:34 2003
> cfengine:lexx: This sub-task started really at Sat Nov 22 16:20:34 2003
[obviously, I'm catching up on email]
I had a problem similar to this. It was somehow related to a bad compile
of cfengine and BerkeleyDB; I don't know what went wrong, but eventually
cfagent would hang forever on trying to make locks in the lock_db file.
And I mean forever; I'm talking fork bomb.
It would be nice if cfexecd were configurable to kill child processes
after a certain amount of time; I would settle for a hard-coded value, but
a configurable one would be best. I think an hour is reasonable, but four
might be better for the general case.
This was also version 2.0.8p1, but like I said, it was a bad compile. We
recompiled against 4.0.14 or something and it worked fine. And this was
only on AIX. I also had to go back and delete every db file on every
machine with this problem, as they were all irretrievably corrupt,
"But these [serious NT security flaws] are not inherent flaws in the
operating system -- they don't happen by accident. They are the result
of deliberate and well-thought-out efforts." --Mike Nash, Microsoft.
The _flaws_ are deliberate?
|[Prev in Thread]
||[Next in Thread]|
- Re: cfagent hangs,
Luke A. Kanies <=