help-cfengine
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

RE: Bogus file content on copies


From: Ferguson, Steve
Subject: RE: Bogus file content on copies
Date: Thu, 17 Jul 2003 10:34:27 -0400

Another related concern: I read that the default Timeout for network
connections is 10 seconds.  Yet, as I see this problem occurring, it
gradually eats up all the child processes I have configured in cfrun.hosts
and ends up blocking the entire cfrun from completing.  Some of the
cfrun-triggered cfagent processes on the clients have stayed around for up
to 10 minutes before I've killed them by hand.

Either nothing is occurring to terminate the network connection, the cfagent
network connection is dying and cfagent itself is hanging internally, or
something is blocking so that the cfagent can't die (though a standard
SIGTERM works by hand).

Steve

> -----Original Message-----
> From: Ferguson, Steve 
> Sent: Thursday, July 17, 2003 9:29 AM
> To: 'address@hidden'; address@hidden
> Cc: Ferguson, Steve; address@hidden
> Subject: RE: Bogus file content on copies
> 
> 
> This morning I came in and ran cfrun with no arguments, to 
> hit all of the servers (over 130).  I had 45 hangs before I 
> gave up and interrupted cfrun, all with this same file left 
> in /var/cfengine/inputs/cfagent.conf.cfnew.  As a first step, 
> I ran truss on the hung cfagent process on several of the 
> boxes.  They were all hung here:
> 
> 13945:  recv(5, 0x00108528, 1397, 0)    (sleeping...)
> 13945:  signotifywait()                 (sleeping...)
> 13945:  door_return(0x00000000, 0, 0x00000000, 0) (sleeping...)
> 13945:  lwp_cond_wait(0xFEED5548, 0xFEED5558, 0xFEECEDB0) 
> (sleeping...)
> 
> I don't know if that's of any help.
> 
> After running cfrun a second time, it went through every 
> machine cleanly.  I've been able to do a clean cfrun several 
> times since then this morning.  I'm going to leave it alone 
> for an hour and try again, to see if there's some sort of 
> "first time in" condition that's causing a problem.  I'm 
> starting to suspect an issue with the central server rather 
> than any of the individual clients.
> 
> Steve
> 
> > -----Original Message-----
> > From: address@hidden [mailto:address@hidden
> > Sent: Thursday, July 17, 2003 3:59 AM
> > To: address@hidden
> > Cc: address@hidden; address@hidden;
> > address@hidden
> > Subject: Re: Bogus file content on copies
> > 
> > 
> > 
> > I don't even understand how this *could happen*, so any
> > details you can find out would be useful,
> > 
> > thanks
> > M
> > 
> > On 16 Jul, Jeremy 'Circ' Charles wrote:
> > > On Wed, 2003-07-16 at 12:13, address@hidden wrote:
> > >> I have not seen this for many years: Might be something 
> to do with
> > >> threading libraries. PLease try to reproduce this running it in
> > >> a debugger. Only a stack error could cause something like this.
> > > 
> > > I'd be curious to know what platform Steve encountered this 
> > problem on.
> > > 
> > > I have cfagent doing a TON of work as part of my RedHat 9 
> > installation
> > > procedure and in maintaining the machines thereafter.  
> > Yesterday I had
> > > one cfagent run hang after dropping content just like what 
> > Steve pointed
> > > out in a file:
> > > 
> > >> address@hidden:inputs# more cfservd.conf
> > >> t 2048BAD: Host authentication failed. Did you forget the 
> > domain name?
> > > 
> > > In my case, it was a different file, but my recollection of 
> > the bogus
> > > content is just like the above.
> > > 
> > > It has only happened once that I'm aware of.  I wrote it 
> > off as a fluke,
> > > blew away the goobered file on the target machine and 
> > started over.  All
> > > was well after that.  :-)
> > > 
> > 
> > 
> > 
> > 
> ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
> > Work: +47 22453272            Email:  address@hidden
> > Fax : +47 22453205            WWW  :  http://www.iu.hio.no/~mark
> > 
> ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
> > 
> 




reply via email to

[Prev in Thread] Current Thread [Next in Thread]