help-cfengine
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Bogus file content on copies


From: Mark . Burgess
Subject: Re: Bogus file content on copies
Date: Thu, 17 Jul 2003 20:43:23 +0200 (MEST)

Access denied can sometimes happen is there is a DNS timeout. We see this
sometimes on solaris and have no explanation for it.

These problems with the file contents are a pathology that I
cannot explain and cannot find any debugging program that
can find a fault. The only way this can happen is if file
descriptors get confused, but that should not be possible unless
there is some weird corruption of the process. Since no
corruption has been detected, I cannot explain this.

M

On 17 Jul, Ferguson, Steve wrote:
> I'm seeing these log entries from cfservd on my server machine, which seem
> to correspond with the file copying problems.
> 
> Jul 17 12:05:19 bigbox cfservd[4069]: [ID 702911 daemon.notice] Host
> authorization/authentication failed or access denied
> Jul 17 12:05:19 bigbox cfservd[4069]: [ID 702911 daemon.notice] From
> (host=client.my.domain.com,user=root,ip=::ffff:xx.yy.zz.48)
> 
> Is there any reason a host would be authorized one minute, then rejected in
> the next minute?  DNS lookups seem to be consistent.  We don't have any
> round-robin hosts in the batch and I have yet to see a lookup fail.  nscd is
> running and caching most lookups anyway.
> 
> Steve
> 
>> -----Original Message-----
>> From: Ferguson, Steve 
>> Sent: Thursday, July 17, 2003 10:34 AM
>> To: 'address@hidden'
>> Subject: RE: Bogus file content on copies
>> 
>> 
>> Another related concern: I read that the default Timeout for 
>> network connections is 10 seconds.  Yet, as I see this 
>> problem occurring, it gradually eats up all the child 
>> processes I have configured in cfrun.hosts and ends up 
>> blocking the entire cfrun from completing.  Some of the 
>> cfrun-triggered cfagent processes on the clients have stayed 
>> around for up to 10 minutes before I've killed them by hand.
>> 
>> Either nothing is occurring to terminate the network 
>> connection, the cfagent network connection is dying and 
>> cfagent itself is hanging internally, or something is 
>> blocking so that the cfagent can't die (though a standard 
>> SIGTERM works by hand).
>> 
>> Steve
>> 
>> > -----Original Message-----
>> > From: Ferguson, Steve 
>> > Sent: Thursday, July 17, 2003 9:29 AM
>> > To: 'address@hidden'; address@hidden
>> > Cc: Ferguson, Steve; address@hidden
>> > Subject: RE: Bogus file content on copies
>> > 
>> > 
>> > This morning I came in and ran cfrun with no arguments, to 
>> > hit all of the servers (over 130).  I had 45 hangs before I 
>> > gave up and interrupted cfrun, all with this same file left 
>> > in /var/cfengine/inputs/cfagent.conf.cfnew.  As a first step, 
>> > I ran truss on the hung cfagent process on several of the 
>> > boxes.  They were all hung here:
>> > 
>> > 13945:  recv(5, 0x00108528, 1397, 0)    (sleeping...)
>> > 13945:  signotifywait()                 (sleeping...)
>> > 13945:  door_return(0x00000000, 0, 0x00000000, 0) (sleeping...)
>> > 13945:  lwp_cond_wait(0xFEED5548, 0xFEED5558, 0xFEECEDB0) 
>> > (sleeping...)
>> > 
>> > I don't know if that's of any help.
>> > 
>> > After running cfrun a second time, it went through every 
>> > machine cleanly.  I've been able to do a clean cfrun several 
>> > times since then this morning.  I'm going to leave it alone 
>> > for an hour and try again, to see if there's some sort of 
>> > "first time in" condition that's causing a problem.  I'm 
>> > starting to suspect an issue with the central server rather 
>> > than any of the individual clients.
>> > 
>> > Steve
>> > 
>> > > -----Original Message-----
>> > > From: address@hidden [mailto:address@hidden
>> > > Sent: Thursday, July 17, 2003 3:59 AM
>> > > To: address@hidden
>> > > Cc: address@hidden; address@hidden;
>> > > address@hidden
>> > > Subject: Re: Bogus file content on copies
>> > > 
>> > > 
>> > > 
>> > > I don't even understand how this *could happen*, so any
>> > > details you can find out would be useful,
>> > > 
>> > > thanks
>> > > M
>> > > 
>> > > On 16 Jul, Jeremy 'Circ' Charles wrote:
>> > > > On Wed, 2003-07-16 at 12:13, address@hidden wrote:
>> > > >> I have not seen this for many years: Might be something 
>> > to do with
>> > > >> threading libraries. PLease try to reproduce this running it in
>> > > >> a debugger. Only a stack error could cause something like this.
>> > > > 
>> > > > I'd be curious to know what platform Steve encountered this 
>> > > problem on.
>> > > > 
>> > > > I have cfagent doing a TON of work as part of my RedHat 9 
>> > > installation
>> > > > procedure and in maintaining the machines thereafter.  
>> > > Yesterday I had
>> > > > one cfagent run hang after dropping content just like what 
>> > > Steve pointed
>> > > > out in a file:
>> > > > 
>> > > >> address@hidden:inputs# more cfservd.conf
>> > > >> t 2048BAD: Host authentication failed. Did you forget the 
>> > > domain name?
>> > > > 
>> > > > In my case, it was a different file, but my recollection of 
>> > > the bogus
>> > > > content is just like the above.
>> > > > 
>> > > > It has only happened once that I'm aware of.  I wrote it 
>> > > off as a fluke,
>> > > > blew away the goobered file on the target machine and 
>> > > started over.  All
>> > > > was well after that.  :-)
>> > > > 
>> > > 
>> > > 
>> > > 
>> > > 
>> > 
>> ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
>> > > Work: +47 22453272            Email:  address@hidden
>> > > Fax : +47 22453205            WWW  :  http://www.iu.hio.no/~mark
>> > > 
>> > 
>> ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
>> > > 
>> > 
>> 
> 
> 
> _______________________________________________
> Help-cfengine mailing list
> address@hidden
> http://mail.gnu.org/mailman/listinfo/help-cfengine



~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Work: +47 22453272            Email:  address@hidden
Fax : +47 22453205            WWW  :  http://www.iu.hio.no/~mark
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~





reply via email to

[Prev in Thread] Current Thread [Next in Thread]