help-cfengine
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

cfengine works unreliably


From: Akop Pogosian
Subject: cfengine works unreliably
Date: Fri, 10 Oct 2003 20:41:20 -0700
User-agent: Mutt/1.5.4i

I have been observing this problem for a while now but I have lost
track of whether this was also happening before we upgraded to
cfengine 2.0.8p1. Once in a while, the machines where cfagent used to
work fine before, get denied by the cfservd in the middle of cfagent
run after it has already authenticated and even downloaded some files
from the server. This results in corrupted file transfers,
inconsistent configurations, etc. I can't really reproduce this
problem easily but it seems to happen to every fourth or so time when
I try to "localize" a vanilla machine with cfengine. When this
happens, cfagent produces messages that looks like this:

Checking copy from 
serverhost.berkeley.edu:/home/install/cfengine/publicfiles/depot-solaris-9/openssh-3.7.1p2
 to /opt/local/depot/openssh-3.7.1p2

cfengine:clienthost: Transmission refused or failed statting 
/home/install/cfengine/publicfiles/depot-solaris-9/openssh-3.7.1p2
Got:  cfengine:clienthost: Can't stat 
/home/install/cfengine/publicfiles/depot-solaris-9/openssh-3.7.1p2 in copy

At the same time, the following is produced by cfservd:

Oct  8 19:24:42 serverhost cfservd[446]: [ID 702911 daemon.notice] Host
authorization/authentication failed or access denied
Oct  8 19:24:42 serverhost cfservd[446]: [ID 702911 daemon.notice] From
(host=clienthost,user=root,ip=::ffff:xxx.xxx.xx.xxx)

This looks very similar to the problem reported by Brian Seppanen
yesterday except that we're running cfengine 2.0.8p1 on Solaris 9.

Usually, cfagent works fine on the next run. Fortunately, we haven't
run into the situation where cfagent corrupts its own executables or
configuration files. However, the workstations sometimes fail to
"heal" themselves fully after such failures because we're using mtime
to determine whether the files need to be copies for lots of files and
the currupted files on the local disk have new mtime. I am considering
to switching to using checksum for all file comparisons because of
this.


-akop




reply via email to

[Prev in Thread] Current Thread [Next in Thread]