help-cfengine
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Bootstrapping


From: Luke A. Kanies
Subject: Re: Bootstrapping
Date: Thu, 19 Feb 2004 11:05:19 -0600 (CST)

On Thu, 19 Feb 2004 address@hidden wrote:

> Can you give an example. I thought the idea of convergence is that
> there is no need for such things. Maybe if you just think in the
> "right"  way you can already do this..

How do I know if all of my machines are updating themselves?  How do I
know if all of my machines have successfully copied all of their files,
started all of their processes, made all of their changes?  There are many
potential problems (issues with the cfengine keys, configurations for the
processes, nonexistent files) and although it's straightforward to have
cfengine email me, it's less straightforward to turn those emails into
meaningful information.

Convergent processes can only do so much; you also need mechanisms for
dealing with errors, and cfengine is largely lacking those.  Not that it's
not fault-tolerant to an extent, just that it doesn't seem to deal much
with faults within the system or the configuration.

Most companies already have some kind of monitoring system, and those
monitoring systems usually have a red/yellow/green classification system
for all hosts.  I would consider the cfengine service on a host to be red
if the host is not updating itself at all, and yellow if there are other
problems.  I'd actually like greater granularity -- maybe a count of the
number of errors encountered -- but I'd accept the red/yellow/green system
to start.

How would one go about getting this information into an NMS?  At this
point, I have to have something monitoring the syslog files and/more
email, and this monitor has to differentiate between different types of
failures (key exchange failure, cfexecd failure, minor error, sshd
failure, etc), and then upload that to the NMS.

If cfengine had better ideas of failure, then this would be much easier.
Failure of some kind is absolutely inevitable, which makes it imperative
that we deal well with it.  See my earlier post about how I monitor syslog
output to get system status into LDAP.

Luke

-- 
I think that all good, right thinking people in this country are sick
and tired of being told that all good, right thinking people in this
country are fed up with being told that all good, right thinking people
in this country are fed up with being sick and tired.  I'm certainly
not, and I'm sick and tired of being told that I am.
                -- Monty Python




reply via email to

[Prev in Thread] Current Thread [Next in Thread]