[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Puzzler: Can cfengine replace make?

From: Ted Zlatanov
Subject: Re: Puzzler: Can cfengine replace make?
Date: Thu, 08 Nov 2001 07:49:16 -0500
User-agent: Gnus/5.090004 (Oort Gnus v0.04) Emacs/21.1 (i386-redhat-linux-gnu)

Steve Traugott <address@hidden> writes:

> Steve's (or was it Joel's?) First Law of Infrastructure(TM) is:
>   Never log into a machine to manage it.  Always make that change in a
>   central repository and let it propagate out.

That's wonderful when things work well, but it can backfire quickly if
you propagate a bad configuration that damages systems.  I prefer to
synchronize the *configuration* automatically, but then I split it
into the automatic and the manual pieces (cfengine, specifically).

The automatic piece runs every minute at our site, and does
indispensable things like monitor processes.  Extreme care is taken
when changing the automatic (cf.minute) configuration.

The manual piece is run about once a day, and does everything else.
Instead of cfd, we ssh into each machine individually and run that
manual piece (we automate the ssh process, but it's still initiated
and monitored by a person).  Though it may seem wrong to do so, using
ssh actually slows down the process enough that you can see
immediately if something is wrong and abort the process.  This
particular slowdown is probably something not everybody will like, but
it suits my mentality very well (better to go slow than to make a lot
of changes quickly).

>   Treat your infrastructure as one big enterprise-wide cluster, or
>   "virtual machine".  Stop thinking of it as individual hosts.  Manage
>   the virtual machine.

That's an interesting thought.  It's harder to do this when machines
are completely different in purpose, but I see your point and agree
with it.  cfengine makes this easy with classes - a machine can be in
any number of classes (possibly overlapping with other machines), and
actions are taken on the basis of classes, not individual machine

> If you use a management tool to manage your virtual machine for you,
> and then you manually go in there and undercut the management tool by
> touching individual hosts, then you are, yourself, inducing
> divergence.  You are failing to tell your management tool what your
> intent is.  Either your tool or your infrastructure will bite you
> later (followed shortly by your users).

In an emergency, this can't be helped.  I prefer to let cfengine drift
machines to a known state, rather than expect machines to be pristine.

> By bypassing your management tool, you are also failing to build
> reproducability -- how did the virtual machine get to where it is
> today, and how can I do it again?  The configuration files that you
> feed into your management tool can be your best documentation.

Agreed, but again, you can never go back in time, and you can never
reproduce things exactly.  It's a fact of life.  Disk drives fail,
networks fail, processors fail, users do stupid things.  The cfengine
configuration should both build a machine reproducibly, *and* drift a
machine to a known state.

>   If you're doing things right, you should be able to grab a random
>   machine that you aren't backing up, and throw it out of a 10th-floor
>   window, without losing sysadmin work or impacting users for more
>   than 10 minutes.
> Very few infrastructures can pass this test.  User data can be
> mirrored across SAN, NBD, or other interhost mirroring devices.  HA
> clusters can handle failover.  

For a lot of money, yes.  On a budget, a lot of wonderful technologies
are impractical (and this includes open source software that needs a
lot of configuration and attention).

>> Not with cfengine as I know it.  cfengine, at it's core, doesn't trust
>> anything that it doesn't directly examine.  It needs to have some kind
>> of "is this done" flag, and the ability to "tokenize" arbitrary
>> stanzas into atomic actions (so they could have the flag attached).
>> That implies a brand new storage paradigm for cfengine; some kind of
>> persistant trusted state storage.  It currently doesn't have anything
>> like that.  Mark -- having described it, this sounds pretty easy...? :)
> I think that's a good description -- persistent storage of class
> state.  It would need to be infinitely persistent, lasting as long as
> the /var filesystem does.  I think I saw something like long-term
> persistence in the Cfengine 2 doc, but it looked like it had a max
> lifetime on it -- 90 days, I think?

Even if this is not built into cfengine, it's almost trivial to do
with a cfengine module and a little Perl.

Pass a class name to the module.  The first time it's run, the class
name is defined back for cfengine, and inserted in a file/database for
quick access the next time.  Then every time afterwards, the module
will just exit silently if the class is already defined.  You could
even go nuts and pass the persistence file/database to the module, and
define the expiration by something other than number of runs,
e.g. time elapsed, system state, etc.

I can write this, if anyone is interested.


reply via email to

[Prev in Thread] Current Thread [Next in Thread]