[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Puzzler: Can cfengine replace make?

From: Steve Traugott
Subject: Re: Puzzler: Can cfengine replace make?
Date: Wed, 7 Nov 2001 12:14:15 -0800
User-agent: Mutt/1.2.5i

On Tue, Nov 06, 2001 at 09:45:18PM -0600, address@hidden wrote:
> Yes -- the cfengine version requires more effort to specify.  So does
> your (make based) version, compared to the manual one.  How many times
> do you install HACMP?  Has it paid off in writing make stanzas over
> doing it manually, or doing it via a simple stupid shell script that
> you keep in /:/install_hacks/do_hacmp_install?  It's all a matter of
> ROI. :) (DFS fans? :)

Ah HAH!  Now I see.  We are talking about two totally different
management paradigms.  Allow me to dissertate...

Steve's (or was it Joel's?) First Law of Infrastructure(TM) is:

  Never log into a machine to manage it.  Always make that change in a
  central repository and let it propagate out.

I can't say that enough.  Everything I do is driven by that law.  It
leads directly into:

  Treat your infrastructure as one big enterprise-wide cluster, or
  "virtual machine".  Stop thinking of it as individual hosts.  Manage
  the virtual machine.

If you use a management tool to manage your virtual machine for you,
and then you manually go in there and undercut the management tool by
touching individual hosts, then you are, yourself, inducing
divergence.  You are failing to tell your management tool what your
intent is.  Either your tool or your infrastructure will bite you
later (followed shortly by your users).

By bypassing your management tool, you are also failing to build
reproducability -- how did the virtual machine get to where it is
today, and how can I do it again?  The configuration files that you
feed into your management tool can be your best documentation.

Here's the test:

  If you're doing things right, you should be able to grab a random
  machine that you aren't backing up, and throw it out of a 10th-floor
  window, without losing sysadmin work or impacting users for more
  than 10 minutes.

Very few infrastructures can pass this test.  User data can be
mirrored across SAN, NBD, or other interhost mirroring devices.  HA
clusters can handle failover.  

But we as an industry are far from being able to preserve the
sysadmin's work.  I compare the sysadmin industry to the auto industry
of the early 1900's -- each machine is lovingly handrafted by a
master, and each machine is different.  

The auto industry eventually woke up to the wonders of mass
production, and later mass customization using standard tooling --
we're not there yet.  

But if you use your management tool to install, build, and maintain
your machines, then you by definition have that mass customization
tooling in place.

By the way, automating HACMP installs has paid off handsomely -- a job
that normally takes IBM PSO 40 hours to do by hand now takes 2 hours.
I'm able to install and re-install cluster nodes at will, test to
destruction during production readiness proveouts, change a parameter
in a central config file and install hands-off again without pain.  I
was able to pass this on to someone else who never did an HACMP
install, and his first cluster worked.  IBM PSO had quoted the client
their normal outrageous amount to do *one* unreproducible install, by
hand.  My family's business got the contract instead.

This HACMP code isn't a standalone set of scripts -- it's part of the
same environment that manages all of the other AIX machines for the
client.  Toggling a machine from one role to another is a one-line
change in a configuration file.

> Not with cfengine as I know it.  cfengine, at it's core, doesn't trust
> anything that it doesn't directly examine.  It needs to have some kind
> of "is this done" flag, and the ability to "tokenize" arbitrary
> stanzas into atomic actions (so they could have the flag attached).
> That implies a brand new storage paradigm for cfengine; some kind of
> persistant trusted state storage.  It currently doesn't have anything
> like that.  Mark -- having described it, this sounds pretty easy...? :)

I think that's a good description -- persistent storage of class
state.  It would need to be infinitely persistent, lasting as long as
the /var filesystem does.  I think I saw something like long-term
persistence in the Cfengine 2 doc, but it looked like it had a max
lifetime on it -- 90 days, I think?

> I'm coming off in this note as a big cfengine fan.    While that's
> true, it's not the cat's meow and I see a real issue here -- it
> *would* be nice to get something elegant to fix this problem.  Or
> complement it... 

And I'm in no way dedicated to 'make' -- otherwise I wouldn't have
started this thread.  The combination of make and other components is
just something I keep going back to, because it helps me do what I
need to do.

                        .       .    `   *    
Steve Traugott   ` .  *  +                       
Infrastructure Architect            + `     
address@hidden    '   *  .   '  +`   *

Attachment: pgpcvVCbiG79w.pgp
Description: PGP signature

reply via email to

[Prev in Thread] Current Thread [Next in Thread]