help-cfengine
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Fwd: [Fwd: Re: convergence and undoing changes]]


From: Mark Burgess
Subject: [Fwd: [Fwd: Re: convergence and undoing changes]]
Date: Fri, 18 Nov 2005 20:40:45 +0100

-------- Forwarded Message --------
From: Alva Couch <address@hidden>
To: Mark Burgess <address@hidden>
Subject: [Fwd: Re: convergence and undoing changes]
Date: Fri, 18 Nov 2005 14:36:52 -0500

Alas, my post to the list bounced. Perhaps you could forward it?

email message attachment (Re: convergence and undoing changes)
-------- Forwarded Message --------
From: address@hidden
To: address@hidden
Subject: Re: convergence and undoing changes
Date: Fri, 18 Nov 2005 14:05:11 -0500

You are not allowed to post to this mailing list, and your message has
been automatically rejected.  If you think that your messages are
being rejected in error, contact the mailing list owner at
address@hidden

email message attachment
-------- Forwarded Message --------
From: Alva Couch <address@hidden>
To: Mark Burgess <address@hidden>
Cc: christian pearce <address@hidden>, Viraj Alankar
<address@hidden>, address@hidden
Subject: Re: convergence and undoing changes
Date: Fri, 18 Nov 2005 14:03:38 -0500

Mark Burgess wrote:

>>Alva Couch gave a presentation about this:
>>http://homepages.informatics.ed.ac.uk/group/lssconf/config2005e/Slides/cfengine.pdf
> 
> I have not seen this before. In fact it annoys me a little because it is
> factually incorrect and seems just to be an unnecessary slur on
> cfengine.

It is not intended as a slur, but perhaps I have been too bold in
my statements. I apologise for the offense. I am trying, instead,
to focus upon an aspect of cfengine that many users have found
problematic, and propose an effective solution, including using
cfengine but also things from outside of the cfengine framework.

I think that the key to any progress is to identify the problem.
In that talk, I identified the problem but didn't focus so much
upon the solution. Let me make some amends by trying to focus upon
the solution here in a constructive way.

No tool can be effectively utilized without appropriate practice.
I have been working in detail on the practice cfengine demands.
The effective practice for utilizing cfengine has a drawback that I call 
the "observability" quandary. Like other incremental approaches, one is
at the mercy of what one decides to control, in perpetuity.
Once one has asserted a state for something, one cannot stop
controlling that state unless one knows that the network has
converged as a whole to the desired state; without that assurance,
further commands must presume the co-existence of perhaps *both*
formerly desired and currently desired states. Since this network-wide
convergence is almost always impractical to assure, this means
that cfengine configurations tend to "ratchet" up in size and
complexity and never decrease in size.

My conclusions from this are controversial. I do not consider the 
typical cfengine configuration file "specific enough" by nature to
constitute a "configuration"; some things are left the way they are
in a pseudo-default state. This is both a strength and a weakness.
The lack of specificity leads cfengine to be extremely effective in
legacy environments where we need to leave most everything alone. It is
also appropriate in an environment where major upgrades will not occur
for the operating system itself.

The talk also contains the claim that "cfengine is not a configuration 
management system". It is instead a configuration enforcer; from a
typical configuration file of cfengine alone, one cannot determine the
whole configuration or its intent. It is best to think of the cfengine
configuration as a "differential template" between what we start with
and what we want. It is thus particularly hard to describe the intent of
a random cfengine configuration file; more structure needs to be imposed
from outside the configuration file in order to be able to
make that interpretation. I think it is more accurate to think of the
actual configuration as something external to the cfengine 
configuration, that is a combination of cfengine and pre-existing state.

The "partial" nature of the cfengine configuration is problematic
when, e.g., a major upgrade of the operating system occurs. The
configuration file describes incremental changes to "one" baseline and
we now have another baseline. Because changes are incremental in nature,
latent variables can crop up easily and cause things to break. We are
thus forced to completely re-validate the cfengine configuration's
correctness, one statement at a time. This is not a scalable thing
and there is no way to "port" the old configuration to the new OS.

Comparing this to a "generative" solution for configuration management,
a re-implementation of the generators fixes *all* porting problems. So
there is a scalability problem: in one case, one fix fixes "all" porting
problems, whereas for a cfengine configuration file, every instance must
be fixed and revalidated separately.

> This network level divergence example is highly misleading.
> Of course there can be periods of divergence, but the talk seems to
> imply that some other tool could improve on the problem. In fact I would
> suggest that no other tool provides more consistency than cfengine
> today.

The claim was not that it does not provide long-term consistency,
but that there is a practice that must be utilized to achieve that
consistency, and that practice is external to cfengine itself.
And just about every user I know has fallen into the trap of
creating such a divergence:
a) "ad-hoc editing" of a cfengine configuration file can produce latent
    effects on a network.
b) thus "change management" is required in a cfengine environment in
    order to avoid producing latent variables and effects.
c) cfengine doesn't support that change management itself.
d) something needs to manage, therefore, the *input* to cfengine.

The solution to the problem under discussion is easy. One must
enter a command into the configuration that performs the rollback.
Probably by copying the appropriate file from the original build of
the machine. That command must stay in the configuration until every
host has converged back to the original state. A systematic approach
to rollback is to keep a "gold server" around that reflects the
initial build state, or otherwise snapshot an initial build for
rollback purposes.

Note that I am not advocating that people stop using cfengine to enforce
configurations. I think, instead, that they should *start* thinking
about generating the cfengine configuration from a higher-level 
configuration, iteratively, so that the incidental divergence
and latent variable problems disappear. The ideal tool is one that
would interpret an *existing* cfengine configuration, and a new
set of "intents", and produce a configuration free of latent effects
in producing those intents. This is a problem that has proven
particularly difficult for humans (as my experience with other
cfengine admins has shown) but might be easier for a computer program.

In short, I am not saying that one should stop using cfengine, but
that one should realize its limitations and look for solutions to
those limitations. It is still the best solution for enforcing
configurations. It needs something else, however, which is the ability
to generate a "complete" configuration consistent with a particular
"intent" and combine that with the changes it has already made to a
system. Ideally, this generation and combination is portable; i.e., it 
can be constructed so that the target architecture is independent of the
intent.

As an aside, however, one does not need rollback in any such system.
Baselining, consistent observability and control, and carry-forward
are enough. It is not a history of changes, but a history of
configurations, that is needed. To roll back a change, rebaseline
the affected files and replay changes up to the point before the
crucial change.

I certainly do not mean to attack the cfengine community. I think, 
however, that we must remain realistic about what our tools do and
do not do. That is the spirit of my presentation and my response
here.







reply via email to

[Prev in Thread] Current Thread [Next in Thread]