[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

RE: convergence and undoing changes

From: Moore, Joe
Subject: RE: convergence and undoing changes
Date: Mon, 21 Nov 2005 07:57:43 -0800

Alva Couch wrote:
> My experience is that users are all too cavalier about the way they 
> modify cfagent.conf. I think a specific discipline -- unknown to many
> users -- is the key. We can either document that discipline or 
> encapsulate it in some kind of transaction engine. I propose 
> to do both.

IMO, more documentation (preferably in the cfengine space) about what
convergence is (and isn't) and how to think convergently is needed.

> My examples using editfiles are a matter of public record. But the 
> problem can even happen when one utilizes purely convergent actions.
> Here's a "typical" example of user thinking.
> - user asserts contents of a file F. Say it is a service startup
>    in /etc/xinetd.d and the intent is to customize some service.

So at this point, all systems converge to F -- for example
/etc/ssh/sshd_config: PermitRootLogin without-password

> - then, some time after F is stable, the user changes the assertion
>    to revert F to its original state.

At this point, the user changes the convergence goal to F' (which may be
identical to F.orig, or it might have some other set of properties.

/etc/ssh/sshd_config: #PermitRootLogin no

> - unbeknownst to the user, some different set of stations are down
>    while F is reverting to the original state.

Some systems do not immediately converge.  The network is in some
indeterminate mixture of unmanaged, F, and F'.

> - then, satisfied that the file is reverted, the user takes the
>    reversion assertion out of the script, considering work to be done.

At this point, the user mistakenly decides that "unmanaged" is the
correct goal state (rather than F or F').

> - time passes and the unreverted machines come back up. There is
>    no reversion to affect them. So they stay with the new version.

After time passes without management, configuration drift occurs.

> - At this point, there are two classes of machines: those with
>    the original version of F and those with the new version. If the
>    new version has a security hole, congratulations, you didn't manage
>    to plug it.

Configuration drift of unmanaged states results in inderminate mixtures
of configuration state.

Some systems will allow root logins (without-password) and some won't.
If a local administrator edits sshd_config and "PermitRootLogin yes",
that's a third class of machine.

> The key here is that for reversions to be effective, they must stay
> in the configuration until it is absolutely sure that all stations
> have applied them. In a very large network, one is likely never
> sure, so one can *never* remove the reversions from the config file.

The key here is that if you have a goal state, you must define that

> This is the principle of observability:
>    Once one manages a thing, one must continue to manage that
>    thing in perpetuity.
> In my experience this kind of "reversion botch" is very common.

Actually, it's "Once one realizes that a thing needs management, one
must continue to manage that thing until configuration drift is

This kind of "reversion botch" is based on the mistaken assumption that
the original configuration was magically convergent.


reply via email to

[Prev in Thread] Current Thread [Next in Thread]