[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

RE: cfengine failover on copy does not seem to work as documented

From: Mark Burgess
Subject: RE: cfengine failover on copy does not seem to work as documented
Date: Mon, 07 Nov 2005 16:52:06 +0100

You should think about more about how this could be optimized.
What I envisage in cfengine 3 is that you will be able to attach
a kind of "filter" like stanza to a command to provide exception
handling. We can talk about this at LISA. So the question is what
kinds of things should be in those stanzas - and you have already given
some ideas.


On Mon, 2005-11-07 at 14:26 +0000, Chip Seraphine wrote:
> > Failover, in my world, is generally used to talk about network services
> > and that is the meaning it carries in cfengine today.
> Mine too,  :)  In that sense I am not using the mechanism for what it was
> intended for.
> > I can agree that
> > there is a general need for some kind of exception handling for other
> > cases too. Then what you are really asking for is another kind of class
> >
> > filenotfound=myclass
> >
> > to complement failover=
> Something like that, yes.
> We probably don't want to get into a situation where we have numerous
> error-handling warts growing on commands, such as "filenotfound=foo" and
> "unreadable=bar".  If we are going to revisit this, I'd consider allowing
> the user to define an error "base class" (which is essentially a label for
> that particular copy operation) and a specific error token could be
> appended.  (So the base class would indicate a failure, and an extended
> class would give specifics errors from a defined list.)  Something like
> this:
> #Overly simplified unrealistic code ensues!
> copy:
>  any::
>   /foo/bar  dest=/bar/foo server=$(policyhost) label=barcopy
>  barcopy_serverdown::
>   /alt/cfenginepath/foo/bar  dest=/bar/foo server=$(backuphost)
> label=barcopy2
> shellcommands:
>  barcopy_filenotfound::
>    "/bin/echo Check $(policyhost) to make sure /foo/bar is present | mail
> admingroup"
>  barcopy_filenotreadble::
>    "/bin/echo Permissions bad at /foo/bar on $(policyhost) | mail
> cfenginemaintainer"
> #barcopy_err is defined on ANY failure of barcopy, in addition to specific
> classes
>  barcopy_failed.barcopy2_failed::
>    "/bin/echo Warning, /bar/foo is out of date on $(host) |
> /usr/local/sbin/nagioswrapper"
> > This can be patched into the current cfengine without any real
> > difficulty. And it should be better designed in cfengine 3.
> If we have labels, we can also set dynamic variables (${barcopy_errmsg})....
> > Is it not true that, if the file is unreadable, you get an error message
> > anyway?
> Of course.   And that is fine for me, the cfengine maintainer, since the
> cfexecd output goes into my inbox.  But given a few hundred machines, that
> is a *lot* of noise (I have cfservd and cfexecd crashes-and-restarts every
> few minutes, each of which generates mail) and nobody but the cfengine guy
> wants to see that.
> So, I have a 'complaint' tool that allows me to shoot messages to the NMS,
> to specific admins or developers who "own" certain resources, etc. 
> (Example:  The presence of a sendmail corefile in /var/core makes me want
> to shoot an email to the mail guy, not to the whole sysadmin team.)
> Also, I often simply do not have time to read all the normal output-- it
> is very cluttered with 'routine' messages (SIGPIPEs from trying to copy
> from downed machines, stuck cfagents being killed, cfexecd or cfservd
> segfaulting), and it is useful to have an external mechanism tell me
> when-and-where something really needs my attention, so that I know when to
> dive into those outputs.
> > Mark
> ~Chip

reply via email to

[Prev in Thread] Current Thread [Next in Thread]