[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Solution: Re: monitoring cfagent(s)

From: Dmitry Sazonov
Subject: Solution: Re: monitoring cfagent(s)
Date: Mon, 30 Jan 2006 12:24:45 -0500

Thanks goes to Matthew for the idea

Included is my implementation. 

It will monitor if cfengine runs periodically on the host that is
monitored by Nagios and has cfengine service defined for it.
If the cfagent needs to do any action (there is an output from cfexecd)
the nagios will turn the cfengine service light to Yellow (WARNING).
If a cfagent didn't run for interval of freshness check (1 hr in the
example) (or email is not arriving) Nagios freshness check turns the
cfengine light to orange (UNKNOWN) 

Please feel free to comment/improve!


1. add one line in cfengine that executes every time like:
    any::   "/bin/echo host=$(host) date=$(date)"
2. make sure Inform is ON
3. cofigure email address to receive all the email from cfengine :
example: address@hidden 

Nagios host:

1. create a local account - cfengine (that account needs to be in the
group that owns Nagios cmd file - in my case nagioscmd)
2. (Linux) make sure local sendmail accepts SMTP from remotes (test
3. create ~cfengine/.forward
4. put that in  /etc/smrsh/ (Linux)

5. - something trivial like this:


use Getopt::Std;


($host) = /^From: \w+\@(\w+)/m;

my $nomatch=0;
while (<>) {
 next if /^$/; # skip blank lines
 next if /^cfengine:\w+: (Executing|Finished) script \/bin\/echo/;
 next if /^cfengine:\w+:\/bin\/echo host=:/;
 print "no match: ",$_ if $opt_d;

 print "WARNING: $host\n" if $opt_d;
 system "$submit $host cfengine 1 \"unexpected output\"";
 print "OK: $host\n" if $opt_d;
 system "$submit $host cfengine 0 \"expected output\"";

6. write submit_check_results (see Nagios docs on passive checks)



# get the current date/time in seconds since UNIX epoch
datetime=`date +%s`

# create the command line to add to the command file
cmdline="[$datetime] PROCESS_SERVICE_CHECK_RESULT;$1;$2;$3;$4"

# append the command to the end of the command file
$echocmd $cmdline >> $CommandFile

7.  Enable Passive checks in Nagios nagios.
8. create a matching service : cfengine
  (Only important lines left in the example:)
define service {
                name                            cfengine
                active_checks_enabled           0
                passive_checks_enabled          1
                check_freshness                 1
                freshness_threshold             3600
fake_check!CRITICAL:!freshness check!2!

9. Check command need to exist (see Nagios docs about Freshness check)


echo $1 $2

exit $3

Dmitry Sazonov
UNIX sysadmin, AAMC
Office: 202-862-6168

>>> Matthew Palmer <address@hidden> 1/5/2006 5:24:28 PM >>>
On Thu, Jan 05, 2006 at 04:58:16PM -0500, Dmitry Sazonov wrote:
> Is there any cfengine class that will tell me that all rules were
> processed, but no action were necessary - meaning that the host state
> as desired (convergent).
> If "corrective" actions were required - I'd like to know that too,
> if the actions are required on every run - there must be something
> wrong.
> Based on this class(?) I could fire a syslog message (submit a
> check) to Nagios.

Create a passive check that fires from an e-mail received, and times
out if
no relevant e-mail is received for a period of time, so the check can
critical if no e-mail is received (to say "cfagent doesn't appear to
running") or it can also go critical (or warning) when an e-mail is
that contains evidence of work being done (I tend to use cfagent -I, so
would probably check to see if any output was printed in the e-mail).

No cfengine hacking required!

- Matt

"For once, Microsoft wasn't exaggerating when they named it the 'Jet
-- your data's the seagull."
        -- Chris Adams

reply via email to

[Prev in Thread] Current Thread [Next in Thread]