monit-general
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[monit] Fine Tunning a monit configuration


From: Andres Tarallo
Subject: [monit] Fine Tunning a monit configuration
Date: Tue, 24 Mar 2009 16:30:09 -0300

I have a bunch of Centos 5.2 Servers running Apache, I've installed monit 4.9 (RPMs from DAG repository). These servers are heavily loaded most of the day (average 1 min over 20, many hours a day). I keep getting the following messages in my Mailbox:

** Subject httpd Timeout - httpd unmonitor on XXXXX: 'httpd' service timed out and will not be checked anymore.
** Subject httpd Connection failed - httpd restart on XXXXXX: 'httpd' failed protocol test [HTTP] at INET[WW.WW.WWW.ZZZ:80] via TCP.
** httpd Does not exist - httpd restart on XXXXXX: 'httpd' process is not running.

The last one really puzzles me, because Apache is actually running !!!!!

My monit configuration file

set daemon  180
set logfile syslog facility log_daemon
set mailserver mail.company.net
set mail-format { from: address@hidden
subject: $SERVICE $EVENT
message: $SERVICE $ACTION on $HOST: $DESCRIPTION.}
set httpd port 2812 and
     use address localhost  # only accept connection from localhost
     allow localhost        # allow localhost to connect to the server and
check system XXXXXX
    if loadavg (1min) > 50 then alert
    if loadavg (5min) > 75 then alert
    if memory usage > 90% then alert
    if cpu usage (user) > 99% then alert
    if cpu usage (system) > 99% then alert
    if cpu usage (wait) > 99% then alert
    alert address@hidden

check file apache_bin with path /usr/sbin/httpd
   group apache
   if failed checksum then unmonitor
   if failed permission 755 then unmonitor
   if failed uid root then unmonitor
   if failed gid root then unmonitor

check process httpd with pidfile /var/run/httpd.pid
    start program = "/etc/init.d/httpd start"  as uid 0 as gid 0
    stop program  = "/etc/init.d/httpd stop" as uid 0 as gid 0
    if cpu > 99% for 5 cycles then restart
    if loadavg(5min) greater than 45 for 3 cycles then restart
    if failed host WW.WW.WWW.ZZZ port 80 protocol HTTP request "/site/page.php" timeout 15 seconds 10 cycles then restart
    if 6 restarts within 10 cycles then timeout
    alert address@hidden
    depends on apache_bin
    group apache

I'm pulling my hair, It doesn't work flawlesly. I receive many alerts, even when the servers are working. Thanks

reply via email to

[Prev in Thread] Current Thread [Next in Thread]