[Top][All Lists]
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[monit-dev] Assertion failure in 4.10.1
From: |
Brian Candler |
Subject: |
[monit-dev] Assertion failure in 4.10.1 |
Date: |
Tue, 25 Mar 2008 15:14:15 +0000 |
User-agent: |
Mutt/1.5.11 |
monit 4.10.1 just died with an assertion failure:
...
Mar 25 14:54:34 localhost monit[17903]: 'testapp_pen.conf' checksum was changed
for /etc/pen.d/testapp.conf
Mar 25 14:54:34 localhost monit[17903]: 'testapp_pen.conf' trying to restart
Mar 25 14:54:34 localhost monit[17903]: 'testapp_pen' stop: /bin/bash
Mar 25 14:54:34 localhost monit[17903]: 'testapp_pen' start: /usr/bin/pen
Mar 25 14:54:34 localhost monit[17903]: AssertException: s at xmalloc.c:110
aborting..
System details:
* CentOS 4.5
Linux localhost.localdomain 2.6.9-55.0.2.plus.c4 #1 Fri Jul 6 05:04:29 EDT
2007 i686 i686 i386 GNU/Linux
* monit 4.10.1 built as an RPM, within a chroot environment (mach) on
another host.
Spec file taken from http://dag.wieers.com/rpm/packages/monit/monit.spec
(just changed 4.9 to 4.10.1)
What I was doing: I had set up a dependency between a config file
(/etc/pen.d/testapp.conf) and a process, then I modified the config file
by adding a blank line, to see if monit would restart the process. It
appears that it started to do so, then died :-(
My full configs are attached below - in particular see
/etc/monit.d/testapp.monitrc
I'm not sure that what I was doing was valid (having a 'restart' action
within a file check, and then a process check dependent on the file check).
So it's possible this is a case of operator error. However I still wouldn't
have expected monit to die.
In case it's relevant, I should add that the checks testapp_mongrel_1 and
testapp_mongrel_2 are intentionally failing, because the processes which
they are trying to start have not yet been installed on the target box. Here
is a fuller log extract:
...
Mar 25 14:54:03 localhost monit[17903]: 'testapp_mongrel_1' process is not
running
Mar 25 14:54:03 localhost monit[17903]: 'testapp_mongrel_1' trying to restart
Mar 25 14:54:03 localhost monit[17903]: 'testapp_mongrel_1' start:
/usr/bin/mongrel_rails
Mar 25 14:54:03 localhost monit[17903]: 'testapp_mongrel_2' process is not
running
Mar 25 14:54:03 localhost monit[17903]: 'testapp_mongrel_2' trying to restart
Mar 25 14:54:03 localhost monit[17903]: 'testapp_mongrel_2' start:
/usr/bin/mongrel_rails
Mar 25 14:54:04 localhost monit[17903]: 'testapp_mongrel_1' failed to start
Mar 25 14:54:04 localhost monit[17903]: 'testapp_mongrel_2' failed to start
Mar 25 14:54:34 localhost monit[17903]: 'testapp_mongrel_1' failed to start
Mar 25 14:54:34 localhost monit[17903]: 'testapp_pen.conf' checksum was changed
for /etc/pen.d/testapp.conf
Mar 25 14:54:34 localhost monit[17903]: 'testapp_pen.conf' trying to restart
Mar 25 14:54:34 localhost monit[17903]: 'testapp_pen' stop: /bin/bash
Mar 25 14:54:34 localhost monit[17903]: 'testapp_pen' start: /usr/bin/pen
Mar 25 14:54:34 localhost monit[17903]: AssertException: s at xmalloc.c:110
aborting..
The bug appears to be repeatable - I tried restarting monit and changing
that config file, and I get the same crash.
Regards,
Brian Candler.
# cat /etc/monit.conf
set daemon 30
set logfile syslog facility log_daemon
set mailserver localhost
set mail-format {from:address@hidden
set alert address@hidden only on { timeout, nonexist }
set httpd port 2812
allow localhost
allow X.X.X.0/255.255.252.0
include /etc/monit.d/*
# head -100 /etc/monit.d/*
==> /etc/monit.d/apache.monitrc <==
check process apache
with pidfile "/var/run/httpd.pid"
start program = "/etc/init.d/httpd start"
stop program = "/etc/init.d/httpd stop"
if 2 restarts within 3 cycles then timeout
if totalmem > 100 Mb then alert
if children > 255 for 5 cycles then stop
if cpu usage > 95% for 3 cycles then restart
#if failed port 80 protocol http then restart
group server
depends on httpd.conf, httpd.conf.d
check file httpd.conf
with path /etc/httpd/conf/httpd.conf
# Reload apache if the httpd.conf file was changed
if changed checksum
then exec "/etc/init.d/httpd graceful"
check directory httpd.conf.d
with path /etc/httpd/conf.d
if changed timestamp
then exec "/etc/init.d/httpd graceful"
==> /etc/monit.d/memcached.monitrc <==
check process memcached
with pidfile /var/run/memcached/memcached.pid
start program = "/etc/init.d/memcached start"
stop program = "/etc/init.d/memcached stop"
if cpu is greater than 80% for 4 cycles then restart
==> /etc/monit.d/testapp.monitrc <==
check process testapp_pen
with pidfile /var/run/pen/testapp.pid
start program = "/usr/bin/pen -F /etc/pen.d/testapp.conf -u nobody
-p /var/run/pen/testapp.pid
-C 127.0.0.1:9999 127.0.0.1:10000"
stop program = "/bin/bash -c 'kill -s SIGTERM `cat
/var/run/pen/testapp.pid`'"
if totalmem is greater than 10.0 MB for 2 cycles then restart
if cpu is greater than 50% for 2 cycles then restart
if 2 restarts within 3 cycles then timeout
depends on testapp_pen.conf
group testapp
check file testapp_pen.conf
with path /etc/pen.d/testapp.conf
if changed checksum
then restart
check process testapp_mongrel_1
with pidfile /u/apps/testapp/shared/tmp/pids/mongrel.10001.pid
start program = "/usr/bin/mongrel_rails cluster::start --clean
-C /u/apps/testapp/current/config/mongrel_cluster.yml
--only 10001"
stop program = "/usr/bin/mongrel_rails cluster::stop
-C /u/apps/testapp/current/config/mongrel_cluster.yml
--only 10001"
if totalmem is greater than 110.0 MB for 4 cycles then restart
if cpu is greater than 80% for 4 cycles then restart
if 10 restarts within 10 cycles then timeout
group testapp
check process testapp_mongrel_2
with pidfile /u/apps/testapp/shared/tmp/pids/mongrel.10002.pid
start program = "/usr/bin/mongrel_rails cluster::start --clean
-C /u/apps/testapp/current/config/mongrel_cluster.yml
--only 10002"
stop program = "/usr/bin/mongrel_rails cluster::stop
-C /u/apps/testapp/current/config/mongrel_cluster.yml
--only 10002"
if totalmem is greater than 110.0 MB for 4 cycles then restart
if cpu is greater than 80% for 4 cycles then restart
if 10 restarts within 10 cycles then timeout
group testapp
- [monit-dev] Assertion failure in 4.10.1,
Brian Candler <=