I think I have found the issue, and I think I may have to walk back my statements on this affecting a certain version or distro.
I have 8 monitored services in an "Execution failed" state. None of the services have a timeout defined.
The timeout apparently defaults to EXEC_TIMEOUT (30 seconds). monit waits the full 30 seconds for the service check to finally fail before checking the next service that is also in an "Execution failed" state.
[EDT Aug 5 13:03:58] error : 'service_name' process is not running
[EDT Aug 5 13:03:58] info : 'service_name' trying to restart
[EDT Aug 5 13:03:58] info : 'service_name' start: /etc/init.d/service_name
[EDT Aug 5 13:03:58] info : Sleeping for 100 ms (src/control.c:127)
[EDT Aug 5 13:03:58] info : Sleeping for 100 ms (src/control.c:127)
[EDT Aug 5 13:03:58] info : Sleeping for 50000 ms (src/control.c:159)
[EDT Aug 5 13:03:58] info : Sleeping for 100000 ms (src/control.c:159)
[EDT Aug 5 13:03:58] info : Sleeping for 200000 ms (src/control.c:159)
[EDT Aug 5 13:03:58] info : Sleeping for 400000 ms (src/control.c:159)
[EDT Aug 5 13:03:59] info : Sleeping for 800000 ms (src/control.c:159)
[EDT Aug 5 13:04:00] info : Sleeping for 1600000 ms (src/control.c:159)
[EDT Aug 5 13:04:01] info : Sleeping for 1000000 ms (src/control.c:159)
[EDT Aug 5 13:04:02] info : Sleeping for 1000000 ms (src/control.c:159)
[EDT Aug 5 13:04:03] info : Sleeping for 1000000 ms (src/control.c:159)
[EDT Aug 5 13:04:04] info : Sleeping for 1000000 ms (src/control.c:159)
[EDT Aug 5 13:04:05] info : Sleeping for 1000000 ms (src/control.c:159)
[EDT Aug 5 13:04:06] info : Sleeping for 1000000 ms (src/control.c:159)
[EDT Aug 5 13:04:07] info : Sleeping for 1000000 ms (src/control.c:159)
[EDT Aug 5 13:04:08] info : Sleeping for 1000000 ms (src/control.c:159)
[EDT Aug 5 13:04:09] info : Sleeping for 1000000 ms (src/control.c:159)
[EDT Aug 5 13:04:10] info : Sleeping for 1000000 ms (src/control.c:159)
[EDT Aug 5 13:04:11] info : Sleeping for 1000000 ms (src/control.c:159)
[EDT Aug 5 13:04:12] info : Sleeping for 1000000 ms (src/control.c:159)
[EDT Aug 5 13:04:13] info : Sleeping for 1000000 ms (src/control.c:159)
[EDT Aug 5 13:04:14] info : Sleeping for 1000000 ms (src/control.c:159)
[EDT Aug 5 13:04:15] info : Sleeping for 1000000 ms (src/control.c:159)
[EDT Aug 5 13:04:16] info : Sleeping for 1000000 ms (src/control.c:159)
[EDT Aug 5 13:04:17] info : Sleeping for 1000000 ms (src/control.c:159)
[EDT Aug 5 13:04:18] info : Sleeping for 1000000 ms (src/control.c:159)
[EDT Aug 5 13:04:19] info : Sleeping for 1000000 ms (src/control.c:159)
[EDT Aug 5 13:04:20] info : Sleeping for 1000000 ms (src/control.c:159)
[EDT Aug 5 13:04:21] info : Sleeping for 1000000 ms (src/control.c:159)
[EDT Aug 5 13:04:22] info : Sleeping for 1000000 ms (src/control.c:159)
[EDT Aug 5 13:04:23] info : Sleeping for 1000000 ms (src/control.c:159)
[EDT Aug 5 13:04:24] info : Sleeping for 1000000 ms (src/control.c:159)
[EDT Aug 5 13:04:25] info : Sleeping for 1000000 ms (src/control.c:159)
[EDT Aug 5 13:04:26] info : Sleeping for 1000000 ms (src/control.c:159)
[EDT Aug 5 13:04:27] info : Sleeping for 1000000 ms (src/control.c:159)
[EDT Aug 5 13:04:28] error : 'service_name' failed to start (exit status 1) -- /etc/init.d/service_name: Shutting down service_name: [ OK ]
Starting service_name: [ OK ]^M[FAILED]
8 services at 30 seconds each = 240 seconds, this means the sleep(Run.polltime) in monit.c only gets called every 4 minutes. This is with the daemon interval set to 10 seconds. Notice ~240 seconds (4 minutes) between each occurrence:
# grep 'src/monit.c' /var/log/monit
[EDT Aug 5 12:56:33] info : Sleeping for 10 seconds (src/monit.c:561)
[EDT Aug 5 13:00:46] info : Sleeping for 10 seconds (src/monit.c:561)
[EDT Aug 5 13:05:00] info : Sleeping for 10 seconds (src/monit.c:561)
So how can I control the execTimeout without having monit give up on trying to start that service?
Thanks,