On Fri, 26 Sep 2003, Martin Pala wrote:
Christian Hopp wrote:
On Fri, 26 Sep 2003, Martin Pala wrote:
2) A watchdog thread... (I think I have mentioned this hidden
somewhere in a mail some days ago)
We detach the validation as one thread, then we do already have the
http thread and whatever might come. The main thread become the
watchdog. After each validation and after each accept (w/ or w/o
timeout) both do set a semaphore which is checked by the watchdog.
In case one of them has not done this the thread is restarted or
monit is restarted.
I'm not sure whether it is needed to separate main thread for watchdog.
It could be usefull to watch httpd thread, but i think it could be
implemented inside of main thread. As workaround in the case that monit
httpd availability is issue, it can be monitored by present monit
version from main thread by:
check host monit-httpd with address 127.0.0.1
if failed port 2812 protocol http then exec "/usr/bin/monit -c
/etc/monitrc reload"
This is not exactly the same thing as httpd thread watchdog, but it can
solve the issue too. In the case that monit httpd is not accessible,
monit will reload itself, which will cause failed thread to start.
But this does only work in case you start the http support. If you
think of application where you donT need/want the http support you
have no chance finding out whether monit is running or not.
Main thread can be watched by init - in the case that it dies, init will
respawn it => the critical functionality is kept regardless of http. In
the case that we'll implement watchdog thread, it will be still needed
to watch this watchdog thread by init to make sure that it will work.
The watchdog serves more the purpose of finding deadlocks or other
situation where the threads might get stuck. Init just protects agains
unexpected QUIT/TERM/... situations. And the watchdog is nothing else
but,
clean_sems()
start_validate()
start_httpd()
while TRUE
sleep (cycletime) (or select, cycletime>max_expected_cycle)
if (! sem_validate )
restart_validate()
sem_validate= FALSE
if (run_httpd && ! sem_httpd )
restart_httpd()
sem_httpd= FALSE
end
if it is simple as this it does not require additional supervision. IMO
it does not require LOCKs. Anyhow the LOCKs could lock the watchdog. That
should not happen.
CHopp