[Top][All Lists]
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: Features...
From: |
Jan-Henrik Haukeland |
Subject: |
Re: Features... |
Date: |
Fri, 26 Sep 2003 17:14:59 +0200 |
User-agent: |
Gnus/5.1002 (Gnus v5.10.2) XEmacs/21.4 (Reasonable Discussion, linux) |
Christian Hopp <address@hidden> writes:
> clean_sems()
> start_validate()
> start_httpd()
>
> while TRUE
> sleep (cycletime) (or select, cycletime>max_expected_cycle)
> if (! sem_validate )
> restart_validate()
> sem_validate= FALSE
>
> if (run_httpd && ! sem_httpd )
> restart_httpd()
> sem_httpd= FALSE
> end
Okay, this clarified a bit what you had in mind. It's not a bad idea
on paper, by all means, but still there are fundamental problems here.
1) First, I still think that this is a way to try and fix something
that is wrong and should be fixed in the code.
2) You cannot use sleep (cycletime) in the watchdog thread. This is
to uncertain; the validate thread can have a lot of work to do and
it can take time before it can set the semaphore. Blindely
restarting the validate thread without knowing if it's really
running or not is not a good solution. Another and much bigger
problem, since the validate thread can run fast and slow depending
on how much work it has to do, the time it will set the semaphore
will vary. So it will set the semaphore at cycletime+X where X is
the time it takes the validate thread to run. Think of this as
timezones; The watchdog thread is located at GMT and says that it
should take the validate thread 80 days (cycletime) to travle
around the world. The problem is that the validate thread must
sometime run by boat or sometime by an airplane and sometime it
must simply wait for a connection. In other words; in one cycle it
will use 60 days to travle aorund the world and in another cycle it
will use 120 days. See the problem? Because the validate thread
cannot and will not run with constant time you *cannot* use
constant time to check if it's running.
3) If the validate thread and http thread should be hung up in a mutex
deadlock situation, simply calling restart_xxx() will hang as well
and the watchdog thread will just fill up the call stack with
restart_xxx() until SIGSEGV.
4) There are many more situations that can go wrong, just give me time
to think them up if you are not convinced already :-)
--
Jan-Henrik Haukeland