Re: Monit believes process failed when it didn't

Hi Because the process id is handled by the background process, a timeout allows it to terminate and creates pid.. usually the init script or a wrapper process waits for initializations to end before demonize itself. Some script use a loop with sleep and a bound limit, analogous to your idea, see jboss startup.. other uses async messages , see systemd service type description for some idea..

On Thu, Dec 3, 2020, 10:47 PM Eric Montellese <eric@emeforward.com> wrote:

Got an odd one for ya...

I have a (legacy) shell script that I need to call from monit. This shell script runs an infinite loop. The platform is a busybox-based openwrt platform (so, the script is running 'ash').

On this platform, it appears that the timing of background processes is not quite as expected. I'd like to understand the expected methodology. The method from https://mmonit.com/wiki/Monit/FAQ#pidfile would seem to be foolproof to avoid the issue I'm seeing (below). However, this method fails outright (see below).

I'm currently running Monit version 5.26.0

The monit config is pretty simple:

check process myprocess with pidfile /tmp/myprocess.pid
start program = "/etc/monit.rc/myprocess.init start"
stop program = "/etc/monit.rc/myprocess.init stop"
depends on other_process

myprocess.init is also quite simple (just showing the 'start' method). Here are three different things I've tried:

1. Following the example in the monit docs:
start() {
echo $$ > /tmp/myprocess.pid
exec /usr/bin/myprocess.sh
}

In this case, monit says that the "process never returned" and tries to restart it. Of course the process didn't return, so why is this the documented method? Is this a difference in versions of monit (vs the documentation I'm using)?

2. Jam that sucker into the background
start() {
/usr/bin/myprocess.sh &
echo $! > /tmp/myprocess.pid
}

Surprisingly, this also does not work. In this case, the pid file is created as expected, but monit does *not* think that the process is running.

3. Try something silly?
start() {
/usr/bin/myprocess.sh &
echo $! > /tmp/myprocess.pid
sleep 1
}

Adding a 'sleep' fixes the issue... but why?

For debug, instead of the 'sleep' I've also tried putting
'ps | grep myprocess > /tmp/output'

In this case, I *do* see the process listed in the /tmp/output file -- but in this case, monit also returns happily. (So it's a heisenbug)

Questions:
1. What is the "normal" way to do this?
2. Anyone seen this sort of behavior on an embedded system?

Best Regards,
Eric

From:	Luca Cazzaniga
Subject:	Re: Monit believes process failed when it didn't
Date:	Fri, 4 Dec 2020 11:05:16 +0100