[monit 3.2 release plan]

monit-dev

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[monit 3.2 release plan]

From:	Jan-Henrik Haukeland
Subject:	[monit 3.2 release plan]
Date:	10 Feb 2003 18:11:00 +0100
User-agent:	Gnus/5.0808 (Gnus v5.8.8) XEmacs/21.4 (Civil Service)

To summarize the latest discussion, then, here's the list with
remaining tasks before we can make a monit 3.2 release.

1.      Reload monit on sighup only ++ as recently discussed on this list
        Responsible: Jan-Henrik

2.      Fix race conditions issues with monit restart
        Responsible: Martin 

3.      Solve sslv2/sslv3/tsl detection problem as reported by Mark F.
        Responsible: Christian
        
4.      In a "restart" situation the resource data "ProcInfo_T" is not
        correctly reinitialized (zeroed)... e.g. -6324% CPU load
        Responsible: Christian

5.      Add Oliver Jehle's doc. for how to use monit with Heartbeat.
        Responsible: Jan-Henrik


I have enclosed Oliver's doc. below if you want to check it out
first. (It's a patch for monit.pod -- we read patch files like others
read newspapers, do we not?)


        

--- monit.pod   Sat Jan 11 03:14:01 2003
+++ /home/oj/monit.pod  Tue Jan 28 08:09:07 2003
@@ -285,21 +285,143 @@ clusters. For instance, using the I<hear
 (http://linux-ha.org/) to watch the health of nodes and in the
 case of one machine failure start services on a secondary node.
 
-Appropriate scripts that can call monit to start/stop specific
-services are needed on both nodes - typical usage:
-
-  FILE                    DESCRIPTION
-  -----------------------------------
-  /etc/inittab            starts monit
-  /etc/rcS.d/S41heartbeat execute "monit start heartbeat"
-  /etc/init.d/monit-node1 execute "monit -g node1 start"
-  /etc/init.d/monit-node2 execute "monit -g node2 start"
-
-This way hearbeat can easily control the cluster state and if one
-node fails, hearbeat will start monit-xxxxx on the running node
-and monit is instructed to start the services of the failing node
-and monitor them...
+=head2 Monit with heartbeat 
 
+The first thing you have to do is install and configure  
+I<heartbeat> (http://www.linux-ha.org/downloads) .
+The Getting Started Guide is very usefull for this task
+(http://www.linux-ha.org/download/GettingStarted.html).
+
+B<Starting up a Node>
+
+This is the normal start sequence for a cluster-node. 
+With this sequence, there should be no error-case, which is not
+handled either by heartbeat or monit. For example, if monit 
+dies, initd restart it. If heatbeat dies, monit restart it. If 
+the node dies, heartbeat on the other node dedect it and restart
+the services there.
+
+ 1) initd starts monit with group local
+ 2) monit starts heartbeat in local group
+ 3) heartbeat requests monit to start the node group 
+ 4) monit starts the node group
+
+B<Monit F</etc/monitrc>>
+
+This sample describe a cluster with 2 nodes. 
+Services running on Node 1 are in group I<node1>, Node 2 services
+are in I<node2>.
+
+The local group entries are mode I<active>, the node group 
+entries are mode I<manual> and controlled by heartbeat
+
+ #
+ # local services on every host
+ #
+ #
+ check heartbeat with pidfile /var/run/heartbeat.pid
+       start program = "/etc/init.d/heartbeat start"
+       stop  program = "/etc/init.d/heartbeat start"
+       mode  active
+       alert address@hidden
+       group local
+ #
+ #
+ check postfix with pidfile /var/spool/postfix/pid/master.pid
+       start program = "/etc/init.d/postfix start"
+       stop program  = "/etc/init.d/postfix stop"
+       mode  active
+       alert address@hidden
+       group local
+ #
+ # node1 services
+ # 
+ check apache with pidfile /var/apache/logs/httpd.pid
+       start program = "/etc/init.d/apache start"
+       stop program  = "/etc/init.d/apache stop"
+       depends named
+       alert address@hidden
+       mode  manual
+       group node1
+ #
+ #
+ check named with pidfile /var/tmp/named.pid
+       start program = "/etc/init.d/named start"
+       stop program  = "/etc/init.d/named stop"
+       alert address@hidden
+       mode  manual
+       group node1
+ #
+ # node2 services
+ #
+ check named-slave with pidfile /var/tmp/named-slave.pid
+       start program = "/etc/init.d/named-slave start"
+       stop program  = "/etc/init.d/named-slave stop"
+       mode  manual
+       alert address@hidden
+       group node2
+ #
+ #
+ check squid with pidfile /var/squid/logs/squid.pid
+       start program = "/etc/init.d/squid start"
+       stop program  = "/etc/init.d/squid stop"
+       depends named-slave
+       alert address@hidden
+       mode  manual
+       group node2
+
+B<initd  F</etc/inittab>>
+
+Monit is started on both nodes with initd. You have to add a
+entry in F</etc/inittab>  starting monit with the local group,
+where heartbeat is member of.
+
+ #/etc/inittab
+ mo:2345:respawn:/usr/local/bin/monit -i -d 10 -c /etc/monitrc -g local
+
+B<heartbeat  F</etc/ha.d/haresources>>
+
+When heartbeat starts, heartbeat lookup for the node entry and 
+start the script F</etc/init.d/monit-node1> or 
+F</etc/init.d/monit-node2>.  The script calls monit 
+to start the node specific group.
+
+ # /etc/ha.d/haresources
+ node1 IPaddr::172.16.100.1  monit-node1
+ node2 IPaddr::172.16.100.2  monit-node2
+
+
+B<F</etc/init.d/monit-node1>>
+
+ #!/bin/bash
+ #
+ # sample script for starting/stopping all services for node1
+ #
+ prog="/usr/local/bin/monit -g node1"
+ start()
+ {
+       echo -n $"Starting $prog:"
+       $prog start
+       echo
+ }
+
+ stop()
+ {
+       echo -n $"Stopping $prog:"
+       $prog stop>
+       echo
+ }
+ 
+ case "$1" in
+       start)
+            start;;
+       stop)
+            stop;;
+       *)
+            echo $"Usage: $0 {start|stop}"
+            RETVAL=1
+ esac
+ exit $RETVAL
 
 =head1 ALERT MESSAGES
 
-- 
Jan-Henrik Haukeland

[Prev in Thread]

Current Thread

[Next in Thread]

[monit 3.2 release plan], Jan-Henrik Haukeland <=
- Re: [monit 3.2 release plan], Martin Pala, 2003/02/10

Prev by Date: Re: monit restart bug/race condition (3.1 and current CVS version behavior)
Next by Date: Re: [monit 3.2 release plan]
Previous by thread: monit control.c
Next by thread: Re: [monit 3.2 release plan]
Index(es):
- Date
- Thread