monit-dev
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: plans


From: Martin Pala
Subject: Re: plans
Date: Wed, 25 Jun 2003 01:01:32 +0200
User-agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.3.1) Gecko/20030618 Debian/1.3.1-3

Bellow proposition isn't very "userfriendly" - some chunks are not compact - for example i used 'cpu' event in point 1.) example and then switched to 'timeout' event in 2.) - 6.) (which is not consistent example).

Some facts are consequences of described syntax and present behavior, for example i didn't mentioned 'mode' statement - it should be:
1.) mode active: event->method->alert
2.) mode passive: event->alert
3.) mode manual: enters active mode as soon as instructed to start (manualy by user)
=> passive mode will skip methods execution.

In general new proposed syntax allows simple syntax for most cases (without overhead in comparision with present syntax) + in addition it allows to take care gracefully for nearly any complex situation - it maps events, methods and alerts standard but flexible way.

What about it? Maybe it will be usefull to prepare better proposition, where the syntax, its relationship to present scheme and internal flow will be described.

Martin

Martin Pala wrote:

Jan-Henrik Haukeland wrote:

... and the current language
does start to get messy.

I agree - it's not very funny.

I think we shall maybe purge obsolete configuration statements in next major release to cleanup syntax and documentation. It will break backward compatibility (which is not good and popular) but it could ease the usage. For old style configs it is possible to supply upgrade script + upgrade documentation (which was prepared by Christian allready). I think new (and old) users will appreciate simple common syntax.

What is your opinion?

Christian is also right about that a global
method section would be useful, because most of the methods (except
start/stop) may be reused by many different check-blocks.  Maybe the
cleanest is to only leave the start/stop commands in a check-block and
define other methods globally?
Most of the methods could be reusable (global) but in the case where the user will need to define different methods for the same event in different services, he will be forced to choose non-conflicting names for those global methods. On the other side, it will be easier to support just global methods from the implementation's point of view (no problem with mentioned global vs. local lists). The price is restriction of the syntax.

I personally preffer support for global and local methods definition. I think it is more flexible and universal syntax - user can choose wheter to use just global methods, just local methods or both whenever suitable.

 As in

# Global method definition section
 method reload "kill -HUP $MONIT_PROGRAM_PID"
 method shutdown "/sbin/shutdown -t 1 bye, bye"
 method foo "/bin/xzz -option 1 ... -option N  argument1.. argumentN"
method bar "/bin/xyz -option 1 ... -option N argument1.. argumentN" exec on {

For global list i preffer prefix 'set' as for other global statements (such as "set logfile ...", etc.) to keep the syntax:

# Global method definition section
 set method reload "kill -HUP $MONIT_PROGRAM_PID"
 set method shutdown "/sbin/shutdown -t 1 bye, bye"
 set method foo "/bin/xyz -option 1 ... -option N  argument1.. argumentN"


# check section
check ...
start program = "/usr/local/apache/bin/http start" stop program = "/usr/local/apache/bin/http stop"
# tests definition section
if timestamp /usr/local/apache/conf/httpd.conf was changed then exec reload
 if cpu usage > 80% then exec foo
 if cpu usage > 99% then exec shutdown

But what should be do with these statements?
checksum /usr/local/apache/bin/http timeout if 2 restarts within 3 cycles
 group server
 every 2 cycles
 depends sybase
 alert address@hidden on { checksum }
 host shop.sol.no port 443 type TCPSSL protocol http

They are not if-tests per se. Should we leave them as is or modernize
them as well?

This is difficult one.

Some of them produce event indirectly:

1.) group server

This probably won't be big problem - group is used for start/stop/restart/status externally from the service's point of view => there are no bindings to other events


2.) depends sybase

What shall Monit do in the case, that it execs some method for "master" service is hard to decide. It involves need for method classification:

- either method triggers event/action to it's dependants
- or the method is realy local (doesn't involve referrals)

Maybe such syntax can solve it:

check process foo ...
 method start "/etc/init.d/foo start" type trigger
method stop "/etc/init.d/foo stop" type trigger method reload "kill -HUP $MONIT_PROGRAM_PID" [type private]
 ...

check process bar ...
 method start "/etc/init.d/bar start" [type private]
 method stop "/etc/init.d/bar stop" [type private]
 method reload "kill -HUP $MONIT_PROGRAM_PID" [type private]
 depends on foo
 ...


Where:

TYPE {TRIGGER|PRIVATE} ... defines wheter the method will follow dependants. In the case that the method has 'trigger' flag, method with the same name is expected in dependant's configuration. If global method has 'trigger' flag true, then it is inheritted by all services and in the case that one chain link will run this method, all dependants will run it too and forward it to its children.

'TYPE PRIVATE' is default in the case where optional TYPE argument isn't specified at all - but the logic could be converse => TYPE TRIGGER could be default and you must specifiy 'TYPE PRIVATE' to make the exception (which is maybe more close to present behavior).

---

To generally consolidate the syntax, monit can produce events of the name according to the test's name:

checksum
cpu
mem
timestamp
timeout
host
unavailable (special - if the service doesn't exist any more)
available (special - if the service appeared)
stop (special - if the service was stopped)
start (special - if the service was started)
etc.

If the event occures, Monit will:

1.) look wheter this test has defined methods list, such as:

if cpu usage > 80% then exec foo, bar

The order of searching for 'foo' and 'bar' methods - first look for local method, if not then for global method.



2.) if list of methods is not defined in the test statement, monit will look wheter there is local method registered for this event in its definition, such as:

method foo "/bin/xyz" on { timeout }


3.) if not, monit will look wheter there is local method of the name same as the event, such as (for example for timeout event):

method timeout "/bin/xyz"


4.) if not, monit will look wheter there is global method registered for this event in its definition, such as:

set method foo "/bin/xyz" on { timeout }


5.) if not, monit will look wheter there is global method of the name same as the event, such as (for example for timeout event):

set method timeout "/bin/xyz"

6.) after method processing (it doesn't matter wheter some method was find and executed), continue to alert list processing. Alerts are sent to all users (in local and global definition) which has this event registered, such as:

# global alerts section
set alert address@hidden   # all events are to be send
set alert address@hidden on { timeout checksum }

check process foo ...
 alert address@hidden on { timestamp }

=> in the case of timestamp event of the process foo alert is sent to address@hidden and address@hidden



If monit will find a method as described above, it will run it. You can register methods to take care about the events globally and localy (which overrides the default behavior).

Example syntax:

# global methods section
set method reload "kill -HUP $MONIT_PROGRAM_PID"
set method shutdown "/sbin/shutdown -t 1 bye, bye"
set method foo "/bin/xyz -option 1 ... -option N argument1.. argumentN" on { timeout }

# global alerts section
set alert address@hidden   # all events are to be send
set alert address@hidden on { timeout checksum }

check process foo ...
 method start "/etc/init.d/foo start" type trigger
 method stop "/etc/init.d/foo stop" type trigger
 method checksum "/xyz"
if host localhost port 80 then stop, start # will do restart of the whole chain if timestamp ... changed then exec reload # will run global 'reload' method
 cheksum "/testfile" # will run local method checksum
 timeout(5,5) # will run global foo method
 alert address@hidden
 ...

check process bar ...
 method start "/etc/init.d/bar start"
 method stop "/etc/init.d/bar stop"
 method restart "/etc/init.d/bar restart"
method reload "/etc/init.d/bar reload" if timestamp ... changed then exec reload # will run local 'reload' method - 'then exec reload' part is not needed to exec it
 depends on foo
 ...


What about it? The example isn't the cleanest syntax - i tried to show the possible usage. In most cases the syntax could be very simple. There are plenty of areas which needs to be discussed/cleaned.

The general flow is: event -> optional_methods -> optional_alerts

If this theory looks too bad, just put it in the "circle folder" (basket :)


Martin




Martin Pala <address@hidden> writes:

I think maybe we can save some configuration file size and make it
more clear, if we'll use syntaz as following:

check ...
 # method definition section
 method start "/etc/init.d/apache start" exec on { start }
 method stop "/etc/init.d/apache start" exec on { stop }
 method reload "kill -HUP $MONIT_PROGRAM_PID"
 method shutdown "/sbin/shutdown -t 1 bye, bye"
 method foo "/bin/xzz -option 1 ... -option N  argument1.. argumentN"
method bar "/bin/xyz -option 1 ... -option N argument1.. argumentN" exec on {
  timeout, restart, checksum, resource }
 # tests definition section
if timestamp /usr/local/apache/conf/httpd.conf was changed then exec reload
 if cpu usage > 80% then exec foo
 if cpu usage > 99% then exec shutdown


It separates method definition and allows to use different methods
for different condition "levels" of events (see last two lines of
above example). You can then make different actions for the same
event appropriate to situation.

It could be usefull for example in device checks - if the space
usage is growing and is above for example 80% watermark, you can
call method to purge some temporary or not important files (cores,
etc.) or to compress/move them (it can be done on regular basis by
cron, but monit could be used as fast reaction system for unusual
situations such, where the filesystem is fullfiled and cron job is
planned to midnigth). However if first method didn't helped, space
usage grows and stop method could be called in the case that it
exceedes 99% to gracefully stop services and defend catastrophal
situation. This is just example for device checks - another cases
are usefull for other monit tests (timestamp, etc.)

I didn't though about implementation, so it probably could be lot of
work (redesigning present start/stop program system). I thing it
isn't needed in such case to change present alert statement (exec
could be standalone list).

What about it?

Cheers,
Martin






_______________________________________________
monit-dev mailing list
address@hidden
http://mail.nongnu.org/mailman/listinfo/monit-dev







reply via email to

[Prev in Thread] Current Thread [Next in Thread]