Re: event engine patch update

monit-dev

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: event engine patch update

From:	Martin Pala
Subject:	Re: event engine patch update
Date:	Sun, 28 Mar 2004 18:00:34 +0200
User-agent:	Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.6) Gecko/20040312 Debian/1.6-3

Jan-Henrik Haukeland wrote:

Martin Pala <address@hidden> writes:

Yes :)

In the attachment is version for current cvs sources. In addition to
last mentioned patch it fixes compile time warning for deprecated
usage of casted expression as lvalue in net.c (not related to event
engine refactoring).

I don't know any refactoring related bug currently - it seems that it
works well (but it still needs further testing). I think the patch is
ready for checkout.



Well, that was a big patch! I have just browsed through it and
concentrated the browsing around the datastructure changes in
monitor.h and read event.c/h and validate.c. The first impression is
that this looks like an improvement and refactor away the weakness in
the orginal implementation.

As far as I can see, you refactor out the internal even handling in
validate.c and centralize it in event.c, this is good. Event handling
in validate.c is basically reduced to post an event for any and all
tests in validate, which is also good since you now have a one single
unified interface into the event machinery from validate, i.e.
Event_post() and not the ugly internal flag settings we had before.

Events posted via Event_post() are put on an event list and handled
based on certain conditions in event.c:handle_event().

Did I get it right?

Yes


I think this patch is an improvement and I'm +1 for checking this into
CVS. But first I have a few comments and questions

1) Have you run monit with this patch through valgrind? To me it looks
like events are added to the list but not dequeued and that there is a
massive memory leak here. Event_free() is not called at all in the code.

The event queue is per service - each service has its own related eventslist.

Monit needs to know whether the state changed. The state depends on theresult of testing rule => it is directly related to particular rule.Thus we need to keep the result of each testing rule and compare it withnew value in next cycle. This is solved via events list.

The list is empty until some FAILED event will occure. In such case,monit will add appropriate event to the list, if it doesn't existalready. PASSED events are ignored until first FAILED event will occure(this way monit will not flood you with tens of "up" events on itsstartup) => we start to watch state change after first failure.

Events are kept in the list until monit is stopped or reloaded - theymust not be deleted, because we need to handle states, where the errorratio exceeds timeout limit (too manyfailed->passed->failed->passed->... state changes in given timeframe).This allows to easily implement triggers in the future - you can haveseveral error levels and do custom action depending on error ratio.

So, lets say some event (FAILED or PASSED) of given type(EVENT_NONEXIST/EVENT_SPACE/EVENT_CONNECTION/...) occured. Monit willtry to find the event with same origin (e.g. produced by the sametesting rule) and the same type identification.

The rule which produced the event is uniquely identified by 'action'parameter. The 'action' points to address of EventAction object, whichdefines per rule custom actions what to do in the case that FAILED orPASSED event occured. Thus when monit posts the event, event willinherite and share the EventAction via its memory address.

In the case that monit will find the same event, it will compare thestate. When state changed, Event_post will set the new state andstate_changed flag. In the case that the event was not found and it isFAILED event, monit will add it to the list. The control is passed tothe handle_event now.

handle_event will call handle_action with per rule specific failed orpassed state handlers (based on the event's polarity - FAILED orPASSED). The state_change flag is reseted after event was handled, thuswe can compare the result of next cycle and decide whether it isdifferent or equal.

handle_action will call handle_alert regardless of event type, but onstate change only => the alert is delivered only once on the statechange. Thus as soon as the service failed, you will receive FAILEDalert and as soon as the service has recovered, you will receive PASSEDevent (event description is type dependent of course :)

All failed events occurences are handled, first passed event occurenceis handled. The service can change the state fromfailed->passed->failed->passed->... monit keeps monitoring and eventhandling all the time, until timeout will occure (in the case that it isdefined).

handle_action will take care for state handler action at all(ACTION_ALERT/ACTION_EXEC/ACTION_RESTART/ACTION_START/ACTION_STOP/ACTION_UNMONITOR).


In the attachment is simplified picture of new monit event engine.


2) In validate.c, the report parameter has bugging me for a while. Can
you please get rid of it by calling Event_post() in the lower
function? :-) E.g. check_uid() and check_gid() can just call
Event_post() directly. This makes the code simpler as well.


Good suggestion :)

3) Please use the "one-true bracket style" in the code i.e.

         if(x) {
             blabla
         }

   and also break lines at 79 column width. Apropos I found out that
   it is possible to use 'C-c .' to set the C indentation style (to
   gnu) in (X)emacs, which is the style we use.

OK. I used in event.c style which i preffer, i hope it will not interfertoo much.


4) It would be nice (but not required) if you could give a short
"formalized" description of the new event machinery and data
structure. For example a drawing and/or text. After all this is a
pretty important part of the code and it would be good to have an
overview :)

The new data structure encapsulates the objects and is based on objectsrelationships and inharitence/sharing. I hope above text and picturewill help. The picture doesn't show all objects and their names (forexample Action_T object consists optionaly of Command_T object in thecase of ACTION_EXEC type) - details can be found in monitor.h



Cheers,
Martin

[Prev in Thread]

Current Thread

[Next in Thread]

service "up" event support, Martin Pala, 2004/03/22
- Re: service "up" event support - new patch, Martin Pala, 2004/03/22
  - new event engine patch++ (various fixes), Martin Pala, 2004/03/22
    - event engine patch update, Martin Pala, 2004/03/23
    - Re: event engine patch update, Jan-Henrik Haukeland, 2004/03/24
    - Re: event engine patch update, Martin Pala, 2004/03/24
    - Re: event engine patch update, Jan-Henrik Haukeland, 2004/03/27
    - Re: event engine patch update, Martin Pala <=
    - Re: event engine patch update - updated picture, Martin Pala, 2004/03/28
    - Re: event engine patch update, Jan-Henrik Haukeland, 2004/03/28
    - Re: event engine patch update, Jan-Henrik Haukeland, 2004/03/28
    - Re: event engine patch update, Martin Pala, 2004/03/28
    - RFD: C style [was Re: event engine patch update], Jan-Henrik Haukeland, 2004/03/28
    - Re: RFD: C style [was Re: event engine patch update], Michael Shigorin, 2004/03/29
    - Re: RFD: C style, Jan-Henrik Haukeland, 2004/03/29
    - Re: RFD: C style [was Re: event engine patch update], Christian Hopp, 2004/03/29
    - Re: RFD: C style, Jan-Henrik Haukeland, 2004/03/29
    - Re: RFD: C style, Michael Shigorin, 2004/03/29

Prev by Date: Re: event engine patch update
Next by Date: Re: event engine patch update - updated picture
Previous by thread: Re: event engine patch update
Next by thread: Re: event engine patch update - updated picture
Index(es):
- Date
- Thread