[Top][All Lists]
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: Monitoring Apache mod_status (was Apache, rotatelogs & chroot enviro
From: |
David Fletcher |
Subject: |
Re: Monitoring Apache mod_status (was Apache, rotatelogs & chroot environment) |
Date: |
Fri, 10 Dec 2004 22:09:55 +0000 |
Hi,
Reading your responses is useful, there are some good ideas.
> BTW, I see a shortcoming in the protocol test interface. I think there
> should be a way to kickback error reports to validate.c so it can be
> included in the alert. Now lots of interesting errors can only be
> logged. We can change the signature of a protocol test to: int
> check_foobar(Socket_T s, char **errors); Where the protocol-test can
> allocate an error string upon errors and assign it to the errors
> parametere. validate.c will use this error string in the alert if it is
> non-null and validate.c is also responsible for freeing the
> errors-buffer. What do you think?
I agree that giving more information in the alert is a good idea, since
currently the infomation is going to the logs, but the alerts are just general
for the protocol. I think this change needs input from everyone, not just me!
> 1.) the limits in the patch are defined as percentage, but it is not
> obvious at first sigth. Currently '%' character is used in monit control
> file for other tests where percentage limit is supported (cpu, memory,
This sounds a good idea. I used percentages since they cope with
changes in the total number of Apache children, but making it clearer would be
good.
> 2.) it could be good to support comparision operators as well, so it
> will be possible to use various combinations. It will be more consitent
> with other tests syntax too (such as in the case of 'space' example
> above). We can then check for example that there are always 10% child
> processes waiting for connection (i.e. ready to serve requests immediately):
>
> if failed port 80
> protocol apache-status waitlimit < 10% then alert
>
> This will allow to stack the actions too based on various error levels:
>
> if failed port 80
> protocol apache-status loglimit > 50% then alert
> if failed port 80
> protocol apache-status loglimit > 90% then restart
>
These comparisons are already there, but not given in the control file.
For all the monitored limits, except waitlimit, an action is taken when the
measured quantity exceeds the limit. For waitlimit, an action is triggered
when the measured quantity is below the waitlimit.
The 'escalation' approach of alert and restart is a good idea, but I haven't
tried it yet. I agree that some other name for the *limits would be good in
this case, perhaps *trigger or *level would be good.
I have other work to do until after Christmas at the earliest, so I won't
make any changes now. It has taken quite a lot of work to get the patch this
far!! What do people want to do? I am happy if other people want to make
changes, and integrate the patch better with the rest of Monit.
Regards,
David.
--
-------------------------------------------------
Email: address@hidden
-------------------------------------------------