[time] Monitor spike?

sjm +timekeepers+sjm-lists+1621b94ba5.timekeepers#fortytwo.ch
Wed Apr 16 20:41:11 UTC 2008

> That's effectively what it does in the scoring mechanism; it just  
> gives more weight to measurements that didn't go well and it shows the  
> nitty gritty of each measurement.
> As always: Remember that the goal of the monitoring system is to kick  
> out systems that give bad time....

While I understand the above, (and usually my server doesn't fall below 
the magic +5 score), the above isn't strictly true.

The current monitoring algorithm gives equal weight to a *bad* time as 
to no response.  I frequently see on my servers (network congestion?) a 
response or two lost which drops the score by 5 each time and the next 
response an accurate one again which slowly adds back to the score each 
time again.  This, as I said, hasn't hurt me or the pool at all normally 
as it doesn't fall below the magic score of five.

We did have a network outage yesterday that put my server quickly to the 
-80 or so score, but when I came back on line, it will take X times as 
long to get back into the pool even though my server is as accurate as 
it ever was.  This really hasn't hurt me much, but it will have (however 
little) hurt the pool some by not having my server in the rotation for 
an amount of time.

It just makes me wonder if a non-response should be penalized as much as 
a bad response.


More information about the pool mailing list