[ntp:questions] Re: Drift handling....
Arnold
arnold at hacktic.nl
Wed Jan 4 22:17:01 UTC 2006
Martin,
Thank you for your clear explanation of drift!
The remaining point is the 'built up' confidence in the calculated drift
value.
You never know the absolute time, as there is always an error margin.
Let's say the error margin on the absolute time is +/-0.1s. Now accuracy
of the drift calculation depends on the period of observation. Assuming
a drift of 5s a day, you get about 0.2s/hour. If you determine drift
over a one hour period, you can end up with a calculated drift from 0.0s
to 0.4s/hour, or from 0 to 10s/day. If you measure during a full day,
then you will determine a drift in the range 4.8 to 5.2s/day (it's not
correct to take the worst case, but for simplicity).
So, the longer your measurements last, the more accurate your
determination of the average drift will be. Many ntp-servers run for a
long time, so it is possible to have a very accurate average.
However, the drift is not constant, it varies with temperature, and
other factors. So, the average drift over six (cold) months is not very
useful on a hot day. Therefore, you want to know the actual drift at a
given moment. To do this, you measure over a very small period. This
gives you an inaccurate, but actual drift value. You can filter this a
bit, assuming that drift does not 'jump', but varies slowly.
With statistics you can improve the accuracy of (many) inaccurate drift
values. For example, when you determine the drift over 1 hour periods,
for several days, then you can average the maximum and minimum values
for each day. Together with the variance, this gives a good indication
of the range of the drift value. The more days you use in your
statistics, the better you know the range for your drift. Of course you
can do the same for the drift over a day or week (or max/min per day
over the last n days).
With the statistical info, you can have confidence in the actual drift
value that you calculate at a given moment. If the measured 'actual'
drift is outside the range, and you do have *a lot* of historical info
used in your statistics, then you know something somewhere is very
wrong! Just adjusting the drift, and assuming you can trust your actual
drift calculations is not a very smart thing to do. However, this seems
to be the way ntpd works: it ignores single outliers, it filters a bit,
and a bit more after running for some time, but otherwise it forgets
about its history and uses the most recent data as the truth.
So, with statistics of historic info, a ntp-server can determine
something is wrong. What to do next?
First of all signal to clients that some strange things happened! The
protocol is ready for it: set both leap bits to indicate the time from
this server is not reliable.
Next, the ntpd-server should have some strategy to handle with the
situation. A few suggestions:
a) recalculate the 'actual' drift against other servers, and reconsider
the server selection in case other servers result in a 'sane' drift
value, or use a 'weighted average offset', weighting the 'good' servers
more than the others;
b) recalculate the drift over a longer period (until it is within, or
limited by the 'normal range') and use that one (1s additional offset in
a 20s period results in a much larger calculated drift then 1s
additional offset over the past few hours or day);
c) don't adjust your drift at all, but assume that a 'glitch' caused the
additional offset; correct for the additional offset, and continue with
the same drift value (or adjust it a very little bit);
d) ...
...
z) Call the administrator for help!!! :-)
General note: after switching from server, one can expect a change in
offset (that's the main reason to select a different server, isn't it?)
Therefore, that (relative large) change in offset in a short period
should not be used to calculate drift compensation. Or, if it is, then
the period over which the new drift value is calculated should be large
compared to the offset.
Arnold &:-)
More information about the questions
mailing list