[ntp:questions] Server offset included in served time?
david at ex.djwhome.demon.co.uk.invalid
Mon Sep 15 20:41:46 UTC 2008
Martin Burnicki wrote:
> But what about the behaviour shortly after startup? The NTP daemon tries to
> determine the initial time offset from its upstream sources. Unless that
> initial offset exceeds the 128 ms limit it starts to slew its system time
> *very* slowly until the frequency drift has been compensated and the
> estimated time offset has been minimized.
I've had some thoughts about this. As I see it the problems are:
- ntpd doesn't have any persistent history of jitter, so has to start by
assuming that the jitter is of the same order of magnitude as the offset
(what people looking at offset often forget is that they have the
benefit of hindsight).
- ntpd is already at the shortest permitted time constant, and going
lower would require faster polling, or compromising the level of
oversampling, or length of the initial best measurement filter. It is
this lower bound on the time constant that means that ntpd can get into
a position where it should know that the time is wrong, but cannot fix
- the step limit is fixed at configuration time.
One could deal with the first by making the smoothed jitter be
persistent. That way ntpd can detect whether its offsets exceed
reasonable jitter for the system, before it has enough history for the
session to know the jitter from measurements just in the current session.
Once one knows that offsets are high compared with jitter, one can
address the time constant issue. Normally jitter << offset would tend
to force the time constant down, but is has nowhere to go down. Maybe
what is needed is to allow the degree of oversampling to compromised
until one first begins to get offsets of the same order as the jitter.
Maybe also use less then 8 filter slots.
This may compromise the stability of downstream systems, so it may be
necessary to stay in an alarm state until this stage of the pocess is
complete. This may be a problem for people who want a whole network to
power up at the same time, and quickly.
If there were also a persistent record of a high percentile figure for
the offset, one could also use that to set the step threshold during the
startup phase, maybe reverting to the standard value, later, to give
better tolerance of major network problems.
> While the system time is being slewed it may be e.g. 120 ms off, and when
> the daemon sends the system time to its clients then it will serve a time
> which is 120 ms off.
To some extent the fact that systems are already experiencing this
suggests, to me, that one might not need to alarm the time during a
temporary short loop time constant phase at startup.
More information about the questions