[ntp:hackers] Does ntpd need to whine more ?
David L. Mills
mills at udel.edu
Tue Oct 4 16:00:30 UTC 2005
The intended order of status reporting goes something like this. If the
leap indicator is set other than synchronized, the other header fields
have meaning depending ot the state at the latest measurement, which
gives the maximum and estimated error at that time. Only the dispersion
component increases after that to show the maximum error as it grows at
the dispersion rate. It doesn't matter whether the upstream server is
reachable or not, the results are the same. The distance threshold
represents the limit in acceptable maximum error. A user can set that
lower or higher; it currently defaults to one second.
I don't think it wise to tinker with the stratum, leap or other header
fields to provide additional and probably suspect information. There is
of course ntpq to watch more closely.
Martin Burnicki wrote:
>sorry for the late response, we've had a holiday here in Germany on Monday.
>David L. Mills wrote:
>>I say again, an intermediate server cannot directly determine for a
>>downstream client whether an upstream server has or has not "lost"
>>synchronization; it can only reveal in the root distance how much the
>>maximum error has accumulated. Only the dependent client can judge from
>>this statistic whether or not to believe the time.
>>The present behavior is not a bug; it is at the definitive core of the
>>design. It is necessary in order to support very long poll intervals as
>>with the modem driver.
>I'm afraid there's again some misunderstanding here. You always mention
>that this is necessary in order to support very long poll intervals.
>What I mean, however, has nothing to do with poll intervals. If ntpd
>polls an upstream server or refclock and does not get a good response
>back, then it should indicate this to its clients, and also to the users
>who check the ntpd status to see if their time sync network really is in
>No matter whether ntpd polls its upstream source after a long or short
>interval: if it does not receive any response at all (e.g. because the
>server is down), or the response packet has the leap bits set (e.g.
>because the refclock is not synchronized) then ntpd should not pretend
>that everything is OK.
>Should the normal user compare the root dispersion to some limit do find
>out that his timesync is broken and the upstream server needs interaction?
>BTW, the root dispersion is initialized to 0 if ntpd starts up, then
>switches to higher values during initial synchronization with an
>upstream source, and then converges to a smaller value as the PLL
>converges. If root dispersion is a quality/selection criteria, shouldn't
>it be initialized to a large value in order to reflect that time is not
>very accurate while the leap bits are initially set?
>Anyway, here's a quote from your own email to hackers at 1st of October,
>concerning ghetto mode (of which I've never heard before that):
>>If all base servers or there sources fail, all machies enter ghetto
>>mode. The stratum is forced to the ghetto stratum ...
>This is exatly what I mean: at least drop the stratum to indicate that
>something is wrong. This should be the default behaviour for any ntpd node.
More information about the hackers