[ntp:questions] How should an NTP server fail?
David L. Mills
mills at udel.edu
Wed Jun 9 23:41:38 UTC 2010
The number of samples in the clock filter has nothing to do with the
selection process, nor whether the peer is the system peer or not. The
selection alogorithm doesn't even know how many samples are in the
filter, only that the filter candidate that is used has least delay. The
selection metric includes that and the dispersion at the measurement
time, plus the dispersion increment since then. When two or more servers
are configured at substantially the same delay, the client may
occasionally hop from one to the other depending on these factors,
although there is a anti-hop scheme that discourages this unless there
is a substantial difference.
What bugs me when these issues appear on the bugs list is when it is not
a bug but a design issue which which should not be so narrowly confined.
I often get two or more messages about the same issue from different
folks and I wind up replying to each one separately. It's the strongest
advice I can give is to view the architecture briefing on the NTP
project page before submitting reports like that.
>On Jun 5, 4:11 pm, David Mills <mi... at udel.edu> wrote:
>>This issue is widely misunderstood; yours is the second such message to
>>me today. So, please spread the word.
>>When a server loses all sources it does not necessarily become
>>unsuitable for downstream clients. Ordinarily, it inherits error
>>statistics from upstream servers and provides them to downstream
>>clients. Servers and clients use these statistics to calculate the
>>maximum error statistic which represents the maximum clock error
>>relative to the primary reference clock. See the error budget called out
>>in the specification. Once determined, the maximum error increases at a
>>rate (15 PPM) determined as the maximum disciplined clock frequency
>>error of the server clock. This increase continues indefinitely or until
>>the sources are again found.
>Since you have requested that this be discussed on the newgroup rather
>than in bug 1554, I am replying here.
>In bug 1554, the reporter claims that what you describe above is what
>he sees happen if the clock filter contains 4 to 7 samples. However,
>he says that if the clock filter is full with 8 samples, then the
>system peer is unselected and a no_sys_peer event is posted. This is
>in contradiction of what you keep describing as the correct behavior,
>but then you keep saying that the reported behavior is correct. Since
>the behavior he is reporting is not the same correct behavior as you
>keep describing it, we have continued to treat this as a bug.
>So, you need to either confirm that this change in behavior at 8
>samples is correct and amend your description, or confirm that your
>description is correct and admit that the reported behavior is a bug.
>Or deny the reported behavior happens (I tend to favor this at this
>point. I suspect user error right now.) In either of these last two
>cases, we should probably still discuss this in the bug report.
>questions mailing list
>questions at lists.ntp.org
More information about the questions