[ntp:questions] PSYCHO PC clock is advancing at 2 HR per second

unruh unruh at invalid.ca
Thu Mar 22 16:04:02 UTC 2012

On 2012-03-22, Dennis Ferguson <dennis.c.ferguson at gmail.com> wrote:
> On 21 Mar, 2012, at 11:36 , unruh wrote:
>> On 2012-03-21, Ron Frazier (NTP) <timekeepingntplist at c3energy.com> wrote:
>>> I noticed that Dave Hart later posted this reply to your question.  I'll 
>>> reference that below.
>>> NTP's jitter is root mean squares of offsets from the clock filter
>>> register (last 8 responses, more or less).
>> Strange, because ntp then takes that entry of those 8 with the shortest
>> roundtrip time and uses only it to drive the ntp algorithm. Thus on the
>> one hand it is using it as a measure of jitter and on the other hand
>> saying it does not trust most of those values, with a distrust so deep
>> it throws them away. Why would you be reporting anything for a set of
>> data you distrust so deeply.
> I see you keep pointing this out in various ways, but I really don't
> understand the point.  If you are measuring data with non-guassian,
> non-zero-mean noise superimposed you need to find a statistic which is
> appropriate for the noise to produce the best noise-free estimate of the
> quantity you are interested in measuring.  If someone takes `n' samples
> with a (slightly different) non-gaussian noise distribution and finds the
> median of the `n' (which is an individual sample) to use for further
> processing would you really call that "throwing away 'n-1' of the samples"
> rather than just computing the measure of central tendency which is most

No I would not. That is not what ntpd does. It really does throw away 7
of the samples and never uses them. The whole question is what is the
best statistic to use. I do not believe that the "shortest roundtrip
time" is that best statistic. If you could convince me it is, I would be
more than happy to have ntp use it.

IF the roundtrip times were to vary by factors of 2 from one instance to
the next, I might be persuaded that it was the best statistic. But it
does not in almost all cases where ntpd is used. It varies by a few
percent ( with maybe an occasional blip with larger delays.) I have huge
reams of data to support my statement. 

Note that in the refclock drivers, some fraction of the events are
thrown away to make up for "popcorn" events ( events believed to be
taken from a distribution with a far wider distribution than normal-Ie,
the model is that the distribution is of the form P1 rho1(x)+P2 rho2(x)
where P1>>P2 and rho2 is a dsitribution with a much larger deviation
than rho1) and the "throwing away" is an attempt to get rid of those
events from the second. But ntpd's handling of the round trip statistic
strikes me as an extremely ad hoc, unthought-through attempt to answer
the valic question of how to handle "popcorn" events in the one way trip
time of the data. There are instances where it might even be the right
thing to do, but those are, I believe, rare, not the run of the mill. 

The current handling throws away valuable data-- data which could
instead be used to increase the sensitivity of ntpd to changes in the
clock rate, eg due to temperature changes, a sensitivity which is
currently poor in ntpd. It could be used to decrease the errors in
ntpd's clock handling. Instead it is thrown away in an attempt to solve
a problem that is in many cases non-existant. 

> appropriate for the noise distribution?  And if he went back over the data
> to compute a measure of variability (perhaps computing the median square
> deviation would be more appropriate) would that really be reporting something
> "for a set of data you distrust so deeply"?  Unless I'm missing something
> this seems like a rather bizarre point of view.
> If you are looking for something to complain about in this particular bit
> of machinery in ntpd I think there are much more interesting aspects you
> might consider.  The best might be that this code causes ntpd to sometimes
> process offsets which are quite old, in that they were measured at a time
> well into the past.  In theory PLLs and other feedback control mechanisms
> are unconditionally destabilized if there is any delay in the feedback path.
> This is why ntpd makes no use of knowledge of when an offset was measured;
> it is feeding those offsets to a PLL and a PLL has no way to deal with data
> measured at any time other than "right now".  In practice (as opposed to
> theory) approximating stability while using data which is significantly
> delayed requires making the time constant of the PLL large enough that
> the delay in the feedback path can be assumed to be approximately zero
> in comparison.  The time constant of ntpd's PLL, and hence the stately
> pace at which it responds to errors, is hence directly related to the
> worst case delays between the measurement and the processing offset data
> which is caused by this filter.

Yes, that is another problem. It directly leads to ntpd's glacial
response to true changes-- like changes in the rate due to temperature
changes. Like changes in the rate due to startup, etc.

Are those costs worth the benefits? Sometimes they might be. Sometimes
the one way paths to the server as so variable that the noise introduced
that way would swamp any advantages of faster response, and error
reduction due to averaging. But that, I would contend, is rare. In most
circumstances, the cost is far greater than the benefit. 

Data is a precious commodity, one should be very careful before throwing
it away.

> There are so many things to complain about that actually make sense
> (in reality everything can be justified in terms of tradeoffs, but people
> can differ about which tradeoffs produce the most attractive result) that
> I don't see why you keep harping on something which seems more like
> nonsense.

Well, you might regard a slowdown of the response of ntpd by a factor of
about 8 as nonsense. Or an increase in the clock errors by a factor of
about 3 as nonesense. I doubt many others would. 

> Dennis Ferguson

More information about the questions mailing list