[ntp:hackers] What to do when the offset is WAYTOOBIG
jlevine at boulder.nist.gov
Thu Apr 19 12:33:06 PDT 2007
>I am watching five clocks. Three of the say 1200, two say 1300 and
>my clock says 1400.
>Since the majority of clocks I watch say 1200, I conclude the real
>time is 1220, but that is beyond my panic limit of one hour.
I would have looked at this differently. I would have evaluated
the error of my clock from the
times of the remote clocks I was monitoring using the sigma of my
clock as a metric -- the average
prediction/correction error over some previous time interval. Since
the sigma of a typical system
is on the order of milliseconds, I would have concluded that
something is really broken here --
the prediction errors are hours, not milliseconds. I would not have
concluded that the time was 1220,
because I trust my local clock to be within some small multiple of
its historical prediction error. That
might not be correct, but it is my first-order working hypothesis.
Based on the evidence at hand, I
have no way of deciding who is right, except that something is
clearly broken. So I set my clock
to unhealthy and do not adjust it. If the problem really is in the
remote clocks, then this strategy is
optimum. If the problem is in my clock then I have limited the damage
by telling my customers not
to use it. (The act of setting the clock unhealthy triggers a pager
alarm in the NIST servers, but that
is outside of the scope of NTP).
Since my strategy uses the historical prediction error of the
local clock as a way of evaluating
the responses of the remote systems, I only need to query a single
external server. I accept its
response if its time difference is within some reasonable value of
what my historical sigma has been. My
system would query a second server if this test fails, but that might
not help here, since none of the queries
would pass this test. The fact that a number of external servers
agreed would not by itself override my
sigma test. As I mentioned above, this situation would trigger an alarm.
The weakness with my algorithm comes when the servers disagree
by something on the order of
my prediction sigma. That is sticky because I can't say for sure
whether it is a glitch or a conforming
event. Depending on the details, I can follow the wrong pied piper here.
Time and Frequency Division
More information about the hackers