[ntp:questions] Bizzare half second disagreement between ntp hosts

Unruh unruh-spam at physics.ubc.ca
Wed Jan 2 17:39:08 UTC 2008

Brian Utterback <brian.utterback at sun.com> writes:

>Unruh wrote:
>> I have a very weird situation. I am running a GPS PPS (Garmin GPS18LVM)
>> with a few machines as a backup/initialization. 
>> Sudeenly for about half and hour, my GPS failed for some reason ( still do
>> not know what was wrong since it had come back on air by the time I noticed
>> something wrong). Every hour I run a ntpq -p just to check that my gps is
>> on air. I got this report.
>>      remote           refid      st t when poll reach   delay   offset jitter
>> ============================================================================ ==
>> xtick.usask.ca   .GPS.            1 u 1003 1024  377   44.954    0.213 480.149
>> +sanrail.com      2 u  993 1024  377    1.486  -479.03 479.917
>> +raptor.tera-byt    2 u  322 1024  377   17.295  -480.35 0.766
>> *zeus.yocum.org    2 u  390 1024  377   70.415  -481.02 1.230
>>  SHM(0)          .PPS.            0 l 1415   16    0    0.000   -0.002 0.001
>> Now I believe the tick.usask.ca result, since all of the machines which use
>> mine as a source suddenly noticed a .48 second jump when my GPS failed. But
>> why in the world would three systems all suddenly be out by .48 sec? 
>> Doing a peers on them, one has a GPS as its source, one a .WWVB. and one an
>> .ACTS. Why should all three suddenly be out by half a second?

>Why to you believe tick.usask.ca? Look at the jitter, it recently had
>a jump of 480ms. It seems much more likely that the one server is off
>than three independent servers.

>The sudden jump is likely due to the way PPS is handled. Remember that
>the offset if the clock is determined by the time between the PPS
>firing and the nearest second tick of the system clock. The definition
>of "nearest" is the one for which the offset is less than .5 seconds,
>either in the future or the past. So, at some particular threshold of
>phase offset, the clock offset determined by the PPS will switch sign,
>and jump from .5 to -.5 or vice versa.

I believe it because when my system lost its own GPS lock, it suddenly
found itself .480 sec out when compared with those other three clocks, and
each of the machines. Ie, usask agreed with my local freewheeling time
which had been locked to GPS rather than those three other sources. And
.480 is not .5, so it was not a simple "which second is it" ( which should
have produced a 1.0 second offset anyway).
Now maybe GPS in Canada (I and usask are both in Canada) suffered a glitch.
(Those are very smart sattelites -- they can give a different time to the
US and to Canada:-)

I guess since noone else noticed anything like this, I will have to put it
down to cosmic weirdness.

What I noticed is that at 2007-12-30 15:06:31 UTC all of the machines which
used string as the ntp source found themselves .48 sec out. Unfortunately I
do not have proper logs in string running ntp-- only once per hour-- and
that log is the once per hour log from the ntp host.

More information about the questions mailing list