[ntp:questions] [Slightly OT] Unexplained clockdrift?

Sander Smeenk sander at bit.nl
Mon Nov 27 19:14:15 UTC 2017


Quoting Mike Cook (michael.cook at sfr.fr):

> > It is most probably hardware related
> unlikely that 2 systems would be similarly impacted with a hardware
> issue unless they have really crappy clocks, especially when going
> back to 4.8 « corrects » it.

I do agree. With 'hardware related' i don't mean both systems have a
defect - i think it relates to some other component in the system that
interferes in some way with interrupts / clock operation mayhaps?

I'm gonna pull cards (it has an Intel 4x10Gbit NIC) tomorrow...


> > | *ntp4.bit.nl .PPS.           1 u 1    64   1     0.329 -112.21 12.197
> > After a while the offset increases to several hundred:
> How long is a while? > 64s ?

Yes, i'm aware of the fact ntp "needs time to settle down". ;)


> > | *ntp4.bit.nl .PPS.          1 u 19   64   1     1.168 644.105 408.358
> Your reach value is not updating normally for this poll value.

I don't think this is a routing problem.
We're an ISP and we convinced ourselves that we know what we are doing. ;)
Those are the worst, i know... Firewalls have been checked too...

In fact the reach value goes up after a restart. 1, 3, 7, 17, 37, 77...
And then it drops all selected sources (*, +, etc..) and "restarts"
itself. The ntpd process is still the same and this is when reach
seems "stuck" at 1.

All this while the offset and jitter keep bouncing up and down.
Mostly up.

When it 'resets' itself, offsets drop down from hundreds to tens.

[ .. time passes .. ]

I assumed the 'reset' happens because the offset was too large, so i
just started ntpd with -g and -x. I now see reach going up to 377 and
staying there. Offsets are insane though. Well over 3000 for all sources.
Jitter goes bonkers too in the 300/500 range.

Ntpd has been running for ~45 minutes now:
|      remote           refid      st t when poll reach   delay   offset  jitter
| ==============================================================================
| *ntp1.dmz.bit.nl 193.0.0.229      2 u   44   64  377    0.363  3734.94 368.463
| +ntp2.dmz.bit.nl 193.67.79.202    2 u   30   64  377    0.382  3499.85 320.456
| +ntp3.dmz.bit.nl 193.79.237.14    2 u   49   64  377    0.514  3095.87 561.627
(I've switched sources to DMZ IPs, but really, it is not network related)


> Is a virtualization being used wrt NTP that is not being used in the
> 4.8 kernel config?

No. This should become the hypervisor machine, although it is completely
idle at the moment as i have to fix this timekeeping issue before we can
go production on it. ;)


Rgds,
Sndr.
-- 
| BIT - https://www.bit.nl/ - KvK#09090351 - 0318 648688 - info at bit.nl
| 4096R/20CC6CD2 - 6D40 1A20 B9AA 87D4 84C7   FBD6 F3A9 9442 20CC 6CD2


More information about the questions mailing list