[ntp:questions] Do logs indicate a config problem?
Ulrich Windl
Ulrich.Windl at RZ.Uni-Regensburg.DE
Fri Jul 16 07:23:18 UTC 2010
David Lord <snews at lordynet.org> writes:
> Do logs indicate a config problem?
>
> system server1: MSFa, PPSa peer=server2 and remote servers
> system server2: peer=server1 and remote servers
> system server3: GPSb, PPSb, server2 and server1
>
>
> This is total logged over period Feb 7 to Feb 12:
> ntp.log.server1:
> 7 Feb 19:22:39 ntpd[26820]: clock SHM(0) event 'clk_noreply' (0x01)
> 7 Feb 19:22:40 ntpd[26820]: clock PPS(0) event 'clk_noreply' (0x01)
> 7 Feb 19:27:01 ntpd[26820]: synchronized to SHM(0), stratum 0
> 7 Feb 19:27:01 ntpd[26820]: kernel time sync status change 2001
> 7 Feb 19:28:01 ntpd[26820]: synchronized to PPS(0), stratum 0
> 12 Feb 10:13:44 ntpd[26820]: sendto(xxx.xxx.xx.xx) (fd=27): Host is down
> 12 Feb 10:20:41 ntpd[26820]: ntpd exiting on signal 15
>
>
> This is tiny section of log on server1 with similar entries
> repeating continuously over whole period Feb 12 to Feb 14. There
> were too many reboots and restarts over period for peerstats to
> be useful. I swapped kernels attempting to get frequency offset
> down from just over 50ppm but the autoconfig still gave 50ppm
> (on NetBSD-3.1 I could set "options TIMER_FREQ=" to get < 10ppm
> without problem).
>
> ntp.log.server1:
> 13 Feb 07:07:48 ntpd[22483]: kernel time sync status change 2901
> 13 Feb 07:08:53 ntpd[22483]: kernel time sync status change 2101
> 13 Feb 07:09:59 ntpd[22483]: kernel time sync status change 2301
> 13 Feb 07:17:28 ntpd[22483]: kernel time sync status change 2501
> 13 Feb 07:18:32 ntpd[22483]: kernel time sync status change 2301
> 13 Feb 07:19:38 ntpd[22483]: kernel time sync status change 2901
> 13 Feb 07:20:41 ntpd[22483]: kernel time sync status change 2301
> 13 Feb 07:21:47 ntpd[22483]: kernel time sync status change 2101
> 13 Feb 07:22:50 ntpd[22483]: kernel time sync status change 2501
> 13 Feb 07:23:54 ntpd[22483]: kernel time sync status change 2101
>
> from timex.h:
> 0x0001 enable pll updates
> 0x0002 enable pps freq discipline
> 0x0004 enable pps time discipline
> 0x0100 pps signal present
> 0x0200 pps signal jitter exceeded
> 0x0400 pps signal wander exceeded
> 0x0800 pps signal calibration error
Hi!
I'd say all 0x800 and 0x400 should never happen unless there is a severe
problem. 0x200 should also happen extremely rarely under normal
conditions. (based on my NTP/Linux experience several years ago)
Regards,
Ulrich
>
>
> This is total logged over period Feb 14 to Feb 19:
> ntp.log.server1
> 14 Feb 10:49:34 ntpd[169]: clock SHM(0) event 'clk_noreply' (0x01)
> 14 Feb 10:52:53 ntpd[169]: synchronized to xx.xxx.xx.xxx, stratum 2
> 14 Feb 10:52:53 ntpd[169]: kernel time sync status change 2001
> 14 Feb 10:58:05 ntpd[169]: synchronized to SHM(0), stratum 0
> 15 Feb 10:32:40 ntpd[169]: ntpd exiting on signal 15
> 15 Feb 10:34:00 ntpd[169]: clock SHM(0) event 'clk_noreply' (0x01)
> 15 Feb 10:40:27 ntpd[169]: synchronized to xx.xxx.xx.xxx, stratum 2
> 15 Feb 10:40:27 ntpd[169]: kernel time sync status change 2001
> 15 Feb 10:41:47 ntpd[169]: synchronized to xx.xxx.xx.xxx, stratum 2
> 15 Feb 10:42:36 ntpd[169]: synchronized to SHM(0), stratum 0
>
>
> This is tiny section of log on server3 with similar entries
> repeating continuously over whole period Feb 7 to Feb 19.
> ntp.log.server3:
> 19 Feb 13:08:24 ntpd[6745]: kernel time sync error \
>
> 0x2307<PLL,PPSFREQ,PPSTIME,PPSSIGNAL,PPSJITTER,NANO,MODE=0x0=PLL,CLK=0x0=A>
> 19 Feb 13:08:39 ntpd[6745]: kernel time sync status change \
> 0x2107<PLL,PPSFREQ,PPSTIME,PPSSIGNAL,NANO,MODE=0x0=PLL,CLK=0x0=A>
> 19 Feb 13:13:31 ntpd[6745]: kernel time sync error \
>
> 0x2307<PLL,PPSFREQ,PPSTIME,PPSSIGNAL,PPSJITTER,NANO,MODE=0x0=PLL,CLK=0x0=A>
> 19 Feb 13:13:46 ntpd[6745]: kernel time sync status change \
> 0x2107<PLL,PPSFREQ,PPSTIME,PPSSIGNAL,NANO,MODE=0x0=PLL,CLK=0x0=A>
>
>
>
> Otherwise timekeeping is good enough for me:
>
> peerstats
> ident mean rms max
> ==========================================================================
> ..... MSF using serial dsr with shm and pps to serial dcd with atom
> server1 20100208-11 127.127.22.0 0.045 1.038 7.100 PPSa
> server3 20100208-11 127.127.22.0 -0.043 0.288 4.544 PPSb
> server3 20100208-11 server1 0.031 0.416 3.485
> server3 20100208-11 server2 0.422 0.577 5.196
> ..... MSF config changed to use only serial dcd with shm driver
> server1 20100215-18 127.127.28.0 0.197 0.427 2.723 MSFa
> server3 20100215-18 127.127.22.0 0.000 -0.002 0.021 PPSb
> server3 20100215-18 server1 1.371 0.412 1.139
> server3 20100215-18 server2 1.452 0.504 1.222
>
>
> Offset from MSF receiver varies significantly with temperature,
> around 1ms/deg based on offset of server1 seen from server3 and
> corresponding to similar offset variation vs remote servers.
>
> GPSb/PPSb on server3 isn't usable from current location of aerial
> in bad weather eg. on 8th 9th when signal lost for short periods
> significantly skewing the stats above:
>
>
> David
More information about the questions
mailing list