[ntp:questions] Re: Synchronisation lost with swisstime.ethz.ch

mike michael.no.spam.cook at wanadoo.fr
Thu Aug 11 09:29:48 UTC 2005


frank.olsen at steria.com wrote:
> Hi all,
> 
> Since a week ago, we've got serious problems with our NTP
> configuration. There are regular _steps_ of say +2 seconds, or -2, or
> +4, etc. This causes a serious problem with jobs on our ERP failing
> because they use timestamps and the ERP find timestamps dating in the
> past...
> 
> The configuration is bascially on of the servers (all running HP-UX
> 11.00), that synhronise to an external server (swisstime.ethz.ch). This
> server does a broadcast on the local subnet. In addition each of the
> other servers also do broadcasts and have each other defined as peers.
> 
> A short extract from a typical part of the syslog for the server that
> does the external synchronisation:
> 
> Aug  3 08:50:58 xntpd[1078]: synchronisation lost
> Aug  3 09:47:30 xntpd[1078]: synchronized to 129.132.2.21, stratum=1
> Aug  4 12:14:05 xntpd[1078]: synchronisation lost
> Aug  4 14:13:33 xntpd[1078]: synchronized to 129.132.2.21, stratum=1
> Aug  4 14:13:31 xntpd[1078]: time reset (step) -2.266052 s
> Aug  4 14:13:31 xntpd[1078]: synchronisation lost
> Aug  4 14:19:42 xntpd[1078]: synchronized to 129.132.2.21, stratum=13
> Aug  4 14:19:42 xntpd[1078]: time reset (step) 0.269439 s
> Aug  4 14:19:42 xntpd[1078]: synchronisation lost
> Aug  4 14:25:02 xntpd[1078]: synchronized to 129.132.2.21, stratum=13
> Aug  4 14:31:06 xntpd[1078]: synchronisation lost
> Aug  4 14:32:10 xntpd[1078]: synchronized to 129.132.2.21, stratum=13
> Aug  4 14:39:57 xntpd[1078]: synchronisation lost
> Aug  4 14:41:01 xntpd[1078]: synchronized to 129.132.2.21, stratum=13
> Aug  4 14:46:05 xntpd[1078]: time reset (step) -3.999183 s
> Aug  4 14:46:05 xntpd[1078]: synchronisation lost
> Aug  4 14:51:25 xntpd[1078]: synchronized to 129.132.2.21, stratum=13
> Aug  4 15:01:34 xntpd[1078]: synchronisation lost
> Aug  4 15:07:04 xntpd[1078]: synchronized to 129.132.2.21, stratum=1
> Aug  4 15:15:42 xntpd[1078]: time reset (step) 6.000553 s
> Aug  4 15:15:42 xntpd[1078]: synchronisation lost
> 
> Up until this month we never had steps of more than around +/- 0.2
> seconds.
> 
> We do have regular connection problems on the external network.
> 
> In trying to debug this problem I've been using ntpdate. I've also
> temporarily desactivated the "external" synchronisation (by commenting
> out server 129.132.2.21 in ntp.conf and restarting xntpd). Currently,
> ntpdate -d gives:
> 
> # ntpdate -d 129.132.2.21
> 11 Aug 10:55:39 ntpdate[24036]: ntpdate version 3.5f: Fri Dec 10
> 18:32:44 GMT 1999 PHNE_19711
> transmit(129.132.2.21)
> receive(129.132.2.21)
> transmit(129.132.2.21)
> receive(129.132.2.21)
> transmit(129.132.2.21)
> receive(129.132.2.21)
> transmit(129.132.2.21)
> receive(129.132.2.21)
> transmit(129.132.2.21)
> server 129.132.2.21, port 123
> stratum 13, precision -18, leap 00, trust 000
> refid [127.127.1.0], delay 0.03419, dispersion 0.00050
> transmitted 4, in filter 4
> reference time:      c6a58f28.f605681e  Thu, Aug 11 2005 10:47:36.961
> originate timestamp: c6a59109.aec1094a  Thu, Aug 11 2005 10:55:37.682
> transmit timestamp:  c6a5910b.c1fca000  Thu, Aug 11 2005 10:55:39.757
> filter delay:  0.03535  0.03725  0.03494  0.03419
>                0.00000  0.00000  0.00000  0.00000
> filter offset: -2.08221 -2.08119 -2.08258 -2.08288
>                0.000000 0.000000 0.000000 0.000000
> delay 0.03419, dispersion 0.00050
> offset -2.082881
> 
> 11 Aug 10:55:39 ntpdate[24036]: step time server 129.132.2.21 offset
> -2.082881 sec
> 
> What is strange is that only 10 minutes ago, the offset was 0.02
> seconds... In the meantime I was disconnected from the server and I had
> to reconnect.) For a while I also got the message "no server suitable
> for synchronization found" from 129.132.2.21, like just now:
> 
> # ntpdate -d 129.132.2.21
> 11 Aug 11:04:35 ntpdate[26125]: ntpdate version 3.5f: Fri Dec 10
> 18:32:44 GMT 1999 PHNE_19711
> transmit(129.132.2.21)
> receive(129.132.2.21)
> transmit(129.132.2.21)
> receive(129.132.2.21)
> transmit(129.132.2.21)
> receive(129.132.2.21)
> transmit(129.132.2.21)
> receive(129.132.2.21)
> transmit(129.132.2.21)
> server 129.132.2.21, port 123
> stratum 16, precision -18, leap 00, trust 000
> refid [83.84.69.80], delay 0.03444, dispersion 0.00031
> transmitted 4, in filter 4
> reference time:      c6a58f28.f605681e  Thu, Aug 11 2005 10:47:36.961
> originate timestamp: c6a59323.fb280f12  Thu, Aug 11 2005 11:04:35.981
> transmit timestamp:  c6a59324.05e53000  Thu, Aug 11 2005 11:04:36.023
> filter delay:  0.03520  0.03461  0.03444  0.03746
>                0.00000  0.00000  0.00000  0.00000
> filter offset: -0.05084 -0.05105 -0.05113 -0.04967
>                0.000000 0.000000 0.000000 0.000000
> delay 0.03444, dispersion 0.00031
> offset -0.051134
> 
> 11 Aug 11:04:36 ntpdate[26125]: no server suitable for synchronization
> found
> 
> Why has the offset gone to -0.051134?
> 
> Can it be that the "stratum=13" appears whenever I loose the connection
> completely to 129.132.2.21? As the syslog showed, the server is
> normally stratum=1, but sometimes it jumps to stratus=13.
> 
> Thanks in advance for any help!
> 
> Best regards,
> 
> Frank Olsen
> 
> P.S. You guessed it, I'm no NTP expert...
> 

It has gone out to stratum 16 in the above.

I have just check myself. That source is showing stratum 13 for me as 
well. They must have some problem, so keep it out of your server list.

If your local server is losing sync then I think you do not have enough 
sources in your config to enable NTP to fall back to another more stable 
source.
Try using the ntp.org pool for example:
server 0.europe.pool.ntp.org
server 1.europe.pool.ntp.org
server 2.europe.pool.ntp.org
If you are not in europe, there are other local pools available.

If you have a firewall letting NTP traffic through from only 
129.132.2.21 it may need reconfiguring.




More information about the questions mailing list