[ntp:questions] Re: strange sign change in clock drift with linux 2.6.15.7

mike michael.no.spam.cook at wanadoo.fr
Tue Apr 18 22:18:27 UTC 2006


  I had seen that default change. It might have an effect;  I believe 
however that the previous default was 100hz in my architecture. I
will give it a try however. It might decrease jitter but I cannot see 
how the sign switch might occur.

  I was wondering what else might have an effect and I thought that it 
could be related to the SHM refclock (DCF77) that I have. So I 
unconfigured it to check.
ERREUR!!!

  That threw ntp into a tizzy that it has taken 12hr to get out of, and
even then the status is pretty horrible:

http://quark.stratum1.d2g.com/tmp/clockdrift20060418-23:56.png

   the first tick on the graph is a reboot, the second and rest is the 
result of unconfiguring the DCF ref clock. NTP was not restarted .

Eventualy the roller-coaster stopped and the drift came back to the
+ve value of before:
Tue Apr 18 23:51:25 CEST 2006
[mike at quark mike]$ cat /etc/ntp/drift
23.716

What is even more disturbing was that it was not just the percieved 
local clock drift that was affected. NTP could not keep the system clock 
in sync, resetting at least twice: Log follows.

boot then restart ntpd 11:03:55 (at 0:0:0+39853ss)
Apr 18 11:03:55 quark ntpdate[5941]: step time server 129.6.15.28 offset 
0.055073 sec
Apr 18 11:03:55 quark ntpd[6022]: ntpd 4.2.0 at 1.1161-r Sat Aug 13 
11:39:48 CEST 2005 (5)
Apr 18 11:03:55 quark ntpd[6022]: precision = 1.000 usec
Apr 18 11:03:55 quark ntpd: ntpd startup succeeded
Apr 18 11:03:55 quark ntpd[6022]: no IPv6 interfaces found
Apr 18 11:03:55 quark ntpd[6022]: kernel time sync status 0040
Apr 18 11:03:56 quark ntpd[6022]: frequency initialized 20.754 PPM from 
/etc/ntp/drift
Apr 18 11:04:06 quark ntpd[6022]: synchronized to 80.96.120.252, stratum=1
Apr 18 11:04:06 quark ntpd[6022]: kernel time sync disabled 0041
Apr 18 11:06:21 quark ntpd[6022]: kernel time sync enabled 0001
Apr 18 11:08:19 quark ntpd[6022]: synchronized to SHM(0), stratum=0
Apr 18 12:27:33 quark ntpd[6022]: synchronized to 80.96.120.252, stratum=1
Apr 18 12:36:12 quark ntpd[6022]: synchronized to SHM(0), stratum=0

at 13:56 - unconfigured dcf77 ref  (0+50162s)

Tue Apr 18 13:55:27 CEST 2006
      remote           refid      st t when poll reach   delay   offset 
  jitter
==============================================================================
  127.127.1.0     LOCAL(0)        10 l   31   64  377    0.000    0.000 
   0.001
*127.127.28.0    .DCF.            0 l   33   64  377    0.000   -0.840 
  2.335
+212.159.16.190  134.226.81.3     2 u   12   64  377   61.108   -4.326 
  1.966
+80.96.120.252   .PPS.            1 u   15   64  377   91.674   -4.181 
  3.274
+80.53.57.158    192.53.103.103   2 u   56   64  377   78.202   -1.189 
41.091
  192.168.1.255   .BCST.          16 u    -   64    0    0.000    0.000 
4000.00
Tue Apr 18 13:56:32 CEST 2006
      remote           refid      st t when poll reach   delay   offset 
  jitter
==============================================================================
  127.127.1.0     LOCAL(0)        10 l   30   64  377    0.000    0.000 
   0.001
+212.159.16.190  134.226.81.3     2 u   13   64  377   61.108   -4.326 
  2.184
*80.96.120.252   .PPS.            1 u   14   64  377   91.674   -4.181 
  3.300
+80.53.57.158    192.53.103.103   2 u   54   64  377   78.202   -1.189 
41.201
  192.168.1.255   .BCST.          16 u    -   64    0    0.000    0.000 
4000.00

   then all hell breaks loose!!

Apr 18 13:56:02 quark ntpd[6022]: synchronized to 80.96.120.252, stratum=1

/etc/ntp/drift(14:03) shows 73.153
Apr 18 14:23:15 quark ntpd[6022]: synchronized to 80.53.57.158, stratum=2

Apr 18 14:24:41 quark ntpd[6022]: synchronized to 80.96.120.252, stratum=1

    although the server dalay stays about the same , the calculated offsets
   gradually drift  out past 500ms.  THEN
Tue Apr 18 14:48:55 CEST 2006
      remote           refid      st t when poll reach   delay   offset 
  jitter
==============================================================================
  127.127.1.0     LOCAL(0)        10 l   10   64  377    0.000    0.000 
   0.001
+212.159.16.190  134.226.81.3     2 u   74  128   31   61.138  691.321 
191.009
*80.96.120.252   .PPS.            1 u  127  128  377   91.075  533.681 
238.461
+80.53.57.158    192.53.103.103   2 u   39  128  377   76.842  516.193 
293.196
  192.168.1.255   .BCST.          16 u    -   64    0    0.000    0.000 
4000.00


Apr 18 14:49:48 quark ntpd[6022]: time reset +0.510980 s
Tue Apr 18 14:49:59 CEST 2006
      remote           refid      st t when poll reach   delay   offset 
  jitter
==============================================================================
  127.127.1.0     LOCAL(0)        10 l   11   64    0    0.000    0.000 
4000.00
  212.159.16.190  .STEP.          16 u 1055   64    0    0.000    0.000 
4000.00
  80.96.120.252   .STEP.          16 u   79   64    0    0.000    0.000 
4000.00
  80.53.57.158    .STEP.          16 u 1014   64    0    0.000    0.000 
4000.00
  192.168.1.255   .BCST.          16 u    -   64    0    0.000    0.000 
4000.00

Apr 18 14:49:48 quark kernel: adjtime: ntpd used obsolete 
ADJ_OFFSET_SINGLESHOT instead of ADJ_ADJTIME
Apr 18 14:51:00 quark ntpd[6022]: synchronized to 80.96.120.252, stratum=1
   Maybe I need to upgrade.
   However, despite the reset and catchup step, the offsets are showing 
the same high
delta as before:
Tue Apr 18 14:51:04 CEST 2006
      remote           refid      st t when poll reach   delay   offset 
  jitter
==============================================================================
  127.127.1.0     LOCAL(0)        10 l   13   64    1    0.000    0.000 
   0.001
+212.159.16.190  134.226.81.3     2 u    2   64    1   60.799  523.476 
  0.791
*80.96.120.252   .PPS.            1 u    2   64    1   94.854  527.978 
  2.280
+80.53.57.158    192.53.103.103   2 u    2   64    1   76.570  527.433 
  0.850
  192.168.1.255   .BCST.          16 u    -   64    0    0.000    0.000 
4000.00
Tue Apr 18 14:52:08 CEST 2006


   now the drift is showing -ve in a very big way?
[mike at quark mike]$ ls -l /etc/ntp/drift
-rw-r--r--  1 root root 9 Apr 18 15:03 /etc/ntp/drift
[mike at quark mike]$ cat /etc/ntp/drift
-196.699  !!!!!!!

    ntp is lost in never never land.Resets again.
Apr 18 15:05:02 quark ntpd[6022]: time reset +0.638121 s
Apr 18 15:05:02 quark kernel: adjtime: ntpd used obsolete 
ADJ_OFFSET_SINGLESHOT instead of ADJ_ADJTIME

this time the initial offsets are ok:
Tue Apr 18 15:07:07 CEST 2006
      remote           refid      st t when poll reach   delay   offset 
  jitter
==============================================================================
  127.127.1.0     LOCAL(0)        10 l   62   64    1    0.000    0.000 
   0.001
+212.159.16.190  134.226.81.3     2 u   45   64    1   58.386   35.613 
  4.595
*80.96.120.252   .PPS.            1 u   46   64    1   92.243   39.273 
  6.470
+80.53.57.158    192.53.103.103   2 u   46   64    1   77.332   46.271 
  4.028
  192.168.1.255   .BCST.          16 u    -   64    0    0.000    0.000 
4000.00

   but offset and jitter soon dift out.

Tue Apr 18 15:15:40 CEST 2006
      remote           refid      st t when poll reach   delay   offset 
  jitter
==============================================================================
  127.127.1.0     LOCAL(0)        10 l   61   64  377    0.000    0.000 
   0.001
+212.159.16.190  134.226.81.3     2 u   46   64  377   60.614   10.450 
91.605
*80.96.120.252   .PPS.            1 u   42   64  377   90.877  -111.79 
70.616
+80.53.57.158    192.53.103.103   2 u   43   64  377   78.108  -46.792 
48.605
  192.168.1.255   .BCST.          16 u    -   64    0    0.000    0.000 
4000.00

   leaving it to see what happens -

   no change, cycling through resets:
Apr 18 17:19:19 quark ntpd[6022]: time reset +0.133473 s
Apr 18 17:19:19 quark kernel: adjtime: ntpd used obsolete 
ADJ_OFFSET_SINGLESHOT instead of ADJ_ADJTIME
Apr 18 17:20:30 quark ntpd[6022]: synchronized to 80.53.57.158, stratum=2
Apr 18 17:20:31 quark ntpd[6022]: synchronized to 80.96.120.252, stratum=1

  crappy figures still:

Tue Apr 18 17:31:25 CEST 2006
      remote           refid      st t when poll reach   delay   offset 
  jitter
==============================================================================
  127.127.1.0     LOCAL(0)        10 l   23   64  377    0.000    0.000 
   0.001
+212.159.16.190  134.226.81.3     2 u   64   64  377   60.463    4.256 
15.580
*80.96.120.252   .PPS.            1 u    4   64  377   92.510    3.472 
20.856
+80.53.57.158    192.53.103.103   2 u    4   64  377   78.340   10.384 
12.726
  192.168.1.255   .BCST.          16 u    -   64    0    0.000    0.000 
4000.00

   drift is down, AND -ve.

[mike at quark mike]$ cat /etc/ntp/drift
-38.397

[mike at quark mike]$ date
Tue Apr 18 18:52:56 CEST 2006
[mike at quark mike]$ cat /etc/ntp/drift
11.101

   gone positive again!!!!

  eventually get stable but not good.
Tue Apr 18 23:40:06 CEST 2006
      remote           refid      st t when poll reach   delay   offset 
  jitter
==============================================================================
  127.127.1.0     LOCAL(0)        10 l   27   64  377    0.000    0.000 
   0.001
+212.159.16.190  134.226.81.3     3 u   71 2048  117   63.087   49.896 
  2.019
*80.96.120.252   .PPS.            1 u   17 2048  377   91.967   52.951 
13.380
+80.53.57.158    192.53.103.103   2 u   83 2048  377   79.007   55.809 
342.806
  192.168.1.255   .BCST.          16 u    -   64    0    0.000    0.000 
4000.00

Tue Apr 18 23:51:25 CEST 2006
[mike at quark mike]$ cat /etc/ntp/drift
23.716


  This shows to my mind an underlying weakness in NTP. Local clock
quality can't be relied on (otherwise we don't need NTP), BUT it is
stable, good or bad. Crystals may drift, even outside the 500pm
alowed, but they don't skip and jump. So I think that NTP aught to take 
a some measure of the historical drift > 1hr into account.

Mike

Markus Rehbach wrote:
> mike wrote:
> 
> 
>>Yesterdaw I upgraded my linux kernel from 2.6.8.1 to 2.6.15.7 in order
>>to take advantage of the PPSKit. I have changed neither the NTP version
>>nor the config. After rebooting and checking that all
>>was working, I noticed that the calculated local drift had switched
>>from its former stable -32 ppm to a positive value of +27 ppm.
> 
> 
> Perhaps the change of the default linux kernel HZ value (from 1000HZ to
> 250HZ in newer versions) is the reason why? 
> 




More information about the questions mailing list