[ntp:questions] NTP not syncing

Bruce Evans bde at besplex.bde.org
Fri Dec 6 05:31:10 UTC 2013


In article <Zw6ou.573976$276.424934 at fx07.iad>, unruh  <unruh at invalid.ca> wrote:
>On 2013-12-05, antonio.marcheselli at gmail.com
><antonio.marcheselli at gmail.com> wrote:
>>
>> I understand and I apologise but I'm with a small netbook at the
>moment and I can't do better. I'm trying to clean the posts before
>posting them back.
>>
>> I'll see if I can fine an alternative. Thanks for your understanding
>in the meantime.
>>> 
>>> -258 IS pretty high. On what kind of machine? What operating system?
>>
>> Same machine that was before! Supermicro motherboard, debian 5.0.7
>>> 
>>> The computer has to calibrate the timer interrupts on bootup. That
>>> calibration can be variable. For older versions of the Linux kernel that
>>> calibration was very variable, of the order that  you are seeing. That
>>> seems to have been fixed on the more recent kernels. 
>>
>> But I only restarted the ntp service, I did not reboot the server.
>>
>> As it's been said, I'm concerned that if the calibration can drift by
>200ppm I may end up over the 500ppm boundary.
>>
>> But I understand that the drift file will change between reboots, but
>it will then stabilise. Now the power saving is disabled I don't see the
>drift changing over time, which should be good?
>
>Yes, power saving is definitely a problem if your system clock is using
>tsc as the clock. The number of instructin cycles per secong changes
>under powersaving and thus the system clock rate changes by huge
>amounts. ( the powersaving could cause the cpu to go half as fast). The
>only thing to do is to use some other counter as the system clock (HPET
>should be better shouldn't it?-- I donot know how it behaves under
>powersaving).

Starting 5-10 years ago, many x86 systems have a TSC that isn't affected
by power saving.  I don't have such a system, but the only problems that
I know of on such systems are:
- the hardware to implements such invariance makes reading the TSC much
  slower (up from about 10 cycles on old Athlons to 50-70 cycles).  This
  is still several times faster than reading an HPET and almost 100 times
  faster than reading the ACPI timer.
- There may be problems with keeping the TSCs of several cores in sync.
  The hardware doesn't do this so transparently, and if the software
  handles it then it makes reading the TSC even slower.

To handle the calibration varying across reboots, under FreeBSD I just
blow away the system calibration using a sysctl in an etc/rc file.
FreeBSD never had large variance in TSC calibration across reboots,
but I found the ones that it has annoying.  Most versions have a jitter
of only a couple of parts per million (ppm), but some have a fixed
error of about 10 ppm due to a sloppy calibration algorithm.  When
switching to a test or reference version with worse of just different
calibration, ntpd takes noticeably longer to sync, and syncing messes
up the driftfile for switching back.

I use a complicated rc file to support the following variations:
- switching the system version
- switching the frequency in the BIOS
- switching the CPU.  I have a system with 2 CPUs running at different
  frequencies.  I usually only use 1 (when this old system was used).
  The one used is random and switching it gives the same calibration
  changes as switching the frequency in the BIOS.
On most of my systems, the TSC frequency is a rational multiple of the
i8254 frequency.  Apparently, there is some master clock driving both
through a PLL to get multiples.  It is possible to determint the exact
(relative) multiple by looking at a continued fraction expansion of the
ratio of the frequencies, after determining this ratio very accurately.
On 1 system:

% # Multipliers 1783+5/13 @ 2127.9MHz, 1863+0 @ 2222.9MHz (Barton 2800)
% # Goal: 1193182 -> 1193199, 2127902422 -> 2127932819, 2222898066 -> 2222929602
%  	freq=$(sysctl -n machdep.tsc_freq)
%  	if test $freq -ge 2127500000 -a $freq -lt 2128500000; then
%  		multiplier=1783.384615385
%  	fi
%  	if test $freq -ge 2222500000 -a $freq -lt 2223500000; then
%  		multiplier=1863
%  	fi
%  	scale=1.000014187

Each choice (in BIOS configuration) if o nominal freqency give a different
multiplier.  Only 2 are supported here.  The initial 'freq' variable is
the kernel calibration.  It may have a large error, but it only has to be
less than several thousand ppm for the above to map it correctly to the
exact multiplier.

Then there is a scale factor to map the frequency (multiplier * 1193182)
to the actual frequency.  1193182 is the nominal frequency of the i8254.
The scale factor is what is needed to convert this to the actual frequency.
Here the scale factor gives an adjustment by 14 ppm.  It is an average
determined over years of running this system.  The correct factor varies
by a few ppm with the season (due to variation of the average temperature),
and I used to sometimes adjust it with the season.  I never got around to
doing temperature adjustments in real time.

Most configurations don't need this complexity.  The adjustment in 'scale'
is almost the same as the one in the driftfile, and one way to determine
it is to wait for ntpd to sync and copy it from there.  This was more
useful when I had some systems not connected to the internet all the time.
By setting the scale factor to the daily average of the correct factor,
it was possible to reduce the daily drift to about 1 ppm.  Setting the
scale factor here instead of in the driftfile mainly makes ntpd loopstats
logs easier to read -- the drift should be close to 0 and have an average
of 0 on all systems.

Most FreeBSD x86 systems using the TSC just need:

% 	sysctl machdep.tsc_freq=$newfreq

to blow away the boot-time calibration, after determining the correct
frequency.  The correct frequency can be just the initial boot-time
calibrated one, after letting ntpd run for a while to adjust the
driftfile.

To initialize this without waiting for ntpd to sync (but with more
hands on for me), I sometimes watch the drift in the offset (while
ntpd is not running and the kernel is not applying any previous
ntpd adjustments).  It is easy to see and compensate drifts of more
than about 1/10 ppm when you know that the drift is caused by a
calibration error.

Bruce



More information about the questions mailing list