[ntp:hackers] ntp 4.2.4

David L. Mills mills at udel.edu
Wed Oct 8 17:33:03 UTC 2008


Simon,

I can't help you with the Linux kernel. The code you cite appears to 
have different semantics than the code that left here 16 years ago. The 
kernel FLL code is highly suboptimal with respect to the current ntpd 
design for large poll intervals. Disable the kernel if you expect the 
poll interval to be greater than 1024 s, especially with large numbers 
of potential servers.

In any case, the code that left here and appears in several kernels 
should not enable the FLL unless explicitly requested. That is a bug 
that should be reported to the kernelmongers.

Dave

Simon Hughes wrote:

> Hi
>
>
>
> By way of introduction, I've been working with ntpd 4.2.4 on linux for
> about 6 months now. It's an interesting project and I'd like to thank
> you for the resources that you've put into the public domain,
> particularly the reference implementation and the red book. I've found
> these to be essential reading in order to understand the open source
> implementation.
>
>
>
> From the data I've collected I can see there is a problem with ntpd
> 4.2.4 running on the linux 2.6.18, which is the kernel I'm using. My
> questions concern the best way to fix it. Let me state the problem as I
> understand it:
>
>
>
> In <kernel_src_root>/kernel/time.time.c from 2.6.18:
>
>
>
>
>
> if (time_status & STA_FREQHOLD || time_reftime == 0)
>
> time_reftime = xtime.tv_sec;
>
> mtemp = xtime.tv_sec - time_reftime;
>
> time_reftime = xtime.tv_sec;
>
> if (time_status & STA_FLL) {
>
> if (mtemp >= MINSEC) {
>
> ltemp = (time_offset / mtemp) << (SHIFT_USEC -
>
> SHIFT_UPDATE);
>
> time_freq += shift_right(ltemp, SHIFT_KH);
>
> } else /* calibration interval too short (p. 12) */
>
> result = TIME_ERROR;
>
> } else { /* PLL mode */
>
> if (mtemp < MAXSEC) {
>
> ltemp *= mtemp;
>
> time_freq += shift_right(ltemp,(time_constant +
>
> time_constant +
>
> SHIFT_KF - SHIFT_USEC));
>
> } else /* calibration interval too long (p. 12) */
>
> result = TIME_ERROR;
>
> }
>
> time_freq = min(time_freq, time_tolerance);
>
> time_freq = max(time_freq, -time_tolerance);
>
> } /* STA_PLL */
>
> } /* txc->modes & ADJ_OFFSET */
>
>
>
>
>
> Under the following conditions:
>
> - Using ntpd 4.2.4 on linux 2.6.18.
>
> - When sys_poll > 10 so that STA_FLL is set (ntp_loopfilter.c)
>
> - From test data, I observe ntp_loopfilter.c::local_clock() is
> occasionally called with small values of mu (mu~45) when the polling
> exponent is, for example sys_poll~13. Lets assume that such a call has
> already taken place, so that time_reftime in the kernel has been just
> reset (this comment is referred to as *A* below).
>
> - A second call to ntp_loopfilter.c::local_clock()
> ntp_adjtime() is received with mu~2^sys_poll.
>
>
>
> Then:
>
> - In the kernel, time.c::mtemp is computed to be very small as
> the kernel remembers the last time it was called (in time_reftime) to be
> very recent (due to *A*)
>
> - ltemp =(time_offset/mtemp) << (shift_usec-shift_update) is
> very large
>
> - time_freq += shift_right(ltemp, shift_kh) produces a ~1000ppb
> change in time_freq.
>
>
>
>
>
> I am aware of public postings regarding the problems with ntpd on the
> linux kernel. However, I find the discussions confusing and inconclusive
> in relation to this problem.
>
>
>
> There appears to be a number of alternatives to fixing this problem, but
> I'm not sure which is most appropriate or advantageous.
>
> - make MINSEC larger to be ~2^sys_poll
>
> - bailing out of ntp_loopfilter.c::local_clock() if mu <<
> 2^sys_poll in FLL mode. Such updates must be spurious.
>
>
>
>
>
> I don't really understand (yet) how mu can be correct (is large) for so
> many calls to local_clock() and then a small value is computed:
>
> mu = peer->epoch - sys_clocktime;
>
>
>
> How can this happen?
>
>
>
> Any guidance as to the preferred way to fix this problem would be
> gratefully appreciated?
>
>
>
>
>
> Thanks
>
> Simon
>
>
>
>
>
>
> This message contains confidential information and may be privileged. 
> If you are not the intended recipient, please notify the sender and 
> delete the message immediately.
>
> ip.access Ltd, registration number 3400157, Building 2020,
> Cambourne Business Park, Cambourne, Cambridge CB23 6DW, United Kingdom
>
>
>



More information about the hackers mailing list