[ntp:questions] Frequency Error on Sun4v

David L. Mills mills at udel.edu
Sat Jun 21 14:41:59 UTC 2008


Naughty Sun. Cheaper to wiggle the clock rather than shield the box. I 
can understand the need to do this for the CPU clock, but I thought the 
timer interrupt was driven by a different oscillator. The problem is 
that the phase jitter would interfere with the TSC (or equivalent) timer 

By reporting the average, there is some averaging interval involved. 
That adds an extra poll to the impulse response and could result in 
ringing/overshot. If the averaging interval is short, like one second, 
no problem. If much longer, there could be a problem. Sped sprectum 
(sic) is a good idea and frequency modulation resulting in phase jitter 
should be filtered out by the NTP mitigation and discipline algorithms, 
as long as the resulting phase jitter samples are identical and 
independently distributed.

Problems with Sunses configured with slew-only and relatively large 
offsets have been reported preciously. This suggests the Solaris 
adjtime() syscall has been modified from the original Unix design, 
whichis a linear slew. In general, people seem to just put up with it. 
This problem should not occur with the kernel ntp_adjtime() call, but it 
is not used when slew-only is configured.


Brian Utterback wrote:
> Hal Murray wrote:
>> [drift > 500 ppm]
>>> It's almost certainly a hardware problem.  Ntpd is telling you that 
>>> the clock is gaining, or losing, more than about 43 seconds (500 
>>> parts per million) per day.  500 PPM is the maximum that ntpd can 
>>> handle.
>> It could easily be a software screwup.
>> Unless you know it works on that particular type of hardware,
>> I'd give software equal probability.
> Do not even think about using NTP on a T1000, T2000, or T5120 until you 
> have the latest firmware patch installed (nah-nah, wasn't hardware  OR 
> software. Or maybe both?).  There is a bug in all three that causes the 
> firmware to report an incorrect clock frequency to the kernel on boot 
> up. Interestingly, the bug in the T1000 and T2000 is different from the 
> bug in the T5120. See my blog post at 
> http://blogs.sun.com/blu/entry/spread_spectrum_emi_and_the for an 
> explanation of the T5120 issue.
> Brian Utterback

More information about the questions mailing list