[ntp:hackers] ntp p110 and setting frequency and offset still off.

David L. Mills mills at udel.edu
Fri Feb 15 18:21:30 UTC 2008


Brian,

I do welcom serious discussion, or I would not be still prolonging it. 
However, I am having a hard time with your proposal because I don't 
understand the context.

If I read you directly, the kernel is not an issue; it is the initial 
treatment of offset and frequency. Here is the context.

!. If there is a frequency file and the initial offset is less than 128 
ms, the PLL operates normally to discipline the offset and frequency. At 
startup only the frequency is initialized from the file and the offset 
is initialized to zero, since there is no offset available at this 
timeOnly when the distance is less than the threshold (usually four 
packets) does the discipline adjust the offset and frequency.

2. If there is no frequency file, the discipline starts in NSET state. 
At the first offset less than 128 ms the discipline enters FREQ state 
and initializes the offset. During FREQ state the offset is not further 
disciplined. It just exponentially slews to the initial offset. The 
problem is to separate the intrinsic frequency error from the offset 
errot. Doing both at the same time is really hard. Since the FREQ state 
occurs only when started for the first time, I concluded that the 
additional complexity was not worth the hazard.

3. When leaving FREQ state after 900 s the frequency is initialized 
directly and the PLL takes over. However, the first offset when leaving 
FREQ state disciplines only the frequency, not the offset. The normal 
PLL takes over at the next update. It could be at this time the offset 
is greater than 128 ms, in which case a step occurs. In either case 
operation continues in SYNC state.

Note there never is a case where the frequncy if set directly and the 
offset is disciplined at the same time.

There are two constraints in these operations. First, the offset is not 
disciplined after entering FREQ state because that would be a hideous 
complication involving very delicate residuals. Second, the FREQ 
interval must be long enough so that the frequency can be determined 
within 1 PPM with an offset uncertaincy of 1 ms.

The same should hold true whether or not a (correct) kernel is in use.

With this understanding in mind, can you state precisely your plan?

Dave

Brian Utterback wrote:

>
>
> David L. Mills wrote:
>
>> Brian,
>>
>> I might not be stating the obvious strongly enough. No matter what 
>> changes are made to the initial protocol it WILL NOT WORK util the 
>> basic underlying problem of the Solaris loop is fixed. The frequency 
>> gain is broken. It makes no sense to patch around it. It has to be 
>> fixed.
>>
>> Dave
>
>
> More talking past each other. I know that. I stated several
> messages back that this has nothing to do with that. You
> have stated many times that you welcome serious discussion
> about improvements.
>
> The setting of the frequency without setting the offset in
> the FREQ state may be a bug, since they both used to get
> set together. In which case, there is nothing to discuss, it
> just needs to get fixed. Otherwise, it is something to
> consider. It is not a radical change.

Likewise, setting the frequency simultaneously with the first

> offset when there is a drift file is likewise not a radical
> change, but is a departure from current behavior.
>
> I think that there is no arguing that the initial offset at
> startup should not be a component of the frequency calculations.
> Starting ntpd with no offset and a drift file leaves the frequency
> at the correct value. Starting it with an offset greater than
> the step threshold likewise results in the correct value because
> the step does not recalculate the frequency and we are back to
> the zero offset case after the step. the initial offset value
> has nothing to do with the PLL, it has to do with how long
> the clock was undisciplined.
>
> However, any offset in between zero and the step threshold results
> in a recalculation and perturbs the frequency which must then
> slowly be brought back in line over time. My proposal eliminates
> this problem. Please, if I am in error, help me to find the
> error of my ways. But I believe that this change results
> in performance as good or better than the current code
> in all use cases.
>
> It is fine to argue LAN vs. Flaky WAN vs. Solar System when
> discussing algorithm changes, but that is not the case here.
> If it is an easy change, and always performs better or as
> good, then why would we not want to make the change?
>



More information about the hackers mailing list