[ntp:hackers] ntp p110 and setting frequency and offset still off.
David L. Mills
mills at udel.edu
Fri Feb 15 18:21:30 UTC 2008
Brian,
I do welcom serious discussion, or I would not be still prolonging it.
However, I am having a hard time with your proposal because I don't
understand the context.
If I read you directly, the kernel is not an issue; it is the initial
treatment of offset and frequency. Here is the context.
!. If there is a frequency file and the initial offset is less than 128
ms, the PLL operates normally to discipline the offset and frequency. At
startup only the frequency is initialized from the file and the offset
is initialized to zero, since there is no offset available at this
timeOnly when the distance is less than the threshold (usually four
packets) does the discipline adjust the offset and frequency.
2. If there is no frequency file, the discipline starts in NSET state.
At the first offset less than 128 ms the discipline enters FREQ state
and initializes the offset. During FREQ state the offset is not further
disciplined. It just exponentially slews to the initial offset. The
problem is to separate the intrinsic frequency error from the offset
errot. Doing both at the same time is really hard. Since the FREQ state
occurs only when started for the first time, I concluded that the
additional complexity was not worth the hazard.
3. When leaving FREQ state after 900 s the frequency is initialized
directly and the PLL takes over. However, the first offset when leaving
FREQ state disciplines only the frequency, not the offset. The normal
PLL takes over at the next update. It could be at this time the offset
is greater than 128 ms, in which case a step occurs. In either case
operation continues in SYNC state.
Note there never is a case where the frequncy if set directly and the
offset is disciplined at the same time.
There are two constraints in these operations. First, the offset is not
disciplined after entering FREQ state because that would be a hideous
complication involving very delicate residuals. Second, the FREQ
interval must be long enough so that the frequency can be determined
within 1 PPM with an offset uncertaincy of 1 ms.
The same should hold true whether or not a (correct) kernel is in use.
With this understanding in mind, can you state precisely your plan?
Dave
Brian Utterback wrote:
>
>
> David L. Mills wrote:
>
>> Brian,
>>
>> I might not be stating the obvious strongly enough. No matter what
>> changes are made to the initial protocol it WILL NOT WORK util the
>> basic underlying problem of the Solaris loop is fixed. The frequency
>> gain is broken. It makes no sense to patch around it. It has to be
>> fixed.
>>
>> Dave
>
>
> More talking past each other. I know that. I stated several
> messages back that this has nothing to do with that. You
> have stated many times that you welcome serious discussion
> about improvements.
>
> The setting of the frequency without setting the offset in
> the FREQ state may be a bug, since they both used to get
> set together. In which case, there is nothing to discuss, it
> just needs to get fixed. Otherwise, it is something to
> consider. It is not a radical change.
Likewise, setting the frequency simultaneously with the first
> offset when there is a drift file is likewise not a radical
> change, but is a departure from current behavior.
>
> I think that there is no arguing that the initial offset at
> startup should not be a component of the frequency calculations.
> Starting ntpd with no offset and a drift file leaves the frequency
> at the correct value. Starting it with an offset greater than
> the step threshold likewise results in the correct value because
> the step does not recalculate the frequency and we are back to
> the zero offset case after the step. the initial offset value
> has nothing to do with the PLL, it has to do with how long
> the clock was undisciplined.
>
> However, any offset in between zero and the step threshold results
> in a recalculation and perturbs the frequency which must then
> slowly be brought back in line over time. My proposal eliminates
> this problem. Please, if I am in error, help me to find the
> error of my ways. But I believe that this change results
> in performance as good or better than the current code
> in all use cases.
>
> It is fine to argue LAN vs. Flaky WAN vs. Solar System when
> discussing algorithm changes, but that is not the case here.
> If it is an easy change, and always performs better or as
> good, then why would we not want to make the change?
>
More information about the hackers
mailing list