[ntp:questions] Idea to improve ntpd accuracy

Weber wsdl at osengr.org
Sun Feb 28 03:07:56 UTC 2016


Holy cow!

I thought of that too but rejected it as way too machine-specific. 
Because the clock PLL does not track frequency ramps feed-forward could 
be very effective (e.g. order of magnitude). To make this work however, 
you need to first discover which one of many potential temperature 
sensors on the MB are most closely correlated with frequency, then 
measure a rough linear gain from temperature to PPM. Seemed too 
hardware-instance-specific to me, but perhaps there's a way...


On 2/27/2016 5:23 AM, Charles Elliott wrote:
> Here are two other ideas you might want to consider to improve accuracy:
> Feed forward (PID) control on system temperature and significant changes in
> system load.  Let me describe the situation.  Right now I have only 3
> computers on a home LAN.  Two, one of which is my main system for email and
> dictating research notes, process BOINC/Einstein at Home work units and one
> serves only as an external-facing NTP client that distributes time to the
> other two.  Due to environmental pressures and the need to conserve energy,
> the two computers processing Einstein at Home WUs stop that at midnight and
> resume at 5 AM automatically.  Ambient temperature affects all 3 computers
> equally, thought its effect is most noticeable on the external-facing NTP
> client.
>
> When the two computers shed the load at midnight, the offsets immediately
> decrease to between -1 and -2 ms; then, both machines spend all night
> recovering to zero offset at about 5 AM, whereupon the offsets jump up to
> between +0.5 and +1 ms when the load is resumed.  At about 8 AM both
> machines are back to zero offset.
>
> All during the night, the ambient temperature is falling because
> BOINC/Einstein at Home consumes between 450 and 500 watts of GPU and CPU power
> on each machine on which it runs.  Consequently, the frequency offset (in
> ppm) declines all night, reaching a minimum at 5 AM, and then increases
> steadily on all three machines until it reaches a steady state at about 10
> AM.  It hovers at a fairly steady state until midnight, whereupon the fall
> and rise pattern is repeated.
>
> PID control is proportional + integral + derivative.  Normally, a control
> signal is proportional to an error signal, such as offset error, in
> proportion to the sum of (offset) errors (integral), and in proportion to
> the change in (offset) error (jitter). However, for temperature, there is no
> concept of error; it is what it is.  For control as a function of
> temperature (T), there clearly is a direct functional relationship between
> ppm and T, such as ppm = Kp T.  Jitter lags both temperature and system load
> changes by about 2 hours, so I am not sure derivative control is wise or
> even possible.  Integral control is what yanks the elevator up to floor
> level when the brakes are slipping.  It can be hard to get right because of
> the yo-yo effect.  Obviously, more thought has to be put into all this.  But
> I have been watching these relationships for several weeks with NTP Plotter,
> and they are obvious and fairly consistent.
>
> What started me off on this, is that CPUID (www.cpuid.com), makers of CPU-z,
> HardwareMonitor, and HardwareMonitorPro, the latter two of which do an
> excellent job of monitoring all manner of system and peripheral
> temperatures, is offering a "System Monitoring SDK" for evaluation.  The
> idea would be to record the relationships between various temperature
> measurements and ppm and then try to adjust the PPM adjustment with
> temperature to see if any improvement is offsets results.  However, CPUID
> never replied to my email.
>
> Charles Elliott
>
> -----Original Message-----
> From: questions
> [mailto:questions-bounces+elliott.ch=comcast.net at lists.ntp.org] On Behalf Of
> Weber
> Sent: Thursday, February 25, 2016 4:52 PM
> To: questions at lists.ntp.org
> Subject: [ntp:questions] Idea to improve ntpd accuracy
>
> This may or may not be worthwhile, but I thought I'd throw it out there and
> see what happens:
>
> Recent work testing some microsecond-accurate NTP servers lead me to an idea
> that could improve accuracy of measurements made by ntpd. These NTP servers
> have hardware timestamps on receive but that's not possible on transmit w/o
> a custom NIC. I've seen this issue discussed before.
>
> The next best thing is to generate the transmit timestamp based on a guess
> as to how long it takes the NIC to get on the wire and send the packet. That
> works pretty well as long as there's no other network traffic. In this
> situation, it is possible to make use of microsecond accuracy in an NTP
> server.
>
> Now, add some typical network traffic and the time it takes the NIC to get
> on the wire becomes unpredictable to the tune of 200us or more (for
> 100 base-T Ethernet). The server's microsecond accuracy is largely lost in
> the noise.
>
> The NIC generates an interrupt after the packet is sent which can result in
> a fairly accurate trailing hardstamp. The problem is...the packet is already
> gone and has the wrong transmit timestamp.
>
> Here's my idea:
>
> What if the poll response packet contained a flag or indication of some sort
> which means "this is an approximate transmit timestamp". That packet would
> then be immediately followed by a second response packet with a more
> accurate transmit time. The second packet could be otherwise identical to
> the first, or it could be a new flavor of packet that contained only the
> transmit time (that would save on network bandwidth).
>
> The ntpd process would need to use the receive time of the first packet (the
> one with an approximate tx timestamp) and merge in the following accurate tx
> timestamp before performing the normal processing associated with a poll
> response.
>
> Here are the pros and cons I can think of:
>
> Pros
>
> * Possible accuracy improvement of 1-2 orders of magnitude. I know ntpd
> already does some work to try and filter out network delay variation so the
> improvement might not be a full 2 orders of magnitude.
> * Could potentially be made compatible backwards compatible with ntp 3/4
> protocols
>
> Cons
>
> * Increased network traffic
> * Improvement to that level of accuracy might not be of interest to anyone
> * Could be a fair bit of work for at least a couple of folks
> * I may have (or probably) missed some stuff regarding network behavior that
> would reduce the level of improvement that could be realized.
> * Perhaps this is less of an issue on G-bit Ethernet?
>
> Wondering if anyone thinks this idea is worth pursuing further...?
>
> _______________________________________________
> questions mailing list
> questions at lists.ntp.org
> http://lists.ntp.org/listinfo/questions
>
>


More information about the questions mailing list