[ntp:questions] Sub-millisecond NTP synchronization for local network

Unruh unruh-spam at physics.ubc.ca
Thu Dec 4 19:13:20 UTC 2008


You donot tell us what operating system your robots run. ntpd is not
designed for rapid convergence. It was designed for long term stability,
and conceptual simplicity. It can take up to 10 hours to compensate for a
change in drift rate. 

If you are running Linux and if you have the adjtimex system call on your
system then chrony might well be a far better option for you. Its
convergence is far faster, its discipline of the local clock is about a
factor of 2-3 better than ntp ( although your requirement of 1ms is pretty
non-stringent so that this would make no difference.)

If you are not running Linux or you do not have the adjtimex system call on
your Linux, then chrony will not work. 


One question is when your network becomes swamped, is the delay primarily
one way ( eg outgoing packets are delayed while incoming ones are not?)
That is almost impossible to any clock algorithm to handle well.


leibs at willowgarage.com (Jeremy Leibs) writes:

>Willow Garage is designing a robotic research platform and completely
>open-source robotic software framework.  We are attempting to use NTP to
>handle the task of maintaining synchronization of the clocks within our
>system.  Unfortunately, we are having an extremely difficult time finding an
>appropriate configuration.  We are looking for someone to help us figure out
>the correct NTP configuration for our use case, or determine if NTP is even
>capable of doing what we want.

>Our configuration is 4 machines connected on a local gigabit network located
>on a mobile robotic base.  These machines are subject to frequently being
>powered down or restarted.  In order to use the robot, the clocks on these
>machines must be self-synchronized to less than 1 millisecond.  Ping times
>between machines on this local network vary between 100 microseconds, and
>1ms depending on saturation of the network by sensor data streams.

>The 4 machines are connected to the rest of the world through a wireless
>link.  The delay time on the wireless link is much more variable: in the
>range of 2ms to 300ms depending on the quality of the link and the amount of
>data going over the wire.  We don't care nearly as much about
>synchronization between the robot and the outside world, though it would be
>nice to avoid unbounded drift.  A synchronization in the range of 10's of ms
>would be acceptable.

>Our present configuration is made up of 1 machine syncing to an external
>server over the wireless link and acting as a local server for the robot.
>The remaining 3 machines then sync to this local server.

>Operating under "stable" conditions, this configuration seems to work well
>and eventually converges to our sub-millisecond criteria.  However, we have
>2 large problems.

>1) When the operating conditions suddenly change, the system diverges
>dramatically, and sometimes becomes unstable/divergent.  In particular, a
>pathological case we have seen is when the wireless link is near saturation
>for an extended period of time such as when copying over multi-gigabyte log
>files over the course of several hours.  Once the transfer completes and the
>wireless link opens up again, the delay time across the wireless link
>plummets, the local server immediately diverges from the external server by
>around 30 ms.  After this initial divergence, the local server stops
>qualifying as a good source of time, and the remaining 3 machines start
>drifting apart in independent directions.

>2) When the system is in a non-converged state, such as after diverging in
>case 1, or on boot, the time it takes for the system to converge is
>unacceptably long.  If I disable NTP, and run ntpdate on each of the client
>machines, I can synchronize them to within 1 ms, but as soon as I start NTP
>again, all of the clocks begin to diverge, often taking hours to re-converge
>back to to steady state.

>We are looking for a way of configuring the system to be robust to sudden
>changes in otherwise stable network latency, and additionally looking for a
>way to get the local system to converge to sub-ms offsets on the order of
>minutes instead of hours.

>Does anyone have suggestions for best practices in configuring an NTP
>network for these conditions?

>Thanks,
>--Jeremy Leibs




More information about the questions mailing list