[ntp:questions] Re: Advice on sync clock between cluster of linux v2.6 to +-1us

Richard B. Gilbert rgilbert88 at comcast.net
Wed Jan 12 06:00:19 UTC 2005


Nick wrote:

> Hi
>
> I've spent the morning doing some research on synchronising a cluster 
> of computers running linux. I would appreciate any advice on this 
> matter and whether it is feasible.
>
> Is it possible to synchronise a group of computers interconnected by a 
> fast ethernet LAN and be able to keep there clocks accurate to +- ~1us 
> of each other using available ntp software? Can the 2.6 kernel handle 
> this high precision?
>
> I don't require the computers to have the correct actual time, only 
> the time between them needs to be accurate.
>
> On the information I have found, the drift of the clock on a modern 
> i386 motherboard would not provide this accuracy so I would expect I 
> would need some external clock source connected to one of the 
> computers with ntp software or equivalent distributing to the others. 
> I seen references to TCXO ICs but have not found any standalone products.
>
> I don't believe I need (the expense of) a GPS/Radio receiver since I 
> don't require the actual time, but a lot of these receivers have a 
> TCXO (or alternative) module to handle failover due to signal failure.
>
> Am I being unrealistic synchronising a cluster of standard PCs to a 
> 1us precision?
>
> Let me have your thoughts.
>
> Many Thanks
> Nick

Synchronization to within 1 microsecond may be achievable but only with 
a very stable time source and a great deal of effort.  The typical 
computer clock is not stable enough to synchronize much of anything; you 
would be in a position analogous to trying to shoot at a randomly moving 
target while standing on a randomly moving platform.

You can use an external reference to discipline the computer clock to 
the point where it becomes stable enough to synchronize a whole herd of 
other machines.  That reference might be a cesium frequency standard, a 
rubidium frequency standard, a GPS receiver or a high quality quartz 
crystal oscillator (oven controlled or temperature controlled (OCXO or 
TCXO).  Cesium and rubidium standards are expensive to buy and expensive 
to maintain.  Cesium is the best there is.   Rubidium is second best.  
GPS is ultimately referenced to cesium clocks and is very good indeed.  
Any of the foregoing can provide both the stability needed as well as 
accuracy.  The very best quartz, OCXO or TCXO provides good but not 
great stability.  Quartz oscillators can vary a great deal in quality; 
even within the same make and model.

If you can site an antenna with an unobstructed view of the entire sky, 
GPS is probably the best reference!  Quality tends to be good and the 
cost is not excessive.  $300 -- $500 US would be a rough estimate for a 
minimal GPS setup.

Your O/S can be a problem.  Some O/Ss, Linux and Windows among them, 
have a reputation for losing clock interrupts (ticks) when they get 
busy.  If yours loses clock ticks, forget about synchronizing within one 
microsecond.    If your O/S is well behaved in this respect you might be 
able to get within a microsecond.

The round trip packet delay bounds the accuracy you can achieve; 
assuming that the server is correct the error in transmitting the time 
cannot be worse than the round trip delay.  The error is probably a 
great deal less than that but there's no way to be sure!  A 100MB full 
duplex switched LAN with a cheap switch (Linksys - my home network) can 
have round trip delays of 400 to 500 microseconds as reported by ntpq.   
I'm talking about a maximum of twenty or thirty feet of cable and a 
switch between two Sun Ultra 10 workstations.  Spending fifty times as 
much money on the switch (high performance Cisco instead of Linksys) 
might  cut this by a factor of ten or more (or it might not; I don't 
have $2000-2500 to spend on a Cisco Switch).

Network load can affect the quality of your timing.   A busy network 
will have longer delays and the delays will most likely be random! A 
faster network should have lower delays but will cost more!

Unless you are prepared to spend a great deal of money and do a great 
deal of work, I think you should reduce your expectations to something 
in the 10  microsecond to 1 millisecond range.  It's hard to say what's 
possible in your environment without actually trying it but I'd hate to 
promise anyone that I could synchronize a network to within 1 
microsecond without actually having done it with the hardware and 
software in question.  I'd feel a LOT better about promising 1 
millisecond!!!!!!!



More information about the questions mailing list