[ntp:questions] Re: Advice on sync clock between cluster of linux v2.6 to +-1us
Richard B. Gilbert
rgilbert88 at comcast.net
Wed Jan 12 06:00:19 UTC 2005
> I've spent the morning doing some research on synchronising a cluster
> of computers running linux. I would appreciate any advice on this
> matter and whether it is feasible.
> Is it possible to synchronise a group of computers interconnected by a
> fast ethernet LAN and be able to keep there clocks accurate to +- ~1us
> of each other using available ntp software? Can the 2.6 kernel handle
> this high precision?
> I don't require the computers to have the correct actual time, only
> the time between them needs to be accurate.
> On the information I have found, the drift of the clock on a modern
> i386 motherboard would not provide this accuracy so I would expect I
> would need some external clock source connected to one of the
> computers with ntp software or equivalent distributing to the others.
> I seen references to TCXO ICs but have not found any standalone products.
> I don't believe I need (the expense of) a GPS/Radio receiver since I
> don't require the actual time, but a lot of these receivers have a
> TCXO (or alternative) module to handle failover due to signal failure.
> Am I being unrealistic synchronising a cluster of standard PCs to a
> 1us precision?
> Let me have your thoughts.
> Many Thanks
Synchronization to within 1 microsecond may be achievable but only with
a very stable time source and a great deal of effort. The typical
computer clock is not stable enough to synchronize much of anything; you
would be in a position analogous to trying to shoot at a randomly moving
target while standing on a randomly moving platform.
You can use an external reference to discipline the computer clock to
the point where it becomes stable enough to synchronize a whole herd of
other machines. That reference might be a cesium frequency standard, a
rubidium frequency standard, a GPS receiver or a high quality quartz
crystal oscillator (oven controlled or temperature controlled (OCXO or
TCXO). Cesium and rubidium standards are expensive to buy and expensive
to maintain. Cesium is the best there is. Rubidium is second best.
GPS is ultimately referenced to cesium clocks and is very good indeed.
Any of the foregoing can provide both the stability needed as well as
accuracy. The very best quartz, OCXO or TCXO provides good but not
great stability. Quartz oscillators can vary a great deal in quality;
even within the same make and model.
If you can site an antenna with an unobstructed view of the entire sky,
GPS is probably the best reference! Quality tends to be good and the
cost is not excessive. $300 -- $500 US would be a rough estimate for a
minimal GPS setup.
Your O/S can be a problem. Some O/Ss, Linux and Windows among them,
have a reputation for losing clock interrupts (ticks) when they get
busy. If yours loses clock ticks, forget about synchronizing within one
microsecond. If your O/S is well behaved in this respect you might be
able to get within a microsecond.
The round trip packet delay bounds the accuracy you can achieve;
assuming that the server is correct the error in transmitting the time
cannot be worse than the round trip delay. The error is probably a
great deal less than that but there's no way to be sure! A 100MB full
duplex switched LAN with a cheap switch (Linksys - my home network) can
have round trip delays of 400 to 500 microseconds as reported by ntpq.
I'm talking about a maximum of twenty or thirty feet of cable and a
switch between two Sun Ultra 10 workstations. Spending fifty times as
much money on the switch (high performance Cisco instead of Linksys)
might cut this by a factor of ten or more (or it might not; I don't
have $2000-2500 to spend on a Cisco Switch).
Network load can affect the quality of your timing. A busy network
will have longer delays and the delays will most likely be random! A
faster network should have lower delays but will cost more!
Unless you are prepared to spend a great deal of money and do a great
deal of work, I think you should reduce your expectations to something
in the 10 microsecond to 1 millisecond range. It's hard to say what's
possible in your environment without actually trying it but I'd hate to
promise anyone that I could synchronize a network to within 1
microsecond without actually having done it with the hardware and
software in question. I'd feel a LOT better about promising 1
More information about the questions