[ntp:questions] Re: NTP seems unsuitable for this application... what do you think?

Richard B. Gilbert rgilbert88 at comcast.net
Thu Dec 2 01:15:26 UTC 2004


John Seal wrote:

> We have a networked system with the following characteristics:
>
> - Not on the internet.
> - Several dozen hosts, mostly Solaris 2.6 and Solaris 9.
> - Normally OFF, turned ON only during use for 8-10 hours at a time.
> - Hosts booted in random order as needed, sometimes not all of them.
> - Clock batteries cannot be easily replaced, so they're often dead.
> - Time (but not date) is available, ultimately from GPS, I think.
>
> The last two points bear a little elaboration.  It is very difficult 
> to replace the clock batteries in this system, so many of them are 
> dead at any given time.  Some hosts boot with approximately the 
> correct time; some with the time they were last shutdown, and some at 
> the epoch.  Two of the hosts are special in that they have a 
> connection to a source of GPS data, and there is a program that reads 
> the time (but not the date) and keeps the local processor clock synced 
> to it.  The GPS connection is not any standard connection that could 
> be used directly by NTP.
>
> As part of the login process, the user is offered a date/time window 
> which can be used to accept or modify the current date and time.  So, 
> if one of the special GPS hosts comes up at the epoch, for example, 
> then GPS will set the right time but the date will be 1/1/1970.  When 
> the user logs in, the date and time will be set to whatever 
> "wristwatch time" the user enters, on both the local host and the one 
> special GPS hosts.  We assume the user will enter the correct date; if 
> the time is a little off, GPS will quickly correct that, and the one 
> special GPS host will now have the correct date and time.
>
> The local host where the user logged in then does a one-time sync to 
> the one special GPS host, so now the local date and time are correct 
> as well.  No additional syncing takes place; the processor clocks 
> drift from that point on.  This happens for each host that logs in.
>
> Bottom line: there is a complex boot/login dance using rdate and 
> custom programs that ensures that hosts start out synced to the one 
> special GPS host at the beginning, but then they are free to drift 
> until the system is shutdown.
>
> There is another special GPS host, but its local clock is not normally 
> kept synced to GPS time.  It is for backup use, and the GPS sync 
> functionality must be manually started when required.
>
> We thought NTP might help.  It was already installed, and was fairly 
> easy to configure, but didn't really work like we expected.  I mainly 
> referred to the Sun Blueprint publication "Using NTP to Control and 
> Synchronize System Clocks - Part II: Basic NTP Administration and
> Architecture" by David Deeths and Glenn Brunnette.
>
> We configured both of the special GPS hosts as servers, with their 
> local processor clocks as the reference (127.127.1.0), as peers of 
> each other, and using authentication keys.  Even though ntptrace and 
> ntpq showed pretty much what I expected, I consistently saw 3-5 
> seconds difference in their times.  The clients were configured to use 
> both servers, but to prefer the main one.  Interestingly, it didn't 
> seem to matter whether the clients were configured to use 
> authentication keys or not.
>
> I say ntpq showed "pretty much" what I expected, because sometimes it 
> showed a peer as unreacheable even though it should have been, or a 
> peer's clock as "insane" when it seemed reasonable.  The clients did 
> switch as servers were stopped and started, but it took a while, like 
> on the order of 5 minutes.  That was one of our concerns, that NTP 
> seemed to take a long time to do anything, and I couldn't find firm 
> answers to how it handled large initial differences, when it stepped 
> vs. slewed ("ntpdate -b" excepted), and how long a slewing correction 
> took.
>

A peer will be viewed as "insane" if it's clock differs from the local 
host by more than a very few tens of milliseconds.  If only two systems 
are peered, they will declare each other insane if their clocks differ 
by very much!!!!!   There is no way to tell who is right.  Is it not 
written that a man with one watch knows what time it is but  a man with 
two can never be certain?

> So, here are a few questions I have:
>
> - Why weren't the two peer servers synced closer than a few seconds?
>
If two peers differ by a few seconds, how is NTP supposed to correct the 
time?  How is NTP supposed to know what the correct time is?   The 
answer is that it can't!

> - If a server started out at the epoch, then changed to the right time 
> on the wrong date, then changed to the right date and time, how would 
> NTP react?  In other words...
>
> - Starting from a situation with the servers and clients in sync, 
> would large time changes on the server be propagated to the clients?
>
NTP cannot make large changes!   It  can slew the clock at a maximum 
rate of 500 parts per million which works out to something like half a 
millisecond per second or eight and a half minutes to correct a one 
second error.  You don't even want to think about how long it would take 
to correct an error of thirty years!

> - Why didn't it seem to make any difference whether the clients used 
> authentication keys or not?
>
> Our decision, as of this morning, is that NTP really isn't suitable to 
> a system like this that's not ON for long periods of time, not on the 
> internet, has hosts that boot with wildly different local times, and 
> lacks direct connection to a GPS.  What do you think?

I think you are right!  It sounds as if you don't really care what time 
it is and really don't need to keep those systems synchronized.

The clock batteries in those Solaris systems (I'm assuming SPARC 
architecture) are embedded in the NVRAM chip with the clock hardware.   
A replacement chip costs something like $25 from Sun if you install it 
yourself.  If you keep the systems powered on, the batteries should last 
for at least four or five years and probably longer since no power is 
drawn while the system is powered up.

If you really care what time it is, leave one of the GPS systems powered 
up to act as a server.  When synched to GPS it should know to within a 
few hundred microseconds what time it is and will share its knowledge 
with anyone who boots up and asks.  If you give an NTP client a stable 
and accurate source of time, it will synch up and stay synchronized as 
long as it's powered on, IFF you give it enough time to do so.  It can 
take twelve to twentyfour hours to achieve tight synchronization from a 
cold start!  By "tight synchronization" I mean within five milliseconds 
or less; I've seen Solaris/SPARC systems synchronize with offsets in the 
low microsecond range.



More information about the questions mailing list