[ntp:questions] question regarding NTP configuration for clusters, and "cluster time" stability

Unruh unruh-spam at physics.ubc.ca
Tue Sep 29 20:12:55 UTC 2009


"rotordyn at yahoo.com" <rotordyn1 at gmail.com> writes:

>On Sep 29, 2:37=A0pm, Harlan Stenn <st... at ntp.org> wrote:

>> You've read assoc.html#orphan, right?

>Yes. And I think I understand how it would work in common scenarios,
>but I have to account for the corner cases as well. In particular, the
>documentation at that link says:

>    "If no UTC sources are available to any core server, one of them
>    can provide a simulated UTC source for all other hosts in the
>    subnet. However, only one core server can simulate the UTC
>    source and all direct dependents, called orphan children, must
>    select the same one, called the orphan parent."

>So I'm not sure what happens if some core servers lose access
>to their UTC sources, while the remainder do not. I had hoped
>that one core server switching to orphan mode would somehow
>trigger the others, but I don't see that it does in the code.

>> If you "choose unwisely" and select a poor master to take over in your
>> falure case, you'll see a time jump when you regain internet access.

>True. I think I can survive that, as long as all the nodes stay close
>enough to each other. If it has to, the cluster can signal that it is
>too far from its external UTC reference to resynchronize. Then
>the internal cluster time would just continue to drift until there was
>a service action to fix it. What would be worse is for nodes in the
>cluster to diverge from each other.

>> If correct time is really important, why not run an inexpensive S1 device
>> locally?

>Oddly enough, "correct time" isn't all that important, at least not to
>the accuracy often discussed in the context of NTP. And I think that's
>the root of my issues: A consistent "cluster time" among the nodes
>is much higher priority than accurately following UTC. Prioritizing
>a common cluster time absolutely over UTC in a redundant fashion
>seems to be difficult, since the peers that provide redundancy can
>diverge from each other if given different inputs. (Different in that
>one
>could lose its external UTC reference while another does not.)

Oddly enough, making sure that all of the computers are synchronized to
UTC is probably the best way to ensure that they are all synchronized to
each other. For example, you could run a PPS line to each computer's
parallel port and use the interrupt from that to sync that computer to
that PPS. Or you could sync a bunch  of them to their own PPS source (
eg a number of gps). 

If the peer looses it external utc reference you could always have it up
its level, so the rest would stop using it as a reference. 


>The current implementation reflects this, since we use NTP internally
>with no outside references. But that means that over time we can
>drift
>pretty far away from UTC, and my goal is just to limit that. I think
>I've
>said it already, but roughly speaking I need the nodes to agree with
>each other to less than a second, and even within a minute or even
>an hour to UTC is enough.

>> If it's *really* imporatant, you can build a very high quality S1 server
>> with Rb and GPS and/or modem for under US$2k.

>Adding hardware isn't an option. This product exists in the field.

One of the advantages of GPS is that it too exists in the field. And on
house tops, and in the canyons ( well at least rooftops) of the concrete
jungle. 

>As always, thanks for the guidance. And if anyone with the right
>level of experience wants a consulting gig, send me an email. :)

>tim




More information about the questions mailing list