[ntp:hackers] What to do when the offset is WAYTOOBIG
brian.utterback at sun.com
Wed Apr 18 06:09:39 PDT 2007
I have had users complain because the xntpd that ships with Solaris
exits when the offset is too big, even though there were servers that
did not have offsets that were too big. In other words, it was counter
intuitive to them that the election would go to a candidate that was
ineligible to serve. It just makes sense that the elimination should be
before the election when the candidates are filing (or this year, it
seems the first step is to form an exploratory committee). Wouldn't the
best solution be to add a test in peer_unfit?
On the other hand, another simple solution is to treat the panic exit
in local_clock the same way as spikes or popcorn filtees, simply return
0 and let the code ignore the update.
Both of these allows ntpd to "wait for better times". The first is more
intuitive but might have an effect on the detection of falsetickers. the
second is probably safer, but will leave the system free running in more
cases, since it will not fall back on the other servers.
Judah Levine wrote:
> I agree that having the software exit on this condition is probably
> the wrong thing to do. The NIST time servers use my own LOCKCLOCK
> algorithm for synchronizing the clock, and that algorithm partitions
> this condition into 3 possibilities:
> 1. A failure of the network or the remote host or the
> measurement process.
> 2. A time step of the local clock
> 3. A frequency step of the local clock
> The first action is to immediately initiate a query to another time source,
> if one is available. Unlike the standard NTP, my algorithm queries only
> one time source on each calibration cycle and evaluates the response
> based on the size of the correction that it implies for the local clock.
> This checks possibility 1. Possibilities 2 and 3 are distinguished by
> delaying for a short period of time and initiating a second query. I
> can talk about the details if anyone is interested or you can find this
> stuff in my papers. However, the point is that the software never exits.
> If the program is unable to decide what to do then it sets itself to
> unhealthy and waits for better times. (This is also a difference with
> the standard version of NTP. My understanding is that the standard
> version will never set itself to "clock unsynchronized" once it is up
> and running.)
> Best wishes,
> Judah Levine
> NIST and University of Colorado
> Judah Levine
> Time and Frequency Division
> NIST Boulder
> hackers mailing list
> hackers at support.ntp.org
"Remember 'A Thousand Points of Light'? With a network, we now have
a thousand points of failure."
Brian Utterback - Solaris RPE, Sun Microsystems, Inc.
More information about the hackers