[ntp:hackers] First sample after f15 minutes used. Really?

David L. Mills mills at udel.edu
Tue Sep 11 14:17:38 UTC 2007


Rob,

I never actually saw a 9037 Sysplex Timer, but my consulting clients 
have them and asked me to look into how they could be used to support 
network wide synchronization. A report on that is on the NTP Project 
Page. In spite of their huge cost ($100K each and you need at least two 
per installation), they are somewhat awesome. Hooked together by fiber 
up to 40 km, they can synchronize a multiple processor, multiple 
mainframe system to TAI. They can synchronize either via ACTS modem or 
directly to a radio clock that is no longer manufactured. However, if 
running on local time when switching from daylight time, the operator is 
expected to close all applications and wait for up to 24 hours while the 
local time stabilize around the world.

Once upon a time when the world was ruled by Intenational Bull Moose, 
the world was like that. Perhaps the most historically interesting fact 
was that the System/360 product line, which still lives, had an memory 
address bus of only 24 bits. See the adventures at 
http://www.eecis.udel.edu/~mills.

Dave

Rob Neal wrote:

> The IBM Mainframe 9037 Sysplex Timer is dead, long live the new regime:
>
> http://www.redbooks.ibm.com/abstracts/sg247280.html?Open
>
> Maybe someday it will even learn to listen to NTP...
>
>
> Rob
>
> On Sun, 9 Sep 2007, David L. Mills wrote:
>
>> Brian,
>>
>> You mean you actually read the documentation? Darn, not many people do
>> that.
>>
>> Who do love, the TOY chip or a neighborhood server? There is code in the
>> ntp_util.c source, but it is not clear it works with all machines. There
>> are different opinions on this. Some set the TOY once an hour when
>> disciplined, some set it on shutdown and so forth. All I can say is that
>> if the TOY is not within several minutes at startup, somebody should
>> know about it. You can of course use the -g option to set the time even
>> if outside step or panic; that's what cisco routers do. The question is
>> whether to set the TOY on a step as well.
>>
>> You don't ever want to set the TOY equivalent on an IBM mainframe. The
>> operator would need to shut all applications down first and then have a
>> tug of war with the 9037 Sysplex Timer. Also for grins, there is in fact
>> a SHARE program that operates as an NTP server, but no program available
>> that operates as an NTP client. I cut my teeth on the IBM 7090 and later
>> the IBM System/360 and so am not surprised at all about that.
>>
>> Dave
>>
>> Brian Utterback wrote:
>>
>>> Yes, that is what I was saying, that the standard mitigation
>>> algorithms are still in effect. I
>>> thought that the wording in the docs made it sound like this was not
>>> the case. I read the doc as
>>> saying that the first sample after 15 minutes (i.e. from the packet
>>> currently being processed) would
>>> be accepted, not the offset calculated after the mitigation and
>>> selection algorithms.
>>>
>>> Another thing, in the html man page for ntpd, it says that if the TOY
>>> chip is not working,
>>> then ntpd exits. Here is the text:
>>>
>>> "In case there is no TOY chip or for some reason its time is more than
>>> 1000 s"..."exit
>>> with a panic message to the system log"
>>>
>>> Now, of course we are all aware of the panic threshold for the
>>> calculation of the offset, but
>>> I am not sure what is meant here about the TOY chip. If there is no
>>> TOY chip ntpd exits?
>>> That doesn't seem right.
>>>
>>> David L. Mills wrote:
>>>
>>>> Brian,
>>>>
>>>> You might misunderstand the purpose of the scheme. First, the
>>>> mitigation is done before the threshold is checked, so presumably we
>>>> have the best candidates to wiggle the clock. Although a rogue sample
>>>> or two might exceed the step threshold, it's highly likely a good
>>>> sample will come along and reset the stepout counter.
>>>
>>> [snip]
>>>
>>>> Dave
>>>>
>>>> Brian Utterback wrote:
>>>>
>>>>
>>>>
>>>>> To answer my own question, the wording isn't exactly right. What
>>>>> happens is that any offset greater
>>>>> than 128ms is ignored for 15 minutes. If it has been over 15 minutes
>>>>> since the last clock update, then
>>>>> the offset (Still calculated by the normal method) is allowed. So,
>>>>> the standard clustering and combining
>>>>> is still in effect.
>>>>>
>>>>>
>>>>> Brian Utterback wrote:
>>>>>
>>>>>
>>>>>
>>>>>> (Resending, including Dave this time.)
>>>>>>
>>>>>>
>>>>>> In the documentation for ntpd, it says:
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>> The ntpd algorithms discard sample
>>>>>>> offsets exceeding 128 ms, unless there is no sample offset
>>>>>>> of less than 128 ms for 15 minutes. The first sample after
>>>>>>> that, no matter what the offset, steps the clock to the
>>>>>>> indicatedA time.
>>>>>>>
>>>>>> Is this really true? The first sample after 15 minutes is used,
>>>>>> even if
>>>>>> it is a lone wolf? What if you have 5 servers and 4 of them give an
>>>>>> offset of 2 seconds and one has an offset of 10 seconds. If the 10
>>>>>> second guy happens to be the first one after 15 minutes of such
>>>>>> samples, he gets used? This seems wrong to me.
>>>>>>
>>>>>> Brian Utterback
>>>>>>
>>>>>> _______________________________________________
>>>>>> hackers mailing list
>>>>>> hackers at lists.ntp.org
>>>>>> https://lists.ntp.org/mailman/listinfo/hackers
>>>>>> _______________________________________________
>>>>>> hackers mailing list
>>>>>> hackers at lists.ntp.org
>>>>>> https://lists.ntp.org/mailman/listinfo/hackers
>>>>>>
>>>>>
>>>>
>>>> _______________________________________________
>>>> hackers mailing list
>>>> hackers at lists.ntp.org
>>>> https://lists.ntp.org/mailman/listinfo/hackers
>>>>
>>>
>> _______________________________________________
>> hackers mailing list
>> hackers at lists.ntp.org
>> https://lists.ntp.org/mailman/listinfo/hackers
>>



More information about the hackers mailing list