[ntp:questions] A proposal to use NIC launch time support to improve NTP

Ulf Samuelsson ulf at invalid.com
Wed Dec 19 15:12:48 UTC 2012

On 2012-12-19 14:48, Brian Utterback wrote:
> On 12/18/2012 7:05 PM, Ulf Samuelsson wrote:
>> Brian Utterback <brian.utterback at oracle.com> wrote:
>>> On 12/13/2012 5:00 AM, Jonatan Walck wrote:
>>>>> This is going to be very hard to get it to be useful. Looking at
>>>>>> the specs for the card, the timestamp you give is relative to a
>>>>>> clock that is internal to the controller, and is only accurate to
>>>>>> the nearest second. That is, it is like the PPS in that it is
>>>>>> assumed that the clock is in sync to within .5 seconds to avoid
>>>>>> aliasing the timestamp.
>>>>>> Brian.
>>>> The internal clock of the network controller is the PHC for IEEE1588,
>>>> it has a 1 ns resolution, and can be steered with a 32 bit fractional
>>>> of 1 ns. see SYSTIML and TIMINCA in the I210 datasheet.
>>>> // jwalck
>>> I know that. The problem is that there is going to be jitter introduced
>>> when you set the clock from the kernel. That is generally the problem
>>> with IEEE 1588, getting the time from the controller to the kernel and
>>> vice versa. If you have to go across a PCI bus for instance that will
>>> introduce jitter.
>>> Brian
>> No, you capture the time for the 1 PPS pulse in the network controller.
>> Then you tweak the count rate of the timestamp counter up or down.
>> Eventually you will have synchronized the timestamp counter with the 1
>> PPS
>> pulse,
>> If you run the network contoller from a clocksource derived from the
>> Cesium
>> clock
>> You should get within a few ns from the pulse,
>> It is an inconvenient approach since you can easily synch your timestamp
>> with a 1 PPS pulse
>> using a few counters if you understand the issue.
>> Unfortunately, those counters are not available in the Intel chips, so we
>> have to use it they way they
>> want us to use it.
>> Since launch time will be the arrival time of the client request (through
>> H/W timestamping)
>> + a constant delay, any delays in the PCIe bus will not affect the
>> precision.
>> You just have to make sure that the constant delay added, is longer than
>> the processing time inside the server.
> No, you are missing the point. You have two clocks in this scenario, the
> kernel clock and the network controller clock. If one gets a good time
> then you have to set the other from it. This means that this time will
> have to travel over the PCI bus which will introduce jitter.

No, all timestamps in our approach are coming from the network 
controller so the system time can be way off without anyone caring.
Since we are a Stratum 1 server, we just need to get the "seconds"
counter right at init.
Until that is done, we are unsynchronized.

> Now, if you have a PPS signal available and can provide it to both the
> network controller and the kernel, then you don't have this problem
> since the PPS signal will sync the time to an accuracy better than the
> jitter that was introduced.
> Even without the PPS signal to the kernel, your system might be usable,
> since the only timestamp used in the kernel will be for the
> originate/transmit timestamp, and this timestamp will be in sync with
> the the network controller timestamp by virtue of the use of launchtime.
> But you will have to be sure that the kernel clock is always a little
> ahead of the network controller clock, enough so that the actual delay
> in the stack doesn't cause the packet to reach the controller after the
> designated launchtime, but not so far ahead that the timestamp wraps
> (i.e. .5 second). Also, not so far ahead that you get too large a back
> log in the controller of packets waiting to be sent

The desired launchtime is compared to the network controller timestamp 
counter in H/W, so again there is no need to synchronize with the system 

> Of course, this also all depends on the controller providing receive
> timestamps as well. Otherwise you will replace jitter with systematic
> asymmetric delays, which are worse.

Yes, the launch time will be the H/W receive timestamp + a constant delay.

> Brian

More information about the questions mailing list