[ntp:questions] Testing throughput in NTP servers

Terje Mathisen "terje.mathisen at tmsw.no" at ntp.org
Fri Sep 14 19:44:50 UTC 2012


Ulf Samuelsson wrote:
> On 2012-09-12 21:24, Richard B. Gilbert wrote:
>> On 9/12/2012 2:34 PM, unruh wrote:
>>> On 2012-09-12, Ulf Samuelsson <ulf at invalid.com> wrote:
>>>> Anyone knows if there are any available Linux based S/W to test the
>>>> throughput of NTP servers?
>>>> I.E:
>>>>
>>>>     packets per second?
>>>>     % of lost packets
>>>>     etc?
>>>>
>>>> Best Regards
>>>> Ulf Samuelsson
>>>
>>> I hope not. I can just see someone deciding to test one of the stratum 1
>>> main servers (eg at the usno) Why in the world would you want this?
>>>
>>
>> Sigh!  I'm sure it has happened and will happen again!  I'm sure that
>> there are people complaining to the National Bureau of Standards or
>> the Naval Observatory that their time is incorrect! ;-)
>>
>> If you really want time with better than micro-second accuracy, consider
>> get a GPS Timing receiver. The one I bought several years
>> ago claimed 50 nanosecond plus or minus of the correct time.
>
> The NTP server we will be testing will be connected to a Cesium clock
> providing a 1pps pulse so that is really not my problem.
> I want to check if this system can handle DDoS attacks, and bad packets.
> This will be done in a lab environment, possibly point-to-point from
> the test machine, to the server, or maybe
>
> In order to test DDoS, probably some FPGA H/W is needed to generate good
> packets, and the S/W stuff is there to generate bad packets and
> check how the server reacts to those

Do you really need that?

It seems to me that by modifying an ethernet card driver to do ntp 
processing in kernel mode, you should be able to handle at least the 
same number of ntp requests as you can do ping replies.

Way back when, around 1992, Drew Major managed to get a NetWare 386 
server to handle a read request in 300 clock cycles. This was from 
receipt of the packet and included parsing, access control checks, 
locating the requested data somewhere in the memory cache, constructing 
the response packet and handing it back to the NIC.

Assuming we can get the actual ntp standard request code processing down 
to the absolute minimum (read the RDTSC counter (or a similar 
low-latency clock source) and the latest OS tick value/RDTSC count, 
scale the offset count by a fixed factor, then add to the OS clock 
value) we should be able to get the entire processing down to ~100 clock 
cycles or so. I.e. moving packet data in/out of the NIC buffers is going 
to take comparable time.

(Any other kind of request is handled as today, i.e. queued for ntpd 
processing, unless DDOS level packet rates cause the queue to pass some 
very low limit in size, at which point we discard the requests.)

Any packet which fails some minimum sanity checks can be discarded 
quickly, this is less overhead than handling it over to the regular 
user-level ntpd process.
>
> Recording the packets will be done with FPGA H/W as well.

So a network sniffer won't be fast enough?

You're talking 10 GiGE wire speed, right?

That's more than 100 M requests/second!

Taking a pessimistic view (1K clock cycles/request) would give just 3M 
packets/core/second, so a 32-core (4x8) machine would suffice.

Getting closer to my 100-cycle target (for chained processing of a bunch 
of consecutive request packets) drops the cpu requirements down to a 
regular quad core single cpu machine, but at this point the bus probably 
won't be able to keep up with the NIC.

Terje

-- 
- <Terje.Mathisen at tmsw.no>
"almost all programming can be viewed as an exercise in caching"



More information about the questions mailing list