[ntp:questions] NTP vs RADclock?
rick.jones2 at hp.com
Sun Jun 10 22:23:09 UTC 2012
unruh <unruh at invalid.ca> wrote:
> On 2012-06-08, Rick Jones <rick.jones2 at hp.com> wrote:
> > I would suggest then trying disabling of the interrupt coalescing
> > via ethtool on the 1GbE NIC of your server and a few select
> > clients and see what that does. If things start to look cleaner
> > then you know it is an implementation-specific detail of one or
> > more GbE NICs.
> It looks to me that interrupt coalescing is not enables according to
I'd like to see the full output of ethtool, ethtool -i and ethtool -c
for your interfaces if I may. Feel free to send as direct email if
> > If it is possible to connect a client "back-to-back" to your server at
> > the same time (via a second port) - still with interrupt coalescing
> > disabled at both ends that would be an excellent addition. That will
> > help evaluate the switch.
> > I trust there were no OS changes when going from 100BT to GbE? Though
> > even if not, there is still the prospect of the drivers for the 100BT
> > cards not doing what linux calls "napi" and the drivers for the GbE
> > cards doing it, which may introduce some timing changes.
> What is napi?
Napi is a mechanism whereby interrupts on a NIC get disabled, and
packets are polled for for a certain length of time.
> >> So yes, I think it is the Gb technology that is causing trouble.
> > I split what may seem a hair between Gb technology being the IEEE
> > specification and Gb implementation being what specific NIC vendors
> > do. So, to me, interrupt coalescing is implementation not technology.
> For me, I do not care what which it is, it is all Gb.
I suspect that my caring about Gb technology/specification vs Gb
implementation may be not all that far from a timekeeper's desire to
distinguish between accuracy and precision, even when laypeople start
to mix the two :)
> Note that on one of the clients, there are two separate clusters of
> roundtrip delays, one from .15 to about .4ms, and the other from
> about 1.3 to 1.6 ms. The slope within each cluster is as above but
> the slope between the clusters is the opposite. Ie, within the
> cluster, the client to server is being delayed, while the clusters
> are due to a huge delay in the server to client. (if I have the
> signs right)
> In http://www.theory.physics.ubc.ca/scatter/scatter.html I have the
> scatter plots (offset vs return time) for two clients to two
> different servers. One of the servers is a Gb server, while the
> other is a 100Mb server. Both servers are disciplined by a GPS PPS
> device. The offset fluctuations on both servers is about 4 us, so
> none of the offset fluctuations come from the server clocks
It would be good to include the specific card name and driver rev etc
in subsequent writeups. Over the years there have been several Intel
gigabit cards and 100BT cards. I believe just about all the Intel GbE
cards have had support for interrupt coalescing in some form or
another. At least those which have crossed my path.
lspci -v can help if you don't already know the card name(s)
It is not a question of half full or empty - the glass has a leak.
The real question is "Can it be patched?"
these opinions are mine, all mine; HP might not want them anyway... :)
feel free to post, OR email to rick.jones2 in hp.com but NOT BOTH...
More information about the questions