[ntp:questions] Re: ntpq times out if NMEA refclock configured?

Richard B. Gilbert rgilbert88 at comcast.net
Mon May 15 15:02:29 UTC 2006


R Jenkins wrote:

> "Richard B. Gilbert" <rgilbert88 at comcast.net> wrote in message
> news:9sqdnZb36qXzoPrZRVn-sg at comcast.com...
> 
>>R Jenkins wrote:
>>
>>>"Richard B. Gilbert" <rgilbert88 at comcast.net> wrote in message
>>>news:L7udnQ5X_9grBvvZRVn-vA at comcast.com...
>>>
>>>
>>>>R Jenkins wrote:
>>>>
>>>>
>>>>
>>>>>Hi,
>>>>>
>>>>>I'm trying to add a GPS refclock to my server.
>>>>>After total failure with a basic Trimble TSIP output GPS plus the parse
>>>>>clock, I'm now using a Garmin GPS25 and the NMEA refclock.
>>>>>
>>
>><big snip>
>>
>>>>After rereading a little more carefully, I notice that your frequency
>>>>correct of -495.9 PPM is on the ragged edge of the 500 PPM limit.  It is
>>>>unusual for a clock to have a freqency error this large; most are below
>>>>50 PPM in absolute value.
>>>>
>>>>Does your system have a kernel parameter called "HZ"?  Is it set to a
>>>>value greater than 100?  I believe I have seen references to values of
>>>>both 250 and 1000; neither value works well with NTPD.  The system seems
>>>>to lose clock interrupts when HZ is greater than 100.  YMMV but if you
>>>>are not using 100, give it a try.
>>>>
>>>
>>>Hi,
>>>thanks for the replies.
>>>
>>>The -495.9 ppm seems to be a symptom of the refclock problem. Without the
>>>NMEA refclock it was -60 after a few minutes, long before it had settled
>>>properly.
>>>I think it does have a fast Hz setting (I've seen it somewhere but I
>>>can't remember where or what it was set to..) However, it's a 3.2GHz
>>>processor so I don't think it should struggle too much.
>>>
>>>
>>>I have the PPS pulse set to 200mS.
>>>The PC does not normally have a display, I use telnet (well, SSH) from my
>>>desk.
>>>Running minicom at 4800 Baud with NTPD stopped shows the GPS serial data
>>>is present:
>>>$GPRMC,073153,A,5319.0516,N,00106.9355,W,000.0,000.0,140506,004.0,W*76
>>>$GPRMC,073154,A,5319.0516,N,00106.9355,W,000.0,000.0,140506,004.0,W*71
>>>$GPRMC,073155,A,5319.0516,N,00106.9355,W,000.0,000.0,140506,004.0,W*70
>>>...
>>>I'm not sure how to remotely monitor the DCD line.
>>>
>>>
>>>Simply having the 'server 127.127.20.0 prefer' line in causes the ntpq
>>>hang.
>>>
>>>I've just got around to checking the log immediately after starting ntpd:
>>>
>>>May 14 08:19:51 gate2 ntpd[28723]: ntpd 4.2.0a at 1.1190-r Sat May 13
>>>10:39:48 BST 2006 (1)
>>>May 14 08:19:51 gate2 ntpd[28723]: precision = 1.000 usec
>>>May 14 08:19:51 gate2 ntpd[28723]: Listening on interface wildcard,
>>>0.0.0.0#123
>>>May 14 08:19:51 gate2 ntpd[28723]: Listening on interface wildcard,
>>>::#123
>>>May 14 08:19:51 gate2 ntpd[28723]: Listening on interface lo,
>>>127.0.0.1#123
>>>May 14 08:19:51 gate2 ntpd[28723]: Listening on interface eth0,
>>>192.168.0.43#123
>>><Other interfaces trimmed>
>>>May 14 08:19:51 gate2 ntpd[28723]: kernel time sync status 0040
>>>May 14 08:19:51 gate2 ntpd[28723]: refclock_nmea: time_pps_kcbind failed:
>>>Invalid argument
>>>May 14 08:19:52 gate2 ntpd[28723]: too many recvbufs allocated (40)
>>>
>>>It looks like there is some problem with the kernel PPS interface, but I
>>>have no idea what...
>>>I used this patch:
>>>PPSkit-light-alpha-3328m-2.6.15.1.diff
>>>on a clean download of kernel 2.6.16.9 - there were a couple of rejects,
>>>but they seemed to be pretty obvious & went in easily by hand..
>>>
>>>I'm happy to try another (recent) 2.6 kernel if there is one with a known
>>>working patch?
>>>
>>>Another test: Leaving the 'flag 3 1' out stops the refclock error line in
>>>the log.
>>>The 'too many recvbufs allocated (40)' line seems to be triggered by the
>>>NMEA refclock regardless of any other settings; it does not appear when
>>>the NMEA clock is commented out in ntp.conf
>>>
>>>Robert Jenkins.
>>>
>>>
>>
>>
>>If the HZ setting is causing the problem, it has little to do with
>>processor speed!!   The problem seems to be that various device drivers
>>mask or disable interrupts for a period covering two or more clock
>>interrupts causing one or more to be lost with each occurrence.
>>
>>The messages about "too many recvbufs allocated (40)" were associated with
>>a bug in ntpd that I believe was fixed more than a year ago.  You might
>>want to try the latest version of ntpd.  You can download it from
>>http://ntp.isc.org/bin/view/Main/SoftwareDownloads
> 
> 
> Hi,
> I can understand the Hz setting messing up the accuracy, but I don't see it
> would stop things running altogether?
> It was at 250Hz, I'm presently compiling a kernel with it at 100Hz to see 
> what effect this has.
> 
> I thought I had tried the latest dev release (as per my last post), but it 
> turns out I had two copies of ntpd in different locations.
> Using the ./configure options from the Redhat source does not put ntpd into 
> the /usr/sbin directory as with their build, they must be patching the paths 
> somewhere as well.
> 
> Having properly cleared the old files & rebuilt again, I am getting slightly 
> better results, but it's still not locking to the GPS.
> 
> After around an hour:
> # ntpq -c peers
>      remote           refid      st t when poll reach   delay   offset 
> jitter
> ==============================================================================
> xGPS_NMEA(0)     .GPS.            0 l   18   64  377    0.000  -482.37 
> 160.720
> +gate.jrw.intra  130.159.196.118  3 u    6   16  377    0.196  131.924 
> 19.370
> *mail.alsys.ro   .GPS.            1 u   49   64  377   67.099   25.813 
> 69.507
> +cronos.cenam.mx .GPS.            1 u   24   64  377  217.026  -10.000 
> 92.241
> 
> ntptime
> ntp_gettime() returns code 0 (OK)
>   time c8121481.1f3f8000  Sun, May 14 2006 21:41:37.122, (.122063),
>   maximum error 603208 us, estimated error 39319 us
> ntp_adjtime() returns code 0 (OK)
>   modes 0x0 (),
>   offset 30763.000 us, frequency -432.622 ppm, interval 4 s,
>   maximum error 603208 us, estimated error 39319 us,
>   status 0x1 (PLL),
>   time constant 2, precision 1.000 us, tolerance 496 ppm,
>   pps frequency -1.251 ppm, stability 0.000 ppm, jitter 0.000 us,
>   intervals 0, jitter exceeded 0, stability exceeded 0, errors 0.
> 
> Initially ntptime was giving code 5 (error) & showing frequency 0 ppm.
> At this, the offsets stayed reasonably constant. I had deleted the drift 
> file before starting so it would not be upset by the previous problems.
> 
> After it changed to code 0 & status 0x1, it started drifting badly and did a 
> 'jump' of half a second at about 40 minutes.
> 
> 
> Robert Jenkins.
> 
> 
> 
> 
> 

It's more than just "not locking to the GPS".  It thinks the GPS clock 
is "insane"; that's the meaning of the "x" in the first column.  The 
ntpq display looks odd in a few other ways.

Are you actually using a server in Romania (.ro) and another in Mexico? 
  Why?  Your servers should, ideally, be close to you in net space.  Net 
space is not the same as geographical space; distances in net space can 
be longer but are seldom shorter than in geographical space.  I'm having 
difficulty thinking of a place that is close to both Mexico and Romania.

Are you setting your as closely as possible to the correct time before 
starting ntpd?  Or, better yet, starting ntpd with the -g switch so that 
ntpd will set the clock on a one time basis?

It appears that you have set MINPOLL for server gate.jrw.intra to a 
value of four.  Why?  The default values of MINPOLL and MAXPOLL are 
correct for most situations.

The servers appear to have wildly differing opinions as to what time it 
is.  This is not good.  If the servers don't have a clue, your clock 
probably won't either.  Add servers until you find at least four that 
show something like close agreement.

I would look for servers closer to home, wherever that might be. 
Ideally you should have values of delay less than 20 milliseconds.  I 
would suggest a minimum of four servers in addition to your GPS, at 
least until you get things working.

Finally, do you have software that will monitor the Garmin GPS receiver 
and report things like the number of satellites it sees, the signal 
strength for each, the position in the sky for each, etc?  Such software 
can tell you many useful things and give you an idea of how well things 
are working.




More information about the questions mailing list