[ntp:questions] Re: ntpq times out if NMEA refclock configured?

R Jenkins not at pub.lished
Mon May 15 17:57:28 UTC 2006


"Richard B. Gilbert" <rgilbert88 at comcast.net> wrote in message 
news:d72dnXEvvMabBfXZ4p2dnA at comcast.com...
>R Jenkins wrote:
>
>> "Richard B. Gilbert" <rgilbert88 at comcast.net> wrote in message
>> news:9sqdnZb36qXzoPrZRVn-sg at comcast.com...
>>
>>>R Jenkins wrote:
>>>
>>>>"Richard B. Gilbert" <rgilbert88 at comcast.net> wrote in message
>>>>news:L7udnQ5X_9grBvvZRVn-vA at comcast.com...
>>>>
>>>>
>>>>>R Jenkins wrote:
>>>>>
>>>>>
>>>>>
>>>>>>Hi,
>>>>>>
>>>>>>I'm trying to add a GPS refclock to my server.
>>>>>>After total failure with a basic Trimble TSIP output GPS plus the 
>>>>>>parse
>>>>>>clock, I'm now using a Garmin GPS25 and the NMEA refclock.
>>>>>>
>>>
>>><big snip>
>>>
>>>>>After rereading a little more carefully, I notice that your frequency
>>>>>correct of -495.9 PPM is on the ragged edge of the 500 PPM limit.  It 
>>>>>is
>>>>>unusual for a clock to have a freqency error this large; most are below
>>>>>50 PPM in absolute value.
>>>>>
>>>>>Does your system have a kernel parameter called "HZ"?  Is it set to a
>>>>>value greater than 100?  I believe I have seen references to values of
>>>>>both 250 and 1000; neither value works well with NTPD.  The system 
>>>>>seems
>>>>>to lose clock interrupts when HZ is greater than 100.  YMMV but if you
>>>>>are not using 100, give it a try.
>>>>>
>>>>
>>>>Hi,
>>>>thanks for the replies.
>>>>
>>>>The -495.9 ppm seems to be a symptom of the refclock problem. Without 
>>>>the
>>>>NMEA refclock it was -60 after a few minutes, long before it had settled
>>>>properly.
>>>>I think it does have a fast Hz setting (I've seen it somewhere but I
>>>>can't remember where or what it was set to..) However, it's a 3.2GHz
>>>>processor so I don't think it should struggle too much.
>>>>
>>>>
>>>>I have the PPS pulse set to 200mS.
>>>>The PC does not normally have a display, I use telnet (well, SSH) from 
>>>>my
>>>>desk.
>>>>Running minicom at 4800 Baud with NTPD stopped shows the GPS serial data
>>>>is present:
>>>>$GPRMC,073153,A,5319.0516,N,00106.9355,W,000.0,000.0,140506,004.0,W*76
>>>>$GPRMC,073154,A,5319.0516,N,00106.9355,W,000.0,000.0,140506,004.0,W*71
>>>>$GPRMC,073155,A,5319.0516,N,00106.9355,W,000.0,000.0,140506,004.0,W*70
>>>>...
>>>>I'm not sure how to remotely monitor the DCD line.
>>>>
>>>>
>>>>Simply having the 'server 127.127.20.0 prefer' line in causes the ntpq
>>>>hang.
>>>>
>>>>I've just got around to checking the log immediately after starting 
>>>>ntpd:
>>>>
>>>>May 14 08:19:51 gate2 ntpd[28723]: ntpd 4.2.0a at 1.1190-r Sat May 13
>>>>10:39:48 BST 2006 (1)
>>>>May 14 08:19:51 gate2 ntpd[28723]: precision = 1.000 usec
>>>>May 14 08:19:51 gate2 ntpd[28723]: Listening on interface wildcard,
>>>>0.0.0.0#123
>>>>May 14 08:19:51 gate2 ntpd[28723]: Listening on interface wildcard,
>>>>::#123
>>>>May 14 08:19:51 gate2 ntpd[28723]: Listening on interface lo,
>>>>127.0.0.1#123
>>>>May 14 08:19:51 gate2 ntpd[28723]: Listening on interface eth0,
>>>>192.168.0.43#123
>>>><Other interfaces trimmed>
>>>>May 14 08:19:51 gate2 ntpd[28723]: kernel time sync status 0040
>>>>May 14 08:19:51 gate2 ntpd[28723]: refclock_nmea: time_pps_kcbind 
>>>>failed:
>>>>Invalid argument
>>>>May 14 08:19:52 gate2 ntpd[28723]: too many recvbufs allocated (40)
>>>>
>>>>It looks like there is some problem with the kernel PPS interface, but I
>>>>have no idea what...
>>>>I used this patch:
>>>>PPSkit-light-alpha-3328m-2.6.15.1.diff
>>>>on a clean download of kernel 2.6.16.9 - there were a couple of rejects,
>>>>but they seemed to be pretty obvious & went in easily by hand..
>>>>
>>>>I'm happy to try another (recent) 2.6 kernel if there is one with a 
>>>>known
>>>>working patch?
>>>>
>>>>Another test: Leaving the 'flag 3 1' out stops the refclock error line 
>>>>in
>>>>the log.
>>>>The 'too many recvbufs allocated (40)' line seems to be triggered by the
>>>>NMEA refclock regardless of any other settings; it does not appear when
>>>>the NMEA clock is commented out in ntp.conf
>>>>
>>>>Robert Jenkins.
>>>>
>>>>
>>>
>>>
>>>If the HZ setting is causing the problem, it has little to do with
>>>processor speed!!   The problem seems to be that various device drivers
>>>mask or disable interrupts for a period covering two or more clock
>>>interrupts causing one or more to be lost with each occurrence.
>>>
>>>The messages about "too many recvbufs allocated (40)" were associated 
>>>with
>>>a bug in ntpd that I believe was fixed more than a year ago.  You might
>>>want to try the latest version of ntpd.  You can download it from
>>>http://ntp.isc.org/bin/view/Main/SoftwareDownloads
>>
>>
>> Hi,
>> I can understand the Hz setting messing up the accuracy, but I don't see 
>> it
>> would stop things running altogether?
>> It was at 250Hz, I'm presently compiling a kernel with it at 100Hz to see 
>> what effect this has.
>>
>> I thought I had tried the latest dev release (as per my last post), but 
>> it turns out I had two copies of ntpd in different locations.
>> Using the ./configure options from the Redhat source does not put ntpd 
>> into the /usr/sbin directory as with their build, they must be patching 
>> the paths somewhere as well.
>>
>> Having properly cleared the old files & rebuilt again, I am getting 
>> slightly better results, but it's still not locking to the GPS.
>>
>> After around an hour:
>> # ntpq -c peers
>>      remote           refid      st t when poll reach   delay   offset 
>> jitter
>> ==============================================================================
>> xGPS_NMEA(0)     .GPS.            0 l   18   64  377    0.000  -482.37 
>> 160.720
>> +gate.jrw.intra  130.159.196.118  3 u    6   16  377    0.196  131.924 
>> 19.370
>> *mail.alsys.ro   .GPS.            1 u   49   64  377   67.099   25.813 
>> 69.507
>> +cronos.cenam.mx .GPS.            1 u   24   64  377  217.026  -10.000 
>> 92.241
>>
>> ntptime
>> ntp_gettime() returns code 0 (OK)
>>   time c8121481.1f3f8000  Sun, May 14 2006 21:41:37.122, (.122063),
>>   maximum error 603208 us, estimated error 39319 us
>> ntp_adjtime() returns code 0 (OK)
>>   modes 0x0 (),
>>   offset 30763.000 us, frequency -432.622 ppm, interval 4 s,
>>   maximum error 603208 us, estimated error 39319 us,
>>   status 0x1 (PLL),
>>   time constant 2, precision 1.000 us, tolerance 496 ppm,
>>   pps frequency -1.251 ppm, stability 0.000 ppm, jitter 0.000 us,
>>   intervals 0, jitter exceeded 0, stability exceeded 0, errors 0.
>>
>> Initially ntptime was giving code 5 (error) & showing frequency 0 ppm.
>> At this, the offsets stayed reasonably constant. I had deleted the drift 
>> file before starting so it would not be upset by the previous problems.
>>
>> After it changed to code 0 & status 0x1, it started drifting badly and 
>> did a 'jump' of half a second at about 40 minutes.
>>
>>
>> Robert Jenkins.
>>
>>
>>
>>
>>
>
> It's more than just "not locking to the GPS".  It thinks the GPS clock is 
> "insane"; that's the meaning of the "x" in the first column.  The ntpq 
> display looks odd in a few other ways.
>
> Are you actually using a server in Romania (.ro) and another in Mexico? 
> Why?  Your servers should, ideally, be close to you in net space.  Net 
> space is not the same as geographical space; distances in net space can be 
> longer but are seldom shorter than in geographical space.  I'm having 
> difficulty thinking of a place that is close to both Mexico and Romania.
>
> Are you setting your as closely as possible to the correct time before 
> starting ntpd?  Or, better yet, starting ntpd with the -g switch so that 
> ntpd will set the clock on a one time basis?
>
> It appears that you have set MINPOLL for server gate.jrw.intra to a value 
> of four.  Why?  The default values of MINPOLL and MAXPOLL are correct for 
> most situations.
>
> The servers appear to have wildly differing opinions as to what time it 
> is.  This is not good.  If the servers don't have a clue, your clock 
> probably won't either.  Add servers until you find at least four that show 
> something like close agreement.
>
> I would look for servers closer to home, wherever that might be. Ideally 
> you should have values of delay less than 20 milliseconds.  I would 
> suggest a minimum of four servers in addition to your GPS, at least until 
> you get things working.
>
> Finally, do you have software that will monitor the Garmin GPS receiver 
> and report things like the number of satellites it sees, the signal 
> strength for each, the position in the sky for each, etc?  Such software 
> can tell you many useful things and give you an idea of how well things 
> are working.

Hi Richard,

the main problem turned out to be not having timex.h linked to where the ntp 
source files were expecing it.

The results now are:
 ntpq -c peers
     remote           refid      st t when poll reach   delay   offset 
jitter
==============================================================================
*GPS_NMEA(0)     .PPS.            1 l   15   64  377    0.000   42.053 
0.018
-gate.jrw.intra  130.88.200.98    3 u    1   16  377    0.209   -1.937 
0.039
+tigger.lentil.o 195.66.241.3     2 u   28   64  377   38.095    1.135 
1.006
+mozart.musicbox 129.6.15.29      2 u   56   64  377   36.672    2.784 
101.162
-sites.urchin.ea 80.253.108.112   3 u    8   64  377   22.293    0.692 
1.225

ntptime
ntp_gettime() returns code 0 (OK)
  time c8133c6c.3c2b0000  Mon, May 15 2006 18:44:12.235, (.235031),
  maximum error 44020 us, estimated error 1 us
ntp_adjtime() returns code 0 (OK)
  modes 0x0 (),
  offset 51.000 us, frequency -0.312 ppm, interval 4 s,
  maximum error 44020 us, estimated error 1 us,
  status 0x107 (PLL,PPSFREQ,PPSTIME,PPSSIGNAL),
  time constant 2, precision 1.000 us, tolerance 496 ppm,
  pps frequency -0.312 ppm, stability 0.152 ppm, jitter 1.000 us,
  intervals 2995, jitter exceeded 84, stability exceeded 16, errors 1989.

The servers were (other than the local one) just random pool entries.
I've already switched it to some uk.pool servers.

The other local machine (gate.jrw.intra) is running on good, fixed sources 
with permission from the owners.
It's reasonably accurate so having a small maxpoll gets things locked rather 
quicker, though this will probably not be needed if GPS/PPS setup proves OK.
This new setup will replace the other eventually, and I will then use the 
servers presently configured on that.

The GPS aerial is in a window at present, if everything looks to be working 
well I will change to a roof mounted one.

Regards,
Robert Jenkins.





More information about the questions mailing list