[ntp:questions] Linux NTP Kernel unsync flag remains long after NTP&Kernel have PPL sync

Darryl Miles darryl-mailinglists at netbauds.net
Thu Aug 28 15:36:31 UTC 2008


Steve Kostecke wrote:
> On 2008-08-26, Darryl Miles <darryl-mailinglists at netbauds.net> wrote:
> 
>> Unruh wrote:
>>
>>> A far better idea is to monitor the offset from the ntp servers to
>>> let you know if there is a clock problem.
>> I'd appreciate a tool for that. "/usr/sbin/ntpdc -check
>> 0.0000:0000:0000 -print"
> 
> ntpq is the preferred monitoring tool.
> 
> ntpq -c"rv 0 offset" will tell you the current offset of your ntpd.

So it is the "offset" that I should look at!  Thanks, I wasn't sure.

I guess an offset of 0.0000 is perfect ?

Now how do I tell the difference between an offset being reported as 
0.0000 due to no sync and an offset being reported as 0.0000 due to a 
perfect sync ?

I'm trying to establish that whomever created such a 
tool/script/whatever which accepted my simple bound requirements has 
taken into account all failure scenarios that I can / the community can 
think of.

Then make it really easy for them to ask the NTP sub-system to report on 
that its well being in respect of its primary function.


How abouts a new ntpq/ntpdc command "summary" could be implemented, with 
a simple "key: value" output of data, with simple "WARNING:" and 
"ERROR:" and "FATAL:" reporting of concerns.  With "Overall Status: GOOD"

Another new command verify/check accuracy against a bounds specification 
  and again report what is inside that bound and what is outside that bound.


Then all that is left is to publish a paragraph into a man page with a 
few examples of possible "bound requirement specifications" and what 
they might mean to a system in real life.

As a system admin wanting to monitor their NTP and kernel clock state 
(as judged by NTP) only needs to consult documentation and copy'n'paste 
from an example.  This would be ideal planning.


>> that takes various parameters for your acceptable accuracy and returns
>> with zero/non-zero exit status. That might also dump data like
>> adjtimex -print and indicate items of concern to the administrator.
> 
> Collecting information from all those sources is the job for a script.

No problem on the mechanism to do it, but its a job for an NTP groker 
and maybe something to be shipped as part of the NTP suite, i.e. not 
something a system administrator wants to make up on the spot and get it 
so easily wrong.


Darryl



More information about the questions mailing list