[ntp:questions] Re: Quasi-On_topic: System (kernel) time jumps 3600 seconds at random times. Stumped.

Richard B. Gilbert rgilbert88 at comcast.net
Wed May 18 20:31:37 UTC 2005


elickd at one.net wrote:

>I know this really isn't on topic for the NTP newsgroup, but I believe
>the people who frequent this group have a better understanding of
>system time on Unix boxes than anyone else I've encountered thus far.
>
>I've been battling a problem for months now where the system time on
>random SCO 5.02/5.0.5/5.0.7 machines jumps + or - 3600 seconds from
>true GMT.
>
>Here are the clues I've been working with
>
>1) Cron runs ">cat -s /dev/clock >/dev/null 2>&1 || exit
>0;/etc/setclock `date +\%m\%d\H\M\y`" every day at 1 AM and 3 AM. As
>best as I can tell, the above command is a very ugly way to check for
>the existence of /dev/clock and then set the RTC to the system time,
>corrected for DST when applicable.
>
>
>2)  Cron runs a script every morning at 3:15 AM that based on the
>machine name, once a week checks to see that the system time isn't off
>by more than 5 minutes using "ntpdate" (and includes logic to error out
>the script if it is) and then sets the system time with (another)
>"ntpdate -s ${SERVER} >/dev/null 2>&1", where ${SERVER} is the name of
>a single NTP server that's been verified alive, out of a pool of
>serveral.  ***I didn't write this and disagree with the whole
>philosophy of setting system time only once a week using "ntpdate", but
>this is the way it is right now.***
>
>2.5) Our timeserver "chain" isn't particularly stable.
>
>3) At 7 a.m., cron runs a small script that checks the system time
>against our time servers, logs the difference and sends the results to
>a master server for proactive monitoring purposes.
>
>4) There's anecdotal evidence that said 1 hour errors increase in
>frequency following either DST change, but not on the exact day.
>
>5) The 3600 second jumps seem to occur more frequently on the day(but
>not the exact same moment as) system time is updated by the script
>described in #2. Sometimes the error is caught by our support dept.
>after the 7 a.m log (#3) and never shows up in ANY logs.  An example is
>a location that noticed their time was off by an hour around 10 p.m.
>(their time) and had things corrected by support shortly thereafter.
>
>6) Sometimes the RTC agrees with the incorrect system clock and other
>times it displays the correct time for that Zone and a 3600 second
>error between itself and system time.
>
>7) The TZ variable is verified correct (for their location) on every
>machine I've had a problem with.
>
>For a while, I had the idea that the "system" that calculates the DST
>time change has a bug in it, but the GMT time the kernel keeps is being
>"whacked", not just what "date" reports.
>
>The ntp daemon does a wonderful job of keeping keeping trouble systems
>in check, but currently I am not in a position to implement it on the
>2500 odd machines I'm responsible for.
>
>Is there any simple way I could log when the system time is adjusted by
>an hour (+/- 120 seconds or so) to determine what is causing my
>problems?  Or is there a simple way I can detect what processes are
>attempting to adjust system time?
>
>Right now my back is against the wall; any ideas are welcome,
>
>Doug
>
>  
>
One more thought. . . .   On every O/S that I'm familiar with, you need 
root/administrative privileges to set the clock.   So, it's unlikely 
that Joe User or one of his applications is doing it.   It's something 
in the O/S itself, and almost certainly running as root.

Ok, I lied. . . . two more thoughts.  Can you patch the low level 
service that sets the time (I've never worked with SCO and have no idea 
what it is!) ?    If you can patch it then install a patched version on 
one or more of your troublesome machines that logs the PID, UID and GID 
of every process that calls it.  Log the time it's called and the 
operation that it is to perform.   That should give you a clue as to the 
identity of the guilty party!!!!!



More information about the questions mailing list