[ntp:questions] Re: Clock drift problems

Ulrich Windl Ulrich.Windl at RZ.Uni-Regensburg.DE
Tue Feb 3 11:45:08 UTC 2004


Christopher Browne <cbbrowne at acm.org> writes:

> Oops! allancady at yahoo.com (Allan Cady) was seen spray-painting on a wall:
> > Disclaimer:  I'm asking this question about a Linux box, which I use
> > only as a client of a couple of web-based applications and as a file
> > server.  I know almost nothing about Linux, or Unix.  I'm a Windows
> > guy.  So please go easy on me!
> 
> [Grumble, grumble, let's get out the pots of boiling oil...  :-).]
> 
> > The problem is, on this Linux machine (Red Hat, I don't know what
> > version), the real-time clock has gotten 7 hours slow.  The guys

I might think that the RTC is running local time for DOS-ish operating
systems, while Linux thinks the time is UTC, thus correcting it for
local time. "date -u" might show the truth.

Ulrich

> > here who administer this machine seem to be stumped about what to do
> > about it.  (This boggles my mind.)  There are two problems: how to
> > get it back where it belongs, and how to prevent it from getting out
> > of sync in the future.
> 
> > The explanation I'm given about why they can't just "set the clock"
> > is that there are applications that would wig out if the clock all
> > of the sudden changes by 7 hours.  I can understand that, but things
> > like NTP are supposed to be able to deal with this kind of thing by
> > adjusting the clock slowly over a period of time.  I don't know the
> > details, and I certainly don't know how to set this up on Linux.
> > And I don't know if it's capable of handling such a gross
> > correction.
> 
> NTP is NOT capable of coping with such a gross correction over time;
> it "gives up" if it finds things more than 1000 seconds off.  The
> problem is that if it goes with small incremental adjustments, it
> could readily take WEEKS to adjust by 7 hours.
> 
> There is going to have to be some form of "outage" on the machine, as
> a result.  The simplest answer may be to get appropriate NTP hosts
> into /etc/ntp.conf and /etc/ntp/step-tickers (the latter is needed in
> order to get the initial sync that overcomes the "off by 7h" problem),
> and see about rebooting the system.
> 
> Ideally, that shouldn't be necessary; if they shut down things like
> database applications, that may suffice to prevent apps from "wigging
> out."  Shut down the "at risk" applications and services, restart ntp
> (the command is "/etc/init.d/ntp restart"), and restart the other
> apps.
> 
> > The explanation I'm given about why the clock is losing time so
> > badly in the first place (about 15 minutes a week), is that it
> > happens when we do our weekly backups to DVD-ROM; something is
> > locking out the hardware interrupt that makes the clock work.  Is
> > this "normal"?  They claim it's nothing to do with Linux, that it
> > would happen with Windows too.  I've never seen anything like this
> > happen on Windows... DOS maybe, but that was 15 years ago.  This is
> > a Dell PowerEdge 1600 machine, less than a year old.
> 
> Yeah, this is something of a "known issue."  When the system bus gets
> taken over by DMA, that certainly can block clocks' access.  Various
> Unixes have suffered from similar things over the years, although it
> is usually just that the clock gets jittery, not that it outright
> dies.
> 
> > Given my ignorance of Linux, it's hard for me to ask specific "how do
> > you do this" questions... for starters, I'm mostly looking for a
> > general opinion of whether this problem is really as confounding as my
> > buddies think it is.  It may get to questions of "how can we configure
> > NTP to do the gross correction without breaking applications", and "is
> > there any way to fix the system so that the clock doesn't drift when
> > we're doing backups", but to start with, I'd just like to know if I
> > can believe the guys who are telling me there's nothing we can do
> > about it.  Or if maybe I can point them somewhere... tell them, "read
> > the man on ...".
> >
> > Or maybe we should be asking Dell for help with this?  
> 
> You'd have to find someone at Dell that knows about hardware clocks as
> well as NTP, which probably isn't anyone you'll be able to speak with
> :-(.
> 
> Configuring and running NTP is the right answer.  The thorny bit will
> be finding an opportunity to get the system outage.
> 
> Of course, if any of the applications care what time it is, they're
> broken, and so they _need_ the outage, like it or not...
> 
> You should probably visit comp.protocols.time.ntp; there may be
> further insights there...
> -- 
> output = ("aa454" "@" "freenet.carleton.ca")
> http://www.ntlug.org/~cbbrowne/ntp.html
> :FATAL ERROR -- VECTOR OUT OF HILBERT SPACE



More information about the questions mailing list