[ntp:questions] Strange NTP problem on AMD Geode LX cards.
David Hawkins
david.j.hawkins at btinternet.com
Sat Oct 3 00:58:03 UTC 2009
Hi
I'm using a number of XTX form factor AMD Geode LX (500Mhz) cards at work.
(Cannot get to news at work, and have left memory stick with details at work
! so apologies for missing info !)
They are running Sues Linux from a read only flash drive, all identical
clones other than host names and IP addresses.
Most of the time ntp runs with no problems and will lock to a local server
with less than 5ms offset, and the drift file comes out at between about -20
and -40.
But now and again a system will not get a stable lock, and on investigation
the drift file is at the maximum of -500.
When I first encountered this I assumed it was a hardware problem with the
processor card, just a one off, but now have seen this on around 10 systems
out of 30 or so I have tested.
When a system shows this fault, powering the unit on and off will almost
always solve it, the unit synchronising to the server after a couple of
hours with a drift file setting of -20 to -40 like the others.
I'm more of a hardware engineer than software, but have now run out things
to look at to solve this problem.
I have considered / done the following
* The drift file is stored in the ram drive /dev/shm so always starts at
00.000 when the system is started.
* On a system not locking stopping ntp and restarting having set the drift
file to -28, results in the drift going back to -400 over a couple of
hours - so not some odd start-up state that confuses the control loop.
* The processor card uses a PCI clock generator capable of spread spectrum
output, this is always enabled and not controllable from the BIOS - the chip
has two settings off and on with a -0.5% spread. Have verified with a
spectrum analyser that the cards with good lock and bad lock, have the
spread spectrum option enabled.
* The cards seem to be more lightly to exhibit the problem when they have
been turned off for a day or so.
* Power saving modes of the processor are enabled, but understand that the
timing is done using the counter timer in the Geode companion chip that runs
at a constant 14.13MHz regardless of the power state:- also as all running
exactly the same code why would some have problems and not others ?
Sorry rather random thoughts but I have now run out of things to look at,
have you ever seen a problem like this and even better found a solution ?
Dave
More information about the questions
mailing list