[ntp:questions] move_fd() causing bad behavior on AIX5.3

Harlan Stenn stenn at ntp.org
Tue Nov 20 22:17:53 UTC 2007


>>> In article <5cec609c-87a9-4477-8b70-6d9aaa18bf8b at l22g2000hsc.googlegroups.com>, brandon.phillips at lmco.com writes:

brandon> Oops, 614 is what I meant.  I did see that it is closed, but the
brandon> comments on it left me with some doubt about whether it had been
brandon> verified on AIX5 specifically.

VERIFIED means the person who opened the issue has verified that the fix we
committed works for them.

Mostly, the code in NTP is general code, so if we fix something in that area
of the code it worke everywhere.

If there is a problem in an OS-specific area, when we get a bugfix and that
problem is marked as VERIFIED, it means that the fix worked.

There have been (rare) cases where OS-specific issues  are different between
different releases of an OS.

Unless we can actually work on these cases ourselves (and often in these
cases we need to work with the kernel or system-library engineers) we really
can't make lots of progress in certain cases.  When this happens we just
wait, as either the vendor will fix things or we'll get a patch from
somebody who is in a better position to work on the problem than we are.

brandon> ...  We were interested in NTPv4 for the
brandon> ability to really always slew (tinker step 0) since our software is
brandon> allergic to time stepping.  We may abandon the idea though, due to
brandon> lack of confidence in the maturity of NTPv4 on AIX5.

The general NTPV4 code is very well tested.  There are definitely
AIX-specific areas of the code that we cannot easily test, but we would only
know about problems if somebody opened a ticket on them, and the odds are
that a fix will appear sooner if somebody is able to dig in to the problem
and find the fix.

Please note that there are cases where we can work around kernel problems.

An example would be in configure.ac, near line 3843, where we avoid a kernel
FLL bug in certain patch levels of Solaris 2.6 and 2.7.

>> And if you can shed any light on bugs 135, 309, 598, or 716, that would
>> be swell, too.

brandon> 135, 716: still exist.  I had to explicitly compile away IPv6
brandon> support to resolve the issue.

At least this lets me know that the "disable IPv6" code is working well
enough!

brandon> 309: I don't think our setup would hit this so can't comment.

brandon> 598: This is interesting since it also complains about the xntpd
brandon> IBM ships (which we currently use as well).  We have had some
brandon> issues with the clocks jumping back to 0:00...1970 after reboots; I
brandon> believe we finally convinced IBM there was a problem.  The other
brandon> issues discussed in this bug I am not sure about; we'll have to
brandon> investigate and see if they are contributing to time related
brandon> pecularities.

I'd appreciate any help you can offer in resolving any of these issues, and
I'm happy to help in any way I can as well.

H




More information about the questions mailing list