[ntp:questions] Architecture / best practice for small/medium company setups
Joachim Schrod
jschrod at acm.org
Thu Jun 29 10:58:10 UTC 2006
Hello,
I would like to pose a few questions on architecture / best practice
on NTP setups for small and medium companies. I read the documentation
and the Wiki, I also googled, but didn't find a satisfying answer. I'm
willing to update the NTP Wiki with an HOW-TO text that results from
this discussion.
The situation:
-- Let's assume a company with 10 to max. 100 computer systems.
Forthermore, let's just talk about Unix systems, for now, and
let's assume that at least five of them are available.
-- There is only one site. All computers are on the same LAN.
-- The company has only one Internet connection, with a typical SLA of
97.5% availability over the year. (For example, that's the SLA of
T-Com for their basic CompanyConnect product here in Germany.) The
worst-case outage of 9 days can be ignored, but we have to cope
for outages of several hours length.
-- The company has no requirements for extremely accurate
time-synchronization. They run the usual bunch of applications:
Databases, ERP systems, Office systems, and other applications
that access time with a granularity of one second.
I think this is a context that is common to many installations. (Well,
I have seen many such environments. :-) My personal experience is only
in big installations with 1000s of systems and redundant Internet
connections; but I have been asked about such situations a few times
in the past. (The last inquiry was a few days ago, and triggered this
posting.)
The first few questions are about selection of time servers:
How many, and what is their peer structure?
-- I assume that the company should use the NTP server pool, as it's
not a large company with 1000s of computers.
-- How many timeservers on the LAN that are accessed by clients?
Looking at the available documentation, I would recommend four
servers. (This might mean that many of the Unix systems suddenly
are timeservers.) Or would three be sufficient? One server is
surely not sufficient, as an outage of that server would endager
the whole time synchronization.
I.e., is peering between three servers sufficient to handle outage
of one server until the repair is done, or does one need four servers
to do that properly.
(An answer may depend on the connection of the timeservers to the
pool, as asked in the next question.) The Wiki recommends four
servers, but I have seen several places where three servers are
deemed sufficient. What's best practice?
-- Connection to the NTP pool:
-- Either all company timeservers access the pool,
-- or one of the timeserver accesses the pool, and the others
synchronizes to it,
-- or there is an additional timeserver that accesses the pool
and the company timeserver synchronize to this special server.
The clients don't use this special server.
Since there is only one Internet connection, and since there are
no separate network paths to the pool servers, I have to ask if
it's still reasonable to have several timeservers synchronizing to
the pool. OTOH, if there is only one pool-connected system, what
to do in case of an outage of that system? (Probably promote one of
the other servers to be the Internet-facing systems.)
I have no idea about further advantages or disadvantages of these
three design possibilities. I assume that this has to be answered
in conjunction with my next question, on peering. (I bar firewall
and DMZ considerations for the moment, that might recommend the
third solution.)
-- Peering: Which servers peer to each other?
-- If all company timeservers access the pool, I think they are
all peers.
-- But if only one system accesses the pool, does this system
also peers with the others who synchronize to it? That hasn't
been clear from the documentation. On the Wiki, it says that
one shold peer all timeservers; but also ones that are
different in the stratum hierarchy?
-- Internet connection outages: Just let them happen, or use
undisciplined local clock on stratum 10 as backup on the
timeservers?
AFAIK, undisciplined local clocks can cause havroc when the time
strays too far away from the reference time source. Googling that
question got several potential answers, therefore: is it best
practice to use 127.127.1.0 as a backup for the case that no outside
source of synchronized time is available?
Is there a design decision for the server setup that I missed?
-- Client configuration: Specific servers, or multicast?
Now we have a bunch of timeservers in the company. What is best
practice: That clients are configured to use these servers
specifically, or that multicast mode is used?
Or should one try a manycast configuration?
If one uses multicast or manycast, does this imply that one needs
to establish key-based authentication between servers and clients?
Such small companies usually have no PKI in place, so this might
mean to distribute shared secret keys during setup, or?
Or use Autokey, as explained in the Wiki?
Sorry for the long post. I hope to get some answers, and maybe we can add those
answers to http://ntp.isc.org/bin/view/Support/DesigningYourNTPNetwork. (Or make
a different page, with specific step-wise explanation for small/medium company
setups.) I think that page is already very good, but would be further improved
with such information.
Best,
Joachim
--
=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
Joachim Schrod Email: jschrod at acm.org
Roedermark, Germany
More information about the questions
mailing list