[ntp:hackers] Improved monlist scheme - a bit off topic
Michael.Wouters at csiro.au
Michael.Wouters at csiro.au
Mon Sep 15 16:33:53 PDT 2003
Dean,
Our NTP servers, like U Delaware's, were (and still are) being
pounded (6000 NTP requests/s per NTP server when blocked) by a router
manufactured by SMC Networks. Like U Delaware,
the addresses of our NTP servers were hard-coded with no way to change
them. The manual for the routers did not even mention this "feature".
SMC Networks have fixed this in their firmware but the problem is of course
convincing all those home users to upgrade.
Regarding the decreasing number of public servers, we too have taken our
servers off the public list but they are still available to the public by
prior arrangement. Our users now nearly all consist of hosts servicing
large networks, the way things were meant to work.
Back to monlist ...
We've always monitored traffic on our NTP servers very closely, with
daily stats reports etc. Because of the limitations of monlist,
I ended up writing my own software to collect
stats via a packet sniffer and then do offline analysis on another computer.
This worked fine at traffic levels of up to 500 packets/s on a P266 but
thrashed
at a few thousand per second.
Cheers
Michael Wouters.
-------------------------------------------
Dr Michael Wouters
Time and Frequency Section
National Measurement Laboratory
CSIRO Division of Telecommunications and Industrial Physics
PO Box 218, Lindfield NSW 2070
Sydney Australia
(street address: Bradfield Rd, Lindfield NSW 2070)
Ph 61 2 9413 7268
Fax 61 2 9413 7202
--------------------------------------------
> -----Original Message-----
> From: Dean Gibson (NTP Administrator) [mailto:ntp at ultimeth.com]
> Sent: Tuesday, 16 September 2003 3:23 AM
> To: David L. Mills; hackers at ntp.org
> Subject: Re: [ntp:hackers] Improved monlist scheme
>
>
> So, as I understand this:
>
> 1. With a low "discard monitor <integer>" value, the ratio
> will be high,
> and the more the behavior will be similar to the present LRU.
>
> 2. With a high "discard monitor <integer>" value, the ratio
> will be low,
> and the more likely it is that the first 600 clients are retained.
>
> 3. With a given "discard monitor <integer>" value, the busier
> the server,
> the ratio will be lower, and the more likely it is that the first 600
> clients are retained.
>
> 4. A "high impact" client that is not presently on the
> monlist will have a
> higher probability of getting on the list.
>
> Sounds ok. However, I believe the present monlist scheme
> records clients
> _before_ they are passed through the restriction list. After
> reading your
> eMail below, I initially thought that this order would be a bad idea,
> because it means that blocking abusive clients needs to be
> done externally
> (otherwise the monlist just fills up with the bad actors).
> However, upon
> reflection, I think the present order is fine, because it allows NTP
> detection of unauthorized clients not on the restriction
> list, without
> analyzing a packet filter log (which on our system is running over
> 50MB/week from virus-compromised hosts).
>
> You may recall that a couple years ago I suggested a scheme
> that used a
> (non-probabilistic) threshold scheme, which simply rejected
> polls from
> clients that polled more often than the threshold. Of
> course, my scheme
> required a monlist big enough to hold all the abusers; yours
> is much more
> adaptable to heavy usage and/or abuse.
>
> On a related note, you may recall that our NTP server was originally
> "time.ultimeth.net", but we dropped that hostname (and
> removed it from DNS)
> three years ago. However, we found that DNS queries for that
> hostname were
> _increasing_ over time. Further, some clients were querying DNS as
> frequently as once per second, even when that hostname was mapped to
> 127.0.0.1. We found that our abandoned hostname (along with
> a bunch of
> others from the ntp.org web page listing) was permanently
> ensconced in a
> Window freeware program named "NetTime" (which also allowed
> the user to set
> the NTP polling interval as low as one second). So, we sent
> out warning
> notices to Internet friends two months ago, and effective Aug 30, we
> removed all DNS servers from the "ultimeth.net" domain, effectively
> dropping the domain from all Internet usage. Or so we
> thought, as the TTL
> for all hosts was 8 hours (the domain expiry field for
> secondary servers
> was 7 days).
>
> Here we are, fifteen days later, and our DNS servers are
> still getting
> numerous (unanswered) DNS queries for "time.ultimeth.net". I
> believe, but
> cannot prove, that this same "NetTime" program, which was
> written without
> _any_ understanding of NTP polling practices, was also
> written without any
> understanding of DNS practices (eg, caching) as well. I
> believe that these
> hosts that are still making DNS queries for an abandoned
> domain name, will
> drop off the radar as they are rebooted (being Windows clients, that
> shouldn't take too long). My opinion is that this "NetTime"
> client is the
> main culprit in the heavy usage of most NTP public servers
> (not to mention
> DNS), and is probably the culprit in the massive abuse of
> Peter Fisk's NTP
> servers in Australia.
>
> When I first added our NTP server to the list of public NTP
> servers four
> years ago, I thought that with the explosion of the Internet, that
> eventually hundreds of sites would volunteer to be stratum-1 and -2
> servers. Sadly, I now note that this has not happened, and
> so the existing
> set of servers is required to carry more and more of the load. Once
> Washington State University dropped off the list, that left
> our NTP server
> as the only stratum-2 server in the Pacific Northwest. In my
> opinion, the
> NTP network cannot sustain the current combination of few NTP
> servers,
> ignorant client implementations, and the increasing hordes of naive
> users. Already the Internet community is increasingly
> blocking significant
> IP address blocks in the third world due to spam; soon, the
> same thing
> will happen to NTP port 123 as well.
>
> -- Dean
>
> David L. Mills wrote on 2003-09-15 07:49:
> >Guys,
> >
> >Long message. Intended mostly for the diehard detail fanatic.
> >
> >The monlist function available with NTPv4 ntpd and ntpdc has
> proved a
> >valuable tool to investigate deviant usage patterns,
> especially accidental
> >or purposeful flooding attacks found at NIST, USNO and U Wisconsin.
> >Unfortunately, the flux of rascals has been way too much for
> the monlist
> >scheme to capture. At NIST, for example, the 600-entry list
> fills up in 7
> >s and at rackety.udel.edu it fills up in 200 s. At least
> with the latter
> >it was hoped to capture usage patterns to at least 1024 s to
> evaluate the
> >effectiveness of the poll-adjust algorithm.
> >
> >A simplistic analysis assumes all clients poll at 1024-s
> intervals. If
> >there are 200 packets/s arriving at NIST, we conclude there
> are 204,800
> >distinct addresses pounding on time.nist.org, more of course is some
> >clients are more thirsty than others. If the 600-entry
> monlist list fills
> >up in 7 s, the list represents only a small fraction of this
> population
> >and will catch only those who drink more often than 7 s.
> >
> >The monlist list sorts distinct source addresses in order of
> increasing
> >age since the most recent reference, in other words a classic
> >least-recently-used (LRU) queue. Upon each reference an
> existing duplicate
> >entry is removed and placed first in the list. Upon first
> occurence, a new
> >entry is inserted first in the list. Mathematically, this is
> a classic
> >stack algorithm. In the original implementation, if the list
> fills up, the
> >oldest entry is removed and the new entry inserted first in
> the list. This
> >of course violates the stack assumption.
> >
> >The problem of course is that, if there are too many
> different source
> >addresses found in too short a time, the oldest entry can be
> removed long
> >before a new duplicate is received. The behavior is similar
> to the classic
> >thrashing behavior seen in early virtual memory operating
> systems which I
> >had initimate experience with.
> >
> >I tried a number of deteriministic and stochasitc algorithms
> to deal with
> >this problem until experiencing a brainfizz recalling Peter
> Denning's
> >Working Set Principle used with effect in modern virtual
> memory systems.
> >The idea is to estimate for each job the minimum number of
> pages to keep
> >in memory and avoid running the job unless that many empty
> pages are in
> >fact available. Implementations I am familiar with use a
> table-driven
> >scheduler designed to separate the elephants from the mice
> and run only
> >the elephants that fit for multiple timeslices to avoid
> paging overhead.
> >
> >I felt a full implementation of this would be something like
> a ten-ton
> >flyswatter and that a probabilistic approach would be
> sufficient. So, the
> >scheme that evolved operates like this. If the LRU list has
> not yet filled
> >up, do the LRU thing as before. If the list has filled up
> roll the dice on
> >the unit interval and compare with a discard threshold. Below the
> >threshold continue as before and clobber the oldest entry;
> otherwise,
> >simply discard the new entry. Come to think of it, this might be an
> >interesting scheme to try on a real operating system. Grad
> students, you
> >mayline up now.
> >
> >The trick is how to set the discard threshold without intricate
> >per-instance analysis. There is now a new state variable in
> ntpd called
> >the monitor discard parameter set by the "discard monitor <integer>"
> >command, with current default 3000. The discard threshold is
> computed as
> >the ratio of the current oldest entry age to the monitor discard
> >parameter. In order to more effectively evaluate the scheme,
> the ntpdc
> >monlist format was changed slightly so that the last column
> now represents
> >the age of each entry, in seconds.
> >
> >I tried this scheme on two primary (stratum 1) servers,
> public rackety and
> >private pogo. Pogo serves as control; it currently receives
> 2.9 packets/s,
> >but never fills up the LRU list. Rackety currently receives
> 12.9 packets/s
> >and, with a discard parameter at 3000 keeps about 600 s of
> history in the
> >LRU list. If indeed most of these rascals were operating at
> 1024 s, this
> >represents some 13,200 spectators. The new scheme captures
> the most frisky
> >600 of them operating at poll intervals less than 600 s.
> This is more than
> >enough to spot nasty pollsters and old NTP implementations
> that do not
> >effectively elevate the poll interval.
> >
> >I was curious about the good guys operating at 1024 s with
> rackety, so
> >increased the discard parameter to 7000, just enough to
> capture the good
> >guys, and something peculiar happened. As expected, there
> were far more of
> >these than could fit in the 600-entry LRU list, so classic
> page thrashing
> >was observed. A bunch of 1024-s pollsters was captured and
> started to get
> >old. When the bunch reached the age of 1024 s, the next few
> pollsters
> >tossed the bunch off the island and the phenom repeated.
> >
> >This has been a most interesting and revealing exercise.
> With this scheme
> >it is possible to automatically detect abusers and report to
> the log for
> >possible blacklisting. Along with the recent call-gap scheme and
> >kiss-o'-death packet, it represents another in the arsenal
> of defensive
> >schemes designed to protect widely distributed, ubiquitous
> network services.
> >
> >With the Backroom machines temporarily out of service, I
> managed to park a
> >tarball ntp-4.1.80-rc1a.tar.gz on
ftp.udel.edu/pub/ntp/software. It would
>be interesting to see how it performs in other very busy servers.
>
>Dave
>_______________________________________________
>hackers mailing list
>hackers at ntp.org
>http://mailman.ntp.org/mailman/listinfo/hackers
_______________________________________________
hackers mailing list
hackers at ntp.org
http://mailman.ntp.org/mailman/listinfo/hackers
More information about the hackers
mailing list