[ntpwg] Re: [ntp:hackers] Re: peer flash bits

David L. Mills mills at udel.edu
Sun Mar 13 20:14:15 UTC 2005


Danny,

Herewith the draft test plan. Actually, it was cobbled together from 
notes as I checked and rechecked for correct operation and recovery from 
various error scenarios. Eventually, somebody else should red-team the 
plan to check for my own funblefingers.

The current flow charts are on the NTP project page under the briefings 
(PowerPoint) header. I'm not going to write anything until the logic has 
been verified and aligned with the code and vice versa. The last time I 
did this with NTPv3 it took a year to align things up and somebody else 
was doing the actual coding. This beast with all the modes, 
cryptographic means and hazards is really a very large, very complex 
real-time machine. As I did over twenty years ago with EGP (rfc904), I 
have had to explore every state, every event and every action of the 
three state machines in the beast. The hardest has been finding and 
fixing as best I can scenarios where a determined hacker could hijack 
the time, jam or demobilize a legitimate association or find a potential 
security loophole in the specification.

The most critical review needed now is the Autokey document at 
http://www.eecis.udel.edu/~mills/database/reports/stime/stime.pdf. I 
would welcome suggestions, corrections and objections.

Dave

mayer at gis.net wrote:

>Dave,
>
>I'd like to get beyond this. I'm not convinced that the flash bits are
>important
>since they are only visible externally from the server if you
>specifically ask
>for them for that specific association. In any case it has more to do
>with
>internal state than a protocol issue.
>
>I'd love to start looking at the spec, but we, at least I, need pointers
>to the specific documentation. Can you provide a set of pointers of docs
>that need review and need to be forwarded into the IETF WG as work
>items?
>I haven't looked at the test plan and I think you only sent it to a few
>people and not the mailing lists.
>
>Danny
>----- Original Message Follows -----
>
>>Harlan,
>>
>>The ntpd doesn't return anything except raw status bits and raw 
>>vegetables. It does no interpret the bits or provide flash tool tips. 
>>That's the function of ntpq in whatever form. See the control and 
>>monitoring protocol in the appendix to rfc1305. That extends to the 
>>traps, status bits and other data defined in that appendix. The NTPv4 
>>spec is unchanged from the NTPv3 spec in that area, at least for the 
>>present. The present might morph with no apology.
>>
>>Can we go beyond the flasher issue, which is not a specification
>>issue?  Can we get on with the spec itself? Has anybody looked at the 
>>flowcharts? Are there comments on the weekly reports I have made to
>>the  hackers group? Has anybody comment on the test suite draft? Has
>>anybody  comment on the algorithms, protocol and architecture? Is
>>there any  progress on the Autokey specification?
>>
>>Dave
>>

-------------- next part --------------
NTP Test Plan - preliminary

Basic Functionality Tests


This assumes you know how to build and configure current NTPv4. You will need two test machines A and B plus an NTP server S already syncrhonized to another source. All should be on the same fast test network, such as a 10-Mb Ethernet. You will not need to change anything on server S.

1. First verify client/server mode works and the client does back off the poll interval in both reachable and unreeachable conditions. Create configuration files for A and B containing only the lines

disable kernel
driftfile /etc/ntp.drift
server S 

If your kernel includes the precision clock routines, use the ntptime program to set the frequency to zero:

ntptime -f 0

Start A and B and verify both synchronize to S within five minutes. Confirm that the initial poll interval is 6 (64 s) and after some hours increases to 10 (1024 s). Verify the stratum of A and B is one higher than S.

Once each hour during this test the intrinsic frequency offset is written to /etc/ntp.drift. At the conclusion of the test, copy and rename this file for use later.

Stop A and watch the peer variables in B. The clock filter should shift samples out of the filter for each poll sent and the dispersion should increase to 16. Verify that after about ten minutes B declares A unreachable and over then next hour backs off the poll interval in stages to 10 (1024 s). Start A again and verify B again synchronizes to A within about ten minutes.

Note the values for the leap and stratum variables before B synxhronizes to A, after B synchronizes to a for the first time and again at the second time. Before the first time the leap should be 11 and stratum 16. After synchronization leap should be 00 and stratum 2. Neither variable should change after that; only the dispersion should increase after A is stopped and resume values that it had during the first period of synchronization. 

2. Next verify symmetric key cryptography works, which is necessary in broadcast and multicast modes. Roll symmetric keys using the ntp-keygen program. Install a copy as /etc/ntp.keys in both A and B. add the following lines to the configuration file for both A and B:

keys /etc/ntp.keys
trustedkey 5

On B edit the configuration file server command and replace the server line in A:

server S key 5

Start A and B at the same time and verify A synchronizes to S within five minutes and that during this time B does not synchronize to A. Verify that after ten minutes B does synchronize to A.

Change the trusted key 5 in the B configuration file to something other than 5. Stop and restart B. Verify B does not synchronize to A. When the test is done, restore the trusted key 5.

3. Next, verify that the iburst and burst protocol works. Add the iburst keyword after the server address in the A and B server commands. Restart both A and B and verify A synchronizes to S within 20 s and B synchronizes to A within 20 s after that. Verify that after B synchronizes to A the B poll interval changes to 6 (64 s) and backs off to 10 (1024 s) as in (1) above.

4. Next verify broadcast mode works. Add the following line to the A configuration, where BCST is the broadcast address for the local Ethernet.

broadcast BCST key 5

Remove the server A line in the B configuration and add a line containing only the word broadcastclient. Restart both A and B and wait until A synchronizes to S. This requires a brief volley with the server to measure the propagation delay. Verify the interval between messages sent by B is not less than two seconds. Verify A sends broadcast messages at intervals of 64 s. Now start B and verify it synchronizes to A within 20 s after receiving a broadcast message from A. Verify the stratum of B is one greater than A.

Stop and restart ntpd in A. Verify B eventually recovers once A has resynchronized to S.

Change the broadcastclient line:

broadcast novolley

Stop and restart B. Verify B synchronizes to A within five minutes, but does not send anything to A.

Stop A, but leave B running. Remove the key 5 entry. Restart A, which is now sending unathenticated broadcasts. Verify these broadcasts do not affect B. This behavior is designed to deflect bait-and-switch attacks.

Stop both A and B. Remove the key 5 entry in the broadcast command. Restart both machines. Verify that B does not synchronize to A.

Stop B. Insert the line

disable auth

in the configuration file. Restart B. Verify that B does synchronize to A.

Stop A. replace the key 5 entry in the broadcast line. Restart A. Verify that B continues to operate, but now in authenticated mode. You may wisht to verify B continues to work if B switches back to unathenticated mode.



Change the trusted key 5 in the B configuration file to something other than 5. Stop and restart B. Verify B does not synchronize to A. When the test is done, restore the trusted key 5.

5. Next verify symmetric modes work. Remove the broadcast line from the A configuration file. Replace the broadcastclient line in the B configuration file with the line

peer A key 5

Verify that A mobilizes an association for B within 64 s and B synchronizes to A at a stratum one greater than A within five minutes. Verify that A does not include B in the survivor list, as this would form a timing loop. Verify the poll interval increases from 6 (64 s) to 10 (1024 s) in both A and B, but this may take awhile.

Verify the peer protocol recovers after a peer restart. With the protocol stabilized and the clock filters full, rudely restart B and verify that eventually the associations are demobilized and correctly recover. Then do the same thing but restart A instead of B. The clank and bang will be something like a pinball machine, but the protocol should eventually recover.

6. Next verify the clock discipline operation. Configure A and B as in step (1) and insure the machines are synchronized to S within a few milliseconds. Be sure to start ntpd with the -g option. Using the Unix date command, set the A clock on 20 years in the future and the B clock 20 years in the past. After about 17 minutes (stepout interval), each machine should take a step correction to the current time, purge the clock filter and resynchronize within five minutes.

Again configure A and B as in step (1), but add the keyword

minpoll 6 maxpoll 6

to the server configuration line in both A and B. Insure the machines are synchronized to S within a few milliseconds and the frequency has settled down. Using the Unix date command set the time ahead on A and behind on B by 100 ms. Watch the clock offset as it is amortized by the clock discipline algorithm. It should decrease to and cross zero in about 3000 s, then overshoot about 5 percent and return to zero in several hours. Verify that each each doubling of the poll interval doubles the zero crossing time.

Configure A and B as in step (2) so A is synchronized to S and B to A. Let the test run until A and B are synchronized within a few milliseconds and the frequency has settled down and the poll interals have backed off to 10. . Before doing this, restor the original values in the A and B /etc/ntp.drift files. This will dramatically shorten test times.

Stop A and edit the /etc/ntp.drift file to contain a value 20 higher or lower than the value in the original file. This simulates the computer clock frequency change, which might be due to a sudden temperature change, for example. Then restart A and insure it synchronizes to S within a few milliseconds. Watch the poll interval on B. It should notice after a few polls that the server experienced a frequency surge and drop the poll interval in order to better track it. Verify this does happen and, when the transient dies out, the poll intervals return to 10.

7. Next verify recovery after extreme time and freqency transients. Configure A and B as in step (1) and insure the machines are synchronized to S within a few milliseconds. These tests may take some time, so if you have additional test hosts, now is the time to use them. Put +500 in the ntp.drift file on a and -500 in the ntp.drift file on B. Start A and B and watch while the offset and frequency are corrected. It may take a couple of and a couple of step adjustments for the machines to converge again.

Extreme Operation Envelope Tests

8. These use the simulator. Verify the clock discipline correctly stabilizes under extreme conditions, such as when the panic, step and stepout thresholds are set to extreme values. Verify that operation is stable, but with a constant offset, if the oscillator frequency is set above 500 PPM. Verify the discipline is stable with extreme values of jitter and wander. Verify the state machine operates correctly with respect to the transition table.

Details TBD

Additional Authentication Tests

Mitigation Algorithms Tests

TBD

Reference clock tests

Configure a reference clock as the only source. Disable the kernel discipoine with the command

disable kernel

in the configuration file. Verify the server comes up with leap bits set 00 and stratum 1. Disconnect the antenna so that the radio loses synchronization. Verfiy the driver recognizes this and sets the dispersion accordingly. Verify the leap bits do not change and that the stratum eventually changes to 16. Reconnect the antenna and verify the radio comes back online. Power down or unplug the radio. Verify that the association eventually times out and clears, with result the leap bits are set to 11 and the stratum to 16. Reconnect the radio and verify the association remobilizes as before.

PPS tests

Configure a reference clock as one source and the atom driver (type 22) as another source. include a prefer keyword in the server command of the reference clock. Verify Disable the kernel discipline with the command

disable kernel

Verify first that the reference clock association comes up (* as the tally character) and then that the PPS comes up (o as the tally character for the PPS and + for the reference clock). Verufy the server stratum is one greater than the reference clock.

Disconnect the PPS signal. Verify the PPS association drops off (blank tally) and the reference clock comes back (* tally).

Reconnect the PPS signal, wait for it to stabilize, then disconnect the reference clock. Verify the reference clock association times out followed by the PPS association times out.

Include the line

fudge 127.127.22.0 stratum 0

line in the configuration file and restart the daemon. Once the server is synchronized to the reference clock and PPS signal, verify the server shows stratum 1.

Disconnect the PPS signal. Verify the server reverts to the reference clock and shows stratum one greater than the reference clock.

Autokey tests

Configure A and B as in (2). Roll keys and certificates for A and B. Configure the OpenSSL library and the .rnd seed file.  Include the line

crypto randfile <file>

In the configuration file. Specify the location of the .rnd file as necessary. Run the same tests as in the symmetric key tests. Note however, that when A is restarted the first message sent by B causes A to send a crypto_NAK and B to restart the protocol.


More information about the ntpwg mailing list