
I am having periodic ntp synchronisation problems. ntp doesn't directly log anything - but I am using nagios3 to track it's synchronisation and I periodically get problems - several times a day. I can't figure out what is causing the loss of synchronisation. Perhaps a burst of ntp packets dropped? But wouldn't ntp log something about this? Any suggestions on logging more info to find the cause or deeper insights into what is going wrong would be appreciated. Andrew Here's the output of commands when things are bad: config(0)# check_ntp_peer -H 127.0.0.1 -w 1.0 -c 2.0 NTP WARNING: Server has the LI_ALARM bit set, Offset 0.210925 secs|offset=0.210925s;1.000000;2.000000; LI_ALARM apparently means not in sync ??? config(1)# ntpq -c rl associd=0 status=c618 leap_alarm, sync_ntp, 1 event, no_sys_peer, version="ntpd 4.2.6p2@1.2194-o Sun Oct 17 13:35:13 UTC 2010 (1)", processor="x86_64", system="Linux/2.6.32-5-amd64", leap=11, stratum=3, precision=-23, rootdelay=95.696, rootdisp=263.117, refid=192.189.54.33, reftime=d2bccf2f.4854b34f Sun, Jan 15 2012 15:06:07.282, clock=d2bcd1d6.e9ffea4c Sun, Jan 15 2012 15:17:26.914, peer=16519, tc=10, mintc=3, offset=0.000, frequency=500.000, sys_jitter=35.804, clk_jitter=0.000, clk_wander=91.828 It thinks it's in error by 16s??? config(0)# ntpdc -c kerninfo pll offset: 0 s pll frequency: 500.000 ppm maximum error: 16 s estimated error: 16 s status: 4041 pll unsync mode=fll pll time constant: 10 precision: 1e-06 s frequency tolerance: 500 ppm ntptime gives the same info config(0)# ntptime ntp_gettime() returns code 5 (ERROR) time d2bcd2be.08d3a000 Sun, Jan 15 2012 15:21:18.034, (.034479), maximum error 16000000 us, estimated error 16000000 us ntp_adjtime() returns code 5 (ERROR) modes 0x0 (), offset 0.000 us, frequency 500.000 ppm, interval 1 s, maximum error 16000000 us, estimated error 16000000 us, status 0x4041 (PLL,UNSYNC,MODE), time constant 10, precision 1.000 us, tolerance 500 ppm, Then mysertiously everything is okay: config(0)# ntpdc -c kerninfo pll offset: 0.00998 s pll frequency: 500.000 ppm maximum error: 1.6291 s estimated error: 0.004771 s status: 0001 pll pll time constant: 10 precision: 1e-06 s frequency tolerance: 500 ppm My leap becomes none (no leap_alarm) and things are ok? config(0)# ntpq -c rl associd=0 status=0618 leap_none, sync_ntp, 1 event, no_sys_peer, version="ntpd 4.2.6p2@1.2194-o Sun Oct 17 13:35:13 UTC 2010 (1)", processor="x86_64", system="Linux/2.6.32-5-amd64", leap=00, stratum=3, precision=-23, rootdelay=95.272, rootdisp=983.007, refid=192.189.54.33, reftime=d2bcd33b.bbc10580 Sun, Jan 15 2012 15:23:23.733, clock=d2bcd852.ec6be1b9 Sun, Jan 15 2012 15:45:06.923, peer=16519, tc=10, mintc=3, offset=13.497, frequency=500.000, sys_jitter=7.251, clk_jitter=4.772, clk_wander=151.809