systemd - dissatisfaction guaranteed


  • Mon 23 September 2019
  • misc

Hating on systemd is widely considered good sport. That said, rants are rarely backed up with good observations about why it's unsatisfactory, merely complaints about it being different from sysvinit.

I've written before about making systemd do what I want, and have had decent experience bending it to my will. This is entirely unsurprising since I've got extensive experience with Solaris' SMF and some experience with Apple's launchctl.

One of the widely repeated complaints about systemd is that it incorporates stuff into the init daemon family that has no business being there, at direct odds to the Unix philosophy of having different well-developed tools for different things. Rarely, though, does the issue of substandard implementation get called out.

I'm in the process of setting up a bare metal environment for packet capture since I'm trying to figure out what's making a certain router sick. Built with Ubuntu 18.04 LTS Server (with a fresh apt-get update && apt-get upgrade), I thought it would be good to make sure ntp was working properly since I am going to try to correlate traffic and badness in logs on a router. I took a look at systemd on this freshly installed and upgraded machine at home and here's what I found:

root@bigbrother:~# systemctl status systemd-timesyncd.service 
● systemd-timesyncd.service - Network Time Synchronization
   Loaded: loaded (/lib/systemd/system/systemd-timesyncd.service; enabled; vendor preset: enabled)
   Active: active (running) since Sun 2019-09-22 21:35:18 UTC; 7min ago
     Docs: man:systemd-timesyncd.service(8)
 Main PID: 4434 (systemd-timesyn)
   Status: "Synchronized to time server 172.30.250.126:123 (172.30.250.126)."
    Tasks: 2 (limit: 4915)
   CGroup: /system.slice/systemd-timesyncd.service
           └─4434 /lib/systemd/systemd-timesyncd

Sep 22 21:35:17 bigbrother systemd[1]: Starting Network Time Synchronization...
Sep 22 21:35:18 bigbrother systemd[1]: Started Network Time Synchronization.
Sep 22 21:35:18 bigbrother systemd-timesyncd[4434]: Synchronized to time server 172.30.250.126:123 (172.30.250.126).
Sep 22 21:37:46 bigbrother systemd-timesyncd[4434]: Network configuration changed, trying to establish connection.
Sep 22 21:37:46 bigbrother systemd-timesyncd[4434]: Synchronized to time server 172.30.250.126:123 (172.30.250.126).
Sep 22 21:40:16 bigbrother systemd-timesyncd[4434]: Network configuration changed, trying to establish connection.
Sep 22 21:40:16 bigbrother systemd-timesyncd[4434]: Synchronized to time server 172.30.250.126:123 (172.30.250.126).
Sep 22 21:42:45 bigbrother systemd-timesyncd[4434]: Network configuration changed, trying to establish connection.
Sep 22 21:42:45 bigbrother systemd-timesyncd[4434]: Synchronized to time server 172.30.250.126:123 (172.30.250.126).
root@bigbrother:~# 

Well. That's pretty odd... network configuration "changed" every 150 seconds. Wonder what could be... oh drat...

pi@raspi-woodburn-dhcpdns-ups:~ $ grep ^default-lease-time /etc/dhcp/dhcpd.conf
default-lease-time 300;
pi@raspi-woodburn-dhcpdns-ups:~ $ 

Yep, looks like the "renew at half the expiration time" standard behavior of DHCP clients. We run a short DHCP lease everywhere at ClueTrust for reasons that date back to Windows 98's DHCP client. Most clients handle it with aplomb and don't have any difficulties with the fact that they are doing renewals fairly frequently. In truth, with well-behaved clients the only problem with running a short DHCP lease period is load on the DHCP server, and our device community here is orders of magnitude too small to be worried about that, even on a Raspberry Pi as a server.

Apparently, systemd-timesyncd knows too little about the state of the network to be smart about whether the network configuration actually changed or not when DHCP renewal happens.

One could argue that this is no great loss given that systemd-timesyncd is actually an SNTP implementation not a full-on NTP implementation such as chrony or the reference NTP implementation and its forks. As such it's not really capable of that good accuracy to begin with and has essentially no long-term trending or PLLs in it for gently drifting the system's clock - one can think of it as being scarcely better than running ntpdate out of a cron job.

For my production stuff I tend to disable systemd-timesyncd and run standard ntpd as an included Ansible role; I would have never noticed this misbehavior if I hadn't been doing a handbuilt one-off, but then again I wouldn't have been potentially affected by it either.