Docker daemon network address space clash

· Read in about 3 min · (550 words) ·

Oops!

Ran into a weird problem yesterday - I had to manually update my Gitlab LetsEncrypt Auto SSL tool to use the newest dehydrated script. Updated the new script and ran docker build

docker build -t gitlab-cf-le-autossl .

And it blew up on apk update…​ What the heck? 😕 That’s weird, I think.

A little bit of duckduck-fu 🦆 ensued and all I could find is network fetch errors during docker build when behind a proxy…​ I’m at home and don’t have a proxy so that’s not going to be it. Time to dig in - let’s fire up a temp container

docker run -it --rm alpine:latest /bin/sh
# nslookup dl-cdn.alpinelinux.org
<failed>

# huh?? - let's check nameservers
# cat /etc/resolv.conf
...
nameserver 172.16.1.2 (1)
1 that’s my home DNS running on Pi-hole in a different VLAN

The mystery thickens…​. I haven’t actually ever changed any docker desktop setting. Why would it stop working today? In any case, I need to get shit done and this container built. Let me just give it an external DNS. You can specify the DNS that containers use in /etc/docker/daemon.json

# edit /etc/docker/daemon.json
{
    "dns" : ["8.8.8.8", "1.1.1.1"]
}

# restart the daemon
sudo systemctl restart docker

At this point, the docker container build worked since it was able to resolve the alpine package repo. However, it’s now time to figure out what’s going on.

Chasing it down

First order of business - is my DNS server down? Issue a ping 172.16.1.2 and got a no route to host. What’s going on? DNS can’t be down since I’m able to browse websites and pretty much everything on my home network is working.

And then, realization strikes…​ routing.

ip route
...
172.16.0.0/16  ..... dev docker0
172.17.0.0/16 ...... dev docker0

Ok - so the entries for 172.16.x.x are being routed to the docker network…​ No wonder it can’t reach my pihole at 172.16.1.2 Delete the route with sudo ip route del 172.16.0.0/16 and then let’s try again

There’s the address space clash at the root of all this…​ My home network uses VLANs 172.16.1.0/8 and 172.16.0.0/8. Docker daemon by default creates networks in the 172.16.0.0/16 address space and when I start the docker daemon, those routes are added.

% ping 172.16.1.2                                                                                     [9:58:35 AM]|(Sandbox)
PING 172.16.1.2 (172.16.1.2) 56(84) bytes of data.
64 bytes from 172.16.1.2: icmp_seq=1 ttl=63 time=0.660 ms
64 bytes from 172.16.1.2: icmp_seq=2 ttl=63 time=0.613 ms

Let’s make it permanent

Given that this is the second time I’m dealing with this problem, let’s just fix it once and for all. Let’s just make sure that the docker overlay network uses a different address space altogether

"bip": "10.200.0.1/24",
"default-address-pools":[
    {"base":"10.201.0.0/16","size":24},
    {"base":"10.202.0.0/16","size":24}
]

Goodies

Turns out, I did get a few things sorted out of this:

  1. Back when I started hosting this blog on Gitlab pages, I’d put together my own script to update SSL certs from Let’s Encrypt. Now Gitlab has this natively - so time to wave my script bye-bye

  2. Container networking is hard - checking things from first principles helps. Overall, I find container networking and networking in general a little intimidating (and this is after maintaining a non trivial setup, isolated, built for privacy home networking setup for years). What helps even more is to write it down when the damned thing happens 😄