EDIT: WIP: The core reason for the failures explained below is due to me not bringing up the host TAP interfaces at the right time, if I allow QEMU to handle the creation of the tap devices, everything works as expected. I will investigate the failure in more detail and provide a clearer explanation of the problem when I have it. Thank you @anx for the tips!
Goal: Run a dnsmasq
inside a host QEMU VM, that services netboots
from another QEMU VM running on the host.
I would like the dnsmasq VM to act like a gateway, with one NIC as the
upstream WAN interface, with an upstream DHCP server, and the other
interface a private LAN interface, to which other VMs will be
"plugged", and will netboot from the dnsmasq listening on this private
LAN interface.
First, to allow the VMs to talk to one another, I create my own bridge
on the host,
ip link add name vivianbr0 type bridge
ip link set vivianbr0 up
For the VMs to talk to each other via the host bridge, I will need two
tap devices, one for the private LAN interface on the gateway VM, and
another for the private VMs single network interface,
ip tuntap add mode tap tap0 user cturner
ip tuntap add mode tap tap1 user cturner
ip link set tap0 up
ip link set tap1 up
ip link set tap0 master vivianbr0
ip link set tap1 master vivianbr0
For the gateway VM, I am using an Arch Linux ISO for testing purposes,
the VM is booted with two NICs, thusly,
qemu-system-x86_64 \
-drive file=arch-disk.qcow2,if=none,id=nvm \
-device nvme,serial=deadbeef,drive=nvm \
-cdrom archlinux-2021.09.01-x86_64.iso \
-boot d \
-device virtio-net-pci,romfile=,netdev=net0,mac="DE:AD:BE:EF:00:11" \
-device virtio-net-pci,romfile=,netdev=net1,mac="DE:AD:BE:EF:00:12" \
`# Simulate the plugged in "upstream" cable with user-mode networking` \
-netdev user,id=net0,hostfwd=tcp::60022-:22,hostfwd=tcp::8080-:80,hostfwd=tcp::8081-:8000,hostfwd=tcp::2375-:2375 \
`# And now the unplugged one with, with TAP networks` \
-netdev tap,id=net1,ifname=tap0,script=no,downscript=no \
-net bridge,br=vivianbr0 \
-m 4G \
-enable-kvm
Once this machine has booted, I see the following in the bridge configuration,
brctl show vivianbr0
bridge name bridge id STP enabled interfaces
vivianbr0 8000.46954a1ad851 no tap0
tap1
tap2
I assume tap2
was created by QEMU...
Inside this VM, there are two ifaces. ens4
with MAC
DE:AD:BE:EF:00:11, and ens5
with MAC DE:AD:BE:EF:00:12. Inside this
VM, I start dnsmasq
,
ip addr add 10.42.0.1/24 dev ens5
dnsmasq -d --dhcp-range=10.42.0.10,10.42.0.100 --dhcp-script=/bin/echo --enable-tftp=ens5 --interface=ens5
This starts wtihout error.
Now I try to netboot another VM, started on the host like this,
qemu-system-x86_64 \
-machine pc-q35-6.0,accel=kvm \
-m 1024 -smp 2,sockets=2,cores=1,threads=1 \
-netdev tap,id=net0,ifname=tap1,script=no,downscript=no \
-device virtio-net-pci,netdev=net0,bootindex=1,mac=DE:AD:BE:EF:00:13 \
-net bridge,br=vivianbr0 \
-enable-kvm \
-vga virtio
But it fails to boot. I monitor the vivianbr0
using tcpdump
and
can see the DHCP requests, but there are no responses, nothing reaches the dnsmasq running inside the first VM,
tcpdump -i vivianbr0 -nN
tcpdump: verbose output suppressed, use -v[v]... for full protocol decode
listening on vivianbr0, link-type EN10MB (Ethernet), snapshot length 262144 bytes
12:21:39.585229 IP 0.0.0.0.68 > 255.255.255.255.67: BOOTP/DHCP, Request from de:ad:be:ef:00:13, length 397
12:21:40.587741 IP 0.0.0.0.68 > 255.255.255.255.67: BOOTP/DHCP, Request from de:ad:be:ef:00:13, length 397
12:21:40.700038 IP6 fe80::6ce2:2aff:fe94:ba48.5353 > ff02::fb.5353: 0 [7q] PTR (QM)? _nfs._tcp.local. PTR (QM)? _ftp._tcp.local. PTR (QM)? _webdav._tcp.local. PTR (QM)? _webdavs._tcp.local. PTR (QM)? _sftp-ssh._tcp.local. PTR (QM)? _smb._tcp.local. PTR (QM)? _afpovertcp._tcp.local. (118)
12:21:42.619968 IP 0.0.0.0.68 > 255.255.255.255.67: BOOTP/DHCP, Request from de:ad:be:ef:00:13, length 397
12:21:46.684448 IP 0.0.0.0.68 > 255.255.255.255.67: BOOTP/DHCP, Request from de:ad:be:ef:00:13, length 397
12:22:30.609555 IP 0.0.0.0.68 > 255.255.255.255.67: BOOTP/DHCP, Request from de:ad:be:ef:00:12, length 289
12:23:33.796148 IP 0.0.0.0.68 > 255.255.255.255.67: BOOTP/DHCP, Request from de:ad:be:ef:00:12, length 289
12:24:38.673364 IP 0.0.0.0.68 > 255.255.255.255.67: BOOTP/DHCP, Request from de:ad:be:ef:00:12, length 289
Oddly, I see BOOTP requests from de:ad:be:ef:00:13
(the netbooting VMs MAC addr) and from de:ad:be:ef:00:12
(the gateway VM's private NIC), indicating something is badly misconfigured.
How can I make this work?