Linux bridging with tun/tap and loopback (dummy) devices, libvirt's use case with virbr0-nic tap interface
I have not been aware about the internals of libvirt’s approach of building virtual networks using bridges up until the point when I started looking at the sources for the libvirt, Linux kernel and OpenVPN. So here is what I found.
If a bridge is created via brctl addbr <br_name>
and there are no interfaces connected to it is going to have a randomly generated MAC address but as soon as there is a new interface connected, it is going to pick the interface’s MAC address for its bridge id (if there are multiple interfaces connected, the one with the lowest MAC is picked). This is why libvirt has some code (src/network/bridge_driver.c) to create a tap device (drivers/net/tun.c) to have the bridge id fixed (though there is no guarantee that this logic is not going to break if a smaller value for a MAC address of an interface is selected).
The above shows that libvirt creates a dummy tap device which it keeps in the ‘Up’ state (the file descriptor is kept open, see tun/tap carrier discussion below) until the Duplicate Address Detection (DAD) for IPv6 finishes. Then the tap interface is set to the ‘Down’ state.
In general, there is a difference between the Administrative State and the Operating State of a network device (RFC 2863, section 3.1.13, kernel.org: Documentation/networking/operstates.txt): ip link set <dev_name> up
can be issued to bring an interface up administratively but it may not come up operationally for various reasons (e.g. a cable is not plugged in). Both states can be checked by looking at a netlink message represented by struct ifinfomsg
: ifinfomsg::if_flags & IFF_UP
for the administrative state and ifinfomsg::if_flags & IFF_RUNNING
for the operating state (the message is received either by polling or subscribing to related messages).
In the case of a tun/tap device there must be a user space process with an open handle to the device to keep it operationally up, otherwise NO-CARRIER
state is going to be shown. Note that when a process is killed, all of its open file descriptors are closed, therefore: either a process closes a file destriptor itself, exits or is being killed, which results in a loss of a carrier for a tun/tap device. If it was the only device connected to a bridge, the bridge itself is going to lose carrier as well. This can be seen by looking at the tun/tap source code: there are three functions tun_chr_open
, tun_chr_ioctl
and tun_chr_close
which do the required operations:
tun_chr_open
allocates the required data structures;tun_chr_ioctl
handles user space commands issued via ioctl interface. Depending on a commandtun_set_iff
might be called followed bynetif_carrier_on
for a specific device depending on the branching in the code;tun_chr_close
callstun_detach
which in turn calls__tun_detach
and eventuallynetif_carrier_off
(as the name suggests, this leads to a NO-CARRIER state).
These functions are mentioned in the code as follows:
As a result, if you want a ‘completely virtual’ bridge which has a carrier all the time without depending on any user space processes, you might need something better than the tun/tap device but for the purposes of libvirt the developers decided to use tun/tap. The reason is probably that there is a loopback interface for local communication, otherwise, it only makes sense to keep a bridge up if there are devices that are able to transmit. With QEMU/KVM the VMs are processes with tap interfaces connected to bridges therefore for host-only case your virbr[x] interface is going to have no carrier if all tap interfaces are down.
An alternative to tun/tap devices are loopback devices but there is only a single loopback device (actually, per a network namespace). A workaround for loopback devices in this case are dummy devices (drivers/net/dummy.c) - they are always up (both operationally and administratively) unless set administratively down.