This document describes technical context and analysis to support a proposal for development of a new Linux device driver destined for mainline Linux and use by OpenXT, to introduce Hypervisor-Mediated data eXchange (HMX) into the data transport of the popular VirtIO suite of Linux virtual device drivers, by leveraging Argo in Xen. Daniel Smith proposed this idea, which has been supported in discussions by Christopher Clark and Rich Persaud of MBO/BAE Systems, and Eric Chanudet and Nick Krasnoff of AIS. Christopher is the primary author of this version of this document.

– August 2020

Introduction to VirtIO

VirtIO is a virtual device driver standard developed originally for the Linux kernel, drawing upon the lessons learned during the development of paravirtualized device drivers for Xen, KVM and other hypervisors. It aimed to become a “de-facto standard for virtual I/O devices”, and to some extent has succeeded in doing so. VirtIO is now widely implemented in both software and hardware, it is commonly the first choice for virtual driver implementation in new virtualization technologies, and the specification is now maintained under governance of the OASIS open standards organization.

VirtIO’s system architecture abstracts device-specific and device-class-specific interfaces and functionality from the transport mechanisms that move data and issue notifications within the kernel and across virtual machine boundaries. It is attractive to developers seeking to implement new drivers for a virtual device because VirtIO provides documented specified interfaces with a well-designed, efficient and maintained common core implementation that can be leveraged to significantly reduce the amount of work required for a new virtual device driver.

VirtIO follows the Xen PV driver model of split-device drivers, where a front-end device driver runs within the guest virtual machine to provide the device abstraction to the guest kernel, and a back-end driver runs outside the VM, in platform-provided software - eg. within a QEMU device emulator - to communicate with the front-end driver and provide mediated access to physical device resources.

A critical property of the current common VirtIO implementations is that they prevent enforcement of strong isolation between the front-end and back-end virtual machines, since the back-end VirtIO device driver is required to be able to obtain direct access to the memory owned by the virtual machine running the front-end virtio device driver. ie. The VM hosting the back-end driver has significant privilege over any VM running a front-end driver.

Xen’s PV drivers use the grant-table mechanism to confine shared memory access to specific memory pages used and permission to access those are specifically granted by the driver in the VM that owns the memory. Argo goes further and achieves stronger isolation than this since it requires no memory sharing between communicating virtual machines.

In contrast to Xen’s current driver transport options, the current implementations of virtio transports pass memory addresses directly across the VM boundary, under the assumption of shared memory access, and thereby require the back-end to have sufficient privilege to directly access any memory that the front-end driver refers to. This has presented a challenge for the suitability of using virtio drivers for Xen deployments where isolation is a requirement. Fortunately, a path exists for integration of the Argo transport into VirtIO which can address this and enable use of the existing body of virtio device drivers with isolation maintained and mandatory access control enforced: consequently this system architecture is significantly differentiated from other options for virtual devices.

VirtIO Architecture Overview

In addition to the front-end / back-end split device driver model, there are further standard elements of VirtIO system architecture.

For detailed reference, VirtIO is described in detail in the “VirtIO 1.1 specification” OASIS standards document.

The front-end device driver architecture imposes tighter constraints on implementation direction, and so is more important to understand in detail, since it is this that is already implemented in the wide body of existing VirtIO device drivers that we are aiming to enable use of. The back-end software is implemented in the platform-provided software - ie. the hypervisor, toolstack, a platform-provided VM or a device emulator, etc. - where we have more flexibility in implementation options, and the interface is determined by both the host virtualization platform and the new transport driver that we are intending to create.

VirtIO front-end driver classes

There are multiple classes of virtio device driver within the Linux kernel; these include the general class of front-end virtio device drivers, which provide function-specific logic to implement virtual devices - eg. a virtual block device driver for storage - and the transport virtio device drivers, which are responsible for device discovery with the platform and provision of data transport across the VM boundary between the front-end drivers and the corresponding remote back-end driver running outside the virtual machine.

VirtIO transport drivers

There are several implementations of virtio transport device drivers in Linux, each implement a common interface within the kernel, and they are designed to be interchangeable and compatible with the VirtIO front-end drivers: so the same front-end driver can use different transports on different systems. Transports can coexist: different virtual devices can be using different transports within the same virtual machine at the same time.

Access to virtual device buffers for I/O operations with VirtIO-PCI transport

The back-end domain requires sufficient privilege with the hypervisor to be able to map the memory of any buffers used for I/O with the device by the guest VM.

VirtIO with Argo transport

Enabling Virtio to use the Argo interdomain communication mechanism for data transport across the VM boundary will address three critical requirements:

The proposal is to implement a new VirtIO transport driver for Linux that utilizes Argo. It will be used within guest virtual machines, and be compatible with the existing VirtIO front-end device drivers. It will be paired with a corresponding new VirtIO-Argo back-end to run within the Qemu device emulator, in the same fashion as the existing VirtIO transport back-ends, and the back-end will use libargo and the (non-VirtIO) Argo Linux driver.

Using VirtIO

Front-end, in-guest software

VirtIO device drivers are included in the mainline Linux kernel and enabled in most modern Linux distributions. There is a menu for VirtIO drivers in the kernel Kconfig to enable inclusion of drivers as required. Once the VirtIO-Argo transport driver has been reviewed upstream and accepted into the mainline Linux kernel, it should propagate for inclusion in the Linux distributions, which will enable seamless deployment of guest VMs on VirtIO-Argo hypervisor platforms with no futher in-guest drivers required.

Prior to the VirtIO-Argo device driver being made available via the Linux distributions, installation will require a Linux kernel module to be installed for the VirtIO-Argo driver, which will then enable compatibility with the other existing VirtIO device drivers in the guest. A method of switching devices over from using their prior driver to the newly activated VirtIO-Argo driver will need to be designed; this is the same issue that the existing Xen PV drivers have handled when handing off responsibility from running on emulated devices over to the Xen PV driver.

Open Source VirtIO drivers for Windows are available, with some Linux distributions, eg. Ubuntu and Fedora, including WHQL Certified drivers. These enable Windows guest VMs to run with virtual devices provided by the VirtIO backends in QEMU. It has not yet been ascertained whether the Windows VirtIO implementation is suitable for introduction of a VirtIO-Argo transport driver in the same way as proposed here for Linux.

Host platform software

QEMU

The QEMU device emulator implements the Virtio transport that the front-end will connect to. Current QEMU 5.0 implements both the virtio-pci and virtio-mmio common transports.

Linux Argo driver

For QEMU to be able to use Argo, it will need an Argo Linux kernel device driver, with similar functionality to the existing Argo Linux driver.

Toolstack

The toolstack of the hypervisor is responsible for configuring and establishing the back-end devices according to the virtual machine configuration. It will need to be aware of the VirtIO-Argo transport and initialize the back-ends for each VM with a suitable configuration for it.

Optional: Userspace drivers

The DPDK userspace device driver software also implements an alternative set of VirtIO virtual device back-ends to QEMU, which can also be used to support VMs with VirtIO virtual devices. Please note that the DPDK implementation uses a “packed virtqueue” data structure, as opposite to the default original “split virtqueue”, across the VM boundary and this is not targetted for compatibility in the initial version of the VirtIO-Argo transport.

VirtIO hardware

Hardware implementations of VirtIO interfaces will by unaffected by the VirtIO-Argo system.

Using the existing Argo and Xen software in OpenXT: without VirtIO

OpenXT currently uses the Xen PV drivers for virtual devices.

The Xen PV drivers use the Grant Tables for the front-end domain to grant access to the back-end domain to be able to establish shared memory access to specific buffers for physical I/O device operations.

In addition to the Xen PV drivers, OpenXT also uses Argo, with an Argo Linux device driver, for inter-domain communication. In contrast to the grant-table data path used by the Xen PV drivers, which establishes and tears down shared memory regions between communicating VMs, the Argo data path is hypervisor-mediated, with Mandatory Access Control enforced on every data movement operation.

The challenges with using Argo with its own Linux device driver installed in the guest are:

Using VirtIO with Argo in OpenXT

Adding Argo as a transport for VirtIO will retain Argo’s MAC policy checks on all data movement, while allowing use of the VirtIO virtual device drivers and device implementations.

With the VirtIO virtual device drivers using the VirtIO-Argo transport driver, OpenXT can retire use of the Xen PV drivers within platform VMs and guest workloads. This removes shared memory from the data path of the device drivers, allows for some hypervisor functionality, such as the grant tables, to be disabled, and makes the virtual device driver data path HMX-compliant.

In addition, as new virtual device classes in Linux have VirtIO drivers implemented, these should transparently be enabled with Mandatory Access Control, via the existing virtio-argo transport driver, potentially without further effort required – although please note that for some cases (eg. graphics) optimizing performance characteristics may require additional effort.

VirtIO with Argo: architecture and design

TODO:

Needed

References

Notes from Virtio Argo planning meeting, July 2020

Call on Friday 17th July

Attendees:

Actions:

Topic: Argo as a transport for Virtio

Background: the Virtio implementation in the Linux kernel is stacked:

KVM uses Qemu to provide backend implementations for devices.
To implement Argo as a Virtio transport, will need to make Qemu Argo-aware.

Typically a PCI device is emulated by Qemu, on a bus provided to the guest.
The Bus:Device:Function (bdf) triple maps to Qemu; the guest loads the PCI
virio transport when this PCI device with that bdf is detected.

For Xen, this is OK for HVM guests, but since the platform PCI device is not
exposed to PVH (or PV) guests, not a sufficient method for those cases.

Discussion notes

The existing main Xen PCI code is small, used for discovery.
vPCI is experimental full emulation of PCI within Xen -- Daniel does not
favour this direction.

Paul Durrant has a draft VFIO implementation plan, previously circulated.
Paul is working on emulating PCI devices directly via a IOReq server (see: "demu").

Daniel is interested in the Linux kernel packet mmap, to avoid one world-switch
when processing data.

Christopher raised the Cambridge Argo port-connection design for external
connection of Argo ports between VMs.

Eric raised the vsock-argo implementation; discussion mentioned the Argo development wiki page that describes requirements and plan that has previously been discussed and agreed for future Argo Linux driver development, which needs to include a non-vsock interface for administrative actions such as runtime firewall configuration.

Daniel is positively inclined towards using ACPI tables for discovery
and notes the executable ACPI table feature, via AML.

Naming: an Argo transport for virtio in the Linux kernel would, to match the
existing naming scheme, be: virtio-argo.c

Group inclined not to require XenStore.

Topic: connecting a frontend to a matching backend

Argo (domain identifier + port) tuple is needed for each end

Design topic: surfacing bootstrap/init connection info to guests

Options:

  1. PCI device, with emulation provided by Xen

  2. ACPI tables

  3. External connection of Argo ports, performed by a toolstack

  4. Use a wildcard ring on the frontend side for driver to autoconnect

Note:
ACPI plan is good for compatibility with DomB work, where ACPI tables are
already being populated to enable PVH guest launch.

Plan:

For a first cut: just claim a predefined port and use that to avoid the need
for interacting with an ACPI table.

To build:

Wiki page to include:

License of this Document

Copyright (c) 2020 BAE Systems.
Document author: Christopher Clark.
This work is licensed under the Creative Commons Attribution Share-Alike 4.0 International License.
To view a copy of this license, visit https://creativecommons.org/licenses/by-sa/4.0/.