Info | ||
---|---|---|
| ||
Copyright 2013 by Citrix Systems, Inc. This work is licensed under the Creative Commons Attribution 4.0 International License. To view a copy of this license, visit http://creativecommons.org/licenses/by/4.0/. |
Table of Contents |
---|
Introduction
This page is intended to document the OpenXT toolstack. Please feel free to contribute heavily.
...
Code Block | ||
---|---|---|
| ||
# nothing can be done by default deny all # allow stubdoms to talk to surfman,xenmgr,dbus allow stubdom true destination com.citrix.xenclient.surfman allow stubdom true destination com.citrix.xenclient.xenmgr allow stubdom true destination org.freedesktop.DBus interface org.freedesktop.DBus # allow guests to call 'gather' on diagnostics interface (required by xc-diag) allow destination com.citrix.xenclient.xenmgr interface com.citrix.xenclient.xenmgr.diag member gather # allow anybody to do some vm queries required for switcher bar allow destination com.citrix.xenclient.xenmgr interface org.freedesktop.DBus.Properties member Get allow destination com.citrix.xenclient.xenmgr interface com.citrix.xenclient.xenmgr member list_vms allow destination com.citrix.xenclient.xenmgr interface com.citrix.xenclient.xenmgr.vm member get_db_key allow destination com.citrix.xenclient.xenmgr interface com.citrix.xenclient.xenmgr.vm member read_icon allow destination com.citrix.xenclient.xenmgr interface com.citrix.xenclient.xenmgr.vm member switch allow destination com.citrix.xenclient.input interface com.citrix.xenclient.input member get_focus_domid allow destination com.citrix.xenclient.xenmgr interface com.citrix.xenclient.xenmgr member find_vm_by_domid # allow guest to do some requests allow destinationdestination com.citrix.xenclient.xenmgr interface com.citrix.xenclient.xenmgr.guestreq member request_attention # allow conditional domstore (private db space) access allow destination com.citrix.xenclient.db interface com.citrix.xenclient.db member read if-boolean domstore-read-access true allow destination com.citrix.xenclient.db interface com.citrix.xenclient.db member read_binary if-boolean domstore-read-access trueallowtrue allow destination com.citrix.xenclient.db interface com.citrix.xenclient.db member list if-boolean domstore-read-access true allow destination com.citrix.xenclient.db interface com.citrix.xenclient.db member exists if-boolean domstore-read-access true # allow destination com.citrix.xenclient.db interface com.citrix.xenclient.db member write if-boolean domstore-write-access true allow destination com.citrix.xenclient.db interface com.citrix.xenclient.db member rm if-boolean domstore-write-access true |
...
Xenops is the lower-layer part of xenvm, responsible for lower level management of Xen domains (via domain ids). It is both used internally by xenvm as well as exposed to the user via the "xenops" Dom0 utility.
- xenops/balloon.ml - memory ballooning utilities, not used much in OpenXT
- xenops/device.ml and device_common.ml - important files which are responsible for initialization of PV (or PCI passthrough) device backends (via xenstore)
- xenops/dm.ml - config construction for qemu
- xenops/dmagent.ml - communication with dmagent, which is a program used to fork/configure qemu instances (which can be running in another domain)
- xenops/domain[_common].ml - domain management functions
- xenops/hotplug.ml - few utility functions to wait on device being created by PV backend etc
- xenops/memory.ml - crazy arithmetic to figure out how much memory is required to boot a VM
- xenops/netman.ml - couple network helper functions
- xenops/watch.ml - xenstore watch helper functions
- xenops/xal.ml - low level loop waiting and parsing PV device events
Xenvm
- xenvm/misc.ml - as name says
- xenvm/tasks.ml - list of rpc tasks xenvm supports
- xenvm/vmact.ml - high level implementation of vm operations (start/stop etc)
- xenvm/vmconfig.ml - parsing xenvm config files (in OpenT, placed in /tmp/xenmgr-xenvm-*)
- xenvm/vmstate.ml - VM state struct
- xenvm/xenops.ml - xenops Dom0 utility entry point
- xenvm/xenvm.ml - daemon entry point
xenmgr
...
Unlike incoming calls, incoming notifications are processed serially on a separate IO thread. This is because the ordering of notifications is important (and guaranteed by dbus, therefore we have to process them serially to keep the guarantee). As a consequence, long running notification handlers should be forking off a thread to not block the queue.
exported DBus entities
xenmgr exports following dbus objects:
- / - root object, contains global configuration and operations
- /vm/$VM_UUID - per VM dbus object, provides access to VM configuration
- /vm/$VM_UUID/$DISK_ID - provides access to VM disk configuration
- /vm/$VM_UUID/$NIC_ID - provides access to VM network interface configuration
- /host - provides access to host specific configuration and host information
each of these objects exports one or more dbus interfaces, as defined in IDL repository (idl.git). Following IDL files are used by xenmgr
- xenmgr.xml
- xenmgr_vm.xml
- xenmgr_host.xml
- vm_nic.xml
- vm_disk.xml
DBus interface boilerplate generation
Most of dbus boilerplate code, such as the hooks to implement exported functions as well as stubs for calling other daemons is generated from the files in idl.git by a custom written program "rpcgen". It takes dbus xml files as input and produces binding/hooks in variety of languages as output.
In case of xenmgr the boilerplate is generated from its BB recipe, in the configure step:
Code Block | ||
---|---|---|
| ||
# generate rpc stubs
mkdir -p Rpc/Autogen
# Server objects
xc-rpcgen --haskell -s -o Rpc/Autogen --module-prefix=Rpc.Autogen ${STAGING_DATADIR}/idl/xenmgr.xml
xc-rpcgen --haskell -s -o Rpc/Autogen --module-prefix=Rpc.Autogen ${STAGING_DATADIR}/idl/xenmgr_vm.xml |
Hence usually modification of IDL consists of following steps:
- modify IDL files
- re-stage IDL files (for example by ./bb -cforce_rebuild xenclient-idl && ./bb xenclient-idl)
- regenerate boilerplate (./bb -cconfigure xenmgr)
- compile xenmgr (./bb -ccompile xenmgr), fix errors, add implementation for new methods etc
VM templates
There's a bunch of json templates in the "templates" subfolder; on the device they are installed in /usr/share/xenmgr-1.0/templates. There are basically two types of templates:
service VM template, distinguished via "service-" prefix in file name. On each boot, the toolstack will create a VM instance based on these (which will be visible via xec-vm contrl utility). If the same instance was already present, its config will be overwritten each boot which generally makes for an easy upgrades of service VM config files (default NDVM or UIVM)
all other templates. VM instances based on these will only be created by manual request (either from UIVM creation wizard, or one of the xec create-vm-* functions)
It's sometimes handy to turn off the automatic overwrite of service VM templates each xenmgr start, so that the service VM can be reconfigured via manual xec-vm invocations. This can be achieved via
$ db-write /xenmgr/overwrite-<vmtype>-settings false
for example for NDVM: db-write /xenmgr/overwrite-ndvm-settings false
VM configuration
JSON vm configuration files are stored in /config/vms/ though xenmgr doesn't access them directly but rather through dbd's dbus API. Low level configuration files consumed by xenvm (which are output by xenmgr) reside in /tmp/xenmgr-xenvm-$VM_UUID files.
VM properties
VM properties are declared in idl.git, in interfaces/xenmgr_vm.xml. Most of them are documented. Any additional documentation should go to IDL files since that's also the place from which sdk docs are generated.
Dodgy VM properties
There are some VM properties which have non trivial repercussions and warrant additional clarification:
- xci-cpuid-signature
Setting it to true changes CPUID signature reported by Xen to be xenclient specific. Effectively it hides Xen presence from the guest. We use it on Linux VMs to avoid the default PV drivers in upstream kernels from activating (so that we can use custom ones). However, toggling it off enables the usage of upstream drivers, if needed.
- flask-label
This specifies the SELinux security context under which VM runs, if mismatched it can prevent the VM from being able to issue hypercalls required for its normal function.
- provides-network-backend
Tells xenmgr to treat this VM as network VM; it will send notifications about VM, or VM state to the network daemon.
- greedy-pciback-bind
This activates the behavior of greedily seizing devices which are configured for PCI passhthrough by any VM by pciback driver on OpenXT boot. This behavior is different from the upstream one, however it is useful to prevent Dom0 drivers using the devices which might later be used for pass-through.
Audio driver is a good example; if the audio card is not seized by pciback driver at boot, alsa drivers in Dom0 will take it over which might prevent pass-through from working correctly later when the PCI passthrough VM is started.
VM hooks
xenmgr supports offloading some of its functionality to either a script, or another dbus daemon (possibly running in different VM). This is done via the following VM hooks (implemented as regular VM properties):
- run-post-create
- run-pre-delete
- run-pre-boot
- run-insteadof-start
- run-on-state-change
- run-on-acpi-state-change
Each hook can be either a script name, for example: xec-vm -n foo set run-insteadof-start /bin/customstart
or a string specifying the dbus rpc call to make, example:
$ xec-vm -n foo set run-insteadof-start rpc:vm=$SOME_UUID,destination=com.citrix.some.service,interface=com.citrix.iface,member=com.citrix.some.method
both the script and rpc call gets passed to the affected VM uuid as its sole argument.
PCI passthrough rules
VM passthrough rules are specified by a set of matchers, which are evaluated when VM starts to find actual PCI devices which need to be PT-ed.
VM PCI passthrough rules are managed by the following dbus methods:
add_pt_rule Adds a new passthrough rules. each of the matchers can take either an ID or word "any" which means match on all devices. example:
xec -n somevm add-pt-rule 0x680 any any
add_pt_rule_bdf Adds a new passthrough rule using BDF notation, example:
xec -n somevm add-pt-rule-bdf 0000:00:16.0
delete_pt_rule, delete_pt_rule_bdf - removal of passthrough rules
- list_pt_rules - lists current passthrough rules
- list_pt_pci_devices - evaluates all vm passthrough rules and outputs lists of matching pci devices on host
V4V firewall rules
V4V firewall rules are managed by dbus api similarily to PCI passthrough rules and evaluated when the VM starts. They usually result in a series of calls to "viptables" command-line program to erect the firewall. On VM shutdowns, the entries added during VM start are torn down.
V4V firewall rules can be modified via add_v4v_firewall_rule / delete_v4v_firewall_rule methods. These take a string argument with a rule definition. The format of rule string is as follows:
<source> -> <destination>
the format of each of the source/destination endpoints is
( * | my-stubdom | myself | dom-type=<domaintype> | dom-name=<domainname> ) : <v4v port>
examples:
- open connection from the VM to domain's 0 port 5555: myself-> 0:555
- open connection from the VM to all domains of type "ndvm" port 5555: myself -> dom-type=ndvm:5555
- open connection from VM's stubdom to domain 0 port 333: my-stubdom -> 0:333
VM measurement
If the "measured" property in the VM config tree is set, the first disk (ID 0) will is evaluated for checksum consistency and its hash is stored in the VM config tree. This is done by mounting the filesystem and computing a sha256 hash of the filesystem (as opposed to doing it on VHD file; because even readonly VHD files are modified due some bookkeeping information such as access timestamp).
If a hash inconsistency is detected, a measurement failure action will be invoked. It defaults to shutting down the system. It can be overridden via
$ db-write /xenmgr/measure-fail-action <powermanagementaction>
can be one of the following
- sleep
- hibernate
- shutdown
- forced-shutdown
- reboot
- nothing
and, as mentioned before, defaults to forced shutdown if not specified
VM dependencies
xenmgr supports a simple form of dependency tracking between VMs. If a VM is configured such that its network backend is not in Dom0, xenmgr will internally track that dependency. It's possible to list all the VMs dependencies by
$ xec-vm -n somevm list-dependencies
By default when a VM is started, xenmgr will ensure all its dependencies are started first. This can be toggled off via setting "track-dependencies" VM property to false.
There's no automatic shutdown of dependent VMs.
Power Management
Xenmgr supports some configuration of power related operations
- vm property "s3-mode"
This configures how the toolstack handles requests to put a VM to S3. Note that this doesn't affect requests made from within guest, but just requests originating from the UI / closing the laptop lid etc. It can be one of the following:
- ignore - request to put VM to S3 will be ignored, toolstack will proceed to the next VM on the list
- PV - toolstack will ask the PV driver within the guest to put it to S3 via xenstore. This is the default for most VMs
restart - toolstack will shutdown the VM and start it again after S3 resume. This is useful for NDVM.
- vm property "s4-mode"
Supports exactly same ability to configure as "s3-mode", specifies actions to be taken when the toolstack is supposed to put VM to hibernate.
- vm property "control-platform-power-state"
This is useful for some single-vm scenarios. The toolstack will track guest power state and try to put host to the same state. So if the guest goes to S3, toolstack will put the host to S3 as well.
- host methods "set_ac_lid_close_action", "set_battery_lid_close_action"
Configure the power action performed when laptop lid closes, either on battery or AC adapter. Can be one of the following:
- sleep
- hibernate
- shutdown
- forced-shutdown
- reboot
- nothing
Code layout
Xenmgr is split into a small library (xenmgr-core) and main daemon (xenmgr). xenmgr-core atm contains very little, primarily just a bit of v4v firewall rule parsing code. The intent was to move code useful for other projects (such as the OVF import tool) out of xenmgr daemon and into the library.
Custom Libraries Used
There's a bunch of small haskell libraries we wrote which are in use by toolstack components. They reside in xclibs.git.
- udbus - vincent's small dbus library
- xchv4v - v4v access library
- xch-rpc - higher level rpc library based on udbus, also with support for v4v rpc tunnels
- xchdb - access to database files in /config, thru database daemon
- xchutils - various random useful utility code
- xchwebsocket - websocket library used by rpc-proxy
xenmgr-core
- Vm/Uuid.hs - few functions related to handling UUIDs
- Vm/ProductProperty.hs - handling OVF product properties (not used much atm besides theoretically for VMs defined by OVF xml)
- Vm/V4VFirewall.hs - parsing of v4v firewall rules
xenmgr
- Rpc/Autogen - folder containing all rpc access stubs generated by rpcgen (invoked from bb recipe)
- Tools/* - various utility code
Vm subtree
- Vm/Types.hs - definition of important types, such as definition of vm config and vm states
- Vm/Monad.hs - definition of "Vm" monad which is a simple reader-based monad that implicitly holds context for single vm
- Vm/Config.hs - definition of database location for all vm config properties and code to create lower-level xenvm config files
- Vm/State.hs - vm state conversions from/to string as well as introduction of concept of internal vm state, which is a vm state private to xenmgr and more detailed than the generic state exposed from xenvm
- Vm/Dm.hs, DmTypes.hs - types relating to device model, such as disks, nics, xenstore device states, handling backend/frontend interactions
- Vm/DomainCore.hs - handling domids/lookup of domains and their stubdoms
- Vm/Balloon.hs - balloning service vm memory down (if allowed by vm config, which it's usually not)
- Vm/Pci*.hs - PCI passthrough - parsing pci passthrough rules, handling of binding drivers to pciback
- Vm/DepGraph.hs - few graphing utilities to solve the dependency graph between vms (even though it usually boils down to 2 node graphs...)
- Vm/Policies.hs - storage and query of vm policy settings
- Vm/Monitor.hs - code for monitoring vm events from various sources (xenvm and xenstore) and registering/invoking internal handlers
- Vm/React.hs - event handlers for vm events coming from Monitor.hs
- Vm/Templates.hs - finding, categorizing, reading vm and service vm templates (in /usr/share/xenmgr-1.0/templates)
- Vm/Queries.hs - many passive functions to query about vm states/config. Of particular interest would be "getVmConfig" functions which creates in-memory config representation based on database
- Vm/Actions.hs - active functions for changing vm config as well as doing some runtime state manipulation (such as relocating network backend to different ndvm), starting/stopping vms etc
XenMgr subtree
- XenMgr/Connect/* - wrappers to access other daemons in the system
- XenMgr/Expose/* - entry points for all xenmgr's dbus server rpcs
- XenMgr/CdLock.hs - relatively new code for handling the AFRL request cd drive lock model
- XenMgr/Config.hs - global xenmgr config storage/query
- XenMgr/Diagnostics.hs - gathering status reports from vms + other diagnostics
- XenMgr/Diskmgr.hs - vhd creation
- XenMgr/Errors.hs - definition of numbered errors reported to the UI
- XenMgr/Host.hs - lots of host level query functions (eth0 mac adresses, bios versions, xc versions, update state etc)
- XenMgr/HostOps.hs - host shutdown/sleep/hibernate/reboot entry points
- XenMgr/PowerManagement.hs - actual implementation of host shutdown/sleep/hibernate/reboot etc plus code to handle lid state changes
- XenMgr/Notify.hs - wrappers for easier generation of various dbus signals
- XenMgr/Rpc.hs - definition of Rpc monad used in xenmgr for dbus access
- XenMgr/XM.hs - definition of XM monad based on reader monad containing context forall vms. Useful for doing some cross vm interactions which require locking / synchronization
Interactions with other daemons
xenmgr interacts with the following daemons on system:
- database daemon (dbd) for config storage. Via DBUS only.
- xenvm for all low level vm operations / domain creation / shutdown etc. Via DBUS and textual vm config files in /tmp/xenmgr-xenvm-$UUID
- input daemon for handling on-boot authentication screen, vm focus switching, screen lock, seamless mouse configuration. Via DBUS.
- network daemon to notify it about state for vms marked with "provides-network-backend" property. Via DBUS (over v4v into network domain)
- surfman to query it for list of passthrough GPU devices if using PVMs ("get_vgpu_mode" surfman RPC). Via DBUS.
apptool
Apptool is an OVF (open virtualization format) import tool. More info athttps://en.wikipedia.org/wiki/Open_Virtualization_Format and https://github.com/OpenXT/manager/tree/master/apptool.
XenVM cont.
There is some more documentation buried in the toolstack repository. I added it - for convenience purposes - here.
Xenvm is a single VM monitor. it's a single process that follow the life of a VM, and open a command socket where you can send command, just like xm to xend.
xenvm expose the uuid of the domain to outside world, so instead of having to refer to the domid of the domain, all outside command use the uuid. the uuid beeing stable, all command will use the same uuid after reboot/suspend/resume cycle.
Config file
you can put the following parameters in your config file:
(common options)
key name | type | description |
---|---|---|
hvm | s | command line given to the kernel |
startup | s | specify what to do with the domain at startup possibles value: started, paused, shutdown or restore |
debug | b | logs all operations to /tmp/xenvm-debug-%uuid |
uuid | s | specify the domain uuid (default to autogeneration) |
on_crash | s | specify the action to be taken after notifying a |
on_halt | s | crash/halt/reboot. possible values: |
on_reboot | s | preserve, reboot, destroy |
kernel | s | specify where to find the kernel to boot (can be empty for hvm. default to hvmloader) |
memory | i | specify the memory given to the guest in megabytes |
vcpus | i | number of vcpus available to the guest |
disk | s | add a virtual disk. |
format: physpath:phystype:virtpath:mode:devty[:k=v...] | ||
- physpath: path to the disk image, raw device, .. | ||
- phystype: phy | ||
- virtpath: hd(a-d) | ||
- mode: r | ||
- devtype = cdrom | ||
- extra k=v arguments (cipher, key-size, key-file) | ||
nic | s | add a virtual nic. |
format: key=value,key=value,... (can be empty) | ||
supported key: bridge, mac, id | ||
examples: | ||
"nic = bridge=xen-br0,mac=ab:ef:fe:dc:ba:ab" | ||
"nic = mac=ab:ef:fe:dc:ba:ab" | ||
"nic = " | ||
pci | s | add a pci device. |
format: devid,bind,domain:bus device.function | ||
serial | s | redirect serial to device or network tcp:ip:port |
ex: "pty" or "tcp:1.2.3.4:1234" | ||
display | s | details the type of display available for the guest |
format: :key[=value],key[=value],... | ||
possible values: | ||
- none | ||
- vnc (keys allowed: use-port-unused, keymap, port) | ||
- sdl | ||
- intel |
(the following are just useful for pv)
key name | type | description |
---|---|---|
cmdline | s | command line given to the kernel |
initrd | s | specify the initrd use, leave empty for none |
(the following are just useful for hvm)
key name | type | description |
---|---|---|
pae | b | specify that the guest is using PAE |
acpi | b | specify that the guest is using ACPI |
apic | b | specify that the guest is using APIC |
nx | b | specify that the guest is using NX |
smbios-pt | b | specify that the guest is using smbios pass-through |
smbios-oem-type-pt | i | tables number to passthrough |
acpi-pt | b | specify that the guest is using ACPI pass-through |
diskinfo-pt | b | specify the guest is using SCSI diskinfo pass-through |
boot | s | specify the qemu boot string |
extra-hvm | k=v | specify extra arguments passthrough to qemu as -k v |
power-management | i | specify the power management passthrough mode |
- 1 : pass-through mode (limited scope) | ||
- 2 : non pass-through mode (in doubt use this) | ||
oem-features | i | specify whether or not to pass through oem features. |
Note: At the moment any integer value can be passed | ||
but this is likely to change in future especially if | ||
we decide to pass through a subset of oem features | ||
and let user configure that subset. | ||
timer-mode | i | specify the timer mode used. |
timeoffset | s | specify the time offset (i.e. timezone) used. |
pci-msitranslate | i | specify whether to use MSI-INTx translation for guest. |
pci-power-management | i | specify whether or not to enable Dx power management |
for passthrough devices. | ||
inject-sci | i | specify whether or not to inject SCIs like lid close, |
power button press to guest. (Default: no injection) |
Sending command to the monitor
xenvm bind a unix socket to the uuid specified/generated You can easily send command to the monitor with xenvm-cmd using the simple syntax:
xenvm-cmd
The following command are supported:
- help
- pause
- unpause
- destroy
- start
- suspend file=
- suspend file= live=true
- restore file=
- checkpoint
- shutdown
- restart
- quit (quit the monitor leaving the vm untouched)
Running with xen-unstable
It's possible to run with xen-unstable directly, but you need the qemu-dm-wrapper script and xenguest binary in /usr/bin/. They are available in xenguest/xenguest and scripts/qemu-dm-wrapper in the toolstack.git repository.
And you need to replace your udev rules by the one available in scripts/xen-backend.rules and scripts/xen-frontend.rules and add the scripts/tap scripts/block scripts/block-front scripts/vif into /etc/xensource/scripts/
Also note that since xen-unstable doesn't have the dm-ready patch, hvm domain takes unfortunately a substantial time (around 20s.) to start.
Example of config
PV config with one LVM disk called 'test' and one VIF :
Code Block | ||
---|---|---|
| ||
uuid = 00000000-0000-0000-0000-000000000001
hvm = false
kernel = /boot/vmlinuz-2.6.18-xenU
cmdline = root=/dev/sda1 ro
memory = 64
disk = /dev/vg/test:phy:sda:w:disk
vif = |
HVM config for installing a windows 2k3 from iso on a LVM disk called 'testhvm':
Code Block | ||
---|---|---|
| ||
uuid = 00000000-0000-0000-0000-000000000002
hvm = true
memory = 256
disk = /dev/vg/testhvm:phy:hda:w:disk
disk = /var/opt/xen/iso_import/w2k3eesp2.iso:file:hdd:r:cdrom
boot = dc
pae = true
acpi = true
apic = true |