VM Memory Layout
Memory Model
The rundown here will focus on HVMs, though PV guests are not all that different. The following is the memory layout from xenops/memory.ml
:
(* === Domain memory breakdown ============================================== *) (* ╤ ╔══════════╗ ╤ *) (* │ ║ shadow ║ │ *) (* │ ╠══════════╣ │ *) (* overhead │ ║ extra ║ │ *) (* │ ║ external ║ │ *) (* │ ╠══════════╣ ╤ │ *) (* │ ║ extra ║ │ │ *) (* │ ║ internal ║ │ │ *) (* ╪ ╠══════════╣ ╤ │ │ footprint *) (* │ ║ video ║ │ │ │ *) (* │ ╠══════════╣ ╤ ╤ │ actual │ xen │ *) (* │ ║ ║ │ │ │ / │ maximum │ *) (* │ ║ ║ │ │ │ target │ │ *) (* │ ║ guest ║ │ │ build │ / │ │ *) (* │ ║ ║ │ │ start │ total │ │ *) (* static │ ║ ║ │ │ │ │ │ *) (* maximum │ ╟──────────╢ │ ╧ ╧ ╧ ╧ *) (* │ ║ ║ │ *) (* │ ║ ║ │ *) (* │ ║ balloon ║ │ build *) (* │ ║ ║ │ maximum *) (* │ ║ ║ │ *) (* ╧ ╚══════════╝ ╧ *)
The blocks marked build maximum and video are passed to xenvm via the input configuration. The balloon area is the memory available to be ballooned up using xenops balloon option when Populate on Demand (PoD) is used. The extra internal block is an extra Mb of total memory given to the guest.
TODO: It is not totally clear what extra external and shadow are but they are external to the guest's memory. Presumably they are for use by xenvm.
The memory model is defined in xenops/memory.ml
. The extra blocks and actual memory model structs are found here:
module HVM_memory_model_data : MEMORY_MODEL_DATA = struct let extra_internal_mib = 1L (* The extra 1Mb *) let extra_external_mib = 1L end ... module Memory_model (D : MEMORY_MODEL_DATA) = struct (* memory block definitions generated here *) ... end
The VM building code mainly resides in xenops/domain_control.ml
. Starting in build_hvm
which is called out of xenvm/vmact.ml
, all the memory values are fetched:
static_max_mib
comes fromstatic_max_kib
passed to the function. It is thebuild_max_mib
+video_mib
.xen_max_mib
is calculated in the memory model asstatic_max_mib
+D.extra_internal_mib
build_max_mib
andbuild_start_mib
end up being the same for HVMs in OpenXT becausestatic_max_mib
andtarget_mib
are the same. They are calculated in the memory model by subtractingvideo_mib
.build_max_mib
andbuild_start_mib
may be different for guests when PoD mode is desired.
With the values above, the first stop is set the overall maximum memory for the guest. This is done in build_pre
using xen_max_mib
. After that step build_hvm
then calls Xg.hvm_build
(which eventually calls libxc:xc_hvm_build
). The build_start_mib
and build_max_mib
values are passed and are used to physmap the initial memory for the guest. Note this does not include the video_mid
or D.extra_internal_mib
. More on those later. This is the relevant code, annotated:
let build_pre ~xc ~xs ~vcpus ~xen_max_mib ~shadow_mib ~required_host_free_mib domid = ... (* This is the call to set the overall maximum for the guest. Note that none of this memory is mapped yet. This is the the maximum possible memory *) Xc.domain_setmaxmem xc domid (Memory.kib_of_mib xen_max_mib); ... let build_hvm ~xc ~xs ~static_max_kib ~target_kib ~video_mib ~shadow_multiplier ~vcpus ~kernel ~timeoffset ~xci_cpuid_signature domid = ... (* Call build_pre first to set overall memory xen_max_mib *) let store_port, console_port = build_pre ~xc ~xs ~xen_max_mib ~shadow_mib ~required_host_free_mib ~vcpus domid in ... (* Call Xg xenguest helper library to do the actual domain building via libc. The build_start_mib and build_max_mib values are used to do the physmapping at this tine. *) let store_mfn, console_mfn = Xg.hvm_build xgh domid (Int64.to_int build_max_mib) (Int64.to_int build_start_mib) kernel platformflags store_port console_port in
Build start and max memory
A note on the memory values used in Xg.hvm_build
. In the actual call to libxc
, the values are mapped as such:
CAMLprim value stub_xc_hvm_build_native(value xc_handle, value domid, value mem_max_mib, value mem_start_mib, value image_name, value platformflags, value store_evtchn, value console_evtchn) ... args.mem_size = (uint64_t) Int_val(mem_max_mib) << 20; args.mem_target = (uint64_t) Int_val(mem_start_mib) << 20; ... /* In the libxc code, if mem_target < mem_size, Populate on Demand mode is set for the VM and * during the physmap process, less that mem_size will get mapped initially. */ r = xc_hvm_build(xch, _D(domid), &args);
Video and extra Internal memory
As noted above, the overall maximum amount of memory a guest is allowed includes these values but they are not physmapped in by the domain builder code. They are in fact physmapped by QEMU. In OpenXT the video memory is 16Mb. That is mapped in xen-all.c:xen_ram_alloc
. Any additional devices like xenmou can also physmap more memory in the extra internal region with as noted is 1M.
xenvm vs. libxl
The good news is that all of the above machinery is more or less the same for libxl
:
/* Defined in libxl_internal.h, this is the extra internal memory in Kb */ #define LIBXL_MAXMEM_CONSTANT 1024 int libxl__build_pre(libxl__gc *gc, uint32_t domid, libxl_domain_config *d_config, libxl__domain_build_state *state) ... /* Target memory like max memory in the info struct includes the video memory. That has * the extra internal memory added during the call to set the overall max memory for the * guest. Note the values are in Kb in this case. */ xc_domain_setmaxmem(ctx->xch, domid, info->target_memkb + LIBXL_MAXMEM_CONSTANT); ... int libxl__build_hvm(libxl__gc *gc, uint32_t domid, libxl_domain_build_info *info, libxl__domain_build_state *state) ... /* As before, the video memory is removed from the mem_size and mem_target before * being sent to the libxc domain build to get physmapped. The shift 10 is due to some * weirdness in the values in libxl, see the comment in the code. */ args.mem_size = (uint64_t)(info->max_memkb - info->video_memkb) << 10; args.mem_target = (uint64_t)(info->target_memkb - info->video_memkb) << 10; ... ret = xc_hvm_build(ctx->xch, domid, &args);
TODO: xenvm uses the max memory value when doing the xc_domain_setmaxmem
call where libxl uses the target memory which can be smaller when using PoD. Needs more investigation but most likely handled during the ballooning process. That probably means xenvm is really doing it wrong.