[PATCH PROTOTYPE 0/6] virtio-mem: vfio support

* [PATCH PROTOTYPE 0/6] virtio-mem: vfio support
@ 2020-09-24 16:04 David Hildenbrand
  2020-09-24 16:04 ` [PATCH PROTOTYPE 1/6] memory: Introduce sparse RAM handler for memory regions David Hildenbrand
                   ` (7 more replies)
  0 siblings, 8 replies; 27+ messages in thread
From: David Hildenbrand @ 2020-09-24 16:04 UTC (permalink / raw)
  To: qemu-devel
  Cc: Pankaj Gupta, David Hildenbrand, Michael S. Tsirkin,
	Dr. David Alan Gilbert, Peter Xu, Luiz Capitulino, Auger Eric,
	Alex Williamson, Wei Yang, Paolo Bonzini, Igor Mammedov

This is a quick and dirty (1.5 days of hacking) prototype to make
vfio and virtio-mem play together. The basic idea was the result of Alex
brainstorming with me on how to tackle this.

A virtio-mem device manages a memory region in guest physical address
space, represented as a single (currently large) memory region in QEMU.
Before the guest is allowed to use memory blocks, it must coordinate with
the hypervisor (plug blocks). After a reboot, all memory is usually
unplugged - when the guest comes up, it detects the virtio-mem device and
selects memory blocks to plug (based on requests from the hypervisor).

Memory hot(un)plug consists of (un)plugging memory blocks via a virtio-mem
device (triggered by the guest). When unplugging blocks, we discard the
memory. In contrast to memory ballooning, we always know which memory
blocks a guest may use - especially during a reboot, after a crash, or
after kexec.

The issue with vfio is, that it cannot deal with random discards - for this
reason, virtio-mem and vfio can currently only run mutually exclusive.
Especially, vfio would currently map the whole memory region (with possible
only little/no plugged blocks), resulting in all pages getting pinned and
therefore resulting in a higher memory consumption than expected (turning
virtio-mem basically useless in these environments).

To make vfio work nicely with virtio-mem, we have to map only the plugged
blocks, and map/unmap properly when plugging/unplugging blocks (including
discarding of RAM when unplugging). We achieve that by using a new notifier
mechanism that communicates changes.

It's important to map memory in the granularity in which we could see
unmaps again (-> virtio-mem block size) - so when e.g., plugging
consecutive 100 MB with a block size of 2MB, we need 50 mappings. When
unmapping, we can use a single vfio_unmap call for the applicable range.
We expect that the block size of virtio-mem devices will be fairly large
in the future (to not run out of mappings and to improve hot(un)plug
performance), configured by the user, when used with vfio (e.g., 128MB,
1G, ...) - Linux guests will still have to be optimized for that.

We try to handle errors when plugging memory (mapping in VFIO) gracefully
- especially to cope with too many mappings in VFIO.

As I basically have no experience with vfio, all I did for testing is
passthrough a secondary GPU (NVIDIA GK208B) via vfio-pci to my guest
and saw it pop up in dmesg. I did *not* actually try to use it (I know
...), so there might still be plenty of BUGs regarding the actual mappings
in the code. When I resize virtio-mem devices (resulting in
memory hot(un)plug), I can spot the memory consumption of my host adjusting
accordingly - in contrast to before, wehreby my machine would always
consume the maximum size of my VM, as if all memory provided by
virtio-mem devices were fully plugged.

I even tested it with 2MB huge pages (sadly for the first time with
virtio-mem ever) - and it worked like a charm on the hypervisor side as
well. The number of free hugepages adjusted accordingly. (again, did not
properly test the device in the guest ...).

If anybody wants to play with it and needs some guidance, please feel
free to ask. I might add some vfio-related documentation to
https://virtio-mem.gitlab.io/ (but it really isn't that special - only
the block size limitations have to be considered).

David Hildenbrand (6):
  memory: Introduce sparse RAM handler for memory regions
  virtio-mem: Impelement SparseRAMHandler interface
  vfio: Implement support for sparse RAM memory regions
  memory: Extend ram_block_discard_(require|disable) by two discard
    types
  virtio-mem: Require only RAM_BLOCK_DISCARD_T_COORDINATED discards
  vfio: Disable only RAM_BLOCK_DISCARD_T_UNCOORDINATED discards

 exec.c                         | 109 +++++++++++++++++----
 hw/vfio/common.c               | 169 ++++++++++++++++++++++++++++++++-
 hw/virtio/virtio-mem.c         | 164 +++++++++++++++++++++++++++++++-
 include/exec/memory.h          | 151 ++++++++++++++++++++++++++++-
 include/hw/vfio/vfio-common.h  |  12 +++
 include/hw/virtio/virtio-mem.h |   3 +
 softmmu/memory.c               |   7 ++
 7 files changed, 583 insertions(+), 32 deletions(-)

-- 
2.26.2

^ permalink raw reply	[flat|nested] 27+ messages in thread