[RFC PATCH v2 00/21] QEMU gmem implemention

* [RFC PATCH v2 00/21] QEMU gmem implemention
@ 2023-09-14  3:50 Xiaoyao Li
  2023-09-14  3:50 ` [RFC PATCH v2 01/21] *** HACK *** linux-headers: Update headers to pull in gmem APIs Xiaoyao Li
                   ` (21 more replies)
  0 siblings, 22 replies; 52+ messages in thread
From: Xiaoyao Li @ 2023-09-14  3:50 UTC (permalink / raw)
  To: Paolo Bonzini, David Hildenbrand, Igor Mammedov,
	Michael S. Tsirkin, Marcel Apfelbaum, Richard Henderson,
	Peter Xu, Philippe Mathieu-Daudé,
	Cornelia Huck, Daniel P. Berrangé,
	Eric Blake, Markus Armbruster, Marcelo Tosatti
  Cc: qemu-devel, kvm, xiaoyao.li, Michael Roth, isaku.yamahata,
	Sean Christopherson, Claudio Fontana

It's the v2 RFC of enabling KVM gmem[1] as the backend for private
memory.

For confidential-computing, KVM provides gmem/guest_mem interfaces for
userspace, like QEMU, to allocate user-unaccesible private memory. This
series aims to add gmem support in QEMU's RAMBlock so that each RAM can
have both hva-based shared memory and gmem_fd based private memory. QEMU
does the shared-private conversion on KVM_MEMORY_EXIT and discards the
memory.

It chooses the design that adds "private" property to hostmeory backend.
If "private" property is set, QEMU will allocate/create KVM gmem when
initialize the RAMbloch of the memory backend. 

This sereis also introduces the first user of kvm gmem,
KVM_X86_SW_PROTECTED_VM. A KVM_X86_SW_PROTECTED_VM with private KVM gmem
can be created with 

  $qemu -object sw-protected-vm,id=sp-vm0 \
	-object memory-backend-ram,id=mem0,size=1G,private=on \
	-machine q35,kernel_irqchip=split,confidential-guest-support=sp-vm0,memory-backend=mem0 \
	...

Unfortunately this patch series fails the boot of OVMF at very early
stage due to triple fault, because KVM doesn't support emulating string IO
to private memory.

This version still leave some opens to be discussed:
1. whether we need "private" propery to be user-settable?

   It seems unnecessary because vm-type is determined. If the VM is
   confidential-guest, then the RAM of the guest must be able to be
   mapped as private, i.e., have kvm gmem backend. So QEMU can
   determine the value of "private" property automatiacally based on vm
   type.

   This also aligns with the board internal MemoryRegion that needs to
   have kvm gmem backend, e.g., TDX requires OVMF to act as private
   memory so bios memory region needs to have kvm gmem fd associated.
   QEMU no doubt will do it internally automatically.

2. hugepage support.

   KVM gmem can be allocated from hugetlbfs. How does QEMU determine
   when to allocate KVM gmem with KVM_GUEST_MEMFD_ALLOW_HUGEPAGE. The
   easiest solution is create KVM gmem with KVM_GUEST_MEMFD_ALLOW_HUGEPAGE
   only when memory backend is HostMemoryBackendFile of hugetlbfs.

3. What is KVM_X86_SW_PROTECTED_VM going to look like? and do we need it?

   This series implements KVM_X86_SW_PROTECTED_VM because it's introduced
   with gmem together on KVM side and it's supposed to be the first user
   who requires KVM gmem. However the implementation is incomplete and
   there lacks the definition of how KVM_X86_SW_PROTECTED_VM works.

Any other idea/open/question is welcomed.

Beside, TDX QEMU implemetation is based on this series to provide
private gmem for TD private memory, which can be found at [2].
And it can work corresponding KVM [3] to boot TDX guest. 

[1] https://lore.kernel.org/all/20230718234512.1690985-1-seanjc@google.com/
[2] https://github.com/intel/qemu-tdx/tree/tdx-qemu-upstream
[3] https://github.com/intel/tdx/tree/kvm-upstream-2023.07.27-v6.5-rc2-workaround

===
Changes since rfc v1:
- Implement KVM_X86_SW_PROTECTED_VM with confidential-guest-support
interface;
- rename memory_region_can_be_private() to memory_region_has_gmem_fd();
- allocate kvm gmem fd when creating/initializing the memory backend by
introducing the RAM_KVM_GMEM flag;

Chao Peng (3):
  RAMBlock: Add support of KVM private gmem
  kvm: Enable KVM_SET_USER_MEMORY_REGION2 for memslot
  kvm: handle KVM_EXIT_MEMORY_FAULT

Isaku Yamahata (4):
  HostMem: Add private property and associate it with RAM_KVM_GMEM
  trace/kvm: Add trace for page convertion between shared and private
  pci-host/q35: Move PAM initialization above SMRAM initialization
  q35: Introduce smm_ranges property for q35-pci-host

Xiaoyao Li (14):
  *** HACK *** linux-headers: Update headers to pull in gmem APIs
  memory: Introduce memory_region_has_gmem_fd()
  i386: Add support for sw-protected-vm object
  i386/pc: Drop pc_machine_kvm_type()
  target/i386: Implement mc->kvm_type() to get VM type
  target/i386: Introduce kvm_confidential_guest_init()
  i386/kvm: Implement kvm_sw_protected_vm_init() for sw-protcted-vm
    specific functions
  kvm: Introduce support for memory_attributes
  kvm/memory: Introduce the infrastructure to set the default
    shared/private value
  i386/kvm: Set memory to default private for KVM_X86_SW_PROTECTED_VM
  physmem: replace function name with __func__ in
    ram_block_discard_range()
  physmem: extract ram_block_discard_range_fd() from
    ram_block_discard_range()
  physmem: Introduce ram_block_convert_range()
  i386: Disable SMM mode for X86_SW_PROTECTED_VM

 accel/kvm/kvm-all.c               | 180 ++++++++++++++++++++-
 accel/kvm/trace-events            |   4 +-
 backends/hostmem-file.c           |   1 +
 backends/hostmem-memfd.c          |   1 +
 backends/hostmem-ram.c            |   1 +
 backends/hostmem.c                |  18 +++
 hw/i386/pc.c                      |   5 -
 hw/i386/pc_q35.c                  |   3 +-
 hw/i386/x86.c                     |  12 ++
 hw/pci-host/q35.c                 |  61 ++++---
 include/exec/cpu-common.h         |   2 +
 include/exec/memory.h             |  20 +++
 include/exec/ramblock.h           |   1 +
 include/hw/i386/pc.h              |   4 +-
 include/hw/i386/x86.h             |   1 +
 include/hw/pci-host/q35.h         |   1 +
 include/sysemu/hostmem.h          |   2 +-
 include/sysemu/kvm.h              |   5 +
 include/sysemu/kvm_int.h          |   2 +
 linux-headers/asm-x86/kvm.h       |   3 +
 linux-headers/linux/kvm.h         |  50 ++++++
 qapi/qom.json                     |   5 +
 softmmu/memory.c                  |  18 +++
 softmmu/physmem.c                 | 256 ++++++++++++++++++------------
 target/i386/kvm/kvm.c             |  43 ++++-
 target/i386/kvm/kvm_i386.h        |   1 +
 target/i386/kvm/meson.build       |   1 +
 target/i386/kvm/sw-protected-vm.c |  71 +++++++++
 target/i386/kvm/sw-protected-vm.h |  19 +++
 target/i386/sev.c                 |   1 -
 target/i386/sev.h                 |   2 +
 31 files changed, 648 insertions(+), 146 deletions(-)
 create mode 100644 target/i386/kvm/sw-protected-vm.c
 create mode 100644 target/i386/kvm/sw-protected-vm.h

-- 
2.34.1

^ permalink raw reply	[flat|nested] 52+ messages in thread