[PATCH v2 0/3] KVM: x86: KVM_MEM_PCI_HOLE memory

* [PATCH v2 0/3] KVM: x86: KVM_MEM_PCI_HOLE memory
@ 2020-08-07 14:12 Vitaly Kuznetsov
  2020-08-07 14:12 ` [PATCH v2 1/3] KVM: x86: move kvm_vcpu_gfn_to_memslot() out of try_async_pf() Vitaly Kuznetsov
                   ` (3 more replies)
  0 siblings, 4 replies; 31+ messages in thread
From: Vitaly Kuznetsov @ 2020-08-07 14:12 UTC (permalink / raw)
  To: kvm, Paolo Bonzini
  Cc: Sean Christopherson, Wanpeng Li, Jim Mattson, Peter Xu,
	Michael Tsirkin, Julia Suvorova, Andy Lutomirski, Andrew Jones,
	linux-kernel

Changes since v1:
- Better KVM_SET_USER_MEMORY_REGION flags description, minor tweaks to
  the code [Drew Jones]
- BUG_ON() condition in __gfn_to_hva_memslot() adjusted.

This is a continuation of "[PATCH RFC 0/5] KVM: x86: KVM_MEM_ALLONES
memory" work: 
https://lore.kernel.org/kvm/20200514180540.52407-1-vkuznets@redhat.com/
and pairs with Julia's "x86/PCI: Use MMCONFIG by default for KVM guests":
https://lore.kernel.org/linux-pci/20200722001513.298315-1-jusual@redhat.com/

PCIe config space can (depending on the configuration) be quite big but
usually is sparsely populated. Guest may scan it by accessing individual
device's page which, when device is missing, is supposed to have 'pci
hole' semantics: reads return '0xff' and writes get discarded.

When testing Linux kernel boot with QEMU q35 VM and direct kernel boot
I observed 8193 accesses to PCI hole memory. When such exit is handled
in KVM without exiting to userspace, it takes roughly 0.000001 sec.
Handling the same exit in userspace is six times slower (0.000006 sec) so
the overal; difference is 0.04 sec. This may be significant for 'microvm'
ideas.

Note, the same speed can already be achieved by using KVM_MEM_READONLY
but doing this would require allocating real memory for all missing
devices and e.g. 8192 pages gives us 32mb. This will have to be allocated
for each guest separately and for 'microvm' use-cases this is likely
a no-go.

Introduce special KVM_MEM_PCI_HOLE memory: userspace doesn't need to
back it with real memory, all reads from it are handled inside KVM and
return '0xff'. Writes still go to userspace but these should be extremely
rare.

The original 'KVM_MEM_ALLONES' idea had additional optimizations: KVM
was mapping all 'PCI hole' pages to a single read-only page stuffed with
0xff. This is omitted in this submission as the benefits are unclear:
KVM will have to allocate SPTEs (either on demand or aggressively) and
this also consumes time/memory. We can always take a look at possible
optimizations later.

Vitaly Kuznetsov (3):
  KVM: x86: move kvm_vcpu_gfn_to_memslot() out of try_async_pf()
  KVM: x86: introduce KVM_MEM_PCI_HOLE memory
  KVM: selftests: add KVM_MEM_PCI_HOLE test

 Documentation/virt/kvm/api.rst                |  18 ++-
 arch/x86/include/uapi/asm/kvm.h               |   1 +
 arch/x86/kvm/mmu/mmu.c                        |  19 +--
 arch/x86/kvm/mmu/paging_tmpl.h                |  10 +-
 arch/x86/kvm/x86.c                            |  10 +-
 include/linux/kvm_host.h                      |   3 +
 include/uapi/linux/kvm.h                      |   2 +
 tools/testing/selftests/kvm/Makefile          |   1 +
 .../testing/selftests/kvm/include/kvm_util.h  |   1 +
 tools/testing/selftests/kvm/lib/kvm_util.c    |  81 +++++++------
 .../kvm/x86_64/memory_slot_pci_hole.c         | 112 ++++++++++++++++++
 virt/kvm/kvm_main.c                           |  39 ++++--
 12 files changed, 239 insertions(+), 58 deletions(-)
 create mode 100644 tools/testing/selftests/kvm/x86_64/memory_slot_pci_hole.c

-- 
2.25.4

^ permalink raw reply	[flat|nested] 31+ messages in thread