kvm.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v6 00/17] KVM: s390: pv: implement lazy destroy for reboot
@ 2021-12-03 16:57 Claudio Imbrenda
  2021-12-03 16:57 ` [PATCH v6 01/17] KVM: s390: pv: leak the topmost page table when destroy fails Claudio Imbrenda
                   ` (16 more replies)
  0 siblings, 17 replies; 23+ messages in thread
From: Claudio Imbrenda @ 2021-12-03 16:57 UTC (permalink / raw)
  To: kvm
  Cc: cohuck, borntraeger, frankja, thuth, pasic, david, linux-s390,
	linux-kernel

Previously, when a protected VM was rebooted or when it was shut down,
its memory was made unprotected, and then the protected VM itself was
destroyed. Looping over the whole address space can take some time,
considering the overhead of the various Ultravisor Calls (UVCs). This
means that a reboot or a shutdown would take a potentially long amount
of time, depending on the amount of used memory.

This patchseries implements a deferred destroy mechanism for protected
guests. When a protected guest is destroyed, its memory can be cleared
in background, allowing the guest to restart or terminate significantly
faster than before.

There are 2 possibilities when a protected VM is torn down:
* it still has an address space associated (reboot case)
* it does not have an address space anymore (shutdown case)

For the reboot case, two new commands are available for the
KVM_S390_PV_COMMAND:

KVM_PV_ASYNC_DISABLE_PREPARE: prepares the current protected VM for
asynchronous teardown. The current VM will then continue immediately
as non-protected. If a protected VM had already been set aside without
starting the teardown process, this call will fail. In this case the
userspace process should issue a normal KVM_PV_DISABLE

KVM_PV_ASYNC_DISABLE: tears down the protected VM previously set aside
for asychronous teardown. This PV command should ideally be issued by
userspace from a separate thread. If a fatal signal is received (or
the process terminates naturally), the command will terminate
immediately without completing.

The idea is that userspace should first issue the
KVM_PV_ASYNC_DISABLE_PREPARE command, and in case of success, create a
new thread and issue KVM_PV_ASYNC_DISABLE from there. This also allows
for proper accounting of the CPU time needed for the asynchronous
teardown.

This means that the same address space can have memory belonging to
more than one protected guest, although only one will be running, the
others will in fact not even have any CPUs.

The shutdown case should be dealt with in userspace (e.g. using
clone(CLONE_VM)).

A module parameter is also provided to disable the new functionality,
which is otherwise enabled by default. This should not be an issue
since the new functionality is opt-in anyway. This is mainly thought to
aid debugging.

v5->v6
* completely reworked the series
* removed kernel thread for asynchronous teardown
* added new commands to KVM_S390_PV_COMMAND ioctl

v4->v5
* fixed and improved some patch descriptions
* added some comments to better explain what's going on
* use vma_lookup instead of find_vma
* rename is_protected to protected_count since now it's used as a counter

v3->v4
* added patch 2
* split patch 3
* removed the shutdown part -- will be a separate patchseries
* moved the patch introducing the module parameter

v2->v3
* added definitions for CC return codes for the UVC instruction
* improved make_secure_pte:
  - renamed rc to cc
  - added comments to explain why returning -EAGAIN is ok
* fixed kvm_s390_pv_replace_asce and kvm_s390_pv_remove_old_asce:
  - renamed
  - added locking
  - moved to gmap.c
* do proper error management in do_secure_storage_access instead of
  trying again hoping to get a different exception
* fix outdated patch descriptions

v1->v2
* rebased on a more recent kernel
* improved/expanded some patch descriptions
* improves/expanded some comments
* added patch 1, which prevents stall notification when the system is
  under heavy load.
* rename some members of struct deferred_priv to improve readability
* avoid an use-after-free bug of the struct mm in case of shutdown
* add missing return when lazy destroy is disabled
* add support for OOM notifier

Claudio Imbrenda (17):
  KVM: s390: pv: leak the topmost page table when destroy fails
  KVM: s390: pv: handle secure storage violations for protected guests
  KVM: s390: pv: handle secure storage exceptions for normal guests
  KVM: s390: pv: refactor s390_reset_acc
  KVM: s390: pv: usage counter instead of flag
  KVM: s390: pv: add export before import
  KVM: s390: pv: module parameter to fence lazy destroy
  KVM: s390: pv: make kvm_s390_cpus_from_pv global
  KVM: s390: pv: clear the state without memset
  KVM: s390: pv: add mmu_notifier
  s390/mm: KVM: pv: when tearing down, try to destroy protected pages
  KVM: s390: pv: refactoring of kvm_s390_pv_deinit_vm
  KVM: s390: pv: cleanup leftover protected VMs if needed
  KVM: s390: pv: asynchronous destroy for reboot
  KVM: s390: pv: api documentation for asynchronous destroy
  KVM: s390: pv: add KVM_CAP_S390_PROT_REBOOT_ASYNC
  KVM: s390: pv: avoid export before import if possible

 Documentation/virt/kvm/api.rst      |  21 ++-
 arch/s390/include/asm/gmap.h        |  38 +++-
 arch/s390/include/asm/kvm_host.h    |   3 +
 arch/s390/include/asm/mmu.h         |   2 +-
 arch/s390/include/asm/mmu_context.h |   2 +-
 arch/s390/include/asm/pgtable.h     |  11 +-
 arch/s390/include/asm/uv.h          |   1 +
 arch/s390/kernel/uv.c               |  64 +++++++
 arch/s390/kvm/kvm-s390.c            |  59 ++++++-
 arch/s390/kvm/kvm-s390.h            |   3 +
 arch/s390/kvm/pv.c                  | 259 ++++++++++++++++++++++++++--
 arch/s390/mm/fault.c                |  20 ++-
 arch/s390/mm/gmap.c                 | 152 +++++++++++++---
 include/uapi/linux/kvm.h            |   3 +
 14 files changed, 591 insertions(+), 47 deletions(-)

-- 
2.31.1


^ permalink raw reply	[flat|nested] 23+ messages in thread

end of thread, other threads:[~2022-01-13 10:38 UTC | newest]

Thread overview: 23+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-12-03 16:57 [PATCH v6 00/17] KVM: s390: pv: implement lazy destroy for reboot Claudio Imbrenda
2021-12-03 16:57 ` [PATCH v6 01/17] KVM: s390: pv: leak the topmost page table when destroy fails Claudio Imbrenda
2021-12-03 16:57 ` [PATCH v6 02/17] KVM: s390: pv: handle secure storage violations for protected guests Claudio Imbrenda
2022-01-13  9:54   ` Janosch Frank
2021-12-03 16:58 ` [PATCH v6 03/17] KVM: s390: pv: handle secure storage exceptions for normal guests Claudio Imbrenda
2022-01-13  9:58   ` Janosch Frank
2021-12-03 16:58 ` [PATCH v6 04/17] KVM: s390: pv: refactor s390_reset_acc Claudio Imbrenda
2021-12-03 16:58 ` [PATCH v6 05/17] KVM: s390: pv: usage counter instead of flag Claudio Imbrenda
2021-12-03 16:58 ` [PATCH v6 06/17] KVM: s390: pv: add export before import Claudio Imbrenda
2021-12-03 16:58 ` [PATCH v6 07/17] KVM: s390: pv: module parameter to fence lazy destroy Claudio Imbrenda
2021-12-03 16:58 ` [PATCH v6 08/17] KVM: s390: pv: make kvm_s390_cpus_from_pv global Claudio Imbrenda
2021-12-03 16:58 ` [PATCH v6 09/17] KVM: s390: pv: clear the state without memset Claudio Imbrenda
2022-01-13 10:30   ` Janosch Frank
2021-12-03 16:58 ` [PATCH v6 10/17] KVM: s390: pv: add mmu_notifier Claudio Imbrenda
2021-12-04  2:32   ` kernel test robot
2021-12-03 16:58 ` [PATCH v6 11/17] s390/mm: KVM: pv: when tearing down, try to destroy protected pages Claudio Imbrenda
2022-01-13 10:38   ` Janosch Frank
2021-12-03 16:58 ` [PATCH v6 12/17] KVM: s390: pv: refactoring of kvm_s390_pv_deinit_vm Claudio Imbrenda
2021-12-03 16:58 ` [PATCH v6 13/17] KVM: s390: pv: cleanup leftover protected VMs if needed Claudio Imbrenda
2021-12-03 16:58 ` [PATCH v6 14/17] KVM: s390: pv: asynchronous destroy for reboot Claudio Imbrenda
2021-12-03 16:58 ` [PATCH v6 15/17] KVM: s390: pv: api documentation for asynchronous destroy Claudio Imbrenda
2021-12-03 16:58 ` [PATCH v6 16/17] KVM: s390: pv: add KVM_CAP_S390_PROT_REBOOT_ASYNC Claudio Imbrenda
2021-12-03 16:58 ` [PATCH v6 17/17] KVM: s390: pv: avoid export before import if possible Claudio Imbrenda

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).