All of lore.kernel.org
 help / color / mirror / Atom feed
From: Paul Mackerras <paulus@ozlabs.org>
To: kvm@vger.kernel.org, kvm-ppc@vger.kernel.org
Cc: linuxppc-dev@ozlabs.org, David Gibson <david@gibson.dropbear.id.au>
Subject: [PATCH v5 00/33] KVM: PPC: Book3S HV: Nested HV virtualization
Date: Mon,  8 Oct 2018 16:30:46 +1100	[thread overview]
Message-ID: <1538976679-1363-1-git-send-email-paulus@ozlabs.org> (raw)

This patch series implements nested virtualization in the KVM-HV
module for radix guests on POWER9 systems.  Unlike PR KVM, nested
guests are able to run in supervisor mode, meaning that performance is
much better than with PR KVM, and is very close to the performance of
a non-nested guests for most things.

The way this works is that each nested guest is also a guest of the
real hypervisor, also known as the level 0 or L0 hypervisor, which
runs in the CPU's hypervisor mode.  Its guests are at level 1, and
when a L1 system wants to run a nested guest, it performs hypercalls
to L0 to set up a virtual partition table in its (L1's) memory and to
enter the L2 guest.  The L0 hypervisor maintains a shadow
partition-scoped page table for the L2 guest and demand-faults entries
into it by translating the L1 real addresses in the partition-scoped
page table in L1 memory into L0 real addresses and puts them in the
shadow partition-scoped page table for L2.

Essentially what this is doing is providing L1 with the ability to do
(some) hypervisor functions using paravirtualization; optionally,
TLB invalidations can be done through emulation of the tlbie
instruction rather than a hypercall.

Along the way, this implements a new guest entry/exit path for radix
guests on POWER9 systems which is written almost entirely in C and
does not do any of the inter-thread coordination that the existing
entry/exit path does.  It is only used for radix guests and when
indep_threads_mode=Y (the default).

The limitations of this scheme are:

- Host and all nested hypervisors and their guests must be in radix
  mode.

- Nested hypervisors cannot use indep_threads_mode=N.

- If the host (i.e. the L0 hypervisor) has indep_threads_mode=N then
  only one nested vcpu can be run on any core at any given time; the
  secondary threads will do nothing.

- A nested hypervisor can't use a smaller page size than the base page
  size of the hypervisor(s) above it.

- A nested hypervisor is limited to having at most 1023 guests below
  it, each of which can have at most NR_CPUS virtual CPUs (and the
  virtual CPU ids have to be < NR_CPUS as well).

This patch series is against the kvm tree's next branch.

Changes in this version since version 4:

- Added KVM_PPC_NO_HASH to flags field of struct kvm_ppc_smmu_info rather
  than disabling the KVM_PPC_GET_SMMU_INFO ioctl entirely.

- Make sure the hypercalls for controlling nested guests will fail if
  the guest is in HPT mode.

- Made the KVM_CAP_PPC_MMU_HASH_V3 capability report false in a nested
  hypervisor.

- Fixed a bug causing HPT guests to fail to execute real-mode
  hypercalls correctly.

- Fix crashes seen on VM exit or when switching from HPT to radix,
  due to leftover rmap values.

- Removed enable/disable flag from KVM_ENABLE_CAP on the
  KVM_CAP_PPC_NESTED_HV capability; it always enables.

- Fixed a bug causing memory corruption on nested guest startup,
  and a bug where we were never clearing bits in the cpu_in_guest
  cpumask.

- Simplified some code in kvmhv_run_single_vcpu.

Paul.

 Documentation/virtual/kvm/api.txt                  |   19 +
 arch/powerpc/include/asm/asm-prototypes.h          |   21 +
 arch/powerpc/include/asm/book3s/64/mmu-hash.h      |   12 +
 .../powerpc/include/asm/book3s/64/tlbflush-radix.h |    1 +
 arch/powerpc/include/asm/hvcall.h                  |   41 +
 arch/powerpc/include/asm/kvm_asm.h                 |    4 +-
 arch/powerpc/include/asm/kvm_book3s.h              |   45 +-
 arch/powerpc/include/asm/kvm_book3s_64.h           |  118 +-
 arch/powerpc/include/asm/kvm_book3s_asm.h          |    3 +
 arch/powerpc/include/asm/kvm_booke.h               |    4 +-
 arch/powerpc/include/asm/kvm_host.h                |   16 +-
 arch/powerpc/include/asm/kvm_ppc.h                 |    4 +
 arch/powerpc/include/asm/ppc-opcode.h              |    1 +
 arch/powerpc/include/asm/reg.h                     |    2 +
 arch/powerpc/include/uapi/asm/kvm.h                |    1 +
 arch/powerpc/kernel/asm-offsets.c                  |    5 +-
 arch/powerpc/kernel/cpu_setup_power.S              |    4 +-
 arch/powerpc/kvm/Makefile                          |    3 +-
 arch/powerpc/kvm/book3s.c                          |   43 +-
 arch/powerpc/kvm/book3s_64_mmu_hv.c                |    7 +-
 arch/powerpc/kvm/book3s_64_mmu_radix.c             |  718 ++++++++---
 arch/powerpc/kvm/book3s_emulate.c                  |   13 +-
 arch/powerpc/kvm/book3s_hv.c                       |  852 ++++++++++++-
 arch/powerpc/kvm/book3s_hv_builtin.c               |   92 +-
 arch/powerpc/kvm/book3s_hv_interrupts.S            |   95 +-
 arch/powerpc/kvm/book3s_hv_nested.c                | 1291 ++++++++++++++++++++
 arch/powerpc/kvm/book3s_hv_ras.c                   |   10 +
 arch/powerpc/kvm/book3s_hv_rm_xics.c               |   13 +-
 arch/powerpc/kvm/book3s_hv_rmhandlers.S            |  823 +++++++------
 arch/powerpc/kvm/book3s_hv_tm.c                    |    6 +-
 arch/powerpc/kvm/book3s_hv_tm_builtin.c            |    5 +-
 arch/powerpc/kvm/book3s_pr.c                       |    5 +-
 arch/powerpc/kvm/book3s_xics.c                     |   14 +-
 arch/powerpc/kvm/book3s_xive.c                     |   63 +
 arch/powerpc/kvm/book3s_xive_template.c            |    8 -
 arch/powerpc/kvm/bookehv_interrupts.S              |    8 +-
 arch/powerpc/kvm/emulate_loadstore.c               |    1 -
 arch/powerpc/kvm/powerpc.c                         |   15 +-
 arch/powerpc/kvm/tm.S                              |  250 ++--
 arch/powerpc/kvm/trace_book3s.h                    |    1 -
 arch/powerpc/mm/tlb-radix.c                        |    9 +
 include/uapi/linux/kvm.h                           |    2 +
 tools/perf/arch/powerpc/util/book3s_hv_exits.h     |    1 -
 43 files changed, 3806 insertions(+), 843 deletions(-)

WARNING: multiple messages have this Message-ID (diff)
From: Paul Mackerras <paulus@ozlabs.org>
To: kvm@vger.kernel.org, kvm-ppc@vger.kernel.org
Cc: linuxppc-dev@ozlabs.org, David Gibson <david@gibson.dropbear.id.au>
Subject: [PATCH v5 00/33] KVM: PPC: Book3S HV: Nested HV virtualization
Date: Mon, 08 Oct 2018 05:30:46 +0000	[thread overview]
Message-ID: <1538976679-1363-1-git-send-email-paulus@ozlabs.org> (raw)

This patch series implements nested virtualization in the KVM-HV
module for radix guests on POWER9 systems.  Unlike PR KVM, nested
guests are able to run in supervisor mode, meaning that performance is
much better than with PR KVM, and is very close to the performance of
a non-nested guests for most things.

The way this works is that each nested guest is also a guest of the
real hypervisor, also known as the level 0 or L0 hypervisor, which
runs in the CPU's hypervisor mode.  Its guests are at level 1, and
when a L1 system wants to run a nested guest, it performs hypercalls
to L0 to set up a virtual partition table in its (L1's) memory and to
enter the L2 guest.  The L0 hypervisor maintains a shadow
partition-scoped page table for the L2 guest and demand-faults entries
into it by translating the L1 real addresses in the partition-scoped
page table in L1 memory into L0 real addresses and puts them in the
shadow partition-scoped page table for L2.

Essentially what this is doing is providing L1 with the ability to do
(some) hypervisor functions using paravirtualization; optionally,
TLB invalidations can be done through emulation of the tlbie
instruction rather than a hypercall.

Along the way, this implements a new guest entry/exit path for radix
guests on POWER9 systems which is written almost entirely in C and
does not do any of the inter-thread coordination that the existing
entry/exit path does.  It is only used for radix guests and when
indep_threads_mode=Y (the default).

The limitations of this scheme are:

- Host and all nested hypervisors and their guests must be in radix
  mode.

- Nested hypervisors cannot use indep_threads_mode=N.

- If the host (i.e. the L0 hypervisor) has indep_threads_mode=N then
  only one nested vcpu can be run on any core at any given time; the
  secondary threads will do nothing.

- A nested hypervisor can't use a smaller page size than the base page
  size of the hypervisor(s) above it.

- A nested hypervisor is limited to having at most 1023 guests below
  it, each of which can have at most NR_CPUS virtual CPUs (and the
  virtual CPU ids have to be < NR_CPUS as well).

This patch series is against the kvm tree's next branch.

Changes in this version since version 4:

- Added KVM_PPC_NO_HASH to flags field of struct kvm_ppc_smmu_info rather
  than disabling the KVM_PPC_GET_SMMU_INFO ioctl entirely.

- Make sure the hypercalls for controlling nested guests will fail if
  the guest is in HPT mode.

- Made the KVM_CAP_PPC_MMU_HASH_V3 capability report false in a nested
  hypervisor.

- Fixed a bug causing HPT guests to fail to execute real-mode
  hypercalls correctly.

- Fix crashes seen on VM exit or when switching from HPT to radix,
  due to leftover rmap values.

- Removed enable/disable flag from KVM_ENABLE_CAP on the
  KVM_CAP_PPC_NESTED_HV capability; it always enables.

- Fixed a bug causing memory corruption on nested guest startup,
  and a bug where we were never clearing bits in the cpu_in_guest
  cpumask.

- Simplified some code in kvmhv_run_single_vcpu.

Paul.

 Documentation/virtual/kvm/api.txt                  |   19 +
 arch/powerpc/include/asm/asm-prototypes.h          |   21 +
 arch/powerpc/include/asm/book3s/64/mmu-hash.h      |   12 +
 .../powerpc/include/asm/book3s/64/tlbflush-radix.h |    1 +
 arch/powerpc/include/asm/hvcall.h                  |   41 +
 arch/powerpc/include/asm/kvm_asm.h                 |    4 +-
 arch/powerpc/include/asm/kvm_book3s.h              |   45 +-
 arch/powerpc/include/asm/kvm_book3s_64.h           |  118 +-
 arch/powerpc/include/asm/kvm_book3s_asm.h          |    3 +
 arch/powerpc/include/asm/kvm_booke.h               |    4 +-
 arch/powerpc/include/asm/kvm_host.h                |   16 +-
 arch/powerpc/include/asm/kvm_ppc.h                 |    4 +
 arch/powerpc/include/asm/ppc-opcode.h              |    1 +
 arch/powerpc/include/asm/reg.h                     |    2 +
 arch/powerpc/include/uapi/asm/kvm.h                |    1 +
 arch/powerpc/kernel/asm-offsets.c                  |    5 +-
 arch/powerpc/kernel/cpu_setup_power.S              |    4 +-
 arch/powerpc/kvm/Makefile                          |    3 +-
 arch/powerpc/kvm/book3s.c                          |   43 +-
 arch/powerpc/kvm/book3s_64_mmu_hv.c                |    7 +-
 arch/powerpc/kvm/book3s_64_mmu_radix.c             |  718 ++++++++---
 arch/powerpc/kvm/book3s_emulate.c                  |   13 +-
 arch/powerpc/kvm/book3s_hv.c                       |  852 ++++++++++++-
 arch/powerpc/kvm/book3s_hv_builtin.c               |   92 +-
 arch/powerpc/kvm/book3s_hv_interrupts.S            |   95 +-
 arch/powerpc/kvm/book3s_hv_nested.c                | 1291 ++++++++++++++++++++
 arch/powerpc/kvm/book3s_hv_ras.c                   |   10 +
 arch/powerpc/kvm/book3s_hv_rm_xics.c               |   13 +-
 arch/powerpc/kvm/book3s_hv_rmhandlers.S            |  823 +++++++------
 arch/powerpc/kvm/book3s_hv_tm.c                    |    6 +-
 arch/powerpc/kvm/book3s_hv_tm_builtin.c            |    5 +-
 arch/powerpc/kvm/book3s_pr.c                       |    5 +-
 arch/powerpc/kvm/book3s_xics.c                     |   14 +-
 arch/powerpc/kvm/book3s_xive.c                     |   63 +
 arch/powerpc/kvm/book3s_xive_template.c            |    8 -
 arch/powerpc/kvm/bookehv_interrupts.S              |    8 +-
 arch/powerpc/kvm/emulate_loadstore.c               |    1 -
 arch/powerpc/kvm/powerpc.c                         |   15 +-
 arch/powerpc/kvm/tm.S                              |  250 ++--
 arch/powerpc/kvm/trace_book3s.h                    |    1 -
 arch/powerpc/mm/tlb-radix.c                        |    9 +
 include/uapi/linux/kvm.h                           |    2 +
 tools/perf/arch/powerpc/util/book3s_hv_exits.h     |    1 -
 43 files changed, 3806 insertions(+), 843 deletions(-)

             reply	other threads:[~2018-10-08  5:30 UTC|newest]

Thread overview: 83+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-10-08  5:30 Paul Mackerras [this message]
2018-10-08  5:30 ` [PATCH v5 00/33] KVM: PPC: Book3S HV: Nested HV virtualization Paul Mackerras
2018-10-08  5:30 ` [PATCH v5 01/33] powerpc: Turn off CPU_FTR_P9_TM_HV_ASSIST in non-hypervisor mode Paul Mackerras
2018-10-08  5:30   ` Paul Mackerras
2018-10-08  5:30 ` [PATCH v5 02/33] KVM: PPC: Book3S: Simplify external interrupt handling Paul Mackerras
2018-10-08  5:30   ` Paul Mackerras
2018-10-08  5:30 ` [PATCH v5 03/33] KVM: PPC: Book3S HV: Remove left-over code in XICS-on-XIVE emulation Paul Mackerras
2018-10-08  5:30   ` Paul Mackerras
2018-10-08  5:30 ` [PATCH v5 04/33] KVM: PPC: Book3S HV: Move interrupt delivery on guest entry to C code Paul Mackerras
2018-10-08  5:30   ` Paul Mackerras
2018-10-08  5:30 ` [PATCH v5 05/33] KVM: PPC: Book3S HV: Extract PMU save/restore operations as C-callable functions Paul Mackerras
2018-10-08  5:30   ` Paul Mackerras
2018-10-08  8:16   ` Madhavan Srinivasan
2018-10-08  5:30 ` [PATCH v5 06/33] KVM: PPC: Book3S HV: Simplify real-mode interrupt handling Paul Mackerras
2018-10-08  5:30   ` Paul Mackerras
2018-10-09  0:05   ` David Gibson
2018-10-09  0:05     ` David Gibson
2018-10-08  5:30 ` [PATCH v5 07/33] KVM: PPC: Book3S: Rework TM save/restore code and make it C-callable Paul Mackerras
2018-10-08  5:30   ` Paul Mackerras
2018-10-08  5:30 ` [PATCH v5 08/33] KVM: PPC: Book3S HV: Call kvmppc_handle_exit_hv() with vcore unlocked Paul Mackerras
2018-10-08  5:30   ` Paul Mackerras
2018-10-08  5:30 ` [PATCH v5 09/33] KVM: PPC: Book3S HV: Streamlined guest entry/exit path on P9 for radix guests Paul Mackerras
2018-10-08  5:30   ` Paul Mackerras
2018-10-09  0:20   ` David Gibson
2018-10-09  0:20     ` David Gibson
2018-10-08  5:30 ` [PATCH v5 10/33] KVM: PPC: Book3S HV: Handle hypervisor instruction faults better Paul Mackerras
2018-10-08  5:30   ` Paul Mackerras
2018-10-08  5:30 ` [PATCH v5 11/33] KVM: PPC: Book3S HV: Add a debugfs file to dump radix mappings Paul Mackerras
2018-10-08  5:30   ` Paul Mackerras
2018-10-08  5:30 ` [PATCH v5 12/33] KVM: PPC: Use ccr field in pt_regs struct embedded in vcpu struct Paul Mackerras
2018-10-08  5:30   ` Paul Mackerras
2018-10-08  5:30 ` [PATCH v5 13/33] KVM: PPC: Book3S HV: Clear partition table entry on vm teardown Paul Mackerras
2018-10-08  5:30   ` Paul Mackerras
2018-10-08  5:31 ` [PATCH v5 14/33] KVM: PPC: Book3S HV: Make kvmppc_mmu_radix_xlate process/partition table agnostic Paul Mackerras
2018-10-08  5:31   ` Paul Mackerras
2018-10-08  5:31 ` [PATCH v5 15/33] KVM: PPC: Book3S HV: Refactor radix page fault handler Paul Mackerras
2018-10-08  5:31   ` Paul Mackerras
2018-10-08  5:31 ` [PATCH v5 16/33] KVM: PPC: Book3S HV: Use kvmppc_unmap_pte() in kvm_unmap_radix() Paul Mackerras
2018-10-08  5:31   ` Paul Mackerras
2018-10-08  5:31 ` [PATCH v5 17/33] KVM: PPC: Book3S HV: Framework and hcall stubs for nested virtualization Paul Mackerras
2018-10-08  5:31   ` Paul Mackerras
2018-10-08 23:30   ` David Gibson
2018-10-08 23:30     ` David Gibson
2018-10-08  5:31 ` [PATCH v5 18/33] KVM: PPC: Book3S HV: Nested guest entry via hypercall Paul Mackerras
2018-10-08  5:31   ` Paul Mackerras
2018-10-08  5:31 ` [PATCH v5 19/33] KVM: PPC: Book3S HV: Use XICS hypercalls when running as a nested hypervisor Paul Mackerras
2018-10-08  5:31   ` Paul Mackerras
2018-10-08  5:31 ` [PATCH v5 20/33] KVM: PPC: Book3S HV: Handle hypercalls correctly when nested Paul Mackerras
2018-10-08  5:31   ` Paul Mackerras
2018-10-08  5:31 ` [PATCH v5 21/33] KVM: PPC: Book3S HV: Handle page fault for a nested guest Paul Mackerras
2018-10-08  5:31   ` Paul Mackerras
2018-10-08  5:31 ` [PATCH v5 22/33] KVM: PPC: Book3S HV: Introduce rmap to track nested guest mappings Paul Mackerras
2018-10-08  5:31   ` Paul Mackerras
2018-10-09  0:26   ` David Gibson
2018-10-09  0:26     ` David Gibson
2018-10-08  5:31 ` [PATCH v5 23/33] KVM: PPC: Book3S HV: Implement H_TLB_INVALIDATE hcall Paul Mackerras
2018-10-08  5:31   ` Paul Mackerras
2018-10-08  5:31 ` [PATCH v5 24/33] KVM: PPC: Book3S HV: Use hypercalls for TLB invalidation when nested Paul Mackerras
2018-10-08  5:31   ` Paul Mackerras
2018-10-08  5:31 ` [PATCH v5 25/33] KVM: PPC: Book3S HV: Invalidate TLB when nested vcpu moves physical cpu Paul Mackerras
2018-10-08  5:31   ` Paul Mackerras
2018-10-08  5:31 ` [PATCH v5 26/33] KVM: PPC: Book3S HV: Don't access HFSCR, LPIDR or LPCR when running nested Paul Mackerras
2018-10-08  5:31   ` Paul Mackerras
2018-10-08  5:31 ` [PATCH v5 27/33] KVM: PPC: Book3S HV: Add one-reg interface to virtual PTCR register Paul Mackerras
2018-10-08  5:31   ` Paul Mackerras
2018-10-08  5:31 ` [PATCH v5 28/33] KVM: PPC: Book3S HV: Sanitise hv_regs on nested guest entry Paul Mackerras
2018-10-08  5:31   ` Paul Mackerras
2018-10-08  5:31 ` [PATCH v5 29/33] KVM: PPC: Book3S HV: Handle differing endianness for H_ENTER_NESTED Paul Mackerras
2018-10-08  5:31   ` Paul Mackerras
2018-10-08  5:31 ` [PATCH v5 30/33] KVM: PPC: Book3S HV: Allow HV module to load without hypervisor mode Paul Mackerras
2018-10-08  5:31   ` Paul Mackerras
2018-10-08 23:31   ` David Gibson
2018-10-08 23:31     ` David Gibson
2018-10-08  5:31 ` [PATCH v5 31/33] KVM: PPC: Book3S HV: Add nested shadow page tables to debugfs Paul Mackerras
2018-10-08  5:31   ` Paul Mackerras
2018-10-08  5:31 ` [PATCH v5 32/33] KVM: PPC: Book3S HV: Add a VM capability to enable nested virtualization Paul Mackerras
2018-10-08  5:31   ` Paul Mackerras
2018-10-08 23:34   ` David Gibson
2018-10-08 23:34     ` David Gibson
2018-10-08  5:31 ` [PATCH v5 33/33] KVM: PPC: Book3S HV: Add NO_HASH flag to GET_SMMU_INFO ioctl result Paul Mackerras
2018-10-08  5:31   ` Paul Mackerras
2018-10-08 23:34   ` David Gibson
2018-10-08 23:34     ` David Gibson

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1538976679-1363-1-git-send-email-paulus@ozlabs.org \
    --to=paulus@ozlabs.org \
    --cc=david@gibson.dropbear.id.au \
    --cc=kvm-ppc@vger.kernel.org \
    --cc=kvm@vger.kernel.org \
    --cc=linuxppc-dev@ozlabs.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.