linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [RFC PATCH 00/27] KVM/arm64: A stage 2 for the host
@ 2020-11-17 18:15 Quentin Perret
  2020-11-17 18:15 ` [RFC PATCH 01/27] arm64: lib: Annotate {clear,copy}_page() as position-independent Quentin Perret
                   ` (26 more replies)
  0 siblings, 27 replies; 54+ messages in thread
From: Quentin Perret @ 2020-11-17 18:15 UTC (permalink / raw)
  To: Catalin Marinas, Will Deacon, Marc Zyngier, James Morse,
	Julien Thierry, Suzuki K Poulose, Rob Herring, Frank Rowand
  Cc: moderated list:ARM64 PORT (AARCH64 ARCHITECTURE),
	open list, open list:KERNEL VIRTUAL MACHINE FOR ARM64 (KVM/arm64),
	open list:OPEN FIRMWARE AND FLATTENED DEVICE TREE, kernel-team,
	android-kvm, Quentin Perret

Hi all,

This RFC series provides the infrastructure enabling to wrap the host
kernel with a stage 2 when running KVM in nVHE. This can be useful for
several use-cases, but the primary motivation is to (eventually) be able
to protect guest memory from the host kernel. More details about the
overall idea, design, and motivations can be found in Will's talk at KVM
Forum 2020 [1], or the pKVM talk at the Android uconf during LPC 2020
[2].

This series essentially gets us to a point where the 'VM' bit is set
in the host's HCR_EL2 when running in nVHE and if 'kvm-arm.protected'
is set on the kernel command line. The EL2 object directly handles
memory aborts from the host and manages entirely its stage 2 page table.

However, this series does _not_ provide any real user for this (yet)
and simply idmaps everything into the host stage 2 as RWX cacheable.
This is all about the infrastructure for now, so clearly not ready for
inclusion upstream yet (hence the RFC tag), but the bases are there and
I thought it'd be useful to start a discussion with the community early
as this is a rather intrusive change. So, here goes.

One of the interesting requirements that comes with the series is that
managing page-tables requires some sort of memory allocator at EL2 to
allocate, refcount and free memory pages. Clearly, none of that is
currently possible in nVHE, so a significant chunk of the series is
dedicated to solving that problem. The proposed EL2 memory allocator
mimics Linux' buddy system in principles, and re-uses some of the arm64
mm design choices. Specifically, it uses a vmemmap at EL2 which contains
a set of struct hyp_page entries to hold pages metadata. To support
this, I extended the EL2 object to make it manage its own stage 1
page-table in addition to host stage 2. This simplifies the hyp_vmemmap
creation and was going to be required anyway for the protected VM
use-case -- the threat model implies the host cannot be trusted after
boot, and it will thus be crucial to ensure it cannot map arbitrary code
at EL2.

The pool of memory pages used by the EL2 allocator are reserved by the
host early during boot (while it is still trusted) using the memblock
API, and are donated to EL2 during KVM init. The current assumption is
that the host reserves enough pages to allow the EL2 object to map all
of memory at page granularity for both hyp stage 1 and host stage 2,
plus some extra pages for device mappings.

On top of that the series introduces a few smaller features that are
needed along the way, but hopefully all of those are detailed properly
in the relevant commit messages.

And as a last note, I'd like to point out that there are at this point
trivial ways for the host to circumvent its stage 2 protection. It still
owns the guests stage 2 for example, meaning that nothing would prevent
a malicious host from using a guest as a proxy to access protected
memory, _yet_. This series lays the ground for future work to address
these things, which will clearly require a stage 2 over the host at some
point, so I just wanted to set the expectations right.

With all that in mind, the series is organized as follows:

 - patches 01-03 provide EL2 with some utility libraries needed for
   memory management and synchronization;

 - patches 04-09 mostly refactor smalls portions of the code to ease the
   EL2 memory management;

 - patches 10-17 add the actual EL2 memory management code, as well as
   the setup/bootstrap code on the KVM init path;

 - patches 18-24 refactor the existing stage 2 management code to make
   it re-usable from the EL2 object;

 - and finally patches 25-27 introduce the host stage 2 and the trap
   handling logic at EL2.

This work is based on the latest kvmarm/queue (which includes Marc's
host EL2 entry rework [3], as well as Will's guest vector refactoring
[4]) + David's PSCI proxying series [5].

And if you'd like a branch that has all the bits and pieces:

    https://android-kvm.googlesource.com/linux qperret/host-stage2

Boot-tested (host and guest) using qemu in VHE and nVHE, and on real
hardware on a AML-S905X-CC (Le Potato).

Thanks,
Quentin

[1] https://kvmforum2020.sched.com/event/eE24/virtualization-for-the-masses-exposing-kvm-on-android-will-deacon-google
[2] https://youtu.be/54q6RzS9BpQ?t=10859
[3] https://lore.kernel.org/kvmarm/20201109175923.445945-1-maz@kernel.org/
[4] https://lore.kernel.org/kvmarm/20201113113847.21619-1-will@kernel.org/
[5] https://lore.kernel.org/kvmarm/20201116204318.63987-1-dbrazdil@google.com/


Quentin Perret (24):
  KVM: arm64: Initialize kvm_nvhe_init_params early
  KVM: arm64: Avoid free_page() in page-table allocator
  KVM: arm64: Factor memory allocation out of pgtable.c
  KVM: arm64: Introduce a BSS section for use at Hyp
  KVM: arm64: Make kvm_call_hyp() a function call at Hyp
  KVM: arm64: Allow using kvm_nvhe_sym() in hyp code
  KVM: arm64: Introduce an early Hyp page allocator
  KVM: arm64: Stub CONFIG_DEBUG_LIST at Hyp
  KVM: arm64: Introduce a Hyp buddy page allocator
  KVM: arm64: Enable access to sanitized CPU features at EL2
  KVM: arm64: Factor out vector address calculation
  of/fdt: Introduce early_init_dt_add_memory_hyp()
  KVM: arm64: Prepare Hyp memory protection
  KVM: arm64: Elevate Hyp mappings creation at EL2
  KVM: arm64: Use kvm_arch for stage 2 pgtable
  KVM: arm64: Use kvm_arch in kvm_s2_mmu
  KVM: arm64: Set host stage 2 using kvm_nvhe_init_params
  KVM: arm64: Refactor kvm_arm_setup_stage2()
  KVM: arm64: Refactor __load_guest_stage2()
  KVM: arm64: Refactor __populate_fault_info()
  KVM: arm64: Make memcache anonymous in pgtable allocator
  KVM: arm64: Reserve memory for host stage 2
  KVM: arm64: Sort the memblock regions list
  KVM: arm64: Wrap the host with a stage 2

Will Deacon (3):
  arm64: lib: Annotate {clear,copy}_page() as position-independent
  KVM: arm64: Link position-independent string routines into .hyp.text
  KVM: arm64: Add standalone ticket spinlock implementation for use at
    hyp

 arch/arm64/include/asm/cpufeature.h           |   1 +
 arch/arm64/include/asm/hyp_image.h            |   4 +
 arch/arm64/include/asm/kvm_asm.h              |  13 +-
 arch/arm64/include/asm/kvm_cpufeature.h       |  19 ++
 arch/arm64/include/asm/kvm_host.h             |  17 +-
 arch/arm64/include/asm/kvm_hyp.h              |   8 +
 arch/arm64/include/asm/kvm_mmu.h              |  69 +++++-
 arch/arm64/include/asm/kvm_pgtable.h          |  41 +++-
 arch/arm64/include/asm/sections.h             |   1 +
 arch/arm64/kernel/asm-offsets.c               |   3 +
 arch/arm64/kernel/cpufeature.c                |  14 +-
 arch/arm64/kernel/image-vars.h                |  35 +++
 arch/arm64/kernel/vmlinux.lds.S               |   7 +
 arch/arm64/kvm/arm.c                          | 136 +++++++++--
 arch/arm64/kvm/hyp/Makefile                   |   2 +-
 arch/arm64/kvm/hyp/include/hyp/switch.h       |  36 +--
 arch/arm64/kvm/hyp/include/nvhe/early_alloc.h |  14 ++
 arch/arm64/kvm/hyp/include/nvhe/gfp.h         |  32 +++
 arch/arm64/kvm/hyp/include/nvhe/mem_protect.h |  33 +++
 arch/arm64/kvm/hyp/include/nvhe/memory.h      |  55 +++++
 arch/arm64/kvm/hyp/include/nvhe/mm.h          | 107 +++++++++
 arch/arm64/kvm/hyp/include/nvhe/spinlock.h    |  95 ++++++++
 arch/arm64/kvm/hyp/include/nvhe/util.h        |  25 ++
 arch/arm64/kvm/hyp/nvhe/Makefile              |   9 +-
 arch/arm64/kvm/hyp/nvhe/cache.S               |  13 ++
 arch/arm64/kvm/hyp/nvhe/cpufeature.c          |   8 +
 arch/arm64/kvm/hyp/nvhe/early_alloc.c         |  60 +++++
 arch/arm64/kvm/hyp/nvhe/hyp-init.S            |  39 ++++
 arch/arm64/kvm/hyp/nvhe/hyp-main.c            |  50 ++++
 arch/arm64/kvm/hyp/nvhe/hyp.lds.S             |   1 +
 arch/arm64/kvm/hyp/nvhe/mem_protect.c         | 191 ++++++++++++++++
 arch/arm64/kvm/hyp/nvhe/mm.c                  | 175 ++++++++++++++
 arch/arm64/kvm/hyp/nvhe/page_alloc.c          | 185 +++++++++++++++
 arch/arm64/kvm/hyp/nvhe/psci-relay.c          |   7 +-
 arch/arm64/kvm/hyp/nvhe/setup.c               | 214 ++++++++++++++++++
 arch/arm64/kvm/hyp/nvhe/stub.c                |  22 ++
 arch/arm64/kvm/hyp/nvhe/switch.c              |  12 +-
 arch/arm64/kvm/hyp/nvhe/tlb.c                 |   4 +-
 arch/arm64/kvm/hyp/pgtable.c                  |  98 ++++----
 arch/arm64/kvm/hyp/reserved_mem.c             |  95 ++++++++
 arch/arm64/kvm/mmu.c                          | 114 +++++++++-
 arch/arm64/kvm/reset.c                        |  42 +---
 arch/arm64/lib/clear_page.S                   |   4 +-
 arch/arm64/lib/copy_page.S                    |   4 +-
 arch/arm64/mm/init.c                          |   3 +
 drivers/of/fdt.c                              |   5 +
 46 files changed, 1971 insertions(+), 151 deletions(-)
 create mode 100644 arch/arm64/include/asm/kvm_cpufeature.h
 create mode 100644 arch/arm64/kvm/hyp/include/nvhe/early_alloc.h
 create mode 100644 arch/arm64/kvm/hyp/include/nvhe/gfp.h
 create mode 100644 arch/arm64/kvm/hyp/include/nvhe/mem_protect.h
 create mode 100644 arch/arm64/kvm/hyp/include/nvhe/memory.h
 create mode 100644 arch/arm64/kvm/hyp/include/nvhe/mm.h
 create mode 100644 arch/arm64/kvm/hyp/include/nvhe/spinlock.h
 create mode 100644 arch/arm64/kvm/hyp/include/nvhe/util.h
 create mode 100644 arch/arm64/kvm/hyp/nvhe/cache.S
 create mode 100644 arch/arm64/kvm/hyp/nvhe/cpufeature.c
 create mode 100644 arch/arm64/kvm/hyp/nvhe/early_alloc.c
 create mode 100644 arch/arm64/kvm/hyp/nvhe/mem_protect.c
 create mode 100644 arch/arm64/kvm/hyp/nvhe/mm.c
 create mode 100644 arch/arm64/kvm/hyp/nvhe/page_alloc.c
 create mode 100644 arch/arm64/kvm/hyp/nvhe/setup.c
 create mode 100644 arch/arm64/kvm/hyp/nvhe/stub.c
 create mode 100644 arch/arm64/kvm/hyp/reserved_mem.c

-- 
2.29.2.299.gdc1121823c-goog


^ permalink raw reply	[flat|nested] 54+ messages in thread

end of thread, other threads:[~2020-12-08  9:41 UTC | newest]

Thread overview: 54+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-11-17 18:15 [RFC PATCH 00/27] KVM/arm64: A stage 2 for the host Quentin Perret
2020-11-17 18:15 ` [RFC PATCH 01/27] arm64: lib: Annotate {clear,copy}_page() as position-independent Quentin Perret
2020-11-17 18:15 ` [RFC PATCH 02/27] KVM: arm64: Link position-independent string routines into .hyp.text Quentin Perret
2020-11-23 12:34   ` David Brazdil
2020-11-23 14:06     ` Quentin Perret
2020-11-17 18:15 ` [RFC PATCH 03/27] KVM: arm64: Add standalone ticket spinlock implementation for use at hyp Quentin Perret
2020-11-17 18:15 ` [RFC PATCH 04/27] KVM: arm64: Initialize kvm_nvhe_init_params early Quentin Perret
2020-11-17 18:15 ` [RFC PATCH 05/27] KVM: arm64: Avoid free_page() in page-table allocator Quentin Perret
2020-11-17 18:15 ` [RFC PATCH 06/27] KVM: arm64: Factor memory allocation out of pgtable.c Quentin Perret
2020-11-17 18:15 ` [RFC PATCH 07/27] KVM: arm64: Introduce a BSS section for use at Hyp Quentin Perret
2020-11-17 18:15 ` [RFC PATCH 08/27] KVM: arm64: Make kvm_call_hyp() a function call " Quentin Perret
2020-11-23 12:51   ` David Brazdil
2020-11-17 18:15 ` [RFC PATCH 09/27] KVM: arm64: Allow using kvm_nvhe_sym() in hyp code Quentin Perret
2020-11-23 12:57   ` David Brazdil
2020-11-23 14:02     ` Quentin Perret
2020-11-23 14:54       ` David Brazdil
2020-11-17 18:15 ` [RFC PATCH 10/27] KVM: arm64: Introduce an early Hyp page allocator Quentin Perret
2020-11-17 18:15 ` [RFC PATCH 11/27] KVM: arm64: Stub CONFIG_DEBUG_LIST at Hyp Quentin Perret
2020-11-17 18:15 ` [RFC PATCH 12/27] KVM: arm64: Introduce a Hyp buddy page allocator Quentin Perret
2020-11-17 18:15 ` [RFC PATCH 13/27] KVM: arm64: Enable access to sanitized CPU features at EL2 Quentin Perret
2020-11-23 10:55   ` Fuad Tabba
2020-11-23 13:51     ` Quentin Perret
2020-11-23 13:22   ` David Brazdil
2020-11-23 14:39     ` Quentin Perret
2020-11-17 18:15 ` [RFC PATCH 14/27] KVM: arm64: Factor out vector address calculation Quentin Perret
2020-11-17 18:15 ` [RFC PATCH 15/27] of/fdt: Introduce early_init_dt_add_memory_hyp() Quentin Perret
2020-11-17 19:44   ` Rob Herring
2020-11-18  9:25     ` Quentin Perret
2020-11-18 14:31       ` Quentin Perret
2020-11-17 18:15 ` [RFC PATCH 16/27] KVM: arm64: Prepare Hyp memory protection Quentin Perret
2020-12-03 12:57   ` Fuad Tabba
2020-12-04 18:01     ` Quentin Perret
2020-12-07 10:20       ` Will Deacon
2020-12-07 11:05         ` Mark Rutland
2020-12-07 11:10           ` Will Deacon
2020-12-07 11:14           ` Fuad Tabba
2020-12-07 11:16       ` Fuad Tabba
2020-12-07 11:58         ` Quentin Perret
2020-12-07 13:54           ` Marc Zyngier
2020-12-07 14:17             ` Quentin Perret
2020-12-07 13:40   ` Will Deacon
2020-12-07 14:11     ` Quentin Perret
2020-12-08  9:40       ` Will Deacon
2020-11-17 18:15 ` [RFC PATCH 17/27] KVM: arm64: Elevate Hyp mappings creation at EL2 Quentin Perret
2020-11-17 18:15 ` [RFC PATCH 18/27] KVM: arm64: Use kvm_arch for stage 2 pgtable Quentin Perret
2020-11-17 18:15 ` [RFC PATCH 19/27] KVM: arm64: Use kvm_arch in kvm_s2_mmu Quentin Perret
2020-11-17 18:16 ` [RFC PATCH 20/27] KVM: arm64: Set host stage 2 using kvm_nvhe_init_params Quentin Perret
2020-11-17 18:16 ` [RFC PATCH 21/27] KVM: arm64: Refactor kvm_arm_setup_stage2() Quentin Perret
2020-11-17 18:16 ` [RFC PATCH 22/27] KVM: arm64: Refactor __load_guest_stage2() Quentin Perret
2020-11-17 18:16 ` [RFC PATCH 23/27] KVM: arm64: Refactor __populate_fault_info() Quentin Perret
2020-11-17 18:16 ` [RFC PATCH 24/27] KVM: arm64: Make memcache anonymous in pgtable allocator Quentin Perret
2020-11-17 18:16 ` [RFC PATCH 25/27] KVM: arm64: Reserve memory for host stage 2 Quentin Perret
2020-11-17 18:16 ` [RFC PATCH 26/27] KVM: arm64: Sort the memblock regions list Quentin Perret
2020-11-17 18:16 ` [RFC PATCH 27/27] KVM: arm64: Wrap the host with a stage 2 Quentin Perret

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).