All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 00/16] KVM: arm64: Implement unshare hypercall for pkvm
@ 2021-10-13 15:58 ` Quentin Perret
  0 siblings, 0 replies; 84+ messages in thread
From: Quentin Perret @ 2021-10-13 15:58 UTC (permalink / raw)
  To: Marc Zyngier, James Morse, Alexandru Elisei, Suzuki K Poulose,
	Catalin Marinas, Will Deacon, Fuad Tabba, David Brazdil
  Cc: linux-arm-kernel, kvmarm, linux-kernel, kernel-team

Hi all,

This series implements an unshare hypercall at EL2 in nVHE protected
mode, and makes use of it to unmmap guest-specific data-structures from
EL2 stage-1 during guest tear-down. Crucially, the implementation of the
share and unshare hypercall implements page refcounts at EL2 to avoid
accidentally unmapping data-structures that overlap a common page.

This series has two main benefits. Firstly it allows EL2 to track the
state of shared pages cleanly, as they can now transition from SHARED
back to OWNED. This will simplify permission checks once e.g. pkvm
implements a donation hcall to provide memory to protected guests, as
there should then be no reason for the host to donate a page that is
currently marked shared. And secondly, it avoids having dangling
mappings in the hypervisor's stage-1, which should be a good idea from
a security perspective as the hypervisor is obviously running with
elevated privileges. And perhaps worth noting is that this also
refactors the EL2 page-tracking checks in a more scalable way, which
should allow to implement other memory transitions (host donating memory
to a guest, a guest sharing back with the host, ...) much more easily in
the future.

The series is organized as follows:

 - patches 01-05 refactor the implementation of the existing share
   hypercall;

 - patches 06-10 introduce infrastructure to allow unmapping pages from
   EL2 stage-1;

 - patches 11-14 allow to refcount pages that are shared more than once
   with EL2;

 - patches 15-16 add the unshare hypercall, and make use of it when
   tearing down guests.

This has been lightly tested on Qemu, by spawning and powering off a
guest 50 times.

Feedback welcome :) !

Thanks,
Quentin

Quentin Perret (11):
  KVM: arm64: Avoid remapping the SVE state in the hyp stage-1
  KVM: arm64: Introduce kvm_share_hyp()
  KVM: arm64: Accept page ranges in pkvm share hypercall
  KVM: arm64: Provide {get,put}_page() stubs for early hyp allocator
  KVM: arm64: Refcount hyp stage-1 pgtable pages
  KVM: arm64: Fixup hyp stage-1 refcount
  KVM: arm64: Back hyp_vmemmap for all of memory
  KVM: arm64: Move hyp refcount helpers to header files
  KVM: arm64: Refcount shared pages at EL2
  KVM: arm64: pkvm: Introduce an unshare hypercall
  KVM: arm64: pkvm: Unshare guest structs during teardown

Will Deacon (5):
  KVM: arm64: Introduce do_share() helper for memory sharing between
    components
  KVM: arm64: Implement __pkvm_host_share_hyp() using do_share()
  KVM: arm64: Hook up ->page_count() for hypervisor stage-1 page-table
  KVM: arm64: Implement kvm_pgtable_hyp_unmap() at EL2
  KVM: arm64: Move double-sharing logic into hyp-specific function

 arch/arm64/include/asm/kvm_asm.h              |   1 +
 arch/arm64/include/asm/kvm_host.h             |   2 +
 arch/arm64/include/asm/kvm_mmu.h              |   2 +
 arch/arm64/include/asm/kvm_pgtable.h          |  21 +
 arch/arm64/kvm/arm.c                          |  17 +-
 arch/arm64/kvm/fpsimd.c                       |  25 +-
 arch/arm64/kvm/hyp/include/nvhe/mem_protect.h |   8 +-
 arch/arm64/kvm/hyp/include/nvhe/memory.h      |  18 +
 arch/arm64/kvm/hyp/include/nvhe/mm.h          |  29 +-
 arch/arm64/kvm/hyp/nvhe/early_alloc.c         |   5 +
 arch/arm64/kvm/hyp/nvhe/hyp-main.c            |  12 +-
 arch/arm64/kvm/hyp/nvhe/mem_protect.c         | 596 ++++++++++++++++--
 arch/arm64/kvm/hyp/nvhe/mm.c                  |  31 +-
 arch/arm64/kvm/hyp/nvhe/page_alloc.c          |  22 +-
 arch/arm64/kvm/hyp/nvhe/setup.c               |  39 +-
 arch/arm64/kvm/hyp/pgtable.c                  |  80 ++-
 arch/arm64/kvm/hyp/reserved_mem.c             |  17 +-
 arch/arm64/kvm/mmu.c                          |  48 +-
 arch/arm64/kvm/reset.c                        |  13 +-
 19 files changed, 814 insertions(+), 172 deletions(-)

-- 
2.33.0.882.g93a45727a2-goog


^ permalink raw reply	[flat|nested] 84+ messages in thread

* [PATCH 00/16] KVM: arm64: Implement unshare hypercall for pkvm
@ 2021-10-13 15:58 ` Quentin Perret
  0 siblings, 0 replies; 84+ messages in thread
From: Quentin Perret @ 2021-10-13 15:58 UTC (permalink / raw)
  To: Marc Zyngier, James Morse, Alexandru Elisei, Suzuki K Poulose,
	Catalin Marinas, Will Deacon, Fuad Tabba, David Brazdil
  Cc: kernel-team, kvmarm, linux-arm-kernel, linux-kernel

Hi all,

This series implements an unshare hypercall at EL2 in nVHE protected
mode, and makes use of it to unmmap guest-specific data-structures from
EL2 stage-1 during guest tear-down. Crucially, the implementation of the
share and unshare hypercall implements page refcounts at EL2 to avoid
accidentally unmapping data-structures that overlap a common page.

This series has two main benefits. Firstly it allows EL2 to track the
state of shared pages cleanly, as they can now transition from SHARED
back to OWNED. This will simplify permission checks once e.g. pkvm
implements a donation hcall to provide memory to protected guests, as
there should then be no reason for the host to donate a page that is
currently marked shared. And secondly, it avoids having dangling
mappings in the hypervisor's stage-1, which should be a good idea from
a security perspective as the hypervisor is obviously running with
elevated privileges. And perhaps worth noting is that this also
refactors the EL2 page-tracking checks in a more scalable way, which
should allow to implement other memory transitions (host donating memory
to a guest, a guest sharing back with the host, ...) much more easily in
the future.

The series is organized as follows:

 - patches 01-05 refactor the implementation of the existing share
   hypercall;

 - patches 06-10 introduce infrastructure to allow unmapping pages from
   EL2 stage-1;

 - patches 11-14 allow to refcount pages that are shared more than once
   with EL2;

 - patches 15-16 add the unshare hypercall, and make use of it when
   tearing down guests.

This has been lightly tested on Qemu, by spawning and powering off a
guest 50 times.

Feedback welcome :) !

Thanks,
Quentin

Quentin Perret (11):
  KVM: arm64: Avoid remapping the SVE state in the hyp stage-1
  KVM: arm64: Introduce kvm_share_hyp()
  KVM: arm64: Accept page ranges in pkvm share hypercall
  KVM: arm64: Provide {get,put}_page() stubs for early hyp allocator
  KVM: arm64: Refcount hyp stage-1 pgtable pages
  KVM: arm64: Fixup hyp stage-1 refcount
  KVM: arm64: Back hyp_vmemmap for all of memory
  KVM: arm64: Move hyp refcount helpers to header files
  KVM: arm64: Refcount shared pages at EL2
  KVM: arm64: pkvm: Introduce an unshare hypercall
  KVM: arm64: pkvm: Unshare guest structs during teardown

Will Deacon (5):
  KVM: arm64: Introduce do_share() helper for memory sharing between
    components
  KVM: arm64: Implement __pkvm_host_share_hyp() using do_share()
  KVM: arm64: Hook up ->page_count() for hypervisor stage-1 page-table
  KVM: arm64: Implement kvm_pgtable_hyp_unmap() at EL2
  KVM: arm64: Move double-sharing logic into hyp-specific function

 arch/arm64/include/asm/kvm_asm.h              |   1 +
 arch/arm64/include/asm/kvm_host.h             |   2 +
 arch/arm64/include/asm/kvm_mmu.h              |   2 +
 arch/arm64/include/asm/kvm_pgtable.h          |  21 +
 arch/arm64/kvm/arm.c                          |  17 +-
 arch/arm64/kvm/fpsimd.c                       |  25 +-
 arch/arm64/kvm/hyp/include/nvhe/mem_protect.h |   8 +-
 arch/arm64/kvm/hyp/include/nvhe/memory.h      |  18 +
 arch/arm64/kvm/hyp/include/nvhe/mm.h          |  29 +-
 arch/arm64/kvm/hyp/nvhe/early_alloc.c         |   5 +
 arch/arm64/kvm/hyp/nvhe/hyp-main.c            |  12 +-
 arch/arm64/kvm/hyp/nvhe/mem_protect.c         | 596 ++++++++++++++++--
 arch/arm64/kvm/hyp/nvhe/mm.c                  |  31 +-
 arch/arm64/kvm/hyp/nvhe/page_alloc.c          |  22 +-
 arch/arm64/kvm/hyp/nvhe/setup.c               |  39 +-
 arch/arm64/kvm/hyp/pgtable.c                  |  80 ++-
 arch/arm64/kvm/hyp/reserved_mem.c             |  17 +-
 arch/arm64/kvm/mmu.c                          |  48 +-
 arch/arm64/kvm/reset.c                        |  13 +-
 19 files changed, 814 insertions(+), 172 deletions(-)

-- 
2.33.0.882.g93a45727a2-goog

_______________________________________________
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

^ permalink raw reply	[flat|nested] 84+ messages in thread

* [PATCH 00/16] KVM: arm64: Implement unshare hypercall for pkvm
@ 2021-10-13 15:58 ` Quentin Perret
  0 siblings, 0 replies; 84+ messages in thread
From: Quentin Perret @ 2021-10-13 15:58 UTC (permalink / raw)
  To: Marc Zyngier, James Morse, Alexandru Elisei, Suzuki K Poulose,
	Catalin Marinas, Will Deacon, Fuad Tabba, David Brazdil
  Cc: linux-arm-kernel, kvmarm, linux-kernel, kernel-team

Hi all,

This series implements an unshare hypercall at EL2 in nVHE protected
mode, and makes use of it to unmmap guest-specific data-structures from
EL2 stage-1 during guest tear-down. Crucially, the implementation of the
share and unshare hypercall implements page refcounts at EL2 to avoid
accidentally unmapping data-structures that overlap a common page.

This series has two main benefits. Firstly it allows EL2 to track the
state of shared pages cleanly, as they can now transition from SHARED
back to OWNED. This will simplify permission checks once e.g. pkvm
implements a donation hcall to provide memory to protected guests, as
there should then be no reason for the host to donate a page that is
currently marked shared. And secondly, it avoids having dangling
mappings in the hypervisor's stage-1, which should be a good idea from
a security perspective as the hypervisor is obviously running with
elevated privileges. And perhaps worth noting is that this also
refactors the EL2 page-tracking checks in a more scalable way, which
should allow to implement other memory transitions (host donating memory
to a guest, a guest sharing back with the host, ...) much more easily in
the future.

The series is organized as follows:

 - patches 01-05 refactor the implementation of the existing share
   hypercall;

 - patches 06-10 introduce infrastructure to allow unmapping pages from
   EL2 stage-1;

 - patches 11-14 allow to refcount pages that are shared more than once
   with EL2;

 - patches 15-16 add the unshare hypercall, and make use of it when
   tearing down guests.

This has been lightly tested on Qemu, by spawning and powering off a
guest 50 times.

Feedback welcome :) !

Thanks,
Quentin

Quentin Perret (11):
  KVM: arm64: Avoid remapping the SVE state in the hyp stage-1
  KVM: arm64: Introduce kvm_share_hyp()
  KVM: arm64: Accept page ranges in pkvm share hypercall
  KVM: arm64: Provide {get,put}_page() stubs for early hyp allocator
  KVM: arm64: Refcount hyp stage-1 pgtable pages
  KVM: arm64: Fixup hyp stage-1 refcount
  KVM: arm64: Back hyp_vmemmap for all of memory
  KVM: arm64: Move hyp refcount helpers to header files
  KVM: arm64: Refcount shared pages at EL2
  KVM: arm64: pkvm: Introduce an unshare hypercall
  KVM: arm64: pkvm: Unshare guest structs during teardown

Will Deacon (5):
  KVM: arm64: Introduce do_share() helper for memory sharing between
    components
  KVM: arm64: Implement __pkvm_host_share_hyp() using do_share()
  KVM: arm64: Hook up ->page_count() for hypervisor stage-1 page-table
  KVM: arm64: Implement kvm_pgtable_hyp_unmap() at EL2
  KVM: arm64: Move double-sharing logic into hyp-specific function

 arch/arm64/include/asm/kvm_asm.h              |   1 +
 arch/arm64/include/asm/kvm_host.h             |   2 +
 arch/arm64/include/asm/kvm_mmu.h              |   2 +
 arch/arm64/include/asm/kvm_pgtable.h          |  21 +
 arch/arm64/kvm/arm.c                          |  17 +-
 arch/arm64/kvm/fpsimd.c                       |  25 +-
 arch/arm64/kvm/hyp/include/nvhe/mem_protect.h |   8 +-
 arch/arm64/kvm/hyp/include/nvhe/memory.h      |  18 +
 arch/arm64/kvm/hyp/include/nvhe/mm.h          |  29 +-
 arch/arm64/kvm/hyp/nvhe/early_alloc.c         |   5 +
 arch/arm64/kvm/hyp/nvhe/hyp-main.c            |  12 +-
 arch/arm64/kvm/hyp/nvhe/mem_protect.c         | 596 ++++++++++++++++--
 arch/arm64/kvm/hyp/nvhe/mm.c                  |  31 +-
 arch/arm64/kvm/hyp/nvhe/page_alloc.c          |  22 +-
 arch/arm64/kvm/hyp/nvhe/setup.c               |  39 +-
 arch/arm64/kvm/hyp/pgtable.c                  |  80 ++-
 arch/arm64/kvm/hyp/reserved_mem.c             |  17 +-
 arch/arm64/kvm/mmu.c                          |  48 +-
 arch/arm64/kvm/reset.c                        |  13 +-
 19 files changed, 814 insertions(+), 172 deletions(-)

-- 
2.33.0.882.g93a45727a2-goog


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 84+ messages in thread

* [PATCH 01/16] KVM: arm64: Introduce do_share() helper for memory sharing between components
  2021-10-13 15:58 ` Quentin Perret
  (?)
@ 2021-10-13 15:58   ` Quentin Perret
  -1 siblings, 0 replies; 84+ messages in thread
From: Quentin Perret @ 2021-10-13 15:58 UTC (permalink / raw)
  To: Marc Zyngier, James Morse, Alexandru Elisei, Suzuki K Poulose,
	Catalin Marinas, Will Deacon, Fuad Tabba, David Brazdil
  Cc: linux-arm-kernel, kvmarm, linux-kernel, kernel-team

From: Will Deacon <will@kernel.org>

In preparation for extending memory sharing to include the guest as well
as the hypervisor and the host, introduce a high-level do_share() helper
which allows memory to be shared between these components without
duplication of validity checks.

Signed-off-by: Will Deacon <will@kernel.org>
Signed-off-by: Quentin Perret <qperret@google.com>
---
 arch/arm64/kvm/hyp/include/nvhe/mem_protect.h |   5 +
 arch/arm64/kvm/hyp/nvhe/mem_protect.c         | 315 ++++++++++++++++++
 2 files changed, 320 insertions(+)

diff --git a/arch/arm64/kvm/hyp/include/nvhe/mem_protect.h b/arch/arm64/kvm/hyp/include/nvhe/mem_protect.h
index b58c910babaf..56445586c755 100644
--- a/arch/arm64/kvm/hyp/include/nvhe/mem_protect.h
+++ b/arch/arm64/kvm/hyp/include/nvhe/mem_protect.h
@@ -24,6 +24,11 @@ enum pkvm_page_state {
 	PKVM_PAGE_OWNED			= 0ULL,
 	PKVM_PAGE_SHARED_OWNED		= KVM_PGTABLE_PROT_SW0,
 	PKVM_PAGE_SHARED_BORROWED	= KVM_PGTABLE_PROT_SW1,
+	__PKVM_PAGE_RESERVED		= KVM_PGTABLE_PROT_SW0 |
+					  KVM_PGTABLE_PROT_SW1,
+
+	/* Meta-states which aren't encoded directly in the PTE's SW bits */
+	PKVM_NOPAGE,
 };
 
 #define PKVM_PAGE_STATE_PROT_MASK	(KVM_PGTABLE_PROT_SW0 | KVM_PGTABLE_PROT_SW1)
diff --git a/arch/arm64/kvm/hyp/nvhe/mem_protect.c b/arch/arm64/kvm/hyp/nvhe/mem_protect.c
index bacd493a4eac..53e503501044 100644
--- a/arch/arm64/kvm/hyp/nvhe/mem_protect.c
+++ b/arch/arm64/kvm/hyp/nvhe/mem_protect.c
@@ -443,3 +443,318 @@ void handle_host_mem_abort(struct kvm_cpu_context *host_ctxt)
 	ret = host_stage2_idmap(addr);
 	BUG_ON(ret && ret != -EAGAIN);
 }
+
+/* This corresponds to locking order */
+enum pkvm_component_id {
+	PKVM_ID_HOST,
+	PKVM_ID_HYP,
+};
+
+struct pkvm_mem_transition {
+	u64				nr_pages;
+
+	struct {
+		enum pkvm_component_id	id;
+		u64			addr;
+
+		union {
+			struct {
+				u64	completer_addr;
+			} host;
+		};
+	} initiator;
+
+	struct {
+		enum pkvm_component_id	id;
+	} completer;
+};
+
+struct pkvm_mem_share {
+	struct pkvm_mem_transition	tx;
+	enum kvm_pgtable_prot		prot;
+};
+
+struct pkvm_page_req {
+	struct {
+		enum pkvm_page_state	state;
+		u64			addr;
+	} initiator;
+
+	struct {
+		u64			addr;
+	} completer;
+
+	phys_addr_t			phys;
+};
+
+struct pkvm_page_share_ack {
+	struct {
+		enum pkvm_page_state	state;
+		phys_addr_t		phys;
+		enum kvm_pgtable_prot	prot;
+	} completer;
+};
+
+static void host_lock_component(void)
+{
+	hyp_spin_lock(&host_kvm.lock);
+}
+
+static void host_unlock_component(void)
+{
+	hyp_spin_unlock(&host_kvm.lock);
+}
+
+static void hyp_lock_component(void)
+{
+	hyp_spin_lock(&pkvm_pgd_lock);
+}
+
+static void hyp_unlock_component(void)
+{
+	hyp_spin_unlock(&pkvm_pgd_lock);
+}
+
+static int host_request_share(struct pkvm_page_req *req,
+			      struct pkvm_mem_transition *tx,
+			      u64 idx)
+{
+	u64 offset = idx * PAGE_SIZE;
+	enum kvm_pgtable_prot prot;
+	u64 host_addr;
+	kvm_pte_t pte;
+	int err;
+
+	hyp_assert_lock_held(&host_kvm.lock);
+
+	host_addr = tx->initiator.addr + offset;
+	err = kvm_pgtable_get_leaf(&host_kvm.pgt, host_addr, &pte, NULL);
+	if (err)
+		return err;
+
+	if (!kvm_pte_valid(pte) && pte)
+		return -EPERM;
+
+	prot = kvm_pgtable_stage2_pte_prot(pte);
+	*req = (struct pkvm_page_req) {
+		.initiator	= {
+			.state	= pkvm_getstate(prot),
+			.addr	= host_addr,
+		},
+		.completer	= {
+			.addr	= tx->initiator.host.completer_addr + offset,
+		},
+		.phys		= host_addr,
+	};
+
+	return 0;
+}
+
+/*
+ * Populate the page-sharing request (@req) based on the share transition
+ * information from the initiator and its current page state.
+ */
+static int request_share(struct pkvm_page_req *req,
+			 struct pkvm_mem_share *share,
+			 u64 idx)
+{
+	struct pkvm_mem_transition *tx = &share->tx;
+
+	switch (tx->initiator.id) {
+	case PKVM_ID_HOST:
+		return host_request_share(req, tx, idx);
+	default:
+		return -EINVAL;
+	}
+}
+
+static int hyp_ack_share(struct pkvm_page_share_ack *ack,
+			 struct pkvm_page_req *req,
+			 enum kvm_pgtable_prot perms)
+{
+	enum pkvm_page_state state = PKVM_NOPAGE;
+	enum kvm_pgtable_prot prot = 0;
+	phys_addr_t phys = 0;
+	kvm_pte_t pte;
+	u64 hyp_addr;
+	int err;
+
+	hyp_assert_lock_held(&pkvm_pgd_lock);
+
+	if (perms != PAGE_HYP)
+		return -EPERM;
+
+	hyp_addr = req->completer.addr;
+	err = kvm_pgtable_get_leaf(&pkvm_pgtable, hyp_addr, &pte, NULL);
+	if (err)
+		return err;
+
+	if (kvm_pte_valid(pte)) {
+		state	= pkvm_getstate(kvm_pgtable_hyp_pte_prot(pte));
+		phys	= kvm_pte_to_phys(pte);
+		prot	= kvm_pgtable_hyp_pte_prot(pte) & KVM_PGTABLE_PROT_RWX;
+	}
+
+	*ack = (struct pkvm_page_share_ack) {
+		.completer	= {
+			.state	= state,
+			.phys	= phys,
+			.prot	= prot,
+		},
+	};
+
+	return 0;
+}
+
+/*
+ * Populate the page-sharing acknowledgment (@ack) based on the sharing request
+ * from the initiator and the current page state in the completer.
+ */
+static int ack_share(struct pkvm_page_share_ack *ack,
+		     struct pkvm_page_req *req,
+		     struct pkvm_mem_share *share)
+{
+	struct pkvm_mem_transition *tx = &share->tx;
+
+	switch (tx->completer.id) {
+	case PKVM_ID_HYP:
+		return hyp_ack_share(ack, req, share->prot);
+	default:
+		return -EINVAL;
+	}
+}
+
+/*
+ * Check that the page states in the initiator and the completer are compatible
+ * for the requested page-sharing operation to go ahead.
+ */
+static int check_share(struct pkvm_page_req *req,
+		       struct pkvm_page_share_ack *ack,
+		       struct pkvm_mem_share *share)
+{
+	if (!addr_is_memory(req->phys))
+		return -EINVAL;
+
+	if (req->initiator.state == PKVM_PAGE_OWNED &&
+	    ack->completer.state == PKVM_NOPAGE) {
+		return 0;
+	}
+
+	if (req->initiator.state != PKVM_PAGE_SHARED_OWNED)
+		return -EPERM;
+
+	if (ack->completer.state != PKVM_PAGE_SHARED_BORROWED)
+		return -EPERM;
+
+	if (ack->completer.phys != req->phys)
+		return -EPERM;
+
+	if (ack->completer.prot != share->prot)
+		return -EPERM;
+
+	return 0;
+}
+
+static int host_initiate_share(struct pkvm_page_req *req)
+{
+	enum kvm_pgtable_prot prot;
+
+	prot = pkvm_mkstate(PKVM_HOST_MEM_PROT, PKVM_PAGE_SHARED_OWNED);
+	return host_stage2_idmap_locked(req->initiator.addr, PAGE_SIZE, prot);
+}
+
+/* Update the initiator's page-table for the page-sharing request */
+static int initiate_share(struct pkvm_page_req *req,
+			  struct pkvm_mem_share *share)
+{
+	struct pkvm_mem_transition *tx = &share->tx;
+
+	switch (tx->initiator.id) {
+	case PKVM_ID_HOST:
+		return host_initiate_share(req);
+	default:
+		return -EINVAL;
+	}
+}
+
+static int hyp_complete_share(struct pkvm_page_req *req,
+			      enum kvm_pgtable_prot perms)
+{
+	void *start = (void *)req->completer.addr, *end = start + PAGE_SIZE;
+	enum kvm_pgtable_prot prot;
+
+	prot = pkvm_mkstate(perms, PKVM_PAGE_SHARED_BORROWED);
+	return pkvm_create_mappings_locked(start, end, prot);
+}
+
+/* Update the completer's page-table for the page-sharing request */
+static int complete_share(struct pkvm_page_req *req,
+			  struct pkvm_mem_share *share)
+{
+	struct pkvm_mem_transition *tx = &share->tx;
+
+	switch (tx->completer.id) {
+	case PKVM_ID_HYP:
+		return hyp_complete_share(req, share->prot);
+	default:
+		return -EINVAL;
+	}
+}
+
+/*
+ * do_share():
+ *
+ * The page owner grants access to another component with a given set
+ * of permissions.
+ *
+ * Initiator: OWNED	=> SHARED_OWNED
+ * Completer: NOPAGE	=> SHARED_BORROWED
+ *
+ * Note that we permit the same share operation to be repeated from the
+ * host to the hypervisor, as this removes the need for excessive
+ * book-keeping of shared KVM data structures at EL1.
+ */
+static int do_share(struct pkvm_mem_share *share)
+{
+	struct pkvm_page_req req;
+	int ret = 0;
+	u64 idx;
+
+	for (idx = 0; idx < share->tx.nr_pages; ++idx) {
+		struct pkvm_page_share_ack ack;
+
+		ret = request_share(&req, share, idx);
+		if (ret)
+			goto out;
+
+		ret = ack_share(&ack, &req, share);
+		if (ret)
+			goto out;
+
+		ret = check_share(&req, &ack, share);
+		if (ret)
+			goto out;
+	}
+
+	for (idx = 0; idx < share->tx.nr_pages; ++idx) {
+		ret = request_share(&req, share, idx);
+		if (ret)
+			break;
+
+		/* Allow double-sharing by skipping over the page */
+		if (req.initiator.state == PKVM_PAGE_SHARED_OWNED)
+			continue;
+
+		ret = initiate_share(&req, share);
+		if (ret)
+			break;
+
+		ret = complete_share(&req, share);
+		if (ret)
+			break;
+	}
+
+	WARN_ON(ret);
+out:
+	return ret;
+}
-- 
2.33.0.882.g93a45727a2-goog


^ permalink raw reply related	[flat|nested] 84+ messages in thread

* [PATCH 01/16] KVM: arm64: Introduce do_share() helper for memory sharing between components
@ 2021-10-13 15:58   ` Quentin Perret
  0 siblings, 0 replies; 84+ messages in thread
From: Quentin Perret @ 2021-10-13 15:58 UTC (permalink / raw)
  To: Marc Zyngier, James Morse, Alexandru Elisei, Suzuki K Poulose,
	Catalin Marinas, Will Deacon, Fuad Tabba, David Brazdil
  Cc: kernel-team, kvmarm, linux-arm-kernel, linux-kernel

From: Will Deacon <will@kernel.org>

In preparation for extending memory sharing to include the guest as well
as the hypervisor and the host, introduce a high-level do_share() helper
which allows memory to be shared between these components without
duplication of validity checks.

Signed-off-by: Will Deacon <will@kernel.org>
Signed-off-by: Quentin Perret <qperret@google.com>
---
 arch/arm64/kvm/hyp/include/nvhe/mem_protect.h |   5 +
 arch/arm64/kvm/hyp/nvhe/mem_protect.c         | 315 ++++++++++++++++++
 2 files changed, 320 insertions(+)

diff --git a/arch/arm64/kvm/hyp/include/nvhe/mem_protect.h b/arch/arm64/kvm/hyp/include/nvhe/mem_protect.h
index b58c910babaf..56445586c755 100644
--- a/arch/arm64/kvm/hyp/include/nvhe/mem_protect.h
+++ b/arch/arm64/kvm/hyp/include/nvhe/mem_protect.h
@@ -24,6 +24,11 @@ enum pkvm_page_state {
 	PKVM_PAGE_OWNED			= 0ULL,
 	PKVM_PAGE_SHARED_OWNED		= KVM_PGTABLE_PROT_SW0,
 	PKVM_PAGE_SHARED_BORROWED	= KVM_PGTABLE_PROT_SW1,
+	__PKVM_PAGE_RESERVED		= KVM_PGTABLE_PROT_SW0 |
+					  KVM_PGTABLE_PROT_SW1,
+
+	/* Meta-states which aren't encoded directly in the PTE's SW bits */
+	PKVM_NOPAGE,
 };
 
 #define PKVM_PAGE_STATE_PROT_MASK	(KVM_PGTABLE_PROT_SW0 | KVM_PGTABLE_PROT_SW1)
diff --git a/arch/arm64/kvm/hyp/nvhe/mem_protect.c b/arch/arm64/kvm/hyp/nvhe/mem_protect.c
index bacd493a4eac..53e503501044 100644
--- a/arch/arm64/kvm/hyp/nvhe/mem_protect.c
+++ b/arch/arm64/kvm/hyp/nvhe/mem_protect.c
@@ -443,3 +443,318 @@ void handle_host_mem_abort(struct kvm_cpu_context *host_ctxt)
 	ret = host_stage2_idmap(addr);
 	BUG_ON(ret && ret != -EAGAIN);
 }
+
+/* This corresponds to locking order */
+enum pkvm_component_id {
+	PKVM_ID_HOST,
+	PKVM_ID_HYP,
+};
+
+struct pkvm_mem_transition {
+	u64				nr_pages;
+
+	struct {
+		enum pkvm_component_id	id;
+		u64			addr;
+
+		union {
+			struct {
+				u64	completer_addr;
+			} host;
+		};
+	} initiator;
+
+	struct {
+		enum pkvm_component_id	id;
+	} completer;
+};
+
+struct pkvm_mem_share {
+	struct pkvm_mem_transition	tx;
+	enum kvm_pgtable_prot		prot;
+};
+
+struct pkvm_page_req {
+	struct {
+		enum pkvm_page_state	state;
+		u64			addr;
+	} initiator;
+
+	struct {
+		u64			addr;
+	} completer;
+
+	phys_addr_t			phys;
+};
+
+struct pkvm_page_share_ack {
+	struct {
+		enum pkvm_page_state	state;
+		phys_addr_t		phys;
+		enum kvm_pgtable_prot	prot;
+	} completer;
+};
+
+static void host_lock_component(void)
+{
+	hyp_spin_lock(&host_kvm.lock);
+}
+
+static void host_unlock_component(void)
+{
+	hyp_spin_unlock(&host_kvm.lock);
+}
+
+static void hyp_lock_component(void)
+{
+	hyp_spin_lock(&pkvm_pgd_lock);
+}
+
+static void hyp_unlock_component(void)
+{
+	hyp_spin_unlock(&pkvm_pgd_lock);
+}
+
+static int host_request_share(struct pkvm_page_req *req,
+			      struct pkvm_mem_transition *tx,
+			      u64 idx)
+{
+	u64 offset = idx * PAGE_SIZE;
+	enum kvm_pgtable_prot prot;
+	u64 host_addr;
+	kvm_pte_t pte;
+	int err;
+
+	hyp_assert_lock_held(&host_kvm.lock);
+
+	host_addr = tx->initiator.addr + offset;
+	err = kvm_pgtable_get_leaf(&host_kvm.pgt, host_addr, &pte, NULL);
+	if (err)
+		return err;
+
+	if (!kvm_pte_valid(pte) && pte)
+		return -EPERM;
+
+	prot = kvm_pgtable_stage2_pte_prot(pte);
+	*req = (struct pkvm_page_req) {
+		.initiator	= {
+			.state	= pkvm_getstate(prot),
+			.addr	= host_addr,
+		},
+		.completer	= {
+			.addr	= tx->initiator.host.completer_addr + offset,
+		},
+		.phys		= host_addr,
+	};
+
+	return 0;
+}
+
+/*
+ * Populate the page-sharing request (@req) based on the share transition
+ * information from the initiator and its current page state.
+ */
+static int request_share(struct pkvm_page_req *req,
+			 struct pkvm_mem_share *share,
+			 u64 idx)
+{
+	struct pkvm_mem_transition *tx = &share->tx;
+
+	switch (tx->initiator.id) {
+	case PKVM_ID_HOST:
+		return host_request_share(req, tx, idx);
+	default:
+		return -EINVAL;
+	}
+}
+
+static int hyp_ack_share(struct pkvm_page_share_ack *ack,
+			 struct pkvm_page_req *req,
+			 enum kvm_pgtable_prot perms)
+{
+	enum pkvm_page_state state = PKVM_NOPAGE;
+	enum kvm_pgtable_prot prot = 0;
+	phys_addr_t phys = 0;
+	kvm_pte_t pte;
+	u64 hyp_addr;
+	int err;
+
+	hyp_assert_lock_held(&pkvm_pgd_lock);
+
+	if (perms != PAGE_HYP)
+		return -EPERM;
+
+	hyp_addr = req->completer.addr;
+	err = kvm_pgtable_get_leaf(&pkvm_pgtable, hyp_addr, &pte, NULL);
+	if (err)
+		return err;
+
+	if (kvm_pte_valid(pte)) {
+		state	= pkvm_getstate(kvm_pgtable_hyp_pte_prot(pte));
+		phys	= kvm_pte_to_phys(pte);
+		prot	= kvm_pgtable_hyp_pte_prot(pte) & KVM_PGTABLE_PROT_RWX;
+	}
+
+	*ack = (struct pkvm_page_share_ack) {
+		.completer	= {
+			.state	= state,
+			.phys	= phys,
+			.prot	= prot,
+		},
+	};
+
+	return 0;
+}
+
+/*
+ * Populate the page-sharing acknowledgment (@ack) based on the sharing request
+ * from the initiator and the current page state in the completer.
+ */
+static int ack_share(struct pkvm_page_share_ack *ack,
+		     struct pkvm_page_req *req,
+		     struct pkvm_mem_share *share)
+{
+	struct pkvm_mem_transition *tx = &share->tx;
+
+	switch (tx->completer.id) {
+	case PKVM_ID_HYP:
+		return hyp_ack_share(ack, req, share->prot);
+	default:
+		return -EINVAL;
+	}
+}
+
+/*
+ * Check that the page states in the initiator and the completer are compatible
+ * for the requested page-sharing operation to go ahead.
+ */
+static int check_share(struct pkvm_page_req *req,
+		       struct pkvm_page_share_ack *ack,
+		       struct pkvm_mem_share *share)
+{
+	if (!addr_is_memory(req->phys))
+		return -EINVAL;
+
+	if (req->initiator.state == PKVM_PAGE_OWNED &&
+	    ack->completer.state == PKVM_NOPAGE) {
+		return 0;
+	}
+
+	if (req->initiator.state != PKVM_PAGE_SHARED_OWNED)
+		return -EPERM;
+
+	if (ack->completer.state != PKVM_PAGE_SHARED_BORROWED)
+		return -EPERM;
+
+	if (ack->completer.phys != req->phys)
+		return -EPERM;
+
+	if (ack->completer.prot != share->prot)
+		return -EPERM;
+
+	return 0;
+}
+
+static int host_initiate_share(struct pkvm_page_req *req)
+{
+	enum kvm_pgtable_prot prot;
+
+	prot = pkvm_mkstate(PKVM_HOST_MEM_PROT, PKVM_PAGE_SHARED_OWNED);
+	return host_stage2_idmap_locked(req->initiator.addr, PAGE_SIZE, prot);
+}
+
+/* Update the initiator's page-table for the page-sharing request */
+static int initiate_share(struct pkvm_page_req *req,
+			  struct pkvm_mem_share *share)
+{
+	struct pkvm_mem_transition *tx = &share->tx;
+
+	switch (tx->initiator.id) {
+	case PKVM_ID_HOST:
+		return host_initiate_share(req);
+	default:
+		return -EINVAL;
+	}
+}
+
+static int hyp_complete_share(struct pkvm_page_req *req,
+			      enum kvm_pgtable_prot perms)
+{
+	void *start = (void *)req->completer.addr, *end = start + PAGE_SIZE;
+	enum kvm_pgtable_prot prot;
+
+	prot = pkvm_mkstate(perms, PKVM_PAGE_SHARED_BORROWED);
+	return pkvm_create_mappings_locked(start, end, prot);
+}
+
+/* Update the completer's page-table for the page-sharing request */
+static int complete_share(struct pkvm_page_req *req,
+			  struct pkvm_mem_share *share)
+{
+	struct pkvm_mem_transition *tx = &share->tx;
+
+	switch (tx->completer.id) {
+	case PKVM_ID_HYP:
+		return hyp_complete_share(req, share->prot);
+	default:
+		return -EINVAL;
+	}
+}
+
+/*
+ * do_share():
+ *
+ * The page owner grants access to another component with a given set
+ * of permissions.
+ *
+ * Initiator: OWNED	=> SHARED_OWNED
+ * Completer: NOPAGE	=> SHARED_BORROWED
+ *
+ * Note that we permit the same share operation to be repeated from the
+ * host to the hypervisor, as this removes the need for excessive
+ * book-keeping of shared KVM data structures at EL1.
+ */
+static int do_share(struct pkvm_mem_share *share)
+{
+	struct pkvm_page_req req;
+	int ret = 0;
+	u64 idx;
+
+	for (idx = 0; idx < share->tx.nr_pages; ++idx) {
+		struct pkvm_page_share_ack ack;
+
+		ret = request_share(&req, share, idx);
+		if (ret)
+			goto out;
+
+		ret = ack_share(&ack, &req, share);
+		if (ret)
+			goto out;
+
+		ret = check_share(&req, &ack, share);
+		if (ret)
+			goto out;
+	}
+
+	for (idx = 0; idx < share->tx.nr_pages; ++idx) {
+		ret = request_share(&req, share, idx);
+		if (ret)
+			break;
+
+		/* Allow double-sharing by skipping over the page */
+		if (req.initiator.state == PKVM_PAGE_SHARED_OWNED)
+			continue;
+
+		ret = initiate_share(&req, share);
+		if (ret)
+			break;
+
+		ret = complete_share(&req, share);
+		if (ret)
+			break;
+	}
+
+	WARN_ON(ret);
+out:
+	return ret;
+}
-- 
2.33.0.882.g93a45727a2-goog

_______________________________________________
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

^ permalink raw reply related	[flat|nested] 84+ messages in thread

* [PATCH 01/16] KVM: arm64: Introduce do_share() helper for memory sharing between components
@ 2021-10-13 15:58   ` Quentin Perret
  0 siblings, 0 replies; 84+ messages in thread
From: Quentin Perret @ 2021-10-13 15:58 UTC (permalink / raw)
  To: Marc Zyngier, James Morse, Alexandru Elisei, Suzuki K Poulose,
	Catalin Marinas, Will Deacon, Fuad Tabba, David Brazdil
  Cc: linux-arm-kernel, kvmarm, linux-kernel, kernel-team

From: Will Deacon <will@kernel.org>

In preparation for extending memory sharing to include the guest as well
as the hypervisor and the host, introduce a high-level do_share() helper
which allows memory to be shared between these components without
duplication of validity checks.

Signed-off-by: Will Deacon <will@kernel.org>
Signed-off-by: Quentin Perret <qperret@google.com>
---
 arch/arm64/kvm/hyp/include/nvhe/mem_protect.h |   5 +
 arch/arm64/kvm/hyp/nvhe/mem_protect.c         | 315 ++++++++++++++++++
 2 files changed, 320 insertions(+)

diff --git a/arch/arm64/kvm/hyp/include/nvhe/mem_protect.h b/arch/arm64/kvm/hyp/include/nvhe/mem_protect.h
index b58c910babaf..56445586c755 100644
--- a/arch/arm64/kvm/hyp/include/nvhe/mem_protect.h
+++ b/arch/arm64/kvm/hyp/include/nvhe/mem_protect.h
@@ -24,6 +24,11 @@ enum pkvm_page_state {
 	PKVM_PAGE_OWNED			= 0ULL,
 	PKVM_PAGE_SHARED_OWNED		= KVM_PGTABLE_PROT_SW0,
 	PKVM_PAGE_SHARED_BORROWED	= KVM_PGTABLE_PROT_SW1,
+	__PKVM_PAGE_RESERVED		= KVM_PGTABLE_PROT_SW0 |
+					  KVM_PGTABLE_PROT_SW1,
+
+	/* Meta-states which aren't encoded directly in the PTE's SW bits */
+	PKVM_NOPAGE,
 };
 
 #define PKVM_PAGE_STATE_PROT_MASK	(KVM_PGTABLE_PROT_SW0 | KVM_PGTABLE_PROT_SW1)
diff --git a/arch/arm64/kvm/hyp/nvhe/mem_protect.c b/arch/arm64/kvm/hyp/nvhe/mem_protect.c
index bacd493a4eac..53e503501044 100644
--- a/arch/arm64/kvm/hyp/nvhe/mem_protect.c
+++ b/arch/arm64/kvm/hyp/nvhe/mem_protect.c
@@ -443,3 +443,318 @@ void handle_host_mem_abort(struct kvm_cpu_context *host_ctxt)
 	ret = host_stage2_idmap(addr);
 	BUG_ON(ret && ret != -EAGAIN);
 }
+
+/* This corresponds to locking order */
+enum pkvm_component_id {
+	PKVM_ID_HOST,
+	PKVM_ID_HYP,
+};
+
+struct pkvm_mem_transition {
+	u64				nr_pages;
+
+	struct {
+		enum pkvm_component_id	id;
+		u64			addr;
+
+		union {
+			struct {
+				u64	completer_addr;
+			} host;
+		};
+	} initiator;
+
+	struct {
+		enum pkvm_component_id	id;
+	} completer;
+};
+
+struct pkvm_mem_share {
+	struct pkvm_mem_transition	tx;
+	enum kvm_pgtable_prot		prot;
+};
+
+struct pkvm_page_req {
+	struct {
+		enum pkvm_page_state	state;
+		u64			addr;
+	} initiator;
+
+	struct {
+		u64			addr;
+	} completer;
+
+	phys_addr_t			phys;
+};
+
+struct pkvm_page_share_ack {
+	struct {
+		enum pkvm_page_state	state;
+		phys_addr_t		phys;
+		enum kvm_pgtable_prot	prot;
+	} completer;
+};
+
+static void host_lock_component(void)
+{
+	hyp_spin_lock(&host_kvm.lock);
+}
+
+static void host_unlock_component(void)
+{
+	hyp_spin_unlock(&host_kvm.lock);
+}
+
+static void hyp_lock_component(void)
+{
+	hyp_spin_lock(&pkvm_pgd_lock);
+}
+
+static void hyp_unlock_component(void)
+{
+	hyp_spin_unlock(&pkvm_pgd_lock);
+}
+
+static int host_request_share(struct pkvm_page_req *req,
+			      struct pkvm_mem_transition *tx,
+			      u64 idx)
+{
+	u64 offset = idx * PAGE_SIZE;
+	enum kvm_pgtable_prot prot;
+	u64 host_addr;
+	kvm_pte_t pte;
+	int err;
+
+	hyp_assert_lock_held(&host_kvm.lock);
+
+	host_addr = tx->initiator.addr + offset;
+	err = kvm_pgtable_get_leaf(&host_kvm.pgt, host_addr, &pte, NULL);
+	if (err)
+		return err;
+
+	if (!kvm_pte_valid(pte) && pte)
+		return -EPERM;
+
+	prot = kvm_pgtable_stage2_pte_prot(pte);
+	*req = (struct pkvm_page_req) {
+		.initiator	= {
+			.state	= pkvm_getstate(prot),
+			.addr	= host_addr,
+		},
+		.completer	= {
+			.addr	= tx->initiator.host.completer_addr + offset,
+		},
+		.phys		= host_addr,
+	};
+
+	return 0;
+}
+
+/*
+ * Populate the page-sharing request (@req) based on the share transition
+ * information from the initiator and its current page state.
+ */
+static int request_share(struct pkvm_page_req *req,
+			 struct pkvm_mem_share *share,
+			 u64 idx)
+{
+	struct pkvm_mem_transition *tx = &share->tx;
+
+	switch (tx->initiator.id) {
+	case PKVM_ID_HOST:
+		return host_request_share(req, tx, idx);
+	default:
+		return -EINVAL;
+	}
+}
+
+static int hyp_ack_share(struct pkvm_page_share_ack *ack,
+			 struct pkvm_page_req *req,
+			 enum kvm_pgtable_prot perms)
+{
+	enum pkvm_page_state state = PKVM_NOPAGE;
+	enum kvm_pgtable_prot prot = 0;
+	phys_addr_t phys = 0;
+	kvm_pte_t pte;
+	u64 hyp_addr;
+	int err;
+
+	hyp_assert_lock_held(&pkvm_pgd_lock);
+
+	if (perms != PAGE_HYP)
+		return -EPERM;
+
+	hyp_addr = req->completer.addr;
+	err = kvm_pgtable_get_leaf(&pkvm_pgtable, hyp_addr, &pte, NULL);
+	if (err)
+		return err;
+
+	if (kvm_pte_valid(pte)) {
+		state	= pkvm_getstate(kvm_pgtable_hyp_pte_prot(pte));
+		phys	= kvm_pte_to_phys(pte);
+		prot	= kvm_pgtable_hyp_pte_prot(pte) & KVM_PGTABLE_PROT_RWX;
+	}
+
+	*ack = (struct pkvm_page_share_ack) {
+		.completer	= {
+			.state	= state,
+			.phys	= phys,
+			.prot	= prot,
+		},
+	};
+
+	return 0;
+}
+
+/*
+ * Populate the page-sharing acknowledgment (@ack) based on the sharing request
+ * from the initiator and the current page state in the completer.
+ */
+static int ack_share(struct pkvm_page_share_ack *ack,
+		     struct pkvm_page_req *req,
+		     struct pkvm_mem_share *share)
+{
+	struct pkvm_mem_transition *tx = &share->tx;
+
+	switch (tx->completer.id) {
+	case PKVM_ID_HYP:
+		return hyp_ack_share(ack, req, share->prot);
+	default:
+		return -EINVAL;
+	}
+}
+
+/*
+ * Check that the page states in the initiator and the completer are compatible
+ * for the requested page-sharing operation to go ahead.
+ */
+static int check_share(struct pkvm_page_req *req,
+		       struct pkvm_page_share_ack *ack,
+		       struct pkvm_mem_share *share)
+{
+	if (!addr_is_memory(req->phys))
+		return -EINVAL;
+
+	if (req->initiator.state == PKVM_PAGE_OWNED &&
+	    ack->completer.state == PKVM_NOPAGE) {
+		return 0;
+	}
+
+	if (req->initiator.state != PKVM_PAGE_SHARED_OWNED)
+		return -EPERM;
+
+	if (ack->completer.state != PKVM_PAGE_SHARED_BORROWED)
+		return -EPERM;
+
+	if (ack->completer.phys != req->phys)
+		return -EPERM;
+
+	if (ack->completer.prot != share->prot)
+		return -EPERM;
+
+	return 0;
+}
+
+static int host_initiate_share(struct pkvm_page_req *req)
+{
+	enum kvm_pgtable_prot prot;
+
+	prot = pkvm_mkstate(PKVM_HOST_MEM_PROT, PKVM_PAGE_SHARED_OWNED);
+	return host_stage2_idmap_locked(req->initiator.addr, PAGE_SIZE, prot);
+}
+
+/* Update the initiator's page-table for the page-sharing request */
+static int initiate_share(struct pkvm_page_req *req,
+			  struct pkvm_mem_share *share)
+{
+	struct pkvm_mem_transition *tx = &share->tx;
+
+	switch (tx->initiator.id) {
+	case PKVM_ID_HOST:
+		return host_initiate_share(req);
+	default:
+		return -EINVAL;
+	}
+}
+
+static int hyp_complete_share(struct pkvm_page_req *req,
+			      enum kvm_pgtable_prot perms)
+{
+	void *start = (void *)req->completer.addr, *end = start + PAGE_SIZE;
+	enum kvm_pgtable_prot prot;
+
+	prot = pkvm_mkstate(perms, PKVM_PAGE_SHARED_BORROWED);
+	return pkvm_create_mappings_locked(start, end, prot);
+}
+
+/* Update the completer's page-table for the page-sharing request */
+static int complete_share(struct pkvm_page_req *req,
+			  struct pkvm_mem_share *share)
+{
+	struct pkvm_mem_transition *tx = &share->tx;
+
+	switch (tx->completer.id) {
+	case PKVM_ID_HYP:
+		return hyp_complete_share(req, share->prot);
+	default:
+		return -EINVAL;
+	}
+}
+
+/*
+ * do_share():
+ *
+ * The page owner grants access to another component with a given set
+ * of permissions.
+ *
+ * Initiator: OWNED	=> SHARED_OWNED
+ * Completer: NOPAGE	=> SHARED_BORROWED
+ *
+ * Note that we permit the same share operation to be repeated from the
+ * host to the hypervisor, as this removes the need for excessive
+ * book-keeping of shared KVM data structures at EL1.
+ */
+static int do_share(struct pkvm_mem_share *share)
+{
+	struct pkvm_page_req req;
+	int ret = 0;
+	u64 idx;
+
+	for (idx = 0; idx < share->tx.nr_pages; ++idx) {
+		struct pkvm_page_share_ack ack;
+
+		ret = request_share(&req, share, idx);
+		if (ret)
+			goto out;
+
+		ret = ack_share(&ack, &req, share);
+		if (ret)
+			goto out;
+
+		ret = check_share(&req, &ack, share);
+		if (ret)
+			goto out;
+	}
+
+	for (idx = 0; idx < share->tx.nr_pages; ++idx) {
+		ret = request_share(&req, share, idx);
+		if (ret)
+			break;
+
+		/* Allow double-sharing by skipping over the page */
+		if (req.initiator.state == PKVM_PAGE_SHARED_OWNED)
+			continue;
+
+		ret = initiate_share(&req, share);
+		if (ret)
+			break;
+
+		ret = complete_share(&req, share);
+		if (ret)
+			break;
+	}
+
+	WARN_ON(ret);
+out:
+	return ret;
+}
-- 
2.33.0.882.g93a45727a2-goog


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply related	[flat|nested] 84+ messages in thread

* [PATCH 02/16] KVM: arm64: Implement __pkvm_host_share_hyp() using do_share()
  2021-10-13 15:58 ` Quentin Perret
  (?)
@ 2021-10-13 15:58   ` Quentin Perret
  -1 siblings, 0 replies; 84+ messages in thread
From: Quentin Perret @ 2021-10-13 15:58 UTC (permalink / raw)
  To: Marc Zyngier, James Morse, Alexandru Elisei, Suzuki K Poulose,
	Catalin Marinas, Will Deacon, Fuad Tabba, David Brazdil
  Cc: linux-arm-kernel, kvmarm, linux-kernel, kernel-team

From: Will Deacon <will@kernel.org>

__pkvm_host_share_hyp() shares memory between the host and the hypervisor,
so implement it as an invocation of the new do_share() mechanism.

Note that the new semantics are slightly stricter than before, as we
now validate the physical address when double-sharing a page. However,
this makes no functional difference as long as no other transitions are
supported and the host can only share pages by pfn.

Signed-off-by: Will Deacon <will@kernel.org>
Signed-off-by: Quentin Perret <qperret@google.com>
---
 arch/arm64/kvm/hyp/nvhe/mem_protect.c | 121 +++++++-------------------
 1 file changed, 33 insertions(+), 88 deletions(-)

diff --git a/arch/arm64/kvm/hyp/nvhe/mem_protect.c b/arch/arm64/kvm/hyp/nvhe/mem_protect.c
index 53e503501044..6983b83f799f 100644
--- a/arch/arm64/kvm/hyp/nvhe/mem_protect.c
+++ b/arch/arm64/kvm/hyp/nvhe/mem_protect.c
@@ -342,94 +342,6 @@ static int host_stage2_idmap(u64 addr)
 	return ret;
 }
 
-static inline bool check_prot(enum kvm_pgtable_prot prot,
-			      enum kvm_pgtable_prot required,
-			      enum kvm_pgtable_prot denied)
-{
-	return (prot & (required | denied)) == required;
-}
-
-int __pkvm_host_share_hyp(u64 pfn)
-{
-	phys_addr_t addr = hyp_pfn_to_phys(pfn);
-	enum kvm_pgtable_prot prot, cur;
-	void *virt = __hyp_va(addr);
-	enum pkvm_page_state state;
-	kvm_pte_t pte;
-	int ret;
-
-	if (!addr_is_memory(addr))
-		return -EINVAL;
-
-	hyp_spin_lock(&host_kvm.lock);
-	hyp_spin_lock(&pkvm_pgd_lock);
-
-	ret = kvm_pgtable_get_leaf(&host_kvm.pgt, addr, &pte, NULL);
-	if (ret)
-		goto unlock;
-	if (!pte)
-		goto map_shared;
-
-	/*
-	 * Check attributes in the host stage-2 PTE. We need the page to be:
-	 *  - mapped RWX as we're sharing memory;
-	 *  - not borrowed, as that implies absence of ownership.
-	 * Otherwise, we can't let it got through
-	 */
-	cur = kvm_pgtable_stage2_pte_prot(pte);
-	prot = pkvm_mkstate(0, PKVM_PAGE_SHARED_BORROWED);
-	if (!check_prot(cur, PKVM_HOST_MEM_PROT, prot)) {
-		ret = -EPERM;
-		goto unlock;
-	}
-
-	state = pkvm_getstate(cur);
-	if (state == PKVM_PAGE_OWNED)
-		goto map_shared;
-
-	/*
-	 * Tolerate double-sharing the same page, but this requires
-	 * cross-checking the hypervisor stage-1.
-	 */
-	if (state != PKVM_PAGE_SHARED_OWNED) {
-		ret = -EPERM;
-		goto unlock;
-	}
-
-	ret = kvm_pgtable_get_leaf(&pkvm_pgtable, (u64)virt, &pte, NULL);
-	if (ret)
-		goto unlock;
-
-	/*
-	 * If the page has been shared with the hypervisor, it must be
-	 * already mapped as SHARED_BORROWED in its stage-1.
-	 */
-	cur = kvm_pgtable_hyp_pte_prot(pte);
-	prot = pkvm_mkstate(PAGE_HYP, PKVM_PAGE_SHARED_BORROWED);
-	if (!check_prot(cur, prot, ~prot))
-		ret = -EPERM;
-	goto unlock;
-
-map_shared:
-	/*
-	 * If the page is not yet shared, adjust mappings in both page-tables
-	 * while both locks are held.
-	 */
-	prot = pkvm_mkstate(PAGE_HYP, PKVM_PAGE_SHARED_BORROWED);
-	ret = pkvm_create_mappings_locked(virt, virt + PAGE_SIZE, prot);
-	BUG_ON(ret);
-
-	prot = pkvm_mkstate(PKVM_HOST_MEM_PROT, PKVM_PAGE_SHARED_OWNED);
-	ret = host_stage2_idmap_locked(addr, PAGE_SIZE, prot);
-	BUG_ON(ret);
-
-unlock:
-	hyp_spin_unlock(&pkvm_pgd_lock);
-	hyp_spin_unlock(&host_kvm.lock);
-
-	return ret;
-}
-
 void handle_host_mem_abort(struct kvm_cpu_context *host_ctxt)
 {
 	struct kvm_vcpu_fault_info fault;
@@ -758,3 +670,36 @@ static int do_share(struct pkvm_mem_share *share)
 out:
 	return ret;
 }
+
+int __pkvm_host_share_hyp(u64 pfn)
+{
+	int ret;
+	u64 host_addr = hyp_pfn_to_phys(pfn);
+	u64 hyp_addr = (u64)__hyp_va(host_addr);
+	struct pkvm_mem_share share = {
+		.tx	= {
+			.nr_pages	= 1,
+			.initiator	= {
+				.id	= PKVM_ID_HOST,
+				.addr	= host_addr,
+				.host	= {
+					.completer_addr = hyp_addr,
+				},
+			},
+			.completer	= {
+				.id	= PKVM_ID_HYP,
+			},
+		},
+		.prot	= PAGE_HYP,
+	};
+
+	host_lock_component();
+	hyp_lock_component();
+
+	ret = do_share(&share);
+
+	hyp_unlock_component();
+	host_unlock_component();
+
+	return ret;
+}
-- 
2.33.0.882.g93a45727a2-goog


^ permalink raw reply related	[flat|nested] 84+ messages in thread

* [PATCH 02/16] KVM: arm64: Implement __pkvm_host_share_hyp() using do_share()
@ 2021-10-13 15:58   ` Quentin Perret
  0 siblings, 0 replies; 84+ messages in thread
From: Quentin Perret @ 2021-10-13 15:58 UTC (permalink / raw)
  To: Marc Zyngier, James Morse, Alexandru Elisei, Suzuki K Poulose,
	Catalin Marinas, Will Deacon, Fuad Tabba, David Brazdil
  Cc: kernel-team, kvmarm, linux-arm-kernel, linux-kernel

From: Will Deacon <will@kernel.org>

__pkvm_host_share_hyp() shares memory between the host and the hypervisor,
so implement it as an invocation of the new do_share() mechanism.

Note that the new semantics are slightly stricter than before, as we
now validate the physical address when double-sharing a page. However,
this makes no functional difference as long as no other transitions are
supported and the host can only share pages by pfn.

Signed-off-by: Will Deacon <will@kernel.org>
Signed-off-by: Quentin Perret <qperret@google.com>
---
 arch/arm64/kvm/hyp/nvhe/mem_protect.c | 121 +++++++-------------------
 1 file changed, 33 insertions(+), 88 deletions(-)

diff --git a/arch/arm64/kvm/hyp/nvhe/mem_protect.c b/arch/arm64/kvm/hyp/nvhe/mem_protect.c
index 53e503501044..6983b83f799f 100644
--- a/arch/arm64/kvm/hyp/nvhe/mem_protect.c
+++ b/arch/arm64/kvm/hyp/nvhe/mem_protect.c
@@ -342,94 +342,6 @@ static int host_stage2_idmap(u64 addr)
 	return ret;
 }
 
-static inline bool check_prot(enum kvm_pgtable_prot prot,
-			      enum kvm_pgtable_prot required,
-			      enum kvm_pgtable_prot denied)
-{
-	return (prot & (required | denied)) == required;
-}
-
-int __pkvm_host_share_hyp(u64 pfn)
-{
-	phys_addr_t addr = hyp_pfn_to_phys(pfn);
-	enum kvm_pgtable_prot prot, cur;
-	void *virt = __hyp_va(addr);
-	enum pkvm_page_state state;
-	kvm_pte_t pte;
-	int ret;
-
-	if (!addr_is_memory(addr))
-		return -EINVAL;
-
-	hyp_spin_lock(&host_kvm.lock);
-	hyp_spin_lock(&pkvm_pgd_lock);
-
-	ret = kvm_pgtable_get_leaf(&host_kvm.pgt, addr, &pte, NULL);
-	if (ret)
-		goto unlock;
-	if (!pte)
-		goto map_shared;
-
-	/*
-	 * Check attributes in the host stage-2 PTE. We need the page to be:
-	 *  - mapped RWX as we're sharing memory;
-	 *  - not borrowed, as that implies absence of ownership.
-	 * Otherwise, we can't let it got through
-	 */
-	cur = kvm_pgtable_stage2_pte_prot(pte);
-	prot = pkvm_mkstate(0, PKVM_PAGE_SHARED_BORROWED);
-	if (!check_prot(cur, PKVM_HOST_MEM_PROT, prot)) {
-		ret = -EPERM;
-		goto unlock;
-	}
-
-	state = pkvm_getstate(cur);
-	if (state == PKVM_PAGE_OWNED)
-		goto map_shared;
-
-	/*
-	 * Tolerate double-sharing the same page, but this requires
-	 * cross-checking the hypervisor stage-1.
-	 */
-	if (state != PKVM_PAGE_SHARED_OWNED) {
-		ret = -EPERM;
-		goto unlock;
-	}
-
-	ret = kvm_pgtable_get_leaf(&pkvm_pgtable, (u64)virt, &pte, NULL);
-	if (ret)
-		goto unlock;
-
-	/*
-	 * If the page has been shared with the hypervisor, it must be
-	 * already mapped as SHARED_BORROWED in its stage-1.
-	 */
-	cur = kvm_pgtable_hyp_pte_prot(pte);
-	prot = pkvm_mkstate(PAGE_HYP, PKVM_PAGE_SHARED_BORROWED);
-	if (!check_prot(cur, prot, ~prot))
-		ret = -EPERM;
-	goto unlock;
-
-map_shared:
-	/*
-	 * If the page is not yet shared, adjust mappings in both page-tables
-	 * while both locks are held.
-	 */
-	prot = pkvm_mkstate(PAGE_HYP, PKVM_PAGE_SHARED_BORROWED);
-	ret = pkvm_create_mappings_locked(virt, virt + PAGE_SIZE, prot);
-	BUG_ON(ret);
-
-	prot = pkvm_mkstate(PKVM_HOST_MEM_PROT, PKVM_PAGE_SHARED_OWNED);
-	ret = host_stage2_idmap_locked(addr, PAGE_SIZE, prot);
-	BUG_ON(ret);
-
-unlock:
-	hyp_spin_unlock(&pkvm_pgd_lock);
-	hyp_spin_unlock(&host_kvm.lock);
-
-	return ret;
-}
-
 void handle_host_mem_abort(struct kvm_cpu_context *host_ctxt)
 {
 	struct kvm_vcpu_fault_info fault;
@@ -758,3 +670,36 @@ static int do_share(struct pkvm_mem_share *share)
 out:
 	return ret;
 }
+
+int __pkvm_host_share_hyp(u64 pfn)
+{
+	int ret;
+	u64 host_addr = hyp_pfn_to_phys(pfn);
+	u64 hyp_addr = (u64)__hyp_va(host_addr);
+	struct pkvm_mem_share share = {
+		.tx	= {
+			.nr_pages	= 1,
+			.initiator	= {
+				.id	= PKVM_ID_HOST,
+				.addr	= host_addr,
+				.host	= {
+					.completer_addr = hyp_addr,
+				},
+			},
+			.completer	= {
+				.id	= PKVM_ID_HYP,
+			},
+		},
+		.prot	= PAGE_HYP,
+	};
+
+	host_lock_component();
+	hyp_lock_component();
+
+	ret = do_share(&share);
+
+	hyp_unlock_component();
+	host_unlock_component();
+
+	return ret;
+}
-- 
2.33.0.882.g93a45727a2-goog

_______________________________________________
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

^ permalink raw reply related	[flat|nested] 84+ messages in thread

* [PATCH 02/16] KVM: arm64: Implement __pkvm_host_share_hyp() using do_share()
@ 2021-10-13 15:58   ` Quentin Perret
  0 siblings, 0 replies; 84+ messages in thread
From: Quentin Perret @ 2021-10-13 15:58 UTC (permalink / raw)
  To: Marc Zyngier, James Morse, Alexandru Elisei, Suzuki K Poulose,
	Catalin Marinas, Will Deacon, Fuad Tabba, David Brazdil
  Cc: linux-arm-kernel, kvmarm, linux-kernel, kernel-team

From: Will Deacon <will@kernel.org>

__pkvm_host_share_hyp() shares memory between the host and the hypervisor,
so implement it as an invocation of the new do_share() mechanism.

Note that the new semantics are slightly stricter than before, as we
now validate the physical address when double-sharing a page. However,
this makes no functional difference as long as no other transitions are
supported and the host can only share pages by pfn.

Signed-off-by: Will Deacon <will@kernel.org>
Signed-off-by: Quentin Perret <qperret@google.com>
---
 arch/arm64/kvm/hyp/nvhe/mem_protect.c | 121 +++++++-------------------
 1 file changed, 33 insertions(+), 88 deletions(-)

diff --git a/arch/arm64/kvm/hyp/nvhe/mem_protect.c b/arch/arm64/kvm/hyp/nvhe/mem_protect.c
index 53e503501044..6983b83f799f 100644
--- a/arch/arm64/kvm/hyp/nvhe/mem_protect.c
+++ b/arch/arm64/kvm/hyp/nvhe/mem_protect.c
@@ -342,94 +342,6 @@ static int host_stage2_idmap(u64 addr)
 	return ret;
 }
 
-static inline bool check_prot(enum kvm_pgtable_prot prot,
-			      enum kvm_pgtable_prot required,
-			      enum kvm_pgtable_prot denied)
-{
-	return (prot & (required | denied)) == required;
-}
-
-int __pkvm_host_share_hyp(u64 pfn)
-{
-	phys_addr_t addr = hyp_pfn_to_phys(pfn);
-	enum kvm_pgtable_prot prot, cur;
-	void *virt = __hyp_va(addr);
-	enum pkvm_page_state state;
-	kvm_pte_t pte;
-	int ret;
-
-	if (!addr_is_memory(addr))
-		return -EINVAL;
-
-	hyp_spin_lock(&host_kvm.lock);
-	hyp_spin_lock(&pkvm_pgd_lock);
-
-	ret = kvm_pgtable_get_leaf(&host_kvm.pgt, addr, &pte, NULL);
-	if (ret)
-		goto unlock;
-	if (!pte)
-		goto map_shared;
-
-	/*
-	 * Check attributes in the host stage-2 PTE. We need the page to be:
-	 *  - mapped RWX as we're sharing memory;
-	 *  - not borrowed, as that implies absence of ownership.
-	 * Otherwise, we can't let it got through
-	 */
-	cur = kvm_pgtable_stage2_pte_prot(pte);
-	prot = pkvm_mkstate(0, PKVM_PAGE_SHARED_BORROWED);
-	if (!check_prot(cur, PKVM_HOST_MEM_PROT, prot)) {
-		ret = -EPERM;
-		goto unlock;
-	}
-
-	state = pkvm_getstate(cur);
-	if (state == PKVM_PAGE_OWNED)
-		goto map_shared;
-
-	/*
-	 * Tolerate double-sharing the same page, but this requires
-	 * cross-checking the hypervisor stage-1.
-	 */
-	if (state != PKVM_PAGE_SHARED_OWNED) {
-		ret = -EPERM;
-		goto unlock;
-	}
-
-	ret = kvm_pgtable_get_leaf(&pkvm_pgtable, (u64)virt, &pte, NULL);
-	if (ret)
-		goto unlock;
-
-	/*
-	 * If the page has been shared with the hypervisor, it must be
-	 * already mapped as SHARED_BORROWED in its stage-1.
-	 */
-	cur = kvm_pgtable_hyp_pte_prot(pte);
-	prot = pkvm_mkstate(PAGE_HYP, PKVM_PAGE_SHARED_BORROWED);
-	if (!check_prot(cur, prot, ~prot))
-		ret = -EPERM;
-	goto unlock;
-
-map_shared:
-	/*
-	 * If the page is not yet shared, adjust mappings in both page-tables
-	 * while both locks are held.
-	 */
-	prot = pkvm_mkstate(PAGE_HYP, PKVM_PAGE_SHARED_BORROWED);
-	ret = pkvm_create_mappings_locked(virt, virt + PAGE_SIZE, prot);
-	BUG_ON(ret);
-
-	prot = pkvm_mkstate(PKVM_HOST_MEM_PROT, PKVM_PAGE_SHARED_OWNED);
-	ret = host_stage2_idmap_locked(addr, PAGE_SIZE, prot);
-	BUG_ON(ret);
-
-unlock:
-	hyp_spin_unlock(&pkvm_pgd_lock);
-	hyp_spin_unlock(&host_kvm.lock);
-
-	return ret;
-}
-
 void handle_host_mem_abort(struct kvm_cpu_context *host_ctxt)
 {
 	struct kvm_vcpu_fault_info fault;
@@ -758,3 +670,36 @@ static int do_share(struct pkvm_mem_share *share)
 out:
 	return ret;
 }
+
+int __pkvm_host_share_hyp(u64 pfn)
+{
+	int ret;
+	u64 host_addr = hyp_pfn_to_phys(pfn);
+	u64 hyp_addr = (u64)__hyp_va(host_addr);
+	struct pkvm_mem_share share = {
+		.tx	= {
+			.nr_pages	= 1,
+			.initiator	= {
+				.id	= PKVM_ID_HOST,
+				.addr	= host_addr,
+				.host	= {
+					.completer_addr = hyp_addr,
+				},
+			},
+			.completer	= {
+				.id	= PKVM_ID_HYP,
+			},
+		},
+		.prot	= PAGE_HYP,
+	};
+
+	host_lock_component();
+	hyp_lock_component();
+
+	ret = do_share(&share);
+
+	hyp_unlock_component();
+	host_unlock_component();
+
+	return ret;
+}
-- 
2.33.0.882.g93a45727a2-goog


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply related	[flat|nested] 84+ messages in thread

* [PATCH 03/16] KVM: arm64: Avoid remapping the SVE state in the hyp stage-1
  2021-10-13 15:58 ` Quentin Perret
  (?)
@ 2021-10-13 15:58   ` Quentin Perret
  -1 siblings, 0 replies; 84+ messages in thread
From: Quentin Perret @ 2021-10-13 15:58 UTC (permalink / raw)
  To: Marc Zyngier, James Morse, Alexandru Elisei, Suzuki K Poulose,
	Catalin Marinas, Will Deacon, Fuad Tabba, David Brazdil
  Cc: linux-arm-kernel, kvmarm, linux-kernel, kernel-team

We currently map the SVE state into the hypervisor stage-1 on VCPU_RUN,
when the vCPU thread's PID has changed. However, this only needs to be
done during the first VCPU_RUN as the SVE state doesn't depend on
thread-specific data, so move the create_hyp_mapping() call to
kvm_vcpu_first_run_init().

Suggested-by: Marc Zyngier <maz@kernel.org>
Signed-off-by: Quentin Perret <qperret@google.com>
---
 arch/arm64/kvm/arm.c    | 12 ++++++++++++
 arch/arm64/kvm/fpsimd.c | 11 -----------
 2 files changed, 12 insertions(+), 11 deletions(-)

diff --git a/arch/arm64/kvm/arm.c b/arch/arm64/kvm/arm.c
index fe102cd2e518..c33d8c073820 100644
--- a/arch/arm64/kvm/arm.c
+++ b/arch/arm64/kvm/arm.c
@@ -618,6 +618,18 @@ static int kvm_vcpu_first_run_init(struct kvm_vcpu *vcpu)
 	if (ret)
 		return ret;
 
+	if (vcpu->arch.sve_state) {
+		void *sve_end;
+
+		sve_end = vcpu->arch.sve_state + vcpu_sve_state_size(vcpu);
+
+		ret = create_hyp_mappings(vcpu->arch.sve_state, sve_end,
+					  PAGE_HYP);
+		if (ret)
+			return ret;
+	}
+
+
 	ret = kvm_arm_pmu_v3_enable(vcpu);
 
 	return ret;
diff --git a/arch/arm64/kvm/fpsimd.c b/arch/arm64/kvm/fpsimd.c
index 5621020b28de..62c0d78da7be 100644
--- a/arch/arm64/kvm/fpsimd.c
+++ b/arch/arm64/kvm/fpsimd.c
@@ -43,17 +43,6 @@ int kvm_arch_vcpu_run_map_fp(struct kvm_vcpu *vcpu)
 	if (ret)
 		goto error;
 
-	if (vcpu->arch.sve_state) {
-		void *sve_end;
-
-		sve_end = vcpu->arch.sve_state + vcpu_sve_state_size(vcpu);
-
-		ret = create_hyp_mappings(vcpu->arch.sve_state, sve_end,
-					  PAGE_HYP);
-		if (ret)
-			goto error;
-	}
-
 	vcpu->arch.host_thread_info = kern_hyp_va(ti);
 	vcpu->arch.host_fpsimd_state = kern_hyp_va(fpsimd);
 error:
-- 
2.33.0.882.g93a45727a2-goog


^ permalink raw reply related	[flat|nested] 84+ messages in thread

* [PATCH 03/16] KVM: arm64: Avoid remapping the SVE state in the hyp stage-1
@ 2021-10-13 15:58   ` Quentin Perret
  0 siblings, 0 replies; 84+ messages in thread
From: Quentin Perret @ 2021-10-13 15:58 UTC (permalink / raw)
  To: Marc Zyngier, James Morse, Alexandru Elisei, Suzuki K Poulose,
	Catalin Marinas, Will Deacon, Fuad Tabba, David Brazdil
  Cc: kernel-team, kvmarm, linux-arm-kernel, linux-kernel

We currently map the SVE state into the hypervisor stage-1 on VCPU_RUN,
when the vCPU thread's PID has changed. However, this only needs to be
done during the first VCPU_RUN as the SVE state doesn't depend on
thread-specific data, so move the create_hyp_mapping() call to
kvm_vcpu_first_run_init().

Suggested-by: Marc Zyngier <maz@kernel.org>
Signed-off-by: Quentin Perret <qperret@google.com>
---
 arch/arm64/kvm/arm.c    | 12 ++++++++++++
 arch/arm64/kvm/fpsimd.c | 11 -----------
 2 files changed, 12 insertions(+), 11 deletions(-)

diff --git a/arch/arm64/kvm/arm.c b/arch/arm64/kvm/arm.c
index fe102cd2e518..c33d8c073820 100644
--- a/arch/arm64/kvm/arm.c
+++ b/arch/arm64/kvm/arm.c
@@ -618,6 +618,18 @@ static int kvm_vcpu_first_run_init(struct kvm_vcpu *vcpu)
 	if (ret)
 		return ret;
 
+	if (vcpu->arch.sve_state) {
+		void *sve_end;
+
+		sve_end = vcpu->arch.sve_state + vcpu_sve_state_size(vcpu);
+
+		ret = create_hyp_mappings(vcpu->arch.sve_state, sve_end,
+					  PAGE_HYP);
+		if (ret)
+			return ret;
+	}
+
+
 	ret = kvm_arm_pmu_v3_enable(vcpu);
 
 	return ret;
diff --git a/arch/arm64/kvm/fpsimd.c b/arch/arm64/kvm/fpsimd.c
index 5621020b28de..62c0d78da7be 100644
--- a/arch/arm64/kvm/fpsimd.c
+++ b/arch/arm64/kvm/fpsimd.c
@@ -43,17 +43,6 @@ int kvm_arch_vcpu_run_map_fp(struct kvm_vcpu *vcpu)
 	if (ret)
 		goto error;
 
-	if (vcpu->arch.sve_state) {
-		void *sve_end;
-
-		sve_end = vcpu->arch.sve_state + vcpu_sve_state_size(vcpu);
-
-		ret = create_hyp_mappings(vcpu->arch.sve_state, sve_end,
-					  PAGE_HYP);
-		if (ret)
-			goto error;
-	}
-
 	vcpu->arch.host_thread_info = kern_hyp_va(ti);
 	vcpu->arch.host_fpsimd_state = kern_hyp_va(fpsimd);
 error:
-- 
2.33.0.882.g93a45727a2-goog

_______________________________________________
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

^ permalink raw reply related	[flat|nested] 84+ messages in thread

* [PATCH 03/16] KVM: arm64: Avoid remapping the SVE state in the hyp stage-1
@ 2021-10-13 15:58   ` Quentin Perret
  0 siblings, 0 replies; 84+ messages in thread
From: Quentin Perret @ 2021-10-13 15:58 UTC (permalink / raw)
  To: Marc Zyngier, James Morse, Alexandru Elisei, Suzuki K Poulose,
	Catalin Marinas, Will Deacon, Fuad Tabba, David Brazdil
  Cc: linux-arm-kernel, kvmarm, linux-kernel, kernel-team

We currently map the SVE state into the hypervisor stage-1 on VCPU_RUN,
when the vCPU thread's PID has changed. However, this only needs to be
done during the first VCPU_RUN as the SVE state doesn't depend on
thread-specific data, so move the create_hyp_mapping() call to
kvm_vcpu_first_run_init().

Suggested-by: Marc Zyngier <maz@kernel.org>
Signed-off-by: Quentin Perret <qperret@google.com>
---
 arch/arm64/kvm/arm.c    | 12 ++++++++++++
 arch/arm64/kvm/fpsimd.c | 11 -----------
 2 files changed, 12 insertions(+), 11 deletions(-)

diff --git a/arch/arm64/kvm/arm.c b/arch/arm64/kvm/arm.c
index fe102cd2e518..c33d8c073820 100644
--- a/arch/arm64/kvm/arm.c
+++ b/arch/arm64/kvm/arm.c
@@ -618,6 +618,18 @@ static int kvm_vcpu_first_run_init(struct kvm_vcpu *vcpu)
 	if (ret)
 		return ret;
 
+	if (vcpu->arch.sve_state) {
+		void *sve_end;
+
+		sve_end = vcpu->arch.sve_state + vcpu_sve_state_size(vcpu);
+
+		ret = create_hyp_mappings(vcpu->arch.sve_state, sve_end,
+					  PAGE_HYP);
+		if (ret)
+			return ret;
+	}
+
+
 	ret = kvm_arm_pmu_v3_enable(vcpu);
 
 	return ret;
diff --git a/arch/arm64/kvm/fpsimd.c b/arch/arm64/kvm/fpsimd.c
index 5621020b28de..62c0d78da7be 100644
--- a/arch/arm64/kvm/fpsimd.c
+++ b/arch/arm64/kvm/fpsimd.c
@@ -43,17 +43,6 @@ int kvm_arch_vcpu_run_map_fp(struct kvm_vcpu *vcpu)
 	if (ret)
 		goto error;
 
-	if (vcpu->arch.sve_state) {
-		void *sve_end;
-
-		sve_end = vcpu->arch.sve_state + vcpu_sve_state_size(vcpu);
-
-		ret = create_hyp_mappings(vcpu->arch.sve_state, sve_end,
-					  PAGE_HYP);
-		if (ret)
-			goto error;
-	}
-
 	vcpu->arch.host_thread_info = kern_hyp_va(ti);
 	vcpu->arch.host_fpsimd_state = kern_hyp_va(fpsimd);
 error:
-- 
2.33.0.882.g93a45727a2-goog


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply related	[flat|nested] 84+ messages in thread

* [PATCH 04/16] KVM: arm64: Introduce kvm_share_hyp()
  2021-10-13 15:58 ` Quentin Perret
  (?)
@ 2021-10-13 15:58   ` Quentin Perret
  -1 siblings, 0 replies; 84+ messages in thread
From: Quentin Perret @ 2021-10-13 15:58 UTC (permalink / raw)
  To: Marc Zyngier, James Morse, Alexandru Elisei, Suzuki K Poulose,
	Catalin Marinas, Will Deacon, Fuad Tabba, David Brazdil
  Cc: linux-arm-kernel, kvmarm, linux-kernel, kernel-team

The create_hyp_mappings() function can currently be called at any point
in time. However, its behaviour in protected mode changes widely
depending on when it is being called. Prior to KVM init, it is used to
create the temporary page-table used to bring-up the hypervisor, and
later on it is transparently turned into a 'share' hypercall when the
kernel has lost control over the hypervisor stage-1. In order to prepare
the ground for also unsharing pages with the hypervisor during guest
teardown, introduce a kvm_share_hyp() function to make it clear in which
places a share hypercall should be expected, as we will soon need a
matching unshare hypercall in all those places.

Signed-off-by: Quentin Perret <qperret@google.com>
---
 arch/arm64/include/asm/kvm_mmu.h |  1 +
 arch/arm64/kvm/arm.c             |  7 +++----
 arch/arm64/kvm/fpsimd.c          |  4 ++--
 arch/arm64/kvm/mmu.c             | 19 +++++++++++++------
 4 files changed, 19 insertions(+), 12 deletions(-)

diff --git a/arch/arm64/include/asm/kvm_mmu.h b/arch/arm64/include/asm/kvm_mmu.h
index 02d378887743..185d0f62b724 100644
--- a/arch/arm64/include/asm/kvm_mmu.h
+++ b/arch/arm64/include/asm/kvm_mmu.h
@@ -150,6 +150,7 @@ static __always_inline unsigned long __kern_hyp_va(unsigned long v)
 #include <asm/kvm_pgtable.h>
 #include <asm/stage2_pgtable.h>
 
+int kvm_share_hyp(void *from, void *to);
 int create_hyp_mappings(void *from, void *to, enum kvm_pgtable_prot prot);
 int create_hyp_io_mappings(phys_addr_t phys_addr, size_t size,
 			   void __iomem **kaddr,
diff --git a/arch/arm64/kvm/arm.c b/arch/arm64/kvm/arm.c
index c33d8c073820..f2e74635332b 100644
--- a/arch/arm64/kvm/arm.c
+++ b/arch/arm64/kvm/arm.c
@@ -146,7 +146,7 @@ int kvm_arch_init_vm(struct kvm *kvm, unsigned long type)
 	if (ret)
 		return ret;
 
-	ret = create_hyp_mappings(kvm, kvm + 1, PAGE_HYP);
+	ret = kvm_share_hyp(kvm, kvm + 1);
 	if (ret)
 		goto out_free_stage2_pgd;
 
@@ -341,7 +341,7 @@ int kvm_arch_vcpu_create(struct kvm_vcpu *vcpu)
 	if (err)
 		return err;
 
-	return create_hyp_mappings(vcpu, vcpu + 1, PAGE_HYP);
+	return kvm_share_hyp(vcpu, vcpu + 1);
 }
 
 void kvm_arch_vcpu_postcreate(struct kvm_vcpu *vcpu)
@@ -623,8 +623,7 @@ static int kvm_vcpu_first_run_init(struct kvm_vcpu *vcpu)
 
 		sve_end = vcpu->arch.sve_state + vcpu_sve_state_size(vcpu);
 
-		ret = create_hyp_mappings(vcpu->arch.sve_state, sve_end,
-					  PAGE_HYP);
+		ret = kvm_share_hyp(vcpu->arch.sve_state, sve_end);
 		if (ret)
 			return ret;
 	}
diff --git a/arch/arm64/kvm/fpsimd.c b/arch/arm64/kvm/fpsimd.c
index 62c0d78da7be..2fe1128d9f3d 100644
--- a/arch/arm64/kvm/fpsimd.c
+++ b/arch/arm64/kvm/fpsimd.c
@@ -35,11 +35,11 @@ int kvm_arch_vcpu_run_map_fp(struct kvm_vcpu *vcpu)
 	 * Make sure the host task thread flags and fpsimd state are
 	 * visible to hyp:
 	 */
-	ret = create_hyp_mappings(ti, ti + 1, PAGE_HYP);
+	ret = kvm_share_hyp(ti, ti + 1);
 	if (ret)
 		goto error;
 
-	ret = create_hyp_mappings(fpsimd, fpsimd + 1, PAGE_HYP);
+	ret = kvm_share_hyp(fpsimd, fpsimd + 1);
 	if (ret)
 		goto error;
 
diff --git a/arch/arm64/kvm/mmu.c b/arch/arm64/kvm/mmu.c
index 1a94a7ca48f2..f80673e863ac 100644
--- a/arch/arm64/kvm/mmu.c
+++ b/arch/arm64/kvm/mmu.c
@@ -296,6 +296,17 @@ static int pkvm_share_hyp(phys_addr_t start, phys_addr_t end)
 	return 0;
 }
 
+int kvm_share_hyp(void *from, void *to)
+{
+	if (is_kernel_in_hyp_mode())
+		return 0;
+
+	if (kvm_host_owns_hyp_mappings())
+		return create_hyp_mappings(from, to, PAGE_HYP);
+
+	return pkvm_share_hyp(kvm_kaddr_to_phys(from), kvm_kaddr_to_phys(to));
+}
+
 /**
  * create_hyp_mappings - duplicate a kernel virtual address range in Hyp mode
  * @from:	The virtual kernel start address of the range
@@ -316,12 +327,8 @@ int create_hyp_mappings(void *from, void *to, enum kvm_pgtable_prot prot)
 	if (is_kernel_in_hyp_mode())
 		return 0;
 
-	if (!kvm_host_owns_hyp_mappings()) {
-		if (WARN_ON(prot != PAGE_HYP))
-			return -EPERM;
-		return pkvm_share_hyp(kvm_kaddr_to_phys(from),
-				      kvm_kaddr_to_phys(to));
-	}
+	if (WARN_ON(!kvm_host_owns_hyp_mappings()))
+		return -EPERM;
 
 	start = start & PAGE_MASK;
 	end = PAGE_ALIGN(end);
-- 
2.33.0.882.g93a45727a2-goog


^ permalink raw reply related	[flat|nested] 84+ messages in thread

* [PATCH 04/16] KVM: arm64: Introduce kvm_share_hyp()
@ 2021-10-13 15:58   ` Quentin Perret
  0 siblings, 0 replies; 84+ messages in thread
From: Quentin Perret @ 2021-10-13 15:58 UTC (permalink / raw)
  To: Marc Zyngier, James Morse, Alexandru Elisei, Suzuki K Poulose,
	Catalin Marinas, Will Deacon, Fuad Tabba, David Brazdil
  Cc: kernel-team, kvmarm, linux-arm-kernel, linux-kernel

The create_hyp_mappings() function can currently be called at any point
in time. However, its behaviour in protected mode changes widely
depending on when it is being called. Prior to KVM init, it is used to
create the temporary page-table used to bring-up the hypervisor, and
later on it is transparently turned into a 'share' hypercall when the
kernel has lost control over the hypervisor stage-1. In order to prepare
the ground for also unsharing pages with the hypervisor during guest
teardown, introduce a kvm_share_hyp() function to make it clear in which
places a share hypercall should be expected, as we will soon need a
matching unshare hypercall in all those places.

Signed-off-by: Quentin Perret <qperret@google.com>
---
 arch/arm64/include/asm/kvm_mmu.h |  1 +
 arch/arm64/kvm/arm.c             |  7 +++----
 arch/arm64/kvm/fpsimd.c          |  4 ++--
 arch/arm64/kvm/mmu.c             | 19 +++++++++++++------
 4 files changed, 19 insertions(+), 12 deletions(-)

diff --git a/arch/arm64/include/asm/kvm_mmu.h b/arch/arm64/include/asm/kvm_mmu.h
index 02d378887743..185d0f62b724 100644
--- a/arch/arm64/include/asm/kvm_mmu.h
+++ b/arch/arm64/include/asm/kvm_mmu.h
@@ -150,6 +150,7 @@ static __always_inline unsigned long __kern_hyp_va(unsigned long v)
 #include <asm/kvm_pgtable.h>
 #include <asm/stage2_pgtable.h>
 
+int kvm_share_hyp(void *from, void *to);
 int create_hyp_mappings(void *from, void *to, enum kvm_pgtable_prot prot);
 int create_hyp_io_mappings(phys_addr_t phys_addr, size_t size,
 			   void __iomem **kaddr,
diff --git a/arch/arm64/kvm/arm.c b/arch/arm64/kvm/arm.c
index c33d8c073820..f2e74635332b 100644
--- a/arch/arm64/kvm/arm.c
+++ b/arch/arm64/kvm/arm.c
@@ -146,7 +146,7 @@ int kvm_arch_init_vm(struct kvm *kvm, unsigned long type)
 	if (ret)
 		return ret;
 
-	ret = create_hyp_mappings(kvm, kvm + 1, PAGE_HYP);
+	ret = kvm_share_hyp(kvm, kvm + 1);
 	if (ret)
 		goto out_free_stage2_pgd;
 
@@ -341,7 +341,7 @@ int kvm_arch_vcpu_create(struct kvm_vcpu *vcpu)
 	if (err)
 		return err;
 
-	return create_hyp_mappings(vcpu, vcpu + 1, PAGE_HYP);
+	return kvm_share_hyp(vcpu, vcpu + 1);
 }
 
 void kvm_arch_vcpu_postcreate(struct kvm_vcpu *vcpu)
@@ -623,8 +623,7 @@ static int kvm_vcpu_first_run_init(struct kvm_vcpu *vcpu)
 
 		sve_end = vcpu->arch.sve_state + vcpu_sve_state_size(vcpu);
 
-		ret = create_hyp_mappings(vcpu->arch.sve_state, sve_end,
-					  PAGE_HYP);
+		ret = kvm_share_hyp(vcpu->arch.sve_state, sve_end);
 		if (ret)
 			return ret;
 	}
diff --git a/arch/arm64/kvm/fpsimd.c b/arch/arm64/kvm/fpsimd.c
index 62c0d78da7be..2fe1128d9f3d 100644
--- a/arch/arm64/kvm/fpsimd.c
+++ b/arch/arm64/kvm/fpsimd.c
@@ -35,11 +35,11 @@ int kvm_arch_vcpu_run_map_fp(struct kvm_vcpu *vcpu)
 	 * Make sure the host task thread flags and fpsimd state are
 	 * visible to hyp:
 	 */
-	ret = create_hyp_mappings(ti, ti + 1, PAGE_HYP);
+	ret = kvm_share_hyp(ti, ti + 1);
 	if (ret)
 		goto error;
 
-	ret = create_hyp_mappings(fpsimd, fpsimd + 1, PAGE_HYP);
+	ret = kvm_share_hyp(fpsimd, fpsimd + 1);
 	if (ret)
 		goto error;
 
diff --git a/arch/arm64/kvm/mmu.c b/arch/arm64/kvm/mmu.c
index 1a94a7ca48f2..f80673e863ac 100644
--- a/arch/arm64/kvm/mmu.c
+++ b/arch/arm64/kvm/mmu.c
@@ -296,6 +296,17 @@ static int pkvm_share_hyp(phys_addr_t start, phys_addr_t end)
 	return 0;
 }
 
+int kvm_share_hyp(void *from, void *to)
+{
+	if (is_kernel_in_hyp_mode())
+		return 0;
+
+	if (kvm_host_owns_hyp_mappings())
+		return create_hyp_mappings(from, to, PAGE_HYP);
+
+	return pkvm_share_hyp(kvm_kaddr_to_phys(from), kvm_kaddr_to_phys(to));
+}
+
 /**
  * create_hyp_mappings - duplicate a kernel virtual address range in Hyp mode
  * @from:	The virtual kernel start address of the range
@@ -316,12 +327,8 @@ int create_hyp_mappings(void *from, void *to, enum kvm_pgtable_prot prot)
 	if (is_kernel_in_hyp_mode())
 		return 0;
 
-	if (!kvm_host_owns_hyp_mappings()) {
-		if (WARN_ON(prot != PAGE_HYP))
-			return -EPERM;
-		return pkvm_share_hyp(kvm_kaddr_to_phys(from),
-				      kvm_kaddr_to_phys(to));
-	}
+	if (WARN_ON(!kvm_host_owns_hyp_mappings()))
+		return -EPERM;
 
 	start = start & PAGE_MASK;
 	end = PAGE_ALIGN(end);
-- 
2.33.0.882.g93a45727a2-goog

_______________________________________________
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

^ permalink raw reply related	[flat|nested] 84+ messages in thread

* [PATCH 04/16] KVM: arm64: Introduce kvm_share_hyp()
@ 2021-10-13 15:58   ` Quentin Perret
  0 siblings, 0 replies; 84+ messages in thread
From: Quentin Perret @ 2021-10-13 15:58 UTC (permalink / raw)
  To: Marc Zyngier, James Morse, Alexandru Elisei, Suzuki K Poulose,
	Catalin Marinas, Will Deacon, Fuad Tabba, David Brazdil
  Cc: linux-arm-kernel, kvmarm, linux-kernel, kernel-team

The create_hyp_mappings() function can currently be called at any point
in time. However, its behaviour in protected mode changes widely
depending on when it is being called. Prior to KVM init, it is used to
create the temporary page-table used to bring-up the hypervisor, and
later on it is transparently turned into a 'share' hypercall when the
kernel has lost control over the hypervisor stage-1. In order to prepare
the ground for also unsharing pages with the hypervisor during guest
teardown, introduce a kvm_share_hyp() function to make it clear in which
places a share hypercall should be expected, as we will soon need a
matching unshare hypercall in all those places.

Signed-off-by: Quentin Perret <qperret@google.com>
---
 arch/arm64/include/asm/kvm_mmu.h |  1 +
 arch/arm64/kvm/arm.c             |  7 +++----
 arch/arm64/kvm/fpsimd.c          |  4 ++--
 arch/arm64/kvm/mmu.c             | 19 +++++++++++++------
 4 files changed, 19 insertions(+), 12 deletions(-)

diff --git a/arch/arm64/include/asm/kvm_mmu.h b/arch/arm64/include/asm/kvm_mmu.h
index 02d378887743..185d0f62b724 100644
--- a/arch/arm64/include/asm/kvm_mmu.h
+++ b/arch/arm64/include/asm/kvm_mmu.h
@@ -150,6 +150,7 @@ static __always_inline unsigned long __kern_hyp_va(unsigned long v)
 #include <asm/kvm_pgtable.h>
 #include <asm/stage2_pgtable.h>
 
+int kvm_share_hyp(void *from, void *to);
 int create_hyp_mappings(void *from, void *to, enum kvm_pgtable_prot prot);
 int create_hyp_io_mappings(phys_addr_t phys_addr, size_t size,
 			   void __iomem **kaddr,
diff --git a/arch/arm64/kvm/arm.c b/arch/arm64/kvm/arm.c
index c33d8c073820..f2e74635332b 100644
--- a/arch/arm64/kvm/arm.c
+++ b/arch/arm64/kvm/arm.c
@@ -146,7 +146,7 @@ int kvm_arch_init_vm(struct kvm *kvm, unsigned long type)
 	if (ret)
 		return ret;
 
-	ret = create_hyp_mappings(kvm, kvm + 1, PAGE_HYP);
+	ret = kvm_share_hyp(kvm, kvm + 1);
 	if (ret)
 		goto out_free_stage2_pgd;
 
@@ -341,7 +341,7 @@ int kvm_arch_vcpu_create(struct kvm_vcpu *vcpu)
 	if (err)
 		return err;
 
-	return create_hyp_mappings(vcpu, vcpu + 1, PAGE_HYP);
+	return kvm_share_hyp(vcpu, vcpu + 1);
 }
 
 void kvm_arch_vcpu_postcreate(struct kvm_vcpu *vcpu)
@@ -623,8 +623,7 @@ static int kvm_vcpu_first_run_init(struct kvm_vcpu *vcpu)
 
 		sve_end = vcpu->arch.sve_state + vcpu_sve_state_size(vcpu);
 
-		ret = create_hyp_mappings(vcpu->arch.sve_state, sve_end,
-					  PAGE_HYP);
+		ret = kvm_share_hyp(vcpu->arch.sve_state, sve_end);
 		if (ret)
 			return ret;
 	}
diff --git a/arch/arm64/kvm/fpsimd.c b/arch/arm64/kvm/fpsimd.c
index 62c0d78da7be..2fe1128d9f3d 100644
--- a/arch/arm64/kvm/fpsimd.c
+++ b/arch/arm64/kvm/fpsimd.c
@@ -35,11 +35,11 @@ int kvm_arch_vcpu_run_map_fp(struct kvm_vcpu *vcpu)
 	 * Make sure the host task thread flags and fpsimd state are
 	 * visible to hyp:
 	 */
-	ret = create_hyp_mappings(ti, ti + 1, PAGE_HYP);
+	ret = kvm_share_hyp(ti, ti + 1);
 	if (ret)
 		goto error;
 
-	ret = create_hyp_mappings(fpsimd, fpsimd + 1, PAGE_HYP);
+	ret = kvm_share_hyp(fpsimd, fpsimd + 1);
 	if (ret)
 		goto error;
 
diff --git a/arch/arm64/kvm/mmu.c b/arch/arm64/kvm/mmu.c
index 1a94a7ca48f2..f80673e863ac 100644
--- a/arch/arm64/kvm/mmu.c
+++ b/arch/arm64/kvm/mmu.c
@@ -296,6 +296,17 @@ static int pkvm_share_hyp(phys_addr_t start, phys_addr_t end)
 	return 0;
 }
 
+int kvm_share_hyp(void *from, void *to)
+{
+	if (is_kernel_in_hyp_mode())
+		return 0;
+
+	if (kvm_host_owns_hyp_mappings())
+		return create_hyp_mappings(from, to, PAGE_HYP);
+
+	return pkvm_share_hyp(kvm_kaddr_to_phys(from), kvm_kaddr_to_phys(to));
+}
+
 /**
  * create_hyp_mappings - duplicate a kernel virtual address range in Hyp mode
  * @from:	The virtual kernel start address of the range
@@ -316,12 +327,8 @@ int create_hyp_mappings(void *from, void *to, enum kvm_pgtable_prot prot)
 	if (is_kernel_in_hyp_mode())
 		return 0;
 
-	if (!kvm_host_owns_hyp_mappings()) {
-		if (WARN_ON(prot != PAGE_HYP))
-			return -EPERM;
-		return pkvm_share_hyp(kvm_kaddr_to_phys(from),
-				      kvm_kaddr_to_phys(to));
-	}
+	if (WARN_ON(!kvm_host_owns_hyp_mappings()))
+		return -EPERM;
 
 	start = start & PAGE_MASK;
 	end = PAGE_ALIGN(end);
-- 
2.33.0.882.g93a45727a2-goog


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply related	[flat|nested] 84+ messages in thread

* [PATCH 05/16] KVM: arm64: Accept page ranges in pkvm share hypercall
  2021-10-13 15:58 ` Quentin Perret
  (?)
@ 2021-10-13 15:58   ` Quentin Perret
  -1 siblings, 0 replies; 84+ messages in thread
From: Quentin Perret @ 2021-10-13 15:58 UTC (permalink / raw)
  To: Marc Zyngier, James Morse, Alexandru Elisei, Suzuki K Poulose,
	Catalin Marinas, Will Deacon, Fuad Tabba, David Brazdil
  Cc: linux-arm-kernel, kvmarm, linux-kernel, kernel-team

The recently reworked do_share() infrastructure for the nVHE protected
mode allows to transition the state of a range of pages 'atomically'.
This is preferable over single-page sharing when e.g. mapping guest
vCPUs in the hypervisor stage-1 as the permission checks and page-table
modifications for the entire range are done in a single critical
section. This means there is no need for the host the handle e.g. only
half of a vCPU being successfully shared with the hypervisor.

So, make use of that feature in the __pkvm_host_share_hyp() hypercall by
allowing to specify a pfn range.

Signed-off-by: Quentin Perret <qperret@google.com>
---
 arch/arm64/kvm/hyp/include/nvhe/mem_protect.h |  2 +-
 arch/arm64/kvm/hyp/nvhe/hyp-main.c            |  3 ++-
 arch/arm64/kvm/hyp/nvhe/mem_protect.c         |  4 +--
 arch/arm64/kvm/mmu.c                          | 25 +++++++------------
 4 files changed, 14 insertions(+), 20 deletions(-)

diff --git a/arch/arm64/kvm/hyp/include/nvhe/mem_protect.h b/arch/arm64/kvm/hyp/include/nvhe/mem_protect.h
index 56445586c755..9c02abe92e0a 100644
--- a/arch/arm64/kvm/hyp/include/nvhe/mem_protect.h
+++ b/arch/arm64/kvm/hyp/include/nvhe/mem_protect.h
@@ -54,7 +54,7 @@ extern struct host_kvm host_kvm;
 extern const u8 pkvm_hyp_id;
 
 int __pkvm_prot_finalize(void);
-int __pkvm_host_share_hyp(u64 pfn);
+int __pkvm_host_share_hyp(u64 pfn, u64 nr_pages);
 
 bool addr_is_memory(phys_addr_t phys);
 int host_stage2_idmap_locked(phys_addr_t addr, u64 size, enum kvm_pgtable_prot prot);
diff --git a/arch/arm64/kvm/hyp/nvhe/hyp-main.c b/arch/arm64/kvm/hyp/nvhe/hyp-main.c
index 2da6aa8da868..f78bec2b9dd4 100644
--- a/arch/arm64/kvm/hyp/nvhe/hyp-main.c
+++ b/arch/arm64/kvm/hyp/nvhe/hyp-main.c
@@ -143,8 +143,9 @@ static void handle___pkvm_cpu_set_vector(struct kvm_cpu_context *host_ctxt)
 static void handle___pkvm_host_share_hyp(struct kvm_cpu_context *host_ctxt)
 {
 	DECLARE_REG(u64, pfn, host_ctxt, 1);
+	DECLARE_REG(u64, nr_pages, host_ctxt, 2);
 
-	cpu_reg(host_ctxt, 1) = __pkvm_host_share_hyp(pfn);
+	cpu_reg(host_ctxt, 1) = __pkvm_host_share_hyp(pfn, nr_pages);
 }
 
 static void handle___pkvm_create_private_mapping(struct kvm_cpu_context *host_ctxt)
diff --git a/arch/arm64/kvm/hyp/nvhe/mem_protect.c b/arch/arm64/kvm/hyp/nvhe/mem_protect.c
index 6983b83f799f..909e60f71b06 100644
--- a/arch/arm64/kvm/hyp/nvhe/mem_protect.c
+++ b/arch/arm64/kvm/hyp/nvhe/mem_protect.c
@@ -671,14 +671,14 @@ static int do_share(struct pkvm_mem_share *share)
 	return ret;
 }
 
-int __pkvm_host_share_hyp(u64 pfn)
+int __pkvm_host_share_hyp(u64 pfn, u64 nr_pages)
 {
 	int ret;
 	u64 host_addr = hyp_pfn_to_phys(pfn);
 	u64 hyp_addr = (u64)__hyp_va(host_addr);
 	struct pkvm_mem_share share = {
 		.tx	= {
-			.nr_pages	= 1,
+			.nr_pages	= nr_pages,
 			.initiator	= {
 				.id	= PKVM_ID_HOST,
 				.addr	= host_addr,
diff --git a/arch/arm64/kvm/mmu.c b/arch/arm64/kvm/mmu.c
index f80673e863ac..bc9865a8c988 100644
--- a/arch/arm64/kvm/mmu.c
+++ b/arch/arm64/kvm/mmu.c
@@ -281,30 +281,23 @@ static phys_addr_t kvm_kaddr_to_phys(void *kaddr)
 	}
 }
 
-static int pkvm_share_hyp(phys_addr_t start, phys_addr_t end)
-{
-	phys_addr_t addr;
-	int ret;
-
-	for (addr = ALIGN_DOWN(start, PAGE_SIZE); addr < end; addr += PAGE_SIZE) {
-		ret = kvm_call_hyp_nvhe(__pkvm_host_share_hyp,
-					__phys_to_pfn(addr));
-		if (ret)
-			return ret;
-	}
-
-	return 0;
-}
-
 int kvm_share_hyp(void *from, void *to)
 {
+	phys_addr_t start, end;
+	u64 nr_pages;
+
 	if (is_kernel_in_hyp_mode())
 		return 0;
 
 	if (kvm_host_owns_hyp_mappings())
 		return create_hyp_mappings(from, to, PAGE_HYP);
 
-	return pkvm_share_hyp(kvm_kaddr_to_phys(from), kvm_kaddr_to_phys(to));
+	start = ALIGN_DOWN(kvm_kaddr_to_phys(from), PAGE_SIZE);
+	end = PAGE_ALIGN(kvm_kaddr_to_phys(to));
+	nr_pages = (end - start) >> PAGE_SHIFT;
+
+	return kvm_call_hyp_nvhe(__pkvm_host_share_hyp, __phys_to_pfn(start),
+				 nr_pages);
 }
 
 /**
-- 
2.33.0.882.g93a45727a2-goog


^ permalink raw reply related	[flat|nested] 84+ messages in thread

* [PATCH 05/16] KVM: arm64: Accept page ranges in pkvm share hypercall
@ 2021-10-13 15:58   ` Quentin Perret
  0 siblings, 0 replies; 84+ messages in thread
From: Quentin Perret @ 2021-10-13 15:58 UTC (permalink / raw)
  To: Marc Zyngier, James Morse, Alexandru Elisei, Suzuki K Poulose,
	Catalin Marinas, Will Deacon, Fuad Tabba, David Brazdil
  Cc: kernel-team, kvmarm, linux-arm-kernel, linux-kernel

The recently reworked do_share() infrastructure for the nVHE protected
mode allows to transition the state of a range of pages 'atomically'.
This is preferable over single-page sharing when e.g. mapping guest
vCPUs in the hypervisor stage-1 as the permission checks and page-table
modifications for the entire range are done in a single critical
section. This means there is no need for the host the handle e.g. only
half of a vCPU being successfully shared with the hypervisor.

So, make use of that feature in the __pkvm_host_share_hyp() hypercall by
allowing to specify a pfn range.

Signed-off-by: Quentin Perret <qperret@google.com>
---
 arch/arm64/kvm/hyp/include/nvhe/mem_protect.h |  2 +-
 arch/arm64/kvm/hyp/nvhe/hyp-main.c            |  3 ++-
 arch/arm64/kvm/hyp/nvhe/mem_protect.c         |  4 +--
 arch/arm64/kvm/mmu.c                          | 25 +++++++------------
 4 files changed, 14 insertions(+), 20 deletions(-)

diff --git a/arch/arm64/kvm/hyp/include/nvhe/mem_protect.h b/arch/arm64/kvm/hyp/include/nvhe/mem_protect.h
index 56445586c755..9c02abe92e0a 100644
--- a/arch/arm64/kvm/hyp/include/nvhe/mem_protect.h
+++ b/arch/arm64/kvm/hyp/include/nvhe/mem_protect.h
@@ -54,7 +54,7 @@ extern struct host_kvm host_kvm;
 extern const u8 pkvm_hyp_id;
 
 int __pkvm_prot_finalize(void);
-int __pkvm_host_share_hyp(u64 pfn);
+int __pkvm_host_share_hyp(u64 pfn, u64 nr_pages);
 
 bool addr_is_memory(phys_addr_t phys);
 int host_stage2_idmap_locked(phys_addr_t addr, u64 size, enum kvm_pgtable_prot prot);
diff --git a/arch/arm64/kvm/hyp/nvhe/hyp-main.c b/arch/arm64/kvm/hyp/nvhe/hyp-main.c
index 2da6aa8da868..f78bec2b9dd4 100644
--- a/arch/arm64/kvm/hyp/nvhe/hyp-main.c
+++ b/arch/arm64/kvm/hyp/nvhe/hyp-main.c
@@ -143,8 +143,9 @@ static void handle___pkvm_cpu_set_vector(struct kvm_cpu_context *host_ctxt)
 static void handle___pkvm_host_share_hyp(struct kvm_cpu_context *host_ctxt)
 {
 	DECLARE_REG(u64, pfn, host_ctxt, 1);
+	DECLARE_REG(u64, nr_pages, host_ctxt, 2);
 
-	cpu_reg(host_ctxt, 1) = __pkvm_host_share_hyp(pfn);
+	cpu_reg(host_ctxt, 1) = __pkvm_host_share_hyp(pfn, nr_pages);
 }
 
 static void handle___pkvm_create_private_mapping(struct kvm_cpu_context *host_ctxt)
diff --git a/arch/arm64/kvm/hyp/nvhe/mem_protect.c b/arch/arm64/kvm/hyp/nvhe/mem_protect.c
index 6983b83f799f..909e60f71b06 100644
--- a/arch/arm64/kvm/hyp/nvhe/mem_protect.c
+++ b/arch/arm64/kvm/hyp/nvhe/mem_protect.c
@@ -671,14 +671,14 @@ static int do_share(struct pkvm_mem_share *share)
 	return ret;
 }
 
-int __pkvm_host_share_hyp(u64 pfn)
+int __pkvm_host_share_hyp(u64 pfn, u64 nr_pages)
 {
 	int ret;
 	u64 host_addr = hyp_pfn_to_phys(pfn);
 	u64 hyp_addr = (u64)__hyp_va(host_addr);
 	struct pkvm_mem_share share = {
 		.tx	= {
-			.nr_pages	= 1,
+			.nr_pages	= nr_pages,
 			.initiator	= {
 				.id	= PKVM_ID_HOST,
 				.addr	= host_addr,
diff --git a/arch/arm64/kvm/mmu.c b/arch/arm64/kvm/mmu.c
index f80673e863ac..bc9865a8c988 100644
--- a/arch/arm64/kvm/mmu.c
+++ b/arch/arm64/kvm/mmu.c
@@ -281,30 +281,23 @@ static phys_addr_t kvm_kaddr_to_phys(void *kaddr)
 	}
 }
 
-static int pkvm_share_hyp(phys_addr_t start, phys_addr_t end)
-{
-	phys_addr_t addr;
-	int ret;
-
-	for (addr = ALIGN_DOWN(start, PAGE_SIZE); addr < end; addr += PAGE_SIZE) {
-		ret = kvm_call_hyp_nvhe(__pkvm_host_share_hyp,
-					__phys_to_pfn(addr));
-		if (ret)
-			return ret;
-	}
-
-	return 0;
-}
-
 int kvm_share_hyp(void *from, void *to)
 {
+	phys_addr_t start, end;
+	u64 nr_pages;
+
 	if (is_kernel_in_hyp_mode())
 		return 0;
 
 	if (kvm_host_owns_hyp_mappings())
 		return create_hyp_mappings(from, to, PAGE_HYP);
 
-	return pkvm_share_hyp(kvm_kaddr_to_phys(from), kvm_kaddr_to_phys(to));
+	start = ALIGN_DOWN(kvm_kaddr_to_phys(from), PAGE_SIZE);
+	end = PAGE_ALIGN(kvm_kaddr_to_phys(to));
+	nr_pages = (end - start) >> PAGE_SHIFT;
+
+	return kvm_call_hyp_nvhe(__pkvm_host_share_hyp, __phys_to_pfn(start),
+				 nr_pages);
 }
 
 /**
-- 
2.33.0.882.g93a45727a2-goog

_______________________________________________
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

^ permalink raw reply related	[flat|nested] 84+ messages in thread

* [PATCH 05/16] KVM: arm64: Accept page ranges in pkvm share hypercall
@ 2021-10-13 15:58   ` Quentin Perret
  0 siblings, 0 replies; 84+ messages in thread
From: Quentin Perret @ 2021-10-13 15:58 UTC (permalink / raw)
  To: Marc Zyngier, James Morse, Alexandru Elisei, Suzuki K Poulose,
	Catalin Marinas, Will Deacon, Fuad Tabba, David Brazdil
  Cc: linux-arm-kernel, kvmarm, linux-kernel, kernel-team

The recently reworked do_share() infrastructure for the nVHE protected
mode allows to transition the state of a range of pages 'atomically'.
This is preferable over single-page sharing when e.g. mapping guest
vCPUs in the hypervisor stage-1 as the permission checks and page-table
modifications for the entire range are done in a single critical
section. This means there is no need for the host the handle e.g. only
half of a vCPU being successfully shared with the hypervisor.

So, make use of that feature in the __pkvm_host_share_hyp() hypercall by
allowing to specify a pfn range.

Signed-off-by: Quentin Perret <qperret@google.com>
---
 arch/arm64/kvm/hyp/include/nvhe/mem_protect.h |  2 +-
 arch/arm64/kvm/hyp/nvhe/hyp-main.c            |  3 ++-
 arch/arm64/kvm/hyp/nvhe/mem_protect.c         |  4 +--
 arch/arm64/kvm/mmu.c                          | 25 +++++++------------
 4 files changed, 14 insertions(+), 20 deletions(-)

diff --git a/arch/arm64/kvm/hyp/include/nvhe/mem_protect.h b/arch/arm64/kvm/hyp/include/nvhe/mem_protect.h
index 56445586c755..9c02abe92e0a 100644
--- a/arch/arm64/kvm/hyp/include/nvhe/mem_protect.h
+++ b/arch/arm64/kvm/hyp/include/nvhe/mem_protect.h
@@ -54,7 +54,7 @@ extern struct host_kvm host_kvm;
 extern const u8 pkvm_hyp_id;
 
 int __pkvm_prot_finalize(void);
-int __pkvm_host_share_hyp(u64 pfn);
+int __pkvm_host_share_hyp(u64 pfn, u64 nr_pages);
 
 bool addr_is_memory(phys_addr_t phys);
 int host_stage2_idmap_locked(phys_addr_t addr, u64 size, enum kvm_pgtable_prot prot);
diff --git a/arch/arm64/kvm/hyp/nvhe/hyp-main.c b/arch/arm64/kvm/hyp/nvhe/hyp-main.c
index 2da6aa8da868..f78bec2b9dd4 100644
--- a/arch/arm64/kvm/hyp/nvhe/hyp-main.c
+++ b/arch/arm64/kvm/hyp/nvhe/hyp-main.c
@@ -143,8 +143,9 @@ static void handle___pkvm_cpu_set_vector(struct kvm_cpu_context *host_ctxt)
 static void handle___pkvm_host_share_hyp(struct kvm_cpu_context *host_ctxt)
 {
 	DECLARE_REG(u64, pfn, host_ctxt, 1);
+	DECLARE_REG(u64, nr_pages, host_ctxt, 2);
 
-	cpu_reg(host_ctxt, 1) = __pkvm_host_share_hyp(pfn);
+	cpu_reg(host_ctxt, 1) = __pkvm_host_share_hyp(pfn, nr_pages);
 }
 
 static void handle___pkvm_create_private_mapping(struct kvm_cpu_context *host_ctxt)
diff --git a/arch/arm64/kvm/hyp/nvhe/mem_protect.c b/arch/arm64/kvm/hyp/nvhe/mem_protect.c
index 6983b83f799f..909e60f71b06 100644
--- a/arch/arm64/kvm/hyp/nvhe/mem_protect.c
+++ b/arch/arm64/kvm/hyp/nvhe/mem_protect.c
@@ -671,14 +671,14 @@ static int do_share(struct pkvm_mem_share *share)
 	return ret;
 }
 
-int __pkvm_host_share_hyp(u64 pfn)
+int __pkvm_host_share_hyp(u64 pfn, u64 nr_pages)
 {
 	int ret;
 	u64 host_addr = hyp_pfn_to_phys(pfn);
 	u64 hyp_addr = (u64)__hyp_va(host_addr);
 	struct pkvm_mem_share share = {
 		.tx	= {
-			.nr_pages	= 1,
+			.nr_pages	= nr_pages,
 			.initiator	= {
 				.id	= PKVM_ID_HOST,
 				.addr	= host_addr,
diff --git a/arch/arm64/kvm/mmu.c b/arch/arm64/kvm/mmu.c
index f80673e863ac..bc9865a8c988 100644
--- a/arch/arm64/kvm/mmu.c
+++ b/arch/arm64/kvm/mmu.c
@@ -281,30 +281,23 @@ static phys_addr_t kvm_kaddr_to_phys(void *kaddr)
 	}
 }
 
-static int pkvm_share_hyp(phys_addr_t start, phys_addr_t end)
-{
-	phys_addr_t addr;
-	int ret;
-
-	for (addr = ALIGN_DOWN(start, PAGE_SIZE); addr < end; addr += PAGE_SIZE) {
-		ret = kvm_call_hyp_nvhe(__pkvm_host_share_hyp,
-					__phys_to_pfn(addr));
-		if (ret)
-			return ret;
-	}
-
-	return 0;
-}
-
 int kvm_share_hyp(void *from, void *to)
 {
+	phys_addr_t start, end;
+	u64 nr_pages;
+
 	if (is_kernel_in_hyp_mode())
 		return 0;
 
 	if (kvm_host_owns_hyp_mappings())
 		return create_hyp_mappings(from, to, PAGE_HYP);
 
-	return pkvm_share_hyp(kvm_kaddr_to_phys(from), kvm_kaddr_to_phys(to));
+	start = ALIGN_DOWN(kvm_kaddr_to_phys(from), PAGE_SIZE);
+	end = PAGE_ALIGN(kvm_kaddr_to_phys(to));
+	nr_pages = (end - start) >> PAGE_SHIFT;
+
+	return kvm_call_hyp_nvhe(__pkvm_host_share_hyp, __phys_to_pfn(start),
+				 nr_pages);
 }
 
 /**
-- 
2.33.0.882.g93a45727a2-goog


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply related	[flat|nested] 84+ messages in thread

* [PATCH 06/16] KVM: arm64: Provide {get,put}_page() stubs for early hyp allocator
  2021-10-13 15:58 ` Quentin Perret
  (?)
@ 2021-10-13 15:58   ` Quentin Perret
  -1 siblings, 0 replies; 84+ messages in thread
From: Quentin Perret @ 2021-10-13 15:58 UTC (permalink / raw)
  To: Marc Zyngier, James Morse, Alexandru Elisei, Suzuki K Poulose,
	Catalin Marinas, Will Deacon, Fuad Tabba, David Brazdil
  Cc: linux-arm-kernel, kvmarm, linux-kernel, kernel-team

In nVHE protected mode, the EL2 code uses a temporary allocator during
boot while re-creating its stage-1 page-table. Unfortunately, the
hyp_vmmemap is not ready to use at this stage, so refcounting pages
is not possible. That is not currently a problem because hyp stage-1
mappings are never removed, which implies refcounting of page-table
pages is unnecessary.

In preparation for allowing hypervisor stage-1 mappings to be removed,
provide stub implementations for {get,put}_page() in the early allocator.

Signed-off-by: Quentin Perret <qperret@google.com>
---
 arch/arm64/kvm/hyp/nvhe/early_alloc.c | 5 +++++
 1 file changed, 5 insertions(+)

diff --git a/arch/arm64/kvm/hyp/nvhe/early_alloc.c b/arch/arm64/kvm/hyp/nvhe/early_alloc.c
index 1306c430ab87..00de04153cc6 100644
--- a/arch/arm64/kvm/hyp/nvhe/early_alloc.c
+++ b/arch/arm64/kvm/hyp/nvhe/early_alloc.c
@@ -43,6 +43,9 @@ void *hyp_early_alloc_page(void *arg)
 	return hyp_early_alloc_contig(1);
 }
 
+static void hyp_early_alloc_get_page(void *addr) { }
+static void hyp_early_alloc_put_page(void *addr) { }
+
 void hyp_early_alloc_init(void *virt, unsigned long size)
 {
 	base = cur = (unsigned long)virt;
@@ -51,4 +54,6 @@ void hyp_early_alloc_init(void *virt, unsigned long size)
 	hyp_early_alloc_mm_ops.zalloc_page = hyp_early_alloc_page;
 	hyp_early_alloc_mm_ops.phys_to_virt = hyp_phys_to_virt;
 	hyp_early_alloc_mm_ops.virt_to_phys = hyp_virt_to_phys;
+	hyp_early_alloc_mm_ops.get_page = hyp_early_alloc_get_page;
+	hyp_early_alloc_mm_ops.put_page = hyp_early_alloc_put_page;
 }
-- 
2.33.0.882.g93a45727a2-goog


^ permalink raw reply related	[flat|nested] 84+ messages in thread

* [PATCH 06/16] KVM: arm64: Provide {get,put}_page() stubs for early hyp allocator
@ 2021-10-13 15:58   ` Quentin Perret
  0 siblings, 0 replies; 84+ messages in thread
From: Quentin Perret @ 2021-10-13 15:58 UTC (permalink / raw)
  To: Marc Zyngier, James Morse, Alexandru Elisei, Suzuki K Poulose,
	Catalin Marinas, Will Deacon, Fuad Tabba, David Brazdil
  Cc: kernel-team, kvmarm, linux-arm-kernel, linux-kernel

In nVHE protected mode, the EL2 code uses a temporary allocator during
boot while re-creating its stage-1 page-table. Unfortunately, the
hyp_vmmemap is not ready to use at this stage, so refcounting pages
is not possible. That is not currently a problem because hyp stage-1
mappings are never removed, which implies refcounting of page-table
pages is unnecessary.

In preparation for allowing hypervisor stage-1 mappings to be removed,
provide stub implementations for {get,put}_page() in the early allocator.

Signed-off-by: Quentin Perret <qperret@google.com>
---
 arch/arm64/kvm/hyp/nvhe/early_alloc.c | 5 +++++
 1 file changed, 5 insertions(+)

diff --git a/arch/arm64/kvm/hyp/nvhe/early_alloc.c b/arch/arm64/kvm/hyp/nvhe/early_alloc.c
index 1306c430ab87..00de04153cc6 100644
--- a/arch/arm64/kvm/hyp/nvhe/early_alloc.c
+++ b/arch/arm64/kvm/hyp/nvhe/early_alloc.c
@@ -43,6 +43,9 @@ void *hyp_early_alloc_page(void *arg)
 	return hyp_early_alloc_contig(1);
 }
 
+static void hyp_early_alloc_get_page(void *addr) { }
+static void hyp_early_alloc_put_page(void *addr) { }
+
 void hyp_early_alloc_init(void *virt, unsigned long size)
 {
 	base = cur = (unsigned long)virt;
@@ -51,4 +54,6 @@ void hyp_early_alloc_init(void *virt, unsigned long size)
 	hyp_early_alloc_mm_ops.zalloc_page = hyp_early_alloc_page;
 	hyp_early_alloc_mm_ops.phys_to_virt = hyp_phys_to_virt;
 	hyp_early_alloc_mm_ops.virt_to_phys = hyp_virt_to_phys;
+	hyp_early_alloc_mm_ops.get_page = hyp_early_alloc_get_page;
+	hyp_early_alloc_mm_ops.put_page = hyp_early_alloc_put_page;
 }
-- 
2.33.0.882.g93a45727a2-goog

_______________________________________________
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

^ permalink raw reply related	[flat|nested] 84+ messages in thread

* [PATCH 06/16] KVM: arm64: Provide {get,put}_page() stubs for early hyp allocator
@ 2021-10-13 15:58   ` Quentin Perret
  0 siblings, 0 replies; 84+ messages in thread
From: Quentin Perret @ 2021-10-13 15:58 UTC (permalink / raw)
  To: Marc Zyngier, James Morse, Alexandru Elisei, Suzuki K Poulose,
	Catalin Marinas, Will Deacon, Fuad Tabba, David Brazdil
  Cc: linux-arm-kernel, kvmarm, linux-kernel, kernel-team

In nVHE protected mode, the EL2 code uses a temporary allocator during
boot while re-creating its stage-1 page-table. Unfortunately, the
hyp_vmmemap is not ready to use at this stage, so refcounting pages
is not possible. That is not currently a problem because hyp stage-1
mappings are never removed, which implies refcounting of page-table
pages is unnecessary.

In preparation for allowing hypervisor stage-1 mappings to be removed,
provide stub implementations for {get,put}_page() in the early allocator.

Signed-off-by: Quentin Perret <qperret@google.com>
---
 arch/arm64/kvm/hyp/nvhe/early_alloc.c | 5 +++++
 1 file changed, 5 insertions(+)

diff --git a/arch/arm64/kvm/hyp/nvhe/early_alloc.c b/arch/arm64/kvm/hyp/nvhe/early_alloc.c
index 1306c430ab87..00de04153cc6 100644
--- a/arch/arm64/kvm/hyp/nvhe/early_alloc.c
+++ b/arch/arm64/kvm/hyp/nvhe/early_alloc.c
@@ -43,6 +43,9 @@ void *hyp_early_alloc_page(void *arg)
 	return hyp_early_alloc_contig(1);
 }
 
+static void hyp_early_alloc_get_page(void *addr) { }
+static void hyp_early_alloc_put_page(void *addr) { }
+
 void hyp_early_alloc_init(void *virt, unsigned long size)
 {
 	base = cur = (unsigned long)virt;
@@ -51,4 +54,6 @@ void hyp_early_alloc_init(void *virt, unsigned long size)
 	hyp_early_alloc_mm_ops.zalloc_page = hyp_early_alloc_page;
 	hyp_early_alloc_mm_ops.phys_to_virt = hyp_phys_to_virt;
 	hyp_early_alloc_mm_ops.virt_to_phys = hyp_virt_to_phys;
+	hyp_early_alloc_mm_ops.get_page = hyp_early_alloc_get_page;
+	hyp_early_alloc_mm_ops.put_page = hyp_early_alloc_put_page;
 }
-- 
2.33.0.882.g93a45727a2-goog


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply related	[flat|nested] 84+ messages in thread

* [PATCH 07/16] KVM: arm64: Refcount hyp stage-1 pgtable pages
  2021-10-13 15:58 ` Quentin Perret
  (?)
@ 2021-10-13 15:58   ` Quentin Perret
  -1 siblings, 0 replies; 84+ messages in thread
From: Quentin Perret @ 2021-10-13 15:58 UTC (permalink / raw)
  To: Marc Zyngier, James Morse, Alexandru Elisei, Suzuki K Poulose,
	Catalin Marinas, Will Deacon, Fuad Tabba, David Brazdil
  Cc: linux-arm-kernel, kvmarm, linux-kernel, kernel-team

To prepare the ground for allowing hyp stage-1 mappings to be removed at
run-time, update the KVM page-table code to maintain a correct refcount
using the ->{get,put}_page() function callbacks.

Signed-off-by: Quentin Perret <qperret@google.com>
---
 arch/arm64/kvm/hyp/pgtable.c | 17 ++++++++++++++---
 1 file changed, 14 insertions(+), 3 deletions(-)

diff --git a/arch/arm64/kvm/hyp/pgtable.c b/arch/arm64/kvm/hyp/pgtable.c
index f8ceebe4982e..768a58835153 100644
--- a/arch/arm64/kvm/hyp/pgtable.c
+++ b/arch/arm64/kvm/hyp/pgtable.c
@@ -408,8 +408,10 @@ static bool hyp_map_walker_try_leaf(u64 addr, u64 end, u32 level,
 		return false;
 
 	new = kvm_init_valid_leaf_pte(phys, data->attr, level);
-	if (hyp_pte_needs_update(old, new))
+	if (hyp_pte_needs_update(old, new)) {
 		smp_store_release(ptep, new);
+		data->mm_ops->get_page(ptep);
+	}
 
 	data->phys += granule;
 	return true;
@@ -433,6 +435,7 @@ static int hyp_map_walker(u64 addr, u64 end, u32 level, kvm_pte_t *ptep,
 		return -ENOMEM;
 
 	kvm_set_table_pte(ptep, childp, mm_ops);
+	mm_ops->get_page(ptep);
 	return 0;
 }
 
@@ -482,8 +485,16 @@ static int hyp_free_walker(u64 addr, u64 end, u32 level, kvm_pte_t *ptep,
 			   enum kvm_pgtable_walk_flags flag, void * const arg)
 {
 	struct kvm_pgtable_mm_ops *mm_ops = arg;
+	kvm_pte_t pte = *ptep;
+
+	if (!kvm_pte_valid(pte))
+		return 0;
+
+	mm_ops->put_page(ptep);
+
+	if (kvm_pte_table(pte, level))
+		mm_ops->put_page(kvm_pte_follow(pte, mm_ops));
 
-	mm_ops->put_page((void *)kvm_pte_follow(*ptep, mm_ops));
 	return 0;
 }
 
@@ -491,7 +502,7 @@ void kvm_pgtable_hyp_destroy(struct kvm_pgtable *pgt)
 {
 	struct kvm_pgtable_walker walker = {
 		.cb	= hyp_free_walker,
-		.flags	= KVM_PGTABLE_WALK_TABLE_POST,
+		.flags	= KVM_PGTABLE_WALK_LEAF | KVM_PGTABLE_WALK_TABLE_POST,
 		.arg	= pgt->mm_ops,
 	};
 
-- 
2.33.0.882.g93a45727a2-goog


^ permalink raw reply related	[flat|nested] 84+ messages in thread

* [PATCH 07/16] KVM: arm64: Refcount hyp stage-1 pgtable pages
@ 2021-10-13 15:58   ` Quentin Perret
  0 siblings, 0 replies; 84+ messages in thread
From: Quentin Perret @ 2021-10-13 15:58 UTC (permalink / raw)
  To: Marc Zyngier, James Morse, Alexandru Elisei, Suzuki K Poulose,
	Catalin Marinas, Will Deacon, Fuad Tabba, David Brazdil
  Cc: kernel-team, kvmarm, linux-arm-kernel, linux-kernel

To prepare the ground for allowing hyp stage-1 mappings to be removed at
run-time, update the KVM page-table code to maintain a correct refcount
using the ->{get,put}_page() function callbacks.

Signed-off-by: Quentin Perret <qperret@google.com>
---
 arch/arm64/kvm/hyp/pgtable.c | 17 ++++++++++++++---
 1 file changed, 14 insertions(+), 3 deletions(-)

diff --git a/arch/arm64/kvm/hyp/pgtable.c b/arch/arm64/kvm/hyp/pgtable.c
index f8ceebe4982e..768a58835153 100644
--- a/arch/arm64/kvm/hyp/pgtable.c
+++ b/arch/arm64/kvm/hyp/pgtable.c
@@ -408,8 +408,10 @@ static bool hyp_map_walker_try_leaf(u64 addr, u64 end, u32 level,
 		return false;
 
 	new = kvm_init_valid_leaf_pte(phys, data->attr, level);
-	if (hyp_pte_needs_update(old, new))
+	if (hyp_pte_needs_update(old, new)) {
 		smp_store_release(ptep, new);
+		data->mm_ops->get_page(ptep);
+	}
 
 	data->phys += granule;
 	return true;
@@ -433,6 +435,7 @@ static int hyp_map_walker(u64 addr, u64 end, u32 level, kvm_pte_t *ptep,
 		return -ENOMEM;
 
 	kvm_set_table_pte(ptep, childp, mm_ops);
+	mm_ops->get_page(ptep);
 	return 0;
 }
 
@@ -482,8 +485,16 @@ static int hyp_free_walker(u64 addr, u64 end, u32 level, kvm_pte_t *ptep,
 			   enum kvm_pgtable_walk_flags flag, void * const arg)
 {
 	struct kvm_pgtable_mm_ops *mm_ops = arg;
+	kvm_pte_t pte = *ptep;
+
+	if (!kvm_pte_valid(pte))
+		return 0;
+
+	mm_ops->put_page(ptep);
+
+	if (kvm_pte_table(pte, level))
+		mm_ops->put_page(kvm_pte_follow(pte, mm_ops));
 
-	mm_ops->put_page((void *)kvm_pte_follow(*ptep, mm_ops));
 	return 0;
 }
 
@@ -491,7 +502,7 @@ void kvm_pgtable_hyp_destroy(struct kvm_pgtable *pgt)
 {
 	struct kvm_pgtable_walker walker = {
 		.cb	= hyp_free_walker,
-		.flags	= KVM_PGTABLE_WALK_TABLE_POST,
+		.flags	= KVM_PGTABLE_WALK_LEAF | KVM_PGTABLE_WALK_TABLE_POST,
 		.arg	= pgt->mm_ops,
 	};
 
-- 
2.33.0.882.g93a45727a2-goog

_______________________________________________
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

^ permalink raw reply related	[flat|nested] 84+ messages in thread

* [PATCH 07/16] KVM: arm64: Refcount hyp stage-1 pgtable pages
@ 2021-10-13 15:58   ` Quentin Perret
  0 siblings, 0 replies; 84+ messages in thread
From: Quentin Perret @ 2021-10-13 15:58 UTC (permalink / raw)
  To: Marc Zyngier, James Morse, Alexandru Elisei, Suzuki K Poulose,
	Catalin Marinas, Will Deacon, Fuad Tabba, David Brazdil
  Cc: linux-arm-kernel, kvmarm, linux-kernel, kernel-team

To prepare the ground for allowing hyp stage-1 mappings to be removed at
run-time, update the KVM page-table code to maintain a correct refcount
using the ->{get,put}_page() function callbacks.

Signed-off-by: Quentin Perret <qperret@google.com>
---
 arch/arm64/kvm/hyp/pgtable.c | 17 ++++++++++++++---
 1 file changed, 14 insertions(+), 3 deletions(-)

diff --git a/arch/arm64/kvm/hyp/pgtable.c b/arch/arm64/kvm/hyp/pgtable.c
index f8ceebe4982e..768a58835153 100644
--- a/arch/arm64/kvm/hyp/pgtable.c
+++ b/arch/arm64/kvm/hyp/pgtable.c
@@ -408,8 +408,10 @@ static bool hyp_map_walker_try_leaf(u64 addr, u64 end, u32 level,
 		return false;
 
 	new = kvm_init_valid_leaf_pte(phys, data->attr, level);
-	if (hyp_pte_needs_update(old, new))
+	if (hyp_pte_needs_update(old, new)) {
 		smp_store_release(ptep, new);
+		data->mm_ops->get_page(ptep);
+	}
 
 	data->phys += granule;
 	return true;
@@ -433,6 +435,7 @@ static int hyp_map_walker(u64 addr, u64 end, u32 level, kvm_pte_t *ptep,
 		return -ENOMEM;
 
 	kvm_set_table_pte(ptep, childp, mm_ops);
+	mm_ops->get_page(ptep);
 	return 0;
 }
 
@@ -482,8 +485,16 @@ static int hyp_free_walker(u64 addr, u64 end, u32 level, kvm_pte_t *ptep,
 			   enum kvm_pgtable_walk_flags flag, void * const arg)
 {
 	struct kvm_pgtable_mm_ops *mm_ops = arg;
+	kvm_pte_t pte = *ptep;
+
+	if (!kvm_pte_valid(pte))
+		return 0;
+
+	mm_ops->put_page(ptep);
+
+	if (kvm_pte_table(pte, level))
+		mm_ops->put_page(kvm_pte_follow(pte, mm_ops));
 
-	mm_ops->put_page((void *)kvm_pte_follow(*ptep, mm_ops));
 	return 0;
 }
 
@@ -491,7 +502,7 @@ void kvm_pgtable_hyp_destroy(struct kvm_pgtable *pgt)
 {
 	struct kvm_pgtable_walker walker = {
 		.cb	= hyp_free_walker,
-		.flags	= KVM_PGTABLE_WALK_TABLE_POST,
+		.flags	= KVM_PGTABLE_WALK_LEAF | KVM_PGTABLE_WALK_TABLE_POST,
 		.arg	= pgt->mm_ops,
 	};
 
-- 
2.33.0.882.g93a45727a2-goog


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply related	[flat|nested] 84+ messages in thread

* [PATCH 08/16] KVM: arm64: Fixup hyp stage-1 refcount
  2021-10-13 15:58 ` Quentin Perret
  (?)
@ 2021-10-13 15:58   ` Quentin Perret
  -1 siblings, 0 replies; 84+ messages in thread
From: Quentin Perret @ 2021-10-13 15:58 UTC (permalink / raw)
  To: Marc Zyngier, James Morse, Alexandru Elisei, Suzuki K Poulose,
	Catalin Marinas, Will Deacon, Fuad Tabba, David Brazdil
  Cc: linux-arm-kernel, kvmarm, linux-kernel, kernel-team

In nVHE-protected mode, the hyp stage-1 page-table refcount is broken
due to the lack of refcount support in the early allocator. Fix-up the
refcount in the finalize walker, once the 'hyp_vmemmap' is up and running.

Signed-off-by: Quentin Perret <qperret@google.com>
---
 arch/arm64/kvm/hyp/nvhe/setup.c | 31 +++++++++++++++++++++----------
 1 file changed, 21 insertions(+), 10 deletions(-)

diff --git a/arch/arm64/kvm/hyp/nvhe/setup.c b/arch/arm64/kvm/hyp/nvhe/setup.c
index 57c27846320f..ad89801dfed7 100644
--- a/arch/arm64/kvm/hyp/nvhe/setup.c
+++ b/arch/arm64/kvm/hyp/nvhe/setup.c
@@ -166,12 +166,22 @@ static int finalize_host_mappings_walker(u64 addr, u64 end, u32 level,
 {
 	enum kvm_pgtable_prot prot;
 	enum pkvm_page_state state;
+	struct kvm_pgtable_mm_ops *mm_ops = arg;
 	kvm_pte_t pte = *ptep;
 	phys_addr_t phys;
 
 	if (!kvm_pte_valid(pte))
 		return 0;
 
+	/*
+	 * Fix-up the refcount for the page-table pages as the early allocator
+	 * was unable to access the hyp_vmemmap and so the buddy allocator has
+	 * initialised the refcount to '1'.
+	 */
+	mm_ops->get_page(ptep);
+	if (flag != KVM_PGTABLE_WALK_LEAF)
+		return 0;
+
 	if (level != (KVM_PGTABLE_MAX_LEVELS - 1))
 		return -EINVAL;
 
@@ -204,7 +214,8 @@ static int finalize_host_mappings(void)
 {
 	struct kvm_pgtable_walker walker = {
 		.cb	= finalize_host_mappings_walker,
-		.flags	= KVM_PGTABLE_WALK_LEAF,
+		.flags	= KVM_PGTABLE_WALK_LEAF | KVM_PGTABLE_WALK_TABLE_POST,
+		.arg	= pkvm_pgtable.mm_ops,
 	};
 
 	return kvm_pgtable_walk(&pkvm_pgtable, 0, BIT(pkvm_pgtable.ia_bits), &walker);
@@ -229,19 +240,19 @@ void __noreturn __pkvm_init_finalise(void)
 	if (ret)
 		goto out;
 
-	ret = finalize_host_mappings();
-	if (ret)
-		goto out;
-
 	pkvm_pgtable_mm_ops = (struct kvm_pgtable_mm_ops) {
-		.zalloc_page = hyp_zalloc_hyp_page,
-		.phys_to_virt = hyp_phys_to_virt,
-		.virt_to_phys = hyp_virt_to_phys,
-		.get_page = hpool_get_page,
-		.put_page = hpool_put_page,
+		.zalloc_page	= hyp_zalloc_hyp_page,
+		.phys_to_virt	= hyp_phys_to_virt,
+		.virt_to_phys	= hyp_virt_to_phys,
+		.get_page	= hpool_get_page,
+		.put_page	= hpool_put_page,
 	};
 	pkvm_pgtable.mm_ops = &pkvm_pgtable_mm_ops;
 
+	ret = finalize_host_mappings();
+	if (ret)
+		goto out;
+
 out:
 	/*
 	 * We tail-called to here from handle___pkvm_init() and will not return,
-- 
2.33.0.882.g93a45727a2-goog


^ permalink raw reply related	[flat|nested] 84+ messages in thread

* [PATCH 08/16] KVM: arm64: Fixup hyp stage-1 refcount
@ 2021-10-13 15:58   ` Quentin Perret
  0 siblings, 0 replies; 84+ messages in thread
From: Quentin Perret @ 2021-10-13 15:58 UTC (permalink / raw)
  To: Marc Zyngier, James Morse, Alexandru Elisei, Suzuki K Poulose,
	Catalin Marinas, Will Deacon, Fuad Tabba, David Brazdil
  Cc: kernel-team, kvmarm, linux-arm-kernel, linux-kernel

In nVHE-protected mode, the hyp stage-1 page-table refcount is broken
due to the lack of refcount support in the early allocator. Fix-up the
refcount in the finalize walker, once the 'hyp_vmemmap' is up and running.

Signed-off-by: Quentin Perret <qperret@google.com>
---
 arch/arm64/kvm/hyp/nvhe/setup.c | 31 +++++++++++++++++++++----------
 1 file changed, 21 insertions(+), 10 deletions(-)

diff --git a/arch/arm64/kvm/hyp/nvhe/setup.c b/arch/arm64/kvm/hyp/nvhe/setup.c
index 57c27846320f..ad89801dfed7 100644
--- a/arch/arm64/kvm/hyp/nvhe/setup.c
+++ b/arch/arm64/kvm/hyp/nvhe/setup.c
@@ -166,12 +166,22 @@ static int finalize_host_mappings_walker(u64 addr, u64 end, u32 level,
 {
 	enum kvm_pgtable_prot prot;
 	enum pkvm_page_state state;
+	struct kvm_pgtable_mm_ops *mm_ops = arg;
 	kvm_pte_t pte = *ptep;
 	phys_addr_t phys;
 
 	if (!kvm_pte_valid(pte))
 		return 0;
 
+	/*
+	 * Fix-up the refcount for the page-table pages as the early allocator
+	 * was unable to access the hyp_vmemmap and so the buddy allocator has
+	 * initialised the refcount to '1'.
+	 */
+	mm_ops->get_page(ptep);
+	if (flag != KVM_PGTABLE_WALK_LEAF)
+		return 0;
+
 	if (level != (KVM_PGTABLE_MAX_LEVELS - 1))
 		return -EINVAL;
 
@@ -204,7 +214,8 @@ static int finalize_host_mappings(void)
 {
 	struct kvm_pgtable_walker walker = {
 		.cb	= finalize_host_mappings_walker,
-		.flags	= KVM_PGTABLE_WALK_LEAF,
+		.flags	= KVM_PGTABLE_WALK_LEAF | KVM_PGTABLE_WALK_TABLE_POST,
+		.arg	= pkvm_pgtable.mm_ops,
 	};
 
 	return kvm_pgtable_walk(&pkvm_pgtable, 0, BIT(pkvm_pgtable.ia_bits), &walker);
@@ -229,19 +240,19 @@ void __noreturn __pkvm_init_finalise(void)
 	if (ret)
 		goto out;
 
-	ret = finalize_host_mappings();
-	if (ret)
-		goto out;
-
 	pkvm_pgtable_mm_ops = (struct kvm_pgtable_mm_ops) {
-		.zalloc_page = hyp_zalloc_hyp_page,
-		.phys_to_virt = hyp_phys_to_virt,
-		.virt_to_phys = hyp_virt_to_phys,
-		.get_page = hpool_get_page,
-		.put_page = hpool_put_page,
+		.zalloc_page	= hyp_zalloc_hyp_page,
+		.phys_to_virt	= hyp_phys_to_virt,
+		.virt_to_phys	= hyp_virt_to_phys,
+		.get_page	= hpool_get_page,
+		.put_page	= hpool_put_page,
 	};
 	pkvm_pgtable.mm_ops = &pkvm_pgtable_mm_ops;
 
+	ret = finalize_host_mappings();
+	if (ret)
+		goto out;
+
 out:
 	/*
 	 * We tail-called to here from handle___pkvm_init() and will not return,
-- 
2.33.0.882.g93a45727a2-goog

_______________________________________________
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

^ permalink raw reply related	[flat|nested] 84+ messages in thread

* [PATCH 08/16] KVM: arm64: Fixup hyp stage-1 refcount
@ 2021-10-13 15:58   ` Quentin Perret
  0 siblings, 0 replies; 84+ messages in thread
From: Quentin Perret @ 2021-10-13 15:58 UTC (permalink / raw)
  To: Marc Zyngier, James Morse, Alexandru Elisei, Suzuki K Poulose,
	Catalin Marinas, Will Deacon, Fuad Tabba, David Brazdil
  Cc: linux-arm-kernel, kvmarm, linux-kernel, kernel-team

In nVHE-protected mode, the hyp stage-1 page-table refcount is broken
due to the lack of refcount support in the early allocator. Fix-up the
refcount in the finalize walker, once the 'hyp_vmemmap' is up and running.

Signed-off-by: Quentin Perret <qperret@google.com>
---
 arch/arm64/kvm/hyp/nvhe/setup.c | 31 +++++++++++++++++++++----------
 1 file changed, 21 insertions(+), 10 deletions(-)

diff --git a/arch/arm64/kvm/hyp/nvhe/setup.c b/arch/arm64/kvm/hyp/nvhe/setup.c
index 57c27846320f..ad89801dfed7 100644
--- a/arch/arm64/kvm/hyp/nvhe/setup.c
+++ b/arch/arm64/kvm/hyp/nvhe/setup.c
@@ -166,12 +166,22 @@ static int finalize_host_mappings_walker(u64 addr, u64 end, u32 level,
 {
 	enum kvm_pgtable_prot prot;
 	enum pkvm_page_state state;
+	struct kvm_pgtable_mm_ops *mm_ops = arg;
 	kvm_pte_t pte = *ptep;
 	phys_addr_t phys;
 
 	if (!kvm_pte_valid(pte))
 		return 0;
 
+	/*
+	 * Fix-up the refcount for the page-table pages as the early allocator
+	 * was unable to access the hyp_vmemmap and so the buddy allocator has
+	 * initialised the refcount to '1'.
+	 */
+	mm_ops->get_page(ptep);
+	if (flag != KVM_PGTABLE_WALK_LEAF)
+		return 0;
+
 	if (level != (KVM_PGTABLE_MAX_LEVELS - 1))
 		return -EINVAL;
 
@@ -204,7 +214,8 @@ static int finalize_host_mappings(void)
 {
 	struct kvm_pgtable_walker walker = {
 		.cb	= finalize_host_mappings_walker,
-		.flags	= KVM_PGTABLE_WALK_LEAF,
+		.flags	= KVM_PGTABLE_WALK_LEAF | KVM_PGTABLE_WALK_TABLE_POST,
+		.arg	= pkvm_pgtable.mm_ops,
 	};
 
 	return kvm_pgtable_walk(&pkvm_pgtable, 0, BIT(pkvm_pgtable.ia_bits), &walker);
@@ -229,19 +240,19 @@ void __noreturn __pkvm_init_finalise(void)
 	if (ret)
 		goto out;
 
-	ret = finalize_host_mappings();
-	if (ret)
-		goto out;
-
 	pkvm_pgtable_mm_ops = (struct kvm_pgtable_mm_ops) {
-		.zalloc_page = hyp_zalloc_hyp_page,
-		.phys_to_virt = hyp_phys_to_virt,
-		.virt_to_phys = hyp_virt_to_phys,
-		.get_page = hpool_get_page,
-		.put_page = hpool_put_page,
+		.zalloc_page	= hyp_zalloc_hyp_page,
+		.phys_to_virt	= hyp_phys_to_virt,
+		.virt_to_phys	= hyp_virt_to_phys,
+		.get_page	= hpool_get_page,
+		.put_page	= hpool_put_page,
 	};
 	pkvm_pgtable.mm_ops = &pkvm_pgtable_mm_ops;
 
+	ret = finalize_host_mappings();
+	if (ret)
+		goto out;
+
 out:
 	/*
 	 * We tail-called to here from handle___pkvm_init() and will not return,
-- 
2.33.0.882.g93a45727a2-goog


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply related	[flat|nested] 84+ messages in thread

* [PATCH 09/16] KVM: arm64: Hook up ->page_count() for hypervisor stage-1 page-table
  2021-10-13 15:58 ` Quentin Perret
  (?)
@ 2021-10-13 15:58   ` Quentin Perret
  -1 siblings, 0 replies; 84+ messages in thread
From: Quentin Perret @ 2021-10-13 15:58 UTC (permalink / raw)
  To: Marc Zyngier, James Morse, Alexandru Elisei, Suzuki K Poulose,
	Catalin Marinas, Will Deacon, Fuad Tabba, David Brazdil
  Cc: linux-arm-kernel, kvmarm, linux-kernel, kernel-team

From: Will Deacon <will@kernel.org>

kvm_pgtable_hyp_unmap() relies on the ->page_count() function callback
being provided by the memory-management operations for the page-table.

Wire up this callback for the hypervisor stage-1 page-table.

Signed-off-by: Will Deacon <will@kernel.org>
Signed-off-by: Quentin Perret <qperret@google.com>
---
 arch/arm64/kvm/hyp/nvhe/setup.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/arch/arm64/kvm/hyp/nvhe/setup.c b/arch/arm64/kvm/hyp/nvhe/setup.c
index ad89801dfed7..98b39facae04 100644
--- a/arch/arm64/kvm/hyp/nvhe/setup.c
+++ b/arch/arm64/kvm/hyp/nvhe/setup.c
@@ -246,6 +246,7 @@ void __noreturn __pkvm_init_finalise(void)
 		.virt_to_phys	= hyp_virt_to_phys,
 		.get_page	= hpool_get_page,
 		.put_page	= hpool_put_page,
+		.page_count	= hyp_page_count,
 	};
 	pkvm_pgtable.mm_ops = &pkvm_pgtable_mm_ops;
 
-- 
2.33.0.882.g93a45727a2-goog


^ permalink raw reply related	[flat|nested] 84+ messages in thread

* [PATCH 09/16] KVM: arm64: Hook up ->page_count() for hypervisor stage-1 page-table
@ 2021-10-13 15:58   ` Quentin Perret
  0 siblings, 0 replies; 84+ messages in thread
From: Quentin Perret @ 2021-10-13 15:58 UTC (permalink / raw)
  To: Marc Zyngier, James Morse, Alexandru Elisei, Suzuki K Poulose,
	Catalin Marinas, Will Deacon, Fuad Tabba, David Brazdil
  Cc: kernel-team, kvmarm, linux-arm-kernel, linux-kernel

From: Will Deacon <will@kernel.org>

kvm_pgtable_hyp_unmap() relies on the ->page_count() function callback
being provided by the memory-management operations for the page-table.

Wire up this callback for the hypervisor stage-1 page-table.

Signed-off-by: Will Deacon <will@kernel.org>
Signed-off-by: Quentin Perret <qperret@google.com>
---
 arch/arm64/kvm/hyp/nvhe/setup.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/arch/arm64/kvm/hyp/nvhe/setup.c b/arch/arm64/kvm/hyp/nvhe/setup.c
index ad89801dfed7..98b39facae04 100644
--- a/arch/arm64/kvm/hyp/nvhe/setup.c
+++ b/arch/arm64/kvm/hyp/nvhe/setup.c
@@ -246,6 +246,7 @@ void __noreturn __pkvm_init_finalise(void)
 		.virt_to_phys	= hyp_virt_to_phys,
 		.get_page	= hpool_get_page,
 		.put_page	= hpool_put_page,
+		.page_count	= hyp_page_count,
 	};
 	pkvm_pgtable.mm_ops = &pkvm_pgtable_mm_ops;
 
-- 
2.33.0.882.g93a45727a2-goog

_______________________________________________
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

^ permalink raw reply related	[flat|nested] 84+ messages in thread

* [PATCH 09/16] KVM: arm64: Hook up ->page_count() for hypervisor stage-1 page-table
@ 2021-10-13 15:58   ` Quentin Perret
  0 siblings, 0 replies; 84+ messages in thread
From: Quentin Perret @ 2021-10-13 15:58 UTC (permalink / raw)
  To: Marc Zyngier, James Morse, Alexandru Elisei, Suzuki K Poulose,
	Catalin Marinas, Will Deacon, Fuad Tabba, David Brazdil
  Cc: linux-arm-kernel, kvmarm, linux-kernel, kernel-team

From: Will Deacon <will@kernel.org>

kvm_pgtable_hyp_unmap() relies on the ->page_count() function callback
being provided by the memory-management operations for the page-table.

Wire up this callback for the hypervisor stage-1 page-table.

Signed-off-by: Will Deacon <will@kernel.org>
Signed-off-by: Quentin Perret <qperret@google.com>
---
 arch/arm64/kvm/hyp/nvhe/setup.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/arch/arm64/kvm/hyp/nvhe/setup.c b/arch/arm64/kvm/hyp/nvhe/setup.c
index ad89801dfed7..98b39facae04 100644
--- a/arch/arm64/kvm/hyp/nvhe/setup.c
+++ b/arch/arm64/kvm/hyp/nvhe/setup.c
@@ -246,6 +246,7 @@ void __noreturn __pkvm_init_finalise(void)
 		.virt_to_phys	= hyp_virt_to_phys,
 		.get_page	= hpool_get_page,
 		.put_page	= hpool_put_page,
+		.page_count	= hyp_page_count,
 	};
 	pkvm_pgtable.mm_ops = &pkvm_pgtable_mm_ops;
 
-- 
2.33.0.882.g93a45727a2-goog


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply related	[flat|nested] 84+ messages in thread

* [PATCH 10/16] KVM: arm64: Implement kvm_pgtable_hyp_unmap() at EL2
  2021-10-13 15:58 ` Quentin Perret
  (?)
@ 2021-10-13 15:58   ` Quentin Perret
  -1 siblings, 0 replies; 84+ messages in thread
From: Quentin Perret @ 2021-10-13 15:58 UTC (permalink / raw)
  To: Marc Zyngier, James Morse, Alexandru Elisei, Suzuki K Poulose,
	Catalin Marinas, Will Deacon, Fuad Tabba, David Brazdil
  Cc: linux-arm-kernel, kvmarm, linux-kernel, kernel-team

From: Will Deacon <will@kernel.org>

Implement kvm_pgtable_hyp_unmap() which can be used to remove hypervisor
stage-1 mappings at EL2.

Signed-off-by: Will Deacon <will@kernel.org>
Signed-off-by: Quentin Perret <qperret@google.com>
---
 arch/arm64/include/asm/kvm_pgtable.h | 21 ++++++++++
 arch/arm64/kvm/hyp/pgtable.c         | 63 ++++++++++++++++++++++++++++
 2 files changed, 84 insertions(+)

diff --git a/arch/arm64/include/asm/kvm_pgtable.h b/arch/arm64/include/asm/kvm_pgtable.h
index 027783829584..9d076f36401d 100644
--- a/arch/arm64/include/asm/kvm_pgtable.h
+++ b/arch/arm64/include/asm/kvm_pgtable.h
@@ -251,6 +251,27 @@ void kvm_pgtable_hyp_destroy(struct kvm_pgtable *pgt);
 int kvm_pgtable_hyp_map(struct kvm_pgtable *pgt, u64 addr, u64 size, u64 phys,
 			enum kvm_pgtable_prot prot);
 
+/**
+ * kvm_pgtable_hyp_unmap() - Remove a mapping from a hypervisor stage-1 page-table.
+ * @pgt:	Page-table structure initialised by kvm_pgtable_hyp_init().
+ * @addr:	Virtual address from which to remove the mapping.
+ * @size:	Size of the mapping.
+ *
+ * The offset of @addr within a page is ignored, @size is rounded-up to
+ * the next page boundary and @phys is rounded-down to the previous page
+ * boundary.
+ *
+ * TLB invalidation is performed for each page-table entry cleared during the
+ * unmapping operation and the reference count for the page-table page
+ * containing the cleared entry is decremented, with unreferenced pages being
+ * freed. The unmapping operation will stop early if it encounters either an
+ * invalid page-table entry or a valid block mapping which maps beyond the range
+ * being unmapped.
+ *
+ * Return: Number of bytes unmapped, which may be 0.
+ */
+u64 kvm_pgtable_hyp_unmap(struct kvm_pgtable *pgt, u64 addr, u64 size);
+
 /**
  * kvm_get_vtcr() - Helper to construct VTCR_EL2
  * @mmfr0:	Sanitized value of SYS_ID_AA64MMFR0_EL1 register.
diff --git a/arch/arm64/kvm/hyp/pgtable.c b/arch/arm64/kvm/hyp/pgtable.c
index 768a58835153..6ad4cb2d6947 100644
--- a/arch/arm64/kvm/hyp/pgtable.c
+++ b/arch/arm64/kvm/hyp/pgtable.c
@@ -463,6 +463,69 @@ int kvm_pgtable_hyp_map(struct kvm_pgtable *pgt, u64 addr, u64 size, u64 phys,
 	return ret;
 }
 
+struct hyp_unmap_data {
+	u64				unmapped;
+	struct kvm_pgtable_mm_ops	*mm_ops;
+};
+
+static int hyp_unmap_walker(u64 addr, u64 end, u32 level, kvm_pte_t *ptep,
+			    enum kvm_pgtable_walk_flags flag, void * const arg)
+{
+	kvm_pte_t pte = *ptep, *childp = NULL;
+	u64 granule = kvm_granule_size(level);
+	struct hyp_unmap_data *data = arg;
+	struct kvm_pgtable_mm_ops *mm_ops = data->mm_ops;
+
+	if (!kvm_pte_valid(pte))
+		return -EINVAL;
+
+	if (kvm_pte_table(pte, level)) {
+		childp = kvm_pte_follow(pte, mm_ops);
+
+		if (mm_ops->page_count(childp) != 1)
+			return 0;
+
+		kvm_clear_pte(ptep);
+		dsb(ishst);
+		__tlbi_level(vae2is, __TLBI_VADDR(addr, 0), level);
+	} else {
+		if (end - addr < granule)
+			return -EINVAL;
+
+		kvm_clear_pte(ptep);
+		dsb(ishst);
+		__tlbi_level(vale2is, __TLBI_VADDR(addr, 0), level);
+		data->unmapped += granule;
+	}
+
+	dsb(ish);
+	isb();
+	mm_ops->put_page(ptep);
+
+	if (childp)
+		mm_ops->put_page(childp);
+
+	return 0;
+}
+
+u64 kvm_pgtable_hyp_unmap(struct kvm_pgtable *pgt, u64 addr, u64 size)
+{
+	struct hyp_unmap_data unmap_data = {
+		.mm_ops	= pgt->mm_ops,
+	};
+	struct kvm_pgtable_walker walker = {
+		.cb	= hyp_unmap_walker,
+		.arg	= &unmap_data,
+		.flags	= KVM_PGTABLE_WALK_LEAF | KVM_PGTABLE_WALK_TABLE_POST,
+	};
+
+	if (!pgt->mm_ops->page_count)
+		return 0;
+
+	kvm_pgtable_walk(pgt, addr, size, &walker);
+	return unmap_data.unmapped;
+}
+
 int kvm_pgtable_hyp_init(struct kvm_pgtable *pgt, u32 va_bits,
 			 struct kvm_pgtable_mm_ops *mm_ops)
 {
-- 
2.33.0.882.g93a45727a2-goog


^ permalink raw reply related	[flat|nested] 84+ messages in thread

* [PATCH 10/16] KVM: arm64: Implement kvm_pgtable_hyp_unmap() at EL2
@ 2021-10-13 15:58   ` Quentin Perret
  0 siblings, 0 replies; 84+ messages in thread
From: Quentin Perret @ 2021-10-13 15:58 UTC (permalink / raw)
  To: Marc Zyngier, James Morse, Alexandru Elisei, Suzuki K Poulose,
	Catalin Marinas, Will Deacon, Fuad Tabba, David Brazdil
  Cc: kernel-team, kvmarm, linux-arm-kernel, linux-kernel

From: Will Deacon <will@kernel.org>

Implement kvm_pgtable_hyp_unmap() which can be used to remove hypervisor
stage-1 mappings at EL2.

Signed-off-by: Will Deacon <will@kernel.org>
Signed-off-by: Quentin Perret <qperret@google.com>
---
 arch/arm64/include/asm/kvm_pgtable.h | 21 ++++++++++
 arch/arm64/kvm/hyp/pgtable.c         | 63 ++++++++++++++++++++++++++++
 2 files changed, 84 insertions(+)

diff --git a/arch/arm64/include/asm/kvm_pgtable.h b/arch/arm64/include/asm/kvm_pgtable.h
index 027783829584..9d076f36401d 100644
--- a/arch/arm64/include/asm/kvm_pgtable.h
+++ b/arch/arm64/include/asm/kvm_pgtable.h
@@ -251,6 +251,27 @@ void kvm_pgtable_hyp_destroy(struct kvm_pgtable *pgt);
 int kvm_pgtable_hyp_map(struct kvm_pgtable *pgt, u64 addr, u64 size, u64 phys,
 			enum kvm_pgtable_prot prot);
 
+/**
+ * kvm_pgtable_hyp_unmap() - Remove a mapping from a hypervisor stage-1 page-table.
+ * @pgt:	Page-table structure initialised by kvm_pgtable_hyp_init().
+ * @addr:	Virtual address from which to remove the mapping.
+ * @size:	Size of the mapping.
+ *
+ * The offset of @addr within a page is ignored, @size is rounded-up to
+ * the next page boundary and @phys is rounded-down to the previous page
+ * boundary.
+ *
+ * TLB invalidation is performed for each page-table entry cleared during the
+ * unmapping operation and the reference count for the page-table page
+ * containing the cleared entry is decremented, with unreferenced pages being
+ * freed. The unmapping operation will stop early if it encounters either an
+ * invalid page-table entry or a valid block mapping which maps beyond the range
+ * being unmapped.
+ *
+ * Return: Number of bytes unmapped, which may be 0.
+ */
+u64 kvm_pgtable_hyp_unmap(struct kvm_pgtable *pgt, u64 addr, u64 size);
+
 /**
  * kvm_get_vtcr() - Helper to construct VTCR_EL2
  * @mmfr0:	Sanitized value of SYS_ID_AA64MMFR0_EL1 register.
diff --git a/arch/arm64/kvm/hyp/pgtable.c b/arch/arm64/kvm/hyp/pgtable.c
index 768a58835153..6ad4cb2d6947 100644
--- a/arch/arm64/kvm/hyp/pgtable.c
+++ b/arch/arm64/kvm/hyp/pgtable.c
@@ -463,6 +463,69 @@ int kvm_pgtable_hyp_map(struct kvm_pgtable *pgt, u64 addr, u64 size, u64 phys,
 	return ret;
 }
 
+struct hyp_unmap_data {
+	u64				unmapped;
+	struct kvm_pgtable_mm_ops	*mm_ops;
+};
+
+static int hyp_unmap_walker(u64 addr, u64 end, u32 level, kvm_pte_t *ptep,
+			    enum kvm_pgtable_walk_flags flag, void * const arg)
+{
+	kvm_pte_t pte = *ptep, *childp = NULL;
+	u64 granule = kvm_granule_size(level);
+	struct hyp_unmap_data *data = arg;
+	struct kvm_pgtable_mm_ops *mm_ops = data->mm_ops;
+
+	if (!kvm_pte_valid(pte))
+		return -EINVAL;
+
+	if (kvm_pte_table(pte, level)) {
+		childp = kvm_pte_follow(pte, mm_ops);
+
+		if (mm_ops->page_count(childp) != 1)
+			return 0;
+
+		kvm_clear_pte(ptep);
+		dsb(ishst);
+		__tlbi_level(vae2is, __TLBI_VADDR(addr, 0), level);
+	} else {
+		if (end - addr < granule)
+			return -EINVAL;
+
+		kvm_clear_pte(ptep);
+		dsb(ishst);
+		__tlbi_level(vale2is, __TLBI_VADDR(addr, 0), level);
+		data->unmapped += granule;
+	}
+
+	dsb(ish);
+	isb();
+	mm_ops->put_page(ptep);
+
+	if (childp)
+		mm_ops->put_page(childp);
+
+	return 0;
+}
+
+u64 kvm_pgtable_hyp_unmap(struct kvm_pgtable *pgt, u64 addr, u64 size)
+{
+	struct hyp_unmap_data unmap_data = {
+		.mm_ops	= pgt->mm_ops,
+	};
+	struct kvm_pgtable_walker walker = {
+		.cb	= hyp_unmap_walker,
+		.arg	= &unmap_data,
+		.flags	= KVM_PGTABLE_WALK_LEAF | KVM_PGTABLE_WALK_TABLE_POST,
+	};
+
+	if (!pgt->mm_ops->page_count)
+		return 0;
+
+	kvm_pgtable_walk(pgt, addr, size, &walker);
+	return unmap_data.unmapped;
+}
+
 int kvm_pgtable_hyp_init(struct kvm_pgtable *pgt, u32 va_bits,
 			 struct kvm_pgtable_mm_ops *mm_ops)
 {
-- 
2.33.0.882.g93a45727a2-goog

_______________________________________________
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

^ permalink raw reply related	[flat|nested] 84+ messages in thread

* [PATCH 10/16] KVM: arm64: Implement kvm_pgtable_hyp_unmap() at EL2
@ 2021-10-13 15:58   ` Quentin Perret
  0 siblings, 0 replies; 84+ messages in thread
From: Quentin Perret @ 2021-10-13 15:58 UTC (permalink / raw)
  To: Marc Zyngier, James Morse, Alexandru Elisei, Suzuki K Poulose,
	Catalin Marinas, Will Deacon, Fuad Tabba, David Brazdil
  Cc: linux-arm-kernel, kvmarm, linux-kernel, kernel-team

From: Will Deacon <will@kernel.org>

Implement kvm_pgtable_hyp_unmap() which can be used to remove hypervisor
stage-1 mappings at EL2.

Signed-off-by: Will Deacon <will@kernel.org>
Signed-off-by: Quentin Perret <qperret@google.com>
---
 arch/arm64/include/asm/kvm_pgtable.h | 21 ++++++++++
 arch/arm64/kvm/hyp/pgtable.c         | 63 ++++++++++++++++++++++++++++
 2 files changed, 84 insertions(+)

diff --git a/arch/arm64/include/asm/kvm_pgtable.h b/arch/arm64/include/asm/kvm_pgtable.h
index 027783829584..9d076f36401d 100644
--- a/arch/arm64/include/asm/kvm_pgtable.h
+++ b/arch/arm64/include/asm/kvm_pgtable.h
@@ -251,6 +251,27 @@ void kvm_pgtable_hyp_destroy(struct kvm_pgtable *pgt);
 int kvm_pgtable_hyp_map(struct kvm_pgtable *pgt, u64 addr, u64 size, u64 phys,
 			enum kvm_pgtable_prot prot);
 
+/**
+ * kvm_pgtable_hyp_unmap() - Remove a mapping from a hypervisor stage-1 page-table.
+ * @pgt:	Page-table structure initialised by kvm_pgtable_hyp_init().
+ * @addr:	Virtual address from which to remove the mapping.
+ * @size:	Size of the mapping.
+ *
+ * The offset of @addr within a page is ignored, @size is rounded-up to
+ * the next page boundary and @phys is rounded-down to the previous page
+ * boundary.
+ *
+ * TLB invalidation is performed for each page-table entry cleared during the
+ * unmapping operation and the reference count for the page-table page
+ * containing the cleared entry is decremented, with unreferenced pages being
+ * freed. The unmapping operation will stop early if it encounters either an
+ * invalid page-table entry or a valid block mapping which maps beyond the range
+ * being unmapped.
+ *
+ * Return: Number of bytes unmapped, which may be 0.
+ */
+u64 kvm_pgtable_hyp_unmap(struct kvm_pgtable *pgt, u64 addr, u64 size);
+
 /**
  * kvm_get_vtcr() - Helper to construct VTCR_EL2
  * @mmfr0:	Sanitized value of SYS_ID_AA64MMFR0_EL1 register.
diff --git a/arch/arm64/kvm/hyp/pgtable.c b/arch/arm64/kvm/hyp/pgtable.c
index 768a58835153..6ad4cb2d6947 100644
--- a/arch/arm64/kvm/hyp/pgtable.c
+++ b/arch/arm64/kvm/hyp/pgtable.c
@@ -463,6 +463,69 @@ int kvm_pgtable_hyp_map(struct kvm_pgtable *pgt, u64 addr, u64 size, u64 phys,
 	return ret;
 }
 
+struct hyp_unmap_data {
+	u64				unmapped;
+	struct kvm_pgtable_mm_ops	*mm_ops;
+};
+
+static int hyp_unmap_walker(u64 addr, u64 end, u32 level, kvm_pte_t *ptep,
+			    enum kvm_pgtable_walk_flags flag, void * const arg)
+{
+	kvm_pte_t pte = *ptep, *childp = NULL;
+	u64 granule = kvm_granule_size(level);
+	struct hyp_unmap_data *data = arg;
+	struct kvm_pgtable_mm_ops *mm_ops = data->mm_ops;
+
+	if (!kvm_pte_valid(pte))
+		return -EINVAL;
+
+	if (kvm_pte_table(pte, level)) {
+		childp = kvm_pte_follow(pte, mm_ops);
+
+		if (mm_ops->page_count(childp) != 1)
+			return 0;
+
+		kvm_clear_pte(ptep);
+		dsb(ishst);
+		__tlbi_level(vae2is, __TLBI_VADDR(addr, 0), level);
+	} else {
+		if (end - addr < granule)
+			return -EINVAL;
+
+		kvm_clear_pte(ptep);
+		dsb(ishst);
+		__tlbi_level(vale2is, __TLBI_VADDR(addr, 0), level);
+		data->unmapped += granule;
+	}
+
+	dsb(ish);
+	isb();
+	mm_ops->put_page(ptep);
+
+	if (childp)
+		mm_ops->put_page(childp);
+
+	return 0;
+}
+
+u64 kvm_pgtable_hyp_unmap(struct kvm_pgtable *pgt, u64 addr, u64 size)
+{
+	struct hyp_unmap_data unmap_data = {
+		.mm_ops	= pgt->mm_ops,
+	};
+	struct kvm_pgtable_walker walker = {
+		.cb	= hyp_unmap_walker,
+		.arg	= &unmap_data,
+		.flags	= KVM_PGTABLE_WALK_LEAF | KVM_PGTABLE_WALK_TABLE_POST,
+	};
+
+	if (!pgt->mm_ops->page_count)
+		return 0;
+
+	kvm_pgtable_walk(pgt, addr, size, &walker);
+	return unmap_data.unmapped;
+}
+
 int kvm_pgtable_hyp_init(struct kvm_pgtable *pgt, u32 va_bits,
 			 struct kvm_pgtable_mm_ops *mm_ops)
 {
-- 
2.33.0.882.g93a45727a2-goog


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply related	[flat|nested] 84+ messages in thread

* [PATCH 11/16] KVM: arm64: Back hyp_vmemmap for all of memory
  2021-10-13 15:58 ` Quentin Perret
  (?)
@ 2021-10-13 15:58   ` Quentin Perret
  -1 siblings, 0 replies; 84+ messages in thread
From: Quentin Perret @ 2021-10-13 15:58 UTC (permalink / raw)
  To: Marc Zyngier, James Morse, Alexandru Elisei, Suzuki K Poulose,
	Catalin Marinas, Will Deacon, Fuad Tabba, David Brazdil
  Cc: linux-arm-kernel, kvmarm, linux-kernel, kernel-team

The EL2 vmemmap in nVHE Protected mode is currently very sparse: only
memory pages owned by the hypervisor itself have a matching struct
hyp_page. But since the size of these structs has been reduced
significantly, it appears that we can now afford backing the vmemmap for
all of memory.

This will simplify a lot memory tracking as the hypervisor will have a
place to store metadata (e.g. refcounts) that wouldn't otherwise fit in
the 4 SW bits we have in the host stage-2 page-table.

Signed-off-by: Quentin Perret <qperret@google.com>
---
 arch/arm64/kvm/hyp/include/nvhe/mm.h | 29 ++++++++++++++++++--------
 arch/arm64/kvm/hyp/nvhe/mm.c         | 31 ++++++++++++++++++++++++----
 arch/arm64/kvm/hyp/nvhe/page_alloc.c |  4 +---
 arch/arm64/kvm/hyp/nvhe/setup.c      |  7 +++----
 arch/arm64/kvm/hyp/reserved_mem.c    | 17 ++-------------
 5 files changed, 53 insertions(+), 35 deletions(-)

diff --git a/arch/arm64/kvm/hyp/include/nvhe/mm.h b/arch/arm64/kvm/hyp/include/nvhe/mm.h
index c9a8f535212e..f5e8582252c3 100644
--- a/arch/arm64/kvm/hyp/include/nvhe/mm.h
+++ b/arch/arm64/kvm/hyp/include/nvhe/mm.h
@@ -20,23 +20,34 @@ extern u64 __io_map_base;
 
 int hyp_create_idmap(u32 hyp_va_bits);
 int hyp_map_vectors(void);
-int hyp_back_vmemmap(phys_addr_t phys, unsigned long size, phys_addr_t back);
+int hyp_back_vmemmap(phys_addr_t back);
 int pkvm_cpu_set_vector(enum arm64_hyp_spectre_vector slot);
 int pkvm_create_mappings(void *from, void *to, enum kvm_pgtable_prot prot);
 int pkvm_create_mappings_locked(void *from, void *to, enum kvm_pgtable_prot prot);
 unsigned long __pkvm_create_private_mapping(phys_addr_t phys, size_t size,
 					    enum kvm_pgtable_prot prot);
 
-static inline void hyp_vmemmap_range(phys_addr_t phys, unsigned long size,
-				     unsigned long *start, unsigned long *end)
+static inline unsigned long hyp_vmemmap_memblock_size(struct memblock_region *reg)
 {
-	unsigned long nr_pages = size >> PAGE_SHIFT;
-	struct hyp_page *p = hyp_phys_to_page(phys);
+	unsigned long nr_pages = reg->size >> PAGE_SHIFT;
+	unsigned long start, end;
 
-	*start = (unsigned long)p;
-	*end = *start + nr_pages * sizeof(struct hyp_page);
-	*start = ALIGN_DOWN(*start, PAGE_SIZE);
-	*end = ALIGN(*end, PAGE_SIZE);
+	start = hyp_phys_to_pfn(reg->base) * sizeof(struct hyp_page);
+	end = start + nr_pages * sizeof(struct hyp_page);
+	start = ALIGN_DOWN(start, PAGE_SIZE);
+	end = ALIGN(end, PAGE_SIZE);
+
+	return end - start;
+}
+
+static inline unsigned long hyp_vmemmap_pages(void)
+{
+	unsigned long res = 0, i;
+
+	for (i = 0; i < kvm_nvhe_sym(hyp_memblock_nr); i++)
+		res += hyp_vmemmap_memblock_size(&kvm_nvhe_sym(hyp_memory)[i]);
+
+	return res >> PAGE_SHIFT;
 }
 
 static inline unsigned long __hyp_pgtable_max_pages(unsigned long nr_pages)
diff --git a/arch/arm64/kvm/hyp/nvhe/mm.c b/arch/arm64/kvm/hyp/nvhe/mm.c
index 2fabeceb889a..65b948cbc0f5 100644
--- a/arch/arm64/kvm/hyp/nvhe/mm.c
+++ b/arch/arm64/kvm/hyp/nvhe/mm.c
@@ -103,13 +103,36 @@ int pkvm_create_mappings(void *from, void *to, enum kvm_pgtable_prot prot)
 	return ret;
 }
 
-int hyp_back_vmemmap(phys_addr_t phys, unsigned long size, phys_addr_t back)
+int hyp_back_vmemmap(phys_addr_t back)
 {
-	unsigned long start, end;
+	unsigned long i, start, size, end = 0;
+	int ret;
 
-	hyp_vmemmap_range(phys, size, &start, &end);
+	for (i = 0; i < hyp_memblock_nr; i++) {
+		start = hyp_memory[i].base;
+		start = ALIGN_DOWN((u64)hyp_phys_to_page(start), PAGE_SIZE);
+		/*
+		 * The beginning of the hyp_vmemmap region for the current
+		 * memblock may already be backed by the page backing the end
+		 * the previous region, so avoid mapping it twice.
+		 */
+		start = max(start, end);
+
+		end = hyp_memory[i].base + hyp_memory[i].size;
+		end = PAGE_ALIGN((u64)hyp_phys_to_page(end));
+		if (start >= end)
+			continue;
+
+		size = end - start;
+		ret = __pkvm_create_mappings(start, size, back, PAGE_HYP);
+		if (ret)
+			return ret;
+
+		memset(hyp_phys_to_virt(back), 0, size);
+		back += size;
+	}
 
-	return __pkvm_create_mappings(start, end - start, back, PAGE_HYP);
+	return 0;
 }
 
 static void *__hyp_bp_vect_base;
diff --git a/arch/arm64/kvm/hyp/nvhe/page_alloc.c b/arch/arm64/kvm/hyp/nvhe/page_alloc.c
index 41fc25bdfb34..38accc2e23e3 100644
--- a/arch/arm64/kvm/hyp/nvhe/page_alloc.c
+++ b/arch/arm64/kvm/hyp/nvhe/page_alloc.c
@@ -234,10 +234,8 @@ int hyp_pool_init(struct hyp_pool *pool, u64 pfn, unsigned int nr_pages,
 
 	/* Init the vmemmap portion */
 	p = hyp_phys_to_page(phys);
-	for (i = 0; i < nr_pages; i++) {
-		p[i].order = 0;
+	for (i = 0; i < nr_pages; i++)
 		hyp_set_page_refcounted(&p[i]);
-	}
 
 	/* Attach the unused pages to the buddy tree */
 	for (i = reserved_pages; i < nr_pages; i++)
diff --git a/arch/arm64/kvm/hyp/nvhe/setup.c b/arch/arm64/kvm/hyp/nvhe/setup.c
index 98b39facae04..98441e4039b9 100644
--- a/arch/arm64/kvm/hyp/nvhe/setup.c
+++ b/arch/arm64/kvm/hyp/nvhe/setup.c
@@ -29,12 +29,11 @@ static struct kvm_pgtable_mm_ops pkvm_pgtable_mm_ops;
 
 static int divide_memory_pool(void *virt, unsigned long size)
 {
-	unsigned long vstart, vend, nr_pages;
+	unsigned long nr_pages;
 
 	hyp_early_alloc_init(virt, size);
 
-	hyp_vmemmap_range(__hyp_pa(virt), size, &vstart, &vend);
-	nr_pages = (vend - vstart) >> PAGE_SHIFT;
+	nr_pages = hyp_vmemmap_pages();
 	vmemmap_base = hyp_early_alloc_contig(nr_pages);
 	if (!vmemmap_base)
 		return -ENOMEM;
@@ -76,7 +75,7 @@ static int recreate_hyp_mappings(phys_addr_t phys, unsigned long size,
 	if (ret)
 		return ret;
 
-	ret = hyp_back_vmemmap(phys, size, hyp_virt_to_phys(vmemmap_base));
+	ret = hyp_back_vmemmap(hyp_virt_to_phys(vmemmap_base));
 	if (ret)
 		return ret;
 
diff --git a/arch/arm64/kvm/hyp/reserved_mem.c b/arch/arm64/kvm/hyp/reserved_mem.c
index 578670e3f608..81db85bfdbad 100644
--- a/arch/arm64/kvm/hyp/reserved_mem.c
+++ b/arch/arm64/kvm/hyp/reserved_mem.c
@@ -54,7 +54,7 @@ static int __init register_memblock_regions(void)
 
 void __init kvm_hyp_reserve(void)
 {
-	u64 nr_pages, prev, hyp_mem_pages = 0;
+	u64 hyp_mem_pages = 0;
 	int ret;
 
 	if (!is_hyp_mode_available() || is_kernel_in_hyp_mode())
@@ -72,20 +72,7 @@ void __init kvm_hyp_reserve(void)
 
 	hyp_mem_pages += hyp_s1_pgtable_pages();
 	hyp_mem_pages += host_s2_pgtable_pages();
-
-	/*
-	 * The hyp_vmemmap needs to be backed by pages, but these pages
-	 * themselves need to be present in the vmemmap, so compute the number
-	 * of pages needed by looking for a fixed point.
-	 */
-	nr_pages = 0;
-	do {
-		prev = nr_pages;
-		nr_pages = hyp_mem_pages + prev;
-		nr_pages = DIV_ROUND_UP(nr_pages * sizeof(struct hyp_page), PAGE_SIZE);
-		nr_pages += __hyp_pgtable_max_pages(nr_pages);
-	} while (nr_pages != prev);
-	hyp_mem_pages += nr_pages;
+	hyp_mem_pages += hyp_vmemmap_pages();
 
 	/*
 	 * Try to allocate a PMD-aligned region to reduce TLB pressure once
-- 
2.33.0.882.g93a45727a2-goog


^ permalink raw reply related	[flat|nested] 84+ messages in thread

* [PATCH 11/16] KVM: arm64: Back hyp_vmemmap for all of memory
@ 2021-10-13 15:58   ` Quentin Perret
  0 siblings, 0 replies; 84+ messages in thread
From: Quentin Perret @ 2021-10-13 15:58 UTC (permalink / raw)
  To: Marc Zyngier, James Morse, Alexandru Elisei, Suzuki K Poulose,
	Catalin Marinas, Will Deacon, Fuad Tabba, David Brazdil
  Cc: kernel-team, kvmarm, linux-arm-kernel, linux-kernel

The EL2 vmemmap in nVHE Protected mode is currently very sparse: only
memory pages owned by the hypervisor itself have a matching struct
hyp_page. But since the size of these structs has been reduced
significantly, it appears that we can now afford backing the vmemmap for
all of memory.

This will simplify a lot memory tracking as the hypervisor will have a
place to store metadata (e.g. refcounts) that wouldn't otherwise fit in
the 4 SW bits we have in the host stage-2 page-table.

Signed-off-by: Quentin Perret <qperret@google.com>
---
 arch/arm64/kvm/hyp/include/nvhe/mm.h | 29 ++++++++++++++++++--------
 arch/arm64/kvm/hyp/nvhe/mm.c         | 31 ++++++++++++++++++++++++----
 arch/arm64/kvm/hyp/nvhe/page_alloc.c |  4 +---
 arch/arm64/kvm/hyp/nvhe/setup.c      |  7 +++----
 arch/arm64/kvm/hyp/reserved_mem.c    | 17 ++-------------
 5 files changed, 53 insertions(+), 35 deletions(-)

diff --git a/arch/arm64/kvm/hyp/include/nvhe/mm.h b/arch/arm64/kvm/hyp/include/nvhe/mm.h
index c9a8f535212e..f5e8582252c3 100644
--- a/arch/arm64/kvm/hyp/include/nvhe/mm.h
+++ b/arch/arm64/kvm/hyp/include/nvhe/mm.h
@@ -20,23 +20,34 @@ extern u64 __io_map_base;
 
 int hyp_create_idmap(u32 hyp_va_bits);
 int hyp_map_vectors(void);
-int hyp_back_vmemmap(phys_addr_t phys, unsigned long size, phys_addr_t back);
+int hyp_back_vmemmap(phys_addr_t back);
 int pkvm_cpu_set_vector(enum arm64_hyp_spectre_vector slot);
 int pkvm_create_mappings(void *from, void *to, enum kvm_pgtable_prot prot);
 int pkvm_create_mappings_locked(void *from, void *to, enum kvm_pgtable_prot prot);
 unsigned long __pkvm_create_private_mapping(phys_addr_t phys, size_t size,
 					    enum kvm_pgtable_prot prot);
 
-static inline void hyp_vmemmap_range(phys_addr_t phys, unsigned long size,
-				     unsigned long *start, unsigned long *end)
+static inline unsigned long hyp_vmemmap_memblock_size(struct memblock_region *reg)
 {
-	unsigned long nr_pages = size >> PAGE_SHIFT;
-	struct hyp_page *p = hyp_phys_to_page(phys);
+	unsigned long nr_pages = reg->size >> PAGE_SHIFT;
+	unsigned long start, end;
 
-	*start = (unsigned long)p;
-	*end = *start + nr_pages * sizeof(struct hyp_page);
-	*start = ALIGN_DOWN(*start, PAGE_SIZE);
-	*end = ALIGN(*end, PAGE_SIZE);
+	start = hyp_phys_to_pfn(reg->base) * sizeof(struct hyp_page);
+	end = start + nr_pages * sizeof(struct hyp_page);
+	start = ALIGN_DOWN(start, PAGE_SIZE);
+	end = ALIGN(end, PAGE_SIZE);
+
+	return end - start;
+}
+
+static inline unsigned long hyp_vmemmap_pages(void)
+{
+	unsigned long res = 0, i;
+
+	for (i = 0; i < kvm_nvhe_sym(hyp_memblock_nr); i++)
+		res += hyp_vmemmap_memblock_size(&kvm_nvhe_sym(hyp_memory)[i]);
+
+	return res >> PAGE_SHIFT;
 }
 
 static inline unsigned long __hyp_pgtable_max_pages(unsigned long nr_pages)
diff --git a/arch/arm64/kvm/hyp/nvhe/mm.c b/arch/arm64/kvm/hyp/nvhe/mm.c
index 2fabeceb889a..65b948cbc0f5 100644
--- a/arch/arm64/kvm/hyp/nvhe/mm.c
+++ b/arch/arm64/kvm/hyp/nvhe/mm.c
@@ -103,13 +103,36 @@ int pkvm_create_mappings(void *from, void *to, enum kvm_pgtable_prot prot)
 	return ret;
 }
 
-int hyp_back_vmemmap(phys_addr_t phys, unsigned long size, phys_addr_t back)
+int hyp_back_vmemmap(phys_addr_t back)
 {
-	unsigned long start, end;
+	unsigned long i, start, size, end = 0;
+	int ret;
 
-	hyp_vmemmap_range(phys, size, &start, &end);
+	for (i = 0; i < hyp_memblock_nr; i++) {
+		start = hyp_memory[i].base;
+		start = ALIGN_DOWN((u64)hyp_phys_to_page(start), PAGE_SIZE);
+		/*
+		 * The beginning of the hyp_vmemmap region for the current
+		 * memblock may already be backed by the page backing the end
+		 * the previous region, so avoid mapping it twice.
+		 */
+		start = max(start, end);
+
+		end = hyp_memory[i].base + hyp_memory[i].size;
+		end = PAGE_ALIGN((u64)hyp_phys_to_page(end));
+		if (start >= end)
+			continue;
+
+		size = end - start;
+		ret = __pkvm_create_mappings(start, size, back, PAGE_HYP);
+		if (ret)
+			return ret;
+
+		memset(hyp_phys_to_virt(back), 0, size);
+		back += size;
+	}
 
-	return __pkvm_create_mappings(start, end - start, back, PAGE_HYP);
+	return 0;
 }
 
 static void *__hyp_bp_vect_base;
diff --git a/arch/arm64/kvm/hyp/nvhe/page_alloc.c b/arch/arm64/kvm/hyp/nvhe/page_alloc.c
index 41fc25bdfb34..38accc2e23e3 100644
--- a/arch/arm64/kvm/hyp/nvhe/page_alloc.c
+++ b/arch/arm64/kvm/hyp/nvhe/page_alloc.c
@@ -234,10 +234,8 @@ int hyp_pool_init(struct hyp_pool *pool, u64 pfn, unsigned int nr_pages,
 
 	/* Init the vmemmap portion */
 	p = hyp_phys_to_page(phys);
-	for (i = 0; i < nr_pages; i++) {
-		p[i].order = 0;
+	for (i = 0; i < nr_pages; i++)
 		hyp_set_page_refcounted(&p[i]);
-	}
 
 	/* Attach the unused pages to the buddy tree */
 	for (i = reserved_pages; i < nr_pages; i++)
diff --git a/arch/arm64/kvm/hyp/nvhe/setup.c b/arch/arm64/kvm/hyp/nvhe/setup.c
index 98b39facae04..98441e4039b9 100644
--- a/arch/arm64/kvm/hyp/nvhe/setup.c
+++ b/arch/arm64/kvm/hyp/nvhe/setup.c
@@ -29,12 +29,11 @@ static struct kvm_pgtable_mm_ops pkvm_pgtable_mm_ops;
 
 static int divide_memory_pool(void *virt, unsigned long size)
 {
-	unsigned long vstart, vend, nr_pages;
+	unsigned long nr_pages;
 
 	hyp_early_alloc_init(virt, size);
 
-	hyp_vmemmap_range(__hyp_pa(virt), size, &vstart, &vend);
-	nr_pages = (vend - vstart) >> PAGE_SHIFT;
+	nr_pages = hyp_vmemmap_pages();
 	vmemmap_base = hyp_early_alloc_contig(nr_pages);
 	if (!vmemmap_base)
 		return -ENOMEM;
@@ -76,7 +75,7 @@ static int recreate_hyp_mappings(phys_addr_t phys, unsigned long size,
 	if (ret)
 		return ret;
 
-	ret = hyp_back_vmemmap(phys, size, hyp_virt_to_phys(vmemmap_base));
+	ret = hyp_back_vmemmap(hyp_virt_to_phys(vmemmap_base));
 	if (ret)
 		return ret;
 
diff --git a/arch/arm64/kvm/hyp/reserved_mem.c b/arch/arm64/kvm/hyp/reserved_mem.c
index 578670e3f608..81db85bfdbad 100644
--- a/arch/arm64/kvm/hyp/reserved_mem.c
+++ b/arch/arm64/kvm/hyp/reserved_mem.c
@@ -54,7 +54,7 @@ static int __init register_memblock_regions(void)
 
 void __init kvm_hyp_reserve(void)
 {
-	u64 nr_pages, prev, hyp_mem_pages = 0;
+	u64 hyp_mem_pages = 0;
 	int ret;
 
 	if (!is_hyp_mode_available() || is_kernel_in_hyp_mode())
@@ -72,20 +72,7 @@ void __init kvm_hyp_reserve(void)
 
 	hyp_mem_pages += hyp_s1_pgtable_pages();
 	hyp_mem_pages += host_s2_pgtable_pages();
-
-	/*
-	 * The hyp_vmemmap needs to be backed by pages, but these pages
-	 * themselves need to be present in the vmemmap, so compute the number
-	 * of pages needed by looking for a fixed point.
-	 */
-	nr_pages = 0;
-	do {
-		prev = nr_pages;
-		nr_pages = hyp_mem_pages + prev;
-		nr_pages = DIV_ROUND_UP(nr_pages * sizeof(struct hyp_page), PAGE_SIZE);
-		nr_pages += __hyp_pgtable_max_pages(nr_pages);
-	} while (nr_pages != prev);
-	hyp_mem_pages += nr_pages;
+	hyp_mem_pages += hyp_vmemmap_pages();
 
 	/*
 	 * Try to allocate a PMD-aligned region to reduce TLB pressure once
-- 
2.33.0.882.g93a45727a2-goog

_______________________________________________
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

^ permalink raw reply related	[flat|nested] 84+ messages in thread

* [PATCH 11/16] KVM: arm64: Back hyp_vmemmap for all of memory
@ 2021-10-13 15:58   ` Quentin Perret
  0 siblings, 0 replies; 84+ messages in thread
From: Quentin Perret @ 2021-10-13 15:58 UTC (permalink / raw)
  To: Marc Zyngier, James Morse, Alexandru Elisei, Suzuki K Poulose,
	Catalin Marinas, Will Deacon, Fuad Tabba, David Brazdil
  Cc: linux-arm-kernel, kvmarm, linux-kernel, kernel-team

The EL2 vmemmap in nVHE Protected mode is currently very sparse: only
memory pages owned by the hypervisor itself have a matching struct
hyp_page. But since the size of these structs has been reduced
significantly, it appears that we can now afford backing the vmemmap for
all of memory.

This will simplify a lot memory tracking as the hypervisor will have a
place to store metadata (e.g. refcounts) that wouldn't otherwise fit in
the 4 SW bits we have in the host stage-2 page-table.

Signed-off-by: Quentin Perret <qperret@google.com>
---
 arch/arm64/kvm/hyp/include/nvhe/mm.h | 29 ++++++++++++++++++--------
 arch/arm64/kvm/hyp/nvhe/mm.c         | 31 ++++++++++++++++++++++++----
 arch/arm64/kvm/hyp/nvhe/page_alloc.c |  4 +---
 arch/arm64/kvm/hyp/nvhe/setup.c      |  7 +++----
 arch/arm64/kvm/hyp/reserved_mem.c    | 17 ++-------------
 5 files changed, 53 insertions(+), 35 deletions(-)

diff --git a/arch/arm64/kvm/hyp/include/nvhe/mm.h b/arch/arm64/kvm/hyp/include/nvhe/mm.h
index c9a8f535212e..f5e8582252c3 100644
--- a/arch/arm64/kvm/hyp/include/nvhe/mm.h
+++ b/arch/arm64/kvm/hyp/include/nvhe/mm.h
@@ -20,23 +20,34 @@ extern u64 __io_map_base;
 
 int hyp_create_idmap(u32 hyp_va_bits);
 int hyp_map_vectors(void);
-int hyp_back_vmemmap(phys_addr_t phys, unsigned long size, phys_addr_t back);
+int hyp_back_vmemmap(phys_addr_t back);
 int pkvm_cpu_set_vector(enum arm64_hyp_spectre_vector slot);
 int pkvm_create_mappings(void *from, void *to, enum kvm_pgtable_prot prot);
 int pkvm_create_mappings_locked(void *from, void *to, enum kvm_pgtable_prot prot);
 unsigned long __pkvm_create_private_mapping(phys_addr_t phys, size_t size,
 					    enum kvm_pgtable_prot prot);
 
-static inline void hyp_vmemmap_range(phys_addr_t phys, unsigned long size,
-				     unsigned long *start, unsigned long *end)
+static inline unsigned long hyp_vmemmap_memblock_size(struct memblock_region *reg)
 {
-	unsigned long nr_pages = size >> PAGE_SHIFT;
-	struct hyp_page *p = hyp_phys_to_page(phys);
+	unsigned long nr_pages = reg->size >> PAGE_SHIFT;
+	unsigned long start, end;
 
-	*start = (unsigned long)p;
-	*end = *start + nr_pages * sizeof(struct hyp_page);
-	*start = ALIGN_DOWN(*start, PAGE_SIZE);
-	*end = ALIGN(*end, PAGE_SIZE);
+	start = hyp_phys_to_pfn(reg->base) * sizeof(struct hyp_page);
+	end = start + nr_pages * sizeof(struct hyp_page);
+	start = ALIGN_DOWN(start, PAGE_SIZE);
+	end = ALIGN(end, PAGE_SIZE);
+
+	return end - start;
+}
+
+static inline unsigned long hyp_vmemmap_pages(void)
+{
+	unsigned long res = 0, i;
+
+	for (i = 0; i < kvm_nvhe_sym(hyp_memblock_nr); i++)
+		res += hyp_vmemmap_memblock_size(&kvm_nvhe_sym(hyp_memory)[i]);
+
+	return res >> PAGE_SHIFT;
 }
 
 static inline unsigned long __hyp_pgtable_max_pages(unsigned long nr_pages)
diff --git a/arch/arm64/kvm/hyp/nvhe/mm.c b/arch/arm64/kvm/hyp/nvhe/mm.c
index 2fabeceb889a..65b948cbc0f5 100644
--- a/arch/arm64/kvm/hyp/nvhe/mm.c
+++ b/arch/arm64/kvm/hyp/nvhe/mm.c
@@ -103,13 +103,36 @@ int pkvm_create_mappings(void *from, void *to, enum kvm_pgtable_prot prot)
 	return ret;
 }
 
-int hyp_back_vmemmap(phys_addr_t phys, unsigned long size, phys_addr_t back)
+int hyp_back_vmemmap(phys_addr_t back)
 {
-	unsigned long start, end;
+	unsigned long i, start, size, end = 0;
+	int ret;
 
-	hyp_vmemmap_range(phys, size, &start, &end);
+	for (i = 0; i < hyp_memblock_nr; i++) {
+		start = hyp_memory[i].base;
+		start = ALIGN_DOWN((u64)hyp_phys_to_page(start), PAGE_SIZE);
+		/*
+		 * The beginning of the hyp_vmemmap region for the current
+		 * memblock may already be backed by the page backing the end
+		 * the previous region, so avoid mapping it twice.
+		 */
+		start = max(start, end);
+
+		end = hyp_memory[i].base + hyp_memory[i].size;
+		end = PAGE_ALIGN((u64)hyp_phys_to_page(end));
+		if (start >= end)
+			continue;
+
+		size = end - start;
+		ret = __pkvm_create_mappings(start, size, back, PAGE_HYP);
+		if (ret)
+			return ret;
+
+		memset(hyp_phys_to_virt(back), 0, size);
+		back += size;
+	}
 
-	return __pkvm_create_mappings(start, end - start, back, PAGE_HYP);
+	return 0;
 }
 
 static void *__hyp_bp_vect_base;
diff --git a/arch/arm64/kvm/hyp/nvhe/page_alloc.c b/arch/arm64/kvm/hyp/nvhe/page_alloc.c
index 41fc25bdfb34..38accc2e23e3 100644
--- a/arch/arm64/kvm/hyp/nvhe/page_alloc.c
+++ b/arch/arm64/kvm/hyp/nvhe/page_alloc.c
@@ -234,10 +234,8 @@ int hyp_pool_init(struct hyp_pool *pool, u64 pfn, unsigned int nr_pages,
 
 	/* Init the vmemmap portion */
 	p = hyp_phys_to_page(phys);
-	for (i = 0; i < nr_pages; i++) {
-		p[i].order = 0;
+	for (i = 0; i < nr_pages; i++)
 		hyp_set_page_refcounted(&p[i]);
-	}
 
 	/* Attach the unused pages to the buddy tree */
 	for (i = reserved_pages; i < nr_pages; i++)
diff --git a/arch/arm64/kvm/hyp/nvhe/setup.c b/arch/arm64/kvm/hyp/nvhe/setup.c
index 98b39facae04..98441e4039b9 100644
--- a/arch/arm64/kvm/hyp/nvhe/setup.c
+++ b/arch/arm64/kvm/hyp/nvhe/setup.c
@@ -29,12 +29,11 @@ static struct kvm_pgtable_mm_ops pkvm_pgtable_mm_ops;
 
 static int divide_memory_pool(void *virt, unsigned long size)
 {
-	unsigned long vstart, vend, nr_pages;
+	unsigned long nr_pages;
 
 	hyp_early_alloc_init(virt, size);
 
-	hyp_vmemmap_range(__hyp_pa(virt), size, &vstart, &vend);
-	nr_pages = (vend - vstart) >> PAGE_SHIFT;
+	nr_pages = hyp_vmemmap_pages();
 	vmemmap_base = hyp_early_alloc_contig(nr_pages);
 	if (!vmemmap_base)
 		return -ENOMEM;
@@ -76,7 +75,7 @@ static int recreate_hyp_mappings(phys_addr_t phys, unsigned long size,
 	if (ret)
 		return ret;
 
-	ret = hyp_back_vmemmap(phys, size, hyp_virt_to_phys(vmemmap_base));
+	ret = hyp_back_vmemmap(hyp_virt_to_phys(vmemmap_base));
 	if (ret)
 		return ret;
 
diff --git a/arch/arm64/kvm/hyp/reserved_mem.c b/arch/arm64/kvm/hyp/reserved_mem.c
index 578670e3f608..81db85bfdbad 100644
--- a/arch/arm64/kvm/hyp/reserved_mem.c
+++ b/arch/arm64/kvm/hyp/reserved_mem.c
@@ -54,7 +54,7 @@ static int __init register_memblock_regions(void)
 
 void __init kvm_hyp_reserve(void)
 {
-	u64 nr_pages, prev, hyp_mem_pages = 0;
+	u64 hyp_mem_pages = 0;
 	int ret;
 
 	if (!is_hyp_mode_available() || is_kernel_in_hyp_mode())
@@ -72,20 +72,7 @@ void __init kvm_hyp_reserve(void)
 
 	hyp_mem_pages += hyp_s1_pgtable_pages();
 	hyp_mem_pages += host_s2_pgtable_pages();
-
-	/*
-	 * The hyp_vmemmap needs to be backed by pages, but these pages
-	 * themselves need to be present in the vmemmap, so compute the number
-	 * of pages needed by looking for a fixed point.
-	 */
-	nr_pages = 0;
-	do {
-		prev = nr_pages;
-		nr_pages = hyp_mem_pages + prev;
-		nr_pages = DIV_ROUND_UP(nr_pages * sizeof(struct hyp_page), PAGE_SIZE);
-		nr_pages += __hyp_pgtable_max_pages(nr_pages);
-	} while (nr_pages != prev);
-	hyp_mem_pages += nr_pages;
+	hyp_mem_pages += hyp_vmemmap_pages();
 
 	/*
 	 * Try to allocate a PMD-aligned region to reduce TLB pressure once
-- 
2.33.0.882.g93a45727a2-goog


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply related	[flat|nested] 84+ messages in thread

* [PATCH 12/16] KVM: arm64: Move hyp refcount helpers to header files
  2021-10-13 15:58 ` Quentin Perret
  (?)
@ 2021-10-13 15:58   ` Quentin Perret
  -1 siblings, 0 replies; 84+ messages in thread
From: Quentin Perret @ 2021-10-13 15:58 UTC (permalink / raw)
  To: Marc Zyngier, James Morse, Alexandru Elisei, Suzuki K Poulose,
	Catalin Marinas, Will Deacon, Fuad Tabba, David Brazdil
  Cc: linux-arm-kernel, kvmarm, linux-kernel, kernel-team

We will soon need to touch the hyp_page refcount from outside
page_alloc.c in nVHE protected mode, so move the relevant helpers into a
header file.

Signed-off-by: Quentin Perret <qperret@google.com>
---
 arch/arm64/kvm/hyp/include/nvhe/memory.h | 18 ++++++++++++++++++
 arch/arm64/kvm/hyp/nvhe/page_alloc.c     | 18 ------------------
 2 files changed, 18 insertions(+), 18 deletions(-)

diff --git a/arch/arm64/kvm/hyp/include/nvhe/memory.h b/arch/arm64/kvm/hyp/include/nvhe/memory.h
index 592b7edb3edb..e77783be0f3f 100644
--- a/arch/arm64/kvm/hyp/include/nvhe/memory.h
+++ b/arch/arm64/kvm/hyp/include/nvhe/memory.h
@@ -12,6 +12,24 @@ struct hyp_page {
 	unsigned short order;
 };
 
+static inline void hyp_page_ref_inc(struct hyp_page *p)
+{
+	BUG_ON(p->refcount == USHRT_MAX);
+	p->refcount++;
+}
+
+static inline int hyp_page_ref_dec_and_test(struct hyp_page *p)
+{
+	p->refcount--;
+	return (p->refcount == 0);
+}
+
+static inline void hyp_set_page_refcounted(struct hyp_page *p)
+{
+	BUG_ON(p->refcount);
+	p->refcount = 1;
+}
+
 extern u64 __hyp_vmemmap;
 #define hyp_vmemmap ((struct hyp_page *)__hyp_vmemmap)
 
diff --git a/arch/arm64/kvm/hyp/nvhe/page_alloc.c b/arch/arm64/kvm/hyp/nvhe/page_alloc.c
index 38accc2e23e3..0d977169ed08 100644
--- a/arch/arm64/kvm/hyp/nvhe/page_alloc.c
+++ b/arch/arm64/kvm/hyp/nvhe/page_alloc.c
@@ -144,24 +144,6 @@ static struct hyp_page *__hyp_extract_page(struct hyp_pool *pool,
 	return p;
 }
 
-static inline void hyp_page_ref_inc(struct hyp_page *p)
-{
-	BUG_ON(p->refcount == USHRT_MAX);
-	p->refcount++;
-}
-
-static inline int hyp_page_ref_dec_and_test(struct hyp_page *p)
-{
-	p->refcount--;
-	return (p->refcount == 0);
-}
-
-static inline void hyp_set_page_refcounted(struct hyp_page *p)
-{
-	BUG_ON(p->refcount);
-	p->refcount = 1;
-}
-
 static void __hyp_put_page(struct hyp_pool *pool, struct hyp_page *p)
 {
 	if (hyp_page_ref_dec_and_test(p))
-- 
2.33.0.882.g93a45727a2-goog


^ permalink raw reply related	[flat|nested] 84+ messages in thread

* [PATCH 12/16] KVM: arm64: Move hyp refcount helpers to header files
@ 2021-10-13 15:58   ` Quentin Perret
  0 siblings, 0 replies; 84+ messages in thread
From: Quentin Perret @ 2021-10-13 15:58 UTC (permalink / raw)
  To: Marc Zyngier, James Morse, Alexandru Elisei, Suzuki K Poulose,
	Catalin Marinas, Will Deacon, Fuad Tabba, David Brazdil
  Cc: kernel-team, kvmarm, linux-arm-kernel, linux-kernel

We will soon need to touch the hyp_page refcount from outside
page_alloc.c in nVHE protected mode, so move the relevant helpers into a
header file.

Signed-off-by: Quentin Perret <qperret@google.com>
---
 arch/arm64/kvm/hyp/include/nvhe/memory.h | 18 ++++++++++++++++++
 arch/arm64/kvm/hyp/nvhe/page_alloc.c     | 18 ------------------
 2 files changed, 18 insertions(+), 18 deletions(-)

diff --git a/arch/arm64/kvm/hyp/include/nvhe/memory.h b/arch/arm64/kvm/hyp/include/nvhe/memory.h
index 592b7edb3edb..e77783be0f3f 100644
--- a/arch/arm64/kvm/hyp/include/nvhe/memory.h
+++ b/arch/arm64/kvm/hyp/include/nvhe/memory.h
@@ -12,6 +12,24 @@ struct hyp_page {
 	unsigned short order;
 };
 
+static inline void hyp_page_ref_inc(struct hyp_page *p)
+{
+	BUG_ON(p->refcount == USHRT_MAX);
+	p->refcount++;
+}
+
+static inline int hyp_page_ref_dec_and_test(struct hyp_page *p)
+{
+	p->refcount--;
+	return (p->refcount == 0);
+}
+
+static inline void hyp_set_page_refcounted(struct hyp_page *p)
+{
+	BUG_ON(p->refcount);
+	p->refcount = 1;
+}
+
 extern u64 __hyp_vmemmap;
 #define hyp_vmemmap ((struct hyp_page *)__hyp_vmemmap)
 
diff --git a/arch/arm64/kvm/hyp/nvhe/page_alloc.c b/arch/arm64/kvm/hyp/nvhe/page_alloc.c
index 38accc2e23e3..0d977169ed08 100644
--- a/arch/arm64/kvm/hyp/nvhe/page_alloc.c
+++ b/arch/arm64/kvm/hyp/nvhe/page_alloc.c
@@ -144,24 +144,6 @@ static struct hyp_page *__hyp_extract_page(struct hyp_pool *pool,
 	return p;
 }
 
-static inline void hyp_page_ref_inc(struct hyp_page *p)
-{
-	BUG_ON(p->refcount == USHRT_MAX);
-	p->refcount++;
-}
-
-static inline int hyp_page_ref_dec_and_test(struct hyp_page *p)
-{
-	p->refcount--;
-	return (p->refcount == 0);
-}
-
-static inline void hyp_set_page_refcounted(struct hyp_page *p)
-{
-	BUG_ON(p->refcount);
-	p->refcount = 1;
-}
-
 static void __hyp_put_page(struct hyp_pool *pool, struct hyp_page *p)
 {
 	if (hyp_page_ref_dec_and_test(p))
-- 
2.33.0.882.g93a45727a2-goog

_______________________________________________
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

^ permalink raw reply related	[flat|nested] 84+ messages in thread

* [PATCH 12/16] KVM: arm64: Move hyp refcount helpers to header files
@ 2021-10-13 15:58   ` Quentin Perret
  0 siblings, 0 replies; 84+ messages in thread
From: Quentin Perret @ 2021-10-13 15:58 UTC (permalink / raw)
  To: Marc Zyngier, James Morse, Alexandru Elisei, Suzuki K Poulose,
	Catalin Marinas, Will Deacon, Fuad Tabba, David Brazdil
  Cc: linux-arm-kernel, kvmarm, linux-kernel, kernel-team

We will soon need to touch the hyp_page refcount from outside
page_alloc.c in nVHE protected mode, so move the relevant helpers into a
header file.

Signed-off-by: Quentin Perret <qperret@google.com>
---
 arch/arm64/kvm/hyp/include/nvhe/memory.h | 18 ++++++++++++++++++
 arch/arm64/kvm/hyp/nvhe/page_alloc.c     | 18 ------------------
 2 files changed, 18 insertions(+), 18 deletions(-)

diff --git a/arch/arm64/kvm/hyp/include/nvhe/memory.h b/arch/arm64/kvm/hyp/include/nvhe/memory.h
index 592b7edb3edb..e77783be0f3f 100644
--- a/arch/arm64/kvm/hyp/include/nvhe/memory.h
+++ b/arch/arm64/kvm/hyp/include/nvhe/memory.h
@@ -12,6 +12,24 @@ struct hyp_page {
 	unsigned short order;
 };
 
+static inline void hyp_page_ref_inc(struct hyp_page *p)
+{
+	BUG_ON(p->refcount == USHRT_MAX);
+	p->refcount++;
+}
+
+static inline int hyp_page_ref_dec_and_test(struct hyp_page *p)
+{
+	p->refcount--;
+	return (p->refcount == 0);
+}
+
+static inline void hyp_set_page_refcounted(struct hyp_page *p)
+{
+	BUG_ON(p->refcount);
+	p->refcount = 1;
+}
+
 extern u64 __hyp_vmemmap;
 #define hyp_vmemmap ((struct hyp_page *)__hyp_vmemmap)
 
diff --git a/arch/arm64/kvm/hyp/nvhe/page_alloc.c b/arch/arm64/kvm/hyp/nvhe/page_alloc.c
index 38accc2e23e3..0d977169ed08 100644
--- a/arch/arm64/kvm/hyp/nvhe/page_alloc.c
+++ b/arch/arm64/kvm/hyp/nvhe/page_alloc.c
@@ -144,24 +144,6 @@ static struct hyp_page *__hyp_extract_page(struct hyp_pool *pool,
 	return p;
 }
 
-static inline void hyp_page_ref_inc(struct hyp_page *p)
-{
-	BUG_ON(p->refcount == USHRT_MAX);
-	p->refcount++;
-}
-
-static inline int hyp_page_ref_dec_and_test(struct hyp_page *p)
-{
-	p->refcount--;
-	return (p->refcount == 0);
-}
-
-static inline void hyp_set_page_refcounted(struct hyp_page *p)
-{
-	BUG_ON(p->refcount);
-	p->refcount = 1;
-}
-
 static void __hyp_put_page(struct hyp_pool *pool, struct hyp_page *p)
 {
 	if (hyp_page_ref_dec_and_test(p))
-- 
2.33.0.882.g93a45727a2-goog


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply related	[flat|nested] 84+ messages in thread

* [PATCH 13/16] KVM: arm64: Move double-sharing logic into hyp-specific function
  2021-10-13 15:58 ` Quentin Perret
  (?)
@ 2021-10-13 15:58   ` Quentin Perret
  -1 siblings, 0 replies; 84+ messages in thread
From: Quentin Perret @ 2021-10-13 15:58 UTC (permalink / raw)
  To: Marc Zyngier, James Morse, Alexandru Elisei, Suzuki K Poulose,
	Catalin Marinas, Will Deacon, Fuad Tabba, David Brazdil
  Cc: linux-arm-kernel, kvmarm, linux-kernel, kernel-team

From: Will Deacon <will@kernel.org>

Strictly speaking, double-sharing a page is an invalid transition and
should be rejected, however we allow this in order to simplify the
book-keeping when KVM metadata (such as vcpu structures) co-exists in
the same page.

Given that double-sharing is only required for pages shared with the
hypervisor by the host, move the handling into a hyp-specific function
to check incoming shares, therefore preventing double-sharing outside
of this particular transition.

Signed-off-by: Will Deacon <will@kernel.org>
Signed-off-by: Quentin Perret <qperret@google.com>
---
 arch/arm64/kvm/hyp/nvhe/mem_protect.c | 57 +++++++++++++++++++--------
 1 file changed, 41 insertions(+), 16 deletions(-)

diff --git a/arch/arm64/kvm/hyp/nvhe/mem_protect.c b/arch/arm64/kvm/hyp/nvhe/mem_protect.c
index 909e60f71b06..3378117d010c 100644
--- a/arch/arm64/kvm/hyp/nvhe/mem_protect.c
+++ b/arch/arm64/kvm/hyp/nvhe/mem_protect.c
@@ -536,6 +536,33 @@ static int ack_share(struct pkvm_page_share_ack *ack,
 	}
 }
 
+static int hyp_check_incoming_share(struct pkvm_page_req *req,
+				    struct pkvm_page_share_ack *ack,
+				    enum pkvm_component_id initiator,
+				    enum kvm_pgtable_prot prot)
+{
+	/*
+	 * We allow the host to share the same page twice, but that means we
+	 * have to check that the states really do match exactly.
+	 */
+	if (initiator != PKVM_ID_HOST)
+		return -EPERM;
+
+	if (req->initiator.state != PKVM_PAGE_SHARED_OWNED)
+		return -EPERM;
+
+	if (ack->completer.state != PKVM_PAGE_SHARED_BORROWED)
+		return -EPERM;
+
+	if (ack->completer.phys != req->phys)
+		return -EPERM;
+
+	if (ack->completer.prot != prot)
+		return -EPERM;
+
+	return 0;
+}
+
 /*
  * Check that the page states in the initiator and the completer are compatible
  * for the requested page-sharing operation to go ahead.
@@ -544,6 +571,8 @@ static int check_share(struct pkvm_page_req *req,
 		       struct pkvm_page_share_ack *ack,
 		       struct pkvm_mem_share *share)
 {
+	struct pkvm_mem_transition *tx = &share->tx;
+
 	if (!addr_is_memory(req->phys))
 		return -EINVAL;
 
@@ -552,25 +581,22 @@ static int check_share(struct pkvm_page_req *req,
 		return 0;
 	}
 
-	if (req->initiator.state != PKVM_PAGE_SHARED_OWNED)
-		return -EPERM;
-
-	if (ack->completer.state != PKVM_PAGE_SHARED_BORROWED)
-		return -EPERM;
-
-	if (ack->completer.phys != req->phys)
-		return -EPERM;
-
-	if (ack->completer.prot != share->prot)
+	switch (tx->completer.id) {
+	case PKVM_ID_HYP:
+		return hyp_check_incoming_share(req, ack, tx->initiator.id,
+						share->prot);
+	default:
 		return -EPERM;
-
-	return 0;
+	}
 }
 
 static int host_initiate_share(struct pkvm_page_req *req)
 {
 	enum kvm_pgtable_prot prot;
 
+	if (req->initiator.state == PKVM_PAGE_SHARED_OWNED)
+		return 0;
+
 	prot = pkvm_mkstate(PKVM_HOST_MEM_PROT, PKVM_PAGE_SHARED_OWNED);
 	return host_stage2_idmap_locked(req->initiator.addr, PAGE_SIZE, prot);
 }
@@ -595,6 +621,9 @@ static int hyp_complete_share(struct pkvm_page_req *req,
 	void *start = (void *)req->completer.addr, *end = start + PAGE_SIZE;
 	enum kvm_pgtable_prot prot;
 
+	if (req->initiator.state == PKVM_PAGE_SHARED_OWNED)
+		return 0;
+
 	prot = pkvm_mkstate(perms, PKVM_PAGE_SHARED_BORROWED);
 	return pkvm_create_mappings_locked(start, end, prot);
 }
@@ -653,10 +682,6 @@ static int do_share(struct pkvm_mem_share *share)
 		if (ret)
 			break;
 
-		/* Allow double-sharing by skipping over the page */
-		if (req.initiator.state == PKVM_PAGE_SHARED_OWNED)
-			continue;
-
 		ret = initiate_share(&req, share);
 		if (ret)
 			break;
-- 
2.33.0.882.g93a45727a2-goog


^ permalink raw reply related	[flat|nested] 84+ messages in thread

* [PATCH 13/16] KVM: arm64: Move double-sharing logic into hyp-specific function
@ 2021-10-13 15:58   ` Quentin Perret
  0 siblings, 0 replies; 84+ messages in thread
From: Quentin Perret @ 2021-10-13 15:58 UTC (permalink / raw)
  To: Marc Zyngier, James Morse, Alexandru Elisei, Suzuki K Poulose,
	Catalin Marinas, Will Deacon, Fuad Tabba, David Brazdil
  Cc: kernel-team, kvmarm, linux-arm-kernel, linux-kernel

From: Will Deacon <will@kernel.org>

Strictly speaking, double-sharing a page is an invalid transition and
should be rejected, however we allow this in order to simplify the
book-keeping when KVM metadata (such as vcpu structures) co-exists in
the same page.

Given that double-sharing is only required for pages shared with the
hypervisor by the host, move the handling into a hyp-specific function
to check incoming shares, therefore preventing double-sharing outside
of this particular transition.

Signed-off-by: Will Deacon <will@kernel.org>
Signed-off-by: Quentin Perret <qperret@google.com>
---
 arch/arm64/kvm/hyp/nvhe/mem_protect.c | 57 +++++++++++++++++++--------
 1 file changed, 41 insertions(+), 16 deletions(-)

diff --git a/arch/arm64/kvm/hyp/nvhe/mem_protect.c b/arch/arm64/kvm/hyp/nvhe/mem_protect.c
index 909e60f71b06..3378117d010c 100644
--- a/arch/arm64/kvm/hyp/nvhe/mem_protect.c
+++ b/arch/arm64/kvm/hyp/nvhe/mem_protect.c
@@ -536,6 +536,33 @@ static int ack_share(struct pkvm_page_share_ack *ack,
 	}
 }
 
+static int hyp_check_incoming_share(struct pkvm_page_req *req,
+				    struct pkvm_page_share_ack *ack,
+				    enum pkvm_component_id initiator,
+				    enum kvm_pgtable_prot prot)
+{
+	/*
+	 * We allow the host to share the same page twice, but that means we
+	 * have to check that the states really do match exactly.
+	 */
+	if (initiator != PKVM_ID_HOST)
+		return -EPERM;
+
+	if (req->initiator.state != PKVM_PAGE_SHARED_OWNED)
+		return -EPERM;
+
+	if (ack->completer.state != PKVM_PAGE_SHARED_BORROWED)
+		return -EPERM;
+
+	if (ack->completer.phys != req->phys)
+		return -EPERM;
+
+	if (ack->completer.prot != prot)
+		return -EPERM;
+
+	return 0;
+}
+
 /*
  * Check that the page states in the initiator and the completer are compatible
  * for the requested page-sharing operation to go ahead.
@@ -544,6 +571,8 @@ static int check_share(struct pkvm_page_req *req,
 		       struct pkvm_page_share_ack *ack,
 		       struct pkvm_mem_share *share)
 {
+	struct pkvm_mem_transition *tx = &share->tx;
+
 	if (!addr_is_memory(req->phys))
 		return -EINVAL;
 
@@ -552,25 +581,22 @@ static int check_share(struct pkvm_page_req *req,
 		return 0;
 	}
 
-	if (req->initiator.state != PKVM_PAGE_SHARED_OWNED)
-		return -EPERM;
-
-	if (ack->completer.state != PKVM_PAGE_SHARED_BORROWED)
-		return -EPERM;
-
-	if (ack->completer.phys != req->phys)
-		return -EPERM;
-
-	if (ack->completer.prot != share->prot)
+	switch (tx->completer.id) {
+	case PKVM_ID_HYP:
+		return hyp_check_incoming_share(req, ack, tx->initiator.id,
+						share->prot);
+	default:
 		return -EPERM;
-
-	return 0;
+	}
 }
 
 static int host_initiate_share(struct pkvm_page_req *req)
 {
 	enum kvm_pgtable_prot prot;
 
+	if (req->initiator.state == PKVM_PAGE_SHARED_OWNED)
+		return 0;
+
 	prot = pkvm_mkstate(PKVM_HOST_MEM_PROT, PKVM_PAGE_SHARED_OWNED);
 	return host_stage2_idmap_locked(req->initiator.addr, PAGE_SIZE, prot);
 }
@@ -595,6 +621,9 @@ static int hyp_complete_share(struct pkvm_page_req *req,
 	void *start = (void *)req->completer.addr, *end = start + PAGE_SIZE;
 	enum kvm_pgtable_prot prot;
 
+	if (req->initiator.state == PKVM_PAGE_SHARED_OWNED)
+		return 0;
+
 	prot = pkvm_mkstate(perms, PKVM_PAGE_SHARED_BORROWED);
 	return pkvm_create_mappings_locked(start, end, prot);
 }
@@ -653,10 +682,6 @@ static int do_share(struct pkvm_mem_share *share)
 		if (ret)
 			break;
 
-		/* Allow double-sharing by skipping over the page */
-		if (req.initiator.state == PKVM_PAGE_SHARED_OWNED)
-			continue;
-
 		ret = initiate_share(&req, share);
 		if (ret)
 			break;
-- 
2.33.0.882.g93a45727a2-goog

_______________________________________________
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

^ permalink raw reply related	[flat|nested] 84+ messages in thread

* [PATCH 13/16] KVM: arm64: Move double-sharing logic into hyp-specific function
@ 2021-10-13 15:58   ` Quentin Perret
  0 siblings, 0 replies; 84+ messages in thread
From: Quentin Perret @ 2021-10-13 15:58 UTC (permalink / raw)
  To: Marc Zyngier, James Morse, Alexandru Elisei, Suzuki K Poulose,
	Catalin Marinas, Will Deacon, Fuad Tabba, David Brazdil
  Cc: linux-arm-kernel, kvmarm, linux-kernel, kernel-team

From: Will Deacon <will@kernel.org>

Strictly speaking, double-sharing a page is an invalid transition and
should be rejected, however we allow this in order to simplify the
book-keeping when KVM metadata (such as vcpu structures) co-exists in
the same page.

Given that double-sharing is only required for pages shared with the
hypervisor by the host, move the handling into a hyp-specific function
to check incoming shares, therefore preventing double-sharing outside
of this particular transition.

Signed-off-by: Will Deacon <will@kernel.org>
Signed-off-by: Quentin Perret <qperret@google.com>
---
 arch/arm64/kvm/hyp/nvhe/mem_protect.c | 57 +++++++++++++++++++--------
 1 file changed, 41 insertions(+), 16 deletions(-)

diff --git a/arch/arm64/kvm/hyp/nvhe/mem_protect.c b/arch/arm64/kvm/hyp/nvhe/mem_protect.c
index 909e60f71b06..3378117d010c 100644
--- a/arch/arm64/kvm/hyp/nvhe/mem_protect.c
+++ b/arch/arm64/kvm/hyp/nvhe/mem_protect.c
@@ -536,6 +536,33 @@ static int ack_share(struct pkvm_page_share_ack *ack,
 	}
 }
 
+static int hyp_check_incoming_share(struct pkvm_page_req *req,
+				    struct pkvm_page_share_ack *ack,
+				    enum pkvm_component_id initiator,
+				    enum kvm_pgtable_prot prot)
+{
+	/*
+	 * We allow the host to share the same page twice, but that means we
+	 * have to check that the states really do match exactly.
+	 */
+	if (initiator != PKVM_ID_HOST)
+		return -EPERM;
+
+	if (req->initiator.state != PKVM_PAGE_SHARED_OWNED)
+		return -EPERM;
+
+	if (ack->completer.state != PKVM_PAGE_SHARED_BORROWED)
+		return -EPERM;
+
+	if (ack->completer.phys != req->phys)
+		return -EPERM;
+
+	if (ack->completer.prot != prot)
+		return -EPERM;
+
+	return 0;
+}
+
 /*
  * Check that the page states in the initiator and the completer are compatible
  * for the requested page-sharing operation to go ahead.
@@ -544,6 +571,8 @@ static int check_share(struct pkvm_page_req *req,
 		       struct pkvm_page_share_ack *ack,
 		       struct pkvm_mem_share *share)
 {
+	struct pkvm_mem_transition *tx = &share->tx;
+
 	if (!addr_is_memory(req->phys))
 		return -EINVAL;
 
@@ -552,25 +581,22 @@ static int check_share(struct pkvm_page_req *req,
 		return 0;
 	}
 
-	if (req->initiator.state != PKVM_PAGE_SHARED_OWNED)
-		return -EPERM;
-
-	if (ack->completer.state != PKVM_PAGE_SHARED_BORROWED)
-		return -EPERM;
-
-	if (ack->completer.phys != req->phys)
-		return -EPERM;
-
-	if (ack->completer.prot != share->prot)
+	switch (tx->completer.id) {
+	case PKVM_ID_HYP:
+		return hyp_check_incoming_share(req, ack, tx->initiator.id,
+						share->prot);
+	default:
 		return -EPERM;
-
-	return 0;
+	}
 }
 
 static int host_initiate_share(struct pkvm_page_req *req)
 {
 	enum kvm_pgtable_prot prot;
 
+	if (req->initiator.state == PKVM_PAGE_SHARED_OWNED)
+		return 0;
+
 	prot = pkvm_mkstate(PKVM_HOST_MEM_PROT, PKVM_PAGE_SHARED_OWNED);
 	return host_stage2_idmap_locked(req->initiator.addr, PAGE_SIZE, prot);
 }
@@ -595,6 +621,9 @@ static int hyp_complete_share(struct pkvm_page_req *req,
 	void *start = (void *)req->completer.addr, *end = start + PAGE_SIZE;
 	enum kvm_pgtable_prot prot;
 
+	if (req->initiator.state == PKVM_PAGE_SHARED_OWNED)
+		return 0;
+
 	prot = pkvm_mkstate(perms, PKVM_PAGE_SHARED_BORROWED);
 	return pkvm_create_mappings_locked(start, end, prot);
 }
@@ -653,10 +682,6 @@ static int do_share(struct pkvm_mem_share *share)
 		if (ret)
 			break;
 
-		/* Allow double-sharing by skipping over the page */
-		if (req.initiator.state == PKVM_PAGE_SHARED_OWNED)
-			continue;
-
 		ret = initiate_share(&req, share);
 		if (ret)
 			break;
-- 
2.33.0.882.g93a45727a2-goog


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply related	[flat|nested] 84+ messages in thread

* [PATCH 14/16] KVM: arm64: Refcount shared pages at EL2
  2021-10-13 15:58 ` Quentin Perret
  (?)
@ 2021-10-13 15:58   ` Quentin Perret
  -1 siblings, 0 replies; 84+ messages in thread
From: Quentin Perret @ 2021-10-13 15:58 UTC (permalink / raw)
  To: Marc Zyngier, James Morse, Alexandru Elisei, Suzuki K Poulose,
	Catalin Marinas, Will Deacon, Fuad Tabba, David Brazdil
  Cc: kernel-team, kvmarm, linux-arm-kernel, linux-kernel

We currently allow double sharing of pages from the hypervisor to the
host, but don't track how many times each page is shared. In order to
prepare the introduction of an unshare operation in the hypervisor,
refcount the physical pages which the host shares more than once.

Signed-off-by: Quentin Perret <qperret@google.com>
---
 arch/arm64/kvm/hyp/nvhe/mem_protect.c | 16 ++++++++++++++--
 1 file changed, 14 insertions(+), 2 deletions(-)

diff --git a/arch/arm64/kvm/hyp/nvhe/mem_protect.c b/arch/arm64/kvm/hyp/nvhe/mem_protect.c
index 3378117d010c..cad76bc68e53 100644
--- a/arch/arm64/kvm/hyp/nvhe/mem_protect.c
+++ b/arch/arm64/kvm/hyp/nvhe/mem_protect.c
@@ -560,6 +560,9 @@ static int hyp_check_incoming_share(struct pkvm_page_req *req,
 	if (ack->completer.prot != prot)
 		return -EPERM;
 
+	if (WARN_ON(!hyp_phys_to_page(req->phys)->refcount))
+		return -EINVAL;
+
 	return 0;
 }
 
@@ -619,13 +622,22 @@ static int hyp_complete_share(struct pkvm_page_req *req,
 			      enum kvm_pgtable_prot perms)
 {
 	void *start = (void *)req->completer.addr, *end = start + PAGE_SIZE;
+	struct hyp_page *page = hyp_phys_to_page(req->phys);
 	enum kvm_pgtable_prot prot;
+	int ret = 0;
 
-	if (req->initiator.state == PKVM_PAGE_SHARED_OWNED)
+	if (req->initiator.state == PKVM_PAGE_SHARED_OWNED) {
+		hyp_page_ref_inc(page);
 		return 0;
+	}
 
 	prot = pkvm_mkstate(perms, PKVM_PAGE_SHARED_BORROWED);
-	return pkvm_create_mappings_locked(start, end, prot);
+	ret = pkvm_create_mappings_locked(start, end, prot);
+
+	if (!ret)
+		hyp_set_page_refcounted(page);
+
+	return ret;
 }
 
 /* Update the completer's page-table for the page-sharing request */
-- 
2.33.0.882.g93a45727a2-goog

_______________________________________________
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

^ permalink raw reply related	[flat|nested] 84+ messages in thread

* [PATCH 14/16] KVM: arm64: Refcount shared pages at EL2
@ 2021-10-13 15:58   ` Quentin Perret
  0 siblings, 0 replies; 84+ messages in thread
From: Quentin Perret @ 2021-10-13 15:58 UTC (permalink / raw)
  To: Marc Zyngier, James Morse, Alexandru Elisei, Suzuki K Poulose,
	Catalin Marinas, Will Deacon, Fuad Tabba, David Brazdil
  Cc: linux-arm-kernel, kvmarm, linux-kernel, kernel-team

We currently allow double sharing of pages from the hypervisor to the
host, but don't track how many times each page is shared. In order to
prepare the introduction of an unshare operation in the hypervisor,
refcount the physical pages which the host shares more than once.

Signed-off-by: Quentin Perret <qperret@google.com>
---
 arch/arm64/kvm/hyp/nvhe/mem_protect.c | 16 ++++++++++++++--
 1 file changed, 14 insertions(+), 2 deletions(-)

diff --git a/arch/arm64/kvm/hyp/nvhe/mem_protect.c b/arch/arm64/kvm/hyp/nvhe/mem_protect.c
index 3378117d010c..cad76bc68e53 100644
--- a/arch/arm64/kvm/hyp/nvhe/mem_protect.c
+++ b/arch/arm64/kvm/hyp/nvhe/mem_protect.c
@@ -560,6 +560,9 @@ static int hyp_check_incoming_share(struct pkvm_page_req *req,
 	if (ack->completer.prot != prot)
 		return -EPERM;
 
+	if (WARN_ON(!hyp_phys_to_page(req->phys)->refcount))
+		return -EINVAL;
+
 	return 0;
 }
 
@@ -619,13 +622,22 @@ static int hyp_complete_share(struct pkvm_page_req *req,
 			      enum kvm_pgtable_prot perms)
 {
 	void *start = (void *)req->completer.addr, *end = start + PAGE_SIZE;
+	struct hyp_page *page = hyp_phys_to_page(req->phys);
 	enum kvm_pgtable_prot prot;
+	int ret = 0;
 
-	if (req->initiator.state == PKVM_PAGE_SHARED_OWNED)
+	if (req->initiator.state == PKVM_PAGE_SHARED_OWNED) {
+		hyp_page_ref_inc(page);
 		return 0;
+	}
 
 	prot = pkvm_mkstate(perms, PKVM_PAGE_SHARED_BORROWED);
-	return pkvm_create_mappings_locked(start, end, prot);
+	ret = pkvm_create_mappings_locked(start, end, prot);
+
+	if (!ret)
+		hyp_set_page_refcounted(page);
+
+	return ret;
 }
 
 /* Update the completer's page-table for the page-sharing request */
-- 
2.33.0.882.g93a45727a2-goog


^ permalink raw reply related	[flat|nested] 84+ messages in thread

* [PATCH 14/16] KVM: arm64: Refcount shared pages at EL2
@ 2021-10-13 15:58   ` Quentin Perret
  0 siblings, 0 replies; 84+ messages in thread
From: Quentin Perret @ 2021-10-13 15:58 UTC (permalink / raw)
  To: Marc Zyngier, James Morse, Alexandru Elisei, Suzuki K Poulose,
	Catalin Marinas, Will Deacon, Fuad Tabba, David Brazdil
  Cc: linux-arm-kernel, kvmarm, linux-kernel, kernel-team

We currently allow double sharing of pages from the hypervisor to the
host, but don't track how many times each page is shared. In order to
prepare the introduction of an unshare operation in the hypervisor,
refcount the physical pages which the host shares more than once.

Signed-off-by: Quentin Perret <qperret@google.com>
---
 arch/arm64/kvm/hyp/nvhe/mem_protect.c | 16 ++++++++++++++--
 1 file changed, 14 insertions(+), 2 deletions(-)

diff --git a/arch/arm64/kvm/hyp/nvhe/mem_protect.c b/arch/arm64/kvm/hyp/nvhe/mem_protect.c
index 3378117d010c..cad76bc68e53 100644
--- a/arch/arm64/kvm/hyp/nvhe/mem_protect.c
+++ b/arch/arm64/kvm/hyp/nvhe/mem_protect.c
@@ -560,6 +560,9 @@ static int hyp_check_incoming_share(struct pkvm_page_req *req,
 	if (ack->completer.prot != prot)
 		return -EPERM;
 
+	if (WARN_ON(!hyp_phys_to_page(req->phys)->refcount))
+		return -EINVAL;
+
 	return 0;
 }
 
@@ -619,13 +622,22 @@ static int hyp_complete_share(struct pkvm_page_req *req,
 			      enum kvm_pgtable_prot perms)
 {
 	void *start = (void *)req->completer.addr, *end = start + PAGE_SIZE;
+	struct hyp_page *page = hyp_phys_to_page(req->phys);
 	enum kvm_pgtable_prot prot;
+	int ret = 0;
 
-	if (req->initiator.state == PKVM_PAGE_SHARED_OWNED)
+	if (req->initiator.state == PKVM_PAGE_SHARED_OWNED) {
+		hyp_page_ref_inc(page);
 		return 0;
+	}
 
 	prot = pkvm_mkstate(perms, PKVM_PAGE_SHARED_BORROWED);
-	return pkvm_create_mappings_locked(start, end, prot);
+	ret = pkvm_create_mappings_locked(start, end, prot);
+
+	if (!ret)
+		hyp_set_page_refcounted(page);
+
+	return ret;
 }
 
 /* Update the completer's page-table for the page-sharing request */
-- 
2.33.0.882.g93a45727a2-goog


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply related	[flat|nested] 84+ messages in thread

* [PATCH 15/16] KVM: arm64: pkvm: Introduce an unshare hypercall
  2021-10-13 15:58 ` Quentin Perret
  (?)
@ 2021-10-13 15:58   ` Quentin Perret
  -1 siblings, 0 replies; 84+ messages in thread
From: Quentin Perret @ 2021-10-13 15:58 UTC (permalink / raw)
  To: Marc Zyngier, James Morse, Alexandru Elisei, Suzuki K Poulose,
	Catalin Marinas, Will Deacon, Fuad Tabba, David Brazdil
  Cc: kernel-team, kvmarm, linux-arm-kernel, linux-kernel

Introduce an unshare hypercall which can be used to unmap things from
the hypervisor stage-1 in nVHE protected mode. This will be useful to
update the EL2 ownership state of pages during guest teardown, and
avoids keeping dangling mappings to unreferenced portions of memory.

Signed-off-by: Quentin Perret <qperret@google.com>
---
 arch/arm64/include/asm/kvm_asm.h              |   1 +
 arch/arm64/kvm/hyp/include/nvhe/mem_protect.h |   1 +
 arch/arm64/kvm/hyp/nvhe/hyp-main.c            |   9 +
 arch/arm64/kvm/hyp/nvhe/mem_protect.c         | 159 ++++++++++++++++++
 4 files changed, 170 insertions(+)

diff --git a/arch/arm64/include/asm/kvm_asm.h b/arch/arm64/include/asm/kvm_asm.h
index e86045ac43ba..98ba11b7b192 100644
--- a/arch/arm64/include/asm/kvm_asm.h
+++ b/arch/arm64/include/asm/kvm_asm.h
@@ -64,6 +64,7 @@
 #define __KVM_HOST_SMCCC_FUNC___pkvm_cpu_set_vector		18
 #define __KVM_HOST_SMCCC_FUNC___pkvm_prot_finalize		19
 #define __KVM_HOST_SMCCC_FUNC___kvm_adjust_pc			20
+#define __KVM_HOST_SMCCC_FUNC___pkvm_host_unshare_hyp		21
 
 #ifndef __ASSEMBLY__
 
diff --git a/arch/arm64/kvm/hyp/include/nvhe/mem_protect.h b/arch/arm64/kvm/hyp/include/nvhe/mem_protect.h
index 9c02abe92e0a..88e1607a94fe 100644
--- a/arch/arm64/kvm/hyp/include/nvhe/mem_protect.h
+++ b/arch/arm64/kvm/hyp/include/nvhe/mem_protect.h
@@ -55,6 +55,7 @@ extern const u8 pkvm_hyp_id;
 
 int __pkvm_prot_finalize(void);
 int __pkvm_host_share_hyp(u64 pfn, u64 nr_pages);
+int __pkvm_host_unshare_hyp(u64 pfn, u64 nr_pages);
 
 bool addr_is_memory(phys_addr_t phys);
 int host_stage2_idmap_locked(phys_addr_t addr, u64 size, enum kvm_pgtable_prot prot);
diff --git a/arch/arm64/kvm/hyp/nvhe/hyp-main.c b/arch/arm64/kvm/hyp/nvhe/hyp-main.c
index f78bec2b9dd4..7070ed9ead9b 100644
--- a/arch/arm64/kvm/hyp/nvhe/hyp-main.c
+++ b/arch/arm64/kvm/hyp/nvhe/hyp-main.c
@@ -148,6 +148,14 @@ static void handle___pkvm_host_share_hyp(struct kvm_cpu_context *host_ctxt)
 	cpu_reg(host_ctxt, 1) = __pkvm_host_share_hyp(pfn, nr_pages);
 }
 
+static void handle___pkvm_host_unshare_hyp(struct kvm_cpu_context *host_ctxt)
+{
+	DECLARE_REG(u64, pfn, host_ctxt, 1);
+	DECLARE_REG(u64, nr_pages, host_ctxt, 2);
+
+	cpu_reg(host_ctxt, 1) = __pkvm_host_unshare_hyp(pfn, nr_pages);
+}
+
 static void handle___pkvm_create_private_mapping(struct kvm_cpu_context *host_ctxt)
 {
 	DECLARE_REG(phys_addr_t, phys, host_ctxt, 1);
@@ -184,6 +192,7 @@ static const hcall_t host_hcall[] = {
 	HANDLE_FUNC(__pkvm_init),
 	HANDLE_FUNC(__pkvm_cpu_set_vector),
 	HANDLE_FUNC(__pkvm_host_share_hyp),
+	HANDLE_FUNC(__pkvm_host_unshare_hyp),
 	HANDLE_FUNC(__pkvm_create_private_mapping),
 	HANDLE_FUNC(__pkvm_prot_finalize),
 };
diff --git a/arch/arm64/kvm/hyp/nvhe/mem_protect.c b/arch/arm64/kvm/hyp/nvhe/mem_protect.c
index cad76bc68e53..3b724ab62e9f 100644
--- a/arch/arm64/kvm/hyp/nvhe/mem_protect.c
+++ b/arch/arm64/kvm/hyp/nvhe/mem_protect.c
@@ -740,3 +740,162 @@ int __pkvm_host_share_hyp(u64 pfn, u64 nr_pages)
 
 	return ret;
 }
+
+static int host_initiate_unshare(struct pkvm_page_req *req)
+{
+	struct hyp_page *page = hyp_phys_to_page(req->phys);
+	enum kvm_pgtable_prot prot;
+
+	if (page->refcount > 1)
+		return 0;
+
+	prot = pkvm_mkstate(PKVM_HOST_MEM_PROT, PKVM_PAGE_OWNED);
+	return host_stage2_idmap_locked(req->initiator.addr, PAGE_SIZE, prot);
+}
+
+static int initiate_unshare(struct pkvm_page_req *req,
+			    struct pkvm_mem_share *share)
+{
+	struct pkvm_mem_transition *tx = &share->tx;
+
+	switch (tx->initiator.id) {
+	case PKVM_ID_HOST:
+		return host_initiate_unshare(req);
+	default:
+		return -EINVAL;
+	}
+}
+
+static int hyp_complete_unshare(struct pkvm_page_req *req)
+{
+	struct hyp_page *page = hyp_phys_to_page(req->phys);
+	void *addr = (void *)req->completer.addr;
+	int ret = 0;
+
+	if (hyp_page_ref_dec_and_test(page)) {
+		ret = kvm_pgtable_hyp_unmap(&pkvm_pgtable, (u64)addr, PAGE_SIZE);
+		ret = (ret == PAGE_SIZE) ? 0 : -EINVAL;
+	}
+
+	return ret;
+}
+
+static int complete_unshare(struct pkvm_page_req *req,
+			    struct pkvm_mem_share *share)
+{
+	struct pkvm_mem_transition *tx = &share->tx;
+
+	switch (tx->completer.id) {
+	case PKVM_ID_HYP:
+		return hyp_complete_unshare(req);
+	default:
+		return -EINVAL;
+	}
+}
+
+static int check_unshare(struct pkvm_page_req *req,
+			 struct pkvm_page_share_ack *ack,
+			 struct pkvm_mem_share *share)
+{
+	struct pkvm_mem_transition *tx = &share->tx;
+
+	if (!addr_is_memory(req->phys))
+		return -EINVAL;
+
+	switch (tx->completer.id) {
+	case PKVM_ID_HYP:
+		return hyp_check_incoming_share(req, ack, tx->initiator.id,
+						share->prot);
+	default:
+		return -EPERM;
+	}
+}
+
+/*
+ * do_unshare():
+ *
+ * The page owner revokes access from another component for a range of
+ * pages which were previously shared using do_share().
+ *
+ * Initiator: SHARED_OWNED	=> OWNED
+ * Completer: SHARED_BORROWED	=> NOPAGE
+ */
+static int do_unshare(struct pkvm_mem_share *share)
+{
+	struct pkvm_page_req req;
+	int ret = 0;
+	u64 idx;
+
+	for (idx = 0; idx < share->tx.nr_pages; ++idx) {
+		struct pkvm_page_share_ack ack;
+
+		/*
+		 * Use the request_share() and ack_share() from the normal share
+		 * path as they implement all the checks we need here. But
+		 * check_unshare() needs to differ -- PKVM_PAGE_OWNED is illegal
+		 * for the initiator.
+		 */
+		ret = request_share(&req, share, idx);
+		if (ret)
+			goto out;
+
+		ret = ack_share(&ack, &req, share);
+		if (ret)
+			goto out;
+
+		ret = check_unshare(&req, &ack, share);
+		if (ret)
+			goto out;
+	}
+
+	for (idx = 0; idx < share->tx.nr_pages; ++idx) {
+		ret = request_share(&req, share, idx);
+		if (ret)
+			break;
+
+		ret = initiate_unshare(&req, share);
+		if (ret)
+			break;
+
+		ret = complete_unshare(&req, share);
+		if (ret)
+			break;
+	}
+
+	WARN_ON(ret);
+out:
+	return ret;
+}
+
+int __pkvm_host_unshare_hyp(u64 pfn, u64 nr_pages)
+{
+	int ret;
+	u64 host_addr = hyp_pfn_to_phys(pfn);
+	u64 hyp_addr = (u64)__hyp_va(host_addr);
+	struct pkvm_mem_share share = {
+		.tx	= {
+			.nr_pages	= nr_pages,
+			.initiator	= {
+				.id	= PKVM_ID_HOST,
+				.addr	= host_addr,
+				.host	= {
+					.completer_addr = hyp_addr,
+				},
+			},
+			.completer	= {
+				.id	= PKVM_ID_HYP,
+			},
+		},
+		.prot	= PAGE_HYP,
+	};
+
+	host_lock_component();
+	hyp_lock_component();
+
+	ret = do_unshare(&share);
+
+	hyp_unlock_component();
+	host_unlock_component();
+
+	return ret;
+}
-- 
2.33.0.882.g93a45727a2-goog

_______________________________________________
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

^ permalink raw reply related	[flat|nested] 84+ messages in thread

* [PATCH 15/16] KVM: arm64: pkvm: Introduce an unshare hypercall
@ 2021-10-13 15:58   ` Quentin Perret
  0 siblings, 0 replies; 84+ messages in thread
From: Quentin Perret @ 2021-10-13 15:58 UTC (permalink / raw)
  To: Marc Zyngier, James Morse, Alexandru Elisei, Suzuki K Poulose,
	Catalin Marinas, Will Deacon, Fuad Tabba, David Brazdil
  Cc: linux-arm-kernel, kvmarm, linux-kernel, kernel-team

Introduce an unshare hypercall which can be used to unmap things from
the hypervisor stage-1 in nVHE protected mode. This will be useful to
update the EL2 ownership state of pages during guest teardown, and
avoids keeping dangling mappings to unreferenced portions of memory.

Signed-off-by: Quentin Perret <qperret@google.com>
---
 arch/arm64/include/asm/kvm_asm.h              |   1 +
 arch/arm64/kvm/hyp/include/nvhe/mem_protect.h |   1 +
 arch/arm64/kvm/hyp/nvhe/hyp-main.c            |   9 +
 arch/arm64/kvm/hyp/nvhe/mem_protect.c         | 159 ++++++++++++++++++
 4 files changed, 170 insertions(+)

diff --git a/arch/arm64/include/asm/kvm_asm.h b/arch/arm64/include/asm/kvm_asm.h
index e86045ac43ba..98ba11b7b192 100644
--- a/arch/arm64/include/asm/kvm_asm.h
+++ b/arch/arm64/include/asm/kvm_asm.h
@@ -64,6 +64,7 @@
 #define __KVM_HOST_SMCCC_FUNC___pkvm_cpu_set_vector		18
 #define __KVM_HOST_SMCCC_FUNC___pkvm_prot_finalize		19
 #define __KVM_HOST_SMCCC_FUNC___kvm_adjust_pc			20
+#define __KVM_HOST_SMCCC_FUNC___pkvm_host_unshare_hyp		21
 
 #ifndef __ASSEMBLY__
 
diff --git a/arch/arm64/kvm/hyp/include/nvhe/mem_protect.h b/arch/arm64/kvm/hyp/include/nvhe/mem_protect.h
index 9c02abe92e0a..88e1607a94fe 100644
--- a/arch/arm64/kvm/hyp/include/nvhe/mem_protect.h
+++ b/arch/arm64/kvm/hyp/include/nvhe/mem_protect.h
@@ -55,6 +55,7 @@ extern const u8 pkvm_hyp_id;
 
 int __pkvm_prot_finalize(void);
 int __pkvm_host_share_hyp(u64 pfn, u64 nr_pages);
+int __pkvm_host_unshare_hyp(u64 pfn, u64 nr_pages);
 
 bool addr_is_memory(phys_addr_t phys);
 int host_stage2_idmap_locked(phys_addr_t addr, u64 size, enum kvm_pgtable_prot prot);
diff --git a/arch/arm64/kvm/hyp/nvhe/hyp-main.c b/arch/arm64/kvm/hyp/nvhe/hyp-main.c
index f78bec2b9dd4..7070ed9ead9b 100644
--- a/arch/arm64/kvm/hyp/nvhe/hyp-main.c
+++ b/arch/arm64/kvm/hyp/nvhe/hyp-main.c
@@ -148,6 +148,14 @@ static void handle___pkvm_host_share_hyp(struct kvm_cpu_context *host_ctxt)
 	cpu_reg(host_ctxt, 1) = __pkvm_host_share_hyp(pfn, nr_pages);
 }
 
+static void handle___pkvm_host_unshare_hyp(struct kvm_cpu_context *host_ctxt)
+{
+	DECLARE_REG(u64, pfn, host_ctxt, 1);
+	DECLARE_REG(u64, nr_pages, host_ctxt, 2);
+
+	cpu_reg(host_ctxt, 1) = __pkvm_host_unshare_hyp(pfn, nr_pages);
+}
+
 static void handle___pkvm_create_private_mapping(struct kvm_cpu_context *host_ctxt)
 {
 	DECLARE_REG(phys_addr_t, phys, host_ctxt, 1);
@@ -184,6 +192,7 @@ static const hcall_t host_hcall[] = {
 	HANDLE_FUNC(__pkvm_init),
 	HANDLE_FUNC(__pkvm_cpu_set_vector),
 	HANDLE_FUNC(__pkvm_host_share_hyp),
+	HANDLE_FUNC(__pkvm_host_unshare_hyp),
 	HANDLE_FUNC(__pkvm_create_private_mapping),
 	HANDLE_FUNC(__pkvm_prot_finalize),
 };
diff --git a/arch/arm64/kvm/hyp/nvhe/mem_protect.c b/arch/arm64/kvm/hyp/nvhe/mem_protect.c
index cad76bc68e53..3b724ab62e9f 100644
--- a/arch/arm64/kvm/hyp/nvhe/mem_protect.c
+++ b/arch/arm64/kvm/hyp/nvhe/mem_protect.c
@@ -740,3 +740,162 @@ int __pkvm_host_share_hyp(u64 pfn, u64 nr_pages)
 
 	return ret;
 }
+
+static int host_initiate_unshare(struct pkvm_page_req *req)
+{
+	struct hyp_page *page = hyp_phys_to_page(req->phys);
+	enum kvm_pgtable_prot prot;
+
+	if (page->refcount > 1)
+		return 0;
+
+	prot = pkvm_mkstate(PKVM_HOST_MEM_PROT, PKVM_PAGE_OWNED);
+	return host_stage2_idmap_locked(req->initiator.addr, PAGE_SIZE, prot);
+}
+
+static int initiate_unshare(struct pkvm_page_req *req,
+			    struct pkvm_mem_share *share)
+{
+	struct pkvm_mem_transition *tx = &share->tx;
+
+	switch (tx->initiator.id) {
+	case PKVM_ID_HOST:
+		return host_initiate_unshare(req);
+	default:
+		return -EINVAL;
+	}
+}
+
+static int hyp_complete_unshare(struct pkvm_page_req *req)
+{
+	struct hyp_page *page = hyp_phys_to_page(req->phys);
+	void *addr = (void *)req->completer.addr;
+	int ret = 0;
+
+	if (hyp_page_ref_dec_and_test(page)) {
+		ret = kvm_pgtable_hyp_unmap(&pkvm_pgtable, (u64)addr, PAGE_SIZE);
+		ret = (ret == PAGE_SIZE) ? 0 : -EINVAL;
+	}
+
+	return ret;
+}
+
+static int complete_unshare(struct pkvm_page_req *req,
+			    struct pkvm_mem_share *share)
+{
+	struct pkvm_mem_transition *tx = &share->tx;
+
+	switch (tx->completer.id) {
+	case PKVM_ID_HYP:
+		return hyp_complete_unshare(req);
+	default:
+		return -EINVAL;
+	}
+}
+
+static int check_unshare(struct pkvm_page_req *req,
+			 struct pkvm_page_share_ack *ack,
+			 struct pkvm_mem_share *share)
+{
+	struct pkvm_mem_transition *tx = &share->tx;
+
+	if (!addr_is_memory(req->phys))
+		return -EINVAL;
+
+	switch (tx->completer.id) {
+	case PKVM_ID_HYP:
+		return hyp_check_incoming_share(req, ack, tx->initiator.id,
+						share->prot);
+	default:
+		return -EPERM;
+	}
+}
+
+/*
+ * do_unshare():
+ *
+ * The page owner revokes access from another component for a range of
+ * pages which were previously shared using do_share().
+ *
+ * Initiator: SHARED_OWNED	=> OWNED
+ * Completer: SHARED_BORROWED	=> NOPAGE
+ */
+static int do_unshare(struct pkvm_mem_share *share)
+{
+	struct pkvm_page_req req;
+	int ret = 0;
+	u64 idx;
+
+	for (idx = 0; idx < share->tx.nr_pages; ++idx) {
+		struct pkvm_page_share_ack ack;
+
+		/*
+		 * Use the request_share() and ack_share() from the normal share
+		 * path as they implement all the checks we need here. But
+		 * check_unshare() needs to differ -- PKVM_PAGE_OWNED is illegal
+		 * for the initiator.
+		 */
+		ret = request_share(&req, share, idx);
+		if (ret)
+			goto out;
+
+		ret = ack_share(&ack, &req, share);
+		if (ret)
+			goto out;
+
+		ret = check_unshare(&req, &ack, share);
+		if (ret)
+			goto out;
+	}
+
+	for (idx = 0; idx < share->tx.nr_pages; ++idx) {
+		ret = request_share(&req, share, idx);
+		if (ret)
+			break;
+
+		ret = initiate_unshare(&req, share);
+		if (ret)
+			break;
+
+		ret = complete_unshare(&req, share);
+		if (ret)
+			break;
+	}
+
+	WARN_ON(ret);
+out:
+	return ret;
+}
+
+int __pkvm_host_unshare_hyp(u64 pfn, u64 nr_pages)
+{
+	int ret;
+	u64 host_addr = hyp_pfn_to_phys(pfn);
+	u64 hyp_addr = (u64)__hyp_va(host_addr);
+	struct pkvm_mem_share share = {
+		.tx	= {
+			.nr_pages	= nr_pages,
+			.initiator	= {
+				.id	= PKVM_ID_HOST,
+				.addr	= host_addr,
+				.host	= {
+					.completer_addr = hyp_addr,
+				},
+			},
+			.completer	= {
+				.id	= PKVM_ID_HYP,
+			},
+		},
+		.prot	= PAGE_HYP,
+	};
+
+	host_lock_component();
+	hyp_lock_component();
+
+	ret = do_unshare(&share);
+
+	hyp_unlock_component();
+	host_unlock_component();
+
+	return ret;
+}
-- 
2.33.0.882.g93a45727a2-goog


^ permalink raw reply related	[flat|nested] 84+ messages in thread

* [PATCH 15/16] KVM: arm64: pkvm: Introduce an unshare hypercall
@ 2021-10-13 15:58   ` Quentin Perret
  0 siblings, 0 replies; 84+ messages in thread
From: Quentin Perret @ 2021-10-13 15:58 UTC (permalink / raw)
  To: Marc Zyngier, James Morse, Alexandru Elisei, Suzuki K Poulose,
	Catalin Marinas, Will Deacon, Fuad Tabba, David Brazdil
  Cc: linux-arm-kernel, kvmarm, linux-kernel, kernel-team

Introduce an unshare hypercall which can be used to unmap things from
the hypervisor stage-1 in nVHE protected mode. This will be useful to
update the EL2 ownership state of pages during guest teardown, and
avoids keeping dangling mappings to unreferenced portions of memory.

Signed-off-by: Quentin Perret <qperret@google.com>
---
 arch/arm64/include/asm/kvm_asm.h              |   1 +
 arch/arm64/kvm/hyp/include/nvhe/mem_protect.h |   1 +
 arch/arm64/kvm/hyp/nvhe/hyp-main.c            |   9 +
 arch/arm64/kvm/hyp/nvhe/mem_protect.c         | 159 ++++++++++++++++++
 4 files changed, 170 insertions(+)

diff --git a/arch/arm64/include/asm/kvm_asm.h b/arch/arm64/include/asm/kvm_asm.h
index e86045ac43ba..98ba11b7b192 100644
--- a/arch/arm64/include/asm/kvm_asm.h
+++ b/arch/arm64/include/asm/kvm_asm.h
@@ -64,6 +64,7 @@
 #define __KVM_HOST_SMCCC_FUNC___pkvm_cpu_set_vector		18
 #define __KVM_HOST_SMCCC_FUNC___pkvm_prot_finalize		19
 #define __KVM_HOST_SMCCC_FUNC___kvm_adjust_pc			20
+#define __KVM_HOST_SMCCC_FUNC___pkvm_host_unshare_hyp		21
 
 #ifndef __ASSEMBLY__
 
diff --git a/arch/arm64/kvm/hyp/include/nvhe/mem_protect.h b/arch/arm64/kvm/hyp/include/nvhe/mem_protect.h
index 9c02abe92e0a..88e1607a94fe 100644
--- a/arch/arm64/kvm/hyp/include/nvhe/mem_protect.h
+++ b/arch/arm64/kvm/hyp/include/nvhe/mem_protect.h
@@ -55,6 +55,7 @@ extern const u8 pkvm_hyp_id;
 
 int __pkvm_prot_finalize(void);
 int __pkvm_host_share_hyp(u64 pfn, u64 nr_pages);
+int __pkvm_host_unshare_hyp(u64 pfn, u64 nr_pages);
 
 bool addr_is_memory(phys_addr_t phys);
 int host_stage2_idmap_locked(phys_addr_t addr, u64 size, enum kvm_pgtable_prot prot);
diff --git a/arch/arm64/kvm/hyp/nvhe/hyp-main.c b/arch/arm64/kvm/hyp/nvhe/hyp-main.c
index f78bec2b9dd4..7070ed9ead9b 100644
--- a/arch/arm64/kvm/hyp/nvhe/hyp-main.c
+++ b/arch/arm64/kvm/hyp/nvhe/hyp-main.c
@@ -148,6 +148,14 @@ static void handle___pkvm_host_share_hyp(struct kvm_cpu_context *host_ctxt)
 	cpu_reg(host_ctxt, 1) = __pkvm_host_share_hyp(pfn, nr_pages);
 }
 
+static void handle___pkvm_host_unshare_hyp(struct kvm_cpu_context *host_ctxt)
+{
+	DECLARE_REG(u64, pfn, host_ctxt, 1);
+	DECLARE_REG(u64, nr_pages, host_ctxt, 2);
+
+	cpu_reg(host_ctxt, 1) = __pkvm_host_unshare_hyp(pfn, nr_pages);
+}
+
 static void handle___pkvm_create_private_mapping(struct kvm_cpu_context *host_ctxt)
 {
 	DECLARE_REG(phys_addr_t, phys, host_ctxt, 1);
@@ -184,6 +192,7 @@ static const hcall_t host_hcall[] = {
 	HANDLE_FUNC(__pkvm_init),
 	HANDLE_FUNC(__pkvm_cpu_set_vector),
 	HANDLE_FUNC(__pkvm_host_share_hyp),
+	HANDLE_FUNC(__pkvm_host_unshare_hyp),
 	HANDLE_FUNC(__pkvm_create_private_mapping),
 	HANDLE_FUNC(__pkvm_prot_finalize),
 };
diff --git a/arch/arm64/kvm/hyp/nvhe/mem_protect.c b/arch/arm64/kvm/hyp/nvhe/mem_protect.c
index cad76bc68e53..3b724ab62e9f 100644
--- a/arch/arm64/kvm/hyp/nvhe/mem_protect.c
+++ b/arch/arm64/kvm/hyp/nvhe/mem_protect.c
@@ -740,3 +740,162 @@ int __pkvm_host_share_hyp(u64 pfn, u64 nr_pages)
 
 	return ret;
 }
+
+static int host_initiate_unshare(struct pkvm_page_req *req)
+{
+	struct hyp_page *page = hyp_phys_to_page(req->phys);
+	enum kvm_pgtable_prot prot;
+
+	if (page->refcount > 1)
+		return 0;
+
+	prot = pkvm_mkstate(PKVM_HOST_MEM_PROT, PKVM_PAGE_OWNED);
+	return host_stage2_idmap_locked(req->initiator.addr, PAGE_SIZE, prot);
+}
+
+static int initiate_unshare(struct pkvm_page_req *req,
+			    struct pkvm_mem_share *share)
+{
+	struct pkvm_mem_transition *tx = &share->tx;
+
+	switch (tx->initiator.id) {
+	case PKVM_ID_HOST:
+		return host_initiate_unshare(req);
+	default:
+		return -EINVAL;
+	}
+}
+
+static int hyp_complete_unshare(struct pkvm_page_req *req)
+{
+	struct hyp_page *page = hyp_phys_to_page(req->phys);
+	void *addr = (void *)req->completer.addr;
+	int ret = 0;
+
+	if (hyp_page_ref_dec_and_test(page)) {
+		ret = kvm_pgtable_hyp_unmap(&pkvm_pgtable, (u64)addr, PAGE_SIZE);
+		ret = (ret == PAGE_SIZE) ? 0 : -EINVAL;
+	}
+
+	return ret;
+}
+
+static int complete_unshare(struct pkvm_page_req *req,
+			    struct pkvm_mem_share *share)
+{
+	struct pkvm_mem_transition *tx = &share->tx;
+
+	switch (tx->completer.id) {
+	case PKVM_ID_HYP:
+		return hyp_complete_unshare(req);
+	default:
+		return -EINVAL;
+	}
+}
+
+static int check_unshare(struct pkvm_page_req *req,
+			 struct pkvm_page_share_ack *ack,
+			 struct pkvm_mem_share *share)
+{
+	struct pkvm_mem_transition *tx = &share->tx;
+
+	if (!addr_is_memory(req->phys))
+		return -EINVAL;
+
+	switch (tx->completer.id) {
+	case PKVM_ID_HYP:
+		return hyp_check_incoming_share(req, ack, tx->initiator.id,
+						share->prot);
+	default:
+		return -EPERM;
+	}
+}
+
+/*
+ * do_unshare():
+ *
+ * The page owner revokes access from another component for a range of
+ * pages which were previously shared using do_share().
+ *
+ * Initiator: SHARED_OWNED	=> OWNED
+ * Completer: SHARED_BORROWED	=> NOPAGE
+ */
+static int do_unshare(struct pkvm_mem_share *share)
+{
+	struct pkvm_page_req req;
+	int ret = 0;
+	u64 idx;
+
+	for (idx = 0; idx < share->tx.nr_pages; ++idx) {
+		struct pkvm_page_share_ack ack;
+
+		/*
+		 * Use the request_share() and ack_share() from the normal share
+		 * path as they implement all the checks we need here. But
+		 * check_unshare() needs to differ -- PKVM_PAGE_OWNED is illegal
+		 * for the initiator.
+		 */
+		ret = request_share(&req, share, idx);
+		if (ret)
+			goto out;
+
+		ret = ack_share(&ack, &req, share);
+		if (ret)
+			goto out;
+
+		ret = check_unshare(&req, &ack, share);
+		if (ret)
+			goto out;
+	}
+
+	for (idx = 0; idx < share->tx.nr_pages; ++idx) {
+		ret = request_share(&req, share, idx);
+		if (ret)
+			break;
+
+		ret = initiate_unshare(&req, share);
+		if (ret)
+			break;
+
+		ret = complete_unshare(&req, share);
+		if (ret)
+			break;
+	}
+
+	WARN_ON(ret);
+out:
+	return ret;
+}
+
+int __pkvm_host_unshare_hyp(u64 pfn, u64 nr_pages)
+{
+	int ret;
+	u64 host_addr = hyp_pfn_to_phys(pfn);
+	u64 hyp_addr = (u64)__hyp_va(host_addr);
+	struct pkvm_mem_share share = {
+		.tx	= {
+			.nr_pages	= nr_pages,
+			.initiator	= {
+				.id	= PKVM_ID_HOST,
+				.addr	= host_addr,
+				.host	= {
+					.completer_addr = hyp_addr,
+				},
+			},
+			.completer	= {
+				.id	= PKVM_ID_HYP,
+			},
+		},
+		.prot	= PAGE_HYP,
+	};
+
+	host_lock_component();
+	hyp_lock_component();
+
+	ret = do_unshare(&share);
+
+	hyp_unlock_component();
+	host_unlock_component();
+
+	return ret;
+}
-- 
2.33.0.882.g93a45727a2-goog


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply related	[flat|nested] 84+ messages in thread

* [PATCH 16/16] KVM: arm64: pkvm: Unshare guest structs during teardown
  2021-10-13 15:58 ` Quentin Perret
  (?)
@ 2021-10-13 15:58   ` Quentin Perret
  -1 siblings, 0 replies; 84+ messages in thread
From: Quentin Perret @ 2021-10-13 15:58 UTC (permalink / raw)
  To: Marc Zyngier, James Morse, Alexandru Elisei, Suzuki K Poulose,
	Catalin Marinas, Will Deacon, Fuad Tabba, David Brazdil
  Cc: kernel-team, kvmarm, linux-arm-kernel, linux-kernel

Make use of the newly introduced unshare hypercall during guest teardown
to unmap guest-related data structures from the hyp stage-1.

Signed-off-by: Quentin Perret <qperret@google.com>
---
 arch/arm64/include/asm/kvm_host.h |  2 ++
 arch/arm64/include/asm/kvm_mmu.h  |  1 +
 arch/arm64/kvm/arm.c              |  2 ++
 arch/arm64/kvm/fpsimd.c           | 10 ++++++++--
 arch/arm64/kvm/mmu.c              | 16 ++++++++++++++++
 arch/arm64/kvm/reset.c            | 13 ++++++++++++-
 6 files changed, 41 insertions(+), 3 deletions(-)

diff --git a/arch/arm64/include/asm/kvm_host.h b/arch/arm64/include/asm/kvm_host.h
index f8be56d5342b..8b61cdcd1b29 100644
--- a/arch/arm64/include/asm/kvm_host.h
+++ b/arch/arm64/include/asm/kvm_host.h
@@ -322,6 +322,8 @@ struct kvm_vcpu_arch {
 
 	struct thread_info *host_thread_info;	/* hyp VA */
 	struct user_fpsimd_state *host_fpsimd_state;	/* hyp VA */
+	struct thread_info *kern_thread_info;
+	struct user_fpsimd_state *kern_fpsimd_state;
 
 	struct {
 		/* {Break,watch}point registers */
diff --git a/arch/arm64/include/asm/kvm_mmu.h b/arch/arm64/include/asm/kvm_mmu.h
index 185d0f62b724..81839e9a8a24 100644
--- a/arch/arm64/include/asm/kvm_mmu.h
+++ b/arch/arm64/include/asm/kvm_mmu.h
@@ -151,6 +151,7 @@ static __always_inline unsigned long __kern_hyp_va(unsigned long v)
 #include <asm/stage2_pgtable.h>
 
 int kvm_share_hyp(void *from, void *to);
+void kvm_unshare_hyp(void *from, void *to);
 int create_hyp_mappings(void *from, void *to, enum kvm_pgtable_prot prot);
 int create_hyp_io_mappings(phys_addr_t phys_addr, size_t size,
 			   void __iomem **kaddr,
diff --git a/arch/arm64/kvm/arm.c b/arch/arm64/kvm/arm.c
index f2e74635332b..f11c51db6fe6 100644
--- a/arch/arm64/kvm/arm.c
+++ b/arch/arm64/kvm/arm.c
@@ -188,6 +188,8 @@ void kvm_arch_destroy_vm(struct kvm *kvm)
 		}
 	}
 	atomic_set(&kvm->online_vcpus, 0);
+
+	kvm_unshare_hyp(kvm, kvm + 1);
 }
 
 int kvm_vm_ioctl_check_extension(struct kvm *kvm, long ext)
diff --git a/arch/arm64/kvm/fpsimd.c b/arch/arm64/kvm/fpsimd.c
index 2fe1128d9f3d..67059daf4d26 100644
--- a/arch/arm64/kvm/fpsimd.c
+++ b/arch/arm64/kvm/fpsimd.c
@@ -28,23 +28,29 @@ int kvm_arch_vcpu_run_map_fp(struct kvm_vcpu *vcpu)
 {
 	int ret;
 
-	struct thread_info *ti = &current->thread_info;
-	struct user_fpsimd_state *fpsimd = &current->thread.uw.fpsimd_state;
+	struct thread_info *ti = vcpu->arch.kern_thread_info;
+	struct user_fpsimd_state *fpsimd = vcpu->arch.kern_fpsimd_state;
 
 	/*
 	 * Make sure the host task thread flags and fpsimd state are
 	 * visible to hyp:
 	 */
+	kvm_unshare_hyp(ti, ti + 1);
+	ti = &current->thread_info;
 	ret = kvm_share_hyp(ti, ti + 1);
 	if (ret)
 		goto error;
 
+	kvm_unshare_hyp(fpsimd, fpsimd + 1);
+	fpsimd = &current->thread.uw.fpsimd_state;
 	ret = kvm_share_hyp(fpsimd, fpsimd + 1);
 	if (ret)
 		goto error;
 
 	vcpu->arch.host_thread_info = kern_hyp_va(ti);
 	vcpu->arch.host_fpsimd_state = kern_hyp_va(fpsimd);
+	vcpu->arch.kern_thread_info = ti;
+	vcpu->arch.kern_fpsimd_state = fpsimd;
 error:
 	return ret;
 }
diff --git a/arch/arm64/kvm/mmu.c b/arch/arm64/kvm/mmu.c
index bc9865a8c988..f01b0e49e262 100644
--- a/arch/arm64/kvm/mmu.c
+++ b/arch/arm64/kvm/mmu.c
@@ -300,6 +300,22 @@ int kvm_share_hyp(void *from, void *to)
 				 nr_pages);
 }
 
+void kvm_unshare_hyp(void *from, void *to)
+{
+	phys_addr_t start, end;
+	u64 nr_pages;
+
+	if (is_kernel_in_hyp_mode() || kvm_host_owns_hyp_mappings() || !from)
+		return;
+
+	start = ALIGN_DOWN(kvm_kaddr_to_phys(from), PAGE_SIZE);
+	end = PAGE_ALIGN(kvm_kaddr_to_phys(to));
+	nr_pages = (end - start) >> PAGE_SHIFT;
+
+	WARN_ON(kvm_call_hyp_nvhe(__pkvm_host_unshare_hyp, __phys_to_pfn(start),
+				  nr_pages));
+}
+
 /**
  * create_hyp_mappings - duplicate a kernel virtual address range in Hyp mode
  * @from:	The virtual kernel start address of the range
diff --git a/arch/arm64/kvm/reset.c b/arch/arm64/kvm/reset.c
index 5ce36b0a3343..e3e9c9e1f1c8 100644
--- a/arch/arm64/kvm/reset.c
+++ b/arch/arm64/kvm/reset.c
@@ -141,7 +141,18 @@ bool kvm_arm_vcpu_is_finalized(struct kvm_vcpu *vcpu)
 
 void kvm_arm_vcpu_destroy(struct kvm_vcpu *vcpu)
 {
-	kfree(vcpu->arch.sve_state);
+	struct user_fpsimd_state *fpsimd = vcpu->arch.kern_fpsimd_state;
+	struct thread_info *ti = vcpu->arch.kern_thread_info;
+	void *sve_state = vcpu->arch.sve_state;
+
+	kvm_unshare_hyp(vcpu, vcpu + 1);
+	if (ti)
+		kvm_unshare_hyp(ti, ti + 1);
+	if (fpsimd)
+		kvm_unshare_hyp(fpsimd, fpsimd + 1);
+	if (sve_state && vcpu->arch.has_run_once)
+		kvm_unshare_hyp(sve_state, sve_state + vcpu_sve_state_size(vcpu));
+	kfree(sve_state);
 }
 
 static void kvm_vcpu_reset_sve(struct kvm_vcpu *vcpu)
-- 
2.33.0.882.g93a45727a2-goog

_______________________________________________
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

^ permalink raw reply related	[flat|nested] 84+ messages in thread

* [PATCH 16/16] KVM: arm64: pkvm: Unshare guest structs during teardown
@ 2021-10-13 15:58   ` Quentin Perret
  0 siblings, 0 replies; 84+ messages in thread
From: Quentin Perret @ 2021-10-13 15:58 UTC (permalink / raw)
  To: Marc Zyngier, James Morse, Alexandru Elisei, Suzuki K Poulose,
	Catalin Marinas, Will Deacon, Fuad Tabba, David Brazdil
  Cc: linux-arm-kernel, kvmarm, linux-kernel, kernel-team

Make use of the newly introduced unshare hypercall during guest teardown
to unmap guest-related data structures from the hyp stage-1.

Signed-off-by: Quentin Perret <qperret@google.com>
---
 arch/arm64/include/asm/kvm_host.h |  2 ++
 arch/arm64/include/asm/kvm_mmu.h  |  1 +
 arch/arm64/kvm/arm.c              |  2 ++
 arch/arm64/kvm/fpsimd.c           | 10 ++++++++--
 arch/arm64/kvm/mmu.c              | 16 ++++++++++++++++
 arch/arm64/kvm/reset.c            | 13 ++++++++++++-
 6 files changed, 41 insertions(+), 3 deletions(-)

diff --git a/arch/arm64/include/asm/kvm_host.h b/arch/arm64/include/asm/kvm_host.h
index f8be56d5342b..8b61cdcd1b29 100644
--- a/arch/arm64/include/asm/kvm_host.h
+++ b/arch/arm64/include/asm/kvm_host.h
@@ -322,6 +322,8 @@ struct kvm_vcpu_arch {
 
 	struct thread_info *host_thread_info;	/* hyp VA */
 	struct user_fpsimd_state *host_fpsimd_state;	/* hyp VA */
+	struct thread_info *kern_thread_info;
+	struct user_fpsimd_state *kern_fpsimd_state;
 
 	struct {
 		/* {Break,watch}point registers */
diff --git a/arch/arm64/include/asm/kvm_mmu.h b/arch/arm64/include/asm/kvm_mmu.h
index 185d0f62b724..81839e9a8a24 100644
--- a/arch/arm64/include/asm/kvm_mmu.h
+++ b/arch/arm64/include/asm/kvm_mmu.h
@@ -151,6 +151,7 @@ static __always_inline unsigned long __kern_hyp_va(unsigned long v)
 #include <asm/stage2_pgtable.h>
 
 int kvm_share_hyp(void *from, void *to);
+void kvm_unshare_hyp(void *from, void *to);
 int create_hyp_mappings(void *from, void *to, enum kvm_pgtable_prot prot);
 int create_hyp_io_mappings(phys_addr_t phys_addr, size_t size,
 			   void __iomem **kaddr,
diff --git a/arch/arm64/kvm/arm.c b/arch/arm64/kvm/arm.c
index f2e74635332b..f11c51db6fe6 100644
--- a/arch/arm64/kvm/arm.c
+++ b/arch/arm64/kvm/arm.c
@@ -188,6 +188,8 @@ void kvm_arch_destroy_vm(struct kvm *kvm)
 		}
 	}
 	atomic_set(&kvm->online_vcpus, 0);
+
+	kvm_unshare_hyp(kvm, kvm + 1);
 }
 
 int kvm_vm_ioctl_check_extension(struct kvm *kvm, long ext)
diff --git a/arch/arm64/kvm/fpsimd.c b/arch/arm64/kvm/fpsimd.c
index 2fe1128d9f3d..67059daf4d26 100644
--- a/arch/arm64/kvm/fpsimd.c
+++ b/arch/arm64/kvm/fpsimd.c
@@ -28,23 +28,29 @@ int kvm_arch_vcpu_run_map_fp(struct kvm_vcpu *vcpu)
 {
 	int ret;
 
-	struct thread_info *ti = &current->thread_info;
-	struct user_fpsimd_state *fpsimd = &current->thread.uw.fpsimd_state;
+	struct thread_info *ti = vcpu->arch.kern_thread_info;
+	struct user_fpsimd_state *fpsimd = vcpu->arch.kern_fpsimd_state;
 
 	/*
 	 * Make sure the host task thread flags and fpsimd state are
 	 * visible to hyp:
 	 */
+	kvm_unshare_hyp(ti, ti + 1);
+	ti = &current->thread_info;
 	ret = kvm_share_hyp(ti, ti + 1);
 	if (ret)
 		goto error;
 
+	kvm_unshare_hyp(fpsimd, fpsimd + 1);
+	fpsimd = &current->thread.uw.fpsimd_state;
 	ret = kvm_share_hyp(fpsimd, fpsimd + 1);
 	if (ret)
 		goto error;
 
 	vcpu->arch.host_thread_info = kern_hyp_va(ti);
 	vcpu->arch.host_fpsimd_state = kern_hyp_va(fpsimd);
+	vcpu->arch.kern_thread_info = ti;
+	vcpu->arch.kern_fpsimd_state = fpsimd;
 error:
 	return ret;
 }
diff --git a/arch/arm64/kvm/mmu.c b/arch/arm64/kvm/mmu.c
index bc9865a8c988..f01b0e49e262 100644
--- a/arch/arm64/kvm/mmu.c
+++ b/arch/arm64/kvm/mmu.c
@@ -300,6 +300,22 @@ int kvm_share_hyp(void *from, void *to)
 				 nr_pages);
 }
 
+void kvm_unshare_hyp(void *from, void *to)
+{
+	phys_addr_t start, end;
+	u64 nr_pages;
+
+	if (is_kernel_in_hyp_mode() || kvm_host_owns_hyp_mappings() || !from)
+		return;
+
+	start = ALIGN_DOWN(kvm_kaddr_to_phys(from), PAGE_SIZE);
+	end = PAGE_ALIGN(kvm_kaddr_to_phys(to));
+	nr_pages = (end - start) >> PAGE_SHIFT;
+
+	WARN_ON(kvm_call_hyp_nvhe(__pkvm_host_unshare_hyp, __phys_to_pfn(start),
+				  nr_pages));
+}
+
 /**
  * create_hyp_mappings - duplicate a kernel virtual address range in Hyp mode
  * @from:	The virtual kernel start address of the range
diff --git a/arch/arm64/kvm/reset.c b/arch/arm64/kvm/reset.c
index 5ce36b0a3343..e3e9c9e1f1c8 100644
--- a/arch/arm64/kvm/reset.c
+++ b/arch/arm64/kvm/reset.c
@@ -141,7 +141,18 @@ bool kvm_arm_vcpu_is_finalized(struct kvm_vcpu *vcpu)
 
 void kvm_arm_vcpu_destroy(struct kvm_vcpu *vcpu)
 {
-	kfree(vcpu->arch.sve_state);
+	struct user_fpsimd_state *fpsimd = vcpu->arch.kern_fpsimd_state;
+	struct thread_info *ti = vcpu->arch.kern_thread_info;
+	void *sve_state = vcpu->arch.sve_state;
+
+	kvm_unshare_hyp(vcpu, vcpu + 1);
+	if (ti)
+		kvm_unshare_hyp(ti, ti + 1);
+	if (fpsimd)
+		kvm_unshare_hyp(fpsimd, fpsimd + 1);
+	if (sve_state && vcpu->arch.has_run_once)
+		kvm_unshare_hyp(sve_state, sve_state + vcpu_sve_state_size(vcpu));
+	kfree(sve_state);
 }
 
 static void kvm_vcpu_reset_sve(struct kvm_vcpu *vcpu)
-- 
2.33.0.882.g93a45727a2-goog


^ permalink raw reply related	[flat|nested] 84+ messages in thread

* [PATCH 16/16] KVM: arm64: pkvm: Unshare guest structs during teardown
@ 2021-10-13 15:58   ` Quentin Perret
  0 siblings, 0 replies; 84+ messages in thread
From: Quentin Perret @ 2021-10-13 15:58 UTC (permalink / raw)
  To: Marc Zyngier, James Morse, Alexandru Elisei, Suzuki K Poulose,
	Catalin Marinas, Will Deacon, Fuad Tabba, David Brazdil
  Cc: linux-arm-kernel, kvmarm, linux-kernel, kernel-team

Make use of the newly introduced unshare hypercall during guest teardown
to unmap guest-related data structures from the hyp stage-1.

Signed-off-by: Quentin Perret <qperret@google.com>
---
 arch/arm64/include/asm/kvm_host.h |  2 ++
 arch/arm64/include/asm/kvm_mmu.h  |  1 +
 arch/arm64/kvm/arm.c              |  2 ++
 arch/arm64/kvm/fpsimd.c           | 10 ++++++++--
 arch/arm64/kvm/mmu.c              | 16 ++++++++++++++++
 arch/arm64/kvm/reset.c            | 13 ++++++++++++-
 6 files changed, 41 insertions(+), 3 deletions(-)

diff --git a/arch/arm64/include/asm/kvm_host.h b/arch/arm64/include/asm/kvm_host.h
index f8be56d5342b..8b61cdcd1b29 100644
--- a/arch/arm64/include/asm/kvm_host.h
+++ b/arch/arm64/include/asm/kvm_host.h
@@ -322,6 +322,8 @@ struct kvm_vcpu_arch {
 
 	struct thread_info *host_thread_info;	/* hyp VA */
 	struct user_fpsimd_state *host_fpsimd_state;	/* hyp VA */
+	struct thread_info *kern_thread_info;
+	struct user_fpsimd_state *kern_fpsimd_state;
 
 	struct {
 		/* {Break,watch}point registers */
diff --git a/arch/arm64/include/asm/kvm_mmu.h b/arch/arm64/include/asm/kvm_mmu.h
index 185d0f62b724..81839e9a8a24 100644
--- a/arch/arm64/include/asm/kvm_mmu.h
+++ b/arch/arm64/include/asm/kvm_mmu.h
@@ -151,6 +151,7 @@ static __always_inline unsigned long __kern_hyp_va(unsigned long v)
 #include <asm/stage2_pgtable.h>
 
 int kvm_share_hyp(void *from, void *to);
+void kvm_unshare_hyp(void *from, void *to);
 int create_hyp_mappings(void *from, void *to, enum kvm_pgtable_prot prot);
 int create_hyp_io_mappings(phys_addr_t phys_addr, size_t size,
 			   void __iomem **kaddr,
diff --git a/arch/arm64/kvm/arm.c b/arch/arm64/kvm/arm.c
index f2e74635332b..f11c51db6fe6 100644
--- a/arch/arm64/kvm/arm.c
+++ b/arch/arm64/kvm/arm.c
@@ -188,6 +188,8 @@ void kvm_arch_destroy_vm(struct kvm *kvm)
 		}
 	}
 	atomic_set(&kvm->online_vcpus, 0);
+
+	kvm_unshare_hyp(kvm, kvm + 1);
 }
 
 int kvm_vm_ioctl_check_extension(struct kvm *kvm, long ext)
diff --git a/arch/arm64/kvm/fpsimd.c b/arch/arm64/kvm/fpsimd.c
index 2fe1128d9f3d..67059daf4d26 100644
--- a/arch/arm64/kvm/fpsimd.c
+++ b/arch/arm64/kvm/fpsimd.c
@@ -28,23 +28,29 @@ int kvm_arch_vcpu_run_map_fp(struct kvm_vcpu *vcpu)
 {
 	int ret;
 
-	struct thread_info *ti = &current->thread_info;
-	struct user_fpsimd_state *fpsimd = &current->thread.uw.fpsimd_state;
+	struct thread_info *ti = vcpu->arch.kern_thread_info;
+	struct user_fpsimd_state *fpsimd = vcpu->arch.kern_fpsimd_state;
 
 	/*
 	 * Make sure the host task thread flags and fpsimd state are
 	 * visible to hyp:
 	 */
+	kvm_unshare_hyp(ti, ti + 1);
+	ti = &current->thread_info;
 	ret = kvm_share_hyp(ti, ti + 1);
 	if (ret)
 		goto error;
 
+	kvm_unshare_hyp(fpsimd, fpsimd + 1);
+	fpsimd = &current->thread.uw.fpsimd_state;
 	ret = kvm_share_hyp(fpsimd, fpsimd + 1);
 	if (ret)
 		goto error;
 
 	vcpu->arch.host_thread_info = kern_hyp_va(ti);
 	vcpu->arch.host_fpsimd_state = kern_hyp_va(fpsimd);
+	vcpu->arch.kern_thread_info = ti;
+	vcpu->arch.kern_fpsimd_state = fpsimd;
 error:
 	return ret;
 }
diff --git a/arch/arm64/kvm/mmu.c b/arch/arm64/kvm/mmu.c
index bc9865a8c988..f01b0e49e262 100644
--- a/arch/arm64/kvm/mmu.c
+++ b/arch/arm64/kvm/mmu.c
@@ -300,6 +300,22 @@ int kvm_share_hyp(void *from, void *to)
 				 nr_pages);
 }
 
+void kvm_unshare_hyp(void *from, void *to)
+{
+	phys_addr_t start, end;
+	u64 nr_pages;
+
+	if (is_kernel_in_hyp_mode() || kvm_host_owns_hyp_mappings() || !from)
+		return;
+
+	start = ALIGN_DOWN(kvm_kaddr_to_phys(from), PAGE_SIZE);
+	end = PAGE_ALIGN(kvm_kaddr_to_phys(to));
+	nr_pages = (end - start) >> PAGE_SHIFT;
+
+	WARN_ON(kvm_call_hyp_nvhe(__pkvm_host_unshare_hyp, __phys_to_pfn(start),
+				  nr_pages));
+}
+
 /**
  * create_hyp_mappings - duplicate a kernel virtual address range in Hyp mode
  * @from:	The virtual kernel start address of the range
diff --git a/arch/arm64/kvm/reset.c b/arch/arm64/kvm/reset.c
index 5ce36b0a3343..e3e9c9e1f1c8 100644
--- a/arch/arm64/kvm/reset.c
+++ b/arch/arm64/kvm/reset.c
@@ -141,7 +141,18 @@ bool kvm_arm_vcpu_is_finalized(struct kvm_vcpu *vcpu)
 
 void kvm_arm_vcpu_destroy(struct kvm_vcpu *vcpu)
 {
-	kfree(vcpu->arch.sve_state);
+	struct user_fpsimd_state *fpsimd = vcpu->arch.kern_fpsimd_state;
+	struct thread_info *ti = vcpu->arch.kern_thread_info;
+	void *sve_state = vcpu->arch.sve_state;
+
+	kvm_unshare_hyp(vcpu, vcpu + 1);
+	if (ti)
+		kvm_unshare_hyp(ti, ti + 1);
+	if (fpsimd)
+		kvm_unshare_hyp(fpsimd, fpsimd + 1);
+	if (sve_state && vcpu->arch.has_run_once)
+		kvm_unshare_hyp(sve_state, sve_state + vcpu_sve_state_size(vcpu));
+	kfree(sve_state);
 }
 
 static void kvm_vcpu_reset_sve(struct kvm_vcpu *vcpu)
-- 
2.33.0.882.g93a45727a2-goog


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply related	[flat|nested] 84+ messages in thread

* Re: [PATCH 01/16] KVM: arm64: Introduce do_share() helper for memory sharing between components
  2021-10-13 15:58   ` Quentin Perret
  (?)
@ 2021-10-15 15:11     ` Andrew Walbran
  -1 siblings, 0 replies; 84+ messages in thread
From: Andrew Walbran @ 2021-10-15 15:11 UTC (permalink / raw)
  To: Quentin Perret
  Cc: Marc Zyngier, James Morse, Alexandru Elisei, Suzuki K Poulose,
	Catalin Marinas, Will Deacon, Fuad Tabba, David Brazdil,
	linux-arm-kernel, kvmarm, linux-kernel, Android Kernel Team

[-- Attachment #1: Type: text/plain, Size: 11777 bytes --]

On Wed, 13 Oct 2021 at 16:58, 'Quentin Perret' via kernel-team
<kernel-team@android.com> wrote:
>
> From: Will Deacon <will@kernel.org>
>
> In preparation for extending memory sharing to include the guest as well
> as the hypervisor and the host, introduce a high-level do_share() helper
> which allows memory to be shared between these components without
> duplication of validity checks.
>
> Signed-off-by: Will Deacon <will@kernel.org>
> Signed-off-by: Quentin Perret <qperret@google.com>
> ---
>  arch/arm64/kvm/hyp/include/nvhe/mem_protect.h |   5 +
>  arch/arm64/kvm/hyp/nvhe/mem_protect.c         | 315 ++++++++++++++++++
>  2 files changed, 320 insertions(+)
>
> diff --git a/arch/arm64/kvm/hyp/include/nvhe/mem_protect.h b/arch/arm64/kvm/hyp/include/nvhe/mem_protect.h
> index b58c910babaf..56445586c755 100644
> --- a/arch/arm64/kvm/hyp/include/nvhe/mem_protect.h
> +++ b/arch/arm64/kvm/hyp/include/nvhe/mem_protect.h
> @@ -24,6 +24,11 @@ enum pkvm_page_state {
>         PKVM_PAGE_OWNED                 = 0ULL,
>         PKVM_PAGE_SHARED_OWNED          = KVM_PGTABLE_PROT_SW0,
>         PKVM_PAGE_SHARED_BORROWED       = KVM_PGTABLE_PROT_SW1,
> +       __PKVM_PAGE_RESERVED            = KVM_PGTABLE_PROT_SW0 |
> +                                         KVM_PGTABLE_PROT_SW1,
> +
> +       /* Meta-states which aren't encoded directly in the PTE's SW bits */
> +       PKVM_NOPAGE,
>  };
>
>  #define PKVM_PAGE_STATE_PROT_MASK      (KVM_PGTABLE_PROT_SW0 | KVM_PGTABLE_PROT_SW1)
> diff --git a/arch/arm64/kvm/hyp/nvhe/mem_protect.c b/arch/arm64/kvm/hyp/nvhe/mem_protect.c
> index bacd493a4eac..53e503501044 100644
> --- a/arch/arm64/kvm/hyp/nvhe/mem_protect.c
> +++ b/arch/arm64/kvm/hyp/nvhe/mem_protect.c
> @@ -443,3 +443,318 @@ void handle_host_mem_abort(struct kvm_cpu_context *host_ctxt)
>         ret = host_stage2_idmap(addr);
>         BUG_ON(ret && ret != -EAGAIN);
>  }
> +
> +/* This corresponds to locking order */
> +enum pkvm_component_id {
> +       PKVM_ID_HOST,
> +       PKVM_ID_HYP,
> +};
> +
> +struct pkvm_mem_transition {
> +       u64                             nr_pages;
> +
> +       struct {
> +               enum pkvm_component_id  id;
> +               u64                     addr;
Is this the physical address or the IPA of the initiator? It would be
good to have a comment explaining.

> +
> +               union {
> +                       struct {
> +                               u64     completer_addr;
> +                       } host;
> +               };
> +       } initiator;
> +
> +       struct {
> +               enum pkvm_component_id  id;
> +       } completer;
> +};
> +
> +struct pkvm_mem_share {
> +       struct pkvm_mem_transition      tx;
> +       enum kvm_pgtable_prot           prot;
> +};
> +
> +struct pkvm_page_req {
> +       struct {
> +               enum pkvm_page_state    state;
> +               u64                     addr;
> +       } initiator;
> +
> +       struct {
> +               u64                     addr;
> +       } completer;
> +
> +       phys_addr_t                     phys;
> +};
> +
> +struct pkvm_page_share_ack {
> +       struct {
> +               enum pkvm_page_state    state;
> +               phys_addr_t             phys;
> +               enum kvm_pgtable_prot   prot;
> +       } completer;
> +};
> +
> +static void host_lock_component(void)
> +{
> +       hyp_spin_lock(&host_kvm.lock);
> +}
> +
> +static void host_unlock_component(void)
> +{
> +       hyp_spin_unlock(&host_kvm.lock);
> +}
> +
> +static void hyp_lock_component(void)
> +{
> +       hyp_spin_lock(&pkvm_pgd_lock);
> +}
> +
> +static void hyp_unlock_component(void)
> +{
> +       hyp_spin_unlock(&pkvm_pgd_lock);
> +}
> +
> +static int host_request_share(struct pkvm_page_req *req,
> +                             struct pkvm_mem_transition *tx,
> +                             u64 idx)
> +{
> +       u64 offset = idx * PAGE_SIZE;
> +       enum kvm_pgtable_prot prot;
> +       u64 host_addr;
> +       kvm_pte_t pte;
> +       int err;
> +
> +       hyp_assert_lock_held(&host_kvm.lock);
> +
> +       host_addr = tx->initiator.addr + offset;
> +       err = kvm_pgtable_get_leaf(&host_kvm.pgt, host_addr, &pte, NULL);
> +       if (err)
> +               return err;
> +
> +       if (!kvm_pte_valid(pte) && pte)
> +               return -EPERM;
> +
> +       prot = kvm_pgtable_stage2_pte_prot(pte);
> +       *req = (struct pkvm_page_req) {
> +               .initiator      = {
> +                       .state  = pkvm_getstate(prot),
> +                       .addr   = host_addr,
> +               },
> +               .completer      = {
> +                       .addr   = tx->initiator.host.completer_addr + offset,
> +               },
> +               .phys           = host_addr,
> +       };
> +
> +       return 0;
> +}
> +
> +/*
> + * Populate the page-sharing request (@req) based on the share transition
> + * information from the initiator and its current page state.
> + */
> +static int request_share(struct pkvm_page_req *req,
> +                        struct pkvm_mem_share *share,
> +                        u64 idx)
> +{
> +       struct pkvm_mem_transition *tx = &share->tx;
> +
> +       switch (tx->initiator.id) {
> +       case PKVM_ID_HOST:
> +               return host_request_share(req, tx, idx);
> +       default:
> +               return -EINVAL;
> +       }
> +}
> +
> +static int hyp_ack_share(struct pkvm_page_share_ack *ack,
> +                        struct pkvm_page_req *req,
> +                        enum kvm_pgtable_prot perms)
> +{
> +       enum pkvm_page_state state = PKVM_NOPAGE;
> +       enum kvm_pgtable_prot prot = 0;
> +       phys_addr_t phys = 0;
> +       kvm_pte_t pte;
> +       u64 hyp_addr;
> +       int err;
> +
> +       hyp_assert_lock_held(&pkvm_pgd_lock);
> +
> +       if (perms != PAGE_HYP)
> +               return -EPERM;
> +
> +       hyp_addr = req->completer.addr;
> +       err = kvm_pgtable_get_leaf(&pkvm_pgtable, hyp_addr, &pte, NULL);
> +       if (err)
> +               return err;
> +
> +       if (kvm_pte_valid(pte)) {
> +               state   = pkvm_getstate(kvm_pgtable_hyp_pte_prot(pte));
> +               phys    = kvm_pte_to_phys(pte);
> +               prot    = kvm_pgtable_hyp_pte_prot(pte) & KVM_PGTABLE_PROT_RWX;
> +       }
> +
> +       *ack = (struct pkvm_page_share_ack) {
> +               .completer      = {
> +                       .state  = state,
> +                       .phys   = phys,
> +                       .prot   = prot,
> +               },
> +       };
> +
> +       return 0;
> +}
> +
> +/*
> + * Populate the page-sharing acknowledgment (@ack) based on the sharing request
> + * from the initiator and the current page state in the completer.
> + */
> +static int ack_share(struct pkvm_page_share_ack *ack,
> +                    struct pkvm_page_req *req,
> +                    struct pkvm_mem_share *share)
> +{
> +       struct pkvm_mem_transition *tx = &share->tx;
> +
> +       switch (tx->completer.id) {
> +       case PKVM_ID_HYP:
> +               return hyp_ack_share(ack, req, share->prot);
> +       default:
> +               return -EINVAL;
> +       }
> +}
> +
> +/*
> + * Check that the page states in the initiator and the completer are compatible
> + * for the requested page-sharing operation to go ahead.
> + */
> +static int check_share(struct pkvm_page_req *req,
> +                      struct pkvm_page_share_ack *ack,
> +                      struct pkvm_mem_share *share)
> +{
> +       if (!addr_is_memory(req->phys))
> +               return -EINVAL;
> +
> +       if (req->initiator.state == PKVM_PAGE_OWNED &&
> +           ack->completer.state == PKVM_NOPAGE) {
> +               return 0;
> +       }
> +
> +       if (req->initiator.state != PKVM_PAGE_SHARED_OWNED)
> +               return -EPERM;
> +
> +       if (ack->completer.state != PKVM_PAGE_SHARED_BORROWED)
> +               return -EPERM;
> +
> +       if (ack->completer.phys != req->phys)
> +               return -EPERM;
> +
> +       if (ack->completer.prot != share->prot)
> +               return -EPERM;
I guess this is the workaround you mentioned for the fact that the
host can share the same page twice? It might be worth adding a comment
to explain that that's what's going on.

> +
> +       return 0;
> +}
> +
> +static int host_initiate_share(struct pkvm_page_req *req)
> +{
> +       enum kvm_pgtable_prot prot;
> +
> +       prot = pkvm_mkstate(PKVM_HOST_MEM_PROT, PKVM_PAGE_SHARED_OWNED);
> +       return host_stage2_idmap_locked(req->initiator.addr, PAGE_SIZE, prot);
> +}
> +
> +/* Update the initiator's page-table for the page-sharing request */
> +static int initiate_share(struct pkvm_page_req *req,
> +                         struct pkvm_mem_share *share)
> +{
> +       struct pkvm_mem_transition *tx = &share->tx;
> +
> +       switch (tx->initiator.id) {
> +       case PKVM_ID_HOST:
> +               return host_initiate_share(req);
> +       default:
> +               return -EINVAL;
> +       }
> +}
> +
> +static int hyp_complete_share(struct pkvm_page_req *req,
> +                             enum kvm_pgtable_prot perms)
> +{
> +       void *start = (void *)req->completer.addr, *end = start + PAGE_SIZE;
> +       enum kvm_pgtable_prot prot;
> +
> +       prot = pkvm_mkstate(perms, PKVM_PAGE_SHARED_BORROWED);
> +       return pkvm_create_mappings_locked(start, end, prot);
> +}
> +
> +/* Update the completer's page-table for the page-sharing request */
> +static int complete_share(struct pkvm_page_req *req,
> +                         struct pkvm_mem_share *share)
> +{
> +       struct pkvm_mem_transition *tx = &share->tx;
> +
> +       switch (tx->completer.id) {
> +       case PKVM_ID_HYP:
> +               return hyp_complete_share(req, share->prot);
> +       default:
> +               return -EINVAL;
> +       }
> +}
> +
> +/*
> + * do_share():
> + *
> + * The page owner grants access to another component with a given set
> + * of permissions.
> + *
> + * Initiator: OWNED    => SHARED_OWNED
> + * Completer: NOPAGE   => SHARED_BORROWED
> + *
> + * Note that we permit the same share operation to be repeated from the
> + * host to the hypervisor, as this removes the need for excessive
> + * book-keeping of shared KVM data structures at EL1.
> + */
> +static int do_share(struct pkvm_mem_share *share)
> +{
> +       struct pkvm_page_req req;
> +       int ret = 0;
> +       u64 idx;
> +
> +       for (idx = 0; idx < share->tx.nr_pages; ++idx) {
> +               struct pkvm_page_share_ack ack;
> +
> +               ret = request_share(&req, share, idx);
> +               if (ret)
> +                       goto out;
> +
> +               ret = ack_share(&ack, &req, share);
> +               if (ret)
> +                       goto out;
> +
> +               ret = check_share(&req, &ack, share);
> +               if (ret)
> +                       goto out;
> +       }
> +
> +       for (idx = 0; idx < share->tx.nr_pages; ++idx) {
> +               ret = request_share(&req, share, idx);
> +               if (ret)
> +                       break;
> +
> +               /* Allow double-sharing by skipping over the page */
> +               if (req.initiator.state == PKVM_PAGE_SHARED_OWNED)
> +                       continue;
> +
> +               ret = initiate_share(&req, share);
> +               if (ret)
> +                       break;
> +
> +               ret = complete_share(&req, share);
> +               if (ret)
> +                       break;
> +       }
> +
> +       WARN_ON(ret);
> +out:
> +       return ret;
> +}
> --
> 2.33.0.882.g93a45727a2-goog
>
> --
> To unsubscribe from this group and stop receiving emails from it, send an email to kernel-team+unsubscribe@android.com.
>

[-- Attachment #2: S/MIME Cryptographic Signature --]
[-- Type: application/pkcs7-signature, Size: 3998 bytes --]

^ permalink raw reply	[flat|nested] 84+ messages in thread

* Re: [PATCH 01/16] KVM: arm64: Introduce do_share() helper for memory sharing between components
@ 2021-10-15 15:11     ` Andrew Walbran
  0 siblings, 0 replies; 84+ messages in thread
From: Andrew Walbran @ 2021-10-15 15:11 UTC (permalink / raw)
  To: Quentin Perret
  Cc: Marc Zyngier, James Morse, Alexandru Elisei, Suzuki K Poulose,
	Catalin Marinas, Will Deacon, Fuad Tabba, David Brazdil,
	linux-arm-kernel, kvmarm, linux-kernel, Android Kernel Team


[-- Attachment #1.1: Type: text/plain, Size: 11777 bytes --]

On Wed, 13 Oct 2021 at 16:58, 'Quentin Perret' via kernel-team
<kernel-team@android.com> wrote:
>
> From: Will Deacon <will@kernel.org>
>
> In preparation for extending memory sharing to include the guest as well
> as the hypervisor and the host, introduce a high-level do_share() helper
> which allows memory to be shared between these components without
> duplication of validity checks.
>
> Signed-off-by: Will Deacon <will@kernel.org>
> Signed-off-by: Quentin Perret <qperret@google.com>
> ---
>  arch/arm64/kvm/hyp/include/nvhe/mem_protect.h |   5 +
>  arch/arm64/kvm/hyp/nvhe/mem_protect.c         | 315 ++++++++++++++++++
>  2 files changed, 320 insertions(+)
>
> diff --git a/arch/arm64/kvm/hyp/include/nvhe/mem_protect.h b/arch/arm64/kvm/hyp/include/nvhe/mem_protect.h
> index b58c910babaf..56445586c755 100644
> --- a/arch/arm64/kvm/hyp/include/nvhe/mem_protect.h
> +++ b/arch/arm64/kvm/hyp/include/nvhe/mem_protect.h
> @@ -24,6 +24,11 @@ enum pkvm_page_state {
>         PKVM_PAGE_OWNED                 = 0ULL,
>         PKVM_PAGE_SHARED_OWNED          = KVM_PGTABLE_PROT_SW0,
>         PKVM_PAGE_SHARED_BORROWED       = KVM_PGTABLE_PROT_SW1,
> +       __PKVM_PAGE_RESERVED            = KVM_PGTABLE_PROT_SW0 |
> +                                         KVM_PGTABLE_PROT_SW1,
> +
> +       /* Meta-states which aren't encoded directly in the PTE's SW bits */
> +       PKVM_NOPAGE,
>  };
>
>  #define PKVM_PAGE_STATE_PROT_MASK      (KVM_PGTABLE_PROT_SW0 | KVM_PGTABLE_PROT_SW1)
> diff --git a/arch/arm64/kvm/hyp/nvhe/mem_protect.c b/arch/arm64/kvm/hyp/nvhe/mem_protect.c
> index bacd493a4eac..53e503501044 100644
> --- a/arch/arm64/kvm/hyp/nvhe/mem_protect.c
> +++ b/arch/arm64/kvm/hyp/nvhe/mem_protect.c
> @@ -443,3 +443,318 @@ void handle_host_mem_abort(struct kvm_cpu_context *host_ctxt)
>         ret = host_stage2_idmap(addr);
>         BUG_ON(ret && ret != -EAGAIN);
>  }
> +
> +/* This corresponds to locking order */
> +enum pkvm_component_id {
> +       PKVM_ID_HOST,
> +       PKVM_ID_HYP,
> +};
> +
> +struct pkvm_mem_transition {
> +       u64                             nr_pages;
> +
> +       struct {
> +               enum pkvm_component_id  id;
> +               u64                     addr;
Is this the physical address or the IPA of the initiator? It would be
good to have a comment explaining.

> +
> +               union {
> +                       struct {
> +                               u64     completer_addr;
> +                       } host;
> +               };
> +       } initiator;
> +
> +       struct {
> +               enum pkvm_component_id  id;
> +       } completer;
> +};
> +
> +struct pkvm_mem_share {
> +       struct pkvm_mem_transition      tx;
> +       enum kvm_pgtable_prot           prot;
> +};
> +
> +struct pkvm_page_req {
> +       struct {
> +               enum pkvm_page_state    state;
> +               u64                     addr;
> +       } initiator;
> +
> +       struct {
> +               u64                     addr;
> +       } completer;
> +
> +       phys_addr_t                     phys;
> +};
> +
> +struct pkvm_page_share_ack {
> +       struct {
> +               enum pkvm_page_state    state;
> +               phys_addr_t             phys;
> +               enum kvm_pgtable_prot   prot;
> +       } completer;
> +};
> +
> +static void host_lock_component(void)
> +{
> +       hyp_spin_lock(&host_kvm.lock);
> +}
> +
> +static void host_unlock_component(void)
> +{
> +       hyp_spin_unlock(&host_kvm.lock);
> +}
> +
> +static void hyp_lock_component(void)
> +{
> +       hyp_spin_lock(&pkvm_pgd_lock);
> +}
> +
> +static void hyp_unlock_component(void)
> +{
> +       hyp_spin_unlock(&pkvm_pgd_lock);
> +}
> +
> +static int host_request_share(struct pkvm_page_req *req,
> +                             struct pkvm_mem_transition *tx,
> +                             u64 idx)
> +{
> +       u64 offset = idx * PAGE_SIZE;
> +       enum kvm_pgtable_prot prot;
> +       u64 host_addr;
> +       kvm_pte_t pte;
> +       int err;
> +
> +       hyp_assert_lock_held(&host_kvm.lock);
> +
> +       host_addr = tx->initiator.addr + offset;
> +       err = kvm_pgtable_get_leaf(&host_kvm.pgt, host_addr, &pte, NULL);
> +       if (err)
> +               return err;
> +
> +       if (!kvm_pte_valid(pte) && pte)
> +               return -EPERM;
> +
> +       prot = kvm_pgtable_stage2_pte_prot(pte);
> +       *req = (struct pkvm_page_req) {
> +               .initiator      = {
> +                       .state  = pkvm_getstate(prot),
> +                       .addr   = host_addr,
> +               },
> +               .completer      = {
> +                       .addr   = tx->initiator.host.completer_addr + offset,
> +               },
> +               .phys           = host_addr,
> +       };
> +
> +       return 0;
> +}
> +
> +/*
> + * Populate the page-sharing request (@req) based on the share transition
> + * information from the initiator and its current page state.
> + */
> +static int request_share(struct pkvm_page_req *req,
> +                        struct pkvm_mem_share *share,
> +                        u64 idx)
> +{
> +       struct pkvm_mem_transition *tx = &share->tx;
> +
> +       switch (tx->initiator.id) {
> +       case PKVM_ID_HOST:
> +               return host_request_share(req, tx, idx);
> +       default:
> +               return -EINVAL;
> +       }
> +}
> +
> +static int hyp_ack_share(struct pkvm_page_share_ack *ack,
> +                        struct pkvm_page_req *req,
> +                        enum kvm_pgtable_prot perms)
> +{
> +       enum pkvm_page_state state = PKVM_NOPAGE;
> +       enum kvm_pgtable_prot prot = 0;
> +       phys_addr_t phys = 0;
> +       kvm_pte_t pte;
> +       u64 hyp_addr;
> +       int err;
> +
> +       hyp_assert_lock_held(&pkvm_pgd_lock);
> +
> +       if (perms != PAGE_HYP)
> +               return -EPERM;
> +
> +       hyp_addr = req->completer.addr;
> +       err = kvm_pgtable_get_leaf(&pkvm_pgtable, hyp_addr, &pte, NULL);
> +       if (err)
> +               return err;
> +
> +       if (kvm_pte_valid(pte)) {
> +               state   = pkvm_getstate(kvm_pgtable_hyp_pte_prot(pte));
> +               phys    = kvm_pte_to_phys(pte);
> +               prot    = kvm_pgtable_hyp_pte_prot(pte) & KVM_PGTABLE_PROT_RWX;
> +       }
> +
> +       *ack = (struct pkvm_page_share_ack) {
> +               .completer      = {
> +                       .state  = state,
> +                       .phys   = phys,
> +                       .prot   = prot,
> +               },
> +       };
> +
> +       return 0;
> +}
> +
> +/*
> + * Populate the page-sharing acknowledgment (@ack) based on the sharing request
> + * from the initiator and the current page state in the completer.
> + */
> +static int ack_share(struct pkvm_page_share_ack *ack,
> +                    struct pkvm_page_req *req,
> +                    struct pkvm_mem_share *share)
> +{
> +       struct pkvm_mem_transition *tx = &share->tx;
> +
> +       switch (tx->completer.id) {
> +       case PKVM_ID_HYP:
> +               return hyp_ack_share(ack, req, share->prot);
> +       default:
> +               return -EINVAL;
> +       }
> +}
> +
> +/*
> + * Check that the page states in the initiator and the completer are compatible
> + * for the requested page-sharing operation to go ahead.
> + */
> +static int check_share(struct pkvm_page_req *req,
> +                      struct pkvm_page_share_ack *ack,
> +                      struct pkvm_mem_share *share)
> +{
> +       if (!addr_is_memory(req->phys))
> +               return -EINVAL;
> +
> +       if (req->initiator.state == PKVM_PAGE_OWNED &&
> +           ack->completer.state == PKVM_NOPAGE) {
> +               return 0;
> +       }
> +
> +       if (req->initiator.state != PKVM_PAGE_SHARED_OWNED)
> +               return -EPERM;
> +
> +       if (ack->completer.state != PKVM_PAGE_SHARED_BORROWED)
> +               return -EPERM;
> +
> +       if (ack->completer.phys != req->phys)
> +               return -EPERM;
> +
> +       if (ack->completer.prot != share->prot)
> +               return -EPERM;
I guess this is the workaround you mentioned for the fact that the
host can share the same page twice? It might be worth adding a comment
to explain that that's what's going on.

> +
> +       return 0;
> +}
> +
> +static int host_initiate_share(struct pkvm_page_req *req)
> +{
> +       enum kvm_pgtable_prot prot;
> +
> +       prot = pkvm_mkstate(PKVM_HOST_MEM_PROT, PKVM_PAGE_SHARED_OWNED);
> +       return host_stage2_idmap_locked(req->initiator.addr, PAGE_SIZE, prot);
> +}
> +
> +/* Update the initiator's page-table for the page-sharing request */
> +static int initiate_share(struct pkvm_page_req *req,
> +                         struct pkvm_mem_share *share)
> +{
> +       struct pkvm_mem_transition *tx = &share->tx;
> +
> +       switch (tx->initiator.id) {
> +       case PKVM_ID_HOST:
> +               return host_initiate_share(req);
> +       default:
> +               return -EINVAL;
> +       }
> +}
> +
> +static int hyp_complete_share(struct pkvm_page_req *req,
> +                             enum kvm_pgtable_prot perms)
> +{
> +       void *start = (void *)req->completer.addr, *end = start + PAGE_SIZE;
> +       enum kvm_pgtable_prot prot;
> +
> +       prot = pkvm_mkstate(perms, PKVM_PAGE_SHARED_BORROWED);
> +       return pkvm_create_mappings_locked(start, end, prot);
> +}
> +
> +/* Update the completer's page-table for the page-sharing request */
> +static int complete_share(struct pkvm_page_req *req,
> +                         struct pkvm_mem_share *share)
> +{
> +       struct pkvm_mem_transition *tx = &share->tx;
> +
> +       switch (tx->completer.id) {
> +       case PKVM_ID_HYP:
> +               return hyp_complete_share(req, share->prot);
> +       default:
> +               return -EINVAL;
> +       }
> +}
> +
> +/*
> + * do_share():
> + *
> + * The page owner grants access to another component with a given set
> + * of permissions.
> + *
> + * Initiator: OWNED    => SHARED_OWNED
> + * Completer: NOPAGE   => SHARED_BORROWED
> + *
> + * Note that we permit the same share operation to be repeated from the
> + * host to the hypervisor, as this removes the need for excessive
> + * book-keeping of shared KVM data structures at EL1.
> + */
> +static int do_share(struct pkvm_mem_share *share)
> +{
> +       struct pkvm_page_req req;
> +       int ret = 0;
> +       u64 idx;
> +
> +       for (idx = 0; idx < share->tx.nr_pages; ++idx) {
> +               struct pkvm_page_share_ack ack;
> +
> +               ret = request_share(&req, share, idx);
> +               if (ret)
> +                       goto out;
> +
> +               ret = ack_share(&ack, &req, share);
> +               if (ret)
> +                       goto out;
> +
> +               ret = check_share(&req, &ack, share);
> +               if (ret)
> +                       goto out;
> +       }
> +
> +       for (idx = 0; idx < share->tx.nr_pages; ++idx) {
> +               ret = request_share(&req, share, idx);
> +               if (ret)
> +                       break;
> +
> +               /* Allow double-sharing by skipping over the page */
> +               if (req.initiator.state == PKVM_PAGE_SHARED_OWNED)
> +                       continue;
> +
> +               ret = initiate_share(&req, share);
> +               if (ret)
> +                       break;
> +
> +               ret = complete_share(&req, share);
> +               if (ret)
> +                       break;
> +       }
> +
> +       WARN_ON(ret);
> +out:
> +       return ret;
> +}
> --
> 2.33.0.882.g93a45727a2-goog
>
> --
> To unsubscribe from this group and stop receiving emails from it, send an email to kernel-team+unsubscribe@android.com.
>

[-- Attachment #1.2: S/MIME Cryptographic Signature --]
[-- Type: application/pkcs7-signature, Size: 3998 bytes --]

[-- Attachment #2: Type: text/plain, Size: 176 bytes --]

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 84+ messages in thread

* Re: [PATCH 01/16] KVM: arm64: Introduce do_share() helper for memory sharing between components
@ 2021-10-15 15:11     ` Andrew Walbran
  0 siblings, 0 replies; 84+ messages in thread
From: Andrew Walbran @ 2021-10-15 15:11 UTC (permalink / raw)
  To: Quentin Perret
  Cc: Android Kernel Team, Marc Zyngier, linux-kernel,
	linux-arm-kernel, Catalin Marinas, Will Deacon, kvmarm


[-- Attachment #1.1: Type: text/plain, Size: 11777 bytes --]

On Wed, 13 Oct 2021 at 16:58, 'Quentin Perret' via kernel-team
<kernel-team@android.com> wrote:
>
> From: Will Deacon <will@kernel.org>
>
> In preparation for extending memory sharing to include the guest as well
> as the hypervisor and the host, introduce a high-level do_share() helper
> which allows memory to be shared between these components without
> duplication of validity checks.
>
> Signed-off-by: Will Deacon <will@kernel.org>
> Signed-off-by: Quentin Perret <qperret@google.com>
> ---
>  arch/arm64/kvm/hyp/include/nvhe/mem_protect.h |   5 +
>  arch/arm64/kvm/hyp/nvhe/mem_protect.c         | 315 ++++++++++++++++++
>  2 files changed, 320 insertions(+)
>
> diff --git a/arch/arm64/kvm/hyp/include/nvhe/mem_protect.h b/arch/arm64/kvm/hyp/include/nvhe/mem_protect.h
> index b58c910babaf..56445586c755 100644
> --- a/arch/arm64/kvm/hyp/include/nvhe/mem_protect.h
> +++ b/arch/arm64/kvm/hyp/include/nvhe/mem_protect.h
> @@ -24,6 +24,11 @@ enum pkvm_page_state {
>         PKVM_PAGE_OWNED                 = 0ULL,
>         PKVM_PAGE_SHARED_OWNED          = KVM_PGTABLE_PROT_SW0,
>         PKVM_PAGE_SHARED_BORROWED       = KVM_PGTABLE_PROT_SW1,
> +       __PKVM_PAGE_RESERVED            = KVM_PGTABLE_PROT_SW0 |
> +                                         KVM_PGTABLE_PROT_SW1,
> +
> +       /* Meta-states which aren't encoded directly in the PTE's SW bits */
> +       PKVM_NOPAGE,
>  };
>
>  #define PKVM_PAGE_STATE_PROT_MASK      (KVM_PGTABLE_PROT_SW0 | KVM_PGTABLE_PROT_SW1)
> diff --git a/arch/arm64/kvm/hyp/nvhe/mem_protect.c b/arch/arm64/kvm/hyp/nvhe/mem_protect.c
> index bacd493a4eac..53e503501044 100644
> --- a/arch/arm64/kvm/hyp/nvhe/mem_protect.c
> +++ b/arch/arm64/kvm/hyp/nvhe/mem_protect.c
> @@ -443,3 +443,318 @@ void handle_host_mem_abort(struct kvm_cpu_context *host_ctxt)
>         ret = host_stage2_idmap(addr);
>         BUG_ON(ret && ret != -EAGAIN);
>  }
> +
> +/* This corresponds to locking order */
> +enum pkvm_component_id {
> +       PKVM_ID_HOST,
> +       PKVM_ID_HYP,
> +};
> +
> +struct pkvm_mem_transition {
> +       u64                             nr_pages;
> +
> +       struct {
> +               enum pkvm_component_id  id;
> +               u64                     addr;
Is this the physical address or the IPA of the initiator? It would be
good to have a comment explaining.

> +
> +               union {
> +                       struct {
> +                               u64     completer_addr;
> +                       } host;
> +               };
> +       } initiator;
> +
> +       struct {
> +               enum pkvm_component_id  id;
> +       } completer;
> +};
> +
> +struct pkvm_mem_share {
> +       struct pkvm_mem_transition      tx;
> +       enum kvm_pgtable_prot           prot;
> +};
> +
> +struct pkvm_page_req {
> +       struct {
> +               enum pkvm_page_state    state;
> +               u64                     addr;
> +       } initiator;
> +
> +       struct {
> +               u64                     addr;
> +       } completer;
> +
> +       phys_addr_t                     phys;
> +};
> +
> +struct pkvm_page_share_ack {
> +       struct {
> +               enum pkvm_page_state    state;
> +               phys_addr_t             phys;
> +               enum kvm_pgtable_prot   prot;
> +       } completer;
> +};
> +
> +static void host_lock_component(void)
> +{
> +       hyp_spin_lock(&host_kvm.lock);
> +}
> +
> +static void host_unlock_component(void)
> +{
> +       hyp_spin_unlock(&host_kvm.lock);
> +}
> +
> +static void hyp_lock_component(void)
> +{
> +       hyp_spin_lock(&pkvm_pgd_lock);
> +}
> +
> +static void hyp_unlock_component(void)
> +{
> +       hyp_spin_unlock(&pkvm_pgd_lock);
> +}
> +
> +static int host_request_share(struct pkvm_page_req *req,
> +                             struct pkvm_mem_transition *tx,
> +                             u64 idx)
> +{
> +       u64 offset = idx * PAGE_SIZE;
> +       enum kvm_pgtable_prot prot;
> +       u64 host_addr;
> +       kvm_pte_t pte;
> +       int err;
> +
> +       hyp_assert_lock_held(&host_kvm.lock);
> +
> +       host_addr = tx->initiator.addr + offset;
> +       err = kvm_pgtable_get_leaf(&host_kvm.pgt, host_addr, &pte, NULL);
> +       if (err)
> +               return err;
> +
> +       if (!kvm_pte_valid(pte) && pte)
> +               return -EPERM;
> +
> +       prot = kvm_pgtable_stage2_pte_prot(pte);
> +       *req = (struct pkvm_page_req) {
> +               .initiator      = {
> +                       .state  = pkvm_getstate(prot),
> +                       .addr   = host_addr,
> +               },
> +               .completer      = {
> +                       .addr   = tx->initiator.host.completer_addr + offset,
> +               },
> +               .phys           = host_addr,
> +       };
> +
> +       return 0;
> +}
> +
> +/*
> + * Populate the page-sharing request (@req) based on the share transition
> + * information from the initiator and its current page state.
> + */
> +static int request_share(struct pkvm_page_req *req,
> +                        struct pkvm_mem_share *share,
> +                        u64 idx)
> +{
> +       struct pkvm_mem_transition *tx = &share->tx;
> +
> +       switch (tx->initiator.id) {
> +       case PKVM_ID_HOST:
> +               return host_request_share(req, tx, idx);
> +       default:
> +               return -EINVAL;
> +       }
> +}
> +
> +static int hyp_ack_share(struct pkvm_page_share_ack *ack,
> +                        struct pkvm_page_req *req,
> +                        enum kvm_pgtable_prot perms)
> +{
> +       enum pkvm_page_state state = PKVM_NOPAGE;
> +       enum kvm_pgtable_prot prot = 0;
> +       phys_addr_t phys = 0;
> +       kvm_pte_t pte;
> +       u64 hyp_addr;
> +       int err;
> +
> +       hyp_assert_lock_held(&pkvm_pgd_lock);
> +
> +       if (perms != PAGE_HYP)
> +               return -EPERM;
> +
> +       hyp_addr = req->completer.addr;
> +       err = kvm_pgtable_get_leaf(&pkvm_pgtable, hyp_addr, &pte, NULL);
> +       if (err)
> +               return err;
> +
> +       if (kvm_pte_valid(pte)) {
> +               state   = pkvm_getstate(kvm_pgtable_hyp_pte_prot(pte));
> +               phys    = kvm_pte_to_phys(pte);
> +               prot    = kvm_pgtable_hyp_pte_prot(pte) & KVM_PGTABLE_PROT_RWX;
> +       }
> +
> +       *ack = (struct pkvm_page_share_ack) {
> +               .completer      = {
> +                       .state  = state,
> +                       .phys   = phys,
> +                       .prot   = prot,
> +               },
> +       };
> +
> +       return 0;
> +}
> +
> +/*
> + * Populate the page-sharing acknowledgment (@ack) based on the sharing request
> + * from the initiator and the current page state in the completer.
> + */
> +static int ack_share(struct pkvm_page_share_ack *ack,
> +                    struct pkvm_page_req *req,
> +                    struct pkvm_mem_share *share)
> +{
> +       struct pkvm_mem_transition *tx = &share->tx;
> +
> +       switch (tx->completer.id) {
> +       case PKVM_ID_HYP:
> +               return hyp_ack_share(ack, req, share->prot);
> +       default:
> +               return -EINVAL;
> +       }
> +}
> +
> +/*
> + * Check that the page states in the initiator and the completer are compatible
> + * for the requested page-sharing operation to go ahead.
> + */
> +static int check_share(struct pkvm_page_req *req,
> +                      struct pkvm_page_share_ack *ack,
> +                      struct pkvm_mem_share *share)
> +{
> +       if (!addr_is_memory(req->phys))
> +               return -EINVAL;
> +
> +       if (req->initiator.state == PKVM_PAGE_OWNED &&
> +           ack->completer.state == PKVM_NOPAGE) {
> +               return 0;
> +       }
> +
> +       if (req->initiator.state != PKVM_PAGE_SHARED_OWNED)
> +               return -EPERM;
> +
> +       if (ack->completer.state != PKVM_PAGE_SHARED_BORROWED)
> +               return -EPERM;
> +
> +       if (ack->completer.phys != req->phys)
> +               return -EPERM;
> +
> +       if (ack->completer.prot != share->prot)
> +               return -EPERM;
I guess this is the workaround you mentioned for the fact that the
host can share the same page twice? It might be worth adding a comment
to explain that that's what's going on.

> +
> +       return 0;
> +}
> +
> +static int host_initiate_share(struct pkvm_page_req *req)
> +{
> +       enum kvm_pgtable_prot prot;
> +
> +       prot = pkvm_mkstate(PKVM_HOST_MEM_PROT, PKVM_PAGE_SHARED_OWNED);
> +       return host_stage2_idmap_locked(req->initiator.addr, PAGE_SIZE, prot);
> +}
> +
> +/* Update the initiator's page-table for the page-sharing request */
> +static int initiate_share(struct pkvm_page_req *req,
> +                         struct pkvm_mem_share *share)
> +{
> +       struct pkvm_mem_transition *tx = &share->tx;
> +
> +       switch (tx->initiator.id) {
> +       case PKVM_ID_HOST:
> +               return host_initiate_share(req);
> +       default:
> +               return -EINVAL;
> +       }
> +}
> +
> +static int hyp_complete_share(struct pkvm_page_req *req,
> +                             enum kvm_pgtable_prot perms)
> +{
> +       void *start = (void *)req->completer.addr, *end = start + PAGE_SIZE;
> +       enum kvm_pgtable_prot prot;
> +
> +       prot = pkvm_mkstate(perms, PKVM_PAGE_SHARED_BORROWED);
> +       return pkvm_create_mappings_locked(start, end, prot);
> +}
> +
> +/* Update the completer's page-table for the page-sharing request */
> +static int complete_share(struct pkvm_page_req *req,
> +                         struct pkvm_mem_share *share)
> +{
> +       struct pkvm_mem_transition *tx = &share->tx;
> +
> +       switch (tx->completer.id) {
> +       case PKVM_ID_HYP:
> +               return hyp_complete_share(req, share->prot);
> +       default:
> +               return -EINVAL;
> +       }
> +}
> +
> +/*
> + * do_share():
> + *
> + * The page owner grants access to another component with a given set
> + * of permissions.
> + *
> + * Initiator: OWNED    => SHARED_OWNED
> + * Completer: NOPAGE   => SHARED_BORROWED
> + *
> + * Note that we permit the same share operation to be repeated from the
> + * host to the hypervisor, as this removes the need for excessive
> + * book-keeping of shared KVM data structures at EL1.
> + */
> +static int do_share(struct pkvm_mem_share *share)
> +{
> +       struct pkvm_page_req req;
> +       int ret = 0;
> +       u64 idx;
> +
> +       for (idx = 0; idx < share->tx.nr_pages; ++idx) {
> +               struct pkvm_page_share_ack ack;
> +
> +               ret = request_share(&req, share, idx);
> +               if (ret)
> +                       goto out;
> +
> +               ret = ack_share(&ack, &req, share);
> +               if (ret)
> +                       goto out;
> +
> +               ret = check_share(&req, &ack, share);
> +               if (ret)
> +                       goto out;
> +       }
> +
> +       for (idx = 0; idx < share->tx.nr_pages; ++idx) {
> +               ret = request_share(&req, share, idx);
> +               if (ret)
> +                       break;
> +
> +               /* Allow double-sharing by skipping over the page */
> +               if (req.initiator.state == PKVM_PAGE_SHARED_OWNED)
> +                       continue;
> +
> +               ret = initiate_share(&req, share);
> +               if (ret)
> +                       break;
> +
> +               ret = complete_share(&req, share);
> +               if (ret)
> +                       break;
> +       }
> +
> +       WARN_ON(ret);
> +out:
> +       return ret;
> +}
> --
> 2.33.0.882.g93a45727a2-goog
>
> --
> To unsubscribe from this group and stop receiving emails from it, send an email to kernel-team+unsubscribe@android.com.
>

[-- Attachment #1.2: S/MIME Cryptographic Signature --]
[-- Type: application/pkcs7-signature, Size: 3998 bytes --]

[-- Attachment #2: Type: text/plain, Size: 151 bytes --]

_______________________________________________
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

^ permalink raw reply	[flat|nested] 84+ messages in thread

* Re: [PATCH 03/16] KVM: arm64: Avoid remapping the SVE state in the hyp stage-1
  2021-10-13 15:58   ` Quentin Perret
  (?)
@ 2021-10-16 11:04     ` Marc Zyngier
  -1 siblings, 0 replies; 84+ messages in thread
From: Marc Zyngier @ 2021-10-16 11:04 UTC (permalink / raw)
  To: Quentin Perret
  Cc: James Morse, Alexandru Elisei, Suzuki K Poulose, Catalin Marinas,
	Will Deacon, Fuad Tabba, David Brazdil, linux-arm-kernel, kvmarm,
	linux-kernel, kernel-team

On Wed, 13 Oct 2021 16:58:18 +0100,
Quentin Perret <qperret@google.com> wrote:
> 
> We currently map the SVE state into the hypervisor stage-1 on VCPU_RUN,
> when the vCPU thread's PID has changed. However, this only needs to be
> done during the first VCPU_RUN as the SVE state doesn't depend on
> thread-specific data, so move the create_hyp_mapping() call to
> kvm_vcpu_first_run_init().
> 
> Suggested-by: Marc Zyngier <maz@kernel.org>
> Signed-off-by: Quentin Perret <qperret@google.com>
> ---
>  arch/arm64/kvm/arm.c    | 12 ++++++++++++
>  arch/arm64/kvm/fpsimd.c | 11 -----------
>  2 files changed, 12 insertions(+), 11 deletions(-)
> 
> diff --git a/arch/arm64/kvm/arm.c b/arch/arm64/kvm/arm.c
> index fe102cd2e518..c33d8c073820 100644
> --- a/arch/arm64/kvm/arm.c
> +++ b/arch/arm64/kvm/arm.c
> @@ -618,6 +618,18 @@ static int kvm_vcpu_first_run_init(struct kvm_vcpu *vcpu)
>  	if (ret)
>  		return ret;
>  
> +	if (vcpu->arch.sve_state) {
> +		void *sve_end;
> +
> +		sve_end = vcpu->arch.sve_state + vcpu_sve_state_size(vcpu);
> +
> +		ret = create_hyp_mappings(vcpu->arch.sve_state, sve_end,
> +					  PAGE_HYP);
> +		if (ret)
> +			return ret;
> +	}
> +
> +
>  	ret = kvm_arm_pmu_v3_enable(vcpu);
>  
>  	return ret;
> diff --git a/arch/arm64/kvm/fpsimd.c b/arch/arm64/kvm/fpsimd.c
> index 5621020b28de..62c0d78da7be 100644
> --- a/arch/arm64/kvm/fpsimd.c
> +++ b/arch/arm64/kvm/fpsimd.c
> @@ -43,17 +43,6 @@ int kvm_arch_vcpu_run_map_fp(struct kvm_vcpu *vcpu)
>  	if (ret)
>  		goto error;
>  
> -	if (vcpu->arch.sve_state) {
> -		void *sve_end;
> -
> -		sve_end = vcpu->arch.sve_state + vcpu_sve_state_size(vcpu);
> -
> -		ret = create_hyp_mappings(vcpu->arch.sve_state, sve_end,
> -					  PAGE_HYP);
> -		if (ret)
> -			goto error;
> -	}
> -
>  	vcpu->arch.host_thread_info = kern_hyp_va(ti);
>  	vcpu->arch.host_fpsimd_state = kern_hyp_va(fpsimd);
>  error:

I actually ended-up implementing a similar patch as part of my 'first
run' series[1], though I moved the mapping to the point where we
finalise the vcpu as that's where the allocation takes place.

Do you see any potential issue with that approach?

Thanks,

	M.

[1] https://lore.kernel.org/r/20211015090822.2994920-2-maz@kernel.org

-- 
Without deviation from the norm, progress is not possible.

^ permalink raw reply	[flat|nested] 84+ messages in thread

* Re: [PATCH 03/16] KVM: arm64: Avoid remapping the SVE state in the hyp stage-1
@ 2021-10-16 11:04     ` Marc Zyngier
  0 siblings, 0 replies; 84+ messages in thread
From: Marc Zyngier @ 2021-10-16 11:04 UTC (permalink / raw)
  To: Quentin Perret
  Cc: kernel-team, Will Deacon, Catalin Marinas, linux-kernel,
	linux-arm-kernel, kvmarm

On Wed, 13 Oct 2021 16:58:18 +0100,
Quentin Perret <qperret@google.com> wrote:
> 
> We currently map the SVE state into the hypervisor stage-1 on VCPU_RUN,
> when the vCPU thread's PID has changed. However, this only needs to be
> done during the first VCPU_RUN as the SVE state doesn't depend on
> thread-specific data, so move the create_hyp_mapping() call to
> kvm_vcpu_first_run_init().
> 
> Suggested-by: Marc Zyngier <maz@kernel.org>
> Signed-off-by: Quentin Perret <qperret@google.com>
> ---
>  arch/arm64/kvm/arm.c    | 12 ++++++++++++
>  arch/arm64/kvm/fpsimd.c | 11 -----------
>  2 files changed, 12 insertions(+), 11 deletions(-)
> 
> diff --git a/arch/arm64/kvm/arm.c b/arch/arm64/kvm/arm.c
> index fe102cd2e518..c33d8c073820 100644
> --- a/arch/arm64/kvm/arm.c
> +++ b/arch/arm64/kvm/arm.c
> @@ -618,6 +618,18 @@ static int kvm_vcpu_first_run_init(struct kvm_vcpu *vcpu)
>  	if (ret)
>  		return ret;
>  
> +	if (vcpu->arch.sve_state) {
> +		void *sve_end;
> +
> +		sve_end = vcpu->arch.sve_state + vcpu_sve_state_size(vcpu);
> +
> +		ret = create_hyp_mappings(vcpu->arch.sve_state, sve_end,
> +					  PAGE_HYP);
> +		if (ret)
> +			return ret;
> +	}
> +
> +
>  	ret = kvm_arm_pmu_v3_enable(vcpu);
>  
>  	return ret;
> diff --git a/arch/arm64/kvm/fpsimd.c b/arch/arm64/kvm/fpsimd.c
> index 5621020b28de..62c0d78da7be 100644
> --- a/arch/arm64/kvm/fpsimd.c
> +++ b/arch/arm64/kvm/fpsimd.c
> @@ -43,17 +43,6 @@ int kvm_arch_vcpu_run_map_fp(struct kvm_vcpu *vcpu)
>  	if (ret)
>  		goto error;
>  
> -	if (vcpu->arch.sve_state) {
> -		void *sve_end;
> -
> -		sve_end = vcpu->arch.sve_state + vcpu_sve_state_size(vcpu);
> -
> -		ret = create_hyp_mappings(vcpu->arch.sve_state, sve_end,
> -					  PAGE_HYP);
> -		if (ret)
> -			goto error;
> -	}
> -
>  	vcpu->arch.host_thread_info = kern_hyp_va(ti);
>  	vcpu->arch.host_fpsimd_state = kern_hyp_va(fpsimd);
>  error:

I actually ended-up implementing a similar patch as part of my 'first
run' series[1], though I moved the mapping to the point where we
finalise the vcpu as that's where the allocation takes place.

Do you see any potential issue with that approach?

Thanks,

	M.

[1] https://lore.kernel.org/r/20211015090822.2994920-2-maz@kernel.org

-- 
Without deviation from the norm, progress is not possible.
_______________________________________________
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

^ permalink raw reply	[flat|nested] 84+ messages in thread

* Re: [PATCH 03/16] KVM: arm64: Avoid remapping the SVE state in the hyp stage-1
@ 2021-10-16 11:04     ` Marc Zyngier
  0 siblings, 0 replies; 84+ messages in thread
From: Marc Zyngier @ 2021-10-16 11:04 UTC (permalink / raw)
  To: Quentin Perret
  Cc: James Morse, Alexandru Elisei, Suzuki K Poulose, Catalin Marinas,
	Will Deacon, Fuad Tabba, David Brazdil, linux-arm-kernel, kvmarm,
	linux-kernel, kernel-team

On Wed, 13 Oct 2021 16:58:18 +0100,
Quentin Perret <qperret@google.com> wrote:
> 
> We currently map the SVE state into the hypervisor stage-1 on VCPU_RUN,
> when the vCPU thread's PID has changed. However, this only needs to be
> done during the first VCPU_RUN as the SVE state doesn't depend on
> thread-specific data, so move the create_hyp_mapping() call to
> kvm_vcpu_first_run_init().
> 
> Suggested-by: Marc Zyngier <maz@kernel.org>
> Signed-off-by: Quentin Perret <qperret@google.com>
> ---
>  arch/arm64/kvm/arm.c    | 12 ++++++++++++
>  arch/arm64/kvm/fpsimd.c | 11 -----------
>  2 files changed, 12 insertions(+), 11 deletions(-)
> 
> diff --git a/arch/arm64/kvm/arm.c b/arch/arm64/kvm/arm.c
> index fe102cd2e518..c33d8c073820 100644
> --- a/arch/arm64/kvm/arm.c
> +++ b/arch/arm64/kvm/arm.c
> @@ -618,6 +618,18 @@ static int kvm_vcpu_first_run_init(struct kvm_vcpu *vcpu)
>  	if (ret)
>  		return ret;
>  
> +	if (vcpu->arch.sve_state) {
> +		void *sve_end;
> +
> +		sve_end = vcpu->arch.sve_state + vcpu_sve_state_size(vcpu);
> +
> +		ret = create_hyp_mappings(vcpu->arch.sve_state, sve_end,
> +					  PAGE_HYP);
> +		if (ret)
> +			return ret;
> +	}
> +
> +
>  	ret = kvm_arm_pmu_v3_enable(vcpu);
>  
>  	return ret;
> diff --git a/arch/arm64/kvm/fpsimd.c b/arch/arm64/kvm/fpsimd.c
> index 5621020b28de..62c0d78da7be 100644
> --- a/arch/arm64/kvm/fpsimd.c
> +++ b/arch/arm64/kvm/fpsimd.c
> @@ -43,17 +43,6 @@ int kvm_arch_vcpu_run_map_fp(struct kvm_vcpu *vcpu)
>  	if (ret)
>  		goto error;
>  
> -	if (vcpu->arch.sve_state) {
> -		void *sve_end;
> -
> -		sve_end = vcpu->arch.sve_state + vcpu_sve_state_size(vcpu);
> -
> -		ret = create_hyp_mappings(vcpu->arch.sve_state, sve_end,
> -					  PAGE_HYP);
> -		if (ret)
> -			goto error;
> -	}
> -
>  	vcpu->arch.host_thread_info = kern_hyp_va(ti);
>  	vcpu->arch.host_fpsimd_state = kern_hyp_va(fpsimd);
>  error:

I actually ended-up implementing a similar patch as part of my 'first
run' series[1], though I moved the mapping to the point where we
finalise the vcpu as that's where the allocation takes place.

Do you see any potential issue with that approach?

Thanks,

	M.

[1] https://lore.kernel.org/r/20211015090822.2994920-2-maz@kernel.org

-- 
Without deviation from the norm, progress is not possible.

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 84+ messages in thread

* Re: [PATCH 16/16] KVM: arm64: pkvm: Unshare guest structs during teardown
  2021-10-13 15:58   ` Quentin Perret
  (?)
@ 2021-10-16 12:25     ` Marc Zyngier
  -1 siblings, 0 replies; 84+ messages in thread
From: Marc Zyngier @ 2021-10-16 12:25 UTC (permalink / raw)
  To: Quentin Perret
  Cc: James Morse, Alexandru Elisei, Suzuki K Poulose, Catalin Marinas,
	Will Deacon, Fuad Tabba, David Brazdil, linux-arm-kernel, kvmarm,
	linux-kernel, kernel-team

On Wed, 13 Oct 2021 16:58:31 +0100,
Quentin Perret <qperret@google.com> wrote:
> 
> Make use of the newly introduced unshare hypercall during guest teardown
> to unmap guest-related data structures from the hyp stage-1.
> 
> Signed-off-by: Quentin Perret <qperret@google.com>
> ---
>  arch/arm64/include/asm/kvm_host.h |  2 ++
>  arch/arm64/include/asm/kvm_mmu.h  |  1 +
>  arch/arm64/kvm/arm.c              |  2 ++
>  arch/arm64/kvm/fpsimd.c           | 10 ++++++++--
>  arch/arm64/kvm/mmu.c              | 16 ++++++++++++++++
>  arch/arm64/kvm/reset.c            | 13 ++++++++++++-
>  6 files changed, 41 insertions(+), 3 deletions(-)
> 
> diff --git a/arch/arm64/include/asm/kvm_host.h b/arch/arm64/include/asm/kvm_host.h
> index f8be56d5342b..8b61cdcd1b29 100644
> --- a/arch/arm64/include/asm/kvm_host.h
> +++ b/arch/arm64/include/asm/kvm_host.h
> @@ -322,6 +322,8 @@ struct kvm_vcpu_arch {
>  
>  	struct thread_info *host_thread_info;	/* hyp VA */
>  	struct user_fpsimd_state *host_fpsimd_state;	/* hyp VA */
> +	struct thread_info *kern_thread_info;
> +	struct user_fpsimd_state *kern_fpsimd_state;
>  
>  	struct {
>  		/* {Break,watch}point registers */
> diff --git a/arch/arm64/include/asm/kvm_mmu.h b/arch/arm64/include/asm/kvm_mmu.h
> index 185d0f62b724..81839e9a8a24 100644
> --- a/arch/arm64/include/asm/kvm_mmu.h
> +++ b/arch/arm64/include/asm/kvm_mmu.h
> @@ -151,6 +151,7 @@ static __always_inline unsigned long __kern_hyp_va(unsigned long v)
>  #include <asm/stage2_pgtable.h>
>  
>  int kvm_share_hyp(void *from, void *to);
> +void kvm_unshare_hyp(void *from, void *to);
>  int create_hyp_mappings(void *from, void *to, enum kvm_pgtable_prot prot);
>  int create_hyp_io_mappings(phys_addr_t phys_addr, size_t size,
>  			   void __iomem **kaddr,
> diff --git a/arch/arm64/kvm/arm.c b/arch/arm64/kvm/arm.c
> index f2e74635332b..f11c51db6fe6 100644
> --- a/arch/arm64/kvm/arm.c
> +++ b/arch/arm64/kvm/arm.c
> @@ -188,6 +188,8 @@ void kvm_arch_destroy_vm(struct kvm *kvm)
>  		}
>  	}
>  	atomic_set(&kvm->online_vcpus, 0);
> +
> +	kvm_unshare_hyp(kvm, kvm + 1);
>  }
>  
>  int kvm_vm_ioctl_check_extension(struct kvm *kvm, long ext)
> diff --git a/arch/arm64/kvm/fpsimd.c b/arch/arm64/kvm/fpsimd.c
> index 2fe1128d9f3d..67059daf4d26 100644
> --- a/arch/arm64/kvm/fpsimd.c
> +++ b/arch/arm64/kvm/fpsimd.c
> @@ -28,23 +28,29 @@ int kvm_arch_vcpu_run_map_fp(struct kvm_vcpu *vcpu)
>  {
>  	int ret;
>  
> -	struct thread_info *ti = &current->thread_info;
> -	struct user_fpsimd_state *fpsimd = &current->thread.uw.fpsimd_state;
> +	struct thread_info *ti = vcpu->arch.kern_thread_info;
> +	struct user_fpsimd_state *fpsimd = vcpu->arch.kern_fpsimd_state;
>  
>  	/*
>  	 * Make sure the host task thread flags and fpsimd state are
>  	 * visible to hyp:
>  	 */
> +	kvm_unshare_hyp(ti, ti + 1);

At this stage, the old thread may have been destroyed and the memory
recycled. What happens if, in the interval, that memory gets shared
again in another context? My guts feeling is that either the sharing
fails, or the unshare above breaks something else (the refcounting
doesn't save us if the sharing is not with HYP).

In any case, I wonder whether we need a notification from the core
code that a thread for which we have a HYP mapping is gone so that we
can mop up the mapping at that point. That's similar to what we have
for MMU notifiers and S2 PTs.

This is doable by hooking into fpsimd_release_task() and extending
thread_struct to track the sharing with HYP.

Thanks,

	M.

-- 
Without deviation from the norm, progress is not possible.

^ permalink raw reply	[flat|nested] 84+ messages in thread

* Re: [PATCH 16/16] KVM: arm64: pkvm: Unshare guest structs during teardown
@ 2021-10-16 12:25     ` Marc Zyngier
  0 siblings, 0 replies; 84+ messages in thread
From: Marc Zyngier @ 2021-10-16 12:25 UTC (permalink / raw)
  To: Quentin Perret
  Cc: kernel-team, Will Deacon, Catalin Marinas, linux-kernel,
	linux-arm-kernel, kvmarm

On Wed, 13 Oct 2021 16:58:31 +0100,
Quentin Perret <qperret@google.com> wrote:
> 
> Make use of the newly introduced unshare hypercall during guest teardown
> to unmap guest-related data structures from the hyp stage-1.
> 
> Signed-off-by: Quentin Perret <qperret@google.com>
> ---
>  arch/arm64/include/asm/kvm_host.h |  2 ++
>  arch/arm64/include/asm/kvm_mmu.h  |  1 +
>  arch/arm64/kvm/arm.c              |  2 ++
>  arch/arm64/kvm/fpsimd.c           | 10 ++++++++--
>  arch/arm64/kvm/mmu.c              | 16 ++++++++++++++++
>  arch/arm64/kvm/reset.c            | 13 ++++++++++++-
>  6 files changed, 41 insertions(+), 3 deletions(-)
> 
> diff --git a/arch/arm64/include/asm/kvm_host.h b/arch/arm64/include/asm/kvm_host.h
> index f8be56d5342b..8b61cdcd1b29 100644
> --- a/arch/arm64/include/asm/kvm_host.h
> +++ b/arch/arm64/include/asm/kvm_host.h
> @@ -322,6 +322,8 @@ struct kvm_vcpu_arch {
>  
>  	struct thread_info *host_thread_info;	/* hyp VA */
>  	struct user_fpsimd_state *host_fpsimd_state;	/* hyp VA */
> +	struct thread_info *kern_thread_info;
> +	struct user_fpsimd_state *kern_fpsimd_state;
>  
>  	struct {
>  		/* {Break,watch}point registers */
> diff --git a/arch/arm64/include/asm/kvm_mmu.h b/arch/arm64/include/asm/kvm_mmu.h
> index 185d0f62b724..81839e9a8a24 100644
> --- a/arch/arm64/include/asm/kvm_mmu.h
> +++ b/arch/arm64/include/asm/kvm_mmu.h
> @@ -151,6 +151,7 @@ static __always_inline unsigned long __kern_hyp_va(unsigned long v)
>  #include <asm/stage2_pgtable.h>
>  
>  int kvm_share_hyp(void *from, void *to);
> +void kvm_unshare_hyp(void *from, void *to);
>  int create_hyp_mappings(void *from, void *to, enum kvm_pgtable_prot prot);
>  int create_hyp_io_mappings(phys_addr_t phys_addr, size_t size,
>  			   void __iomem **kaddr,
> diff --git a/arch/arm64/kvm/arm.c b/arch/arm64/kvm/arm.c
> index f2e74635332b..f11c51db6fe6 100644
> --- a/arch/arm64/kvm/arm.c
> +++ b/arch/arm64/kvm/arm.c
> @@ -188,6 +188,8 @@ void kvm_arch_destroy_vm(struct kvm *kvm)
>  		}
>  	}
>  	atomic_set(&kvm->online_vcpus, 0);
> +
> +	kvm_unshare_hyp(kvm, kvm + 1);
>  }
>  
>  int kvm_vm_ioctl_check_extension(struct kvm *kvm, long ext)
> diff --git a/arch/arm64/kvm/fpsimd.c b/arch/arm64/kvm/fpsimd.c
> index 2fe1128d9f3d..67059daf4d26 100644
> --- a/arch/arm64/kvm/fpsimd.c
> +++ b/arch/arm64/kvm/fpsimd.c
> @@ -28,23 +28,29 @@ int kvm_arch_vcpu_run_map_fp(struct kvm_vcpu *vcpu)
>  {
>  	int ret;
>  
> -	struct thread_info *ti = &current->thread_info;
> -	struct user_fpsimd_state *fpsimd = &current->thread.uw.fpsimd_state;
> +	struct thread_info *ti = vcpu->arch.kern_thread_info;
> +	struct user_fpsimd_state *fpsimd = vcpu->arch.kern_fpsimd_state;
>  
>  	/*
>  	 * Make sure the host task thread flags and fpsimd state are
>  	 * visible to hyp:
>  	 */
> +	kvm_unshare_hyp(ti, ti + 1);

At this stage, the old thread may have been destroyed and the memory
recycled. What happens if, in the interval, that memory gets shared
again in another context? My guts feeling is that either the sharing
fails, or the unshare above breaks something else (the refcounting
doesn't save us if the sharing is not with HYP).

In any case, I wonder whether we need a notification from the core
code that a thread for which we have a HYP mapping is gone so that we
can mop up the mapping at that point. That's similar to what we have
for MMU notifiers and S2 PTs.

This is doable by hooking into fpsimd_release_task() and extending
thread_struct to track the sharing with HYP.

Thanks,

	M.

-- 
Without deviation from the norm, progress is not possible.
_______________________________________________
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

^ permalink raw reply	[flat|nested] 84+ messages in thread

* Re: [PATCH 16/16] KVM: arm64: pkvm: Unshare guest structs during teardown
@ 2021-10-16 12:25     ` Marc Zyngier
  0 siblings, 0 replies; 84+ messages in thread
From: Marc Zyngier @ 2021-10-16 12:25 UTC (permalink / raw)
  To: Quentin Perret
  Cc: James Morse, Alexandru Elisei, Suzuki K Poulose, Catalin Marinas,
	Will Deacon, Fuad Tabba, David Brazdil, linux-arm-kernel, kvmarm,
	linux-kernel, kernel-team

On Wed, 13 Oct 2021 16:58:31 +0100,
Quentin Perret <qperret@google.com> wrote:
> 
> Make use of the newly introduced unshare hypercall during guest teardown
> to unmap guest-related data structures from the hyp stage-1.
> 
> Signed-off-by: Quentin Perret <qperret@google.com>
> ---
>  arch/arm64/include/asm/kvm_host.h |  2 ++
>  arch/arm64/include/asm/kvm_mmu.h  |  1 +
>  arch/arm64/kvm/arm.c              |  2 ++
>  arch/arm64/kvm/fpsimd.c           | 10 ++++++++--
>  arch/arm64/kvm/mmu.c              | 16 ++++++++++++++++
>  arch/arm64/kvm/reset.c            | 13 ++++++++++++-
>  6 files changed, 41 insertions(+), 3 deletions(-)
> 
> diff --git a/arch/arm64/include/asm/kvm_host.h b/arch/arm64/include/asm/kvm_host.h
> index f8be56d5342b..8b61cdcd1b29 100644
> --- a/arch/arm64/include/asm/kvm_host.h
> +++ b/arch/arm64/include/asm/kvm_host.h
> @@ -322,6 +322,8 @@ struct kvm_vcpu_arch {
>  
>  	struct thread_info *host_thread_info;	/* hyp VA */
>  	struct user_fpsimd_state *host_fpsimd_state;	/* hyp VA */
> +	struct thread_info *kern_thread_info;
> +	struct user_fpsimd_state *kern_fpsimd_state;
>  
>  	struct {
>  		/* {Break,watch}point registers */
> diff --git a/arch/arm64/include/asm/kvm_mmu.h b/arch/arm64/include/asm/kvm_mmu.h
> index 185d0f62b724..81839e9a8a24 100644
> --- a/arch/arm64/include/asm/kvm_mmu.h
> +++ b/arch/arm64/include/asm/kvm_mmu.h
> @@ -151,6 +151,7 @@ static __always_inline unsigned long __kern_hyp_va(unsigned long v)
>  #include <asm/stage2_pgtable.h>
>  
>  int kvm_share_hyp(void *from, void *to);
> +void kvm_unshare_hyp(void *from, void *to);
>  int create_hyp_mappings(void *from, void *to, enum kvm_pgtable_prot prot);
>  int create_hyp_io_mappings(phys_addr_t phys_addr, size_t size,
>  			   void __iomem **kaddr,
> diff --git a/arch/arm64/kvm/arm.c b/arch/arm64/kvm/arm.c
> index f2e74635332b..f11c51db6fe6 100644
> --- a/arch/arm64/kvm/arm.c
> +++ b/arch/arm64/kvm/arm.c
> @@ -188,6 +188,8 @@ void kvm_arch_destroy_vm(struct kvm *kvm)
>  		}
>  	}
>  	atomic_set(&kvm->online_vcpus, 0);
> +
> +	kvm_unshare_hyp(kvm, kvm + 1);
>  }
>  
>  int kvm_vm_ioctl_check_extension(struct kvm *kvm, long ext)
> diff --git a/arch/arm64/kvm/fpsimd.c b/arch/arm64/kvm/fpsimd.c
> index 2fe1128d9f3d..67059daf4d26 100644
> --- a/arch/arm64/kvm/fpsimd.c
> +++ b/arch/arm64/kvm/fpsimd.c
> @@ -28,23 +28,29 @@ int kvm_arch_vcpu_run_map_fp(struct kvm_vcpu *vcpu)
>  {
>  	int ret;
>  
> -	struct thread_info *ti = &current->thread_info;
> -	struct user_fpsimd_state *fpsimd = &current->thread.uw.fpsimd_state;
> +	struct thread_info *ti = vcpu->arch.kern_thread_info;
> +	struct user_fpsimd_state *fpsimd = vcpu->arch.kern_fpsimd_state;
>  
>  	/*
>  	 * Make sure the host task thread flags and fpsimd state are
>  	 * visible to hyp:
>  	 */
> +	kvm_unshare_hyp(ti, ti + 1);

At this stage, the old thread may have been destroyed and the memory
recycled. What happens if, in the interval, that memory gets shared
again in another context? My guts feeling is that either the sharing
fails, or the unshare above breaks something else (the refcounting
doesn't save us if the sharing is not with HYP).

In any case, I wonder whether we need a notification from the core
code that a thread for which we have a HYP mapping is gone so that we
can mop up the mapping at that point. That's similar to what we have
for MMU notifiers and S2 PTs.

This is doable by hooking into fpsimd_release_task() and extending
thread_struct to track the sharing with HYP.

Thanks,

	M.

-- 
Without deviation from the norm, progress is not possible.

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 84+ messages in thread

* Re: [PATCH 04/16] KVM: arm64: Introduce kvm_share_hyp()
  2021-10-13 15:58   ` Quentin Perret
  (?)
@ 2021-10-17 10:41     ` Marc Zyngier
  -1 siblings, 0 replies; 84+ messages in thread
From: Marc Zyngier @ 2021-10-17 10:41 UTC (permalink / raw)
  To: Quentin Perret
  Cc: James Morse, Alexandru Elisei, Suzuki K Poulose, Catalin Marinas,
	Will Deacon, Fuad Tabba, David Brazdil, linux-arm-kernel, kvmarm,
	linux-kernel, kernel-team

On Wed, 13 Oct 2021 16:58:19 +0100,
Quentin Perret <qperret@google.com> wrote:
> 
> The create_hyp_mappings() function can currently be called at any point
> in time. However, its behaviour in protected mode changes widely
> depending on when it is being called. Prior to KVM init, it is used to
> create the temporary page-table used to bring-up the hypervisor, and
> later on it is transparently turned into a 'share' hypercall when the
> kernel has lost control over the hypervisor stage-1. In order to prepare
> the ground for also unsharing pages with the hypervisor during guest
> teardown, introduce a kvm_share_hyp() function to make it clear in which
> places a share hypercall should be expected, as we will soon need a
> matching unshare hypercall in all those places.
> 
> Signed-off-by: Quentin Perret <qperret@google.com>
> ---
>  arch/arm64/include/asm/kvm_mmu.h |  1 +
>  arch/arm64/kvm/arm.c             |  7 +++----
>  arch/arm64/kvm/fpsimd.c          |  4 ++--
>  arch/arm64/kvm/mmu.c             | 19 +++++++++++++------
>  4 files changed, 19 insertions(+), 12 deletions(-)
> 
> diff --git a/arch/arm64/include/asm/kvm_mmu.h b/arch/arm64/include/asm/kvm_mmu.h
> index 02d378887743..185d0f62b724 100644
> --- a/arch/arm64/include/asm/kvm_mmu.h
> +++ b/arch/arm64/include/asm/kvm_mmu.h
> @@ -150,6 +150,7 @@ static __always_inline unsigned long __kern_hyp_va(unsigned long v)
>  #include <asm/kvm_pgtable.h>
>  #include <asm/stage2_pgtable.h>
>  
> +int kvm_share_hyp(void *from, void *to);
>  int create_hyp_mappings(void *from, void *to, enum kvm_pgtable_prot prot);
>  int create_hyp_io_mappings(phys_addr_t phys_addr, size_t size,
>  			   void __iomem **kaddr,
> diff --git a/arch/arm64/kvm/arm.c b/arch/arm64/kvm/arm.c
> index c33d8c073820..f2e74635332b 100644
> --- a/arch/arm64/kvm/arm.c
> +++ b/arch/arm64/kvm/arm.c
> @@ -146,7 +146,7 @@ int kvm_arch_init_vm(struct kvm *kvm, unsigned long type)
>  	if (ret)
>  		return ret;
>  
> -	ret = create_hyp_mappings(kvm, kvm + 1, PAGE_HYP);
> +	ret = kvm_share_hyp(kvm, kvm + 1);
>  	if (ret)
>  		goto out_free_stage2_pgd;
>  
> @@ -341,7 +341,7 @@ int kvm_arch_vcpu_create(struct kvm_vcpu *vcpu)
>  	if (err)
>  		return err;
>  
> -	return create_hyp_mappings(vcpu, vcpu + 1, PAGE_HYP);
> +	return kvm_share_hyp(vcpu, vcpu + 1);
>  }
>  
>  void kvm_arch_vcpu_postcreate(struct kvm_vcpu *vcpu)
> @@ -623,8 +623,7 @@ static int kvm_vcpu_first_run_init(struct kvm_vcpu *vcpu)
>  
>  		sve_end = vcpu->arch.sve_state + vcpu_sve_state_size(vcpu);
>  
> -		ret = create_hyp_mappings(vcpu->arch.sve_state, sve_end,
> -					  PAGE_HYP);
> +		ret = kvm_share_hyp(vcpu->arch.sve_state, sve_end);
>  		if (ret)
>  			return ret;
>  	}
> diff --git a/arch/arm64/kvm/fpsimd.c b/arch/arm64/kvm/fpsimd.c
> index 62c0d78da7be..2fe1128d9f3d 100644
> --- a/arch/arm64/kvm/fpsimd.c
> +++ b/arch/arm64/kvm/fpsimd.c
> @@ -35,11 +35,11 @@ int kvm_arch_vcpu_run_map_fp(struct kvm_vcpu *vcpu)
>  	 * Make sure the host task thread flags and fpsimd state are
>  	 * visible to hyp:
>  	 */
> -	ret = create_hyp_mappings(ti, ti + 1, PAGE_HYP);
> +	ret = kvm_share_hyp(ti, ti + 1);
>  	if (ret)
>  		goto error;
>  
> -	ret = create_hyp_mappings(fpsimd, fpsimd + 1, PAGE_HYP);
> +	ret = kvm_share_hyp(fpsimd, fpsimd + 1);
>  	if (ret)
>  		goto error;
>  
> diff --git a/arch/arm64/kvm/mmu.c b/arch/arm64/kvm/mmu.c
> index 1a94a7ca48f2..f80673e863ac 100644
> --- a/arch/arm64/kvm/mmu.c
> +++ b/arch/arm64/kvm/mmu.c
> @@ -296,6 +296,17 @@ static int pkvm_share_hyp(phys_addr_t start, phys_addr_t end)
>  	return 0;
>  }
>  
> +int kvm_share_hyp(void *from, void *to)
> +{
> +	if (is_kernel_in_hyp_mode())
> +		return 0;
> +
> +	if (kvm_host_owns_hyp_mappings())
> +		return create_hyp_mappings(from, to, PAGE_HYP);

Not directly related to this code, but it looks to me that
kvm_host_owns_hyp_mappings() really ought to check for
is_kernel_in_hyp_mode() on its own. VHE really deals with its own
mappings, and create_hyp_mappings() already has a check to do nothing
on VHE.

> +
> +	return pkvm_share_hyp(kvm_kaddr_to_phys(from), kvm_kaddr_to_phys(to));
> +}
> +
>  /**
>   * create_hyp_mappings - duplicate a kernel virtual address range in Hyp mode
>   * @from:	The virtual kernel start address of the range
> @@ -316,12 +327,8 @@ int create_hyp_mappings(void *from, void *to, enum kvm_pgtable_prot prot)
>  	if (is_kernel_in_hyp_mode())
>  		return 0;
>  
> -	if (!kvm_host_owns_hyp_mappings()) {
> -		if (WARN_ON(prot != PAGE_HYP))
> -			return -EPERM;
> -		return pkvm_share_hyp(kvm_kaddr_to_phys(from),
> -				      kvm_kaddr_to_phys(to));
> -	}
> +	if (WARN_ON(!kvm_host_owns_hyp_mappings()))
> +		return -EPERM;

Do we really need this? Can't we just verify that all the code paths
to create_hyp_mappings() check for kvm_host_owns_hyp_mappings()?

At the very least, make this a VM_BUG_ON() so that this is limited to
debug. Yes, I'm quickly developing a WARN_ON()-phobia.

	M.

-- 
Without deviation from the norm, progress is not possible.

^ permalink raw reply	[flat|nested] 84+ messages in thread

* Re: [PATCH 04/16] KVM: arm64: Introduce kvm_share_hyp()
@ 2021-10-17 10:41     ` Marc Zyngier
  0 siblings, 0 replies; 84+ messages in thread
From: Marc Zyngier @ 2021-10-17 10:41 UTC (permalink / raw)
  To: Quentin Perret
  Cc: kernel-team, Will Deacon, Catalin Marinas, linux-kernel,
	linux-arm-kernel, kvmarm

On Wed, 13 Oct 2021 16:58:19 +0100,
Quentin Perret <qperret@google.com> wrote:
> 
> The create_hyp_mappings() function can currently be called at any point
> in time. However, its behaviour in protected mode changes widely
> depending on when it is being called. Prior to KVM init, it is used to
> create the temporary page-table used to bring-up the hypervisor, and
> later on it is transparently turned into a 'share' hypercall when the
> kernel has lost control over the hypervisor stage-1. In order to prepare
> the ground for also unsharing pages with the hypervisor during guest
> teardown, introduce a kvm_share_hyp() function to make it clear in which
> places a share hypercall should be expected, as we will soon need a
> matching unshare hypercall in all those places.
> 
> Signed-off-by: Quentin Perret <qperret@google.com>
> ---
>  arch/arm64/include/asm/kvm_mmu.h |  1 +
>  arch/arm64/kvm/arm.c             |  7 +++----
>  arch/arm64/kvm/fpsimd.c          |  4 ++--
>  arch/arm64/kvm/mmu.c             | 19 +++++++++++++------
>  4 files changed, 19 insertions(+), 12 deletions(-)
> 
> diff --git a/arch/arm64/include/asm/kvm_mmu.h b/arch/arm64/include/asm/kvm_mmu.h
> index 02d378887743..185d0f62b724 100644
> --- a/arch/arm64/include/asm/kvm_mmu.h
> +++ b/arch/arm64/include/asm/kvm_mmu.h
> @@ -150,6 +150,7 @@ static __always_inline unsigned long __kern_hyp_va(unsigned long v)
>  #include <asm/kvm_pgtable.h>
>  #include <asm/stage2_pgtable.h>
>  
> +int kvm_share_hyp(void *from, void *to);
>  int create_hyp_mappings(void *from, void *to, enum kvm_pgtable_prot prot);
>  int create_hyp_io_mappings(phys_addr_t phys_addr, size_t size,
>  			   void __iomem **kaddr,
> diff --git a/arch/arm64/kvm/arm.c b/arch/arm64/kvm/arm.c
> index c33d8c073820..f2e74635332b 100644
> --- a/arch/arm64/kvm/arm.c
> +++ b/arch/arm64/kvm/arm.c
> @@ -146,7 +146,7 @@ int kvm_arch_init_vm(struct kvm *kvm, unsigned long type)
>  	if (ret)
>  		return ret;
>  
> -	ret = create_hyp_mappings(kvm, kvm + 1, PAGE_HYP);
> +	ret = kvm_share_hyp(kvm, kvm + 1);
>  	if (ret)
>  		goto out_free_stage2_pgd;
>  
> @@ -341,7 +341,7 @@ int kvm_arch_vcpu_create(struct kvm_vcpu *vcpu)
>  	if (err)
>  		return err;
>  
> -	return create_hyp_mappings(vcpu, vcpu + 1, PAGE_HYP);
> +	return kvm_share_hyp(vcpu, vcpu + 1);
>  }
>  
>  void kvm_arch_vcpu_postcreate(struct kvm_vcpu *vcpu)
> @@ -623,8 +623,7 @@ static int kvm_vcpu_first_run_init(struct kvm_vcpu *vcpu)
>  
>  		sve_end = vcpu->arch.sve_state + vcpu_sve_state_size(vcpu);
>  
> -		ret = create_hyp_mappings(vcpu->arch.sve_state, sve_end,
> -					  PAGE_HYP);
> +		ret = kvm_share_hyp(vcpu->arch.sve_state, sve_end);
>  		if (ret)
>  			return ret;
>  	}
> diff --git a/arch/arm64/kvm/fpsimd.c b/arch/arm64/kvm/fpsimd.c
> index 62c0d78da7be..2fe1128d9f3d 100644
> --- a/arch/arm64/kvm/fpsimd.c
> +++ b/arch/arm64/kvm/fpsimd.c
> @@ -35,11 +35,11 @@ int kvm_arch_vcpu_run_map_fp(struct kvm_vcpu *vcpu)
>  	 * Make sure the host task thread flags and fpsimd state are
>  	 * visible to hyp:
>  	 */
> -	ret = create_hyp_mappings(ti, ti + 1, PAGE_HYP);
> +	ret = kvm_share_hyp(ti, ti + 1);
>  	if (ret)
>  		goto error;
>  
> -	ret = create_hyp_mappings(fpsimd, fpsimd + 1, PAGE_HYP);
> +	ret = kvm_share_hyp(fpsimd, fpsimd + 1);
>  	if (ret)
>  		goto error;
>  
> diff --git a/arch/arm64/kvm/mmu.c b/arch/arm64/kvm/mmu.c
> index 1a94a7ca48f2..f80673e863ac 100644
> --- a/arch/arm64/kvm/mmu.c
> +++ b/arch/arm64/kvm/mmu.c
> @@ -296,6 +296,17 @@ static int pkvm_share_hyp(phys_addr_t start, phys_addr_t end)
>  	return 0;
>  }
>  
> +int kvm_share_hyp(void *from, void *to)
> +{
> +	if (is_kernel_in_hyp_mode())
> +		return 0;
> +
> +	if (kvm_host_owns_hyp_mappings())
> +		return create_hyp_mappings(from, to, PAGE_HYP);

Not directly related to this code, but it looks to me that
kvm_host_owns_hyp_mappings() really ought to check for
is_kernel_in_hyp_mode() on its own. VHE really deals with its own
mappings, and create_hyp_mappings() already has a check to do nothing
on VHE.

> +
> +	return pkvm_share_hyp(kvm_kaddr_to_phys(from), kvm_kaddr_to_phys(to));
> +}
> +
>  /**
>   * create_hyp_mappings - duplicate a kernel virtual address range in Hyp mode
>   * @from:	The virtual kernel start address of the range
> @@ -316,12 +327,8 @@ int create_hyp_mappings(void *from, void *to, enum kvm_pgtable_prot prot)
>  	if (is_kernel_in_hyp_mode())
>  		return 0;
>  
> -	if (!kvm_host_owns_hyp_mappings()) {
> -		if (WARN_ON(prot != PAGE_HYP))
> -			return -EPERM;
> -		return pkvm_share_hyp(kvm_kaddr_to_phys(from),
> -				      kvm_kaddr_to_phys(to));
> -	}
> +	if (WARN_ON(!kvm_host_owns_hyp_mappings()))
> +		return -EPERM;

Do we really need this? Can't we just verify that all the code paths
to create_hyp_mappings() check for kvm_host_owns_hyp_mappings()?

At the very least, make this a VM_BUG_ON() so that this is limited to
debug. Yes, I'm quickly developing a WARN_ON()-phobia.

	M.

-- 
Without deviation from the norm, progress is not possible.
_______________________________________________
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

^ permalink raw reply	[flat|nested] 84+ messages in thread

* Re: [PATCH 04/16] KVM: arm64: Introduce kvm_share_hyp()
@ 2021-10-17 10:41     ` Marc Zyngier
  0 siblings, 0 replies; 84+ messages in thread
From: Marc Zyngier @ 2021-10-17 10:41 UTC (permalink / raw)
  To: Quentin Perret
  Cc: James Morse, Alexandru Elisei, Suzuki K Poulose, Catalin Marinas,
	Will Deacon, Fuad Tabba, David Brazdil, linux-arm-kernel, kvmarm,
	linux-kernel, kernel-team

On Wed, 13 Oct 2021 16:58:19 +0100,
Quentin Perret <qperret@google.com> wrote:
> 
> The create_hyp_mappings() function can currently be called at any point
> in time. However, its behaviour in protected mode changes widely
> depending on when it is being called. Prior to KVM init, it is used to
> create the temporary page-table used to bring-up the hypervisor, and
> later on it is transparently turned into a 'share' hypercall when the
> kernel has lost control over the hypervisor stage-1. In order to prepare
> the ground for also unsharing pages with the hypervisor during guest
> teardown, introduce a kvm_share_hyp() function to make it clear in which
> places a share hypercall should be expected, as we will soon need a
> matching unshare hypercall in all those places.
> 
> Signed-off-by: Quentin Perret <qperret@google.com>
> ---
>  arch/arm64/include/asm/kvm_mmu.h |  1 +
>  arch/arm64/kvm/arm.c             |  7 +++----
>  arch/arm64/kvm/fpsimd.c          |  4 ++--
>  arch/arm64/kvm/mmu.c             | 19 +++++++++++++------
>  4 files changed, 19 insertions(+), 12 deletions(-)
> 
> diff --git a/arch/arm64/include/asm/kvm_mmu.h b/arch/arm64/include/asm/kvm_mmu.h
> index 02d378887743..185d0f62b724 100644
> --- a/arch/arm64/include/asm/kvm_mmu.h
> +++ b/arch/arm64/include/asm/kvm_mmu.h
> @@ -150,6 +150,7 @@ static __always_inline unsigned long __kern_hyp_va(unsigned long v)
>  #include <asm/kvm_pgtable.h>
>  #include <asm/stage2_pgtable.h>
>  
> +int kvm_share_hyp(void *from, void *to);
>  int create_hyp_mappings(void *from, void *to, enum kvm_pgtable_prot prot);
>  int create_hyp_io_mappings(phys_addr_t phys_addr, size_t size,
>  			   void __iomem **kaddr,
> diff --git a/arch/arm64/kvm/arm.c b/arch/arm64/kvm/arm.c
> index c33d8c073820..f2e74635332b 100644
> --- a/arch/arm64/kvm/arm.c
> +++ b/arch/arm64/kvm/arm.c
> @@ -146,7 +146,7 @@ int kvm_arch_init_vm(struct kvm *kvm, unsigned long type)
>  	if (ret)
>  		return ret;
>  
> -	ret = create_hyp_mappings(kvm, kvm + 1, PAGE_HYP);
> +	ret = kvm_share_hyp(kvm, kvm + 1);
>  	if (ret)
>  		goto out_free_stage2_pgd;
>  
> @@ -341,7 +341,7 @@ int kvm_arch_vcpu_create(struct kvm_vcpu *vcpu)
>  	if (err)
>  		return err;
>  
> -	return create_hyp_mappings(vcpu, vcpu + 1, PAGE_HYP);
> +	return kvm_share_hyp(vcpu, vcpu + 1);
>  }
>  
>  void kvm_arch_vcpu_postcreate(struct kvm_vcpu *vcpu)
> @@ -623,8 +623,7 @@ static int kvm_vcpu_first_run_init(struct kvm_vcpu *vcpu)
>  
>  		sve_end = vcpu->arch.sve_state + vcpu_sve_state_size(vcpu);
>  
> -		ret = create_hyp_mappings(vcpu->arch.sve_state, sve_end,
> -					  PAGE_HYP);
> +		ret = kvm_share_hyp(vcpu->arch.sve_state, sve_end);
>  		if (ret)
>  			return ret;
>  	}
> diff --git a/arch/arm64/kvm/fpsimd.c b/arch/arm64/kvm/fpsimd.c
> index 62c0d78da7be..2fe1128d9f3d 100644
> --- a/arch/arm64/kvm/fpsimd.c
> +++ b/arch/arm64/kvm/fpsimd.c
> @@ -35,11 +35,11 @@ int kvm_arch_vcpu_run_map_fp(struct kvm_vcpu *vcpu)
>  	 * Make sure the host task thread flags and fpsimd state are
>  	 * visible to hyp:
>  	 */
> -	ret = create_hyp_mappings(ti, ti + 1, PAGE_HYP);
> +	ret = kvm_share_hyp(ti, ti + 1);
>  	if (ret)
>  		goto error;
>  
> -	ret = create_hyp_mappings(fpsimd, fpsimd + 1, PAGE_HYP);
> +	ret = kvm_share_hyp(fpsimd, fpsimd + 1);
>  	if (ret)
>  		goto error;
>  
> diff --git a/arch/arm64/kvm/mmu.c b/arch/arm64/kvm/mmu.c
> index 1a94a7ca48f2..f80673e863ac 100644
> --- a/arch/arm64/kvm/mmu.c
> +++ b/arch/arm64/kvm/mmu.c
> @@ -296,6 +296,17 @@ static int pkvm_share_hyp(phys_addr_t start, phys_addr_t end)
>  	return 0;
>  }
>  
> +int kvm_share_hyp(void *from, void *to)
> +{
> +	if (is_kernel_in_hyp_mode())
> +		return 0;
> +
> +	if (kvm_host_owns_hyp_mappings())
> +		return create_hyp_mappings(from, to, PAGE_HYP);

Not directly related to this code, but it looks to me that
kvm_host_owns_hyp_mappings() really ought to check for
is_kernel_in_hyp_mode() on its own. VHE really deals with its own
mappings, and create_hyp_mappings() already has a check to do nothing
on VHE.

> +
> +	return pkvm_share_hyp(kvm_kaddr_to_phys(from), kvm_kaddr_to_phys(to));
> +}
> +
>  /**
>   * create_hyp_mappings - duplicate a kernel virtual address range in Hyp mode
>   * @from:	The virtual kernel start address of the range
> @@ -316,12 +327,8 @@ int create_hyp_mappings(void *from, void *to, enum kvm_pgtable_prot prot)
>  	if (is_kernel_in_hyp_mode())
>  		return 0;
>  
> -	if (!kvm_host_owns_hyp_mappings()) {
> -		if (WARN_ON(prot != PAGE_HYP))
> -			return -EPERM;
> -		return pkvm_share_hyp(kvm_kaddr_to_phys(from),
> -				      kvm_kaddr_to_phys(to));
> -	}
> +	if (WARN_ON(!kvm_host_owns_hyp_mappings()))
> +		return -EPERM;

Do we really need this? Can't we just verify that all the code paths
to create_hyp_mappings() check for kvm_host_owns_hyp_mappings()?

At the very least, make this a VM_BUG_ON() so that this is limited to
debug. Yes, I'm quickly developing a WARN_ON()-phobia.

	M.

-- 
Without deviation from the norm, progress is not possible.

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 84+ messages in thread

* Re: [PATCH 16/16] KVM: arm64: pkvm: Unshare guest structs during teardown
  2021-10-16 12:25     ` Marc Zyngier
  (?)
@ 2021-10-18 10:32       ` Quentin Perret
  -1 siblings, 0 replies; 84+ messages in thread
From: Quentin Perret @ 2021-10-18 10:32 UTC (permalink / raw)
  To: Marc Zyngier
  Cc: James Morse, Alexandru Elisei, Suzuki K Poulose, Catalin Marinas,
	Will Deacon, Fuad Tabba, David Brazdil, linux-arm-kernel, kvmarm,
	linux-kernel, kernel-team

On Saturday 16 Oct 2021 at 13:25:45 (+0100), Marc Zyngier wrote:
> At this stage, the old thread may have been destroyed and the memory
> recycled. What happens if, in the interval, that memory gets shared
> again in another context? My guts feeling is that either the sharing
> fails, or the unshare above breaks something else (the refcounting
> doesn't save us if the sharing is not with HYP).

Aha, yes, that's a good point. The problematic scenario would be: a vcpu
runs in the context of task A, then blocks. Then task A is destroyed,
but the vcpu isn't (possibly because e.g. userspace intends to run it in
the context of another task or something along those lines). But the
thread_info and fpsimd_state of task A remain shared with the hypervisor
until the next vcpu run, even though the memory has been freed by the
host, and is possibly reallocated to another guest in the meantime.

So yes, at this point sharing/donating this memory range with a new
guest will fail, and confuse the host massively :/

> In any case, I wonder whether we need a notification from the core
> code that a thread for which we have a HYP mapping is gone so that we
> can mop up the mapping at that point. That's similar to what we have
> for MMU notifiers and S2 PTs.
> 
> This is doable by hooking into fpsimd_release_task() and extending
> thread_struct to track the sharing with HYP.

I've been looking into this, but struggled to find a good way to avoid
all the races. Specifically, handling the case where a vcpu and the task
which last ran it get destroyed at the same time isn't that easy to
handle: you end up having to maintain pointers from the task to the vcpu
and vice versa, but I have no obvious idea to update them both in a
non-racy way (other than having a one big lock to serialize all
those changes, but that would mean serializing all task destructions so
that really doesn't scale).

Another option is to take a refcount on 'current' from
kvm_arch_vcpu_run_map_fp() before sharing thread-specific structs with
the hyp and release the refcount of the previous task after unsharing.
But that means we'll have to also drop the refcount when the vcpu
gets destroyed, as well as explicitly unshare at that point. Shouldn't
be too bad I think. Thoughts?

Thanks,
Quentin

^ permalink raw reply	[flat|nested] 84+ messages in thread

* Re: [PATCH 16/16] KVM: arm64: pkvm: Unshare guest structs during teardown
@ 2021-10-18 10:32       ` Quentin Perret
  0 siblings, 0 replies; 84+ messages in thread
From: Quentin Perret @ 2021-10-18 10:32 UTC (permalink / raw)
  To: Marc Zyngier
  Cc: kernel-team, Will Deacon, Catalin Marinas, linux-kernel,
	linux-arm-kernel, kvmarm

On Saturday 16 Oct 2021 at 13:25:45 (+0100), Marc Zyngier wrote:
> At this stage, the old thread may have been destroyed and the memory
> recycled. What happens if, in the interval, that memory gets shared
> again in another context? My guts feeling is that either the sharing
> fails, or the unshare above breaks something else (the refcounting
> doesn't save us if the sharing is not with HYP).

Aha, yes, that's a good point. The problematic scenario would be: a vcpu
runs in the context of task A, then blocks. Then task A is destroyed,
but the vcpu isn't (possibly because e.g. userspace intends to run it in
the context of another task or something along those lines). But the
thread_info and fpsimd_state of task A remain shared with the hypervisor
until the next vcpu run, even though the memory has been freed by the
host, and is possibly reallocated to another guest in the meantime.

So yes, at this point sharing/donating this memory range with a new
guest will fail, and confuse the host massively :/

> In any case, I wonder whether we need a notification from the core
> code that a thread for which we have a HYP mapping is gone so that we
> can mop up the mapping at that point. That's similar to what we have
> for MMU notifiers and S2 PTs.
> 
> This is doable by hooking into fpsimd_release_task() and extending
> thread_struct to track the sharing with HYP.

I've been looking into this, but struggled to find a good way to avoid
all the races. Specifically, handling the case where a vcpu and the task
which last ran it get destroyed at the same time isn't that easy to
handle: you end up having to maintain pointers from the task to the vcpu
and vice versa, but I have no obvious idea to update them both in a
non-racy way (other than having a one big lock to serialize all
those changes, but that would mean serializing all task destructions so
that really doesn't scale).

Another option is to take a refcount on 'current' from
kvm_arch_vcpu_run_map_fp() before sharing thread-specific structs with
the hyp and release the refcount of the previous task after unsharing.
But that means we'll have to also drop the refcount when the vcpu
gets destroyed, as well as explicitly unshare at that point. Shouldn't
be too bad I think. Thoughts?

Thanks,
Quentin
_______________________________________________
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

^ permalink raw reply	[flat|nested] 84+ messages in thread

* Re: [PATCH 16/16] KVM: arm64: pkvm: Unshare guest structs during teardown
@ 2021-10-18 10:32       ` Quentin Perret
  0 siblings, 0 replies; 84+ messages in thread
From: Quentin Perret @ 2021-10-18 10:32 UTC (permalink / raw)
  To: Marc Zyngier
  Cc: James Morse, Alexandru Elisei, Suzuki K Poulose, Catalin Marinas,
	Will Deacon, Fuad Tabba, David Brazdil, linux-arm-kernel, kvmarm,
	linux-kernel, kernel-team

On Saturday 16 Oct 2021 at 13:25:45 (+0100), Marc Zyngier wrote:
> At this stage, the old thread may have been destroyed and the memory
> recycled. What happens if, in the interval, that memory gets shared
> again in another context? My guts feeling is that either the sharing
> fails, or the unshare above breaks something else (the refcounting
> doesn't save us if the sharing is not with HYP).

Aha, yes, that's a good point. The problematic scenario would be: a vcpu
runs in the context of task A, then blocks. Then task A is destroyed,
but the vcpu isn't (possibly because e.g. userspace intends to run it in
the context of another task or something along those lines). But the
thread_info and fpsimd_state of task A remain shared with the hypervisor
until the next vcpu run, even though the memory has been freed by the
host, and is possibly reallocated to another guest in the meantime.

So yes, at this point sharing/donating this memory range with a new
guest will fail, and confuse the host massively :/

> In any case, I wonder whether we need a notification from the core
> code that a thread for which we have a HYP mapping is gone so that we
> can mop up the mapping at that point. That's similar to what we have
> for MMU notifiers and S2 PTs.
> 
> This is doable by hooking into fpsimd_release_task() and extending
> thread_struct to track the sharing with HYP.

I've been looking into this, but struggled to find a good way to avoid
all the races. Specifically, handling the case where a vcpu and the task
which last ran it get destroyed at the same time isn't that easy to
handle: you end up having to maintain pointers from the task to the vcpu
and vice versa, but I have no obvious idea to update them both in a
non-racy way (other than having a one big lock to serialize all
those changes, but that would mean serializing all task destructions so
that really doesn't scale).

Another option is to take a refcount on 'current' from
kvm_arch_vcpu_run_map_fp() before sharing thread-specific structs with
the hyp and release the refcount of the previous task after unsharing.
But that means we'll have to also drop the refcount when the vcpu
gets destroyed, as well as explicitly unshare at that point. Shouldn't
be too bad I think. Thoughts?

Thanks,
Quentin

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 84+ messages in thread

* Re: [PATCH 03/16] KVM: arm64: Avoid remapping the SVE state in the hyp stage-1
  2021-10-16 11:04     ` Marc Zyngier
  (?)
@ 2021-10-18 10:36       ` Quentin Perret
  -1 siblings, 0 replies; 84+ messages in thread
From: Quentin Perret @ 2021-10-18 10:36 UTC (permalink / raw)
  To: Marc Zyngier
  Cc: James Morse, Alexandru Elisei, Suzuki K Poulose, Catalin Marinas,
	Will Deacon, Fuad Tabba, David Brazdil, linux-arm-kernel, kvmarm,
	linux-kernel, kernel-team

On Saturday 16 Oct 2021 at 12:04:15 (+0100), Marc Zyngier wrote:
> I actually ended-up implementing a similar patch as part of my 'first
> run' series[1], though I moved the mapping to the point where we
> finalise the vcpu as that's where the allocation takes place.
> 
> Do you see any potential issue with that approach?

Nope, and in fact I think your patch should allow to simplify a bit
kvm_arm_vcpu_destroy() for me, as checking arch.sve_state != NULL would
now be sufficient to know if the sve state has been shared -- checking
arch.has_run_once is no longer needed.

I'll base v2 on this.

Thanks!
Quentin

^ permalink raw reply	[flat|nested] 84+ messages in thread

* Re: [PATCH 03/16] KVM: arm64: Avoid remapping the SVE state in the hyp stage-1
@ 2021-10-18 10:36       ` Quentin Perret
  0 siblings, 0 replies; 84+ messages in thread
From: Quentin Perret @ 2021-10-18 10:36 UTC (permalink / raw)
  To: Marc Zyngier
  Cc: kernel-team, Will Deacon, Catalin Marinas, linux-kernel,
	linux-arm-kernel, kvmarm

On Saturday 16 Oct 2021 at 12:04:15 (+0100), Marc Zyngier wrote:
> I actually ended-up implementing a similar patch as part of my 'first
> run' series[1], though I moved the mapping to the point where we
> finalise the vcpu as that's where the allocation takes place.
> 
> Do you see any potential issue with that approach?

Nope, and in fact I think your patch should allow to simplify a bit
kvm_arm_vcpu_destroy() for me, as checking arch.sve_state != NULL would
now be sufficient to know if the sve state has been shared -- checking
arch.has_run_once is no longer needed.

I'll base v2 on this.

Thanks!
Quentin
_______________________________________________
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

^ permalink raw reply	[flat|nested] 84+ messages in thread

* Re: [PATCH 03/16] KVM: arm64: Avoid remapping the SVE state in the hyp stage-1
@ 2021-10-18 10:36       ` Quentin Perret
  0 siblings, 0 replies; 84+ messages in thread
From: Quentin Perret @ 2021-10-18 10:36 UTC (permalink / raw)
  To: Marc Zyngier
  Cc: James Morse, Alexandru Elisei, Suzuki K Poulose, Catalin Marinas,
	Will Deacon, Fuad Tabba, David Brazdil, linux-arm-kernel, kvmarm,
	linux-kernel, kernel-team

On Saturday 16 Oct 2021 at 12:04:15 (+0100), Marc Zyngier wrote:
> I actually ended-up implementing a similar patch as part of my 'first
> run' series[1], though I moved the mapping to the point where we
> finalise the vcpu as that's where the allocation takes place.
> 
> Do you see any potential issue with that approach?

Nope, and in fact I think your patch should allow to simplify a bit
kvm_arm_vcpu_destroy() for me, as checking arch.sve_state != NULL would
now be sufficient to know if the sve state has been shared -- checking
arch.has_run_once is no longer needed.

I'll base v2 on this.

Thanks!
Quentin

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 84+ messages in thread

* Re: [PATCH 04/16] KVM: arm64: Introduce kvm_share_hyp()
  2021-10-17 10:41     ` Marc Zyngier
  (?)
@ 2021-10-18 10:51       ` Quentin Perret
  -1 siblings, 0 replies; 84+ messages in thread
From: Quentin Perret @ 2021-10-18 10:51 UTC (permalink / raw)
  To: Marc Zyngier
  Cc: James Morse, Alexandru Elisei, Suzuki K Poulose, Catalin Marinas,
	Will Deacon, Fuad Tabba, David Brazdil, linux-arm-kernel, kvmarm,
	linux-kernel, kernel-team

On Sunday 17 Oct 2021 at 11:41:21 (+0100), Marc Zyngier wrote:
> Not directly related to this code, but it looks to me that
> kvm_host_owns_hyp_mappings() really ought to check for
> is_kernel_in_hyp_mode() on its own. VHE really deals with its own
> mappings, and create_hyp_mappings() already has a check to do nothing
> on VHE.

Sure, I'll stick a patch at the beginning of the series.

> 
> > +
> > +	return pkvm_share_hyp(kvm_kaddr_to_phys(from), kvm_kaddr_to_phys(to));
> > +}
> > +
> >  /**
> >   * create_hyp_mappings - duplicate a kernel virtual address range in Hyp mode
> >   * @from:	The virtual kernel start address of the range
> > @@ -316,12 +327,8 @@ int create_hyp_mappings(void *from, void *to, enum kvm_pgtable_prot prot)
> >  	if (is_kernel_in_hyp_mode())
> >  		return 0;
> >  
> > -	if (!kvm_host_owns_hyp_mappings()) {
> > -		if (WARN_ON(prot != PAGE_HYP))
> > -			return -EPERM;
> > -		return pkvm_share_hyp(kvm_kaddr_to_phys(from),
> > -				      kvm_kaddr_to_phys(to));
> > -	}
> > +	if (WARN_ON(!kvm_host_owns_hyp_mappings()))
> > +		return -EPERM;
> 
> Do we really need this? Can't we just verify that all the code paths
> to create_hyp_mappings() check for kvm_host_owns_hyp_mappings()?
> 
> At the very least, make this a VM_BUG_ON() so that this is limited to
> debug. Yes, I'm quickly developing a WARN_ON()-phobia.

Right, that _is_ purely debug. It's just that folks are used to being
able to just call create_hyp_mappings() for anything, so I wanted to
make sure we have an easy way to catch future changes that would
unknowingly break pKVM, but no objection to make this VM_BUG_ON().

Cheers,
Quentin

^ permalink raw reply	[flat|nested] 84+ messages in thread

* Re: [PATCH 04/16] KVM: arm64: Introduce kvm_share_hyp()
@ 2021-10-18 10:51       ` Quentin Perret
  0 siblings, 0 replies; 84+ messages in thread
From: Quentin Perret @ 2021-10-18 10:51 UTC (permalink / raw)
  To: Marc Zyngier
  Cc: kernel-team, Will Deacon, Catalin Marinas, linux-kernel,
	linux-arm-kernel, kvmarm

On Sunday 17 Oct 2021 at 11:41:21 (+0100), Marc Zyngier wrote:
> Not directly related to this code, but it looks to me that
> kvm_host_owns_hyp_mappings() really ought to check for
> is_kernel_in_hyp_mode() on its own. VHE really deals with its own
> mappings, and create_hyp_mappings() already has a check to do nothing
> on VHE.

Sure, I'll stick a patch at the beginning of the series.

> 
> > +
> > +	return pkvm_share_hyp(kvm_kaddr_to_phys(from), kvm_kaddr_to_phys(to));
> > +}
> > +
> >  /**
> >   * create_hyp_mappings - duplicate a kernel virtual address range in Hyp mode
> >   * @from:	The virtual kernel start address of the range
> > @@ -316,12 +327,8 @@ int create_hyp_mappings(void *from, void *to, enum kvm_pgtable_prot prot)
> >  	if (is_kernel_in_hyp_mode())
> >  		return 0;
> >  
> > -	if (!kvm_host_owns_hyp_mappings()) {
> > -		if (WARN_ON(prot != PAGE_HYP))
> > -			return -EPERM;
> > -		return pkvm_share_hyp(kvm_kaddr_to_phys(from),
> > -				      kvm_kaddr_to_phys(to));
> > -	}
> > +	if (WARN_ON(!kvm_host_owns_hyp_mappings()))
> > +		return -EPERM;
> 
> Do we really need this? Can't we just verify that all the code paths
> to create_hyp_mappings() check for kvm_host_owns_hyp_mappings()?
> 
> At the very least, make this a VM_BUG_ON() so that this is limited to
> debug. Yes, I'm quickly developing a WARN_ON()-phobia.

Right, that _is_ purely debug. It's just that folks are used to being
able to just call create_hyp_mappings() for anything, so I wanted to
make sure we have an easy way to catch future changes that would
unknowingly break pKVM, but no objection to make this VM_BUG_ON().

Cheers,
Quentin
_______________________________________________
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

^ permalink raw reply	[flat|nested] 84+ messages in thread

* Re: [PATCH 04/16] KVM: arm64: Introduce kvm_share_hyp()
@ 2021-10-18 10:51       ` Quentin Perret
  0 siblings, 0 replies; 84+ messages in thread
From: Quentin Perret @ 2021-10-18 10:51 UTC (permalink / raw)
  To: Marc Zyngier
  Cc: James Morse, Alexandru Elisei, Suzuki K Poulose, Catalin Marinas,
	Will Deacon, Fuad Tabba, David Brazdil, linux-arm-kernel, kvmarm,
	linux-kernel, kernel-team

On Sunday 17 Oct 2021 at 11:41:21 (+0100), Marc Zyngier wrote:
> Not directly related to this code, but it looks to me that
> kvm_host_owns_hyp_mappings() really ought to check for
> is_kernel_in_hyp_mode() on its own. VHE really deals with its own
> mappings, and create_hyp_mappings() already has a check to do nothing
> on VHE.

Sure, I'll stick a patch at the beginning of the series.

> 
> > +
> > +	return pkvm_share_hyp(kvm_kaddr_to_phys(from), kvm_kaddr_to_phys(to));
> > +}
> > +
> >  /**
> >   * create_hyp_mappings - duplicate a kernel virtual address range in Hyp mode
> >   * @from:	The virtual kernel start address of the range
> > @@ -316,12 +327,8 @@ int create_hyp_mappings(void *from, void *to, enum kvm_pgtable_prot prot)
> >  	if (is_kernel_in_hyp_mode())
> >  		return 0;
> >  
> > -	if (!kvm_host_owns_hyp_mappings()) {
> > -		if (WARN_ON(prot != PAGE_HYP))
> > -			return -EPERM;
> > -		return pkvm_share_hyp(kvm_kaddr_to_phys(from),
> > -				      kvm_kaddr_to_phys(to));
> > -	}
> > +	if (WARN_ON(!kvm_host_owns_hyp_mappings()))
> > +		return -EPERM;
> 
> Do we really need this? Can't we just verify that all the code paths
> to create_hyp_mappings() check for kvm_host_owns_hyp_mappings()?
> 
> At the very least, make this a VM_BUG_ON() so that this is limited to
> debug. Yes, I'm quickly developing a WARN_ON()-phobia.

Right, that _is_ purely debug. It's just that folks are used to being
able to just call create_hyp_mappings() for anything, so I wanted to
make sure we have an easy way to catch future changes that would
unknowingly break pKVM, but no objection to make this VM_BUG_ON().

Cheers,
Quentin

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 84+ messages in thread

* Re: [PATCH 16/16] KVM: arm64: pkvm: Unshare guest structs during teardown
  2021-10-18 10:32       ` Quentin Perret
  (?)
@ 2021-10-18 14:03         ` Quentin Perret
  -1 siblings, 0 replies; 84+ messages in thread
From: Quentin Perret @ 2021-10-18 14:03 UTC (permalink / raw)
  To: Marc Zyngier
  Cc: kernel-team, Will Deacon, Catalin Marinas, linux-kernel,
	linux-arm-kernel, kvmarm

On Monday 18 Oct 2021 at 11:32:13 (+0100), Quentin Perret wrote:
> Another option is to take a refcount on 'current' from
> kvm_arch_vcpu_run_map_fp() before sharing thread-specific structs with
> the hyp and release the refcount of the previous task after unsharing.
> But that means we'll have to also drop the refcount when the vcpu
> gets destroyed, as well as explicitly unshare at that point. Shouldn't
> be too bad I think. Thoughts?

Something like the below seems to work OK on my setup, including
SIGKILL'ing the guest and such. How much do you hate it?

diff --git a/arch/arm64/include/asm/kvm_host.h b/arch/arm64/include/asm/kvm_host.h
index f8be56d5342b..50598d704c71 100644
--- a/arch/arm64/include/asm/kvm_host.h
+++ b/arch/arm64/include/asm/kvm_host.h
@@ -322,6 +322,7 @@ struct kvm_vcpu_arch {
 
 	struct thread_info *host_thread_info;	/* hyp VA */
 	struct user_fpsimd_state *host_fpsimd_state;	/* hyp VA */
+	struct task_struct *parent_task;
 
 	struct {
 		/* {Break,watch}point registers */
@@ -738,6 +739,7 @@ int kvm_arch_vcpu_run_map_fp(struct kvm_vcpu *vcpu);
 void kvm_arch_vcpu_load_fp(struct kvm_vcpu *vcpu);
 void kvm_arch_vcpu_ctxsync_fp(struct kvm_vcpu *vcpu);
 void kvm_arch_vcpu_put_fp(struct kvm_vcpu *vcpu);
+void kvm_vcpu_unshare_task_fp(struct kvm_vcpu *vcpu);
 
 static inline bool kvm_pmu_counter_deferred(struct perf_event_attr *attr)
 {
diff --git a/arch/arm64/kvm/fpsimd.c b/arch/arm64/kvm/fpsimd.c
index 2fe1128d9f3d..27afeebbe1cb 100644
--- a/arch/arm64/kvm/fpsimd.c
+++ b/arch/arm64/kvm/fpsimd.c
@@ -15,6 +15,22 @@
 #include <asm/kvm_mmu.h>
 #include <asm/sysreg.h>
 
+void kvm_vcpu_unshare_task_fp(struct kvm_vcpu *vcpu)
+{
+	struct task_struct *p = vcpu->arch.parent_task;
+	struct user_fpsimd_state *fpsimd;
+	struct thread_info *ti;
+
+	if (!static_branch_likely(&kvm_protected_mode_initialized) || !p)
+		return;
+
+	ti = &p->thread_info;
+	kvm_unshare_hyp(ti, ti + 1);
+	fpsimd = &p->thread.uw.fpsimd_state;
+	kvm_unshare_hyp(fpsimd, fpsimd + 1);
+	put_task_struct(p);
+}
+
 /*
  * Called on entry to KVM_RUN unless this vcpu previously ran at least
  * once and the most recent prior KVM_RUN for this vcpu was called from
@@ -31,6 +47,8 @@ int kvm_arch_vcpu_run_map_fp(struct kvm_vcpu *vcpu)
 	struct thread_info *ti = &current->thread_info;
 	struct user_fpsimd_state *fpsimd = &current->thread.uw.fpsimd_state;
 
+	kvm_vcpu_unshare_task_fp(vcpu);
+
 	/*
 	 * Make sure the host task thread flags and fpsimd state are
 	 * visible to hyp:
@@ -45,6 +63,17 @@ int kvm_arch_vcpu_run_map_fp(struct kvm_vcpu *vcpu)
 
 	vcpu->arch.host_thread_info = kern_hyp_va(ti);
 	vcpu->arch.host_fpsimd_state = kern_hyp_va(fpsimd);
+
+	/*
+	 * We need to keep current's task_struct pinned until its data has been
+	 * unshared with the hypervisor to make sure it is not re-used by the
+	 * kernel and donated to someone else while already shared -- see
+	 * kvm_vcpu_unshare_task_fp() for the matching put_task_struct().
+	 */
+	if (static_branch_likely(&kvm_protected_mode_initialized)) {
+		get_task_struct(current);
+		vcpu->arch.parent_task = current;
+	}
 error:
 	return ret;
 }
diff --git a/arch/arm64/kvm/reset.c b/arch/arm64/kvm/reset.c
index 5ce36b0a3343..c2a2cd7d5748 100644
--- a/arch/arm64/kvm/reset.c
+++ b/arch/arm64/kvm/reset.c
@@ -141,7 +141,13 @@ bool kvm_arm_vcpu_is_finalized(struct kvm_vcpu *vcpu)
 
 void kvm_arm_vcpu_destroy(struct kvm_vcpu *vcpu)
 {
-	kfree(vcpu->arch.sve_state);
+	void *sve_state = vcpu->arch.sve_state;
+
+	kvm_vcpu_unshare_task_fp(vcpu);
+	kvm_unshare_hyp(vcpu, vcpu + 1);
+	if (sve_state && vcpu->arch.has_run_once)
+		kvm_unshare_hyp(sve_state, sve_state + vcpu_sve_state_size(vcpu));
+	kfree(sve_state);
 }
 
 static void kvm_vcpu_reset_sve(struct kvm_vcpu *vcpu)
_______________________________________________
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

^ permalink raw reply related	[flat|nested] 84+ messages in thread

* Re: [PATCH 16/16] KVM: arm64: pkvm: Unshare guest structs during teardown
@ 2021-10-18 14:03         ` Quentin Perret
  0 siblings, 0 replies; 84+ messages in thread
From: Quentin Perret @ 2021-10-18 14:03 UTC (permalink / raw)
  To: Marc Zyngier
  Cc: James Morse, Alexandru Elisei, Suzuki K Poulose, Catalin Marinas,
	Will Deacon, Fuad Tabba, David Brazdil, linux-arm-kernel, kvmarm,
	linux-kernel, kernel-team

On Monday 18 Oct 2021 at 11:32:13 (+0100), Quentin Perret wrote:
> Another option is to take a refcount on 'current' from
> kvm_arch_vcpu_run_map_fp() before sharing thread-specific structs with
> the hyp and release the refcount of the previous task after unsharing.
> But that means we'll have to also drop the refcount when the vcpu
> gets destroyed, as well as explicitly unshare at that point. Shouldn't
> be too bad I think. Thoughts?

Something like the below seems to work OK on my setup, including
SIGKILL'ing the guest and such. How much do you hate it?

diff --git a/arch/arm64/include/asm/kvm_host.h b/arch/arm64/include/asm/kvm_host.h
index f8be56d5342b..50598d704c71 100644
--- a/arch/arm64/include/asm/kvm_host.h
+++ b/arch/arm64/include/asm/kvm_host.h
@@ -322,6 +322,7 @@ struct kvm_vcpu_arch {
 
 	struct thread_info *host_thread_info;	/* hyp VA */
 	struct user_fpsimd_state *host_fpsimd_state;	/* hyp VA */
+	struct task_struct *parent_task;
 
 	struct {
 		/* {Break,watch}point registers */
@@ -738,6 +739,7 @@ int kvm_arch_vcpu_run_map_fp(struct kvm_vcpu *vcpu);
 void kvm_arch_vcpu_load_fp(struct kvm_vcpu *vcpu);
 void kvm_arch_vcpu_ctxsync_fp(struct kvm_vcpu *vcpu);
 void kvm_arch_vcpu_put_fp(struct kvm_vcpu *vcpu);
+void kvm_vcpu_unshare_task_fp(struct kvm_vcpu *vcpu);
 
 static inline bool kvm_pmu_counter_deferred(struct perf_event_attr *attr)
 {
diff --git a/arch/arm64/kvm/fpsimd.c b/arch/arm64/kvm/fpsimd.c
index 2fe1128d9f3d..27afeebbe1cb 100644
--- a/arch/arm64/kvm/fpsimd.c
+++ b/arch/arm64/kvm/fpsimd.c
@@ -15,6 +15,22 @@
 #include <asm/kvm_mmu.h>
 #include <asm/sysreg.h>
 
+void kvm_vcpu_unshare_task_fp(struct kvm_vcpu *vcpu)
+{
+	struct task_struct *p = vcpu->arch.parent_task;
+	struct user_fpsimd_state *fpsimd;
+	struct thread_info *ti;
+
+	if (!static_branch_likely(&kvm_protected_mode_initialized) || !p)
+		return;
+
+	ti = &p->thread_info;
+	kvm_unshare_hyp(ti, ti + 1);
+	fpsimd = &p->thread.uw.fpsimd_state;
+	kvm_unshare_hyp(fpsimd, fpsimd + 1);
+	put_task_struct(p);
+}
+
 /*
  * Called on entry to KVM_RUN unless this vcpu previously ran at least
  * once and the most recent prior KVM_RUN for this vcpu was called from
@@ -31,6 +47,8 @@ int kvm_arch_vcpu_run_map_fp(struct kvm_vcpu *vcpu)
 	struct thread_info *ti = &current->thread_info;
 	struct user_fpsimd_state *fpsimd = &current->thread.uw.fpsimd_state;
 
+	kvm_vcpu_unshare_task_fp(vcpu);
+
 	/*
 	 * Make sure the host task thread flags and fpsimd state are
 	 * visible to hyp:
@@ -45,6 +63,17 @@ int kvm_arch_vcpu_run_map_fp(struct kvm_vcpu *vcpu)
 
 	vcpu->arch.host_thread_info = kern_hyp_va(ti);
 	vcpu->arch.host_fpsimd_state = kern_hyp_va(fpsimd);
+
+	/*
+	 * We need to keep current's task_struct pinned until its data has been
+	 * unshared with the hypervisor to make sure it is not re-used by the
+	 * kernel and donated to someone else while already shared -- see
+	 * kvm_vcpu_unshare_task_fp() for the matching put_task_struct().
+	 */
+	if (static_branch_likely(&kvm_protected_mode_initialized)) {
+		get_task_struct(current);
+		vcpu->arch.parent_task = current;
+	}
 error:
 	return ret;
 }
diff --git a/arch/arm64/kvm/reset.c b/arch/arm64/kvm/reset.c
index 5ce36b0a3343..c2a2cd7d5748 100644
--- a/arch/arm64/kvm/reset.c
+++ b/arch/arm64/kvm/reset.c
@@ -141,7 +141,13 @@ bool kvm_arm_vcpu_is_finalized(struct kvm_vcpu *vcpu)
 
 void kvm_arm_vcpu_destroy(struct kvm_vcpu *vcpu)
 {
-	kfree(vcpu->arch.sve_state);
+	void *sve_state = vcpu->arch.sve_state;
+
+	kvm_vcpu_unshare_task_fp(vcpu);
+	kvm_unshare_hyp(vcpu, vcpu + 1);
+	if (sve_state && vcpu->arch.has_run_once)
+		kvm_unshare_hyp(sve_state, sve_state + vcpu_sve_state_size(vcpu));
+	kfree(sve_state);
 }
 
 static void kvm_vcpu_reset_sve(struct kvm_vcpu *vcpu)

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply related	[flat|nested] 84+ messages in thread

* Re: [PATCH 16/16] KVM: arm64: pkvm: Unshare guest structs during teardown
@ 2021-10-18 14:03         ` Quentin Perret
  0 siblings, 0 replies; 84+ messages in thread
From: Quentin Perret @ 2021-10-18 14:03 UTC (permalink / raw)
  To: Marc Zyngier
  Cc: James Morse, Alexandru Elisei, Suzuki K Poulose, Catalin Marinas,
	Will Deacon, Fuad Tabba, David Brazdil, linux-arm-kernel, kvmarm,
	linux-kernel, kernel-team

On Monday 18 Oct 2021 at 11:32:13 (+0100), Quentin Perret wrote:
> Another option is to take a refcount on 'current' from
> kvm_arch_vcpu_run_map_fp() before sharing thread-specific structs with
> the hyp and release the refcount of the previous task after unsharing.
> But that means we'll have to also drop the refcount when the vcpu
> gets destroyed, as well as explicitly unshare at that point. Shouldn't
> be too bad I think. Thoughts?

Something like the below seems to work OK on my setup, including
SIGKILL'ing the guest and such. How much do you hate it?

diff --git a/arch/arm64/include/asm/kvm_host.h b/arch/arm64/include/asm/kvm_host.h
index f8be56d5342b..50598d704c71 100644
--- a/arch/arm64/include/asm/kvm_host.h
+++ b/arch/arm64/include/asm/kvm_host.h
@@ -322,6 +322,7 @@ struct kvm_vcpu_arch {
 
 	struct thread_info *host_thread_info;	/* hyp VA */
 	struct user_fpsimd_state *host_fpsimd_state;	/* hyp VA */
+	struct task_struct *parent_task;
 
 	struct {
 		/* {Break,watch}point registers */
@@ -738,6 +739,7 @@ int kvm_arch_vcpu_run_map_fp(struct kvm_vcpu *vcpu);
 void kvm_arch_vcpu_load_fp(struct kvm_vcpu *vcpu);
 void kvm_arch_vcpu_ctxsync_fp(struct kvm_vcpu *vcpu);
 void kvm_arch_vcpu_put_fp(struct kvm_vcpu *vcpu);
+void kvm_vcpu_unshare_task_fp(struct kvm_vcpu *vcpu);
 
 static inline bool kvm_pmu_counter_deferred(struct perf_event_attr *attr)
 {
diff --git a/arch/arm64/kvm/fpsimd.c b/arch/arm64/kvm/fpsimd.c
index 2fe1128d9f3d..27afeebbe1cb 100644
--- a/arch/arm64/kvm/fpsimd.c
+++ b/arch/arm64/kvm/fpsimd.c
@@ -15,6 +15,22 @@
 #include <asm/kvm_mmu.h>
 #include <asm/sysreg.h>
 
+void kvm_vcpu_unshare_task_fp(struct kvm_vcpu *vcpu)
+{
+	struct task_struct *p = vcpu->arch.parent_task;
+	struct user_fpsimd_state *fpsimd;
+	struct thread_info *ti;
+
+	if (!static_branch_likely(&kvm_protected_mode_initialized) || !p)
+		return;
+
+	ti = &p->thread_info;
+	kvm_unshare_hyp(ti, ti + 1);
+	fpsimd = &p->thread.uw.fpsimd_state;
+	kvm_unshare_hyp(fpsimd, fpsimd + 1);
+	put_task_struct(p);
+}
+
 /*
  * Called on entry to KVM_RUN unless this vcpu previously ran at least
  * once and the most recent prior KVM_RUN for this vcpu was called from
@@ -31,6 +47,8 @@ int kvm_arch_vcpu_run_map_fp(struct kvm_vcpu *vcpu)
 	struct thread_info *ti = &current->thread_info;
 	struct user_fpsimd_state *fpsimd = &current->thread.uw.fpsimd_state;
 
+	kvm_vcpu_unshare_task_fp(vcpu);
+
 	/*
 	 * Make sure the host task thread flags and fpsimd state are
 	 * visible to hyp:
@@ -45,6 +63,17 @@ int kvm_arch_vcpu_run_map_fp(struct kvm_vcpu *vcpu)
 
 	vcpu->arch.host_thread_info = kern_hyp_va(ti);
 	vcpu->arch.host_fpsimd_state = kern_hyp_va(fpsimd);
+
+	/*
+	 * We need to keep current's task_struct pinned until its data has been
+	 * unshared with the hypervisor to make sure it is not re-used by the
+	 * kernel and donated to someone else while already shared -- see
+	 * kvm_vcpu_unshare_task_fp() for the matching put_task_struct().
+	 */
+	if (static_branch_likely(&kvm_protected_mode_initialized)) {
+		get_task_struct(current);
+		vcpu->arch.parent_task = current;
+	}
 error:
 	return ret;
 }
diff --git a/arch/arm64/kvm/reset.c b/arch/arm64/kvm/reset.c
index 5ce36b0a3343..c2a2cd7d5748 100644
--- a/arch/arm64/kvm/reset.c
+++ b/arch/arm64/kvm/reset.c
@@ -141,7 +141,13 @@ bool kvm_arm_vcpu_is_finalized(struct kvm_vcpu *vcpu)
 
 void kvm_arm_vcpu_destroy(struct kvm_vcpu *vcpu)
 {
-	kfree(vcpu->arch.sve_state);
+	void *sve_state = vcpu->arch.sve_state;
+
+	kvm_vcpu_unshare_task_fp(vcpu);
+	kvm_unshare_hyp(vcpu, vcpu + 1);
+	if (sve_state && vcpu->arch.has_run_once)
+		kvm_unshare_hyp(sve_state, sve_state + vcpu_sve_state_size(vcpu));
+	kfree(sve_state);
 }
 
 static void kvm_vcpu_reset_sve(struct kvm_vcpu *vcpu)

^ permalink raw reply related	[flat|nested] 84+ messages in thread

* Re: [PATCH 16/16] KVM: arm64: pkvm: Unshare guest structs during teardown
  2021-10-18 14:03         ` Quentin Perret
  (?)
@ 2021-10-18 17:12           ` Marc Zyngier
  -1 siblings, 0 replies; 84+ messages in thread
From: Marc Zyngier @ 2021-10-18 17:12 UTC (permalink / raw)
  To: Quentin Perret
  Cc: kernel-team, Will Deacon, Catalin Marinas, linux-kernel,
	linux-arm-kernel, kvmarm

On 2021-10-18 15:03, Quentin Perret wrote:
> On Monday 18 Oct 2021 at 11:32:13 (+0100), Quentin Perret wrote:
>> Another option is to take a refcount on 'current' from
>> kvm_arch_vcpu_run_map_fp() before sharing thread-specific structs with
>> the hyp and release the refcount of the previous task after unsharing.
>> But that means we'll have to also drop the refcount when the vcpu
>> gets destroyed, as well as explicitly unshare at that point. Shouldn't
>> be too bad I think. Thoughts?
> 
> Something like the below seems to work OK on my setup, including
> SIGKILL'ing the guest and such. How much do you hate it?

It is annoyingly elegant! Small nitpick below.

> 
> diff --git a/arch/arm64/include/asm/kvm_host.h
> b/arch/arm64/include/asm/kvm_host.h
> index f8be56d5342b..50598d704c71 100644
> --- a/arch/arm64/include/asm/kvm_host.h
> +++ b/arch/arm64/include/asm/kvm_host.h
> @@ -322,6 +322,7 @@ struct kvm_vcpu_arch {
> 
>  	struct thread_info *host_thread_info;	/* hyp VA */
>  	struct user_fpsimd_state *host_fpsimd_state;	/* hyp VA */
> +	struct task_struct *parent_task;
> 
>  	struct {
>  		/* {Break,watch}point registers */
> @@ -738,6 +739,7 @@ int kvm_arch_vcpu_run_map_fp(struct kvm_vcpu 
> *vcpu);
>  void kvm_arch_vcpu_load_fp(struct kvm_vcpu *vcpu);
>  void kvm_arch_vcpu_ctxsync_fp(struct kvm_vcpu *vcpu);
>  void kvm_arch_vcpu_put_fp(struct kvm_vcpu *vcpu);
> +void kvm_vcpu_unshare_task_fp(struct kvm_vcpu *vcpu);
> 
>  static inline bool kvm_pmu_counter_deferred(struct perf_event_attr 
> *attr)
>  {
> diff --git a/arch/arm64/kvm/fpsimd.c b/arch/arm64/kvm/fpsimd.c
> index 2fe1128d9f3d..27afeebbe1cb 100644
> --- a/arch/arm64/kvm/fpsimd.c
> +++ b/arch/arm64/kvm/fpsimd.c
> @@ -15,6 +15,22 @@
>  #include <asm/kvm_mmu.h>
>  #include <asm/sysreg.h>
> 
> +void kvm_vcpu_unshare_task_fp(struct kvm_vcpu *vcpu)
> +{
> +	struct task_struct *p = vcpu->arch.parent_task;
> +	struct user_fpsimd_state *fpsimd;
> +	struct thread_info *ti;
> +
> +	if (!static_branch_likely(&kvm_protected_mode_initialized) || !p)

Shouldn't this be a check on is_protected_kvm_enabled() instead?
The two should be equivalent outside of the initialisation code...

Otherwise, ship it.

         M.
-- 
Jazz is not dead. It just smells funny...
_______________________________________________
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

^ permalink raw reply	[flat|nested] 84+ messages in thread

* Re: [PATCH 16/16] KVM: arm64: pkvm: Unshare guest structs during teardown
@ 2021-10-18 17:12           ` Marc Zyngier
  0 siblings, 0 replies; 84+ messages in thread
From: Marc Zyngier @ 2021-10-18 17:12 UTC (permalink / raw)
  To: Quentin Perret
  Cc: James Morse, Alexandru Elisei, Suzuki K Poulose, Catalin Marinas,
	Will Deacon, Fuad Tabba, David Brazdil, linux-arm-kernel, kvmarm,
	linux-kernel, kernel-team

On 2021-10-18 15:03, Quentin Perret wrote:
> On Monday 18 Oct 2021 at 11:32:13 (+0100), Quentin Perret wrote:
>> Another option is to take a refcount on 'current' from
>> kvm_arch_vcpu_run_map_fp() before sharing thread-specific structs with
>> the hyp and release the refcount of the previous task after unsharing.
>> But that means we'll have to also drop the refcount when the vcpu
>> gets destroyed, as well as explicitly unshare at that point. Shouldn't
>> be too bad I think. Thoughts?
> 
> Something like the below seems to work OK on my setup, including
> SIGKILL'ing the guest and such. How much do you hate it?

It is annoyingly elegant! Small nitpick below.

> 
> diff --git a/arch/arm64/include/asm/kvm_host.h
> b/arch/arm64/include/asm/kvm_host.h
> index f8be56d5342b..50598d704c71 100644
> --- a/arch/arm64/include/asm/kvm_host.h
> +++ b/arch/arm64/include/asm/kvm_host.h
> @@ -322,6 +322,7 @@ struct kvm_vcpu_arch {
> 
>  	struct thread_info *host_thread_info;	/* hyp VA */
>  	struct user_fpsimd_state *host_fpsimd_state;	/* hyp VA */
> +	struct task_struct *parent_task;
> 
>  	struct {
>  		/* {Break,watch}point registers */
> @@ -738,6 +739,7 @@ int kvm_arch_vcpu_run_map_fp(struct kvm_vcpu 
> *vcpu);
>  void kvm_arch_vcpu_load_fp(struct kvm_vcpu *vcpu);
>  void kvm_arch_vcpu_ctxsync_fp(struct kvm_vcpu *vcpu);
>  void kvm_arch_vcpu_put_fp(struct kvm_vcpu *vcpu);
> +void kvm_vcpu_unshare_task_fp(struct kvm_vcpu *vcpu);
> 
>  static inline bool kvm_pmu_counter_deferred(struct perf_event_attr 
> *attr)
>  {
> diff --git a/arch/arm64/kvm/fpsimd.c b/arch/arm64/kvm/fpsimd.c
> index 2fe1128d9f3d..27afeebbe1cb 100644
> --- a/arch/arm64/kvm/fpsimd.c
> +++ b/arch/arm64/kvm/fpsimd.c
> @@ -15,6 +15,22 @@
>  #include <asm/kvm_mmu.h>
>  #include <asm/sysreg.h>
> 
> +void kvm_vcpu_unshare_task_fp(struct kvm_vcpu *vcpu)
> +{
> +	struct task_struct *p = vcpu->arch.parent_task;
> +	struct user_fpsimd_state *fpsimd;
> +	struct thread_info *ti;
> +
> +	if (!static_branch_likely(&kvm_protected_mode_initialized) || !p)

Shouldn't this be a check on is_protected_kvm_enabled() instead?
The two should be equivalent outside of the initialisation code...

Otherwise, ship it.

         M.
-- 
Jazz is not dead. It just smells funny...

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 84+ messages in thread

* Re: [PATCH 16/16] KVM: arm64: pkvm: Unshare guest structs during teardown
@ 2021-10-18 17:12           ` Marc Zyngier
  0 siblings, 0 replies; 84+ messages in thread
From: Marc Zyngier @ 2021-10-18 17:12 UTC (permalink / raw)
  To: Quentin Perret
  Cc: James Morse, Alexandru Elisei, Suzuki K Poulose, Catalin Marinas,
	Will Deacon, Fuad Tabba, David Brazdil, linux-arm-kernel, kvmarm,
	linux-kernel, kernel-team

On 2021-10-18 15:03, Quentin Perret wrote:
> On Monday 18 Oct 2021 at 11:32:13 (+0100), Quentin Perret wrote:
>> Another option is to take a refcount on 'current' from
>> kvm_arch_vcpu_run_map_fp() before sharing thread-specific structs with
>> the hyp and release the refcount of the previous task after unsharing.
>> But that means we'll have to also drop the refcount when the vcpu
>> gets destroyed, as well as explicitly unshare at that point. Shouldn't
>> be too bad I think. Thoughts?
> 
> Something like the below seems to work OK on my setup, including
> SIGKILL'ing the guest and such. How much do you hate it?

It is annoyingly elegant! Small nitpick below.

> 
> diff --git a/arch/arm64/include/asm/kvm_host.h
> b/arch/arm64/include/asm/kvm_host.h
> index f8be56d5342b..50598d704c71 100644
> --- a/arch/arm64/include/asm/kvm_host.h
> +++ b/arch/arm64/include/asm/kvm_host.h
> @@ -322,6 +322,7 @@ struct kvm_vcpu_arch {
> 
>  	struct thread_info *host_thread_info;	/* hyp VA */
>  	struct user_fpsimd_state *host_fpsimd_state;	/* hyp VA */
> +	struct task_struct *parent_task;
> 
>  	struct {
>  		/* {Break,watch}point registers */
> @@ -738,6 +739,7 @@ int kvm_arch_vcpu_run_map_fp(struct kvm_vcpu 
> *vcpu);
>  void kvm_arch_vcpu_load_fp(struct kvm_vcpu *vcpu);
>  void kvm_arch_vcpu_ctxsync_fp(struct kvm_vcpu *vcpu);
>  void kvm_arch_vcpu_put_fp(struct kvm_vcpu *vcpu);
> +void kvm_vcpu_unshare_task_fp(struct kvm_vcpu *vcpu);
> 
>  static inline bool kvm_pmu_counter_deferred(struct perf_event_attr 
> *attr)
>  {
> diff --git a/arch/arm64/kvm/fpsimd.c b/arch/arm64/kvm/fpsimd.c
> index 2fe1128d9f3d..27afeebbe1cb 100644
> --- a/arch/arm64/kvm/fpsimd.c
> +++ b/arch/arm64/kvm/fpsimd.c
> @@ -15,6 +15,22 @@
>  #include <asm/kvm_mmu.h>
>  #include <asm/sysreg.h>
> 
> +void kvm_vcpu_unshare_task_fp(struct kvm_vcpu *vcpu)
> +{
> +	struct task_struct *p = vcpu->arch.parent_task;
> +	struct user_fpsimd_state *fpsimd;
> +	struct thread_info *ti;
> +
> +	if (!static_branch_likely(&kvm_protected_mode_initialized) || !p)

Shouldn't this be a check on is_protected_kvm_enabled() instead?
The two should be equivalent outside of the initialisation code...

Otherwise, ship it.

         M.
-- 
Jazz is not dead. It just smells funny...

^ permalink raw reply	[flat|nested] 84+ messages in thread

* Re: [PATCH 16/16] KVM: arm64: pkvm: Unshare guest structs during teardown
  2021-10-18 17:12           ` Marc Zyngier
  (?)
@ 2021-10-19  9:40             ` Quentin Perret
  -1 siblings, 0 replies; 84+ messages in thread
From: Quentin Perret @ 2021-10-19  9:40 UTC (permalink / raw)
  To: Marc Zyngier
  Cc: James Morse, Alexandru Elisei, Suzuki K Poulose, Catalin Marinas,
	Will Deacon, Fuad Tabba, David Brazdil, linux-arm-kernel, kvmarm,
	linux-kernel, kernel-team

On Monday 18 Oct 2021 at 18:12:22 (+0100), Marc Zyngier wrote:
> On 2021-10-18 15:03, Quentin Perret wrote:
> > On Monday 18 Oct 2021 at 11:32:13 (+0100), Quentin Perret wrote:
> > > Another option is to take a refcount on 'current' from
> > > kvm_arch_vcpu_run_map_fp() before sharing thread-specific structs with
> > > the hyp and release the refcount of the previous task after unsharing.
> > > But that means we'll have to also drop the refcount when the vcpu
> > > gets destroyed, as well as explicitly unshare at that point. Shouldn't
> > > be too bad I think. Thoughts?
> > 
> > Something like the below seems to work OK on my setup, including
> > SIGKILL'ing the guest and such. How much do you hate it?
> 
> It is annoyingly elegant! Small nitpick below.
> 
> > 
> > diff --git a/arch/arm64/include/asm/kvm_host.h
> > b/arch/arm64/include/asm/kvm_host.h
> > index f8be56d5342b..50598d704c71 100644
> > --- a/arch/arm64/include/asm/kvm_host.h
> > +++ b/arch/arm64/include/asm/kvm_host.h
> > @@ -322,6 +322,7 @@ struct kvm_vcpu_arch {
> > 
> >  	struct thread_info *host_thread_info;	/* hyp VA */
> >  	struct user_fpsimd_state *host_fpsimd_state;	/* hyp VA */
> > +	struct task_struct *parent_task;
> > 
> >  	struct {
> >  		/* {Break,watch}point registers */
> > @@ -738,6 +739,7 @@ int kvm_arch_vcpu_run_map_fp(struct kvm_vcpu *vcpu);
> >  void kvm_arch_vcpu_load_fp(struct kvm_vcpu *vcpu);
> >  void kvm_arch_vcpu_ctxsync_fp(struct kvm_vcpu *vcpu);
> >  void kvm_arch_vcpu_put_fp(struct kvm_vcpu *vcpu);
> > +void kvm_vcpu_unshare_task_fp(struct kvm_vcpu *vcpu);
> > 
> >  static inline bool kvm_pmu_counter_deferred(struct perf_event_attr
> > *attr)
> >  {
> > diff --git a/arch/arm64/kvm/fpsimd.c b/arch/arm64/kvm/fpsimd.c
> > index 2fe1128d9f3d..27afeebbe1cb 100644
> > --- a/arch/arm64/kvm/fpsimd.c
> > +++ b/arch/arm64/kvm/fpsimd.c
> > @@ -15,6 +15,22 @@
> >  #include <asm/kvm_mmu.h>
> >  #include <asm/sysreg.h>
> > 
> > +void kvm_vcpu_unshare_task_fp(struct kvm_vcpu *vcpu)
> > +{
> > +	struct task_struct *p = vcpu->arch.parent_task;
> > +	struct user_fpsimd_state *fpsimd;
> > +	struct thread_info *ti;
> > +
> > +	if (!static_branch_likely(&kvm_protected_mode_initialized) || !p)
> 
> Shouldn't this be a check on is_protected_kvm_enabled() instead?
> The two should be equivalent outside of the initialisation code...

Yup, it'd be nice to do checks on kvm_protected_mode_initialized only
when they're strictly necessary, and that's not the case here. I'll fold
that change in v2.

Cheers
Quentin

^ permalink raw reply	[flat|nested] 84+ messages in thread

* Re: [PATCH 16/16] KVM: arm64: pkvm: Unshare guest structs during teardown
@ 2021-10-19  9:40             ` Quentin Perret
  0 siblings, 0 replies; 84+ messages in thread
From: Quentin Perret @ 2021-10-19  9:40 UTC (permalink / raw)
  To: Marc Zyngier
  Cc: kernel-team, Will Deacon, Catalin Marinas, linux-kernel,
	linux-arm-kernel, kvmarm

On Monday 18 Oct 2021 at 18:12:22 (+0100), Marc Zyngier wrote:
> On 2021-10-18 15:03, Quentin Perret wrote:
> > On Monday 18 Oct 2021 at 11:32:13 (+0100), Quentin Perret wrote:
> > > Another option is to take a refcount on 'current' from
> > > kvm_arch_vcpu_run_map_fp() before sharing thread-specific structs with
> > > the hyp and release the refcount of the previous task after unsharing.
> > > But that means we'll have to also drop the refcount when the vcpu
> > > gets destroyed, as well as explicitly unshare at that point. Shouldn't
> > > be too bad I think. Thoughts?
> > 
> > Something like the below seems to work OK on my setup, including
> > SIGKILL'ing the guest and such. How much do you hate it?
> 
> It is annoyingly elegant! Small nitpick below.
> 
> > 
> > diff --git a/arch/arm64/include/asm/kvm_host.h
> > b/arch/arm64/include/asm/kvm_host.h
> > index f8be56d5342b..50598d704c71 100644
> > --- a/arch/arm64/include/asm/kvm_host.h
> > +++ b/arch/arm64/include/asm/kvm_host.h
> > @@ -322,6 +322,7 @@ struct kvm_vcpu_arch {
> > 
> >  	struct thread_info *host_thread_info;	/* hyp VA */
> >  	struct user_fpsimd_state *host_fpsimd_state;	/* hyp VA */
> > +	struct task_struct *parent_task;
> > 
> >  	struct {
> >  		/* {Break,watch}point registers */
> > @@ -738,6 +739,7 @@ int kvm_arch_vcpu_run_map_fp(struct kvm_vcpu *vcpu);
> >  void kvm_arch_vcpu_load_fp(struct kvm_vcpu *vcpu);
> >  void kvm_arch_vcpu_ctxsync_fp(struct kvm_vcpu *vcpu);
> >  void kvm_arch_vcpu_put_fp(struct kvm_vcpu *vcpu);
> > +void kvm_vcpu_unshare_task_fp(struct kvm_vcpu *vcpu);
> > 
> >  static inline bool kvm_pmu_counter_deferred(struct perf_event_attr
> > *attr)
> >  {
> > diff --git a/arch/arm64/kvm/fpsimd.c b/arch/arm64/kvm/fpsimd.c
> > index 2fe1128d9f3d..27afeebbe1cb 100644
> > --- a/arch/arm64/kvm/fpsimd.c
> > +++ b/arch/arm64/kvm/fpsimd.c
> > @@ -15,6 +15,22 @@
> >  #include <asm/kvm_mmu.h>
> >  #include <asm/sysreg.h>
> > 
> > +void kvm_vcpu_unshare_task_fp(struct kvm_vcpu *vcpu)
> > +{
> > +	struct task_struct *p = vcpu->arch.parent_task;
> > +	struct user_fpsimd_state *fpsimd;
> > +	struct thread_info *ti;
> > +
> > +	if (!static_branch_likely(&kvm_protected_mode_initialized) || !p)
> 
> Shouldn't this be a check on is_protected_kvm_enabled() instead?
> The two should be equivalent outside of the initialisation code...

Yup, it'd be nice to do checks on kvm_protected_mode_initialized only
when they're strictly necessary, and that's not the case here. I'll fold
that change in v2.

Cheers
Quentin
_______________________________________________
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

^ permalink raw reply	[flat|nested] 84+ messages in thread

* Re: [PATCH 16/16] KVM: arm64: pkvm: Unshare guest structs during teardown
@ 2021-10-19  9:40             ` Quentin Perret
  0 siblings, 0 replies; 84+ messages in thread
From: Quentin Perret @ 2021-10-19  9:40 UTC (permalink / raw)
  To: Marc Zyngier
  Cc: James Morse, Alexandru Elisei, Suzuki K Poulose, Catalin Marinas,
	Will Deacon, Fuad Tabba, David Brazdil, linux-arm-kernel, kvmarm,
	linux-kernel, kernel-team

On Monday 18 Oct 2021 at 18:12:22 (+0100), Marc Zyngier wrote:
> On 2021-10-18 15:03, Quentin Perret wrote:
> > On Monday 18 Oct 2021 at 11:32:13 (+0100), Quentin Perret wrote:
> > > Another option is to take a refcount on 'current' from
> > > kvm_arch_vcpu_run_map_fp() before sharing thread-specific structs with
> > > the hyp and release the refcount of the previous task after unsharing.
> > > But that means we'll have to also drop the refcount when the vcpu
> > > gets destroyed, as well as explicitly unshare at that point. Shouldn't
> > > be too bad I think. Thoughts?
> > 
> > Something like the below seems to work OK on my setup, including
> > SIGKILL'ing the guest and such. How much do you hate it?
> 
> It is annoyingly elegant! Small nitpick below.
> 
> > 
> > diff --git a/arch/arm64/include/asm/kvm_host.h
> > b/arch/arm64/include/asm/kvm_host.h
> > index f8be56d5342b..50598d704c71 100644
> > --- a/arch/arm64/include/asm/kvm_host.h
> > +++ b/arch/arm64/include/asm/kvm_host.h
> > @@ -322,6 +322,7 @@ struct kvm_vcpu_arch {
> > 
> >  	struct thread_info *host_thread_info;	/* hyp VA */
> >  	struct user_fpsimd_state *host_fpsimd_state;	/* hyp VA */
> > +	struct task_struct *parent_task;
> > 
> >  	struct {
> >  		/* {Break,watch}point registers */
> > @@ -738,6 +739,7 @@ int kvm_arch_vcpu_run_map_fp(struct kvm_vcpu *vcpu);
> >  void kvm_arch_vcpu_load_fp(struct kvm_vcpu *vcpu);
> >  void kvm_arch_vcpu_ctxsync_fp(struct kvm_vcpu *vcpu);
> >  void kvm_arch_vcpu_put_fp(struct kvm_vcpu *vcpu);
> > +void kvm_vcpu_unshare_task_fp(struct kvm_vcpu *vcpu);
> > 
> >  static inline bool kvm_pmu_counter_deferred(struct perf_event_attr
> > *attr)
> >  {
> > diff --git a/arch/arm64/kvm/fpsimd.c b/arch/arm64/kvm/fpsimd.c
> > index 2fe1128d9f3d..27afeebbe1cb 100644
> > --- a/arch/arm64/kvm/fpsimd.c
> > +++ b/arch/arm64/kvm/fpsimd.c
> > @@ -15,6 +15,22 @@
> >  #include <asm/kvm_mmu.h>
> >  #include <asm/sysreg.h>
> > 
> > +void kvm_vcpu_unshare_task_fp(struct kvm_vcpu *vcpu)
> > +{
> > +	struct task_struct *p = vcpu->arch.parent_task;
> > +	struct user_fpsimd_state *fpsimd;
> > +	struct thread_info *ti;
> > +
> > +	if (!static_branch_likely(&kvm_protected_mode_initialized) || !p)
> 
> Shouldn't this be a check on is_protected_kvm_enabled() instead?
> The two should be equivalent outside of the initialisation code...

Yup, it'd be nice to do checks on kvm_protected_mode_initialized only
when they're strictly necessary, and that's not the case here. I'll fold
that change in v2.

Cheers
Quentin

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 84+ messages in thread

* Re: [PATCH 01/16] KVM: arm64: Introduce do_share() helper for memory sharing between components
  2021-10-15 15:11     ` Andrew Walbran
  (?)
@ 2021-10-19 10:37       ` Quentin Perret
  -1 siblings, 0 replies; 84+ messages in thread
From: Quentin Perret @ 2021-10-19 10:37 UTC (permalink / raw)
  To: Andrew Walbran
  Cc: Marc Zyngier, James Morse, Alexandru Elisei, Suzuki K Poulose,
	Catalin Marinas, Will Deacon, Fuad Tabba, David Brazdil,
	linux-arm-kernel, kvmarm, linux-kernel, Android Kernel Team

Hi Andrew,

On Friday 15 Oct 2021 at 16:11:49 (+0100), Andrew Walbran wrote:
> On Wed, 13 Oct 2021 at 16:58, 'Quentin Perret' via kernel-team
> > +struct pkvm_mem_transition {
> > +       u64                             nr_pages;
> > +
> > +       struct {
> > +               enum pkvm_component_id  id;
> > +               u64                     addr;
> Is this the physical address or the IPA of the initiator? It would be
> good to have a comment explaining.

That's the address in the initiator's address space. For the host and
guests that'll be an IPA (which also happens to be the same as the PA
for the host) and for the hypervisor that'll be an EL2 VA.

But yes, a comment won't hurt, so I'll add something.

<snip>
> > +static int check_share(struct pkvm_page_req *req,
> > +                      struct pkvm_page_share_ack *ack,
> > +                      struct pkvm_mem_share *share)
> > +{
> > +       if (!addr_is_memory(req->phys))
> > +               return -EINVAL;
> > +
> > +       if (req->initiator.state == PKVM_PAGE_OWNED &&
> > +           ack->completer.state == PKVM_NOPAGE) {
> > +               return 0;
> > +       }
> > +
> > +       if (req->initiator.state != PKVM_PAGE_SHARED_OWNED)
> > +               return -EPERM;
> > +
> > +       if (ack->completer.state != PKVM_PAGE_SHARED_BORROWED)
> > +               return -EPERM;
> > +
> > +       if (ack->completer.phys != req->phys)
> > +               return -EPERM;
> > +
> > +       if (ack->completer.prot != share->prot)
> > +               return -EPERM;
> I guess this is the workaround you mentioned for the fact that the
> host can share the same page twice? It might be worth adding a comment
> to explain that that's what's going on.

Yep, that's what is going on here. But FWIW I'm currently reworking the
way we refcount pages in v2, which will now become the host's
responsibility. So, that should simplify things quite a bit at EL2, and
make all of the above go away :-)

Cheers,
Quentin

^ permalink raw reply	[flat|nested] 84+ messages in thread

* Re: [PATCH 01/16] KVM: arm64: Introduce do_share() helper for memory sharing between components
@ 2021-10-19 10:37       ` Quentin Perret
  0 siblings, 0 replies; 84+ messages in thread
From: Quentin Perret @ 2021-10-19 10:37 UTC (permalink / raw)
  To: Andrew Walbran
  Cc: Android Kernel Team, Marc Zyngier, linux-kernel,
	linux-arm-kernel, Catalin Marinas, Will Deacon, kvmarm

Hi Andrew,

On Friday 15 Oct 2021 at 16:11:49 (+0100), Andrew Walbran wrote:
> On Wed, 13 Oct 2021 at 16:58, 'Quentin Perret' via kernel-team
> > +struct pkvm_mem_transition {
> > +       u64                             nr_pages;
> > +
> > +       struct {
> > +               enum pkvm_component_id  id;
> > +               u64                     addr;
> Is this the physical address or the IPA of the initiator? It would be
> good to have a comment explaining.

That's the address in the initiator's address space. For the host and
guests that'll be an IPA (which also happens to be the same as the PA
for the host) and for the hypervisor that'll be an EL2 VA.

But yes, a comment won't hurt, so I'll add something.

<snip>
> > +static int check_share(struct pkvm_page_req *req,
> > +                      struct pkvm_page_share_ack *ack,
> > +                      struct pkvm_mem_share *share)
> > +{
> > +       if (!addr_is_memory(req->phys))
> > +               return -EINVAL;
> > +
> > +       if (req->initiator.state == PKVM_PAGE_OWNED &&
> > +           ack->completer.state == PKVM_NOPAGE) {
> > +               return 0;
> > +       }
> > +
> > +       if (req->initiator.state != PKVM_PAGE_SHARED_OWNED)
> > +               return -EPERM;
> > +
> > +       if (ack->completer.state != PKVM_PAGE_SHARED_BORROWED)
> > +               return -EPERM;
> > +
> > +       if (ack->completer.phys != req->phys)
> > +               return -EPERM;
> > +
> > +       if (ack->completer.prot != share->prot)
> > +               return -EPERM;
> I guess this is the workaround you mentioned for the fact that the
> host can share the same page twice? It might be worth adding a comment
> to explain that that's what's going on.

Yep, that's what is going on here. But FWIW I'm currently reworking the
way we refcount pages in v2, which will now become the host's
responsibility. So, that should simplify things quite a bit at EL2, and
make all of the above go away :-)

Cheers,
Quentin
_______________________________________________
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

^ permalink raw reply	[flat|nested] 84+ messages in thread

* Re: [PATCH 01/16] KVM: arm64: Introduce do_share() helper for memory sharing between components
@ 2021-10-19 10:37       ` Quentin Perret
  0 siblings, 0 replies; 84+ messages in thread
From: Quentin Perret @ 2021-10-19 10:37 UTC (permalink / raw)
  To: Andrew Walbran
  Cc: Marc Zyngier, James Morse, Alexandru Elisei, Suzuki K Poulose,
	Catalin Marinas, Will Deacon, Fuad Tabba, David Brazdil,
	linux-arm-kernel, kvmarm, linux-kernel, Android Kernel Team

Hi Andrew,

On Friday 15 Oct 2021 at 16:11:49 (+0100), Andrew Walbran wrote:
> On Wed, 13 Oct 2021 at 16:58, 'Quentin Perret' via kernel-team
> > +struct pkvm_mem_transition {
> > +       u64                             nr_pages;
> > +
> > +       struct {
> > +               enum pkvm_component_id  id;
> > +               u64                     addr;
> Is this the physical address or the IPA of the initiator? It would be
> good to have a comment explaining.

That's the address in the initiator's address space. For the host and
guests that'll be an IPA (which also happens to be the same as the PA
for the host) and for the hypervisor that'll be an EL2 VA.

But yes, a comment won't hurt, so I'll add something.

<snip>
> > +static int check_share(struct pkvm_page_req *req,
> > +                      struct pkvm_page_share_ack *ack,
> > +                      struct pkvm_mem_share *share)
> > +{
> > +       if (!addr_is_memory(req->phys))
> > +               return -EINVAL;
> > +
> > +       if (req->initiator.state == PKVM_PAGE_OWNED &&
> > +           ack->completer.state == PKVM_NOPAGE) {
> > +               return 0;
> > +       }
> > +
> > +       if (req->initiator.state != PKVM_PAGE_SHARED_OWNED)
> > +               return -EPERM;
> > +
> > +       if (ack->completer.state != PKVM_PAGE_SHARED_BORROWED)
> > +               return -EPERM;
> > +
> > +       if (ack->completer.phys != req->phys)
> > +               return -EPERM;
> > +
> > +       if (ack->completer.prot != share->prot)
> > +               return -EPERM;
> I guess this is the workaround you mentioned for the fact that the
> host can share the same page twice? It might be worth adding a comment
> to explain that that's what's going on.

Yep, that's what is going on here. But FWIW I'm currently reworking the
way we refcount pages in v2, which will now become the host's
responsibility. So, that should simplify things quite a bit at EL2, and
make all of the above go away :-)

Cheers,
Quentin

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 84+ messages in thread

end of thread, other threads:[~2021-10-19 10:38 UTC | newest]

Thread overview: 84+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-10-13 15:58 [PATCH 00/16] KVM: arm64: Implement unshare hypercall for pkvm Quentin Perret
2021-10-13 15:58 ` Quentin Perret
2021-10-13 15:58 ` Quentin Perret
2021-10-13 15:58 ` [PATCH 01/16] KVM: arm64: Introduce do_share() helper for memory sharing between components Quentin Perret
2021-10-13 15:58   ` Quentin Perret
2021-10-13 15:58   ` Quentin Perret
2021-10-15 15:11   ` Andrew Walbran
2021-10-15 15:11     ` Andrew Walbran
2021-10-15 15:11     ` Andrew Walbran
2021-10-19 10:37     ` Quentin Perret
2021-10-19 10:37       ` Quentin Perret
2021-10-19 10:37       ` Quentin Perret
2021-10-13 15:58 ` [PATCH 02/16] KVM: arm64: Implement __pkvm_host_share_hyp() using do_share() Quentin Perret
2021-10-13 15:58   ` Quentin Perret
2021-10-13 15:58   ` Quentin Perret
2021-10-13 15:58 ` [PATCH 03/16] KVM: arm64: Avoid remapping the SVE state in the hyp stage-1 Quentin Perret
2021-10-13 15:58   ` Quentin Perret
2021-10-13 15:58   ` Quentin Perret
2021-10-16 11:04   ` Marc Zyngier
2021-10-16 11:04     ` Marc Zyngier
2021-10-16 11:04     ` Marc Zyngier
2021-10-18 10:36     ` Quentin Perret
2021-10-18 10:36       ` Quentin Perret
2021-10-18 10:36       ` Quentin Perret
2021-10-13 15:58 ` [PATCH 04/16] KVM: arm64: Introduce kvm_share_hyp() Quentin Perret
2021-10-13 15:58   ` Quentin Perret
2021-10-13 15:58   ` Quentin Perret
2021-10-17 10:41   ` Marc Zyngier
2021-10-17 10:41     ` Marc Zyngier
2021-10-17 10:41     ` Marc Zyngier
2021-10-18 10:51     ` Quentin Perret
2021-10-18 10:51       ` Quentin Perret
2021-10-18 10:51       ` Quentin Perret
2021-10-13 15:58 ` [PATCH 05/16] KVM: arm64: Accept page ranges in pkvm share hypercall Quentin Perret
2021-10-13 15:58   ` Quentin Perret
2021-10-13 15:58   ` Quentin Perret
2021-10-13 15:58 ` [PATCH 06/16] KVM: arm64: Provide {get,put}_page() stubs for early hyp allocator Quentin Perret
2021-10-13 15:58   ` Quentin Perret
2021-10-13 15:58   ` Quentin Perret
2021-10-13 15:58 ` [PATCH 07/16] KVM: arm64: Refcount hyp stage-1 pgtable pages Quentin Perret
2021-10-13 15:58   ` Quentin Perret
2021-10-13 15:58   ` Quentin Perret
2021-10-13 15:58 ` [PATCH 08/16] KVM: arm64: Fixup hyp stage-1 refcount Quentin Perret
2021-10-13 15:58   ` Quentin Perret
2021-10-13 15:58   ` Quentin Perret
2021-10-13 15:58 ` [PATCH 09/16] KVM: arm64: Hook up ->page_count() for hypervisor stage-1 page-table Quentin Perret
2021-10-13 15:58   ` Quentin Perret
2021-10-13 15:58   ` Quentin Perret
2021-10-13 15:58 ` [PATCH 10/16] KVM: arm64: Implement kvm_pgtable_hyp_unmap() at EL2 Quentin Perret
2021-10-13 15:58   ` Quentin Perret
2021-10-13 15:58   ` Quentin Perret
2021-10-13 15:58 ` [PATCH 11/16] KVM: arm64: Back hyp_vmemmap for all of memory Quentin Perret
2021-10-13 15:58   ` Quentin Perret
2021-10-13 15:58   ` Quentin Perret
2021-10-13 15:58 ` [PATCH 12/16] KVM: arm64: Move hyp refcount helpers to header files Quentin Perret
2021-10-13 15:58   ` Quentin Perret
2021-10-13 15:58   ` Quentin Perret
2021-10-13 15:58 ` [PATCH 13/16] KVM: arm64: Move double-sharing logic into hyp-specific function Quentin Perret
2021-10-13 15:58   ` Quentin Perret
2021-10-13 15:58   ` Quentin Perret
2021-10-13 15:58 ` [PATCH 14/16] KVM: arm64: Refcount shared pages at EL2 Quentin Perret
2021-10-13 15:58   ` Quentin Perret
2021-10-13 15:58   ` Quentin Perret
2021-10-13 15:58 ` [PATCH 15/16] KVM: arm64: pkvm: Introduce an unshare hypercall Quentin Perret
2021-10-13 15:58   ` Quentin Perret
2021-10-13 15:58   ` Quentin Perret
2021-10-13 15:58 ` [PATCH 16/16] KVM: arm64: pkvm: Unshare guest structs during teardown Quentin Perret
2021-10-13 15:58   ` Quentin Perret
2021-10-13 15:58   ` Quentin Perret
2021-10-16 12:25   ` Marc Zyngier
2021-10-16 12:25     ` Marc Zyngier
2021-10-16 12:25     ` Marc Zyngier
2021-10-18 10:32     ` Quentin Perret
2021-10-18 10:32       ` Quentin Perret
2021-10-18 10:32       ` Quentin Perret
2021-10-18 14:03       ` Quentin Perret
2021-10-18 14:03         ` Quentin Perret
2021-10-18 14:03         ` Quentin Perret
2021-10-18 17:12         ` Marc Zyngier
2021-10-18 17:12           ` Marc Zyngier
2021-10-18 17:12           ` Marc Zyngier
2021-10-19  9:40           ` Quentin Perret
2021-10-19  9:40             ` Quentin Perret
2021-10-19  9:40             ` Quentin Perret

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.