All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 0/3] KVM: arm64: support MTE in protected VMs
@ 2022-06-23  2:19 ` Peter Collingbourne
  0 siblings, 0 replies; 18+ messages in thread
From: Peter Collingbourne @ 2022-06-23  2:19 UTC (permalink / raw)
  To: kvmarm
  Cc: Peter Collingbourne, Marc Zyngier, kvm, Andy Lutomirski,
	linux-arm-kernel, Michael Roth, Catalin Marinas, Chao Peng,
	Will Deacon, Evgenii Stepanov

Hi,

This patch series contains a proposed extension to pKVM that allows MTE
to be exposed to the protected guests. It is based on the base pKVM
series previously sent to the list [1] and later rebased to 5.19-rc2
and uploaded to [2].

This series takes precautions against host compromise of the guests
via direct access to their tag storage, by preventing the host from
accessing the tag storage via stage 2 page tables. The device tree
must describe the physical memory address of the tag storage, if any,
and the memory nodes must declare that the tag storage location is
described. Otherwise, the MTE feature is disabled in protected guests.

Now that we can easily do so, we also prevent the host from accessing
any unmapped reserved-memory regions without a driver, as the host
has no business accessing that memory.

A proposed extension to the devicetree specification is available at
[3], a patched version of QEMU that produces the required device tree
nodes is available at [4] and a patched version of the crosvm hypervisor
that enables MTE is available at [5].

[1] https://lore.kernel.org/all/20220519134204.5379-1-will@kernel.org/
[2] https://android-kvm.googlesource.com/linux/ for-upstream/pkvm-base-v2
[3] https://github.com/pcc/devicetree-specification mte-alloc
[4] https://github.com/pcc/qemu mte-shared-alloc
[5] https://chromium-review.googlesource.com/c/chromiumos/platform/crosvm/+/3719324

Peter Collingbourne (3):
  KVM: arm64: add a hypercall for disowning pages
  KVM: arm64: disown unused reserved-memory regions
  KVM: arm64: allow MTE in protected VMs if the tag storage is known

 arch/arm64/include/asm/kvm_asm.h              |  1 +
 arch/arm64/include/asm/kvm_host.h             |  6 ++
 arch/arm64/include/asm/kvm_pkvm.h             |  4 +-
 arch/arm64/kernel/image-vars.h                |  3 +
 arch/arm64/kvm/arm.c                          | 83 ++++++++++++++++++-
 arch/arm64/kvm/hyp/include/nvhe/mem_protect.h |  1 +
 arch/arm64/kvm/hyp/include/nvhe/pkvm.h        |  1 +
 arch/arm64/kvm/hyp/nvhe/hyp-main.c            |  9 ++
 arch/arm64/kvm/hyp/nvhe/mem_protect.c         | 11 +++
 arch/arm64/kvm/hyp/nvhe/pkvm.c                |  8 +-
 arch/arm64/kvm/hyp/pgtable.c                  |  5 +-
 arch/arm64/kvm/mmu.c                          |  4 +-
 12 files changed, 126 insertions(+), 10 deletions(-)

-- 
2.37.0.rc0.104.g0611611a94-goog


^ permalink raw reply	[flat|nested] 18+ messages in thread

* [PATCH 0/3] KVM: arm64: support MTE in protected VMs
@ 2022-06-23  2:19 ` Peter Collingbourne
  0 siblings, 0 replies; 18+ messages in thread
From: Peter Collingbourne @ 2022-06-23  2:19 UTC (permalink / raw)
  To: kvmarm
  Cc: Peter Collingbourne, Marc Zyngier, kvm, Andy Lutomirski,
	linux-arm-kernel, Michael Roth, Catalin Marinas, Chao Peng,
	Will Deacon, Evgenii Stepanov

Hi,

This patch series contains a proposed extension to pKVM that allows MTE
to be exposed to the protected guests. It is based on the base pKVM
series previously sent to the list [1] and later rebased to 5.19-rc2
and uploaded to [2].

This series takes precautions against host compromise of the guests
via direct access to their tag storage, by preventing the host from
accessing the tag storage via stage 2 page tables. The device tree
must describe the physical memory address of the tag storage, if any,
and the memory nodes must declare that the tag storage location is
described. Otherwise, the MTE feature is disabled in protected guests.

Now that we can easily do so, we also prevent the host from accessing
any unmapped reserved-memory regions without a driver, as the host
has no business accessing that memory.

A proposed extension to the devicetree specification is available at
[3], a patched version of QEMU that produces the required device tree
nodes is available at [4] and a patched version of the crosvm hypervisor
that enables MTE is available at [5].

[1] https://lore.kernel.org/all/20220519134204.5379-1-will@kernel.org/
[2] https://android-kvm.googlesource.com/linux/ for-upstream/pkvm-base-v2
[3] https://github.com/pcc/devicetree-specification mte-alloc
[4] https://github.com/pcc/qemu mte-shared-alloc
[5] https://chromium-review.googlesource.com/c/chromiumos/platform/crosvm/+/3719324

Peter Collingbourne (3):
  KVM: arm64: add a hypercall for disowning pages
  KVM: arm64: disown unused reserved-memory regions
  KVM: arm64: allow MTE in protected VMs if the tag storage is known

 arch/arm64/include/asm/kvm_asm.h              |  1 +
 arch/arm64/include/asm/kvm_host.h             |  6 ++
 arch/arm64/include/asm/kvm_pkvm.h             |  4 +-
 arch/arm64/kernel/image-vars.h                |  3 +
 arch/arm64/kvm/arm.c                          | 83 ++++++++++++++++++-
 arch/arm64/kvm/hyp/include/nvhe/mem_protect.h |  1 +
 arch/arm64/kvm/hyp/include/nvhe/pkvm.h        |  1 +
 arch/arm64/kvm/hyp/nvhe/hyp-main.c            |  9 ++
 arch/arm64/kvm/hyp/nvhe/mem_protect.c         | 11 +++
 arch/arm64/kvm/hyp/nvhe/pkvm.c                |  8 +-
 arch/arm64/kvm/hyp/pgtable.c                  |  5 +-
 arch/arm64/kvm/mmu.c                          |  4 +-
 12 files changed, 126 insertions(+), 10 deletions(-)

-- 
2.37.0.rc0.104.g0611611a94-goog


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 18+ messages in thread

* [PATCH 0/3] KVM: arm64: support MTE in protected VMs
@ 2022-06-23  2:19 ` Peter Collingbourne
  0 siblings, 0 replies; 18+ messages in thread
From: Peter Collingbourne @ 2022-06-23  2:19 UTC (permalink / raw)
  To: kvmarm
  Cc: Catalin Marinas, kvm, Marc Zyngier, Andy Lutomirski, Will Deacon,
	Evgenii Stepanov, Michael Roth, Chao Peng, Peter Collingbourne,
	linux-arm-kernel

Hi,

This patch series contains a proposed extension to pKVM that allows MTE
to be exposed to the protected guests. It is based on the base pKVM
series previously sent to the list [1] and later rebased to 5.19-rc2
and uploaded to [2].

This series takes precautions against host compromise of the guests
via direct access to their tag storage, by preventing the host from
accessing the tag storage via stage 2 page tables. The device tree
must describe the physical memory address of the tag storage, if any,
and the memory nodes must declare that the tag storage location is
described. Otherwise, the MTE feature is disabled in protected guests.

Now that we can easily do so, we also prevent the host from accessing
any unmapped reserved-memory regions without a driver, as the host
has no business accessing that memory.

A proposed extension to the devicetree specification is available at
[3], a patched version of QEMU that produces the required device tree
nodes is available at [4] and a patched version of the crosvm hypervisor
that enables MTE is available at [5].

[1] https://lore.kernel.org/all/20220519134204.5379-1-will@kernel.org/
[2] https://android-kvm.googlesource.com/linux/ for-upstream/pkvm-base-v2
[3] https://github.com/pcc/devicetree-specification mte-alloc
[4] https://github.com/pcc/qemu mte-shared-alloc
[5] https://chromium-review.googlesource.com/c/chromiumos/platform/crosvm/+/3719324

Peter Collingbourne (3):
  KVM: arm64: add a hypercall for disowning pages
  KVM: arm64: disown unused reserved-memory regions
  KVM: arm64: allow MTE in protected VMs if the tag storage is known

 arch/arm64/include/asm/kvm_asm.h              |  1 +
 arch/arm64/include/asm/kvm_host.h             |  6 ++
 arch/arm64/include/asm/kvm_pkvm.h             |  4 +-
 arch/arm64/kernel/image-vars.h                |  3 +
 arch/arm64/kvm/arm.c                          | 83 ++++++++++++++++++-
 arch/arm64/kvm/hyp/include/nvhe/mem_protect.h |  1 +
 arch/arm64/kvm/hyp/include/nvhe/pkvm.h        |  1 +
 arch/arm64/kvm/hyp/nvhe/hyp-main.c            |  9 ++
 arch/arm64/kvm/hyp/nvhe/mem_protect.c         | 11 +++
 arch/arm64/kvm/hyp/nvhe/pkvm.c                |  8 +-
 arch/arm64/kvm/hyp/pgtable.c                  |  5 +-
 arch/arm64/kvm/mmu.c                          |  4 +-
 12 files changed, 126 insertions(+), 10 deletions(-)

-- 
2.37.0.rc0.104.g0611611a94-goog

_______________________________________________
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

^ permalink raw reply	[flat|nested] 18+ messages in thread

* [PATCH 1/3] KVM: arm64: add a hypercall for disowning pages
  2022-06-23  2:19 ` Peter Collingbourne
  (?)
@ 2022-06-23  2:19   ` Peter Collingbourne
  -1 siblings, 0 replies; 18+ messages in thread
From: Peter Collingbourne @ 2022-06-23  2:19 UTC (permalink / raw)
  To: kvmarm
  Cc: Peter Collingbourne, Marc Zyngier, kvm, Andy Lutomirski,
	linux-arm-kernel, Michael Roth, Catalin Marinas, Chao Peng,
	Will Deacon, Evgenii Stepanov

Currently we only deny the host access to hyp and guest pages. However,
there may be other pages that could potentially be used to indirectly
compromise the hypervisor or the other guests. Therefore introduce a
__pkvm_disown_pages hypercall that the host kernel may use to deny its
future self access to those pages before deprivileging itself.

Signed-off-by: Peter Collingbourne <pcc@google.com>
---
 arch/arm64/include/asm/kvm_asm.h              |  1 +
 arch/arm64/kvm/hyp/include/nvhe/mem_protect.h |  1 +
 arch/arm64/kvm/hyp/include/nvhe/pkvm.h        |  1 +
 arch/arm64/kvm/hyp/nvhe/hyp-main.c            |  9 +++++++++
 arch/arm64/kvm/hyp/nvhe/mem_protect.c         | 11 +++++++++++
 arch/arm64/kvm/hyp/pgtable.c                  |  5 +++--
 6 files changed, 26 insertions(+), 2 deletions(-)

diff --git a/arch/arm64/include/asm/kvm_asm.h b/arch/arm64/include/asm/kvm_asm.h
index 411cfbe3ebbd..1a177d9ed517 100644
--- a/arch/arm64/include/asm/kvm_asm.h
+++ b/arch/arm64/include/asm/kvm_asm.h
@@ -63,6 +63,7 @@ enum __kvm_host_smccc_func {
 	__KVM_HOST_SMCCC_FUNC___kvm_tlb_flush_vmid_ipa,
 	__KVM_HOST_SMCCC_FUNC___kvm_tlb_flush_vmid,
 	__KVM_HOST_SMCCC_FUNC___kvm_flush_cpu_context,
+	__KVM_HOST_SMCCC_FUNC___pkvm_disown_pages,
 	__KVM_HOST_SMCCC_FUNC___pkvm_prot_finalize,
 
 	/* Hypercalls available after pKVM finalisation */
diff --git a/arch/arm64/kvm/hyp/include/nvhe/mem_protect.h b/arch/arm64/kvm/hyp/include/nvhe/mem_protect.h
index e0bbb1726fa3..e88a9dab9cd5 100644
--- a/arch/arm64/kvm/hyp/include/nvhe/mem_protect.h
+++ b/arch/arm64/kvm/hyp/include/nvhe/mem_protect.h
@@ -58,6 +58,7 @@ enum pkvm_component_id {
 	PKVM_ID_HOST,
 	PKVM_ID_HYP,
 	PKVM_ID_GUEST,
+	PKVM_ID_NOBODY,
 };
 
 extern unsigned long hyp_nr_cpus;
diff --git a/arch/arm64/kvm/hyp/include/nvhe/pkvm.h b/arch/arm64/kvm/hyp/include/nvhe/pkvm.h
index c1987115b217..fbd991a46ab3 100644
--- a/arch/arm64/kvm/hyp/include/nvhe/pkvm.h
+++ b/arch/arm64/kvm/hyp/include/nvhe/pkvm.h
@@ -98,6 +98,7 @@ int __pkvm_init_shadow(struct kvm *kvm,
 		       unsigned long pgd_hva,
 		       unsigned long last_ran_hva, size_t last_ran_size);
 int __pkvm_teardown_shadow(unsigned int shadow_handle);
+int __pkvm_disown_pages(phys_addr_t phys, size_t size);
 
 struct kvm_shadow_vcpu_state *
 pkvm_load_shadow_vcpu_state(unsigned int shadow_handle, unsigned int vcpu_idx);
diff --git a/arch/arm64/kvm/hyp/nvhe/hyp-main.c b/arch/arm64/kvm/hyp/nvhe/hyp-main.c
index ddb36d172b60..b81908ef13e2 100644
--- a/arch/arm64/kvm/hyp/nvhe/hyp-main.c
+++ b/arch/arm64/kvm/hyp/nvhe/hyp-main.c
@@ -1055,6 +1055,14 @@ static void handle___pkvm_teardown_shadow(struct kvm_cpu_context *host_ctxt)
 	cpu_reg(host_ctxt, 1) = __pkvm_teardown_shadow(shadow_handle);
 }
 
+static void handle___pkvm_disown_pages(struct kvm_cpu_context *host_ctxt)
+{
+	DECLARE_REG(phys_addr_t, phys, host_ctxt, 1);
+	DECLARE_REG(size_t, size, host_ctxt, 2);
+
+	cpu_reg(host_ctxt, 1) = __pkvm_disown_pages(phys, size);
+}
+
 typedef void (*hcall_t)(struct kvm_cpu_context *);
 
 #define HANDLE_FUNC(x)	[__KVM_HOST_SMCCC_FUNC_##x] = (hcall_t)handle_##x
@@ -1072,6 +1080,7 @@ static const hcall_t host_hcall[] = {
 	HANDLE_FUNC(__kvm_tlb_flush_vmid_ipa),
 	HANDLE_FUNC(__kvm_tlb_flush_vmid),
 	HANDLE_FUNC(__kvm_flush_cpu_context),
+	HANDLE_FUNC(__pkvm_disown_pages),
 	HANDLE_FUNC(__pkvm_prot_finalize),
 
 	HANDLE_FUNC(__pkvm_host_share_hyp),
diff --git a/arch/arm64/kvm/hyp/nvhe/mem_protect.c b/arch/arm64/kvm/hyp/nvhe/mem_protect.c
index d839bb573b49..b3a2ad8454cc 100644
--- a/arch/arm64/kvm/hyp/nvhe/mem_protect.c
+++ b/arch/arm64/kvm/hyp/nvhe/mem_protect.c
@@ -1756,3 +1756,14 @@ int __pkvm_host_reclaim_page(u64 pfn)
 
 	return ret;
 }
+
+int __pkvm_disown_pages(phys_addr_t phys, size_t size)
+{
+	int ret;
+
+	host_lock_component();
+	ret = host_stage2_set_owner_locked(phys, size, PKVM_ID_NOBODY);
+	host_unlock_component();
+
+	return ret;
+}
diff --git a/arch/arm64/kvm/hyp/pgtable.c b/arch/arm64/kvm/hyp/pgtable.c
index 756bbb15c1f3..e1ecddd43885 100644
--- a/arch/arm64/kvm/hyp/pgtable.c
+++ b/arch/arm64/kvm/hyp/pgtable.c
@@ -10,6 +10,7 @@
 #include <linux/bitfield.h>
 #include <asm/kvm_pgtable.h>
 #include <asm/stage2_pgtable.h>
+#include <nvhe/mem_protect.h>
 
 
 #define KVM_PTE_TYPE			BIT(1)
@@ -677,9 +678,9 @@ static bool stage2_pte_is_counted(kvm_pte_t pte)
 	/*
 	 * The refcount tracks valid entries as well as invalid entries if they
 	 * encode ownership of a page to another entity than the page-table
-	 * owner, whose id is 0.
+	 * owner, whose id is 0, or NOBODY, which does not correspond to a page-table.
 	 */
-	return !!pte;
+	return !!pte && pte != kvm_init_invalid_leaf_owner(PKVM_ID_NOBODY);
 }
 
 static void stage2_put_pte(kvm_pte_t *ptep, struct kvm_s2_mmu *mmu, u64 addr,
-- 
2.37.0.rc0.104.g0611611a94-goog


^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [PATCH 1/3] KVM: arm64: add a hypercall for disowning pages
@ 2022-06-23  2:19   ` Peter Collingbourne
  0 siblings, 0 replies; 18+ messages in thread
From: Peter Collingbourne @ 2022-06-23  2:19 UTC (permalink / raw)
  To: kvmarm
  Cc: Peter Collingbourne, Marc Zyngier, kvm, Andy Lutomirski,
	linux-arm-kernel, Michael Roth, Catalin Marinas, Chao Peng,
	Will Deacon, Evgenii Stepanov

Currently we only deny the host access to hyp and guest pages. However,
there may be other pages that could potentially be used to indirectly
compromise the hypervisor or the other guests. Therefore introduce a
__pkvm_disown_pages hypercall that the host kernel may use to deny its
future self access to those pages before deprivileging itself.

Signed-off-by: Peter Collingbourne <pcc@google.com>
---
 arch/arm64/include/asm/kvm_asm.h              |  1 +
 arch/arm64/kvm/hyp/include/nvhe/mem_protect.h |  1 +
 arch/arm64/kvm/hyp/include/nvhe/pkvm.h        |  1 +
 arch/arm64/kvm/hyp/nvhe/hyp-main.c            |  9 +++++++++
 arch/arm64/kvm/hyp/nvhe/mem_protect.c         | 11 +++++++++++
 arch/arm64/kvm/hyp/pgtable.c                  |  5 +++--
 6 files changed, 26 insertions(+), 2 deletions(-)

diff --git a/arch/arm64/include/asm/kvm_asm.h b/arch/arm64/include/asm/kvm_asm.h
index 411cfbe3ebbd..1a177d9ed517 100644
--- a/arch/arm64/include/asm/kvm_asm.h
+++ b/arch/arm64/include/asm/kvm_asm.h
@@ -63,6 +63,7 @@ enum __kvm_host_smccc_func {
 	__KVM_HOST_SMCCC_FUNC___kvm_tlb_flush_vmid_ipa,
 	__KVM_HOST_SMCCC_FUNC___kvm_tlb_flush_vmid,
 	__KVM_HOST_SMCCC_FUNC___kvm_flush_cpu_context,
+	__KVM_HOST_SMCCC_FUNC___pkvm_disown_pages,
 	__KVM_HOST_SMCCC_FUNC___pkvm_prot_finalize,
 
 	/* Hypercalls available after pKVM finalisation */
diff --git a/arch/arm64/kvm/hyp/include/nvhe/mem_protect.h b/arch/arm64/kvm/hyp/include/nvhe/mem_protect.h
index e0bbb1726fa3..e88a9dab9cd5 100644
--- a/arch/arm64/kvm/hyp/include/nvhe/mem_protect.h
+++ b/arch/arm64/kvm/hyp/include/nvhe/mem_protect.h
@@ -58,6 +58,7 @@ enum pkvm_component_id {
 	PKVM_ID_HOST,
 	PKVM_ID_HYP,
 	PKVM_ID_GUEST,
+	PKVM_ID_NOBODY,
 };
 
 extern unsigned long hyp_nr_cpus;
diff --git a/arch/arm64/kvm/hyp/include/nvhe/pkvm.h b/arch/arm64/kvm/hyp/include/nvhe/pkvm.h
index c1987115b217..fbd991a46ab3 100644
--- a/arch/arm64/kvm/hyp/include/nvhe/pkvm.h
+++ b/arch/arm64/kvm/hyp/include/nvhe/pkvm.h
@@ -98,6 +98,7 @@ int __pkvm_init_shadow(struct kvm *kvm,
 		       unsigned long pgd_hva,
 		       unsigned long last_ran_hva, size_t last_ran_size);
 int __pkvm_teardown_shadow(unsigned int shadow_handle);
+int __pkvm_disown_pages(phys_addr_t phys, size_t size);
 
 struct kvm_shadow_vcpu_state *
 pkvm_load_shadow_vcpu_state(unsigned int shadow_handle, unsigned int vcpu_idx);
diff --git a/arch/arm64/kvm/hyp/nvhe/hyp-main.c b/arch/arm64/kvm/hyp/nvhe/hyp-main.c
index ddb36d172b60..b81908ef13e2 100644
--- a/arch/arm64/kvm/hyp/nvhe/hyp-main.c
+++ b/arch/arm64/kvm/hyp/nvhe/hyp-main.c
@@ -1055,6 +1055,14 @@ static void handle___pkvm_teardown_shadow(struct kvm_cpu_context *host_ctxt)
 	cpu_reg(host_ctxt, 1) = __pkvm_teardown_shadow(shadow_handle);
 }
 
+static void handle___pkvm_disown_pages(struct kvm_cpu_context *host_ctxt)
+{
+	DECLARE_REG(phys_addr_t, phys, host_ctxt, 1);
+	DECLARE_REG(size_t, size, host_ctxt, 2);
+
+	cpu_reg(host_ctxt, 1) = __pkvm_disown_pages(phys, size);
+}
+
 typedef void (*hcall_t)(struct kvm_cpu_context *);
 
 #define HANDLE_FUNC(x)	[__KVM_HOST_SMCCC_FUNC_##x] = (hcall_t)handle_##x
@@ -1072,6 +1080,7 @@ static const hcall_t host_hcall[] = {
 	HANDLE_FUNC(__kvm_tlb_flush_vmid_ipa),
 	HANDLE_FUNC(__kvm_tlb_flush_vmid),
 	HANDLE_FUNC(__kvm_flush_cpu_context),
+	HANDLE_FUNC(__pkvm_disown_pages),
 	HANDLE_FUNC(__pkvm_prot_finalize),
 
 	HANDLE_FUNC(__pkvm_host_share_hyp),
diff --git a/arch/arm64/kvm/hyp/nvhe/mem_protect.c b/arch/arm64/kvm/hyp/nvhe/mem_protect.c
index d839bb573b49..b3a2ad8454cc 100644
--- a/arch/arm64/kvm/hyp/nvhe/mem_protect.c
+++ b/arch/arm64/kvm/hyp/nvhe/mem_protect.c
@@ -1756,3 +1756,14 @@ int __pkvm_host_reclaim_page(u64 pfn)
 
 	return ret;
 }
+
+int __pkvm_disown_pages(phys_addr_t phys, size_t size)
+{
+	int ret;
+
+	host_lock_component();
+	ret = host_stage2_set_owner_locked(phys, size, PKVM_ID_NOBODY);
+	host_unlock_component();
+
+	return ret;
+}
diff --git a/arch/arm64/kvm/hyp/pgtable.c b/arch/arm64/kvm/hyp/pgtable.c
index 756bbb15c1f3..e1ecddd43885 100644
--- a/arch/arm64/kvm/hyp/pgtable.c
+++ b/arch/arm64/kvm/hyp/pgtable.c
@@ -10,6 +10,7 @@
 #include <linux/bitfield.h>
 #include <asm/kvm_pgtable.h>
 #include <asm/stage2_pgtable.h>
+#include <nvhe/mem_protect.h>
 
 
 #define KVM_PTE_TYPE			BIT(1)
@@ -677,9 +678,9 @@ static bool stage2_pte_is_counted(kvm_pte_t pte)
 	/*
 	 * The refcount tracks valid entries as well as invalid entries if they
 	 * encode ownership of a page to another entity than the page-table
-	 * owner, whose id is 0.
+	 * owner, whose id is 0, or NOBODY, which does not correspond to a page-table.
 	 */
-	return !!pte;
+	return !!pte && pte != kvm_init_invalid_leaf_owner(PKVM_ID_NOBODY);
 }
 
 static void stage2_put_pte(kvm_pte_t *ptep, struct kvm_s2_mmu *mmu, u64 addr,
-- 
2.37.0.rc0.104.g0611611a94-goog


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [PATCH 1/3] KVM: arm64: add a hypercall for disowning pages
@ 2022-06-23  2:19   ` Peter Collingbourne
  0 siblings, 0 replies; 18+ messages in thread
From: Peter Collingbourne @ 2022-06-23  2:19 UTC (permalink / raw)
  To: kvmarm
  Cc: Catalin Marinas, kvm, Marc Zyngier, Andy Lutomirski, Will Deacon,
	Evgenii Stepanov, Michael Roth, Chao Peng, Peter Collingbourne,
	linux-arm-kernel

Currently we only deny the host access to hyp and guest pages. However,
there may be other pages that could potentially be used to indirectly
compromise the hypervisor or the other guests. Therefore introduce a
__pkvm_disown_pages hypercall that the host kernel may use to deny its
future self access to those pages before deprivileging itself.

Signed-off-by: Peter Collingbourne <pcc@google.com>
---
 arch/arm64/include/asm/kvm_asm.h              |  1 +
 arch/arm64/kvm/hyp/include/nvhe/mem_protect.h |  1 +
 arch/arm64/kvm/hyp/include/nvhe/pkvm.h        |  1 +
 arch/arm64/kvm/hyp/nvhe/hyp-main.c            |  9 +++++++++
 arch/arm64/kvm/hyp/nvhe/mem_protect.c         | 11 +++++++++++
 arch/arm64/kvm/hyp/pgtable.c                  |  5 +++--
 6 files changed, 26 insertions(+), 2 deletions(-)

diff --git a/arch/arm64/include/asm/kvm_asm.h b/arch/arm64/include/asm/kvm_asm.h
index 411cfbe3ebbd..1a177d9ed517 100644
--- a/arch/arm64/include/asm/kvm_asm.h
+++ b/arch/arm64/include/asm/kvm_asm.h
@@ -63,6 +63,7 @@ enum __kvm_host_smccc_func {
 	__KVM_HOST_SMCCC_FUNC___kvm_tlb_flush_vmid_ipa,
 	__KVM_HOST_SMCCC_FUNC___kvm_tlb_flush_vmid,
 	__KVM_HOST_SMCCC_FUNC___kvm_flush_cpu_context,
+	__KVM_HOST_SMCCC_FUNC___pkvm_disown_pages,
 	__KVM_HOST_SMCCC_FUNC___pkvm_prot_finalize,
 
 	/* Hypercalls available after pKVM finalisation */
diff --git a/arch/arm64/kvm/hyp/include/nvhe/mem_protect.h b/arch/arm64/kvm/hyp/include/nvhe/mem_protect.h
index e0bbb1726fa3..e88a9dab9cd5 100644
--- a/arch/arm64/kvm/hyp/include/nvhe/mem_protect.h
+++ b/arch/arm64/kvm/hyp/include/nvhe/mem_protect.h
@@ -58,6 +58,7 @@ enum pkvm_component_id {
 	PKVM_ID_HOST,
 	PKVM_ID_HYP,
 	PKVM_ID_GUEST,
+	PKVM_ID_NOBODY,
 };
 
 extern unsigned long hyp_nr_cpus;
diff --git a/arch/arm64/kvm/hyp/include/nvhe/pkvm.h b/arch/arm64/kvm/hyp/include/nvhe/pkvm.h
index c1987115b217..fbd991a46ab3 100644
--- a/arch/arm64/kvm/hyp/include/nvhe/pkvm.h
+++ b/arch/arm64/kvm/hyp/include/nvhe/pkvm.h
@@ -98,6 +98,7 @@ int __pkvm_init_shadow(struct kvm *kvm,
 		       unsigned long pgd_hva,
 		       unsigned long last_ran_hva, size_t last_ran_size);
 int __pkvm_teardown_shadow(unsigned int shadow_handle);
+int __pkvm_disown_pages(phys_addr_t phys, size_t size);
 
 struct kvm_shadow_vcpu_state *
 pkvm_load_shadow_vcpu_state(unsigned int shadow_handle, unsigned int vcpu_idx);
diff --git a/arch/arm64/kvm/hyp/nvhe/hyp-main.c b/arch/arm64/kvm/hyp/nvhe/hyp-main.c
index ddb36d172b60..b81908ef13e2 100644
--- a/arch/arm64/kvm/hyp/nvhe/hyp-main.c
+++ b/arch/arm64/kvm/hyp/nvhe/hyp-main.c
@@ -1055,6 +1055,14 @@ static void handle___pkvm_teardown_shadow(struct kvm_cpu_context *host_ctxt)
 	cpu_reg(host_ctxt, 1) = __pkvm_teardown_shadow(shadow_handle);
 }
 
+static void handle___pkvm_disown_pages(struct kvm_cpu_context *host_ctxt)
+{
+	DECLARE_REG(phys_addr_t, phys, host_ctxt, 1);
+	DECLARE_REG(size_t, size, host_ctxt, 2);
+
+	cpu_reg(host_ctxt, 1) = __pkvm_disown_pages(phys, size);
+}
+
 typedef void (*hcall_t)(struct kvm_cpu_context *);
 
 #define HANDLE_FUNC(x)	[__KVM_HOST_SMCCC_FUNC_##x] = (hcall_t)handle_##x
@@ -1072,6 +1080,7 @@ static const hcall_t host_hcall[] = {
 	HANDLE_FUNC(__kvm_tlb_flush_vmid_ipa),
 	HANDLE_FUNC(__kvm_tlb_flush_vmid),
 	HANDLE_FUNC(__kvm_flush_cpu_context),
+	HANDLE_FUNC(__pkvm_disown_pages),
 	HANDLE_FUNC(__pkvm_prot_finalize),
 
 	HANDLE_FUNC(__pkvm_host_share_hyp),
diff --git a/arch/arm64/kvm/hyp/nvhe/mem_protect.c b/arch/arm64/kvm/hyp/nvhe/mem_protect.c
index d839bb573b49..b3a2ad8454cc 100644
--- a/arch/arm64/kvm/hyp/nvhe/mem_protect.c
+++ b/arch/arm64/kvm/hyp/nvhe/mem_protect.c
@@ -1756,3 +1756,14 @@ int __pkvm_host_reclaim_page(u64 pfn)
 
 	return ret;
 }
+
+int __pkvm_disown_pages(phys_addr_t phys, size_t size)
+{
+	int ret;
+
+	host_lock_component();
+	ret = host_stage2_set_owner_locked(phys, size, PKVM_ID_NOBODY);
+	host_unlock_component();
+
+	return ret;
+}
diff --git a/arch/arm64/kvm/hyp/pgtable.c b/arch/arm64/kvm/hyp/pgtable.c
index 756bbb15c1f3..e1ecddd43885 100644
--- a/arch/arm64/kvm/hyp/pgtable.c
+++ b/arch/arm64/kvm/hyp/pgtable.c
@@ -10,6 +10,7 @@
 #include <linux/bitfield.h>
 #include <asm/kvm_pgtable.h>
 #include <asm/stage2_pgtable.h>
+#include <nvhe/mem_protect.h>
 
 
 #define KVM_PTE_TYPE			BIT(1)
@@ -677,9 +678,9 @@ static bool stage2_pte_is_counted(kvm_pte_t pte)
 	/*
 	 * The refcount tracks valid entries as well as invalid entries if they
 	 * encode ownership of a page to another entity than the page-table
-	 * owner, whose id is 0.
+	 * owner, whose id is 0, or NOBODY, which does not correspond to a page-table.
 	 */
-	return !!pte;
+	return !!pte && pte != kvm_init_invalid_leaf_owner(PKVM_ID_NOBODY);
 }
 
 static void stage2_put_pte(kvm_pte_t *ptep, struct kvm_s2_mmu *mmu, u64 addr,
-- 
2.37.0.rc0.104.g0611611a94-goog

_______________________________________________
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [PATCH 2/3] KVM: arm64: disown unused reserved-memory regions
  2022-06-23  2:19 ` Peter Collingbourne
  (?)
@ 2022-06-23  2:19   ` Peter Collingbourne
  -1 siblings, 0 replies; 18+ messages in thread
From: Peter Collingbourne @ 2022-06-23  2:19 UTC (permalink / raw)
  To: kvmarm
  Cc: Peter Collingbourne, Marc Zyngier, kvm, Andy Lutomirski,
	linux-arm-kernel, Michael Roth, Catalin Marinas, Chao Peng,
	Will Deacon, Evgenii Stepanov

The meaning of no-map on a reserved-memory node is as follows:

      Indicates the operating system must not create a virtual mapping
      of the region as part of its standard mapping of system memory,
      nor permit speculative access to it under any circumstances other
      than under the control of the device driver using the region.

If there is no compatible property, there is no device driver, so the
host kernel has no business accessing the reserved-memory region. Since
these regions may represent a route through which the host kernel
can gain additional privileges, disown any such memory regions before
deprivileging ourselves.

Signed-off-by: Peter Collingbourne <pcc@google.com>
---
 arch/arm64/kvm/arm.c | 46 ++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 46 insertions(+)

diff --git a/arch/arm64/kvm/arm.c b/arch/arm64/kvm/arm.c
index db7cbca6ace4..38f0900b7ddb 100644
--- a/arch/arm64/kvm/arm.c
+++ b/arch/arm64/kvm/arm.c
@@ -4,6 +4,7 @@
  * Author: Christoffer Dall <c.dall@virtualopensystems.com>
  */
 
+#include <linux/acpi.h>
 #include <linux/bug.h>
 #include <linux/cpu_pm.h>
 #include <linux/entry-kvm.h>
@@ -12,6 +13,7 @@
 #include <linux/kvm_host.h>
 #include <linux/list.h>
 #include <linux/module.h>
+#include <linux/of.h>
 #include <linux/vmalloc.h>
 #include <linux/fs.h>
 #include <linux/mman.h>
@@ -1907,6 +1909,48 @@ static bool init_psci_relay(void)
 	return true;
 }
 
+static void disown_reserved_memory(struct device_node *node)
+{
+	int addr_cells = of_n_addr_cells(node);
+	int size_cells = of_n_size_cells(node);
+	const __be32 *reg, *end;
+	int len;
+
+	reg = of_get_property(node, "reg", &len);
+	if (len % (4 * (addr_cells + size_cells)))
+		return;
+
+	end = reg + (len / 4);
+	while (reg != end) {
+		u64 addr, size;
+
+		addr = of_read_number(reg, addr_cells);
+		reg += addr_cells;
+		size = of_read_number(reg, size_cells);
+		reg += size_cells;
+
+		kvm_call_hyp_nvhe(__pkvm_disown_pages, addr, size);
+	}
+}
+
+static void kvm_reserved_memory_init(void)
+{
+	struct device_node *parent, *node;
+
+	if (!acpi_disabled || !is_protected_kvm_enabled())
+		return;
+
+	parent = of_find_node_by_path("/reserved-memory");
+	if (!parent)
+		return;
+
+	for_each_child_of_node(parent, node) {
+		if (!of_get_property(node, "compatible", NULL) &&
+		    of_get_property(node, "no-map", NULL))
+			disown_reserved_memory(node);
+	}
+}
+
 static int init_subsystems(void)
 {
 	int err = 0;
@@ -1947,6 +1991,8 @@ static int init_subsystems(void)
 
 	kvm_register_perf_callbacks(NULL);
 
+	kvm_reserved_memory_init();
+
 out:
 	if (err || !is_protected_kvm_enabled())
 		on_each_cpu(_kvm_arch_hardware_disable, NULL, 1);
-- 
2.37.0.rc0.104.g0611611a94-goog


^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [PATCH 2/3] KVM: arm64: disown unused reserved-memory regions
@ 2022-06-23  2:19   ` Peter Collingbourne
  0 siblings, 0 replies; 18+ messages in thread
From: Peter Collingbourne @ 2022-06-23  2:19 UTC (permalink / raw)
  To: kvmarm
  Cc: Peter Collingbourne, Marc Zyngier, kvm, Andy Lutomirski,
	linux-arm-kernel, Michael Roth, Catalin Marinas, Chao Peng,
	Will Deacon, Evgenii Stepanov

The meaning of no-map on a reserved-memory node is as follows:

      Indicates the operating system must not create a virtual mapping
      of the region as part of its standard mapping of system memory,
      nor permit speculative access to it under any circumstances other
      than under the control of the device driver using the region.

If there is no compatible property, there is no device driver, so the
host kernel has no business accessing the reserved-memory region. Since
these regions may represent a route through which the host kernel
can gain additional privileges, disown any such memory regions before
deprivileging ourselves.

Signed-off-by: Peter Collingbourne <pcc@google.com>
---
 arch/arm64/kvm/arm.c | 46 ++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 46 insertions(+)

diff --git a/arch/arm64/kvm/arm.c b/arch/arm64/kvm/arm.c
index db7cbca6ace4..38f0900b7ddb 100644
--- a/arch/arm64/kvm/arm.c
+++ b/arch/arm64/kvm/arm.c
@@ -4,6 +4,7 @@
  * Author: Christoffer Dall <c.dall@virtualopensystems.com>
  */
 
+#include <linux/acpi.h>
 #include <linux/bug.h>
 #include <linux/cpu_pm.h>
 #include <linux/entry-kvm.h>
@@ -12,6 +13,7 @@
 #include <linux/kvm_host.h>
 #include <linux/list.h>
 #include <linux/module.h>
+#include <linux/of.h>
 #include <linux/vmalloc.h>
 #include <linux/fs.h>
 #include <linux/mman.h>
@@ -1907,6 +1909,48 @@ static bool init_psci_relay(void)
 	return true;
 }
 
+static void disown_reserved_memory(struct device_node *node)
+{
+	int addr_cells = of_n_addr_cells(node);
+	int size_cells = of_n_size_cells(node);
+	const __be32 *reg, *end;
+	int len;
+
+	reg = of_get_property(node, "reg", &len);
+	if (len % (4 * (addr_cells + size_cells)))
+		return;
+
+	end = reg + (len / 4);
+	while (reg != end) {
+		u64 addr, size;
+
+		addr = of_read_number(reg, addr_cells);
+		reg += addr_cells;
+		size = of_read_number(reg, size_cells);
+		reg += size_cells;
+
+		kvm_call_hyp_nvhe(__pkvm_disown_pages, addr, size);
+	}
+}
+
+static void kvm_reserved_memory_init(void)
+{
+	struct device_node *parent, *node;
+
+	if (!acpi_disabled || !is_protected_kvm_enabled())
+		return;
+
+	parent = of_find_node_by_path("/reserved-memory");
+	if (!parent)
+		return;
+
+	for_each_child_of_node(parent, node) {
+		if (!of_get_property(node, "compatible", NULL) &&
+		    of_get_property(node, "no-map", NULL))
+			disown_reserved_memory(node);
+	}
+}
+
 static int init_subsystems(void)
 {
 	int err = 0;
@@ -1947,6 +1991,8 @@ static int init_subsystems(void)
 
 	kvm_register_perf_callbacks(NULL);
 
+	kvm_reserved_memory_init();
+
 out:
 	if (err || !is_protected_kvm_enabled())
 		on_each_cpu(_kvm_arch_hardware_disable, NULL, 1);
-- 
2.37.0.rc0.104.g0611611a94-goog


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [PATCH 2/3] KVM: arm64: disown unused reserved-memory regions
@ 2022-06-23  2:19   ` Peter Collingbourne
  0 siblings, 0 replies; 18+ messages in thread
From: Peter Collingbourne @ 2022-06-23  2:19 UTC (permalink / raw)
  To: kvmarm
  Cc: Catalin Marinas, kvm, Marc Zyngier, Andy Lutomirski, Will Deacon,
	Evgenii Stepanov, Michael Roth, Chao Peng, Peter Collingbourne,
	linux-arm-kernel

The meaning of no-map on a reserved-memory node is as follows:

      Indicates the operating system must not create a virtual mapping
      of the region as part of its standard mapping of system memory,
      nor permit speculative access to it under any circumstances other
      than under the control of the device driver using the region.

If there is no compatible property, there is no device driver, so the
host kernel has no business accessing the reserved-memory region. Since
these regions may represent a route through which the host kernel
can gain additional privileges, disown any such memory regions before
deprivileging ourselves.

Signed-off-by: Peter Collingbourne <pcc@google.com>
---
 arch/arm64/kvm/arm.c | 46 ++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 46 insertions(+)

diff --git a/arch/arm64/kvm/arm.c b/arch/arm64/kvm/arm.c
index db7cbca6ace4..38f0900b7ddb 100644
--- a/arch/arm64/kvm/arm.c
+++ b/arch/arm64/kvm/arm.c
@@ -4,6 +4,7 @@
  * Author: Christoffer Dall <c.dall@virtualopensystems.com>
  */
 
+#include <linux/acpi.h>
 #include <linux/bug.h>
 #include <linux/cpu_pm.h>
 #include <linux/entry-kvm.h>
@@ -12,6 +13,7 @@
 #include <linux/kvm_host.h>
 #include <linux/list.h>
 #include <linux/module.h>
+#include <linux/of.h>
 #include <linux/vmalloc.h>
 #include <linux/fs.h>
 #include <linux/mman.h>
@@ -1907,6 +1909,48 @@ static bool init_psci_relay(void)
 	return true;
 }
 
+static void disown_reserved_memory(struct device_node *node)
+{
+	int addr_cells = of_n_addr_cells(node);
+	int size_cells = of_n_size_cells(node);
+	const __be32 *reg, *end;
+	int len;
+
+	reg = of_get_property(node, "reg", &len);
+	if (len % (4 * (addr_cells + size_cells)))
+		return;
+
+	end = reg + (len / 4);
+	while (reg != end) {
+		u64 addr, size;
+
+		addr = of_read_number(reg, addr_cells);
+		reg += addr_cells;
+		size = of_read_number(reg, size_cells);
+		reg += size_cells;
+
+		kvm_call_hyp_nvhe(__pkvm_disown_pages, addr, size);
+	}
+}
+
+static void kvm_reserved_memory_init(void)
+{
+	struct device_node *parent, *node;
+
+	if (!acpi_disabled || !is_protected_kvm_enabled())
+		return;
+
+	parent = of_find_node_by_path("/reserved-memory");
+	if (!parent)
+		return;
+
+	for_each_child_of_node(parent, node) {
+		if (!of_get_property(node, "compatible", NULL) &&
+		    of_get_property(node, "no-map", NULL))
+			disown_reserved_memory(node);
+	}
+}
+
 static int init_subsystems(void)
 {
 	int err = 0;
@@ -1947,6 +1991,8 @@ static int init_subsystems(void)
 
 	kvm_register_perf_callbacks(NULL);
 
+	kvm_reserved_memory_init();
+
 out:
 	if (err || !is_protected_kvm_enabled())
 		on_each_cpu(_kvm_arch_hardware_disable, NULL, 1);
-- 
2.37.0.rc0.104.g0611611a94-goog

_______________________________________________
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [PATCH 3/3] KVM: arm64: allow MTE in protected VMs if the tag storage is known
  2022-06-23  2:19 ` Peter Collingbourne
  (?)
@ 2022-06-23  2:19   ` Peter Collingbourne
  -1 siblings, 0 replies; 18+ messages in thread
From: Peter Collingbourne @ 2022-06-23  2:19 UTC (permalink / raw)
  To: kvmarm
  Cc: Peter Collingbourne, Marc Zyngier, kvm, Andy Lutomirski,
	linux-arm-kernel, Michael Roth, Catalin Marinas, Chao Peng,
	Will Deacon, Evgenii Stepanov

Because the host may corrupt a protected guest's tag storage unless
protected by stage 2 page tables, we can't expose MTE to protected guests
if the location of the tag storage is not known.

Therefore, only allow protected VM guests to use MTE if the location of
the tag storage is described in the device tree, and only after disowning
any physical memory accessible tag storage regions.

To avoid exposing MTE tags from the host to protected VMs, sanitize
tags before donating pages.

Signed-off-by: Peter Collingbourne <pcc@google.com>
---
 arch/arm64/include/asm/kvm_host.h |  6 +++++
 arch/arm64/include/asm/kvm_pkvm.h |  4 +++-
 arch/arm64/kernel/image-vars.h    |  3 +++
 arch/arm64/kvm/arm.c              | 37 ++++++++++++++++++++++++++++---
 arch/arm64/kvm/hyp/nvhe/pkvm.c    |  8 ++++---
 arch/arm64/kvm/mmu.c              |  4 +++-
 6 files changed, 54 insertions(+), 8 deletions(-)

diff --git a/arch/arm64/include/asm/kvm_host.h b/arch/arm64/include/asm/kvm_host.h
index 0fd184b816b3..973ab0378d01 100644
--- a/arch/arm64/include/asm/kvm_host.h
+++ b/arch/arm64/include/asm/kvm_host.h
@@ -930,6 +930,12 @@ bool kvm_arm_vcpu_is_finalized(struct kvm_vcpu *vcpu);
 #define kvm_arm_vcpu_sve_finalized(vcpu) \
 	((vcpu)->arch.flags & KVM_ARM64_VCPU_SVE_FINALIZED)
 
+DECLARE_STATIC_KEY_FALSE(pkvm_mte_supported);
+
+#define kvm_supports_mte(kvm)                                                  \
+	(system_supports_mte() &&                                              \
+	 (!kvm_vm_is_protected(kvm) ||                                         \
+	  static_branch_unlikely(&pkvm_mte_supported)))
 #define kvm_has_mte(kvm)					\
 	(system_supports_mte() &&				\
 	 test_bit(KVM_ARCH_FLAG_MTE_ENABLED, &(kvm)->arch.flags))
diff --git a/arch/arm64/include/asm/kvm_pkvm.h b/arch/arm64/include/asm/kvm_pkvm.h
index cd56438a34be..ef5d4870c043 100644
--- a/arch/arm64/include/asm/kvm_pkvm.h
+++ b/arch/arm64/include/asm/kvm_pkvm.h
@@ -73,10 +73,12 @@ void kvm_shadow_destroy(struct kvm *kvm);
  * Allow for protected VMs:
  * - Branch Target Identification
  * - Speculative Store Bypassing
+ * - Memory Tagging Extension
  */
 #define PVM_ID_AA64PFR1_ALLOW (\
 	ARM64_FEATURE_MASK(ID_AA64PFR1_BT) | \
-	ARM64_FEATURE_MASK(ID_AA64PFR1_SSBS) \
+	ARM64_FEATURE_MASK(ID_AA64PFR1_SSBS) | \
+	ARM64_FEATURE_MASK(ID_AA64PFR1_MTE) \
 	)
 
 /*
diff --git a/arch/arm64/kernel/image-vars.h b/arch/arm64/kernel/image-vars.h
index 2d4d6836ff47..26a9b31478aa 100644
--- a/arch/arm64/kernel/image-vars.h
+++ b/arch/arm64/kernel/image-vars.h
@@ -84,6 +84,9 @@ KVM_NVHE_ALIAS(__hyp_stub_vectors);
 KVM_NVHE_ALIAS(arm64_const_caps_ready);
 KVM_NVHE_ALIAS(cpu_hwcap_keys);
 
+/* Kernel symbol needed for kvm_supports_mte() check. */
+KVM_NVHE_ALIAS(pkvm_mte_supported);
+
 /* Static keys which are set if a vGIC trap should be handled in hyp. */
 KVM_NVHE_ALIAS(vgic_v2_cpuif_trap);
 KVM_NVHE_ALIAS(vgic_v3_cpuif_trap);
diff --git a/arch/arm64/kvm/arm.c b/arch/arm64/kvm/arm.c
index 38f0900b7ddb..5a780a0f59e3 100644
--- a/arch/arm64/kvm/arm.c
+++ b/arch/arm64/kvm/arm.c
@@ -60,6 +60,7 @@ static bool vgic_present;
 
 static DEFINE_PER_CPU(unsigned char, kvm_arm_hardware_enabled);
 DEFINE_STATIC_KEY_FALSE(userspace_irqchip_in_use);
+DEFINE_STATIC_KEY_FALSE(pkvm_mte_supported);
 
 int kvm_arch_vcpu_should_kick(struct kvm_vcpu *vcpu)
 {
@@ -96,9 +97,7 @@ int kvm_vm_ioctl_enable_cap(struct kvm *kvm,
 		break;
 	case KVM_CAP_ARM_MTE:
 		mutex_lock(&kvm->lock);
-		if (!system_supports_mte() ||
-		    kvm_vm_is_protected(kvm) ||
-		    kvm->created_vcpus) {
+		if (!kvm_supports_mte(kvm) || kvm->created_vcpus) {
 			r = -EINVAL;
 		} else {
 			r = 0;
@@ -334,6 +333,9 @@ static int pkvm_check_extension(struct kvm *kvm, long ext, int kvm_cap)
 	case KVM_CAP_ARM_VM_IPA_SIZE:
 		r = kvm_cap;
 		break;
+	case KVM_CAP_ARM_MTE:
+		r = kvm_cap && static_branch_unlikely(&pkvm_mte_supported);
+		break;
 	case KVM_CAP_GUEST_DEBUG_HW_BPS:
 		r = min(kvm_cap, pkvm_get_max_brps());
 		break;
@@ -1948,9 +1950,36 @@ static void kvm_reserved_memory_init(void)
 		if (!of_get_property(node, "compatible", NULL) &&
 		    of_get_property(node, "no-map", NULL))
 			disown_reserved_memory(node);
+
+		if (of_device_is_compatible(node, "arm,mte-tag-storage"))
+			disown_reserved_memory(node);
 	}
 }
 
+static void kvm_mte_init(void)
+{
+	struct device_node *memory;
+
+	if (!system_supports_mte() || !acpi_disabled ||
+	    !is_protected_kvm_enabled())
+		return;
+
+	/*
+	 * It is only safe to turn on MTE for protected VMs if we can protect
+	 * the guests from host accesses to their tag storage. If every memory
+	 * region has an arm,mte-alloc property we know that all tag storage
+	 * regions exposed to physical memory, if any, are described by a
+	 * reserved-memory compatible with arm,mte-tag-storage. We can use these
+	 * descriptions to unmap these regions from the host's stage 2 page
+	 * tables (see kvm_reserved_memory_init).
+	 */
+	for_each_node_by_type(memory, "memory")
+		if (!of_get_property(memory, "arm,mte-alloc", NULL))
+			return;
+
+	static_branch_enable(&pkvm_mte_supported);
+}
+
 static int init_subsystems(void)
 {
 	int err = 0;
@@ -1993,6 +2022,8 @@ static int init_subsystems(void)
 
 	kvm_reserved_memory_init();
 
+	kvm_mte_init();
+
 out:
 	if (err || !is_protected_kvm_enabled())
 		on_each_cpu(_kvm_arch_hardware_disable, NULL, 1);
diff --git a/arch/arm64/kvm/hyp/nvhe/pkvm.c b/arch/arm64/kvm/hyp/nvhe/pkvm.c
index e81a6dd676d0..dcc3084161b3 100644
--- a/arch/arm64/kvm/hyp/nvhe/pkvm.c
+++ b/arch/arm64/kvm/hyp/nvhe/pkvm.c
@@ -88,7 +88,7 @@ static void pvm_init_traps_aa64pfr1(struct kvm_vcpu *vcpu)
 	/* Memory Tagging: Trap and Treat as Untagged if not supported. */
 	if (!FIELD_GET(ARM64_FEATURE_MASK(ID_AA64PFR1_MTE), feature_ids)) {
 		hcr_set |= HCR_TID5;
-		hcr_clear |= HCR_DCT | HCR_ATA;
+		hcr_clear |= HCR_ATA;
 	}
 
 	vcpu->arch.hcr_el2 |= hcr_set;
@@ -179,8 +179,8 @@ static void pvm_init_trap_regs(struct kvm_vcpu *vcpu)
 	 * - Feature id registers: to control features exposed to guests
 	 * - Implementation-defined features
 	 */
-	vcpu->arch.hcr_el2 = HCR_GUEST_FLAGS |
-			     HCR_TID3 | HCR_TACR | HCR_TIDCP | HCR_TID1;
+	vcpu->arch.hcr_el2 = HCR_GUEST_FLAGS | HCR_TID3 | HCR_TACR | HCR_TIDCP |
+			     HCR_TID1 | HCR_ATA;
 
 	if (cpus_have_const_cap(ARM64_HAS_RAS_EXTN)) {
 		/* route synchronous external abort exceptions to EL2 */
@@ -459,6 +459,8 @@ static int init_shadow_structs(struct kvm *kvm, struct kvm_shadow_vm *vm,
 	vm->host_kvm = kvm;
 	vm->kvm.created_vcpus = nr_vcpus;
 	vm->kvm.arch.vtcr = host_kvm.arch.vtcr;
+	if (kvm_supports_mte(kvm) && test_bit(KVM_ARCH_FLAG_MTE_ENABLED, &kvm->arch.flags))
+		set_bit(KVM_ARCH_FLAG_MTE_ENABLED, &vm->kvm.arch.flags);
 	vm->kvm.arch.pkvm.enabled = READ_ONCE(kvm->arch.pkvm.enabled);
 	vm->kvm.arch.mmu.last_vcpu_ran = last_ran;
 	vm->last_ran_size = last_ran_size;
diff --git a/arch/arm64/kvm/mmu.c b/arch/arm64/kvm/mmu.c
index bca90b7354b9..5e079daf2d8e 100644
--- a/arch/arm64/kvm/mmu.c
+++ b/arch/arm64/kvm/mmu.c
@@ -1228,8 +1228,10 @@ static int pkvm_mem_abort(struct kvm_vcpu *vcpu, phys_addr_t fault_ipa,
 		goto dec_account;
 	}
 
-	write_lock(&kvm->mmu_lock);
 	pfn = page_to_pfn(page);
+	sanitise_mte_tags(kvm, pfn, PAGE_SIZE);
+
+	write_lock(&kvm->mmu_lock);
 	ret = pkvm_host_map_guest(pfn, fault_ipa >> PAGE_SHIFT);
 	if (ret) {
 		if (ret == -EAGAIN)
-- 
2.37.0.rc0.104.g0611611a94-goog


^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [PATCH 3/3] KVM: arm64: allow MTE in protected VMs if the tag storage is known
@ 2022-06-23  2:19   ` Peter Collingbourne
  0 siblings, 0 replies; 18+ messages in thread
From: Peter Collingbourne @ 2022-06-23  2:19 UTC (permalink / raw)
  To: kvmarm
  Cc: Peter Collingbourne, Marc Zyngier, kvm, Andy Lutomirski,
	linux-arm-kernel, Michael Roth, Catalin Marinas, Chao Peng,
	Will Deacon, Evgenii Stepanov

Because the host may corrupt a protected guest's tag storage unless
protected by stage 2 page tables, we can't expose MTE to protected guests
if the location of the tag storage is not known.

Therefore, only allow protected VM guests to use MTE if the location of
the tag storage is described in the device tree, and only after disowning
any physical memory accessible tag storage regions.

To avoid exposing MTE tags from the host to protected VMs, sanitize
tags before donating pages.

Signed-off-by: Peter Collingbourne <pcc@google.com>
---
 arch/arm64/include/asm/kvm_host.h |  6 +++++
 arch/arm64/include/asm/kvm_pkvm.h |  4 +++-
 arch/arm64/kernel/image-vars.h    |  3 +++
 arch/arm64/kvm/arm.c              | 37 ++++++++++++++++++++++++++++---
 arch/arm64/kvm/hyp/nvhe/pkvm.c    |  8 ++++---
 arch/arm64/kvm/mmu.c              |  4 +++-
 6 files changed, 54 insertions(+), 8 deletions(-)

diff --git a/arch/arm64/include/asm/kvm_host.h b/arch/arm64/include/asm/kvm_host.h
index 0fd184b816b3..973ab0378d01 100644
--- a/arch/arm64/include/asm/kvm_host.h
+++ b/arch/arm64/include/asm/kvm_host.h
@@ -930,6 +930,12 @@ bool kvm_arm_vcpu_is_finalized(struct kvm_vcpu *vcpu);
 #define kvm_arm_vcpu_sve_finalized(vcpu) \
 	((vcpu)->arch.flags & KVM_ARM64_VCPU_SVE_FINALIZED)
 
+DECLARE_STATIC_KEY_FALSE(pkvm_mte_supported);
+
+#define kvm_supports_mte(kvm)                                                  \
+	(system_supports_mte() &&                                              \
+	 (!kvm_vm_is_protected(kvm) ||                                         \
+	  static_branch_unlikely(&pkvm_mte_supported)))
 #define kvm_has_mte(kvm)					\
 	(system_supports_mte() &&				\
 	 test_bit(KVM_ARCH_FLAG_MTE_ENABLED, &(kvm)->arch.flags))
diff --git a/arch/arm64/include/asm/kvm_pkvm.h b/arch/arm64/include/asm/kvm_pkvm.h
index cd56438a34be..ef5d4870c043 100644
--- a/arch/arm64/include/asm/kvm_pkvm.h
+++ b/arch/arm64/include/asm/kvm_pkvm.h
@@ -73,10 +73,12 @@ void kvm_shadow_destroy(struct kvm *kvm);
  * Allow for protected VMs:
  * - Branch Target Identification
  * - Speculative Store Bypassing
+ * - Memory Tagging Extension
  */
 #define PVM_ID_AA64PFR1_ALLOW (\
 	ARM64_FEATURE_MASK(ID_AA64PFR1_BT) | \
-	ARM64_FEATURE_MASK(ID_AA64PFR1_SSBS) \
+	ARM64_FEATURE_MASK(ID_AA64PFR1_SSBS) | \
+	ARM64_FEATURE_MASK(ID_AA64PFR1_MTE) \
 	)
 
 /*
diff --git a/arch/arm64/kernel/image-vars.h b/arch/arm64/kernel/image-vars.h
index 2d4d6836ff47..26a9b31478aa 100644
--- a/arch/arm64/kernel/image-vars.h
+++ b/arch/arm64/kernel/image-vars.h
@@ -84,6 +84,9 @@ KVM_NVHE_ALIAS(__hyp_stub_vectors);
 KVM_NVHE_ALIAS(arm64_const_caps_ready);
 KVM_NVHE_ALIAS(cpu_hwcap_keys);
 
+/* Kernel symbol needed for kvm_supports_mte() check. */
+KVM_NVHE_ALIAS(pkvm_mte_supported);
+
 /* Static keys which are set if a vGIC trap should be handled in hyp. */
 KVM_NVHE_ALIAS(vgic_v2_cpuif_trap);
 KVM_NVHE_ALIAS(vgic_v3_cpuif_trap);
diff --git a/arch/arm64/kvm/arm.c b/arch/arm64/kvm/arm.c
index 38f0900b7ddb..5a780a0f59e3 100644
--- a/arch/arm64/kvm/arm.c
+++ b/arch/arm64/kvm/arm.c
@@ -60,6 +60,7 @@ static bool vgic_present;
 
 static DEFINE_PER_CPU(unsigned char, kvm_arm_hardware_enabled);
 DEFINE_STATIC_KEY_FALSE(userspace_irqchip_in_use);
+DEFINE_STATIC_KEY_FALSE(pkvm_mte_supported);
 
 int kvm_arch_vcpu_should_kick(struct kvm_vcpu *vcpu)
 {
@@ -96,9 +97,7 @@ int kvm_vm_ioctl_enable_cap(struct kvm *kvm,
 		break;
 	case KVM_CAP_ARM_MTE:
 		mutex_lock(&kvm->lock);
-		if (!system_supports_mte() ||
-		    kvm_vm_is_protected(kvm) ||
-		    kvm->created_vcpus) {
+		if (!kvm_supports_mte(kvm) || kvm->created_vcpus) {
 			r = -EINVAL;
 		} else {
 			r = 0;
@@ -334,6 +333,9 @@ static int pkvm_check_extension(struct kvm *kvm, long ext, int kvm_cap)
 	case KVM_CAP_ARM_VM_IPA_SIZE:
 		r = kvm_cap;
 		break;
+	case KVM_CAP_ARM_MTE:
+		r = kvm_cap && static_branch_unlikely(&pkvm_mte_supported);
+		break;
 	case KVM_CAP_GUEST_DEBUG_HW_BPS:
 		r = min(kvm_cap, pkvm_get_max_brps());
 		break;
@@ -1948,9 +1950,36 @@ static void kvm_reserved_memory_init(void)
 		if (!of_get_property(node, "compatible", NULL) &&
 		    of_get_property(node, "no-map", NULL))
 			disown_reserved_memory(node);
+
+		if (of_device_is_compatible(node, "arm,mte-tag-storage"))
+			disown_reserved_memory(node);
 	}
 }
 
+static void kvm_mte_init(void)
+{
+	struct device_node *memory;
+
+	if (!system_supports_mte() || !acpi_disabled ||
+	    !is_protected_kvm_enabled())
+		return;
+
+	/*
+	 * It is only safe to turn on MTE for protected VMs if we can protect
+	 * the guests from host accesses to their tag storage. If every memory
+	 * region has an arm,mte-alloc property we know that all tag storage
+	 * regions exposed to physical memory, if any, are described by a
+	 * reserved-memory compatible with arm,mte-tag-storage. We can use these
+	 * descriptions to unmap these regions from the host's stage 2 page
+	 * tables (see kvm_reserved_memory_init).
+	 */
+	for_each_node_by_type(memory, "memory")
+		if (!of_get_property(memory, "arm,mte-alloc", NULL))
+			return;
+
+	static_branch_enable(&pkvm_mte_supported);
+}
+
 static int init_subsystems(void)
 {
 	int err = 0;
@@ -1993,6 +2022,8 @@ static int init_subsystems(void)
 
 	kvm_reserved_memory_init();
 
+	kvm_mte_init();
+
 out:
 	if (err || !is_protected_kvm_enabled())
 		on_each_cpu(_kvm_arch_hardware_disable, NULL, 1);
diff --git a/arch/arm64/kvm/hyp/nvhe/pkvm.c b/arch/arm64/kvm/hyp/nvhe/pkvm.c
index e81a6dd676d0..dcc3084161b3 100644
--- a/arch/arm64/kvm/hyp/nvhe/pkvm.c
+++ b/arch/arm64/kvm/hyp/nvhe/pkvm.c
@@ -88,7 +88,7 @@ static void pvm_init_traps_aa64pfr1(struct kvm_vcpu *vcpu)
 	/* Memory Tagging: Trap and Treat as Untagged if not supported. */
 	if (!FIELD_GET(ARM64_FEATURE_MASK(ID_AA64PFR1_MTE), feature_ids)) {
 		hcr_set |= HCR_TID5;
-		hcr_clear |= HCR_DCT | HCR_ATA;
+		hcr_clear |= HCR_ATA;
 	}
 
 	vcpu->arch.hcr_el2 |= hcr_set;
@@ -179,8 +179,8 @@ static void pvm_init_trap_regs(struct kvm_vcpu *vcpu)
 	 * - Feature id registers: to control features exposed to guests
 	 * - Implementation-defined features
 	 */
-	vcpu->arch.hcr_el2 = HCR_GUEST_FLAGS |
-			     HCR_TID3 | HCR_TACR | HCR_TIDCP | HCR_TID1;
+	vcpu->arch.hcr_el2 = HCR_GUEST_FLAGS | HCR_TID3 | HCR_TACR | HCR_TIDCP |
+			     HCR_TID1 | HCR_ATA;
 
 	if (cpus_have_const_cap(ARM64_HAS_RAS_EXTN)) {
 		/* route synchronous external abort exceptions to EL2 */
@@ -459,6 +459,8 @@ static int init_shadow_structs(struct kvm *kvm, struct kvm_shadow_vm *vm,
 	vm->host_kvm = kvm;
 	vm->kvm.created_vcpus = nr_vcpus;
 	vm->kvm.arch.vtcr = host_kvm.arch.vtcr;
+	if (kvm_supports_mte(kvm) && test_bit(KVM_ARCH_FLAG_MTE_ENABLED, &kvm->arch.flags))
+		set_bit(KVM_ARCH_FLAG_MTE_ENABLED, &vm->kvm.arch.flags);
 	vm->kvm.arch.pkvm.enabled = READ_ONCE(kvm->arch.pkvm.enabled);
 	vm->kvm.arch.mmu.last_vcpu_ran = last_ran;
 	vm->last_ran_size = last_ran_size;
diff --git a/arch/arm64/kvm/mmu.c b/arch/arm64/kvm/mmu.c
index bca90b7354b9..5e079daf2d8e 100644
--- a/arch/arm64/kvm/mmu.c
+++ b/arch/arm64/kvm/mmu.c
@@ -1228,8 +1228,10 @@ static int pkvm_mem_abort(struct kvm_vcpu *vcpu, phys_addr_t fault_ipa,
 		goto dec_account;
 	}
 
-	write_lock(&kvm->mmu_lock);
 	pfn = page_to_pfn(page);
+	sanitise_mte_tags(kvm, pfn, PAGE_SIZE);
+
+	write_lock(&kvm->mmu_lock);
 	ret = pkvm_host_map_guest(pfn, fault_ipa >> PAGE_SHIFT);
 	if (ret) {
 		if (ret == -EAGAIN)
-- 
2.37.0.rc0.104.g0611611a94-goog


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [PATCH 3/3] KVM: arm64: allow MTE in protected VMs if the tag storage is known
@ 2022-06-23  2:19   ` Peter Collingbourne
  0 siblings, 0 replies; 18+ messages in thread
From: Peter Collingbourne @ 2022-06-23  2:19 UTC (permalink / raw)
  To: kvmarm
  Cc: Catalin Marinas, kvm, Marc Zyngier, Andy Lutomirski, Will Deacon,
	Evgenii Stepanov, Michael Roth, Chao Peng, Peter Collingbourne,
	linux-arm-kernel

Because the host may corrupt a protected guest's tag storage unless
protected by stage 2 page tables, we can't expose MTE to protected guests
if the location of the tag storage is not known.

Therefore, only allow protected VM guests to use MTE if the location of
the tag storage is described in the device tree, and only after disowning
any physical memory accessible tag storage regions.

To avoid exposing MTE tags from the host to protected VMs, sanitize
tags before donating pages.

Signed-off-by: Peter Collingbourne <pcc@google.com>
---
 arch/arm64/include/asm/kvm_host.h |  6 +++++
 arch/arm64/include/asm/kvm_pkvm.h |  4 +++-
 arch/arm64/kernel/image-vars.h    |  3 +++
 arch/arm64/kvm/arm.c              | 37 ++++++++++++++++++++++++++++---
 arch/arm64/kvm/hyp/nvhe/pkvm.c    |  8 ++++---
 arch/arm64/kvm/mmu.c              |  4 +++-
 6 files changed, 54 insertions(+), 8 deletions(-)

diff --git a/arch/arm64/include/asm/kvm_host.h b/arch/arm64/include/asm/kvm_host.h
index 0fd184b816b3..973ab0378d01 100644
--- a/arch/arm64/include/asm/kvm_host.h
+++ b/arch/arm64/include/asm/kvm_host.h
@@ -930,6 +930,12 @@ bool kvm_arm_vcpu_is_finalized(struct kvm_vcpu *vcpu);
 #define kvm_arm_vcpu_sve_finalized(vcpu) \
 	((vcpu)->arch.flags & KVM_ARM64_VCPU_SVE_FINALIZED)
 
+DECLARE_STATIC_KEY_FALSE(pkvm_mte_supported);
+
+#define kvm_supports_mte(kvm)                                                  \
+	(system_supports_mte() &&                                              \
+	 (!kvm_vm_is_protected(kvm) ||                                         \
+	  static_branch_unlikely(&pkvm_mte_supported)))
 #define kvm_has_mte(kvm)					\
 	(system_supports_mte() &&				\
 	 test_bit(KVM_ARCH_FLAG_MTE_ENABLED, &(kvm)->arch.flags))
diff --git a/arch/arm64/include/asm/kvm_pkvm.h b/arch/arm64/include/asm/kvm_pkvm.h
index cd56438a34be..ef5d4870c043 100644
--- a/arch/arm64/include/asm/kvm_pkvm.h
+++ b/arch/arm64/include/asm/kvm_pkvm.h
@@ -73,10 +73,12 @@ void kvm_shadow_destroy(struct kvm *kvm);
  * Allow for protected VMs:
  * - Branch Target Identification
  * - Speculative Store Bypassing
+ * - Memory Tagging Extension
  */
 #define PVM_ID_AA64PFR1_ALLOW (\
 	ARM64_FEATURE_MASK(ID_AA64PFR1_BT) | \
-	ARM64_FEATURE_MASK(ID_AA64PFR1_SSBS) \
+	ARM64_FEATURE_MASK(ID_AA64PFR1_SSBS) | \
+	ARM64_FEATURE_MASK(ID_AA64PFR1_MTE) \
 	)
 
 /*
diff --git a/arch/arm64/kernel/image-vars.h b/arch/arm64/kernel/image-vars.h
index 2d4d6836ff47..26a9b31478aa 100644
--- a/arch/arm64/kernel/image-vars.h
+++ b/arch/arm64/kernel/image-vars.h
@@ -84,6 +84,9 @@ KVM_NVHE_ALIAS(__hyp_stub_vectors);
 KVM_NVHE_ALIAS(arm64_const_caps_ready);
 KVM_NVHE_ALIAS(cpu_hwcap_keys);
 
+/* Kernel symbol needed for kvm_supports_mte() check. */
+KVM_NVHE_ALIAS(pkvm_mte_supported);
+
 /* Static keys which are set if a vGIC trap should be handled in hyp. */
 KVM_NVHE_ALIAS(vgic_v2_cpuif_trap);
 KVM_NVHE_ALIAS(vgic_v3_cpuif_trap);
diff --git a/arch/arm64/kvm/arm.c b/arch/arm64/kvm/arm.c
index 38f0900b7ddb..5a780a0f59e3 100644
--- a/arch/arm64/kvm/arm.c
+++ b/arch/arm64/kvm/arm.c
@@ -60,6 +60,7 @@ static bool vgic_present;
 
 static DEFINE_PER_CPU(unsigned char, kvm_arm_hardware_enabled);
 DEFINE_STATIC_KEY_FALSE(userspace_irqchip_in_use);
+DEFINE_STATIC_KEY_FALSE(pkvm_mte_supported);
 
 int kvm_arch_vcpu_should_kick(struct kvm_vcpu *vcpu)
 {
@@ -96,9 +97,7 @@ int kvm_vm_ioctl_enable_cap(struct kvm *kvm,
 		break;
 	case KVM_CAP_ARM_MTE:
 		mutex_lock(&kvm->lock);
-		if (!system_supports_mte() ||
-		    kvm_vm_is_protected(kvm) ||
-		    kvm->created_vcpus) {
+		if (!kvm_supports_mte(kvm) || kvm->created_vcpus) {
 			r = -EINVAL;
 		} else {
 			r = 0;
@@ -334,6 +333,9 @@ static int pkvm_check_extension(struct kvm *kvm, long ext, int kvm_cap)
 	case KVM_CAP_ARM_VM_IPA_SIZE:
 		r = kvm_cap;
 		break;
+	case KVM_CAP_ARM_MTE:
+		r = kvm_cap && static_branch_unlikely(&pkvm_mte_supported);
+		break;
 	case KVM_CAP_GUEST_DEBUG_HW_BPS:
 		r = min(kvm_cap, pkvm_get_max_brps());
 		break;
@@ -1948,9 +1950,36 @@ static void kvm_reserved_memory_init(void)
 		if (!of_get_property(node, "compatible", NULL) &&
 		    of_get_property(node, "no-map", NULL))
 			disown_reserved_memory(node);
+
+		if (of_device_is_compatible(node, "arm,mte-tag-storage"))
+			disown_reserved_memory(node);
 	}
 }
 
+static void kvm_mte_init(void)
+{
+	struct device_node *memory;
+
+	if (!system_supports_mte() || !acpi_disabled ||
+	    !is_protected_kvm_enabled())
+		return;
+
+	/*
+	 * It is only safe to turn on MTE for protected VMs if we can protect
+	 * the guests from host accesses to their tag storage. If every memory
+	 * region has an arm,mte-alloc property we know that all tag storage
+	 * regions exposed to physical memory, if any, are described by a
+	 * reserved-memory compatible with arm,mte-tag-storage. We can use these
+	 * descriptions to unmap these regions from the host's stage 2 page
+	 * tables (see kvm_reserved_memory_init).
+	 */
+	for_each_node_by_type(memory, "memory")
+		if (!of_get_property(memory, "arm,mte-alloc", NULL))
+			return;
+
+	static_branch_enable(&pkvm_mte_supported);
+}
+
 static int init_subsystems(void)
 {
 	int err = 0;
@@ -1993,6 +2022,8 @@ static int init_subsystems(void)
 
 	kvm_reserved_memory_init();
 
+	kvm_mte_init();
+
 out:
 	if (err || !is_protected_kvm_enabled())
 		on_each_cpu(_kvm_arch_hardware_disable, NULL, 1);
diff --git a/arch/arm64/kvm/hyp/nvhe/pkvm.c b/arch/arm64/kvm/hyp/nvhe/pkvm.c
index e81a6dd676d0..dcc3084161b3 100644
--- a/arch/arm64/kvm/hyp/nvhe/pkvm.c
+++ b/arch/arm64/kvm/hyp/nvhe/pkvm.c
@@ -88,7 +88,7 @@ static void pvm_init_traps_aa64pfr1(struct kvm_vcpu *vcpu)
 	/* Memory Tagging: Trap and Treat as Untagged if not supported. */
 	if (!FIELD_GET(ARM64_FEATURE_MASK(ID_AA64PFR1_MTE), feature_ids)) {
 		hcr_set |= HCR_TID5;
-		hcr_clear |= HCR_DCT | HCR_ATA;
+		hcr_clear |= HCR_ATA;
 	}
 
 	vcpu->arch.hcr_el2 |= hcr_set;
@@ -179,8 +179,8 @@ static void pvm_init_trap_regs(struct kvm_vcpu *vcpu)
 	 * - Feature id registers: to control features exposed to guests
 	 * - Implementation-defined features
 	 */
-	vcpu->arch.hcr_el2 = HCR_GUEST_FLAGS |
-			     HCR_TID3 | HCR_TACR | HCR_TIDCP | HCR_TID1;
+	vcpu->arch.hcr_el2 = HCR_GUEST_FLAGS | HCR_TID3 | HCR_TACR | HCR_TIDCP |
+			     HCR_TID1 | HCR_ATA;
 
 	if (cpus_have_const_cap(ARM64_HAS_RAS_EXTN)) {
 		/* route synchronous external abort exceptions to EL2 */
@@ -459,6 +459,8 @@ static int init_shadow_structs(struct kvm *kvm, struct kvm_shadow_vm *vm,
 	vm->host_kvm = kvm;
 	vm->kvm.created_vcpus = nr_vcpus;
 	vm->kvm.arch.vtcr = host_kvm.arch.vtcr;
+	if (kvm_supports_mte(kvm) && test_bit(KVM_ARCH_FLAG_MTE_ENABLED, &kvm->arch.flags))
+		set_bit(KVM_ARCH_FLAG_MTE_ENABLED, &vm->kvm.arch.flags);
 	vm->kvm.arch.pkvm.enabled = READ_ONCE(kvm->arch.pkvm.enabled);
 	vm->kvm.arch.mmu.last_vcpu_ran = last_ran;
 	vm->last_ran_size = last_ran_size;
diff --git a/arch/arm64/kvm/mmu.c b/arch/arm64/kvm/mmu.c
index bca90b7354b9..5e079daf2d8e 100644
--- a/arch/arm64/kvm/mmu.c
+++ b/arch/arm64/kvm/mmu.c
@@ -1228,8 +1228,10 @@ static int pkvm_mem_abort(struct kvm_vcpu *vcpu, phys_addr_t fault_ipa,
 		goto dec_account;
 	}
 
-	write_lock(&kvm->mmu_lock);
 	pfn = page_to_pfn(page);
+	sanitise_mte_tags(kvm, pfn, PAGE_SIZE);
+
+	write_lock(&kvm->mmu_lock);
 	ret = pkvm_host_map_guest(pfn, fault_ipa >> PAGE_SHIFT);
 	if (ret) {
 		if (ret == -EAGAIN)
-- 
2.37.0.rc0.104.g0611611a94-goog

_______________________________________________
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

^ permalink raw reply related	[flat|nested] 18+ messages in thread

* Re: [PATCH 1/3] KVM: arm64: add a hypercall for disowning pages
  2022-06-23  2:19   ` Peter Collingbourne
  (?)
@ 2022-06-23 13:11     ` Quentin Perret
  -1 siblings, 0 replies; 18+ messages in thread
From: Quentin Perret @ 2022-06-23 13:11 UTC (permalink / raw)
  To: Peter Collingbourne
  Cc: kvmarm, Marc Zyngier, kvm, Andy Lutomirski, linux-arm-kernel,
	Michael Roth, Catalin Marinas, Chao Peng, Will Deacon,
	Evgenii Stepanov

Hi Peter,

On Wednesday 22 Jun 2022 at 19:19:24 (-0700), Peter Collingbourne wrote:
> @@ -677,9 +678,9 @@ static bool stage2_pte_is_counted(kvm_pte_t pte)
>  	/*
>  	 * The refcount tracks valid entries as well as invalid entries if they
>  	 * encode ownership of a page to another entity than the page-table
> -	 * owner, whose id is 0.
> +	 * owner, whose id is 0, or NOBODY, which does not correspond to a page-table.
>  	 */
> -	return !!pte;
> +	return !!pte && pte != kvm_init_invalid_leaf_owner(PKVM_ID_NOBODY);
>  }

I'm not sure to understand this part? By not refcounting the PTEs that
are annotated with PKVM_ID_NOBODY, the page-table page that contains
them may be freed at some point. And when that happens, I don't see how
the hypervisor will remember to block host accesses to the disowned
pages.

Cheers,
Quentin

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH 1/3] KVM: arm64: add a hypercall for disowning pages
@ 2022-06-23 13:11     ` Quentin Perret
  0 siblings, 0 replies; 18+ messages in thread
From: Quentin Perret @ 2022-06-23 13:11 UTC (permalink / raw)
  To: Peter Collingbourne
  Cc: Catalin Marinas, kvm, Marc Zyngier, Andy Lutomirski,
	Evgenii Stepanov, Michael Roth, Chao Peng, Will Deacon, kvmarm,
	linux-arm-kernel

Hi Peter,

On Wednesday 22 Jun 2022 at 19:19:24 (-0700), Peter Collingbourne wrote:
> @@ -677,9 +678,9 @@ static bool stage2_pte_is_counted(kvm_pte_t pte)
>  	/*
>  	 * The refcount tracks valid entries as well as invalid entries if they
>  	 * encode ownership of a page to another entity than the page-table
> -	 * owner, whose id is 0.
> +	 * owner, whose id is 0, or NOBODY, which does not correspond to a page-table.
>  	 */
> -	return !!pte;
> +	return !!pte && pte != kvm_init_invalid_leaf_owner(PKVM_ID_NOBODY);
>  }

I'm not sure to understand this part? By not refcounting the PTEs that
are annotated with PKVM_ID_NOBODY, the page-table page that contains
them may be freed at some point. And when that happens, I don't see how
the hypervisor will remember to block host accesses to the disowned
pages.

Cheers,
Quentin
_______________________________________________
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH 1/3] KVM: arm64: add a hypercall for disowning pages
@ 2022-06-23 13:11     ` Quentin Perret
  0 siblings, 0 replies; 18+ messages in thread
From: Quentin Perret @ 2022-06-23 13:11 UTC (permalink / raw)
  To: Peter Collingbourne
  Cc: kvmarm, Marc Zyngier, kvm, Andy Lutomirski, linux-arm-kernel,
	Michael Roth, Catalin Marinas, Chao Peng, Will Deacon,
	Evgenii Stepanov

Hi Peter,

On Wednesday 22 Jun 2022 at 19:19:24 (-0700), Peter Collingbourne wrote:
> @@ -677,9 +678,9 @@ static bool stage2_pte_is_counted(kvm_pte_t pte)
>  	/*
>  	 * The refcount tracks valid entries as well as invalid entries if they
>  	 * encode ownership of a page to another entity than the page-table
> -	 * owner, whose id is 0.
> +	 * owner, whose id is 0, or NOBODY, which does not correspond to a page-table.
>  	 */
> -	return !!pte;
> +	return !!pte && pte != kvm_init_invalid_leaf_owner(PKVM_ID_NOBODY);
>  }

I'm not sure to understand this part? By not refcounting the PTEs that
are annotated with PKVM_ID_NOBODY, the page-table page that contains
them may be freed at some point. And when that happens, I don't see how
the hypervisor will remember to block host accesses to the disowned
pages.

Cheers,
Quentin

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH 1/3] KVM: arm64: add a hypercall for disowning pages
  2022-06-23 13:11     ` Quentin Perret
  (?)
@ 2022-06-23 18:12       ` Peter Collingbourne
  -1 siblings, 0 replies; 18+ messages in thread
From: Peter Collingbourne @ 2022-06-23 18:12 UTC (permalink / raw)
  To: Quentin Perret
  Cc: kvmarm, Marc Zyngier, kvm, Andy Lutomirski, Linux ARM,
	Michael Roth, Catalin Marinas, Chao Peng, Will Deacon,
	Evgenii Stepanov

On Thu, Jun 23, 2022 at 6:12 AM Quentin Perret <qperret@google.com> wrote:
>
> Hi Peter,
>
> On Wednesday 22 Jun 2022 at 19:19:24 (-0700), Peter Collingbourne wrote:
> > @@ -677,9 +678,9 @@ static bool stage2_pte_is_counted(kvm_pte_t pte)
> >       /*
> >        * The refcount tracks valid entries as well as invalid entries if they
> >        * encode ownership of a page to another entity than the page-table
> > -      * owner, whose id is 0.
> > +      * owner, whose id is 0, or NOBODY, which does not correspond to a page-table.
> >        */
> > -     return !!pte;
> > +     return !!pte && pte != kvm_init_invalid_leaf_owner(PKVM_ID_NOBODY);
> >  }
>
> I'm not sure to understand this part? By not refcounting the PTEs that
> are annotated with PKVM_ID_NOBODY, the page-table page that contains
> them may be freed at some point. And when that happens, I don't see how
> the hypervisor will remember to block host accesses to the disowned
> pages.

This was because I misunderstood the code and thought that this was
for maintaining a count from PTEs to the pages that they reference
(which would make the refcounting unnecessary for pages owned by
nobody). Reading the code more carefully my understanding is now that
the refcounts are instead for tracking the number of non-zero PTEs in
the page table, so that the hypervisor knows that it can free up a
page table when it becomes all zeros i.e. no longer contains any
information that needs to be preserved (so the "references" that are
being counted are implicit in the location of the PTE). So it doesn't
make sense to not track PTEs owned by nobody here. I'll remove this
part in v2.

Peter

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH 1/3] KVM: arm64: add a hypercall for disowning pages
@ 2022-06-23 18:12       ` Peter Collingbourne
  0 siblings, 0 replies; 18+ messages in thread
From: Peter Collingbourne @ 2022-06-23 18:12 UTC (permalink / raw)
  To: Quentin Perret
  Cc: kvmarm, Marc Zyngier, kvm, Andy Lutomirski, Linux ARM,
	Michael Roth, Catalin Marinas, Chao Peng, Will Deacon,
	Evgenii Stepanov

On Thu, Jun 23, 2022 at 6:12 AM Quentin Perret <qperret@google.com> wrote:
>
> Hi Peter,
>
> On Wednesday 22 Jun 2022 at 19:19:24 (-0700), Peter Collingbourne wrote:
> > @@ -677,9 +678,9 @@ static bool stage2_pte_is_counted(kvm_pte_t pte)
> >       /*
> >        * The refcount tracks valid entries as well as invalid entries if they
> >        * encode ownership of a page to another entity than the page-table
> > -      * owner, whose id is 0.
> > +      * owner, whose id is 0, or NOBODY, which does not correspond to a page-table.
> >        */
> > -     return !!pte;
> > +     return !!pte && pte != kvm_init_invalid_leaf_owner(PKVM_ID_NOBODY);
> >  }
>
> I'm not sure to understand this part? By not refcounting the PTEs that
> are annotated with PKVM_ID_NOBODY, the page-table page that contains
> them may be freed at some point. And when that happens, I don't see how
> the hypervisor will remember to block host accesses to the disowned
> pages.

This was because I misunderstood the code and thought that this was
for maintaining a count from PTEs to the pages that they reference
(which would make the refcounting unnecessary for pages owned by
nobody). Reading the code more carefully my understanding is now that
the refcounts are instead for tracking the number of non-zero PTEs in
the page table, so that the hypervisor knows that it can free up a
page table when it becomes all zeros i.e. no longer contains any
information that needs to be preserved (so the "references" that are
being counted are implicit in the location of the PTE). So it doesn't
make sense to not track PTEs owned by nobody here. I'll remove this
part in v2.

Peter

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH 1/3] KVM: arm64: add a hypercall for disowning pages
@ 2022-06-23 18:12       ` Peter Collingbourne
  0 siblings, 0 replies; 18+ messages in thread
From: Peter Collingbourne @ 2022-06-23 18:12 UTC (permalink / raw)
  To: Quentin Perret
  Cc: Catalin Marinas, kvm, Marc Zyngier, Andy Lutomirski,
	Evgenii Stepanov, Michael Roth, Chao Peng, Will Deacon, kvmarm,
	Linux ARM

On Thu, Jun 23, 2022 at 6:12 AM Quentin Perret <qperret@google.com> wrote:
>
> Hi Peter,
>
> On Wednesday 22 Jun 2022 at 19:19:24 (-0700), Peter Collingbourne wrote:
> > @@ -677,9 +678,9 @@ static bool stage2_pte_is_counted(kvm_pte_t pte)
> >       /*
> >        * The refcount tracks valid entries as well as invalid entries if they
> >        * encode ownership of a page to another entity than the page-table
> > -      * owner, whose id is 0.
> > +      * owner, whose id is 0, or NOBODY, which does not correspond to a page-table.
> >        */
> > -     return !!pte;
> > +     return !!pte && pte != kvm_init_invalid_leaf_owner(PKVM_ID_NOBODY);
> >  }
>
> I'm not sure to understand this part? By not refcounting the PTEs that
> are annotated with PKVM_ID_NOBODY, the page-table page that contains
> them may be freed at some point. And when that happens, I don't see how
> the hypervisor will remember to block host accesses to the disowned
> pages.

This was because I misunderstood the code and thought that this was
for maintaining a count from PTEs to the pages that they reference
(which would make the refcounting unnecessary for pages owned by
nobody). Reading the code more carefully my understanding is now that
the refcounts are instead for tracking the number of non-zero PTEs in
the page table, so that the hypervisor knows that it can free up a
page table when it becomes all zeros i.e. no longer contains any
information that needs to be preserved (so the "references" that are
being counted are implicit in the location of the PTE). So it doesn't
make sense to not track PTEs owned by nobody here. I'll remove this
part in v2.

Peter
_______________________________________________
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

^ permalink raw reply	[flat|nested] 18+ messages in thread

end of thread, other threads:[~2022-06-23 19:51 UTC | newest]

Thread overview: 18+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-06-23  2:19 [PATCH 0/3] KVM: arm64: support MTE in protected VMs Peter Collingbourne
2022-06-23  2:19 ` Peter Collingbourne
2022-06-23  2:19 ` Peter Collingbourne
2022-06-23  2:19 ` [PATCH 1/3] KVM: arm64: add a hypercall for disowning pages Peter Collingbourne
2022-06-23  2:19   ` Peter Collingbourne
2022-06-23  2:19   ` Peter Collingbourne
2022-06-23 13:11   ` Quentin Perret
2022-06-23 13:11     ` Quentin Perret
2022-06-23 13:11     ` Quentin Perret
2022-06-23 18:12     ` Peter Collingbourne
2022-06-23 18:12       ` Peter Collingbourne
2022-06-23 18:12       ` Peter Collingbourne
2022-06-23  2:19 ` [PATCH 2/3] KVM: arm64: disown unused reserved-memory regions Peter Collingbourne
2022-06-23  2:19   ` Peter Collingbourne
2022-06-23  2:19   ` Peter Collingbourne
2022-06-23  2:19 ` [PATCH 3/3] KVM: arm64: allow MTE in protected VMs if the tag storage is known Peter Collingbourne
2022-06-23  2:19   ` Peter Collingbourne
2022-06-23  2:19   ` Peter Collingbourne

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.