linux-arm-kernel.lists.infradead.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v17 0/6] MTE support for KVM guest
@ 2021-06-21 11:17 Steven Price
  2021-06-21 11:17 ` [PATCH v17 1/6] arm64: mte: Sync tags for pages where PTE is untagged Steven Price
                   ` (6 more replies)
  0 siblings, 7 replies; 23+ messages in thread
From: Steven Price @ 2021-06-21 11:17 UTC (permalink / raw)
  To: Catalin Marinas, Marc Zyngier, Will Deacon
  Cc: Steven Price, James Morse, Julien Thierry, Suzuki K Poulose,
	kvmarm, linux-arm-kernel, linux-kernel, Dave Martin,
	Mark Rutland, Thomas Gleixner, qemu-devel, Juan Quintela,
	Dr. David Alan Gilbert, Richard Henderson, Peter Maydell,
	Andrew Jones

This series adds support for using the Arm Memory Tagging Extensions
(MTE) in a KVM guest.

Changes since v16[1]:

 - Dropped the first patch ("Handle race when synchronising tags") as
   it's not KVM specific and by restricting MAP_SHARED in KVM there is
   no longer a dependency.

 - Change return code when creating a memslot with VM_SHARED regions to
   -EFAULT (and correctly jump to out_unlock on this error case).

 - Clarify documentation thanks to Catalin.

 - Rebase onto v5.13-rc4.

 - Add Reviewed-by tags from Catalin - thanks!

[1] https://lore.kernel.org/r/20210618132826.54670-1-steven.price%40arm.com

Steven Price (6):
  arm64: mte: Sync tags for pages where PTE is untagged
  KVM: arm64: Introduce MTE VM feature
  KVM: arm64: Save/restore MTE registers
  KVM: arm64: Expose KVM_ARM_CAP_MTE
  KVM: arm64: ioctl to fetch/store tags in a guest
  KVM: arm64: Document MTE capability and ioctl

 Documentation/virt/kvm/api.rst             | 61 ++++++++++++++++
 arch/arm64/include/asm/kvm_arm.h           |  3 +-
 arch/arm64/include/asm/kvm_emulate.h       |  3 +
 arch/arm64/include/asm/kvm_host.h          | 12 ++++
 arch/arm64/include/asm/kvm_mte.h           | 66 +++++++++++++++++
 arch/arm64/include/asm/mte-def.h           |  1 +
 arch/arm64/include/asm/mte.h               |  4 +-
 arch/arm64/include/asm/pgtable.h           | 22 +++++-
 arch/arm64/include/asm/sysreg.h            |  3 +-
 arch/arm64/include/uapi/asm/kvm.h          | 11 +++
 arch/arm64/kernel/asm-offsets.c            |  2 +
 arch/arm64/kernel/mte.c                    | 18 +++--
 arch/arm64/kvm/arm.c                       | 16 +++++
 arch/arm64/kvm/guest.c                     | 82 ++++++++++++++++++++++
 arch/arm64/kvm/hyp/entry.S                 |  7 ++
 arch/arm64/kvm/hyp/exception.c             |  3 +-
 arch/arm64/kvm/hyp/include/hyp/sysreg-sr.h | 21 ++++++
 arch/arm64/kvm/mmu.c                       | 64 ++++++++++++++++-
 arch/arm64/kvm/reset.c                     |  4 ++
 arch/arm64/kvm/sys_regs.c                  | 32 +++++++--
 include/uapi/linux/kvm.h                   |  2 +
 21 files changed, 419 insertions(+), 18 deletions(-)
 create mode 100644 arch/arm64/include/asm/kvm_mte.h

-- 
2.20.1


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 23+ messages in thread

* [PATCH v17 1/6] arm64: mte: Sync tags for pages where PTE is untagged
  2021-06-21 11:17 [PATCH v17 0/6] MTE support for KVM guest Steven Price
@ 2021-06-21 11:17 ` Steven Price
  2021-06-21 11:17 ` [PATCH v17 2/6] KVM: arm64: Introduce MTE VM feature Steven Price
                   ` (5 subsequent siblings)
  6 siblings, 0 replies; 23+ messages in thread
From: Steven Price @ 2021-06-21 11:17 UTC (permalink / raw)
  To: Catalin Marinas, Marc Zyngier, Will Deacon
  Cc: Steven Price, James Morse, Julien Thierry, Suzuki K Poulose,
	kvmarm, linux-arm-kernel, linux-kernel, Dave Martin,
	Mark Rutland, Thomas Gleixner, qemu-devel, Juan Quintela,
	Dr. David Alan Gilbert, Richard Henderson, Peter Maydell,
	Andrew Jones

A KVM guest could store tags in a page even if the VMM hasn't mapped
the page with PROT_MTE. So when restoring pages from swap we will
need to check to see if there are any saved tags even if !pte_tagged().

However don't check pages for which pte_access_permitted() returns false
as these will not have been swapped out.

Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Steven Price <steven.price@arm.com>
---
 arch/arm64/include/asm/mte.h     |  4 ++--
 arch/arm64/include/asm/pgtable.h | 22 +++++++++++++++++++---
 arch/arm64/kernel/mte.c          | 18 +++++++++++++-----
 3 files changed, 34 insertions(+), 10 deletions(-)

diff --git a/arch/arm64/include/asm/mte.h b/arch/arm64/include/asm/mte.h
index bc88a1ced0d7..347ef38a35f7 100644
--- a/arch/arm64/include/asm/mte.h
+++ b/arch/arm64/include/asm/mte.h
@@ -37,7 +37,7 @@ void mte_free_tag_storage(char *storage);
 /* track which pages have valid allocation tags */
 #define PG_mte_tagged	PG_arch_2
 
-void mte_sync_tags(pte_t *ptep, pte_t pte);
+void mte_sync_tags(pte_t old_pte, pte_t pte);
 void mte_copy_page_tags(void *kto, const void *kfrom);
 void mte_thread_init_user(void);
 void mte_thread_switch(struct task_struct *next);
@@ -53,7 +53,7 @@ int mte_ptrace_copy_tags(struct task_struct *child, long request,
 /* unused if !CONFIG_ARM64_MTE, silence the compiler */
 #define PG_mte_tagged	0
 
-static inline void mte_sync_tags(pte_t *ptep, pte_t pte)
+static inline void mte_sync_tags(pte_t old_pte, pte_t pte)
 {
 }
 static inline void mte_copy_page_tags(void *kto, const void *kfrom)
diff --git a/arch/arm64/include/asm/pgtable.h b/arch/arm64/include/asm/pgtable.h
index 0b10204e72fc..db5402168841 100644
--- a/arch/arm64/include/asm/pgtable.h
+++ b/arch/arm64/include/asm/pgtable.h
@@ -314,9 +314,25 @@ static inline void set_pte_at(struct mm_struct *mm, unsigned long addr,
 	if (pte_present(pte) && pte_user_exec(pte) && !pte_special(pte))
 		__sync_icache_dcache(pte);
 
-	if (system_supports_mte() &&
-	    pte_present(pte) && pte_tagged(pte) && !pte_special(pte))
-		mte_sync_tags(ptep, pte);
+	/*
+	 * If the PTE would provide user space access to the tags associated
+	 * with it then ensure that the MTE tags are synchronised.  Although
+	 * pte_access_permitted() returns false for exec only mappings, they
+	 * don't expose tags (instruction fetches don't check tags).
+	 */
+	if (system_supports_mte() && pte_access_permitted(pte, false) &&
+	    !pte_special(pte)) {
+		pte_t old_pte = READ_ONCE(*ptep);
+		/*
+		 * We only need to synchronise if the new PTE has tags enabled
+		 * or if swapping in (in which case another mapping may have
+		 * set tags in the past even if this PTE isn't tagged).
+		 * (!pte_none() && !pte_present()) is an open coded version of
+		 * is_swap_pte()
+		 */
+		if (pte_tagged(pte) || (!pte_none(old_pte) && !pte_present(old_pte)))
+			mte_sync_tags(old_pte, pte);
+	}
 
 	__check_racy_pte_update(mm, ptep, pte);
 
diff --git a/arch/arm64/kernel/mte.c b/arch/arm64/kernel/mte.c
index 125a10e413e9..69b3fde8759e 100644
--- a/arch/arm64/kernel/mte.c
+++ b/arch/arm64/kernel/mte.c
@@ -32,10 +32,9 @@ DEFINE_STATIC_KEY_FALSE(mte_async_mode);
 EXPORT_SYMBOL_GPL(mte_async_mode);
 #endif
 
-static void mte_sync_page_tags(struct page *page, pte_t *ptep, bool check_swap)
+static void mte_sync_page_tags(struct page *page, pte_t old_pte,
+			       bool check_swap, bool pte_is_tagged)
 {
-	pte_t old_pte = READ_ONCE(*ptep);
-
 	if (check_swap && is_swap_pte(old_pte)) {
 		swp_entry_t entry = pte_to_swp_entry(old_pte);
 
@@ -43,6 +42,9 @@ static void mte_sync_page_tags(struct page *page, pte_t *ptep, bool check_swap)
 			return;
 	}
 
+	if (!pte_is_tagged)
+		return;
+
 	page_kasan_tag_reset(page);
 	/*
 	 * We need smp_wmb() in between setting the flags and clearing the
@@ -55,16 +57,22 @@ static void mte_sync_page_tags(struct page *page, pte_t *ptep, bool check_swap)
 	mte_clear_page_tags(page_address(page));
 }
 
-void mte_sync_tags(pte_t *ptep, pte_t pte)
+void mte_sync_tags(pte_t old_pte, pte_t pte)
 {
 	struct page *page = pte_page(pte);
 	long i, nr_pages = compound_nr(page);
 	bool check_swap = nr_pages == 1;
+	bool pte_is_tagged = pte_tagged(pte);
+
+	/* Early out if there's nothing to do */
+	if (!check_swap && !pte_is_tagged)
+		return;
 
 	/* if PG_mte_tagged is set, tags have already been initialised */
 	for (i = 0; i < nr_pages; i++, page++) {
 		if (!test_and_set_bit(PG_mte_tagged, &page->flags))
-			mte_sync_page_tags(page, ptep, check_swap);
+			mte_sync_page_tags(page, old_pte, check_swap,
+					   pte_is_tagged);
 	}
 }
 
-- 
2.20.1


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [PATCH v17 2/6] KVM: arm64: Introduce MTE VM feature
  2021-06-21 11:17 [PATCH v17 0/6] MTE support for KVM guest Steven Price
  2021-06-21 11:17 ` [PATCH v17 1/6] arm64: mte: Sync tags for pages where PTE is untagged Steven Price
@ 2021-06-21 11:17 ` Steven Price
  2021-06-21 17:00   ` Fuad Tabba
  2021-06-21 11:17 ` [PATCH v17 3/6] KVM: arm64: Save/restore MTE registers Steven Price
                   ` (4 subsequent siblings)
  6 siblings, 1 reply; 23+ messages in thread
From: Steven Price @ 2021-06-21 11:17 UTC (permalink / raw)
  To: Catalin Marinas, Marc Zyngier, Will Deacon
  Cc: Steven Price, James Morse, Julien Thierry, Suzuki K Poulose,
	kvmarm, linux-arm-kernel, linux-kernel, Dave Martin,
	Mark Rutland, Thomas Gleixner, qemu-devel, Juan Quintela,
	Dr. David Alan Gilbert, Richard Henderson, Peter Maydell,
	Andrew Jones

Add a new VM feature 'KVM_ARM_CAP_MTE' which enables memory tagging
for a VM. This will expose the feature to the guest and automatically
tag memory pages touched by the VM as PG_mte_tagged (and clear the tag
storage) to ensure that the guest cannot see stale tags, and so that
the tags are correctly saved/restored across swap.

Actually exposing the new capability to user space happens in a later
patch.

Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Steven Price <steven.price@arm.com>
---
 arch/arm64/include/asm/kvm_emulate.h |  3 ++
 arch/arm64/include/asm/kvm_host.h    |  3 ++
 arch/arm64/kvm/hyp/exception.c       |  3 +-
 arch/arm64/kvm/mmu.c                 | 64 +++++++++++++++++++++++++++-
 arch/arm64/kvm/sys_regs.c            |  7 +++
 include/uapi/linux/kvm.h             |  1 +
 6 files changed, 79 insertions(+), 2 deletions(-)

diff --git a/arch/arm64/include/asm/kvm_emulate.h b/arch/arm64/include/asm/kvm_emulate.h
index 01b9857757f2..fd418955e31e 100644
--- a/arch/arm64/include/asm/kvm_emulate.h
+++ b/arch/arm64/include/asm/kvm_emulate.h
@@ -84,6 +84,9 @@ static inline void vcpu_reset_hcr(struct kvm_vcpu *vcpu)
 	if (cpus_have_const_cap(ARM64_MISMATCHED_CACHE_TYPE) ||
 	    vcpu_el1_is_32bit(vcpu))
 		vcpu->arch.hcr_el2 |= HCR_TID2;
+
+	if (kvm_has_mte(vcpu->kvm))
+		vcpu->arch.hcr_el2 |= HCR_ATA;
 }
 
 static inline unsigned long *vcpu_hcr(struct kvm_vcpu *vcpu)
diff --git a/arch/arm64/include/asm/kvm_host.h b/arch/arm64/include/asm/kvm_host.h
index 7cd7d5c8c4bc..afaa5333f0e4 100644
--- a/arch/arm64/include/asm/kvm_host.h
+++ b/arch/arm64/include/asm/kvm_host.h
@@ -132,6 +132,8 @@ struct kvm_arch {
 
 	u8 pfr0_csv2;
 	u8 pfr0_csv3;
+	/* Memory Tagging Extension enabled for the guest */
+	bool mte_enabled;
 };
 
 struct kvm_vcpu_fault_info {
@@ -769,6 +771,7 @@ bool kvm_arm_vcpu_is_finalized(struct kvm_vcpu *vcpu);
 #define kvm_arm_vcpu_sve_finalized(vcpu) \
 	((vcpu)->arch.flags & KVM_ARM64_VCPU_SVE_FINALIZED)
 
+#define kvm_has_mte(kvm) (system_supports_mte() && (kvm)->arch.mte_enabled)
 #define kvm_vcpu_has_pmu(vcpu)					\
 	(test_bit(KVM_ARM_VCPU_PMU_V3, (vcpu)->arch.features))
 
diff --git a/arch/arm64/kvm/hyp/exception.c b/arch/arm64/kvm/hyp/exception.c
index 11541b94b328..0418399e0a20 100644
--- a/arch/arm64/kvm/hyp/exception.c
+++ b/arch/arm64/kvm/hyp/exception.c
@@ -112,7 +112,8 @@ static void enter_exception64(struct kvm_vcpu *vcpu, unsigned long target_mode,
 	new |= (old & PSR_C_BIT);
 	new |= (old & PSR_V_BIT);
 
-	// TODO: TCO (if/when ARMv8.5-MemTag is exposed to guests)
+	if (kvm_has_mte(vcpu->kvm))
+		new |= PSR_TCO_BIT;
 
 	new |= (old & PSR_DIT_BIT);
 
diff --git a/arch/arm64/kvm/mmu.c b/arch/arm64/kvm/mmu.c
index c10207fed2f3..52326b739357 100644
--- a/arch/arm64/kvm/mmu.c
+++ b/arch/arm64/kvm/mmu.c
@@ -822,6 +822,45 @@ transparent_hugepage_adjust(struct kvm_memory_slot *memslot,
 	return PAGE_SIZE;
 }
 
+/*
+ * The page will be mapped in stage 2 as Normal Cacheable, so the VM will be
+ * able to see the page's tags and therefore they must be initialised first. If
+ * PG_mte_tagged is set, tags have already been initialised.
+ *
+ * The race in the test/set of the PG_mte_tagged flag is handled by:
+ * - preventing VM_SHARED mappings in a memslot with MTE preventing two VMs
+ *   racing to santise the same page
+ * - mmap_lock protects between a VM faulting a page in and the VMM performing
+ *   an mprotect() to add VM_MTE
+ */
+static int sanitise_mte_tags(struct kvm *kvm, kvm_pfn_t pfn,
+			     unsigned long size)
+{
+	unsigned long i, nr_pages = size >> PAGE_SHIFT;
+	struct page *page;
+
+	if (!kvm_has_mte(kvm))
+		return 0;
+
+	/*
+	 * pfn_to_online_page() is used to reject ZONE_DEVICE pages
+	 * that may not support tags.
+	 */
+	page = pfn_to_online_page(pfn);
+
+	if (!page)
+		return -EFAULT;
+
+	for (i = 0; i < nr_pages; i++, page++) {
+		if (!test_bit(PG_mte_tagged, &page->flags)) {
+			mte_clear_page_tags(page_address(page));
+			set_bit(PG_mte_tagged, &page->flags);
+		}
+	}
+
+	return 0;
+}
+
 static int user_mem_abort(struct kvm_vcpu *vcpu, phys_addr_t fault_ipa,
 			  struct kvm_memory_slot *memslot, unsigned long hva,
 			  unsigned long fault_status)
@@ -971,8 +1010,18 @@ static int user_mem_abort(struct kvm_vcpu *vcpu, phys_addr_t fault_ipa,
 	if (writable)
 		prot |= KVM_PGTABLE_PROT_W;
 
-	if (fault_status != FSC_PERM && !device)
+	if (fault_status != FSC_PERM && !device) {
+		/* Check the VMM hasn't introduced a new VM_SHARED VMA */
+		if (kvm_has_mte(kvm) && vma->vm_flags & VM_SHARED) {
+			ret = -EFAULT;
+			goto out_unlock;
+		}
+		ret = sanitise_mte_tags(kvm, pfn, vma_pagesize);
+		if (ret)
+			goto out_unlock;
+
 		clean_dcache_guest_page(pfn, vma_pagesize);
+	}
 
 	if (exec_fault) {
 		prot |= KVM_PGTABLE_PROT_X;
@@ -1168,12 +1217,17 @@ bool kvm_unmap_gfn_range(struct kvm *kvm, struct kvm_gfn_range *range)
 bool kvm_set_spte_gfn(struct kvm *kvm, struct kvm_gfn_range *range)
 {
 	kvm_pfn_t pfn = pte_pfn(range->pte);
+	int ret;
 
 	if (!kvm->arch.mmu.pgt)
 		return false;
 
 	WARN_ON(range->end - range->start != 1);
 
+	ret = sanitise_mte_tags(kvm, pfn, PAGE_SIZE);
+	if (ret)
+		return false;
+
 	/*
 	 * We've moved a page around, probably through CoW, so let's treat it
 	 * just like a translation fault and clean the cache to the PoC.
@@ -1381,6 +1435,14 @@ int kvm_arch_prepare_memory_region(struct kvm *kvm,
 		if (!vma)
 			break;
 
+		/*
+		 * VM_SHARED mappings are not allowed with MTE to avoid races
+		 * when updating the PG_mte_tagged page flag, see
+		 * sanitise_mte_tags for more details.
+		 */
+		if (kvm_has_mte(kvm) && vma->vm_flags & VM_SHARED)
+			return -EINVAL;
+
 		/*
 		 * Take the intersection of this VMA with the memory region
 		 */
diff --git a/arch/arm64/kvm/sys_regs.c b/arch/arm64/kvm/sys_regs.c
index 1a7968ad078c..36f67f8deae1 100644
--- a/arch/arm64/kvm/sys_regs.c
+++ b/arch/arm64/kvm/sys_regs.c
@@ -1047,6 +1047,13 @@ static u64 read_id_reg(const struct kvm_vcpu *vcpu,
 		break;
 	case SYS_ID_AA64PFR1_EL1:
 		val &= ~FEATURE(ID_AA64PFR1_MTE);
+		if (kvm_has_mte(vcpu->kvm)) {
+			u64 pfr, mte;
+
+			pfr = read_sanitised_ftr_reg(SYS_ID_AA64PFR1_EL1);
+			mte = cpuid_feature_extract_unsigned_field(pfr, ID_AA64PFR1_MTE_SHIFT);
+			val |= FIELD_PREP(FEATURE(ID_AA64PFR1_MTE), mte);
+		}
 		break;
 	case SYS_ID_AA64ISAR1_EL1:
 		if (!vcpu_has_ptrauth(vcpu))
diff --git a/include/uapi/linux/kvm.h b/include/uapi/linux/kvm.h
index 79d9c44d1ad7..d4da58ddcad7 100644
--- a/include/uapi/linux/kvm.h
+++ b/include/uapi/linux/kvm.h
@@ -1083,6 +1083,7 @@ struct kvm_ppc_resize_hpt {
 #define KVM_CAP_SGX_ATTRIBUTE 196
 #define KVM_CAP_VM_COPY_ENC_CONTEXT_FROM 197
 #define KVM_CAP_PTP_KVM 198
+#define KVM_CAP_ARM_MTE 199
 
 #ifdef KVM_CAP_IRQ_ROUTING
 
-- 
2.20.1


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [PATCH v17 3/6] KVM: arm64: Save/restore MTE registers
  2021-06-21 11:17 [PATCH v17 0/6] MTE support for KVM guest Steven Price
  2021-06-21 11:17 ` [PATCH v17 1/6] arm64: mte: Sync tags for pages where PTE is untagged Steven Price
  2021-06-21 11:17 ` [PATCH v17 2/6] KVM: arm64: Introduce MTE VM feature Steven Price
@ 2021-06-21 11:17 ` Steven Price
  2021-06-22  9:46   ` Fuad Tabba
  2021-06-21 11:17 ` [PATCH v17 4/6] KVM: arm64: Expose KVM_ARM_CAP_MTE Steven Price
                   ` (3 subsequent siblings)
  6 siblings, 1 reply; 23+ messages in thread
From: Steven Price @ 2021-06-21 11:17 UTC (permalink / raw)
  To: Catalin Marinas, Marc Zyngier, Will Deacon
  Cc: Steven Price, James Morse, Julien Thierry, Suzuki K Poulose,
	kvmarm, linux-arm-kernel, linux-kernel, Dave Martin,
	Mark Rutland, Thomas Gleixner, qemu-devel, Juan Quintela,
	Dr. David Alan Gilbert, Richard Henderson, Peter Maydell,
	Andrew Jones

Define the new system registers that MTE introduces and context switch
them. The MTE feature is still hidden from the ID register as it isn't
supported in a VM yet.

Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Steven Price <steven.price@arm.com>
---
 arch/arm64/include/asm/kvm_arm.h           |  3 +-
 arch/arm64/include/asm/kvm_host.h          |  6 ++
 arch/arm64/include/asm/kvm_mte.h           | 66 ++++++++++++++++++++++
 arch/arm64/include/asm/sysreg.h            |  3 +-
 arch/arm64/kernel/asm-offsets.c            |  2 +
 arch/arm64/kvm/hyp/entry.S                 |  7 +++
 arch/arm64/kvm/hyp/include/hyp/sysreg-sr.h | 21 +++++++
 arch/arm64/kvm/sys_regs.c                  | 22 ++++++--
 8 files changed, 124 insertions(+), 6 deletions(-)
 create mode 100644 arch/arm64/include/asm/kvm_mte.h

diff --git a/arch/arm64/include/asm/kvm_arm.h b/arch/arm64/include/asm/kvm_arm.h
index 692c9049befa..d436831dd706 100644
--- a/arch/arm64/include/asm/kvm_arm.h
+++ b/arch/arm64/include/asm/kvm_arm.h
@@ -12,7 +12,8 @@
 #include <asm/types.h>
 
 /* Hyp Configuration Register (HCR) bits */
-#define HCR_ATA		(UL(1) << 56)
+#define HCR_ATA_SHIFT	56
+#define HCR_ATA		(UL(1) << HCR_ATA_SHIFT)
 #define HCR_FWB		(UL(1) << 46)
 #define HCR_API		(UL(1) << 41)
 #define HCR_APK		(UL(1) << 40)
diff --git a/arch/arm64/include/asm/kvm_host.h b/arch/arm64/include/asm/kvm_host.h
index afaa5333f0e4..309e36cc1b42 100644
--- a/arch/arm64/include/asm/kvm_host.h
+++ b/arch/arm64/include/asm/kvm_host.h
@@ -208,6 +208,12 @@ enum vcpu_sysreg {
 	CNTP_CVAL_EL0,
 	CNTP_CTL_EL0,
 
+	/* Memory Tagging Extension registers */
+	RGSR_EL1,	/* Random Allocation Tag Seed Register */
+	GCR_EL1,	/* Tag Control Register */
+	TFSR_EL1,	/* Tag Fault Status Register (EL1) */
+	TFSRE0_EL1,	/* Tag Fault Status Register (EL0) */
+
 	/* 32bit specific registers. Keep them at the end of the range */
 	DACR32_EL2,	/* Domain Access Control Register */
 	IFSR32_EL2,	/* Instruction Fault Status Register */
diff --git a/arch/arm64/include/asm/kvm_mte.h b/arch/arm64/include/asm/kvm_mte.h
new file mode 100644
index 000000000000..88dd1199670b
--- /dev/null
+++ b/arch/arm64/include/asm/kvm_mte.h
@@ -0,0 +1,66 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+/*
+ * Copyright (C) 2020-2021 ARM Ltd.
+ */
+#ifndef __ASM_KVM_MTE_H
+#define __ASM_KVM_MTE_H
+
+#ifdef __ASSEMBLY__
+
+#include <asm/sysreg.h>
+
+#ifdef CONFIG_ARM64_MTE
+
+.macro mte_switch_to_guest g_ctxt, h_ctxt, reg1
+alternative_if_not ARM64_MTE
+	b	.L__skip_switch\@
+alternative_else_nop_endif
+	mrs	\reg1, hcr_el2
+	tbz	\reg1, #(HCR_ATA_SHIFT), .L__skip_switch\@
+
+	mrs_s	\reg1, SYS_RGSR_EL1
+	str	\reg1, [\h_ctxt, #CPU_RGSR_EL1]
+	mrs_s	\reg1, SYS_GCR_EL1
+	str	\reg1, [\h_ctxt, #CPU_GCR_EL1]
+
+	ldr	\reg1, [\g_ctxt, #CPU_RGSR_EL1]
+	msr_s	SYS_RGSR_EL1, \reg1
+	ldr	\reg1, [\g_ctxt, #CPU_GCR_EL1]
+	msr_s	SYS_GCR_EL1, \reg1
+
+.L__skip_switch\@:
+.endm
+
+.macro mte_switch_to_hyp g_ctxt, h_ctxt, reg1
+alternative_if_not ARM64_MTE
+	b	.L__skip_switch\@
+alternative_else_nop_endif
+	mrs	\reg1, hcr_el2
+	tbz	\reg1, #(HCR_ATA_SHIFT), .L__skip_switch\@
+
+	mrs_s	\reg1, SYS_RGSR_EL1
+	str	\reg1, [\g_ctxt, #CPU_RGSR_EL1]
+	mrs_s	\reg1, SYS_GCR_EL1
+	str	\reg1, [\g_ctxt, #CPU_GCR_EL1]
+
+	ldr	\reg1, [\h_ctxt, #CPU_RGSR_EL1]
+	msr_s	SYS_RGSR_EL1, \reg1
+	ldr	\reg1, [\h_ctxt, #CPU_GCR_EL1]
+	msr_s	SYS_GCR_EL1, \reg1
+
+	isb
+
+.L__skip_switch\@:
+.endm
+
+#else /* CONFIG_ARM64_MTE */
+
+.macro mte_switch_to_guest g_ctxt, h_ctxt, reg1
+.endm
+
+.macro mte_switch_to_hyp g_ctxt, h_ctxt, reg1
+.endm
+
+#endif /* CONFIG_ARM64_MTE */
+#endif /* __ASSEMBLY__ */
+#endif /* __ASM_KVM_MTE_H */
diff --git a/arch/arm64/include/asm/sysreg.h b/arch/arm64/include/asm/sysreg.h
index 65d15700a168..347ccac2341e 100644
--- a/arch/arm64/include/asm/sysreg.h
+++ b/arch/arm64/include/asm/sysreg.h
@@ -651,7 +651,8 @@
 
 #define INIT_SCTLR_EL2_MMU_ON						\
 	(SCTLR_ELx_M  | SCTLR_ELx_C | SCTLR_ELx_SA | SCTLR_ELx_I |	\
-	 SCTLR_ELx_IESB | SCTLR_ELx_WXN | ENDIAN_SET_EL2 | SCTLR_EL2_RES1)
+	 SCTLR_ELx_IESB | SCTLR_ELx_WXN | ENDIAN_SET_EL2 |		\
+	 SCTLR_ELx_ITFSB | SCTLR_EL2_RES1)
 
 #define INIT_SCTLR_EL2_MMU_OFF \
 	(SCTLR_EL2_RES1 | ENDIAN_SET_EL2)
diff --git a/arch/arm64/kernel/asm-offsets.c b/arch/arm64/kernel/asm-offsets.c
index 0cb34ccb6e73..6f0044cb233e 100644
--- a/arch/arm64/kernel/asm-offsets.c
+++ b/arch/arm64/kernel/asm-offsets.c
@@ -111,6 +111,8 @@ int main(void)
   DEFINE(VCPU_WORKAROUND_FLAGS,	offsetof(struct kvm_vcpu, arch.workaround_flags));
   DEFINE(VCPU_HCR_EL2,		offsetof(struct kvm_vcpu, arch.hcr_el2));
   DEFINE(CPU_USER_PT_REGS,	offsetof(struct kvm_cpu_context, regs));
+  DEFINE(CPU_RGSR_EL1,		offsetof(struct kvm_cpu_context, sys_regs[RGSR_EL1]));
+  DEFINE(CPU_GCR_EL1,		offsetof(struct kvm_cpu_context, sys_regs[GCR_EL1]));
   DEFINE(CPU_APIAKEYLO_EL1,	offsetof(struct kvm_cpu_context, sys_regs[APIAKEYLO_EL1]));
   DEFINE(CPU_APIBKEYLO_EL1,	offsetof(struct kvm_cpu_context, sys_regs[APIBKEYLO_EL1]));
   DEFINE(CPU_APDAKEYLO_EL1,	offsetof(struct kvm_cpu_context, sys_regs[APDAKEYLO_EL1]));
diff --git a/arch/arm64/kvm/hyp/entry.S b/arch/arm64/kvm/hyp/entry.S
index e831d3dfd50d..435346ea1504 100644
--- a/arch/arm64/kvm/hyp/entry.S
+++ b/arch/arm64/kvm/hyp/entry.S
@@ -13,6 +13,7 @@
 #include <asm/kvm_arm.h>
 #include <asm/kvm_asm.h>
 #include <asm/kvm_mmu.h>
+#include <asm/kvm_mte.h>
 #include <asm/kvm_ptrauth.h>
 
 	.text
@@ -51,6 +52,9 @@ alternative_else_nop_endif
 
 	add	x29, x0, #VCPU_CONTEXT
 
+	// mte_switch_to_guest(g_ctxt, h_ctxt, tmp1)
+	mte_switch_to_guest x29, x1, x2
+
 	// Macro ptrauth_switch_to_guest format:
 	// 	ptrauth_switch_to_guest(guest cxt, tmp1, tmp2, tmp3)
 	// The below macro to restore guest keys is not implemented in C code
@@ -142,6 +146,9 @@ SYM_INNER_LABEL(__guest_exit, SYM_L_GLOBAL)
 	// when this feature is enabled for kernel code.
 	ptrauth_switch_to_hyp x1, x2, x3, x4, x5
 
+	// mte_switch_to_hyp(g_ctxt, h_ctxt, reg1)
+	mte_switch_to_hyp x1, x2, x3
+
 	// Restore hyp's sp_el0
 	restore_sp_el0 x2, x3
 
diff --git a/arch/arm64/kvm/hyp/include/hyp/sysreg-sr.h b/arch/arm64/kvm/hyp/include/hyp/sysreg-sr.h
index cce43bfe158f..de7e14c862e6 100644
--- a/arch/arm64/kvm/hyp/include/hyp/sysreg-sr.h
+++ b/arch/arm64/kvm/hyp/include/hyp/sysreg-sr.h
@@ -14,6 +14,7 @@
 #include <asm/kvm_asm.h>
 #include <asm/kvm_emulate.h>
 #include <asm/kvm_hyp.h>
+#include <asm/kvm_mmu.h>
 
 static inline void __sysreg_save_common_state(struct kvm_cpu_context *ctxt)
 {
@@ -26,6 +27,16 @@ static inline void __sysreg_save_user_state(struct kvm_cpu_context *ctxt)
 	ctxt_sys_reg(ctxt, TPIDRRO_EL0)	= read_sysreg(tpidrro_el0);
 }
 
+static inline bool ctxt_has_mte(struct kvm_cpu_context *ctxt)
+{
+	struct kvm_vcpu *vcpu = ctxt->__hyp_running_vcpu;
+
+	if (!vcpu)
+		vcpu = container_of(ctxt, struct kvm_vcpu, arch.ctxt);
+
+	return kvm_has_mte(kern_hyp_va(vcpu->kvm));
+}
+
 static inline void __sysreg_save_el1_state(struct kvm_cpu_context *ctxt)
 {
 	ctxt_sys_reg(ctxt, CSSELR_EL1)	= read_sysreg(csselr_el1);
@@ -46,6 +57,11 @@ static inline void __sysreg_save_el1_state(struct kvm_cpu_context *ctxt)
 	ctxt_sys_reg(ctxt, PAR_EL1)	= read_sysreg_par();
 	ctxt_sys_reg(ctxt, TPIDR_EL1)	= read_sysreg(tpidr_el1);
 
+	if (ctxt_has_mte(ctxt)) {
+		ctxt_sys_reg(ctxt, TFSR_EL1) = read_sysreg_el1(SYS_TFSR);
+		ctxt_sys_reg(ctxt, TFSRE0_EL1) = read_sysreg_s(SYS_TFSRE0_EL1);
+	}
+
 	ctxt_sys_reg(ctxt, SP_EL1)	= read_sysreg(sp_el1);
 	ctxt_sys_reg(ctxt, ELR_EL1)	= read_sysreg_el1(SYS_ELR);
 	ctxt_sys_reg(ctxt, SPSR_EL1)	= read_sysreg_el1(SYS_SPSR);
@@ -107,6 +123,11 @@ static inline void __sysreg_restore_el1_state(struct kvm_cpu_context *ctxt)
 	write_sysreg(ctxt_sys_reg(ctxt, PAR_EL1),	par_el1);
 	write_sysreg(ctxt_sys_reg(ctxt, TPIDR_EL1),	tpidr_el1);
 
+	if (ctxt_has_mte(ctxt)) {
+		write_sysreg_el1(ctxt_sys_reg(ctxt, TFSR_EL1), SYS_TFSR);
+		write_sysreg_s(ctxt_sys_reg(ctxt, TFSRE0_EL1), SYS_TFSRE0_EL1);
+	}
+
 	if (!has_vhe() &&
 	    cpus_have_final_cap(ARM64_WORKAROUND_SPECULATIVE_AT) &&
 	    ctxt->__hyp_running_vcpu) {
diff --git a/arch/arm64/kvm/sys_regs.c b/arch/arm64/kvm/sys_regs.c
index 36f67f8deae1..5c75b24eae21 100644
--- a/arch/arm64/kvm/sys_regs.c
+++ b/arch/arm64/kvm/sys_regs.c
@@ -1309,6 +1309,20 @@ static bool access_ccsidr(struct kvm_vcpu *vcpu, struct sys_reg_params *p,
 	return true;
 }
 
+static unsigned int mte_visibility(const struct kvm_vcpu *vcpu,
+				   const struct sys_reg_desc *rd)
+{
+	return REG_HIDDEN;
+}
+
+#define MTE_REG(name) {				\
+	SYS_DESC(SYS_##name),			\
+	.access = undef_access,			\
+	.reset = reset_unknown,			\
+	.reg = name,				\
+	.visibility = mte_visibility,		\
+}
+
 /* sys_reg_desc initialiser for known cpufeature ID registers */
 #define ID_SANITISED(name) {			\
 	SYS_DESC(SYS_##name),			\
@@ -1477,8 +1491,8 @@ static const struct sys_reg_desc sys_reg_descs[] = {
 	{ SYS_DESC(SYS_ACTLR_EL1), access_actlr, reset_actlr, ACTLR_EL1 },
 	{ SYS_DESC(SYS_CPACR_EL1), NULL, reset_val, CPACR_EL1, 0 },
 
-	{ SYS_DESC(SYS_RGSR_EL1), undef_access },
-	{ SYS_DESC(SYS_GCR_EL1), undef_access },
+	MTE_REG(RGSR_EL1),
+	MTE_REG(GCR_EL1),
 
 	{ SYS_DESC(SYS_ZCR_EL1), NULL, reset_val, ZCR_EL1, 0, .visibility = sve_visibility },
 	{ SYS_DESC(SYS_TRFCR_EL1), undef_access },
@@ -1505,8 +1519,8 @@ static const struct sys_reg_desc sys_reg_descs[] = {
 	{ SYS_DESC(SYS_ERXMISC0_EL1), trap_raz_wi },
 	{ SYS_DESC(SYS_ERXMISC1_EL1), trap_raz_wi },
 
-	{ SYS_DESC(SYS_TFSR_EL1), undef_access },
-	{ SYS_DESC(SYS_TFSRE0_EL1), undef_access },
+	MTE_REG(TFSR_EL1),
+	MTE_REG(TFSRE0_EL1),
 
 	{ SYS_DESC(SYS_FAR_EL1), access_vm_reg, reset_unknown, FAR_EL1 },
 	{ SYS_DESC(SYS_PAR_EL1), NULL, reset_unknown, PAR_EL1 },
-- 
2.20.1


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [PATCH v17 4/6] KVM: arm64: Expose KVM_ARM_CAP_MTE
  2021-06-21 11:17 [PATCH v17 0/6] MTE support for KVM guest Steven Price
                   ` (2 preceding siblings ...)
  2021-06-21 11:17 ` [PATCH v17 3/6] KVM: arm64: Save/restore MTE registers Steven Price
@ 2021-06-21 11:17 ` Steven Price
  2021-06-22  8:07   ` Fuad Tabba
  2021-06-21 11:17 ` [PATCH v17 5/6] KVM: arm64: ioctl to fetch/store tags in a guest Steven Price
                   ` (2 subsequent siblings)
  6 siblings, 1 reply; 23+ messages in thread
From: Steven Price @ 2021-06-21 11:17 UTC (permalink / raw)
  To: Catalin Marinas, Marc Zyngier, Will Deacon
  Cc: Steven Price, James Morse, Julien Thierry, Suzuki K Poulose,
	kvmarm, linux-arm-kernel, linux-kernel, Dave Martin,
	Mark Rutland, Thomas Gleixner, qemu-devel, Juan Quintela,
	Dr. David Alan Gilbert, Richard Henderson, Peter Maydell,
	Andrew Jones

It's now safe for the VMM to enable MTE in a guest, so expose the
capability to user space.

Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Steven Price <steven.price@arm.com>
---
 arch/arm64/kvm/arm.c      | 9 +++++++++
 arch/arm64/kvm/reset.c    | 4 ++++
 arch/arm64/kvm/sys_regs.c | 3 +++
 3 files changed, 16 insertions(+)

diff --git a/arch/arm64/kvm/arm.c b/arch/arm64/kvm/arm.c
index e720148232a0..28ce26a68f09 100644
--- a/arch/arm64/kvm/arm.c
+++ b/arch/arm64/kvm/arm.c
@@ -93,6 +93,12 @@ int kvm_vm_ioctl_enable_cap(struct kvm *kvm,
 		r = 0;
 		kvm->arch.return_nisv_io_abort_to_user = true;
 		break;
+	case KVM_CAP_ARM_MTE:
+		if (!system_supports_mte() || kvm->created_vcpus)
+			return -EINVAL;
+		r = 0;
+		kvm->arch.mte_enabled = true;
+		break;
 	default:
 		r = -EINVAL;
 		break;
@@ -237,6 +243,9 @@ int kvm_vm_ioctl_check_extension(struct kvm *kvm, long ext)
 		 */
 		r = 1;
 		break;
+	case KVM_CAP_ARM_MTE:
+		r = system_supports_mte();
+		break;
 	case KVM_CAP_STEAL_TIME:
 		r = kvm_arm_pvtime_supported();
 		break;
diff --git a/arch/arm64/kvm/reset.c b/arch/arm64/kvm/reset.c
index d37ebee085cf..9e6922b9503a 100644
--- a/arch/arm64/kvm/reset.c
+++ b/arch/arm64/kvm/reset.c
@@ -244,6 +244,10 @@ int kvm_reset_vcpu(struct kvm_vcpu *vcpu)
 	switch (vcpu->arch.target) {
 	default:
 		if (test_bit(KVM_ARM_VCPU_EL1_32BIT, vcpu->arch.features)) {
+			if (vcpu->kvm->arch.mte_enabled) {
+				ret = -EINVAL;
+				goto out;
+			}
 			pstate = VCPU_RESET_PSTATE_SVC;
 		} else {
 			pstate = VCPU_RESET_PSTATE_EL1;
diff --git a/arch/arm64/kvm/sys_regs.c b/arch/arm64/kvm/sys_regs.c
index 5c75b24eae21..f6f126eb6ac1 100644
--- a/arch/arm64/kvm/sys_regs.c
+++ b/arch/arm64/kvm/sys_regs.c
@@ -1312,6 +1312,9 @@ static bool access_ccsidr(struct kvm_vcpu *vcpu, struct sys_reg_params *p,
 static unsigned int mte_visibility(const struct kvm_vcpu *vcpu,
 				   const struct sys_reg_desc *rd)
 {
+	if (kvm_has_mte(vcpu->kvm))
+		return 0;
+
 	return REG_HIDDEN;
 }
 
-- 
2.20.1


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [PATCH v17 5/6] KVM: arm64: ioctl to fetch/store tags in a guest
  2021-06-21 11:17 [PATCH v17 0/6] MTE support for KVM guest Steven Price
                   ` (3 preceding siblings ...)
  2021-06-21 11:17 ` [PATCH v17 4/6] KVM: arm64: Expose KVM_ARM_CAP_MTE Steven Price
@ 2021-06-21 11:17 ` Steven Price
  2021-06-22  8:56   ` Fuad Tabba
  2021-06-24 13:35   ` Marc Zyngier
  2021-06-21 11:17 ` [PATCH v17 6/6] KVM: arm64: Document MTE capability and ioctl Steven Price
  2021-06-22 14:21 ` [PATCH v17 0/6] MTE support for KVM guest Marc Zyngier
  6 siblings, 2 replies; 23+ messages in thread
From: Steven Price @ 2021-06-21 11:17 UTC (permalink / raw)
  To: Catalin Marinas, Marc Zyngier, Will Deacon
  Cc: Steven Price, James Morse, Julien Thierry, Suzuki K Poulose,
	kvmarm, linux-arm-kernel, linux-kernel, Dave Martin,
	Mark Rutland, Thomas Gleixner, qemu-devel, Juan Quintela,
	Dr. David Alan Gilbert, Richard Henderson, Peter Maydell,
	Andrew Jones

The VMM may not wish to have it's own mapping of guest memory mapped
with PROT_MTE because this causes problems if the VMM has tag checking
enabled (the guest controls the tags in physical RAM and it's unlikely
the tags are correct for the VMM).

Instead add a new ioctl which allows the VMM to easily read/write the
tags from guest memory, allowing the VMM's mapping to be non-PROT_MTE
while the VMM can still read/write the tags for the purpose of
migration.

Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Steven Price <steven.price@arm.com>
---
 arch/arm64/include/asm/kvm_host.h |  3 ++
 arch/arm64/include/asm/mte-def.h  |  1 +
 arch/arm64/include/uapi/asm/kvm.h | 11 +++++
 arch/arm64/kvm/arm.c              |  7 +++
 arch/arm64/kvm/guest.c            | 82 +++++++++++++++++++++++++++++++
 include/uapi/linux/kvm.h          |  1 +
 6 files changed, 105 insertions(+)

diff --git a/arch/arm64/include/asm/kvm_host.h b/arch/arm64/include/asm/kvm_host.h
index 309e36cc1b42..6a2ac4636d42 100644
--- a/arch/arm64/include/asm/kvm_host.h
+++ b/arch/arm64/include/asm/kvm_host.h
@@ -729,6 +729,9 @@ int kvm_arm_vcpu_arch_get_attr(struct kvm_vcpu *vcpu,
 int kvm_arm_vcpu_arch_has_attr(struct kvm_vcpu *vcpu,
 			       struct kvm_device_attr *attr);
 
+long kvm_vm_ioctl_mte_copy_tags(struct kvm *kvm,
+				struct kvm_arm_copy_mte_tags *copy_tags);
+
 /* Guest/host FPSIMD coordination helpers */
 int kvm_arch_vcpu_run_map_fp(struct kvm_vcpu *vcpu);
 void kvm_arch_vcpu_load_fp(struct kvm_vcpu *vcpu);
diff --git a/arch/arm64/include/asm/mte-def.h b/arch/arm64/include/asm/mte-def.h
index cf241b0f0a42..626d359b396e 100644
--- a/arch/arm64/include/asm/mte-def.h
+++ b/arch/arm64/include/asm/mte-def.h
@@ -7,6 +7,7 @@
 
 #define MTE_GRANULE_SIZE	UL(16)
 #define MTE_GRANULE_MASK	(~(MTE_GRANULE_SIZE - 1))
+#define MTE_GRANULES_PER_PAGE	(PAGE_SIZE / MTE_GRANULE_SIZE)
 #define MTE_TAG_SHIFT		56
 #define MTE_TAG_SIZE		4
 #define MTE_TAG_MASK		GENMASK((MTE_TAG_SHIFT + (MTE_TAG_SIZE - 1)), MTE_TAG_SHIFT)
diff --git a/arch/arm64/include/uapi/asm/kvm.h b/arch/arm64/include/uapi/asm/kvm.h
index 24223adae150..b3edde68bc3e 100644
--- a/arch/arm64/include/uapi/asm/kvm.h
+++ b/arch/arm64/include/uapi/asm/kvm.h
@@ -184,6 +184,17 @@ struct kvm_vcpu_events {
 	__u32 reserved[12];
 };
 
+struct kvm_arm_copy_mte_tags {
+	__u64 guest_ipa;
+	__u64 length;
+	void __user *addr;
+	__u64 flags;
+	__u64 reserved[2];
+};
+
+#define KVM_ARM_TAGS_TO_GUEST		0
+#define KVM_ARM_TAGS_FROM_GUEST		1
+
 /* If you need to interpret the index values, here is the key: */
 #define KVM_REG_ARM_COPROC_MASK		0x000000000FFF0000
 #define KVM_REG_ARM_COPROC_SHIFT	16
diff --git a/arch/arm64/kvm/arm.c b/arch/arm64/kvm/arm.c
index 28ce26a68f09..511f3716fe33 100644
--- a/arch/arm64/kvm/arm.c
+++ b/arch/arm64/kvm/arm.c
@@ -1359,6 +1359,13 @@ long kvm_arch_vm_ioctl(struct file *filp,
 
 		return 0;
 	}
+	case KVM_ARM_MTE_COPY_TAGS: {
+		struct kvm_arm_copy_mte_tags copy_tags;
+
+		if (copy_from_user(&copy_tags, argp, sizeof(copy_tags)))
+			return -EFAULT;
+		return kvm_vm_ioctl_mte_copy_tags(kvm, &copy_tags);
+	}
 	default:
 		return -EINVAL;
 	}
diff --git a/arch/arm64/kvm/guest.c b/arch/arm64/kvm/guest.c
index 5cb4a1cd5603..4ddb20017b2f 100644
--- a/arch/arm64/kvm/guest.c
+++ b/arch/arm64/kvm/guest.c
@@ -995,3 +995,85 @@ int kvm_arm_vcpu_arch_has_attr(struct kvm_vcpu *vcpu,
 
 	return ret;
 }
+
+long kvm_vm_ioctl_mte_copy_tags(struct kvm *kvm,
+				struct kvm_arm_copy_mte_tags *copy_tags)
+{
+	gpa_t guest_ipa = copy_tags->guest_ipa;
+	size_t length = copy_tags->length;
+	void __user *tags = copy_tags->addr;
+	gpa_t gfn;
+	bool write = !(copy_tags->flags & KVM_ARM_TAGS_FROM_GUEST);
+	int ret = 0;
+
+	if (!kvm_has_mte(kvm))
+		return -EINVAL;
+
+	if (copy_tags->reserved[0] || copy_tags->reserved[1])
+		return -EINVAL;
+
+	if (copy_tags->flags & ~KVM_ARM_TAGS_FROM_GUEST)
+		return -EINVAL;
+
+	if (length & ~PAGE_MASK || guest_ipa & ~PAGE_MASK)
+		return -EINVAL;
+
+	gfn = gpa_to_gfn(guest_ipa);
+
+	mutex_lock(&kvm->slots_lock);
+
+	while (length > 0) {
+		kvm_pfn_t pfn = gfn_to_pfn_prot(kvm, gfn, write, NULL);
+		void *maddr;
+		unsigned long num_tags;
+		struct page *page;
+
+		if (is_error_noslot_pfn(pfn)) {
+			ret = -EFAULT;
+			goto out;
+		}
+
+		page = pfn_to_online_page(pfn);
+		if (!page) {
+			/* Reject ZONE_DEVICE memory */
+			ret = -EFAULT;
+			goto out;
+		}
+		maddr = page_address(page);
+
+		if (!write) {
+			if (test_bit(PG_mte_tagged, &page->flags))
+				num_tags = mte_copy_tags_to_user(tags, maddr,
+							MTE_GRANULES_PER_PAGE);
+			else
+				/* No tags in memory, so write zeros */
+				num_tags = MTE_GRANULES_PER_PAGE -
+					clear_user(tags, MTE_GRANULES_PER_PAGE);
+			kvm_release_pfn_clean(pfn);
+		} else {
+			num_tags = mte_copy_tags_from_user(maddr, tags,
+							MTE_GRANULES_PER_PAGE);
+			kvm_release_pfn_dirty(pfn);
+		}
+
+		if (num_tags != MTE_GRANULES_PER_PAGE) {
+			ret = -EFAULT;
+			goto out;
+		}
+
+		/* Set the flag after checking the write completed fully */
+		if (write)
+			set_bit(PG_mte_tagged, &page->flags);
+
+		gfn++;
+		tags += num_tags;
+		length -= PAGE_SIZE;
+	}
+
+out:
+	mutex_unlock(&kvm->slots_lock);
+	/* If some data has been copied report the number of bytes copied */
+	if (length != copy_tags->length)
+		return copy_tags->length - length;
+	return ret;
+}
diff --git a/include/uapi/linux/kvm.h b/include/uapi/linux/kvm.h
index d4da58ddcad7..da1edd2b4046 100644
--- a/include/uapi/linux/kvm.h
+++ b/include/uapi/linux/kvm.h
@@ -1429,6 +1429,7 @@ struct kvm_s390_ucas_mapping {
 /* Available with KVM_CAP_PMU_EVENT_FILTER */
 #define KVM_SET_PMU_EVENT_FILTER  _IOW(KVMIO,  0xb2, struct kvm_pmu_event_filter)
 #define KVM_PPC_SVM_OFF		  _IO(KVMIO,  0xb3)
+#define KVM_ARM_MTE_COPY_TAGS	  _IOR(KVMIO,  0xb4, struct kvm_arm_copy_mte_tags)
 
 /* ioctl for vm fd */
 #define KVM_CREATE_DEVICE	  _IOWR(KVMIO,  0xe0, struct kvm_create_device)
-- 
2.20.1


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [PATCH v17 6/6] KVM: arm64: Document MTE capability and ioctl
  2021-06-21 11:17 [PATCH v17 0/6] MTE support for KVM guest Steven Price
                   ` (4 preceding siblings ...)
  2021-06-21 11:17 ` [PATCH v17 5/6] KVM: arm64: ioctl to fetch/store tags in a guest Steven Price
@ 2021-06-21 11:17 ` Steven Price
  2021-06-22  9:42   ` Fuad Tabba
  2021-06-22 14:21 ` [PATCH v17 0/6] MTE support for KVM guest Marc Zyngier
  6 siblings, 1 reply; 23+ messages in thread
From: Steven Price @ 2021-06-21 11:17 UTC (permalink / raw)
  To: Catalin Marinas, Marc Zyngier, Will Deacon
  Cc: Steven Price, James Morse, Julien Thierry, Suzuki K Poulose,
	kvmarm, linux-arm-kernel, linux-kernel, Dave Martin,
	Mark Rutland, Thomas Gleixner, qemu-devel, Juan Quintela,
	Dr. David Alan Gilbert, Richard Henderson, Peter Maydell,
	Andrew Jones

A new capability (KVM_CAP_ARM_MTE) identifies that the kernel supports
granting a guest access to the tags, and provides a mechanism for the
VMM to enable it.

A new ioctl (KVM_ARM_MTE_COPY_TAGS) provides a simple way for a VMM to
access the tags of a guest without having to maintain a PROT_MTE mapping
in userspace. The above capability gates access to the ioctl.

Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Steven Price <steven.price@arm.com>
---
 Documentation/virt/kvm/api.rst | 61 ++++++++++++++++++++++++++++++++++
 1 file changed, 61 insertions(+)

diff --git a/Documentation/virt/kvm/api.rst b/Documentation/virt/kvm/api.rst
index 7fcb2fd38f42..97661a97943f 100644
--- a/Documentation/virt/kvm/api.rst
+++ b/Documentation/virt/kvm/api.rst
@@ -5034,6 +5034,43 @@ see KVM_XEN_VCPU_SET_ATTR above.
 The KVM_XEN_VCPU_ATTR_TYPE_RUNSTATE_ADJUST type may not be used
 with the KVM_XEN_VCPU_GET_ATTR ioctl.
 
+4.130 KVM_ARM_MTE_COPY_TAGS
+---------------------------
+
+:Capability: KVM_CAP_ARM_MTE
+:Architectures: arm64
+:Type: vm ioctl
+:Parameters: struct kvm_arm_copy_mte_tags
+:Returns: number of bytes copied, < 0 on error (-EINVAL for incorrect
+          arguments, -EFAULT if memory cannot be accessed).
+
+::
+
+  struct kvm_arm_copy_mte_tags {
+	__u64 guest_ipa;
+	__u64 length;
+	void __user *addr;
+	__u64 flags;
+	__u64 reserved[2];
+  };
+
+Copies Memory Tagging Extension (MTE) tags to/from guest tag memory. The
+``guest_ipa`` and ``length`` fields must be ``PAGE_SIZE`` aligned. The ``addr``
+field must point to a buffer which the tags will be copied to or from.
+
+``flags`` specifies the direction of copy, either ``KVM_ARM_TAGS_TO_GUEST`` or
+``KVM_ARM_TAGS_FROM_GUEST``.
+
+The size of the buffer to store the tags is ``(length / 16)`` bytes
+(granules in MTE are 16 bytes long). Each byte contains a single tag
+value. This matches the format of ``PTRACE_PEEKMTETAGS`` and
+``PTRACE_POKEMTETAGS``.
+
+If an error occurs before any data is copied then a negative error code is
+returned. If some tags have been copied before an error occurs then the number
+of bytes successfully copied is returned. If the call completes successfully
+then ``length`` is returned.
+
 5. The kvm_run structure
 ========================
 
@@ -6362,6 +6399,30 @@ default.
 
 See Documentation/x86/sgx/2.Kernel-internals.rst for more details.
 
+7.26 KVM_CAP_ARM_MTE
+--------------------
+
+:Architectures: arm64
+:Parameters: none
+
+This capability indicates that KVM (and the hardware) supports exposing the
+Memory Tagging Extensions (MTE) to the guest. It must also be enabled by the
+VMM before creating any VCPUs to allow the guest access. Note that MTE is only
+available to a guest running in AArch64 mode and enabling this capability will
+cause attempts to create AArch32 VCPUs to fail.
+
+When enabled the guest is able to access tags associated with any memory given
+to the guest. KVM will ensure that the tags are maintained during swap or
+hibernation of the host; however the VMM needs to manually save/restore the
+tags as appropriate if the VM is migrated.
+
+When this capability is enabled all memory in memslots must be mapped as
+not-shareable (no MAP_SHARED), attempts to create a memslot with a
+MAP_SHARED mmap will result in an -EINVAL return.
+
+When enabled the VMM may make use of the ``KVM_ARM_MTE_COPY_TAGS`` ioctl to
+perform a bulk copy of tags to/from the guest.
+
 8. Other capabilities.
 ======================
 
-- 
2.20.1


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply related	[flat|nested] 23+ messages in thread

* Re: [PATCH v17 2/6] KVM: arm64: Introduce MTE VM feature
  2021-06-21 11:17 ` [PATCH v17 2/6] KVM: arm64: Introduce MTE VM feature Steven Price
@ 2021-06-21 17:00   ` Fuad Tabba
  2021-06-22 11:29     ` Marc Zyngier
  0 siblings, 1 reply; 23+ messages in thread
From: Fuad Tabba @ 2021-06-21 17:00 UTC (permalink / raw)
  To: Steven Price
  Cc: Catalin Marinas, Marc Zyngier, Will Deacon,
	Dr. David Alan Gilbert, qemu-devel, Dave Martin, Juan Quintela,
	Richard Henderson, linux-kernel, Thomas Gleixner, kvmarm,
	linux-arm-kernel

Hi,

On Mon, Jun 21, 2021 at 12:18 PM Steven Price <steven.price@arm.com> wrote:
>
> Add a new VM feature 'KVM_ARM_CAP_MTE' which enables memory tagging
> for a VM. This will expose the feature to the guest and automatically
> tag memory pages touched by the VM as PG_mte_tagged (and clear the tag
> storage) to ensure that the guest cannot see stale tags, and so that
> the tags are correctly saved/restored across swap.
>
> Actually exposing the new capability to user space happens in a later
> patch.
>
> Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
> Signed-off-by: Steven Price <steven.price@arm.com>
> ---
>  arch/arm64/include/asm/kvm_emulate.h |  3 ++
>  arch/arm64/include/asm/kvm_host.h    |  3 ++
>  arch/arm64/kvm/hyp/exception.c       |  3 +-
>  arch/arm64/kvm/mmu.c                 | 64 +++++++++++++++++++++++++++-
>  arch/arm64/kvm/sys_regs.c            |  7 +++
>  include/uapi/linux/kvm.h             |  1 +
>  6 files changed, 79 insertions(+), 2 deletions(-)
>
> diff --git a/arch/arm64/include/asm/kvm_emulate.h b/arch/arm64/include/asm/kvm_emulate.h
> index 01b9857757f2..fd418955e31e 100644
> --- a/arch/arm64/include/asm/kvm_emulate.h
> +++ b/arch/arm64/include/asm/kvm_emulate.h
> @@ -84,6 +84,9 @@ static inline void vcpu_reset_hcr(struct kvm_vcpu *vcpu)
>         if (cpus_have_const_cap(ARM64_MISMATCHED_CACHE_TYPE) ||
>             vcpu_el1_is_32bit(vcpu))
>                 vcpu->arch.hcr_el2 |= HCR_TID2;
> +
> +       if (kvm_has_mte(vcpu->kvm))
> +               vcpu->arch.hcr_el2 |= HCR_ATA;
>  }
>
>  static inline unsigned long *vcpu_hcr(struct kvm_vcpu *vcpu)
> diff --git a/arch/arm64/include/asm/kvm_host.h b/arch/arm64/include/asm/kvm_host.h
> index 7cd7d5c8c4bc..afaa5333f0e4 100644
> --- a/arch/arm64/include/asm/kvm_host.h
> +++ b/arch/arm64/include/asm/kvm_host.h
> @@ -132,6 +132,8 @@ struct kvm_arch {
>
>         u8 pfr0_csv2;
>         u8 pfr0_csv3;
> +       /* Memory Tagging Extension enabled for the guest */
> +       bool mte_enabled;
>  };

nit: newline before the comment/new member

>
>  struct kvm_vcpu_fault_info {
> @@ -769,6 +771,7 @@ bool kvm_arm_vcpu_is_finalized(struct kvm_vcpu *vcpu);
>  #define kvm_arm_vcpu_sve_finalized(vcpu) \
>         ((vcpu)->arch.flags & KVM_ARM64_VCPU_SVE_FINALIZED)
>
> +#define kvm_has_mte(kvm) (system_supports_mte() && (kvm)->arch.mte_enabled)
>  #define kvm_vcpu_has_pmu(vcpu)                                 \
>         (test_bit(KVM_ARM_VCPU_PMU_V3, (vcpu)->arch.features))
>
> diff --git a/arch/arm64/kvm/hyp/exception.c b/arch/arm64/kvm/hyp/exception.c
> index 11541b94b328..0418399e0a20 100644
> --- a/arch/arm64/kvm/hyp/exception.c
> +++ b/arch/arm64/kvm/hyp/exception.c
> @@ -112,7 +112,8 @@ static void enter_exception64(struct kvm_vcpu *vcpu, unsigned long target_mode,
>         new |= (old & PSR_C_BIT);
>         new |= (old & PSR_V_BIT);
>
> -       // TODO: TCO (if/when ARMv8.5-MemTag is exposed to guests)
> +       if (kvm_has_mte(vcpu->kvm))
> +               new |= PSR_TCO_BIT;
>
>         new |= (old & PSR_DIT_BIT);
>
> diff --git a/arch/arm64/kvm/mmu.c b/arch/arm64/kvm/mmu.c
> index c10207fed2f3..52326b739357 100644
> --- a/arch/arm64/kvm/mmu.c
> +++ b/arch/arm64/kvm/mmu.c
> @@ -822,6 +822,45 @@ transparent_hugepage_adjust(struct kvm_memory_slot *memslot,
>         return PAGE_SIZE;
>  }
>
> +/*
> + * The page will be mapped in stage 2 as Normal Cacheable, so the VM will be
> + * able to see the page's tags and therefore they must be initialised first. If
> + * PG_mte_tagged is set, tags have already been initialised.
> + *
> + * The race in the test/set of the PG_mte_tagged flag is handled by:
> + * - preventing VM_SHARED mappings in a memslot with MTE preventing two VMs
> + *   racing to santise the same page
> + * - mmap_lock protects between a VM faulting a page in and the VMM performing
> + *   an mprotect() to add VM_MTE
> + */
> +static int sanitise_mte_tags(struct kvm *kvm, kvm_pfn_t pfn,
> +                            unsigned long size)
> +{
> +       unsigned long i, nr_pages = size >> PAGE_SHIFT;
> +       struct page *page;
> +
> +       if (!kvm_has_mte(kvm))
> +               return 0;
> +
> +       /*
> +        * pfn_to_online_page() is used to reject ZONE_DEVICE pages
> +        * that may not support tags.
> +        */
> +       page = pfn_to_online_page(pfn);
> +
> +       if (!page)
> +               return -EFAULT;
> +
> +       for (i = 0; i < nr_pages; i++, page++) {
> +               if (!test_bit(PG_mte_tagged, &page->flags)) {
> +                       mte_clear_page_tags(page_address(page));
> +                       set_bit(PG_mte_tagged, &page->flags);
> +               }
> +       }
> +
> +       return 0;
> +}
> +
>  static int user_mem_abort(struct kvm_vcpu *vcpu, phys_addr_t fault_ipa,
>                           struct kvm_memory_slot *memslot, unsigned long hva,
>                           unsigned long fault_status)
> @@ -971,8 +1010,18 @@ static int user_mem_abort(struct kvm_vcpu *vcpu, phys_addr_t fault_ipa,
>         if (writable)
>                 prot |= KVM_PGTABLE_PROT_W;
>
> -       if (fault_status != FSC_PERM && !device)
> +       if (fault_status != FSC_PERM && !device) {
> +               /* Check the VMM hasn't introduced a new VM_SHARED VMA */
> +               if (kvm_has_mte(kvm) && vma->vm_flags & VM_SHARED) {
> +                       ret = -EFAULT;
> +                       goto out_unlock;
> +               }
> +               ret = sanitise_mte_tags(kvm, pfn, vma_pagesize);
> +               if (ret)
> +                       goto out_unlock;
> +

nit: Would it make sense to bring in sanitise_mte_tags under the
kvm_has_mte. I know that a check is done in kvm_has_mte as well, but
since you're already checking, it might make the code a bit clearer.

>                 clean_dcache_guest_page(pfn, vma_pagesize);
> +       }
>
>         if (exec_fault) {
>                 prot |= KVM_PGTABLE_PROT_X;
> @@ -1168,12 +1217,17 @@ bool kvm_unmap_gfn_range(struct kvm *kvm, struct kvm_gfn_range *range)
>  bool kvm_set_spte_gfn(struct kvm *kvm, struct kvm_gfn_range *range)
>  {
>         kvm_pfn_t pfn = pte_pfn(range->pte);
> +       int ret;
>
>         if (!kvm->arch.mmu.pgt)
>                 return false;
>
>         WARN_ON(range->end - range->start != 1);
>
> +       ret = sanitise_mte_tags(kvm, pfn, PAGE_SIZE);
> +       if (ret)
> +               return false;
> +
>         /*
>          * We've moved a page around, probably through CoW, so let's treat it
>          * just like a translation fault and clean the cache to the PoC.
> @@ -1381,6 +1435,14 @@ int kvm_arch_prepare_memory_region(struct kvm *kvm,
>                 if (!vma)
>                         break;
>
> +               /*
> +                * VM_SHARED mappings are not allowed with MTE to avoid races
> +                * when updating the PG_mte_tagged page flag, see
> +                * sanitise_mte_tags for more details.
> +                */
> +               if (kvm_has_mte(kvm) && vma->vm_flags & VM_SHARED)
> +                       return -EINVAL;
> +
>                 /*
>                  * Take the intersection of this VMA with the memory region
>                  */
> diff --git a/arch/arm64/kvm/sys_regs.c b/arch/arm64/kvm/sys_regs.c
> index 1a7968ad078c..36f67f8deae1 100644
> --- a/arch/arm64/kvm/sys_regs.c
> +++ b/arch/arm64/kvm/sys_regs.c
> @@ -1047,6 +1047,13 @@ static u64 read_id_reg(const struct kvm_vcpu *vcpu,
>                 break;
>         case SYS_ID_AA64PFR1_EL1:
>                 val &= ~FEATURE(ID_AA64PFR1_MTE);
> +               if (kvm_has_mte(vcpu->kvm)) {
> +                       u64 pfr, mte;
> +
> +                       pfr = read_sanitised_ftr_reg(SYS_ID_AA64PFR1_EL1);

nit: why reread the sanitized register? wouldn't it be clearer to
rework the masking of the val and the check for kvm_has_mte?

Cheers,
/fuad



> +                       mte = cpuid_feature_extract_unsigned_field(pfr, ID_AA64PFR1_MTE_SHIFT);
> +                       val |= FIELD_PREP(FEATURE(ID_AA64PFR1_MTE), mte);
> +               }
>                 break;
>         case SYS_ID_AA64ISAR1_EL1:
>                 if (!vcpu_has_ptrauth(vcpu))
> diff --git a/include/uapi/linux/kvm.h b/include/uapi/linux/kvm.h
> index 79d9c44d1ad7..d4da58ddcad7 100644
> --- a/include/uapi/linux/kvm.h
> +++ b/include/uapi/linux/kvm.h
> @@ -1083,6 +1083,7 @@ struct kvm_ppc_resize_hpt {
>  #define KVM_CAP_SGX_ATTRIBUTE 196
>  #define KVM_CAP_VM_COPY_ENC_CONTEXT_FROM 197
>  #define KVM_CAP_PTP_KVM 198
> +#define KVM_CAP_ARM_MTE 199
>
>  #ifdef KVM_CAP_IRQ_ROUTING
>
> --
> 2.20.1
>
> _______________________________________________
> kvmarm mailing list
> kvmarm@lists.cs.columbia.edu
> https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH v17 4/6] KVM: arm64: Expose KVM_ARM_CAP_MTE
  2021-06-21 11:17 ` [PATCH v17 4/6] KVM: arm64: Expose KVM_ARM_CAP_MTE Steven Price
@ 2021-06-22  8:07   ` Fuad Tabba
  2021-06-22  8:48     ` Marc Zyngier
  0 siblings, 1 reply; 23+ messages in thread
From: Fuad Tabba @ 2021-06-22  8:07 UTC (permalink / raw)
  To: Steven Price
  Cc: Catalin Marinas, Marc Zyngier, Will Deacon,
	Dr. David Alan Gilbert, qemu-devel, Dave Martin, Juan Quintela,
	Richard Henderson, linux-kernel, Thomas Gleixner, kvmarm,
	linux-arm-kernel

Hi,

On Mon, Jun 21, 2021 at 12:18 PM Steven Price <steven.price@arm.com> wrote:
>
> It's now safe for the VMM to enable MTE in a guest, so expose the
> capability to user space.
>
> Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
> Signed-off-by: Steven Price <steven.price@arm.com>
> ---
>  arch/arm64/kvm/arm.c      | 9 +++++++++
>  arch/arm64/kvm/reset.c    | 4 ++++
>  arch/arm64/kvm/sys_regs.c | 3 +++
>  3 files changed, 16 insertions(+)
>
> diff --git a/arch/arm64/kvm/arm.c b/arch/arm64/kvm/arm.c
> index e720148232a0..28ce26a68f09 100644
> --- a/arch/arm64/kvm/arm.c
> +++ b/arch/arm64/kvm/arm.c
> @@ -93,6 +93,12 @@ int kvm_vm_ioctl_enable_cap(struct kvm *kvm,
>                 r = 0;
>                 kvm->arch.return_nisv_io_abort_to_user = true;
>                 break;
> +       case KVM_CAP_ARM_MTE:
> +               if (!system_supports_mte() || kvm->created_vcpus)
> +                       return -EINVAL;
> +               r = 0;
> +               kvm->arch.mte_enabled = true;
> +               break;
>         default:
>                 r = -EINVAL;
>                 break;
> @@ -237,6 +243,9 @@ int kvm_vm_ioctl_check_extension(struct kvm *kvm, long ext)
>                  */
>                 r = 1;
>                 break;
> +       case KVM_CAP_ARM_MTE:
> +               r = system_supports_mte();
> +               break;
>         case KVM_CAP_STEAL_TIME:
>                 r = kvm_arm_pvtime_supported();
>                 break;
> diff --git a/arch/arm64/kvm/reset.c b/arch/arm64/kvm/reset.c
> index d37ebee085cf..9e6922b9503a 100644
> --- a/arch/arm64/kvm/reset.c
> +++ b/arch/arm64/kvm/reset.c
> @@ -244,6 +244,10 @@ int kvm_reset_vcpu(struct kvm_vcpu *vcpu)
>         switch (vcpu->arch.target) {
>         default:
>                 if (test_bit(KVM_ARM_VCPU_EL1_32BIT, vcpu->arch.features)) {
> +                       if (vcpu->kvm->arch.mte_enabled) {
> +                               ret = -EINVAL;
> +                               goto out;
> +                       }
>                         pstate = VCPU_RESET_PSTATE_SVC;
>                 } else {
>                         pstate = VCPU_RESET_PSTATE_EL1;

nit: I was wondering whether this check would be better suited in
kvm_vcpu_set_target, rather than here (kvm_reset_vcpu). kvm_reset_vcpu
is called by kvm_vcpu_set_target, but kvm_vcpu_set_target is where
checking for supported features happens. It might be better to group
all such checks together. I don't think that there is any risk of this
feature being toggled by the other call path to kvm_reset_vcpu (via
check_vcpu_requests).

Cheers,
/fuad

> diff --git a/arch/arm64/kvm/sys_regs.c b/arch/arm64/kvm/sys_regs.c
> index 5c75b24eae21..f6f126eb6ac1 100644
> --- a/arch/arm64/kvm/sys_regs.c
> +++ b/arch/arm64/kvm/sys_regs.c
> @@ -1312,6 +1312,9 @@ static bool access_ccsidr(struct kvm_vcpu *vcpu, struct sys_reg_params *p,
>  static unsigned int mte_visibility(const struct kvm_vcpu *vcpu,
>                                    const struct sys_reg_desc *rd)
>  {
> +       if (kvm_has_mte(vcpu->kvm))
> +               return 0;
> +
>         return REG_HIDDEN;
>  }
>
> --
> 2.20.1
>
> _______________________________________________
> kvmarm mailing list
> kvmarm@lists.cs.columbia.edu
> https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH v17 4/6] KVM: arm64: Expose KVM_ARM_CAP_MTE
  2021-06-22  8:07   ` Fuad Tabba
@ 2021-06-22  8:48     ` Marc Zyngier
  0 siblings, 0 replies; 23+ messages in thread
From: Marc Zyngier @ 2021-06-22  8:48 UTC (permalink / raw)
  To: Fuad Tabba
  Cc: Steven Price, Catalin Marinas, Will Deacon,
	Dr. David Alan Gilbert, qemu-devel, Dave Martin, Juan Quintela,
	Richard Henderson, linux-kernel, Thomas Gleixner, kvmarm,
	linux-arm-kernel

On Tue, 22 Jun 2021 09:07:51 +0100,
Fuad Tabba <tabba@google.com> wrote:
> 
> Hi,
> 
> On Mon, Jun 21, 2021 at 12:18 PM Steven Price <steven.price@arm.com> wrote:
> >
> > It's now safe for the VMM to enable MTE in a guest, so expose the
> > capability to user space.
> >
> > Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
> > Signed-off-by: Steven Price <steven.price@arm.com>
> > ---
> >  arch/arm64/kvm/arm.c      | 9 +++++++++
> >  arch/arm64/kvm/reset.c    | 4 ++++
> >  arch/arm64/kvm/sys_regs.c | 3 +++
> >  3 files changed, 16 insertions(+)
> >
> > diff --git a/arch/arm64/kvm/arm.c b/arch/arm64/kvm/arm.c
> > index e720148232a0..28ce26a68f09 100644
> > --- a/arch/arm64/kvm/arm.c
> > +++ b/arch/arm64/kvm/arm.c
> > @@ -93,6 +93,12 @@ int kvm_vm_ioctl_enable_cap(struct kvm *kvm,
> >                 r = 0;
> >                 kvm->arch.return_nisv_io_abort_to_user = true;
> >                 break;
> > +       case KVM_CAP_ARM_MTE:
> > +               if (!system_supports_mte() || kvm->created_vcpus)
> > +                       return -EINVAL;
> > +               r = 0;
> > +               kvm->arch.mte_enabled = true;
> > +               break;
> >         default:
> >                 r = -EINVAL;
> >                 break;
> > @@ -237,6 +243,9 @@ int kvm_vm_ioctl_check_extension(struct kvm *kvm, long ext)
> >                  */
> >                 r = 1;
> >                 break;
> > +       case KVM_CAP_ARM_MTE:
> > +               r = system_supports_mte();
> > +               break;
> >         case KVM_CAP_STEAL_TIME:
> >                 r = kvm_arm_pvtime_supported();
> >                 break;
> > diff --git a/arch/arm64/kvm/reset.c b/arch/arm64/kvm/reset.c
> > index d37ebee085cf..9e6922b9503a 100644
> > --- a/arch/arm64/kvm/reset.c
> > +++ b/arch/arm64/kvm/reset.c
> > @@ -244,6 +244,10 @@ int kvm_reset_vcpu(struct kvm_vcpu *vcpu)
> >         switch (vcpu->arch.target) {
> >         default:
> >                 if (test_bit(KVM_ARM_VCPU_EL1_32BIT, vcpu->arch.features)) {
> > +                       if (vcpu->kvm->arch.mte_enabled) {
> > +                               ret = -EINVAL;
> > +                               goto out;
> > +                       }
> >                         pstate = VCPU_RESET_PSTATE_SVC;
> >                 } else {
> >                         pstate = VCPU_RESET_PSTATE_EL1;
> 
> nit: I was wondering whether this check would be better suited in
> kvm_vcpu_set_target, rather than here (kvm_reset_vcpu). kvm_reset_vcpu
> is called by kvm_vcpu_set_target, but kvm_vcpu_set_target is where
> checking for supported features happens. It might be better to group
> all such checks together. I don't think that there is any risk of this
> feature being toggled by the other call path to kvm_reset_vcpu (via
> check_vcpu_requests).

We already group the 32bit related compatibility checks in
vcpu_allowed_register_width(), and this is where I think this should
move to. I've provisionally added the change below.

	M.

diff --git a/arch/arm64/kvm/reset.c b/arch/arm64/kvm/reset.c
index 9e6922b9503a..cba7872d69a8 100644
--- a/arch/arm64/kvm/reset.c
+++ b/arch/arm64/kvm/reset.c
@@ -176,6 +176,10 @@ static bool vcpu_allowed_register_width(struct kvm_vcpu *vcpu)
 	if (!cpus_have_const_cap(ARM64_HAS_32BIT_EL1) && is32bit)
 		return false;
 
+	/* MTE is incompatible with AArch32 */
+	if (kvm_has_mte(vcpu->kvm) && is32bit)
+		return false;
+
 	/* Check that the vcpus are either all 32bit or all 64bit */
 	kvm_for_each_vcpu(i, tmp, vcpu->kvm) {
 		if (vcpu_has_feature(tmp, KVM_ARM_VCPU_EL1_32BIT) != is32bit)
@@ -244,10 +248,6 @@ int kvm_reset_vcpu(struct kvm_vcpu *vcpu)
 	switch (vcpu->arch.target) {
 	default:
 		if (test_bit(KVM_ARM_VCPU_EL1_32BIT, vcpu->arch.features)) {
-			if (vcpu->kvm->arch.mte_enabled) {
-				ret = -EINVAL;
-				goto out;
-			}
 			pstate = VCPU_RESET_PSTATE_SVC;
 		} else {
 			pstate = VCPU_RESET_PSTATE_EL1;


-- 
Without deviation from the norm, progress is not possible.

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply related	[flat|nested] 23+ messages in thread

* Re: [PATCH v17 5/6] KVM: arm64: ioctl to fetch/store tags in a guest
  2021-06-21 11:17 ` [PATCH v17 5/6] KVM: arm64: ioctl to fetch/store tags in a guest Steven Price
@ 2021-06-22  8:56   ` Fuad Tabba
  2021-06-22 10:25     ` Marc Zyngier
  2021-06-24 13:35   ` Marc Zyngier
  1 sibling, 1 reply; 23+ messages in thread
From: Fuad Tabba @ 2021-06-22  8:56 UTC (permalink / raw)
  To: Steven Price
  Cc: Catalin Marinas, Marc Zyngier, Will Deacon,
	Dr. David Alan Gilbert, qemu-devel, Dave Martin, Juan Quintela,
	Richard Henderson, linux-kernel, Thomas Gleixner, kvmarm,
	linux-arm-kernel

Hi,


On Mon, Jun 21, 2021 at 12:18 PM Steven Price <steven.price@arm.com> wrote:
>
> The VMM may not wish to have it's own mapping of guest memory mapped
> with PROT_MTE because this causes problems if the VMM has tag checking
> enabled (the guest controls the tags in physical RAM and it's unlikely
> the tags are correct for the VMM).
>
> Instead add a new ioctl which allows the VMM to easily read/write the
> tags from guest memory, allowing the VMM's mapping to be non-PROT_MTE
> while the VMM can still read/write the tags for the purpose of
> migration.
>
> Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
> Signed-off-by: Steven Price <steven.price@arm.com>
> ---
>  arch/arm64/include/asm/kvm_host.h |  3 ++
>  arch/arm64/include/asm/mte-def.h  |  1 +
>  arch/arm64/include/uapi/asm/kvm.h | 11 +++++
>  arch/arm64/kvm/arm.c              |  7 +++
>  arch/arm64/kvm/guest.c            | 82 +++++++++++++++++++++++++++++++
>  include/uapi/linux/kvm.h          |  1 +
>  6 files changed, 105 insertions(+)
>
> diff --git a/arch/arm64/include/asm/kvm_host.h b/arch/arm64/include/asm/kvm_host.h
> index 309e36cc1b42..6a2ac4636d42 100644
> --- a/arch/arm64/include/asm/kvm_host.h
> +++ b/arch/arm64/include/asm/kvm_host.h
> @@ -729,6 +729,9 @@ int kvm_arm_vcpu_arch_get_attr(struct kvm_vcpu *vcpu,
>  int kvm_arm_vcpu_arch_has_attr(struct kvm_vcpu *vcpu,
>                                struct kvm_device_attr *attr);
>
> +long kvm_vm_ioctl_mte_copy_tags(struct kvm *kvm,
> +                               struct kvm_arm_copy_mte_tags *copy_tags);
> +
>  /* Guest/host FPSIMD coordination helpers */
>  int kvm_arch_vcpu_run_map_fp(struct kvm_vcpu *vcpu);
>  void kvm_arch_vcpu_load_fp(struct kvm_vcpu *vcpu);
> diff --git a/arch/arm64/include/asm/mte-def.h b/arch/arm64/include/asm/mte-def.h
> index cf241b0f0a42..626d359b396e 100644
> --- a/arch/arm64/include/asm/mte-def.h
> +++ b/arch/arm64/include/asm/mte-def.h
> @@ -7,6 +7,7 @@
>
>  #define MTE_GRANULE_SIZE       UL(16)
>  #define MTE_GRANULE_MASK       (~(MTE_GRANULE_SIZE - 1))
> +#define MTE_GRANULES_PER_PAGE  (PAGE_SIZE / MTE_GRANULE_SIZE)
>  #define MTE_TAG_SHIFT          56
>  #define MTE_TAG_SIZE           4
>  #define MTE_TAG_MASK           GENMASK((MTE_TAG_SHIFT + (MTE_TAG_SIZE - 1)), MTE_TAG_SHIFT)
> diff --git a/arch/arm64/include/uapi/asm/kvm.h b/arch/arm64/include/uapi/asm/kvm.h
> index 24223adae150..b3edde68bc3e 100644
> --- a/arch/arm64/include/uapi/asm/kvm.h
> +++ b/arch/arm64/include/uapi/asm/kvm.h
> @@ -184,6 +184,17 @@ struct kvm_vcpu_events {
>         __u32 reserved[12];
>  };
>
> +struct kvm_arm_copy_mte_tags {
> +       __u64 guest_ipa;
> +       __u64 length;
> +       void __user *addr;
> +       __u64 flags;
> +       __u64 reserved[2];
> +};
> +
> +#define KVM_ARM_TAGS_TO_GUEST          0
> +#define KVM_ARM_TAGS_FROM_GUEST                1
> +
>  /* If you need to interpret the index values, here is the key: */
>  #define KVM_REG_ARM_COPROC_MASK                0x000000000FFF0000
>  #define KVM_REG_ARM_COPROC_SHIFT       16
> diff --git a/arch/arm64/kvm/arm.c b/arch/arm64/kvm/arm.c
> index 28ce26a68f09..511f3716fe33 100644
> --- a/arch/arm64/kvm/arm.c
> +++ b/arch/arm64/kvm/arm.c
> @@ -1359,6 +1359,13 @@ long kvm_arch_vm_ioctl(struct file *filp,
>
>                 return 0;
>         }
> +       case KVM_ARM_MTE_COPY_TAGS: {
> +               struct kvm_arm_copy_mte_tags copy_tags;
> +
> +               if (copy_from_user(&copy_tags, argp, sizeof(copy_tags)))
> +                       return -EFAULT;
> +               return kvm_vm_ioctl_mte_copy_tags(kvm, &copy_tags);
> +       }
>         default:
>                 return -EINVAL;
>         }
> diff --git a/arch/arm64/kvm/guest.c b/arch/arm64/kvm/guest.c
> index 5cb4a1cd5603..4ddb20017b2f 100644
> --- a/arch/arm64/kvm/guest.c
> +++ b/arch/arm64/kvm/guest.c
> @@ -995,3 +995,85 @@ int kvm_arm_vcpu_arch_has_attr(struct kvm_vcpu *vcpu,
>
>         return ret;
>  }
> +
> +long kvm_vm_ioctl_mte_copy_tags(struct kvm *kvm,
> +                               struct kvm_arm_copy_mte_tags *copy_tags)
> +{
> +       gpa_t guest_ipa = copy_tags->guest_ipa;
> +       size_t length = copy_tags->length;
> +       void __user *tags = copy_tags->addr;
> +       gpa_t gfn;
> +       bool write = !(copy_tags->flags & KVM_ARM_TAGS_FROM_GUEST);
> +       int ret = 0;
> +
> +       if (!kvm_has_mte(kvm))
> +               return -EINVAL;
> +
> +       if (copy_tags->reserved[0] || copy_tags->reserved[1])
> +               return -EINVAL;
> +
> +       if (copy_tags->flags & ~KVM_ARM_TAGS_FROM_GUEST)
> +               return -EINVAL;
> +
> +       if (length & ~PAGE_MASK || guest_ipa & ~PAGE_MASK)
> +               return -EINVAL;
> +
> +       gfn = gpa_to_gfn(guest_ipa);
> +
> +       mutex_lock(&kvm->slots_lock);
> +
> +       while (length > 0) {
> +               kvm_pfn_t pfn = gfn_to_pfn_prot(kvm, gfn, write, NULL);
> +               void *maddr;
> +               unsigned long num_tags;
> +               struct page *page;
> +
> +               if (is_error_noslot_pfn(pfn)) {
> +                       ret = -EFAULT;
> +                       goto out;
> +               }
> +
> +               page = pfn_to_online_page(pfn);
> +               if (!page) {
> +                       /* Reject ZONE_DEVICE memory */
> +                       ret = -EFAULT;
> +                       goto out;
> +               }
> +               maddr = page_address(page);
> +
> +               if (!write) {
> +                       if (test_bit(PG_mte_tagged, &page->flags))
> +                               num_tags = mte_copy_tags_to_user(tags, maddr,
> +                                                       MTE_GRANULES_PER_PAGE);
> +                       else
> +                               /* No tags in memory, so write zeros */
> +                               num_tags = MTE_GRANULES_PER_PAGE -
> +                                       clear_user(tags, MTE_GRANULES_PER_PAGE);
> +                       kvm_release_pfn_clean(pfn);
> +               } else {
> +                       num_tags = mte_copy_tags_from_user(maddr, tags,
> +                                                       MTE_GRANULES_PER_PAGE);
> +                       kvm_release_pfn_dirty(pfn);
> +               }
> +
> +               if (num_tags != MTE_GRANULES_PER_PAGE) {
> +                       ret = -EFAULT;
> +                       goto out;
> +               }
> +
> +               /* Set the flag after checking the write completed fully */
> +               if (write)
> +                       set_bit(PG_mte_tagged, &page->flags);
> +
> +               gfn++;
> +               tags += num_tags;
> +               length -= PAGE_SIZE;
> +       }
> +
> +out:
> +       mutex_unlock(&kvm->slots_lock);
> +       /* If some data has been copied report the number of bytes copied */
> +       if (length != copy_tags->length)
> +               return copy_tags->length - length;

I'm not sure if this is actually an issue, but a couple of comments on
the return value if there is an error after a partial copy has been
done. If mte_copy_tags_to_user or mte_copy_tags_from_user don't return
MTE_GRANULES_PER_PAGE, then the check for num_tags would fail, but
some of the tags would have been copied, which wouldn't be reflected
in length. That said, on a write the tagged bit wouldn't be set, and
on read then the return value would be conservative, but not
incorrect.

That said, even though it is described that way in the documentation
(rather deep in the description though), it might be confusing to
return a non-negative value on an error. The other kvm ioctl I could
find that does something similar, KVM_S390_GET_IRQ_STATE, seems to
always return a -ERROR on error, rather than the number of bytes
copied.

Cheers,
/fuad

> +       return ret;
> +}
> diff --git a/include/uapi/linux/kvm.h b/include/uapi/linux/kvm.h
> index d4da58ddcad7..da1edd2b4046 100644
> --- a/include/uapi/linux/kvm.h
> +++ b/include/uapi/linux/kvm.h
> @@ -1429,6 +1429,7 @@ struct kvm_s390_ucas_mapping {
>  /* Available with KVM_CAP_PMU_EVENT_FILTER */
>  #define KVM_SET_PMU_EVENT_FILTER  _IOW(KVMIO,  0xb2, struct kvm_pmu_event_filter)
>  #define KVM_PPC_SVM_OFF                  _IO(KVMIO,  0xb3)
> +#define KVM_ARM_MTE_COPY_TAGS    _IOR(KVMIO,  0xb4, struct kvm_arm_copy_mte_tags)
>
>  /* ioctl for vm fd */
>  #define KVM_CREATE_DEVICE        _IOWR(KVMIO,  0xe0, struct kvm_create_device)
> --
> 2.20.1
>
> _______________________________________________
> kvmarm mailing list
> kvmarm@lists.cs.columbia.edu
> https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH v17 6/6] KVM: arm64: Document MTE capability and ioctl
  2021-06-21 11:17 ` [PATCH v17 6/6] KVM: arm64: Document MTE capability and ioctl Steven Price
@ 2021-06-22  9:42   ` Fuad Tabba
  2021-06-22 10:35     ` Marc Zyngier
  0 siblings, 1 reply; 23+ messages in thread
From: Fuad Tabba @ 2021-06-22  9:42 UTC (permalink / raw)
  To: Steven Price
  Cc: Catalin Marinas, Marc Zyngier, Will Deacon,
	Dr. David Alan Gilbert, qemu-devel, Dave Martin, Juan Quintela,
	Richard Henderson, linux-kernel, Thomas Gleixner, kvmarm,
	linux-arm-kernel

Hi,


On Mon, Jun 21, 2021 at 12:18 PM Steven Price <steven.price@arm.com> wrote:
>
> A new capability (KVM_CAP_ARM_MTE) identifies that the kernel supports
> granting a guest access to the tags, and provides a mechanism for the
> VMM to enable it.
>
> A new ioctl (KVM_ARM_MTE_COPY_TAGS) provides a simple way for a VMM to
> access the tags of a guest without having to maintain a PROT_MTE mapping
> in userspace. The above capability gates access to the ioctl.
>
> Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
> Signed-off-by: Steven Price <steven.price@arm.com>
> ---
>  Documentation/virt/kvm/api.rst | 61 ++++++++++++++++++++++++++++++++++
>  1 file changed, 61 insertions(+)
>
> diff --git a/Documentation/virt/kvm/api.rst b/Documentation/virt/kvm/api.rst
> index 7fcb2fd38f42..97661a97943f 100644
> --- a/Documentation/virt/kvm/api.rst
> +++ b/Documentation/virt/kvm/api.rst
> @@ -5034,6 +5034,43 @@ see KVM_XEN_VCPU_SET_ATTR above.
>  The KVM_XEN_VCPU_ATTR_TYPE_RUNSTATE_ADJUST type may not be used
>  with the KVM_XEN_VCPU_GET_ATTR ioctl.
>
> +4.130 KVM_ARM_MTE_COPY_TAGS
> +---------------------------
> +
> +:Capability: KVM_CAP_ARM_MTE
> +:Architectures: arm64
> +:Type: vm ioctl
> +:Parameters: struct kvm_arm_copy_mte_tags
> +:Returns: number of bytes copied, < 0 on error (-EINVAL for incorrect
> +          arguments, -EFAULT if memory cannot be accessed).
> +
> +::
> +
> +  struct kvm_arm_copy_mte_tags {
> +       __u64 guest_ipa;
> +       __u64 length;
> +       void __user *addr;
> +       __u64 flags;
> +       __u64 reserved[2];
> +  };
> +
> +Copies Memory Tagging Extension (MTE) tags to/from guest tag memory. The
> +``guest_ipa`` and ``length`` fields must be ``PAGE_SIZE`` aligned. The ``addr``
> +field must point to a buffer which the tags will be copied to or from.
> +
> +``flags`` specifies the direction of copy, either ``KVM_ARM_TAGS_TO_GUEST`` or
> +``KVM_ARM_TAGS_FROM_GUEST``.
> +
> +The size of the buffer to store the tags is ``(length / 16)`` bytes
> +(granules in MTE are 16 bytes long). Each byte contains a single tag
> +value. This matches the format of ``PTRACE_PEEKMTETAGS`` and
> +``PTRACE_POKEMTETAGS``.
> +
> +If an error occurs before any data is copied then a negative error code is
> +returned. If some tags have been copied before an error occurs then the number
> +of bytes successfully copied is returned. If the call completes successfully
> +then ``length`` is returned.
> +
>  5. The kvm_run structure
>  ========================
>
> @@ -6362,6 +6399,30 @@ default.
>
>  See Documentation/x86/sgx/2.Kernel-internals.rst for more details.
>
> +7.26 KVM_CAP_ARM_MTE
> +--------------------
> +
> +:Architectures: arm64
> +:Parameters: none
> +
> +This capability indicates that KVM (and the hardware) supports exposing the
> +Memory Tagging Extensions (MTE) to the guest. It must also be enabled by the
> +VMM before creating any VCPUs to allow the guest access. Note that MTE is only
> +available to a guest running in AArch64 mode and enabling this capability will
> +cause attempts to create AArch32 VCPUs to fail.

I was wondering if there might be an issue with AArch32 at EL0 and
MTE, because I think that even if AArch64 at EL1 is disallowed, the
guest can still run AArch32 at EL0.

Thanks,
/fuad

> +
> +When enabled the guest is able to access tags associated with any memory given
> +to the guest. KVM will ensure that the tags are maintained during swap or
> +hibernation of the host; however the VMM needs to manually save/restore the
> +tags as appropriate if the VM is migrated.
> +
> +When this capability is enabled all memory in memslots must be mapped as
> +not-shareable (no MAP_SHARED), attempts to create a memslot with a
> +MAP_SHARED mmap will result in an -EINVAL return.
> +
> +When enabled the VMM may make use of the ``KVM_ARM_MTE_COPY_TAGS`` ioctl to
> +perform a bulk copy of tags to/from the guest.
> +
>  8. Other capabilities.
>  ======================
>
> --
> 2.20.1
>
> _______________________________________________
> kvmarm mailing list
> kvmarm@lists.cs.columbia.edu
> https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH v17 3/6] KVM: arm64: Save/restore MTE registers
  2021-06-21 11:17 ` [PATCH v17 3/6] KVM: arm64: Save/restore MTE registers Steven Price
@ 2021-06-22  9:46   ` Fuad Tabba
  0 siblings, 0 replies; 23+ messages in thread
From: Fuad Tabba @ 2021-06-22  9:46 UTC (permalink / raw)
  To: Steven Price
  Cc: Catalin Marinas, Marc Zyngier, Will Deacon,
	Dr. David Alan Gilbert, qemu-devel, Dave Martin, Juan Quintela,
	Richard Henderson, linux-kernel, Thomas Gleixner, kvmarm,
	linux-arm-kernel

Hi,

> diff --git a/arch/arm64/include/asm/kvm_mte.h b/arch/arm64/include/asm/kvm_mte.h
> new file mode 100644
> index 000000000000..88dd1199670b
> --- /dev/null
> +++ b/arch/arm64/include/asm/kvm_mte.h
> @@ -0,0 +1,66 @@
> +/* SPDX-License-Identifier: GPL-2.0 */
> +/*
> + * Copyright (C) 2020-2021 ARM Ltd.
> + */
> +#ifndef __ASM_KVM_MTE_H
> +#define __ASM_KVM_MTE_H
> +
> +#ifdef __ASSEMBLY__
> +
> +#include <asm/sysreg.h>
> +
> +#ifdef CONFIG_ARM64_MTE
> +
> +.macro mte_switch_to_guest g_ctxt, h_ctxt, reg1
> +alternative_if_not ARM64_MTE
> +       b       .L__skip_switch\@
> +alternative_else_nop_endif
> +       mrs     \reg1, hcr_el2
> +       tbz     \reg1, #(HCR_ATA_SHIFT), .L__skip_switch\@
> +
> +       mrs_s   \reg1, SYS_RGSR_EL1
> +       str     \reg1, [\h_ctxt, #CPU_RGSR_EL1]
> +       mrs_s   \reg1, SYS_GCR_EL1
> +       str     \reg1, [\h_ctxt, #CPU_GCR_EL1]
> +
> +       ldr     \reg1, [\g_ctxt, #CPU_RGSR_EL1]
> +       msr_s   SYS_RGSR_EL1, \reg1
> +       ldr     \reg1, [\g_ctxt, #CPU_GCR_EL1]
> +       msr_s   SYS_GCR_EL1, \reg1
> +
> +.L__skip_switch\@:
> +.endm
> +
> +.macro mte_switch_to_hyp g_ctxt, h_ctxt, reg1
> +alternative_if_not ARM64_MTE
> +       b       .L__skip_switch\@
> +alternative_else_nop_endif
> +       mrs     \reg1, hcr_el2
> +       tbz     \reg1, #(HCR_ATA_SHIFT), .L__skip_switch\@
> +
> +       mrs_s   \reg1, SYS_RGSR_EL1
> +       str     \reg1, [\g_ctxt, #CPU_RGSR_EL1]
> +       mrs_s   \reg1, SYS_GCR_EL1
> +       str     \reg1, [\g_ctxt, #CPU_GCR_EL1]
> +
> +       ldr     \reg1, [\h_ctxt, #CPU_RGSR_EL1]
> +       msr_s   SYS_RGSR_EL1, \reg1
> +       ldr     \reg1, [\h_ctxt, #CPU_GCR_EL1]
> +       msr_s   SYS_GCR_EL1, \reg1
> +
> +       isb
> +
> +.L__skip_switch\@:
> +.endm
> +
> +#else /* CONFIG_ARM64_MTE */

nit: !CONFIG_ARM64_MTE (clearer and matches the style in kvm_ptrauth.h)

Cheers,
/fuad


> +
> +.macro mte_switch_to_guest g_ctxt, h_ctxt, reg1
> +.endm
> +
> +.macro mte_switch_to_hyp g_ctxt, h_ctxt, reg1
> +.endm
> +
> +#endif /* CONFIG_ARM64_MTE */
> +#endif /* __ASSEMBLY__ */
> +#endif /* __ASM_KVM_MTE_H */
> diff --git a/arch/arm64/include/asm/sysreg.h b/arch/arm64/include/asm/sysreg.h
> index 65d15700a168..347ccac2341e 100644
> --- a/arch/arm64/include/asm/sysreg.h
> +++ b/arch/arm64/include/asm/sysreg.h
> @@ -651,7 +651,8 @@
>
>  #define INIT_SCTLR_EL2_MMU_ON                                          \
>         (SCTLR_ELx_M  | SCTLR_ELx_C | SCTLR_ELx_SA | SCTLR_ELx_I |      \
> -        SCTLR_ELx_IESB | SCTLR_ELx_WXN | ENDIAN_SET_EL2 | SCTLR_EL2_RES1)
> +        SCTLR_ELx_IESB | SCTLR_ELx_WXN | ENDIAN_SET_EL2 |              \
> +        SCTLR_ELx_ITFSB | SCTLR_EL2_RES1)
>
>  #define INIT_SCTLR_EL2_MMU_OFF \
>         (SCTLR_EL2_RES1 | ENDIAN_SET_EL2)
> diff --git a/arch/arm64/kernel/asm-offsets.c b/arch/arm64/kernel/asm-offsets.c
> index 0cb34ccb6e73..6f0044cb233e 100644
> --- a/arch/arm64/kernel/asm-offsets.c
> +++ b/arch/arm64/kernel/asm-offsets.c
> @@ -111,6 +111,8 @@ int main(void)
>    DEFINE(VCPU_WORKAROUND_FLAGS,        offsetof(struct kvm_vcpu, arch.workaround_flags));
>    DEFINE(VCPU_HCR_EL2,         offsetof(struct kvm_vcpu, arch.hcr_el2));
>    DEFINE(CPU_USER_PT_REGS,     offsetof(struct kvm_cpu_context, regs));
> +  DEFINE(CPU_RGSR_EL1,         offsetof(struct kvm_cpu_context, sys_regs[RGSR_EL1]));
> +  DEFINE(CPU_GCR_EL1,          offsetof(struct kvm_cpu_context, sys_regs[GCR_EL1]));
>    DEFINE(CPU_APIAKEYLO_EL1,    offsetof(struct kvm_cpu_context, sys_regs[APIAKEYLO_EL1]));
>    DEFINE(CPU_APIBKEYLO_EL1,    offsetof(struct kvm_cpu_context, sys_regs[APIBKEYLO_EL1]));
>    DEFINE(CPU_APDAKEYLO_EL1,    offsetof(struct kvm_cpu_context, sys_regs[APDAKEYLO_EL1]));
> diff --git a/arch/arm64/kvm/hyp/entry.S b/arch/arm64/kvm/hyp/entry.S
> index e831d3dfd50d..435346ea1504 100644
> --- a/arch/arm64/kvm/hyp/entry.S
> +++ b/arch/arm64/kvm/hyp/entry.S
> @@ -13,6 +13,7 @@
>  #include <asm/kvm_arm.h>
>  #include <asm/kvm_asm.h>
>  #include <asm/kvm_mmu.h>
> +#include <asm/kvm_mte.h>
>  #include <asm/kvm_ptrauth.h>
>
>         .text
> @@ -51,6 +52,9 @@ alternative_else_nop_endif
>
>         add     x29, x0, #VCPU_CONTEXT
>
> +       // mte_switch_to_guest(g_ctxt, h_ctxt, tmp1)
> +       mte_switch_to_guest x29, x1, x2
> +
>         // Macro ptrauth_switch_to_guest format:
>         //      ptrauth_switch_to_guest(guest cxt, tmp1, tmp2, tmp3)
>         // The below macro to restore guest keys is not implemented in C code
> @@ -142,6 +146,9 @@ SYM_INNER_LABEL(__guest_exit, SYM_L_GLOBAL)
>         // when this feature is enabled for kernel code.
>         ptrauth_switch_to_hyp x1, x2, x3, x4, x5
>
> +       // mte_switch_to_hyp(g_ctxt, h_ctxt, reg1)
> +       mte_switch_to_hyp x1, x2, x3
> +
>         // Restore hyp's sp_el0
>         restore_sp_el0 x2, x3
>
> diff --git a/arch/arm64/kvm/hyp/include/hyp/sysreg-sr.h b/arch/arm64/kvm/hyp/include/hyp/sysreg-sr.h
> index cce43bfe158f..de7e14c862e6 100644
> --- a/arch/arm64/kvm/hyp/include/hyp/sysreg-sr.h
> +++ b/arch/arm64/kvm/hyp/include/hyp/sysreg-sr.h
> @@ -14,6 +14,7 @@
>  #include <asm/kvm_asm.h>
>  #include <asm/kvm_emulate.h>
>  #include <asm/kvm_hyp.h>
> +#include <asm/kvm_mmu.h>
>
>  static inline void __sysreg_save_common_state(struct kvm_cpu_context *ctxt)
>  {
> @@ -26,6 +27,16 @@ static inline void __sysreg_save_user_state(struct kvm_cpu_context *ctxt)
>         ctxt_sys_reg(ctxt, TPIDRRO_EL0) = read_sysreg(tpidrro_el0);
>  }
>
> +static inline bool ctxt_has_mte(struct kvm_cpu_context *ctxt)
> +{
> +       struct kvm_vcpu *vcpu = ctxt->__hyp_running_vcpu;
> +
> +       if (!vcpu)
> +               vcpu = container_of(ctxt, struct kvm_vcpu, arch.ctxt);
> +
> +       return kvm_has_mte(kern_hyp_va(vcpu->kvm));
> +}
> +
>  static inline void __sysreg_save_el1_state(struct kvm_cpu_context *ctxt)
>  {
>         ctxt_sys_reg(ctxt, CSSELR_EL1)  = read_sysreg(csselr_el1);
> @@ -46,6 +57,11 @@ static inline void __sysreg_save_el1_state(struct kvm_cpu_context *ctxt)
>         ctxt_sys_reg(ctxt, PAR_EL1)     = read_sysreg_par();
>         ctxt_sys_reg(ctxt, TPIDR_EL1)   = read_sysreg(tpidr_el1);
>
> +       if (ctxt_has_mte(ctxt)) {
> +               ctxt_sys_reg(ctxt, TFSR_EL1) = read_sysreg_el1(SYS_TFSR);
> +               ctxt_sys_reg(ctxt, TFSRE0_EL1) = read_sysreg_s(SYS_TFSRE0_EL1);
> +       }
> +
>         ctxt_sys_reg(ctxt, SP_EL1)      = read_sysreg(sp_el1);
>         ctxt_sys_reg(ctxt, ELR_EL1)     = read_sysreg_el1(SYS_ELR);
>         ctxt_sys_reg(ctxt, SPSR_EL1)    = read_sysreg_el1(SYS_SPSR);
> @@ -107,6 +123,11 @@ static inline void __sysreg_restore_el1_state(struct kvm_cpu_context *ctxt)
>         write_sysreg(ctxt_sys_reg(ctxt, PAR_EL1),       par_el1);
>         write_sysreg(ctxt_sys_reg(ctxt, TPIDR_EL1),     tpidr_el1);
>
> +       if (ctxt_has_mte(ctxt)) {
> +               write_sysreg_el1(ctxt_sys_reg(ctxt, TFSR_EL1), SYS_TFSR);
> +               write_sysreg_s(ctxt_sys_reg(ctxt, TFSRE0_EL1), SYS_TFSRE0_EL1);
> +       }
> +
>         if (!has_vhe() &&
>             cpus_have_final_cap(ARM64_WORKAROUND_SPECULATIVE_AT) &&
>             ctxt->__hyp_running_vcpu) {
> diff --git a/arch/arm64/kvm/sys_regs.c b/arch/arm64/kvm/sys_regs.c
> index 36f67f8deae1..5c75b24eae21 100644
> --- a/arch/arm64/kvm/sys_regs.c
> +++ b/arch/arm64/kvm/sys_regs.c
> @@ -1309,6 +1309,20 @@ static bool access_ccsidr(struct kvm_vcpu *vcpu, struct sys_reg_params *p,
>         return true;
>  }
>
> +static unsigned int mte_visibility(const struct kvm_vcpu *vcpu,
> +                                  const struct sys_reg_desc *rd)
> +{
> +       return REG_HIDDEN;
> +}
> +
> +#define MTE_REG(name) {                                \
> +       SYS_DESC(SYS_##name),                   \
> +       .access = undef_access,                 \
> +       .reset = reset_unknown,                 \
> +       .reg = name,                            \
> +       .visibility = mte_visibility,           \
> +}
> +
>  /* sys_reg_desc initialiser for known cpufeature ID registers */
>  #define ID_SANITISED(name) {                   \
>         SYS_DESC(SYS_##name),                   \
> @@ -1477,8 +1491,8 @@ static const struct sys_reg_desc sys_reg_descs[] = {
>         { SYS_DESC(SYS_ACTLR_EL1), access_actlr, reset_actlr, ACTLR_EL1 },
>         { SYS_DESC(SYS_CPACR_EL1), NULL, reset_val, CPACR_EL1, 0 },
>
> -       { SYS_DESC(SYS_RGSR_EL1), undef_access },
> -       { SYS_DESC(SYS_GCR_EL1), undef_access },
> +       MTE_REG(RGSR_EL1),
> +       MTE_REG(GCR_EL1),
>
>         { SYS_DESC(SYS_ZCR_EL1), NULL, reset_val, ZCR_EL1, 0, .visibility = sve_visibility },
>         { SYS_DESC(SYS_TRFCR_EL1), undef_access },
> @@ -1505,8 +1519,8 @@ static const struct sys_reg_desc sys_reg_descs[] = {
>         { SYS_DESC(SYS_ERXMISC0_EL1), trap_raz_wi },
>         { SYS_DESC(SYS_ERXMISC1_EL1), trap_raz_wi },
>
> -       { SYS_DESC(SYS_TFSR_EL1), undef_access },
> -       { SYS_DESC(SYS_TFSRE0_EL1), undef_access },
> +       MTE_REG(TFSR_EL1),
> +       MTE_REG(TFSRE0_EL1),
>
>         { SYS_DESC(SYS_FAR_EL1), access_vm_reg, reset_unknown, FAR_EL1 },
>         { SYS_DESC(SYS_PAR_EL1), NULL, reset_unknown, PAR_EL1 },
> --
> 2.20.1
>
> _______________________________________________
> kvmarm mailing list
> kvmarm@lists.cs.columbia.edu
> https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH v17 5/6] KVM: arm64: ioctl to fetch/store tags in a guest
  2021-06-22  8:56   ` Fuad Tabba
@ 2021-06-22 10:25     ` Marc Zyngier
  2021-06-22 10:56       ` Fuad Tabba
  0 siblings, 1 reply; 23+ messages in thread
From: Marc Zyngier @ 2021-06-22 10:25 UTC (permalink / raw)
  To: Fuad Tabba
  Cc: Steven Price, Catalin Marinas, Will Deacon,
	Dr. David Alan Gilbert, qemu-devel, Dave Martin, Juan Quintela,
	Richard Henderson, linux-kernel, Thomas Gleixner, kvmarm,
	linux-arm-kernel

Hi Fuad,

On Tue, 22 Jun 2021 09:56:22 +0100,
Fuad Tabba <tabba@google.com> wrote:
> 
> Hi,
> 
> 
> On Mon, Jun 21, 2021 at 12:18 PM Steven Price <steven.price@arm.com> wrote:
> >
> > The VMM may not wish to have it's own mapping of guest memory mapped
> > with PROT_MTE because this causes problems if the VMM has tag checking
> > enabled (the guest controls the tags in physical RAM and it's unlikely
> > the tags are correct for the VMM).
> >
> > Instead add a new ioctl which allows the VMM to easily read/write the
> > tags from guest memory, allowing the VMM's mapping to be non-PROT_MTE
> > while the VMM can still read/write the tags for the purpose of
> > migration.
> >
> > Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
> > Signed-off-by: Steven Price <steven.price@arm.com>
> > ---
> >  arch/arm64/include/asm/kvm_host.h |  3 ++
> >  arch/arm64/include/asm/mte-def.h  |  1 +
> >  arch/arm64/include/uapi/asm/kvm.h | 11 +++++
> >  arch/arm64/kvm/arm.c              |  7 +++
> >  arch/arm64/kvm/guest.c            | 82 +++++++++++++++++++++++++++++++
> >  include/uapi/linux/kvm.h          |  1 +
> >  6 files changed, 105 insertions(+)
> >
> > diff --git a/arch/arm64/include/asm/kvm_host.h b/arch/arm64/include/asm/kvm_host.h
> > index 309e36cc1b42..6a2ac4636d42 100644
> > --- a/arch/arm64/include/asm/kvm_host.h
> > +++ b/arch/arm64/include/asm/kvm_host.h
> > @@ -729,6 +729,9 @@ int kvm_arm_vcpu_arch_get_attr(struct kvm_vcpu *vcpu,
> >  int kvm_arm_vcpu_arch_has_attr(struct kvm_vcpu *vcpu,
> >                                struct kvm_device_attr *attr);
> >
> > +long kvm_vm_ioctl_mte_copy_tags(struct kvm *kvm,
> > +                               struct kvm_arm_copy_mte_tags *copy_tags);
> > +
> >  /* Guest/host FPSIMD coordination helpers */
> >  int kvm_arch_vcpu_run_map_fp(struct kvm_vcpu *vcpu);
> >  void kvm_arch_vcpu_load_fp(struct kvm_vcpu *vcpu);
> > diff --git a/arch/arm64/include/asm/mte-def.h b/arch/arm64/include/asm/mte-def.h
> > index cf241b0f0a42..626d359b396e 100644
> > --- a/arch/arm64/include/asm/mte-def.h
> > +++ b/arch/arm64/include/asm/mte-def.h
> > @@ -7,6 +7,7 @@
> >
> >  #define MTE_GRANULE_SIZE       UL(16)
> >  #define MTE_GRANULE_MASK       (~(MTE_GRANULE_SIZE - 1))
> > +#define MTE_GRANULES_PER_PAGE  (PAGE_SIZE / MTE_GRANULE_SIZE)
> >  #define MTE_TAG_SHIFT          56
> >  #define MTE_TAG_SIZE           4
> >  #define MTE_TAG_MASK           GENMASK((MTE_TAG_SHIFT + (MTE_TAG_SIZE - 1)), MTE_TAG_SHIFT)
> > diff --git a/arch/arm64/include/uapi/asm/kvm.h b/arch/arm64/include/uapi/asm/kvm.h
> > index 24223adae150..b3edde68bc3e 100644
> > --- a/arch/arm64/include/uapi/asm/kvm.h
> > +++ b/arch/arm64/include/uapi/asm/kvm.h
> > @@ -184,6 +184,17 @@ struct kvm_vcpu_events {
> >         __u32 reserved[12];
> >  };
> >
> > +struct kvm_arm_copy_mte_tags {
> > +       __u64 guest_ipa;
> > +       __u64 length;
> > +       void __user *addr;
> > +       __u64 flags;
> > +       __u64 reserved[2];
> > +};
> > +
> > +#define KVM_ARM_TAGS_TO_GUEST          0
> > +#define KVM_ARM_TAGS_FROM_GUEST                1
> > +
> >  /* If you need to interpret the index values, here is the key: */
> >  #define KVM_REG_ARM_COPROC_MASK                0x000000000FFF0000
> >  #define KVM_REG_ARM_COPROC_SHIFT       16
> > diff --git a/arch/arm64/kvm/arm.c b/arch/arm64/kvm/arm.c
> > index 28ce26a68f09..511f3716fe33 100644
> > --- a/arch/arm64/kvm/arm.c
> > +++ b/arch/arm64/kvm/arm.c
> > @@ -1359,6 +1359,13 @@ long kvm_arch_vm_ioctl(struct file *filp,
> >
> >                 return 0;
> >         }
> > +       case KVM_ARM_MTE_COPY_TAGS: {
> > +               struct kvm_arm_copy_mte_tags copy_tags;
> > +
> > +               if (copy_from_user(&copy_tags, argp, sizeof(copy_tags)))
> > +                       return -EFAULT;
> > +               return kvm_vm_ioctl_mte_copy_tags(kvm, &copy_tags);
> > +       }
> >         default:
> >                 return -EINVAL;
> >         }
> > diff --git a/arch/arm64/kvm/guest.c b/arch/arm64/kvm/guest.c
> > index 5cb4a1cd5603..4ddb20017b2f 100644
> > --- a/arch/arm64/kvm/guest.c
> > +++ b/arch/arm64/kvm/guest.c
> > @@ -995,3 +995,85 @@ int kvm_arm_vcpu_arch_has_attr(struct kvm_vcpu *vcpu,
> >
> >         return ret;
> >  }
> > +
> > +long kvm_vm_ioctl_mte_copy_tags(struct kvm *kvm,
> > +                               struct kvm_arm_copy_mte_tags *copy_tags)
> > +{
> > +       gpa_t guest_ipa = copy_tags->guest_ipa;
> > +       size_t length = copy_tags->length;
> > +       void __user *tags = copy_tags->addr;
> > +       gpa_t gfn;
> > +       bool write = !(copy_tags->flags & KVM_ARM_TAGS_FROM_GUEST);
> > +       int ret = 0;
> > +
> > +       if (!kvm_has_mte(kvm))
> > +               return -EINVAL;
> > +
> > +       if (copy_tags->reserved[0] || copy_tags->reserved[1])
> > +               return -EINVAL;
> > +
> > +       if (copy_tags->flags & ~KVM_ARM_TAGS_FROM_GUEST)
> > +               return -EINVAL;
> > +
> > +       if (length & ~PAGE_MASK || guest_ipa & ~PAGE_MASK)
> > +               return -EINVAL;
> > +
> > +       gfn = gpa_to_gfn(guest_ipa);
> > +
> > +       mutex_lock(&kvm->slots_lock);
> > +
> > +       while (length > 0) {
> > +               kvm_pfn_t pfn = gfn_to_pfn_prot(kvm, gfn, write, NULL);
> > +               void *maddr;
> > +               unsigned long num_tags;
> > +               struct page *page;
> > +
> > +               if (is_error_noslot_pfn(pfn)) {
> > +                       ret = -EFAULT;
> > +                       goto out;
> > +               }
> > +
> > +               page = pfn_to_online_page(pfn);
> > +               if (!page) {
> > +                       /* Reject ZONE_DEVICE memory */
> > +                       ret = -EFAULT;
> > +                       goto out;
> > +               }
> > +               maddr = page_address(page);
> > +
> > +               if (!write) {
> > +                       if (test_bit(PG_mte_tagged, &page->flags))
> > +                               num_tags = mte_copy_tags_to_user(tags, maddr,
> > +                                                       MTE_GRANULES_PER_PAGE);
> > +                       else
> > +                               /* No tags in memory, so write zeros */
> > +                               num_tags = MTE_GRANULES_PER_PAGE -
> > +                                       clear_user(tags, MTE_GRANULES_PER_PAGE);
> > +                       kvm_release_pfn_clean(pfn);
> > +               } else {
> > +                       num_tags = mte_copy_tags_from_user(maddr, tags,
> > +                                                       MTE_GRANULES_PER_PAGE);
> > +                       kvm_release_pfn_dirty(pfn);
> > +               }
> > +
> > +               if (num_tags != MTE_GRANULES_PER_PAGE) {
> > +                       ret = -EFAULT;
> > +                       goto out;
> > +               }
> > +
> > +               /* Set the flag after checking the write completed fully */
> > +               if (write)
> > +                       set_bit(PG_mte_tagged, &page->flags);
> > +
> > +               gfn++;
> > +               tags += num_tags;
> > +               length -= PAGE_SIZE;
> > +       }
> > +
> > +out:
> > +       mutex_unlock(&kvm->slots_lock);
> > +       /* If some data has been copied report the number of bytes copied */
> > +       if (length != copy_tags->length)
> > +               return copy_tags->length - length;
> 
> I'm not sure if this is actually an issue, but a couple of comments on
> the return value if there is an error after a partial copy has been
> done. If mte_copy_tags_to_user or mte_copy_tags_from_user don't return
> MTE_GRANULES_PER_PAGE, then the check for num_tags would fail, but
> some of the tags would have been copied, which wouldn't be reflected
> in length. That said, on a write the tagged bit wouldn't be set, and
> on read then the return value would be conservative, but not
> incorrect.
>
> That said, even though it is described that way in the documentation
> (rather deep in the description though), it might be confusing to
> return a non-negative value on an error. The other kvm ioctl I could
> find that does something similar, KVM_S390_GET_IRQ_STATE, seems to
> always return a -ERROR on error, rather than the number of bytes
> copied.

My mental analogy for this ioctl is the read()/write() syscalls, which
return the number of bytes that have been transferred in either
direction.

I agree that there are some corner cases (a tag copy that fails
because of a faulty page adjacent to a valid page will still report
some degree of success), but it is also important to report what has
actually been done in either direction.

Thanks,

	M.

-- 
Without deviation from the norm, progress is not possible.

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH v17 6/6] KVM: arm64: Document MTE capability and ioctl
  2021-06-22  9:42   ` Fuad Tabba
@ 2021-06-22 10:35     ` Marc Zyngier
  2021-06-22 10:41       ` Fuad Tabba
  0 siblings, 1 reply; 23+ messages in thread
From: Marc Zyngier @ 2021-06-22 10:35 UTC (permalink / raw)
  To: Fuad Tabba
  Cc: Steven Price, Catalin Marinas, Will Deacon,
	Dr. David Alan Gilbert, qemu-devel, Dave Martin, Juan Quintela,
	Richard Henderson, linux-kernel, Thomas Gleixner, kvmarm,
	linux-arm-kernel

On Tue, 22 Jun 2021 10:42:42 +0100,
Fuad Tabba <tabba@google.com> wrote:
> 
> Hi,
> 
> 
> On Mon, Jun 21, 2021 at 12:18 PM Steven Price <steven.price@arm.com> wrote:
> >
> > A new capability (KVM_CAP_ARM_MTE) identifies that the kernel supports
> > granting a guest access to the tags, and provides a mechanism for the
> > VMM to enable it.
> >
> > A new ioctl (KVM_ARM_MTE_COPY_TAGS) provides a simple way for a VMM to
> > access the tags of a guest without having to maintain a PROT_MTE mapping
> > in userspace. The above capability gates access to the ioctl.
> >
> > Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
> > Signed-off-by: Steven Price <steven.price@arm.com>
> > ---
> >  Documentation/virt/kvm/api.rst | 61 ++++++++++++++++++++++++++++++++++
> >  1 file changed, 61 insertions(+)
> >
> > diff --git a/Documentation/virt/kvm/api.rst b/Documentation/virt/kvm/api.rst
> > index 7fcb2fd38f42..97661a97943f 100644
> > --- a/Documentation/virt/kvm/api.rst
> > +++ b/Documentation/virt/kvm/api.rst
> > @@ -5034,6 +5034,43 @@ see KVM_XEN_VCPU_SET_ATTR above.
> >  The KVM_XEN_VCPU_ATTR_TYPE_RUNSTATE_ADJUST type may not be used
> >  with the KVM_XEN_VCPU_GET_ATTR ioctl.
> >
> > +4.130 KVM_ARM_MTE_COPY_TAGS
> > +---------------------------
> > +
> > +:Capability: KVM_CAP_ARM_MTE
> > +:Architectures: arm64
> > +:Type: vm ioctl
> > +:Parameters: struct kvm_arm_copy_mte_tags
> > +:Returns: number of bytes copied, < 0 on error (-EINVAL for incorrect
> > +          arguments, -EFAULT if memory cannot be accessed).
> > +
> > +::
> > +
> > +  struct kvm_arm_copy_mte_tags {
> > +       __u64 guest_ipa;
> > +       __u64 length;
> > +       void __user *addr;
> > +       __u64 flags;
> > +       __u64 reserved[2];
> > +  };
> > +
> > +Copies Memory Tagging Extension (MTE) tags to/from guest tag memory. The
> > +``guest_ipa`` and ``length`` fields must be ``PAGE_SIZE`` aligned. The ``addr``
> > +field must point to a buffer which the tags will be copied to or from.
> > +
> > +``flags`` specifies the direction of copy, either ``KVM_ARM_TAGS_TO_GUEST`` or
> > +``KVM_ARM_TAGS_FROM_GUEST``.
> > +
> > +The size of the buffer to store the tags is ``(length / 16)`` bytes
> > +(granules in MTE are 16 bytes long). Each byte contains a single tag
> > +value. This matches the format of ``PTRACE_PEEKMTETAGS`` and
> > +``PTRACE_POKEMTETAGS``.
> > +
> > +If an error occurs before any data is copied then a negative error code is
> > +returned. If some tags have been copied before an error occurs then the number
> > +of bytes successfully copied is returned. If the call completes successfully
> > +then ``length`` is returned.
> > +
> >  5. The kvm_run structure
> >  ========================
> >
> > @@ -6362,6 +6399,30 @@ default.
> >
> >  See Documentation/x86/sgx/2.Kernel-internals.rst for more details.
> >
> > +7.26 KVM_CAP_ARM_MTE
> > +--------------------
> > +
> > +:Architectures: arm64
> > +:Parameters: none
> > +
> > +This capability indicates that KVM (and the hardware) supports exposing the
> > +Memory Tagging Extensions (MTE) to the guest. It must also be enabled by the
> > +VMM before creating any VCPUs to allow the guest access. Note that MTE is only
> > +available to a guest running in AArch64 mode and enabling this capability will
> > +cause attempts to create AArch32 VCPUs to fail.
> 
> I was wondering if there might be an issue with AArch32 at EL0 and
> MTE, because I think that even if AArch64 at EL1 is disallowed, the

Did you mean AArch32 here?

> guest can still run AArch32 at EL0.

I don't get your question:

- If the guest is AArch32 at EL1, there is not MTE whatsoever (where
  would you place the tag?)

- If the guest is AArch64, it can have MTE enabled or not,
  irrespective of the EL. If this guest decides to run an AArch32 EL0,
  the architecture rules still apply, and it cannot expose MTE to its
  own 32bit userspace. Nothing that KVM needs to do about this.

What KVM enforces is that at the point where the guest is in charge,
we have a consistent architectural behaviour.

Thanks,

	M.

-- 
Without deviation from the norm, progress is not possible.

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH v17 6/6] KVM: arm64: Document MTE capability and ioctl
  2021-06-22 10:35     ` Marc Zyngier
@ 2021-06-22 10:41       ` Fuad Tabba
  0 siblings, 0 replies; 23+ messages in thread
From: Fuad Tabba @ 2021-06-22 10:41 UTC (permalink / raw)
  To: Marc Zyngier
  Cc: Steven Price, Catalin Marinas, Will Deacon,
	Dr. David Alan Gilbert, qemu-devel, Dave Martin, Juan Quintela,
	Richard Henderson, linux-kernel, Thomas Gleixner, kvmarm,
	linux-arm-kernel

Hi Marc,

On Tue, Jun 22, 2021 at 11:35 AM Marc Zyngier <maz@kernel.org> wrote:
>
> On Tue, 22 Jun 2021 10:42:42 +0100,
> Fuad Tabba <tabba@google.com> wrote:
> >
> > Hi,
> >
> >
> > On Mon, Jun 21, 2021 at 12:18 PM Steven Price <steven.price@arm.com> wrote:
> > >
> > > A new capability (KVM_CAP_ARM_MTE) identifies that the kernel supports
> > > granting a guest access to the tags, and provides a mechanism for the
> > > VMM to enable it.
> > >
> > > A new ioctl (KVM_ARM_MTE_COPY_TAGS) provides a simple way for a VMM to
> > > access the tags of a guest without having to maintain a PROT_MTE mapping
> > > in userspace. The above capability gates access to the ioctl.
> > >
> > > Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
> > > Signed-off-by: Steven Price <steven.price@arm.com>
> > > ---
> > >  Documentation/virt/kvm/api.rst | 61 ++++++++++++++++++++++++++++++++++
> > >  1 file changed, 61 insertions(+)
> > >
> > > diff --git a/Documentation/virt/kvm/api.rst b/Documentation/virt/kvm/api.rst
> > > index 7fcb2fd38f42..97661a97943f 100644
> > > --- a/Documentation/virt/kvm/api.rst
> > > +++ b/Documentation/virt/kvm/api.rst
> > > @@ -5034,6 +5034,43 @@ see KVM_XEN_VCPU_SET_ATTR above.
> > >  The KVM_XEN_VCPU_ATTR_TYPE_RUNSTATE_ADJUST type may not be used
> > >  with the KVM_XEN_VCPU_GET_ATTR ioctl.
> > >
> > > +4.130 KVM_ARM_MTE_COPY_TAGS
> > > +---------------------------
> > > +
> > > +:Capability: KVM_CAP_ARM_MTE
> > > +:Architectures: arm64
> > > +:Type: vm ioctl
> > > +:Parameters: struct kvm_arm_copy_mte_tags
> > > +:Returns: number of bytes copied, < 0 on error (-EINVAL for incorrect
> > > +          arguments, -EFAULT if memory cannot be accessed).
> > > +
> > > +::
> > > +
> > > +  struct kvm_arm_copy_mte_tags {
> > > +       __u64 guest_ipa;
> > > +       __u64 length;
> > > +       void __user *addr;
> > > +       __u64 flags;
> > > +       __u64 reserved[2];
> > > +  };
> > > +
> > > +Copies Memory Tagging Extension (MTE) tags to/from guest tag memory. The
> > > +``guest_ipa`` and ``length`` fields must be ``PAGE_SIZE`` aligned. The ``addr``
> > > +field must point to a buffer which the tags will be copied to or from.
> > > +
> > > +``flags`` specifies the direction of copy, either ``KVM_ARM_TAGS_TO_GUEST`` or
> > > +``KVM_ARM_TAGS_FROM_GUEST``.
> > > +
> > > +The size of the buffer to store the tags is ``(length / 16)`` bytes
> > > +(granules in MTE are 16 bytes long). Each byte contains a single tag
> > > +value. This matches the format of ``PTRACE_PEEKMTETAGS`` and
> > > +``PTRACE_POKEMTETAGS``.
> > > +
> > > +If an error occurs before any data is copied then a negative error code is
> > > +returned. If some tags have been copied before an error occurs then the number
> > > +of bytes successfully copied is returned. If the call completes successfully
> > > +then ``length`` is returned.
> > > +
> > >  5. The kvm_run structure
> > >  ========================
> > >
> > > @@ -6362,6 +6399,30 @@ default.
> > >
> > >  See Documentation/x86/sgx/2.Kernel-internals.rst for more details.
> > >
> > > +7.26 KVM_CAP_ARM_MTE
> > > +--------------------
> > > +
> > > +:Architectures: arm64
> > > +:Parameters: none
> > > +
> > > +This capability indicates that KVM (and the hardware) supports exposing the
> > > +Memory Tagging Extensions (MTE) to the guest. It must also be enabled by the
> > > +VMM before creating any VCPUs to allow the guest access. Note that MTE is only
> > > +available to a guest running in AArch64 mode and enabling this capability will
> > > +cause attempts to create AArch32 VCPUs to fail.
> >
> > I was wondering if there might be an issue with AArch32 at EL0 and
> > MTE, because I think that even if AArch64 at EL1 is disallowed, the
>
> Did you mean AArch32 here?

Yes.

> > guest can still run AArch32 at EL0.
>
> I don't get your question:
>
> - If the guest is AArch32 at EL1, there is not MTE whatsoever (where
>   would you place the tag?)
>
> - If the guest is AArch64, it can have MTE enabled or not,
>   irrespective of the EL. If this guest decides to run an AArch32 EL0,
>   the architecture rules still apply, and it cannot expose MTE to its
>   own 32bit userspace. Nothing that KVM needs to do about this.
>
> What KVM enforces is that at the point where the guest is in charge,
> we have a consistent architectural behaviour.

This answers my question. I was wondering whether we should be
concerned with the case where the guest decides to run an AArch32 EL0.

Thanks,
/fuad

>
> Thanks,
>
>         M.
>
> --
> Without deviation from the norm, progress is not possible.

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH v17 5/6] KVM: arm64: ioctl to fetch/store tags in a guest
  2021-06-22 10:25     ` Marc Zyngier
@ 2021-06-22 10:56       ` Fuad Tabba
  2021-06-23 14:07         ` Steven Price
  0 siblings, 1 reply; 23+ messages in thread
From: Fuad Tabba @ 2021-06-22 10:56 UTC (permalink / raw)
  To: Marc Zyngier
  Cc: Steven Price, Catalin Marinas, Will Deacon,
	Dr. David Alan Gilbert, qemu-devel, Dave Martin, Juan Quintela,
	Richard Henderson, linux-kernel, Thomas Gleixner, kvmarm,
	linux-arm-kernel

Hi Marc,

On Tue, Jun 22, 2021 at 11:25 AM Marc Zyngier <maz@kernel.org> wrote:
>
> Hi Fuad,
>
> On Tue, 22 Jun 2021 09:56:22 +0100,
> Fuad Tabba <tabba@google.com> wrote:
> >
> > Hi,
> >
> >
> > On Mon, Jun 21, 2021 at 12:18 PM Steven Price <steven.price@arm.com> wrote:
> > >
> > > The VMM may not wish to have it's own mapping of guest memory mapped
> > > with PROT_MTE because this causes problems if the VMM has tag checking
> > > enabled (the guest controls the tags in physical RAM and it's unlikely
> > > the tags are correct for the VMM).
> > >
> > > Instead add a new ioctl which allows the VMM to easily read/write the
> > > tags from guest memory, allowing the VMM's mapping to be non-PROT_MTE
> > > while the VMM can still read/write the tags for the purpose of
> > > migration.
> > >
> > > Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
> > > Signed-off-by: Steven Price <steven.price@arm.com>
> > > ---
> > >  arch/arm64/include/asm/kvm_host.h |  3 ++
> > >  arch/arm64/include/asm/mte-def.h  |  1 +
> > >  arch/arm64/include/uapi/asm/kvm.h | 11 +++++
> > >  arch/arm64/kvm/arm.c              |  7 +++
> > >  arch/arm64/kvm/guest.c            | 82 +++++++++++++++++++++++++++++++
> > >  include/uapi/linux/kvm.h          |  1 +
> > >  6 files changed, 105 insertions(+)
> > >
> > > diff --git a/arch/arm64/include/asm/kvm_host.h b/arch/arm64/include/asm/kvm_host.h
> > > index 309e36cc1b42..6a2ac4636d42 100644
> > > --- a/arch/arm64/include/asm/kvm_host.h
> > > +++ b/arch/arm64/include/asm/kvm_host.h
> > > @@ -729,6 +729,9 @@ int kvm_arm_vcpu_arch_get_attr(struct kvm_vcpu *vcpu,
> > >  int kvm_arm_vcpu_arch_has_attr(struct kvm_vcpu *vcpu,
> > >                                struct kvm_device_attr *attr);
> > >
> > > +long kvm_vm_ioctl_mte_copy_tags(struct kvm *kvm,
> > > +                               struct kvm_arm_copy_mte_tags *copy_tags);
> > > +
> > >  /* Guest/host FPSIMD coordination helpers */
> > >  int kvm_arch_vcpu_run_map_fp(struct kvm_vcpu *vcpu);
> > >  void kvm_arch_vcpu_load_fp(struct kvm_vcpu *vcpu);
> > > diff --git a/arch/arm64/include/asm/mte-def.h b/arch/arm64/include/asm/mte-def.h
> > > index cf241b0f0a42..626d359b396e 100644
> > > --- a/arch/arm64/include/asm/mte-def.h
> > > +++ b/arch/arm64/include/asm/mte-def.h
> > > @@ -7,6 +7,7 @@
> > >
> > >  #define MTE_GRANULE_SIZE       UL(16)
> > >  #define MTE_GRANULE_MASK       (~(MTE_GRANULE_SIZE - 1))
> > > +#define MTE_GRANULES_PER_PAGE  (PAGE_SIZE / MTE_GRANULE_SIZE)
> > >  #define MTE_TAG_SHIFT          56
> > >  #define MTE_TAG_SIZE           4
> > >  #define MTE_TAG_MASK           GENMASK((MTE_TAG_SHIFT + (MTE_TAG_SIZE - 1)), MTE_TAG_SHIFT)
> > > diff --git a/arch/arm64/include/uapi/asm/kvm.h b/arch/arm64/include/uapi/asm/kvm.h
> > > index 24223adae150..b3edde68bc3e 100644
> > > --- a/arch/arm64/include/uapi/asm/kvm.h
> > > +++ b/arch/arm64/include/uapi/asm/kvm.h
> > > @@ -184,6 +184,17 @@ struct kvm_vcpu_events {
> > >         __u32 reserved[12];
> > >  };
> > >
> > > +struct kvm_arm_copy_mte_tags {
> > > +       __u64 guest_ipa;
> > > +       __u64 length;
> > > +       void __user *addr;
> > > +       __u64 flags;
> > > +       __u64 reserved[2];
> > > +};
> > > +
> > > +#define KVM_ARM_TAGS_TO_GUEST          0
> > > +#define KVM_ARM_TAGS_FROM_GUEST                1
> > > +
> > >  /* If you need to interpret the index values, here is the key: */
> > >  #define KVM_REG_ARM_COPROC_MASK                0x000000000FFF0000
> > >  #define KVM_REG_ARM_COPROC_SHIFT       16
> > > diff --git a/arch/arm64/kvm/arm.c b/arch/arm64/kvm/arm.c
> > > index 28ce26a68f09..511f3716fe33 100644
> > > --- a/arch/arm64/kvm/arm.c
> > > +++ b/arch/arm64/kvm/arm.c
> > > @@ -1359,6 +1359,13 @@ long kvm_arch_vm_ioctl(struct file *filp,
> > >
> > >                 return 0;
> > >         }
> > > +       case KVM_ARM_MTE_COPY_TAGS: {
> > > +               struct kvm_arm_copy_mte_tags copy_tags;
> > > +
> > > +               if (copy_from_user(&copy_tags, argp, sizeof(copy_tags)))
> > > +                       return -EFAULT;
> > > +               return kvm_vm_ioctl_mte_copy_tags(kvm, &copy_tags);
> > > +       }
> > >         default:
> > >                 return -EINVAL;
> > >         }
> > > diff --git a/arch/arm64/kvm/guest.c b/arch/arm64/kvm/guest.c
> > > index 5cb4a1cd5603..4ddb20017b2f 100644
> > > --- a/arch/arm64/kvm/guest.c
> > > +++ b/arch/arm64/kvm/guest.c
> > > @@ -995,3 +995,85 @@ int kvm_arm_vcpu_arch_has_attr(struct kvm_vcpu *vcpu,
> > >
> > >         return ret;
> > >  }
> > > +
> > > +long kvm_vm_ioctl_mte_copy_tags(struct kvm *kvm,
> > > +                               struct kvm_arm_copy_mte_tags *copy_tags)
> > > +{
> > > +       gpa_t guest_ipa = copy_tags->guest_ipa;
> > > +       size_t length = copy_tags->length;
> > > +       void __user *tags = copy_tags->addr;
> > > +       gpa_t gfn;
> > > +       bool write = !(copy_tags->flags & KVM_ARM_TAGS_FROM_GUEST);
> > > +       int ret = 0;
> > > +
> > > +       if (!kvm_has_mte(kvm))
> > > +               return -EINVAL;
> > > +
> > > +       if (copy_tags->reserved[0] || copy_tags->reserved[1])
> > > +               return -EINVAL;
> > > +
> > > +       if (copy_tags->flags & ~KVM_ARM_TAGS_FROM_GUEST)
> > > +               return -EINVAL;
> > > +
> > > +       if (length & ~PAGE_MASK || guest_ipa & ~PAGE_MASK)
> > > +               return -EINVAL;
> > > +
> > > +       gfn = gpa_to_gfn(guest_ipa);
> > > +
> > > +       mutex_lock(&kvm->slots_lock);
> > > +
> > > +       while (length > 0) {
> > > +               kvm_pfn_t pfn = gfn_to_pfn_prot(kvm, gfn, write, NULL);
> > > +               void *maddr;
> > > +               unsigned long num_tags;
> > > +               struct page *page;
> > > +
> > > +               if (is_error_noslot_pfn(pfn)) {
> > > +                       ret = -EFAULT;
> > > +                       goto out;
> > > +               }
> > > +
> > > +               page = pfn_to_online_page(pfn);
> > > +               if (!page) {
> > > +                       /* Reject ZONE_DEVICE memory */
> > > +                       ret = -EFAULT;
> > > +                       goto out;
> > > +               }
> > > +               maddr = page_address(page);
> > > +
> > > +               if (!write) {
> > > +                       if (test_bit(PG_mte_tagged, &page->flags))
> > > +                               num_tags = mte_copy_tags_to_user(tags, maddr,
> > > +                                                       MTE_GRANULES_PER_PAGE);
> > > +                       else
> > > +                               /* No tags in memory, so write zeros */
> > > +                               num_tags = MTE_GRANULES_PER_PAGE -
> > > +                                       clear_user(tags, MTE_GRANULES_PER_PAGE);
> > > +                       kvm_release_pfn_clean(pfn);
> > > +               } else {
> > > +                       num_tags = mte_copy_tags_from_user(maddr, tags,
> > > +                                                       MTE_GRANULES_PER_PAGE);
> > > +                       kvm_release_pfn_dirty(pfn);
> > > +               }
> > > +
> > > +               if (num_tags != MTE_GRANULES_PER_PAGE) {
> > > +                       ret = -EFAULT;
> > > +                       goto out;
> > > +               }
> > > +
> > > +               /* Set the flag after checking the write completed fully */
> > > +               if (write)
> > > +                       set_bit(PG_mte_tagged, &page->flags);
> > > +
> > > +               gfn++;
> > > +               tags += num_tags;
> > > +               length -= PAGE_SIZE;
> > > +       }
> > > +
> > > +out:
> > > +       mutex_unlock(&kvm->slots_lock);
> > > +       /* If some data has been copied report the number of bytes copied */
> > > +       if (length != copy_tags->length)
> > > +               return copy_tags->length - length;
> >
> > I'm not sure if this is actually an issue, but a couple of comments on
> > the return value if there is an error after a partial copy has been
> > done. If mte_copy_tags_to_user or mte_copy_tags_from_user don't return
> > MTE_GRANULES_PER_PAGE, then the check for num_tags would fail, but
> > some of the tags would have been copied, which wouldn't be reflected
> > in length. That said, on a write the tagged bit wouldn't be set, and
> > on read then the return value would be conservative, but not
> > incorrect.
> >
> > That said, even though it is described that way in the documentation
> > (rather deep in the description though), it might be confusing to
> > return a non-negative value on an error. The other kvm ioctl I could
> > find that does something similar, KVM_S390_GET_IRQ_STATE, seems to
> > always return a -ERROR on error, rather than the number of bytes
> > copied.
>
> My mental analogy for this ioctl is the read()/write() syscalls, which
> return the number of bytes that have been transferred in either
> direction.
>
> I agree that there are some corner cases (a tag copy that fails
> because of a faulty page adjacent to a valid page will still report
> some degree of success), but it is also important to report what has
> actually been done in either direction.

read()/write() return an error (-1) and not the amount copied if there
is an actual error I believe:

https://man7.org/linux/man-pages/man2/read.2.html

> It is not an error if this number is smaller than the number of
> bytes requested; this may happen for example because fewer bytes
> are actually available right now (maybe because we were close to
> end-of-file, or because we are reading from a pipe, or from a
> terminal), or because read() was interrupted by a signal.
>
> On error, -1 is returned, and errno is set to indicate the error.
> In this case, it is left unspecified whether the file position
> (if any) changes.

I think that for the current return value, then it would be good for
the documentation in patch 6/6 to be more explicit. There it says:

> :Returns: number of bytes copied, < 0 on error (-EINVAL for incorrect
>           arguments, -EFAULT if memory cannot be accessed).

Later on it does state that if an error happens after some copying has
been done, it returns the number copied. But that's at the end of the
section. I think it would be less confusing to have it in the summary
(with the "Returns").

Thanks,
/fuad






> Thanks,
>
>         M.
>
> --
> Without deviation from the norm, progress is not possible.

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH v17 2/6] KVM: arm64: Introduce MTE VM feature
  2021-06-21 17:00   ` Fuad Tabba
@ 2021-06-22 11:29     ` Marc Zyngier
  0 siblings, 0 replies; 23+ messages in thread
From: Marc Zyngier @ 2021-06-22 11:29 UTC (permalink / raw)
  To: Fuad Tabba, Steven Price
  Cc: Steven Price, Catalin Marinas, Will Deacon,
	Dr. David Alan Gilbert, qemu-devel, Dave Martin, Juan Quintela,
	Richard Henderson, linux-kernel, Thomas Gleixner, kvmarm,
	linux-arm-kernel

On Mon, 21 Jun 2021 18:00:20 +0100,
Fuad Tabba <tabba@google.com> wrote:
> 
> Hi,
> 
> On Mon, Jun 21, 2021 at 12:18 PM Steven Price <steven.price@arm.com> wrote:
> >
> > Add a new VM feature 'KVM_ARM_CAP_MTE' which enables memory tagging
> > for a VM. This will expose the feature to the guest and automatically
> > tag memory pages touched by the VM as PG_mte_tagged (and clear the tag
> > storage) to ensure that the guest cannot see stale tags, and so that
> > the tags are correctly saved/restored across swap.
> >
> > Actually exposing the new capability to user space happens in a later
> > patch.
> >
> > Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
> > Signed-off-by: Steven Price <steven.price@arm.com>
> > ---
> >  arch/arm64/include/asm/kvm_emulate.h |  3 ++
> >  arch/arm64/include/asm/kvm_host.h    |  3 ++
> >  arch/arm64/kvm/hyp/exception.c       |  3 +-
> >  arch/arm64/kvm/mmu.c                 | 64 +++++++++++++++++++++++++++-
> >  arch/arm64/kvm/sys_regs.c            |  7 +++
> >  include/uapi/linux/kvm.h             |  1 +
> >  6 files changed, 79 insertions(+), 2 deletions(-)
> >
> > diff --git a/arch/arm64/include/asm/kvm_emulate.h b/arch/arm64/include/asm/kvm_emulate.h
> > index 01b9857757f2..fd418955e31e 100644
> > --- a/arch/arm64/include/asm/kvm_emulate.h
> > +++ b/arch/arm64/include/asm/kvm_emulate.h
> > @@ -84,6 +84,9 @@ static inline void vcpu_reset_hcr(struct kvm_vcpu *vcpu)
> >         if (cpus_have_const_cap(ARM64_MISMATCHED_CACHE_TYPE) ||
> >             vcpu_el1_is_32bit(vcpu))
> >                 vcpu->arch.hcr_el2 |= HCR_TID2;
> > +
> > +       if (kvm_has_mte(vcpu->kvm))
> > +               vcpu->arch.hcr_el2 |= HCR_ATA;
> >  }
> >
> >  static inline unsigned long *vcpu_hcr(struct kvm_vcpu *vcpu)
> > diff --git a/arch/arm64/include/asm/kvm_host.h b/arch/arm64/include/asm/kvm_host.h
> > index 7cd7d5c8c4bc..afaa5333f0e4 100644
> > --- a/arch/arm64/include/asm/kvm_host.h
> > +++ b/arch/arm64/include/asm/kvm_host.h
> > @@ -132,6 +132,8 @@ struct kvm_arch {
> >
> >         u8 pfr0_csv2;
> >         u8 pfr0_csv3;
> > +       /* Memory Tagging Extension enabled for the guest */
> > +       bool mte_enabled;
> >  };
> 
> nit: newline before the comment/new member
> 
> >
> >  struct kvm_vcpu_fault_info {
> > @@ -769,6 +771,7 @@ bool kvm_arm_vcpu_is_finalized(struct kvm_vcpu *vcpu);
> >  #define kvm_arm_vcpu_sve_finalized(vcpu) \
> >         ((vcpu)->arch.flags & KVM_ARM64_VCPU_SVE_FINALIZED)
> >
> > +#define kvm_has_mte(kvm) (system_supports_mte() && (kvm)->arch.mte_enabled)
> >  #define kvm_vcpu_has_pmu(vcpu)                                 \
> >         (test_bit(KVM_ARM_VCPU_PMU_V3, (vcpu)->arch.features))
> >
> > diff --git a/arch/arm64/kvm/hyp/exception.c b/arch/arm64/kvm/hyp/exception.c
> > index 11541b94b328..0418399e0a20 100644
> > --- a/arch/arm64/kvm/hyp/exception.c
> > +++ b/arch/arm64/kvm/hyp/exception.c
> > @@ -112,7 +112,8 @@ static void enter_exception64(struct kvm_vcpu *vcpu, unsigned long target_mode,
> >         new |= (old & PSR_C_BIT);
> >         new |= (old & PSR_V_BIT);
> >
> > -       // TODO: TCO (if/when ARMv8.5-MemTag is exposed to guests)
> > +       if (kvm_has_mte(vcpu->kvm))
> > +               new |= PSR_TCO_BIT;
> >
> >         new |= (old & PSR_DIT_BIT);
> >
> > diff --git a/arch/arm64/kvm/mmu.c b/arch/arm64/kvm/mmu.c
> > index c10207fed2f3..52326b739357 100644
> > --- a/arch/arm64/kvm/mmu.c
> > +++ b/arch/arm64/kvm/mmu.c
> > @@ -822,6 +822,45 @@ transparent_hugepage_adjust(struct kvm_memory_slot *memslot,
> >         return PAGE_SIZE;
> >  }
> >
> > +/*
> > + * The page will be mapped in stage 2 as Normal Cacheable, so the VM will be
> > + * able to see the page's tags and therefore they must be initialised first. If
> > + * PG_mte_tagged is set, tags have already been initialised.
> > + *
> > + * The race in the test/set of the PG_mte_tagged flag is handled by:
> > + * - preventing VM_SHARED mappings in a memslot with MTE preventing two VMs
> > + *   racing to santise the same page
> > + * - mmap_lock protects between a VM faulting a page in and the VMM performing
> > + *   an mprotect() to add VM_MTE
> > + */
> > +static int sanitise_mte_tags(struct kvm *kvm, kvm_pfn_t pfn,
> > +                            unsigned long size)
> > +{
> > +       unsigned long i, nr_pages = size >> PAGE_SHIFT;
> > +       struct page *page;
> > +
> > +       if (!kvm_has_mte(kvm))
> > +               return 0;
> > +
> > +       /*
> > +        * pfn_to_online_page() is used to reject ZONE_DEVICE pages
> > +        * that may not support tags.
> > +        */
> > +       page = pfn_to_online_page(pfn);
> > +
> > +       if (!page)
> > +               return -EFAULT;
> > +
> > +       for (i = 0; i < nr_pages; i++, page++) {
> > +               if (!test_bit(PG_mte_tagged, &page->flags)) {
> > +                       mte_clear_page_tags(page_address(page));
> > +                       set_bit(PG_mte_tagged, &page->flags);
> > +               }
> > +       }
> > +
> > +       return 0;
> > +}
> > +
> >  static int user_mem_abort(struct kvm_vcpu *vcpu, phys_addr_t fault_ipa,
> >                           struct kvm_memory_slot *memslot, unsigned long hva,
> >                           unsigned long fault_status)
> > @@ -971,8 +1010,18 @@ static int user_mem_abort(struct kvm_vcpu *vcpu, phys_addr_t fault_ipa,
> >         if (writable)
> >                 prot |= KVM_PGTABLE_PROT_W;
> >
> > -       if (fault_status != FSC_PERM && !device)
> > +       if (fault_status != FSC_PERM && !device) {
> > +               /* Check the VMM hasn't introduced a new VM_SHARED VMA */
> > +               if (kvm_has_mte(kvm) && vma->vm_flags & VM_SHARED) {
> > +                       ret = -EFAULT;
> > +                       goto out_unlock;
> > +               }
> > +               ret = sanitise_mte_tags(kvm, pfn, vma_pagesize);
> > +               if (ret)
> > +                       goto out_unlock;
> > +
> 
> nit: Would it make sense to bring in sanitise_mte_tags under the
> kvm_has_mte. I know that a check is done in kvm_has_mte as well, but
> since you're already checking, it might make the code a bit clearer.

I think it makes more sense once merged with -next, as the CMO has
been moved into the PT code. I came up with the following resolution:

	if (vma_pagesize == PAGE_SIZE && !(force_pte || device))
		vma_pagesize = transparent_hugepage_adjust(memslot, hva,
							   &pfn, &fault_ipa);

	if (fault_status != FSC_PERM && !device && kvm_has_mte(kvm)) {
		/* Check the VMM hasn't introduced a new VM_SHARED VMA */
		if (!(vma->vm_flags & VM_SHARED))
			ret = sanitise_mte_tags(kvm, pfn, vma_pagesize);
		else
			ret = -EFAULT;
		if (ret)
			goto out_unlock;
	}

	if (writable)
		prot |= KVM_PGTABLE_PROT_W;


However, there is a more annoying issue here, which is that the vma is
accessed outside of the mm lock. I *think* we're safe because if an
unmap happens in parallel, the MMU notifier will kick and we will be
in one of two cases:

- the unmap occurs before we take the kvm->mmu_lock, and the mmu
  notifier seq_lock is want saves us (we will drop everything and take
  the fault again),

- it occurs once we hold the lock, and this blocks the unmap.

Either way, I'd be more confident if the shared state was sampled
inside the locked section.

Thoughts?

	M.

-- 
Without deviation from the norm, progress is not possible.

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH v17 0/6] MTE support for KVM guest
  2021-06-21 11:17 [PATCH v17 0/6] MTE support for KVM guest Steven Price
                   ` (5 preceding siblings ...)
  2021-06-21 11:17 ` [PATCH v17 6/6] KVM: arm64: Document MTE capability and ioctl Steven Price
@ 2021-06-22 14:21 ` Marc Zyngier
  2021-06-23 14:09   ` Steven Price
  6 siblings, 1 reply; 23+ messages in thread
From: Marc Zyngier @ 2021-06-22 14:21 UTC (permalink / raw)
  To: Catalin Marinas, Will Deacon, Steven Price
  Cc: linux-kernel, Dave Martin, Peter Maydell, Julien Thierry,
	Suzuki K Poulose, Juan Quintela, Dr. David Alan Gilbert,
	Andrew Jones, qemu-devel, Thomas Gleixner, James Morse,
	Richard Henderson, Mark Rutland, linux-arm-kernel, kvmarm

On Mon, 21 Jun 2021 12:17:10 +0100, Steven Price wrote:
> This series adds support for using the Arm Memory Tagging Extensions
> (MTE) in a KVM guest.
> 
> Changes since v16[1]:
> 
>  - Dropped the first patch ("Handle race when synchronising tags") as
>    it's not KVM specific and by restricting MAP_SHARED in KVM there is
>    no longer a dependency.
> 
> [...]

Applied to next, thanks!

[1/6] arm64: mte: Sync tags for pages where PTE is untagged
      commit: 69e3b846d8a753f9f279f29531ca56b0f7563ad0
[2/6] KVM: arm64: Introduce MTE VM feature
      commit: ea7fc1bb1cd1b92b42b1d9273ce7e231d3dc9321
[3/6] KVM: arm64: Save/restore MTE registers
      commit: e1f358b5046479d2897f23b1d5b092687c6e7a67
[4/6] KVM: arm64: Expose KVM_ARM_CAP_MTE
      commit: 673638f434ee4a00319e254ade338c57618d6f7e
[5/6] KVM: arm64: ioctl to fetch/store tags in a guest
      commit: f0376edb1ddcab19a473b4bf1fbd5b6bbed3705b
[6/6] KVM: arm64: Document MTE capability and ioctl
      commit: 04c02c201d7e8149ae336ead69fb64e4e6f94bc9

I performed a number of changes in user_mem_abort(), so please
have a look at the result. It is also pretty late in the merge
cycle, so if anything looks amiss, I'll just drop it.

Cheers,

	M.
-- 
Without deviation from the norm, progress is not possible.



_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH v17 5/6] KVM: arm64: ioctl to fetch/store tags in a guest
  2021-06-22 10:56       ` Fuad Tabba
@ 2021-06-23 14:07         ` Steven Price
  0 siblings, 0 replies; 23+ messages in thread
From: Steven Price @ 2021-06-23 14:07 UTC (permalink / raw)
  To: Fuad Tabba, Marc Zyngier
  Cc: Catalin Marinas, Will Deacon, Dr. David Alan Gilbert, qemu-devel,
	Dave Martin, Juan Quintela, Richard Henderson, linux-kernel,
	Thomas Gleixner, kvmarm, linux-arm-kernel

On 22/06/2021 11:56, Fuad Tabba wrote:
> Hi Marc,
> 
> On Tue, Jun 22, 2021 at 11:25 AM Marc Zyngier <maz@kernel.org> wrote:
>>
>> Hi Fuad,
>>
>> On Tue, 22 Jun 2021 09:56:22 +0100,
>> Fuad Tabba <tabba@google.com> wrote:
>>>
>>> Hi,
>>>
>>>
>>> On Mon, Jun 21, 2021 at 12:18 PM Steven Price <steven.price@arm.com> wrote:
>>>>
>>>> The VMM may not wish to have it's own mapping of guest memory mapped
>>>> with PROT_MTE because this causes problems if the VMM has tag checking
>>>> enabled (the guest controls the tags in physical RAM and it's unlikely
>>>> the tags are correct for the VMM).
>>>>
>>>> Instead add a new ioctl which allows the VMM to easily read/write the
>>>> tags from guest memory, allowing the VMM's mapping to be non-PROT_MTE
>>>> while the VMM can still read/write the tags for the purpose of
>>>> migration.
>>>>
>>>> Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
>>>> Signed-off-by: Steven Price <steven.price@arm.com>
>>>> ---
>>>>  arch/arm64/include/asm/kvm_host.h |  3 ++
>>>>  arch/arm64/include/asm/mte-def.h  |  1 +
>>>>  arch/arm64/include/uapi/asm/kvm.h | 11 +++++
>>>>  arch/arm64/kvm/arm.c              |  7 +++
>>>>  arch/arm64/kvm/guest.c            | 82 +++++++++++++++++++++++++++++++
>>>>  include/uapi/linux/kvm.h          |  1 +
>>>>  6 files changed, 105 insertions(+)
>>>>
>>>> diff --git a/arch/arm64/include/asm/kvm_host.h b/arch/arm64/include/asm/kvm_host.h
>>>> index 309e36cc1b42..6a2ac4636d42 100644
>>>> --- a/arch/arm64/include/asm/kvm_host.h
>>>> +++ b/arch/arm64/include/asm/kvm_host.h
>>>> @@ -729,6 +729,9 @@ int kvm_arm_vcpu_arch_get_attr(struct kvm_vcpu *vcpu,
>>>>  int kvm_arm_vcpu_arch_has_attr(struct kvm_vcpu *vcpu,
>>>>                                struct kvm_device_attr *attr);
>>>>
>>>> +long kvm_vm_ioctl_mte_copy_tags(struct kvm *kvm,
>>>> +                               struct kvm_arm_copy_mte_tags *copy_tags);
>>>> +
>>>>  /* Guest/host FPSIMD coordination helpers */
>>>>  int kvm_arch_vcpu_run_map_fp(struct kvm_vcpu *vcpu);
>>>>  void kvm_arch_vcpu_load_fp(struct kvm_vcpu *vcpu);
>>>> diff --git a/arch/arm64/include/asm/mte-def.h b/arch/arm64/include/asm/mte-def.h
>>>> index cf241b0f0a42..626d359b396e 100644
>>>> --- a/arch/arm64/include/asm/mte-def.h
>>>> +++ b/arch/arm64/include/asm/mte-def.h
>>>> @@ -7,6 +7,7 @@
>>>>
>>>>  #define MTE_GRANULE_SIZE       UL(16)
>>>>  #define MTE_GRANULE_MASK       (~(MTE_GRANULE_SIZE - 1))
>>>> +#define MTE_GRANULES_PER_PAGE  (PAGE_SIZE / MTE_GRANULE_SIZE)
>>>>  #define MTE_TAG_SHIFT          56
>>>>  #define MTE_TAG_SIZE           4
>>>>  #define MTE_TAG_MASK           GENMASK((MTE_TAG_SHIFT + (MTE_TAG_SIZE - 1)), MTE_TAG_SHIFT)
>>>> diff --git a/arch/arm64/include/uapi/asm/kvm.h b/arch/arm64/include/uapi/asm/kvm.h
>>>> index 24223adae150..b3edde68bc3e 100644
>>>> --- a/arch/arm64/include/uapi/asm/kvm.h
>>>> +++ b/arch/arm64/include/uapi/asm/kvm.h
>>>> @@ -184,6 +184,17 @@ struct kvm_vcpu_events {
>>>>         __u32 reserved[12];
>>>>  };
>>>>
>>>> +struct kvm_arm_copy_mte_tags {
>>>> +       __u64 guest_ipa;
>>>> +       __u64 length;
>>>> +       void __user *addr;
>>>> +       __u64 flags;
>>>> +       __u64 reserved[2];
>>>> +};
>>>> +
>>>> +#define KVM_ARM_TAGS_TO_GUEST          0
>>>> +#define KVM_ARM_TAGS_FROM_GUEST                1
>>>> +
>>>>  /* If you need to interpret the index values, here is the key: */
>>>>  #define KVM_REG_ARM_COPROC_MASK                0x000000000FFF0000
>>>>  #define KVM_REG_ARM_COPROC_SHIFT       16
>>>> diff --git a/arch/arm64/kvm/arm.c b/arch/arm64/kvm/arm.c
>>>> index 28ce26a68f09..511f3716fe33 100644
>>>> --- a/arch/arm64/kvm/arm.c
>>>> +++ b/arch/arm64/kvm/arm.c
>>>> @@ -1359,6 +1359,13 @@ long kvm_arch_vm_ioctl(struct file *filp,
>>>>
>>>>                 return 0;
>>>>         }
>>>> +       case KVM_ARM_MTE_COPY_TAGS: {
>>>> +               struct kvm_arm_copy_mte_tags copy_tags;
>>>> +
>>>> +               if (copy_from_user(&copy_tags, argp, sizeof(copy_tags)))
>>>> +                       return -EFAULT;
>>>> +               return kvm_vm_ioctl_mte_copy_tags(kvm, &copy_tags);
>>>> +       }
>>>>         default:
>>>>                 return -EINVAL;
>>>>         }
>>>> diff --git a/arch/arm64/kvm/guest.c b/arch/arm64/kvm/guest.c
>>>> index 5cb4a1cd5603..4ddb20017b2f 100644
>>>> --- a/arch/arm64/kvm/guest.c
>>>> +++ b/arch/arm64/kvm/guest.c
>>>> @@ -995,3 +995,85 @@ int kvm_arm_vcpu_arch_has_attr(struct kvm_vcpu *vcpu,
>>>>
>>>>         return ret;
>>>>  }
>>>> +
>>>> +long kvm_vm_ioctl_mte_copy_tags(struct kvm *kvm,
>>>> +                               struct kvm_arm_copy_mte_tags *copy_tags)
>>>> +{
>>>> +       gpa_t guest_ipa = copy_tags->guest_ipa;
>>>> +       size_t length = copy_tags->length;
>>>> +       void __user *tags = copy_tags->addr;
>>>> +       gpa_t gfn;
>>>> +       bool write = !(copy_tags->flags & KVM_ARM_TAGS_FROM_GUEST);
>>>> +       int ret = 0;
>>>> +
>>>> +       if (!kvm_has_mte(kvm))
>>>> +               return -EINVAL;
>>>> +
>>>> +       if (copy_tags->reserved[0] || copy_tags->reserved[1])
>>>> +               return -EINVAL;
>>>> +
>>>> +       if (copy_tags->flags & ~KVM_ARM_TAGS_FROM_GUEST)
>>>> +               return -EINVAL;
>>>> +
>>>> +       if (length & ~PAGE_MASK || guest_ipa & ~PAGE_MASK)
>>>> +               return -EINVAL;
>>>> +
>>>> +       gfn = gpa_to_gfn(guest_ipa);
>>>> +
>>>> +       mutex_lock(&kvm->slots_lock);
>>>> +
>>>> +       while (length > 0) {
>>>> +               kvm_pfn_t pfn = gfn_to_pfn_prot(kvm, gfn, write, NULL);
>>>> +               void *maddr;
>>>> +               unsigned long num_tags;
>>>> +               struct page *page;
>>>> +
>>>> +               if (is_error_noslot_pfn(pfn)) {
>>>> +                       ret = -EFAULT;
>>>> +                       goto out;
>>>> +               }
>>>> +
>>>> +               page = pfn_to_online_page(pfn);
>>>> +               if (!page) {
>>>> +                       /* Reject ZONE_DEVICE memory */
>>>> +                       ret = -EFAULT;
>>>> +                       goto out;
>>>> +               }
>>>> +               maddr = page_address(page);
>>>> +
>>>> +               if (!write) {
>>>> +                       if (test_bit(PG_mte_tagged, &page->flags))
>>>> +                               num_tags = mte_copy_tags_to_user(tags, maddr,
>>>> +                                                       MTE_GRANULES_PER_PAGE);
>>>> +                       else
>>>> +                               /* No tags in memory, so write zeros */
>>>> +                               num_tags = MTE_GRANULES_PER_PAGE -
>>>> +                                       clear_user(tags, MTE_GRANULES_PER_PAGE);
>>>> +                       kvm_release_pfn_clean(pfn);
>>>> +               } else {
>>>> +                       num_tags = mte_copy_tags_from_user(maddr, tags,
>>>> +                                                       MTE_GRANULES_PER_PAGE);
>>>> +                       kvm_release_pfn_dirty(pfn);
>>>> +               }
>>>> +
>>>> +               if (num_tags != MTE_GRANULES_PER_PAGE) {
>>>> +                       ret = -EFAULT;
>>>> +                       goto out;
>>>> +               }
>>>> +
>>>> +               /* Set the flag after checking the write completed fully */
>>>> +               if (write)
>>>> +                       set_bit(PG_mte_tagged, &page->flags);
>>>> +
>>>> +               gfn++;
>>>> +               tags += num_tags;
>>>> +               length -= PAGE_SIZE;
>>>> +       }
>>>> +
>>>> +out:
>>>> +       mutex_unlock(&kvm->slots_lock);
>>>> +       /* If some data has been copied report the number of bytes copied */
>>>> +       if (length != copy_tags->length)
>>>> +               return copy_tags->length - length;
>>>
>>> I'm not sure if this is actually an issue, but a couple of comments on
>>> the return value if there is an error after a partial copy has been
>>> done. If mte_copy_tags_to_user or mte_copy_tags_from_user don't return
>>> MTE_GRANULES_PER_PAGE, then the check for num_tags would fail, but
>>> some of the tags would have been copied, which wouldn't be reflected
>>> in length. That said, on a write the tagged bit wouldn't be set, and
>>> on read then the return value would be conservative, but not
>>> incorrect.
>>>
>>> That said, even though it is described that way in the documentation
>>> (rather deep in the description though), it might be confusing to
>>> return a non-negative value on an error. The other kvm ioctl I could
>>> find that does something similar, KVM_S390_GET_IRQ_STATE, seems to
>>> always return a -ERROR on error, rather than the number of bytes
>>> copied.
>>
>> My mental analogy for this ioctl is the read()/write() syscalls, which
>> return the number of bytes that have been transferred in either
>> direction.
>>
>> I agree that there are some corner cases (a tag copy that fails
>> because of a faulty page adjacent to a valid page will still report
>> some degree of success), but it is also important to report what has
>> actually been done in either direction.
> 
> read()/write() return an error (-1) and not the amount copied if there
> is an actual error I believe:
> 
> https://man7.org/linux/man-pages/man2/read.2.html
> 
>> It is not an error if this number is smaller than the number of
>> bytes requested; this may happen for example because fewer bytes
>> are actually available right now (maybe because we were close to
>> end-of-file, or because we are reading from a pipe, or from a
>> terminal), or because read() was interrupted by a signal.
>>
>> On error, -1 is returned, and errno is set to indicate the error.
>> In this case, it is left unspecified whether the file position
>> (if any) changes.
> 
> I think that for the current return value, then it would be good for
> the documentation in patch 6/6 to be more explicit. There it says:

read() isn't quite as interesting because (usually) there's no effect if
you abort a read part way through (you can easily roll back the file
position and return an error). Although I believe it does have the same
behaviour as write() in the cases of errors where rollback doesn't
happen. write()[1] has the following text:

> Note that a successful write() may transfer fewer than count
> bytes.  Such partial writes can occur for various reasons; for
> example, because there was insufficient space on the disk device
> to write all of the requested bytes, or because a blocked write()
> to a socket, pipe, or similar was interrupted by a signal handler
> after it had transferred some, but before it had transferred all
> of the requested bytes.  In the event of a partial write, the
> caller can make another write() call to transfer the remaining
> bytes.  The subsequent call will either transfer further bytes or
> may result in an error (e.g., if the disk is now full).

which matches the behaviour here - if there's an error (e.g. disk full)
then the partial success is reported (and the error effectively
ignored). The caller can then make another call to attempt the remaining
part at which point the error will actually be reported.

[1] https://man7.org/linux/man-pages/man2/write.2.html

>> :Returns: number of bytes copied, < 0 on error (-EINVAL for incorrect
>>           arguments, -EFAULT if memory cannot be accessed).
> 
> Later on it does state that if an error happens after some copying has
> been done, it returns the number copied. But that's at the end of the
> section. I think it would be less confusing to have it in the summary
> (with the "Returns").

I tried to keep the Returns section reasonably small. It does state
"number of bytes copied" which I think gives the reader a hint that the
return could be different from the number of bytes requested.

Thanks,

Steve

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH v17 0/6] MTE support for KVM guest
  2021-06-22 14:21 ` [PATCH v17 0/6] MTE support for KVM guest Marc Zyngier
@ 2021-06-23 14:09   ` Steven Price
  0 siblings, 0 replies; 23+ messages in thread
From: Steven Price @ 2021-06-23 14:09 UTC (permalink / raw)
  To: Marc Zyngier, Catalin Marinas, Will Deacon
  Cc: linux-kernel, Dave Martin, Peter Maydell, Julien Thierry,
	Suzuki K Poulose, Juan Quintela, Dr. David Alan Gilbert,
	Andrew Jones, qemu-devel, Thomas Gleixner, James Morse,
	Richard Henderson, Mark Rutland, linux-arm-kernel, kvmarm

On 22/06/2021 15:21, Marc Zyngier wrote:
> On Mon, 21 Jun 2021 12:17:10 +0100, Steven Price wrote:
>> This series adds support for using the Arm Memory Tagging Extensions
>> (MTE) in a KVM guest.
>>
>> Changes since v16[1]:
>>
>>  - Dropped the first patch ("Handle race when synchronising tags") as
>>    it's not KVM specific and by restricting MAP_SHARED in KVM there is
>>    no longer a dependency.
>>
>> [...]
> 
> Applied to next, thanks!
> 
> [1/6] arm64: mte: Sync tags for pages where PTE is untagged
>       commit: 69e3b846d8a753f9f279f29531ca56b0f7563ad0
> [2/6] KVM: arm64: Introduce MTE VM feature
>       commit: ea7fc1bb1cd1b92b42b1d9273ce7e231d3dc9321
> [3/6] KVM: arm64: Save/restore MTE registers
>       commit: e1f358b5046479d2897f23b1d5b092687c6e7a67
> [4/6] KVM: arm64: Expose KVM_ARM_CAP_MTE
>       commit: 673638f434ee4a00319e254ade338c57618d6f7e
> [5/6] KVM: arm64: ioctl to fetch/store tags in a guest
>       commit: f0376edb1ddcab19a473b4bf1fbd5b6bbed3705b
> [6/6] KVM: arm64: Document MTE capability and ioctl
>       commit: 04c02c201d7e8149ae336ead69fb64e4e6f94bc9
> 
> I performed a number of changes in user_mem_abort(), so please
> have a look at the result. It is also pretty late in the merge
> cycle, so if anything looks amiss, I'll just drop it.

It all looks good to me - thanks for making those changes.

Steve

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH v17 5/6] KVM: arm64: ioctl to fetch/store tags in a guest
  2021-06-21 11:17 ` [PATCH v17 5/6] KVM: arm64: ioctl to fetch/store tags in a guest Steven Price
  2021-06-22  8:56   ` Fuad Tabba
@ 2021-06-24 13:35   ` Marc Zyngier
  2021-06-24 13:42     ` Steven Price
  1 sibling, 1 reply; 23+ messages in thread
From: Marc Zyngier @ 2021-06-24 13:35 UTC (permalink / raw)
  To: Steven Price
  Cc: Catalin Marinas, Will Deacon, James Morse, Julien Thierry,
	Suzuki K Poulose, kvmarm, linux-arm-kernel, linux-kernel,
	Dave Martin, Mark Rutland, Thomas Gleixner, qemu-devel,
	Juan Quintela, Dr. David Alan Gilbert, Richard Henderson,
	Peter Maydell, Andrew Jones

Hi Steven,

On Mon, 21 Jun 2021 12:17:15 +0100,
Steven Price <steven.price@arm.com> wrote:
> 
> The VMM may not wish to have it's own mapping of guest memory mapped
> with PROT_MTE because this causes problems if the VMM has tag checking
> enabled (the guest controls the tags in physical RAM and it's unlikely
> the tags are correct for the VMM).
> 
> Instead add a new ioctl which allows the VMM to easily read/write the
> tags from guest memory, allowing the VMM's mapping to be non-PROT_MTE
> while the VMM can still read/write the tags for the purpose of
> migration.
> 
> Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
> Signed-off-by: Steven Price <steven.price@arm.com>
> ---
>  arch/arm64/include/asm/kvm_host.h |  3 ++
>  arch/arm64/include/asm/mte-def.h  |  1 +
>  arch/arm64/include/uapi/asm/kvm.h | 11 +++++
>  arch/arm64/kvm/arm.c              |  7 +++
>  arch/arm64/kvm/guest.c            | 82 +++++++++++++++++++++++++++++++
>  include/uapi/linux/kvm.h          |  1 +
>  6 files changed, 105 insertions(+)
> 
> diff --git a/arch/arm64/include/asm/kvm_host.h b/arch/arm64/include/asm/kvm_host.h
> index 309e36cc1b42..6a2ac4636d42 100644
> --- a/arch/arm64/include/asm/kvm_host.h
> +++ b/arch/arm64/include/asm/kvm_host.h
> @@ -729,6 +729,9 @@ int kvm_arm_vcpu_arch_get_attr(struct kvm_vcpu *vcpu,
>  int kvm_arm_vcpu_arch_has_attr(struct kvm_vcpu *vcpu,
>  			       struct kvm_device_attr *attr);
>  
> +long kvm_vm_ioctl_mte_copy_tags(struct kvm *kvm,
> +				struct kvm_arm_copy_mte_tags *copy_tags);
> +
>  /* Guest/host FPSIMD coordination helpers */
>  int kvm_arch_vcpu_run_map_fp(struct kvm_vcpu *vcpu);
>  void kvm_arch_vcpu_load_fp(struct kvm_vcpu *vcpu);
> diff --git a/arch/arm64/include/asm/mte-def.h b/arch/arm64/include/asm/mte-def.h
> index cf241b0f0a42..626d359b396e 100644
> --- a/arch/arm64/include/asm/mte-def.h
> +++ b/arch/arm64/include/asm/mte-def.h
> @@ -7,6 +7,7 @@
>  
>  #define MTE_GRANULE_SIZE	UL(16)
>  #define MTE_GRANULE_MASK	(~(MTE_GRANULE_SIZE - 1))
> +#define MTE_GRANULES_PER_PAGE	(PAGE_SIZE / MTE_GRANULE_SIZE)
>  #define MTE_TAG_SHIFT		56
>  #define MTE_TAG_SIZE		4
>  #define MTE_TAG_MASK		GENMASK((MTE_TAG_SHIFT + (MTE_TAG_SIZE - 1)), MTE_TAG_SHIFT)
> diff --git a/arch/arm64/include/uapi/asm/kvm.h b/arch/arm64/include/uapi/asm/kvm.h
> index 24223adae150..b3edde68bc3e 100644
> --- a/arch/arm64/include/uapi/asm/kvm.h
> +++ b/arch/arm64/include/uapi/asm/kvm.h
> @@ -184,6 +184,17 @@ struct kvm_vcpu_events {
>  	__u32 reserved[12];
>  };
>  
> +struct kvm_arm_copy_mte_tags {
> +	__u64 guest_ipa;
> +	__u64 length;
> +	void __user *addr;
> +	__u64 flags;
> +	__u64 reserved[2];
> +};
> +
> +#define KVM_ARM_TAGS_TO_GUEST		0
> +#define KVM_ARM_TAGS_FROM_GUEST		1
> +
>  /* If you need to interpret the index values, here is the key: */
>  #define KVM_REG_ARM_COPROC_MASK		0x000000000FFF0000
>  #define KVM_REG_ARM_COPROC_SHIFT	16
> diff --git a/arch/arm64/kvm/arm.c b/arch/arm64/kvm/arm.c
> index 28ce26a68f09..511f3716fe33 100644
> --- a/arch/arm64/kvm/arm.c
> +++ b/arch/arm64/kvm/arm.c
> @@ -1359,6 +1359,13 @@ long kvm_arch_vm_ioctl(struct file *filp,
>  
>  		return 0;
>  	}
> +	case KVM_ARM_MTE_COPY_TAGS: {
> +		struct kvm_arm_copy_mte_tags copy_tags;
> +
> +		if (copy_from_user(&copy_tags, argp, sizeof(copy_tags)))
> +			return -EFAULT;
> +		return kvm_vm_ioctl_mte_copy_tags(kvm, &copy_tags);
> +	}
>  	default:
>  		return -EINVAL;
>  	}
> diff --git a/arch/arm64/kvm/guest.c b/arch/arm64/kvm/guest.c
> index 5cb4a1cd5603..4ddb20017b2f 100644
> --- a/arch/arm64/kvm/guest.c
> +++ b/arch/arm64/kvm/guest.c
> @@ -995,3 +995,85 @@ int kvm_arm_vcpu_arch_has_attr(struct kvm_vcpu *vcpu,
>  
>  	return ret;
>  }
> +
> +long kvm_vm_ioctl_mte_copy_tags(struct kvm *kvm,
> +				struct kvm_arm_copy_mte_tags *copy_tags)
> +{
> +	gpa_t guest_ipa = copy_tags->guest_ipa;
> +	size_t length = copy_tags->length;
> +	void __user *tags = copy_tags->addr;
> +	gpa_t gfn;
> +	bool write = !(copy_tags->flags & KVM_ARM_TAGS_FROM_GUEST);
> +	int ret = 0;
> +
> +	if (!kvm_has_mte(kvm))
> +		return -EINVAL;
> +
> +	if (copy_tags->reserved[0] || copy_tags->reserved[1])
> +		return -EINVAL;
> +
> +	if (copy_tags->flags & ~KVM_ARM_TAGS_FROM_GUEST)
> +		return -EINVAL;
> +
> +	if (length & ~PAGE_MASK || guest_ipa & ~PAGE_MASK)
> +		return -EINVAL;
> +
> +	gfn = gpa_to_gfn(guest_ipa);
> +
> +	mutex_lock(&kvm->slots_lock);
> +
> +	while (length > 0) {
> +		kvm_pfn_t pfn = gfn_to_pfn_prot(kvm, gfn, write, NULL);
> +		void *maddr;
> +		unsigned long num_tags;
> +		struct page *page;
> +
> +		if (is_error_noslot_pfn(pfn)) {
> +			ret = -EFAULT;
> +			goto out;
> +		}
> +
> +		page = pfn_to_online_page(pfn);
> +		if (!page) {
> +			/* Reject ZONE_DEVICE memory */
> +			ret = -EFAULT;
> +			goto out;
> +		}
> +		maddr = page_address(page);
> +
> +		if (!write) {
> +			if (test_bit(PG_mte_tagged, &page->flags))
> +				num_tags = mte_copy_tags_to_user(tags, maddr,
> +							MTE_GRANULES_PER_PAGE);
> +			else
> +				/* No tags in memory, so write zeros */
> +				num_tags = MTE_GRANULES_PER_PAGE -
> +					clear_user(tags, MTE_GRANULES_PER_PAGE);
> +			kvm_release_pfn_clean(pfn);
> +		} else {
> +			num_tags = mte_copy_tags_from_user(maddr, tags,
> +							MTE_GRANULES_PER_PAGE);
> +			kvm_release_pfn_dirty(pfn);
> +		}
> +
> +		if (num_tags != MTE_GRANULES_PER_PAGE) {
> +			ret = -EFAULT;
> +			goto out;
> +		}
> +
> +		/* Set the flag after checking the write completed fully */
> +		if (write)
> +			set_bit(PG_mte_tagged, &page->flags);

This ended up catching my eye as I was merging some other patches.

This set_bit() occurs *after* the page has been released, meaning it
could have been evicted and reused in the interval. I plan to fix it
as below. Please let me know if that works for you.

Thanks,

	M.

From a78d3206378a7101659fbc2a4bf01cb9376c4793 Mon Sep 17 00:00:00 2001
From: Marc Zyngier <maz@kernel.org>
Date: Thu, 24 Jun 2021 14:21:05 +0100
Subject: [PATCH] KVM: arm64: Set the MTE tag bit before releasing the page

Setting a page flag without holding a reference to the page
is living dangerously. In the tag-writing path, we drop the
reference to the page by calling kvm_release_pfn_dirty(),
and only then set the PG_mte_tagged bit.

It would be safer to do it the other way round.

Fixes: f0376edb1ddca ("KVM: arm64: Add ioctl to fetch/store tags in a guest")
Cc: Steven Price <steven.price@arm.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
---
 arch/arm64/kvm/guest.c | 12 ++++++++----
 1 file changed, 8 insertions(+), 4 deletions(-)

diff --git a/arch/arm64/kvm/guest.c b/arch/arm64/kvm/guest.c
index 4ddb20017b2f..60815ae477cf 100644
--- a/arch/arm64/kvm/guest.c
+++ b/arch/arm64/kvm/guest.c
@@ -1053,6 +1053,14 @@ long kvm_vm_ioctl_mte_copy_tags(struct kvm *kvm,
 		} else {
 			num_tags = mte_copy_tags_from_user(maddr, tags,
 							MTE_GRANULES_PER_PAGE);
+
+			/*
+			 * Set the flag after checking the write
+			 * completed fully
+			 */
+			if (num_tags == MTE_GRANULES_PER_PAGE)
+				set_bit(PG_mte_tagged, &page->flags);
+
 			kvm_release_pfn_dirty(pfn);
 		}
 
@@ -1061,10 +1069,6 @@ long kvm_vm_ioctl_mte_copy_tags(struct kvm *kvm,
 			goto out;
 		}
 
-		/* Set the flag after checking the write completed fully */
-		if (write)
-			set_bit(PG_mte_tagged, &page->flags);
-
 		gfn++;
 		tags += num_tags;
 		length -= PAGE_SIZE;
-- 
2.30.2


-- 
Without deviation from the norm, progress is not possible.

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply related	[flat|nested] 23+ messages in thread

* Re: [PATCH v17 5/6] KVM: arm64: ioctl to fetch/store tags in a guest
  2021-06-24 13:35   ` Marc Zyngier
@ 2021-06-24 13:42     ` Steven Price
  0 siblings, 0 replies; 23+ messages in thread
From: Steven Price @ 2021-06-24 13:42 UTC (permalink / raw)
  To: Marc Zyngier
  Cc: Catalin Marinas, Will Deacon, James Morse, Julien Thierry,
	Suzuki K Poulose, kvmarm, linux-arm-kernel, linux-kernel,
	Dave Martin, Mark Rutland, Thomas Gleixner, qemu-devel,
	Juan Quintela, Dr. David Alan Gilbert, Richard Henderson,
	Peter Maydell, Andrew Jones

On 24/06/2021 14:35, Marc Zyngier wrote:
> Hi Steven,
> 
> On Mon, 21 Jun 2021 12:17:15 +0100,
> Steven Price <steven.price@arm.com> wrote:
>>
>> The VMM may not wish to have it's own mapping of guest memory mapped
>> with PROT_MTE because this causes problems if the VMM has tag checking
>> enabled (the guest controls the tags in physical RAM and it's unlikely
>> the tags are correct for the VMM).
>>
>> Instead add a new ioctl which allows the VMM to easily read/write the
>> tags from guest memory, allowing the VMM's mapping to be non-PROT_MTE
>> while the VMM can still read/write the tags for the purpose of
>> migration.
>>
>> Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
>> Signed-off-by: Steven Price <steven.price@arm.com>
>> ---
>>  arch/arm64/include/asm/kvm_host.h |  3 ++
>>  arch/arm64/include/asm/mte-def.h  |  1 +
>>  arch/arm64/include/uapi/asm/kvm.h | 11 +++++
>>  arch/arm64/kvm/arm.c              |  7 +++
>>  arch/arm64/kvm/guest.c            | 82 +++++++++++++++++++++++++++++++
>>  include/uapi/linux/kvm.h          |  1 +
>>  6 files changed, 105 insertions(+)
>>
>> diff --git a/arch/arm64/include/asm/kvm_host.h b/arch/arm64/include/asm/kvm_host.h
>> index 309e36cc1b42..6a2ac4636d42 100644
>> --- a/arch/arm64/include/asm/kvm_host.h
>> +++ b/arch/arm64/include/asm/kvm_host.h
>> @@ -729,6 +729,9 @@ int kvm_arm_vcpu_arch_get_attr(struct kvm_vcpu *vcpu,
>>  int kvm_arm_vcpu_arch_has_attr(struct kvm_vcpu *vcpu,
>>  			       struct kvm_device_attr *attr);
>>  
>> +long kvm_vm_ioctl_mte_copy_tags(struct kvm *kvm,
>> +				struct kvm_arm_copy_mte_tags *copy_tags);
>> +
>>  /* Guest/host FPSIMD coordination helpers */
>>  int kvm_arch_vcpu_run_map_fp(struct kvm_vcpu *vcpu);
>>  void kvm_arch_vcpu_load_fp(struct kvm_vcpu *vcpu);
>> diff --git a/arch/arm64/include/asm/mte-def.h b/arch/arm64/include/asm/mte-def.h
>> index cf241b0f0a42..626d359b396e 100644
>> --- a/arch/arm64/include/asm/mte-def.h
>> +++ b/arch/arm64/include/asm/mte-def.h
>> @@ -7,6 +7,7 @@
>>  
>>  #define MTE_GRANULE_SIZE	UL(16)
>>  #define MTE_GRANULE_MASK	(~(MTE_GRANULE_SIZE - 1))
>> +#define MTE_GRANULES_PER_PAGE	(PAGE_SIZE / MTE_GRANULE_SIZE)
>>  #define MTE_TAG_SHIFT		56
>>  #define MTE_TAG_SIZE		4
>>  #define MTE_TAG_MASK		GENMASK((MTE_TAG_SHIFT + (MTE_TAG_SIZE - 1)), MTE_TAG_SHIFT)
>> diff --git a/arch/arm64/include/uapi/asm/kvm.h b/arch/arm64/include/uapi/asm/kvm.h
>> index 24223adae150..b3edde68bc3e 100644
>> --- a/arch/arm64/include/uapi/asm/kvm.h
>> +++ b/arch/arm64/include/uapi/asm/kvm.h
>> @@ -184,6 +184,17 @@ struct kvm_vcpu_events {
>>  	__u32 reserved[12];
>>  };
>>  
>> +struct kvm_arm_copy_mte_tags {
>> +	__u64 guest_ipa;
>> +	__u64 length;
>> +	void __user *addr;
>> +	__u64 flags;
>> +	__u64 reserved[2];
>> +};
>> +
>> +#define KVM_ARM_TAGS_TO_GUEST		0
>> +#define KVM_ARM_TAGS_FROM_GUEST		1
>> +
>>  /* If you need to interpret the index values, here is the key: */
>>  #define KVM_REG_ARM_COPROC_MASK		0x000000000FFF0000
>>  #define KVM_REG_ARM_COPROC_SHIFT	16
>> diff --git a/arch/arm64/kvm/arm.c b/arch/arm64/kvm/arm.c
>> index 28ce26a68f09..511f3716fe33 100644
>> --- a/arch/arm64/kvm/arm.c
>> +++ b/arch/arm64/kvm/arm.c
>> @@ -1359,6 +1359,13 @@ long kvm_arch_vm_ioctl(struct file *filp,
>>  
>>  		return 0;
>>  	}
>> +	case KVM_ARM_MTE_COPY_TAGS: {
>> +		struct kvm_arm_copy_mte_tags copy_tags;
>> +
>> +		if (copy_from_user(&copy_tags, argp, sizeof(copy_tags)))
>> +			return -EFAULT;
>> +		return kvm_vm_ioctl_mte_copy_tags(kvm, &copy_tags);
>> +	}
>>  	default:
>>  		return -EINVAL;
>>  	}
>> diff --git a/arch/arm64/kvm/guest.c b/arch/arm64/kvm/guest.c
>> index 5cb4a1cd5603..4ddb20017b2f 100644
>> --- a/arch/arm64/kvm/guest.c
>> +++ b/arch/arm64/kvm/guest.c
>> @@ -995,3 +995,85 @@ int kvm_arm_vcpu_arch_has_attr(struct kvm_vcpu *vcpu,
>>  
>>  	return ret;
>>  }
>> +
>> +long kvm_vm_ioctl_mte_copy_tags(struct kvm *kvm,
>> +				struct kvm_arm_copy_mte_tags *copy_tags)
>> +{
>> +	gpa_t guest_ipa = copy_tags->guest_ipa;
>> +	size_t length = copy_tags->length;
>> +	void __user *tags = copy_tags->addr;
>> +	gpa_t gfn;
>> +	bool write = !(copy_tags->flags & KVM_ARM_TAGS_FROM_GUEST);
>> +	int ret = 0;
>> +
>> +	if (!kvm_has_mte(kvm))
>> +		return -EINVAL;
>> +
>> +	if (copy_tags->reserved[0] || copy_tags->reserved[1])
>> +		return -EINVAL;
>> +
>> +	if (copy_tags->flags & ~KVM_ARM_TAGS_FROM_GUEST)
>> +		return -EINVAL;
>> +
>> +	if (length & ~PAGE_MASK || guest_ipa & ~PAGE_MASK)
>> +		return -EINVAL;
>> +
>> +	gfn = gpa_to_gfn(guest_ipa);
>> +
>> +	mutex_lock(&kvm->slots_lock);
>> +
>> +	while (length > 0) {
>> +		kvm_pfn_t pfn = gfn_to_pfn_prot(kvm, gfn, write, NULL);
>> +		void *maddr;
>> +		unsigned long num_tags;
>> +		struct page *page;
>> +
>> +		if (is_error_noslot_pfn(pfn)) {
>> +			ret = -EFAULT;
>> +			goto out;
>> +		}
>> +
>> +		page = pfn_to_online_page(pfn);
>> +		if (!page) {
>> +			/* Reject ZONE_DEVICE memory */
>> +			ret = -EFAULT;
>> +			goto out;
>> +		}
>> +		maddr = page_address(page);
>> +
>> +		if (!write) {
>> +			if (test_bit(PG_mte_tagged, &page->flags))
>> +				num_tags = mte_copy_tags_to_user(tags, maddr,
>> +							MTE_GRANULES_PER_PAGE);
>> +			else
>> +				/* No tags in memory, so write zeros */
>> +				num_tags = MTE_GRANULES_PER_PAGE -
>> +					clear_user(tags, MTE_GRANULES_PER_PAGE);
>> +			kvm_release_pfn_clean(pfn);
>> +		} else {
>> +			num_tags = mte_copy_tags_from_user(maddr, tags,
>> +							MTE_GRANULES_PER_PAGE);
>> +			kvm_release_pfn_dirty(pfn);
>> +		}
>> +
>> +		if (num_tags != MTE_GRANULES_PER_PAGE) {
>> +			ret = -EFAULT;
>> +			goto out;
>> +		}
>> +
>> +		/* Set the flag after checking the write completed fully */
>> +		if (write)
>> +			set_bit(PG_mte_tagged, &page->flags);
> 
> This ended up catching my eye as I was merging some other patches.
> 
> This set_bit() occurs *after* the page has been released, meaning it
> could have been evicted and reused in the interval. I plan to fix it
> as below. Please let me know if that works for you.
> 
> Thanks,
> 
> 	M.
> 
> From a78d3206378a7101659fbc2a4bf01cb9376c4793 Mon Sep 17 00:00:00 2001
> From: Marc Zyngier <maz@kernel.org>
> Date: Thu, 24 Jun 2021 14:21:05 +0100
> Subject: [PATCH] KVM: arm64: Set the MTE tag bit before releasing the page
> 
> Setting a page flag without holding a reference to the page
> is living dangerously. In the tag-writing path, we drop the
> reference to the page by calling kvm_release_pfn_dirty(),
> and only then set the PG_mte_tagged bit.
> 
> It would be safer to do it the other way round.
> 
> Fixes: f0376edb1ddca ("KVM: arm64: Add ioctl to fetch/store tags in a guest")
> Cc: Steven Price <steven.price@arm.com>
> Cc: Catalin Marinas <catalin.marinas@arm.com>
> Signed-off-by: Marc Zyngier <maz@kernel.org>

Well spotted - I'd originally had the set_bit there - but moved it down
when I realised it should only be set if the mte_copy_tags_from_user()
completed successfully. Obviously I hadn't noticed that the page
reference had gone by that point.

Thanks for fixing it. FWIW:

Reviewed-by: Steven Price <steven.price@arm.com>

> ---
>  arch/arm64/kvm/guest.c | 12 ++++++++----
>  1 file changed, 8 insertions(+), 4 deletions(-)
> 
> diff --git a/arch/arm64/kvm/guest.c b/arch/arm64/kvm/guest.c
> index 4ddb20017b2f..60815ae477cf 100644
> --- a/arch/arm64/kvm/guest.c
> +++ b/arch/arm64/kvm/guest.c
> @@ -1053,6 +1053,14 @@ long kvm_vm_ioctl_mte_copy_tags(struct kvm *kvm,
>  		} else {
>  			num_tags = mte_copy_tags_from_user(maddr, tags,
>  							MTE_GRANULES_PER_PAGE);
> +
> +			/*
> +			 * Set the flag after checking the write
> +			 * completed fully
> +			 */
> +			if (num_tags == MTE_GRANULES_PER_PAGE)
> +				set_bit(PG_mte_tagged, &page->flags);
> +
>  			kvm_release_pfn_dirty(pfn);
>  		}
>  
> @@ -1061,10 +1069,6 @@ long kvm_vm_ioctl_mte_copy_tags(struct kvm *kvm,
>  			goto out;
>  		}
>  
> -		/* Set the flag after checking the write completed fully */
> -		if (write)
> -			set_bit(PG_mte_tagged, &page->flags);
> -
>  		gfn++;
>  		tags += num_tags;
>  		length -= PAGE_SIZE;
> 


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 23+ messages in thread

end of thread, other threads:[~2021-06-24 13:44 UTC | newest]

Thread overview: 23+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-06-21 11:17 [PATCH v17 0/6] MTE support for KVM guest Steven Price
2021-06-21 11:17 ` [PATCH v17 1/6] arm64: mte: Sync tags for pages where PTE is untagged Steven Price
2021-06-21 11:17 ` [PATCH v17 2/6] KVM: arm64: Introduce MTE VM feature Steven Price
2021-06-21 17:00   ` Fuad Tabba
2021-06-22 11:29     ` Marc Zyngier
2021-06-21 11:17 ` [PATCH v17 3/6] KVM: arm64: Save/restore MTE registers Steven Price
2021-06-22  9:46   ` Fuad Tabba
2021-06-21 11:17 ` [PATCH v17 4/6] KVM: arm64: Expose KVM_ARM_CAP_MTE Steven Price
2021-06-22  8:07   ` Fuad Tabba
2021-06-22  8:48     ` Marc Zyngier
2021-06-21 11:17 ` [PATCH v17 5/6] KVM: arm64: ioctl to fetch/store tags in a guest Steven Price
2021-06-22  8:56   ` Fuad Tabba
2021-06-22 10:25     ` Marc Zyngier
2021-06-22 10:56       ` Fuad Tabba
2021-06-23 14:07         ` Steven Price
2021-06-24 13:35   ` Marc Zyngier
2021-06-24 13:42     ` Steven Price
2021-06-21 11:17 ` [PATCH v17 6/6] KVM: arm64: Document MTE capability and ioctl Steven Price
2021-06-22  9:42   ` Fuad Tabba
2021-06-22 10:35     ` Marc Zyngier
2021-06-22 10:41       ` Fuad Tabba
2021-06-22 14:21 ` [PATCH v17 0/6] MTE support for KVM guest Marc Zyngier
2021-06-23 14:09   ` Steven Price

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).