linux-arm-kernel.lists.infradead.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 0/4] arm64: mte: Fix racing on MTE tag initialisation
@ 2022-07-05 14:26 Catalin Marinas
  2022-07-05 14:26 ` [PATCH 1/4] arm64: mte: Fix/clarify the PG_mte_tagged semantics Catalin Marinas
                   ` (3 more replies)
  0 siblings, 4 replies; 9+ messages in thread
From: Catalin Marinas @ 2022-07-05 14:26 UTC (permalink / raw)
  To: Will Deacon, Marc Zyngier, Steven Price, Peter Collingbourne
  Cc: Vincenzo Frascino, linux-arm-kernel

This series aims to fix the races between initialising the tags on a
page and setting the PG_mte_tagged flag. Currently the flag is set
either before or after that tag initialisation and this can lead to CoW
copying stale tags. The first patch moves the flag setting after the
tags have been initialised, solving the CoW issue. However, concurrent
mprotect() on a shared mapping may (very rarely) lead to valid tags
being zeroed.

The second skips the sanitise_mte_tags() call in kvm_set_spte_gfn(),
deferring it to user_mem_abort(). The outcome is that no
sanitise_mte_tags() can be simplified to skip the pfn_to_online_page()
check and only rely on VM_MTE_ALLOWED vma flag that can be checked in
user_mem_abort().

The third and fourth patches use PG_arch_3 as a lock for page tagging,
based on Peter Collingbourne's idea of a two-bit lock.

I think the first patch can be queued but the rest needs some in depth
review and test. With this series (if correct) we could allos MAP_SHARED
on KVM guest memory but this is to be discussed separately as there are
some KVM ABI implications.

Thanks,

Catalin

Catalin Marinas (3):
  arm64: mte: Fix/clarify the PG_mte_tagged semantics
  KVM: arm64: Simplify the sanitise_mte_tags() logic
  arm64: mte: Lock a page for MTE tag initialisation

Peter Collingbourne (1):
  mm: Add PG_arch_3 page flag

 arch/arm64/include/asm/mte.h     | 62 ++++++++++++++++++++++++++++++++
 arch/arm64/include/asm/pgtable.h |  3 +-
 arch/arm64/kernel/cpufeature.c   |  4 ++-
 arch/arm64/kernel/elfcore.c      |  2 +-
 arch/arm64/kernel/hibernate.c    |  2 +-
 arch/arm64/kernel/mte.c          | 17 +++++----
 arch/arm64/kvm/guest.c           | 18 ++++++----
 arch/arm64/kvm/mmu.c             | 42 +++++++++-------------
 arch/arm64/mm/copypage.c         |  6 ++--
 arch/arm64/mm/fault.c            |  4 ++-
 arch/arm64/mm/mteswap.c          |  5 ++-
 fs/proc/page.c                   |  1 +
 include/linux/page-flags.h       |  1 +
 include/trace/events/mmflags.h   |  7 ++--
 mm/huge_memory.c                 |  1 +
 15 files changed, 125 insertions(+), 50 deletions(-)


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 9+ messages in thread

* [PATCH 1/4] arm64: mte: Fix/clarify the PG_mte_tagged semantics
  2022-07-05 14:26 [PATCH 0/4] arm64: mte: Fix racing on MTE tag initialisation Catalin Marinas
@ 2022-07-05 14:26 ` Catalin Marinas
  2022-07-05 14:26 ` [PATCH 2/4] KVM: arm64: Simplify the sanitise_mte_tags() logic Catalin Marinas
                   ` (2 subsequent siblings)
  3 siblings, 0 replies; 9+ messages in thread
From: Catalin Marinas @ 2022-07-05 14:26 UTC (permalink / raw)
  To: Will Deacon, Marc Zyngier, Steven Price, Peter Collingbourne
  Cc: Vincenzo Frascino, linux-arm-kernel

Currently the PG_mte_tagged page flag mostly means the page contains
valid tags and it should be set after the tags have been cleared or
restored. However, in mte_sync_tags() it is set before setting the tags
to avoid, in theory, a race with concurrent mprotect(PROT_MTE) for
shared pages. However, a concurrent mprotect(PROT_MTE) with a copy on
write in another thread can cause the new page to have stale tags.
Similarly, tag reading via ptrace() can read stale tags of the
PG_mte_tagged flag is set before actually clearing/restoring the tags.

Fix the PG_mte_tagged semantics so that it is only set after the tags
have been cleared or restored. This is safe for swap restoring into a
MAP_SHARED or CoW page since the core code takes the page lock. Add two
functions to test and set the PG_mte_tagged flag with acquire and
release semantics. The downside is that concurrent mprotect(PROT_MTE) on
a MAP_SHARED page may cause tag loss. This is already the case for KVM
guests if a VMM changes the page protection while the guest triggers a
user_mem_abort().

Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Cc: Will Deacon <will@kernel.org>
Cc: Marc Zyngier <maz@kernel.org>
Cc: Steven Price <steven.price@arm.com>
Cc: Peter Collingbourne <pcc@google.com>
---
 arch/arm64/include/asm/mte.h     | 30 ++++++++++++++++++++++++++++++
 arch/arm64/include/asm/pgtable.h |  2 +-
 arch/arm64/kernel/cpufeature.c   |  4 +++-
 arch/arm64/kernel/elfcore.c      |  2 +-
 arch/arm64/kernel/hibernate.c    |  2 +-
 arch/arm64/kernel/mte.c          | 12 +++++++-----
 arch/arm64/kvm/guest.c           |  4 ++--
 arch/arm64/kvm/mmu.c             |  4 ++--
 arch/arm64/mm/copypage.c         |  4 ++--
 arch/arm64/mm/fault.c            |  2 +-
 arch/arm64/mm/mteswap.c          |  2 +-
 11 files changed, 51 insertions(+), 17 deletions(-)

diff --git a/arch/arm64/include/asm/mte.h b/arch/arm64/include/asm/mte.h
index aa523591a44e..c69218c56980 100644
--- a/arch/arm64/include/asm/mte.h
+++ b/arch/arm64/include/asm/mte.h
@@ -37,6 +37,29 @@ void mte_free_tag_storage(char *storage);
 /* track which pages have valid allocation tags */
 #define PG_mte_tagged	PG_arch_2
 
+static inline void set_page_mte_tagged(struct page *page)
+{
+	/*
+	 * Ensure that the tags written prior to this function are visible
+	 * before the page flags update.
+	 */
+	smp_wmb();
+	set_bit(PG_mte_tagged, &page->flags);
+}
+
+static inline bool page_mte_tagged(struct page *page)
+{
+	bool ret = test_bit(PG_mte_tagged, &page->flags);
+
+	/*
+	 * If the page is tagged, ensure ordering with a likely subsequent
+	 * read of the tags.
+	 */
+	if (ret)
+		smp_rmb();
+	return ret;
+}
+
 void mte_zero_clear_page_tags(void *addr);
 void mte_sync_tags(pte_t old_pte, pte_t pte);
 void mte_copy_page_tags(void *kto, const void *kfrom);
@@ -54,6 +77,13 @@ size_t mte_probe_user_range(const char __user *uaddr, size_t size);
 /* unused if !CONFIG_ARM64_MTE, silence the compiler */
 #define PG_mte_tagged	0
 
+static inline set_page_mte_tagged(struct page *page)
+{
+}
+static inline bool page_mte_tagged(struct page *page)
+{
+	return false;
+}
 static inline void mte_zero_clear_page_tags(void *addr)
 {
 }
diff --git a/arch/arm64/include/asm/pgtable.h b/arch/arm64/include/asm/pgtable.h
index 0b6632f18364..08823669db0a 100644
--- a/arch/arm64/include/asm/pgtable.h
+++ b/arch/arm64/include/asm/pgtable.h
@@ -1034,7 +1034,7 @@ static inline void arch_swap_invalidate_area(int type)
 static inline void arch_swap_restore(swp_entry_t entry, struct folio *folio)
 {
 	if (system_supports_mte() && mte_restore_tags(entry, &folio->page))
-		set_bit(PG_mte_tagged, &folio->flags);
+		set_page_mte_tagged(&folio->page);
 }
 
 #endif /* CONFIG_ARM64_MTE */
diff --git a/arch/arm64/kernel/cpufeature.c b/arch/arm64/kernel/cpufeature.c
index 8d88433de81d..4478e5774580 100644
--- a/arch/arm64/kernel/cpufeature.c
+++ b/arch/arm64/kernel/cpufeature.c
@@ -1964,8 +1964,10 @@ static void cpu_enable_mte(struct arm64_cpu_capabilities const *cap)
 	 * Clear the tags in the zero page. This needs to be done via the
 	 * linear map which has the Tagged attribute.
 	 */
-	if (!test_and_set_bit(PG_mte_tagged, &ZERO_PAGE(0)->flags))
+	if (!page_mte_tagged(ZERO_PAGE(0))) {
 		mte_clear_page_tags(lm_alias(empty_zero_page));
+		set_page_mte_tagged(ZERO_PAGE(0));
+	}
 
 	kasan_init_hw_tags_cpu();
 }
diff --git a/arch/arm64/kernel/elfcore.c b/arch/arm64/kernel/elfcore.c
index 98d67444a5b6..f91bb1572d22 100644
--- a/arch/arm64/kernel/elfcore.c
+++ b/arch/arm64/kernel/elfcore.c
@@ -47,7 +47,7 @@ static int mte_dump_tag_range(struct coredump_params *cprm,
 		 * Pages mapped in user space as !pte_access_permitted() (e.g.
 		 * PROT_EXEC only) may not have the PG_mte_tagged flag set.
 		 */
-		if (!test_bit(PG_mte_tagged, &page->flags)) {
+		if (!page_mte_tagged(page)) {
 			put_page(page);
 			dump_skip(cprm, MTE_PAGE_TAG_STORAGE);
 			continue;
diff --git a/arch/arm64/kernel/hibernate.c b/arch/arm64/kernel/hibernate.c
index 2e248342476e..018ae7a25737 100644
--- a/arch/arm64/kernel/hibernate.c
+++ b/arch/arm64/kernel/hibernate.c
@@ -271,7 +271,7 @@ static int swsusp_mte_save_tags(void)
 			if (!page)
 				continue;
 
-			if (!test_bit(PG_mte_tagged, &page->flags))
+			if (!page_mte_tagged(page))
 				continue;
 
 			ret = save_tags(page, pfn);
diff --git a/arch/arm64/kernel/mte.c b/arch/arm64/kernel/mte.c
index f6b00743c399..67e82ce4c285 100644
--- a/arch/arm64/kernel/mte.c
+++ b/arch/arm64/kernel/mte.c
@@ -41,8 +41,10 @@ static void mte_sync_page_tags(struct page *page, pte_t old_pte,
 	if (check_swap && is_swap_pte(old_pte)) {
 		swp_entry_t entry = pte_to_swp_entry(old_pte);
 
-		if (!non_swap_entry(entry) && mte_restore_tags(entry, page))
+		if (!non_swap_entry(entry) && mte_restore_tags(entry, page)) {
+			set_page_mte_tagged(page);
 			return;
+		}
 	}
 
 	if (!pte_is_tagged)
@@ -58,6 +60,7 @@ static void mte_sync_page_tags(struct page *page, pte_t old_pte,
 	 */
 	smp_wmb();
 	mte_clear_page_tags(page_address(page));
+	set_page_mte_tagged(page);
 }
 
 void mte_sync_tags(pte_t old_pte, pte_t pte)
@@ -73,7 +76,7 @@ void mte_sync_tags(pte_t old_pte, pte_t pte)
 
 	/* if PG_mte_tagged is set, tags have already been initialised */
 	for (i = 0; i < nr_pages; i++, page++) {
-		if (!test_and_set_bit(PG_mte_tagged, &page->flags))
+		if (!page_mte_tagged(page))
 			mte_sync_page_tags(page, old_pte, check_swap,
 					   pte_is_tagged);
 	}
@@ -100,8 +103,7 @@ int memcmp_pages(struct page *page1, struct page *page2)
 	 * pages is tagged, set_pte_at() may zero or change the tags of the
 	 * other page via mte_sync_tags().
 	 */
-	if (test_bit(PG_mte_tagged, &page1->flags) ||
-	    test_bit(PG_mte_tagged, &page2->flags))
+	if (page_mte_tagged(page1) || page_mte_tagged(page2))
 		return addr1 != addr2;
 
 	return ret;
@@ -407,7 +409,7 @@ static int __access_remote_tags(struct mm_struct *mm, unsigned long addr,
 			put_page(page);
 			break;
 		}
-		WARN_ON_ONCE(!test_bit(PG_mte_tagged, &page->flags));
+		WARN_ON_ONCE(!page_mte_tagged(page));
 
 		/* limit access to the end of the page */
 		offset = offset_in_page(addr);
diff --git a/arch/arm64/kvm/guest.c b/arch/arm64/kvm/guest.c
index 8c607199cad1..3b04e69006b4 100644
--- a/arch/arm64/kvm/guest.c
+++ b/arch/arm64/kvm/guest.c
@@ -1058,7 +1058,7 @@ long kvm_vm_ioctl_mte_copy_tags(struct kvm *kvm,
 		maddr = page_address(page);
 
 		if (!write) {
-			if (test_bit(PG_mte_tagged, &page->flags))
+			if (page_mte_tagged(page))
 				num_tags = mte_copy_tags_to_user(tags, maddr,
 							MTE_GRANULES_PER_PAGE);
 			else
@@ -1075,7 +1075,7 @@ long kvm_vm_ioctl_mte_copy_tags(struct kvm *kvm,
 			 * completed fully
 			 */
 			if (num_tags == MTE_GRANULES_PER_PAGE)
-				set_bit(PG_mte_tagged, &page->flags);
+				set_page_mte_tagged(page);
 
 			kvm_release_pfn_dirty(pfn);
 		}
diff --git a/arch/arm64/kvm/mmu.c b/arch/arm64/kvm/mmu.c
index f5651a05b6a8..9cfa516452e1 100644
--- a/arch/arm64/kvm/mmu.c
+++ b/arch/arm64/kvm/mmu.c
@@ -1075,9 +1075,9 @@ static int sanitise_mte_tags(struct kvm *kvm, kvm_pfn_t pfn,
 		return -EFAULT;
 
 	for (i = 0; i < nr_pages; i++, page++) {
-		if (!test_bit(PG_mte_tagged, &page->flags)) {
+		if (!page_mte_tagged(page)) {
 			mte_clear_page_tags(page_address(page));
-			set_bit(PG_mte_tagged, &page->flags);
+			set_page_mte_tagged(page);
 		}
 	}
 
diff --git a/arch/arm64/mm/copypage.c b/arch/arm64/mm/copypage.c
index 0dea80bf6de4..f36d796f1bce 100644
--- a/arch/arm64/mm/copypage.c
+++ b/arch/arm64/mm/copypage.c
@@ -21,8 +21,7 @@ void copy_highpage(struct page *to, struct page *from)
 
 	copy_page(kto, kfrom);
 
-	if (system_supports_mte() && test_bit(PG_mte_tagged, &from->flags)) {
-		set_bit(PG_mte_tagged, &to->flags);
+	if (system_supports_mte() && page_mte_tagged(from)) {
 		page_kasan_tag_reset(to);
 		/*
 		 * We need smp_wmb() in between setting the flags and clearing the
@@ -33,6 +32,7 @@ void copy_highpage(struct page *to, struct page *from)
 		 */
 		smp_wmb();
 		mte_copy_page_tags(kto, kfrom);
+		set_page_mte_tagged(to);
 	}
 }
 EXPORT_SYMBOL(copy_highpage);
diff --git a/arch/arm64/mm/fault.c b/arch/arm64/mm/fault.c
index c5e11768e5c1..147fe28d3fbe 100644
--- a/arch/arm64/mm/fault.c
+++ b/arch/arm64/mm/fault.c
@@ -928,5 +928,5 @@ void tag_clear_highpage(struct page *page)
 {
 	mte_zero_clear_page_tags(page_address(page));
 	page_kasan_tag_reset(page);
-	set_bit(PG_mte_tagged, &page->flags);
+	set_page_mte_tagged(page);
 }
diff --git a/arch/arm64/mm/mteswap.c b/arch/arm64/mm/mteswap.c
index a9e50e930484..9d3a8cf388fc 100644
--- a/arch/arm64/mm/mteswap.c
+++ b/arch/arm64/mm/mteswap.c
@@ -24,7 +24,7 @@ int mte_save_tags(struct page *page)
 {
 	void *tag_storage, *ret;
 
-	if (!test_bit(PG_mte_tagged, &page->flags))
+	if (!page_mte_tagged(page))
 		return 0;
 
 	tag_storage = mte_allocate_tag_storage();

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply related	[flat|nested] 9+ messages in thread

* [PATCH 2/4] KVM: arm64: Simplify the sanitise_mte_tags() logic
  2022-07-05 14:26 [PATCH 0/4] arm64: mte: Fix racing on MTE tag initialisation Catalin Marinas
  2022-07-05 14:26 ` [PATCH 1/4] arm64: mte: Fix/clarify the PG_mte_tagged semantics Catalin Marinas
@ 2022-07-05 14:26 ` Catalin Marinas
  2022-07-08 23:00   ` Peter Collingbourne
  2022-07-05 14:26 ` [PATCH 3/4] mm: Add PG_arch_3 page flag Catalin Marinas
  2022-07-05 14:26 ` [PATCH 4/4] arm64: mte: Lock a page for MTE tag initialisation Catalin Marinas
  3 siblings, 1 reply; 9+ messages in thread
From: Catalin Marinas @ 2022-07-05 14:26 UTC (permalink / raw)
  To: Will Deacon, Marc Zyngier, Steven Price, Peter Collingbourne
  Cc: Vincenzo Frascino, linux-arm-kernel

Currently sanitise_mte_tags() checks if it's an online page before
attempting to sanitise the tags. Such detection should be done in the
caller via the VM_MTE_ALLOWED vma flag. Since kvm_set_spte_gfn() does
not have the vma, leave the page unmapped if not already tagged. Tag
initialisation will be done on a subsequent access fault in
user_mem_abort().

Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Cc: Will Deacon <will@kernel.org>
Cc: Marc Zyngier <maz@kernel.org>
Cc: Steven Price <steven.price@arm.com>
Cc: Peter Collingbourne <pcc@google.com>
---
 arch/arm64/kvm/mmu.c | 38 ++++++++++++++------------------------
 1 file changed, 14 insertions(+), 24 deletions(-)

diff --git a/arch/arm64/kvm/mmu.c b/arch/arm64/kvm/mmu.c
index 9cfa516452e1..35850f17ae08 100644
--- a/arch/arm64/kvm/mmu.c
+++ b/arch/arm64/kvm/mmu.c
@@ -1056,23 +1056,14 @@ static int get_vma_page_shift(struct vm_area_struct *vma, unsigned long hva)
  * - mmap_lock protects between a VM faulting a page in and the VMM performing
  *   an mprotect() to add VM_MTE
  */
-static int sanitise_mte_tags(struct kvm *kvm, kvm_pfn_t pfn,
-			     unsigned long size)
+static void sanitise_mte_tags(struct kvm *kvm, kvm_pfn_t pfn,
+			      unsigned long size)
 {
 	unsigned long i, nr_pages = size >> PAGE_SHIFT;
 	struct page *page;
 
 	if (!kvm_has_mte(kvm))
-		return 0;
-
-	/*
-	 * pfn_to_online_page() is used to reject ZONE_DEVICE pages
-	 * that may not support tags.
-	 */
-	page = pfn_to_online_page(pfn);
-
-	if (!page)
-		return -EFAULT;
+		return;
 
 	for (i = 0; i < nr_pages; i++, page++) {
 		if (!page_mte_tagged(page)) {
@@ -1080,8 +1071,6 @@ static int sanitise_mte_tags(struct kvm *kvm, kvm_pfn_t pfn,
 			set_page_mte_tagged(page);
 		}
 	}
-
-	return 0;
 }
 
 static int user_mem_abort(struct kvm_vcpu *vcpu, phys_addr_t fault_ipa,
@@ -1092,7 +1081,6 @@ static int user_mem_abort(struct kvm_vcpu *vcpu, phys_addr_t fault_ipa,
 	bool write_fault, writable, force_pte = false;
 	bool exec_fault;
 	bool device = false;
-	bool shared;
 	unsigned long mmu_seq;
 	struct kvm *kvm = vcpu->kvm;
 	struct kvm_mmu_memory_cache *memcache = &vcpu->arch.mmu_page_cache;
@@ -1142,8 +1130,6 @@ static int user_mem_abort(struct kvm_vcpu *vcpu, phys_addr_t fault_ipa,
 		vma_shift = get_vma_page_shift(vma, hva);
 	}
 
-	shared = (vma->vm_flags & VM_SHARED);
-
 	switch (vma_shift) {
 #ifndef __PAGETABLE_PMD_FOLDED
 	case PUD_SHIFT:
@@ -1264,12 +1250,13 @@ static int user_mem_abort(struct kvm_vcpu *vcpu, phys_addr_t fault_ipa,
 
 	if (fault_status != FSC_PERM && !device && kvm_has_mte(kvm)) {
 		/* Check the VMM hasn't introduced a new VM_SHARED VMA */
-		if (!shared)
-			ret = sanitise_mte_tags(kvm, pfn, vma_pagesize);
-		else
+		if ((vma->vm_flags & VM_MTE_ALLOWED) &&
+		    !(vma->vm_flags & VM_SHARED)) {
+			sanitise_mte_tags(kvm, pfn, vma_pagesize);
+		} else {
 			ret = -EFAULT;
-		if (ret)
 			goto out_unlock;
+		}
 	}
 
 	if (writable)
@@ -1491,15 +1478,18 @@ bool kvm_unmap_gfn_range(struct kvm *kvm, struct kvm_gfn_range *range)
 bool kvm_set_spte_gfn(struct kvm *kvm, struct kvm_gfn_range *range)
 {
 	kvm_pfn_t pfn = pte_pfn(range->pte);
-	int ret;
 
 	if (!kvm->arch.mmu.pgt)
 		return false;
 
 	WARN_ON(range->end - range->start != 1);
 
-	ret = sanitise_mte_tags(kvm, pfn, PAGE_SIZE);
-	if (ret)
+	/*
+	 * If the page isn't tagged, defer to user_mem_abort() for sanitising
+	 * the MTE tags. The S2 pte should have been unmapped by
+	 * mmu_notifier_invalidate_range_end().
+	 */
+	if (kvm_has_mte(kvm) && !page_mte_tagged(pfn_to_page(pfn)))
 		return false;
 
 	/*

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply related	[flat|nested] 9+ messages in thread

* [PATCH 3/4] mm: Add PG_arch_3 page flag
  2022-07-05 14:26 [PATCH 0/4] arm64: mte: Fix racing on MTE tag initialisation Catalin Marinas
  2022-07-05 14:26 ` [PATCH 1/4] arm64: mte: Fix/clarify the PG_mte_tagged semantics Catalin Marinas
  2022-07-05 14:26 ` [PATCH 2/4] KVM: arm64: Simplify the sanitise_mte_tags() logic Catalin Marinas
@ 2022-07-05 14:26 ` Catalin Marinas
  2022-07-05 14:26 ` [PATCH 4/4] arm64: mte: Lock a page for MTE tag initialisation Catalin Marinas
  3 siblings, 0 replies; 9+ messages in thread
From: Catalin Marinas @ 2022-07-05 14:26 UTC (permalink / raw)
  To: Will Deacon, Marc Zyngier, Steven Price, Peter Collingbourne
  Cc: Vincenzo Frascino, linux-arm-kernel

From: Peter Collingbourne <pcc@google.com>

As with PG_arch_2, this flag is only allowed on 64-bit architectures due
to the shortage of bits available. It will be used by the arm64 MTE code
in subsequent patches.

Signed-off-by: Peter Collingbourne <pcc@google.com>
Cc: Will Deacon <will@kernel.org>
Cc: Marc Zyngier <maz@kernel.org>
Cc: Steven Price <steven.price@arm.com>
[catalin.marinas@arm.com: added flag preserving in __split_huge_page_tail()]
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
---
 fs/proc/page.c                 | 1 +
 include/linux/page-flags.h     | 1 +
 include/trace/events/mmflags.h | 7 ++++---
 mm/huge_memory.c               | 1 +
 4 files changed, 7 insertions(+), 3 deletions(-)

diff --git a/fs/proc/page.c b/fs/proc/page.c
index a2873a617ae8..438b8aa7249d 100644
--- a/fs/proc/page.c
+++ b/fs/proc/page.c
@@ -220,6 +220,7 @@ u64 stable_page_flags(struct page *page)
 	u |= kpf_copy_bit(k, KPF_ARCH,		PG_arch_1);
 #ifdef CONFIG_64BIT
 	u |= kpf_copy_bit(k, KPF_ARCH_2,	PG_arch_2);
+	u |= kpf_copy_bit(k, KPF_ARCH_2,	PG_arch_3);
 #endif
 
 	return u;
diff --git a/include/linux/page-flags.h b/include/linux/page-flags.h
index e66f7aa3191d..447cdd4b1311 100644
--- a/include/linux/page-flags.h
+++ b/include/linux/page-flags.h
@@ -134,6 +134,7 @@ enum pageflags {
 #endif
 #ifdef CONFIG_64BIT
 	PG_arch_2,
+	PG_arch_3,
 #endif
 #ifdef CONFIG_KASAN_HW_TAGS
 	PG_skip_kasan_poison,
diff --git a/include/trace/events/mmflags.h b/include/trace/events/mmflags.h
index e87cb2b80ed3..ecf7ae0de3d8 100644
--- a/include/trace/events/mmflags.h
+++ b/include/trace/events/mmflags.h
@@ -92,9 +92,9 @@
 #endif
 
 #ifdef CONFIG_64BIT
-#define IF_HAVE_PG_ARCH_2(flag,string) ,{1UL << flag, string}
+#define IF_HAVE_PG_ARCH_2_3(flag,string) ,{1UL << flag, string}
 #else
-#define IF_HAVE_PG_ARCH_2(flag,string)
+#define IF_HAVE_PG_ARCH_2_3(flag,string)
 #endif
 
 #ifdef CONFIG_KASAN_HW_TAGS
@@ -130,7 +130,8 @@ IF_HAVE_PG_UNCACHED(PG_uncached,	"uncached"	)		\
 IF_HAVE_PG_HWPOISON(PG_hwpoison,	"hwpoison"	)		\
 IF_HAVE_PG_IDLE(PG_young,		"young"		)		\
 IF_HAVE_PG_IDLE(PG_idle,		"idle"		)		\
-IF_HAVE_PG_ARCH_2(PG_arch_2,		"arch_2"	)		\
+IF_HAVE_PG_ARCH_2_3(PG_arch_2,		"arch_2"	)		\
+IF_HAVE_PG_ARCH_2_3(PG_arch_3,		"arch_3"	)		\
 IF_HAVE_PG_SKIP_KASAN_POISON(PG_skip_kasan_poison, "skip_kasan_poison")
 
 #define show_page_flags(flags)						\
diff --git a/mm/huge_memory.c b/mm/huge_memory.c
index 834f288b3769..e8fda2c28256 100644
--- a/mm/huge_memory.c
+++ b/mm/huge_memory.c
@@ -2369,6 +2369,7 @@ static void __split_huge_page_tail(struct page *head, int tail,
 			 (1L << PG_unevictable) |
 #ifdef CONFIG_64BIT
 			 (1L << PG_arch_2) |
+			 (1L << PG_arch_3) |
 #endif
 			 (1L << PG_dirty)));
 

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply related	[flat|nested] 9+ messages in thread

* [PATCH 4/4] arm64: mte: Lock a page for MTE tag initialisation
  2022-07-05 14:26 [PATCH 0/4] arm64: mte: Fix racing on MTE tag initialisation Catalin Marinas
                   ` (2 preceding siblings ...)
  2022-07-05 14:26 ` [PATCH 3/4] mm: Add PG_arch_3 page flag Catalin Marinas
@ 2022-07-05 14:26 ` Catalin Marinas
  2022-07-08 23:11   ` Peter Collingbourne
  3 siblings, 1 reply; 9+ messages in thread
From: Catalin Marinas @ 2022-07-05 14:26 UTC (permalink / raw)
  To: Will Deacon, Marc Zyngier, Steven Price, Peter Collingbourne
  Cc: Vincenzo Frascino, linux-arm-kernel

Initialising the tags and setting PG_mte_tagged flag for a page can race
between multiple set_pte_at() on shared pages or setting the stage 2 pte
via user_mem_abort(). Introduce a new PG_mte_lock flag as PG_arch_3 and
set it before attempting page initialisation. Given that PG_mte_tagged
is never cleared for a page, consider setting this flag to mean page
unlocked and wait on this bit with acquire semantics if the page is
locked:

- try_page_mte_tagging() - lock the page for tagging, return true if it
  can be tagged, false if already tagged. No acquire semantics if it
  returns true (PG_mte_tagged not set) as there is no serialisation with
  a previous set_page_mte_tagged().

- set_page_mte_tagged() - set PG_mte_tagged with release semantics.

The two-bit locking is based on Peter Colingbourne's idea.

Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Cc: Will Deacon <will@kernel.org>
Cc: Marc Zyngier <maz@kernel.org>
Cc: Steven Price <steven.price@arm.com>
Cc: Peter Collingbourne <pcc@google.com>
---
 arch/arm64/include/asm/mte.h     | 32 ++++++++++++++++++++++++++++++++
 arch/arm64/include/asm/pgtable.h |  1 +
 arch/arm64/kernel/cpufeature.c   |  2 +-
 arch/arm64/kernel/mte.c          |  7 +++++--
 arch/arm64/kvm/guest.c           | 16 ++++++++++------
 arch/arm64/kvm/mmu.c             |  2 +-
 arch/arm64/mm/copypage.c         |  2 ++
 arch/arm64/mm/fault.c            |  2 ++
 arch/arm64/mm/mteswap.c          |  3 +++
 9 files changed, 57 insertions(+), 10 deletions(-)

diff --git a/arch/arm64/include/asm/mte.h b/arch/arm64/include/asm/mte.h
index c69218c56980..29712fc9df8c 100644
--- a/arch/arm64/include/asm/mte.h
+++ b/arch/arm64/include/asm/mte.h
@@ -36,6 +36,8 @@ void mte_free_tag_storage(char *storage);
 
 /* track which pages have valid allocation tags */
 #define PG_mte_tagged	PG_arch_2
+/* simple lock to avoid multiple threads tagging the same page */
+#define PG_mte_lock	PG_arch_3
 
 static inline void set_page_mte_tagged(struct page *page)
 {
@@ -60,6 +62,32 @@ static inline bool page_mte_tagged(struct page *page)
 	return ret;
 }
 
+/*
+ * Lock the page for tagging and return 'true' if the page can be tagged,
+ * 'false' if already tagged. PG_mte_tagged is never cleared and therefore the
+ * locking only happens once for page initialisation.
+ *
+ * The page MTE lock state:
+ *
+ *   Locked:	PG_mte_lock && !PG_mte_tagged
+ *   Unlocked:	!PG_mte_lock || PG_mte_tagged
+ *
+ * Acquire semantics only if the page is tagged (returning 'false').
+ */
+static inline bool try_page_mte_tagging(struct page *page)
+{
+	if (!test_and_set_bit(PG_mte_lock, &page->flags))
+		return !page_mte_tagged(page);
+
+	/*
+	 * The tags are being initialised, wait for the PG_mte_tagged flag to
+	 * be set.
+	 */
+	smp_cond_load_acquire(&page->flags, VAL & (1UL << PG_mte_tagged));
+
+	return false;
+}
+
 void mte_zero_clear_page_tags(void *addr);
 void mte_sync_tags(pte_t old_pte, pte_t pte);
 void mte_copy_page_tags(void *kto, const void *kfrom);
@@ -84,6 +112,10 @@ static inline bool page_mte_tagged(struct page *page)
 {
 	return false;
 }
+static inline bool try_page_mte_tagging(struct page *page)
+{
+	return false;
+}
 static inline void mte_zero_clear_page_tags(void *addr)
 {
 }
diff --git a/arch/arm64/include/asm/pgtable.h b/arch/arm64/include/asm/pgtable.h
index 08823669db0a..ce2dc72f64f4 100644
--- a/arch/arm64/include/asm/pgtable.h
+++ b/arch/arm64/include/asm/pgtable.h
@@ -1033,6 +1033,7 @@ static inline void arch_swap_invalidate_area(int type)
 #define __HAVE_ARCH_SWAP_RESTORE
 static inline void arch_swap_restore(swp_entry_t entry, struct folio *folio)
 {
+	/* mte_restore_tags() takes the PG_mte_lock */
 	if (system_supports_mte() && mte_restore_tags(entry, &folio->page))
 		set_page_mte_tagged(&folio->page);
 }
diff --git a/arch/arm64/kernel/cpufeature.c b/arch/arm64/kernel/cpufeature.c
index 4478e5774580..c07dd7916517 100644
--- a/arch/arm64/kernel/cpufeature.c
+++ b/arch/arm64/kernel/cpufeature.c
@@ -1964,7 +1964,7 @@ static void cpu_enable_mte(struct arm64_cpu_capabilities const *cap)
 	 * Clear the tags in the zero page. This needs to be done via the
 	 * linear map which has the Tagged attribute.
 	 */
-	if (!page_mte_tagged(ZERO_PAGE(0))) {
+	if (try_page_mte_tagging(ZERO_PAGE(0))) {
 		mte_clear_page_tags(lm_alias(empty_zero_page));
 		set_page_mte_tagged(ZERO_PAGE(0));
 	}
diff --git a/arch/arm64/kernel/mte.c b/arch/arm64/kernel/mte.c
index 67e82ce4c285..54d284a1e0a7 100644
--- a/arch/arm64/kernel/mte.c
+++ b/arch/arm64/kernel/mte.c
@@ -41,6 +41,7 @@ static void mte_sync_page_tags(struct page *page, pte_t old_pte,
 	if (check_swap && is_swap_pte(old_pte)) {
 		swp_entry_t entry = pte_to_swp_entry(old_pte);
 
+		/* mte_restore_tags() takes the PG_mte_lock */
 		if (!non_swap_entry(entry) && mte_restore_tags(entry, page)) {
 			set_page_mte_tagged(page);
 			return;
@@ -59,8 +60,10 @@ static void mte_sync_page_tags(struct page *page, pte_t old_pte,
 	 * the new page->flags are visible before the tags were updated.
 	 */
 	smp_wmb();
-	mte_clear_page_tags(page_address(page));
-	set_page_mte_tagged(page);
+	if (try_page_mte_tagging(page)) {
+		mte_clear_page_tags(page_address(page));
+		set_page_mte_tagged(page);
+	}
 }
 
 void mte_sync_tags(pte_t old_pte, pte_t pte)
diff --git a/arch/arm64/kvm/guest.c b/arch/arm64/kvm/guest.c
index 3b04e69006b4..059b38e7a9e8 100644
--- a/arch/arm64/kvm/guest.c
+++ b/arch/arm64/kvm/guest.c
@@ -1067,15 +1067,19 @@ long kvm_vm_ioctl_mte_copy_tags(struct kvm *kvm,
 					clear_user(tags, MTE_GRANULES_PER_PAGE);
 			kvm_release_pfn_clean(pfn);
 		} else {
+			/*
+			 * Only locking to serialise with a concurrent
+			 * set_pte_at() in the VMM but still overriding the
+			 * tags, hence ignoring the return value.
+			 */
+			try_page_mte_tagging(page);
 			num_tags = mte_copy_tags_from_user(maddr, tags,
 							MTE_GRANULES_PER_PAGE);
 
-			/*
-			 * Set the flag after checking the write
-			 * completed fully
-			 */
-			if (num_tags == MTE_GRANULES_PER_PAGE)
-				set_page_mte_tagged(page);
+			/* uaccess failed, don't leave stale tags */
+			if (num_tags != MTE_GRANULES_PER_PAGE)
+				mte_clear_page_tags(page);
+			set_page_mte_tagged(page);
 
 			kvm_release_pfn_dirty(pfn);
 		}
diff --git a/arch/arm64/kvm/mmu.c b/arch/arm64/kvm/mmu.c
index 35850f17ae08..fdd46089f260 100644
--- a/arch/arm64/kvm/mmu.c
+++ b/arch/arm64/kvm/mmu.c
@@ -1066,7 +1066,7 @@ static void sanitise_mte_tags(struct kvm *kvm, kvm_pfn_t pfn,
 		return;
 
 	for (i = 0; i < nr_pages; i++, page++) {
-		if (!page_mte_tagged(page)) {
+		if (try_page_mte_tagging(page)) {
 			mte_clear_page_tags(page_address(page));
 			set_page_mte_tagged(page);
 		}
diff --git a/arch/arm64/mm/copypage.c b/arch/arm64/mm/copypage.c
index f36d796f1bce..9c73bc020894 100644
--- a/arch/arm64/mm/copypage.c
+++ b/arch/arm64/mm/copypage.c
@@ -31,6 +31,8 @@ void copy_highpage(struct page *to, struct page *from)
 		 * the new page->flags are visible before the tags were updated.
 		 */
 		smp_wmb();
+		/* It's a new page, shouldn't have been tagged yet */
+		WARN_ON_ONCE(!try_page_mte_tagging(to));
 		mte_copy_page_tags(kto, kfrom);
 		set_page_mte_tagged(to);
 	}
diff --git a/arch/arm64/mm/fault.c b/arch/arm64/mm/fault.c
index 147fe28d3fbe..bd28d6bd9286 100644
--- a/arch/arm64/mm/fault.c
+++ b/arch/arm64/mm/fault.c
@@ -926,6 +926,8 @@ struct page *alloc_zeroed_user_highpage_movable(struct vm_area_struct *vma,
 
 void tag_clear_highpage(struct page *page)
 {
+	/* Newly allocated page, shouldn't have been tagged yet */
+	WARN_ON_ONCE(!try_page_mte_tagging(page));
 	mte_zero_clear_page_tags(page_address(page));
 	page_kasan_tag_reset(page);
 	set_page_mte_tagged(page);
diff --git a/arch/arm64/mm/mteswap.c b/arch/arm64/mm/mteswap.c
index 9d3a8cf388fc..aec76a4423e9 100644
--- a/arch/arm64/mm/mteswap.c
+++ b/arch/arm64/mm/mteswap.c
@@ -62,6 +62,9 @@ bool mte_restore_tags(swp_entry_t entry, struct page *page)
 	 * the new page->flags are visible before the tags were updated.
 	 */
 	smp_wmb();
+	/* racing tag restoring? */
+	if (!try_page_mte_tagging(page))
+		return false;
 	mte_restore_page_tags(page_address(page), tags);
 
 	return true;

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply related	[flat|nested] 9+ messages in thread

* Re: [PATCH 2/4] KVM: arm64: Simplify the sanitise_mte_tags() logic
  2022-07-05 14:26 ` [PATCH 2/4] KVM: arm64: Simplify the sanitise_mte_tags() logic Catalin Marinas
@ 2022-07-08 23:00   ` Peter Collingbourne
  2022-09-01 10:42     ` Catalin Marinas
  0 siblings, 1 reply; 9+ messages in thread
From: Peter Collingbourne @ 2022-07-08 23:00 UTC (permalink / raw)
  To: Catalin Marinas
  Cc: Will Deacon, Marc Zyngier, Steven Price, Vincenzo Frascino, Linux ARM

On Tue, Jul 5, 2022 at 7:26 AM Catalin Marinas <catalin.marinas@arm.com> wrote:
>
> Currently sanitise_mte_tags() checks if it's an online page before
> attempting to sanitise the tags. Such detection should be done in the
> caller via the VM_MTE_ALLOWED vma flag. Since kvm_set_spte_gfn() does
> not have the vma, leave the page unmapped if not already tagged. Tag
> initialisation will be done on a subsequent access fault in
> user_mem_abort().
>
> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
> Cc: Will Deacon <will@kernel.org>
> Cc: Marc Zyngier <maz@kernel.org>
> Cc: Steven Price <steven.price@arm.com>
> Cc: Peter Collingbourne <pcc@google.com>
> ---
>  arch/arm64/kvm/mmu.c | 38 ++++++++++++++------------------------
>  1 file changed, 14 insertions(+), 24 deletions(-)
>
> diff --git a/arch/arm64/kvm/mmu.c b/arch/arm64/kvm/mmu.c
> index 9cfa516452e1..35850f17ae08 100644
> --- a/arch/arm64/kvm/mmu.c
> +++ b/arch/arm64/kvm/mmu.c
> @@ -1056,23 +1056,14 @@ static int get_vma_page_shift(struct vm_area_struct *vma, unsigned long hva)
>   * - mmap_lock protects between a VM faulting a page in and the VMM performing
>   *   an mprotect() to add VM_MTE
>   */
> -static int sanitise_mte_tags(struct kvm *kvm, kvm_pfn_t pfn,
> -                            unsigned long size)
> +static void sanitise_mte_tags(struct kvm *kvm, kvm_pfn_t pfn,
> +                             unsigned long size)
>  {
>         unsigned long i, nr_pages = size >> PAGE_SHIFT;
>         struct page *page;

Did you intend to change this to "struct page *page =
pfn_to_page(pfn);"? As things are, I get a kernel panic if I try to
start a VM with MTE enabled. The VM boots after making my suggested
change though.

Peter

>
>         if (!kvm_has_mte(kvm))
> -               return 0;
> -
> -       /*
> -        * pfn_to_online_page() is used to reject ZONE_DEVICE pages
> -        * that may not support tags.
> -        */
> -       page = pfn_to_online_page(pfn);
> -
> -       if (!page)
> -               return -EFAULT;
> +               return;
>
>         for (i = 0; i < nr_pages; i++, page++) {
>                 if (!page_mte_tagged(page)) {
> @@ -1080,8 +1071,6 @@ static int sanitise_mte_tags(struct kvm *kvm, kvm_pfn_t pfn,
>                         set_page_mte_tagged(page);
>                 }
>         }
> -
> -       return 0;
>  }
>
>  static int user_mem_abort(struct kvm_vcpu *vcpu, phys_addr_t fault_ipa,
> @@ -1092,7 +1081,6 @@ static int user_mem_abort(struct kvm_vcpu *vcpu, phys_addr_t fault_ipa,
>         bool write_fault, writable, force_pte = false;
>         bool exec_fault;
>         bool device = false;
> -       bool shared;
>         unsigned long mmu_seq;
>         struct kvm *kvm = vcpu->kvm;
>         struct kvm_mmu_memory_cache *memcache = &vcpu->arch.mmu_page_cache;
> @@ -1142,8 +1130,6 @@ static int user_mem_abort(struct kvm_vcpu *vcpu, phys_addr_t fault_ipa,
>                 vma_shift = get_vma_page_shift(vma, hva);
>         }
>
> -       shared = (vma->vm_flags & VM_SHARED);
> -
>         switch (vma_shift) {
>  #ifndef __PAGETABLE_PMD_FOLDED
>         case PUD_SHIFT:
> @@ -1264,12 +1250,13 @@ static int user_mem_abort(struct kvm_vcpu *vcpu, phys_addr_t fault_ipa,
>
>         if (fault_status != FSC_PERM && !device && kvm_has_mte(kvm)) {
>                 /* Check the VMM hasn't introduced a new VM_SHARED VMA */
> -               if (!shared)
> -                       ret = sanitise_mte_tags(kvm, pfn, vma_pagesize);
> -               else
> +               if ((vma->vm_flags & VM_MTE_ALLOWED) &&
> +                   !(vma->vm_flags & VM_SHARED)) {
> +                       sanitise_mte_tags(kvm, pfn, vma_pagesize);
> +               } else {
>                         ret = -EFAULT;
> -               if (ret)
>                         goto out_unlock;
> +               }
>         }
>
>         if (writable)
> @@ -1491,15 +1478,18 @@ bool kvm_unmap_gfn_range(struct kvm *kvm, struct kvm_gfn_range *range)
>  bool kvm_set_spte_gfn(struct kvm *kvm, struct kvm_gfn_range *range)
>  {
>         kvm_pfn_t pfn = pte_pfn(range->pte);
> -       int ret;
>
>         if (!kvm->arch.mmu.pgt)
>                 return false;
>
>         WARN_ON(range->end - range->start != 1);
>
> -       ret = sanitise_mte_tags(kvm, pfn, PAGE_SIZE);
> -       if (ret)
> +       /*
> +        * If the page isn't tagged, defer to user_mem_abort() for sanitising
> +        * the MTE tags. The S2 pte should have been unmapped by
> +        * mmu_notifier_invalidate_range_end().
> +        */
> +       if (kvm_has_mte(kvm) && !page_mte_tagged(pfn_to_page(pfn)))
>                 return false;
>
>         /*

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH 4/4] arm64: mte: Lock a page for MTE tag initialisation
  2022-07-05 14:26 ` [PATCH 4/4] arm64: mte: Lock a page for MTE tag initialisation Catalin Marinas
@ 2022-07-08 23:11   ` Peter Collingbourne
  2022-09-01 12:15     ` Catalin Marinas
  0 siblings, 1 reply; 9+ messages in thread
From: Peter Collingbourne @ 2022-07-08 23:11 UTC (permalink / raw)
  To: Catalin Marinas
  Cc: Will Deacon, Marc Zyngier, Steven Price, Vincenzo Frascino, Linux ARM

On Tue, Jul 5, 2022 at 7:26 AM Catalin Marinas <catalin.marinas@arm.com> wrote:
>
> Initialising the tags and setting PG_mte_tagged flag for a page can race
> between multiple set_pte_at() on shared pages or setting the stage 2 pte
> via user_mem_abort(). Introduce a new PG_mte_lock flag as PG_arch_3 and
> set it before attempting page initialisation. Given that PG_mte_tagged
> is never cleared for a page, consider setting this flag to mean page
> unlocked and wait on this bit with acquire semantics if the page is
> locked:
>
> - try_page_mte_tagging() - lock the page for tagging, return true if it
>   can be tagged, false if already tagged. No acquire semantics if it
>   returns true (PG_mte_tagged not set) as there is no serialisation with
>   a previous set_page_mte_tagged().
>
> - set_page_mte_tagged() - set PG_mte_tagged with release semantics.
>
> The two-bit locking is based on Peter Colingbourne's idea.

nit: Collingbourne (two l's).

>
> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
> Cc: Will Deacon <will@kernel.org>
> Cc: Marc Zyngier <maz@kernel.org>
> Cc: Steven Price <steven.price@arm.com>
> Cc: Peter Collingbourne <pcc@google.com>
> ---
>  arch/arm64/include/asm/mte.h     | 32 ++++++++++++++++++++++++++++++++
>  arch/arm64/include/asm/pgtable.h |  1 +
>  arch/arm64/kernel/cpufeature.c   |  2 +-
>  arch/arm64/kernel/mte.c          |  7 +++++--
>  arch/arm64/kvm/guest.c           | 16 ++++++++++------
>  arch/arm64/kvm/mmu.c             |  2 +-
>  arch/arm64/mm/copypage.c         |  2 ++
>  arch/arm64/mm/fault.c            |  2 ++
>  arch/arm64/mm/mteswap.c          |  3 +++
>  9 files changed, 57 insertions(+), 10 deletions(-)
>
> diff --git a/arch/arm64/include/asm/mte.h b/arch/arm64/include/asm/mte.h
> index c69218c56980..29712fc9df8c 100644
> --- a/arch/arm64/include/asm/mte.h
> +++ b/arch/arm64/include/asm/mte.h
> @@ -36,6 +36,8 @@ void mte_free_tag_storage(char *storage);
>
>  /* track which pages have valid allocation tags */
>  #define PG_mte_tagged  PG_arch_2
> +/* simple lock to avoid multiple threads tagging the same page */
> +#define PG_mte_lock    PG_arch_3
>
>  static inline void set_page_mte_tagged(struct page *page)
>  {
> @@ -60,6 +62,32 @@ static inline bool page_mte_tagged(struct page *page)
>         return ret;
>  }
>
> +/*
> + * Lock the page for tagging and return 'true' if the page can be tagged,
> + * 'false' if already tagged. PG_mte_tagged is never cleared and therefore the
> + * locking only happens once for page initialisation.
> + *
> + * The page MTE lock state:
> + *
> + *   Locked:   PG_mte_lock && !PG_mte_tagged
> + *   Unlocked: !PG_mte_lock || PG_mte_tagged
> + *
> + * Acquire semantics only if the page is tagged (returning 'false').
> + */
> +static inline bool try_page_mte_tagging(struct page *page)
> +{
> +       if (!test_and_set_bit(PG_mte_lock, &page->flags))
> +               return !page_mte_tagged(page);

Since all callers of set_page_mte_tagged() are now dominated by a call
to try_page_mte_tagging() and PG_mte_lock is never cleared I think we
can't end up in the state where !PG_mte_lock && PG_mte_tagged. So I
think this can be simplified to "return true;". I can still boot VMs
with MTE enabled after making my suggested change.

> +
> +       /*
> +        * The tags are being initialised, wait for the PG_mte_tagged flag to

I think at this point the tags are either being initialized or have
already been initialized, so the comment isn't quite right.

Peter

> +        * be set.
> +        */
> +       smp_cond_load_acquire(&page->flags, VAL & (1UL << PG_mte_tagged));
> +
> +       return false;
> +}
> +
>  void mte_zero_clear_page_tags(void *addr);
>  void mte_sync_tags(pte_t old_pte, pte_t pte);
>  void mte_copy_page_tags(void *kto, const void *kfrom);
> @@ -84,6 +112,10 @@ static inline bool page_mte_tagged(struct page *page)
>  {
>         return false;
>  }
> +static inline bool try_page_mte_tagging(struct page *page)
> +{
> +       return false;
> +}
>  static inline void mte_zero_clear_page_tags(void *addr)
>  {
>  }
> diff --git a/arch/arm64/include/asm/pgtable.h b/arch/arm64/include/asm/pgtable.h
> index 08823669db0a..ce2dc72f64f4 100644
> --- a/arch/arm64/include/asm/pgtable.h
> +++ b/arch/arm64/include/asm/pgtable.h
> @@ -1033,6 +1033,7 @@ static inline void arch_swap_invalidate_area(int type)
>  #define __HAVE_ARCH_SWAP_RESTORE
>  static inline void arch_swap_restore(swp_entry_t entry, struct folio *folio)
>  {
> +       /* mte_restore_tags() takes the PG_mte_lock */
>         if (system_supports_mte() && mte_restore_tags(entry, &folio->page))
>                 set_page_mte_tagged(&folio->page);
>  }
> diff --git a/arch/arm64/kernel/cpufeature.c b/arch/arm64/kernel/cpufeature.c
> index 4478e5774580..c07dd7916517 100644
> --- a/arch/arm64/kernel/cpufeature.c
> +++ b/arch/arm64/kernel/cpufeature.c
> @@ -1964,7 +1964,7 @@ static void cpu_enable_mte(struct arm64_cpu_capabilities const *cap)
>          * Clear the tags in the zero page. This needs to be done via the
>          * linear map which has the Tagged attribute.
>          */
> -       if (!page_mte_tagged(ZERO_PAGE(0))) {
> +       if (try_page_mte_tagging(ZERO_PAGE(0))) {
>                 mte_clear_page_tags(lm_alias(empty_zero_page));
>                 set_page_mte_tagged(ZERO_PAGE(0));
>         }
> diff --git a/arch/arm64/kernel/mte.c b/arch/arm64/kernel/mte.c
> index 67e82ce4c285..54d284a1e0a7 100644
> --- a/arch/arm64/kernel/mte.c
> +++ b/arch/arm64/kernel/mte.c
> @@ -41,6 +41,7 @@ static void mte_sync_page_tags(struct page *page, pte_t old_pte,
>         if (check_swap && is_swap_pte(old_pte)) {
>                 swp_entry_t entry = pte_to_swp_entry(old_pte);
>
> +               /* mte_restore_tags() takes the PG_mte_lock */
>                 if (!non_swap_entry(entry) && mte_restore_tags(entry, page)) {
>                         set_page_mte_tagged(page);
>                         return;
> @@ -59,8 +60,10 @@ static void mte_sync_page_tags(struct page *page, pte_t old_pte,
>          * the new page->flags are visible before the tags were updated.
>          */
>         smp_wmb();
> -       mte_clear_page_tags(page_address(page));
> -       set_page_mte_tagged(page);
> +       if (try_page_mte_tagging(page)) {
> +               mte_clear_page_tags(page_address(page));
> +               set_page_mte_tagged(page);
> +       }
>  }
>
>  void mte_sync_tags(pte_t old_pte, pte_t pte)
> diff --git a/arch/arm64/kvm/guest.c b/arch/arm64/kvm/guest.c
> index 3b04e69006b4..059b38e7a9e8 100644
> --- a/arch/arm64/kvm/guest.c
> +++ b/arch/arm64/kvm/guest.c
> @@ -1067,15 +1067,19 @@ long kvm_vm_ioctl_mte_copy_tags(struct kvm *kvm,
>                                         clear_user(tags, MTE_GRANULES_PER_PAGE);
>                         kvm_release_pfn_clean(pfn);
>                 } else {
> +                       /*
> +                        * Only locking to serialise with a concurrent
> +                        * set_pte_at() in the VMM but still overriding the
> +                        * tags, hence ignoring the return value.
> +                        */
> +                       try_page_mte_tagging(page);
>                         num_tags = mte_copy_tags_from_user(maddr, tags,
>                                                         MTE_GRANULES_PER_PAGE);
>
> -                       /*
> -                        * Set the flag after checking the write
> -                        * completed fully
> -                        */
> -                       if (num_tags == MTE_GRANULES_PER_PAGE)
> -                               set_page_mte_tagged(page);
> +                       /* uaccess failed, don't leave stale tags */
> +                       if (num_tags != MTE_GRANULES_PER_PAGE)
> +                               mte_clear_page_tags(page);
> +                       set_page_mte_tagged(page);
>
>                         kvm_release_pfn_dirty(pfn);
>                 }
> diff --git a/arch/arm64/kvm/mmu.c b/arch/arm64/kvm/mmu.c
> index 35850f17ae08..fdd46089f260 100644
> --- a/arch/arm64/kvm/mmu.c
> +++ b/arch/arm64/kvm/mmu.c
> @@ -1066,7 +1066,7 @@ static void sanitise_mte_tags(struct kvm *kvm, kvm_pfn_t pfn,
>                 return;
>
>         for (i = 0; i < nr_pages; i++, page++) {
> -               if (!page_mte_tagged(page)) {
> +               if (try_page_mte_tagging(page)) {
>                         mte_clear_page_tags(page_address(page));
>                         set_page_mte_tagged(page);
>                 }
> diff --git a/arch/arm64/mm/copypage.c b/arch/arm64/mm/copypage.c
> index f36d796f1bce..9c73bc020894 100644
> --- a/arch/arm64/mm/copypage.c
> +++ b/arch/arm64/mm/copypage.c
> @@ -31,6 +31,8 @@ void copy_highpage(struct page *to, struct page *from)
>                  * the new page->flags are visible before the tags were updated.
>                  */
>                 smp_wmb();
> +               /* It's a new page, shouldn't have been tagged yet */
> +               WARN_ON_ONCE(!try_page_mte_tagging(to));
>                 mte_copy_page_tags(kto, kfrom);
>                 set_page_mte_tagged(to);
>         }
> diff --git a/arch/arm64/mm/fault.c b/arch/arm64/mm/fault.c
> index 147fe28d3fbe..bd28d6bd9286 100644
> --- a/arch/arm64/mm/fault.c
> +++ b/arch/arm64/mm/fault.c
> @@ -926,6 +926,8 @@ struct page *alloc_zeroed_user_highpage_movable(struct vm_area_struct *vma,
>
>  void tag_clear_highpage(struct page *page)
>  {
> +       /* Newly allocated page, shouldn't have been tagged yet */
> +       WARN_ON_ONCE(!try_page_mte_tagging(page));
>         mte_zero_clear_page_tags(page_address(page));
>         page_kasan_tag_reset(page);
>         set_page_mte_tagged(page);
> diff --git a/arch/arm64/mm/mteswap.c b/arch/arm64/mm/mteswap.c
> index 9d3a8cf388fc..aec76a4423e9 100644
> --- a/arch/arm64/mm/mteswap.c
> +++ b/arch/arm64/mm/mteswap.c
> @@ -62,6 +62,9 @@ bool mte_restore_tags(swp_entry_t entry, struct page *page)
>          * the new page->flags are visible before the tags were updated.
>          */
>         smp_wmb();
> +       /* racing tag restoring? */
> +       if (!try_page_mte_tagging(page))
> +               return false;
>         mte_restore_page_tags(page_address(page), tags);
>
>         return true;

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH 2/4] KVM: arm64: Simplify the sanitise_mte_tags() logic
  2022-07-08 23:00   ` Peter Collingbourne
@ 2022-09-01 10:42     ` Catalin Marinas
  0 siblings, 0 replies; 9+ messages in thread
From: Catalin Marinas @ 2022-09-01 10:42 UTC (permalink / raw)
  To: Peter Collingbourne
  Cc: Will Deacon, Marc Zyngier, Steven Price, Vincenzo Frascino, Linux ARM

On Fri, Jul 08, 2022 at 04:00:01PM -0700, Peter Collingbourne wrote:
> On Tue, Jul 5, 2022 at 7:26 AM Catalin Marinas <catalin.marinas@arm.com> wrote:
> > diff --git a/arch/arm64/kvm/mmu.c b/arch/arm64/kvm/mmu.c
> > index 9cfa516452e1..35850f17ae08 100644
> > --- a/arch/arm64/kvm/mmu.c
> > +++ b/arch/arm64/kvm/mmu.c
> > @@ -1056,23 +1056,14 @@ static int get_vma_page_shift(struct vm_area_struct *vma, unsigned long hva)
> >   * - mmap_lock protects between a VM faulting a page in and the VMM performing
> >   *   an mprotect() to add VM_MTE
> >   */
> > -static int sanitise_mte_tags(struct kvm *kvm, kvm_pfn_t pfn,
> > -                            unsigned long size)
> > +static void sanitise_mte_tags(struct kvm *kvm, kvm_pfn_t pfn,
> > +                             unsigned long size)
> >  {
> >         unsigned long i, nr_pages = size >> PAGE_SHIFT;
> >         struct page *page;
> 
> Did you intend to change this to "struct page *page =
> pfn_to_page(pfn);"? As things are, I get a kernel panic if I try to
> start a VM with MTE enabled. The VM boots after making my suggested
> change though.

Yes, indeed. I think you fixed it when reposting together with the other
patches.

Sorry for the delay, too much holiday this summer ;).

-- 
Catalin

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH 4/4] arm64: mte: Lock a page for MTE tag initialisation
  2022-07-08 23:11   ` Peter Collingbourne
@ 2022-09-01 12:15     ` Catalin Marinas
  0 siblings, 0 replies; 9+ messages in thread
From: Catalin Marinas @ 2022-09-01 12:15 UTC (permalink / raw)
  To: Peter Collingbourne
  Cc: Will Deacon, Marc Zyngier, Steven Price, Vincenzo Frascino, Linux ARM

On Fri, Jul 08, 2022 at 04:11:59PM -0700, Peter Collingbourne wrote:
> On Tue, Jul 5, 2022 at 7:26 AM Catalin Marinas <catalin.marinas@arm.com> wrote:
> > @@ -60,6 +62,32 @@ static inline bool page_mte_tagged(struct page *page)
> >         return ret;
> >  }
> >
> > +/*
> > + * Lock the page for tagging and return 'true' if the page can be tagged,
> > + * 'false' if already tagged. PG_mte_tagged is never cleared and therefore the
> > + * locking only happens once for page initialisation.
> > + *
> > + * The page MTE lock state:
> > + *
> > + *   Locked:   PG_mte_lock && !PG_mte_tagged
> > + *   Unlocked: !PG_mte_lock || PG_mte_tagged
> > + *
> > + * Acquire semantics only if the page is tagged (returning 'false').
> > + */
> > +static inline bool try_page_mte_tagging(struct page *page)
> > +{
> > +       if (!test_and_set_bit(PG_mte_lock, &page->flags))
> > +               return !page_mte_tagged(page);
> 
> Since all callers of set_page_mte_tagged() are now dominated by a call
> to try_page_mte_tagging() and PG_mte_lock is never cleared I think we
> can't end up in the state where !PG_mte_lock && PG_mte_tagged. So I
> think this can be simplified to "return true;". I can still boot VMs
> with MTE enabled after making my suggested change.

Correct. Not sure why I complicated this since the "Unlocked"
description above states that try_page_mte_tagging() should return
"unlocked" if !PG_mte_lock, so no need for the PG_mte_tagged check.

> > +
> > +       /*
> > +        * The tags are being initialised, wait for the PG_mte_tagged flag to
> 
> I think at this point the tags are either being initialized or have
> already been initialized, so the comment isn't quite right.

Yeah, they may have been initialised already by the time we got here and
smp_cond_load_acquire() would just return immediately. I was too lazy to
write all the use-cases here.

-- 
Catalin

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2022-09-01 12:16 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-07-05 14:26 [PATCH 0/4] arm64: mte: Fix racing on MTE tag initialisation Catalin Marinas
2022-07-05 14:26 ` [PATCH 1/4] arm64: mte: Fix/clarify the PG_mte_tagged semantics Catalin Marinas
2022-07-05 14:26 ` [PATCH 2/4] KVM: arm64: Simplify the sanitise_mte_tags() logic Catalin Marinas
2022-07-08 23:00   ` Peter Collingbourne
2022-09-01 10:42     ` Catalin Marinas
2022-07-05 14:26 ` [PATCH 3/4] mm: Add PG_arch_3 page flag Catalin Marinas
2022-07-05 14:26 ` [PATCH 4/4] arm64: mte: Lock a page for MTE tag initialisation Catalin Marinas
2022-07-08 23:11   ` Peter Collingbourne
2022-09-01 12:15     ` Catalin Marinas

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).