linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v4 0/8] Create 'old' ptes for faultaround mappings on arm64 with hardware access flag
@ 2021-01-20 17:36 Will Deacon
  2021-01-20 17:36 ` [PATCH v4 1/8] mm: Cleanup faultaround and finish_fault() codepaths Will Deacon
                   ` (8 more replies)
  0 siblings, 9 replies; 24+ messages in thread
From: Will Deacon @ 2021-01-20 17:36 UTC (permalink / raw)
  To: linux-kernel
  Cc: linux-mm, linux-arm-kernel, Will Deacon, Catalin Marinas,
	Jan Kara, Minchan Kim, Andrew Morton, Kirill A . Shutemov,
	Linus Torvalds, Vinayak Menon, Hugh Dickins, Nick Desaulniers,
	kernel-team

Hi all,

This is version four of the patches I previously posted here:

  v1: https://lore.kernel.org/r/20201209163950.8494-1-will@kernel.org
  v2: https://lore.kernel.org/r/20210108171517.5290-1-will@kernel.org
  v3: https://lore.kernel.org/r/20210114175934.13070-1-will@kernel.org

The patches allow architectures to opt-in at runtime for faultaround
mappings to be created as 'old' instead of 'young'. Although there have
been previous attempts at this, they failed either because the decision
was deferred to userspace [1] or because it was done unconditionally and
shown to regress benchmarks for particular architectures [2].

The big change since v3 is that the immutable fields of 'struct vm_fault'
now live in a 'const' anonymous struct. Although Clang will silently
accept modifications to these fields [3], GCC emits an error. The
resulting diffstat is _considerably_ more manageable with this approach.

As before, I've also updated this branch:

  https://git.kernel.org/pub/scm/linux/kernel/git/will/linux.git/log/?h=faultaround

Cheers,

Will

[1] https://www.spinics.net/lists/linux-mm/msg143831.html
[2] 315d09bf30c2 ("Revert "mm: make faultaround produce old ptes"")
[3] https://bugs.llvm.org/show_bug.cgi?id=48755

Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Jan Kara <jack@suse.cz>
Cc: Minchan Kim <minchan@kernel.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Vinayak Menon <vinmenon@codeaurora.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Hugh Dickins <hughd@google.com>
Cc: Nick Desaulniers <ndesaulniers@google.com>
Cc: <kernel-team@android.com>

--->8

Kirill A. Shutemov (1):
  mm: Cleanup faultaround and finish_fault() codepaths

Will Deacon (7):
  mm: Allow architectures to request 'old' entries when prefaulting
  arm64: mm: Implement arch_wants_old_prefaulted_pte()
  mm: Move immutable fields of 'struct vm_fault' into anonymous struct
  mm: Pass 'address' to map to do_set_pte() and drop FAULT_FLAG_PREFAULT
  mm: Avoid modifying vmf.address in __collapse_huge_page_swapin()
  mm: Use static initialisers for immutable fields of 'struct vm_fault'
  mm: Mark anonymous struct field of 'struct vm_fault' as 'const'

 arch/arm64/include/asm/pgtable.h |  12 +-
 fs/xfs/xfs_file.c                |   6 +-
 include/linux/mm.h               |  25 ++--
 include/linux/pgtable.h          |  11 ++
 mm/filemap.c                     | 178 ++++++++++++++++++------
 mm/khugepaged.c                  |  37 +++--
 mm/memory.c                      | 223 +++++++++++--------------------
 mm/shmem.c                       |   6 +-
 mm/swapfile.c                    |  11 +-
 9 files changed, 280 insertions(+), 229 deletions(-)

-- 
2.30.0.284.gd98b1dd5eaa7-goog



^ permalink raw reply	[flat|nested] 24+ messages in thread

* [PATCH v4 1/8] mm: Cleanup faultaround and finish_fault() codepaths
  2021-01-20 17:36 [PATCH v4 0/8] Create 'old' ptes for faultaround mappings on arm64 with hardware access flag Will Deacon
@ 2021-01-20 17:36 ` Will Deacon
  2021-01-20 17:36 ` [PATCH v4 2/8] mm: Allow architectures to request 'old' entries when prefaulting Will Deacon
                   ` (7 subsequent siblings)
  8 siblings, 0 replies; 24+ messages in thread
From: Will Deacon @ 2021-01-20 17:36 UTC (permalink / raw)
  To: linux-kernel
  Cc: linux-mm, linux-arm-kernel, Will Deacon, Catalin Marinas,
	Jan Kara, Minchan Kim, Andrew Morton, Kirill A . Shutemov,
	Linus Torvalds, Vinayak Menon, Hugh Dickins, Nick Desaulniers,
	kernel-team

From: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>

alloc_set_pte() has two users with different requirements: in the
faultaround code, it called from an atomic context and PTE page table
has to be preallocated. finish_fault() can sleep and allocate page table
as needed.

PTL locking rules are also strange, hard to follow and overkill for
finish_fault().

Let's untangle the mess. alloc_set_pte() has gone now. All locking is
explicit.

The price is some code duplication to handle huge pages in faultaround
path, but it should be fine, having overall improvement in readability.

Link: https://lore.kernel.org/r/20201229132819.najtavneutnf7ajp@box
Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
[will: s/from from/from/ in comment; spotted by willy]
Signed-off-by: Will Deacon <will@kernel.org>
---
 fs/xfs/xfs_file.c       |   6 +-
 include/linux/mm.h      |  12 ++-
 include/linux/pgtable.h |  11 +++
 mm/filemap.c            | 177 ++++++++++++++++++++++++++---------
 mm/memory.c             | 199 ++++++++++++----------------------------
 5 files changed, 213 insertions(+), 192 deletions(-)

diff --git a/fs/xfs/xfs_file.c b/fs/xfs/xfs_file.c
index 5b0f93f73837..111fe73bb8a7 100644
--- a/fs/xfs/xfs_file.c
+++ b/fs/xfs/xfs_file.c
@@ -1319,17 +1319,19 @@ xfs_filemap_pfn_mkwrite(
 	return __xfs_filemap_fault(vmf, PE_SIZE_PTE, true);
 }
 
-static void
+static vm_fault_t
 xfs_filemap_map_pages(
 	struct vm_fault		*vmf,
 	pgoff_t			start_pgoff,
 	pgoff_t			end_pgoff)
 {
 	struct inode		*inode = file_inode(vmf->vma->vm_file);
+	vm_fault_t ret;
 
 	xfs_ilock(XFS_I(inode), XFS_MMAPLOCK_SHARED);
-	filemap_map_pages(vmf, start_pgoff, end_pgoff);
+	ret = filemap_map_pages(vmf, start_pgoff, end_pgoff);
 	xfs_iunlock(XFS_I(inode), XFS_MMAPLOCK_SHARED);
+	return ret;
 }
 
 static const struct vm_operations_struct xfs_file_vm_ops = {
diff --git a/include/linux/mm.h b/include/linux/mm.h
index ecdf8a8cd6ae..4572a9bc5862 100644
--- a/include/linux/mm.h
+++ b/include/linux/mm.h
@@ -542,8 +542,8 @@ struct vm_fault {
 					 * is not NULL, otherwise pmd.
 					 */
 	pgtable_t prealloc_pte;		/* Pre-allocated pte page table.
-					 * vm_ops->map_pages() calls
-					 * alloc_set_pte() from atomic context.
+					 * vm_ops->map_pages() sets up a page
+					 * table from atomic context.
 					 * do_fault_around() pre-allocates
 					 * page table to avoid allocation from
 					 * atomic context.
@@ -578,7 +578,7 @@ struct vm_operations_struct {
 	vm_fault_t (*fault)(struct vm_fault *vmf);
 	vm_fault_t (*huge_fault)(struct vm_fault *vmf,
 			enum page_entry_size pe_size);
-	void (*map_pages)(struct vm_fault *vmf,
+	vm_fault_t (*map_pages)(struct vm_fault *vmf,
 			pgoff_t start_pgoff, pgoff_t end_pgoff);
 	unsigned long (*pagesize)(struct vm_area_struct * area);
 
@@ -988,7 +988,9 @@ static inline pte_t maybe_mkwrite(pte_t pte, struct vm_area_struct *vma)
 	return pte;
 }
 
-vm_fault_t alloc_set_pte(struct vm_fault *vmf, struct page *page);
+vm_fault_t do_set_pmd(struct vm_fault *vmf, struct page *page);
+void do_set_pte(struct vm_fault *vmf, struct page *page);
+
 vm_fault_t finish_fault(struct vm_fault *vmf);
 vm_fault_t finish_mkwrite_fault(struct vm_fault *vmf);
 #endif
@@ -2622,7 +2624,7 @@ extern void truncate_inode_pages_final(struct address_space *);
 
 /* generic vm_area_ops exported for stackable file systems */
 extern vm_fault_t filemap_fault(struct vm_fault *vmf);
-extern void filemap_map_pages(struct vm_fault *vmf,
+extern vm_fault_t filemap_map_pages(struct vm_fault *vmf,
 		pgoff_t start_pgoff, pgoff_t end_pgoff);
 extern vm_fault_t filemap_page_mkwrite(struct vm_fault *vmf);
 
diff --git a/include/linux/pgtable.h b/include/linux/pgtable.h
index 8fcdfa52eb4b..36eb748f3c97 100644
--- a/include/linux/pgtable.h
+++ b/include/linux/pgtable.h
@@ -1314,6 +1314,17 @@ static inline int pmd_trans_unstable(pmd_t *pmd)
 #endif
 }
 
+/*
+ * the ordering of these checks is important for pmds with _page_devmap set.
+ * if we check pmd_trans_unstable() first we will trip the bad_pmd() check
+ * inside of pmd_none_or_trans_huge_or_clear_bad(). this will end up correctly
+ * returning 1 but not before it spams dmesg with the pmd_clear_bad() output.
+ */
+static inline int pmd_devmap_trans_unstable(pmd_t *pmd)
+{
+	return pmd_devmap(*pmd) || pmd_trans_unstable(pmd);
+}
+
 #ifndef CONFIG_NUMA_BALANCING
 /*
  * Technically a PTE can be PROTNONE even when not doing NUMA balancing but
diff --git a/mm/filemap.c b/mm/filemap.c
index 5c9d564317a5..c1f2dc89b8a7 100644
--- a/mm/filemap.c
+++ b/mm/filemap.c
@@ -42,6 +42,7 @@
 #include <linux/psi.h>
 #include <linux/ramfs.h>
 #include <linux/page_idle.h>
+#include <asm/pgalloc.h>
 #include "internal.h"
 
 #define CREATE_TRACE_POINTS
@@ -2911,74 +2912,164 @@ vm_fault_t filemap_fault(struct vm_fault *vmf)
 }
 EXPORT_SYMBOL(filemap_fault);
 
-void filemap_map_pages(struct vm_fault *vmf,
-		pgoff_t start_pgoff, pgoff_t end_pgoff)
+static bool filemap_map_pmd(struct vm_fault *vmf, struct page *page)
 {
-	struct file *file = vmf->vma->vm_file;
+	struct mm_struct *mm = vmf->vma->vm_mm;
+
+	/* Huge page is mapped? No need to proceed. */
+	if (pmd_trans_huge(*vmf->pmd)) {
+		unlock_page(page);
+		put_page(page);
+		return true;
+	}
+
+	if (pmd_none(*vmf->pmd) && PageTransHuge(page)) {
+	    vm_fault_t ret = do_set_pmd(vmf, page);
+	    if (!ret) {
+		    /* The page is mapped successfully, reference consumed. */
+		    unlock_page(page);
+		    return true;
+	    }
+	}
+
+	if (pmd_none(*vmf->pmd)) {
+		vmf->ptl = pmd_lock(mm, vmf->pmd);
+		if (likely(pmd_none(*vmf->pmd))) {
+			mm_inc_nr_ptes(mm);
+			pmd_populate(mm, vmf->pmd, vmf->prealloc_pte);
+			vmf->prealloc_pte = NULL;
+		}
+		spin_unlock(vmf->ptl);
+	}
+
+	/* See comment in handle_pte_fault() */
+	if (pmd_devmap_trans_unstable(vmf->pmd)) {
+		unlock_page(page);
+		put_page(page);
+		return true;
+	}
+
+	return false;
+}
+
+static struct page *next_uptodate_page(struct page *page,
+				       struct address_space *mapping,
+				       struct xa_state *xas, pgoff_t end_pgoff)
+{
+	unsigned long max_idx;
+
+	do {
+		if (!page)
+			return NULL;
+		if (xas_retry(xas, page))
+			continue;
+		if (xa_is_value(page))
+			continue;
+		if (PageLocked(page))
+			continue;
+		if (!page_cache_get_speculative(page))
+			continue;
+		/* Has the page moved or been split? */
+		if (unlikely(page != xas_reload(xas)))
+			goto skip;
+		if (!PageUptodate(page) || PageReadahead(page))
+			goto skip;
+		if (PageHWPoison(page))
+			goto skip;
+		if (!trylock_page(page))
+			goto skip;
+		if (page->mapping != mapping)
+			goto unlock;
+		if (!PageUptodate(page))
+			goto unlock;
+		max_idx = DIV_ROUND_UP(i_size_read(mapping->host), PAGE_SIZE);
+		if (xas->xa_index >= max_idx)
+			goto unlock;
+		return page;
+unlock:
+		unlock_page(page);
+skip:
+		put_page(page);
+	} while ((page = xas_next_entry(xas, end_pgoff)) != NULL);
+
+	return NULL;
+}
+
+static inline struct page *first_map_page(struct address_space *mapping,
+					  struct xa_state *xas,
+					  pgoff_t end_pgoff)
+{
+	return next_uptodate_page(xas_find(xas, end_pgoff),
+				  mapping, xas, end_pgoff);
+}
+
+static inline struct page *next_map_page(struct address_space *mapping,
+					 struct xa_state *xas,
+					 pgoff_t end_pgoff)
+{
+	return next_uptodate_page(xas_next_entry(xas, end_pgoff),
+				  mapping, xas, end_pgoff);
+}
+
+vm_fault_t filemap_map_pages(struct vm_fault *vmf,
+			     pgoff_t start_pgoff, pgoff_t end_pgoff)
+{
+	struct vm_area_struct *vma = vmf->vma;
+	struct file *file = vma->vm_file;
 	struct address_space *mapping = file->f_mapping;
 	pgoff_t last_pgoff = start_pgoff;
-	unsigned long max_idx;
+	unsigned long address = vmf->address;
 	XA_STATE(xas, &mapping->i_pages, start_pgoff);
 	struct page *head, *page;
 	unsigned int mmap_miss = READ_ONCE(file->f_ra.mmap_miss);
+	vm_fault_t ret = 0;
 
 	rcu_read_lock();
-	xas_for_each(&xas, head, end_pgoff) {
-		if (xas_retry(&xas, head))
-			continue;
-		if (xa_is_value(head))
-			goto next;
+	head = first_map_page(mapping, &xas, end_pgoff);
+	if (!head)
+		goto out;
 
-		/*
-		 * Check for a locked page first, as a speculative
-		 * reference may adversely influence page migration.
-		 */
-		if (PageLocked(head))
-			goto next;
-		if (!page_cache_get_speculative(head))
-			goto next;
+	if (filemap_map_pmd(vmf, head)) {
+		ret = VM_FAULT_NOPAGE;
+		goto out;
+	}
 
-		/* Has the page moved or been split? */
-		if (unlikely(head != xas_reload(&xas)))
-			goto skip;
+	vmf->address = vma->vm_start + ((start_pgoff - vma->vm_pgoff) << PAGE_SHIFT);
+	vmf->pte = pte_offset_map_lock(vma->vm_mm, vmf->pmd, vmf->address, &vmf->ptl);
+	do {
 		page = find_subpage(head, xas.xa_index);
-
-		if (!PageUptodate(head) ||
-				PageReadahead(page) ||
-				PageHWPoison(page))
-			goto skip;
-		if (!trylock_page(head))
-			goto skip;
-
-		if (head->mapping != mapping || !PageUptodate(head))
-			goto unlock;
-
-		max_idx = DIV_ROUND_UP(i_size_read(mapping->host), PAGE_SIZE);
-		if (xas.xa_index >= max_idx)
+		if (PageHWPoison(page))
 			goto unlock;
 
 		if (mmap_miss > 0)
 			mmap_miss--;
 
 		vmf->address += (xas.xa_index - last_pgoff) << PAGE_SHIFT;
-		if (vmf->pte)
-			vmf->pte += xas.xa_index - last_pgoff;
+		vmf->pte += xas.xa_index - last_pgoff;
 		last_pgoff = xas.xa_index;
-		if (alloc_set_pte(vmf, page))
+
+		if (!pte_none(*vmf->pte))
 			goto unlock;
+
+		do_set_pte(vmf, page);
+		/* no need to invalidate: a not-present page won't be cached */
+		update_mmu_cache(vma, vmf->address, vmf->pte);
 		unlock_page(head);
-		goto next;
+
+		/* The fault is handled */
+		if (vmf->address == address)
+			ret = VM_FAULT_NOPAGE;
+		continue;
 unlock:
 		unlock_page(head);
-skip:
 		put_page(head);
-next:
-		/* Huge page is mapped? No need to proceed. */
-		if (pmd_trans_huge(*vmf->pmd))
-			break;
-	}
+	} while ((head = next_map_page(mapping, &xas, end_pgoff)) != NULL);
+	pte_unmap_unlock(vmf->pte, vmf->ptl);
+out:
 	rcu_read_unlock();
+	vmf->address = address;
 	WRITE_ONCE(file->f_ra.mmap_miss, mmap_miss);
+	return ret;
 }
 EXPORT_SYMBOL(filemap_map_pages);
 
diff --git a/mm/memory.c b/mm/memory.c
index feff48e1465a..3e2fc2950ad7 100644
--- a/mm/memory.c
+++ b/mm/memory.c
@@ -3503,7 +3503,7 @@ static vm_fault_t do_anonymous_page(struct vm_fault *vmf)
 	if (pte_alloc(vma->vm_mm, vmf->pmd))
 		return VM_FAULT_OOM;
 
-	/* See the comment in pte_alloc_one_map() */
+	/* See comment in handle_pte_fault() */
 	if (unlikely(pmd_trans_unstable(vmf->pmd)))
 		return 0;
 
@@ -3643,66 +3643,6 @@ static vm_fault_t __do_fault(struct vm_fault *vmf)
 	return ret;
 }
 
-/*
- * The ordering of these checks is important for pmds with _PAGE_DEVMAP set.
- * If we check pmd_trans_unstable() first we will trip the bad_pmd() check
- * inside of pmd_none_or_trans_huge_or_clear_bad(). This will end up correctly
- * returning 1 but not before it spams dmesg with the pmd_clear_bad() output.
- */
-static int pmd_devmap_trans_unstable(pmd_t *pmd)
-{
-	return pmd_devmap(*pmd) || pmd_trans_unstable(pmd);
-}
-
-static vm_fault_t pte_alloc_one_map(struct vm_fault *vmf)
-{
-	struct vm_area_struct *vma = vmf->vma;
-
-	if (!pmd_none(*vmf->pmd))
-		goto map_pte;
-	if (vmf->prealloc_pte) {
-		vmf->ptl = pmd_lock(vma->vm_mm, vmf->pmd);
-		if (unlikely(!pmd_none(*vmf->pmd))) {
-			spin_unlock(vmf->ptl);
-			goto map_pte;
-		}
-
-		mm_inc_nr_ptes(vma->vm_mm);
-		pmd_populate(vma->vm_mm, vmf->pmd, vmf->prealloc_pte);
-		spin_unlock(vmf->ptl);
-		vmf->prealloc_pte = NULL;
-	} else if (unlikely(pte_alloc(vma->vm_mm, vmf->pmd))) {
-		return VM_FAULT_OOM;
-	}
-map_pte:
-	/*
-	 * If a huge pmd materialized under us just retry later.  Use
-	 * pmd_trans_unstable() via pmd_devmap_trans_unstable() instead of
-	 * pmd_trans_huge() to ensure the pmd didn't become pmd_trans_huge
-	 * under us and then back to pmd_none, as a result of MADV_DONTNEED
-	 * running immediately after a huge pmd fault in a different thread of
-	 * this mm, in turn leading to a misleading pmd_trans_huge() retval.
-	 * All we have to ensure is that it is a regular pmd that we can walk
-	 * with pte_offset_map() and we can do that through an atomic read in
-	 * C, which is what pmd_trans_unstable() provides.
-	 */
-	if (pmd_devmap_trans_unstable(vmf->pmd))
-		return VM_FAULT_NOPAGE;
-
-	/*
-	 * At this point we know that our vmf->pmd points to a page of ptes
-	 * and it cannot become pmd_none(), pmd_devmap() or pmd_trans_huge()
-	 * for the duration of the fault.  If a racing MADV_DONTNEED runs and
-	 * we zap the ptes pointed to by our vmf->pmd, the vmf->ptl will still
-	 * be valid and we will re-check to make sure the vmf->pte isn't
-	 * pte_none() under vmf->ptl protection when we return to
-	 * alloc_set_pte().
-	 */
-	vmf->pte = pte_offset_map_lock(vma->vm_mm, vmf->pmd, vmf->address,
-			&vmf->ptl);
-	return 0;
-}
-
 #ifdef CONFIG_TRANSPARENT_HUGEPAGE
 static void deposit_prealloc_pte(struct vm_fault *vmf)
 {
@@ -3717,7 +3657,7 @@ static void deposit_prealloc_pte(struct vm_fault *vmf)
 	vmf->prealloc_pte = NULL;
 }
 
-static vm_fault_t do_set_pmd(struct vm_fault *vmf, struct page *page)
+vm_fault_t do_set_pmd(struct vm_fault *vmf, struct page *page)
 {
 	struct vm_area_struct *vma = vmf->vma;
 	bool write = vmf->flags & FAULT_FLAG_WRITE;
@@ -3775,52 +3715,17 @@ static vm_fault_t do_set_pmd(struct vm_fault *vmf, struct page *page)
 	return ret;
 }
 #else
-static vm_fault_t do_set_pmd(struct vm_fault *vmf, struct page *page)
+vm_fault_t do_set_pmd(struct vm_fault *vmf, struct page *page)
 {
-	BUILD_BUG();
-	return 0;
+	return VM_FAULT_FALLBACK;
 }
 #endif
 
-/**
- * alloc_set_pte - setup new PTE entry for given page and add reverse page
- * mapping. If needed, the function allocates page table or use pre-allocated.
- *
- * @vmf: fault environment
- * @page: page to map
- *
- * Caller must take care of unlocking vmf->ptl, if vmf->pte is non-NULL on
- * return.
- *
- * Target users are page handler itself and implementations of
- * vm_ops->map_pages.
- *
- * Return: %0 on success, %VM_FAULT_ code in case of error.
- */
-vm_fault_t alloc_set_pte(struct vm_fault *vmf, struct page *page)
+void do_set_pte(struct vm_fault *vmf, struct page *page)
 {
 	struct vm_area_struct *vma = vmf->vma;
 	bool write = vmf->flags & FAULT_FLAG_WRITE;
 	pte_t entry;
-	vm_fault_t ret;
-
-	if (pmd_none(*vmf->pmd) && PageTransCompound(page)) {
-		ret = do_set_pmd(vmf, page);
-		if (ret != VM_FAULT_FALLBACK)
-			return ret;
-	}
-
-	if (!vmf->pte) {
-		ret = pte_alloc_one_map(vmf);
-		if (ret)
-			return ret;
-	}
-
-	/* Re-check under ptl */
-	if (unlikely(!pte_none(*vmf->pte))) {
-		update_mmu_tlb(vma, vmf->address, vmf->pte);
-		return VM_FAULT_NOPAGE;
-	}
 
 	flush_icache_page(vma, page);
 	entry = mk_pte(page, vma->vm_page_prot);
@@ -3837,14 +3742,8 @@ vm_fault_t alloc_set_pte(struct vm_fault *vmf, struct page *page)
 		page_add_file_rmap(page, false);
 	}
 	set_pte_at(vma->vm_mm, vmf->address, vmf->pte, entry);
-
-	/* no need to invalidate: a not-present page won't be cached */
-	update_mmu_cache(vma, vmf->address, vmf->pte);
-
-	return 0;
 }
 
-
 /**
  * finish_fault - finish page fault once we have prepared the page to fault
  *
@@ -3862,12 +3761,12 @@ vm_fault_t alloc_set_pte(struct vm_fault *vmf, struct page *page)
  */
 vm_fault_t finish_fault(struct vm_fault *vmf)
 {
+	struct vm_area_struct *vma = vmf->vma;
 	struct page *page;
-	vm_fault_t ret = 0;
+	vm_fault_t ret;
 
 	/* Did we COW the page? */
-	if ((vmf->flags & FAULT_FLAG_WRITE) &&
-	    !(vmf->vma->vm_flags & VM_SHARED))
+	if ((vmf->flags & FAULT_FLAG_WRITE) && !(vma->vm_flags & VM_SHARED))
 		page = vmf->cow_page;
 	else
 		page = vmf->page;
@@ -3876,12 +3775,38 @@ vm_fault_t finish_fault(struct vm_fault *vmf)
 	 * check even for read faults because we might have lost our CoWed
 	 * page
 	 */
-	if (!(vmf->vma->vm_flags & VM_SHARED))
-		ret = check_stable_address_space(vmf->vma->vm_mm);
-	if (!ret)
-		ret = alloc_set_pte(vmf, page);
-	if (vmf->pte)
-		pte_unmap_unlock(vmf->pte, vmf->ptl);
+	if (!(vma->vm_flags & VM_SHARED)) {
+		ret = check_stable_address_space(vma->vm_mm);
+		if (ret)
+			return ret;
+	}
+
+	if (pmd_none(*vmf->pmd)) {
+		if (PageTransCompound(page)) {
+			ret = do_set_pmd(vmf, page);
+			if (ret != VM_FAULT_FALLBACK)
+				return ret;
+		}
+
+		if (unlikely(pte_alloc(vma->vm_mm, vmf->pmd)))
+			return VM_FAULT_OOM;
+	}
+
+	/* See comment in handle_pte_fault() */
+	if (pmd_devmap_trans_unstable(vmf->pmd))
+		return 0;
+
+	vmf->pte = pte_offset_map_lock(vma->vm_mm, vmf->pmd,
+				      vmf->address, &vmf->ptl);
+	ret = 0;
+	/* Re-check under ptl */
+	if (likely(pte_none(*vmf->pte)))
+		do_set_pte(vmf, page);
+	else
+		ret = VM_FAULT_NOPAGE;
+
+	update_mmu_tlb(vma, vmf->address, vmf->pte);
+	pte_unmap_unlock(vmf->pte, vmf->ptl);
 	return ret;
 }
 
@@ -3951,13 +3876,12 @@ static vm_fault_t do_fault_around(struct vm_fault *vmf)
 	pgoff_t start_pgoff = vmf->pgoff;
 	pgoff_t end_pgoff;
 	int off;
-	vm_fault_t ret = 0;
 
 	nr_pages = READ_ONCE(fault_around_bytes) >> PAGE_SHIFT;
 	mask = ~(nr_pages * PAGE_SIZE - 1) & PAGE_MASK;
 
-	vmf->address = max(address & mask, vmf->vma->vm_start);
-	off = ((address - vmf->address) >> PAGE_SHIFT) & (PTRS_PER_PTE - 1);
+	address = max(address & mask, vmf->vma->vm_start);
+	off = ((vmf->address - address) >> PAGE_SHIFT) & (PTRS_PER_PTE - 1);
 	start_pgoff -= off;
 
 	/*
@@ -3965,7 +3889,7 @@ static vm_fault_t do_fault_around(struct vm_fault *vmf)
 	 *  the vma or nr_pages from start_pgoff, depending what is nearest.
 	 */
 	end_pgoff = start_pgoff -
-		((vmf->address >> PAGE_SHIFT) & (PTRS_PER_PTE - 1)) +
+		((address >> PAGE_SHIFT) & (PTRS_PER_PTE - 1)) +
 		PTRS_PER_PTE - 1;
 	end_pgoff = min3(end_pgoff, vma_pages(vmf->vma) + vmf->vma->vm_pgoff - 1,
 			start_pgoff + nr_pages - 1);
@@ -3973,31 +3897,11 @@ static vm_fault_t do_fault_around(struct vm_fault *vmf)
 	if (pmd_none(*vmf->pmd)) {
 		vmf->prealloc_pte = pte_alloc_one(vmf->vma->vm_mm);
 		if (!vmf->prealloc_pte)
-			goto out;
+			return VM_FAULT_OOM;
 		smp_wmb(); /* See comment in __pte_alloc() */
 	}
 
-	vmf->vma->vm_ops->map_pages(vmf, start_pgoff, end_pgoff);
-
-	/* Huge page is mapped? Page fault is solved */
-	if (pmd_trans_huge(*vmf->pmd)) {
-		ret = VM_FAULT_NOPAGE;
-		goto out;
-	}
-
-	/* ->map_pages() haven't done anything useful. Cold page cache? */
-	if (!vmf->pte)
-		goto out;
-
-	/* check if the page fault is solved */
-	vmf->pte -= (vmf->address >> PAGE_SHIFT) - (address >> PAGE_SHIFT);
-	if (!pte_none(*vmf->pte))
-		ret = VM_FAULT_NOPAGE;
-	pte_unmap_unlock(vmf->pte, vmf->ptl);
-out:
-	vmf->address = address;
-	vmf->pte = NULL;
-	return ret;
+	return vmf->vma->vm_ops->map_pages(vmf, start_pgoff, end_pgoff);
 }
 
 static vm_fault_t do_read_fault(struct vm_fault *vmf)
@@ -4353,7 +4257,18 @@ static vm_fault_t handle_pte_fault(struct vm_fault *vmf)
 		 */
 		vmf->pte = NULL;
 	} else {
-		/* See comment in pte_alloc_one_map() */
+		/*
+		 * If a huge pmd materialized under us just retry later.  Use
+		 * pmd_trans_unstable() via pmd_devmap_trans_unstable() instead
+		 * of pmd_trans_huge() to ensure the pmd didn't become
+		 * pmd_trans_huge under us and then back to pmd_none, as a
+		 * result of MADV_DONTNEED running immediately after a huge pmd
+		 * fault in a different thread of this mm, in turn leading to a
+		 * misleading pmd_trans_huge() retval. All we have to ensure is
+		 * that it is a regular pmd that we can walk with
+		 * pte_offset_map() and we can do that through an atomic read
+		 * in C, which is what pmd_trans_unstable() provides.
+		 */
 		if (pmd_devmap_trans_unstable(vmf->pmd))
 			return 0;
 		/*
-- 
2.30.0.284.gd98b1dd5eaa7-goog



^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [PATCH v4 2/8] mm: Allow architectures to request 'old' entries when prefaulting
  2021-01-20 17:36 [PATCH v4 0/8] Create 'old' ptes for faultaround mappings on arm64 with hardware access flag Will Deacon
  2021-01-20 17:36 ` [PATCH v4 1/8] mm: Cleanup faultaround and finish_fault() codepaths Will Deacon
@ 2021-01-20 17:36 ` Will Deacon
  2021-01-20 17:36 ` [PATCH v4 3/8] arm64: mm: Implement arch_wants_old_prefaulted_pte() Will Deacon
                   ` (6 subsequent siblings)
  8 siblings, 0 replies; 24+ messages in thread
From: Will Deacon @ 2021-01-20 17:36 UTC (permalink / raw)
  To: linux-kernel
  Cc: linux-mm, linux-arm-kernel, Will Deacon, Catalin Marinas,
	Jan Kara, Minchan Kim, Andrew Morton, Kirill A . Shutemov,
	Linus Torvalds, Vinayak Menon, Hugh Dickins, Nick Desaulniers,
	kernel-team

Commit 5c0a85fad949 ("mm: make faultaround produce old ptes") changed
the "faultaround" behaviour to initialise prefaulted PTEs as 'old',
since this avoids vmscan wrongly assuming that they are hot, despite
having never been explicitly accessed by userspace. The change has been
shown to benefit numerous arm64 micro-architectures (with hardware
access flag) running Android, where both application launch latency and
direct reclaim time are significantly reduced (by 10%+ and ~80%
respectively).

Unfortunately, commit 315d09bf30c2 ("Revert "mm: make faultaround
produce old ptes"") reverted the change due to it being identified as
the cause of a ~6% regression in unixbench on x86. Experiments on a
variety of recent arm64 micro-architectures indicate that unixbench is
not affected by the original commit, which appears to yield a 0-1%
performance improvement.

Since one size does not fit all for the initial state of prefaulted
PTEs, introduce arch_wants_old_prefaulted_pte(), which allows an
architecture to opt-in to 'old' prefaulted PTEs at runtime based on
whatever criteria it may have.

Cc: Jan Kara <jack@suse.cz>
Cc: Minchan Kim <minchan@kernel.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Reported-by: Vinayak Menon <vinmenon@codeaurora.org>
Signed-off-by: Will Deacon <will@kernel.org>
---
 include/linux/mm.h |  5 ++++-
 mm/filemap.c       | 14 ++++++++++----
 mm/memory.c        | 20 +++++++++++++++++++-
 3 files changed, 33 insertions(+), 6 deletions(-)

diff --git a/include/linux/mm.h b/include/linux/mm.h
index 4572a9bc5862..251a2339befb 100644
--- a/include/linux/mm.h
+++ b/include/linux/mm.h
@@ -434,6 +434,7 @@ extern pgprot_t protection_map[16];
  * @FAULT_FLAG_REMOTE: The fault is not for current task/mm.
  * @FAULT_FLAG_INSTRUCTION: The fault was during an instruction fetch.
  * @FAULT_FLAG_INTERRUPTIBLE: The fault can be interrupted by non-fatal signals.
+ * @FAULT_FLAG_PREFAULT: Fault was a prefault.
  *
  * About @FAULT_FLAG_ALLOW_RETRY and @FAULT_FLAG_TRIED: we can specify
  * whether we would allow page faults to retry by specifying these two
@@ -464,6 +465,7 @@ extern pgprot_t protection_map[16];
 #define FAULT_FLAG_REMOTE			0x80
 #define FAULT_FLAG_INSTRUCTION  		0x100
 #define FAULT_FLAG_INTERRUPTIBLE		0x200
+#define FAULT_FLAG_PREFAULT			0x400
 
 /*
  * The default fault flags that should be used by most of the
@@ -501,7 +503,8 @@ static inline bool fault_flag_allow_retry_first(unsigned int flags)
 	{ FAULT_FLAG_USER,		"USER" }, \
 	{ FAULT_FLAG_REMOTE,		"REMOTE" }, \
 	{ FAULT_FLAG_INSTRUCTION,	"INSTRUCTION" }, \
-	{ FAULT_FLAG_INTERRUPTIBLE,	"INTERRUPTIBLE" }
+	{ FAULT_FLAG_INTERRUPTIBLE,	"INTERRUPTIBLE" }, \
+	{ FAULT_FLAG_PREFAULT,		"PREFAULT" }
 
 /*
  * vm_fault is filled by the pagefault handler and passed to the vma's
diff --git a/mm/filemap.c b/mm/filemap.c
index c1f2dc89b8a7..a6dc97906c8e 100644
--- a/mm/filemap.c
+++ b/mm/filemap.c
@@ -3019,6 +3019,7 @@ vm_fault_t filemap_map_pages(struct vm_fault *vmf,
 	struct address_space *mapping = file->f_mapping;
 	pgoff_t last_pgoff = start_pgoff;
 	unsigned long address = vmf->address;
+	unsigned long flags = vmf->flags;
 	XA_STATE(xas, &mapping->i_pages, start_pgoff);
 	struct page *head, *page;
 	unsigned int mmap_miss = READ_ONCE(file->f_ra.mmap_miss);
@@ -3051,14 +3052,18 @@ vm_fault_t filemap_map_pages(struct vm_fault *vmf,
 		if (!pte_none(*vmf->pte))
 			goto unlock;
 
+		/* We're about to handle the fault */
+		if (vmf->address == address) {
+			vmf->flags &= ~FAULT_FLAG_PREFAULT;
+			ret = VM_FAULT_NOPAGE;
+		} else {
+			vmf->flags |= FAULT_FLAG_PREFAULT;
+		}
+
 		do_set_pte(vmf, page);
 		/* no need to invalidate: a not-present page won't be cached */
 		update_mmu_cache(vma, vmf->address, vmf->pte);
 		unlock_page(head);
-
-		/* The fault is handled */
-		if (vmf->address == address)
-			ret = VM_FAULT_NOPAGE;
 		continue;
 unlock:
 		unlock_page(head);
@@ -3067,6 +3072,7 @@ vm_fault_t filemap_map_pages(struct vm_fault *vmf,
 	pte_unmap_unlock(vmf->pte, vmf->ptl);
 out:
 	rcu_read_unlock();
+	vmf->flags = flags;
 	vmf->address = address;
 	WRITE_ONCE(file->f_ra.mmap_miss, mmap_miss);
 	return ret;
diff --git a/mm/memory.c b/mm/memory.c
index 3e2fc2950ad7..f0e7c589ca9d 100644
--- a/mm/memory.c
+++ b/mm/memory.c
@@ -134,6 +134,18 @@ static inline bool arch_faults_on_old_pte(void)
 }
 #endif
 
+#ifndef arch_wants_old_prefaulted_pte
+static inline bool arch_wants_old_prefaulted_pte(void)
+{
+	/*
+	 * Transitioning a PTE from 'old' to 'young' can be expensive on
+	 * some architectures, even if it's performed in hardware. By
+	 * default, "false" means prefaulted entries will be 'young'.
+	 */
+	return false;
+}
+#endif
+
 static int __init disable_randmaps(char *s)
 {
 	randomize_va_space = 0;
@@ -3725,11 +3737,17 @@ void do_set_pte(struct vm_fault *vmf, struct page *page)
 {
 	struct vm_area_struct *vma = vmf->vma;
 	bool write = vmf->flags & FAULT_FLAG_WRITE;
+	bool prefault = vmf->flags & FAULT_FLAG_PREFAULT;
 	pte_t entry;
 
 	flush_icache_page(vma, page);
 	entry = mk_pte(page, vma->vm_page_prot);
-	entry = pte_sw_mkyoung(entry);
+
+	if (prefault && arch_wants_old_prefaulted_pte())
+		entry = pte_mkold(entry);
+	else
+		entry = pte_sw_mkyoung(entry);
+
 	if (write)
 		entry = maybe_mkwrite(pte_mkdirty(entry), vma);
 	/* copy-on-write page */
-- 
2.30.0.284.gd98b1dd5eaa7-goog



^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [PATCH v4 3/8] arm64: mm: Implement arch_wants_old_prefaulted_pte()
  2021-01-20 17:36 [PATCH v4 0/8] Create 'old' ptes for faultaround mappings on arm64 with hardware access flag Will Deacon
  2021-01-20 17:36 ` [PATCH v4 1/8] mm: Cleanup faultaround and finish_fault() codepaths Will Deacon
  2021-01-20 17:36 ` [PATCH v4 2/8] mm: Allow architectures to request 'old' entries when prefaulting Will Deacon
@ 2021-01-20 17:36 ` Will Deacon
  2021-01-20 17:36 ` [PATCH v4 4/8] mm: Move immutable fields of 'struct vm_fault' into anonymous struct Will Deacon
                   ` (5 subsequent siblings)
  8 siblings, 0 replies; 24+ messages in thread
From: Will Deacon @ 2021-01-20 17:36 UTC (permalink / raw)
  To: linux-kernel
  Cc: linux-mm, linux-arm-kernel, Will Deacon, Catalin Marinas,
	Jan Kara, Minchan Kim, Andrew Morton, Kirill A . Shutemov,
	Linus Torvalds, Vinayak Menon, Hugh Dickins, Nick Desaulniers,
	kernel-team

On CPUs with hardware AF/DBM, initialising prefaulted PTEs as 'old'
improves vmscan behaviour and does not appear to introduce any overhead
elsewhere.

Implement arch_wants_old_prefaulted_pte() to return 'true' if we detect
hardware access flag support at runtime. This can be extended in future
based on MIDR matching if necessary.

Cc: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Will Deacon <will@kernel.org>
---
 arch/arm64/include/asm/pgtable.h | 12 +++++++++++-
 1 file changed, 11 insertions(+), 1 deletion(-)

diff --git a/arch/arm64/include/asm/pgtable.h b/arch/arm64/include/asm/pgtable.h
index 501562793ce2..e17b96d0e4b5 100644
--- a/arch/arm64/include/asm/pgtable.h
+++ b/arch/arm64/include/asm/pgtable.h
@@ -980,7 +980,17 @@ static inline bool arch_faults_on_old_pte(void)
 
 	return !cpu_has_hw_af();
 }
-#define arch_faults_on_old_pte arch_faults_on_old_pte
+#define arch_faults_on_old_pte		arch_faults_on_old_pte
+
+/*
+ * Experimentally, it's cheap to set the access flag in hardware and we
+ * benefit from prefaulting mappings as 'old' to start with.
+ */
+static inline bool arch_wants_old_prefaulted_pte(void)
+{
+	return !arch_faults_on_old_pte();
+}
+#define arch_wants_old_prefaulted_pte	arch_wants_old_prefaulted_pte
 
 #endif /* !__ASSEMBLY__ */
 
-- 
2.30.0.284.gd98b1dd5eaa7-goog



^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [PATCH v4 4/8] mm: Move immutable fields of 'struct vm_fault' into anonymous struct
  2021-01-20 17:36 [PATCH v4 0/8] Create 'old' ptes for faultaround mappings on arm64 with hardware access flag Will Deacon
                   ` (2 preceding siblings ...)
  2021-01-20 17:36 ` [PATCH v4 3/8] arm64: mm: Implement arch_wants_old_prefaulted_pte() Will Deacon
@ 2021-01-20 17:36 ` Will Deacon
  2021-01-20 18:13   ` Nick Desaulniers
  2021-01-20 17:36 ` [PATCH v4 5/8] mm: Pass 'address' to map to do_set_pte() and drop FAULT_FLAG_PREFAULT Will Deacon
                   ` (4 subsequent siblings)
  8 siblings, 1 reply; 24+ messages in thread
From: Will Deacon @ 2021-01-20 17:36 UTC (permalink / raw)
  To: linux-kernel
  Cc: linux-mm, linux-arm-kernel, Will Deacon, Catalin Marinas,
	Jan Kara, Minchan Kim, Andrew Morton, Kirill A . Shutemov,
	Linus Torvalds, Vinayak Menon, Hugh Dickins, Nick Desaulniers,
	kernel-team

'struct vm_fault' contains both information about the fault being
serviced alongside mutable fields contributing to the state of the
fault-handling logic. Unfortunately, the distinction between the two is
not clear-cut, and a number of callers end up manipulating the structure
temporarily before restoring it when returning.

Try to clean this up by moving the immutable fault information into an
anonymous struct, which will later be marked as 'const'. GCC will then
complain (with an error) about modification of these fields after they
have been initialised, although LLVM currently allows them without even
a warning:

https://bugs.llvm.org/show_bug.cgi?id=48755

Ideally, the 'flags' field would be part of the new structure too, but
it seems as though the ->page_mkwrite() path is not ready for this yet.

Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Suggested-by: Linus Torvalds <torvalds@linux-foundation.org>
Link: https://lore.kernel.org/r/CAHk-=whYs9XsO88iqJzN6NC=D-dp2m0oYXuOoZ=eWnvv=5OA+w@mail.gmail.com
Signed-off-by: Will Deacon <will@kernel.org>
---
 include/linux/mm.h | 13 ++++++++-----
 1 file changed, 8 insertions(+), 5 deletions(-)

diff --git a/include/linux/mm.h b/include/linux/mm.h
index 251a2339befb..b4a5cb9bff7d 100644
--- a/include/linux/mm.h
+++ b/include/linux/mm.h
@@ -517,11 +517,14 @@ static inline bool fault_flag_allow_retry_first(unsigned int flags)
  * pgoff should be used in favour of virtual_address, if possible.
  */
 struct vm_fault {
-	struct vm_area_struct *vma;	/* Target VMA */
-	unsigned int flags;		/* FAULT_FLAG_xxx flags */
-	gfp_t gfp_mask;			/* gfp mask to be used for allocations */
-	pgoff_t pgoff;			/* Logical page offset based on vma */
-	unsigned long address;		/* Faulting virtual address */
+	struct {
+		struct vm_area_struct *vma;	/* Target VMA */
+		gfp_t gfp_mask;			/* gfp mask to be used for allocations */
+		pgoff_t pgoff;			/* Logical page offset based on vma */
+		unsigned long address;		/* Faulting virtual address */
+	};
+	unsigned int flags;		/* FAULT_FLAG_xxx flags
+					 * XXX: should really be 'const' */
 	pmd_t *pmd;			/* Pointer to pmd entry matching
 					 * the 'address' */
 	pud_t *pud;			/* Pointer to pud entry matching
-- 
2.30.0.284.gd98b1dd5eaa7-goog



^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [PATCH v4 5/8] mm: Pass 'address' to map to do_set_pte() and drop FAULT_FLAG_PREFAULT
  2021-01-20 17:36 [PATCH v4 0/8] Create 'old' ptes for faultaround mappings on arm64 with hardware access flag Will Deacon
                   ` (3 preceding siblings ...)
  2021-01-20 17:36 ` [PATCH v4 4/8] mm: Move immutable fields of 'struct vm_fault' into anonymous struct Will Deacon
@ 2021-01-20 17:36 ` Will Deacon
  2021-01-20 17:36 ` [PATCH v4 6/8] mm: Avoid modifying vmf.address in __collapse_huge_page_swapin() Will Deacon
                   ` (3 subsequent siblings)
  8 siblings, 0 replies; 24+ messages in thread
From: Will Deacon @ 2021-01-20 17:36 UTC (permalink / raw)
  To: linux-kernel
  Cc: linux-mm, linux-arm-kernel, Will Deacon, Catalin Marinas,
	Jan Kara, Minchan Kim, Andrew Morton, Kirill A . Shutemov,
	Linus Torvalds, Vinayak Menon, Hugh Dickins, Nick Desaulniers,
	kernel-team

Rather than modifying the 'address' field of the 'struct vm_fault'
passed to do_set_pte(), leave that to identify the real faulting address
and pass in the virtual address to be mapped by the new pte as a
separate argument.

This makes FAULT_FLAG_PREFAULT redundant, as a prefault entry can be
identified simply by comparing the new address parameter with the
faulting address, so remove the redundant flag at the same time.

Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Will Deacon <will@kernel.org>
---
 include/linux/mm.h |  7 ++-----
 mm/filemap.c       | 21 +++++++--------------
 mm/memory.c        | 10 +++++-----
 3 files changed, 14 insertions(+), 24 deletions(-)

diff --git a/include/linux/mm.h b/include/linux/mm.h
index b4a5cb9bff7d..e0f056753bef 100644
--- a/include/linux/mm.h
+++ b/include/linux/mm.h
@@ -434,7 +434,6 @@ extern pgprot_t protection_map[16];
  * @FAULT_FLAG_REMOTE: The fault is not for current task/mm.
  * @FAULT_FLAG_INSTRUCTION: The fault was during an instruction fetch.
  * @FAULT_FLAG_INTERRUPTIBLE: The fault can be interrupted by non-fatal signals.
- * @FAULT_FLAG_PREFAULT: Fault was a prefault.
  *
  * About @FAULT_FLAG_ALLOW_RETRY and @FAULT_FLAG_TRIED: we can specify
  * whether we would allow page faults to retry by specifying these two
@@ -465,7 +464,6 @@ extern pgprot_t protection_map[16];
 #define FAULT_FLAG_REMOTE			0x80
 #define FAULT_FLAG_INSTRUCTION  		0x100
 #define FAULT_FLAG_INTERRUPTIBLE		0x200
-#define FAULT_FLAG_PREFAULT			0x400
 
 /*
  * The default fault flags that should be used by most of the
@@ -503,8 +501,7 @@ static inline bool fault_flag_allow_retry_first(unsigned int flags)
 	{ FAULT_FLAG_USER,		"USER" }, \
 	{ FAULT_FLAG_REMOTE,		"REMOTE" }, \
 	{ FAULT_FLAG_INSTRUCTION,	"INSTRUCTION" }, \
-	{ FAULT_FLAG_INTERRUPTIBLE,	"INTERRUPTIBLE" }, \
-	{ FAULT_FLAG_PREFAULT,		"PREFAULT" }
+	{ FAULT_FLAG_INTERRUPTIBLE,	"INTERRUPTIBLE" }
 
 /*
  * vm_fault is filled by the pagefault handler and passed to the vma's
@@ -995,7 +992,7 @@ static inline pte_t maybe_mkwrite(pte_t pte, struct vm_area_struct *vma)
 }
 
 vm_fault_t do_set_pmd(struct vm_fault *vmf, struct page *page);
-void do_set_pte(struct vm_fault *vmf, struct page *page);
+void do_set_pte(struct vm_fault *vmf, struct page *page, unsigned long addr);
 
 vm_fault_t finish_fault(struct vm_fault *vmf);
 vm_fault_t finish_mkwrite_fault(struct vm_fault *vmf);
diff --git a/mm/filemap.c b/mm/filemap.c
index a6dc97906c8e..fb7a8d9b5603 100644
--- a/mm/filemap.c
+++ b/mm/filemap.c
@@ -3018,8 +3018,7 @@ vm_fault_t filemap_map_pages(struct vm_fault *vmf,
 	struct file *file = vma->vm_file;
 	struct address_space *mapping = file->f_mapping;
 	pgoff_t last_pgoff = start_pgoff;
-	unsigned long address = vmf->address;
-	unsigned long flags = vmf->flags;
+	unsigned long addr;
 	XA_STATE(xas, &mapping->i_pages, start_pgoff);
 	struct page *head, *page;
 	unsigned int mmap_miss = READ_ONCE(file->f_ra.mmap_miss);
@@ -3035,8 +3034,8 @@ vm_fault_t filemap_map_pages(struct vm_fault *vmf,
 		goto out;
 	}
 
-	vmf->address = vma->vm_start + ((start_pgoff - vma->vm_pgoff) << PAGE_SHIFT);
-	vmf->pte = pte_offset_map_lock(vma->vm_mm, vmf->pmd, vmf->address, &vmf->ptl);
+	addr = vma->vm_start + ((start_pgoff - vma->vm_pgoff) << PAGE_SHIFT);
+	vmf->pte = pte_offset_map_lock(vma->vm_mm, vmf->pmd, addr, &vmf->ptl);
 	do {
 		page = find_subpage(head, xas.xa_index);
 		if (PageHWPoison(page))
@@ -3045,7 +3044,7 @@ vm_fault_t filemap_map_pages(struct vm_fault *vmf,
 		if (mmap_miss > 0)
 			mmap_miss--;
 
-		vmf->address += (xas.xa_index - last_pgoff) << PAGE_SHIFT;
+		addr += (xas.xa_index - last_pgoff) << PAGE_SHIFT;
 		vmf->pte += xas.xa_index - last_pgoff;
 		last_pgoff = xas.xa_index;
 
@@ -3053,16 +3052,12 @@ vm_fault_t filemap_map_pages(struct vm_fault *vmf,
 			goto unlock;
 
 		/* We're about to handle the fault */
-		if (vmf->address == address) {
-			vmf->flags &= ~FAULT_FLAG_PREFAULT;
+		if (vmf->address == addr)
 			ret = VM_FAULT_NOPAGE;
-		} else {
-			vmf->flags |= FAULT_FLAG_PREFAULT;
-		}
 
-		do_set_pte(vmf, page);
+		do_set_pte(vmf, page, addr);
 		/* no need to invalidate: a not-present page won't be cached */
-		update_mmu_cache(vma, vmf->address, vmf->pte);
+		update_mmu_cache(vma, addr, vmf->pte);
 		unlock_page(head);
 		continue;
 unlock:
@@ -3072,8 +3067,6 @@ vm_fault_t filemap_map_pages(struct vm_fault *vmf,
 	pte_unmap_unlock(vmf->pte, vmf->ptl);
 out:
 	rcu_read_unlock();
-	vmf->flags = flags;
-	vmf->address = address;
 	WRITE_ONCE(file->f_ra.mmap_miss, mmap_miss);
 	return ret;
 }
diff --git a/mm/memory.c b/mm/memory.c
index f0e7c589ca9d..7b1307873325 100644
--- a/mm/memory.c
+++ b/mm/memory.c
@@ -3733,11 +3733,11 @@ vm_fault_t do_set_pmd(struct vm_fault *vmf, struct page *page)
 }
 #endif
 
-void do_set_pte(struct vm_fault *vmf, struct page *page)
+void do_set_pte(struct vm_fault *vmf, struct page *page, unsigned long addr)
 {
 	struct vm_area_struct *vma = vmf->vma;
 	bool write = vmf->flags & FAULT_FLAG_WRITE;
-	bool prefault = vmf->flags & FAULT_FLAG_PREFAULT;
+	bool prefault = vmf->address != addr;
 	pte_t entry;
 
 	flush_icache_page(vma, page);
@@ -3753,13 +3753,13 @@ void do_set_pte(struct vm_fault *vmf, struct page *page)
 	/* copy-on-write page */
 	if (write && !(vma->vm_flags & VM_SHARED)) {
 		inc_mm_counter_fast(vma->vm_mm, MM_ANONPAGES);
-		page_add_new_anon_rmap(page, vma, vmf->address, false);
+		page_add_new_anon_rmap(page, vma, addr, false);
 		lru_cache_add_inactive_or_unevictable(page, vma);
 	} else {
 		inc_mm_counter_fast(vma->vm_mm, mm_counter_file(page));
 		page_add_file_rmap(page, false);
 	}
-	set_pte_at(vma->vm_mm, vmf->address, vmf->pte, entry);
+	set_pte_at(vma->vm_mm, addr, vmf->pte, entry);
 }
 
 /**
@@ -3819,7 +3819,7 @@ vm_fault_t finish_fault(struct vm_fault *vmf)
 	ret = 0;
 	/* Re-check under ptl */
 	if (likely(pte_none(*vmf->pte)))
-		do_set_pte(vmf, page);
+		do_set_pte(vmf, page, vmf->address);
 	else
 		ret = VM_FAULT_NOPAGE;
 
-- 
2.30.0.284.gd98b1dd5eaa7-goog



^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [PATCH v4 6/8] mm: Avoid modifying vmf.address in __collapse_huge_page_swapin()
  2021-01-20 17:36 [PATCH v4 0/8] Create 'old' ptes for faultaround mappings on arm64 with hardware access flag Will Deacon
                   ` (4 preceding siblings ...)
  2021-01-20 17:36 ` [PATCH v4 5/8] mm: Pass 'address' to map to do_set_pte() and drop FAULT_FLAG_PREFAULT Will Deacon
@ 2021-01-20 17:36 ` Will Deacon
  2021-01-20 17:36 ` [PATCH v4 7/8] mm: Use static initialisers for immutable fields of 'struct vm_fault' Will Deacon
                   ` (2 subsequent siblings)
  8 siblings, 0 replies; 24+ messages in thread
From: Will Deacon @ 2021-01-20 17:36 UTC (permalink / raw)
  To: linux-kernel
  Cc: linux-mm, linux-arm-kernel, Will Deacon, Catalin Marinas,
	Jan Kara, Minchan Kim, Andrew Morton, Kirill A . Shutemov,
	Linus Torvalds, Vinayak Menon, Hugh Dickins, Nick Desaulniers,
	kernel-team

In preparation for const-ifying the anonymous struct field of
'struct vm_fault', rework __collapse_huge_page_swapin() to avoid
continuously updating vmf.address and instead populate a new
'struct vm_fault' on the stack for each page being processed.

Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Will Deacon <will@kernel.org>
---
 mm/khugepaged.c | 37 ++++++++++++++++++-------------------
 1 file changed, 18 insertions(+), 19 deletions(-)

diff --git a/mm/khugepaged.c b/mm/khugepaged.c
index 67ab391a5373..fb0fdaec34d5 100644
--- a/mm/khugepaged.c
+++ b/mm/khugepaged.c
@@ -991,38 +991,41 @@ static int hugepage_vma_revalidate(struct mm_struct *mm, unsigned long address,
 
 static bool __collapse_huge_page_swapin(struct mm_struct *mm,
 					struct vm_area_struct *vma,
-					unsigned long address, pmd_t *pmd,
+					unsigned long haddr, pmd_t *pmd,
 					int referenced)
 {
 	int swapped_in = 0;
 	vm_fault_t ret = 0;
-	struct vm_fault vmf = {
-		.vma = vma,
-		.address = address,
-		.flags = FAULT_FLAG_ALLOW_RETRY,
-		.pmd = pmd,
-		.pgoff = linear_page_index(vma, address),
-	};
-
-	vmf.pte = pte_offset_map(pmd, address);
-	for (; vmf.address < address + HPAGE_PMD_NR*PAGE_SIZE;
-			vmf.pte++, vmf.address += PAGE_SIZE) {
+	unsigned long address, end = haddr + (HPAGE_PMD_NR * PAGE_SIZE);
+
+	for (address = haddr; address < end; address += PAGE_SIZE) {
+		struct vm_fault vmf = {
+			.vma = vma,
+			.address = address,
+			.pgoff = linear_page_index(vma, haddr),
+			.flags = FAULT_FLAG_ALLOW_RETRY,
+			.pmd = pmd,
+		};
+
+		vmf.pte = pte_offset_map(pmd, address);
 		vmf.orig_pte = *vmf.pte;
-		if (!is_swap_pte(vmf.orig_pte))
+		if (!is_swap_pte(vmf.orig_pte)) {
+			pte_unmap(vmf.pte);
 			continue;
+		}
 		swapped_in++;
 		ret = do_swap_page(&vmf);
 
 		/* do_swap_page returns VM_FAULT_RETRY with released mmap_lock */
 		if (ret & VM_FAULT_RETRY) {
 			mmap_read_lock(mm);
-			if (hugepage_vma_revalidate(mm, address, &vmf.vma)) {
+			if (hugepage_vma_revalidate(mm, haddr, &vma)) {
 				/* vma is no longer available, don't continue to swapin */
 				trace_mm_collapse_huge_page_swapin(mm, swapped_in, referenced, 0);
 				return false;
 			}
 			/* check if the pmd is still valid */
-			if (mm_find_pmd(mm, address) != pmd) {
+			if (mm_find_pmd(mm, haddr) != pmd) {
 				trace_mm_collapse_huge_page_swapin(mm, swapped_in, referenced, 0);
 				return false;
 			}
@@ -1031,11 +1034,7 @@ static bool __collapse_huge_page_swapin(struct mm_struct *mm,
 			trace_mm_collapse_huge_page_swapin(mm, swapped_in, referenced, 0);
 			return false;
 		}
-		/* pte is unmapped now, we need to map it */
-		vmf.pte = pte_offset_map(pmd, vmf.address);
 	}
-	vmf.pte--;
-	pte_unmap(vmf.pte);
 
 	/* Drain LRU add pagevec to remove extra pin on the swapped in pages */
 	if (swapped_in)
-- 
2.30.0.284.gd98b1dd5eaa7-goog



^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [PATCH v4 7/8] mm: Use static initialisers for immutable fields of 'struct vm_fault'
  2021-01-20 17:36 [PATCH v4 0/8] Create 'old' ptes for faultaround mappings on arm64 with hardware access flag Will Deacon
                   ` (5 preceding siblings ...)
  2021-01-20 17:36 ` [PATCH v4 6/8] mm: Avoid modifying vmf.address in __collapse_huge_page_swapin() Will Deacon
@ 2021-01-20 17:36 ` Will Deacon
  2021-01-20 18:21   ` Nick Desaulniers
  2021-01-20 17:36 ` [PATCH v4 8/8] mm: Mark anonymous struct field of 'struct vm_fault' as 'const' Will Deacon
  2021-01-26 23:08 ` [PATCH v4 0/8] Create 'old' ptes for faultaround mappings on arm64 with hardware access flag Will Deacon
  8 siblings, 1 reply; 24+ messages in thread
From: Will Deacon @ 2021-01-20 17:36 UTC (permalink / raw)
  To: linux-kernel
  Cc: linux-mm, linux-arm-kernel, Will Deacon, Catalin Marinas,
	Jan Kara, Minchan Kim, Andrew Morton, Kirill A . Shutemov,
	Linus Torvalds, Vinayak Menon, Hugh Dickins, Nick Desaulniers,
	kernel-team

In preparation for const-ifying the anonymous struct field of
'struct vm_fault', ensure that it is initialised using static
initialisers.

Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Will Deacon <will@kernel.org>
---
 mm/shmem.c    |  6 +++---
 mm/swapfile.c | 11 ++++++-----
 2 files changed, 9 insertions(+), 8 deletions(-)

diff --git a/mm/shmem.c b/mm/shmem.c
index 7c6b6d8f6c39..1b254fbfdf52 100644
--- a/mm/shmem.c
+++ b/mm/shmem.c
@@ -1520,11 +1520,11 @@ static struct page *shmem_swapin(swp_entry_t swap, gfp_t gfp,
 {
 	struct vm_area_struct pvma;
 	struct page *page;
-	struct vm_fault vmf;
+	struct vm_fault vmf = {
+		.vma = &pvma,
+	};
 
 	shmem_pseudo_vma_init(&pvma, info, index);
-	vmf.vma = &pvma;
-	vmf.address = 0;
 	page = swap_cluster_readahead(swap, gfp, &vmf);
 	shmem_pseudo_vma_destroy(&pvma);
 
diff --git a/mm/swapfile.c b/mm/swapfile.c
index 9fffc5af29d1..2b570a566276 100644
--- a/mm/swapfile.c
+++ b/mm/swapfile.c
@@ -1951,8 +1951,6 @@ static int unuse_pte_range(struct vm_area_struct *vma, pmd_t *pmd,
 	si = swap_info[type];
 	pte = pte_offset_map(pmd, addr);
 	do {
-		struct vm_fault vmf;
-
 		if (!is_swap_pte(*pte))
 			continue;
 
@@ -1968,9 +1966,12 @@ static int unuse_pte_range(struct vm_area_struct *vma, pmd_t *pmd,
 		swap_map = &si->swap_map[offset];
 		page = lookup_swap_cache(entry, vma, addr);
 		if (!page) {
-			vmf.vma = vma;
-			vmf.address = addr;
-			vmf.pmd = pmd;
+			struct vm_fault vmf = {
+				.vma = vma,
+				.address = addr,
+				.pmd = pmd,
+			};
+
 			page = swapin_readahead(entry, GFP_HIGHUSER_MOVABLE,
 						&vmf);
 		}
-- 
2.30.0.284.gd98b1dd5eaa7-goog



^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [PATCH v4 8/8] mm: Mark anonymous struct field of 'struct vm_fault' as 'const'
  2021-01-20 17:36 [PATCH v4 0/8] Create 'old' ptes for faultaround mappings on arm64 with hardware access flag Will Deacon
                   ` (6 preceding siblings ...)
  2021-01-20 17:36 ` [PATCH v4 7/8] mm: Use static initialisers for immutable fields of 'struct vm_fault' Will Deacon
@ 2021-01-20 17:36 ` Will Deacon
  2021-01-20 18:27   ` Nick Desaulniers
  2021-01-26 23:08 ` [PATCH v4 0/8] Create 'old' ptes for faultaround mappings on arm64 with hardware access flag Will Deacon
  8 siblings, 1 reply; 24+ messages in thread
From: Will Deacon @ 2021-01-20 17:36 UTC (permalink / raw)
  To: linux-kernel
  Cc: linux-mm, linux-arm-kernel, Will Deacon, Catalin Marinas,
	Jan Kara, Minchan Kim, Andrew Morton, Kirill A . Shutemov,
	Linus Torvalds, Vinayak Menon, Hugh Dickins, Nick Desaulniers,
	kernel-team

The fields of this struct are only ever read after being initialised, so
mark it 'const' before somebody tries to modify it again.

Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Will Deacon <will@kernel.org>
---
 include/linux/mm.h | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/include/linux/mm.h b/include/linux/mm.h
index e0f056753bef..7ff3d9817d38 100644
--- a/include/linux/mm.h
+++ b/include/linux/mm.h
@@ -514,7 +514,7 @@ static inline bool fault_flag_allow_retry_first(unsigned int flags)
  * pgoff should be used in favour of virtual_address, if possible.
  */
 struct vm_fault {
-	struct {
+	const struct {
 		struct vm_area_struct *vma;	/* Target VMA */
 		gfp_t gfp_mask;			/* gfp mask to be used for allocations */
 		pgoff_t pgoff;			/* Logical page offset based on vma */
-- 
2.30.0.284.gd98b1dd5eaa7-goog



^ permalink raw reply related	[flat|nested] 24+ messages in thread

* Re: [PATCH v4 4/8] mm: Move immutable fields of 'struct vm_fault' into anonymous struct
  2021-01-20 17:36 ` [PATCH v4 4/8] mm: Move immutable fields of 'struct vm_fault' into anonymous struct Will Deacon
@ 2021-01-20 18:13   ` Nick Desaulniers
  2021-01-21 12:48     ` Will Deacon
  0 siblings, 1 reply; 24+ messages in thread
From: Nick Desaulniers @ 2021-01-20 18:13 UTC (permalink / raw)
  To: Will Deacon
  Cc: LKML, Linux Memory Management List, Linux ARM, Catalin Marinas,
	Jan Kara, Minchan Kim, Andrew Morton, Kirill A . Shutemov,
	Linus Torvalds, Vinayak Menon, Hugh Dickins, kernel-team

On Wed, Jan 20, 2021 at 9:36 AM Will Deacon <will@kernel.org> wrote:
>
> 'struct vm_fault' contains both information about the fault being
> serviced alongside mutable fields contributing to the state of the
> fault-handling logic. Unfortunately, the distinction between the two is
> not clear-cut, and a number of callers end up manipulating the structure
> temporarily before restoring it when returning.
>
> Try to clean this up by moving the immutable fault information into an
> anonymous struct, which will later be marked as 'const'. GCC will then
> complain (with an error) about modification of these fields after they
> have been initialised, although LLVM currently allows them without even
> a warning:
>
> https://bugs.llvm.org/show_bug.cgi?id=48755

I think this paragraph+link would be better on patch 8/8.

>
> Ideally, the 'flags' field would be part of the new structure too, but
> it seems as though the ->page_mkwrite() path is not ready for this yet.
>
> Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
> Suggested-by: Linus Torvalds <torvalds@linux-foundation.org>
> Link: https://lore.kernel.org/r/CAHk-=whYs9XsO88iqJzN6NC=D-dp2m0oYXuOoZ=eWnvv=5OA+w@mail.gmail.com
> Signed-off-by: Will Deacon <will@kernel.org>
> ---
>  include/linux/mm.h | 13 ++++++++-----
>  1 file changed, 8 insertions(+), 5 deletions(-)
>
> diff --git a/include/linux/mm.h b/include/linux/mm.h
> index 251a2339befb..b4a5cb9bff7d 100644
> --- a/include/linux/mm.h
> +++ b/include/linux/mm.h
> @@ -517,11 +517,14 @@ static inline bool fault_flag_allow_retry_first(unsigned int flags)
>   * pgoff should be used in favour of virtual_address, if possible.
>   */
>  struct vm_fault {
> -       struct vm_area_struct *vma;     /* Target VMA */
> -       unsigned int flags;             /* FAULT_FLAG_xxx flags */
> -       gfp_t gfp_mask;                 /* gfp mask to be used for allocations */
> -       pgoff_t pgoff;                  /* Logical page offset based on vma */
> -       unsigned long address;          /* Faulting virtual address */
> +       struct {
> +               struct vm_area_struct *vma;     /* Target VMA */
> +               gfp_t gfp_mask;                 /* gfp mask to be used for allocations */
> +               pgoff_t pgoff;                  /* Logical page offset based on vma */
> +               unsigned long address;          /* Faulting virtual address */
> +       };
> +       unsigned int flags;             /* FAULT_FLAG_xxx flags
> +                                        * XXX: should really be 'const' */
>         pmd_t *pmd;                     /* Pointer to pmd entry matching
>                                          * the 'address' */
>         pud_t *pud;                     /* Pointer to pud entry matching
> --
> 2.30.0.284.gd98b1dd5eaa7-goog
>


-- 
Thanks,
~Nick Desaulniers


^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH v4 7/8] mm: Use static initialisers for immutable fields of 'struct vm_fault'
  2021-01-20 17:36 ` [PATCH v4 7/8] mm: Use static initialisers for immutable fields of 'struct vm_fault' Will Deacon
@ 2021-01-20 18:21   ` Nick Desaulniers
  2021-01-21 12:50     ` Will Deacon
  0 siblings, 1 reply; 24+ messages in thread
From: Nick Desaulniers @ 2021-01-20 18:21 UTC (permalink / raw)
  To: Will Deacon
  Cc: LKML, Linux Memory Management List, Linux ARM, Catalin Marinas,
	Jan Kara, Minchan Kim, Andrew Morton, Kirill A . Shutemov,
	Linus Torvalds, Vinayak Menon, Hugh Dickins, kernel-team

On Wed, Jan 20, 2021 at 9:36 AM Will Deacon <will@kernel.org> wrote:
>
> In preparation for const-ifying the anonymous struct field of
> 'struct vm_fault', ensure that it is initialised using static
> initialisers.

FWIW these are known as "designated initializers", ie.

struct foo my_foo = {
  .bar = 42,
};

https://gcc.gnu.org/onlinedocs/gcc/Designated-Inits.html

Reviewed-by: Nick Desaulniers <ndesaulniers@google.com>

>
> Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
> Cc: Linus Torvalds <torvalds@linux-foundation.org>
> Signed-off-by: Will Deacon <will@kernel.org>
> ---
>  mm/shmem.c    |  6 +++---
>  mm/swapfile.c | 11 ++++++-----
>  2 files changed, 9 insertions(+), 8 deletions(-)
>
> diff --git a/mm/shmem.c b/mm/shmem.c
> index 7c6b6d8f6c39..1b254fbfdf52 100644
> --- a/mm/shmem.c
> +++ b/mm/shmem.c
> @@ -1520,11 +1520,11 @@ static struct page *shmem_swapin(swp_entry_t swap, gfp_t gfp,
>  {
>         struct vm_area_struct pvma;
>         struct page *page;
> -       struct vm_fault vmf;
> +       struct vm_fault vmf = {
> +               .vma = &pvma,
> +       };
>
>         shmem_pseudo_vma_init(&pvma, info, index);
> -       vmf.vma = &pvma;
> -       vmf.address = 0;
>         page = swap_cluster_readahead(swap, gfp, &vmf);
>         shmem_pseudo_vma_destroy(&pvma);
>
> diff --git a/mm/swapfile.c b/mm/swapfile.c
> index 9fffc5af29d1..2b570a566276 100644
> --- a/mm/swapfile.c
> +++ b/mm/swapfile.c
> @@ -1951,8 +1951,6 @@ static int unuse_pte_range(struct vm_area_struct *vma, pmd_t *pmd,
>         si = swap_info[type];
>         pte = pte_offset_map(pmd, addr);
>         do {
> -               struct vm_fault vmf;
> -
>                 if (!is_swap_pte(*pte))
>                         continue;
>
> @@ -1968,9 +1966,12 @@ static int unuse_pte_range(struct vm_area_struct *vma, pmd_t *pmd,
>                 swap_map = &si->swap_map[offset];
>                 page = lookup_swap_cache(entry, vma, addr);
>                 if (!page) {
> -                       vmf.vma = vma;
> -                       vmf.address = addr;
> -                       vmf.pmd = pmd;
> +                       struct vm_fault vmf = {
> +                               .vma = vma,
> +                               .address = addr,
> +                               .pmd = pmd,
> +                       };
> +
>                         page = swapin_readahead(entry, GFP_HIGHUSER_MOVABLE,
>                                                 &vmf);
>                 }
> --
> 2.30.0.284.gd98b1dd5eaa7-goog
>


--
Thanks,
~Nick Desaulniers


^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH v4 8/8] mm: Mark anonymous struct field of 'struct vm_fault' as 'const'
  2021-01-20 17:36 ` [PATCH v4 8/8] mm: Mark anonymous struct field of 'struct vm_fault' as 'const' Will Deacon
@ 2021-01-20 18:27   ` Nick Desaulniers
  2021-01-20 19:02     ` Linus Torvalds
  0 siblings, 1 reply; 24+ messages in thread
From: Nick Desaulniers @ 2021-01-20 18:27 UTC (permalink / raw)
  To: Will Deacon
  Cc: LKML, Linux Memory Management List, Linux ARM, Catalin Marinas,
	Jan Kara, Minchan Kim, Andrew Morton, Kirill A . Shutemov,
	Linus Torvalds, Vinayak Menon, Hugh Dickins, kernel-team

On Wed, Jan 20, 2021 at 9:36 AM Will Deacon <will@kernel.org> wrote:
>
> The fields of this struct are only ever read after being initialised, so
> mark it 'const' before somebody tries to modify it again.
>
> Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
> Cc: Linus Torvalds <torvalds@linux-foundation.org>
> Signed-off-by: Will Deacon <will@kernel.org>
> ---
>  include/linux/mm.h | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/include/linux/mm.h b/include/linux/mm.h
> index e0f056753bef..7ff3d9817d38 100644
> --- a/include/linux/mm.h
> +++ b/include/linux/mm.h
> @@ -514,7 +514,7 @@ static inline bool fault_flag_allow_retry_first(unsigned int flags)
>   * pgoff should be used in favour of virtual_address, if possible.
>   */
>  struct vm_fault {
> -       struct {
> +       const struct {
>                 struct vm_area_struct *vma;     /* Target VMA */
>                 gfp_t gfp_mask;                 /* gfp mask to be used for allocations */
>                 pgoff_t pgoff;                  /* Logical page offset based on vma */

Is there a difference between:

+       const struct {
+               struct vm_area_struct *vma;     /* Target VMA */
+               gfp_t gfp_mask;                 /* gfp mask to be used
for allocations */
+               pgoff_t pgoff;                  /* Logical page offset
based on vma */
+               unsigned long address;          /* Faulting virtual address */
+       };

vs

+       struct {
+               struct vm_area_struct * const vma;     /* Target VMA */
+               const gfp_t gfp_mask;                 /* gfp mask to
be used for allocations */
+               const pgoff_t pgoff;                  /* Logical page
offset based on vma */
+               const unsigned long address;          /* Faulting
virtual address */
+       };

ie. marking the members const vs the anonymous struct?  Does anything
need to change in the use of these structures?

> --
> 2.30.0.284.gd98b1dd5eaa7-goog
>


-- 
Thanks,
~Nick Desaulniers


^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH v4 8/8] mm: Mark anonymous struct field of 'struct vm_fault' as 'const'
  2021-01-20 18:27   ` Nick Desaulniers
@ 2021-01-20 19:02     ` Linus Torvalds
  2021-01-21 13:11       ` Will Deacon
  0 siblings, 1 reply; 24+ messages in thread
From: Linus Torvalds @ 2021-01-20 19:02 UTC (permalink / raw)
  To: Nick Desaulniers
  Cc: Will Deacon, LKML, Linux Memory Management List, Linux ARM,
	Catalin Marinas, Jan Kara, Minchan Kim, Andrew Morton,
	Kirill A . Shutemov, Vinayak Menon, Hugh Dickins, kernel-team

On Wed, Jan 20, 2021 at 10:27 AM Nick Desaulniers
<ndesaulniers@google.com> wrote:
>
> Is there a difference between: [ const unnamed struct and individual const members ]

Semantically? No.

Syntactically the "group the const members together" is a lot cleaner,
imho. Not just from a "just a single const" standpoint, but from a
"code as documentation" standpoint.

But I guess to avoid the clang issue, we could do the "mark individual
fields" thing.

(It turns out that sparse gets this wrong too, so it's not just clang).

           Linus


^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH v4 4/8] mm: Move immutable fields of 'struct vm_fault' into anonymous struct
  2021-01-20 18:13   ` Nick Desaulniers
@ 2021-01-21 12:48     ` Will Deacon
  0 siblings, 0 replies; 24+ messages in thread
From: Will Deacon @ 2021-01-21 12:48 UTC (permalink / raw)
  To: Nick Desaulniers
  Cc: LKML, Linux Memory Management List, Linux ARM, Catalin Marinas,
	Jan Kara, Minchan Kim, Andrew Morton, Kirill A . Shutemov,
	Linus Torvalds, Vinayak Menon, Hugh Dickins, kernel-team

On Wed, Jan 20, 2021 at 10:13:37AM -0800, Nick Desaulniers wrote:
> On Wed, Jan 20, 2021 at 9:36 AM Will Deacon <will@kernel.org> wrote:
> >
> > 'struct vm_fault' contains both information about the fault being
> > serviced alongside mutable fields contributing to the state of the
> > fault-handling logic. Unfortunately, the distinction between the two is
> > not clear-cut, and a number of callers end up manipulating the structure
> > temporarily before restoring it when returning.
> >
> > Try to clean this up by moving the immutable fault information into an
> > anonymous struct, which will later be marked as 'const'. GCC will then
> > complain (with an error) about modification of these fields after they
> > have been initialised, although LLVM currently allows them without even
> > a warning:
> >
> > https://bugs.llvm.org/show_bug.cgi?id=48755
> 
> I think this paragraph+link would be better on patch 8/8.

Agreed, I'll move it.

Will


^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH v4 7/8] mm: Use static initialisers for immutable fields of 'struct vm_fault'
  2021-01-20 18:21   ` Nick Desaulniers
@ 2021-01-21 12:50     ` Will Deacon
  0 siblings, 0 replies; 24+ messages in thread
From: Will Deacon @ 2021-01-21 12:50 UTC (permalink / raw)
  To: Nick Desaulniers
  Cc: LKML, Linux Memory Management List, Linux ARM, Catalin Marinas,
	Jan Kara, Minchan Kim, Andrew Morton, Kirill A . Shutemov,
	Linus Torvalds, Vinayak Menon, Hugh Dickins, kernel-team

On Wed, Jan 20, 2021 at 10:21:33AM -0800, Nick Desaulniers wrote:
> On Wed, Jan 20, 2021 at 9:36 AM Will Deacon <will@kernel.org> wrote:
> >
> > In preparation for const-ifying the anonymous struct field of
> > 'struct vm_fault', ensure that it is initialised using static
> > initialisers.
> 
> FWIW these are known as "designated initializers", ie.
> 
> struct foo my_foo = {
>   .bar = 42,
> };
> 
> https://gcc.gnu.org/onlinedocs/gcc/Designated-Inits.html

Thanks, I'm useless with terminology. Will update.

Will


^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH v4 8/8] mm: Mark anonymous struct field of 'struct vm_fault' as 'const'
  2021-01-20 19:02     ` Linus Torvalds
@ 2021-01-21 13:11       ` Will Deacon
  2021-01-21 19:24         ` Nick Desaulniers
  2021-01-22 17:47         ` Linus Torvalds
  0 siblings, 2 replies; 24+ messages in thread
From: Will Deacon @ 2021-01-21 13:11 UTC (permalink / raw)
  To: Linus Torvalds, luc.vanoostenryck
  Cc: Nick Desaulniers, LKML, Linux Memory Management List, Linux ARM,
	Catalin Marinas, Jan Kara, Minchan Kim, Andrew Morton,
	Kirill A . Shutemov, Vinayak Menon, Hugh Dickins, kernel-team

On Wed, Jan 20, 2021 at 11:02:06AM -0800, Linus Torvalds wrote:
> On Wed, Jan 20, 2021 at 10:27 AM Nick Desaulniers
> <ndesaulniers@google.com> wrote:
> >
> > Is there a difference between: [ const unnamed struct and individual const members ]
> 
> Semantically? No.
> 
> Syntactically the "group the const members together" is a lot cleaner,
> imho. Not just from a "just a single const" standpoint, but from a
> "code as documentation" standpoint.
> 
> But I guess to avoid the clang issue, we could do the "mark individual
> fields" thing.

I'd prefer to wait until the bug against LLVM has been resolved before we
try to work around anything. Although I couldn't find any other examples
like this in the kernel, requiring all of the member fields to be marked as
'const' still feels pretty fragile to me; it's only a matter of time before
new non-const fields get added, at which point the temptation for developers
to remove 'const' from other fields when it gets in the way is pretty high.

None of this is bullet-proof, of course, but if clang ends up emitting a
warning (even if it's gated behind an option) then I think we're in a good
place.

> (It turns out that sparse gets this wrong too, so it's not just clang).

Adding Luc, as hopefully that's fixable.

Will


^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH v4 8/8] mm: Mark anonymous struct field of 'struct vm_fault' as 'const'
  2021-01-21 13:11       ` Will Deacon
@ 2021-01-21 19:24         ` Nick Desaulniers
  2021-01-21 21:28           ` Will Deacon
  2021-01-22 17:47         ` Linus Torvalds
  1 sibling, 1 reply; 24+ messages in thread
From: Nick Desaulniers @ 2021-01-21 19:24 UTC (permalink / raw)
  To: Will Deacon
  Cc: Linus Torvalds, Luc Van Oostenryck, LKML,
	Linux Memory Management List, Linux ARM, Catalin Marinas,
	Jan Kara, Minchan Kim, Andrew Morton, Kirill A . Shutemov,
	Vinayak Menon, Hugh Dickins, kernel-team

On Thu, Jan 21, 2021 at 5:11 AM Will Deacon <will@kernel.org> wrote:
>
> On Wed, Jan 20, 2021 at 11:02:06AM -0800, Linus Torvalds wrote:
> > On Wed, Jan 20, 2021 at 10:27 AM Nick Desaulniers
> > <ndesaulniers@google.com> wrote:
> > >
> > > Is there a difference between: [ const unnamed struct and individual const members ]
> >
> > Semantically? No.
> >
> > Syntactically the "group the const members together" is a lot cleaner,
> > imho. Not just from a "just a single const" standpoint, but from a
> > "code as documentation" standpoint.
> >
> > But I guess to avoid the clang issue, we could do the "mark individual
> > fields" thing.
>
> I'd prefer to wait until the bug against LLVM has been resolved before we
> try to work around anything. Although I couldn't find any other examples
> like this in the kernel, requiring all of the member fields to be marked as
> 'const' still feels pretty fragile to me; it's only a matter of time before
> new non-const fields get added, at which point the temptation for developers
> to remove 'const' from other fields when it gets in the way is pretty high.

What's to stop a new non-const field from getting added outside the
const qualified anonymous struct?
What's to stop someone from removing const from the anonymous struct?
What's to stop a number of callers from manipulating the structure
temporarily before restoring it when returning by casting away the
const?

Code review.

Using a non-toolchain-portable solution certainly could be considered
more fragile.

It's always possible that the resolution is the C standards body goes
the C++ route, at which point GCC would be forced to address this and
potentially change behavior.  Kind of like how people avoid going to
court since things are never guaranteed to work out in their favor.

>
> None of this is bullet-proof, of course, but if clang ends up emitting a
> warning (even if it's gated behind an option) then I think we're in a good
> place.
>
> > (It turns out that sparse gets this wrong too, so it's not just clang).
>
> Adding Luc, as hopefully that's fixable.
>
> Will



--
Thanks,
~Nick Desaulniers


^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH v4 8/8] mm: Mark anonymous struct field of 'struct vm_fault' as 'const'
  2021-01-21 19:24         ` Nick Desaulniers
@ 2021-01-21 21:28           ` Will Deacon
  2021-01-22 19:10             ` Nick Desaulniers
  0 siblings, 1 reply; 24+ messages in thread
From: Will Deacon @ 2021-01-21 21:28 UTC (permalink / raw)
  To: Nick Desaulniers
  Cc: Linus Torvalds, Luc Van Oostenryck, LKML,
	Linux Memory Management List, Linux ARM, Catalin Marinas,
	Jan Kara, Minchan Kim, Andrew Morton, Kirill A . Shutemov,
	Vinayak Menon, Hugh Dickins, kernel-team

On Thu, Jan 21, 2021 at 11:24:36AM -0800, Nick Desaulniers wrote:
> On Thu, Jan 21, 2021 at 5:11 AM Will Deacon <will@kernel.org> wrote:
> >
> > On Wed, Jan 20, 2021 at 11:02:06AM -0800, Linus Torvalds wrote:
> > > On Wed, Jan 20, 2021 at 10:27 AM Nick Desaulniers
> > > <ndesaulniers@google.com> wrote:
> > > >
> > > > Is there a difference between: [ const unnamed struct and individual const members ]
> > >
> > > Semantically? No.
> > >
> > > Syntactically the "group the const members together" is a lot cleaner,
> > > imho. Not just from a "just a single const" standpoint, but from a
> > > "code as documentation" standpoint.
> > >
> > > But I guess to avoid the clang issue, we could do the "mark individual
> > > fields" thing.
> >
> > I'd prefer to wait until the bug against LLVM has been resolved before we
> > try to work around anything. Although I couldn't find any other examples
> > like this in the kernel, requiring all of the member fields to be marked as
> > 'const' still feels pretty fragile to me; it's only a matter of time before
> > new non-const fields get added, at which point the temptation for developers
> > to remove 'const' from other fields when it gets in the way is pretty high.
> 
> What's to stop a new non-const field from getting added outside the
> const qualified anonymous struct?
> What's to stop someone from removing const from the anonymous struct?
> What's to stop a number of callers from manipulating the structure
> temporarily before restoring it when returning by casting away the
> const?
> 
> Code review.

Sure, but here we are cleaning up this stuff, so I think review only gets
you so far. To me:

	const struct {
		int	foo;
		long	bar;
	};

clearly says "don't modify fields of this struct", whereas:

	struct {
		const int	foo;
		const long	bar;
	};

says "don't modify foo or bar, but add whatever you like on the end" and
that's the slippery slope. So then we end up with the eye-sore of:

	const struct {
		const int	foo;
		const long	bar;
	};

and maybe that's the right answer, but I'm just saying we should wait for
clang to make up its mind first. It's not like this is a functional problem,
and there are enough GCC users around that we're not exactly in a hurry.

Will


^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH v4 8/8] mm: Mark anonymous struct field of 'struct vm_fault' as 'const'
  2021-01-21 13:11       ` Will Deacon
  2021-01-21 19:24         ` Nick Desaulniers
@ 2021-01-22 17:47         ` Linus Torvalds
  1 sibling, 0 replies; 24+ messages in thread
From: Linus Torvalds @ 2021-01-22 17:47 UTC (permalink / raw)
  To: Will Deacon
  Cc: Luc Van Oostenryck, Nick Desaulniers, LKML,
	Linux Memory Management List, Linux ARM, Catalin Marinas,
	Jan Kara, Minchan Kim, Andrew Morton, Kirill A . Shutemov,
	Vinayak Menon, Hugh Dickins, kernel-team

On Thu, Jan 21, 2021 at 5:11 AM Will Deacon <will@kernel.org> wrote:
>
> > (It turns out that sparse gets this wrong too, so it's not just clang).
>
> Adding Luc, as hopefully that's fixable.

I had talked to Luc about this earlier, and he just sent out a fix for
sparse. It's not in the git repo yet, but it's coming.

Having looked at what sparse does, I suspect the clang behavior stems
from a similar approach to looking up unnamed struct/union members.

And the sparse fix was fairly straightforward: changing the _lookup_
is painful, because that's late and trying to fix types after-the-fact
is just not great. But just (recursively) modifying the type modifiers
at type parsing time for anonymous struct/union members is quite
straightforward, probably in clang too.

So honestly - I think the clang push-back was by somebody who thought
it would be nasty to fix, and who was thus actively trying to mis-read
the standards language.

I'm not willing to argue with compiler people who do standards
language weasel-wording (I've seen it before, my life is too short to
deal with those kinds of people), but maybe Nick is more used to deal
with clang people.

Nick - I suspect that the sparse type system model is different enough
from the clang one that the sparse patch is not really helpful as a
"look, this is how it was done in sparse, maybe clang can do something
similar", but I'll bounce it to you separately anyway just in case.
Maybe your clang knowledge means that you go "yeah, in clang that
function is called 'xyz', and we can do something very similar".

            Linus


^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH v4 8/8] mm: Mark anonymous struct field of 'struct vm_fault' as 'const'
  2021-01-21 21:28           ` Will Deacon
@ 2021-01-22 19:10             ` Nick Desaulniers
  2021-01-22 19:27               ` Will Deacon
  0 siblings, 1 reply; 24+ messages in thread
From: Nick Desaulniers @ 2021-01-22 19:10 UTC (permalink / raw)
  To: Will Deacon
  Cc: Linus Torvalds, Luc Van Oostenryck, LKML,
	Linux Memory Management List, Linux ARM, Catalin Marinas,
	Jan Kara, Minchan Kim, Andrew Morton, Kirill A . Shutemov,
	Vinayak Menon, Hugh Dickins, kernel-team

On Thu, Jan 21, 2021 at 1:28 PM Will Deacon <will@kernel.org> wrote:
>
> On Thu, Jan 21, 2021 at 11:24:36AM -0800, Nick Desaulniers wrote:
> > On Thu, Jan 21, 2021 at 5:11 AM Will Deacon <will@kernel.org> wrote:
> > >
> > > On Wed, Jan 20, 2021 at 11:02:06AM -0800, Linus Torvalds wrote:
> > > > On Wed, Jan 20, 2021 at 10:27 AM Nick Desaulniers
> > > > <ndesaulniers@google.com> wrote:
> > > > >
> > > > > Is there a difference between: [ const unnamed struct and individual const members ]
> > > >
> > > > Semantically? No.
> > > >
> > > > Syntactically the "group the const members together" is a lot cleaner,
> > > > imho. Not just from a "just a single const" standpoint, but from a
> > > > "code as documentation" standpoint.
> > > >
> > > > But I guess to avoid the clang issue, we could do the "mark individual
> > > > fields" thing.
> > >
> > > I'd prefer to wait until the bug against LLVM has been resolved before we
> > > try to work around anything. Although I couldn't find any other examples
> > > like this in the kernel, requiring all of the member fields to be marked as
> > > 'const' still feels pretty fragile to me; it's only a matter of time before
> > > new non-const fields get added, at which point the temptation for developers
> > > to remove 'const' from other fields when it gets in the way is pretty high.
> >
> > What's to stop a new non-const field from getting added outside the
> > const qualified anonymous struct?
> > What's to stop someone from removing const from the anonymous struct?
> > What's to stop a number of callers from manipulating the structure
> > temporarily before restoring it when returning by casting away the
> > const?
> >
> > Code review.
>
> Sure, but here we are cleaning up this stuff, so I think review only gets
> you so far. To me:
>
>         const struct {
>                 int     foo;
>                 long    bar;
>         };
>
> clearly says "don't modify fields of this struct", whereas:
>
>         struct {
>                 const int       foo;
>                 const long      bar;
>         };
>
> says "don't modify foo or bar, but add whatever you like on the end" and
> that's the slippery slope.

"but you could add additional non-const members on the end" for sure.
Though going back to

>> What's to stop a new non-const field from getting added outside the
> > const qualified anonymous struct?

my point with that is that the const anonymous struct is within a
non-const anonymous struct.

struct vm_fault {
  const {
    struct vm_area_struct *vma;
    gfp_t gfp_mask;
    pgoff_t pgoff;
    unsigned long address;
    // Your point is about new member additions here, IIUC
  };
  // My point: what's to stop someone from adding a new non-const member here?
  unsigned int flags;
  pmd_t *pmd;
  pud_t *pud;
  ...
  // or here?
};

The const anonymous struct will help prevent additions of non-const
members to the anonymous struct, sure; but if someone really wanted a
new non-const member in a `struct vm_fault` instance they're just
going to go around the const anonymous struct.  Or is there something
more I'm missing about the order of the members of struct vm_fault?

> So then we end up with the eye-sore of:
>
>         const struct {
>                 const int       foo;
>                 const long      bar;
>         };
>
> and maybe that's the right answer, but I'm just saying we should wait for
> clang to make up its mind first. It's not like this is a functional problem,
> and there are enough GCC users around that we're not exactly in a hurry.

Yeah, I mean I'm happy to whip something up for Clang, even though I
suspect it will get shot down in code review until there's more
guidance from standards bodies.  It doesn't hurt to try, and at least
have a patch "waiting in the wings" should we hear back from WG14 that
favors GCC's behavior.  Who knows, maybe the guidance will be that
WG21 should revisit this behavior for C++ to avoid divergence with C
(as g++ and gcc currently do).
-- 
Thanks,
~Nick Desaulniers


^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH v4 8/8] mm: Mark anonymous struct field of 'struct vm_fault' as 'const'
  2021-01-22 19:10             ` Nick Desaulniers
@ 2021-01-22 19:27               ` Will Deacon
  0 siblings, 0 replies; 24+ messages in thread
From: Will Deacon @ 2021-01-22 19:27 UTC (permalink / raw)
  To: Nick Desaulniers
  Cc: Linus Torvalds, Luc Van Oostenryck, LKML,
	Linux Memory Management List, Linux ARM, Catalin Marinas,
	Jan Kara, Minchan Kim, Andrew Morton, Kirill A . Shutemov,
	Vinayak Menon, Hugh Dickins, kernel-team

Hey Nick,

On Fri, Jan 22, 2021 at 11:10:50AM -0800, Nick Desaulniers wrote:
> On Thu, Jan 21, 2021 at 1:28 PM Will Deacon <will@kernel.org> wrote:
> > On Thu, Jan 21, 2021 at 11:24:36AM -0800, Nick Desaulniers wrote:
> > > On Thu, Jan 21, 2021 at 5:11 AM Will Deacon <will@kernel.org> wrote:
> > Sure, but here we are cleaning up this stuff, so I think review only gets
> > you so far. To me:
> >
> >         const struct {
> >                 int     foo;
> >                 long    bar;
> >         };
> >
> > clearly says "don't modify fields of this struct", whereas:
> >
> >         struct {
> >                 const int       foo;
> >                 const long      bar;
> >         };
> >
> > says "don't modify foo or bar, but add whatever you like on the end" and
> > that's the slippery slope.
> 
> "but you could add additional non-const members on the end" for sure.
> Though going back to
> 
> >> What's to stop a new non-const field from getting added outside the
> > > const qualified anonymous struct?
> 
> my point with that is that the const anonymous struct is within a
> non-const anonymous struct.
> 
> struct vm_fault {
>   const {
>     struct vm_area_struct *vma;
>     gfp_t gfp_mask;
>     pgoff_t pgoff;
>     unsigned long address;
>     // Your point is about new member additions here, IIUC
>   };
>   // My point: what's to stop someone from adding a new non-const member here?
>   unsigned int flags;
>   pmd_t *pmd;
>   pud_t *pud;
>   ...
>   // or here?
> };
> 
> The const anonymous struct will help prevent additions of non-const
> members to the anonymous struct, sure; but if someone really wanted a
> new non-const member in a `struct vm_fault` instance they're just
> going to go around the const anonymous struct.  Or is there something
> more I'm missing about the order of the members of struct vm_fault?

All I was trying to say is that, given a struct with a mixture of const and
non-const members, I would be less hesitant to remove 'const' from one of
the members if it helped me get something else done. Having the 'const' on
the struct itself, however, would deter me because at that point its clear
that you're not supposed to be modifying this stuff. That's the difference
between the "Your point" and "My point" lines above.

But honestly, I can't say I particularly enjoy arguing about these
idiosyncracies; I'd just rather wait for the dust to settle with clang
before we add code to deal with that outcome.

> > So then we end up with the eye-sore of:
> >
> >         const struct {
> >                 const int       foo;
> >                 const long      bar;
> >         };
> >
> > and maybe that's the right answer, but I'm just saying we should wait for
> > clang to make up its mind first. It's not like this is a functional problem,
> > and there are enough GCC users around that we're not exactly in a hurry.
> 
> Yeah, I mean I'm happy to whip something up for Clang, even though I
> suspect it will get shot down in code review until there's more
> guidance from standards bodies.  It doesn't hurt to try, and at least
> have a patch "waiting in the wings" should we hear back from WG14 that
> favors GCC's behavior.  Who knows, maybe the guidance will be that
> WG21 should revisit this behavior for C++ to avoid divergence with C
> (as g++ and gcc currently do).

I mean, a warning doesn't seem controversial to me, especially as opinions
certainly seem to be divided about what the right behaviour should be here.

Will


^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH v4 0/8] Create 'old' ptes for faultaround mappings on arm64 with hardware access flag
  2021-01-20 17:36 [PATCH v4 0/8] Create 'old' ptes for faultaround mappings on arm64 with hardware access flag Will Deacon
                   ` (7 preceding siblings ...)
  2021-01-20 17:36 ` [PATCH v4 8/8] mm: Mark anonymous struct field of 'struct vm_fault' as 'const' Will Deacon
@ 2021-01-26 23:08 ` Will Deacon
  2021-01-26 23:28   ` Hugh Dickins
  8 siblings, 1 reply; 24+ messages in thread
From: Will Deacon @ 2021-01-26 23:08 UTC (permalink / raw)
  To: linux-kernel
  Cc: linux-mm, linux-arm-kernel, Catalin Marinas, Jan Kara,
	Minchan Kim, Andrew Morton, Kirill A . Shutemov, Linus Torvalds,
	Vinayak Menon, Hugh Dickins, Nick Desaulniers, kernel-team

On Wed, Jan 20, 2021 at 05:36:04PM +0000, Will Deacon wrote:
> Hi all,
> 
> This is version four of the patches I previously posted here:
> 
>   v1: https://lore.kernel.org/r/20201209163950.8494-1-will@kernel.org
>   v2: https://lore.kernel.org/r/20210108171517.5290-1-will@kernel.org
>   v3: https://lore.kernel.org/r/20210114175934.13070-1-will@kernel.org
> 
> The patches allow architectures to opt-in at runtime for faultaround
> mappings to be created as 'old' instead of 'young'. Although there have
> been previous attempts at this, they failed either because the decision
> was deferred to userspace [1] or because it was done unconditionally and
> shown to regress benchmarks for particular architectures [2].
> 
> The big change since v3 is that the immutable fields of 'struct vm_fault'
> now live in a 'const' anonymous struct. Although Clang will silently
> accept modifications to these fields [3], GCC emits an error. The
> resulting diffstat is _considerably_ more manageable with this approach.

The only changes I have pending against this series are cosmetic (commit
logs). Can I go ahead and queue this in the arm64 tree so that it can sit
in linux-next for a bit? (positive or negative feedback appreciated!).

Thanks,

Will


^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH v4 0/8] Create 'old' ptes for faultaround mappings on arm64 with hardware access flag
  2021-01-26 23:08 ` [PATCH v4 0/8] Create 'old' ptes for faultaround mappings on arm64 with hardware access flag Will Deacon
@ 2021-01-26 23:28   ` Hugh Dickins
  2021-01-27 17:16     ` Will Deacon
  0 siblings, 1 reply; 24+ messages in thread
From: Hugh Dickins @ 2021-01-26 23:28 UTC (permalink / raw)
  To: Will Deacon
  Cc: linux-kernel, linux-mm, linux-arm-kernel, Catalin Marinas,
	Jan Kara, Minchan Kim, Andrew Morton, Kirill A . Shutemov,
	Linus Torvalds, Vinayak Menon, Hugh Dickins, Nick Desaulniers,
	kernel-team

On Tue, 26 Jan 2021, Will Deacon wrote:
> On Wed, Jan 20, 2021 at 05:36:04PM +0000, Will Deacon wrote:
> > Hi all,
> > 
> > This is version four of the patches I previously posted here:
> > 
> >   v1: https://lore.kernel.org/r/20201209163950.8494-1-will@kernel.org
> >   v2: https://lore.kernel.org/r/20210108171517.5290-1-will@kernel.org
> >   v3: https://lore.kernel.org/r/20210114175934.13070-1-will@kernel.org
> > 
> > The patches allow architectures to opt-in at runtime for faultaround
> > mappings to be created as 'old' instead of 'young'. Although there have
> > been previous attempts at this, they failed either because the decision
> > was deferred to userspace [1] or because it was done unconditionally and
> > shown to regress benchmarks for particular architectures [2].
> > 
> > The big change since v3 is that the immutable fields of 'struct vm_fault'
> > now live in a 'const' anonymous struct. Although Clang will silently
> > accept modifications to these fields [3], GCC emits an error. The
> > resulting diffstat is _considerably_ more manageable with this approach.
> 
> The only changes I have pending against this series are cosmetic (commit
> logs). Can I go ahead and queue this in the arm64 tree so that it can sit
> in linux-next for a bit? (positive or negative feedback appreciated!).

That would be fine by me: I ran v3 on rc3, then the nicer smaller v4
on rc4, and saw no problems when running either of them (x86_64 only).

Hugh


^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH v4 0/8] Create 'old' ptes for faultaround mappings on arm64 with hardware access flag
  2021-01-26 23:28   ` Hugh Dickins
@ 2021-01-27 17:16     ` Will Deacon
  0 siblings, 0 replies; 24+ messages in thread
From: Will Deacon @ 2021-01-27 17:16 UTC (permalink / raw)
  To: Hugh Dickins
  Cc: linux-kernel, linux-mm, linux-arm-kernel, Catalin Marinas,
	Jan Kara, Minchan Kim, Andrew Morton, Kirill A . Shutemov,
	Linus Torvalds, Vinayak Menon, Nick Desaulniers, kernel-team

On Tue, Jan 26, 2021 at 03:28:22PM -0800, Hugh Dickins wrote:
> On Tue, 26 Jan 2021, Will Deacon wrote:
> > On Wed, Jan 20, 2021 at 05:36:04PM +0000, Will Deacon wrote:
> > > Hi all,
> > > 
> > > This is version four of the patches I previously posted here:
> > > 
> > >   v1: https://lore.kernel.org/r/20201209163950.8494-1-will@kernel.org
> > >   v2: https://lore.kernel.org/r/20210108171517.5290-1-will@kernel.org
> > >   v3: https://lore.kernel.org/r/20210114175934.13070-1-will@kernel.org
> > > 
> > > The patches allow architectures to opt-in at runtime for faultaround
> > > mappings to be created as 'old' instead of 'young'. Although there have
> > > been previous attempts at this, they failed either because the decision
> > > was deferred to userspace [1] or because it was done unconditionally and
> > > shown to regress benchmarks for particular architectures [2].
> > > 
> > > The big change since v3 is that the immutable fields of 'struct vm_fault'
> > > now live in a 'const' anonymous struct. Although Clang will silently
> > > accept modifications to these fields [3], GCC emits an error. The
> > > resulting diffstat is _considerably_ more manageable with this approach.
> > 
> > The only changes I have pending against this series are cosmetic (commit
> > logs). Can I go ahead and queue this in the arm64 tree so that it can sit
> > in linux-next for a bit? (positive or negative feedback appreciated!).
> 
> That would be fine by me: I ran v3 on rc3, then the nicer smaller v4
> on rc4, and saw no problems when running either of them (x86_64 only).

Thanks, Hugh. I'll stick these into -next later today and we'll see how we
get on.

Will


^ permalink raw reply	[flat|nested] 24+ messages in thread

end of thread, other threads:[~2021-01-27 17:16 UTC | newest]

Thread overview: 24+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-01-20 17:36 [PATCH v4 0/8] Create 'old' ptes for faultaround mappings on arm64 with hardware access flag Will Deacon
2021-01-20 17:36 ` [PATCH v4 1/8] mm: Cleanup faultaround and finish_fault() codepaths Will Deacon
2021-01-20 17:36 ` [PATCH v4 2/8] mm: Allow architectures to request 'old' entries when prefaulting Will Deacon
2021-01-20 17:36 ` [PATCH v4 3/8] arm64: mm: Implement arch_wants_old_prefaulted_pte() Will Deacon
2021-01-20 17:36 ` [PATCH v4 4/8] mm: Move immutable fields of 'struct vm_fault' into anonymous struct Will Deacon
2021-01-20 18:13   ` Nick Desaulniers
2021-01-21 12:48     ` Will Deacon
2021-01-20 17:36 ` [PATCH v4 5/8] mm: Pass 'address' to map to do_set_pte() and drop FAULT_FLAG_PREFAULT Will Deacon
2021-01-20 17:36 ` [PATCH v4 6/8] mm: Avoid modifying vmf.address in __collapse_huge_page_swapin() Will Deacon
2021-01-20 17:36 ` [PATCH v4 7/8] mm: Use static initialisers for immutable fields of 'struct vm_fault' Will Deacon
2021-01-20 18:21   ` Nick Desaulniers
2021-01-21 12:50     ` Will Deacon
2021-01-20 17:36 ` [PATCH v4 8/8] mm: Mark anonymous struct field of 'struct vm_fault' as 'const' Will Deacon
2021-01-20 18:27   ` Nick Desaulniers
2021-01-20 19:02     ` Linus Torvalds
2021-01-21 13:11       ` Will Deacon
2021-01-21 19:24         ` Nick Desaulniers
2021-01-21 21:28           ` Will Deacon
2021-01-22 19:10             ` Nick Desaulniers
2021-01-22 19:27               ` Will Deacon
2021-01-22 17:47         ` Linus Torvalds
2021-01-26 23:08 ` [PATCH v4 0/8] Create 'old' ptes for faultaround mappings on arm64 with hardware access flag Will Deacon
2021-01-26 23:28   ` Hugh Dickins
2021-01-27 17:16     ` Will Deacon

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).