All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v2 00/13] Simplify the external interface for GUP
@ 2023-01-24 20:34 Jason Gunthorpe
  2023-01-24 20:34 ` [PATCH v2 01/13] mm/gup: have internal functions get the mmap_read_lock() Jason Gunthorpe
                   ` (13 more replies)
  0 siblings, 14 replies; 44+ messages in thread
From: Jason Gunthorpe @ 2023-01-24 20:34 UTC (permalink / raw)
  Cc: Alistair Popple, David Hildenbrand, David Howells,
	Christoph Hellwig, John Hubbard, linux-mm, Mike Rapoport (IBM)

It is quite a maze of EXPORTED symbols leading up to the three actual
worker functions of GUP. Simplify this by reorganizing some of the code so
the EXPORTED symbols directly call the correct internal function with
validated and consistent arguments.

Consolidate all the assertions into one place at the top of the call
chains.

Remove some dead code.

Move more things into the mm/internal.h header

v2:
 - Call the new flag FOLL_UNLOCKABLE
 - Revise comments around locked to reflect FOLL_UNLOCKABLE
 - s/lock_dropped/must_unlock/
 - Various grammer fixes
 - Add missing FOLL_UNLOCKABLE users populate_vma_page_range,
   faultin_vma_page_range
 - Make the mmap_lock assertion unconditional
 - Move internal FOLL_ flags to mm/internal.h
 - Rebase onto 41c457e96ba5 ("Merge branch 'mm-nonmm-unstable' into mm-everything")
v1: https://lore.kernel.org/r/0-v1-dd94f8f0d5ad+716-gup_tidy_jgg@nvidia.com

Jason Gunthorpe (13):
  mm/gup: have internal functions get the mmap_read_lock()
  mm/gup: remove obsolete FOLL_LONGTERM comment
  mm/gup: don't call __gup_longterm_locked() if FOLL_LONGTERM cannot be
    set
  mm/gup: move try_grab_page() to mm/internal.h
  mm/gup: simplify the external interface functions and consolidate
    invariants
  mm/gup: add an assertion that the mmap lock is locked
  mm/gup: remove locked being NULL from faultin_vma_page_range()
  mm/gup: add FOLL_UNLOCKABLE
  mm/gup: make locked never NULL in the internal GUP functions
  mm/gup: remove pin_user_pages_fast_only()
  mm/gup: make get_user_pages_fast_only() return the common return value
  mm/gup: move gup_must_unshare() to mm/internal.h
  mm/gup: move private gup FOLL_ flags to internal.h

 include/linux/mm.h       |  69 --------
 include/linux/mm_types.h |  62 ++++---
 mm/gup.c                 | 370 +++++++++++++++++++--------------------
 mm/huge_memory.c         |  10 --
 mm/internal.h            |  81 +++++++++
 5 files changed, 292 insertions(+), 300 deletions(-)


base-commit: 41c457e96ba51547c6e80b125dcba91aa6a00699
-- 
2.39.0



^ permalink raw reply	[flat|nested] 44+ messages in thread

* [PATCH v2 01/13] mm/gup: have internal functions get the mmap_read_lock()
  2023-01-24 20:34 [PATCH v2 00/13] Simplify the external interface for GUP Jason Gunthorpe
@ 2023-01-24 20:34 ` Jason Gunthorpe
  2023-01-25  2:11   ` John Hubbard
  2023-01-24 20:34 ` [PATCH v2 02/13] mm/gup: remove obsolete FOLL_LONGTERM comment Jason Gunthorpe
                   ` (12 subsequent siblings)
  13 siblings, 1 reply; 44+ messages in thread
From: Jason Gunthorpe @ 2023-01-24 20:34 UTC (permalink / raw)
  Cc: Alistair Popple, David Hildenbrand, David Howells,
	Christoph Hellwig, John Hubbard, linux-mm, Mike Rapoport (IBM)

__get_user_pages_locked() and __gup_longterm_locked() both require the
mmap lock to be held. They have a slightly unusual locked parameter that
is used to allow these functions to unlock and relock the mmap lock and
convey that fact to the caller.

Several places wrap these functions with a simple mmap_read_lock() just so
they can follow the optimized locked protocol.

Consolidate this internally to the functions. Allow internal callers to
set locked = 0 to cause the functions to acquire and release the lock on
their own.

Reorganize __gup_longterm_locked() to use the autolocking in
__get_user_pages_locked().

Replace all the places obtaining the mmap_read_lock() just to call
__get_user_pages_locked() with the new mechanism. Replace all the internal
callers of get_user_pages_unlocked() with direct calls to
__gup_longterm_locked() using the new mechanism.

A following patch will add assertions ensuring the external interface
continues to always pass in locked = 1.

Acked-by: Mike Rapoport (IBM) <rppt@kernel.org>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
---
 mm/gup.c | 113 ++++++++++++++++++++++++++++++++-----------------------
 1 file changed, 65 insertions(+), 48 deletions(-)

diff --git a/mm/gup.c b/mm/gup.c
index 920ee4d85e70ba..7007b3afc4fda8 100644
--- a/mm/gup.c
+++ b/mm/gup.c
@@ -1331,8 +1331,17 @@ static bool gup_signal_pending(unsigned int flags)
 }
 
 /*
- * Please note that this function, unlike __get_user_pages will not
- * return 0 for nr_pages > 0 without FOLL_NOWAIT
+ * Locking: (*locked == 1) means that the mmap_lock has already been acquired by
+ * the caller. This function may drop the mmap_lock. If it does so, then it will
+ * set (*locked = 0).
+ *
+ * (*locked == 0) means that the caller expects this function to acquire and
+ * drop the mmap_lock. Therefore, the value of *locked will still be zero when
+ * the function returns, even though it may have changed temporarily during
+ * function execution.
+ *
+ * Please note that this function, unlike __get_user_pages(), will not return 0
+ * for nr_pages > 0, unless FOLL_NOWAIT is used.
  */
 static __always_inline long __get_user_pages_locked(struct mm_struct *mm,
 						unsigned long start,
@@ -1343,13 +1352,22 @@ static __always_inline long __get_user_pages_locked(struct mm_struct *mm,
 						unsigned int flags)
 {
 	long ret, pages_done;
-	bool lock_dropped;
+	bool must_unlock = false;
 
 	if (locked) {
 		/* if VM_FAULT_RETRY can be returned, vmas become invalid */
 		BUG_ON(vmas);
-		/* check caller initialized locked */
-		BUG_ON(*locked != 1);
+	}
+
+	/*
+	 * The internal caller expects GUP to manage the lock internally and the
+	 * lock must be released when this returns.
+	 */
+	if (locked && !*locked) {
+		if (mmap_read_lock_killable(mm))
+			return -EAGAIN;
+		must_unlock = true;
+		*locked = 1;
 	}
 
 	if (flags & FOLL_PIN)
@@ -1368,7 +1386,6 @@ static __always_inline long __get_user_pages_locked(struct mm_struct *mm,
 		flags |= FOLL_GET;
 
 	pages_done = 0;
-	lock_dropped = false;
 	for (;;) {
 		ret = __get_user_pages(mm, start, nr_pages, flags, pages,
 				       vmas, locked);
@@ -1404,7 +1421,9 @@ static __always_inline long __get_user_pages_locked(struct mm_struct *mm,
 		if (likely(pages))
 			pages += ret;
 		start += ret << PAGE_SHIFT;
-		lock_dropped = true;
+
+		/* The lock was temporarily dropped, so we must unlock later */
+		must_unlock = true;
 
 retry:
 		/*
@@ -1451,10 +1470,11 @@ static __always_inline long __get_user_pages_locked(struct mm_struct *mm,
 			pages++;
 		start += PAGE_SIZE;
 	}
-	if (lock_dropped && *locked) {
+	if (must_unlock && *locked) {
 		/*
-		 * We must let the caller know we temporarily dropped the lock
-		 * and so the critical section protected by it was lost.
+		 * We either temporarily dropped the lock, or the caller
+		 * requested that we both acquire and drop the lock. Either way,
+		 * we must now unlock, and notify the caller of that state.
 		 */
 		mmap_read_unlock(mm);
 		*locked = 0;
@@ -1659,9 +1679,24 @@ static long __get_user_pages_locked(struct mm_struct *mm, unsigned long start,
 		unsigned int foll_flags)
 {
 	struct vm_area_struct *vma;
+	bool must_unlock = false;
 	unsigned long vm_flags;
 	long i;
 
+	if (!nr_pages)
+		return 0;
+
+	/*
+	 * The internal caller expects GUP to manage the lock internally and the
+	 * lock must be released when this returns.
+	 */
+	if (locked && !*locked) {
+		if (mmap_read_lock_killable(mm))
+			return -EAGAIN;
+		must_unlock = true;
+		*locked = 1;
+	}
+
 	/* calculate required read or write permissions.
 	 * If FOLL_FORCE is set, we only require the "MAY" flags.
 	 */
@@ -1673,12 +1708,12 @@ static long __get_user_pages_locked(struct mm_struct *mm, unsigned long start,
 	for (i = 0; i < nr_pages; i++) {
 		vma = find_vma(mm, start);
 		if (!vma)
-			goto finish_or_fault;
+			break;
 
 		/* protect what we can, including chardevs */
 		if ((vma->vm_flags & (VM_IO | VM_PFNMAP)) ||
 		    !(vm_flags & vma->vm_flags))
-			goto finish_or_fault;
+			break;
 
 		if (pages) {
 			pages[i] = virt_to_page((void *)start);
@@ -1690,9 +1725,11 @@ static long __get_user_pages_locked(struct mm_struct *mm, unsigned long start,
 		start = (start + PAGE_SIZE) & PAGE_MASK;
 	}
 
-	return i;
+	if (must_unlock && *locked) {
+		mmap_read_unlock(mm);
+		*locked = 0;
+	}
 
-finish_or_fault:
 	return i ? : -EFAULT;
 }
 #endif /* !CONFIG_MMU */
@@ -1861,17 +1898,13 @@ EXPORT_SYMBOL(fault_in_readable);
 #ifdef CONFIG_ELF_CORE
 struct page *get_dump_page(unsigned long addr)
 {
-	struct mm_struct *mm = current->mm;
 	struct page *page;
-	int locked = 1;
+	int locked = 0;
 	int ret;
 
-	if (mmap_read_lock_killable(mm))
-		return NULL;
-	ret = __get_user_pages_locked(mm, addr, 1, &page, NULL, &locked,
+	ret = __get_user_pages_locked(current->mm, addr, 1, &page, NULL,
+				      &locked,
 				      FOLL_FORCE | FOLL_DUMP | FOLL_GET);
-	if (locked)
-		mmap_read_unlock(mm);
 	return (ret == 1) ? page : NULL;
 }
 #endif /* CONFIG_ELF_CORE */
@@ -2047,13 +2080,9 @@ static long __gup_longterm_locked(struct mm_struct *mm,
 				  int *locked,
 				  unsigned int gup_flags)
 {
-	bool must_unlock = false;
 	unsigned int flags;
 	long rc, nr_pinned_pages;
 
-	if (locked && WARN_ON_ONCE(!*locked))
-		return -EINVAL;
-
 	if (!(gup_flags & FOLL_LONGTERM))
 		return __get_user_pages_locked(mm, start, nr_pages, pages, vmas,
 					       locked, gup_flags);
@@ -2070,11 +2099,6 @@ static long __gup_longterm_locked(struct mm_struct *mm,
 		return -EINVAL;
 	flags = memalloc_pin_save();
 	do {
-		if (locked && !*locked) {
-			mmap_read_lock(mm);
-			must_unlock = true;
-			*locked = 1;
-		}
 		nr_pinned_pages = __get_user_pages_locked(mm, start, nr_pages,
 							  pages, vmas, locked,
 							  gup_flags);
@@ -2085,11 +2109,6 @@ static long __gup_longterm_locked(struct mm_struct *mm,
 		rc = check_and_migrate_movable_pages(nr_pinned_pages, pages);
 	} while (rc == -EAGAIN);
 	memalloc_pin_restore(flags);
-
-	if (locked && *locked && must_unlock) {
-		mmap_read_unlock(mm);
-		*locked = 0;
-	}
 	return rc ? rc : nr_pinned_pages;
 }
 
@@ -2242,16 +2261,10 @@ EXPORT_SYMBOL(get_user_pages);
 long get_user_pages_unlocked(unsigned long start, unsigned long nr_pages,
 			     struct page **pages, unsigned int gup_flags)
 {
-	struct mm_struct *mm = current->mm;
-	int locked = 1;
-	long ret;
+	int locked = 0;
 
-	mmap_read_lock(mm);
-	ret = __gup_longterm_locked(mm, start, nr_pages, pages, NULL, &locked,
-				    gup_flags | FOLL_TOUCH);
-	if (locked)
-		mmap_read_unlock(mm);
-	return ret;
+	return __gup_longterm_locked(current->mm, start, nr_pages, pages, NULL,
+				     &locked, gup_flags | FOLL_TOUCH);
 }
 EXPORT_SYMBOL(get_user_pages_unlocked);
 
@@ -2904,6 +2917,7 @@ static int internal_get_user_pages_fast(unsigned long start,
 {
 	unsigned long len, end;
 	unsigned long nr_pinned;
+	int locked = 0;
 	int ret;
 
 	if (WARN_ON_ONCE(gup_flags & ~(FOLL_WRITE | FOLL_LONGTERM |
@@ -2932,8 +2946,9 @@ static int internal_get_user_pages_fast(unsigned long start,
 	/* Slow path: try to get the remaining pages with get_user_pages */
 	start += nr_pinned << PAGE_SHIFT;
 	pages += nr_pinned;
-	ret = get_user_pages_unlocked(start, nr_pages - nr_pinned, pages,
-				      gup_flags);
+	ret = __gup_longterm_locked(current->mm, start, nr_pages - nr_pinned,
+				    pages, NULL, &locked,
+				    gup_flags | FOLL_TOUCH);
 	if (ret < 0) {
 		/*
 		 * The caller has to unpin the pages we already pinned so
@@ -3183,11 +3198,13 @@ long pin_user_pages_unlocked(unsigned long start, unsigned long nr_pages,
 	/* FOLL_GET and FOLL_PIN are mutually exclusive. */
 	if (WARN_ON_ONCE(gup_flags & FOLL_GET))
 		return -EINVAL;
+	int locked = 0;
 
 	if (WARN_ON_ONCE(!pages))
 		return -EINVAL;
 
-	gup_flags |= FOLL_PIN;
-	return get_user_pages_unlocked(start, nr_pages, pages, gup_flags);
+	gup_flags |= FOLL_PIN | FOLL_TOUCH;
+	return __gup_longterm_locked(current->mm, start, nr_pages, pages, NULL,
+				     &locked, gup_flags);
 }
 EXPORT_SYMBOL(pin_user_pages_unlocked);
-- 
2.39.0



^ permalink raw reply related	[flat|nested] 44+ messages in thread

* [PATCH v2 02/13] mm/gup: remove obsolete FOLL_LONGTERM comment
  2023-01-24 20:34 [PATCH v2 00/13] Simplify the external interface for GUP Jason Gunthorpe
  2023-01-24 20:34 ` [PATCH v2 01/13] mm/gup: have internal functions get the mmap_read_lock() Jason Gunthorpe
@ 2023-01-24 20:34 ` Jason Gunthorpe
  2023-01-25  2:13   ` John Hubbard
  2023-02-08 14:25   ` David Hildenbrand
  2023-01-24 20:34 ` [PATCH v2 03/13] mm/gup: don't call __gup_longterm_locked() if FOLL_LONGTERM cannot be set Jason Gunthorpe
                   ` (11 subsequent siblings)
  13 siblings, 2 replies; 44+ messages in thread
From: Jason Gunthorpe @ 2023-01-24 20:34 UTC (permalink / raw)
  Cc: Alistair Popple, David Hildenbrand, David Howells,
	Christoph Hellwig, John Hubbard, linux-mm, Mike Rapoport (IBM)

These days FOLL_LONGTERM is not allowed at all on any get_user_pages*()
functions, it must be only be used with pin_user_pages*(), plus it now has
universal support for all the pin_user_pages*() functions.

Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
---
 include/linux/mm_types.h | 6 ------
 1 file changed, 6 deletions(-)

diff --git a/include/linux/mm_types.h b/include/linux/mm_types.h
index a689198caf7408..8971a40c120e38 100644
--- a/include/linux/mm_types.h
+++ b/include/linux/mm_types.h
@@ -1052,12 +1052,6 @@ typedef unsigned int __bitwise zap_flags_t;
  * specifically failed.  Filesystem pages are still subject to bugs and use of
  * FOLL_LONGTERM should be avoided on those pages.
  *
- * FIXME: Also NOTE that FOLL_LONGTERM is not supported in every GUP call.
- * Currently only get_user_pages() and get_user_pages_fast() support this flag
- * and calls to get_user_pages_[un]locked are specifically not allowed.  This
- * is due to an incompatibility with the FS DAX check and
- * FAULT_FLAG_ALLOW_RETRY.
- *
  * In the CMA case: long term pins in a CMA region would unnecessarily fragment
  * that region.  And so, CMA attempts to migrate the page before pinning, when
  * FOLL_LONGTERM is specified.
-- 
2.39.0



^ permalink raw reply related	[flat|nested] 44+ messages in thread

* [PATCH v2 03/13] mm/gup: don't call __gup_longterm_locked() if FOLL_LONGTERM cannot be set
  2023-01-24 20:34 [PATCH v2 00/13] Simplify the external interface for GUP Jason Gunthorpe
  2023-01-24 20:34 ` [PATCH v2 01/13] mm/gup: have internal functions get the mmap_read_lock() Jason Gunthorpe
  2023-01-24 20:34 ` [PATCH v2 02/13] mm/gup: remove obsolete FOLL_LONGTERM comment Jason Gunthorpe
@ 2023-01-24 20:34 ` Jason Gunthorpe
  2023-02-08 14:26   ` David Hildenbrand
  2023-01-24 20:34 ` [PATCH v2 04/13] mm/gup: move try_grab_page() to mm/internal.h Jason Gunthorpe
                   ` (10 subsequent siblings)
  13 siblings, 1 reply; 44+ messages in thread
From: Jason Gunthorpe @ 2023-01-24 20:34 UTC (permalink / raw)
  Cc: Alistair Popple, David Hildenbrand, David Howells,
	Christoph Hellwig, John Hubbard, linux-mm, Mike Rapoport (IBM)

get_user_pages_remote(), get_user_pages_unlocked() and get_user_pages()
are never called with FOLL_LONGTERM, so directly call
__get_user_pages_locked()

The next patch will add an assertion for this.

Suggested-by: John Hubbard <jhubbard@nvidia.com>
Reviewed-by: John Hubbard <jhubbard@nvidia.com>
Acked-by: Mike Rapoport (IBM) <rppt@kernel.org>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
---
 mm/gup.c | 12 ++++++------
 1 file changed, 6 insertions(+), 6 deletions(-)

diff --git a/mm/gup.c b/mm/gup.c
index 7007b3afc4fda8..a6559d7243db92 100644
--- a/mm/gup.c
+++ b/mm/gup.c
@@ -2200,8 +2200,8 @@ long get_user_pages_remote(struct mm_struct *mm,
 	if (!is_valid_gup_flags(gup_flags))
 		return -EINVAL;
 
-	return __gup_longterm_locked(mm, start, nr_pages, pages, vmas, locked,
-				     gup_flags | FOLL_TOUCH | FOLL_REMOTE);
+	return __get_user_pages_locked(mm, start, nr_pages, pages, vmas, locked,
+				       gup_flags | FOLL_TOUCH | FOLL_REMOTE);
 }
 EXPORT_SYMBOL(get_user_pages_remote);
 
@@ -2238,8 +2238,8 @@ long get_user_pages(unsigned long start, unsigned long nr_pages,
 	if (!is_valid_gup_flags(gup_flags))
 		return -EINVAL;
 
-	return __gup_longterm_locked(current->mm, start, nr_pages,
-				     pages, vmas, NULL, gup_flags | FOLL_TOUCH);
+	return __get_user_pages_locked(current->mm, start, nr_pages, pages,
+				       vmas, NULL, gup_flags | FOLL_TOUCH);
 }
 EXPORT_SYMBOL(get_user_pages);
 
@@ -2263,8 +2263,8 @@ long get_user_pages_unlocked(unsigned long start, unsigned long nr_pages,
 {
 	int locked = 0;
 
-	return __gup_longterm_locked(current->mm, start, nr_pages, pages, NULL,
-				     &locked, gup_flags | FOLL_TOUCH);
+	return __get_user_pages_locked(current->mm, start, nr_pages, pages,
+				       NULL, &locked, gup_flags | FOLL_TOUCH);
 }
 EXPORT_SYMBOL(get_user_pages_unlocked);
 
-- 
2.39.0



^ permalink raw reply related	[flat|nested] 44+ messages in thread

* [PATCH v2 04/13] mm/gup: move try_grab_page() to mm/internal.h
  2023-01-24 20:34 [PATCH v2 00/13] Simplify the external interface for GUP Jason Gunthorpe
                   ` (2 preceding siblings ...)
  2023-01-24 20:34 ` [PATCH v2 03/13] mm/gup: don't call __gup_longterm_locked() if FOLL_LONGTERM cannot be set Jason Gunthorpe
@ 2023-01-24 20:34 ` Jason Gunthorpe
  2023-01-25  2:15   ` John Hubbard
  2023-02-08 14:26   ` David Hildenbrand
  2023-01-24 20:34 ` [PATCH v2 05/13] mm/gup: simplify the external interface functions and consolidate invariants Jason Gunthorpe
                   ` (9 subsequent siblings)
  13 siblings, 2 replies; 44+ messages in thread
From: Jason Gunthorpe @ 2023-01-24 20:34 UTC (permalink / raw)
  Cc: Alistair Popple, David Hildenbrand, David Howells,
	Christoph Hellwig, John Hubbard, linux-mm, Mike Rapoport (IBM)

This is part of the internal function of gup.c and is only non-static so
that the parts of gup.c in the huge_memory.c and hugetlb.c can call it.

Put it in internal.h beside the similarly purposed try_grab_folio()

Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
---
 include/linux/mm.h | 2 --
 mm/internal.h      | 1 +
 2 files changed, 1 insertion(+), 2 deletions(-)

diff --git a/include/linux/mm.h b/include/linux/mm.h
index c9db257f09b307..dfc2a88bc4a8ed 100644
--- a/include/linux/mm.h
+++ b/include/linux/mm.h
@@ -1173,8 +1173,6 @@ static inline void get_page(struct page *page)
 	folio_get(page_folio(page));
 }
 
-int __must_check try_grab_page(struct page *page, unsigned int flags);
-
 static inline __must_check bool try_get_page(struct page *page)
 {
 	page = compound_head(page);
diff --git a/mm/internal.h b/mm/internal.h
index ce462bf145b441..0f035bcaf133f5 100644
--- a/mm/internal.h
+++ b/mm/internal.h
@@ -852,6 +852,7 @@ int migrate_device_coherent_page(struct page *page);
  * mm/gup.c
  */
 struct folio *try_grab_folio(struct page *page, int refs, unsigned int flags);
+int __must_check try_grab_page(struct page *page, unsigned int flags);
 
 extern bool mirrored_kernelcore;
 
-- 
2.39.0



^ permalink raw reply related	[flat|nested] 44+ messages in thread

* [PATCH v2 05/13] mm/gup: simplify the external interface functions and consolidate invariants
  2023-01-24 20:34 [PATCH v2 00/13] Simplify the external interface for GUP Jason Gunthorpe
                   ` (3 preceding siblings ...)
  2023-01-24 20:34 ` [PATCH v2 04/13] mm/gup: move try_grab_page() to mm/internal.h Jason Gunthorpe
@ 2023-01-24 20:34 ` Jason Gunthorpe
  2023-01-25  2:30   ` John Hubbard
  2023-01-24 20:34 ` [PATCH v2 06/13] mm/gup: add an assertion that the mmap lock is locked Jason Gunthorpe
                   ` (8 subsequent siblings)
  13 siblings, 1 reply; 44+ messages in thread
From: Jason Gunthorpe @ 2023-01-24 20:34 UTC (permalink / raw)
  Cc: Alistair Popple, David Hildenbrand, David Howells,
	Christoph Hellwig, John Hubbard, linux-mm, Mike Rapoport (IBM)

The GUP family of functions have a complex, but fairly well defined, set
of invariants for their arguments. Currently these are sprinkled about,
sometimes in duplicate through many functions.

Internally we don't follow all the invariants that the external interface
has to follow, so place these checks directly at the exported
interface. This ensures the internal functions never reach a violated
invariant.

Remove the duplicated invariant checks.

The end result is to make these functions fully internal:
 __get_user_pages_locked()
 internal_get_user_pages_fast()
 __gup_longterm_locked()

And all the other functions call directly into one of these.

Suggested-by: John Hubbard <jhubbard@nvidia.com>
Acked-by: Mike Rapoport (IBM) <rppt@kernel.org>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
---
 mm/gup.c         | 153 +++++++++++++++++++++++------------------------
 mm/huge_memory.c |  10 ----
 2 files changed, 75 insertions(+), 88 deletions(-)

diff --git a/mm/gup.c b/mm/gup.c
index a6559d7243db92..4c236fb83dcd3e 100644
--- a/mm/gup.c
+++ b/mm/gup.c
@@ -215,7 +215,6 @@ int __must_check try_grab_page(struct page *page, unsigned int flags)
 {
 	struct folio *folio = page_folio(page);
 
-	WARN_ON_ONCE((flags & (FOLL_GET | FOLL_PIN)) == (FOLL_GET | FOLL_PIN));
 	if (WARN_ON_ONCE(folio_ref_count(folio) <= 0))
 		return -ENOMEM;
 
@@ -818,7 +817,7 @@ struct page *follow_page(struct vm_area_struct *vma, unsigned long address,
 	if (vma_is_secretmem(vma))
 		return NULL;
 
-	if (foll_flags & FOLL_PIN)
+	if (WARN_ON_ONCE(foll_flags & FOLL_PIN))
 		return NULL;
 
 	page = follow_page_mask(vma, address, foll_flags, &ctx);
@@ -975,9 +974,6 @@ static int check_vma_flags(struct vm_area_struct *vma, unsigned long gup_flags)
 	if ((gup_flags & FOLL_LONGTERM) && vma_is_fsdax(vma))
 		return -EOPNOTSUPP;
 
-	if ((gup_flags & FOLL_LONGTERM) && (gup_flags & FOLL_PCI_P2PDMA))
-		return -EOPNOTSUPP;
-
 	if (vma_is_secretmem(vma))
 		return -EFAULT;
 
@@ -1354,11 +1350,6 @@ static __always_inline long __get_user_pages_locked(struct mm_struct *mm,
 	long ret, pages_done;
 	bool must_unlock = false;
 
-	if (locked) {
-		/* if VM_FAULT_RETRY can be returned, vmas become invalid */
-		BUG_ON(vmas);
-	}
-
 	/*
 	 * The internal caller expects GUP to manage the lock internally and the
 	 * lock must be released when this returns.
@@ -2087,16 +2078,6 @@ static long __gup_longterm_locked(struct mm_struct *mm,
 		return __get_user_pages_locked(mm, start, nr_pages, pages, vmas,
 					       locked, gup_flags);
 
-	/*
-	 * If we get to this point then FOLL_LONGTERM is set, and FOLL_LONGTERM
-	 * implies FOLL_PIN (although the reverse is not true). Therefore it is
-	 * correct to unconditionally call check_and_migrate_movable_pages()
-	 * which assumes pages have been pinned via FOLL_PIN.
-	 *
-	 * Enforce the above reasoning by asserting that FOLL_PIN is set.
-	 */
-	if (WARN_ON(!(gup_flags & FOLL_PIN)))
-		return -EINVAL;
 	flags = memalloc_pin_save();
 	do {
 		nr_pinned_pages = __get_user_pages_locked(mm, start, nr_pages,
@@ -2106,28 +2087,66 @@ static long __gup_longterm_locked(struct mm_struct *mm,
 			rc = nr_pinned_pages;
 			break;
 		}
+
+		/* FOLL_LONGTERM implies FOLL_PIN */
 		rc = check_and_migrate_movable_pages(nr_pinned_pages, pages);
 	} while (rc == -EAGAIN);
 	memalloc_pin_restore(flags);
 	return rc ? rc : nr_pinned_pages;
 }
 
-static bool is_valid_gup_flags(unsigned int gup_flags)
+/*
+ * Check that the given flags are valid for the exported gup/pup interface, and
+ * update them with the required flags that the caller must have set.
+ */
+static bool is_valid_gup_args(struct page **pages, struct vm_area_struct **vmas,
+			      int *locked, unsigned int *gup_flags_p,
+			      unsigned int to_set)
 {
+	unsigned int gup_flags = *gup_flags_p;
+
 	/*
-	 * FOLL_PIN must only be set internally by the pin_user_pages*() APIs,
-	 * never directly by the caller, so enforce that with an assertion:
+	 * These flags not allowed to be specified externally to the gup
+	 * interfaces:
+	 * - FOLL_PIN/FOLL_TRIED/FOLL_FAST_ONLY are internal only
+	 * - FOLL_REMOTE is internal only and used on follow_page()
 	 */
-	if (WARN_ON_ONCE(gup_flags & FOLL_PIN))
+	if (WARN_ON_ONCE(gup_flags & (FOLL_PIN | FOLL_TRIED |
+				      FOLL_REMOTE | FOLL_FAST_ONLY)))
+		return false;
+
+	gup_flags |= to_set;
+
+	/* FOLL_GET and FOLL_PIN are mutually exclusive. */
+	if (WARN_ON_ONCE((gup_flags & (FOLL_PIN | FOLL_GET)) ==
+			 (FOLL_PIN | FOLL_GET)))
+		return false;
+
+	/* LONGTERM can only be specified when pinning */
+	if (WARN_ON_ONCE(!(gup_flags & FOLL_PIN) && (gup_flags & FOLL_LONGTERM)))
+		return false;
+
+	/* Pages input must be given if using GET/PIN */
+	if (WARN_ON_ONCE((gup_flags & (FOLL_GET | FOLL_PIN)) && !pages))
 		return false;
+
+	/* At the external interface locked must be set */
+	if (WARN_ON_ONCE(locked && *locked != 1))
+		return false;
+
+	/* We want to allow the pgmap to be hot-unplugged at all times */
+	if (WARN_ON_ONCE((gup_flags & FOLL_LONGTERM) &&
+			 (gup_flags & FOLL_PCI_P2PDMA)))
+		return false;
+
 	/*
-	 * FOLL_PIN is a prerequisite to FOLL_LONGTERM. Another way of saying
-	 * that is, FOLL_LONGTERM is a specific case, more restrictive case of
-	 * FOLL_PIN.
+	 * Can't use VMAs with locked, as locked allows GUP to unlock
+	 * which invalidates the vmas array
 	 */
-	if (WARN_ON_ONCE(gup_flags & FOLL_LONGTERM))
+	if (WARN_ON_ONCE(vmas && locked))
 		return false;
 
+	*gup_flags_p = gup_flags;
 	return true;
 }
 
@@ -2197,11 +2216,12 @@ long get_user_pages_remote(struct mm_struct *mm,
 		unsigned int gup_flags, struct page **pages,
 		struct vm_area_struct **vmas, int *locked)
 {
-	if (!is_valid_gup_flags(gup_flags))
+	if (!is_valid_gup_args(pages, vmas, locked, &gup_flags,
+			       FOLL_TOUCH | FOLL_REMOTE))
 		return -EINVAL;
 
 	return __get_user_pages_locked(mm, start, nr_pages, pages, vmas, locked,
-				       gup_flags | FOLL_TOUCH | FOLL_REMOTE);
+				       gup_flags);
 }
 EXPORT_SYMBOL(get_user_pages_remote);
 
@@ -2235,11 +2255,11 @@ long get_user_pages(unsigned long start, unsigned long nr_pages,
 		unsigned int gup_flags, struct page **pages,
 		struct vm_area_struct **vmas)
 {
-	if (!is_valid_gup_flags(gup_flags))
+	if (!is_valid_gup_args(pages, vmas, NULL, &gup_flags, FOLL_TOUCH))
 		return -EINVAL;
 
 	return __get_user_pages_locked(current->mm, start, nr_pages, pages,
-				       vmas, NULL, gup_flags | FOLL_TOUCH);
+				       vmas, NULL, gup_flags);
 }
 EXPORT_SYMBOL(get_user_pages);
 
@@ -2263,8 +2283,11 @@ long get_user_pages_unlocked(unsigned long start, unsigned long nr_pages,
 {
 	int locked = 0;
 
+	if (!is_valid_gup_args(pages, NULL, NULL, &gup_flags, FOLL_TOUCH))
+		return -EINVAL;
+
 	return __get_user_pages_locked(current->mm, start, nr_pages, pages,
-				       NULL, &locked, gup_flags | FOLL_TOUCH);
+				       NULL, &locked, gup_flags);
 }
 EXPORT_SYMBOL(get_user_pages_unlocked);
 
@@ -2992,7 +3015,9 @@ int get_user_pages_fast_only(unsigned long start, int nr_pages,
 	 * FOLL_FAST_ONLY is required in order to match the API description of
 	 * this routine: no fall back to regular ("slow") GUP.
 	 */
-	gup_flags |= FOLL_GET | FOLL_FAST_ONLY;
+	if (!is_valid_gup_args(pages, NULL, NULL, &gup_flags,
+			       FOLL_GET | FOLL_FAST_ONLY))
+		return -EINVAL;
 
 	nr_pinned = internal_get_user_pages_fast(start, nr_pages, gup_flags,
 						 pages);
@@ -3029,16 +3054,14 @@ EXPORT_SYMBOL_GPL(get_user_pages_fast_only);
 int get_user_pages_fast(unsigned long start, int nr_pages,
 			unsigned int gup_flags, struct page **pages)
 {
-	if (!is_valid_gup_flags(gup_flags))
-		return -EINVAL;
-
 	/*
 	 * The caller may or may not have explicitly set FOLL_GET; either way is
 	 * OK. However, internally (within mm/gup.c), gup fast variants must set
 	 * FOLL_GET, because gup fast is always a "pin with a +1 page refcount"
 	 * request.
 	 */
-	gup_flags |= FOLL_GET;
+	if (!is_valid_gup_args(pages, NULL, NULL, &gup_flags, FOLL_GET))
+		return -EINVAL;
 	return internal_get_user_pages_fast(start, nr_pages, gup_flags, pages);
 }
 EXPORT_SYMBOL_GPL(get_user_pages_fast);
@@ -3062,14 +3085,8 @@ EXPORT_SYMBOL_GPL(get_user_pages_fast);
 int pin_user_pages_fast(unsigned long start, int nr_pages,
 			unsigned int gup_flags, struct page **pages)
 {
-	/* FOLL_GET and FOLL_PIN are mutually exclusive. */
-	if (WARN_ON_ONCE(gup_flags & FOLL_GET))
+	if (!is_valid_gup_args(pages, NULL, NULL, &gup_flags, FOLL_PIN))
 		return -EINVAL;
-
-	if (WARN_ON_ONCE(!pages))
-		return -EINVAL;
-
-	gup_flags |= FOLL_PIN;
 	return internal_get_user_pages_fast(start, nr_pages, gup_flags, pages);
 }
 EXPORT_SYMBOL_GPL(pin_user_pages_fast);
@@ -3085,20 +3102,14 @@ int pin_user_pages_fast_only(unsigned long start, int nr_pages,
 {
 	int nr_pinned;
 
-	/*
-	 * FOLL_GET and FOLL_PIN are mutually exclusive. Note that the API
-	 * rules require returning 0, rather than -errno:
-	 */
-	if (WARN_ON_ONCE(gup_flags & FOLL_GET))
-		return 0;
-
-	if (WARN_ON_ONCE(!pages))
-		return 0;
 	/*
 	 * FOLL_FAST_ONLY is required in order to match the API description of
 	 * this routine: no fall back to regular ("slow") GUP.
 	 */
-	gup_flags |= (FOLL_PIN | FOLL_FAST_ONLY);
+	if (!is_valid_gup_args(pages, NULL, NULL, &gup_flags,
+			       FOLL_PIN | FOLL_FAST_ONLY))
+		return 0;
+
 	nr_pinned = internal_get_user_pages_fast(start, nr_pages, gup_flags,
 						 pages);
 	/*
@@ -3140,16 +3151,11 @@ long pin_user_pages_remote(struct mm_struct *mm,
 			   unsigned int gup_flags, struct page **pages,
 			   struct vm_area_struct **vmas, int *locked)
 {
-	/* FOLL_GET and FOLL_PIN are mutually exclusive. */
-	if (WARN_ON_ONCE(gup_flags & FOLL_GET))
-		return -EINVAL;
-
-	if (WARN_ON_ONCE(!pages))
-		return -EINVAL;
-
+	if (!is_valid_gup_args(pages, vmas, locked, &gup_flags,
+			       FOLL_PIN | FOLL_TOUCH | FOLL_REMOTE))
+		return 0;
 	return __gup_longterm_locked(mm, start, nr_pages, pages, vmas, locked,
-				     gup_flags | FOLL_PIN | FOLL_TOUCH |
-					     FOLL_REMOTE);
+				     gup_flags);
 }
 EXPORT_SYMBOL(pin_user_pages_remote);
 
@@ -3174,14 +3180,8 @@ long pin_user_pages(unsigned long start, unsigned long nr_pages,
 		    unsigned int gup_flags, struct page **pages,
 		    struct vm_area_struct **vmas)
 {
-	/* FOLL_GET and FOLL_PIN are mutually exclusive. */
-	if (WARN_ON_ONCE(gup_flags & FOLL_GET))
-		return -EINVAL;
-
-	if (WARN_ON_ONCE(!pages))
-		return -EINVAL;
-
-	gup_flags |= FOLL_PIN;
+	if (!is_valid_gup_args(pages, vmas, NULL, &gup_flags, FOLL_PIN))
+		return 0;
 	return __gup_longterm_locked(current->mm, start, nr_pages,
 				     pages, vmas, NULL, gup_flags);
 }
@@ -3195,15 +3195,12 @@ EXPORT_SYMBOL(pin_user_pages);
 long pin_user_pages_unlocked(unsigned long start, unsigned long nr_pages,
 			     struct page **pages, unsigned int gup_flags)
 {
-	/* FOLL_GET and FOLL_PIN are mutually exclusive. */
-	if (WARN_ON_ONCE(gup_flags & FOLL_GET))
-		return -EINVAL;
 	int locked = 0;
 
-	if (WARN_ON_ONCE(!pages))
-		return -EINVAL;
+	if (!is_valid_gup_args(pages, NULL, NULL, &gup_flags,
+			       FOLL_PIN | FOLL_TOUCH))
+		return 0;
 
-	gup_flags |= FOLL_PIN | FOLL_TOUCH;
 	return __gup_longterm_locked(current->mm, start, nr_pages, pages, NULL,
 				     &locked, gup_flags);
 }
diff --git a/mm/huge_memory.c b/mm/huge_memory.c
index 1d6977dc6b31ba..1343a7d88299be 100644
--- a/mm/huge_memory.c
+++ b/mm/huge_memory.c
@@ -1042,11 +1042,6 @@ struct page *follow_devmap_pmd(struct vm_area_struct *vma, unsigned long addr,
 
 	assert_spin_locked(pmd_lockptr(mm, pmd));
 
-	/* FOLL_GET and FOLL_PIN are mutually exclusive. */
-	if (WARN_ON_ONCE((flags & (FOLL_PIN | FOLL_GET)) ==
-			 (FOLL_PIN | FOLL_GET)))
-		return NULL;
-
 	if (flags & FOLL_WRITE && !pmd_write(*pmd))
 		return NULL;
 
@@ -1205,11 +1200,6 @@ struct page *follow_devmap_pud(struct vm_area_struct *vma, unsigned long addr,
 	if (flags & FOLL_WRITE && !pud_write(*pud))
 		return NULL;
 
-	/* FOLL_GET and FOLL_PIN are mutually exclusive. */
-	if (WARN_ON_ONCE((flags & (FOLL_PIN | FOLL_GET)) ==
-			 (FOLL_PIN | FOLL_GET)))
-		return NULL;
-
 	if (pud_present(*pud) && pud_devmap(*pud))
 		/* pass */;
 	else
-- 
2.39.0



^ permalink raw reply related	[flat|nested] 44+ messages in thread

* [PATCH v2 06/13] mm/gup: add an assertion that the mmap lock is locked
  2023-01-24 20:34 [PATCH v2 00/13] Simplify the external interface for GUP Jason Gunthorpe
                   ` (4 preceding siblings ...)
  2023-01-24 20:34 ` [PATCH v2 05/13] mm/gup: simplify the external interface functions and consolidate invariants Jason Gunthorpe
@ 2023-01-24 20:34 ` Jason Gunthorpe
  2023-01-25  2:34   ` John Hubbard
  2023-01-24 20:34 ` [PATCH v2 07/13] mm/gup: remove locked being NULL from faultin_vma_page_range() Jason Gunthorpe
                   ` (7 subsequent siblings)
  13 siblings, 1 reply; 44+ messages in thread
From: Jason Gunthorpe @ 2023-01-24 20:34 UTC (permalink / raw)
  Cc: Alistair Popple, David Hildenbrand, David Howells,
	Christoph Hellwig, John Hubbard, linux-mm, Mike Rapoport (IBM)

Since commit 5b78ed24e8ec ("mm/pagemap: add mmap_assert_locked()
annotations to find_vma*()") we already have this assertion, it is just
buried in find_vma():

 __get_user_pages_locked()
  __get_user_pages()
  find_extend_vma()
   find_vma()

Also check it at the top of __get_user_pages_locked() as a form of
documentation.

Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
---
 mm/gup.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/mm/gup.c b/mm/gup.c
index 4c236fb83dcd3e..de1a5c64fdfdcf 100644
--- a/mm/gup.c
+++ b/mm/gup.c
@@ -1360,6 +1360,8 @@ static __always_inline long __get_user_pages_locked(struct mm_struct *mm,
 		must_unlock = true;
 		*locked = 1;
 	}
+	else
+		mmap_assert_locked(mm);
 
 	if (flags & FOLL_PIN)
 		mm_set_has_pinned_flag(&mm->flags);
-- 
2.39.0



^ permalink raw reply related	[flat|nested] 44+ messages in thread

* [PATCH v2 07/13] mm/gup: remove locked being NULL from faultin_vma_page_range()
  2023-01-24 20:34 [PATCH v2 00/13] Simplify the external interface for GUP Jason Gunthorpe
                   ` (5 preceding siblings ...)
  2023-01-24 20:34 ` [PATCH v2 06/13] mm/gup: add an assertion that the mmap lock is locked Jason Gunthorpe
@ 2023-01-24 20:34 ` Jason Gunthorpe
  2023-01-25  2:38   ` John Hubbard
  2023-01-24 20:34 ` [PATCH v2 08/13] mm/gup: add FOLL_UNLOCKABLE Jason Gunthorpe
                   ` (6 subsequent siblings)
  13 siblings, 1 reply; 44+ messages in thread
From: Jason Gunthorpe @ 2023-01-24 20:34 UTC (permalink / raw)
  Cc: Alistair Popple, David Hildenbrand, David Howells,
	Christoph Hellwig, John Hubbard, linux-mm, Mike Rapoport (IBM)

The only caller of this function always passes in a non-NULL locked,
so just remove this obsolete comment.

Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
---
 mm/gup.c | 7 +------
 1 file changed, 1 insertion(+), 6 deletions(-)

diff --git a/mm/gup.c b/mm/gup.c
index de1a5c64fdfdcf..dfb315b3f2950d 100644
--- a/mm/gup.c
+++ b/mm/gup.c
@@ -1558,12 +1558,7 @@ long populate_vma_page_range(struct vm_area_struct *vma,
  * code on error (see __get_user_pages()).
  *
  * vma->vm_mm->mmap_lock must be held. The range must be page-aligned and
- * covered by the VMA.
- *
- * If @locked is NULL, it may be held for read or write and will be unperturbed.
- *
- * If @locked is non-NULL, it must held for read only and may be released.  If
- * it's released, *@locked will be set to 0.
+ * covered by the VMA. If it's released, *@locked will be set to 0.
  */
 long faultin_vma_page_range(struct vm_area_struct *vma, unsigned long start,
 			    unsigned long end, bool write, int *locked)
-- 
2.39.0



^ permalink raw reply related	[flat|nested] 44+ messages in thread

* [PATCH v2 08/13] mm/gup: add FOLL_UNLOCKABLE
  2023-01-24 20:34 [PATCH v2 00/13] Simplify the external interface for GUP Jason Gunthorpe
                   ` (6 preceding siblings ...)
  2023-01-24 20:34 ` [PATCH v2 07/13] mm/gup: remove locked being NULL from faultin_vma_page_range() Jason Gunthorpe
@ 2023-01-24 20:34 ` Jason Gunthorpe
  2023-01-24 20:34 ` [PATCH v2 09/13] mm/gup: make locked never NULL in the internal GUP functions Jason Gunthorpe
                   ` (5 subsequent siblings)
  13 siblings, 0 replies; 44+ messages in thread
From: Jason Gunthorpe @ 2023-01-24 20:34 UTC (permalink / raw)
  Cc: Alistair Popple, David Hildenbrand, David Howells,
	Christoph Hellwig, John Hubbard, linux-mm, Mike Rapoport (IBM)

Setting FOLL_UNLOCKABLE allows GUP to lock/unlock the mmap lock on its
own. It is a more explicit replacement for locked != NULL. This clears the
way for passing in locked = 1, without intending that the lock can be
unlocked.

Set the flag in all cases where it is used, eg locked is present in the
external interface or locked is used internally with locked = 0.

Acked-by: Mike Rapoport (IBM) <rppt@kernel.org>
Reviewed-by: John Hubbard <jhubbard@nvidia.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
---
 include/linux/mm_types.h |  1 +
 mm/gup.c                 | 36 +++++++++++++++++++++++-------------
 2 files changed, 24 insertions(+), 13 deletions(-)

diff --git a/include/linux/mm_types.h b/include/linux/mm_types.h
index 8971a40c120e38..518617431f8085 100644
--- a/include/linux/mm_types.h
+++ b/include/linux/mm_types.h
@@ -1103,5 +1103,6 @@ typedef unsigned int __bitwise zap_flags_t;
 #define FOLL_FAST_ONLY	0x80000	/* gup_fast: prevent fall-back to slow gup */
 #define FOLL_PCI_P2PDMA	0x100000 /* allow returning PCI P2PDMA pages */
 #define FOLL_INTERRUPTIBLE  0x200000 /* allow interrupts from generic signals */
+#define FOLL_UNLOCKABLE	0x400000 /* allow unlocking the mmap lock (internal only) */
 
 #endif /* _LINUX_MM_TYPES_H */
diff --git a/mm/gup.c b/mm/gup.c
index dfb315b3f2950d..37849fcd962e9a 100644
--- a/mm/gup.c
+++ b/mm/gup.c
@@ -896,7 +896,7 @@ static int faultin_page(struct vm_area_struct *vma,
 		fault_flags |= FAULT_FLAG_WRITE;
 	if (*flags & FOLL_REMOTE)
 		fault_flags |= FAULT_FLAG_REMOTE;
-	if (locked) {
+	if (*flags & FOLL_UNLOCKABLE) {
 		fault_flags |= FAULT_FLAG_ALLOW_RETRY | FAULT_FLAG_KILLABLE;
 		/*
 		 * FAULT_FLAG_INTERRUPTIBLE is opt-in. GUP callers must set
@@ -1382,9 +1382,11 @@ static __always_inline long __get_user_pages_locked(struct mm_struct *mm,
 	for (;;) {
 		ret = __get_user_pages(mm, start, nr_pages, flags, pages,
 				       vmas, locked);
-		if (!locked)
+		if (!(flags & FOLL_UNLOCKABLE)) {
 			/* VM_FAULT_RETRY couldn't trigger, bypass */
-			return ret;
+			pages_done = ret;
+			break;
+		}
 
 		/* VM_FAULT_RETRY or VM_FAULT_COMPLETED cannot return errors */
 		if (!*locked) {
@@ -1532,6 +1534,9 @@ long populate_vma_page_range(struct vm_area_struct *vma,
 	if (vma_is_accessible(vma))
 		gup_flags |= FOLL_FORCE;
 
+	if (locked)
+		gup_flags |= FOLL_UNLOCKABLE;
+
 	/*
 	 * We made sure addr is within a VMA, so the following will
 	 * not result in a stack expansion that recurses back here.
@@ -1583,7 +1588,7 @@ long faultin_vma_page_range(struct vm_area_struct *vma, unsigned long start,
 	 *		  a poisoned page.
 	 * !FOLL_FORCE: Require proper access permissions.
 	 */
-	gup_flags = FOLL_TOUCH | FOLL_HWPOISON;
+	gup_flags = FOLL_TOUCH | FOLL_HWPOISON | FOLL_UNLOCKABLE;
 	if (write)
 		gup_flags |= FOLL_WRITE;
 
@@ -2107,12 +2112,20 @@ static bool is_valid_gup_args(struct page **pages, struct vm_area_struct **vmas,
 	 * interfaces:
 	 * - FOLL_PIN/FOLL_TRIED/FOLL_FAST_ONLY are internal only
 	 * - FOLL_REMOTE is internal only and used on follow_page()
+	 * - FOLL_UNLOCKABLE is internal only and used if locked is !NULL
 	 */
-	if (WARN_ON_ONCE(gup_flags & (FOLL_PIN | FOLL_TRIED |
+	if (WARN_ON_ONCE(gup_flags & (FOLL_PIN | FOLL_TRIED | FOLL_UNLOCKABLE |
 				      FOLL_REMOTE | FOLL_FAST_ONLY)))
 		return false;
 
 	gup_flags |= to_set;
+	if (locked) {
+		/* At the external interface locked must be set */
+		if (WARN_ON_ONCE(*locked != 1))
+			return false;
+
+		gup_flags |= FOLL_UNLOCKABLE;
+	}
 
 	/* FOLL_GET and FOLL_PIN are mutually exclusive. */
 	if (WARN_ON_ONCE((gup_flags & (FOLL_PIN | FOLL_GET)) ==
@@ -2127,10 +2140,6 @@ static bool is_valid_gup_args(struct page **pages, struct vm_area_struct **vmas,
 	if (WARN_ON_ONCE((gup_flags & (FOLL_GET | FOLL_PIN)) && !pages))
 		return false;
 
-	/* At the external interface locked must be set */
-	if (WARN_ON_ONCE(locked && *locked != 1))
-		return false;
-
 	/* We want to allow the pgmap to be hot-unplugged at all times */
 	if (WARN_ON_ONCE((gup_flags & FOLL_LONGTERM) &&
 			 (gup_flags & FOLL_PCI_P2PDMA)))
@@ -2140,7 +2149,7 @@ static bool is_valid_gup_args(struct page **pages, struct vm_area_struct **vmas,
 	 * Can't use VMAs with locked, as locked allows GUP to unlock
 	 * which invalidates the vmas array
 	 */
-	if (WARN_ON_ONCE(vmas && locked))
+	if (WARN_ON_ONCE(vmas && (gup_flags & FOLL_UNLOCKABLE)))
 		return false;
 
 	*gup_flags_p = gup_flags;
@@ -2280,7 +2289,8 @@ long get_user_pages_unlocked(unsigned long start, unsigned long nr_pages,
 {
 	int locked = 0;
 
-	if (!is_valid_gup_args(pages, NULL, NULL, &gup_flags, FOLL_TOUCH))
+	if (!is_valid_gup_args(pages, NULL, NULL, &gup_flags,
+			       FOLL_TOUCH | FOLL_UNLOCKABLE))
 		return -EINVAL;
 
 	return __get_user_pages_locked(current->mm, start, nr_pages, pages,
@@ -2968,7 +2978,7 @@ static int internal_get_user_pages_fast(unsigned long start,
 	pages += nr_pinned;
 	ret = __gup_longterm_locked(current->mm, start, nr_pages - nr_pinned,
 				    pages, NULL, &locked,
-				    gup_flags | FOLL_TOUCH);
+				    gup_flags | FOLL_TOUCH | FOLL_UNLOCKABLE);
 	if (ret < 0) {
 		/*
 		 * The caller has to unpin the pages we already pinned so
@@ -3195,7 +3205,7 @@ long pin_user_pages_unlocked(unsigned long start, unsigned long nr_pages,
 	int locked = 0;
 
 	if (!is_valid_gup_args(pages, NULL, NULL, &gup_flags,
-			       FOLL_PIN | FOLL_TOUCH))
+			       FOLL_PIN | FOLL_TOUCH | FOLL_UNLOCKABLE))
 		return 0;
 
 	return __gup_longterm_locked(current->mm, start, nr_pages, pages, NULL,
-- 
2.39.0



^ permalink raw reply related	[flat|nested] 44+ messages in thread

* [PATCH v2 09/13] mm/gup: make locked never NULL in the internal GUP functions
  2023-01-24 20:34 [PATCH v2 00/13] Simplify the external interface for GUP Jason Gunthorpe
                   ` (7 preceding siblings ...)
  2023-01-24 20:34 ` [PATCH v2 08/13] mm/gup: add FOLL_UNLOCKABLE Jason Gunthorpe
@ 2023-01-24 20:34 ` Jason Gunthorpe
  2023-01-25  3:00   ` John Hubbard
  2023-01-24 20:34 ` [PATCH v2 10/13] mm/gup: remove pin_user_pages_fast_only() Jason Gunthorpe
                   ` (4 subsequent siblings)
  13 siblings, 1 reply; 44+ messages in thread
From: Jason Gunthorpe @ 2023-01-24 20:34 UTC (permalink / raw)
  Cc: Alistair Popple, David Hildenbrand, David Howells,
	Christoph Hellwig, John Hubbard, linux-mm, Mike Rapoport (IBM)

Now that NULL locked doesn't have a special meaning we can just make it
non-NULL in all cases and remove the special tests.

get_user_pages() and pin_user_pages() can safely pass in a locked = 1

get_user_pages_remote) and pin_user_pages_remote() can swap in a local
variable for locked if NULL is passed.

Remove all the NULL checks.

Acked-by: Mike Rapoport (IBM) <rppt@kernel.org>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
---
 mm/gup.c | 51 ++++++++++++++++++++++++++++++---------------------
 1 file changed, 30 insertions(+), 21 deletions(-)

diff --git a/mm/gup.c b/mm/gup.c
index 37849fcd962e9a..932a2339613c2f 100644
--- a/mm/gup.c
+++ b/mm/gup.c
@@ -879,9 +879,9 @@ static int get_gate_page(struct mm_struct *mm, unsigned long address,
 }
 
 /*
- * mmap_lock must be held on entry.  If @locked != NULL and *@flags
- * does not include FOLL_NOWAIT, the mmap_lock may be released.  If it
- * is, *@locked will be set to 0 and -EBUSY returned.
+ * mmap_lock must be held on entry.  If @flags has FOLL_UNLOCKABLE but not
+ * FOLL_NOWAIT, the mmap_lock may be released.  If it is, *@locked will be set
+ * to 0 and -EBUSY returned.
  */
 static int faultin_page(struct vm_area_struct *vma,
 		unsigned long address, unsigned int *flags, bool unshare,
@@ -930,8 +930,8 @@ static int faultin_page(struct vm_area_struct *vma,
 		 * mmap lock in the page fault handler. Sanity check this.
 		 */
 		WARN_ON_ONCE(fault_flags & FAULT_FLAG_RETRY_NOWAIT);
-		if (locked)
-			*locked = 0;
+		*locked = 0;
+
 		/*
 		 * We should do the same as VM_FAULT_RETRY, but let's not
 		 * return -EBUSY since that's not reflecting the reality of
@@ -951,7 +951,7 @@ static int faultin_page(struct vm_area_struct *vma,
 	}
 
 	if (ret & VM_FAULT_RETRY) {
-		if (locked && !(fault_flags & FAULT_FLAG_RETRY_NOWAIT))
+		if (!(fault_flags & FAULT_FLAG_RETRY_NOWAIT))
 			*locked = 0;
 		return -EBUSY;
 	}
@@ -1062,14 +1062,12 @@ static int check_vma_flags(struct vm_area_struct *vma, unsigned long gup_flags)
  * appropriate) must be called after the page is finished with, and
  * before put_page is called.
  *
- * If @locked != NULL, *@locked will be set to 0 when mmap_lock is
- * released by an up_read().  That can happen if @gup_flags does not
- * have FOLL_NOWAIT.
+ * If FOLL_UNLOCKABLE is set without FOLL_NOWAIT then the mmap_lock may
+ * be released. If this happens *@locked will be set to 0 on return.
  *
- * A caller using such a combination of @locked and @gup_flags
- * must therefore hold the mmap_lock for reading only, and recognize
- * when it's been released.  Otherwise, it must be held for either
- * reading or writing and will not be released.
+ * A caller using such a combination of @gup_flags must therefore hold the
+ * mmap_lock for reading only, and recognize when it's been released. Otherwise,
+ * it must be held for either reading or writing and will not be released.
  *
  * In most cases, get_user_pages or get_user_pages_fast should be used
  * instead of __get_user_pages. __get_user_pages should be used only if
@@ -1121,7 +1119,7 @@ static long __get_user_pages(struct mm_struct *mm,
 				i = follow_hugetlb_page(mm, vma, pages, vmas,
 						&start, &nr_pages, i,
 						gup_flags, locked);
-				if (locked && *locked == 0) {
+				if (!*locked) {
 					/*
 					 * We've got a VM_FAULT_RETRY
 					 * and we've lost mmap_lock.
@@ -1354,7 +1352,7 @@ static __always_inline long __get_user_pages_locked(struct mm_struct *mm,
 	 * The internal caller expects GUP to manage the lock internally and the
 	 * lock must be released when this returns.
 	 */
-	if (locked && !*locked) {
+	if (!*locked) {
 		if (mmap_read_lock_killable(mm))
 			return -EAGAIN;
 		must_unlock = true;
@@ -1502,6 +1500,7 @@ long populate_vma_page_range(struct vm_area_struct *vma,
 {
 	struct mm_struct *mm = vma->vm_mm;
 	unsigned long nr_pages = (end - start) / PAGE_SIZE;
+	int local_locked = 1;
 	int gup_flags;
 	long ret;
 
@@ -1542,7 +1541,7 @@ long populate_vma_page_range(struct vm_area_struct *vma,
 	 * not result in a stack expansion that recurses back here.
 	 */
 	ret = __get_user_pages(mm, start, nr_pages, gup_flags,
-				NULL, NULL, locked);
+				NULL, NULL, locked ? locked : &local_locked);
 	lru_add_drain();
 	return ret;
 }
@@ -1683,7 +1682,7 @@ static long __get_user_pages_locked(struct mm_struct *mm, unsigned long start,
 	 * The internal caller expects GUP to manage the lock internally and the
 	 * lock must be released when this returns.
 	 */
-	if (locked && !*locked) {
+	if (!*locked) {
 		if (mmap_read_lock_killable(mm))
 			return -EAGAIN;
 		must_unlock = true;
@@ -2222,11 +2221,14 @@ long get_user_pages_remote(struct mm_struct *mm,
 		unsigned int gup_flags, struct page **pages,
 		struct vm_area_struct **vmas, int *locked)
 {
+	int local_locked = 1;
+
 	if (!is_valid_gup_args(pages, vmas, locked, &gup_flags,
 			       FOLL_TOUCH | FOLL_REMOTE))
 		return -EINVAL;
 
-	return __get_user_pages_locked(mm, start, nr_pages, pages, vmas, locked,
+	return __get_user_pages_locked(mm, start, nr_pages, pages, vmas,
+				       locked ? locked : &local_locked,
 				       gup_flags);
 }
 EXPORT_SYMBOL(get_user_pages_remote);
@@ -2261,11 +2263,13 @@ long get_user_pages(unsigned long start, unsigned long nr_pages,
 		unsigned int gup_flags, struct page **pages,
 		struct vm_area_struct **vmas)
 {
+	int locked = 1;
+
 	if (!is_valid_gup_args(pages, vmas, NULL, &gup_flags, FOLL_TOUCH))
 		return -EINVAL;
 
 	return __get_user_pages_locked(current->mm, start, nr_pages, pages,
-				       vmas, NULL, gup_flags);
+				       vmas, &locked, gup_flags);
 }
 EXPORT_SYMBOL(get_user_pages);
 
@@ -3158,10 +3162,13 @@ long pin_user_pages_remote(struct mm_struct *mm,
 			   unsigned int gup_flags, struct page **pages,
 			   struct vm_area_struct **vmas, int *locked)
 {
+	int local_locked = 1;
+
 	if (!is_valid_gup_args(pages, vmas, locked, &gup_flags,
 			       FOLL_PIN | FOLL_TOUCH | FOLL_REMOTE))
 		return 0;
-	return __gup_longterm_locked(mm, start, nr_pages, pages, vmas, locked,
+	return __gup_longterm_locked(mm, start, nr_pages, pages, vmas,
+				     locked ? locked : &local_locked,
 				     gup_flags);
 }
 EXPORT_SYMBOL(pin_user_pages_remote);
@@ -3187,10 +3194,12 @@ long pin_user_pages(unsigned long start, unsigned long nr_pages,
 		    unsigned int gup_flags, struct page **pages,
 		    struct vm_area_struct **vmas)
 {
+	int locked = 1;
+
 	if (!is_valid_gup_args(pages, vmas, NULL, &gup_flags, FOLL_PIN))
 		return 0;
 	return __gup_longterm_locked(current->mm, start, nr_pages,
-				     pages, vmas, NULL, gup_flags);
+				     pages, vmas, &locked, gup_flags);
 }
 EXPORT_SYMBOL(pin_user_pages);
 
-- 
2.39.0



^ permalink raw reply related	[flat|nested] 44+ messages in thread

* [PATCH v2 10/13] mm/gup: remove pin_user_pages_fast_only()
  2023-01-24 20:34 [PATCH v2 00/13] Simplify the external interface for GUP Jason Gunthorpe
                   ` (8 preceding siblings ...)
  2023-01-24 20:34 ` [PATCH v2 09/13] mm/gup: make locked never NULL in the internal GUP functions Jason Gunthorpe
@ 2023-01-24 20:34 ` Jason Gunthorpe
  2023-01-24 20:34 ` [PATCH v2 11/13] mm/gup: make get_user_pages_fast_only() return the common return value Jason Gunthorpe
                   ` (3 subsequent siblings)
  13 siblings, 0 replies; 44+ messages in thread
From: Jason Gunthorpe @ 2023-01-24 20:34 UTC (permalink / raw)
  Cc: Alistair Popple, David Hildenbrand, David Howells,
	Christoph Hellwig, John Hubbard, linux-mm, Mike Rapoport (IBM)

Commit ed29c2691188 ("drm/i915: Fix userptr so we do not have to worry
about obj->mm.lock, v7.")  removed the only caller, remove this dead code
too.

Acked-by: Mike Rapoport (IBM) <rppt@kernel.org>
Reviewed-by: John Hubbard <jhubbard@nvidia.com>
Reviewed-by: David Hildenbrand <david@redhat.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
---
 include/linux/mm.h |  2 --
 mm/gup.c           | 33 ---------------------------------
 2 files changed, 35 deletions(-)

diff --git a/include/linux/mm.h b/include/linux/mm.h
index dfc2a88bc4a8ed..a47a6e8a9c78be 100644
--- a/include/linux/mm.h
+++ b/include/linux/mm.h
@@ -2166,8 +2166,6 @@ extern int mprotect_fixup(struct mmu_gather *tlb, struct vm_area_struct *vma,
  */
 int get_user_pages_fast_only(unsigned long start, int nr_pages,
 			     unsigned int gup_flags, struct page **pages);
-int pin_user_pages_fast_only(unsigned long start, int nr_pages,
-			     unsigned int gup_flags, struct page **pages);
 
 static inline bool get_user_page_fast_only(unsigned long addr,
 			unsigned int gup_flags, struct page **pagep)
diff --git a/mm/gup.c b/mm/gup.c
index 932a2339613c2f..8edfa66277fe20 100644
--- a/mm/gup.c
+++ b/mm/gup.c
@@ -3102,39 +3102,6 @@ int pin_user_pages_fast(unsigned long start, int nr_pages,
 }
 EXPORT_SYMBOL_GPL(pin_user_pages_fast);
 
-/*
- * This is the FOLL_PIN equivalent of get_user_pages_fast_only(). Behavior
- * is the same, except that this one sets FOLL_PIN instead of FOLL_GET.
- *
- * The API rules are the same, too: no negative values may be returned.
- */
-int pin_user_pages_fast_only(unsigned long start, int nr_pages,
-			     unsigned int gup_flags, struct page **pages)
-{
-	int nr_pinned;
-
-	/*
-	 * FOLL_FAST_ONLY is required in order to match the API description of
-	 * this routine: no fall back to regular ("slow") GUP.
-	 */
-	if (!is_valid_gup_args(pages, NULL, NULL, &gup_flags,
-			       FOLL_PIN | FOLL_FAST_ONLY))
-		return 0;
-
-	nr_pinned = internal_get_user_pages_fast(start, nr_pages, gup_flags,
-						 pages);
-	/*
-	 * This routine is not allowed to return negative values. However,
-	 * internal_get_user_pages_fast() *can* return -errno. Therefore,
-	 * correct for that here:
-	 */
-	if (nr_pinned < 0)
-		nr_pinned = 0;
-
-	return nr_pinned;
-}
-EXPORT_SYMBOL_GPL(pin_user_pages_fast_only);
-
 /**
  * pin_user_pages_remote() - pin pages of a remote process
  *
-- 
2.39.0



^ permalink raw reply related	[flat|nested] 44+ messages in thread

* [PATCH v2 11/13] mm/gup: make get_user_pages_fast_only() return the common return value
  2023-01-24 20:34 [PATCH v2 00/13] Simplify the external interface for GUP Jason Gunthorpe
                   ` (9 preceding siblings ...)
  2023-01-24 20:34 ` [PATCH v2 10/13] mm/gup: remove pin_user_pages_fast_only() Jason Gunthorpe
@ 2023-01-24 20:34 ` Jason Gunthorpe
  2023-01-24 20:34 ` [PATCH v2 12/13] mm/gup: move gup_must_unshare() to mm/internal.h Jason Gunthorpe
                   ` (2 subsequent siblings)
  13 siblings, 0 replies; 44+ messages in thread
From: Jason Gunthorpe @ 2023-01-24 20:34 UTC (permalink / raw)
  Cc: Alistair Popple, David Hildenbrand, David Howells,
	Christoph Hellwig, John Hubbard, linux-mm, Mike Rapoport (IBM)

There are only two callers, both can handle the common return code:

- get_user_page_fast_only() checks == 1

- gfn_to_page_many_atomic() already returns -1, and the only caller
  checks for negative return values

Remove the restriction against returning negative values.

Acked-by: Mike Rapoport (IBM) <rppt@kernel.org>
Reviewed-by: John Hubbard <jhubbard@nvidia.com>
Reviewed-by: David Hildenbrand <david@redhat.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
---
 mm/gup.c | 17 +----------------
 1 file changed, 1 insertion(+), 16 deletions(-)

diff --git a/mm/gup.c b/mm/gup.c
index 8edfa66277fe20..05ca9b0a06c8c5 100644
--- a/mm/gup.c
+++ b/mm/gup.c
@@ -3005,8 +3005,6 @@ static int internal_get_user_pages_fast(unsigned long start,
  *
  * Like get_user_pages_fast() except it's IRQ-safe in that it won't fall back to
  * the regular GUP.
- * Note a difference with get_user_pages_fast: this always returns the
- * number of pages pinned, 0 if no pages were pinned.
  *
  * If the architecture does not support this function, simply return with no
  * pages pinned.
@@ -3018,7 +3016,6 @@ static int internal_get_user_pages_fast(unsigned long start,
 int get_user_pages_fast_only(unsigned long start, int nr_pages,
 			     unsigned int gup_flags, struct page **pages)
 {
-	int nr_pinned;
 	/*
 	 * Internally (within mm/gup.c), gup fast variants must set FOLL_GET,
 	 * because gup fast is always a "pin with a +1 page refcount" request.
@@ -3030,19 +3027,7 @@ int get_user_pages_fast_only(unsigned long start, int nr_pages,
 			       FOLL_GET | FOLL_FAST_ONLY))
 		return -EINVAL;
 
-	nr_pinned = internal_get_user_pages_fast(start, nr_pages, gup_flags,
-						 pages);
-
-	/*
-	 * As specified in the API description above, this routine is not
-	 * allowed to return negative values. However, the common core
-	 * routine internal_get_user_pages_fast() *can* return -errno.
-	 * Therefore, correct for that here:
-	 */
-	if (nr_pinned < 0)
-		nr_pinned = 0;
-
-	return nr_pinned;
+	return internal_get_user_pages_fast(start, nr_pages, gup_flags, pages);
 }
 EXPORT_SYMBOL_GPL(get_user_pages_fast_only);
 
-- 
2.39.0



^ permalink raw reply related	[flat|nested] 44+ messages in thread

* [PATCH v2 12/13] mm/gup: move gup_must_unshare() to mm/internal.h
  2023-01-24 20:34 [PATCH v2 00/13] Simplify the external interface for GUP Jason Gunthorpe
                   ` (10 preceding siblings ...)
  2023-01-24 20:34 ` [PATCH v2 11/13] mm/gup: make get_user_pages_fast_only() return the common return value Jason Gunthorpe
@ 2023-01-24 20:34 ` Jason Gunthorpe
  2023-01-25  2:41   ` John Hubbard
  2023-01-26 11:29   ` David Hildenbrand
  2023-01-24 20:34 ` [PATCH v2 13/13] mm/gup: move private gup FOLL_ flags to internal.h Jason Gunthorpe
  2023-02-06 23:46 ` [PATCH v2 00/13] Simplify the external interface for GUP Jason Gunthorpe
  13 siblings, 2 replies; 44+ messages in thread
From: Jason Gunthorpe @ 2023-01-24 20:34 UTC (permalink / raw)
  Cc: Alistair Popple, David Hildenbrand, David Howells,
	Christoph Hellwig, John Hubbard, linux-mm, Mike Rapoport (IBM)

This function is only used in gup.c and closely related. It touches
FOLL_PIN so it must be moved before the next patch.

Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
---
 include/linux/mm.h | 65 ----------------------------------------------
 mm/internal.h      | 65 ++++++++++++++++++++++++++++++++++++++++++++++
 2 files changed, 65 insertions(+), 65 deletions(-)

diff --git a/include/linux/mm.h b/include/linux/mm.h
index a47a6e8a9c78be..e0bacf9f2c5ebe 100644
--- a/include/linux/mm.h
+++ b/include/linux/mm.h
@@ -3087,71 +3087,6 @@ static inline int vm_fault_to_errno(vm_fault_t vm_fault, int foll_flags)
 	return 0;
 }
 
-/*
- * Indicates for which pages that are write-protected in the page table,
- * whether GUP has to trigger unsharing via FAULT_FLAG_UNSHARE such that the
- * GUP pin will remain consistent with the pages mapped into the page tables
- * of the MM.
- *
- * Temporary unmapping of PageAnonExclusive() pages or clearing of
- * PageAnonExclusive() has to protect against concurrent GUP:
- * * Ordinary GUP: Using the PT lock
- * * GUP-fast and fork(): mm->write_protect_seq
- * * GUP-fast and KSM or temporary unmapping (swap, migration): see
- *    page_try_share_anon_rmap()
- *
- * Must be called with the (sub)page that's actually referenced via the
- * page table entry, which might not necessarily be the head page for a
- * PTE-mapped THP.
- *
- * If the vma is NULL, we're coming from the GUP-fast path and might have
- * to fallback to the slow path just to lookup the vma.
- */
-static inline bool gup_must_unshare(struct vm_area_struct *vma,
-				    unsigned int flags, struct page *page)
-{
-	/*
-	 * FOLL_WRITE is implicitly handled correctly as the page table entry
-	 * has to be writable -- and if it references (part of) an anonymous
-	 * folio, that part is required to be marked exclusive.
-	 */
-	if ((flags & (FOLL_WRITE | FOLL_PIN)) != FOLL_PIN)
-		return false;
-	/*
-	 * Note: PageAnon(page) is stable until the page is actually getting
-	 * freed.
-	 */
-	if (!PageAnon(page)) {
-		/*
-		 * We only care about R/O long-term pining: R/O short-term
-		 * pinning does not have the semantics to observe successive
-		 * changes through the process page tables.
-		 */
-		if (!(flags & FOLL_LONGTERM))
-			return false;
-
-		/* We really need the vma ... */
-		if (!vma)
-			return true;
-
-		/*
-		 * ... because we only care about writable private ("COW")
-		 * mappings where we have to break COW early.
-		 */
-		return is_cow_mapping(vma->vm_flags);
-	}
-
-	/* Paired with a memory barrier in page_try_share_anon_rmap(). */
-	if (IS_ENABLED(CONFIG_HAVE_FAST_GUP))
-		smp_rmb();
-
-	/*
-	 * Note that PageKsm() pages cannot be exclusive, and consequently,
-	 * cannot get pinned.
-	 */
-	return !PageAnonExclusive(page);
-}
-
 /*
  * Indicates whether GUP can follow a PROT_NONE mapped page, or whether
  * a (NUMA hinting) fault is required.
diff --git a/mm/internal.h b/mm/internal.h
index 0f035bcaf133f5..5c1310b98db64d 100644
--- a/mm/internal.h
+++ b/mm/internal.h
@@ -854,6 +854,71 @@ int migrate_device_coherent_page(struct page *page);
 struct folio *try_grab_folio(struct page *page, int refs, unsigned int flags);
 int __must_check try_grab_page(struct page *page, unsigned int flags);
 
+/*
+ * Indicates for which pages that are write-protected in the page table,
+ * whether GUP has to trigger unsharing via FAULT_FLAG_UNSHARE such that the
+ * GUP pin will remain consistent with the pages mapped into the page tables
+ * of the MM.
+ *
+ * Temporary unmapping of PageAnonExclusive() pages or clearing of
+ * PageAnonExclusive() has to protect against concurrent GUP:
+ * * Ordinary GUP: Using the PT lock
+ * * GUP-fast and fork(): mm->write_protect_seq
+ * * GUP-fast and KSM or temporary unmapping (swap, migration): see
+ *    page_try_share_anon_rmap()
+ *
+ * Must be called with the (sub)page that's actually referenced via the
+ * page table entry, which might not necessarily be the head page for a
+ * PTE-mapped THP.
+ *
+ * If the vma is NULL, we're coming from the GUP-fast path and might have
+ * to fallback to the slow path just to lookup the vma.
+ */
+static inline bool gup_must_unshare(struct vm_area_struct *vma,
+				    unsigned int flags, struct page *page)
+{
+	/*
+	 * FOLL_WRITE is implicitly handled correctly as the page table entry
+	 * has to be writable -- and if it references (part of) an anonymous
+	 * folio, that part is required to be marked exclusive.
+	 */
+	if ((flags & (FOLL_WRITE | FOLL_PIN)) != FOLL_PIN)
+		return false;
+	/*
+	 * Note: PageAnon(page) is stable until the page is actually getting
+	 * freed.
+	 */
+	if (!PageAnon(page)) {
+		/*
+		 * We only care about R/O long-term pining: R/O short-term
+		 * pinning does not have the semantics to observe successive
+		 * changes through the process page tables.
+		 */
+		if (!(flags & FOLL_LONGTERM))
+			return false;
+
+		/* We really need the vma ... */
+		if (!vma)
+			return true;
+
+		/*
+		 * ... because we only care about writable private ("COW")
+		 * mappings where we have to break COW early.
+		 */
+		return is_cow_mapping(vma->vm_flags);
+	}
+
+	/* Paired with a memory barrier in page_try_share_anon_rmap(). */
+	if (IS_ENABLED(CONFIG_HAVE_FAST_GUP))
+		smp_rmb();
+
+	/*
+	 * Note that PageKsm() pages cannot be exclusive, and consequently,
+	 * cannot get pinned.
+	 */
+	return !PageAnonExclusive(page);
+}
+
 extern bool mirrored_kernelcore;
 
 static inline bool vma_soft_dirty_enabled(struct vm_area_struct *vma)
-- 
2.39.0



^ permalink raw reply related	[flat|nested] 44+ messages in thread

* [PATCH v2 13/13] mm/gup: move private gup FOLL_ flags to internal.h
  2023-01-24 20:34 [PATCH v2 00/13] Simplify the external interface for GUP Jason Gunthorpe
                   ` (11 preceding siblings ...)
  2023-01-24 20:34 ` [PATCH v2 12/13] mm/gup: move gup_must_unshare() to mm/internal.h Jason Gunthorpe
@ 2023-01-24 20:34 ` Jason Gunthorpe
  2023-01-25  2:44   ` John Hubbard
  2023-01-26 12:48   ` David Hildenbrand
  2023-02-06 23:46 ` [PATCH v2 00/13] Simplify the external interface for GUP Jason Gunthorpe
  13 siblings, 2 replies; 44+ messages in thread
From: Jason Gunthorpe @ 2023-01-24 20:34 UTC (permalink / raw)
  Cc: Alistair Popple, David Hildenbrand, David Howells,
	Christoph Hellwig, John Hubbard, linux-mm, Mike Rapoport (IBM)

Move the flags that should not/are not used outside gup.c and related into
mm/internal.h to discourage driver abuse.

To make this more maintainable going forward compact the two FOLL ranges
with new bit numbers from 0 to 11 and 16 to 21, using shifts so it is
explict.

Switch to an enum so the whole thing is easier to read.

Cc: David Howells <dhowells@redhat.com>
Cc: Christoph Hellwig <hch@infradead.org>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
---
 include/linux/mm_types.h | 57 ++++++++++++++++++++++++----------------
 mm/internal.h            | 15 +++++++++++
 2 files changed, 50 insertions(+), 22 deletions(-)

diff --git a/include/linux/mm_types.h b/include/linux/mm_types.h
index 518617431f8085..dbfc3a7e9559b4 100644
--- a/include/linux/mm_types.h
+++ b/include/linux/mm_types.h
@@ -1039,9 +1039,6 @@ typedef unsigned int __bitwise zap_flags_t;
  * FOLL_PIN and FOLL_LONGTERM may be used in various combinations with each
  * other. Here is what they mean, and how to use them:
  *
- * FOLL_LONGTERM indicates that the page will be held for an indefinite time
- * period _often_ under userspace control.  This is in contrast to
- * iov_iter_get_pages(), whose usages are transient.
  *
  * FIXME: For pages which are part of a filesystem, mappings are subject to the
  * lifetime enforced by the filesystem and we need guarantees that longterm
@@ -1085,24 +1082,40 @@ typedef unsigned int __bitwise zap_flags_t;
  * Please see Documentation/core-api/pin_user_pages.rst for more information.
  */
 
-#define FOLL_WRITE	0x01	/* check pte is writable */
-#define FOLL_TOUCH	0x02	/* mark page accessed */
-#define FOLL_GET	0x04	/* do get_page on page */
-#define FOLL_DUMP	0x08	/* give error on hole if it would be zero */
-#define FOLL_FORCE	0x10	/* get_user_pages read/write w/o permission */
-#define FOLL_NOWAIT	0x20	/* if a disk transfer is needed, start the IO
-				 * and return without waiting upon it */
-#define FOLL_NOFAULT	0x80	/* do not fault in pages */
-#define FOLL_HWPOISON	0x100	/* check page is hwpoisoned */
-#define FOLL_TRIED	0x800	/* a retry, previous pass started an IO */
-#define FOLL_REMOTE	0x2000	/* we are working on non-current tsk/mm */
-#define FOLL_ANON	0x8000	/* don't do file mappings */
-#define FOLL_LONGTERM	0x10000	/* mapping lifetime is indefinite: see below */
-#define FOLL_SPLIT_PMD	0x20000	/* split huge pmd before returning */
-#define FOLL_PIN	0x40000	/* pages must be released via unpin_user_page */
-#define FOLL_FAST_ONLY	0x80000	/* gup_fast: prevent fall-back to slow gup */
-#define FOLL_PCI_P2PDMA	0x100000 /* allow returning PCI P2PDMA pages */
-#define FOLL_INTERRUPTIBLE  0x200000 /* allow interrupts from generic signals */
-#define FOLL_UNLOCKABLE	0x400000 /* allow unlocking the mmap lock (internal only) */
+enum {
+	/* check pte is writable */
+	FOLL_WRITE = 1 << 0,
+	/* do get_page on page */
+	FOLL_GET = 1 << 1,
+	/* give error on hole if it would be zero */
+	FOLL_DUMP = 1 << 2,
+	/* get_user_pages read/write w/o permission */
+	FOLL_FORCE = 1 << 3,
+	/*
+	 * if a disk transfer is needed, start the IO and return without waiting
+	 * upon it
+	 */
+	FOLL_NOWAIT = 1 << 4,
+	/* do not fault in pages */
+	FOLL_NOFAULT = 1 << 5,
+	/* check page is hwpoisoned */
+	FOLL_HWPOISON = 1 << 6,
+	/* don't do file mappings */
+	FOLL_ANON = 1 << 7,
+	/*
+	 * FOLL_LONGTERM indicates that the page will be held for an indefinite
+	 * time period _often_ under userspace control.  This is in contrast to
+	 * iov_iter_get_pages(), whose usages are transient.
+	 */
+	FOLL_LONGTERM = 1 << 8,
+	/* split huge pmd before returning */
+	FOLL_SPLIT_PMD = 1 << 9,
+	/* allow returning PCI P2PDMA pages */
+	FOLL_PCI_P2PDMA = 1 << 10,
+	/* allow interrupts from generic signals */
+	FOLL_INTERRUPTIBLE = 1 << 11,
+
+	/* See also internal only FOLL flags in mm/internal.h */
+};
 
 #endif /* _LINUX_MM_TYPES_H */
diff --git a/mm/internal.h b/mm/internal.h
index 5c1310b98db64d..673b4b1d5f4c79 100644
--- a/mm/internal.h
+++ b/mm/internal.h
@@ -854,6 +854,21 @@ int migrate_device_coherent_page(struct page *page);
 struct folio *try_grab_folio(struct page *page, int refs, unsigned int flags);
 int __must_check try_grab_page(struct page *page, unsigned int flags);
 
+enum {
+	/* mark page accessed */
+	FOLL_TOUCH = 1 << 16,
+	/* a retry, previous pass started an IO */
+	FOLL_TRIED = 1 << 17,
+	/* we are working on non-current tsk/mm */
+	FOLL_REMOTE = 1 << 18,
+	/* pages must be released via unpin_user_page */
+	FOLL_PIN = 1 << 19,
+	/* gup_fast: prevent fall-back to slow gup */
+	FOLL_FAST_ONLY = 1 << 20,
+	/* allow unlocking the mmap lock */
+	FOLL_UNLOCKABLE = 1 << 21,
+};
+
 /*
  * Indicates for which pages that are write-protected in the page table,
  * whether GUP has to trigger unsharing via FAULT_FLAG_UNSHARE such that the
-- 
2.39.0



^ permalink raw reply related	[flat|nested] 44+ messages in thread

* Re: [PATCH v2 01/13] mm/gup: have internal functions get the mmap_read_lock()
  2023-01-24 20:34 ` [PATCH v2 01/13] mm/gup: have internal functions get the mmap_read_lock() Jason Gunthorpe
@ 2023-01-25  2:11   ` John Hubbard
  2023-01-25  2:52     ` John Hubbard
  2023-01-25 16:38     ` Jason Gunthorpe
  0 siblings, 2 replies; 44+ messages in thread
From: John Hubbard @ 2023-01-25  2:11 UTC (permalink / raw)
  To: Jason Gunthorpe
  Cc: Alistair Popple, David Hildenbrand, David Howells,
	Christoph Hellwig, linux-mm, Mike Rapoport (IBM)

On 1/24/23 12:34, Jason Gunthorpe wrote:
...
> @@ -3183,11 +3198,13 @@ long pin_user_pages_unlocked(unsigned long start, unsigned long nr_pages,
>   	/* FOLL_GET and FOLL_PIN are mutually exclusive. */
>   	if (WARN_ON_ONCE(gup_flags & FOLL_GET))
>   		return -EINVAL;
> +	int locked = 0;
>   
>   	if (WARN_ON_ONCE(!pages))
>   		return -EINVAL;
>   
> -	gup_flags |= FOLL_PIN;
> -	return get_user_pages_unlocked(start, nr_pages, pages, gup_flags);
> +	gup_flags |= FOLL_PIN | FOLL_TOUCH;

I missed this on my review of v1 of this series: the FOLL_TOUCH change
looks like a mistake, yes? It should just be left as-is:

	gup_flags |= FOLL_PIN;

Everything else looks good.

thanks,
-- 
John Hubbard
NVIDIA


^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [PATCH v2 02/13] mm/gup: remove obsolete FOLL_LONGTERM comment
  2023-01-24 20:34 ` [PATCH v2 02/13] mm/gup: remove obsolete FOLL_LONGTERM comment Jason Gunthorpe
@ 2023-01-25  2:13   ` John Hubbard
  2023-02-08 14:25   ` David Hildenbrand
  1 sibling, 0 replies; 44+ messages in thread
From: John Hubbard @ 2023-01-25  2:13 UTC (permalink / raw)
  To: Jason Gunthorpe
  Cc: Alistair Popple, David Hildenbrand, David Howells,
	Christoph Hellwig, linux-mm, Mike Rapoport (IBM)

On 1/24/23 12:34, Jason Gunthorpe wrote:
> These days FOLL_LONGTERM is not allowed at all on any get_user_pages*()
> functions, it must be only be used with pin_user_pages*(), plus it now has
> universal support for all the pin_user_pages*() functions.
> 
> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
> ---
>   include/linux/mm_types.h | 6 ------
>   1 file changed, 6 deletions(-)

Yes! Progress has happened.

Reviewed-by: John Hubbard <jhubbard@nvidia.com>


thanks,
-- 
John Hubbard
NVIDIA

> 
> diff --git a/include/linux/mm_types.h b/include/linux/mm_types.h
> index a689198caf7408..8971a40c120e38 100644
> --- a/include/linux/mm_types.h
> +++ b/include/linux/mm_types.h
> @@ -1052,12 +1052,6 @@ typedef unsigned int __bitwise zap_flags_t;
>    * specifically failed.  Filesystem pages are still subject to bugs and use of
>    * FOLL_LONGTERM should be avoided on those pages.
>    *
> - * FIXME: Also NOTE that FOLL_LONGTERM is not supported in every GUP call.
> - * Currently only get_user_pages() and get_user_pages_fast() support this flag
> - * and calls to get_user_pages_[un]locked are specifically not allowed.  This
> - * is due to an incompatibility with the FS DAX check and
> - * FAULT_FLAG_ALLOW_RETRY.
> - *
>    * In the CMA case: long term pins in a CMA region would unnecessarily fragment
>    * that region.  And so, CMA attempts to migrate the page before pinning, when
>    * FOLL_LONGTERM is specified.



^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [PATCH v2 04/13] mm/gup: move try_grab_page() to mm/internal.h
  2023-01-24 20:34 ` [PATCH v2 04/13] mm/gup: move try_grab_page() to mm/internal.h Jason Gunthorpe
@ 2023-01-25  2:15   ` John Hubbard
  2023-02-08 14:26   ` David Hildenbrand
  1 sibling, 0 replies; 44+ messages in thread
From: John Hubbard @ 2023-01-25  2:15 UTC (permalink / raw)
  To: Jason Gunthorpe
  Cc: Alistair Popple, David Hildenbrand, David Howells,
	Christoph Hellwig, linux-mm, Mike Rapoport (IBM)

On 1/24/23 12:34, Jason Gunthorpe wrote:
> This is part of the internal function of gup.c and is only non-static so
> that the parts of gup.c in the huge_memory.c and hugetlb.c can call it.
> 
> Put it in internal.h beside the similarly purposed try_grab_folio()
> 
> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
> ---
>   include/linux/mm.h | 2 --
>   mm/internal.h      | 1 +
>   2 files changed, 1 insertion(+), 2 deletions(-)

Reviewed-by: John Hubbard <jhubbard@nvidia.com>

thanks,
-- 
John Hubbard
NVIDIA

> 
> diff --git a/include/linux/mm.h b/include/linux/mm.h
> index c9db257f09b307..dfc2a88bc4a8ed 100644
> --- a/include/linux/mm.h
> +++ b/include/linux/mm.h
> @@ -1173,8 +1173,6 @@ static inline void get_page(struct page *page)
>   	folio_get(page_folio(page));
>   }
>   
> -int __must_check try_grab_page(struct page *page, unsigned int flags);
> -
>   static inline __must_check bool try_get_page(struct page *page)
>   {
>   	page = compound_head(page);
> diff --git a/mm/internal.h b/mm/internal.h
> index ce462bf145b441..0f035bcaf133f5 100644
> --- a/mm/internal.h
> +++ b/mm/internal.h
> @@ -852,6 +852,7 @@ int migrate_device_coherent_page(struct page *page);
>    * mm/gup.c
>    */
>   struct folio *try_grab_folio(struct page *page, int refs, unsigned int flags);
> +int __must_check try_grab_page(struct page *page, unsigned int flags);
>   
>   extern bool mirrored_kernelcore;
>   




^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [PATCH v2 05/13] mm/gup: simplify the external interface functions and consolidate invariants
  2023-01-24 20:34 ` [PATCH v2 05/13] mm/gup: simplify the external interface functions and consolidate invariants Jason Gunthorpe
@ 2023-01-25  2:30   ` John Hubbard
  0 siblings, 0 replies; 44+ messages in thread
From: John Hubbard @ 2023-01-25  2:30 UTC (permalink / raw)
  To: Jason Gunthorpe
  Cc: Alistair Popple, David Hildenbrand, David Howells,
	Christoph Hellwig, linux-mm, Mike Rapoport (IBM)

On 1/24/23 12:34, Jason Gunthorpe wrote:
> The GUP family of functions have a complex, but fairly well defined, set
> of invariants for their arguments. Currently these are sprinkled about,
> sometimes in duplicate through many functions.
> 
> Internally we don't follow all the invariants that the external interface
> has to follow, so place these checks directly at the exported
> interface. This ensures the internal functions never reach a violated
> invariant.
> 
> Remove the duplicated invariant checks.
> 
> The end result is to make these functions fully internal:
>   __get_user_pages_locked()
>   internal_get_user_pages_fast()
>   __gup_longterm_locked()
> 
> And all the other functions call directly into one of these.
> 
> Suggested-by: John Hubbard <jhubbard@nvidia.com>
> Acked-by: Mike Rapoport (IBM) <rppt@kernel.org>
> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
> ---
>   mm/gup.c         | 153 +++++++++++++++++++++++------------------------
>   mm/huge_memory.c |  10 ----
>   2 files changed, 75 insertions(+), 88 deletions(-)

Reviewed-by: John Hubbard <jhubbard@nvidia.com>

thanks,
-- 
John Hubbard
NVIDIA

> 
> diff --git a/mm/gup.c b/mm/gup.c
> index a6559d7243db92..4c236fb83dcd3e 100644
> --- a/mm/gup.c
> +++ b/mm/gup.c
> @@ -215,7 +215,6 @@ int __must_check try_grab_page(struct page *page, unsigned int flags)
>   {
>   	struct folio *folio = page_folio(page);
>   
> -	WARN_ON_ONCE((flags & (FOLL_GET | FOLL_PIN)) == (FOLL_GET | FOLL_PIN));
>   	if (WARN_ON_ONCE(folio_ref_count(folio) <= 0))
>   		return -ENOMEM;
>   
> @@ -818,7 +817,7 @@ struct page *follow_page(struct vm_area_struct *vma, unsigned long address,
>   	if (vma_is_secretmem(vma))
>   		return NULL;
>   
> -	if (foll_flags & FOLL_PIN)
> +	if (WARN_ON_ONCE(foll_flags & FOLL_PIN))
>   		return NULL;
>   
>   	page = follow_page_mask(vma, address, foll_flags, &ctx);
> @@ -975,9 +974,6 @@ static int check_vma_flags(struct vm_area_struct *vma, unsigned long gup_flags)
>   	if ((gup_flags & FOLL_LONGTERM) && vma_is_fsdax(vma))
>   		return -EOPNOTSUPP;
>   
> -	if ((gup_flags & FOLL_LONGTERM) && (gup_flags & FOLL_PCI_P2PDMA))
> -		return -EOPNOTSUPP;
> -
>   	if (vma_is_secretmem(vma))
>   		return -EFAULT;
>   
> @@ -1354,11 +1350,6 @@ static __always_inline long __get_user_pages_locked(struct mm_struct *mm,
>   	long ret, pages_done;
>   	bool must_unlock = false;
>   
> -	if (locked) {
> -		/* if VM_FAULT_RETRY can be returned, vmas become invalid */
> -		BUG_ON(vmas);
> -	}
> -
>   	/*
>   	 * The internal caller expects GUP to manage the lock internally and the
>   	 * lock must be released when this returns.
> @@ -2087,16 +2078,6 @@ static long __gup_longterm_locked(struct mm_struct *mm,
>   		return __get_user_pages_locked(mm, start, nr_pages, pages, vmas,
>   					       locked, gup_flags);
>   
> -	/*
> -	 * If we get to this point then FOLL_LONGTERM is set, and FOLL_LONGTERM
> -	 * implies FOLL_PIN (although the reverse is not true). Therefore it is
> -	 * correct to unconditionally call check_and_migrate_movable_pages()
> -	 * which assumes pages have been pinned via FOLL_PIN.
> -	 *
> -	 * Enforce the above reasoning by asserting that FOLL_PIN is set.
> -	 */
> -	if (WARN_ON(!(gup_flags & FOLL_PIN)))
> -		return -EINVAL;
>   	flags = memalloc_pin_save();
>   	do {
>   		nr_pinned_pages = __get_user_pages_locked(mm, start, nr_pages,
> @@ -2106,28 +2087,66 @@ static long __gup_longterm_locked(struct mm_struct *mm,
>   			rc = nr_pinned_pages;
>   			break;
>   		}
> +
> +		/* FOLL_LONGTERM implies FOLL_PIN */
>   		rc = check_and_migrate_movable_pages(nr_pinned_pages, pages);
>   	} while (rc == -EAGAIN);
>   	memalloc_pin_restore(flags);
>   	return rc ? rc : nr_pinned_pages;
>   }
>   
> -static bool is_valid_gup_flags(unsigned int gup_flags)
> +/*
> + * Check that the given flags are valid for the exported gup/pup interface, and
> + * update them with the required flags that the caller must have set.
> + */
> +static bool is_valid_gup_args(struct page **pages, struct vm_area_struct **vmas,
> +			      int *locked, unsigned int *gup_flags_p,
> +			      unsigned int to_set)
>   {
> +	unsigned int gup_flags = *gup_flags_p;
> +
>   	/*
> -	 * FOLL_PIN must only be set internally by the pin_user_pages*() APIs,
> -	 * never directly by the caller, so enforce that with an assertion:
> +	 * These flags not allowed to be specified externally to the gup
> +	 * interfaces:
> +	 * - FOLL_PIN/FOLL_TRIED/FOLL_FAST_ONLY are internal only
> +	 * - FOLL_REMOTE is internal only and used on follow_page()
>   	 */
> -	if (WARN_ON_ONCE(gup_flags & FOLL_PIN))
> +	if (WARN_ON_ONCE(gup_flags & (FOLL_PIN | FOLL_TRIED |
> +				      FOLL_REMOTE | FOLL_FAST_ONLY)))
> +		return false;
> +
> +	gup_flags |= to_set;
> +
> +	/* FOLL_GET and FOLL_PIN are mutually exclusive. */
> +	if (WARN_ON_ONCE((gup_flags & (FOLL_PIN | FOLL_GET)) ==
> +			 (FOLL_PIN | FOLL_GET)))
> +		return false;
> +
> +	/* LONGTERM can only be specified when pinning */
> +	if (WARN_ON_ONCE(!(gup_flags & FOLL_PIN) && (gup_flags & FOLL_LONGTERM)))
> +		return false;
> +
> +	/* Pages input must be given if using GET/PIN */
> +	if (WARN_ON_ONCE((gup_flags & (FOLL_GET | FOLL_PIN)) && !pages))
>   		return false;
> +
> +	/* At the external interface locked must be set */
> +	if (WARN_ON_ONCE(locked && *locked != 1))
> +		return false;
> +
> +	/* We want to allow the pgmap to be hot-unplugged at all times */
> +	if (WARN_ON_ONCE((gup_flags & FOLL_LONGTERM) &&
> +			 (gup_flags & FOLL_PCI_P2PDMA)))
> +		return false;
> +
>   	/*
> -	 * FOLL_PIN is a prerequisite to FOLL_LONGTERM. Another way of saying
> -	 * that is, FOLL_LONGTERM is a specific case, more restrictive case of
> -	 * FOLL_PIN.
> +	 * Can't use VMAs with locked, as locked allows GUP to unlock
> +	 * which invalidates the vmas array
>   	 */
> -	if (WARN_ON_ONCE(gup_flags & FOLL_LONGTERM))
> +	if (WARN_ON_ONCE(vmas && locked))
>   		return false;
>   
> +	*gup_flags_p = gup_flags;
>   	return true;
>   }
>   
> @@ -2197,11 +2216,12 @@ long get_user_pages_remote(struct mm_struct *mm,
>   		unsigned int gup_flags, struct page **pages,
>   		struct vm_area_struct **vmas, int *locked)
>   {
> -	if (!is_valid_gup_flags(gup_flags))
> +	if (!is_valid_gup_args(pages, vmas, locked, &gup_flags,
> +			       FOLL_TOUCH | FOLL_REMOTE))
>   		return -EINVAL;
>   
>   	return __get_user_pages_locked(mm, start, nr_pages, pages, vmas, locked,
> -				       gup_flags | FOLL_TOUCH | FOLL_REMOTE);
> +				       gup_flags);
>   }
>   EXPORT_SYMBOL(get_user_pages_remote);
>   
> @@ -2235,11 +2255,11 @@ long get_user_pages(unsigned long start, unsigned long nr_pages,
>   		unsigned int gup_flags, struct page **pages,
>   		struct vm_area_struct **vmas)
>   {
> -	if (!is_valid_gup_flags(gup_flags))
> +	if (!is_valid_gup_args(pages, vmas, NULL, &gup_flags, FOLL_TOUCH))
>   		return -EINVAL;
>   
>   	return __get_user_pages_locked(current->mm, start, nr_pages, pages,
> -				       vmas, NULL, gup_flags | FOLL_TOUCH);
> +				       vmas, NULL, gup_flags);
>   }
>   EXPORT_SYMBOL(get_user_pages);
>   
> @@ -2263,8 +2283,11 @@ long get_user_pages_unlocked(unsigned long start, unsigned long nr_pages,
>   {
>   	int locked = 0;
>   
> +	if (!is_valid_gup_args(pages, NULL, NULL, &gup_flags, FOLL_TOUCH))
> +		return -EINVAL;
> +
>   	return __get_user_pages_locked(current->mm, start, nr_pages, pages,
> -				       NULL, &locked, gup_flags | FOLL_TOUCH);
> +				       NULL, &locked, gup_flags);
>   }
>   EXPORT_SYMBOL(get_user_pages_unlocked);
>   
> @@ -2992,7 +3015,9 @@ int get_user_pages_fast_only(unsigned long start, int nr_pages,
>   	 * FOLL_FAST_ONLY is required in order to match the API description of
>   	 * this routine: no fall back to regular ("slow") GUP.
>   	 */
> -	gup_flags |= FOLL_GET | FOLL_FAST_ONLY;
> +	if (!is_valid_gup_args(pages, NULL, NULL, &gup_flags,
> +			       FOLL_GET | FOLL_FAST_ONLY))
> +		return -EINVAL;
>   
>   	nr_pinned = internal_get_user_pages_fast(start, nr_pages, gup_flags,
>   						 pages);
> @@ -3029,16 +3054,14 @@ EXPORT_SYMBOL_GPL(get_user_pages_fast_only);
>   int get_user_pages_fast(unsigned long start, int nr_pages,
>   			unsigned int gup_flags, struct page **pages)
>   {
> -	if (!is_valid_gup_flags(gup_flags))
> -		return -EINVAL;
> -
>   	/*
>   	 * The caller may or may not have explicitly set FOLL_GET; either way is
>   	 * OK. However, internally (within mm/gup.c), gup fast variants must set
>   	 * FOLL_GET, because gup fast is always a "pin with a +1 page refcount"
>   	 * request.
>   	 */
> -	gup_flags |= FOLL_GET;
> +	if (!is_valid_gup_args(pages, NULL, NULL, &gup_flags, FOLL_GET))
> +		return -EINVAL;
>   	return internal_get_user_pages_fast(start, nr_pages, gup_flags, pages);
>   }
>   EXPORT_SYMBOL_GPL(get_user_pages_fast);
> @@ -3062,14 +3085,8 @@ EXPORT_SYMBOL_GPL(get_user_pages_fast);
>   int pin_user_pages_fast(unsigned long start, int nr_pages,
>   			unsigned int gup_flags, struct page **pages)
>   {
> -	/* FOLL_GET and FOLL_PIN are mutually exclusive. */
> -	if (WARN_ON_ONCE(gup_flags & FOLL_GET))
> +	if (!is_valid_gup_args(pages, NULL, NULL, &gup_flags, FOLL_PIN))
>   		return -EINVAL;
> -
> -	if (WARN_ON_ONCE(!pages))
> -		return -EINVAL;
> -
> -	gup_flags |= FOLL_PIN;
>   	return internal_get_user_pages_fast(start, nr_pages, gup_flags, pages);
>   }
>   EXPORT_SYMBOL_GPL(pin_user_pages_fast);
> @@ -3085,20 +3102,14 @@ int pin_user_pages_fast_only(unsigned long start, int nr_pages,
>   {
>   	int nr_pinned;
>   
> -	/*
> -	 * FOLL_GET and FOLL_PIN are mutually exclusive. Note that the API
> -	 * rules require returning 0, rather than -errno:
> -	 */
> -	if (WARN_ON_ONCE(gup_flags & FOLL_GET))
> -		return 0;
> -
> -	if (WARN_ON_ONCE(!pages))
> -		return 0;
>   	/*
>   	 * FOLL_FAST_ONLY is required in order to match the API description of
>   	 * this routine: no fall back to regular ("slow") GUP.
>   	 */
> -	gup_flags |= (FOLL_PIN | FOLL_FAST_ONLY);
> +	if (!is_valid_gup_args(pages, NULL, NULL, &gup_flags,
> +			       FOLL_PIN | FOLL_FAST_ONLY))
> +		return 0;
> +
>   	nr_pinned = internal_get_user_pages_fast(start, nr_pages, gup_flags,
>   						 pages);
>   	/*
> @@ -3140,16 +3151,11 @@ long pin_user_pages_remote(struct mm_struct *mm,
>   			   unsigned int gup_flags, struct page **pages,
>   			   struct vm_area_struct **vmas, int *locked)
>   {
> -	/* FOLL_GET and FOLL_PIN are mutually exclusive. */
> -	if (WARN_ON_ONCE(gup_flags & FOLL_GET))
> -		return -EINVAL;
> -
> -	if (WARN_ON_ONCE(!pages))
> -		return -EINVAL;
> -
> +	if (!is_valid_gup_args(pages, vmas, locked, &gup_flags,
> +			       FOLL_PIN | FOLL_TOUCH | FOLL_REMOTE))
> +		return 0;
>   	return __gup_longterm_locked(mm, start, nr_pages, pages, vmas, locked,
> -				     gup_flags | FOLL_PIN | FOLL_TOUCH |
> -					     FOLL_REMOTE);
> +				     gup_flags);
>   }
>   EXPORT_SYMBOL(pin_user_pages_remote);
>   
> @@ -3174,14 +3180,8 @@ long pin_user_pages(unsigned long start, unsigned long nr_pages,
>   		    unsigned int gup_flags, struct page **pages,
>   		    struct vm_area_struct **vmas)
>   {
> -	/* FOLL_GET and FOLL_PIN are mutually exclusive. */
> -	if (WARN_ON_ONCE(gup_flags & FOLL_GET))
> -		return -EINVAL;
> -
> -	if (WARN_ON_ONCE(!pages))
> -		return -EINVAL;
> -
> -	gup_flags |= FOLL_PIN;
> +	if (!is_valid_gup_args(pages, vmas, NULL, &gup_flags, FOLL_PIN))
> +		return 0;
>   	return __gup_longterm_locked(current->mm, start, nr_pages,
>   				     pages, vmas, NULL, gup_flags);
>   }
> @@ -3195,15 +3195,12 @@ EXPORT_SYMBOL(pin_user_pages);
>   long pin_user_pages_unlocked(unsigned long start, unsigned long nr_pages,
>   			     struct page **pages, unsigned int gup_flags)
>   {
> -	/* FOLL_GET and FOLL_PIN are mutually exclusive. */
> -	if (WARN_ON_ONCE(gup_flags & FOLL_GET))
> -		return -EINVAL;
>   	int locked = 0;
>   
> -	if (WARN_ON_ONCE(!pages))
> -		return -EINVAL;
> +	if (!is_valid_gup_args(pages, NULL, NULL, &gup_flags,
> +			       FOLL_PIN | FOLL_TOUCH))
> +		return 0;
>   
> -	gup_flags |= FOLL_PIN | FOLL_TOUCH;
>   	return __gup_longterm_locked(current->mm, start, nr_pages, pages, NULL,
>   				     &locked, gup_flags);
>   }
> diff --git a/mm/huge_memory.c b/mm/huge_memory.c
> index 1d6977dc6b31ba..1343a7d88299be 100644
> --- a/mm/huge_memory.c
> +++ b/mm/huge_memory.c
> @@ -1042,11 +1042,6 @@ struct page *follow_devmap_pmd(struct vm_area_struct *vma, unsigned long addr,
>   
>   	assert_spin_locked(pmd_lockptr(mm, pmd));
>   
> -	/* FOLL_GET and FOLL_PIN are mutually exclusive. */
> -	if (WARN_ON_ONCE((flags & (FOLL_PIN | FOLL_GET)) ==
> -			 (FOLL_PIN | FOLL_GET)))
> -		return NULL;
> -
>   	if (flags & FOLL_WRITE && !pmd_write(*pmd))
>   		return NULL;
>   
> @@ -1205,11 +1200,6 @@ struct page *follow_devmap_pud(struct vm_area_struct *vma, unsigned long addr,
>   	if (flags & FOLL_WRITE && !pud_write(*pud))
>   		return NULL;
>   
> -	/* FOLL_GET and FOLL_PIN are mutually exclusive. */
> -	if (WARN_ON_ONCE((flags & (FOLL_PIN | FOLL_GET)) ==
> -			 (FOLL_PIN | FOLL_GET)))
> -		return NULL;
> -
>   	if (pud_present(*pud) && pud_devmap(*pud))
>   		/* pass */;
>   	else



^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [PATCH v2 06/13] mm/gup: add an assertion that the mmap lock is locked
  2023-01-24 20:34 ` [PATCH v2 06/13] mm/gup: add an assertion that the mmap lock is locked Jason Gunthorpe
@ 2023-01-25  2:34   ` John Hubbard
  0 siblings, 0 replies; 44+ messages in thread
From: John Hubbard @ 2023-01-25  2:34 UTC (permalink / raw)
  To: Jason Gunthorpe
  Cc: Alistair Popple, David Hildenbrand, David Howells,
	Christoph Hellwig, linux-mm, Mike Rapoport (IBM)

On 1/24/23 12:34, Jason Gunthorpe wrote:
> Since commit 5b78ed24e8ec ("mm/pagemap: add mmap_assert_locked()
> annotations to find_vma*()") we already have this assertion, it is just
> buried in find_vma():
> 
>   __get_user_pages_locked()
>    __get_user_pages()
>    find_extend_vma()
>     find_vma()
> 
> Also check it at the top of __get_user_pages_locked() as a form of
> documentation.
> 
> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
> ---
>   mm/gup.c | 2 ++
>   1 file changed, 2 insertions(+)

Reviewed-by: John Hubbard <jhubbard@nvidia.com>

thanks,
-- 
John Hubbard
NVIDIA

> 
> diff --git a/mm/gup.c b/mm/gup.c
> index 4c236fb83dcd3e..de1a5c64fdfdcf 100644
> --- a/mm/gup.c
> +++ b/mm/gup.c
> @@ -1360,6 +1360,8 @@ static __always_inline long __get_user_pages_locked(struct mm_struct *mm,
>   		must_unlock = true;
>   		*locked = 1;
>   	}
> +	else
> +		mmap_assert_locked(mm);
>   
>   	if (flags & FOLL_PIN)
>   		mm_set_has_pinned_flag(&mm->flags);



^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [PATCH v2 07/13] mm/gup: remove locked being NULL from faultin_vma_page_range()
  2023-01-24 20:34 ` [PATCH v2 07/13] mm/gup: remove locked being NULL from faultin_vma_page_range() Jason Gunthorpe
@ 2023-01-25  2:38   ` John Hubbard
  0 siblings, 0 replies; 44+ messages in thread
From: John Hubbard @ 2023-01-25  2:38 UTC (permalink / raw)
  To: Jason Gunthorpe
  Cc: Alistair Popple, David Hildenbrand, David Howells,
	Christoph Hellwig, linux-mm, Mike Rapoport (IBM)

On 1/24/23 12:34, Jason Gunthorpe wrote:
> The only caller of this function always passes in a non-NULL locked,
> so just remove this obsolete comment.
> 
> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
> ---
>   mm/gup.c | 7 +------
>   1 file changed, 1 insertion(+), 6 deletions(-)

Reviewed-by: John Hubbard <jhubbard@nvidia.com>


thanks,
-- 
John Hubbard
NVIDIA

> 
> diff --git a/mm/gup.c b/mm/gup.c
> index de1a5c64fdfdcf..dfb315b3f2950d 100644
> --- a/mm/gup.c
> +++ b/mm/gup.c
> @@ -1558,12 +1558,7 @@ long populate_vma_page_range(struct vm_area_struct *vma,
>    * code on error (see __get_user_pages()).
>    *
>    * vma->vm_mm->mmap_lock must be held. The range must be page-aligned and
> - * covered by the VMA.
> - *
> - * If @locked is NULL, it may be held for read or write and will be unperturbed.
> - *
> - * If @locked is non-NULL, it must held for read only and may be released.  If
> - * it's released, *@locked will be set to 0.
> + * covered by the VMA. If it's released, *@locked will be set to 0.
>    */
>   long faultin_vma_page_range(struct vm_area_struct *vma, unsigned long start,
>   			    unsigned long end, bool write, int *locked)



^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [PATCH v2 12/13] mm/gup: move gup_must_unshare() to mm/internal.h
  2023-01-24 20:34 ` [PATCH v2 12/13] mm/gup: move gup_must_unshare() to mm/internal.h Jason Gunthorpe
@ 2023-01-25  2:41   ` John Hubbard
  2023-01-26 11:29   ` David Hildenbrand
  1 sibling, 0 replies; 44+ messages in thread
From: John Hubbard @ 2023-01-25  2:41 UTC (permalink / raw)
  To: Jason Gunthorpe
  Cc: Alistair Popple, David Hildenbrand, David Howells,
	Christoph Hellwig, linux-mm, Mike Rapoport (IBM)

On 1/24/23 12:34, Jason Gunthorpe wrote:
> This function is only used in gup.c and closely related. It touches
> FOLL_PIN so it must be moved before the next patch.
> 
> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
> ---
>   include/linux/mm.h | 65 ----------------------------------------------
>   mm/internal.h      | 65 ++++++++++++++++++++++++++++++++++++++++++++++
>   2 files changed, 65 insertions(+), 65 deletions(-)

Reviewed-by: John Hubbard <jhubbard@nvidia.com>

thanks,
-- 
John Hubbard
NVIDIA

> 
> diff --git a/include/linux/mm.h b/include/linux/mm.h
> index a47a6e8a9c78be..e0bacf9f2c5ebe 100644
> --- a/include/linux/mm.h
> +++ b/include/linux/mm.h
> @@ -3087,71 +3087,6 @@ static inline int vm_fault_to_errno(vm_fault_t vm_fault, int foll_flags)
>   	return 0;
>   }
>   
> -/*
> - * Indicates for which pages that are write-protected in the page table,
> - * whether GUP has to trigger unsharing via FAULT_FLAG_UNSHARE such that the
> - * GUP pin will remain consistent with the pages mapped into the page tables
> - * of the MM.
> - *
> - * Temporary unmapping of PageAnonExclusive() pages or clearing of
> - * PageAnonExclusive() has to protect against concurrent GUP:
> - * * Ordinary GUP: Using the PT lock
> - * * GUP-fast and fork(): mm->write_protect_seq
> - * * GUP-fast and KSM or temporary unmapping (swap, migration): see
> - *    page_try_share_anon_rmap()
> - *
> - * Must be called with the (sub)page that's actually referenced via the
> - * page table entry, which might not necessarily be the head page for a
> - * PTE-mapped THP.
> - *
> - * If the vma is NULL, we're coming from the GUP-fast path and might have
> - * to fallback to the slow path just to lookup the vma.
> - */
> -static inline bool gup_must_unshare(struct vm_area_struct *vma,
> -				    unsigned int flags, struct page *page)
> -{
> -	/*
> -	 * FOLL_WRITE is implicitly handled correctly as the page table entry
> -	 * has to be writable -- and if it references (part of) an anonymous
> -	 * folio, that part is required to be marked exclusive.
> -	 */
> -	if ((flags & (FOLL_WRITE | FOLL_PIN)) != FOLL_PIN)
> -		return false;
> -	/*
> -	 * Note: PageAnon(page) is stable until the page is actually getting
> -	 * freed.
> -	 */
> -	if (!PageAnon(page)) {
> -		/*
> -		 * We only care about R/O long-term pining: R/O short-term
> -		 * pinning does not have the semantics to observe successive
> -		 * changes through the process page tables.
> -		 */
> -		if (!(flags & FOLL_LONGTERM))
> -			return false;
> -
> -		/* We really need the vma ... */
> -		if (!vma)
> -			return true;
> -
> -		/*
> -		 * ... because we only care about writable private ("COW")
> -		 * mappings where we have to break COW early.
> -		 */
> -		return is_cow_mapping(vma->vm_flags);
> -	}
> -
> -	/* Paired with a memory barrier in page_try_share_anon_rmap(). */
> -	if (IS_ENABLED(CONFIG_HAVE_FAST_GUP))
> -		smp_rmb();
> -
> -	/*
> -	 * Note that PageKsm() pages cannot be exclusive, and consequently,
> -	 * cannot get pinned.
> -	 */
> -	return !PageAnonExclusive(page);
> -}
> -
>   /*
>    * Indicates whether GUP can follow a PROT_NONE mapped page, or whether
>    * a (NUMA hinting) fault is required.
> diff --git a/mm/internal.h b/mm/internal.h
> index 0f035bcaf133f5..5c1310b98db64d 100644
> --- a/mm/internal.h
> +++ b/mm/internal.h
> @@ -854,6 +854,71 @@ int migrate_device_coherent_page(struct page *page);
>   struct folio *try_grab_folio(struct page *page, int refs, unsigned int flags);
>   int __must_check try_grab_page(struct page *page, unsigned int flags);
>   
> +/*
> + * Indicates for which pages that are write-protected in the page table,
> + * whether GUP has to trigger unsharing via FAULT_FLAG_UNSHARE such that the
> + * GUP pin will remain consistent with the pages mapped into the page tables
> + * of the MM.
> + *
> + * Temporary unmapping of PageAnonExclusive() pages or clearing of
> + * PageAnonExclusive() has to protect against concurrent GUP:
> + * * Ordinary GUP: Using the PT lock
> + * * GUP-fast and fork(): mm->write_protect_seq
> + * * GUP-fast and KSM or temporary unmapping (swap, migration): see
> + *    page_try_share_anon_rmap()
> + *
> + * Must be called with the (sub)page that's actually referenced via the
> + * page table entry, which might not necessarily be the head page for a
> + * PTE-mapped THP.
> + *
> + * If the vma is NULL, we're coming from the GUP-fast path and might have
> + * to fallback to the slow path just to lookup the vma.
> + */
> +static inline bool gup_must_unshare(struct vm_area_struct *vma,
> +				    unsigned int flags, struct page *page)
> +{
> +	/*
> +	 * FOLL_WRITE is implicitly handled correctly as the page table entry
> +	 * has to be writable -- and if it references (part of) an anonymous
> +	 * folio, that part is required to be marked exclusive.
> +	 */
> +	if ((flags & (FOLL_WRITE | FOLL_PIN)) != FOLL_PIN)
> +		return false;
> +	/*
> +	 * Note: PageAnon(page) is stable until the page is actually getting
> +	 * freed.
> +	 */
> +	if (!PageAnon(page)) {
> +		/*
> +		 * We only care about R/O long-term pining: R/O short-term
> +		 * pinning does not have the semantics to observe successive
> +		 * changes through the process page tables.
> +		 */
> +		if (!(flags & FOLL_LONGTERM))
> +			return false;
> +
> +		/* We really need the vma ... */
> +		if (!vma)
> +			return true;
> +
> +		/*
> +		 * ... because we only care about writable private ("COW")
> +		 * mappings where we have to break COW early.
> +		 */
> +		return is_cow_mapping(vma->vm_flags);
> +	}
> +
> +	/* Paired with a memory barrier in page_try_share_anon_rmap(). */
> +	if (IS_ENABLED(CONFIG_HAVE_FAST_GUP))
> +		smp_rmb();
> +
> +	/*
> +	 * Note that PageKsm() pages cannot be exclusive, and consequently,
> +	 * cannot get pinned.
> +	 */
> +	return !PageAnonExclusive(page);
> +}
> +
>   extern bool mirrored_kernelcore;
>   
>   static inline bool vma_soft_dirty_enabled(struct vm_area_struct *vma)



^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [PATCH v2 13/13] mm/gup: move private gup FOLL_ flags to internal.h
  2023-01-24 20:34 ` [PATCH v2 13/13] mm/gup: move private gup FOLL_ flags to internal.h Jason Gunthorpe
@ 2023-01-25  2:44   ` John Hubbard
  2023-01-26 12:48   ` David Hildenbrand
  1 sibling, 0 replies; 44+ messages in thread
From: John Hubbard @ 2023-01-25  2:44 UTC (permalink / raw)
  To: Jason Gunthorpe
  Cc: Alistair Popple, David Hildenbrand, David Howells,
	Christoph Hellwig, linux-mm, Mike Rapoport (IBM)

On 1/24/23 12:34, Jason Gunthorpe wrote:
> Move the flags that should not/are not used outside gup.c and related into
> mm/internal.h to discourage driver abuse.
> 
> To make this more maintainable going forward compact the two FOLL ranges
> with new bit numbers from 0 to 11 and 16 to 21, using shifts so it is
> explict.
> 
> Switch to an enum so the whole thing is easier to read.
> 
> Cc: David Howells <dhowells@redhat.com>
> Cc: Christoph Hellwig <hch@infradead.org>
> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
> ---
>   include/linux/mm_types.h | 57 ++++++++++++++++++++++++----------------
>   mm/internal.h            | 15 +++++++++++
>   2 files changed, 50 insertions(+), 22 deletions(-)

Reviewed-by: John Hubbard <jhubbard@nvidia.com>


thanks,
-- 
John Hubbard
NVIDIA

> 
> diff --git a/include/linux/mm_types.h b/include/linux/mm_types.h
> index 518617431f8085..dbfc3a7e9559b4 100644
> --- a/include/linux/mm_types.h
> +++ b/include/linux/mm_types.h
> @@ -1039,9 +1039,6 @@ typedef unsigned int __bitwise zap_flags_t;
>    * FOLL_PIN and FOLL_LONGTERM may be used in various combinations with each
>    * other. Here is what they mean, and how to use them:
>    *
> - * FOLL_LONGTERM indicates that the page will be held for an indefinite time
> - * period _often_ under userspace control.  This is in contrast to
> - * iov_iter_get_pages(), whose usages are transient.
>    *
>    * FIXME: For pages which are part of a filesystem, mappings are subject to the
>    * lifetime enforced by the filesystem and we need guarantees that longterm
> @@ -1085,24 +1082,40 @@ typedef unsigned int __bitwise zap_flags_t;
>    * Please see Documentation/core-api/pin_user_pages.rst for more information.
>    */
>   
> -#define FOLL_WRITE	0x01	/* check pte is writable */
> -#define FOLL_TOUCH	0x02	/* mark page accessed */
> -#define FOLL_GET	0x04	/* do get_page on page */
> -#define FOLL_DUMP	0x08	/* give error on hole if it would be zero */
> -#define FOLL_FORCE	0x10	/* get_user_pages read/write w/o permission */
> -#define FOLL_NOWAIT	0x20	/* if a disk transfer is needed, start the IO
> -				 * and return without waiting upon it */
> -#define FOLL_NOFAULT	0x80	/* do not fault in pages */
> -#define FOLL_HWPOISON	0x100	/* check page is hwpoisoned */
> -#define FOLL_TRIED	0x800	/* a retry, previous pass started an IO */
> -#define FOLL_REMOTE	0x2000	/* we are working on non-current tsk/mm */
> -#define FOLL_ANON	0x8000	/* don't do file mappings */
> -#define FOLL_LONGTERM	0x10000	/* mapping lifetime is indefinite: see below */
> -#define FOLL_SPLIT_PMD	0x20000	/* split huge pmd before returning */
> -#define FOLL_PIN	0x40000	/* pages must be released via unpin_user_page */
> -#define FOLL_FAST_ONLY	0x80000	/* gup_fast: prevent fall-back to slow gup */
> -#define FOLL_PCI_P2PDMA	0x100000 /* allow returning PCI P2PDMA pages */
> -#define FOLL_INTERRUPTIBLE  0x200000 /* allow interrupts from generic signals */
> -#define FOLL_UNLOCKABLE	0x400000 /* allow unlocking the mmap lock (internal only) */
> +enum {
> +	/* check pte is writable */
> +	FOLL_WRITE = 1 << 0,
> +	/* do get_page on page */
> +	FOLL_GET = 1 << 1,
> +	/* give error on hole if it would be zero */
> +	FOLL_DUMP = 1 << 2,
> +	/* get_user_pages read/write w/o permission */
> +	FOLL_FORCE = 1 << 3,
> +	/*
> +	 * if a disk transfer is needed, start the IO and return without waiting
> +	 * upon it
> +	 */
> +	FOLL_NOWAIT = 1 << 4,
> +	/* do not fault in pages */
> +	FOLL_NOFAULT = 1 << 5,
> +	/* check page is hwpoisoned */
> +	FOLL_HWPOISON = 1 << 6,
> +	/* don't do file mappings */
> +	FOLL_ANON = 1 << 7,
> +	/*
> +	 * FOLL_LONGTERM indicates that the page will be held for an indefinite
> +	 * time period _often_ under userspace control.  This is in contrast to
> +	 * iov_iter_get_pages(), whose usages are transient.
> +	 */
> +	FOLL_LONGTERM = 1 << 8,
> +	/* split huge pmd before returning */
> +	FOLL_SPLIT_PMD = 1 << 9,
> +	/* allow returning PCI P2PDMA pages */
> +	FOLL_PCI_P2PDMA = 1 << 10,
> +	/* allow interrupts from generic signals */
> +	FOLL_INTERRUPTIBLE = 1 << 11,
> +
> +	/* See also internal only FOLL flags in mm/internal.h */
> +};
>   
>   #endif /* _LINUX_MM_TYPES_H */
> diff --git a/mm/internal.h b/mm/internal.h
> index 5c1310b98db64d..673b4b1d5f4c79 100644
> --- a/mm/internal.h
> +++ b/mm/internal.h
> @@ -854,6 +854,21 @@ int migrate_device_coherent_page(struct page *page);
>   struct folio *try_grab_folio(struct page *page, int refs, unsigned int flags);
>   int __must_check try_grab_page(struct page *page, unsigned int flags);
>   
> +enum {
> +	/* mark page accessed */
> +	FOLL_TOUCH = 1 << 16,
> +	/* a retry, previous pass started an IO */
> +	FOLL_TRIED = 1 << 17,
> +	/* we are working on non-current tsk/mm */
> +	FOLL_REMOTE = 1 << 18,
> +	/* pages must be released via unpin_user_page */
> +	FOLL_PIN = 1 << 19,
> +	/* gup_fast: prevent fall-back to slow gup */
> +	FOLL_FAST_ONLY = 1 << 20,
> +	/* allow unlocking the mmap lock */
> +	FOLL_UNLOCKABLE = 1 << 21,
> +};
> +
>   /*
>    * Indicates for which pages that are write-protected in the page table,
>    * whether GUP has to trigger unsharing via FAULT_FLAG_UNSHARE such that the



^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [PATCH v2 01/13] mm/gup: have internal functions get the mmap_read_lock()
  2023-01-25  2:11   ` John Hubbard
@ 2023-01-25  2:52     ` John Hubbard
  2023-01-25 16:38     ` Jason Gunthorpe
  1 sibling, 0 replies; 44+ messages in thread
From: John Hubbard @ 2023-01-25  2:52 UTC (permalink / raw)
  To: Jason Gunthorpe
  Cc: Alistair Popple, David Hildenbrand, David Howells,
	Christoph Hellwig, linux-mm, Mike Rapoport (IBM)

On 1/24/23 18:11, John Hubbard wrote:
> On 1/24/23 12:34, Jason Gunthorpe wrote:
> ...
>> @@ -3183,11 +3198,13 @@ long pin_user_pages_unlocked(unsigned long start, unsigned long nr_pages,
>>       /* FOLL_GET and FOLL_PIN are mutually exclusive. */
>>       if (WARN_ON_ONCE(gup_flags & FOLL_GET))
>>           return -EINVAL;
>> +    int locked = 0;
>>       if (WARN_ON_ONCE(!pages))
>>           return -EINVAL;
>> -    gup_flags |= FOLL_PIN;
>> -    return get_user_pages_unlocked(start, nr_pages, pages, gup_flags);
>> +    gup_flags |= FOLL_PIN | FOLL_TOUCH;
> 
> I missed this on my review of v1 of this series: the FOLL_TOUCH change
> looks like a mistake, yes? It should just be left as-is:
> 
>      gup_flags |= FOLL_PIN;
> 

...and I think the same thing happened in internal_get_user_pages_fast().

If you think that FOLL_TOUCH should be added, then it should really be a
separate patch, because it's unrelated to what this patch here is doing.

Looking through the code, I'm on the fence about that. On one hand,
there is an argument that pages that are going to be pinned should get
FOLL_TOUCH. But should pin_user_pages() do that for the caller? I'm
not sure.

thanks,
-- 
John Hubbard
NVIDIA


^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [PATCH v2 09/13] mm/gup: make locked never NULL in the internal GUP functions
  2023-01-24 20:34 ` [PATCH v2 09/13] mm/gup: make locked never NULL in the internal GUP functions Jason Gunthorpe
@ 2023-01-25  3:00   ` John Hubbard
  0 siblings, 0 replies; 44+ messages in thread
From: John Hubbard @ 2023-01-25  3:00 UTC (permalink / raw)
  To: Jason Gunthorpe
  Cc: Alistair Popple, David Hildenbrand, David Howells,
	Christoph Hellwig, linux-mm, Mike Rapoport (IBM)

On 1/24/23 12:34, Jason Gunthorpe wrote:
> Now that NULL locked doesn't have a special meaning we can just make it
> non-NULL in all cases and remove the special tests.
> 
> get_user_pages() and pin_user_pages() can safely pass in a locked = 1
> 
> get_user_pages_remote) and pin_user_pages_remote() can swap in a local
> variable for locked if NULL is passed.
> 
> Remove all the NULL checks.
> 
> Acked-by: Mike Rapoport (IBM) <rppt@kernel.org>
> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
> ---
>   mm/gup.c | 51 ++++++++++++++++++++++++++++++---------------------
>   1 file changed, 30 insertions(+), 21 deletions(-)


Reviewed-by: John Hubbard <jhubbard@nvidia.com>

thanks,
-- 
John Hubbard
NVIDIA

> 
> diff --git a/mm/gup.c b/mm/gup.c
> index 37849fcd962e9a..932a2339613c2f 100644
> --- a/mm/gup.c
> +++ b/mm/gup.c
> @@ -879,9 +879,9 @@ static int get_gate_page(struct mm_struct *mm, unsigned long address,
>   }
>   
>   /*
> - * mmap_lock must be held on entry.  If @locked != NULL and *@flags
> - * does not include FOLL_NOWAIT, the mmap_lock may be released.  If it
> - * is, *@locked will be set to 0 and -EBUSY returned.
> + * mmap_lock must be held on entry.  If @flags has FOLL_UNLOCKABLE but not
> + * FOLL_NOWAIT, the mmap_lock may be released.  If it is, *@locked will be set
> + * to 0 and -EBUSY returned.
>    */
>   static int faultin_page(struct vm_area_struct *vma,
>   		unsigned long address, unsigned int *flags, bool unshare,
> @@ -930,8 +930,8 @@ static int faultin_page(struct vm_area_struct *vma,
>   		 * mmap lock in the page fault handler. Sanity check this.
>   		 */
>   		WARN_ON_ONCE(fault_flags & FAULT_FLAG_RETRY_NOWAIT);
> -		if (locked)
> -			*locked = 0;
> +		*locked = 0;
> +
>   		/*
>   		 * We should do the same as VM_FAULT_RETRY, but let's not
>   		 * return -EBUSY since that's not reflecting the reality of
> @@ -951,7 +951,7 @@ static int faultin_page(struct vm_area_struct *vma,
>   	}
>   
>   	if (ret & VM_FAULT_RETRY) {
> -		if (locked && !(fault_flags & FAULT_FLAG_RETRY_NOWAIT))
> +		if (!(fault_flags & FAULT_FLAG_RETRY_NOWAIT))
>   			*locked = 0;
>   		return -EBUSY;
>   	}
> @@ -1062,14 +1062,12 @@ static int check_vma_flags(struct vm_area_struct *vma, unsigned long gup_flags)
>    * appropriate) must be called after the page is finished with, and
>    * before put_page is called.
>    *
> - * If @locked != NULL, *@locked will be set to 0 when mmap_lock is
> - * released by an up_read().  That can happen if @gup_flags does not
> - * have FOLL_NOWAIT.
> + * If FOLL_UNLOCKABLE is set without FOLL_NOWAIT then the mmap_lock may
> + * be released. If this happens *@locked will be set to 0 on return.
>    *
> - * A caller using such a combination of @locked and @gup_flags
> - * must therefore hold the mmap_lock for reading only, and recognize
> - * when it's been released.  Otherwise, it must be held for either
> - * reading or writing and will not be released.
> + * A caller using such a combination of @gup_flags must therefore hold the
> + * mmap_lock for reading only, and recognize when it's been released. Otherwise,
> + * it must be held for either reading or writing and will not be released.
>    *
>    * In most cases, get_user_pages or get_user_pages_fast should be used
>    * instead of __get_user_pages. __get_user_pages should be used only if
> @@ -1121,7 +1119,7 @@ static long __get_user_pages(struct mm_struct *mm,
>   				i = follow_hugetlb_page(mm, vma, pages, vmas,
>   						&start, &nr_pages, i,
>   						gup_flags, locked);
> -				if (locked && *locked == 0) {
> +				if (!*locked) {
>   					/*
>   					 * We've got a VM_FAULT_RETRY
>   					 * and we've lost mmap_lock.
> @@ -1354,7 +1352,7 @@ static __always_inline long __get_user_pages_locked(struct mm_struct *mm,
>   	 * The internal caller expects GUP to manage the lock internally and the
>   	 * lock must be released when this returns.
>   	 */
> -	if (locked && !*locked) {
> +	if (!*locked) {
>   		if (mmap_read_lock_killable(mm))
>   			return -EAGAIN;
>   		must_unlock = true;
> @@ -1502,6 +1500,7 @@ long populate_vma_page_range(struct vm_area_struct *vma,
>   {
>   	struct mm_struct *mm = vma->vm_mm;
>   	unsigned long nr_pages = (end - start) / PAGE_SIZE;
> +	int local_locked = 1;
>   	int gup_flags;
>   	long ret;
>   
> @@ -1542,7 +1541,7 @@ long populate_vma_page_range(struct vm_area_struct *vma,
>   	 * not result in a stack expansion that recurses back here.
>   	 */
>   	ret = __get_user_pages(mm, start, nr_pages, gup_flags,
> -				NULL, NULL, locked);
> +				NULL, NULL, locked ? locked : &local_locked);
>   	lru_add_drain();
>   	return ret;
>   }
> @@ -1683,7 +1682,7 @@ static long __get_user_pages_locked(struct mm_struct *mm, unsigned long start,
>   	 * The internal caller expects GUP to manage the lock internally and the
>   	 * lock must be released when this returns.
>   	 */
> -	if (locked && !*locked) {
> +	if (!*locked) {
>   		if (mmap_read_lock_killable(mm))
>   			return -EAGAIN;
>   		must_unlock = true;
> @@ -2222,11 +2221,14 @@ long get_user_pages_remote(struct mm_struct *mm,
>   		unsigned int gup_flags, struct page **pages,
>   		struct vm_area_struct **vmas, int *locked)
>   {
> +	int local_locked = 1;
> +
>   	if (!is_valid_gup_args(pages, vmas, locked, &gup_flags,
>   			       FOLL_TOUCH | FOLL_REMOTE))
>   		return -EINVAL;
>   
> -	return __get_user_pages_locked(mm, start, nr_pages, pages, vmas, locked,
> +	return __get_user_pages_locked(mm, start, nr_pages, pages, vmas,
> +				       locked ? locked : &local_locked,
>   				       gup_flags);
>   }
>   EXPORT_SYMBOL(get_user_pages_remote);
> @@ -2261,11 +2263,13 @@ long get_user_pages(unsigned long start, unsigned long nr_pages,
>   		unsigned int gup_flags, struct page **pages,
>   		struct vm_area_struct **vmas)
>   {
> +	int locked = 1;
> +
>   	if (!is_valid_gup_args(pages, vmas, NULL, &gup_flags, FOLL_TOUCH))
>   		return -EINVAL;
>   
>   	return __get_user_pages_locked(current->mm, start, nr_pages, pages,
> -				       vmas, NULL, gup_flags);
> +				       vmas, &locked, gup_flags);
>   }
>   EXPORT_SYMBOL(get_user_pages);
>   
> @@ -3158,10 +3162,13 @@ long pin_user_pages_remote(struct mm_struct *mm,
>   			   unsigned int gup_flags, struct page **pages,
>   			   struct vm_area_struct **vmas, int *locked)
>   {
> +	int local_locked = 1;
> +
>   	if (!is_valid_gup_args(pages, vmas, locked, &gup_flags,
>   			       FOLL_PIN | FOLL_TOUCH | FOLL_REMOTE))
>   		return 0;
> -	return __gup_longterm_locked(mm, start, nr_pages, pages, vmas, locked,
> +	return __gup_longterm_locked(mm, start, nr_pages, pages, vmas,
> +				     locked ? locked : &local_locked,
>   				     gup_flags);
>   }
>   EXPORT_SYMBOL(pin_user_pages_remote);
> @@ -3187,10 +3194,12 @@ long pin_user_pages(unsigned long start, unsigned long nr_pages,
>   		    unsigned int gup_flags, struct page **pages,
>   		    struct vm_area_struct **vmas)
>   {
> +	int locked = 1;
> +
>   	if (!is_valid_gup_args(pages, vmas, NULL, &gup_flags, FOLL_PIN))
>   		return 0;
>   	return __gup_longterm_locked(current->mm, start, nr_pages,
> -				     pages, vmas, NULL, gup_flags);
> +				     pages, vmas, &locked, gup_flags);
>   }
>   EXPORT_SYMBOL(pin_user_pages);
>   



^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [PATCH v2 01/13] mm/gup: have internal functions get the mmap_read_lock()
  2023-01-25  2:11   ` John Hubbard
  2023-01-25  2:52     ` John Hubbard
@ 2023-01-25 16:38     ` Jason Gunthorpe
  2023-01-25 18:48       ` John Hubbard
  1 sibling, 1 reply; 44+ messages in thread
From: Jason Gunthorpe @ 2023-01-25 16:38 UTC (permalink / raw)
  To: John Hubbard
  Cc: Alistair Popple, David Hildenbrand, David Howells,
	Christoph Hellwig, linux-mm, Mike Rapoport (IBM)

On Tue, Jan 24, 2023 at 06:11:36PM -0800, John Hubbard wrote:
> On 1/24/23 12:34, Jason Gunthorpe wrote:
> ...
> > @@ -3183,11 +3198,13 @@ long pin_user_pages_unlocked(unsigned long start, unsigned long nr_pages,
> >   	/* FOLL_GET and FOLL_PIN are mutually exclusive. */
> >   	if (WARN_ON_ONCE(gup_flags & FOLL_GET))
> >   		return -EINVAL;
> > +	int locked = 0;
> >   	if (WARN_ON_ONCE(!pages))
> >   		return -EINVAL;
> > -	gup_flags |= FOLL_PIN;
> > -	return get_user_pages_unlocked(start, nr_pages, pages, gup_flags);
> > +	gup_flags |= FOLL_PIN | FOLL_TOUCH;
> 
> I missed this on my review of v1 of this series: the FOLL_TOUCH change
> looks like a mistake, yes? It should just be left as-is:
> 
> 	gup_flags |= FOLL_PIN;
>

No, not a mistake.

The transformation is this:

-	return get_user_pages_unlocked(start, nr_pages, pages, gup_flags);
+	gup_flags |= FOLL_PIN | FOLL_TOUCH;
+	return __gup_longterm_locked(current->mm, start, nr_pages, pages, NULL,
+				     &locked, gup_flags);

Ie we switch get_user_pages_unlocked() for __gup_longterm_locked()

Howevr get_user_pages_unlocked() was adding the FOLL_TOUCH before
calling __gup_longterm_locked():

-	ret = __gup_longterm_locked(mm, start, nr_pages, pages, NULL, &locked,
-				    gup_flags | FOLL_TOUCH);

Thus we have to lift the FOLL_TOUCH when we do the function call
replacement.

Jason


^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [PATCH v2 01/13] mm/gup: have internal functions get the mmap_read_lock()
  2023-01-25 16:38     ` Jason Gunthorpe
@ 2023-01-25 18:48       ` John Hubbard
  0 siblings, 0 replies; 44+ messages in thread
From: John Hubbard @ 2023-01-25 18:48 UTC (permalink / raw)
  To: Jason Gunthorpe
  Cc: Alistair Popple, David Hildenbrand, David Howells,
	Christoph Hellwig, linux-mm, Mike Rapoport (IBM)

On 1/25/23 08:38, Jason Gunthorpe wrote:
...
>> I missed this on my review of v1 of this series: the FOLL_TOUCH change
>> looks like a mistake, yes? It should just be left as-is:
>>
>> 	gup_flags |= FOLL_PIN;
>>
> 
> No, not a mistake.
> 
> The transformation is this:
> 
> -	return get_user_pages_unlocked(start, nr_pages, pages, gup_flags);
> +	gup_flags |= FOLL_PIN | FOLL_TOUCH;
> +	return __gup_longterm_locked(current->mm, start, nr_pages, pages, NULL,
> +				     &locked, gup_flags);
> 
> Ie we switch get_user_pages_unlocked() for __gup_longterm_locked()
> 
> Howevr get_user_pages_unlocked() was adding the FOLL_TOUCH before
> calling __gup_longterm_locked():
> 
> -	ret = __gup_longterm_locked(mm, start, nr_pages, pages, NULL, &locked,
> -				    gup_flags | FOLL_TOUCH);
> 
> Thus we have to lift the FOLL_TOUCH when we do the function call
> replacement.
> 

OK good. Sorry about the moment of panic, haha. :)


Reviewed-by: John Hubbard <jhubbard@nvidia.com>

thanks,
-- 
John Hubbard
NVIDIA


^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [PATCH v2 12/13] mm/gup: move gup_must_unshare() to mm/internal.h
  2023-01-24 20:34 ` [PATCH v2 12/13] mm/gup: move gup_must_unshare() to mm/internal.h Jason Gunthorpe
  2023-01-25  2:41   ` John Hubbard
@ 2023-01-26 11:29   ` David Hildenbrand
  1 sibling, 0 replies; 44+ messages in thread
From: David Hildenbrand @ 2023-01-26 11:29 UTC (permalink / raw)
  To: Jason Gunthorpe
  Cc: Alistair Popple, David Howells, Christoph Hellwig, John Hubbard,
	linux-mm, Mike Rapoport (IBM)

On 24.01.23 21:34, Jason Gunthorpe wrote:
> This function is only used in gup.c and closely related. It touches
> FOLL_PIN so it must be moved before the next patch.
> 
> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
> ---

Reviewed-by: David Hildenbrand <david@redhat.com>

-- 
Thanks,

David / dhildenb



^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [PATCH v2 13/13] mm/gup: move private gup FOLL_ flags to internal.h
  2023-01-24 20:34 ` [PATCH v2 13/13] mm/gup: move private gup FOLL_ flags to internal.h Jason Gunthorpe
  2023-01-25  2:44   ` John Hubbard
@ 2023-01-26 12:48   ` David Hildenbrand
  2023-01-26 12:55     ` Jason Gunthorpe
  1 sibling, 1 reply; 44+ messages in thread
From: David Hildenbrand @ 2023-01-26 12:48 UTC (permalink / raw)
  To: Jason Gunthorpe
  Cc: Alistair Popple, David Howells, Christoph Hellwig, John Hubbard,
	linux-mm, Mike Rapoport (IBM)

On 24.01.23 21:34, Jason Gunthorpe wrote:
> Move the flags that should not/are not used outside gup.c and related into
> mm/internal.h to discourage driver abuse.
> 
> To make this more maintainable going forward compact the two FOLL ranges
> with new bit numbers from 0 to 11 and 16 to 21, using shifts so it is
> explict.
> 
> Switch to an enum so the whole thing is easier to read.

Using a __bitwise type would be even better, but that requires quite 
some adjustments ...

The primary leftover for FOLL_GET seems to be follow_page(). IIRC, there 
is only one caller that doesn't pass FOLL_GET (s390). We could either 
add a new function to "probe" that anything is mapped (IIRC that's the 
use case), or simply ref+unref.

So we might just be able to make FOLL_GET internal fairly easily, too ...

Anyhow

Acked-by: David Hildenbrand <david@redhat.com>


-- 
Thanks,

David / dhildenb



^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [PATCH v2 13/13] mm/gup: move private gup FOLL_ flags to internal.h
  2023-01-26 12:48   ` David Hildenbrand
@ 2023-01-26 12:55     ` Jason Gunthorpe
  2023-01-26 13:06       ` David Hildenbrand
  2023-01-26 14:41       ` Claudio Imbrenda
  0 siblings, 2 replies; 44+ messages in thread
From: Jason Gunthorpe @ 2023-01-26 12:55 UTC (permalink / raw)
  To: David Hildenbrand, Claudio Imbrenda
  Cc: Alistair Popple, David Howells, Christoph Hellwig, John Hubbard,
	linux-mm, Mike Rapoport (IBM),
	Christian Borntraeger, Janosch Frank, Claudio Imbrenda

On Thu, Jan 26, 2023 at 01:48:46PM +0100, David Hildenbrand wrote:
> On 24.01.23 21:34, Jason Gunthorpe wrote:
> > Move the flags that should not/are not used outside gup.c and related into
> > mm/internal.h to discourage driver abuse.
> > 
> > To make this more maintainable going forward compact the two FOLL ranges
> > with new bit numbers from 0 to 11 and 16 to 21, using shifts so it is
> > explict.
> > 
> > Switch to an enum so the whole thing is easier to read.
> 
> Using a __bitwise type would be even better, but that requires quite some
> adjustments ...
> 
> The primary leftover for FOLL_GET seems to be follow_page(). IIRC, there is
> only one caller that doesn't pass FOLL_GET (s390). We could either add a new
> function to "probe" that anything is mapped (IIRC that's the use case), or
> simply ref+unref.

Is that code even safe as written? I don't really understand how it
can safely call lock_page() on something it doesn't have a reference
too ?

So adding the FOLL_GET and put_page seems like a good idea to me? At a
minimum this should get a comment to explain why it is OK.

S390 people?

int gmap_make_secure(struct gmap *gmap, unsigned long gaddr, void *uvcb)
{
[..]
        rc = -ENXIO;
        page = follow_page(vma, uaddr, FOLL_WRITE);
        if (IS_ERR_OR_NULL(page))
                goto out;

        lock_page(page);
        ptep = get_locked_pte(gmap->mm, uaddr, &ptelock);

Jason


^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [PATCH v2 13/13] mm/gup: move private gup FOLL_ flags to internal.h
  2023-01-26 12:55     ` Jason Gunthorpe
@ 2023-01-26 13:06       ` David Hildenbrand
  2023-01-26 14:41       ` Claudio Imbrenda
  1 sibling, 0 replies; 44+ messages in thread
From: David Hildenbrand @ 2023-01-26 13:06 UTC (permalink / raw)
  To: Jason Gunthorpe, Claudio Imbrenda
  Cc: Alistair Popple, David Howells, Christoph Hellwig, John Hubbard,
	linux-mm, Mike Rapoport (IBM),
	Christian Borntraeger, Janosch Frank, Claudio Imbrenda

On 26.01.23 13:55, Jason Gunthorpe wrote:
> On Thu, Jan 26, 2023 at 01:48:46PM +0100, David Hildenbrand wrote:
>> On 24.01.23 21:34, Jason Gunthorpe wrote:
>>> Move the flags that should not/are not used outside gup.c and related into
>>> mm/internal.h to discourage driver abuse.
>>>
>>> To make this more maintainable going forward compact the two FOLL ranges
>>> with new bit numbers from 0 to 11 and 16 to 21, using shifts so it is
>>> explict.
>>>
>>> Switch to an enum so the whole thing is easier to read.
>>
>> Using a __bitwise type would be even better, but that requires quite some
>> adjustments ...
>>
>> The primary leftover for FOLL_GET seems to be follow_page(). IIRC, there is
>> only one caller that doesn't pass FOLL_GET (s390). We could either add a new
>> function to "probe" that anything is mapped (IIRC that's the use case), or
>> simply ref+unref.
> 
> Is that code even safe as written? I don't really understand how it
> can safely call lock_page() on something it doesn't have a reference
> too ?

Let me look into the details ... I remember reviewing that before I got 
to study the beauty of GUP in more detail.

CCin Claudio

> 
> So adding the FOLL_GET and put_page seems like a good idea to me? At a
> minimum this should get a comment to explain why it is OK.
> 

At first sight, it really feels like the right thing to do. Maybe 
another good reason to handle FOLL_GET vs. FOLL_PIN completely internal. 
Harder to abuse.


> S390 people?
> 
> int gmap_make_secure(struct gmap *gmap, unsigned long gaddr, void *uvcb)
> {
> [..]
>          rc = -ENXIO;
>          page = follow_page(vma, uaddr, FOLL_WRITE);
>          if (IS_ERR_OR_NULL(page))
>                  goto out;
> 
>          lock_page(page);
>          ptep = get_locked_pte(gmap->mm, uaddr, &ptelock);
> 
> Jason
> 

-- 
Thanks,

David / dhildenb



^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [PATCH v2 13/13] mm/gup: move private gup FOLL_ flags to internal.h
  2023-01-26 12:55     ` Jason Gunthorpe
  2023-01-26 13:06       ` David Hildenbrand
@ 2023-01-26 14:41       ` Claudio Imbrenda
  2023-01-26 14:46         ` David Hildenbrand
  1 sibling, 1 reply; 44+ messages in thread
From: Claudio Imbrenda @ 2023-01-26 14:41 UTC (permalink / raw)
  To: Jason Gunthorpe
  Cc: David Hildenbrand, Alistair Popple, David Howells,
	Christoph Hellwig, John Hubbard, linux-mm, Mike Rapoport (IBM),
	Christian Borntraeger, Janosch Frank

On Thu, 26 Jan 2023 08:55:27 -0400
Jason Gunthorpe <jgg@nvidia.com> wrote:

> On Thu, Jan 26, 2023 at 01:48:46PM +0100, David Hildenbrand wrote:
> > On 24.01.23 21:34, Jason Gunthorpe wrote:  
> > > Move the flags that should not/are not used outside gup.c and related into
> > > mm/internal.h to discourage driver abuse.
> > > 
> > > To make this more maintainable going forward compact the two FOLL ranges
> > > with new bit numbers from 0 to 11 and 16 to 21, using shifts so it is
> > > explict.
> > > 
> > > Switch to an enum so the whole thing is easier to read.  
> > 
> > Using a __bitwise type would be even better, but that requires quite some
> > adjustments ...
> > 
> > The primary leftover for FOLL_GET seems to be follow_page(). IIRC, there is
> > only one caller that doesn't pass FOLL_GET (s390). We could either add a new
> > function to "probe" that anything is mapped (IIRC that's the use case), or
> > simply ref+unref.  
> 
> Is that code even safe as written? I don't really understand how it

yes (surprisingly) it is

> can safely call lock_page() on something it doesn't have a reference
> too ?

the code between lock_page and unlock_page will behave "properly" and
do nothing or at worst cause a tiny performance issue in the rare case
something changes between the follow_page and the page_lock, i.e. if
things are done on the wrong page.

make_secure_pte does some very hacky stuff involving page_ref_freeze,
with expected_page_refs counting how many references are expected.

the code is how it is because, when it was written, FOLL_PIN did
not exist, and FOLL_GET caused the page to be converted to unsecure.

I guess maybe we can make it work with FOLL_GET if expected_page_refs
takes into account the extra reference? (and then maybe we could a
put_page after unlock_page)

> 
> So adding the FOLL_GET and put_page seems like a good idea to me? At a
> minimum this should get a comment to explain why it is OK.
> 
> S390 people?
> 
> int gmap_make_secure(struct gmap *gmap, unsigned long gaddr, void *uvcb)
> {
> [..]
>         rc = -ENXIO;
>         page = follow_page(vma, uaddr, FOLL_WRITE);
>         if (IS_ERR_OR_NULL(page))
>                 goto out;
> 
>         lock_page(page);
>         ptep = get_locked_pte(gmap->mm, uaddr, &ptelock);
> 
> Jason



^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [PATCH v2 13/13] mm/gup: move private gup FOLL_ flags to internal.h
  2023-01-26 14:41       ` Claudio Imbrenda
@ 2023-01-26 14:46         ` David Hildenbrand
  2023-01-26 15:05           ` Jason Gunthorpe
  0 siblings, 1 reply; 44+ messages in thread
From: David Hildenbrand @ 2023-01-26 14:46 UTC (permalink / raw)
  To: Claudio Imbrenda, Jason Gunthorpe
  Cc: Alistair Popple, David Howells, Christoph Hellwig, John Hubbard,
	linux-mm, Mike Rapoport (IBM),
	Christian Borntraeger, Janosch Frank

On 26.01.23 15:41, Claudio Imbrenda wrote:
> On Thu, 26 Jan 2023 08:55:27 -0400
> Jason Gunthorpe <jgg@nvidia.com> wrote:
> 
>> On Thu, Jan 26, 2023 at 01:48:46PM +0100, David Hildenbrand wrote:
>>> On 24.01.23 21:34, Jason Gunthorpe wrote:
>>>> Move the flags that should not/are not used outside gup.c and related into
>>>> mm/internal.h to discourage driver abuse.
>>>>
>>>> To make this more maintainable going forward compact the two FOLL ranges
>>>> with new bit numbers from 0 to 11 and 16 to 21, using shifts so it is
>>>> explict.
>>>>
>>>> Switch to an enum so the whole thing is easier to read.
>>>
>>> Using a __bitwise type would be even better, but that requires quite some
>>> adjustments ...
>>>
>>> The primary leftover for FOLL_GET seems to be follow_page(). IIRC, there is
>>> only one caller that doesn't pass FOLL_GET (s390). We could either add a new
>>> function to "probe" that anything is mapped (IIRC that's the use case), or
>>> simply ref+unref.
>>
>> Is that code even safe as written? I don't really understand how it
> 
> yes (surprisingly) it is
> 
>> can safely call lock_page() on something it doesn't have a reference
>> too ?
> 
> the code between lock_page and unlock_page will behave "properly" and
> do nothing or at worst cause a tiny performance issue in the rare case
> something changes between the follow_page and the page_lock, i.e. if
> things are done on the wrong page.

What prevents the page from getting unmapped (MADV_DONTNEED), freed, 
reallocated as a larger folio and the unlock_page() would target the 
wrong bit? I think even while freeing a locked page we might run into 
trouble ...

-- 
Thanks,

David / dhildenb



^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [PATCH v2 13/13] mm/gup: move private gup FOLL_ flags to internal.h
  2023-01-26 14:46         ` David Hildenbrand
@ 2023-01-26 15:05           ` Jason Gunthorpe
  2023-01-26 15:39             ` Claudio Imbrenda
  0 siblings, 1 reply; 44+ messages in thread
From: Jason Gunthorpe @ 2023-01-26 15:05 UTC (permalink / raw)
  To: David Hildenbrand
  Cc: Claudio Imbrenda, Alistair Popple, David Howells,
	Christoph Hellwig, John Hubbard, linux-mm, Mike Rapoport (IBM),
	Christian Borntraeger, Janosch Frank

On Thu, Jan 26, 2023 at 03:46:09PM +0100, David Hildenbrand wrote:
> On 26.01.23 15:41, Claudio Imbrenda wrote:
> > On Thu, 26 Jan 2023 08:55:27 -0400
> > Jason Gunthorpe <jgg@nvidia.com> wrote:
> > 
> > > On Thu, Jan 26, 2023 at 01:48:46PM +0100, David Hildenbrand wrote:
> > > > On 24.01.23 21:34, Jason Gunthorpe wrote:
> > > > > Move the flags that should not/are not used outside gup.c and related into
> > > > > mm/internal.h to discourage driver abuse.
> > > > > 
> > > > > To make this more maintainable going forward compact the two FOLL ranges
> > > > > with new bit numbers from 0 to 11 and 16 to 21, using shifts so it is
> > > > > explict.
> > > > > 
> > > > > Switch to an enum so the whole thing is easier to read.
> > > > 
> > > > Using a __bitwise type would be even better, but that requires quite some
> > > > adjustments ...
> > > > 
> > > > The primary leftover for FOLL_GET seems to be follow_page(). IIRC, there is
> > > > only one caller that doesn't pass FOLL_GET (s390). We could either add a new
> > > > function to "probe" that anything is mapped (IIRC that's the use case), or
> > > > simply ref+unref.
> > > 
> > > Is that code even safe as written? I don't really understand how it
> > 
> > yes (surprisingly) it is
> > 
> > > can safely call lock_page() on something it doesn't have a reference
> > > too ?
> > 
> > the code between lock_page and unlock_page will behave "properly" and
> > do nothing or at worst cause a tiny performance issue in the rare case
> > something changes between the follow_page and the page_lock, i.e. if
> > things are done on the wrong page.
> 
> What prevents the page from getting unmapped (MADV_DONTNEED), freed,
> reallocated as a larger folio and the unlock_page() would target the wrong
> bit? I think even while freeing a locked page we might run into trouble ...

Yep. 

The issue is you can't call lock_page() on something you don't have a
ref to.

The worst case would be the memory got unmapped from the VMA and the
entire memory space was hot-unpluged eg it was DAX or something. Now
the page pointer will oops if you call lock_page.

Why not just use the get_locked_pte() exclusively and do -EAGAIN or
-EBUSY if folio_try_lock fails, under the PTL? This already happens
for PageWriteback caes.

Jason


^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [PATCH v2 13/13] mm/gup: move private gup FOLL_ flags to internal.h
  2023-01-26 15:05           ` Jason Gunthorpe
@ 2023-01-26 15:39             ` Claudio Imbrenda
  2023-01-26 16:35               ` Jason Gunthorpe
  0 siblings, 1 reply; 44+ messages in thread
From: Claudio Imbrenda @ 2023-01-26 15:39 UTC (permalink / raw)
  To: Jason Gunthorpe
  Cc: David Hildenbrand, Alistair Popple, David Howells,
	Christoph Hellwig, John Hubbard, linux-mm, Mike Rapoport (IBM),
	Christian Borntraeger, Janosch Frank

On Thu, 26 Jan 2023 11:05:27 -0400
Jason Gunthorpe <jgg@nvidia.com> wrote:

> On Thu, Jan 26, 2023 at 03:46:09PM +0100, David Hildenbrand wrote:
> > On 26.01.23 15:41, Claudio Imbrenda wrote:  
> > > On Thu, 26 Jan 2023 08:55:27 -0400
> > > Jason Gunthorpe <jgg@nvidia.com> wrote:
> > >   
> > > > On Thu, Jan 26, 2023 at 01:48:46PM +0100, David Hildenbrand wrote:  
> > > > > On 24.01.23 21:34, Jason Gunthorpe wrote:  
> > > > > > Move the flags that should not/are not used outside gup.c and related into
> > > > > > mm/internal.h to discourage driver abuse.
> > > > > > 
> > > > > > To make this more maintainable going forward compact the two FOLL ranges
> > > > > > with new bit numbers from 0 to 11 and 16 to 21, using shifts so it is
> > > > > > explict.
> > > > > > 
> > > > > > Switch to an enum so the whole thing is easier to read.  
> > > > > 
> > > > > Using a __bitwise type would be even better, but that requires quite some
> > > > > adjustments ...
> > > > > 
> > > > > The primary leftover for FOLL_GET seems to be follow_page(). IIRC, there is
> > > > > only one caller that doesn't pass FOLL_GET (s390). We could either add a new
> > > > > function to "probe" that anything is mapped (IIRC that's the use case), or
> > > > > simply ref+unref.  
> > > > 
> > > > Is that code even safe as written? I don't really understand how it  
> > > 
> > > yes (surprisingly) it is
> > >   
> > > > can safely call lock_page() on something it doesn't have a reference
> > > > too ?  
> > > 
> > > the code between lock_page and unlock_page will behave "properly" and
> > > do nothing or at worst cause a tiny performance issue in the rare case
> > > something changes between the follow_page and the page_lock, i.e. if
> > > things are done on the wrong page.  
> > 
> > What prevents the page from getting unmapped (MADV_DONTNEED), freed,
> > reallocated as a larger folio and the unlock_page() would target the wrong
> > bit? I think even while freeing a locked page we might run into trouble ...  
> 
> Yep. 
> 
> The issue is you can't call lock_page() on something you don't have a
> ref to.

so we have been doing this wrong the whole time? oops

> 
> The worst case would be the memory got unmapped from the VMA and the
> entire memory space was hot-unpluged eg it was DAX or something. Now
> the page pointer will oops if you call lock_page.

we do not have memory mapped devices or anything, so this scenario is
highly unlikely (at last this)

> 
> Why not just use the get_locked_pte() exclusively and do -EAGAIN or
> -EBUSY if folio_try_lock fails, under the PTL? This already happens
> for PageWriteback caes.

I think I will need some time to process this sentence


I can tell you that the original goal of that function is to make sure
that there are no extra references. in particular, we want to prevent
I/O of any kind to be ongoing while the page becomes secure. (the I/O
will fail and, depending on which device it was, the whole system might
end up in a rather unhappy state)

transitioning from secure to non-secure instead is much easier

> 
> Jason



^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [PATCH v2 13/13] mm/gup: move private gup FOLL_ flags to internal.h
  2023-01-26 15:39             ` Claudio Imbrenda
@ 2023-01-26 16:35               ` Jason Gunthorpe
  2023-01-26 17:24                 ` Claudio Imbrenda
  2023-01-30 18:21                 ` Claudio Imbrenda
  0 siblings, 2 replies; 44+ messages in thread
From: Jason Gunthorpe @ 2023-01-26 16:35 UTC (permalink / raw)
  To: Claudio Imbrenda
  Cc: David Hildenbrand, Alistair Popple, David Howells,
	Christoph Hellwig, John Hubbard, linux-mm, Mike Rapoport (IBM),
	Christian Borntraeger, Janosch Frank

On Thu, Jan 26, 2023 at 04:39:02PM +0100, Claudio Imbrenda wrote:

> I can tell you that the original goal of that function is to make sure
> that there are no extra references. in particular, we want to prevent
> I/O of any kind to be ongoing while the page becomes secure. (the I/O
> will fail and, depending on which device it was, the whole system might
> end up in a rather unhappy state)

Sure, but if there is concurrent IO you just try again right? It
doesn't wait for refs to drop for instance.

So make the lock_page work the same way:

diff --git a/arch/s390/kernel/uv.c b/arch/s390/kernel/uv.c
index 9f18a4af9c1319..847ee50b8672c6 100644
--- a/arch/s390/kernel/uv.c
+++ b/arch/s390/kernel/uv.c
@@ -193,20 +193,11 @@ static int expected_page_refs(struct page *page)
 }
 
 static int make_secure_pte(pte_t *ptep, unsigned long addr,
-			   struct page *exp_page, struct uv_cb_header *uvcb)
+			   struct page *page, struct uv_cb_header *uvcb)
 {
 	pte_t entry = READ_ONCE(*ptep);
-	struct page *page;
 	int expected, cc = 0;
 
-	if (!pte_present(entry))
-		return -ENXIO;
-	if (pte_val(entry) & _PAGE_INVALID)
-		return -ENXIO;
-
-	page = pte_page(entry);
-	if (page != exp_page)
-		return -ENXIO;
 	if (PageWriteback(page))
 		return -EAGAIN;
 	expected = expected_page_refs(page);
@@ -304,17 +295,25 @@ int gmap_make_secure(struct gmap *gmap, unsigned long gaddr, void *uvcb)
 		goto out;
 
 	rc = -ENXIO;
-	page = follow_page(vma, uaddr, FOLL_WRITE);
-	if (IS_ERR_OR_NULL(page))
-		goto out;
-
-	lock_page(page);
 	ptep = get_locked_pte(gmap->mm, uaddr, &ptelock);
+
+	if (!pte_present(entry))
+		goto out_unlock_pte;
+	if (pte_val(entry) & _PAGE_INVALID)
+		goto out_unlock_pte;
+	page = pte_page(entry);
+
+	if (!trylock_page(page)) {
+		rc = -EAGAIN;
+		goto out_unlock_pte;
+	}
+
 	if (should_export_before_import(uvcb, gmap->mm))
 		uv_convert_from_secure(page_to_phys(page));
 	rc = make_secure_pte(ptep, uaddr, page, uvcb);
-	pte_unmap_unlock(ptep, ptelock);
 	unlock_page(page);
+out_unlock_pte:
+	pte_unmap_unlock(ptep, ptelock);
 out:
 	mmap_read_unlock(gmap->mm);
 


^ permalink raw reply related	[flat|nested] 44+ messages in thread

* Re: [PATCH v2 13/13] mm/gup: move private gup FOLL_ flags to internal.h
  2023-01-26 16:35               ` Jason Gunthorpe
@ 2023-01-26 17:24                 ` Claudio Imbrenda
  2023-01-30 18:21                 ` Claudio Imbrenda
  1 sibling, 0 replies; 44+ messages in thread
From: Claudio Imbrenda @ 2023-01-26 17:24 UTC (permalink / raw)
  To: Jason Gunthorpe
  Cc: David Hildenbrand, Alistair Popple, David Howells,
	Christoph Hellwig, John Hubbard, linux-mm, Mike Rapoport (IBM),
	Christian Borntraeger, Janosch Frank

On Thu, 26 Jan 2023 12:35:37 -0400
Jason Gunthorpe <jgg@nvidia.com> wrote:

> On Thu, Jan 26, 2023 at 04:39:02PM +0100, Claudio Imbrenda wrote:
> 
> > I can tell you that the original goal of that function is to make sure
> > that there are no extra references. in particular, we want to prevent
> > I/O of any kind to be ongoing while the page becomes secure. (the I/O
> > will fail and, depending on which device it was, the whole system might
> > end up in a rather unhappy state)  
> 
> Sure, but if there is concurrent IO you just try again right? It
> doesn't wait for refs to drop for instance.
> 
> So make the lock_page work the same way:

I'll need to think carefully about this. I remember that there were good
reasons to do things the way we did, so I want to make sure we don't
break something by changing things around.

> 
> diff --git a/arch/s390/kernel/uv.c b/arch/s390/kernel/uv.c
> index 9f18a4af9c1319..847ee50b8672c6 100644
> --- a/arch/s390/kernel/uv.c
> +++ b/arch/s390/kernel/uv.c
> @@ -193,20 +193,11 @@ static int expected_page_refs(struct page *page)
>  }
>  
>  static int make_secure_pte(pte_t *ptep, unsigned long addr,
> -			   struct page *exp_page, struct uv_cb_header *uvcb)
> +			   struct page *page, struct uv_cb_header *uvcb)
>  {
>  	pte_t entry = READ_ONCE(*ptep);
> -	struct page *page;
>  	int expected, cc = 0;
>  
> -	if (!pte_present(entry))
> -		return -ENXIO;
> -	if (pte_val(entry) & _PAGE_INVALID)
> -		return -ENXIO;
> -
> -	page = pte_page(entry);
> -	if (page != exp_page)
> -		return -ENXIO;
>  	if (PageWriteback(page))
>  		return -EAGAIN;
>  	expected = expected_page_refs(page);
> @@ -304,17 +295,25 @@ int gmap_make_secure(struct gmap *gmap, unsigned long gaddr, void *uvcb)
>  		goto out;
>  
>  	rc = -ENXIO;
> -	page = follow_page(vma, uaddr, FOLL_WRITE);
> -	if (IS_ERR_OR_NULL(page))
> -		goto out;
> -
> -	lock_page(page);
>  	ptep = get_locked_pte(gmap->mm, uaddr, &ptelock);
> +
> +	if (!pte_present(entry))
> +		goto out_unlock_pte;
> +	if (pte_val(entry) & _PAGE_INVALID)
> +		goto out_unlock_pte;
> +	page = pte_page(entry);
> +
> +	if (!trylock_page(page)) {
> +		rc = -EAGAIN;
> +		goto out_unlock_pte;
> +	}
> +
>  	if (should_export_before_import(uvcb, gmap->mm))
>  		uv_convert_from_secure(page_to_phys(page));
>  	rc = make_secure_pte(ptep, uaddr, page, uvcb);
> -	pte_unmap_unlock(ptep, ptelock);
>  	unlock_page(page);
> +out_unlock_pte:
> +	pte_unmap_unlock(ptep, ptelock);
>  out:
>  	mmap_read_unlock(gmap->mm);
>  



^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [PATCH v2 13/13] mm/gup: move private gup FOLL_ flags to internal.h
  2023-01-26 16:35               ` Jason Gunthorpe
  2023-01-26 17:24                 ` Claudio Imbrenda
@ 2023-01-30 18:21                 ` Claudio Imbrenda
  2023-01-30 18:24                   ` Jason Gunthorpe
  1 sibling, 1 reply; 44+ messages in thread
From: Claudio Imbrenda @ 2023-01-30 18:21 UTC (permalink / raw)
  To: Jason Gunthorpe
  Cc: David Hildenbrand, Alistair Popple, David Howells,
	Christoph Hellwig, John Hubbard, linux-mm, Mike Rapoport (IBM),
	Christian Borntraeger, Janosch Frank

On Thu, 26 Jan 2023 12:35:37 -0400
Jason Gunthorpe <jgg@nvidia.com> wrote:

> On Thu, Jan 26, 2023 at 04:39:02PM +0100, Claudio Imbrenda wrote:
> 
> > I can tell you that the original goal of that function is to make sure
> > that there are no extra references. in particular, we want to prevent
> > I/O of any kind to be ongoing while the page becomes secure. (the I/O
> > will fail and, depending on which device it was, the whole system might
> > end up in a rather unhappy state)  
> 
> Sure, but if there is concurrent IO you just try again right? It
> doesn't wait for refs to drop for instance.
> 
> So make the lock_page work the same way:

the more I look at this, the less I understand why I wrote the code
like that. I do have a comment, though

> 
> diff --git a/arch/s390/kernel/uv.c b/arch/s390/kernel/uv.c
> index 9f18a4af9c1319..847ee50b8672c6 100644
> --- a/arch/s390/kernel/uv.c
> +++ b/arch/s390/kernel/uv.c
> @@ -193,20 +193,11 @@ static int expected_page_refs(struct page *page)
>  }
>  
>  static int make_secure_pte(pte_t *ptep, unsigned long addr,
> -			   struct page *exp_page, struct uv_cb_header *uvcb)
> +			   struct page *page, struct uv_cb_header *uvcb)
>  {
>  	pte_t entry = READ_ONCE(*ptep);
> -	struct page *page;
>  	int expected, cc = 0;
>  
> -	if (!pte_present(entry))
> -		return -ENXIO;
> -	if (pte_val(entry) & _PAGE_INVALID)
> -		return -ENXIO;
> -
> -	page = pte_page(entry);
> -	if (page != exp_page)
> -		return -ENXIO;
>  	if (PageWriteback(page))
>  		return -EAGAIN;
>  	expected = expected_page_refs(page);
> @@ -304,17 +295,25 @@ int gmap_make_secure(struct gmap *gmap, unsigned long gaddr, void *uvcb)
>  		goto out;
>  
>  	rc = -ENXIO;
> -	page = follow_page(vma, uaddr, FOLL_WRITE);
> -	if (IS_ERR_OR_NULL(page))
> -		goto out;
> -
> -	lock_page(page);
>  	ptep = get_locked_pte(gmap->mm, uaddr, &ptelock);
> +
> +	if (!pte_present(entry))
> +		goto out_unlock_pte;
> +	if (pte_val(entry) & _PAGE_INVALID)
> +		goto out_unlock_pte;

I guess we also need to make sure the page was writable?
FOLL_WRITE made sure of that

so I guess something like:

if (!pte_write(entry))
	goto out_unlock_pte;

> +	page = pte_page(entry);
> +
> +	if (!trylock_page(page)) {
> +		rc = -EAGAIN;
> +		goto out_unlock_pte;
> +	}
> +
>  	if (should_export_before_import(uvcb, gmap->mm))
>  		uv_convert_from_secure(page_to_phys(page));
>  	rc = make_secure_pte(ptep, uaddr, page, uvcb);
> -	pte_unmap_unlock(ptep, ptelock);
>  	unlock_page(page);
> +out_unlock_pte:
> +	pte_unmap_unlock(ptep, ptelock);
>  out:
>  	mmap_read_unlock(gmap->mm);
>  



^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [PATCH v2 13/13] mm/gup: move private gup FOLL_ flags to internal.h
  2023-01-30 18:21                 ` Claudio Imbrenda
@ 2023-01-30 18:24                   ` Jason Gunthorpe
  2023-02-07 11:31                     ` Claudio Imbrenda
  0 siblings, 1 reply; 44+ messages in thread
From: Jason Gunthorpe @ 2023-01-30 18:24 UTC (permalink / raw)
  To: Claudio Imbrenda
  Cc: David Hildenbrand, Alistair Popple, David Howells,
	Christoph Hellwig, John Hubbard, linux-mm, Mike Rapoport (IBM),
	Christian Borntraeger, Janosch Frank

On Mon, Jan 30, 2023 at 07:21:04PM +0100, Claudio Imbrenda wrote:
> On Thu, 26 Jan 2023 12:35:37 -0400
> Jason Gunthorpe <jgg@nvidia.com> wrote:
> 
> > On Thu, Jan 26, 2023 at 04:39:02PM +0100, Claudio Imbrenda wrote:
> > 
> > > I can tell you that the original goal of that function is to make sure
> > > that there are no extra references. in particular, we want to prevent
> > > I/O of any kind to be ongoing while the page becomes secure. (the I/O
> > > will fail and, depending on which device it was, the whole system might
> > > end up in a rather unhappy state)  
> > 
> > Sure, but if there is concurrent IO you just try again right? It
> > doesn't wait for refs to drop for instance.
> > 
> > So make the lock_page work the same way:
> 
> the more I look at this, the less I understand why I wrote the code
> like that. I do have a comment, though
> 
> > 
> > diff --git a/arch/s390/kernel/uv.c b/arch/s390/kernel/uv.c
> > index 9f18a4af9c1319..847ee50b8672c6 100644
> > --- a/arch/s390/kernel/uv.c
> > +++ b/arch/s390/kernel/uv.c
> > @@ -193,20 +193,11 @@ static int expected_page_refs(struct page *page)
> >  }
> >  
> >  static int make_secure_pte(pte_t *ptep, unsigned long addr,
> > -			   struct page *exp_page, struct uv_cb_header *uvcb)
> > +			   struct page *page, struct uv_cb_header *uvcb)
> >  {
> >  	pte_t entry = READ_ONCE(*ptep);
> > -	struct page *page;
> >  	int expected, cc = 0;
> >  
> > -	if (!pte_present(entry))
> > -		return -ENXIO;
> > -	if (pte_val(entry) & _PAGE_INVALID)
> > -		return -ENXIO;
> > -
> > -	page = pte_page(entry);
> > -	if (page != exp_page)
> > -		return -ENXIO;
> >  	if (PageWriteback(page))
> >  		return -EAGAIN;
> >  	expected = expected_page_refs(page);
> > @@ -304,17 +295,25 @@ int gmap_make_secure(struct gmap *gmap, unsigned long gaddr, void *uvcb)
> >  		goto out;
> >  
> >  	rc = -ENXIO;
> > -	page = follow_page(vma, uaddr, FOLL_WRITE);
> > -	if (IS_ERR_OR_NULL(page))
> > -		goto out;
> > -
> > -	lock_page(page);
> >  	ptep = get_locked_pte(gmap->mm, uaddr, &ptelock);
> > +
> > +	if (!pte_present(entry))
> > +		goto out_unlock_pte;
> > +	if (pte_val(entry) & _PAGE_INVALID)
> > +		goto out_unlock_pte;
> 
> I guess we also need to make sure the page was writable?
> FOLL_WRITE made sure of that
> 
> so I guess something like:
> 
> if (!pte_write(entry))
> 	goto out_unlock_pte;

Probably, that looks like an existing race that it wasn't re-checked :\

Jason


^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [PATCH v2 00/13] Simplify the external interface for GUP
  2023-01-24 20:34 [PATCH v2 00/13] Simplify the external interface for GUP Jason Gunthorpe
                   ` (12 preceding siblings ...)
  2023-01-24 20:34 ` [PATCH v2 13/13] mm/gup: move private gup FOLL_ flags to internal.h Jason Gunthorpe
@ 2023-02-06 23:46 ` Jason Gunthorpe
  13 siblings, 0 replies; 44+ messages in thread
From: Jason Gunthorpe @ 2023-02-06 23:46 UTC (permalink / raw)
  To: Alistair Popple, Andrew Morton, David Hildenbrand, David Howells,
	Christoph Hellwig, John Hubbard, linux-mm, Mike Rapoport (IBM)

On Tue, Jan 24, 2023 at 04:34:21PM -0400, Jason Gunthorpe wrote:
> It is quite a maze of EXPORTED symbols leading up to the three actual
> worker functions of GUP. Simplify this by reorganizing some of the code so
> the EXPORTED symbols directly call the correct internal function with
> validated and consistent arguments.
> 
> Consolidate all the assertions into one place at the top of the call
> chains.
> 
> Remove some dead code.
> 
> Move more things into the mm/internal.h header

Hi Andrew,

It has been about two weeks and I don't have a reason to respin it, so
please consider for this cycle.

Thanks,
Jason


^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [PATCH v2 13/13] mm/gup: move private gup FOLL_ flags to internal.h
  2023-01-30 18:24                   ` Jason Gunthorpe
@ 2023-02-07 11:31                     ` Claudio Imbrenda
  2023-02-07 12:40                       ` Jason Gunthorpe
  0 siblings, 1 reply; 44+ messages in thread
From: Claudio Imbrenda @ 2023-02-07 11:31 UTC (permalink / raw)
  To: Jason Gunthorpe
  Cc: David Hildenbrand, Alistair Popple, David Howells,
	Christoph Hellwig, John Hubbard, linux-mm, Mike Rapoport (IBM),
	Christian Borntraeger, Janosch Frank

On Mon, 30 Jan 2023 14:24:40 -0400
Jason Gunthorpe <jgg@nvidia.com> wrote:

> On Mon, Jan 30, 2023 at 07:21:04PM +0100, Claudio Imbrenda wrote:
> > On Thu, 26 Jan 2023 12:35:37 -0400
> > Jason Gunthorpe <jgg@nvidia.com> wrote:
> >   
> > > On Thu, Jan 26, 2023 at 04:39:02PM +0100, Claudio Imbrenda wrote:
> > >   
> > > > I can tell you that the original goal of that function is to make sure
> > > > that there are no extra references. in particular, we want to prevent
> > > > I/O of any kind to be ongoing while the page becomes secure. (the I/O
> > > > will fail and, depending on which device it was, the whole system might
> > > > end up in a rather unhappy state)    
> > > 
> > > Sure, but if there is concurrent IO you just try again right? It
> > > doesn't wait for refs to drop for instance.
> > > 
> > > So make the lock_page work the same way:  
> > 
> > the more I look at this, the less I understand why I wrote the code
> > like that. I do have a comment, though
> >   
> > > 
> > > diff --git a/arch/s390/kernel/uv.c b/arch/s390/kernel/uv.c
> > > index 9f18a4af9c1319..847ee50b8672c6 100644
> > > --- a/arch/s390/kernel/uv.c
> > > +++ b/arch/s390/kernel/uv.c
> > > @@ -193,20 +193,11 @@ static int expected_page_refs(struct page *page)
> > >  }
> > >  
> > >  static int make_secure_pte(pte_t *ptep, unsigned long addr,
> > > -			   struct page *exp_page, struct uv_cb_header *uvcb)
> > > +			   struct page *page, struct uv_cb_header *uvcb)
> > >  {
> > >  	pte_t entry = READ_ONCE(*ptep);
> > > -	struct page *page;
> > >  	int expected, cc = 0;
> > >  
> > > -	if (!pte_present(entry))
> > > -		return -ENXIO;
> > > -	if (pte_val(entry) & _PAGE_INVALID)
> > > -		return -ENXIO;
> > > -
> > > -	page = pte_page(entry);
> > > -	if (page != exp_page)
> > > -		return -ENXIO;
> > >  	if (PageWriteback(page))
> > >  		return -EAGAIN;
> > >  	expected = expected_page_refs(page);
> > > @@ -304,17 +295,25 @@ int gmap_make_secure(struct gmap *gmap, unsigned long gaddr, void *uvcb)
> > >  		goto out;
> > >  
> > >  	rc = -ENXIO;
> > > -	page = follow_page(vma, uaddr, FOLL_WRITE);
> > > -	if (IS_ERR_OR_NULL(page))
> > > -		goto out;
> > > -
> > > -	lock_page(page);
> > >  	ptep = get_locked_pte(gmap->mm, uaddr, &ptelock);
> > > +
> > > +	if (!pte_present(entry))
> > > +		goto out_unlock_pte;
> > > +	if (pte_val(entry) & _PAGE_INVALID)
> > > +		goto out_unlock_pte;  
> > 
> > I guess we also need to make sure the page was writable?
> > FOLL_WRITE made sure of that
> > 
> > so I guess something like:
> > 
> > if (!pte_write(entry))
> > 	goto out_unlock_pte;  
> 
> Probably, that looks like an existing race that it wasn't re-checked :\
> 
> Jason

how do we want to proceed with this?

I can write a patch based on the above and see if anything breaks, but
it will take some thinking and testing and reviewing before I'll be
comfortable with it.


^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [PATCH v2 13/13] mm/gup: move private gup FOLL_ flags to internal.h
  2023-02-07 11:31                     ` Claudio Imbrenda
@ 2023-02-07 12:40                       ` Jason Gunthorpe
  0 siblings, 0 replies; 44+ messages in thread
From: Jason Gunthorpe @ 2023-02-07 12:40 UTC (permalink / raw)
  To: Claudio Imbrenda
  Cc: David Hildenbrand, Alistair Popple, David Howells,
	Christoph Hellwig, John Hubbard, linux-mm, Mike Rapoport (IBM),
	Christian Borntraeger, Janosch Frank

On Tue, Feb 07, 2023 at 12:31:12PM +0100, Claudio Imbrenda wrote:
> On Mon, 30 Jan 2023 14:24:40 -0400
> Jason Gunthorpe <jgg@nvidia.com> wrote:
> 
> > On Mon, Jan 30, 2023 at 07:21:04PM +0100, Claudio Imbrenda wrote:
> > > On Thu, 26 Jan 2023 12:35:37 -0400
> > > Jason Gunthorpe <jgg@nvidia.com> wrote:
> > >   
> > > > On Thu, Jan 26, 2023 at 04:39:02PM +0100, Claudio Imbrenda wrote:
> > > >   
> > > > > I can tell you that the original goal of that function is to make sure
> > > > > that there are no extra references. in particular, we want to prevent
> > > > > I/O of any kind to be ongoing while the page becomes secure. (the I/O
> > > > > will fail and, depending on which device it was, the whole system might
> > > > > end up in a rather unhappy state)    
> > > > 
> > > > Sure, but if there is concurrent IO you just try again right? It
> > > > doesn't wait for refs to drop for instance.
> > > > 
> > > > So make the lock_page work the same way:  
> > > 
> > > the more I look at this, the less I understand why I wrote the code
> > > like that. I do have a comment, though
> > >   
> > > > 
> > > > diff --git a/arch/s390/kernel/uv.c b/arch/s390/kernel/uv.c
> > > > index 9f18a4af9c1319..847ee50b8672c6 100644
> > > > --- a/arch/s390/kernel/uv.c
> > > > +++ b/arch/s390/kernel/uv.c
> > > > @@ -193,20 +193,11 @@ static int expected_page_refs(struct page *page)
> > > >  }
> > > >  
> > > >  static int make_secure_pte(pte_t *ptep, unsigned long addr,
> > > > -			   struct page *exp_page, struct uv_cb_header *uvcb)
> > > > +			   struct page *page, struct uv_cb_header *uvcb)
> > > >  {
> > > >  	pte_t entry = READ_ONCE(*ptep);
> > > > -	struct page *page;
> > > >  	int expected, cc = 0;
> > > >  
> > > > -	if (!pte_present(entry))
> > > > -		return -ENXIO;
> > > > -	if (pte_val(entry) & _PAGE_INVALID)
> > > > -		return -ENXIO;
> > > > -
> > > > -	page = pte_page(entry);
> > > > -	if (page != exp_page)
> > > > -		return -ENXIO;
> > > >  	if (PageWriteback(page))
> > > >  		return -EAGAIN;
> > > >  	expected = expected_page_refs(page);
> > > > @@ -304,17 +295,25 @@ int gmap_make_secure(struct gmap *gmap, unsigned long gaddr, void *uvcb)
> > > >  		goto out;
> > > >  
> > > >  	rc = -ENXIO;
> > > > -	page = follow_page(vma, uaddr, FOLL_WRITE);
> > > > -	if (IS_ERR_OR_NULL(page))
> > > > -		goto out;
> > > > -
> > > > -	lock_page(page);
> > > >  	ptep = get_locked_pte(gmap->mm, uaddr, &ptelock);
> > > > +
> > > > +	if (!pte_present(entry))
> > > > +		goto out_unlock_pte;
> > > > +	if (pte_val(entry) & _PAGE_INVALID)
> > > > +		goto out_unlock_pte;  
> > > 
> > > I guess we also need to make sure the page was writable?
> > > FOLL_WRITE made sure of that
> > > 
> > > so I guess something like:
> > > 
> > > if (!pte_write(entry))
> > > 	goto out_unlock_pte;  
> > 
> > Probably, that looks like an existing race that it wasn't re-checked :\
> > 
> > Jason
> 
> how do we want to proceed with this?
> 
> I can write a patch based on the above and see if anything breaks, but
> it will take some thinking and testing and reviewing before I'll be
> comfortable with it.

I would be happy if you did that

I think the code is clearly racey as written now

Jason


^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [PATCH v2 02/13] mm/gup: remove obsolete FOLL_LONGTERM comment
  2023-01-24 20:34 ` [PATCH v2 02/13] mm/gup: remove obsolete FOLL_LONGTERM comment Jason Gunthorpe
  2023-01-25  2:13   ` John Hubbard
@ 2023-02-08 14:25   ` David Hildenbrand
  1 sibling, 0 replies; 44+ messages in thread
From: David Hildenbrand @ 2023-02-08 14:25 UTC (permalink / raw)
  To: Jason Gunthorpe
  Cc: Alistair Popple, David Howells, Christoph Hellwig, John Hubbard,
	linux-mm, Mike Rapoport (IBM)

On 24.01.23 21:34, Jason Gunthorpe wrote:
> These days FOLL_LONGTERM is not allowed at all on any get_user_pages*()
> functions, it must be only be used with pin_user_pages*(), plus it now has
> universal support for all the pin_user_pages*() functions.
> 
> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
> ---
>   include/linux/mm_types.h | 6 ------
>   1 file changed, 6 deletions(-)
> 
> diff --git a/include/linux/mm_types.h b/include/linux/mm_types.h
> index a689198caf7408..8971a40c120e38 100644
> --- a/include/linux/mm_types.h
> +++ b/include/linux/mm_types.h
> @@ -1052,12 +1052,6 @@ typedef unsigned int __bitwise zap_flags_t;
>    * specifically failed.  Filesystem pages are still subject to bugs and use of
>    * FOLL_LONGTERM should be avoided on those pages.
>    *
> - * FIXME: Also NOTE that FOLL_LONGTERM is not supported in every GUP call.
> - * Currently only get_user_pages() and get_user_pages_fast() support this flag
> - * and calls to get_user_pages_[un]locked are specifically not allowed.  This
> - * is due to an incompatibility with the FS DAX check and
> - * FAULT_FLAG_ALLOW_RETRY.
> - *
>    * In the CMA case: long term pins in a CMA region would unnecessarily fragment
>    * that region.  And so, CMA attempts to migrate the page before pinning, when
>    * FOLL_LONGTERM is specified.

Reviewed-by: David Hildenbrand <david@redhat.com>

-- 
Thanks,

David / dhildenb



^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [PATCH v2 03/13] mm/gup: don't call __gup_longterm_locked() if FOLL_LONGTERM cannot be set
  2023-01-24 20:34 ` [PATCH v2 03/13] mm/gup: don't call __gup_longterm_locked() if FOLL_LONGTERM cannot be set Jason Gunthorpe
@ 2023-02-08 14:26   ` David Hildenbrand
  0 siblings, 0 replies; 44+ messages in thread
From: David Hildenbrand @ 2023-02-08 14:26 UTC (permalink / raw)
  To: Jason Gunthorpe
  Cc: Alistair Popple, David Howells, Christoph Hellwig, John Hubbard,
	linux-mm, Mike Rapoport (IBM)

On 24.01.23 21:34, Jason Gunthorpe wrote:
> get_user_pages_remote(), get_user_pages_unlocked() and get_user_pages()
> are never called with FOLL_LONGTERM, so directly call
> __get_user_pages_locked()
> 
> The next patch will add an assertion for this.
> 
> Suggested-by: John Hubbard <jhubbard@nvidia.com>
> Reviewed-by: John Hubbard <jhubbard@nvidia.com>
> Acked-by: Mike Rapoport (IBM) <rppt@kernel.org>
> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
> ---
>   mm/gup.c | 12 ++++++------
>   1 file changed, 6 insertions(+), 6 deletions(-)
> 
> diff --git a/mm/gup.c b/mm/gup.c
> index 7007b3afc4fda8..a6559d7243db92 100644
> --- a/mm/gup.c
> +++ b/mm/gup.c
> @@ -2200,8 +2200,8 @@ long get_user_pages_remote(struct mm_struct *mm,
>   	if (!is_valid_gup_flags(gup_flags))
>   		return -EINVAL;
>   
> -	return __gup_longterm_locked(mm, start, nr_pages, pages, vmas, locked,
> -				     gup_flags | FOLL_TOUCH | FOLL_REMOTE);
> +	return __get_user_pages_locked(mm, start, nr_pages, pages, vmas, locked,
> +				       gup_flags | FOLL_TOUCH | FOLL_REMOTE);
>   }
>   EXPORT_SYMBOL(get_user_pages_remote);
>   
> @@ -2238,8 +2238,8 @@ long get_user_pages(unsigned long start, unsigned long nr_pages,
>   	if (!is_valid_gup_flags(gup_flags))
>   		return -EINVAL;
>   
> -	return __gup_longterm_locked(current->mm, start, nr_pages,
> -				     pages, vmas, NULL, gup_flags | FOLL_TOUCH);
> +	return __get_user_pages_locked(current->mm, start, nr_pages, pages,
> +				       vmas, NULL, gup_flags | FOLL_TOUCH);
>   }
>   EXPORT_SYMBOL(get_user_pages);
>   
> @@ -2263,8 +2263,8 @@ long get_user_pages_unlocked(unsigned long start, unsigned long nr_pages,
>   {
>   	int locked = 0;
>   
> -	return __gup_longterm_locked(current->mm, start, nr_pages, pages, NULL,
> -				     &locked, gup_flags | FOLL_TOUCH);
> +	return __get_user_pages_locked(current->mm, start, nr_pages, pages,
> +				       NULL, &locked, gup_flags | FOLL_TOUCH);
>   }
>   EXPORT_SYMBOL(get_user_pages_unlocked);
>   

Reviewed-by: David Hildenbrand <david@redhat.com>

-- 
Thanks,

David / dhildenb



^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [PATCH v2 04/13] mm/gup: move try_grab_page() to mm/internal.h
  2023-01-24 20:34 ` [PATCH v2 04/13] mm/gup: move try_grab_page() to mm/internal.h Jason Gunthorpe
  2023-01-25  2:15   ` John Hubbard
@ 2023-02-08 14:26   ` David Hildenbrand
  1 sibling, 0 replies; 44+ messages in thread
From: David Hildenbrand @ 2023-02-08 14:26 UTC (permalink / raw)
  To: Jason Gunthorpe
  Cc: Alistair Popple, David Howells, Christoph Hellwig, John Hubbard,
	linux-mm, Mike Rapoport (IBM)

On 24.01.23 21:34, Jason Gunthorpe wrote:
> This is part of the internal function of gup.c and is only non-static so
> that the parts of gup.c in the huge_memory.c and hugetlb.c can call it.
> 
> Put it in internal.h beside the similarly purposed try_grab_folio()
> 
> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
> ---
>   include/linux/mm.h | 2 --
>   mm/internal.h      | 1 +
>   2 files changed, 1 insertion(+), 2 deletions(-)
> 
> diff --git a/include/linux/mm.h b/include/linux/mm.h
> index c9db257f09b307..dfc2a88bc4a8ed 100644
> --- a/include/linux/mm.h
> +++ b/include/linux/mm.h
> @@ -1173,8 +1173,6 @@ static inline void get_page(struct page *page)
>   	folio_get(page_folio(page));
>   }
>   
> -int __must_check try_grab_page(struct page *page, unsigned int flags);
> -
>   static inline __must_check bool try_get_page(struct page *page)
>   {
>   	page = compound_head(page);
> diff --git a/mm/internal.h b/mm/internal.h
> index ce462bf145b441..0f035bcaf133f5 100644
> --- a/mm/internal.h
> +++ b/mm/internal.h
> @@ -852,6 +852,7 @@ int migrate_device_coherent_page(struct page *page);
>    * mm/gup.c
>    */
>   struct folio *try_grab_folio(struct page *page, int refs, unsigned int flags);
> +int __must_check try_grab_page(struct page *page, unsigned int flags);
>   
>   extern bool mirrored_kernelcore;
>   

Reviewed-by: David Hildenbrand <david@redhat.com>
-- 
Thanks,

David / dhildenb



^ permalink raw reply	[flat|nested] 44+ messages in thread

end of thread, other threads:[~2023-02-08 14:26 UTC | newest]

Thread overview: 44+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-01-24 20:34 [PATCH v2 00/13] Simplify the external interface for GUP Jason Gunthorpe
2023-01-24 20:34 ` [PATCH v2 01/13] mm/gup: have internal functions get the mmap_read_lock() Jason Gunthorpe
2023-01-25  2:11   ` John Hubbard
2023-01-25  2:52     ` John Hubbard
2023-01-25 16:38     ` Jason Gunthorpe
2023-01-25 18:48       ` John Hubbard
2023-01-24 20:34 ` [PATCH v2 02/13] mm/gup: remove obsolete FOLL_LONGTERM comment Jason Gunthorpe
2023-01-25  2:13   ` John Hubbard
2023-02-08 14:25   ` David Hildenbrand
2023-01-24 20:34 ` [PATCH v2 03/13] mm/gup: don't call __gup_longterm_locked() if FOLL_LONGTERM cannot be set Jason Gunthorpe
2023-02-08 14:26   ` David Hildenbrand
2023-01-24 20:34 ` [PATCH v2 04/13] mm/gup: move try_grab_page() to mm/internal.h Jason Gunthorpe
2023-01-25  2:15   ` John Hubbard
2023-02-08 14:26   ` David Hildenbrand
2023-01-24 20:34 ` [PATCH v2 05/13] mm/gup: simplify the external interface functions and consolidate invariants Jason Gunthorpe
2023-01-25  2:30   ` John Hubbard
2023-01-24 20:34 ` [PATCH v2 06/13] mm/gup: add an assertion that the mmap lock is locked Jason Gunthorpe
2023-01-25  2:34   ` John Hubbard
2023-01-24 20:34 ` [PATCH v2 07/13] mm/gup: remove locked being NULL from faultin_vma_page_range() Jason Gunthorpe
2023-01-25  2:38   ` John Hubbard
2023-01-24 20:34 ` [PATCH v2 08/13] mm/gup: add FOLL_UNLOCKABLE Jason Gunthorpe
2023-01-24 20:34 ` [PATCH v2 09/13] mm/gup: make locked never NULL in the internal GUP functions Jason Gunthorpe
2023-01-25  3:00   ` John Hubbard
2023-01-24 20:34 ` [PATCH v2 10/13] mm/gup: remove pin_user_pages_fast_only() Jason Gunthorpe
2023-01-24 20:34 ` [PATCH v2 11/13] mm/gup: make get_user_pages_fast_only() return the common return value Jason Gunthorpe
2023-01-24 20:34 ` [PATCH v2 12/13] mm/gup: move gup_must_unshare() to mm/internal.h Jason Gunthorpe
2023-01-25  2:41   ` John Hubbard
2023-01-26 11:29   ` David Hildenbrand
2023-01-24 20:34 ` [PATCH v2 13/13] mm/gup: move private gup FOLL_ flags to internal.h Jason Gunthorpe
2023-01-25  2:44   ` John Hubbard
2023-01-26 12:48   ` David Hildenbrand
2023-01-26 12:55     ` Jason Gunthorpe
2023-01-26 13:06       ` David Hildenbrand
2023-01-26 14:41       ` Claudio Imbrenda
2023-01-26 14:46         ` David Hildenbrand
2023-01-26 15:05           ` Jason Gunthorpe
2023-01-26 15:39             ` Claudio Imbrenda
2023-01-26 16:35               ` Jason Gunthorpe
2023-01-26 17:24                 ` Claudio Imbrenda
2023-01-30 18:21                 ` Claudio Imbrenda
2023-01-30 18:24                   ` Jason Gunthorpe
2023-02-07 11:31                     ` Claudio Imbrenda
2023-02-07 12:40                       ` Jason Gunthorpe
2023-02-06 23:46 ` [PATCH v2 00/13] Simplify the external interface for GUP Jason Gunthorpe

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.