linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v3 0/3] bio: Direct IO: convert to pin_user_pages_fast()
@ 2020-08-31  7:14 John Hubbard
  2020-08-31  7:14 ` [PATCH v3 1/3] mm/gup: introduce pin_page() John Hubbard
                   ` (2 more replies)
  0 siblings, 3 replies; 6+ messages in thread
From: John Hubbard @ 2020-08-31  7:14 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Alexander Viro, Christoph Hellwig, Ilya Dryomov, Jens Axboe,
	linux-xfs, linux-fsdevel, linux-block, linux-mm, LKML,
	John Hubbard

bio: convert get_user_pages_fast() --> pin_user_pages_fast()

Change generic block/bio Direct IO routines, to acquire FOLL_PIN user
pages via the recently added routines:

    iov_iter_pin_pages()
    iov_iter_pin_pages_alloc()
    pin_page()

This effectively converts several file systems (ext4, for example) that
use the common Direct IO routines.

Change the corresponding page release calls from put_page() to
unpin_user_page().

Change bio_release_pages() to handle FOLL_PIN pages. In fact, after this
patch, that is the *only* type of pages that bio_release_pages()
handles.

Design notes
============

Quite a few approaches have been considered over the years. This one is
inspired by Christoph Hellwig's July, 2019 observation that there are
only 5 ITER_ types, and we can simplify handling of them for Direct IO
[1]. Accordingly, this patch implements the following pseudocode:

Direct IO behavior:

    ITER_IOVEC:
        pin_user_pages_fast();
        break;

    ITER_PIPE:
        for each page:
             pin_page();
        break;

    ITER_KVEC:    // already elevated page refcount, leave alone
    ITER_BVEC:    // already elevated page refcount, leave alone
    ITER_DISCARD: // discard
        return -EFAULT or -ENVALID;

...which works for callers that already have sorted out which case they
are in. Such as, Direct IO in the block/bio layers.

Note that this does leave ITER_KVEC and ITER_BVEC unconverted, for now.

Page acquisition: The iov_iter_get_pages*() routines above are at just
the right level in the call stack: the callers already know which system
to use, and so it's a small change to just drop in the replacement
routines. And it's a fan-in/fan-out point: block/bio call sites for
Direct IO funnel their page acquisitions through the
iov_iter_get_pages*() routines, and there are many other callers of
those. And we can't convert all of the callers at once--too many
subsystems are involved, and it would be a too large and too risky
patch.

Page release: there are already separate release routines: put_page()
vs. unpin_user_page(), so it's already done there.

[1] https://lore.kernel.org/kvm/20190724061750.GA19397@infradead.org/

[2] "Explicit pinning of user-space pages":
    https://lwn.net/Articles/807108/



John Hubbard (3):
  mm/gup: introduce pin_page()
  iov_iter: introduce iov_iter_pin_pages*() routines
  bio: convert get_user_pages_fast() --> pin_user_pages_fast()

 block/bio.c          |  24 ++++-----
 block/blk-map.c      |   6 +--
 fs/direct-io.c       |  28 +++++------
 fs/iomap/direct-io.c |   2 +-
 include/linux/mm.h   |   2 +
 include/linux/uio.h  |   5 ++
 lib/iov_iter.c       | 113 ++++++++++++++++++++++++++++++++++++++++---
 mm/gup.c             |  33 +++++++++++++
 8 files changed, 175 insertions(+), 38 deletions(-)

-- 
2.28.0



^ permalink raw reply	[flat|nested] 6+ messages in thread

* [PATCH v3 1/3] mm/gup: introduce pin_page()
  2020-08-31  7:14 [PATCH v3 0/3] bio: Direct IO: convert to pin_user_pages_fast() John Hubbard
@ 2020-08-31  7:14 ` John Hubbard
  2020-08-31  7:14 ` [PATCH v3 2/3] iov_iter: introduce iov_iter_pin_pages*() routines John Hubbard
  2020-08-31  7:14 ` [PATCH v3 3/3] bio: convert get_user_pages_fast() --> pin_user_pages_fast() John Hubbard
  2 siblings, 0 replies; 6+ messages in thread
From: John Hubbard @ 2020-08-31  7:14 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Alexander Viro, Christoph Hellwig, Ilya Dryomov, Jens Axboe,
	linux-xfs, linux-fsdevel, linux-block, linux-mm, LKML,
	John Hubbard

pin_page() is the FOLL_PIN equivalent of get_page().

This was always a missing piece of the pin/unpin API calls (early
reviewers of pin_user_pages() asked about it, in fact), but until now,
it just wasn't needed. Finally though, now that the Direct IO pieces in
block/bio are about to be converted to use FOLL_PIN, it turns out that
there are some cases in which get_page() and get_user_pages_fast() were
both used. Converting those sites requires a drop-in replacement for
get_page(), which this patch supplies.

[1] and [2] provide some background about the overall effort to convert
things to pin_user_page*() and unpin_user_page*().

[1] "Explicit pinning of user-space pages":
    https://lwn.net/Articles/807108/

[2] Documentation/core-api/pin_user_pages.rst

Cc: Christoph Hellwig <hch@infradead.org>
Signed-off-by: John Hubbard <jhubbard@nvidia.com>
---
 include/linux/mm.h |  2 ++
 mm/gup.c           | 33 +++++++++++++++++++++++++++++++++
 2 files changed, 35 insertions(+)

diff --git a/include/linux/mm.h b/include/linux/mm.h
index ca6e6a81576b..24240cf66c44 100644
--- a/include/linux/mm.h
+++ b/include/linux/mm.h
@@ -1154,6 +1154,8 @@ static inline void get_page(struct page *page)
 	page_ref_inc(page);
 }
 
+void pin_page(struct page *page);
+
 bool __must_check try_grab_page(struct page *page, unsigned int flags);
 
 static inline __must_check bool try_get_page(struct page *page)
diff --git a/mm/gup.c b/mm/gup.c
index ae096ea7583f..a3a4bfae224a 100644
--- a/mm/gup.c
+++ b/mm/gup.c
@@ -123,6 +123,39 @@ static __maybe_unused struct page *try_grab_compound_head(struct page *page,
 	return NULL;
 }
 
+/*
+ * pin_page() - elevate the page refcount, and mark as FOLL_PIN
+ *
+ * This the FOLL_PIN equivalent of get_page(). It is intended for use when the
+ * page will be released via unpin_user_page().
+ *
+ * Unlike pin_user_page*(), pin_page() may be used on nearly any page, not just
+ * userspace-allocated pages.
+ */
+void pin_page(struct page *page)
+{
+	int refs = 1;
+
+	page = compound_head(page);
+
+	VM_BUG_ON_PAGE(page_ref_count(page) <= 0, page);
+
+	if (hpage_pincount_available(page))
+		hpage_pincount_add(page, 1);
+	else
+		refs = GUP_PIN_COUNTING_BIAS;
+
+	/*
+	 * Similar to try_grab_compound_head(): even if using the
+	 * hpage_pincount_add/_sub() routines, be sure to
+	 * *also* increment the normal page refcount field at least
+	 * once, so that the page really is pinned.
+	 */
+	page_ref_add(page, refs);
+
+	mod_node_page_state(page_pgdat(page), NR_FOLL_PIN_ACQUIRED, 1);
+}
+
 /**
  * try_grab_page() - elevate a page's refcount by a flag-dependent amount
  *
-- 
2.28.0



^ permalink raw reply related	[flat|nested] 6+ messages in thread

* [PATCH v3 2/3] iov_iter: introduce iov_iter_pin_pages*() routines
  2020-08-31  7:14 [PATCH v3 0/3] bio: Direct IO: convert to pin_user_pages_fast() John Hubbard
  2020-08-31  7:14 ` [PATCH v3 1/3] mm/gup: introduce pin_page() John Hubbard
@ 2020-08-31  7:14 ` John Hubbard
  2020-08-31  7:14 ` [PATCH v3 3/3] bio: convert get_user_pages_fast() --> pin_user_pages_fast() John Hubbard
  2 siblings, 0 replies; 6+ messages in thread
From: John Hubbard @ 2020-08-31  7:14 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Alexander Viro, Christoph Hellwig, Ilya Dryomov, Jens Axboe,
	linux-xfs, linux-fsdevel, linux-block, linux-mm, LKML,
	John Hubbard

The new routines are:
    iov_iter_pin_pages()
    iov_iter_pin_pages_alloc()

and those correspond to these pre-existing routines:
    iov_iter_get_pages()
    iov_iter_get_pages_alloc()

Also, pipe_get_pages() and related are changed so as to pass
down a "use_pup" (use pin_page() instead of get_page()) bool
argument.

Unlike the iov_iter_get_pages*() routines, the iov_iter_pin_pages*()
routines assert that only ITER_IOVEC or ITER_PIPE items are passed in.
They then call pin_user_pages_fast() or pin_page(), instead of
get_user_pages_fast() or get_page().

Why: In order to incrementally change Direct IO callers from calling
get_user_pages_fast() and put_page(), over to calling
pin_user_pages_fast() and unpin_user_page(), there need to be mid-level
routines that specifically call one or the other systems, for both page
acquisition and page release.

Cc: Christoph Hellwig <hch@infradead.org>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: John Hubbard <jhubbard@nvidia.com>
---
 include/linux/uio.h |   5 ++
 lib/iov_iter.c      | 113 ++++++++++++++++++++++++++++++++++++++++----
 2 files changed, 110 insertions(+), 8 deletions(-)

diff --git a/include/linux/uio.h b/include/linux/uio.h
index 3835a8a8e9ea..e44eed12afdf 100644
--- a/include/linux/uio.h
+++ b/include/linux/uio.h
@@ -229,6 +229,11 @@ int iov_iter_npages(const struct iov_iter *i, int maxpages);
 
 const void *dup_iter(struct iov_iter *new, struct iov_iter *old, gfp_t flags);
 
+ssize_t iov_iter_pin_pages(struct iov_iter *i, struct page **pages,
+			size_t maxsize, unsigned int maxpages, size_t *start);
+ssize_t iov_iter_pin_pages_alloc(struct iov_iter *i, struct page ***pages,
+			size_t maxsize, size_t *start);
+
 static inline size_t iov_iter_count(const struct iov_iter *i)
 {
 	return i->count;
diff --git a/lib/iov_iter.c b/lib/iov_iter.c
index 5e40786c8f12..2dc1f4812fa9 100644
--- a/lib/iov_iter.c
+++ b/lib/iov_iter.c
@@ -1269,7 +1269,8 @@ static inline ssize_t __pipe_get_pages(struct iov_iter *i,
 				size_t maxsize,
 				struct page **pages,
 				int iter_head,
-				size_t *start)
+				size_t *start,
+				bool use_pup)
 {
 	struct pipe_inode_info *pipe = i->pipe;
 	unsigned int p_mask = pipe->ring_size - 1;
@@ -1280,7 +1281,14 @@ static inline ssize_t __pipe_get_pages(struct iov_iter *i,
 	maxsize = n;
 	n += *start;
 	while (n > 0) {
-		get_page(*pages++ = pipe->bufs[iter_head & p_mask].page);
+		struct page *page = pipe->bufs[iter_head & p_mask].page;
+
+		if (use_pup)
+			pin_page(page);
+		else
+			get_page(page);
+
+		*pages++ = page;
 		iter_head++;
 		n -= PAGE_SIZE;
 	}
@@ -1290,7 +1298,7 @@ static inline ssize_t __pipe_get_pages(struct iov_iter *i,
 
 static ssize_t pipe_get_pages(struct iov_iter *i,
 		   struct page **pages, size_t maxsize, unsigned maxpages,
-		   size_t *start)
+		   size_t *start, bool use_pup)
 {
 	unsigned int iter_head, npages;
 	size_t capacity;
@@ -1306,9 +1314,52 @@ static ssize_t pipe_get_pages(struct iov_iter *i,
 	npages = pipe_space_for_user(iter_head, i->pipe->tail, i->pipe);
 	capacity = min(npages, maxpages) * PAGE_SIZE - *start;
 
-	return __pipe_get_pages(i, min(maxsize, capacity), pages, iter_head, start);
+	return __pipe_get_pages(i, min(maxsize, capacity), pages, iter_head,
+				start, use_pup);
 }
 
+ssize_t iov_iter_pin_pages(struct iov_iter *i,
+		   struct page **pages, size_t maxsize, unsigned int maxpages,
+		   size_t *start)
+{
+	size_t skip = i->iov_offset;
+	const struct iovec *iov;
+	struct iovec v;
+
+	if (unlikely(iov_iter_is_pipe(i)))
+		return pipe_get_pages(i, pages, maxsize, maxpages, start, true);
+	if (unlikely(iov_iter_is_discard(i)))
+		return -EFAULT;
+	if (WARN_ON_ONCE(!iter_is_iovec(i)))
+		return -EFAULT;
+
+	if (unlikely(!maxsize))
+		return 0;
+	maxsize = min(maxsize, i->count);
+
+	iterate_iovec(i, maxsize, v, iov, skip, ({
+		unsigned long addr = (unsigned long)v.iov_base;
+		size_t len = v.iov_len + (*start = addr & (PAGE_SIZE - 1));
+		int n;
+		int res;
+
+		if (len > maxpages * PAGE_SIZE)
+			len = maxpages * PAGE_SIZE;
+		addr &= ~(PAGE_SIZE - 1);
+		n = DIV_ROUND_UP(len, PAGE_SIZE);
+
+		res = pin_user_pages_fast(addr, n,
+				iov_iter_rw(i) != WRITE ?  FOLL_WRITE : 0,
+				pages);
+		if (unlikely(res < 0))
+			return res;
+		return (res == n ? len : res * PAGE_SIZE) - *start;
+		0;
+	}))
+	return 0;
+}
+EXPORT_SYMBOL(iov_iter_pin_pages);
+
 ssize_t iov_iter_get_pages(struct iov_iter *i,
 		   struct page **pages, size_t maxsize, unsigned maxpages,
 		   size_t *start)
@@ -1317,7 +1368,7 @@ ssize_t iov_iter_get_pages(struct iov_iter *i,
 		maxsize = i->count;
 
 	if (unlikely(iov_iter_is_pipe(i)))
-		return pipe_get_pages(i, pages, maxsize, maxpages, start);
+		return pipe_get_pages(i, pages, maxsize, maxpages, start, false);
 	if (unlikely(iov_iter_is_discard(i)))
 		return -EFAULT;
 
@@ -1357,7 +1408,7 @@ static struct page **get_pages_array(size_t n)
 
 static ssize_t pipe_get_pages_alloc(struct iov_iter *i,
 		   struct page ***pages, size_t maxsize,
-		   size_t *start)
+		   size_t *start, bool use_pup)
 {
 	struct page **p;
 	unsigned int iter_head, npages;
@@ -1380,7 +1431,7 @@ static ssize_t pipe_get_pages_alloc(struct iov_iter *i,
 	p = get_pages_array(npages);
 	if (!p)
 		return -ENOMEM;
-	n = __pipe_get_pages(i, maxsize, p, iter_head, start);
+	n = __pipe_get_pages(i, maxsize, p, iter_head, start, use_pup);
 	if (n > 0)
 		*pages = p;
 	else
@@ -1388,6 +1439,52 @@ static ssize_t pipe_get_pages_alloc(struct iov_iter *i,
 	return n;
 }
 
+ssize_t iov_iter_pin_pages_alloc(struct iov_iter *i,
+		   struct page ***pages, size_t maxsize,
+		   size_t *start)
+{
+	struct page **p;
+	size_t skip = i->iov_offset;
+	const struct iovec *iov;
+	struct iovec v;
+
+	if (unlikely(iov_iter_is_pipe(i)))
+		return pipe_get_pages_alloc(i, pages, maxsize, start, true);
+	if (unlikely(iov_iter_is_discard(i)))
+		return -EFAULT;
+	if (WARN_ON_ONCE(!iter_is_iovec(i)))
+		return -EFAULT;
+
+	if (unlikely(!maxsize))
+		return 0;
+	maxsize = min(maxsize, i->count);
+
+	iterate_iovec(i, maxsize, v, iov, skip, ({
+		unsigned long addr = (unsigned long)v.iov_base;
+		size_t len = v.iov_len + (*start = addr & (PAGE_SIZE - 1));
+		int n;
+		int res;
+
+		addr &= ~(PAGE_SIZE - 1);
+		n = DIV_ROUND_UP(len, PAGE_SIZE);
+		p = get_pages_array(n);
+		if (!p)
+			return -ENOMEM;
+
+		res = pin_user_pages_fast(addr, n,
+				iov_iter_rw(i) != WRITE ?  FOLL_WRITE : 0, p);
+		if (unlikely(res < 0)) {
+			kvfree(p);
+			return res;
+		}
+		*pages = p;
+		return (res == n ? len : res * PAGE_SIZE) - *start;
+		0;
+	}))
+	return 0;
+}
+EXPORT_SYMBOL(iov_iter_pin_pages_alloc);
+
 ssize_t iov_iter_get_pages_alloc(struct iov_iter *i,
 		   struct page ***pages, size_t maxsize,
 		   size_t *start)
@@ -1398,7 +1495,7 @@ ssize_t iov_iter_get_pages_alloc(struct iov_iter *i,
 		maxsize = i->count;
 
 	if (unlikely(iov_iter_is_pipe(i)))
-		return pipe_get_pages_alloc(i, pages, maxsize, start);
+		return pipe_get_pages_alloc(i, pages, maxsize, start, false);
 	if (unlikely(iov_iter_is_discard(i)))
 		return -EFAULT;
 
-- 
2.28.0



^ permalink raw reply related	[flat|nested] 6+ messages in thread

* [PATCH v3 3/3] bio: convert get_user_pages_fast() --> pin_user_pages_fast()
  2020-08-31  7:14 [PATCH v3 0/3] bio: Direct IO: convert to pin_user_pages_fast() John Hubbard
  2020-08-31  7:14 ` [PATCH v3 1/3] mm/gup: introduce pin_page() John Hubbard
  2020-08-31  7:14 ` [PATCH v3 2/3] iov_iter: introduce iov_iter_pin_pages*() routines John Hubbard
@ 2020-08-31  7:14 ` John Hubbard
  2020-08-31 16:52   ` Ira Weiny
  2 siblings, 1 reply; 6+ messages in thread
From: John Hubbard @ 2020-08-31  7:14 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Alexander Viro, Christoph Hellwig, Ilya Dryomov, Jens Axboe,
	linux-xfs, linux-fsdevel, linux-block, linux-mm, LKML,
	John Hubbard

Change generic block/bio Direct IO routines, to acquire FOLL_PIN user
pages via the recently added routines:

    iov_iter_pin_pages()
    iov_iter_pin_pages_alloc()
    pin_page()

This effectively converts several file systems (ext4, for example) that
use the common Direct IO routines.

Change the corresponding page release calls from put_page() to
unpin_user_page().

Change bio_release_pages() to handle FOLL_PIN pages. In fact, after this
patch, that is the *only* type of pages that bio_release_pages()
handles.

Design notes
============

Quite a few approaches have been considered over the years. This one is
inspired by Christoph Hellwig's July, 2019 observation that there are
only 5 ITER_ types, and we can simplify handling of them for Direct IO
[1]. Accordingly, this patch implements the following pseudocode:

Direct IO behavior:

    ITER_IOVEC:
        pin_user_pages_fast();
        break;

    ITER_PIPE:
        for each page:
             pin_page();
        break;

    ITER_KVEC:    // already elevated page refcount, leave alone
    ITER_BVEC:    // already elevated page refcount, leave alone
    ITER_DISCARD: // discard
        return -EFAULT or -ENVALID;

...which works for callers that already have sorted out which case they
are in. Such as, Direct IO in the block/bio layers.

Note that this does leave ITER_KVEC and ITER_BVEC unconverted, for now.

Page acquisition: The iov_iter_get_pages*() routines above are at just
the right level in the call stack: the callers already know which system
to use, and so it's a small change to just drop in the replacement
routines. And it's a fan-in/fan-out point: block/bio call sites for
Direct IO funnel their page acquisitions through the
iov_iter_get_pages*() routines, and there are many other callers of
those. And we can't convert all of the callers at once--too many
subsystems are involved, and it would be a too large and too risky
patch.

Page release: there are already separate release routines: put_page()
vs. unpin_user_page(), so it's already done there.

[1] https://lore.kernel.org/kvm/20190724061750.GA19397@infradead.org/

[2] "Explicit pinning of user-space pages":
    https://lwn.net/Articles/807108/

Cc: Christoph Hellwig <hch@infradead.org>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: John Hubbard <jhubbard@nvidia.com>
---
 block/bio.c          | 24 ++++++++++++------------
 block/blk-map.c      |  6 +++---
 fs/direct-io.c       | 28 ++++++++++++++--------------
 fs/iomap/direct-io.c |  2 +-
 4 files changed, 30 insertions(+), 30 deletions(-)

diff --git a/block/bio.c b/block/bio.c
index a9931f23d933..76c5843f6050 100644
--- a/block/bio.c
+++ b/block/bio.c
@@ -955,7 +955,7 @@ void bio_release_pages(struct bio *bio, bool mark_dirty)
 	bio_for_each_segment_all(bvec, bio, iter_all) {
 		if (mark_dirty && !PageCompound(bvec->bv_page))
 			set_page_dirty_lock(bvec->bv_page);
-		put_page(bvec->bv_page);
+		unpin_user_page(bvec->bv_page);
 	}
 }
 EXPORT_SYMBOL_GPL(bio_release_pages);
@@ -986,9 +986,9 @@ static int __bio_iov_bvec_add_pages(struct bio *bio, struct iov_iter *iter)
  * @iter: iov iterator describing the region to be mapped
  *
  * Pins pages from *iter and appends them to @bio's bvec array. The
- * pages will have to be released using put_page() when done.
- * For multi-segment *iter, this function only adds pages from the
- * next non-empty segment of the iov iterator.
+ * pages will have to be released using put_page() or unpin_user_page() when
+ * done. For multi-segment *iter, this function only adds pages from the next
+ * non-empty segment of the iov iterator.
  */
 static int __bio_iov_iter_get_pages(struct bio *bio, struct iov_iter *iter)
 {
@@ -1009,7 +1009,7 @@ static int __bio_iov_iter_get_pages(struct bio *bio, struct iov_iter *iter)
 	BUILD_BUG_ON(PAGE_PTRS_PER_BVEC < 2);
 	pages += entries_left * (PAGE_PTRS_PER_BVEC - 1);
 
-	size = iov_iter_get_pages(iter, pages, LONG_MAX, nr_pages, &offset);
+	size = iov_iter_pin_pages(iter, pages, LONG_MAX, nr_pages, &offset);
 	if (unlikely(size <= 0))
 		return size ? size : -EFAULT;
 
@@ -1020,7 +1020,7 @@ static int __bio_iov_iter_get_pages(struct bio *bio, struct iov_iter *iter)
 
 		if (__bio_try_merge_page(bio, page, len, offset, &same_page)) {
 			if (same_page)
-				put_page(page);
+				unpin_user_page(page);
 		} else {
 			if (WARN_ON_ONCE(bio_full(bio, len)))
                                 return -EINVAL;
@@ -1056,7 +1056,7 @@ static int __bio_iov_append_get_pages(struct bio *bio, struct iov_iter *iter)
 	BUILD_BUG_ON(PAGE_PTRS_PER_BVEC < 2);
 	pages += entries_left * (PAGE_PTRS_PER_BVEC - 1);
 
-	size = iov_iter_get_pages(iter, pages, LONG_MAX, nr_pages, &offset);
+	size = iov_iter_pin_pages(iter, pages, LONG_MAX, nr_pages, &offset);
 	if (unlikely(size <= 0))
 		return size ? size : -EFAULT;
 
@@ -1069,7 +1069,7 @@ static int __bio_iov_append_get_pages(struct bio *bio, struct iov_iter *iter)
 				max_append_sectors, &same_page) != len)
 			return -EINVAL;
 		if (same_page)
-			put_page(page);
+			unpin_user_page(page);
 		offset = 0;
 	}
 
@@ -1113,8 +1113,8 @@ int bio_iov_iter_get_pages(struct bio *bio, struct iov_iter *iter)
 		} else {
 			if (is_bvec)
 				ret = __bio_iov_bvec_add_pages(bio, iter);
-			else
-				ret = __bio_iov_iter_get_pages(bio, iter);
+		else
+			ret = __bio_iov_iter_get_pages(bio, iter);
 		}
 	} while (!ret && iov_iter_count(iter) && !bio_full(bio, 0));
 
@@ -1326,8 +1326,8 @@ void bio_set_pages_dirty(struct bio *bio)
  * the BIO and re-dirty the pages in process context.
  *
  * It is expected that bio_check_pages_dirty() will wholly own the BIO from
- * here on.  It will run one put_page() against each page and will run one
- * bio_put() against the BIO.
+ * here on. It will run one unpin_user_page() against each page, and will run
+ * one bio_put() against the BIO.
  */
 
 static void bio_dirty_fn(struct work_struct *work);
diff --git a/block/blk-map.c b/block/blk-map.c
index 6e804892d5ec..2057a8e5b4bc 100644
--- a/block/blk-map.c
+++ b/block/blk-map.c
@@ -275,7 +275,7 @@ static struct bio *bio_map_user_iov(struct request_queue *q,
 		size_t offs, added = 0;
 		int npages;
 
-		bytes = iov_iter_get_pages_alloc(iter, &pages, LONG_MAX, &offs);
+		bytes = iov_iter_pin_pages_alloc(iter, &pages, LONG_MAX, &offs);
 		if (unlikely(bytes <= 0)) {
 			ret = bytes ? bytes : -EFAULT;
 			goto out_unmap;
@@ -298,7 +298,7 @@ static struct bio *bio_map_user_iov(struct request_queue *q,
 				if (!bio_add_hw_page(q, bio, page, n, offs,
 						     max_sectors, &same_page)) {
 					if (same_page)
-						put_page(page);
+						unpin_user_page(page);
 					break;
 				}
 
@@ -312,7 +312,7 @@ static struct bio *bio_map_user_iov(struct request_queue *q,
 		 * release the pages we didn't map into the bio, if any
 		 */
 		while (j < npages)
-			put_page(pages[j++]);
+			unpin_user_page(pages[j++]);
 		kvfree(pages);
 		/* couldn't stuff something into bio? */
 		if (bytes)
diff --git a/fs/direct-io.c b/fs/direct-io.c
index 183299892465..5130ba32ae91 100644
--- a/fs/direct-io.c
+++ b/fs/direct-io.c
@@ -170,7 +170,7 @@ static inline int dio_refill_pages(struct dio *dio, struct dio_submit *sdio)
 {
 	ssize_t ret;
 
-	ret = iov_iter_get_pages(sdio->iter, dio->pages, LONG_MAX, DIO_PAGES,
+	ret = iov_iter_pin_pages(sdio->iter, dio->pages, LONG_MAX, DIO_PAGES,
 				&sdio->from);
 
 	if (ret < 0 && sdio->blocks_available && (dio->op == REQ_OP_WRITE)) {
@@ -182,7 +182,7 @@ static inline int dio_refill_pages(struct dio *dio, struct dio_submit *sdio)
 		 */
 		if (dio->page_errors == 0)
 			dio->page_errors = ret;
-		get_page(page);
+		pin_page(page);
 		dio->pages[0] = page;
 		sdio->head = 0;
 		sdio->tail = 1;
@@ -472,7 +472,7 @@ static inline void dio_bio_submit(struct dio *dio, struct dio_submit *sdio)
 static inline void dio_cleanup(struct dio *dio, struct dio_submit *sdio)
 {
 	while (sdio->head < sdio->tail)
-		put_page(dio->pages[sdio->head++]);
+		unpin_user_page(dio->pages[sdio->head++]);
 }
 
 /*
@@ -739,7 +739,7 @@ static inline int dio_bio_add_page(struct dio_submit *sdio)
 		 */
 		if ((sdio->cur_page_len + sdio->cur_page_offset) == PAGE_SIZE)
 			sdio->pages_in_io--;
-		get_page(sdio->cur_page);
+		pin_page(sdio->cur_page);
 		sdio->final_block_in_bio = sdio->cur_page_block +
 			(sdio->cur_page_len >> sdio->blkbits);
 		ret = 0;
@@ -853,13 +853,13 @@ submit_page_section(struct dio *dio, struct dio_submit *sdio, struct page *page,
 	 */
 	if (sdio->cur_page) {
 		ret = dio_send_cur_page(dio, sdio, map_bh);
-		put_page(sdio->cur_page);
+		unpin_user_page(sdio->cur_page);
 		sdio->cur_page = NULL;
 		if (ret)
 			return ret;
 	}
 
-	get_page(page);		/* It is in dio */
+	pin_page(page);		/* It is in dio */
 	sdio->cur_page = page;
 	sdio->cur_page_offset = offset;
 	sdio->cur_page_len = len;
@@ -874,7 +874,7 @@ submit_page_section(struct dio *dio, struct dio_submit *sdio, struct page *page,
 		ret = dio_send_cur_page(dio, sdio, map_bh);
 		if (sdio->bio)
 			dio_bio_submit(dio, sdio);
-		put_page(sdio->cur_page);
+		unpin_user_page(sdio->cur_page);
 		sdio->cur_page = NULL;
 	}
 	return ret;
@@ -974,7 +974,7 @@ static int do_direct_IO(struct dio *dio, struct dio_submit *sdio,
 
 				ret = get_more_blocks(dio, sdio, map_bh);
 				if (ret) {
-					put_page(page);
+					unpin_user_page(page);
 					goto out;
 				}
 				if (!buffer_mapped(map_bh))
@@ -1019,7 +1019,7 @@ static int do_direct_IO(struct dio *dio, struct dio_submit *sdio,
 
 				/* AKPM: eargh, -ENOTBLK is a hack */
 				if (dio->op == REQ_OP_WRITE) {
-					put_page(page);
+					unpin_user_page(page);
 					return -ENOTBLK;
 				}
 
@@ -1032,7 +1032,7 @@ static int do_direct_IO(struct dio *dio, struct dio_submit *sdio,
 				if (sdio->block_in_file >=
 						i_size_aligned >> blkbits) {
 					/* We hit eof */
-					put_page(page);
+					unpin_user_page(page);
 					goto out;
 				}
 				zero_user(page, from, 1 << blkbits);
@@ -1072,7 +1072,7 @@ static int do_direct_IO(struct dio *dio, struct dio_submit *sdio,
 						  sdio->next_block_for_io,
 						  map_bh);
 			if (ret) {
-				put_page(page);
+				unpin_user_page(page);
 				goto out;
 			}
 			sdio->next_block_for_io += this_chunk_blocks;
@@ -1087,8 +1087,8 @@ static int do_direct_IO(struct dio *dio, struct dio_submit *sdio,
 				break;
 		}
 
-		/* Drop the ref which was taken in get_user_pages() */
-		put_page(page);
+		/* Drop the ref which was taken in pin_user_pages() */
+		unpin_user_page(page);
 	}
 out:
 	return ret;
@@ -1327,7 +1327,7 @@ do_blockdev_direct_IO(struct kiocb *iocb, struct inode *inode,
 		ret2 = dio_send_cur_page(dio, &sdio, &map_bh);
 		if (retval == 0)
 			retval = ret2;
-		put_page(sdio.cur_page);
+		unpin_user_page(sdio.cur_page);
 		sdio.cur_page = NULL;
 	}
 	if (sdio.bio)
diff --git a/fs/iomap/direct-io.c b/fs/iomap/direct-io.c
index c1aafb2ab990..ea29f0892a8c 100644
--- a/fs/iomap/direct-io.c
+++ b/fs/iomap/direct-io.c
@@ -194,7 +194,7 @@ iomap_dio_zero(struct iomap_dio *dio, struct iomap *iomap, loff_t pos,
 	bio->bi_private = dio;
 	bio->bi_end_io = iomap_dio_bio_end_io;
 
-	get_page(page);
+	pin_page(page);
 	__bio_add_page(bio, page, len, 0);
 	bio_set_op_attrs(bio, REQ_OP_WRITE, flags);
 	iomap_dio_submit_bio(dio, iomap, bio, pos);
-- 
2.28.0



^ permalink raw reply related	[flat|nested] 6+ messages in thread

* Re: [PATCH v3 3/3] bio: convert get_user_pages_fast() --> pin_user_pages_fast()
  2020-08-31  7:14 ` [PATCH v3 3/3] bio: convert get_user_pages_fast() --> pin_user_pages_fast() John Hubbard
@ 2020-08-31 16:52   ` Ira Weiny
  2020-08-31 19:09     ` John Hubbard
  0 siblings, 1 reply; 6+ messages in thread
From: Ira Weiny @ 2020-08-31 16:52 UTC (permalink / raw)
  To: John Hubbard
  Cc: Andrew Morton, Alexander Viro, Christoph Hellwig, Ilya Dryomov,
	Jens Axboe, linux-xfs, linux-fsdevel, linux-block, linux-mm,
	LKML

On Mon, Aug 31, 2020 at 12:14:39AM -0700, John Hubbard wrote:
> @@ -1113,8 +1113,8 @@ int bio_iov_iter_get_pages(struct bio *bio, struct iov_iter *iter)
>  		} else {
>  			if (is_bvec)
>  				ret = __bio_iov_bvec_add_pages(bio, iter);
> -			else
> -				ret = __bio_iov_iter_get_pages(bio, iter);
> +		else
> +			ret = __bio_iov_iter_get_pages(bio, iter);

Why the white space change?

Ira



^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH v3 3/3] bio: convert get_user_pages_fast() --> pin_user_pages_fast()
  2020-08-31 16:52   ` Ira Weiny
@ 2020-08-31 19:09     ` John Hubbard
  0 siblings, 0 replies; 6+ messages in thread
From: John Hubbard @ 2020-08-31 19:09 UTC (permalink / raw)
  To: Ira Weiny
  Cc: Andrew Morton, Alexander Viro, Christoph Hellwig, Ilya Dryomov,
	Jens Axboe, linux-xfs, linux-fsdevel, linux-block, linux-mm,
	LKML

On 8/31/20 9:52 AM, Ira Weiny wrote:
> On Mon, Aug 31, 2020 at 12:14:39AM -0700, John Hubbard wrote:
>> @@ -1113,8 +1113,8 @@ int bio_iov_iter_get_pages(struct bio *bio, struct iov_iter *iter)
>>   		} else {
>>   			if (is_bvec)
>>   				ret = __bio_iov_bvec_add_pages(bio, iter);
>> -			else
>> -				ret = __bio_iov_iter_get_pages(bio, iter);
>> +		else
>> +			ret = __bio_iov_iter_get_pages(bio, iter);
> 
> Why the white space change?
> 

Yikes, that's an oversight, and...yes, it goes all the way back to v1.
Thanks for spotting it!

There is not supposed to be any diff at all, in that region of code.
I'll restore the hunk to its rightful, undisturbed state, in v4.


thanks,
-- 
John Hubbard
NVIDIA


^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2020-08-31 19:09 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-08-31  7:14 [PATCH v3 0/3] bio: Direct IO: convert to pin_user_pages_fast() John Hubbard
2020-08-31  7:14 ` [PATCH v3 1/3] mm/gup: introduce pin_page() John Hubbard
2020-08-31  7:14 ` [PATCH v3 2/3] iov_iter: introduce iov_iter_pin_pages*() routines John Hubbard
2020-08-31  7:14 ` [PATCH v3 3/3] bio: convert get_user_pages_fast() --> pin_user_pages_fast() John Hubbard
2020-08-31 16:52   ` Ira Weiny
2020-08-31 19:09     ` John Hubbard

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).