linux-block.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH V2 0/2] block: fix page leak by merging to same page
@ 2019-06-10  4:18 Ming Lei
  2019-06-10  4:18 ` [PATCH V2 1/2] block: introduce 'enum bvec_merge_flags' for __bio_try_merge_page Ming Lei
                   ` (3 more replies)
  0 siblings, 4 replies; 8+ messages in thread
From: Ming Lei @ 2019-06-10  4:18 UTC (permalink / raw)
  To: Jens Axboe
  Cc: linux-block, Ming Lei, David Gibson, Darrick J. Wong, linux-xfs,
	Alexander Viro, Christoph Hellwig

Hi,

'pages' retrived by __bio_iov_iter_get_pages() may point to same page,
and finally they can be merged to the same page in bio_add_page(), then
page leak can be caused because bio_release_pages() only drops the page
ref once.

Fixes this issue by dropping the extra page ref.

V2:
	- V1 breaks multi-page merge, and fix it and only put the page ref
	if the added page is really the 'same page' 


Ming Lei (2):
  block: introduce 'enum bvec_merge_flags' for __bio_try_merge_page
  block: fix page leak in case of merging to same page

 block/bio.c         | 32 ++++++++++++++++++++++----------
 fs/iomap.c          |  3 ++-
 fs/xfs/xfs_aops.c   |  3 ++-
 include/linux/bio.h |  9 ++++++++-
 4 files changed, 34 insertions(+), 13 deletions(-)

Cc: David Gibson <david@gibson.dropbear.id.au>
Cc: "Darrick J. Wong" <darrick.wong@oracle.com>
Cc: linux-xfs@vger.kernel.org
Cc: Alexander Viro <viro@zeniv.linux.org.uk>
Cc: Christoph Hellwig <hch@infradead.org>
-- 
2.20.1


^ permalink raw reply	[flat|nested] 8+ messages in thread

* [PATCH V2 1/2] block: introduce 'enum bvec_merge_flags' for __bio_try_merge_page
  2019-06-10  4:18 [PATCH V2 0/2] block: fix page leak by merging to same page Ming Lei
@ 2019-06-10  4:18 ` Ming Lei
  2019-06-10  4:18 ` [PATCH V2 2/2] block: fix page leak in case of merging to same page Ming Lei
                   ` (2 subsequent siblings)
  3 siblings, 0 replies; 8+ messages in thread
From: Ming Lei @ 2019-06-10  4:18 UTC (permalink / raw)
  To: Jens Axboe
  Cc: linux-block, Ming Lei, Darrick J. Wong, linux-xfs,
	Alexander Viro, Christoph Hellwig, David Gibson

Introduce 'enum bvec_merge_flags' and pass it to __bio_try_merge_page,
we have to deal with several cases related with page reference when
merging same page to bio(bvec), such as:

1) only merge to same page without putting reference of the same page,
such as iomap & xfs

2) merge to same page and put reference of the same page, such as
__bio_iov_iter_get_pages()

Cc: "Darrick J. Wong" <darrick.wong@oracle.com>
Cc: linux-xfs@vger.kernel.org
Cc: Alexander Viro <viro@zeniv.linux.org.uk>
Cc: Christoph Hellwig <hch@infradead.org>
Cc: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Ming Lei <ming.lei@redhat.com>
---
 block/bio.c         | 20 ++++++++++++--------
 fs/iomap.c          |  3 ++-
 fs/xfs/xfs_aops.c   |  3 ++-
 include/linux/bio.h |  8 +++++++-
 4 files changed, 23 insertions(+), 11 deletions(-)

diff --git a/block/bio.c b/block/bio.c
index 683cbb40f051..39e3b931dc3b 100644
--- a/block/bio.c
+++ b/block/bio.c
@@ -636,7 +636,7 @@ EXPORT_SYMBOL(bio_clone_fast);
 
 static inline bool page_is_mergeable(const struct bio_vec *bv,
 		struct page *page, unsigned int len, unsigned int off,
-		bool same_page)
+		enum bvec_merge_flags flags)
 {
 	phys_addr_t vec_end_addr = page_to_phys(bv->bv_page) +
 		bv->bv_offset + bv->bv_len - 1;
@@ -648,13 +648,14 @@ static inline bool page_is_mergeable(const struct bio_vec *bv,
 		return false;
 
 	if ((vec_end_addr & PAGE_MASK) != page_addr) {
-		if (same_page)
+		if (flags & BVEC_MERGE_TO_SAME_PAGE)
 			return false;
 		if (pfn_to_page(PFN_DOWN(vec_end_addr)) + 1 != page)
 			return false;
 	}
 
-	WARN_ON_ONCE(same_page && (len + off) > PAGE_SIZE);
+	WARN_ON_ONCE((flags & BVEC_MERGE_TO_SAME_PAGE) &&
+			(len + off) > PAGE_SIZE);
 
 	return true;
 }
@@ -729,8 +730,9 @@ static int __bio_add_pc_page(struct request_queue *q, struct bio *bio,
 		if (bvec_gap_to_prev(q, bvec, offset))
 			return 0;
 
-		if (page_is_mergeable(bvec, page, len, offset, false) &&
-		    can_add_page_to_seg(q, bvec, page, len, offset)) {
+		if (page_is_mergeable(bvec, page, len, offset,
+		    BVEC_MERGE_DEFAULT) && can_add_page_to_seg(q, bvec,
+		    page, len, offset)) {
 			bvec->bv_len += len;
 			goto done;
 		}
@@ -779,7 +781,8 @@ EXPORT_SYMBOL(bio_add_pc_page);
  * Return %true on success or %false on failure.
  */
 bool __bio_try_merge_page(struct bio *bio, struct page *page,
-		unsigned int len, unsigned int off, bool same_page)
+		unsigned int len, unsigned int off,
+		enum bvec_merge_flags flags)
 {
 	if (WARN_ON_ONCE(bio_flagged(bio, BIO_CLONED)))
 		return false;
@@ -787,7 +790,7 @@ bool __bio_try_merge_page(struct bio *bio, struct page *page,
 	if (bio->bi_vcnt > 0) {
 		struct bio_vec *bv = &bio->bi_io_vec[bio->bi_vcnt - 1];
 
-		if (page_is_mergeable(bv, page, len, off, same_page)) {
+		if (page_is_mergeable(bv, page, len, off, flags)) {
 			bv->bv_len += len;
 			bio->bi_iter.bi_size += len;
 			return true;
@@ -837,7 +840,8 @@ EXPORT_SYMBOL_GPL(__bio_add_page);
 int bio_add_page(struct bio *bio, struct page *page,
 		 unsigned int len, unsigned int offset)
 {
-	if (!__bio_try_merge_page(bio, page, len, offset, false)) {
+	if (!__bio_try_merge_page(bio, page, len, offset,
+				  BVEC_MERGE_DEFAULT)) {
 		if (bio_full(bio))
 			return 0;
 		__bio_add_page(bio, page, len, offset);
diff --git a/fs/iomap.c b/fs/iomap.c
index 23ef63fd1669..e04652bbf92a 100644
--- a/fs/iomap.c
+++ b/fs/iomap.c
@@ -316,7 +316,8 @@ iomap_readpage_actor(struct inode *inode, loff_t pos, loff_t length, void *data,
 	 */
 	sector = iomap_sector(iomap, pos);
 	if (ctx->bio && bio_end_sector(ctx->bio) == sector) {
-		if (__bio_try_merge_page(ctx->bio, page, plen, poff, true))
+		if (__bio_try_merge_page(ctx->bio, page, plen, poff,
+					 BVEC_MERGE_TO_SAME_PAGE))
 			goto done;
 		is_contig = true;
 	}
diff --git a/fs/xfs/xfs_aops.c b/fs/xfs/xfs_aops.c
index a6f0f4761a37..7e7385bc3b9e 100644
--- a/fs/xfs/xfs_aops.c
+++ b/fs/xfs/xfs_aops.c
@@ -774,7 +774,8 @@ xfs_add_to_ioend(
 				wpc->imap.br_state, offset, bdev, sector);
 	}
 
-	if (!__bio_try_merge_page(wpc->ioend->io_bio, page, len, poff, true)) {
+	if (!__bio_try_merge_page(wpc->ioend->io_bio, page, len, poff,
+				BVEC_MERGE_TO_SAME_PAGE)) {
 		if (iop)
 			atomic_inc(&iop->write_count);
 		if (bio_full(wpc->ioend->io_bio))
diff --git a/include/linux/bio.h b/include/linux/bio.h
index 0f23b5682640..48a95bca1703 100644
--- a/include/linux/bio.h
+++ b/include/linux/bio.h
@@ -419,11 +419,17 @@ extern void bio_uninit(struct bio *);
 extern void bio_reset(struct bio *);
 void bio_chain(struct bio *, struct bio *);
 
+enum bvec_merge_flags {
+	BVEC_MERGE_DEFAULT,
+	BVEC_MERGE_TO_SAME_PAGE = BIT(0),
+};
+
 extern int bio_add_page(struct bio *, struct page *, unsigned int,unsigned int);
 extern int bio_add_pc_page(struct request_queue *, struct bio *, struct page *,
 			   unsigned int, unsigned int);
 bool __bio_try_merge_page(struct bio *bio, struct page *page,
-		unsigned int len, unsigned int off, bool same_page);
+		unsigned int len, unsigned int off,
+		enum bvec_merge_flags flags);
 void __bio_add_page(struct bio *bio, struct page *page,
 		unsigned int len, unsigned int off);
 int bio_iov_iter_get_pages(struct bio *bio, struct iov_iter *iter);
-- 
2.20.1


^ permalink raw reply related	[flat|nested] 8+ messages in thread

* [PATCH V2 2/2] block: fix page leak in case of merging to same page
  2019-06-10  4:18 [PATCH V2 0/2] block: fix page leak by merging to same page Ming Lei
  2019-06-10  4:18 ` [PATCH V2 1/2] block: introduce 'enum bvec_merge_flags' for __bio_try_merge_page Ming Lei
@ 2019-06-10  4:18 ` Ming Lei
  2019-06-10  8:37 ` [PATCH V2 0/2] block: fix page leak by " Ming Lei
  2019-06-10 13:34 ` Christoph Hellwig
  3 siblings, 0 replies; 8+ messages in thread
From: Ming Lei @ 2019-06-10  4:18 UTC (permalink / raw)
  To: Jens Axboe
  Cc: linux-block, Ming Lei, David Gibson, Darrick J. Wong, linux-xfs,
	Alexander Viro, Christoph Hellwig

Different iovec may use one same page, then 'pages' array filled
by iov_iter_get_pages() may get reference of the same page several
times. If some elements in 'pages' can be merged to same page in
one bvec by bio_add_page(), bio_release_pages() only drops the
page's reference once.

This way causes page leak reported by David Gibson.

This issue can be triggered since 576ed913 ("block: use bio_add_page in
bio_iov_iter_get_pages").

Fixes the issue by putting the page's ref if it is merged to same page.

Cc: David Gibson <david@gibson.dropbear.id.au>
Cc: "Darrick J. Wong" <darrick.wong@oracle.com>
Cc: linux-xfs@vger.kernel.org
Cc: Alexander Viro <viro@zeniv.linux.org.uk>
Cc: Christoph Hellwig <hch@infradead.org>
Link: https://lkml.org/lkml/2019/4/23/64
Fixes: 576ed913 ("block: use bio_add_page in bio_iov_iter_get_pages")
Reported-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Ming Lei <ming.lei@redhat.com>
---
 block/bio.c         | 12 ++++++++++--
 include/linux/bio.h |  1 +
 2 files changed, 11 insertions(+), 2 deletions(-)

diff --git a/block/bio.c b/block/bio.c
index 39e3b931dc3b..07a15abc3d11 100644
--- a/block/bio.c
+++ b/block/bio.c
@@ -652,6 +652,9 @@ static inline bool page_is_mergeable(const struct bio_vec *bv,
 			return false;
 		if (pfn_to_page(PFN_DOWN(vec_end_addr)) + 1 != page)
 			return false;
+	/* drop page ref if the page has been added and user asks to do that */
+	} else if (flags & BVEC_MERGE_PUT_SAME_PAGE) {
+		put_page(page);
 	}
 
 	WARN_ON_ONCE((flags & BVEC_MERGE_TO_SAME_PAGE) &&
@@ -924,8 +927,13 @@ static int __bio_iov_iter_get_pages(struct bio *bio, struct iov_iter *iter)
 		struct page *page = pages[i];
 
 		len = min_t(size_t, PAGE_SIZE - offset, left);
-		if (WARN_ON_ONCE(bio_add_page(bio, page, len, offset) != len))
-			return -EINVAL;
+
+		if (!__bio_try_merge_page(bio, page, len, offset,
+					BVEC_MERGE_PUT_SAME_PAGE)) {
+			if (WARN_ON_ONCE(bio_add_page(bio, page, len, offset)
+						!= len))
+				return -EINVAL;
+		}
 		offset = 0;
 	}
 
diff --git a/include/linux/bio.h b/include/linux/bio.h
index 48a95bca1703..dec6cf683d8e 100644
--- a/include/linux/bio.h
+++ b/include/linux/bio.h
@@ -422,6 +422,7 @@ void bio_chain(struct bio *, struct bio *);
 enum bvec_merge_flags {
 	BVEC_MERGE_DEFAULT,
 	BVEC_MERGE_TO_SAME_PAGE = BIT(0),
+	BVEC_MERGE_PUT_SAME_PAGE = BIT(1),
 };
 
 extern int bio_add_page(struct bio *, struct page *, unsigned int,unsigned int);
-- 
2.20.1


^ permalink raw reply related	[flat|nested] 8+ messages in thread

* Re: [PATCH V2 0/2] block: fix page leak by merging to same page
  2019-06-10  4:18 [PATCH V2 0/2] block: fix page leak by merging to same page Ming Lei
  2019-06-10  4:18 ` [PATCH V2 1/2] block: introduce 'enum bvec_merge_flags' for __bio_try_merge_page Ming Lei
  2019-06-10  4:18 ` [PATCH V2 2/2] block: fix page leak in case of merging to same page Ming Lei
@ 2019-06-10  8:37 ` Ming Lei
  2019-06-10 13:34 ` Christoph Hellwig
  3 siblings, 0 replies; 8+ messages in thread
From: Ming Lei @ 2019-06-10  8:37 UTC (permalink / raw)
  To: Jens Axboe
  Cc: linux-block, David Gibson, Darrick J. Wong, linux-xfs,
	Alexander Viro, Christoph Hellwig

On Mon, Jun 10, 2019 at 12:18:17PM +0800, Ming Lei wrote:
> Hi,
> 
> 'pages' retrived by __bio_iov_iter_get_pages() may point to same page,
> and finally they can be merged to the same page in bio_add_page(), then
> page leak can be caused because bio_release_pages() only drops the page
> ref once.
> 
> Fixes this issue by dropping the extra page ref.
> 
> V2:
> 	- V1 breaks multi-page merge, and fix it and only put the page ref
> 	if the added page is really the 'same page' 
> 
> 
> Ming Lei (2):
>   block: introduce 'enum bvec_merge_flags' for __bio_try_merge_page
>   block: fix page leak in case of merging to same page
> 
>  block/bio.c         | 32 ++++++++++++++++++++++----------
>  fs/iomap.c          |  3 ++-
>  fs/xfs/xfs_aops.c   |  3 ++-
>  include/linux/bio.h |  9 ++++++++-
>  4 files changed, 34 insertions(+), 13 deletions(-)
> 
> Cc: David Gibson <david@gibson.dropbear.id.au>
> Cc: "Darrick J. Wong" <darrick.wong@oracle.com>
> Cc: linux-xfs@vger.kernel.org
> Cc: Alexander Viro <viro@zeniv.linux.org.uk>
> Cc: Christoph Hellwig <hch@infradead.org>
> -- 
> 2.20.1
> 

Please ignore V2, I will improve it a bit and post out V3.


Thanks
Ming

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH V2 0/2] block: fix page leak by merging to same page
  2019-06-10  4:18 [PATCH V2 0/2] block: fix page leak by merging to same page Ming Lei
                   ` (2 preceding siblings ...)
  2019-06-10  8:37 ` [PATCH V2 0/2] block: fix page leak by " Ming Lei
@ 2019-06-10 13:34 ` Christoph Hellwig
  2019-06-10 15:09   ` Ming Lei
  3 siblings, 1 reply; 8+ messages in thread
From: Christoph Hellwig @ 2019-06-10 13:34 UTC (permalink / raw)
  To: Ming Lei
  Cc: Jens Axboe, linux-block, David Gibson, Darrick J. Wong,
	linux-xfs, Alexander Viro, Christoph Hellwig

I don't really like the magic enum types.  I'd rather go back to my
initial idea to turn the same_page argument into an output parameter,
so that the callers can act upon it.  Untested patch below:


diff --git a/block/bio.c b/block/bio.c
index 683cbb40f051..d4999ef3b1fb 100644
--- a/block/bio.c
+++ b/block/bio.c
@@ -636,7 +636,7 @@ EXPORT_SYMBOL(bio_clone_fast);
 
 static inline bool page_is_mergeable(const struct bio_vec *bv,
 		struct page *page, unsigned int len, unsigned int off,
-		bool same_page)
+		bool *same_page)
 {
 	phys_addr_t vec_end_addr = page_to_phys(bv->bv_page) +
 		bv->bv_offset + bv->bv_len - 1;
@@ -647,26 +647,17 @@ static inline bool page_is_mergeable(const struct bio_vec *bv,
 	if (xen_domain() && !xen_biovec_phys_mergeable(bv, page))
 		return false;
 
-	if ((vec_end_addr & PAGE_MASK) != page_addr) {
-		if (same_page)
-			return false;
-		if (pfn_to_page(PFN_DOWN(vec_end_addr)) + 1 != page)
-			return false;
-	}
-
-	WARN_ON_ONCE(same_page && (len + off) > PAGE_SIZE);
-
+	*same_page = ((vec_end_addr & PAGE_MASK) == page_addr);
+	if (!*same_page && pfn_to_page(PFN_DOWN(vec_end_addr)) + 1 != page)
+		return false;
 	return true;
 }
 
-/*
- * Check if the @page can be added to the current segment(@bv), and make
- * sure to call it only if page_is_mergeable(@bv, @page) is true
- */
-static bool can_add_page_to_seg(struct request_queue *q,
-		struct bio_vec *bv, struct page *page, unsigned len,
-		unsigned offset)
+static bool bio_try_merge_pc_page(struct request_queue *q, struct bio *bio,
+		struct page *page, unsigned len, unsigned offset,
+		bool *same_page)
 {
+	struct bio_vec *bv = &bio->bi_io_vec[bio->bi_vcnt - 1];
 	unsigned long mask = queue_segment_boundary(q);
 	phys_addr_t addr1 = page_to_phys(bv->bv_page) + bv->bv_offset;
 	phys_addr_t addr2 = page_to_phys(page) + offset + len - 1;
@@ -677,7 +668,13 @@ static bool can_add_page_to_seg(struct request_queue *q,
 	if (bv->bv_len + len > queue_max_segment_size(q))
 		return false;
 
-	return true;
+	/*
+	 * If the queue doesn't support SG gaps and adding this
+	 * offset would create a gap, disallow it.
+	 */
+	if (bvec_gap_to_prev(q, bv, offset))
+		return false;
+	return __bio_try_merge_page(bio, page, len, offset, same_page);
 }
 
 /**
@@ -701,6 +698,7 @@ static int __bio_add_pc_page(struct request_queue *q, struct bio *bio,
 		bool put_same_page)
 {
 	struct bio_vec *bvec;
+	bool same_page = false;
 
 	/*
 	 * cloned bio must not modify vec list
@@ -711,29 +709,11 @@ static int __bio_add_pc_page(struct request_queue *q, struct bio *bio,
 	if (((bio->bi_iter.bi_size + len) >> 9) > queue_max_hw_sectors(q))
 		return 0;
 
-	if (bio->bi_vcnt > 0) {
-		bvec = &bio->bi_io_vec[bio->bi_vcnt - 1];
-
-		if (page == bvec->bv_page &&
-		    offset == bvec->bv_offset + bvec->bv_len) {
-			if (put_same_page)
-				put_page(page);
-			bvec->bv_len += len;
-			goto done;
-		}
-
-		/*
-		 * If the queue doesn't support SG gaps and adding this
-		 * offset would create a gap, disallow it.
-		 */
-		if (bvec_gap_to_prev(q, bvec, offset))
-			return 0;
-
-		if (page_is_mergeable(bvec, page, len, offset, false) &&
-		    can_add_page_to_seg(q, bvec, page, len, offset)) {
-			bvec->bv_len += len;
-			goto done;
-		}
+	if (bio->bi_vcnt > 0 &&
+	    bio_try_merge_pc_page(q, bio, page, len, offset, &same_page)) {
+		if (put_same_page && same_page)
+			put_page(page);
+		goto done;
 	}
 
 	if (bio_full(bio))
@@ -747,8 +727,8 @@ static int __bio_add_pc_page(struct request_queue *q, struct bio *bio,
 	bvec->bv_len = len;
 	bvec->bv_offset = offset;
 	bio->bi_vcnt++;
- done:
 	bio->bi_iter.bi_size += len;
+ done:
 	bio->bi_phys_segments = bio->bi_vcnt;
 	bio_set_flag(bio, BIO_SEG_VALID);
 	return len;
@@ -767,8 +747,7 @@ EXPORT_SYMBOL(bio_add_pc_page);
  * @page: start page to add
  * @len: length of the data to add
  * @off: offset of the data relative to @page
- * @same_page: if %true only merge if the new data is in the same physical
- *		page as the last segment of the bio.
+ * @same_page: return if the segment has been merged inside the same page
  *
  * Try to add the data at @page + @off to the last bvec of @bio.  This is a
  * a useful optimisation for file systems with a block size smaller than the
@@ -779,7 +758,7 @@ EXPORT_SYMBOL(bio_add_pc_page);
  * Return %true on success or %false on failure.
  */
 bool __bio_try_merge_page(struct bio *bio, struct page *page,
-		unsigned int len, unsigned int off, bool same_page)
+		unsigned int len, unsigned int off, bool *same_page)
 {
 	if (WARN_ON_ONCE(bio_flagged(bio, BIO_CLONED)))
 		return false;
@@ -837,7 +816,9 @@ EXPORT_SYMBOL_GPL(__bio_add_page);
 int bio_add_page(struct bio *bio, struct page *page,
 		 unsigned int len, unsigned int offset)
 {
-	if (!__bio_try_merge_page(bio, page, len, offset, false)) {
+	bool same_page = false;
+
+	if (!__bio_try_merge_page(bio, page, len, offset, &same_page)) {
 		if (bio_full(bio))
 			return 0;
 		__bio_add_page(bio, page, len, offset);
@@ -900,6 +881,7 @@ static int __bio_iov_iter_get_pages(struct bio *bio, struct iov_iter *iter)
 	unsigned short entries_left = bio->bi_max_vecs - bio->bi_vcnt;
 	struct bio_vec *bv = bio->bi_io_vec + bio->bi_vcnt;
 	struct page **pages = (struct page **)bv;
+	bool same_page = false;
 	ssize_t size, left;
 	unsigned len, i;
 	size_t offset;
@@ -920,8 +902,15 @@ static int __bio_iov_iter_get_pages(struct bio *bio, struct iov_iter *iter)
 		struct page *page = pages[i];
 
 		len = min_t(size_t, PAGE_SIZE - offset, left);
-		if (WARN_ON_ONCE(bio_add_page(bio, page, len, offset) != len))
-			return -EINVAL;
+
+		if (__bio_try_merge_page(bio, page, len, offset, &same_page)) {
+			if (same_page)
+				put_page(page);
+		} else {
+			if (WARN_ON_ONCE(bio_full(bio)))
+                                return -EINVAL;
+			__bio_add_page(bio, page, len, offset);
+		}
 		offset = 0;
 	}
 
diff --git a/fs/iomap.c b/fs/iomap.c
index 23ef63fd1669..12654c2e78f8 100644
--- a/fs/iomap.c
+++ b/fs/iomap.c
@@ -287,7 +287,7 @@ iomap_readpage_actor(struct inode *inode, loff_t pos, loff_t length, void *data,
 	struct iomap_readpage_ctx *ctx = data;
 	struct page *page = ctx->cur_page;
 	struct iomap_page *iop = iomap_page_create(inode, page);
-	bool is_contig = false;
+	bool same_page = false, is_contig = false;
 	loff_t orig_pos = pos;
 	unsigned poff, plen;
 	sector_t sector;
@@ -315,10 +315,14 @@ iomap_readpage_actor(struct inode *inode, loff_t pos, loff_t length, void *data,
 	 * Try to merge into a previous segment if we can.
 	 */
 	sector = iomap_sector(iomap, pos);
-	if (ctx->bio && bio_end_sector(ctx->bio) == sector) {
-		if (__bio_try_merge_page(ctx->bio, page, plen, poff, true))
-			goto done;
+	if (ctx->bio && bio_end_sector(ctx->bio) == sector)
 		is_contig = true;
+
+	if (is_contig &&
+	    __bio_try_merge_page(ctx->bio, page, plen, poff, &same_page)) {
+		if (!same_page && iop)
+			atomic_inc(&iop->read_count);
+		goto done;
 	}
 
 	/*
diff --git a/fs/xfs/xfs_aops.c b/fs/xfs/xfs_aops.c
index a6f0f4761a37..8da5e6637771 100644
--- a/fs/xfs/xfs_aops.c
+++ b/fs/xfs/xfs_aops.c
@@ -758,6 +758,7 @@ xfs_add_to_ioend(
 	struct block_device	*bdev = xfs_find_bdev_for_inode(inode);
 	unsigned		len = i_blocksize(inode);
 	unsigned		poff = offset & (PAGE_SIZE - 1);
+	bool			merged, same_page = false;
 	sector_t		sector;
 
 	sector = xfs_fsb_to_db(ip, wpc->imap.br_startblock) +
@@ -774,9 +775,13 @@ xfs_add_to_ioend(
 				wpc->imap.br_state, offset, bdev, sector);
 	}
 
-	if (!__bio_try_merge_page(wpc->ioend->io_bio, page, len, poff, true)) {
-		if (iop)
-			atomic_inc(&iop->write_count);
+	merged = __bio_try_merge_page(wpc->ioend->io_bio, page, len, poff,
+			&same_page);
+
+	if (iop && !same_page)
+		atomic_inc(&iop->write_count);
+
+	if (!merged) {
 		if (bio_full(wpc->ioend->io_bio))
 			xfs_chain_bio(wpc->ioend, wbc, bdev, sector);
 		bio_add_page(wpc->ioend->io_bio, page, len, poff);
diff --git a/include/linux/bio.h b/include/linux/bio.h
index ea73df36529a..3df3b127b394 100644
--- a/include/linux/bio.h
+++ b/include/linux/bio.h
@@ -423,7 +423,7 @@ extern int bio_add_page(struct bio *, struct page *, unsigned int,unsigned int);
 extern int bio_add_pc_page(struct request_queue *, struct bio *, struct page *,
 			   unsigned int, unsigned int);
 bool __bio_try_merge_page(struct bio *bio, struct page *page,
-		unsigned int len, unsigned int off, bool same_page);
+		unsigned int len, unsigned int off, bool *same_page);
 void __bio_add_page(struct bio *bio, struct page *page,
 		unsigned int len, unsigned int off);
 int bio_iov_iter_get_pages(struct bio *bio, struct iov_iter *iter);

^ permalink raw reply related	[flat|nested] 8+ messages in thread

* Re: [PATCH V2 0/2] block: fix page leak by merging to same page
  2019-06-10 13:34 ` Christoph Hellwig
@ 2019-06-10 15:09   ` Ming Lei
  2019-06-11  7:45     ` Christoph Hellwig
  0 siblings, 1 reply; 8+ messages in thread
From: Ming Lei @ 2019-06-10 15:09 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: Jens Axboe, linux-block, David Gibson, Darrick J. Wong,
	linux-xfs, Alexander Viro

On Mon, Jun 10, 2019 at 06:34:46AM -0700, Christoph Hellwig wrote:
> I don't really like the magic enum types.  I'd rather go back to my
> initial idea to turn the same_page argument into an output parameter,
> so that the callers can act upon it.  Untested patch below:
> 
> 
> diff --git a/block/bio.c b/block/bio.c
> index 683cbb40f051..d4999ef3b1fb 100644
> --- a/block/bio.c
> +++ b/block/bio.c
> @@ -636,7 +636,7 @@ EXPORT_SYMBOL(bio_clone_fast);
>  
>  static inline bool page_is_mergeable(const struct bio_vec *bv,
>  		struct page *page, unsigned int len, unsigned int off,
> -		bool same_page)
> +		bool *same_page)
>  {
>  	phys_addr_t vec_end_addr = page_to_phys(bv->bv_page) +
>  		bv->bv_offset + bv->bv_len - 1;
> @@ -647,26 +647,17 @@ static inline bool page_is_mergeable(const struct bio_vec *bv,
>  	if (xen_domain() && !xen_biovec_phys_mergeable(bv, page))
>  		return false;
>  
> -	if ((vec_end_addr & PAGE_MASK) != page_addr) {
> -		if (same_page)
> -			return false;
> -		if (pfn_to_page(PFN_DOWN(vec_end_addr)) + 1 != page)
> -			return false;
> -	}
> -
> -	WARN_ON_ONCE(same_page && (len + off) > PAGE_SIZE);
> -
> +	*same_page = ((vec_end_addr & PAGE_MASK) == page_addr);
> +	if (!*same_page && pfn_to_page(PFN_DOWN(vec_end_addr)) + 1 != page)
> +		return false;
>  	return true;
>  }
>  
> -/*
> - * Check if the @page can be added to the current segment(@bv), and make
> - * sure to call it only if page_is_mergeable(@bv, @page) is true
> - */
> -static bool can_add_page_to_seg(struct request_queue *q,
> -		struct bio_vec *bv, struct page *page, unsigned len,
> -		unsigned offset)
> +static bool bio_try_merge_pc_page(struct request_queue *q, struct bio *bio,
> +		struct page *page, unsigned len, unsigned offset,
> +		bool *same_page)
>  {
> +	struct bio_vec *bv = &bio->bi_io_vec[bio->bi_vcnt - 1];
>  	unsigned long mask = queue_segment_boundary(q);
>  	phys_addr_t addr1 = page_to_phys(bv->bv_page) + bv->bv_offset;
>  	phys_addr_t addr2 = page_to_phys(page) + offset + len - 1;
> @@ -677,7 +668,13 @@ static bool can_add_page_to_seg(struct request_queue *q,
>  	if (bv->bv_len + len > queue_max_segment_size(q))
>  		return false;
>  
> -	return true;
> +	/*
> +	 * If the queue doesn't support SG gaps and adding this
> +	 * offset would create a gap, disallow it.
> +	 */
> +	if (bvec_gap_to_prev(q, bv, offset))
> +		return false;
> +	return __bio_try_merge_page(bio, page, len, offset, same_page);
>  }
>  
>  /**
> @@ -701,6 +698,7 @@ static int __bio_add_pc_page(struct request_queue *q, struct bio *bio,
>  		bool put_same_page)
>  {
>  	struct bio_vec *bvec;
> +	bool same_page = false;
>  
>  	/*
>  	 * cloned bio must not modify vec list
> @@ -711,29 +709,11 @@ static int __bio_add_pc_page(struct request_queue *q, struct bio *bio,
>  	if (((bio->bi_iter.bi_size + len) >> 9) > queue_max_hw_sectors(q))
>  		return 0;
>  
> -	if (bio->bi_vcnt > 0) {
> -		bvec = &bio->bi_io_vec[bio->bi_vcnt - 1];
> -
> -		if (page == bvec->bv_page &&
> -		    offset == bvec->bv_offset + bvec->bv_len) {
> -			if (put_same_page)
> -				put_page(page);
> -			bvec->bv_len += len;
> -			goto done;
> -		}
> -
> -		/*
> -		 * If the queue doesn't support SG gaps and adding this
> -		 * offset would create a gap, disallow it.
> -		 */
> -		if (bvec_gap_to_prev(q, bvec, offset))
> -			return 0;
> -
> -		if (page_is_mergeable(bvec, page, len, offset, false) &&
> -		    can_add_page_to_seg(q, bvec, page, len, offset)) {
> -			bvec->bv_len += len;
> -			goto done;
> -		}
> +	if (bio->bi_vcnt > 0 &&
> +	    bio_try_merge_pc_page(q, bio, page, len, offset, &same_page)) {
> +		if (put_same_page && same_page)
> +			put_page(page);
> +		goto done;
>  	}
>  
>  	if (bio_full(bio))
> @@ -747,8 +727,8 @@ static int __bio_add_pc_page(struct request_queue *q, struct bio *bio,
>  	bvec->bv_len = len;
>  	bvec->bv_offset = offset;
>  	bio->bi_vcnt++;
> - done:
>  	bio->bi_iter.bi_size += len;
> + done:
>  	bio->bi_phys_segments = bio->bi_vcnt;
>  	bio_set_flag(bio, BIO_SEG_VALID);
>  	return len;
> @@ -767,8 +747,7 @@ EXPORT_SYMBOL(bio_add_pc_page);
>   * @page: start page to add
>   * @len: length of the data to add
>   * @off: offset of the data relative to @page
> - * @same_page: if %true only merge if the new data is in the same physical
> - *		page as the last segment of the bio.
> + * @same_page: return if the segment has been merged inside the same page
>   *
>   * Try to add the data at @page + @off to the last bvec of @bio.  This is a
>   * a useful optimisation for file systems with a block size smaller than the
> @@ -779,7 +758,7 @@ EXPORT_SYMBOL(bio_add_pc_page);
>   * Return %true on success or %false on failure.
>   */
>  bool __bio_try_merge_page(struct bio *bio, struct page *page,
> -		unsigned int len, unsigned int off, bool same_page)
> +		unsigned int len, unsigned int off, bool *same_page)
>  {
>  	if (WARN_ON_ONCE(bio_flagged(bio, BIO_CLONED)))
>  		return false;
> @@ -837,7 +816,9 @@ EXPORT_SYMBOL_GPL(__bio_add_page);
>  int bio_add_page(struct bio *bio, struct page *page,
>  		 unsigned int len, unsigned int offset)
>  {
> -	if (!__bio_try_merge_page(bio, page, len, offset, false)) {
> +	bool same_page = false;
> +
> +	if (!__bio_try_merge_page(bio, page, len, offset, &same_page)) {
>  		if (bio_full(bio))
>  			return 0;
>  		__bio_add_page(bio, page, len, offset);
> @@ -900,6 +881,7 @@ static int __bio_iov_iter_get_pages(struct bio *bio, struct iov_iter *iter)
>  	unsigned short entries_left = bio->bi_max_vecs - bio->bi_vcnt;
>  	struct bio_vec *bv = bio->bi_io_vec + bio->bi_vcnt;
>  	struct page **pages = (struct page **)bv;
> +	bool same_page = false;
>  	ssize_t size, left;
>  	unsigned len, i;
>  	size_t offset;
> @@ -920,8 +902,15 @@ static int __bio_iov_iter_get_pages(struct bio *bio, struct iov_iter *iter)
>  		struct page *page = pages[i];
>  
>  		len = min_t(size_t, PAGE_SIZE - offset, left);
> -		if (WARN_ON_ONCE(bio_add_page(bio, page, len, offset) != len))
> -			return -EINVAL;
> +
> +		if (__bio_try_merge_page(bio, page, len, offset, &same_page)) {
> +			if (same_page)
> +				put_page(page);
> +		} else {
> +			if (WARN_ON_ONCE(bio_full(bio)))
> +                                return -EINVAL;
> +			__bio_add_page(bio, page, len, offset);
> +		}
>  		offset = 0;
>  	}
>  
> diff --git a/fs/iomap.c b/fs/iomap.c
> index 23ef63fd1669..12654c2e78f8 100644
> --- a/fs/iomap.c
> +++ b/fs/iomap.c
> @@ -287,7 +287,7 @@ iomap_readpage_actor(struct inode *inode, loff_t pos, loff_t length, void *data,
>  	struct iomap_readpage_ctx *ctx = data;
>  	struct page *page = ctx->cur_page;
>  	struct iomap_page *iop = iomap_page_create(inode, page);
> -	bool is_contig = false;
> +	bool same_page = false, is_contig = false;
>  	loff_t orig_pos = pos;
>  	unsigned poff, plen;
>  	sector_t sector;
> @@ -315,10 +315,14 @@ iomap_readpage_actor(struct inode *inode, loff_t pos, loff_t length, void *data,
>  	 * Try to merge into a previous segment if we can.
>  	 */
>  	sector = iomap_sector(iomap, pos);
> -	if (ctx->bio && bio_end_sector(ctx->bio) == sector) {
> -		if (__bio_try_merge_page(ctx->bio, page, plen, poff, true))
> -			goto done;
> +	if (ctx->bio && bio_end_sector(ctx->bio) == sector)
>  		is_contig = true;
> +
> +	if (is_contig &&
> +	    __bio_try_merge_page(ctx->bio, page, plen, poff, &same_page)) {
> +		if (!same_page && iop)
> +			atomic_inc(&iop->read_count);
> +		goto done;
>  	}
>  
>  	/*
> diff --git a/fs/xfs/xfs_aops.c b/fs/xfs/xfs_aops.c
> index a6f0f4761a37..8da5e6637771 100644
> --- a/fs/xfs/xfs_aops.c
> +++ b/fs/xfs/xfs_aops.c
> @@ -758,6 +758,7 @@ xfs_add_to_ioend(
>  	struct block_device	*bdev = xfs_find_bdev_for_inode(inode);
>  	unsigned		len = i_blocksize(inode);
>  	unsigned		poff = offset & (PAGE_SIZE - 1);
> +	bool			merged, same_page = false;
>  	sector_t		sector;
>  
>  	sector = xfs_fsb_to_db(ip, wpc->imap.br_startblock) +
> @@ -774,9 +775,13 @@ xfs_add_to_ioend(
>  				wpc->imap.br_state, offset, bdev, sector);
>  	}
>  
> -	if (!__bio_try_merge_page(wpc->ioend->io_bio, page, len, poff, true)) {
> -		if (iop)
> -			atomic_inc(&iop->write_count);
> +	merged = __bio_try_merge_page(wpc->ioend->io_bio, page, len, poff,
> +			&same_page);
> +
> +	if (iop && !same_page)
> +		atomic_inc(&iop->write_count);
> +
> +	if (!merged) {
>  		if (bio_full(wpc->ioend->io_bio))
>  			xfs_chain_bio(wpc->ioend, wbc, bdev, sector);
>  		bio_add_page(wpc->ioend->io_bio, page, len, poff);
> diff --git a/include/linux/bio.h b/include/linux/bio.h
> index ea73df36529a..3df3b127b394 100644
> --- a/include/linux/bio.h
> +++ b/include/linux/bio.h
> @@ -423,7 +423,7 @@ extern int bio_add_page(struct bio *, struct page *, unsigned int,unsigned int);
>  extern int bio_add_pc_page(struct request_queue *, struct bio *, struct page *,
>  			   unsigned int, unsigned int);
>  bool __bio_try_merge_page(struct bio *bio, struct page *page,
> -		unsigned int len, unsigned int off, bool same_page);
> +		unsigned int len, unsigned int off, bool *same_page);
>  void __bio_add_page(struct bio *bio, struct page *page,
>  		unsigned int len, unsigned int off);
>  int bio_iov_iter_get_pages(struct bio *bio, struct iov_iter *iter);

I'd suggest to take a look at V3, in which each flag is documented well
enough, and it is much more simpler than this one.

Also maybe other callers need to pass BVEC_MERGE_PUT_SAME_PAGE.

Thanks,
Ming

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH V2 0/2] block: fix page leak by merging to same page
  2019-06-10 15:09   ` Ming Lei
@ 2019-06-11  7:45     ` Christoph Hellwig
  2019-06-11  7:57       ` Ming Lei
  0 siblings, 1 reply; 8+ messages in thread
From: Christoph Hellwig @ 2019-06-11  7:45 UTC (permalink / raw)
  To: Ming Lei
  Cc: Christoph Hellwig, Jens Axboe, linux-block, David Gibson,
	Darrick J. Wong, linux-xfs, Alexander Viro

Can you please trim your replies?  I've scrolled two patches before
giving up trying to read your mail.

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH V2 0/2] block: fix page leak by merging to same page
  2019-06-11  7:45     ` Christoph Hellwig
@ 2019-06-11  7:57       ` Ming Lei
  0 siblings, 0 replies; 8+ messages in thread
From: Ming Lei @ 2019-06-11  7:57 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: Jens Axboe, linux-block, David Gibson, Darrick J. Wong,
	linux-xfs, Alexander Viro

On Tue, Jun 11, 2019 at 12:45:41AM -0700, Christoph Hellwig wrote:
> Can you please trim your replies?  I've scrolled two patches before
> giving up trying to read your mail.

OK, next time will trim the reply.

Thanks,
Ming

^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2019-06-11  7:57 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-06-10  4:18 [PATCH V2 0/2] block: fix page leak by merging to same page Ming Lei
2019-06-10  4:18 ` [PATCH V2 1/2] block: introduce 'enum bvec_merge_flags' for __bio_try_merge_page Ming Lei
2019-06-10  4:18 ` [PATCH V2 2/2] block: fix page leak in case of merging to same page Ming Lei
2019-06-10  8:37 ` [PATCH V2 0/2] block: fix page leak by " Ming Lei
2019-06-10 13:34 ` Christoph Hellwig
2019-06-10 15:09   ` Ming Lei
2019-06-11  7:45     ` Christoph Hellwig
2019-06-11  7:57       ` Ming Lei

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).