* [PATCH V2 0/2] block: fix page leak by merging to same page
@ 2019-06-10 4:18 Ming Lei
2019-06-10 4:18 ` [PATCH V2 1/2] block: introduce 'enum bvec_merge_flags' for __bio_try_merge_page Ming Lei
` (3 more replies)
0 siblings, 4 replies; 8+ messages in thread
From: Ming Lei @ 2019-06-10 4:18 UTC (permalink / raw)
To: Jens Axboe
Cc: linux-block, Ming Lei, David Gibson, Darrick J. Wong, linux-xfs,
Alexander Viro, Christoph Hellwig
Hi,
'pages' retrived by __bio_iov_iter_get_pages() may point to same page,
and finally they can be merged to the same page in bio_add_page(), then
page leak can be caused because bio_release_pages() only drops the page
ref once.
Fixes this issue by dropping the extra page ref.
V2:
- V1 breaks multi-page merge, and fix it and only put the page ref
if the added page is really the 'same page'
Ming Lei (2):
block: introduce 'enum bvec_merge_flags' for __bio_try_merge_page
block: fix page leak in case of merging to same page
block/bio.c | 32 ++++++++++++++++++++++----------
fs/iomap.c | 3 ++-
fs/xfs/xfs_aops.c | 3 ++-
include/linux/bio.h | 9 ++++++++-
4 files changed, 34 insertions(+), 13 deletions(-)
Cc: David Gibson <david@gibson.dropbear.id.au>
Cc: "Darrick J. Wong" <darrick.wong@oracle.com>
Cc: linux-xfs@vger.kernel.org
Cc: Alexander Viro <viro@zeniv.linux.org.uk>
Cc: Christoph Hellwig <hch@infradead.org>
--
2.20.1
^ permalink raw reply [flat|nested] 8+ messages in thread
* [PATCH V2 1/2] block: introduce 'enum bvec_merge_flags' for __bio_try_merge_page
2019-06-10 4:18 [PATCH V2 0/2] block: fix page leak by merging to same page Ming Lei
@ 2019-06-10 4:18 ` Ming Lei
2019-06-10 4:18 ` [PATCH V2 2/2] block: fix page leak in case of merging to same page Ming Lei
` (2 subsequent siblings)
3 siblings, 0 replies; 8+ messages in thread
From: Ming Lei @ 2019-06-10 4:18 UTC (permalink / raw)
To: Jens Axboe
Cc: linux-block, Ming Lei, Darrick J. Wong, linux-xfs,
Alexander Viro, Christoph Hellwig, David Gibson
Introduce 'enum bvec_merge_flags' and pass it to __bio_try_merge_page,
we have to deal with several cases related with page reference when
merging same page to bio(bvec), such as:
1) only merge to same page without putting reference of the same page,
such as iomap & xfs
2) merge to same page and put reference of the same page, such as
__bio_iov_iter_get_pages()
Cc: "Darrick J. Wong" <darrick.wong@oracle.com>
Cc: linux-xfs@vger.kernel.org
Cc: Alexander Viro <viro@zeniv.linux.org.uk>
Cc: Christoph Hellwig <hch@infradead.org>
Cc: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Ming Lei <ming.lei@redhat.com>
---
block/bio.c | 20 ++++++++++++--------
fs/iomap.c | 3 ++-
fs/xfs/xfs_aops.c | 3 ++-
include/linux/bio.h | 8 +++++++-
4 files changed, 23 insertions(+), 11 deletions(-)
diff --git a/block/bio.c b/block/bio.c
index 683cbb40f051..39e3b931dc3b 100644
--- a/block/bio.c
+++ b/block/bio.c
@@ -636,7 +636,7 @@ EXPORT_SYMBOL(bio_clone_fast);
static inline bool page_is_mergeable(const struct bio_vec *bv,
struct page *page, unsigned int len, unsigned int off,
- bool same_page)
+ enum bvec_merge_flags flags)
{
phys_addr_t vec_end_addr = page_to_phys(bv->bv_page) +
bv->bv_offset + bv->bv_len - 1;
@@ -648,13 +648,14 @@ static inline bool page_is_mergeable(const struct bio_vec *bv,
return false;
if ((vec_end_addr & PAGE_MASK) != page_addr) {
- if (same_page)
+ if (flags & BVEC_MERGE_TO_SAME_PAGE)
return false;
if (pfn_to_page(PFN_DOWN(vec_end_addr)) + 1 != page)
return false;
}
- WARN_ON_ONCE(same_page && (len + off) > PAGE_SIZE);
+ WARN_ON_ONCE((flags & BVEC_MERGE_TO_SAME_PAGE) &&
+ (len + off) > PAGE_SIZE);
return true;
}
@@ -729,8 +730,9 @@ static int __bio_add_pc_page(struct request_queue *q, struct bio *bio,
if (bvec_gap_to_prev(q, bvec, offset))
return 0;
- if (page_is_mergeable(bvec, page, len, offset, false) &&
- can_add_page_to_seg(q, bvec, page, len, offset)) {
+ if (page_is_mergeable(bvec, page, len, offset,
+ BVEC_MERGE_DEFAULT) && can_add_page_to_seg(q, bvec,
+ page, len, offset)) {
bvec->bv_len += len;
goto done;
}
@@ -779,7 +781,8 @@ EXPORT_SYMBOL(bio_add_pc_page);
* Return %true on success or %false on failure.
*/
bool __bio_try_merge_page(struct bio *bio, struct page *page,
- unsigned int len, unsigned int off, bool same_page)
+ unsigned int len, unsigned int off,
+ enum bvec_merge_flags flags)
{
if (WARN_ON_ONCE(bio_flagged(bio, BIO_CLONED)))
return false;
@@ -787,7 +790,7 @@ bool __bio_try_merge_page(struct bio *bio, struct page *page,
if (bio->bi_vcnt > 0) {
struct bio_vec *bv = &bio->bi_io_vec[bio->bi_vcnt - 1];
- if (page_is_mergeable(bv, page, len, off, same_page)) {
+ if (page_is_mergeable(bv, page, len, off, flags)) {
bv->bv_len += len;
bio->bi_iter.bi_size += len;
return true;
@@ -837,7 +840,8 @@ EXPORT_SYMBOL_GPL(__bio_add_page);
int bio_add_page(struct bio *bio, struct page *page,
unsigned int len, unsigned int offset)
{
- if (!__bio_try_merge_page(bio, page, len, offset, false)) {
+ if (!__bio_try_merge_page(bio, page, len, offset,
+ BVEC_MERGE_DEFAULT)) {
if (bio_full(bio))
return 0;
__bio_add_page(bio, page, len, offset);
diff --git a/fs/iomap.c b/fs/iomap.c
index 23ef63fd1669..e04652bbf92a 100644
--- a/fs/iomap.c
+++ b/fs/iomap.c
@@ -316,7 +316,8 @@ iomap_readpage_actor(struct inode *inode, loff_t pos, loff_t length, void *data,
*/
sector = iomap_sector(iomap, pos);
if (ctx->bio && bio_end_sector(ctx->bio) == sector) {
- if (__bio_try_merge_page(ctx->bio, page, plen, poff, true))
+ if (__bio_try_merge_page(ctx->bio, page, plen, poff,
+ BVEC_MERGE_TO_SAME_PAGE))
goto done;
is_contig = true;
}
diff --git a/fs/xfs/xfs_aops.c b/fs/xfs/xfs_aops.c
index a6f0f4761a37..7e7385bc3b9e 100644
--- a/fs/xfs/xfs_aops.c
+++ b/fs/xfs/xfs_aops.c
@@ -774,7 +774,8 @@ xfs_add_to_ioend(
wpc->imap.br_state, offset, bdev, sector);
}
- if (!__bio_try_merge_page(wpc->ioend->io_bio, page, len, poff, true)) {
+ if (!__bio_try_merge_page(wpc->ioend->io_bio, page, len, poff,
+ BVEC_MERGE_TO_SAME_PAGE)) {
if (iop)
atomic_inc(&iop->write_count);
if (bio_full(wpc->ioend->io_bio))
diff --git a/include/linux/bio.h b/include/linux/bio.h
index 0f23b5682640..48a95bca1703 100644
--- a/include/linux/bio.h
+++ b/include/linux/bio.h
@@ -419,11 +419,17 @@ extern void bio_uninit(struct bio *);
extern void bio_reset(struct bio *);
void bio_chain(struct bio *, struct bio *);
+enum bvec_merge_flags {
+ BVEC_MERGE_DEFAULT,
+ BVEC_MERGE_TO_SAME_PAGE = BIT(0),
+};
+
extern int bio_add_page(struct bio *, struct page *, unsigned int,unsigned int);
extern int bio_add_pc_page(struct request_queue *, struct bio *, struct page *,
unsigned int, unsigned int);
bool __bio_try_merge_page(struct bio *bio, struct page *page,
- unsigned int len, unsigned int off, bool same_page);
+ unsigned int len, unsigned int off,
+ enum bvec_merge_flags flags);
void __bio_add_page(struct bio *bio, struct page *page,
unsigned int len, unsigned int off);
int bio_iov_iter_get_pages(struct bio *bio, struct iov_iter *iter);
--
2.20.1
^ permalink raw reply related [flat|nested] 8+ messages in thread
* [PATCH V2 2/2] block: fix page leak in case of merging to same page
2019-06-10 4:18 [PATCH V2 0/2] block: fix page leak by merging to same page Ming Lei
2019-06-10 4:18 ` [PATCH V2 1/2] block: introduce 'enum bvec_merge_flags' for __bio_try_merge_page Ming Lei
@ 2019-06-10 4:18 ` Ming Lei
2019-06-10 8:37 ` [PATCH V2 0/2] block: fix page leak by " Ming Lei
2019-06-10 13:34 ` Christoph Hellwig
3 siblings, 0 replies; 8+ messages in thread
From: Ming Lei @ 2019-06-10 4:18 UTC (permalink / raw)
To: Jens Axboe
Cc: linux-block, Ming Lei, David Gibson, Darrick J. Wong, linux-xfs,
Alexander Viro, Christoph Hellwig
Different iovec may use one same page, then 'pages' array filled
by iov_iter_get_pages() may get reference of the same page several
times. If some elements in 'pages' can be merged to same page in
one bvec by bio_add_page(), bio_release_pages() only drops the
page's reference once.
This way causes page leak reported by David Gibson.
This issue can be triggered since 576ed913 ("block: use bio_add_page in
bio_iov_iter_get_pages").
Fixes the issue by putting the page's ref if it is merged to same page.
Cc: David Gibson <david@gibson.dropbear.id.au>
Cc: "Darrick J. Wong" <darrick.wong@oracle.com>
Cc: linux-xfs@vger.kernel.org
Cc: Alexander Viro <viro@zeniv.linux.org.uk>
Cc: Christoph Hellwig <hch@infradead.org>
Link: https://lkml.org/lkml/2019/4/23/64
Fixes: 576ed913 ("block: use bio_add_page in bio_iov_iter_get_pages")
Reported-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Ming Lei <ming.lei@redhat.com>
---
block/bio.c | 12 ++++++++++--
include/linux/bio.h | 1 +
2 files changed, 11 insertions(+), 2 deletions(-)
diff --git a/block/bio.c b/block/bio.c
index 39e3b931dc3b..07a15abc3d11 100644
--- a/block/bio.c
+++ b/block/bio.c
@@ -652,6 +652,9 @@ static inline bool page_is_mergeable(const struct bio_vec *bv,
return false;
if (pfn_to_page(PFN_DOWN(vec_end_addr)) + 1 != page)
return false;
+ /* drop page ref if the page has been added and user asks to do that */
+ } else if (flags & BVEC_MERGE_PUT_SAME_PAGE) {
+ put_page(page);
}
WARN_ON_ONCE((flags & BVEC_MERGE_TO_SAME_PAGE) &&
@@ -924,8 +927,13 @@ static int __bio_iov_iter_get_pages(struct bio *bio, struct iov_iter *iter)
struct page *page = pages[i];
len = min_t(size_t, PAGE_SIZE - offset, left);
- if (WARN_ON_ONCE(bio_add_page(bio, page, len, offset) != len))
- return -EINVAL;
+
+ if (!__bio_try_merge_page(bio, page, len, offset,
+ BVEC_MERGE_PUT_SAME_PAGE)) {
+ if (WARN_ON_ONCE(bio_add_page(bio, page, len, offset)
+ != len))
+ return -EINVAL;
+ }
offset = 0;
}
diff --git a/include/linux/bio.h b/include/linux/bio.h
index 48a95bca1703..dec6cf683d8e 100644
--- a/include/linux/bio.h
+++ b/include/linux/bio.h
@@ -422,6 +422,7 @@ void bio_chain(struct bio *, struct bio *);
enum bvec_merge_flags {
BVEC_MERGE_DEFAULT,
BVEC_MERGE_TO_SAME_PAGE = BIT(0),
+ BVEC_MERGE_PUT_SAME_PAGE = BIT(1),
};
extern int bio_add_page(struct bio *, struct page *, unsigned int,unsigned int);
--
2.20.1
^ permalink raw reply related [flat|nested] 8+ messages in thread
* Re: [PATCH V2 0/2] block: fix page leak by merging to same page
2019-06-10 4:18 [PATCH V2 0/2] block: fix page leak by merging to same page Ming Lei
2019-06-10 4:18 ` [PATCH V2 1/2] block: introduce 'enum bvec_merge_flags' for __bio_try_merge_page Ming Lei
2019-06-10 4:18 ` [PATCH V2 2/2] block: fix page leak in case of merging to same page Ming Lei
@ 2019-06-10 8:37 ` Ming Lei
2019-06-10 13:34 ` Christoph Hellwig
3 siblings, 0 replies; 8+ messages in thread
From: Ming Lei @ 2019-06-10 8:37 UTC (permalink / raw)
To: Jens Axboe
Cc: linux-block, David Gibson, Darrick J. Wong, linux-xfs,
Alexander Viro, Christoph Hellwig
On Mon, Jun 10, 2019 at 12:18:17PM +0800, Ming Lei wrote:
> Hi,
>
> 'pages' retrived by __bio_iov_iter_get_pages() may point to same page,
> and finally they can be merged to the same page in bio_add_page(), then
> page leak can be caused because bio_release_pages() only drops the page
> ref once.
>
> Fixes this issue by dropping the extra page ref.
>
> V2:
> - V1 breaks multi-page merge, and fix it and only put the page ref
> if the added page is really the 'same page'
>
>
> Ming Lei (2):
> block: introduce 'enum bvec_merge_flags' for __bio_try_merge_page
> block: fix page leak in case of merging to same page
>
> block/bio.c | 32 ++++++++++++++++++++++----------
> fs/iomap.c | 3 ++-
> fs/xfs/xfs_aops.c | 3 ++-
> include/linux/bio.h | 9 ++++++++-
> 4 files changed, 34 insertions(+), 13 deletions(-)
>
> Cc: David Gibson <david@gibson.dropbear.id.au>
> Cc: "Darrick J. Wong" <darrick.wong@oracle.com>
> Cc: linux-xfs@vger.kernel.org
> Cc: Alexander Viro <viro@zeniv.linux.org.uk>
> Cc: Christoph Hellwig <hch@infradead.org>
> --
> 2.20.1
>
Please ignore V2, I will improve it a bit and post out V3.
Thanks
Ming
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH V2 0/2] block: fix page leak by merging to same page
2019-06-10 4:18 [PATCH V2 0/2] block: fix page leak by merging to same page Ming Lei
` (2 preceding siblings ...)
2019-06-10 8:37 ` [PATCH V2 0/2] block: fix page leak by " Ming Lei
@ 2019-06-10 13:34 ` Christoph Hellwig
2019-06-10 15:09 ` Ming Lei
3 siblings, 1 reply; 8+ messages in thread
From: Christoph Hellwig @ 2019-06-10 13:34 UTC (permalink / raw)
To: Ming Lei
Cc: Jens Axboe, linux-block, David Gibson, Darrick J. Wong,
linux-xfs, Alexander Viro, Christoph Hellwig
I don't really like the magic enum types. I'd rather go back to my
initial idea to turn the same_page argument into an output parameter,
so that the callers can act upon it. Untested patch below:
diff --git a/block/bio.c b/block/bio.c
index 683cbb40f051..d4999ef3b1fb 100644
--- a/block/bio.c
+++ b/block/bio.c
@@ -636,7 +636,7 @@ EXPORT_SYMBOL(bio_clone_fast);
static inline bool page_is_mergeable(const struct bio_vec *bv,
struct page *page, unsigned int len, unsigned int off,
- bool same_page)
+ bool *same_page)
{
phys_addr_t vec_end_addr = page_to_phys(bv->bv_page) +
bv->bv_offset + bv->bv_len - 1;
@@ -647,26 +647,17 @@ static inline bool page_is_mergeable(const struct bio_vec *bv,
if (xen_domain() && !xen_biovec_phys_mergeable(bv, page))
return false;
- if ((vec_end_addr & PAGE_MASK) != page_addr) {
- if (same_page)
- return false;
- if (pfn_to_page(PFN_DOWN(vec_end_addr)) + 1 != page)
- return false;
- }
-
- WARN_ON_ONCE(same_page && (len + off) > PAGE_SIZE);
-
+ *same_page = ((vec_end_addr & PAGE_MASK) == page_addr);
+ if (!*same_page && pfn_to_page(PFN_DOWN(vec_end_addr)) + 1 != page)
+ return false;
return true;
}
-/*
- * Check if the @page can be added to the current segment(@bv), and make
- * sure to call it only if page_is_mergeable(@bv, @page) is true
- */
-static bool can_add_page_to_seg(struct request_queue *q,
- struct bio_vec *bv, struct page *page, unsigned len,
- unsigned offset)
+static bool bio_try_merge_pc_page(struct request_queue *q, struct bio *bio,
+ struct page *page, unsigned len, unsigned offset,
+ bool *same_page)
{
+ struct bio_vec *bv = &bio->bi_io_vec[bio->bi_vcnt - 1];
unsigned long mask = queue_segment_boundary(q);
phys_addr_t addr1 = page_to_phys(bv->bv_page) + bv->bv_offset;
phys_addr_t addr2 = page_to_phys(page) + offset + len - 1;
@@ -677,7 +668,13 @@ static bool can_add_page_to_seg(struct request_queue *q,
if (bv->bv_len + len > queue_max_segment_size(q))
return false;
- return true;
+ /*
+ * If the queue doesn't support SG gaps and adding this
+ * offset would create a gap, disallow it.
+ */
+ if (bvec_gap_to_prev(q, bv, offset))
+ return false;
+ return __bio_try_merge_page(bio, page, len, offset, same_page);
}
/**
@@ -701,6 +698,7 @@ static int __bio_add_pc_page(struct request_queue *q, struct bio *bio,
bool put_same_page)
{
struct bio_vec *bvec;
+ bool same_page = false;
/*
* cloned bio must not modify vec list
@@ -711,29 +709,11 @@ static int __bio_add_pc_page(struct request_queue *q, struct bio *bio,
if (((bio->bi_iter.bi_size + len) >> 9) > queue_max_hw_sectors(q))
return 0;
- if (bio->bi_vcnt > 0) {
- bvec = &bio->bi_io_vec[bio->bi_vcnt - 1];
-
- if (page == bvec->bv_page &&
- offset == bvec->bv_offset + bvec->bv_len) {
- if (put_same_page)
- put_page(page);
- bvec->bv_len += len;
- goto done;
- }
-
- /*
- * If the queue doesn't support SG gaps and adding this
- * offset would create a gap, disallow it.
- */
- if (bvec_gap_to_prev(q, bvec, offset))
- return 0;
-
- if (page_is_mergeable(bvec, page, len, offset, false) &&
- can_add_page_to_seg(q, bvec, page, len, offset)) {
- bvec->bv_len += len;
- goto done;
- }
+ if (bio->bi_vcnt > 0 &&
+ bio_try_merge_pc_page(q, bio, page, len, offset, &same_page)) {
+ if (put_same_page && same_page)
+ put_page(page);
+ goto done;
}
if (bio_full(bio))
@@ -747,8 +727,8 @@ static int __bio_add_pc_page(struct request_queue *q, struct bio *bio,
bvec->bv_len = len;
bvec->bv_offset = offset;
bio->bi_vcnt++;
- done:
bio->bi_iter.bi_size += len;
+ done:
bio->bi_phys_segments = bio->bi_vcnt;
bio_set_flag(bio, BIO_SEG_VALID);
return len;
@@ -767,8 +747,7 @@ EXPORT_SYMBOL(bio_add_pc_page);
* @page: start page to add
* @len: length of the data to add
* @off: offset of the data relative to @page
- * @same_page: if %true only merge if the new data is in the same physical
- * page as the last segment of the bio.
+ * @same_page: return if the segment has been merged inside the same page
*
* Try to add the data at @page + @off to the last bvec of @bio. This is a
* a useful optimisation for file systems with a block size smaller than the
@@ -779,7 +758,7 @@ EXPORT_SYMBOL(bio_add_pc_page);
* Return %true on success or %false on failure.
*/
bool __bio_try_merge_page(struct bio *bio, struct page *page,
- unsigned int len, unsigned int off, bool same_page)
+ unsigned int len, unsigned int off, bool *same_page)
{
if (WARN_ON_ONCE(bio_flagged(bio, BIO_CLONED)))
return false;
@@ -837,7 +816,9 @@ EXPORT_SYMBOL_GPL(__bio_add_page);
int bio_add_page(struct bio *bio, struct page *page,
unsigned int len, unsigned int offset)
{
- if (!__bio_try_merge_page(bio, page, len, offset, false)) {
+ bool same_page = false;
+
+ if (!__bio_try_merge_page(bio, page, len, offset, &same_page)) {
if (bio_full(bio))
return 0;
__bio_add_page(bio, page, len, offset);
@@ -900,6 +881,7 @@ static int __bio_iov_iter_get_pages(struct bio *bio, struct iov_iter *iter)
unsigned short entries_left = bio->bi_max_vecs - bio->bi_vcnt;
struct bio_vec *bv = bio->bi_io_vec + bio->bi_vcnt;
struct page **pages = (struct page **)bv;
+ bool same_page = false;
ssize_t size, left;
unsigned len, i;
size_t offset;
@@ -920,8 +902,15 @@ static int __bio_iov_iter_get_pages(struct bio *bio, struct iov_iter *iter)
struct page *page = pages[i];
len = min_t(size_t, PAGE_SIZE - offset, left);
- if (WARN_ON_ONCE(bio_add_page(bio, page, len, offset) != len))
- return -EINVAL;
+
+ if (__bio_try_merge_page(bio, page, len, offset, &same_page)) {
+ if (same_page)
+ put_page(page);
+ } else {
+ if (WARN_ON_ONCE(bio_full(bio)))
+ return -EINVAL;
+ __bio_add_page(bio, page, len, offset);
+ }
offset = 0;
}
diff --git a/fs/iomap.c b/fs/iomap.c
index 23ef63fd1669..12654c2e78f8 100644
--- a/fs/iomap.c
+++ b/fs/iomap.c
@@ -287,7 +287,7 @@ iomap_readpage_actor(struct inode *inode, loff_t pos, loff_t length, void *data,
struct iomap_readpage_ctx *ctx = data;
struct page *page = ctx->cur_page;
struct iomap_page *iop = iomap_page_create(inode, page);
- bool is_contig = false;
+ bool same_page = false, is_contig = false;
loff_t orig_pos = pos;
unsigned poff, plen;
sector_t sector;
@@ -315,10 +315,14 @@ iomap_readpage_actor(struct inode *inode, loff_t pos, loff_t length, void *data,
* Try to merge into a previous segment if we can.
*/
sector = iomap_sector(iomap, pos);
- if (ctx->bio && bio_end_sector(ctx->bio) == sector) {
- if (__bio_try_merge_page(ctx->bio, page, plen, poff, true))
- goto done;
+ if (ctx->bio && bio_end_sector(ctx->bio) == sector)
is_contig = true;
+
+ if (is_contig &&
+ __bio_try_merge_page(ctx->bio, page, plen, poff, &same_page)) {
+ if (!same_page && iop)
+ atomic_inc(&iop->read_count);
+ goto done;
}
/*
diff --git a/fs/xfs/xfs_aops.c b/fs/xfs/xfs_aops.c
index a6f0f4761a37..8da5e6637771 100644
--- a/fs/xfs/xfs_aops.c
+++ b/fs/xfs/xfs_aops.c
@@ -758,6 +758,7 @@ xfs_add_to_ioend(
struct block_device *bdev = xfs_find_bdev_for_inode(inode);
unsigned len = i_blocksize(inode);
unsigned poff = offset & (PAGE_SIZE - 1);
+ bool merged, same_page = false;
sector_t sector;
sector = xfs_fsb_to_db(ip, wpc->imap.br_startblock) +
@@ -774,9 +775,13 @@ xfs_add_to_ioend(
wpc->imap.br_state, offset, bdev, sector);
}
- if (!__bio_try_merge_page(wpc->ioend->io_bio, page, len, poff, true)) {
- if (iop)
- atomic_inc(&iop->write_count);
+ merged = __bio_try_merge_page(wpc->ioend->io_bio, page, len, poff,
+ &same_page);
+
+ if (iop && !same_page)
+ atomic_inc(&iop->write_count);
+
+ if (!merged) {
if (bio_full(wpc->ioend->io_bio))
xfs_chain_bio(wpc->ioend, wbc, bdev, sector);
bio_add_page(wpc->ioend->io_bio, page, len, poff);
diff --git a/include/linux/bio.h b/include/linux/bio.h
index ea73df36529a..3df3b127b394 100644
--- a/include/linux/bio.h
+++ b/include/linux/bio.h
@@ -423,7 +423,7 @@ extern int bio_add_page(struct bio *, struct page *, unsigned int,unsigned int);
extern int bio_add_pc_page(struct request_queue *, struct bio *, struct page *,
unsigned int, unsigned int);
bool __bio_try_merge_page(struct bio *bio, struct page *page,
- unsigned int len, unsigned int off, bool same_page);
+ unsigned int len, unsigned int off, bool *same_page);
void __bio_add_page(struct bio *bio, struct page *page,
unsigned int len, unsigned int off);
int bio_iov_iter_get_pages(struct bio *bio, struct iov_iter *iter);
^ permalink raw reply related [flat|nested] 8+ messages in thread
* Re: [PATCH V2 0/2] block: fix page leak by merging to same page
2019-06-10 13:34 ` Christoph Hellwig
@ 2019-06-10 15:09 ` Ming Lei
2019-06-11 7:45 ` Christoph Hellwig
0 siblings, 1 reply; 8+ messages in thread
From: Ming Lei @ 2019-06-10 15:09 UTC (permalink / raw)
To: Christoph Hellwig
Cc: Jens Axboe, linux-block, David Gibson, Darrick J. Wong,
linux-xfs, Alexander Viro
On Mon, Jun 10, 2019 at 06:34:46AM -0700, Christoph Hellwig wrote:
> I don't really like the magic enum types. I'd rather go back to my
> initial idea to turn the same_page argument into an output parameter,
> so that the callers can act upon it. Untested patch below:
>
>
> diff --git a/block/bio.c b/block/bio.c
> index 683cbb40f051..d4999ef3b1fb 100644
> --- a/block/bio.c
> +++ b/block/bio.c
> @@ -636,7 +636,7 @@ EXPORT_SYMBOL(bio_clone_fast);
>
> static inline bool page_is_mergeable(const struct bio_vec *bv,
> struct page *page, unsigned int len, unsigned int off,
> - bool same_page)
> + bool *same_page)
> {
> phys_addr_t vec_end_addr = page_to_phys(bv->bv_page) +
> bv->bv_offset + bv->bv_len - 1;
> @@ -647,26 +647,17 @@ static inline bool page_is_mergeable(const struct bio_vec *bv,
> if (xen_domain() && !xen_biovec_phys_mergeable(bv, page))
> return false;
>
> - if ((vec_end_addr & PAGE_MASK) != page_addr) {
> - if (same_page)
> - return false;
> - if (pfn_to_page(PFN_DOWN(vec_end_addr)) + 1 != page)
> - return false;
> - }
> -
> - WARN_ON_ONCE(same_page && (len + off) > PAGE_SIZE);
> -
> + *same_page = ((vec_end_addr & PAGE_MASK) == page_addr);
> + if (!*same_page && pfn_to_page(PFN_DOWN(vec_end_addr)) + 1 != page)
> + return false;
> return true;
> }
>
> -/*
> - * Check if the @page can be added to the current segment(@bv), and make
> - * sure to call it only if page_is_mergeable(@bv, @page) is true
> - */
> -static bool can_add_page_to_seg(struct request_queue *q,
> - struct bio_vec *bv, struct page *page, unsigned len,
> - unsigned offset)
> +static bool bio_try_merge_pc_page(struct request_queue *q, struct bio *bio,
> + struct page *page, unsigned len, unsigned offset,
> + bool *same_page)
> {
> + struct bio_vec *bv = &bio->bi_io_vec[bio->bi_vcnt - 1];
> unsigned long mask = queue_segment_boundary(q);
> phys_addr_t addr1 = page_to_phys(bv->bv_page) + bv->bv_offset;
> phys_addr_t addr2 = page_to_phys(page) + offset + len - 1;
> @@ -677,7 +668,13 @@ static bool can_add_page_to_seg(struct request_queue *q,
> if (bv->bv_len + len > queue_max_segment_size(q))
> return false;
>
> - return true;
> + /*
> + * If the queue doesn't support SG gaps and adding this
> + * offset would create a gap, disallow it.
> + */
> + if (bvec_gap_to_prev(q, bv, offset))
> + return false;
> + return __bio_try_merge_page(bio, page, len, offset, same_page);
> }
>
> /**
> @@ -701,6 +698,7 @@ static int __bio_add_pc_page(struct request_queue *q, struct bio *bio,
> bool put_same_page)
> {
> struct bio_vec *bvec;
> + bool same_page = false;
>
> /*
> * cloned bio must not modify vec list
> @@ -711,29 +709,11 @@ static int __bio_add_pc_page(struct request_queue *q, struct bio *bio,
> if (((bio->bi_iter.bi_size + len) >> 9) > queue_max_hw_sectors(q))
> return 0;
>
> - if (bio->bi_vcnt > 0) {
> - bvec = &bio->bi_io_vec[bio->bi_vcnt - 1];
> -
> - if (page == bvec->bv_page &&
> - offset == bvec->bv_offset + bvec->bv_len) {
> - if (put_same_page)
> - put_page(page);
> - bvec->bv_len += len;
> - goto done;
> - }
> -
> - /*
> - * If the queue doesn't support SG gaps and adding this
> - * offset would create a gap, disallow it.
> - */
> - if (bvec_gap_to_prev(q, bvec, offset))
> - return 0;
> -
> - if (page_is_mergeable(bvec, page, len, offset, false) &&
> - can_add_page_to_seg(q, bvec, page, len, offset)) {
> - bvec->bv_len += len;
> - goto done;
> - }
> + if (bio->bi_vcnt > 0 &&
> + bio_try_merge_pc_page(q, bio, page, len, offset, &same_page)) {
> + if (put_same_page && same_page)
> + put_page(page);
> + goto done;
> }
>
> if (bio_full(bio))
> @@ -747,8 +727,8 @@ static int __bio_add_pc_page(struct request_queue *q, struct bio *bio,
> bvec->bv_len = len;
> bvec->bv_offset = offset;
> bio->bi_vcnt++;
> - done:
> bio->bi_iter.bi_size += len;
> + done:
> bio->bi_phys_segments = bio->bi_vcnt;
> bio_set_flag(bio, BIO_SEG_VALID);
> return len;
> @@ -767,8 +747,7 @@ EXPORT_SYMBOL(bio_add_pc_page);
> * @page: start page to add
> * @len: length of the data to add
> * @off: offset of the data relative to @page
> - * @same_page: if %true only merge if the new data is in the same physical
> - * page as the last segment of the bio.
> + * @same_page: return if the segment has been merged inside the same page
> *
> * Try to add the data at @page + @off to the last bvec of @bio. This is a
> * a useful optimisation for file systems with a block size smaller than the
> @@ -779,7 +758,7 @@ EXPORT_SYMBOL(bio_add_pc_page);
> * Return %true on success or %false on failure.
> */
> bool __bio_try_merge_page(struct bio *bio, struct page *page,
> - unsigned int len, unsigned int off, bool same_page)
> + unsigned int len, unsigned int off, bool *same_page)
> {
> if (WARN_ON_ONCE(bio_flagged(bio, BIO_CLONED)))
> return false;
> @@ -837,7 +816,9 @@ EXPORT_SYMBOL_GPL(__bio_add_page);
> int bio_add_page(struct bio *bio, struct page *page,
> unsigned int len, unsigned int offset)
> {
> - if (!__bio_try_merge_page(bio, page, len, offset, false)) {
> + bool same_page = false;
> +
> + if (!__bio_try_merge_page(bio, page, len, offset, &same_page)) {
> if (bio_full(bio))
> return 0;
> __bio_add_page(bio, page, len, offset);
> @@ -900,6 +881,7 @@ static int __bio_iov_iter_get_pages(struct bio *bio, struct iov_iter *iter)
> unsigned short entries_left = bio->bi_max_vecs - bio->bi_vcnt;
> struct bio_vec *bv = bio->bi_io_vec + bio->bi_vcnt;
> struct page **pages = (struct page **)bv;
> + bool same_page = false;
> ssize_t size, left;
> unsigned len, i;
> size_t offset;
> @@ -920,8 +902,15 @@ static int __bio_iov_iter_get_pages(struct bio *bio, struct iov_iter *iter)
> struct page *page = pages[i];
>
> len = min_t(size_t, PAGE_SIZE - offset, left);
> - if (WARN_ON_ONCE(bio_add_page(bio, page, len, offset) != len))
> - return -EINVAL;
> +
> + if (__bio_try_merge_page(bio, page, len, offset, &same_page)) {
> + if (same_page)
> + put_page(page);
> + } else {
> + if (WARN_ON_ONCE(bio_full(bio)))
> + return -EINVAL;
> + __bio_add_page(bio, page, len, offset);
> + }
> offset = 0;
> }
>
> diff --git a/fs/iomap.c b/fs/iomap.c
> index 23ef63fd1669..12654c2e78f8 100644
> --- a/fs/iomap.c
> +++ b/fs/iomap.c
> @@ -287,7 +287,7 @@ iomap_readpage_actor(struct inode *inode, loff_t pos, loff_t length, void *data,
> struct iomap_readpage_ctx *ctx = data;
> struct page *page = ctx->cur_page;
> struct iomap_page *iop = iomap_page_create(inode, page);
> - bool is_contig = false;
> + bool same_page = false, is_contig = false;
> loff_t orig_pos = pos;
> unsigned poff, plen;
> sector_t sector;
> @@ -315,10 +315,14 @@ iomap_readpage_actor(struct inode *inode, loff_t pos, loff_t length, void *data,
> * Try to merge into a previous segment if we can.
> */
> sector = iomap_sector(iomap, pos);
> - if (ctx->bio && bio_end_sector(ctx->bio) == sector) {
> - if (__bio_try_merge_page(ctx->bio, page, plen, poff, true))
> - goto done;
> + if (ctx->bio && bio_end_sector(ctx->bio) == sector)
> is_contig = true;
> +
> + if (is_contig &&
> + __bio_try_merge_page(ctx->bio, page, plen, poff, &same_page)) {
> + if (!same_page && iop)
> + atomic_inc(&iop->read_count);
> + goto done;
> }
>
> /*
> diff --git a/fs/xfs/xfs_aops.c b/fs/xfs/xfs_aops.c
> index a6f0f4761a37..8da5e6637771 100644
> --- a/fs/xfs/xfs_aops.c
> +++ b/fs/xfs/xfs_aops.c
> @@ -758,6 +758,7 @@ xfs_add_to_ioend(
> struct block_device *bdev = xfs_find_bdev_for_inode(inode);
> unsigned len = i_blocksize(inode);
> unsigned poff = offset & (PAGE_SIZE - 1);
> + bool merged, same_page = false;
> sector_t sector;
>
> sector = xfs_fsb_to_db(ip, wpc->imap.br_startblock) +
> @@ -774,9 +775,13 @@ xfs_add_to_ioend(
> wpc->imap.br_state, offset, bdev, sector);
> }
>
> - if (!__bio_try_merge_page(wpc->ioend->io_bio, page, len, poff, true)) {
> - if (iop)
> - atomic_inc(&iop->write_count);
> + merged = __bio_try_merge_page(wpc->ioend->io_bio, page, len, poff,
> + &same_page);
> +
> + if (iop && !same_page)
> + atomic_inc(&iop->write_count);
> +
> + if (!merged) {
> if (bio_full(wpc->ioend->io_bio))
> xfs_chain_bio(wpc->ioend, wbc, bdev, sector);
> bio_add_page(wpc->ioend->io_bio, page, len, poff);
> diff --git a/include/linux/bio.h b/include/linux/bio.h
> index ea73df36529a..3df3b127b394 100644
> --- a/include/linux/bio.h
> +++ b/include/linux/bio.h
> @@ -423,7 +423,7 @@ extern int bio_add_page(struct bio *, struct page *, unsigned int,unsigned int);
> extern int bio_add_pc_page(struct request_queue *, struct bio *, struct page *,
> unsigned int, unsigned int);
> bool __bio_try_merge_page(struct bio *bio, struct page *page,
> - unsigned int len, unsigned int off, bool same_page);
> + unsigned int len, unsigned int off, bool *same_page);
> void __bio_add_page(struct bio *bio, struct page *page,
> unsigned int len, unsigned int off);
> int bio_iov_iter_get_pages(struct bio *bio, struct iov_iter *iter);
I'd suggest to take a look at V3, in which each flag is documented well
enough, and it is much more simpler than this one.
Also maybe other callers need to pass BVEC_MERGE_PUT_SAME_PAGE.
Thanks,
Ming
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH V2 0/2] block: fix page leak by merging to same page
2019-06-10 15:09 ` Ming Lei
@ 2019-06-11 7:45 ` Christoph Hellwig
2019-06-11 7:57 ` Ming Lei
0 siblings, 1 reply; 8+ messages in thread
From: Christoph Hellwig @ 2019-06-11 7:45 UTC (permalink / raw)
To: Ming Lei
Cc: Christoph Hellwig, Jens Axboe, linux-block, David Gibson,
Darrick J. Wong, linux-xfs, Alexander Viro
Can you please trim your replies? I've scrolled two patches before
giving up trying to read your mail.
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH V2 0/2] block: fix page leak by merging to same page
2019-06-11 7:45 ` Christoph Hellwig
@ 2019-06-11 7:57 ` Ming Lei
0 siblings, 0 replies; 8+ messages in thread
From: Ming Lei @ 2019-06-11 7:57 UTC (permalink / raw)
To: Christoph Hellwig
Cc: Jens Axboe, linux-block, David Gibson, Darrick J. Wong,
linux-xfs, Alexander Viro
On Tue, Jun 11, 2019 at 12:45:41AM -0700, Christoph Hellwig wrote:
> Can you please trim your replies? I've scrolled two patches before
> giving up trying to read your mail.
OK, next time will trim the reply.
Thanks,
Ming
^ permalink raw reply [flat|nested] 8+ messages in thread
end of thread, other threads:[~2019-06-11 7:57 UTC | newest]
Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-06-10 4:18 [PATCH V2 0/2] block: fix page leak by merging to same page Ming Lei
2019-06-10 4:18 ` [PATCH V2 1/2] block: introduce 'enum bvec_merge_flags' for __bio_try_merge_page Ming Lei
2019-06-10 4:18 ` [PATCH V2 2/2] block: fix page leak in case of merging to same page Ming Lei
2019-06-10 8:37 ` [PATCH V2 0/2] block: fix page leak by " Ming Lei
2019-06-10 13:34 ` Christoph Hellwig
2019-06-10 15:09 ` Ming Lei
2019-06-11 7:45 ` Christoph Hellwig
2019-06-11 7:57 ` Ming Lei
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.