* [PATCH v2] Btrfs: fix eb memory leak due to readpage failure
@ 2016-06-03 19:08 Liu Bo
2016-06-13 15:36 ` David Sterba
` (2 more replies)
0 siblings, 3 replies; 10+ messages in thread
From: Liu Bo @ 2016-06-03 19:08 UTC (permalink / raw)
To: linux-btrfs; +Cc: Josef Bacik
eb->io_pages is set in read_extent_buffer_pages().
In case of readpage failure, for pages that have been added to bio,
it calls bio_endio and later readpage_io_failed_hook() does the work.
When this eb's page (couldn't be the 1st page) fails to add itself to bio
due to failure in merge_bio(), it cannot decrease eb->io_pages via bio_endio,
and ends up with a memory leak eventually.
This lets __do_readpage propagate errors to callers and adds the
'atomic_dec(&eb->io_pages)'.
Signed-off-by: Liu Bo <bo.li.liu@oracle.com>
---
v2:
- Move 'dec io_pages' to the caller so that we're consistent with
write_one_eb()
fs/btrfs/extent_io.c | 16 +++++++++++++---
1 file changed, 13 insertions(+), 3 deletions(-)
diff --git a/fs/btrfs/extent_io.c b/fs/btrfs/extent_io.c
index d247fc0..0309388 100644
--- a/fs/btrfs/extent_io.c
+++ b/fs/btrfs/extent_io.c
@@ -2869,6 +2869,7 @@ __get_extent_map(struct inode *inode, struct page *page, size_t pg_offset,
* into the tree that are removed when the IO is done (by the end_io
* handlers)
* XXX JDM: This needs looking at to ensure proper page locking
+ * return 0 on success, otherwise return error
*/
static int __do_readpage(struct extent_io_tree *tree,
struct page *page,
@@ -2890,7 +2891,7 @@ static int __do_readpage(struct extent_io_tree *tree,
sector_t sector;
struct extent_map *em;
struct block_device *bdev;
- int ret;
+ int ret = 0;
int nr = 0;
size_t pg_offset = 0;
size_t iosize;
@@ -3081,7 +3082,7 @@ out:
SetPageUptodate(page);
unlock_page(page);
}
- return 0;
+ return ret;
}
static inline void __do_contiguous_readpages(struct extent_io_tree *tree,
@@ -5204,8 +5205,17 @@ int read_extent_buffer_pages(struct extent_io_tree *tree,
get_extent, &bio,
mirror_num, &bio_flags,
READ | REQ_META);
- if (err)
+ if (err) {
ret = err;
+ /*
+ * We use &bio in above __extent_read_full_page,
+ * so we ensure that if it returns error, the
+ * current page fails to add itself to bio.
+ *
+ * We must dec io_pages by ourselves.
+ */
+ atomic_dec(&eb->io_pages);
+ }
} else {
unlock_page(page);
}
--
2.5.5
^ permalink raw reply related [flat|nested] 10+ messages in thread
* Re: [PATCH v2] Btrfs: fix eb memory leak due to readpage failure
2016-06-03 19:08 [PATCH v2] Btrfs: fix eb memory leak due to readpage failure Liu Bo
@ 2016-06-13 15:36 ` David Sterba
2016-07-08 16:01 ` David Sterba
2016-07-11 17:39 ` [PATCH v3] " Liu Bo
2 siblings, 0 replies; 10+ messages in thread
From: David Sterba @ 2016-06-13 15:36 UTC (permalink / raw)
To: Liu Bo; +Cc: linux-btrfs, Josef Bacik
On Fri, Jun 03, 2016 at 12:08:38PM -0700, Liu Bo wrote:
> eb->io_pages is set in read_extent_buffer_pages().
>
> In case of readpage failure, for pages that have been added to bio,
> it calls bio_endio and later readpage_io_failed_hook() does the work.
>
> When this eb's page (couldn't be the 1st page) fails to add itself to bio
> due to failure in merge_bio(), it cannot decrease eb->io_pages via bio_endio,
> and ends up with a memory leak eventually.
>
> This lets __do_readpage propagate errors to callers and adds the
> 'atomic_dec(&eb->io_pages)'.
>
> Signed-off-by: Liu Bo <bo.li.liu@oracle.com>
I'm adding this to for-next, but a review is needed if this is supposed
to go to 4.7.
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH v2] Btrfs: fix eb memory leak due to readpage failure
2016-06-03 19:08 [PATCH v2] Btrfs: fix eb memory leak due to readpage failure Liu Bo
2016-06-13 15:36 ` David Sterba
@ 2016-07-08 16:01 ` David Sterba
2016-07-08 21:23 ` Liu Bo
2016-07-11 17:39 ` [PATCH v3] " Liu Bo
2 siblings, 1 reply; 10+ messages in thread
From: David Sterba @ 2016-07-08 16:01 UTC (permalink / raw)
To: Liu Bo; +Cc: linux-btrfs, Josef Bacik
On Fri, Jun 03, 2016 at 12:08:38PM -0700, Liu Bo wrote:
> eb->io_pages is set in read_extent_buffer_pages().
>
> In case of readpage failure, for pages that have been added to bio,
> it calls bio_endio and later readpage_io_failed_hook() does the work.
>
> When this eb's page (couldn't be the 1st page) fails to add itself to bio
> due to failure in merge_bio(), it cannot decrease eb->io_pages via bio_endio,
> and ends up with a memory leak eventually.
>
> This lets __do_readpage propagate errors to callers and adds the
> 'atomic_dec(&eb->io_pages)'.
I'm not sure, but could we lose some error values from __do_readpage?
Ie. return 0 even if there was an error in a page that's in the middle
(not the first, not the last).
The loop in __do_readpage iterates while (cur <= end), and ret is only
set by submit_extent_page, but the loop does not exit immediatelly. So
we can detect error, set page error state bit, but next loop will
overwrite ret with 0 (if the page submission was ok).
Then we still don't decrement the io_pages as needed.
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH v2] Btrfs: fix eb memory leak due to readpage failure
2016-07-08 16:01 ` David Sterba
@ 2016-07-08 21:23 ` Liu Bo
0 siblings, 0 replies; 10+ messages in thread
From: Liu Bo @ 2016-07-08 21:23 UTC (permalink / raw)
To: dsterba; +Cc: linux-btrfs, Josef Bacik
On Fri, Jul 08, 2016 at 06:01:49PM +0200, David Sterba wrote:
> On Fri, Jun 03, 2016 at 12:08:38PM -0700, Liu Bo wrote:
> > eb->io_pages is set in read_extent_buffer_pages().
> >
> > In case of readpage failure, for pages that have been added to bio,
> > it calls bio_endio and later readpage_io_failed_hook() does the work.
> >
> > When this eb's page (couldn't be the 1st page) fails to add itself to bio
> > due to failure in merge_bio(), it cannot decrease eb->io_pages via bio_endio,
> > and ends up with a memory leak eventually.
> >
> > This lets __do_readpage propagate errors to callers and adds the
> > 'atomic_dec(&eb->io_pages)'.
>
> I'm not sure, but could we lose some error values from __do_readpage?
> Ie. return 0 even if there was an error in a page that's in the middle
> (not the first, not the last).
>
> The loop in __do_readpage iterates while (cur <= end), and ret is only
> set by submit_extent_page, but the loop does not exit immediatelly. So
> we can detect error, set page error state bit, but next loop will
> overwrite ret with 0 (if the page submission was ok).
>
> Then we still don't decrement the io_pages as needed.
Right, it still has that problem, then the possible way I can see is to break
the while (cur <= end) loop when we fail on submit_extent_page() and
pass an error up to its caller and we can do the rest eb->io_pages cleanup work in
read_extent_buffer_pages(), just like how we did in write_one_eb()
(this was already suggested by Josef, but seems I was off the right track).
This also assumes that if one page fails on submit_extent_page(), it's
likely for the rest pages to fail as well.
What do you think?
Thanks,
-liubo
^ permalink raw reply [flat|nested] 10+ messages in thread
* [PATCH v3] Btrfs: fix eb memory leak due to readpage failure
2016-06-03 19:08 [PATCH v2] Btrfs: fix eb memory leak due to readpage failure Liu Bo
2016-06-13 15:36 ` David Sterba
2016-07-08 16:01 ` David Sterba
@ 2016-07-11 17:39 ` Liu Bo
2016-07-11 18:27 ` Chris Mason
2016-07-12 17:30 ` David Sterba
2 siblings, 2 replies; 10+ messages in thread
From: Liu Bo @ 2016-07-11 17:39 UTC (permalink / raw)
To: linux-btrfs; +Cc: David Sterba
eb->io_pages is set in read_extent_buffer_pages().
In case of readpage failure, for pages that have been added to bio,
it calls bio_endio and later readpage_io_failed_hook() does the work.
When this eb's page (couldn't be the 1st page) fails to add itself to bio
due to failure in merge_bio(), it cannot decrease eb->io_pages via bio_endio,
and ends up with a memory leak eventually.
This lets __do_readpage propagate errors to callers and adds the
'atomic_dec(&eb->io_pages)'.
Signed-off-by: Liu Bo <bo.li.liu@oracle.com>
---
v2: - Move 'dec io_pages' to the caller so that we're consistent with
write_one_eb()
v3: - Bail out once we fail to read a page and do the cleanup work
for eb->io_pages
fs/btrfs/extent_io.c | 25 ++++++++++++++++++++++---
1 file changed, 22 insertions(+), 3 deletions(-)
diff --git a/fs/btrfs/extent_io.c b/fs/btrfs/extent_io.c
index ac1a696..7303e5a 100644
--- a/fs/btrfs/extent_io.c
+++ b/fs/btrfs/extent_io.c
@@ -2878,6 +2878,7 @@ __get_extent_map(struct inode *inode, struct page *page, size_t pg_offset,
* into the tree that are removed when the IO is done (by the end_io
* handlers)
* XXX JDM: This needs looking at to ensure proper page locking
+ * return 0 on success, otherwise return error
*/
static int __do_readpage(struct extent_io_tree *tree,
struct page *page,
@@ -2899,7 +2900,7 @@ static int __do_readpage(struct extent_io_tree *tree,
sector_t sector;
struct extent_map *em;
struct block_device *bdev;
- int ret;
+ int ret = 0;
int nr = 0;
size_t pg_offset = 0;
size_t iosize;
@@ -3080,6 +3081,7 @@ static int __do_readpage(struct extent_io_tree *tree,
} else {
SetPageError(page);
unlock_extent(tree, cur, cur + iosize - 1);
+ goto out;
}
cur = cur + iosize;
pg_offset += iosize;
@@ -3090,7 +3092,7 @@ out:
SetPageUptodate(page);
unlock_page(page);
}
- return 0;
+ return ret;
}
static inline void __do_contiguous_readpages(struct extent_io_tree *tree,
@@ -5230,14 +5232,31 @@ int read_extent_buffer_pages(struct extent_io_tree *tree,
atomic_set(&eb->io_pages, num_reads);
for (i = start_i; i < num_pages; i++) {
page = eb->pages[i];
+
if (!PageUptodate(page)) {
+ if (ret) {
+ atomic_dec(&eb->io_pages);
+ unlock_page(page);
+ continue;
+ }
+
ClearPageError(page);
err = __extent_read_full_page(tree, page,
get_extent, &bio,
mirror_num, &bio_flags,
READ | REQ_META);
- if (err)
+ if (err) {
ret = err;
+ /*
+ * We use &bio in above __extent_read_full_page,
+ * so we ensure that if it returns error, the
+ * current page fails to add itself to bio and
+ * it's been unlocked.
+ *
+ * We must dec io_pages by ourselves.
+ */
+ atomic_dec(&eb->io_pages);
+ }
} else {
unlock_page(page);
}
--
2.5.5
^ permalink raw reply related [flat|nested] 10+ messages in thread
* Re: [PATCH v3] Btrfs: fix eb memory leak due to readpage failure
2016-07-11 17:39 ` [PATCH v3] " Liu Bo
@ 2016-07-11 18:27 ` Chris Mason
2016-07-11 22:48 ` Liu Bo
2016-07-12 17:30 ` David Sterba
1 sibling, 1 reply; 10+ messages in thread
From: Chris Mason @ 2016-07-11 18:27 UTC (permalink / raw)
To: Liu Bo, linux-btrfs; +Cc: David Sterba
On 07/11/2016 01:39 PM, Liu Bo wrote:
> eb->io_pages is set in read_extent_buffer_pages().
>
> In case of readpage failure, for pages that have been added to bio,
> it calls bio_endio and later readpage_io_failed_hook() does the work.
>
> When this eb's page (couldn't be the 1st page) fails to add itself to bio
> due to failure in merge_bio(), it cannot decrease eb->io_pages via bio_endio,
> and ends up with a memory leak eventually.
>
> This lets __do_readpage propagate errors to callers and adds the
> 'atomic_dec(&eb->io_pages)'.
Thanks for looking at this Liu, how is it currently being tested?
-chris
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH v3] Btrfs: fix eb memory leak due to readpage failure
2016-07-11 18:27 ` Chris Mason
@ 2016-07-11 22:48 ` Liu Bo
2016-07-11 22:54 ` Chris Mason
0 siblings, 1 reply; 10+ messages in thread
From: Liu Bo @ 2016-07-11 22:48 UTC (permalink / raw)
To: Chris Mason; +Cc: linux-btrfs, David Sterba
On Mon, Jul 11, 2016 at 02:27:39PM -0400, Chris Mason wrote:
>
>
> On 07/11/2016 01:39 PM, Liu Bo wrote:
> > eb->io_pages is set in read_extent_buffer_pages().
> >
> > In case of readpage failure, for pages that have been added to bio,
> > it calls bio_endio and later readpage_io_failed_hook() does the work.
> >
> > When this eb's page (couldn't be the 1st page) fails to add itself to bio
> > due to failure in merge_bio(), it cannot decrease eb->io_pages via bio_endio,
> > and ends up with a memory leak eventually.
> >
> > This lets __do_readpage propagate errors to callers and adds the
> > 'atomic_dec(&eb->io_pages)'.
>
> Thanks for looking at this Liu, how is it currently being tested?
I have a btrfs disk image which was corrupted by btrfs-corrupt-block
tool, in that image, the chunk tree's content has been removed while the
chunk node can be read from read successfully, so we'd get -EIO when
trying to read tree root's node since __btrfs_map_block() would fail to
find the right item in chunk mapping_tree. Thus, we can test our error
handling path in read_extent_buffer_pages().
Thanks,
-liubo
>
> -chris
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH v3] Btrfs: fix eb memory leak due to readpage failure
2016-07-11 22:48 ` Liu Bo
@ 2016-07-11 22:54 ` Chris Mason
2016-07-11 23:04 ` Liu Bo
0 siblings, 1 reply; 10+ messages in thread
From: Chris Mason @ 2016-07-11 22:54 UTC (permalink / raw)
To: Liu Bo; +Cc: linux-btrfs, David Sterba
On Mon, Jul 11, 2016 at 03:48:38PM -0700, Liu Bo wrote:
>On Mon, Jul 11, 2016 at 02:27:39PM -0400, Chris Mason wrote:
>>
>>
>> On 07/11/2016 01:39 PM, Liu Bo wrote:
>> > eb->io_pages is set in read_extent_buffer_pages().
>> >
>> > In case of readpage failure, for pages that have been added to bio,
>> > it calls bio_endio and later readpage_io_failed_hook() does the work.
>> >
>> > When this eb's page (couldn't be the 1st page) fails to add itself to bio
>> > due to failure in merge_bio(), it cannot decrease eb->io_pages via bio_endio,
>> > and ends up with a memory leak eventually.
>> >
>> > This lets __do_readpage propagate errors to callers and adds the
>> > 'atomic_dec(&eb->io_pages)'.
>>
>> Thanks for looking at this Liu, how is it currently being tested?
>
>I have a btrfs disk image which was corrupted by btrfs-corrupt-block
>tool, in that image, the chunk tree's content has been removed while the
>chunk node can be read from read successfully, so we'd get -EIO when
>trying to read tree root's node since __btrfs_map_block() would fail to
>find the right item in chunk mapping_tree. Thus, we can test our error
>handling path in read_extent_buffer_pages().
Fantastic. Can you please make this an xfstest, maybe along with a dm-flakey?
as the second phase?
-chris
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH v3] Btrfs: fix eb memory leak due to readpage failure
2016-07-11 22:54 ` Chris Mason
@ 2016-07-11 23:04 ` Liu Bo
0 siblings, 0 replies; 10+ messages in thread
From: Liu Bo @ 2016-07-11 23:04 UTC (permalink / raw)
To: Chris Mason, linux-btrfs, David Sterba
On Mon, Jul 11, 2016 at 06:54:02PM -0400, Chris Mason wrote:
> On Mon, Jul 11, 2016 at 03:48:38PM -0700, Liu Bo wrote:
> > On Mon, Jul 11, 2016 at 02:27:39PM -0400, Chris Mason wrote:
> > >
> > >
> > > On 07/11/2016 01:39 PM, Liu Bo wrote:
> > > > eb->io_pages is set in read_extent_buffer_pages().
> > > >
> > > > In case of readpage failure, for pages that have been added to bio,
> > > > it calls bio_endio and later readpage_io_failed_hook() does the work.
> > > >
> > > > When this eb's page (couldn't be the 1st page) fails to add itself to bio
> > > > due to failure in merge_bio(), it cannot decrease eb->io_pages via bio_endio,
> > > > and ends up with a memory leak eventually.
> > > >
> > > > This lets __do_readpage propagate errors to callers and adds the
> > > > 'atomic_dec(&eb->io_pages)'.
> > >
> > > Thanks for looking at this Liu, how is it currently being tested?
> >
> > I have a btrfs disk image which was corrupted by btrfs-corrupt-block
> > tool, in that image, the chunk tree's content has been removed while the
> > chunk node can be read from read successfully, so we'd get -EIO when
> > trying to read tree root's node since __btrfs_map_block() would fail to
> > find the right item in chunk mapping_tree. Thus, we can test our error
> > handling path in read_extent_buffer_pages().
>
> Fantastic. Can you please make this an xfstest, maybe along with a dm-flakey?
> as the second phase?
Sure, this depends on a btrfs-corrupt-block patch, which I've not sent
out, I'll try to work out a xfstests case :)
Btw, I'm also planning to add this into our fuzz images of btrfs-progs.
Thanks,
-liubo
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH v3] Btrfs: fix eb memory leak due to readpage failure
2016-07-11 17:39 ` [PATCH v3] " Liu Bo
2016-07-11 18:27 ` Chris Mason
@ 2016-07-12 17:30 ` David Sterba
1 sibling, 0 replies; 10+ messages in thread
From: David Sterba @ 2016-07-12 17:30 UTC (permalink / raw)
To: Liu Bo; +Cc: linux-btrfs, David Sterba
On Mon, Jul 11, 2016 at 10:39:07AM -0700, Liu Bo wrote:
> eb->io_pages is set in read_extent_buffer_pages().
>
> In case of readpage failure, for pages that have been added to bio,
> it calls bio_endio and later readpage_io_failed_hook() does the work.
>
> When this eb's page (couldn't be the 1st page) fails to add itself to bio
> due to failure in merge_bio(), it cannot decrease eb->io_pages via bio_endio,
> and ends up with a memory leak eventually.
>
> This lets __do_readpage propagate errors to callers and adds the
> 'atomic_dec(&eb->io_pages)'.
>
> Signed-off-by: Liu Bo <bo.li.liu@oracle.com>
Reviewed-by: David Sterba <dsterba@suse.com>
> if (!PageUptodate(page)) {
> + if (ret) {
> + atomic_dec(&eb->io_pages);
> + unlock_page(page);
> + continue;
> + }
This changes the behaviour to "fail early", which could be positive as a
sequence of unreadable blocks will not try to reread all of them with
the timeouts and retries.
^ permalink raw reply [flat|nested] 10+ messages in thread
end of thread, other threads:[~2016-07-12 17:29 UTC | newest]
Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-06-03 19:08 [PATCH v2] Btrfs: fix eb memory leak due to readpage failure Liu Bo
2016-06-13 15:36 ` David Sterba
2016-07-08 16:01 ` David Sterba
2016-07-08 21:23 ` Liu Bo
2016-07-11 17:39 ` [PATCH v3] " Liu Bo
2016-07-11 18:27 ` Chris Mason
2016-07-11 22:48 ` Liu Bo
2016-07-11 22:54 ` Chris Mason
2016-07-11 23:04 ` Liu Bo
2016-07-12 17:30 ` David Sterba
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.