All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v2] Btrfs: fix eb memory leak due to readpage failure
@ 2016-06-03 19:08 Liu Bo
  2016-06-13 15:36 ` David Sterba
                   ` (2 more replies)
  0 siblings, 3 replies; 10+ messages in thread
From: Liu Bo @ 2016-06-03 19:08 UTC (permalink / raw)
  To: linux-btrfs; +Cc: Josef Bacik

eb->io_pages is set in read_extent_buffer_pages().

In case of readpage failure, for pages that have been added to bio,
it calls bio_endio and later readpage_io_failed_hook() does the work.

When this eb's page (couldn't be the 1st page) fails to add itself to bio
due to failure in merge_bio(), it cannot decrease eb->io_pages via bio_endio,
 and ends up with a memory leak eventually.

This lets __do_readpage propagate errors to callers and adds the
 'atomic_dec(&eb->io_pages)'.

Signed-off-by: Liu Bo <bo.li.liu@oracle.com>
---
v2:
  - Move 'dec io_pages' to the caller so that we're consistent with
    write_one_eb()

 fs/btrfs/extent_io.c | 16 +++++++++++++---
 1 file changed, 13 insertions(+), 3 deletions(-)

diff --git a/fs/btrfs/extent_io.c b/fs/btrfs/extent_io.c
index d247fc0..0309388 100644
--- a/fs/btrfs/extent_io.c
+++ b/fs/btrfs/extent_io.c
@@ -2869,6 +2869,7 @@ __get_extent_map(struct inode *inode, struct page *page, size_t pg_offset,
  * into the tree that are removed when the IO is done (by the end_io
  * handlers)
  * XXX JDM: This needs looking at to ensure proper page locking
+ * return 0 on success, otherwise return error
  */
 static int __do_readpage(struct extent_io_tree *tree,
 			 struct page *page,
@@ -2890,7 +2891,7 @@ static int __do_readpage(struct extent_io_tree *tree,
 	sector_t sector;
 	struct extent_map *em;
 	struct block_device *bdev;
-	int ret;
+	int ret = 0;
 	int nr = 0;
 	size_t pg_offset = 0;
 	size_t iosize;
@@ -3081,7 +3082,7 @@ out:
 			SetPageUptodate(page);
 		unlock_page(page);
 	}
-	return 0;
+	return ret;
 }
 
 static inline void __do_contiguous_readpages(struct extent_io_tree *tree,
@@ -5204,8 +5205,17 @@ int read_extent_buffer_pages(struct extent_io_tree *tree,
 						      get_extent, &bio,
 						      mirror_num, &bio_flags,
 						      READ | REQ_META);
-			if (err)
+			if (err) {
 				ret = err;
+				/*
+				 * We use &bio in above __extent_read_full_page,
+				 * so we ensure that if it returns error, the
+				 * current page fails to add itself to bio.
+				 *
+				 * We must dec io_pages by ourselves.
+				 */
+				atomic_dec(&eb->io_pages);
+			}
 		} else {
 			unlock_page(page);
 		}
-- 
2.5.5


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* Re: [PATCH v2] Btrfs: fix eb memory leak due to readpage failure
  2016-06-03 19:08 [PATCH v2] Btrfs: fix eb memory leak due to readpage failure Liu Bo
@ 2016-06-13 15:36 ` David Sterba
  2016-07-08 16:01 ` David Sterba
  2016-07-11 17:39 ` [PATCH v3] " Liu Bo
  2 siblings, 0 replies; 10+ messages in thread
From: David Sterba @ 2016-06-13 15:36 UTC (permalink / raw)
  To: Liu Bo; +Cc: linux-btrfs, Josef Bacik

On Fri, Jun 03, 2016 at 12:08:38PM -0700, Liu Bo wrote:
> eb->io_pages is set in read_extent_buffer_pages().
> 
> In case of readpage failure, for pages that have been added to bio,
> it calls bio_endio and later readpage_io_failed_hook() does the work.
> 
> When this eb's page (couldn't be the 1st page) fails to add itself to bio
> due to failure in merge_bio(), it cannot decrease eb->io_pages via bio_endio,
>  and ends up with a memory leak eventually.
> 
> This lets __do_readpage propagate errors to callers and adds the
>  'atomic_dec(&eb->io_pages)'.
> 
> Signed-off-by: Liu Bo <bo.li.liu@oracle.com>

I'm adding this to for-next, but a review is needed if this is supposed
to go to 4.7.

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH v2] Btrfs: fix eb memory leak due to readpage failure
  2016-06-03 19:08 [PATCH v2] Btrfs: fix eb memory leak due to readpage failure Liu Bo
  2016-06-13 15:36 ` David Sterba
@ 2016-07-08 16:01 ` David Sterba
  2016-07-08 21:23   ` Liu Bo
  2016-07-11 17:39 ` [PATCH v3] " Liu Bo
  2 siblings, 1 reply; 10+ messages in thread
From: David Sterba @ 2016-07-08 16:01 UTC (permalink / raw)
  To: Liu Bo; +Cc: linux-btrfs, Josef Bacik

On Fri, Jun 03, 2016 at 12:08:38PM -0700, Liu Bo wrote:
> eb->io_pages is set in read_extent_buffer_pages().
> 
> In case of readpage failure, for pages that have been added to bio,
> it calls bio_endio and later readpage_io_failed_hook() does the work.
> 
> When this eb's page (couldn't be the 1st page) fails to add itself to bio
> due to failure in merge_bio(), it cannot decrease eb->io_pages via bio_endio,
>  and ends up with a memory leak eventually.
> 
> This lets __do_readpage propagate errors to callers and adds the
>  'atomic_dec(&eb->io_pages)'.

I'm not sure, but could we lose some error values from __do_readpage?
Ie. return 0 even if there was an error in a page that's in the middle
(not the first, not the last).

The loop in __do_readpage iterates while (cur <= end), and ret is only
set by submit_extent_page, but the loop does not exit immediatelly. So
we can detect error, set page error state bit, but next loop will
overwrite ret with 0 (if the page submission was ok).

Then we still don't decrement the io_pages as needed.

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH v2] Btrfs: fix eb memory leak due to readpage failure
  2016-07-08 16:01 ` David Sterba
@ 2016-07-08 21:23   ` Liu Bo
  0 siblings, 0 replies; 10+ messages in thread
From: Liu Bo @ 2016-07-08 21:23 UTC (permalink / raw)
  To: dsterba; +Cc: linux-btrfs, Josef Bacik

On Fri, Jul 08, 2016 at 06:01:49PM +0200, David Sterba wrote:
> On Fri, Jun 03, 2016 at 12:08:38PM -0700, Liu Bo wrote:
> > eb->io_pages is set in read_extent_buffer_pages().
> > 
> > In case of readpage failure, for pages that have been added to bio,
> > it calls bio_endio and later readpage_io_failed_hook() does the work.
> > 
> > When this eb's page (couldn't be the 1st page) fails to add itself to bio
> > due to failure in merge_bio(), it cannot decrease eb->io_pages via bio_endio,
> >  and ends up with a memory leak eventually.
> > 
> > This lets __do_readpage propagate errors to callers and adds the
> >  'atomic_dec(&eb->io_pages)'.
> 
> I'm not sure, but could we lose some error values from __do_readpage?
> Ie. return 0 even if there was an error in a page that's in the middle
> (not the first, not the last).
> 
> The loop in __do_readpage iterates while (cur <= end), and ret is only
> set by submit_extent_page, but the loop does not exit immediatelly. So
> we can detect error, set page error state bit, but next loop will
> overwrite ret with 0 (if the page submission was ok).
> 
> Then we still don't decrement the io_pages as needed.

Right, it still has that problem, then the possible way I can see is to break
the while (cur <= end) loop when we fail on submit_extent_page() and
pass an error up to its caller and we can do the rest eb->io_pages cleanup work in
read_extent_buffer_pages(), just like how we did in write_one_eb()
(this was already suggested by Josef, but seems I was off the right track).

This also assumes that if one page fails on submit_extent_page(), it's
likely for the rest pages to fail as well.

What do you think?

Thanks,

-liubo

^ permalink raw reply	[flat|nested] 10+ messages in thread

* [PATCH v3] Btrfs: fix eb memory leak due to readpage failure
  2016-06-03 19:08 [PATCH v2] Btrfs: fix eb memory leak due to readpage failure Liu Bo
  2016-06-13 15:36 ` David Sterba
  2016-07-08 16:01 ` David Sterba
@ 2016-07-11 17:39 ` Liu Bo
  2016-07-11 18:27   ` Chris Mason
  2016-07-12 17:30   ` David Sterba
  2 siblings, 2 replies; 10+ messages in thread
From: Liu Bo @ 2016-07-11 17:39 UTC (permalink / raw)
  To: linux-btrfs; +Cc: David Sterba

eb->io_pages is set in read_extent_buffer_pages().

In case of readpage failure, for pages that have been added to bio,
it calls bio_endio and later readpage_io_failed_hook() does the work.

When this eb's page (couldn't be the 1st page) fails to add itself to bio
due to failure in merge_bio(), it cannot decrease eb->io_pages via bio_endio,
 and ends up with a memory leak eventually.

This lets __do_readpage propagate errors to callers and adds the
 'atomic_dec(&eb->io_pages)'.

Signed-off-by: Liu Bo <bo.li.liu@oracle.com>
---
v2: - Move 'dec io_pages' to the caller so that we're consistent with
      write_one_eb()
v3: - Bail out once we fail to read a page and do the cleanup work
      for eb->io_pages

 fs/btrfs/extent_io.c | 25 ++++++++++++++++++++++---
 1 file changed, 22 insertions(+), 3 deletions(-)

diff --git a/fs/btrfs/extent_io.c b/fs/btrfs/extent_io.c
index ac1a696..7303e5a 100644
--- a/fs/btrfs/extent_io.c
+++ b/fs/btrfs/extent_io.c
@@ -2878,6 +2878,7 @@ __get_extent_map(struct inode *inode, struct page *page, size_t pg_offset,
  * into the tree that are removed when the IO is done (by the end_io
  * handlers)
  * XXX JDM: This needs looking at to ensure proper page locking
+ * return 0 on success, otherwise return error
  */
 static int __do_readpage(struct extent_io_tree *tree,
 			 struct page *page,
@@ -2899,7 +2900,7 @@ static int __do_readpage(struct extent_io_tree *tree,
 	sector_t sector;
 	struct extent_map *em;
 	struct block_device *bdev;
-	int ret;
+	int ret = 0;
 	int nr = 0;
 	size_t pg_offset = 0;
 	size_t iosize;
@@ -3080,6 +3081,7 @@ static int __do_readpage(struct extent_io_tree *tree,
 		} else {
 			SetPageError(page);
 			unlock_extent(tree, cur, cur + iosize - 1);
+			goto out;
 		}
 		cur = cur + iosize;
 		pg_offset += iosize;
@@ -3090,7 +3092,7 @@ out:
 			SetPageUptodate(page);
 		unlock_page(page);
 	}
-	return 0;
+	return ret;
 }
 
 static inline void __do_contiguous_readpages(struct extent_io_tree *tree,
@@ -5230,14 +5232,31 @@ int read_extent_buffer_pages(struct extent_io_tree *tree,
 	atomic_set(&eb->io_pages, num_reads);
 	for (i = start_i; i < num_pages; i++) {
 		page = eb->pages[i];
+
 		if (!PageUptodate(page)) {
+			if (ret) {
+				atomic_dec(&eb->io_pages);
+				unlock_page(page);
+				continue;
+			}
+
 			ClearPageError(page);
 			err = __extent_read_full_page(tree, page,
 						      get_extent, &bio,
 						      mirror_num, &bio_flags,
 						      READ | REQ_META);
-			if (err)
+			if (err) {
 				ret = err;
+				/*
+				 * We use &bio in above __extent_read_full_page,
+				 * so we ensure that if it returns error, the
+				 * current page fails to add itself to bio and
+				 * it's been unlocked.
+				 *
+				 * We must dec io_pages by ourselves.
+				 */
+				atomic_dec(&eb->io_pages);
+			}
 		} else {
 			unlock_page(page);
 		}
-- 
2.5.5


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* Re: [PATCH v3] Btrfs: fix eb memory leak due to readpage failure
  2016-07-11 17:39 ` [PATCH v3] " Liu Bo
@ 2016-07-11 18:27   ` Chris Mason
  2016-07-11 22:48     ` Liu Bo
  2016-07-12 17:30   ` David Sterba
  1 sibling, 1 reply; 10+ messages in thread
From: Chris Mason @ 2016-07-11 18:27 UTC (permalink / raw)
  To: Liu Bo, linux-btrfs; +Cc: David Sterba



On 07/11/2016 01:39 PM, Liu Bo wrote:
> eb->io_pages is set in read_extent_buffer_pages().
>
> In case of readpage failure, for pages that have been added to bio,
> it calls bio_endio and later readpage_io_failed_hook() does the work.
>
> When this eb's page (couldn't be the 1st page) fails to add itself to bio
> due to failure in merge_bio(), it cannot decrease eb->io_pages via bio_endio,
>  and ends up with a memory leak eventually.
>
> This lets __do_readpage propagate errors to callers and adds the
>  'atomic_dec(&eb->io_pages)'.

Thanks for looking at this Liu, how is it currently being tested?

-chris

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH v3] Btrfs: fix eb memory leak due to readpage failure
  2016-07-11 18:27   ` Chris Mason
@ 2016-07-11 22:48     ` Liu Bo
  2016-07-11 22:54       ` Chris Mason
  0 siblings, 1 reply; 10+ messages in thread
From: Liu Bo @ 2016-07-11 22:48 UTC (permalink / raw)
  To: Chris Mason; +Cc: linux-btrfs, David Sterba

On Mon, Jul 11, 2016 at 02:27:39PM -0400, Chris Mason wrote:
> 
> 
> On 07/11/2016 01:39 PM, Liu Bo wrote:
> > eb->io_pages is set in read_extent_buffer_pages().
> > 
> > In case of readpage failure, for pages that have been added to bio,
> > it calls bio_endio and later readpage_io_failed_hook() does the work.
> > 
> > When this eb's page (couldn't be the 1st page) fails to add itself to bio
> > due to failure in merge_bio(), it cannot decrease eb->io_pages via bio_endio,
> >  and ends up with a memory leak eventually.
> > 
> > This lets __do_readpage propagate errors to callers and adds the
> >  'atomic_dec(&eb->io_pages)'.
> 
> Thanks for looking at this Liu, how is it currently being tested?

I have a btrfs disk image which was corrupted by btrfs-corrupt-block
tool, in that image, the chunk tree's content has been removed while the
chunk node can be read from read successfully, so we'd get -EIO when
trying to read tree root's node since __btrfs_map_block() would fail to
find the right item in chunk mapping_tree.  Thus, we can test our error
handling path in read_extent_buffer_pages().

Thanks,

-liubo

> 
> -chris

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH v3] Btrfs: fix eb memory leak due to readpage failure
  2016-07-11 22:48     ` Liu Bo
@ 2016-07-11 22:54       ` Chris Mason
  2016-07-11 23:04         ` Liu Bo
  0 siblings, 1 reply; 10+ messages in thread
From: Chris Mason @ 2016-07-11 22:54 UTC (permalink / raw)
  To: Liu Bo; +Cc: linux-btrfs, David Sterba

On Mon, Jul 11, 2016 at 03:48:38PM -0700, Liu Bo wrote:
>On Mon, Jul 11, 2016 at 02:27:39PM -0400, Chris Mason wrote:
>>
>>
>> On 07/11/2016 01:39 PM, Liu Bo wrote:
>> > eb->io_pages is set in read_extent_buffer_pages().
>> >
>> > In case of readpage failure, for pages that have been added to bio,
>> > it calls bio_endio and later readpage_io_failed_hook() does the work.
>> >
>> > When this eb's page (couldn't be the 1st page) fails to add itself to bio
>> > due to failure in merge_bio(), it cannot decrease eb->io_pages via bio_endio,
>> >  and ends up with a memory leak eventually.
>> >
>> > This lets __do_readpage propagate errors to callers and adds the
>> >  'atomic_dec(&eb->io_pages)'.
>>
>> Thanks for looking at this Liu, how is it currently being tested?
>
>I have a btrfs disk image which was corrupted by btrfs-corrupt-block
>tool, in that image, the chunk tree's content has been removed while the
>chunk node can be read from read successfully, so we'd get -EIO when
>trying to read tree root's node since __btrfs_map_block() would fail to
>find the right item in chunk mapping_tree.  Thus, we can test our error
>handling path in read_extent_buffer_pages().

Fantastic.  Can you please make this an xfstest, maybe along with a dm-flakey?
as the second phase?

-chris


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH v3] Btrfs: fix eb memory leak due to readpage failure
  2016-07-11 22:54       ` Chris Mason
@ 2016-07-11 23:04         ` Liu Bo
  0 siblings, 0 replies; 10+ messages in thread
From: Liu Bo @ 2016-07-11 23:04 UTC (permalink / raw)
  To: Chris Mason, linux-btrfs, David Sterba

On Mon, Jul 11, 2016 at 06:54:02PM -0400, Chris Mason wrote:
> On Mon, Jul 11, 2016 at 03:48:38PM -0700, Liu Bo wrote:
> > On Mon, Jul 11, 2016 at 02:27:39PM -0400, Chris Mason wrote:
> > > 
> > > 
> > > On 07/11/2016 01:39 PM, Liu Bo wrote:
> > > > eb->io_pages is set in read_extent_buffer_pages().
> > > >
> > > > In case of readpage failure, for pages that have been added to bio,
> > > > it calls bio_endio and later readpage_io_failed_hook() does the work.
> > > >
> > > > When this eb's page (couldn't be the 1st page) fails to add itself to bio
> > > > due to failure in merge_bio(), it cannot decrease eb->io_pages via bio_endio,
> > > >  and ends up with a memory leak eventually.
> > > >
> > > > This lets __do_readpage propagate errors to callers and adds the
> > > >  'atomic_dec(&eb->io_pages)'.
> > > 
> > > Thanks for looking at this Liu, how is it currently being tested?
> > 
> > I have a btrfs disk image which was corrupted by btrfs-corrupt-block
> > tool, in that image, the chunk tree's content has been removed while the
> > chunk node can be read from read successfully, so we'd get -EIO when
> > trying to read tree root's node since __btrfs_map_block() would fail to
> > find the right item in chunk mapping_tree.  Thus, we can test our error
> > handling path in read_extent_buffer_pages().
> 
> Fantastic.  Can you please make this an xfstest, maybe along with a dm-flakey?
> as the second phase?

Sure, this depends on a btrfs-corrupt-block patch, which I've not sent
out, I'll try to work out a xfstests case :)

Btw, I'm also planning to add this into our fuzz images of btrfs-progs.

Thanks,

-liubo

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH v3] Btrfs: fix eb memory leak due to readpage failure
  2016-07-11 17:39 ` [PATCH v3] " Liu Bo
  2016-07-11 18:27   ` Chris Mason
@ 2016-07-12 17:30   ` David Sterba
  1 sibling, 0 replies; 10+ messages in thread
From: David Sterba @ 2016-07-12 17:30 UTC (permalink / raw)
  To: Liu Bo; +Cc: linux-btrfs, David Sterba

On Mon, Jul 11, 2016 at 10:39:07AM -0700, Liu Bo wrote:
> eb->io_pages is set in read_extent_buffer_pages().
> 
> In case of readpage failure, for pages that have been added to bio,
> it calls bio_endio and later readpage_io_failed_hook() does the work.
> 
> When this eb's page (couldn't be the 1st page) fails to add itself to bio
> due to failure in merge_bio(), it cannot decrease eb->io_pages via bio_endio,
>  and ends up with a memory leak eventually.
> 
> This lets __do_readpage propagate errors to callers and adds the
>  'atomic_dec(&eb->io_pages)'.
> 
> Signed-off-by: Liu Bo <bo.li.liu@oracle.com>

Reviewed-by: David Sterba <dsterba@suse.com>

>  		if (!PageUptodate(page)) {
> +			if (ret) {
> +				atomic_dec(&eb->io_pages);
> +				unlock_page(page);
> +				continue;
> +			}

This changes the behaviour to "fail early", which could be positive as a
sequence of unreadable blocks will not try to reread all of them with
the timeouts and retries.

^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2016-07-12 17:29 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-06-03 19:08 [PATCH v2] Btrfs: fix eb memory leak due to readpage failure Liu Bo
2016-06-13 15:36 ` David Sterba
2016-07-08 16:01 ` David Sterba
2016-07-08 21:23   ` Liu Bo
2016-07-11 17:39 ` [PATCH v3] " Liu Bo
2016-07-11 18:27   ` Chris Mason
2016-07-11 22:48     ` Liu Bo
2016-07-11 22:54       ` Chris Mason
2016-07-11 23:04         ` Liu Bo
2016-07-12 17:30   ` David Sterba

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.