All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH] Btrfs: fix unexpected balance crash due to BUG_ON
@ 2016-05-02 23:01 Liu Bo
  2016-05-03 23:14 ` Holger Hoffstätte
  2016-05-04 14:59 ` David Sterba
  0 siblings, 2 replies; 6+ messages in thread
From: Liu Bo @ 2016-05-02 23:01 UTC (permalink / raw)
  To: linux-btrfs; +Cc: master.b.at.raven

Mounting a btrfs can resume previous balance operations asynchronously.
An user got a crash when one drive has some corrupt sectors.

Since balance can cancel itself in case of any error, we can gracefully
return errors to upper layers and let balance do the cancel job.

Reported-by: sash <master.b.at.raven@chefmail.de>
Signed-off-by: Liu Bo <bo.li.liu@oracle.com>
---
 fs/btrfs/volumes.c | 18 +++++++++++++++---
 1 file changed, 15 insertions(+), 3 deletions(-)

diff --git a/fs/btrfs/volumes.c b/fs/btrfs/volumes.c
index bd0f45f..5aed2e2 100644
--- a/fs/btrfs/volumes.c
+++ b/fs/btrfs/volumes.c
@@ -3418,13 +3418,25 @@ static int __btrfs_balance(struct btrfs_fs_info *fs_info)
 		ret = btrfs_shrink_device(device, old_size - size_to_free);
 		if (ret == -ENOSPC)
 			break;
-		BUG_ON(ret);
+		if (ret) {
+			/* btrfs_shrink_device never returns ret > 0 */
+			WARN_ON_ONCE(ret > 0);
+			goto error;
+		}
 
 		trans = btrfs_start_transaction(dev_root, 0);
-		BUG_ON(IS_ERR(trans));
+		if (IS_ERR(trans)) {
+			ret = PTR_ERR(trans);
+			goto error;
+		}
 
 		ret = btrfs_grow_device(trans, device, old_size);
-		BUG_ON(ret);
+		if (ret) {
+			btrfs_end_transaction(trans, dev_root);
+			/* btrfs_grow_device never returns ret > 0 */
+			WARN_ON_ONCE(ret > 0);
+			goto error;
+		}
 
 		btrfs_end_transaction(trans, dev_root);
 	}
-- 
2.5.5


^ permalink raw reply related	[flat|nested] 6+ messages in thread

* Re: [PATCH] Btrfs: fix unexpected balance crash due to BUG_ON
  2016-05-02 23:01 [PATCH] Btrfs: fix unexpected balance crash due to BUG_ON Liu Bo
@ 2016-05-03 23:14 ` Holger Hoffstätte
  2016-05-03 23:30   ` Liu Bo
  2016-05-04 14:59 ` David Sterba
  1 sibling, 1 reply; 6+ messages in thread
From: Holger Hoffstätte @ 2016-05-03 23:14 UTC (permalink / raw)
  To: Liu Bo; +Cc: linux-btrfs, master.b.at.raven

On Tue, May 3, 2016 at 1:01 AM, Liu Bo <bo.li.liu@oracle.com> wrote:
> Mounting a btrfs can resume previous balance operations asynchronously.
> An user got a crash when one drive has some corrupt sectors.
>
> Since balance can cancel itself in case of any error, we can gracefully
> return errors to upper layers and let balance do the cancel job.
>
> Reported-by: sash <master.b.at.raven@chefmail.de>
> Signed-off-by: Liu Bo <bo.li.liu@oracle.com>
> ---
>  fs/btrfs/volumes.c | 18 +++++++++++++++---
>  1 file changed, 15 insertions(+), 3 deletions(-)
>
> diff --git a/fs/btrfs/volumes.c b/fs/btrfs/volumes.c
> index bd0f45f..5aed2e2 100644
> --- a/fs/btrfs/volumes.c
> +++ b/fs/btrfs/volumes.c
> @@ -3418,13 +3418,25 @@ static int __btrfs_balance(struct btrfs_fs_info *fs_info)
>                 ret = btrfs_shrink_device(device, old_size - size_to_free);
>                 if (ret == -ENOSPC)
>                         break;
> -               BUG_ON(ret);
> +               if (ret) {
> +                       /* btrfs_shrink_device never returns ret > 0 */
> +                       WARN_ON_ONCE(ret > 0);
> +                       goto error;
> +               }
>
>                 trans = btrfs_start_transaction(dev_root, 0);
> -               BUG_ON(IS_ERR(trans));
> +               if (IS_ERR(trans)) {
> +                       ret = PTR_ERR(trans);
> +                       goto error;
> +               }
>
>                 ret = btrfs_grow_device(trans, device, old_size);
> -               BUG_ON(ret);
> +               if (ret) {
> +                       btrfs_end_transaction(trans, dev_root);
> +                       /* btrfs_grow_device never returns ret > 0 */
> +                       WARN_ON_ONCE(ret > 0);
> +                       goto error;
> +               }
>
>                 btrfs_end_transaction(trans, dev_root);
>         }

Just a heads up that this seems to introduce a valid warning, since it now
can goto error before the first initializing use of path:

fs/btrfs/volumes.c: In function 'btrfs_balance':
fs/btrfs/volumes.c:3601:2: warning: 'path' may be used uninitialized
in this function [-Wmaybe-uninitialized]
  btrfs_free_path(path);
  ^
fs/btrfs/volumes.c:3385:21: note: 'path' was declared here
  struct btrfs_path *path;
                     ^
(it's really in __btrfs_balance which got inlined, so gcc thinks it's
at the call site).

Simply setting path = NULL at the beginning of __btrfs_balance fixes it, since
btrfs_free_path allows NULL values.

cheers
Holger

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH] Btrfs: fix unexpected balance crash due to BUG_ON
  2016-05-03 23:14 ` Holger Hoffstätte
@ 2016-05-03 23:30   ` Liu Bo
  2016-07-08 16:05     ` David Sterba
  0 siblings, 1 reply; 6+ messages in thread
From: Liu Bo @ 2016-05-03 23:30 UTC (permalink / raw)
  To: Holger Hoffstätte; +Cc: linux-btrfs, master.b.at.raven

On Wed, May 04, 2016 at 01:14:27AM +0200, Holger Hoffstätte wrote:
> On Tue, May 3, 2016 at 1:01 AM, Liu Bo <bo.li.liu@oracle.com> wrote:
> > Mounting a btrfs can resume previous balance operations asynchronously.
> > An user got a crash when one drive has some corrupt sectors.
> >
> > Since balance can cancel itself in case of any error, we can gracefully
> > return errors to upper layers and let balance do the cancel job.
> >
> > Reported-by: sash <master.b.at.raven@chefmail.de>
> > Signed-off-by: Liu Bo <bo.li.liu@oracle.com>
> > ---
> >  fs/btrfs/volumes.c | 18 +++++++++++++++---
> >  1 file changed, 15 insertions(+), 3 deletions(-)
> >
> > diff --git a/fs/btrfs/volumes.c b/fs/btrfs/volumes.c
> > index bd0f45f..5aed2e2 100644
> > --- a/fs/btrfs/volumes.c
> > +++ b/fs/btrfs/volumes.c
> > @@ -3418,13 +3418,25 @@ static int __btrfs_balance(struct btrfs_fs_info *fs_info)
> >                 ret = btrfs_shrink_device(device, old_size - size_to_free);
> >                 if (ret == -ENOSPC)
> >                         break;
> > -               BUG_ON(ret);
> > +               if (ret) {
> > +                       /* btrfs_shrink_device never returns ret > 0 */
> > +                       WARN_ON_ONCE(ret > 0);
> > +                       goto error;
> > +               }
> >
> >                 trans = btrfs_start_transaction(dev_root, 0);
> > -               BUG_ON(IS_ERR(trans));
> > +               if (IS_ERR(trans)) {
> > +                       ret = PTR_ERR(trans);
> > +                       goto error;
> > +               }
> >
> >                 ret = btrfs_grow_device(trans, device, old_size);
> > -               BUG_ON(ret);
> > +               if (ret) {
> > +                       btrfs_end_transaction(trans, dev_root);
> > +                       /* btrfs_grow_device never returns ret > 0 */
> > +                       WARN_ON_ONCE(ret > 0);
> > +                       goto error;
> > +               }
> >
> >                 btrfs_end_transaction(trans, dev_root);
> >         }
> 
> Just a heads up that this seems to introduce a valid warning, since it now
> can goto error before the first initializing use of path:
> 
> fs/btrfs/volumes.c: In function 'btrfs_balance':
> fs/btrfs/volumes.c:3601:2: warning: 'path' may be used uninitialized
> in this function [-Wmaybe-uninitialized]
>   btrfs_free_path(path);
>   ^
> fs/btrfs/volumes.c:3385:21: note: 'path' was declared here
>   struct btrfs_path *path;
>                      ^
> (it's really in __btrfs_balance which got inlined, so gcc thinks it's
> at the call site).
> 
> Simply setting path = NULL at the beginning of __btrfs_balance fixes it, since
> btrfs_free_path allows NULL values.

That's right, it's weird that I didn't get this warning while testing it.

Thanks for catching it, Holger.

Thanks,

-liubo

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH] Btrfs: fix unexpected balance crash due to BUG_ON
  2016-05-02 23:01 [PATCH] Btrfs: fix unexpected balance crash due to BUG_ON Liu Bo
  2016-05-03 23:14 ` Holger Hoffstätte
@ 2016-05-04 14:59 ` David Sterba
  1 sibling, 0 replies; 6+ messages in thread
From: David Sterba @ 2016-05-04 14:59 UTC (permalink / raw)
  To: Liu Bo; +Cc: linux-btrfs, master.b.at.raven

On Mon, May 02, 2016 at 04:01:02PM -0700, Liu Bo wrote:
> --- a/fs/btrfs/volumes.c
> +++ b/fs/btrfs/volumes.c
> @@ -3418,13 +3418,25 @@ static int __btrfs_balance(struct btrfs_fs_info *fs_info)
>  		ret = btrfs_shrink_device(device, old_size - size_to_free);
>  		if (ret == -ENOSPC)
>  			break;
> -		BUG_ON(ret);
> +		if (ret) {
> +			/* btrfs_shrink_device never returns ret > 0 */
> +			WARN_ON_ONCE(ret > 0);
> +			goto error;
> +		}
>  
>  		trans = btrfs_start_transaction(dev_root, 0);
> -		BUG_ON(IS_ERR(trans));
> +		if (IS_ERR(trans)) {
> +			ret = PTR_ERR(trans);
> +			goto error;
> +		}
>  
>  		ret = btrfs_grow_device(trans, device, old_size);
> -		BUG_ON(ret);
> +		if (ret) {
> +			btrfs_end_transaction(trans, dev_root);
> +			/* btrfs_grow_device never returns ret > 0 */
> +			WARN_ON_ONCE(ret > 0);
> +			goto error;
> +		}

The "shrink then grow" trick seems to be necessary to make the workspace
for balance. I'm thinking what could be the intermediate result when it
succeeds only partially that we should worry about. (This also means
partial success on several devices only.)

If just shrink succeeds, then there is a smaller device that the user
did not ask for. This is effectively no change from the current
behaviour, now we fail more gracefully.

My idea is to print the at least original size of the device if
transaction start or grow phase fails.

Otherwise patch looks ok.

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH] Btrfs: fix unexpected balance crash due to BUG_ON
  2016-05-03 23:30   ` Liu Bo
@ 2016-07-08 16:05     ` David Sterba
  2016-07-08 21:26       ` Liu Bo
  0 siblings, 1 reply; 6+ messages in thread
From: David Sterba @ 2016-07-08 16:05 UTC (permalink / raw)
  To: Liu Bo; +Cc: Holger Hoffstätte, linux-btrfs, master.b.at.raven

On Tue, May 03, 2016 at 04:30:54PM -0700, Liu Bo wrote:
> > Just a heads up that this seems to introduce a valid warning, since it now
> > can goto error before the first initializing use of path:
> > 
> > fs/btrfs/volumes.c: In function 'btrfs_balance':
> > fs/btrfs/volumes.c:3601:2: warning: 'path' may be used uninitialized
> > in this function [-Wmaybe-uninitialized]
> >   btrfs_free_path(path);
> >   ^
> > fs/btrfs/volumes.c:3385:21: note: 'path' was declared here
> >   struct btrfs_path *path;
> >                      ^
> > (it's really in __btrfs_balance which got inlined, so gcc thinks it's
> > at the call site).
> > 
> > Simply setting path = NULL at the beginning of __btrfs_balance fixes it, since
> > btrfs_free_path allows NULL values.
> 
> That's right, it's weird that I didn't get this warning while testing it.
> 
> Thanks for catching it, Holger.

Please send a v2, the patch is desiable.

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH] Btrfs: fix unexpected balance crash due to BUG_ON
  2016-07-08 16:05     ` David Sterba
@ 2016-07-08 21:26       ` Liu Bo
  0 siblings, 0 replies; 6+ messages in thread
From: Liu Bo @ 2016-07-08 21:26 UTC (permalink / raw)
  To: dsterba; +Cc: Holger Hoffstätte, linux-btrfs, master.b.at.raven

On Fri, Jul 08, 2016 at 06:05:16PM +0200, David Sterba wrote:
> On Tue, May 03, 2016 at 04:30:54PM -0700, Liu Bo wrote:
> > > Just a heads up that this seems to introduce a valid warning, since it now
> > > can goto error before the first initializing use of path:
> > > 
> > > fs/btrfs/volumes.c: In function 'btrfs_balance':
> > > fs/btrfs/volumes.c:3601:2: warning: 'path' may be used uninitialized
> > > in this function [-Wmaybe-uninitialized]
> > >   btrfs_free_path(path);
> > >   ^
> > > fs/btrfs/volumes.c:3385:21: note: 'path' was declared here
> > >   struct btrfs_path *path;
> > >                      ^
> > > (it's really in __btrfs_balance which got inlined, so gcc thinks it's
> > > at the call site).
> > > 
> > > Simply setting path = NULL at the beginning of __btrfs_balance fixes it, since
> > > btrfs_free_path allows NULL values.
> > 
> > That's right, it's weird that I didn't get this warning while testing it.
> > 
> > Thanks for catching it, Holger.
> 
> Please send a v2, the patch is desiable.

Oh, I almost forgot this one, thanks for the reminder.

Thanks,

-liubo

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2016-07-08 21:27 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-05-02 23:01 [PATCH] Btrfs: fix unexpected balance crash due to BUG_ON Liu Bo
2016-05-03 23:14 ` Holger Hoffstätte
2016-05-03 23:30   ` Liu Bo
2016-07-08 16:05     ` David Sterba
2016-07-08 21:26       ` Liu Bo
2016-05-04 14:59 ` David Sterba

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.