All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v2] btrfs: fix BUG trying to resume balance without resume flag
@ 2018-04-30  9:48 Anand Jain
  2018-04-30 14:11 ` kbuild test robot
                   ` (2 more replies)
  0 siblings, 3 replies; 5+ messages in thread
From: Anand Jain @ 2018-04-30  9:48 UTC (permalink / raw)
  To: linux-btrfs

We set the BTRFS_BALANCE_RESUME flag in the btrfs_recover_balance(),
which is not called during the remount. So when resuming from the
paused balance we hit BUG.

 kernel: kernel BUG at fs/btrfs/volumes.c:3890!
 ::
 kernel:  balance_kthread+0x51/0x60 [btrfs]
 kernel:  kthread+0x111/0x130
 ::
 kernel: RIP: btrfs_balance+0x12e1/0x1570 [btrfs] RSP: ffffba7d0090bde8

Reproducer:
  On a mounted BTRFS.

  btrfs balance start --full-balance /btrfs
  btrfs balance pause /btrfs
  mount -o remount,ro /dev/sdb /btrfs
  mount -o remount,rw /dev/sdb /btrfs

To fix this set the BTRFS_BALANCE_RESUME flag in btrfs_resume_balance_async()
instead of btrfs_recover_balance().

Signed-off-by: Anand Jain <anand.jain@oracle.com>
---
v1->v2: btrfs_resume_balance_async() can be called only from remount or
mount, we don't need to hold fs_info->balance_lock.

Strictly speaking we should rather keep the balance at the paused state
unless it is resumed by the user again, that means neither mount nor
remount-rw should resume the balance automatically, former case needs
writing balance status to the disk. Which needs compatibility
verification.  So for now just avoid BUG. 

 fs/btrfs/volumes.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/fs/btrfs/volumes.c b/fs/btrfs/volumes.c
index 3e6983a169c4..64bcaf25908b 100644
--- a/fs/btrfs/volumes.c
+++ b/fs/btrfs/volumes.c
@@ -4115,6 +4115,8 @@ int btrfs_resume_balance_async(struct btrfs_fs_info *fs_info)
 		return 0;
 	}
 
+	fs_info->balance_ctl->b_flags |= BTRFS_BALANCE_RESUME;
+
 	tsk = kthread_run(balance_kthread, fs_info, "btrfs-balance");
 	return PTR_ERR_OR_ZERO(tsk);
 }
@@ -4156,7 +4158,6 @@ int btrfs_recover_balance(struct btrfs_fs_info *fs_info)
 
 	bctl->fs_info = fs_info;
 	bctl->flags = btrfs_balance_flags(leaf, item);
-	bctl->flags |= BTRFS_BALANCE_RESUME;
 
 	btrfs_balance_data(leaf, item, &disk_bargs);
 	btrfs_disk_balance_args_to_cpu(&bctl->data, &disk_bargs);
-- 
2.7.0


^ permalink raw reply related	[flat|nested] 5+ messages in thread

* Re: [PATCH v2] btrfs: fix BUG trying to resume balance without resume flag
  2018-04-30  9:48 [PATCH v2] btrfs: fix BUG trying to resume balance without resume flag Anand Jain
@ 2018-04-30 14:11 ` kbuild test robot
  2018-04-30 14:12 ` kbuild test robot
  2018-05-16 14:35 ` David Sterba
  2 siblings, 0 replies; 5+ messages in thread
From: kbuild test robot @ 2018-04-30 14:11 UTC (permalink / raw)
  To: Anand Jain; +Cc: kbuild-all, linux-btrfs

[-- Attachment #1: Type: text/plain, Size: 1817 bytes --]

Hi Anand,

Thank you for the patch! Yet something to improve:

[auto build test ERROR on btrfs/next]
[also build test ERROR on v4.17-rc3 next-20180430]
[if your patch is applied to the wrong git tree, please drop us a note to help improve the system]

url:    https://github.com/0day-ci/linux/commits/Anand-Jain/btrfs-fix-BUG-trying-to-resume-balance-without-resume-flag/20180430-192532
base:   https://git.kernel.org/pub/scm/linux/kernel/git/mason/linux-btrfs.git next
config: i386-randconfig-a1-201817 (attached as .config)
compiler: gcc-4.9 (Debian 4.9.4-2) 4.9.4
reproduce:
        # save the attached .config to linux build tree
        make ARCH=i386 

All errors (new ones prefixed by >>):

   fs/btrfs/volumes.c: In function 'btrfs_resume_balance_async':
>> fs/btrfs/volumes.c:3971:22: error: 'struct btrfs_balance_control' has no member named 'b_flags'
     fs_info->balance_ctl->b_flags |= BTRFS_BALANCE_RESUME;
                         ^

vim +3971 fs/btrfs/volumes.c

  3954	
  3955	int btrfs_resume_balance_async(struct btrfs_fs_info *fs_info)
  3956	{
  3957		struct task_struct *tsk;
  3958	
  3959		spin_lock(&fs_info->balance_lock);
  3960		if (!fs_info->balance_ctl) {
  3961			spin_unlock(&fs_info->balance_lock);
  3962			return 0;
  3963		}
  3964		spin_unlock(&fs_info->balance_lock);
  3965	
  3966		if (btrfs_test_opt(fs_info, SKIP_BALANCE)) {
  3967			btrfs_info(fs_info, "force skipping balance");
  3968			return 0;
  3969		}
  3970	
> 3971		fs_info->balance_ctl->b_flags |= BTRFS_BALANCE_RESUME;
  3972	
  3973		tsk = kthread_run(balance_kthread, fs_info, "btrfs-balance");
  3974		return PTR_ERR_OR_ZERO(tsk);
  3975	}
  3976	

---
0-DAY kernel test infrastructure                Open Source Technology Center
https://lists.01.org/pipermail/kbuild-all                   Intel Corporation

[-- Attachment #2: .config.gz --]
[-- Type: application/gzip, Size: 32886 bytes --]

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH v2] btrfs: fix BUG trying to resume balance without resume flag
  2018-04-30  9:48 [PATCH v2] btrfs: fix BUG trying to resume balance without resume flag Anand Jain
  2018-04-30 14:11 ` kbuild test robot
@ 2018-04-30 14:12 ` kbuild test robot
  2018-05-16 14:35 ` David Sterba
  2 siblings, 0 replies; 5+ messages in thread
From: kbuild test robot @ 2018-04-30 14:12 UTC (permalink / raw)
  To: Anand Jain; +Cc: kbuild-all, linux-btrfs

[-- Attachment #1: Type: text/plain, Size: 1885 bytes --]

Hi Anand,

Thank you for the patch! Yet something to improve:

[auto build test ERROR on btrfs/next]
[also build test ERROR on v4.17-rc3 next-20180430]
[if your patch is applied to the wrong git tree, please drop us a note to help improve the system]

url:    https://github.com/0day-ci/linux/commits/Anand-Jain/btrfs-fix-BUG-trying-to-resume-balance-without-resume-flag/20180430-192532
base:   https://git.kernel.org/pub/scm/linux/kernel/git/mason/linux-btrfs.git next
config: i386-randconfig-x000-201817 (attached as .config)
compiler: gcc-7 (Debian 7.3.0-16) 7.3.0
reproduce:
        # save the attached .config to linux build tree
        make ARCH=i386 

All errors (new ones prefixed by >>):

   fs//btrfs/volumes.c: In function 'btrfs_resume_balance_async':
>> fs//btrfs/volumes.c:3971:24: error: 'struct btrfs_balance_control' has no member named 'b_flags'; did you mean 'flags'?
     fs_info->balance_ctl->b_flags |= BTRFS_BALANCE_RESUME;
                           ^~~~~~~
                           flags

vim +3971 fs//btrfs/volumes.c

  3954	
  3955	int btrfs_resume_balance_async(struct btrfs_fs_info *fs_info)
  3956	{
  3957		struct task_struct *tsk;
  3958	
  3959		spin_lock(&fs_info->balance_lock);
  3960		if (!fs_info->balance_ctl) {
  3961			spin_unlock(&fs_info->balance_lock);
  3962			return 0;
  3963		}
  3964		spin_unlock(&fs_info->balance_lock);
  3965	
  3966		if (btrfs_test_opt(fs_info, SKIP_BALANCE)) {
  3967			btrfs_info(fs_info, "force skipping balance");
  3968			return 0;
  3969		}
  3970	
> 3971		fs_info->balance_ctl->b_flags |= BTRFS_BALANCE_RESUME;
  3972	
  3973		tsk = kthread_run(balance_kthread, fs_info, "btrfs-balance");
  3974		return PTR_ERR_OR_ZERO(tsk);
  3975	}
  3976	

---
0-DAY kernel test infrastructure                Open Source Technology Center
https://lists.01.org/pipermail/kbuild-all                   Intel Corporation

[-- Attachment #2: .config.gz --]
[-- Type: application/gzip, Size: 31372 bytes --]

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH v2] btrfs: fix BUG trying to resume balance without resume flag
  2018-04-30  9:48 [PATCH v2] btrfs: fix BUG trying to resume balance without resume flag Anand Jain
  2018-04-30 14:11 ` kbuild test robot
  2018-04-30 14:12 ` kbuild test robot
@ 2018-05-16 14:35 ` David Sterba
  2018-05-17  7:16   ` Anand Jain
  2 siblings, 1 reply; 5+ messages in thread
From: David Sterba @ 2018-05-16 14:35 UTC (permalink / raw)
  To: Anand Jain; +Cc: linux-btrfs

On Mon, Apr 30, 2018 at 05:48:45PM +0800, Anand Jain wrote:
> We set the BTRFS_BALANCE_RESUME flag in the btrfs_recover_balance(),
> which is not called during the remount. So when resuming from the
> paused balance we hit BUG.
> 
>  kernel: kernel BUG at fs/btrfs/volumes.c:3890!
>  ::
>  kernel:  balance_kthread+0x51/0x60 [btrfs]
>  kernel:  kthread+0x111/0x130
>  ::
>  kernel: RIP: btrfs_balance+0x12e1/0x1570 [btrfs] RSP: ffffba7d0090bde8
> 
> Reproducer:
>   On a mounted BTRFS.
> 
>   btrfs balance start --full-balance /btrfs
>   btrfs balance pause /btrfs
>   mount -o remount,ro /dev/sdb /btrfs
>   mount -o remount,rw /dev/sdb /btrfs
> 
> To fix this set the BTRFS_BALANCE_RESUME flag in btrfs_resume_balance_async()
> instead of btrfs_recover_balance().
> 
> Signed-off-by: Anand Jain <anand.jain@oracle.com>
> ---
> v1->v2: btrfs_resume_balance_async() can be called only from remount or
> mount, we don't need to hold fs_info->balance_lock.

For mount it's ok, there's nothing that can run in parallel, but remount
ro->rw that resumes the balance can be run with the ioctl that checks
balance status in parallel, at any time. As the fs_info::balance_ctl is
set up, btrfs_ioctl_balance_progress will proceed and return the current
status.


> Strictly speaking we should rather keep the balance at the paused state
> unless it is resumed by the user again, that means neither mount nor
> remount-rw should resume the balance automatically, former case needs
> writing balance status to the disk. Which needs compatibility
> verification.  So for now just avoid BUG. 
> 
>  fs/btrfs/volumes.c | 3 ++-
>  1 file changed, 2 insertions(+), 1 deletion(-)
> 
> diff --git a/fs/btrfs/volumes.c b/fs/btrfs/volumes.c
> index 3e6983a169c4..64bcaf25908b 100644
> --- a/fs/btrfs/volumes.c
> +++ b/fs/btrfs/volumes.c
> @@ -4115,6 +4115,8 @@ int btrfs_resume_balance_async(struct btrfs_fs_info *fs_info)
>  		return 0;
>  	}
>  
> +	fs_info->balance_ctl->b_flags |= BTRFS_BALANCE_RESUME;

Though there's no update operation on flags, I think it's still better
to add the locks that set the resume status. And a comment why it's
here.

> +
>  	tsk = kthread_run(balance_kthread, fs_info, "btrfs-balance");
>  	return PTR_ERR_OR_ZERO(tsk);
>  }
> @@ -4156,7 +4158,6 @@ int btrfs_recover_balance(struct btrfs_fs_info *fs_info)
>  
>  	bctl->fs_info = fs_info;
>  	bctl->flags = btrfs_balance_flags(leaf, item);
> -	bctl->flags |= BTRFS_BALANCE_RESUME;

Also, it does not hurt to leave this here, as it logically matches what
the function does: we found the balance item, thus we set the status
accordingly.

Please update and resend, I'd like to push this to 4.17-rc as it's a
crash fix with a reproducer. Thanks.

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH v2] btrfs: fix BUG trying to resume balance without resume flag
  2018-05-16 14:35 ` David Sterba
@ 2018-05-17  7:16   ` Anand Jain
  0 siblings, 0 replies; 5+ messages in thread
From: Anand Jain @ 2018-05-17  7:16 UTC (permalink / raw)
  To: dsterba, linux-btrfs



On 05/16/2018 10:35 PM, David Sterba wrote:
> On Mon, Apr 30, 2018 at 05:48:45PM +0800, Anand Jain wrote:
>> We set the BTRFS_BALANCE_RESUME flag in the btrfs_recover_balance(),
>> which is not called during the remount. So when resuming from the
>> paused balance we hit BUG.
>>
>>   kernel: kernel BUG at fs/btrfs/volumes.c:3890!
>>   ::
>>   kernel:  balance_kthread+0x51/0x60 [btrfs]
>>   kernel:  kthread+0x111/0x130
>>   ::
>>   kernel: RIP: btrfs_balance+0x12e1/0x1570 [btrfs] RSP: ffffba7d0090bde8
>>
>> Reproducer:
>>    On a mounted BTRFS.
>>
>>    btrfs balance start --full-balance /btrfs
>>    btrfs balance pause /btrfs
>>    mount -o remount,ro /dev/sdb /btrfs
>>    mount -o remount,rw /dev/sdb /btrfs
>>
>> To fix this set the BTRFS_BALANCE_RESUME flag in btrfs_resume_balance_async()
>> instead of btrfs_recover_balance().
>>
>> Signed-off-by: Anand Jain <anand.jain@oracle.com>
>> ---
>> v1->v2: btrfs_resume_balance_async() can be called only from remount or
>> mount, we don't need to hold fs_info->balance_lock.
> 
> For mount it's ok, there's nothing that can run in parallel, but remount
> ro->rw that resumes the balance can be run with the ioctl that checks
> balance status in parallel, at any time. As the fs_info::balance_ctl is
> set up, btrfs_ioctl_balance_progress will proceed and return the current
> status.

  You are right.

>> Strictly speaking we should rather keep the balance at the paused state
>> unless it is resumed by the user again, that means neither mount nor
>> remount-rw should resume the balance automatically, former case needs
>> writing balance status to the disk. Which needs compatibility
>> verification.  So for now just avoid BUG.
>>
>>   fs/btrfs/volumes.c | 3 ++-
>>   1 file changed, 2 insertions(+), 1 deletion(-)
>>
>> diff --git a/fs/btrfs/volumes.c b/fs/btrfs/volumes.c
>> index 3e6983a169c4..64bcaf25908b 100644
>> --- a/fs/btrfs/volumes.c
>> +++ b/fs/btrfs/volumes.c
>> @@ -4115,6 +4115,8 @@ int btrfs_resume_balance_async(struct btrfs_fs_info *fs_info)
>>   		return 0;
>>   	}
>>   
>> +	fs_info->balance_ctl->b_flags |= BTRFS_BALANCE_RESUME;
> 
> Though there's no update operation on flags, I think it's still better
> to add the locks that set the resume status. And a comment why it's
> here.

ok.

>> +
>>   	tsk = kthread_run(balance_kthread, fs_info, "btrfs-balance");
>>   	return PTR_ERR_OR_ZERO(tsk);
>>   }
>> @@ -4156,7 +4158,6 @@ int btrfs_recover_balance(struct btrfs_fs_info *fs_info)
>>   
>>   	bctl->fs_info = fs_info;
>>   	bctl->flags = btrfs_balance_flags(leaf, item);
>> -	bctl->flags |= BTRFS_BALANCE_RESUME;
> 
> Also, it does not hurt to leave this here, as it logically matches what
> the function does: we found the balance item, thus we set the status
> accordingly.

ok.

> Please update and resend, I'd like to push this to 4.17-rc as it's a
> crash fix with a reproducer. Thanks.

Sent v3 with the above changes.

Thanks, Anand
> --
> To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2018-05-17  7:14 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-04-30  9:48 [PATCH v2] btrfs: fix BUG trying to resume balance without resume flag Anand Jain
2018-04-30 14:11 ` kbuild test robot
2018-04-30 14:12 ` kbuild test robot
2018-05-16 14:35 ` David Sterba
2018-05-17  7:16   ` Anand Jain

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.