All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH] btrfs: Don't call btrfs_start_transaction() on frozen fs to avoid deadlock.
@ 2015-01-19  7:42 Qu Wenruo
  2015-01-19 14:06 ` David Sterba
                   ` (2 more replies)
  0 siblings, 3 replies; 25+ messages in thread
From: Qu Wenruo @ 2015-01-19  7:42 UTC (permalink / raw)
  To: linux-btrfs; +Cc: David Sterba

Commit 6b5fe46dfa52 (btrfs: do commit in sync_fs if there are pending
changes) will call btrfs_start_transaction() in sync_fs(), to handle
some operations needed to be done in next transaction.

However this can cause deadlock if the filesystem is frozen, with the
following sys_r+w output:
[  143.255932] Call Trace:
[  143.255936]  [<ffffffff816c0e09>] schedule+0x29/0x70
[  143.255939]  [<ffffffff811cb7f3>] __sb_start_write+0xb3/0x100
[  143.255971]  [<ffffffffa040ec06>] start_transaction+0x2e6/0x5a0
[btrfs]
[  143.255992]  [<ffffffffa040f1eb>] btrfs_start_transaction+0x1b/0x20
[btrfs]
[  143.256003]  [<ffffffffa03dc0ba>] btrfs_sync_fs+0xca/0xd0 [btrfs]
[  143.256007]  [<ffffffff811f7be0>] sync_fs_one_sb+0x20/0x30
[  143.256011]  [<ffffffff811cbd01>] iterate_supers+0xe1/0xf0
[  143.256014]  [<ffffffff811f7d75>] sys_sync+0x55/0x90
[  143.256017]  [<ffffffff816c49d2>] system_call_fastpath+0x12/0x17
[  143.256111] Call Trace:
[  143.256114]  [<ffffffff816c0e09>] schedule+0x29/0x70
[  143.256119]  [<ffffffff816c3405>] rwsem_down_write_failed+0x1c5/0x2d0
[  143.256123]  [<ffffffff8133f013>] call_rwsem_down_write_failed+0x13/0x20
[  143.256131]  [<ffffffff811caae8>] thaw_super+0x28/0xc0
[  143.256135]  [<ffffffff811db3e5>] do_vfs_ioctl+0x3f5/0x540
[  143.256187]  [<ffffffff811db5c1>] SyS_ioctl+0x91/0xb0
[  143.256213]  [<ffffffff816c49d2>] system_call_fastpath+0x12/0x17

The reason is like the following:
(Holding s_umount)
VFS sync_fs staff:
|- btrfs_sync_fs()
   |- btrfs_start_transaction()
      |- sb_start_intwrite()
      (Waiting thaw_fs to unfreeze)
					VFS thaw_fs staff:
					thaw_fs()
					(Waiting sync_fs to release
					 s_umount)

So deadlock happens.
This can be easily triggered by fstest/generic/068 with inode_cache
mount option.

The fix is to check if the fs is frozen, if the fs is frozen, just
return and waiting for the next transaction.

Cc: David Sterba <dsterba@suse.cz>
Reported-by: Gui Hecheng <guihc.fnst@cn.fujitsu.com>
Signed-off-by: Qu Wenruo <quwenruo@cn.fujitsu.com>
---
 fs/btrfs/super.c | 8 ++++++++
 1 file changed, 8 insertions(+)

diff --git a/fs/btrfs/super.c b/fs/btrfs/super.c
index 60f7cbe..1d9f1e6 100644
--- a/fs/btrfs/super.c
+++ b/fs/btrfs/super.c
@@ -1000,6 +1000,14 @@ int btrfs_sync_fs(struct super_block *sb, int wait)
 			 */
 			if (fs_info->pending_changes == 0)
 				return 0;
+			/*
+			 * Test if the fs is frozen, or start_trasaction
+			 * will deadlock on itself.
+			 */
+			if (__sb_start_write(sb, SB_FREEZE_FS, false))
+				__sb_end_write(sb, SB_FREEZE_FS);
+			else
+				return 0;
 			trans = btrfs_start_transaction(root, 0);
 		} else {
 			return PTR_ERR(trans);
-- 
2.2.2


^ permalink raw reply related	[flat|nested] 25+ messages in thread

* Re: [PATCH] btrfs: Don't call btrfs_start_transaction() on frozen fs to avoid deadlock.
  2015-01-19  7:42 [PATCH] btrfs: Don't call btrfs_start_transaction() on frozen fs to avoid deadlock Qu Wenruo
@ 2015-01-19 14:06 ` David Sterba
  2015-01-20  2:51   ` Qu Wenruo
  2015-01-20  0:19 ` Miao Xie
  2015-01-20 17:13 ` David Sterba
  2 siblings, 1 reply; 25+ messages in thread
From: David Sterba @ 2015-01-19 14:06 UTC (permalink / raw)
  To: Qu Wenruo; +Cc: linux-btrfs, David Sterba

On Mon, Jan 19, 2015 at 03:42:41PM +0800, Qu Wenruo wrote:
> The fix is to check if the fs is frozen, if the fs is frozen, just
> return and waiting for the next transaction.
> 
> --- a/fs/btrfs/super.c
> +++ b/fs/btrfs/super.c
> @@ -1000,6 +1000,14 @@ int btrfs_sync_fs(struct super_block *sb, int wait)
>  			 */
>  			if (fs_info->pending_changes == 0)
>  				return 0;
> +			/*
> +			 * Test if the fs is frozen, or start_trasaction
> +			 * will deadlock on itself.
> +			 */
> +			if (__sb_start_write(sb, SB_FREEZE_FS, false))
> +				__sb_end_write(sb, SB_FREEZE_FS);
> +			else
> +				return 0;

I'm not sure this is the right fix. We should use either
mnt_want_write_file or sb_start_write around the start/commit functions.
The fs may be frozen already, but we also have to catch transition to
that state, or RO remount.

Also, returning 0 is not right, the ioctl actually skipped the expected
work.

>  			trans = btrfs_start_transaction(root, 0);
>  		} else {
>  			return PTR_ERR(trans);

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH] btrfs: Don't call btrfs_start_transaction() on frozen fs to avoid deadlock.
  2015-01-19  7:42 [PATCH] btrfs: Don't call btrfs_start_transaction() on frozen fs to avoid deadlock Qu Wenruo
  2015-01-19 14:06 ` David Sterba
@ 2015-01-20  0:19 ` Miao Xie
  2015-01-20  0:26   ` Qu Wenruo
  2015-01-20 17:13 ` David Sterba
  2 siblings, 1 reply; 25+ messages in thread
From: Miao Xie @ 2015-01-20  0:19 UTC (permalink / raw)
  To: Qu Wenruo, linux-btrfs; +Cc: David Sterba

On Mon, 19 Jan 2015 15:42:41 +0800, Qu Wenruo wrote:
> Commit 6b5fe46dfa52 (btrfs: do commit in sync_fs if there are pending
> changes) will call btrfs_start_transaction() in sync_fs(), to handle
> some operations needed to be done in next transaction.
> 
> However this can cause deadlock if the filesystem is frozen, with the
> following sys_r+w output:
> [  143.255932] Call Trace:
> [  143.255936]  [<ffffffff816c0e09>] schedule+0x29/0x70
> [  143.255939]  [<ffffffff811cb7f3>] __sb_start_write+0xb3/0x100
> [  143.255971]  [<ffffffffa040ec06>] start_transaction+0x2e6/0x5a0
> [btrfs]
> [  143.255992]  [<ffffffffa040f1eb>] btrfs_start_transaction+0x1b/0x20
> [btrfs]
> [  143.256003]  [<ffffffffa03dc0ba>] btrfs_sync_fs+0xca/0xd0 [btrfs]
> [  143.256007]  [<ffffffff811f7be0>] sync_fs_one_sb+0x20/0x30
> [  143.256011]  [<ffffffff811cbd01>] iterate_supers+0xe1/0xf0
> [  143.256014]  [<ffffffff811f7d75>] sys_sync+0x55/0x90
> [  143.256017]  [<ffffffff816c49d2>] system_call_fastpath+0x12/0x17
> [  143.256111] Call Trace:
> [  143.256114]  [<ffffffff816c0e09>] schedule+0x29/0x70
> [  143.256119]  [<ffffffff816c3405>] rwsem_down_write_failed+0x1c5/0x2d0
> [  143.256123]  [<ffffffff8133f013>] call_rwsem_down_write_failed+0x13/0x20
> [  143.256131]  [<ffffffff811caae8>] thaw_super+0x28/0xc0
> [  143.256135]  [<ffffffff811db3e5>] do_vfs_ioctl+0x3f5/0x540
> [  143.256187]  [<ffffffff811db5c1>] SyS_ioctl+0x91/0xb0
> [  143.256213]  [<ffffffff816c49d2>] system_call_fastpath+0x12/0x17
> 
> The reason is like the following:
> (Holding s_umount)
> VFS sync_fs staff:
> |- btrfs_sync_fs()
>    |- btrfs_start_transaction()
>       |- sb_start_intwrite()
>       (Waiting thaw_fs to unfreeze)
> 					VFS thaw_fs staff:
> 					thaw_fs()
> 					(Waiting sync_fs to release
> 					 s_umount)
> 
> So deadlock happens.
> This can be easily triggered by fstest/generic/068 with inode_cache
> mount option.
> 
> The fix is to check if the fs is frozen, if the fs is frozen, just
> return and waiting for the next transaction.
> 
> Cc: David Sterba <dsterba@suse.cz>
> Reported-by: Gui Hecheng <guihc.fnst@cn.fujitsu.com>
> Signed-off-by: Qu Wenruo <quwenruo@cn.fujitsu.com>
> ---
>  fs/btrfs/super.c | 8 ++++++++
>  1 file changed, 8 insertions(+)
> 
> diff --git a/fs/btrfs/super.c b/fs/btrfs/super.c
> index 60f7cbe..1d9f1e6 100644
> --- a/fs/btrfs/super.c
> +++ b/fs/btrfs/super.c
> @@ -1000,6 +1000,14 @@ int btrfs_sync_fs(struct super_block *sb, int wait)
>  			 */
>  			if (fs_info->pending_changes == 0)
>  				return 0;


I think the problem is here -- why ->pending_changes is not 0 when the
filesystem is frozen? so I think the reason of this problem is btrfs_freeze
forget to deal with the pending changes, and the correct fix is to correct
the behavior of btrfs_freeze().

Thanks
Miao

> +			/*
> +			 * Test if the fs is frozen, or start_trasaction
> +			 * will deadlock on itself.
> +			 */
> +			if (__sb_start_write(sb, SB_FREEZE_FS, false))
> +				__sb_end_write(sb, SB_FREEZE_FS);
> +			else
> +				return 0;
>  			trans = btrfs_start_transaction(root, 0);
>  		} else {
>  			return PTR_ERR(trans);
> 


^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH] btrfs: Don't call btrfs_start_transaction() on frozen fs to avoid deadlock.
  2015-01-20  0:19 ` Miao Xie
@ 2015-01-20  0:26   ` Qu Wenruo
  0 siblings, 0 replies; 25+ messages in thread
From: Qu Wenruo @ 2015-01-20  0:26 UTC (permalink / raw)
  To: Miao Xie, linux-btrfs; +Cc: David Sterba


-------- Original Message --------
Subject: Re: [PATCH] btrfs: Don't call btrfs_start_transaction() on 
frozen fs to avoid deadlock.
From: Miao Xie <miaoxie@huawei.com>
To: Qu Wenruo <quwenruo@cn.fujitsu.com>, <linux-btrfs@vger.kernel.org>
Date: 2015年01月20日 08:19
> On Mon, 19 Jan 2015 15:42:41 +0800, Qu Wenruo wrote:
>> Commit 6b5fe46dfa52 (btrfs: do commit in sync_fs if there are pending
>> changes) will call btrfs_start_transaction() in sync_fs(), to handle
>> some operations needed to be done in next transaction.
>>
>> However this can cause deadlock if the filesystem is frozen, with the
>> following sys_r+w output:
>> [  143.255932] Call Trace:
>> [  143.255936]  [<ffffffff816c0e09>] schedule+0x29/0x70
>> [  143.255939]  [<ffffffff811cb7f3>] __sb_start_write+0xb3/0x100
>> [  143.255971]  [<ffffffffa040ec06>] start_transaction+0x2e6/0x5a0
>> [btrfs]
>> [  143.255992]  [<ffffffffa040f1eb>] btrfs_start_transaction+0x1b/0x20
>> [btrfs]
>> [  143.256003]  [<ffffffffa03dc0ba>] btrfs_sync_fs+0xca/0xd0 [btrfs]
>> [  143.256007]  [<ffffffff811f7be0>] sync_fs_one_sb+0x20/0x30
>> [  143.256011]  [<ffffffff811cbd01>] iterate_supers+0xe1/0xf0
>> [  143.256014]  [<ffffffff811f7d75>] sys_sync+0x55/0x90
>> [  143.256017]  [<ffffffff816c49d2>] system_call_fastpath+0x12/0x17
>> [  143.256111] Call Trace:
>> [  143.256114]  [<ffffffff816c0e09>] schedule+0x29/0x70
>> [  143.256119]  [<ffffffff816c3405>] rwsem_down_write_failed+0x1c5/0x2d0
>> [  143.256123]  [<ffffffff8133f013>] call_rwsem_down_write_failed+0x13/0x20
>> [  143.256131]  [<ffffffff811caae8>] thaw_super+0x28/0xc0
>> [  143.256135]  [<ffffffff811db3e5>] do_vfs_ioctl+0x3f5/0x540
>> [  143.256187]  [<ffffffff811db5c1>] SyS_ioctl+0x91/0xb0
>> [  143.256213]  [<ffffffff816c49d2>] system_call_fastpath+0x12/0x17
>>
>> The reason is like the following:
>> (Holding s_umount)
>> VFS sync_fs staff:
>> |- btrfs_sync_fs()
>>     |- btrfs_start_transaction()
>>        |- sb_start_intwrite()
>>        (Waiting thaw_fs to unfreeze)
>> 					VFS thaw_fs staff:
>> 					thaw_fs()
>> 					(Waiting sync_fs to release
>> 					 s_umount)
>>
>> So deadlock happens.
>> This can be easily triggered by fstest/generic/068 with inode_cache
>> mount option.
>>
>> The fix is to check if the fs is frozen, if the fs is frozen, just
>> return and waiting for the next transaction.
>>
>> Cc: David Sterba <dsterba@suse.cz>
>> Reported-by: Gui Hecheng <guihc.fnst@cn.fujitsu.com>
>> Signed-off-by: Qu Wenruo <quwenruo@cn.fujitsu.com>
>> ---
>>   fs/btrfs/super.c | 8 ++++++++
>>   1 file changed, 8 insertions(+)
>>
>> diff --git a/fs/btrfs/super.c b/fs/btrfs/super.c
>> index 60f7cbe..1d9f1e6 100644
>> --- a/fs/btrfs/super.c
>> +++ b/fs/btrfs/super.c
>> @@ -1000,6 +1000,14 @@ int btrfs_sync_fs(struct super_block *sb, int wait)
>>   			 */
>>   			if (fs_info->pending_changes == 0)
>>   				return 0;
>
> I think the problem is here -- why ->pending_changes is not 0 when the
> filesystem is frozen?
This happens when already no transaction is running but some one set 
inode_cache or things needs pending.
And the freeze follows.
> so I think the reason of this problem is btrfs_freeze
> forget to deal with the pending changes, and the correct fix is to correct
> the behavior of btrfs_freeze().
Great! Thanks for pointing this!
Starting a transaction in btrfs_freeze() seems to be the silver bullet 
for such case.

Thanks
Qu
>
> Thanks
> Miao
>
>> +			/*
>> +			 * Test if the fs is frozen, or start_trasaction
>> +			 * will deadlock on itself.
>> +			 */
>> +			if (__sb_start_write(sb, SB_FREEZE_FS, false))
>> +				__sb_end_write(sb, SB_FREEZE_FS);
>> +			else
>> +				return 0;
>>   			trans = btrfs_start_transaction(root, 0);
>>   		} else {
>>   			return PTR_ERR(trans);
>>


^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH] btrfs: Don't call btrfs_start_transaction() on frozen fs to avoid deadlock.
  2015-01-19 14:06 ` David Sterba
@ 2015-01-20  2:51   ` Qu Wenruo
  2015-01-20  2:53     ` Qu Wenruo
  0 siblings, 1 reply; 25+ messages in thread
From: Qu Wenruo @ 2015-01-20  2:51 UTC (permalink / raw)
  To: dsterba, linux-btrfs, Miao Xie


-------- Original Message --------
Subject: Re: [PATCH] btrfs: Don't call btrfs_start_transaction() on 
frozen fs to avoid deadlock.
From: David Sterba <dsterba@suse.cz>
To: Qu Wenruo <quwenruo@cn.fujitsu.com>
Date: 2015年01月19日 22:06
> On Mon, Jan 19, 2015 at 03:42:41PM +0800, Qu Wenruo wrote:
>> The fix is to check if the fs is frozen, if the fs is frozen, just
>> return and waiting for the next transaction.
>>
>> --- a/fs/btrfs/super.c
>> +++ b/fs/btrfs/super.c
>> @@ -1000,6 +1000,14 @@ int btrfs_sync_fs(struct super_block *sb, int wait)
>>   			 */
>>   			if (fs_info->pending_changes == 0)
>>   				return 0;
>> +			/*
>> +			 * Test if the fs is frozen, or start_trasaction
>> +			 * will deadlock on itself.
>> +			 */
>> +			if (__sb_start_write(sb, SB_FREEZE_FS, false))
>> +				__sb_end_write(sb, SB_FREEZE_FS);
>> +			else
>> +				return 0;
> I'm not sure this is the right fix. We should use either
> mnt_want_write_file or sb_start_write around the start/commit functions.
> The fs may be frozen already, but we also have to catch transition to
> that state, or RO remount.
But the deadlock between s_umount and frozen level is a larger problem...

Even Miao mentioned that we can start a transaction in btrfs_freeze(), 
but there is still possibility that
we try to change the feature of the frozen btrfs and do sync, again the 
deadlock will happen.
Although handling in btrfs_freeze() is also needed, but can't resolve 
all the problem.

IMHO the fix is still needed, or at least as a workaround until we find 
a real root solution for it
(If nobody want to revert the patchset)

BTW, what about put the pending changes to a workqueue? If we don't 
start transaction under
s_umount context like sync_fs()

Thanks,
Qu
>
> Also, returning 0 is not right, the ioctl actually skipped the expected
> work.
>
>>   			trans = btrfs_start_transaction(root, 0);
>>   		} else {
>>   			return PTR_ERR(trans);


^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH] btrfs: Don't call btrfs_start_transaction() on frozen fs to avoid deadlock.
  2015-01-20  2:51   ` Qu Wenruo
@ 2015-01-20  2:53     ` Qu Wenruo
  2015-01-20  3:06       ` Miao Xie
  0 siblings, 1 reply; 25+ messages in thread
From: Qu Wenruo @ 2015-01-20  2:53 UTC (permalink / raw)
  To: dsterba, linux-btrfs, miaoxie@huawei.com >> Miao Xie

Add CC to Miao Xie <miaoxie@huawei.com>

-------- Original Message --------
Subject: Re: [PATCH] btrfs: Don't call btrfs_start_transaction() on 
frozen fs to avoid deadlock.
From: Qu Wenruo <quwenruo@cn.fujitsu.com>
To: dsterba@suse.cz, linux-btrfs@vger.kernel.org, Miao Xie 
<miaox@cn.fujitsu.com>
Date: 2015年01月20日 10:51
>
> -------- Original Message --------
> Subject: Re: [PATCH] btrfs: Don't call btrfs_start_transaction() on 
> frozen fs to avoid deadlock.
> From: David Sterba <dsterba@suse.cz>
> To: Qu Wenruo <quwenruo@cn.fujitsu.com>
> Date: 2015年01月19日 22:06
>> On Mon, Jan 19, 2015 at 03:42:41PM +0800, Qu Wenruo wrote:
>>> The fix is to check if the fs is frozen, if the fs is frozen, just
>>> return and waiting for the next transaction.
>>>
>>> --- a/fs/btrfs/super.c
>>> +++ b/fs/btrfs/super.c
>>> @@ -1000,6 +1000,14 @@ int btrfs_sync_fs(struct super_block *sb, int 
>>> wait)
>>>                */
>>>               if (fs_info->pending_changes == 0)
>>>                   return 0;
>>> +            /*
>>> +             * Test if the fs is frozen, or start_trasaction
>>> +             * will deadlock on itself.
>>> +             */
>>> +            if (__sb_start_write(sb, SB_FREEZE_FS, false))
>>> +                __sb_end_write(sb, SB_FREEZE_FS);
>>> +            else
>>> +                return 0;
>> I'm not sure this is the right fix. We should use either
>> mnt_want_write_file or sb_start_write around the start/commit functions.
>> The fs may be frozen already, but we also have to catch transition to
>> that state, or RO remount.
> But the deadlock between s_umount and frozen level is a larger problem...
>
> Even Miao mentioned that we can start a transaction in btrfs_freeze(), 
> but there is still possibility that
> we try to change the feature of the frozen btrfs and do sync, again 
> the deadlock will happen.
> Although handling in btrfs_freeze() is also needed, but can't resolve 
> all the problem.
>
> IMHO the fix is still needed, or at least as a workaround until we 
> find a real root solution for it
> (If nobody want to revert the patchset)
>
> BTW, what about put the pending changes to a workqueue? If we don't 
> start transaction under
> s_umount context like sync_fs()
>
> Thanks,
> Qu
>>
>> Also, returning 0 is not right, the ioctl actually skipped the expected
>> work.
>>
>>>               trans = btrfs_start_transaction(root, 0);
>>>           } else {
>>>               return PTR_ERR(trans);
>


^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH] btrfs: Don't call btrfs_start_transaction() on frozen fs to avoid deadlock.
  2015-01-20  2:53     ` Qu Wenruo
@ 2015-01-20  3:06       ` Miao Xie
  2015-01-20  3:17         ` Qu Wenruo
  0 siblings, 1 reply; 25+ messages in thread
From: Miao Xie @ 2015-01-20  3:06 UTC (permalink / raw)
  To: Qu Wenruo, dsterba, linux-btrfs

On Tue, 20 Jan 2015 10:53:05 +0800, Qu Wenruo wrote:
> Add CC to Miao Xie <miaoxie@huawei.com>
> 
> -------- Original Message --------
> Subject: Re: [PATCH] btrfs: Don't call btrfs_start_transaction() on frozen fs to
> avoid deadlock.
> From: Qu Wenruo <quwenruo@cn.fujitsu.com>
> To: dsterba@suse.cz, linux-btrfs@vger.kernel.org, Miao Xie <miaox@cn.fujitsu.com>
> Date: 2015年01月20日 10:51
>>
>> -------- Original Message --------
>> Subject: Re: [PATCH] btrfs: Don't call btrfs_start_transaction() on frozen fs
>> to avoid deadlock.
>> From: David Sterba <dsterba@suse.cz>
>> To: Qu Wenruo <quwenruo@cn.fujitsu.com>
>> Date: 2015年01月19日 22:06
>>> On Mon, Jan 19, 2015 at 03:42:41PM +0800, Qu Wenruo wrote:
>>>> The fix is to check if the fs is frozen, if the fs is frozen, just
>>>> return and waiting for the next transaction.
>>>>
>>>> --- a/fs/btrfs/super.c
>>>> +++ b/fs/btrfs/super.c
>>>> @@ -1000,6 +1000,14 @@ int btrfs_sync_fs(struct super_block *sb, int wait)
>>>>                */
>>>>               if (fs_info->pending_changes == 0)
>>>>                   return 0;
>>>> +            /*
>>>> +             * Test if the fs is frozen, or start_trasaction
>>>> +             * will deadlock on itself.
>>>> +             */
>>>> +            if (__sb_start_write(sb, SB_FREEZE_FS, false))
>>>> +                __sb_end_write(sb, SB_FREEZE_FS);
>>>> +            else
>>>> +                return 0;
>>> I'm not sure this is the right fix. We should use either
>>> mnt_want_write_file or sb_start_write around the start/commit functions.
>>> The fs may be frozen already, but we also have to catch transition to
>>> that state, or RO remount.
>> But the deadlock between s_umount and frozen level is a larger problem...
>>
>> Even Miao mentioned that we can start a transaction in btrfs_freeze(), but
>> there is still possibility that
>> we try to change the feature of the frozen btrfs and do sync, again the
>> deadlock will happen.
>> Although handling in btrfs_freeze() is also needed, but can't resolve all the
>> problem.
>>
>> IMHO the fix is still needed, or at least as a workaround until we find a real
>> root solution for it
>> (If nobody want to revert the patchset)
>>
>> BTW, what about put the pending changes to a workqueue? If we don't start
>> transaction under
>> s_umount context like sync_fs()

I don't like this fix.
I think we should deal with those pending changes when we freeze a filesystem.
or we break the rule of fs freeze.

Thanks
Miao

>>
>> Thanks,
>> Qu
>>>
>>> Also, returning 0 is not right, the ioctl actually skipped the expected
>>> work.
>>>
>>>>               trans = btrfs_start_transaction(root, 0);
>>>>           } else {
>>>>               return PTR_ERR(trans);
>>
> 
> .
> 


^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH] btrfs: Don't call btrfs_start_transaction() on frozen fs to avoid deadlock.
  2015-01-20  3:06       ` Miao Xie
@ 2015-01-20  3:17         ` Qu Wenruo
  2015-01-20  8:16           ` Miao Xie
  0 siblings, 1 reply; 25+ messages in thread
From: Qu Wenruo @ 2015-01-20  3:17 UTC (permalink / raw)
  To: Miao Xie, dsterba, linux-btrfs


-------- Original Message --------
Subject: Re: [PATCH] btrfs: Don't call btrfs_start_transaction() on 
frozen fs to avoid deadlock.
From: Miao Xie <miaoxie@huawei.com>
To: Qu Wenruo <quwenruo@cn.fujitsu.com>, <dsterba@suse.cz>, 
<linux-btrfs@vger.kernel.org>
Date: 2015年01月20日 11:06
> On Tue, 20 Jan 2015 10:53:05 +0800, Qu Wenruo wrote:
>> Add CC to Miao Xie <miaoxie@huawei.com>
>>
>> -------- Original Message --------
>> Subject: Re: [PATCH] btrfs: Don't call btrfs_start_transaction() on frozen fs to
>> avoid deadlock.
>> From: Qu Wenruo <quwenruo@cn.fujitsu.com>
>> To: dsterba@suse.cz, linux-btrfs@vger.kernel.org, Miao Xie <miaox@cn.fujitsu.com>
>> Date: 2015年01月20日 10:51
>>> -------- Original Message --------
>>> Subject: Re: [PATCH] btrfs: Don't call btrfs_start_transaction() on frozen fs
>>> to avoid deadlock.
>>> From: David Sterba <dsterba@suse.cz>
>>> To: Qu Wenruo <quwenruo@cn.fujitsu.com>
>>> Date: 2015年01月19日 22:06
>>>> On Mon, Jan 19, 2015 at 03:42:41PM +0800, Qu Wenruo wrote:
>>>>> The fix is to check if the fs is frozen, if the fs is frozen, just
>>>>> return and waiting for the next transaction.
>>>>>
>>>>> --- a/fs/btrfs/super.c
>>>>> +++ b/fs/btrfs/super.c
>>>>> @@ -1000,6 +1000,14 @@ int btrfs_sync_fs(struct super_block *sb, int wait)
>>>>>                 */
>>>>>                if (fs_info->pending_changes == 0)
>>>>>                    return 0;
>>>>> +            /*
>>>>> +             * Test if the fs is frozen, or start_trasaction
>>>>> +             * will deadlock on itself.
>>>>> +             */
>>>>> +            if (__sb_start_write(sb, SB_FREEZE_FS, false))
>>>>> +                __sb_end_write(sb, SB_FREEZE_FS);
>>>>> +            else
>>>>> +                return 0;
>>>> I'm not sure this is the right fix. We should use either
>>>> mnt_want_write_file or sb_start_write around the start/commit functions.
>>>> The fs may be frozen already, but we also have to catch transition to
>>>> that state, or RO remount.
>>> But the deadlock between s_umount and frozen level is a larger problem...
>>>
>>> Even Miao mentioned that we can start a transaction in btrfs_freeze(), but
>>> there is still possibility that
>>> we try to change the feature of the frozen btrfs and do sync, again the
>>> deadlock will happen.
>>> Although handling in btrfs_freeze() is also needed, but can't resolve all the
>>> problem.
>>>
>>> IMHO the fix is still needed, or at least as a workaround until we find a real
>>> root solution for it
>>> (If nobody want to revert the patchset)
>>>
>>> BTW, what about put the pending changes to a workqueue? If we don't start
>>> transaction under
>>> s_umount context like sync_fs()
> I don't like this fix.
> I think we should deal with those pending changes when we freeze a filesystem.
> or we break the rule of fs freeze.
I am afraid handling it in btrfs_freeze() won't help.
Case like freeze() -> change_feature -> sync() -> unfreeze() will still 
deadlock in sync().

Even cleared the pending changes in freeze(), it can still be set 
through sysfs interface even the fs is frozen.

And in fact, if we put the things like attach/create a transaction into 
a workqueue, we will not break
the freeze rule.
Since if the fs is frozen, there is no running transaction and we need 
to create a new one,
that will call sb_start_intwrite(), which will sleep until the fs is 
unfreeze.

Thanks,
Qu
>
> Thanks
> Miao
>
>>> Thanks,
>>> Qu
>>>> Also, returning 0 is not right, the ioctl actually skipped the expected
>>>> work.
>>>>
>>>>>                trans = btrfs_start_transaction(root, 0);
>>>>>            } else {
>>>>>                return PTR_ERR(trans);
>> .
>>


^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH] btrfs: Don't call btrfs_start_transaction() on frozen fs to avoid deadlock.
  2015-01-20  3:17         ` Qu Wenruo
@ 2015-01-20  8:16           ` Miao Xie
  0 siblings, 0 replies; 25+ messages in thread
From: Miao Xie @ 2015-01-20  8:16 UTC (permalink / raw)
  To: Qu Wenruo, dsterba, linux-btrfs

On Tue, 20 Jan 2015 11:17:07 +0800, Qu Wenruo wrote:
>>>>>> --- a/fs/btrfs/super.c
>>>>>> +++ b/fs/btrfs/super.c
>>>>>> @@ -1000,6 +1000,14 @@ int btrfs_sync_fs(struct super_block *sb, int wait)
>>>>>>                 */
>>>>>>                if (fs_info->pending_changes == 0)
>>>>>>                    return 0;
>>>>>> +            /*
>>>>>> +             * Test if the fs is frozen, or start_trasaction
>>>>>> +             * will deadlock on itself.
>>>>>> +             */
>>>>>> +            if (__sb_start_write(sb, SB_FREEZE_FS, false))
>>>>>> +                __sb_end_write(sb, SB_FREEZE_FS);
>>>>>> +            else
>>>>>> +                return 0;
>>>>> I'm not sure this is the right fix. We should use either
>>>>> mnt_want_write_file or sb_start_write around the start/commit functions.
>>>>> The fs may be frozen already, but we also have to catch transition to
>>>>> that state, or RO remount.
>>>> But the deadlock between s_umount and frozen level is a larger problem...
>>>>
>>>> Even Miao mentioned that we can start a transaction in btrfs_freeze(), but
>>>> there is still possibility that
>>>> we try to change the feature of the frozen btrfs and do sync, again the
>>>> deadlock will happen.
>>>> Although handling in btrfs_freeze() is also needed, but can't resolve all the
>>>> problem.
>>>>
>>>> IMHO the fix is still needed, or at least as a workaround until we find a real
>>>> root solution for it
>>>> (If nobody want to revert the patchset)
>>>>
>>>> BTW, what about put the pending changes to a workqueue? If we don't start
>>>> transaction under
>>>> s_umount context like sync_fs()
>> I don't like this fix.
>> I think we should deal with those pending changes when we freeze a filesystem.
>> or we break the rule of fs freeze.
> I am afraid handling it in btrfs_freeze() won't help.
> Case like freeze() -> change_feature -> sync() -> unfreeze() will still deadlock
> in sync().

We should not change feature after the fs is freezed.

> Even cleared the pending changes in freeze(), it can still be set through sysfs
> interface even the fs is frozen.
> 
> And in fact, if we put the things like attach/create a transaction into a
> workqueue, we will not break
> the freeze rule.
> Since if the fs is frozen, there is no running transaction and we need to create
> a new one,
> that will call sb_start_intwrite(), which will sleep until the fs is unfreeze.

I read the pending change code just now, and I found the pending change is just
used for changing the mount option now, so I think as a work-around fix we
needn't start a new transaction to handle the pending flags which are set after
the current transaction is committed, because the data on the disk is
integrated.

Thanks
Miao


> 
> Thanks,
> Qu
>>
>> Thanks
>> Miao
>>
>>>> Thanks,
>>>> Qu
>>>>> Also, returning 0 is not right, the ioctl actually skipped the expected
>>>>> work.
>>>>>
>>>>>>                trans = btrfs_start_transaction(root, 0);
>>>>>>            } else {
>>>>>>                return PTR_ERR(trans);
>>> .
>>>
> 
> .
> 


^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH] btrfs: Don't call btrfs_start_transaction() on frozen fs to avoid deadlock.
  2015-01-19  7:42 [PATCH] btrfs: Don't call btrfs_start_transaction() on frozen fs to avoid deadlock Qu Wenruo
  2015-01-19 14:06 ` David Sterba
  2015-01-20  0:19 ` Miao Xie
@ 2015-01-20 17:13 ` David Sterba
  2015-01-21  0:58   ` Qu Wenruo
  2 siblings, 1 reply; 25+ messages in thread
From: David Sterba @ 2015-01-20 17:13 UTC (permalink / raw)
  To: Qu Wenruo; +Cc: linux-btrfs, David Sterba, miaoxie, clm

On Mon, Jan 19, 2015 at 03:42:41PM +0800, Qu Wenruo wrote:
> --- a/fs/btrfs/super.c
> +++ b/fs/btrfs/super.c
> @@ -1000,6 +1000,14 @@ int btrfs_sync_fs(struct super_block *sb, int wait)
>  			 */
>  			if (fs_info->pending_changes == 0)
>  				return 0;
> +			/*
> +			 * Test if the fs is frozen, or start_trasaction
> +			 * will deadlock on itself.
> +			 */
> +			if (__sb_start_write(sb, SB_FREEZE_FS, false))
> +				__sb_end_write(sb, SB_FREEZE_FS);
> +			else
> +				return 0;

The more I look into that the more I think that the first fix is the
right one.

Has been pointed out in this thread, it is ok to skip processing the
pending changes if the filesystem is frozen.

The pending changes have to flushed from sync (by design), we cannot use
mnt_want_write or the sb_start* protections that.

The btrfs_freeze callback can safely do the last commit, that's under
s_umount held by vfs::freeze_super. Then any other new transaction would
block. Any other call to btrfs_sync_fs will not find any active
transaction and with this patch will not start one. Sounds safe to me.

I think the right level to check is SB_FREEZE_WRITE though, to stop any
potential writes as soon as possible and when the s_umount lock is still
held in vfs::freeze_super.

I'll collect the relevant patches and will send it for review.


>  			trans = btrfs_start_transaction(root, 0);
>  		} else {
>  			return PTR_ERR(trans);

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH] btrfs: Don't call btrfs_start_transaction() on frozen fs to avoid deadlock.
  2015-01-20 17:13 ` David Sterba
@ 2015-01-21  0:58   ` Qu Wenruo
  2015-01-21  1:05     ` Chris Mason
  0 siblings, 1 reply; 25+ messages in thread
From: Qu Wenruo @ 2015-01-21  0:58 UTC (permalink / raw)
  To: dsterba, linux-btrfs, miaoxie, clm


-------- Original Message --------
Subject: Re: [PATCH] btrfs: Don't call btrfs_start_transaction() on 
frozen fs to avoid deadlock.
From: David Sterba <dsterba@suse.cz>
To: Qu Wenruo <quwenruo@cn.fujitsu.com>
Date: 2015年01月21日 01:13
> On Mon, Jan 19, 2015 at 03:42:41PM +0800, Qu Wenruo wrote:
>> --- a/fs/btrfs/super.c
>> +++ b/fs/btrfs/super.c
>> @@ -1000,6 +1000,14 @@ int btrfs_sync_fs(struct super_block *sb, int wait)
>>   			 */
>>   			if (fs_info->pending_changes == 0)
>>   				return 0;
>> +			/*
>> +			 * Test if the fs is frozen, or start_trasaction
>> +			 * will deadlock on itself.
>> +			 */
>> +			if (__sb_start_write(sb, SB_FREEZE_FS, false))
>> +				__sb_end_write(sb, SB_FREEZE_FS);
>> +			else
>> +				return 0;
> The more I look into that the more I think that the first fix is the
> right one.
>
> Has been pointed out in this thread, it is ok to skip processing the
> pending changes if the filesystem is frozen.
That's good, for me, either this patch or the patch 2~5 in the patchset 
will solve the sync_fs() problem
on frozen fs. Just different timing to start the new transaction.

But the patchset one has the problem, which needs to deal with the sysfs 
interface changes, or sync_fs()
will still cause deadlock.
So I tried to revert the sysfs related patches, but it seems overkilled, 
needing extra btrfs_start_transaction*
things.

As you already picked this one, I'm completely OK with this.
>
> The pending changes have to flushed from sync (by design), we cannot use
> mnt_want_write or the sb_start* protections that.
>
> The btrfs_freeze callback can safely do the last commit, that's under
> s_umount held by vfs::freeze_super. Then any other new transaction would
> block. Any other call to btrfs_sync_fs will not find any active
> transaction and with this patch will not start one. Sounds safe to me.
>
> I think the right level to check is SB_FREEZE_WRITE though, to stop any
> potential writes as soon as possible and when the s_umount lock is still
> held in vfs::freeze_super.
SB_FREEZE_WRITE seems good for me.
But I didn't catch the difference between 
SB_FREEZE_FS(WRITE/PAGEFAULT/COMPLETE),
since freeze() conflicts with sync_fs(), when we comes to btrfs_sync_fs(),
the fs is either totally frozen or unfrozen and frozen level won't 
change during the protection of s_umount.

Although SB_FREEZE_WRITE seems better in its meaning and makes it more 
readable.
>
> I'll collect the relevant patches and will send it for review.
Thanks for collecting them and sending them out.

Thanks,
Qu
>
>
>>   			trans = btrfs_start_transaction(root, 0);
>>   		} else {
>>   			return PTR_ERR(trans);


^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH] btrfs: Don't call btrfs_start_transaction() on frozen fs to avoid deadlock.
  2015-01-21  0:58   ` Qu Wenruo
@ 2015-01-21  1:05     ` Chris Mason
  2015-01-21  1:09       ` Qu Wenruo
  0 siblings, 1 reply; 25+ messages in thread
From: Chris Mason @ 2015-01-21  1:05 UTC (permalink / raw)
  To: Qu Wenruo; +Cc: dsterba, linux-btrfs, miaoxie



On Tue, Jan 20, 2015 at 7:58 PM, Qu Wenruo <quwenruo@cn.fujitsu.com> 
wrote:
> 
> -------- Original Message --------
> Subject: Re: [PATCH] btrfs: Don't call btrfs_start_transaction() on 
> frozen fs to avoid deadlock.
> From: David Sterba <dsterba@suse.cz>
> To: Qu Wenruo <quwenruo@cn.fujitsu.com>
> Date: 2015年01月21日 01:13
>> On Mon, Jan 19, 2015 at 03:42:41PM +0800, Qu Wenruo wrote:
>>> --- a/fs/btrfs/super.c
>>> +++ b/fs/btrfs/super.c
>>> @@ -1000,6 +1000,14 @@ int btrfs_sync_fs(struct super_block *sb, 
>>> int wait)
>>>   			 */
>>>   			if (fs_info->pending_changes == 0)
>>>   				return 0;
>>> +			/*
>>> +			 * Test if the fs is frozen, or start_trasaction
>>> +			 * will deadlock on itself.
>>> +			 */
>>> +			if (__sb_start_write(sb, SB_FREEZE_FS, false))
>>> +				__sb_end_write(sb, SB_FREEZE_FS);
>>> +			else
>>> +				return 0;

But what if someone freezes the FS after __sb_end_write() and before 
btrfs_start_transaction()?   I don't see what keeps new freezers from 
coming in.

-chris



^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH] btrfs: Don't call btrfs_start_transaction() on frozen fs to avoid deadlock.
  2015-01-21  1:05     ` Chris Mason
@ 2015-01-21  1:09       ` Qu Wenruo
  2015-01-21  1:10         ` Chris Mason
  0 siblings, 1 reply; 25+ messages in thread
From: Qu Wenruo @ 2015-01-21  1:09 UTC (permalink / raw)
  To: Chris Mason; +Cc: dsterba, linux-btrfs, miaoxie


-------- Original Message --------
Subject: Re: [PATCH] btrfs: Don't call btrfs_start_transaction() on 
frozen fs to avoid deadlock.
From: Chris Mason <clm@fb.com>
To: Qu Wenruo <quwenruo@cn.fujitsu.com>
Date: 2015年01月21日 09:05
>
>
> On Tue, Jan 20, 2015 at 7:58 PM, Qu Wenruo <quwenruo@cn.fujitsu.com> 
> wrote:
>>
>> -------- Original Message --------
>> Subject: Re: [PATCH] btrfs: Don't call btrfs_start_transaction() on 
>> frozen fs to avoid deadlock.
>> From: David Sterba <dsterba@suse.cz>
>> To: Qu Wenruo <quwenruo@cn.fujitsu.com>
>> Date: 2015年01月21日 01:13
>>> On Mon, Jan 19, 2015 at 03:42:41PM +0800, Qu Wenruo wrote:
>>>> --- a/fs/btrfs/super.c
>>>> +++ b/fs/btrfs/super.c
>>>> @@ -1000,6 +1000,14 @@ int btrfs_sync_fs(struct super_block *sb, 
>>>> int wait)
>>>>                */
>>>>               if (fs_info->pending_changes == 0)
>>>>                   return 0;
>>>> +            /*
>>>> +             * Test if the fs is frozen, or start_trasaction
>>>> +             * will deadlock on itself.
>>>> +             */
>>>> +            if (__sb_start_write(sb, SB_FREEZE_FS, false))
>>>> +                __sb_end_write(sb, SB_FREEZE_FS);
>>>> +            else
>>>> +                return 0;
>
> But what if someone freezes the FS after __sb_end_write() and before 
> btrfs_start_transaction()?   I don't see what keeps new freezers from 
> coming in.
>
> -chris
Either VFS::freeze_super() and VFS::syncfs() will hold the s_umount 
mutex, so freeze will not happen
during sync.

Thanks,
Qu
>
>


^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH] btrfs: Don't call btrfs_start_transaction() on frozen fs to avoid deadlock.
  2015-01-21  1:09       ` Qu Wenruo
@ 2015-01-21  1:10         ` Chris Mason
  2015-01-21  3:10           ` Miao Xie
  0 siblings, 1 reply; 25+ messages in thread
From: Chris Mason @ 2015-01-21  1:10 UTC (permalink / raw)
  To: Qu Wenruo; +Cc: dsterba, linux-btrfs, miaoxie

On Tue, Jan 20, 2015 at 8:09 PM, Qu Wenruo <quwenruo@cn.fujitsu.com> 
wrote:
> 
> -------- Original Message --------
> Subject: Re: [PATCH] btrfs: Don't call btrfs_start_transaction() on 
> frozen fs to avoid deadlock.
> From: Chris Mason <clm@fb.com>
> To: Qu Wenruo <quwenruo@cn.fujitsu.com>
> Date: 2015年01月21日 09:05
>> 
>> 
>> On Tue, Jan 20, 2015 at 7:58 PM, Qu Wenruo <quwenruo@cn.fujitsu.com> 
>> wrote:
>>> 
>>> -------- Original Message --------
>>> Subject: Re: [PATCH] btrfs: Don't call btrfs_start_transaction() on 
>>> frozen fs to avoid deadlock.
>>> From: David Sterba <dsterba@suse.cz>
>>> To: Qu Wenruo <quwenruo@cn.fujitsu.com>
>>> Date: 2015年01月21日 01:13
>>>> On Mon, Jan 19, 2015 at 03:42:41PM +0800, Qu Wenruo wrote:
>>>>> --- a/fs/btrfs/super.c
>>>>> +++ b/fs/btrfs/super.c
>>>>> @@ -1000,6 +1000,14 @@ int btrfs_sync_fs(struct super_block *sb, 
>>>>> int wait)
>>>>>                */
>>>>>               if (fs_info->pending_changes == 0)
>>>>>                   return 0;
>>>>> +            /*
>>>>> +             * Test if the fs is frozen, or start_trasaction
>>>>> +             * will deadlock on itself.
>>>>> +             */
>>>>> +            if (__sb_start_write(sb, SB_FREEZE_FS, false))
>>>>> +                __sb_end_write(sb, SB_FREEZE_FS);
>>>>> +            else
>>>>> +                return 0;
>> 
>> But what if someone freezes the FS after __sb_end_write() and before 
>> btrfs_start_transaction()?   I don't see what keeps new freezers 
>> from coming in.
>> 
>> -chris
> Either VFS::freeze_super() and VFS::syncfs() will hold the s_umount 
> mutex, so freeze will not happen
> during sync.

You're right.  I was worried about the sync ioctl, but the mutex won't 
be held there to deadlock against.  We'll be fine.

-chris




^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH] btrfs: Don't call btrfs_start_transaction() on frozen fs to avoid deadlock.
  2015-01-21  1:10         ` Chris Mason
@ 2015-01-21  3:10           ` Miao Xie
  2015-01-21  3:15             ` Qu Wenruo
  0 siblings, 1 reply; 25+ messages in thread
From: Miao Xie @ 2015-01-21  3:10 UTC (permalink / raw)
  To: Chris Mason, Qu Wenruo; +Cc: dsterba, linux-btrfs

On Tue, 20 Jan 2015 20:10:56 -0500, Chris Mason wrote:
> On Tue, Jan 20, 2015 at 8:09 PM, Qu Wenruo <quwenruo@cn.fujitsu.com> wrote:
>>
>> -------- Original Message --------
>> Subject: Re: [PATCH] btrfs: Don't call btrfs_start_transaction() on frozen fs
>> to avoid deadlock.
>> From: Chris Mason <clm@fb.com>
>> To: Qu Wenruo <quwenruo@cn.fujitsu.com>
>> Date: 2015年01月21日 09:05
>>>
>>>
>>> On Tue, Jan 20, 2015 at 7:58 PM, Qu Wenruo <quwenruo@cn.fujitsu.com> wrote:
>>>>
>>>> -------- Original Message --------
>>>> Subject: Re: [PATCH] btrfs: Don't call btrfs_start_transaction() on frozen
>>>> fs to avoid deadlock.
>>>> From: David Sterba <dsterba@suse.cz>
>>>> To: Qu Wenruo <quwenruo@cn.fujitsu.com>
>>>> Date: 2015年01月21日 01:13
>>>>> On Mon, Jan 19, 2015 at 03:42:41PM +0800, Qu Wenruo wrote:
>>>>>> --- a/fs/btrfs/super.c
>>>>>> +++ b/fs/btrfs/super.c
>>>>>> @@ -1000,6 +1000,14 @@ int btrfs_sync_fs(struct super_block *sb, int wait)
>>>>>>                */
>>>>>>               if (fs_info->pending_changes == 0)
>>>>>>                   return 0;
>>>>>> +            /*
>>>>>> +             * Test if the fs is frozen, or start_trasaction
>>>>>> +             * will deadlock on itself.
>>>>>> +             */
>>>>>> +            if (__sb_start_write(sb, SB_FREEZE_FS, false))
>>>>>> +                __sb_end_write(sb, SB_FREEZE_FS);
>>>>>> +            else
>>>>>> +                return 0;
>>>
>>> But what if someone freezes the FS after __sb_end_write() and before
>>> btrfs_start_transaction()?   I don't see what keeps new freezers from coming in.
>>>
>>> -chris
>> Either VFS::freeze_super() and VFS::syncfs() will hold the s_umount mutex, so
>> freeze will not happen
>> during sync.
> 
> You're right.  I was worried about the sync ioctl, but the mutex won't be held
> there to deadlock against.  We'll be fine.

There is another problem which is introduced by pending change. That is we will
start and commmit a transaction by changing pending mount option after we set
the fs to be R/O.

I think it is better that we don't start a new transaction for pending changes
which are set after the transaction is committed, just make them be handled by
the next transaction, the reason is:
- Make the behavior of the fs be consistent(both freezed fs and unfreezed fs)
- Data on the disk is right and integrated


Thanks
Miao

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH] btrfs: Don't call btrfs_start_transaction() on frozen fs to avoid deadlock.
  2015-01-21  3:10           ` Miao Xie
@ 2015-01-21  3:15             ` Qu Wenruo
  2015-01-21  3:26               ` Miao Xie
  0 siblings, 1 reply; 25+ messages in thread
From: Qu Wenruo @ 2015-01-21  3:15 UTC (permalink / raw)
  To: Miao Xie, Chris Mason; +Cc: dsterba, linux-btrfs


-------- Original Message --------
Subject: Re: [PATCH] btrfs: Don't call btrfs_start_transaction() on 
frozen fs to avoid deadlock.
From: Miao Xie <miaoxie@huawei.com>
To: Chris Mason <clm@fb.com>, Qu Wenruo <quwenruo@cn.fujitsu.com>
Date: 2015年01月21日 11:10
> On Tue, 20 Jan 2015 20:10:56 -0500, Chris Mason wrote:
>> On Tue, Jan 20, 2015 at 8:09 PM, Qu Wenruo <quwenruo@cn.fujitsu.com> wrote:
>>> -------- Original Message --------
>>> Subject: Re: [PATCH] btrfs: Don't call btrfs_start_transaction() on frozen fs
>>> to avoid deadlock.
>>> From: Chris Mason <clm@fb.com>
>>> To: Qu Wenruo <quwenruo@cn.fujitsu.com>
>>> Date: 2015年01月21日 09:05
>>>>
>>>> On Tue, Jan 20, 2015 at 7:58 PM, Qu Wenruo <quwenruo@cn.fujitsu.com> wrote:
>>>>> -------- Original Message --------
>>>>> Subject: Re: [PATCH] btrfs: Don't call btrfs_start_transaction() on frozen
>>>>> fs to avoid deadlock.
>>>>> From: David Sterba <dsterba@suse.cz>
>>>>> To: Qu Wenruo <quwenruo@cn.fujitsu.com>
>>>>> Date: 2015年01月21日 01:13
>>>>>> On Mon, Jan 19, 2015 at 03:42:41PM +0800, Qu Wenruo wrote:
>>>>>>> --- a/fs/btrfs/super.c
>>>>>>> +++ b/fs/btrfs/super.c
>>>>>>> @@ -1000,6 +1000,14 @@ int btrfs_sync_fs(struct super_block *sb, int wait)
>>>>>>>                 */
>>>>>>>                if (fs_info->pending_changes == 0)
>>>>>>>                    return 0;
>>>>>>> +            /*
>>>>>>> +             * Test if the fs is frozen, or start_trasaction
>>>>>>> +             * will deadlock on itself.
>>>>>>> +             */
>>>>>>> +            if (__sb_start_write(sb, SB_FREEZE_FS, false))
>>>>>>> +                __sb_end_write(sb, SB_FREEZE_FS);
>>>>>>> +            else
>>>>>>> +                return 0;
>>>> But what if someone freezes the FS after __sb_end_write() and before
>>>> btrfs_start_transaction()?   I don't see what keeps new freezers from coming in.
>>>>
>>>> -chris
>>> Either VFS::freeze_super() and VFS::syncfs() will hold the s_umount mutex, so
>>> freeze will not happen
>>> during sync.
>> You're right.  I was worried about the sync ioctl, but the mutex won't be held
>> there to deadlock against.  We'll be fine.
> There is another problem which is introduced by pending change. That is we will
> start and commmit a transaction by changing pending mount option after we set
> the fs to be R/O.
Oh, I missed this problem.
>
> I think it is better that we don't start a new transaction for pending changes
> which are set after the transaction is committed, just make them be handled by
> the next transaction,
This will cause another problem, nobody can ensure there will be next 
transaction and the change may
never to written into disk.

For example, if we change the features/label through sysfs, and then 
umount the fs,
since there is no write, there is no running transaction and if we don't 
start a new transaction,
it won't be flushed to disk.

Thanks,
Qu
> the reason is:
> - Make the behavior of the fs be consistent(both freezed fs and unfreezed fs)
> - Data on the disk is right and integrated
>
>
> Thanks
> Miao


^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH] btrfs: Don't call btrfs_start_transaction() on frozen fs to avoid deadlock.
  2015-01-21  3:15             ` Qu Wenruo
@ 2015-01-21  3:26               ` Miao Xie
  2015-01-21  3:53                 ` Qu Wenruo
  0 siblings, 1 reply; 25+ messages in thread
From: Miao Xie @ 2015-01-21  3:26 UTC (permalink / raw)
  To: Qu Wenruo, Chris Mason; +Cc: dsterba, linux-btrfs

On Wed, 21 Jan 2015 11:15:41 +0800, Qu Wenruo wrote:
> 
> -------- Original Message --------
> Subject: Re: [PATCH] btrfs: Don't call btrfs_start_transaction() on frozen fs to
> avoid deadlock.
> From: Miao Xie <miaoxie@huawei.com>
> To: Chris Mason <clm@fb.com>, Qu Wenruo <quwenruo@cn.fujitsu.com>
> Date: 2015年01月21日 11:10
>> On Tue, 20 Jan 2015 20:10:56 -0500, Chris Mason wrote:
>>> On Tue, Jan 20, 2015 at 8:09 PM, Qu Wenruo <quwenruo@cn.fujitsu.com> wrote:
>>>> -------- Original Message --------
>>>> Subject: Re: [PATCH] btrfs: Don't call btrfs_start_transaction() on frozen fs
>>>> to avoid deadlock.
>>>> From: Chris Mason <clm@fb.com>
>>>> To: Qu Wenruo <quwenruo@cn.fujitsu.com>
>>>> Date: 2015年01月21日 09:05
>>>>>
>>>>> On Tue, Jan 20, 2015 at 7:58 PM, Qu Wenruo <quwenruo@cn.fujitsu.com> wrote:
>>>>>> -------- Original Message --------
>>>>>> Subject: Re: [PATCH] btrfs: Don't call btrfs_start_transaction() on frozen
>>>>>> fs to avoid deadlock.
>>>>>> From: David Sterba <dsterba@suse.cz>
>>>>>> To: Qu Wenruo <quwenruo@cn.fujitsu.com>
>>>>>> Date: 2015年01月21日 01:13
>>>>>>> On Mon, Jan 19, 2015 at 03:42:41PM +0800, Qu Wenruo wrote:
>>>>>>>> --- a/fs/btrfs/super.c
>>>>>>>> +++ b/fs/btrfs/super.c
>>>>>>>> @@ -1000,6 +1000,14 @@ int btrfs_sync_fs(struct super_block *sb, int wait)
>>>>>>>>                 */
>>>>>>>>                if (fs_info->pending_changes == 0)
>>>>>>>>                    return 0;
>>>>>>>> +            /*
>>>>>>>> +             * Test if the fs is frozen, or start_trasaction
>>>>>>>> +             * will deadlock on itself.
>>>>>>>> +             */
>>>>>>>> +            if (__sb_start_write(sb, SB_FREEZE_FS, false))
>>>>>>>> +                __sb_end_write(sb, SB_FREEZE_FS);
>>>>>>>> +            else
>>>>>>>> +                return 0;
>>>>> But what if someone freezes the FS after __sb_end_write() and before
>>>>> btrfs_start_transaction()?   I don't see what keeps new freezers from
>>>>> coming in.
>>>>>
>>>>> -chris
>>>> Either VFS::freeze_super() and VFS::syncfs() will hold the s_umount mutex, so
>>>> freeze will not happen
>>>> during sync.
>>> You're right.  I was worried about the sync ioctl, but the mutex won't be held
>>> there to deadlock against.  We'll be fine.
>> There is another problem which is introduced by pending change. That is we will
>> start and commmit a transaction by changing pending mount option after we set
>> the fs to be R/O.
> Oh, I missed this problem.
>>
>> I think it is better that we don't start a new transaction for pending changes
>> which are set after the transaction is committed, just make them be handled by
>> the next transaction,
> This will cause another problem, nobody can ensure there will be next
> transaction and the change may
> never to written into disk.

First, the pending changes is mount option, that is in-memory data.
Second, the same problem would happen after you freeze fs.

> 
> For example, if we change the features/label through sysfs, and then umount the fs,

It is different from pending change.
If you want to change features/label,  you should get write permission and make
sure the fs is not be freezed because those are on-disk data. So the problem
doesn't exist, or there is a bug.

Thanks
Miao

> since there is no write, there is no running transaction and if we don't start a
> new transaction,
> it won't be flushed to disk.
> 
> Thanks,
> Qu
>> the reason is:
>> - Make the behavior of the fs be consistent(both freezed fs and unfreezed fs)
>> - Data on the disk is right and integrated
>>
>>
>> Thanks
>> Miao
> 
> .
> 


^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH] btrfs: Don't call btrfs_start_transaction() on frozen fs to avoid deadlock.
  2015-01-21  3:26               ` Miao Xie
@ 2015-01-21  3:53                 ` Qu Wenruo
  2015-01-21  7:04                   ` Miao Xie
  0 siblings, 1 reply; 25+ messages in thread
From: Qu Wenruo @ 2015-01-21  3:53 UTC (permalink / raw)
  To: Miao Xie, Chris Mason; +Cc: dsterba, linux-btrfs


-------- Original Message --------
Subject: Re: [PATCH] btrfs: Don't call btrfs_start_transaction() on 
frozen fs to avoid deadlock.
From: Miao Xie <miaoxie@huawei.com>
To: Qu Wenruo <quwenruo@cn.fujitsu.com>, Chris Mason <clm@fb.com>
Date: 2015年01月21日 11:26
> On Wed, 21 Jan 2015 11:15:41 +0800, Qu Wenruo wrote:
>> -------- Original Message --------
>> Subject: Re: [PATCH] btrfs: Don't call btrfs_start_transaction() on frozen fs to
>> avoid deadlock.
>> From: Miao Xie <miaoxie@huawei.com>
>> To: Chris Mason <clm@fb.com>, Qu Wenruo <quwenruo@cn.fujitsu.com>
>> Date: 2015年01月21日 11:10
>>> On Tue, 20 Jan 2015 20:10:56 -0500, Chris Mason wrote:
>>>> On Tue, Jan 20, 2015 at 8:09 PM, Qu Wenruo <quwenruo@cn.fujitsu.com> wrote:
>>>>> -------- Original Message --------
>>>>> Subject: Re: [PATCH] btrfs: Don't call btrfs_start_transaction() on frozen fs
>>>>> to avoid deadlock.
>>>>> From: Chris Mason <clm@fb.com>
>>>>> To: Qu Wenruo <quwenruo@cn.fujitsu.com>
>>>>> Date: 2015年01月21日 09:05
>>>>>> On Tue, Jan 20, 2015 at 7:58 PM, Qu Wenruo <quwenruo@cn.fujitsu.com> wrote:
>>>>>>> -------- Original Message --------
>>>>>>> Subject: Re: [PATCH] btrfs: Don't call btrfs_start_transaction() on frozen
>>>>>>> fs to avoid deadlock.
>>>>>>> From: David Sterba <dsterba@suse.cz>
>>>>>>> To: Qu Wenruo <quwenruo@cn.fujitsu.com>
>>>>>>> Date: 2015年01月21日 01:13
>>>>>>>> On Mon, Jan 19, 2015 at 03:42:41PM +0800, Qu Wenruo wrote:
>>>>>>>>> --- a/fs/btrfs/super.c
>>>>>>>>> +++ b/fs/btrfs/super.c
>>>>>>>>> @@ -1000,6 +1000,14 @@ int btrfs_sync_fs(struct super_block *sb, int wait)
>>>>>>>>>                  */
>>>>>>>>>                 if (fs_info->pending_changes == 0)
>>>>>>>>>                     return 0;
>>>>>>>>> +            /*
>>>>>>>>> +             * Test if the fs is frozen, or start_trasaction
>>>>>>>>> +             * will deadlock on itself.
>>>>>>>>> +             */
>>>>>>>>> +            if (__sb_start_write(sb, SB_FREEZE_FS, false))
>>>>>>>>> +                __sb_end_write(sb, SB_FREEZE_FS);
>>>>>>>>> +            else
>>>>>>>>> +                return 0;
>>>>>> But what if someone freezes the FS after __sb_end_write() and before
>>>>>> btrfs_start_transaction()?   I don't see what keeps new freezers from
>>>>>> coming in.
>>>>>>
>>>>>> -chris
>>>>> Either VFS::freeze_super() and VFS::syncfs() will hold the s_umount mutex, so
>>>>> freeze will not happen
>>>>> during sync.
>>>> You're right.  I was worried about the sync ioctl, but the mutex won't be held
>>>> there to deadlock against.  We'll be fine.
>>> There is another problem which is introduced by pending change. That is we will
>>> start and commmit a transaction by changing pending mount option after we set
>>> the fs to be R/O.
>> Oh, I missed this problem.
>>> I think it is better that we don't start a new transaction for pending changes
>>> which are set after the transaction is committed, just make them be handled by
>>> the next transaction,
>> This will cause another problem, nobody can ensure there will be next
>> transaction and the change may
>> never to written into disk.
> First, the pending changes is mount option, that is in-memory data.
> Second, the same problem would happen after you freeze fs.
Pending changes are *not* only mount options. Feature change and label 
change are also pending changes if using sysfs.
Normal ioctl label changing is not affected.

For freeze, it's not the same problem since the fs will be unfreeze 
sooner or later and transaction will be initiated.
>
>> For example, if we change the features/label through sysfs, and then umount the fs,
> It is different from pending change.
No, now features/label changing using sysfs both use pending changes to 
do the commit.
See BTRFS_PENDING_COMMIT bit.
So freeze -> change features/label -> sync will still cause the deadlock 
in the same way,
and you can try it yourself.

Thanks,
Qu

> If you want to change features/label,  you should get write permission and make
> sure the fs is not be freezed because those are on-disk data. So the problem
> doesn't exist, or there is a bug.
>
> Thanks
> Miao
>
>> since there is no write, there is no running transaction and if we don't start a
>> new transaction,
>> it won't be flushed to disk.
>>
>> Thanks,
>> Qu
>>> the reason is:
>>> - Make the behavior of the fs be consistent(both freezed fs and unfreezed fs)
>>> - Data on the disk is right and integrated
>>>
>>>
>>> Thanks
>>> Miao
>> .
>>


^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH] btrfs: Don't call btrfs_start_transaction() on frozen fs to avoid deadlock.
  2015-01-21  3:53                 ` Qu Wenruo
@ 2015-01-21  7:04                   ` Miao Xie
  2015-01-21  7:47                     ` Qu Wenruo
  2015-01-23 16:59                     ` David Sterba
  0 siblings, 2 replies; 25+ messages in thread
From: Miao Xie @ 2015-01-21  7:04 UTC (permalink / raw)
  To: Qu Wenruo, Chris Mason; +Cc: dsterba, linux-btrfs

On Wed, 21 Jan 2015 11:53:34 +0800, Qu Wenruo wrote:
>>>>>>>>>> +            /*
>>>>>>>>>> +             * Test if the fs is frozen, or start_trasaction
>>>>>>>>>> +             * will deadlock on itself.
>>>>>>>>>> +             */
>>>>>>>>>> +            if (__sb_start_write(sb, SB_FREEZE_FS, false))
>>>>>>>>>> +                __sb_end_write(sb, SB_FREEZE_FS);
>>>>>>>>>> +            else
>>>>>>>>>> +                return 0;
>>>>>>> But what if someone freezes the FS after __sb_end_write() and before
>>>>>>> btrfs_start_transaction()?   I don't see what keeps new freezers from
>>>>>>> coming in.
>>>>>>>
>>>>>>> -chris
>>>>>> Either VFS::freeze_super() and VFS::syncfs() will hold the s_umount mutex, so
>>>>>> freeze will not happen
>>>>>> during sync.
>>>>> You're right.  I was worried about the sync ioctl, but the mutex won't be held
>>>>> there to deadlock against.  We'll be fine.
>>>> There is another problem which is introduced by pending change. That is we will
>>>> start and commmit a transaction by changing pending mount option after we set
>>>> the fs to be R/O.
>>> Oh, I missed this problem.
>>>> I think it is better that we don't start a new transaction for pending changes
>>>> which are set after the transaction is committed, just make them be handled by
>>>> the next transaction,
>>> This will cause another problem, nobody can ensure there will be next
>>> transaction and the change may
>>> never to written into disk.
>> First, the pending changes is mount option, that is in-memory data.
>> Second, the same problem would happen after you freeze fs.
> Pending changes are *not* only mount options. Feature change and label change
> are also pending changes if using sysfs.

My miss, I don't notice feature and label change by sysfs.

But the implementation of feature and label change by sysfs is wrong, we can
not change them without write permission.

> Normal ioctl label changing is not affected.
> 
> For freeze, it's not the same problem since the fs will be unfreeze sooner or
> later and transaction will be initiated.

You can not assume the operations of the users, they might freeze the fs and
then shutdown the machine.

>>
>>> For example, if we change the features/label through sysfs, and then umount
>>> the fs,
>> It is different from pending change.
> No, now features/label changing using sysfs both use pending changes to do the
> commit.
> See BTRFS_PENDING_COMMIT bit.
> So freeze -> change features/label -> sync will still cause the deadlock in the
> same way,
> and you can try it yourself.

As I said above, the implementation of sysfs feature and label change is wrong,
it is better to separate them from the pending mount option change, make the
sysfs feature and label change be done in the context of transaction after
getting the write permission. If so, we needn't do anything special when sync
the fs.

In short, changing the sysfs feature and label change implementation and
removing the unnecessary btrfs_start_transaction in sync_fs can fix the
deadlock.

Thanks
Miao

> 
> Thanks,
> Qu
> 
>> If you want to change features/label,  you should get write permission and make
>> sure the fs is not be freezed because those are on-disk data. So the problem
>> doesn't exist, or there is a bug.
>>
>> Thanks
>> Miao
>>
>>> since there is no write, there is no running transaction and if we don't start a
>>> new transaction,
>>> it won't be flushed to disk.
>>>
>>> Thanks,
>>> Qu
>>>> the reason is:
>>>> - Make the behavior of the fs be consistent(both freezed fs and unfreezed fs)
>>>> - Data on the disk is right and integrated
>>>>
>>>>
>>>> Thanks
>>>> Miao
>>> .
>>>
> 
> .
> 


^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH] btrfs: Don't call btrfs_start_transaction() on frozen fs to avoid deadlock.
  2015-01-21  7:04                   ` Miao Xie
@ 2015-01-21  7:47                     ` Qu Wenruo
  2015-01-21  8:46                       ` Miao Xie
  2015-01-23 17:39                       ` David Sterba
  2015-01-23 16:59                     ` David Sterba
  1 sibling, 2 replies; 25+ messages in thread
From: Qu Wenruo @ 2015-01-21  7:47 UTC (permalink / raw)
  To: Miao Xie, Chris Mason, dsterba; +Cc: linux-btrfs


-------- Original Message --------
Subject: Re: [PATCH] btrfs: Don't call btrfs_start_transaction() on 
frozen fs to avoid deadlock.
From: Miao Xie <miaoxie@huawei.com>
To: Qu Wenruo <quwenruo@cn.fujitsu.com>, Chris Mason <clm@fb.com>
Date: 2015年01月21日 15:04
> On Wed, 21 Jan 2015 11:53:34 +0800, Qu Wenruo wrote:
>>>> [snipped]
>>>> This will cause another problem, nobody can ensure there will be next
>>>> transaction and the change may
>>>> never to written into disk.
>>> First, the pending changes is mount option, that is in-memory data.
>>> Second, the same problem would happen after you freeze fs.
>> Pending changes are *not* only mount options. Feature change and label change
>> are also pending changes if using sysfs.
> My miss, I don't notice feature and label change by sysfs.
>
> But the implementation of feature and label change by sysfs is wrong, we can
> not change them without write permission.
>
>> Normal ioctl label changing is not affected.
>>
>> For freeze, it's not the same problem since the fs will be unfreeze sooner or
>> later and transaction will be initiated.
> You can not assume the operations of the users, they might freeze the fs and
> then shutdown the machine.
>
>>>> For example, if we change the features/label through sysfs, and then umount
>>>> the fs,
>>> It is different from pending change.
>> No, now features/label changing using sysfs both use pending changes to do the
>> commit.
>> See BTRFS_PENDING_COMMIT bit.
>> So freeze -> change features/label -> sync will still cause the deadlock in the
>> same way,
>> and you can try it yourself.
> As I said above, the implementation of sysfs feature and label change is wrong,
> it is better to separate them from the pending mount option change, make the
> sysfs feature and label change be done in the context of transaction after
> getting the write permission. If so, we needn't do anything special when sync
> the fs.
>
> In short, changing the sysfs feature and label change implementation and
> removing the unnecessary btrfs_start_transaction in sync_fs can fix the
> deadlock.
Your method will only fix the deadlock, but will introduce the risk like 
pending inode_cache will never
be written to disk as I already explained. (If still using the 
fs_info->pending_changes mechanism)
To ensure pending changes written to disk sync_fs() should start a 
transaction if needed,
or there will be chance that no transaction can handle it.

But I don't see the necessity to pending current work(inode_cache, 
feature/label changes) to next transaction.

To David:
I'm a little curious about why inode_cache needs to be delayed to next 
transaction.
In btrfs_remount() we have s_umount mutex, and we synced the whole 
filesystem already,
so there should be no running transaction and we can just set any mount 
option into fs_info.

Or even in worst case, there is a racing window, we can still start a 
transaction and do the commit,
a little overhead in such minor case won't impact the overall performance.

For sysfs change, I prefer attach or start transaction method, and for 
mount option change, and
such sysfs tuning is also minor case for a filesystem.

What do you think about reverting the whole patchset and rework the 
sysfs interface?

Thanks,
Qu
>
> Thanks
> Miao
>
>> Thanks,
>> Qu
>>
>>> If you want to change features/label,  you should get write permission and make
>>> sure the fs is not be freezed because those are on-disk data. So the problem
>>> doesn't exist, or there is a bug.
>>>
>>> Thanks
>>> Miao
>>>
>>>> since there is no write, there is no running transaction and if we don't start a
>>>> new transaction,
>>>> it won't be flushed to disk.
>>>>
>>>> Thanks,
>>>> Qu
>>>>> the reason is:
>>>>> - Make the behavior of the fs be consistent(both freezed fs and unfreezed fs)
>>>>> - Data on the disk is right and integrated
>>>>>
>>>>>
>>>>> Thanks
>>>>> Miao
>>>> .
>>>>
>> .
>>


^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH] btrfs: Don't call btrfs_start_transaction() on frozen fs to avoid deadlock.
  2015-01-21  7:47                     ` Qu Wenruo
@ 2015-01-21  8:46                       ` Miao Xie
  2015-01-23 17:39                       ` David Sterba
  1 sibling, 0 replies; 25+ messages in thread
From: Miao Xie @ 2015-01-21  8:46 UTC (permalink / raw)
  To: Qu Wenruo, Chris Mason, dsterba; +Cc: linux-btrfs

On Wed, 21 Jan 2015 15:47:54 +0800, Qu Wenruo wrote:
>> On Wed, 21 Jan 2015 11:53:34 +0800, Qu Wenruo wrote:
>>>>> [snipped]
>>>>> This will cause another problem, nobody can ensure there will be next
>>>>> transaction and the change may
>>>>> never to written into disk.
>>>> First, the pending changes is mount option, that is in-memory data.
>>>> Second, the same problem would happen after you freeze fs.
>>> Pending changes are *not* only mount options. Feature change and label change
>>> are also pending changes if using sysfs.
>> My miss, I don't notice feature and label change by sysfs.
>>
>> But the implementation of feature and label change by sysfs is wrong, we can
>> not change them without write permission.
>>
>>> Normal ioctl label changing is not affected.
>>>
>>> For freeze, it's not the same problem since the fs will be unfreeze sooner or
>>> later and transaction will be initiated.
>> You can not assume the operations of the users, they might freeze the fs and
>> then shutdown the machine.
>>
>>>>> For example, if we change the features/label through sysfs, and then umount
>>>>> the fs,
>>>> It is different from pending change.
>>> No, now features/label changing using sysfs both use pending changes to do the
>>> commit.
>>> See BTRFS_PENDING_COMMIT bit.
>>> So freeze -> change features/label -> sync will still cause the deadlock in the
>>> same way,
>>> and you can try it yourself.
>> As I said above, the implementation of sysfs feature and label change is wrong,
>> it is better to separate them from the pending mount option change, make the
>> sysfs feature and label change be done in the context of transaction after
>> getting the write permission. If so, we needn't do anything special when sync
>> the fs.
>>
>> In short, changing the sysfs feature and label change implementation and
>> removing the unnecessary btrfs_start_transaction in sync_fs can fix the
>> deadlock.
> Your method will only fix the deadlock, but will introduce the risk like pending
> inode_cache will never
> be written to disk as I already explained. (If still using the
> fs_info->pending_changes mechanism)
> To ensure pending changes written to disk sync_fs() should start a transaction
> if needed,
> or there will be chance that no transaction can handle it.

We are sure that writting down the inode cache need transaction. But INODE_CACHE
is not a forcible flag. Sometimes though you set it, it is very likely that the
inode cache files are not created and the data is not written down because the
fs might still be reading inode usage information, and this operation might span
several transactions. So I think what you worried is not a problem.

Thanks
Miao

> 
> But I don't see the necessity to pending current work(inode_cache, feature/label
> changes) to next transaction.
> 
> To David:
> I'm a little curious about why inode_cache needs to be delayed to next transaction.
> In btrfs_remount() we have s_umount mutex, and we synced the whole filesystem
> already,
> so there should be no running transaction and we can just set any mount option
> into fs_info.
> 
> Or even in worst case, there is a racing window, we can still start a
> transaction and do the commit,
> a little overhead in such minor case won't impact the overall performance.
> 
> For sysfs change, I prefer attach or start transaction method, and for mount
> option change, and
> such sysfs tuning is also minor case for a filesystem.
> 
> What do you think about reverting the whole patchset and rework the sysfs
> interface?
> 
> Thanks,
> Qu
>>
>> Thanks
>> Miao
>>
>>> Thanks,
>>> Qu
>>>
>>>> If you want to change features/label,  you should get write permission and make
>>>> sure the fs is not be freezed because those are on-disk data. So the problem
>>>> doesn't exist, or there is a bug.
>>>>
>>>> Thanks
>>>> Miao
>>>>
>>>>> since there is no write, there is no running transaction and if we don't
>>>>> start a
>>>>> new transaction,
>>>>> it won't be flushed to disk.
>>>>>
>>>>> Thanks,
>>>>> Qu
>>>>>> the reason is:
>>>>>> - Make the behavior of the fs be consistent(both freezed fs and unfreezed fs)
>>>>>> - Data on the disk is right and integrated
>>>>>>
>>>>>>
>>>>>> Thanks
>>>>>> Miao
>>>>> .
>>>>>
>>> .
>>>
> 
> .
> 


^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH] btrfs: Don't call btrfs_start_transaction() on frozen fs to avoid deadlock.
  2015-01-21  7:04                   ` Miao Xie
  2015-01-21  7:47                     ` Qu Wenruo
@ 2015-01-23 16:59                     ` David Sterba
  2015-01-26  0:31                       ` Miao Xie
  1 sibling, 1 reply; 25+ messages in thread
From: David Sterba @ 2015-01-23 16:59 UTC (permalink / raw)
  To: Miao Xie; +Cc: Qu Wenruo, Chris Mason, dsterba, linux-btrfs

On Wed, Jan 21, 2015 at 03:04:02PM +0800, Miao Xie wrote:
> > Pending changes are *not* only mount options. Feature change and label change
> > are also pending changes if using sysfs.
> 
> My miss, I don't notice feature and label change by sysfs.
> 
> But the implementation of feature and label change by sysfs is wrong, we can
> not change them without write permission.

Label change does not happen if the fs is readonly. If the filesystem is
RW and label is changed through sysfs, then remount to RO will sync the
filesystem and the new label will be saved.

The sysfs features write handler is missing that protection though, I'll
send a patch.

> > For freeze, it's not the same problem since the fs will be unfreeze sooner or
> > later and transaction will be initiated.
> 
> You can not assume the operations of the users, they might freeze the fs and
> then shutdown the machine.

The semantics of freezing should make the on-device image consistent,
but still keep some changes in memory.

> >>> For example, if we change the features/label through sysfs, and then umount
> >>> the fs,
> >> It is different from pending change.
> > No, now features/label changing using sysfs both use pending changes to do the
> > commit.
> > See BTRFS_PENDING_COMMIT bit.
> > So freeze -> change features/label -> sync will still cause the deadlock in the
> > same way,
> > and you can try it yourself.
> 
> As I said above, the implementation of sysfs feature and label change is wrong,
> it is better to separate them from the pending mount option change, make the
> sysfs feature and label change be done in the context of transaction after
> getting the write permission. If so, we needn't do anything special when sync
> the fs.

That would mean to drop the write support of sysfs files that change
global filesystem state (label and features right now). This would leave
only the ioctl way to do that. I'd like to keep the sysfs write support
though for ease of use from scripts and languages not ioctl-friendly.

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH] btrfs: Don't call btrfs_start_transaction() on frozen fs to avoid deadlock.
  2015-01-21  7:47                     ` Qu Wenruo
  2015-01-21  8:46                       ` Miao Xie
@ 2015-01-23 17:39                       ` David Sterba
  2015-01-23 18:21                         ` Chris Mason
  1 sibling, 1 reply; 25+ messages in thread
From: David Sterba @ 2015-01-23 17:39 UTC (permalink / raw)
  To: Qu Wenruo; +Cc: Miao Xie, Chris Mason, dsterba, linux-btrfs

On Wed, Jan 21, 2015 at 03:47:54PM +0800, Qu Wenruo wrote:
> To David:
> I'm a little curious about why inode_cache needs to be delayed to next 
> transaction.
> In btrfs_remount() we have s_umount mutex, and we synced the whole 
> filesystem already,
> so there should be no running transaction and we can just set any mount 
> option into fs_info.

See our discussion under the noinode_cache option:

http://www.mail-archive.com/linux-btrfs%40vger.kernel.org/msg30075.html
http://www.mail-archive.com/linux-btrfs%40vger.kernel.org/msg30414.html

> What do you think about reverting the whole patchset and rework the 
> sysfs interface?

IMO reverting should be the last option, we have a minimal fix to the
sync deadlock and you've proposed the per-trasaction mount options to
replace the pending inode_change.

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH] btrfs: Don't call btrfs_start_transaction() on frozen fs to avoid deadlock.
  2015-01-23 17:39                       ` David Sterba
@ 2015-01-23 18:21                         ` Chris Mason
  0 siblings, 0 replies; 25+ messages in thread
From: Chris Mason @ 2015-01-23 18:21 UTC (permalink / raw)
  To: dsterba; +Cc: Qu Wenruo, Miao Xie, dsterba, linux-btrfs



On Fri, Jan 23, 2015 at 12:39 PM, David Sterba <dsterba@suse.cz> wrote:
> On Wed, Jan 21, 2015 at 03:47:54PM +0800, Qu Wenruo wrote:
>>  To David:
>>  I'm a little curious about why inode_cache needs to be delayed to 
>> next
>>  transaction.
>>  In btrfs_remount() we have s_umount mutex, and we synced the whole
>>  filesystem already,
>>  so there should be no running transaction and we can just set any 
>> mount
>>  option into fs_info.
> 
> See our discussion under the noinode_cache option:
> 
> https://urldefense.proofpoint.com/v1/url?u=http://www.mail-archive.com/linux-btrfs%2540vger.kernel.org/msg30075.html&k=ZVNjlDMF0FElm4dQtryO4A%3D%3D%0A&r=6%2FL0lzzDhu0Y1hL9xm%2BQyA%3D%3D%0A&m=sv%2BL93W9i7vNsbS3ozpylY3o%2F3wpA4TZTQTtFh3mUXg%3D%0A&s=2d678af317413a7452f047aa9ed07bc7e5424d4bae831ac15fae5f23a2acd080
> https://urldefense.proofpoint.com/v1/url?u=http://www.mail-archive.com/linux-btrfs%2540vger.kernel.org/msg30414.html&k=ZVNjlDMF0FElm4dQtryO4A%3D%3D%0A&r=6%2FL0lzzDhu0Y1hL9xm%2BQyA%3D%3D%0A&m=sv%2BL93W9i7vNsbS3ozpylY3o%2F3wpA4TZTQTtFh3mUXg%3D%0A&s=2fab711b3d70ab27c008694249bc62596f37e41af84dfc21077629930b4fe854
> 
>>  What do you think about reverting the whole patchset and rework the
>>  sysfs interface?
> 
> IMO reverting should be the last option, we have a minimal fix to the
> sync deadlock and you've proposed the per-trasaction mount options to
> replace the pending inode_change.

I agree, I'd rather build on top of what we have than use reverts at 
this point.

-chris




^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH] btrfs: Don't call btrfs_start_transaction() on frozen fs to avoid deadlock.
  2015-01-23 16:59                     ` David Sterba
@ 2015-01-26  0:31                       ` Miao Xie
  0 siblings, 0 replies; 25+ messages in thread
From: Miao Xie @ 2015-01-26  0:31 UTC (permalink / raw)
  To: dsterba, Qu Wenruo, Chris Mason, linux-btrfs

On Fri, 23 Jan 2015 17:59:49 +0100, David Sterba wrote:
> On Wed, Jan 21, 2015 at 03:04:02PM +0800, Miao Xie wrote:
>>> Pending changes are *not* only mount options. Feature change and label change
>>> are also pending changes if using sysfs.
>>
>> My miss, I don't notice feature and label change by sysfs.
>>
>> But the implementation of feature and label change by sysfs is wrong, we can
>> not change them without write permission.
> 
> Label change does not happen if the fs is readonly. If the filesystem is
> RW and label is changed through sysfs, then remount to RO will sync the
> filesystem and the new label will be saved.
> 
> The sysfs features write handler is missing that protection though, I'll
> send a patch.

First, the R/O protection is so cheap, there is a race between R/O remount and
label/feature change, please consider the following case:
	Remount R/O task		Label/Attr Change Task
					Check R/O
	remount ro R/O
					change Label/feature

Second, it forgets to handle the freezing event.

> 
>>> For freeze, it's not the same problem since the fs will be unfreeze sooner or
>>> later and transaction will be initiated.
>>
>> You can not assume the operations of the users, they might freeze the fs and
>> then shutdown the machine.
> 
> The semantics of freezing should make the on-device image consistent,
> but still keep some changes in memory.
> 
>>>>> For example, if we change the features/label through sysfs, and then umount
>>>>> the fs,
>>>> It is different from pending change.
>>> No, now features/label changing using sysfs both use pending changes to do the
>>> commit.
>>> See BTRFS_PENDING_COMMIT bit.
>>> So freeze -> change features/label -> sync will still cause the deadlock in the
>>> same way,
>>> and you can try it yourself.
>>
>> As I said above, the implementation of sysfs feature and label change is wrong,
>> it is better to separate them from the pending mount option change, make the
>> sysfs feature and label change be done in the context of transaction after
>> getting the write permission. If so, we needn't do anything special when sync
>> the fs.
> 
> That would mean to drop the write support of sysfs files that change
> global filesystem state (label and features right now). This would leave
> only the ioctl way to do that. I'd like to keep the sysfs write support
> though for ease of use from scripts and languages not ioctl-friendly.
> .

not drop the write support of sysfs, just fix the bug and make it change the
label and features under the writable context.

Thanks
Miao

^ permalink raw reply	[flat|nested] 25+ messages in thread

end of thread, other threads:[~2015-01-26  0:37 UTC | newest]

Thread overview: 25+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-01-19  7:42 [PATCH] btrfs: Don't call btrfs_start_transaction() on frozen fs to avoid deadlock Qu Wenruo
2015-01-19 14:06 ` David Sterba
2015-01-20  2:51   ` Qu Wenruo
2015-01-20  2:53     ` Qu Wenruo
2015-01-20  3:06       ` Miao Xie
2015-01-20  3:17         ` Qu Wenruo
2015-01-20  8:16           ` Miao Xie
2015-01-20  0:19 ` Miao Xie
2015-01-20  0:26   ` Qu Wenruo
2015-01-20 17:13 ` David Sterba
2015-01-21  0:58   ` Qu Wenruo
2015-01-21  1:05     ` Chris Mason
2015-01-21  1:09       ` Qu Wenruo
2015-01-21  1:10         ` Chris Mason
2015-01-21  3:10           ` Miao Xie
2015-01-21  3:15             ` Qu Wenruo
2015-01-21  3:26               ` Miao Xie
2015-01-21  3:53                 ` Qu Wenruo
2015-01-21  7:04                   ` Miao Xie
2015-01-21  7:47                     ` Qu Wenruo
2015-01-21  8:46                       ` Miao Xie
2015-01-23 17:39                       ` David Sterba
2015-01-23 18:21                         ` Chris Mason
2015-01-23 16:59                     ` David Sterba
2015-01-26  0:31                       ` Miao Xie

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.