LKML Archive on lore.kernel.org
 help / Atom feed
* [PATCH V3] ocfs2: fix dead lock caused by ocfs2_defrag_extent
@ 2018-11-01  7:14 Larry Chen
  2018-11-01  8:58 ` [Ocfs2-devel] " Joseph Qi
  2018-11-02  0:53 ` Changwei Ge
  0 siblings, 2 replies; 9+ messages in thread
From: Larry Chen @ 2018-11-01  7:14 UTC (permalink / raw)
  To: mark, jlbec; +Cc: linux-kernel, ocfs2-devel, akpm, ge.changwei

ocfs2_defrag_extent may fall into deadlock.

ocfs2_ioctl_move_extents
    ocfs2_ioctl_move_extents
      ocfs2_move_extents
        ocfs2_defrag_extent
          ocfs2_lock_allocators_move_extents

            ocfs2_reserve_clusters
              inode_lock GLOBAL_BITMAP_SYSTEM_INODE

	  __ocfs2_flush_truncate_log
              inode_lock GLOBAL_BITMAP_SYSTEM_INODE

As backtrace shows above, ocfs2_reserve_clusters() will call inode_lock
against the global bitmap if local allocator has not sufficient cluters.
Once global bitmap could meet the demand, ocfs2_reserve_cluster will
return success with global bitmap locked.

After ocfs2_reserve_cluster(), if truncate log is full,
__ocfs2_flush_truncate_log() will definitely fall into deadlock because it
needs to inode_lock global bitmap, which has already been locked.

To fix this bug, we could remove from ocfs2_lock_allocators_move_extents()
the code which intends to lock global allocator, and put the removed code
after __ocfs2_flush_truncate_log().

ocfs2_lock_allocators_move_extents() is referred by 2 places, one is here,
the other does not need the data allocator context, which means this patch
does not affect the caller so far.

Change log:
1. Correct the function comment.
2. Remove unused argument from ocfs2_lock_meta_allocator_move_extents.

Signed-off-by: Larry Chen <lchen@suse.com>
---
 fs/ocfs2/move_extents.c | 47 ++++++++++++++++++++++++++---------------------
 1 file changed, 26 insertions(+), 21 deletions(-)

diff --git a/fs/ocfs2/move_extents.c b/fs/ocfs2/move_extents.c
index 7eb3b0a6347e..f55f82ca3425 100644
--- a/fs/ocfs2/move_extents.c
+++ b/fs/ocfs2/move_extents.c
@@ -156,18 +156,14 @@ static int __ocfs2_move_extent(handle_t *handle,
 }
 
 /*
- * lock allocators, and reserving appropriate number of bits for
- * meta blocks and data clusters.
- *
- * in some cases, we don't need to reserve clusters, just let data_ac
- * be NULL.
+ * lock allocator, and reserve appropriate number of bits for
+ * meta blocks.
  */
-static int ocfs2_lock_allocators_move_extents(struct inode *inode,
+static int ocfs2_lock_meta_allocator_move_extents(struct inode *inode,
 					struct ocfs2_extent_tree *et,
 					u32 clusters_to_move,
 					u32 extents_to_split,
 					struct ocfs2_alloc_context **meta_ac,
-					struct ocfs2_alloc_context **data_ac,
 					int extra_blocks,
 					int *credits)
 {
@@ -192,13 +188,6 @@ static int ocfs2_lock_allocators_move_extents(struct inode *inode,
 		goto out;
 	}
 
-	if (data_ac) {
-		ret = ocfs2_reserve_clusters(osb, clusters_to_move, data_ac);
-		if (ret) {
-			mlog_errno(ret);
-			goto out;
-		}
-	}
 
 	*credits += ocfs2_calc_extend_credits(osb->sb, et->et_root_el);
 
@@ -257,10 +246,10 @@ static int ocfs2_defrag_extent(struct ocfs2_move_extents_context *context,
 		}
 	}
 
-	ret = ocfs2_lock_allocators_move_extents(inode, &context->et, *len, 1,
-						 &context->meta_ac,
-						 &context->data_ac,
-						 extra_blocks, &credits);
+	ret = ocfs2_lock_meta_allocator_move_extents(inode, &context->et,
+						*len, 1,
+						&context->meta_ac,
+						extra_blocks, &credits);
 	if (ret) {
 		mlog_errno(ret);
 		goto out;
@@ -283,6 +272,21 @@ static int ocfs2_defrag_extent(struct ocfs2_move_extents_context *context,
 		}
 	}
 
+	/*
+	 * Make sure ocfs2_reserve_cluster is called after
+	 * __ocfs2_flush_truncate_log, otherwise, dead lock may happen.
+	 *
+	 * If ocfs2_reserve_cluster is called
+	 * before __ocfs2_flush_truncate_log, dead lock on global bitmap
+	 * may happen.
+	 *
+	 */
+	ret = ocfs2_reserve_clusters(osb, *len, &context->data_ac);
+	if (ret) {
+		mlog_errno(ret);
+		goto out_unlock_mutex;
+	}
+
 	handle = ocfs2_start_trans(osb, credits);
 	if (IS_ERR(handle)) {
 		ret = PTR_ERR(handle);
@@ -600,9 +604,10 @@ static int ocfs2_move_extent(struct ocfs2_move_extents_context *context,
 		}
 	}
 
-	ret = ocfs2_lock_allocators_move_extents(inode, &context->et, len, 1,
-						 &context->meta_ac,
-						 NULL, extra_blocks, &credits);
+	ret = ocfs2_lock_meta_allocator_move_extents(inode, &context->et,
+						len, 1,
+						&context->meta_ac,
+						extra_blocks, &credits);
 	if (ret) {
 		mlog_errno(ret);
 		goto out;
-- 
2.16.4


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [Ocfs2-devel] [PATCH V3] ocfs2: fix dead lock caused by ocfs2_defrag_extent
  2018-11-01  7:14 [PATCH V3] ocfs2: fix dead lock caused by ocfs2_defrag_extent Larry Chen
@ 2018-11-01  8:58 ` " Joseph Qi
  2018-11-01 11:52   ` Changwei Ge
  2018-11-02  0:53 ` Changwei Ge
  1 sibling, 1 reply; 9+ messages in thread
From: Joseph Qi @ 2018-11-01  8:58 UTC (permalink / raw)
  To: Larry Chen, mark, jlbec; +Cc: linux-kernel, ocfs2-devel

Hi Larry,

On 18/11/1 15:14, Larry Chen wrote:
> ocfs2_defrag_extent may fall into deadlock.
> 
> ocfs2_ioctl_move_extents
>     ocfs2_ioctl_move_extents
>       ocfs2_move_extents
>         ocfs2_defrag_extent
>           ocfs2_lock_allocators_move_extents
> 
>             ocfs2_reserve_clusters
>               inode_lock GLOBAL_BITMAP_SYSTEM_INODE
> 
> 	  __ocfs2_flush_truncate_log
>               inode_lock GLOBAL_BITMAP_SYSTEM_INODE
> 
> As backtrace shows above, ocfs2_reserve_clusters() will call inode_lock
> against the global bitmap if local allocator has not sufficient cluters.
> Once global bitmap could meet the demand, ocfs2_reserve_cluster will
> return success with global bitmap locked.
> 
> After ocfs2_reserve_cluster(), if truncate log is full,
> __ocfs2_flush_truncate_log() will definitely fall into deadlock because it
> needs to inode_lock global bitmap, which has already been locked.
> 
> To fix this bug, we could remove from ocfs2_lock_allocators_move_extents()
> the code which intends to lock global allocator, and put the removed code
> after __ocfs2_flush_truncate_log().
> 
> ocfs2_lock_allocators_move_extents() is referred by 2 places, one is here,
> the other does not need the data allocator context, which means this patch
> does not affect the caller so far.
> 
> Change log:
> 1. Correct the function comment.
> 2. Remove unused argument from ocfs2_lock_meta_allocator_move_extents.
> 
> Signed-off-by: Larry Chen <lchen@suse.com>
> ---
>  fs/ocfs2/move_extents.c | 47 ++++++++++++++++++++++++++---------------------
>  1 file changed, 26 insertions(+), 21 deletions(-)
> 
> diff --git a/fs/ocfs2/move_extents.c b/fs/ocfs2/move_extents.c
> index 7eb3b0a6347e..f55f82ca3425 100644
> --- a/fs/ocfs2/move_extents.c
> +++ b/fs/ocfs2/move_extents.c
> @@ -156,18 +156,14 @@ static int __ocfs2_move_extent(handle_t *handle,
>  }
>  
>  /*
> - * lock allocators, and reserving appropriate number of bits for
> - * meta blocks and data clusters.
> - *
> - * in some cases, we don't need to reserve clusters, just let data_ac
> - * be NULL.
> + * lock allocator, and reserve appropriate number of bits for
> + * meta blocks.
>   */
> -static int ocfs2_lock_allocators_move_extents(struct inode *inode,
> +static int ocfs2_lock_meta_allocator_move_extents(struct inode *inode,
>  					struct ocfs2_extent_tree *et,
>  					u32 clusters_to_move,

IMO, here clusters_to_move is only for data_ac, since we change this
function to only handle meta_ac, I'm afraid clusters_to_move related
logic has to been changed correspondingly.

Thanks,
Joseph
>  					u32 extents_to_split,
>  					struct ocfs2_alloc_context **meta_ac,
> -					struct ocfs2_alloc_context **data_ac,
>  					int extra_blocks,
>  					int *credits)
>  {
> @@ -192,13 +188,6 @@ static int ocfs2_lock_allocators_move_extents(struct inode *inode,
>  		goto out;
>  	}
>  
> -	if (data_ac) {
> -		ret = ocfs2_reserve_clusters(osb, clusters_to_move, data_ac);
> -		if (ret) {
> -			mlog_errno(ret);
> -			goto out;
> -		}
> -	}
>  
>  	*credits += ocfs2_calc_extend_credits(osb->sb, et->et_root_el);
>  
> @@ -257,10 +246,10 @@ static int ocfs2_defrag_extent(struct ocfs2_move_extents_context *context,
>  		}
>  	}
>  
> -	ret = ocfs2_lock_allocators_move_extents(inode, &context->et, *len, 1,
> -						 &context->meta_ac,
> -						 &context->data_ac,
> -						 extra_blocks, &credits);
> +	ret = ocfs2_lock_meta_allocator_move_extents(inode, &context->et,
> +						*len, 1,
> +						&context->meta_ac,
> +						extra_blocks, &credits);
>  	if (ret) {
>  		mlog_errno(ret);
>  		goto out;
> @@ -283,6 +272,21 @@ static int ocfs2_defrag_extent(struct ocfs2_move_extents_context *context,
>  		}
>  	}
>  
> +	/*
> +	 * Make sure ocfs2_reserve_cluster is called after
> +	 * __ocfs2_flush_truncate_log, otherwise, dead lock may happen.
> +	 *
> +	 * If ocfs2_reserve_cluster is called
> +	 * before __ocfs2_flush_truncate_log, dead lock on global bitmap
> +	 * may happen.
> +	 *
> +	 */
> +	ret = ocfs2_reserve_clusters(osb, *len, &context->data_ac);
> +	if (ret) {
> +		mlog_errno(ret);
> +		goto out_unlock_mutex;
> +	}
> +
>  	handle = ocfs2_start_trans(osb, credits);
>  	if (IS_ERR(handle)) {
>  		ret = PTR_ERR(handle);
> @@ -600,9 +604,10 @@ static int ocfs2_move_extent(struct ocfs2_move_extents_context *context,
>  		}
>  	}
>  
> -	ret = ocfs2_lock_allocators_move_extents(inode, &context->et, len, 1,
> -						 &context->meta_ac,
> -						 NULL, extra_blocks, &credits);
> +	ret = ocfs2_lock_meta_allocator_move_extents(inode, &context->et,
> +						len, 1,
> +						&context->meta_ac,
> +						extra_blocks, &credits);
>  	if (ret) {
>  		mlog_errno(ret);
>  		goto out;
> 

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [Ocfs2-devel] [PATCH V3] ocfs2: fix dead lock caused by ocfs2_defrag_extent
  2018-11-01  8:58 ` [Ocfs2-devel] " Joseph Qi
@ 2018-11-01 11:52   ` Changwei Ge
  2018-11-01 12:15     ` Joseph Qi
  2018-11-01 12:39     ` Larry Chen
  0 siblings, 2 replies; 9+ messages in thread
From: Changwei Ge @ 2018-11-01 11:52 UTC (permalink / raw)
  To: Joseph Qi, Larry Chen, mark, jlbec
  Cc: linux-kernel, ocfs2-devel, Andrew Morton

Hello Joseph,

On 2018/11/1 17:01, Joseph Qi wrote:
> Hi Larry,
> 
> On 18/11/1 15:14, Larry Chen wrote:
>> ocfs2_defrag_extent may fall into deadlock.
>>
>> ocfs2_ioctl_move_extents
>>      ocfs2_ioctl_move_extents
>>        ocfs2_move_extents
>>          ocfs2_defrag_extent
>>            ocfs2_lock_allocators_move_extents
>>
>>              ocfs2_reserve_clusters
>>                inode_lock GLOBAL_BITMAP_SYSTEM_INODE
>>
>> 	  __ocfs2_flush_truncate_log
>>                inode_lock GLOBAL_BITMAP_SYSTEM_INODE
>>
>> As backtrace shows above, ocfs2_reserve_clusters() will call inode_lock
>> against the global bitmap if local allocator has not sufficient cluters.
>> Once global bitmap could meet the demand, ocfs2_reserve_cluster will
>> return success with global bitmap locked.
>>
>> After ocfs2_reserve_cluster(), if truncate log is full,
>> __ocfs2_flush_truncate_log() will definitely fall into deadlock because it
>> needs to inode_lock global bitmap, which has already been locked.
>>
>> To fix this bug, we could remove from ocfs2_lock_allocators_move_extents()
>> the code which intends to lock global allocator, and put the removed code
>> after __ocfs2_flush_truncate_log().
>>
>> ocfs2_lock_allocators_move_extents() is referred by 2 places, one is here,
>> the other does not need the data allocator context, which means this patch
>> does not affect the caller so far.
>>
>> Change log:
>> 1. Correct the function comment.
>> 2. Remove unused argument from ocfs2_lock_meta_allocator_move_extents.
>>
>> Signed-off-by: Larry Chen <lchen@suse.com>
>> ---
>>   fs/ocfs2/move_extents.c | 47 ++++++++++++++++++++++++++---------------------
>>   1 file changed, 26 insertions(+), 21 deletions(-)
>>

> IMO, here clusters_to_move is only for data_ac, since we change this
> function to only handle meta_ac, I'm afraid clusters_to_move related
> logic has to been changed correspondingly.

I think we can't remove *clusters_to_move* from here as clusters can be reserved latter outsides this function, but we
still have to reserve metadata(extents) in advance.
So we need that argument.

Thanks,
Changwei

> 
> Thanks,
> Joseph
>>   					u32 extents_to_split,
>>   					struct ocfs2_alloc_context **meta_ac,
>> -					struct ocfs2_alloc_context **data_ac,
>>   					int extra_blocks,
>>   					int *credits)
>>   {
>> @@ -192,13 +188,6 @@ static int ocfs2_lock_allocators_move_extents(struct inode *inode,
>>   		goto out;
>>   	}
>>   
>> -	if (data_ac) {
>> -		ret = ocfs2_reserve_clusters(osb, clusters_to_move, data_ac);
>> -		if (ret) {
>> -			mlog_errno(ret);
>> -			goto out;
>> -		}
>> -	}
>>   
>>   	*credits += ocfs2_calc_extend_credits(osb->sb, et->et_root_el);
>>   
>> @@ -257,10 +246,10 @@ static int ocfs2_defrag_extent(struct ocfs2_move_extents_context *context,
>>   		}
>>   	}
>>   
>> -	ret = ocfs2_lock_allocators_move_extents(inode, &context->et, *len, 1,
>> -						 &context->meta_ac,
>> -						 &context->data_ac,
>> -						 extra_blocks, &credits);
>> +	ret = ocfs2_lock_meta_allocator_move_extents(inode, &context->et,
>> +						*len, 1,
>> +						&context->meta_ac,
>> +						extra_blocks, &credits);
>>   	if (ret) {
>>   		mlog_errno(ret);
>>   		goto out;
>> @@ -283,6 +272,21 @@ static int ocfs2_defrag_extent(struct ocfs2_move_extents_context *context,
>>   		}
>>   	}
>>   
>> +	/*
>> +	 * Make sure ocfs2_reserve_cluster is called after
>> +	 * __ocfs2_flush_truncate_log, otherwise, dead lock may happen.
>> +	 *
>> +	 * If ocfs2_reserve_cluster is called
>> +	 * before __ocfs2_flush_truncate_log, dead lock on global bitmap
>> +	 * may happen.
>> +	 *
>> +	 */
>> +	ret = ocfs2_reserve_clusters(osb, *len, &context->data_ac);
>> +	if (ret) {
>> +		mlog_errno(ret);
>> +		goto out_unlock_mutex;
>> +	}
>> +
>>   	handle = ocfs2_start_trans(osb, credits);
>>   	if (IS_ERR(handle)) {
>>   		ret = PTR_ERR(handle);
>> @@ -600,9 +604,10 @@ static int ocfs2_move_extent(struct ocfs2_move_extents_context *context,
>>   		}
>>   	}
>>   
>> -	ret = ocfs2_lock_allocators_move_extents(inode, &context->et, len, 1,
>> -						 &context->meta_ac,
>> -						 NULL, extra_blocks, &credits);
>> +	ret = ocfs2_lock_meta_allocator_move_extents(inode, &context->et,
>> +						len, 1,
>> +						&context->meta_ac,
>> +						extra_blocks, &credits);
>>   	if (ret) {
>>   		mlog_errno(ret);
>>   		goto out;
>>
> 
> _______________________________________________
> Ocfs2-devel mailing list
> Ocfs2-devel@oss.oracle.com
> https://oss.oracle.com/mailman/listinfo/ocfs2-devel
> 

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [Ocfs2-devel] [PATCH V3] ocfs2: fix dead lock caused by ocfs2_defrag_extent
  2018-11-01 11:52   ` Changwei Ge
@ 2018-11-01 12:15     ` Joseph Qi
  2018-11-01 12:34       ` Changwei Ge
  2018-11-01 12:39     ` Larry Chen
  1 sibling, 1 reply; 9+ messages in thread
From: Joseph Qi @ 2018-11-01 12:15 UTC (permalink / raw)
  To: Changwei Ge, Larry Chen, mark, jlbec
  Cc: linux-kernel, ocfs2-devel, Andrew Morton



On 18/11/1 19:52, Changwei Ge wrote:
> Hello Joseph,
> 
> On 2018/11/1 17:01, Joseph Qi wrote:
>> Hi Larry,
>>
>> On 18/11/1 15:14, Larry Chen wrote:
>>> ocfs2_defrag_extent may fall into deadlock.
>>>
>>> ocfs2_ioctl_move_extents
>>>      ocfs2_ioctl_move_extents
>>>        ocfs2_move_extents
>>>          ocfs2_defrag_extent
>>>            ocfs2_lock_allocators_move_extents
>>>
>>>              ocfs2_reserve_clusters
>>>                inode_lock GLOBAL_BITMAP_SYSTEM_INODE
>>>
>>> 	  __ocfs2_flush_truncate_log
>>>                inode_lock GLOBAL_BITMAP_SYSTEM_INODE
>>>
>>> As backtrace shows above, ocfs2_reserve_clusters() will call inode_lock
>>> against the global bitmap if local allocator has not sufficient cluters.
>>> Once global bitmap could meet the demand, ocfs2_reserve_cluster will
>>> return success with global bitmap locked.
>>>
>>> After ocfs2_reserve_cluster(), if truncate log is full,
>>> __ocfs2_flush_truncate_log() will definitely fall into deadlock because it
>>> needs to inode_lock global bitmap, which has already been locked.
>>>
>>> To fix this bug, we could remove from ocfs2_lock_allocators_move_extents()
>>> the code which intends to lock global allocator, and put the removed code
>>> after __ocfs2_flush_truncate_log().
>>>
>>> ocfs2_lock_allocators_move_extents() is referred by 2 places, one is here,
>>> the other does not need the data allocator context, which means this patch
>>> does not affect the caller so far.
>>>
>>> Change log:
>>> 1. Correct the function comment.
>>> 2. Remove unused argument from ocfs2_lock_meta_allocator_move_extents.
>>>
>>> Signed-off-by: Larry Chen <lchen@suse.com>
>>> ---
>>>   fs/ocfs2/move_extents.c | 47 ++++++++++++++++++++++++++---------------------
>>>   1 file changed, 26 insertions(+), 21 deletions(-)
>>>
> 
>> IMO, here clusters_to_move is only for data_ac, since we change this
>> function to only handle meta_ac, I'm afraid clusters_to_move related
>> logic has to been changed correspondingly.
> 
> I think we can't remove *clusters_to_move* from here as clusters can be reserved latter outsides this function, but we
> still have to reserve metadata(extents) in advance.
> So we need that argument.
> 
I was not saying just remove it.
IIUC, clusters_to_move is for reserving data clusters (for meta_ac, we
mostly talk about blocks). Since we have moved data cluster reserve
logic out of ocfs2_lock_allocators_move_extents() now, then left
clusters_to_move related logic here is odd.

>>>   					u32 extents_to_split,
>>>   					struct ocfs2_alloc_context **meta_ac,
>>> -					struct ocfs2_alloc_context **data_ac,
>>>   					int extra_blocks,
>>>   					int *credits)
>>>   {
>>> @@ -192,13 +188,6 @@ static int ocfs2_lock_allocators_move_extents(struct inode *inode,
>>>   		goto out;
>>>   	}
>>>   
>>> -	if (data_ac) {
>>> -		ret = ocfs2_reserve_clusters(osb, clusters_to_move, data_ac);
>>> -		if (ret) {
>>> -			mlog_errno(ret);
>>> -			goto out;
>>> -		}
>>> -	}
>>>   
>>>   	*credits += ocfs2_calc_extend_credits(osb->sb, et->et_root_el);
>>>   
>>> @@ -257,10 +246,10 @@ static int ocfs2_defrag_extent(struct ocfs2_move_extents_context *context,
>>>   		}
>>>   	}
>>>   
>>> -	ret = ocfs2_lock_allocators_move_extents(inode, &context->et, *len, 1,
>>> -						 &context->meta_ac,
>>> -						 &context->data_ac,
>>> -						 extra_blocks, &credits);
>>> +	ret = ocfs2_lock_meta_allocator_move_extents(inode, &context->et,
>>> +						*len, 1,
>>> +						&context->meta_ac,
>>> +						extra_blocks, &credits);
>>>   	if (ret) {
>>>   		mlog_errno(ret);
>>>   		goto out;
>>> @@ -283,6 +272,21 @@ static int ocfs2_defrag_extent(struct ocfs2_move_extents_context *context,
>>>   		}
>>>   	}
>>>   
>>> +	/*
>>> +	 * Make sure ocfs2_reserve_cluster is called after
>>> +	 * __ocfs2_flush_truncate_log, otherwise, dead lock may happen.
>>> +	 *
>>> +	 * If ocfs2_reserve_cluster is called
>>> +	 * before __ocfs2_flush_truncate_log, dead lock on global bitmap
>>> +	 * may happen.
>>> +	 *
>>> +	 */
>>> +	ret = ocfs2_reserve_clusters(osb, *len, &context->data_ac);
>>> +	if (ret) {
>>> +		mlog_errno(ret);
>>> +		goto out_unlock_mutex;
>>> +	}
>>> +
>>>   	handle = ocfs2_start_trans(osb, credits);
>>>   	if (IS_ERR(handle)) {
>>>   		ret = PTR_ERR(handle);
>>> @@ -600,9 +604,10 @@ static int ocfs2_move_extent(struct ocfs2_move_extents_context *context,
>>>   		}
>>>   	}
>>>   
>>> -	ret = ocfs2_lock_allocators_move_extents(inode, &context->et, len, 1,
>>> -						 &context->meta_ac,
>>> -						 NULL, extra_blocks, &credits);
>>> +	ret = ocfs2_lock_meta_allocator_move_extents(inode, &context->et,
>>> +						len, 1,
>>> +						&context->meta_ac,
>>> +						extra_blocks, &credits);
>>>   	if (ret) {
>>>   		mlog_errno(ret);
>>>   		goto out;
>>>
>>
>> _______________________________________________
>> Ocfs2-devel mailing list
>> Ocfs2-devel@oss.oracle.com
>> https://oss.oracle.com/mailman/listinfo/ocfs2-devel
>>

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [Ocfs2-devel] [PATCH V3] ocfs2: fix dead lock caused by ocfs2_defrag_extent
  2018-11-01 12:15     ` Joseph Qi
@ 2018-11-01 12:34       ` Changwei Ge
  2018-11-02  1:18         ` Joseph Qi
  0 siblings, 1 reply; 9+ messages in thread
From: Changwei Ge @ 2018-11-01 12:34 UTC (permalink / raw)
  To: Joseph Qi, Larry Chen, mark, jlbec
  Cc: linux-kernel, ocfs2-devel, Andrew Morton

Hello Joseph,

On 2018/11/1 20:16, Joseph Qi wrote:
> 
> 
> On 18/11/1 19:52, Changwei Ge wrote:
>> Hello Joseph,
>>
>> On 2018/11/1 17:01, Joseph Qi wrote:
>>> Hi Larry,
>>>
>>> On 18/11/1 15:14, Larry Chen wrote:
>>>> ocfs2_defrag_extent may fall into deadlock.
>>>>
>>>> ocfs2_ioctl_move_extents
>>>>       ocfs2_ioctl_move_extents
>>>>         ocfs2_move_extents
>>>>           ocfs2_defrag_extent
>>>>             ocfs2_lock_allocators_move_extents
>>>>
>>>>               ocfs2_reserve_clusters
>>>>                 inode_lock GLOBAL_BITMAP_SYSTEM_INODE
>>>>
>>>> 	  __ocfs2_flush_truncate_log
>>>>                 inode_lock GLOBAL_BITMAP_SYSTEM_INODE
>>>>
>>>> As backtrace shows above, ocfs2_reserve_clusters() will call inode_lock
>>>> against the global bitmap if local allocator has not sufficient cluters.
>>>> Once global bitmap could meet the demand, ocfs2_reserve_cluster will
>>>> return success with global bitmap locked.
>>>>
>>>> After ocfs2_reserve_cluster(), if truncate log is full,
>>>> __ocfs2_flush_truncate_log() will definitely fall into deadlock because it
>>>> needs to inode_lock global bitmap, which has already been locked.
>>>>
>>>> To fix this bug, we could remove from ocfs2_lock_allocators_move_extents()
>>>> the code which intends to lock global allocator, and put the removed code
>>>> after __ocfs2_flush_truncate_log().
>>>>
>>>> ocfs2_lock_allocators_move_extents() is referred by 2 places, one is here,
>>>> the other does not need the data allocator context, which means this patch
>>>> does not affect the caller so far.
>>>>
>>>> Change log:
>>>> 1. Correct the function comment.
>>>> 2. Remove unused argument from ocfs2_lock_meta_allocator_move_extents.
>>>>
>>>> Signed-off-by: Larry Chen <lchen@suse.com>
>>>> ---
>>>>    fs/ocfs2/move_extents.c | 47 ++++++++++++++++++++++++++---------------------
>>>>    1 file changed, 26 insertions(+), 21 deletions(-)
>>>>
>>
>>> IMO, here clusters_to_move is only for data_ac, since we change this
>>> function to only handle meta_ac, I'm afraid clusters_to_move related
>>> logic has to been changed correspondingly.
>>
>> I think we can't remove *clusters_to_move* from here as clusters can be reserved latter outsides this function, but we
>> still have to reserve metadata(extents) in advance.
>> So we need that argument.
>>
> I was not saying just remove it.
> IIUC, clusters_to_move is for reserving data clusters (for meta_ac, we

Um...
*cluster_to_move* is not only used for reserving data clusters.
It is also an effecting input for calculating if existed extents still have enough free records for later
tree operation like attaching clusters to extents.

Please refer to below code:
  175         unsigned int max_recs_needed = 2 * extents_to_split + clusters_to_move;



> mostly talk about blocks). Since we have moved data cluster reserve
> logic out of ocfs2_lock_allocators_move_extents() now, then left
> clusters_to_move related logic here is odd.

Like my preceding elaboration, it is used for telling if we need more extents.
Anyway, I think we must keep *cluster_to_move* here as it used to. :-)

Thanks,
Changwei




> 
>>>>    					u32 extents_to_split,
>>>>    					struct ocfs2_alloc_context **meta_ac,
>>>> -					struct ocfs2_alloc_context **data_ac,
>>>>    					int extra_blocks,
>>>>    					int *credits)
>>>>    {
>>>> @@ -192,13 +188,6 @@ static int ocfs2_lock_allocators_move_extents(struct inode *inode,
>>>>    		goto out;
>>>>    	}
>>>>    
>>>> -	if (data_ac) {
>>>> -		ret = ocfs2_reserve_clusters(osb, clusters_to_move, data_ac);
>>>> -		if (ret) {
>>>> -			mlog_errno(ret);
>>>> -			goto out;
>>>> -		}
>>>> -	}
>>>>    
>>>>    	*credits += ocfs2_calc_extend_credits(osb->sb, et->et_root_el);
>>>>    
>>>> @@ -257,10 +246,10 @@ static int ocfs2_defrag_extent(struct ocfs2_move_extents_context *context,
>>>>    		}
>>>>    	}
>>>>    
>>>> -	ret = ocfs2_lock_allocators_move_extents(inode, &context->et, *len, 1,
>>>> -						 &context->meta_ac,
>>>> -						 &context->data_ac,
>>>> -						 extra_blocks, &credits);
>>>> +	ret = ocfs2_lock_meta_allocator_move_extents(inode, &context->et,
>>>> +						*len, 1,
>>>> +						&context->meta_ac,
>>>> +						extra_blocks, &credits);
>>>>    	if (ret) {
>>>>    		mlog_errno(ret);
>>>>    		goto out;
>>>> @@ -283,6 +272,21 @@ static int ocfs2_defrag_extent(struct ocfs2_move_extents_context *context,
>>>>    		}
>>>>    	}
>>>>    
>>>> +	/*
>>>> +	 * Make sure ocfs2_reserve_cluster is called after
>>>> +	 * __ocfs2_flush_truncate_log, otherwise, dead lock may happen.
>>>> +	 *
>>>> +	 * If ocfs2_reserve_cluster is called
>>>> +	 * before __ocfs2_flush_truncate_log, dead lock on global bitmap
>>>> +	 * may happen.
>>>> +	 *
>>>> +	 */
>>>> +	ret = ocfs2_reserve_clusters(osb, *len, &context->data_ac);
>>>> +	if (ret) {
>>>> +		mlog_errno(ret);
>>>> +		goto out_unlock_mutex;
>>>> +	}
>>>> +
>>>>    	handle = ocfs2_start_trans(osb, credits);
>>>>    	if (IS_ERR(handle)) {
>>>>    		ret = PTR_ERR(handle);
>>>> @@ -600,9 +604,10 @@ static int ocfs2_move_extent(struct ocfs2_move_extents_context *context,
>>>>    		}
>>>>    	}
>>>>    
>>>> -	ret = ocfs2_lock_allocators_move_extents(inode, &context->et, len, 1,
>>>> -						 &context->meta_ac,
>>>> -						 NULL, extra_blocks, &credits);
>>>> +	ret = ocfs2_lock_meta_allocator_move_extents(inode, &context->et,
>>>> +						len, 1,
>>>> +						&context->meta_ac,
>>>> +						extra_blocks, &credits);
>>>>    	if (ret) {
>>>>    		mlog_errno(ret);
>>>>    		goto out;
>>>>
>>>
>>> _______________________________________________
>>> Ocfs2-devel mailing list
>>> Ocfs2-devel@oss.oracle.com
>>> https://oss.oracle.com/mailman/listinfo/ocfs2-devel
>>>
> 

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [Ocfs2-devel] [PATCH V3] ocfs2: fix dead lock caused by ocfs2_defrag_extent
  2018-11-01 11:52   ` Changwei Ge
  2018-11-01 12:15     ` Joseph Qi
@ 2018-11-01 12:39     ` Larry Chen
  2018-11-01 12:48       ` Changwei Ge
  1 sibling, 1 reply; 9+ messages in thread
From: Larry Chen @ 2018-11-01 12:39 UTC (permalink / raw)
  To: Changwei Ge, Joseph Qi; +Cc: linux-kernel, ocfs2-devel, Andrew Morton

Hi Joseph,

On 11/1/18 7:52 PM, Changwei Ge wrote:
> Hello Joseph,
> 
> On 2018/11/1 17:01, Joseph Qi wrote:
>> Hi Larry,
>>
>> On 18/11/1 15:14, Larry Chen wrote:
>>> ocfs2_defrag_extent may fall into deadlock.
>>>
>>> ocfs2_ioctl_move_extents
>>>       ocfs2_ioctl_move_extents
>>>         ocfs2_move_extents
>>>           ocfs2_defrag_extent
>>>             ocfs2_lock_allocators_move_extents
>>>
>>>               ocfs2_reserve_clusters
>>>                 inode_lock GLOBAL_BITMAP_SYSTEM_INODE
>>>
>>> 	  __ocfs2_flush_truncate_log
>>>                 inode_lock GLOBAL_BITMAP_SYSTEM_INODE
>>>
>>> As backtrace shows above, ocfs2_reserve_clusters() will call inode_lock
>>> against the global bitmap if local allocator has not sufficient cluters.
>>> Once global bitmap could meet the demand, ocfs2_reserve_cluster will
>>> return success with global bitmap locked.
>>>
>>> After ocfs2_reserve_cluster(), if truncate log is full,
>>> __ocfs2_flush_truncate_log() will definitely fall into deadlock because it
>>> needs to inode_lock global bitmap, which has already been locked.
>>>
>>> To fix this bug, we could remove from ocfs2_lock_allocators_move_extents()
>>> the code which intends to lock global allocator, and put the removed code
>>> after __ocfs2_flush_truncate_log().
>>>
>>> ocfs2_lock_allocators_move_extents() is referred by 2 places, one is here,
>>> the other does not need the data allocator context, which means this patch
>>> does not affect the caller so far.
>>>
>>> Change log:
>>> 1. Correct the function comment.
>>> 2. Remove unused argument from ocfs2_lock_meta_allocator_move_extents.
>>>
>>> Signed-off-by: Larry Chen <lchen@suse.com>
>>> ---
>>>    fs/ocfs2/move_extents.c | 47 ++++++++++++++++++++++++++---------------------
>>>    1 file changed, 26 insertions(+), 21 deletions(-)
>>>
> 
>> IMO, here clusters_to_move is only for data_ac, since we change this
>> function to only handle meta_ac, I'm afraid clusters_to_move related
>> logic has to been changed correspondingly.
> 
> I think we can't remove *clusters_to_move* from here as clusters can be reserved latter outsides this function, but we
> still have to reserve metadata(extents) in advance.
> So we need that argument.
> 

Yeah, I think clusters_to_move should be reserved, in order to keep the 
original logic as it was.

But I'm curious about why max_recs_needed should be equal to
2 * extents_to_split + cluster_to_move?
Does that mean that each cluster might form an extent??

Thanks,
Larry


> Thanks,
> Changwei
> 
>>
>> Thanks,
>> Joseph
>>>    					u32 extents_to_split,
>>>    					struct ocfs2_alloc_context **meta_ac,
>>> -					struct ocfs2_alloc_context **data_ac,
>>>    					int extra_blocks,
>>>    					int *credits)
>>>    {
>>> @@ -192,13 +188,6 @@ static int ocfs2_lock_allocators_move_extents(struct inode *inode,
>>>    		goto out;
>>>    	}
>>>    
>>> -	if (data_ac) {
>>> -		ret = ocfs2_reserve_clusters(osb, clusters_to_move, data_ac);
>>> -		if (ret) {
>>> -			mlog_errno(ret);
>>> -			goto out;
>>> -		}
>>> -	}
>>>    
>>>    	*credits += ocfs2_calc_extend_credits(osb->sb, et->et_root_el);
>>>    
>>> @@ -257,10 +246,10 @@ static int ocfs2_defrag_extent(struct ocfs2_move_extents_context *context,
>>>    		}
>>>    	}
>>>    
>>> -	ret = ocfs2_lock_allocators_move_extents(inode, &context->et, *len, 1,
>>> -						 &context->meta_ac,
>>> -						 &context->data_ac,
>>> -						 extra_blocks, &credits);
>>> +	ret = ocfs2_lock_meta_allocator_move_extents(inode, &context->et,
>>> +						*len, 1,
>>> +						&context->meta_ac,
>>> +						extra_blocks, &credits);
>>>    	if (ret) {
>>>    		mlog_errno(ret);
>>>    		goto out;
>>> @@ -283,6 +272,21 @@ static int ocfs2_defrag_extent(struct ocfs2_move_extents_context *context,
>>>    		}
>>>    	}
>>>    
>>> +	/*
>>> +	 * Make sure ocfs2_reserve_cluster is called after
>>> +	 * __ocfs2_flush_truncate_log, otherwise, dead lock may happen.
>>> +	 *
>>> +	 * If ocfs2_reserve_cluster is called
>>> +	 * before __ocfs2_flush_truncate_log, dead lock on global bitmap
>>> +	 * may happen.
>>> +	 *
>>> +	 */
>>> +	ret = ocfs2_reserve_clusters(osb, *len, &context->data_ac);
>>> +	if (ret) {
>>> +		mlog_errno(ret);
>>> +		goto out_unlock_mutex;
>>> +	}
>>> +
>>>    	handle = ocfs2_start_trans(osb, credits);
>>>    	if (IS_ERR(handle)) {
>>>    		ret = PTR_ERR(handle);
>>> @@ -600,9 +604,10 @@ static int ocfs2_move_extent(struct ocfs2_move_extents_context *context,
>>>    		}
>>>    	}
>>>    
>>> -	ret = ocfs2_lock_allocators_move_extents(inode, &context->et, len, 1,
>>> -						 &context->meta_ac,
>>> -						 NULL, extra_blocks, &credits);
>>> +	ret = ocfs2_lock_meta_allocator_move_extents(inode, &context->et,
>>> +						len, 1,
>>> +						&context->meta_ac,
>>> +						extra_blocks, &credits);
>>>    	if (ret) {
>>>    		mlog_errno(ret);
>>>    		goto out;
>>>
>>
>> _______________________________________________
>> Ocfs2-devel mailing list
>> Ocfs2-devel@oss.oracle.com
>> https://oss.oracle.com/mailman/listinfo/ocfs2-devel
>>
> 


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [Ocfs2-devel] [PATCH V3] ocfs2: fix dead lock caused by ocfs2_defrag_extent
  2018-11-01 12:39     ` Larry Chen
@ 2018-11-01 12:48       ` Changwei Ge
  0 siblings, 0 replies; 9+ messages in thread
From: Changwei Ge @ 2018-11-01 12:48 UTC (permalink / raw)
  To: Larry Chen, Joseph Qi; +Cc: linux-kernel, ocfs2-devel, Andrew Morton

Hi Larry,

On 2018/11/1 20:39, Larry Chen wrote:
> Hi Joseph,
> 
> On 11/1/18 7:52 PM, Changwei Ge wrote:
>> Hello Joseph,
>>
>> On 2018/11/1 17:01, Joseph Qi wrote:
>>> Hi Larry,
>>>
>>> On 18/11/1 15:14, Larry Chen wrote:
>>>> ocfs2_defrag_extent may fall into deadlock.
>>>>
>>>> ocfs2_ioctl_move_extents
>>>>        ocfs2_ioctl_move_extents
>>>>          ocfs2_move_extents
>>>>            ocfs2_defrag_extent
>>>>              ocfs2_lock_allocators_move_extents
>>>>
>>>>                ocfs2_reserve_clusters
>>>>                  inode_lock GLOBAL_BITMAP_SYSTEM_INODE
>>>>
>>>> 	  __ocfs2_flush_truncate_log
>>>>                  inode_lock GLOBAL_BITMAP_SYSTEM_INODE
>>>>
>>>> As backtrace shows above, ocfs2_reserve_clusters() will call inode_lock
>>>> against the global bitmap if local allocator has not sufficient cluters.
>>>> Once global bitmap could meet the demand, ocfs2_reserve_cluster will
>>>> return success with global bitmap locked.
>>>>
>>>> After ocfs2_reserve_cluster(), if truncate log is full,
>>>> __ocfs2_flush_truncate_log() will definitely fall into deadlock because it
>>>> needs to inode_lock global bitmap, which has already been locked.
>>>>
>>>> To fix this bug, we could remove from ocfs2_lock_allocators_move_extents()
>>>> the code which intends to lock global allocator, and put the removed code
>>>> after __ocfs2_flush_truncate_log().
>>>>
>>>> ocfs2_lock_allocators_move_extents() is referred by 2 places, one is here,
>>>> the other does not need the data allocator context, which means this patch
>>>> does not affect the caller so far.
>>>>
>>>> Change log:
>>>> 1. Correct the function comment.
>>>> 2. Remove unused argument from ocfs2_lock_meta_allocator_move_extents.
>>>>
>>>> Signed-off-by: Larry Chen <lchen@suse.com>
>>>> ---
>>>>     fs/ocfs2/move_extents.c | 47 ++++++++++++++++++++++++++---------------------
>>>>     1 file changed, 26 insertions(+), 21 deletions(-)
>>>>
>>
>>> IMO, here clusters_to_move is only for data_ac, since we change this
>>> function to only handle meta_ac, I'm afraid clusters_to_move related
>>> logic has to been changed correspondingly.
>>
>> I think we can't remove *clusters_to_move* from here as clusters can be reserved latter outsides this function, but we
>> still have to reserve metadata(extents) in advance.
>> So we need that argument.
>>
> 
> Yeah, I think clusters_to_move should be reserved, in order to keep the
> original logic as it was.
> 
> But I'm curious about why max_recs_needed should be equal to
> 2 * extents_to_split + cluster_to_move?
> Does that mean that each cluster might form an extent??

It's because in the worst case that the file is of great fragment and one extent record can only represent one cluster.
And we might encounter extents split during merging and rotation so we also need _2 * extents_to_split_.

Thanks,
Changwei

> 
> Thanks,
> Larry
> 
> 
>> Thanks,
>> Changwei
>>
>>>
>>> Thanks,
>>> Joseph
>>>>     					u32 extents_to_split,
>>>>     					struct ocfs2_alloc_context **meta_ac,
>>>> -					struct ocfs2_alloc_context **data_ac,
>>>>     					int extra_blocks,
>>>>     					int *credits)
>>>>     {
>>>> @@ -192,13 +188,6 @@ static int ocfs2_lock_allocators_move_extents(struct inode *inode,
>>>>     		goto out;
>>>>     	}
>>>>     
>>>> -	if (data_ac) {
>>>> -		ret = ocfs2_reserve_clusters(osb, clusters_to_move, data_ac);
>>>> -		if (ret) {
>>>> -			mlog_errno(ret);
>>>> -			goto out;
>>>> -		}
>>>> -	}
>>>>     
>>>>     	*credits += ocfs2_calc_extend_credits(osb->sb, et->et_root_el);
>>>>     
>>>> @@ -257,10 +246,10 @@ static int ocfs2_defrag_extent(struct ocfs2_move_extents_context *context,
>>>>     		}
>>>>     	}
>>>>     
>>>> -	ret = ocfs2_lock_allocators_move_extents(inode, &context->et, *len, 1,
>>>> -						 &context->meta_ac,
>>>> -						 &context->data_ac,
>>>> -						 extra_blocks, &credits);
>>>> +	ret = ocfs2_lock_meta_allocator_move_extents(inode, &context->et,
>>>> +						*len, 1,
>>>> +						&context->meta_ac,
>>>> +						extra_blocks, &credits);
>>>>     	if (ret) {
>>>>     		mlog_errno(ret);
>>>>     		goto out;
>>>> @@ -283,6 +272,21 @@ static int ocfs2_defrag_extent(struct ocfs2_move_extents_context *context,
>>>>     		}
>>>>     	}
>>>>     
>>>> +	/*
>>>> +	 * Make sure ocfs2_reserve_cluster is called after
>>>> +	 * __ocfs2_flush_truncate_log, otherwise, dead lock may happen.
>>>> +	 *
>>>> +	 * If ocfs2_reserve_cluster is called
>>>> +	 * before __ocfs2_flush_truncate_log, dead lock on global bitmap
>>>> +	 * may happen.
>>>> +	 *
>>>> +	 */
>>>> +	ret = ocfs2_reserve_clusters(osb, *len, &context->data_ac);
>>>> +	if (ret) {
>>>> +		mlog_errno(ret);
>>>> +		goto out_unlock_mutex;
>>>> +	}
>>>> +
>>>>     	handle = ocfs2_start_trans(osb, credits);
>>>>     	if (IS_ERR(handle)) {
>>>>     		ret = PTR_ERR(handle);
>>>> @@ -600,9 +604,10 @@ static int ocfs2_move_extent(struct ocfs2_move_extents_context *context,
>>>>     		}
>>>>     	}
>>>>     
>>>> -	ret = ocfs2_lock_allocators_move_extents(inode, &context->et, len, 1,
>>>> -						 &context->meta_ac,
>>>> -						 NULL, extra_blocks, &credits);
>>>> +	ret = ocfs2_lock_meta_allocator_move_extents(inode, &context->et,
>>>> +						len, 1,
>>>> +						&context->meta_ac,
>>>> +						extra_blocks, &credits);
>>>>     	if (ret) {
>>>>     		mlog_errno(ret);
>>>>     		goto out;
>>>>
>>>
>>> _______________________________________________
>>> Ocfs2-devel mailing list
>>> Ocfs2-devel@oss.oracle.com
>>> https://oss.oracle.com/mailman/listinfo/ocfs2-devel
>>>
>>
> 
> 

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH V3] ocfs2: fix dead lock caused by ocfs2_defrag_extent
  2018-11-01  7:14 [PATCH V3] ocfs2: fix dead lock caused by ocfs2_defrag_extent Larry Chen
  2018-11-01  8:58 ` [Ocfs2-devel] " Joseph Qi
@ 2018-11-02  0:53 ` Changwei Ge
  1 sibling, 0 replies; 9+ messages in thread
From: Changwei Ge @ 2018-11-02  0:53 UTC (permalink / raw)
  To: Larry Chen, mark, jlbec; +Cc: linux-kernel, ocfs2-devel, akpm

Hi Larry,

The change looks good to me.

On 2018/11/1 15:14, Larry Chen wrote:
> ocfs2_defrag_extent may fall into deadlock.
> 
> ocfs2_ioctl_move_extents
>      ocfs2_ioctl_move_extents
>        ocfs2_move_extents
>          ocfs2_defrag_extent
>            ocfs2_lock_allocators_move_extents
> 
>              ocfs2_reserve_clusters
>                inode_lock GLOBAL_BITMAP_SYSTEM_INODE
> 
> 	  __ocfs2_flush_truncate_log
>                inode_lock GLOBAL_BITMAP_SYSTEM_INODE
> 
> As backtrace shows above, ocfs2_reserve_clusters() will call inode_lock
> against the global bitmap if local allocator has not sufficient cluters.
> Once global bitmap could meet the demand, ocfs2_reserve_cluster will
> return success with global bitmap locked.
> 
> After ocfs2_reserve_cluster(), if truncate log is full,
> __ocfs2_flush_truncate_log() will definitely fall into deadlock because it
> needs to inode_lock global bitmap, which has already been locked.
> 
> To fix this bug, we could remove from ocfs2_lock_allocators_move_extents()
> the code which intends to lock global allocator, and put the removed code
> after __ocfs2_flush_truncate_log().
> 
> ocfs2_lock_allocators_move_extents() is referred by 2 places, one is here,
> the other does not need the data allocator context, which means this patch
> does not affect the caller so far.
> 
> Change log:
> 1. Correct the function comment.
> 2. Remove unused argument from ocfs2_lock_meta_allocator_move_extents.
> 
> Signed-off-by: Larry Chen <lchen@suse.com>

Reviewed-by: Changwei Ge <ge.changwei@h3c.com>

> ---
>   fs/ocfs2/move_extents.c | 47 ++++++++++++++++++++++++++---------------------
>   1 file changed, 26 insertions(+), 21 deletions(-)
> 
> diff --git a/fs/ocfs2/move_extents.c b/fs/ocfs2/move_extents.c
> index 7eb3b0a6347e..f55f82ca3425 100644
> --- a/fs/ocfs2/move_extents.c
> +++ b/fs/ocfs2/move_extents.c
> @@ -156,18 +156,14 @@ static int __ocfs2_move_extent(handle_t *handle,
>   }
>   
>   /*
> - * lock allocators, and reserving appropriate number of bits for
> - * meta blocks and data clusters.
> - *
> - * in some cases, we don't need to reserve clusters, just let data_ac
> - * be NULL.
> + * lock allocator, and reserve appropriate number of bits for
> + * meta blocks.
>    */
> -static int ocfs2_lock_allocators_move_extents(struct inode *inode,
> +static int ocfs2_lock_meta_allocator_move_extents(struct inode *inode,
>   					struct ocfs2_extent_tree *et,
>   					u32 clusters_to_move,
>   					u32 extents_to_split,
>   					struct ocfs2_alloc_context **meta_ac,
> -					struct ocfs2_alloc_context **data_ac,
>   					int extra_blocks,
>   					int *credits)
>   {
> @@ -192,13 +188,6 @@ static int ocfs2_lock_allocators_move_extents(struct inode *inode,
>   		goto out;
>   	}
>   
> -	if (data_ac) {
> -		ret = ocfs2_reserve_clusters(osb, clusters_to_move, data_ac);
> -		if (ret) {
> -			mlog_errno(ret);
> -			goto out;
> -		}
> -	}
>   
>   	*credits += ocfs2_calc_extend_credits(osb->sb, et->et_root_el);
>   
> @@ -257,10 +246,10 @@ static int ocfs2_defrag_extent(struct ocfs2_move_extents_context *context,
>   		}
>   	}
>   
> -	ret = ocfs2_lock_allocators_move_extents(inode, &context->et, *len, 1,
> -						 &context->meta_ac,
> -						 &context->data_ac,
> -						 extra_blocks, &credits);
> +	ret = ocfs2_lock_meta_allocator_move_extents(inode, &context->et,
> +						*len, 1,
> +						&context->meta_ac,
> +						extra_blocks, &credits);
>   	if (ret) {
>   		mlog_errno(ret);
>   		goto out;
> @@ -283,6 +272,21 @@ static int ocfs2_defrag_extent(struct ocfs2_move_extents_context *context,
>   		}
>   	}
>   
> +	/*
> +	 * Make sure ocfs2_reserve_cluster is called after
> +	 * __ocfs2_flush_truncate_log, otherwise, dead lock may happen.
> +	 *
> +	 * If ocfs2_reserve_cluster is called
> +	 * before __ocfs2_flush_truncate_log, dead lock on global bitmap
> +	 * may happen.
> +	 *
> +	 */
> +	ret = ocfs2_reserve_clusters(osb, *len, &context->data_ac);
> +	if (ret) {
> +		mlog_errno(ret);
> +		goto out_unlock_mutex;
> +	}
> +
>   	handle = ocfs2_start_trans(osb, credits);
>   	if (IS_ERR(handle)) {
>   		ret = PTR_ERR(handle);
> @@ -600,9 +604,10 @@ static int ocfs2_move_extent(struct ocfs2_move_extents_context *context,
>   		}
>   	}
>   
> -	ret = ocfs2_lock_allocators_move_extents(inode, &context->et, len, 1,
> -						 &context->meta_ac,
> -						 NULL, extra_blocks, &credits);
> +	ret = ocfs2_lock_meta_allocator_move_extents(inode, &context->et,
> +						len, 1,
> +						&context->meta_ac,
> +						extra_blocks, &credits);
>   	if (ret) {
>   		mlog_errno(ret);
>   		goto out;
> 

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [Ocfs2-devel] [PATCH V3] ocfs2: fix dead lock caused by ocfs2_defrag_extent
  2018-11-01 12:34       ` Changwei Ge
@ 2018-11-02  1:18         ` Joseph Qi
  0 siblings, 0 replies; 9+ messages in thread
From: Joseph Qi @ 2018-11-02  1:18 UTC (permalink / raw)
  To: Changwei Ge, Larry Chen, mark, jlbec
  Cc: linux-kernel, ocfs2-devel, Andrew Morton



On 18/11/1 20:34, Changwei Ge wrote:
> Hello Joseph,
> 
> On 2018/11/1 20:16, Joseph Qi wrote:
>>
>>
>> On 18/11/1 19:52, Changwei Ge wrote:
>>> Hello Joseph,
>>>
>>> On 2018/11/1 17:01, Joseph Qi wrote:
>>>> Hi Larry,
>>>>
>>>> On 18/11/1 15:14, Larry Chen wrote:
>>>>> ocfs2_defrag_extent may fall into deadlock.
>>>>>
>>>>> ocfs2_ioctl_move_extents
>>>>>       ocfs2_ioctl_move_extents
>>>>>         ocfs2_move_extents
>>>>>           ocfs2_defrag_extent
>>>>>             ocfs2_lock_allocators_move_extents
>>>>>
>>>>>               ocfs2_reserve_clusters
>>>>>                 inode_lock GLOBAL_BITMAP_SYSTEM_INODE
>>>>>
>>>>> 	  __ocfs2_flush_truncate_log
>>>>>                 inode_lock GLOBAL_BITMAP_SYSTEM_INODE
>>>>>
>>>>> As backtrace shows above, ocfs2_reserve_clusters() will call inode_lock
>>>>> against the global bitmap if local allocator has not sufficient cluters.
>>>>> Once global bitmap could meet the demand, ocfs2_reserve_cluster will
>>>>> return success with global bitmap locked.
>>>>>
>>>>> After ocfs2_reserve_cluster(), if truncate log is full,
>>>>> __ocfs2_flush_truncate_log() will definitely fall into deadlock because it
>>>>> needs to inode_lock global bitmap, which has already been locked.
>>>>>
>>>>> To fix this bug, we could remove from ocfs2_lock_allocators_move_extents()
>>>>> the code which intends to lock global allocator, and put the removed code
>>>>> after __ocfs2_flush_truncate_log().
>>>>>
>>>>> ocfs2_lock_allocators_move_extents() is referred by 2 places, one is here,
>>>>> the other does not need the data allocator context, which means this patch
>>>>> does not affect the caller so far.
>>>>>
>>>>> Change log:
>>>>> 1. Correct the function comment.
>>>>> 2. Remove unused argument from ocfs2_lock_meta_allocator_move_extents.
>>>>>
>>>>> Signed-off-by: Larry Chen <lchen@suse.com>
>>>>> ---
>>>>>    fs/ocfs2/move_extents.c | 47 ++++++++++++++++++++++++++---------------------
>>>>>    1 file changed, 26 insertions(+), 21 deletions(-)
>>>>>
>>>
>>>> IMO, here clusters_to_move is only for data_ac, since we change this
>>>> function to only handle meta_ac, I'm afraid clusters_to_move related
>>>> logic has to been changed correspondingly.
>>>
>>> I think we can't remove *clusters_to_move* from here as clusters can be reserved latter outsides this function, but we
>>> still have to reserve metadata(extents) in advance.
>>> So we need that argument.
>>>
>> I was not saying just remove it.
>> IIUC, clusters_to_move is for reserving data clusters (for meta_ac, we
> 
> Um...
> *cluster_to_move* is not only used for reserving data clusters.
> It is also an effecting input for calculating if existed extents still have enough free records for later
> tree operation like attaching clusters to extents.
> 
> Please refer to below code:
>   175         unsigned int max_recs_needed = 2 * extents_to_split + clusters_to_move;
>

IC. It is a bit odd that calculate it here but do the real reserve out.

> 
> 
>> mostly talk about blocks). Since we have moved data cluster reserve
>> logic out of ocfs2_lock_allocators_move_extents() now, then left
>> clusters_to_move related logic here is odd.
> 
> Like my preceding elaboration, it is used for telling if we need more extents.
> Anyway, I think we must keep *cluster_to_move* here as it used to. :-)
> 
> Thanks,
> Changwei
> 
> 
> 
> 
>>
>>>>>    					u32 extents_to_split,
>>>>>    					struct ocfs2_alloc_context **meta_ac,
>>>>> -					struct ocfs2_alloc_context **data_ac,
>>>>>    					int extra_blocks,
>>>>>    					int *credits)
>>>>>    {
>>>>> @@ -192,13 +188,6 @@ static int ocfs2_lock_allocators_move_extents(struct inode *inode,
>>>>>    		goto out;
>>>>>    	}
>>>>>    
>>>>> -	if (data_ac) {
>>>>> -		ret = ocfs2_reserve_clusters(osb, clusters_to_move, data_ac);
>>>>> -		if (ret) {
>>>>> -			mlog_errno(ret);
>>>>> -			goto out;
>>>>> -		}
>>>>> -	}
>>>>>    
>>>>>    	*credits += ocfs2_calc_extend_credits(osb->sb, et->et_root_el);
>>>>>    
>>>>> @@ -257,10 +246,10 @@ static int ocfs2_defrag_extent(struct ocfs2_move_extents_context *context,
>>>>>    		}
>>>>>    	}
>>>>>    
>>>>> -	ret = ocfs2_lock_allocators_move_extents(inode, &context->et, *len, 1,
>>>>> -						 &context->meta_ac,
>>>>> -						 &context->data_ac,
>>>>> -						 extra_blocks, &credits);
>>>>> +	ret = ocfs2_lock_meta_allocator_move_extents(inode, &context->et,
>>>>> +						*len, 1,
>>>>> +						&context->meta_ac,
>>>>> +						extra_blocks, &credits);
>>>>>    	if (ret) {
>>>>>    		mlog_errno(ret);
>>>>>    		goto out;
>>>>> @@ -283,6 +272,21 @@ static int ocfs2_defrag_extent(struct ocfs2_move_extents_context *context,
>>>>>    		}
>>>>>    	}
>>>>>    
>>>>> +	/*
>>>>> +	 * Make sure ocfs2_reserve_cluster is called after
>>>>> +	 * __ocfs2_flush_truncate_log, otherwise, dead lock may happen.
>>>>> +	 *
>>>>> +	 * If ocfs2_reserve_cluster is called
>>>>> +	 * before __ocfs2_flush_truncate_log, dead lock on global bitmap
>>>>> +	 * may happen.
>>>>> +	 *
>>>>> +	 */
>>>>> +	ret = ocfs2_reserve_clusters(osb, *len, &context->data_ac);
>>>>> +	if (ret) {
>>>>> +		mlog_errno(ret);
>>>>> +		goto out_unlock_mutex;
>>>>> +	}
>>>>> +
>>>>>    	handle = ocfs2_start_trans(osb, credits);
>>>>>    	if (IS_ERR(handle)) {
>>>>>    		ret = PTR_ERR(handle);
>>>>> @@ -600,9 +604,10 @@ static int ocfs2_move_extent(struct ocfs2_move_extents_context *context,
>>>>>    		}
>>>>>    	}
>>>>>    
>>>>> -	ret = ocfs2_lock_allocators_move_extents(inode, &context->et, len, 1,
>>>>> -						 &context->meta_ac,
>>>>> -						 NULL, extra_blocks, &credits);
>>>>> +	ret = ocfs2_lock_meta_allocator_move_extents(inode, &context->et,
>>>>> +						len, 1,
>>>>> +						&context->meta_ac,
>>>>> +						extra_blocks, &credits);
>>>>>    	if (ret) {
>>>>>    		mlog_errno(ret);
>>>>>    		goto out;
>>>>>
>>>>
>>>> _______________________________________________
>>>> Ocfs2-devel mailing list
>>>> Ocfs2-devel@oss.oracle.com
>>>> https://oss.oracle.com/mailman/listinfo/ocfs2-devel
>>>>
>>

^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, back to index

Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-11-01  7:14 [PATCH V3] ocfs2: fix dead lock caused by ocfs2_defrag_extent Larry Chen
2018-11-01  8:58 ` [Ocfs2-devel] " Joseph Qi
2018-11-01 11:52   ` Changwei Ge
2018-11-01 12:15     ` Joseph Qi
2018-11-01 12:34       ` Changwei Ge
2018-11-02  1:18         ` Joseph Qi
2018-11-01 12:39     ` Larry Chen
2018-11-01 12:48       ` Changwei Ge
2018-11-02  0:53 ` Changwei Ge

LKML Archive on lore.kernel.org

Archives are clonable:
	git clone --mirror https://lore.kernel.org/lkml/0 lkml/git/0.git
	git clone --mirror https://lore.kernel.org/lkml/1 lkml/git/1.git
	git clone --mirror https://lore.kernel.org/lkml/2 lkml/git/2.git
	git clone --mirror https://lore.kernel.org/lkml/3 lkml/git/3.git
	git clone --mirror https://lore.kernel.org/lkml/4 lkml/git/4.git
	git clone --mirror https://lore.kernel.org/lkml/5 lkml/git/5.git
	git clone --mirror https://lore.kernel.org/lkml/6 lkml/git/6.git

	# If you have public-inbox 1.1+ installed, you may
	# initialize and index your mirror using the following commands:
	public-inbox-init -V2 lkml lkml/ https://lore.kernel.org/lkml \
		linux-kernel@vger.kernel.org linux-kernel@archiver.kernel.org
	public-inbox-index lkml


Newsgroup available over NNTP:
	nntp://nntp.lore.kernel.org/org.kernel.vger.linux-kernel


AGPL code for this site: git clone https://public-inbox.org/ public-inbox