All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH] btrfs: fix deadlock when cloning inline extents and low on available space
@ 2021-05-25 10:05 fdmanana
  2021-05-25 16:11 ` David Sterba
  0 siblings, 1 reply; 2+ messages in thread
From: fdmanana @ 2021-05-25 10:05 UTC (permalink / raw)
  To: linux-btrfs

From: Filipe Manana <fdmanana@suse.com>

There are a few cases where cloning an inline extent requires copying data
into a page of the destination inode. For these cases we are allocating
the required data and metadata space while holding a leaf locked. This can
result in a deadlock when we are low on available space because allocating
the space may flush delalloc and two deadlock scenarios can happen:

1) When starting writeback for an inode with a very small dirty range that
   fits in an inline extent, we deadlock during the writeback when trying
   to insert the inline extent, at cow_file_range_inline(), if the extent
   is going to be located in the leaf for which we are already holding a
   read lock;

2) After successfully starting writeback, for non-inline extent cases,
   the async reclaim thread will hang waiting for an ordered extent to
   complete if the ordered extent completion needs to modify the leaf
   for which the clone task is holding a read lock (for adding or
   replacing file extent items). So the cloning task will wait forever
   on the async reclaim thread to make progress, which in turn is
   waiting for the ordered extent completion which in turn is waiting
   to acquire a write lock on the same leaf.

So fix this by making sure we release the path (and therefore the leaf)
everytime we need to copy the inline extent's data into a page of the
destination inode, as by that time we do not need to have the leaf locked.

Fixes: 05a5a7621ce66c ("Btrfs: implement full reflink support for inline extents")
Signed-off-by: Filipe Manana <fdmanana@suse.com>
---
 fs/btrfs/reflink.c | 38 ++++++++++++++++++++++----------------
 1 file changed, 22 insertions(+), 16 deletions(-)

diff --git a/fs/btrfs/reflink.c b/fs/btrfs/reflink.c
index 06682128d8fa..58ddc7ed9e84 100644
--- a/fs/btrfs/reflink.c
+++ b/fs/btrfs/reflink.c
@@ -207,10 +207,7 @@ static int clone_copy_inline_extent(struct inode *dst,
 			 * inline extent's data to the page.
 			 */
 			ASSERT(key.offset > 0);
-			ret = copy_inline_to_page(BTRFS_I(dst), new_key->offset,
-						  inline_data, size, datal,
-						  comp_type);
-			goto out;
+			goto copy_to_page;
 		}
 	} else if (i_size_read(dst) <= datal) {
 		struct btrfs_file_extent_item *ei;
@@ -226,13 +223,10 @@ static int clone_copy_inline_extent(struct inode *dst,
 		    BTRFS_FILE_EXTENT_INLINE)
 			goto copy_inline_extent;
 
-		ret = copy_inline_to_page(BTRFS_I(dst), new_key->offset,
-					  inline_data, size, datal, comp_type);
-		goto out;
+		goto copy_to_page;
 	}
 
 copy_inline_extent:
-	ret = 0;
 	/*
 	 * We have no extent items, or we have an extent at offset 0 which may
 	 * or may not be inlined. All these cases are dealt the same way.
@@ -244,11 +238,13 @@ static int clone_copy_inline_extent(struct inode *dst,
 		 * clone. Deal with all these cases by copying the inline extent
 		 * data into the respective page at the destination inode.
 		 */
-		ret = copy_inline_to_page(BTRFS_I(dst), new_key->offset,
-					  inline_data, size, datal, comp_type);
-		goto out;
+		goto copy_to_page;
 	}
 
+	/*
+	 * Release path before starting a new transaction so we don't hold locks
+	 * that would confuse lockdep.
+	 */
 	btrfs_release_path(path);
 	/*
 	 * If we end up here it means were copy the inline extent into a leaf
@@ -285,11 +281,6 @@ static int clone_copy_inline_extent(struct inode *dst,
 	ret = btrfs_inode_set_file_extent_range(BTRFS_I(dst), 0, aligned_end);
 out:
 	if (!ret && !trans) {
-		/*
-		 * Release path before starting a new transaction so we don't
-		 * hold locks that would confuse lockdep.
-		 */
-		btrfs_release_path(path);
 		/*
 		 * No transaction here means we copied the inline extent into a
 		 * page of the destination inode.
@@ -310,6 +301,21 @@ static int clone_copy_inline_extent(struct inode *dst,
 		*trans_out = trans;
 
 	return ret;
+
+copy_to_page:
+	/*
+	 * Release our path because we don't need it anymore and also because
+	 * copy_inline_to_page() needs to reserve data and metadata, which may
+	 * need to flush delalloc when we are low on available space and
+	 * therefore cause a deadlock if writeback of an inline extent needs to
+	 * write to the same leaf or an ordered extent completion needs to write
+	 * to the same leaf.
+	 */
+	btrfs_release_path(path);
+
+	ret = copy_inline_to_page(BTRFS_I(dst), new_key->offset,
+				  inline_data, size, datal, comp_type);
+	goto out;
 }
 
 /**
-- 
2.28.0


^ permalink raw reply related	[flat|nested] 2+ messages in thread

* Re: [PATCH] btrfs: fix deadlock when cloning inline extents and low on available space
  2021-05-25 10:05 [PATCH] btrfs: fix deadlock when cloning inline extents and low on available space fdmanana
@ 2021-05-25 16:11 ` David Sterba
  0 siblings, 0 replies; 2+ messages in thread
From: David Sterba @ 2021-05-25 16:11 UTC (permalink / raw)
  To: fdmanana; +Cc: linux-btrfs

On Tue, May 25, 2021 at 11:05:28AM +0100, fdmanana@kernel.org wrote:
> From: Filipe Manana <fdmanana@suse.com>
> 
> There are a few cases where cloning an inline extent requires copying data
> into a page of the destination inode. For these cases we are allocating
> the required data and metadata space while holding a leaf locked. This can
> result in a deadlock when we are low on available space because allocating
> the space may flush delalloc and two deadlock scenarios can happen:
> 
> 1) When starting writeback for an inode with a very small dirty range that
>    fits in an inline extent, we deadlock during the writeback when trying
>    to insert the inline extent, at cow_file_range_inline(), if the extent
>    is going to be located in the leaf for which we are already holding a
>    read lock;
> 
> 2) After successfully starting writeback, for non-inline extent cases,
>    the async reclaim thread will hang waiting for an ordered extent to
>    complete if the ordered extent completion needs to modify the leaf
>    for which the clone task is holding a read lock (for adding or
>    replacing file extent items). So the cloning task will wait forever
>    on the async reclaim thread to make progress, which in turn is
>    waiting for the ordered extent completion which in turn is waiting
>    to acquire a write lock on the same leaf.
> 
> So fix this by making sure we release the path (and therefore the leaf)
> everytime we need to copy the inline extent's data into a page of the
> destination inode, as by that time we do not need to have the leaf locked.
> 
> Fixes: 05a5a7621ce66c ("Btrfs: implement full reflink support for inline extents")
> Signed-off-by: Filipe Manana <fdmanana@suse.com>

Added to misc-next, thanks.

> ---
>  fs/btrfs/reflink.c | 38 ++++++++++++++++++++++----------------
>  1 file changed, 22 insertions(+), 16 deletions(-)
> 
> diff --git a/fs/btrfs/reflink.c b/fs/btrfs/reflink.c
> index 06682128d8fa..58ddc7ed9e84 100644
> --- a/fs/btrfs/reflink.c
> +++ b/fs/btrfs/reflink.c
> @@ -207,10 +207,7 @@ static int clone_copy_inline_extent(struct inode *dst,
>  			 * inline extent's data to the page.
>  			 */
>  			ASSERT(key.offset > 0);
> -			ret = copy_inline_to_page(BTRFS_I(dst), new_key->offset,
> -						  inline_data, size, datal,
> -						  comp_type);
> -			goto out;
> +			goto copy_to_page;
>  		}
>  	} else if (i_size_read(dst) <= datal) {
>  		struct btrfs_file_extent_item *ei;
> @@ -226,13 +223,10 @@ static int clone_copy_inline_extent(struct inode *dst,
>  		    BTRFS_FILE_EXTENT_INLINE)
>  			goto copy_inline_extent;
>  
> -		ret = copy_inline_to_page(BTRFS_I(dst), new_key->offset,
> -					  inline_data, size, datal, comp_type);
> -		goto out;
> +		goto copy_to_page;
>  	}
>  
>  copy_inline_extent:
> -	ret = 0;
>  	/*
>  	 * We have no extent items, or we have an extent at offset 0 which may
>  	 * or may not be inlined. All these cases are dealt the same way.
> @@ -244,11 +238,13 @@ static int clone_copy_inline_extent(struct inode *dst,
>  		 * clone. Deal with all these cases by copying the inline extent
>  		 * data into the respective page at the destination inode.
>  		 */
> -		ret = copy_inline_to_page(BTRFS_I(dst), new_key->offset,
> -					  inline_data, size, datal, comp_type);
> -		goto out;
> +		goto copy_to_page;
>  	}
>  
> +	/*
> +	 * Release path before starting a new transaction so we don't hold locks
> +	 * that would confuse lockdep.
> +	 */
>  	btrfs_release_path(path);
>  	/*
>  	 * If we end up here it means were copy the inline extent into a leaf
> @@ -285,11 +281,6 @@ static int clone_copy_inline_extent(struct inode *dst,
>  	ret = btrfs_inode_set_file_extent_range(BTRFS_I(dst), 0, aligned_end);
>  out:
>  	if (!ret && !trans) {
> -		/*
> -		 * Release path before starting a new transaction so we don't
> -		 * hold locks that would confuse lockdep.
> -		 */
> -		btrfs_release_path(path);
>  		/*
>  		 * No transaction here means we copied the inline extent into a
>  		 * page of the destination inode.
> @@ -310,6 +301,21 @@ static int clone_copy_inline_extent(struct inode *dst,
>  		*trans_out = trans;
>  
>  	return ret;
> +
> +copy_to_page:
> +	/*
> +	 * Release our path because we don't need it anymore and also because
> +	 * copy_inline_to_page() needs to reserve data and metadata, which may
> +	 * need to flush delalloc when we are low on available space and
> +	 * therefore cause a deadlock if writeback of an inline extent needs to
> +	 * write to the same leaf or an ordered extent completion needs to write
> +	 * to the same leaf.
> +	 */
> +	btrfs_release_path(path);
> +
> +	ret = copy_inline_to_page(BTRFS_I(dst), new_key->offset,
> +				  inline_data, size, datal, comp_type);
> +	goto out;
>  }

I don't like to see the pattern with the chained labels but in this case
I don't see a better way, so ok.

^ permalink raw reply	[flat|nested] 2+ messages in thread

end of thread, other threads:[~2021-05-25 16:14 UTC | newest]

Thread overview: 2+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-05-25 10:05 [PATCH] btrfs: fix deadlock when cloning inline extents and low on available space fdmanana
2021-05-25 16:11 ` David Sterba

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.