All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 0/3] Fix ext4 deadlock when running xfstests/269
@ 2013-12-10 10:13 Jan Kara
  2013-12-10 10:13 ` [PATCH 1/3] ext4: Retry allocation when inline->extent conversion failed Jan Kara
                   ` (2 more replies)
  0 siblings, 3 replies; 12+ messages in thread
From: Jan Kara @ 2013-12-10 10:13 UTC (permalink / raw)
  To: Ted Tso; +Cc: Akira Fujita, linux-ext4


  Hello,

  these three patches fix a deadlock Akira-san has been observing for
some time when running xfstest's test 269 for long enough. The first two
patches are preparatory and the third patch fixes the real problem -
calling ext4_should_retry_alloc() from a wrong locking context.

								Honza

^ permalink raw reply	[flat|nested] 12+ messages in thread

* [PATCH 1/3] ext4: Retry allocation when inline->extent conversion failed
  2013-12-10 10:13 [PATCH 0/3] Fix ext4 deadlock when running xfstests/269 Jan Kara
@ 2013-12-10 10:13 ` Jan Kara
  2013-12-12  6:44   ` Zheng Liu
  2013-12-18  5:45   ` Theodore Ts'o
  2013-12-10 10:13 ` [PATCH 2/3] ext4: Standardize error handling in ext4_da_write_inline_data_begin() Jan Kara
  2013-12-10 10:14 ` [PATCH 3/3] ext4: Fix deadlock when writing in ENOSPC conditions Jan Kara
  2 siblings, 2 replies; 12+ messages in thread
From: Jan Kara @ 2013-12-10 10:13 UTC (permalink / raw)
  To: Ted Tso; +Cc: Akira Fujita, linux-ext4, Jan Kara

Similarly as other ->write_begin functions in ext4, also
ext4_da_write_inline_data_begin() should retry allocation if the
conversion failed because of ENOSPC. This avoids returning ENOSPC
prematurely because of uncommitted block deletions.

Signed-off-by: Jan Kara <jack@suse.cz>
---
 fs/ext4/inline.c | 7 +++++++
 1 file changed, 7 insertions(+)

diff --git a/fs/ext4/inline.c b/fs/ext4/inline.c
index bae987549dc3..ed6e71fe5e9d 100644
--- a/fs/ext4/inline.c
+++ b/fs/ext4/inline.c
@@ -849,11 +849,13 @@ int ext4_da_write_inline_data_begin(struct address_space *mapping,
 	handle_t *handle;
 	struct page *page;
 	struct ext4_iloc iloc;
+	int retries;
 
 	ret = ext4_get_inode_loc(inode, &iloc);
 	if (ret)
 		return ret;
 
+retry_journal:
 	handle = ext4_journal_start(inode, EXT4_HT_INODE, 1);
 	if (IS_ERR(handle)) {
 		ret = PTR_ERR(handle);
@@ -875,6 +877,11 @@ int ext4_da_write_inline_data_begin(struct address_space *mapping,
 							    inode,
 							    flags,
 							    fsdata);
+		ext4_journal_stop(handle);
+		handle = NULL;
+		if (ret == -ENOSPC &&
+		    ext4_should_retry_alloc(inode->i_sb, &retries))
+			goto retry_journal;
 		goto out;
 	}
 
-- 
1.8.1.4


^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [PATCH 2/3] ext4: Standardize error handling in ext4_da_write_inline_data_begin()
  2013-12-10 10:13 [PATCH 0/3] Fix ext4 deadlock when running xfstests/269 Jan Kara
  2013-12-10 10:13 ` [PATCH 1/3] ext4: Retry allocation when inline->extent conversion failed Jan Kara
@ 2013-12-10 10:13 ` Jan Kara
  2013-12-18  5:45   ` Theodore Ts'o
  2013-12-10 10:14 ` [PATCH 3/3] ext4: Fix deadlock when writing in ENOSPC conditions Jan Kara
  2 siblings, 1 reply; 12+ messages in thread
From: Jan Kara @ 2013-12-10 10:13 UTC (permalink / raw)
  To: Ted Tso; +Cc: Akira Fujita, linux-ext4, Jan Kara

The function has a bit non-standard (for ext4) error recovery in that it
used a mix of 'out' labels and testing for 'handle' being NULL. There
isn't a good reason for that in the function so clean it up a bit.

Signed-off-by: Jan Kara <jack@suse.cz>
---
 fs/ext4/inline.c | 11 ++++-------
 1 file changed, 4 insertions(+), 7 deletions(-)

diff --git a/fs/ext4/inline.c b/fs/ext4/inline.c
index ed6e71fe5e9d..c417e52d194e 100644
--- a/fs/ext4/inline.c
+++ b/fs/ext4/inline.c
@@ -859,7 +859,6 @@ retry_journal:
 	handle = ext4_journal_start(inode, EXT4_HT_INODE, 1);
 	if (IS_ERR(handle)) {
 		ret = PTR_ERR(handle);
-		handle = NULL;
 		goto out;
 	}
 
@@ -869,7 +868,7 @@ retry_journal:
 	if (inline_size >= pos + len) {
 		ret = ext4_prepare_inline_data(handle, inode, pos + len);
 		if (ret && ret != -ENOSPC)
-			goto out;
+			goto out_journal;
 	}
 
 	if (ret == -ENOSPC) {
@@ -878,7 +877,6 @@ retry_journal:
 							    flags,
 							    fsdata);
 		ext4_journal_stop(handle);
-		handle = NULL;
 		if (ret == -ENOSPC &&
 		    ext4_should_retry_alloc(inode->i_sb, &retries))
 			goto retry_journal;
@@ -894,7 +892,7 @@ retry_journal:
 	page = grab_cache_page_write_begin(mapping, 0, flags);
 	if (!page) {
 		ret = -ENOMEM;
-		goto out;
+		goto out_journal;
 	}
 
 	down_read(&EXT4_I(inode)->xattr_sem);
@@ -911,16 +909,15 @@ retry_journal:
 
 	up_read(&EXT4_I(inode)->xattr_sem);
 	*pagep = page;
-	handle = NULL;
 	brelse(iloc.bh);
 	return 1;
 out_release_page:
 	up_read(&EXT4_I(inode)->xattr_sem);
 	unlock_page(page);
 	page_cache_release(page);
+out_journal:
+	ext4_journal_stop(handle);
 out:
-	if (handle)
-		ext4_journal_stop(handle);
 	brelse(iloc.bh);
 	return ret;
 }
-- 
1.8.1.4


^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [PATCH 3/3] ext4: Fix deadlock when writing in ENOSPC conditions
  2013-12-10 10:13 [PATCH 0/3] Fix ext4 deadlock when running xfstests/269 Jan Kara
  2013-12-10 10:13 ` [PATCH 1/3] ext4: Retry allocation when inline->extent conversion failed Jan Kara
  2013-12-10 10:13 ` [PATCH 2/3] ext4: Standardize error handling in ext4_da_write_inline_data_begin() Jan Kara
@ 2013-12-10 10:14 ` Jan Kara
  2013-12-12  6:51   ` Zheng Liu
  2013-12-18  5:45   ` Theodore Ts'o
  2 siblings, 2 replies; 12+ messages in thread
From: Jan Kara @ 2013-12-10 10:14 UTC (permalink / raw)
  To: Ted Tso; +Cc: Akira Fujita, linux-ext4, Jan Kara

Akira-san has been reporting rare deadlocks of his machine when running
xfstests test 269 on ext4 filesystem. The problem turned out to be in
ext4_da_reserve_metadata() and ext4_da_reserve_space() which called
ext4_should_retry_alloc() while holding i_data_sem. Since
ext4_should_retry_alloc() can force a transaction commit, this is a
lock ordering violation and leads to deadlocks.

Fix the problem by just removing the retry loops. These functions should
just report ENOSPC to the caller (e.g. ext4_da_write_begin()) and that
function must take care of retrying after dropping all necessary locks.

Reported-and-tested-by: Akira Fujita <a-fujita@rs.jp.nec.com>
Signed-off-by: Jan Kara <jack@suse.cz>
---
 fs/ext4/inode.c | 12 ------------
 1 file changed, 12 deletions(-)

diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c
index 075763474118..61d49ff22c81 100644
--- a/fs/ext4/inode.c
+++ b/fs/ext4/inode.c
@@ -1206,7 +1206,6 @@ static int ext4_journalled_write_end(struct file *file,
  */
 static int ext4_da_reserve_metadata(struct inode *inode, ext4_lblk_t lblock)
 {
-	int retries = 0;
 	struct ext4_sb_info *sbi = EXT4_SB(inode->i_sb);
 	struct ext4_inode_info *ei = EXT4_I(inode);
 	unsigned int md_needed;
@@ -1218,7 +1217,6 @@ static int ext4_da_reserve_metadata(struct inode *inode, ext4_lblk_t lblock)
 	 * in order to allocate nrblocks
 	 * worse case is one extent per block
 	 */
-repeat:
 	spin_lock(&ei->i_block_reservation_lock);
 	/*
 	 * ext4_calc_metadata_amount() has side effects, which we have
@@ -1238,10 +1236,6 @@ repeat:
 		ei->i_da_metadata_calc_len = save_len;
 		ei->i_da_metadata_calc_last_lblock = save_last_lblock;
 		spin_unlock(&ei->i_block_reservation_lock);
-		if (ext4_should_retry_alloc(inode->i_sb, &retries)) {
-			cond_resched();
-			goto repeat;
-		}
 		return -ENOSPC;
 	}
 	ei->i_reserved_meta_blocks += md_needed;
@@ -1255,7 +1249,6 @@ repeat:
  */
 static int ext4_da_reserve_space(struct inode *inode, ext4_lblk_t lblock)
 {
-	int retries = 0;
 	struct ext4_sb_info *sbi = EXT4_SB(inode->i_sb);
 	struct ext4_inode_info *ei = EXT4_I(inode);
 	unsigned int md_needed;
@@ -1277,7 +1270,6 @@ static int ext4_da_reserve_space(struct inode *inode, ext4_lblk_t lblock)
 	 * in order to allocate nrblocks
 	 * worse case is one extent per block
 	 */
-repeat:
 	spin_lock(&ei->i_block_reservation_lock);
 	/*
 	 * ext4_calc_metadata_amount() has side effects, which we have
@@ -1297,10 +1289,6 @@ repeat:
 		ei->i_da_metadata_calc_len = save_len;
 		ei->i_da_metadata_calc_last_lblock = save_last_lblock;
 		spin_unlock(&ei->i_block_reservation_lock);
-		if (ext4_should_retry_alloc(inode->i_sb, &retries)) {
-			cond_resched();
-			goto repeat;
-		}
 		dquot_release_reservation_block(inode, EXT4_C2B(sbi, 1));
 		return -ENOSPC;
 	}
-- 
1.8.1.4


^ permalink raw reply related	[flat|nested] 12+ messages in thread

* Re: [PATCH 1/3] ext4: Retry allocation when inline->extent conversion failed
  2013-12-10 10:13 ` [PATCH 1/3] ext4: Retry allocation when inline->extent conversion failed Jan Kara
@ 2013-12-12  6:44   ` Zheng Liu
  2013-12-12  9:39     ` Jan Kara
  2013-12-18  5:45   ` Theodore Ts'o
  1 sibling, 1 reply; 12+ messages in thread
From: Zheng Liu @ 2013-12-12  6:44 UTC (permalink / raw)
  To: Jan Kara; +Cc: Ted Tso, Akira Fujita, linux-ext4

Hi Jan,

On Tue, Dec 10, 2013 at 11:13:58AM +0100, Jan Kara wrote:
> Similarly as other ->write_begin functions in ext4, also
> ext4_da_write_inline_data_begin() should retry allocation if the
> conversion failed because of ENOSPC. This avoids returning ENOSPC
> prematurely because of uncommitted block deletions.

I don't see why we need to try again here.  If there is no enough space
in inline data, ext4_da_convert_inline_data_to_extent() just read the
inline data into a page and clear EXT4_STATE_MAY_INLINE_DATA flag.  If
we try again, we will encounter an ENOSPC error again.  So why do we
need to retry allocation?

Thanks,
                                                - Zheng

> 
> Signed-off-by: Jan Kara <jack@suse.cz>
> ---
>  fs/ext4/inline.c | 7 +++++++
>  1 file changed, 7 insertions(+)
> 
> diff --git a/fs/ext4/inline.c b/fs/ext4/inline.c
> index bae987549dc3..ed6e71fe5e9d 100644
> --- a/fs/ext4/inline.c
> +++ b/fs/ext4/inline.c
> @@ -849,11 +849,13 @@ int ext4_da_write_inline_data_begin(struct address_space *mapping,
>  	handle_t *handle;
>  	struct page *page;
>  	struct ext4_iloc iloc;
> +	int retries;
>  
>  	ret = ext4_get_inode_loc(inode, &iloc);
>  	if (ret)
>  		return ret;
>  
> +retry_journal:
>  	handle = ext4_journal_start(inode, EXT4_HT_INODE, 1);
>  	if (IS_ERR(handle)) {
>  		ret = PTR_ERR(handle);
> @@ -875,6 +877,11 @@ int ext4_da_write_inline_data_begin(struct address_space *mapping,
>  							    inode,
>  							    flags,
>  							    fsdata);
> +		ext4_journal_stop(handle);
> +		handle = NULL;
> +		if (ret == -ENOSPC &&
> +		    ext4_should_retry_alloc(inode->i_sb, &retries))
> +			goto retry_journal;
>  		goto out;
>  	}
>  
> -- 
> 1.8.1.4
> 
> --
> To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH 3/3] ext4: Fix deadlock when writing in ENOSPC conditions
  2013-12-10 10:14 ` [PATCH 3/3] ext4: Fix deadlock when writing in ENOSPC conditions Jan Kara
@ 2013-12-12  6:51   ` Zheng Liu
  2013-12-12  9:41     ` Jan Kara
  2013-12-18  5:45   ` Theodore Ts'o
  1 sibling, 1 reply; 12+ messages in thread
From: Zheng Liu @ 2013-12-12  6:51 UTC (permalink / raw)
  To: Jan Kara; +Cc: Ted Tso, Akira Fujita, linux-ext4

On Tue, Dec 10, 2013 at 11:14:00AM +0100, Jan Kara wrote:
> Akira-san has been reporting rare deadlocks of his machine when running
> xfstests test 269 on ext4 filesystem. The problem turned out to be in
> ext4_da_reserve_metadata() and ext4_da_reserve_space() which called
> ext4_should_retry_alloc() while holding i_data_sem. Since
> ext4_should_retry_alloc() can force a transaction commit, this is a
> lock ordering violation and leads to deadlocks.
> 
> Fix the problem by just removing the retry loops. These functions should
> just report ENOSPC to the caller (e.g. ext4_da_write_begin()) and that
> function must take care of retrying after dropping all necessary locks.
> 
> Reported-and-tested-by: Akira Fujita <a-fujita@rs.jp.nec.com>
> Signed-off-by: Jan Kara <jack@suse.cz>

Thanks for fixing this.  The patch looks good to me.  You can add:
Reviewed-by: Zheng Liu <wenqing.lz@taobao.com>

BTW, I have met a deadlock which is caused by ext4_da_reserve_space()
in our product system.  The calltrace information looks like this.  So
I want to make sure it is the root cause.  But I couldn't reproduce the
problem with running xfstest #269.  Could you please tell me how to
reproduce the deadlock?

FWIW, I think we should backport this patch to stable kernel.

Thanks,
                                                - Zheng

> ---
>  fs/ext4/inode.c | 12 ------------
>  1 file changed, 12 deletions(-)
> 
> diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c
> index 075763474118..61d49ff22c81 100644
> --- a/fs/ext4/inode.c
> +++ b/fs/ext4/inode.c
> @@ -1206,7 +1206,6 @@ static int ext4_journalled_write_end(struct file *file,
>   */
>  static int ext4_da_reserve_metadata(struct inode *inode, ext4_lblk_t lblock)
>  {
> -	int retries = 0;
>  	struct ext4_sb_info *sbi = EXT4_SB(inode->i_sb);
>  	struct ext4_inode_info *ei = EXT4_I(inode);
>  	unsigned int md_needed;
> @@ -1218,7 +1217,6 @@ static int ext4_da_reserve_metadata(struct inode *inode, ext4_lblk_t lblock)
>  	 * in order to allocate nrblocks
>  	 * worse case is one extent per block
>  	 */
> -repeat:
>  	spin_lock(&ei->i_block_reservation_lock);
>  	/*
>  	 * ext4_calc_metadata_amount() has side effects, which we have
> @@ -1238,10 +1236,6 @@ repeat:
>  		ei->i_da_metadata_calc_len = save_len;
>  		ei->i_da_metadata_calc_last_lblock = save_last_lblock;
>  		spin_unlock(&ei->i_block_reservation_lock);
> -		if (ext4_should_retry_alloc(inode->i_sb, &retries)) {
> -			cond_resched();
> -			goto repeat;
> -		}
>  		return -ENOSPC;
>  	}
>  	ei->i_reserved_meta_blocks += md_needed;
> @@ -1255,7 +1249,6 @@ repeat:
>   */
>  static int ext4_da_reserve_space(struct inode *inode, ext4_lblk_t lblock)
>  {
> -	int retries = 0;
>  	struct ext4_sb_info *sbi = EXT4_SB(inode->i_sb);
>  	struct ext4_inode_info *ei = EXT4_I(inode);
>  	unsigned int md_needed;
> @@ -1277,7 +1270,6 @@ static int ext4_da_reserve_space(struct inode *inode, ext4_lblk_t lblock)
>  	 * in order to allocate nrblocks
>  	 * worse case is one extent per block
>  	 */
> -repeat:
>  	spin_lock(&ei->i_block_reservation_lock);
>  	/*
>  	 * ext4_calc_metadata_amount() has side effects, which we have
> @@ -1297,10 +1289,6 @@ repeat:
>  		ei->i_da_metadata_calc_len = save_len;
>  		ei->i_da_metadata_calc_last_lblock = save_last_lblock;
>  		spin_unlock(&ei->i_block_reservation_lock);
> -		if (ext4_should_retry_alloc(inode->i_sb, &retries)) {
> -			cond_resched();
> -			goto repeat;
> -		}
>  		dquot_release_reservation_block(inode, EXT4_C2B(sbi, 1));
>  		return -ENOSPC;
>  	}
> -- 
> 1.8.1.4
> 
> --
> To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH 1/3] ext4: Retry allocation when inline->extent conversion failed
  2013-12-12  6:44   ` Zheng Liu
@ 2013-12-12  9:39     ` Jan Kara
  2013-12-12 10:30       ` Zheng Liu
  0 siblings, 1 reply; 12+ messages in thread
From: Jan Kara @ 2013-12-12  9:39 UTC (permalink / raw)
  To: Zheng Liu; +Cc: Jan Kara, Ted Tso, Akira Fujita, linux-ext4

On Thu 12-12-13 14:44:22, Zheng Liu wrote:
> Hi Jan,
> 
> On Tue, Dec 10, 2013 at 11:13:58AM +0100, Jan Kara wrote:
> > Similarly as other ->write_begin functions in ext4, also
> > ext4_da_write_inline_data_begin() should retry allocation if the
> > conversion failed because of ENOSPC. This avoids returning ENOSPC
> > prematurely because of uncommitted block deletions.
> 
> I don't see why we need to try again here.  If there is no enough space
> in inline data, ext4_da_convert_inline_data_to_extent() just read the
> inline data into a page and clear EXT4_STATE_MAY_INLINE_DATA flag.  If
> we try again, we will encounter an ENOSPC error again.  So why do we
> need to retry allocation?
  Umm, I don't think so. ext4_da_convert_inline_data_to_extent() calls
ext4_da_get_block_prep() and that can falsely fail with ENOSPC if there are
blocks freed in currently running transaction. So we call
ext4_should_retry_alloc() which forces commit of the current transaction
and then we retry the allocation.

								Honza

> > 
> > Signed-off-by: Jan Kara <jack@suse.cz>
> > ---
> >  fs/ext4/inline.c | 7 +++++++
> >  1 file changed, 7 insertions(+)
> > 
> > diff --git a/fs/ext4/inline.c b/fs/ext4/inline.c
> > index bae987549dc3..ed6e71fe5e9d 100644
> > --- a/fs/ext4/inline.c
> > +++ b/fs/ext4/inline.c
> > @@ -849,11 +849,13 @@ int ext4_da_write_inline_data_begin(struct address_space *mapping,
> >  	handle_t *handle;
> >  	struct page *page;
> >  	struct ext4_iloc iloc;
> > +	int retries;
> >  
> >  	ret = ext4_get_inode_loc(inode, &iloc);
> >  	if (ret)
> >  		return ret;
> >  
> > +retry_journal:
> >  	handle = ext4_journal_start(inode, EXT4_HT_INODE, 1);
> >  	if (IS_ERR(handle)) {
> >  		ret = PTR_ERR(handle);
> > @@ -875,6 +877,11 @@ int ext4_da_write_inline_data_begin(struct address_space *mapping,
> >  							    inode,
> >  							    flags,
> >  							    fsdata);
> > +		ext4_journal_stop(handle);
> > +		handle = NULL;
> > +		if (ret == -ENOSPC &&
> > +		    ext4_should_retry_alloc(inode->i_sb, &retries))
> > +			goto retry_journal;
> >  		goto out;
> >  	}
> >  
> > -- 
> > 1.8.1.4
> > 
> > --
> > To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
> > the body of a message to majordomo@vger.kernel.org
> > More majordomo info at  http://vger.kernel.org/majordomo-info.html
-- 
Jan Kara <jack@suse.cz>
SUSE Labs, CR

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH 3/3] ext4: Fix deadlock when writing in ENOSPC conditions
  2013-12-12  6:51   ` Zheng Liu
@ 2013-12-12  9:41     ` Jan Kara
  0 siblings, 0 replies; 12+ messages in thread
From: Jan Kara @ 2013-12-12  9:41 UTC (permalink / raw)
  To: Zheng Liu; +Cc: Jan Kara, Ted Tso, Akira Fujita, linux-ext4

On Thu 12-12-13 14:51:16, Zheng Liu wrote:
> On Tue, Dec 10, 2013 at 11:14:00AM +0100, Jan Kara wrote:
> > Akira-san has been reporting rare deadlocks of his machine when running
> > xfstests test 269 on ext4 filesystem. The problem turned out to be in
> > ext4_da_reserve_metadata() and ext4_da_reserve_space() which called
> > ext4_should_retry_alloc() while holding i_data_sem. Since
> > ext4_should_retry_alloc() can force a transaction commit, this is a
> > lock ordering violation and leads to deadlocks.
> > 
> > Fix the problem by just removing the retry loops. These functions should
> > just report ENOSPC to the caller (e.g. ext4_da_write_begin()) and that
> > function must take care of retrying after dropping all necessary locks.
> > 
> > Reported-and-tested-by: Akira Fujita <a-fujita@rs.jp.nec.com>
> > Signed-off-by: Jan Kara <jack@suse.cz>
> 
> Thanks for fixing this.  The patch looks good to me.  You can add:
> Reviewed-by: Zheng Liu <wenqing.lz@taobao.com>
> 
> BTW, I have met a deadlock which is caused by ext4_da_reserve_space()
> in our product system.  The calltrace information looks like this.  So
> I want to make sure it is the root cause.  But I couldn't reproduce the
> problem with running xfstest #269.  Could you please tell me how to
> reproduce the deadlock?
  I couldn't reproduce it either but Akira was able to reproduce it (but it
took him a long time as well).

> FWIW, I think we should backport this patch to stable kernel.
  Agreed.

								Honza

> > ---
> >  fs/ext4/inode.c | 12 ------------
> >  1 file changed, 12 deletions(-)
> > 
> > diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c
> > index 075763474118..61d49ff22c81 100644
> > --- a/fs/ext4/inode.c
> > +++ b/fs/ext4/inode.c
> > @@ -1206,7 +1206,6 @@ static int ext4_journalled_write_end(struct file *file,
> >   */
> >  static int ext4_da_reserve_metadata(struct inode *inode, ext4_lblk_t lblock)
> >  {
> > -	int retries = 0;
> >  	struct ext4_sb_info *sbi = EXT4_SB(inode->i_sb);
> >  	struct ext4_inode_info *ei = EXT4_I(inode);
> >  	unsigned int md_needed;
> > @@ -1218,7 +1217,6 @@ static int ext4_da_reserve_metadata(struct inode *inode, ext4_lblk_t lblock)
> >  	 * in order to allocate nrblocks
> >  	 * worse case is one extent per block
> >  	 */
> > -repeat:
> >  	spin_lock(&ei->i_block_reservation_lock);
> >  	/*
> >  	 * ext4_calc_metadata_amount() has side effects, which we have
> > @@ -1238,10 +1236,6 @@ repeat:
> >  		ei->i_da_metadata_calc_len = save_len;
> >  		ei->i_da_metadata_calc_last_lblock = save_last_lblock;
> >  		spin_unlock(&ei->i_block_reservation_lock);
> > -		if (ext4_should_retry_alloc(inode->i_sb, &retries)) {
> > -			cond_resched();
> > -			goto repeat;
> > -		}
> >  		return -ENOSPC;
> >  	}
> >  	ei->i_reserved_meta_blocks += md_needed;
> > @@ -1255,7 +1249,6 @@ repeat:
> >   */
> >  static int ext4_da_reserve_space(struct inode *inode, ext4_lblk_t lblock)
> >  {
> > -	int retries = 0;
> >  	struct ext4_sb_info *sbi = EXT4_SB(inode->i_sb);
> >  	struct ext4_inode_info *ei = EXT4_I(inode);
> >  	unsigned int md_needed;
> > @@ -1277,7 +1270,6 @@ static int ext4_da_reserve_space(struct inode *inode, ext4_lblk_t lblock)
> >  	 * in order to allocate nrblocks
> >  	 * worse case is one extent per block
> >  	 */
> > -repeat:
> >  	spin_lock(&ei->i_block_reservation_lock);
> >  	/*
> >  	 * ext4_calc_metadata_amount() has side effects, which we have
> > @@ -1297,10 +1289,6 @@ repeat:
> >  		ei->i_da_metadata_calc_len = save_len;
> >  		ei->i_da_metadata_calc_last_lblock = save_last_lblock;
> >  		spin_unlock(&ei->i_block_reservation_lock);
> > -		if (ext4_should_retry_alloc(inode->i_sb, &retries)) {
> > -			cond_resched();
> > -			goto repeat;
> > -		}
> >  		dquot_release_reservation_block(inode, EXT4_C2B(sbi, 1));
> >  		return -ENOSPC;
> >  	}
> > -- 
> > 1.8.1.4
> > 
> > --
> > To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
> > the body of a message to majordomo@vger.kernel.org
> > More majordomo info at  http://vger.kernel.org/majordomo-info.html
-- 
Jan Kara <jack@suse.cz>
SUSE Labs, CR

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH 1/3] ext4: Retry allocation when inline->extent conversion failed
  2013-12-12  9:39     ` Jan Kara
@ 2013-12-12 10:30       ` Zheng Liu
  0 siblings, 0 replies; 12+ messages in thread
From: Zheng Liu @ 2013-12-12 10:30 UTC (permalink / raw)
  To: Jan Kara; +Cc: Ted Tso, Akira Fujita, linux-ext4

On Thu, Dec 12, 2013 at 10:39:36AM +0100, Jan Kara wrote:
> On Thu 12-12-13 14:44:22, Zheng Liu wrote:
> > Hi Jan,
> > 
> > On Tue, Dec 10, 2013 at 11:13:58AM +0100, Jan Kara wrote:
> > > Similarly as other ->write_begin functions in ext4, also
> > > ext4_da_write_inline_data_begin() should retry allocation if the
> > > conversion failed because of ENOSPC. This avoids returning ENOSPC
> > > prematurely because of uncommitted block deletions.
> > 
> > I don't see why we need to try again here.  If there is no enough space
> > in inline data, ext4_da_convert_inline_data_to_extent() just read the
> > inline data into a page and clear EXT4_STATE_MAY_INLINE_DATA flag.  If
> > we try again, we will encounter an ENOSPC error again.  So why do we
> > need to retry allocation?
>   Umm, I don't think so. ext4_da_convert_inline_data_to_extent() calls
> ext4_da_get_block_prep() and that can falsely fail with ENOSPC if there are
> blocks freed in currently running transaction. So we call
> ext4_should_retry_alloc() which forces commit of the current transaction
> and then we retry the allocation.

Ah, I see.  Thanks for your explanation.

                                                - Zheng

> 
> 								Honza
> 
> > > 
> > > Signed-off-by: Jan Kara <jack@suse.cz>
> > > ---
> > >  fs/ext4/inline.c | 7 +++++++
> > >  1 file changed, 7 insertions(+)
> > > 
> > > diff --git a/fs/ext4/inline.c b/fs/ext4/inline.c
> > > index bae987549dc3..ed6e71fe5e9d 100644
> > > --- a/fs/ext4/inline.c
> > > +++ b/fs/ext4/inline.c
> > > @@ -849,11 +849,13 @@ int ext4_da_write_inline_data_begin(struct address_space *mapping,
> > >  	handle_t *handle;
> > >  	struct page *page;
> > >  	struct ext4_iloc iloc;
> > > +	int retries;
> > >  
> > >  	ret = ext4_get_inode_loc(inode, &iloc);
> > >  	if (ret)
> > >  		return ret;
> > >  
> > > +retry_journal:
> > >  	handle = ext4_journal_start(inode, EXT4_HT_INODE, 1);
> > >  	if (IS_ERR(handle)) {
> > >  		ret = PTR_ERR(handle);
> > > @@ -875,6 +877,11 @@ int ext4_da_write_inline_data_begin(struct address_space *mapping,
> > >  							    inode,
> > >  							    flags,
> > >  							    fsdata);
> > > +		ext4_journal_stop(handle);
> > > +		handle = NULL;
> > > +		if (ret == -ENOSPC &&
> > > +		    ext4_should_retry_alloc(inode->i_sb, &retries))
> > > +			goto retry_journal;
> > >  		goto out;
> > >  	}
> > >  
> > > -- 
> > > 1.8.1.4
> > > 
> > > --
> > > To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
> > > the body of a message to majordomo@vger.kernel.org
> > > More majordomo info at  http://vger.kernel.org/majordomo-info.html
> -- 
> Jan Kara <jack@suse.cz>
> SUSE Labs, CR

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH 1/3] ext4: Retry allocation when inline->extent conversion failed
  2013-12-10 10:13 ` [PATCH 1/3] ext4: Retry allocation when inline->extent conversion failed Jan Kara
  2013-12-12  6:44   ` Zheng Liu
@ 2013-12-18  5:45   ` Theodore Ts'o
  1 sibling, 0 replies; 12+ messages in thread
From: Theodore Ts'o @ 2013-12-18  5:45 UTC (permalink / raw)
  To: Jan Kara; +Cc: Akira Fujita, linux-ext4

On Tue, Dec 10, 2013 at 11:13:58AM +0100, Jan Kara wrote:
> Similarly as other ->write_begin functions in ext4, also
> ext4_da_write_inline_data_begin() should retry allocation if the
> conversion failed because of ENOSPC. This avoids returning ENOSPC
> prematurely because of uncommitted block deletions.
> 
> Signed-off-by: Jan Kara <jack@suse.cz>

Thanks, applied.

					- Ted

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH 2/3] ext4: Standardize error handling in ext4_da_write_inline_data_begin()
  2013-12-10 10:13 ` [PATCH 2/3] ext4: Standardize error handling in ext4_da_write_inline_data_begin() Jan Kara
@ 2013-12-18  5:45   ` Theodore Ts'o
  0 siblings, 0 replies; 12+ messages in thread
From: Theodore Ts'o @ 2013-12-18  5:45 UTC (permalink / raw)
  To: Jan Kara; +Cc: Akira Fujita, linux-ext4

On Tue, Dec 10, 2013 at 11:13:59AM +0100, Jan Kara wrote:
> The function has a bit non-standard (for ext4) error recovery in that it
> used a mix of 'out' labels and testing for 'handle' being NULL. There
> isn't a good reason for that in the function so clean it up a bit.
> 
> Signed-off-by: Jan Kara <jack@suse.cz>

Thanks, applied.

				- Ted

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH 3/3] ext4: Fix deadlock when writing in ENOSPC conditions
  2013-12-10 10:14 ` [PATCH 3/3] ext4: Fix deadlock when writing in ENOSPC conditions Jan Kara
  2013-12-12  6:51   ` Zheng Liu
@ 2013-12-18  5:45   ` Theodore Ts'o
  1 sibling, 0 replies; 12+ messages in thread
From: Theodore Ts'o @ 2013-12-18  5:45 UTC (permalink / raw)
  To: Jan Kara; +Cc: Akira Fujita, linux-ext4

On Tue, Dec 10, 2013 at 11:14:00AM +0100, Jan Kara wrote:
> Akira-san has been reporting rare deadlocks of his machine when running
> xfstests test 269 on ext4 filesystem. The problem turned out to be in
> ext4_da_reserve_metadata() and ext4_da_reserve_space() which called
> ext4_should_retry_alloc() while holding i_data_sem. Since
> ext4_should_retry_alloc() can force a transaction commit, this is a
> lock ordering violation and leads to deadlocks.
> 
> Fix the problem by just removing the retry loops. These functions should
> just report ENOSPC to the caller (e.g. ext4_da_write_begin()) and that
> function must take care of retrying after dropping all necessary locks.
> 
> Reported-and-tested-by: Akira Fujita <a-fujita@rs.jp.nec.com>
> Signed-off-by: Jan Kara <jack@suse.cz>

Thanks, applied.

					- Ted

^ permalink raw reply	[flat|nested] 12+ messages in thread

end of thread, other threads:[~2013-12-18  5:46 UTC | newest]

Thread overview: 12+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2013-12-10 10:13 [PATCH 0/3] Fix ext4 deadlock when running xfstests/269 Jan Kara
2013-12-10 10:13 ` [PATCH 1/3] ext4: Retry allocation when inline->extent conversion failed Jan Kara
2013-12-12  6:44   ` Zheng Liu
2013-12-12  9:39     ` Jan Kara
2013-12-12 10:30       ` Zheng Liu
2013-12-18  5:45   ` Theodore Ts'o
2013-12-10 10:13 ` [PATCH 2/3] ext4: Standardize error handling in ext4_da_write_inline_data_begin() Jan Kara
2013-12-18  5:45   ` Theodore Ts'o
2013-12-10 10:14 ` [PATCH 3/3] ext4: Fix deadlock when writing in ENOSPC conditions Jan Kara
2013-12-12  6:51   ` Zheng Liu
2013-12-12  9:41     ` Jan Kara
2013-12-18  5:45   ` Theodore Ts'o

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.