linux-btrfs.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: liubo <liubo2009@cn.fujitsu.com>
To: Sage Weil <sage@newdream.net>
Cc: Linux Btrfs <linux-btrfs@vger.kernel.org>
Subject: Re: [PATCH 1/6] Btrfs: fix deadlock in btrfs_commit_transaction
Date: Tue, 26 Oct 2010 14:46:08 +0800	[thread overview]
Message-ID: <4CC67930.4030406@cn.fujitsu.com> (raw)
In-Reply-To: <1288033662-21464-2-git-send-email-sage@newdream.net>

On 10/26/2010 03:07 AM, Sage Weil wrote:
> We calculate timeout (either 1 or MAX_SCHEDULE_TIMEOUT) based on whether
> num_writers > 1 or should_grow at the top of the loop.  Then, much much
> later, we wait for that timeout if either num_writers or should_grow is
> true.  However, it's possible for a racing process (calling
> btrfs_end_transaction()) to decrement num_writers such that we wait
> forever instead of for 1.
> 

IMO, there still exists a deadlock with your patch.
===
with your patch:

thread 1:                          thread 2:                       

btrfs_commit_transaction()
  if (num_writers > 1)
    timeout = MAX_TIMEOUT;
	                --------->
                                   __btrfs_end_transaction()
                                     num_writers--;
				     if (wq)  
                                       wake_up();
			<---------
  smp_mb();
  prepare_wait();
  if (num_writers > 1)
    schedule_timeout(MAX);
  else if (should_grow)
    schedule_timeout(1);
===

thread2 also needs a memory_barrier, for without memory_barrier, 
on some CPUs, "if (wq)" may be executed before num_writers--, like
===
thread 1:                          thread 2:                       

btrfs_commit_transaction()
  if (num_writers > 1)
    timeout = MAX_TIMEOUT;
	                --------->
                                   __btrfs_end_transaction()
                                     if (wq)  
                                       wake_up();
			<---------
  smp_mb();
  prepare_wait();
  if (num_writers > 1)
    schedule_timeout(MAX);
  else if (should_grow)
    schedule_timeout(1); 
			---------->
				      num_writers--;
===
then, thread1 may wait forever.

Since wake_up() itself provides a implied wmb, and a wq active check, 
it is better to drop "if (wq)" in __btrfs_end_transaction().


thanks,
liubo

> Fix this by deciding how long to wait when we wait.
> 
> Signed-off-by: Sage Weil <sage@newdream.net>
> ---
>  fs/btrfs/transaction.c |   12 ++++--------
>  1 files changed, 4 insertions(+), 8 deletions(-)
> 
> diff --git a/fs/btrfs/transaction.c b/fs/btrfs/transaction.c
> index 66e4c66..bf399ea 100644
> --- a/fs/btrfs/transaction.c
> +++ b/fs/btrfs/transaction.c
> @@ -992,7 +992,6 @@ int btrfs_commit_transaction(struct btrfs_trans_handle *trans,
>  			     struct btrfs_root *root)
>  {
>  	unsigned long joined = 0;
> -	unsigned long timeout = 1;
>  	struct btrfs_transaction *cur_trans;
>  	struct btrfs_transaction *prev_trans = NULL;
>  	DEFINE_WAIT(wait);
> @@ -1063,11 +1062,6 @@ int btrfs_commit_transaction(struct btrfs_trans_handle *trans,
>  			snap_pending = 1;
>  
>  		WARN_ON(cur_trans != trans->transaction);
> -		if (cur_trans->num_writers > 1)
> -			timeout = MAX_SCHEDULE_TIMEOUT;
> -		else if (should_grow)
> -			timeout = 1;
> -
>  		mutex_unlock(&root->fs_info->trans_mutex);
>  
>  		if (flush_on_commit || snap_pending) {
> @@ -1089,8 +1083,10 @@ int btrfs_commit_transaction(struct btrfs_trans_handle *trans,
>  				TASK_UNINTERRUPTIBLE);
>  
>  		smp_mb();
> -		if (cur_trans->num_writers > 1 || should_grow)
> -			schedule_timeout(timeout);
> +		if (cur_trans->num_writers > 1)
> +			schedule_timeout(MAX_SCHEDULE_TIMEOUT);
> +		else if (should_grow)
> +			schedule_timeout(1);
>  
>  		mutex_lock(&root->fs_info->trans_mutex);
>  		finish_wait(&cur_trans->writer_wait, &wait);





  parent reply	other threads:[~2010-10-26  6:46 UTC|newest]

Thread overview: 15+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2010-10-25 19:07 [PATCH 0/6] Btrfs commit fixes, async subvol operations Sage Weil
2010-10-25 19:07 ` [PATCH 1/6] Btrfs: fix deadlock in btrfs_commit_transaction Sage Weil
2010-10-25 19:07   ` [PATCH 2/6] Btrfs: async transaction commit Sage Weil
2010-10-25 19:07     ` [PATCH 3/6] Btrfs: add START_SYNC, WAIT_SYNC ioctls Sage Weil
2010-10-25 19:07       ` [PATCH 4/6] Btrfs: add SNAP_CREATE_ASYNC ioctl Sage Weil
2010-10-25 19:07         ` [PATCH 5/6] Btrfs: make SNAP_DESTROY async Sage Weil
2010-10-25 19:07           ` [PATCH 6/6] Btrfs: allow subvol deletion by owner Sage Weil
2010-10-26  6:46   ` liubo [this message]
2010-10-26 16:36     ` [PATCH 1/6] Btrfs: fix deadlock in btrfs_commit_transaction Sage Weil
2010-10-26 17:06       ` Chris Mason
2010-10-27  0:41         ` liubo
2010-10-25 19:29 ` [PATCH 0/6] Btrfs commit fixes, async subvol operations Chris Mason
2010-10-25 19:41   ` Sage Weil
2010-10-25 19:58     ` Chris Mason
2010-10-25 21:27       ` Sage Weil

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4CC67930.4030406@cn.fujitsu.com \
    --to=liubo2009@cn.fujitsu.com \
    --cc=linux-btrfs@vger.kernel.org \
    --cc=sage@newdream.net \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).