All of lore.kernel.org
 help / color / mirror / Atom feed
From: Sage Weil <sage@newdream.net>
To: liubo <liubo2009@cn.fujitsu.com>
Cc: Linux Btrfs <linux-btrfs@vger.kernel.org>
Subject: Re: [PATCH 1/6] Btrfs: fix deadlock in btrfs_commit_transaction
Date: Tue, 26 Oct 2010 09:36:26 -0700 (PDT)	[thread overview]
Message-ID: <Pine.LNX.4.64.1010260931310.7660@cobra.newdream.net> (raw)
In-Reply-To: <4CC67930.4030406@cn.fujitsu.com>

On Tue, 26 Oct 2010, liubo wrote:
> On 10/26/2010 03:07 AM, Sage Weil wrote:
> > We calculate timeout (either 1 or MAX_SCHEDULE_TIMEOUT) based on whether
> > num_writers > 1 or should_grow at the top of the loop.  Then, much much
> > later, we wait for that timeout if either num_writers or should_grow is
> > true.  However, it's possible for a racing process (calling
> > btrfs_end_transaction()) to decrement num_writers such that we wait
> > forever instead of for 1.
> > 
> 
> IMO, there still exists a deadlock with your patch.
> ===
> with your patch:
> 
> thread 1:                          thread 2:                       
> 
> btrfs_commit_transaction()
>   if (num_writers > 1)
>     timeout = MAX_TIMEOUT;
(This bit goes away, btw.)
> 	                --------->
>                                    __btrfs_end_transaction()
>                                      num_writers--;
> 				     if (wq)  
>                                        wake_up();
> 			<---------
>   smp_mb();
>   prepare_wait();
>   if (num_writers > 1)
>     schedule_timeout(MAX);
>   else if (should_grow)
>     schedule_timeout(1);
> ===

What's the problem above?  The wake_up() doesn't get called, and thread1 
doesn't sleep.

> thread2 also needs a memory_barrier, for without memory_barrier, 
> on some CPUs, "if (wq)" may be executed before num_writers--, like
> ===
> thread 1:                          thread 2:                       
> 
> btrfs_commit_transaction()
>   if (num_writers > 1)
>     timeout = MAX_TIMEOUT;
(This bit is gone)

> 	                --------->
>                                    __btrfs_end_transaction()
>                                      if (wq)  
>                                        wake_up();
> 			<---------
>   smp_mb();
>   prepare_wait();
>   if (num_writers > 1)
>     schedule_timeout(MAX);
>   else if (should_grow)
>     schedule_timeout(1); 
> 			---------->
> 				      num_writers--;
> ===
> then, thread1 may wait forever.
> 
> Since wake_up() itself provides a implied wmb, and a wq active check, 
> it is better to drop "if (wq)" in __btrfs_end_transaction().

I see.  It could also be

        smb_mb();
        if (wq)
                wake_up();

but just calling wake_up() unconditionally is simpler, and fewer barriers 
in the wake_up case.  I'm not attached to the if (wq); I just kept it 
because it was there already.

Chris?

Thanks!
sage





> 
> 
> thanks,
> liubo
> 
> > Fix this by deciding how long to wait when we wait.
> > 
> > Signed-off-by: Sage Weil <sage@newdream.net>
> > ---
> >  fs/btrfs/transaction.c |   12 ++++--------
> >  1 files changed, 4 insertions(+), 8 deletions(-)
> > 
> > diff --git a/fs/btrfs/transaction.c b/fs/btrfs/transaction.c
> > index 66e4c66..bf399ea 100644
> > --- a/fs/btrfs/transaction.c
> > +++ b/fs/btrfs/transaction.c
> > @@ -992,7 +992,6 @@ int btrfs_commit_transaction(struct btrfs_trans_handle *trans,
> >  			     struct btrfs_root *root)
> >  {
> >  	unsigned long joined = 0;
> > -	unsigned long timeout = 1;
> >  	struct btrfs_transaction *cur_trans;
> >  	struct btrfs_transaction *prev_trans = NULL;
> >  	DEFINE_WAIT(wait);
> > @@ -1063,11 +1062,6 @@ int btrfs_commit_transaction(struct btrfs_trans_handle *trans,
> >  			snap_pending = 1;
> >  
> >  		WARN_ON(cur_trans != trans->transaction);
> > -		if (cur_trans->num_writers > 1)
> > -			timeout = MAX_SCHEDULE_TIMEOUT;
> > -		else if (should_grow)
> > -			timeout = 1;
> > -
> >  		mutex_unlock(&root->fs_info->trans_mutex);
> >  
> >  		if (flush_on_commit || snap_pending) {
> > @@ -1089,8 +1083,10 @@ int btrfs_commit_transaction(struct btrfs_trans_handle *trans,
> >  				TASK_UNINTERRUPTIBLE);
> >  
> >  		smp_mb();
> > -		if (cur_trans->num_writers > 1 || should_grow)
> > -			schedule_timeout(timeout);
> > +		if (cur_trans->num_writers > 1)
> > +			schedule_timeout(MAX_SCHEDULE_TIMEOUT);
> > +		else if (should_grow)
> > +			schedule_timeout(1);
> >  
> >  		mutex_lock(&root->fs_info->trans_mutex);
> >  		finish_wait(&cur_trans->writer_wait, &wait);
> 
> 
> 
> 
> --
> To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 
> 

  reply	other threads:[~2010-10-26 16:36 UTC|newest]

Thread overview: 15+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2010-10-25 19:07 [PATCH 0/6] Btrfs commit fixes, async subvol operations Sage Weil
2010-10-25 19:07 ` [PATCH 1/6] Btrfs: fix deadlock in btrfs_commit_transaction Sage Weil
2010-10-25 19:07   ` [PATCH 2/6] Btrfs: async transaction commit Sage Weil
2010-10-25 19:07     ` [PATCH 3/6] Btrfs: add START_SYNC, WAIT_SYNC ioctls Sage Weil
2010-10-25 19:07       ` [PATCH 4/6] Btrfs: add SNAP_CREATE_ASYNC ioctl Sage Weil
2010-10-25 19:07         ` [PATCH 5/6] Btrfs: make SNAP_DESTROY async Sage Weil
2010-10-25 19:07           ` [PATCH 6/6] Btrfs: allow subvol deletion by owner Sage Weil
2010-10-26  6:46   ` [PATCH 1/6] Btrfs: fix deadlock in btrfs_commit_transaction liubo
2010-10-26 16:36     ` Sage Weil [this message]
2010-10-26 17:06       ` Chris Mason
2010-10-27  0:41         ` liubo
2010-10-25 19:29 ` [PATCH 0/6] Btrfs commit fixes, async subvol operations Chris Mason
2010-10-25 19:41   ` Sage Weil
2010-10-25 19:58     ` Chris Mason
2010-10-25 21:27       ` Sage Weil

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=Pine.LNX.4.64.1010260931310.7660@cobra.newdream.net \
    --to=sage@newdream.net \
    --cc=linux-btrfs@vger.kernel.org \
    --cc=liubo2009@cn.fujitsu.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.