All of lore.kernel.org
 help / color / mirror / Atom feed
From: Dave Chinner <david@fromorbit.com>
To: Christoph Hellwig <hch@infradead.org>
Cc: linux-kernel@vger.kernel.org,
	Markus Trippelsdorf <markus@trippelsdorf.de>,
	Bruno Pr?mont <bonbons@linux-vserver.org>,
	xfs-masters@oss.sgi.com, xfs@oss.sgi.com,
	Alex Elder <aelder@sgi.com>, Dave Chinner <dchinner@redhat.com>
Subject: Re: 2.6.39-rc3, 2.6.39-rc4: XFS lockup - regression since 2.6.38
Date: Fri, 6 May 2011 11:49:06 +1000	[thread overview]
Message-ID: <20110506014906.GF26837@dastard> (raw)
In-Reply-To: <20110505123959.GA21098@infradead.org>

On Thu, May 05, 2011 at 08:39:59AM -0400, Christoph Hellwig wrote:
> > The third problem is that updating the push target is not safe on 32
> > bit machines. We cannot copy a 64 bit LSN without the possibility of
> > corrupting the result when racing with another updating thread. We
> > have function to do this update safely without needing to care about
> > 32/64 bit issues - xfs_trans_ail_copy_lsn() - so use that when
> > updating the AIL push target.
> 
> But reading xa_target without xa_lock isn't safe on 32-bit either, is it?

Not sure - I think it depends on the platform. I don't think we
protect LSN reads in any specific way on 32 bit platforms.

In this case, I don't think it matters so much on read, because if
we get a race with a write that mixes upper/lower words of the
target we will eventually hit the stop condition and we won't get a
match. That will trigger the requeue code and we'll start the push
again.

The problem with getting such a race on the target write is that we
could get a cycle/block pair that is beyond the current head of the
log and we'd never be able to push the AIL again as all push
thresholds are truncated to the current head LSN on disk...

> For the first read it can trivially be moved into the critical
> section a few lines below, and the second one should probably use
> XFS_LSN_CMP.
> 
> > @@ -482,19 +481,24 @@ xfs_ail_worker(
> >  	/* assume we have more work to do in a short while */
> >  	tout = 10;
> >  	if (!count) {
> > +out_done:
> 
> Jumping into conditionals is really ugly.  By initializing count a bit
> earlier you can just jump in front of the if/else clauses.  And while
> you're there maybe moving the tout = 10; into an else clause would
> also make the code more readable.
> an uninitialied used of tout.

Ok, I'll rework that.

> > +		if (ailp->xa_target == target ||
> > +		    (test_and_set_bit(XFS_AIL_PUSHING_BIT, &ailp->xa_flags)))
> 
> no need for braces around the test_and_set_bit call.

*nod*. Left over from developing the fix...

I'll split all these and post them to the xfs-list for review...

Cheers,

Dave.
-- 
Dave Chinner
david@fromorbit.com

WARNING: multiple messages have this Message-ID (diff)
From: Dave Chinner <david@fromorbit.com>
To: Christoph Hellwig <hch@infradead.org>
Cc: Dave Chinner <dchinner@redhat.com>,
	linux-kernel@vger.kernel.org, xfs@oss.sgi.com,
	xfs-masters@oss.sgi.com,
	Bruno Pr?mont <bonbons@linux-vserver.org>,
	Alex Elder <aelder@sgi.com>,
	Markus Trippelsdorf <markus@trippelsdorf.de>
Subject: Re: 2.6.39-rc3, 2.6.39-rc4: XFS lockup - regression since 2.6.38
Date: Fri, 6 May 2011 11:49:06 +1000	[thread overview]
Message-ID: <20110506014906.GF26837@dastard> (raw)
In-Reply-To: <20110505123959.GA21098@infradead.org>

On Thu, May 05, 2011 at 08:39:59AM -0400, Christoph Hellwig wrote:
> > The third problem is that updating the push target is not safe on 32
> > bit machines. We cannot copy a 64 bit LSN without the possibility of
> > corrupting the result when racing with another updating thread. We
> > have function to do this update safely without needing to care about
> > 32/64 bit issues - xfs_trans_ail_copy_lsn() - so use that when
> > updating the AIL push target.
> 
> But reading xa_target without xa_lock isn't safe on 32-bit either, is it?

Not sure - I think it depends on the platform. I don't think we
protect LSN reads in any specific way on 32 bit platforms.

In this case, I don't think it matters so much on read, because if
we get a race with a write that mixes upper/lower words of the
target we will eventually hit the stop condition and we won't get a
match. That will trigger the requeue code and we'll start the push
again.

The problem with getting such a race on the target write is that we
could get a cycle/block pair that is beyond the current head of the
log and we'd never be able to push the AIL again as all push
thresholds are truncated to the current head LSN on disk...

> For the first read it can trivially be moved into the critical
> section a few lines below, and the second one should probably use
> XFS_LSN_CMP.
> 
> > @@ -482,19 +481,24 @@ xfs_ail_worker(
> >  	/* assume we have more work to do in a short while */
> >  	tout = 10;
> >  	if (!count) {
> > +out_done:
> 
> Jumping into conditionals is really ugly.  By initializing count a bit
> earlier you can just jump in front of the if/else clauses.  And while
> you're there maybe moving the tout = 10; into an else clause would
> also make the code more readable.
> an uninitialied used of tout.

Ok, I'll rework that.

> > +		if (ailp->xa_target == target ||
> > +		    (test_and_set_bit(XFS_AIL_PUSHING_BIT, &ailp->xa_flags)))
> 
> no need for braces around the test_and_set_bit call.

*nod*. Left over from developing the fix...

I'll split all these and post them to the xfs-list for review...

Cheers,

Dave.
-- 
Dave Chinner
david@fromorbit.com

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

  reply	other threads:[~2011-05-06  1:49 UTC|newest]

Thread overview: 44+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2011-04-23 20:44 2.6.39-rc3, 2.6.39-rc4: XFS lockup - regression since 2.6.38 Bruno Prémont
2011-04-23 20:44 ` Bruno Prémont
2011-04-27  5:08 ` Dave Chinner
2011-04-27  5:08   ` Dave Chinner
2011-04-27 16:26   ` Bruno Prémont
2011-04-27 16:26     ` Bruno Prémont
2011-04-28 19:45     ` Markus Trippelsdorf
2011-04-28 19:45       ` Markus Trippelsdorf
2011-04-29  1:19       ` Dave Chinner
2011-04-29  1:19         ` Dave Chinner
2011-04-29 15:18         ` Markus Trippelsdorf
2011-04-29 15:18           ` Markus Trippelsdorf
2011-04-29 19:35           ` Bruno Prémont
2011-04-29 19:35             ` Bruno Prémont
2011-04-30 14:18             ` Bruno Prémont
2011-04-30 14:18               ` Bruno Prémont
2011-05-02  6:15               ` Markus Trippelsdorf
2011-05-02  6:15                 ` Markus Trippelsdorf
2011-05-02 12:40                 ` Dave Chinner
2011-05-02 12:40                   ` Dave Chinner
2011-05-04  0:57         ` Jamie Heilman
2011-05-04  0:57           ` Jamie Heilman
2011-05-04 13:25           ` Dave Chinner
2011-05-04 13:25             ` Dave Chinner
2011-05-05  0:21           ` Dave Chinner
2011-05-05  0:21             ` Dave Chinner
2011-05-05  2:26             ` Dave Chinner
2011-05-05  2:26               ` Dave Chinner
2011-05-05 12:21               ` Dave Chinner
2011-05-05 12:21                 ` Dave Chinner
2011-05-05 12:39                 ` Christoph Hellwig
2011-05-05 12:39                   ` Christoph Hellwig
2011-05-06  1:49                   ` Dave Chinner [this message]
2011-05-06  1:49                     ` Dave Chinner
2011-05-05 20:35                 ` Bruno Prémont
2011-05-05 20:35                   ` Bruno Prémont
2011-05-09  5:57                   ` Bruno Prémont
2011-05-09  5:57                     ` Bruno Prémont
2011-05-08  5:11                 ` Jamie Heilman
2011-05-08  5:11                   ` Jamie Heilman
2011-05-20 11:20         ` Andrey Rahmatullin
2011-05-20 11:20           ` Andrey Rahmatullin
2011-05-21  0:14           ` Dave Chinner
2011-05-21  0:14             ` Dave Chinner

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20110506014906.GF26837@dastard \
    --to=david@fromorbit.com \
    --cc=aelder@sgi.com \
    --cc=bonbons@linux-vserver.org \
    --cc=dchinner@redhat.com \
    --cc=hch@infradead.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=markus@trippelsdorf.de \
    --cc=xfs-masters@oss.sgi.com \
    --cc=xfs@oss.sgi.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.