linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Dave Chinner <david@fromorbit.com>
To: Brian Foster <bfoster@redhat.com>
Cc: Waiman Long <Waiman.Long@hp.com>,
	linux-kernel@vger.kernel.org, xfs@oss.sgi.com
Subject: Re: [PATCH] xfs: call xfs_idestroy_fork() in xfs_ilock() critical section
Date: Fri, 24 Apr 2015 08:08:23 +1000	[thread overview]
Message-ID: <20150423220823.GJ15810@dastard> (raw)
In-Reply-To: <20150423122149.GA13131@bfoster.bfoster>

On Thu, Apr 23, 2015 at 08:21:50AM -0400, Brian Foster wrote:
> On Thu, Apr 23, 2015 at 09:17:58AM +1000, Dave Chinner wrote:
> > @@ -410,11 +418,12 @@ xfs_attr_inactive(xfs_inode_t *dp)
> >  	 */
> >  	trans = xfs_trans_alloc(mp, XFS_TRANS_ATTRINVAL);
> >  	error = xfs_trans_reserve(trans, &M_RES(mp)->tr_attrinval, 0, 0);
> > -	if (error) {
> > -		xfs_trans_cancel(trans, 0);
> > -		return error;
> > -	}
> > -	xfs_ilock(dp, XFS_ILOCK_EXCL);
> > +	if (error)
> > +		goto out_cancel;
> > +
> 
> The error path expects a locked inode, but it isn't here.

Right, xfs/181 tripped that. I've fixed it in my current version ;)

> 
> > +	lock_mode = XFS_ILOCK_EXCL;
> > +	cancel_flags = XFS_TRANS_RELEASE_LOG_RES | XFS_TRANS_ABORT;
> > +	xfs_ilock(dp, lock_mode);
> >  
> >  	/*
> >  	 * No need to make quota reservations here. We expect to release some
> > @@ -423,28 +432,36 @@ xfs_attr_inactive(xfs_inode_t *dp)
> >  	xfs_trans_ijoin(trans, dp, 0);
> >  
> >  	/*
> > -	 * Decide on what work routines to call based on the inode size.
> > +	 * It's unlikely we've raced with an attribute fork creation, but check
> > +	 * anyway just in case.
> >  	 */
> > -	if (!xfs_inode_hasattr(dp) ||
> > -	    dp->i_d.di_aformat == XFS_DINODE_FMT_LOCAL) {
> > -		error = 0;
> > -		goto out;
> > +	if (!XFS_IFORK_Q(dp))
> > +		goto out_cancel;
> 
> What about attribute fork creation would cause di_forkoff == 0 if that
> wasn't the case above? Do you mean to say a potential race with
> attribute fork destruction?

atrtibute fork creation will never leave di_forkoff == 0. See
xfs_attr_shortform_bytesfit() as a guideline for the min/max fork
offset at attribute fork creation time.

The race I'm talking about is the fact we check for an attr fork,
then drop the lock, do the trans reserve and then grab the lock
again. The inode could have changed in that time, so we need to
check again. It's extremely unlikely that the inode has changed due
to the fact it is in the ->evict path and can't be referenced by the
VFS again until it's in a reclaimable state. Hence it is only
internal filesystem stuff that could modify it, which I don't think
can happen. So, leave the check, mark the race as unlikely to occur.

> > +	/* invalidate and truncate the attribute fork extents */
> > +	if (dp->i_d.di_aformat != XFS_DINODE_FMT_LOCAL) {
> > +		error = xfs_attr3_root_inactive(&trans, dp);
> > +		if (error)
> > +			goto out_cancel;
> > +
> > +		error = xfs_itruncate_extents(&trans, dp, XFS_ATTR_FORK, 0);
> > +		if (error)
> > +			goto out_cancel;
> >  	}
> > -	error = xfs_attr3_root_inactive(&trans, dp);
> > -	if (error)
> > -		goto out;
> >  
> > -	error = xfs_itruncate_extents(&trans, dp, XFS_ATTR_FORK, 0);
> > -	if (error)
> > -		goto out;
> > +	/* Reset the attribute fork - this also destroys the in-core fork */
> > +	xfs_attr_fork_reset(dp, trans);
> >  
> >  	error = xfs_trans_commit(trans, XFS_TRANS_RELEASE_LOG_RES);
> > -	xfs_iunlock(dp, XFS_ILOCK_EXCL);
> > -
> > +	xfs_iunlock(dp, lock_mode);
> >  	return error;
> >  
> > -out:
> > -	xfs_trans_cancel(trans, XFS_TRANS_RELEASE_LOG_RES|XFS_TRANS_ABORT);
> > -	xfs_iunlock(dp, XFS_ILOCK_EXCL);
> > +out_cancel:
> > +	xfs_trans_cancel(trans, cancel_flags);
> > +out_destroy_fork:
> > +	/* kill the in-core attr fork before we drop the inode lock */
> > +	if (dp->i_afp)
> > +		xfs_idestroy_fork(dp, XFS_ATTR_FORK);
> > +	xfs_iunlock(dp, lock_mode);
> 
> I wonder if a warning or some kind of notification is appropriate here.
> If we get to this point, we're removing an inode potentially without
> having freed attr fork blocks and thus leaving them permanently
> unreferenced, yes?

We end up leaving the inode on the unlinked list because we abort
the inactivation on error. The in-core inode still gets reclaimed
properly, but it's now up to log recovery to re-run inactivation to
try to free the inode or xfs_repair to cleanit up.  Either way, it's
safe just to leave the inode where it is on the unlinked list - it's
free and not getting in the way, so IMO warnings at this point don't
serve any useful purpose...

Cheers,

Dave.
-- 
Dave Chinner
david@fromorbit.com

  reply	other threads:[~2015-04-23 22:08 UTC|newest]

Thread overview: 9+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-04-22 17:33 [PATCH] xfs: call xfs_idestroy_fork() in xfs_ilock() critical section Waiman Long
2015-04-22 19:11 ` Brian Foster
2015-04-22 20:28   ` Waiman Long
2015-04-22 23:17 ` Dave Chinner
2015-04-23 12:21   ` Brian Foster
2015-04-23 22:08     ` Dave Chinner [this message]
2015-04-24 11:57       ` Brian Foster
2015-04-26 22:56         ` Dave Chinner
2015-04-23 17:14   ` Waiman Long

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20150423220823.GJ15810@dastard \
    --to=david@fromorbit.com \
    --cc=Waiman.Long@hp.com \
    --cc=bfoster@redhat.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=xfs@oss.sgi.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).