linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: "J . Bruce Fields" <bfields@fieldses.org>
To: Jeff Layton <jlayton@poochiereds.net>
Cc: Oleg Drokin <green@linuxhacker.ru>,
	linux-nfs@vger.kernel.org, linux-kernel@vger.kernel.org,
	Andrew W Elble <aweits@rit.edu>
Subject: Re: [PATCH] nfsd: Close a race between access checking/setting in nfs4_get_vfs_file
Date: Fri, 10 Jun 2016 16:55:45 -0400	[thread overview]
Message-ID: <20160610205545.GA13766@fieldses.org> (raw)
In-Reply-To: <1465555833.1425.15.camel@poochiereds.net>

On Fri, Jun 10, 2016 at 06:50:33AM -0400, Jeff Layton wrote:
> On Fri, 2016-06-10 at 00:18 -0400, Oleg Drokin wrote:
> > On Jun 9, 2016, at 5:01 PM, Oleg Drokin wrote:
> > 
> > > Currently there's an unprotected access mode check in
> > > nfs4_upgrade_open
> > > that then calls nfs4_get_vfs_file which in turn assumes whatever
> > > access mode was present in the state is still valid which is racy.
> > > Two nfs4_get_vfs_file van enter the same path as result and get two
> > > references to nfs4_file, but later drop would only happens once
> > > because
> > > access mode is only denoted by bits, so no refcounting.
> > > 
> > > The locking around access mode testing is introduced to avoid this
> > > race.
> > > 
> > > Signed-off-by: Oleg Drokin <green@linuxhacker.ru>
> > > ---
> > > 
> > > This patch performs equally well to the st_rwsem -> mutex
> > > conversion,
> > > but is a bit ligher-weight I imagine.
> > > For one it seems to allow truncates in parallel if we ever want it.
> > > 
> > > fs/nfsd/nfs4state.c | 28 +++++++++++++++++++++++++---
> > > 1 file changed, 25 insertions(+), 3 deletions(-)
> > > 
> > > diff --git a/fs/nfsd/nfs4state.c b/fs/nfsd/nfs4state.c
> > > index f5f82e1..d4b9eba 100644
> > > --- a/fs/nfsd/nfs4state.c
> > > +++ b/fs/nfsd/nfs4state.c
> > > @@ -3958,6 +3958,11 @@ static __be32 nfs4_get_vfs_file(struct
> > > svc_rqst *rqstp, struct nfs4_file *fp,
> > > 
> > > 	spin_lock(&fp->fi_lock);
> > > 
> > > +	if (test_access(open->op_share_access, stp)) {
> > > +		spin_unlock(&fp->fi_lock);
> > > +		return nfserr_eagain;
> > > +	}
> > > +
> > > 	/*
> > > 	 * Are we trying to set a deny mode that would conflict with
> > > 	 * current access?
> > > @@ -4017,11 +4022,21 @@ nfs4_upgrade_open(struct svc_rqst *rqstp,
> > > struct nfs4_file *fp, struct svc_fh *c
> > > 	__be32 status;
> > > 	unsigned char old_deny_bmap = stp->st_deny_bmap;
> > > 
> > > -	if (!test_access(open->op_share_access, stp))
> > > -		return nfs4_get_vfs_file(rqstp, fp, cur_fh, stp,
> > > open);
> > > +again:
> > > +	spin_lock(&fp->fi_lock);
> > > +	if (!test_access(open->op_share_access, stp)) {
> > > +		spin_unlock(&fp->fi_lock);
> > > +		status = nfs4_get_vfs_file(rqstp, fp, cur_fh, stp,
> > > open);
> > > +		/*
> > > +		 * Somebody won the race for access while we did
> > > not hold
> > > +		 * the lock here
> > > +		 */
> > > +		if (status == nfserr_eagain)
> > > +			goto again;
> > > +		return status;
> > > +	}
> > > 
> > > 	/* test and set deny mode */
> > > -	spin_lock(&fp->fi_lock);
> > > 	status = nfs4_file_check_deny(fp, open->op_share_deny);
> > > 	if (status == nfs_ok) {
> > > 		set_deny(open->op_share_deny, stp);
> > > @@ -4361,6 +4376,13 @@ nfsd4_process_open2(struct svc_rqst *rqstp,
> > > struct svc_fh *current_fh, struct nf
> > > 		status = nfs4_get_vfs_file(rqstp, fp, current_fh, stp,
> > > open);
> > > 		if (status) {
> > > 			up_read(&stp->st_rwsem);
> > > +			/*
> > > +			 * EAGAIN is returned when there's a
> > > racing access,
> > > +			 * this should never happen as we are the
> > > only user
> > > +			 * of this new state, and since it's not
> > > yet hashed,
> > > +			 * nobody can find it
> > > +			 */
> > > +			WARN_ON(status == nfserr_eagain);
> > 
> > Ok, some more testing shows that this CAN happen.
> > So this patch is inferior to the mutex one after all.
> > 
> 
> Yeah, that can happen for all sorts of reasons. As Andrew pointed out,
> you can get this when there is a lease break in progress, and that may
> be occurring for a completely different stateid (or because of samba,
> etc...)
> 
> It may be possible to do something like this, but we'd need to audit
> all of the handling of st_access_bmap (and the deny bmap) to ensure
> that we get it right.
> 
> For now, I think just turning that rwsem into a mutex is the best
> solution. That is a per-stateid mutex so any contention is going to be
> due to the client sending racing OPEN calls for the same inode anyway.
> Allowing those to run in parallel again could be useful in some cases,
> but most use-cases won't be harmed by that serialization.

OK, so for now my plan is to take "nfsd: Always lock state exclusively"
for 4.7.  Thanks to both of you for your work on this....

--b.

  reply	other threads:[~2016-06-10 20:55 UTC|newest]

Thread overview: 45+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-06-07 15:37 Files leak from nfsd in 4.7.1-rc1 (and more?) Oleg Drokin
2016-06-07 17:10 ` Jeff Layton
2016-06-07 17:30   ` Oleg Drokin
2016-06-07 20:04     ` Jeff Layton
2016-06-07 23:39       ` Oleg Drokin
2016-06-08  0:03         ` Jeff Layton
2016-06-08  0:46           ` Oleg Drokin
2016-06-08  2:22           ` Oleg Drokin
2016-06-08  3:55             ` Oleg Drokin
2016-06-08 10:58             ` Jeff Layton
2016-06-08 14:44               ` Oleg Drokin
2016-06-08 16:10               ` Oleg Drokin
2016-06-08 17:22                 ` Jeff Layton
2016-06-08 17:37                   ` Oleg Drokin
2016-06-09  2:55                   ` [PATCH] nfsd: Always lock state exclusively Oleg Drokin
2016-06-09 10:13                     ` Jeff Layton
2016-06-09 21:01                   ` [PATCH] nfsd: Close a race between access checking/setting in nfs4_get_vfs_file Oleg Drokin
2016-06-10  4:18                     ` Oleg Drokin
2016-06-10 10:50                       ` Jeff Layton
2016-06-10 20:55                         ` J . Bruce Fields [this message]
2016-06-11 15:41                           ` Oleg Drokin
2016-06-12  1:33                             ` Jeff Layton
2016-06-12  2:06                               ` Oleg Drokin
2016-06-12  2:50                                 ` Jeff Layton
2016-06-12  3:15                                   ` Oleg Drokin
2016-06-12 13:13                                     ` Jeff Layton
2016-06-13  1:26                                     ` [PATCH v2] nfsd: Always lock state exclusively Oleg Drokin
2016-06-14 15:38                                       ` J . Bruce Fields
2016-06-14 15:53                                         ` Oleg Drokin
2016-06-14 18:50                                           ` J . Bruce Fields
2016-06-14 22:52                                             ` Jeff Layton
2016-06-14 22:54                                               ` Oleg Drokin
2016-06-14 22:57                                                 ` Jeff Layton
2016-06-15  3:28                                                   ` [PATCH 0/3] nfsd state handling fixes Oleg Drokin
2016-06-15  3:28                                                     ` [PATCH 1/3] nfsd: Always lock state exclusively Oleg Drokin
2016-06-15  3:28                                                     ` [PATCH 2/3] nfsd: Extend the mutex holding region around in nfsd4_process_open2() Oleg Drokin
2016-06-15  3:28                                                     ` [PATCH 3/3] nfsd: Make init_open_stateid() a bit more whole Oleg Drokin
2016-06-16  1:54                                                     ` [PATCH 0/3] nfsd state handling fixes Oleg Drokin
2016-06-16  2:07                                                       ` J . Bruce Fields
2016-06-14 15:46                                       ` [PATCH v2] nfsd: Always lock state exclusively J . Bruce Fields
2016-06-14 15:56                                         ` Oleg Drokin
2016-06-14 18:46                                           ` J . Bruce Fields
2016-06-15  2:19                                             ` Oleg Drokin
2016-06-15 13:31                                               ` J . Bruce Fields
2016-06-09 12:13               ` Files leak from nfsd in 4.7.1-rc1 (and more?) Andrew W Elble

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20160610205545.GA13766@fieldses.org \
    --to=bfields@fieldses.org \
    --cc=aweits@rit.edu \
    --cc=green@linuxhacker.ru \
    --cc=jlayton@poochiereds.net \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-nfs@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).