linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Oleg Drokin <green@linuxhacker.ru>
To: "J . Bruce Fields" <bfields@fieldses.org>
Cc: Jeff Layton <jlayton@poochiereds.net>,
	linux-nfs@vger.kernel.org, linux-kernel@vger.kernel.org
Subject: Re: [PATCH v2] nfsd: Always lock state exclusively.
Date: Tue, 14 Jun 2016 22:19:49 -0400	[thread overview]
Message-ID: <4B503C8E-0195-4016-96F2-C848F667218D@linuxhacker.ru> (raw)
In-Reply-To: <20160614184655.GI25973@fieldses.org>


On Jun 14, 2016, at 2:46 PM, J . Bruce Fields wrote:

> On Tue, Jun 14, 2016 at 11:56:20AM -0400, Oleg Drokin wrote:
>> 
>> On Jun 14, 2016, at 11:46 AM, J . Bruce Fields wrote:
>> 
>>> On Sun, Jun 12, 2016 at 09:26:27PM -0400, Oleg Drokin wrote:
>>>> It used to be the case that state had an rwlock that was locked for write
>>>> by downgrades, but for read for upgrades (opens). Well, the problem is
>>>> if there are two competing opens for the same state, they step on
>>>> each other toes potentially leading to leaking file descriptors
>>>> from the state structure, since access mode is a bitmap only set once.
>>>> 
>>>> Extend the holding region around in nfsd4_process_open2() to avoid
>>>> racing entry into nfs4_get_vfs_file().
>>>> Make init_open_stateid() return with locked stateid to be unlocked
>>>> by the caller.
>>>> 
>>>> Now this version held up pretty well in my testing for 24 hours.
>>>> It still does not address the situation if during one of the racing
>>>> nfs4_get_vfs_file() calls we are getting an error from one (first?)
>>>> of them. This is to be addressed in a separate patch after having a
>>>> solid reproducer (potentially using some fault injection).
>>>> 
>>>> Signed-off-by: Oleg Drokin <green@linuxhacker.ru>
>>>> ---
>>>> fs/nfsd/nfs4state.c | 47 +++++++++++++++++++++++++++--------------------
>>>> fs/nfsd/state.h     |  2 +-
>>>> 2 files changed, 28 insertions(+), 21 deletions(-)
>>>> 
>>>> diff --git a/fs/nfsd/nfs4state.c b/fs/nfsd/nfs4state.c
>>>> index f5f82e1..fa5fb5a 100644
>>>> --- a/fs/nfsd/nfs4state.c
>>>> +++ b/fs/nfsd/nfs4state.c
>>>> @@ -3487,6 +3487,10 @@ init_open_stateid(struct nfs4_ol_stateid *stp, struct nfs4_file *fp,
>>>> 	struct nfs4_openowner *oo = open->op_openowner;
>>>> 	struct nfs4_ol_stateid *retstp = NULL;
>>>> 
>>>> +	/* We are moving these outside of the spinlocks to avoid the warnings */
>>>> +	mutex_init(&stp->st_mutex);
>>>> +	mutex_lock(&stp->st_mutex);
>>> 
>>> A mutex_init_locked() primitive might also be convenient here.
>> 
>> I know! I would be able to do it under spinlock then without moving this around too.
>> 
>> But alas, not only there is not one, mutex documentation states this is disallowed.
> 
> You're just talking about this comment?:
> 
> 	 * It is not allowed to initialize an already locked mutex.
> 
> That's a weird comment.  You're proably right that what they meant was
> something like "It is not allowed to initialize a mutex to locked
> state".  But, I don't know, taken literally that comment doesn't make
> sense (how could you even distinguish between an already-locked mutex
> and an uninitialized mutex?), so maybe it'd be worth asking.

I think this is because of the strict ownership tracking or something.
I guess I can ask.

>>> You could also take the two previous lines from the caller into this
>>> function instead of passing in stp, that might simplify the code.
>>> (Haven't checked.)
>> 
>> I am not really sure what do you mean here.
>> These lines are moved from further away in this function )well, just the init, anyway).
>> 
>> Having half initialisation of stp here and half in the caller sounds kind of strange
>> to me.
> 
> I was thinking of something like the following--so init_open_stateid
> hides more of the details of the swapping.  Untested.  Does it look like
> an improvement to you?
> 
> There's got to be a way to make this code a little less convoluted....
> 
> --b.
> 
> diff --git a/fs/nfsd/nfs4state.c b/fs/nfsd/nfs4state.c
> index fa5fb5aa4847..41b59854c40f 100644
> --- a/fs/nfsd/nfs4state.c
> +++ b/fs/nfsd/nfs4state.c
> @@ -3480,13 +3480,15 @@ alloc_init_open_stateowner(unsigned int strhashval, struct nfsd4_open *open,
> }
> 
> static struct nfs4_ol_stateid *
> -init_open_stateid(struct nfs4_ol_stateid *stp, struct nfs4_file *fp,
> -		struct nfsd4_open *open)
> +init_open_stateid(struct nfs4_file *fp, struct nfsd4_open *open)
> {
> 
> 	struct nfs4_openowner *oo = open->op_openowner;
> 	struct nfs4_ol_stateid *retstp = NULL;
> +	struct nfs4_ol_stateid *stp;
> 
> +	stp = open->op_stp;
> +	open->op_stp = NULL;
> 	/* We are moving these outside of the spinlocks to avoid the warnings */
> 	mutex_init(&stp->st_mutex);
> 	mutex_lock(&stp->st_mutex);
> @@ -3512,9 +3514,12 @@ init_open_stateid(struct nfs4_ol_stateid *stp, struct nfs4_file *fp,
> out_unlock:
> 	spin_unlock(&fp->fi_lock);
> 	spin_unlock(&oo->oo_owner.so_client->cl_lock);
> -	if (retstp)
> -		mutex_lock(&retstp->st_mutex);
> -	return retstp;
> +	if (retstp) {
> +		nfs4_put_stid(&stp->st_stid);

So as I am trying to integrate this into my patchset,
do we really need this?
We don't if we took the other path and left this one
hanging off the struct nfsd4_open (why do we need to
assign it NULL before the search?) I imagine then
we'd save some free/realloc churn as well?

I assume struct nfsd4_open cannot be shared between threads?
Otherwise we have bigger problems at hand like mutex init on a locked
mutex from another thread and stuff.

I'll try this theory I guess.


> +		stp = retstp;
> +		mutex_lock(&stp->st_mutex);
> +	}
> +	return stp;
> }
> 
> /*
> @@ -4310,7 +4315,6 @@ nfsd4_process_open2(struct svc_rqst *rqstp, struct svc_fh *current_fh, struct nf
> 	struct nfs4_client *cl = open->op_openowner->oo_owner.so_client;
> 	struct nfs4_file *fp = NULL;
> 	struct nfs4_ol_stateid *stp = NULL;
> -	struct nfs4_ol_stateid *swapstp = NULL;
> 	struct nfs4_delegation *dp = NULL;
> 	__be32 status;
> 
> @@ -4347,16 +4351,9 @@ nfsd4_process_open2(struct svc_rqst *rqstp, struct svc_fh *current_fh, struct nf
> 			goto out;
> 		}
> 	} else {
> -		stp = open->op_stp;
> -		open->op_stp = NULL;
> -		/*
> -		 * init_open_stateid() either returns a locked stateid
> -		 * it found, or initializes and locks the new one we passed in
> -		 */
> -		swapstp = init_open_stateid(stp, fp, open);
> -		if (swapstp) {
> -			nfs4_put_stid(&stp->st_stid);
> -			stp = swapstp;
> +		/* stp is returned locked: */
> +		stp = init_open_stateid(fp, open);
> +		if (stp->st_access_bmap == 0) {
> 			status = nfs4_upgrade_open(rqstp, fp, current_fh,
> 						stp, open);
> 			if (status) {

  reply	other threads:[~2016-06-15  2:20 UTC|newest]

Thread overview: 45+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-06-07 15:37 Files leak from nfsd in 4.7.1-rc1 (and more?) Oleg Drokin
2016-06-07 17:10 ` Jeff Layton
2016-06-07 17:30   ` Oleg Drokin
2016-06-07 20:04     ` Jeff Layton
2016-06-07 23:39       ` Oleg Drokin
2016-06-08  0:03         ` Jeff Layton
2016-06-08  0:46           ` Oleg Drokin
2016-06-08  2:22           ` Oleg Drokin
2016-06-08  3:55             ` Oleg Drokin
2016-06-08 10:58             ` Jeff Layton
2016-06-08 14:44               ` Oleg Drokin
2016-06-08 16:10               ` Oleg Drokin
2016-06-08 17:22                 ` Jeff Layton
2016-06-08 17:37                   ` Oleg Drokin
2016-06-09  2:55                   ` [PATCH] nfsd: Always lock state exclusively Oleg Drokin
2016-06-09 10:13                     ` Jeff Layton
2016-06-09 21:01                   ` [PATCH] nfsd: Close a race between access checking/setting in nfs4_get_vfs_file Oleg Drokin
2016-06-10  4:18                     ` Oleg Drokin
2016-06-10 10:50                       ` Jeff Layton
2016-06-10 20:55                         ` J . Bruce Fields
2016-06-11 15:41                           ` Oleg Drokin
2016-06-12  1:33                             ` Jeff Layton
2016-06-12  2:06                               ` Oleg Drokin
2016-06-12  2:50                                 ` Jeff Layton
2016-06-12  3:15                                   ` Oleg Drokin
2016-06-12 13:13                                     ` Jeff Layton
2016-06-13  1:26                                     ` [PATCH v2] nfsd: Always lock state exclusively Oleg Drokin
2016-06-14 15:38                                       ` J . Bruce Fields
2016-06-14 15:53                                         ` Oleg Drokin
2016-06-14 18:50                                           ` J . Bruce Fields
2016-06-14 22:52                                             ` Jeff Layton
2016-06-14 22:54                                               ` Oleg Drokin
2016-06-14 22:57                                                 ` Jeff Layton
2016-06-15  3:28                                                   ` [PATCH 0/3] nfsd state handling fixes Oleg Drokin
2016-06-15  3:28                                                     ` [PATCH 1/3] nfsd: Always lock state exclusively Oleg Drokin
2016-06-15  3:28                                                     ` [PATCH 2/3] nfsd: Extend the mutex holding region around in nfsd4_process_open2() Oleg Drokin
2016-06-15  3:28                                                     ` [PATCH 3/3] nfsd: Make init_open_stateid() a bit more whole Oleg Drokin
2016-06-16  1:54                                                     ` [PATCH 0/3] nfsd state handling fixes Oleg Drokin
2016-06-16  2:07                                                       ` J . Bruce Fields
2016-06-14 15:46                                       ` [PATCH v2] nfsd: Always lock state exclusively J . Bruce Fields
2016-06-14 15:56                                         ` Oleg Drokin
2016-06-14 18:46                                           ` J . Bruce Fields
2016-06-15  2:19                                             ` Oleg Drokin [this message]
2016-06-15 13:31                                               ` J . Bruce Fields
2016-06-09 12:13               ` Files leak from nfsd in 4.7.1-rc1 (and more?) Andrew W Elble

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4B503C8E-0195-4016-96F2-C848F667218D@linuxhacker.ru \
    --to=green@linuxhacker.ru \
    --cc=bfields@fieldses.org \
    --cc=jlayton@poochiereds.net \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-nfs@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).