All of lore.kernel.org
 help / color / mirror / Atom feed
From: Oleg Drokin <green@linuxhacker.ru>
To: "J . Bruce Fields" <bfields@fieldses.org>
Cc: Jeff Layton <jlayton@poochiereds.net>,
	linux-nfs@vger.kernel.org, linux-kernel@vger.kernel.org
Subject: Re: [PATCH v2] nfsd: Always lock state exclusively.
Date: Tue, 14 Jun 2016 22:19:49 -0400	[thread overview]
Message-ID: <4B503C8E-0195-4016-96F2-C848F667218D@linuxhacker.ru> (raw)
In-Reply-To: <20160614184655.GI25973@fieldses.org>


On Jun 14, 2016, at 2:46 PM, J . Bruce Fields wrote:

> On Tue, Jun 14, 2016 at 11:56:20AM -0400, Oleg Drokin wrote:
>> 
>> On Jun 14, 2016, at 11:46 AM, J . Bruce Fields wrote:
>> 
>>> On Sun, Jun 12, 2016 at 09:26:27PM -0400, Oleg Drokin wrote:
>>>> It used to be the case that state had an rwlock that was locked for write
>>>> by downgrades, but for read for upgrades (opens). Well, the problem is
>>>> if there are two competing opens for the same state, they step on
>>>> each other toes potentially leading to leaking file descriptors
>>>> from the state structure, since access mode is a bitmap only set once.
>>>> 
>>>> Extend the holding region around in nfsd4_process_open2() to avoid
>>>> racing entry into nfs4_get_vfs_file().
>>>> Make init_open_stateid() return with locked stateid to be unlocked
>>>> by the caller.
>>>> 
>>>> Now this version held up pretty well in my testing for 24 hours.
>>>> It still does not address the situation if during one of the racing
>>>> nfs4_get_vfs_file() calls we are getting an error from one (first?)
>>>> of them. This is to be addressed in a separate patch after having a
>>>> solid reproducer (potentially using some fault injection).
>>>> 
>>>> Signed-off-by: Oleg Drokin <green@linuxhacker.ru>
>>>> ---
>>>> fs/nfsd/nfs4state.c | 47 +++++++++++++++++++++++++++--------------------
>>>> fs/nfsd/state.h     |  2 +-
>>>> 2 files changed, 28 insertions(+), 21 deletions(-)
>>>> 
>>>> diff --git a/fs/nfsd/nfs4state.c b/fs/nfsd/nfs4state.c
>>>> index f5f82e1..fa5fb5a 100644
>>>> --- a/fs/nfsd/nfs4state.c
>>>> +++ b/fs/nfsd/nfs4state.c
>>>> @@ -3487,6 +3487,10 @@ init_open_stateid(struct nfs4_ol_stateid *stp, struct nfs4_file *fp,
>>>> 	struct nfs4_openowner *oo = open->op_openowner;
>>>> 	struct nfs4_ol_stateid *retstp = NULL;
>>>> 
>>>> +	/* We are moving these outside of the spinlocks to avoid the warnings */
>>>> +	mutex_init(&stp->st_mutex);
>>>> +	mutex_lock(&stp->st_mutex);
>>> 
>>> A mutex_init_locked() primitive might also be convenient here.
>> 
>> I know! I would be able to do it under spinlock then without moving this around too.
>> 
>> But alas, not only there is not one, mutex documentation states this is disallowed.
> 
> You're just talking about this comment?:
> 
> 	 * It is not allowed to initialize an already locked mutex.
> 
> That's a weird comment.  You're proably right that what they meant was
> something like "It is not allowed to initialize a mutex to locked
> state".  But, I don't know, taken literally that comment doesn't make
> sense (how could you even distinguish between an already-locked mutex
> and an uninitialized mutex?), so maybe it'd be worth asking.

I think this is because of the strict ownership tracking or something.
I guess I can ask.

>>> You could also take the two previous lines from the caller into this
>>> function instead of passing in stp, that might simplify the code.
>>> (Haven't checked.)
>> 
>> I am not really sure what do you mean here.
>> These lines are moved from further away in this function )well, just the init, anyway).
>> 
>> Having half initialisation of stp here and half in the caller sounds kind of strange
>> to me.
> 
> I was thinking of something like the following--so init_open_stateid
> hides more of the details of the swapping.  Untested.  Does it look like
> an improvement to you?
> 
> There's got to be a way to make this code a little less convoluted....
> 
> --b.
> 
> diff --git a/fs/nfsd/nfs4state.c b/fs/nfsd/nfs4state.c
> index fa5fb5aa4847..41b59854c40f 100644
> --- a/fs/nfsd/nfs4state.c
> +++ b/fs/nfsd/nfs4state.c
> @@ -3480,13 +3480,15 @@ alloc_init_open_stateowner(unsigned int strhashval, struct nfsd4_open *open,
> }
> 
> static struct nfs4_ol_stateid *
> -init_open_stateid(struct nfs4_ol_stateid *stp, struct nfs4_file *fp,
> -		struct nfsd4_open *open)
> +init_open_stateid(struct nfs4_file *fp, struct nfsd4_open *open)
> {
> 
> 	struct nfs4_openowner *oo = open->op_openowner;
> 	struct nfs4_ol_stateid *retstp = NULL;
> +	struct nfs4_ol_stateid *stp;
> 
> +	stp = open->op_stp;
> +	open->op_stp = NULL;
> 	/* We are moving these outside of the spinlocks to avoid the warnings */
> 	mutex_init(&stp->st_mutex);
> 	mutex_lock(&stp->st_mutex);
> @@ -3512,9 +3514,12 @@ init_open_stateid(struct nfs4_ol_stateid *stp, struct nfs4_file *fp,
> out_unlock:
> 	spin_unlock(&fp->fi_lock);
> 	spin_unlock(&oo->oo_owner.so_client->cl_lock);
> -	if (retstp)
> -		mutex_lock(&retstp->st_mutex);
> -	return retstp;
> +	if (retstp) {
> +		nfs4_put_stid(&stp->st_stid);

So as I am trying to integrate this into my patchset,
do we really need this?
We don't if we took the other path and left this one
hanging off the struct nfsd4_open (why do we need to
assign it NULL before the search?) I imagine then
we'd save some free/realloc churn as well?

I assume struct nfsd4_open cannot be shared between threads?
Otherwise we have bigger problems at hand like mutex init on a locked
mutex from another thread and stuff.

I'll try this theory I guess.


> +		stp = retstp;
> +		mutex_lock(&stp->st_mutex);
> +	}
> +	return stp;
> }
> 
> /*
> @@ -4310,7 +4315,6 @@ nfsd4_process_open2(struct svc_rqst *rqstp, struct svc_fh *current_fh, struct nf
> 	struct nfs4_client *cl = open->op_openowner->oo_owner.so_client;
> 	struct nfs4_file *fp = NULL;
> 	struct nfs4_ol_stateid *stp = NULL;
> -	struct nfs4_ol_stateid *swapstp = NULL;
> 	struct nfs4_delegation *dp = NULL;
> 	__be32 status;
> 
> @@ -4347,16 +4351,9 @@ nfsd4_process_open2(struct svc_rqst *rqstp, struct svc_fh *current_fh, struct nf
> 			goto out;
> 		}
> 	} else {
> -		stp = open->op_stp;
> -		open->op_stp = NULL;
> -		/*
> -		 * init_open_stateid() either returns a locked stateid
> -		 * it found, or initializes and locks the new one we passed in
> -		 */
> -		swapstp = init_open_stateid(stp, fp, open);
> -		if (swapstp) {
> -			nfs4_put_stid(&stp->st_stid);
> -			stp = swapstp;
> +		/* stp is returned locked: */
> +		stp = init_open_stateid(fp, open);
> +		if (stp->st_access_bmap == 0) {
> 			status = nfs4_upgrade_open(rqstp, fp, current_fh,
> 						stp, open);
> 			if (status) {

  reply	other threads:[~2016-06-15  2:20 UTC|newest]

Thread overview: 60+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-06-07 15:37 Files leak from nfsd in 4.7.1-rc1 (and more?) Oleg Drokin
2016-06-07 15:37 ` Oleg Drokin
2016-06-07 17:10 ` Jeff Layton
2016-06-07 17:10   ` Jeff Layton
2016-06-07 17:30   ` Oleg Drokin
2016-06-07 17:30     ` Oleg Drokin
2016-06-07 20:04     ` Jeff Layton
2016-06-07 20:04       ` Jeff Layton
2016-06-07 23:39       ` Oleg Drokin
2016-06-07 23:39         ` Oleg Drokin
2016-06-08  0:03         ` Jeff Layton
2016-06-08  0:03           ` Jeff Layton
2016-06-08  0:46           ` Oleg Drokin
2016-06-08  0:46             ` Oleg Drokin
2016-06-08  2:22           ` Oleg Drokin
2016-06-08  2:22             ` Oleg Drokin
2016-06-08  3:55             ` Oleg Drokin
2016-06-08  3:55               ` Oleg Drokin
2016-06-08 10:58             ` Jeff Layton
2016-06-08 10:58               ` Jeff Layton
2016-06-08 14:44               ` Oleg Drokin
2016-06-08 14:44                 ` Oleg Drokin
2016-06-08 16:10               ` Oleg Drokin
2016-06-08 16:10                 ` Oleg Drokin
2016-06-08 17:22                 ` Jeff Layton
2016-06-08 17:22                   ` Jeff Layton
2016-06-08 17:37                   ` Oleg Drokin
2016-06-08 17:37                     ` Oleg Drokin
2016-06-09  2:55                   ` [PATCH] nfsd: Always lock state exclusively Oleg Drokin
2016-06-09 10:13                     ` Jeff Layton
2016-06-09 21:01                   ` [PATCH] nfsd: Close a race between access checking/setting in nfs4_get_vfs_file Oleg Drokin
2016-06-10  4:18                     ` Oleg Drokin
2016-06-10 10:50                       ` Jeff Layton
2016-06-10 20:55                         ` J . Bruce Fields
2016-06-11 15:41                           ` Oleg Drokin
2016-06-12  1:33                             ` Jeff Layton
2016-06-12  2:06                               ` Oleg Drokin
2016-06-12  2:50                                 ` Jeff Layton
2016-06-12  3:15                                   ` Oleg Drokin
2016-06-12 13:13                                     ` Jeff Layton
2016-06-13  1:26                                     ` [PATCH v2] nfsd: Always lock state exclusively Oleg Drokin
2016-06-14 15:38                                       ` J . Bruce Fields
2016-06-14 15:53                                         ` Oleg Drokin
2016-06-14 18:50                                           ` J . Bruce Fields
2016-06-14 22:52                                             ` Jeff Layton
2016-06-14 22:54                                               ` Oleg Drokin
2016-06-14 22:57                                                 ` Jeff Layton
2016-06-15  3:28                                                   ` [PATCH 0/3] nfsd state handling fixes Oleg Drokin
2016-06-15  3:28                                                     ` [PATCH 1/3] nfsd: Always lock state exclusively Oleg Drokin
2016-06-15  3:28                                                     ` [PATCH 2/3] nfsd: Extend the mutex holding region around in nfsd4_process_open2() Oleg Drokin
2016-06-15  3:28                                                     ` [PATCH 3/3] nfsd: Make init_open_stateid() a bit more whole Oleg Drokin
2016-06-16  1:54                                                     ` [PATCH 0/3] nfsd state handling fixes Oleg Drokin
2016-06-16  2:07                                                       ` J . Bruce Fields
2016-06-14 15:46                                       ` [PATCH v2] nfsd: Always lock state exclusively J . Bruce Fields
2016-06-14 15:56                                         ` Oleg Drokin
2016-06-14 18:46                                           ` J . Bruce Fields
2016-06-15  2:19                                             ` Oleg Drokin [this message]
2016-06-15 13:31                                               ` J . Bruce Fields
2016-06-09 12:13               ` Files leak from nfsd in 4.7.1-rc1 (and more?) Andrew W Elble
2016-06-09 12:13                 ` Andrew W Elble

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4B503C8E-0195-4016-96F2-C848F667218D@linuxhacker.ru \
    --to=green@linuxhacker.ru \
    --cc=bfields@fieldses.org \
    --cc=jlayton@poochiereds.net \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-nfs@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.