All of lore.kernel.org
 help / color / mirror / Atom feed
From: NeilBrown <neilb@suse.de>
To: "J. Bruce Fields" <bfields@fieldses.org>
Cc: Kinglong Mee <kinglongmee@gmail.com>,
	"linux-nfs@vger.kernel.org" <linux-nfs@vger.kernel.org>
Subject: Re: [PATCH RFC] NFSD: fix cannot umounting mount points under pseudo root
Date: Thu, 30 Apr 2015 07:52:25 +1000	[thread overview]
Message-ID: <20150430075225.21a71056@notabene.brown> (raw)
In-Reply-To: <20150429191934.GA23980@fieldses.org>

[-- Attachment #1: Type: text/plain, Size: 3061 bytes --]

On Wed, 29 Apr 2015 15:19:34 -0400 "J. Bruce Fields" <bfields@fieldses.org>
wrote:

> On Wed, Apr 29, 2015 at 12:57:28PM +1000, NeilBrown wrote:
> > In any case, holding a mutex while waiting for an upcall is probably a bad
> > idea.  We should try to find a way to work around that...
> 
> Yeah, it sounds fragile even if we could fix this particular issue in
> mountd.
> 
> > Any ideas Bruce ?-)
> 
> Looking at nfsd_buffered_readdir():
> 
> 	/*
> 	 * Various filldir functions may end up calling back into
> 	 * lookup_one_len() and the file system's ->lookup() method.
> 	 * These expect i_mutex to be held, as it would within readdir.
> 	 */
>         host_err = mutex_lock_killable(&dir_inode->i_mutex);
> 
> and as far as I can tell Kinglong's approach (adding an unlock and lock
> around nfsd_cross_mnt() calls might actually be OK.

Yes, I think now that it would work.  It would look odd though, as you
suggest.
There would need to be very clear comments explaining why the lock is being
managed that way.

> 
> Though in general I expect
> 
> 	lock()
> 	...code...
> 	unlock()
> 
> to mark a critical section and would be unpleasantly surprised to
> discover that a function somewhere in the middle had a buried
> unlock/lock pair.
> 
> Maybe drop the locking from nfsd_buffered_readdir and *just* take the
> i_mutex around lookup_one_len(), if that's the only place we need it?

I think it is the only place needed.  Certainly normal path lookup only takes
i_mutex very briefly around __lookup_hash().

One cost would be that we would take the mutex for every name instead of once
for a whole set of names.  That is generally frowned upon for performance
reasons.

An alternative might be to stop using lookup_one_len, and use kern_path()
instead.  Or kern_path_mountpoint if we don't want to cross a mountpoint.

They would only take the i_mutex if absolutely necessary.

The draw back is that they don't take a 'len' argument, so we would need to
make sure the name  is nul terminated (and maybe double-check that there is
no '/'??).  It would be fairly easy to nul-terminate names from readdir -
just use a bit more space  in the buffer in nfsd_buffered_filldir().

I'm less sure about the locking needed in nfsd_lookup_dentry().
The comments suggests that there is good reason to keep the lock for an
extended period.  But that reasoning might only apply to files, and
nfsd_mountpoint should  only be true for directories... I hope.

A different approach would be to pass NULL for the rqstp to nfsd_cross_mnt().
This will ensure it doesn't block, but it might fail incorrectly.
If it does fail, we drop the lock, retry with the real rqstp, retake the lock
and .... no, I think that gets a bit too messy.

I think I'm in favour of:
 - not taking the lock in readdir
 - maybe not taking it in lookup
 - using kern_path_mountpoint or kern_path
 - nul terminating then names, copying the nfsd_lookup name to
   __getname() if necessary.

Sound reasonable?

NeilBrown


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 811 bytes --]

  reply	other threads:[~2015-04-29 21:52 UTC|newest]

Thread overview: 51+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-04-21 14:50 [PATCH RFC] NFSD: fix cannot umounting mount points under pseudo root Kinglong Mee
2015-04-21 21:54 ` J. Bruce Fields
2015-04-22  5:07   ` NeilBrown
2015-04-22 11:11   ` Kinglong Mee
2015-04-22 15:07     ` J. Bruce Fields
2015-04-22 23:44       ` NeilBrown
2015-04-23 12:52         ` Kinglong Mee
2015-04-24  3:00           ` NeilBrown
2015-04-27 12:11             ` Kinglong Mee
2015-04-29  2:57               ` NeilBrown
2015-04-29  8:45                 ` Kinglong Mee
2015-04-29 19:19                 ` J. Bruce Fields
2015-04-29 21:52                   ` NeilBrown [this message]
2015-04-30 21:36                     ` J. Bruce Fields
2015-05-01  1:53                       ` NeilBrown
2015-05-01  2:03                         ` Al Viro
2015-05-01  2:23                           ` NeilBrown
2015-05-01  2:29                             ` Al Viro
2015-05-01  3:08                               ` NeilBrown
2015-05-01 13:29                                 ` J. Bruce Fields
2015-05-02 23:16                                   ` NeilBrown
2015-05-03  0:37                                     ` J. Bruce Fields
2015-05-04  4:11                                       ` NeilBrown
2015-05-04 21:48                                     ` J. Bruce Fields
2015-05-05 22:27                                       ` NeilBrown
2015-05-04 22:01                         ` J. Bruce Fields
2015-05-05 13:54                           ` Kinglong Mee
2015-05-05 14:18                             ` J. Bruce Fields
2015-05-05 15:52                               ` J. Bruce Fields
2015-05-05 22:26                                 ` NeilBrown
2015-05-08 16:15                                   ` J. Bruce Fields
2015-05-08 20:01                                     ` [PATCH] nfsd: don't hold i_mutex over userspace upcalls J. Bruce Fields
2015-06-03 15:18                                       ` J. Bruce Fields
     [not found]                                         ` <20150603151819.GA8441-uC3wQj2KruNg9hUCZPvPmw@public.gmane.org>
2015-07-05 11:27                                           ` Kinglong Mee
2015-07-05 11:27                                             ` Kinglong Mee
2015-07-06 18:22                                             ` J. Bruce Fields
2015-08-18 19:10                                           ` J. Bruce Fields
2015-08-18 19:10                                             ` J. Bruce Fields
     [not found]                                             ` <20150818191028.GA3957-uC3wQj2KruNg9hUCZPvPmw@public.gmane.org>
2015-11-12 21:22                                               ` J. Bruce Fields
2015-11-12 21:22                                                 ` J. Bruce Fields
2015-05-07 15:31                                 ` [PATCH RFC] NFSD: fix cannot umounting mount points under pseudo root J. Bruce Fields
2015-05-07 22:42                                   ` NeilBrown
2015-05-08 14:10                                     ` J. Bruce Fields
2015-05-05  3:53                       ` Kinglong Mee
2015-05-05  4:19                         ` NeilBrown
2015-05-05  8:32                           ` Kinglong Mee
2015-05-05 13:52                             ` J. Bruce Fields
2015-06-26 23:14                             ` Kinglong Mee
2015-06-26 23:35                               ` NeilBrown
2015-07-02  9:42                                 ` Kinglong Mee
2015-05-01  1:55                     ` Al Viro

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20150430075225.21a71056@notabene.brown \
    --to=neilb@suse.de \
    --cc=bfields@fieldses.org \
    --cc=kinglongmee@gmail.com \
    --cc=linux-nfs@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.