All of lore.kernel.org
 help / color / mirror / Atom feed
From: Jeff Layton <jlayton@redhat.com>
To: Benjamin Coddington <bcodding@redhat.com>
Cc: bfields@fieldses.org, Alexander Viro <viro@zeniv.linux.org.uk>,
	linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org
Subject: Re: [PATCH 2/3] fs/locks: Remove fl_nspid
Date: Tue, 06 Jun 2017 19:05:31 -0400	[thread overview]
Message-ID: <1496790331.3362.6.camel@redhat.com> (raw)
In-Reply-To: <7ADF33E5-AD07-4D57-A38E-AB5FD81636FB@redhat.com>

On Tue, 2017-06-06 at 14:57 -0400, Benjamin Coddington wrote:
> On 6 Jun 2017, at 14:25, Jeff Layton wrote:
> 
> > On Tue, 2017-06-06 at 14:00 -0400, Jeff Layton wrote:
> > > On Tue, 2017-06-06 at 13:19 -0400, Benjamin Coddington wrote:
> > > > Since commit c69899a17ca4 "NFSv4: Update of VFS byte range lock must 
> > > > be
> > > > atomic with the stateid update", NFSv4 has been inserting locks in 
> > > > rpciod
> > > > worker context.  The result is that the file_lock's fl_nspid is the
> > > > kworker's pid instead of the original userspace pid.
> > > > 
> > > > The fl_nspid is only used to represent the namespaced virtual pid 
> > > > number
> > > > when displaying locks or returning from F_GETLK.  There's no reason 
> > > > to set
> > > > it for every inserted lock, since we can usually just look it up 
> > > > from
> > > > fl_pid.  So, instead of looking up and holding struct pid for every 
> > > > lock,
> > > > let's just look up the virtual pid number from fl_pid when it is 
> > > > needed.
> > > > That means we can remove fl_nspid entirely.
> > > > 
> > > 
> > > With this set, I think we ought to codify that the stored pid must be
> > > relative
> > 
> > ...to the init_pid_ns. Let's make that clear in the comments for
> > filesystem authors.
> 
> OK, but I think you mean fl_pid should always be current->tgid or the 
> pid as
> it is in init_pid_ns.  We translate that pid into the virtual pid of the
> process doing F_GETLK or reading /proc/locks.
> 
> > > > Signed-off-by: Benjamin Coddington <bcodding@redhat.com>
> > > > ---
> > > >  fs/locks.c         | 58 
> > > > ++++++++++++++++++++++++++++++++----------------------
> > > >  include/linux/fs.h |  1 -
> > > >  2 files changed, 35 insertions(+), 24 deletions(-)
> > > > 
> > > > diff --git a/fs/locks.c b/fs/locks.c
> > > > index d7daa6c8932f..104398ccc9b9 100644
> > > > --- a/fs/locks.c
> > > > +++ b/fs/locks.c
> > > > @@ -733,7 +733,6 @@ static void locks_wake_up_blocks(struct 
> > > > file_lock *blocker)
> > > >  static void
> > > >  locks_insert_lock_ctx(struct file_lock *fl, struct list_head 
> > > > *before)
> > > >  {
> > > > -	fl->fl_nspid = get_pid(task_tgid(current));
> > > >  	list_add_tail(&fl->fl_list, before);
> > > >  	locks_insert_global_locks(fl);
> > > >  }
> > > > @@ -743,10 +742,6 @@ locks_unlink_lock_ctx(struct file_lock *fl)
> > > >  {
> > > >  	locks_delete_global_locks(fl);
> > > >  	list_del_init(&fl->fl_list);
> > > > -	if (fl->fl_nspid) {
> > > > -		put_pid(fl->fl_nspid);
> > > > -		fl->fl_nspid = NULL;
> > > > -	}
> > > >  	locks_wake_up_blocks(fl);
> > > >  }
> > > > 
> > > > @@ -823,8 +818,6 @@ posix_test_lock(struct file *filp, struct 
> > > > file_lock *fl)
> > > >  	list_for_each_entry(cfl, &ctx->flc_posix, fl_list) {
> > > >  		if (posix_locks_conflict(fl, cfl)) {
> > > >  			locks_copy_conflock(fl, cfl);
> > > > -			if (cfl->fl_nspid)
> > > > -				fl->fl_pid = pid_vnr(cfl->fl_nspid);
> > > >  			goto out;
> > > >  		}
> > > >  	}
> > > > @@ -2048,6 +2041,31 @@ int vfs_test_lock(struct file *filp, struct 
> > > > file_lock *fl)
> > > >  }
> > > >  EXPORT_SYMBOL_GPL(vfs_test_lock);
> > > > 
> > > > +/**
> > > > + * locks_translate_pid - translate a pid number into a namespace
> > > > + * @nr: The pid number in the init_pid_ns
> > > > + * @ns: The namespace into which the pid should be translated
> > > > + *
> > > > + * Used to tranlate a fl_pid into a namespace virtual pid number
> > > > + */
> > > > +static pid_t locks_translate_pid(int init_nr, struct pid_namespace 
> > > > *ns)
> > > > +{
> > > > +	pid_t vnr = 0;
> > > > +	struct task_struct *task;
> > > > +
> > > > +	rcu_read_lock();
> > > > +	task = find_task_by_pid_ns(init_nr, &init_pid_ns);
> > > > +	if (task)
> > > > +		get_task_struct(task);
> > > > +	rcu_read_unlock();
> > > 
> > > Is that safe? What prevents get_task_struct from doing a 0->1 
> > > transition
> > > there after the task usage count has already gone 1->0 and is on its 
> > > way
> > > to being freed?
> 
> Uh, no -- seems not safe.  I copied that directly from fs/proc/base.c, 
> and
> seems a problem there too.
> 
> Changing this to the below avoids the race with the struct task being
> released:
> 
> rcu_read_lock();
> struct pid = find_pid_ns(init_nr, &init_pid_ns)
> vnr = pid_vnr(pid);
> rcu_read_unlock();
> 

Actually now that I've looked a little more closely, I think it may be
ok. call_rcu callback is what ends up putting the last reference when
the task is released, so if you can find the thing via
find_task_by_pid_ns, then you know that the refcount is at least 1 until
the rcu_read_lock is dropped.

Still, I think doing the pid assignment under rcu_read_lock and not
bothering with the refcount is better in this case...

Thanks,
-- 
Jeff Layton <jlayton@redhat.com>

  parent reply	other threads:[~2017-06-06 23:05 UTC|newest]

Thread overview: 11+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-06-06 17:19 [PATCH 0/3 v3] Fixups for l_pid Benjamin Coddington
2017-06-06 17:19 ` [PATCH 1/3] fs/locks: Use allocation rather than the stack in fcntl_getlk() Benjamin Coddington
2017-06-06 17:19 ` [PATCH 2/3] fs/locks: Remove fl_nspid Benjamin Coddington
2017-06-06 18:00   ` Jeff Layton
2017-06-06 18:25     ` Jeff Layton
2017-06-06 18:57       ` Benjamin Coddington
2017-06-06 20:41         ` Benjamin Coddington
2017-06-06 23:05         ` Jeff Layton [this message]
2017-06-08  6:50   ` kbuild test robot
2017-06-06 17:19 ` [PATCH 3/3] fs/locks: Use fs-specific l_pid for remote locks Benjamin Coddington
2017-06-06 20:45 [PATCH 0/3 v4] Fixups for l_pid Benjamin Coddington
2017-06-06 20:45 ` [PATCH 2/3] fs/locks: Remove fl_nspid Benjamin Coddington

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1496790331.3362.6.camel@redhat.com \
    --to=jlayton@redhat.com \
    --cc=bcodding@redhat.com \
    --cc=bfields@fieldses.org \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=viro@zeniv.linux.org.uk \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.