linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: ebiederm@xmission.com (Eric W. Biederman)
To: Stephen Brennan <stephen.s.brennan@oracle.com>
Cc: Alexey Dobriyan <adobriyan@gmail.com>,
	James Morris <jmorris@namei.org>,
	"Serge E. Hallyn" <serge@hallyn.com>,
	linux-security-module@vger.kernel.org,
	Paul Moore <paul@paul-moore.com>,
	Stephen Smalley <stephen.smalley.work@gmail.com>,
	Eric Paris <eparis@parisplace.org>,
	selinux@vger.kernel.org, Casey Schaufler <casey@schaufler-ca.com>,
	Alexander Viro <viro@zeniv.linux.org.uk>,
	linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org,
	Matthew Wilcox <willy@infradead.org>
Subject: Re: [PATCH v2] proc: Allow pid_revalidate() during LOOKUP_RCU
Date: Tue, 15 Dec 2020 15:45:46 -0600	[thread overview]
Message-ID: <87pn3ag2ut.fsf@x220.int.ebiederm.org> (raw)
In-Reply-To: <87sg88tiex.fsf@stepbren-lnx.us.oracle.com> (Stephen Brennan's message of "Mon, 14 Dec 2020 09:19:02 -0800")

Stephen Brennan <stephen.s.brennan@oracle.com> writes:

> ebiederm@xmission.com (Eric W. Biederman) writes:
>
>> Stephen Brennan <stephen.s.brennan@oracle.com> writes:
>>
>>> The pid_revalidate() function requires dropping from RCU into REF lookup
>>> mode. When many threads are resolving paths within /proc in parallel,
>>> this can result in heavy spinlock contention as each thread tries to
>>> grab a reference to the /proc dentry lock (and drop it shortly
>>> thereafter).
>>
>> I am feeling dense at the moment.  Which lock specifically are you
>> referring to?  The only locks I can thinking of are sleeping locks,
>> not spinlocks.
>
> The lock in question is the d_lockref field (aliased as d_lock) of
> struct dentry. It is contended in this code path while processing the
> "/proc" dentry, switching from RCU to REF mode.
>
>     walk_component()
>       lookup_fast()
>         d_revalidate()
>           pid_revalidate() // returns -ECHILD
>         unlazy_child()
>           lockref_get_not_dead(&nd->path.dentry->d_lockref)
>
>>
>>> diff --git a/fs/proc/base.c b/fs/proc/base.c
>>> index ebea9501afb8..833d55a59e20 100644
>>> --- a/fs/proc/base.c
>>> +++ b/fs/proc/base.c
>>> @@ -1830,19 +1846,22 @@ static int pid_revalidate(struct dentry *dentry, unsigned int flags)
>>>  {
>>>  	struct inode *inode;
>>>  	struct task_struct *task;
>>> +	int rv = 0;
>>>  
>>> -	if (flags & LOOKUP_RCU)
>>> -		return -ECHILD;
>>> -
>>> -	inode = d_inode(dentry);
>>> -	task = get_proc_task(inode);
>>> -
>>> -	if (task) {
>>> -		pid_update_inode(task, inode);
>>> -		put_task_struct(task);
>>> -		return 1;
>>> +	if (flags & LOOKUP_RCU) {
>>
>> Why do we need to test flags here at all?
>> Why can't the code simply take an rcu_read_lock unconditionally and just
>> pass flags into do_pid_update_inode?
>>
>
> I don't have any good reason. If it is safe to update the inode without
> holding a reference to the task struct (or holding any other lock) then
> I can consolidate the whole conditional.


I am not certain if there is anything that makes it safe to change uid
and gid on the inode during lookup.  The current code might be buggy in
that respect.

However it is absoltely safe to read from the task_struct with just the
rcu_read_lock.  Which means it is only do_pid_update_inode that needs
locking to update the inode.

>>> +		inode = d_inode_rcu(dentry);
>>> +		task = pid_task(proc_pid(inode), PIDTYPE_PID);
>>> +		if (task)
>>> +			rv = do_pid_update_inode(task, inode, flags);
>>> +	} else {
>>> +		inode = d_inode(dentry);
>>> +		task = get_proc_task(inode);
>>> +		if (task) {
>>> +			rv = do_pid_update_inode(task, inode, flags);
>>> +			put_task_struct(task);
>>> +		}
>>
>>>  	}
>>> -	return 0;
>>> +	return rv;
>>>  }
>>>  
>>>  static inline bool proc_inode_is_dead(struct inode *inode)
>>
>> Eric

Eric

      reply	other threads:[~2020-12-15 21:48 UTC|newest]

Thread overview: 15+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-12-04  0:02 [PATCH v2] proc: Allow pid_revalidate() during LOOKUP_RCU Stephen Brennan
2020-12-12 20:55 ` Matthew Wilcox
2020-12-13 14:22   ` Eric W. Biederman
2020-12-13 16:29     ` Matthew Wilcox
2020-12-13 23:00       ` Paul Moore
2020-12-15 18:09         ` Casey Schaufler
2020-12-15 22:04           ` Eric W. Biederman
2020-12-15 22:53             ` Casey Schaufler
2020-12-16  1:05               ` Stephen Brennan
2020-12-14 18:45       ` Casey Schaufler
2020-12-14 18:15     ` Stephen Brennan
2020-12-13 14:30 ` Eric W. Biederman
2020-12-13 16:32   ` Matthew Wilcox
2020-12-14 17:19   ` Stephen Brennan
2020-12-15 21:45     ` Eric W. Biederman [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=87pn3ag2ut.fsf@x220.int.ebiederm.org \
    --to=ebiederm@xmission.com \
    --cc=adobriyan@gmail.com \
    --cc=casey@schaufler-ca.com \
    --cc=eparis@parisplace.org \
    --cc=jmorris@namei.org \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-security-module@vger.kernel.org \
    --cc=paul@paul-moore.com \
    --cc=selinux@vger.kernel.org \
    --cc=serge@hallyn.com \
    --cc=stephen.s.brennan@oracle.com \
    --cc=stephen.smalley.work@gmail.com \
    --cc=viro@zeniv.linux.org.uk \
    --cc=willy@infradead.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).