linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Pavel Emelyanov <xemul@virtuozzo.com>
To: Nikolay Borisov <kernel@kyup.com>,
	Jeff Layton <jlayton@poochiereds.net>, <bfields@fieldses.org>
Cc: <viro@zeniv.linux.org.uk>, <linux-kernel@vger.kernel.org>,
	<linux-fsdevel@vger.kernel.org>, <ebiederm@xmission.com>,
	<containers@lists.linux-foundation.org>,
	Andrey Vagin <avagin@openvz.org>
Subject: Re: [PATCH v2] locks: Filter /proc/locks output on proc pid ns
Date: Wed, 3 Aug 2016 17:54:54 +0300	[thread overview]
Message-ID: <57A205BE.3070202@virtuozzo.com> (raw)
In-Reply-To: <57A1FCE5.3040206@kyup.com>

On 08/03/2016 05:17 PM, Nikolay Borisov wrote:
> 
> 
> On 08/03/2016 04:46 PM, Jeff Layton wrote:
>> On Wed, 2016-08-03 at 10:35 +0300, Nikolay Borisov wrote:
>>> On busy container servers reading /proc/locks shows all the locks
>>> created by all clients. This can cause large latency spikes. In my
>>> case I observed lsof taking up to 5-10 seconds while processing around
>>> 50k locks. Fix this by limiting the locks shown only to those created
>>> in the same pidns as the one the proc was mounted in. When reading
>>> /proc/locks from the init_pid_ns show everything.
>>>
>>>> Signed-off-by: Nikolay Borisov <kernel@kyup.com>
>>> ---
>>>  fs/locks.c | 6 ++++++
>>>  1 file changed, 6 insertions(+)
>>>
>>> diff --git a/fs/locks.c b/fs/locks.c
>>> index ee1b15f6fc13..751673d7f7fc 100644
>>> --- a/fs/locks.c
>>> +++ b/fs/locks.c
>>> @@ -2648,9 +2648,15 @@ static int locks_show(struct seq_file *f, void *v)
>>>  {
>>>>  	struct locks_iterator *iter = f->private;
>>>>  	struct file_lock *fl, *bfl;
>>>> +	struct pid_namespace *proc_pidns = file_inode(f->file)->i_sb->s_fs_info;
>>>> +	struct pid_namespace *current_pidns = task_active_pid_ns(current);
>>>  
>>>>  	fl = hlist_entry(v, struct file_lock, fl_link);
>>>  
>>>>> +	if ((current_pidns != &init_pid_ns) && fl->fl_nspid
>>
>> Ok, so when you read from a process that's in the init_pid_ns
>> namespace, then you'll get the whole pile of locks, even when reading
>> this from a filesystem that was mounted in a different pid_ns?
>>
>> That seems odd to me if so. Any reason not to just uniformly use the
>> proc_pidns here?
> 
> [CCing some people from openvz/CRIU]

Thanks :)

> My train of thought was "we should have means which would be the one
> universal truth about everything and this would be a process in the
> init_pid_ns". I don't have strong preference as long as I'm not breaking
> userspace. As I said before - I think the CRIU guys might be using that
> interface.

This particular change won't break us mostly because we've switched to
reading the /proc/pid/fdinfo/n files for locks.

-- Pavel

>>
>>>>> +	    && (proc_pidns != ns_of_pid(fl->fl_nspid)))
>>>> +		return 0;
>>> +
>>>>  	lock_get_status(f, fl, iter->li_pos, "");
>>>  
>>>>  	list_for_each_entry(bfl, &fl->fl_block, fl_block)
>>
> .
> 

  parent reply	other threads:[~2016-08-03 22:32 UTC|newest]

Thread overview: 22+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-08-02 14:42 [RFC PATCH] locks: Show only file_locks created in the same pidns as current process Nikolay Borisov
2016-08-02 14:45 ` Nikolay Borisov
2016-08-02 15:05 ` J. Bruce Fields
2016-08-02 15:20   ` Nikolay Borisov
2016-08-02 15:43     ` J. Bruce Fields
2016-08-02 16:00 ` Eric W. Biederman
2016-08-02 17:40   ` J. Bruce Fields
2016-08-02 19:09     ` Eric W. Biederman
2016-08-02 19:44       ` J. Bruce Fields
2016-08-02 20:01         ` Jeff Layton
2016-08-02 20:11           ` Nikolay Borisov
2016-08-02 20:34           ` J. Bruce Fields
2016-08-03  7:35 ` [PATCH v2] locks: Filter /proc/locks output on proc pid ns Nikolay Borisov
2016-08-03 13:46   ` Jeff Layton
2016-08-03 14:17     ` Nikolay Borisov
2016-08-03 14:28       ` J. Bruce Fields
2016-08-03 14:33         ` Nikolay Borisov
2016-08-03 14:54       ` Pavel Emelyanov [this message]
2016-08-03 15:00         ` Nikolay Borisov
2016-08-03 15:06           ` J. Bruce Fields
2016-08-03 15:10             ` Nikolay Borisov
2016-08-03 17:35               ` Eric W. Biederman

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=57A205BE.3070202@virtuozzo.com \
    --to=xemul@virtuozzo.com \
    --cc=avagin@openvz.org \
    --cc=bfields@fieldses.org \
    --cc=containers@lists.linux-foundation.org \
    --cc=ebiederm@xmission.com \
    --cc=jlayton@poochiereds.net \
    --cc=kernel@kyup.com \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=viro@zeniv.linux.org.uk \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).