From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S932501AbcHCO24 (ORCPT ); Wed, 3 Aug 2016 10:28:56 -0400 Received: from fieldses.org ([173.255.197.46]:50900 "EHLO fieldses.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S932153AbcHCO2w (ORCPT ); Wed, 3 Aug 2016 10:28:52 -0400 Date: Wed, 3 Aug 2016 10:28:50 -0400 From: "J. Bruce Fields" To: Nikolay Borisov Cc: Jeff Layton , viro@zeniv.linux.org.uk, linux-kernel@vger.kernel.org, linux-fsdevel@vger.kernel.org, ebiederm@xmission.com, containers@lists.linux-foundation.org, Andrey Vagin , xemul@virtuozzo.com Subject: Re: [PATCH v2] locks: Filter /proc/locks output on proc pid ns Message-ID: <20160803142850.GA27072@fieldses.org> References: <1470148943-21835-1-git-send-email-kernel@kyup.com> <1470209710-30022-1-git-send-email-kernel@kyup.com> <1470232012.18285.4.camel@poochiereds.net> <57A1FCE5.3040206@kyup.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <57A1FCE5.3040206@kyup.com> User-Agent: Mutt/1.5.21 (2010-09-15) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Wed, Aug 03, 2016 at 05:17:09PM +0300, Nikolay Borisov wrote: > > > On 08/03/2016 04:46 PM, Jeff Layton wrote: > > On Wed, 2016-08-03 at 10:35 +0300, Nikolay Borisov wrote: > >> On busy container servers reading /proc/locks shows all the locks > >> created by all clients. This can cause large latency spikes. In my > >> case I observed lsof taking up to 5-10 seconds while processing around > >> 50k locks. Fix this by limiting the locks shown only to those created > >> in the same pidns as the one the proc was mounted in. When reading > >> /proc/locks from the init_pid_ns show everything. > >> > >>> Signed-off-by: Nikolay Borisov > >> --- > >> fs/locks.c | 6 ++++++ > >> 1 file changed, 6 insertions(+) > >> > >> diff --git a/fs/locks.c b/fs/locks.c > >> index ee1b15f6fc13..751673d7f7fc 100644 > >> --- a/fs/locks.c > >> +++ b/fs/locks.c > >> @@ -2648,9 +2648,15 @@ static int locks_show(struct seq_file *f, void *v) > >> { > >>> struct locks_iterator *iter = f->private; > >>> struct file_lock *fl, *bfl; > >>> + struct pid_namespace *proc_pidns = file_inode(f->file)->i_sb->s_fs_info; > >>> + struct pid_namespace *current_pidns = task_active_pid_ns(current); > >> > >>> fl = hlist_entry(v, struct file_lock, fl_link); > >> > >>>> + if ((current_pidns != &init_pid_ns) && fl->fl_nspid > > > > Ok, so when you read from a process that's in the init_pid_ns > > namespace, then you'll get the whole pile of locks, even when reading > > this from a filesystem that was mounted in a different pid_ns? > > > > That seems odd to me if so. Any reason not to just uniformly use the > > proc_pidns here? > > [CCing some people from openvz/CRIU] > > My train of thought was "we should have means which would be the one > universal truth about everything and this would be a process in the > init_pid_ns". OK, but why not make that means be "mount proc from the init_pid_ns and read /proc/locks there". So just replace current_pidns with proc_pidns in the above. I think that's all Jeff was suggesting. --b. > I don't have strong preference as long as I'm not breaking > userspace. As I said before - I think the CRIU guys might be using that > interface. > > > > >>>> + && (proc_pidns != ns_of_pid(fl->fl_nspid))) > >>> + return 0; > >> + > >>> lock_get_status(f, fl, iter->li_pos, ""); > >> > >>> list_for_each_entry(bfl, &fl->fl_block, fl_block) > >