From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752321AbdCMPZM (ORCPT ); Mon, 13 Mar 2017 11:25:12 -0400 Received: from mail.kernel.org ([198.145.29.136]:50498 "EHLO mail.kernel.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751102AbdCMPZE (ORCPT ); Mon, 13 Mar 2017 11:25:04 -0400 MIME-Version: 1.0 In-Reply-To: <20170313132732.GR29622@ZenIV.linux.org.uk> References: <20170221145746.GA31914@redhat.com> <20170306230515.GA3453@comp-core-i7-2640m-0182e6> <20170312015430.GO29622@ZenIV.linux.org.uk> <20170312021257.GP29622@ZenIV.linux.org.uk> <20170313132732.GR29622@ZenIV.linux.org.uk> From: Andy Lutomirski Date: Mon, 13 Mar 2017 08:24:23 -0700 X-Gmail-Original-Message-ID: Message-ID: Subject: Re: [RFC] Add option to mount only a pids subset To: Al Viro Cc: Alexey Gladkov , Linux Kernel Mailing List , Linux API , "Kirill A. Shutemov" , Vasiliy Kulikov , "Eric W. Biederman" , Oleg Nesterov , Pavel Emelyanov , James Bottomley Content-Type: text/plain; charset=UTF-8 Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Mon, Mar 13, 2017 at 6:27 AM, Al Viro wrote: > On Sun, Mar 12, 2017 at 08:19:33PM -0700, Andy Lutomirski wrote: >> On Sat, Mar 11, 2017 at 6:13 PM, Al Viro wrote: >> > PS: AFAICS, simple mount --bind of your pid-only mount will suddenly >> > expose the full thing. And as for the lifetimes making no sense... >> > note that you are simply not freeing these structures of yours. >> > Try to handle that and you'll get a serious PITA all over the >> > place. >> > >> > What are you trying to achieve, anyway? Why not add a second vfsmount >> > pointer per pid_namespace and make it initialized on demand, at the >> > first attempt of no-pid mount? Just have a separate no-pid instance >> > created for those namespaces where it had been asked for, with >> > separate superblock and dentry tree not containing anything other >> > that pid-only parts + self + thread-self... >> >> Can't we just make procfs work like most other filesystems and have >> each mount have its own superblock? If we need to do something funky >> to stat() output to keep existing userspace working, I think that's >> okay. > > First of all, most of the filesystems do *NOT* guarantee anything of > that sort. And what's the point of having more instances than > necessary, anyway? I mean that, if I do: mount -t proc -o foobar none a mount -t proc -o baz none b Then I think that the second mount should create a whole new proc instance rather than just a new vfsmount. Then the options could differ, which would solve a bunch of problems. > > Again, what for? It won't salvage that kludge... It's not as if it > had been hard to have separate pid-only instance created when asked > for (and reused every time when we are asked for pid-only). What's > the point of ever having more than two instances per pidns? IDGI... I can easily procfs growing more than one interesting option. Pid-only and hidepid come to mind, and that's already six possible combinations. The current hidepid implementation is really awful. > > Folks, there is no one-to-one correspondence between mountpoints and > superblocks. Not since 2000 or so. Just don't try to shove your > per-superblock stuff into vfsmount; it simply won't work. If you > want a separate instance for that thing, then just go ahead and > have ->mount() decide which one to use (and whether to create a new > one). All there is to it... That's what I mean. I just don't see the point of going all-out in trying to reuse superblocks.