From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <linux-kernel-owner@vger.kernel.org>
Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand
        id S1752321AbdCMPZM (ORCPT <rfc822;w@1wt.eu>);
        Mon, 13 Mar 2017 11:25:12 -0400
Received: from mail.kernel.org ([198.145.29.136]:50498 "EHLO mail.kernel.org"
        rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP
        id S1751102AbdCMPZE (ORCPT <rfc822;linux-kernel@vger.kernel.org>);
        Mon, 13 Mar 2017 11:25:04 -0400
MIME-Version: 1.0
In-Reply-To: <20170313132732.GR29622@ZenIV.linux.org.uk>
References: <20170221145746.GA31914@redhat.com> <20170306230515.GA3453@comp-core-i7-2640m-0182e6>
 <20170312015430.GO29622@ZenIV.linux.org.uk> <20170312021257.GP29622@ZenIV.linux.org.uk>
 <CALCETrVT5sfGhNomLKAephrSGj8fc81ZjGTN-Y6UwgAHngVRCA@mail.gmail.com> <20170313132732.GR29622@ZenIV.linux.org.uk>
From: Andy Lutomirski <luto@kernel.org>
Date: Mon, 13 Mar 2017 08:24:23 -0700
X-Gmail-Original-Message-ID: <CALCETrXqv8VUeO6MpKWDR6DFYBgmmT0nZVezBJsimtmmQgDksw@mail.gmail.com>
Message-ID: <CALCETrXqv8VUeO6MpKWDR6DFYBgmmT0nZVezBJsimtmmQgDksw@mail.gmail.com>
Subject: Re: [RFC] Add option to mount only a pids subset
To: Al Viro <viro@zeniv.linux.org.uk>
Cc: Alexey Gladkov <gladkov.alexey@gmail.com>,
        Linux Kernel Mailing List <linux-kernel@vger.kernel.org>,
        Linux API <linux-api@vger.kernel.org>,
        "Kirill A. Shutemov" <kirill@shutemov.name>,
        Vasiliy Kulikov <segoon@openwall.com>,
        "Eric W. Biederman" <ebiederm@xmission.com>,
        Oleg Nesterov <oleg@redhat.com>, Pavel Emelyanov <xemul@parallels.com>,
        James Bottomley <James.Bottomley@hansenpartnership.com>
Content-Type: text/plain; charset=UTF-8
Sender: linux-kernel-owner@vger.kernel.org
List-ID: <linux-kernel.vger.kernel.org>
X-Mailing-List: linux-kernel@vger.kernel.org

On Mon, Mar 13, 2017 at 6:27 AM, Al Viro <viro@zeniv.linux.org.uk> wrote:
> On Sun, Mar 12, 2017 at 08:19:33PM -0700, Andy Lutomirski wrote:
>> On Sat, Mar 11, 2017 at 6:13 PM, Al Viro <viro@zeniv.linux.org.uk> wrote:
>> > PS: AFAICS, simple mount --bind of your pid-only mount will suddenly
>> > expose the full thing.  And as for the lifetimes making no sense...
>> > note that you are simply not freeing these structures of yours.
>> > Try to handle that and you'll get a serious PITA all over the
>> > place.
>> >
>> > What are you trying to achieve, anyway?  Why not add a second vfsmount
>> > pointer per pid_namespace and make it initialized on demand, at the
>> > first attempt of no-pid mount?  Just have a separate no-pid instance
>> > created for those namespaces where it had been asked for, with
>> > separate superblock and dentry tree not containing anything other
>> > that pid-only parts + self + thread-self...
>>
>> Can't we just make procfs work like most other filesystems and have
>> each mount have its own superblock?  If we need to do something funky
>> to stat() output to keep existing userspace working, I think that's
>> okay.
>
> First of all, most of the filesystems do *NOT* guarantee anything of
> that sort.  And what's the point of having more instances than
> necessary, anyway?

I mean that, if I do:

mount -t proc -o foobar none a
mount -t proc -o baz none b

Then I think that the second mount should create a whole new proc
instance rather than just a new vfsmount.  Then the options could
differ, which would solve a bunch of problems.

>
> Again, what for?  It won't salvage that kludge...  It's not as if it
> had been hard to have separate pid-only instance created when asked
> for (and reused every time when we are asked for pid-only).  What's
> the point of ever having more than two instances per pidns?  IDGI...

I can easily procfs growing more than one interesting option.
Pid-only and hidepid come to mind, and that's already six possible
combinations.  The current hidepid implementation is really awful.

>
> Folks, there is no one-to-one correspondence between mountpoints and
> superblocks.  Not since 2000 or so.  Just don't try to shove your
> per-superblock stuff into vfsmount; it simply won't work.  If you
> want a separate instance for that thing, then just go ahead and
> have ->mount() decide which one to use (and whether to create a new
> one).  All there is to it...

That's what I mean.  I just don't see the point of going all-out in
trying to reuse superblocks.