From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-8.8 required=3.0 tests=BAYES_00, HEADER_FROM_DIFFERENT_DOMAINS,MAILING_LIST_MULTI,MENTIONS_GIT_HOSTING, SPF_HELO_NONE,SPF_PASS autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 2162EC47095 for ; Wed, 9 Jun 2021 08:09:26 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id EF0A361364 for ; Wed, 9 Jun 2021 08:09:25 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S236997AbhFIILS (ORCPT ); Wed, 9 Jun 2021 04:11:18 -0400 Received: from mail.kernel.org ([198.145.29.99]:57812 "EHLO mail.kernel.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S236372AbhFIILR (ORCPT ); Wed, 9 Jun 2021 04:11:17 -0400 Received: by mail.kernel.org (Postfix) with ESMTPSA id 257BE61361; Wed, 9 Jun 2021 08:09:20 +0000 (UTC) Date: Wed, 9 Jun 2021 10:09:18 +0200 From: Christian Brauner To: Hannes Reinecke Cc: "Eric W. Biederman" , gregkh@linuxfoundation.org, containers@lists.linux.dev, linux-kernel@vger.kernel.org, lkml@metux.net Subject: Re: device namespaces Message-ID: <20210609080918.ma2klvxkjad4pjrn@wittgenstein> References: <9157affa-b27a-c0f4-f6ee-def4a991fd4e@suse.de> <20210608142911.ievp2rpuquxjuyus@wittgenstein> <877dj4ff9g.fsf@disp2133> <20210609063818.xnod4rzvti3ujkvn@wittgenstein> <20210609072108.ldhsxfnfql4pacqx@wittgenstein> <85a0d777-dea6-9574-8946-9fc8f912c1af@suse.de> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Disposition: inline In-Reply-To: <85a0d777-dea6-9574-8946-9fc8f912c1af@suse.de> Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Wed, Jun 09, 2021 at 09:54:05AM +0200, Hannes Reinecke wrote: > On 6/9/21 9:21 AM, Christian Brauner wrote: > > On Wed, Jun 09, 2021 at 09:02:36AM +0200, Hannes Reinecke wrote: > >> On 6/9/21 8:38 AM, Christian Brauner wrote: > >>> On Tue, Jun 08, 2021 at 12:16:43PM -0500, Eric W. Biederman wrote: > >>>> Hannes Reinecke writes: > >>>> > >>>>> On 6/8/21 4:29 PM, Christian Brauner wrote: > >>>>>> On Tue, Jun 08, 2021 at 04:10:08PM +0200, Hannes Reinecke wrote: > >> [ .. ] > >>>>> Granted, modifying sysfs layout is not something for the faint-hearted, > >>>>> and one really has to look closely to ensure you end up with a > >>>>> consistent layout afterwards. > >>>>> > >>>>> But let's see how things go; might well be that it turns out to be too > >>>>> complex to consider. Can't tell yet. > >>>> > >>>> I would suggest aiming for something like devptsfs without the > >>>> complication of /dev/ptmx. > >>>> > >>>> That is a pseudo filesystem that has a control node and virtual block > >>>> devices that were created using that control node. > >>> > >>> Also see android/binder/binderfs.c > >>> > >> Ah. Will have a look. > > > > I implemented this a few years back and I think it should've made it > > onto Android by default now. So that approach does indeed work well, it > > seems: > > https://chromium.googlesource.com/aosp/platform/system/core/+/master/rootdir/init.rc#257 > > > > This should be easier to follow than the devpts case because you don't > > need to wade through the {t,p}ty layer. > > > >> > >>>> > >>>> That is the cleanest solution I know and is not strictly limited to use > >>>> with containers so it can also gain greater traction. The interaction > >>>> with devtmpfs should be simply having devtmpfs create a mount point for > >>>> that filesystem. > >>>> > >>>> This could be a new cleaner api for things like loopback devices. > >>> > >>> I sent a patchset that implemented this last year. > >>> > >> Do you have a pointer/commit hash for this? > > > > Yes, sure: > > https://lore.kernel.org/linux-block/20200424162052.441452-1-christian.brauner@ubuntu.com/ > > > > You can also just pull my branch. I think it's still based on v5.7 or sm: > > https://git.kernel.org/pub/scm/linux/kernel/git/brauner/linux.git/log/?h=loopfs > > > > I'm happy to collaborate on this too. > > > How _very_ curious. 'kernfs: handle multiple namespace tags' and 'loop: > preserve sysfs backwards compability' are essentially the same patches I > did for my block namespaces prototyp; I named it 'KOBJ_NS_TYPE_BLK', not > 'KOBJ_NS_TYPE_USER', though :-) > > Guess we really should cooperate. > > Speaking of which: why did you name it 'user' namespace? > There already is a generic 'user_namespace' in > include/linux/user_namespace.h, serving as a container for all > namespaces; as such it probably should include this 'user' namespace, > leading to quite some confusion. > > Or did I misunderstood something here? Ah yes, you misunderstand. The KOBJ_NS_TYPE_* tags are namespace tags. So KOBJ_NS_TYPE_NET is a network namespace tag. So KOBJ_NS_TYPE_USER is a user namespace tag not a completely new namespace. The idea very roughly being that devices such as loop devices are ultimately filtered by user namespace which is taken from the s_user_ns the loopfs instance is mounted in. We should compare notes. Christian