From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from out02.mta.xmission.com ([166.70.13.232]:44580 "EHLO out02.mta.xmission.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1750741AbcGXFEX (ORCPT ); Sun, 24 Jul 2016 01:04:23 -0400 From: ebiederm@xmission.com (Eric W. Biederman) To: "W. Trevor King" Cc: James Bottomley , Andrey Vagin , Serge Hallyn , linux-api@vger.kernel.org, containers@lists.linux-foundation.org, linux-kernel@vger.kernel.org, Alexander Viro , criu@openvz.org, linux-fsdevel@vger.kernel.org, "Michael Kerrisk \(man-pages\)" References: <1468520419-28220-1-git-send-email-avagin@openvz.org> <20160723211414.GA25371@odin.tremily.us> <1469309936.2332.35.camel@HansenPartnership.com> <20160723215802.GO24913@odin.tremily.us> <87mvl8nhlv.fsf@x220.int.ebiederm.org> <20160723223448.GP24913@odin.tremily.us> Date: Sat, 23 Jul 2016 23:51:07 -0500 In-Reply-To: <20160723223448.GP24913@odin.tremily.us> (W. Trevor King's message of "Sat, 23 Jul 2016 15:34:48 -0700") Message-ID: <877fcboczo.fsf@x220.int.ebiederm.org> MIME-Version: 1.0 Content-Type: text/plain Subject: Re: [PATCH 0/5 RFC] Add an interface to discover relationships between namespaces Sender: linux-fsdevel-owner@vger.kernel.org List-ID: "W. Trevor King" writes: > On Sat, Jul 23, 2016 at 04:56:44PM -0500, Eric W. Biederman wrote: >> "W. Trevor King" writes: >> > On Sat, Jul 23, 2016 at 02:38:56PM -0700, James Bottomley wrote: >> >> On Sat, 2016-07-23 at 14:14 -0700, W. Trevor King wrote: >> >> > namespaces(7) and clone(2) both have: >> >> > >> >> > When a network namespace is freed (i.e., when the last >> >> > process in the namespace terminates), its physical network >> >> > devices are moved back to the initial network namespace (not >> >> > to the parent of the process). >> >> > >> >> > So the initial network namespace (the head of >> >> > net_namespace_list?) is special [1]. To understand how >> >> > physical network devices will be handled, it seems like we want >> >> > to treat network devices as a depth-1 tree, with all >> >> > non-initial net namespaces as children of the initial net >> >> > namespace. Can we extend this series' NS_GET_PARENT to return: >> >> > >> >> > * EPERM for an unprivileged caller (like this series currently >> >> > does for PID namespaces), >> >> > * ENOENT when called on net_namespace_list, and >> >> > * net_namespace_list when called on any other net namespace. >> >> >> >> What's the practical application of this? independent net >> >> namespaces are managed by the ip netns command. It pins them by >> >> a bind mount in a flat fashion; if we make them hierarchical the >> >> tool would probably need updating to reflect this, so we're going >> >> to need a reason to give the network people. Just having the >> >> interfaces not go back to root when you do an ip netns delete >> >> doesn't seem very compelling. >> > >> > I'm not suggesting we add support for deeper nesting, I'm suggesting >> > we use NS_GET_PARENT to allow sufficiently privileged users to >> > determine if a given net namespace is the initial net namespace. You >> > could do this already with something like: >> > >> > 1. Create a new net namespace. >> > 2. Add a physical network device to that namespace. >> > 3. Delete that namespace. >> > 4. See if the physical network device shows up in your >> > initial-net-namespace candidate. >> > 5. Delete the physical network device (hopefully it ended up >> > somewhere you can find it ;). >> > >> > But using an NS_GET_PARENT call seems much safer and easier. >> >> Have you had the problem in practice where you can't tell which >> network namespace is the initial network namespace. This all seems >> like a theoretical problem rather than a real one. > > I haven't had any practical problems here, I'm just trying to wrap my > head around namespace-relationship discovery. The special physical > network device handling seems a lot like init re-parenting (with no > PR_SET_CHILD_SUBREAPER analog in a 1-deep namespace tree), so calling > the initial network namespace a parent (and all the other namespaces > its direct children) seems natural enough. If that doesn't sound > convincing, I'm happy to punt this idea until someone runs into a > practical problem ;). Then let's punt this until someone runs into a practical problem. For scaling and for sanity it is desirable to keep the connections between namespaces to a minimum. Further the initial instances of a namespace always tend to be a little bit special. Eric