From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S968503AbdEWQN2 (ORCPT ); Tue, 23 May 2017 12:13:28 -0400 Received: from mx1.redhat.com ([209.132.183.28]:57358 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1757989AbdEWQN0 (ORCPT ); Tue, 23 May 2017 12:13:26 -0400 DMARC-Filter: OpenDMARC Filter v1.3.2 mx1.redhat.com 148D53D956 Authentication-Results: ext-mx06.extmail.prod.ext.phx2.redhat.com; dmarc=none (p=none dis=none) header.from=redhat.com Authentication-Results: ext-mx06.extmail.prod.ext.phx2.redhat.com; spf=pass smtp.mailfrom=dhowells@redhat.com DKIM-Filter: OpenDKIM Filter v2.11.0 mx1.redhat.com 148D53D956 Organization: Red Hat UK Ltd. Registered Address: Red Hat UK Ltd, Amberley Place, 107-111 Peascod Street, Windsor, Berkshire, SI4 1TE, United Kingdom. Registered in England and Wales under Company Registration No. 3798903 From: David Howells In-Reply-To: <87lgpoww67.fsf@xmission.com> References: <87lgpoww67.fsf@xmission.com> <149547014649.10599.12025037906646164347.stgit@warthog.procyon.org.uk> To: ebiederm@xmission.com (Eric W. Biederman) Cc: dhowells@redhat.com, trondmy@primarydata.com, mszeredi@redhat.com, linux-nfs@vger.kernel.org, jlayton@redhat.com, linux-kernel@vger.kernel.org, viro@zeniv.linux.org.uk, linux-fsdevel@vger.kernel.org, cgroups@vger.kernel.org Subject: Re: [RFC][PATCH 0/9] Make containers kernel objects MIME-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-ID: <3611.1495555998.1@warthog.procyon.org.uk> Date: Tue, 23 May 2017 17:13:18 +0100 Message-ID: <3612.1495555998@warthog.procyon.org.uk> X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.30]); Tue, 23 May 2017 16:13:26 +0000 (UTC) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Eric W. Biederman wrote: > Let me suggest a concrete alternative: > > - At the time of mount observer the mounters user namespace. Looking at sget(), I don't think a mounter can see a superblock outside of their namespace. There is something icky in there whereby all automounts are currently transferred into the init_user_ns though (something to fix in my mount-context series) :-/ > - Find the mounters pid namespace. > - If the mounters pid namespace is owned by the mounters user namespace > walk up the pid namespace tree to the first pid namespace owned by > that user namespace. > - If the mounters pid namespace is not owned by the mounters user > namespace fail the mount it is going to need to make upcalls as > will not be possible. Take the following scenario: (1) Create a process with a new network namespace. Set up the network to route out of ethernet port 1. (2) Create a child process with new network and user namespaces. Set up the network to route out of ethernet port 2. (3) Mount an NFS volume in the process created in (2). The mount in (3) will fail unconditionally. > - Hold a reference to the pid namespace that was found. Take the following scenario: (1) Create a process with new network and pid namespaces. Set up the network to route out of ethernet port 1. (2) Create a child process with new network and pid namespaces. Set up the network to route out of ethernet port 2. (3) Mount an NFS volume in the process created in (2). (4) Create another child process with new network and pid namespaces. Set up the network to route out of ethernet port 3. (5) In the process created in (4), access the NFS volume created in (3). The user namespace is the same all the way through. Now you're holding a ref to the pid namespace created in (1) - but that is of no use to you. The upcall must take place in the network namespace that routes out through port 2. David From mboxrd@z Thu Jan 1 00:00:00 1970 From: David Howells Subject: Re: [RFC][PATCH 0/9] Make containers kernel objects Date: Tue, 23 May 2017 17:13:18 +0100 Message-ID: <3612.1495555998@warthog.procyon.org.uk> References: <87lgpoww67.fsf@xmission.com> <149547014649.10599.12025037906646164347.stgit@warthog.procyon.org.uk> Mime-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Return-path: DKIM-Filter: OpenDKIM Filter v2.11.0 mx1.redhat.com 148D53D956 In-Reply-To: <87lgpoww67.fsf@xmission.com> Content-ID: <3611.1495555998.1@warthog.procyon.org.uk> Sender: linux-fsdevel-owner@vger.kernel.org List-ID: Content-Transfer-Encoding: 7bit To: "Eric W. Biederman" Cc: dhowells@redhat.com, trondmy@primarydata.com, mszeredi@redhat.com, linux-nfs@vger.kernel.org, jlayton@redhat.com, linux-kernel@vger.kernel.org, viro@zeniv.linux.org.uk, linux-fsdevel@vger.kernel.org, cgroups@vger.kernel.org Eric W. Biederman wrote: > Let me suggest a concrete alternative: > > - At the time of mount observer the mounters user namespace. Looking at sget(), I don't think a mounter can see a superblock outside of their namespace. There is something icky in there whereby all automounts are currently transferred into the init_user_ns though (something to fix in my mount-context series) :-/ > - Find the mounters pid namespace. > - If the mounters pid namespace is owned by the mounters user namespace > walk up the pid namespace tree to the first pid namespace owned by > that user namespace. > - If the mounters pid namespace is not owned by the mounters user > namespace fail the mount it is going to need to make upcalls as > will not be possible. Take the following scenario: (1) Create a process with a new network namespace. Set up the network to route out of ethernet port 1. (2) Create a child process with new network and user namespaces. Set up the network to route out of ethernet port 2. (3) Mount an NFS volume in the process created in (2). The mount in (3) will fail unconditionally. > - Hold a reference to the pid namespace that was found. Take the following scenario: (1) Create a process with new network and pid namespaces. Set up the network to route out of ethernet port 1. (2) Create a child process with new network and pid namespaces. Set up the network to route out of ethernet port 2. (3) Mount an NFS volume in the process created in (2). (4) Create another child process with new network and pid namespaces. Set up the network to route out of ethernet port 3. (5) In the process created in (4), access the NFS volume created in (3). The user namespace is the same all the way through. Now you're holding a ref to the pid namespace created in (1) - but that is of no use to you. The upcall must take place in the network namespace that routes out through port 2. David