From mboxrd@z Thu Jan 1 00:00:00 1970 From: ebiederm-aS9lmoZGLiVWk0Htik3J/w@public.gmane.org (Eric W. Biederman) Subject: Re: [RFC][PATCH 0/9] Make containers kernel objects Date: Tue, 23 May 2017 09:56:19 -0500 Message-ID: <87zie3mxkc.fsf__2244.0422130332$1495551787$gmane$org@xmission.com> References: <149547014649.10599.12025037906646164347.stgit@warthog.procyon.org.uk> <1495472039.2757.19.camel@HansenPartnership.com> <2446.1495551216@warthog.procyon.org.uk> Mime-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Return-path: In-Reply-To: <2446.1495551216-S6HVgzuS8uM4Awkfq6JHfwNdhmdF6hFW@public.gmane.org> (David Howells's message of "Tue, 23 May 2017 15:53:36 +0100") List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: containers-bounces-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA@public.gmane.org Errors-To: containers-bounces-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA@public.gmane.org To: David Howells Cc: mszeredi-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org, linux-nfs-u79uwXL29TY76Z2rM5mHXA@public.gmane.org, jlayton-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org, Linux Containers , linux-kernel-u79uwXL29TY76Z2rM5mHXA@public.gmane.org, James Bottomley , viro-RmSDqhL/yNMiFSDQTTA3OLVCufUGDwFn@public.gmane.org, cgroups-u79uwXL29TY76Z2rM5mHXA@public.gmane.org, linux-fsdevel-u79uwXL29TY76Z2rM5mHXA@public.gmane.org, trondmy-7I+n7zu2hftEKMMhf/gKZA@public.gmane.org List-Id: containers.vger.kernel.org David Howells writes: > Aleksa Sarai wrote: > >> >> The reason I think this is necessary is that the kernel has no idea >> >> how to direct upcalls to what userspace considers to be a container - >> >> current Linux practice appears to make a "container" just an >> >> arbitrarily chosen junction of namespaces, control groups and files, >> >> which may be changed individually within the "container". >> >> Just want to point out that if the kernel APIs for containers massively >> change, then the OCI will have to completely rework how we describe containers >> (and so will all existing runtimes). >> >> Not to mention that while I don't like how hard it is (from a runtime >> perspective) to actually set up a container securely, there are undoubtedly >> benefits to having namespaces split out. The network namespace being separate >> means that in certain contexts you actually don't want to create a new network >> namespace when creating a container. > > Yep, I quite agree. > > However, certain things need to be made per-net namespace that *aren't*. DNS > results, for instance. > > As an example, I could set up a client machine with two ethernet ports, set up > two DNS+NFS servers, each of which think they're called "foo.bar" and attach > each server to a different port on the client machine. Then I could create a > pair of containers on the client machine and route the network in each > container to a different port. Now there's a problem because the names of the > cached DNS records for each port overlap. Please look at ip netns add. It does solve this in userspace rather simply. > Further, the NFS idmapper needs to be able to direct its calls to the > appropriate network. Eric