From mboxrd@z Thu Jan 1 00:00:00 1970 From: Colin Walters Subject: Re: [PATCH v2 00/10] userns: sysctl limits for namespaces Date: Fri, 22 Jul 2016 09:33:19 -0400 Message-ID: <1469194399.3817016.673814953.7581706C__32379.2670890272$1469194783$gmane$org@webmail.messagingengine.com> References: <8737n5dscy.fsf@x220.int.ebiederm.org> <87d1m754jc.fsf@x220.int.ebiederm.org> Mime-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Return-path: In-Reply-To: <87d1m754jc.fsf-JOvCrm2gF+uungPnsOpG7nhyD016LWXt@public.gmane.org> List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: containers-bounces-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA@public.gmane.org Errors-To: containers-bounces-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA@public.gmane.org To: "Eric W. Biederman" , Linux Containers Cc: Kees Cook , netdev-u79uwXL29TY76Z2rM5mHXA@public.gmane.org, linux-kernel-u79uwXL29TY76Z2rM5mHXA@public.gmane.org, Andy Lutomirski , Seth Forshee , Nikolay Borisov , linux-api-u79uwXL29TY76Z2rM5mHXA@public.gmane.org, linux-fsdevel-u79uwXL29TY76Z2rM5mHXA@public.gmane.org, Jann Horn List-Id: containers.vger.kernel.org On Thu, Jul 21, 2016, at 12:39 PM, Eric W. Biederman wrote: > > This patchset addresses two use cases: > - Implement a sane upper bound on the number of namespaces. > - Provide a way for sandboxes to limit the attack surface from > namespaces. Perhaps this is obvious, but since you didn't quite explicitly state it; do you see this as obsoleting the existing downstream patches mentioned in: https://lwn.net/Articles/673597/ It seems conceptually similar to Kees' original approach, right? The high level makes sense to me...most interesting is per-userns sysctls. I'll note most current container managers mount /proc/sys read-only, and Docker specifically drops CAP_SYS_RESOURCE by default, so they'd likely need to learn how to undo that if one wanted to support recursive container usage. We'd probably need to evaluate the safety of having /proc/sys writable generally. (Also it's rather common to filter out CLONE_NEWUSER via seccomp, but that's easy to undo) But that's the flip side - if we're aiming primarily for an upstreamable way to *limit* namespace usage, it seems sane to me.