From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752666AbdKFXRY (ORCPT ); Mon, 6 Nov 2017 18:17:24 -0500 Received: from mail-pf0-f195.google.com ([209.85.192.195]:53940 "EHLO mail-pf0-f195.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752593AbdKFXRW (ORCPT ); Mon, 6 Nov 2017 18:17:22 -0500 X-Google-Smtp-Source: ABhQp+TvEkfiktmTLtmZkSQb0FBBk02qFI+8pKZbga6/3QZfOWClz0Ua/R4biWt4Iyi5BjDUs0RZHi8+t4/Rsc3AV/E= MIME-Version: 1.0 X-Originating-IP: [72.70.61.204] In-Reply-To: <20171106221418.GA32543@mail.hallyn.com> References: <20171103004436.40026-1-mahesh@bandewar.net> <20171104235346.GA17170@mail.hallyn.com> <20171106150302.GA26634@mail.hallyn.com> <1510003994.736.0.camel@gmail.com> <20171106221418.GA32543@mail.hallyn.com> From: Boris Lukashev Date: Mon, 6 Nov 2017 18:17:21 -0500 Message-ID: Subject: Re: [kernel-hardening] Re: [PATCH resend 2/2] userns: control capabilities of some user namespaces To: "Serge E. Hallyn" Cc: Daniel Micay , =?UTF-8?B?TWFoZXNoIEJhbmRld2FyICjgpK7gpLngpYfgpLYg4KSs4KSC4KSh4KWH4KS14KS+4KSwKQ==?= , Mahesh Bandewar , LKML , Netdev , Kernel-hardening , Linux API , Kees Cook , "Eric W . Biederman" , Eric Dumazet , David Miller Content-Type: text/plain; charset="UTF-8" Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Mon, Nov 6, 2017 at 5:14 PM, Serge E. Hallyn wrote: > Quoting Daniel Micay (danielmicay@gmail.com): >> Substantial added attack surface will never go away as a problem. There >> aren't a finite number of vulnerabilities to be found. > > There's varying levels of usefulness and quality. There is code which I > want to be able to use in a container, and code which I can't ever see a > reason for using there. The latter, especially if it's also in a > staging driver, would be nice to have a toggle to disable. > > You're not advocating dropping the added attack surface, only adding a > way of dealing with an 0day after the fact. Privilege raising 0days can > exist anywhere, not just in code which only root in a user namespace can > exercise. So from that point of view, ksplice seems a more complete > solution. Why not just actually fix the bad code block when we know > about it? > > Finally, it has been well argued that you can gain many new caps from > having only a few others. Given that, how could you ever be sure that, > if an 0day is found which allows root in a user ns to abuse > CAP_NET_ADMIN against the host, just keeping CAP_NET_ADMIN from them > would suffice? It seems to me that the existing control in > /proc/sys/kernel/unprivileged_userns_clone might be the better duct tape > in that case. > > -serge This seems to be heading toward "we need full zones in Linux" with their own procfs and sysfs namespace and a stricter isolation model for resources and capabilities. So long as things can happen in a namespace which have a privileged relationship with host resources, this is going to be cat-and-mouse to one degree or another. Containers and namespaces dont have a one-to-one relationship, so i'm not sure that's the best term to use in the kernel security context since there's a bunch of userspace and implementation delta across the different systems (with their own security models and so forth). Without accounting for what a specific implementation may or may not do, and only looking at "how do we reduce privileged impact on parent context from unprivileged namespaces," this patch does seem to provide a logical way of reducing the privileges available in such a namespace and often needed to mount escapes/impact parent context. -Boris -- Boris Lukashev Systems Architect Semper Victus