From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1751116AbdKJFGP (ORCPT ); Fri, 10 Nov 2017 00:06:15 -0500 Received: from mail-yw0-f173.google.com ([209.85.161.173]:55008 "EHLO mail-yw0-f173.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1750732AbdKJFGN (ORCPT ); Fri, 10 Nov 2017 00:06:13 -0500 X-Google-Smtp-Source: ABhQp+QvETFJKSvsd90PSy8B7EQh2hwPC7XET0hJYPLGn2rlECOshiiNpGrQ4CHu17QaUjMuTg2TbePxXGQ19JaZFX8= MIME-Version: 1.0 In-Reply-To: <20171110043010.GA3572@mail.hallyn.com> References: <20171103004433.39954-1-mahesh@bandewar.net> <20171109172201.GA26229@mail.hallyn.com> <20171110043010.GA3572@mail.hallyn.com> From: =?UTF-8?B?TWFoZXNoIEJhbmRld2FyICjgpK7gpLngpYfgpLYg4KSs4KSC4KSh4KWH4KS14KS+4KSwKQ==?= Date: Fri, 10 Nov 2017 14:05:51 +0900 Message-ID: Subject: Re: [PATCH resend 1/2] capability: introduce sysctl for controlled user-ns capability whitelist To: "Serge E. Hallyn" Cc: Mahesh Bandewar , LKML , Netdev , Kernel-hardening , Linux API , Kees Cook , "Eric W . Biederman" , Eric Dumazet , David Miller Content-Type: text/plain; charset="UTF-8" Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Transfer-Encoding: 8bit X-MIME-Autoconverted: from quoted-printable to 8bit by nfs id vAA56KGh007838 On Fri, Nov 10, 2017 at 1:30 PM, Serge E. Hallyn wrote: > Quoting Mahesh Bandewar (महेश बंडेवार) (maheshb@google.com): > ... >> >> >> >> ============================================================== >> >> >> >> +controlled_userns_caps_whitelist >> >> + >> >> +Capability mask that is whitelisted for "controlled" user namespaces. >> >> +Any capability that is missing from this mask will not be allowed to >> >> +any process that is attached to a controlled-userns. e.g. if CAP_NET_RAW >> >> +is not part of this mask, then processes running inside any controlled >> >> +userns's will not be allowed to perform action that needs CAP_NET_RAW >> >> +capability. However, processes that are attached to a parent user-ns >> >> +hierarchy that is *not* controlled and has CAP_NET_RAW can continue >> >> +performing those actions. User-namespaces are marked "controlled" at >> >> +the time of their creation based on the capabilities of the creator. >> >> +A process that does not have CAP_SYS_ADMIN will create user-namespaces >> >> +that are controlled. >> > >> > Hm. I think that's fine (the way 'controlled' user namespaces are >> > defined), but that is design decision in itself, and should perhaps be >> > discussed. >> > >> > Did you consider other ways? What about using CAP_SETPCAP? >> > >> I did try other ways e.g. using another bounding-set etc. but >> eventually settled with this approach because of main two properties - > > No, I meant did you try other ways of defining a controlled user > namespace, other than one which is created by a task lacking > CAP_SYS_ADMIN? > SYS_ADMIN is the capability that has been used for deciding who can or cannot create namespaces, so didn't want to create another model that may not be compatible with current model which is well understood hence no. > ... > >> >> +The value is expressed as two comma separated hex words (u32). This >> > >> > Why comma separated? whitespace ok? Leading 0x ok? What is the >> > default at boot? (Obviously the patch tells me, I'm asking for it >> > to be spelled out in the doc) >> > >> I tried multiple ways including representing capabilities in >> string/name form for better readability but didn't want to add >> additional complexities of dealing with strings and possible >> string-related-issues for this. Also didn't want to reinvent the new >> form so settled with something that is widely used (cpu >> bounding/affinity/irq mapping etc.) and is capable of handling growing >> bit set (currently 37 but possibly more later). > > Ok, thanks. From mboxrd@z Thu Jan 1 00:00:00 1970 From: =?UTF-8?B?TWFoZXNoIEJhbmRld2FyICjgpK7gpLngpYfgpLYg4KSs4KSC4KSh4KWH4KS14KS+4KSwKQ==?= Subject: Re: [PATCH resend 1/2] capability: introduce sysctl for controlled user-ns capability whitelist Date: Fri, 10 Nov 2017 14:05:51 +0900 Message-ID: References: <20171103004433.39954-1-mahesh@bandewar.net> <20171109172201.GA26229@mail.hallyn.com> <20171110043010.GA3572@mail.hallyn.com> Mime-Version: 1.0 Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable Cc: Mahesh Bandewar , LKML , Netdev , Kernel-hardening , Linux API , Kees Cook , "Eric W . Biederman" , Eric Dumazet , David Miller To: "Serge E. Hallyn" Return-path: In-Reply-To: <20171110043010.GA3572-7LNsyQBKDXoIagZqoN9o3w@public.gmane.org> Sender: linux-api-owner-u79uwXL29TY76Z2rM5mHXA@public.gmane.org List-Id: netdev.vger.kernel.org On Fri, Nov 10, 2017 at 1:30 PM, Serge E. Hallyn wrote: > Quoting Mahesh Bandewar (=E0=A4=AE=E0=A4=B9=E0=A5=87=E0=A4=B6 =E0=A4=AC= =E0=A4=82=E0=A4=A1=E0=A5=87=E0=A4=B5=E0=A4=BE=E0=A4=B0) (maheshb-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org= ): > ... >> >> >> >> =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D >> >> >> >> +controlled_userns_caps_whitelist >> >> + >> >> +Capability mask that is whitelisted for "controlled" user namespaces= . >> >> +Any capability that is missing from this mask will not be allowed to >> >> +any process that is attached to a controlled-userns. e.g. if CAP_NET= _RAW >> >> +is not part of this mask, then processes running inside any controll= ed >> >> +userns's will not be allowed to perform action that needs CAP_NET_RA= W >> >> +capability. However, processes that are attached to a parent user-ns >> >> +hierarchy that is *not* controlled and has CAP_NET_RAW can continue >> >> +performing those actions. User-namespaces are marked "controlled" at >> >> +the time of their creation based on the capabilities of the creator. >> >> +A process that does not have CAP_SYS_ADMIN will create user-namespac= es >> >> +that are controlled. >> > >> > Hm. I think that's fine (the way 'controlled' user namespaces are >> > defined), but that is design decision in itself, and should perhaps be >> > discussed. >> > >> > Did you consider other ways? What about using CAP_SETPCAP? >> > >> I did try other ways e.g. using another bounding-set etc. but >> eventually settled with this approach because of main two properties - > > No, I meant did you try other ways of defining a controlled user > namespace, other than one which is created by a task lacking > CAP_SYS_ADMIN? > SYS_ADMIN is the capability that has been used for deciding who can or cannot create namespaces, so didn't want to create another model that may not be compatible with current model which is well understood hence no. > ... > >> >> +The value is expressed as two comma separated hex words (u32). This >> > >> > Why comma separated? whitespace ok? Leading 0x ok? What is the >> > default at boot? (Obviously the patch tells me, I'm asking for it >> > to be spelled out in the doc) >> > >> I tried multiple ways including representing capabilities in >> string/name form for better readability but didn't want to add >> additional complexities of dealing with strings and possible >> string-related-issues for this. Also didn't want to reinvent the new >> form so settled with something that is widely used (cpu >> bounding/affinity/irq mapping etc.) and is capable of handling growing >> bit set (currently 37 but possibly more later). > > Ok, thanks. From mboxrd@z Thu Jan 1 00:00:00 1970 MIME-Version: 1.0 In-Reply-To: <20171110043010.GA3572@mail.hallyn.com> References: <20171103004433.39954-1-mahesh@bandewar.net> <20171109172201.GA26229@mail.hallyn.com> <20171110043010.GA3572@mail.hallyn.com> From: =?UTF-8?B?TWFoZXNoIEJhbmRld2FyICjgpK7gpLngpYfgpLYg4KSs4KSC4KSh4KWH4KS14KS+4KSwKQ==?= Date: Fri, 10 Nov 2017 14:05:51 +0900 Message-ID: Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable Subject: [kernel-hardening] Re: [PATCH resend 1/2] capability: introduce sysctl for controlled user-ns capability whitelist To: "Serge E. Hallyn" Cc: Mahesh Bandewar , LKML , Netdev , Kernel-hardening , Linux API , Kees Cook , "Eric W . Biederman" , Eric Dumazet , David Miller List-ID: On Fri, Nov 10, 2017 at 1:30 PM, Serge E. Hallyn wrote: > Quoting Mahesh Bandewar (=E0=A4=AE=E0=A4=B9=E0=A5=87=E0=A4=B6 =E0=A4=AC= =E0=A4=82=E0=A4=A1=E0=A5=87=E0=A4=B5=E0=A4=BE=E0=A4=B0) (maheshb@google.com= ): > ... >> >> >> >> =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D >> >> >> >> +controlled_userns_caps_whitelist >> >> + >> >> +Capability mask that is whitelisted for "controlled" user namespaces= . >> >> +Any capability that is missing from this mask will not be allowed to >> >> +any process that is attached to a controlled-userns. e.g. if CAP_NET= _RAW >> >> +is not part of this mask, then processes running inside any controll= ed >> >> +userns's will not be allowed to perform action that needs CAP_NET_RA= W >> >> +capability. However, processes that are attached to a parent user-ns >> >> +hierarchy that is *not* controlled and has CAP_NET_RAW can continue >> >> +performing those actions. User-namespaces are marked "controlled" at >> >> +the time of their creation based on the capabilities of the creator. >> >> +A process that does not have CAP_SYS_ADMIN will create user-namespac= es >> >> +that are controlled. >> > >> > Hm. I think that's fine (the way 'controlled' user namespaces are >> > defined), but that is design decision in itself, and should perhaps be >> > discussed. >> > >> > Did you consider other ways? What about using CAP_SETPCAP? >> > >> I did try other ways e.g. using another bounding-set etc. but >> eventually settled with this approach because of main two properties - > > No, I meant did you try other ways of defining a controlled user > namespace, other than one which is created by a task lacking > CAP_SYS_ADMIN? > SYS_ADMIN is the capability that has been used for deciding who can or cannot create namespaces, so didn't want to create another model that may not be compatible with current model which is well understood hence no. > ... > >> >> +The value is expressed as two comma separated hex words (u32). This >> > >> > Why comma separated? whitespace ok? Leading 0x ok? What is the >> > default at boot? (Obviously the patch tells me, I'm asking for it >> > to be spelled out in the doc) >> > >> I tried multiple ways including representing capabilities in >> string/name form for better readability but didn't want to add >> additional complexities of dealing with strings and possible >> string-related-issues for this. Also didn't want to reinvent the new >> form so settled with something that is widely used (cpu >> bounding/affinity/irq mapping etc.) and is capable of handling growing >> bit set (currently 37 but possibly more later). > > Ok, thanks.