From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <linux-kernel-owner@vger.kernel.org>
Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand
        id S1752434AbdGMRF6 (ORCPT <rfc822;w@1wt.eu>);
        Thu, 13 Jul 2017 13:05:58 -0400
Received: from mx0b-001b2d01.pphosted.com ([148.163.158.5]:37949 "EHLO
        mx0a-001b2d01.pphosted.com" rhost-flags-OK-OK-OK-FAIL)
        by vger.kernel.org with ESMTP id S1750881AbdGMRFy (ORCPT
        <rfc822;linux-kernel@vger.kernel.org>);
        Thu, 13 Jul 2017 13:05:54 -0400
Subject: Re: [PATCH v2] xattr: Enable security.capability in user namespaces
To: "Theodore Ts'o" <tytso@mit.edu>,
        "Eric W. Biederman" <ebiederm@xmission.com>,
        "Serge E. Hallyn" <serge@hallyn.com>,
        containers@lists.linux-foundation.org, lkp@01.org,
        linux-kernel@vger.kernel.org, zohar@linux.vnet.ibm.com,
        tycho@docker.com, James.Bottomley@HansenPartnership.com,
        vgoyal@redhat.com, christian.brauner@mailbox.org, amir73il@gmail.com,
        linux-security-module@vger.kernel.org, casey@schaufler-ca.com
References: <1499785511-17192-1-git-send-email-stefanb@linux.vnet.ibm.com>
 <1499785511-17192-2-git-send-email-stefanb@linux.vnet.ibm.com>
 <87mv89iy7q.fsf@xmission.com> <20170712170346.GA17974@mail.hallyn.com>
 <877ezdgsey.fsf@xmission.com>
 <74664cc8-bc3e-75d6-5892-f8934404349f@linux.vnet.ibm.com>
 <20170713011554.xwmrgkzfwnibvgcu@thunk.org> <87y3rscz9j.fsf@xmission.com>
 <20170713164012.brj2flnkaaks2oci@thunk.org>
From: Stefan Berger <stefanb@linux.vnet.ibm.com>
Date: Thu, 13 Jul 2017 13:05:47 -0400
User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:45.0) Gecko/20100101
 Thunderbird/45.4.0
MIME-Version: 1.0
In-Reply-To: <20170713164012.brj2flnkaaks2oci@thunk.org>
Content-Type: text/plain; charset=windows-1252; format=flowed
Content-Transfer-Encoding: 7bit
X-TM-AS-GCONF: 00
x-cbid: 17071317-0004-0000-0000-000012936BEA
X-IBM-SpamModules-Scores: 
X-IBM-SpamModules-Versions: BY=3.00007361; HX=3.00000241; KW=3.00000007;
 PH=3.00000004; SC=3.00000214; SDB=6.00887050; UDB=6.00442849; IPR=6.00667190;
 BA=6.00005470; NDR=6.00000001; ZLA=6.00000005; ZF=6.00000009; ZB=6.00000000;
 ZP=6.00000000; ZH=6.00000000; ZU=6.00000002; MB=3.00016215; XFM=3.00000015;
 UTC=2017-07-13 17:05:52
X-IBM-AV-DETECTION: SAVI=unused REMOTE=unused XFE=unused
x-cbparentid: 17071317-0005-0000-0000-000080378CA9
Message-Id: <29fdda5e-ed4a-bcda-e3cc-c06ab87973ce@linux.vnet.ibm.com>
X-Proofpoint-Virus-Version: vendor=fsecure engine=2.50.10432:,, definitions=2017-07-13_09:,,
 signatures=0
X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 spamscore=0 suspectscore=0
 malwarescore=0 phishscore=0 adultscore=0 bulkscore=0 classifier=spam
 adjust=0 reason=mlx scancount=1 engine=8.0.1-1706020000
 definitions=main-1707130269
Sender: linux-kernel-owner@vger.kernel.org
List-ID: <linux-kernel.vger.kernel.org>
X-Mailing-List: linux-kernel@vger.kernel.org

On 07/13/2017 12:40 PM, Theodore Ts'o wrote:
> On Thu, Jul 13, 2017 at 07:11:36AM -0500, Eric W. Biederman wrote:
>> The concise summary:
>>
>> Today we have the xattr security.capable that holds a set of
>> capabilities that an application gains when executed.  AKA setuid root exec
>> without actually being setuid root.
>>
>> User namespaces have the concept of capabilities that are not global but
>> are limited to their user namespace.  We do not currently have
>> filesystem support for this concept.
> So correct me if I am wrong; in general, there will only be one
> variant of the form:
>
>     security.foo@uid=15000
>
> It's not like there will be:
>
>     security.foo@uid=1000
>     security.foo@uid=2000

A file shared by 2 containers, one mapping root to uid=1000, the other 
mapping root to uid=2000, will show these two xattrs on the host 
(init_user_ns) once these containers set xattrs on that file.

>
> Except.... if you have an Distribution root directory which is shared
> by many containers, you would need to put the xattrs in the overlay
> inodes.  Worse, each time you launch a new container, with a new
> subuid allocation, you will have to iterate over all files with
> capabilities and do a copy-up operations on the xattrs in overlayfs.
> So that's actually a bit of a disaster.

Note that we do keep compatibility to existing behavior. The 
security.foo of the host is visible inside any container for as long as 
the container root user doesn't set its own security.foo on that file, 
which then hides it. Does that address this concern?


>
> So for distribution overlays, you will need to do things a different
> way, which is to map the distro subdirectory so you know that the
> capability with the global uid 0 should be used for the container
> "root" uid, right?
>
> So this hack of using security.foo@uid=1000 is *only* useful when the
> subcontainer root wants to create the privileged executable.  You
> still have to do things the other way.
>
> So can we make perhaps the assertion that *either*:
>
>     security.foo
>
> exists, *or*
>
>     security.foo@uid=BAR
>
> exists, but never both?  And there BAR is exclusive to only one
> instances?

In the current implementation BAR is visible inside of any instance that 
'covers' this uid with the mapping range. Above example of 
security.foo@uid=1000 appears as security.foo inside the container with 
root mapping to uid 1000 (@uid=0 is suppressed) but also appears as 
security.foo@uid=100 with root uid mapping to 900 (and range of at least 
101).

>
> Otherwise, I suspect that the architecture is going to turn around and
> bite us in the *ss eventually, because someone will want to do
> something crazy and the solution will not be scalable.

Can you define what 'scalable' means for you in this context?
 From what I can see sharing a filesystem between multiple containers 
doesn't 'scale well' for virtualizing the xattrs primarily because of 
size limitations of xattrs per file.

      Stefan

>
> 	  	    		      	   -Ted
>

From mboxrd@z Thu Jan  1 00:00:00 1970
From: stefanb@linux.vnet.ibm.com (Stefan Berger)
Date: Thu, 13 Jul 2017 13:05:47 -0400
Subject: [PATCH v2] xattr: Enable security.capability in user namespaces
In-Reply-To: <20170713164012.brj2flnkaaks2oci@thunk.org>
References: <1499785511-17192-1-git-send-email-stefanb@linux.vnet.ibm.com>
	<1499785511-17192-2-git-send-email-stefanb@linux.vnet.ibm.com>
	<87mv89iy7q.fsf@xmission.com> <20170712170346.GA17974@mail.hallyn.com>
	<877ezdgsey.fsf@xmission.com>
	<74664cc8-bc3e-75d6-5892-f8934404349f@linux.vnet.ibm.com>
	<20170713011554.xwmrgkzfwnibvgcu@thunk.org>
	<87y3rscz9j.fsf@xmission.com>
	<20170713164012.brj2flnkaaks2oci@thunk.org>
Message-ID: <29fdda5e-ed4a-bcda-e3cc-c06ab87973ce@linux.vnet.ibm.com>
To: linux-security-module@vger.kernel.org
List-Id: linux-security-module.vger.kernel.org

On 07/13/2017 12:40 PM, Theodore Ts'o wrote:
> On Thu, Jul 13, 2017 at 07:11:36AM -0500, Eric W. Biederman wrote:
>> The concise summary:
>>
>> Today we have the xattr security.capable that holds a set of
>> capabilities that an application gains when executed.  AKA setuid root exec
>> without actually being setuid root.
>>
>> User namespaces have the concept of capabilities that are not global but
>> are limited to their user namespace.  We do not currently have
>> filesystem support for this concept.
> So correct me if I am wrong; in general, there will only be one
> variant of the form:
>
>     security.foo at uid=15000
>
> It's not like there will be:
>
>     security.foo at uid=1000
>     security.foo at uid=2000

A file shared by 2 containers, one mapping root to uid=1000, the other 
mapping root to uid=2000, will show these two xattrs on the host 
(init_user_ns) once these containers set xattrs on that file.

>
> Except.... if you have an Distribution root directory which is shared
> by many containers, you would need to put the xattrs in the overlay
> inodes.  Worse, each time you launch a new container, with a new
> subuid allocation, you will have to iterate over all files with
> capabilities and do a copy-up operations on the xattrs in overlayfs.
> So that's actually a bit of a disaster.

Note that we do keep compatibility to existing behavior. The 
security.foo of the host is visible inside any container for as long as 
the container root user doesn't set its own security.foo on that file, 
which then hides it. Does that address this concern?


>
> So for distribution overlays, you will need to do things a different
> way, which is to map the distro subdirectory so you know that the
> capability with the global uid 0 should be used for the container
> "root" uid, right?
>
> So this hack of using security.foo at uid=1000 is *only* useful when the
> subcontainer root wants to create the privileged executable.  You
> still have to do things the other way.
>
> So can we make perhaps the assertion that *either*:
>
>     security.foo
>
> exists, *or*
>
>     security.foo at uid=BAR
>
> exists, but never both?  And there BAR is exclusive to only one
> instances?

In the current implementation BAR is visible inside of any instance that 
'covers' this uid with the mapping range. Above example of 
security.foo at uid=1000 appears as security.foo inside the container with 
root mapping to uid 1000 (@uid=0 is suppressed) but also appears as 
security.foo at uid=100 with root uid mapping to 900 (and range of at least 
101).

>
> Otherwise, I suspect that the architecture is going to turn around and
> bite us in the *ss eventually, because someone will want to do
> something crazy and the solution will not be scalable.

Can you define what 'scalable' means for you in this context?
 From what I can see sharing a filesystem between multiple containers 
doesn't 'scale well' for virtualizing the xattrs primarily because of 
size limitations of xattrs per file.

      Stefan

>
> 	  	    		      	   -Ted
>

--
To unsubscribe from this list: send the line "unsubscribe linux-security-module" in
the body of a message to majordomo at vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

From mboxrd@z Thu Jan  1 00:00:00 1970
Content-Type: multipart/mixed; boundary="===============5179858186838049250=="
MIME-Version: 1.0
From: Stefan Berger <stefanb@linux.vnet.ibm.com>
To: lkp@lists.01.org
Subject: Re: [PATCH v2] xattr: Enable security.capability in user namespaces
Date: Thu, 13 Jul 2017 13:05:47 -0400
Message-ID: <29fdda5e-ed4a-bcda-e3cc-c06ab87973ce@linux.vnet.ibm.com>
In-Reply-To: <20170713164012.brj2flnkaaks2oci@thunk.org>
List-Id: <oe-lkp.lists.linux.dev>

--===============5179858186838049250==
Content-Type: text/plain; charset="utf-8"
MIME-Version: 1.0
Content-Transfer-Encoding: quoted-printable

On 07/13/2017 12:40 PM, Theodore Ts'o wrote:
> On Thu, Jul 13, 2017 at 07:11:36AM -0500, Eric W. Biederman wrote:
>> The concise summary:
>>
>> Today we have the xattr security.capable that holds a set of
>> capabilities that an application gains when executed.  AKA setuid root e=
xec
>> without actually being setuid root.
>>
>> User namespaces have the concept of capabilities that are not global but
>> are limited to their user namespace.  We do not currently have
>> filesystem support for this concept.
> So correct me if I am wrong; in general, there will only be one
> variant of the form:
>
>     security.foo(a)uid=3D15000
>
> It's not like there will be:
>
>     security.foo(a)uid=3D1000
>     security.foo(a)uid=3D2000

A file shared by 2 containers, one mapping root to uid=3D1000, the other =

mapping root to uid=3D2000, will show these two xattrs on the host =

(init_user_ns) once these containers set xattrs on that file.

>
> Except.... if you have an Distribution root directory which is shared
> by many containers, you would need to put the xattrs in the overlay
> inodes.  Worse, each time you launch a new container, with a new
> subuid allocation, you will have to iterate over all files with
> capabilities and do a copy-up operations on the xattrs in overlayfs.
> So that's actually a bit of a disaster.

Note that we do keep compatibility to existing behavior. The =

security.foo of the host is visible inside any container for as long as =

the container root user doesn't set its own security.foo on that file, =

which then hides it. Does that address this concern?


>
> So for distribution overlays, you will need to do things a different
> way, which is to map the distro subdirectory so you know that the
> capability with the global uid 0 should be used for the container
> "root" uid, right?
>
> So this hack of using security.foo(a)uid=3D1000 is *only* useful when the
> subcontainer root wants to create the privileged executable.  You
> still have to do things the other way.
>
> So can we make perhaps the assertion that *either*:
>
>     security.foo
>
> exists, *or*
>
>     security.foo(a)uid=3DBAR
>
> exists, but never both?  And there BAR is exclusive to only one
> instances?

In the current implementation BAR is visible inside of any instance that =

'covers' this uid with the mapping range. Above example of =

security.foo(a)uid=3D1000 appears as security.foo inside the container with =

root mapping to uid 1000 (@uid=3D0 is suppressed) but also appears as =

security.foo(a)uid=3D100 with root uid mapping to 900 (and range of at leas=
t =

101).

>
> Otherwise, I suspect that the architecture is going to turn around and
> bite us in the *ss eventually, because someone will want to do
> something crazy and the solution will not be scalable.

Can you define what 'scalable' means for you in this context?
 From what I can see sharing a filesystem between multiple containers =

doesn't 'scale well' for virtualizing the xattrs primarily because of =

size limitations of xattrs per file.

      Stefan

>
> 	  	    		      	   -Ted
>


--===============5179858186838049250==--