All of lore.kernel.org
 help / color / mirror / Atom feed
* SELinux lead to soft lockup when pid 1 proceess reap child
       [not found]   ` <b7f75f65-592a-5102-0ac5-4d3aa43f0b55@huawei.com>
@ 2017-01-09 10:51     ` yangshukui
  2017-01-09 18:12       ` Oleg Nesterov
  2017-03-09  9:03         ` yangshukui
  0 siblings, 2 replies; 22+ messages in thread
From: yangshukui @ 2017-01-09 10:51 UTC (permalink / raw)
  To: selinux, linux-security-module, linux-kernel
  Cc: Kefeng Wang, Guohanjun (Hanjun Guo), 'Qiang Huang',
	Lizefan, miaoxie (A),
	Zhangdianfang, paul, sds, eparis, james.l.morris, oleg, ebiederm,
	serge.hallyn

[-- Attachment #1: Type: text/plain, Size: 2686 bytes --]

Pid 1 process (with init_t)  have the right to reap child in host, but 
pid 1 process (such as spc_t, docker use spc_t as container's default type)
may not have the right to reap child in container, if this condition 
occur, it will lead to soft lock up. The following will produce it,

docker run -ti --rm -v /sys/fs/selinux:/sys/fs/selinux fedora:20 bash
[root@b755018fb526 /]# yum install selinux-policy-targeted 
selinux-policy-devel perl-Test-Harness gcc libselinux-devel net-tools 
netlabel_tools iptables git cpan
[root@b755018fb526 /]# git clone 
https://github.com/SELinuxProject/selinux-testsuite.git
[root@b755018fb526 /]# setenforce 0
[root@b755018fb526 /]# runcon -t unconfined_t bash
[root@b755018fb526 /]# genhomedircon
[root@b755018fb526 /]# restorecon -R /
[root@b755018fb526 /]# setenforce 1
[root@b755018fb526 /]# cd /root/selinux-testsuite/
[root@b755018fb526 selinux-testsuite]# make -C policy load
[root@b755018fb526 selinux-testsuite]# make -C tests test
[root@b755018fb526 selinux-testsuite]# exit  #this will lead to soft lockup

before exiting the container, we can also see some zombies:
[root@b755018fb526 selinux-testsuite]# ps -eafZ
LABEL                           UID        PID  PPID  C STIME 
TTY          TIME CMD
...
unconfined_u:unconfined_r:test_fdreceive_server_t:s0 root 215 1  0 05:35 
pts/0 00:00:00 [server] <defunct>
unconfined_u:unconfined_r:test_ptrace_traced_t:s0 root 291 1  0 05:35 
pts/0 00:00:00 [wait] <defunct>
unconfined_u:unconfined_r:test_setnice_set_t:s0 root 374 1  0 05:35 
pts/0 00:00:00 [child] <defunct>

in kernel code,
zap_pid_ns_processes {
       ...
       /* Firstly reap the EXIT_ZOMBIE children we may have. */
       do {
           clear_thread_flag(TIF_SIGPENDING);
           rc = sys_wait4(-1, NULL, __WALL, NULL);
           //sys_wait4 -> do_wait-> 
wait_consider_task->security_task_wait->selinux_task_wait->avc_has_perm_flags->avc_has_perm_noaudit->avc_denied
the return value is -EACCES, unable to return to the expected -ECHILD, 
and leading to the dead cycle.
     } while (rc != -ECHILD);
}

I have a hack like this,
diff --git a/security/selinux/hooks.c b/security/selinux/hooks.c
index 57a2020..c10c58c 100644
--- a/security/selinux/hooks.c
+++ b/security/selinux/hooks.c
@@ -3596,6 +3596,9 @@ static int selinux_task_kill(struct task_struct 
*p, struct siginfo *info,

  static int selinux_task_wait(struct task_struct *p)
  {
+       if (pid_vnr(task_tgid(current)) == 1){
+                return 0;
+       }
         return task_has_perm(p, current, PROCESS__SIGCHLD);
  }
It work but it permit pid 1 process to reap child without selinux check. 
Can we have a better way to handle this problem?

[-- Attachment #2: Type: text/html, Size: 3877 bytes --]

^ permalink raw reply related	[flat|nested] 22+ messages in thread

* Re: SELinux lead to soft lockup when pid 1 proceess reap child
  2017-01-09 10:51     ` SELinux lead to soft lockup when pid 1 proceess reap child yangshukui
@ 2017-01-09 18:12       ` Oleg Nesterov
  2017-01-09 18:29         ` Oleg Nesterov
  2017-03-09  9:03         ` yangshukui
  1 sibling, 1 reply; 22+ messages in thread
From: Oleg Nesterov @ 2017-01-09 18:12 UTC (permalink / raw)
  To: yangshukui
  Cc: selinux, linux-security-module, linux-kernel, Kefeng Wang,
	Guohanjun (Hanjun Guo), 'Qiang Huang',
	Lizefan, miaoxie (A),
	Zhangdianfang, paul, sds, eparis, james.l.morris, ebiederm,
	serge.hallyn

On 01/09, yangshukui wrote:
>
> --- a/security/selinux/hooks.c
> +++ b/security/selinux/hooks.c
> @@ -3596,6 +3596,9 @@ static int selinux_task_kill(struct task_struct *p,
> struct siginfo *info,
>
>  static int selinux_task_wait(struct task_struct *p)
>  {
> +       if (pid_vnr(task_tgid(current)) == 1){
> +                return 0;

this check is not really correct, it can be a sub-thread... Doesn't matter,
please see below.

> +       }
>         return task_has_perm(p, current, PROCESS__SIGCHLD);
>  }
> It work but it permit pid 1 process to reap child without selinux check. Can
> we have a better way to handle this problem?

I never understood why security_task_wait() should deny to reap a child. But
since it can we probably want some explicit "the whole namespace goes away" check.
We could use, say, PIDNS_HASH_ADDING but I'd suggest something like a trivial change
below for now.

Eric, what do you think?

Oleg.

diff --git a/security/security.c b/security/security.c
index f825304..1330b4e 100644
--- a/security/security.c
+++ b/security/security.c
@@ -1027,6 +1027,9 @@ int security_task_kill(struct task_struct *p, struct siginfo *info,
 
 int security_task_wait(struct task_struct *p)
 {
+	/* must be the exiting child reaper */
+	if (unlikely(current->flags & PF_EXITING))
+		return 0;
 	return call_int_hook(task_wait, 0, p);
 }
 

^ permalink raw reply related	[flat|nested] 22+ messages in thread

* Re: SELinux lead to soft lockup when pid 1 proceess reap child
  2017-01-09 18:12       ` Oleg Nesterov
@ 2017-01-09 18:29         ` Oleg Nesterov
  2017-01-09 18:43           ` Stephen Smalley
  0 siblings, 1 reply; 22+ messages in thread
From: Oleg Nesterov @ 2017-01-09 18:29 UTC (permalink / raw)
  To: yangshukui
  Cc: selinux, linux-security-module, linux-kernel, Kefeng Wang,
	Guohanjun (Hanjun Guo), 'Qiang Huang',
	Lizefan, miaoxie (A),
	Zhangdianfang, paul, sds, eparis, james.l.morris, ebiederm,
	serge.hallyn

Seriously, could someone explain why do we need the security_task_wait()
hook at all?


On 01/09, Oleg Nesterov wrote:
>
> On 01/09, yangshukui wrote:
> >
> > --- a/security/selinux/hooks.c
> > +++ b/security/selinux/hooks.c
> > @@ -3596,6 +3596,9 @@ static int selinux_task_kill(struct task_struct *p,
> > struct siginfo *info,
> >
> >  static int selinux_task_wait(struct task_struct *p)
> >  {
> > +       if (pid_vnr(task_tgid(current)) == 1){
> > +                return 0;
> 
> this check is not really correct, it can be a sub-thread... Doesn't matter,
> please see below.
> 
> > +       }
> >         return task_has_perm(p, current, PROCESS__SIGCHLD);
> >  }
> > It work but it permit pid 1 process to reap child without selinux check. Can
> > we have a better way to handle this problem?
> 
> I never understood why security_task_wait() should deny to reap a child. But
> since it can we probably want some explicit "the whole namespace goes away" check.
> We could use, say, PIDNS_HASH_ADDING but I'd suggest something like a trivial change
> below for now.
> 
> Eric, what do you think?
> 
> Oleg.
> 
> diff --git a/security/security.c b/security/security.c
> index f825304..1330b4e 100644
> --- a/security/security.c
> +++ b/security/security.c
> @@ -1027,6 +1027,9 @@ int security_task_kill(struct task_struct *p, struct siginfo *info,
>  
>  int security_task_wait(struct task_struct *p)
>  {
> +	/* must be the exiting child reaper */
> +	if (unlikely(current->flags & PF_EXITING))
> +		return 0;
>  	return call_int_hook(task_wait, 0, p);
>  }
>  

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: SELinux lead to soft lockup when pid 1 proceess reap child
  2017-01-09 18:29         ` Oleg Nesterov
@ 2017-01-09 18:43           ` Stephen Smalley
  2017-01-09 23:49             ` Paul Moore
  2017-01-10  0:26             ` Casey Schaufler
  0 siblings, 2 replies; 22+ messages in thread
From: Stephen Smalley @ 2017-01-09 18:43 UTC (permalink / raw)
  To: Oleg Nesterov, yangshukui
  Cc: selinux, linux-security-module, linux-kernel, Kefeng Wang,
	Guohanjun (Hanjun Guo), 'Qiang Huang',
	Lizefan, miaoxie (A),
	Zhangdianfang, paul, eparis, james.l.morris, ebiederm,
	serge.hallyn

On Mon, 2017-01-09 at 19:29 +0100, Oleg Nesterov wrote:
> Seriously, could someone explain why do we need the
> security_task_wait()
> hook at all?

I would be ok with killing it.
IIRC, the original motivation was to block an unauthorized data flow
from child to parent when the child context differs, but part of that
original design was also to reparent the child automatically, and that
was never implemented.  I don't think there is a real use case for it
in practice and it just breaks things, so let's get rid of it unless
someone objects.

> 
> 
> On 01/09, Oleg Nesterov wrote:
> > 
> > 
> > On 01/09, yangshukui wrote:
> > > 
> > > 
> > > --- a/security/selinux/hooks.c
> > > +++ b/security/selinux/hooks.c
> > > @@ -3596,6 +3596,9 @@ static int selinux_task_kill(struct
> > > task_struct *p,
> > > struct siginfo *info,
> > > 
> > >  static int selinux_task_wait(struct task_struct *p)
> > >  {
> > > +       if (pid_vnr(task_tgid(current)) == 1){
> > > +                return 0;
> > 
> > this check is not really correct, it can be a sub-thread... Doesn't
> > matter,
> > please see below.
> > 
> > > 
> > > +       }
> > >         return task_has_perm(p, current, PROCESS__SIGCHLD);
> > >  }
> > > It work but it permit pid 1 process to reap child without selinux
> > > check. Can
> > > we have a better way to handle this problem?
> > 
> > I never understood why security_task_wait() should deny to reap a
> > child. But
> > since it can we probably want some explicit "the whole namespace
> > goes away" check.
> > We could use, say, PIDNS_HASH_ADDING but I'd suggest something like
> > a trivial change
> > below for now.
> > 
> > Eric, what do you think?
> > 
> > Oleg.
> > 
> > diff --git a/security/security.c b/security/security.c
> > index f825304..1330b4e 100644
> > --- a/security/security.c
> > +++ b/security/security.c
> > @@ -1027,6 +1027,9 @@ int security_task_kill(struct task_struct *p,
> > struct siginfo *info,
> >  
> >  int security_task_wait(struct task_struct *p)
> >  {
> > +	/* must be the exiting child reaper */
> > +	if (unlikely(current->flags & PF_EXITING))
> > +		return 0;
> >  	return call_int_hook(task_wait, 0, p);
> >  }
> >  

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: SELinux lead to soft lockup when pid 1 proceess reap child
  2017-01-09 18:43           ` Stephen Smalley
@ 2017-01-09 23:49             ` Paul Moore
  2017-01-10  0:26             ` Casey Schaufler
  1 sibling, 0 replies; 22+ messages in thread
From: Paul Moore @ 2017-01-09 23:49 UTC (permalink / raw)
  To: Stephen Smalley, yangshukui
  Cc: Oleg Nesterov, selinux, linux-security-module, linux-kernel,
	Kefeng Wang, Guohanjun (Hanjun Guo),
	Qiang Huang, Lizefan, miaoxie (A),
	Zhangdianfang, Eric Paris, James Morris, ebiederm, serge.hallyn

On Mon, Jan 9, 2017 at 1:43 PM, Stephen Smalley <sds@tycho.nsa.gov> wrote:
> On Mon, 2017-01-09 at 19:29 +0100, Oleg Nesterov wrote:
>> Seriously, could someone explain why do we need the
>> security_task_wait()
>> hook at all?
>
> I would be ok with killing it.
> IIRC, the original motivation was to block an unauthorized data flow
> from child to parent when the child context differs, but part of that
> original design was also to reparent the child automatically, and that
> was never implemented.  I don't think there is a real use case for it
> in practice and it just breaks things, so let's get rid of it unless
> someone objects.

Patches are always welcome, plenty of time to get things in for 4.11 :)

-- 
paul moore
www.paul-moore.com

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: SELinux lead to soft lockup when pid 1 proceess reap child
  2017-01-09 18:43           ` Stephen Smalley
  2017-01-09 23:49             ` Paul Moore
@ 2017-01-10  0:26             ` Casey Schaufler
  1 sibling, 0 replies; 22+ messages in thread
From: Casey Schaufler @ 2017-01-10  0:26 UTC (permalink / raw)
  To: Stephen Smalley, Oleg Nesterov, yangshukui
  Cc: selinux, linux-security-module, linux-kernel, Kefeng Wang,
	Guohanjun (Hanjun Guo), 'Qiang Huang',
	Lizefan, miaoxie (A),
	Zhangdianfang, paul, eparis, james.l.morris, ebiederm,
	serge.hallyn

On 1/9/2017 10:43 AM, Stephen Smalley wrote:
> On Mon, 2017-01-09 at 19:29 +0100, Oleg Nesterov wrote:
>> Seriously, could someone explain why do we need the
>> security_task_wait()
>> hook at all?
> I would be ok with killing it.
> IIRC, the original motivation was to block an unauthorized data flow
> from child to parent when the child context differs, but part of that
> original design was also to reparent the child automatically, and that
> was never implemented.  I don't think there is a real use case for it
> in practice and it just breaks things, so let's get rid of it unless
> someone objects.

A strict Bell & LaPadula sensitivity model must prohibit a child
with a more sensitive label from signalling its parent. Except that
Bad Things happen when you try enforcing that on a real system.
I agree with Stephen and Oleg that this hook could go away and not
be missed. If someone *really* wants to implement a strict B&L
policy I believe that a reparentting solution is going to be necessary
anyway.

Regardless of the outcome, I notice that the Smack hook does not
do anything, and that's unnecessary overhead, so it's going to come
out.

>
>>
>> On 01/09, Oleg Nesterov wrote:
>>>
>>> On 01/09, yangshukui wrote:
>>>>
>>>> --- a/security/selinux/hooks.c
>>>> +++ b/security/selinux/hooks.c
>>>> @@ -3596,6 +3596,9 @@ static int selinux_task_kill(struct
>>>> task_struct *p,
>>>> struct siginfo *info,
>>>>
>>>>  static int selinux_task_wait(struct task_struct *p)
>>>>  {
>>>> +       if (pid_vnr(task_tgid(current)) == 1){
>>>> +                return 0;
>>> this check is not really correct, it can be a sub-thread... Doesn't
>>> matter,
>>> please see below.
>>>
>>>> +       }
>>>>         return task_has_perm(p, current, PROCESS__SIGCHLD);
>>>>  }
>>>> It work but it permit pid 1 process to reap child without selinux
>>>> check. Can
>>>> we have a better way to handle this problem?
>>> I never understood why security_task_wait() should deny to reap a
>>> child. But
>>> since it can we probably want some explicit "the whole namespace
>>> goes away" check.
>>> We could use, say, PIDNS_HASH_ADDING but I'd suggest something like
>>> a trivial change
>>> below for now.
>>>
>>> Eric, what do you think?
>>>
>>> Oleg.
>>>
>>> diff --git a/security/security.c b/security/security.c
>>> index f825304..1330b4e 100644
>>> --- a/security/security.c
>>> +++ b/security/security.c
>>> @@ -1027,6 +1027,9 @@ int security_task_kill(struct task_struct *p,
>>> struct siginfo *info,
>>>  
>>>  int security_task_wait(struct task_struct *p)
>>>  {
>>> +	/* must be the exiting child reaper */
>>> +	if (unlikely(current->flags & PF_EXITING))
>>> +		return 0;
>>>  	return call_int_hook(task_wait, 0, p);
>>>  }
>>>  
> --
> To unsubscribe from this list: send the line "unsubscribe linux-security-module" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>

^ permalink raw reply	[flat|nested] 22+ messages in thread

* isolate selinux_enforcing
  2017-01-09 10:51     ` SELinux lead to soft lockup when pid 1 proceess reap child yangshukui
@ 2017-03-09  9:03         ` yangshukui
  2017-03-09  9:03         ` yangshukui
  1 sibling, 0 replies; 22+ messages in thread
From: yangshukui @ 2017-03-09  9:03 UTC (permalink / raw)
  To: selinux, linux-security-module
  Cc: Lizefan, paul, sds, eparis, james.l.morris, oleg, ebiederm, serge.hallyn

I want to use SELinux in system container and only concern the function 
in the container.
this system container run in vm and every vm has only one system container.

How do I use now?
docker run ... system-contaier /sbin/init
after init is running ,the following service is also running:

#this is the part of service file which will run in container after 
starting the container.
...
semodule -R     #use the policy in container.
restorecon /     #if needed
...

this method seem to work if host os and the docker images use the same 
content for rootfs, but if host use
redhat7 and docker images use centos7, it will deny many normal 
operations , and this let some host service not work.

If SELinux is permissive in host and enforcing in container ,it will 
resolve my problem. Unfortunately,
there is no namespace for SELinux.

Isolate SELinux is difficult and it has a lot of work to do, but is 
easier to isolate selinux_enforcing.

What do you think ?

Think you very much.

^ permalink raw reply	[flat|nested] 22+ messages in thread

* isolate selinux_enforcing
@ 2017-03-09  9:03         ` yangshukui
  0 siblings, 0 replies; 22+ messages in thread
From: yangshukui @ 2017-03-09  9:03 UTC (permalink / raw)
  To: linux-security-module

I want to use SELinux in system container and only concern the function 
in the container.
this system container run in vm and every vm has only one system container.

How do I use now?
docker run ... system-contaier /sbin/init
after init is running ,the following service is also running:

#this is the part of service file which will run in container after 
starting the container.
...
semodule -R     #use the policy in container.
restorecon /     #if needed
...

this method seem to work if host os and the docker images use the same 
content for rootfs, but if host use
redhat7 and docker images use centos7, it will deny many normal 
operations , and this let some host service not work.

If SELinux is permissive in host and enforcing in container ,it will 
resolve my problem. Unfortunately,
there is no namespace for SELinux.

Isolate SELinux is difficult and it has a lot of work to do, but is 
easier to isolate selinux_enforcing.

What do you think ?

Think you very much.


--
To unsubscribe from this list: send the line "unsubscribe linux-security-module" in
the body of a message to majordomo at vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: isolate selinux_enforcing
  2017-03-09  9:03         ` yangshukui
@ 2017-03-09 15:28           ` Stephen Smalley
  -1 siblings, 0 replies; 22+ messages in thread
From: Stephen Smalley @ 2017-03-09 15:28 UTC (permalink / raw)
  To: yangshukui, selinux, linux-security-module
  Cc: serge.hallyn, oleg, Lizefan, james.l.morris, Kees Cook, Nick Kralevich

On Thu, 2017-03-09 at 17:03 +0800, yangshukui wrote:
> I want to use SELinux in system container and only concern the
> function 
> in the container.
> this system container run in vm and every vm has only one system
> container.
> 
> How do I use now?
> docker run ... system-contaier /sbin/init
> after init is running ,the following service is also running:
> 
> #this is the part of service file which will run in container after 
> starting the container.
> ...
> semodule -R     #use the policy in container.
> restorecon /     #if needed
> ...
> 
> this method seem to work if host os and the docker images use the
> same 
> content for rootfs, but if host use
> redhat7 and docker images use centos7, it will deny many normal 
> operations , and this let some host service not work.
> 
> If SELinux is permissive in host and enforcing in container ,it will 
> resolve my problem. Unfortunately,
> there is no namespace for SELinux.
> 
> Isolate SELinux is difficult and it has a lot of work to do, but is 
> easier to isolate selinux_enforcing.
> 
> What do you think ?

I'd rather see proper SELinux policy namespace support implemented.
Admittedly, that won't be straightforward.

FWIW, ChromiumOS appears to have done something similar to what you
suggest for supporting Android containers (i.e. SELinux enforcing for
the Android container, permissive for ChromiumOS processes outside the
container), but they never discussed it with upstream SELinux
developers AFAIK.  My only knowledge of what they have done comes from
their kernel repository [1]. It appears that they experimented with a
hack to narrow the scope of selinux_enforcing to a PID namespace [2],
then reverted that change later and just implemented an option to
suppress audit denials for permissive domains [3] (evidently they are
running the Chromium OS processes in a permissive domain; I haven't
seen their policy).  I wouldn't recommend either approach; the former
won't properly handle permission checks that occur outside of process
context or certain permission checks where the source context is not
the current task context (e.g. an inter-object relationship check),
while the latter requires leaving a permissive domain in the production
policy (which seemingly would violate CTS; not sure why that gets a
pass, and if that is ok, then why didn't they just create a domain
allowed all permissions and use that outside the container instead -
then they won't need to suppress audit at all?) and further requires
use of a separate kernel for policy development/debugging.  Note btw
that they could have silenced the permissive denials via dontaudit
rules instead (as Android does for its su domain) but chose not to do
so to avoid taking the slow path.

[1] https://chromium.googlesource.com/chromiumos/third_party/kernel
[2] https://chromium-review.googlesource.com/c/361464/
[3] https://chromium-review.googlesource.com/c/424948/

^ permalink raw reply	[flat|nested] 22+ messages in thread

* isolate selinux_enforcing
@ 2017-03-09 15:28           ` Stephen Smalley
  0 siblings, 0 replies; 22+ messages in thread
From: Stephen Smalley @ 2017-03-09 15:28 UTC (permalink / raw)
  To: linux-security-module

On Thu, 2017-03-09 at 17:03 +0800, yangshukui wrote:
> I want to use SELinux in system container and only concern the
> function?
> in the container.
> this system container run in vm and every vm has only one system
> container.
> 
> How do I use now?
> docker run ... system-contaier /sbin/init
> after init is running ,the following service is also running:
> 
> #this is the part of service file which will run in container after?
> starting the container.
> ...
> semodule -R?????#use the policy in container.
> restorecon /?????#if needed
> ...
> 
> this method seem to work if host os and the docker images use the
> same?
> content for rootfs, but if host use
> redhat7 and docker images use centos7, it will deny many normal?
> operations , and this let some host service not work.
> 
> If SELinux is permissive in host and enforcing in container ,it will?
> resolve my problem. Unfortunately,
> there is no namespace for SELinux.
> 
> Isolate SELinux is difficult and it has a lot of work to do, but is?
> easier to isolate selinux_enforcing.
> 
> What do you think ?

I'd rather see proper SELinux policy namespace support implemented.
Admittedly, that won't be straightforward.

FWIW, ChromiumOS appears to have done something similar to what you
suggest for supporting Android containers (i.e. SELinux enforcing for
the Android container, permissive for ChromiumOS processes outside the
container), but they never discussed it with upstream SELinux
developers AFAIK. ?My only knowledge of what they have done comes from
their kernel repository [1]. It appears that they experimented with a
hack to narrow the scope of selinux_enforcing to a PID namespace [2],
then reverted that change later and just implemented an option to
suppress audit denials for permissive domains [3] (evidently they are
running the Chromium OS processes in a permissive domain; I haven't
seen their policy). ?I wouldn't recommend either approach; the former
won't properly handle permission checks that occur outside of process
context or certain permission checks where the source context is not
the current task context (e.g. an inter-object relationship check),
while the latter requires leaving a permissive domain in the production
policy (which seemingly would violate CTS; not sure why that gets a
pass, and if that is ok, then why didn't they just create a domain
allowed all permissions and use that outside the container instead -
then they won't need to suppress audit at all?) and further requires
use of a separate kernel for policy development/debugging. ?Note btw
that they could have silenced the permissive denials via dontaudit
rules instead (as Android does for its su domain) but chose not to do
so to avoid taking the slow path.

[1]?https://chromium.googlesource.com/chromiumos/third_party/kernel
[2]?https://chromium-review.googlesource.com/c/361464/
[3]?https://chromium-review.googlesource.com/c/424948/

--
To unsubscribe from this list: send the line "unsubscribe linux-security-module" in
the body of a message to majordomo at vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: isolate selinux_enforcing
  2017-03-09 15:28           ` Stephen Smalley
@ 2017-03-09 15:39             ` Stephen Smalley
  -1 siblings, 0 replies; 22+ messages in thread
From: Stephen Smalley @ 2017-03-09 15:39 UTC (permalink / raw)
  To: yangshukui, selinux, linux-security-module
  Cc: Kees Cook, serge.hallyn, oleg, Lizefan, james.l.morris

On Thu, 2017-03-09 at 10:28 -0500, Stephen Smalley wrote:
> On Thu, 2017-03-09 at 17:03 +0800, yangshukui wrote:
> > 
> > I want to use SELinux in system container and only concern the
> > function 
> > in the container.
> > this system container run in vm and every vm has only one system
> > container.
> > 
> > How do I use now?
> > docker run ... system-contaier /sbin/init
> > after init is running ,the following service is also running:
> > 
> > #this is the part of service file which will run in container
> > after 
> > starting the container.
> > ...
> > semodule -R     #use the policy in container.
> > restorecon /     #if needed
> > ...
> > 
> > this method seem to work if host os and the docker images use the
> > same 
> > content for rootfs, but if host use
> > redhat7 and docker images use centos7, it will deny many normal 
> > operations , and this let some host service not work.
> > 
> > If SELinux is permissive in host and enforcing in container ,it
> > will 
> > resolve my problem. Unfortunately,
> > there is no namespace for SELinux.
> > 
> > Isolate SELinux is difficult and it has a lot of work to do, but
> > is 
> > easier to isolate selinux_enforcing.
> > 
> > What do you think ?
> 
> I'd rather see proper SELinux policy namespace support implemented.
> Admittedly, that won't be straightforward.
> 
> FWIW, ChromiumOS appears to have done something similar to what you
> suggest for supporting Android containers (i.e. SELinux enforcing for
> the Android container, permissive for ChromiumOS processes outside
> the
> container), but they never discussed it with upstream SELinux
> developers AFAIK.  My only knowledge of what they have done comes
> from
> their kernel repository [1]. It appears that they experimented with a
> hack to narrow the scope of selinux_enforcing to a PID namespace [2],
> then reverted that change later and just implemented an option to
> suppress audit denials for permissive domains [3] (evidently they are
> running the Chromium OS processes in a permissive domain; I haven't
> seen their policy).  I wouldn't recommend either approach; the former
> won't properly handle permission checks that occur outside of process
> context or certain permission checks where the source context is not
> the current task context (e.g. an inter-object relationship check),
> while the latter requires leaving a permissive domain in the
> production
> policy (which seemingly would violate CTS; not sure why that gets a
> pass, and if that is ok, then why didn't they just create a domain
> allowed all permissions and use that outside the container instead -
> then they won't need to suppress audit at all?) and further requires
> use of a separate kernel for policy development/debugging.  Note btw
> that they could have silenced the permissive denials via dontaudit
> rules instead (as Android does for its su domain) but chose not to do
> so to avoid taking the slow path.

Sorry, should have looked more closely at their actual change - that
last part of their rationale is bogus; a dontaudit rule would have
prevented calling slow_avc_audit() at all, whereas their change merely
returns early from slow_avc_audit().  So I really don't understand why
they didn't just define dontaudit rules for all permissions (if using a
permissive domain) or allow rules for all permissions (if using an
enforcing, allow-all domain).  Neither one is especially hard to write,
and they could have just looked at the su domain in Android for an
example of the former.

> 
> [1] https://chromium.googlesource.com/chromiumos/third_party/kernel
> [2] https://chromium-review.googlesource.com/c/361464/
> [3] https://chromium-review.googlesource.com/c/424948/

^ permalink raw reply	[flat|nested] 22+ messages in thread

* isolate selinux_enforcing
@ 2017-03-09 15:39             ` Stephen Smalley
  0 siblings, 0 replies; 22+ messages in thread
From: Stephen Smalley @ 2017-03-09 15:39 UTC (permalink / raw)
  To: linux-security-module

On Thu, 2017-03-09 at 10:28 -0500, Stephen Smalley wrote:
> On Thu, 2017-03-09 at 17:03 +0800, yangshukui wrote:
> > 
> > I want to use SELinux in system container and only concern the
> > function?
> > in the container.
> > this system container run in vm and every vm has only one system
> > container.
> > 
> > How do I use now?
> > docker run ... system-contaier /sbin/init
> > after init is running ,the following service is also running:
> > 
> > #this is the part of service file which will run in container
> > after?
> > starting the container.
> > ...
> > semodule -R?????#use the policy in container.
> > restorecon /?????#if needed
> > ...
> > 
> > this method seem to work if host os and the docker images use the
> > same?
> > content for rootfs, but if host use
> > redhat7 and docker images use centos7, it will deny many normal?
> > operations , and this let some host service not work.
> > 
> > If SELinux is permissive in host and enforcing in container ,it
> > will?
> > resolve my problem. Unfortunately,
> > there is no namespace for SELinux.
> > 
> > Isolate SELinux is difficult and it has a lot of work to do, but
> > is?
> > easier to isolate selinux_enforcing.
> > 
> > What do you think ?
> 
> I'd rather see proper SELinux policy namespace support implemented.
> Admittedly, that won't be straightforward.
> 
> FWIW, ChromiumOS appears to have done something similar to what you
> suggest for supporting Android containers (i.e. SELinux enforcing for
> the Android container, permissive for ChromiumOS processes outside
> the
> container), but they never discussed it with upstream SELinux
> developers AFAIK. ?My only knowledge of what they have done comes
> from
> their kernel repository [1]. It appears that they experimented with a
> hack to narrow the scope of selinux_enforcing to a PID namespace [2],
> then reverted that change later and just implemented an option to
> suppress audit denials for permissive domains [3] (evidently they are
> running the Chromium OS processes in a permissive domain; I haven't
> seen their policy). ?I wouldn't recommend either approach; the former
> won't properly handle permission checks that occur outside of process
> context or certain permission checks where the source context is not
> the current task context (e.g. an inter-object relationship check),
> while the latter requires leaving a permissive domain in the
> production
> policy (which seemingly would violate CTS; not sure why that gets a
> pass, and if that is ok, then why didn't they just create a domain
> allowed all permissions and use that outside the container instead -
> then they won't need to suppress audit at all?) and further requires
> use of a separate kernel for policy development/debugging. ?Note btw
> that they could have silenced the permissive denials via dontaudit
> rules instead (as Android does for its su domain) but chose not to do
> so to avoid taking the slow path.

Sorry, should have looked more closely at their actual change - that
last part of their rationale is bogus; a dontaudit rule would have
prevented calling slow_avc_audit() at all, whereas their change merely
returns early from slow_avc_audit(). ?So I really don't understand why
they didn't just define dontaudit rules for all permissions (if using a
permissive domain) or allow rules for all permissions (if using an
enforcing, allow-all domain). ?Neither one is especially hard to write,
and they could have just looked at the su domain in Android for an
example of the former.

> 
> [1]?https://chromium.googlesource.com/chromiumos/third_party/kernel
> [2]?https://chromium-review.googlesource.com/c/361464/
> [3]?https://chromium-review.googlesource.com/c/424948/

--
To unsubscribe from this list: send the line "unsubscribe linux-security-module" in
the body of a message to majordomo at vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: isolate selinux_enforcing
  2017-03-09  9:03         ` yangshukui
@ 2017-03-09 16:39           ` Casey Schaufler
  -1 siblings, 0 replies; 22+ messages in thread
From: Casey Schaufler @ 2017-03-09 16:39 UTC (permalink / raw)
  To: yangshukui, selinux, linux-security-module
  Cc: Lizefan, paul, sds, eparis, james.l.morris, oleg, ebiederm,
	serge.hallyn, Casey Schaufler

On 3/9/2017 1:03 AM, yangshukui wrote:
> I want to use SELinux in system container and only concern the function in the container.
> this system container run in vm and every vm has only one system container.
>
> How do I use now?
> docker run ... system-contaier /sbin/init
> after init is running ,the following service is also running:
>
> #this is the part of service file which will run in container after starting the container.
> ..
> semodule -R     #use the policy in container.
> restorecon /     #if needed
> ..
>
> this method seem to work if host os and the docker images use the same content for rootfs, but if host use
> redhat7 and docker images use centos7, it will deny many normal operations , and this let some host service not work.
>
> If SELinux is permissive in host and enforcing in container ,it will resolve my problem. Unfortunately,
> there is no namespace for SELinux.

The LSM infrastructure is essentially a set of lists.
These lists are rooted globally, but there's no reason*
they couldn't be rooted in a namespace. That would give
each namespace the option of using whatever security
scheme was deemed appropriate. There are a number of
issues, such as namespacing policy, that would have to
be addressed, but the mechanism could work fine. I would
look at patches.

---
* Other than the sheer insanity of making security
  claims about such a system. I would not expect that
  minor issue to slow demand or deployment any more
  than it has in the past.

>
> Isolate SELinux is difficult and it has a lot of work to do, but is easier to isolate selinux_enforcing.
>
> What do you think ?
>
> Think you very much.
>
>
> -- 
> To unsubscribe from this list: send the line "unsubscribe linux-security-module" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>

^ permalink raw reply	[flat|nested] 22+ messages in thread

* isolate selinux_enforcing
@ 2017-03-09 16:39           ` Casey Schaufler
  0 siblings, 0 replies; 22+ messages in thread
From: Casey Schaufler @ 2017-03-09 16:39 UTC (permalink / raw)
  To: linux-security-module

On 3/9/2017 1:03 AM, yangshukui wrote:
> I want to use SELinux in system container and only concern the function in the container.
> this system container run in vm and every vm has only one system container.
>
> How do I use now?
> docker run ... system-contaier /sbin/init
> after init is running ,the following service is also running:
>
> #this is the part of service file which will run in container after starting the container.
> ..
> semodule -R     #use the policy in container.
> restorecon /     #if needed
> ..
>
> this method seem to work if host os and the docker images use the same content for rootfs, but if host use
> redhat7 and docker images use centos7, it will deny many normal operations , and this let some host service not work.
>
> If SELinux is permissive in host and enforcing in container ,it will resolve my problem. Unfortunately,
> there is no namespace for SELinux.

The LSM infrastructure is essentially a set of lists.
These lists are rooted globally, but there's no reason*
they couldn't be rooted in a namespace. That would give
each namespace the option of using whatever security
scheme was deemed appropriate. There are a number of
issues, such as namespacing policy, that would have to
be addressed, but the mechanism could work fine. I would
look at patches.

---
* Other than the sheer insanity of making security
  claims about such a system. I would not expect that
  minor issue to slow demand or deployment any more
  than it has in the past.

>
> Isolate SELinux is difficult and it has a lot of work to do, but is easier to isolate selinux_enforcing.
>
> What do you think ?
>
> Think you very much.
>
>
> -- 
> To unsubscribe from this list: send the line "unsubscribe linux-security-module" in
> the body of a message to majordomo at vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>

--
To unsubscribe from this list: send the line "unsubscribe linux-security-module" in
the body of a message to majordomo at vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: isolate selinux_enforcing
  2017-03-09 16:39           ` Casey Schaufler
@ 2017-03-09 20:49             ` Eric W. Biederman
  -1 siblings, 0 replies; 22+ messages in thread
From: Eric W. Biederman @ 2017-03-09 20:49 UTC (permalink / raw)
  To: Casey Schaufler
  Cc: yangshukui, selinux, linux-security-module, Lizefan, paul, sds,
	eparis, james.l.morris, oleg, serge.hallyn

Casey Schaufler <casey@schaufler-ca.com> writes:

> On 3/9/2017 1:03 AM, yangshukui wrote:
>> I want to use SELinux in system container and only concern the function in the container.
>> this system container run in vm and every vm has only one system container.
>>
>> How do I use now?
>> docker run ... system-contaier /sbin/init
>> after init is running ,the following service is also running:
>>
>> #this is the part of service file which will run in container after starting the container.
>> ..
>> semodule -R     #use the policy in container.
>> restorecon /     #if needed
>> ..
>>
>> this method seem to work if host os and the docker images use the same content for rootfs, but if host use
>> redhat7 and docker images use centos7, it will deny many normal operations , and this let some host service not work.
>>
>> If SELinux is permissive in host and enforcing in container ,it will resolve my problem. Unfortunately,
>> there is no namespace for SELinux.

This is mostly a SELinux problem.

> The LSM infrastructure is essentially a set of lists.
> These lists are rooted globally, but there's no reason*
> they couldn't be rooted in a namespace. That would give
> each namespace the option of using whatever security
> scheme was deemed appropriate. There are a number of
> issues, such as namespacing policy, that would have to
> be addressed, but the mechanism could work fine. I would
> look at patches.

>
> ---
> * Other than the sheer insanity of making security
>   claims about such a system. I would not expect that
>   minor issue to slow demand or deployment any more
>   than it has in the past.

I would tend to insist that the container local policy stacks inside the
global policy.  So that at the least the global security claims would
not be reduced.

My expectation is that a container would run as essentially all one
label from a global perspective.

To implement this would require a revision on the selinux labels xattrs
so that they can be marked as being part of a container...  But having
the labels look ordinary inside the container.

We almost have a patch that implements something like that for the
capability xattr.

Eric

^ permalink raw reply	[flat|nested] 22+ messages in thread

* isolate selinux_enforcing
@ 2017-03-09 20:49             ` Eric W. Biederman
  0 siblings, 0 replies; 22+ messages in thread
From: Eric W. Biederman @ 2017-03-09 20:49 UTC (permalink / raw)
  To: linux-security-module

Casey Schaufler <casey@schaufler-ca.com> writes:

> On 3/9/2017 1:03 AM, yangshukui wrote:
>> I want to use SELinux in system container and only concern the function in the container.
>> this system container run in vm and every vm has only one system container.
>>
>> How do I use now?
>> docker run ... system-contaier /sbin/init
>> after init is running ,the following service is also running:
>>
>> #this is the part of service file which will run in container after starting the container.
>> ..
>> semodule -R     #use the policy in container.
>> restorecon /     #if needed
>> ..
>>
>> this method seem to work if host os and the docker images use the same content for rootfs, but if host use
>> redhat7 and docker images use centos7, it will deny many normal operations , and this let some host service not work.
>>
>> If SELinux is permissive in host and enforcing in container ,it will resolve my problem. Unfortunately,
>> there is no namespace for SELinux.

This is mostly a SELinux problem.

> The LSM infrastructure is essentially a set of lists.
> These lists are rooted globally, but there's no reason*
> they couldn't be rooted in a namespace. That would give
> each namespace the option of using whatever security
> scheme was deemed appropriate. There are a number of
> issues, such as namespacing policy, that would have to
> be addressed, but the mechanism could work fine. I would
> look at patches.

>
> ---
> * Other than the sheer insanity of making security
>   claims about such a system. I would not expect that
>   minor issue to slow demand or deployment any more
>   than it has in the past.

I would tend to insist that the container local policy stacks inside the
global policy.  So that at the least the global security claims would
not be reduced.

My expectation is that a container would run as essentially all one
label from a global perspective.

To implement this would require a revision on the selinux labels xattrs
so that they can be marked as being part of a container...  But having
the labels look ordinary inside the container.

We almost have a patch that implements something like that for the
capability xattr.

Eric
--
To unsubscribe from this list: send the line "unsubscribe linux-security-module" in
the body of a message to majordomo at vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: isolate selinux_enforcing
  2017-03-09 20:49             ` Eric W. Biederman
@ 2017-03-10  0:05               ` Paul Moore
  -1 siblings, 0 replies; 22+ messages in thread
From: Paul Moore @ 2017-03-10  0:05 UTC (permalink / raw)
  To: Casey Schaufler, Eric W. Biederman
  Cc: yangshukui, selinux, linux-security-module, Lizefan,
	Stephen Smalley, Eric Paris, James Morris, oleg, serge.hallyn

On Thu, Mar 9, 2017 at 3:49 PM, Eric W. Biederman <ebiederm@xmission.com> wrote:
> Casey Schaufler <casey@schaufler-ca.com> writes:
>
>> On 3/9/2017 1:03 AM, yangshukui wrote:
>>> I want to use SELinux in system container and only concern the function in the container.
>>> this system container run in vm and every vm has only one system container.
>>>
>>> How do I use now?
>>> docker run ... system-contaier /sbin/init
>>> after init is running ,the following service is also running:
>>>
>>> #this is the part of service file which will run in container after starting the container.
>>> ..
>>> semodule -R     #use the policy in container.
>>> restorecon /     #if needed
>>> ..
>>>
>>> this method seem to work if host os and the docker images use the same content for rootfs, but if host use
>>> redhat7 and docker images use centos7, it will deny many normal operations , and this let some host service not work.
>>>
>>> If SELinux is permissive in host and enforcing in container ,it will resolve my problem. Unfortunately,
>>> there is no namespace for SELinux.
>
> This is mostly a SELinux problem.
>
>> The LSM infrastructure is essentially a set of lists.
>> These lists are rooted globally, but there's no reason*
>> they couldn't be rooted in a namespace. That would give
>> each namespace the option of using whatever security
>> scheme was deemed appropriate. There are a number of
>> issues, such as namespacing policy, that would have to
>> be addressed, but the mechanism could work fine. I would
>> look at patches.
>
>>
>> ---
>> * Other than the sheer insanity of making security
>>   claims about such a system. I would not expect that
>>   minor issue to slow demand or deployment any more
>>   than it has in the past.
>
> I would tend to insist that the container local policy stacks inside the
> global policy.  So that at the least the global security claims would
> not be reduced.

My current thinking is that namespacing is best left to the individual
LSMs, as it is unlikely we will all want to solve it the same way.
With SELinux we already have some basic support for what Eric
describes via bounded domains, but that alone isn't likely to solve
SELinux inside containers in a sense that most would expect; for that
you will need what Stephen already described.

-- 
paul moore
www.paul-moore.com

^ permalink raw reply	[flat|nested] 22+ messages in thread

* isolate selinux_enforcing
@ 2017-03-10  0:05               ` Paul Moore
  0 siblings, 0 replies; 22+ messages in thread
From: Paul Moore @ 2017-03-10  0:05 UTC (permalink / raw)
  To: linux-security-module

On Thu, Mar 9, 2017 at 3:49 PM, Eric W. Biederman <ebiederm@xmission.com> wrote:
> Casey Schaufler <casey@schaufler-ca.com> writes:
>
>> On 3/9/2017 1:03 AM, yangshukui wrote:
>>> I want to use SELinux in system container and only concern the function in the container.
>>> this system container run in vm and every vm has only one system container.
>>>
>>> How do I use now?
>>> docker run ... system-contaier /sbin/init
>>> after init is running ,the following service is also running:
>>>
>>> #this is the part of service file which will run in container after starting the container.
>>> ..
>>> semodule -R     #use the policy in container.
>>> restorecon /     #if needed
>>> ..
>>>
>>> this method seem to work if host os and the docker images use the same content for rootfs, but if host use
>>> redhat7 and docker images use centos7, it will deny many normal operations , and this let some host service not work.
>>>
>>> If SELinux is permissive in host and enforcing in container ,it will resolve my problem. Unfortunately,
>>> there is no namespace for SELinux.
>
> This is mostly a SELinux problem.
>
>> The LSM infrastructure is essentially a set of lists.
>> These lists are rooted globally, but there's no reason*
>> they couldn't be rooted in a namespace. That would give
>> each namespace the option of using whatever security
>> scheme was deemed appropriate. There are a number of
>> issues, such as namespacing policy, that would have to
>> be addressed, but the mechanism could work fine. I would
>> look at patches.
>
>>
>> ---
>> * Other than the sheer insanity of making security
>>   claims about such a system. I would not expect that
>>   minor issue to slow demand or deployment any more
>>   than it has in the past.
>
> I would tend to insist that the container local policy stacks inside the
> global policy.  So that at the least the global security claims would
> not be reduced.

My current thinking is that namespacing is best left to the individual
LSMs, as it is unlikely we will all want to solve it the same way.
With SELinux we already have some basic support for what Eric
describes via bounded domains, but that alone isn't likely to solve
SELinux inside containers in a sense that most would expect; for that
you will need what Stephen already described.

-- 
paul moore
www.paul-moore.com
--
To unsubscribe from this list: send the line "unsubscribe linux-security-module" in
the body of a message to majordomo at vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: isolate selinux_enforcing
  2017-03-09 20:49             ` Eric W. Biederman
@ 2017-03-13  7:06               ` James Morris
  -1 siblings, 0 replies; 22+ messages in thread
From: James Morris @ 2017-03-13  7:06 UTC (permalink / raw)
  To: Eric W. Biederman
  Cc: Casey Schaufler, yangshukui, selinux, linux-security-module,
	Lizefan, paul, sds, eparis, james.l.morris, oleg, serge.hallyn

On Thu, 9 Mar 2017, Eric W. Biederman wrote:

> My expectation is that a container would run as essentially all one
> label from a global perspective.
> 

Keep in mind that a different classes of objects may have distinct 
labeling in SELinux.  e.g. a process and a file typically have different 
labels (say, sshd_t vs. sshd_key_t).

Also, I think you will want to have the global namespace always use the 
original security labels.  If accessing an object from outside the 
container, the original global policy should always apply.  Really, this 
needs to be an invariant property.

I'd suggest implementing an orthogonal 2nd set of security labels which 
are only ever used within the container.


> To implement this would require a revision on the selinux labels xattrs
> so that they can be marked as being part of a container...  But having
> the labels look ordinary inside the container.
> 
> We almost have a patch that implements something like that for the
> capability xattr.

It'll be interesting to see.

-- 
James Morris
<jmorris@namei.org>

^ permalink raw reply	[flat|nested] 22+ messages in thread

* isolate selinux_enforcing
@ 2017-03-13  7:06               ` James Morris
  0 siblings, 0 replies; 22+ messages in thread
From: James Morris @ 2017-03-13  7:06 UTC (permalink / raw)
  To: linux-security-module

On Thu, 9 Mar 2017, Eric W. Biederman wrote:

> My expectation is that a container would run as essentially all one
> label from a global perspective.
> 

Keep in mind that a different classes of objects may have distinct 
labeling in SELinux.  e.g. a process and a file typically have different 
labels (say, sshd_t vs. sshd_key_t).

Also, I think you will want to have the global namespace always use the 
original security labels.  If accessing an object from outside the 
container, the original global policy should always apply.  Really, this 
needs to be an invariant property.

I'd suggest implementing an orthogonal 2nd set of security labels which 
are only ever used within the container.


> To implement this would require a revision on the selinux labels xattrs
> so that they can be marked as being part of a container...  But having
> the labels look ordinary inside the container.
> 
> We almost have a patch that implements something like that for the
> capability xattr.

It'll be interesting to see.

-- 
James Morris
<jmorris@namei.org>

--
To unsubscribe from this list: send the line "unsubscribe linux-security-module" in
the body of a message to majordomo at vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: isolate selinux_enforcing
  2017-03-13  7:06               ` James Morris
@ 2017-03-13 16:05                 ` Casey Schaufler
  -1 siblings, 0 replies; 22+ messages in thread
From: Casey Schaufler @ 2017-03-13 16:05 UTC (permalink / raw)
  To: James Morris, Eric W. Biederman
  Cc: yangshukui, selinux, linux-security-module, Lizefan, paul, sds,
	eparis, james.l.morris, oleg, serge.hallyn

On 3/13/2017 12:06 AM, James Morris wrote:
> On Thu, 9 Mar 2017, Eric W. Biederman wrote:
>
>> My expectation is that a container would run as essentially all one
>> label from a global perspective.
>>
> Keep in mind that a different classes of objects may have distinct 
> labeling in SELinux.  e.g. a process and a file typically have different 
> labels (say, sshd_t vs. sshd_key_t).
>
> Also, I think you will want to have the global namespace always use the 
> original security labels.  If accessing an object from outside the 
> container, the original global policy should always apply.  Really, this 
> needs to be an invariant property.
>
> I'd suggest implementing an orthogonal 2nd set of security labels which 
> are only ever used within the container.

The work that's been done for Smack namespaces

	https://lwn.net/Articles/652320

may come in handy during during your deliberations for
SELinux. Conceptually you can create aliases for your
base labels, and use those within the container. Very
much like the UID mapping of user namespaces. Labels that
don't have an alias can't be accessed within the namespace.

>> To implement this would require a revision on the selinux labels xattrs
>> so that they can be marked as being part of a container...  But having
>> the labels look ordinary inside the container.
>>
>> We almost have a patch that implements something like that for the
>> capability xattr.
> It'll be interesting to see.
>

^ permalink raw reply	[flat|nested] 22+ messages in thread

* isolate selinux_enforcing
@ 2017-03-13 16:05                 ` Casey Schaufler
  0 siblings, 0 replies; 22+ messages in thread
From: Casey Schaufler @ 2017-03-13 16:05 UTC (permalink / raw)
  To: linux-security-module

On 3/13/2017 12:06 AM, James Morris wrote:
> On Thu, 9 Mar 2017, Eric W. Biederman wrote:
>
>> My expectation is that a container would run as essentially all one
>> label from a global perspective.
>>
> Keep in mind that a different classes of objects may have distinct 
> labeling in SELinux.  e.g. a process and a file typically have different 
> labels (say, sshd_t vs. sshd_key_t).
>
> Also, I think you will want to have the global namespace always use the 
> original security labels.  If accessing an object from outside the 
> container, the original global policy should always apply.  Really, this 
> needs to be an invariant property.
>
> I'd suggest implementing an orthogonal 2nd set of security labels which 
> are only ever used within the container.

The work that's been done for Smack namespaces

	https://lwn.net/Articles/652320

may come in handy during during your deliberations for
SELinux. Conceptually you can create aliases for your
base labels, and use those within the container. Very
much like the UID mapping of user namespaces. Labels that
don't have an alias can't be accessed within the namespace.

>> To implement this would require a revision on the selinux labels xattrs
>> so that they can be marked as being part of a container...  But having
>> the labels look ordinary inside the container.
>>
>> We almost have a patch that implements something like that for the
>> capability xattr.
> It'll be interesting to see.
>

--
To unsubscribe from this list: send the line "unsubscribe linux-security-module" in
the body of a message to majordomo at vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 22+ messages in thread

end of thread, other threads:[~2017-03-13 16:06 UTC | newest]

Thread overview: 22+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
     [not found] <58732BCF.4090908@huawei.com>
     [not found] ` <58734284.1060504@huawei.com>
     [not found]   ` <b7f75f65-592a-5102-0ac5-4d3aa43f0b55@huawei.com>
2017-01-09 10:51     ` SELinux lead to soft lockup when pid 1 proceess reap child yangshukui
2017-01-09 18:12       ` Oleg Nesterov
2017-01-09 18:29         ` Oleg Nesterov
2017-01-09 18:43           ` Stephen Smalley
2017-01-09 23:49             ` Paul Moore
2017-01-10  0:26             ` Casey Schaufler
2017-03-09  9:03       ` isolate selinux_enforcing yangshukui
2017-03-09  9:03         ` yangshukui
2017-03-09 15:28         ` Stephen Smalley
2017-03-09 15:28           ` Stephen Smalley
2017-03-09 15:39           ` Stephen Smalley
2017-03-09 15:39             ` Stephen Smalley
2017-03-09 16:39         ` Casey Schaufler
2017-03-09 16:39           ` Casey Schaufler
2017-03-09 20:49           ` Eric W. Biederman
2017-03-09 20:49             ` Eric W. Biederman
2017-03-10  0:05             ` Paul Moore
2017-03-10  0:05               ` Paul Moore
2017-03-13  7:06             ` James Morris
2017-03-13  7:06               ` James Morris
2017-03-13 16:05               ` Casey Schaufler
2017-03-13 16:05                 ` Casey Schaufler

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.