New Container vulnerability could potentially use an SELinux fix.

selinux.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

* New Container vulnerability could potentially use an SELinux fix.
@ 2019-06-07 15:42 Daniel Walsh
  2019-06-07 16:44 ` Stephen Smalley
  0 siblings, 1 reply; 10+ messages in thread
From: Daniel Walsh @ 2019-06-07 15:42 UTC (permalink / raw)
  To: Miloslav Trmac, selinux

We have periodic vulnerablities around bad container images having
symbolic link attacks against the host.

One came out last week about doing a `podman cp`

Which would copy content from the host into the container.  The issue
was that if the container was running, it could trick the processes
copying content into it to follow a symbolic link to external of the
container image.

The question came up, is there a way to use SELinux to prevent this. And
sadly the answer right now is no, because we have no way to know what
the label of the process attempting to update the container file system
is running as.  Usually it will be running as unconfined_t.

One idea would be to add a rule to policy that control the following of
symbolic links to only those specified in policy.

Something like

SPECIALRESTRICTED TYPE container_file_t

allow container_file_t container_file_t:symlink follow;

Then if a process attempted to copy content onto a symbolic link from
container_file_t to a non container_file_t type, the kernel would deny
access.

Thoughts?

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: New Container vulnerability could potentially use an SELinux fix.
  2019-06-07 15:42 New Container vulnerability could potentially use an SELinux fix Daniel Walsh
@ 2019-06-07 16:44 ` Stephen Smalley
  2019-06-07 21:06   ` Daniel Walsh
  0 siblings, 1 reply; 10+ messages in thread
From: Stephen Smalley @ 2019-06-07 16:44 UTC (permalink / raw)
  To: dwalsh, Miloslav Trmac, selinux

On 6/7/19 11:42 AM, Daniel Walsh wrote:
> We have periodic vulnerablities around bad container images having
> symbolic link attacks against the host.
> 
> One came out last week about doing a `podman cp`
> 
> Which would copy content from the host into the container.  The issue
> was that if the container was running, it could trick the processes
> copying content into it to follow a symbolic link to external of the
> container image.
> 
> The question came up, is there a way to use SELinux to prevent this. And
> sadly the answer right now is no, because we have no way to know what
> the label of the process attempting to update the container file system
> is running as.  Usually it will be running as unconfined_t.
> 
> One idea would be to add a rule to policy that control the following of
> symbolic links to only those specified in policy.
> 
> 
> Something like
> 
> SPECIALRESTRICTED TYPE container_file_t
> 
> allow container_file_t container_file_t:symlink follow;
> 
> Then if a process attempted to copy content onto a symbolic link from
> container_file_t to a non container_file_t type, the kernel would deny
> access.
> 
> Thoughts?

SELinux would prevent it if you didn't allow unconfined_t (or other 
privileged domains) to follow untrustworthy symlinks (e.g. don't allow 
unconfined_t container_file_t:lnk_file read; in the first place). 
That's the right way to prevent it.

Trying to apply a check between symlink and its target as you suggest is 
problematic; we don't generally have them both at the same point.  If we 
are allowed to follow the symlink, we read its contents and perform a 
path walk on that, and that could be a multi-component pathname lookup 
that itself spans further symlinks, mount points, etc.  I think that 
would be challenging to support in the kernel, subject to races, and 
certainly would require changes outside of just SELinux.

If you truly cannot impose such restrictions on unconfined_t, then maybe 
podman should run in its own domain.

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: New Container vulnerability could potentially use an SELinux fix.
  2019-06-07 16:44 ` Stephen Smalley
@ 2019-06-07 21:06   ` Daniel Walsh
  2019-06-07 21:26     ` Stephen Smalley
  0 siblings, 1 reply; 10+ messages in thread
From: Daniel Walsh @ 2019-06-07 21:06 UTC (permalink / raw)
  To: Stephen Smalley, Miloslav Trmac, selinux

On 6/7/19 12:44 PM, Stephen Smalley wrote:
> On 6/7/19 11:42 AM, Daniel Walsh wrote:
>> We have periodic vulnerablities around bad container images having
>> symbolic link attacks against the host.
>>
>> One came out last week about doing a `podman cp`
>>
>> Which would copy content from the host into the container.  The issue
>> was that if the container was running, it could trick the processes
>> copying content into it to follow a symbolic link to external of the
>> container image.
>>
>> The question came up, is there a way to use SELinux to prevent this. And
>> sadly the answer right now is no, because we have no way to know what
>> the label of the process attempting to update the container file system
>> is running as.  Usually it will be running as unconfined_t.
>>
>> One idea would be to add a rule to policy that control the following of
>> symbolic links to only those specified in policy.
>>
>>
>> Something like
>>
>> SPECIALRESTRICTED TYPE container_file_t
>>
>> allow container_file_t container_file_t:symlink follow;
>>
>> Then if a process attempted to copy content onto a symbolic link from
>> container_file_t to a non container_file_t type, the kernel would deny
>> access.
>>
>> Thoughts?
>
> SELinux would prevent it if you didn't allow unconfined_t (or other
> privileged domains) to follow untrustworthy symlinks (e.g. don't allow
> unconfined_t container_file_t:lnk_file read; in the first place).
> That's the right way to prevent it.
>
> Trying to apply a check between symlink and its target as you suggest
> is problematic; we don't generally have them both at the same point. 
> If we are allowed to follow the symlink, we read its contents and
> perform a path walk on that, and that could be a multi-component
> pathname lookup that itself spans further symlinks, mount points,
> etc.  I think that would be challenging to support in the kernel,
> subject to races, and certainly would require changes outside of just
> SELinux.
>
> If you truly cannot impose such restrictions on unconfined_t, then
> maybe podman should run in its own domain.
>
This is not an issue with just podman.  Podman can mount the image and
the tools can just read/write content into the mountpoint.

I thought I recalled a LSM that prefented symlink attacks when users
would link a file in the homedir against /etc/shadow and then attempt to
get the admin to modify the file in his homedir?

I was thinking that if that existed we could build more controls on it
based on Labels rather then just UIDs matching.


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: New Container vulnerability could potentially use an SELinux fix.
  2019-06-07 21:06   ` Daniel Walsh
@ 2019-06-07 21:26     ` Stephen Smalley
  2019-06-08 14:08       ` Daniel Walsh
  0 siblings, 1 reply; 10+ messages in thread
From: Stephen Smalley @ 2019-06-07 21:26 UTC (permalink / raw)
  To: dwalsh, Miloslav Trmac, selinux

On 6/7/19 5:06 PM, Daniel Walsh wrote:
> On 6/7/19 12:44 PM, Stephen Smalley wrote:
>> On 6/7/19 11:42 AM, Daniel Walsh wrote:
>>> We have periodic vulnerablities around bad container images having
>>> symbolic link attacks against the host.
>>>
>>> One came out last week about doing a `podman cp`
>>>
>>> Which would copy content from the host into the container.  The issue
>>> was that if the container was running, it could trick the processes
>>> copying content into it to follow a symbolic link to external of the
>>> container image.
>>>
>>> The question came up, is there a way to use SELinux to prevent this. And
>>> sadly the answer right now is no, because we have no way to know what
>>> the label of the process attempting to update the container file system
>>> is running as.  Usually it will be running as unconfined_t.
>>>
>>> One idea would be to add a rule to policy that control the following of
>>> symbolic links to only those specified in policy.
>>>
>>>
>>> Something like
>>>
>>> SPECIALRESTRICTED TYPE container_file_t
>>>
>>> allow container_file_t container_file_t:symlink follow;
>>>
>>> Then if a process attempted to copy content onto a symbolic link from
>>> container_file_t to a non container_file_t type, the kernel would deny
>>> access.
>>>
>>> Thoughts?
>>
>> SELinux would prevent it if you didn't allow unconfined_t (or other
>> privileged domains) to follow untrustworthy symlinks (e.g. don't allow
>> unconfined_t container_file_t:lnk_file read; in the first place).
>> That's the right way to prevent it.
>>
>> Trying to apply a check between symlink and its target as you suggest
>> is problematic; we don't generally have them both at the same point.
>> If we are allowed to follow the symlink, we read its contents and
>> perform a path walk on that, and that could be a multi-component
>> pathname lookup that itself spans further symlinks, mount points,
>> etc.  I think that would be challenging to support in the kernel,
>> subject to races, and certainly would require changes outside of just
>> SELinux.
>>
>> If you truly cannot impose such restrictions on unconfined_t, then
>> maybe podman should run in its own domain.
>>
> This is not an issue with just podman.  Podman can mount the image and
> the tools can just read/write content into the mountpoint.
> 
> I thought I recalled a LSM that prefented symlink attacks when users
> would link a file in the homedir against /etc/shadow and then attempt to
> get the admin to modify the file in his homedir?
> 
> I was thinking that if that existed we could build more controls on it
> based on Labels rather then just UIDs matching.

Not sure if you are thinking of symlink attacks or hard link attacks. 
SELinux supports preventing the former by restricting the ability to 
follow symlinks based on lnk_file read permission, so you can prevent 
trusted processes from following untrustworthy symlinks.  SELinux 
supports preventing the latter by restricting the ability to create hard 
links to unauthorized files.  But you need to write your policies in a 
manner that leverages that support, and a fully unconfined domain isn't 
going to be protected via SELinux by definition; ideally you'd be 
phasing out unconfined altogether like Android did.  Modern kernels also 
have the /proc/sys/fs/protected_hardlinks and 
/proc/sys/fs/protected_symlinks settings, which restrict based on UID, 
but the symlink checks aren't based on the target of the symlink either.

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: New Container vulnerability could potentially use an SELinux fix.
  2019-06-07 21:26     ` Stephen Smalley
@ 2019-06-08 14:08       ` Daniel Walsh
  2019-06-10 14:08         ` Stephen Smalley
  0 siblings, 1 reply; 10+ messages in thread
From: Daniel Walsh @ 2019-06-08 14:08 UTC (permalink / raw)
  To: Stephen Smalley, Miloslav Trmac, selinux

On 6/7/19 5:26 PM, Stephen Smalley wrote:
> On 6/7/19 5:06 PM, Daniel Walsh wrote:
>> On 6/7/19 12:44 PM, Stephen Smalley wrote:
>>> On 6/7/19 11:42 AM, Daniel Walsh wrote:
>>>> We have periodic vulnerablities around bad container images having
>>>> symbolic link attacks against the host.
>>>>
>>>> One came out last week about doing a `podman cp`
>>>>
>>>> Which would copy content from the host into the container.  The issue
>>>> was that if the container was running, it could trick the processes
>>>> copying content into it to follow a symbolic link to external of the
>>>> container image.
>>>>
>>>> The question came up, is there a way to use SELinux to prevent
>>>> this. And
>>>> sadly the answer right now is no, because we have no way to know what
>>>> the label of the process attempting to update the container file
>>>> system
>>>> is running as.  Usually it will be running as unconfined_t.
>>>>
>>>> One idea would be to add a rule to policy that control the
>>>> following of
>>>> symbolic links to only those specified in policy.
>>>>
>>>>
>>>> Something like
>>>>
>>>> SPECIALRESTRICTED TYPE container_file_t
>>>>
>>>> allow container_file_t container_file_t:symlink follow;
>>>>
>>>> Then if a process attempted to copy content onto a symbolic link from
>>>> container_file_t to a non container_file_t type, the kernel would deny
>>>> access.
>>>>
>>>> Thoughts?
>>>
>>> SELinux would prevent it if you didn't allow unconfined_t (or other
>>> privileged domains) to follow untrustworthy symlinks (e.g. don't allow
>>> unconfined_t container_file_t:lnk_file read; in the first place).
>>> That's the right way to prevent it.
>>>
>>> Trying to apply a check between symlink and its target as you suggest
>>> is problematic; we don't generally have them both at the same point.
>>> If we are allowed to follow the symlink, we read its contents and
>>> perform a path walk on that, and that could be a multi-component
>>> pathname lookup that itself spans further symlinks, mount points,
>>> etc.  I think that would be challenging to support in the kernel,
>>> subject to races, and certainly would require changes outside of just
>>> SELinux.
>>>
>>> If you truly cannot impose such restrictions on unconfined_t, then
>>> maybe podman should run in its own domain.
>>>
>> This is not an issue with just podman.  Podman can mount the image and
>> the tools can just read/write content into the mountpoint.
>>
>> I thought I recalled a LSM that prefented symlink attacks when users
>> would link a file in the homedir against /etc/shadow and then attempt to
>> get the admin to modify the file in his homedir?
>>
>> I was thinking that if that existed we could build more controls on it
>> based on Labels rather then just UIDs matching.
>
> Not sure if you are thinking of symlink attacks or hard link attacks.
> SELinux supports preventing the former by restricting the ability to
> follow symlinks based on lnk_file read permission, so you can prevent
> trusted processes from following untrustworthy symlinks.  SELinux
> supports preventing the latter by restricting the ability to create
> hard links to unauthorized files.  But you need to write your policies
> in a manner that leverages that support, and a fully unconfined domain
> isn't going to be protected via SELinux by definition; ideally you'd
> be phasing out unconfined altogether like Android did.  Modern kernels
> also have the /proc/sys/fs/protected_hardlinks and
> /proc/sys/fs/protected_symlinks settings, which restrict based on UID,
> but the symlink checks aren't based on the target of the symlink either.

Android does not have an Admin, so it is a lot easier for them.  But not
going to get into that now.  I obviously understand how SELinux works. 
But perhaps I am looking for something differntly.

This link defines pretty close to what I would want, but extended for
labels rather then just UIDS.

https://sysctl-explorer.net/fs/protected_symlinks/


> A long-standing class of security issues is the symlink-based
> time-of-check-time-of-use race, most commonly seen in world-writable
> directories like /tmp. The common method of exploitation of this flaw
> is to cross privilege boundaries when following a given symlink (i.e.
> a **PRIVILEGED** process follows a symlink belonging **PROVIDED BY
> OTHERS**). For a likely incomplete list of hundreds of examples across
> the years, please see:
> http://cve.mitre.org/cgi-bin/cvekey.cgi?keyword=/tmp
>
> When set to “0”, symlink following behavior is unrestricted.
>
> When set to “1” symlinks are permitted to be followed only when
> outside a sticky world-writable directory **WE COULD POTENTIALLY SET
> THIS OR SOME OTHER FLAG**, or when the **LABEL** of the symlink and
> follower match, or when the directory **LABEL** matches the symlink’s
> **LABEL**.
>
> This protection is based on the restrictions in Openwall and grsecurity.
>







^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: New Container vulnerability could potentially use an SELinux fix.
  2019-06-08 14:08       ` Daniel Walsh
@ 2019-06-10 14:08         ` Stephen Smalley
  2019-06-10 14:37           ` Daniel Walsh
  0 siblings, 1 reply; 10+ messages in thread
From: Stephen Smalley @ 2019-06-10 14:08 UTC (permalink / raw)
  To: dwalsh, Miloslav Trmac, selinux

On 6/8/19 10:08 AM, Daniel Walsh wrote:
> On 6/7/19 5:26 PM, Stephen Smalley wrote:
>> On 6/7/19 5:06 PM, Daniel Walsh wrote:
>>> On 6/7/19 12:44 PM, Stephen Smalley wrote:
>>>> On 6/7/19 11:42 AM, Daniel Walsh wrote:
>>>>> We have periodic vulnerablities around bad container images having
>>>>> symbolic link attacks against the host.
>>>>>
>>>>> One came out last week about doing a `podman cp`
>>>>>
>>>>> Which would copy content from the host into the container.  The issue
>>>>> was that if the container was running, it could trick the processes
>>>>> copying content into it to follow a symbolic link to external of the
>>>>> container image.
>>>>>
>>>>> The question came up, is there a way to use SELinux to prevent
>>>>> this. And
>>>>> sadly the answer right now is no, because we have no way to know what
>>>>> the label of the process attempting to update the container file
>>>>> system
>>>>> is running as.  Usually it will be running as unconfined_t.
>>>>>
>>>>> One idea would be to add a rule to policy that control the
>>>>> following of
>>>>> symbolic links to only those specified in policy.
>>>>>
>>>>>
>>>>> Something like
>>>>>
>>>>> SPECIALRESTRICTED TYPE container_file_t
>>>>>
>>>>> allow container_file_t container_file_t:symlink follow;
>>>>>
>>>>> Then if a process attempted to copy content onto a symbolic link from
>>>>> container_file_t to a non container_file_t type, the kernel would deny
>>>>> access.
>>>>>
>>>>> Thoughts?
>>>>
>>>> SELinux would prevent it if you didn't allow unconfined_t (or other
>>>> privileged domains) to follow untrustworthy symlinks (e.g. don't allow
>>>> unconfined_t container_file_t:lnk_file read; in the first place).
>>>> That's the right way to prevent it.
>>>>
>>>> Trying to apply a check between symlink and its target as you suggest
>>>> is problematic; we don't generally have them both at the same point.
>>>> If we are allowed to follow the symlink, we read its contents and
>>>> perform a path walk on that, and that could be a multi-component
>>>> pathname lookup that itself spans further symlinks, mount points,
>>>> etc.  I think that would be challenging to support in the kernel,
>>>> subject to races, and certainly would require changes outside of just
>>>> SELinux.
>>>>
>>>> If you truly cannot impose such restrictions on unconfined_t, then
>>>> maybe podman should run in its own domain.
>>>>
>>> This is not an issue with just podman.  Podman can mount the image and
>>> the tools can just read/write content into the mountpoint.
>>>
>>> I thought I recalled a LSM that prefented symlink attacks when users
>>> would link a file in the homedir against /etc/shadow and then attempt to
>>> get the admin to modify the file in his homedir?
>>>
>>> I was thinking that if that existed we could build more controls on it
>>> based on Labels rather then just UIDs matching.
>>
>> Not sure if you are thinking of symlink attacks or hard link attacks.
>> SELinux supports preventing the former by restricting the ability to
>> follow symlinks based on lnk_file read permission, so you can prevent
>> trusted processes from following untrustworthy symlinks.  SELinux
>> supports preventing the latter by restricting the ability to create
>> hard links to unauthorized files.  But you need to write your policies
>> in a manner that leverages that support, and a fully unconfined domain
>> isn't going to be protected via SELinux by definition; ideally you'd
>> be phasing out unconfined altogether like Android did.  Modern kernels
>> also have the /proc/sys/fs/protected_hardlinks and
>> /proc/sys/fs/protected_symlinks settings, which restrict based on UID,
>> but the symlink checks aren't based on the target of the symlink either.
> 
> Android does not have an Admin, so it is a lot easier for them.  But not
> going to get into that now.  I obviously understand how SELinux works.
> But perhaps I am looking for something differntly.
> 
> This link defines pretty close to what I would want, but extended for
> labels rather then just UIDS.
> 
> https://sysctl-explorer.net/fs/protected_symlinks/
> 
> 
>> A long-standing class of security issues is the symlink-based
>> time-of-check-time-of-use race, most commonly seen in world-writable
>> directories like /tmp. The common method of exploitation of this flaw
>> is to cross privilege boundaries when following a given symlink (i.e.
>> a **PRIVILEGED** process follows a symlink belonging **PROVIDED BY
>> OTHERS**). For a likely incomplete list of hundreds of examples across
>> the years, please see:
>> http://cve.mitre.org/cgi-bin/cvekey.cgi?keyword=/tmp
>>
>> When set to “0”, symlink following behavior is unrestricted.
>>
>> When set to “1” symlinks are permitted to be followed only when
>> outside a sticky world-writable directory **WE COULD POTENTIALLY SET
>> THIS OR SOME OTHER FLAG**, or when the **LABEL** of the symlink and
>> follower match, or when the directory **LABEL** matches the symlink’s
>> **LABEL**.
>>
>> This protection is based on the restrictions in Openwall and grsecurity.
>>

That's the /proc/sys/fs/protected_symlinks feature I mentioned in my 
email above.  It isn't based on the target of the symlink; it is only 
based on the attributes of the follower process (e.g. root), the 
attributes of the parent directory containing the symlink (e.g. /tmp), 
and the attributes of the symlink file (e.g. /tmp/foo -> /etc/shadow). 
At no point is it checking anything about the target of the symlink, 
e.g. /etc/shadow.  If dwalsh creates a symlink under /tmp (ln -s 
/etc/shadow /tmp/foo) and root tries to follow /tmp/foo, then that will 
fail because 1) the process fsuid (root) != the /tmp/foo symlink owner 
(dwalsh), and 2) /tmp is a sticky and world-writable directory, and 3) 
the /tmp directory owner (root) != the /tmp/foo symlink owner (dwalsh). 
Note that conditions (2) and (3) render the check useless for your use 
case, since you want to prevent following any symlinks writable by 
container processes in any directory within the container filesystem, so 
the directory need not be world-writable/sticky and the parent directory 
UID/label might be identical to the symlink UID/label.

The existing SELinux lnk_file read permission check enables you to apply 
stronger label-based controls to all symlinks within the container 
filesystem, not just ones in /tmp-like directories.  Don't allow 
unconfined_t or any other privileged domain read permission to 
container_file_t:lnk_file (or preferably to any file type for which 
:lnk_file create is allowed to container process domains), and you'll 
never have to worry about them following a symlink writable by a 
container process.  This of course assumes that the container filesystem 
is always labeled with a type that is untrusted, whether via mount 
contexts or actual labels.

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: New Container vulnerability could potentially use an SELinux fix.
  2019-06-10 14:08         ` Stephen Smalley
@ 2019-06-10 14:37           ` Daniel Walsh
  2019-06-10 15:00             ` Stephen Smalley
  0 siblings, 1 reply; 10+ messages in thread
From: Daniel Walsh @ 2019-06-10 14:37 UTC (permalink / raw)
  To: Stephen Smalley, Miloslav Trmac, selinux

On 6/10/19 10:08 AM, Stephen Smalley wrote:
> On 6/8/19 10:08 AM, Daniel Walsh wrote:
>> On 6/7/19 5:26 PM, Stephen Smalley wrote:
>>> On 6/7/19 5:06 PM, Daniel Walsh wrote:
>>>> On 6/7/19 12:44 PM, Stephen Smalley wrote:
>>>>> On 6/7/19 11:42 AM, Daniel Walsh wrote:
>>>>>> We have periodic vulnerablities around bad container images having
>>>>>> symbolic link attacks against the host.
>>>>>>
>>>>>> One came out last week about doing a `podman cp`
>>>>>>
>>>>>> Which would copy content from the host into the container.  The
>>>>>> issue
>>>>>> was that if the container was running, it could trick the processes
>>>>>> copying content into it to follow a symbolic link to external of the
>>>>>> container image.
>>>>>>
>>>>>> The question came up, is there a way to use SELinux to prevent
>>>>>> this. And
>>>>>> sadly the answer right now is no, because we have no way to know
>>>>>> what
>>>>>> the label of the process attempting to update the container file
>>>>>> system
>>>>>> is running as.  Usually it will be running as unconfined_t.
>>>>>>
>>>>>> One idea would be to add a rule to policy that control the
>>>>>> following of
>>>>>> symbolic links to only those specified in policy.
>>>>>>
>>>>>>
>>>>>> Something like
>>>>>>
>>>>>> SPECIALRESTRICTED TYPE container_file_t
>>>>>>
>>>>>> allow container_file_t container_file_t:symlink follow;
>>>>>>
>>>>>> Then if a process attempted to copy content onto a symbolic link
>>>>>> from
>>>>>> container_file_t to a non container_file_t type, the kernel would
>>>>>> deny
>>>>>> access.
>>>>>>
>>>>>> Thoughts?
>>>>>
>>>>> SELinux would prevent it if you didn't allow unconfined_t (or other
>>>>> privileged domains) to follow untrustworthy symlinks (e.g. don't
>>>>> allow
>>>>> unconfined_t container_file_t:lnk_file read; in the first place).
>>>>> That's the right way to prevent it.
>>>>>
>>>>> Trying to apply a check between symlink and its target as you suggest
>>>>> is problematic; we don't generally have them both at the same point.
>>>>> If we are allowed to follow the symlink, we read its contents and
>>>>> perform a path walk on that, and that could be a multi-component
>>>>> pathname lookup that itself spans further symlinks, mount points,
>>>>> etc.  I think that would be challenging to support in the kernel,
>>>>> subject to races, and certainly would require changes outside of just
>>>>> SELinux.
>>>>>
>>>>> If you truly cannot impose such restrictions on unconfined_t, then
>>>>> maybe podman should run in its own domain.
>>>>>
>>>> This is not an issue with just podman.  Podman can mount the image and
>>>> the tools can just read/write content into the mountpoint.
>>>>
>>>> I thought I recalled a LSM that prefented symlink attacks when users
>>>> would link a file in the homedir against /etc/shadow and then
>>>> attempt to
>>>> get the admin to modify the file in his homedir?
>>>>
>>>> I was thinking that if that existed we could build more controls on it
>>>> based on Labels rather then just UIDs matching.
>>>
>>> Not sure if you are thinking of symlink attacks or hard link attacks.
>>> SELinux supports preventing the former by restricting the ability to
>>> follow symlinks based on lnk_file read permission, so you can prevent
>>> trusted processes from following untrustworthy symlinks.  SELinux
>>> supports preventing the latter by restricting the ability to create
>>> hard links to unauthorized files.  But you need to write your policies
>>> in a manner that leverages that support, and a fully unconfined domain
>>> isn't going to be protected via SELinux by definition; ideally you'd
>>> be phasing out unconfined altogether like Android did.  Modern kernels
>>> also have the /proc/sys/fs/protected_hardlinks and
>>> /proc/sys/fs/protected_symlinks settings, which restrict based on UID,
>>> but the symlink checks aren't based on the target of the symlink
>>> either.
>>
>> Android does not have an Admin, so it is a lot easier for them.  But not
>> going to get into that now.  I obviously understand how SELinux works.
>> But perhaps I am looking for something differntly.
>>
>> This link defines pretty close to what I would want, but extended for
>> labels rather then just UIDS.
>>
>> https://sysctl-explorer.net/fs/protected_symlinks/
>>
>>
>>> A long-standing class of security issues is the symlink-based
>>> time-of-check-time-of-use race, most commonly seen in world-writable
>>> directories like /tmp. The common method of exploitation of this flaw
>>> is to cross privilege boundaries when following a given symlink (i.e.
>>> a **PRIVILEGED** process follows a symlink belonging **PROVIDED BY
>>> OTHERS**). For a likely incomplete list of hundreds of examples across
>>> the years, please see:
>>> http://cve.mitre.org/cgi-bin/cvekey.cgi?keyword=/tmp
>>>
>>> When set to “0”, symlink following behavior is unrestricted.
>>>
>>> When set to “1” symlinks are permitted to be followed only when
>>> outside a sticky world-writable directory **WE COULD POTENTIALLY SET
>>> THIS OR SOME OTHER FLAG**, or when the **LABEL** of the symlink and
>>> follower match, or when the directory **LABEL** matches the symlink’s
>>> **LABEL**.
>>>
>>> This protection is based on the restrictions in Openwall and
>>> grsecurity.
>>>
>
> That's the /proc/sys/fs/protected_symlinks feature I mentioned in my
> email above.  It isn't based on the target of the symlink; it is only
> based on the attributes of the follower process (e.g. root), the
> attributes of the parent directory containing the symlink (e.g. /tmp),
> and the attributes of the symlink file (e.g. /tmp/foo -> /etc/shadow).
> At no point is it checking anything about the target of the symlink,
> e.g. /etc/shadow.  If dwalsh creates a symlink under /tmp (ln -s
> /etc/shadow /tmp/foo) and root tries to follow /tmp/foo, then that
> will fail because 1) the process fsuid (root) != the /tmp/foo symlink
> owner (dwalsh), and 2) /tmp is a sticky and world-writable directory,
> and 3) the /tmp directory owner (root) != the /tmp/foo symlink owner
> (dwalsh). Note that conditions (2) and (3) render the check useless
> for your use case, since you want to prevent following any symlinks
> writable by container processes in any directory within the container
> filesystem, so the directory need not be world-writable/sticky and the
> parent directory UID/label might be identical to the symlink UID/label.
We we are mounting the file system (Most of the time), So we could add a
flag to indicate that this is a protected file system.
>
>
> The existing SELinux lnk_file read permission check enables you to
> apply stronger label-based controls to all symlinks within the
> container filesystem, not just ones in /tmp-like directories.  Don't
> allow unconfined_t or any other privileged domain read permission to
> container_file_t:lnk_file (or preferably to any file type for which
> :lnk_file create is allowed to container process domains), and you'll
> never have to worry about them following a symlink writable by a
> container process.  This of course assumes that the container
> filesystem is always labeled with a type that is untrusted, whether
> via mount contexts or actual labels.

But we want to allow domains to follow container_file_t links that point
to container_file_t objects.  Just not follow them if they point to
other types.  This means there is no Protection that I could write to a
domain like unconfined_t to say only follow links when the types match. 
Or the types have allow rules.




^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: New Container vulnerability could potentially use an SELinux fix.
  2019-06-10 14:37           ` Daniel Walsh
@ 2019-06-10 15:00             ` Stephen Smalley
  2019-06-10 16:58               ` Daniel Walsh
  2019-06-10 17:01               ` Daniel Walsh
  0 siblings, 2 replies; 10+ messages in thread
From: Stephen Smalley @ 2019-06-10 15:00 UTC (permalink / raw)
  To: dwalsh, Miloslav Trmac, selinux

On 6/10/19 10:37 AM, Daniel Walsh wrote:
> On 6/10/19 10:08 AM, Stephen Smalley wrote:
>> On 6/8/19 10:08 AM, Daniel Walsh wrote:
>>> On 6/7/19 5:26 PM, Stephen Smalley wrote:
>>>> On 6/7/19 5:06 PM, Daniel Walsh wrote:
>>>>> On 6/7/19 12:44 PM, Stephen Smalley wrote:
>>>>>> On 6/7/19 11:42 AM, Daniel Walsh wrote:
>>>>>>> We have periodic vulnerablities around bad container images having
>>>>>>> symbolic link attacks against the host.
>>>>>>>
>>>>>>> One came out last week about doing a `podman cp`
>>>>>>>
>>>>>>> Which would copy content from the host into the container.  The
>>>>>>> issue
>>>>>>> was that if the container was running, it could trick the processes
>>>>>>> copying content into it to follow a symbolic link to external of the
>>>>>>> container image.
>>>>>>>
>>>>>>> The question came up, is there a way to use SELinux to prevent
>>>>>>> this. And
>>>>>>> sadly the answer right now is no, because we have no way to know
>>>>>>> what
>>>>>>> the label of the process attempting to update the container file
>>>>>>> system
>>>>>>> is running as.  Usually it will be running as unconfined_t.
>>>>>>>
>>>>>>> One idea would be to add a rule to policy that control the
>>>>>>> following of
>>>>>>> symbolic links to only those specified in policy.
>>>>>>>
>>>>>>>
>>>>>>> Something like
>>>>>>>
>>>>>>> SPECIALRESTRICTED TYPE container_file_t
>>>>>>>
>>>>>>> allow container_file_t container_file_t:symlink follow;
>>>>>>>
>>>>>>> Then if a process attempted to copy content onto a symbolic link
>>>>>>> from
>>>>>>> container_file_t to a non container_file_t type, the kernel would
>>>>>>> deny
>>>>>>> access.
>>>>>>>
>>>>>>> Thoughts?
>>>>>>
>>>>>> SELinux would prevent it if you didn't allow unconfined_t (or other
>>>>>> privileged domains) to follow untrustworthy symlinks (e.g. don't
>>>>>> allow
>>>>>> unconfined_t container_file_t:lnk_file read; in the first place).
>>>>>> That's the right way to prevent it.
>>>>>>
>>>>>> Trying to apply a check between symlink and its target as you suggest
>>>>>> is problematic; we don't generally have them both at the same point.
>>>>>> If we are allowed to follow the symlink, we read its contents and
>>>>>> perform a path walk on that, and that could be a multi-component
>>>>>> pathname lookup that itself spans further symlinks, mount points,
>>>>>> etc.  I think that would be challenging to support in the kernel,
>>>>>> subject to races, and certainly would require changes outside of just
>>>>>> SELinux.
>>>>>>
>>>>>> If you truly cannot impose such restrictions on unconfined_t, then
>>>>>> maybe podman should run in its own domain.
>>>>>>
>>>>> This is not an issue with just podman.  Podman can mount the image and
>>>>> the tools can just read/write content into the mountpoint.
>>>>>
>>>>> I thought I recalled a LSM that prefented symlink attacks when users
>>>>> would link a file in the homedir against /etc/shadow and then
>>>>> attempt to
>>>>> get the admin to modify the file in his homedir?
>>>>>
>>>>> I was thinking that if that existed we could build more controls on it
>>>>> based on Labels rather then just UIDs matching.
>>>>
>>>> Not sure if you are thinking of symlink attacks or hard link attacks.
>>>> SELinux supports preventing the former by restricting the ability to
>>>> follow symlinks based on lnk_file read permission, so you can prevent
>>>> trusted processes from following untrustworthy symlinks.  SELinux
>>>> supports preventing the latter by restricting the ability to create
>>>> hard links to unauthorized files.  But you need to write your policies
>>>> in a manner that leverages that support, and a fully unconfined domain
>>>> isn't going to be protected via SELinux by definition; ideally you'd
>>>> be phasing out unconfined altogether like Android did.  Modern kernels
>>>> also have the /proc/sys/fs/protected_hardlinks and
>>>> /proc/sys/fs/protected_symlinks settings, which restrict based on UID,
>>>> but the symlink checks aren't based on the target of the symlink
>>>> either.
>>>
>>> Android does not have an Admin, so it is a lot easier for them.  But not
>>> going to get into that now.  I obviously understand how SELinux works.
>>> But perhaps I am looking for something differntly.
>>>
>>> This link defines pretty close to what I would want, but extended for
>>> labels rather then just UIDS.
>>>
>>> https://sysctl-explorer.net/fs/protected_symlinks/
>>>
>>>
>>>> A long-standing class of security issues is the symlink-based
>>>> time-of-check-time-of-use race, most commonly seen in world-writable
>>>> directories like /tmp. The common method of exploitation of this flaw
>>>> is to cross privilege boundaries when following a given symlink (i.e.
>>>> a **PRIVILEGED** process follows a symlink belonging **PROVIDED BY
>>>> OTHERS**). For a likely incomplete list of hundreds of examples across
>>>> the years, please see:
>>>> http://cve.mitre.org/cgi-bin/cvekey.cgi?keyword=/tmp
>>>>
>>>> When set to “0”, symlink following behavior is unrestricted.
>>>>
>>>> When set to “1” symlinks are permitted to be followed only when
>>>> outside a sticky world-writable directory **WE COULD POTENTIALLY SET
>>>> THIS OR SOME OTHER FLAG**, or when the **LABEL** of the symlink and
>>>> follower match, or when the directory **LABEL** matches the symlink’s
>>>> **LABEL**.
>>>>
>>>> This protection is based on the restrictions in Openwall and
>>>> grsecurity.
>>>>
>>
>> That's the /proc/sys/fs/protected_symlinks feature I mentioned in my
>> email above.  It isn't based on the target of the symlink; it is only
>> based on the attributes of the follower process (e.g. root), the
>> attributes of the parent directory containing the symlink (e.g. /tmp),
>> and the attributes of the symlink file (e.g. /tmp/foo -> /etc/shadow).
>> At no point is it checking anything about the target of the symlink,
>> e.g. /etc/shadow.  If dwalsh creates a symlink under /tmp (ln -s
>> /etc/shadow /tmp/foo) and root tries to follow /tmp/foo, then that
>> will fail because 1) the process fsuid (root) != the /tmp/foo symlink
>> owner (dwalsh), and 2) /tmp is a sticky and world-writable directory,
>> and 3) the /tmp directory owner (root) != the /tmp/foo symlink owner
>> (dwalsh). Note that conditions (2) and (3) render the check useless
>> for your use case, since you want to prevent following any symlinks
>> writable by container processes in any directory within the container
>> filesystem, so the directory need not be world-writable/sticky and the
>> parent directory UID/label might be identical to the symlink UID/label.
> We we are mounting the file system (Most of the time), So we could add a
> flag to indicate that this is a protected file system.

You are effectively already doing that by mounting with a context mount 
that assigns container_file_t or whatever type to the filesystem.  You 
don't need something new there.

>>
>>
>> The existing SELinux lnk_file read permission check enables you to
>> apply stronger label-based controls to all symlinks within the
>> container filesystem, not just ones in /tmp-like directories.  Don't
>> allow unconfined_t or any other privileged domain read permission to
>> container_file_t:lnk_file (or preferably to any file type for which
>> :lnk_file create is allowed to container process domains), and you'll
>> never have to worry about them following a symlink writable by a
>> container process.  This of course assumes that the container
>> filesystem is always labeled with a type that is untrusted, whether
>> via mount contexts or actual labels.
> 
> But we want to allow domains to follow container_file_t links that point
> to container_file_t objects.  Just not follow them if they point to
> other types.  This means there is no Protection that I could write to a
> domain like unconfined_t to say only follow links when the types match.
> Or the types have allow rules.

You really don't want programs on the host OS that are acting on a 
container filesystem to ever follow any symlinks within it.  It just 
isn't a good idea; even if you limit it to intra-container symlinks, 
then an attacker could use the host process to overwrite some file 
within the container that wasn't directly writable by him.

In any event, I don't know how one would implement a check between the 
symlink and its target; you'd have to save the symlink information until 
you reach the final target and then call a hook with both of them.  And 
what if there are multiple symlinks in that path?  Symlinks to symlinks?





^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: New Container vulnerability could potentially use an SELinux fix.
  2019-06-10 15:00             ` Stephen Smalley
@ 2019-06-10 16:58               ` Daniel Walsh
  2019-06-10 17:01               ` Daniel Walsh
  1 sibling, 0 replies; 10+ messages in thread
From: Daniel Walsh @ 2019-06-10 16:58 UTC (permalink / raw)
  To: Stephen Smalley, Miloslav Trmac, selinux

On 6/10/19 11:00 AM, Stephen Smalley wrote:
> On 6/10/19 10:37 AM, Daniel Walsh wrote:
>> On 6/10/19 10:08 AM, Stephen Smalley wrote:
>>> On 6/8/19 10:08 AM, Daniel Walsh wrote:
>>>> On 6/7/19 5:26 PM, Stephen Smalley wrote:
>>>>> On 6/7/19 5:06 PM, Daniel Walsh wrote:
>>>>>> On 6/7/19 12:44 PM, Stephen Smalley wrote:
>>>>>>> On 6/7/19 11:42 AM, Daniel Walsh wrote:
>>>>>>>> We have periodic vulnerablities around bad container images having
>>>>>>>> symbolic link attacks against the host.
>>>>>>>>
>>>>>>>> One came out last week about doing a `podman cp`
>>>>>>>>
>>>>>>>> Which would copy content from the host into the container.  The
>>>>>>>> issue
>>>>>>>> was that if the container was running, it could trick the
>>>>>>>> processes
>>>>>>>> copying content into it to follow a symbolic link to external
>>>>>>>> of the
>>>>>>>> container image.
>>>>>>>>
>>>>>>>> The question came up, is there a way to use SELinux to prevent
>>>>>>>> this. And
>>>>>>>> sadly the answer right now is no, because we have no way to know
>>>>>>>> what
>>>>>>>> the label of the process attempting to update the container file
>>>>>>>> system
>>>>>>>> is running as.  Usually it will be running as unconfined_t.
>>>>>>>>
>>>>>>>> One idea would be to add a rule to policy that control the
>>>>>>>> following of
>>>>>>>> symbolic links to only those specified in policy.
>>>>>>>>
>>>>>>>>
>>>>>>>> Something like
>>>>>>>>
>>>>>>>> SPECIALRESTRICTED TYPE container_file_t
>>>>>>>>
>>>>>>>> allow container_file_t container_file_t:symlink follow;
>>>>>>>>
>>>>>>>> Then if a process attempted to copy content onto a symbolic link
>>>>>>>> from
>>>>>>>> container_file_t to a non container_file_t type, the kernel would
>>>>>>>> deny
>>>>>>>> access.
>>>>>>>>
>>>>>>>> Thoughts?
>>>>>>>
>>>>>>> SELinux would prevent it if you didn't allow unconfined_t (or other
>>>>>>> privileged domains) to follow untrustworthy symlinks (e.g. don't
>>>>>>> allow
>>>>>>> unconfined_t container_file_t:lnk_file read; in the first place).
>>>>>>> That's the right way to prevent it.
>>>>>>>
>>>>>>> Trying to apply a check between symlink and its target as you
>>>>>>> suggest
>>>>>>> is problematic; we don't generally have them both at the same
>>>>>>> point.
>>>>>>> If we are allowed to follow the symlink, we read its contents and
>>>>>>> perform a path walk on that, and that could be a multi-component
>>>>>>> pathname lookup that itself spans further symlinks, mount points,
>>>>>>> etc.  I think that would be challenging to support in the kernel,
>>>>>>> subject to races, and certainly would require changes outside of
>>>>>>> just
>>>>>>> SELinux.
>>>>>>>
>>>>>>> If you truly cannot impose such restrictions on unconfined_t, then
>>>>>>> maybe podman should run in its own domain.
>>>>>>>
>>>>>> This is not an issue with just podman.  Podman can mount the
>>>>>> image and
>>>>>> the tools can just read/write content into the mountpoint.
>>>>>>
>>>>>> I thought I recalled a LSM that prefented symlink attacks when users
>>>>>> would link a file in the homedir against /etc/shadow and then
>>>>>> attempt to
>>>>>> get the admin to modify the file in his homedir?
>>>>>>
>>>>>> I was thinking that if that existed we could build more controls
>>>>>> on it
>>>>>> based on Labels rather then just UIDs matching.
>>>>>
>>>>> Not sure if you are thinking of symlink attacks or hard link attacks.
>>>>> SELinux supports preventing the former by restricting the ability to
>>>>> follow symlinks based on lnk_file read permission, so you can prevent
>>>>> trusted processes from following untrustworthy symlinks.  SELinux
>>>>> supports preventing the latter by restricting the ability to create
>>>>> hard links to unauthorized files.  But you need to write your
>>>>> policies
>>>>> in a manner that leverages that support, and a fully unconfined
>>>>> domain
>>>>> isn't going to be protected via SELinux by definition; ideally you'd
>>>>> be phasing out unconfined altogether like Android did.  Modern
>>>>> kernels
>>>>> also have the /proc/sys/fs/protected_hardlinks and
>>>>> /proc/sys/fs/protected_symlinks settings, which restrict based on
>>>>> UID,
>>>>> but the symlink checks aren't based on the target of the symlink
>>>>> either.
>>>>
>>>> Android does not have an Admin, so it is a lot easier for them. 
>>>> But not
>>>> going to get into that now.  I obviously understand how SELinux works.
>>>> But perhaps I am looking for something differntly.
>>>>
>>>> This link defines pretty close to what I would want, but extended for
>>>> labels rather then just UIDS.
>>>>
>>>> https://sysctl-explorer.net/fs/protected_symlinks/
>>>>
>>>>
>>>>> A long-standing class of security issues is the symlink-based
>>>>> time-of-check-time-of-use race, most commonly seen in world-writable
>>>>> directories like /tmp. The common method of exploitation of this flaw
>>>>> is to cross privilege boundaries when following a given symlink (i.e.
>>>>> a **PRIVILEGED** process follows a symlink belonging **PROVIDED BY
>>>>> OTHERS**). For a likely incomplete list of hundreds of examples
>>>>> across
>>>>> the years, please see:
>>>>> http://cve.mitre.org/cgi-bin/cvekey.cgi?keyword=/tmp
>>>>>
>>>>> When set to “0”, symlink following behavior is unrestricted.
>>>>>
>>>>> When set to “1” symlinks are permitted to be followed only when
>>>>> outside a sticky world-writable directory **WE COULD POTENTIALLY SET
>>>>> THIS OR SOME OTHER FLAG**, or when the **LABEL** of the symlink and
>>>>> follower match, or when the directory **LABEL** matches the symlink’s
>>>>> **LABEL**.
>>>>>
>>>>> This protection is based on the restrictions in Openwall and
>>>>> grsecurity.
>>>>>
>>>
>>> That's the /proc/sys/fs/protected_symlinks feature I mentioned in my
>>> email above.  It isn't based on the target of the symlink; it is only
>>> based on the attributes of the follower process (e.g. root), the
>>> attributes of the parent directory containing the symlink (e.g. /tmp),
>>> and the attributes of the symlink file (e.g. /tmp/foo -> /etc/shadow).
>>> At no point is it checking anything about the target of the symlink,
>>> e.g. /etc/shadow.  If dwalsh creates a symlink under /tmp (ln -s
>>> /etc/shadow /tmp/foo) and root tries to follow /tmp/foo, then that
>>> will fail because 1) the process fsuid (root) != the /tmp/foo symlink
>>> owner (dwalsh), and 2) /tmp is a sticky and world-writable directory,
>>> and 3) the /tmp directory owner (root) != the /tmp/foo symlink owner
>>> (dwalsh). Note that conditions (2) and (3) render the check useless
>>> for your use case, since you want to prevent following any symlinks
>>> writable by container processes in any directory within the container
>>> filesystem, so the directory need not be world-writable/sticky and the
>>> parent directory UID/label might be identical to the symlink UID/label.
>> We we are mounting the file system (Most of the time), So we could add a
>> flag to indicate that this is a protected file system.
>
> You are effectively already doing that by mounting with a context
> mount that assigns container_file_t or whatever type to the
> filesystem.  You don't need something new there.
Well yes with the Overlay Driver.  Not with the VFS Driver and maybe not
with fuse-overlay.
>
>>>
>>>
>>> The existing SELinux lnk_file read permission check enables you to
>>> apply stronger label-based controls to all symlinks within the
>>> container filesystem, not just ones in /tmp-like directories.  Don't
>>> allow unconfined_t or any other privileged domain read permission to
>>> container_file_t:lnk_file (or preferably to any file type for which
>>> :lnk_file create is allowed to container process domains), and you'll
>>> never have to worry about them following a symlink writable by a
>>> container process.  This of course assumes that the container
>>> filesystem is always labeled with a type that is untrusted, whether
>>> via mount contexts or actual labels.
>>
>> But we want to allow domains to follow container_file_t links that point
>> to container_file_t objects.  Just not follow them if they point to
>> other types.  This means there is no Protection that I could write to a
>> domain like unconfined_t to say only follow links when the types match.
>> Or the types have allow rules.
>
> You really don't want programs on the host OS that are acting on a
> container filesystem to ever follow any symlinks within it.  It just
> isn't a good idea; even if you limit it to intra-container symlinks,
> then an attacker could use the host process to overwrite some file
> within the container that wasn't directly writable by him.
>
> In any event, I don't know how one would implement a check between the
> symlink and its target; you'd have to save the symlink information
> until you reach the final target and then call a hook with both of
> them.  And what if there are multiple symlinks in that path?  Symlinks
> to symlinks?
>
>
>
>


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: New Container vulnerability could potentially use an SELinux fix.
  2019-06-10 15:00             ` Stephen Smalley
  2019-06-10 16:58               ` Daniel Walsh
@ 2019-06-10 17:01               ` Daniel Walsh
  1 sibling, 0 replies; 10+ messages in thread
From: Daniel Walsh @ 2019-06-10 17:01 UTC (permalink / raw)
  To: Stephen Smalley, Miloslav Trmac, selinux

On 6/10/19 11:00 AM, Stephen Smalley wrote:
> On 6/10/19 10:37 AM, Daniel Walsh wrote:
>> On 6/10/19 10:08 AM, Stephen Smalley wrote:
>>> On 6/8/19 10:08 AM, Daniel Walsh wrote:
>>>> On 6/7/19 5:26 PM, Stephen Smalley wrote:
>>>>> On 6/7/19 5:06 PM, Daniel Walsh wrote:
>>>>>> On 6/7/19 12:44 PM, Stephen Smalley wrote:
>>>>>>> On 6/7/19 11:42 AM, Daniel Walsh wrote:
>>>>>>>> We have periodic vulnerablities around bad container images having
>>>>>>>> symbolic link attacks against the host.
>>>>>>>>
>>>>>>>> One came out last week about doing a `podman cp`
>>>>>>>>
>>>>>>>> Which would copy content from the host into the container.  The
>>>>>>>> issue
>>>>>>>> was that if the container was running, it could trick the
>>>>>>>> processes
>>>>>>>> copying content into it to follow a symbolic link to external
>>>>>>>> of the
>>>>>>>> container image.
>>>>>>>>
>>>>>>>> The question came up, is there a way to use SELinux to prevent
>>>>>>>> this. And
>>>>>>>> sadly the answer right now is no, because we have no way to know
>>>>>>>> what
>>>>>>>> the label of the process attempting to update the container file
>>>>>>>> system
>>>>>>>> is running as.  Usually it will be running as unconfined_t.
>>>>>>>>
>>>>>>>> One idea would be to add a rule to policy that control the
>>>>>>>> following of
>>>>>>>> symbolic links to only those specified in policy.
>>>>>>>>
>>>>>>>>
>>>>>>>> Something like
>>>>>>>>
>>>>>>>> SPECIALRESTRICTED TYPE container_file_t
>>>>>>>>
>>>>>>>> allow container_file_t container_file_t:symlink follow;
>>>>>>>>
>>>>>>>> Then if a process attempted to copy content onto a symbolic link
>>>>>>>> from
>>>>>>>> container_file_t to a non container_file_t type, the kernel would
>>>>>>>> deny
>>>>>>>> access.
>>>>>>>>
>>>>>>>> Thoughts?
>>>>>>>
>>>>>>> SELinux would prevent it if you didn't allow unconfined_t (or other
>>>>>>> privileged domains) to follow untrustworthy symlinks (e.g. don't
>>>>>>> allow
>>>>>>> unconfined_t container_file_t:lnk_file read; in the first place).
>>>>>>> That's the right way to prevent it.
>>>>>>>
>>>>>>> Trying to apply a check between symlink and its target as you
>>>>>>> suggest
>>>>>>> is problematic; we don't generally have them both at the same
>>>>>>> point.
>>>>>>> If we are allowed to follow the symlink, we read its contents and
>>>>>>> perform a path walk on that, and that could be a multi-component
>>>>>>> pathname lookup that itself spans further symlinks, mount points,
>>>>>>> etc.  I think that would be challenging to support in the kernel,
>>>>>>> subject to races, and certainly would require changes outside of
>>>>>>> just
>>>>>>> SELinux.
>>>>>>>
>>>>>>> If you truly cannot impose such restrictions on unconfined_t, then
>>>>>>> maybe podman should run in its own domain.
>>>>>>>
>>>>>> This is not an issue with just podman.  Podman can mount the
>>>>>> image and
>>>>>> the tools can just read/write content into the mountpoint.
>>>>>>
>>>>>> I thought I recalled a LSM that prefented symlink attacks when users
>>>>>> would link a file in the homedir against /etc/shadow and then
>>>>>> attempt to
>>>>>> get the admin to modify the file in his homedir?
>>>>>>
>>>>>> I was thinking that if that existed we could build more controls
>>>>>> on it
>>>>>> based on Labels rather then just UIDs matching.
>>>>>
>>>>> Not sure if you are thinking of symlink attacks or hard link attacks.
>>>>> SELinux supports preventing the former by restricting the ability to
>>>>> follow symlinks based on lnk_file read permission, so you can prevent
>>>>> trusted processes from following untrustworthy symlinks.  SELinux
>>>>> supports preventing the latter by restricting the ability to create
>>>>> hard links to unauthorized files.  But you need to write your
>>>>> policies
>>>>> in a manner that leverages that support, and a fully unconfined
>>>>> domain
>>>>> isn't going to be protected via SELinux by definition; ideally you'd
>>>>> be phasing out unconfined altogether like Android did.  Modern
>>>>> kernels
>>>>> also have the /proc/sys/fs/protected_hardlinks and
>>>>> /proc/sys/fs/protected_symlinks settings, which restrict based on
>>>>> UID,
>>>>> but the symlink checks aren't based on the target of the symlink
>>>>> either.
>>>>
>>>> Android does not have an Admin, so it is a lot easier for them. 
>>>> But not
>>>> going to get into that now.  I obviously understand how SELinux works.
>>>> But perhaps I am looking for something differntly.
>>>>
>>>> This link defines pretty close to what I would want, but extended for
>>>> labels rather then just UIDS.
>>>>
>>>> https://sysctl-explorer.net/fs/protected_symlinks/
>>>>
>>>>
>>>>> A long-standing class of security issues is the symlink-based
>>>>> time-of-check-time-of-use race, most commonly seen in world-writable
>>>>> directories like /tmp. The common method of exploitation of this flaw
>>>>> is to cross privilege boundaries when following a given symlink (i.e.
>>>>> a **PRIVILEGED** process follows a symlink belonging **PROVIDED BY
>>>>> OTHERS**). For a likely incomplete list of hundreds of examples
>>>>> across
>>>>> the years, please see:
>>>>> http://cve.mitre.org/cgi-bin/cvekey.cgi?keyword=/tmp
>>>>>
>>>>> When set to “0”, symlink following behavior is unrestricted.
>>>>>
>>>>> When set to “1” symlinks are permitted to be followed only when
>>>>> outside a sticky world-writable directory **WE COULD POTENTIALLY SET
>>>>> THIS OR SOME OTHER FLAG**, or when the **LABEL** of the symlink and
>>>>> follower match, or when the directory **LABEL** matches the symlink’s
>>>>> **LABEL**.
>>>>>
>>>>> This protection is based on the restrictions in Openwall and
>>>>> grsecurity.
>>>>>
>>>
>>> That's the /proc/sys/fs/protected_symlinks feature I mentioned in my
>>> email above.  It isn't based on the target of the symlink; it is only
>>> based on the attributes of the follower process (e.g. root), the
>>> attributes of the parent directory containing the symlink (e.g. /tmp),
>>> and the attributes of the symlink file (e.g. /tmp/foo -> /etc/shadow).
>>> At no point is it checking anything about the target of the symlink,
>>> e.g. /etc/shadow.  If dwalsh creates a symlink under /tmp (ln -s
>>> /etc/shadow /tmp/foo) and root tries to follow /tmp/foo, then that
>>> will fail because 1) the process fsuid (root) != the /tmp/foo symlink
>>> owner (dwalsh), and 2) /tmp is a sticky and world-writable directory,
>>> and 3) the /tmp directory owner (root) != the /tmp/foo symlink owner
>>> (dwalsh). Note that conditions (2) and (3) render the check useless
>>> for your use case, since you want to prevent following any symlinks
>>> writable by container processes in any directory within the container
>>> filesystem, so the directory need not be world-writable/sticky and the
>>> parent directory UID/label might be identical to the symlink UID/label.
>> We we are mounting the file system (Most of the time), So we could add a
>> flag to indicate that this is a protected file system.
>
> You are effectively already doing that by mounting with a context
> mount that assigns container_file_t or whatever type to the
> filesystem.  You don't need something new there.
>
>>>
>>>
>>> The existing SELinux lnk_file read permission check enables you to
>>> apply stronger label-based controls to all symlinks within the
>>> container filesystem, not just ones in /tmp-like directories.  Don't
>>> allow unconfined_t or any other privileged domain read permission to
>>> container_file_t:lnk_file (or preferably to any file type for which
>>> :lnk_file create is allowed to container process domains), and you'll
>>> never have to worry about them following a symlink writable by a
>>> container process.  This of course assumes that the container
>>> filesystem is always labeled with a type that is untrusted, whether
>>> via mount contexts or actual labels.
>>
>> But we want to allow domains to follow container_file_t links that point
>> to container_file_t objects.  Just not follow them if they point to
>> other types.  This means there is no Protection that I could write to a
>> domain like unconfined_t to say only follow links when the types match.
>> Or the types have allow rules.
>
> You really don't want programs on the host OS that are acting on a
> container filesystem to ever follow any symlinks within it.  It just
> isn't a good idea; even if you limit it to intra-container symlinks,
> then an attacker could use the host process to overwrite some file
> within the container that wasn't directly writable by him.
>
> In any event, I don't know how one would implement a check between the
> symlink and its target; you'd have to save the symlink information
> until you reach the final target and then call a hook with both of
> them.  And what if there are multiple symlinks in that path?  Symlinks
> to symlinks?
>
>
>
>
I would think we would only have to check the source and the final check. 

The problem is lots of processes like dnf, cp, install all deal with
following legitimate symbolic links.  Most of the times processes
outside of the container dealing with the container content, are
installing software.  Having this software fail anytime their is a
symbolic link, is not going to work.


^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2019-06-10 17:01 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-06-07 15:42 New Container vulnerability could potentially use an SELinux fix Daniel Walsh
2019-06-07 16:44 ` Stephen Smalley
2019-06-07 21:06   ` Daniel Walsh
2019-06-07 21:26     ` Stephen Smalley
2019-06-08 14:08       ` Daniel Walsh
2019-06-10 14:08         ` Stephen Smalley
2019-06-10 14:37           ` Daniel Walsh
2019-06-10 15:00             ` Stephen Smalley
2019-06-10 16:58               ` Daniel Walsh
2019-06-10 17:01               ` Daniel Walsh

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).