All of lore.kernel.org
 help / color / mirror / Atom feed
From: ebiederm@xmission.com (Eric W. Biederman)
To: Aleksa Sarai <asarai@suse.de>
Cc: Michal Hocko <mhocko@kernel.org>,
	Andrew Morton <akpm@linux-foundation.org>,
	Oleg Nesterov <oleg@redhat.com>,
	Kees Cook <keescook@chromium.org>,
	Al Viro <viro@zeniv.linux.org.uk>,
	John Stultz <john.stultz@linaro.org>,
	Mateusz Guzik <mguzik@redhat.com>,
	Janis Danisevskis <jdanis@google.com>,
	linux-kernel@vger.kernel.org, dev@opencontainers.org,
	containers@lists.linux-foundation.org
Subject: Re: [PATCH] procfs: change the owner of non-dumpable and writeable files
Date: Fri, 20 Jan 2017 17:35:58 +1300	[thread overview]
Message-ID: <87wpdqjqkx.fsf@xmission.com> (raw)
In-Reply-To: <4025e285-9179-b98a-88c0-905f4f9c3ef8@suse.de> (Aleksa Sarai's message of "Fri, 20 Jan 2017 13:35:29 +1100")

Aleksa Sarai <asarai@suse.de> writes:

>> Please verify but the ptrace issue that allowed processes in a container
>> to call setns on our processes should be fixed as of 4.10-rc1.  And the
>> change has been marked for backporting.
>
> ptrace(2) is not the only issue, the issue that we had in runC is that
> a process joining a namespace may have file descriptors that refer to
> the host filesystem. If the process joining is dumpable, a racing
> process inside the container can access those file descriptors through
> the /proc/[pid]/fd/... mechanism.
>
> See CVE-2016-9962.

Wow, I said that badly.  But the issue is ptrace_attach and the
ptrace_may_access checks.

>> AKA it should be this fix that removes the need for your dumpable setting.
>> bfedb589252c ("mm: Add a user_ns owner to mm_struct and fix ptrace permission checks")
>
> I will check, though from what I recall that patch doesn't fix the
> ptrace_may_access checks. Not to mention it won't help if the
> container doesn't have it's own user namespace.

That change very much is to the ptrace_may_access checks.

You are not playing with setgroups if you don't have your own user
namespace.  So I don't see how the other cases are relevant.

Going further there are only two namespaces that are only 3 pieces of
information that I see are relevant in this context.  The pid namespace,
the user namespace, and the process credentials.

You can not join a PID namespace you can only fork into one.

The user namespace is relevant in older kernels as joining a user
namespace would allow ptrace_has_cap check in ptrace_may_access
to pass.  After the change listed above that check does not start
passing until after exec.

The credentials on your process are relevant as changing those
allow the ptrace_may_access checks to start passing for other sets
of processes.



What I think I would do in the situation you describe is to
join what you are going to join.  Limit yourself to creating pid
namespaces with unshare.

If you are joining a user namespace set undumpable.
If you are creating a user namespace create it and then set undumpable.

fork your process into the new pid namespace.
change your pocesses credentials.
exec whatever you are execing, letting close_on_exec clean up for you.

>> Now with that said I believe we want to add the following change now
>> that dumpable is user namespace relative.  That will use not the
>> GLOBAL_ROOT_UID/GID but instead uid and gid 0 in the namespace
>> that dumpable is relative too.
>
> Sure, but that's tangential to the issue under discussion.

It is pretty much the only case I see worth considering.  Certainly I
don't see setgroups as a motivational example.

>> But ugh!  Your case is even more confused that I had first noticed.
>> Saying that a processes is undumpable is completely unnecessary
>> when you are entering into a new fresh user namespace.  Touching
>> setgroups at any point where there are other processes in the namespace
>> makes no sense whatsoever.
>
> Currently in runC the ordering for mixed create-and-join namespaces is
> that we first join existing namespaces and _then_ create new ones. So
> we need to be non-dumpable to avoid the problem in CVE-2016-9962.

Yes mixed create and join is trickier than a pure create and use case,
and trickier than a pure join case.  But see above it looks quite doable
to me to avoid being vulnerable while using the existing permissions
even in the case you describe.

>> Clearing dumpable is to help not leak things
>> into a container when you call setns on a user namespace.
>
> It is also to help not leak things into a container when you join
> other namespaces. Most notably the PID namespace.

Except that you don't strictly join a PID namespace.  You set a context
for children to run in a different PID namespace.  So you are safe
from PID namespaces as long as you don't call fork.

>> +	if (mode != (S_IFDIR|S_IRUGO|S_IXUGO)) {
>
> I'd just like to draw your attention to this special case -- why is
> this special cased? What was the original reasoning behind it? Does it
> make sense for a non-dumpable process to allow someone to change the
> mode of some random /proc/[pid]/ directories?

This has nothing at all to do with changing modes and is all about
what uid/gid are set on the proc inode.   Usually it is the uid/gid
of the process in question but occassionally for undumpable processes
it is root/root to prevent people from accessing the files in question.

> I get the feeling that some of this logic is a bit iffy.

It looks like I forgot to carry forward the comment that explains that
case in my patch.  Something I need to fix before I merge it.

/*
 * Before the /proc/pid/status file was created the only way to read
 * the effective uid of a /process was to stat /proc/pid.  Reading
 * /proc/pid/status is slow enough that procps and other packages
 * kept stating /proc/pid.  To keep the rules in /proc simple I have
 * made this apply to all per process world readable and executable
 * directories.
 */

Or in short.  I broke ps when I removed all of the special cases, and to
fix ps I added the existing special case.  Not that the uid or gid of a
directory that the whole world can access matters.

Eric

  reply	other threads:[~2017-01-20  4:40 UTC|newest]

Thread overview: 18+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-01-18  4:01 [PATCH] procfs: change the owner of non-dumpable and writeable files Aleksa Sarai
2017-01-18  4:01 ` Aleksa Sarai
     [not found] ` <20170118040159.4751-1-asarai-l3A5Bk7waGM@public.gmane.org>
2017-01-18 23:22   ` Kees Cook
2017-01-18 23:22     ` Kees Cook
     [not found]     ` <CAGXu5jL4R-rqv-33XvvsMOqcxBgPRm-eAeRThdN515Ux3TSiFw-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2017-01-18 23:34       ` Aleksa Sarai
2017-01-18 23:34         ` Aleksa Sarai
2017-01-19  9:29   ` Michal Hocko
2017-01-19  9:29     ` Michal Hocko
     [not found]     ` <20170119092930.GJ30786-2MMpYkNvuYDjFM9bn6wA6Q@public.gmane.org>
2017-01-19 13:08       ` Aleksa Sarai
2017-01-19 13:08         ` Aleksa Sarai
     [not found]         ` <e7137350-b8a8-c7e5-e35d-8218a7df1147-l3A5Bk7waGM@public.gmane.org>
2017-01-20  1:57           ` Eric W. Biederman
2017-01-20  1:57             ` Eric W. Biederman
     [not found]             ` <87r33yv6gk.fsf-aS9lmoZGLiVWk0Htik3J/w@public.gmane.org>
2017-01-20  2:35               ` Aleksa Sarai
2017-01-20  2:35                 ` Aleksa Sarai
2017-01-20  4:35                 ` Eric W. Biederman [this message]
     [not found]                   ` <87wpdqjqkx.fsf-aS9lmoZGLiVWk0Htik3J/w@public.gmane.org>
2017-01-25  6:43                     ` Aleksa Sarai
2017-01-25  6:43                       ` Aleksa Sarai
     [not found]                 ` <4025e285-9179-b98a-88c0-905f4f9c3ef8-l3A5Bk7waGM@public.gmane.org>
2017-01-20  4:35                   ` Eric W. Biederman

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=87wpdqjqkx.fsf@xmission.com \
    --to=ebiederm@xmission.com \
    --cc=akpm@linux-foundation.org \
    --cc=asarai@suse.de \
    --cc=containers@lists.linux-foundation.org \
    --cc=dev@opencontainers.org \
    --cc=jdanis@google.com \
    --cc=john.stultz@linaro.org \
    --cc=keescook@chromium.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mguzik@redhat.com \
    --cc=mhocko@kernel.org \
    --cc=oleg@redhat.com \
    --cc=viro@zeniv.linux.org.uk \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.