linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: "Serge E. Hallyn" <serge@hallyn.com>
To: Christian Brauner <christian@brauner.io>
Cc: "Eric W. Biederman" <ebiederm@xmission.com>,
	Daniel Colascione <dancol@google.com>,
	Aleksa Sarai <cyphar@cyphar.com>,
	linux-kernel <linux-kernel@vger.kernel.org>,
	"Serge E. Hallyn" <serge@hallyn.com>,
	Jann Horn <jannh@google.com>, Andy Lutomirski <luto@kernel.org>,
	Andrew Morton <akpm@linux-foundation.org>,
	Oleg Nesterov <oleg@redhat.com>,
	Al Viro <viro@zeniv.linux.org.uk>,
	Linux FS Devel <linux-fsdevel@vger.kernel.org>,
	Linux API <linux-api@vger.kernel.org>,
	Tim Murray <timmurray@google.com>,
	linux-man <linux-man@vger.kernel.org>,
	Kees Cook <keescook@chromium.org>
Subject: Re: [PATCH v1 2/2] signal: add procfd_signal() syscall
Date: Wed, 21 Nov 2018 15:39:46 -0600	[thread overview]
Message-ID: <20181121213946.GA10795@mail.hallyn.com> (raw)
In-Reply-To: <20181120103111.etlqp7zop34v6nv4@brauner.io>

On Tue, Nov 20, 2018 at 11:31:13AM +0100, Christian Brauner wrote:
> On Mon, Nov 19, 2018 at 10:59:12PM -0600, Eric W. Biederman wrote:
> > Daniel Colascione <dancol@google.com> writes:
> > 
> > > On Mon, Nov 19, 2018 at 1:37 PM Christian Brauner <christian@brauner.io> wrote:
> > >>
> > >> On Mon, Nov 19, 2018 at 01:26:22PM -0800, Daniel Colascione wrote:
> > >> > On Mon, Nov 19, 2018 at 1:21 PM, Christian Brauner <christian@brauner.io> wrote:
> > >> > > That can be done without a loop by comparing the level counter for the
> > >> > > two pid namespaces.
> > >> > >
> > >> > >>
> > >> > >> And you can rewrite pidns_get_parent to use it. So you would instead be
> > >> > >> doing:
> > >> > >>
> > >> > >>     if (pidns_is_descendant(proc_pid_ns, task_active_pid_ns(current)))
> > >> > >>         return -EPERM;
> > >> > >>
> > >> > >> (Or you can just copy the 5-line loop into procfd_signal -- though I
> > >> > >> imagine we'll need this for all of the procfd_* APIs.)
> > >> >
> > >> > Why is any of this even necessary? Why does the child namespace we're
> > >> > considering even have a file descriptor to its ancestor's procfs? If
> > >>
> > >> Because you can send file descriptors between processes and container
> > >> runtimes tend to do that.
> > >
> > > Right. But why *would* a container runtime send one of these procfs
> > > FDs to a container?
> > >
> > >> > it has one of these FDs, it can already *read* all sorts of
> > >> > information it really shouldn't be able to acquire, so the additional
> > >> > ability to send a signal (subject to the usual permission checks)
> > >> > feels like sticking a finger in a dike that's already well-perforated.
> > >> > IMHO, we shouldn't bother with this check. The patch would be simpler
> > >> > without it.
> > >>
> > >> We will definitely not allow signaling processes in an ancestor pid
> > >> namespace! That is a security issue! I can imagine container runtimes
> > >> killing their monitoring process etc. pp. Not happening, unless someone
> > >> with deep expertise in signals can convince me otherwise.
> > >
> > > If parent namespace procfs FDs or mounts really can leak into child
> > > namespaces as easily as Aleksa says, then I don't mind adding the
> > > check. I was under the impression that if you find yourself in this
> > > situation, you already have a big problem.
> > 
> > There is one big reason to have the check, and I have not seen it
> > mentioned yet in this thread.
> > 
> > When SI_USER is set we report the pid of the sender of the signal in
> > si_pid.  When the signal comes from the kernel si_pid == 0.  When signal
> > is sent from an ancestor pid namespace si_pid also equals 0 (which is
> > reasonable).
> > 
> > A signal out to a process in a parent pid namespace such as SIGCHLD is
> > reasonable as we can map the pid.  I really don't see the point of
> > forbidding that.  From the perspective of the process in the parent pid
> > namespace it is just another process in it's pid namespace.  So it
> > should pose no problem from the perspective of the receiving process.
> > 
> > A signal to a process in a pid namespace that is neither a parent nor a
> > descendent pid namespace would be a problem, as there is no well defined
> > notion of what si_pid should be set to.  So for that case perhaps we
> > should have something like a noprocess pid that we can set.  Perhaps we
> > could set si_pid to 0xffffffff.  That would take a small extension to
> > pid_nr_ns.
> > 
> > File descriptors are not namespaced.  It is completely legitimate to use
> > file descriptors to get around limitations of namespaces.
> 
> Frankly, I don't see a good argument for why we would allow that even if
> safe. I have not heard a legitimate use-case or need for this.
> At this point I care about very simple semantics. Being able to signal
> into ancestor pid namespaces and cousin namespaces is interesting but
> makes the syscall more brittle and harder to understand.

Yeah, I'm with you on that.  We can always open that door later if a good
use case comes up, but I prefer simple at first.

> Changing pid_nr_ns() might be the solution but this function is called
> all over the place in the kernel and I'm not going to risk breaking
> something by changing it for a feature that no one so far has ever
> asked for.
> If you are ok with this then we should hold off on this. We can always
> add this feature later by removing the check when someone has a use-case
> for it.
> I'll send a v2 of the patch that keeps the restriction for now. If you
> insist on it being removed we can make the change in a follow-up
> iteration.
> 
> Christian
> 
> > 
> > Adding limitations to a file descriptor based api because someone else
> > can't set up their processes in such a way as to get the restrictions
> > they are looking for seems very sad.
> > 
> > Frankly I think it is one of the better features of namespaces that we
> > have to carefully handle and define these cases so that when the
> > inevitable leaks happen you are not immediately in a world of hurt.  All
> > of the other permission checks etc continue to do their job.  Plus you
> > are prepared for the case when someone wants their containers to have an
> > interesting communication primitive.
> > 
> > Eric
> > 
> > 
> > 
> > 

  reply	other threads:[~2018-11-21 21:39 UTC|newest]

Thread overview: 50+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-11-19 10:32 [PATCH v1 0/2] proc: allow signaling processes via file descriptors Christian Brauner
2018-11-19 10:32 ` [PATCH v1 1/2] proc: get process file descriptor from /proc/<pid> Christian Brauner
2018-11-19 15:32   ` Andy Lutomirski
2018-11-19 18:20     ` Christian Brauner
2018-11-19 10:32 ` [PATCH v1 2/2] signal: add procfd_signal() syscall Christian Brauner
2018-11-19 15:45   ` Andy Lutomirski
2018-11-19 15:57     ` Daniel Colascione
2018-11-19 18:39     ` Christian Brauner
2018-11-19 15:59   ` Daniel Colascione
2018-11-19 18:29     ` Christian Brauner
2018-11-19 19:02       ` Eric W. Biederman
2018-11-19 19:31         ` Christian Brauner
2018-11-19 19:39           ` Daniel Colascione
2018-11-19 17:10   ` Eugene Syromiatnikov
2018-11-19 18:23     ` Christian Brauner
2018-11-19 17:14   ` Eugene Syromiatnikov
2018-11-19 20:28   ` Aleksa Sarai
2018-11-19 20:55     ` Christian Brauner
2018-11-19 21:13       ` Christian Brauner
2018-11-19 21:18       ` Aleksa Sarai
2018-11-19 21:20         ` Christian Brauner
2018-11-19 21:21         ` Christian Brauner
2018-11-19 21:25           ` Aleksa Sarai
2018-11-19 21:26           ` Daniel Colascione
2018-11-19 21:36             ` Aleksa Sarai
2018-11-19 21:37             ` Christian Brauner
2018-11-19 21:41               ` Daniel Colascione
2018-11-20  4:59                 ` Eric W. Biederman
2018-11-20 10:31                   ` Christian Brauner
2018-11-21 21:39                     ` Serge E. Hallyn [this message]
2018-11-19 21:23         ` Aleksa Sarai
2018-11-22  7:41           ` Serge E. Hallyn
2018-11-19 22:39   ` Tycho Andersen
2018-11-19 22:49     ` Daniel Colascione
2018-11-19 23:07       ` Tycho Andersen
2018-11-20  0:27         ` Andy Lutomirski
2018-11-20  0:32           ` Christian Brauner
2018-11-20  0:34             ` Andy Lutomirski
2018-11-20  0:49           ` Daniel Colascione
2018-11-22  7:48     ` Serge E. Hallyn
2018-11-19 23:35   ` kbuild test robot
2018-11-19 23:37   ` kbuild test robot
2018-11-19 23:45     ` Christian Brauner
2018-11-28 21:45   ` Joey Pabalinas
2018-11-28 22:05     ` Christian Brauner
2018-11-28 23:02       ` Joey Pabalinas
2018-11-19 10:32 ` [PATCH] procfd_signal.2: document procfd_signal syscall Christian Brauner
2018-11-20 13:29   ` Michael Kerrisk (man-pages)
2018-11-28 20:59   ` Florian Weimer
2018-11-28 21:12     ` Christian Brauner

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20181121213946.GA10795@mail.hallyn.com \
    --to=serge@hallyn.com \
    --cc=akpm@linux-foundation.org \
    --cc=christian@brauner.io \
    --cc=cyphar@cyphar.com \
    --cc=dancol@google.com \
    --cc=ebiederm@xmission.com \
    --cc=jannh@google.com \
    --cc=keescook@chromium.org \
    --cc=linux-api@vger.kernel.org \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-man@vger.kernel.org \
    --cc=luto@kernel.org \
    --cc=oleg@redhat.com \
    --cc=timmurray@google.com \
    --cc=viro@zeniv.linux.org.uk \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).