linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: ebiederm@xmission.com (Eric W. Biederman)
To: Tycho Andersen <tycho@tycho.ws>
Cc: Matthew Helsley <matt.helsley@gmail.com>,
	Kees Cook <keescook@chromium.org>,
	lkml <linux-kernel@vger.kernel.org>,
	containers@lists.linux-foundation.org,
	Oleg Nesterov <oleg@redhat.com>,
	Akihiro Suda <suda.akihiro@lab.ntt.co.jp>,
	Tyler Hicks <tyhicks@canonical.com>,
	Christian Brauner <christian.brauner@ubuntu.com>,
	Andy Lutomirski <luto@amacapital.net>,
	"Tobin C . Harding" <me@tobin.cc>
Subject: Re: [PATCH v3 1/4] seccomp: add a return code to trap to userspace
Date: Thu, 14 Jun 2018 16:53:51 -0500	[thread overview]
Message-ID: <87in6lt4pc.fsf@xmission.com> (raw)
In-Reply-To: <20180614210325.GA5673@cisco> (Tycho Andersen's message of "Thu, 14 Jun 2018 14:03:25 -0700")

Tycho Andersen <tycho@tycho.ws> writes:

> On Thu, Jun 14, 2018 at 12:44:21PM -0700, Matthew Helsley wrote:
>> On Tue, Jun 12, 2018 at 4:16 PM, Tycho Andersen <tycho@tycho.ws> wrote:
>> 
>> > Hi Matthew,
>> >
>> > On Tue, Jun 12, 2018 at 02:39:03PM -0700, Matthew Helsley wrote:
>> > > On Thu, May 31, 2018 at 7:49 AM, Tycho Andersen <tycho@tycho.ws> wrote:
>> > >
>> > > <snip>
>> > >
>> > >
>> > > > +struct seccomp_notif {
>> > > > +       __u64 id;
>> > > > +       pid_t pid;
>> > > > +       struct seccomp_data data;
>> > > > +};
>> > > >
>> > >
>> > > Since it's part of the UAPI I think it would be good to add documentation
>> > > to this patch series. Part of that documentation should talk about which
>> > > pid namespaces this pid value is relevant in. This is especially
>> > important
>> > > since the feature is designed for use by things like container/sandbox
>> > > managers.
>> >
>> > Yes, at least Documentation/userspace-api/seccomp_filter.txt should be
>> > updated. I'll add that for the next series.
>> >
>> 
>> Are there some relevant man pages too that should be updated too?
>
> Yes, but since those are in a separate tree, I usually send the sets
> separately.
>
>> > > > +static void seccomp_do_user_notification(int this_syscall,
>> > > > +                                        struct seccomp_filter *match,
>> > > > +                                        const struct seccomp_data *sd)
>> > > > +{
>> > > > +       int err;
>> > > > +       long ret = 0;
>> > > > +       struct seccomp_knotif n = {};
>> > > > +
>> > > > +       mutex_lock(&match->notify_lock);
>> > > > +       err = -ENOSYS;
>> > > > +       if (!match->has_listener)
>> > > > +               goto out;
>> > > > +
>> > > > +       n.pid = current->pid;
>> > > >
>> > >
>> > > How have you tested this code for correctness? I don't see many
>> > > combinations being tested below nor here:
>> > > https://github.com/tych0/kernel-utils/blob/master/
>> > seccomp/notify_stress.c
>> > >
>> > > What about processes in different pid namespaces? Make tests that vary
>> > key
>> > > parameters like these between the task generating the notifications and
>> > the
>> > > task interested in processing the notifications. Make tests that try to
>> > > kill them at interesting times too. etc. Make tests that pass the
>> > > notification fd around and see how they work (or not).
>> > >
>> > > I ask about testing because you're effectively returning a pid value to
>> > > userspace here but not using the proper macros to access the task's
>> > struct
>> > > pid for that purpose. That will work so long as both processes are in the
>> > > same pid namespace but breaks otherwise.
>> > >
>> > > Furthermore, a pid value is not the best solution for the queueing of
>> > these
>> > > notifications because a single pid value is not meaningful/correct
>> > outside
>> > > current's pid namespace and the seccomp notification file descriptor
>> > could
>> > > be passed on to processes in another pid namespaces; this pid value will
>> > > almost always not be relevant or correct there hence taking a reference
>> > to
>> >
>> > Well, it *has* to be relevant in some cases: if you want to access
>> > some of the task's memory to e.g. read the args to the syscall, you
>> > need to ptrace or map_files to access the memory. So we do need to
>> > pass it, but,
>> >
>> > > the struct pid is useful. Then, during read(), the code would use the
>> > > proper macro to turn the struct pid reference into a pid value relevant
>> > in
>> > > the *reader's* pid namespace *or* return something obviously bogus if the
>> > > reader is in a pid namespace that can't see that pid. This could prevent
>> > > management processes from being tricked into clobbering the wrong
>> > process,
>> > > feeding the wrong process sensitive information via syscall results, etc.
>> >
>> > Yes, this makes sense. Seems like we need to do some magic about
>> > passing the tracee's task struct to the listener, so it can do
>> > task_pid_vnr(). We could perhaps require the listener to be in the
>> > init_pid_ns case, but I think things like socket() might still be
>> > useful even if you can't read the tracee's memory.
>> 
>> 
>> You could take a reference to the pid rather than the task
>> then use pid_vnr(). In that case the changes should result in something
>> like:
>> 
>> 
>> struct seccomp_knotif {
>>        /* The pid whose filter triggered the notification */
>>        struct pid *pid;
>> ...
>> 
>> static void seccomp_do_user_notification(...)
>> {
>>     ...
>>     n.pid = get_task_pid(current, PIDTYPE_PID);
>>     ...
>> remove_list:
>>     list_del(&n.list);
>>     put_pid(n.pid);
>>     ...
>> }
>> ...
>> static ssize_t seccomp_notify_read(...)
>> {
>>     ...
>>     unotif.pid = pid_vnr(knotif->pid);
>>     ...
>> }
>> 
>> I like holding the pid reference because it's what we do elsewhere when pid
>> namespaces
>> are a concern and it more precisely specifies what the knotif content needs
>> to convey.
>> Otherwise I don't think it makes a difference.
>
> Great, thanks, I'll do this. I guess we need a put_pid() here too.

A)  We know that the task is stopped.  Unless there is something
    like SIGKILL that can make the task move you don't need to
    take a reference to anything.

B)  pid_vnr is the wrong answer.  When you create the struct file
    and intialize the filter you need to capture the calling processes
    pid namespace.  The you can use "pid_nr_ns(knotif->pid, filter->pid_ns);".
    That will work consistently even if the file descriptor is passed
    between processes.

Eric

  reply	other threads:[~2018-06-14 21:54 UTC|newest]

Thread overview: 22+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-05-31 14:49 [PATCH v3 0/4] seccomp trap to userspace Tycho Andersen
2018-05-31 14:49 ` [PATCH v3 1/4] seccomp: add a return code to " Tycho Andersen
2018-06-03 18:41   ` Jann Horn
2018-06-04  0:18     ` Tycho Andersen
2018-06-13 15:32       ` Jann Horn
2018-06-13 15:43         ` Jann Horn
     [not found]   ` <CA+RrjuW98m2coL+TOKq5cL0QhAb=HYxo2DpNoqMzdiwjqhc2BA@mail.gmail.com>
2018-06-12 23:16     ` Tycho Andersen
     [not found]       ` <CA+RrjuUtYoXfbH3cTbSY=QzXcxJsJOa0BL628ADy9N3bTO4=Mw@mail.gmail.com>
2018-06-14 21:03         ` Tycho Andersen
2018-06-14 21:53           ` Eric W. Biederman [this message]
2018-06-20 14:41             ` Tycho Andersen
2018-06-20  5:05   ` Tobin C . Harding
2018-06-20  5:53   ` Tobin C . Harding
     [not found]   ` <CA+RrjuUhFW+XU7RkZOM+f8cyDGOBjJrQHK3GruZmmCETb8ugfA@mail.gmail.com>
2018-06-20 14:55     ` Tycho Andersen
2018-05-31 14:49 ` [PATCH v3 2/4] seccomp: make get_nth_filter available outside of CHECKPOINT_RESTORE Tycho Andersen
2018-05-31 14:49 ` [PATCH v3 3/4] seccomp: add a way to get a listener fd from ptrace Tycho Andersen
2018-05-31 14:49 ` [PATCH v3 4/4] seccomp: add support for passing fds via USER_NOTIF Tycho Andersen
2018-06-02 13:13   ` Jann Horn
2018-06-02 18:18     ` Tycho Andersen
2018-06-02 19:14   ` Alban Crequy
2018-06-04  0:14     ` Tycho Andersen
2018-06-08 16:29 ` [PATCH v3 0/4] seccomp trap to userspace Kees Cook
2018-06-08 21:04   ` Tycho Andersen

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=87in6lt4pc.fsf@xmission.com \
    --to=ebiederm@xmission.com \
    --cc=christian.brauner@ubuntu.com \
    --cc=containers@lists.linux-foundation.org \
    --cc=keescook@chromium.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=luto@amacapital.net \
    --cc=matt.helsley@gmail.com \
    --cc=me@tobin.cc \
    --cc=oleg@redhat.com \
    --cc=suda.akihiro@lab.ntt.co.jp \
    --cc=tycho@tycho.ws \
    --cc=tyhicks@canonical.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).