All of lore.kernel.org
 help / color / mirror / Atom feed
From: Christian Brauner <christian@brauner.io>
To: Jann Horn <jannh@google.com>
Cc: Andy Lutomirski <luto@kernel.org>,
	Konstantin Khlebnikov <khlebnikov@yandex-team.ru>,
	David Howells <dhowells@redhat.com>,
	"Serge E. Hallyn" <serge@hallyn.com>,
	"Eric W. Biederman" <ebiederm@xmission.com>,
	Linux API <linux-api@vger.kernel.org>,
	kernel list <linux-kernel@vger.kernel.org>,
	Arnd Bergmann <arnd@arndb.de>, Kees Cook <keescook@chromium.org>,
	Alexey Dobriyan <adobriyan@gmail.com>,
	Thomas Gleixner <tglx@linutronix.de>,
	Michael Kerrisk-manpages <mtk.manpages@gmail.com>,
	bl0pbl33p@gmail.com, "Dmitry V. Levin" <ldv@altlinux.org>,
	Andrew Morton <akpm@linux-foundation.org>,
	Oleg Nesterov <oleg@redhat.com>,
	Nagarathnam Muthusamy <nagarathnam.muthusamy@oracle.com>,
	cyphar@cyphar.com, Al Viro <viro@zeniv.linux.org.uk>,
	"Joel Fernandes (Google)" <joel@joelfernandes.org>,
	Daniel Colascione <dancol@google.com>
Subject: Re: [PATCH 2/4] pid: add pidctl()
Date: Mon, 25 Mar 2019 20:58:45 +0100	[thread overview]
Message-ID: <20190325195844.lrf7kgutlyv77vmu@brauner.io> (raw)
In-Reply-To: <CAG48ez24xPM4f7ty=8k5p6_gVWd-bEUSxca5Ngy5svHgarhz+w@mail.gmail.com>

On Mon, Mar 25, 2019 at 07:18:42PM +0100, Jann Horn wrote:
> On Mon, Mar 25, 2019 at 5:21 PM Christian Brauner <christian@brauner.io> wrote:
> > The pidctl() syscalls builds on, extends, and improves translate_pid() [4].
> > I quote Konstantins original patchset first that has already been acked and
> > picked up by Eric before and whose functionality is preserved in this
> > syscall:
> [...]
> > +
> > +static struct pid_namespace *get_pid_ns_by_fd(int fd)
> > +{
> > +       struct pid_namespace *pidns = ERR_PTR(-EINVAL);
> > +
> > +       if (fd >= 0) {
> > +#ifdef CONFIG_PID_NS
> > +               struct ns_common *ns;
> > +               struct file *file = proc_ns_fget(fd);
> > +               if (IS_ERR(file))
> > +                       return ERR_CAST(file);
> > +
> > +               ns = get_proc_ns(file_inode(file));
> > +               if (ns->ops->type == CLONE_NEWPID)
> > +                       pidns = get_pid_ns(
> > +                               container_of(ns, struct pid_namespace, ns));
> 
> This increments the refcount of the pidns...

I didn't touch that code. That's taken over from the orginial patchset.
Thanks for catching this.

> 
> > +
> > +               fput(file);
> > +#endif
> > +       } else {
> > +               pidns = task_active_pid_ns(current);
> 
> ... but this doesn't. That's pretty subtle; could you please put a
> comment on top of this function that points this out? Or even better,
> change the function to always take a reference, so that the caller
> doesn't have to worry about figuring this out.

Always taking a reference sounds more correct. Will do.

> 
> > +       }
> > +
> > +       return pidns;
> > +}
> [...]
> > +SYSCALL_DEFINE5(pidctl, unsigned int, cmd, pid_t, pid, int, source, int, target,
> > +               unsigned int, flags)
> > +{
> > +       struct pid_namespace *source_ns = NULL, *target_ns = NULL;
> > +       struct pid *struct_pid;
> > +       pid_t result;
> > +
> > +       switch (cmd) {
> > +       case PIDCMD_QUERY_PIDNS:
> > +               if (pid != 0)
> > +                       return -EINVAL;
> > +               pid = 1;
> > +               /* fall through */
> > +       case PIDCMD_QUERY_PID:
> > +               if (flags != 0)
> > +                       return -EINVAL;
> > +               break;
> > +       case PIDCMD_GET_PIDFD:
> > +               if (flags & ~PIDCTL_CLOEXEC)
> > +                       return -EINVAL;
> > +               break;
> > +       default:
> > +               return -EOPNOTSUPP;
> > +       }
> > +
> > +       source_ns = get_pid_ns_by_fd(source);
> > +       result = PTR_ERR(source_ns);
> 
> I very much dislike using PTR_ERR() on pointers before checking
> whether they contain an error value or not. I understand that the
> result of this won't actually be used, but it still seems weird to
> have what is essentially a cast of a potentially valid pointer to a
> potentially smaller integer here.
> 
> Could you maybe move the PTR_ERR() into the error branch? Like so:

Will do!

> 
> if (IS_ERR(source_ns)) {
>   result = PTR_ERR(source_ns);
>   goto err_source;
> }
> 
> > +       if (IS_ERR(source_ns))
> > +               goto err_source;
> > +
> > +       target_ns = get_pid_ns_by_fd(target);
> > +       result = PTR_ERR(target_ns);
> > +       if (IS_ERR(target_ns))
> > +               goto err_target;
> > +
> > +       if (cmd == PIDCMD_QUERY_PIDNS) {
> > +               result = pidns_related(source_ns, target_ns);
> > +       } else {
> > +               rcu_read_lock();
> > +               struct_pid = find_pid_ns(pid, source_ns);
> 
> find_pid_ns() doesn't take a reference on its return value, the return
> value is only pinned into memory by the RCU read-side critical
> section...
> 
> > +               result = struct_pid ? pid_nr_ns(struct_pid, target_ns) : -ESRCH;
> > +               rcu_read_unlock();
> 
> ... which ends here, making struct_pid a dangling pointer...
> 
> > +
> > +               if (cmd == PIDCMD_GET_PIDFD) {
> > +                       int cloexec = (flags & PIDCTL_CLOEXEC) ? O_CLOEXEC : 0;
> > +                       if (result > 0)
> > +                               result = pidfd_create_fd(struct_pid, cloexec);
> 
> ... and then here you continue to use struct_pid. That seems bogus.

I'll just take a reference to struct pid once I found it to prevent that
from happening.

> 
> > +                       else if (result == 0)
> > +                               result = -ENOENT;
> 
> You don't need to have flags for this for new syscalls, you can just
> make everything O_CLOEXEC; if someone wants to preserve an fd across
> execve(), they can just clear that bit with fcntl(). (I thiiink it was
> Andy Lutomirski who said that before regarding another syscall? But my
> memory of that is pretty foggy, might've been someone else.)

If that's the way going forward this is fine with me!

> 
> > +               }
> > +       }
> > +
> > +       if (target)
> > +               put_pid_ns(target_ns);
> > +err_target:
> > +       if (source)
> > +               put_pid_ns(source_ns);
> 
> I think you probably intended to check for "if (target >= 0)" and "if
> (source >= 0)" instead of these conditions, matching the condition in
> get_pid_ns_by_fd()? The current code looks as if it will leave the
> refcount one too high when used with target==0 or source==0, and leave
> the refcount one too low when used with target==-1 or source==-1.
> Anyway, as stated above, I think it would be simpler to
> unconditionally take a reference instead.

Yep.

> 
> > +err_source:
> > +       return result;
> > +}

WARNING: multiple messages have this Message-ID (diff)
From: Christian Brauner <christian@brauner.io>
To: Jann Horn <jannh@google.com>
Cc: Andy Lutomirski <luto@kernel.org>,
	Konstantin Khlebnikov <khlebnikov@yandex-team.ru>,
	David Howells <dhowells@redhat.com>,
	"Serge E. Hallyn" <serge@hallyn.com>,
	"Eric W. Biederman" <ebiederm@xmission.com>,
	Linux API <linux-api@vger.kernel.org>,
	kernel list <linux-kernel@vger.kernel.org>,
	Arnd Bergmann <arnd@arndb.de>, Kees Cook <keescook@chromium.org>,
	Alexey Dobriyan <adobriyan@gmail.com>,
	Thomas Gleixner <tglx@linutronix.de>,
	Michael Kerrisk-manpages <mtk.manpages@gmail.com>,
	bl0pbl33p@gmail.com, "Dmitry V. Levin" <ldv@altlinux.org>,
	Andrew Morton <akpm@linux-foundation.org>,
	Oleg Nesterov <oleg@redhat.com>,
	Nagarathnam Muthusamy <nagarathnam.muthusamy@oracle.com>,
	cyphar@cyphar.com, Al Viro <viro@zeniv.linux.org.uk>,
	"Joel Fernandes (Google)" <joe>
Subject: Re: [PATCH 2/4] pid: add pidctl()
Date: Mon, 25 Mar 2019 20:58:45 +0100	[thread overview]
Message-ID: <20190325195844.lrf7kgutlyv77vmu@brauner.io> (raw)
In-Reply-To: <CAG48ez24xPM4f7ty=8k5p6_gVWd-bEUSxca5Ngy5svHgarhz+w@mail.gmail.com>

On Mon, Mar 25, 2019 at 07:18:42PM +0100, Jann Horn wrote:
> On Mon, Mar 25, 2019 at 5:21 PM Christian Brauner <christian@brauner.io> wrote:
> > The pidctl() syscalls builds on, extends, and improves translate_pid() [4].
> > I quote Konstantins original patchset first that has already been acked and
> > picked up by Eric before and whose functionality is preserved in this
> > syscall:
> [...]
> > +
> > +static struct pid_namespace *get_pid_ns_by_fd(int fd)
> > +{
> > +       struct pid_namespace *pidns = ERR_PTR(-EINVAL);
> > +
> > +       if (fd >= 0) {
> > +#ifdef CONFIG_PID_NS
> > +               struct ns_common *ns;
> > +               struct file *file = proc_ns_fget(fd);
> > +               if (IS_ERR(file))
> > +                       return ERR_CAST(file);
> > +
> > +               ns = get_proc_ns(file_inode(file));
> > +               if (ns->ops->type == CLONE_NEWPID)
> > +                       pidns = get_pid_ns(
> > +                               container_of(ns, struct pid_namespace, ns));
> 
> This increments the refcount of the pidns...

I didn't touch that code. That's taken over from the orginial patchset.
Thanks for catching this.

> 
> > +
> > +               fput(file);
> > +#endif
> > +       } else {
> > +               pidns = task_active_pid_ns(current);
> 
> ... but this doesn't. That's pretty subtle; could you please put a
> comment on top of this function that points this out? Or even better,
> change the function to always take a reference, so that the caller
> doesn't have to worry about figuring this out.

Always taking a reference sounds more correct. Will do.

> 
> > +       }
> > +
> > +       return pidns;
> > +}
> [...]
> > +SYSCALL_DEFINE5(pidctl, unsigned int, cmd, pid_t, pid, int, source, int, target,
> > +               unsigned int, flags)
> > +{
> > +       struct pid_namespace *source_ns = NULL, *target_ns = NULL;
> > +       struct pid *struct_pid;
> > +       pid_t result;
> > +
> > +       switch (cmd) {
> > +       case PIDCMD_QUERY_PIDNS:
> > +               if (pid != 0)
> > +                       return -EINVAL;
> > +               pid = 1;
> > +               /* fall through */
> > +       case PIDCMD_QUERY_PID:
> > +               if (flags != 0)
> > +                       return -EINVAL;
> > +               break;
> > +       case PIDCMD_GET_PIDFD:
> > +               if (flags & ~PIDCTL_CLOEXEC)
> > +                       return -EINVAL;
> > +               break;
> > +       default:
> > +               return -EOPNOTSUPP;
> > +       }
> > +
> > +       source_ns = get_pid_ns_by_fd(source);
> > +       result = PTR_ERR(source_ns);
> 
> I very much dislike using PTR_ERR() on pointers before checking
> whether they contain an error value or not. I understand that the
> result of this won't actually be used, but it still seems weird to
> have what is essentially a cast of a potentially valid pointer to a
> potentially smaller integer here.
> 
> Could you maybe move the PTR_ERR() into the error branch? Like so:

Will do!

> 
> if (IS_ERR(source_ns)) {
>   result = PTR_ERR(source_ns);
>   goto err_source;
> }
> 
> > +       if (IS_ERR(source_ns))
> > +               goto err_source;
> > +
> > +       target_ns = get_pid_ns_by_fd(target);
> > +       result = PTR_ERR(target_ns);
> > +       if (IS_ERR(target_ns))
> > +               goto err_target;
> > +
> > +       if (cmd == PIDCMD_QUERY_PIDNS) {
> > +               result = pidns_related(source_ns, target_ns);
> > +       } else {
> > +               rcu_read_lock();
> > +               struct_pid = find_pid_ns(pid, source_ns);
> 
> find_pid_ns() doesn't take a reference on its return value, the return
> value is only pinned into memory by the RCU read-side critical
> section...
> 
> > +               result = struct_pid ? pid_nr_ns(struct_pid, target_ns) : -ESRCH;
> > +               rcu_read_unlock();
> 
> ... which ends here, making struct_pid a dangling pointer...
> 
> > +
> > +               if (cmd == PIDCMD_GET_PIDFD) {
> > +                       int cloexec = (flags & PIDCTL_CLOEXEC) ? O_CLOEXEC : 0;
> > +                       if (result > 0)
> > +                               result = pidfd_create_fd(struct_pid, cloexec);
> 
> ... and then here you continue to use struct_pid. That seems bogus.

I'll just take a reference to struct pid once I found it to prevent that
from happening.

> 
> > +                       else if (result == 0)
> > +                               result = -ENOENT;
> 
> You don't need to have flags for this for new syscalls, you can just
> make everything O_CLOEXEC; if someone wants to preserve an fd across
> execve(), they can just clear that bit with fcntl(). (I thiiink it was
> Andy Lutomirski who said that before regarding another syscall? But my
> memory of that is pretty foggy, might've been someone else.)

If that's the way going forward this is fine with me!

> 
> > +               }
> > +       }
> > +
> > +       if (target)
> > +               put_pid_ns(target_ns);
> > +err_target:
> > +       if (source)
> > +               put_pid_ns(source_ns);
> 
> I think you probably intended to check for "if (target >= 0)" and "if
> (source >= 0)" instead of these conditions, matching the condition in
> get_pid_ns_by_fd()? The current code looks as if it will leave the
> refcount one too high when used with target==0 or source==0, and leave
> the refcount one too low when used with target==-1 or source==-1.
> Anyway, as stated above, I think it would be simpler to
> unconditionally take a reference instead.

Yep.

> 
> > +err_source:
> > +       return result;
> > +}

  reply	other threads:[~2019-03-25 19:58 UTC|newest]

Thread overview: 77+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-03-25 16:20 [PATCH 0/4] pid: add pidctl() Christian Brauner
2019-03-25 16:20 ` [PATCH 1/4] Make anon_inodes unconditional Christian Brauner
2019-03-25 16:20 ` [PATCH 2/4] pid: add pidctl() Christian Brauner
2019-03-25 17:20   ` Mika Penttilä
2019-03-25 19:59     ` Christian Brauner
2019-03-25 19:59       ` Christian Brauner
2019-03-25 18:18   ` Jann Horn
2019-03-25 18:18     ` Jann Horn
2019-03-25 19:58     ` Christian Brauner [this message]
2019-03-25 19:58       ` Christian Brauner
2019-03-26 16:07     ` Joel Fernandes
2019-03-26 16:07       ` Joel Fernandes
2019-03-26 16:15       ` Christian Brauner
2019-03-26 16:15         ` Christian Brauner
2019-03-25 16:20 ` [PATCH 3/4] signal: support pidctl() with pidfd_send_signal() Christian Brauner
2019-03-25 18:28   ` Jonathan Kowalski
2019-03-25 18:28     ` Jonathan Kowalski
2019-03-25 20:05     ` Christian Brauner
2019-03-25 20:05       ` Christian Brauner
2019-03-25 18:39   ` Jann Horn
2019-03-25 18:39     ` Jann Horn
2019-03-25 19:41     ` Christian Brauner
2019-03-25 19:41       ` Christian Brauner
2019-03-25 16:20 ` [PATCH 4/4] tests: add pidctl() tests Christian Brauner
2019-03-25 16:48 ` [PATCH 0/4] pid: add pidctl() Daniel Colascione
2019-03-25 16:48   ` Daniel Colascione
2019-03-25 17:05   ` Konstantin Khlebnikov
2019-03-25 17:07     ` Daniel Colascione
2019-03-25 17:07       ` Daniel Colascione
2019-03-25 17:36   ` Joel Fernandes
2019-03-25 17:36     ` Joel Fernandes
2019-03-25 17:53     ` Daniel Colascione
2019-03-25 17:53       ` Daniel Colascione
2019-03-25 18:19       ` Jonathan Kowalski
2019-03-25 18:19         ` Jonathan Kowalski
2019-03-25 18:57         ` Daniel Colascione
2019-03-25 18:57           ` Daniel Colascione
2019-03-25 19:42           ` Jonathan Kowalski
2019-03-25 19:42             ` Jonathan Kowalski
2019-03-25 20:14             ` Daniel Colascione
2019-03-25 20:14               ` Daniel Colascione
2019-03-25 20:34               ` Jann Horn
2019-03-25 20:34                 ` Jann Horn
2019-03-25 20:40                 ` Jonathan Kowalski
2019-03-25 20:40                   ` Jonathan Kowalski
2019-03-25 21:14                   ` Jonathan Kowalski
2019-03-25 21:14                     ` Jonathan Kowalski
2019-03-25 21:15                   ` Jann Horn
2019-03-25 21:15                     ` Jann Horn
2019-03-25 20:40                 ` Christian Brauner
2019-03-25 20:40                   ` Christian Brauner
2019-03-25 20:15     ` Christian Brauner
2019-03-25 20:15       ` Christian Brauner
2019-03-25 21:11       ` Joel Fernandes
2019-03-25 21:11         ` Joel Fernandes
2019-03-25 21:17         ` Daniel Colascione
2019-03-25 21:17           ` Daniel Colascione
2019-03-25 21:19         ` Jann Horn
2019-03-25 21:19           ` Jann Horn
2019-03-25 21:43           ` Joel Fernandes
2019-03-25 21:43             ` Joel Fernandes
2019-03-25 21:54             ` Jonathan Kowalski
2019-03-25 21:54               ` Jonathan Kowalski
2019-03-25 22:07               ` Daniel Colascione
2019-03-25 22:07                 ` Daniel Colascione
2019-03-25 22:37                 ` Jonathan Kowalski
2019-03-25 22:37                   ` Jonathan Kowalski
2019-03-25 23:14                   ` Daniel Colascione
2019-03-25 23:14                     ` Daniel Colascione
2019-03-26  3:03               ` Joel Fernandes
2019-03-26  3:03                 ` Joel Fernandes
2019-03-25 16:56 ` David Howells
2019-03-25 16:56   ` David Howells
2019-03-25 16:58   ` Daniel Colascione
2019-03-25 16:58     ` Daniel Colascione
2019-03-25 23:39   ` Andy Lutomirski
2019-03-25 23:39     ` Andy Lutomirski

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20190325195844.lrf7kgutlyv77vmu@brauner.io \
    --to=christian@brauner.io \
    --cc=adobriyan@gmail.com \
    --cc=akpm@linux-foundation.org \
    --cc=arnd@arndb.de \
    --cc=bl0pbl33p@gmail.com \
    --cc=cyphar@cyphar.com \
    --cc=dancol@google.com \
    --cc=dhowells@redhat.com \
    --cc=ebiederm@xmission.com \
    --cc=jannh@google.com \
    --cc=joel@joelfernandes.org \
    --cc=keescook@chromium.org \
    --cc=khlebnikov@yandex-team.ru \
    --cc=ldv@altlinux.org \
    --cc=linux-api@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=luto@kernel.org \
    --cc=mtk.manpages@gmail.com \
    --cc=nagarathnam.muthusamy@oracle.com \
    --cc=oleg@redhat.com \
    --cc=serge@hallyn.com \
    --cc=tglx@linutronix.de \
    --cc=viro@zeniv.linux.org.uk \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.