linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Jann Horn <jannh@google.com>
To: Christian Brauner <christian@brauner.io>,
	Andy Lutomirski <luto@kernel.org>
Cc: Konstantin Khlebnikov <khlebnikov@yandex-team.ru>,
	David Howells <dhowells@redhat.com>,
	"Serge E. Hallyn" <serge@hallyn.com>,
	"Eric W. Biederman" <ebiederm@xmission.com>,
	Linux API <linux-api@vger.kernel.org>,
	kernel list <linux-kernel@vger.kernel.org>,
	Arnd Bergmann <arnd@arndb.de>, Kees Cook <keescook@chromium.org>,
	Alexey Dobriyan <adobriyan@gmail.com>,
	Thomas Gleixner <tglx@linutronix.de>,
	Michael Kerrisk-manpages <mtk.manpages@gmail.com>,
	bl0pbl33p@gmail.com, "Dmitry V. Levin" <ldv@altlinux.org>,
	Andrew Morton <akpm@linux-foundation.org>,
	Oleg Nesterov <oleg@redhat.com>,
	Nagarathnam Muthusamy <nagarathnam.muthusamy@oracle.com>,
	cyphar@cyphar.com, Al Viro <viro@zeniv.linux.org.uk>,
	"Joel Fernandes (Google)" <joel@joelfernandes.org>,
	Daniel Colascione <dancol@google.com>
Subject: Re: [PATCH 2/4] pid: add pidctl()
Date: Mon, 25 Mar 2019 19:18:42 +0100	[thread overview]
Message-ID: <CAG48ez24xPM4f7ty=8k5p6_gVWd-bEUSxca5Ngy5svHgarhz+w@mail.gmail.com> (raw)
In-Reply-To: <20190325162052.28987-3-christian@brauner.io>

On Mon, Mar 25, 2019 at 5:21 PM Christian Brauner <christian@brauner.io> wrote:
> The pidctl() syscalls builds on, extends, and improves translate_pid() [4].
> I quote Konstantins original patchset first that has already been acked and
> picked up by Eric before and whose functionality is preserved in this
> syscall:
[...]
> +
> +static struct pid_namespace *get_pid_ns_by_fd(int fd)
> +{
> +       struct pid_namespace *pidns = ERR_PTR(-EINVAL);
> +
> +       if (fd >= 0) {
> +#ifdef CONFIG_PID_NS
> +               struct ns_common *ns;
> +               struct file *file = proc_ns_fget(fd);
> +               if (IS_ERR(file))
> +                       return ERR_CAST(file);
> +
> +               ns = get_proc_ns(file_inode(file));
> +               if (ns->ops->type == CLONE_NEWPID)
> +                       pidns = get_pid_ns(
> +                               container_of(ns, struct pid_namespace, ns));

This increments the refcount of the pidns...

> +
> +               fput(file);
> +#endif
> +       } else {
> +               pidns = task_active_pid_ns(current);

... but this doesn't. That's pretty subtle; could you please put a
comment on top of this function that points this out? Or even better,
change the function to always take a reference, so that the caller
doesn't have to worry about figuring this out.

> +       }
> +
> +       return pidns;
> +}
[...]
> +SYSCALL_DEFINE5(pidctl, unsigned int, cmd, pid_t, pid, int, source, int, target,
> +               unsigned int, flags)
> +{
> +       struct pid_namespace *source_ns = NULL, *target_ns = NULL;
> +       struct pid *struct_pid;
> +       pid_t result;
> +
> +       switch (cmd) {
> +       case PIDCMD_QUERY_PIDNS:
> +               if (pid != 0)
> +                       return -EINVAL;
> +               pid = 1;
> +               /* fall through */
> +       case PIDCMD_QUERY_PID:
> +               if (flags != 0)
> +                       return -EINVAL;
> +               break;
> +       case PIDCMD_GET_PIDFD:
> +               if (flags & ~PIDCTL_CLOEXEC)
> +                       return -EINVAL;
> +               break;
> +       default:
> +               return -EOPNOTSUPP;
> +       }
> +
> +       source_ns = get_pid_ns_by_fd(source);
> +       result = PTR_ERR(source_ns);

I very much dislike using PTR_ERR() on pointers before checking
whether they contain an error value or not. I understand that the
result of this won't actually be used, but it still seems weird to
have what is essentially a cast of a potentially valid pointer to a
potentially smaller integer here.

Could you maybe move the PTR_ERR() into the error branch? Like so:

if (IS_ERR(source_ns)) {
  result = PTR_ERR(source_ns);
  goto err_source;
}

> +       if (IS_ERR(source_ns))
> +               goto err_source;
> +
> +       target_ns = get_pid_ns_by_fd(target);
> +       result = PTR_ERR(target_ns);
> +       if (IS_ERR(target_ns))
> +               goto err_target;
> +
> +       if (cmd == PIDCMD_QUERY_PIDNS) {
> +               result = pidns_related(source_ns, target_ns);
> +       } else {
> +               rcu_read_lock();
> +               struct_pid = find_pid_ns(pid, source_ns);

find_pid_ns() doesn't take a reference on its return value, the return
value is only pinned into memory by the RCU read-side critical
section...

> +               result = struct_pid ? pid_nr_ns(struct_pid, target_ns) : -ESRCH;
> +               rcu_read_unlock();

... which ends here, making struct_pid a dangling pointer...

> +
> +               if (cmd == PIDCMD_GET_PIDFD) {
> +                       int cloexec = (flags & PIDCTL_CLOEXEC) ? O_CLOEXEC : 0;
> +                       if (result > 0)
> +                               result = pidfd_create_fd(struct_pid, cloexec);

... and then here you continue to use struct_pid. That seems bogus.

> +                       else if (result == 0)
> +                               result = -ENOENT;

You don't need to have flags for this for new syscalls, you can just
make everything O_CLOEXEC; if someone wants to preserve an fd across
execve(), they can just clear that bit with fcntl(). (I thiiink it was
Andy Lutomirski who said that before regarding another syscall? But my
memory of that is pretty foggy, might've been someone else.)

> +               }
> +       }
> +
> +       if (target)
> +               put_pid_ns(target_ns);
> +err_target:
> +       if (source)
> +               put_pid_ns(source_ns);

I think you probably intended to check for "if (target >= 0)" and "if
(source >= 0)" instead of these conditions, matching the condition in
get_pid_ns_by_fd()? The current code looks as if it will leave the
refcount one too high when used with target==0 or source==0, and leave
the refcount one too low when used with target==-1 or source==-1.
Anyway, as stated above, I think it would be simpler to
unconditionally take a reference instead.

> +err_source:
> +       return result;
> +}

  parent reply	other threads:[~2019-03-25 18:19 UTC|newest]

Thread overview: 42+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-03-25 16:20 [PATCH 0/4] pid: add pidctl() Christian Brauner
2019-03-25 16:20 ` [PATCH 1/4] Make anon_inodes unconditional Christian Brauner
2019-03-25 16:20 ` [PATCH 2/4] pid: add pidctl() Christian Brauner
2019-03-25 17:20   ` Mika Penttilä
2019-03-25 19:59     ` Christian Brauner
2019-03-25 18:18   ` Jann Horn [this message]
2019-03-25 19:58     ` Christian Brauner
2019-03-26 16:07     ` Joel Fernandes
2019-03-26 16:15       ` Christian Brauner
2019-03-25 16:20 ` [PATCH 3/4] signal: support pidctl() with pidfd_send_signal() Christian Brauner
2019-03-25 18:28   ` Jonathan Kowalski
2019-03-25 20:05     ` Christian Brauner
2019-03-25 18:39   ` Jann Horn
2019-03-25 19:41     ` Christian Brauner
2019-03-25 16:20 ` [PATCH 4/4] tests: add pidctl() tests Christian Brauner
2019-03-25 16:48 ` [PATCH 0/4] pid: add pidctl() Daniel Colascione
2019-03-25 17:05   ` Konstantin Khlebnikov
2019-03-25 17:07     ` Daniel Colascione
2019-03-25 17:36   ` Joel Fernandes
2019-03-25 17:53     ` Daniel Colascione
2019-03-25 18:19       ` Jonathan Kowalski
2019-03-25 18:57         ` Daniel Colascione
2019-03-25 19:42           ` Jonathan Kowalski
2019-03-25 20:14             ` Daniel Colascione
2019-03-25 20:34               ` Jann Horn
2019-03-25 20:40                 ` Jonathan Kowalski
2019-03-25 21:14                   ` Jonathan Kowalski
2019-03-25 21:15                   ` Jann Horn
2019-03-25 20:40                 ` Christian Brauner
2019-03-25 20:15     ` Christian Brauner
2019-03-25 21:11       ` Joel Fernandes
2019-03-25 21:17         ` Daniel Colascione
2019-03-25 21:19         ` Jann Horn
2019-03-25 21:43           ` Joel Fernandes
2019-03-25 21:54             ` Jonathan Kowalski
2019-03-25 22:07               ` Daniel Colascione
2019-03-25 22:37                 ` Jonathan Kowalski
2019-03-25 23:14                   ` Daniel Colascione
2019-03-26  3:03               ` Joel Fernandes
2019-03-25 16:56 ` David Howells
2019-03-25 16:58   ` Daniel Colascione
2019-03-25 23:39   ` Andy Lutomirski

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CAG48ez24xPM4f7ty=8k5p6_gVWd-bEUSxca5Ngy5svHgarhz+w@mail.gmail.com' \
    --to=jannh@google.com \
    --cc=adobriyan@gmail.com \
    --cc=akpm@linux-foundation.org \
    --cc=arnd@arndb.de \
    --cc=bl0pbl33p@gmail.com \
    --cc=christian@brauner.io \
    --cc=cyphar@cyphar.com \
    --cc=dancol@google.com \
    --cc=dhowells@redhat.com \
    --cc=ebiederm@xmission.com \
    --cc=joel@joelfernandes.org \
    --cc=keescook@chromium.org \
    --cc=khlebnikov@yandex-team.ru \
    --cc=ldv@altlinux.org \
    --cc=linux-api@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=luto@kernel.org \
    --cc=mtk.manpages@gmail.com \
    --cc=nagarathnam.muthusamy@oracle.com \
    --cc=oleg@redhat.com \
    --cc=serge@hallyn.com \
    --cc=tglx@linutronix.de \
    --cc=viro@zeniv.linux.org.uk \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).