From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-8.6 required=3.0 tests=DKIMWL_WL_MED,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS,MAILING_LIST_MULTI, SPF_PASS,URIBL_BLOCKED,USER_IN_DEF_DKIM_WL autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 9D9AFC43381 for ; Sat, 30 Mar 2019 18:00:33 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id 715A82184C for ; Sat, 30 Mar 2019 18:00:33 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b="TnzRtJVI" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1730988AbfC3SAc (ORCPT ); Sat, 30 Mar 2019 14:00:32 -0400 Received: from mail-ot1-f68.google.com ([209.85.210.68]:41227 "EHLO mail-ot1-f68.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1730396AbfC3SAb (ORCPT ); Sat, 30 Mar 2019 14:00:31 -0400 Received: by mail-ot1-f68.google.com with SMTP id 64so4934494otb.8 for ; Sat, 30 Mar 2019 11:00:31 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20161025; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=z3FFXH4gxkuC/G1c7qfWR3wOIcaso97fHrNrqzwKqK8=; b=TnzRtJVIaaWfjZ/Sl5J5P/2gFhl3R3rbWB7NCIe5p8uzMLdsrZht9mVoXUutesB9kP tjoaoV7pbmeuLQY4Ht7KUZTUMbe/ESd2f1eu0y9PQFbvNq/F1csye0aL/bLdY3SeEge1 DnZd6RDHdODb93Wg+S6Uw3berNAkIehoWlmrecEka1a02cLMuaHad40k/lUDFVHK9AAT 4KecuJkGQpwtW+ycig2iiStbaj239iLmlO9PB5acpxA316coBoulgkZjn9x7Z7VUfdPT f6lFI1u9U/hoidPd5gGpRSlwBAWD1MoKaYyv7gNGNzrELcLCGLFVw/4daNITTDPW3AEn YYxA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=z3FFXH4gxkuC/G1c7qfWR3wOIcaso97fHrNrqzwKqK8=; b=RG/NijQYKZ0UhqX6vTKThALQTczHNz7S0v1AMQAklLcKRMdFY0KAOi6VuLoieVdT31 3vAqnHYdmpcY/EMrTQ39TdDq9NyIlpF0Wpnb13PnblZOCpDxzjM02C9QAR+3a0MFSxS3 dZJT+bD6d4TsIaP5+rNF/uz1beYHsg9aUi83g8CoGrVT5uWyac3qY6u+rX0nnKHkvJUm 8bF+CpdEpv1JJwRaLZ0AVdJ8WTKk/uqynNmJojw3SebFbjszYwlS3J8ii9Xm7eOKyDqO NBoBV++rHoIaoc4rQwcN6YwzgOAhGwRQHXYtk0RoRzsgwGYVbcGsBPC4WPFGEbg5RKxe XSQA== X-Gm-Message-State: APjAAAU40MZOKknJB2o1msQYNtdQgEkxgFashofm01S1IQ2PjRe/keCw doM6Gmt2DMIFPsnhTXfGSZbR7Knja4ccwS0WLSTdGA== X-Google-Smtp-Source: APXvYqxM7O1pfgYC70jc2l2KC/r6w6ibNRp/bfL/Ss9pP2OzwdmaBikbIDHBUsFk2auq3IF1PgBfyd3CyPgg1k9MlYg= X-Received: by 2002:a9d:694c:: with SMTP id p12mr38290862oto.242.1553968830984; Sat, 30 Mar 2019 11:00:30 -0700 (PDT) MIME-Version: 1.0 References: <20190329155425.26059-1-christian@brauner.io> <20190330171215.3yrfxwodstmgzmxy@brauner.io> In-Reply-To: From: Jann Horn Date: Sat, 30 Mar 2019 19:00:03 +0100 Message-ID: Subject: Re: [PATCH v2 0/5] pid: add pidfd_open() To: Linus Torvalds Cc: Christian Brauner , Daniel Colascione , Andrew Lutomirski , David Howells , "Serge E. Hallyn" , Linux API , Linux List Kernel Mailing , Arnd Bergmann , "Eric W. Biederman" , Konstantin Khlebnikov , Kees Cook , Alexey Dobriyan , Thomas Gleixner , Michael Kerrisk-manpages , Jonathan Kowalski , "Dmitry V. Levin" , Andrew Morton , Oleg Nesterov , Nagarathnam Muthusamy , Aleksa Sarai , Al Viro , Joel Fernandes Content-Type: text/plain; charset="UTF-8" Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Sat, Mar 30, 2019 at 6:24 PM Linus Torvalds wrote: > On Sat, Mar 30, 2019 at 10:12 AM Christian Brauner wrote: > > To clarify, what the Android guys really wanted to be part of the api is > > a way to get race-free access to metadata associated with a given pidfd. > > And the idea was that *if and only if procfs is mounted* you could do: > > > > int pidfd = pidfd_open(1234, 0); > > > > int procfd = open("/proc", O_RDONLY | O_CLOEXEC); > > int procpidfd = ioctl(pidfd, PIDFD_TO_PROCFD, procfd); > > And my claim is that this is three system calls - one of them very > hacky - to just do > > int pidfd = open("/proc/%d", O_PATH); > > and you're done. It acts as the pidfd _and_ the way to get the > associated status files etc. > > So there is absolutely zero advantage to going through pidfd_open(). > > No. No. No. > > So the *only* reason for "pidfd_open()" is if you don't have /proc in > the first place. In which case the whole PIDFD_TO_PROCFD is bogus. So if, in the future, there is some sort of "create a new task and return an fd to it" syscall, do you think it should always return pidfds, or do you think it should return fds to /proc if procfs is available? And if it should return fds to /proc, does that mean that this "create a task" API should take an extra argument with a file descriptor to the procfs instance you want to use? (This can't always be implemented easily in userspace on top of normal clone(), because if you create a task without a termination signal - like a thread -, its PID can be recycled under you.) An API like this would have less complexity stuffed into a single syscall if it always returns pidfds, and if you then actually want an fd to procfs, you can do the conversion that requires specifying a procfs instance separately. Of course, if you think that we shouldn't add an API for pidfd-to-procfs conversion before we have an API for clone()-with-an-fd-retval, that's understandable.