From mboxrd@z Thu Jan 1 00:00:00 1970 From: Christian Brauner Date: Sat, 18 May 2019 10:04:46 +0000 Subject: Re: [PATCH v1 1/2] pid: add pidfd_open() Message-Id: <20190518100435.c5bqpcnra53dsr6p@brauner.io> List-Id: References: <20190516135944.7205-1-christian@brauner.io> <20190516224949.GA15401@localhost> In-Reply-To: <20190516224949.GA15401@localhost> MIME-Version: 1.0 Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: quoted-printable To: Joel Fernandes Cc: jannh@google.com, oleg@redhat.com, viro@zeniv.linux.org.uk, torvalds@linux-foundation.org, linux-kernel@vger.kernel.org, arnd@arndb.de, akpm@linux-foundation.org, cyphar@cyphar.com, dhowells@redhat.com, ebiederm@xmission.com, elena.reshetova@intel.com, keescook@chromium.org, luto@amacapital.net, luto@kernel.org, tglx@linutronix.de, linux-alpha@vger.kernel.org, linux-arm-kernel@lists.infradead.org, linux-ia64@vger.kernel.org, linux-m68k@lists.linux-m68k.org, linux-mips@vger.kernel.orgg, linux-parisc@vger.kernel.org, linuxppc-dev@lists.ozlabs.org, linux-s390@vger.kernel.org, linux-sh@vger.kernel.org, sparclinux@vger.kernel.org, linux-xtensa@linux-xtensa.org, linux-api@vger.kernel.org, linux-arch@vger.kernel.org, linux-kselftest@vger.kernel.org, dancol@google.com, serge@hallyn.com, su On Sat, May 18, 2019 at 05:48:03AM -0400, Joel Fernandes wrote: > Hi Christian, >=20 > For next revision, could you also CC surenb@google.com as well? He is also > working on the low memory killer. And also suggest CC to > kernel-team@android.com. And mentioned some comments below, thanks. Yip, totally. Just added them both to my Cc list. :) (I saw you added Suren manually. I added the Android kernel team now too.) >=20 > On Thu, May 16, 2019 at 03:59:42PM +0200, Christian Brauner wrote: > [snip] =20 > > diff --git a/kernel/pid.c b/kernel/pid.c > > index 20881598bdfa..4afca3d6dcb8 100644 > > --- a/kernel/pid.c > > +++ b/kernel/pid.c > > @@ -38,6 +38,7 @@ > > #include > > #include > > #include > > +#include > > #include > > #include > > =20 > > @@ -451,6 +452,55 @@ struct pid *find_ge_pid(int nr, struct pid_namespa= ce *ns) > > return idr_get_next(&ns->idr, &nr); > > } > > =20 > > +/** > > + * pidfd_open() - Open new pid file descriptor. > > + * > > + * @pid: pid for which to retrieve a pidfd > > + * @flags: flags to pass > > + * > > + * This creates a new pid file descriptor with the O_CLOEXEC flag set = for > > + * the process identified by @pid. Currently, the process identified by > > + * @pid must be a thread-group leader. This restriction currently exis= ts > > + * for all aspects of pidfds including pidfd creation (CLONE_PIDFD can= not > > + * be used with CLONE_THREAD) and pidfd polling (only supports thread = group > > + * leaders). > > + * > > + * Return: On success, a cloexec pidfd is returned. > > + * On error, a negative errno number will be returned. > > + */ > > +SYSCALL_DEFINE2(pidfd_open, pid_t, pid, unsigned int, flags) > > +{ > > + int fd, ret; > > + struct pid *p; > > + struct task_struct *tsk; > > + > > + if (flags) > > + return -EINVAL; > > + > > + if (pid <=3D 0) > > + return -EINVAL; > > + > > + p =3D find_get_pid(pid); > > + if (!p) > > + return -ESRCH; > > + > > + ret =3D 0; > > + rcu_read_lock(); > > + /* > > + * If this returns non-NULL the pid was used as a thread-group > > + * leader. Note, we race with exec here: If it changes the > > + * thread-group leader we might return the old leader. > > + */ > > + tsk =3D pid_task(p, PIDTYPE_TGID); >=20 > Just trying to understand the comment here. The issue is that we might ei= ther > return the new leader, or the old leader depending on the overlap with > concurrent de_thread right? In either case, we don't care though. >=20 > I suggest to remove the "Note..." part of the comment since it doesn't se= em the > race is relevant here unless we are doing something else with tsk in the > function, but if you want to keep it that's also fine. Comment text should > probably should be 'return the new leader' though. Nah, I actually removed the comment already independently (cf. see [1]). [1]: https://git.kernel.org/pub/scm/linux/kernel/git/brauner/linux.git/comm= it/?h=3Dpidfd_open&id=DCfc98c2d957bf3ac14b06414cb2cf4c673fc297 >=20 > > + if (!tsk) > > + ret =3D -ESRCH; >=20 > Perhaps -EINVAL? AFAICS, this can only happen if a CLONE_THREAD pid was > passed as argument to pidfd_open which is invalid. But let me know what y= ou > had in mind.. Hm, from the kernel's perspective ESRCH is correct but I guess EINVAL is nicer for userspace. So I don't have objections to using EINVAL. My first version did too I think. Thanks! Christian From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-11.1 required=3.0 tests=DKIM_SIGNED,DKIM_VALID, DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_PATCH,MAILING_LIST_MULTI, MENTIONS_GIT_HOSTING,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED,USER_AGENT_NEOMUTT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id B5F9AC04AAF for ; Sat, 18 May 2019 10:05:08 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id 8106420B7C for ; Sat, 18 May 2019 10:05:08 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=brauner.io header.i=@brauner.io header.b="MzXf8JwF" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1729747AbfERKFI (ORCPT ); Sat, 18 May 2019 06:05:08 -0400 Received: from mail-ed1-f67.google.com ([209.85.208.67]:33200 "EHLO mail-ed1-f67.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1729661AbfERKFH (ORCPT ); Sat, 18 May 2019 06:05:07 -0400 Received: by mail-ed1-f67.google.com with SMTP id n17so14783168edb.0 for ; Sat, 18 May 2019 03:05:06 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=brauner.io; s=google; h=date:from:to:cc:subject:message-id:references:mime-version :content-disposition:in-reply-to:user-agent; bh=kuxvODYNXQX1UADGYTekrXKJHl9GqTdixhHyMJoLFkg=; b=MzXf8JwF72gJj2Pq8n12P+UywZxc2BVXTvpiVUKrEniJVb1LIFP3IbcOliQYISdvq7 CigvtYVzwi+8s8BzhGA7JzeYEAEgELRKSJUGXkBeKo3jIoY+k+AIz1NIGNQ3IbqWFvd/ pZcoiBfjP1G87UNeXXgpYPrDlzF+SvvAxTS1B9J01RvJsEbVREubCnWtrpwlE+W8uhC8 abJb2p1WNSljrqCFAQH7at9A4yXrlhg+5ZrhECybT+pFDJTv4XLPhvmKoEIQriQEdsEE 2qeWRYeIhjWVjN8Omnws0ZotrB2BKKuxhO4+QGekj7Ec6VajrUKLl5oRB10tcR/KFGxb SomA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:date:from:to:cc:subject:message-id:references :mime-version:content-disposition:in-reply-to:user-agent; bh=kuxvODYNXQX1UADGYTekrXKJHl9GqTdixhHyMJoLFkg=; b=fLaLJfeXpL4e7verrIp2GTiT9ps9ZOgX2TNtWgQCc02hxck3PnHMLFf2qzEOXOlhs4 sJTjuYNBNxiXNUSWu+iF92FOjnZYUQRbJQ4xe0GIvz3EreQxTxOF9uUsyNn2Ry1o3jSB rOIBEotxrBQMeQMxnll7RIR/XBkU9XbhEVnJW2Q6N98UqJ1RkNTes20A6eEyGQdN4+8H YkDX94WOUkcw0ULWCilXk0ewRQvvzl+rI8Y+4g6iADO3Dyv4Gox5axLUbNx7ConBL+m5 eSdUVd5VBeUUvO4tHfy1a2oYAa5wJ6UZUA33deNYav9qSpkTx8GGx+BPNfgOnVGb6Elw Iujw== X-Gm-Message-State: APjAAAW00NgEYL7jIb7+bLAwS3WidCMMKWfFqM4hxI1vGqSM2SlFnwDI UtirvBeeO1RXZplbhE3rSD6LEQ== X-Google-Smtp-Source: APXvYqyJ8cSwup5T+wLOXgE4G1RIJN3Q14FiCUfotn6HCvE35vO+QFm+mQrHo4U4TisKcq8ZEYMBDQ== X-Received: by 2002:a50:b82d:: with SMTP id j42mr63022736ede.186.1558173905455; Sat, 18 May 2019 03:05:05 -0700 (PDT) Received: from brauner.io ([46.183.103.8]) by smtp.gmail.com with ESMTPSA id y30sm3741910edc.83.2019.05.18.03.04.59 (version=TLS1_3 cipher=AEAD-AES256-GCM-SHA384 bits=256/256); Sat, 18 May 2019 03:05:04 -0700 (PDT) Date: Sat, 18 May 2019 12:04:46 +0200 From: Christian Brauner To: Joel Fernandes Cc: jannh@google.com, oleg@redhat.com, viro@zeniv.linux.org.uk, torvalds@linux-foundation.org, linux-kernel@vger.kernel.org, arnd@arndb.de, akpm@linux-foundation.org, cyphar@cyphar.com, dhowells@redhat.com, ebiederm@xmission.com, elena.reshetova@intel.com, keescook@chromium.org, luto@amacapital.net, luto@kernel.org, tglx@linutronix.de, linux-alpha@vger.kernel.org, linux-arm-kernel@lists.infradead.org, linux-ia64@vger.kernel.org, linux-m68k@lists.linux-m68k.org, linux-mips@vger.kernel.orgg, linux-parisc@vger.kernel.org, linuxppc-dev@lists.ozlabs.org, linux-s390@vger.kernel.org, linux-sh@vger.kernel.org, sparclinux@vger.kernel.org, linux-xtensa@linux-xtensa.org, linux-api@vger.kernel.org, linux-arch@vger.kernel.org, linux-kselftest@vger.kernel.org, dancol@google.com, serge@hallyn.com, surenb@google.com, Geert Uytterhoeven , kernel-team@android.com Subject: Re: [PATCH v1 1/2] pid: add pidfd_open() Message-ID: <20190518100435.c5bqpcnra53dsr6p@brauner.io> References: <20190516135944.7205-1-christian@brauner.io> <20190516224949.GA15401@localhost> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Disposition: inline In-Reply-To: <20190516224949.GA15401@localhost> User-Agent: NeoMutt/20180716 Sender: linux-parisc-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-parisc@vger.kernel.org On Sat, May 18, 2019 at 05:48:03AM -0400, Joel Fernandes wrote: > Hi Christian, > > For next revision, could you also CC surenb@google.com as well? He is also > working on the low memory killer. And also suggest CC to > kernel-team@android.com. And mentioned some comments below, thanks. Yip, totally. Just added them both to my Cc list. :) (I saw you added Suren manually. I added the Android kernel team now too.) > > On Thu, May 16, 2019 at 03:59:42PM +0200, Christian Brauner wrote: > [snip] > > diff --git a/kernel/pid.c b/kernel/pid.c > > index 20881598bdfa..4afca3d6dcb8 100644 > > --- a/kernel/pid.c > > +++ b/kernel/pid.c > > @@ -38,6 +38,7 @@ > > #include > > #include > > #include > > +#include > > #include > > #include > > > > @@ -451,6 +452,55 @@ struct pid *find_ge_pid(int nr, struct pid_namespace *ns) > > return idr_get_next(&ns->idr, &nr); > > } > > > > +/** > > + * pidfd_open() - Open new pid file descriptor. > > + * > > + * @pid: pid for which to retrieve a pidfd > > + * @flags: flags to pass > > + * > > + * This creates a new pid file descriptor with the O_CLOEXEC flag set for > > + * the process identified by @pid. Currently, the process identified by > > + * @pid must be a thread-group leader. This restriction currently exists > > + * for all aspects of pidfds including pidfd creation (CLONE_PIDFD cannot > > + * be used with CLONE_THREAD) and pidfd polling (only supports thread group > > + * leaders). > > + * > > + * Return: On success, a cloexec pidfd is returned. > > + * On error, a negative errno number will be returned. > > + */ > > +SYSCALL_DEFINE2(pidfd_open, pid_t, pid, unsigned int, flags) > > +{ > > + int fd, ret; > > + struct pid *p; > > + struct task_struct *tsk; > > + > > + if (flags) > > + return -EINVAL; > > + > > + if (pid <= 0) > > + return -EINVAL; > > + > > + p = find_get_pid(pid); > > + if (!p) > > + return -ESRCH; > > + > > + ret = 0; > > + rcu_read_lock(); > > + /* > > + * If this returns non-NULL the pid was used as a thread-group > > + * leader. Note, we race with exec here: If it changes the > > + * thread-group leader we might return the old leader. > > + */ > > + tsk = pid_task(p, PIDTYPE_TGID); > > Just trying to understand the comment here. The issue is that we might either > return the new leader, or the old leader depending on the overlap with > concurrent de_thread right? In either case, we don't care though. > > I suggest to remove the "Note..." part of the comment since it doesn't seem the > race is relevant here unless we are doing something else with tsk in the > function, but if you want to keep it that's also fine. Comment text should > probably should be 'return the new leader' though. Nah, I actually removed the comment already independently (cf. see [1]). [1]: https://git.kernel.org/pub/scm/linux/kernel/git/brauner/linux.git/commit/?h=pidfd_open&id=dcfc98c2d957bf3ac14b06414cb2cf4c673fc297 > > > + if (!tsk) > > + ret = -ESRCH; > > Perhaps -EINVAL? AFAICS, this can only happen if a CLONE_THREAD pid was > passed as argument to pidfd_open which is invalid. But let me know what you > had in mind.. Hm, from the kernel's perspective ESRCH is correct but I guess EINVAL is nicer for userspace. So I don't have objections to using EINVAL. My first version did too I think. Thanks! Christian From mboxrd@z Thu Jan 1 00:00:00 1970 From: Christian Brauner Subject: Re: [PATCH v1 1/2] pid: add pidfd_open() Date: Sat, 18 May 2019 12:04:46 +0200 Message-ID: <20190518100435.c5bqpcnra53dsr6p@brauner.io> References: <20190516135944.7205-1-christian@brauner.io> <20190516224949.GA15401@localhost> Mime-Version: 1.0 Content-Type: text/plain; charset=utf-8 Return-path: Content-Disposition: inline In-Reply-To: <20190516224949.GA15401@localhost> Sender: linux-kernel-owner@vger.kernel.org List-Archive: List-Post: To: Joel Fernandes Cc: jannh@google.com, oleg@redhat.com, viro@zeniv.linux.org.uk, torvalds@linux-foundation.org, linux-kernel@vger.kernel.org, arnd@arndb.de, akpm@linux-foundation.org, cyphar@cyphar.com, dhowells@redhat.com, ebiederm@xmission.com, elena.reshetova@intel.com, keescook@chromium.org, luto@amacapital.net, luto@kernel.org, tglx@linutronix.de, linux-alpha@vger.kernel.org, linux-arm-kernel@lists.infradead.org, linux-ia64@vger.kernel.org, linux-m68k@lists.linux-m68k.org, linux-mips@vger.kernel.orgg, linux-parisc@vger.kernel.org, linuxppc-dev@lists.ozlabs.org, linux-s390@vger.kernel.org, linux-sh@vger.kernel.org, sparclinux@vger.kernel.org, linux-xtensa@linux-xtensa.org, linux-api@vger.kernel.org, linux-arch@vger.kernel.org, linux-kselftest@vger.kernel.org, dancol@google.com, serge@hallyn.com, su List-ID: On Sat, May 18, 2019 at 05:48:03AM -0400, Joel Fernandes wrote: > Hi Christian, > > For next revision, could you also CC surenb@google.com as well? He is also > working on the low memory killer. And also suggest CC to > kernel-team@android.com. And mentioned some comments below, thanks. Yip, totally. Just added them both to my Cc list. :) (I saw you added Suren manually. I added the Android kernel team now too.) > > On Thu, May 16, 2019 at 03:59:42PM +0200, Christian Brauner wrote: > [snip] > > diff --git a/kernel/pid.c b/kernel/pid.c > > index 20881598bdfa..4afca3d6dcb8 100644 > > --- a/kernel/pid.c > > +++ b/kernel/pid.c > > @@ -38,6 +38,7 @@ > > #include > > #include > > #include > > +#include > > #include > > #include > > > > @@ -451,6 +452,55 @@ struct pid *find_ge_pid(int nr, struct pid_namespace *ns) > > return idr_get_next(&ns->idr, &nr); > > } > > > > +/** > > + * pidfd_open() - Open new pid file descriptor. > > + * > > + * @pid: pid for which to retrieve a pidfd > > + * @flags: flags to pass > > + * > > + * This creates a new pid file descriptor with the O_CLOEXEC flag set for > > + * the process identified by @pid. Currently, the process identified by > > + * @pid must be a thread-group leader. This restriction currently exists > > + * for all aspects of pidfds including pidfd creation (CLONE_PIDFD cannot > > + * be used with CLONE_THREAD) and pidfd polling (only supports thread group > > + * leaders). > > + * > > + * Return: On success, a cloexec pidfd is returned. > > + * On error, a negative errno number will be returned. > > + */ > > +SYSCALL_DEFINE2(pidfd_open, pid_t, pid, unsigned int, flags) > > +{ > > + int fd, ret; > > + struct pid *p; > > + struct task_struct *tsk; > > + > > + if (flags) > > + return -EINVAL; > > + > > + if (pid <= 0) > > + return -EINVAL; > > + > > + p = find_get_pid(pid); > > + if (!p) > > + return -ESRCH; > > + > > + ret = 0; > > + rcu_read_lock(); > > + /* > > + * If this returns non-NULL the pid was used as a thread-group > > + * leader. Note, we race with exec here: If it changes the > > + * thread-group leader we might return the old leader. > > + */ > > + tsk = pid_task(p, PIDTYPE_TGID); > > Just trying to understand the comment here. The issue is that we might either > return the new leader, or the old leader depending on the overlap with > concurrent de_thread right? In either case, we don't care though. > > I suggest to remove the "Note..." part of the comment since it doesn't seem the > race is relevant here unless we are doing something else with tsk in the > function, but if you want to keep it that's also fine. Comment text should > probably should be 'return the new leader' though. Nah, I actually removed the comment already independently (cf. see [1]). [1]: https://git.kernel.org/pub/scm/linux/kernel/git/brauner/linux.git/commit/?h=pidfd_open&id=dcfc98c2d957bf3ac14b06414cb2cf4c673fc297 > > > + if (!tsk) > > + ret = -ESRCH; > > Perhaps -EINVAL? AFAICS, this can only happen if a CLONE_THREAD pid was > passed as argument to pidfd_open which is invalid. But let me know what you > had in mind.. Hm, from the kernel's perspective ESRCH is correct but I guess EINVAL is nicer for userspace. So I don't have objections to using EINVAL. My first version did too I think. Thanks! Christian From mboxrd@z Thu Jan 1 00:00:00 1970 From: christian at brauner.io (Christian Brauner) Date: Sat, 18 May 2019 12:04:46 +0200 Subject: [PATCH v1 1/2] pid: add pidfd_open() In-Reply-To: <20190516224949.GA15401@localhost> References: <20190516135944.7205-1-christian@brauner.io> <20190516224949.GA15401@localhost> Message-ID: <20190518100435.c5bqpcnra53dsr6p@brauner.io> On Sat, May 18, 2019 at 05:48:03AM -0400, Joel Fernandes wrote: > Hi Christian, > > For next revision, could you also CC surenb at google.com as well? He is also > working on the low memory killer. And also suggest CC to > kernel-team at android.com. And mentioned some comments below, thanks. Yip, totally. Just added them both to my Cc list. :) (I saw you added Suren manually. I added the Android kernel team now too.) > > On Thu, May 16, 2019 at 03:59:42PM +0200, Christian Brauner wrote: > [snip] > > diff --git a/kernel/pid.c b/kernel/pid.c > > index 20881598bdfa..4afca3d6dcb8 100644 > > --- a/kernel/pid.c > > +++ b/kernel/pid.c > > @@ -38,6 +38,7 @@ > > #include > > #include > > #include > > +#include > > #include > > #include > > > > @@ -451,6 +452,55 @@ struct pid *find_ge_pid(int nr, struct pid_namespace *ns) > > return idr_get_next(&ns->idr, &nr); > > } > > > > +/** > > + * pidfd_open() - Open new pid file descriptor. > > + * > > + * @pid: pid for which to retrieve a pidfd > > + * @flags: flags to pass > > + * > > + * This creates a new pid file descriptor with the O_CLOEXEC flag set for > > + * the process identified by @pid. Currently, the process identified by > > + * @pid must be a thread-group leader. This restriction currently exists > > + * for all aspects of pidfds including pidfd creation (CLONE_PIDFD cannot > > + * be used with CLONE_THREAD) and pidfd polling (only supports thread group > > + * leaders). > > + * > > + * Return: On success, a cloexec pidfd is returned. > > + * On error, a negative errno number will be returned. > > + */ > > +SYSCALL_DEFINE2(pidfd_open, pid_t, pid, unsigned int, flags) > > +{ > > + int fd, ret; > > + struct pid *p; > > + struct task_struct *tsk; > > + > > + if (flags) > > + return -EINVAL; > > + > > + if (pid <= 0) > > + return -EINVAL; > > + > > + p = find_get_pid(pid); > > + if (!p) > > + return -ESRCH; > > + > > + ret = 0; > > + rcu_read_lock(); > > + /* > > + * If this returns non-NULL the pid was used as a thread-group > > + * leader. Note, we race with exec here: If it changes the > > + * thread-group leader we might return the old leader. > > + */ > > + tsk = pid_task(p, PIDTYPE_TGID); > > Just trying to understand the comment here. The issue is that we might either > return the new leader, or the old leader depending on the overlap with > concurrent de_thread right? In either case, we don't care though. > > I suggest to remove the "Note..." part of the comment since it doesn't seem the > race is relevant here unless we are doing something else with tsk in the > function, but if you want to keep it that's also fine. Comment text should > probably should be 'return the new leader' though. Nah, I actually removed the comment already independently (cf. see [1]). [1]: https://git.kernel.org/pub/scm/linux/kernel/git/brauner/linux.git/commit/?h=pidfd_open&id=dcfc98c2d957bf3ac14b06414cb2cf4c673fc297 > > > + if (!tsk) > > + ret = -ESRCH; > > Perhaps -EINVAL? AFAICS, this can only happen if a CLONE_THREAD pid was > passed as argument to pidfd_open which is invalid. But let me know what you > had in mind.. Hm, from the kernel's perspective ESRCH is correct but I guess EINVAL is nicer for userspace. So I don't have objections to using EINVAL. My first version did too I think. Thanks! Christian From mboxrd@z Thu Jan 1 00:00:00 1970 From: christian@brauner.io (Christian Brauner) Date: Sat, 18 May 2019 12:04:46 +0200 Subject: [PATCH v1 1/2] pid: add pidfd_open() In-Reply-To: <20190516224949.GA15401@localhost> References: <20190516135944.7205-1-christian@brauner.io> <20190516224949.GA15401@localhost> Message-ID: <20190518100435.c5bqpcnra53dsr6p@brauner.io> Content-Type: text/plain; charset="UTF-8" Message-ID: <20190518100446.RFQ-lZuaDaI3OxYquzDZUVu2pWpy30HFOX6VdIpEefI@z> On Sat, May 18, 2019@05:48:03AM -0400, Joel Fernandes wrote: > Hi Christian, > > For next revision, could you also CC surenb at google.com as well? He is also > working on the low memory killer. And also suggest CC to > kernel-team at android.com. And mentioned some comments below, thanks. Yip, totally. Just added them both to my Cc list. :) (I saw you added Suren manually. I added the Android kernel team now too.) > > On Thu, May 16, 2019@03:59:42PM +0200, Christian Brauner wrote: > [snip] > > diff --git a/kernel/pid.c b/kernel/pid.c > > index 20881598bdfa..4afca3d6dcb8 100644 > > --- a/kernel/pid.c > > +++ b/kernel/pid.c > > @@ -38,6 +38,7 @@ > > #include > > #include > > #include > > +#include > > #include > > #include > > > > @@ -451,6 +452,55 @@ struct pid *find_ge_pid(int nr, struct pid_namespace *ns) > > return idr_get_next(&ns->idr, &nr); > > } > > > > +/** > > + * pidfd_open() - Open new pid file descriptor. > > + * > > + * @pid: pid for which to retrieve a pidfd > > + * @flags: flags to pass > > + * > > + * This creates a new pid file descriptor with the O_CLOEXEC flag set for > > + * the process identified by @pid. Currently, the process identified by > > + * @pid must be a thread-group leader. This restriction currently exists > > + * for all aspects of pidfds including pidfd creation (CLONE_PIDFD cannot > > + * be used with CLONE_THREAD) and pidfd polling (only supports thread group > > + * leaders). > > + * > > + * Return: On success, a cloexec pidfd is returned. > > + * On error, a negative errno number will be returned. > > + */ > > +SYSCALL_DEFINE2(pidfd_open, pid_t, pid, unsigned int, flags) > > +{ > > + int fd, ret; > > + struct pid *p; > > + struct task_struct *tsk; > > + > > + if (flags) > > + return -EINVAL; > > + > > + if (pid <= 0) > > + return -EINVAL; > > + > > + p = find_get_pid(pid); > > + if (!p) > > + return -ESRCH; > > + > > + ret = 0; > > + rcu_read_lock(); > > + /* > > + * If this returns non-NULL the pid was used as a thread-group > > + * leader. Note, we race with exec here: If it changes the > > + * thread-group leader we might return the old leader. > > + */ > > + tsk = pid_task(p, PIDTYPE_TGID); > > Just trying to understand the comment here. The issue is that we might either > return the new leader, or the old leader depending on the overlap with > concurrent de_thread right? In either case, we don't care though. > > I suggest to remove the "Note..." part of the comment since it doesn't seem the > race is relevant here unless we are doing something else with tsk in the > function, but if you want to keep it that's also fine. Comment text should > probably should be 'return the new leader' though. Nah, I actually removed the comment already independently (cf. see [1]). [1]: https://git.kernel.org/pub/scm/linux/kernel/git/brauner/linux.git/commit/?h=pidfd_open&id=dcfc98c2d957bf3ac14b06414cb2cf4c673fc297 > > > + if (!tsk) > > + ret = -ESRCH; > > Perhaps -EINVAL? AFAICS, this can only happen if a CLONE_THREAD pid was > passed as argument to pidfd_open which is invalid. But let me know what you > had in mind.. Hm, from the kernel's perspective ESRCH is correct but I guess EINVAL is nicer for userspace. So I don't have objections to using EINVAL. My first version did too I think. Thanks! Christian From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-10.8 required=3.0 tests=DKIM_INVALID,DKIM_SIGNED, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_PATCH,MAILING_LIST_MULTI, MENTIONS_GIT_HOSTING,SPF_HELO_NONE,SPF_PASS,USER_AGENT_NEOMUTT autolearn=unavailable autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 2B96FC04AAF for ; Sat, 18 May 2019 10:06:33 +0000 (UTC) Received: from lists.ozlabs.org (lists.ozlabs.org [203.11.71.2]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 6BFB92087B for ; Sat, 18 May 2019 10:06:32 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=fail reason="signature verification failed" (2048-bit key) header.d=brauner.io header.i=@brauner.io header.b="MzXf8JwF" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 6BFB92087B Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=brauner.io Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=linuxppc-dev-bounces+linuxppc-dev=archiver.kernel.org@lists.ozlabs.org Received: from lists.ozlabs.org (lists.ozlabs.org [IPv6:2401:3900:2:1::3]) by lists.ozlabs.org (Postfix) with ESMTP id 455gmQ1cnbzDqS0 for ; Sat, 18 May 2019 20:06:30 +1000 (AEST) Authentication-Results: lists.ozlabs.org; spf=pass (mailfrom) smtp.mailfrom=brauner.io (client-ip=2a00:1450:4864:20::541; helo=mail-ed1-x541.google.com; envelope-from=christian@brauner.io; receiver=) Authentication-Results: lists.ozlabs.org; dmarc=none (p=none dis=none) header.from=brauner.io Authentication-Results: lists.ozlabs.org; dkim=pass (2048-bit key; secure) header.d=brauner.io header.i=@brauner.io header.b="MzXf8JwF"; dkim-atps=neutral Received: from mail-ed1-x541.google.com (mail-ed1-x541.google.com [IPv6:2a00:1450:4864:20::541]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by lists.ozlabs.org (Postfix) with ESMTPS id 455gkz0QjgzDqQN for ; Sat, 18 May 2019 20:05:13 +1000 (AEST) Received: by mail-ed1-x541.google.com with SMTP id n17so14783173edb.0 for ; Sat, 18 May 2019 03:05:13 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=brauner.io; s=google; h=date:from:to:cc:subject:message-id:references:mime-version :content-disposition:in-reply-to:user-agent; bh=kuxvODYNXQX1UADGYTekrXKJHl9GqTdixhHyMJoLFkg=; b=MzXf8JwF72gJj2Pq8n12P+UywZxc2BVXTvpiVUKrEniJVb1LIFP3IbcOliQYISdvq7 CigvtYVzwi+8s8BzhGA7JzeYEAEgELRKSJUGXkBeKo3jIoY+k+AIz1NIGNQ3IbqWFvd/ pZcoiBfjP1G87UNeXXgpYPrDlzF+SvvAxTS1B9J01RvJsEbVREubCnWtrpwlE+W8uhC8 abJb2p1WNSljrqCFAQH7at9A4yXrlhg+5ZrhECybT+pFDJTv4XLPhvmKoEIQriQEdsEE 2qeWRYeIhjWVjN8Omnws0ZotrB2BKKuxhO4+QGekj7Ec6VajrUKLl5oRB10tcR/KFGxb SomA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:date:from:to:cc:subject:message-id:references :mime-version:content-disposition:in-reply-to:user-agent; bh=kuxvODYNXQX1UADGYTekrXKJHl9GqTdixhHyMJoLFkg=; b=NO4JMj1UiS+V6L4FhrRx5hsaDyh0Inhp1tOicX89SSbdFGrpVaH1JSRv7rDyx7IuVa PoVCnjtsLbFwV6eIQJaO33xgbxBDZGlzlvgp6xm0I1k3DOcvsVZo4N9qoauewEDhMSnA /wVbYzpEBb6vvAjRY//osSxPl/MecGp4ViMTDYYEfAZByOiOHJyQjgSgsKCT7vnHS2U/ pMT6mfmxzVJ8HKQZ/Wf99ljTGV8dIrCb993VP97PzDFB18TrF9wPRVDCS7VkeoH7tiOI 7RbQp5m/3/kbbpzUG53c9ufyMYw3H5Q4WsrcG8gDiHR+kTGigvcOAP6u4rsz49fQNqCd v2Uw== X-Gm-Message-State: APjAAAV7gCpBSNfbfAw7dj+CEA5OEdFwd/NlA1Mq7lyXcQr8z9OHdT4n C0Ni0155LsukuvyYLmr2BJJYfw== X-Google-Smtp-Source: APXvYqyJ8cSwup5T+wLOXgE4G1RIJN3Q14FiCUfotn6HCvE35vO+QFm+mQrHo4U4TisKcq8ZEYMBDQ== X-Received: by 2002:a50:b82d:: with SMTP id j42mr63022736ede.186.1558173905455; Sat, 18 May 2019 03:05:05 -0700 (PDT) Received: from brauner.io ([46.183.103.8]) by smtp.gmail.com with ESMTPSA id y30sm3741910edc.83.2019.05.18.03.04.59 (version=TLS1_3 cipher=AEAD-AES256-GCM-SHA384 bits=256/256); Sat, 18 May 2019 03:05:04 -0700 (PDT) Date: Sat, 18 May 2019 12:04:46 +0200 From: Christian Brauner To: Joel Fernandes Subject: Re: [PATCH v1 1/2] pid: add pidfd_open() Message-ID: <20190518100435.c5bqpcnra53dsr6p@brauner.io> References: <20190516135944.7205-1-christian@brauner.io> <20190516224949.GA15401@localhost> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Disposition: inline In-Reply-To: <20190516224949.GA15401@localhost> User-Agent: NeoMutt/20180716 X-BeenThere: linuxppc-dev@lists.ozlabs.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Linux on PowerPC Developers Mail List List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: linux-ia64@vger.kernel.org, linux-sh@vger.kernel.org, oleg@redhat.com, dhowells@redhat.com, linux-kselftest@vger.kernel.org, sparclinux@vger.kernel.org, linux-api@vger.kernel.org, elena.reshetova@intel.com, linux-arch@vger.kernel.org, linux-s390@vger.kernel.org, dancol@google.com, Geert Uytterhoeven , kernel-team@android.com, serge@hallyn.com, linux-xtensa@linux-xtensa.org, keescook@chromium.org, arnd@arndb.de, jannh@google.com, linux-m68k@lists.linux-m68k.org, viro@zeniv.linux.org.uk, luto@kernel.org, linux-mips@vger.kernel.orgg, tglx@linutronix.de, surenb@google.com, linux-arm-kernel@lists.infradead.org, linux-parisc@vger.kernel.org, cyphar@cyphar.com, torvalds@linux-foundation.org, linux-kernel@vger.kernel.org, luto@amacapital.net, ebiederm@xmission.com, linux-alpha@vger.kernel.org, akpm@linux-foundation.org, linuxppc-dev@lists.ozlabs.org Errors-To: linuxppc-dev-bounces+linuxppc-dev=archiver.kernel.org@lists.ozlabs.org Sender: "Linuxppc-dev" On Sat, May 18, 2019 at 05:48:03AM -0400, Joel Fernandes wrote: > Hi Christian, > > For next revision, could you also CC surenb@google.com as well? He is also > working on the low memory killer. And also suggest CC to > kernel-team@android.com. And mentioned some comments below, thanks. Yip, totally. Just added them both to my Cc list. :) (I saw you added Suren manually. I added the Android kernel team now too.) > > On Thu, May 16, 2019 at 03:59:42PM +0200, Christian Brauner wrote: > [snip] > > diff --git a/kernel/pid.c b/kernel/pid.c > > index 20881598bdfa..4afca3d6dcb8 100644 > > --- a/kernel/pid.c > > +++ b/kernel/pid.c > > @@ -38,6 +38,7 @@ > > #include > > #include > > #include > > +#include > > #include > > #include > > > > @@ -451,6 +452,55 @@ struct pid *find_ge_pid(int nr, struct pid_namespace *ns) > > return idr_get_next(&ns->idr, &nr); > > } > > > > +/** > > + * pidfd_open() - Open new pid file descriptor. > > + * > > + * @pid: pid for which to retrieve a pidfd > > + * @flags: flags to pass > > + * > > + * This creates a new pid file descriptor with the O_CLOEXEC flag set for > > + * the process identified by @pid. Currently, the process identified by > > + * @pid must be a thread-group leader. This restriction currently exists > > + * for all aspects of pidfds including pidfd creation (CLONE_PIDFD cannot > > + * be used with CLONE_THREAD) and pidfd polling (only supports thread group > > + * leaders). > > + * > > + * Return: On success, a cloexec pidfd is returned. > > + * On error, a negative errno number will be returned. > > + */ > > +SYSCALL_DEFINE2(pidfd_open, pid_t, pid, unsigned int, flags) > > +{ > > + int fd, ret; > > + struct pid *p; > > + struct task_struct *tsk; > > + > > + if (flags) > > + return -EINVAL; > > + > > + if (pid <= 0) > > + return -EINVAL; > > + > > + p = find_get_pid(pid); > > + if (!p) > > + return -ESRCH; > > + > > + ret = 0; > > + rcu_read_lock(); > > + /* > > + * If this returns non-NULL the pid was used as a thread-group > > + * leader. Note, we race with exec here: If it changes the > > + * thread-group leader we might return the old leader. > > + */ > > + tsk = pid_task(p, PIDTYPE_TGID); > > Just trying to understand the comment here. The issue is that we might either > return the new leader, or the old leader depending on the overlap with > concurrent de_thread right? In either case, we don't care though. > > I suggest to remove the "Note..." part of the comment since it doesn't seem the > race is relevant here unless we are doing something else with tsk in the > function, but if you want to keep it that's also fine. Comment text should > probably should be 'return the new leader' though. Nah, I actually removed the comment already independently (cf. see [1]). [1]: https://git.kernel.org/pub/scm/linux/kernel/git/brauner/linux.git/commit/?h=pidfd_open&id=dcfc98c2d957bf3ac14b06414cb2cf4c673fc297 > > > + if (!tsk) > > + ret = -ESRCH; > > Perhaps -EINVAL? AFAICS, this can only happen if a CLONE_THREAD pid was > passed as argument to pidfd_open which is invalid. But let me know what you > had in mind.. Hm, from the kernel's perspective ESRCH is correct but I guess EINVAL is nicer for userspace. So I don't have objections to using EINVAL. My first version did too I think. Thanks! Christian From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-11.0 required=3.0 tests=DKIMWL_WL_HIGH,DKIM_SIGNED, DKIM_VALID,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_PATCH,MAILING_LIST_MULTI, MENTIONS_GIT_HOSTING,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED,USER_AGENT_NEOMUTT autolearn=unavailable autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id EF4E1C04AAF for ; Sat, 18 May 2019 10:05:13 +0000 (UTC) Received: from bombadil.infradead.org (bombadil.infradead.org [198.137.202.133]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id C290C20C01 for ; Sat, 18 May 2019 10:05:13 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=lists.infradead.org header.i=@lists.infradead.org header.b="dlLq7Gpy"; dkim=fail reason="signature verification failed" (2048-bit key) header.d=brauner.io header.i=@brauner.io header.b="MzXf8JwF" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org C290C20C01 Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=brauner.io Authentication-Results: mail.kernel.org; spf=none smtp.mailfrom=linux-arm-kernel-bounces+infradead-linux-arm-kernel=archiver.kernel.org@lists.infradead.org DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=lists.infradead.org; s=bombadil.20170209; h=Sender: Content-Transfer-Encoding:Content-Type:Cc:List-Subscribe:List-Help:List-Post: List-Archive:List-Unsubscribe:List-Id:In-Reply-To:MIME-Version:References: Message-ID:Subject:To:From:Date:Reply-To:Content-ID:Content-Description: Resent-Date:Resent-From:Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID: List-Owner; bh=GAMfU7klB9QgAUAL284xq6nlI6u2zeDUoavRDmSHMl0=; b=dlLq7GpyG1Stpd XHZHAlcHqyAeUPVWbUlLwLm+sy5g8U/H/jmzmMO4yXv12iF/BUGBDhJh4bEG7pJcIsNVoAi9ALWQq j6LWjZx1I5m09BhTh6v+SsQDRRg/XgvBoN1xVnEBNMocPQo4l5wLMeMJWz1DEDb4Ahh6jIiTLHabh F8vTxkxDyHPRi2RSBfIkbPe9PgQzmvd3GtIBgIpVIM2omjcGc1K6hH7D52DL1993yziKL7lqT4O2F gnE/pfMDRGBdI6o0Gmfbw8/euSmNAb9bsZYzPMWNBpldH0AnKqFZsD00fIBvabIrrtj37ZZuNG5gT IkwaPotSbCDhI7Bf07xQ==; Received: from localhost ([127.0.0.1] helo=bombadil.infradead.org) by bombadil.infradead.org with esmtp (Exim 4.90_1 #2 (Red Hat Linux)) id 1hRwDH-0001Rb-78; Sat, 18 May 2019 10:05:11 +0000 Received: from mail-ed1-x544.google.com ([2a00:1450:4864:20::544]) by bombadil.infradead.org with esmtps (Exim 4.90_1 #2 (Red Hat Linux)) id 1hRwDE-00013E-1T for linux-arm-kernel@lists.infradead.org; Sat, 18 May 2019 10:05:09 +0000 Received: by mail-ed1-x544.google.com with SMTP id l25so14683786eda.9 for ; Sat, 18 May 2019 03:05:06 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=brauner.io; s=google; h=date:from:to:cc:subject:message-id:references:mime-version :content-disposition:in-reply-to:user-agent; bh=kuxvODYNXQX1UADGYTekrXKJHl9GqTdixhHyMJoLFkg=; b=MzXf8JwF72gJj2Pq8n12P+UywZxc2BVXTvpiVUKrEniJVb1LIFP3IbcOliQYISdvq7 CigvtYVzwi+8s8BzhGA7JzeYEAEgELRKSJUGXkBeKo3jIoY+k+AIz1NIGNQ3IbqWFvd/ pZcoiBfjP1G87UNeXXgpYPrDlzF+SvvAxTS1B9J01RvJsEbVREubCnWtrpwlE+W8uhC8 abJb2p1WNSljrqCFAQH7at9A4yXrlhg+5ZrhECybT+pFDJTv4XLPhvmKoEIQriQEdsEE 2qeWRYeIhjWVjN8Omnws0ZotrB2BKKuxhO4+QGekj7Ec6VajrUKLl5oRB10tcR/KFGxb SomA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:date:from:to:cc:subject:message-id:references :mime-version:content-disposition:in-reply-to:user-agent; bh=kuxvODYNXQX1UADGYTekrXKJHl9GqTdixhHyMJoLFkg=; b=U/IfcakcqcjNDN6pav0IuaiimWGHosGuNCWuREDQMaOf3VsdMVdGdQ0162rPf29hJQ ylbc/7HjyeINmo3bdULs7v/Lx1ZcgpHxwkBgKaCcZhsfYBKAzftIvGi/tQvqZJTca3rO aZ1k5EXihlZRElvex1+it53ohM3SKuOaM23jkVh/8Hf/qXk9Pj5VjV4ViK/Qvf8sm4S2 hWoOMDldv1sfTW7EEqFW8LtCuIT9Im0b5GIPNmx44dWKuNEFb0E5mluHHbcgleZi5hjJ deceeAnyLOcOrZ2A4oRdr7TUwGvMvfjzHA6paK51tRI4tnI/mhBmvrYs2Mq99zooGUxr 8YWQ== X-Gm-Message-State: APjAAAX8mWU2a6yBT86/KzLA4dP3lSRS0aaTD2n4ZttQTuCNV8EseoA5 dRY1OD+w6HDuXnVVdNyEdcYKHw== X-Google-Smtp-Source: APXvYqyJ8cSwup5T+wLOXgE4G1RIJN3Q14FiCUfotn6HCvE35vO+QFm+mQrHo4U4TisKcq8ZEYMBDQ== X-Received: by 2002:a50:b82d:: with SMTP id j42mr63022736ede.186.1558173905455; Sat, 18 May 2019 03:05:05 -0700 (PDT) Received: from brauner.io ([46.183.103.8]) by smtp.gmail.com with ESMTPSA id y30sm3741910edc.83.2019.05.18.03.04.59 (version=TLS1_3 cipher=AEAD-AES256-GCM-SHA384 bits=256/256); Sat, 18 May 2019 03:05:04 -0700 (PDT) Date: Sat, 18 May 2019 12:04:46 +0200 From: Christian Brauner To: Joel Fernandes Subject: Re: [PATCH v1 1/2] pid: add pidfd_open() Message-ID: <20190518100435.c5bqpcnra53dsr6p@brauner.io> References: <20190516135944.7205-1-christian@brauner.io> <20190516224949.GA15401@localhost> MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: <20190516224949.GA15401@localhost> User-Agent: NeoMutt/20180716 X-CRM114-Version: 20100106-BlameMichelson ( TRE 0.8.0 (BSD) ) MR-646709E3 X-CRM114-CacheID: sfid-20190518_030508_136900_6DD9B10A X-CRM114-Status: GOOD ( 26.39 ) X-BeenThere: linux-arm-kernel@lists.infradead.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: linux-ia64@vger.kernel.org, linux-sh@vger.kernel.org, oleg@redhat.com, dhowells@redhat.com, linux-kselftest@vger.kernel.org, sparclinux@vger.kernel.org, linux-api@vger.kernel.org, elena.reshetova@intel.com, linux-arch@vger.kernel.org, linux-s390@vger.kernel.org, dancol@google.com, Geert Uytterhoeven , kernel-team@android.com, serge@hallyn.com, linux-xtensa@linux-xtensa.org, keescook@chromium.org, arnd@arndb.de, jannh@google.com, linux-m68k@lists.linux-m68k.org, viro@zeniv.linux.org.uk, luto@kernel.org, linux-mips@vger.kernel.orgg, tglx@linutronix.de, surenb@google.com, linux-arm-kernel@lists.infradead.org, linux-parisc@vger.kernel.org, cyphar@cyphar.com, torvalds@linux-foundation.org, linux-kernel@vger.kernel.org, luto@amacapital.net, ebiederm@xmission.com, linux-alpha@vger.kernel.org, akpm@linux-foundation.org, linuxppc-dev@lists.ozlabs.org Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Sender: "linux-arm-kernel" Errors-To: linux-arm-kernel-bounces+infradead-linux-arm-kernel=archiver.kernel.org@lists.infradead.org On Sat, May 18, 2019 at 05:48:03AM -0400, Joel Fernandes wrote: > Hi Christian, > > For next revision, could you also CC surenb@google.com as well? He is also > working on the low memory killer. And also suggest CC to > kernel-team@android.com. And mentioned some comments below, thanks. Yip, totally. Just added them both to my Cc list. :) (I saw you added Suren manually. I added the Android kernel team now too.) > > On Thu, May 16, 2019 at 03:59:42PM +0200, Christian Brauner wrote: > [snip] > > diff --git a/kernel/pid.c b/kernel/pid.c > > index 20881598bdfa..4afca3d6dcb8 100644 > > --- a/kernel/pid.c > > +++ b/kernel/pid.c > > @@ -38,6 +38,7 @@ > > #include > > #include > > #include > > +#include > > #include > > #include > > > > @@ -451,6 +452,55 @@ struct pid *find_ge_pid(int nr, struct pid_namespace *ns) > > return idr_get_next(&ns->idr, &nr); > > } > > > > +/** > > + * pidfd_open() - Open new pid file descriptor. > > + * > > + * @pid: pid for which to retrieve a pidfd > > + * @flags: flags to pass > > + * > > + * This creates a new pid file descriptor with the O_CLOEXEC flag set for > > + * the process identified by @pid. Currently, the process identified by > > + * @pid must be a thread-group leader. This restriction currently exists > > + * for all aspects of pidfds including pidfd creation (CLONE_PIDFD cannot > > + * be used with CLONE_THREAD) and pidfd polling (only supports thread group > > + * leaders). > > + * > > + * Return: On success, a cloexec pidfd is returned. > > + * On error, a negative errno number will be returned. > > + */ > > +SYSCALL_DEFINE2(pidfd_open, pid_t, pid, unsigned int, flags) > > +{ > > + int fd, ret; > > + struct pid *p; > > + struct task_struct *tsk; > > + > > + if (flags) > > + return -EINVAL; > > + > > + if (pid <= 0) > > + return -EINVAL; > > + > > + p = find_get_pid(pid); > > + if (!p) > > + return -ESRCH; > > + > > + ret = 0; > > + rcu_read_lock(); > > + /* > > + * If this returns non-NULL the pid was used as a thread-group > > + * leader. Note, we race with exec here: If it changes the > > + * thread-group leader we might return the old leader. > > + */ > > + tsk = pid_task(p, PIDTYPE_TGID); > > Just trying to understand the comment here. The issue is that we might either > return the new leader, or the old leader depending on the overlap with > concurrent de_thread right? In either case, we don't care though. > > I suggest to remove the "Note..." part of the comment since it doesn't seem the > race is relevant here unless we are doing something else with tsk in the > function, but if you want to keep it that's also fine. Comment text should > probably should be 'return the new leader' though. Nah, I actually removed the comment already independently (cf. see [1]). [1]: https://git.kernel.org/pub/scm/linux/kernel/git/brauner/linux.git/commit/?h=pidfd_open&id=dcfc98c2d957bf3ac14b06414cb2cf4c673fc297 > > > + if (!tsk) > > + ret = -ESRCH; > > Perhaps -EINVAL? AFAICS, this can only happen if a CLONE_THREAD pid was > passed as argument to pidfd_open which is invalid. But let me know what you > had in mind.. Hm, from the kernel's perspective ESRCH is correct but I guess EINVAL is nicer for userspace. So I don't have objections to using EINVAL. My first version did too I think. Thanks! Christian _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel