All of lore.kernel.org
 help / color / mirror / Atom feed
From: Oleg Nesterov <oleg@redhat.com>
To: Joel Fernandes <joel@joelfernandes.org>
Cc: linux-kernel@vger.kernel.org, luto@amacapital.net,
	rostedt@goodmis.org, dancol@google.com, christian@brauner.io,
	jannh@google.com, surenb@google.com,
	torvalds@linux-foundation.org,
	Alexey Dobriyan <adobriyan@gmail.com>,
	Al Viro <viro@zeniv.linux.org.uk>,
	Andrei Vagin <avagin@gmail.com>,
	Andrew Morton <akpm@linux-foundation.org>,
	Arnd Bergmann <arnd@arndb.de>,
	"Eric W. Biederman" <ebiederm@xmission.com>,
	Kees Cook <keescook@chromium.org>,
	linux-fsdevel@vger.kernel.org, linux-kselftest@vger.kernel.org,
	Michal Hocko <mhocko@suse.com>, Nadav Amit <namit@vmware.com>,
	Serge Hallyn <serge@hallyn.com>, Shuah Khan <shuah@kernel.org>,
	Stephen Rothwell <sfr@canb.auug.org.au>,
	Taehee Yoo <ap420073@gmail.com>, Tejun Heo <tj@kernel.org>,
	Thomas Gleixner <tglx@linutronix.de>,
	kernel-team@android.com, Tycho Andersen <tycho@tycho.ws>
Subject: Re: [PATCH RFC 1/2] Add polling support to pidfd
Date: Wed, 17 Apr 2019 15:09:41 +0200	[thread overview]
Message-ID: <20190417130940.GC32622@redhat.com> (raw)
In-Reply-To: <20190416192051.GA184889@google.com>

On 04/16, Joel Fernandes wrote:
>
> On Tue, Apr 16, 2019 at 02:04:31PM +0200, Oleg Nesterov wrote:
> >
> > Could you explain when it should return POLLIN? When the whole process exits?
>
> It returns POLLIN when the task is dead or doesn't exist anymore, or when it
> is in a zombie state and there's no other thread in the thread group.

IOW, when the whole thread group exits, so it can't be used to monitor sub-threads.

just in case... speaking of this patch it doesn't modify proc_tid_base_operations,
so you can't poll("/proc/sub-thread-tid") anyway, but iiuc you are going to use
the anonymous file returned by CLONE_PIDFD ?


> > Then all you need is
> >
> > 	!task || task->exit_state && thread_group_empty(task)
>
> Yes this works as well, all the tests pass with your suggestion so I'll
> change it to that. Although I will the be giving up returing EPOLLERR if the
> task_struct doesn't exit. We don't need that, but I thought it was cool to
> return it anyway.

OK, task == NULL means that it was already reaped by parent, pid_nr is free,
probably useful....

> > Please do not use EXIT_DEAD/EXIT_ZOMBIE. And ->wait_pidfd should probably
> > live in task->signal_struct.
>
> About wait_pidfd living in signal_struct, that wont work since the waitqueue
> has to survive for the duration of the poll system call.

That is why I said this will need the additional cleanup in free_signal_struct().
But I was wrong, somehow I forgot that free_poll_entry() needs wq_head->lock ;)
so this will need much more complications, lets forget it...

> Also the waitqueue living in struct pid solves the de_thread() issue I
> mentioned later in the following thread and in the commit message:
> https://lore.kernel.org/patchwork/comment/1257175/

Hmm...

	2. By including the struct pid for the waitqueue means that during
	de_exec, the thread doing de_thread() automatically gets the new
	waitqueue/pid even though its task_struct is different.

this one?

this is not true, or I do not understand...

it gets the _same_ (old, not new) PIDTYPE_TGID pid even if it changes task_struct.
But probably this is what you actually meant, because this is what your patch wants
or I am totally confused.

And note that exec/de_thread doesn't change ->signal_struct, so I do not understand
you anyway. Nevermind.

Oleg.


WARNING: multiple messages have this Message-ID (diff)
From: oleg at redhat.com (Oleg Nesterov)
Subject: [PATCH RFC 1/2] Add polling support to pidfd
Date: Wed, 17 Apr 2019 15:09:41 +0200	[thread overview]
Message-ID: <20190417130940.GC32622@redhat.com> (raw)
In-Reply-To: <20190416192051.GA184889@google.com>

On 04/16, Joel Fernandes wrote:
>
> On Tue, Apr 16, 2019 at 02:04:31PM +0200, Oleg Nesterov wrote:
> >
> > Could you explain when it should return POLLIN? When the whole process exits?
>
> It returns POLLIN when the task is dead or doesn't exist anymore, or when it
> is in a zombie state and there's no other thread in the thread group.

IOW, when the whole thread group exits, so it can't be used to monitor sub-threads.

just in case... speaking of this patch it doesn't modify proc_tid_base_operations,
so you can't poll("/proc/sub-thread-tid") anyway, but iiuc you are going to use
the anonymous file returned by CLONE_PIDFD ?


> > Then all you need is
> >
> > 	!task || task->exit_state && thread_group_empty(task)
>
> Yes this works as well, all the tests pass with your suggestion so I'll
> change it to that. Although I will the be giving up returing EPOLLERR if the
> task_struct doesn't exit. We don't need that, but I thought it was cool to
> return it anyway.

OK, task == NULL means that it was already reaped by parent, pid_nr is free,
probably useful....

> > Please do not use EXIT_DEAD/EXIT_ZOMBIE. And ->wait_pidfd should probably
> > live in task->signal_struct.
>
> About wait_pidfd living in signal_struct, that wont work since the waitqueue
> has to survive for the duration of the poll system call.

That is why I said this will need the additional cleanup in free_signal_struct().
But I was wrong, somehow I forgot that free_poll_entry() needs wq_head->lock ;)
so this will need much more complications, lets forget it...

> Also the waitqueue living in struct pid solves the de_thread() issue I
> mentioned later in the following thread and in the commit message:
> https://lore.kernel.org/patchwork/comment/1257175/

Hmm...

	2. By including the struct pid for the waitqueue means that during
	de_exec, the thread doing de_thread() automatically gets the new
	waitqueue/pid even though its task_struct is different.

this one?

this is not true, or I do not understand...

it gets the _same_ (old, not new) PIDTYPE_TGID pid even if it changes task_struct.
But probably this is what you actually meant, because this is what your patch wants
or I am totally confused.

And note that exec/de_thread doesn't change ->signal_struct, so I do not understand
you anyway. Nevermind.

Oleg.

WARNING: multiple messages have this Message-ID (diff)
From: oleg@redhat.com (Oleg Nesterov)
Subject: [PATCH RFC 1/2] Add polling support to pidfd
Date: Wed, 17 Apr 2019 15:09:41 +0200	[thread overview]
Message-ID: <20190417130940.GC32622@redhat.com> (raw)
Message-ID: <20190417130941.QyHap7BU_7Gpyg_7fxc_yAqPXmRNc0jtR76XUfoSfU4@z> (raw)
In-Reply-To: <20190416192051.GA184889@google.com>

On 04/16, Joel Fernandes wrote:
>
> On Tue, Apr 16, 2019@02:04:31PM +0200, Oleg Nesterov wrote:
> >
> > Could you explain when it should return POLLIN? When the whole process exits?
>
> It returns POLLIN when the task is dead or doesn't exist anymore, or when it
> is in a zombie state and there's no other thread in the thread group.

IOW, when the whole thread group exits, so it can't be used to monitor sub-threads.

just in case... speaking of this patch it doesn't modify proc_tid_base_operations,
so you can't poll("/proc/sub-thread-tid") anyway, but iiuc you are going to use
the anonymous file returned by CLONE_PIDFD ?


> > Then all you need is
> >
> > 	!task || task->exit_state && thread_group_empty(task)
>
> Yes this works as well, all the tests pass with your suggestion so I'll
> change it to that. Although I will the be giving up returing EPOLLERR if the
> task_struct doesn't exit. We don't need that, but I thought it was cool to
> return it anyway.

OK, task == NULL means that it was already reaped by parent, pid_nr is free,
probably useful....

> > Please do not use EXIT_DEAD/EXIT_ZOMBIE. And ->wait_pidfd should probably
> > live in task->signal_struct.
>
> About wait_pidfd living in signal_struct, that wont work since the waitqueue
> has to survive for the duration of the poll system call.

That is why I said this will need the additional cleanup in free_signal_struct().
But I was wrong, somehow I forgot that free_poll_entry() needs wq_head->lock ;)
so this will need much more complications, lets forget it...

> Also the waitqueue living in struct pid solves the de_thread() issue I
> mentioned later in the following thread and in the commit message:
> https://lore.kernel.org/patchwork/comment/1257175/

Hmm...

	2. By including the struct pid for the waitqueue means that during
	de_exec, the thread doing de_thread() automatically gets the new
	waitqueue/pid even though its task_struct is different.

this one?

this is not true, or I do not understand...

it gets the _same_ (old, not new) PIDTYPE_TGID pid even if it changes task_struct.
But probably this is what you actually meant, because this is what your patch wants
or I am totally confused.

And note that exec/de_thread doesn't change ->signal_struct, so I do not understand
you anyway. Nevermind.

Oleg.

  parent reply	other threads:[~2019-04-17 13:09 UTC|newest]

Thread overview: 198+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-04-11 17:50 [PATCH RFC 1/2] Add polling support to pidfd Joel Fernandes (Google)
2019-04-11 17:50 ` Joel Fernandes (Google)
2019-04-11 17:50 ` joel
2019-04-11 17:50 ` [PATCH RFC 2/2] Add selftests for pidfd polling Joel Fernandes (Google)
2019-04-11 17:50   ` Joel Fernandes (Google)
2019-04-11 17:50   ` joel
2019-04-12 14:51   ` Tycho Andersen
2019-04-12 14:51     ` Tycho Andersen
2019-04-12 14:51     ` tycho
2019-04-11 20:00 ` [PATCH RFC 1/2] Add polling support to pidfd Joel Fernandes
2019-04-11 20:00   ` Joel Fernandes
2019-04-11 20:00   ` joel
2019-04-11 20:02   ` Christian Brauner
2019-04-11 20:02     ` Christian Brauner
2019-04-11 20:02     ` christian
2019-04-11 20:20     ` Joel Fernandes
2019-04-11 20:20       ` Joel Fernandes
2019-04-11 20:20       ` joel
2019-04-12 21:32 ` Andy Lutomirski
2019-04-12 21:32   ` Andy Lutomirski
2019-04-12 21:32   ` luto
2019-04-13  0:09   ` Joel Fernandes
2019-04-13  0:09     ` Joel Fernandes
2019-04-13  0:09     ` joel
     [not found]     ` <CAKOZuetX4jMPDtDqAvGgSNo4BHf9BOnu79ufEiULfM5X5nDyyQ@mail.gmail.com>
2019-04-13  0:56       ` Daniel Colascione
2019-04-13  0:56         ` Daniel Colascione
2019-04-13  0:56         ` dancol
2019-04-14 18:19   ` Linus Torvalds
2019-04-14 18:19     ` Linus Torvalds
2019-04-14 18:19     ` torvalds
2019-04-16 12:04 ` Oleg Nesterov
2019-04-16 12:04   ` Oleg Nesterov
2019-04-16 12:04   ` oleg
2019-04-16 12:43   ` Oleg Nesterov
2019-04-16 12:43     ` Oleg Nesterov
2019-04-16 12:43     ` oleg
2019-04-16 19:20   ` Joel Fernandes
2019-04-16 19:20     ` Joel Fernandes
2019-04-16 19:20     ` joel
2019-04-16 19:32     ` Joel Fernandes
2019-04-16 19:32       ` Joel Fernandes
2019-04-16 19:32       ` joel
2019-04-17 13:09     ` Oleg Nesterov [this message]
2019-04-17 13:09       ` Oleg Nesterov
2019-04-17 13:09       ` oleg
2019-04-18 17:23       ` Jann Horn
2019-04-18 17:23         ` Jann Horn
2019-04-18 17:23         ` jannh
2019-04-18 17:26         ` Christian Brauner
2019-04-18 17:26           ` Christian Brauner
2019-04-18 17:26           ` christian
2019-04-18 17:53           ` Daniel Colascione
2019-04-18 17:53             ` Daniel Colascione
2019-04-18 17:53             ` dancol
2019-04-19 19:02           ` Joel Fernandes
2019-04-19 19:02             ` Joel Fernandes
2019-04-19 19:02             ` joel
2019-04-19 19:18             ` Christian Brauner
2019-04-19 19:18               ` Christian Brauner
2019-04-19 19:18               ` christian
2019-04-19 19:22               ` Christian Brauner
2019-04-19 19:22                 ` Christian Brauner
2019-04-19 19:22                 ` christian
2019-04-19 19:42                 ` Christian Brauner
2019-04-19 19:42                   ` Christian Brauner
2019-04-19 19:42                   ` christian
2019-04-19 19:49               ` Joel Fernandes
2019-04-19 19:49                 ` Joel Fernandes
2019-04-19 19:49                 ` joel
2019-04-19 20:01                 ` Christian Brauner
2019-04-19 20:01                   ` Christian Brauner
2019-04-19 20:01                   ` christian
2019-04-19 21:13                   ` Joel Fernandes
2019-04-19 21:13                     ` Joel Fernandes
2019-04-19 21:13                     ` joel
2019-04-19 20:34                 ` Daniel Colascione
2019-04-19 20:34                   ` Daniel Colascione
2019-04-19 20:34                   ` dancol
2019-04-19 20:57                   ` Christian Brauner
2019-04-19 20:57                     ` Christian Brauner
2019-04-19 20:57                     ` christian
2019-04-19 21:20                     ` Joel Fernandes
2019-04-19 21:20                       ` Joel Fernandes
2019-04-19 21:20                       ` joel
2019-04-19 21:24                       ` Daniel Colascione
2019-04-19 21:24                         ` Daniel Colascione
2019-04-19 21:24                         ` dancol
2019-04-19 21:45                         ` Joel Fernandes
2019-04-19 21:45                           ` Joel Fernandes
2019-04-19 21:45                           ` joel
2019-04-19 22:08                           ` Daniel Colascione
2019-04-19 22:08                             ` Daniel Colascione
2019-04-19 22:08                             ` dancol
2019-04-19 22:17                             ` Christian Brauner
2019-04-19 22:17                               ` Christian Brauner
2019-04-19 22:17                               ` christian
2019-04-19 22:37                               ` Daniel Colascione
2019-04-19 22:37                                 ` Daniel Colascione
2019-04-19 22:37                                 ` dancol
2019-04-24  8:04                         ` Enrico Weigelt, metux IT consult
2019-04-24  8:04                           ` Enrico Weigelt, metux IT consult
2019-04-24  8:04                           ` lkml
2019-04-19 21:59                       ` Christian Brauner
2019-04-19 21:59                         ` Christian Brauner
2019-04-19 21:59                         ` christian
2019-04-20 11:51                         ` Oleg Nesterov
2019-04-20 11:51                           ` Oleg Nesterov
2019-04-20 11:51                           ` oleg
2019-04-20 12:26                           ` Oleg Nesterov
2019-04-20 12:26                             ` Oleg Nesterov
2019-04-20 12:26                             ` oleg
2019-04-20 12:35                             ` Christian Brauner
2019-04-20 12:35                               ` Christian Brauner
2019-04-20 12:35                               ` christian
2019-04-19 23:11                       ` Linus Torvalds
2019-04-19 23:11                         ` Linus Torvalds
2019-04-19 23:11                         ` torvalds
2019-04-19 23:20                         ` Christian Brauner
2019-04-19 23:20                           ` Christian Brauner
2019-04-19 23:20                           ` christian
2019-04-19 23:32                           ` Linus Torvalds
2019-04-19 23:32                             ` Linus Torvalds
2019-04-19 23:32                             ` torvalds
2019-04-19 23:36                             ` Daniel Colascione
2019-04-19 23:36                               ` Daniel Colascione
2019-04-19 23:36                               ` dancol
2019-04-20  0:46                         ` Joel Fernandes
2019-04-20  0:46                           ` Joel Fernandes
2019-04-20  0:46                           ` joel
2019-04-19 21:21                     ` Daniel Colascione
2019-04-19 21:21                       ` Daniel Colascione
2019-04-19 21:21                       ` dancol
2019-04-19 21:48                       ` Christian Brauner
2019-04-19 21:48                         ` Christian Brauner
2019-04-19 21:48                         ` christian
2019-04-19 22:02                         ` Christian Brauner
2019-04-19 22:02                           ` Christian Brauner
2019-04-19 22:02                           ` christian
2019-04-19 22:46                           ` Daniel Colascione
2019-04-19 22:46                             ` Daniel Colascione
2019-04-19 22:46                             ` dancol
2019-04-19 23:12                             ` Christian Brauner
2019-04-19 23:12                               ` Christian Brauner
2019-04-19 23:12                               ` christian
2019-04-19 23:46                               ` Daniel Colascione
2019-04-19 23:46                                 ` Daniel Colascione
2019-04-19 23:46                                 ` dancol
2019-04-20  0:17                                 ` Christian Brauner
2019-04-20  0:17                                   ` Christian Brauner
2019-04-20  0:17                                   ` christian
2019-04-24  9:05                                   ` Enrico Weigelt, metux IT consult
2019-04-24  9:05                                     ` Enrico Weigelt, metux IT consult
2019-04-24  9:05                                     ` lkml
2019-04-24  9:03                                 ` Enrico Weigelt, metux IT consult
2019-04-24  9:03                                   ` Enrico Weigelt, metux IT consult
2019-04-24  9:03                                   ` lkml
2019-04-19 22:35                         ` Daniel Colascione
2019-04-19 22:35                           ` Daniel Colascione
2019-04-19 22:35                           ` dancol
2019-04-19 23:02                           ` Christian Brauner
2019-04-19 23:02                             ` Christian Brauner
2019-04-19 23:02                             ` christian
2019-04-19 23:29                             ` Daniel Colascione
2019-04-19 23:29                               ` Daniel Colascione
2019-04-19 23:29                               ` dancol
2019-04-20  0:02                               ` Christian Brauner
2019-04-20  0:02                                 ` Christian Brauner
2019-04-20  0:02                                 ` christian
2019-04-24  9:17                               ` Enrico Weigelt, metux IT consult
2019-04-24  9:17                                 ` Enrico Weigelt, metux IT consult
2019-04-24  9:17                                 ` lkml
2019-04-24  9:11                             ` Enrico Weigelt, metux IT consult
2019-04-24  9:11                               ` Enrico Weigelt, metux IT consult
2019-04-24  9:11                               ` lkml
2019-04-24  8:56                         ` Enrico Weigelt, metux IT consult
2019-04-24  8:56                           ` Enrico Weigelt, metux IT consult
2019-04-24  8:56                           ` lkml
2019-04-24  8:20                       ` Enrico Weigelt, metux IT consult
2019-04-24  8:20                         ` Enrico Weigelt, metux IT consult
2019-04-24  8:20                         ` lkml
2019-04-19 15:43         ` Oleg Nesterov
2019-04-19 15:43           ` Oleg Nesterov
2019-04-19 15:43           ` oleg
2019-04-19 18:12       ` Joel Fernandes
2019-04-19 18:12         ` Joel Fernandes
2019-04-19 18:12         ` joel
2019-04-18 18:44     ` Jonathan Kowalski
2019-04-18 18:44       ` Jonathan Kowalski
2019-04-18 18:44       ` bl0pbl33p
2019-04-18 18:57       ` Daniel Colascione
2019-04-18 18:57         ` Daniel Colascione
2019-04-18 18:57         ` dancol
2019-04-18 19:14         ` Linus Torvalds
2019-04-18 19:14           ` Linus Torvalds
2019-04-18 19:14           ` torvalds
2019-04-19 19:05           ` Joel Fernandes
2019-04-19 19:05             ` Joel Fernandes
2019-04-19 19:05             ` joel

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20190417130940.GC32622@redhat.com \
    --to=oleg@redhat.com \
    --cc=adobriyan@gmail.com \
    --cc=akpm@linux-foundation.org \
    --cc=ap420073@gmail.com \
    --cc=arnd@arndb.de \
    --cc=avagin@gmail.com \
    --cc=christian@brauner.io \
    --cc=dancol@google.com \
    --cc=ebiederm@xmission.com \
    --cc=jannh@google.com \
    --cc=joel@joelfernandes.org \
    --cc=keescook@chromium.org \
    --cc=kernel-team@android.com \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-kselftest@vger.kernel.org \
    --cc=luto@amacapital.net \
    --cc=mhocko@suse.com \
    --cc=namit@vmware.com \
    --cc=rostedt@goodmis.org \
    --cc=serge@hallyn.com \
    --cc=sfr@canb.auug.org.au \
    --cc=shuah@kernel.org \
    --cc=surenb@google.com \
    --cc=tglx@linutronix.de \
    --cc=tj@kernel.org \
    --cc=torvalds@linux-foundation.org \
    --cc=tycho@tycho.ws \
    --cc=viro@zeniv.linux.org.uk \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.