All of lore.kernel.org
 help / color / mirror / Atom feed
From: Joel Fernandes <joel@joelfernandes.org>
To: Oleg Nesterov <oleg@redhat.com>
Cc: linux-kernel@vger.kernel.org, luto@amacapital.net,
	rostedt@goodmis.org, dancol@google.com, christian@brauner.io,
	jannh@google.com, surenb@google.com,
	torvalds@linux-foundation.org,
	Alexey Dobriyan <adobriyan@gmail.com>,
	Al Viro <viro@zeniv.linux.org.uk>,
	Andrei Vagin <avagin@gmail.com>,
	Andrew Morton <akpm@linux-foundation.org>,
	Arnd Bergmann <arnd@arndb.de>,
	"Eric W. Biederman" <ebiederm@xmission.com>,
	Kees Cook <keescook@chromium.org>,
	linux-fsdevel@vger.kernel.org, linux-kselftest@vger.kernel.org,
	Michal Hocko <mhocko@suse.com>, Nadav Amit <namit@vmware.com>,
	Serge Hallyn <serge@hallyn.com>, Shuah Khan <shuah@kernel.org>,
	Stephen Rothwell <sfr@canb.auug.org.au>,
	Taehee Yoo <ap420073@gmail.com>, Tejun Heo <tj@kernel.org>,
	Thomas Gleixner <tglx@linutronix.de>,
	kernel-team@android.com, Tycho Andersen <tycho@tycho.ws>
Subject: Re: [PATCH RFC 1/2] Add polling support to pidfd
Date: Fri, 19 Apr 2019 14:12:58 -0400	[thread overview]
Message-ID: <20190419181258.GA251571@google.com> (raw)
In-Reply-To: <20190417130940.GC32622@redhat.com>

Just returned to work today dealing with "life" issues, apologies for the
delays in replying. :)

On Wed, Apr 17, 2019 at 03:09:41PM +0200, Oleg Nesterov wrote:
> On 04/16, Joel Fernandes wrote:
> >
> > On Tue, Apr 16, 2019 at 02:04:31PM +0200, Oleg Nesterov wrote:
> > >
> > > Could you explain when it should return POLLIN? When the whole process exits?
> >
> > It returns POLLIN when the task is dead or doesn't exist anymore, or when it
> > is in a zombie state and there's no other thread in the thread group.
> 
> IOW, when the whole thread group exits, so it can't be used to monitor sub-threads.
> 
> just in case... speaking of this patch it doesn't modify proc_tid_base_operations,
> so you can't poll("/proc/sub-thread-tid") anyway, but iiuc you are going to use
> the anonymous file returned by CLONE_PIDFD ?

Yes, I am going to be converting to non-proc file returned by CLONE_PIDFD,
yes. (But I am still catching up with all threads and will read the latest on
whether we are still consider proc pidfds, last I understand - we are not).

> > > Then all you need is
> > >
> > > 	!task || task->exit_state && thread_group_empty(task)
> >
> > Yes this works as well, all the tests pass with your suggestion so I'll
> > change it to that. Although I will the be giving up returing EPOLLERR if the
> > task_struct doesn't exit. We don't need that, but I thought it was cool to
> > return it anyway.
> 
> OK, task == NULL means that it was already reaped by parent, pid_nr is free,
> probably useful....

Ok I will add that semantic as well then.

> > > Please do not use EXIT_DEAD/EXIT_ZOMBIE. And ->wait_pidfd should probably
> > > live in task->signal_struct.
> >
> > About wait_pidfd living in signal_struct, that wont work since the waitqueue
> > has to survive for the duration of the poll system call.
> 
> That is why I said this will need the additional cleanup in free_signal_struct().
> But I was wrong, somehow I forgot that free_poll_entry() needs wq_head->lock ;)
> so this will need much more complications, lets forget it...

Ok np :)

> > Also the waitqueue living in struct pid solves the de_thread() issue I
> > mentioned later in the following thread and in the commit message:
> > https://lore.kernel.org/patchwork/comment/1257175/
> 
> Hmm...
> 
> 	2. By including the struct pid for the waitqueue means that during
> 	de_exec, the thread doing de_thread() automatically gets the new
> 	waitqueue/pid even though its task_struct is different.
> 
> this one?
> 
> this is not true, or I do not understand...
> 
> it gets the _same_ (old, not new) PIDTYPE_TGID pid even if it changes task_struct.
> But probably this is what you actually meant, because this is what your patch wants
> or I am totally confused.

Yes, that's what I meant, sorry.

> And note that exec/de_thread doesn't change ->signal_struct, so I do not understand
> you anyway. Nevermind.

Yes right, but the signal_struct would suffer from the waitqueue lifetime
issue anyway so we can't use it. The current patch works well for everything.

thanks,

- Joel


WARNING: multiple messages have this Message-ID (diff)
From: joel at joelfernandes.org (Joel Fernandes)
Subject: [PATCH RFC 1/2] Add polling support to pidfd
Date: Fri, 19 Apr 2019 14:12:58 -0400	[thread overview]
Message-ID: <20190419181258.GA251571@google.com> (raw)
In-Reply-To: <20190417130940.GC32622@redhat.com>

Just returned to work today dealing with "life" issues, apologies for the
delays in replying. :)

On Wed, Apr 17, 2019 at 03:09:41PM +0200, Oleg Nesterov wrote:
> On 04/16, Joel Fernandes wrote:
> >
> > On Tue, Apr 16, 2019 at 02:04:31PM +0200, Oleg Nesterov wrote:
> > >
> > > Could you explain when it should return POLLIN? When the whole process exits?
> >
> > It returns POLLIN when the task is dead or doesn't exist anymore, or when it
> > is in a zombie state and there's no other thread in the thread group.
> 
> IOW, when the whole thread group exits, so it can't be used to monitor sub-threads.
> 
> just in case... speaking of this patch it doesn't modify proc_tid_base_operations,
> so you can't poll("/proc/sub-thread-tid") anyway, but iiuc you are going to use
> the anonymous file returned by CLONE_PIDFD ?

Yes, I am going to be converting to non-proc file returned by CLONE_PIDFD,
yes. (But I am still catching up with all threads and will read the latest on
whether we are still consider proc pidfds, last I understand - we are not).

> > > Then all you need is
> > >
> > > 	!task || task->exit_state && thread_group_empty(task)
> >
> > Yes this works as well, all the tests pass with your suggestion so I'll
> > change it to that. Although I will the be giving up returing EPOLLERR if the
> > task_struct doesn't exit. We don't need that, but I thought it was cool to
> > return it anyway.
> 
> OK, task == NULL means that it was already reaped by parent, pid_nr is free,
> probably useful....

Ok I will add that semantic as well then.

> > > Please do not use EXIT_DEAD/EXIT_ZOMBIE. And ->wait_pidfd should probably
> > > live in task->signal_struct.
> >
> > About wait_pidfd living in signal_struct, that wont work since the waitqueue
> > has to survive for the duration of the poll system call.
> 
> That is why I said this will need the additional cleanup in free_signal_struct().
> But I was wrong, somehow I forgot that free_poll_entry() needs wq_head->lock ;)
> so this will need much more complications, lets forget it...

Ok np :)

> > Also the waitqueue living in struct pid solves the de_thread() issue I
> > mentioned later in the following thread and in the commit message:
> > https://lore.kernel.org/patchwork/comment/1257175/
> 
> Hmm...
> 
> 	2. By including the struct pid for the waitqueue means that during
> 	de_exec, the thread doing de_thread() automatically gets the new
> 	waitqueue/pid even though its task_struct is different.
> 
> this one?
> 
> this is not true, or I do not understand...
> 
> it gets the _same_ (old, not new) PIDTYPE_TGID pid even if it changes task_struct.
> But probably this is what you actually meant, because this is what your patch wants
> or I am totally confused.

Yes, that's what I meant, sorry.

> And note that exec/de_thread doesn't change ->signal_struct, so I do not understand
> you anyway. Nevermind.

Yes right, but the signal_struct would suffer from the waitqueue lifetime
issue anyway so we can't use it. The current patch works well for everything.

thanks,

- Joel

WARNING: multiple messages have this Message-ID (diff)
From: joel@joelfernandes.org (Joel Fernandes)
Subject: [PATCH RFC 1/2] Add polling support to pidfd
Date: Fri, 19 Apr 2019 14:12:58 -0400	[thread overview]
Message-ID: <20190419181258.GA251571@google.com> (raw)
Message-ID: <20190419181258.5y2hoyFNN8VOKLTJIAIQIzDXjSC7B-1GAyQtPOGbQhI@z> (raw)
In-Reply-To: <20190417130940.GC32622@redhat.com>

Just returned to work today dealing with "life" issues, apologies for the
delays in replying. :)

On Wed, Apr 17, 2019@03:09:41PM +0200, Oleg Nesterov wrote:
> On 04/16, Joel Fernandes wrote:
> >
> > On Tue, Apr 16, 2019@02:04:31PM +0200, Oleg Nesterov wrote:
> > >
> > > Could you explain when it should return POLLIN? When the whole process exits?
> >
> > It returns POLLIN when the task is dead or doesn't exist anymore, or when it
> > is in a zombie state and there's no other thread in the thread group.
> 
> IOW, when the whole thread group exits, so it can't be used to monitor sub-threads.
> 
> just in case... speaking of this patch it doesn't modify proc_tid_base_operations,
> so you can't poll("/proc/sub-thread-tid") anyway, but iiuc you are going to use
> the anonymous file returned by CLONE_PIDFD ?

Yes, I am going to be converting to non-proc file returned by CLONE_PIDFD,
yes. (But I am still catching up with all threads and will read the latest on
whether we are still consider proc pidfds, last I understand - we are not).

> > > Then all you need is
> > >
> > > 	!task || task->exit_state && thread_group_empty(task)
> >
> > Yes this works as well, all the tests pass with your suggestion so I'll
> > change it to that. Although I will the be giving up returing EPOLLERR if the
> > task_struct doesn't exit. We don't need that, but I thought it was cool to
> > return it anyway.
> 
> OK, task == NULL means that it was already reaped by parent, pid_nr is free,
> probably useful....

Ok I will add that semantic as well then.

> > > Please do not use EXIT_DEAD/EXIT_ZOMBIE. And ->wait_pidfd should probably
> > > live in task->signal_struct.
> >
> > About wait_pidfd living in signal_struct, that wont work since the waitqueue
> > has to survive for the duration of the poll system call.
> 
> That is why I said this will need the additional cleanup in free_signal_struct().
> But I was wrong, somehow I forgot that free_poll_entry() needs wq_head->lock ;)
> so this will need much more complications, lets forget it...

Ok np :)

> > Also the waitqueue living in struct pid solves the de_thread() issue I
> > mentioned later in the following thread and in the commit message:
> > https://lore.kernel.org/patchwork/comment/1257175/
> 
> Hmm...
> 
> 	2. By including the struct pid for the waitqueue means that during
> 	de_exec, the thread doing de_thread() automatically gets the new
> 	waitqueue/pid even though its task_struct is different.
> 
> this one?
> 
> this is not true, or I do not understand...
> 
> it gets the _same_ (old, not new) PIDTYPE_TGID pid even if it changes task_struct.
> But probably this is what you actually meant, because this is what your patch wants
> or I am totally confused.

Yes, that's what I meant, sorry.

> And note that exec/de_thread doesn't change ->signal_struct, so I do not understand
> you anyway. Nevermind.

Yes right, but the signal_struct would suffer from the waitqueue lifetime
issue anyway so we can't use it. The current patch works well for everything.

thanks,

- Joel

  parent reply	other threads:[~2019-04-19 18:20 UTC|newest]

Thread overview: 198+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-04-11 17:50 [PATCH RFC 1/2] Add polling support to pidfd Joel Fernandes (Google)
2019-04-11 17:50 ` Joel Fernandes (Google)
2019-04-11 17:50 ` joel
2019-04-11 17:50 ` [PATCH RFC 2/2] Add selftests for pidfd polling Joel Fernandes (Google)
2019-04-11 17:50   ` Joel Fernandes (Google)
2019-04-11 17:50   ` joel
2019-04-12 14:51   ` Tycho Andersen
2019-04-12 14:51     ` Tycho Andersen
2019-04-12 14:51     ` tycho
2019-04-11 20:00 ` [PATCH RFC 1/2] Add polling support to pidfd Joel Fernandes
2019-04-11 20:00   ` Joel Fernandes
2019-04-11 20:00   ` joel
2019-04-11 20:02   ` Christian Brauner
2019-04-11 20:02     ` Christian Brauner
2019-04-11 20:02     ` christian
2019-04-11 20:20     ` Joel Fernandes
2019-04-11 20:20       ` Joel Fernandes
2019-04-11 20:20       ` joel
2019-04-12 21:32 ` Andy Lutomirski
2019-04-12 21:32   ` Andy Lutomirski
2019-04-12 21:32   ` luto
2019-04-13  0:09   ` Joel Fernandes
2019-04-13  0:09     ` Joel Fernandes
2019-04-13  0:09     ` joel
     [not found]     ` <CAKOZuetX4jMPDtDqAvGgSNo4BHf9BOnu79ufEiULfM5X5nDyyQ@mail.gmail.com>
2019-04-13  0:56       ` Daniel Colascione
2019-04-13  0:56         ` Daniel Colascione
2019-04-13  0:56         ` dancol
2019-04-14 18:19   ` Linus Torvalds
2019-04-14 18:19     ` Linus Torvalds
2019-04-14 18:19     ` torvalds
2019-04-16 12:04 ` Oleg Nesterov
2019-04-16 12:04   ` Oleg Nesterov
2019-04-16 12:04   ` oleg
2019-04-16 12:43   ` Oleg Nesterov
2019-04-16 12:43     ` Oleg Nesterov
2019-04-16 12:43     ` oleg
2019-04-16 19:20   ` Joel Fernandes
2019-04-16 19:20     ` Joel Fernandes
2019-04-16 19:20     ` joel
2019-04-16 19:32     ` Joel Fernandes
2019-04-16 19:32       ` Joel Fernandes
2019-04-16 19:32       ` joel
2019-04-17 13:09     ` Oleg Nesterov
2019-04-17 13:09       ` Oleg Nesterov
2019-04-17 13:09       ` oleg
2019-04-18 17:23       ` Jann Horn
2019-04-18 17:23         ` Jann Horn
2019-04-18 17:23         ` jannh
2019-04-18 17:26         ` Christian Brauner
2019-04-18 17:26           ` Christian Brauner
2019-04-18 17:26           ` christian
2019-04-18 17:53           ` Daniel Colascione
2019-04-18 17:53             ` Daniel Colascione
2019-04-18 17:53             ` dancol
2019-04-19 19:02           ` Joel Fernandes
2019-04-19 19:02             ` Joel Fernandes
2019-04-19 19:02             ` joel
2019-04-19 19:18             ` Christian Brauner
2019-04-19 19:18               ` Christian Brauner
2019-04-19 19:18               ` christian
2019-04-19 19:22               ` Christian Brauner
2019-04-19 19:22                 ` Christian Brauner
2019-04-19 19:22                 ` christian
2019-04-19 19:42                 ` Christian Brauner
2019-04-19 19:42                   ` Christian Brauner
2019-04-19 19:42                   ` christian
2019-04-19 19:49               ` Joel Fernandes
2019-04-19 19:49                 ` Joel Fernandes
2019-04-19 19:49                 ` joel
2019-04-19 20:01                 ` Christian Brauner
2019-04-19 20:01                   ` Christian Brauner
2019-04-19 20:01                   ` christian
2019-04-19 21:13                   ` Joel Fernandes
2019-04-19 21:13                     ` Joel Fernandes
2019-04-19 21:13                     ` joel
2019-04-19 20:34                 ` Daniel Colascione
2019-04-19 20:34                   ` Daniel Colascione
2019-04-19 20:34                   ` dancol
2019-04-19 20:57                   ` Christian Brauner
2019-04-19 20:57                     ` Christian Brauner
2019-04-19 20:57                     ` christian
2019-04-19 21:20                     ` Joel Fernandes
2019-04-19 21:20                       ` Joel Fernandes
2019-04-19 21:20                       ` joel
2019-04-19 21:24                       ` Daniel Colascione
2019-04-19 21:24                         ` Daniel Colascione
2019-04-19 21:24                         ` dancol
2019-04-19 21:45                         ` Joel Fernandes
2019-04-19 21:45                           ` Joel Fernandes
2019-04-19 21:45                           ` joel
2019-04-19 22:08                           ` Daniel Colascione
2019-04-19 22:08                             ` Daniel Colascione
2019-04-19 22:08                             ` dancol
2019-04-19 22:17                             ` Christian Brauner
2019-04-19 22:17                               ` Christian Brauner
2019-04-19 22:17                               ` christian
2019-04-19 22:37                               ` Daniel Colascione
2019-04-19 22:37                                 ` Daniel Colascione
2019-04-19 22:37                                 ` dancol
2019-04-24  8:04                         ` Enrico Weigelt, metux IT consult
2019-04-24  8:04                           ` Enrico Weigelt, metux IT consult
2019-04-24  8:04                           ` lkml
2019-04-19 21:59                       ` Christian Brauner
2019-04-19 21:59                         ` Christian Brauner
2019-04-19 21:59                         ` christian
2019-04-20 11:51                         ` Oleg Nesterov
2019-04-20 11:51                           ` Oleg Nesterov
2019-04-20 11:51                           ` oleg
2019-04-20 12:26                           ` Oleg Nesterov
2019-04-20 12:26                             ` Oleg Nesterov
2019-04-20 12:26                             ` oleg
2019-04-20 12:35                             ` Christian Brauner
2019-04-20 12:35                               ` Christian Brauner
2019-04-20 12:35                               ` christian
2019-04-19 23:11                       ` Linus Torvalds
2019-04-19 23:11                         ` Linus Torvalds
2019-04-19 23:11                         ` torvalds
2019-04-19 23:20                         ` Christian Brauner
2019-04-19 23:20                           ` Christian Brauner
2019-04-19 23:20                           ` christian
2019-04-19 23:32                           ` Linus Torvalds
2019-04-19 23:32                             ` Linus Torvalds
2019-04-19 23:32                             ` torvalds
2019-04-19 23:36                             ` Daniel Colascione
2019-04-19 23:36                               ` Daniel Colascione
2019-04-19 23:36                               ` dancol
2019-04-20  0:46                         ` Joel Fernandes
2019-04-20  0:46                           ` Joel Fernandes
2019-04-20  0:46                           ` joel
2019-04-19 21:21                     ` Daniel Colascione
2019-04-19 21:21                       ` Daniel Colascione
2019-04-19 21:21                       ` dancol
2019-04-19 21:48                       ` Christian Brauner
2019-04-19 21:48                         ` Christian Brauner
2019-04-19 21:48                         ` christian
2019-04-19 22:02                         ` Christian Brauner
2019-04-19 22:02                           ` Christian Brauner
2019-04-19 22:02                           ` christian
2019-04-19 22:46                           ` Daniel Colascione
2019-04-19 22:46                             ` Daniel Colascione
2019-04-19 22:46                             ` dancol
2019-04-19 23:12                             ` Christian Brauner
2019-04-19 23:12                               ` Christian Brauner
2019-04-19 23:12                               ` christian
2019-04-19 23:46                               ` Daniel Colascione
2019-04-19 23:46                                 ` Daniel Colascione
2019-04-19 23:46                                 ` dancol
2019-04-20  0:17                                 ` Christian Brauner
2019-04-20  0:17                                   ` Christian Brauner
2019-04-20  0:17                                   ` christian
2019-04-24  9:05                                   ` Enrico Weigelt, metux IT consult
2019-04-24  9:05                                     ` Enrico Weigelt, metux IT consult
2019-04-24  9:05                                     ` lkml
2019-04-24  9:03                                 ` Enrico Weigelt, metux IT consult
2019-04-24  9:03                                   ` Enrico Weigelt, metux IT consult
2019-04-24  9:03                                   ` lkml
2019-04-19 22:35                         ` Daniel Colascione
2019-04-19 22:35                           ` Daniel Colascione
2019-04-19 22:35                           ` dancol
2019-04-19 23:02                           ` Christian Brauner
2019-04-19 23:02                             ` Christian Brauner
2019-04-19 23:02                             ` christian
2019-04-19 23:29                             ` Daniel Colascione
2019-04-19 23:29                               ` Daniel Colascione
2019-04-19 23:29                               ` dancol
2019-04-20  0:02                               ` Christian Brauner
2019-04-20  0:02                                 ` Christian Brauner
2019-04-20  0:02                                 ` christian
2019-04-24  9:17                               ` Enrico Weigelt, metux IT consult
2019-04-24  9:17                                 ` Enrico Weigelt, metux IT consult
2019-04-24  9:17                                 ` lkml
2019-04-24  9:11                             ` Enrico Weigelt, metux IT consult
2019-04-24  9:11                               ` Enrico Weigelt, metux IT consult
2019-04-24  9:11                               ` lkml
2019-04-24  8:56                         ` Enrico Weigelt, metux IT consult
2019-04-24  8:56                           ` Enrico Weigelt, metux IT consult
2019-04-24  8:56                           ` lkml
2019-04-24  8:20                       ` Enrico Weigelt, metux IT consult
2019-04-24  8:20                         ` Enrico Weigelt, metux IT consult
2019-04-24  8:20                         ` lkml
2019-04-19 15:43         ` Oleg Nesterov
2019-04-19 15:43           ` Oleg Nesterov
2019-04-19 15:43           ` oleg
2019-04-19 18:12       ` Joel Fernandes [this message]
2019-04-19 18:12         ` Joel Fernandes
2019-04-19 18:12         ` joel
2019-04-18 18:44     ` Jonathan Kowalski
2019-04-18 18:44       ` Jonathan Kowalski
2019-04-18 18:44       ` bl0pbl33p
2019-04-18 18:57       ` Daniel Colascione
2019-04-18 18:57         ` Daniel Colascione
2019-04-18 18:57         ` dancol
2019-04-18 19:14         ` Linus Torvalds
2019-04-18 19:14           ` Linus Torvalds
2019-04-18 19:14           ` torvalds
2019-04-19 19:05           ` Joel Fernandes
2019-04-19 19:05             ` Joel Fernandes
2019-04-19 19:05             ` joel

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20190419181258.GA251571@google.com \
    --to=joel@joelfernandes.org \
    --cc=adobriyan@gmail.com \
    --cc=akpm@linux-foundation.org \
    --cc=ap420073@gmail.com \
    --cc=arnd@arndb.de \
    --cc=avagin@gmail.com \
    --cc=christian@brauner.io \
    --cc=dancol@google.com \
    --cc=ebiederm@xmission.com \
    --cc=jannh@google.com \
    --cc=keescook@chromium.org \
    --cc=kernel-team@android.com \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-kselftest@vger.kernel.org \
    --cc=luto@amacapital.net \
    --cc=mhocko@suse.com \
    --cc=namit@vmware.com \
    --cc=oleg@redhat.com \
    --cc=rostedt@goodmis.org \
    --cc=serge@hallyn.com \
    --cc=sfr@canb.auug.org.au \
    --cc=shuah@kernel.org \
    --cc=surenb@google.com \
    --cc=tglx@linutronix.de \
    --cc=tj@kernel.org \
    --cc=torvalds@linux-foundation.org \
    --cc=tycho@tycho.ws \
    --cc=viro@zeniv.linux.org.uk \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.