linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* Re: [PATCH] io_uring: handle signals before letting io-worker exit
       [not found] <60ae94d1.1c69fb81.94f7a.2a35SMTPIN_ADDED_MISSING@mx.google.com>
@ 2021-05-27 13:46 ` Jens Axboe
  2021-05-27 15:21   ` Olivier Langlois
  0 siblings, 1 reply; 4+ messages in thread
From: Jens Axboe @ 2021-05-27 13:46 UTC (permalink / raw)
  To: Olivier Langlois, Pavel Begunkov, io-uring, linux-kernel

On 5/26/21 12:21 PM, Olivier Langlois wrote:
> This is required for proper core dump generation.
> 
> Because signals are normally serviced before resuming userspace and an
> io_worker thread will never resume userspace, it needs to explicitly
> call the signal servicing functions.
> 
> Also, notice that it is possible to exit from the io_wqe_worker()
> function main loop while having a pending signal such as when
> the IO_WQ_BIT_EXIT bit is set.
> 
> It is crucial to service any pending signal before calling do_exit()
> Proper coredump generation is relying on PF_SIGNALED to be set.
> 
> More specifically, exit_mm() is using this flag to wait for the
> core dump completion before releasing its memory descriptor.
> 
> Signed-off-by: Olivier Langlois <olivier@trillion01.com>
> ---
>  fs/io-wq.c | 22 ++++++++++++++++++++--
>  1 file changed, 20 insertions(+), 2 deletions(-)
> 
> diff --git a/fs/io-wq.c b/fs/io-wq.c
> index 5361a9b4b47b..b76c61e9aff2 100644
> --- a/fs/io-wq.c
> +++ b/fs/io-wq.c
> @@ -9,8 +9,6 @@
>  #include <linux/init.h>
>  #include <linux/errno.h>
>  #include <linux/sched/signal.h>
> -#include <linux/mm.h>
> -#include <linux/sched/mm.h>
>  #include <linux/percpu.h>
>  #include <linux/slab.h>
>  #include <linux/rculist_nulls.h>
> @@ -193,6 +191,26 @@ static void io_worker_exit(struct io_worker *worker)
>  
>  	kfree_rcu(worker, rcu);
>  	io_worker_ref_put(wqe->wq);
> +	/*
> +	 * Because signals are normally serviced before resuming userspace and an
> +	 * io_worker thread will never resume userspace, it needs to explicitly
> +	 * call the signal servicing functions.
> +	 *
> +	 * Also notice that it is possible to exit from the io_wqe_worker()
> +	 * function main loop while having a pending signal such as when
> +	 * the IO_WQ_BIT_EXIT bit is set.
> +	 *
> +	 * It is crucial to service any pending signal before calling do_exit()
> +	 * Proper coredump generation is relying on PF_SIGNALED to be set.
> +	 *
> +	 * More specifically, exit_mm() is using this flag to wait for the
> +	 * core dump completion before releasing its memory descriptor.
> +	 */
> +	if (signal_pending(current)) {
> +		struct ksignal ksig;
> +
> +		get_signal(&ksig);
> +	}
>  	do_exit(0);
>  }

Do we need the same thing in fs/io_uring.c:io_sq_thread()?

-- 
Jens Axboe


^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [PATCH] io_uring: handle signals before letting io-worker exit
  2021-05-27 13:46 ` [PATCH] io_uring: handle signals before letting io-worker exit Jens Axboe
@ 2021-05-27 15:21   ` Olivier Langlois
  2021-05-27 15:30     ` Jens Axboe
  0 siblings, 1 reply; 4+ messages in thread
From: Olivier Langlois @ 2021-05-27 15:21 UTC (permalink / raw)
  To: Jens Axboe, Pavel Begunkov, io-uring, linux-kernel

On Thu, 2021-05-27 at 07:46 -0600, Jens Axboe wrote:
> On 5/26/21 12:21 PM, Olivier Langlois wrote:
> > This is required for proper core dump generation.
> > 
> > Because signals are normally serviced before resuming userspace and
> > an
> > io_worker thread will never resume userspace, it needs to
> > explicitly
> > call the signal servicing functions.
> > 
> > Also, notice that it is possible to exit from the io_wqe_worker()
> > function main loop while having a pending signal such as when
> > the IO_WQ_BIT_EXIT bit is set.
> > 
> > It is crucial to service any pending signal before calling
> > do_exit()
> > Proper coredump generation is relying on PF_SIGNALED to be set.
> > 
> > More specifically, exit_mm() is using this flag to wait for the
> > core dump completion before releasing its memory descriptor.
> > 
> > Signed-off-by: Olivier Langlois <olivier@trillion01.com>
> > ---
> >  fs/io-wq.c | 22 ++++++++++++++++++++--
> >  1 file changed, 20 insertions(+), 2 deletions(-)
> > 
> > diff --git a/fs/io-wq.c b/fs/io-wq.c
> > index 5361a9b4b47b..b76c61e9aff2 100644
> > --- a/fs/io-wq.c
> > +++ b/fs/io-wq.c
> > @@ -9,8 +9,6 @@
> >  #include <linux/init.h>
> >  #include <linux/errno.h>
> >  #include <linux/sched/signal.h>
> > -#include <linux/mm.h>
> > -#include <linux/sched/mm.h>
> >  #include <linux/percpu.h>
> >  #include <linux/slab.h>
> >  #include <linux/rculist_nulls.h>
> > @@ -193,6 +191,26 @@ static void io_worker_exit(struct io_worker
> > *worker)
> >  
> >         kfree_rcu(worker, rcu);
> >         io_worker_ref_put(wqe->wq);
> > +       /*
> > +        * Because signals are normally serviced before resuming
> > userspace and an
> > +        * io_worker thread will never resume userspace, it needs
> > to explicitly
> > +        * call the signal servicing functions.
> > +        *
> > +        * Also notice that it is possible to exit from the
> > io_wqe_worker()
> > +        * function main loop while having a pending signal such as
> > when
> > +        * the IO_WQ_BIT_EXIT bit is set.
> > +        *
> > +        * It is crucial to service any pending signal before
> > calling do_exit()
> > +        * Proper coredump generation is relying on PF_SIGNALED to
> > be set.
> > +        *
> > +        * More specifically, exit_mm() is using this flag to wait
> > for the
> > +        * core dump completion before releasing its memory
> > descriptor.
> > +        */
> > +       if (signal_pending(current)) {
> > +               struct ksignal ksig;
> > +
> > +               get_signal(&ksig);
> > +       }
> >         do_exit(0);
> >  }
> 
> Do we need the same thing in fs/io_uring.c:io_sq_thread()?
> 
Jens,

You are 100% correct. In fact, this is the same problem for ALL
currently existing and future io threads. Therefore, I start to think
that the right place for the fix might be straight into do_exit()...



^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [PATCH] io_uring: handle signals before letting io-worker exit
  2021-05-27 15:21   ` Olivier Langlois
@ 2021-05-27 15:30     ` Jens Axboe
  2021-05-27 16:09       ` Olivier Langlois
  0 siblings, 1 reply; 4+ messages in thread
From: Jens Axboe @ 2021-05-27 15:30 UTC (permalink / raw)
  To: Olivier Langlois, Pavel Begunkov, io-uring, linux-kernel

On 5/27/21 9:21 AM, Olivier Langlois wrote:
> On Thu, 2021-05-27 at 07:46 -0600, Jens Axboe wrote:
>> On 5/26/21 12:21 PM, Olivier Langlois wrote:
>>> This is required for proper core dump generation.
>>>
>>> Because signals are normally serviced before resuming userspace and
>>> an
>>> io_worker thread will never resume userspace, it needs to
>>> explicitly
>>> call the signal servicing functions.
>>>
>>> Also, notice that it is possible to exit from the io_wqe_worker()
>>> function main loop while having a pending signal such as when
>>> the IO_WQ_BIT_EXIT bit is set.
>>>
>>> It is crucial to service any pending signal before calling
>>> do_exit()
>>> Proper coredump generation is relying on PF_SIGNALED to be set.
>>>
>>> More specifically, exit_mm() is using this flag to wait for the
>>> core dump completion before releasing its memory descriptor.
>>>
>>> Signed-off-by: Olivier Langlois <olivier@trillion01.com>
>>> ---
>>>  fs/io-wq.c | 22 ++++++++++++++++++++--
>>>  1 file changed, 20 insertions(+), 2 deletions(-)
>>>
>>> diff --git a/fs/io-wq.c b/fs/io-wq.c
>>> index 5361a9b4b47b..b76c61e9aff2 100644
>>> --- a/fs/io-wq.c
>>> +++ b/fs/io-wq.c
>>> @@ -9,8 +9,6 @@
>>>  #include <linux/init.h>
>>>  #include <linux/errno.h>
>>>  #include <linux/sched/signal.h>
>>> -#include <linux/mm.h>
>>> -#include <linux/sched/mm.h>
>>>  #include <linux/percpu.h>
>>>  #include <linux/slab.h>
>>>  #include <linux/rculist_nulls.h>
>>> @@ -193,6 +191,26 @@ static void io_worker_exit(struct io_worker
>>> *worker)
>>>  
>>>         kfree_rcu(worker, rcu);
>>>         io_worker_ref_put(wqe->wq);
>>> +       /*
>>> +        * Because signals are normally serviced before resuming
>>> userspace and an
>>> +        * io_worker thread will never resume userspace, it needs
>>> to explicitly
>>> +        * call the signal servicing functions.
>>> +        *
>>> +        * Also notice that it is possible to exit from the
>>> io_wqe_worker()
>>> +        * function main loop while having a pending signal such as
>>> when
>>> +        * the IO_WQ_BIT_EXIT bit is set.
>>> +        *
>>> +        * It is crucial to service any pending signal before
>>> calling do_exit()
>>> +        * Proper coredump generation is relying on PF_SIGNALED to
>>> be set.
>>> +        *
>>> +        * More specifically, exit_mm() is using this flag to wait
>>> for the
>>> +        * core dump completion before releasing its memory
>>> descriptor.
>>> +        */
>>> +       if (signal_pending(current)) {
>>> +               struct ksignal ksig;
>>> +
>>> +               get_signal(&ksig);
>>> +       }
>>>         do_exit(0);
>>>  }
>>
>> Do we need the same thing in fs/io_uring.c:io_sq_thread()?
>>
> Jens,
> 
> You are 100% correct. In fact, this is the same problem for ALL
> currently existing and future io threads. Therefore, I start to think
> that the right place for the fix might be straight into do_exit()...

That is what I was getting at. To avoid poluting do_exit() with it, I
think it'd be best to add an io_thread_exit() that simply does:

void io_thread_exit(void)
{
	if (signal_pending(current)) {
		struct ksignal ksig;
		get_signal(&ksig);
	}
	do_exit(0);
}

and convert the do_exit() calls in io_uring/io-wq to io_thread_exit()
instead.

-- 
Jens Axboe


^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [PATCH] io_uring: handle signals before letting io-worker exit
  2021-05-27 15:30     ` Jens Axboe
@ 2021-05-27 16:09       ` Olivier Langlois
  0 siblings, 0 replies; 4+ messages in thread
From: Olivier Langlois @ 2021-05-27 16:09 UTC (permalink / raw)
  To: Jens Axboe, Pavel Begunkov, io-uring, linux-kernel

On Thu, 2021-05-27 at 09:30 -0600, Jens Axboe wrote:
> > > 
> > Jens,
> > 
> > You are 100% correct. In fact, this is the same problem for ALL
> > currently existing and future io threads. Therefore, I start to
> > think
> > that the right place for the fix might be straight into
> > do_exit()...
> 
> That is what I was getting at. To avoid poluting do_exit() with it, I
> think it'd be best to add an io_thread_exit() that simply does:
> 
> void io_thread_exit(void)
> {
>         if (signal_pending(current)) {
>                 struct ksignal ksig;
>                 get_signal(&ksig);
>         }
>         do_exit(0);
> }
> 
> and convert the do_exit() calls in io_uring/io-wq to io_thread_exit()
> instead.
> 
IMHO, that would be an acceptable compromise because it does fix my
problem. However, I am of the opinion that it wouldn't be poluting
do_exit() and would in fact be the right place to do it considering
that create_io_thread() is in kernel and theoritically, anyone can call
it to create an io_thread and would be susceptible to get bitten by the
exact same problem and would have to come up with a similar solution if
it is not addressed directly by the kernel.

Also, since I have submitted the patch, I have made the following
realization:

I got bitten by the problem because of a race condition between the io-
mgr thread and its io-wrks threads for processing their pending SIGKILL
and the proposed patch does correct my problem.

The issue would have most likely been buried by 5.13 io-mgr removal...

BUT, even the proposed patch isn't 100% perfect. AFAIK, it is still
possible, but very unlikely, to get a signal between calling
signal_pending() and do_exit().

It might be possible to implement the solution and be 100% correct all
the time by doing it inside do_exit()... I am currently eyeing
exit_signals() as a potential good site for the patch...



^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2021-05-27 16:09 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
     [not found] <60ae94d1.1c69fb81.94f7a.2a35SMTPIN_ADDED_MISSING@mx.google.com>
2021-05-27 13:46 ` [PATCH] io_uring: handle signals before letting io-worker exit Jens Axboe
2021-05-27 15:21   ` Olivier Langlois
2021-05-27 15:30     ` Jens Axboe
2021-05-27 16:09       ` Olivier Langlois

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).