stable.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] io_uring: fix early sqd_list removal sqpoll hangs
@ 2021-04-13 10:43 Pavel Begunkov
  2021-04-14 10:46 ` Pavel Begunkov
  2021-04-14 16:19 ` Jens Axboe
  0 siblings, 2 replies; 3+ messages in thread
From: Pavel Begunkov @ 2021-04-13 10:43 UTC (permalink / raw)
  To: Jens Axboe, io-uring; +Cc: stable, Joakim Hassila

[  245.463317] INFO: task iou-sqp-1374:1377 blocked for more than 122 seconds.
[  245.463334] task:iou-sqp-1374    state:D flags:0x00004000
[  245.463345] Call Trace:
[  245.463352]  __schedule+0x36b/0x950
[  245.463376]  schedule+0x68/0xe0
[  245.463385]  __io_uring_cancel+0xfb/0x1a0
[  245.463407]  do_exit+0xc0/0xb40
[  245.463423]  io_sq_thread+0x49b/0x710
[  245.463445]  ret_from_fork+0x22/0x30

It happens when sqpoll forgot to run park_task_work and goes to exit,
then exiting user may remove ctx from sqd_list, and so corresponding
io_sq_thread() -> io_uring_cancel_sqpoll() won't be executed. Hopefully
it just stucks in do_exit() in this case.

Cc: stable@vger.kernel.org
Reported-by: Joakim Hassila <joj@mac.com>
Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
---
 fs/io_uring.c | 7 +++++--
 1 file changed, 5 insertions(+), 2 deletions(-)

diff --git a/fs/io_uring.c b/fs/io_uring.c
index cadd7a65a7f4..f390914666b1 100644
--- a/fs/io_uring.c
+++ b/fs/io_uring.c
@@ -6817,6 +6817,9 @@ static int io_sq_thread(void *data)
 	current->flags |= PF_NO_SETAFFINITY;
 
 	mutex_lock(&sqd->lock);
+	/* a user may had exited before the thread wstarted */
+	io_run_task_work_head(&sqd->park_task_work);
+
 	while (!test_bit(IO_SQ_THREAD_SHOULD_STOP, &sqd->state)) {
 		int ret;
 		bool cap_entries, sqt_spin, needs_sched;
@@ -6833,10 +6836,10 @@ static int io_sq_thread(void *data)
 			}
 			cond_resched();
 			mutex_lock(&sqd->lock);
-			if (did_sig)
-				break;
 			io_run_task_work();
 			io_run_task_work_head(&sqd->park_task_work);
+			if (did_sig)
+				break;
 			timeout = jiffies + sqd->sq_thread_idle;
 			continue;
 		}
-- 
2.24.0


^ permalink raw reply related	[flat|nested] 3+ messages in thread

* Re: [PATCH] io_uring: fix early sqd_list removal sqpoll hangs
  2021-04-13 10:43 [PATCH] io_uring: fix early sqd_list removal sqpoll hangs Pavel Begunkov
@ 2021-04-14 10:46 ` Pavel Begunkov
  2021-04-14 16:19 ` Jens Axboe
  1 sibling, 0 replies; 3+ messages in thread
From: Pavel Begunkov @ 2021-04-14 10:46 UTC (permalink / raw)
  To: Jens Axboe, io-uring; +Cc: stable, Joakim Hassila

On 13/04/2021 11:43, Pavel Begunkov wrote:
> [  245.463317] INFO: task iou-sqp-1374:1377 blocked for more than 122 seconds.
> [  245.463334] task:iou-sqp-1374    state:D flags:0x00004000
> [  245.463345] Call Trace:
> [  245.463352]  __schedule+0x36b/0x950
> [  245.463376]  schedule+0x68/0xe0
> [  245.463385]  __io_uring_cancel+0xfb/0x1a0
> [  245.463407]  do_exit+0xc0/0xb40
> [  245.463423]  io_sq_thread+0x49b/0x710
> [  245.463445]  ret_from_fork+0x22/0x30
> 
> It happens when sqpoll forgot to run park_task_work and goes to exit,
> then exiting user may remove ctx from sqd_list, and so corresponding
> io_sq_thread() -> io_uring_cancel_sqpoll() won't be executed. Hopefully
> it just stucks in do_exit() in this case.

fwiw, it's actually a 5.12 problem and I have a reliable enough
way to reproduce it.


> Cc: stable@vger.kernel.org
> Reported-by: Joakim Hassila <joj@mac.com>
> Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
> ---
>  fs/io_uring.c | 7 +++++--
>  1 file changed, 5 insertions(+), 2 deletions(-)
> 
> diff --git a/fs/io_uring.c b/fs/io_uring.c
> index cadd7a65a7f4..f390914666b1 100644
> --- a/fs/io_uring.c
> +++ b/fs/io_uring.c
> @@ -6817,6 +6817,9 @@ static int io_sq_thread(void *data)
>  	current->flags |= PF_NO_SETAFFINITY;
>  
>  	mutex_lock(&sqd->lock);
> +	/* a user may had exited before the thread wstarted */
> +	io_run_task_work_head(&sqd->park_task_work);
> +
>  	while (!test_bit(IO_SQ_THREAD_SHOULD_STOP, &sqd->state)) {
>  		int ret;
>  		bool cap_entries, sqt_spin, needs_sched;
> @@ -6833,10 +6836,10 @@ static int io_sq_thread(void *data)
>  			}
>  			cond_resched();
>  			mutex_lock(&sqd->lock);
> -			if (did_sig)
> -				break;
>  			io_run_task_work();
>  			io_run_task_work_head(&sqd->park_task_work);
> +			if (did_sig)
> +				break;
>  			timeout = jiffies + sqd->sq_thread_idle;
>  			continue;
>  		}
> 

-- 
Pavel Begunkov

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: [PATCH] io_uring: fix early sqd_list removal sqpoll hangs
  2021-04-13 10:43 [PATCH] io_uring: fix early sqd_list removal sqpoll hangs Pavel Begunkov
  2021-04-14 10:46 ` Pavel Begunkov
@ 2021-04-14 16:19 ` Jens Axboe
  1 sibling, 0 replies; 3+ messages in thread
From: Jens Axboe @ 2021-04-14 16:19 UTC (permalink / raw)
  To: Pavel Begunkov, io-uring; +Cc: stable, Joakim Hassila

On 4/13/21 4:43 AM, Pavel Begunkov wrote:
> [  245.463317] INFO: task iou-sqp-1374:1377 blocked for more than 122 seconds.
> [  245.463334] task:iou-sqp-1374    state:D flags:0x00004000
> [  245.463345] Call Trace:
> [  245.463352]  __schedule+0x36b/0x950
> [  245.463376]  schedule+0x68/0xe0
> [  245.463385]  __io_uring_cancel+0xfb/0x1a0
> [  245.463407]  do_exit+0xc0/0xb40
> [  245.463423]  io_sq_thread+0x49b/0x710
> [  245.463445]  ret_from_fork+0x22/0x30
> 
> It happens when sqpoll forgot to run park_task_work and goes to exit,
> then exiting user may remove ctx from sqd_list, and so corresponding
> io_sq_thread() -> io_uring_cancel_sqpoll() won't be executed. Hopefully
> it just stucks in do_exit() in this case.

Added for 5.12, thanks.

-- 
Jens Axboe


^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2021-04-14 16:19 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-04-13 10:43 [PATCH] io_uring: fix early sqd_list removal sqpoll hangs Pavel Begunkov
2021-04-14 10:46 ` Pavel Begunkov
2021-04-14 16:19 ` Jens Axboe

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).