* [PATCH v2] io_uring: fix io_sq_thread_stop running in front of io_sq_thread
@ 2019-07-08 2:07 Jackie Liu
2019-07-08 4:13 ` Jens Axboe
0 siblings, 1 reply; 3+ messages in thread
From: Jackie Liu @ 2019-07-08 2:07 UTC (permalink / raw)
To: axboe; +Cc: ebiggers, liuzhengyuan, linux-block, Jackie Liu
INFO: task syz-executor.5:8634 blocked for more than 143 seconds.
Not tainted 5.2.0-rc5+ #3
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
syz-executor.5 D25632 8634 8224 0x00004004
Call Trace:
context_switch kernel/sched/core.c:2818 [inline]
__schedule+0x658/0x9e0 kernel/sched/core.c:3445
schedule+0x131/0x1d0 kernel/sched/core.c:3509
schedule_timeout+0x9a/0x2b0 kernel/time/timer.c:1783
do_wait_for_common+0x35e/0x5a0 kernel/sched/completion.c:83
__wait_for_common kernel/sched/completion.c:104 [inline]
wait_for_common kernel/sched/completion.c:115 [inline]
wait_for_completion+0x47/0x60 kernel/sched/completion.c:136
kthread_stop+0xb4/0x150 kernel/kthread.c:559
io_sq_thread_stop fs/io_uring.c:2252 [inline]
io_finish_async fs/io_uring.c:2259 [inline]
io_ring_ctx_free fs/io_uring.c:2770 [inline]
io_ring_ctx_wait_and_kill+0x268/0x880 fs/io_uring.c:2834
io_uring_release+0x5d/0x70 fs/io_uring.c:2842
__fput+0x2e4/0x740 fs/file_table.c:280
____fput+0x15/0x20 fs/file_table.c:313
task_work_run+0x17e/0x1b0 kernel/task_work.c:113
tracehook_notify_resume include/linux/tracehook.h:185 [inline]
exit_to_usermode_loop arch/x86/entry/common.c:168 [inline]
prepare_exit_to_usermode+0x402/0x4f0 arch/x86/entry/common.c:199
syscall_return_slowpath+0x110/0x440 arch/x86/entry/common.c:279
do_syscall_64+0x126/0x140 arch/x86/entry/common.c:304
entry_SYSCALL_64_after_hwframe+0x49/0xbe
RIP: 0033:0x412fb1
Code: 80 3b 7c 0f 84 c7 02 00 00 c7 85 d0 00 00 00 00 00 00 00 48 8b 05 cf
a6 24 00 49 8b 14 24 41 b9 cb 2a 44 00 48 89 ee 48 89 df <48> 85 c0 4c 0f
45 c8 45 31 c0 31 c9 e8 0e 5b 00 00 85 c0 41 89 c7
RSP: 002b:00007ffe7ee6a180 EFLAGS: 00000293 ORIG_RAX: 0000000000000003
RAX: 0000000000000000 RBX: 0000000000000004 RCX: 0000000000412fb1
RDX: 0000001b2d920000 RSI: 0000000000000000 RDI: 0000000000000003
RBP: 0000000000000001 R08: 00000000f3a3e1f8 R09: 00000000f3a3e1fc
R10: 00007ffe7ee6a260 R11: 0000000000000293 R12: 000000000075c9a0
R13: 000000000075c9a0 R14: 0000000000024c00 R15: 000000000075bf2c
=============================================
There is an wrong logic, when kthread_park running
in front of io_sq_thread.
CPU#0 CPU#1
io_sq_thread_stop: int kthread(void *_create):
kthread_park()
__kthread_parkme(self); <<< Wrong
kthread_stop()
<< wait for self->exited
<< clear_bit KTHREAD_SHOULD_PARK
ret = threadfn(data);
|
|- io_sq_thread
|- kthread_should_park() << false
|- schedule() <<< nobody wake up
stuck CPU#0 stuck CPU#1
So, use a new variable sqo_thread_started to ensure that io_sq_thread
run first, then io_sq_thread_stop.
Reported-by: syzbot+94324416c485d422fe15@syzkaller.appspotmail.com
Signed-off-by: Jackie Liu <liuyun01@kylinos.cn>
---
fs/io_uring.c | 8 ++++++++
1 file changed, 8 insertions(+)
diff --git a/fs/io_uring.c b/fs/io_uring.c
index 4ef62a4..b7d92fa 100644
--- a/fs/io_uring.c
+++ b/fs/io_uring.c
@@ -78,6 +78,8 @@
#define IORING_MAX_ENTRIES 4096
#define IORING_MAX_FIXED_FILES 1024
+#define SQO_THREAD_STARTED 1
+
struct io_uring {
u32 head ____cacheline_aligned_in_smp;
u32 tail ____cacheline_aligned_in_smp;
@@ -231,6 +233,7 @@ struct io_ring_ctx {
struct task_struct *sqo_thread; /* if using sq thread polling */
struct mm_struct *sqo_mm;
wait_queue_head_t sqo_wait;
+ unsigned long sqo_flags;
struct {
/* CQ ring */
@@ -2009,6 +2012,9 @@ static int io_sq_thread(void *data)
unsigned inflight;
unsigned long timeout;
+ /* io_sq_thread should run before io_sq_thread_stop */
+ set_bit(SQO_THREAD_STARTED, &ctx->sqo_flags);
+
old_fs = get_fs();
set_fs(USER_DS);
@@ -2243,6 +2249,8 @@ static int io_sqe_files_unregister(struct io_ring_ctx *ctx)
static void io_sq_thread_stop(struct io_ring_ctx *ctx)
{
if (ctx->sqo_thread) {
+ while (unlikely(!test_bit(SQO_THREAD_STARTED, &ctx->sqo_flags)))
+ schedule();
/*
* The park is a bit of a work-around, without it we get
* warning spews on shutdown with SQPOLL set and affinity
--
2.7.4
^ permalink raw reply related [flat|nested] 3+ messages in thread
* Re: [PATCH v2] io_uring: fix io_sq_thread_stop running in front of io_sq_thread
2019-07-08 2:07 [PATCH v2] io_uring: fix io_sq_thread_stop running in front of io_sq_thread Jackie Liu
@ 2019-07-08 4:13 ` Jens Axboe
2019-07-08 4:38 ` JackieLiu
0 siblings, 1 reply; 3+ messages in thread
From: Jens Axboe @ 2019-07-08 4:13 UTC (permalink / raw)
To: Jackie Liu; +Cc: ebiggers, liuzhengyuan, linux-block
On 7/7/19 8:07 PM, Jackie Liu wrote:
> INFO: task syz-executor.5:8634 blocked for more than 143 seconds.
> Not tainted 5.2.0-rc5+ #3
> "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
> syz-executor.5 D25632 8634 8224 0x00004004
> Call Trace:
> context_switch kernel/sched/core.c:2818 [inline]
> __schedule+0x658/0x9e0 kernel/sched/core.c:3445
> schedule+0x131/0x1d0 kernel/sched/core.c:3509
> schedule_timeout+0x9a/0x2b0 kernel/time/timer.c:1783
> do_wait_for_common+0x35e/0x5a0 kernel/sched/completion.c:83
> __wait_for_common kernel/sched/completion.c:104 [inline]
> wait_for_common kernel/sched/completion.c:115 [inline]
> wait_for_completion+0x47/0x60 kernel/sched/completion.c:136
> kthread_stop+0xb4/0x150 kernel/kthread.c:559
> io_sq_thread_stop fs/io_uring.c:2252 [inline]
> io_finish_async fs/io_uring.c:2259 [inline]
> io_ring_ctx_free fs/io_uring.c:2770 [inline]
> io_ring_ctx_wait_and_kill+0x268/0x880 fs/io_uring.c:2834
> io_uring_release+0x5d/0x70 fs/io_uring.c:2842
> __fput+0x2e4/0x740 fs/file_table.c:280
> ____fput+0x15/0x20 fs/file_table.c:313
> task_work_run+0x17e/0x1b0 kernel/task_work.c:113
> tracehook_notify_resume include/linux/tracehook.h:185 [inline]
> exit_to_usermode_loop arch/x86/entry/common.c:168 [inline]
> prepare_exit_to_usermode+0x402/0x4f0 arch/x86/entry/common.c:199
> syscall_return_slowpath+0x110/0x440 arch/x86/entry/common.c:279
> do_syscall_64+0x126/0x140 arch/x86/entry/common.c:304
> entry_SYSCALL_64_after_hwframe+0x49/0xbe
> RIP: 0033:0x412fb1
> Code: 80 3b 7c 0f 84 c7 02 00 00 c7 85 d0 00 00 00 00 00 00 00 48 8b 05 cf
> a6 24 00 49 8b 14 24 41 b9 cb 2a 44 00 48 89 ee 48 89 df <48> 85 c0 4c 0f
> 45 c8 45 31 c0 31 c9 e8 0e 5b 00 00 85 c0 41 89 c7
> RSP: 002b:00007ffe7ee6a180 EFLAGS: 00000293 ORIG_RAX: 0000000000000003
> RAX: 0000000000000000 RBX: 0000000000000004 RCX: 0000000000412fb1
> RDX: 0000001b2d920000 RSI: 0000000000000000 RDI: 0000000000000003
> RBP: 0000000000000001 R08: 00000000f3a3e1f8 R09: 00000000f3a3e1fc
> R10: 00007ffe7ee6a260 R11: 0000000000000293 R12: 000000000075c9a0
> R13: 000000000075c9a0 R14: 0000000000024c00 R15: 000000000075bf2c
>
> =============================================
>
> There is an wrong logic, when kthread_park running
> in front of io_sq_thread.
>
> CPU#0 CPU#1
>
> io_sq_thread_stop: int kthread(void *_create):
>
> kthread_park()
> __kthread_parkme(self); <<< Wrong
> kthread_stop()
> << wait for self->exited
> << clear_bit KTHREAD_SHOULD_PARK
>
> ret = threadfn(data);
> |
> |- io_sq_thread
> |- kthread_should_park() << false
> |- schedule() <<< nobody wake up
>
> stuck CPU#0 stuck CPU#1
>
> So, use a new variable sqo_thread_started to ensure that io_sq_thread
> run first, then io_sq_thread_stop.
>
> Reported-by: syzbot+94324416c485d422fe15@syzkaller.appspotmail.com
> Signed-off-by: Jackie Liu <liuyun01@kylinos.cn>
> ---
> fs/io_uring.c | 8 ++++++++
> 1 file changed, 8 insertions(+)
>
> diff --git a/fs/io_uring.c b/fs/io_uring.c
> index 4ef62a4..b7d92fa 100644
> --- a/fs/io_uring.c
> +++ b/fs/io_uring.c
> @@ -78,6 +78,8 @@
> #define IORING_MAX_ENTRIES 4096
> #define IORING_MAX_FIXED_FILES 1024
>
> +#define SQO_THREAD_STARTED 1
Should be 0, first bit.
> @@ -2243,6 +2249,8 @@ static int io_sqe_files_unregister(struct io_ring_ctx *ctx)
> static void io_sq_thread_stop(struct io_ring_ctx *ctx)
> {
> if (ctx->sqo_thread) {
> + while (unlikely(!test_bit(SQO_THREAD_STARTED, &ctx->sqo_flags)))
> + schedule();
This doesn't really look safe - who wakes you up if the thread wasn't
started to begin with, but got started right after? This needs to use
some variant of a completion struct, for example, or some other way of
waiting for the thread to have started.
--
Jens Axboe
^ permalink raw reply [flat|nested] 3+ messages in thread
* Re: [PATCH v2] io_uring: fix io_sq_thread_stop running in front of io_sq_thread
2019-07-08 4:13 ` Jens Axboe
@ 2019-07-08 4:38 ` JackieLiu
0 siblings, 0 replies; 3+ messages in thread
From: JackieLiu @ 2019-07-08 4:38 UTC (permalink / raw)
To: Jens Axboe; +Cc: ebiggers, 刘正元, linux-block
> 在 2019年7月8日,12:13,Jens Axboe <axboe@kernel.dk> 写道:
>
> On 7/7/19 8:07 PM, Jackie Liu wrote:
>> INFO: task syz-executor.5:8634 blocked for more than 143 seconds.
>> Not tainted 5.2.0-rc5+ #3
>> "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
>> syz-executor.5 D25632 8634 8224 0x00004004
>> Call Trace:
>> context_switch kernel/sched/core.c:2818 [inline]
>> __schedule+0x658/0x9e0 kernel/sched/core.c:3445
>> schedule+0x131/0x1d0 kernel/sched/core.c:3509
>> schedule_timeout+0x9a/0x2b0 kernel/time/timer.c:1783
>> do_wait_for_common+0x35e/0x5a0 kernel/sched/completion.c:83
>> __wait_for_common kernel/sched/completion.c:104 [inline]
>> wait_for_common kernel/sched/completion.c:115 [inline]
>> wait_for_completion+0x47/0x60 kernel/sched/completion.c:136
>> kthread_stop+0xb4/0x150 kernel/kthread.c:559
>> io_sq_thread_stop fs/io_uring.c:2252 [inline]
>> io_finish_async fs/io_uring.c:2259 [inline]
>> io_ring_ctx_free fs/io_uring.c:2770 [inline]
>> io_ring_ctx_wait_and_kill+0x268/0x880 fs/io_uring.c:2834
>> io_uring_release+0x5d/0x70 fs/io_uring.c:2842
>> __fput+0x2e4/0x740 fs/file_table.c:280
>> ____fput+0x15/0x20 fs/file_table.c:313
>> task_work_run+0x17e/0x1b0 kernel/task_work.c:113
>> tracehook_notify_resume include/linux/tracehook.h:185 [inline]
>> exit_to_usermode_loop arch/x86/entry/common.c:168 [inline]
>> prepare_exit_to_usermode+0x402/0x4f0 arch/x86/entry/common.c:199
>> syscall_return_slowpath+0x110/0x440 arch/x86/entry/common.c:279
>> do_syscall_64+0x126/0x140 arch/x86/entry/common.c:304
>> entry_SYSCALL_64_after_hwframe+0x49/0xbe
>> RIP: 0033:0x412fb1
>> Code: 80 3b 7c 0f 84 c7 02 00 00 c7 85 d0 00 00 00 00 00 00 00 48 8b 05 cf
>> a6 24 00 49 8b 14 24 41 b9 cb 2a 44 00 48 89 ee 48 89 df <48> 85 c0 4c 0f
>> 45 c8 45 31 c0 31 c9 e8 0e 5b 00 00 85 c0 41 89 c7
>> RSP: 002b:00007ffe7ee6a180 EFLAGS: 00000293 ORIG_RAX: 0000000000000003
>> RAX: 0000000000000000 RBX: 0000000000000004 RCX: 0000000000412fb1
>> RDX: 0000001b2d920000 RSI: 0000000000000000 RDI: 0000000000000003
>> RBP: 0000000000000001 R08: 00000000f3a3e1f8 R09: 00000000f3a3e1fc
>> R10: 00007ffe7ee6a260 R11: 0000000000000293 R12: 000000000075c9a0
>> R13: 000000000075c9a0 R14: 0000000000024c00 R15: 000000000075bf2c
>>
>> =============================================
>>
>> There is an wrong logic, when kthread_park running
>> in front of io_sq_thread.
>>
>> CPU#0 CPU#1
>>
>> io_sq_thread_stop: int kthread(void *_create):
>>
>> kthread_park()
>> __kthread_parkme(self); <<< Wrong
>> kthread_stop()
>> << wait for self->exited
>> << clear_bit KTHREAD_SHOULD_PARK
>>
>> ret = threadfn(data);
>> |
>> |- io_sq_thread
>> |- kthread_should_park() << false
>> |- schedule() <<< nobody wake up
>>
>> stuck CPU#0 stuck CPU#1
>>
>> So, use a new variable sqo_thread_started to ensure that io_sq_thread
>> run first, then io_sq_thread_stop.
>>
>> Reported-by: syzbot+94324416c485d422fe15@syzkaller.appspotmail.com
>> Signed-off-by: Jackie Liu <liuyun01@kylinos.cn>
>> ---
>> fs/io_uring.c | 8 ++++++++
>> 1 file changed, 8 insertions(+)
>>
>> diff --git a/fs/io_uring.c b/fs/io_uring.c
>> index 4ef62a4..b7d92fa 100644
>> --- a/fs/io_uring.c
>> +++ b/fs/io_uring.c
>> @@ -78,6 +78,8 @@
>> #define IORING_MAX_ENTRIES 4096
>> #define IORING_MAX_FIXED_FILES 1024
>>
>> +#define SQO_THREAD_STARTED 1
>
> Should be 0, first bit.
>
>> @@ -2243,6 +2249,8 @@ static int io_sqe_files_unregister(struct io_ring_ctx *ctx)
>> static void io_sq_thread_stop(struct io_ring_ctx *ctx)
>> {
>> if (ctx->sqo_thread) {
>> + while (unlikely(!test_bit(SQO_THREAD_STARTED, &ctx->sqo_flags)))
>> + schedule();
>
> This doesn't really look safe - who wakes you up if the thread wasn't
> started to begin with, but got started right after? This needs to use
> some variant of a completion struct, for example, or some other way of
> waiting for the thread to have started.
Hi, Jens.
Yes, you are right, I will use a completion struct for new version, base
on 'drivers/staging/wilc1000/wilc_netdev.c',
It's wl->txq_thread_started.
Thanks.
>
> --
> Jens Axboe
>
>
^ permalink raw reply [flat|nested] 3+ messages in thread
end of thread, other threads:[~2019-07-08 4:38 UTC | newest]
Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-07-08 2:07 [PATCH v2] io_uring: fix io_sq_thread_stop running in front of io_sq_thread Jackie Liu
2019-07-08 4:13 ` Jens Axboe
2019-07-08 4:38 ` JackieLiu
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.