linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* Re: [PATCH] io_uring: Fix NULL pointer dereference in io_sq_wq_submit_work()
       [not found] <20200901015442.44831-1-yinxin_1989@aliyun.com>
@ 2020-09-01  3:38 ` Jens Axboe
       [not found]   ` <67f27d17-81fa-43a8-baa9-429b1ccd65d0.yinxin_1989@aliyun.com>
  0 siblings, 1 reply; 5+ messages in thread
From: Jens Axboe @ 2020-09-01  3:38 UTC (permalink / raw)
  To: Xin Yin, viro; +Cc: linux-block, linux-fsdevel, linux-kernel

On 8/31/20 7:54 PM, Xin Yin wrote:
> the commit <1c4404efcf2c0> ("<io_uring: make sure async workqueue
> is canceled on exit>") caused a crash in io_sq_wq_submit_work().
> when io_ring-wq get a req form async_list, which may not have been
> added to task_list. Then try to delete the req from task_list will caused
> a "NULL pointer dereference".

Hmm, do you have a reproducer for this?

> @@ -2356,9 +2358,11 @@ static void io_sq_wq_submit_work(struct work_struct *work)
>   * running. We currently only allow this if the new request is sequential
>   * to the previous one we punted.
>   */
> -static bool io_add_to_prev_work(struct async_list *list, struct io_kiocb *req)
> +static bool io_add_to_prev_work(struct async_list *list, struct io_kiocb *req,
> +							struct io_ring_ctx *ctx)
>  {
>  	bool ret;
> +	unsigned long flags;
>  
>  	if (!list)
>  		return false;
> @@ -2378,6 +2382,13 @@ static bool io_add_to_prev_work(struct async_list *list, struct io_kiocb *req)
>  		list_del_init(&req->list);
>  		ret = false;
>  	}
> +
> +	if (ret) {
> +		spin_lock_irqsave(&ctx->task_lock, flags);
> +		list_add(&req->task_list, &ctx->task_list);
> +		req->work_task = NULL;
> +		spin_unlock_irqrestore(&ctx->task_lock, flags);
> +	}
>  	spin_unlock(&list->lock);
>  	return ret;
>  }
> @@ -2454,7 +2465,7 @@ static int __io_queue_sqe(struct io_ring_ctx *ctx, struct io_kiocb *req,
>  			s->sqe = sqe_copy;
>  			memcpy(&req->submit, s, sizeof(*s));
>  			list = io_async_list_from_req(ctx, req);
> -			if (!io_add_to_prev_work(list, req)) {
> +			if (!io_add_to_prev_work(list, req, ctx)) {
>  				if (list)
>  					atomic_inc(&list->cnt);
>  				INIT_WORK(&req->work, io_sq_wq_submit_work);
> 

ctx == req->ctx, so you should not need that change.

-- 
Jens Axboe


^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH] io_uring: Fix NULL pointer dereference in io_sq_wq_submit_work()
       [not found]   ` <67f27d17-81fa-43a8-baa9-429b1ccd65d0.yinxin_1989@aliyun.com>
@ 2020-09-01 14:52     ` Jens Axboe
  2020-09-01 20:01       ` Jens Axboe
  0 siblings, 1 reply; 5+ messages in thread
From: Jens Axboe @ 2020-09-01 14:52 UTC (permalink / raw)
  To: yinxin_1989, viro; +Cc: linux-block, linux-fsdevel, linux-kernel

On 8/31/20 10:59 PM, yinxin_1989 wrote:
> 
>>On 8/31/20 7:54 PM, Xin Yin wrote:
>>> the commit <1c4404efcf2c0> ("<io_uring: make sure async workqueue
>>> is canceled on exit>") caused a crash in io_sq_wq_submit_work().
>>> when io_ring-wq get a req form async_list, which may not have been
>>> added to task_list. Then try to delete the req from task_list will caused
>>> a "NULL pointer dereference".
>>
>>Hmm, do you have a reproducer for this?
> 
> I update code to linux5.4.y , and I can reproduce this issue on an arm
> board and my x86 pc by fio tools.

Right, I figured this was 5.4 stable, as that's the only version that
has this patch.

> fio -filename=/home/yinxin/testfile -direct=0 -ioengine=io_uring -iodepth 128 -rw=read -bs=16K -size=1G -numjobs=1 -runtime=60 -group_reporting -name=iops

Gotcha, thanks!

>>> @@ -2356,9 +2358,11 @@ static void io_sq_wq_submit_work(struct work_struct *work)
>>>   * running. We currently only allow this if the new request is sequential
>>>   * to the previous one we punted.
>>>   */
>>> -static bool io_add_to_prev_work(struct async_list *list, struct io_kiocb *req)
>>> +static bool io_add_to_prev_work(struct async_list *list, struct io_kiocb *req,
>>> +       struct io_ring_ctx *ctx)
>>>  {
>>>   bool ret;
>>> + unsigned long flags;
>>>  
>>>   if (!list)
>>>    return false;
>>> @@ -2378,6 +2382,13 @@ static bool io_add_to_prev_work(struct async_list *list, struct io_kiocb *req)
>>>    list_del_init(&req->list);
>>>    ret = false;
>>>   }
>>> +
>>> + if (ret) {
>>> +  spin_lock_irqsave(&ctx->task_lock, flags);
>>> +  list_add(&req->task_list, &ctx->task_list);
>>> +  req->work_task = NULL;
>>> +  spin_unlock_irqrestore(&ctx->task_lock, flags);
>>> + }
>>>   spin_unlock(&list->lock);
>>>   return ret;
>>>  }
>>> @@ -2454,7 +2465,7 @@ static int __io_queue_sqe(struct io_ring_ctx *ctx, struct io_kiocb *req,
>>>     s->sqe = sqe_copy;
>>>     memcpy(&req->submit, s, sizeof(*s));
>>>     list = io_async_list_from_req(ctx, req);
>>> -   if (!io_add_to_prev_work(list, req)) {
>>> +   if (!io_add_to_prev_work(list, req, ctx)) {
>>>      if (list)
>>>       atomic_inc(&list->cnt);
>>>      INIT_WORK(&req->work, io_sq_wq_submit_work);
>>> 
>>ctx == req->ctx, so you should not need that change.
> 
> In my test , the req have not been add to req->task_list(maybe waiting
> for the ctx->task_lock) , and in io_sq_wq_submit_work() try to delete
> it from req->task_list ,which will cause this issue.

Sure, but req->ctx is set when the req is initialized. If req->ctx !=
ctx here, then that would be pretty disastrous... So you can drop that
part of the patch.

Care to send with that changed? Then I'm fine with queueing this up for
5.4-stable. Thanks!

-- 
Jens Axboe


^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH] io_uring: Fix NULL pointer dereference in io_sq_wq_submit_work()
  2020-09-01 14:52     ` Jens Axboe
@ 2020-09-01 20:01       ` Jens Axboe
  2020-09-01 20:12         ` Jens Axboe
  0 siblings, 1 reply; 5+ messages in thread
From: Jens Axboe @ 2020-09-01 20:01 UTC (permalink / raw)
  To: yinxin_1989, viro; +Cc: linux-block, linux-fsdevel, linux-kernel

On 9/1/20 8:52 AM, Jens Axboe wrote:
> On 8/31/20 10:59 PM, yinxin_1989 wrote:
>>
>>> On 8/31/20 7:54 PM, Xin Yin wrote:
>>>> the commit <1c4404efcf2c0> ("<io_uring: make sure async workqueue
>>>> is canceled on exit>") caused a crash in io_sq_wq_submit_work().
>>>> when io_ring-wq get a req form async_list, which may not have been
>>>> added to task_list. Then try to delete the req from task_list will caused
>>>> a "NULL pointer dereference".
>>>
>>> Hmm, do you have a reproducer for this?
>>
>> I update code to linux5.4.y , and I can reproduce this issue on an arm
>> board and my x86 pc by fio tools.
> 
> Right, I figured this was 5.4 stable, as that's the only version that
> has this patch.

I took a closer look, and I think your patch can basically be boiled down
to this single hunk. If you agree, can you resend your patch with the
description based on that, then I'll get it queued up for 5.4-stable.
Thanks!

diff --git a/fs/io_uring.c b/fs/io_uring.c
index fada14ee1cdc..cbbcd85780c4 100644
--- a/fs/io_uring.c
+++ b/fs/io_uring.c
@@ -2378,6 +2378,16 @@ static bool io_add_to_prev_work(struct async_list *list, struct io_kiocb *req)
 		list_del_init(&req->list);
 		ret = false;
 	}
+
+	if (ret) {
+		struct io_ring_ctx *ctx = req->ctx;
+		unsigned long flags;
+
+		spin_lock_irqsave(&ctx->task_lock, flags);
+		list_add(&req->task_list, &ctx->task_list);
+		req->work_task = NULL;
+		spin_unlock_irqrestore(&ctx->task_lock, flags);
+	}
 	spin_unlock(&list->lock);
 	return ret;
 }

-- 
Jens Axboe


^ permalink raw reply related	[flat|nested] 5+ messages in thread

* Re: [PATCH] io_uring: Fix NULL pointer dereference in io_sq_wq_submit_work()
  2020-09-01 20:01       ` Jens Axboe
@ 2020-09-01 20:12         ` Jens Axboe
  0 siblings, 0 replies; 5+ messages in thread
From: Jens Axboe @ 2020-09-01 20:12 UTC (permalink / raw)
  To: yinxin_1989, viro; +Cc: linux-block, linux-fsdevel, linux-kernel

On 9/1/20 2:01 PM, Jens Axboe wrote:
> On 9/1/20 8:52 AM, Jens Axboe wrote:
>> On 8/31/20 10:59 PM, yinxin_1989 wrote:
>>>
>>>> On 8/31/20 7:54 PM, Xin Yin wrote:
>>>>> the commit <1c4404efcf2c0> ("<io_uring: make sure async workqueue
>>>>> is canceled on exit>") caused a crash in io_sq_wq_submit_work().
>>>>> when io_ring-wq get a req form async_list, which may not have been
>>>>> added to task_list. Then try to delete the req from task_list will caused
>>>>> a "NULL pointer dereference".
>>>>
>>>> Hmm, do you have a reproducer for this?
>>>
>>> I update code to linux5.4.y , and I can reproduce this issue on an arm
>>> board and my x86 pc by fio tools.
>>
>> Right, I figured this was 5.4 stable, as that's the only version that
>> has this patch.
> 
> I took a closer look, and I think your patch can basically be boiled down
> to this single hunk. If you agree, can you resend your patch with the
> description based on that, then I'll get it queued up for 5.4-stable.
> Thanks!

Actually, we don't even need the irqsave, this should be enough:

diff --git a/fs/io_uring.c b/fs/io_uring.c
index fada14ee1cdc..2a539b794f3b 100644
--- a/fs/io_uring.c
+++ b/fs/io_uring.c
@@ -2378,6 +2378,15 @@ static bool io_add_to_prev_work(struct async_list *list, struct io_kiocb *req)
 		list_del_init(&req->list);
 		ret = false;
 	}
+
+	if (ret) {
+		struct io_ring_ctx *ctx = req->ctx;
+
+		spin_lock_irq(&ctx->task_lock);
+		list_add(&req->task_list, &ctx->task_list);
+		req->work_task = NULL;
+		spin_unlock_irq(&ctx->task_lock);
+	}
 	spin_unlock(&list->lock);
 	return ret;
 }

-- 
Jens Axboe


^ permalink raw reply related	[flat|nested] 5+ messages in thread

* Re: [PATCH] io_uring: Fix NULL pointer dereference in io_sq_wq_submit_work()
       [not found] <20200902015948.109580-1-yinxin_1989@aliyun.com>
@ 2020-09-02  2:12 ` Jens Axboe
  0 siblings, 0 replies; 5+ messages in thread
From: Jens Axboe @ 2020-09-02  2:12 UTC (permalink / raw)
  To: Xin Yin, viro; +Cc: linux-block, linux-fsdevel, linux-kernel

On 9/1/20 7:59 PM, Xin Yin wrote:
> the commit <1c4404efcf2c0> ("<io_uring: make sure async workqueue
> is canceled on exit>") caused a crash in io_sq_wq_submit_work().
> when io_ring-wq get a req form async_list, which not have been
> added to task_list. Then try to delete the req from task_list will caused
> a "NULL pointer dereference".
> 
> Ensure add req to async_list and task_list at the sametime.
> 
> The crash log looks like this:
> [95995.973638] Unable to handle kernel NULL pointer dereference at virtual address 00000000
> [95995.979123] pgd = c20c8964
> [95995.981803] [00000000] *pgd=1c72d831, *pte=00000000, *ppte=00000000
> [95995.988043] Internal error: Oops: 817 [#1] SMP ARM
> [95995.992814] Modules linked in: bpfilter(-)
> [95995.996898] CPU: 1 PID: 15661 Comm: kworker/u8:5 Not tainted 5.4.56 #2
> [95996.003406] Hardware name: Amlogic Meson platform
> [95996.008108] Workqueue: io_ring-wq io_sq_wq_submit_work
> [95996.013224] PC is at io_sq_wq_submit_work+0x1f4/0x5c4
> [95996.018261] LR is at walk_stackframe+0x24/0x40
> [95996.022685] pc : [<c059b898>]    lr : [<c030da7c>]    psr: 600f0093
> [95996.028936] sp : dc6f7e88  ip : dc6f7df0  fp : dc6f7ef4
> [95996.034148] r10: deff9800  r9 : dc1d1694  r8 : dda58b80
> [95996.039358] r7 : dc6f6000  r6 : dc6f7ebc  r5 : dc1d1600  r4 : deff99c0
> [95996.045871] r3 : 0000cb5d  r2 : 00000000  r1 : ef6b9b80  r0 : c059b88c
> [95996.052385] Flags: nZCv  IRQs off  FIQs on  Mode SVC_32  ISA ARM  Segment user
> [95996.059593] Control: 10c5387d  Table: 22be804a  DAC: 00000055
> [95996.065325] Process kworker/u8:5 (pid: 15661, stack limit = 0x78013c69)
> [95996.071923] Stack: (0xdc6f7e88 to 0xdc6f8000)
> [95996.076268] 7e80:                   dc6f7ecc dc6f7e98 00000000 c1f06c08 de9dc800 deff9a04
> [95996.084431] 7ea0: 00000000 dc6f7f7c 00000000 c1f65808 0000080c dc677a00 c1ee9bd0 dc6f7ebc
> [95996.092594] 7ec0: dc6f7ebc d085c8f6 c0445a90 dc1d1e00 e008f300 c0288400 e4ef7100 00000000
> [95996.100757] 7ee0: c20d45b0 e4ef7115 dc6f7f34 dc6f7ef8 c03725f0 c059b6b0 c0288400 c0288400
> [95996.108921] 7f00: c0288400 00000001 c0288418 e008f300 c0288400 e008f314 00000088 c0288418
> [95996.117083] 7f20: c1f03d00 dc6f6038 dc6f7f7c dc6f7f38 c0372df8 c037246c dc6f7f5c 00000000
> [95996.125245] 7f40: c1f03d00 c1f03d00 c20d3cbe c0288400 dc6f7f7c e1c43880 e4fa7980 00000000
> [95996.133409] 7f60: e008f300 c0372d9c e48bbe74 e1c4389c dc6f7fac dc6f7f80 c0379244 c0372da8
> [95996.141570] 7f80: 600f0093 e4fa7980 c0379108 00000000 00000000 00000000 00000000 00000000
> [95996.149734] 7fa0: 00000000 dc6f7fb0 c03010ac c0379114 00000000 00000000 00000000 00000000
> [95996.157897] 7fc0: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
> [95996.166060] 7fe0: 00000000 00000000 00000000 00000000 00000013 00000000 00000000 00000000
> [95996.174217] Backtrace:
> [95996.176662] [<c059b6a4>] (io_sq_wq_submit_work) from [<c03725f0>] (process_one_work+0x190/0x4c0)
> [95996.185425]  r10:e4ef7115 r9:c20d45b0 r8:00000000 r7:e4ef7100 r6:c0288400 r5:e008f300
> [95996.193237]  r4:dc1d1e00
> [95996.195760] [<c0372460>] (process_one_work) from [<c0372df8>] (worker_thread+0x5c/0x5bc)
> [95996.203836]  r10:dc6f6038 r9:c1f03d00 r8:c0288418 r7:00000088 r6:e008f314 r5:c0288400
> [95996.211647]  r4:e008f300
> [95996.214173] [<c0372d9c>] (worker_thread) from [<c0379244>] (kthread+0x13c/0x168)
> [95996.221554]  r10:e1c4389c r9:e48bbe74 r8:c0372d9c r7:e008f300 r6:00000000 r5:e4fa7980
> [95996.229363]  r4:e1c43880
> [95996.231888] [<c0379108>] (kthread) from [<c03010ac>] (ret_from_fork+0x14/0x28)
> [95996.239088] Exception stack(0xdc6f7fb0 to 0xdc6f7ff8)
> [95996.244127] 7fa0:                                     00000000 00000000 00000000 00000000
> [95996.252291] 7fc0: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
> [95996.260453] 7fe0: 00000000 00000000 00000000 00000000 00000013 00000000
> [95996.267054]  r10:00000000 r9:00000000 r8:00000000 r7:00000000 r6:00000000 r5:c0379108
> [95996.274866]  r4:e4fa7980 r3:600f0093
> [95996.278430] Code: eb3a59e1 e5952098 e5951094 e5812004 (e5821000)

Thanks for catching this, I'll get it queued for stable.

-- 
Jens Axboe


^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2020-09-02  2:12 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
     [not found] <20200901015442.44831-1-yinxin_1989@aliyun.com>
2020-09-01  3:38 ` [PATCH] io_uring: Fix NULL pointer dereference in io_sq_wq_submit_work() Jens Axboe
     [not found]   ` <67f27d17-81fa-43a8-baa9-429b1ccd65d0.yinxin_1989@aliyun.com>
2020-09-01 14:52     ` Jens Axboe
2020-09-01 20:01       ` Jens Axboe
2020-09-01 20:12         ` Jens Axboe
     [not found] <20200902015948.109580-1-yinxin_1989@aliyun.com>
2020-09-02  2:12 ` Jens Axboe

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).