io-uring.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] io_uring: don't issue reqs in iopoll mode when ctx is dying
@ 2021-02-06 15:00 Xiaoguang Wang
  2021-02-07 17:30 ` Pavel Begunkov
  2021-02-24  3:23 ` Xiaoguang Wang
  0 siblings, 2 replies; 16+ messages in thread
From: Xiaoguang Wang @ 2021-02-06 15:00 UTC (permalink / raw)
  To: io-uring; +Cc: axboe, asml.silence, joseph.qi

Abaci Robot reported following panic:
------------[ cut here ]------------
refcount_t: underflow; use-after-free.
WARNING: CPU: 1 PID: 195 at lib/refcount.c:28 refcount_warn_saturate+0x137/0x140
Modules linked in:
CPU: 1 PID: 195 Comm: kworker/u4:2 Not tainted 5.11.0-rc3+ #70
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.11.1-0-g0551a4be2c-prebuilt.qemu-project.or4
Workqueue: events_unbound io_ring_exit_work
RIP: 0010:refcount_warn_saturate+0x137/0x140
Code: 05 ad 63 49 08 01 e8 45 0f 6f 00 0f 0b e9 16 ff ff ff e8 4c ba ae ff 48 c7 c7 28 2e 7c 82 c6 05 90 63 40
RSP: 0018:ffffc900002e3cc8 EFLAGS: 00010282
RAX: 0000000000000000 RBX: 0000000000000003 RCX: 0000000000000000
RDX: ffff888102918000 RSI: ffffffff81150a34 RDI: ffff88813bd28570
RBP: ffff8881075cd348 R08: 0000000000000001 R09: 0000000000000001
R10: 0000000000000001 R11: 0000000000080000 R12: ffff8881075cd308
R13: ffff8881075cd348 R14: ffff888122d33ab8 R15: ffff888104780300
FS:  0000000000000000(0000) GS:ffff88813bd00000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 0000000000000108 CR3: 0000000107636005 CR4: 0000000000060ee0
Call Trace:
 io_dismantle_req+0x3f3/0x5b0
 __io_free_req+0x2c/0x270
 io_put_req+0x4c/0x70
 io_wq_cancel_cb+0x171/0x470
 ? io_match_task.part.0+0x80/0x80
 __io_uring_cancel_task_requests+0xa0/0x190
 io_ring_exit_work+0x32/0x3e0
 process_one_work+0x2f3/0x720
 worker_thread+0x5a/0x4b0
 ? process_one_work+0x720/0x720
 kthread+0x138/0x180
 ? kthread_park+0xd0/0xd0
 ret_from_fork+0x1f/0x30

Later system will panic for some memory corruption.

The io_identity's count is underflowed. It's because in io_put_identity,
first argument tctx comes from req->task->io_uring, the second argument
comes from the task context that calls io_req_init_async, so the compare
in io_put_identity maybe meaningless. See below case:
    task context A issue one polled req, then req->task = A.
    task context B do iopoll, above req returns with EAGAIN error.
    task context B re-issue req, call io_queue_async_work for req.
    req->task->io_uring will set to task context B's identity, or cow new one.
then for above case, in io_put_identity(), the compare is meaningless.

IIUC, req->task should indicates the initial task context that issues req,
then if it gets EAGAIN error, we'll call io_prep_async_work() in req->task
context, but iopoll reqs seems special, they maybe issued successfully and
got re-issued in other task context because of EAGAIN error.

Currently for this panic, we can disable issuing reqs that are returned
with EAGAIN error in iopoll mode when ctx is dying, but we may need to
re-consider the io identity codes more.

Reported-by: Abaci Robot <abaci@linux.alibaba.com>
Signed-off-by: Xiaoguang Wang <xiaoguang.wang@linux.alibaba.com>
---
 fs/io_uring.c | 7 ++++++-
 1 file changed, 6 insertions(+), 1 deletion(-)

diff --git a/fs/io_uring.c b/fs/io_uring.c
index 9db05171a774..e3b90426d72b 100644
--- a/fs/io_uring.c
+++ b/fs/io_uring.c
@@ -2467,7 +2467,12 @@ static void io_iopoll_complete(struct io_ring_ctx *ctx, unsigned int *nr_events,
 		int cflags = 0;
 
 		req = list_first_entry(done, struct io_kiocb, inflight_entry);
-		if (READ_ONCE(req->result) == -EAGAIN) {
+		/*
+		 * If ctx is dying, don't need to issue reqs that are returned
+		 * with EAGAIN error, since there maybe no users to reap them.
+		 */
+		if ((READ_ONCE(req->result) == -EAGAIN) &&
+		    !percpu_ref_is_dying(&ctx->refs)) {
 			req->result = 0;
 			req->iopoll_completed = 0;
 			list_move_tail(&req->inflight_entry, &again);
-- 
2.17.2


^ permalink raw reply related	[flat|nested] 16+ messages in thread

* Re: [PATCH] io_uring: don't issue reqs in iopoll mode when ctx is dying
  2021-02-06 15:00 [PATCH] io_uring: don't issue reqs in iopoll mode when ctx is dying Xiaoguang Wang
@ 2021-02-07 17:30 ` Pavel Begunkov
  2021-02-08  2:50   ` Xiaoguang Wang
  2021-02-24 12:42   ` Hao Xu
  2021-02-24  3:23 ` Xiaoguang Wang
  1 sibling, 2 replies; 16+ messages in thread
From: Pavel Begunkov @ 2021-02-07 17:30 UTC (permalink / raw)
  To: Xiaoguang Wang, io-uring; +Cc: axboe, joseph.qi

On 06/02/2021 15:00, Xiaoguang Wang wrote:
> Abaci Robot reported following panic:
> ------------[ cut here ]------------
> refcount_t: underflow; use-after-free.
> WARNING: CPU: 1 PID: 195 at lib/refcount.c:28 refcount_warn_saturate+0x137/0x140
> Modules linked in:
> CPU: 1 PID: 195 Comm: kworker/u4:2 Not tainted 5.11.0-rc3+ #70
> Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.11.1-0-g0551a4be2c-prebuilt.qemu-project.or4
> Workqueue: events_unbound io_ring_exit_work
> RIP: 0010:refcount_warn_saturate+0x137/0x140
> Code: 05 ad 63 49 08 01 e8 45 0f 6f 00 0f 0b e9 16 ff ff ff e8 4c ba ae ff 48 c7 c7 28 2e 7c 82 c6 05 90 63 40
> RSP: 0018:ffffc900002e3cc8 EFLAGS: 00010282
> RAX: 0000000000000000 RBX: 0000000000000003 RCX: 0000000000000000
> RDX: ffff888102918000 RSI: ffffffff81150a34 RDI: ffff88813bd28570
> RBP: ffff8881075cd348 R08: 0000000000000001 R09: 0000000000000001
> R10: 0000000000000001 R11: 0000000000080000 R12: ffff8881075cd308
> R13: ffff8881075cd348 R14: ffff888122d33ab8 R15: ffff888104780300
> FS:  0000000000000000(0000) GS:ffff88813bd00000(0000) knlGS:0000000000000000
> CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> CR2: 0000000000000108 CR3: 0000000107636005 CR4: 0000000000060ee0
> Call Trace:
>  io_dismantle_req+0x3f3/0x5b0
>  __io_free_req+0x2c/0x270
>  io_put_req+0x4c/0x70
>  io_wq_cancel_cb+0x171/0x470
>  ? io_match_task.part.0+0x80/0x80
>  __io_uring_cancel_task_requests+0xa0/0x190
>  io_ring_exit_work+0x32/0x3e0
>  process_one_work+0x2f3/0x720
>  worker_thread+0x5a/0x4b0
>  ? process_one_work+0x720/0x720
>  kthread+0x138/0x180
>  ? kthread_park+0xd0/0xd0
>  ret_from_fork+0x1f/0x30
> 
> Later system will panic for some memory corruption.
> 
> The io_identity's count is underflowed. It's because in io_put_identity,
> first argument tctx comes from req->task->io_uring, the second argument
> comes from the task context that calls io_req_init_async, so the compare
> in io_put_identity maybe meaningless. See below case:
>     task context A issue one polled req, then req->task = A.
>     task context B do iopoll, above req returns with EAGAIN error.
>     task context B re-issue req, call io_queue_async_work for req.
>     req->task->io_uring will set to task context B's identity, or cow new one.
> then for above case, in io_put_identity(), the compare is meaningless.
> 
> IIUC, req->task should indicates the initial task context that issues req,
> then if it gets EAGAIN error, we'll call io_prep_async_work() in req->task
> context, but iopoll reqs seems special, they maybe issued successfully and
> got re-issued in other task context because of EAGAIN error.

Looks as you say, but the patch doesn't solve the issue completely.
1. We must not do io_queue_async_work() under a different task context,
because of it potentially uses a different set of resources. So, I just
thought that it would be better to punt it to the right task context
via task_work. But...

2. ...iovec import from io_resubmit_prep() might happen after submit ends,
i.e. when iovec was freed in userspace. And that's not great at all.

> Currently for this panic, we can disable issuing reqs that are returned
> with EAGAIN error in iopoll mode when ctx is dying, but we may need to
> re-consider the io identity codes more.
> 
> Reported-by: Abaci Robot <abaci@linux.alibaba.com>
> Signed-off-by: Xiaoguang Wang <xiaoguang.wang@linux.alibaba.com>
> ---
>  fs/io_uring.c | 7 ++++++-
>  1 file changed, 6 insertions(+), 1 deletion(-)
> 
> diff --git a/fs/io_uring.c b/fs/io_uring.c
> index 9db05171a774..e3b90426d72b 100644
> --- a/fs/io_uring.c
> +++ b/fs/io_uring.c
> @@ -2467,7 +2467,12 @@ static void io_iopoll_complete(struct io_ring_ctx *ctx, unsigned int *nr_events,
>  		int cflags = 0;
>  
>  		req = list_first_entry(done, struct io_kiocb, inflight_entry);
> -		if (READ_ONCE(req->result) == -EAGAIN) {
> +		/*
> +		 * If ctx is dying, don't need to issue reqs that are returned
> +		 * with EAGAIN error, since there maybe no users to reap them.
> +		 */
> +		if ((READ_ONCE(req->result) == -EAGAIN) &&
> +		    !percpu_ref_is_dying(&ctx->refs)) {
>  			req->result = 0;
>  			req->iopoll_completed = 0;
>  			list_move_tail(&req->inflight_entry, &again);
> 

-- 
Pavel Begunkov

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH] io_uring: don't issue reqs in iopoll mode when ctx is dying
  2021-02-07 17:30 ` Pavel Begunkov
@ 2021-02-08  2:50   ` Xiaoguang Wang
  2021-02-08 13:35     ` Pavel Begunkov
  2021-02-24 12:42   ` Hao Xu
  1 sibling, 1 reply; 16+ messages in thread
From: Xiaoguang Wang @ 2021-02-08  2:50 UTC (permalink / raw)
  To: Pavel Begunkov, io-uring; +Cc: axboe, joseph.qi

hi,

> On 06/02/2021 15:00, Xiaoguang Wang wrote:
>> Abaci Robot reported following panic:
>> ------------[ cut here ]------------
>> refcount_t: underflow; use-after-free.
>> WARNING: CPU: 1 PID: 195 at lib/refcount.c:28 refcount_warn_saturate+0x137/0x140
>> Modules linked in:
>> CPU: 1 PID: 195 Comm: kworker/u4:2 Not tainted 5.11.0-rc3+ #70
>> Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.11.1-0-g0551a4be2c-prebuilt.qemu-project.or4
>> Workqueue: events_unbound io_ring_exit_work
>> RIP: 0010:refcount_warn_saturate+0x137/0x140
>> Code: 05 ad 63 49 08 01 e8 45 0f 6f 00 0f 0b e9 16 ff ff ff e8 4c ba ae ff 48 c7 c7 28 2e 7c 82 c6 05 90 63 40
>> RSP: 0018:ffffc900002e3cc8 EFLAGS: 00010282
>> RAX: 0000000000000000 RBX: 0000000000000003 RCX: 0000000000000000
>> RDX: ffff888102918000 RSI: ffffffff81150a34 RDI: ffff88813bd28570
>> RBP: ffff8881075cd348 R08: 0000000000000001 R09: 0000000000000001
>> R10: 0000000000000001 R11: 0000000000080000 R12: ffff8881075cd308
>> R13: ffff8881075cd348 R14: ffff888122d33ab8 R15: ffff888104780300
>> FS:  0000000000000000(0000) GS:ffff88813bd00000(0000) knlGS:0000000000000000
>> CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
>> CR2: 0000000000000108 CR3: 0000000107636005 CR4: 0000000000060ee0
>> Call Trace:
>>   io_dismantle_req+0x3f3/0x5b0
>>   __io_free_req+0x2c/0x270
>>   io_put_req+0x4c/0x70
>>   io_wq_cancel_cb+0x171/0x470
>>   ? io_match_task.part.0+0x80/0x80
>>   __io_uring_cancel_task_requests+0xa0/0x190
>>   io_ring_exit_work+0x32/0x3e0
>>   process_one_work+0x2f3/0x720
>>   worker_thread+0x5a/0x4b0
>>   ? process_one_work+0x720/0x720
>>   kthread+0x138/0x180
>>   ? kthread_park+0xd0/0xd0
>>   ret_from_fork+0x1f/0x30
>>
>> Later system will panic for some memory corruption.
>>
>> The io_identity's count is underflowed. It's because in io_put_identity,
>> first argument tctx comes from req->task->io_uring, the second argument
>> comes from the task context that calls io_req_init_async, so the compare
>> in io_put_identity maybe meaningless. See below case:
>>      task context A issue one polled req, then req->task = A.
>>      task context B do iopoll, above req returns with EAGAIN error.
>>      task context B re-issue req, call io_queue_async_work for req.
>>      req->task->io_uring will set to task context B's identity, or cow new one.
>> then for above case, in io_put_identity(), the compare is meaningless.
>>
>> IIUC, req->task should indicates the initial task context that issues req,
>> then if it gets EAGAIN error, we'll call io_prep_async_work() in req->task
>> context, but iopoll reqs seems special, they maybe issued successfully and
>> got re-issued in other task context because of EAGAIN error.
> 
> Looks as you say, but the patch doesn't solve the issue completely.
> 1. We must not do io_queue_async_work() under a different task context,
> because of it potentially uses a different set of resources. So, I just
> thought that it would be better to punt it to the right task context
> via task_work. But...
> 
> 2. ...iovec import from io_resubmit_prep() might happen after submit ends,
> i.e. when iovec was freed in userspace. And that's not great at all.
Yes, agree, that's why I say we neeed to re-consider the io identity codes
more in commit message :) I'll have a try to prepare a better one.

Regards,
Xiaoguang Wang
> 
>> Currently for this panic, we can disable issuing reqs that are returned
>> with EAGAIN error in iopoll mode when ctx is dying, but we may need to
>> re-consider the io identity codes more.
>>
>> Reported-by: Abaci Robot <abaci@linux.alibaba.com>
>> Signed-off-by: Xiaoguang Wang <xiaoguang.wang@linux.alibaba.com>
>> ---
>>   fs/io_uring.c | 7 ++++++-
>>   1 file changed, 6 insertions(+), 1 deletion(-)
>>
>> diff --git a/fs/io_uring.c b/fs/io_uring.c
>> index 9db05171a774..e3b90426d72b 100644
>> --- a/fs/io_uring.c
>> +++ b/fs/io_uring.c
>> @@ -2467,7 +2467,12 @@ static void io_iopoll_complete(struct io_ring_ctx *ctx, unsigned int *nr_events,
>>   		int cflags = 0;
>>   
>>   		req = list_first_entry(done, struct io_kiocb, inflight_entry);
>> -		if (READ_ONCE(req->result) == -EAGAIN) {
>> +		/*
>> +		 * If ctx is dying, don't need to issue reqs that are returned
>> +		 * with EAGAIN error, since there maybe no users to reap them.
>> +		 */
>> +		if ((READ_ONCE(req->result) == -EAGAIN) &&
>> +		    !percpu_ref_is_dying(&ctx->refs)) {
>>   			req->result = 0;
>>   			req->iopoll_completed = 0;
>>   			list_move_tail(&req->inflight_entry, &again);
>>
> 

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH] io_uring: don't issue reqs in iopoll mode when ctx is dying
  2021-02-08  2:50   ` Xiaoguang Wang
@ 2021-02-08 13:35     ` Pavel Begunkov
  2021-02-22 13:23       ` Pavel Begunkov
  0 siblings, 1 reply; 16+ messages in thread
From: Pavel Begunkov @ 2021-02-08 13:35 UTC (permalink / raw)
  To: Xiaoguang Wang, io-uring; +Cc: axboe, joseph.qi

On 08/02/2021 02:50, Xiaoguang Wang wrote:
>>> The io_identity's count is underflowed. It's because in io_put_identity,
>>> first argument tctx comes from req->task->io_uring, the second argument
>>> comes from the task context that calls io_req_init_async, so the compare
>>> in io_put_identity maybe meaningless. See below case:
>>>      task context A issue one polled req, then req->task = A.
>>>      task context B do iopoll, above req returns with EAGAIN error.
>>>      task context B re-issue req, call io_queue_async_work for req.
>>>      req->task->io_uring will set to task context B's identity, or cow new one.
>>> then for above case, in io_put_identity(), the compare is meaningless.
>>>
>>> IIUC, req->task should indicates the initial task context that issues req,
>>> then if it gets EAGAIN error, we'll call io_prep_async_work() in req->task
>>> context, but iopoll reqs seems special, they maybe issued successfully and
>>> got re-issued in other task context because of EAGAIN error.
>>
>> Looks as you say, but the patch doesn't solve the issue completely.
>> 1. We must not do io_queue_async_work() under a different task context,
>> because of it potentially uses a different set of resources. So, I just
>> thought that it would be better to punt it to the right task context
>> via task_work. But...
>>
>> 2. ...iovec import from io_resubmit_prep() might happen after submit ends,
>> i.e. when iovec was freed in userspace. And that's not great at all.
> Yes, agree, that's why I say we neeed to re-consider the io identity codes
> more in commit message :) I'll have a try to prepare a better one.

I'd vote for dragging -AGAIN'ed reqs that don't need io_import_iovec()
through task_work for resubmission, and fail everything else. Not great,
but imho better than always setting async_data.

-- 
Pavel Begunkov

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH] io_uring: don't issue reqs in iopoll mode when ctx is dying
  2021-02-08 13:35     ` Pavel Begunkov
@ 2021-02-22 13:23       ` Pavel Begunkov
  2021-02-24  2:30         ` Xiaoguang Wang
  0 siblings, 1 reply; 16+ messages in thread
From: Pavel Begunkov @ 2021-02-22 13:23 UTC (permalink / raw)
  To: Xiaoguang Wang, io-uring; +Cc: axboe, joseph.qi

On 08/02/2021 13:35, Pavel Begunkov wrote:
> On 08/02/2021 02:50, Xiaoguang Wang wrote:
>>>> The io_identity's count is underflowed. It's because in io_put_identity,
>>>> first argument tctx comes from req->task->io_uring, the second argument
>>>> comes from the task context that calls io_req_init_async, so the compare
>>>> in io_put_identity maybe meaningless. See below case:
>>>>      task context A issue one polled req, then req->task = A.
>>>>      task context B do iopoll, above req returns with EAGAIN error.
>>>>      task context B re-issue req, call io_queue_async_work for req.
>>>>      req->task->io_uring will set to task context B's identity, or cow new one.
>>>> then for above case, in io_put_identity(), the compare is meaningless.
>>>>
>>>> IIUC, req->task should indicates the initial task context that issues req,
>>>> then if it gets EAGAIN error, we'll call io_prep_async_work() in req->task
>>>> context, but iopoll reqs seems special, they maybe issued successfully and
>>>> got re-issued in other task context because of EAGAIN error.
>>>
>>> Looks as you say, but the patch doesn't solve the issue completely.
>>> 1. We must not do io_queue_async_work() under a different task context,
>>> because of it potentially uses a different set of resources. So, I just
>>> thought that it would be better to punt it to the right task context
>>> via task_work. But...
>>>
>>> 2. ...iovec import from io_resubmit_prep() might happen after submit ends,
>>> i.e. when iovec was freed in userspace. And that's not great at all.
>> Yes, agree, that's why I say we neeed to re-consider the io identity codes
>> more in commit message :) I'll have a try to prepare a better one.
> 
> I'd vote for dragging -AGAIN'ed reqs that don't need io_import_iovec()
> through task_work for resubmission, and fail everything else. Not great,
> but imho better than always setting async_data.

Hey Xiaoguang, are you working on this? I would like to leave it to you,
If you do.

-- 
Pavel Begunkov

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH] io_uring: don't issue reqs in iopoll mode when ctx is dying
  2021-02-22 13:23       ` Pavel Begunkov
@ 2021-02-24  2:30         ` Xiaoguang Wang
  2021-02-24  2:35           ` Jens Axboe
  2021-02-24  9:38           ` Pavel Begunkov
  0 siblings, 2 replies; 16+ messages in thread
From: Xiaoguang Wang @ 2021-02-24  2:30 UTC (permalink / raw)
  To: Pavel Begunkov, io-uring; +Cc: axboe, joseph.qi

hi Pavel,

> On 08/02/2021 13:35, Pavel Begunkov wrote:
>> On 08/02/2021 02:50, Xiaoguang Wang wrote:
>>>>> The io_identity's count is underflowed. It's because in io_put_identity,
>>>>> first argument tctx comes from req->task->io_uring, the second argument
>>>>> comes from the task context that calls io_req_init_async, so the compare
>>>>> in io_put_identity maybe meaningless. See below case:
>>>>>       task context A issue one polled req, then req->task = A.
>>>>>       task context B do iopoll, above req returns with EAGAIN error.
>>>>>       task context B re-issue req, call io_queue_async_work for req.
>>>>>       req->task->io_uring will set to task context B's identity, or cow new one.
>>>>> then for above case, in io_put_identity(), the compare is meaningless.
>>>>>
>>>>> IIUC, req->task should indicates the initial task context that issues req,
>>>>> then if it gets EAGAIN error, we'll call io_prep_async_work() in req->task
>>>>> context, but iopoll reqs seems special, they maybe issued successfully and
>>>>> got re-issued in other task context because of EAGAIN error.
>>>>
>>>> Looks as you say, but the patch doesn't solve the issue completely.
>>>> 1. We must not do io_queue_async_work() under a different task context,
>>>> because of it potentially uses a different set of resources. So, I just
>>>> thought that it would be better to punt it to the right task context
>>>> via task_work. But...
>>>>
>>>> 2. ...iovec import from io_resubmit_prep() might happen after submit ends,
>>>> i.e. when iovec was freed in userspace. And that's not great at all.
>>> Yes, agree, that's why I say we neeed to re-consider the io identity codes
>>> more in commit message :) I'll have a try to prepare a better one.
>>
>> I'd vote for dragging -AGAIN'ed reqs that don't need io_import_iovec()
>> through task_work for resubmission, and fail everything else. Not great,
>> but imho better than always setting async_data.
> 
> Hey Xiaoguang, are you working on this? I would like to leave it to you,
> If you do.
Sorry, currently I'm busy with other project and don't have much time to work on
it yet. Hao Xu will help to continue work on the new version patch.

Regards,
Xiaoguang Wang
> 

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH] io_uring: don't issue reqs in iopoll mode when ctx is dying
  2021-02-24  2:30         ` Xiaoguang Wang
@ 2021-02-24  2:35           ` Jens Axboe
  2021-02-24  2:45             ` Xiaoguang Wang
  2021-02-24  9:38           ` Pavel Begunkov
  1 sibling, 1 reply; 16+ messages in thread
From: Jens Axboe @ 2021-02-24  2:35 UTC (permalink / raw)
  To: Xiaoguang Wang, Pavel Begunkov, io-uring; +Cc: joseph.qi

On 2/23/21 7:30 PM, Xiaoguang Wang wrote:
> hi Pavel,
> 
>> On 08/02/2021 13:35, Pavel Begunkov wrote:
>>> On 08/02/2021 02:50, Xiaoguang Wang wrote:
>>>>>> The io_identity's count is underflowed. It's because in io_put_identity,
>>>>>> first argument tctx comes from req->task->io_uring, the second argument
>>>>>> comes from the task context that calls io_req_init_async, so the compare
>>>>>> in io_put_identity maybe meaningless. See below case:
>>>>>>       task context A issue one polled req, then req->task = A.
>>>>>>       task context B do iopoll, above req returns with EAGAIN error.
>>>>>>       task context B re-issue req, call io_queue_async_work for req.
>>>>>>       req->task->io_uring will set to task context B's identity, or cow new one.
>>>>>> then for above case, in io_put_identity(), the compare is meaningless.
>>>>>>
>>>>>> IIUC, req->task should indicates the initial task context that issues req,
>>>>>> then if it gets EAGAIN error, we'll call io_prep_async_work() in req->task
>>>>>> context, but iopoll reqs seems special, they maybe issued successfully and
>>>>>> got re-issued in other task context because of EAGAIN error.
>>>>>
>>>>> Looks as you say, but the patch doesn't solve the issue completely.
>>>>> 1. We must not do io_queue_async_work() under a different task context,
>>>>> because of it potentially uses a different set of resources. So, I just
>>>>> thought that it would be better to punt it to the right task context
>>>>> via task_work. But...
>>>>>
>>>>> 2. ...iovec import from io_resubmit_prep() might happen after submit ends,
>>>>> i.e. when iovec was freed in userspace. And that's not great at all.
>>>> Yes, agree, that's why I say we neeed to re-consider the io identity codes
>>>> more in commit message :) I'll have a try to prepare a better one.
>>>
>>> I'd vote for dragging -AGAIN'ed reqs that don't need io_import_iovec()
>>> through task_work for resubmission, and fail everything else. Not great,
>>> but imho better than always setting async_data.
>>
>> Hey Xiaoguang, are you working on this? I would like to leave it to you,
>> If you do.
> Sorry, currently I'm busy with other project and don't have much time to work on
> it yet. Hao Xu will help to continue work on the new version patch.

Is it issue or reissue? I found this one today:

https://lore.kernel.org/io-uring/c9f6e1f6-ff82-0e58-ab66-956d0cde30ff@kernel.dk/

-- 
Jens Axboe


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH] io_uring: don't issue reqs in iopoll mode when ctx is dying
  2021-02-24  2:35           ` Jens Axboe
@ 2021-02-24  2:45             ` Xiaoguang Wang
  2021-02-24  2:51               ` Jens Axboe
  0 siblings, 1 reply; 16+ messages in thread
From: Xiaoguang Wang @ 2021-02-24  2:45 UTC (permalink / raw)
  To: Jens Axboe, Pavel Begunkov, io-uring; +Cc: joseph.qi

hi,

> On 2/23/21 7:30 PM, Xiaoguang Wang wrote:
>> hi Pavel,
>>
>>> On 08/02/2021 13:35, Pavel Begunkov wrote:
>>>> On 08/02/2021 02:50, Xiaoguang Wang wrote:
>>>>>>> The io_identity's count is underflowed. It's because in io_put_identity,
>>>>>>> first argument tctx comes from req->task->io_uring, the second argument
>>>>>>> comes from the task context that calls io_req_init_async, so the compare
>>>>>>> in io_put_identity maybe meaningless. See below case:
>>>>>>>        task context A issue one polled req, then req->task = A.
>>>>>>>        task context B do iopoll, above req returns with EAGAIN error.
>>>>>>>        task context B re-issue req, call io_queue_async_work for req.
>>>>>>>        req->task->io_uring will set to task context B's identity, or cow new one.
>>>>>>> then for above case, in io_put_identity(), the compare is meaningless.
>>>>>>>
>>>>>>> IIUC, req->task should indicates the initial task context that issues req,
>>>>>>> then if it gets EAGAIN error, we'll call io_prep_async_work() in req->task
>>>>>>> context, but iopoll reqs seems special, they maybe issued successfully and
>>>>>>> got re-issued in other task context because of EAGAIN error.
>>>>>>
>>>>>> Looks as you say, but the patch doesn't solve the issue completely.
>>>>>> 1. We must not do io_queue_async_work() under a different task context,
>>>>>> because of it potentially uses a different set of resources. So, I just
>>>>>> thought that it would be better to punt it to the right task context
>>>>>> via task_work. But...
>>>>>>
>>>>>> 2. ...iovec import from io_resubmit_prep() might happen after submit ends,
>>>>>> i.e. when iovec was freed in userspace. And that's not great at all.
>>>>> Yes, agree, that's why I say we neeed to re-consider the io identity codes
>>>>> more in commit message :) I'll have a try to prepare a better one.
>>>>
>>>> I'd vote for dragging -AGAIN'ed reqs that don't need io_import_iovec()
>>>> through task_work for resubmission, and fail everything else. Not great,
>>>> but imho better than always setting async_data.
>>>
>>> Hey Xiaoguang, are you working on this? I would like to leave it to you,
>>> If you do.
>> Sorry, currently I'm busy with other project and don't have much time to work on
>> it yet. Hao Xu will help to continue work on the new version patch.
> 
> Is it issue or reissue? I found this one today:
> 
> https://lore.kernel.org/io-uring/c9f6e1f6-ff82-0e58-ab66-956d0cde30ff@kernel.dk/
Yeah, my initial patch is similar to yours, but it only solves the bug described
in my commit message partially(ctx is dying), you can have a look at my commit message
for the bug bug scene, thanks.

Regards,
Xiaoguang Wang


> 

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH] io_uring: don't issue reqs in iopoll mode when ctx is dying
  2021-02-24  2:45             ` Xiaoguang Wang
@ 2021-02-24  2:51               ` Jens Axboe
  2021-02-24  9:46                 ` Pavel Begunkov
  0 siblings, 1 reply; 16+ messages in thread
From: Jens Axboe @ 2021-02-24  2:51 UTC (permalink / raw)
  To: Xiaoguang Wang, Pavel Begunkov, io-uring; +Cc: joseph.qi

On 2/23/21 7:45 PM, Xiaoguang Wang wrote:
> hi,
> 
>> On 2/23/21 7:30 PM, Xiaoguang Wang wrote:
>>> hi Pavel,
>>>
>>>> On 08/02/2021 13:35, Pavel Begunkov wrote:
>>>>> On 08/02/2021 02:50, Xiaoguang Wang wrote:
>>>>>>>> The io_identity's count is underflowed. It's because in io_put_identity,
>>>>>>>> first argument tctx comes from req->task->io_uring, the second argument
>>>>>>>> comes from the task context that calls io_req_init_async, so the compare
>>>>>>>> in io_put_identity maybe meaningless. See below case:
>>>>>>>>        task context A issue one polled req, then req->task = A.
>>>>>>>>        task context B do iopoll, above req returns with EAGAIN error.
>>>>>>>>        task context B re-issue req, call io_queue_async_work for req.
>>>>>>>>        req->task->io_uring will set to task context B's identity, or cow new one.
>>>>>>>> then for above case, in io_put_identity(), the compare is meaningless.
>>>>>>>>
>>>>>>>> IIUC, req->task should indicates the initial task context that issues req,
>>>>>>>> then if it gets EAGAIN error, we'll call io_prep_async_work() in req->task
>>>>>>>> context, but iopoll reqs seems special, they maybe issued successfully and
>>>>>>>> got re-issued in other task context because of EAGAIN error.
>>>>>>>
>>>>>>> Looks as you say, but the patch doesn't solve the issue completely.
>>>>>>> 1. We must not do io_queue_async_work() under a different task context,
>>>>>>> because of it potentially uses a different set of resources. So, I just
>>>>>>> thought that it would be better to punt it to the right task context
>>>>>>> via task_work. But...
>>>>>>>
>>>>>>> 2. ...iovec import from io_resubmit_prep() might happen after submit ends,
>>>>>>> i.e. when iovec was freed in userspace. And that's not great at all.
>>>>>> Yes, agree, that's why I say we neeed to re-consider the io identity codes
>>>>>> more in commit message :) I'll have a try to prepare a better one.
>>>>>
>>>>> I'd vote for dragging -AGAIN'ed reqs that don't need io_import_iovec()
>>>>> through task_work for resubmission, and fail everything else. Not great,
>>>>> but imho better than always setting async_data.
>>>>
>>>> Hey Xiaoguang, are you working on this? I would like to leave it to you,
>>>> If you do.
>>> Sorry, currently I'm busy with other project and don't have much time to work on
>>> it yet. Hao Xu will help to continue work on the new version patch.
>>
>> Is it issue or reissue? I found this one today:
>>
>> https://lore.kernel.org/io-uring/c9f6e1f6-ff82-0e58-ab66-956d0cde30ff@kernel.dk/
> Yeah, my initial patch is similar to yours, but it only solves the bug described
> in my commit message partially(ctx is dying), you can have a look at my commit message
> for the bug bug scene, thanks.

Are you sure? We just don't want to reissue it, we need to fail it.
Hence if we catch it at reissue time, that should be enough. But I'm
open to clue batting :-)

-- 
Jens Axboe


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH] io_uring: don't issue reqs in iopoll mode when ctx is dying
  2021-02-06 15:00 [PATCH] io_uring: don't issue reqs in iopoll mode when ctx is dying Xiaoguang Wang
  2021-02-07 17:30 ` Pavel Begunkov
@ 2021-02-24  3:23 ` Xiaoguang Wang
  1 sibling, 0 replies; 16+ messages in thread
From: Xiaoguang Wang @ 2021-02-24  3:23 UTC (permalink / raw)
  To: io-uring; +Cc: axboe, asml.silence, joseph.qi

hi,

> Abaci Robot reported following panic:
> ------------[ cut here ]------------
> refcount_t: underflow; use-after-free.
> WARNING: CPU: 1 PID: 195 at lib/refcount.c:28 refcount_warn_saturate+0x137/0x140
> Modules linked in:
> CPU: 1 PID: 195 Comm: kworker/u4:2 Not tainted 5.11.0-rc3+ #70
> Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.11.1-0-g0551a4be2c-prebuilt.qemu-project.or4
> Workqueue: events_unbound io_ring_exit_work
> RIP: 0010:refcount_warn_saturate+0x137/0x140
> Code: 05 ad 63 49 08 01 e8 45 0f 6f 00 0f 0b e9 16 ff ff ff e8 4c ba ae ff 48 c7 c7 28 2e 7c 82 c6 05 90 63 40
> RSP: 0018:ffffc900002e3cc8 EFLAGS: 00010282
> RAX: 0000000000000000 RBX: 0000000000000003 RCX: 0000000000000000
> RDX: ffff888102918000 RSI: ffffffff81150a34 RDI: ffff88813bd28570
> RBP: ffff8881075cd348 R08: 0000000000000001 R09: 0000000000000001
> R10: 0000000000000001 R11: 0000000000080000 R12: ffff8881075cd308
> R13: ffff8881075cd348 R14: ffff888122d33ab8 R15: ffff888104780300
> FS:  0000000000000000(0000) GS:ffff88813bd00000(0000) knlGS:0000000000000000
> CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> CR2: 0000000000000108 CR3: 0000000107636005 CR4: 0000000000060ee0
> Call Trace:
>   io_dismantle_req+0x3f3/0x5b0
>   __io_free_req+0x2c/0x270
>   io_put_req+0x4c/0x70
>   io_wq_cancel_cb+0x171/0x470
>   ? io_match_task.part.0+0x80/0x80
>   __io_uring_cancel_task_requests+0xa0/0x190
>   io_ring_exit_work+0x32/0x3e0
>   process_one_work+0x2f3/0x720
>   worker_thread+0x5a/0x4b0
>   ? process_one_work+0x720/0x720
>   kthread+0x138/0x180
>   ? kthread_park+0xd0/0xd0
>   ret_from_fork+0x1f/0x30
> 
> Later system will panic for some memory corruption.
> 
> The io_identity's count is underflowed. It's because in io_put_identity,
> first argument tctx comes from req->task->io_uring, the second argument
> comes from the task context that calls io_req_init_async, so the compare
> in io_put_identity maybe meaningless. See below case:
>      task context A issue one polled req, then req->task = A.
>      task context B do iopoll, above req returns with EAGAIN error.
>      task context B re-issue req, call io_queue_async_work for req.
>      req->task->io_uring will set to task context B's identity, or cow new one.
> then for above case, in io_put_identity(), the compare is meaningless.
Jens, can your patch solve this issue when ctx is not dying.

Regards,
Xiaoguang Wang
> 
> IIUC, req->task should indicates the initial task context that issues req,
> then if it gets EAGAIN error, we'll call io_prep_async_work() in req->task
> context, but iopoll reqs seems special, they maybe issued successfully and
> got re-issued in other task context because of EAGAIN error.
> 
> Currently for this panic, we can disable issuing reqs that are returned
> with EAGAIN error in iopoll mode when ctx is dying, but we may need to
> re-consider the io identity codes more.
> 
> Reported-by: Abaci Robot <abaci@linux.alibaba.com>
> Signed-off-by: Xiaoguang Wang <xiaoguang.wang@linux.alibaba.com>
> ---
>   fs/io_uring.c | 7 ++++++-
>   1 file changed, 6 insertions(+), 1 deletion(-)
> 
> diff --git a/fs/io_uring.c b/fs/io_uring.c
> index 9db05171a774..e3b90426d72b 100644
> --- a/fs/io_uring.c
> +++ b/fs/io_uring.c
> @@ -2467,7 +2467,12 @@ static void io_iopoll_complete(struct io_ring_ctx *ctx, unsigned int *nr_events,
>   		int cflags = 0;
>   
>   		req = list_first_entry(done, struct io_kiocb, inflight_entry);
> -		if (READ_ONCE(req->result) == -EAGAIN) {
> +		/*
> +		 * If ctx is dying, don't need to issue reqs that are returned
> +		 * with EAGAIN error, since there maybe no users to reap them.
> +		 */
> +		if ((READ_ONCE(req->result) == -EAGAIN) &&
> +		    !percpu_ref_is_dying(&ctx->refs)) {
>   			req->result = 0;
>   			req->iopoll_completed = 0;
>   			list_move_tail(&req->inflight_entry, &again);
> 

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH] io_uring: don't issue reqs in iopoll mode when ctx is dying
  2021-02-24  2:30         ` Xiaoguang Wang
  2021-02-24  2:35           ` Jens Axboe
@ 2021-02-24  9:38           ` Pavel Begunkov
  1 sibling, 0 replies; 16+ messages in thread
From: Pavel Begunkov @ 2021-02-24  9:38 UTC (permalink / raw)
  To: Xiaoguang Wang, io-uring; +Cc: axboe, joseph.qi

On 24/02/2021 02:30, Xiaoguang Wang wrote:
> hi Pavel,
> 
>> On 08/02/2021 13:35, Pavel Begunkov wrote:
>>> On 08/02/2021 02:50, Xiaoguang Wang wrote:
>>>>>> The io_identity's count is underflowed. It's because in io_put_identity,
>>>>>> first argument tctx comes from req->task->io_uring, the second argument
>>>>>> comes from the task context that calls io_req_init_async, so the compare
>>>>>> in io_put_identity maybe meaningless. See below case:
>>>>>>       task context A issue one polled req, then req->task = A.
>>>>>>       task context B do iopoll, above req returns with EAGAIN error.
>>>>>>       task context B re-issue req, call io_queue_async_work for req.
>>>>>>       req->task->io_uring will set to task context B's identity, or cow new one.
>>>>>> then for above case, in io_put_identity(), the compare is meaningless.
>>>>>>
>>>>>> IIUC, req->task should indicates the initial task context that issues req,
>>>>>> then if it gets EAGAIN error, we'll call io_prep_async_work() in req->task
>>>>>> context, but iopoll reqs seems special, they maybe issued successfully and
>>>>>> got re-issued in other task context because of EAGAIN error.
>>>>>
>>>>> Looks as you say, but the patch doesn't solve the issue completely.
>>>>> 1. We must not do io_queue_async_work() under a different task context,
>>>>> because of it potentially uses a different set of resources. So, I just
>>>>> thought that it would be better to punt it to the right task context
>>>>> via task_work. But...
>>>>>
>>>>> 2. ...iovec import from io_resubmit_prep() might happen after submit ends,
>>>>> i.e. when iovec was freed in userspace. And that's not great at all.
>>>> Yes, agree, that's why I say we neeed to re-consider the io identity codes
>>>> more in commit message :) I'll have a try to prepare a better one.
>>>
>>> I'd vote for dragging -AGAIN'ed reqs that don't need io_import_iovec()
>>> through task_work for resubmission, and fail everything else. Not great,
>>> but imho better than always setting async_data.
>>
>> Hey Xiaoguang, are you working on this? I would like to leave it to you,
>> If you do.
> Sorry, currently I'm busy with other project and don't have much time to work on
> it yet. Hao Xu will help to continue work on the new version patch.

Thanks for letting know! It's really not a big deal, just was leaving it
if someones wants to take a shot, but we'll get it fixed ourselves
otherwise.

-- 
Pavel Begunkov

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH] io_uring: don't issue reqs in iopoll mode when ctx is dying
  2021-02-24  2:51               ` Jens Axboe
@ 2021-02-24  9:46                 ` Pavel Begunkov
  2021-02-24  9:59                   ` Pavel Begunkov
  0 siblings, 1 reply; 16+ messages in thread
From: Pavel Begunkov @ 2021-02-24  9:46 UTC (permalink / raw)
  To: Jens Axboe, Xiaoguang Wang, io-uring; +Cc: joseph.qi

On 24/02/2021 02:51, Jens Axboe wrote:
>>>>> On 08/02/2021 13:35, Pavel Begunkov wrote:
>>>>>> On 08/02/2021 02:50, Xiaoguang Wang wrote:
>>>>>>>>> The io_identity's count is underflowed. It's because in io_put_identity,
>>>>>>>>> first argument tctx comes from req->task->io_uring, the second argument
>>>>>>>>> comes from the task context that calls io_req_init_async, so the compare
>>>>>>>>> in io_put_identity maybe meaningless. See below case:
>>>>>>>>>        task context A issue one polled req, then req->task = A.
>>>>>>>>>        task context B do iopoll, above req returns with EAGAIN error.
>>>>>>>>>        task context B re-issue req, call io_queue_async_work for req.
>>>>>>>>>        req->task->io_uring will set to task context B's identity, or cow new one.
>>>>>>>>> then for above case, in io_put_identity(), the compare is meaningless.
>>>>>>>>>
>>>>>>>>> IIUC, req->task should indicates the initial task context that issues req,
>>>>>>>>> then if it gets EAGAIN error, we'll call io_prep_async_work() in req->task
>>>>>>>>> context, but iopoll reqs seems special, they maybe issued successfully and
>>>>>>>>> got re-issued in other task context because of EAGAIN error.
>>>>>>>>
>>>>>>>> Looks as you say, but the patch doesn't solve the issue completely.
>>>>>>>> 1. We must not do io_queue_async_work() under a different task context,
>>>>>>>> because of it potentially uses a different set of resources. So, I just
>>>>>>>> thought that it would be better to punt it to the right task context
>>>>>>>> via task_work. But...
>>>>>>>>
>>>>>>>> 2. ...iovec import from io_resubmit_prep() might happen after submit ends,
>>>>>>>> i.e. when iovec was freed in userspace. And that's not great at all.
>>>>>>> Yes, agree, that's why I say we neeed to re-consider the io identity codes
>>>>>>> more in commit message :) I'll have a try to prepare a better one.
>>>>>>
>>>>>> I'd vote for dragging -AGAIN'ed reqs that don't need io_import_iovec()
>>>>>> through task_work for resubmission, and fail everything else. Not great,
>>>>>> but imho better than always setting async_data.
>>>>>
>>>>> Hey Xiaoguang, are you working on this? I would like to leave it to you,
>>>>> If you do.
>>>> Sorry, currently I'm busy with other project and don't have much time to work on
>>>> it yet. Hao Xu will help to continue work on the new version patch.
>>>
>>> Is it issue or reissue? I found this one today:
>>>
>>> https://lore.kernel.org/io-uring/c9f6e1f6-ff82-0e58-ab66-956d0cde30ff@kernel.dk/
>> Yeah, my initial patch is similar to yours, but it only solves the bug described
>> in my commit message partially(ctx is dying), you can have a look at my commit message
>> for the bug bug scene, thanks.
> 
> Are you sure? We just don't want to reissue it, we need to fail it.
> Hence if we catch it at reissue time, that should be enough. But I'm
> open to clue batting :-)

Jens, IOPOLL can happen from a different task, so
1) we don't want to grab io_wq_work context from it. As always we can pass it
through task_work, or should be solved with your io-wq patches.

2) it happens who knows when in time, so iovec may be gone already -- same
reasoning why io_[read,write]() copy it before going to io-wq.


-- 
Pavel Begunkov

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH] io_uring: don't issue reqs in iopoll mode when ctx is dying
  2021-02-24  9:46                 ` Pavel Begunkov
@ 2021-02-24  9:59                   ` Pavel Begunkov
  2021-02-24 10:33                     ` Pavel Begunkov
  0 siblings, 1 reply; 16+ messages in thread
From: Pavel Begunkov @ 2021-02-24  9:59 UTC (permalink / raw)
  To: Jens Axboe, Xiaoguang Wang, io-uring; +Cc: joseph.qi

On 24/02/2021 09:46, Pavel Begunkov wrote:
>> Are you sure? We just don't want to reissue it, we need to fail it.
>> Hence if we catch it at reissue time, that should be enough. But I'm
>> open to clue batting :-)
> 
> Jens, IOPOLL can happen from a different task, so
> 1) we don't want to grab io_wq_work context from it. As always we can pass it
> through task_work, or should be solved with your io-wq patches.
> 
> 2) it happens who knows when in time, so iovec may be gone already -- same
> reasoning why io_[read,write]() copy it before going to io-wq.

The diff below should solve the second problem by failing them (not tested).
1) can be left to the io-wq patches.


diff --git a/fs/io_uring.c b/fs/io_uring.c
index bf9ad810c621..561c29b20463 100644
--- a/fs/io_uring.c
+++ b/fs/io_uring.c
@@ -2610,8 +2610,11 @@ static void io_iopoll_complete(struct io_ring_ctx *ctx, unsigned int *nr_events,
 		list_del(&req->inflight_entry);
 
 		if (READ_ONCE(req->result) == -EAGAIN) {
+			bool reissue = req->async_data ||
+				!io_op_defs[req->opcode].needs_async_data;
+
 			req->iopoll_completed = 0;
-			if (io_rw_reissue(req))
+			if (reissue && io_rw_reissue(req))
 				continue;
 		}
 


-- 
Pavel Begunkov

^ permalink raw reply related	[flat|nested] 16+ messages in thread

* Re: [PATCH] io_uring: don't issue reqs in iopoll mode when ctx is dying
  2021-02-24  9:59                   ` Pavel Begunkov
@ 2021-02-24 10:33                     ` Pavel Begunkov
  0 siblings, 0 replies; 16+ messages in thread
From: Pavel Begunkov @ 2021-02-24 10:33 UTC (permalink / raw)
  To: Jens Axboe, Xiaoguang Wang, io-uring; +Cc: joseph.qi

On 24/02/2021 09:59, Pavel Begunkov wrote:
> On 24/02/2021 09:46, Pavel Begunkov wrote:
>>> Are you sure? We just don't want to reissue it, we need to fail it.
>>> Hence if we catch it at reissue time, that should be enough. But I'm
>>> open to clue batting :-)
>>
>> Jens, IOPOLL can happen from a different task, so
>> 1) we don't want to grab io_wq_work context from it. As always we can pass it
>> through task_work, or should be solved with your io-wq patches.
>>
>> 2) it happens who knows when in time, so iovec may be gone already -- same
>> reasoning why io_[read,write]() copy it before going to io-wq.
> 
> The diff below should solve the second problem by failing them (not tested).
> 1) can be left to the io-wq patches.

We can even try to init it in io_complete_rw_iopoll() similarly to
__io_complete_rw(). The tricky part for me is that "!inline exec" comment,
i.e. to distinct io_complete_rw_iopoll() -EAGAIN'ed inline and from IRQ/etc.
Jens?


diff --git a/fs/io_uring.c b/fs/io_uring.c
index bf9ad810c621..413bb4dd0a2f 100644
--- a/fs/io_uring.c
+++ b/fs/io_uring.c
@@ -2610,8 +2610,11 @@ static void io_iopoll_complete(struct io_ring_ctx *ctx, unsigned int *nr_events,
 		list_del(&req->inflight_entry);
 
 		if (READ_ONCE(req->result) == -EAGAIN) {
+			bool reissue = req->async_data ||
+				!io_op_defs[req->opcode].needs_async_data;
+
 			req->iopoll_completed = 0;
-			if (io_rw_reissue(req))
+			if (reissue && io_rw_reissue(req))
 				continue;
 		}
 
@@ -2829,7 +2832,7 @@ static bool io_resubmit_prep(struct io_kiocb *req)
 }
 #endif
 
-static bool io_rw_reissue(struct io_kiocb *req)
+static bool io_rw_reissue_prep(struct io_kiocb *req)
 {
 #ifdef CONFIG_BLOCK
 	umode_t mode = file_inode(req->file)->i_mode;
@@ -2844,13 +2847,21 @@ static bool io_rw_reissue(struct io_kiocb *req)
 
 	ret = io_sq_thread_acquire_mm_files(req->ctx, req);
 
-	if (!ret && io_resubmit_prep(req)) {
+	if (!ret && io_resubmit_prep(req))
+		return true;
+	req_set_fail_links(req);
+#endif
+	return false;
+
+}
+
+static bool io_rw_reissue(struct io_kiocb *req)
+{
+	if (io_rw_reissue_prep(req)) {
 		refcount_inc(&req->refs);
 		io_queue_async_work(req);
 		return true;
 	}
-	req_set_fail_links(req);
-#endif
 	return false;
 }
 
@@ -2885,8 +2896,12 @@ static void io_complete_rw_iopoll(struct kiocb *kiocb, long res, long res2)
 	if (kiocb->ki_flags & IOCB_WRITE)
 		kiocb_end_write(req);
 
-	if (res != -EAGAIN && res != req->result)
+	if (req == -EAGAIN) {
+		if (/* !inline exec || */ !io_rw_reissue_prep(req))
+			req_set_fail_links(req);
+	} else if (res != req->result) {
 		req_set_fail_links(req);
+	}
 
 	WRITE_ONCE(req->result, res);
 	/* order with io_poll_complete() checking ->result */



-- 
Pavel Begunkov

^ permalink raw reply related	[flat|nested] 16+ messages in thread

* Re: [PATCH] io_uring: don't issue reqs in iopoll mode when ctx is dying
  2021-02-07 17:30 ` Pavel Begunkov
  2021-02-08  2:50   ` Xiaoguang Wang
@ 2021-02-24 12:42   ` Hao Xu
  2021-02-25 10:55     ` Pavel Begunkov
  1 sibling, 1 reply; 16+ messages in thread
From: Hao Xu @ 2021-02-24 12:42 UTC (permalink / raw)
  To: Pavel Begunkov, Xiaoguang Wang, io-uring; +Cc: axboe, joseph.qi

在 2021/2/8 上午1:30, Pavel Begunkov 写道:
> On 06/02/2021 15:00, Xiaoguang Wang wrote:
>> Abaci Robot reported following panic:
>> ------------[ cut here ]------------
>> refcount_t: underflow; use-after-free.
>> WARNING: CPU: 1 PID: 195 at lib/refcount.c:28 refcount_warn_saturate+0x137/0x140
>> Modules linked in:
>> CPU: 1 PID: 195 Comm: kworker/u4:2 Not tainted 5.11.0-rc3+ #70
>> Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.11.1-0-g0551a4be2c-prebuilt.qemu-project.or4
>> Workqueue: events_unbound io_ring_exit_work
>> RIP: 0010:refcount_warn_saturate+0x137/0x140
>> Code: 05 ad 63 49 08 01 e8 45 0f 6f 00 0f 0b e9 16 ff ff ff e8 4c ba ae ff 48 c7 c7 28 2e 7c 82 c6 05 90 63 40
>> RSP: 0018:ffffc900002e3cc8 EFLAGS: 00010282
>> RAX: 0000000000000000 RBX: 0000000000000003 RCX: 0000000000000000
>> RDX: ffff888102918000 RSI: ffffffff81150a34 RDI: ffff88813bd28570
>> RBP: ffff8881075cd348 R08: 0000000000000001 R09: 0000000000000001
>> R10: 0000000000000001 R11: 0000000000080000 R12: ffff8881075cd308
>> R13: ffff8881075cd348 R14: ffff888122d33ab8 R15: ffff888104780300
>> FS:  0000000000000000(0000) GS:ffff88813bd00000(0000) knlGS:0000000000000000
>> CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
>> CR2: 0000000000000108 CR3: 0000000107636005 CR4: 0000000000060ee0
>> Call Trace:
>>   io_dismantle_req+0x3f3/0x5b0
>>   __io_free_req+0x2c/0x270
>>   io_put_req+0x4c/0x70
>>   io_wq_cancel_cb+0x171/0x470
>>   ? io_match_task.part.0+0x80/0x80
>>   __io_uring_cancel_task_requests+0xa0/0x190
>>   io_ring_exit_work+0x32/0x3e0
>>   process_one_work+0x2f3/0x720
>>   worker_thread+0x5a/0x4b0
>>   ? process_one_work+0x720/0x720
>>   kthread+0x138/0x180
>>   ? kthread_park+0xd0/0xd0
>>   ret_from_fork+0x1f/0x30
>>
>> Later system will panic for some memory corruption.
>>
>> The io_identity's count is underflowed. It's because in io_put_identity,
>> first argument tctx comes from req->task->io_uring, the second argument
>> comes from the task context that calls io_req_init_async, so the compare
>> in io_put_identity maybe meaningless. See below case:
>>      task context A issue one polled req, then req->task = A.
>>      task context B do iopoll, above req returns with EAGAIN error.
>>      task context B re-issue req, call io_queue_async_work for req.
>>      req->task->io_uring will set to task context B's identity, or cow new one.
>> then for above case, in io_put_identity(), the compare is meaningless.
>>
>> IIUC, req->task should indicates the initial task context that issues req,
>> then if it gets EAGAIN error, we'll call io_prep_async_work() in req->task
>> context, but iopoll reqs seems special, they maybe issued successfully and
>> got re-issued in other task context because of EAGAIN error.
> 
> Looks as you say, but the patch doesn't solve the issue completely.
> 1. We must not do io_queue_async_work() under a different task context,
> because of it potentially uses a different set of resources. So, I just
Hi Pavel,
I've some questions.
  - Why would the resources be potentially different? My understanding 
is the tasks which can share a uring instance must be in the same 
process, so they have same context(files, mm etc.)
  - Assume 1 is true, why don't we just do something like this:
        req->work.identity = req->task->io_uring->identity;
    since req->task points to the original context

> thought that it would be better to punt it to the right task context
> via task_work. But...
> 
> 2. ...iovec import from io_resubmit_prep() might happen after submit ends,
> i.e. when iovec was freed in userspace. And that's not great at all.
I didn't get it why here we need to import iovec. My understanding is 
OPs(such as readv writev) which need imoport iovec already did that in 
the first inline IO, what do I omit? And if it is neccessary, how do we 
ensure the iovec wasn't freed in userspace at the reissuing time?
> 
>> Currently for this panic, we can disable issuing reqs that are returned
>> with EAGAIN error in iopoll mode when ctx is dying, but we may need to
>> re-consider the io identity codes more.
>>
>> Reported-by: Abaci Robot <abaci@linux.alibaba.com>
>> Signed-off-by: Xiaoguang Wang <xiaoguang.wang@linux.alibaba.com>
>> ---
>>   fs/io_uring.c | 7 ++++++-
>>   1 file changed, 6 insertions(+), 1 deletion(-)
>>
>> diff --git a/fs/io_uring.c b/fs/io_uring.c
>> index 9db05171a774..e3b90426d72b 100644
>> --- a/fs/io_uring.c
>> +++ b/fs/io_uring.c
>> @@ -2467,7 +2467,12 @@ static void io_iopoll_complete(struct io_ring_ctx *ctx, unsigned int *nr_events,
>>   		int cflags = 0;
>>   
>>   		req = list_first_entry(done, struct io_kiocb, inflight_entry);
>> -		if (READ_ONCE(req->result) == -EAGAIN) {
>> +		/*
>> +		 * If ctx is dying, don't need to issue reqs that are returned
>> +		 * with EAGAIN error, since there maybe no users to reap them.
>> +		 */
>> +		if ((READ_ONCE(req->result) == -EAGAIN) &&
>> +		    !percpu_ref_is_dying(&ctx->refs)) {
>>   			req->result = 0;
>>   			req->iopoll_completed = 0;
>>   			list_move_tail(&req->inflight_entry, &again);
>>
> 


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH] io_uring: don't issue reqs in iopoll mode when ctx is dying
  2021-02-24 12:42   ` Hao Xu
@ 2021-02-25 10:55     ` Pavel Begunkov
  0 siblings, 0 replies; 16+ messages in thread
From: Pavel Begunkov @ 2021-02-25 10:55 UTC (permalink / raw)
  To: Hao Xu, Xiaoguang Wang, io-uring; +Cc: axboe, joseph.qi

On 24/02/2021 12:42, Hao Xu wrote:
> 在 2021/2/8 上午1:30, Pavel Begunkov 写道:
>> On 06/02/2021 15:00, Xiaoguang Wang wrote:
>> Looks as you say, but the patch doesn't solve the issue completely.
>> 1. We must not do io_queue_async_work() under a different task context,
>> because of it potentially uses a different set of resources. So, I just
> Hi Pavel,
> I've some questions.
>  - Why would the resources be potentially different? My understanding is the tasks which can share a uring instance must be in the same process, so they have same context(files, mm etc.)

"task" and "process" are pretty synonyms in this context, as well as
struct task. And they can have different files/mm, e.g. io_uring works
across fork.

>  - Assume 1 is true, why don't we just do something like this:
>        req->work.identity = req->task->io_uring->identity;
>    since req->task points to the original context

Because there is more to be done to that, see grab_identity or something,
and it may be racy to do. As mentioned, identity will cease to exist with
Jens' patches, and in any case we can punt to task_work.

> 
>> thought that it would be better to punt it to the right task context
>> via task_work. But...
>>
>> 2. ...iovec import from io_resubmit_prep() might happen after submit ends,
>> i.e. when iovec was freed in userspace. And that's not great at all.
> I didn't get it why here we need to import iovec. My understanding is OPs(such as readv writev) which need imoport iovec already did that in the first inline IO, what do I omit? 

Nope, it may skip copying them. e.g. see -EIOCBQUEUED path of io_read().

> And if it is neccessary, how do we ensure the iovec wasn't freed in userspace at the reissuing time?

We should not import_iovec() for reissue after we left submit, and
that's exactly the problem happening with IOPOLL.

For __io_complete_rw(), Jens tells that it returns -EAGAIN IFF it
called the callback from the submission, correct me if I'm wrong.
I have never checked it myself, though

-- 
Pavel Begunkov

^ permalink raw reply	[flat|nested] 16+ messages in thread

end of thread, other threads:[~2021-02-25 10:59 UTC | newest]

Thread overview: 16+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-02-06 15:00 [PATCH] io_uring: don't issue reqs in iopoll mode when ctx is dying Xiaoguang Wang
2021-02-07 17:30 ` Pavel Begunkov
2021-02-08  2:50   ` Xiaoguang Wang
2021-02-08 13:35     ` Pavel Begunkov
2021-02-22 13:23       ` Pavel Begunkov
2021-02-24  2:30         ` Xiaoguang Wang
2021-02-24  2:35           ` Jens Axboe
2021-02-24  2:45             ` Xiaoguang Wang
2021-02-24  2:51               ` Jens Axboe
2021-02-24  9:46                 ` Pavel Begunkov
2021-02-24  9:59                   ` Pavel Begunkov
2021-02-24 10:33                     ` Pavel Begunkov
2021-02-24  9:38           ` Pavel Begunkov
2021-02-24 12:42   ` Hao Xu
2021-02-25 10:55     ` Pavel Begunkov
2021-02-24  3:23 ` Xiaoguang Wang

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).