io-uring.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 0/2] fixes of poll
@ 2021-09-27 12:35 Hao Xu
  2021-09-27 12:35 ` [PATCH 1/2] io_uring: fix concurrent poll interruption Hao Xu
  2021-09-27 12:36 ` [PATCH 2/2] io_uring: fix tw list mess-up by adding tw while it's already in tw list Hao Xu
  0 siblings, 2 replies; 6+ messages in thread
From: Hao Xu @ 2021-09-27 12:35 UTC (permalink / raw)
  To: Jens Axboe; +Cc: io-uring, Pavel Begunkov, Joseph Qi

Two fixes to address poll issues.

Hao Xu (2):
  io_uring: fix concurrent poll interruption
  io_uring: fix tw list mess-up by adding tw while it's already in tw
    list

 fs/io_uring.c | 13 ++++++++++---
 1 file changed, 10 insertions(+), 3 deletions(-)

-- 
2.24.4


^ permalink raw reply	[flat|nested] 6+ messages in thread

* [PATCH 1/2] io_uring: fix concurrent poll interruption
  2021-09-27 12:35 [PATCH 0/2] fixes of poll Hao Xu
@ 2021-09-27 12:35 ` Hao Xu
  2021-09-27 13:33   ` Hao Xu
  2021-09-27 12:36 ` [PATCH 2/2] io_uring: fix tw list mess-up by adding tw while it's already in tw list Hao Xu
  1 sibling, 1 reply; 6+ messages in thread
From: Hao Xu @ 2021-09-27 12:35 UTC (permalink / raw)
  To: Jens Axboe; +Cc: io-uring, Pavel Begunkov, Joseph Qi

There may be concurrent poll interruptions:

async_wake()
->del wait entry
->add to tw list
                       async_wake()
                       ->del wait entry
                       ->add to tw list

This mess up the tw list, let's avoid it by adding a if check before
delete wait entry:
async_wake()
->if empty(wait entry)
    return
->del wait entry
->add to tw list
                       async_wake()
                       ->if empty(wait entry)
                           return <------------will return here
                       ->del wait entry
                       ->add to tw list

Fixes: 88e41cf928a6 ("io_uring: add multishot mode for IORING_OP_POLL_ADD")
Signed-off-by: Hao Xu <haoxu@linux.alibaba.com>
---
 fs/io_uring.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/fs/io_uring.c b/fs/io_uring.c
index 8317c360f7a4..d0b358b9b589 100644
--- a/fs/io_uring.c
+++ b/fs/io_uring.c
@@ -5245,6 +5245,8 @@ static int __io_async_wake(struct io_kiocb *req, struct io_poll_iocb *poll,
 
 	trace_io_uring_task_add(req->ctx, req->opcode, req->user_data, mask);
 
+	if (list_empty(&poll->wait.entry))
+		return 0;
 	list_del_init(&poll->wait.entry);
 
 	req->result = mask;
-- 
2.24.4


^ permalink raw reply related	[flat|nested] 6+ messages in thread

* [PATCH 2/2] io_uring: fix tw list mess-up by adding tw while it's already in tw list
  2021-09-27 12:35 [PATCH 0/2] fixes of poll Hao Xu
  2021-09-27 12:35 ` [PATCH 1/2] io_uring: fix concurrent poll interruption Hao Xu
@ 2021-09-27 12:36 ` Hao Xu
  2021-09-29 11:13   ` Pavel Begunkov
  1 sibling, 1 reply; 6+ messages in thread
From: Hao Xu @ 2021-09-27 12:36 UTC (permalink / raw)
  To: Jens Axboe; +Cc: io-uring, Pavel Begunkov, Joseph Qi

For multishot mode, there may be cases like:
io_poll_task_func()
-> add_wait_queue()
                            async_wake()
                            ->io_req_task_work_add()
                            this one mess up the running task_work list
                            since req->io_task_work.node is in use.

similar situation for req->io_task_work.fallback_node.
Fix it by set node->next = NULL before we run the tw, so that when we
add req back to the wait queue in middle of tw running, we can safely
re-add it to the tw list.

Fixes: 7cbf1722d5fc ("io_uring: provide FIFO ordering for task_work")
Signed-off-by: Hao Xu <haoxu@linux.alibaba.com>
---
 fs/io_uring.c | 11 ++++++++---
 1 file changed, 8 insertions(+), 3 deletions(-)

diff --git a/fs/io_uring.c b/fs/io_uring.c
index d0b358b9b589..f667d6286438 100644
--- a/fs/io_uring.c
+++ b/fs/io_uring.c
@@ -1250,13 +1250,17 @@ static void io_fallback_req_func(struct work_struct *work)
 	struct io_ring_ctx *ctx = container_of(work, struct io_ring_ctx,
 						fallback_work.work);
 	struct llist_node *node = llist_del_all(&ctx->fallback_llist);
-	struct io_kiocb *req, *tmp;
+	struct io_kiocb *req;
 	bool locked = false;
 
 	percpu_ref_get(&ctx->refs);
-	llist_for_each_entry_safe(req, tmp, node, io_task_work.fallback_node)
+	req = llist_entry(node, struct io_kiocb, io_task_work.fallback_node);
+	while (member_address_is_nonnull(req, io_task_work.fallback_node)) {
+		node = req->io_task_work.fallback_node.next;
+		req->io_task_work.fallback_node.next = NULL;
 		req->io_task_work.func(req, &locked);
-
+		req = llist_entry(node, struct io_kiocb, io_task_work.fallback_node);
+	}
 	if (locked) {
 		io_submit_flush_completions(ctx);
 		mutex_unlock(&ctx->uring_lock);
@@ -2156,6 +2160,7 @@ static void tctx_task_work(struct callback_head *cb)
 				locked = mutex_trylock(&ctx->uring_lock);
 				percpu_ref_get(&ctx->refs);
 			}
+			node->next = NULL;
 			req->io_task_work.func(req, &locked);
 			node = next;
 		} while (node);
-- 
2.24.4


^ permalink raw reply related	[flat|nested] 6+ messages in thread

* Re: [PATCH 1/2] io_uring: fix concurrent poll interruption
  2021-09-27 12:35 ` [PATCH 1/2] io_uring: fix concurrent poll interruption Hao Xu
@ 2021-09-27 13:33   ` Hao Xu
  0 siblings, 0 replies; 6+ messages in thread
From: Hao Xu @ 2021-09-27 13:33 UTC (permalink / raw)
  To: Jens Axboe; +Cc: io-uring, Pavel Begunkov, Joseph Qi

在 2021/9/27 下午8:35, Hao Xu 写道:
> There may be concurrent poll interruptions:
> 
> async_wake()
> ->del wait entry
> ->add to tw list
>                         async_wake()
>                         ->del wait entry
>                         ->add to tw list
> 
> This mess up the tw list, let's avoid it by adding a if check before
> delete wait entry:
> async_wake()
> ->if empty(wait entry)
>      return
> ->del wait entry
> ->add to tw list
>                         async_wake()
>                         ->if empty(wait entry)
>                             return <------------will return here
>                         ->del wait entry
>                         ->add to tw list
I now think the issue may not exist, since async_wake() is protected by
poll->head->lock, so the wait entry has already been deleted when the
second time event happens and try to visit the wait list to wakeup it.
> 
> Fixes: 88e41cf928a6 ("io_uring: add multishot mode for IORING_OP_POLL_ADD")
> Signed-off-by: Hao Xu <haoxu@linux.alibaba.com>
> ---
>   fs/io_uring.c | 2 ++
>   1 file changed, 2 insertions(+)
> 
> diff --git a/fs/io_uring.c b/fs/io_uring.c
> index 8317c360f7a4..d0b358b9b589 100644
> --- a/fs/io_uring.c
> +++ b/fs/io_uring.c
> @@ -5245,6 +5245,8 @@ static int __io_async_wake(struct io_kiocb *req, struct io_poll_iocb *poll,
>   
>   	trace_io_uring_task_add(req->ctx, req->opcode, req->user_data, mask);
>   
> +	if (list_empty(&poll->wait.entry))
> +		return 0;
>   	list_del_init(&poll->wait.entry);
>   
>   	req->result = mask;
> 


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH 2/2] io_uring: fix tw list mess-up by adding tw while it's already in tw list
  2021-09-27 12:36 ` [PATCH 2/2] io_uring: fix tw list mess-up by adding tw while it's already in tw list Hao Xu
@ 2021-09-29 11:13   ` Pavel Begunkov
  2021-09-29 12:08     ` Hao Xu
  0 siblings, 1 reply; 6+ messages in thread
From: Pavel Begunkov @ 2021-09-29 11:13 UTC (permalink / raw)
  To: Hao Xu, Jens Axboe; +Cc: io-uring, Joseph Qi

On 9/27/21 1:36 PM, Hao Xu wrote:
> For multishot mode, there may be cases like:
> io_poll_task_func()
> -> add_wait_queue()
>                             async_wake()
>                             ->io_req_task_work_add()
>                             this one mess up the running task_work list
>                             since req->io_task_work.node is in use.

By the time req->io_task_work.func() is called, node->next is undefined
and free to use. io_req_task_work_add() will override it without looking
at a prior value and that's fine, as the calling code doesn't touch it
after the callback.

> similar situation for req->io_task_work.fallback_node.
> Fix it by set node->next = NULL before we run the tw, so that when we
> add req back to the wait queue in middle of tw running, we can safely
> re-add it to the tw list.

I might be missing what is the problem you're trying to fix. Does the
above makes sense? It doesn't sound like node->next=NULL can solve
anything.

> Fixes: 7cbf1722d5fc ("io_uring: provide FIFO ordering for task_work")
> Signed-off-by: Hao Xu <haoxu@linux.alibaba.com>
> ---
>  fs/io_uring.c | 11 ++++++++---
>  1 file changed, 8 insertions(+), 3 deletions(-)
> 
> diff --git a/fs/io_uring.c b/fs/io_uring.c
> index d0b358b9b589..f667d6286438 100644
> --- a/fs/io_uring.c
> +++ b/fs/io_uring.c
> @@ -1250,13 +1250,17 @@ static void io_fallback_req_func(struct work_struct *work)
>  	struct io_ring_ctx *ctx = container_of(work, struct io_ring_ctx,
>  						fallback_work.work);
>  	struct llist_node *node = llist_del_all(&ctx->fallback_llist);
> -	struct io_kiocb *req, *tmp;
> +	struct io_kiocb *req;
>  	bool locked = false;
>  
>  	percpu_ref_get(&ctx->refs);
> -	llist_for_each_entry_safe(req, tmp, node, io_task_work.fallback_node)
> +	req = llist_entry(node, struct io_kiocb, io_task_work.fallback_node);
> +	while (member_address_is_nonnull(req, io_task_work.fallback_node)) {
> +		node = req->io_task_work.fallback_node.next;
> +		req->io_task_work.fallback_node.next = NULL;
>  		req->io_task_work.func(req, &locked);
> -
> +		req = llist_entry(node, struct io_kiocb, io_task_work.fallback_node);
> +	}
>  	if (locked) {
>  		io_submit_flush_completions(ctx);
>  		mutex_unlock(&ctx->uring_lock);
> @@ -2156,6 +2160,7 @@ static void tctx_task_work(struct callback_head *cb)
>  				locked = mutex_trylock(&ctx->uring_lock);
>  				percpu_ref_get(&ctx->refs);
>  			}
> +			node->next = NULL;
>  			req->io_task_work.func(req, &locked);
>  			node = next;
>  		} while (node);
> 

-- 
Pavel Begunkov

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH 2/2] io_uring: fix tw list mess-up by adding tw while it's already in tw list
  2021-09-29 11:13   ` Pavel Begunkov
@ 2021-09-29 12:08     ` Hao Xu
  0 siblings, 0 replies; 6+ messages in thread
From: Hao Xu @ 2021-09-29 12:08 UTC (permalink / raw)
  To: Pavel Begunkov, Jens Axboe; +Cc: io-uring, Joseph Qi

在 2021/9/29 下午7:13, Pavel Begunkov 写道:
> On 9/27/21 1:36 PM, Hao Xu wrote:
>> For multishot mode, there may be cases like:
>> io_poll_task_func()
>> -> add_wait_queue()
>>                              async_wake()
>>                              ->io_req_task_work_add()
>>                              this one mess up the running task_work list
>>                              since req->io_task_work.node is in use.
> 
> By the time req->io_task_work.func() is called, node->next is undefined
> and free to use. io_req_task_work_add() will override it without looking
> at a prior value and that's fine, as the calling code doesn't touch it
> after the callback.
I misunderstood the code, since node->next will be reset to NULL in
wq_list_add_tail(), so no problem here.
> 
>> similar situation for req->io_task_work.fallback_node.
>> Fix it by set node->next = NULL before we run the tw, so that when we
>> add req back to the wait queue in middle of tw running, we can safely
>> re-add it to the tw list.
> 
> I might be missing what is the problem you're trying to fix. Does the
> above makes sense? It doesn't sound like node->next=NULL can solve
> anything.
> 
>> Fixes: 7cbf1722d5fc ("io_uring: provide FIFO ordering for task_work")
>> Signed-off-by: Hao Xu <haoxu@linux.alibaba.com>
>> ---
>>   fs/io_uring.c | 11 ++++++++---
>>   1 file changed, 8 insertions(+), 3 deletions(-)
>>
>> diff --git a/fs/io_uring.c b/fs/io_uring.c
>> index d0b358b9b589..f667d6286438 100644
>> --- a/fs/io_uring.c
>> +++ b/fs/io_uring.c
>> @@ -1250,13 +1250,17 @@ static void io_fallback_req_func(struct work_struct *work)
>>   	struct io_ring_ctx *ctx = container_of(work, struct io_ring_ctx,
>>   						fallback_work.work);
>>   	struct llist_node *node = llist_del_all(&ctx->fallback_llist);
>> -	struct io_kiocb *req, *tmp;
>> +	struct io_kiocb *req;
>>   	bool locked = false;
>>   
>>   	percpu_ref_get(&ctx->refs);
>> -	llist_for_each_entry_safe(req, tmp, node, io_task_work.fallback_node)
>> +	req = llist_entry(node, struct io_kiocb, io_task_work.fallback_node);
>> +	while (member_address_is_nonnull(req, io_task_work.fallback_node)) {
>> +		node = req->io_task_work.fallback_node.next;
>> +		req->io_task_work.fallback_node.next = NULL;
>>   		req->io_task_work.func(req, &locked);
>> -
>> +		req = llist_entry(node, struct io_kiocb, io_task_work.fallback_node);
>> +	}
>>   	if (locked) {
>>   		io_submit_flush_completions(ctx);
>>   		mutex_unlock(&ctx->uring_lock);
>> @@ -2156,6 +2160,7 @@ static void tctx_task_work(struct callback_head *cb)
>>   				locked = mutex_trylock(&ctx->uring_lock);
>>   				percpu_ref_get(&ctx->refs);
>>   			}
>> +			node->next = NULL;
>>   			req->io_task_work.func(req, &locked);
>>   			node = next;
>>   		} while (node);
>>
> 


^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2021-09-29 12:08 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-09-27 12:35 [PATCH 0/2] fixes of poll Hao Xu
2021-09-27 12:35 ` [PATCH 1/2] io_uring: fix concurrent poll interruption Hao Xu
2021-09-27 13:33   ` Hao Xu
2021-09-27 12:36 ` [PATCH 2/2] io_uring: fix tw list mess-up by adding tw while it's already in tw list Hao Xu
2021-09-29 11:13   ` Pavel Begunkov
2021-09-29 12:08     ` Hao Xu

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).