io-uring.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH for-5.15 v3 0/2] fix failed linkchain code logic
@ 2021-08-27  9:46 Hao Xu
  2021-08-27  9:46 ` [PATCH 1/2] io_uring: remove redundant req_set_fail() Hao Xu
                   ` (3 more replies)
  0 siblings, 4 replies; 10+ messages in thread
From: Hao Xu @ 2021-08-27  9:46 UTC (permalink / raw)
  To: Jens Axboe; +Cc: io-uring, Pavel Begunkov, Joseph Qi

the first patch is code clean.
the second is the main one, which refactors linkchain failure path to
fix a problem, detail in the commit message.

v1-->v2
 - update patch with Pavel's suggestion.
v2-->v3
 - move req->result initiation to better place
 - add helpers for failing link node

Hao Xu (2):
  io_uring: remove redundant req_set_fail()
  io_uring: fix failed linkchain code logic

 fs/io_uring.c | 62 ++++++++++++++++++++++++++++++++++++++-------------
 1 file changed, 47 insertions(+), 15 deletions(-)

-- 
2.24.4


^ permalink raw reply	[flat|nested] 10+ messages in thread

* [PATCH 1/2] io_uring: remove redundant req_set_fail()
  2021-08-27  9:46 [PATCH for-5.15 v3 0/2] fix failed linkchain code logic Hao Xu
@ 2021-08-27  9:46 ` Hao Xu
  2021-08-27  9:46 ` [PATCH 2/2] io_uring: fix failed linkchain code logic Hao Xu
                   ` (2 subsequent siblings)
  3 siblings, 0 replies; 10+ messages in thread
From: Hao Xu @ 2021-08-27  9:46 UTC (permalink / raw)
  To: Jens Axboe; +Cc: io-uring, Pavel Begunkov, Joseph Qi

req_set_fail() in io_submit_sqe() is redundant, remove it.

Signed-off-by: Hao Xu <haoxu@linux.alibaba.com>
---
 fs/io_uring.c | 1 -
 1 file changed, 1 deletion(-)

diff --git a/fs/io_uring.c b/fs/io_uring.c
index d9534c72dc4b..3598319b1340 100644
--- a/fs/io_uring.c
+++ b/fs/io_uring.c
@@ -6631,7 +6631,6 @@ static int io_submit_sqe(struct io_ring_ctx *ctx, struct io_kiocb *req,
 fail_req:
 		if (link->head) {
 			/* fail even hard links since we don't submit */
-			req_set_fail(link->head);
 			io_req_complete_failed(link->head, -ECANCELED);
 			link->head = NULL;
 		}
-- 
2.24.4


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [PATCH 2/2] io_uring: fix failed linkchain code logic
  2021-08-27  9:46 [PATCH for-5.15 v3 0/2] fix failed linkchain code logic Hao Xu
  2021-08-27  9:46 ` [PATCH 1/2] io_uring: remove redundant req_set_fail() Hao Xu
@ 2021-08-27  9:46 ` Hao Xu
  2021-08-27 11:01 ` [PATCH for-5.15 v3 0/2] " Pavel Begunkov
  2021-08-27 13:27 ` Jens Axboe
  3 siblings, 0 replies; 10+ messages in thread
From: Hao Xu @ 2021-08-27  9:46 UTC (permalink / raw)
  To: Jens Axboe; +Cc: io-uring, Pavel Begunkov, Joseph Qi

Given a linkchain like this:
req0(link_flag)-->req1(link_flag)-->...-->reqn(no link_flag)

There is a problem:
 - if some intermediate linked req like req1 's submittion fails, reqs
   after it won't be cancelled.

   - sqpoll disabled: maybe it's ok since users can get the error info
     of req1 and stop submitting the following sqes.

   - sqpoll enabled: definitely a problem, the following sqes will be
     submitted in the next round.

The solution is to refactor the code logic to:
 - if a linked req's submittion fails, just mark it and the head(if it
   exists) as REQ_F_FAIL. Leverage req->result to indicate whether it
   is failed or cancelled.
 - submit or fail the whole chain when we come to the end of it.

Signed-off-by: Hao Xu <haoxu@linux.alibaba.com>
---
 fs/io_uring.c | 61 +++++++++++++++++++++++++++++++++++++++------------
 1 file changed, 47 insertions(+), 14 deletions(-)

diff --git a/fs/io_uring.c b/fs/io_uring.c
index 3598319b1340..4c83ec227d85 100644
--- a/fs/io_uring.c
+++ b/fs/io_uring.c
@@ -1175,6 +1175,12 @@ static inline void req_set_fail(struct io_kiocb *req)
 	req->flags |= REQ_F_FAIL;
 }
 
+static inline void req_fail_link_node(struct io_kiocb *req, int res)
+{
+	req_set_fail(req);
+	req->result = res;
+}
+
 static void io_ring_ctx_ref_free(struct percpu_ref *ref)
 {
 	struct io_ring_ctx *ctx = container_of(ref, struct io_ring_ctx, refs);
@@ -1931,11 +1937,16 @@ static void io_fail_links(struct io_kiocb *req)
 
 	req->link = NULL;
 	while (link) {
+		long res = -ECANCELED;
+
+		if (link->flags & REQ_F_FAIL)
+			res = link->result;
+
 		nxt = link->link;
 		link->link = NULL;
 
 		trace_io_uring_fail_link(req, link);
-		io_cqring_fill_event(link->ctx, link->user_data, -ECANCELED, 0);
+		io_cqring_fill_event(link->ctx, link->user_data, res, 0);
 		io_put_req_deferred(link);
 		link = nxt;
 	}
@@ -6519,8 +6530,10 @@ static inline void io_queue_sqe(struct io_kiocb *req)
 	if (unlikely(req->ctx->drain_active) && io_drain_req(req))
 		return;
 
-	if (likely(!(req->flags & REQ_F_FORCE_ASYNC))) {
+	if (likely(!(req->flags & (REQ_F_FORCE_ASYNC | REQ_F_FAIL)))) {
 		__io_queue_sqe(req);
+	} else if (req->flags & REQ_F_FAIL) {
+		io_req_complete_failed(req, req->result);
 	} else {
 		int ret = io_req_prep_async(req);
 
@@ -6629,19 +6642,34 @@ static int io_submit_sqe(struct io_ring_ctx *ctx, struct io_kiocb *req,
 	ret = io_init_req(ctx, req, sqe);
 	if (unlikely(ret)) {
 fail_req:
+		/* fail even hard links since we don't submit */
 		if (link->head) {
-			/* fail even hard links since we don't submit */
-			io_req_complete_failed(link->head, -ECANCELED);
-			link->head = NULL;
+			/*
+			 * we can judge a link req is failed or cancelled by if
+			 * REQ_F_FAIL is set, but the head is an exception since
+			 * it may be set REQ_F_FAIL because of other req's failure
+			 * so let's leverage req->result to distinguish if a head
+			 * is set REQ_F_FAIL because of its failure or other req's
+			 * failure so that we can set the correct ret code for it.
+			 * init result here to avoid affecting the normal path.
+			 */
+			if (!(link->head->flags & REQ_F_FAIL))
+				req_fail_link_node(link->head, -ECANCELED);
+		} else if (!(req->flags & (REQ_F_LINK | REQ_F_HARDLINK))) {
+			/*
+			 * the current req is a normal req, we should return
+			 * error and thus break the submittion loop.
+			 */
+			io_req_complete_failed(req, ret);
+			return ret;
 		}
-		io_req_complete_failed(req, ret);
-		return ret;
+		req_fail_link_node(req, ret);
+	} else {
+		ret = io_req_prep(req, sqe);
+		if (unlikely(ret))
+			goto fail_req;
 	}
 
-	ret = io_req_prep(req, sqe);
-	if (unlikely(ret))
-		goto fail_req;
-
 	/* don't need @sqe from now on */
 	trace_io_uring_submit_sqe(ctx, req, req->opcode, req->user_data,
 				  req->flags, true,
@@ -6657,9 +6685,14 @@ static int io_submit_sqe(struct io_ring_ctx *ctx, struct io_kiocb *req,
 	if (link->head) {
 		struct io_kiocb *head = link->head;
 
-		ret = io_req_prep_async(req);
-		if (unlikely(ret))
-			goto fail_req;
+		if (!(req->flags & REQ_F_FAIL)) {
+			ret = io_req_prep_async(req);
+			if (unlikely(ret)) {
+				req_fail_link_node(req, ret);
+				if (!(head->flags & REQ_F_FAIL))
+					req_fail_link_node(head, -ECANCELED);
+			}
+		}
 		trace_io_uring_link(ctx, req, head);
 		link->last->link = req;
 		link->last = req;
-- 
2.24.4


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* Re: [PATCH for-5.15 v3 0/2] fix failed linkchain code logic
  2021-08-27  9:46 [PATCH for-5.15 v3 0/2] fix failed linkchain code logic Hao Xu
  2021-08-27  9:46 ` [PATCH 1/2] io_uring: remove redundant req_set_fail() Hao Xu
  2021-08-27  9:46 ` [PATCH 2/2] io_uring: fix failed linkchain code logic Hao Xu
@ 2021-08-27 11:01 ` Pavel Begunkov
  2021-08-27 13:27 ` Jens Axboe
  3 siblings, 0 replies; 10+ messages in thread
From: Pavel Begunkov @ 2021-08-27 11:01 UTC (permalink / raw)
  To: Hao Xu, Jens Axboe; +Cc: io-uring, Joseph Qi

On 8/27/21 10:46 AM, Hao Xu wrote:
> the first patch is code clean.
> the second is the main one, which refactors linkchain failure path to
> fix a problem, detail in the commit message.

Looks good, thanks

Reviewed-by: Pavel Begunkov <asml.silence@gmail.com>

> v1-->v2
>  - update patch with Pavel's suggestion.
> v2-->v3
>  - move req->result initiation to better place
>  - add helpers for failing link node
> 
> Hao Xu (2):
>   io_uring: remove redundant req_set_fail()
>   io_uring: fix failed linkchain code logic
> 
>  fs/io_uring.c | 62 ++++++++++++++++++++++++++++++++++++++-------------
>  1 file changed, 47 insertions(+), 15 deletions(-)
> 

-- 
Pavel Begunkov

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH for-5.15 v3 0/2] fix failed linkchain code logic
  2021-08-27  9:46 [PATCH for-5.15 v3 0/2] fix failed linkchain code logic Hao Xu
                   ` (2 preceding siblings ...)
  2021-08-27 11:01 ` [PATCH for-5.15 v3 0/2] " Pavel Begunkov
@ 2021-08-27 13:27 ` Jens Axboe
  2021-08-27 17:04   ` Hao Xu
  3 siblings, 1 reply; 10+ messages in thread
From: Jens Axboe @ 2021-08-27 13:27 UTC (permalink / raw)
  To: Hao Xu; +Cc: io-uring, Pavel Begunkov, Joseph Qi

On 8/27/21 3:46 AM, Hao Xu wrote:
> the first patch is code clean.
> the second is the main one, which refactors linkchain failure path to
> fix a problem, detail in the commit message.

Thanks for pulling this one to completion - applied!

-- 
Jens Axboe


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH for-5.15 v3 0/2] fix failed linkchain code logic
  2021-08-27 13:27 ` Jens Axboe
@ 2021-08-27 17:04   ` Hao Xu
  0 siblings, 0 replies; 10+ messages in thread
From: Hao Xu @ 2021-08-27 17:04 UTC (permalink / raw)
  To: Jens Axboe; +Cc: io-uring, Pavel Begunkov, Joseph Qi

在 2021/8/27 下午9:27, Jens Axboe 写道:
> On 8/27/21 3:46 AM, Hao Xu wrote:
>> the first patch is code clean.
>> the second is the main one, which refactors linkchain failure path to
>> fix a problem, detail in the commit message.
> 
> Thanks for pulling this one to completion - applied!
> 
sorry for the delay.

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH 2/2] io_uring: fix failed linkchain code logic
  2021-08-23 11:02   ` Pavel Begunkov
  2021-08-23 17:12     ` Pavel Begunkov
@ 2021-08-23 18:45     ` Hao Xu
  1 sibling, 0 replies; 10+ messages in thread
From: Hao Xu @ 2021-08-23 18:45 UTC (permalink / raw)
  To: Pavel Begunkov, Jens Axboe; +Cc: io-uring, Joseph Qi

在 2021/8/23 下午7:02, Pavel Begunkov 写道:
> On 8/23/21 4:25 AM, Hao Xu wrote:
>> Given a linkchain like this:
>> req0(link_flag)-->req1(link_flag)-->...-->reqn(no link_flag)
>>
>> There is a problem:
>>   - if some intermediate linked req like req1 's submittion fails, reqs
>>     after it won't be cancelled.
>>
>>     - sqpoll disabled: maybe it's ok since users can get the error info
>>       of req1 and stop submitting the following sqes.
>>
>>     - sqpoll enabled: definitely a problem, the following sqes will be
>>       submitted in the next round.
>>
>> The solution is to refactor the code logic to:
>>   - if a linked req's submittion fails, just mark it and the head(if it
>>     exists) as REQ_F_FAIL. Leverage req->result to indicate whether it
>>     is failed or cancelled.
>>   - submit or fail the whole chain when we come to the end of it.
> 
> This looks good to me, a couple of comments below.
> 
> 
>> Signed-off-by: Hao Xu <haoxu@linux.alibaba.com>
>> ---
>>   fs/io_uring.c | 61 +++++++++++++++++++++++++++++++++++++--------------
>>   1 file changed, 45 insertions(+), 16 deletions(-)
>>
>> diff --git a/fs/io_uring.c b/fs/io_uring.c
>> index 44b1b2b58e6a..9ae8f2a5c584 100644
>> --- a/fs/io_uring.c
>> +++ b/fs/io_uring.c
>> @@ -1776,8 +1776,6 @@ static void io_preinit_req(struct io_kiocb *req, struct io_ring_ctx *ctx)
>>   	req->ctx = ctx;
>>   	req->link = NULL;
>>   	req->async_data = NULL;
>> -	/* not necessary, but safer to zero */
>> -	req->result = 0;
> 
> Please leave it. I'm afraid of leaking stack to userspace because
> ->result juggling looks prone to errors. And preinit is pretty cold
> anyway.
> 
> [...]
> 
>>   
>> @@ -6637,19 +6644,25 @@ static int io_submit_sqe(struct io_ring_ctx *ctx, struct io_kiocb *req,
>>   	ret = io_init_req(ctx, req, sqe);
>>   	if (unlikely(ret)) {
>>   fail_req:
>> +		/* fail even hard links since we don't submit */
>>   		if (link->head) {
>> -			/* fail even hard links since we don't submit */
>> -			io_req_complete_failed(link->head, -ECANCELED);
>> -			link->head = NULL;
>> +			req_set_fail(link->head);
> 
> I think it will be more reliable if we set head->result here, ...
Sure, I'll send v3 later.
> 
> if (!(link->head->flags & FAIL))
> 	link->head->result = -ECANCELED;
> 
>> -		ret = io_req_prep_async(req);
>> -		if (unlikely(ret))
>> -			goto fail_req;
>> +		if (!(req->flags & REQ_F_FAIL)) {
>> +			ret = io_req_prep_async(req);
>> +			if (unlikely(ret)) {
>> +				req->result = ret;
>> +				req_set_fail(req);
>> +				req_set_fail(link->head);
> 
> ... and here (a helper?), ...
> 
>> +			}
>> +		}
>>   		trace_io_uring_link(ctx, req, head);
>>   		link->last->link = req;
>>   		link->last = req;
>> @@ -6681,6 +6699,17 @@ static int io_submit_sqe(struct io_ring_ctx *ctx, struct io_kiocb *req,
>>   		if (req->flags & (REQ_F_LINK | REQ_F_HARDLINK)) {
>>   			link->head = req;
>>   			link->last = req;
>> +			/*
>> +			 * we can judge a link req is failed or cancelled by if
>> +			 * REQ_F_FAIL is set, but the head is an exception since
>> +			 * it may be set REQ_F_FAIL because of other req's failure
>> +			 * so let's leverage req->result to distinguish if a head
>> +			 * is set REQ_F_FAIL because of its failure or other req's
>> +			 * failure so that we can set the correct ret code for it.
>> +			 * init result here to avoid affecting the normal path.
>> +			 */
>> +			if (!(req->flags & REQ_F_FAIL))
>> +				req->result = 0;
> 
> ... instead of delaying to this point. Just IMHO, it's easier to look
> after the code when it's set on the spot, i.e. may be easy to screw/forget
> something while changing bits around.
> 
> 
>>   		} else {
>>   			io_queue_sqe(req);
>>   		}
>>
> 


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH 2/2] io_uring: fix failed linkchain code logic
  2021-08-23 11:02   ` Pavel Begunkov
@ 2021-08-23 17:12     ` Pavel Begunkov
  2021-08-23 18:45     ` Hao Xu
  1 sibling, 0 replies; 10+ messages in thread
From: Pavel Begunkov @ 2021-08-23 17:12 UTC (permalink / raw)
  To: Hao Xu, Jens Axboe; +Cc: io-uring, Joseph Qi

On 8/23/21 12:02 PM, Pavel Begunkov wrote:
> On 8/23/21 4:25 AM, Hao Xu wrote:
>> Given a linkchain like this:
>> req0(link_flag)-->req1(link_flag)-->...-->reqn(no link_flag)
>>
>> There is a problem:
>>  - if some intermediate linked req like req1 's submittion fails, reqs
>>    after it won't be cancelled.
>>
>>    - sqpoll disabled: maybe it's ok since users can get the error info
>>      of req1 and stop submitting the following sqes.
>>
>>    - sqpoll enabled: definitely a problem, the following sqes will be
>>      submitted in the next round.
>>
>> The solution is to refactor the code logic to:
>>  - if a linked req's submittion fails, just mark it and the head(if it
>>    exists) as REQ_F_FAIL. Leverage req->result to indicate whether it
>>    is failed or cancelled.
>>  - submit or fail the whole chain when we come to the end of it.
> 
> This looks good to me, a couple of comments below.
> 
> 
>> Signed-off-by: Hao Xu <haoxu@linux.alibaba.com>
>> ---
>>  fs/io_uring.c | 61 +++++++++++++++++++++++++++++++++++++--------------
>>  1 file changed, 45 insertions(+), 16 deletions(-)
>>
>> diff --git a/fs/io_uring.c b/fs/io_uring.c
>> index 44b1b2b58e6a..9ae8f2a5c584 100644
>> --- a/fs/io_uring.c
>> +++ b/fs/io_uring.c
>> @@ -1776,8 +1776,6 @@ static void io_preinit_req(struct io_kiocb *req, struct io_ring_ctx *ctx)
>>  	req->ctx = ctx;
>>  	req->link = NULL;
>>  	req->async_data = NULL;
>> -	/* not necessary, but safer to zero */
>> -	req->result = 0;
> 
> Please leave it. I'm afraid of leaking stack to userspace because
                                         ^^^^^
Don't know why I called it "stack", just kernel memory/data

> ->result juggling looks prone to errors. And preinit is pretty cold
> anyway.
> 

-- 
Pavel Begunkov

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH 2/2] io_uring: fix failed linkchain code logic
  2021-08-23  3:25 ` [PATCH 2/2] io_uring: " Hao Xu
@ 2021-08-23 11:02   ` Pavel Begunkov
  2021-08-23 17:12     ` Pavel Begunkov
  2021-08-23 18:45     ` Hao Xu
  0 siblings, 2 replies; 10+ messages in thread
From: Pavel Begunkov @ 2021-08-23 11:02 UTC (permalink / raw)
  To: Hao Xu, Jens Axboe; +Cc: io-uring, Joseph Qi

On 8/23/21 4:25 AM, Hao Xu wrote:
> Given a linkchain like this:
> req0(link_flag)-->req1(link_flag)-->...-->reqn(no link_flag)
> 
> There is a problem:
>  - if some intermediate linked req like req1 's submittion fails, reqs
>    after it won't be cancelled.
> 
>    - sqpoll disabled: maybe it's ok since users can get the error info
>      of req1 and stop submitting the following sqes.
> 
>    - sqpoll enabled: definitely a problem, the following sqes will be
>      submitted in the next round.
> 
> The solution is to refactor the code logic to:
>  - if a linked req's submittion fails, just mark it and the head(if it
>    exists) as REQ_F_FAIL. Leverage req->result to indicate whether it
>    is failed or cancelled.
>  - submit or fail the whole chain when we come to the end of it.

This looks good to me, a couple of comments below.


> Signed-off-by: Hao Xu <haoxu@linux.alibaba.com>
> ---
>  fs/io_uring.c | 61 +++++++++++++++++++++++++++++++++++++--------------
>  1 file changed, 45 insertions(+), 16 deletions(-)
> 
> diff --git a/fs/io_uring.c b/fs/io_uring.c
> index 44b1b2b58e6a..9ae8f2a5c584 100644
> --- a/fs/io_uring.c
> +++ b/fs/io_uring.c
> @@ -1776,8 +1776,6 @@ static void io_preinit_req(struct io_kiocb *req, struct io_ring_ctx *ctx)
>  	req->ctx = ctx;
>  	req->link = NULL;
>  	req->async_data = NULL;
> -	/* not necessary, but safer to zero */
> -	req->result = 0;

Please leave it. I'm afraid of leaking stack to userspace because
->result juggling looks prone to errors. And preinit is pretty cold
anyway.

[...]

>  
> @@ -6637,19 +6644,25 @@ static int io_submit_sqe(struct io_ring_ctx *ctx, struct io_kiocb *req,
>  	ret = io_init_req(ctx, req, sqe);
>  	if (unlikely(ret)) {
>  fail_req:
> +		/* fail even hard links since we don't submit */
>  		if (link->head) {
> -			/* fail even hard links since we don't submit */
> -			io_req_complete_failed(link->head, -ECANCELED);
> -			link->head = NULL;
> +			req_set_fail(link->head);

I think it will be more reliable if we set head->result here, ...

if (!(link->head->flags & FAIL))
	link->head->result = -ECANCELED;

> -		ret = io_req_prep_async(req);
> -		if (unlikely(ret))
> -			goto fail_req;
> +		if (!(req->flags & REQ_F_FAIL)) {
> +			ret = io_req_prep_async(req);
> +			if (unlikely(ret)) {
> +				req->result = ret;
> +				req_set_fail(req);
> +				req_set_fail(link->head);

... and here (a helper?), ...

> +			}
> +		}
>  		trace_io_uring_link(ctx, req, head);
>  		link->last->link = req;
>  		link->last = req;
> @@ -6681,6 +6699,17 @@ static int io_submit_sqe(struct io_ring_ctx *ctx, struct io_kiocb *req,
>  		if (req->flags & (REQ_F_LINK | REQ_F_HARDLINK)) {
>  			link->head = req;
>  			link->last = req;
> +			/*
> +			 * we can judge a link req is failed or cancelled by if
> +			 * REQ_F_FAIL is set, but the head is an exception since
> +			 * it may be set REQ_F_FAIL because of other req's failure
> +			 * so let's leverage req->result to distinguish if a head
> +			 * is set REQ_F_FAIL because of its failure or other req's
> +			 * failure so that we can set the correct ret code for it.
> +			 * init result here to avoid affecting the normal path.
> +			 */
> +			if (!(req->flags & REQ_F_FAIL))
> +				req->result = 0;

... instead of delaying to this point. Just IMHO, it's easier to look
after the code when it's set on the spot, i.e. may be easy to screw/forget
something while changing bits around.


>  		} else {
>  			io_queue_sqe(req);
>  		}
> 

-- 
Pavel Begunkov

^ permalink raw reply	[flat|nested] 10+ messages in thread

* [PATCH 2/2] io_uring: fix failed linkchain code logic
  2021-08-23  3:25 [PATCH for-5.15 v2 " Hao Xu
@ 2021-08-23  3:25 ` Hao Xu
  2021-08-23 11:02   ` Pavel Begunkov
  0 siblings, 1 reply; 10+ messages in thread
From: Hao Xu @ 2021-08-23  3:25 UTC (permalink / raw)
  To: Jens Axboe; +Cc: io-uring, Pavel Begunkov, Joseph Qi

Given a linkchain like this:
req0(link_flag)-->req1(link_flag)-->...-->reqn(no link_flag)

There is a problem:
 - if some intermediate linked req like req1 's submittion fails, reqs
   after it won't be cancelled.

   - sqpoll disabled: maybe it's ok since users can get the error info
     of req1 and stop submitting the following sqes.

   - sqpoll enabled: definitely a problem, the following sqes will be
     submitted in the next round.

The solution is to refactor the code logic to:
 - if a linked req's submittion fails, just mark it and the head(if it
   exists) as REQ_F_FAIL. Leverage req->result to indicate whether it
   is failed or cancelled.
 - submit or fail the whole chain when we come to the end of it.

Signed-off-by: Hao Xu <haoxu@linux.alibaba.com>
---
 fs/io_uring.c | 61 +++++++++++++++++++++++++++++++++++++--------------
 1 file changed, 45 insertions(+), 16 deletions(-)

diff --git a/fs/io_uring.c b/fs/io_uring.c
index 44b1b2b58e6a..9ae8f2a5c584 100644
--- a/fs/io_uring.c
+++ b/fs/io_uring.c
@@ -1776,8 +1776,6 @@ static void io_preinit_req(struct io_kiocb *req, struct io_ring_ctx *ctx)
 	req->ctx = ctx;
 	req->link = NULL;
 	req->async_data = NULL;
-	/* not necessary, but safer to zero */
-	req->result = 0;
 }
 
 static void io_flush_cached_locked_reqs(struct io_ring_ctx *ctx,
@@ -1931,11 +1929,16 @@ static void io_fail_links(struct io_kiocb *req)
 
 	req->link = NULL;
 	while (link) {
+		long res = -ECANCELED;
+
+		if (link->flags & REQ_F_FAIL)
+			res = link->result;
+
 		nxt = link->link;
 		link->link = NULL;
 
 		trace_io_uring_fail_link(req, link);
-		io_cqring_fill_event(link->ctx, link->user_data, -ECANCELED, 0);
+		io_cqring_fill_event(link->ctx, link->user_data, res, 0);
 		io_put_req_deferred(link);
 		link = nxt;
 	}
@@ -6527,8 +6530,12 @@ static inline void io_queue_sqe(struct io_kiocb *req)
 	if (unlikely(req->ctx->drain_active) && io_drain_req(req))
 		return;
 
-	if (likely(!(req->flags & REQ_F_FORCE_ASYNC))) {
+	if (likely(!(req->flags & (REQ_F_FORCE_ASYNC | REQ_F_FAIL)))) {
 		__io_queue_sqe(req);
+	} else if (req->flags & REQ_F_FAIL) {
+		long res = req->result ? : -ECANCELED;
+
+		io_req_complete_failed(req, res);
 	} else {
 		int ret = io_req_prep_async(req);
 
@@ -6637,19 +6644,25 @@ static int io_submit_sqe(struct io_ring_ctx *ctx, struct io_kiocb *req,
 	ret = io_init_req(ctx, req, sqe);
 	if (unlikely(ret)) {
 fail_req:
+		/* fail even hard links since we don't submit */
 		if (link->head) {
-			/* fail even hard links since we don't submit */
-			io_req_complete_failed(link->head, -ECANCELED);
-			link->head = NULL;
+			req_set_fail(link->head);
+		} else if (!(req->flags & (REQ_F_LINK | REQ_F_HARDLINK))) {
+			/*
+			 * the current req is a normal req, we should return
+			 * error and thus break the submittion loop.
+			 */
+			io_req_complete_failed(req, ret);
+			return ret;
 		}
-		io_req_complete_failed(req, ret);
-		return ret;
+		req_set_fail(req);
+		req->result = ret;
+	} else {
+		ret = io_req_prep(req, sqe);
+		if (unlikely(ret))
+			goto fail_req;
 	}
 
-	ret = io_req_prep(req, sqe);
-	if (unlikely(ret))
-		goto fail_req;
-
 	/* don't need @sqe from now on */
 	trace_io_uring_submit_sqe(ctx, req, req->opcode, req->user_data,
 				  req->flags, true,
@@ -6665,9 +6678,14 @@ static int io_submit_sqe(struct io_ring_ctx *ctx, struct io_kiocb *req,
 	if (link->head) {
 		struct io_kiocb *head = link->head;
 
-		ret = io_req_prep_async(req);
-		if (unlikely(ret))
-			goto fail_req;
+		if (!(req->flags & REQ_F_FAIL)) {
+			ret = io_req_prep_async(req);
+			if (unlikely(ret)) {
+				req->result = ret;
+				req_set_fail(req);
+				req_set_fail(link->head);
+			}
+		}
 		trace_io_uring_link(ctx, req, head);
 		link->last->link = req;
 		link->last = req;
@@ -6681,6 +6699,17 @@ static int io_submit_sqe(struct io_ring_ctx *ctx, struct io_kiocb *req,
 		if (req->flags & (REQ_F_LINK | REQ_F_HARDLINK)) {
 			link->head = req;
 			link->last = req;
+			/*
+			 * we can judge a link req is failed or cancelled by if
+			 * REQ_F_FAIL is set, but the head is an exception since
+			 * it may be set REQ_F_FAIL because of other req's failure
+			 * so let's leverage req->result to distinguish if a head
+			 * is set REQ_F_FAIL because of its failure or other req's
+			 * failure so that we can set the correct ret code for it.
+			 * init result here to avoid affecting the normal path.
+			 */
+			if (!(req->flags & REQ_F_FAIL))
+				req->result = 0;
 		} else {
 			io_queue_sqe(req);
 		}
-- 
2.24.4


^ permalink raw reply related	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2021-08-27 17:04 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-08-27  9:46 [PATCH for-5.15 v3 0/2] fix failed linkchain code logic Hao Xu
2021-08-27  9:46 ` [PATCH 1/2] io_uring: remove redundant req_set_fail() Hao Xu
2021-08-27  9:46 ` [PATCH 2/2] io_uring: fix failed linkchain code logic Hao Xu
2021-08-27 11:01 ` [PATCH for-5.15 v3 0/2] " Pavel Begunkov
2021-08-27 13:27 ` Jens Axboe
2021-08-27 17:04   ` Hao Xu
  -- strict thread matches above, loose matches on Subject: below --
2021-08-23  3:25 [PATCH for-5.15 v2 " Hao Xu
2021-08-23  3:25 ` [PATCH 2/2] io_uring: " Hao Xu
2021-08-23 11:02   ` Pavel Begunkov
2021-08-23 17:12     ` Pavel Begunkov
2021-08-23 18:45     ` Hao Xu

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).