* [PATCH V3] io_uring: consider the overflow of sequence for timeout req
@ 2019-10-15 13:59 yangerkun
2019-10-15 14:52 ` Jens Axboe
2019-10-16 1:35 ` yangerkun
0 siblings, 2 replies; 5+ messages in thread
From: yangerkun @ 2019-10-15 13:59 UTC (permalink / raw)
To: axboe, linux-block; +Cc: yangerkun, yi.zhang, houtao1
Now we recalculate the sequence of timeout with 'req->sequence =
ctx->cached_sq_head + count - 1', judge the right place to insert
for timeout_list by compare the number of request we still expected for
completion. But we have not consider about the situation of overflow:
1. ctx->cached_sq_head + count - 1 may overflow. And a bigger count for
the new timeout req can have a small req->sequence.
2. cached_sq_head of now may overflow compare with before req. And it
will lead the timeout req with small req->sequence.
This overflow will lead to the misorder of timeout_list, which can lead
to the wrong order of the completion of timeout_list. Fix it by reuse
req->submit.sequence to store the count, and change the logic of
inserting sort in io_timeout.
Signed-off-by: yangerkun <yangerkun@huawei.com>
---
fs/io_uring.c | 27 +++++++++++++++++++++------
1 file changed, 21 insertions(+), 6 deletions(-)
diff --git a/fs/io_uring.c b/fs/io_uring.c
index 76fdbe84aff5..c9512da06973 100644
--- a/fs/io_uring.c
+++ b/fs/io_uring.c
@@ -1884,7 +1884,7 @@ static enum hrtimer_restart io_timeout_fn(struct hrtimer *timer)
static int io_timeout(struct io_kiocb *req, const struct io_uring_sqe *sqe)
{
- unsigned count, req_dist, tail_index;
+ unsigned count;
struct io_ring_ctx *ctx = req->ctx;
struct list_head *entry;
struct timespec64 ts;
@@ -1907,21 +1907,36 @@ static int io_timeout(struct io_kiocb *req, const struct io_uring_sqe *sqe)
count = 1;
req->sequence = ctx->cached_sq_head + count - 1;
+ /* reuse it to store the count */
+ req->submit.sequence = count;
req->flags |= REQ_F_TIMEOUT;
/*
* Insertion sort, ensuring the first entry in the list is always
* the one we need first.
*/
- tail_index = ctx->cached_cq_tail - ctx->rings->sq_dropped;
- req_dist = req->sequence - tail_index;
spin_lock_irq(&ctx->completion_lock);
list_for_each_prev(entry, &ctx->timeout_list) {
struct io_kiocb *nxt = list_entry(entry, struct io_kiocb, list);
- unsigned dist;
+ unsigned nxt_sq_head;
+ long long tmp, tmp_nxt;
- dist = nxt->sequence - tail_index;
- if (req_dist >= dist)
+ /*
+ * Since cached_sq_head + count - 1 can overflow, use type long
+ * long to store it.
+ */
+ tmp = (long long)ctx->cached_sq_head + count - 1;
+ nxt_sq_head = nxt->sequence - nxt->submit.sequence + 1;
+ tmp_nxt = (long long)nxt_sq_head + nxt->submit.sequence - 1;
+
+ /*
+ * cached_sq_head may overflow, and it will never overflow twice
+ * once there is some timeout req still be valid.
+ */
+ if (ctx->cached_sq_head < nxt_sq_head)
+ tmp_nxt += UINT_MAX;
+
+ if (tmp >= tmp_nxt)
break;
}
list_add(&req->list, entry);
--
2.17.2
^ permalink raw reply related [flat|nested] 5+ messages in thread
* Re: [PATCH V3] io_uring: consider the overflow of sequence for timeout req
2019-10-15 13:59 [PATCH V3] io_uring: consider the overflow of sequence for timeout req yangerkun
@ 2019-10-15 14:52 ` Jens Axboe
2019-10-16 1:35 ` yangerkun
1 sibling, 0 replies; 5+ messages in thread
From: Jens Axboe @ 2019-10-15 14:52 UTC (permalink / raw)
To: yangerkun, linux-block; +Cc: yi.zhang, houtao1
On 10/15/19 7:59 AM, yangerkun wrote:
> Now we recalculate the sequence of timeout with 'req->sequence =
> ctx->cached_sq_head + count - 1', judge the right place to insert
> for timeout_list by compare the number of request we still expected for
> completion. But we have not consider about the situation of overflow:
>
> 1. ctx->cached_sq_head + count - 1 may overflow. And a bigger count for
> the new timeout req can have a small req->sequence.
>
> 2. cached_sq_head of now may overflow compare with before req. And it
> will lead the timeout req with small req->sequence.
>
> This overflow will lead to the misorder of timeout_list, which can lead
> to the wrong order of the completion of timeout_list. Fix it by reuse
> req->submit.sequence to store the count, and change the logic of
> inserting sort in io_timeout.
Thanks, this looks great. Applied for 5.4.
--
Jens Axboe
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [PATCH V3] io_uring: consider the overflow of sequence for timeout req
2019-10-15 13:59 [PATCH V3] io_uring: consider the overflow of sequence for timeout req yangerkun
2019-10-15 14:52 ` Jens Axboe
@ 2019-10-16 1:35 ` yangerkun
2019-10-16 1:45 ` Jens Axboe
1 sibling, 1 reply; 5+ messages in thread
From: yangerkun @ 2019-10-16 1:35 UTC (permalink / raw)
To: axboe, linux-block; +Cc: yi.zhang, houtao1
On 2019/10/15 21:59, yangerkun wrote:
> Now we recalculate the sequence of timeout with 'req->sequence =
> ctx->cached_sq_head + count - 1', judge the right place to insert
> for timeout_list by compare the number of request we still expected for
> completion. But we have not consider about the situation of overflow:
>
> 1. ctx->cached_sq_head + count - 1 may overflow. And a bigger count for
> the new timeout req can have a small req->sequence.
>
> 2. cached_sq_head of now may overflow compare with before req. And it
> will lead the timeout req with small req->sequence.
>
> This overflow will lead to the misorder of timeout_list, which can lead
> to the wrong order of the completion of timeout_list. Fix it by reuse
> req->submit.sequence to store the count, and change the logic of
> inserting sort in io_timeout.
>
> Signed-off-by: yangerkun <yangerkun@huawei.com>
> ---
> fs/io_uring.c | 27 +++++++++++++++++++++------
> 1 file changed, 21 insertions(+), 6 deletions(-)
>
> diff --git a/fs/io_uring.c b/fs/io_uring.c
> index 76fdbe84aff5..c9512da06973 100644
> --- a/fs/io_uring.c
> +++ b/fs/io_uring.c
> @@ -1884,7 +1884,7 @@ static enum hrtimer_restart io_timeout_fn(struct hrtimer *timer)
>
> static int io_timeout(struct io_kiocb *req, const struct io_uring_sqe *sqe)
> {
> - unsigned count, req_dist, tail_index;
> + unsigned count;
> struct io_ring_ctx *ctx = req->ctx;
> struct list_head *entry;
> struct timespec64 ts;
> @@ -1907,21 +1907,36 @@ static int io_timeout(struct io_kiocb *req, const struct io_uring_sqe *sqe)
> count = 1;
>
> req->sequence = ctx->cached_sq_head + count - 1;
> + /* reuse it to store the count */
> + req->submit.sequence = count;
> req->flags |= REQ_F_TIMEOUT;
>
> /*
> * Insertion sort, ensuring the first entry in the list is always
> * the one we need first.
> */
> - tail_index = ctx->cached_cq_tail - ctx->rings->sq_dropped;
> - req_dist = req->sequence - tail_index;
> spin_lock_irq(&ctx->completion_lock);
> list_for_each_prev(entry, &ctx->timeout_list) {
> struct io_kiocb *nxt = list_entry(entry, struct io_kiocb, list);
> - unsigned dist;
> + unsigned nxt_sq_head;
> + long long tmp, tmp_nxt;
>
> - dist = nxt->sequence - tail_index;
> - if (req_dist >= dist)
> + /*
> + * Since cached_sq_head + count - 1 can overflow, use type long
> + * long to store it.
> + */
> + tmp = (long long)ctx->cached_sq_head + count - 1;
> + nxt_sq_head = nxt->sequence - nxt->submit.sequence + 1;
> + tmp_nxt = (long long)nxt_sq_head + nxt->submit.sequence - 1;
> +
> + /*
> + * cached_sq_head may overflow, and it will never overflow twice
> + * once there is some timeout req still be valid.
> + */
> + if (ctx->cached_sq_head < nxt_sq_head)
> + tmp_nxt += UINT_MAX;
Maybe there is a mistake, it should be tmp. So sorry about this.
> +
> + if (tmp >= tmp_nxt)
> break;
> }
> list_add(&req->list, entry);
>
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [PATCH V3] io_uring: consider the overflow of sequence for timeout req
2019-10-16 1:35 ` yangerkun
@ 2019-10-16 1:45 ` Jens Axboe
2019-10-16 2:19 ` yangerkun
0 siblings, 1 reply; 5+ messages in thread
From: Jens Axboe @ 2019-10-16 1:45 UTC (permalink / raw)
To: yangerkun, linux-block; +Cc: yi.zhang, houtao1
On 10/15/19 7:35 PM, yangerkun wrote:
>
>
> On 2019/10/15 21:59, yangerkun wrote:
>> Now we recalculate the sequence of timeout with 'req->sequence =
>> ctx->cached_sq_head + count - 1', judge the right place to insert
>> for timeout_list by compare the number of request we still expected for
>> completion. But we have not consider about the situation of overflow:
>>
>> 1. ctx->cached_sq_head + count - 1 may overflow. And a bigger count for
>> the new timeout req can have a small req->sequence.
>>
>> 2. cached_sq_head of now may overflow compare with before req. And it
>> will lead the timeout req with small req->sequence.
>>
>> This overflow will lead to the misorder of timeout_list, which can lead
>> to the wrong order of the completion of timeout_list. Fix it by reuse
>> req->submit.sequence to store the count, and change the logic of
>> inserting sort in io_timeout.
>>
>> Signed-off-by: yangerkun <yangerkun@huawei.com>
>> ---
>> fs/io_uring.c | 27 +++++++++++++++++++++------
>> 1 file changed, 21 insertions(+), 6 deletions(-)
>>
>> diff --git a/fs/io_uring.c b/fs/io_uring.c
>> index 76fdbe84aff5..c9512da06973 100644
>> --- a/fs/io_uring.c
>> +++ b/fs/io_uring.c
>> @@ -1884,7 +1884,7 @@ static enum hrtimer_restart io_timeout_fn(struct hrtimer *timer)
>>
>> static int io_timeout(struct io_kiocb *req, const struct io_uring_sqe *sqe)
>> {
>> - unsigned count, req_dist, tail_index;
>> + unsigned count;
>> struct io_ring_ctx *ctx = req->ctx;
>> struct list_head *entry;
>> struct timespec64 ts;
>> @@ -1907,21 +1907,36 @@ static int io_timeout(struct io_kiocb *req, const struct io_uring_sqe *sqe)
>> count = 1;
>>
>> req->sequence = ctx->cached_sq_head + count - 1;
>> + /* reuse it to store the count */
>> + req->submit.sequence = count;
>> req->flags |= REQ_F_TIMEOUT;
>>
>> /*
>> * Insertion sort, ensuring the first entry in the list is always
>> * the one we need first.
>> */
>> - tail_index = ctx->cached_cq_tail - ctx->rings->sq_dropped;
>> - req_dist = req->sequence - tail_index;
>> spin_lock_irq(&ctx->completion_lock);
>> list_for_each_prev(entry, &ctx->timeout_list) {
>> struct io_kiocb *nxt = list_entry(entry, struct io_kiocb, list);
>> - unsigned dist;
>> + unsigned nxt_sq_head;
>> + long long tmp, tmp_nxt;
>>
>> - dist = nxt->sequence - tail_index;
>> - if (req_dist >= dist)
>> + /*
>> + * Since cached_sq_head + count - 1 can overflow, use type long
>> + * long to store it.
>> + */
>> + tmp = (long long)ctx->cached_sq_head + count - 1;
>> + nxt_sq_head = nxt->sequence - nxt->submit.sequence + 1;
>> + tmp_nxt = (long long)nxt_sq_head + nxt->submit.sequence - 1;
>> +
>> + /*
>> + * cached_sq_head may overflow, and it will never overflow twice
>> + * once there is some timeout req still be valid.
>> + */
>> + if (ctx->cached_sq_head < nxt_sq_head)
>> + tmp_nxt += UINT_MAX;
>
> Maybe there is a mistake, it should be tmp. So sorry about this.
I ran it through the basic testing, but I guess it doesn't catch overflow
cases. Maybe we can come up with one? Should be pretty simple to setup a
io_uring, post UINT_MAX - 10 nops (or something like that), then do some
timeout testing.
Just send an incremental patch to fix it.
--
Jens Axboe
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [PATCH V3] io_uring: consider the overflow of sequence for timeout req
2019-10-16 1:45 ` Jens Axboe
@ 2019-10-16 2:19 ` yangerkun
0 siblings, 0 replies; 5+ messages in thread
From: yangerkun @ 2019-10-16 2:19 UTC (permalink / raw)
To: Jens Axboe, linux-block; +Cc: yi.zhang, houtao1
On 2019/10/16 9:45, Jens Axboe wrote:
> On 10/15/19 7:35 PM, yangerkun wrote:
>>
>>
>> On 2019/10/15 21:59, yangerkun wrote:
>>> Now we recalculate the sequence of timeout with 'req->sequence =
>>> ctx->cached_sq_head + count - 1', judge the right place to insert
>>> for timeout_list by compare the number of request we still expected for
>>> completion. But we have not consider about the situation of overflow:
>>>
>>> 1. ctx->cached_sq_head + count - 1 may overflow. And a bigger count for
>>> the new timeout req can have a small req->sequence.
>>>
>>> 2. cached_sq_head of now may overflow compare with before req. And it
>>> will lead the timeout req with small req->sequence.
>>>
>>> This overflow will lead to the misorder of timeout_list, which can lead
>>> to the wrong order of the completion of timeout_list. Fix it by reuse
>>> req->submit.sequence to store the count, and change the logic of
>>> inserting sort in io_timeout.
>>>
>>> Signed-off-by: yangerkun <yangerkun@huawei.com>
>>> ---
>>> fs/io_uring.c | 27 +++++++++++++++++++++------
>>> 1 file changed, 21 insertions(+), 6 deletions(-)
>>>
>>> diff --git a/fs/io_uring.c b/fs/io_uring.c
>>> index 76fdbe84aff5..c9512da06973 100644
>>> --- a/fs/io_uring.c
>>> +++ b/fs/io_uring.c
>>> @@ -1884,7 +1884,7 @@ static enum hrtimer_restart io_timeout_fn(struct hrtimer *timer)
>>>
>>> static int io_timeout(struct io_kiocb *req, const struct io_uring_sqe *sqe)
>>> {
>>> - unsigned count, req_dist, tail_index;
>>> + unsigned count;
>>> struct io_ring_ctx *ctx = req->ctx;
>>> struct list_head *entry;
>>> struct timespec64 ts;
>>> @@ -1907,21 +1907,36 @@ static int io_timeout(struct io_kiocb *req, const struct io_uring_sqe *sqe)
>>> count = 1;
>>>
>>> req->sequence = ctx->cached_sq_head + count - 1;
>>> + /* reuse it to store the count */
>>> + req->submit.sequence = count;
>>> req->flags |= REQ_F_TIMEOUT;
>>>
>>> /*
>>> * Insertion sort, ensuring the first entry in the list is always
>>> * the one we need first.
>>> */
>>> - tail_index = ctx->cached_cq_tail - ctx->rings->sq_dropped;
>>> - req_dist = req->sequence - tail_index;
>>> spin_lock_irq(&ctx->completion_lock);
>>> list_for_each_prev(entry, &ctx->timeout_list) {
>>> struct io_kiocb *nxt = list_entry(entry, struct io_kiocb, list);
>>> - unsigned dist;
>>> + unsigned nxt_sq_head;
>>> + long long tmp, tmp_nxt;
>>>
>>> - dist = nxt->sequence - tail_index;
>>> - if (req_dist >= dist)
>>> + /*
>>> + * Since cached_sq_head + count - 1 can overflow, use type long
>>> + * long to store it.
>>> + */
>>> + tmp = (long long)ctx->cached_sq_head + count - 1;
>>> + nxt_sq_head = nxt->sequence - nxt->submit.sequence + 1;
>>> + tmp_nxt = (long long)nxt_sq_head + nxt->submit.sequence - 1;
>>> +
>>> + /*
>>> + * cached_sq_head may overflow, and it will never overflow twice
>>> + * once there is some timeout req still be valid.
>>> + */
>>> + if (ctx->cached_sq_head < nxt_sq_head)
>>> + tmp_nxt += UINT_MAX;
>>
>> Maybe there is a mistake, it should be tmp. So sorry about this.
>
> I ran it through the basic testing, but I guess it doesn't catch overflow
> cases. Maybe we can come up with one? Should be pretty simple to setup a
> io_uring, post UINT_MAX - 10 nops (or something like that), then do some
> timeout testing.
>
Good idea! I will try to add a testcase for this in liburing.
> Just send an incremental patch to fix it.
OK, will send the fix patch!
>
^ permalink raw reply [flat|nested] 5+ messages in thread
end of thread, other threads:[~2019-10-16 2:19 UTC | newest]
Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-10-15 13:59 [PATCH V3] io_uring: consider the overflow of sequence for timeout req yangerkun
2019-10-15 14:52 ` Jens Axboe
2019-10-16 1:35 ` yangerkun
2019-10-16 1:45 ` Jens Axboe
2019-10-16 2:19 ` yangerkun
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).