Linux-Block Archive on lore.kernel.org
 help / color / Atom feed
* [PATCH] io_uring: remove wait loop spurious wakeups
@ 2019-10-07 23:18 Pavel Begunkov (Silence)
  2019-10-08  3:16 ` Jens Axboe
  0 siblings, 1 reply; 9+ messages in thread
From: Pavel Begunkov (Silence) @ 2019-10-07 23:18 UTC (permalink / raw)
  To: Jens Axboe, linux-block, linux-kernel; +Cc: Pavel Begunkov

From: Pavel Begunkov <asml.silence@gmail.com>

Any changes interesting to tasks waiting in io_cqring_wait() are
commited with io_cqring_ev_posted(). However, io_ring_drop_ctx_refs()
also tries to do that but with no reason, that means spurious wakeups
every io_free_req() and io_uring_enter().

Just use percpu_ref_put() instead.

Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
---
 fs/io_uring.c | 16 ++++------------
 1 file changed, 4 insertions(+), 12 deletions(-)

diff --git a/fs/io_uring.c b/fs/io_uring.c
index c934f91c51e9..89d77a626063 100644
--- a/fs/io_uring.c
+++ b/fs/io_uring.c
@@ -591,14 +591,6 @@ static void io_cqring_add_event(struct io_ring_ctx *ctx, u64 user_data,
 	io_cqring_ev_posted(ctx);
 }
 
-static void io_ring_drop_ctx_refs(struct io_ring_ctx *ctx, unsigned refs)
-{
-	percpu_ref_put_many(&ctx->refs, refs);
-
-	if (waitqueue_active(&ctx->wait))
-		wake_up(&ctx->wait);
-}
-
 static struct io_kiocb *io_get_req(struct io_ring_ctx *ctx,
 				   struct io_submit_state *state)
 {
@@ -646,7 +638,7 @@ static struct io_kiocb *io_get_req(struct io_ring_ctx *ctx,
 	req->result = 0;
 	return req;
 out:
-	io_ring_drop_ctx_refs(ctx, 1);
+	percpu_ref_put(&ctx->refs);
 	return NULL;
 }
 
@@ -654,7 +646,7 @@ static void io_free_req_many(struct io_ring_ctx *ctx, void **reqs, int *nr)
 {
 	if (*nr) {
 		kmem_cache_free_bulk(req_cachep, *nr, reqs);
-		io_ring_drop_ctx_refs(ctx, *nr);
+		percpu_ref_put_many(&ctx->refs, *nr);
 		*nr = 0;
 	}
 }
@@ -663,7 +655,7 @@ static void __io_free_req(struct io_kiocb *req)
 {
 	if (req->file && !(req->flags & REQ_F_FIXED_FILE))
 		fput(req->file);
-	io_ring_drop_ctx_refs(req->ctx, 1);
+	percpu_ref_put(&req->ctx->refs);
 	kmem_cache_free(req_cachep, req);
 }
 
@@ -3630,7 +3622,7 @@ SYSCALL_DEFINE6(io_uring_enter, unsigned int, fd, u32, to_submit,
 		}
 	}
 
-	io_ring_drop_ctx_refs(ctx, 1);
+	percpu_ref_put(&ctx->refs);
 out_fput:
 	fdput(f);
 	return submitted ? submitted : ret;
-- 
2.23.0


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH] io_uring: remove wait loop spurious wakeups
  2019-10-07 23:18 [PATCH] io_uring: remove wait loop spurious wakeups Pavel Begunkov (Silence)
@ 2019-10-08  3:16 ` Jens Axboe
  2019-10-08 16:43   ` Pavel Begunkov
  0 siblings, 1 reply; 9+ messages in thread
From: Jens Axboe @ 2019-10-08  3:16 UTC (permalink / raw)
  To: Pavel Begunkov (Silence), linux-block, linux-kernel

On 10/7/19 5:18 PM, Pavel Begunkov (Silence) wrote:
> From: Pavel Begunkov <asml.silence@gmail.com>
> 
> Any changes interesting to tasks waiting in io_cqring_wait() are
> commited with io_cqring_ev_posted(). However, io_ring_drop_ctx_refs()
> also tries to do that but with no reason, that means spurious wakeups
> every io_free_req() and io_uring_enter().
> 
> Just use percpu_ref_put() instead.

Looks good, this is a leftover from when the ctx teardown used
the waitqueue as well.

-- 
Jens Axboe


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH] io_uring: remove wait loop spurious wakeups
  2019-10-08  3:16 ` Jens Axboe
@ 2019-10-08 16:43   ` Pavel Begunkov
  2019-10-08 17:00     ` Jens Axboe
  0 siblings, 1 reply; 9+ messages in thread
From: Pavel Begunkov @ 2019-10-08 16:43 UTC (permalink / raw)
  To: Jens Axboe, linux-block, linux-kernel

[-- Attachment #1.1: Type: text/plain, Size: 834 bytes --]

On 08/10/2019 06:16, Jens Axboe wrote:
> On 10/7/19 5:18 PM, Pavel Begunkov (Silence) wrote:
>> From: Pavel Begunkov <asml.silence@gmail.com>
>>
>> Any changes interesting to tasks waiting in io_cqring_wait() are
>> commited with io_cqring_ev_posted(). However, io_ring_drop_ctx_refs()
>> also tries to do that but with no reason, that means spurious wakeups
>> every io_free_req() and io_uring_enter().
>>
>> Just use percpu_ref_put() instead.
> 
> Looks good, this is a leftover from when the ctx teardown used
> the waitqueue as well.
> 
BTW, is there a reason for ref-counting in struct io_kiocb? I understand
the idea behind submission reference, but don't see any actual part
needing it.

Tested with another ref-counting patch and got +5-8% to
nops performance.


-- 
Yours sincerely,
Pavel Begunkov


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH] io_uring: remove wait loop spurious wakeups
  2019-10-08 16:43   ` Pavel Begunkov
@ 2019-10-08 17:00     ` Jens Axboe
  2019-10-08 20:58       ` Pavel Begunkov
  0 siblings, 1 reply; 9+ messages in thread
From: Jens Axboe @ 2019-10-08 17:00 UTC (permalink / raw)
  To: Pavel Begunkov, linux-block, linux-kernel

On 10/8/19 10:43 AM, Pavel Begunkov wrote:
> On 08/10/2019 06:16, Jens Axboe wrote:
>> On 10/7/19 5:18 PM, Pavel Begunkov (Silence) wrote:
>>> From: Pavel Begunkov <asml.silence@gmail.com>
>>>
>>> Any changes interesting to tasks waiting in io_cqring_wait() are
>>> commited with io_cqring_ev_posted(). However, io_ring_drop_ctx_refs()
>>> also tries to do that but with no reason, that means spurious wakeups
>>> every io_free_req() and io_uring_enter().
>>>
>>> Just use percpu_ref_put() instead.
>>
>> Looks good, this is a leftover from when the ctx teardown used
>> the waitqueue as well.
>>
> BTW, is there a reason for ref-counting in struct io_kiocb? I understand
> the idea behind submission reference, but don't see any actual part
> needing it.

In short, it's to prevent the completion running before we're done with
the iocb on the submission side.

> Tested with another ref-counting patch and got +5-8% to
> nops performance.
> 
> 


-- 
Jens Axboe


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH] io_uring: remove wait loop spurious wakeups
  2019-10-08 17:00     ` Jens Axboe
@ 2019-10-08 20:58       ` Pavel Begunkov
  2019-10-08 21:22         ` Jens Axboe
  0 siblings, 1 reply; 9+ messages in thread
From: Pavel Begunkov @ 2019-10-08 20:58 UTC (permalink / raw)
  To: Jens Axboe, linux-block, linux-kernel

[-- Attachment #1.1: Type: text/plain, Size: 1261 bytes --]

On 08/10/2019 20:00, Jens Axboe wrote:
> On 10/8/19 10:43 AM, Pavel Begunkov wrote:
>> On 08/10/2019 06:16, Jens Axboe wrote:
>>> On 10/7/19 5:18 PM, Pavel Begunkov (Silence) wrote:
>>>> From: Pavel Begunkov <asml.silence@gmail.com>
>>>>
>>>> Any changes interesting to tasks waiting in io_cqring_wait() are
>>>> commited with io_cqring_ev_posted(). However, io_ring_drop_ctx_refs()
>>>> also tries to do that but with no reason, that means spurious wakeups
>>>> every io_free_req() and io_uring_enter().
>>>>
>>>> Just use percpu_ref_put() instead.
>>>
>>> Looks good, this is a leftover from when the ctx teardown used
>>> the waitqueue as well.
>>>
>> BTW, is there a reason for ref-counting in struct io_kiocb? I understand
>> the idea behind submission reference, but don't see any actual part
>> needing it.
> 
> In short, it's to prevent the completion running before we're done with
> the iocb on the submission side.

Yep, that's what I expected. Perhaps I missed something, but what I've
seen following code paths all the way down, it either
1. gets error / completes synchronously and then frees req locally
2. or passes it further (e.g. async list) and never accesses it after


-- 
Yours sincerely,
Pavel Begunkov


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH] io_uring: remove wait loop spurious wakeups
  2019-10-08 20:58       ` Pavel Begunkov
@ 2019-10-08 21:22         ` Jens Axboe
  2019-10-08 22:05           ` Pavel Begunkov
  0 siblings, 1 reply; 9+ messages in thread
From: Jens Axboe @ 2019-10-08 21:22 UTC (permalink / raw)
  To: Pavel Begunkov, linux-block, linux-kernel

On 10/8/19 2:58 PM, Pavel Begunkov wrote:
> On 08/10/2019 20:00, Jens Axboe wrote:
>> On 10/8/19 10:43 AM, Pavel Begunkov wrote:
>>> On 08/10/2019 06:16, Jens Axboe wrote:
>>>> On 10/7/19 5:18 PM, Pavel Begunkov (Silence) wrote:
>>>>> From: Pavel Begunkov <asml.silence@gmail.com>
>>>>>
>>>>> Any changes interesting to tasks waiting in io_cqring_wait() are
>>>>> commited with io_cqring_ev_posted(). However, io_ring_drop_ctx_refs()
>>>>> also tries to do that but with no reason, that means spurious wakeups
>>>>> every io_free_req() and io_uring_enter().
>>>>>
>>>>> Just use percpu_ref_put() instead.
>>>>
>>>> Looks good, this is a leftover from when the ctx teardown used
>>>> the waitqueue as well.
>>>>
>>> BTW, is there a reason for ref-counting in struct io_kiocb? I understand
>>> the idea behind submission reference, but don't see any actual part
>>> needing it.
>>
>> In short, it's to prevent the completion running before we're done with
>> the iocb on the submission side.
> 
> Yep, that's what I expected. Perhaps I missed something, but what I've
> seen following code paths all the way down, it either
> 1. gets error / completes synchronously and then frees req locally
> 2. or passes it further (e.g. async list) and never accesses it after

As soon as the IO is passed on, it can complete. In fact, it can complete
even _before_ that call returns. That's the issue. Obviously this isn't
true for purely polled IO, but it is true for IRQ based IO.

-- 
Jens Axboe


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH] io_uring: remove wait loop spurious wakeups
  2019-10-08 21:22         ` Jens Axboe
@ 2019-10-08 22:05           ` Pavel Begunkov
  2019-10-09  2:54             ` Jens Axboe
  0 siblings, 1 reply; 9+ messages in thread
From: Pavel Begunkov @ 2019-10-08 22:05 UTC (permalink / raw)
  To: Jens Axboe, linux-block, linux-kernel

[-- Attachment #1.1: Type: text/plain, Size: 1842 bytes --]

On 09/10/2019 00:22, Jens Axboe wrote:
> On 10/8/19 2:58 PM, Pavel Begunkov wrote:
>> On 08/10/2019 20:00, Jens Axboe wrote:
>>> On 10/8/19 10:43 AM, Pavel Begunkov wrote:
>>>> On 08/10/2019 06:16, Jens Axboe wrote:
>>>>> On 10/7/19 5:18 PM, Pavel Begunkov (Silence) wrote:
>>>>>> From: Pavel Begunkov <asml.silence@gmail.com>
>>>>>>
>>>>>> Any changes interesting to tasks waiting in io_cqring_wait() are
>>>>>> commited with io_cqring_ev_posted(). However, io_ring_drop_ctx_refs()
>>>>>> also tries to do that but with no reason, that means spurious wakeups
>>>>>> every io_free_req() and io_uring_enter().
>>>>>>
>>>>>> Just use percpu_ref_put() instead.
>>>>>
>>>>> Looks good, this is a leftover from when the ctx teardown used
>>>>> the waitqueue as well.
>>>>>
>>>> BTW, is there a reason for ref-counting in struct io_kiocb? I understand
>>>> the idea behind submission reference, but don't see any actual part
>>>> needing it.
>>>
>>> In short, it's to prevent the completion running before we're done with
>>> the iocb on the submission side.
>>
>> Yep, that's what I expected. Perhaps I missed something, but what I've
>> seen following code paths all the way down, it either
>> 1. gets error / completes synchronously and then frees req locally
>> 2. or passes it further (e.g. async list) and never accesses it after
> 
> As soon as the IO is passed on, it can complete. In fact, it can complete
> even _before_ that call returns. That's the issue. Obviously this isn't
> true for purely polled IO, but it is true for IRQ based IO.

And the idea was to not use io_kiocb after submission. Except when we know,
that it won't complete asynchronously (e.g. error), that could be checked
with return code, I guess.

Anyway, thanks for the explanation!

-- 
Yours sincerely,
Pavel Begunkov


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH] io_uring: remove wait loop spurious wakeups
  2019-10-08 22:05           ` Pavel Begunkov
@ 2019-10-09  2:54             ` Jens Axboe
  2019-10-09  9:19               ` Pavel Begunkov
  0 siblings, 1 reply; 9+ messages in thread
From: Jens Axboe @ 2019-10-09  2:54 UTC (permalink / raw)
  To: Pavel Begunkov, linux-block, linux-kernel

On 10/8/19 4:05 PM, Pavel Begunkov wrote:
> On 09/10/2019 00:22, Jens Axboe wrote:
>> On 10/8/19 2:58 PM, Pavel Begunkov wrote:
>>> On 08/10/2019 20:00, Jens Axboe wrote:
>>>> On 10/8/19 10:43 AM, Pavel Begunkov wrote:
>>>>> On 08/10/2019 06:16, Jens Axboe wrote:
>>>>>> On 10/7/19 5:18 PM, Pavel Begunkov (Silence) wrote:
>>>>>>> From: Pavel Begunkov <asml.silence@gmail.com>
>>>>>>>
>>>>>>> Any changes interesting to tasks waiting in io_cqring_wait() are
>>>>>>> commited with io_cqring_ev_posted(). However, io_ring_drop_ctx_refs()
>>>>>>> also tries to do that but with no reason, that means spurious wakeups
>>>>>>> every io_free_req() and io_uring_enter().
>>>>>>>
>>>>>>> Just use percpu_ref_put() instead.
>>>>>>
>>>>>> Looks good, this is a leftover from when the ctx teardown used
>>>>>> the waitqueue as well.
>>>>>>
>>>>> BTW, is there a reason for ref-counting in struct io_kiocb? I understand
>>>>> the idea behind submission reference, but don't see any actual part
>>>>> needing it.
>>>>
>>>> In short, it's to prevent the completion running before we're done with
>>>> the iocb on the submission side.
>>>
>>> Yep, that's what I expected. Perhaps I missed something, but what I've
>>> seen following code paths all the way down, it either
>>> 1. gets error / completes synchronously and then frees req locally
>>> 2. or passes it further (e.g. async list) and never accesses it after
>>
>> As soon as the IO is passed on, it can complete. In fact, it can complete
>> even _before_ that call returns. That's the issue. Obviously this isn't
>> true for purely polled IO, but it is true for IRQ based IO.
> 
> And the idea was to not use io_kiocb after submission. Except when we know,
> that it won't complete asynchronously (e.g. error), that could be checked
> with return code, I guess.

I think you're still missing the point. During the submission it can go
away, it can be deep in a call chain. So it's not enough to say "we
won't touch it after completion returns", we need to hold a reference to
ensure it doesn't go away WHILE being submitted.

Hope that helps!

-- 
Jens Axboe


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH] io_uring: remove wait loop spurious wakeups
  2019-10-09  2:54             ` Jens Axboe
@ 2019-10-09  9:19               ` Pavel Begunkov
  0 siblings, 0 replies; 9+ messages in thread
From: Pavel Begunkov @ 2019-10-09  9:19 UTC (permalink / raw)
  To: Jens Axboe, linux-block, linux-kernel

On 10/9/2019 5:54 AM, Jens Axboe wrote:
>>>>>> BTW, is there a reason for ref-counting in struct io_kiocb? I understand
>>>>>> the idea behind submission reference, but don't see any actual part
>>>>>> needing it.
>>>>>
>>>>> In short, it's to prevent the completion running before we're done with
>>>>> the iocb on the submission side.
>>>>
>>>> Yep, that's what I expected. Perhaps I missed something, but what I've
>>>> seen following code paths all the way down, it either
>>>> 1. gets error / completes synchronously and then frees req locally
>>>> 2. or passes it further (e.g. async list) and never accesses it after
>>>
>>> As soon as the IO is passed on, it can complete. In fact, it can complete
>>> even _before_ that call returns. That's the issue. Obviously this isn't
>>> true for purely polled IO, but it is true for IRQ based IO.
>>
>> And the idea was to not use io_kiocb after submission. Except when we know,
>> that it won't complete asynchronously (e.g. error), that could be checked
>> with return code, I guess.
> 
> I think you're still missing the point. During the submission it can go
> away, it can be deep in a call chain. So it's not enough to say "we
> won't touch it after completion returns", we need to hold a reference to
> ensure it doesn't go away WHILE being submitted.
> 
> Hope that helps!

Now I get it, thanks Jens!

-- 
Yours sincerely,
Pavel Begunkov

^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, back to index

Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-10-07 23:18 [PATCH] io_uring: remove wait loop spurious wakeups Pavel Begunkov (Silence)
2019-10-08  3:16 ` Jens Axboe
2019-10-08 16:43   ` Pavel Begunkov
2019-10-08 17:00     ` Jens Axboe
2019-10-08 20:58       ` Pavel Begunkov
2019-10-08 21:22         ` Jens Axboe
2019-10-08 22:05           ` Pavel Begunkov
2019-10-09  2:54             ` Jens Axboe
2019-10-09  9:19               ` Pavel Begunkov

Linux-Block Archive on lore.kernel.org

Archives are clonable:
	git clone --mirror https://lore.kernel.org/linux-block/0 linux-block/git/0.git

	# If you have public-inbox 1.1+ installed, you may
	# initialize and index your mirror using the following commands:
	public-inbox-init -V2 linux-block linux-block/ https://lore.kernel.org/linux-block \
		linux-block@vger.kernel.org linux-block@archiver.kernel.org
	public-inbox-index linux-block

Example config snippet for mirrors

Newsgroup available over NNTP:
	nntp://nntp.lore.kernel.org/org.kernel.vger.linux-block


AGPL code for this site: git clone https://public-inbox.org/ public-inbox