* [PATCH] aio-posix: avoid reacquiring rcu_read_lock() when polling
@ 2020-02-18 18:27 Stefan Hajnoczi
2020-02-20 11:19 ` Paolo Bonzini
2020-02-21 13:34 ` Stefan Hajnoczi
0 siblings, 2 replies; 3+ messages in thread
From: Stefan Hajnoczi @ 2020-02-18 18:27 UTC (permalink / raw)
To: qemu-devel; +Cc: Fam Zheng, Paolo Bonzini, Stefan Hajnoczi, qemu-block
The first rcu_read_lock/unlock() is expensive. Nested calls are cheap.
This optimization increases IOPS from 73k to 162k with a Linux guest
that has 2 virtio-blk,num-queues=1 and 99 virtio-blk,num-queues=32
devices.
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
---
util/aio-posix.c | 11 +++++++++++
1 file changed, 11 insertions(+)
diff --git a/util/aio-posix.c b/util/aio-posix.c
index a4977f538e..f67f5b34e9 100644
--- a/util/aio-posix.c
+++ b/util/aio-posix.c
@@ -15,6 +15,7 @@
#include "qemu/osdep.h"
#include "block/block.h"
+#include "qemu/rcu.h"
#include "qemu/rcu_queue.h"
#include "qemu/sockets.h"
#include "qemu/cutils.h"
@@ -514,6 +515,16 @@ static bool run_poll_handlers_once(AioContext *ctx, int64_t *timeout)
bool progress = false;
AioHandler *node;
+ /*
+ * Optimization: ->io_poll() handlers often contain RCU read critical
+ * sections and we therefore see many rcu_read_lock() -> rcu_read_unlock()
+ * -> rcu_read_lock() -> ... sequences with expensive memory
+ * synchronization primitives. Make the entire polling loop an RCU
+ * critical section because nested rcu_read_lock()/rcu_read_unlock() calls
+ * are cheap.
+ */
+ RCU_READ_LOCK_GUARD();
+
QLIST_FOREACH_RCU(node, &ctx->aio_handlers, node) {
if (!node->deleted && node->io_poll &&
aio_node_check(ctx, node->is_external) &&
--
2.24.1
^ permalink raw reply related [flat|nested] 3+ messages in thread
* Re: [PATCH] aio-posix: avoid reacquiring rcu_read_lock() when polling
2020-02-18 18:27 [PATCH] aio-posix: avoid reacquiring rcu_read_lock() when polling Stefan Hajnoczi
@ 2020-02-20 11:19 ` Paolo Bonzini
2020-02-21 13:34 ` Stefan Hajnoczi
1 sibling, 0 replies; 3+ messages in thread
From: Paolo Bonzini @ 2020-02-20 11:19 UTC (permalink / raw)
To: Stefan Hajnoczi, qemu-devel; +Cc: Fam Zheng, qemu-block
On 18/02/20 19:27, Stefan Hajnoczi wrote:
> The first rcu_read_lock/unlock() is expensive. Nested calls are cheap.
>
> This optimization increases IOPS from 73k to 162k with a Linux guest
> that has 2 virtio-blk,num-queues=1 and 99 virtio-blk,num-queues=32
> devices.
>
> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
> ---
> util/aio-posix.c | 11 +++++++++++
> 1 file changed, 11 insertions(+)
>
> diff --git a/util/aio-posix.c b/util/aio-posix.c
> index a4977f538e..f67f5b34e9 100644
> --- a/util/aio-posix.c
> +++ b/util/aio-posix.c
> @@ -15,6 +15,7 @@
>
> #include "qemu/osdep.h"
> #include "block/block.h"
> +#include "qemu/rcu.h"
> #include "qemu/rcu_queue.h"
> #include "qemu/sockets.h"
> #include "qemu/cutils.h"
> @@ -514,6 +515,16 @@ static bool run_poll_handlers_once(AioContext *ctx, int64_t *timeout)
> bool progress = false;
> AioHandler *node;
>
> + /*
> + * Optimization: ->io_poll() handlers often contain RCU read critical
> + * sections and we therefore see many rcu_read_lock() -> rcu_read_unlock()
> + * -> rcu_read_lock() -> ... sequences with expensive memory
> + * synchronization primitives. Make the entire polling loop an RCU
> + * critical section because nested rcu_read_lock()/rcu_read_unlock() calls
> + * are cheap.
> + */
> + RCU_READ_LOCK_GUARD();
> +
> QLIST_FOREACH_RCU(node, &ctx->aio_handlers, node) {
> if (!node->deleted && node->io_poll &&
> aio_node_check(ctx, node->is_external) &&
>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
^ permalink raw reply [flat|nested] 3+ messages in thread
* Re: [PATCH] aio-posix: avoid reacquiring rcu_read_lock() when polling
2020-02-18 18:27 [PATCH] aio-posix: avoid reacquiring rcu_read_lock() when polling Stefan Hajnoczi
2020-02-20 11:19 ` Paolo Bonzini
@ 2020-02-21 13:34 ` Stefan Hajnoczi
1 sibling, 0 replies; 3+ messages in thread
From: Stefan Hajnoczi @ 2020-02-21 13:34 UTC (permalink / raw)
To: Stefan Hajnoczi; +Cc: Fam Zheng, Paolo Bonzini, qemu-devel, qemu-block
[-- Attachment #1: Type: text/plain, Size: 536 bytes --]
On Tue, Feb 18, 2020 at 06:27:08PM +0000, Stefan Hajnoczi wrote:
> The first rcu_read_lock/unlock() is expensive. Nested calls are cheap.
>
> This optimization increases IOPS from 73k to 162k with a Linux guest
> that has 2 virtio-blk,num-queues=1 and 99 virtio-blk,num-queues=32
> devices.
>
> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
> ---
> util/aio-posix.c | 11 +++++++++++
> 1 file changed, 11 insertions(+)
Thanks, applied to my block tree:
https://github.com/stefanha/qemu/commits/block
Stefan
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]
^ permalink raw reply [flat|nested] 3+ messages in thread
end of thread, other threads:[~2020-02-21 13:43 UTC | newest]
Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-02-18 18:27 [PATCH] aio-posix: avoid reacquiring rcu_read_lock() when polling Stefan Hajnoczi
2020-02-20 11:19 ` Paolo Bonzini
2020-02-21 13:34 ` Stefan Hajnoczi
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).