* [PATCH 1/5] aio-posix: fix use after leaving scope in aio_poll()
2020-02-14 17:17 [PATCH 0/5] aio-posix: towards an O(1) event loop Stefan Hajnoczi
@ 2020-02-14 17:17 ` Stefan Hajnoczi
2020-02-19 7:02 ` Sergio Lopez
2020-02-14 17:17 ` [PATCH 2/5] aio-posix: don't pass ns timeout to epoll_wait() Stefan Hajnoczi
` (4 subsequent siblings)
5 siblings, 1 reply; 20+ messages in thread
From: Stefan Hajnoczi @ 2020-02-14 17:17 UTC (permalink / raw)
To: qemu-devel
Cc: Fam Zheng, Kevin Wolf, qemu-block, Max Reitz, Stefan Hajnoczi,
Marc-André Lureau, Paolo Bonzini
epoll_handler is a stack variable and must not be accessed after it goes
out of scope:
if (aio_epoll_check_poll(ctx, pollfds, npfd, timeout)) {
AioHandler epoll_handler;
...
add_pollfd(&epoll_handler);
ret = aio_epoll(ctx, pollfds, npfd, timeout);
} ...
...
/* if we have any readable fds, dispatch event */
if (ret > 0) {
for (i = 0; i < npfd; i++) {
nodes[i]->pfd.revents = pollfds[i].revents;
}
}
nodes[0] is &epoll_handler, which has already gone out of scope.
There is no need to use pollfds[] for epoll. We don't need an
AioHandler for the epoll fd.
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
---
util/aio-posix.c | 20 ++++++++------------
1 file changed, 8 insertions(+), 12 deletions(-)
diff --git a/util/aio-posix.c b/util/aio-posix.c
index a4977f538e..31a8e03ca7 100644
--- a/util/aio-posix.c
+++ b/util/aio-posix.c
@@ -104,17 +104,18 @@ static void aio_epoll_update(AioContext *ctx, AioHandler *node, bool is_new)
}
}
-static int aio_epoll(AioContext *ctx, GPollFD *pfds,
- unsigned npfd, int64_t timeout)
+static int aio_epoll(AioContext *ctx, int64_t timeout)
{
+ GPollFD pfd = {
+ .fd = ctx->epollfd,
+ .events = G_IO_IN | G_IO_OUT | G_IO_HUP | G_IO_ERR,
+ };
AioHandler *node;
int i, ret = 0;
struct epoll_event events[128];
- assert(npfd == 1);
- assert(pfds[0].fd == ctx->epollfd);
if (timeout > 0) {
- ret = qemu_poll_ns(pfds, npfd, timeout);
+ ret = qemu_poll_ns(&pfd, 1, timeout);
}
if (timeout <= 0 || ret > 0) {
ret = epoll_wait(ctx->epollfd, events,
@@ -658,13 +659,8 @@ bool aio_poll(AioContext *ctx, bool blocking)
/* wait until next event */
if (aio_epoll_check_poll(ctx, pollfds, npfd, timeout)) {
- AioHandler epoll_handler;
-
- epoll_handler.pfd.fd = ctx->epollfd;
- epoll_handler.pfd.events = G_IO_IN | G_IO_OUT | G_IO_HUP | G_IO_ERR;
- npfd = 0;
- add_pollfd(&epoll_handler);
- ret = aio_epoll(ctx, pollfds, npfd, timeout);
+ npfd = 0; /* pollfds[] is not being used */
+ ret = aio_epoll(ctx, timeout);
} else {
ret = qemu_poll_ns(pollfds, npfd, timeout);
}
--
2.24.1
^ permalink raw reply related [flat|nested] 20+ messages in thread
* Re: [PATCH 1/5] aio-posix: fix use after leaving scope in aio_poll()
2020-02-14 17:17 ` [PATCH 1/5] aio-posix: fix use after leaving scope in aio_poll() Stefan Hajnoczi
@ 2020-02-19 7:02 ` Sergio Lopez
0 siblings, 0 replies; 20+ messages in thread
From: Sergio Lopez @ 2020-02-19 7:02 UTC (permalink / raw)
To: Stefan Hajnoczi
Cc: Fam Zheng, Kevin Wolf, qemu-block, qemu-devel, Max Reitz,
Paolo Bonzini, Marc-André Lureau
[-- Attachment #1: Type: text/plain, Size: 992 bytes --]
On Fri, Feb 14, 2020 at 05:17:08PM +0000, Stefan Hajnoczi wrote:
> epoll_handler is a stack variable and must not be accessed after it goes
> out of scope:
>
> if (aio_epoll_check_poll(ctx, pollfds, npfd, timeout)) {
> AioHandler epoll_handler;
> ...
> add_pollfd(&epoll_handler);
> ret = aio_epoll(ctx, pollfds, npfd, timeout);
> } ...
>
> ...
>
> /* if we have any readable fds, dispatch event */
> if (ret > 0) {
> for (i = 0; i < npfd; i++) {
> nodes[i]->pfd.revents = pollfds[i].revents;
> }
> }
>
> nodes[0] is &epoll_handler, which has already gone out of scope.
>
> There is no need to use pollfds[] for epoll. We don't need an
> AioHandler for the epoll fd.
>
> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
> ---
> util/aio-posix.c | 20 ++++++++------------
> 1 file changed, 8 insertions(+), 12 deletions(-)
Reviewed-by: Sergio Lopez <slp@redhat.com>
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]
^ permalink raw reply [flat|nested] 20+ messages in thread
* [PATCH 2/5] aio-posix: don't pass ns timeout to epoll_wait()
2020-02-14 17:17 [PATCH 0/5] aio-posix: towards an O(1) event loop Stefan Hajnoczi
2020-02-14 17:17 ` [PATCH 1/5] aio-posix: fix use after leaving scope in aio_poll() Stefan Hajnoczi
@ 2020-02-14 17:17 ` Stefan Hajnoczi
2020-02-19 10:12 ` Sergio Lopez
2020-02-14 17:17 ` [PATCH 3/5] qemu/queue.h: add QLIST_SAFE_REMOVE() Stefan Hajnoczi
` (3 subsequent siblings)
5 siblings, 1 reply; 20+ messages in thread
From: Stefan Hajnoczi @ 2020-02-14 17:17 UTC (permalink / raw)
To: qemu-devel
Cc: Fam Zheng, Kevin Wolf, qemu-block, Max Reitz, Stefan Hajnoczi,
Marc-André Lureau, Paolo Bonzini
Don't pass the nanosecond timeout into epoll_wait(), which expects
milliseconds.
The epoll_wait() timeout value does not matter if qemu_poll_ns()
determined that the poll fd is ready, but passing a value in the wrong
units is still ugly. Pass a 0 timeout to epoll_wait() instead.
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
---
util/aio-posix.c | 3 +++
1 file changed, 3 insertions(+)
diff --git a/util/aio-posix.c b/util/aio-posix.c
index 31a8e03ca7..b21bcd8e97 100644
--- a/util/aio-posix.c
+++ b/util/aio-posix.c
@@ -116,6 +116,9 @@ static int aio_epoll(AioContext *ctx, int64_t timeout)
if (timeout > 0) {
ret = qemu_poll_ns(&pfd, 1, timeout);
+ if (ret > 0) {
+ timeout = 0;
+ }
}
if (timeout <= 0 || ret > 0) {
ret = epoll_wait(ctx->epollfd, events,
--
2.24.1
^ permalink raw reply related [flat|nested] 20+ messages in thread
* Re: [PATCH 2/5] aio-posix: don't pass ns timeout to epoll_wait()
2020-02-14 17:17 ` [PATCH 2/5] aio-posix: don't pass ns timeout to epoll_wait() Stefan Hajnoczi
@ 2020-02-19 10:12 ` Sergio Lopez
0 siblings, 0 replies; 20+ messages in thread
From: Sergio Lopez @ 2020-02-19 10:12 UTC (permalink / raw)
To: Stefan Hajnoczi
Cc: Fam Zheng, Kevin Wolf, qemu-block, qemu-devel, Max Reitz,
Paolo Bonzini, Marc-André Lureau
[-- Attachment #1: Type: text/plain, Size: 544 bytes --]
On Fri, Feb 14, 2020 at 05:17:09PM +0000, Stefan Hajnoczi wrote:
> Don't pass the nanosecond timeout into epoll_wait(), which expects
> milliseconds.
>
> The epoll_wait() timeout value does not matter if qemu_poll_ns()
> determined that the poll fd is ready, but passing a value in the wrong
> units is still ugly. Pass a 0 timeout to epoll_wait() instead.
>
> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
> ---
> util/aio-posix.c | 3 +++
> 1 file changed, 3 insertions(+)
Reviewed-by: Sergio Lopez <slp@redhat.com>
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]
^ permalink raw reply [flat|nested] 20+ messages in thread
* [PATCH 3/5] qemu/queue.h: add QLIST_SAFE_REMOVE()
2020-02-14 17:17 [PATCH 0/5] aio-posix: towards an O(1) event loop Stefan Hajnoczi
2020-02-14 17:17 ` [PATCH 1/5] aio-posix: fix use after leaving scope in aio_poll() Stefan Hajnoczi
2020-02-14 17:17 ` [PATCH 2/5] aio-posix: don't pass ns timeout to epoll_wait() Stefan Hajnoczi
@ 2020-02-14 17:17 ` Stefan Hajnoczi
2020-02-19 10:30 ` Sergio Lopez
2020-02-14 17:17 ` [PATCH 4/5] aio-posix: make AioHandler deletion O(1) Stefan Hajnoczi
` (2 subsequent siblings)
5 siblings, 1 reply; 20+ messages in thread
From: Stefan Hajnoczi @ 2020-02-14 17:17 UTC (permalink / raw)
To: qemu-devel
Cc: Fam Zheng, Kevin Wolf, qemu-block, Max Reitz, Stefan Hajnoczi,
Marc-André Lureau, Paolo Bonzini
QLIST_REMOVE() assumes the element is in a list. It also leaves the
element's linked list pointers dangling.
Introduce a safe version of QLIST_REMOVE() and convert open-coded
instances of this pattern.
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
---
block.c | 5 +----
chardev/spice.c | 4 +---
include/qemu/queue.h | 14 ++++++++++++++
3 files changed, 16 insertions(+), 7 deletions(-)
diff --git a/block.c b/block.c
index 9c810534d6..484e01d042 100644
--- a/block.c
+++ b/block.c
@@ -2499,10 +2499,7 @@ BdrvChild *bdrv_attach_child(BlockDriverState *parent_bs,
static void bdrv_detach_child(BdrvChild *child)
{
- if (child->next.le_prev) {
- QLIST_REMOVE(child, next);
- child->next.le_prev = NULL;
- }
+ QLIST_SAFE_REMOVE(child, next);
bdrv_replace_child(child, NULL);
diff --git a/chardev/spice.c b/chardev/spice.c
index 241e2b7770..bf7ea1e294 100644
--- a/chardev/spice.c
+++ b/chardev/spice.c
@@ -216,9 +216,7 @@ static void char_spice_finalize(Object *obj)
vmc_unregister_interface(s);
- if (s->next.le_prev) {
- QLIST_REMOVE(s, next);
- }
+ QLIST_SAFE_REMOVE(s, next);
g_free((char *)s->sin.subtype);
g_free((char *)s->sin.portname);
diff --git a/include/qemu/queue.h b/include/qemu/queue.h
index 19425f973f..a276363372 100644
--- a/include/qemu/queue.h
+++ b/include/qemu/queue.h
@@ -144,6 +144,20 @@ struct { \
*(elm)->field.le_prev = (elm)->field.le_next; \
} while (/*CONSTCOND*/0)
+/*
+ * Like QLIST_REMOVE() but safe to call when elm is not in a list
+ */
+#define QLIST_SAFE_REMOVE(elm, field) do { \
+ if ((elm)->field.le_prev != NULL) { \
+ if ((elm)->field.le_next != NULL) \
+ (elm)->field.le_next->field.le_prev = \
+ (elm)->field.le_prev; \
+ *(elm)->field.le_prev = (elm)->field.le_next; \
+ (elm)->field.le_next = NULL; \
+ (elm)->field.le_prev = NULL; \
+ } \
+} while (/*CONSTCOND*/0)
+
#define QLIST_FOREACH(var, head, field) \
for ((var) = ((head)->lh_first); \
(var); \
--
2.24.1
^ permalink raw reply related [flat|nested] 20+ messages in thread
* Re: [PATCH 3/5] qemu/queue.h: add QLIST_SAFE_REMOVE()
2020-02-14 17:17 ` [PATCH 3/5] qemu/queue.h: add QLIST_SAFE_REMOVE() Stefan Hajnoczi
@ 2020-02-19 10:30 ` Sergio Lopez
0 siblings, 0 replies; 20+ messages in thread
From: Sergio Lopez @ 2020-02-19 10:30 UTC (permalink / raw)
To: Stefan Hajnoczi
Cc: Fam Zheng, Kevin Wolf, qemu-block, qemu-devel, Max Reitz,
Paolo Bonzini, Marc-André Lureau
[-- Attachment #1: Type: text/plain, Size: 568 bytes --]
On Fri, Feb 14, 2020 at 05:17:10PM +0000, Stefan Hajnoczi wrote:
> QLIST_REMOVE() assumes the element is in a list. It also leaves the
> element's linked list pointers dangling.
>
> Introduce a safe version of QLIST_REMOVE() and convert open-coded
> instances of this pattern.
>
> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
> ---
> block.c | 5 +----
> chardev/spice.c | 4 +---
> include/qemu/queue.h | 14 ++++++++++++++
> 3 files changed, 16 insertions(+), 7 deletions(-)
Reviewed-by: Sergio Lopez <slp@redhat.com>
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]
^ permalink raw reply [flat|nested] 20+ messages in thread
* [PATCH 4/5] aio-posix: make AioHandler deletion O(1)
2020-02-14 17:17 [PATCH 0/5] aio-posix: towards an O(1) event loop Stefan Hajnoczi
` (2 preceding siblings ...)
2020-02-14 17:17 ` [PATCH 3/5] qemu/queue.h: add QLIST_SAFE_REMOVE() Stefan Hajnoczi
@ 2020-02-14 17:17 ` Stefan Hajnoczi
2020-02-19 10:41 ` Sergio Lopez
2020-02-14 17:17 ` [PATCH 5/5] aio-posix: make AioHandler dispatch O(1) with epoll Stefan Hajnoczi
2020-02-21 15:29 ` [PATCH 0/5] aio-posix: towards an O(1) event loop Stefan Hajnoczi
5 siblings, 1 reply; 20+ messages in thread
From: Stefan Hajnoczi @ 2020-02-14 17:17 UTC (permalink / raw)
To: qemu-devel
Cc: Fam Zheng, Kevin Wolf, qemu-block, Max Reitz, Stefan Hajnoczi,
Marc-André Lureau, Paolo Bonzini
It is not necessary to scan all AioHandlers for deletion. Keep a list
of deleted handlers instead of scanning the full list of all handlers.
The AioHandler->deleted field can be dropped. Let's check if the
handler has been inserted into the deleted list instead. Add a new
QLIST_IS_INSERTED() API for this check.
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
---
include/block/aio.h | 6 ++++-
include/qemu/queue.h | 3 +++
util/aio-posix.c | 53 +++++++++++++++++++++++++++++---------------
3 files changed, 43 insertions(+), 19 deletions(-)
diff --git a/include/block/aio.h b/include/block/aio.h
index 7ba9bd7874..1a0de1508c 100644
--- a/include/block/aio.h
+++ b/include/block/aio.h
@@ -42,6 +42,7 @@ void qemu_aio_unref(void *p);
void qemu_aio_ref(void *p);
typedef struct AioHandler AioHandler;
+typedef QLIST_HEAD(, AioHandler) AioHandlerList;
typedef void QEMUBHFunc(void *opaque);
typedef bool AioPollFn(void *opaque);
typedef void IOHandler(void *opaque);
@@ -58,7 +59,10 @@ struct AioContext {
QemuRecMutex lock;
/* The list of registered AIO handlers. Protected by ctx->list_lock. */
- QLIST_HEAD(, AioHandler) aio_handlers;
+ AioHandlerList aio_handlers;
+
+ /* The list of AIO handlers to be deleted. Protected by ctx->list_lock. */
+ AioHandlerList deleted_aio_handlers;
/* Used to avoid unnecessary event_notifier_set calls in aio_notify;
* accessed with atomic primitives. If this field is 0, everything
diff --git a/include/qemu/queue.h b/include/qemu/queue.h
index a276363372..699a8a0568 100644
--- a/include/qemu/queue.h
+++ b/include/qemu/queue.h
@@ -158,6 +158,9 @@ struct { \
} \
} while (/*CONSTCOND*/0)
+/* Is elm in a list? */
+#define QLIST_IS_INSERTED(elm, field) ((elm)->field.le_prev != NULL)
+
#define QLIST_FOREACH(var, head, field) \
for ((var) = ((head)->lh_first); \
(var); \
diff --git a/util/aio-posix.c b/util/aio-posix.c
index b21bcd8e97..3a98a2acb9 100644
--- a/util/aio-posix.c
+++ b/util/aio-posix.c
@@ -31,10 +31,10 @@ struct AioHandler
AioPollFn *io_poll;
IOHandler *io_poll_begin;
IOHandler *io_poll_end;
- int deleted;
void *opaque;
bool is_external;
QLIST_ENTRY(AioHandler) node;
+ QLIST_ENTRY(AioHandler) node_deleted;
};
#ifdef CONFIG_EPOLL_CREATE1
@@ -67,7 +67,7 @@ static bool aio_epoll_try_enable(AioContext *ctx)
QLIST_FOREACH_RCU(node, &ctx->aio_handlers, node) {
int r;
- if (node->deleted || !node->pfd.events) {
+ if (QLIST_IS_INSERTED(node, node_deleted) || !node->pfd.events) {
continue;
}
event.events = epoll_events_from_pfd(node->pfd.events);
@@ -195,9 +195,11 @@ static AioHandler *find_aio_handler(AioContext *ctx, int fd)
AioHandler *node;
QLIST_FOREACH(node, &ctx->aio_handlers, node) {
- if (node->pfd.fd == fd)
- if (!node->deleted)
+ if (node->pfd.fd == fd) {
+ if (!QLIST_IS_INSERTED(node, node_deleted)) {
return node;
+ }
+ }
}
return NULL;
@@ -216,7 +218,7 @@ static bool aio_remove_fd_handler(AioContext *ctx, AioHandler *node)
/* If a read is in progress, just mark the node as deleted */
if (qemu_lockcnt_count(&ctx->list_lock)) {
- node->deleted = 1;
+ QLIST_INSERT_HEAD_RCU(&ctx->deleted_aio_handlers, node, node_deleted);
node->pfd.revents = 0;
return false;
}
@@ -358,7 +360,7 @@ static void poll_set_started(AioContext *ctx, bool started)
QLIST_FOREACH_RCU(node, &ctx->aio_handlers, node) {
IOHandler *fn;
- if (node->deleted) {
+ if (QLIST_IS_INSERTED(node, node_deleted)) {
continue;
}
@@ -415,6 +417,26 @@ bool aio_pending(AioContext *ctx)
return result;
}
+static void aio_free_deleted_handlers(AioContext *ctx)
+{
+ AioHandler *node;
+
+ if (QLIST_EMPTY_RCU(&ctx->deleted_aio_handlers)) {
+ return;
+ }
+ if (!qemu_lockcnt_dec_if_lock(&ctx->list_lock)) {
+ return; /* we are nested, let the parent do the freeing */
+ }
+
+ while ((node = QLIST_FIRST_RCU(&ctx->deleted_aio_handlers))) {
+ QLIST_REMOVE(node, node);
+ QLIST_REMOVE(node, node_deleted);
+ g_free(node);
+ }
+
+ qemu_lockcnt_inc_and_unlock(&ctx->list_lock);
+}
+
static bool aio_dispatch_handlers(AioContext *ctx)
{
AioHandler *node, *tmp;
@@ -426,7 +448,7 @@ static bool aio_dispatch_handlers(AioContext *ctx)
revents = node->pfd.revents & node->pfd.events;
node->pfd.revents = 0;
- if (!node->deleted &&
+ if (!QLIST_IS_INSERTED(node, node_deleted) &&
(revents & (G_IO_IN | G_IO_HUP | G_IO_ERR)) &&
aio_node_check(ctx, node->is_external) &&
node->io_read) {
@@ -437,21 +459,13 @@ static bool aio_dispatch_handlers(AioContext *ctx)
progress = true;
}
}
- if (!node->deleted &&
+ if (!QLIST_IS_INSERTED(node, node_deleted) &&
(revents & (G_IO_OUT | G_IO_ERR)) &&
aio_node_check(ctx, node->is_external) &&
node->io_write) {
node->io_write(node->opaque);
progress = true;
}
-
- if (node->deleted) {
- if (qemu_lockcnt_dec_if_lock(&ctx->list_lock)) {
- QLIST_REMOVE(node, node);
- g_free(node);
- qemu_lockcnt_inc_and_unlock(&ctx->list_lock);
- }
- }
}
return progress;
@@ -462,6 +476,7 @@ void aio_dispatch(AioContext *ctx)
qemu_lockcnt_inc(&ctx->list_lock);
aio_bh_poll(ctx);
aio_dispatch_handlers(ctx);
+ aio_free_deleted_handlers(ctx);
qemu_lockcnt_dec(&ctx->list_lock);
timerlistgroup_run_timers(&ctx->tlg);
@@ -519,7 +534,7 @@ static bool run_poll_handlers_once(AioContext *ctx, int64_t *timeout)
AioHandler *node;
QLIST_FOREACH_RCU(node, &ctx->aio_handlers, node) {
- if (!node->deleted && node->io_poll &&
+ if (!QLIST_IS_INSERTED(node, node_deleted) && node->io_poll &&
aio_node_check(ctx, node->is_external) &&
node->io_poll(node->opaque)) {
/*
@@ -653,7 +668,7 @@ bool aio_poll(AioContext *ctx, bool blocking)
if (!aio_epoll_enabled(ctx)) {
QLIST_FOREACH_RCU(node, &ctx->aio_handlers, node) {
- if (!node->deleted && node->pfd.events
+ if (!QLIST_IS_INSERTED(node, node_deleted) && node->pfd.events
&& aio_node_check(ctx, node->is_external)) {
add_pollfd(node);
}
@@ -730,6 +745,8 @@ bool aio_poll(AioContext *ctx, bool blocking)
progress |= aio_dispatch_handlers(ctx);
}
+ aio_free_deleted_handlers(ctx);
+
qemu_lockcnt_dec(&ctx->list_lock);
progress |= timerlistgroup_run_timers(&ctx->tlg);
--
2.24.1
^ permalink raw reply related [flat|nested] 20+ messages in thread
* Re: [PATCH 4/5] aio-posix: make AioHandler deletion O(1)
2020-02-14 17:17 ` [PATCH 4/5] aio-posix: make AioHandler deletion O(1) Stefan Hajnoczi
@ 2020-02-19 10:41 ` Sergio Lopez
0 siblings, 0 replies; 20+ messages in thread
From: Sergio Lopez @ 2020-02-19 10:41 UTC (permalink / raw)
To: Stefan Hajnoczi
Cc: Fam Zheng, Kevin Wolf, qemu-block, qemu-devel, Max Reitz,
Paolo Bonzini, Marc-André Lureau
[-- Attachment #1: Type: text/plain, Size: 714 bytes --]
On Fri, Feb 14, 2020 at 05:17:11PM +0000, Stefan Hajnoczi wrote:
> It is not necessary to scan all AioHandlers for deletion. Keep a list
> of deleted handlers instead of scanning the full list of all handlers.
>
> The AioHandler->deleted field can be dropped. Let's check if the
> handler has been inserted into the deleted list instead. Add a new
> QLIST_IS_INSERTED() API for this check.
>
> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
> ---
> include/block/aio.h | 6 ++++-
> include/qemu/queue.h | 3 +++
> util/aio-posix.c | 53 +++++++++++++++++++++++++++++---------------
> 3 files changed, 43 insertions(+), 19 deletions(-)
Reviewed-by: Sergio Lopez <slp@redhat.com>
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]
^ permalink raw reply [flat|nested] 20+ messages in thread
* [PATCH 5/5] aio-posix: make AioHandler dispatch O(1) with epoll
2020-02-14 17:17 [PATCH 0/5] aio-posix: towards an O(1) event loop Stefan Hajnoczi
` (3 preceding siblings ...)
2020-02-14 17:17 ` [PATCH 4/5] aio-posix: make AioHandler deletion O(1) Stefan Hajnoczi
@ 2020-02-14 17:17 ` Stefan Hajnoczi
2020-02-19 11:00 ` Sergio Lopez
2020-02-19 11:13 ` Paolo Bonzini
2020-02-21 15:29 ` [PATCH 0/5] aio-posix: towards an O(1) event loop Stefan Hajnoczi
5 siblings, 2 replies; 20+ messages in thread
From: Stefan Hajnoczi @ 2020-02-14 17:17 UTC (permalink / raw)
To: qemu-devel
Cc: Fam Zheng, Kevin Wolf, qemu-block, Max Reitz, Stefan Hajnoczi,
Marc-André Lureau, Paolo Bonzini
File descriptor monitoring is O(1) with epoll(7), but
aio_dispatch_handlers() still scans all AioHandlers instead of
dispatching just those that are ready. This makes aio_poll() O(n) with
respect to the total number of registered handlers.
Add a local ready_list to aio_poll() so that each nested aio_poll()
builds a list of handlers ready to be dispatched. Since file descriptor
polling is level-triggered, nested aio_poll() calls also see fds that
were ready in the parent but not yet dispatched. This guarantees that
nested aio_poll() invocations will dispatch all fds, even those that
became ready before the nested invocation.
Since only handlers ready to be dispatched are placed onto the
ready_list, the new aio_dispatch_ready_handlers() function provides O(1)
dispatch.
Note that AioContext polling is still O(n) and currently cannot be fully
disabled. This still needs to be fixed before aio_poll() is fully O(1).
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
---
util/aio-posix.c | 106 +++++++++++++++++++++++++++++++++--------------
1 file changed, 76 insertions(+), 30 deletions(-)
diff --git a/util/aio-posix.c b/util/aio-posix.c
index 3a98a2acb9..dc33ca08a6 100644
--- a/util/aio-posix.c
+++ b/util/aio-posix.c
@@ -34,6 +34,7 @@ struct AioHandler
void *opaque;
bool is_external;
QLIST_ENTRY(AioHandler) node;
+ QLIST_ENTRY(AioHandler) node_ready; /* only used during aio_poll() */
QLIST_ENTRY(AioHandler) node_deleted;
};
@@ -104,7 +105,18 @@ static void aio_epoll_update(AioContext *ctx, AioHandler *node, bool is_new)
}
}
-static int aio_epoll(AioContext *ctx, int64_t timeout)
+/* Add a handler to a ready list */
+static void add_ready_handler(AioHandlerList *ready_list,
+ AioHandler *node,
+ int revents)
+{
+ QLIST_SAFE_REMOVE(node, node_ready); /* remove from nested parent's list */
+ node->pfd.revents = revents;
+ QLIST_INSERT_HEAD(ready_list, node, node_ready);
+}
+
+static int aio_epoll(AioContext *ctx, AioHandlerList *ready_list,
+ int64_t timeout)
{
GPollFD pfd = {
.fd = ctx->epollfd,
@@ -129,11 +141,13 @@ static int aio_epoll(AioContext *ctx, int64_t timeout)
}
for (i = 0; i < ret; i++) {
int ev = events[i].events;
+ int revents = (ev & EPOLLIN ? G_IO_IN : 0) |
+ (ev & EPOLLOUT ? G_IO_OUT : 0) |
+ (ev & EPOLLHUP ? G_IO_HUP : 0) |
+ (ev & EPOLLERR ? G_IO_ERR : 0);
+
node = events[i].data.ptr;
- node->pfd.revents = (ev & EPOLLIN ? G_IO_IN : 0) |
- (ev & EPOLLOUT ? G_IO_OUT : 0) |
- (ev & EPOLLHUP ? G_IO_HUP : 0) |
- (ev & EPOLLERR ? G_IO_ERR : 0);
+ add_ready_handler(ready_list, node, revents);
}
}
out:
@@ -437,36 +451,63 @@ static void aio_free_deleted_handlers(AioContext *ctx)
qemu_lockcnt_inc_and_unlock(&ctx->list_lock);
}
-static bool aio_dispatch_handlers(AioContext *ctx)
+static bool aio_dispatch_handler(AioContext *ctx, AioHandler *node)
{
- AioHandler *node, *tmp;
bool progress = false;
+ int revents;
- QLIST_FOREACH_SAFE_RCU(node, &ctx->aio_handlers, node, tmp) {
- int revents;
+ revents = node->pfd.revents & node->pfd.events;
+ node->pfd.revents = 0;
- revents = node->pfd.revents & node->pfd.events;
- node->pfd.revents = 0;
+ if (!QLIST_IS_INSERTED(node, node_deleted) &&
+ (revents & (G_IO_IN | G_IO_HUP | G_IO_ERR)) &&
+ aio_node_check(ctx, node->is_external) &&
+ node->io_read) {
+ node->io_read(node->opaque);
- if (!QLIST_IS_INSERTED(node, node_deleted) &&
- (revents & (G_IO_IN | G_IO_HUP | G_IO_ERR)) &&
- aio_node_check(ctx, node->is_external) &&
- node->io_read) {
- node->io_read(node->opaque);
-
- /* aio_notify() does not count as progress */
- if (node->opaque != &ctx->notifier) {
- progress = true;
- }
- }
- if (!QLIST_IS_INSERTED(node, node_deleted) &&
- (revents & (G_IO_OUT | G_IO_ERR)) &&
- aio_node_check(ctx, node->is_external) &&
- node->io_write) {
- node->io_write(node->opaque);
+ /* aio_notify() does not count as progress */
+ if (node->opaque != &ctx->notifier) {
progress = true;
}
}
+ if (!QLIST_IS_INSERTED(node, node_deleted) &&
+ (revents & (G_IO_OUT | G_IO_ERR)) &&
+ aio_node_check(ctx, node->is_external) &&
+ node->io_write) {
+ node->io_write(node->opaque);
+ progress = true;
+ }
+
+ return progress;
+}
+
+/*
+ * If we have a list of ready handlers then this is more efficient than
+ * scanning all handlers with aio_dispatch_handlers().
+ */
+static bool aio_dispatch_ready_handlers(AioContext *ctx,
+ AioHandlerList *ready_list)
+{
+ bool progress = false;
+ AioHandler *node;
+
+ while ((node = QLIST_FIRST(ready_list))) {
+ QLIST_SAFE_REMOVE(node, node_ready);
+ progress = aio_dispatch_handler(ctx, node) || progress;
+ }
+
+ return progress;
+}
+
+/* Slower than aio_dispatch_ready_handlers() but only used via glib */
+static bool aio_dispatch_handlers(AioContext *ctx)
+{
+ AioHandler *node, *tmp;
+ bool progress = false;
+
+ QLIST_FOREACH_SAFE_RCU(node, &ctx->aio_handlers, node, tmp) {
+ progress = aio_dispatch_handler(ctx, node) || progress;
+ }
return progress;
}
@@ -628,6 +669,7 @@ static bool try_poll_mode(AioContext *ctx, int64_t *timeout)
bool aio_poll(AioContext *ctx, bool blocking)
{
+ AioHandlerList ready_list = QLIST_HEAD_INITIALIZER(ready_list);
AioHandler *node;
int i;
int ret = 0;
@@ -678,7 +720,7 @@ bool aio_poll(AioContext *ctx, bool blocking)
/* wait until next event */
if (aio_epoll_check_poll(ctx, pollfds, npfd, timeout)) {
npfd = 0; /* pollfds[] is not being used */
- ret = aio_epoll(ctx, timeout);
+ ret = aio_epoll(ctx, &ready_list, timeout);
} else {
ret = qemu_poll_ns(pollfds, npfd, timeout);
}
@@ -733,7 +775,11 @@ bool aio_poll(AioContext *ctx, bool blocking)
/* if we have any readable fds, dispatch event */
if (ret > 0) {
for (i = 0; i < npfd; i++) {
- nodes[i]->pfd.revents = pollfds[i].revents;
+ int revents = pollfds[i].revents;
+
+ if (revents) {
+ add_ready_handler(&ready_list, nodes[i], revents);
+ }
}
}
@@ -742,7 +788,7 @@ bool aio_poll(AioContext *ctx, bool blocking)
progress |= aio_bh_poll(ctx);
if (ret > 0) {
- progress |= aio_dispatch_handlers(ctx);
+ progress |= aio_dispatch_ready_handlers(ctx, &ready_list);
}
aio_free_deleted_handlers(ctx);
--
2.24.1
^ permalink raw reply related [flat|nested] 20+ messages in thread
* Re: [PATCH 5/5] aio-posix: make AioHandler dispatch O(1) with epoll
2020-02-14 17:17 ` [PATCH 5/5] aio-posix: make AioHandler dispatch O(1) with epoll Stefan Hajnoczi
@ 2020-02-19 11:00 ` Sergio Lopez
2020-02-19 11:13 ` Paolo Bonzini
1 sibling, 0 replies; 20+ messages in thread
From: Sergio Lopez @ 2020-02-19 11:00 UTC (permalink / raw)
To: Stefan Hajnoczi
Cc: Fam Zheng, Kevin Wolf, qemu-block, qemu-devel, Max Reitz,
Paolo Bonzini, Marc-André Lureau
[-- Attachment #1: Type: text/plain, Size: 1292 bytes --]
On Fri, Feb 14, 2020 at 05:17:12PM +0000, Stefan Hajnoczi wrote:
> File descriptor monitoring is O(1) with epoll(7), but
> aio_dispatch_handlers() still scans all AioHandlers instead of
> dispatching just those that are ready. This makes aio_poll() O(n) with
> respect to the total number of registered handlers.
>
> Add a local ready_list to aio_poll() so that each nested aio_poll()
> builds a list of handlers ready to be dispatched. Since file descriptor
> polling is level-triggered, nested aio_poll() calls also see fds that
> were ready in the parent but not yet dispatched. This guarantees that
> nested aio_poll() invocations will dispatch all fds, even those that
> became ready before the nested invocation.
>
> Since only handlers ready to be dispatched are placed onto the
> ready_list, the new aio_dispatch_ready_handlers() function provides O(1)
> dispatch.
>
> Note that AioContext polling is still O(n) and currently cannot be fully
> disabled. This still needs to be fixed before aio_poll() is fully O(1).
>
> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
> ---
> util/aio-posix.c | 106 +++++++++++++++++++++++++++++++++--------------
> 1 file changed, 76 insertions(+), 30 deletions(-)
Reviewed-by: Sergio Lopez <slp@redhat.com>
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH 5/5] aio-posix: make AioHandler dispatch O(1) with epoll
2020-02-14 17:17 ` [PATCH 5/5] aio-posix: make AioHandler dispatch O(1) with epoll Stefan Hajnoczi
2020-02-19 11:00 ` Sergio Lopez
@ 2020-02-19 11:13 ` Paolo Bonzini
2020-02-21 12:59 ` Stefan Hajnoczi
1 sibling, 1 reply; 20+ messages in thread
From: Paolo Bonzini @ 2020-02-19 11:13 UTC (permalink / raw)
To: Stefan Hajnoczi, qemu-devel
Cc: Fam Zheng, Kevin Wolf, Max Reitz, qemu-block, Marc-André Lureau
On 14/02/20 18:17, Stefan Hajnoczi wrote:
> + while ((node = QLIST_FIRST(ready_list))) {
> + QLIST_SAFE_REMOVE(node, node_ready);
Why does this need safe remove?
Paolo
> + progress = aio_dispatch_handler(ctx, node) || progress;
> + }
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH 5/5] aio-posix: make AioHandler dispatch O(1) with epoll
2020-02-19 11:13 ` Paolo Bonzini
@ 2020-02-21 12:59 ` Stefan Hajnoczi
2020-02-21 13:06 ` Paolo Bonzini
0 siblings, 1 reply; 20+ messages in thread
From: Stefan Hajnoczi @ 2020-02-21 12:59 UTC (permalink / raw)
To: Paolo Bonzini
Cc: Fam Zheng, Kevin Wolf, qemu-block, qemu-devel, Max Reitz,
Stefan Hajnoczi, Marc-André Lureau
[-- Attachment #1: Type: text/plain, Size: 1196 bytes --]
On Wed, Feb 19, 2020 at 12:13:40PM +0100, Paolo Bonzini wrote:
> On 14/02/20 18:17, Stefan Hajnoczi wrote:
> > + while ((node = QLIST_FIRST(ready_list))) {
> > + QLIST_SAFE_REMOVE(node, node_ready);
>
> Why does this need safe remove?
Yes, it's necessary. QLIST_SAFE_REMOVE() has two properties that make
it "safe":
1. It doesn't crash if the node is currently not on a list.
2. It clears the node's linked list pointers so that future linked
list operations (like QLIST_SAFE_REMOVE()) aren't accidentally
performed on stale pointers.
The node has a long lifespan and will be inserted into ready_lists
multiple times. We need to safely remove it from ready_list to protect
against a corruption the next time the node is inserted into a
ready_list again:
/* Add a handler to a ready list */
static void add_ready_handler(AioHandlerList *ready_list,
AioHandler *node,
int revents)
{
QLIST_SAFE_REMOVE(node, node_ready); /* remove from nested parent's list */
^---- would cause corruption if node->node_ready was stale!
Would you like me to add a comment?
Stefan
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH 5/5] aio-posix: make AioHandler dispatch O(1) with epoll
2020-02-21 12:59 ` Stefan Hajnoczi
@ 2020-02-21 13:06 ` Paolo Bonzini
2020-02-21 14:44 ` Stefan Hajnoczi
2020-02-21 14:47 ` Stefan Hajnoczi
0 siblings, 2 replies; 20+ messages in thread
From: Paolo Bonzini @ 2020-02-21 13:06 UTC (permalink / raw)
To: Stefan Hajnoczi
Cc: Fam Zheng, Kevin Wolf, qemu-block, qemu-devel, Max Reitz,
Stefan Hajnoczi, Marc-André Lureau
On 21/02/20 13:59, Stefan Hajnoczi wrote:
> 1. It doesn't crash if the node is currently not on a list.
> 2. It clears the node's linked list pointers so that future linked
> list operations (like QLIST_SAFE_REMOVE()) aren't accidentally
> performed on stale pointers.
>
> The node has a long lifespan and will be inserted into ready_lists
> multiple times. We need to safely remove it from ready_list to protect
> against a corruption the next time the node is inserted into a
> ready_list again:
Ah, so the one I singled out is for (2) (we know the node is currently
on a list), while the one below is for (1). Would it make sense to move
(2) to Q*_REMOVE_*? We can do it separately after this pull request.
> /* Add a handler to a ready list */
> static void add_ready_handler(AioHandlerList *ready_list,
> AioHandler *node,
> int revents)
> {
> QLIST_SAFE_REMOVE(node, node_ready); /* remove from nested parent's list */
> ^---- would cause corruption if node->node_ready was stale!
>
> Would you like me to add a comment?
No, it's okay.
Paolo
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH 5/5] aio-posix: make AioHandler dispatch O(1) with epoll
2020-02-21 13:06 ` Paolo Bonzini
@ 2020-02-21 14:44 ` Stefan Hajnoczi
2020-02-21 14:47 ` Stefan Hajnoczi
1 sibling, 0 replies; 20+ messages in thread
From: Stefan Hajnoczi @ 2020-02-21 14:44 UTC (permalink / raw)
To: Paolo Bonzini
Cc: Fam Zheng, Kevin Wolf, qemu-block, Stefan Hajnoczi, qemu-devel,
Max Reitz, Marc-André Lureau
[-- Attachment #1: Type: text/plain, Size: 947 bytes --]
On Fri, Feb 21, 2020 at 02:06:26PM +0100, Paolo Bonzini wrote:
> On 21/02/20 13:59, Stefan Hajnoczi wrote:
> > 1. It doesn't crash if the node is currently not on a list.
> > 2. It clears the node's linked list pointers so that future linked
> > list operations (like QLIST_SAFE_REMOVE()) aren't accidentally
> > performed on stale pointers.
> >
> > The node has a long lifespan and will be inserted into ready_lists
> > multiple times. We need to safely remove it from ready_list to protect
> > against a corruption the next time the node is inserted into a
> > ready_list again:
>
> Ah, so the one I singled out is for (2) (we know the node is currently
> on a list), while the one below is for (1). Would it make sense to move
> (2) to Q*_REMOVE_*? We can do it separately after this pull request.
Extending all Q*_REMOVE*() macros to clear the linked list pointers is
nice. I'll send a follow-up patch.
Stefan
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH 5/5] aio-posix: make AioHandler dispatch O(1) with epoll
2020-02-21 13:06 ` Paolo Bonzini
2020-02-21 14:44 ` Stefan Hajnoczi
@ 2020-02-21 14:47 ` Stefan Hajnoczi
2020-02-21 15:04 ` Paolo Bonzini
1 sibling, 1 reply; 20+ messages in thread
From: Stefan Hajnoczi @ 2020-02-21 14:47 UTC (permalink / raw)
To: Paolo Bonzini
Cc: Fam Zheng, Kevin Wolf, qemu-block, Stefan Hajnoczi, qemu-devel,
Max Reitz, Marc-André Lureau
[-- Attachment #1: Type: text/plain, Size: 682 bytes --]
On Fri, Feb 21, 2020 at 02:06:26PM +0100, Paolo Bonzini wrote:
> On 21/02/20 13:59, Stefan Hajnoczi wrote:
> > /* Add a handler to a ready list */
> > static void add_ready_handler(AioHandlerList *ready_list,
> > AioHandler *node,
> > int revents)
> > {
> > QLIST_SAFE_REMOVE(node, node_ready); /* remove from nested parent's list */
> > ^---- would cause corruption if node->node_ready was stale!
> >
> > Would you like me to add a comment?
> No, it's okay.
Are you happy with this series?
Shall I include it in my next pull request or do you want to merge it?
Thanks,
Stefan
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH 5/5] aio-posix: make AioHandler dispatch O(1) with epoll
2020-02-21 14:47 ` Stefan Hajnoczi
@ 2020-02-21 15:04 ` Paolo Bonzini
2020-02-21 15:29 ` Stefan Hajnoczi
0 siblings, 1 reply; 20+ messages in thread
From: Paolo Bonzini @ 2020-02-21 15:04 UTC (permalink / raw)
To: Stefan Hajnoczi
Cc: Fam Zheng, Kevin Wolf, qemu-block, Stefan Hajnoczi, qemu-devel,
Max Reitz, Marc-André Lureau
On 21/02/20 15:47, Stefan Hajnoczi wrote:
>>> QLIST_SAFE_REMOVE(node, node_ready); /* remove from nested parent's list */
>>> ^---- would cause corruption if node->node_ready was stale!
>>>
>>> Would you like me to add a comment?
>> No, it's okay.
> Are you happy with this series?
Yes. Let's keep the Q*_REMOVE cleanup on the todo list. I'd keep
Q*_SAFE_REMOVE, but clear the pointer unconditionally in Q*_REMOVE so
that we can have something like Q*_IN_LIST too.
> Shall I include it in my next pull request or do you want to merge it?
No, it's yours.
Paolo
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH 5/5] aio-posix: make AioHandler dispatch O(1) with epoll
2020-02-21 15:04 ` Paolo Bonzini
@ 2020-02-21 15:29 ` Stefan Hajnoczi
2020-02-21 15:37 ` Paolo Bonzini
0 siblings, 1 reply; 20+ messages in thread
From: Stefan Hajnoczi @ 2020-02-21 15:29 UTC (permalink / raw)
To: Paolo Bonzini
Cc: Fam Zheng, Kevin Wolf, qemu-block, Stefan Hajnoczi, qemu-devel,
Max Reitz, Marc-André Lureau
[-- Attachment #1: Type: text/plain, Size: 833 bytes --]
On Fri, Feb 21, 2020 at 04:04:10PM +0100, Paolo Bonzini wrote:
> On 21/02/20 15:47, Stefan Hajnoczi wrote:
> >>> QLIST_SAFE_REMOVE(node, node_ready); /* remove from nested parent's list */
> >>> ^---- would cause corruption if node->node_ready was stale!
> >>>
> >>> Would you like me to add a comment?
> >> No, it's okay.
> > Are you happy with this series?
>
> Yes. Let's keep the Q*_REMOVE cleanup on the todo list. I'd keep
> Q*_SAFE_REMOVE, but clear the pointer unconditionally in Q*_REMOVE so
> that we can have something like Q*_IN_LIST too.
QLIST_IS_INSERTED() is part of this patch series, although I can rename
it to Q*_IN_LIST() and cover all linked list variants. :)
> > Shall I include it in my next pull request or do you want to merge it?
>
> No, it's yours.
Thanks!
Stefan
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH 5/5] aio-posix: make AioHandler dispatch O(1) with epoll
2020-02-21 15:29 ` Stefan Hajnoczi
@ 2020-02-21 15:37 ` Paolo Bonzini
0 siblings, 0 replies; 20+ messages in thread
From: Paolo Bonzini @ 2020-02-21 15:37 UTC (permalink / raw)
To: Stefan Hajnoczi
Cc: Fam Zheng, Kevin Wolf, qemu-block, Stefan Hajnoczi, qemu-devel,
Max Reitz, Marc-André Lureau
On 21/02/20 16:29, Stefan Hajnoczi wrote:
>> Yes. Let's keep the Q*_REMOVE cleanup on the todo list. I'd keep
>> Q*_SAFE_REMOVE, but clear the pointer unconditionally in Q*_REMOVE so
>> that we can have something like Q*_IN_LIST too.
> QLIST_IS_INSERTED() is part of this patch series, although I can rename
> it to Q*_IN_LIST() and cover all linked list variants. :)
Yes I meant having it for all variants. I remembered you adding it but
not the exact spelling!
Paolo
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH 0/5] aio-posix: towards an O(1) event loop
2020-02-14 17:17 [PATCH 0/5] aio-posix: towards an O(1) event loop Stefan Hajnoczi
` (4 preceding siblings ...)
2020-02-14 17:17 ` [PATCH 5/5] aio-posix: make AioHandler dispatch O(1) with epoll Stefan Hajnoczi
@ 2020-02-21 15:29 ` Stefan Hajnoczi
5 siblings, 0 replies; 20+ messages in thread
From: Stefan Hajnoczi @ 2020-02-21 15:29 UTC (permalink / raw)
To: qemu-devel
Cc: Fam Zheng, Kevin Wolf, qemu-block, Max Reitz,
Marc-André Lureau, Paolo Bonzini
[-- Attachment #1: Type: text/plain, Size: 1749 bytes --]
On Fri, Feb 14, 2020 at 05:17:07PM +0000, Stefan Hajnoczi wrote:
> This patch series makes AioHandler deletion and dispatch O(1) with respect to
> the total number of registered handlers. The event loop has scalability
> problems when many AioHandlers are registered because it is O(n). Linux
> epoll(7) is used to avoid scanning over all pollfds but parts of the code still
> scan all AioHandlers.
>
> This series reduces QEMU CPU utilization and therefore increases IOPS,
> especially for guests that have many devices. It was tested with 32 vCPUs, 2
> virtio-blk,num-queues=1,iothread=iothread1, and 99
> virtio-blk,num-queues=32,iothread=iothread1 devices. Using an IOThread is
> necessary because this series does not improve the glib main loop, a non-goal
> since the glib API is inherently O(n).
>
> AioContext polling remains O(n) and will be addressed in a separate patch
> series. This patch series increases IOPS from 260k to 300k when AioContext
> polling is commented out
> (rw=randread,bs=4k,iodepth=32,ioengine=libaio,direct=1).
>
> Stefan Hajnoczi (5):
> aio-posix: fix use after leaving scope in aio_poll()
> aio-posix: don't pass ns timeout to epoll_wait()
> qemu/queue.h: add QLIST_SAFE_REMOVE()
> aio-posix: make AioHandler deletion O(1)
> aio-posix: make AioHandler dispatch O(1) with epoll
>
> block.c | 5 +-
> chardev/spice.c | 4 +-
> include/block/aio.h | 6 +-
> include/qemu/queue.h | 17 +++++
> util/aio-posix.c | 172 +++++++++++++++++++++++++++++--------------
> 5 files changed, 141 insertions(+), 63 deletions(-)
>
> --
> 2.24.1
>
Thanks, applied to my block tree:
https://github.com/stefanha/qemu/commits/block
Stefan
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]
^ permalink raw reply [flat|nested] 20+ messages in thread