* [Qemu-devel] [PATCH v3 0/7] aio: Support epoll by introducing qemu_poll abstraction
@ 2015-04-16 4:57 Fam Zheng
2015-04-16 4:57 ` [Qemu-devel] [PATCH v3 1/7] poll: Introduce QEMU Poll API Fam Zheng
` (7 more replies)
0 siblings, 8 replies; 12+ messages in thread
From: Fam Zheng @ 2015-04-16 4:57 UTC (permalink / raw)
To: qemu-devel; +Cc: Kevin Wolf, Paolo Bonzini, Stefan Hajnoczi
v3: Rebase to master for 2.4.
Although epoll_pwait1 syscall is still being worked on [1], the QEMU part
(if any) will base on this, so let's merge it first.
That part is not included in this version because I'm still evaluating by
comparing epoll_pwait1 with epoll+timerfd as with current master they seem
to be really close.
[1]: http://www.spinics.net/lists/linux-api/msg08216.html
v2: Emulate nanoseconds precison of timeout with ppoll and timerfd.
Their performance is on par with each other, but both much better than
qemu.git:
syscall high # of fd low # of fd
-------------------------------------------------
qemu.git(ppoll) 44 96
ppoll+epoll 85 101
timerfd+epoll 87 109
(In high # of fd case, 3 activated but idle virtio-console devices are
attached, which will add us hundereds of fds to poll)
v1 cover letter
---------------
ppoll(2) doesn't scale as well as epoll: The elapsed time of the syscall is
linear to the number of fd's we poll, which hurts performance a bit when the
number of devices are many, or when a virtio device registers many virtqueues
(virtio-serial, for instance).
This series introduces qemu_poll, which is implemented with g_poll and epoll,
decided at configure time with CONFIG_EPOLL.
Fam
Fam Zheng (7):
poll: Introduce QEMU Poll API
posix-aio: Use QEMU poll interface
poll: Add epoll implementation for qemu_poll
main-loop: Replace qemu_poll_ns with qemu_poll
tests: Add test case for qemu_poll
poll-glib: Support ppoll
poll-linux: Add timerfd support
Makefile.objs | 2 +
aio-posix.c | 52 ++++----
async.c | 5 +-
include/block/aio.h | 7 +-
include/qemu/poll.h | 40 ++++++
include/qemu/timer.h | 13 --
include/qemu/typedefs.h | 4 +-
main-loop.c | 35 +++++-
poll-glib.c | 153 +++++++++++++++++++++++
poll-linux.c | 314 ++++++++++++++++++++++++++++++++++++++++++++++++
qemu-timer.c | 28 -----
tests/Makefile | 2 +
tests/test-poll.c | 272 +++++++++++++++++++++++++++++++++++++++++
13 files changed, 845 insertions(+), 82 deletions(-)
create mode 100644 include/qemu/poll.h
create mode 100644 poll-glib.c
create mode 100644 poll-linux.c
create mode 100644 tests/test-poll.c
--
1.9.3
^ permalink raw reply [flat|nested] 12+ messages in thread
* [Qemu-devel] [PATCH v3 1/7] poll: Introduce QEMU Poll API
2015-04-16 4:57 [Qemu-devel] [PATCH v3 0/7] aio: Support epoll by introducing qemu_poll abstraction Fam Zheng
@ 2015-04-16 4:57 ` Fam Zheng
2015-04-16 4:57 ` [Qemu-devel] [PATCH v3 2/7] posix-aio: Use QEMU poll interface Fam Zheng
` (6 subsequent siblings)
7 siblings, 0 replies; 12+ messages in thread
From: Fam Zheng @ 2015-04-16 4:57 UTC (permalink / raw)
To: qemu-devel; +Cc: Kevin Wolf, Paolo Bonzini, Stefan Hajnoczi
This is abstract of underlying poll implementation. A glib
implementation is included.
Signed-off-by: Fam Zheng <famz@redhat.com>
---
Makefile.objs | 2 +-
include/qemu/poll.h | 40 +++++++++++++++
include/qemu/typedefs.h | 4 +-
poll-glib.c | 130 ++++++++++++++++++++++++++++++++++++++++++++++++
4 files changed, 174 insertions(+), 2 deletions(-)
create mode 100644 include/qemu/poll.h
create mode 100644 poll-glib.c
diff --git a/Makefile.objs b/Makefile.objs
index 28999d3..77a56d0 100644
--- a/Makefile.objs
+++ b/Makefile.objs
@@ -9,7 +9,7 @@ util-obj-y = util/ qobject/ qapi/ qapi-types.o qapi-visit.o qapi-event.o
block-obj-y = async.o thread-pool.o
block-obj-y += nbd.o block.o blockjob.o
block-obj-y += main-loop.o iohandler.o qemu-timer.o
-block-obj-$(CONFIG_POSIX) += aio-posix.o
+block-obj-$(CONFIG_POSIX) += aio-posix.o poll-glib.o
block-obj-$(CONFIG_WIN32) += aio-win32.o
block-obj-y += block/
block-obj-y += qemu-io-cmds.o
diff --git a/include/qemu/poll.h b/include/qemu/poll.h
new file mode 100644
index 0000000..597975f
--- /dev/null
+++ b/include/qemu/poll.h
@@ -0,0 +1,40 @@
+#ifndef QEMU_POLL_H
+#define QEMU_POLL_H
+
+#include "qemu/typedefs.h"
+#include "qemu-common.h"
+
+struct QEMUPollEvent {
+ int fd;
+ int events;
+ int revents;
+ void *opaque;
+};
+
+QEMUPoll *qemu_poll_new(void);
+void qemu_poll_free(QEMUPoll *qpoll);
+int qemu_poll(QEMUPoll *qpoll, int64_t timeout_ns);
+
+/* Add an fd to poll. Return -EEXIST if fd already registered. */
+int qemu_poll_add(QEMUPoll *qpoll, int fd, int gio_events, void *opaque);
+
+/* Delete a previously added fd. Return -ENOENT if fd not registered. */
+int qemu_poll_del(QEMUPoll *qpoll, int fd);
+
+/**
+ * A shortcut to:
+ * 1) remove all the existing fds;
+ * 2) add all in @fds, while setting each fd's opaque pointer to the pollfd
+ * struct itself.
+ */
+int qemu_poll_set_fds(QEMUPoll *qpoll, GPollFD *fds, int nfds);
+
+/**
+ * Query the events from last qemu_poll. ONLY revent and opaque are valid in
+ * @events after return.
+ */
+int qemu_poll_get_events(QEMUPoll *qpoll,
+ QEMUPollEvent *events,
+ int max_events);
+
+#endif
diff --git a/include/qemu/typedefs.h b/include/qemu/typedefs.h
index cde3314..8e638cb 100644
--- a/include/qemu/typedefs.h
+++ b/include/qemu/typedefs.h
@@ -64,10 +64,12 @@ typedef struct QEMUBH QEMUBH;
typedef struct QemuConsole QemuConsole;
typedef struct QEMUFile QEMUFile;
typedef struct QEMUMachine QEMUMachine;
+typedef struct QEMUPoll QEMUPoll;
+typedef struct QEMUPollEvent QEMUPollEvent;
typedef struct QEMUSGList QEMUSGList;
typedef struct QEMUSizedBuffer QEMUSizedBuffer;
-typedef struct QEMUTimerListGroup QEMUTimerListGroup;
typedef struct QEMUTimer QEMUTimer;
+typedef struct QEMUTimerListGroup QEMUTimerListGroup;
typedef struct Range Range;
typedef struct SerialState SerialState;
typedef struct SHPCDevice SHPCDevice;
diff --git a/poll-glib.c b/poll-glib.c
new file mode 100644
index 0000000..64fde69
--- /dev/null
+++ b/poll-glib.c
@@ -0,0 +1,130 @@
+/*
+ * g_poll implementation for QEMU Poll API
+ *
+ * Copyright Red Hat, Inc. 2014
+ *
+ * Authors:
+ * Fam Zheng <famz@redhat.com>
+ *
+ * This work is licensed under the terms of the GNU GPL, version 2 or later.
+ * See the COPYING file in the top-level directory.
+ *
+ */
+
+#include "qemu-common.h"
+#include "qemu/timer.h"
+#include "qemu/poll.h"
+
+struct QEMUPoll {
+ /* Array of GPollFD for g_poll() */
+ GArray *gpollfds;
+
+ /* Array of opaque pointers, should be in sync with gpollfds */
+ GArray *opaque;
+};
+
+QEMUPoll *qemu_poll_new(void)
+{
+ QEMUPoll *qpoll = g_new0(QEMUPoll, 1);
+ qpoll->gpollfds = g_array_new(FALSE, FALSE, sizeof(GPollFD));
+ qpoll->opaque = g_array_new(FALSE, FALSE, sizeof(void *));
+ return qpoll;
+}
+
+
+void qemu_poll_free(QEMUPoll *qpoll)
+{
+ g_array_unref(qpoll->gpollfds);
+ g_array_unref(qpoll->opaque);
+ g_free(qpoll);
+}
+
+int qemu_poll(QEMUPoll *qpoll, int64_t timeout_ns)
+{
+ int i;
+ for (i = 0; i < qpoll->gpollfds->len; i++) {
+ GPollFD *p = &g_array_index(qpoll->gpollfds, GPollFD, i);
+ p->revents = 0;
+ }
+ return g_poll((GPollFD *)qpoll->gpollfds->data,
+ qpoll->gpollfds->len,
+ qemu_timeout_ns_to_ms(timeout_ns));
+}
+
+/* Add an fd to poll. Return -EEXIST if fd already registered. */
+int qemu_poll_add(QEMUPoll *qpoll, int fd, int gio_events, void *opaque)
+{
+ int i;
+ GPollFD pfd = (GPollFD) {
+ .fd = fd,
+ .revents = 0,
+ .events = gio_events,
+ };
+
+ for (i = 0; i < qpoll->gpollfds->len; i++) {
+ GPollFD *p = &g_array_index(qpoll->gpollfds, GPollFD, i);
+ if (p->fd == fd) {
+ return -EEXIST;
+ }
+ }
+ g_array_append_val(qpoll->gpollfds, pfd);
+ g_array_append_val(qpoll->opaque, opaque);
+ assert(qpoll->gpollfds->len == qpoll->opaque->len);
+ return 0;
+}
+
+/* Delete a previously added fd. Return -ENOENT if fd not registered. */
+int qemu_poll_del(QEMUPoll *qpoll, int fd)
+{
+ int i;
+ int found = -1;
+
+ for (i = 0; i < qpoll->gpollfds->len; i++) {
+ GPollFD *p = &g_array_index(qpoll->gpollfds, GPollFD, i);
+ if (p->fd == fd) {
+ found = i;
+ break;
+ }
+ }
+ if (found >= 0) {
+ g_array_remove_index(qpoll->gpollfds, found);
+ g_array_remove_index(qpoll->opaque, found);
+ assert(qpoll->gpollfds->len == qpoll->opaque->len);
+ return 0;
+ } else {
+ return -ENOENT;
+ }
+}
+
+int qemu_poll_set_fds(QEMUPoll *qpoll, GPollFD *fds, int nfds)
+{
+ int i;
+ g_array_set_size(qpoll->gpollfds, 0);
+ g_array_set_size(qpoll->opaque, 0);
+ for (i = 0; i < nfds; i++) {
+ void *opaque = &fds[i];
+ g_array_append_val(qpoll->gpollfds, fds[i]);
+ g_array_append_val(qpoll->opaque, opaque);
+ assert(qpoll->gpollfds->len == qpoll->opaque->len);
+ }
+ return nfds;
+}
+
+int qemu_poll_get_events(QEMUPoll *qpoll,
+ QEMUPollEvent *events,
+ int max_events)
+{
+ int i;
+ int r = 0;
+ for (i = 0; i < qpoll->gpollfds->len && r < max_events; i++) {
+ GPollFD *fd = &g_array_index(qpoll->gpollfds, GPollFD, i);
+ if (fd->revents & fd->events) {
+ events[r].fd = fd->fd;
+ events[r].revents = fd->revents;
+ events[r].events = fd->events;
+ events[r].opaque = g_array_index(qpoll->opaque, void *, i);
+ r++;
+ }
+ }
+ return r;
+}
--
1.9.3
^ permalink raw reply related [flat|nested] 12+ messages in thread
* [Qemu-devel] [PATCH v3 2/7] posix-aio: Use QEMU poll interface
2015-04-16 4:57 [Qemu-devel] [PATCH v3 0/7] aio: Support epoll by introducing qemu_poll abstraction Fam Zheng
2015-04-16 4:57 ` [Qemu-devel] [PATCH v3 1/7] poll: Introduce QEMU Poll API Fam Zheng
@ 2015-04-16 4:57 ` Fam Zheng
2015-04-16 4:57 ` [Qemu-devel] [PATCH v3 3/7] poll: Add epoll implementation for qemu_poll Fam Zheng
` (5 subsequent siblings)
7 siblings, 0 replies; 12+ messages in thread
From: Fam Zheng @ 2015-04-16 4:57 UTC (permalink / raw)
To: qemu-devel; +Cc: Kevin Wolf, Paolo Bonzini, Stefan Hajnoczi
The AIO handler list is only modified by aio_set_fd_handler, so we can
easily add del poll fd there. Initialize a QEMUPoll and keep track of
all the fds, so we don't need to rebuild a GPollFD array for g_poll in
aio_poll.
Signed-off-by: Fam Zheng <famz@redhat.com>
---
aio-posix.c | 52 +++++++++++++++++++++-------------------------------
async.c | 5 +++--
include/block/aio.h | 7 +++++--
3 files changed, 29 insertions(+), 35 deletions(-)
diff --git a/aio-posix.c b/aio-posix.c
index cbd4c34..5c81fe6 100644
--- a/aio-posix.c
+++ b/aio-posix.c
@@ -17,6 +17,7 @@
#include "block/block.h"
#include "qemu/queue.h"
#include "qemu/sockets.h"
+#include "qemu/poll.h"
struct AioHandler
{
@@ -55,6 +56,7 @@ void aio_set_fd_handler(AioContext *ctx,
/* Are we deleting the fd handler? */
if (!io_read && !io_write) {
if (node) {
+ qemu_poll_del(ctx->qpoll, fd);
g_source_remove_poll(&ctx->source, &node->pfd);
/* If the lock is held, just mark the node as deleted */
@@ -71,13 +73,15 @@ void aio_set_fd_handler(AioContext *ctx,
}
}
} else {
- if (node == NULL) {
+ if (node) {
+ /* Remove the old */
+ qemu_poll_del(ctx->qpoll, fd);
+ g_source_remove_poll(&ctx->source, &node->pfd);
+ } else {
/* Alloc and insert if it's not already there */
node = g_new0(AioHandler, 1);
node->pfd.fd = fd;
QLIST_INSERT_HEAD(&ctx->aio_handlers, node, node);
-
- g_source_add_poll(&ctx->source, &node->pfd);
}
/* Update handler with latest information */
node->io_read = io_read;
@@ -87,6 +91,8 @@ void aio_set_fd_handler(AioContext *ctx,
node->pfd.events = (io_read ? G_IO_IN | G_IO_HUP | G_IO_ERR : 0);
node->pfd.events |= (io_write ? G_IO_OUT | G_IO_ERR : 0);
+ qemu_poll_add(ctx->qpoll, fd, node->pfd.events, node);
+ g_source_add_poll(&ctx->source, &node->pfd);
}
aio_notify(ctx);
@@ -191,6 +197,7 @@ bool aio_poll(AioContext *ctx, bool blocking)
AioHandler *node;
bool was_dispatching;
int ret;
+ int i;
bool progress;
was_dispatching = ctx->dispatching;
@@ -208,38 +215,21 @@ bool aio_poll(AioContext *ctx, bool blocking)
*/
aio_set_dispatching(ctx, !blocking);
- ctx->walking_handlers++;
-
- g_array_set_size(ctx->pollfds, 0);
-
- /* fill pollfds */
- QLIST_FOREACH(node, &ctx->aio_handlers, node) {
- node->pollfds_idx = -1;
- if (!node->deleted && node->pfd.events) {
- GPollFD pfd = {
- .fd = node->pfd.fd,
- .events = node->pfd.events,
- };
- node->pollfds_idx = ctx->pollfds->len;
- g_array_append_val(ctx->pollfds, pfd);
- }
- }
-
- ctx->walking_handlers--;
-
/* wait until next event */
- ret = qemu_poll_ns((GPollFD *)ctx->pollfds->data,
- ctx->pollfds->len,
- blocking ? aio_compute_timeout(ctx) : 0);
+ ret = qemu_poll(ctx->qpoll, blocking ? aio_compute_timeout(ctx) : 0);
/* if we have any readable fds, dispatch event */
if (ret > 0) {
- QLIST_FOREACH(node, &ctx->aio_handlers, node) {
- if (node->pollfds_idx != -1) {
- GPollFD *pfd = &g_array_index(ctx->pollfds, GPollFD,
- node->pollfds_idx);
- node->pfd.revents = pfd->revents;
- }
+ int r;
+ g_array_set_size(ctx->events, ret);
+ r = qemu_poll_get_events(ctx->qpoll,
+ (QEMUPollEvent *)ctx->events->data,
+ ret);
+ assert(r == ret);
+ for (i = 0; i < r; i++) {
+ QEMUPollEvent *e = &g_array_index(ctx->events, QEMUPollEvent, i);
+ node = e->opaque;
+ node->pfd.revents = e->revents;
}
}
diff --git a/async.c b/async.c
index 2b51e87..dd5d62f 100644
--- a/async.c
+++ b/async.c
@@ -27,6 +27,7 @@
#include "block/thread-pool.h"
#include "qemu/main-loop.h"
#include "qemu/atomic.h"
+#include "qemu/poll.h"
/***********************************************************/
/* bottom halves (can be seen as timers which expire ASAP) */
@@ -230,7 +231,6 @@ aio_ctx_finalize(GSource *source)
event_notifier_cleanup(&ctx->notifier);
rfifolock_destroy(&ctx->lock);
qemu_mutex_destroy(&ctx->bh_lock);
- g_array_free(ctx->pollfds, TRUE);
timerlistgroup_deinit(&ctx->tlg);
}
@@ -299,10 +299,11 @@ AioContext *aio_context_new(Error **errp)
return NULL;
}
g_source_set_can_recurse(&ctx->source, true);
+ ctx->qpoll = qemu_poll_new();
+ ctx->events = g_array_new(FALSE, FALSE, sizeof(QEMUPollEvent));
aio_set_event_notifier(ctx, &ctx->notifier,
(EventNotifierHandler *)
event_notifier_test_and_clear);
- ctx->pollfds = g_array_new(FALSE, FALSE, sizeof(GPollFD));
ctx->thread_pool = NULL;
qemu_mutex_init(&ctx->bh_lock);
rfifolock_init(&ctx->lock, aio_rfifolock_cb, ctx);
diff --git a/include/block/aio.h b/include/block/aio.h
index 7d1e26b..af7edba 100644
--- a/include/block/aio.h
+++ b/include/block/aio.h
@@ -82,8 +82,11 @@ struct AioContext {
/* Used for aio_notify. */
EventNotifier notifier;
- /* GPollFDs for aio_poll() */
- GArray *pollfds;
+ /* qemu_poll context */
+ QEMUPoll *qpoll;
+
+ /* QEMUPollEvents for qemu_poll_get_events() */
+ GArray *events;
/* Thread pool for performing work and receiving completion callbacks */
struct ThreadPool *thread_pool;
--
1.9.3
^ permalink raw reply related [flat|nested] 12+ messages in thread
* [Qemu-devel] [PATCH v3 3/7] poll: Add epoll implementation for qemu_poll
2015-04-16 4:57 [Qemu-devel] [PATCH v3 0/7] aio: Support epoll by introducing qemu_poll abstraction Fam Zheng
2015-04-16 4:57 ` [Qemu-devel] [PATCH v3 1/7] poll: Introduce QEMU Poll API Fam Zheng
2015-04-16 4:57 ` [Qemu-devel] [PATCH v3 2/7] posix-aio: Use QEMU poll interface Fam Zheng
@ 2015-04-16 4:57 ` Fam Zheng
2015-04-16 4:57 ` [Qemu-devel] [PATCH v3 4/7] main-loop: Replace qemu_poll_ns with qemu_poll Fam Zheng
` (4 subsequent siblings)
7 siblings, 0 replies; 12+ messages in thread
From: Fam Zheng @ 2015-04-16 4:57 UTC (permalink / raw)
To: qemu-devel; +Cc: Kevin Wolf, Paolo Bonzini, Stefan Hajnoczi
This implements qemu_poll with ppoll + epoll. The only complex part is
qemu_poll_set_fds, which will sync up epollfd with epoll_ctl by
computing the symmetric difference of previous and new fds.
The ppoll is used to retain ns precision of timeout.
Signed-off-by: Fam Zheng <famz@redhat.com>
---
Makefile.objs | 4 +-
poll-linux.c | 232 ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
2 files changed, 235 insertions(+), 1 deletion(-)
create mode 100644 poll-linux.c
diff --git a/Makefile.objs b/Makefile.objs
index 77a56d0..d3f6f51 100644
--- a/Makefile.objs
+++ b/Makefile.objs
@@ -9,7 +9,9 @@ util-obj-y = util/ qobject/ qapi/ qapi-types.o qapi-visit.o qapi-event.o
block-obj-y = async.o thread-pool.o
block-obj-y += nbd.o block.o blockjob.o
block-obj-y += main-loop.o iohandler.o qemu-timer.o
-block-obj-$(CONFIG_POSIX) += aio-posix.o poll-glib.o
+block-obj-$(CONFIG_POSIX) += aio-posix.o
+block-obj-$(CONFIG_EPOLL) += poll-linux.o
+block-obj-$(if $(CONFIG_EPOLL),n,$(CONFIG_POSIX)) += poll-glib.o
block-obj-$(CONFIG_WIN32) += aio-win32.o
block-obj-y += block/
block-obj-y += qemu-io-cmds.o
diff --git a/poll-linux.c b/poll-linux.c
new file mode 100644
index 0000000..7605004
--- /dev/null
+++ b/poll-linux.c
@@ -0,0 +1,232 @@
+/*
+ * epoll implementation for QEMU Poll API
+ *
+ * Copyright Red Hat, Inc. 2014
+ *
+ * Authors:
+ * Fam Zheng <famz@redhat.com>
+ *
+ * This work is licensed under the terms of the GNU GPL, version 2 or later.
+ * See the COPYING file in the top-level directory.
+ *
+ */
+
+#include <sys/epoll.h>
+#include <glib.h>
+#include <poll.h>
+#include "qemu-common.h"
+#include "qemu/timer.h"
+#include "qemu/poll.h"
+
+struct QEMUPoll {
+ int epollfd;
+ struct epoll_event *events;
+ int max_events;
+ int nevents;
+ GHashTable *fds;
+};
+
+QEMUPoll *qemu_poll_new(void)
+{
+ int epollfd;
+ QEMUPoll *qpoll = g_new0(QEMUPoll, 1);
+ epollfd = epoll_create1(EPOLL_CLOEXEC);
+ if (epollfd < 0) {
+ perror("epoll_create1:");
+ abort();
+ }
+ qpoll->epollfd = epollfd;
+ qpoll->max_events = 1;
+ qpoll->events = g_new(struct epoll_event, 1);
+ qpoll->fds = g_hash_table_new_full(g_int_hash, g_int_equal,
+ NULL, g_free);
+ return qpoll;
+}
+
+void qemu_poll_free(QEMUPoll *qpoll)
+{
+ g_free(qpoll->events);
+ g_hash_table_destroy(qpoll->fds);
+ close(qpoll->epollfd);
+ g_free(qpoll);
+}
+
+int qemu_poll(QEMUPoll *qpoll, int64_t timeout_ns)
+{
+ int r;
+ struct pollfd fd = {
+ .fd = qpoll->epollfd,
+ .events = POLLIN | POLLOUT | POLLERR | POLLHUP,
+ };
+
+ if (timeout_ns >= 0) {
+ struct timespec ts;
+ ts.tv_sec = timeout_ns / 1000000000LL;
+ ts.tv_nsec = timeout_ns % 1000000000LL;
+ r = ppoll(&fd, 1, &ts, NULL);
+ } else {
+ r = ppoll(&fd, 1, NULL, NULL);
+ }
+
+ if (r > 0) {
+ r = epoll_wait(qpoll->epollfd,
+ qpoll->events,
+ qpoll->max_events,
+ 0);
+ qpoll->nevents = r;
+ }
+ return r;
+}
+
+static inline uint32_t epoll_event_from_gio_events(int gio_events)
+{
+
+ return (gio_events & G_IO_IN ? EPOLLIN : 0) |
+ (gio_events & G_IO_OUT ? EPOLLOUT : 0) |
+ (gio_events & G_IO_ERR ? EPOLLERR : 0) |
+ (gio_events & G_IO_HUP ? EPOLLHUP : 0);
+}
+
+
+/* Add an fd to poll. Return -EEXIST if fd already registered. */
+int qemu_poll_add(QEMUPoll *qpoll, int fd, int gio_events, void *opaque)
+{
+ int ret;
+ struct epoll_event ev;
+ QEMUPollEvent *e;
+
+ ev.events = epoll_event_from_gio_events(gio_events);
+ ev.data.fd = fd;
+ ret = epoll_ctl(qpoll->epollfd, EPOLL_CTL_ADD, fd, &ev);
+ if (ret) {
+ ret = -errno;
+ return ret;
+ }
+ qpoll->max_events++;
+ qpoll->events = g_renew(struct epoll_event,
+ qpoll->events,
+ qpoll->max_events);
+ /* Shouldn't get here if the fd is already added since epoll_ctl would
+ * return -EEXIST, assert to be sure */
+ assert(g_hash_table_lookup(qpoll->fds, &fd) == NULL);
+ e = g_new0(QEMUPollEvent, 1);
+ e->fd = fd;
+ e->events = gio_events;
+ e->opaque = opaque;
+ g_hash_table_insert(qpoll->fds, &e->fd, e);
+ return ret;
+}
+
+/* Delete a previously added fd. Return -ENOENT if fd not registered. */
+int qemu_poll_del(QEMUPoll *qpoll, int fd)
+{
+ int ret;
+
+ if (!g_hash_table_lookup(qpoll->fds, &fd)) {
+ ret = -ENOENT;
+ goto out;
+ }
+ ret = epoll_ctl(qpoll->epollfd, EPOLL_CTL_DEL, fd, NULL);
+ if (ret) {
+ ret = -errno;
+ goto out;
+ }
+ qpoll->max_events--;
+ qpoll->events = g_renew(struct epoll_event,
+ qpoll->events,
+ qpoll->max_events);
+out:
+ g_hash_table_remove(qpoll->fds, &fd);
+ return ret;
+}
+
+static void qemu_poll_copy_fd(gpointer key, gpointer value, gpointer user_data)
+{
+ GHashTable *dst = user_data;
+ QEMUPollEvent *event, *copy;
+
+ event = value;
+ copy = g_new(QEMUPollEvent, 1);
+ *copy = *event;
+ g_hash_table_insert(dst, ©->fd, copy);
+}
+
+static void qemu_poll_del_fd(gpointer key, gpointer value, gpointer user_data)
+{
+ QEMUPoll *qpoll = user_data;
+ QEMUPollEvent *event = value;
+
+ qemu_poll_del(qpoll, event->fd);
+}
+
+int qemu_poll_set_fds(QEMUPoll *qpoll, GPollFD *fds, int nfds)
+{
+ int i;
+ int updated = 0;
+ int ret = nfds;
+ int old_size = g_hash_table_size(qpoll->fds);
+
+ for (i = 0; i < nfds; i++) {
+ int r;
+ GPollFD *fd = &fds[i];
+ QEMUPollEvent *e = g_hash_table_lookup(qpoll->fds, &fd->fd);
+ if (e) {
+ updated++;
+ assert(e->fd == fd->fd);
+ if (e->events == fd->events) {
+ e->opaque = fd;
+ continue;
+ }
+ r = qemu_poll_del(qpoll, fd->fd);
+ if (r < 0) {
+ ret = r;
+ break;
+ }
+ }
+ r = qemu_poll_add(qpoll, fd->fd, fd->events, fd);
+ if (r < 0) {
+ ret = r;
+ break;
+ }
+ }
+
+ if (updated < old_size) {
+ GHashTable *fds_copy;
+
+ fds_copy = g_hash_table_new_full(g_int_hash, g_int_equal, NULL, g_free);
+ g_hash_table_foreach(qpoll->fds, qemu_poll_copy_fd, fds_copy);
+
+ /* Some fds need to be removed, as they are not seen in new fds */
+ for (i = 0; i < nfds; i++) {
+ GPollFD *fd = &fds[i];
+ g_hash_table_remove(fds_copy, &fd->fd);
+ }
+
+ g_hash_table_foreach(fds_copy, qemu_poll_del_fd, qpoll);
+ g_hash_table_destroy(fds_copy);
+ }
+ return ret;
+}
+
+int qemu_poll_get_events(QEMUPoll *qpoll,
+ QEMUPollEvent *events,
+ int max_events)
+{
+ int i;
+ QEMUPollEvent *p;
+ struct epoll_event *ev;
+ int fd;
+
+ for (i = 0; i < MIN(qpoll->nevents, max_events); i++) {
+ ev = &qpoll->events[i];
+ fd = ev->data.fd;
+ p = g_hash_table_lookup(qpoll->fds, &fd);
+ assert(p);
+
+ events[i].revents = ev->events;
+ events[i].opaque = p->opaque;
+ events[i].fd = fd;
+ events[i].events = p->events;
+ }
+ return i;
+}
--
1.9.3
^ permalink raw reply related [flat|nested] 12+ messages in thread
* [Qemu-devel] [PATCH v3 4/7] main-loop: Replace qemu_poll_ns with qemu_poll
2015-04-16 4:57 [Qemu-devel] [PATCH v3 0/7] aio: Support epoll by introducing qemu_poll abstraction Fam Zheng
` (2 preceding siblings ...)
2015-04-16 4:57 ` [Qemu-devel] [PATCH v3 3/7] poll: Add epoll implementation for qemu_poll Fam Zheng
@ 2015-04-16 4:57 ` Fam Zheng
2015-04-16 4:57 ` [Qemu-devel] [PATCH v3 5/7] tests: Add test case for qemu_poll Fam Zheng
` (3 subsequent siblings)
7 siblings, 0 replies; 12+ messages in thread
From: Fam Zheng @ 2015-04-16 4:57 UTC (permalink / raw)
To: qemu-devel; +Cc: Kevin Wolf, Paolo Bonzini, Stefan Hajnoczi
qemu_poll_set_fds + qemu_poll does the same, and when epoll is
available, it is faster.
Signed-off-by: Fam Zheng <famz@redhat.com>
---
include/qemu/timer.h | 13 -------------
main-loop.c | 35 ++++++++++++++++++++++++++++++-----
qemu-timer.c | 28 ----------------------------
3 files changed, 30 insertions(+), 46 deletions(-)
diff --git a/include/qemu/timer.h b/include/qemu/timer.h
index e5bd494..ae99deb 100644
--- a/include/qemu/timer.h
+++ b/include/qemu/timer.h
@@ -738,19 +738,6 @@ void timer_put(QEMUFile *f, QEMUTimer *ts);
int qemu_timeout_ns_to_ms(int64_t ns);
/**
- * qemu_poll_ns:
- * @fds: Array of file descriptors
- * @nfds: number of file descriptors
- * @timeout: timeout in nanoseconds
- *
- * Perform a poll like g_poll but with a timeout in nanoseconds.
- * See g_poll documentation for further details.
- *
- * Returns: number of fds ready
- */
-int qemu_poll_ns(GPollFD *fds, guint nfds, int64_t timeout);
-
-/**
* qemu_soonest_timeout:
* @timeout1: first timeout in nanoseconds (or -1 for infinite)
* @timeout2: second timeout in nanoseconds (or -1 for infinite)
diff --git a/main-loop.c b/main-loop.c
index 981bcb5..b70f929 100644
--- a/main-loop.c
+++ b/main-loop.c
@@ -29,6 +29,7 @@
#include "slirp/libslirp.h"
#include "qemu/main-loop.h"
#include "block/aio.h"
+#include "qemu/poll.h"
#ifndef _WIN32
@@ -130,6 +131,7 @@ void qemu_notify_event(void)
}
static GArray *gpollfds;
+static QEMUPoll *qpoll;
int qemu_init_main_loop(Error **errp)
{
@@ -150,6 +152,7 @@ int qemu_init_main_loop(Error **errp)
return -EMFILE;
}
gpollfds = g_array_new(FALSE, FALSE, sizeof(GPollFD));
+ qpoll = qemu_poll_new();
src = aio_get_g_source(qemu_aio_context);
g_source_attach(src, NULL);
g_source_unref(src);
@@ -207,6 +210,8 @@ static int os_host_main_loop_wait(int64_t timeout)
{
int ret;
static int spin_counter;
+ QEMUPollEvent events[1024];
+ int r, i;
glib_pollfds_fill(&timeout);
@@ -236,7 +241,17 @@ static int os_host_main_loop_wait(int64_t timeout)
spin_counter++;
}
- ret = qemu_poll_ns((GPollFD *)gpollfds->data, gpollfds->len, timeout);
+ qemu_poll_set_fds(qpoll, (GPollFD *)gpollfds->data, gpollfds->len);
+ ret = qemu_poll(qpoll, timeout);
+ if (ret > 0) {
+ ret = MIN(ret, sizeof(events) / sizeof(QEMUPollEvent));
+ r = qemu_poll_get_events(qpoll, events, ret);
+ for (i = 0; i < r; i++) {
+ GPollFD *fd = events[i].opaque;
+ fd->revents = events[i].revents;
+ }
+ ret = r;
+ }
if (timeout) {
qemu_mutex_lock_iothread();
@@ -389,9 +404,11 @@ static int os_host_main_loop_wait(int64_t timeout)
{
GMainContext *context = g_main_context_default();
GPollFD poll_fds[1024 * 2]; /* this is probably overkill */
+ QEMUPollEvent poll_events[1024 * 2];
int select_ret = 0;
- int g_poll_ret, ret, i, n_poll_fds;
+ int nevents, ret, i, n_poll_fds;
PollingEntry *pe;
+ QEMUPollEvent *events;
WaitObjects *w = &wait_objects;
gint poll_timeout;
int64_t poll_timeout_ns;
@@ -441,10 +458,18 @@ static int os_host_main_loop_wait(int64_t timeout)
poll_timeout_ns = qemu_soonest_timeout(poll_timeout_ns, timeout);
qemu_mutex_unlock_iothread();
- g_poll_ret = qemu_poll_ns(poll_fds, n_poll_fds + w->num, poll_timeout_ns);
+
+ qemu_poll_set_fds(qpoll, (GPollFD *)poll_fds, n_poll_fds + w->num)
+ nevents = qemu_poll(qpoll, poll_timeout_ns);
qemu_mutex_lock_iothread();
- if (g_poll_ret > 0) {
+ if (nevents > 0) {
+ r = qemu_poll_get_events(qpoll, poll_events, nevents);
+ for (i = 0; i < r; i++) {
+ GPollFD *fd = poll_events[i].opaque;
+ fd->revents = poll_events[i].revents;
+ }
+
for (i = 0; i < w->num; i++) {
w->revents[i] = poll_fds[n_poll_fds + i].revents;
}
@@ -459,7 +484,7 @@ static int os_host_main_loop_wait(int64_t timeout)
g_main_context_dispatch(context);
}
- return select_ret || g_poll_ret;
+ return select_ret || nevents;
}
#endif
diff --git a/qemu-timer.c b/qemu-timer.c
index 5741f0d..41440e2 100644
--- a/qemu-timer.c
+++ b/qemu-timer.c
@@ -299,34 +299,6 @@ int qemu_timeout_ns_to_ms(int64_t ns)
return (int) ms;
}
-
-/* qemu implementation of g_poll which uses a nanosecond timeout but is
- * otherwise identical to g_poll
- */
-int qemu_poll_ns(GPollFD *fds, guint nfds, int64_t timeout)
-{
-#ifdef CONFIG_PPOLL
- if (timeout < 0) {
- return ppoll((struct pollfd *)fds, nfds, NULL, NULL);
- } else {
- struct timespec ts;
- int64_t tvsec = timeout / 1000000000LL;
- /* Avoid possibly overflowing and specifying a negative number of
- * seconds, which would turn a very long timeout into a busy-wait.
- */
- if (tvsec > (int64_t)INT32_MAX) {
- tvsec = INT32_MAX;
- }
- ts.tv_sec = tvsec;
- ts.tv_nsec = timeout % 1000000000LL;
- return ppoll((struct pollfd *)fds, nfds, &ts, NULL);
- }
-#else
- return g_poll(fds, nfds, qemu_timeout_ns_to_ms(timeout));
-#endif
-}
-
-
void timer_init_tl(QEMUTimer *ts,
QEMUTimerList *timer_list, int scale,
QEMUTimerCB *cb, void *opaque)
--
1.9.3
^ permalink raw reply related [flat|nested] 12+ messages in thread
* [Qemu-devel] [PATCH v3 5/7] tests: Add test case for qemu_poll
2015-04-16 4:57 [Qemu-devel] [PATCH v3 0/7] aio: Support epoll by introducing qemu_poll abstraction Fam Zheng
` (3 preceding siblings ...)
2015-04-16 4:57 ` [Qemu-devel] [PATCH v3 4/7] main-loop: Replace qemu_poll_ns with qemu_poll Fam Zheng
@ 2015-04-16 4:57 ` Fam Zheng
2015-04-16 4:57 ` [Qemu-devel] [PATCH v3 6/7] poll-glib: Support ppoll Fam Zheng
` (2 subsequent siblings)
7 siblings, 0 replies; 12+ messages in thread
From: Fam Zheng @ 2015-04-16 4:57 UTC (permalink / raw)
To: qemu-devel; +Cc: Kevin Wolf, Paolo Bonzini, Stefan Hajnoczi
Signed-off-by: Fam Zheng <famz@redhat.com>
---
tests/Makefile | 2 +
tests/test-poll.c | 272 ++++++++++++++++++++++++++++++++++++++++++++++++++++++
2 files changed, 274 insertions(+)
create mode 100644 tests/test-poll.c
diff --git a/tests/Makefile b/tests/Makefile
index 55aa745..d152bf5 100644
--- a/tests/Makefile
+++ b/tests/Makefile
@@ -41,6 +41,7 @@ check-unit-$(CONFIG_POSIX) += tests/test-rfifolock$(EXESUF)
check-unit-y += tests/test-throttle$(EXESUF)
gcov-files-test-aio-$(CONFIG_WIN32) = aio-win32.c
gcov-files-test-aio-$(CONFIG_POSIX) = aio-posix.c
+check-unit-y += tests/test-poll$(EXESUF)
check-unit-y += tests/test-thread-pool$(EXESUF)
gcov-files-test-thread-pool-y = thread-pool.c
gcov-files-test-hbitmap-y = util/hbitmap.c
@@ -251,6 +252,7 @@ tests/check-qjson$(EXESUF): tests/check-qjson.o libqemuutil.a libqemustub.a
tests/check-qom-interface$(EXESUF): tests/check-qom-interface.o $(qom-core-obj) libqemuutil.a libqemustub.a
tests/test-coroutine$(EXESUF): tests/test-coroutine.o $(block-obj-y) libqemuutil.a libqemustub.a
tests/test-aio$(EXESUF): tests/test-aio.o $(block-obj-y) libqemuutil.a libqemustub.a
+tests/test-poll$(EXESUF): tests/test-poll.o $(block-obj-y) libqemuutil.a libqemustub.a
tests/test-rfifolock$(EXESUF): tests/test-rfifolock.o libqemuutil.a libqemustub.a
tests/test-throttle$(EXESUF): tests/test-throttle.o $(block-obj-y) libqemuutil.a libqemustub.a
tests/test-thread-pool$(EXESUF): tests/test-thread-pool.o $(block-obj-y) libqemuutil.a libqemustub.a
diff --git a/tests/test-poll.c b/tests/test-poll.c
new file mode 100644
index 0000000..75074d8
--- /dev/null
+++ b/tests/test-poll.c
@@ -0,0 +1,272 @@
+/*
+ * QTest testcase for QEMU poll
+ *
+ * Copyright Red Hat, Inc. 2014
+ *
+ * Authors:
+ * Fam Zheng <famz@redhat.com>
+ *
+ * This work is licensed under the terms of the GNU GPL, version 2 or later.
+ * See the COPYING file in the top-level directory.
+ */
+
+#include <glib.h>
+#include "qemu/poll.h"
+#include "qemu/event_notifier.h"
+
+static EventNotifier *poll_add_one(QEMUPoll *qpoll)
+{
+ EventNotifier *e = g_new(EventNotifier, 1);
+
+ event_notifier_init(e, false);
+ qemu_poll_add(qpoll, event_notifier_get_fd(e),
+ G_IO_IN,
+ NULL);
+ return e;
+}
+
+static int poll_del_one(QEMUPoll *qpoll, EventNotifier *e)
+{
+ int r = qemu_poll_del(qpoll, event_notifier_get_fd(e));
+ event_notifier_cleanup(e);
+ g_free(e);
+ return r;
+}
+
+static void test_poll_single(void)
+{
+ QEMUPoll *qpoll;
+ EventNotifier *e;
+ int r, i;
+
+ qpoll = qemu_poll_new();
+ g_assert(qpoll);
+
+ e = poll_add_one(qpoll);
+
+ /* Write some data and poll */
+ event_notifier_set(e);
+ r = qemu_poll(qpoll, 0);
+ g_assert_cmpint(r, ==, 1);
+
+ /* Clear data and poll */
+ r = event_notifier_test_and_clear(e);
+ g_assert(r);
+ r = qemu_poll(qpoll, 0);
+ g_assert_cmpint(r, ==, 0);
+
+ /* Write a lot of data and poll */
+ for (i = 0; i < 10000; i++) {
+ event_notifier_set(e);
+ }
+ r = qemu_poll(qpoll, 0);
+ g_assert_cmpint(r, ==, 1);
+
+ /* Clear data and poll again */
+ r = event_notifier_test_and_clear(e);
+ g_assert(r);
+ r = qemu_poll(qpoll, 0);
+ g_assert_cmpint(r, ==, 0);
+
+ /* Clean up */
+ poll_del_one(qpoll, e);
+ qemu_poll_free(qpoll);
+}
+
+static void test_poll_multiple(void)
+{
+ QEMUPoll *qpoll;
+ const int N = 32;
+ EventNotifier *e[N];
+ QEMUPollEvent events[N];
+ int r, s, i;
+
+ qpoll = qemu_poll_new();
+ g_assert(qpoll);
+
+ for (i = 0; i < N; i++) {
+ e[i] = poll_add_one(qpoll);
+ }
+
+ /* Write some data for each and poll */
+ for (i = 0; i < N; i++) {
+ event_notifier_set(e[i]);
+ }
+ r = qemu_poll(qpoll, 0);
+ g_assert_cmpint(r, ==, N);
+
+ /* Clear data and poll */
+ for (i = 0; i < N; i++) {
+ r = event_notifier_test_and_clear(e[i]);
+ g_assert(r);
+ }
+ r = qemu_poll(qpoll, 0);
+ g_assert_cmpint(r, ==, 0);
+
+ /* Write some data for first half and poll */
+ for (i = 0; i < N / 2; i++) {
+ event_notifier_set(e[i]);
+ }
+ r = qemu_poll(qpoll, 0);
+ g_assert_cmpint(r, ==, N / 2);
+ s = qemu_poll_get_events(qpoll, events, N);
+ g_assert_cmpint(s, ==, r);
+ for (i = 0; i < s; i++) {
+ g_assert(events[i].revents & G_IO_IN);
+ }
+
+ /* Clean up */
+ for (i = 0; i < N; i++) {
+ poll_del_one(qpoll, e[i]);
+ }
+ qemu_poll_free(qpoll);
+}
+
+static void test_poll_add_del(void)
+{
+ QEMUPoll *qpoll;
+ EventNotifier *e1, *e2;
+ QEMUPollEvent events[2];
+ int r;
+
+ qpoll = qemu_poll_new();
+ g_assert(qpoll);
+
+ e1 = poll_add_one(qpoll);
+ e2 = poll_add_one(qpoll);
+
+ /* Write some data for each and poll */
+ event_notifier_set(e1);
+ event_notifier_set(e2);
+ r = qemu_poll(qpoll, 0);
+ g_assert_cmpint(r, ==, 2);
+
+ /* Clear e1 and poll */
+ r = event_notifier_test_and_clear(e1);
+ r = qemu_poll(qpoll, 0);
+ g_assert_cmpint(r, ==, 1);
+ r = qemu_poll_get_events(qpoll, events, 2);
+ g_assert_cmpint(r, ==, 1);
+ g_assert_cmpint(events[0].fd, ==, event_notifier_get_fd(e2));
+
+ /* Write to both but remove one and poll the other */
+ event_notifier_set(e1);
+ event_notifier_set(e2);
+ qemu_poll_del(qpoll, event_notifier_get_fd(e2));
+ r = qemu_poll(qpoll, 0);
+ g_assert_cmpint(r, ==, 1);
+ r = qemu_poll_get_events(qpoll, events, 2);
+ g_assert_cmpint(r, ==, 1);
+ g_assert_cmpint(events[0].fd, ==, event_notifier_get_fd(e1));
+
+ r = qemu_poll_del(qpoll, event_notifier_get_fd(e2));
+ g_assert_cmpint(r, ==, -ENOENT);
+
+ /* Add it back and poll both */
+ qemu_poll_add(qpoll, event_notifier_get_fd(e2),
+ G_IO_IN, NULL);
+ r = qemu_poll(qpoll, 0);
+ g_assert_cmpint(r, ==, 2);
+
+ event_notifier_test_and_clear(e1);
+ event_notifier_test_and_clear(e2);
+
+ r = qemu_poll_add(qpoll, event_notifier_get_fd(e2),
+ G_IO_IN, NULL);
+ g_assert_cmpint(r, ==, -EEXIST);
+
+ /* Clean up */
+ poll_del_one(qpoll, e1);
+ poll_del_one(qpoll, e2);
+ qemu_poll_free(qpoll);
+}
+
+static void gpollfd_fill(GPollFD *fds, int start, EventNotifier **e, int n)
+{
+ int i;
+ for (i = 0; i < n; i++) {
+ fds[i + start] = (GPollFD) {
+ .fd = event_notifier_get_fd(e[i]),
+ .events = G_IO_IN,
+ };
+ }
+}
+
+static void test_poll_set_fds(void)
+{
+ QEMUPoll *qpoll;
+ const int M = 20;
+ const int N = 10;
+ EventNotifier *em[M], *en[N];
+ GPollFD fds[M + N];
+ int i, r;
+
+ qpoll = qemu_poll_new();
+ g_assert(qpoll);
+
+ for (i = 0; i < M; i++) {
+ en[i] = poll_add_one(qpoll);
+ }
+
+ for (i = 0; i < M; i++) {
+ em[i] = poll_add_one(qpoll);
+ }
+
+ gpollfd_fill(fds, 0, em, M);
+ gpollfd_fill(fds, M, en, N);
+
+ /* Set N */
+ for (i = 0; i < N; i++) {
+ event_notifier_set(en[i]);
+ }
+
+ /* Poll M + N should get N */
+ r = qemu_poll_set_fds(qpoll, fds, M + N);
+ g_assert_cmpint(r, ==, M + N);
+ r = qemu_poll(qpoll, 0);
+ g_assert_cmpint(r, ==, N);
+
+ /* Poll M should get 0 */
+ r = qemu_poll_set_fds(qpoll, fds, M);
+ g_assert_cmpint(r, ==, M);
+ r = qemu_poll(qpoll, 0);
+ g_assert_cmpint(r, ==, 0);
+
+ /* Poll N should get N */
+ r = qemu_poll_set_fds(qpoll, fds + M, N);
+ g_assert_cmpint(r, ==, N);
+ r = qemu_poll(qpoll, 0);
+ g_assert_cmpint(r, ==, N);
+
+ /* Poll M + N / 2 should get N / 2 */
+ r = qemu_poll_set_fds(qpoll, fds, M + N / 2);
+ g_assert_cmpint(r, ==, M + N / 2);
+ r = qemu_poll(qpoll, 0);
+ g_assert_cmpint(r, ==, N / 2);
+
+ /* Clean up */
+ for (i = 0; i < M; i++) {
+ poll_del_one(qpoll, em[i]);
+ }
+
+ for (i = 0; i < N; i++) {
+ poll_del_one(qpoll, en[i]);
+ }
+ qemu_poll_free(qpoll);
+}
+
+int main(int argc, char **argv)
+{
+ int ret;
+
+ g_test_init(&argc, &argv, NULL);
+
+ g_test_add_func("/poll/single", test_poll_single);
+ g_test_add_func("/poll/multiple", test_poll_multiple);
+ g_test_add_func("/poll/add-del", test_poll_add_del);
+ g_test_add_func("/poll/set-fds", test_poll_set_fds);
+
+ ret = g_test_run();
+
+ return ret;
+}
--
1.9.3
^ permalink raw reply related [flat|nested] 12+ messages in thread
* [Qemu-devel] [PATCH v3 6/7] poll-glib: Support ppoll
2015-04-16 4:57 [Qemu-devel] [PATCH v3 0/7] aio: Support epoll by introducing qemu_poll abstraction Fam Zheng
` (4 preceding siblings ...)
2015-04-16 4:57 ` [Qemu-devel] [PATCH v3 5/7] tests: Add test case for qemu_poll Fam Zheng
@ 2015-04-16 4:57 ` Fam Zheng
2015-04-16 4:57 ` [Qemu-devel] [PATCH v3 7/7] poll-linux: Add timerfd support Fam Zheng
2015-04-16 13:03 ` [Qemu-devel] [PATCH v3 0/7] aio: Support epoll by introducing qemu_poll abstraction Stefan Hajnoczi
7 siblings, 0 replies; 12+ messages in thread
From: Fam Zheng @ 2015-04-16 4:57 UTC (permalink / raw)
To: qemu-devel; +Cc: Kevin Wolf, Paolo Bonzini, Stefan Hajnoczi
Signed-off-by: Fam Zheng <famz@redhat.com>
---
poll-glib.c | 29 ++++++++++++++++++++++++++---
1 file changed, 26 insertions(+), 3 deletions(-)
diff --git a/poll-glib.c b/poll-glib.c
index 64fde69..23d528d 100644
--- a/poll-glib.c
+++ b/poll-glib.c
@@ -14,6 +14,9 @@
#include "qemu-common.h"
#include "qemu/timer.h"
#include "qemu/poll.h"
+#if CONFIG_PPOLL
+#include <poll.h>
+#endif
struct QEMUPoll {
/* Array of GPollFD for g_poll() */
@@ -42,13 +45,33 @@ void qemu_poll_free(QEMUPoll *qpoll)
int qemu_poll(QEMUPoll *qpoll, int64_t timeout_ns)
{
int i;
+ void *fds = qpoll->gpollfds->data;
+ int nfds = qpoll->gpollfds->len;
+
for (i = 0; i < qpoll->gpollfds->len; i++) {
GPollFD *p = &g_array_index(qpoll->gpollfds, GPollFD, i);
p->revents = 0;
}
- return g_poll((GPollFD *)qpoll->gpollfds->data,
- qpoll->gpollfds->len,
- qemu_timeout_ns_to_ms(timeout_ns));
+#if CONFIG_PPOLL
+ if (timeout_ns < 0) {
+ return ppoll(fds, nfds, NULL, NULL);
+ } else {
+ struct timespec ts;
+ int64_t tvsec = timeout_ns / 1000000000LL;
+ /* Avoid possibly overflowing and specifying a negative number of
+ * seconds, which would turn a very long timeout into a busy-wait.
+ */
+ if (tvsec > (int64_t)INT32_MAX) {
+ tvsec = INT32_MAX;
+ }
+ ts.tv_sec = tvsec;
+ ts.tv_nsec = timeout_ns % 1000000000LL;
+ return ppoll(fds, nfds, &ts, NULL);
+ }
+
+#else
+ return g_poll(data, nfds, qemu_timeout_ns_to_ms(timeout_ns));
+#endif
}
/* Add an fd to poll. Return -EEXIST if fd already registered. */
--
1.9.3
^ permalink raw reply related [flat|nested] 12+ messages in thread
* [Qemu-devel] [PATCH v3 7/7] poll-linux: Add timerfd support
2015-04-16 4:57 [Qemu-devel] [PATCH v3 0/7] aio: Support epoll by introducing qemu_poll abstraction Fam Zheng
` (5 preceding siblings ...)
2015-04-16 4:57 ` [Qemu-devel] [PATCH v3 6/7] poll-glib: Support ppoll Fam Zheng
@ 2015-04-16 4:57 ` Fam Zheng
2015-04-16 13:00 ` Stefan Hajnoczi
2015-04-16 13:03 ` [Qemu-devel] [PATCH v3 0/7] aio: Support epoll by introducing qemu_poll abstraction Stefan Hajnoczi
7 siblings, 1 reply; 12+ messages in thread
From: Fam Zheng @ 2015-04-16 4:57 UTC (permalink / raw)
To: qemu-devel; +Cc: Kevin Wolf, Paolo Bonzini, Stefan Hajnoczi
In qemu_poll_timerfd, we arm the timerfd with timeout_ns. The timerfd is
also watched by epollfd, so that when there is no other events,
epoll_wait will still return on time, even though we pass -1 (wait
infinitely).
Signed-off-by: Fam Zheng <famz@redhat.com>
---
poll-linux.c | 84 +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++-
1 file changed, 83 insertions(+), 1 deletion(-)
diff --git a/poll-linux.c b/poll-linux.c
index 7605004..154da8b 100644
--- a/poll-linux.c
+++ b/poll-linux.c
@@ -12,20 +12,52 @@
*/
#include <sys/epoll.h>
+
+#define USE_TIMERFD CONFIG_TIMERFD
+
+#ifdef USE_TIMERFD
+#include <sys/timerfd.h>
+#endif
+
#include <glib.h>
#include <poll.h>
#include "qemu-common.h"
#include "qemu/timer.h"
#include "qemu/poll.h"
+
struct QEMUPoll {
int epollfd;
+#if USE_TIMERFD
+ int timerfd;
+#endif
struct epoll_event *events;
int max_events;
int nevents;
GHashTable *fds;
};
+static void qemu_poll_init_timerfd(QEMUPoll *qpoll)
+{
+#if USE_TIMERFD
+ int r;
+ struct epoll_event ev;
+ qpoll->timerfd = timerfd_create(CLOCK_MONOTONIC,
+ TFD_NONBLOCK | TFD_CLOEXEC);
+ if (qpoll->timerfd < 0) {
+ perror("timerfd_create");
+ abort();
+ }
+ ev.events = EPOLLIN | EPOLLOUT | EPOLLERR | EPOLLHUP;
+ ev.data.fd = qpoll->timerfd;
+ r = epoll_ctl(qpoll->epollfd, EPOLL_CTL_ADD, qpoll->timerfd, &ev);
+ if (r) {
+ perror("epoll_ctl add timerfd");
+ abort();
+ }
+#endif
+}
+
QEMUPoll *qemu_poll_new(void)
{
int epollfd;
@@ -40,6 +72,7 @@ QEMUPoll *qemu_poll_new(void)
qpoll->events = g_new(struct epoll_event, 1);
qpoll->fds = g_hash_table_new_full(g_int_hash, g_int_equal,
NULL, g_free);
+ qemu_poll_init_timerfd(qpoll);
return qpoll;
}
@@ -48,10 +81,13 @@ void qemu_poll_free(QEMUPoll *qpoll)
g_free(qpoll->events);
g_hash_table_destroy(qpoll->fds);
close(qpoll->epollfd);
+#if USE_TIMERFD
+ close(qpoll->timerfd);
+#endif
g_free(qpoll);
}
-int qemu_poll(QEMUPoll *qpoll, int64_t timeout_ns)
+static int qemu_poll_ppoll(QEMUPoll *qpoll, int64_t timeout_ns)
{
int r;
struct pollfd fd = {
@@ -78,6 +114,47 @@ int qemu_poll(QEMUPoll *qpoll, int64_t timeout_ns)
return r;
}
+#if USE_TIMERFD
+static int qemu_poll_timerfd(QEMUPoll *qpoll, int64_t timeout_ns)
+{
+ int r;
+ struct itimerspec its = { { 0 } };
+
+ if (timeout_ns > 0) {
+ its.it_value.tv_sec = timeout_ns / 1000000000LL;
+ its.it_value.tv_nsec = timeout_ns % 1000000000LL;
+ }
+
+ r = timerfd_settime(qpoll->timerfd, 0, &its, NULL);
+ if (r) {
+ struct pollfd fd = {
+ .fd = qpoll->epollfd,
+ .events = POLLIN | POLLOUT | POLLERR | POLLHUP,
+ };
+ perror("timerfd_settime");
+ abort();
+ r = ppoll(&fd, 1, &its.it_value, NULL);
+ }
+ if (r < 0) {
+ return r;
+ }
+ r = epoll_wait(qpoll->epollfd,
+ qpoll->events,
+ qpoll->max_events,
+ timeout_ns > 0 ? -1 : timeout_ns);
+ qpoll->nevents = r;
+ return r;
+}
+#endif
+
+int qemu_poll(QEMUPoll *qpoll, int64_t timeout_ns)
+{
+#if USE_TIMERFD
+ return qemu_poll_timerfd(qpoll, timeout_ns);
+#endif
+ return qemu_poll_ppoll(qpoll, timeout_ns);
+}
+
static inline uint32_t epoll_event_from_gio_events(int gio_events)
{
@@ -220,6 +297,11 @@ int qemu_poll_get_events(QEMUPoll *qpoll,
for (i = 0; i < MIN(qpoll->nevents, max_events); i++) {
ev = &qpoll->events[i];
fd = ev->data.fd;
+#if USE_TIMERFD
+ if (fd == qpoll->timerfd) {
+ continue;
+ }
+#endif
p = g_hash_table_lookup(qpoll->fds, &fd);
assert(p);
--
1.9.3
^ permalink raw reply related [flat|nested] 12+ messages in thread
* Re: [Qemu-devel] [PATCH v3 7/7] poll-linux: Add timerfd support
2015-04-16 4:57 ` [Qemu-devel] [PATCH v3 7/7] poll-linux: Add timerfd support Fam Zheng
@ 2015-04-16 13:00 ` Stefan Hajnoczi
0 siblings, 0 replies; 12+ messages in thread
From: Stefan Hajnoczi @ 2015-04-16 13:00 UTC (permalink / raw)
To: Fam Zheng; +Cc: Kevin Wolf, Paolo Bonzini, qemu-devel
[-- Attachment #1: Type: text/plain, Size: 1005 bytes --]
On Thu, Apr 16, 2015 at 12:57:36PM +0800, Fam Zheng wrote:
> +static int qemu_poll_timerfd(QEMUPoll *qpoll, int64_t timeout_ns)
> +{
> + int r;
> + struct itimerspec its = { { 0 } };
> +
> + if (timeout_ns > 0) {
> + its.it_value.tv_sec = timeout_ns / 1000000000LL;
> + its.it_value.tv_nsec = timeout_ns % 1000000000LL;
> + }
> +
> + r = timerfd_settime(qpoll->timerfd, 0, &its, NULL);
Can't hurt to put a comment here:
/* The timer must be set even when there is no timeout so the readable
* timerfd is cleared (we never call read(2) on it).
*/
> + if (r) {
> + struct pollfd fd = {
> + .fd = qpoll->epollfd,
> + .events = POLLIN | POLLOUT | POLLERR | POLLHUP,
> + };
> + perror("timerfd_settime");
> + abort();
> + r = ppoll(&fd, 1, &its.it_value, NULL);
Please remove the dead code.
In fact, there is no reasonable error for timerfd_settime(). It should
never fail, I'd be happy with just assert(r == 0).
[-- Attachment #2: Type: application/pgp-signature, Size: 473 bytes --]
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [Qemu-devel] [PATCH v3 0/7] aio: Support epoll by introducing qemu_poll abstraction
2015-04-16 4:57 [Qemu-devel] [PATCH v3 0/7] aio: Support epoll by introducing qemu_poll abstraction Fam Zheng
` (6 preceding siblings ...)
2015-04-16 4:57 ` [Qemu-devel] [PATCH v3 7/7] poll-linux: Add timerfd support Fam Zheng
@ 2015-04-16 13:03 ` Stefan Hajnoczi
2015-04-17 2:02 ` Fam Zheng
7 siblings, 1 reply; 12+ messages in thread
From: Stefan Hajnoczi @ 2015-04-16 13:03 UTC (permalink / raw)
To: Fam Zheng; +Cc: Kevin Wolf, Paolo Bonzini, qemu-devel
[-- Attachment #1: Type: text/plain, Size: 1295 bytes --]
On Thu, Apr 16, 2015 at 12:57:29PM +0800, Fam Zheng wrote:
> v3: Rebase to master for 2.4.
> Although epoll_pwait1 syscall is still being worked on [1], the QEMU part
> (if any) will base on this, so let's merge it first.
>
> That part is not included in this version because I'm still evaluating by
> comparing epoll_pwait1 with epoll+timerfd as with current master they seem
> to be really close.
>
> [1]: http://www.spinics.net/lists/linux-api/msg08216.html
>
> v2: Emulate nanoseconds precison of timeout with ppoll and timerfd.
> Their performance is on par with each other, but both much better than
> qemu.git:
>
> syscall high # of fd low # of fd
> -------------------------------------------------
> qemu.git(ppoll) 44 96
> ppoll+epoll 85 101
> timerfd+epoll 87 109
>
> (In high # of fd case, 3 activated but idle virtio-console devices are
> attached, which will add us hundereds of fds to poll)
Have you rerun benchmarks with this patch series?
I wonder how the ppoll-only performance changes. It seems like there
are now additional copies of <fd, events, revents> information and
corresponding malloc/realloc/frees.
Stefan
[-- Attachment #2: Type: application/pgp-signature, Size: 473 bytes --]
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [Qemu-devel] [PATCH v3 0/7] aio: Support epoll by introducing qemu_poll abstraction
2015-04-16 13:03 ` [Qemu-devel] [PATCH v3 0/7] aio: Support epoll by introducing qemu_poll abstraction Stefan Hajnoczi
@ 2015-04-17 2:02 ` Fam Zheng
2015-04-20 10:36 ` Stefan Hajnoczi
0 siblings, 1 reply; 12+ messages in thread
From: Fam Zheng @ 2015-04-17 2:02 UTC (permalink / raw)
To: Stefan Hajnoczi; +Cc: Kevin Wolf, Paolo Bonzini, qemu-devel
On Thu, 04/16 14:03, Stefan Hajnoczi wrote:
> On Thu, Apr 16, 2015 at 12:57:29PM +0800, Fam Zheng wrote:
> > v3: Rebase to master for 2.4.
> > Although epoll_pwait1 syscall is still being worked on [1], the QEMU part
> > (if any) will base on this, so let's merge it first.
> >
> > That part is not included in this version because I'm still evaluating by
> > comparing epoll_pwait1 with epoll+timerfd as with current master they seem
> > to be really close.
> >
> > [1]: http://www.spinics.net/lists/linux-api/msg08216.html
> >
> > v2: Emulate nanoseconds precison of timeout with ppoll and timerfd.
> > Their performance is on par with each other, but both much better than
> > qemu.git:
> >
> > syscall high # of fd low # of fd
> > -------------------------------------------------
> > qemu.git(ppoll) 44 96
> > ppoll+epoll 85 101
> > timerfd+epoll 87 109
> >
> > (In high # of fd case, 3 activated but idle virtio-console devices are
> > attached, which will add us hundereds of fds to poll)
>
> Have you rerun benchmarks with this patch series?
Yes, here:
syscall high # of fd low # of fd
(Unit MB/s) (Unit MB/s)
-------------------------------------------------
qemu.git(ppoll) 24 73
ppoll+epoll 49 77
timerfd+epoll 49 82
>
> I wonder how the ppoll-only performance changes. It seems like there
> are now additional copies of <fd, events, revents> information and
> corresponding malloc/realloc/frees.
>
> Stefan
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [Qemu-devel] [PATCH v3 0/7] aio: Support epoll by introducing qemu_poll abstraction
2015-04-17 2:02 ` Fam Zheng
@ 2015-04-20 10:36 ` Stefan Hajnoczi
0 siblings, 0 replies; 12+ messages in thread
From: Stefan Hajnoczi @ 2015-04-20 10:36 UTC (permalink / raw)
To: Fam Zheng; +Cc: Kevin Wolf, Paolo Bonzini, qemu-devel
[-- Attachment #1: Type: text/plain, Size: 2105 bytes --]
On Fri, Apr 17, 2015 at 10:02:30AM +0800, Fam Zheng wrote:
> On Thu, 04/16 14:03, Stefan Hajnoczi wrote:
> > On Thu, Apr 16, 2015 at 12:57:29PM +0800, Fam Zheng wrote:
> > > v3: Rebase to master for 2.4.
> > > Although epoll_pwait1 syscall is still being worked on [1], the QEMU part
> > > (if any) will base on this, so let's merge it first.
> > >
> > > That part is not included in this version because I'm still evaluating by
> > > comparing epoll_pwait1 with epoll+timerfd as with current master they seem
> > > to be really close.
> > >
> > > [1]: http://www.spinics.net/lists/linux-api/msg08216.html
> > >
> > > v2: Emulate nanoseconds precison of timeout with ppoll and timerfd.
> > > Their performance is on par with each other, but both much better than
> > > qemu.git:
> > >
> > > syscall high # of fd low # of fd
> > > -------------------------------------------------
> > > qemu.git(ppoll) 44 96
> > > ppoll+epoll 85 101
> > > timerfd+epoll 87 109
> > >
> > > (In high # of fd case, 3 activated but idle virtio-console devices are
> > > attached, which will add us hundereds of fds to poll)
> >
> > Have you rerun benchmarks with this patch series?
>
> Yes, here:
>
> syscall high # of fd low # of fd
> (Unit MB/s) (Unit MB/s)
> -------------------------------------------------
> qemu.git(ppoll) 24 73
> ppoll+epoll 49 77
> timerfd+epoll 49 82
Nice. Weird that the results are much lower than the v2 results, but
maybe the host changed?
> >
> > I wonder how the ppoll-only performance changes. It seems like there
> > are now additional copies of <fd, events, revents> information and
> > corresponding malloc/realloc/frees.
Please add a ppoll-only column hosts so we can be confident that the
non-epoll code works. It will also show that non-Linux POSIX hosts
haven't regressed.
Thanks,
Stefan
[-- Attachment #2: Type: application/pgp-signature, Size: 473 bytes --]
^ permalink raw reply [flat|nested] 12+ messages in thread
end of thread, other threads:[~2015-04-20 10:36 UTC | newest]
Thread overview: 12+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-04-16 4:57 [Qemu-devel] [PATCH v3 0/7] aio: Support epoll by introducing qemu_poll abstraction Fam Zheng
2015-04-16 4:57 ` [Qemu-devel] [PATCH v3 1/7] poll: Introduce QEMU Poll API Fam Zheng
2015-04-16 4:57 ` [Qemu-devel] [PATCH v3 2/7] posix-aio: Use QEMU poll interface Fam Zheng
2015-04-16 4:57 ` [Qemu-devel] [PATCH v3 3/7] poll: Add epoll implementation for qemu_poll Fam Zheng
2015-04-16 4:57 ` [Qemu-devel] [PATCH v3 4/7] main-loop: Replace qemu_poll_ns with qemu_poll Fam Zheng
2015-04-16 4:57 ` [Qemu-devel] [PATCH v3 5/7] tests: Add test case for qemu_poll Fam Zheng
2015-04-16 4:57 ` [Qemu-devel] [PATCH v3 6/7] poll-glib: Support ppoll Fam Zheng
2015-04-16 4:57 ` [Qemu-devel] [PATCH v3 7/7] poll-linux: Add timerfd support Fam Zheng
2015-04-16 13:00 ` Stefan Hajnoczi
2015-04-16 13:03 ` [Qemu-devel] [PATCH v3 0/7] aio: Support epoll by introducing qemu_poll abstraction Stefan Hajnoczi
2015-04-17 2:02 ` Fam Zheng
2015-04-20 10:36 ` Stefan Hajnoczi
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.