All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH experiment 00/35] stackless coroutine backend
@ 2022-03-10 12:43 Paolo Bonzini
  2022-03-10 12:43 ` [PATCH 01/35] coroutine: add missing coroutine_fn annotations for CoRwlock functions Paolo Bonzini
                   ` (35 more replies)
  0 siblings, 36 replies; 44+ messages in thread
From: Paolo Bonzini @ 2022-03-10 12:43 UTC (permalink / raw)
  To: qemu-devel; +Cc: hreitz, stefanha, qemu-block, sguelton

Here is an experiment with using stackless coroutines in QEMU.  It
only compiles enough code to run tests/unit/test-coroutine, but at
least it proves that it's possible to quickly test ideas in the
area of coroutine runtimes.  Another idea that could be toyed with
in a similar manner could be (whoa) C++ coroutines.

As expected, this also found some issues in existing code, so I
plan to submit patches 1-5 separately.

The new backend (which is the only one that works, due to the required
code changes) is in patch 7.  For the big description of what stackless
coroutines are, please refer to that patch.

Patches 8-11 do some initial conversions.  Patch 12 introduce some
preprocessor magic that greatly eases the rest of the work, and then
the tests are converted one at a time, until patch 27 where the only
ones missing are the CoRwlock tests.

Therefore, patches 28-33 convert CoRwlock and pathces 34-35 take care
of the corresponding tests, thus concluding the experiment.

Paolo

Paolo Bonzini (35):
  coroutine: add missing coroutine_fn annotations for CoRwlock functions
  coroutine: qemu_coroutine_get_aio_context is not a coroutine_fn
  coroutine: introduce QemuCoLockable
  coroutine: introduce coroutine_only_fn
  coroutine: small code cleanup in qemu_co_rwlock_wrlock
  disable some code
  coroutine: introduce the "stackless coroutine" backend
  /basic/lifecycle
  convert qemu-coroutine-sleep.c to stackless coroutines
  enable tail call optimization of qemu_co_mutex_lock
  convert CoMutex to stackless coroutines
  define magic macros for stackless coroutines
  /basic/yield
  /basic/nesting
  /basic/self
  /basic/entered
  /basic/in_coroutine
  /basic/order
  /perf/lifecycle
  /perf/nesting
  /perf/yield
  /perf/function-call
  /perf/cost
  /basic/no-dangling-access
  /locking/co-mutex
  convert qemu_co_mutex_lock_slowpath to magic macros
  /locking/co-mutex/lockable
  qemu_co_rwlock_maybe_wake_one
  qemu_co_rwlock_rdlock
  qemu_co_rwlock_unlock
  qemu_co_rwlock_downgrade
  qemu_co_rwlock_wrlock
  qemu_co_rwlock_upgrade
  /locking/co-rwlock/upgrade
  /locking/co-rwlock/downgrade

 configure                    |  44 +---
 include/qemu/co-lockable.h   | 110 +++++++++
 include/qemu/coroutine.h     |  99 ++++++--
 include/qemu/coroutine_int.h |   6 -
 include/qemu/lockable.h      |  13 +-
 include/qemu/typedefs.h      |   1 +
 tests/unit/meson.build       |   2 +-
 tests/unit/test-coroutine.c  | 425 +++++++++++++++++++++++++++++------
 util/coroutine-stackless.c   | 159 +++++++++++++
 util/meson.build             |  10 +-
 util/qemu-coroutine-lock.c   | 215 ++++++++++++++----
 util/qemu-coroutine-sleep.c  |  57 ++++-
 util/qemu-coroutine.c        |  18 +-
 13 files changed, 932 insertions(+), 227 deletions(-)
 create mode 100644 include/qemu/co-lockable.h
 create mode 100644 util/coroutine-stackless.c

-- 
2.35.1



^ permalink raw reply	[flat|nested] 44+ messages in thread

* [PATCH 01/35] coroutine: add missing coroutine_fn annotations for CoRwlock functions
  2022-03-10 12:43 [PATCH experiment 00/35] stackless coroutine backend Paolo Bonzini
@ 2022-03-10 12:43 ` Paolo Bonzini
  2022-03-10 12:43 ` [PATCH 02/35] coroutine: qemu_coroutine_get_aio_context is not a coroutine_fn Paolo Bonzini
                   ` (34 subsequent siblings)
  35 siblings, 0 replies; 44+ messages in thread
From: Paolo Bonzini @ 2022-03-10 12:43 UTC (permalink / raw)
  To: qemu-devel; +Cc: hreitz, stefanha, qemu-block, sguelton

CoRwlock can only be taken or released from a coroutine, and it
can yield.  Mark it as coroutine_fn.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
 include/qemu/coroutine.h | 10 +++++-----
 1 file changed, 5 insertions(+), 5 deletions(-)

diff --git a/include/qemu/coroutine.h b/include/qemu/coroutine.h
index c828a95ee0..da68be5ad2 100644
--- a/include/qemu/coroutine.h
+++ b/include/qemu/coroutine.h
@@ -261,7 +261,7 @@ void qemu_co_rwlock_init(CoRwlock *lock);
  * of a parallel writer, control is transferred to the caller of the current
  * coroutine.
  */
-void qemu_co_rwlock_rdlock(CoRwlock *lock);
+void coroutine_fn qemu_co_rwlock_rdlock(CoRwlock *lock);
 
 /**
  * Write Locks the CoRwlock from a reader.  This is a bit more efficient than
@@ -270,7 +270,7 @@ void qemu_co_rwlock_rdlock(CoRwlock *lock);
  * to the caller of the current coroutine; another writer might run while
  * @qemu_co_rwlock_upgrade blocks.
  */
-void qemu_co_rwlock_upgrade(CoRwlock *lock);
+void coroutine_fn qemu_co_rwlock_upgrade(CoRwlock *lock);
 
 /**
  * Downgrades a write-side critical section to a reader.  Downgrading with
@@ -278,20 +278,20 @@ void qemu_co_rwlock_upgrade(CoRwlock *lock);
  * followed by @qemu_co_rwlock_rdlock.  This makes it more efficient, but
  * may also sometimes be necessary for correctness.
  */
-void qemu_co_rwlock_downgrade(CoRwlock *lock);
+void coroutine_fn qemu_co_rwlock_downgrade(CoRwlock *lock);
 
 /**
  * Write Locks the mutex. If the lock cannot be taken immediately because
  * of a parallel reader, control is transferred to the caller of the current
  * coroutine.
  */
-void qemu_co_rwlock_wrlock(CoRwlock *lock);
+void coroutine_fn qemu_co_rwlock_wrlock(CoRwlock *lock);
 
 /**
  * Unlocks the read/write lock and schedules the next coroutine that was
  * waiting for this lock to be run.
  */
-void qemu_co_rwlock_unlock(CoRwlock *lock);
+void coroutine_fn qemu_co_rwlock_unlock(CoRwlock *lock);
 
 typedef struct QemuCoSleep {
     Coroutine *to_wake;
-- 
2.35.1




^ permalink raw reply related	[flat|nested] 44+ messages in thread

* [PATCH 02/35] coroutine: qemu_coroutine_get_aio_context is not a coroutine_fn
  2022-03-10 12:43 [PATCH experiment 00/35] stackless coroutine backend Paolo Bonzini
  2022-03-10 12:43 ` [PATCH 01/35] coroutine: add missing coroutine_fn annotations for CoRwlock functions Paolo Bonzini
@ 2022-03-10 12:43 ` Paolo Bonzini
  2022-03-10 12:43 ` [PATCH 03/35] coroutine: introduce QemuCoLockable Paolo Bonzini
                   ` (33 subsequent siblings)
  35 siblings, 0 replies; 44+ messages in thread
From: Paolo Bonzini @ 2022-03-10 12:43 UTC (permalink / raw)
  To: qemu-devel; +Cc: hreitz, stefanha, qemu-block, sguelton

Since it operates on a given coroutine, qemu_coroutine_get_aio_context
can be called from outside coroutine context.

This is for example how qio_channel_restart_read uses it.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
 include/qemu/coroutine.h | 2 +-
 util/qemu-coroutine.c    | 2 +-
 2 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/include/qemu/coroutine.h b/include/qemu/coroutine.h
index da68be5ad2..666f3ba0e0 100644
--- a/include/qemu/coroutine.h
+++ b/include/qemu/coroutine.h
@@ -92,7 +92,7 @@ void coroutine_fn qemu_coroutine_yield(void);
 /**
  * Get the AioContext of the given coroutine
  */
-AioContext *coroutine_fn qemu_coroutine_get_aio_context(Coroutine *co);
+AioContext *qemu_coroutine_get_aio_context(Coroutine *co);
 
 /**
  * Get the currently executing coroutine
diff --git a/util/qemu-coroutine.c b/util/qemu-coroutine.c
index c03b2422ff..9f2bd96fa0 100644
--- a/util/qemu-coroutine.c
+++ b/util/qemu-coroutine.c
@@ -200,7 +200,7 @@ bool qemu_coroutine_entered(Coroutine *co)
     return co->caller;
 }
 
-AioContext *coroutine_fn qemu_coroutine_get_aio_context(Coroutine *co)
+AioContext *qemu_coroutine_get_aio_context(Coroutine *co)
 {
     return co->ctx;
 }
-- 
2.35.1




^ permalink raw reply related	[flat|nested] 44+ messages in thread

* [PATCH 03/35] coroutine: introduce QemuCoLockable
  2022-03-10 12:43 [PATCH experiment 00/35] stackless coroutine backend Paolo Bonzini
  2022-03-10 12:43 ` [PATCH 01/35] coroutine: add missing coroutine_fn annotations for CoRwlock functions Paolo Bonzini
  2022-03-10 12:43 ` [PATCH 02/35] coroutine: qemu_coroutine_get_aio_context is not a coroutine_fn Paolo Bonzini
@ 2022-03-10 12:43 ` Paolo Bonzini
  2022-03-10 12:43 ` [PATCH 04/35] coroutine: introduce coroutine_only_fn Paolo Bonzini
                   ` (32 subsequent siblings)
  35 siblings, 0 replies; 44+ messages in thread
From: Paolo Bonzini @ 2022-03-10 12:43 UTC (permalink / raw)
  To: qemu-devel; +Cc: hreitz, stefanha, qemu-block, sguelton

In preparation for splitting "from coroutine" ("awaitable" in other
languages) and "not from coroutine" functions, remove the CoMutex case
from QemuLockable---thus making qemu_lockable_lock and qemu_lockable_unlock
"not awaitable".

To satisfy the qemu_co_queue_wait use case, introduce QemuCoLockable
which can be used for both QemuMutex (which will trivially never yield)
and CoMutex.  qemu_co_lockable_lock and qemu_co_lockable_unlock are
coroutine_fns.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
 include/qemu/co-lockable.h  | 98 +++++++++++++++++++++++++++++++++++++
 include/qemu/coroutine.h    |  5 +-
 include/qemu/lockable.h     | 13 ++---
 include/qemu/typedefs.h     |  1 +
 tests/unit/test-coroutine.c | 10 ++--
 util/qemu-coroutine-lock.c  |  6 +--
 6 files changed, 114 insertions(+), 19 deletions(-)
 create mode 100644 include/qemu/co-lockable.h

diff --git a/include/qemu/co-lockable.h b/include/qemu/co-lockable.h
new file mode 100644
index 0000000000..09f4620017
--- /dev/null
+++ b/include/qemu/co-lockable.h
@@ -0,0 +1,98 @@
+/*
+ * Polymorphic locking functions (aka poor man templates)
+ *
+ * Copyright Red Hat, Inc. 2017, 2018
+ *
+ * Author: Paolo Bonzini <pbonzini@redhat.com>
+ *
+ * This work is licensed under the terms of the GNU LGPL, version 2 or later.
+ * See the COPYING.LIB file in the top-level directory.
+ *
+ */
+
+#ifndef QEMU_CO_LOCKABLE_H
+#define QEMU_CO_LOCKABLE_H
+
+#include "qemu/coroutine.h"
+#include "qemu/thread.h"
+
+typedef void coroutine_fn QemuCoLockUnlockFunc(void *);
+
+struct QemuCoLockable {
+    void *object;
+    QemuCoLockUnlockFunc *lock;
+    QemuCoLockUnlockFunc *unlock;
+};
+
+static inline __attribute__((__always_inline__)) QemuCoLockable *
+qemu_make_co_lockable(void *x, QemuCoLockable *lockable)
+{
+    /*
+     * We cannot test this in a macro, otherwise we get compiler
+     * warnings like "the address of 'm' will always evaluate as 'true'".
+     */
+    return x ? lockable : NULL;
+}
+
+static inline __attribute__((__always_inline__)) QemuCoLockable *
+qemu_null_co_lockable(void *x)
+{
+    if (x != NULL) {
+        qemu_build_not_reached();
+    }
+    return NULL;
+}
+
+/*
+ * In C, compound literals have the lifetime of an automatic variable.
+ * In C++ it would be different, but then C++ wouldn't need QemuCoLockable
+ * either...
+ */
+#define QMCL_OBJ_(x, name) (&(QemuCoLockable) {                         \
+        .object = (x),                                                  \
+        .lock = (QemuCoLockUnlockFunc *) qemu_ ## name ## _lock,        \
+        .unlock = (QemuCoLockUnlockFunc *) qemu_ ## name ## _unlock     \
+    })
+
+/**
+ * QEMU_MAKE_CO_LOCKABLE - Make a polymorphic QemuCoLockable
+ *
+ * @x: a lock object (currently one of QemuMutex, CoMutex).
+ *
+ * Returns a QemuCoLockable object that can be passed around
+ * to a function that can operate with locks of any kind, or
+ * NULL if @x is %NULL.
+ *
+ * Note the special case for void *, so that we may pass "NULL".
+ */
+#define QEMU_MAKE_CO_LOCKABLE(x)                                            \
+    _Generic((x), QemuCoLockable *: (x),                                    \
+             void *: qemu_null_co_lockable(x),                              \
+             QemuMutex *: qemu_make_co_lockable(x, QMCL_OBJ_(x, mutex)),    \
+             CoMutex *: qemu_make_co_lockable(x, QMCL_OBJ_(x, co_mutex)))   \
+
+/**
+ * QEMU_MAKE_CO_LOCKABLE_NONNULL - Make a polymorphic QemuCoLockable
+ *
+ * @x: a lock object (currently one of QemuMutex, QemuRecMutex,
+ *     CoMutex, QemuSpin).
+ *
+ * Returns a QemuCoLockable object that can be passed around
+ * to a function that can operate with locks of any kind.
+ */
+#define QEMU_MAKE_CO_LOCKABLE_NONNULL(x)                        \
+    _Generic((x), QemuCoLockable *: (x),                        \
+                  QemuMutex *: QMCL_OBJ_(x, mutex),             \
+                  CoMutex *: QMCL_OBJ_(x, co_mutex))
+
+static inline void coroutine_fn qemu_co_lockable_lock(QemuCoLockable *x)
+{
+    x->lock(x->object);
+}
+
+static inline void coroutine_fn qemu_co_lockable_unlock(QemuCoLockable *x)
+{
+    x->unlock(x->object);
+}
+
+#endif
diff --git a/include/qemu/coroutine.h b/include/qemu/coroutine.h
index 666f3ba0e0..6f4596fc5b 100644
--- a/include/qemu/coroutine.h
+++ b/include/qemu/coroutine.h
@@ -204,8 +204,8 @@ void qemu_co_queue_init(CoQueue *queue);
  * locked again afterwards.
  */
 #define qemu_co_queue_wait(queue, lock) \
-    qemu_co_queue_wait_impl(queue, QEMU_MAKE_LOCKABLE(lock))
-void coroutine_fn qemu_co_queue_wait_impl(CoQueue *queue, QemuLockable *lock);
+    qemu_co_queue_wait_impl(queue, QEMU_MAKE_CO_LOCKABLE(lock))
+void coroutine_fn qemu_co_queue_wait_impl(CoQueue *queue, QemuCoLockable *lock);
 
 /**
  * Removes the next coroutine from the CoQueue, and wake it up.
@@ -342,5 +342,6 @@ void qemu_coroutine_increase_pool_batch_size(unsigned int additional_pool_size);
 void qemu_coroutine_decrease_pool_batch_size(unsigned int additional_pool_size);
 
 #include "qemu/lockable.h"
+#include "qemu/co-lockable.h"
 
 #endif /* QEMU_COROUTINE_H */
diff --git a/include/qemu/lockable.h b/include/qemu/lockable.h
index 86db7cb04c..c860f81737 100644
--- a/include/qemu/lockable.h
+++ b/include/qemu/lockable.h
@@ -13,7 +13,6 @@
 #ifndef QEMU_LOCKABLE_H
 #define QEMU_LOCKABLE_H
 
-#include "qemu/coroutine.h"
 #include "qemu/thread.h"
 
 typedef void QemuLockUnlockFunc(void *);
@@ -57,8 +56,7 @@ qemu_null_lockable(void *x)
 /**
  * QEMU_MAKE_LOCKABLE - Make a polymorphic QemuLockable
  *
- * @x: a lock object (currently one of QemuMutex, QemuRecMutex,
- *     CoMutex, QemuSpin).
+ * @x: a lock object (currently one of QemuMutex, QemuRecMutex, QemuSpin).
  *
  * Returns a QemuLockable object that can be passed around
  * to a function that can operate with locks of any kind, or
@@ -71,14 +69,12 @@ qemu_null_lockable(void *x)
              void *: qemu_null_lockable(x),                             \
              QemuMutex *: qemu_make_lockable(x, QML_OBJ_(x, mutex)),    \
              QemuRecMutex *: qemu_make_lockable(x, QML_OBJ_(x, rec_mutex)), \
-             CoMutex *: qemu_make_lockable(x, QML_OBJ_(x, co_mutex)),   \
              QemuSpin *: qemu_make_lockable(x, QML_OBJ_(x, spin)))
 
 /**
  * QEMU_MAKE_LOCKABLE_NONNULL - Make a polymorphic QemuLockable
  *
- * @x: a lock object (currently one of QemuMutex, QemuRecMutex,
- *     CoMutex, QemuSpin).
+ * @x: a lock object (currently one of QemuMutex, QemuRecMutex, QemuSpin).
  *
  * Returns a QemuLockable object that can be passed around
  * to a function that can operate with locks of any kind.
@@ -87,7 +83,6 @@ qemu_null_lockable(void *x)
     _Generic((x), QemuLockable *: (x),                          \
                   QemuMutex *: QML_OBJ_(x, mutex),              \
                   QemuRecMutex *: QML_OBJ_(x, rec_mutex),       \
-                  CoMutex *: QML_OBJ_(x, co_mutex),             \
                   QemuSpin *: QML_OBJ_(x, spin))
 
 static inline void qemu_lockable_lock(QemuLockable *x)
@@ -124,7 +119,7 @@ G_DEFINE_AUTOPTR_CLEANUP_FUNC(QemuLockable, qemu_lockable_auto_unlock)
 /**
  * WITH_QEMU_LOCK_GUARD - Lock a lock object for scope
  *
- * @x: a lock object (currently one of QemuMutex, CoMutex, QemuSpin).
+ * @x: a lock object (currently one of QemuMutex, QemuRecMutex, QemuSpin).
  *
  * This macro defines a lock scope such that entering the scope takes the lock
  * and leaving the scope releases the lock.  Return statements are allowed
@@ -149,7 +144,7 @@ G_DEFINE_AUTOPTR_CLEANUP_FUNC(QemuLockable, qemu_lockable_auto_unlock)
 /**
  * QEMU_LOCK_GUARD - Lock an object until the end of the scope
  *
- * @x: a lock object (currently one of QemuMutex, CoMutex, QemuSpin).
+ * @x: a lock object (currently one of QemuMutex, QemuRecMutex, QemuSpin).
  *
  * This macro takes a lock until the end of the scope.  Return statements
  * release the lock.
diff --git a/include/qemu/typedefs.h b/include/qemu/typedefs.h
index 42f4ceb701..144ce82b8b 100644
--- a/include/qemu/typedefs.h
+++ b/include/qemu/typedefs.h
@@ -103,6 +103,7 @@ typedef struct QBool QBool;
 typedef struct QDict QDict;
 typedef struct QEMUBH QEMUBH;
 typedef struct QemuConsole QemuConsole;
+typedef struct QemuCoLockable QemuCoLockable;
 typedef struct QEMUFile QEMUFile;
 typedef struct QemuLockable QemuLockable;
 typedef struct QemuMutex QemuMutex;
diff --git a/tests/unit/test-coroutine.c b/tests/unit/test-coroutine.c
index aa77a3bcb3..82e22db070 100644
--- a/tests/unit/test-coroutine.c
+++ b/tests/unit/test-coroutine.c
@@ -213,13 +213,13 @@ static void coroutine_fn mutex_fn(void *opaque)
 
 static void coroutine_fn lockable_fn(void *opaque)
 {
-    QemuLockable *x = opaque;
-    qemu_lockable_lock(x);
+    QemuCoLockable *x = opaque;
+    qemu_co_lockable_lock(x);
     assert(!locked);
     locked = true;
     qemu_coroutine_yield();
     locked = false;
-    qemu_lockable_unlock(x);
+    qemu_co_lockable_unlock(x);
     done++;
 }
 
@@ -259,9 +259,9 @@ static void test_co_mutex_lockable(void)
     CoMutex *null_pointer = NULL;
 
     qemu_co_mutex_init(&m);
-    do_test_co_mutex(lockable_fn, QEMU_MAKE_LOCKABLE(&m));
+    do_test_co_mutex(lockable_fn, QEMU_MAKE_CO_LOCKABLE(&m));
 
-    g_assert(QEMU_MAKE_LOCKABLE(null_pointer) == NULL);
+    g_assert(QEMU_MAKE_CO_LOCKABLE(null_pointer) == NULL);
 }
 
 static CoRwlock rwlock;
diff --git a/util/qemu-coroutine-lock.c b/util/qemu-coroutine-lock.c
index 2669403839..c29cb69f5e 100644
--- a/util/qemu-coroutine-lock.c
+++ b/util/qemu-coroutine-lock.c
@@ -39,13 +39,13 @@ void qemu_co_queue_init(CoQueue *queue)
     QSIMPLEQ_INIT(&queue->entries);
 }
 
-void coroutine_fn qemu_co_queue_wait_impl(CoQueue *queue, QemuLockable *lock)
+void coroutine_fn qemu_co_queue_wait_impl(CoQueue *queue, QemuCoLockable *lock)
 {
     Coroutine *self = qemu_coroutine_self();
     QSIMPLEQ_INSERT_TAIL(&queue->entries, self, co_queue_next);
 
     if (lock) {
-        qemu_lockable_unlock(lock);
+        qemu_co_lockable_unlock(lock);
     }
 
     /* There is no race condition here.  Other threads will call
@@ -63,7 +63,7 @@ void coroutine_fn qemu_co_queue_wait_impl(CoQueue *queue, QemuLockable *lock)
      * other cases of QemuLockable.
      */
     if (lock) {
-        qemu_lockable_lock(lock);
+        qemu_co_lockable_lock(lock);
     }
 }
 
-- 
2.35.1




^ permalink raw reply related	[flat|nested] 44+ messages in thread

* [PATCH 04/35] coroutine: introduce coroutine_only_fn
  2022-03-10 12:43 [PATCH experiment 00/35] stackless coroutine backend Paolo Bonzini
                   ` (2 preceding siblings ...)
  2022-03-10 12:43 ` [PATCH 03/35] coroutine: introduce QemuCoLockable Paolo Bonzini
@ 2022-03-10 12:43 ` Paolo Bonzini
  2022-03-10 12:43 ` [PATCH 05/35] coroutine: small code cleanup in qemu_co_rwlock_wrlock Paolo Bonzini
                   ` (31 subsequent siblings)
  35 siblings, 0 replies; 44+ messages in thread
From: Paolo Bonzini @ 2022-03-10 12:43 UTC (permalink / raw)
  To: qemu-devel; +Cc: hreitz, stefanha, qemu-block, sguelton

Some functions only make sense from coroutine context, but never yield.
Mark them as "coroutine_only_fn".

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
 include/qemu/coroutine.h | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/include/qemu/coroutine.h b/include/qemu/coroutine.h
index 6f4596fc5b..b23fba88c2 100644
--- a/include/qemu/coroutine.h
+++ b/include/qemu/coroutine.h
@@ -43,6 +43,7 @@
  *   }
  */
 #define coroutine_fn
+#define coroutine_only_fn
 
 typedef struct Coroutine Coroutine;
 
@@ -97,7 +98,7 @@ AioContext *qemu_coroutine_get_aio_context(Coroutine *co);
 /**
  * Get the currently executing coroutine
  */
-Coroutine *coroutine_fn qemu_coroutine_self(void);
+Coroutine *coroutine_only_fn qemu_coroutine_self(void);
 
 /**
  * Return whether or not currently inside a coroutine
@@ -170,7 +171,7 @@ void coroutine_fn qemu_co_mutex_unlock(CoMutex *mutex);
 /**
  * Assert that the current coroutine holds @mutex.
  */
-static inline coroutine_fn void qemu_co_mutex_assert_locked(CoMutex *mutex)
+static inline void coroutine_only_fn qemu_co_mutex_assert_locked(CoMutex *mutex)
 {
     /*
      * mutex->holder doesn't need any synchronisation if the assertion holds
-- 
2.35.1




^ permalink raw reply related	[flat|nested] 44+ messages in thread

* [PATCH 05/35] coroutine: small code cleanup in qemu_co_rwlock_wrlock
  2022-03-10 12:43 [PATCH experiment 00/35] stackless coroutine backend Paolo Bonzini
                   ` (3 preceding siblings ...)
  2022-03-10 12:43 ` [PATCH 04/35] coroutine: introduce coroutine_only_fn Paolo Bonzini
@ 2022-03-10 12:43 ` Paolo Bonzini
  2022-03-10 14:10   ` Philippe Mathieu-Daudé
  2022-03-10 12:43 ` [PATCH 06/35] disable some code Paolo Bonzini
                   ` (30 subsequent siblings)
  35 siblings, 1 reply; 44+ messages in thread
From: Paolo Bonzini @ 2022-03-10 12:43 UTC (permalink / raw)
  To: qemu-devel; +Cc: hreitz, stefanha, qemu-block, sguelton

qemu_co_rwlock_wrlock stores the current coroutine in a local variable,
use it instead of calling qemu_coroutine_self() again.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
 util/qemu-coroutine-lock.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/util/qemu-coroutine-lock.c b/util/qemu-coroutine-lock.c
index c29cb69f5e..3f12b53a31 100644
--- a/util/qemu-coroutine-lock.c
+++ b/util/qemu-coroutine-lock.c
@@ -436,7 +436,7 @@ void qemu_co_rwlock_wrlock(CoRwlock *lock)
         lock->owners = -1;
         qemu_co_mutex_unlock(&lock->mutex);
     } else {
-        CoRwTicket my_ticket = { false, qemu_coroutine_self() };
+        CoRwTicket my_ticket = { false, self };
 
         QSIMPLEQ_INSERT_TAIL(&lock->tickets, &my_ticket, next);
         qemu_co_mutex_unlock(&lock->mutex);
-- 
2.35.1




^ permalink raw reply related	[flat|nested] 44+ messages in thread

* [PATCH 06/35] disable some code
  2022-03-10 12:43 [PATCH experiment 00/35] stackless coroutine backend Paolo Bonzini
                   ` (4 preceding siblings ...)
  2022-03-10 12:43 ` [PATCH 05/35] coroutine: small code cleanup in qemu_co_rwlock_wrlock Paolo Bonzini
@ 2022-03-10 12:43 ` Paolo Bonzini
  2022-03-10 12:43 ` [PATCH 07/35] coroutine: introduce the "stackless coroutine" backend Paolo Bonzini
                   ` (29 subsequent siblings)
  35 siblings, 0 replies; 44+ messages in thread
From: Paolo Bonzini @ 2022-03-10 12:43 UTC (permalink / raw)
  To: qemu-devel; +Cc: hreitz, stefanha, qemu-block, sguelton

Disable a lot of code that I can't be bothered to convert right now.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
 tests/unit/meson.build      |  2 +-
 tests/unit/test-coroutine.c |  6 ++++++
 util/meson.build            | 10 +++++-----
 util/qemu-coroutine-lock.c  |  2 ++
 util/qemu-coroutine-sleep.c |  2 ++
 5 files changed, 16 insertions(+), 6 deletions(-)

diff --git a/tests/unit/meson.build b/tests/unit/meson.build
index 96b295263e..4ca5fdb699 100644
--- a/tests/unit/meson.build
+++ b/tests/unit/meson.build
@@ -61,7 +61,7 @@ endif
 
 if have_block
   tests += {
-    'test-coroutine': [testblock],
+    'test-coroutine': [],
     'test-aio': [testblock],
     'test-aio-multithread': [testblock],
     'test-throttle': [testblock],
diff --git a/tests/unit/test-coroutine.c b/tests/unit/test-coroutine.c
index 82e22db070..c230c2fa6e 100644
--- a/tests/unit/test-coroutine.c
+++ b/tests/unit/test-coroutine.c
@@ -16,6 +16,7 @@
 #include "qemu/coroutine_int.h"
 #include "qemu/lockable.h"
 
+#if 0
 /*
  * Check that qemu_in_coroutine() works
  */
@@ -638,11 +639,13 @@ static void perf_cost(void)
                    duration, ops,
                    (unsigned long)(1000000000.0 * duration / maxcycles));
 }
+#endif
 
 int main(int argc, char **argv)
 {
     g_test_init(&argc, &argv, NULL);
 
+#if 0
     /* This test assumes there is a freelist and marks freed coroutine memory
      * with a sentinel value.  If there is no freelist this would legitimately
      * crash, so skip it.
@@ -650,7 +653,9 @@ int main(int argc, char **argv)
     if (CONFIG_COROUTINE_POOL) {
         g_test_add_func("/basic/no-dangling-access", test_no_dangling_access);
     }
+#endif
 
+#if 0
     g_test_add_func("/basic/lifecycle", test_lifecycle);
     g_test_add_func("/basic/yield", test_yield);
     g_test_add_func("/basic/nesting", test_nesting);
@@ -669,5 +674,6 @@ int main(int argc, char **argv)
         g_test_add_func("/perf/function-call", perf_baseline);
         g_test_add_func("/perf/cost", perf_cost);
     }
+#endif
     return g_test_run();
 }
diff --git a/util/meson.build b/util/meson.build
index f6ee74ad0c..30949cd481 100644
--- a/util/meson.build
+++ b/util/meson.build
@@ -76,13 +76,13 @@ if have_block
   util_ss.add(files('lockcnt.c'))
   util_ss.add(files('main-loop.c'))
   util_ss.add(files('nvdimm-utils.c'))
-  util_ss.add(files('qemu-coroutine.c', 'qemu-coroutine-lock.c', 'qemu-coroutine-io.c'))
-  util_ss.add(when: 'CONFIG_LINUX', if_true: [
-    files('vhost-user-server.c'), vhost_user
-  ])
+  util_ss.add(files('qemu-coroutine.c', 'qemu-coroutine-lock.c')) # 'qemu-coroutine-io.c'
+# util_ss.add(when: 'CONFIG_LINUX', if_true: [
+#   files('vhost-user-server.c'), vhost_user
+# ])
   util_ss.add(files('block-helpers.c'))
   util_ss.add(files('qemu-coroutine-sleep.c'))
-  util_ss.add(files('qemu-co-shared-resource.c'))
+# util_ss.add(files('qemu-co-shared-resource.c'))
   util_ss.add(files('thread-pool.c', 'qemu-timer.c'))
   util_ss.add(files('readline.c'))
   util_ss.add(files('throttle.c'))
diff --git a/util/qemu-coroutine-lock.c b/util/qemu-coroutine-lock.c
index 3f12b53a31..d6c0565ba5 100644
--- a/util/qemu-coroutine-lock.c
+++ b/util/qemu-coroutine-lock.c
@@ -34,6 +34,7 @@
 #include "block/aio.h"
 #include "trace.h"
 
+#if 0
 void qemu_co_queue_init(CoQueue *queue)
 {
     QSIMPLEQ_INIT(&queue->entries);
@@ -465,3 +466,4 @@ void qemu_co_rwlock_upgrade(CoRwlock *lock)
         assert(lock->owners == -1);
     }
 }
+#endif
diff --git a/util/qemu-coroutine-sleep.c b/util/qemu-coroutine-sleep.c
index 571ab521ff..b5bfb4ad18 100644
--- a/util/qemu-coroutine-sleep.c
+++ b/util/qemu-coroutine-sleep.c
@@ -17,6 +17,7 @@
 #include "qemu/timer.h"
 #include "block/aio.h"
 
+#if 0
 static const char *qemu_co_sleep_ns__scheduled = "qemu_co_sleep_ns";
 
 void qemu_co_sleep_wake(QemuCoSleep *w)
@@ -78,3 +79,4 @@ void coroutine_fn qemu_co_sleep_ns_wakeable(QemuCoSleep *w,
     qemu_co_sleep(w);
     timer_del(&ts);
 }
+#endif
-- 
2.35.1




^ permalink raw reply related	[flat|nested] 44+ messages in thread

* [PATCH 07/35] coroutine: introduce the "stackless coroutine" backend
  2022-03-10 12:43 [PATCH experiment 00/35] stackless coroutine backend Paolo Bonzini
                   ` (5 preceding siblings ...)
  2022-03-10 12:43 ` [PATCH 06/35] disable some code Paolo Bonzini
@ 2022-03-10 12:43 ` Paolo Bonzini
  2022-03-10 12:43 ` [PATCH 08/35] /basic/lifecycle Paolo Bonzini
                   ` (28 subsequent siblings)
  35 siblings, 0 replies; 44+ messages in thread
From: Paolo Bonzini @ 2022-03-10 12:43 UTC (permalink / raw)
  To: qemu-devel; +Cc: hreitz, stefanha, qemu-block, sguelton

This backend is similar to the one that was written for the "Continuation
Passing C" precompiler[1].  The main advantages of stackless coroutines,
in the context of QEMU, are two.  First, they do not make any assumption
on the layout of the stack (e.g. they do not need any special treatment
for control-flow protection technologies such as SafeStack or CET).
Second, they do not hide from the compiler the fact that coroutines can
and will move across threads.

Stackless coroutines actually do have a stack, but the frames of
awaitable functions are kept outside the *processor* stack.  The
qemu_coroutine_switch function from the runtime repeatedly invokes the
"handler" for the topmost entry of the coroutine call stack, so that
there is nothing on the processor stack between qemu_coroutine_switch
and a function that can yield.  Therefore, yielding can be done simply
by returning back to qemu_coroutine_switch.

An awaitable function is split in two parts.  User code calls an
external function that sets up the stack frame with the arguments and
calls into the second part.  The second part takes a single void*
argument pointing to the stack frame, so that it can be called on
resumption from qemu_coroutine_switch.  You can already see a bit of
this separation in the runtime, where qemu_coroutine_new sets up a frame
for coroutine_trampoline, which is the function that is actually invoked
by qemu_coroutine_switch.

Both parts return a CoroutineAction enum: COROUTINE_CONTINUE to just go on
with the execution loop (typically because an awaitable function reached
the end of the function and popped a stack frame, more on this below),
COROUTINE_YIELD to exit to the caller, COROUTINE_TERMINATE to clean up the
current coroutine and exit to the caller.  COROUTINE_TERMINATE actually
is only returned by the runtime's internal coroutine_trampoline.  The
fact that the return value is always a CoroutineAction (the actual return
value must be done with a pointer argument, pointing into the caller's
stack frame) means that code changes are necessary for the coroutine
themselves; that's a separate topic which I'll get to in a moment.

Compared to other backends, the only extra change needed to common code
is in the implementation of qemu_coroutine_yield().  The implementation
for fiber-based backends cannot be reused because qemu_coroutine_yield()
now needs to return a CoroutineAction enum, just like any other
awaitable function[2].

There are two possible implementation of the stackless coroutine handlers.
The first is to change the handler address every time an awaitable
function yields, pointing the handler to a function that executes the
rest of the function.  This effectively means transforming the function
to continuation-passing style and is what Continuation Passing C did.

The alternative is to turn the function into a state machine (a large
switch statement); the information on where to restart execution is
stored in the stack frame.  This is the approach that I chose for these
conversions.  Because the yield points can be arbitrarily inside loops
or conditionals, this would be a very hard thing to do for a source-to-source
translator in any normal language.  But C is not a normal language, and
especially its switch statement is not normal.  A construction similar
to Duff's device makes it possible to do this in a source-to-source
manner:

  switch (t->_step) {
  case 0:
        do_something_without_yielding();
        for (i = 0; i < n; i++) {
                ...
                t->_step = 1;
                t->i = i;
                t->n = n;
                return another_coroutine(...);
  case 1:
                i = t->i;
                n = t->n;
                ...
        }
  }

Putting everything in a single function *should* reduces code size
somewhat (due to prologs and epilogs), and would also provide chances
for optimization when an awaitable function is called but does not yield.
I have not done this in this series, but it basically would entail changing
all occurrences of "return awaitable_function();" to

   if (awaitable_function() == COROUTINE_YIELD) return COROUTINE_YIELD;
   fallthrough;

Because of the code transformations that are needed, stackless coroutines
require compiler support.  Manual conversion is something I did for
this experiment but I don't recommend for your sanity; it should
be restricted to internal functions such as coroutine_trampoline
or qemu_coroutine_yield().  Therefore, the idea would be to use a
source-to-source translator, again similar to Continuation Passing C.
Debugging support on the level of the current gdb script is probably
possible too.

A basic translator should produce code roughly similar to the one that I
have written by hand, except when I applied a little more
care or sophistication to simplify the translation.

For example, if a function does not call an awaitable function except
in tail calls, it is not necessary to construct a stack frame and tear
it away immediately after the tail calls return.  I have sometimes done
this because it makes the conversion simpler.

In addition, the translator need not save variables that are not written
on any path from a previous yield point to the current one, and need not
load variables that are not read on any path from the current yield point
to the next one.  This looks complicated but is a relatively simple data
flow problem---but anyway it's not important for a basic translator.

A final design point is the implementation of the per-coroutine stack.
In this implementation I chose to have no allocations while coroutines
run: the per-coroutine stack is really a stack that carves out frames
from a large COROUTINE_STACK_SIZE-bytes block.  However, it is also
possible to implement coroutine_stack_alloc and coroutine_stack_free
using malloc/free.

[1] https://arxiv.org/pdf/1310.3404.pdf
[2] The QEMU/CPC paper kept the target-independent qemu_coroutine_yield,
    by special casing it in the runtime.  I chose not to because, if
    QEMU were to switch to stackless coroutines, the other backends
    would likely be dropped at the same time.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
 configure                    |  44 +---------
 include/qemu/co-lockable.h   |  26 ++++--
 include/qemu/coroutine.h     |  48 +++++++----
 include/qemu/coroutine_int.h |   6 --
 util/coroutine-stackless.c   | 159 +++++++++++++++++++++++++++++++++++
 util/qemu-coroutine.c        |  16 ----
 6 files changed, 212 insertions(+), 87 deletions(-)
 create mode 100644 util/coroutine-stackless.c

diff --git a/configure b/configure
index 886000346a..e45d2c3b9c 100755
--- a/configure
+++ b/configure
@@ -1220,8 +1220,6 @@ Advanced options (experts only):
   --with-trace-file=NAME   Full PATH,NAME of file to store traces
                            Default:trace-<pid>
   --cpu=CPU                Build for host CPU [$cpu]
-  --with-coroutine=BACKEND coroutine backend. Supported options:
-                           ucontext, sigaltstack, windows
   --enable-gcov            enable test coverage analysis with gcov
   --tls-priority           default TLS protocol/cipher priority string
   --enable-plugins
@@ -1242,7 +1240,7 @@ cat << EOF
   debug-info      debugging information
   lto             Enable Link-Time Optimization.
   safe-stack      SafeStack Stack Smash Protection. Depends on
-                  clang/llvm >= 3.7 and requires coroutine backend ucontext.
+                  clang/llvm >= 3.7
   rdma            Enable RDMA-based migration
   pvrdma          Enable PVRDMA support
   vhost-net       vhost-net kernel acceleration support
@@ -2338,39 +2336,7 @@ EOF
   fi
 fi
 
-if test "$coroutine" = ""; then
-  if test "$mingw32" = "yes"; then
-    coroutine=win32
-  elif test "$ucontext_works" = "yes"; then
-    coroutine=ucontext
-  else
-    coroutine=sigaltstack
-  fi
-else
-  case $coroutine in
-  windows)
-    if test "$mingw32" != "yes"; then
-      error_exit "'windows' coroutine backend only valid for Windows"
-    fi
-    # Unfortunately the user visible backend name doesn't match the
-    # coroutine-*.c filename for this case, so we have to adjust it here.
-    coroutine=win32
-    ;;
-  ucontext)
-    if test "$ucontext_works" != "yes"; then
-      feature_not_found "ucontext"
-    fi
-    ;;
-  sigaltstack)
-    if test "$mingw32" = "yes"; then
-      error_exit "only the 'windows' coroutine backend is valid for Windows"
-    fi
-    ;;
-  *)
-    error_exit "unknown coroutine backend $coroutine"
-    ;;
-  esac
-fi
+coroutine=stackless
 
 ##################################################
 # SafeStack
@@ -2395,9 +2361,6 @@ EOF
   else
     error_exit "SafeStack not supported by your compiler"
   fi
-  if test "$coroutine" != "ucontext"; then
-    error_exit "SafeStack is only supported by the coroutine backend ucontext"
-  fi
 else
 cat > $TMPC << EOF
 int main(int argc, char *argv[])
@@ -2427,9 +2390,6 @@ else # "$safe_stack" = ""
     safe_stack="no"
   else
     safe_stack="yes"
-    if test "$coroutine" != "ucontext"; then
-      error_exit "SafeStack is only supported by the coroutine backend ucontext"
-    fi
   fi
 fi
 fi
diff --git a/include/qemu/co-lockable.h b/include/qemu/co-lockable.h
index 09f4620017..95d058e2c9 100644
--- a/include/qemu/co-lockable.h
+++ b/include/qemu/co-lockable.h
@@ -16,7 +16,7 @@
 #include "qemu/coroutine.h"
 #include "qemu/thread.h"
 
-typedef void coroutine_fn QemuCoLockUnlockFunc(void *);
+typedef CoroutineAction QemuCoLockUnlockFunc(void *);
 
 struct QemuCoLockable {
     void *object;
@@ -24,6 +24,18 @@ struct QemuCoLockable {
     QemuCoLockUnlockFunc *unlock;
 };
 
+static inline CoroutineAction qemu_mutex_co_lock(QemuMutex *mutex)
+{
+    qemu_mutex_lock(mutex);
+    return COROUTINE_CONTINUE;
+}
+
+static inline CoroutineAction qemu_mutex_co_unlock(QemuMutex *mutex)
+{
+    qemu_mutex_unlock(mutex);
+    return COROUTINE_CONTINUE;
+}
+
 static inline __attribute__((__always_inline__)) QemuCoLockable *
 qemu_make_co_lockable(void *x, QemuCoLockable *lockable)
 {
@@ -68,7 +80,7 @@ qemu_null_co_lockable(void *x)
 #define QEMU_MAKE_CO_LOCKABLE(x)                                            \
     _Generic((x), QemuCoLockable *: (x),                                    \
              void *: qemu_null_co_lockable(x),                              \
-             QemuMutex *: qemu_make_co_lockable(x, QMCL_OBJ_(x, mutex)),    \
+             QemuMutex *: qemu_make_co_lockable(x, QMCL_OBJ_(x, mutex_co)), \
              CoMutex *: qemu_make_co_lockable(x, QMCL_OBJ_(x, co_mutex)))   \
 
 /**
@@ -82,17 +94,17 @@ qemu_null_co_lockable(void *x)
  */
 #define QEMU_MAKE_CO_LOCKABLE_NONNULL(x)                        \
     _Generic((x), QemuCoLockable *: (x),                        \
-                  QemuMutex *: QMCL_OBJ_(x, mutex),             \
+                  QemuMutex *: QMCL_OBJ_(x, mutex_co),          \
                   CoMutex *: QMCL_OBJ_(x, co_mutex))
 
-static inline void coroutine_fn qemu_co_lockable_lock(QemuCoLockable *x)
+static inline CoroutineAction qemu_co_lockable_lock(QemuCoLockable *x)
 {
-    x->lock(x->object);
+    return x->lock(x->object);
 }
 
-static inline void coroutine_fn qemu_co_lockable_unlock(QemuCoLockable *x)
+static inline CoroutineAction qemu_co_lockable_unlock(QemuCoLockable *x)
 {
-    x->unlock(x->object);
+    return x->unlock(x->object);
 }
 
 #endif
diff --git a/include/qemu/coroutine.h b/include/qemu/coroutine.h
index b23fba88c2..2f2be6abfe 100644
--- a/include/qemu/coroutine.h
+++ b/include/qemu/coroutine.h
@@ -18,6 +18,20 @@
 #include "qemu/queue.h"
 #include "qemu/timer.h"
 
+typedef enum {
+    COROUTINE_YIELD = 1,
+    COROUTINE_TERMINATE = 2,
+    COROUTINE_ENTER = 3,
+    COROUTINE_CONTINUE = 4,
+} CoroutineAction;
+
+typedef CoroutineAction CoroutineImpl(void *opaque);
+
+typedef struct {
+    CoroutineImpl *caller_func;
+    void *caller_frame;
+} CoroutineFrame;
+
 /**
  * Coroutines are a mechanism for stack switching and can be used for
  * cooperative userspace threading.  These functions provide a simple but
@@ -56,7 +70,7 @@ typedef struct Coroutine Coroutine;
  * When this function returns, the coroutine is destroyed automatically and
  * execution continues in the caller who last entered the coroutine.
  */
-typedef void coroutine_fn CoroutineEntry(void *opaque);
+typedef CoroutineAction CoroutineEntry(void *opaque);
 
 /**
  * Create a new coroutine
@@ -88,7 +102,7 @@ void qemu_aio_coroutine_enter(AioContext *ctx, Coroutine *co);
  * This function does not return until the coroutine is re-entered using
  * qemu_coroutine_enter().
  */
-void coroutine_fn qemu_coroutine_yield(void);
+CoroutineAction qemu_coroutine_yield(void);
 
 /**
  * Get the AioContext of the given coroutine
@@ -160,13 +174,13 @@ void qemu_co_mutex_init(CoMutex *mutex);
  * Locks the mutex. If the lock cannot be taken immediately, control is
  * transferred to the caller of the current coroutine.
  */
-void coroutine_fn qemu_co_mutex_lock(CoMutex *mutex);
+CoroutineAction qemu_co_mutex_lock(CoMutex *mutex);
 
 /**
  * Unlocks the mutex and schedules the next coroutine that was waiting for this
  * lock to be run.
  */
-void coroutine_fn qemu_co_mutex_unlock(CoMutex *mutex);
+CoroutineAction qemu_co_mutex_unlock(CoMutex *mutex);
 
 /**
  * Assert that the current coroutine holds @mutex.
@@ -206,7 +220,7 @@ void qemu_co_queue_init(CoQueue *queue);
  */
 #define qemu_co_queue_wait(queue, lock) \
     qemu_co_queue_wait_impl(queue, QEMU_MAKE_CO_LOCKABLE(lock))
-void coroutine_fn qemu_co_queue_wait_impl(CoQueue *queue, QemuCoLockable *lock);
+CoroutineAction qemu_co_queue_wait_impl(CoQueue *queue, QemuCoLockable *lock);
 
 /**
  * Removes the next coroutine from the CoQueue, and wake it up.
@@ -262,7 +276,7 @@ void qemu_co_rwlock_init(CoRwlock *lock);
  * of a parallel writer, control is transferred to the caller of the current
  * coroutine.
  */
-void coroutine_fn qemu_co_rwlock_rdlock(CoRwlock *lock);
+CoroutineAction qemu_co_rwlock_rdlock(CoRwlock *lock);
 
 /**
  * Write Locks the CoRwlock from a reader.  This is a bit more efficient than
@@ -271,7 +285,7 @@ void coroutine_fn qemu_co_rwlock_rdlock(CoRwlock *lock);
  * to the caller of the current coroutine; another writer might run while
  * @qemu_co_rwlock_upgrade blocks.
  */
-void coroutine_fn qemu_co_rwlock_upgrade(CoRwlock *lock);
+CoroutineAction qemu_co_rwlock_upgrade(CoRwlock *lock);
 
 /**
  * Downgrades a write-side critical section to a reader.  Downgrading with
@@ -279,20 +293,20 @@ void coroutine_fn qemu_co_rwlock_upgrade(CoRwlock *lock);
  * followed by @qemu_co_rwlock_rdlock.  This makes it more efficient, but
  * may also sometimes be necessary for correctness.
  */
-void coroutine_fn qemu_co_rwlock_downgrade(CoRwlock *lock);
+CoroutineAction qemu_co_rwlock_downgrade(CoRwlock *lock);
 
 /**
  * Write Locks the mutex. If the lock cannot be taken immediately because
  * of a parallel reader, control is transferred to the caller of the current
  * coroutine.
  */
-void coroutine_fn qemu_co_rwlock_wrlock(CoRwlock *lock);
+CoroutineAction qemu_co_rwlock_wrlock(CoRwlock *lock);
 
 /**
  * Unlocks the read/write lock and schedules the next coroutine that was
  * waiting for this lock to be run.
  */
-void coroutine_fn qemu_co_rwlock_unlock(CoRwlock *lock);
+CoroutineAction qemu_co_rwlock_unlock(CoRwlock *lock);
 
 typedef struct QemuCoSleep {
     Coroutine *to_wake;
@@ -303,18 +317,18 @@ typedef struct QemuCoSleep {
  * during this yield, it can be passed to qemu_co_sleep_wake() to
  * terminate the sleep.
  */
-void coroutine_fn qemu_co_sleep_ns_wakeable(QemuCoSleep *w,
+CoroutineAction qemu_co_sleep_ns_wakeable(QemuCoSleep *w,
                                             QEMUClockType type, int64_t ns);
 
 /**
  * Yield the coroutine until the next call to qemu_co_sleep_wake.
  */
-void coroutine_fn qemu_co_sleep(QemuCoSleep *w);
+CoroutineAction qemu_co_sleep(QemuCoSleep *w);
 
-static inline void coroutine_fn qemu_co_sleep_ns(QEMUClockType type, int64_t ns)
+static inline CoroutineAction qemu_co_sleep_ns(QEMUClockType type, int64_t ns)
 {
     QemuCoSleep w = { 0 };
-    qemu_co_sleep_ns_wakeable(&w, type, ns);
+    return qemu_co_sleep_ns_wakeable(&w, type, ns);
 }
 
 /**
@@ -330,7 +344,7 @@ void qemu_co_sleep_wake(QemuCoSleep *w);
  *
  * Note that this function clobbers the handlers for the file descriptor.
  */
-void coroutine_fn yield_until_fd_readable(int fd);
+CoroutineAction yield_until_fd_readable(int fd);
 
 /**
  * Increase coroutine pool size
@@ -342,7 +356,9 @@ void qemu_coroutine_increase_pool_batch_size(unsigned int additional_pool_size);
  */
 void qemu_coroutine_decrease_pool_batch_size(unsigned int additional_pool_size);
 
-#include "qemu/lockable.h"
 #include "qemu/co-lockable.h"
 
+void *coroutine_only_fn stack_alloc(CoroutineImpl *func, size_t bytes);
+CoroutineAction coroutine_only_fn stack_free(CoroutineFrame *f);
+
 #endif /* QEMU_COROUTINE_H */
diff --git a/include/qemu/coroutine_int.h b/include/qemu/coroutine_int.h
index 1da148552f..1989370194 100644
--- a/include/qemu/coroutine_int.h
+++ b/include/qemu/coroutine_int.h
@@ -35,12 +35,6 @@ extern __thread void *__safestack_unsafe_stack_ptr;
 
 #define COROUTINE_STACK_SIZE (1 << 20)
 
-typedef enum {
-    COROUTINE_YIELD = 1,
-    COROUTINE_TERMINATE = 2,
-    COROUTINE_ENTER = 3,
-} CoroutineAction;
-
 struct Coroutine {
     CoroutineEntry *entry;
     void *entry_arg;
diff --git a/util/coroutine-stackless.c b/util/coroutine-stackless.c
new file mode 100644
index 0000000000..7ba3b0cf63
--- /dev/null
+++ b/util/coroutine-stackless.c
@@ -0,0 +1,159 @@
+/*
+ * stackless coroutine initialization code
+ *
+ * Copyright (C) 2022 Paolo BOnzini <pbonzini@redhat.com>
+ *
+ * This library is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU Lesser General Public
+ * License as published by the Free Software Foundation; either
+ * version 2.0 of the License, or (at your option) any later version.
+ *
+ * This library is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+ * Lesser General Public License for more details.
+ *
+ * You should have received a copy of the GNU Lesser General Public
+ * License along with this library; if not, see <http://www.gnu.org/licenses/>.
+ */
+
+#include "qemu/osdep.h"
+#include "trace.h"
+#include "qemu/coroutine_int.h"
+
+typedef struct {
+    Coroutine base;
+    void *stack;
+    void *stack_ptr;
+    CoroutineImpl *current_func;
+    void *current_frame;
+} CoroutineStackless;
+
+static __thread CoroutineStackless leader;
+static __thread Coroutine *current;
+
+static void *coroutine_stack_alloc(CoroutineStackless *co, CoroutineImpl *func, size_t bytes)
+{
+    CoroutineFrame *ret = co->stack_ptr;
+
+    bytes = ROUND_UP(bytes, 16);
+    assert(bytes <= COROUTINE_STACK_SIZE - (co->stack_ptr - co->stack));
+    co->stack_ptr += bytes;
+    ret->caller_func = co->current_func;
+    ret->caller_frame = co->current_frame;
+    co->current_func = func;
+    co->current_frame = ret;
+    return ret;
+}
+
+static void coroutine_stack_free(CoroutineStackless *co, CoroutineFrame *f)
+{
+    assert((void *)f >= co->stack && (void *)f < co->stack_ptr);
+    co->current_func = f->caller_func;
+    co->current_frame = f->caller_frame;
+    co->stack_ptr = f;
+}
+
+struct FRAME__coroutine_trampoline {
+    CoroutineFrame common;
+    bool back;
+};
+
+static CoroutineAction coroutine_trampoline(void *_frame)
+{
+    struct FRAME__coroutine_trampoline *_f = _frame;
+    Coroutine *co = current;
+    if (!_f->back) {
+        _f->back = true;
+        // or:
+        //   if (co->entry(co->entry_arg) == COROUTINE_YIELD) return COROUTINE_YIELD;
+        return co->entry(co->entry_arg);
+    }
+
+    _f->back = false;
+    current = co->caller;
+    co->caller = NULL;
+    return COROUTINE_TERMINATE;
+}
+
+Coroutine *qemu_coroutine_new(void)
+{
+    CoroutineStackless *co;
+    struct FRAME__coroutine_trampoline *frame;
+
+    co = g_malloc0(sizeof(*co));
+    co->stack = g_malloc(COROUTINE_STACK_SIZE);
+    co->stack_ptr = co->stack;
+
+    frame = coroutine_stack_alloc(co, coroutine_trampoline, sizeof(*frame));
+    frame->back = false;
+    return &co->base;
+}
+
+void qemu_coroutine_delete(Coroutine *co_)
+{
+    CoroutineStackless *co = DO_UPCAST(CoroutineStackless, base, co_);
+    struct FRAME__coroutine_trampoline *frame = co->current_frame;
+
+    assert(!frame->back);
+    coroutine_stack_free(co, co->current_frame);
+    assert(co->stack_ptr == co->stack);
+    g_free(co->stack);
+    g_free(co);
+}
+
+CoroutineAction
+qemu_coroutine_switch(Coroutine *from, Coroutine *to,
+                      CoroutineAction action)
+{
+    assert(action == COROUTINE_ENTER);
+    assert(to->caller != NULL);
+    current = to;
+    do {
+        CoroutineStackless *co = DO_UPCAST(CoroutineStackless, base, to);
+        action = co->current_func(co->current_frame);
+    } while (action == COROUTINE_CONTINUE);
+    assert(action != COROUTINE_ENTER);
+    return action;
+}
+
+CoroutineAction qemu_coroutine_yield(void)
+{
+    Coroutine *from = current;
+    Coroutine *to = from->caller;
+    trace_qemu_coroutine_yield(from, to);
+    if (!to) {
+        fprintf(stderr, "Co-routine is yielding to no one\n");
+        abort();
+    }
+    from->caller = NULL;
+    current = to;
+    return COROUTINE_YIELD;
+}
+
+Coroutine *qemu_coroutine_self(void)
+{
+    if (!current) {
+        current = &leader.base;
+    }
+    return current;
+}
+
+bool qemu_in_coroutine(void)
+{
+    return current && current->caller;
+}
+
+void *stack_alloc(CoroutineImpl *func, size_t bytes)
+{
+    CoroutineStackless *co = DO_UPCAST(CoroutineStackless, base, current);
+
+    return coroutine_stack_alloc(co, func, bytes);
+}
+
+CoroutineAction stack_free(CoroutineFrame *f)
+{
+    CoroutineStackless *co = DO_UPCAST(CoroutineStackless, base, current);
+    coroutine_stack_free(co, f);
+    return COROUTINE_CONTINUE;
+}
diff --git a/util/qemu-coroutine.c b/util/qemu-coroutine.c
index 9f2bd96fa0..0ae2a4090f 100644
--- a/util/qemu-coroutine.c
+++ b/util/qemu-coroutine.c
@@ -179,22 +179,6 @@ void qemu_coroutine_enter_if_inactive(Coroutine *co)
     }
 }
 
-void coroutine_fn qemu_coroutine_yield(void)
-{
-    Coroutine *self = qemu_coroutine_self();
-    Coroutine *to = self->caller;
-
-    trace_qemu_coroutine_yield(self, to);
-
-    if (!to) {
-        fprintf(stderr, "Co-routine is yielding to no one\n");
-        abort();
-    }
-
-    self->caller = NULL;
-    qemu_coroutine_switch(self, to, COROUTINE_YIELD);
-}
-
 bool qemu_coroutine_entered(Coroutine *co)
 {
     return co->caller;
-- 
2.35.1




^ permalink raw reply related	[flat|nested] 44+ messages in thread

* [PATCH 08/35] /basic/lifecycle
  2022-03-10 12:43 [PATCH experiment 00/35] stackless coroutine backend Paolo Bonzini
                   ` (6 preceding siblings ...)
  2022-03-10 12:43 ` [PATCH 07/35] coroutine: introduce the "stackless coroutine" backend Paolo Bonzini
@ 2022-03-10 12:43 ` Paolo Bonzini
  2022-03-10 12:43 ` [PATCH 09/35] convert qemu-coroutine-sleep.c to stackless coroutines Paolo Bonzini
                   ` (27 subsequent siblings)
  35 siblings, 0 replies; 44+ messages in thread
From: Paolo Bonzini @ 2022-03-10 12:43 UTC (permalink / raw)
  To: qemu-devel; +Cc: hreitz, stefanha, qemu-block, sguelton

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
 tests/unit/test-coroutine.c | 7 +++++--
 1 file changed, 5 insertions(+), 2 deletions(-)

diff --git a/tests/unit/test-coroutine.c b/tests/unit/test-coroutine.c
index c230c2fa6e..3670750c5b 100644
--- a/tests/unit/test-coroutine.c
+++ b/tests/unit/test-coroutine.c
@@ -423,16 +423,18 @@ static void test_co_rwlock_downgrade(void)
 
     g_assert(c1_done);
 }
+#endif
 
 /*
  * Check that creation, enter, and return work
  */
 
-static void coroutine_fn set_and_exit(void *opaque)
+static CoroutineAction set_and_exit(void *opaque)
 {
     bool *done = opaque;
 
     *done = true;
+    return COROUTINE_CONTINUE;
 }
 
 static void test_lifecycle(void)
@@ -452,6 +454,7 @@ static void test_lifecycle(void)
     g_assert(done); /* expect done to be true (second time) */
 }
 
+#if 0
 
 #define RECORD_SIZE 10 /* Leave some room for expansion */
 struct coroutine_position {
@@ -655,8 +658,8 @@ int main(int argc, char **argv)
     }
 #endif
 
-#if 0
     g_test_add_func("/basic/lifecycle", test_lifecycle);
+#if 0
     g_test_add_func("/basic/yield", test_yield);
     g_test_add_func("/basic/nesting", test_nesting);
     g_test_add_func("/basic/self", test_self);
-- 
2.35.1




^ permalink raw reply related	[flat|nested] 44+ messages in thread

* [PATCH 09/35] convert qemu-coroutine-sleep.c to stackless coroutines
  2022-03-10 12:43 [PATCH experiment 00/35] stackless coroutine backend Paolo Bonzini
                   ` (7 preceding siblings ...)
  2022-03-10 12:43 ` [PATCH 08/35] /basic/lifecycle Paolo Bonzini
@ 2022-03-10 12:43 ` Paolo Bonzini
  2022-03-10 12:43 ` [PATCH 10/35] enable tail call optimization of qemu_co_mutex_lock Paolo Bonzini
                   ` (26 subsequent siblings)
  35 siblings, 0 replies; 44+ messages in thread
From: Paolo Bonzini @ 2022-03-10 12:43 UTC (permalink / raw)
  To: qemu-devel; +Cc: hreitz, stefanha, qemu-block, sguelton

The main change is to qemu_co_sleep_ns_wakeable, which gets the full
conversion treatment.  It's important to note that variables that escape
(have their address taken), such as "QEMUTimer ts" in this case, move
entirely to the frame structure and do not have local variables anymore.
For the others, always using the frame structure would be inefficient,
so they need to be saved and restored.  Perhaps "restrict" would be
an idea too, I haven't investigated it.

qemu_co_sleep almost has a tail call to qemu_coroutine_yield(), except for
an assertion after qemu_coroutine_yield() returns.  For simplicity and
to demonstrate the optimization I'm removing the assertion.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
 util/qemu-coroutine-sleep.c | 59 ++++++++++++++++++++++++++++---------
 1 file changed, 45 insertions(+), 14 deletions(-)

diff --git a/util/qemu-coroutine-sleep.c b/util/qemu-coroutine-sleep.c
index b5bfb4ad18..3d0b1579b3 100644
--- a/util/qemu-coroutine-sleep.c
+++ b/util/qemu-coroutine-sleep.c
@@ -17,7 +17,6 @@
 #include "qemu/timer.h"
 #include "block/aio.h"
 
-#if 0
 static const char *qemu_co_sleep_ns__scheduled = "qemu_co_sleep_ns";
 
 void qemu_co_sleep_wake(QemuCoSleep *w)
@@ -42,7 +41,7 @@ static void co_sleep_cb(void *opaque)
     qemu_co_sleep_wake(w);
 }
 
-void coroutine_fn qemu_co_sleep(QemuCoSleep *w)
+CoroutineAction qemu_co_sleep(QemuCoSleep *w)
 {
     Coroutine *co = qemu_coroutine_self();
 
@@ -56,27 +55,59 @@ void coroutine_fn qemu_co_sleep(QemuCoSleep *w)
     }
 
     w->to_wake = co;
-    qemu_coroutine_yield();
+    return qemu_coroutine_yield();
 
     /* w->to_wake is cleared before resuming this coroutine.  */
-    assert(w->to_wake == NULL);
+    // assert(w->to_wake == NULL);
 }
 
-void coroutine_fn qemu_co_sleep_ns_wakeable(QemuCoSleep *w,
-                                            QEMUClockType type, int64_t ns)
-{
-    AioContext *ctx = qemu_get_current_aio_context();
-    QEMUTimer ts;
+struct FRAME__qemu_co_sleep_ns_wakeable {
+	CoroutineFrame common;
+	uint32_t _step;
+        QemuCoSleep *w;
+        QEMUClockType type;
+        int64_t ns;
+	QEMUTimer ts;
+};
 
-    aio_timer_init(ctx, &ts, type, SCALE_NS, co_sleep_cb, w);
-    timer_mod(&ts, qemu_clock_get_ns(type) + ns);
+static CoroutineAction co__qemu_co_sleep_ns_wakeable(void *_frame)
+{
+    struct FRAME__qemu_co_sleep_ns_wakeable *_f = _frame;
+    AioContext *ctx = qemu_get_current_aio_context();
+
+switch(_f->_step) {
+case 0: {
+    QemuCoSleep *w = _f->w;
+    QEMUClockType type = _f->type;
+    int64_t ns = _f->ns;
+    aio_timer_init(ctx, &_f->ts, type, SCALE_NS, co_sleep_cb, w);
+    timer_mod(&_f->ts, qemu_clock_get_ns(type) + ns);
 
     /*
      * The timer will fire in the current AiOContext, so the callback
      * must happen after qemu_co_sleep yields and there is no race
      * between timer_mod and qemu_co_sleep.
      */
-    qemu_co_sleep(w);
-    timer_del(&ts);
+_f->_step = 1;
+    return qemu_co_sleep(w);
+}
+case 1:
+    timer_del(&_f->ts);
+    goto _out;
+}
+_out:
+stack_free(&_f->common);
+return COROUTINE_CONTINUE;
+}
+
+CoroutineAction qemu_co_sleep_ns_wakeable(QemuCoSleep *w,
+                                          QEMUClockType type, int64_t ns)
+{
+    struct FRAME__qemu_co_sleep_ns_wakeable *f;
+    f = stack_alloc(co__qemu_co_sleep_ns_wakeable, sizeof(*f));
+    f->w = w;
+    f->type = type;
+    f->ns = ns;
+    f->_step = 0;
+    return co__qemu_co_sleep_ns_wakeable(f);
 }
-#endif
-- 
2.35.1




^ permalink raw reply related	[flat|nested] 44+ messages in thread

* [PATCH 10/35] enable tail call optimization of qemu_co_mutex_lock
  2022-03-10 12:43 [PATCH experiment 00/35] stackless coroutine backend Paolo Bonzini
                   ` (8 preceding siblings ...)
  2022-03-10 12:43 ` [PATCH 09/35] convert qemu-coroutine-sleep.c to stackless coroutines Paolo Bonzini
@ 2022-03-10 12:43 ` Paolo Bonzini
  2022-03-10 12:43 ` [PATCH 11/35] convert CoMutex to stackless coroutines Paolo Bonzini
                   ` (25 subsequent siblings)
  35 siblings, 0 replies; 44+ messages in thread
From: Paolo Bonzini @ 2022-03-10 12:43 UTC (permalink / raw)
  To: qemu-devel; +Cc: hreitz, stefanha, qemu-block, sguelton

Make qemu_co_mutex_lock_slowpath a tail call, so that qemu_co_mutex_lock
does not need to build a stack frame of its own.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
 util/qemu-coroutine-lock.c | 6 ++++--
 1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/util/qemu-coroutine-lock.c b/util/qemu-coroutine-lock.c
index d6c0565ba5..048cfcea71 100644
--- a/util/qemu-coroutine-lock.c
+++ b/util/qemu-coroutine-lock.c
@@ -231,6 +231,8 @@ static void coroutine_fn qemu_co_mutex_lock_slowpath(AioContext *ctx,
 
     qemu_coroutine_yield();
     trace_qemu_co_mutex_lock_return(mutex, self);
+    mutex->holder = self;
+    self->locks_held++;
 }
 
 void coroutine_fn qemu_co_mutex_lock(CoMutex *mutex)
@@ -266,11 +268,11 @@ retry_fast_path:
         /* Uncontended.  */
         trace_qemu_co_mutex_lock_uncontended(mutex, self);
         mutex->ctx = ctx;
+        mutex->holder = self;
+        self->locks_held++;
     } else {
         qemu_co_mutex_lock_slowpath(ctx, mutex);
     }
-    mutex->holder = self;
-    self->locks_held++;
 }
 
 void coroutine_fn qemu_co_mutex_unlock(CoMutex *mutex)
-- 
2.35.1




^ permalink raw reply related	[flat|nested] 44+ messages in thread

* [PATCH 11/35] convert CoMutex to stackless coroutines
  2022-03-10 12:43 [PATCH experiment 00/35] stackless coroutine backend Paolo Bonzini
                   ` (9 preceding siblings ...)
  2022-03-10 12:43 ` [PATCH 10/35] enable tail call optimization of qemu_co_mutex_lock Paolo Bonzini
@ 2022-03-10 12:43 ` Paolo Bonzini
  2022-03-10 12:43 ` [PATCH 12/35] define magic macros for " Paolo Bonzini
                   ` (24 subsequent siblings)
  35 siblings, 0 replies; 44+ messages in thread
From: Paolo Bonzini @ 2022-03-10 12:43 UTC (permalink / raw)
  To: qemu-devel; +Cc: hreitz, stefanha, qemu-block, sguelton

Build the frame for qemu_co_mutex_lock_slowpath, because it has code
that runs after qemu_coroutine_yield().  For qemu_co_mutex_lock() and
qemu_co_mutex_unlock(), just return COROUTINE_CONTINUE on paths that do
not go through an awaitable function, which is all of them in the case
of qemu_co_mutex_unlock().

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
 util/qemu-coroutine-lock.c | 60 ++++++++++++++++++++++++++++++--------
 1 file changed, 48 insertions(+), 12 deletions(-)

diff --git a/util/qemu-coroutine-lock.c b/util/qemu-coroutine-lock.c
index 048cfcea71..061a376aa4 100644
--- a/util/qemu-coroutine-lock.c
+++ b/util/qemu-coroutine-lock.c
@@ -120,6 +120,7 @@ bool qemu_co_queue_empty(CoQueue *queue)
 {
     return QSIMPLEQ_FIRST(&queue->entries) == NULL;
 }
+#endif
 
 /* The wait records are handled with a multiple-producer, single-consumer
  * lock-free queue.  There cannot be two concurrent pop_waiter() calls
@@ -197,15 +198,28 @@ static void coroutine_fn qemu_co_mutex_wake(CoMutex *mutex, Coroutine *co)
     aio_co_wake(co);
 }
 
-static void coroutine_fn qemu_co_mutex_lock_slowpath(AioContext *ctx,
-                                                     CoMutex *mutex)
-{
-    Coroutine *self = qemu_coroutine_self();
+struct FRAME__qemu_co_mutex_lock_slowpath {
+    CoroutineFrame common;
+    uint32_t _step;
+    AioContext *ctx;
+    CoMutex *mutex;
+    Coroutine *self;
     CoWaitRecord w;
+};
+
+static CoroutineAction co__qemu_co_mutex_lock_slowpath(void *_frame)
+{
+    struct FRAME__qemu_co_mutex_lock_slowpath *_f = _frame;
+    AioContext *ctx = _f->ctx;
+    CoMutex *mutex = _f->mutex;
+    Coroutine *self;
     unsigned old_handoff;
 
+switch(_f->_step) {
+case 0: {
+    self = qemu_coroutine_self();
     trace_qemu_co_mutex_lock_entry(mutex, self);
-    push_waiter(mutex, &w);
+    push_waiter(mutex, &_f->w);
 
     /* This is the "Responsibility Hand-Off" protocol; a lock() picks from
      * a concurrent unlock() the responsibility of waking somebody up.
@@ -221,21 +235,40 @@ static void coroutine_fn qemu_co_mutex_lock_slowpath(AioContext *ctx,
         Coroutine *co = to_wake->co;
         if (co == self) {
             /* We got the lock ourselves!  */
-            assert(to_wake == &w);
+            assert(to_wake == &_f->w);
             mutex->ctx = ctx;
-            return;
+            goto _out;
         }
 
         qemu_co_mutex_wake(mutex, co);
     }
 
-    qemu_coroutine_yield();
+_f->_step = 1;
+_f->self = self;
+    return qemu_coroutine_yield();
+}
+case 1:
+self = _f->self;
     trace_qemu_co_mutex_lock_return(mutex, self);
     mutex->holder = self;
     self->locks_held++;
+    goto _out;
+}
+_out:
+return stack_free(&_f->common);
 }
 
-void coroutine_fn qemu_co_mutex_lock(CoMutex *mutex)
+static CoroutineAction qemu_co_mutex_lock_slowpath(AioContext *ctx, CoMutex *mutex)
+{
+    struct FRAME__qemu_co_mutex_lock_slowpath *f;
+    f = stack_alloc(co__qemu_co_mutex_lock_slowpath, sizeof(*f));
+    f->ctx = ctx;
+    f->mutex = mutex;
+    f->_step = 0;
+    return co__qemu_co_mutex_lock_slowpath(f);
+}
+
+CoroutineAction qemu_co_mutex_lock(CoMutex *mutex)
 {
     AioContext *ctx = qemu_get_current_aio_context();
     Coroutine *self = qemu_coroutine_self();
@@ -270,12 +303,13 @@ retry_fast_path:
         mutex->ctx = ctx;
         mutex->holder = self;
         self->locks_held++;
+        return COROUTINE_CONTINUE;
     } else {
-        qemu_co_mutex_lock_slowpath(ctx, mutex);
+        return qemu_co_mutex_lock_slowpath(ctx, mutex);
     }
 }
 
-void coroutine_fn qemu_co_mutex_unlock(CoMutex *mutex)
+CoroutineAction qemu_co_mutex_unlock(CoMutex *mutex)
 {
     Coroutine *self = qemu_coroutine_self();
 
@@ -290,7 +324,7 @@ void coroutine_fn qemu_co_mutex_unlock(CoMutex *mutex)
     self->locks_held--;
     if (qatomic_fetch_dec(&mutex->locked) == 1) {
         /* No waiting qemu_co_mutex_lock().  Pfew, that was easy!  */
-        return;
+        return COROUTINE_CONTINUE;
     }
 
     for (;;) {
@@ -328,8 +362,10 @@ void coroutine_fn qemu_co_mutex_unlock(CoMutex *mutex)
     }
 
     trace_qemu_co_mutex_unlock_return(mutex, self);
+    return COROUTINE_CONTINUE;
 }
 
+#if 0
 struct CoRwTicket {
     bool read;
     Coroutine *co;
-- 
2.35.1




^ permalink raw reply related	[flat|nested] 44+ messages in thread

* [PATCH 12/35] define magic macros for stackless coroutines
  2022-03-10 12:43 [PATCH experiment 00/35] stackless coroutine backend Paolo Bonzini
                   ` (10 preceding siblings ...)
  2022-03-10 12:43 ` [PATCH 11/35] convert CoMutex to stackless coroutines Paolo Bonzini
@ 2022-03-10 12:43 ` Paolo Bonzini
  2022-03-10 12:43 ` [PATCH 13/35] /basic/yield Paolo Bonzini
                   ` (23 subsequent siblings)
  35 siblings, 0 replies; 44+ messages in thread
From: Paolo Bonzini @ 2022-03-10 12:43 UTC (permalink / raw)
  To: qemu-devel; +Cc: hreitz, stefanha, qemu-block, sguelton

Because conversion to stackless coroutines is incredibly repetitive,
define some magic variable-argument macros that simplify the task:

- CO_DECLARE_FRAME() declares a frame structure, with a couple common fields
  and the extras coming from variable arguments

- CO_INIT_FRAME() allocates the frame structure, builds it using any arguments
  provided by the user, and continues with the second part of the
  awaitable function that takes the frame as its only argument

- CO_ARG() declare variables and load them from the frame structure.  It
  uses typeof() to avoid repetition of the type of the variable (it is needed
  only twice, in CO_DECLARE_FRAME() and in the declaration of the user-visible
  awaitable function)

- CO_DECLARE() also declares variables using typeof, but it's for locals that
  are not prepared by CO_INIT_FRAME()

- CO_SAVE() and CO_LOAD() copy to and from the frame structure

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
 include/qemu/coroutine.h | 41 ++++++++++++++++++++++++++++++++++++++++
 1 file changed, 41 insertions(+)

diff --git a/include/qemu/coroutine.h b/include/qemu/coroutine.h
index 2f2be6abfe..df148ff80e 100644
--- a/include/qemu/coroutine.h
+++ b/include/qemu/coroutine.h
@@ -361,4 +361,45 @@ void qemu_coroutine_decrease_pool_batch_size(unsigned int additional_pool_size);
 void *coroutine_only_fn stack_alloc(CoroutineImpl *func, size_t bytes);
 CoroutineAction coroutine_only_fn stack_free(CoroutineFrame *f);
 
+
+#define CO_DO(MACRO, ...) CO_DO_(MACRO, __VA_ARGS__, 10, 9, 8, 7, 6, 5, 4, 3, 2, 1, 0)
+#define CO_DO_(MACRO, a0, a1, a2, a3, a4, a5, a6, a7, a8 , a9, n, ...) CO_DO##n(MACRO, a0, a1, a2, a3, a4, a5, a6, a7, a8 , a9)
+#define CO_DO0(MACRO, a0, ...)
+#define CO_DO1(MACRO, a0, ...) MACRO(a0)
+#define CO_DO2(MACRO, a0, ...) MACRO(a0); CO_DO1(MACRO, __VA_ARGS__)
+#define CO_DO3(MACRO, a0, ...) MACRO(a0); CO_DO2(MACRO, __VA_ARGS__)
+#define CO_DO4(MACRO, a0, ...) MACRO(a0); CO_DO3(MACRO, __VA_ARGS__)
+#define CO_DO5(MACRO, a0, ...) MACRO(a0); CO_DO4(MACRO, __VA_ARGS__)
+#define CO_DO6(MACRO, a0, ...) MACRO(a0); CO_DO5(MACRO, __VA_ARGS__)
+#define CO_DO7(MACRO, a0, ...) MACRO(a0); CO_DO6(MACRO, __VA_ARGS__)
+#define CO_DO8(MACRO, a0, ...) MACRO(a0); CO_DO7(MACRO, __VA_ARGS__)
+#define CO_DO9(MACRO, a0, ...) MACRO(a0); CO_DO8(MACRO, __VA_ARGS__)
+
+#define CO_FRAME1(decl) decl
+#define CO_SAVE1(var) _f->var = var
+#define CO_LOAD1(var) var = _f->var
+#define CO_DECLARE1(var) typeof(_f->var) var
+#define CO_ARG1(var) typeof(_f->var) var = _f->var
+
+#define CO_SAVE(...) CO_DO(CO_SAVE1, __VA_ARGS__)
+#define CO_LOAD(...) CO_DO(CO_LOAD1, __VA_ARGS__)
+#define CO_DECLARE(...) CO_DO(CO_DECLARE1, __VA_ARGS__)
+#define CO_ARG(...) CO_DO(CO_ARG1, __VA_ARGS__)
+
+#define CO_DECLARE_FRAME(func, ...) \
+    struct FRAME__##func { \
+        CoroutineFrame common; \
+        uint32_t _step; \
+        CO_DO(CO_FRAME1, __VA_ARGS__); \
+    }
+
+#define CO_INIT_FRAME(func, ...) \
+    co__##func(({ \
+        struct FRAME__##func *_f; \
+        _f = stack_alloc(co__##func, sizeof(*_f)); \
+        __VA_OPT__(CO_SAVE(__VA_ARGS__);) \
+        _f->_step = 0; \
+        _f; \
+    }))
+
 #endif /* QEMU_COROUTINE_H */
-- 
2.35.1




^ permalink raw reply related	[flat|nested] 44+ messages in thread

* [PATCH 13/35] /basic/yield
  2022-03-10 12:43 [PATCH experiment 00/35] stackless coroutine backend Paolo Bonzini
                   ` (11 preceding siblings ...)
  2022-03-10 12:43 ` [PATCH 12/35] define magic macros for " Paolo Bonzini
@ 2022-03-10 12:43 ` Paolo Bonzini
  2022-03-10 12:43 ` [PATCH 14/35] /basic/nesting Paolo Bonzini
                   ` (22 subsequent siblings)
  35 siblings, 0 replies; 44+ messages in thread
From: Paolo Bonzini @ 2022-03-10 12:43 UTC (permalink / raw)
  To: qemu-devel; +Cc: hreitz, stefanha, qemu-block, sguelton

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
 tests/unit/test-coroutine.c | 27 +++++++++++++++++++++++----
 1 file changed, 23 insertions(+), 4 deletions(-)

diff --git a/tests/unit/test-coroutine.c b/tests/unit/test-coroutine.c
index 3670750c5b..ae06e97c95 100644
--- a/tests/unit/test-coroutine.c
+++ b/tests/unit/test-coroutine.c
@@ -141,15 +141,33 @@ static void test_nesting(void)
  * Check that yield/enter transfer control correctly
  */
 
-static void coroutine_fn yield_5_times(void *opaque)
+#endif
+CO_DECLARE_FRAME(yield_5_times, void *opaque, int i);
+static CoroutineAction co__yield_5_times(void *_frame)
 {
+    struct FRAME__yield_5_times *_f = _frame;
+    CO_ARG(opaque);
     bool *done = opaque;
-    int i;
+    CO_DECLARE(i);
 
+switch(_f->_step) {
+case 0:
     for (i = 0; i < 5; i++) {
-        qemu_coroutine_yield();
+CO_SAVE(i);
+_f->_step = 1;
+        return qemu_coroutine_yield();
+case 1:
+CO_LOAD(i);
     }
     *done = true;
+    break;
+}
+return stack_free(&_f->common);
+}
+
+static CoroutineAction yield_5_times(void *opaque)
+{
+    return CO_INIT_FRAME(yield_5_times, opaque);
 }
 
 static void test_yield(void)
@@ -166,6 +184,7 @@ static void test_yield(void)
     g_assert_cmpint(i, ==, 5); /* coroutine must yield 5 times */
 }
 
+#if 0
 static void coroutine_fn c2_fn(void *opaque)
 {
     qemu_coroutine_yield();
@@ -659,8 +678,8 @@ int main(int argc, char **argv)
 #endif
 
     g_test_add_func("/basic/lifecycle", test_lifecycle);
-#if 0
     g_test_add_func("/basic/yield", test_yield);
+#if 0
     g_test_add_func("/basic/nesting", test_nesting);
     g_test_add_func("/basic/self", test_self);
     g_test_add_func("/basic/entered", test_entered);
-- 
2.35.1




^ permalink raw reply related	[flat|nested] 44+ messages in thread

* [PATCH 14/35] /basic/nesting
  2022-03-10 12:43 [PATCH experiment 00/35] stackless coroutine backend Paolo Bonzini
                   ` (12 preceding siblings ...)
  2022-03-10 12:43 ` [PATCH 13/35] /basic/yield Paolo Bonzini
@ 2022-03-10 12:43 ` Paolo Bonzini
  2022-03-10 12:43 ` [PATCH 15/35] /basic/self Paolo Bonzini
                   ` (21 subsequent siblings)
  35 siblings, 0 replies; 44+ messages in thread
From: Paolo Bonzini @ 2022-03-10 12:43 UTC (permalink / raw)
  To: qemu-devel; +Cc: hreitz, stefanha, qemu-block, sguelton

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
 tests/unit/test-coroutine.c | 7 ++++---
 1 file changed, 4 insertions(+), 3 deletions(-)

diff --git a/tests/unit/test-coroutine.c b/tests/unit/test-coroutine.c
index ae06e97c95..7aaadfd31a 100644
--- a/tests/unit/test-coroutine.c
+++ b/tests/unit/test-coroutine.c
@@ -93,6 +93,7 @@ static void test_entered(void)
     g_assert(!qemu_coroutine_entered(coroutine));
     qemu_coroutine_enter(coroutine);
 }
+#endif
 
 /*
  * Check that coroutines may nest multiple levels
@@ -104,7 +105,7 @@ typedef struct {
     unsigned int max;       /* maximum level of nesting */
 } NestData;
 
-static void coroutine_fn nest(void *opaque)
+static CoroutineAction nest(void *opaque)
 {
     NestData *nd = opaque;
 
@@ -118,6 +119,7 @@ static void coroutine_fn nest(void *opaque)
     }
 
     nd->n_return++;
+    return COROUTINE_CONTINUE;
 }
 
 static void test_nesting(void)
@@ -141,7 +143,6 @@ static void test_nesting(void)
  * Check that yield/enter transfer control correctly
  */
 
-#endif
 CO_DECLARE_FRAME(yield_5_times, void *opaque, int i);
 static CoroutineAction co__yield_5_times(void *_frame)
 {
@@ -679,8 +680,8 @@ int main(int argc, char **argv)
 
     g_test_add_func("/basic/lifecycle", test_lifecycle);
     g_test_add_func("/basic/yield", test_yield);
-#if 0
     g_test_add_func("/basic/nesting", test_nesting);
+#if 0
     g_test_add_func("/basic/self", test_self);
     g_test_add_func("/basic/entered", test_entered);
     g_test_add_func("/basic/in_coroutine", test_in_coroutine);
-- 
2.35.1




^ permalink raw reply related	[flat|nested] 44+ messages in thread

* [PATCH 15/35] /basic/self
  2022-03-10 12:43 [PATCH experiment 00/35] stackless coroutine backend Paolo Bonzini
                   ` (13 preceding siblings ...)
  2022-03-10 12:43 ` [PATCH 14/35] /basic/nesting Paolo Bonzini
@ 2022-03-10 12:43 ` Paolo Bonzini
  2022-03-10 12:43 ` [PATCH 16/35] /basic/entered Paolo Bonzini
                   ` (20 subsequent siblings)
  35 siblings, 0 replies; 44+ messages in thread
From: Paolo Bonzini @ 2022-03-10 12:43 UTC (permalink / raw)
  To: qemu-devel; +Cc: hreitz, stefanha, qemu-block, sguelton

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
 tests/unit/test-coroutine.c | 7 +++++--
 1 file changed, 5 insertions(+), 2 deletions(-)

diff --git a/tests/unit/test-coroutine.c b/tests/unit/test-coroutine.c
index 7aaadfd31a..c701113d77 100644
--- a/tests/unit/test-coroutine.c
+++ b/tests/unit/test-coroutine.c
@@ -36,14 +36,16 @@ static void test_in_coroutine(void)
     qemu_coroutine_enter(coroutine);
 }
 
+#endif
 /*
  * Check that qemu_coroutine_self() works
  */
 
-static void coroutine_fn verify_self(void *opaque)
+static CoroutineAction verify_self(void *opaque)
 {
     Coroutine **p_co = opaque;
     g_assert(qemu_coroutine_self() == *p_co);
+    return COROUTINE_CONTINUE;
 }
 
 static void test_self(void)
@@ -53,6 +55,7 @@ static void test_self(void)
     coroutine = qemu_coroutine_create(verify_self, &coroutine);
     qemu_coroutine_enter(coroutine);
 }
+#if 0
 
 /*
  * Check that qemu_coroutine_entered() works
@@ -681,8 +684,8 @@ int main(int argc, char **argv)
     g_test_add_func("/basic/lifecycle", test_lifecycle);
     g_test_add_func("/basic/yield", test_yield);
     g_test_add_func("/basic/nesting", test_nesting);
-#if 0
     g_test_add_func("/basic/self", test_self);
+#if 0
     g_test_add_func("/basic/entered", test_entered);
     g_test_add_func("/basic/in_coroutine", test_in_coroutine);
     g_test_add_func("/basic/order", test_order);
-- 
2.35.1




^ permalink raw reply related	[flat|nested] 44+ messages in thread

* [PATCH 16/35] /basic/entered
  2022-03-10 12:43 [PATCH experiment 00/35] stackless coroutine backend Paolo Bonzini
                   ` (14 preceding siblings ...)
  2022-03-10 12:43 ` [PATCH 15/35] /basic/self Paolo Bonzini
@ 2022-03-10 12:43 ` Paolo Bonzini
  2022-03-10 12:43 ` [PATCH 17/35] /basic/in_coroutine Paolo Bonzini
                   ` (19 subsequent siblings)
  35 siblings, 0 replies; 44+ messages in thread
From: Paolo Bonzini @ 2022-03-10 12:43 UTC (permalink / raw)
  To: qemu-devel; +Cc: hreitz, stefanha, qemu-block, sguelton

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
 tests/unit/test-coroutine.c | 30 ++++++++++++++++++++++--------
 1 file changed, 22 insertions(+), 8 deletions(-)

diff --git a/tests/unit/test-coroutine.c b/tests/unit/test-coroutine.c
index c701113d77..bc75050463 100644
--- a/tests/unit/test-coroutine.c
+++ b/tests/unit/test-coroutine.c
@@ -55,26 +55,40 @@ static void test_self(void)
     coroutine = qemu_coroutine_create(verify_self, &coroutine);
     qemu_coroutine_enter(coroutine);
 }
-#if 0
 
 /*
  * Check that qemu_coroutine_entered() works
  */
 
-static void coroutine_fn verify_entered_step_2(void *opaque)
+CO_DECLARE_FRAME(verify_entered_step_2, Coroutine *caller);
+static CoroutineAction co__verify_entered_step_2(void *_frame)
 {
-    Coroutine *caller = (Coroutine *)opaque;
+    struct FRAME__verify_entered_step_2 *_f = _frame;
+    CO_ARG(caller);
 
+switch(_f->_step)
+{
+case 0:
     g_assert(qemu_coroutine_entered(caller));
     g_assert(qemu_coroutine_entered(qemu_coroutine_self()));
-    qemu_coroutine_yield();
-
+    _f->_step = 1;
+    return qemu_coroutine_yield();
+case 1:
     /* Once more to check it still works after yielding */
     g_assert(qemu_coroutine_entered(caller));
     g_assert(qemu_coroutine_entered(qemu_coroutine_self()));
+    break;
+}
+return stack_free(&_f->common);
 }
 
-static void coroutine_fn verify_entered_step_1(void *opaque)
+static CoroutineAction verify_entered_step_2(void *opaque)
+{
+    Coroutine *caller = (Coroutine *)opaque;
+    return CO_INIT_FRAME(verify_entered_step_2, caller);
+}
+
+static CoroutineAction verify_entered_step_1(void *opaque)
 {
     Coroutine *self = qemu_coroutine_self();
     Coroutine *coroutine;
@@ -86,6 +100,7 @@ static void coroutine_fn verify_entered_step_1(void *opaque)
     qemu_coroutine_enter(coroutine);
     g_assert(!qemu_coroutine_entered(coroutine));
     qemu_coroutine_enter(coroutine);
+    return COROUTINE_CONTINUE;
 }
 
 static void test_entered(void)
@@ -96,7 +111,6 @@ static void test_entered(void)
     g_assert(!qemu_coroutine_entered(coroutine));
     qemu_coroutine_enter(coroutine);
 }
-#endif
 
 /*
  * Check that coroutines may nest multiple levels
@@ -685,8 +699,8 @@ int main(int argc, char **argv)
     g_test_add_func("/basic/yield", test_yield);
     g_test_add_func("/basic/nesting", test_nesting);
     g_test_add_func("/basic/self", test_self);
-#if 0
     g_test_add_func("/basic/entered", test_entered);
+#if 0
     g_test_add_func("/basic/in_coroutine", test_in_coroutine);
     g_test_add_func("/basic/order", test_order);
     g_test_add_func("/locking/co-mutex", test_co_mutex);
-- 
2.35.1




^ permalink raw reply related	[flat|nested] 44+ messages in thread

* [PATCH 17/35] /basic/in_coroutine
  2022-03-10 12:43 [PATCH experiment 00/35] stackless coroutine backend Paolo Bonzini
                   ` (15 preceding siblings ...)
  2022-03-10 12:43 ` [PATCH 16/35] /basic/entered Paolo Bonzini
@ 2022-03-10 12:43 ` Paolo Bonzini
  2022-03-10 12:43 ` [PATCH 18/35] /basic/order Paolo Bonzini
                   ` (18 subsequent siblings)
  35 siblings, 0 replies; 44+ messages in thread
From: Paolo Bonzini @ 2022-03-10 12:43 UTC (permalink / raw)
  To: qemu-devel; +Cc: hreitz, stefanha, qemu-block, sguelton

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
 tests/unit/test-coroutine.c | 7 +++----
 1 file changed, 3 insertions(+), 4 deletions(-)

diff --git a/tests/unit/test-coroutine.c b/tests/unit/test-coroutine.c
index bc75050463..6ad653adda 100644
--- a/tests/unit/test-coroutine.c
+++ b/tests/unit/test-coroutine.c
@@ -16,14 +16,14 @@
 #include "qemu/coroutine_int.h"
 #include "qemu/lockable.h"
 
-#if 0
 /*
  * Check that qemu_in_coroutine() works
  */
 
-static void coroutine_fn verify_in_coroutine(void *opaque)
+static CoroutineAction verify_in_coroutine(void *opaque)
 {
     g_assert(qemu_in_coroutine());
+    return COROUTINE_CONTINUE;
 }
 
 static void test_in_coroutine(void)
@@ -36,7 +36,6 @@ static void test_in_coroutine(void)
     qemu_coroutine_enter(coroutine);
 }
 
-#endif
 /*
  * Check that qemu_coroutine_self() works
  */
@@ -700,8 +699,8 @@ int main(int argc, char **argv)
     g_test_add_func("/basic/nesting", test_nesting);
     g_test_add_func("/basic/self", test_self);
     g_test_add_func("/basic/entered", test_entered);
-#if 0
     g_test_add_func("/basic/in_coroutine", test_in_coroutine);
+#if 0
     g_test_add_func("/basic/order", test_order);
     g_test_add_func("/locking/co-mutex", test_co_mutex);
     g_test_add_func("/locking/co-mutex/lockable", test_co_mutex_lockable);
-- 
2.35.1




^ permalink raw reply related	[flat|nested] 44+ messages in thread

* [PATCH 18/35] /basic/order
  2022-03-10 12:43 [PATCH experiment 00/35] stackless coroutine backend Paolo Bonzini
                   ` (16 preceding siblings ...)
  2022-03-10 12:43 ` [PATCH 17/35] /basic/in_coroutine Paolo Bonzini
@ 2022-03-10 12:43 ` Paolo Bonzini
  2022-03-10 12:43 ` [PATCH 19/35] /perf/lifecycle Paolo Bonzini
                   ` (17 subsequent siblings)
  35 siblings, 0 replies; 44+ messages in thread
From: Paolo Bonzini @ 2022-03-10 12:43 UTC (permalink / raw)
  To: qemu-devel; +Cc: hreitz, stefanha, qemu-block, sguelton

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
 tests/unit/test-coroutine.c | 23 ++++++++++++++++++-----
 1 file changed, 18 insertions(+), 5 deletions(-)

diff --git a/tests/unit/test-coroutine.c b/tests/unit/test-coroutine.c
index 6ad653adda..3d898d50c6 100644
--- a/tests/unit/test-coroutine.c
+++ b/tests/unit/test-coroutine.c
@@ -490,8 +490,6 @@ static void test_lifecycle(void)
     g_assert(done); /* expect done to be true (second time) */
 }
 
-#if 0
-
 #define RECORD_SIZE 10 /* Leave some room for expansion */
 struct coroutine_position {
     int func;
@@ -508,13 +506,27 @@ static void record_push(int func, int state)
     cp->state = state;
 }
 
-static void coroutine_fn co_order_test(void *opaque)
+CO_DECLARE_FRAME(co_order_test);
+static CoroutineAction co__co_order_test(void *_frame)
 {
+    struct FRAME__co_order_test *_f = _frame;
+switch(_f->_step) {
+case 0:
     record_push(2, 1);
     g_assert(qemu_in_coroutine());
-    qemu_coroutine_yield();
+_f->_step = 1;
+    return qemu_coroutine_yield();
+case 1:
     record_push(2, 2);
     g_assert(qemu_in_coroutine());
+    break;
+}
+return stack_free(&_f->common);
+}
+
+static CoroutineAction co_order_test(void *opaque)
+{
+    return CO_INIT_FRAME(co_order_test);
 }
 
 static void do_order_test(void)
@@ -544,6 +556,7 @@ static void test_order(void)
         g_assert_cmpint(records[i].state, ==, expected_pos[i].state);
     }
 }
+#if 0
 /*
  * Lifecycle benchmark
  */
@@ -700,8 +713,8 @@ int main(int argc, char **argv)
     g_test_add_func("/basic/self", test_self);
     g_test_add_func("/basic/entered", test_entered);
     g_test_add_func("/basic/in_coroutine", test_in_coroutine);
-#if 0
     g_test_add_func("/basic/order", test_order);
+#if 0
     g_test_add_func("/locking/co-mutex", test_co_mutex);
     g_test_add_func("/locking/co-mutex/lockable", test_co_mutex_lockable);
     g_test_add_func("/locking/co-rwlock/upgrade", test_co_rwlock_upgrade);
-- 
2.35.1




^ permalink raw reply related	[flat|nested] 44+ messages in thread

* [PATCH 19/35] /perf/lifecycle
  2022-03-10 12:43 [PATCH experiment 00/35] stackless coroutine backend Paolo Bonzini
                   ` (17 preceding siblings ...)
  2022-03-10 12:43 ` [PATCH 18/35] /basic/order Paolo Bonzini
@ 2022-03-10 12:43 ` Paolo Bonzini
  2022-03-10 12:43 ` [PATCH 20/35] /perf/nesting Paolo Bonzini
                   ` (16 subsequent siblings)
  35 siblings, 0 replies; 44+ messages in thread
From: Paolo Bonzini @ 2022-03-10 12:43 UTC (permalink / raw)
  To: qemu-devel; +Cc: hreitz, stefanha, qemu-block, sguelton

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
 tests/unit/test-coroutine.c | 44 ++++++++++++++++++++++++++++++++-----
 1 file changed, 39 insertions(+), 5 deletions(-)

diff --git a/tests/unit/test-coroutine.c b/tests/unit/test-coroutine.c
index 3d898d50c6..439bd269c9 100644
--- a/tests/unit/test-coroutine.c
+++ b/tests/unit/test-coroutine.c
@@ -556,14 +556,21 @@ static void test_order(void)
         g_assert_cmpint(records[i].state, ==, expected_pos[i].state);
     }
 }
-#if 0
+
 /*
  * Lifecycle benchmark
  */
 
-static void coroutine_fn empty_coroutine(void *opaque)
+CO_DECLARE_FRAME(empty_coroutine);
+static CoroutineAction co__empty_coroutine(void *_frame)
 {
-    /* Do nothing */
+    struct FRAME__empty_coroutine *_f = _frame;
+    return stack_free(&_f->common);
+}
+
+static CoroutineAction empty_coroutine(void *opaque)
+{
+    return CO_INIT_FRAME(empty_coroutine);
 }
 
 static void perf_lifecycle(void)
@@ -572,7 +579,7 @@ static void perf_lifecycle(void)
     unsigned int i, max;
     double duration;
 
-    max = 1000000;
+    max = 10000000;
 
     g_test_timer_start();
     for (i = 0; i < max; i++) {
@@ -584,6 +591,30 @@ static void perf_lifecycle(void)
     g_test_message("Lifecycle %u iterations: %f s", max, duration);
 }
 
+static CoroutineAction empty_coroutine_noalloc(void *opaque)
+{
+    return COROUTINE_CONTINUE;
+}
+
+static void perf_lifecycle_noalloc(void)
+{
+    Coroutine *coroutine;
+    unsigned int i, max;
+    double duration;
+
+    max = 10000000;
+
+    g_test_timer_start();
+    for (i = 0; i < max; i++) {
+        coroutine = qemu_coroutine_create(empty_coroutine_noalloc, NULL);
+        qemu_coroutine_enter(coroutine);
+    }
+    duration = g_test_timer_elapsed();
+
+    g_test_message("Lifecycle %u iterations: %f s", max, duration);
+}
+
+#if 0
 static void perf_nesting(void)
 {
     unsigned int i, maxcycles, maxnesting;
@@ -719,13 +750,16 @@ int main(int argc, char **argv)
     g_test_add_func("/locking/co-mutex/lockable", test_co_mutex_lockable);
     g_test_add_func("/locking/co-rwlock/upgrade", test_co_rwlock_upgrade);
     g_test_add_func("/locking/co-rwlock/downgrade", test_co_rwlock_downgrade);
+#endif
     if (g_test_perf()) {
         g_test_add_func("/perf/lifecycle", perf_lifecycle);
+        g_test_add_func("/perf/lifecycle/noalloc", perf_lifecycle_noalloc);
+#if 0
         g_test_add_func("/perf/nesting", perf_nesting);
         g_test_add_func("/perf/yield", perf_yield);
         g_test_add_func("/perf/function-call", perf_baseline);
         g_test_add_func("/perf/cost", perf_cost);
-    }
 #endif
+    }
     return g_test_run();
 }
-- 
2.35.1




^ permalink raw reply related	[flat|nested] 44+ messages in thread

* [PATCH 20/35] /perf/nesting
  2022-03-10 12:43 [PATCH experiment 00/35] stackless coroutine backend Paolo Bonzini
                   ` (18 preceding siblings ...)
  2022-03-10 12:43 ` [PATCH 19/35] /perf/lifecycle Paolo Bonzini
@ 2022-03-10 12:43 ` Paolo Bonzini
  2022-03-10 12:43 ` [PATCH 21/35] /perf/yield Paolo Bonzini
                   ` (15 subsequent siblings)
  35 siblings, 0 replies; 44+ messages in thread
From: Paolo Bonzini @ 2022-03-10 12:43 UTC (permalink / raw)
  To: qemu-devel; +Cc: hreitz, stefanha, qemu-block, sguelton

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
 tests/unit/test-coroutine.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/tests/unit/test-coroutine.c b/tests/unit/test-coroutine.c
index 439bd269c9..75d54e5d29 100644
--- a/tests/unit/test-coroutine.c
+++ b/tests/unit/test-coroutine.c
@@ -614,7 +614,6 @@ static void perf_lifecycle_noalloc(void)
     g_test_message("Lifecycle %u iterations: %f s", max, duration);
 }
 
-#if 0
 static void perf_nesting(void)
 {
     unsigned int i, maxcycles, maxnesting;
@@ -640,6 +639,7 @@ static void perf_nesting(void)
         maxcycles, maxnesting, duration);
 }
 
+#if 0
 /*
  * Yield benchmark
  */
@@ -754,8 +754,8 @@ int main(int argc, char **argv)
     if (g_test_perf()) {
         g_test_add_func("/perf/lifecycle", perf_lifecycle);
         g_test_add_func("/perf/lifecycle/noalloc", perf_lifecycle_noalloc);
-#if 0
         g_test_add_func("/perf/nesting", perf_nesting);
+#if 0
         g_test_add_func("/perf/yield", perf_yield);
         g_test_add_func("/perf/function-call", perf_baseline);
         g_test_add_func("/perf/cost", perf_cost);
-- 
2.35.1




^ permalink raw reply related	[flat|nested] 44+ messages in thread

* [PATCH 21/35] /perf/yield
  2022-03-10 12:43 [PATCH experiment 00/35] stackless coroutine backend Paolo Bonzini
                   ` (19 preceding siblings ...)
  2022-03-10 12:43 ` [PATCH 20/35] /perf/nesting Paolo Bonzini
@ 2022-03-10 12:43 ` Paolo Bonzini
  2022-03-10 12:44 ` [PATCH 22/35] /perf/function-call Paolo Bonzini
                   ` (14 subsequent siblings)
  35 siblings, 0 replies; 44+ messages in thread
From: Paolo Bonzini @ 2022-03-10 12:43 UTC (permalink / raw)
  To: qemu-devel; +Cc: hreitz, stefanha, qemu-block, sguelton

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
 tests/unit/test-coroutine.c | 23 +++++++++++++++++++----
 1 file changed, 19 insertions(+), 4 deletions(-)

diff --git a/tests/unit/test-coroutine.c b/tests/unit/test-coroutine.c
index 75d54e5d29..0b7b4d6ef8 100644
--- a/tests/unit/test-coroutine.c
+++ b/tests/unit/test-coroutine.c
@@ -639,19 +639,33 @@ static void perf_nesting(void)
         maxcycles, maxnesting, duration);
 }
 
-#if 0
 /*
  * Yield benchmark
  */
 
-static void coroutine_fn yield_loop(void *opaque)
+CO_DECLARE_FRAME(yield_loop, void *opaque);
+static CoroutineAction co__yield_loop(void *_frame)
 {
+    struct FRAME__yield_loop *_f = _frame;
+    CO_ARG(opaque);
     unsigned int *counter = opaque;
 
+switch(_f->_step) {
+case 0:
     while ((*counter) > 0) {
         (*counter)--;
-        qemu_coroutine_yield();
+_f->_step = 1;
+        return qemu_coroutine_yield();
+case 1:
     }
+    break;
+}
+return stack_free(&_f->common);
+}
+
+static CoroutineAction yield_loop(void *opaque)
+{
+    return CO_INIT_FRAME(yield_loop, opaque);
 }
 
 static void perf_yield(void)
@@ -672,6 +686,7 @@ static void perf_yield(void)
     g_test_message("Yield %u iterations: %f s", maxcycles, duration);
 }
 
+#if 0
 static __attribute__((noinline)) void dummy(unsigned *i)
 {
     (*i)--;
@@ -755,8 +770,8 @@ int main(int argc, char **argv)
         g_test_add_func("/perf/lifecycle", perf_lifecycle);
         g_test_add_func("/perf/lifecycle/noalloc", perf_lifecycle_noalloc);
         g_test_add_func("/perf/nesting", perf_nesting);
-#if 0
         g_test_add_func("/perf/yield", perf_yield);
+#if 0
         g_test_add_func("/perf/function-call", perf_baseline);
         g_test_add_func("/perf/cost", perf_cost);
 #endif
-- 
2.35.1




^ permalink raw reply related	[flat|nested] 44+ messages in thread

* [PATCH 22/35] /perf/function-call
  2022-03-10 12:43 [PATCH experiment 00/35] stackless coroutine backend Paolo Bonzini
                   ` (20 preceding siblings ...)
  2022-03-10 12:43 ` [PATCH 21/35] /perf/yield Paolo Bonzini
@ 2022-03-10 12:44 ` Paolo Bonzini
  2022-03-10 12:44 ` [PATCH 23/35] /perf/cost Paolo Bonzini
                   ` (13 subsequent siblings)
  35 siblings, 0 replies; 44+ messages in thread
From: Paolo Bonzini @ 2022-03-10 12:44 UTC (permalink / raw)
  To: qemu-devel; +Cc: hreitz, stefanha, qemu-block, sguelton

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
 tests/unit/test-coroutine.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/tests/unit/test-coroutine.c b/tests/unit/test-coroutine.c
index 0b7b4d6ef8..c44287bcb0 100644
--- a/tests/unit/test-coroutine.c
+++ b/tests/unit/test-coroutine.c
@@ -686,7 +686,6 @@ static void perf_yield(void)
     g_test_message("Yield %u iterations: %f s", maxcycles, duration);
 }
 
-#if 0
 static __attribute__((noinline)) void dummy(unsigned *i)
 {
     (*i)--;
@@ -709,6 +708,7 @@ static void perf_baseline(void)
     g_test_message("Function call %u iterations: %f s", maxcycles, duration);
 }
 
+#if 0
 static __attribute__((noinline)) void perf_cost_func(void *opaque)
 {
     qemu_coroutine_yield();
@@ -771,8 +771,8 @@ int main(int argc, char **argv)
         g_test_add_func("/perf/lifecycle/noalloc", perf_lifecycle_noalloc);
         g_test_add_func("/perf/nesting", perf_nesting);
         g_test_add_func("/perf/yield", perf_yield);
-#if 0
         g_test_add_func("/perf/function-call", perf_baseline);
+#if 0
         g_test_add_func("/perf/cost", perf_cost);
 #endif
     }
-- 
2.35.1




^ permalink raw reply related	[flat|nested] 44+ messages in thread

* [PATCH 23/35] /perf/cost
  2022-03-10 12:43 [PATCH experiment 00/35] stackless coroutine backend Paolo Bonzini
                   ` (21 preceding siblings ...)
  2022-03-10 12:44 ` [PATCH 22/35] /perf/function-call Paolo Bonzini
@ 2022-03-10 12:44 ` Paolo Bonzini
  2022-03-10 12:44 ` [PATCH 24/35] /basic/no-dangling-access Paolo Bonzini
                   ` (12 subsequent siblings)
  35 siblings, 0 replies; 44+ messages in thread
From: Paolo Bonzini @ 2022-03-10 12:44 UTC (permalink / raw)
  To: qemu-devel; +Cc: hreitz, stefanha, qemu-block, sguelton

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
 tests/unit/test-coroutine.c | 24 ++++++++++++++++++------
 1 file changed, 18 insertions(+), 6 deletions(-)

diff --git a/tests/unit/test-coroutine.c b/tests/unit/test-coroutine.c
index c44287bcb0..080ee76dde 100644
--- a/tests/unit/test-coroutine.c
+++ b/tests/unit/test-coroutine.c
@@ -708,10 +708,25 @@ static void perf_baseline(void)
     g_test_message("Function call %u iterations: %f s", maxcycles, duration);
 }
 
-#if 0
-static __attribute__((noinline)) void perf_cost_func(void *opaque)
+CO_DECLARE_FRAME(perf_cost_func);
+static CoroutineAction co__perf_cost_func(void *_frame)
 {
-    qemu_coroutine_yield();
+    struct FRAME__verify_entered_step_2 *_f = _frame;
+
+switch(_f->_step)
+{
+case 0:
+    _f->_step = 1;
+    return qemu_coroutine_yield();
+case 1:
+    break;
+}
+return stack_free(&_f->common);
+}
+
+static CoroutineAction perf_cost_func(void *opaque)
+{
+    return CO_INIT_FRAME(perf_cost_func);
 }
 
 static void perf_cost(void)
@@ -737,7 +752,6 @@ static void perf_cost(void)
                    duration, ops,
                    (unsigned long)(1000000000.0 * duration / maxcycles));
 }
-#endif
 
 int main(int argc, char **argv)
 {
@@ -772,9 +786,7 @@ int main(int argc, char **argv)
         g_test_add_func("/perf/nesting", perf_nesting);
         g_test_add_func("/perf/yield", perf_yield);
         g_test_add_func("/perf/function-call", perf_baseline);
-#if 0
         g_test_add_func("/perf/cost", perf_cost);
-#endif
     }
     return g_test_run();
 }
-- 
2.35.1




^ permalink raw reply related	[flat|nested] 44+ messages in thread

* [PATCH 24/35] /basic/no-dangling-access
  2022-03-10 12:43 [PATCH experiment 00/35] stackless coroutine backend Paolo Bonzini
                   ` (22 preceding siblings ...)
  2022-03-10 12:44 ` [PATCH 23/35] /perf/cost Paolo Bonzini
@ 2022-03-10 12:44 ` Paolo Bonzini
  2022-03-10 12:44 ` [PATCH 25/35] /locking/co-mutex Paolo Bonzini
                   ` (11 subsequent siblings)
  35 siblings, 0 replies; 44+ messages in thread
From: Paolo Bonzini @ 2022-03-10 12:44 UTC (permalink / raw)
  To: qemu-devel; +Cc: hreitz, stefanha, qemu-block, sguelton

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
 tests/unit/test-coroutine.c | 27 +++++++++++++++++++++------
 1 file changed, 21 insertions(+), 6 deletions(-)

diff --git a/tests/unit/test-coroutine.c b/tests/unit/test-coroutine.c
index 080ee76dde..0fe9226b86 100644
--- a/tests/unit/test-coroutine.c
+++ b/tests/unit/test-coroutine.c
@@ -201,16 +201,32 @@ static void test_yield(void)
     g_assert_cmpint(i, ==, 5); /* coroutine must yield 5 times */
 }
 
-#if 0
-static void coroutine_fn c2_fn(void *opaque)
+CO_DECLARE_FRAME(c2_fn);
+static CoroutineAction co__c2_fn(void *_frame)
 {
-    qemu_coroutine_yield();
+    struct FRAME__verify_entered_step_2 *_f = _frame;
+
+switch(_f->_step)
+{
+case 0:
+    _f->_step = 1;
+    return qemu_coroutine_yield();
+case 1:
+    break;
+}
+return stack_free(&_f->common);
 }
 
-static void coroutine_fn c1_fn(void *opaque)
+static CoroutineAction c2_fn(void *opaque)
+{
+    return CO_INIT_FRAME(c2_fn);
+}
+
+static CoroutineAction c1_fn(void *opaque)
 {
     Coroutine *c2 = opaque;
     qemu_coroutine_enter(c2);
+    return COROUTINE_CONTINUE;
 }
 
 static void test_no_dangling_access(void)
@@ -233,6 +249,7 @@ static void test_no_dangling_access(void)
     *c1 = tmp;
 }
 
+#if 0
 static bool locked;
 static int done;
 
@@ -757,7 +774,6 @@ int main(int argc, char **argv)
 {
     g_test_init(&argc, &argv, NULL);
 
-#if 0
     /* This test assumes there is a freelist and marks freed coroutine memory
      * with a sentinel value.  If there is no freelist this would legitimately
      * crash, so skip it.
@@ -765,7 +781,6 @@ int main(int argc, char **argv)
     if (CONFIG_COROUTINE_POOL) {
         g_test_add_func("/basic/no-dangling-access", test_no_dangling_access);
     }
-#endif
 
     g_test_add_func("/basic/lifecycle", test_lifecycle);
     g_test_add_func("/basic/yield", test_yield);
-- 
2.35.1




^ permalink raw reply related	[flat|nested] 44+ messages in thread

* [PATCH 25/35] /locking/co-mutex
  2022-03-10 12:43 [PATCH experiment 00/35] stackless coroutine backend Paolo Bonzini
                   ` (23 preceding siblings ...)
  2022-03-10 12:44 ` [PATCH 24/35] /basic/no-dangling-access Paolo Bonzini
@ 2022-03-10 12:44 ` Paolo Bonzini
  2022-03-10 12:44 ` [PATCH 26/35] convert qemu_co_mutex_lock_slowpath to magic macros Paolo Bonzini
                   ` (10 subsequent siblings)
  35 siblings, 0 replies; 44+ messages in thread
From: Paolo Bonzini @ 2022-03-10 12:44 UTC (permalink / raw)
  To: qemu-devel; +Cc: hreitz, stefanha, qemu-block, sguelton

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
 tests/unit/test-coroutine.c | 35 ++++++++++++++++++++++++++++-------
 1 file changed, 28 insertions(+), 7 deletions(-)

diff --git a/tests/unit/test-coroutine.c b/tests/unit/test-coroutine.c
index 0fe9226b86..642ef36bc3 100644
--- a/tests/unit/test-coroutine.c
+++ b/tests/unit/test-coroutine.c
@@ -249,22 +249,41 @@ static void test_no_dangling_access(void)
     *c1 = tmp;
 }
 
-#if 0
 static bool locked;
 static int done;
 
-static void coroutine_fn mutex_fn(void *opaque)
+CO_DECLARE_FRAME(mutex_fn, CoMutex *m);
+static CoroutineAction co__mutex_fn(void *_frame)
 {
-    CoMutex *m = opaque;
-    qemu_co_mutex_lock(m);
+    struct FRAME__mutex_fn *_f = _frame;
+    CO_ARG(m);
+switch(_f->_step) {
+case 0:
+_f->_step = 1;
+    return qemu_co_mutex_lock(m);
+case 1:
     assert(!locked);
     locked = true;
-    qemu_coroutine_yield();
+_f->_step = 2;
+    return qemu_coroutine_yield();
+case 2:
     locked = false;
-    qemu_co_mutex_unlock(m);
+_f->_step = 3;
+    return qemu_co_mutex_unlock(m);
+case 3:
     done++;
+    break;
+}
+return stack_free(&_f->common);
 }
 
+static CoroutineAction mutex_fn(void *opaque)
+{
+    CoMutex *m = opaque;
+    return CO_INIT_FRAME(mutex_fn, m);
+}
+
+#if 0
 static void coroutine_fn lockable_fn(void *opaque)
 {
     QemuCoLockable *x = opaque;
@@ -276,6 +295,7 @@ static void coroutine_fn lockable_fn(void *opaque)
     qemu_co_lockable_unlock(x);
     done++;
 }
+#endif
 
 static void do_test_co_mutex(CoroutineEntry *entry, void *opaque)
 {
@@ -307,6 +327,7 @@ static void test_co_mutex(void)
     do_test_co_mutex(mutex_fn, &m);
 }
 
+#if 0
 static void test_co_mutex_lockable(void)
 {
     CoMutex m;
@@ -789,8 +810,8 @@ int main(int argc, char **argv)
     g_test_add_func("/basic/entered", test_entered);
     g_test_add_func("/basic/in_coroutine", test_in_coroutine);
     g_test_add_func("/basic/order", test_order);
-#if 0
     g_test_add_func("/locking/co-mutex", test_co_mutex);
+#if 0
     g_test_add_func("/locking/co-mutex/lockable", test_co_mutex_lockable);
     g_test_add_func("/locking/co-rwlock/upgrade", test_co_rwlock_upgrade);
     g_test_add_func("/locking/co-rwlock/downgrade", test_co_rwlock_downgrade);
-- 
2.35.1




^ permalink raw reply related	[flat|nested] 44+ messages in thread

* [PATCH 26/35] convert qemu_co_mutex_lock_slowpath to magic macros
  2022-03-10 12:43 [PATCH experiment 00/35] stackless coroutine backend Paolo Bonzini
                   ` (24 preceding siblings ...)
  2022-03-10 12:44 ` [PATCH 25/35] /locking/co-mutex Paolo Bonzini
@ 2022-03-10 12:44 ` Paolo Bonzini
  2022-03-10 12:44 ` [PATCH 27/35] /locking/co-mutex/lockable Paolo Bonzini
                   ` (9 subsequent siblings)
  35 siblings, 0 replies; 44+ messages in thread
From: Paolo Bonzini @ 2022-03-10 12:44 UTC (permalink / raw)
  To: qemu-devel; +Cc: hreitz, stefanha, qemu-block, sguelton

Replace the hand-written frame structure with one built with the CO_* macros,
just to shake them a bit.  The produced code is exactly the same (except for
CO_INIT_FRAME using a statement expression to keep the "return" statement
visible in the code).

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
 util/qemu-coroutine-lock.c | 25 ++++++-------------------
 1 file changed, 6 insertions(+), 19 deletions(-)

diff --git a/util/qemu-coroutine-lock.c b/util/qemu-coroutine-lock.c
index 061a376aa4..51f7da8bda 100644
--- a/util/qemu-coroutine-lock.c
+++ b/util/qemu-coroutine-lock.c
@@ -198,21 +198,13 @@ static void coroutine_fn qemu_co_mutex_wake(CoMutex *mutex, Coroutine *co)
     aio_co_wake(co);
 }
 
-struct FRAME__qemu_co_mutex_lock_slowpath {
-    CoroutineFrame common;
-    uint32_t _step;
-    AioContext *ctx;
-    CoMutex *mutex;
-    Coroutine *self;
-    CoWaitRecord w;
-};
+CO_DECLARE_FRAME(qemu_co_mutex_lock_slowpath, AioContext *ctx, CoMutex *mutex, Coroutine *self, CoWaitRecord w);
 
 static CoroutineAction co__qemu_co_mutex_lock_slowpath(void *_frame)
 {
     struct FRAME__qemu_co_mutex_lock_slowpath *_f = _frame;
-    AioContext *ctx = _f->ctx;
-    CoMutex *mutex = _f->mutex;
-    Coroutine *self;
+    CO_ARG(ctx, mutex);
+    CO_DECLARE(self);
     unsigned old_handoff;
 
 switch(_f->_step) {
@@ -244,11 +236,11 @@ case 0: {
     }
 
 _f->_step = 1;
-_f->self = self;
+CO_SAVE(self);
     return qemu_coroutine_yield();
 }
 case 1:
-self = _f->self;
+CO_LOAD(self);
     trace_qemu_co_mutex_lock_return(mutex, self);
     mutex->holder = self;
     self->locks_held++;
@@ -260,12 +252,7 @@ return stack_free(&_f->common);
 
 static CoroutineAction qemu_co_mutex_lock_slowpath(AioContext *ctx, CoMutex *mutex)
 {
-    struct FRAME__qemu_co_mutex_lock_slowpath *f;
-    f = stack_alloc(co__qemu_co_mutex_lock_slowpath, sizeof(*f));
-    f->ctx = ctx;
-    f->mutex = mutex;
-    f->_step = 0;
-    return co__qemu_co_mutex_lock_slowpath(f);
+    return CO_INIT_FRAME(qemu_co_mutex_lock_slowpath, ctx, mutex);
 }
 
 CoroutineAction qemu_co_mutex_lock(CoMutex *mutex)
-- 
2.35.1




^ permalink raw reply related	[flat|nested] 44+ messages in thread

* [PATCH 27/35] /locking/co-mutex/lockable
  2022-03-10 12:43 [PATCH experiment 00/35] stackless coroutine backend Paolo Bonzini
                   ` (25 preceding siblings ...)
  2022-03-10 12:44 ` [PATCH 26/35] convert qemu_co_mutex_lock_slowpath to magic macros Paolo Bonzini
@ 2022-03-10 12:44 ` Paolo Bonzini
  2022-03-10 12:44 ` [PATCH 28/35] qemu_co_rwlock_maybe_wake_one Paolo Bonzini
                   ` (8 subsequent siblings)
  35 siblings, 0 replies; 44+ messages in thread
From: Paolo Bonzini @ 2022-03-10 12:44 UTC (permalink / raw)
  To: qemu-devel; +Cc: hreitz, stefanha, qemu-block, sguelton

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
 tests/unit/test-coroutine.c | 35 ++++++++++++++++++++++++++---------
 1 file changed, 26 insertions(+), 9 deletions(-)

diff --git a/tests/unit/test-coroutine.c b/tests/unit/test-coroutine.c
index 642ef36bc3..db6718db40 100644
--- a/tests/unit/test-coroutine.c
+++ b/tests/unit/test-coroutine.c
@@ -283,19 +283,36 @@ static CoroutineAction mutex_fn(void *opaque)
     return CO_INIT_FRAME(mutex_fn, m);
 }
 
-#if 0
-static void coroutine_fn lockable_fn(void *opaque)
+CO_DECLARE_FRAME(lockable_fn, QemuCoLockable *x);
+static CoroutineAction co__lockable_fn(void *_frame)
 {
-    QemuCoLockable *x = opaque;
-    qemu_co_lockable_lock(x);
+    struct FRAME__lockable_fn *_f = _frame;
+    CO_ARG(x);
+switch(_f->_step) {
+case 0:
+_f->_step = 1;
+    return qemu_co_lockable_lock(x);
+case 1:
     assert(!locked);
     locked = true;
-    qemu_coroutine_yield();
+_f->_step = 2;
+    return qemu_coroutine_yield();
+case 2:
     locked = false;
-    qemu_co_lockable_unlock(x);
+_f->_step = 3;
+    return qemu_co_lockable_unlock(x);
+case 3:
     done++;
+    break;
+}
+return stack_free(&_f->common);
+}
+
+static CoroutineAction lockable_fn(void *opaque)
+{
+    QemuCoLockable *x = opaque;
+    return CO_INIT_FRAME(lockable_fn, x);
 }
-#endif
 
 static void do_test_co_mutex(CoroutineEntry *entry, void *opaque)
 {
@@ -327,7 +344,6 @@ static void test_co_mutex(void)
     do_test_co_mutex(mutex_fn, &m);
 }
 
-#if 0
 static void test_co_mutex_lockable(void)
 {
     CoMutex m;
@@ -339,6 +355,7 @@ static void test_co_mutex_lockable(void)
     g_assert(QEMU_MAKE_CO_LOCKABLE(null_pointer) == NULL);
 }
 
+#if 0
 static CoRwlock rwlock;
 
 /* Test that readers are properly sent back to the queue when upgrading,
@@ -811,8 +828,8 @@ int main(int argc, char **argv)
     g_test_add_func("/basic/in_coroutine", test_in_coroutine);
     g_test_add_func("/basic/order", test_order);
     g_test_add_func("/locking/co-mutex", test_co_mutex);
-#if 0
     g_test_add_func("/locking/co-mutex/lockable", test_co_mutex_lockable);
+#if 0
     g_test_add_func("/locking/co-rwlock/upgrade", test_co_rwlock_upgrade);
     g_test_add_func("/locking/co-rwlock/downgrade", test_co_rwlock_downgrade);
 #endif
-- 
2.35.1




^ permalink raw reply related	[flat|nested] 44+ messages in thread

* [PATCH 28/35] qemu_co_rwlock_maybe_wake_one
  2022-03-10 12:43 [PATCH experiment 00/35] stackless coroutine backend Paolo Bonzini
                   ` (26 preceding siblings ...)
  2022-03-10 12:44 ` [PATCH 27/35] /locking/co-mutex/lockable Paolo Bonzini
@ 2022-03-10 12:44 ` Paolo Bonzini
  2022-03-10 12:44 ` [PATCH 29/35] qemu_co_rwlock_rdlock Paolo Bonzini
                   ` (7 subsequent siblings)
  35 siblings, 0 replies; 44+ messages in thread
From: Paolo Bonzini @ 2022-03-10 12:44 UTC (permalink / raw)
  To: qemu-devel; +Cc: hreitz, stefanha, qemu-block, sguelton

This is optimized a bit based on the assumption that
qemu_co_mutex_unlock() never yields.  In other words,
qemu_co_mutex_unlock() and qemu_co_rwlock_maybe_wake_one()
could be declared coroutine_only_fn instead of coroutine_fn.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
 util/qemu-coroutine-lock.c | 11 +++++++----
 1 file changed, 7 insertions(+), 4 deletions(-)

diff --git a/util/qemu-coroutine-lock.c b/util/qemu-coroutine-lock.c
index 51f7da8bda..3b50e1dd5b 100644
--- a/util/qemu-coroutine-lock.c
+++ b/util/qemu-coroutine-lock.c
@@ -352,7 +352,6 @@ CoroutineAction qemu_co_mutex_unlock(CoMutex *mutex)
     return COROUTINE_CONTINUE;
 }
 
-#if 0
 struct CoRwTicket {
     bool read;
     Coroutine *co;
@@ -367,7 +366,7 @@ void qemu_co_rwlock_init(CoRwlock *lock)
 }
 
 /* Releases the internal CoMutex.  */
-static void qemu_co_rwlock_maybe_wake_one(CoRwlock *lock)
+static CoroutineAction qemu_co_rwlock_maybe_wake_one(CoRwlock *lock)
 {
     CoRwTicket *tkt = QSIMPLEQ_FIRST(&lock->tickets);
     Coroutine *co = NULL;
@@ -393,13 +392,17 @@ static void qemu_co_rwlock_maybe_wake_one(CoRwlock *lock)
 
     if (co) {
         QSIMPLEQ_REMOVE_HEAD(&lock->tickets, next);
-        qemu_co_mutex_unlock(&lock->mutex);
+        int action = qemu_co_mutex_unlock(&lock->mutex);
+        assert(action == COROUTINE_CONTINUE);
         aio_co_wake(co);
     } else {
-        qemu_co_mutex_unlock(&lock->mutex);
+        int action = qemu_co_mutex_unlock(&lock->mutex);
+        assert(action == COROUTINE_CONTINUE);
     }
+    return COROUTINE_CONTINUE;
 }
 
+#if 0
 void qemu_co_rwlock_rdlock(CoRwlock *lock)
 {
     Coroutine *self = qemu_coroutine_self();
-- 
2.35.1




^ permalink raw reply related	[flat|nested] 44+ messages in thread

* [PATCH 29/35] qemu_co_rwlock_rdlock
  2022-03-10 12:43 [PATCH experiment 00/35] stackless coroutine backend Paolo Bonzini
                   ` (27 preceding siblings ...)
  2022-03-10 12:44 ` [PATCH 28/35] qemu_co_rwlock_maybe_wake_one Paolo Bonzini
@ 2022-03-10 12:44 ` Paolo Bonzini
  2022-03-10 12:44 ` [PATCH 30/35] qemu_co_rwlock_unlock Paolo Bonzini
                   ` (6 subsequent siblings)
  35 siblings, 0 replies; 44+ messages in thread
From: Paolo Bonzini @ 2022-03-10 12:44 UTC (permalink / raw)
  To: qemu-devel; +Cc: hreitz, stefanha, qemu-block, sguelton

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
 util/qemu-coroutine-lock.c | 40 +++++++++++++++++++++++++++++---------
 1 file changed, 31 insertions(+), 9 deletions(-)

diff --git a/util/qemu-coroutine-lock.c b/util/qemu-coroutine-lock.c
index 3b50e1dd5b..e7eb446566 100644
--- a/util/qemu-coroutine-lock.c
+++ b/util/qemu-coroutine-lock.c
@@ -402,32 +402,54 @@ static CoroutineAction qemu_co_rwlock_maybe_wake_one(CoRwlock *lock)
     return COROUTINE_CONTINUE;
 }
 
-#if 0
-void qemu_co_rwlock_rdlock(CoRwlock *lock)
+CO_DECLARE_FRAME(qemu_co_rwlock_rdlock, CoRwlock *lock, Coroutine *self, CoRwTicket my_ticket);
+static CoroutineAction co__qemu_co_rwlock_rdlock(void *_frame)
 {
+    struct FRAME__qemu_co_rwlock_rdlock *_f = _frame;
+    CO_ARG(lock);
     Coroutine *self = qemu_coroutine_self();
 
-    qemu_co_mutex_lock(&lock->mutex);
+switch(_f->_step) {
+case 0:
+_f->_step = 1;
+CO_SAVE(self);
+    return qemu_co_mutex_lock(&lock->mutex);
+case 1:
+CO_LOAD(self);
     /* For fairness, wait if a writer is in line.  */
     if (lock->owners == 0 || (lock->owners > 0 && QSIMPLEQ_EMPTY(&lock->tickets))) {
         lock->owners++;
         qemu_co_mutex_unlock(&lock->mutex);
     } else {
-        CoRwTicket my_ticket = { true, self };
+        _f->my_ticket = (CoRwTicket){ true, self };
 
-        QSIMPLEQ_INSERT_TAIL(&lock->tickets, &my_ticket, next);
+        QSIMPLEQ_INSERT_TAIL(&lock->tickets, &_f->my_ticket, next);
         qemu_co_mutex_unlock(&lock->mutex);
-        qemu_coroutine_yield();
+
+_f->_step = 2;
+        return qemu_coroutine_yield();
+case 2:
         assert(lock->owners >= 1);
 
         /* Possibly wake another reader, which will wake the next in line.  */
-        qemu_co_mutex_lock(&lock->mutex);
+_f->_step = 3;
+        return qemu_co_mutex_lock(&lock->mutex);
+case 3:
+CO_LOAD(self);
         qemu_co_rwlock_maybe_wake_one(lock);
     }
-
-    self->locks_held++;
 }
 
+    self->locks_held++;
+return stack_free(&_f->common);
+}
+
+CoroutineAction qemu_co_rwlock_rdlock(CoRwlock *lock)
+{
+    return CO_INIT_FRAME(qemu_co_rwlock_rdlock, lock);
+}
+
+#if 0
 void qemu_co_rwlock_unlock(CoRwlock *lock)
 {
     Coroutine *self = qemu_coroutine_self();
-- 
2.35.1




^ permalink raw reply related	[flat|nested] 44+ messages in thread

* [PATCH 30/35] qemu_co_rwlock_unlock
  2022-03-10 12:43 [PATCH experiment 00/35] stackless coroutine backend Paolo Bonzini
                   ` (28 preceding siblings ...)
  2022-03-10 12:44 ` [PATCH 29/35] qemu_co_rwlock_rdlock Paolo Bonzini
@ 2022-03-10 12:44 ` Paolo Bonzini
  2022-03-10 12:44 ` [PATCH 31/35] qemu_co_rwlock_downgrade Paolo Bonzini
                   ` (5 subsequent siblings)
  35 siblings, 0 replies; 44+ messages in thread
From: Paolo Bonzini @ 2022-03-10 12:44 UTC (permalink / raw)
  To: qemu-devel; +Cc: hreitz, stefanha, qemu-block, sguelton

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
 util/qemu-coroutine-lock.c | 25 +++++++++++++++++++++----
 1 file changed, 21 insertions(+), 4 deletions(-)

diff --git a/util/qemu-coroutine-lock.c b/util/qemu-coroutine-lock.c
index e7eb446566..c164cf6b15 100644
--- a/util/qemu-coroutine-lock.c
+++ b/util/qemu-coroutine-lock.c
@@ -449,15 +449,21 @@ CoroutineAction qemu_co_rwlock_rdlock(CoRwlock *lock)
     return CO_INIT_FRAME(qemu_co_rwlock_rdlock, lock);
 }
 
-#if 0
-void qemu_co_rwlock_unlock(CoRwlock *lock)
+CO_DECLARE_FRAME(qemu_co_rwlock_unlock, CoRwlock *lock);
+static CoroutineAction co__qemu_co_rwlock_unlock(void *_frame)
 {
+    struct FRAME__qemu_co_rwlock_unlock *_f = _frame;
+    CO_ARG(lock);
     Coroutine *self = qemu_coroutine_self();
 
+switch(_f->_step) {
+case 0:
+_f->_step = 1;
     assert(qemu_in_coroutine());
     self->locks_held--;
 
-    qemu_co_mutex_lock(&lock->mutex);
+    return qemu_co_mutex_lock(&lock->mutex);
+case 1:
     if (lock->owners > 0) {
         lock->owners--;
     } else {
@@ -465,9 +471,20 @@ void qemu_co_rwlock_unlock(CoRwlock *lock)
         lock->owners = 0;
     }
 
-    qemu_co_rwlock_maybe_wake_one(lock);
+_f->_step = 2;
+    return qemu_co_rwlock_maybe_wake_one(lock);
+case 2:
+    break;
+}
+return stack_free(&_f->common);
 }
 
+CoroutineAction qemu_co_rwlock_unlock(CoRwlock *lock)
+{
+    return CO_INIT_FRAME(qemu_co_rwlock_unlock, lock);
+}
+
+#if 0
 void qemu_co_rwlock_downgrade(CoRwlock *lock)
 {
     qemu_co_mutex_lock(&lock->mutex);
-- 
2.35.1




^ permalink raw reply related	[flat|nested] 44+ messages in thread

* [PATCH 31/35] qemu_co_rwlock_downgrade
  2022-03-10 12:43 [PATCH experiment 00/35] stackless coroutine backend Paolo Bonzini
                   ` (29 preceding siblings ...)
  2022-03-10 12:44 ` [PATCH 30/35] qemu_co_rwlock_unlock Paolo Bonzini
@ 2022-03-10 12:44 ` Paolo Bonzini
  2022-03-10 12:44 ` [PATCH 32/35] qemu_co_rwlock_wrlock Paolo Bonzini
                   ` (4 subsequent siblings)
  35 siblings, 0 replies; 44+ messages in thread
From: Paolo Bonzini @ 2022-03-10 12:44 UTC (permalink / raw)
  To: qemu-devel; +Cc: hreitz, stefanha, qemu-block, sguelton

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
 util/qemu-coroutine-lock.c | 26 ++++++++++++++++++++++----
 1 file changed, 22 insertions(+), 4 deletions(-)

diff --git a/util/qemu-coroutine-lock.c b/util/qemu-coroutine-lock.c
index c164cf6b15..5a7b99cfaf 100644
--- a/util/qemu-coroutine-lock.c
+++ b/util/qemu-coroutine-lock.c
@@ -484,17 +484,35 @@ CoroutineAction qemu_co_rwlock_unlock(CoRwlock *lock)
     return CO_INIT_FRAME(qemu_co_rwlock_unlock, lock);
 }
 
-#if 0
-void qemu_co_rwlock_downgrade(CoRwlock *lock)
+CO_DECLARE_FRAME(qemu_co_rwlock_downgrade, CoRwlock *lock);
+static CoroutineAction co__qemu_co_rwlock_downgrade(void *_frame)
 {
-    qemu_co_mutex_lock(&lock->mutex);
+    struct FRAME__qemu_co_rwlock_downgrade *_f = _frame;
+    CO_ARG(lock);
+
+switch(_f->_step) {
+case 0:
+_f->_step = 1;
+    return qemu_co_mutex_lock(&lock->mutex);
+case 1:
     assert(lock->owners == -1);
     lock->owners = 1;
 
     /* Possibly wake another reader, which will wake the next in line.  */
-    qemu_co_rwlock_maybe_wake_one(lock);
+_f->_step = 2;
+    return qemu_co_rwlock_maybe_wake_one(lock);
+case 2:
+    break;
+}
+return stack_free(&_f->common);
 }
 
+CoroutineAction qemu_co_rwlock_downgrade(CoRwlock *lock)
+{
+    return CO_INIT_FRAME(qemu_co_rwlock_downgrade, lock);
+}
+
+#if 0
 void qemu_co_rwlock_wrlock(CoRwlock *lock)
 {
     Coroutine *self = qemu_coroutine_self();
-- 
2.35.1




^ permalink raw reply related	[flat|nested] 44+ messages in thread

* [PATCH 32/35] qemu_co_rwlock_wrlock
  2022-03-10 12:43 [PATCH experiment 00/35] stackless coroutine backend Paolo Bonzini
                   ` (30 preceding siblings ...)
  2022-03-10 12:44 ` [PATCH 31/35] qemu_co_rwlock_downgrade Paolo Bonzini
@ 2022-03-10 12:44 ` Paolo Bonzini
  2022-03-10 12:44 ` [PATCH 33/35] qemu_co_rwlock_upgrade Paolo Bonzini
                   ` (3 subsequent siblings)
  35 siblings, 0 replies; 44+ messages in thread
From: Paolo Bonzini @ 2022-03-10 12:44 UTC (permalink / raw)
  To: qemu-devel; +Cc: hreitz, stefanha, qemu-block, sguelton

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
 util/qemu-coroutine-lock.c | 36 ++++++++++++++++++++++++++++--------
 1 file changed, 28 insertions(+), 8 deletions(-)

diff --git a/util/qemu-coroutine-lock.c b/util/qemu-coroutine-lock.c
index 5a7b99cfaf..c0541171d4 100644
--- a/util/qemu-coroutine-lock.c
+++ b/util/qemu-coroutine-lock.c
@@ -512,27 +512,47 @@ CoroutineAction qemu_co_rwlock_downgrade(CoRwlock *lock)
     return CO_INIT_FRAME(qemu_co_rwlock_downgrade, lock);
 }
 
-#if 0
-void qemu_co_rwlock_wrlock(CoRwlock *lock)
+CO_DECLARE_FRAME(qemu_co_rwlock_wrlock, CoRwlock *lock, Coroutine *self, CoRwTicket my_ticket);
+static CoroutineAction co__qemu_co_rwlock_wrlock(void *_frame)
 {
+    struct FRAME__qemu_co_rwlock_wrlock *_f = _frame;
+    CO_ARG(lock);
     Coroutine *self = qemu_coroutine_self();
 
-    qemu_co_mutex_lock(&lock->mutex);
+switch(_f->_step) {
+case 0:
+_f->_step = 1;
+CO_SAVE(self);
+    return qemu_co_mutex_lock(&lock->mutex);
+case 1:
+CO_LOAD(self);
     if (lock->owners == 0) {
         lock->owners = -1;
         qemu_co_mutex_unlock(&lock->mutex);
     } else {
-        CoRwTicket my_ticket = { false, self };
+        _f->my_ticket = (CoRwTicket){ false, self };
 
-        QSIMPLEQ_INSERT_TAIL(&lock->tickets, &my_ticket, next);
+        QSIMPLEQ_INSERT_TAIL(&lock->tickets, &_f->my_ticket, next);
         qemu_co_mutex_unlock(&lock->mutex);
-        qemu_coroutine_yield();
+_f->_step = 2;
+        return qemu_coroutine_yield();
+case 2:
+CO_LOAD(self);
         assert(lock->owners == -1);
     }
-
-    self->locks_held++;
+    break;
 }
 
+    self->locks_held++;
+return stack_free(&_f->common);
+}
+
+CoroutineAction qemu_co_rwlock_wrlock(CoRwlock *lock)
+{
+    return CO_INIT_FRAME(qemu_co_rwlock_wrlock, lock);
+}
+
+#if 0
 void qemu_co_rwlock_upgrade(CoRwlock *lock)
 {
     qemu_co_mutex_lock(&lock->mutex);
-- 
2.35.1




^ permalink raw reply related	[flat|nested] 44+ messages in thread

* [PATCH 33/35] qemu_co_rwlock_upgrade
  2022-03-10 12:43 [PATCH experiment 00/35] stackless coroutine backend Paolo Bonzini
                   ` (31 preceding siblings ...)
  2022-03-10 12:44 ` [PATCH 32/35] qemu_co_rwlock_wrlock Paolo Bonzini
@ 2022-03-10 12:44 ` Paolo Bonzini
  2022-03-10 12:44 ` [PATCH 34/35] /locking/co-rwlock/upgrade Paolo Bonzini
                   ` (2 subsequent siblings)
  35 siblings, 0 replies; 44+ messages in thread
From: Paolo Bonzini @ 2022-03-10 12:44 UTC (permalink / raw)
  To: qemu-devel; +Cc: hreitz, stefanha, qemu-block, sguelton

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
 util/qemu-coroutine-lock.c | 30 +++++++++++++++++++++++-------
 1 file changed, 23 insertions(+), 7 deletions(-)

diff --git a/util/qemu-coroutine-lock.c b/util/qemu-coroutine-lock.c
index c0541171d4..9674e8e3e3 100644
--- a/util/qemu-coroutine-lock.c
+++ b/util/qemu-coroutine-lock.c
@@ -552,23 +552,39 @@ CoroutineAction qemu_co_rwlock_wrlock(CoRwlock *lock)
     return CO_INIT_FRAME(qemu_co_rwlock_wrlock, lock);
 }
 
-#if 0
-void qemu_co_rwlock_upgrade(CoRwlock *lock)
+CO_DECLARE_FRAME(qemu_co_rwlock_upgrade, CoRwlock *lock, CoRwTicket my_ticket);
+static CoroutineAction co__qemu_co_rwlock_upgrade(void *_frame)
 {
-    qemu_co_mutex_lock(&lock->mutex);
+    struct FRAME__qemu_co_rwlock_upgrade *_f = _frame;
+    CO_ARG(lock);
+
+switch(_f->_step) {
+case 0:
+_f->_step = 1;
+    return qemu_co_mutex_lock(&lock->mutex);
+case 1:
     assert(lock->owners > 0);
     /* For fairness, wait if a writer is in line.  */
     if (lock->owners == 1 && QSIMPLEQ_EMPTY(&lock->tickets)) {
         lock->owners = -1;
         qemu_co_mutex_unlock(&lock->mutex);
     } else {
-        CoRwTicket my_ticket = { false, qemu_coroutine_self() };
+        _f->my_ticket = (CoRwTicket){ false, qemu_coroutine_self() };
 
         lock->owners--;
-        QSIMPLEQ_INSERT_TAIL(&lock->tickets, &my_ticket, next);
+        QSIMPLEQ_INSERT_TAIL(&lock->tickets, &_f->my_ticket, next);
         qemu_co_rwlock_maybe_wake_one(lock);
-        qemu_coroutine_yield();
+_f->_step = 2;
+        return qemu_coroutine_yield();
+case 2:
         assert(lock->owners == -1);
     }
+    break;
+}
+return stack_free(&_f->common);
+}
+
+CoroutineAction qemu_co_rwlock_upgrade(CoRwlock *lock)
+{
+    return CO_INIT_FRAME(qemu_co_rwlock_upgrade, lock);
 }
-#endif
-- 
2.35.1




^ permalink raw reply related	[flat|nested] 44+ messages in thread

* [PATCH 34/35] /locking/co-rwlock/upgrade
  2022-03-10 12:43 [PATCH experiment 00/35] stackless coroutine backend Paolo Bonzini
                   ` (32 preceding siblings ...)
  2022-03-10 12:44 ` [PATCH 33/35] qemu_co_rwlock_upgrade Paolo Bonzini
@ 2022-03-10 12:44 ` Paolo Bonzini
  2022-03-10 12:44 ` [PATCH 35/35] /locking/co-rwlock/downgrade Paolo Bonzini
  2022-03-10 17:42 ` [PATCH experiment 00/35] stackless coroutine backend Stefan Hajnoczi
  35 siblings, 0 replies; 44+ messages in thread
From: Paolo Bonzini @ 2022-03-10 12:44 UTC (permalink / raw)
  To: qemu-devel; +Cc: hreitz, stefanha, qemu-block, sguelton

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
 tests/unit/test-coroutine.c | 68 ++++++++++++++++++++++++++++++-------
 1 file changed, 55 insertions(+), 13 deletions(-)

diff --git a/tests/unit/test-coroutine.c b/tests/unit/test-coroutine.c
index db6718db40..39d0f31492 100644
--- a/tests/unit/test-coroutine.c
+++ b/tests/unit/test-coroutine.c
@@ -355,7 +355,6 @@ static void test_co_mutex_lockable(void)
     g_assert(QEMU_MAKE_CO_LOCKABLE(null_pointer) == NULL);
 }
 
-#if 0
 static CoRwlock rwlock;
 
 /* Test that readers are properly sent back to the queue when upgrading,
@@ -375,24 +374,66 @@ static CoRwlock rwlock;
  * | unlock       |            |
  */
 
-static void coroutine_fn rwlock_yield_upgrade(void *opaque)
+CO_DECLARE_FRAME(rwlock_yield_upgrade, bool *done);
+static CoroutineAction co__rwlock_yield_upgrade(void *_frame)
 {
-    qemu_co_rwlock_rdlock(&rwlock);
-    qemu_coroutine_yield();
+    struct FRAME__rwlock_yield_upgrade *_f = _frame;
+    CO_ARG(done);
+switch(_f->_step) {
+case 0:
+_f->_step = 1;
+    return qemu_co_rwlock_rdlock(&rwlock);
+case 1:
+_f->_step = 2;
+    return qemu_coroutine_yield();
 
-    qemu_co_rwlock_upgrade(&rwlock);
-    qemu_co_rwlock_unlock(&rwlock);
+case 2:
+_f->_step = 3;
+    return qemu_co_rwlock_upgrade(&rwlock);
+case 3:
+_f->_step = 4;
+    return qemu_co_rwlock_unlock(&rwlock);
 
-    *(bool *)opaque = true;
+case 4:
+    *done = true;
+    break;
+}
+return stack_free(&_f->common);
 }
 
-static void coroutine_fn rwlock_wrlock_yield(void *opaque)
+static CoroutineAction rwlock_yield_upgrade(void *opaque)
 {
-    qemu_co_rwlock_wrlock(&rwlock);
-    qemu_coroutine_yield();
+    bool *done = opaque;
+    return CO_INIT_FRAME(rwlock_yield_upgrade, done);
+}
 
-    qemu_co_rwlock_unlock(&rwlock);
-    *(bool *)opaque = true;
+CO_DECLARE_FRAME(rwlock_wrlock_yield, bool *done);
+static CoroutineAction co__rwlock_wrlock_yield(void *_frame)
+{
+    struct FRAME__rwlock_wrlock_yield *_f = _frame;
+    CO_ARG(done);
+switch(_f->_step) {
+case 0:
+_f->_step = 1;
+    return qemu_co_rwlock_wrlock(&rwlock);
+case 1:
+_f->_step = 2;
+    return qemu_coroutine_yield();
+
+case 2:
+_f->_step = 3;
+    return qemu_co_rwlock_unlock(&rwlock);
+case 3:
+    *done = true;
+    break;
+}
+return stack_free(&_f->common);
+}
+
+static CoroutineAction rwlock_wrlock_yield(void *opaque)
+{
+    bool *done = opaque;
+    return CO_INIT_FRAME(rwlock_wrlock_yield, done);
 }
 
 static void test_co_rwlock_upgrade(void)
@@ -417,6 +458,7 @@ static void test_co_rwlock_upgrade(void)
     g_assert(c2_done);
 }
 
+#if 0
 static void coroutine_fn rwlock_rdlock_yield(void *opaque)
 {
     qemu_co_rwlock_rdlock(&rwlock);
@@ -829,8 +871,8 @@ int main(int argc, char **argv)
     g_test_add_func("/basic/order", test_order);
     g_test_add_func("/locking/co-mutex", test_co_mutex);
     g_test_add_func("/locking/co-mutex/lockable", test_co_mutex_lockable);
-#if 0
     g_test_add_func("/locking/co-rwlock/upgrade", test_co_rwlock_upgrade);
+#if 0
     g_test_add_func("/locking/co-rwlock/downgrade", test_co_rwlock_downgrade);
 #endif
     if (g_test_perf()) {
-- 
2.35.1




^ permalink raw reply related	[flat|nested] 44+ messages in thread

* [PATCH 35/35] /locking/co-rwlock/downgrade
  2022-03-10 12:43 [PATCH experiment 00/35] stackless coroutine backend Paolo Bonzini
                   ` (33 preceding siblings ...)
  2022-03-10 12:44 ` [PATCH 34/35] /locking/co-rwlock/upgrade Paolo Bonzini
@ 2022-03-10 12:44 ` Paolo Bonzini
  2022-03-10 17:42 ` [PATCH experiment 00/35] stackless coroutine backend Stefan Hajnoczi
  35 siblings, 0 replies; 44+ messages in thread
From: Paolo Bonzini @ 2022-03-10 12:44 UTC (permalink / raw)
  To: qemu-devel; +Cc: hreitz, stefanha, qemu-block, sguelton

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
 tests/unit/test-coroutine.c | 123 ++++++++++++++++++++++++++++--------
 1 file changed, 98 insertions(+), 25 deletions(-)

diff --git a/tests/unit/test-coroutine.c b/tests/unit/test-coroutine.c
index 39d0f31492..174ea8d579 100644
--- a/tests/unit/test-coroutine.c
+++ b/tests/unit/test-coroutine.c
@@ -458,41 +458,117 @@ static void test_co_rwlock_upgrade(void)
     g_assert(c2_done);
 }
 
-#if 0
-static void coroutine_fn rwlock_rdlock_yield(void *opaque)
+CO_DECLARE_FRAME(rwlock_rdlock_yield, bool *done);
+static CoroutineAction co__rwlock_rdlock_yield(void *_frame)
 {
-    qemu_co_rwlock_rdlock(&rwlock);
-    qemu_coroutine_yield();
+    struct FRAME__rwlock_rdlock_yield *_f = _frame;
+    CO_ARG(done);
+switch(_f->_step) {
+case 0:
+_f->_step = 1;
+    return qemu_co_rwlock_rdlock(&rwlock);
+case 1:
+_f->_step = 2;
+    return qemu_coroutine_yield();
 
-    qemu_co_rwlock_unlock(&rwlock);
-    qemu_coroutine_yield();
-
-    *(bool *)opaque = true;
+case 2:
+_f->_step = 3;
+    return qemu_co_rwlock_unlock(&rwlock);
+case 3:
+_f->_step = 4;
+    return qemu_coroutine_yield();
+case 4:
+    *done = true;
+    break;
+}
+return stack_free(&_f->common);
 }
 
-static void coroutine_fn rwlock_wrlock_downgrade(void *opaque)
+static CoroutineAction rwlock_rdlock_yield(void *opaque)
 {
-    qemu_co_rwlock_wrlock(&rwlock);
-
-    qemu_co_rwlock_downgrade(&rwlock);
-    qemu_co_rwlock_unlock(&rwlock);
-    *(bool *)opaque = true;
+    bool *done = opaque;
+    return CO_INIT_FRAME(rwlock_rdlock_yield, done);
 }
 
-static void coroutine_fn rwlock_rdlock(void *opaque)
+CO_DECLARE_FRAME(rwlock_wrlock_downgrade, bool *done);
+static CoroutineAction co__rwlock_wrlock_downgrade(void *_frame)
 {
-    qemu_co_rwlock_rdlock(&rwlock);
+    struct FRAME__rwlock_wrlock_downgrade *_f = _frame;
+    CO_ARG(done);
+switch(_f->_step) {
+case 0:
+_f->_step = 1;
+    return qemu_co_rwlock_wrlock(&rwlock);
 
-    qemu_co_rwlock_unlock(&rwlock);
-    *(bool *)opaque = true;
+case 1:
+_f->_step = 2;
+    return qemu_co_rwlock_downgrade(&rwlock);
+case 2:
+_f->_step = 3;
+    return qemu_co_rwlock_unlock(&rwlock);
+case 3:
+    *done = true;
+    break;
+}
+return stack_free(&_f->common);
 }
 
-static void coroutine_fn rwlock_wrlock(void *opaque)
+static CoroutineAction rwlock_wrlock_downgrade(void *opaque)
 {
-    qemu_co_rwlock_wrlock(&rwlock);
+    bool *done = opaque;
+    return CO_INIT_FRAME(rwlock_wrlock_downgrade, done);
+}
 
-    qemu_co_rwlock_unlock(&rwlock);
-    *(bool *)opaque = true;
+CO_DECLARE_FRAME(rwlock_rdlock, bool *done);
+static CoroutineAction co__rwlock_rdlock(void *_frame)
+{
+    struct FRAME__rwlock_rdlock *_f = _frame;
+    CO_ARG(done);
+switch(_f->_step) {
+case 0:
+_f->_step = 1;
+    return qemu_co_rwlock_rdlock(&rwlock);
+
+case 1:
+_f->_step = 2;
+    return qemu_co_rwlock_unlock(&rwlock);
+case 2:
+    *done = true;
+    break;
+}
+return stack_free(&_f->common);
+}
+
+static CoroutineAction rwlock_rdlock(void *opaque)
+{
+    bool *done = opaque;
+    return CO_INIT_FRAME(rwlock_rdlock, done);
+}
+
+CO_DECLARE_FRAME(rwlock_wrlock, bool *done);
+static CoroutineAction co__rwlock_wrlock(void *_frame)
+{
+    struct FRAME__rwlock_wrlock *_f = _frame;
+    CO_ARG(done);
+switch(_f->_step) {
+case 0:
+_f->_step = 1;
+    return qemu_co_rwlock_wrlock(&rwlock);
+
+case 1:
+_f->_step = 2;
+    return qemu_co_rwlock_unlock(&rwlock);
+case 2:
+    *done = true;
+    break;
+}
+return stack_free(&_f->common);
+}
+
+static CoroutineAction rwlock_wrlock(void *opaque)
+{
+    bool *done = opaque;
+    return CO_INIT_FRAME(rwlock_wrlock, done);
 }
 
 /*
@@ -556,7 +632,6 @@ static void test_co_rwlock_downgrade(void)
 
     g_assert(c1_done);
 }
-#endif
 
 /*
  * Check that creation, enter, and return work
@@ -872,9 +947,7 @@ int main(int argc, char **argv)
     g_test_add_func("/locking/co-mutex", test_co_mutex);
     g_test_add_func("/locking/co-mutex/lockable", test_co_mutex_lockable);
     g_test_add_func("/locking/co-rwlock/upgrade", test_co_rwlock_upgrade);
-#if 0
     g_test_add_func("/locking/co-rwlock/downgrade", test_co_rwlock_downgrade);
-#endif
     if (g_test_perf()) {
         g_test_add_func("/perf/lifecycle", perf_lifecycle);
         g_test_add_func("/perf/lifecycle/noalloc", perf_lifecycle_noalloc);
-- 
2.35.1



^ permalink raw reply related	[flat|nested] 44+ messages in thread

* Re: [PATCH 05/35] coroutine: small code cleanup in qemu_co_rwlock_wrlock
  2022-03-10 12:43 ` [PATCH 05/35] coroutine: small code cleanup in qemu_co_rwlock_wrlock Paolo Bonzini
@ 2022-03-10 14:10   ` Philippe Mathieu-Daudé
  0 siblings, 0 replies; 44+ messages in thread
From: Philippe Mathieu-Daudé @ 2022-03-10 14:10 UTC (permalink / raw)
  To: Paolo Bonzini, qemu-devel; +Cc: hreitz, qemu-block, stefanha, sguelton

On 10/3/22 13:43, Paolo Bonzini wrote:
> qemu_co_rwlock_wrlock stores the current coroutine in a local variable,
> use it instead of calling qemu_coroutine_self() again.
> 
> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
> ---
>   util/qemu-coroutine-lock.c | 2 +-
>   1 file changed, 1 insertion(+), 1 deletion(-)

Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>


^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [PATCH experiment 00/35] stackless coroutine backend
  2022-03-10 12:43 [PATCH experiment 00/35] stackless coroutine backend Paolo Bonzini
                   ` (34 preceding siblings ...)
  2022-03-10 12:44 ` [PATCH 35/35] /locking/co-rwlock/downgrade Paolo Bonzini
@ 2022-03-10 17:42 ` Stefan Hajnoczi
  2022-03-10 20:14   ` Paolo Bonzini
  35 siblings, 1 reply; 44+ messages in thread
From: Stefan Hajnoczi @ 2022-03-10 17:42 UTC (permalink / raw)
  To: Paolo Bonzini; +Cc: hreitz, qemu-devel, qemu-block, sguelton

[-- Attachment #1: Type: text/plain, Size: 1635 bytes --]

On Thu, Mar 10, 2022 at 01:43:38PM +0100, Paolo Bonzini wrote:
> Here is an experiment with using stackless coroutines in QEMU.  It
> only compiles enough code to run tests/unit/test-coroutine, but at
> least it proves that it's possible to quickly test ideas in the
> area of coroutine runtimes.  Another idea that could be toyed with
> in a similar manner could be (whoa) C++ coroutines.
> 
> As expected, this also found some issues in existing code, so I
> plan to submit patches 1-5 separately.
> 
> The new backend (which is the only one that works, due to the required
> code changes) is in patch 7.  For the big description of what stackless
> coroutines are, please refer to that patch.
> 
> Patches 8-11 do some initial conversions.  Patch 12 introduce some
> preprocessor magic that greatly eases the rest of the work, and then
> the tests are converted one at a time, until patch 27 where the only
> ones missing are the CoRwlock tests.
> 
> Therefore, patches 28-33 convert CoRwlock and pathces 34-35 take care
> of the corresponding tests, thus concluding the experiment.

Nice, the transformation is clear. It's simpler than Continuation
Passing Style transform because the loops and if statements remain
unmodified. This is a big advantage with the Duff's device-style
approach.

There are a lot of details to decide on in the translator tool and
runtime to optimize the code. I think the way the stack frames are
organized in this patch series is probably for convenience rather than
performance.

Out of curiousity, did you run the perf tests and compare against
ucontext?

Stefan

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [PATCH experiment 00/35] stackless coroutine backend
  2022-03-10 17:42 ` [PATCH experiment 00/35] stackless coroutine backend Stefan Hajnoczi
@ 2022-03-10 20:14   ` Paolo Bonzini
  2022-03-11  9:27     ` Stefan Hajnoczi
  0 siblings, 1 reply; 44+ messages in thread
From: Paolo Bonzini @ 2022-03-10 20:14 UTC (permalink / raw)
  To: Stefan Hajnoczi; +Cc: hreitz, qemu-devel, qemu-block, sguelton

On 3/10/22 18:42, Stefan Hajnoczi wrote:
> There are a lot of details to decide on in the translator tool and
> runtime to optimize the code. I think the way the stack frames are
> organized in this patch series is probably for convenience rather than
> performance.

Yes, sometimes the optimizations are there but mostly because they made 
my job easier.

> Out of curiousity, did you run the perf tests and compare against
> ucontext?

Not quite voluntarily, but I noticed I had to add one 0 to make them run 
for a decent amount of time.  So yeah, it's much faster than siglongjmp.

Paolo


^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [PATCH experiment 00/35] stackless coroutine backend
  2022-03-10 20:14   ` Paolo Bonzini
@ 2022-03-11  9:27     ` Stefan Hajnoczi
  2022-03-11 12:04       ` Paolo Bonzini
  0 siblings, 1 reply; 44+ messages in thread
From: Stefan Hajnoczi @ 2022-03-11  9:27 UTC (permalink / raw)
  To: Paolo Bonzini; +Cc: hreitz, qemu-devel, qemu-block, sguelton

[-- Attachment #1: Type: text/plain, Size: 977 bytes --]

On Thu, Mar 10, 2022 at 09:14:07PM +0100, Paolo Bonzini wrote:
> On 3/10/22 18:42, Stefan Hajnoczi wrote:
> > There are a lot of details to decide on in the translator tool and
> > runtime to optimize the code. I think the way the stack frames are
> > organized in this patch series is probably for convenience rather than
> > performance.
> 
> Yes, sometimes the optimizations are there but mostly because they made my
> job easier.
> 
> > Out of curiousity, did you run the perf tests and compare against
> > ucontext?
> 
> Not quite voluntarily, but I noticed I had to add one 0 to make them run for
> a decent amount of time.  So yeah, it's much faster than siglongjmp.

That's a nice first indication that performance will be good. I guess
that deep coroutine_fn stacks could be less efficient with stackless
coroutines compared to ucontext, but the cost of switching between
coroutines (enter/yield) will be lower with stackless coroutines.

Stefan

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [PATCH experiment 00/35] stackless coroutine backend
  2022-03-11  9:27     ` Stefan Hajnoczi
@ 2022-03-11 12:04       ` Paolo Bonzini
  2022-03-11 12:17         ` Daniel P. Berrangé
  0 siblings, 1 reply; 44+ messages in thread
From: Paolo Bonzini @ 2022-03-11 12:04 UTC (permalink / raw)
  To: Stefan Hajnoczi; +Cc: hreitz, qemu-devel, qemu-block, sguelton

On 3/11/22 10:27, Stefan Hajnoczi wrote:
>> Not quite voluntarily, but I noticed I had to add one 0 to make them run for
>> a decent amount of time.  So yeah, it's much faster than siglongjmp.
> That's a nice first indication that performance will be good. I guess
> that deep coroutine_fn stacks could be less efficient with stackless
> coroutines compared to ucontext, but the cost of switching between
> coroutines (enter/yield) will be lower with stackless coroutines.

Note that right now I'm not placing the coroutine_fn stack on the heap, 
it's still allocated from a contiguous area in virtual address space. 
The contiguous allocation is wrapped by coroutine_stack_alloc and 
coroutine_stack_free, so it's really easy to change them to malloc and free.

I also do not have to walk up the whole call stack on coroutine_fn 
yields, because calls from one coroutine_fn to the next are tail calls; 
in exchange for that, I have more indirect calls than if the code did

     if (next_call() == COROUTINE_YIELD) {
         return COROUTINE_YIELD;
     }

For now the choice was again just the one that made the translation easiest.

Today I also managed to implement a QEMU-like API on top of C++ coroutines:

     CoroutineFn<int> return_int() {
         co_await qemu_coroutine_yield();
         co_return 30;
     }

     CoroutineFn<void> return_void() {
         co_await qemu_coroutine_yield();
     }

     CoroutineFn<void> co(void *) {
         co_await return_void();
         printf("%d\n", co_await return_int())
         co_await qemu_coroutine_yield();
     }

     int main() {
         Coroutine *f = qemu_coroutine_create(co, NULL);
         printf("--- 0\n");
         qemu_coroutine_enter(f);
         printf("--- 1\n");
         qemu_coroutine_enter(f);
         printf("--- 2\n");
         qemu_coroutine_enter(f);
         printf("--- 3\n");
         qemu_coroutine_enter(f);
         printf("--- 4\n");
     }

The runtime code is absurdly obscure; my favorite bit is

     Yield qemu_coroutine_yield()
     {
         return Yield();
     }

:) However, at 200 lines of code it's certainly smaller than a 
source-to-source translator.  It might be worth investigating a bit 
more.  Only files that define or use a coroutine_fn (which includes 
callers of qemu_coroutine_create) would have to be compiled as C++.

Paolo


^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [PATCH experiment 00/35] stackless coroutine backend
  2022-03-11 12:04       ` Paolo Bonzini
@ 2022-03-11 12:17         ` Daniel P. Berrangé
  2022-03-13 15:18           ` Paolo Bonzini
  0 siblings, 1 reply; 44+ messages in thread
From: Daniel P. Berrangé @ 2022-03-11 12:17 UTC (permalink / raw)
  To: Paolo Bonzini; +Cc: hreitz, qemu-block, qemu-devel, Stefan Hajnoczi, sguelton

On Fri, Mar 11, 2022 at 01:04:33PM +0100, Paolo Bonzini wrote:
> On 3/11/22 10:27, Stefan Hajnoczi wrote:
> > > Not quite voluntarily, but I noticed I had to add one 0 to make them run for
> > > a decent amount of time.  So yeah, it's much faster than siglongjmp.
> > That's a nice first indication that performance will be good. I guess
> > that deep coroutine_fn stacks could be less efficient with stackless
> > coroutines compared to ucontext, but the cost of switching between
> > coroutines (enter/yield) will be lower with stackless coroutines.
> 
> Note that right now I'm not placing the coroutine_fn stack on the heap, it's
> still allocated from a contiguous area in virtual address space. The
> contiguous allocation is wrapped by coroutine_stack_alloc and
> coroutine_stack_free, so it's really easy to change them to malloc and free.
> 
> I also do not have to walk up the whole call stack on coroutine_fn yields,
> because calls from one coroutine_fn to the next are tail calls; in exchange
> for that, I have more indirect calls than if the code did
> 
>     if (next_call() == COROUTINE_YIELD) {
>         return COROUTINE_YIELD;
>     }
> 
> For now the choice was again just the one that made the translation easiest.
> 
> Today I also managed to implement a QEMU-like API on top of C++ coroutines:
> 
>     CoroutineFn<int> return_int() {
>         co_await qemu_coroutine_yield();
>         co_return 30;
>     }
> 
>     CoroutineFn<void> return_void() {
>         co_await qemu_coroutine_yield();
>     }
> 
>     CoroutineFn<void> co(void *) {
>         co_await return_void();
>         printf("%d\n", co_await return_int())
>         co_await qemu_coroutine_yield();
>     }
> 
>     int main() {
>         Coroutine *f = qemu_coroutine_create(co, NULL);
>         printf("--- 0\n");
>         qemu_coroutine_enter(f);
>         printf("--- 1\n");
>         qemu_coroutine_enter(f);
>         printf("--- 2\n");
>         qemu_coroutine_enter(f);
>         printf("--- 3\n");
>         qemu_coroutine_enter(f);
>         printf("--- 4\n");
>     }
> 
> The runtime code is absurdly obscure; my favorite bit is
> 
>     Yield qemu_coroutine_yield()
>     {
>         return Yield();
>     }
> 
> :) However, at 200 lines of code it's certainly smaller than a
> source-to-source translator.  It might be worth investigating a bit more.
> Only files that define or use a coroutine_fn (which includes callers of
> qemu_coroutine_create) would have to be compiled as C++.

Unless I'm misunderstanding what you mean, "define a coroutine_fn"
is a very large number of functions/files

  $ git grep coroutine_fn | wc -l
  806
  $ git grep -l coroutine_fn | wc -l
  132

Dominated by the block layer of course, but tentacles spreading
out into alot of other code.

Feels like identifying all callers would be tedious/unpleasant enough,
that practically speaking we would have to just compile all of QEMU
as C++.

Regards,
Daniel
-- 
|: https://berrange.com      -o-    https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org         -o-            https://fstop138.berrange.com :|
|: https://entangle-photo.org    -o-    https://www.instagram.com/dberrange :|



^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [PATCH experiment 00/35] stackless coroutine backend
  2022-03-11 12:17         ` Daniel P. Berrangé
@ 2022-03-13 15:18           ` Paolo Bonzini
  2022-03-14 13:43             ` Stefan Hajnoczi
  0 siblings, 1 reply; 44+ messages in thread
From: Paolo Bonzini @ 2022-03-13 15:18 UTC (permalink / raw)
  To: Daniel P. Berrangé
  Cc: hreitz, Stefan Hajnoczi, qemu-devel, qemu-block, sguelton

On 3/11/22 13:17, Daniel P. Berrangé wrote:
>> Only files that define or use a coroutine_fn (which includes callers of
>> qemu_coroutine_create) would have to be compiled as C++.
> Unless I'm misunderstanding what you mean, "define a coroutine_fn"
> is a very large number of functions/files
> 
>    $ git grep coroutine_fn | wc -l
>    806
>    $ git grep -l coroutine_fn | wc -l
>    132
> 
> Dominated by the block layer of course, but tentacles spreading
> out into alot of other code.

The main other user is 9pfs, then there is:

hw/remote/message.c
io/channel.c
job.c
migration/savevm.c
monitor/hmp-cmds.c
monitor/monitor-internal.h
monitor/qmp.c
nbd/client-connection.c
nbd/client.c
nbd/server.c
net/colo-compare.c
net/filter-mirror.c
scsi/pr-manager.c
scsi/qemu-pr-helper.c
ui/console.c
util/vhost-user-server.c

> Feels like identifying all callers would be tedious/unpleasant enough,
> that practically speaking we would have to just compile all of QEMU
> as C++.

Yes, it's a large amount of code, but it's relatively self-contained. 
In io/ for example only three functions would have to become C++ 
(qio_channel_readv_full_all_eof, qio_channel_writev_full_all, 
qio_channel_yield), and it's easy to move them to a separate file 
io/channel-coroutine.cc.

Likewise for e.g. util/async.c or util/thread-pool.c (one function each).

The block layer would almost entirely move to C++, that's for sure.  The 
monitor would be a bit more in the middle, but hardware emulation can 
remain 100% C.

I haven't gotten the thing to compile or run yet, and I'm not sure how 
much time I'll have this week, but the change for test-coroutine.c to 
run should be in the ballpark of this:

  include/qemu/coroutine.h                                 |  26
  tests/unit/meson.build                                   |   6
  tests/unit/{test-coroutine.c => test-coroutine.cc}       | 106
  util/meson.build                                         |   4
  util/{qemu-coroutine-lock.c => qemu-coroutine-lock.cc}   |  65
  util/{qemu-coroutine-sleep.c => qemu-coroutine-sleep.cc} |  10

where the changes are for a good part mechanical: switching from "x 
coroutine_fn" to CoroutineFn<x> entirely so, while adding co_await in 
front of coroutine calls is half mechanical.  For non-void functions, 
the compiler can identify all callers (because the old type "int" is not 
compatible with the new type CoroutineFn<int>).  For void function one 
could use warn_unused_result.

The question is what is easier to maintain, stack switching code that is 
becoming less and less portable (status quo with SafeStack, CET and the 
TLS issues that Stefan has worked on), a mixed C/C++ codebase (C++ 
coroutines), a custom source-to-source translator (this series).  The 
third might be more fun, but it would be quite a large enterprise and 
the C++ compiler writers have already done the work.

A part of the changes is common in both cases, since you cannot have 
code that can run both inside or outside a coroutine.

Paolo


^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [PATCH experiment 00/35] stackless coroutine backend
  2022-03-13 15:18           ` Paolo Bonzini
@ 2022-03-14 13:43             ` Stefan Hajnoczi
  0 siblings, 0 replies; 44+ messages in thread
From: Stefan Hajnoczi @ 2022-03-14 13:43 UTC (permalink / raw)
  To: Paolo Bonzini
  Cc: hreitz, Daniel P. Berrangé, qemu-devel, qemu-block, sguelton

[-- Attachment #1: Type: text/plain, Size: 642 bytes --]

On Sun, Mar 13, 2022 at 04:18:40PM +0100, Paolo Bonzini wrote:
> On 3/11/22 13:17, Daniel P. Berrangé wrote:
> The question is what is easier to maintain, stack switching code that is
> becoming less and less portable (status quo with SafeStack, CET and the TLS
> issues that Stefan has worked on), a mixed C/C++ codebase (C++ coroutines),
> a custom source-to-source translator (this series).  The third might be more
> fun, but it would be quite a large enterprise and the C++ compiler writers
> have already done the work.

Or a C-to-C++ translator to keep the code in C but still use C++
coroutines :). (I'm joking.)

Stefan

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 44+ messages in thread

end of thread, other threads:[~2022-03-14 14:00 UTC | newest]

Thread overview: 44+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-03-10 12:43 [PATCH experiment 00/35] stackless coroutine backend Paolo Bonzini
2022-03-10 12:43 ` [PATCH 01/35] coroutine: add missing coroutine_fn annotations for CoRwlock functions Paolo Bonzini
2022-03-10 12:43 ` [PATCH 02/35] coroutine: qemu_coroutine_get_aio_context is not a coroutine_fn Paolo Bonzini
2022-03-10 12:43 ` [PATCH 03/35] coroutine: introduce QemuCoLockable Paolo Bonzini
2022-03-10 12:43 ` [PATCH 04/35] coroutine: introduce coroutine_only_fn Paolo Bonzini
2022-03-10 12:43 ` [PATCH 05/35] coroutine: small code cleanup in qemu_co_rwlock_wrlock Paolo Bonzini
2022-03-10 14:10   ` Philippe Mathieu-Daudé
2022-03-10 12:43 ` [PATCH 06/35] disable some code Paolo Bonzini
2022-03-10 12:43 ` [PATCH 07/35] coroutine: introduce the "stackless coroutine" backend Paolo Bonzini
2022-03-10 12:43 ` [PATCH 08/35] /basic/lifecycle Paolo Bonzini
2022-03-10 12:43 ` [PATCH 09/35] convert qemu-coroutine-sleep.c to stackless coroutines Paolo Bonzini
2022-03-10 12:43 ` [PATCH 10/35] enable tail call optimization of qemu_co_mutex_lock Paolo Bonzini
2022-03-10 12:43 ` [PATCH 11/35] convert CoMutex to stackless coroutines Paolo Bonzini
2022-03-10 12:43 ` [PATCH 12/35] define magic macros for " Paolo Bonzini
2022-03-10 12:43 ` [PATCH 13/35] /basic/yield Paolo Bonzini
2022-03-10 12:43 ` [PATCH 14/35] /basic/nesting Paolo Bonzini
2022-03-10 12:43 ` [PATCH 15/35] /basic/self Paolo Bonzini
2022-03-10 12:43 ` [PATCH 16/35] /basic/entered Paolo Bonzini
2022-03-10 12:43 ` [PATCH 17/35] /basic/in_coroutine Paolo Bonzini
2022-03-10 12:43 ` [PATCH 18/35] /basic/order Paolo Bonzini
2022-03-10 12:43 ` [PATCH 19/35] /perf/lifecycle Paolo Bonzini
2022-03-10 12:43 ` [PATCH 20/35] /perf/nesting Paolo Bonzini
2022-03-10 12:43 ` [PATCH 21/35] /perf/yield Paolo Bonzini
2022-03-10 12:44 ` [PATCH 22/35] /perf/function-call Paolo Bonzini
2022-03-10 12:44 ` [PATCH 23/35] /perf/cost Paolo Bonzini
2022-03-10 12:44 ` [PATCH 24/35] /basic/no-dangling-access Paolo Bonzini
2022-03-10 12:44 ` [PATCH 25/35] /locking/co-mutex Paolo Bonzini
2022-03-10 12:44 ` [PATCH 26/35] convert qemu_co_mutex_lock_slowpath to magic macros Paolo Bonzini
2022-03-10 12:44 ` [PATCH 27/35] /locking/co-mutex/lockable Paolo Bonzini
2022-03-10 12:44 ` [PATCH 28/35] qemu_co_rwlock_maybe_wake_one Paolo Bonzini
2022-03-10 12:44 ` [PATCH 29/35] qemu_co_rwlock_rdlock Paolo Bonzini
2022-03-10 12:44 ` [PATCH 30/35] qemu_co_rwlock_unlock Paolo Bonzini
2022-03-10 12:44 ` [PATCH 31/35] qemu_co_rwlock_downgrade Paolo Bonzini
2022-03-10 12:44 ` [PATCH 32/35] qemu_co_rwlock_wrlock Paolo Bonzini
2022-03-10 12:44 ` [PATCH 33/35] qemu_co_rwlock_upgrade Paolo Bonzini
2022-03-10 12:44 ` [PATCH 34/35] /locking/co-rwlock/upgrade Paolo Bonzini
2022-03-10 12:44 ` [PATCH 35/35] /locking/co-rwlock/downgrade Paolo Bonzini
2022-03-10 17:42 ` [PATCH experiment 00/35] stackless coroutine backend Stefan Hajnoczi
2022-03-10 20:14   ` Paolo Bonzini
2022-03-11  9:27     ` Stefan Hajnoczi
2022-03-11 12:04       ` Paolo Bonzini
2022-03-11 12:17         ` Daniel P. Berrangé
2022-03-13 15:18           ` Paolo Bonzini
2022-03-14 13:43             ` Stefan Hajnoczi

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.