All of lore.kernel.org
 help / color / mirror / Atom feed
* [Qemu-devel] [RFC v2 0/5] arm AioContext with its own timer stuff
@ 2013-07-29  3:16 Liu Ping Fan
  2013-07-29  3:16 ` [Qemu-devel] [RFC v2 1/5] timer: protect timers_state with lock Liu Ping Fan
                   ` (6 more replies)
  0 siblings, 7 replies; 38+ messages in thread
From: Liu Ping Fan @ 2013-07-29  3:16 UTC (permalink / raw)
  To: qemu-devel
  Cc: Kevin Wolf, Stefan Hajnoczi, Jan Kiszka, Alex Bligh,
	Anthony Liguori, Paolo Bonzini

summary of the model:
  Three qemu-wide clock source allowed in system. And each AioContext has
three corresponding timer list to run timer against clocksource.

rfcv2:
   drop patches about alarm-timer(if timeout of poll will not satisfy, will come back to it)
   fix qemu_clock_enable sync problem (Thanks for Jan and Stefan)
   fix process=true when aio_poll run timers (Thanks for Alex)

Liu Ping Fan (5):
  timer: protect timers_state with lock
  timer: pick out timer list info from QemuClock
  timer: make qemu_clock_enable sync between disable and timer's cb
  timer: associate three timerlists with AioContext
  timer: run timers on aio_poll

 aio-posix.c          |   2 +
 async.c              |   9 +++
 cpus.c               |  26 ++++++--
 include/block/aio.h  |  13 ++++
 include/qemu/timer.h |  24 ++++++-
 main-loop.c          |   2 -
 qemu-timer.c         | 184 ++++++++++++++++++++++++++++++++++++---------------
 7 files changed, 198 insertions(+), 62 deletions(-)

-- 
1.8.1.4

^ permalink raw reply	[flat|nested] 38+ messages in thread

* [Qemu-devel] [RFC v2 1/5] timer: protect timers_state with lock
  2013-07-29  3:16 [Qemu-devel] [RFC v2 0/5] arm AioContext with its own timer stuff Liu Ping Fan
@ 2013-07-29  3:16 ` Liu Ping Fan
  2013-07-29  6:26   ` Paolo Bonzini
  2013-07-29  3:16 ` [Qemu-devel] [RFC v2 2/5] timer: pick out timer list info from QemuClock Liu Ping Fan
                   ` (5 subsequent siblings)
  6 siblings, 1 reply; 38+ messages in thread
From: Liu Ping Fan @ 2013-07-29  3:16 UTC (permalink / raw)
  To: qemu-devel
  Cc: Kevin Wolf, Stefan Hajnoczi, Jan Kiszka, Alex Bligh,
	Anthony Liguori, Paolo Bonzini

In kvm mode, vm_clock may be read on AioContexts outside BQL(next
patch). This will make timers_state --the foundation of vm_clock
exposed to race condition. Using private lock to protect it.
Note in tcg mode, vm_clock still read inside BQL, so icount is
left without change.

Lock rule: private lock innermost, ie BQL->"this lock"

Signed-off-by: Liu Ping Fan <pingfank@linux.vnet.ibm.com>
---
 cpus.c | 26 +++++++++++++++++++++-----
 1 file changed, 21 insertions(+), 5 deletions(-)

diff --git a/cpus.c b/cpus.c
index 61e86a8..4af81e9 100644
--- a/cpus.c
+++ b/cpus.c
@@ -112,7 +112,9 @@ typedef struct TimersState {
     int64_t dummy;
 } TimersState;
 
-TimersState timers_state;
+static TimersState timers_state;
+/* lock rule: innermost */
+static QemuMutex timers_state_lock;
 
 /* Return the virtual CPU time, based on the instruction counter.  */
 int64_t cpu_get_icount(void)
@@ -134,11 +136,14 @@ int64_t cpu_get_icount(void)
 /* return the host CPU cycle counter and handle stop/restart */
 int64_t cpu_get_ticks(void)
 {
+    int64_t ret;
+
     if (use_icount) {
         return cpu_get_icount();
     }
+    qemu_mutex_lock(&timers_state_lock);
     if (!timers_state.cpu_ticks_enabled) {
-        return timers_state.cpu_ticks_offset;
+        ret = timers_state.cpu_ticks_offset;
     } else {
         int64_t ticks;
         ticks = cpu_get_real_ticks();
@@ -148,41 +153,51 @@ int64_t cpu_get_ticks(void)
             timers_state.cpu_ticks_offset += timers_state.cpu_ticks_prev - ticks;
         }
         timers_state.cpu_ticks_prev = ticks;
-        return ticks + timers_state.cpu_ticks_offset;
+        ret = ticks + timers_state.cpu_ticks_offset;
     }
+    qemu_mutex_unlock(&timers_state_lock);
+    return ret;
 }
 
 /* return the host CPU monotonic timer and handle stop/restart */
 int64_t cpu_get_clock(void)
 {
     int64_t ti;
+
+    qemu_mutex_lock(&timers_state_lock);
     if (!timers_state.cpu_ticks_enabled) {
-        return timers_state.cpu_clock_offset;
+        ti = timers_state.cpu_clock_offset;
     } else {
         ti = get_clock();
-        return ti + timers_state.cpu_clock_offset;
+        ti += timers_state.cpu_clock_offset;
     }
+    qemu_mutex_unlock(&timers_state_lock);
+    return ti;
 }
 
 /* enable cpu_get_ticks() */
 void cpu_enable_ticks(void)
 {
+    qemu_mutex_lock(&timers_state_lock);
     if (!timers_state.cpu_ticks_enabled) {
         timers_state.cpu_ticks_offset -= cpu_get_real_ticks();
         timers_state.cpu_clock_offset -= get_clock();
         timers_state.cpu_ticks_enabled = 1;
     }
+    qemu_mutex_unlock(&timers_state_lock);
 }
 
 /* disable cpu_get_ticks() : the clock is stopped. You must not call
    cpu_get_ticks() after that.  */
 void cpu_disable_ticks(void)
 {
+    qemu_mutex_lock(&timers_state_lock);
     if (timers_state.cpu_ticks_enabled) {
         timers_state.cpu_ticks_offset = cpu_get_ticks();
         timers_state.cpu_clock_offset = cpu_get_clock();
         timers_state.cpu_ticks_enabled = 0;
     }
+    qemu_mutex_unlock(&timers_state_lock);
 }
 
 /* Correlation between real and virtual time is always going to be
@@ -353,6 +368,7 @@ static const VMStateDescription vmstate_timers = {
 
 void configure_icount(const char *option)
 {
+    qemu_mutex_init(&timers_state_lock);
     vmstate_register(NULL, 0, &vmstate_timers, &timers_state);
     if (!option) {
         return;
-- 
1.8.1.4

^ permalink raw reply related	[flat|nested] 38+ messages in thread

* [Qemu-devel] [RFC v2 2/5] timer: pick out timer list info from QemuClock
  2013-07-29  3:16 [Qemu-devel] [RFC v2 0/5] arm AioContext with its own timer stuff Liu Ping Fan
  2013-07-29  3:16 ` [Qemu-devel] [RFC v2 1/5] timer: protect timers_state with lock Liu Ping Fan
@ 2013-07-29  3:16 ` Liu Ping Fan
  2013-07-29  3:16 ` [Qemu-devel] [RFC v2 3/5] timer: make qemu_clock_enable sync between disable and timer's cb Liu Ping Fan
                   ` (4 subsequent siblings)
  6 siblings, 0 replies; 38+ messages in thread
From: Liu Ping Fan @ 2013-07-29  3:16 UTC (permalink / raw)
  To: qemu-devel
  Cc: Kevin Wolf, Stefan Hajnoczi, Jan Kiszka, Alex Bligh,
	Anthony Liguori, Paolo Bonzini

In qemu-wide, we will have three clocksource, they are present
by rt_clock/vm_clock/host_clock. On the other hand, we want to
run timers on the different backend threads by binding AioContext
with its own three timer lists.
So clean up the QemuClock struct to use it as the clock source.

Signed-off-by: Liu Ping Fan <pingfank@linux.vnet.ibm.com>
---
 qemu-timer.c | 106 ++++++++++++++++++++++++++++++++++++++---------------------
 1 file changed, 68 insertions(+), 38 deletions(-)

diff --git a/qemu-timer.c b/qemu-timer.c
index 9500d12..5a42035 100644
--- a/qemu-timer.c
+++ b/qemu-timer.c
@@ -30,6 +30,7 @@
 
 #include "qemu/timer.h"
 #include "qemu/thread.h"
+#include "qemu/tls.h"
 #ifdef CONFIG_POSIX
 #include <pthread.h>
 #endif
@@ -44,11 +45,16 @@
 #define QEMU_CLOCK_REALTIME 0
 #define QEMU_CLOCK_VIRTUAL  1
 #define QEMU_CLOCK_HOST     2
+#define QEMU_CLOCK_MAXCNT 3
 
-struct QEMUClock {
+typedef struct TimerList {
     QEMUTimer *active_timers;
     QemuMutex active_timers_lock;
+} TimerList;
+
+static TimerList timer_list[QEMU_CLOCK_MAXCNT];
 
+struct QEMUClock {
     NotifierList reset_notifiers;
     int64_t last;
 
@@ -58,7 +64,7 @@ struct QEMUClock {
 
 struct QEMUTimer {
     int64_t expire_time;	/* in nanoseconds */
-    QEMUClock *clock;
+    TimerList *list;
     QEMUTimerCB *cb;
     void *opaque;
     QEMUTimer *next;
@@ -86,6 +92,13 @@ struct qemu_alarm_timer {
 
 static struct qemu_alarm_timer *alarm_timer;
 
+static TimerList *clock_to_timerlist(QEMUClock *clock)
+{
+    int type = clock->type;
+
+    return  &timer_list[type];
+}
+
 static bool qemu_timer_expired_ns(QEMUTimer *timer_head, int64_t current_time)
 {
     return timer_head && (timer_head->expire_time <= current_time);
@@ -95,17 +108,19 @@ static int64_t qemu_next_clock_deadline(QEMUClock *clock, int64_t delta)
 {
     int64_t expire_time, next;
     bool has_timer = false;
+    TimerList *tlist;
 
     if (!clock->enabled) {
         return delta;
     }
 
-    qemu_mutex_lock(&clock->active_timers_lock);
-    if (clock->active_timers) {
+    tlist = clock_to_timerlist(clock);
+    qemu_mutex_lock(&tlist->active_timers_lock);
+    if (tlist->active_timers) {
         has_timer = true;
-        expire_time = clock->active_timers->expire_time;
+        expire_time = tlist->active_timers->expire_time;
     }
-    qemu_mutex_unlock(&clock->active_timers_lock);
+    qemu_mutex_unlock(&tlist->active_timers_lock);
     if (!has_timer) {
         return delta;
     }
@@ -244,16 +259,25 @@ QEMUClock *rt_clock;
 QEMUClock *vm_clock;
 QEMUClock *host_clock;
 
+static void timer_list_init(TimerList *tlist)
+{
+    qemu_mutex_init(&tlist->active_timers_lock);
+    tlist->active_timers = NULL;
+}
+
 static QEMUClock *qemu_new_clock(int type)
 {
     QEMUClock *clock;
+    TimerList *tlist;
 
     clock = g_malloc0(sizeof(QEMUClock));
     clock->type = type;
     clock->enabled = true;
     clock->last = INT64_MIN;
     notifier_list_init(&clock->reset_notifiers);
-    qemu_mutex_init(&clock->active_timers_lock);
+    tlist = clock_to_timerlist(clock);
+    timer_list_init(tlist);
+
     return clock;
 }
 
@@ -269,10 +293,11 @@ void qemu_clock_enable(QEMUClock *clock, bool enabled)
 int64_t qemu_clock_has_timers(QEMUClock *clock)
 {
     bool has_timers;
+    TimerList *tlist = clock_to_timerlist(clock);
 
-    qemu_mutex_lock(&clock->active_timers_lock);
-    has_timers = !!clock->active_timers;
-    qemu_mutex_unlock(&clock->active_timers_lock);
+    qemu_mutex_lock(&tlist->active_timers_lock);
+    has_timers = !!tlist->active_timers;
+    qemu_mutex_unlock(&tlist->active_timers_lock);
     return has_timers;
 }
 
@@ -280,11 +305,12 @@ int64_t qemu_clock_expired(QEMUClock *clock)
 {
     bool has_timers;
     int64_t expire_time;
+    TimerList *tlist = clock_to_timerlist(clock);
 
-    qemu_mutex_lock(&clock->active_timers_lock);
-    has_timers = clock->active_timers;
-    expire_time = clock->active_timers->expire_time;
-    qemu_mutex_unlock(&clock->active_timers_lock);
+    qemu_mutex_lock(&tlist->active_timers_lock);
+    has_timers = tlist->active_timers;
+    expire_time = tlist->active_timers->expire_time;
+    qemu_mutex_unlock(&tlist->active_timers_lock);
 
     return has_timers && expire_time < qemu_get_clock_ns(clock);
 }
@@ -295,11 +321,12 @@ int64_t qemu_clock_deadline(QEMUClock *clock)
     int64_t delta = INT32_MAX;
     bool has_timers;
     int64_t expire_time;
+    TimerList *tlist = clock_to_timerlist(clock);
 
-    qemu_mutex_lock(&clock->active_timers_lock);
-    has_timers = clock->active_timers;
-    expire_time = clock->active_timers->expire_time;
-    qemu_mutex_unlock(&clock->active_timers_lock);
+    qemu_mutex_lock(&tlist->active_timers_lock);
+    has_timers = tlist->active_timers;
+    expire_time = tlist->active_timers->expire_time;
+    qemu_mutex_unlock(&tlist->active_timers_lock);
 
     if (has_timers) {
         delta = expire_time - qemu_get_clock_ns(clock);
@@ -316,7 +343,7 @@ QEMUTimer *qemu_new_timer(QEMUClock *clock, int scale,
     QEMUTimer *ts;
 
     ts = g_malloc0(sizeof(QEMUTimer));
-    ts->clock = clock;
+    ts->list = clock_to_timerlist(clock);
     ts->cb = cb;
     ts->opaque = opaque;
     ts->scale = scale;
@@ -331,11 +358,11 @@ void qemu_free_timer(QEMUTimer *ts)
 /* stop a timer, but do not dealloc it */
 void qemu_del_timer(QEMUTimer *ts)
 {
-    QEMUClock *clock = ts->clock;
+    TimerList *tlist = ts->list;
     QEMUTimer **pt, *t;
 
-    qemu_mutex_lock(&clock->active_timers_lock);
-    pt = &ts->clock->active_timers;
+    qemu_mutex_lock(&tlist->active_timers_lock);
+    pt = &tlist->active_timers;
     for(;;) {
         t = *pt;
         if (!t)
@@ -346,21 +373,21 @@ void qemu_del_timer(QEMUTimer *ts)
         }
         pt = &t->next;
     }
-    qemu_mutex_unlock(&clock->active_timers_lock);
+    qemu_mutex_unlock(&tlist->active_timers_lock);
 }
 
 /* modify the current timer so that it will be fired when current_time
    >= expire_time. The corresponding callback will be called. */
 void qemu_mod_timer_ns(QEMUTimer *ts, int64_t expire_time)
 {
-    QEMUClock *clock = ts->clock;
+    TimerList *tlist = ts->list;
     QEMUTimer **pt, *t;
 
     qemu_del_timer(ts);
 
     /* add the timer in the sorted list */
-    qemu_mutex_lock(&clock->active_timers_lock);
-    pt = &clock->active_timers;
+    qemu_mutex_lock(&tlist->active_timers_lock);
+    pt = &tlist->active_timers;
     for(;;) {
         t = *pt;
         if (!qemu_timer_expired_ns(t, expire_time)) {
@@ -371,10 +398,10 @@ void qemu_mod_timer_ns(QEMUTimer *ts, int64_t expire_time)
     ts->expire_time = expire_time;
     ts->next = *pt;
     *pt = ts;
-    qemu_mutex_unlock(&clock->active_timers_lock);
+    qemu_mutex_unlock(&tlist->active_timers_lock);
 
     /* Rearm if necessary  */
-    if (pt == &ts->clock->active_timers) {
+    if (pt == &tlist->active_timers) {
         qemu_mutex_lock(&alarm_timer->timer_modified_lock);
         alarm_timer->timer_modified = true;
         qemu_mutex_unlock(&alarm_timer->timer_modified_lock);
@@ -390,15 +417,15 @@ void qemu_mod_timer(QEMUTimer *ts, int64_t expire_time)
 bool qemu_timer_pending(QEMUTimer *ts)
 {
     QEMUTimer *t;
-    QEMUClock *clock = ts->clock;
+    TimerList *tlist = ts->list;
 
-    qemu_mutex_lock(&clock->active_timers_lock);
-    for (t = ts->clock->active_timers; t != NULL; t = t->next) {
+    qemu_mutex_lock(&tlist->active_timers_lock);
+    for (t = tlist->active_timers; t != NULL; t = t->next) {
         if (t == ts) {
             break;
         }
     }
-    qemu_mutex_unlock(&clock->active_timers_lock);
+    qemu_mutex_unlock(&tlist->active_timers_lock);
     return t;
 }
 
@@ -411,22 +438,25 @@ void qemu_run_timers(QEMUClock *clock)
 {
     QEMUTimer *ts;
     int64_t current_time;
-   
+    TimerList *tlist;
+
     if (!clock->enabled)
         return;
 
     current_time = qemu_get_clock_ns(clock);
+    tlist = clock_to_timerlist(clock);
+
     for(;;) {
-        qemu_mutex_lock(&clock->active_timers_lock);
-        ts = clock->active_timers;
+        qemu_mutex_lock(&tlist->active_timers_lock);
+        ts = tlist->active_timers;
         if (!qemu_timer_expired_ns(ts, current_time)) {
-            qemu_mutex_unlock(&clock->active_timers_lock);
+            qemu_mutex_unlock(&tlist->active_timers_lock);
             break;
         }
         /* remove timer from the list before calling the callback */
-        clock->active_timers = ts->next;
+        tlist->active_timers = ts->next;
         ts->next = NULL;
-        qemu_mutex_unlock(&clock->active_timers_lock);
+        qemu_mutex_unlock(&tlist->active_timers_lock);
 
         /* run the callback (the timer list can be modified) */
         ts->cb(ts->opaque);
-- 
1.8.1.4

^ permalink raw reply related	[flat|nested] 38+ messages in thread

* [Qemu-devel] [RFC v2 3/5] timer: make qemu_clock_enable sync between disable and timer's cb
  2013-07-29  3:16 [Qemu-devel] [RFC v2 0/5] arm AioContext with its own timer stuff Liu Ping Fan
  2013-07-29  3:16 ` [Qemu-devel] [RFC v2 1/5] timer: protect timers_state with lock Liu Ping Fan
  2013-07-29  3:16 ` [Qemu-devel] [RFC v2 2/5] timer: pick out timer list info from QemuClock Liu Ping Fan
@ 2013-07-29  3:16 ` Liu Ping Fan
  2013-07-29  6:30   ` Paolo Bonzini
  2013-07-29  3:16 ` [Qemu-devel] [RFC v2 4/5] timer: associate three timerlists with AioContext Liu Ping Fan
                   ` (3 subsequent siblings)
  6 siblings, 1 reply; 38+ messages in thread
From: Liu Ping Fan @ 2013-07-29  3:16 UTC (permalink / raw)
  To: qemu-devel
  Cc: Kevin Wolf, Stefan Hajnoczi, Jan Kiszka, Alex Bligh,
	Anthony Liguori, Paolo Bonzini

After disabling the QemuClock, we should make sure that no QemuTimers
are still in flight. To implement that, the caller of disabling will
wait until the last user's exit.

Note, the callers of qemu_clock_enable() should be sync by themselves,
not protected by this patch.

Signed-off-by: Liu Ping Fan <pingfank@linux.vnet.ibm.com>
---
 qemu-timer.c | 34 ++++++++++++++++++++++++++++++++--
 1 file changed, 32 insertions(+), 2 deletions(-)

diff --git a/qemu-timer.c b/qemu-timer.c
index 5a42035..d941a83 100644
--- a/qemu-timer.c
+++ b/qemu-timer.c
@@ -60,6 +60,14 @@ struct QEMUClock {
 
     int type;
     bool enabled;
+
+    /* rule: Innermost lock
+     * to protect the disable against timer's cb in flight.
+     */
+    QemuMutex lock;
+    /* reference by timer list */
+    int using;
+    QemuCond wait_using;
 };
 
 struct QEMUTimer {
@@ -274,6 +282,9 @@ static QEMUClock *qemu_new_clock(int type)
     clock->type = type;
     clock->enabled = true;
     clock->last = INT64_MIN;
+    clock->using = 0;
+    qemu_cond_init(&clock->wait_using);
+    qemu_mutex_init(&clock->lock);
     notifier_list_init(&clock->reset_notifiers);
     tlist = clock_to_timerlist(clock);
     timer_list_init(tlist);
@@ -287,6 +298,13 @@ void qemu_clock_enable(QEMUClock *clock, bool enabled)
     clock->enabled = enabled;
     if (enabled && !old) {
         qemu_rearm_alarm_timer(alarm_timer);
+    } else {
+        qemu_mutex_lock(&clock->lock);
+        /* Wait for cb in flight to terminate */
+        while (atomic_read(clock->using)) {
+            qemu_cond_wait(&clock->wait_using, &clock->lock);
+        }
+        qemu_mutex_unlock(&clock->lock);
     }
 }
 
@@ -440,8 +458,11 @@ void qemu_run_timers(QEMUClock *clock)
     int64_t current_time;
     TimerList *tlist;
 
-    if (!clock->enabled)
-        return;
+    atomic_inc(&clock->using);
+    if (unlikely(!clock->enabled)) {
+        goto exit;
+    }
+
 
     current_time = qemu_get_clock_ns(clock);
     tlist = clock_to_timerlist(clock);
@@ -461,6 +482,15 @@ void qemu_run_timers(QEMUClock *clock)
         /* run the callback (the timer list can be modified) */
         ts->cb(ts->opaque);
     }
+
+exit:
+    qemu_mutex_lock(&clock->lock);
+    if (atomic_fetch_dec(&clock->using) == 1) {
+        if (unlikely(!clock->enabled)) {
+            qemu_cond_signal(&clock->wait_using);
+        }
+    }
+    qemu_mutex_unlock(&clock->lock);
 }
 
 int64_t qemu_get_clock_ns(QEMUClock *clock)
-- 
1.8.1.4

^ permalink raw reply related	[flat|nested] 38+ messages in thread

* [Qemu-devel] [RFC v2 4/5] timer: associate three timerlists with AioContext
  2013-07-29  3:16 [Qemu-devel] [RFC v2 0/5] arm AioContext with its own timer stuff Liu Ping Fan
                   ` (2 preceding siblings ...)
  2013-07-29  3:16 ` [Qemu-devel] [RFC v2 3/5] timer: make qemu_clock_enable sync between disable and timer's cb Liu Ping Fan
@ 2013-07-29  3:16 ` Liu Ping Fan
  2013-07-29  6:32   ` Paolo Bonzini
  2013-07-29  3:16 ` [Qemu-devel] [RFC v2 5/5] timer: run timers on aio_poll Liu Ping Fan
                   ` (2 subsequent siblings)
  6 siblings, 1 reply; 38+ messages in thread
From: Liu Ping Fan @ 2013-07-29  3:16 UTC (permalink / raw)
  To: qemu-devel
  Cc: Kevin Wolf, Stefan Hajnoczi, Jan Kiszka, Alex Bligh,
	Anthony Liguori, Paolo Bonzini

Currently, timers run on iothread inside QBL, this limits the usage
of timers in some case, e.g. virtio-blk-dataplane. In order to run
timers on private thread based on different clocksource, we arm each
AioContext with three timer lists in according to three clocksource
(QemuClock).

A little later, we will run timers in aio_poll.

Signed-off-by: Liu Ping Fan <pingfank@linux.vnet.ibm.com>

------ issue to fix ---
Note: before this patch, there should be another one to fix the race
issue by qemu_mod_timer() and _run_timers().
---
 async.c              |  9 ++++++++
 include/block/aio.h  | 13 +++++++++++
 include/qemu/timer.h | 20 ++++++++++++++++
 qemu-timer.c         | 65 +++++++++++++++++++++++++++++++---------------------
 4 files changed, 81 insertions(+), 26 deletions(-)

diff --git a/async.c b/async.c
index ba4072c..7e2340e 100644
--- a/async.c
+++ b/async.c
@@ -201,11 +201,15 @@ static void
 aio_ctx_finalize(GSource     *source)
 {
     AioContext *ctx = (AioContext *) source;
+    int i;
 
     thread_pool_free(ctx->thread_pool);
     aio_set_event_notifier(ctx, &ctx->notifier, NULL, NULL);
     event_notifier_cleanup(&ctx->notifier);
     g_array_free(ctx->pollfds, TRUE);
+    for (i = 0; i < QEMU_CLOCK_MAXCNT; i++) {
+        timer_list_finalize(&ctx->timer_list[i]);
+    }
 }
 
 static GSourceFuncs aio_source_funcs = {
@@ -237,6 +241,8 @@ void aio_notify(AioContext *ctx)
 AioContext *aio_context_new(void)
 {
     AioContext *ctx;
+    int i;
+
     ctx = (AioContext *) g_source_new(&aio_source_funcs, sizeof(AioContext));
     ctx->pollfds = g_array_new(FALSE, FALSE, sizeof(GPollFD));
     ctx->thread_pool = NULL;
@@ -245,6 +251,9 @@ AioContext *aio_context_new(void)
     aio_set_event_notifier(ctx, &ctx->notifier, 
                            (EventNotifierHandler *)
                            event_notifier_test_and_clear, NULL);
+    for (i = 0; i < QEMU_CLOCK_MAXCNT; i++) {
+        timer_list_init(&ctx->timer_list[i]);
+    }
 
     return ctx;
 }
diff --git a/include/block/aio.h b/include/block/aio.h
index 04598b2..cf41b42 100644
--- a/include/block/aio.h
+++ b/include/block/aio.h
@@ -43,6 +43,18 @@ typedef struct AioHandler AioHandler;
 typedef void QEMUBHFunc(void *opaque);
 typedef void IOHandler(void *opaque);
 
+/* Related timer with AioContext */
+typedef struct QEMUTimer QEMUTimer;
+#define QEMU_CLOCK_MAXCNT 3
+
+typedef struct TimerList {
+    QEMUTimer *active_timers;
+    QemuMutex active_timers_lock;
+} TimerList;
+
+void timer_list_init(TimerList *tlist);
+void timer_list_finalize(TimerList *tlist);
+
 typedef struct AioContext {
     GSource source;
 
@@ -73,6 +85,7 @@ typedef struct AioContext {
 
     /* Thread pool for performing work and receiving completion callbacks */
     struct ThreadPool *thread_pool;
+    TimerList timer_list[QEMU_CLOCK_MAXCNT];
 } AioContext;
 
 /* Returns 1 if there are still outstanding AIO requests; 0 otherwise */
diff --git a/include/qemu/timer.h b/include/qemu/timer.h
index 9dd206c..3e5016b 100644
--- a/include/qemu/timer.h
+++ b/include/qemu/timer.h
@@ -33,9 +33,14 @@ extern QEMUClock *vm_clock;
 extern QEMUClock *host_clock;
 
 int64_t qemu_get_clock_ns(QEMUClock *clock);
+/* qemu_clock_has_timers, qemu_clock_expired, qemu_clock_deadline
+ * run In tcg icount mode. There is only one AioContext i.e. qemu_aio_context.
+ * So we only count the timers on qemu_aio_context.
+ */
 int64_t qemu_clock_has_timers(QEMUClock *clock);
 int64_t qemu_clock_expired(QEMUClock *clock);
 int64_t qemu_clock_deadline(QEMUClock *clock);
+
 void qemu_clock_enable(QEMUClock *clock, bool enabled);
 void qemu_clock_warp(QEMUClock *clock);
 
@@ -45,6 +50,9 @@ void qemu_unregister_clock_reset_notifier(QEMUClock *clock,
 
 QEMUTimer *qemu_new_timer(QEMUClock *clock, int scale,
                           QEMUTimerCB *cb, void *opaque);
+QEMUTimer *aioctx_new_timer(QEMUClock *clock, int scale,
+                          QEMUTimerCB *cb, void *opaque, AioContext *ctx);
+
 void qemu_free_timer(QEMUTimer *ts);
 void qemu_del_timer(QEMUTimer *ts);
 void qemu_mod_timer_ns(QEMUTimer *ts, int64_t expire_time);
@@ -75,6 +83,18 @@ static inline QEMUTimer *qemu_new_timer_ms(QEMUClock *clock, QEMUTimerCB *cb,
     return qemu_new_timer(clock, SCALE_MS, cb, opaque);
 }
 
+static inline QEMUTimer *aioctx_new_timer_ns(QEMUClock *clock, QEMUTimerCB *cb,
+                                           void *opaque, AioContext *ctx)
+{
+    return aioctx_new_timer(clock, SCALE_NS, cb, opaque, ctx);
+}
+
+static inline QEMUTimer *aioctx_new_timer_ms(QEMUClock *clock, QEMUTimerCB *cb,
+                                           void *opaque, AioContext *ctx)
+{
+    return aioctx_new_timer(clock, SCALE_MS, cb, opaque, ctx);
+}
+
 static inline int64_t qemu_get_clock_ms(QEMUClock *clock)
 {
     return qemu_get_clock_ns(clock) / SCALE_MS;
diff --git a/qemu-timer.c b/qemu-timer.c
index d941a83..f15c3e6 100644
--- a/qemu-timer.c
+++ b/qemu-timer.c
@@ -45,14 +45,6 @@
 #define QEMU_CLOCK_REALTIME 0
 #define QEMU_CLOCK_VIRTUAL  1
 #define QEMU_CLOCK_HOST     2
-#define QEMU_CLOCK_MAXCNT 3
-
-typedef struct TimerList {
-    QEMUTimer *active_timers;
-    QemuMutex active_timers_lock;
-} TimerList;
-
-static TimerList timer_list[QEMU_CLOCK_MAXCNT];
 
 struct QEMUClock {
     NotifierList reset_notifiers;
@@ -72,7 +64,9 @@ struct QEMUClock {
 
 struct QEMUTimer {
     int64_t expire_time;	/* in nanoseconds */
+    /* quick link to AioContext timer list */
     TimerList *list;
+    AioContext *ctx;
     QEMUTimerCB *cb;
     void *opaque;
     QEMUTimer *next;
@@ -100,11 +94,12 @@ struct qemu_alarm_timer {
 
 static struct qemu_alarm_timer *alarm_timer;
 
-static TimerList *clock_to_timerlist(QEMUClock *clock)
+static TimerList *clock_to_timerlist(QEMUClock *clock, AioContext *ctx)
 {
     int type = clock->type;
 
-    return  &timer_list[type];
+    assert(ctx);
+    return  &ctx->timer_list[type];
 }
 
 static bool qemu_timer_expired_ns(QEMUTimer *timer_head, int64_t current_time)
@@ -112,7 +107,8 @@ static bool qemu_timer_expired_ns(QEMUTimer *timer_head, int64_t current_time)
     return timer_head && (timer_head->expire_time <= current_time);
 }
 
-static int64_t qemu_next_clock_deadline(QEMUClock *clock, int64_t delta)
+static int64_t qemu_next_clock_deadline(QEMUClock *clock, int64_t delta,
+    AioContext *ctx)
 {
     int64_t expire_time, next;
     bool has_timer = false;
@@ -122,7 +118,7 @@ static int64_t qemu_next_clock_deadline(QEMUClock *clock, int64_t delta)
         return delta;
     }
 
-    tlist = clock_to_timerlist(clock);
+    tlist = clock_to_timerlist(clock, ctx);
     qemu_mutex_lock(&tlist->active_timers_lock);
     if (tlist->active_timers) {
         has_timer = true;
@@ -140,12 +136,13 @@ static int64_t qemu_next_clock_deadline(QEMUClock *clock, int64_t delta)
 static int64_t qemu_next_alarm_deadline(void)
 {
     int64_t delta = INT64_MAX;
+    AioContext *ctx = *tls_get_thread_aio_context();
 
     if (!use_icount) {
-        delta = qemu_next_clock_deadline(vm_clock, delta);
+        delta = qemu_next_clock_deadline(vm_clock, delta, ctx);
     }
-    delta = qemu_next_clock_deadline(host_clock, delta);
-    return qemu_next_clock_deadline(rt_clock, delta);
+    delta = qemu_next_clock_deadline(host_clock, delta, ctx);
+    return qemu_next_clock_deadline(rt_clock, delta, ctx);
 }
 
 static void qemu_rearm_alarm_timer(struct qemu_alarm_timer *t)
@@ -267,16 +264,21 @@ QEMUClock *rt_clock;
 QEMUClock *vm_clock;
 QEMUClock *host_clock;
 
-static void timer_list_init(TimerList *tlist)
+void timer_list_init(TimerList *tlist)
 {
     qemu_mutex_init(&tlist->active_timers_lock);
     tlist->active_timers = NULL;
 }
 
+void timer_list_finalize(TimerList *tlist)
+{
+    qemu_mutex_destroy(&tlist->active_timers_lock);
+    assert(!tlist->active_timers);
+}
+
 static QEMUClock *qemu_new_clock(int type)
 {
     QEMUClock *clock;
-    TimerList *tlist;
 
     clock = g_malloc0(sizeof(QEMUClock));
     clock->type = type;
@@ -286,8 +288,6 @@ static QEMUClock *qemu_new_clock(int type)
     qemu_cond_init(&clock->wait_using);
     qemu_mutex_init(&clock->lock);
     notifier_list_init(&clock->reset_notifiers);
-    tlist = clock_to_timerlist(clock);
-    timer_list_init(tlist);
 
     return clock;
 }
@@ -308,10 +308,14 @@ void qemu_clock_enable(QEMUClock *clock, bool enabled)
     }
 }
 
+/* qemu_clock_has_timers, qemu_clock_expired, qemu_clock_deadline
+  * run In tcg icount mode. There is only one AioContext i.e. qemu_aio_context.
+ * So we only count the timers on qemu_aio_context.
+*/
 int64_t qemu_clock_has_timers(QEMUClock *clock)
 {
     bool has_timers;
-    TimerList *tlist = clock_to_timerlist(clock);
+    TimerList *tlist = clock_to_timerlist(clock, qemu_get_aio_context());
 
     qemu_mutex_lock(&tlist->active_timers_lock);
     has_timers = !!tlist->active_timers;
@@ -323,7 +327,7 @@ int64_t qemu_clock_expired(QEMUClock *clock)
 {
     bool has_timers;
     int64_t expire_time;
-    TimerList *tlist = clock_to_timerlist(clock);
+    TimerList *tlist = clock_to_timerlist(clock, qemu_get_aio_context());
 
     qemu_mutex_lock(&tlist->active_timers_lock);
     has_timers = tlist->active_timers;
@@ -339,7 +343,7 @@ int64_t qemu_clock_deadline(QEMUClock *clock)
     int64_t delta = INT32_MAX;
     bool has_timers;
     int64_t expire_time;
-    TimerList *tlist = clock_to_timerlist(clock);
+    TimerList *tlist = clock_to_timerlist(clock, qemu_get_aio_context());
 
     qemu_mutex_lock(&tlist->active_timers_lock);
     has_timers = tlist->active_timers;
@@ -355,19 +359,26 @@ int64_t qemu_clock_deadline(QEMUClock *clock)
     return delta;
 }
 
-QEMUTimer *qemu_new_timer(QEMUClock *clock, int scale,
-                          QEMUTimerCB *cb, void *opaque)
+QEMUTimer *aioctx_new_timer(QEMUClock *clock, int scale,
+                          QEMUTimerCB *cb, void *opaque, AioContext *ctx)
 {
     QEMUTimer *ts;
 
     ts = g_malloc0(sizeof(QEMUTimer));
-    ts->list = clock_to_timerlist(clock);
+    ts->list = clock_to_timerlist(clock, ctx);
     ts->cb = cb;
     ts->opaque = opaque;
     ts->scale = scale;
+    ts->ctx = ctx;
     return ts;
 }
 
+QEMUTimer *qemu_new_timer(QEMUClock *clock, int scale,
+                          QEMUTimerCB *cb, void *opaque)
+{
+    return aioctx_new_timer(clock, scale, cb, opaque, qemu_get_aio_context());
+}
+
 void qemu_free_timer(QEMUTimer *ts)
 {
     g_free(ts);
@@ -457,6 +468,7 @@ void qemu_run_timers(QEMUClock *clock)
     QEMUTimer *ts;
     int64_t current_time;
     TimerList *tlist;
+    AioContext *ctx;
 
     atomic_inc(&clock->using);
     if (unlikely(!clock->enabled)) {
@@ -465,7 +477,8 @@ void qemu_run_timers(QEMUClock *clock)
 
 
     current_time = qemu_get_clock_ns(clock);
-    tlist = clock_to_timerlist(clock);
+    ctx = *tls_get_thread_aio_context();
+    tlist = clock_to_timerlist(clock, ctx);
 
     for(;;) {
         qemu_mutex_lock(&tlist->active_timers_lock);
-- 
1.8.1.4

^ permalink raw reply related	[flat|nested] 38+ messages in thread

* [Qemu-devel] [RFC v2 5/5] timer: run timers on aio_poll
  2013-07-29  3:16 [Qemu-devel] [RFC v2 0/5] arm AioContext with its own timer stuff Liu Ping Fan
                   ` (3 preceding siblings ...)
  2013-07-29  3:16 ` [Qemu-devel] [RFC v2 4/5] timer: associate three timerlists with AioContext Liu Ping Fan
@ 2013-07-29  3:16 ` Liu Ping Fan
  2013-07-29  9:22 ` [Qemu-devel] [RFC v2 0/5] arm AioContext with its own timer stuff Stefan Hajnoczi
  2013-07-29 10:18 ` Alex Bligh
  6 siblings, 0 replies; 38+ messages in thread
From: Liu Ping Fan @ 2013-07-29  3:16 UTC (permalink / raw)
  To: qemu-devel
  Cc: Kevin Wolf, Stefan Hajnoczi, Jan Kiszka, Alex Bligh,
	Anthony Liguori, Paolo Bonzini

Stop call timers in main loop and let each mini event-loop
run its own timers.

Signed-off-by: Liu Ping Fan <pingfank@linux.vnet.ibm.com>
---
 aio-posix.c          |  2 ++
 include/qemu/timer.h |  4 ++--
 main-loop.c          |  2 --
 qemu-timer.c         | 15 ++++++++++-----
 4 files changed, 14 insertions(+), 9 deletions(-)

diff --git a/aio-posix.c b/aio-posix.c
index b68eccd..53fdb1a 100644
--- a/aio-posix.c
+++ b/aio-posix.c
@@ -191,6 +191,8 @@ bool aio_poll(AioContext *ctx, bool blocking)
         progress = true;
     }
 
+    progress |= qemu_run_all_timers();
+
     if (progress && !blocking) {
         return true;
     }
diff --git a/include/qemu/timer.h b/include/qemu/timer.h
index 3e5016b..9ff8a68 100644
--- a/include/qemu/timer.h
+++ b/include/qemu/timer.h
@@ -61,8 +61,8 @@ bool qemu_timer_pending(QEMUTimer *ts);
 bool qemu_timer_expired(QEMUTimer *timer_head, int64_t current_time);
 uint64_t qemu_timer_expire_time_ns(QEMUTimer *ts);
 
-void qemu_run_timers(QEMUClock *clock);
-void qemu_run_all_timers(void);
+bool qemu_run_timers(QEMUClock *clock);
+bool qemu_run_all_timers(void);
 void configure_alarms(char const *opt);
 void init_clocks(void);
 int init_timer_alarm(void);
diff --git a/main-loop.c b/main-loop.c
index 5fbdd4a..0214ed6 100644
--- a/main-loop.c
+++ b/main-loop.c
@@ -471,8 +471,6 @@ int main_loop_wait(int nonblocking)
     slirp_pollfds_poll(gpollfds, (ret < 0));
 #endif
 
-    qemu_run_all_timers();
-
     return ret;
 }
 
diff --git a/qemu-timer.c b/qemu-timer.c
index f15c3e6..8331e18 100644
--- a/qemu-timer.c
+++ b/qemu-timer.c
@@ -463,12 +463,13 @@ bool qemu_timer_expired(QEMUTimer *timer_head, int64_t current_time)
     return qemu_timer_expired_ns(timer_head, current_time * timer_head->scale);
 }
 
-void qemu_run_timers(QEMUClock *clock)
+bool qemu_run_timers(QEMUClock *clock)
 {
     QEMUTimer *ts;
     int64_t current_time;
     TimerList *tlist;
     AioContext *ctx;
+    bool process = false;
 
     atomic_inc(&clock->using);
     if (unlikely(!clock->enabled)) {
@@ -494,6 +495,7 @@ void qemu_run_timers(QEMUClock *clock)
 
         /* run the callback (the timer list can be modified) */
         ts->cb(ts->opaque);
+        process = true;
     }
 
 exit:
@@ -504,6 +506,7 @@ exit:
         }
     }
     qemu_mutex_unlock(&clock->lock);
+    return process;
 }
 
 int64_t qemu_get_clock_ns(QEMUClock *clock)
@@ -555,16 +558,17 @@ uint64_t qemu_timer_expire_time_ns(QEMUTimer *ts)
     return qemu_timer_pending(ts) ? ts->expire_time : -1;
 }
 
-void qemu_run_all_timers(void)
+bool qemu_run_all_timers(void)
 {
     bool timer_modified;
+    bool process = false;
 
     alarm_timer->pending = false;
 
     /* vm time timers */
-    qemu_run_timers(vm_clock);
-    qemu_run_timers(rt_clock);
-    qemu_run_timers(host_clock);
+    process |= qemu_run_timers(vm_clock);
+    process |= qemu_run_timers(rt_clock);
+    process |= qemu_run_timers(host_clock);
 
     /* Check if qemu_mod_timer_ns() has been called */
     qemu_mutex_lock(&alarm_timer->timer_modified_lock);
@@ -580,6 +584,7 @@ void qemu_run_all_timers(void)
         qemu_rearm_alarm_timer(alarm_timer);
     }
 
+    return process;
 }
 
 #ifdef _WIN32
-- 
1.8.1.4

^ permalink raw reply related	[flat|nested] 38+ messages in thread

* Re: [Qemu-devel] [RFC v2 1/5] timer: protect timers_state with lock
  2013-07-29  3:16 ` [Qemu-devel] [RFC v2 1/5] timer: protect timers_state with lock Liu Ping Fan
@ 2013-07-29  6:26   ` Paolo Bonzini
  2013-07-29  8:01     ` liu ping fan
  0 siblings, 1 reply; 38+ messages in thread
From: Paolo Bonzini @ 2013-07-29  6:26 UTC (permalink / raw)
  To: Liu Ping Fan
  Cc: Kevin Wolf, Stefan Hajnoczi, Jan Kiszka, qemu-devel, Alex Bligh,
	Anthony Liguori

Il 29/07/2013 05:16, Liu Ping Fan ha scritto:
> In kvm mode, vm_clock may be read on AioContexts outside BQL(next
> patch). This will make timers_state --the foundation of vm_clock
> exposed to race condition. Using private lock to protect it.
> Note in tcg mode, vm_clock still read inside BQL, so icount is
> left without change.
> 
> Lock rule: private lock innermost, ie BQL->"this lock"

This is quite expensive; on the other hand it is probably a good thing
to do even without the other patches.

Can you use the seqlock primitive we initially worked on for the memory
dispatch (with the BQL on the write side)?

Paolo

> Signed-off-by: Liu Ping Fan <pingfank@linux.vnet.ibm.com>
> ---
>  cpus.c | 26 +++++++++++++++++++++-----
>  1 file changed, 21 insertions(+), 5 deletions(-)
> 
> diff --git a/cpus.c b/cpus.c
> index 61e86a8..4af81e9 100644
> --- a/cpus.c
> +++ b/cpus.c
> @@ -112,7 +112,9 @@ typedef struct TimersState {
>      int64_t dummy;
>  } TimersState;
>  
> -TimersState timers_state;
> +static TimersState timers_state;
> +/* lock rule: innermost */
> +static QemuMutex timers_state_lock;
>  
>  /* Return the virtual CPU time, based on the instruction counter.  */
>  int64_t cpu_get_icount(void)
> @@ -134,11 +136,14 @@ int64_t cpu_get_icount(void)
>  /* return the host CPU cycle counter and handle stop/restart */
>  int64_t cpu_get_ticks(void)
>  {
> +    int64_t ret;
> +
>      if (use_icount) {
>          return cpu_get_icount();
>      }
> +    qemu_mutex_lock(&timers_state_lock);
>      if (!timers_state.cpu_ticks_enabled) {
> -        return timers_state.cpu_ticks_offset;
> +        ret = timers_state.cpu_ticks_offset;
>      } else {
>          int64_t ticks;
>          ticks = cpu_get_real_ticks();
> @@ -148,41 +153,51 @@ int64_t cpu_get_ticks(void)
>              timers_state.cpu_ticks_offset += timers_state.cpu_ticks_prev - ticks;
>          }
>          timers_state.cpu_ticks_prev = ticks;
> -        return ticks + timers_state.cpu_ticks_offset;
> +        ret = ticks + timers_state.cpu_ticks_offset;
>      }
> +    qemu_mutex_unlock(&timers_state_lock);
> +    return ret;
>  }
>  
>  /* return the host CPU monotonic timer and handle stop/restart */
>  int64_t cpu_get_clock(void)
>  {
>      int64_t ti;
> +
> +    qemu_mutex_lock(&timers_state_lock);
>      if (!timers_state.cpu_ticks_enabled) {
> -        return timers_state.cpu_clock_offset;
> +        ti = timers_state.cpu_clock_offset;
>      } else {
>          ti = get_clock();
> -        return ti + timers_state.cpu_clock_offset;
> +        ti += timers_state.cpu_clock_offset;
>      }
> +    qemu_mutex_unlock(&timers_state_lock);
> +    return ti;
>  }
>  
>  /* enable cpu_get_ticks() */
>  void cpu_enable_ticks(void)
>  {
> +    qemu_mutex_lock(&timers_state_lock);
>      if (!timers_state.cpu_ticks_enabled) {
>          timers_state.cpu_ticks_offset -= cpu_get_real_ticks();
>          timers_state.cpu_clock_offset -= get_clock();
>          timers_state.cpu_ticks_enabled = 1;
>      }
> +    qemu_mutex_unlock(&timers_state_lock);
>  }
>  
>  /* disable cpu_get_ticks() : the clock is stopped. You must not call
>     cpu_get_ticks() after that.  */
>  void cpu_disable_ticks(void)
>  {
> +    qemu_mutex_lock(&timers_state_lock);
>      if (timers_state.cpu_ticks_enabled) {
>          timers_state.cpu_ticks_offset = cpu_get_ticks();
>          timers_state.cpu_clock_offset = cpu_get_clock();
>          timers_state.cpu_ticks_enabled = 0;
>      }
> +    qemu_mutex_unlock(&timers_state_lock);
>  }
>  
>  /* Correlation between real and virtual time is always going to be
> @@ -353,6 +368,7 @@ static const VMStateDescription vmstate_timers = {
>  
>  void configure_icount(const char *option)
>  {
> +    qemu_mutex_init(&timers_state_lock);
>      vmstate_register(NULL, 0, &vmstate_timers, &timers_state);
>      if (!option) {
>          return;
> 

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [Qemu-devel] [RFC v2 3/5] timer: make qemu_clock_enable sync between disable and timer's cb
  2013-07-29  3:16 ` [Qemu-devel] [RFC v2 3/5] timer: make qemu_clock_enable sync between disable and timer's cb Liu Ping Fan
@ 2013-07-29  6:30   ` Paolo Bonzini
  2013-07-29  8:10     ` liu ping fan
  0 siblings, 1 reply; 38+ messages in thread
From: Paolo Bonzini @ 2013-07-29  6:30 UTC (permalink / raw)
  To: Liu Ping Fan
  Cc: Kevin Wolf, Stefan Hajnoczi, Jan Kiszka, qemu-devel, Alex Bligh,
	Anthony Liguori

Il 29/07/2013 05:16, Liu Ping Fan ha scritto:
> After disabling the QemuClock, we should make sure that no QemuTimers
> are still in flight. To implement that, the caller of disabling will
> wait until the last user's exit.
> 
> Note, the callers of qemu_clock_enable() should be sync by themselves,
> not protected by this patch.
> 
> Signed-off-by: Liu Ping Fan <pingfank@linux.vnet.ibm.com>

This is an interesting approach.

> -    if (!clock->enabled)
> -        return;
> +    atomic_inc(&clock->using);
> +    if (unlikely(!clock->enabled)) {
> +        goto exit;
> +    }

This can return directly, it doesn't need to increment and decrement
clock->using.

Paolo

>  
>      current_time = qemu_get_clock_ns(clock);
>      tlist = clock_to_timerlist(clock);
> @@ -461,6 +482,15 @@ void qemu_run_timers(QEMUClock *clock)
>          /* run the callback (the timer list can be modified) */
>          ts->cb(ts->opaque);
>      }
> +
> +exit:
> +    qemu_mutex_lock(&clock->lock);
> +    if (atomic_fetch_dec(&clock->using) == 1) {
> +        if (unlikely(!clock->enabled)) {
> +            qemu_cond_signal(&clock->wait_using);
> +        }
> +    }
> +    qemu_mutex_unlock(&clock->lock);
>  }
>  
>  int64_t qemu_get_clock_ns(QEMUClock *clock)
> 

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [Qemu-devel] [RFC v2 4/5] timer: associate three timerlists with AioContext
  2013-07-29  3:16 ` [Qemu-devel] [RFC v2 4/5] timer: associate three timerlists with AioContext Liu Ping Fan
@ 2013-07-29  6:32   ` Paolo Bonzini
  2013-07-29  8:20     ` liu ping fan
  0 siblings, 1 reply; 38+ messages in thread
From: Paolo Bonzini @ 2013-07-29  6:32 UTC (permalink / raw)
  To: Liu Ping Fan
  Cc: Kevin Wolf, Stefan Hajnoczi, Jan Kiszka, qemu-devel, Alex Bligh,
	Anthony Liguori

Il 29/07/2013 05:16, Liu Ping Fan ha scritto:
> Currently, timers run on iothread inside QBL, this limits the usage
> of timers in some case, e.g. virtio-blk-dataplane. In order to run
> timers on private thread based on different clocksource, we arm each
> AioContext with three timer lists in according to three clocksource
> (QemuClock).
> 
> A little later, we will run timers in aio_poll.
> 
> Signed-off-by: Liu Ping Fan <pingfank@linux.vnet.ibm.com>
> 
> ------ issue to fix ---
> Note: before this patch, there should be another one to fix the race
> issue by qemu_mod_timer() and _run_timers().

Another issue is that deadline computation is not using the AioContext's
timer lists.

Paolo

> ---
>  async.c              |  9 ++++++++
>  include/block/aio.h  | 13 +++++++++++
>  include/qemu/timer.h | 20 ++++++++++++++++
>  qemu-timer.c         | 65 +++++++++++++++++++++++++++++++---------------------
>  4 files changed, 81 insertions(+), 26 deletions(-)
> 
> diff --git a/async.c b/async.c
> index ba4072c..7e2340e 100644
> --- a/async.c
> +++ b/async.c
> @@ -201,11 +201,15 @@ static void
>  aio_ctx_finalize(GSource     *source)
>  {
>      AioContext *ctx = (AioContext *) source;
> +    int i;
>  
>      thread_pool_free(ctx->thread_pool);
>      aio_set_event_notifier(ctx, &ctx->notifier, NULL, NULL);
>      event_notifier_cleanup(&ctx->notifier);
>      g_array_free(ctx->pollfds, TRUE);
> +    for (i = 0; i < QEMU_CLOCK_MAXCNT; i++) {
> +        timer_list_finalize(&ctx->timer_list[i]);
> +    }
>  }
>  
>  static GSourceFuncs aio_source_funcs = {
> @@ -237,6 +241,8 @@ void aio_notify(AioContext *ctx)
>  AioContext *aio_context_new(void)
>  {
>      AioContext *ctx;
> +    int i;
> +
>      ctx = (AioContext *) g_source_new(&aio_source_funcs, sizeof(AioContext));
>      ctx->pollfds = g_array_new(FALSE, FALSE, sizeof(GPollFD));
>      ctx->thread_pool = NULL;
> @@ -245,6 +251,9 @@ AioContext *aio_context_new(void)
>      aio_set_event_notifier(ctx, &ctx->notifier, 
>                             (EventNotifierHandler *)
>                             event_notifier_test_and_clear, NULL);
> +    for (i = 0; i < QEMU_CLOCK_MAXCNT; i++) {
> +        timer_list_init(&ctx->timer_list[i]);
> +    }
>  
>      return ctx;
>  }
> diff --git a/include/block/aio.h b/include/block/aio.h
> index 04598b2..cf41b42 100644
> --- a/include/block/aio.h
> +++ b/include/block/aio.h
> @@ -43,6 +43,18 @@ typedef struct AioHandler AioHandler;
>  typedef void QEMUBHFunc(void *opaque);
>  typedef void IOHandler(void *opaque);
>  
> +/* Related timer with AioContext */
> +typedef struct QEMUTimer QEMUTimer;
> +#define QEMU_CLOCK_MAXCNT 3
> +
> +typedef struct TimerList {
> +    QEMUTimer *active_timers;
> +    QemuMutex active_timers_lock;
> +} TimerList;
> +
> +void timer_list_init(TimerList *tlist);
> +void timer_list_finalize(TimerList *tlist);
> +
>  typedef struct AioContext {
>      GSource source;
>  
> @@ -73,6 +85,7 @@ typedef struct AioContext {
>  
>      /* Thread pool for performing work and receiving completion callbacks */
>      struct ThreadPool *thread_pool;
> +    TimerList timer_list[QEMU_CLOCK_MAXCNT];
>  } AioContext;
>  
>  /* Returns 1 if there are still outstanding AIO requests; 0 otherwise */
> diff --git a/include/qemu/timer.h b/include/qemu/timer.h
> index 9dd206c..3e5016b 100644
> --- a/include/qemu/timer.h
> +++ b/include/qemu/timer.h
> @@ -33,9 +33,14 @@ extern QEMUClock *vm_clock;
>  extern QEMUClock *host_clock;
>  
>  int64_t qemu_get_clock_ns(QEMUClock *clock);
> +/* qemu_clock_has_timers, qemu_clock_expired, qemu_clock_deadline
> + * run In tcg icount mode. There is only one AioContext i.e. qemu_aio_context.
> + * So we only count the timers on qemu_aio_context.
> + */
>  int64_t qemu_clock_has_timers(QEMUClock *clock);
>  int64_t qemu_clock_expired(QEMUClock *clock);
>  int64_t qemu_clock_deadline(QEMUClock *clock);
> +
>  void qemu_clock_enable(QEMUClock *clock, bool enabled);
>  void qemu_clock_warp(QEMUClock *clock);
>  
> @@ -45,6 +50,9 @@ void qemu_unregister_clock_reset_notifier(QEMUClock *clock,
>  
>  QEMUTimer *qemu_new_timer(QEMUClock *clock, int scale,
>                            QEMUTimerCB *cb, void *opaque);
> +QEMUTimer *aioctx_new_timer(QEMUClock *clock, int scale,
> +                          QEMUTimerCB *cb, void *opaque, AioContext *ctx);
> +
>  void qemu_free_timer(QEMUTimer *ts);
>  void qemu_del_timer(QEMUTimer *ts);
>  void qemu_mod_timer_ns(QEMUTimer *ts, int64_t expire_time);
> @@ -75,6 +83,18 @@ static inline QEMUTimer *qemu_new_timer_ms(QEMUClock *clock, QEMUTimerCB *cb,
>      return qemu_new_timer(clock, SCALE_MS, cb, opaque);
>  }
>  
> +static inline QEMUTimer *aioctx_new_timer_ns(QEMUClock *clock, QEMUTimerCB *cb,
> +                                           void *opaque, AioContext *ctx)
> +{
> +    return aioctx_new_timer(clock, SCALE_NS, cb, opaque, ctx);
> +}
> +
> +static inline QEMUTimer *aioctx_new_timer_ms(QEMUClock *clock, QEMUTimerCB *cb,
> +                                           void *opaque, AioContext *ctx)
> +{
> +    return aioctx_new_timer(clock, SCALE_MS, cb, opaque, ctx);
> +}
> +
>  static inline int64_t qemu_get_clock_ms(QEMUClock *clock)
>  {
>      return qemu_get_clock_ns(clock) / SCALE_MS;
> diff --git a/qemu-timer.c b/qemu-timer.c
> index d941a83..f15c3e6 100644
> --- a/qemu-timer.c
> +++ b/qemu-timer.c
> @@ -45,14 +45,6 @@
>  #define QEMU_CLOCK_REALTIME 0
>  #define QEMU_CLOCK_VIRTUAL  1
>  #define QEMU_CLOCK_HOST     2
> -#define QEMU_CLOCK_MAXCNT 3
> -
> -typedef struct TimerList {
> -    QEMUTimer *active_timers;
> -    QemuMutex active_timers_lock;
> -} TimerList;
> -
> -static TimerList timer_list[QEMU_CLOCK_MAXCNT];
>  
>  struct QEMUClock {
>      NotifierList reset_notifiers;
> @@ -72,7 +64,9 @@ struct QEMUClock {
>  
>  struct QEMUTimer {
>      int64_t expire_time;	/* in nanoseconds */
> +    /* quick link to AioContext timer list */
>      TimerList *list;
> +    AioContext *ctx;
>      QEMUTimerCB *cb;
>      void *opaque;
>      QEMUTimer *next;
> @@ -100,11 +94,12 @@ struct qemu_alarm_timer {
>  
>  static struct qemu_alarm_timer *alarm_timer;
>  
> -static TimerList *clock_to_timerlist(QEMUClock *clock)
> +static TimerList *clock_to_timerlist(QEMUClock *clock, AioContext *ctx)
>  {
>      int type = clock->type;
>  
> -    return  &timer_list[type];
> +    assert(ctx);
> +    return  &ctx->timer_list[type];
>  }
>  
>  static bool qemu_timer_expired_ns(QEMUTimer *timer_head, int64_t current_time)
> @@ -112,7 +107,8 @@ static bool qemu_timer_expired_ns(QEMUTimer *timer_head, int64_t current_time)
>      return timer_head && (timer_head->expire_time <= current_time);
>  }
>  
> -static int64_t qemu_next_clock_deadline(QEMUClock *clock, int64_t delta)
> +static int64_t qemu_next_clock_deadline(QEMUClock *clock, int64_t delta,
> +    AioContext *ctx)
>  {
>      int64_t expire_time, next;
>      bool has_timer = false;
> @@ -122,7 +118,7 @@ static int64_t qemu_next_clock_deadline(QEMUClock *clock, int64_t delta)
>          return delta;
>      }
>  
> -    tlist = clock_to_timerlist(clock);
> +    tlist = clock_to_timerlist(clock, ctx);
>      qemu_mutex_lock(&tlist->active_timers_lock);
>      if (tlist->active_timers) {
>          has_timer = true;
> @@ -140,12 +136,13 @@ static int64_t qemu_next_clock_deadline(QEMUClock *clock, int64_t delta)
>  static int64_t qemu_next_alarm_deadline(void)
>  {
>      int64_t delta = INT64_MAX;
> +    AioContext *ctx = *tls_get_thread_aio_context();
>  
>      if (!use_icount) {
> -        delta = qemu_next_clock_deadline(vm_clock, delta);
> +        delta = qemu_next_clock_deadline(vm_clock, delta, ctx);
>      }
> -    delta = qemu_next_clock_deadline(host_clock, delta);
> -    return qemu_next_clock_deadline(rt_clock, delta);
> +    delta = qemu_next_clock_deadline(host_clock, delta, ctx);
> +    return qemu_next_clock_deadline(rt_clock, delta, ctx);
>  }
>  
>  static void qemu_rearm_alarm_timer(struct qemu_alarm_timer *t)
> @@ -267,16 +264,21 @@ QEMUClock *rt_clock;
>  QEMUClock *vm_clock;
>  QEMUClock *host_clock;
>  
> -static void timer_list_init(TimerList *tlist)
> +void timer_list_init(TimerList *tlist)
>  {
>      qemu_mutex_init(&tlist->active_timers_lock);
>      tlist->active_timers = NULL;
>  }
>  
> +void timer_list_finalize(TimerList *tlist)
> +{
> +    qemu_mutex_destroy(&tlist->active_timers_lock);
> +    assert(!tlist->active_timers);
> +}
> +
>  static QEMUClock *qemu_new_clock(int type)
>  {
>      QEMUClock *clock;
> -    TimerList *tlist;
>  
>      clock = g_malloc0(sizeof(QEMUClock));
>      clock->type = type;
> @@ -286,8 +288,6 @@ static QEMUClock *qemu_new_clock(int type)
>      qemu_cond_init(&clock->wait_using);
>      qemu_mutex_init(&clock->lock);
>      notifier_list_init(&clock->reset_notifiers);
> -    tlist = clock_to_timerlist(clock);
> -    timer_list_init(tlist);
>  
>      return clock;
>  }
> @@ -308,10 +308,14 @@ void qemu_clock_enable(QEMUClock *clock, bool enabled)
>      }
>  }
>  
> +/* qemu_clock_has_timers, qemu_clock_expired, qemu_clock_deadline
> +  * run In tcg icount mode. There is only one AioContext i.e. qemu_aio_context.
> + * So we only count the timers on qemu_aio_context.
> +*/
>  int64_t qemu_clock_has_timers(QEMUClock *clock)
>  {
>      bool has_timers;
> -    TimerList *tlist = clock_to_timerlist(clock);
> +    TimerList *tlist = clock_to_timerlist(clock, qemu_get_aio_context());
>  
>      qemu_mutex_lock(&tlist->active_timers_lock);
>      has_timers = !!tlist->active_timers;
> @@ -323,7 +327,7 @@ int64_t qemu_clock_expired(QEMUClock *clock)
>  {
>      bool has_timers;
>      int64_t expire_time;
> -    TimerList *tlist = clock_to_timerlist(clock);
> +    TimerList *tlist = clock_to_timerlist(clock, qemu_get_aio_context());
>  
>      qemu_mutex_lock(&tlist->active_timers_lock);
>      has_timers = tlist->active_timers;
> @@ -339,7 +343,7 @@ int64_t qemu_clock_deadline(QEMUClock *clock)
>      int64_t delta = INT32_MAX;
>      bool has_timers;
>      int64_t expire_time;
> -    TimerList *tlist = clock_to_timerlist(clock);
> +    TimerList *tlist = clock_to_timerlist(clock, qemu_get_aio_context());
>  
>      qemu_mutex_lock(&tlist->active_timers_lock);
>      has_timers = tlist->active_timers;
> @@ -355,19 +359,26 @@ int64_t qemu_clock_deadline(QEMUClock *clock)
>      return delta;
>  }
>  
> -QEMUTimer *qemu_new_timer(QEMUClock *clock, int scale,
> -                          QEMUTimerCB *cb, void *opaque)
> +QEMUTimer *aioctx_new_timer(QEMUClock *clock, int scale,
> +                          QEMUTimerCB *cb, void *opaque, AioContext *ctx)
>  {
>      QEMUTimer *ts;
>  
>      ts = g_malloc0(sizeof(QEMUTimer));
> -    ts->list = clock_to_timerlist(clock);
> +    ts->list = clock_to_timerlist(clock, ctx);
>      ts->cb = cb;
>      ts->opaque = opaque;
>      ts->scale = scale;
> +    ts->ctx = ctx;
>      return ts;
>  }
>  
> +QEMUTimer *qemu_new_timer(QEMUClock *clock, int scale,
> +                          QEMUTimerCB *cb, void *opaque)
> +{
> +    return aioctx_new_timer(clock, scale, cb, opaque, qemu_get_aio_context());
> +}
> +
>  void qemu_free_timer(QEMUTimer *ts)
>  {
>      g_free(ts);
> @@ -457,6 +468,7 @@ void qemu_run_timers(QEMUClock *clock)
>      QEMUTimer *ts;
>      int64_t current_time;
>      TimerList *tlist;
> +    AioContext *ctx;
>  
>      atomic_inc(&clock->using);
>      if (unlikely(!clock->enabled)) {
> @@ -465,7 +477,8 @@ void qemu_run_timers(QEMUClock *clock)
>  
>  
>      current_time = qemu_get_clock_ns(clock);
> -    tlist = clock_to_timerlist(clock);
> +    ctx = *tls_get_thread_aio_context();
> +    tlist = clock_to_timerlist(clock, ctx);
>  
>      for(;;) {
>          qemu_mutex_lock(&tlist->active_timers_lock);
> 

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [Qemu-devel] [RFC v2 1/5] timer: protect timers_state with lock
  2013-07-29  6:26   ` Paolo Bonzini
@ 2013-07-29  8:01     ` liu ping fan
  0 siblings, 0 replies; 38+ messages in thread
From: liu ping fan @ 2013-07-29  8:01 UTC (permalink / raw)
  To: Paolo Bonzini
  Cc: Kevin Wolf, Stefan Hajnoczi, Jan Kiszka, qemu-devel, Alex Bligh,
	Anthony Liguori

On Mon, Jul 29, 2013 at 2:26 PM, Paolo Bonzini <pbonzini@redhat.com> wrote:
> Il 29/07/2013 05:16, Liu Ping Fan ha scritto:
>> In kvm mode, vm_clock may be read on AioContexts outside BQL(next
>> patch). This will make timers_state --the foundation of vm_clock
>> exposed to race condition. Using private lock to protect it.
>> Note in tcg mode, vm_clock still read inside BQL, so icount is
>> left without change.
>>
>> Lock rule: private lock innermost, ie BQL->"this lock"
>
> This is quite expensive; on the other hand it is probably a good thing
> to do even without the other patches.
>
> Can you use the seqlock primitive we initially worked on for the memory
> dispatch (with the BQL on the write side)?
>
Fine, of course, seqlock is a better one.

Thanks,
Pingfan
> Paolo
>
>> Signed-off-by: Liu Ping Fan <pingfank@linux.vnet.ibm.com>
>> ---
>>  cpus.c | 26 +++++++++++++++++++++-----
>>  1 file changed, 21 insertions(+), 5 deletions(-)
>>
>> diff --git a/cpus.c b/cpus.c
>> index 61e86a8..4af81e9 100644
>> --- a/cpus.c
>> +++ b/cpus.c
>> @@ -112,7 +112,9 @@ typedef struct TimersState {
>>      int64_t dummy;
>>  } TimersState;
>>
>> -TimersState timers_state;
>> +static TimersState timers_state;
>> +/* lock rule: innermost */
>> +static QemuMutex timers_state_lock;
>>
>>  /* Return the virtual CPU time, based on the instruction counter.  */
>>  int64_t cpu_get_icount(void)
>> @@ -134,11 +136,14 @@ int64_t cpu_get_icount(void)
>>  /* return the host CPU cycle counter and handle stop/restart */
>>  int64_t cpu_get_ticks(void)
>>  {
>> +    int64_t ret;
>> +
>>      if (use_icount) {
>>          return cpu_get_icount();
>>      }
>> +    qemu_mutex_lock(&timers_state_lock);
>>      if (!timers_state.cpu_ticks_enabled) {
>> -        return timers_state.cpu_ticks_offset;
>> +        ret = timers_state.cpu_ticks_offset;
>>      } else {
>>          int64_t ticks;
>>          ticks = cpu_get_real_ticks();
>> @@ -148,41 +153,51 @@ int64_t cpu_get_ticks(void)
>>              timers_state.cpu_ticks_offset += timers_state.cpu_ticks_prev - ticks;
>>          }
>>          timers_state.cpu_ticks_prev = ticks;
>> -        return ticks + timers_state.cpu_ticks_offset;
>> +        ret = ticks + timers_state.cpu_ticks_offset;
>>      }
>> +    qemu_mutex_unlock(&timers_state_lock);
>> +    return ret;
>>  }
>>
>>  /* return the host CPU monotonic timer and handle stop/restart */
>>  int64_t cpu_get_clock(void)
>>  {
>>      int64_t ti;
>> +
>> +    qemu_mutex_lock(&timers_state_lock);
>>      if (!timers_state.cpu_ticks_enabled) {
>> -        return timers_state.cpu_clock_offset;
>> +        ti = timers_state.cpu_clock_offset;
>>      } else {
>>          ti = get_clock();
>> -        return ti + timers_state.cpu_clock_offset;
>> +        ti += timers_state.cpu_clock_offset;
>>      }
>> +    qemu_mutex_unlock(&timers_state_lock);
>> +    return ti;
>>  }
>>
>>  /* enable cpu_get_ticks() */
>>  void cpu_enable_ticks(void)
>>  {
>> +    qemu_mutex_lock(&timers_state_lock);
>>      if (!timers_state.cpu_ticks_enabled) {
>>          timers_state.cpu_ticks_offset -= cpu_get_real_ticks();
>>          timers_state.cpu_clock_offset -= get_clock();
>>          timers_state.cpu_ticks_enabled = 1;
>>      }
>> +    qemu_mutex_unlock(&timers_state_lock);
>>  }
>>
>>  /* disable cpu_get_ticks() : the clock is stopped. You must not call
>>     cpu_get_ticks() after that.  */
>>  void cpu_disable_ticks(void)
>>  {
>> +    qemu_mutex_lock(&timers_state_lock);
>>      if (timers_state.cpu_ticks_enabled) {
>>          timers_state.cpu_ticks_offset = cpu_get_ticks();
>>          timers_state.cpu_clock_offset = cpu_get_clock();
>>          timers_state.cpu_ticks_enabled = 0;
>>      }
>> +    qemu_mutex_unlock(&timers_state_lock);
>>  }
>>
>>  /* Correlation between real and virtual time is always going to be
>> @@ -353,6 +368,7 @@ static const VMStateDescription vmstate_timers = {
>>
>>  void configure_icount(const char *option)
>>  {
>> +    qemu_mutex_init(&timers_state_lock);
>>      vmstate_register(NULL, 0, &vmstate_timers, &timers_state);
>>      if (!option) {
>>          return;
>>
>

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [Qemu-devel] [RFC v2 3/5] timer: make qemu_clock_enable sync between disable and timer's cb
  2013-07-29  6:30   ` Paolo Bonzini
@ 2013-07-29  8:10     ` liu ping fan
  2013-07-29 11:21       ` Paolo Bonzini
  0 siblings, 1 reply; 38+ messages in thread
From: liu ping fan @ 2013-07-29  8:10 UTC (permalink / raw)
  To: Paolo Bonzini
  Cc: Kevin Wolf, Stefan Hajnoczi, Jan Kiszka, qemu-devel, Alex Bligh,
	Anthony Liguori

On Mon, Jul 29, 2013 at 2:30 PM, Paolo Bonzini <pbonzini@redhat.com> wrote:
> Il 29/07/2013 05:16, Liu Ping Fan ha scritto:
>> After disabling the QemuClock, we should make sure that no QemuTimers
>> are still in flight. To implement that, the caller of disabling will
>> wait until the last user's exit.
>>
>> Note, the callers of qemu_clock_enable() should be sync by themselves,
>> not protected by this patch.
>>
>> Signed-off-by: Liu Ping Fan <pingfank@linux.vnet.ibm.com>
>
> This is an interesting approach.
>
>> -    if (!clock->enabled)
>> -        return;
>> +    atomic_inc(&clock->using);
>> +    if (unlikely(!clock->enabled)) {
>> +        goto exit;
>> +    }
>
> This can return directly, it doesn't need to increment and decrement
> clock->using.
>
Here is a race condition like the following, so I swap the sequence

     qemu_clock_enable(false)           run timers


if(!clock->enabled)  return;
                                      ------------>
  clock->enabled=false
  if(atomic_read(using)==0)
                                                        atomic_inc(using)


Regards,
Pingfan
> Paolo
>
>>
>>      current_time = qemu_get_clock_ns(clock);
>>      tlist = clock_to_timerlist(clock);
>> @@ -461,6 +482,15 @@ void qemu_run_timers(QEMUClock *clock)
>>          /* run the callback (the timer list can be modified) */
>>          ts->cb(ts->opaque);
>>      }
>> +
>> +exit:
>> +    qemu_mutex_lock(&clock->lock);
>> +    if (atomic_fetch_dec(&clock->using) == 1) {
>> +        if (unlikely(!clock->enabled)) {
>> +            qemu_cond_signal(&clock->wait_using);
>> +        }
>> +    }
>> +    qemu_mutex_unlock(&clock->lock);
>>  }
>>
>>  int64_t qemu_get_clock_ns(QEMUClock *clock)
>>
>

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [Qemu-devel] [RFC v2 4/5] timer: associate three timerlists with AioContext
  2013-07-29  6:32   ` Paolo Bonzini
@ 2013-07-29  8:20     ` liu ping fan
  2013-07-29 13:11       ` Paolo Bonzini
  0 siblings, 1 reply; 38+ messages in thread
From: liu ping fan @ 2013-07-29  8:20 UTC (permalink / raw)
  To: Paolo Bonzini
  Cc: Kevin Wolf, Stefan Hajnoczi, Jan Kiszka, qemu-devel, Alex Bligh,
	Anthony Liguori

On Mon, Jul 29, 2013 at 2:32 PM, Paolo Bonzini <pbonzini@redhat.com> wrote:
> Il 29/07/2013 05:16, Liu Ping Fan ha scritto:
>> Currently, timers run on iothread inside QBL, this limits the usage
>> of timers in some case, e.g. virtio-blk-dataplane. In order to run
>> timers on private thread based on different clocksource, we arm each
>> AioContext with three timer lists in according to three clocksource
>> (QemuClock).
>>
>> A little later, we will run timers in aio_poll.
>>
>> Signed-off-by: Liu Ping Fan <pingfank@linux.vnet.ibm.com>
>>
>> ------ issue to fix ---
>> Note: before this patch, there should be another one to fix the race
>> issue by qemu_mod_timer() and _run_timers().
>
> Another issue is that deadline computation is not using the AioContext's
> timer lists.
>
Sorry, can not catch the meaning. When AioContext has its own timer
lists, it will have its own deadline(for ppoll timeout). So the
computation should be based on AioContext's timer lists, right?

Thanks and regards,
Pingfan
> Paolo
>
>> ---
>>  async.c              |  9 ++++++++
>>  include/block/aio.h  | 13 +++++++++++
>>  include/qemu/timer.h | 20 ++++++++++++++++
>>  qemu-timer.c         | 65 +++++++++++++++++++++++++++++++---------------------
>>  4 files changed, 81 insertions(+), 26 deletions(-)
>>
>> diff --git a/async.c b/async.c
>> index ba4072c..7e2340e 100644
>> --- a/async.c
>> +++ b/async.c
>> @@ -201,11 +201,15 @@ static void
>>  aio_ctx_finalize(GSource     *source)
>>  {
>>      AioContext *ctx = (AioContext *) source;
>> +    int i;
>>
>>      thread_pool_free(ctx->thread_pool);
>>      aio_set_event_notifier(ctx, &ctx->notifier, NULL, NULL);
>>      event_notifier_cleanup(&ctx->notifier);
>>      g_array_free(ctx->pollfds, TRUE);
>> +    for (i = 0; i < QEMU_CLOCK_MAXCNT; i++) {
>> +        timer_list_finalize(&ctx->timer_list[i]);
>> +    }
>>  }
>>
>>  static GSourceFuncs aio_source_funcs = {
>> @@ -237,6 +241,8 @@ void aio_notify(AioContext *ctx)
>>  AioContext *aio_context_new(void)
>>  {
>>      AioContext *ctx;
>> +    int i;
>> +
>>      ctx = (AioContext *) g_source_new(&aio_source_funcs, sizeof(AioContext));
>>      ctx->pollfds = g_array_new(FALSE, FALSE, sizeof(GPollFD));
>>      ctx->thread_pool = NULL;
>> @@ -245,6 +251,9 @@ AioContext *aio_context_new(void)
>>      aio_set_event_notifier(ctx, &ctx->notifier,
>>                             (EventNotifierHandler *)
>>                             event_notifier_test_and_clear, NULL);
>> +    for (i = 0; i < QEMU_CLOCK_MAXCNT; i++) {
>> +        timer_list_init(&ctx->timer_list[i]);
>> +    }
>>
>>      return ctx;
>>  }
>> diff --git a/include/block/aio.h b/include/block/aio.h
>> index 04598b2..cf41b42 100644
>> --- a/include/block/aio.h
>> +++ b/include/block/aio.h
>> @@ -43,6 +43,18 @@ typedef struct AioHandler AioHandler;
>>  typedef void QEMUBHFunc(void *opaque);
>>  typedef void IOHandler(void *opaque);
>>
>> +/* Related timer with AioContext */
>> +typedef struct QEMUTimer QEMUTimer;
>> +#define QEMU_CLOCK_MAXCNT 3
>> +
>> +typedef struct TimerList {
>> +    QEMUTimer *active_timers;
>> +    QemuMutex active_timers_lock;
>> +} TimerList;
>> +
>> +void timer_list_init(TimerList *tlist);
>> +void timer_list_finalize(TimerList *tlist);
>> +
>>  typedef struct AioContext {
>>      GSource source;
>>
>> @@ -73,6 +85,7 @@ typedef struct AioContext {
>>
>>      /* Thread pool for performing work and receiving completion callbacks */
>>      struct ThreadPool *thread_pool;
>> +    TimerList timer_list[QEMU_CLOCK_MAXCNT];
>>  } AioContext;
>>
>>  /* Returns 1 if there are still outstanding AIO requests; 0 otherwise */
>> diff --git a/include/qemu/timer.h b/include/qemu/timer.h
>> index 9dd206c..3e5016b 100644
>> --- a/include/qemu/timer.h
>> +++ b/include/qemu/timer.h
>> @@ -33,9 +33,14 @@ extern QEMUClock *vm_clock;
>>  extern QEMUClock *host_clock;
>>
>>  int64_t qemu_get_clock_ns(QEMUClock *clock);
>> +/* qemu_clock_has_timers, qemu_clock_expired, qemu_clock_deadline
>> + * run In tcg icount mode. There is only one AioContext i.e. qemu_aio_context.
>> + * So we only count the timers on qemu_aio_context.
>> + */
>>  int64_t qemu_clock_has_timers(QEMUClock *clock);
>>  int64_t qemu_clock_expired(QEMUClock *clock);
>>  int64_t qemu_clock_deadline(QEMUClock *clock);
>> +
>>  void qemu_clock_enable(QEMUClock *clock, bool enabled);
>>  void qemu_clock_warp(QEMUClock *clock);
>>
>> @@ -45,6 +50,9 @@ void qemu_unregister_clock_reset_notifier(QEMUClock *clock,
>>
>>  QEMUTimer *qemu_new_timer(QEMUClock *clock, int scale,
>>                            QEMUTimerCB *cb, void *opaque);
>> +QEMUTimer *aioctx_new_timer(QEMUClock *clock, int scale,
>> +                          QEMUTimerCB *cb, void *opaque, AioContext *ctx);
>> +
>>  void qemu_free_timer(QEMUTimer *ts);
>>  void qemu_del_timer(QEMUTimer *ts);
>>  void qemu_mod_timer_ns(QEMUTimer *ts, int64_t expire_time);
>> @@ -75,6 +83,18 @@ static inline QEMUTimer *qemu_new_timer_ms(QEMUClock *clock, QEMUTimerCB *cb,
>>      return qemu_new_timer(clock, SCALE_MS, cb, opaque);
>>  }
>>
>> +static inline QEMUTimer *aioctx_new_timer_ns(QEMUClock *clock, QEMUTimerCB *cb,
>> +                                           void *opaque, AioContext *ctx)
>> +{
>> +    return aioctx_new_timer(clock, SCALE_NS, cb, opaque, ctx);
>> +}
>> +
>> +static inline QEMUTimer *aioctx_new_timer_ms(QEMUClock *clock, QEMUTimerCB *cb,
>> +                                           void *opaque, AioContext *ctx)
>> +{
>> +    return aioctx_new_timer(clock, SCALE_MS, cb, opaque, ctx);
>> +}
>> +
>>  static inline int64_t qemu_get_clock_ms(QEMUClock *clock)
>>  {
>>      return qemu_get_clock_ns(clock) / SCALE_MS;
>> diff --git a/qemu-timer.c b/qemu-timer.c
>> index d941a83..f15c3e6 100644
>> --- a/qemu-timer.c
>> +++ b/qemu-timer.c
>> @@ -45,14 +45,6 @@
>>  #define QEMU_CLOCK_REALTIME 0
>>  #define QEMU_CLOCK_VIRTUAL  1
>>  #define QEMU_CLOCK_HOST     2
>> -#define QEMU_CLOCK_MAXCNT 3
>> -
>> -typedef struct TimerList {
>> -    QEMUTimer *active_timers;
>> -    QemuMutex active_timers_lock;
>> -} TimerList;
>> -
>> -static TimerList timer_list[QEMU_CLOCK_MAXCNT];
>>
>>  struct QEMUClock {
>>      NotifierList reset_notifiers;
>> @@ -72,7 +64,9 @@ struct QEMUClock {
>>
>>  struct QEMUTimer {
>>      int64_t expire_time;     /* in nanoseconds */
>> +    /* quick link to AioContext timer list */
>>      TimerList *list;
>> +    AioContext *ctx;
>>      QEMUTimerCB *cb;
>>      void *opaque;
>>      QEMUTimer *next;
>> @@ -100,11 +94,12 @@ struct qemu_alarm_timer {
>>
>>  static struct qemu_alarm_timer *alarm_timer;
>>
>> -static TimerList *clock_to_timerlist(QEMUClock *clock)
>> +static TimerList *clock_to_timerlist(QEMUClock *clock, AioContext *ctx)
>>  {
>>      int type = clock->type;
>>
>> -    return  &timer_list[type];
>> +    assert(ctx);
>> +    return  &ctx->timer_list[type];
>>  }
>>
>>  static bool qemu_timer_expired_ns(QEMUTimer *timer_head, int64_t current_time)
>> @@ -112,7 +107,8 @@ static bool qemu_timer_expired_ns(QEMUTimer *timer_head, int64_t current_time)
>>      return timer_head && (timer_head->expire_time <= current_time);
>>  }
>>
>> -static int64_t qemu_next_clock_deadline(QEMUClock *clock, int64_t delta)
>> +static int64_t qemu_next_clock_deadline(QEMUClock *clock, int64_t delta,
>> +    AioContext *ctx)
>>  {
>>      int64_t expire_time, next;
>>      bool has_timer = false;
>> @@ -122,7 +118,7 @@ static int64_t qemu_next_clock_deadline(QEMUClock *clock, int64_t delta)
>>          return delta;
>>      }
>>
>> -    tlist = clock_to_timerlist(clock);
>> +    tlist = clock_to_timerlist(clock, ctx);
>>      qemu_mutex_lock(&tlist->active_timers_lock);
>>      if (tlist->active_timers) {
>>          has_timer = true;
>> @@ -140,12 +136,13 @@ static int64_t qemu_next_clock_deadline(QEMUClock *clock, int64_t delta)
>>  static int64_t qemu_next_alarm_deadline(void)
>>  {
>>      int64_t delta = INT64_MAX;
>> +    AioContext *ctx = *tls_get_thread_aio_context();
>>
>>      if (!use_icount) {
>> -        delta = qemu_next_clock_deadline(vm_clock, delta);
>> +        delta = qemu_next_clock_deadline(vm_clock, delta, ctx);
>>      }
>> -    delta = qemu_next_clock_deadline(host_clock, delta);
>> -    return qemu_next_clock_deadline(rt_clock, delta);
>> +    delta = qemu_next_clock_deadline(host_clock, delta, ctx);
>> +    return qemu_next_clock_deadline(rt_clock, delta, ctx);
>>  }
>>
>>  static void qemu_rearm_alarm_timer(struct qemu_alarm_timer *t)
>> @@ -267,16 +264,21 @@ QEMUClock *rt_clock;
>>  QEMUClock *vm_clock;
>>  QEMUClock *host_clock;
>>
>> -static void timer_list_init(TimerList *tlist)
>> +void timer_list_init(TimerList *tlist)
>>  {
>>      qemu_mutex_init(&tlist->active_timers_lock);
>>      tlist->active_timers = NULL;
>>  }
>>
>> +void timer_list_finalize(TimerList *tlist)
>> +{
>> +    qemu_mutex_destroy(&tlist->active_timers_lock);
>> +    assert(!tlist->active_timers);
>> +}
>> +
>>  static QEMUClock *qemu_new_clock(int type)
>>  {
>>      QEMUClock *clock;
>> -    TimerList *tlist;
>>
>>      clock = g_malloc0(sizeof(QEMUClock));
>>      clock->type = type;
>> @@ -286,8 +288,6 @@ static QEMUClock *qemu_new_clock(int type)
>>      qemu_cond_init(&clock->wait_using);
>>      qemu_mutex_init(&clock->lock);
>>      notifier_list_init(&clock->reset_notifiers);
>> -    tlist = clock_to_timerlist(clock);
>> -    timer_list_init(tlist);
>>
>>      return clock;
>>  }
>> @@ -308,10 +308,14 @@ void qemu_clock_enable(QEMUClock *clock, bool enabled)
>>      }
>>  }
>>
>> +/* qemu_clock_has_timers, qemu_clock_expired, qemu_clock_deadline
>> +  * run In tcg icount mode. There is only one AioContext i.e. qemu_aio_context.
>> + * So we only count the timers on qemu_aio_context.
>> +*/
>>  int64_t qemu_clock_has_timers(QEMUClock *clock)
>>  {
>>      bool has_timers;
>> -    TimerList *tlist = clock_to_timerlist(clock);
>> +    TimerList *tlist = clock_to_timerlist(clock, qemu_get_aio_context());
>>
>>      qemu_mutex_lock(&tlist->active_timers_lock);
>>      has_timers = !!tlist->active_timers;
>> @@ -323,7 +327,7 @@ int64_t qemu_clock_expired(QEMUClock *clock)
>>  {
>>      bool has_timers;
>>      int64_t expire_time;
>> -    TimerList *tlist = clock_to_timerlist(clock);
>> +    TimerList *tlist = clock_to_timerlist(clock, qemu_get_aio_context());
>>
>>      qemu_mutex_lock(&tlist->active_timers_lock);
>>      has_timers = tlist->active_timers;
>> @@ -339,7 +343,7 @@ int64_t qemu_clock_deadline(QEMUClock *clock)
>>      int64_t delta = INT32_MAX;
>>      bool has_timers;
>>      int64_t expire_time;
>> -    TimerList *tlist = clock_to_timerlist(clock);
>> +    TimerList *tlist = clock_to_timerlist(clock, qemu_get_aio_context());
>>
>>      qemu_mutex_lock(&tlist->active_timers_lock);
>>      has_timers = tlist->active_timers;
>> @@ -355,19 +359,26 @@ int64_t qemu_clock_deadline(QEMUClock *clock)
>>      return delta;
>>  }
>>
>> -QEMUTimer *qemu_new_timer(QEMUClock *clock, int scale,
>> -                          QEMUTimerCB *cb, void *opaque)
>> +QEMUTimer *aioctx_new_timer(QEMUClock *clock, int scale,
>> +                          QEMUTimerCB *cb, void *opaque, AioContext *ctx)
>>  {
>>      QEMUTimer *ts;
>>
>>      ts = g_malloc0(sizeof(QEMUTimer));
>> -    ts->list = clock_to_timerlist(clock);
>> +    ts->list = clock_to_timerlist(clock, ctx);
>>      ts->cb = cb;
>>      ts->opaque = opaque;
>>      ts->scale = scale;
>> +    ts->ctx = ctx;
>>      return ts;
>>  }
>>
>> +QEMUTimer *qemu_new_timer(QEMUClock *clock, int scale,
>> +                          QEMUTimerCB *cb, void *opaque)
>> +{
>> +    return aioctx_new_timer(clock, scale, cb, opaque, qemu_get_aio_context());
>> +}
>> +
>>  void qemu_free_timer(QEMUTimer *ts)
>>  {
>>      g_free(ts);
>> @@ -457,6 +468,7 @@ void qemu_run_timers(QEMUClock *clock)
>>      QEMUTimer *ts;
>>      int64_t current_time;
>>      TimerList *tlist;
>> +    AioContext *ctx;
>>
>>      atomic_inc(&clock->using);
>>      if (unlikely(!clock->enabled)) {
>> @@ -465,7 +477,8 @@ void qemu_run_timers(QEMUClock *clock)
>>
>>
>>      current_time = qemu_get_clock_ns(clock);
>> -    tlist = clock_to_timerlist(clock);
>> +    ctx = *tls_get_thread_aio_context();
>> +    tlist = clock_to_timerlist(clock, ctx);
>>
>>      for(;;) {
>>          qemu_mutex_lock(&tlist->active_timers_lock);
>>
>

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [Qemu-devel] [RFC v2 0/5] arm AioContext with its own timer stuff
  2013-07-29  3:16 [Qemu-devel] [RFC v2 0/5] arm AioContext with its own timer stuff Liu Ping Fan
                   ` (4 preceding siblings ...)
  2013-07-29  3:16 ` [Qemu-devel] [RFC v2 5/5] timer: run timers on aio_poll Liu Ping Fan
@ 2013-07-29  9:22 ` Stefan Hajnoczi
  2013-07-29 10:23   ` Alex Bligh
  2013-07-30  3:35   ` liu ping fan
  2013-07-29 10:18 ` Alex Bligh
  6 siblings, 2 replies; 38+ messages in thread
From: Stefan Hajnoczi @ 2013-07-29  9:22 UTC (permalink / raw)
  To: Liu Ping Fan
  Cc: Kevin Wolf, Stefan Hajnoczi, Jan Kiszka, qemu-devel, Alex Bligh,
	Anthony Liguori, Paolo Bonzini

On Mon, Jul 29, 2013 at 11:16:03AM +0800, Liu Ping Fan wrote:
> summary of the model:
>   Three qemu-wide clock source allowed in system. And each AioContext has
> three corresponding timer list to run timer against clocksource.
> 
> rfcv2:
>    drop patches about alarm-timer(if timeout of poll will not satisfy, will come back to it)
>    fix qemu_clock_enable sync problem (Thanks for Jan and Stefan)
>    fix process=true when aio_poll run timers (Thanks for Alex)
> 
> Liu Ping Fan (5):
>   timer: protect timers_state with lock
>   timer: pick out timer list info from QemuClock
>   timer: make qemu_clock_enable sync between disable and timer's cb
>   timer: associate three timerlists with AioContext
>   timer: run timers on aio_poll
> 
>  aio-posix.c          |   2 +
>  async.c              |   9 +++
>  cpus.c               |  26 ++++++--
>  include/block/aio.h  |  13 ++++
>  include/qemu/timer.h |  24 ++++++-
>  main-loop.c          |   2 -
>  qemu-timer.c         | 184 ++++++++++++++++++++++++++++++++++++---------------
>  7 files changed, 198 insertions(+), 62 deletions(-)

The potential for overlap with Alex Bligh's series is large.  Can you
base your patches on his v4?

It seems the difference is that you make clock sources to be available
globally while Alex's series uses rt_clock (no synchronization
necessary).

Stefan

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [Qemu-devel] [RFC v2 0/5] arm AioContext with its own timer stuff
  2013-07-29  3:16 [Qemu-devel] [RFC v2 0/5] arm AioContext with its own timer stuff Liu Ping Fan
                   ` (5 preceding siblings ...)
  2013-07-29  9:22 ` [Qemu-devel] [RFC v2 0/5] arm AioContext with its own timer stuff Stefan Hajnoczi
@ 2013-07-29 10:18 ` Alex Bligh
  6 siblings, 0 replies; 38+ messages in thread
From: Alex Bligh @ 2013-07-29 10:18 UTC (permalink / raw)
  To: Liu Ping Fan, qemu-devel
  Cc: Kevin Wolf, Stefan Hajnoczi, Jan Kiszka, Alex Bligh,
	Anthony Liguori, Paolo Bonzini



--On 29 July 2013 11:16:03 +0800 Liu Ping Fan <qemulist@gmail.com> wrote:

> summary of the model:
>   Three qemu-wide clock source allowed in system. And each AioContext has
> three corresponding timer list to run timer against clocksource.
>
> rfcv2:
>    drop patches about alarm-timer(if timeout of poll will not satisfy,
> will come back to it)    fix qemu_clock_enable sync problem (Thanks for
> Jan and Stefan)    fix process=true when aio_poll run timers (Thanks for
> Alex)

This seems to have a considerable degree of overlap with my PATCHv4
where I split QEMUClock into QEMUClock and QEMUTimerList and then
attached a QEMUTimerList to each AioContext.

-- 
Alex Bligh

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [Qemu-devel] [RFC v2 0/5] arm AioContext with its own timer stuff
  2013-07-29  9:22 ` [Qemu-devel] [RFC v2 0/5] arm AioContext with its own timer stuff Stefan Hajnoczi
@ 2013-07-29 10:23   ` Alex Bligh
  2013-07-29 13:36     ` Stefan Hajnoczi
  2013-07-30  3:35   ` liu ping fan
  1 sibling, 1 reply; 38+ messages in thread
From: Alex Bligh @ 2013-07-29 10:23 UTC (permalink / raw)
  To: Stefan Hajnoczi, Liu Ping Fan
  Cc: Kevin Wolf, Stefan Hajnoczi, Alex Bligh, Jan Kiszka, qemu-devel,
	Anthony Liguori, Paolo Bonzini



--On 29 July 2013 11:22:13 +0200 Stefan Hajnoczi <stefanha@gmail.com> wrote:

> The potential for overlap with Alex Bligh's series is large.  Can you
> base your patches on his v4?
>
> It seems the difference is that you make clock sources to be available
> globally while Alex's series uses rt_clock (no synchronization
> necessary).

I should say PingFan has probably paid more attention to thread safety
than me, as my work was intended to be applied before AioContexts were
used by multiple threads.

-- 
Alex Bligh

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [Qemu-devel] [RFC v2 3/5] timer: make qemu_clock_enable sync between disable and timer's cb
  2013-07-29  8:10     ` liu ping fan
@ 2013-07-29 11:21       ` Paolo Bonzini
  2013-07-30  2:42         ` liu ping fan
  0 siblings, 1 reply; 38+ messages in thread
From: Paolo Bonzini @ 2013-07-29 11:21 UTC (permalink / raw)
  To: liu ping fan
  Cc: Kevin Wolf, Stefan Hajnoczi, Jan Kiszka, qemu-devel, Alex Bligh,
	Anthony Liguori

Il 29/07/2013 10:10, liu ping fan ha scritto:
> On Mon, Jul 29, 2013 at 2:30 PM, Paolo Bonzini <pbonzini@redhat.com> wrote:
>> Il 29/07/2013 05:16, Liu Ping Fan ha scritto:
>>> After disabling the QemuClock, we should make sure that no QemuTimers
>>> are still in flight. To implement that, the caller of disabling will
>>> wait until the last user's exit.
>>>
>>> Note, the callers of qemu_clock_enable() should be sync by themselves,
>>> not protected by this patch.
>>>
>>> Signed-off-by: Liu Ping Fan <pingfank@linux.vnet.ibm.com>
>>
>> This is an interesting approach.
>>
>>> -    if (!clock->enabled)
>>> -        return;
>>> +    atomic_inc(&clock->using);
>>> +    if (unlikely(!clock->enabled)) {
>>> +        goto exit;
>>> +    }
>>
>> This can return directly, it doesn't need to increment and decrement
>> clock->using.
>>
> Here is a race condition like the following

Ah, I see.

Still this seems a bit backwards.  Most of the time you will have no one
on the wait_using condvar, but you are almost always signaling the
condvar (should be broadcast BTW)...

Paolo

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [Qemu-devel] [RFC v2 4/5] timer: associate three timerlists with AioContext
  2013-07-29  8:20     ` liu ping fan
@ 2013-07-29 13:11       ` Paolo Bonzini
  2013-07-30  2:35         ` liu ping fan
  0 siblings, 1 reply; 38+ messages in thread
From: Paolo Bonzini @ 2013-07-29 13:11 UTC (permalink / raw)
  To: liu ping fan
  Cc: Kevin Wolf, Stefan Hajnoczi, Jan Kiszka, qemu-devel, Alex Bligh,
	Anthony Liguori

Il 29/07/2013 10:20, liu ping fan ha scritto:
>> > Another issue is that deadline computation is not using the AioContext's
>> > timer lists.
>> >
> Sorry, can not catch the meaning. When AioContext has its own timer
> lists, it will have its own deadline(for ppoll timeout). So the
> computation should be based on AioContext's timer lists, right?

Associating the main loop's timer lists to other AioContexts is really
really wrong...  It is _already_ wrong to run timers in the main
AioContext's aio_poll (because timers may not be ready to run from an
arbitrary aio_poll), it is even worse to run them in other threads
without taking the BQL.

What exactly is the purpose of this series?

Paolo

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [Qemu-devel] [RFC v2 0/5] arm AioContext with its own timer stuff
  2013-07-29 10:23   ` Alex Bligh
@ 2013-07-29 13:36     ` Stefan Hajnoczi
  2013-07-29 13:56       ` Alex Bligh
  0 siblings, 1 reply; 38+ messages in thread
From: Stefan Hajnoczi @ 2013-07-29 13:36 UTC (permalink / raw)
  To: Alex Bligh
  Cc: Kevin Wolf, Stefan Hajnoczi, qemu-devel, Liu Ping Fan,
	Anthony Liguori, Jan Kiszka, Paolo Bonzini

On Mon, Jul 29, 2013 at 11:23:52AM +0100, Alex Bligh wrote:
> 
> 
> --On 29 July 2013 11:22:13 +0200 Stefan Hajnoczi <stefanha@gmail.com> wrote:
> 
> >The potential for overlap with Alex Bligh's series is large.  Can you
> >base your patches on his v4?
> >
> >It seems the difference is that you make clock sources to be available
> >globally while Alex's series uses rt_clock (no synchronization
> >necessary).
> 
> I should say PingFan has probably paid more attention to thread safety
> than me, as my work was intended to be applied before AioContexts were
> used by multiple threads.

Exactly.  That's why I think these patches should be based on your
series - they add the thread safety on top.

Stefan

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [Qemu-devel] [RFC v2 0/5] arm AioContext with its own timer stuff
  2013-07-29 13:36     ` Stefan Hajnoczi
@ 2013-07-29 13:56       ` Alex Bligh
  0 siblings, 0 replies; 38+ messages in thread
From: Alex Bligh @ 2013-07-29 13:56 UTC (permalink / raw)
  To: Stefan Hajnoczi
  Cc: Kevin Wolf, Alex Bligh, Stefan Hajnoczi, qemu-devel,
	Liu Ping Fan, Anthony Liguori, Jan Kiszka, Paolo Bonzini



--On 29 July 2013 15:36:03 +0200 Stefan Hajnoczi <stefanha@redhat.com> 
wrote:

>> I should say PingFan has probably paid more attention to thread safety
>> than me, as my work was intended to be applied before AioContexts were
>> used by multiple threads.
>
> Exactly.  That's why I think these patches should be based on your
> series - they add the thread safety on top.

OK. I think I'm pretty much done with my series and have incorporated
everyone's comments to date (save a couple I pushed back on of
a minor nature and heard no response).

The only other thing I was thinking of was rather than each QemuTimerList
holding an AioContext pointer so it knows it can use aio_notify, providing
a callback for notify (this is on clock enable). I think that's probably
polishing the ball though.

On the thread issue, for my own education, if an AioContext's timers
only ever get called by that AioContext (and thus in the same thread),
which is the case in PATCHv4, I don't think there is a thread safety
beyond manipulation of the list of QemuTimerLists associated with
each QEMUClock. That's only ever modified when the AioContext is
created or deleted. This assumes that timers are always set from
the same thread they are run (i.e. there is only one thread
walking the QEMUTimerList).

-- 
Alex Bligh

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [Qemu-devel] [RFC v2 4/5] timer: associate three timerlists with AioContext
  2013-07-29 13:11       ` Paolo Bonzini
@ 2013-07-30  2:35         ` liu ping fan
  0 siblings, 0 replies; 38+ messages in thread
From: liu ping fan @ 2013-07-30  2:35 UTC (permalink / raw)
  To: Paolo Bonzini
  Cc: Kevin Wolf, Stefan Hajnoczi, Jan Kiszka, qemu-devel, Alex Bligh,
	Anthony Liguori

On Mon, Jul 29, 2013 at 9:11 PM, Paolo Bonzini <pbonzini@redhat.com> wrote:
> Il 29/07/2013 10:20, liu ping fan ha scritto:
>>> > Another issue is that deadline computation is not using the AioContext's
>>> > timer lists.
>>> >
>> Sorry, can not catch the meaning. When AioContext has its own timer
>> lists, it will have its own deadline(for ppoll timeout). So the
>> computation should be based on AioContext's timer lists, right?
>
> Associating the main loop's timer lists to other AioContexts is really

Main loop's timers will only be associated with the qemu_aio_context,
not with other AioContext. (When creating timer, we have bound it with
a specified AioContext, so it will only run on that AioContext)

> really wrong...  It is _already_ wrong to run timers in the main
> AioContext's aio_poll (because timers may not be ready to run from an

Sorry, but except that timer cb will call aio_poll, any other reason ?
> arbitrary aio_poll), it is even worse to run them in other threads
> without taking the BQL.
>
Can be fixed by some method to sync  the dedicated thread with vcpus?

> What exactly is the purpose of this series?
>
Enable timers to run on the dedicated threads(using aio as the
mini-loop), currently for the following devices,
    virtio-blk-dataplane for throttling
    hpet for the best-effort resolution

Thanks and regards,
Pingfan
> Paolo

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [Qemu-devel] [RFC v2 3/5] timer: make qemu_clock_enable sync between disable and timer's cb
  2013-07-29 11:21       ` Paolo Bonzini
@ 2013-07-30  2:42         ` liu ping fan
  2013-07-30  9:17           ` Paolo Bonzini
  0 siblings, 1 reply; 38+ messages in thread
From: liu ping fan @ 2013-07-30  2:42 UTC (permalink / raw)
  To: Paolo Bonzini
  Cc: Kevin Wolf, Stefan Hajnoczi, Jan Kiszka, qemu-devel, Alex Bligh,
	Anthony Liguori

On Mon, Jul 29, 2013 at 7:21 PM, Paolo Bonzini <pbonzini@redhat.com> wrote:
> Il 29/07/2013 10:10, liu ping fan ha scritto:
>> On Mon, Jul 29, 2013 at 2:30 PM, Paolo Bonzini <pbonzini@redhat.com> wrote:
>>> Il 29/07/2013 05:16, Liu Ping Fan ha scritto:
>>>> After disabling the QemuClock, we should make sure that no QemuTimers
>>>> are still in flight. To implement that, the caller of disabling will
>>>> wait until the last user's exit.
>>>>
>>>> Note, the callers of qemu_clock_enable() should be sync by themselves,
>>>> not protected by this patch.
>>>>
>>>> Signed-off-by: Liu Ping Fan <pingfank@linux.vnet.ibm.com>
>>>
>>> This is an interesting approach.
>>>
>>>> -    if (!clock->enabled)
>>>> -        return;
>>>> +    atomic_inc(&clock->using);
>>>> +    if (unlikely(!clock->enabled)) {
>>>> +        goto exit;
>>>> +    }
>>>
>>> This can return directly, it doesn't need to increment and decrement
>>> clock->using.
>>>
>> Here is a race condition like the following
>
> Ah, I see.
>
> Still this seems a bit backwards.  Most of the time you will have no one
> on the wait_using condvar, but you are almost always signaling the
> condvar (should be broadcast BTW)...
>
I have tried to filter out the normal case by
+    qemu_mutex_lock(&clock->lock);
+    if (atomic_fetch_dec(&clock->using) == 1) { ---------> 1st place
+        if (unlikely(!clock->enabled)) { -------> 2nd place
+            qemu_cond_signal(&clock->wait_using);
+        }
+    }
+    qemu_mutex_unlock(&clock->lock);

Could I do it better?

Thanks,
Pingfan

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [Qemu-devel] [RFC v2 0/5] arm AioContext with its own timer stuff
  2013-07-29  9:22 ` [Qemu-devel] [RFC v2 0/5] arm AioContext with its own timer stuff Stefan Hajnoczi
  2013-07-29 10:23   ` Alex Bligh
@ 2013-07-30  3:35   ` liu ping fan
  1 sibling, 0 replies; 38+ messages in thread
From: liu ping fan @ 2013-07-30  3:35 UTC (permalink / raw)
  To: Stefan Hajnoczi
  Cc: Kevin Wolf, Stefan Hajnoczi, Jan Kiszka, qemu-devel, Alex Bligh,
	Anthony Liguori, Paolo Bonzini

On Mon, Jul 29, 2013 at 5:22 PM, Stefan Hajnoczi <stefanha@gmail.com> wrote:
> On Mon, Jul 29, 2013 at 11:16:03AM +0800, Liu Ping Fan wrote:
>> summary of the model:
>>   Three qemu-wide clock source allowed in system. And each AioContext has
>> three corresponding timer list to run timer against clocksource.
>>
>> rfcv2:
>>    drop patches about alarm-timer(if timeout of poll will not satisfy, will come back to it)
>>    fix qemu_clock_enable sync problem (Thanks for Jan and Stefan)
>>    fix process=true when aio_poll run timers (Thanks for Alex)
>>
>> Liu Ping Fan (5):
>>   timer: protect timers_state with lock
>>   timer: pick out timer list info from QemuClock
>>   timer: make qemu_clock_enable sync between disable and timer's cb
>>   timer: associate three timerlists with AioContext
>>   timer: run timers on aio_poll
>>
>>  aio-posix.c          |   2 +
>>  async.c              |   9 +++
>>  cpus.c               |  26 ++++++--
>>  include/block/aio.h  |  13 ++++
>>  include/qemu/timer.h |  24 ++++++-
>>  main-loop.c          |   2 -
>>  qemu-timer.c         | 184 ++++++++++++++++++++++++++++++++++++---------------
>>  7 files changed, 198 insertions(+), 62 deletions(-)
>
> The potential for overlap with Alex Bligh's series is large.  Can you
> base your patches on his v4?
>
Re-check the patches and Alex' patches can meet my need for further
step to make hpet run in dedicated thread. So I will rebase the sync
part of my patches onto his.

> It seems the difference is that you make clock sources to be available
> globally while Alex's series uses rt_clock (no synchronization
> necessary).
>
> Stefan

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [Qemu-devel] [RFC v2 3/5] timer: make qemu_clock_enable sync between disable and timer's cb
  2013-07-30  2:42         ` liu ping fan
@ 2013-07-30  9:17           ` Paolo Bonzini
  2013-07-30  9:51             ` Alex Bligh
  2013-08-01  5:54             ` liu ping fan
  0 siblings, 2 replies; 38+ messages in thread
From: Paolo Bonzini @ 2013-07-30  9:17 UTC (permalink / raw)
  To: liu ping fan
  Cc: Kevin Wolf, Stefan Hajnoczi, Jan Kiszka, qemu-devel, Alex Bligh,
	Anthony Liguori

Il 30/07/2013 04:42, liu ping fan ha scritto:
> On Mon, Jul 29, 2013 at 7:21 PM, Paolo Bonzini <pbonzini@redhat.com> wrote:
>> Il 29/07/2013 10:10, liu ping fan ha scritto:
>>> On Mon, Jul 29, 2013 at 2:30 PM, Paolo Bonzini <pbonzini@redhat.com> wrote:
>>>> Il 29/07/2013 05:16, Liu Ping Fan ha scritto:
>>>>> After disabling the QemuClock, we should make sure that no QemuTimers
>>>>> are still in flight. To implement that, the caller of disabling will
>>>>> wait until the last user's exit.
>>>>>
>>>>> Note, the callers of qemu_clock_enable() should be sync by themselves,
>>>>> not protected by this patch.
>>>>>
>>>>> Signed-off-by: Liu Ping Fan <pingfank@linux.vnet.ibm.com>
>>>>
>>>> This is an interesting approach.
>>>>
>>>>> -    if (!clock->enabled)
>>>>> -        return;
>>>>> +    atomic_inc(&clock->using);
>>>>> +    if (unlikely(!clock->enabled)) {
>>>>> +        goto exit;
>>>>> +    }
>>>>
>>>> This can return directly, it doesn't need to increment and decrement
>>>> clock->using.
>>>>
>>> Here is a race condition like the following
>>
>> Ah, I see.
>>
>> Still this seems a bit backwards.  Most of the time you will have no one
>> on the wait_using condvar, but you are almost always signaling the
>> condvar (should be broadcast BTW)...
>>
> I have tried to filter out the normal case by
> +    qemu_mutex_lock(&clock->lock);
> +    if (atomic_fetch_dec(&clock->using) == 1) { ---------> 1st place
> +        if (unlikely(!clock->enabled)) { -------> 2nd place
> +            qemu_cond_signal(&clock->wait_using);
> +        }
> +    }
> +    qemu_mutex_unlock(&clock->lock);
> 
> Could I do it better?

Hmm, do we even need clock->using at this point?  For example:

   qemu_clock_enable()
   {
       clock->enabled = enabled;
       ...
       if (!enabled) {
           /* If another thread is within qemu_run_timers,
            * wait for it to finish.
            */
           qemu_event_wait(&clock->callbacks_done_event);
       }
   }

   qemu_run_timers()
   {
       qemu_event_reset(&clock->callbacks_done_event);
       if (!clock->enabled) {
           goto out;
       }
       ...
   out:
       qemu_event_set(&eclock->callbacks_done_event);
   }

In the fast path this only does two atomic operations (an OR for reset,
and XCHG for set).

Paolo

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [Qemu-devel] [RFC v2 3/5] timer: make qemu_clock_enable sync between disable and timer's cb
  2013-07-30  9:17           ` Paolo Bonzini
@ 2013-07-30  9:51             ` Alex Bligh
  2013-07-30 10:12               ` Paolo Bonzini
  2013-08-01  5:54             ` liu ping fan
  1 sibling, 1 reply; 38+ messages in thread
From: Alex Bligh @ 2013-07-30  9:51 UTC (permalink / raw)
  To: Paolo Bonzini, liu ping fan
  Cc: Kevin Wolf, Alex Bligh, Jan Kiszka, Stefan Hajnoczi, qemu-devel,
	Anthony Liguori

Paolo,

--On 30 July 2013 11:17:05 +0200 Paolo Bonzini <pbonzini@redhat.com> wrote:

> Hmm, do we even need clock->using at this point?  For example:
>
>    qemu_clock_enable()
>    {
>        clock->enabled = enabled;
>        ...
>        if (!enabled) {
>            /* If another thread is within qemu_run_timers,
>             * wait for it to finish.
>             */
>            qemu_event_wait(&clock->callbacks_done_event);
>        }
>    }
>
>    qemu_run_timers()
>    {
>        qemu_event_reset(&clock->callbacks_done_event);
>        if (!clock->enabled) {
>            goto out;
>        }
>        ...
>    out:
>        qemu_event_set(&eclock->callbacks_done_event);
>    }
>
> In the fast path this only does two atomic operations (an OR for reset,
> and XCHG for set).

I think I'm missing something here. If Pingfan is rebasing on top
of my patches, or is doing vaguely the same thing, then
qemu_clock_enable will do two things:

1. It will will walk the list of QEMUTimerLists
2. For each QemuTimerList associated with the clock, it will call
   qemu_notify or aio_notify

The list of QEMUTimerLists is only accessed by qemu_clock_enable
(to do task 2) and by the creation and deletion of QEMUTimerLists,
which happen only in init and AioContext create/delete (which should
be very rare). I'm presuming qemu_clock_enable(false) is also
pretty rare. It seems to me there is no need to do anything more
complex than a simple mutex protecting the list of QEMUTimerLists of
each QEMUClock.

As far as walking the QEMUTimerList itself is concerned, this is
something which is 99.999% done by the thread owning the AioContext.
qemu_clock_enable should not even be walking this list. So I don't
see why the protection here is needed.

-- 
Alex Bligh

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [Qemu-devel] [RFC v2 3/5] timer: make qemu_clock_enable sync between disable and timer's cb
  2013-07-30  9:51             ` Alex Bligh
@ 2013-07-30 10:12               ` Paolo Bonzini
  0 siblings, 0 replies; 38+ messages in thread
From: Paolo Bonzini @ 2013-07-30 10:12 UTC (permalink / raw)
  To: Alex Bligh
  Cc: Kevin Wolf, Stefan Hajnoczi, Jan Kiszka, liu ping fan,
	qemu-devel, Anthony Liguori

Il 30/07/2013 11:51, Alex Bligh ha scritto:
> As far as walking the QEMUTimerList itself is concerned, this is
> something which is 99.999% done by the thread owning the AioContext.
> qemu_clock_enable should not even be walking this list. So I don't
> see why the protection here is needed.

The protection is needed not because of qemu_clock_enable, but rather
because of code in qemu_clock_enable's caller.  Such code likely expects
not to run concurrently with timers.

qemu_clock_enable however can be called from other threads than the one
owning the AioContext.  Furthermore, it can happen while timer callbacks
are being called, because callbacks are called without holding any lock.

If you put together these conditions, qemu_clock_enable has to wait for
timer callbacks to finish running before returning.

Paolo

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [Qemu-devel] [RFC v2 3/5] timer: make qemu_clock_enable sync between disable and timer's cb
  2013-07-30  9:17           ` Paolo Bonzini
  2013-07-30  9:51             ` Alex Bligh
@ 2013-08-01  5:54             ` liu ping fan
  2013-08-01  8:57               ` Paolo Bonzini
  1 sibling, 1 reply; 38+ messages in thread
From: liu ping fan @ 2013-08-01  5:54 UTC (permalink / raw)
  To: Paolo Bonzini
  Cc: Kevin Wolf, Stefan Hajnoczi, Jan Kiszka, qemu-devel, Alex Bligh,
	Anthony Liguori

On Tue, Jul 30, 2013 at 5:17 PM, Paolo Bonzini <pbonzini@redhat.com> wrote:
> Il 30/07/2013 04:42, liu ping fan ha scritto:
>> On Mon, Jul 29, 2013 at 7:21 PM, Paolo Bonzini <pbonzini@redhat.com> wrote:
>>> Il 29/07/2013 10:10, liu ping fan ha scritto:
>>>> On Mon, Jul 29, 2013 at 2:30 PM, Paolo Bonzini <pbonzini@redhat.com> wrote:
>>>>> Il 29/07/2013 05:16, Liu Ping Fan ha scritto:
>>>>>> After disabling the QemuClock, we should make sure that no QemuTimers
>>>>>> are still in flight. To implement that, the caller of disabling will
>>>>>> wait until the last user's exit.
>>>>>>
>>>>>> Note, the callers of qemu_clock_enable() should be sync by themselves,
>>>>>> not protected by this patch.
>>>>>>
>>>>>> Signed-off-by: Liu Ping Fan <pingfank@linux.vnet.ibm.com>
>>>>>
>>>>> This is an interesting approach.
>>>>>
>>>>>> -    if (!clock->enabled)
>>>>>> -        return;
>>>>>> +    atomic_inc(&clock->using);
>>>>>> +    if (unlikely(!clock->enabled)) {
>>>>>> +        goto exit;
>>>>>> +    }
>>>>>
>>>>> This can return directly, it doesn't need to increment and decrement
>>>>> clock->using.
>>>>>
>>>> Here is a race condition like the following
>>>
>>> Ah, I see.
>>>
>>> Still this seems a bit backwards.  Most of the time you will have no one
>>> on the wait_using condvar, but you are almost always signaling the
>>> condvar (should be broadcast BTW)...
>>>
>> I have tried to filter out the normal case by
>> +    qemu_mutex_lock(&clock->lock);
>> +    if (atomic_fetch_dec(&clock->using) == 1) { ---------> 1st place
>> +        if (unlikely(!clock->enabled)) { -------> 2nd place
>> +            qemu_cond_signal(&clock->wait_using);
>> +        }
>> +    }
>> +    qemu_mutex_unlock(&clock->lock);
>>
>> Could I do it better?
>
> Hmm, do we even need clock->using at this point?  For example:
>
>    qemu_clock_enable()
>    {
>        clock->enabled = enabled;
>        ...
>        if (!enabled) {
>            /* If another thread is within qemu_run_timers,
>             * wait for it to finish.
>             */
>            qemu_event_wait(&clock->callbacks_done_event);
>        }
>    }
>
>    qemu_run_timers()
>    {
>        qemu_event_reset(&clock->callbacks_done_event);
>        if (!clock->enabled) {
>            goto out;
>        }
>        ...
>    out:
>        qemu_event_set(&eclock->callbacks_done_event);
>    }
>
> In the fast path this only does two atomic operations (an OR for reset,
> and XCHG for set).
>
There is race condition, suppose the following scenario with A/B thread
A: qemu_event_reset()
B: qemu_event_reset()
A: qemu_event_set() ----> B is still in flight when
qemu_clock_enable() is notified
B: qemu_event_set()

I had tried to build something around futex(2) like qemu_event, but failed.

Thanks and regards,
Pingfan

> Paolo

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [Qemu-devel] [RFC v2 3/5] timer: make qemu_clock_enable sync between disable and timer's cb
  2013-08-01  5:54             ` liu ping fan
@ 2013-08-01  8:57               ` Paolo Bonzini
  2013-08-01  9:35                 ` Alex Bligh
  0 siblings, 1 reply; 38+ messages in thread
From: Paolo Bonzini @ 2013-08-01  8:57 UTC (permalink / raw)
  To: liu ping fan
  Cc: Kevin Wolf, Stefan Hajnoczi, Jan Kiszka, qemu-devel, Alex Bligh,
	Anthony Liguori

> > Hmm, do we even need clock->using at this point?  For example:
> >
> >    qemu_clock_enable()
> >    {
> >        clock->enabled = enabled;
> >        ...
> >        if (!enabled) {
> >            /* If another thread is within qemu_run_timers,
> >             * wait for it to finish.
> >             */
> >            qemu_event_wait(&clock->callbacks_done_event);
> >        }
> >    }
> >
> >    qemu_run_timers()
> >    {
> >        qemu_event_reset(&clock->callbacks_done_event);
> >        if (!clock->enabled) {
> >            goto out;
> >        }
> >        ...
> >    out:
> >        qemu_event_set(&eclock->callbacks_done_event);
> >    }
> >
> > In the fast path this only does two atomic operations (an OR for reset,
> > and XCHG for set).
> >
> There is race condition, suppose the following scenario with A/B thread
> A: qemu_event_reset()
> B: qemu_event_reset()
> A: qemu_event_set() ----> B is still in flight when
> qemu_clock_enable() is notified
> B: qemu_event_set()
> 
> I had tried to build something around futex(2) like qemu_event, but failed.

True, qemu_event basically works only when a single thread resets it.  But
there is no race condition here because qemu_run_timers cannot be executed
concurrently by multiple threads (like aio_poll in your bottom half patches).

Paolo

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [Qemu-devel] [RFC v2 3/5] timer: make qemu_clock_enable sync between disable and timer's cb
  2013-08-01  8:57               ` Paolo Bonzini
@ 2013-08-01  9:35                 ` Alex Bligh
  2013-08-01 12:19                   ` Paolo Bonzini
  0 siblings, 1 reply; 38+ messages in thread
From: Alex Bligh @ 2013-08-01  9:35 UTC (permalink / raw)
  To: Paolo Bonzini, liu ping fan
  Cc: Kevin Wolf, Alex Bligh, Jan Kiszka, Stefan Hajnoczi, qemu-devel,
	Anthony Liguori



--On 1 August 2013 04:57:52 -0400 Paolo Bonzini <pbonzini@redhat.com> wrote:

> True, qemu_event basically works only when a single thread resets it.  But
> there is no race condition here because qemu_run_timers cannot be executed
> concurrently by multiple threads (like aio_poll in your bottom half
> patches).

... or, if rebasing on top of my patches, qemu_run_timers *can* be
executed concurrently by mulitple threads, but in respect of any given
QEMUTimerList, it will only be executed by one thread.

-- 
Alex Bligh

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [Qemu-devel] [RFC v2 3/5] timer: make qemu_clock_enable sync between disable and timer's cb
  2013-08-01  9:35                 ` Alex Bligh
@ 2013-08-01 12:19                   ` Paolo Bonzini
  2013-08-01 13:28                     ` Alex Bligh
  0 siblings, 1 reply; 38+ messages in thread
From: Paolo Bonzini @ 2013-08-01 12:19 UTC (permalink / raw)
  To: Alex Bligh
  Cc: Kevin Wolf, Stefan Hajnoczi, Jan Kiszka, liu ping fan,
	qemu-devel, Anthony Liguori

> > True, qemu_event basically works only when a single thread resets it.  But
> > there is no race condition here because qemu_run_timers cannot be executed
> > concurrently by multiple threads (like aio_poll in your bottom half
> > patches).
> 
> ... or, if rebasing on top of my patches, qemu_run_timers *can* be
> executed concurrently by mulitple threads, but in respect of any given
> QEMUTimerList, it will only be executed by one thread.

... so the event would be placed in QEMUTimerList, not QEMUClock.

qemu_clock_enable then will have to visit all timer lists.  That's
not hard to do, but as locks proliferate we need to have a clear
policy (e.g. BQL -> clock -> timerlist).

So actually there is another problem with this patch (both the
condvar and the event approach are equally buggy).  If a timer
on clock X disables clock X, qemu_clock_enable will deadlock.

I suppose it's easier to ask each user of qemu_clock_enable to
provide its own synchronization, and drop this patch.  The simplest
kind of synchronization is to delay some work to a bottom half in
the clock's AioContext.  If you do that, you know that the timers
are not running.

Ping Fan, this teaches once more the same lesson: let's not invent
complex concurrency mechanisms unless really necessary.  So far
they almost never survived (there are some exceptions, but we have
always taken them from somewhere else: QemuEvent is an abstraction
of liburcu code to make it portable, RCU and seqlock are from Linux,
and I cannot think of anything else).

Paolo

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [Qemu-devel] [RFC v2 3/5] timer: make qemu_clock_enable sync between disable and timer's cb
  2013-08-01 12:19                   ` Paolo Bonzini
@ 2013-08-01 13:28                     ` Alex Bligh
  2013-08-01 13:51                       ` Paolo Bonzini
  2013-08-02  3:33                       ` liu ping fan
  0 siblings, 2 replies; 38+ messages in thread
From: Alex Bligh @ 2013-08-01 13:28 UTC (permalink / raw)
  To: Paolo Bonzini
  Cc: Kevin Wolf, Stefan Hajnoczi, Alex Bligh, Jan Kiszka,
	liu ping fan, qemu-devel, Anthony Liguori

Paolo,

--On 1 August 2013 08:19:34 -0400 Paolo Bonzini <pbonzini@redhat.com> wrote:

>> > True, qemu_event basically works only when a single thread resets it.
>> > But there is no race condition here because qemu_run_timers cannot be
>> > executed concurrently by multiple threads (like aio_poll in your
>> > bottom half patches).
>>
>> ... or, if rebasing on top of my patches, qemu_run_timers *can* be
>> executed concurrently by mulitple threads, but in respect of any given
>> QEMUTimerList, it will only be executed by one thread.
>
> ... so the event would be placed in QEMUTimerList, not QEMUClock.

Correct

> qemu_clock_enable then will have to visit all timer lists. That's
> not hard to do,

Correct, v4 of my patch does this.

> but as locks proliferate we need to have a clear
> policy (e.g. BQL -> clock -> timerlist).

But doesn't do the locking bit yet (Pingfan is going to do that I hope)

> So actually there is another problem with this patch (both the
> condvar and the event approach are equally buggy).  If a timer
> on clock X disables clock X, qemu_clock_enable will deadlock.

Yes. I believe there will be a similar problem if a timer
created or deleted an AioContext (clearly less likely!)

> I suppose it's easier to ask each user of qemu_clock_enable to
> provide its own synchronization, and drop this patch.  The simplest
> kind of synchronization is to delay some work to a bottom half in
> the clock's AioContext.  If you do that, you know that the timers
> are not running.

I'm not sure that's true. If two AioContexts run in different
threads, would their BH's and timers not also run in those two
different threads?

-- 
Alex Bligh

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [Qemu-devel] [RFC v2 3/5] timer: make qemu_clock_enable sync between disable and timer's cb
  2013-08-01 13:28                     ` Alex Bligh
@ 2013-08-01 13:51                       ` Paolo Bonzini
  2013-08-01 14:20                         ` Alex Bligh
  2013-08-02  3:33                       ` liu ping fan
  1 sibling, 1 reply; 38+ messages in thread
From: Paolo Bonzini @ 2013-08-01 13:51 UTC (permalink / raw)
  To: Alex Bligh
  Cc: Kevin Wolf, Stefan Hajnoczi, Jan Kiszka, liu ping fan,
	qemu-devel, Anthony Liguori

On Aug 01 2013, Alex Bligh wrote:
> >So actually there is another problem with this patch (both the
> >condvar and the event approach are equally buggy).  If a timer
> >on clock X disables clock X, qemu_clock_enable will deadlock.
> 
> Yes. I believe there will be a similar problem if a timer
> created or deleted an AioContext (clearly less likely!)
> 
> >I suppose it's easier to ask each user of qemu_clock_enable to
> >provide its own synchronization, and drop this patch.  The simplest
> >kind of synchronization is to delay some work to a bottom half in
> >the clock's AioContext.  If you do that, you know that the timers
> >are not running.
> 
> I'm not sure that's true. If two AioContexts run in different
> threads, would their BH's and timers not also run in those two
> different threads?

Suppose a thread wants to do qemu_clock_enable(foo, false), and the
code after qemu_clock_enable assumes that no timers are running.
Then you should move that code to a bottom half in foo's AioContext.

Paolo

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [Qemu-devel] [RFC v2 3/5] timer: make qemu_clock_enable sync between disable and timer's cb
  2013-08-01 13:51                       ` Paolo Bonzini
@ 2013-08-01 14:20                         ` Alex Bligh
  2013-08-01 14:28                           ` Paolo Bonzini
  0 siblings, 1 reply; 38+ messages in thread
From: Alex Bligh @ 2013-08-01 14:20 UTC (permalink / raw)
  To: Paolo Bonzini
  Cc: Kevin Wolf, Stefan Hajnoczi, Alex Bligh, Jan Kiszka,
	liu ping fan, qemu-devel, Anthony Liguori

Paolo,

--On 1 August 2013 15:51:11 +0200 Paolo Bonzini <pbonzini@redhat.com> wrote:

>> > So actually there is another problem with this patch (both the
>> > condvar and the event approach are equally buggy).  If a timer
>> > on clock X disables clock X, qemu_clock_enable will deadlock.
>>
>> Yes. I believe there will be a similar problem if a timer
>> created or deleted an AioContext (clearly less likely!)
>>
>> > I suppose it's easier to ask each user of qemu_clock_enable to
>> > provide its own synchronization, and drop this patch.  The simplest
>> > kind of synchronization is to delay some work to a bottom half in
>> > the clock's AioContext.  If you do that, you know that the timers
>> > are not running.
>>
>> I'm not sure that's true. If two AioContexts run in different
>> threads, would their BH's and timers not also run in those two
>> different threads?
>
> Suppose a thread wants to do qemu_clock_enable(foo, false), and the
> code after qemu_clock_enable assumes that no timers are running.
> Then you should move that code to a bottom half in foo's AioContext.

foo is a QEMUClock here.

A QEMUClock may not have just one AioContext. It could have several
each operated by a different thread.


-- 
Alex Bligh

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [Qemu-devel] [RFC v2 3/5] timer: make qemu_clock_enable sync between disable and timer's cb
  2013-08-01 14:20                         ` Alex Bligh
@ 2013-08-01 14:28                           ` Paolo Bonzini
  2013-08-02  3:31                             ` liu ping fan
  0 siblings, 1 reply; 38+ messages in thread
From: Paolo Bonzini @ 2013-08-01 14:28 UTC (permalink / raw)
  To: Alex Bligh
  Cc: Kevin Wolf, Stefan Hajnoczi, Jan Kiszka, liu ping fan,
	qemu-devel, Anthony Liguori

 On Aug 01 2013, Alex Bligh wrote:
> Paolo,
> 
> --On 1 August 2013 15:51:11 +0200 Paolo Bonzini <pbonzini@redhat.com> wrote:
> 
> >>> So actually there is another problem with this patch (both the
> >>> condvar and the event approach are equally buggy).  If a timer
> >>> on clock X disables clock X, qemu_clock_enable will deadlock.
> >>
> >>Yes. I believe there will be a similar problem if a timer
> >>created or deleted an AioContext (clearly less likely!)
> >>
> >>> I suppose it's easier to ask each user of qemu_clock_enable to
> >>> provide its own synchronization, and drop this patch.  The simplest
> >>> kind of synchronization is to delay some work to a bottom half in
> >>> the clock's AioContext.  If you do that, you know that the timers
> >>> are not running.
> >>
> >>I'm not sure that's true. If two AioContexts run in different
> >>threads, would their BH's and timers not also run in those two
> >>different threads?
> >
> >Suppose a thread wants to do qemu_clock_enable(foo, false), and the
> >code after qemu_clock_enable assumes that no timers are running.
> >Then you should move that code to a bottom half in foo's AioContext.
> 
> foo is a QEMUClock here.
> 
> A QEMUClock may not have just one AioContext. It could have several
> each operated by a different thread.

Oops, you're right.

Checking the code for callers of qemu_clock_enable() it seems like there
shouldn't be any deadlocks.  So it should work if qemu_clock_enable
walks all of the timerlists and waits on each event.

But we should document the expectations since they are different from
the current code.

Paolo

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [Qemu-devel] [RFC v2 3/5] timer: make qemu_clock_enable sync between disable and timer's cb
  2013-08-01 14:28                           ` Paolo Bonzini
@ 2013-08-02  3:31                             ` liu ping fan
  2013-08-02 10:01                               ` Paolo Bonzini
  0 siblings, 1 reply; 38+ messages in thread
From: liu ping fan @ 2013-08-02  3:31 UTC (permalink / raw)
  To: Paolo Bonzini
  Cc: Kevin Wolf, Stefan Hajnoczi, Alex Bligh, Jan Kiszka, qemu-devel,
	Anthony Liguori

On Thu, Aug 1, 2013 at 10:28 PM, Paolo Bonzini <pbonzini@redhat.com> wrote:
>  On Aug 01 2013, Alex Bligh wrote:
>> Paolo,
>>
>> --On 1 August 2013 15:51:11 +0200 Paolo Bonzini <pbonzini@redhat.com> wrote:
>>
>> >>> So actually there is another problem with this patch (both the
>> >>> condvar and the event approach are equally buggy).  If a timer
>> >>> on clock X disables clock X, qemu_clock_enable will deadlock.
>> >>
>> >>Yes. I believe there will be a similar problem if a timer
>> >>created or deleted an AioContext (clearly less likely!)
>> >>
>> >>> I suppose it's easier to ask each user of qemu_clock_enable to
>> >>> provide its own synchronization, and drop this patch.  The simplest
>> >>> kind of synchronization is to delay some work to a bottom half in
>> >>> the clock's AioContext.  If you do that, you know that the timers
>> >>> are not running.
>> >>
>> >>I'm not sure that's true. If two AioContexts run in different
>> >>threads, would their BH's and timers not also run in those two
>> >>different threads?
>> >
>> >Suppose a thread wants to do qemu_clock_enable(foo, false), and the
>> >code after qemu_clock_enable assumes that no timers are running.
>> >Then you should move that code to a bottom half in foo's AioContext.
>>
>> foo is a QEMUClock here.
>>
>> A QEMUClock may not have just one AioContext. It could have several
>> each operated by a different thread.
>
> Oops, you're right.
>
> Checking the code for callers of qemu_clock_enable() it seems like there
> shouldn't be any deadlocks.  So it should work if qemu_clock_enable

Do you mean the caller of qemu_clock_enable(foo,false) can NOT be
timer cb? As for this deadlock issue, making
qemu_clock_enabe(foo,false) ASYNC, and forcing the caller to sync
explicitly ?

> walks all of the timerlists and waits on each event.
>
> But we should document the expectations since they are different from
> the current code.
>
> Paolo
>

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [Qemu-devel] [RFC v2 3/5] timer: make qemu_clock_enable sync between disable and timer's cb
  2013-08-01 13:28                     ` Alex Bligh
  2013-08-01 13:51                       ` Paolo Bonzini
@ 2013-08-02  3:33                       ` liu ping fan
  2013-08-02 14:43                         ` Stefan Hajnoczi
  1 sibling, 1 reply; 38+ messages in thread
From: liu ping fan @ 2013-08-02  3:33 UTC (permalink / raw)
  To: Alex Bligh
  Cc: Kevin Wolf, Stefan Hajnoczi, Jan Kiszka, qemu-devel,
	Anthony Liguori, Paolo Bonzini

On Thu, Aug 1, 2013 at 9:28 PM, Alex Bligh <alex@alex.org.uk> wrote:
> Paolo,
>
>
> --On 1 August 2013 08:19:34 -0400 Paolo Bonzini <pbonzini@redhat.com> wrote:
>
>>> > True, qemu_event basically works only when a single thread resets it.
>>> > But there is no race condition here because qemu_run_timers cannot be
>>> > executed concurrently by multiple threads (like aio_poll in your
>>> > bottom half patches).
>>>
>>> ... or, if rebasing on top of my patches, qemu_run_timers *can* be
>>> executed concurrently by mulitple threads, but in respect of any given
>>> QEMUTimerList, it will only be executed by one thread.
>>
>>
>> ... so the event would be placed in QEMUTimerList, not QEMUClock.
>
>
> Correct
>
>
>> qemu_clock_enable then will have to visit all timer lists. That's
>> not hard to do,
>
>
> Correct, v4 of my patch does this.
>
>
>> but as locks proliferate we need to have a clear
>> policy (e.g. BQL -> clock -> timerlist).
>
>
> But doesn't do the locking bit yet (Pingfan is going to do that I hope)
>
Seem that Stefanha had been involved in this, and sent out his patches :)
>
>> So actually there is another problem with this patch (both the
>> condvar and the event approach are equally buggy).  If a timer
>> on clock X disables clock X, qemu_clock_enable will deadlock.
>
>
> Yes. I believe there will be a similar problem if a timer
> created or deleted an AioContext (clearly less likely!)
>
>
>> I suppose it's easier to ask each user of qemu_clock_enable to
>> provide its own synchronization, and drop this patch.  The simplest
>> kind of synchronization is to delay some work to a bottom half in
>> the clock's AioContext.  If you do that, you know that the timers
>> are not running.
>
>
> I'm not sure that's true. If two AioContexts run in different
> threads, would their BH's and timers not also run in those two
> different threads?
>
> --
> Alex Bligh

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [Qemu-devel] [RFC v2 3/5] timer: make qemu_clock_enable sync between disable and timer's cb
  2013-08-02  3:31                             ` liu ping fan
@ 2013-08-02 10:01                               ` Paolo Bonzini
  0 siblings, 0 replies; 38+ messages in thread
From: Paolo Bonzini @ 2013-08-02 10:01 UTC (permalink / raw)
  To: liu ping fan
  Cc: Kevin Wolf, Stefan Hajnoczi, Alex Bligh, Jan Kiszka, qemu-devel,
	Anthony Liguori

 On Aug 02 2013, liu ping fan wrote:
> On Thu, Aug 1, 2013 at 10:28 PM, Paolo Bonzini <pbonzini@redhat.com> wrote:
> > > > > > So actually there is another problem with this patch (both the
> > > > > > condvar and the event approach are equally buggy).  If a timer
> > > > > > on clock X disables clock X, qemu_clock_enable will deadlock.
> >
> > Checking the code for callers of qemu_clock_enable() it seems like there
> > shouldn't be any deadlocks.  So it should work if qemu_clock_enable
> 
> Do you mean the caller of qemu_clock_enable(foo,false) can NOT be
> timer cb?

Yes.

> As for this deadlock issue, making
> qemu_clock_enable(foo,false) ASYNC, and forcing the caller to sync
> explicitly ?

Right now the callers of qemu_clock_enable(foo, false) are safe.  So
the problem can be "fixed" just by adequate documentation.

> > But we should document the expectations since they are different from
> > the current code.
> >
> > Paolo

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [Qemu-devel] [RFC v2 3/5] timer: make qemu_clock_enable sync between disable and timer's cb
  2013-08-02  3:33                       ` liu ping fan
@ 2013-08-02 14:43                         ` Stefan Hajnoczi
  2013-08-05  2:13                           ` liu ping fan
  0 siblings, 1 reply; 38+ messages in thread
From: Stefan Hajnoczi @ 2013-08-02 14:43 UTC (permalink / raw)
  To: liu ping fan
  Cc: Kevin Wolf, Alex Bligh, Jan Kiszka, qemu-devel, Anthony Liguori,
	Paolo Bonzini

On Fri, Aug 02, 2013 at 11:33:40AM +0800, liu ping fan wrote:
> On Thu, Aug 1, 2013 at 9:28 PM, Alex Bligh <alex@alex.org.uk> wrote:
> > Paolo,
> >
> >
> > --On 1 August 2013 08:19:34 -0400 Paolo Bonzini <pbonzini@redhat.com> wrote:
> >
> >>> > True, qemu_event basically works only when a single thread resets it.
> >>> > But there is no race condition here because qemu_run_timers cannot be
> >>> > executed concurrently by multiple threads (like aio_poll in your
> >>> > bottom half patches).
> >>>
> >>> ... or, if rebasing on top of my patches, qemu_run_timers *can* be
> >>> executed concurrently by mulitple threads, but in respect of any given
> >>> QEMUTimerList, it will only be executed by one thread.
> >>
> >>
> >> ... so the event would be placed in QEMUTimerList, not QEMUClock.
> >
> >
> > Correct
> >
> >
> >> qemu_clock_enable then will have to visit all timer lists. That's
> >> not hard to do,
> >
> >
> > Correct, v4 of my patch does this.
> >
> >
> >> but as locks proliferate we need to have a clear
> >> policy (e.g. BQL -> clock -> timerlist).
> >
> >
> > But doesn't do the locking bit yet (Pingfan is going to do that I hope)
> >
> Seem that Stefanha had been involved in this, and sent out his patches :)

Hi Ping Fan,
I dropped the series where I added locks to qemu-timer.c from a few
weeks ago.

With your series rebased onto Alex's series, I think my patches are no
longer needed?

Stefan

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [Qemu-devel] [RFC v2 3/5] timer: make qemu_clock_enable sync between disable and timer's cb
  2013-08-02 14:43                         ` Stefan Hajnoczi
@ 2013-08-05  2:13                           ` liu ping fan
  0 siblings, 0 replies; 38+ messages in thread
From: liu ping fan @ 2013-08-05  2:13 UTC (permalink / raw)
  To: Stefan Hajnoczi
  Cc: Kevin Wolf, Alex Bligh, Jan Kiszka, qemu-devel, Anthony Liguori,
	Paolo Bonzini

On Fri, Aug 2, 2013 at 10:43 PM, Stefan Hajnoczi <stefanha@redhat.com> wrote:
> On Fri, Aug 02, 2013 at 11:33:40AM +0800, liu ping fan wrote:
>> On Thu, Aug 1, 2013 at 9:28 PM, Alex Bligh <alex@alex.org.uk> wrote:
>> > Paolo,
>> >
>> >
>> > --On 1 August 2013 08:19:34 -0400 Paolo Bonzini <pbonzini@redhat.com> wrote:
>> >
>> >>> > True, qemu_event basically works only when a single thread resets it.
>> >>> > But there is no race condition here because qemu_run_timers cannot be
>> >>> > executed concurrently by multiple threads (like aio_poll in your
>> >>> > bottom half patches).
>> >>>
>> >>> ... or, if rebasing on top of my patches, qemu_run_timers *can* be
>> >>> executed concurrently by mulitple threads, but in respect of any given
>> >>> QEMUTimerList, it will only be executed by one thread.
>> >>
>> >>
>> >> ... so the event would be placed in QEMUTimerList, not QEMUClock.
>> >
>> >
>> > Correct
>> >
>> >
>> >> qemu_clock_enable then will have to visit all timer lists. That's
>> >> not hard to do,
>> >
>> >
>> > Correct, v4 of my patch does this.
>> >
>> >
>> >> but as locks proliferate we need to have a clear
>> >> policy (e.g. BQL -> clock -> timerlist).
>> >
>> >
>> > But doesn't do the locking bit yet (Pingfan is going to do that I hope)
>> >
>> Seem that Stefanha had been involved in this, and sent out his patches :)
>
> Hi Ping Fan,
> I dropped the series where I added locks to qemu-timer.c from a few
> weeks ago.
>
> With your series rebased onto Alex's series, I think my patches are no
> longer needed?
>
No. The active_timers list which is still touched by
qemu_mod_timer_ns() needs protection between vcpu/iothread/..
I think you want to pick up your patches?

Regards,
Pingfan

^ permalink raw reply	[flat|nested] 38+ messages in thread

end of thread, other threads:[~2013-08-05  2:13 UTC | newest]

Thread overview: 38+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2013-07-29  3:16 [Qemu-devel] [RFC v2 0/5] arm AioContext with its own timer stuff Liu Ping Fan
2013-07-29  3:16 ` [Qemu-devel] [RFC v2 1/5] timer: protect timers_state with lock Liu Ping Fan
2013-07-29  6:26   ` Paolo Bonzini
2013-07-29  8:01     ` liu ping fan
2013-07-29  3:16 ` [Qemu-devel] [RFC v2 2/5] timer: pick out timer list info from QemuClock Liu Ping Fan
2013-07-29  3:16 ` [Qemu-devel] [RFC v2 3/5] timer: make qemu_clock_enable sync between disable and timer's cb Liu Ping Fan
2013-07-29  6:30   ` Paolo Bonzini
2013-07-29  8:10     ` liu ping fan
2013-07-29 11:21       ` Paolo Bonzini
2013-07-30  2:42         ` liu ping fan
2013-07-30  9:17           ` Paolo Bonzini
2013-07-30  9:51             ` Alex Bligh
2013-07-30 10:12               ` Paolo Bonzini
2013-08-01  5:54             ` liu ping fan
2013-08-01  8:57               ` Paolo Bonzini
2013-08-01  9:35                 ` Alex Bligh
2013-08-01 12:19                   ` Paolo Bonzini
2013-08-01 13:28                     ` Alex Bligh
2013-08-01 13:51                       ` Paolo Bonzini
2013-08-01 14:20                         ` Alex Bligh
2013-08-01 14:28                           ` Paolo Bonzini
2013-08-02  3:31                             ` liu ping fan
2013-08-02 10:01                               ` Paolo Bonzini
2013-08-02  3:33                       ` liu ping fan
2013-08-02 14:43                         ` Stefan Hajnoczi
2013-08-05  2:13                           ` liu ping fan
2013-07-29  3:16 ` [Qemu-devel] [RFC v2 4/5] timer: associate three timerlists with AioContext Liu Ping Fan
2013-07-29  6:32   ` Paolo Bonzini
2013-07-29  8:20     ` liu ping fan
2013-07-29 13:11       ` Paolo Bonzini
2013-07-30  2:35         ` liu ping fan
2013-07-29  3:16 ` [Qemu-devel] [RFC v2 5/5] timer: run timers on aio_poll Liu Ping Fan
2013-07-29  9:22 ` [Qemu-devel] [RFC v2 0/5] arm AioContext with its own timer stuff Stefan Hajnoczi
2013-07-29 10:23   ` Alex Bligh
2013-07-29 13:36     ` Stefan Hajnoczi
2013-07-29 13:56       ` Alex Bligh
2013-07-30  3:35   ` liu ping fan
2013-07-29 10:18 ` Alex Bligh

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.