All of lore.kernel.org
 help / color / mirror / Atom feed
* [Qemu-devel] [PATCH 0/6] tcg: fix icount super slowdown
@ 2017-03-03 13:11 Paolo Bonzini
  2017-03-03 13:11 ` [Qemu-devel] [PATCH 1/5] qemu-timer: fix off-by-one Paolo Bonzini
                   ` (5 more replies)
  0 siblings, 6 replies; 26+ messages in thread
From: Paolo Bonzini @ 2017-03-03 13:11 UTC (permalink / raw)
  To: qemu-devel; +Cc: alex.bennee

icount has become much slower after tcg_cpu_exec has stopped
using the BQL.  There is also a latent bug that is masked by
the slowness.

The slowness happens because every occurrence of a QEMU_CLOCK_VIRTUAL
timer now has to wake up the I/O thread and wait for it.  The rendez-vous
is mediated by the BQL QemuMutex:

- handle_icount_deadline wakes up the I/O thread with BQL taken
- the I/O thread wakes up and waits on the BQL 
- the VCPU thread releases the BQL a little later
- the I/O thread raises an interrupt, which calls qemu_cpu_kick
- the VCPU thread notices the interrupt, takes the BQL to
  process it and waits on it

All this back and forth is extremely expensive, causing a 6 to 8-fold
slowdown when icount is turned on.

One may think that the issue is that the VCPU thread is too dependent
on the BQL, but then the latent bug comes in.  I first tried removing
the BQL completely from the x86 cpu_exec.  Every guest thern hung, and
the only way to fix it (and make everything slow again) was to add a dummy
BQL lock/unlock pair to qemu_tcg_wait_io_event.

This is because in -icount mode you really have to process the events
before the CPU restarts executing the next instruction.  Therefore, this
series moves the processing of QEMU_CLOCK_VIRTUAL timers straight in
the vCPU thread when running in icount mode.  This is only limited to the
main TimerListGroup.  QEMU_CLOCK_VIRTUAL timers in AioContexts still run
outside the vCPU thread.

With this change, icount mode is pretty much running as fast as in 2.8.
I tested the patches are on top of Alex's series with both x86 and aarch64
guests, but they should be pretty much independent.

The good thing is that the infrastructure to do this is basically
already there, in the form of QEMUTimerListNotifyCB.  It only needs to
be generalized a bit (patches 2 and 3) and bugfixed (patch 1 and 4---the
latter is necessary to avoid the "I/O thread spun for 1000 iterations
and consequent slowing down of vCPU thread).

The bad things are:

- I am not sure of what was different before the patch that removed the
BQL from tcg_cpu_exec (and I don't really have time to profile it right
now---I should not be fixing this in fact...).

- the solution sounds a bit ugly and it probably is---though the patch
itself is pretty small, adding only about 30 lines of new code.

Paolo

Paolo Bonzini (5):
  qemu-timer: fix off-by-one
  qemu-timer: do not include sysemu/cpus.h from util/qemu-timer.h
  cpus: define QEMUTimerListNotifyCB for QEMU system emulation
  main-loop: remove now unnecessary optimization
  icount: process QEMU_CLOCK_VIRTUAL timers in vCPU thread

 cpu-exec.c                   |  1 +
 cpus.c                       | 29 +++++++++++++++++++++++++++--
 hw/core/ptimer.c             |  1 +
 include/qemu/timer.h         | 29 ++++++++++++++++++++++++++---
 include/sysemu/cpus.h        |  3 +++
 kvm-all.c                    |  1 +
 monitor.c                    |  1 +
 replay/replay.c              |  1 +
 stubs/cpu-get-icount.c       |  6 ++++++
 tests/test-aio-multithread.c |  2 +-
 tests/test-aio.c             |  2 +-
 translate-all.c              |  1 +
 util/async.c                 |  2 +-
 util/main-loop.c             |  3 ++-
 util/qemu-timer.c            | 17 ++++++++++-------
 vl.c                         |  5 +----
 16 files changed, 84 insertions(+), 20 deletions(-)

^ permalink raw reply	[flat|nested] 26+ messages in thread

* [Qemu-devel] [PATCH 1/5] qemu-timer: fix off-by-one
  2017-03-03 13:11 [Qemu-devel] [PATCH 0/6] tcg: fix icount super slowdown Paolo Bonzini
@ 2017-03-03 13:11 ` Paolo Bonzini
  2017-03-03 13:48   ` Edgar E. Iglesias
  2017-03-10  9:46   ` Alex Bennée
  2017-03-03 13:11 ` [Qemu-devel] [PATCH 2/5] qemu-timer: do not include sysemu/cpus.h from util/qemu-timer.h Paolo Bonzini
                   ` (4 subsequent siblings)
  5 siblings, 2 replies; 26+ messages in thread
From: Paolo Bonzini @ 2017-03-03 13:11 UTC (permalink / raw)
  To: qemu-devel; +Cc: alex.bennee

If the first timer is exactly at the current value of the clock, the
deadline is met and the timer should fire.  This fixes itself without icount,
but with icount execution of instructions will stop exactly at the deadline.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
 util/qemu-timer.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/util/qemu-timer.c b/util/qemu-timer.c
index 6cf70b9..2f20151 100644
--- a/util/qemu-timer.c
+++ b/util/qemu-timer.c
@@ -199,7 +199,7 @@ bool timerlist_expired(QEMUTimerList *timer_list)
     expire_time = timer_list->active_timers->expire_time;
     qemu_mutex_unlock(&timer_list->active_timers_lock);
 
-    return expire_time < qemu_clock_get_ns(timer_list->clock->type);
+    return expire_time <= qemu_clock_get_ns(timer_list->clock->type);
 }
 
 bool qemu_clock_expired(QEMUClockType type)
-- 
2.9.3

^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [Qemu-devel] [PATCH 2/5] qemu-timer: do not include sysemu/cpus.h from util/qemu-timer.h
  2017-03-03 13:11 [Qemu-devel] [PATCH 0/6] tcg: fix icount super slowdown Paolo Bonzini
  2017-03-03 13:11 ` [Qemu-devel] [PATCH 1/5] qemu-timer: fix off-by-one Paolo Bonzini
@ 2017-03-03 13:11 ` Paolo Bonzini
  2017-03-03 13:48   ` Edgar E. Iglesias
                     ` (3 more replies)
  2017-03-03 13:11 ` [Qemu-devel] [PATCH 3/5] cpus: define QEMUTimerListNotifyCB for QEMU system emulation Paolo Bonzini
                   ` (3 subsequent siblings)
  5 siblings, 4 replies; 26+ messages in thread
From: Paolo Bonzini @ 2017-03-03 13:11 UTC (permalink / raw)
  To: qemu-devel; +Cc: alex.bennee

This dependency is the wrong way, and we will need util/qemu-timer.h from sysemu/cpus.h
in the next patch.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
 cpu-exec.c            | 1 +
 hw/core/ptimer.c      | 1 +
 include/qemu/timer.h  | 1 -
 include/sysemu/cpus.h | 2 ++
 kvm-all.c             | 1 +
 monitor.c             | 1 +
 replay/replay.c       | 1 +
 translate-all.c       | 1 +
 util/main-loop.c      | 1 +
 util/qemu-timer.c     | 1 +
 10 files changed, 10 insertions(+), 1 deletion(-)

diff --git a/cpu-exec.c b/cpu-exec.c
index 1a5ad48..6dbb4da 100644
--- a/cpu-exec.c
+++ b/cpu-exec.c
@@ -33,6 +33,7 @@
 #if defined(TARGET_I386) && !defined(CONFIG_USER_ONLY)
 #include "hw/i386/apic.h"
 #endif
+#include "sysemu/cpus.h"
 #include "sysemu/replay.h"
 
 /* -icount align implementation. */
diff --git a/hw/core/ptimer.c b/hw/core/ptimer.c
index 59ccb00..7221c68 100644
--- a/hw/core/ptimer.c
+++ b/hw/core/ptimer.c
@@ -13,6 +13,7 @@
 #include "sysemu/replay.h"
 #include "sysemu/qtest.h"
 #include "block/aio.h"
+#include "sysemu/cpus.h"
 
 #define DELTA_ADJUST     1
 #define DELTA_NO_ADJUST -1
diff --git a/include/qemu/timer.h b/include/qemu/timer.h
index 26e6285..91cd8c8 100644
--- a/include/qemu/timer.h
+++ b/include/qemu/timer.h
@@ -4,7 +4,6 @@
 #include "qemu-common.h"
 #include "qemu/notify.h"
 #include "qemu/host-utils.h"
-#include "sysemu/cpus.h"
 
 #define NANOSECONDS_PER_SECOND 1000000000LL
 
diff --git a/include/sysemu/cpus.h b/include/sysemu/cpus.h
index a73b5d4..e521a91 100644
--- a/include/sysemu/cpus.h
+++ b/include/sysemu/cpus.h
@@ -1,6 +1,8 @@
 #ifndef QEMU_CPUS_H
 #define QEMU_CPUS_H
 
+#include "qemu/timer.h"
+
 /* cpus.c */
 bool qemu_in_vcpu_thread(void);
 void qemu_init_cpu_loop(void);
diff --git a/kvm-all.c b/kvm-all.c
index 0c94637..0fd6655 100644
--- a/kvm-all.c
+++ b/kvm-all.c
@@ -29,6 +29,7 @@
 #include "hw/s390x/adapter.h"
 #include "exec/gdbstub.h"
 #include "sysemu/kvm_int.h"
+#include "sysemu/cpus.h"
 #include "qemu/bswap.h"
 #include "exec/memory.h"
 #include "exec/ram_addr.h"
diff --git a/monitor.c b/monitor.c
index b68944d..d8fba78 100644
--- a/monitor.c
+++ b/monitor.c
@@ -77,6 +77,7 @@
 #include "qapi-event.h"
 #include "qmp-introspect.h"
 #include "sysemu/qtest.h"
+#include "sysemu/cpus.h"
 #include "qemu/cutils.h"
 #include "qapi/qmp/dispatch.h"
 
diff --git a/replay/replay.c b/replay/replay.c
index 1835b99..78e2a7e 100644
--- a/replay/replay.c
+++ b/replay/replay.c
@@ -16,6 +16,7 @@
 #include "replay-internal.h"
 #include "qemu/timer.h"
 #include "qemu/main-loop.h"
+#include "sysemu/cpus.h"
 #include "sysemu/sysemu.h"
 #include "qemu/error-report.h"
 
diff --git a/translate-all.c b/translate-all.c
index 956d54b..63b8f56 100644
--- a/translate-all.c
+++ b/translate-all.c
@@ -57,6 +57,7 @@
 #include "qemu/timer.h"
 #include "qemu/main-loop.h"
 #include "exec/log.h"
+#include "sysemu/cpus.h"
 
 /* #define DEBUG_TB_INVALIDATE */
 /* #define DEBUG_TB_FLUSH */
diff --git a/util/main-loop.c b/util/main-loop.c
index ad10bca..0b2d8c0 100644
--- a/util/main-loop.c
+++ b/util/main-loop.c
@@ -28,6 +28,7 @@
 #include "qemu/timer.h"
 #include "qemu/sockets.h"	// struct in_addr needed for libslirp.h
 #include "sysemu/qtest.h"
+#include "sysemu/cpus.h"
 #include "slirp/libslirp.h"
 #include "qemu/main-loop.h"
 #include "block/aio.h"
diff --git a/util/qemu-timer.c b/util/qemu-timer.c
index 2f20151..ac99340 100644
--- a/util/qemu-timer.c
+++ b/util/qemu-timer.c
@@ -27,6 +27,7 @@
 #include "qemu/timer.h"
 #include "sysemu/replay.h"
 #include "sysemu/sysemu.h"
+#include "sysemu/cpus.h"
 
 #ifdef CONFIG_POSIX
 #include <pthread.h>
-- 
2.9.3

^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [Qemu-devel] [PATCH 3/5] cpus: define QEMUTimerListNotifyCB for QEMU system emulation
  2017-03-03 13:11 [Qemu-devel] [PATCH 0/6] tcg: fix icount super slowdown Paolo Bonzini
  2017-03-03 13:11 ` [Qemu-devel] [PATCH 1/5] qemu-timer: fix off-by-one Paolo Bonzini
  2017-03-03 13:11 ` [Qemu-devel] [PATCH 2/5] qemu-timer: do not include sysemu/cpus.h from util/qemu-timer.h Paolo Bonzini
@ 2017-03-03 13:11 ` Paolo Bonzini
  2017-03-03 13:53   ` Edgar E. Iglesias
  2017-03-03 13:11 ` [Qemu-devel] [PATCH 4/5] main-loop: remove now unnecessary optimization Paolo Bonzini
                   ` (2 subsequent siblings)
  5 siblings, 1 reply; 26+ messages in thread
From: Paolo Bonzini @ 2017-03-03 13:11 UTC (permalink / raw)
  To: qemu-devel; +Cc: alex.bennee

There is no change for now, because the callback just invokes
qemu_notify_event.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
 cpus.c                       |  5 +++++
 include/qemu/timer.h         |  4 ++--
 include/sysemu/cpus.h        |  1 +
 stubs/cpu-get-icount.c       |  6 ++++++
 tests/test-aio-multithread.c |  2 +-
 tests/test-aio.c             |  2 +-
 util/async.c                 |  2 +-
 util/main-loop.c             |  2 +-
 util/qemu-timer.c            | 10 +++++-----
 9 files changed, 23 insertions(+), 11 deletions(-)

diff --git a/cpus.c b/cpus.c
index 2459f56..747addd 100644
--- a/cpus.c
+++ b/cpus.c
@@ -804,6 +804,11 @@ static void qemu_cpu_kick_rr_cpu(void)
     } while (cpu != atomic_mb_read(&tcg_current_rr_cpu));
 }
 
+void qemu_timer_notify_cb(void *opaque, QEMUClockType type)
+{
+    qemu_notify_event();
+}
+
 static void kick_tcg_thread(void *opaque)
 {
     timer_mod(tcg_kick_vcpu_timer, qemu_tcg_next_kick());
diff --git a/include/qemu/timer.h b/include/qemu/timer.h
index 91cd8c8..1441b42 100644
--- a/include/qemu/timer.h
+++ b/include/qemu/timer.h
@@ -59,7 +59,7 @@ struct QEMUTimerListGroup {
 };
 
 typedef void QEMUTimerCB(void *opaque);
-typedef void QEMUTimerListNotifyCB(void *opaque);
+typedef void QEMUTimerListNotifyCB(void *opaque, QEMUClockType type);
 
 struct QEMUTimer {
     int64_t expire_time;        /* in nanoseconds */
@@ -776,7 +776,7 @@ static inline int64_t qemu_soonest_timeout(int64_t timeout1, int64_t timeout2)
  *
  * Initialise the clock & timer infrastructure
  */
-void init_clocks(void);
+void init_clocks(QEMUTimerListNotifyCB *notify_cb);
 
 int64_t cpu_get_ticks(void);
 /* Caller must hold BQL */
diff --git a/include/sysemu/cpus.h b/include/sysemu/cpus.h
index e521a91..a8053f1 100644
--- a/include/sysemu/cpus.h
+++ b/include/sysemu/cpus.h
@@ -22,6 +22,7 @@ void dump_drift_info(FILE *f, fprintf_function cpu_fprintf);
 
 /* Unblock cpu */
 void qemu_cpu_kick_self(void);
+void qemu_timer_notify_cb(void *opaque, QEMUClockType type);
 
 void cpu_synchronize_all_states(void);
 void cpu_synchronize_all_post_reset(void);
diff --git a/stubs/cpu-get-icount.c b/stubs/cpu-get-icount.c
index 2e8b63b..0b7239d 100644
--- a/stubs/cpu-get-icount.c
+++ b/stubs/cpu-get-icount.c
@@ -2,6 +2,7 @@
 #include "qemu-common.h"
 #include "qemu/timer.h"
 #include "sysemu/cpus.h"
+#include "qemu/main-loop.h"
 
 int use_icount;
 
@@ -9,3 +10,8 @@ int64_t cpu_get_icount(void)
 {
     abort();
 }
+
+void qemu_timer_notify_cb(void *opaque, QEMUClockType type)
+{
+    qemu_notify_event();
+}
diff --git a/tests/test-aio-multithread.c b/tests/test-aio-multithread.c
index 8b0b40e..549d784 100644
--- a/tests/test-aio-multithread.c
+++ b/tests/test-aio-multithread.c
@@ -438,7 +438,7 @@ static void test_multi_mutex_10(void)
 
 int main(int argc, char **argv)
 {
-    init_clocks();
+    init_clocks(NULL);
 
     g_test_init(&argc, &argv, NULL);
     g_test_add_func("/aio/multi/lifecycle", test_lifecycle);
diff --git a/tests/test-aio.c b/tests/test-aio.c
index 2754f15..54e20d6 100644
--- a/tests/test-aio.c
+++ b/tests/test-aio.c
@@ -835,7 +835,7 @@ int main(int argc, char **argv)
     Error *local_error = NULL;
     GSource *src;
 
-    init_clocks();
+    init_clocks(NULL);
 
     ctx = aio_context_new(&local_error);
     if (!ctx) {
diff --git a/util/async.c b/util/async.c
index 7d469eb..663e297 100644
--- a/util/async.c
+++ b/util/async.c
@@ -351,7 +351,7 @@ void aio_notify_accept(AioContext *ctx)
     }
 }
 
-static void aio_timerlist_notify(void *opaque)
+static void aio_timerlist_notify(void *opaque, QEMUClockType type)
 {
     aio_notify(opaque);
 }
diff --git a/util/main-loop.c b/util/main-loop.c
index 0b2d8c0..a8c1c5b 100644
--- a/util/main-loop.c
+++ b/util/main-loop.c
@@ -147,7 +147,7 @@ int qemu_init_main_loop(Error **errp)
     GSource *src;
     Error *local_error = NULL;
 
-    init_clocks();
+    init_clocks(qemu_timer_notify_cb);
 
     ret = qemu_signal_init();
     if (ret) {
diff --git a/util/qemu-timer.c b/util/qemu-timer.c
index ac99340..dc3181e 100644
--- a/util/qemu-timer.c
+++ b/util/qemu-timer.c
@@ -122,7 +122,7 @@ void timerlist_free(QEMUTimerList *timer_list)
     g_free(timer_list);
 }
 
-static void qemu_clock_init(QEMUClockType type)
+static void qemu_clock_init(QEMUClockType type, QEMUTimerListNotifyCB *notify_cb)
 {
     QEMUClock *clock = qemu_clock_ptr(type);
 
@@ -134,7 +134,7 @@ static void qemu_clock_init(QEMUClockType type)
     clock->last = INT64_MIN;
     QLIST_INIT(&clock->timerlists);
     notifier_list_init(&clock->reset_notifiers);
-    main_loop_tlg.tl[type] = timerlist_new(type, NULL, NULL);
+    main_loop_tlg.tl[type] = timerlist_new(type, notify_cb, NULL);
 }
 
 bool qemu_clock_use_for_deadline(QEMUClockType type)
@@ -278,7 +278,7 @@ QEMUTimerList *qemu_clock_get_main_loop_timerlist(QEMUClockType type)
 void timerlist_notify(QEMUTimerList *timer_list)
 {
     if (timer_list->notify_cb) {
-        timer_list->notify_cb(timer_list->notify_opaque);
+        timer_list->notify_cb(timer_list->notify_opaque, timer_list->clock->type);
     } else {
         qemu_notify_event();
     }
@@ -635,11 +635,11 @@ void qemu_clock_unregister_reset_notifier(QEMUClockType type,
     notifier_remove(notifier);
 }
 
-void init_clocks(void)
+void init_clocks(QEMUTimerListNotifyCB *notify_cb)
 {
     QEMUClockType type;
     for (type = 0; type < QEMU_CLOCK_MAX; type++) {
-        qemu_clock_init(type);
+        qemu_clock_init(type, notify_cb);
     }
 
 #ifdef CONFIG_PRCTL_PR_SET_TIMERSLACK
-- 
2.9.3

^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [Qemu-devel] [PATCH 4/5] main-loop: remove now unnecessary optimization
  2017-03-03 13:11 [Qemu-devel] [PATCH 0/6] tcg: fix icount super slowdown Paolo Bonzini
                   ` (2 preceding siblings ...)
  2017-03-03 13:11 ` [Qemu-devel] [PATCH 3/5] cpus: define QEMUTimerListNotifyCB for QEMU system emulation Paolo Bonzini
@ 2017-03-03 13:11 ` Paolo Bonzini
  2017-03-03 13:53   ` Edgar E. Iglesias
  2017-03-13 16:23   ` Alex Bennée
  2017-03-03 13:11 ` [Qemu-devel] [PATCH 5/5] icount: process QEMU_CLOCK_VIRTUAL timers in vCPU thread Paolo Bonzini
  2017-03-09 17:19 ` [Qemu-devel] [PATCH 0/6] tcg: fix icount super slowdown Alex Bennée
  5 siblings, 2 replies; 26+ messages in thread
From: Paolo Bonzini @ 2017-03-03 13:11 UTC (permalink / raw)
  To: qemu-devel; +Cc: alex.bennee

This optimization is not necessary anymore, because the vCPU now drops
the I/O thread lock even with TCG.  Drop it to simplify the code and
avoid the "I/O thread spun for 1000 iterations" warning.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
 vl.c | 5 +----
 1 file changed, 1 insertion(+), 4 deletions(-)

diff --git a/vl.c b/vl.c
index bbbf1ba..b21b57e 100644
--- a/vl.c
+++ b/vl.c
@@ -1884,17 +1884,14 @@ static bool main_loop_should_exit(void)
 
 static void main_loop(void)
 {
-    bool nonblocking;
-    int last_io = 0;
 #ifdef CONFIG_PROFILER
     int64_t ti;
 #endif
     do {
-        nonblocking = tcg_enabled() && last_io > 0;
 #ifdef CONFIG_PROFILER
         ti = profile_getclock();
 #endif
-        last_io = main_loop_wait(nonblocking);
+        main_loop_wait(false);
 #ifdef CONFIG_PROFILER
         dev_time += profile_getclock() - ti;
 #endif
-- 
2.9.3

^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [Qemu-devel] [PATCH 5/5] icount: process QEMU_CLOCK_VIRTUAL timers in vCPU thread
  2017-03-03 13:11 [Qemu-devel] [PATCH 0/6] tcg: fix icount super slowdown Paolo Bonzini
                   ` (3 preceding siblings ...)
  2017-03-03 13:11 ` [Qemu-devel] [PATCH 4/5] main-loop: remove now unnecessary optimization Paolo Bonzini
@ 2017-03-03 13:11 ` Paolo Bonzini
  2017-03-13 16:53   ` Alex Bennée
  2017-03-09 17:19 ` [Qemu-devel] [PATCH 0/6] tcg: fix icount super slowdown Alex Bennée
  5 siblings, 1 reply; 26+ messages in thread
From: Paolo Bonzini @ 2017-03-03 13:11 UTC (permalink / raw)
  To: qemu-devel; +Cc: alex.bennee

icount has become much slower after tcg_cpu_exec has stopped
using the BQL.  There is also a latent bug that is masked by
the slowness.

The slowness happens because every occurrence of a QEMU_CLOCK_VIRTUAL
timer now has to wake up the I/O thread and wait for it.  The rendez-vous
is mediated by the BQL QemuMutex:

- handle_icount_deadline wakes up the I/O thread with BQL taken
- the I/O thread wakes up and waits on the BQL
- the VCPU thread releases the BQL a little later
- the I/O thread raises an interrupt, which calls qemu_cpu_kick
- the VCPU thread notices the interrupt, takes the BQL to
  process it and waits on it

All this back and forth is extremely expensive, causing a 6 to 8-fold
slowdown when icount is turned on.

One may think that the issue is that the VCPU thread is too dependent
on the BQL, but then the latent bug comes in.  I first tried removing
the BQL completely from the x86 cpu_exec, only to see everything break.
The only way to fix it (and make everything slow again) was to add a dummy
BQL lock/unlock pair.

This is because in -icount mode you really have to process the events
before the CPU restarts executing the next instruction.  Therefore, this
series moves the processing of QEMU_CLOCK_VIRTUAL timers straight in
the vCPU thread when running in icount mode.

The required changes include:

- make the timer notification callback wake up TCG's single vCPU thread
  when run from another thread.  By using async_run_on_cpu, the callback
  can override all_cpu_threads_idle() when the CPU is halted.

- move handle_icount_deadline after qemu_tcg_wait_io_event, so that
  the timer notification callback is invoked after the dummy work item
  wakes up the vCPU thread

- make handle_icount_deadline run the timers instead of just waking the
  I/O thread.

- stop processing the timers in the main loop

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
 cpus.c               | 26 +++++++++++++++++++++++---
 include/qemu/timer.h | 24 ++++++++++++++++++++++++
 util/qemu-timer.c    |  4 +++-
 3 files changed, 50 insertions(+), 4 deletions(-)

diff --git a/cpus.c b/cpus.c
index 747addd..209c196 100644
--- a/cpus.c
+++ b/cpus.c
@@ -804,9 +804,23 @@ static void qemu_cpu_kick_rr_cpu(void)
     } while (cpu != atomic_mb_read(&tcg_current_rr_cpu));
 }
 
+static void do_nothing(CPUState *cpu, run_on_cpu_data unused)
+{
+}
+
 void qemu_timer_notify_cb(void *opaque, QEMUClockType type)
 {
-    qemu_notify_event();
+    if (!use_icount || type != QEMU_CLOCK_VIRTUAL) {
+        qemu_notify_event();
+        return;
+    }
+
+    if (!qemu_in_vcpu_thread() && first_cpu) {
+        /* run_on_cpu will also kick the CPU out of halt so that
+         * handle_icount_deadline runs
+         */
+        async_run_on_cpu(first_cpu, do_nothing, RUN_ON_CPU_NULL);
+    }
 }
 
 static void kick_tcg_thread(void *opaque)
@@ -1220,12 +1234,15 @@ static int64_t tcg_get_icount_limit(void)
 
 static void handle_icount_deadline(void)
 {
+    assert(qemu_in_vcpu_thread());
     if (use_icount) {
         int64_t deadline =
             qemu_clock_deadline_ns_all(QEMU_CLOCK_VIRTUAL);
 
         if (deadline == 0) {
+            /* Wake up other AioContexts.  */
             qemu_clock_notify(QEMU_CLOCK_VIRTUAL);
+            qemu_clock_run_timers(QEMU_CLOCK_VIRTUAL);
         }
     }
 }
@@ -1338,6 +1355,11 @@ static void *qemu_tcg_rr_cpu_thread_fn(void *arg)
         /* Account partial waits to QEMU_CLOCK_VIRTUAL.  */
         qemu_account_warp_timer();
 
+        /* Run the timers here.  This is much more efficient than
+         * waking up the I/O thread and waiting for completion.
+         */
+        handle_icount_deadline();
+
         if (!cpu) {
             cpu = first_cpu;
         }
@@ -1379,8 +1401,6 @@ static void *qemu_tcg_rr_cpu_thread_fn(void *arg)
             atomic_mb_set(&cpu->exit_request, 0);
         }
 
-        handle_icount_deadline();
-
         qemu_tcg_wait_io_event(cpu ? cpu : QTAILQ_FIRST(&cpus));
         deal_with_unplugged_cpus();
     }
diff --git a/include/qemu/timer.h b/include/qemu/timer.h
index 1441b42..e1742f2 100644
--- a/include/qemu/timer.h
+++ b/include/qemu/timer.h
@@ -533,6 +533,12 @@ static inline QEMUTimer *timer_new_tl(QEMUTimerList *timer_list,
  * Create a new timer and associate it with the default
  * timer list for the clock type @type.
  *
+ * The default timer list has one special feature: in icount mode,
+ * %QEMU_CLOCK_VIRTUAL timers are run in the vCPU thread.  This is
+ * not true of other timer lists, which are typically associated
+ * with an AioContext---each of them runs its timer callbacks in its own
+ * AioContext thread.
+ *
  * Returns: a pointer to the timer
  */
 static inline QEMUTimer *timer_new(QEMUClockType type, int scale,
@@ -550,6 +556,12 @@ static inline QEMUTimer *timer_new(QEMUClockType type, int scale,
  * Create a new timer with nanosecond scale on the default timer list
  * associated with the clock.
  *
+ * The default timer list has one special feature: in icount mode,
+ * %QEMU_CLOCK_VIRTUAL timers are run in the vCPU thread.  This is
+ * not true of other timer lists, which are typically associated
+ * with an AioContext---each of them runs its timer callbacks in its own
+ * AioContext thread.
+ *
  * Returns: a pointer to the newly created timer
  */
 static inline QEMUTimer *timer_new_ns(QEMUClockType type, QEMUTimerCB *cb,
@@ -564,6 +576,12 @@ static inline QEMUTimer *timer_new_ns(QEMUClockType type, QEMUTimerCB *cb,
  * @cb: the callback to call when the timer expires
  * @opaque: the opaque pointer to pass to the callback
  *
+ * The default timer list has one special feature: in icount mode,
+ * %QEMU_CLOCK_VIRTUAL timers are run in the vCPU thread.  This is
+ * not true of other timer lists, which are typically associated
+ * with an AioContext---each of them runs its timer callbacks in its own
+ * AioContext thread.
+ *
  * Create a new timer with microsecond scale on the default timer list
  * associated with the clock.
  *
@@ -581,6 +599,12 @@ static inline QEMUTimer *timer_new_us(QEMUClockType type, QEMUTimerCB *cb,
  * @cb: the callback to call when the timer expires
  * @opaque: the opaque pointer to pass to the callback
  *
+ * The default timer list has one special feature: in icount mode,
+ * %QEMU_CLOCK_VIRTUAL timers are run in the vCPU thread.  This is
+ * not true of other timer lists, which are typically associated
+ * with an AioContext---each of them runs its timer callbacks in its own
+ * AioContext thread.
+ *
  * Create a new timer with millisecond scale on the default timer list
  * associated with the clock.
  *
diff --git a/util/qemu-timer.c b/util/qemu-timer.c
index dc3181e..82d5650 100644
--- a/util/qemu-timer.c
+++ b/util/qemu-timer.c
@@ -658,7 +658,9 @@ bool qemu_clock_run_all_timers(void)
     QEMUClockType type;
 
     for (type = 0; type < QEMU_CLOCK_MAX; type++) {
-        progress |= qemu_clock_run_timers(type);
+        if (qemu_clock_use_for_deadline(type)) {
+            progress |= qemu_clock_run_timers(type);
+        }
     }
 
     return progress;
-- 
2.9.3

^ permalink raw reply related	[flat|nested] 26+ messages in thread

* Re: [Qemu-devel] [PATCH 1/5] qemu-timer: fix off-by-one
  2017-03-03 13:11 ` [Qemu-devel] [PATCH 1/5] qemu-timer: fix off-by-one Paolo Bonzini
@ 2017-03-03 13:48   ` Edgar E. Iglesias
  2017-03-10  9:46   ` Alex Bennée
  1 sibling, 0 replies; 26+ messages in thread
From: Edgar E. Iglesias @ 2017-03-03 13:48 UTC (permalink / raw)
  To: Paolo Bonzini; +Cc: qemu-devel, alex.bennee

On Fri, Mar 03, 2017 at 02:11:09PM +0100, Paolo Bonzini wrote:
> If the first timer is exactly at the current value of the clock, the
> deadline is met and the timer should fire.  This fixes itself without icount,
> but with icount execution of instructions will stop exactly at the deadline.
> 
> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

Reviewed-by: Edgar E. Iglesias <edgar.iglesias@xilinx.com>


> ---
>  util/qemu-timer.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/util/qemu-timer.c b/util/qemu-timer.c
> index 6cf70b9..2f20151 100644
> --- a/util/qemu-timer.c
> +++ b/util/qemu-timer.c
> @@ -199,7 +199,7 @@ bool timerlist_expired(QEMUTimerList *timer_list)
>      expire_time = timer_list->active_timers->expire_time;
>      qemu_mutex_unlock(&timer_list->active_timers_lock);
>  
> -    return expire_time < qemu_clock_get_ns(timer_list->clock->type);
> +    return expire_time <= qemu_clock_get_ns(timer_list->clock->type);
>  }
>  
>  bool qemu_clock_expired(QEMUClockType type)
> -- 
> 2.9.3
> 
> 
> 

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [Qemu-devel] [PATCH 2/5] qemu-timer: do not include sysemu/cpus.h from util/qemu-timer.h
  2017-03-03 13:11 ` [Qemu-devel] [PATCH 2/5] qemu-timer: do not include sysemu/cpus.h from util/qemu-timer.h Paolo Bonzini
@ 2017-03-03 13:48   ` Edgar E. Iglesias
  2017-03-03 14:50   ` Alex Bennée
                     ` (2 subsequent siblings)
  3 siblings, 0 replies; 26+ messages in thread
From: Edgar E. Iglesias @ 2017-03-03 13:48 UTC (permalink / raw)
  To: Paolo Bonzini; +Cc: qemu-devel, alex.bennee

On Fri, Mar 03, 2017 at 02:11:10PM +0100, Paolo Bonzini wrote:
> This dependency is the wrong way, and we will need util/qemu-timer.h from sysemu/cpus.h
> in the next patch.
> 
> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

Reviewed-by: Edgar E. Iglesias <edgar.iglesias@xilinx.com>



> ---
>  cpu-exec.c            | 1 +
>  hw/core/ptimer.c      | 1 +
>  include/qemu/timer.h  | 1 -
>  include/sysemu/cpus.h | 2 ++
>  kvm-all.c             | 1 +
>  monitor.c             | 1 +
>  replay/replay.c       | 1 +
>  translate-all.c       | 1 +
>  util/main-loop.c      | 1 +
>  util/qemu-timer.c     | 1 +
>  10 files changed, 10 insertions(+), 1 deletion(-)
> 
> diff --git a/cpu-exec.c b/cpu-exec.c
> index 1a5ad48..6dbb4da 100644
> --- a/cpu-exec.c
> +++ b/cpu-exec.c
> @@ -33,6 +33,7 @@
>  #if defined(TARGET_I386) && !defined(CONFIG_USER_ONLY)
>  #include "hw/i386/apic.h"
>  #endif
> +#include "sysemu/cpus.h"
>  #include "sysemu/replay.h"
>  
>  /* -icount align implementation. */
> diff --git a/hw/core/ptimer.c b/hw/core/ptimer.c
> index 59ccb00..7221c68 100644
> --- a/hw/core/ptimer.c
> +++ b/hw/core/ptimer.c
> @@ -13,6 +13,7 @@
>  #include "sysemu/replay.h"
>  #include "sysemu/qtest.h"
>  #include "block/aio.h"
> +#include "sysemu/cpus.h"
>  
>  #define DELTA_ADJUST     1
>  #define DELTA_NO_ADJUST -1
> diff --git a/include/qemu/timer.h b/include/qemu/timer.h
> index 26e6285..91cd8c8 100644
> --- a/include/qemu/timer.h
> +++ b/include/qemu/timer.h
> @@ -4,7 +4,6 @@
>  #include "qemu-common.h"
>  #include "qemu/notify.h"
>  #include "qemu/host-utils.h"
> -#include "sysemu/cpus.h"
>  
>  #define NANOSECONDS_PER_SECOND 1000000000LL
>  
> diff --git a/include/sysemu/cpus.h b/include/sysemu/cpus.h
> index a73b5d4..e521a91 100644
> --- a/include/sysemu/cpus.h
> +++ b/include/sysemu/cpus.h
> @@ -1,6 +1,8 @@
>  #ifndef QEMU_CPUS_H
>  #define QEMU_CPUS_H
>  
> +#include "qemu/timer.h"
> +
>  /* cpus.c */
>  bool qemu_in_vcpu_thread(void);
>  void qemu_init_cpu_loop(void);
> diff --git a/kvm-all.c b/kvm-all.c
> index 0c94637..0fd6655 100644
> --- a/kvm-all.c
> +++ b/kvm-all.c
> @@ -29,6 +29,7 @@
>  #include "hw/s390x/adapter.h"
>  #include "exec/gdbstub.h"
>  #include "sysemu/kvm_int.h"
> +#include "sysemu/cpus.h"
>  #include "qemu/bswap.h"
>  #include "exec/memory.h"
>  #include "exec/ram_addr.h"
> diff --git a/monitor.c b/monitor.c
> index b68944d..d8fba78 100644
> --- a/monitor.c
> +++ b/monitor.c
> @@ -77,6 +77,7 @@
>  #include "qapi-event.h"
>  #include "qmp-introspect.h"
>  #include "sysemu/qtest.h"
> +#include "sysemu/cpus.h"
>  #include "qemu/cutils.h"
>  #include "qapi/qmp/dispatch.h"
>  
> diff --git a/replay/replay.c b/replay/replay.c
> index 1835b99..78e2a7e 100644
> --- a/replay/replay.c
> +++ b/replay/replay.c
> @@ -16,6 +16,7 @@
>  #include "replay-internal.h"
>  #include "qemu/timer.h"
>  #include "qemu/main-loop.h"
> +#include "sysemu/cpus.h"
>  #include "sysemu/sysemu.h"
>  #include "qemu/error-report.h"
>  
> diff --git a/translate-all.c b/translate-all.c
> index 956d54b..63b8f56 100644
> --- a/translate-all.c
> +++ b/translate-all.c
> @@ -57,6 +57,7 @@
>  #include "qemu/timer.h"
>  #include "qemu/main-loop.h"
>  #include "exec/log.h"
> +#include "sysemu/cpus.h"
>  
>  /* #define DEBUG_TB_INVALIDATE */
>  /* #define DEBUG_TB_FLUSH */
> diff --git a/util/main-loop.c b/util/main-loop.c
> index ad10bca..0b2d8c0 100644
> --- a/util/main-loop.c
> +++ b/util/main-loop.c
> @@ -28,6 +28,7 @@
>  #include "qemu/timer.h"
>  #include "qemu/sockets.h"	// struct in_addr needed for libslirp.h
>  #include "sysemu/qtest.h"
> +#include "sysemu/cpus.h"
>  #include "slirp/libslirp.h"
>  #include "qemu/main-loop.h"
>  #include "block/aio.h"
> diff --git a/util/qemu-timer.c b/util/qemu-timer.c
> index 2f20151..ac99340 100644
> --- a/util/qemu-timer.c
> +++ b/util/qemu-timer.c
> @@ -27,6 +27,7 @@
>  #include "qemu/timer.h"
>  #include "sysemu/replay.h"
>  #include "sysemu/sysemu.h"
> +#include "sysemu/cpus.h"
>  
>  #ifdef CONFIG_POSIX
>  #include <pthread.h>
> -- 
> 2.9.3
> 
> 
> 

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [Qemu-devel] [PATCH 3/5] cpus: define QEMUTimerListNotifyCB for QEMU system emulation
  2017-03-03 13:11 ` [Qemu-devel] [PATCH 3/5] cpus: define QEMUTimerListNotifyCB for QEMU system emulation Paolo Bonzini
@ 2017-03-03 13:53   ` Edgar E. Iglesias
  0 siblings, 0 replies; 26+ messages in thread
From: Edgar E. Iglesias @ 2017-03-03 13:53 UTC (permalink / raw)
  To: Paolo Bonzini; +Cc: qemu-devel, alex.bennee

On Fri, Mar 03, 2017 at 02:11:11PM +0100, Paolo Bonzini wrote:
> There is no change for now, because the callback just invokes
> qemu_notify_event.
> 
> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

Reviewed-by: Edgar E. Iglesias <edgar.iglesias@xilinx.com>


> ---
>  cpus.c                       |  5 +++++
>  include/qemu/timer.h         |  4 ++--
>  include/sysemu/cpus.h        |  1 +
>  stubs/cpu-get-icount.c       |  6 ++++++
>  tests/test-aio-multithread.c |  2 +-
>  tests/test-aio.c             |  2 +-
>  util/async.c                 |  2 +-
>  util/main-loop.c             |  2 +-
>  util/qemu-timer.c            | 10 +++++-----
>  9 files changed, 23 insertions(+), 11 deletions(-)
> 
> diff --git a/cpus.c b/cpus.c
> index 2459f56..747addd 100644
> --- a/cpus.c
> +++ b/cpus.c
> @@ -804,6 +804,11 @@ static void qemu_cpu_kick_rr_cpu(void)
>      } while (cpu != atomic_mb_read(&tcg_current_rr_cpu));
>  }
>  
> +void qemu_timer_notify_cb(void *opaque, QEMUClockType type)
> +{
> +    qemu_notify_event();
> +}
> +
>  static void kick_tcg_thread(void *opaque)
>  {
>      timer_mod(tcg_kick_vcpu_timer, qemu_tcg_next_kick());
> diff --git a/include/qemu/timer.h b/include/qemu/timer.h
> index 91cd8c8..1441b42 100644
> --- a/include/qemu/timer.h
> +++ b/include/qemu/timer.h
> @@ -59,7 +59,7 @@ struct QEMUTimerListGroup {
>  };
>  
>  typedef void QEMUTimerCB(void *opaque);
> -typedef void QEMUTimerListNotifyCB(void *opaque);
> +typedef void QEMUTimerListNotifyCB(void *opaque, QEMUClockType type);
>  
>  struct QEMUTimer {
>      int64_t expire_time;        /* in nanoseconds */
> @@ -776,7 +776,7 @@ static inline int64_t qemu_soonest_timeout(int64_t timeout1, int64_t timeout2)
>   *
>   * Initialise the clock & timer infrastructure
>   */
> -void init_clocks(void);
> +void init_clocks(QEMUTimerListNotifyCB *notify_cb);
>  
>  int64_t cpu_get_ticks(void);
>  /* Caller must hold BQL */
> diff --git a/include/sysemu/cpus.h b/include/sysemu/cpus.h
> index e521a91..a8053f1 100644
> --- a/include/sysemu/cpus.h
> +++ b/include/sysemu/cpus.h
> @@ -22,6 +22,7 @@ void dump_drift_info(FILE *f, fprintf_function cpu_fprintf);
>  
>  /* Unblock cpu */
>  void qemu_cpu_kick_self(void);
> +void qemu_timer_notify_cb(void *opaque, QEMUClockType type);
>  
>  void cpu_synchronize_all_states(void);
>  void cpu_synchronize_all_post_reset(void);
> diff --git a/stubs/cpu-get-icount.c b/stubs/cpu-get-icount.c
> index 2e8b63b..0b7239d 100644
> --- a/stubs/cpu-get-icount.c
> +++ b/stubs/cpu-get-icount.c
> @@ -2,6 +2,7 @@
>  #include "qemu-common.h"
>  #include "qemu/timer.h"
>  #include "sysemu/cpus.h"
> +#include "qemu/main-loop.h"
>  
>  int use_icount;
>  
> @@ -9,3 +10,8 @@ int64_t cpu_get_icount(void)
>  {
>      abort();
>  }
> +
> +void qemu_timer_notify_cb(void *opaque, QEMUClockType type)
> +{
> +    qemu_notify_event();
> +}
> diff --git a/tests/test-aio-multithread.c b/tests/test-aio-multithread.c
> index 8b0b40e..549d784 100644
> --- a/tests/test-aio-multithread.c
> +++ b/tests/test-aio-multithread.c
> @@ -438,7 +438,7 @@ static void test_multi_mutex_10(void)
>  
>  int main(int argc, char **argv)
>  {
> -    init_clocks();
> +    init_clocks(NULL);
>  
>      g_test_init(&argc, &argv, NULL);
>      g_test_add_func("/aio/multi/lifecycle", test_lifecycle);
> diff --git a/tests/test-aio.c b/tests/test-aio.c
> index 2754f15..54e20d6 100644
> --- a/tests/test-aio.c
> +++ b/tests/test-aio.c
> @@ -835,7 +835,7 @@ int main(int argc, char **argv)
>      Error *local_error = NULL;
>      GSource *src;
>  
> -    init_clocks();
> +    init_clocks(NULL);
>  
>      ctx = aio_context_new(&local_error);
>      if (!ctx) {
> diff --git a/util/async.c b/util/async.c
> index 7d469eb..663e297 100644
> --- a/util/async.c
> +++ b/util/async.c
> @@ -351,7 +351,7 @@ void aio_notify_accept(AioContext *ctx)
>      }
>  }
>  
> -static void aio_timerlist_notify(void *opaque)
> +static void aio_timerlist_notify(void *opaque, QEMUClockType type)
>  {
>      aio_notify(opaque);
>  }
> diff --git a/util/main-loop.c b/util/main-loop.c
> index 0b2d8c0..a8c1c5b 100644
> --- a/util/main-loop.c
> +++ b/util/main-loop.c
> @@ -147,7 +147,7 @@ int qemu_init_main_loop(Error **errp)
>      GSource *src;
>      Error *local_error = NULL;
>  
> -    init_clocks();
> +    init_clocks(qemu_timer_notify_cb);
>  
>      ret = qemu_signal_init();
>      if (ret) {
> diff --git a/util/qemu-timer.c b/util/qemu-timer.c
> index ac99340..dc3181e 100644
> --- a/util/qemu-timer.c
> +++ b/util/qemu-timer.c
> @@ -122,7 +122,7 @@ void timerlist_free(QEMUTimerList *timer_list)
>      g_free(timer_list);
>  }
>  
> -static void qemu_clock_init(QEMUClockType type)
> +static void qemu_clock_init(QEMUClockType type, QEMUTimerListNotifyCB *notify_cb)
>  {
>      QEMUClock *clock = qemu_clock_ptr(type);
>  
> @@ -134,7 +134,7 @@ static void qemu_clock_init(QEMUClockType type)
>      clock->last = INT64_MIN;
>      QLIST_INIT(&clock->timerlists);
>      notifier_list_init(&clock->reset_notifiers);
> -    main_loop_tlg.tl[type] = timerlist_new(type, NULL, NULL);
> +    main_loop_tlg.tl[type] = timerlist_new(type, notify_cb, NULL);
>  }
>  
>  bool qemu_clock_use_for_deadline(QEMUClockType type)
> @@ -278,7 +278,7 @@ QEMUTimerList *qemu_clock_get_main_loop_timerlist(QEMUClockType type)
>  void timerlist_notify(QEMUTimerList *timer_list)
>  {
>      if (timer_list->notify_cb) {
> -        timer_list->notify_cb(timer_list->notify_opaque);
> +        timer_list->notify_cb(timer_list->notify_opaque, timer_list->clock->type);
>      } else {
>          qemu_notify_event();
>      }
> @@ -635,11 +635,11 @@ void qemu_clock_unregister_reset_notifier(QEMUClockType type,
>      notifier_remove(notifier);
>  }
>  
> -void init_clocks(void)
> +void init_clocks(QEMUTimerListNotifyCB *notify_cb)
>  {
>      QEMUClockType type;
>      for (type = 0; type < QEMU_CLOCK_MAX; type++) {
> -        qemu_clock_init(type);
> +        qemu_clock_init(type, notify_cb);
>      }
>  
>  #ifdef CONFIG_PRCTL_PR_SET_TIMERSLACK
> -- 
> 2.9.3
> 
> 
> 

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [Qemu-devel] [PATCH 4/5] main-loop: remove now unnecessary optimization
  2017-03-03 13:11 ` [Qemu-devel] [PATCH 4/5] main-loop: remove now unnecessary optimization Paolo Bonzini
@ 2017-03-03 13:53   ` Edgar E. Iglesias
  2017-03-13 16:23   ` Alex Bennée
  1 sibling, 0 replies; 26+ messages in thread
From: Edgar E. Iglesias @ 2017-03-03 13:53 UTC (permalink / raw)
  To: Paolo Bonzini; +Cc: qemu-devel, alex.bennee

On Fri, Mar 03, 2017 at 02:11:12PM +0100, Paolo Bonzini wrote:
> This optimization is not necessary anymore, because the vCPU now drops
> the I/O thread lock even with TCG.  Drop it to simplify the code and
> avoid the "I/O thread spun for 1000 iterations" warning.
> 
> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

Reviewed-by: Edgar E. Iglesias <edgar.iglesias@xilinx.com>



> ---
>  vl.c | 5 +----
>  1 file changed, 1 insertion(+), 4 deletions(-)
> 
> diff --git a/vl.c b/vl.c
> index bbbf1ba..b21b57e 100644
> --- a/vl.c
> +++ b/vl.c
> @@ -1884,17 +1884,14 @@ static bool main_loop_should_exit(void)
>  
>  static void main_loop(void)
>  {
> -    bool nonblocking;
> -    int last_io = 0;
>  #ifdef CONFIG_PROFILER
>      int64_t ti;
>  #endif
>      do {
> -        nonblocking = tcg_enabled() && last_io > 0;
>  #ifdef CONFIG_PROFILER
>          ti = profile_getclock();
>  #endif
> -        last_io = main_loop_wait(nonblocking);
> +        main_loop_wait(false);
>  #ifdef CONFIG_PROFILER
>          dev_time += profile_getclock() - ti;
>  #endif
> -- 
> 2.9.3
> 
> 
> 

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [Qemu-devel] [PATCH 2/5] qemu-timer: do not include sysemu/cpus.h from util/qemu-timer.h
  2017-03-03 13:11 ` [Qemu-devel] [PATCH 2/5] qemu-timer: do not include sysemu/cpus.h from util/qemu-timer.h Paolo Bonzini
  2017-03-03 13:48   ` Edgar E. Iglesias
@ 2017-03-03 14:50   ` Alex Bennée
  2017-03-03 14:55     ` Paolo Bonzini
  2017-03-10  7:42   ` [Qemu-devel] [PATCH] fixup! " Alex Bennée
  2017-03-10  9:47   ` [Qemu-devel] [PATCH 2/5] " Alex Bennée
  3 siblings, 1 reply; 26+ messages in thread
From: Alex Bennée @ 2017-03-03 14:50 UTC (permalink / raw)
  To: Paolo Bonzini; +Cc: qemu-devel


Paolo Bonzini <pbonzini@redhat.com> writes:

> This dependency is the wrong way, and we will need util/qemu-timer.h from sysemu/cpus.h
> in the next patch.

This shuffling breaks a couple of targets:

make -k -j9
	CHK version_gen.h
  CC      alpha-softmmu/target/alpha/translate.o
/home/alex/lsrc/qemu/qemu.git/target/alpha/translate.c: In function ‘gen_mfpr’:
/home/alex/lsrc/qemu/qemu.git/target/alpha/translate.c:1321:13: error: ‘use_icount’ undeclared (first use in this function)
         if (use_icount) {
             ^
/home/alex/lsrc/qemu/qemu.git/target/alpha/translate.c:1321:13: note: each undeclared identifier is reported only once for each function it appears in
/home/alex/lsrc/qemu/qemu.git/rules.mak:69: recipe for target 'target/alpha/translate.o' failed
make[1]: *** [target/alpha/translate.o] Error 1
make[1]: Target 'all' not remade because of errors.
Makefile:322: recipe for target 'subdir-alpha-softmmu' failed
make: *** [subdir-alpha-softmmu] Error 2
  CC      ppc64-softmmu/hw/ppc/pnv.o
/home/alex/lsrc/qemu/qemu.git/hw/ppc/pnv.c: In function ‘ppc_powernv_init’:
/home/alex/lsrc/qemu/qemu.git/hw/ppc/pnv.c:446:39: error: ‘smp_cores’ undeclared (first use in this function)
         object_property_set_int(chip, smp_cores, "nr-cores", &error_fatal);
                                       ^
/home/alex/lsrc/qemu/qemu.git/hw/ppc/pnv.c:446:39: note: each undeclared identifier is reported only once for each function it appears in
/home/alex/lsrc/qemu/qemu.git/hw/ppc/pnv.c: In function ‘pnv_chip_realize’:
/home/alex/lsrc/qemu/qemu.git/hw/ppc/pnv.c:686:57: error: ‘smp_threads’ undeclared (first use in this function)
         object_property_set_int(OBJECT(pnv_core), smp_threads, "nr-threads",
                                                         ^
/home/alex/lsrc/qemu/qemu.git/rules.mak:69: recipe for target 'hw/ppc/pnv.o' failed
make[1]: *** [hw/ppc/pnv.o] Error 1
make[1]: Target 'all' not remade because of errors.
Makefile:322: recipe for target 'subdir-ppc64-softmmu' failed
make: *** [subdir-ppc64-softmmu] Error 2
make: Target 'all' not remade because of errors.

Compilation exited abnormally with code 2 at Fri Mar  3 14:49:30


>
> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
> ---
>  cpu-exec.c            | 1 +
>  hw/core/ptimer.c      | 1 +
>  include/qemu/timer.h  | 1 -
>  include/sysemu/cpus.h | 2 ++
>  kvm-all.c             | 1 +
>  monitor.c             | 1 +
>  replay/replay.c       | 1 +
>  translate-all.c       | 1 +
>  util/main-loop.c      | 1 +
>  util/qemu-timer.c     | 1 +
>  10 files changed, 10 insertions(+), 1 deletion(-)
>
> diff --git a/cpu-exec.c b/cpu-exec.c
> index 1a5ad48..6dbb4da 100644
> --- a/cpu-exec.c
> +++ b/cpu-exec.c
> @@ -33,6 +33,7 @@
>  #if defined(TARGET_I386) && !defined(CONFIG_USER_ONLY)
>  #include "hw/i386/apic.h"
>  #endif
> +#include "sysemu/cpus.h"
>  #include "sysemu/replay.h"
>
>  /* -icount align implementation. */
> diff --git a/hw/core/ptimer.c b/hw/core/ptimer.c
> index 59ccb00..7221c68 100644
> --- a/hw/core/ptimer.c
> +++ b/hw/core/ptimer.c
> @@ -13,6 +13,7 @@
>  #include "sysemu/replay.h"
>  #include "sysemu/qtest.h"
>  #include "block/aio.h"
> +#include "sysemu/cpus.h"
>
>  #define DELTA_ADJUST     1
>  #define DELTA_NO_ADJUST -1
> diff --git a/include/qemu/timer.h b/include/qemu/timer.h
> index 26e6285..91cd8c8 100644
> --- a/include/qemu/timer.h
> +++ b/include/qemu/timer.h
> @@ -4,7 +4,6 @@
>  #include "qemu-common.h"
>  #include "qemu/notify.h"
>  #include "qemu/host-utils.h"
> -#include "sysemu/cpus.h"
>
>  #define NANOSECONDS_PER_SECOND 1000000000LL
>
> diff --git a/include/sysemu/cpus.h b/include/sysemu/cpus.h
> index a73b5d4..e521a91 100644
> --- a/include/sysemu/cpus.h
> +++ b/include/sysemu/cpus.h
> @@ -1,6 +1,8 @@
>  #ifndef QEMU_CPUS_H
>  #define QEMU_CPUS_H
>
> +#include "qemu/timer.h"
> +
>  /* cpus.c */
>  bool qemu_in_vcpu_thread(void);
>  void qemu_init_cpu_loop(void);
> diff --git a/kvm-all.c b/kvm-all.c
> index 0c94637..0fd6655 100644
> --- a/kvm-all.c
> +++ b/kvm-all.c
> @@ -29,6 +29,7 @@
>  #include "hw/s390x/adapter.h"
>  #include "exec/gdbstub.h"
>  #include "sysemu/kvm_int.h"
> +#include "sysemu/cpus.h"
>  #include "qemu/bswap.h"
>  #include "exec/memory.h"
>  #include "exec/ram_addr.h"
> diff --git a/monitor.c b/monitor.c
> index b68944d..d8fba78 100644
> --- a/monitor.c
> +++ b/monitor.c
> @@ -77,6 +77,7 @@
>  #include "qapi-event.h"
>  #include "qmp-introspect.h"
>  #include "sysemu/qtest.h"
> +#include "sysemu/cpus.h"
>  #include "qemu/cutils.h"
>  #include "qapi/qmp/dispatch.h"
>
> diff --git a/replay/replay.c b/replay/replay.c
> index 1835b99..78e2a7e 100644
> --- a/replay/replay.c
> +++ b/replay/replay.c
> @@ -16,6 +16,7 @@
>  #include "replay-internal.h"
>  #include "qemu/timer.h"
>  #include "qemu/main-loop.h"
> +#include "sysemu/cpus.h"
>  #include "sysemu/sysemu.h"
>  #include "qemu/error-report.h"
>
> diff --git a/translate-all.c b/translate-all.c
> index 956d54b..63b8f56 100644
> --- a/translate-all.c
> +++ b/translate-all.c
> @@ -57,6 +57,7 @@
>  #include "qemu/timer.h"
>  #include "qemu/main-loop.h"
>  #include "exec/log.h"
> +#include "sysemu/cpus.h"
>
>  /* #define DEBUG_TB_INVALIDATE */
>  /* #define DEBUG_TB_FLUSH */
> diff --git a/util/main-loop.c b/util/main-loop.c
> index ad10bca..0b2d8c0 100644
> --- a/util/main-loop.c
> +++ b/util/main-loop.c
> @@ -28,6 +28,7 @@
>  #include "qemu/timer.h"
>  #include "qemu/sockets.h"	// struct in_addr needed for libslirp.h
>  #include "sysemu/qtest.h"
> +#include "sysemu/cpus.h"
>  #include "slirp/libslirp.h"
>  #include "qemu/main-loop.h"
>  #include "block/aio.h"
> diff --git a/util/qemu-timer.c b/util/qemu-timer.c
> index 2f20151..ac99340 100644
> --- a/util/qemu-timer.c
> +++ b/util/qemu-timer.c
> @@ -27,6 +27,7 @@
>  #include "qemu/timer.h"
>  #include "sysemu/replay.h"
>  #include "sysemu/sysemu.h"
> +#include "sysemu/cpus.h"
>
>  #ifdef CONFIG_POSIX
>  #include <pthread.h>


--
Alex Bennée

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [Qemu-devel] [PATCH 2/5] qemu-timer: do not include sysemu/cpus.h from util/qemu-timer.h
  2017-03-03 14:50   ` Alex Bennée
@ 2017-03-03 14:55     ` Paolo Bonzini
  0 siblings, 0 replies; 26+ messages in thread
From: Paolo Bonzini @ 2017-03-03 14:55 UTC (permalink / raw)
  To: Alex Bennée; +Cc: qemu-devel



On 03/03/2017 15:50, Alex Bennée wrote:
> 
> Paolo Bonzini <pbonzini@redhat.com> writes:
> 
>> This dependency is the wrong way, and we will need util/qemu-timer.h from sysemu/cpus.h
>> in the next patch.
> 
> This shuffling breaks a couple of targets:

Oops.  I only tested x86 and aarch64 before.  This will have to wait for
after soft freeze.

Paolo

> make -k -j9
> 	CHK version_gen.h
>   CC      alpha-softmmu/target/alpha/translate.o
> /home/alex/lsrc/qemu/qemu.git/target/alpha/translate.c: In function ‘gen_mfpr’:
> /home/alex/lsrc/qemu/qemu.git/target/alpha/translate.c:1321:13: error: ‘use_icount’ undeclared (first use in this function)
>          if (use_icount) {
>              ^
> /home/alex/lsrc/qemu/qemu.git/target/alpha/translate.c:1321:13: note: each undeclared identifier is reported only once for each function it appears in
> /home/alex/lsrc/qemu/qemu.git/rules.mak:69: recipe for target 'target/alpha/translate.o' failed
> make[1]: *** [target/alpha/translate.o] Error 1
> make[1]: Target 'all' not remade because of errors.
> Makefile:322: recipe for target 'subdir-alpha-softmmu' failed
> make: *** [subdir-alpha-softmmu] Error 2
>   CC      ppc64-softmmu/hw/ppc/pnv.o
> /home/alex/lsrc/qemu/qemu.git/hw/ppc/pnv.c: In function ‘ppc_powernv_init’:
> /home/alex/lsrc/qemu/qemu.git/hw/ppc/pnv.c:446:39: error: ‘smp_cores’ undeclared (first use in this function)
>          object_property_set_int(chip, smp_cores, "nr-cores", &error_fatal);
>                                        ^
> /home/alex/lsrc/qemu/qemu.git/hw/ppc/pnv.c:446:39: note: each undeclared identifier is reported only once for each function it appears in
> /home/alex/lsrc/qemu/qemu.git/hw/ppc/pnv.c: In function ‘pnv_chip_realize’:
> /home/alex/lsrc/qemu/qemu.git/hw/ppc/pnv.c:686:57: error: ‘smp_threads’ undeclared (first use in this function)
>          object_property_set_int(OBJECT(pnv_core), smp_threads, "nr-threads",
>                                                          ^
> /home/alex/lsrc/qemu/qemu.git/rules.mak:69: recipe for target 'hw/ppc/pnv.o' failed
> make[1]: *** [hw/ppc/pnv.o] Error 1
> make[1]: Target 'all' not remade because of errors.
> Makefile:322: recipe for target 'subdir-ppc64-softmmu' failed
> make: *** [subdir-ppc64-softmmu] Error 2
> make: Target 'all' not remade because of errors.
> 
> Compilation exited abnormally with code 2 at Fri Mar  3 14:49:30
> 
> 
>>
>> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
>> ---
>>  cpu-exec.c            | 1 +
>>  hw/core/ptimer.c      | 1 +
>>  include/qemu/timer.h  | 1 -
>>  include/sysemu/cpus.h | 2 ++
>>  kvm-all.c             | 1 +
>>  monitor.c             | 1 +
>>  replay/replay.c       | 1 +
>>  translate-all.c       | 1 +
>>  util/main-loop.c      | 1 +
>>  util/qemu-timer.c     | 1 +
>>  10 files changed, 10 insertions(+), 1 deletion(-)
>>
>> diff --git a/cpu-exec.c b/cpu-exec.c
>> index 1a5ad48..6dbb4da 100644
>> --- a/cpu-exec.c
>> +++ b/cpu-exec.c
>> @@ -33,6 +33,7 @@
>>  #if defined(TARGET_I386) && !defined(CONFIG_USER_ONLY)
>>  #include "hw/i386/apic.h"
>>  #endif
>> +#include "sysemu/cpus.h"
>>  #include "sysemu/replay.h"
>>
>>  /* -icount align implementation. */
>> diff --git a/hw/core/ptimer.c b/hw/core/ptimer.c
>> index 59ccb00..7221c68 100644
>> --- a/hw/core/ptimer.c
>> +++ b/hw/core/ptimer.c
>> @@ -13,6 +13,7 @@
>>  #include "sysemu/replay.h"
>>  #include "sysemu/qtest.h"
>>  #include "block/aio.h"
>> +#include "sysemu/cpus.h"
>>
>>  #define DELTA_ADJUST     1
>>  #define DELTA_NO_ADJUST -1
>> diff --git a/include/qemu/timer.h b/include/qemu/timer.h
>> index 26e6285..91cd8c8 100644
>> --- a/include/qemu/timer.h
>> +++ b/include/qemu/timer.h
>> @@ -4,7 +4,6 @@
>>  #include "qemu-common.h"
>>  #include "qemu/notify.h"
>>  #include "qemu/host-utils.h"
>> -#include "sysemu/cpus.h"
>>
>>  #define NANOSECONDS_PER_SECOND 1000000000LL
>>
>> diff --git a/include/sysemu/cpus.h b/include/sysemu/cpus.h
>> index a73b5d4..e521a91 100644
>> --- a/include/sysemu/cpus.h
>> +++ b/include/sysemu/cpus.h
>> @@ -1,6 +1,8 @@
>>  #ifndef QEMU_CPUS_H
>>  #define QEMU_CPUS_H
>>
>> +#include "qemu/timer.h"
>> +
>>  /* cpus.c */
>>  bool qemu_in_vcpu_thread(void);
>>  void qemu_init_cpu_loop(void);
>> diff --git a/kvm-all.c b/kvm-all.c
>> index 0c94637..0fd6655 100644
>> --- a/kvm-all.c
>> +++ b/kvm-all.c
>> @@ -29,6 +29,7 @@
>>  #include "hw/s390x/adapter.h"
>>  #include "exec/gdbstub.h"
>>  #include "sysemu/kvm_int.h"
>> +#include "sysemu/cpus.h"
>>  #include "qemu/bswap.h"
>>  #include "exec/memory.h"
>>  #include "exec/ram_addr.h"
>> diff --git a/monitor.c b/monitor.c
>> index b68944d..d8fba78 100644
>> --- a/monitor.c
>> +++ b/monitor.c
>> @@ -77,6 +77,7 @@
>>  #include "qapi-event.h"
>>  #include "qmp-introspect.h"
>>  #include "sysemu/qtest.h"
>> +#include "sysemu/cpus.h"
>>  #include "qemu/cutils.h"
>>  #include "qapi/qmp/dispatch.h"
>>
>> diff --git a/replay/replay.c b/replay/replay.c
>> index 1835b99..78e2a7e 100644
>> --- a/replay/replay.c
>> +++ b/replay/replay.c
>> @@ -16,6 +16,7 @@
>>  #include "replay-internal.h"
>>  #include "qemu/timer.h"
>>  #include "qemu/main-loop.h"
>> +#include "sysemu/cpus.h"
>>  #include "sysemu/sysemu.h"
>>  #include "qemu/error-report.h"
>>
>> diff --git a/translate-all.c b/translate-all.c
>> index 956d54b..63b8f56 100644
>> --- a/translate-all.c
>> +++ b/translate-all.c
>> @@ -57,6 +57,7 @@
>>  #include "qemu/timer.h"
>>  #include "qemu/main-loop.h"
>>  #include "exec/log.h"
>> +#include "sysemu/cpus.h"
>>
>>  /* #define DEBUG_TB_INVALIDATE */
>>  /* #define DEBUG_TB_FLUSH */
>> diff --git a/util/main-loop.c b/util/main-loop.c
>> index ad10bca..0b2d8c0 100644
>> --- a/util/main-loop.c
>> +++ b/util/main-loop.c
>> @@ -28,6 +28,7 @@
>>  #include "qemu/timer.h"
>>  #include "qemu/sockets.h"	// struct in_addr needed for libslirp.h
>>  #include "sysemu/qtest.h"
>> +#include "sysemu/cpus.h"
>>  #include "slirp/libslirp.h"
>>  #include "qemu/main-loop.h"
>>  #include "block/aio.h"
>> diff --git a/util/qemu-timer.c b/util/qemu-timer.c
>> index 2f20151..ac99340 100644
>> --- a/util/qemu-timer.c
>> +++ b/util/qemu-timer.c
>> @@ -27,6 +27,7 @@
>>  #include "qemu/timer.h"
>>  #include "sysemu/replay.h"
>>  #include "sysemu/sysemu.h"
>> +#include "sysemu/cpus.h"
>>
>>  #ifdef CONFIG_POSIX
>>  #include <pthread.h>
> 
> 
> --
> Alex Bennée
> 

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [Qemu-devel] [PATCH 0/6] tcg: fix icount super slowdown
  2017-03-03 13:11 [Qemu-devel] [PATCH 0/6] tcg: fix icount super slowdown Paolo Bonzini
                   ` (4 preceding siblings ...)
  2017-03-03 13:11 ` [Qemu-devel] [PATCH 5/5] icount: process QEMU_CLOCK_VIRTUAL timers in vCPU thread Paolo Bonzini
@ 2017-03-09 17:19 ` Alex Bennée
  2017-03-09 17:22   ` Paolo Bonzini
  5 siblings, 1 reply; 26+ messages in thread
From: Alex Bennée @ 2017-03-09 17:19 UTC (permalink / raw)
  To: Paolo Bonzini; +Cc: qemu-devel


Paolo Bonzini <pbonzini@redhat.com> writes:

Hmm your subject line was:

  Subject: [PATCH 0/6] tcg: fix icount super slowdown

But:

> Paolo Bonzini (5):
>   qemu-timer: fix off-by-one
>   qemu-timer: do not include sysemu/cpus.h from util/qemu-timer.h
>   cpus: define QEMUTimerListNotifyCB for QEMU system emulation
>   main-loop: remove now unnecessary optimization
>   icount: process QEMU_CLOCK_VIRTUAL timers in vCPU thread

Did the tooling missing something or is this just a git-format-patch burp?

--
Alex Bennée

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [Qemu-devel] [PATCH 0/6] tcg: fix icount super slowdown
  2017-03-09 17:19 ` [Qemu-devel] [PATCH 0/6] tcg: fix icount super slowdown Alex Bennée
@ 2017-03-09 17:22   ` Paolo Bonzini
  0 siblings, 0 replies; 26+ messages in thread
From: Paolo Bonzini @ 2017-03-09 17:22 UTC (permalink / raw)
  To: Alex Bennée; +Cc: qemu-devel



On 09/03/2017 18:19, Alex Bennée wrote:
> 
> Hmm your subject line was:
> 
>   Subject: [PATCH 0/6] tcg: fix icount super slowdown
> 
> But:
> 
>> Paolo Bonzini (5):
>>   qemu-timer: fix off-by-one
>>   qemu-timer: do not include sysemu/cpus.h from util/qemu-timer.h
>>   cpus: define QEMUTimerListNotifyCB for QEMU system emulation
>>   main-loop: remove now unnecessary optimization
>>   icount: process QEMU_CLOCK_VIRTUAL timers in vCPU thread
> Did the tooling missing something or is this just a git-format-patch burp?

Patch 1 was the nios2 fix that I ended up posting as a separate thread.

Paolo

^ permalink raw reply	[flat|nested] 26+ messages in thread

* [Qemu-devel] [PATCH] fixup! qemu-timer: do not include sysemu/cpus.h from util/qemu-timer.h
  2017-03-03 13:11 ` [Qemu-devel] [PATCH 2/5] qemu-timer: do not include sysemu/cpus.h from util/qemu-timer.h Paolo Bonzini
  2017-03-03 13:48   ` Edgar E. Iglesias
  2017-03-03 14:50   ` Alex Bennée
@ 2017-03-10  7:42   ` Alex Bennée
  2017-03-10  8:27     ` Peter Maydell
  2017-03-10  9:47   ` [Qemu-devel] [PATCH 2/5] " Alex Bennée
  3 siblings, 1 reply; 26+ messages in thread
From: Alex Bennée @ 2017-03-10  7:42 UTC (permalink / raw)
  To: pbonzini
  Cc: qemu-devel, Alex Bennée, David Gibson, Alexander Graf,
	Richard Henderson, open list:PowerPC

---
 hw/ppc/pnv.c             | 1 +
 target/alpha/translate.c | 1 +
 2 files changed, 2 insertions(+)

diff --git a/hw/ppc/pnv.c b/hw/ppc/pnv.c
index 09f0d22def..3fa722af82 100644
--- a/hw/ppc/pnv.c
+++ b/hw/ppc/pnv.c
@@ -21,6 +21,7 @@
 #include "qapi/error.h"
 #include "sysemu/sysemu.h"
 #include "sysemu/numa.h"
+#include "sysemu/cpus.h"
 #include "hw/hw.h"
 #include "target/ppc/cpu.h"
 #include "qemu/log.h"
diff --git a/target/alpha/translate.c b/target/alpha/translate.c
index 055286a7b8..df5d695344 100644
--- a/target/alpha/translate.c
+++ b/target/alpha/translate.c
@@ -19,6 +19,7 @@
 
 #include "qemu/osdep.h"
 #include "cpu.h"
+#include "sysemu/cpus.h"
 #include "disas/disas.h"
 #include "qemu/host-utils.h"
 #include "exec/exec-all.h"
-- 
2.11.0

^ permalink raw reply related	[flat|nested] 26+ messages in thread

* Re: [Qemu-devel] [PATCH] fixup! qemu-timer: do not include sysemu/cpus.h from util/qemu-timer.h
  2017-03-10  7:42   ` [Qemu-devel] [PATCH] fixup! " Alex Bennée
@ 2017-03-10  8:27     ` Peter Maydell
  0 siblings, 0 replies; 26+ messages in thread
From: Peter Maydell @ 2017-03-10  8:27 UTC (permalink / raw)
  To: Alex Bennée
  Cc: Paolo Bonzini, Alexander Graf, QEMU Developers,
	open list:PowerPC, David Gibson, Richard Henderson

On 10 March 2017 at 08:42, Alex Bennée <alex.bennee@linaro.org> wrote:
> ---
>  hw/ppc/pnv.c             | 1 +
>  target/alpha/translate.c | 1 +
>  2 files changed, 2 insertions(+)

I'm guessing you didn't intend to post a fixup! patch without
a signoff line? :-)

thanks
-- PMM


>
> diff --git a/hw/ppc/pnv.c b/hw/ppc/pnv.c
> index 09f0d22def..3fa722af82 100644
> --- a/hw/ppc/pnv.c
> +++ b/hw/ppc/pnv.c
> @@ -21,6 +21,7 @@
>  #include "qapi/error.h"
>  #include "sysemu/sysemu.h"
>  #include "sysemu/numa.h"
> +#include "sysemu/cpus.h"
>  #include "hw/hw.h"
>  #include "target/ppc/cpu.h"
>  #include "qemu/log.h"
> diff --git a/target/alpha/translate.c b/target/alpha/translate.c
> index 055286a7b8..df5d695344 100644
> --- a/target/alpha/translate.c
> +++ b/target/alpha/translate.c
> @@ -19,6 +19,7 @@
>
>  #include "qemu/osdep.h"
>  #include "cpu.h"
> +#include "sysemu/cpus.h"
>  #include "disas/disas.h"
>  #include "qemu/host-utils.h"
>  #include "exec/exec-all.h"
> --
> 2.11.0

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [Qemu-devel] [PATCH 1/5] qemu-timer: fix off-by-one
  2017-03-03 13:11 ` [Qemu-devel] [PATCH 1/5] qemu-timer: fix off-by-one Paolo Bonzini
  2017-03-03 13:48   ` Edgar E. Iglesias
@ 2017-03-10  9:46   ` Alex Bennée
  1 sibling, 0 replies; 26+ messages in thread
From: Alex Bennée @ 2017-03-10  9:46 UTC (permalink / raw)
  To: Paolo Bonzini; +Cc: qemu-devel


Paolo Bonzini <pbonzini@redhat.com> writes:

> If the first timer is exactly at the current value of the clock, the
> deadline is met and the timer should fire.  This fixes itself without icount,
> but with icount execution of instructions will stop exactly at the deadline.
>
> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

Reviewed-by: Alex Bennée <alex.bennee@linaro.org>

> ---
>  util/qemu-timer.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/util/qemu-timer.c b/util/qemu-timer.c
> index 6cf70b9..2f20151 100644
> --- a/util/qemu-timer.c
> +++ b/util/qemu-timer.c
> @@ -199,7 +199,7 @@ bool timerlist_expired(QEMUTimerList *timer_list)
>      expire_time = timer_list->active_timers->expire_time;
>      qemu_mutex_unlock(&timer_list->active_timers_lock);
>
> -    return expire_time < qemu_clock_get_ns(timer_list->clock->type);
> +    return expire_time <= qemu_clock_get_ns(timer_list->clock->type);
>  }
>
>  bool qemu_clock_expired(QEMUClockType type)


--
Alex Bennée

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [Qemu-devel] [PATCH 2/5] qemu-timer: do not include sysemu/cpus.h from util/qemu-timer.h
  2017-03-03 13:11 ` [Qemu-devel] [PATCH 2/5] qemu-timer: do not include sysemu/cpus.h from util/qemu-timer.h Paolo Bonzini
                     ` (2 preceding siblings ...)
  2017-03-10  7:42   ` [Qemu-devel] [PATCH] fixup! " Alex Bennée
@ 2017-03-10  9:47   ` Alex Bennée
  3 siblings, 0 replies; 26+ messages in thread
From: Alex Bennée @ 2017-03-10  9:47 UTC (permalink / raw)
  To: Paolo Bonzini; +Cc: qemu-devel


Paolo Bonzini <pbonzini@redhat.com> writes:

> This dependency is the wrong way, and we will need util/qemu-timer.h from sysemu/cpus.h
> in the next patch.
>
> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

With the compile fixes I posted as a !fixup patch:

Reviewed-by: Alex Bennée <alex.bennee@linaro.org>

> ---
>  cpu-exec.c            | 1 +
>  hw/core/ptimer.c      | 1 +
>  include/qemu/timer.h  | 1 -
>  include/sysemu/cpus.h | 2 ++
>  kvm-all.c             | 1 +
>  monitor.c             | 1 +
>  replay/replay.c       | 1 +
>  translate-all.c       | 1 +
>  util/main-loop.c      | 1 +
>  util/qemu-timer.c     | 1 +
>  10 files changed, 10 insertions(+), 1 deletion(-)
>
> diff --git a/cpu-exec.c b/cpu-exec.c
> index 1a5ad48..6dbb4da 100644
> --- a/cpu-exec.c
> +++ b/cpu-exec.c
> @@ -33,6 +33,7 @@
>  #if defined(TARGET_I386) && !defined(CONFIG_USER_ONLY)
>  #include "hw/i386/apic.h"
>  #endif
> +#include "sysemu/cpus.h"
>  #include "sysemu/replay.h"
>
>  /* -icount align implementation. */
> diff --git a/hw/core/ptimer.c b/hw/core/ptimer.c
> index 59ccb00..7221c68 100644
> --- a/hw/core/ptimer.c
> +++ b/hw/core/ptimer.c
> @@ -13,6 +13,7 @@
>  #include "sysemu/replay.h"
>  #include "sysemu/qtest.h"
>  #include "block/aio.h"
> +#include "sysemu/cpus.h"
>
>  #define DELTA_ADJUST     1
>  #define DELTA_NO_ADJUST -1
> diff --git a/include/qemu/timer.h b/include/qemu/timer.h
> index 26e6285..91cd8c8 100644
> --- a/include/qemu/timer.h
> +++ b/include/qemu/timer.h
> @@ -4,7 +4,6 @@
>  #include "qemu-common.h"
>  #include "qemu/notify.h"
>  #include "qemu/host-utils.h"
> -#include "sysemu/cpus.h"
>
>  #define NANOSECONDS_PER_SECOND 1000000000LL
>
> diff --git a/include/sysemu/cpus.h b/include/sysemu/cpus.h
> index a73b5d4..e521a91 100644
> --- a/include/sysemu/cpus.h
> +++ b/include/sysemu/cpus.h
> @@ -1,6 +1,8 @@
>  #ifndef QEMU_CPUS_H
>  #define QEMU_CPUS_H
>
> +#include "qemu/timer.h"
> +
>  /* cpus.c */
>  bool qemu_in_vcpu_thread(void);
>  void qemu_init_cpu_loop(void);
> diff --git a/kvm-all.c b/kvm-all.c
> index 0c94637..0fd6655 100644
> --- a/kvm-all.c
> +++ b/kvm-all.c
> @@ -29,6 +29,7 @@
>  #include "hw/s390x/adapter.h"
>  #include "exec/gdbstub.h"
>  #include "sysemu/kvm_int.h"
> +#include "sysemu/cpus.h"
>  #include "qemu/bswap.h"
>  #include "exec/memory.h"
>  #include "exec/ram_addr.h"
> diff --git a/monitor.c b/monitor.c
> index b68944d..d8fba78 100644
> --- a/monitor.c
> +++ b/monitor.c
> @@ -77,6 +77,7 @@
>  #include "qapi-event.h"
>  #include "qmp-introspect.h"
>  #include "sysemu/qtest.h"
> +#include "sysemu/cpus.h"
>  #include "qemu/cutils.h"
>  #include "qapi/qmp/dispatch.h"
>
> diff --git a/replay/replay.c b/replay/replay.c
> index 1835b99..78e2a7e 100644
> --- a/replay/replay.c
> +++ b/replay/replay.c
> @@ -16,6 +16,7 @@
>  #include "replay-internal.h"
>  #include "qemu/timer.h"
>  #include "qemu/main-loop.h"
> +#include "sysemu/cpus.h"
>  #include "sysemu/sysemu.h"
>  #include "qemu/error-report.h"
>
> diff --git a/translate-all.c b/translate-all.c
> index 956d54b..63b8f56 100644
> --- a/translate-all.c
> +++ b/translate-all.c
> @@ -57,6 +57,7 @@
>  #include "qemu/timer.h"
>  #include "qemu/main-loop.h"
>  #include "exec/log.h"
> +#include "sysemu/cpus.h"
>
>  /* #define DEBUG_TB_INVALIDATE */
>  /* #define DEBUG_TB_FLUSH */
> diff --git a/util/main-loop.c b/util/main-loop.c
> index ad10bca..0b2d8c0 100644
> --- a/util/main-loop.c
> +++ b/util/main-loop.c
> @@ -28,6 +28,7 @@
>  #include "qemu/timer.h"
>  #include "qemu/sockets.h"	// struct in_addr needed for libslirp.h
>  #include "sysemu/qtest.h"
> +#include "sysemu/cpus.h"
>  #include "slirp/libslirp.h"
>  #include "qemu/main-loop.h"
>  #include "block/aio.h"
> diff --git a/util/qemu-timer.c b/util/qemu-timer.c
> index 2f20151..ac99340 100644
> --- a/util/qemu-timer.c
> +++ b/util/qemu-timer.c
> @@ -27,6 +27,7 @@
>  #include "qemu/timer.h"
>  #include "sysemu/replay.h"
>  #include "sysemu/sysemu.h"
> +#include "sysemu/cpus.h"
>
>  #ifdef CONFIG_POSIX
>  #include <pthread.h>


--
Alex Bennée

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [Qemu-devel] [PATCH 4/5] main-loop: remove now unnecessary optimization
  2017-03-03 13:11 ` [Qemu-devel] [PATCH 4/5] main-loop: remove now unnecessary optimization Paolo Bonzini
  2017-03-03 13:53   ` Edgar E. Iglesias
@ 2017-03-13 16:23   ` Alex Bennée
  1 sibling, 0 replies; 26+ messages in thread
From: Alex Bennée @ 2017-03-13 16:23 UTC (permalink / raw)
  To: Paolo Bonzini; +Cc: qemu-devel


Paolo Bonzini <pbonzini@redhat.com> writes:

> This optimization is not necessary anymore, because the vCPU now drops
> the I/O thread lock even with TCG.  Drop it to simplify the code and
> avoid the "I/O thread spun for 1000 iterations" warning.
>
> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

Reviewed-by: Alex Bennée <alex.bennee@linaro.org>


> ---
>  vl.c | 5 +----
>  1 file changed, 1 insertion(+), 4 deletions(-)
>
> diff --git a/vl.c b/vl.c
> index bbbf1ba..b21b57e 100644
> --- a/vl.c
> +++ b/vl.c
> @@ -1884,17 +1884,14 @@ static bool main_loop_should_exit(void)
>
>  static void main_loop(void)
>  {
> -    bool nonblocking;
> -    int last_io = 0;
>  #ifdef CONFIG_PROFILER
>      int64_t ti;
>  #endif
>      do {
> -        nonblocking = tcg_enabled() && last_io > 0;
>  #ifdef CONFIG_PROFILER
>          ti = profile_getclock();
>  #endif
> -        last_io = main_loop_wait(nonblocking);
> +        main_loop_wait(false);
>  #ifdef CONFIG_PROFILER
>          dev_time += profile_getclock() - ti;
>  #endif


--
Alex Bennée

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [Qemu-devel] [PATCH 5/5] icount: process QEMU_CLOCK_VIRTUAL timers in vCPU thread
  2017-03-03 13:11 ` [Qemu-devel] [PATCH 5/5] icount: process QEMU_CLOCK_VIRTUAL timers in vCPU thread Paolo Bonzini
@ 2017-03-13 16:53   ` Alex Bennée
  2017-03-13 17:16     ` Paolo Bonzini
  0 siblings, 1 reply; 26+ messages in thread
From: Alex Bennée @ 2017-03-13 16:53 UTC (permalink / raw)
  To: Paolo Bonzini; +Cc: qemu-devel


Paolo Bonzini <pbonzini@redhat.com> writes:

> icount has become much slower after tcg_cpu_exec has stopped
> using the BQL.  There is also a latent bug that is masked by
> the slowness.
>
> The slowness happens because every occurrence of a QEMU_CLOCK_VIRTUAL
> timer now has to wake up the I/O thread and wait for it.  The rendez-vous
> is mediated by the BQL QemuMutex:
>
> - handle_icount_deadline wakes up the I/O thread with BQL taken
> - the I/O thread wakes up and waits on the BQL
> - the VCPU thread releases the BQL a little later
> - the I/O thread raises an interrupt, which calls qemu_cpu_kick
> - the VCPU thread notices the interrupt, takes the BQL to
>   process it and waits on it
>
> All this back and forth is extremely expensive, causing a 6 to 8-fold
> slowdown when icount is turned on.
>
> One may think that the issue is that the VCPU thread is too dependent
> on the BQL, but then the latent bug comes in.  I first tried removing
> the BQL completely from the x86 cpu_exec, only to see everything break.
> The only way to fix it (and make everything slow again) was to add a dummy
> BQL lock/unlock pair.
>
> This is because in -icount mode you really have to process the events
> before the CPU restarts executing the next instruction.  Therefore, this
> series moves the processing of QEMU_CLOCK_VIRTUAL timers straight in
> the vCPU thread when running in icount mode.
>
> The required changes include:
>
> - make the timer notification callback wake up TCG's single vCPU thread
>   when run from another thread.  By using async_run_on_cpu, the callback
>   can override all_cpu_threads_idle() when the CPU is halted.
>
> - move handle_icount_deadline after qemu_tcg_wait_io_event, so that
>   the timer notification callback is invoked after the dummy work item
>   wakes up the vCPU thread
>
> - make handle_icount_deadline run the timers instead of just waking the
>   I/O thread.
>
> - stop processing the timers in the main loop
>
> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
> ---
>  cpus.c               | 26 +++++++++++++++++++++++---
>  include/qemu/timer.h | 24 ++++++++++++++++++++++++
>  util/qemu-timer.c    |  4 +++-
>  3 files changed, 50 insertions(+), 4 deletions(-)
>
> diff --git a/cpus.c b/cpus.c
> index 747addd..209c196 100644
> --- a/cpus.c
> +++ b/cpus.c
> @@ -804,9 +804,23 @@ static void qemu_cpu_kick_rr_cpu(void)
>      } while (cpu != atomic_mb_read(&tcg_current_rr_cpu));
>  }
>
> +static void do_nothing(CPUState *cpu, run_on_cpu_data unused)
> +{
> +}
> +
>  void qemu_timer_notify_cb(void *opaque, QEMUClockType type)
>  {
> -    qemu_notify_event();
> +    if (!use_icount || type != QEMU_CLOCK_VIRTUAL) {
> +        qemu_notify_event();
> +        return;
> +    }
> +
> +    if (!qemu_in_vcpu_thread() && first_cpu) {
> +        /* run_on_cpu will also kick the CPU out of halt so that
> +         * handle_icount_deadline runs
> +         */
> +        async_run_on_cpu(first_cpu, do_nothing, RUN_ON_CPU_NULL);
> +    }

This doesn't read quite right - otherwise you get the same effect by
calling qemu_cpu_kick(), or even modify and call qemu_cpu_kick_rr_cpu()?

The only real effect of having something in the work queue is you run
the outside of the loop before returning into the tcg_cpu_exec.

>  }
>
>  static void kick_tcg_thread(void *opaque)
> @@ -1220,12 +1234,15 @@ static int64_t tcg_get_icount_limit(void)
>
>  static void handle_icount_deadline(void)
>  {
> +    assert(qemu_in_vcpu_thread());
>      if (use_icount) {
>          int64_t deadline =
>              qemu_clock_deadline_ns_all(QEMU_CLOCK_VIRTUAL);
>
>          if (deadline == 0) {
> +            /* Wake up other AioContexts.  */
>              qemu_clock_notify(QEMU_CLOCK_VIRTUAL);
> +            qemu_clock_run_timers(QEMU_CLOCK_VIRTUAL);
>          }
>      }
>  }
> @@ -1338,6 +1355,11 @@ static void *qemu_tcg_rr_cpu_thread_fn(void *arg)
>          /* Account partial waits to QEMU_CLOCK_VIRTUAL.  */
>          qemu_account_warp_timer();
>
> +        /* Run the timers here.  This is much more efficient than
> +         * waking up the I/O thread and waiting for completion.
> +         */
> +        handle_icount_deadline();
> +
>          if (!cpu) {
>              cpu = first_cpu;
>          }
> @@ -1379,8 +1401,6 @@ static void *qemu_tcg_rr_cpu_thread_fn(void *arg)
>              atomic_mb_set(&cpu->exit_request, 0);
>          }
>
> -        handle_icount_deadline();
> -

I guess we could just as easily move the handling to after
qemu_tcg_wait_io_event(cpu ? cpu : QTAILQ_FIRST(&cpus))?

I noticed we still call handle_icount_deadline in the multi-thread case
which is probably superfluous unless/until we figure out a way for it to
work with MTTCG.

>          qemu_tcg_wait_io_event(cpu ? cpu : QTAILQ_FIRST(&cpus));
>          deal_with_unplugged_cpus();
>      }
> diff --git a/include/qemu/timer.h b/include/qemu/timer.h
> index 1441b42..e1742f2 100644
> --- a/include/qemu/timer.h
> +++ b/include/qemu/timer.h
> @@ -533,6 +533,12 @@ static inline QEMUTimer *timer_new_tl(QEMUTimerList *timer_list,
>   * Create a new timer and associate it with the default
>   * timer list for the clock type @type.
>   *
> + * The default timer list has one special feature: in icount mode,
> + * %QEMU_CLOCK_VIRTUAL timers are run in the vCPU thread.  This is
> + * not true of other timer lists, which are typically associated
> + * with an AioContext---each of them runs its timer callbacks in its own
> + * AioContext thread.
> + *
>   * Returns: a pointer to the timer
>   */
>  static inline QEMUTimer *timer_new(QEMUClockType type, int scale,
> @@ -550,6 +556,12 @@ static inline QEMUTimer *timer_new(QEMUClockType type, int scale,
>   * Create a new timer with nanosecond scale on the default timer list
>   * associated with the clock.
>   *
> + * The default timer list has one special feature: in icount mode,
> + * %QEMU_CLOCK_VIRTUAL timers are run in the vCPU thread.  This is
> + * not true of other timer lists, which are typically associated
> + * with an AioContext---each of them runs its timer callbacks in its own
> + * AioContext thread.
> + *
>   * Returns: a pointer to the newly created timer
>   */
>  static inline QEMUTimer *timer_new_ns(QEMUClockType type, QEMUTimerCB *cb,
> @@ -564,6 +576,12 @@ static inline QEMUTimer *timer_new_ns(QEMUClockType type, QEMUTimerCB *cb,
>   * @cb: the callback to call when the timer expires
>   * @opaque: the opaque pointer to pass to the callback
>   *
> + * The default timer list has one special feature: in icount mode,
> + * %QEMU_CLOCK_VIRTUAL timers are run in the vCPU thread.  This is
> + * not true of other timer lists, which are typically associated
> + * with an AioContext---each of them runs its timer callbacks in its own
> + * AioContext thread.
> + *
>   * Create a new timer with microsecond scale on the default timer list
>   * associated with the clock.
>   *
> @@ -581,6 +599,12 @@ static inline QEMUTimer *timer_new_us(QEMUClockType type, QEMUTimerCB *cb,
>   * @cb: the callback to call when the timer expires
>   * @opaque: the opaque pointer to pass to the callback
>   *
> + * The default timer list has one special feature: in icount mode,
> + * %QEMU_CLOCK_VIRTUAL timers are run in the vCPU thread.  This is
> + * not true of other timer lists, which are typically associated
> + * with an AioContext---each of them runs its timer callbacks in its own
> + * AioContext thread.
> + *
>   * Create a new timer with millisecond scale on the default timer list
>   * associated with the clock.
>   *
> diff --git a/util/qemu-timer.c b/util/qemu-timer.c
> index dc3181e..82d5650 100644
> --- a/util/qemu-timer.c
> +++ b/util/qemu-timer.c
> @@ -658,7 +658,9 @@ bool qemu_clock_run_all_timers(void)
>      QEMUClockType type;
>
>      for (type = 0; type < QEMU_CLOCK_MAX; type++) {
> -        progress |= qemu_clock_run_timers(type);
> +        if (qemu_clock_use_for_deadline(type)) {
> +            progress |= qemu_clock_run_timers(type);
> +        }

minor nit: its not really qemu_clock_run_all_timers() now is it. We run
all but the virtual timers (if icount is enabled).

>      }
>
>      return progress;


--
Alex Bennée

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [Qemu-devel] [PATCH 5/5] icount: process QEMU_CLOCK_VIRTUAL timers in vCPU thread
  2017-03-13 16:53   ` Alex Bennée
@ 2017-03-13 17:16     ` Paolo Bonzini
  2017-03-13 18:15       ` Alex Bennée
  0 siblings, 1 reply; 26+ messages in thread
From: Paolo Bonzini @ 2017-03-13 17:16 UTC (permalink / raw)
  To: Alex Bennée; +Cc: qemu-devel



On 13/03/2017 17:53, Alex Bennée wrote:
> 
> Paolo Bonzini <pbonzini@redhat.com> writes:
> 
>> icount has become much slower after tcg_cpu_exec has stopped
>> using the BQL.  There is also a latent bug that is masked by
>> the slowness.
>>
>> The slowness happens because every occurrence of a QEMU_CLOCK_VIRTUAL
>> timer now has to wake up the I/O thread and wait for it.  The rendez-vous
>> is mediated by the BQL QemuMutex:
>>
>> - handle_icount_deadline wakes up the I/O thread with BQL taken
>> - the I/O thread wakes up and waits on the BQL
>> - the VCPU thread releases the BQL a little later
>> - the I/O thread raises an interrupt, which calls qemu_cpu_kick
>> - the VCPU thread notices the interrupt, takes the BQL to
>>   process it and waits on it
>>
>> All this back and forth is extremely expensive, causing a 6 to 8-fold
>> slowdown when icount is turned on.
>>
>> One may think that the issue is that the VCPU thread is too dependent
>> on the BQL, but then the latent bug comes in.  I first tried removing
>> the BQL completely from the x86 cpu_exec, only to see everything break.
>> The only way to fix it (and make everything slow again) was to add a dummy
>> BQL lock/unlock pair.
>>
>> This is because in -icount mode you really have to process the events
>> before the CPU restarts executing the next instruction.  Therefore, this
>> series moves the processing of QEMU_CLOCK_VIRTUAL timers straight in
>> the vCPU thread when running in icount mode.
>>
>> The required changes include:
>>
>> - make the timer notification callback wake up TCG's single vCPU thread
>>   when run from another thread.  By using async_run_on_cpu, the callback
>>   can override all_cpu_threads_idle() when the CPU is halted.
>>
>> - move handle_icount_deadline after qemu_tcg_wait_io_event, so that
>>   the timer notification callback is invoked after the dummy work item
>>   wakes up the vCPU thread
>>
>> - make handle_icount_deadline run the timers instead of just waking the
>>   I/O thread.
>>
>> - stop processing the timers in the main loop
>>
>> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
>> ---
>>  cpus.c               | 26 +++++++++++++++++++++++---
>>  include/qemu/timer.h | 24 ++++++++++++++++++++++++
>>  util/qemu-timer.c    |  4 +++-
>>  3 files changed, 50 insertions(+), 4 deletions(-)
>>
>> diff --git a/cpus.c b/cpus.c
>> index 747addd..209c196 100644
>> --- a/cpus.c
>> +++ b/cpus.c
>> @@ -804,9 +804,23 @@ static void qemu_cpu_kick_rr_cpu(void)
>>      } while (cpu != atomic_mb_read(&tcg_current_rr_cpu));
>>  }
>>
>> +static void do_nothing(CPUState *cpu, run_on_cpu_data unused)
>> +{
>> +}
>> +
>>  void qemu_timer_notify_cb(void *opaque, QEMUClockType type)
>>  {
>> -    qemu_notify_event();
>> +    if (!use_icount || type != QEMU_CLOCK_VIRTUAL) {
>> +        qemu_notify_event();
>> +        return;
>> +    }
>> +
>> +    if (!qemu_in_vcpu_thread() && first_cpu) {
>> +        /* run_on_cpu will also kick the CPU out of halt so that
>> +         * handle_icount_deadline runs
>> +         */
>> +        async_run_on_cpu(first_cpu, do_nothing, RUN_ON_CPU_NULL);
>> +    }
> 
> This doesn't read quite right - otherwise you get the same effect by
> calling qemu_cpu_kick(), or even modify and call qemu_cpu_kick_rr_cpu()?
> 
> The only real effect of having something in the work queue is you run
> the outside of the loop before returning into the tcg_cpu_exec.

Yes, and "the outside of the loop" here means handle_icount_deadline.

But qemu_cpu_kick_rr_cpu won't signal condition variables, and
qemu_cpu_kick does but it won't make cpu_thread_is_idle return false.
Therefore, qemu_tcg_wait_io_event keeps looping.

>>  }
>>
>>  static void kick_tcg_thread(void *opaque)
>> @@ -1220,12 +1234,15 @@ static int64_t tcg_get_icount_limit(void)
>>
>>  static void handle_icount_deadline(void)
>>  {
>> +    assert(qemu_in_vcpu_thread());
>>      if (use_icount) {
>>          int64_t deadline =
>>              qemu_clock_deadline_ns_all(QEMU_CLOCK_VIRTUAL);
>>
>>          if (deadline == 0) {
>> +            /* Wake up other AioContexts.  */
>>              qemu_clock_notify(QEMU_CLOCK_VIRTUAL);
>> +            qemu_clock_run_timers(QEMU_CLOCK_VIRTUAL);
>>          }
>>      }
>>  }
>> @@ -1338,6 +1355,11 @@ static void *qemu_tcg_rr_cpu_thread_fn(void *arg)
>>          /* Account partial waits to QEMU_CLOCK_VIRTUAL.  */
>>          qemu_account_warp_timer();
>>
>> +        /* Run the timers here.  This is much more efficient than
>> +         * waking up the I/O thread and waiting for completion.
>> +         */
>> +        handle_icount_deadline();
>> +
>>          if (!cpu) {
>>              cpu = first_cpu;
>>          }
>> @@ -1379,8 +1401,6 @@ static void *qemu_tcg_rr_cpu_thread_fn(void *arg)
>>              atomic_mb_set(&cpu->exit_request, 0);
>>          }
>>
>> -        handle_icount_deadline();
>> -
> 
> I guess we could just as easily move the handling to after
> qemu_tcg_wait_io_event(cpu ? cpu : QTAILQ_FIRST(&cpus))?
> 
> I noticed we still call handle_icount_deadline in the multi-thread case
> which is probably superfluous unless/until we figure out a way for it to
> work with MTTCG.

Should I remove the call?  Add an assert(!use_icount)?

>>          qemu_tcg_wait_io_event(cpu ? cpu : QTAILQ_FIRST(&cpus));
>>          deal_with_unplugged_cpus();
>>      }
>> diff --git a/include/qemu/timer.h b/include/qemu/timer.h
>> index 1441b42..e1742f2 100644
>> --- a/include/qemu/timer.h
>> +++ b/include/qemu/timer.h
>> @@ -533,6 +533,12 @@ static inline QEMUTimer *timer_new_tl(QEMUTimerList *timer_list,
>>   * Create a new timer and associate it with the default
>>   * timer list for the clock type @type.
>>   *
>> + * The default timer list has one special feature: in icount mode,
>> + * %QEMU_CLOCK_VIRTUAL timers are run in the vCPU thread.  This is
>> + * not true of other timer lists, which are typically associated
>> + * with an AioContext---each of them runs its timer callbacks in its own
>> + * AioContext thread.
>> + *
>>   * Returns: a pointer to the timer
>>   */
>>  static inline QEMUTimer *timer_new(QEMUClockType type, int scale,
>> @@ -550,6 +556,12 @@ static inline QEMUTimer *timer_new(QEMUClockType type, int scale,
>>   * Create a new timer with nanosecond scale on the default timer list
>>   * associated with the clock.
>>   *
>> + * The default timer list has one special feature: in icount mode,
>> + * %QEMU_CLOCK_VIRTUAL timers are run in the vCPU thread.  This is
>> + * not true of other timer lists, which are typically associated
>> + * with an AioContext---each of them runs its timer callbacks in its own
>> + * AioContext thread.
>> + *
>>   * Returns: a pointer to the newly created timer
>>   */
>>  static inline QEMUTimer *timer_new_ns(QEMUClockType type, QEMUTimerCB *cb,
>> @@ -564,6 +576,12 @@ static inline QEMUTimer *timer_new_ns(QEMUClockType type, QEMUTimerCB *cb,
>>   * @cb: the callback to call when the timer expires
>>   * @opaque: the opaque pointer to pass to the callback
>>   *
>> + * The default timer list has one special feature: in icount mode,
>> + * %QEMU_CLOCK_VIRTUAL timers are run in the vCPU thread.  This is
>> + * not true of other timer lists, which are typically associated
>> + * with an AioContext---each of them runs its timer callbacks in its own
>> + * AioContext thread.
>> + *
>>   * Create a new timer with microsecond scale on the default timer list
>>   * associated with the clock.
>>   *
>> @@ -581,6 +599,12 @@ static inline QEMUTimer *timer_new_us(QEMUClockType type, QEMUTimerCB *cb,
>>   * @cb: the callback to call when the timer expires
>>   * @opaque: the opaque pointer to pass to the callback
>>   *
>> + * The default timer list has one special feature: in icount mode,
>> + * %QEMU_CLOCK_VIRTUAL timers are run in the vCPU thread.  This is
>> + * not true of other timer lists, which are typically associated
>> + * with an AioContext---each of them runs its timer callbacks in its own
>> + * AioContext thread.
>> + *
>>   * Create a new timer with millisecond scale on the default timer list
>>   * associated with the clock.
>>   *
>> diff --git a/util/qemu-timer.c b/util/qemu-timer.c
>> index dc3181e..82d5650 100644
>> --- a/util/qemu-timer.c
>> +++ b/util/qemu-timer.c
>> @@ -658,7 +658,9 @@ bool qemu_clock_run_all_timers(void)
>>      QEMUClockType type;
>>
>>      for (type = 0; type < QEMU_CLOCK_MAX; type++) {
>> -        progress |= qemu_clock_run_timers(type);
>> +        if (qemu_clock_use_for_deadline(type)) {
>> +            progress |= qemu_clock_run_timers(type);
>> +        }
> 
> minor nit: its not really qemu_clock_run_all_timers() now is it. We run
> all but the virtual timers (if icount is enabled).

Well yeah, it's all those that pass qemu_clock_use_for_deadline.

Paolo

>>      }
>>
>>      return progress;
> 
> 
> --
> Alex Bennée
> 

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [Qemu-devel] [PATCH 5/5] icount: process QEMU_CLOCK_VIRTUAL timers in vCPU thread
  2017-03-13 17:16     ` Paolo Bonzini
@ 2017-03-13 18:15       ` Alex Bennée
  2017-03-14 10:05         ` Paolo Bonzini
  0 siblings, 1 reply; 26+ messages in thread
From: Alex Bennée @ 2017-03-13 18:15 UTC (permalink / raw)
  To: Paolo Bonzini; +Cc: qemu-devel


Paolo Bonzini <pbonzini@redhat.com> writes:

> On 13/03/2017 17:53, Alex Bennée wrote:
>>
>> Paolo Bonzini <pbonzini@redhat.com> writes:
>>
>>> icount has become much slower after tcg_cpu_exec has stopped
>>> using the BQL.  There is also a latent bug that is masked by
>>> the slowness.
>>>
>>> The slowness happens because every occurrence of a QEMU_CLOCK_VIRTUAL
>>> timer now has to wake up the I/O thread and wait for it.  The rendez-vous
>>> is mediated by the BQL QemuMutex:
>>>
>>> - handle_icount_deadline wakes up the I/O thread with BQL taken
>>> - the I/O thread wakes up and waits on the BQL
>>> - the VCPU thread releases the BQL a little later
>>> - the I/O thread raises an interrupt, which calls qemu_cpu_kick
>>> - the VCPU thread notices the interrupt, takes the BQL to
>>>   process it and waits on it
>>>
>>> All this back and forth is extremely expensive, causing a 6 to 8-fold
>>> slowdown when icount is turned on.
>>>
>>> One may think that the issue is that the VCPU thread is too dependent
>>> on the BQL, but then the latent bug comes in.  I first tried removing
>>> the BQL completely from the x86 cpu_exec, only to see everything break.
>>> The only way to fix it (and make everything slow again) was to add a dummy
>>> BQL lock/unlock pair.
>>>
>>> This is because in -icount mode you really have to process the events
>>> before the CPU restarts executing the next instruction.  Therefore, this
>>> series moves the processing of QEMU_CLOCK_VIRTUAL timers straight in
>>> the vCPU thread when running in icount mode.
>>>
>>> The required changes include:
>>>
>>> - make the timer notification callback wake up TCG's single vCPU thread
>>>   when run from another thread.  By using async_run_on_cpu, the callback
>>>   can override all_cpu_threads_idle() when the CPU is halted.
>>>
>>> - move handle_icount_deadline after qemu_tcg_wait_io_event, so that
>>>   the timer notification callback is invoked after the dummy work item
>>>   wakes up the vCPU thread
>>>
>>> - make handle_icount_deadline run the timers instead of just waking the
>>>   I/O thread.
>>>
>>> - stop processing the timers in the main loop
>>>
>>> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
>>> ---
>>>  cpus.c               | 26 +++++++++++++++++++++++---
>>>  include/qemu/timer.h | 24 ++++++++++++++++++++++++
>>>  util/qemu-timer.c    |  4 +++-
>>>  3 files changed, 50 insertions(+), 4 deletions(-)
>>>
>>> diff --git a/cpus.c b/cpus.c
>>> index 747addd..209c196 100644
>>> --- a/cpus.c
>>> +++ b/cpus.c
>>> @@ -804,9 +804,23 @@ static void qemu_cpu_kick_rr_cpu(void)
>>>      } while (cpu != atomic_mb_read(&tcg_current_rr_cpu));
>>>  }
>>>
>>> +static void do_nothing(CPUState *cpu, run_on_cpu_data unused)
>>> +{
>>> +}
>>> +
>>>  void qemu_timer_notify_cb(void *opaque, QEMUClockType type)
>>>  {
>>> -    qemu_notify_event();
>>> +    if (!use_icount || type != QEMU_CLOCK_VIRTUAL) {
>>> +        qemu_notify_event();
>>> +        return;
>>> +    }
>>> +
>>> +    if (!qemu_in_vcpu_thread() && first_cpu) {
>>> +        /* run_on_cpu will also kick the CPU out of halt so that
>>> +         * handle_icount_deadline runs
>>> +         */
>>> +        async_run_on_cpu(first_cpu, do_nothing, RUN_ON_CPU_NULL);
>>> +    }
>>
>> This doesn't read quite right - otherwise you get the same effect by
>> calling qemu_cpu_kick(), or even modify and call qemu_cpu_kick_rr_cpu()?
>>
>> The only real effect of having something in the work queue is you run
>> the outside of the loop before returning into the tcg_cpu_exec.
>
> Yes, and "the outside of the loop" here means handle_icount_deadline.
>
> But qemu_cpu_kick_rr_cpu won't signal condition variables, and
> qemu_cpu_kick does but it won't make cpu_thread_is_idle return false.
> Therefore, qemu_tcg_wait_io_event keeps looping.

How about:

  Scheduling work with async_run_on_cpu ensures we exit
  qemu_tcg_wait_io_event if we have halted that
  handle_icount_deadline can run.

>
>>>  }
>>>
>>>  static void kick_tcg_thread(void *opaque)
>>> @@ -1220,12 +1234,15 @@ static int64_t tcg_get_icount_limit(void)
>>>
>>>  static void handle_icount_deadline(void)
>>>  {
>>> +    assert(qemu_in_vcpu_thread());
>>>      if (use_icount) {
>>>          int64_t deadline =
>>>              qemu_clock_deadline_ns_all(QEMU_CLOCK_VIRTUAL);
>>>
>>>          if (deadline == 0) {
>>> +            /* Wake up other AioContexts.  */
>>>              qemu_clock_notify(QEMU_CLOCK_VIRTUAL);
>>> +            qemu_clock_run_timers(QEMU_CLOCK_VIRTUAL);
>>>          }
>>>      }
>>>  }
>>> @@ -1338,6 +1355,11 @@ static void *qemu_tcg_rr_cpu_thread_fn(void *arg)
>>>          /* Account partial waits to QEMU_CLOCK_VIRTUAL.  */
>>>          qemu_account_warp_timer();
>>>
>>> +        /* Run the timers here.  This is much more efficient than
>>> +         * waking up the I/O thread and waiting for completion.
>>> +         */
>>> +        handle_icount_deadline();
>>> +
>>>          if (!cpu) {
>>>              cpu = first_cpu;
>>>          }
>>> @@ -1379,8 +1401,6 @@ static void *qemu_tcg_rr_cpu_thread_fn(void *arg)
>>>              atomic_mb_set(&cpu->exit_request, 0);
>>>          }
>>>
>>> -        handle_icount_deadline();
>>> -
>>
>> I guess we could just as easily move the handling to after
>> qemu_tcg_wait_io_event(cpu ? cpu : QTAILQ_FIRST(&cpus))?
>>
>> I noticed we still call handle_icount_deadline in the multi-thread case
>> which is probably superfluous unless/until we figure out a way for it to
>> work with MTTCG.
>
> Should I remove the call?  Add an assert(!use_icount)?

Yes pretty much that as an independent patch.

>
>>>          qemu_tcg_wait_io_event(cpu ? cpu : QTAILQ_FIRST(&cpus));
>>>          deal_with_unplugged_cpus();
>>>      }
>>> diff --git a/include/qemu/timer.h b/include/qemu/timer.h
>>> index 1441b42..e1742f2 100644
>>> --- a/include/qemu/timer.h
>>> +++ b/include/qemu/timer.h
>>> @@ -533,6 +533,12 @@ static inline QEMUTimer *timer_new_tl(QEMUTimerList *timer_list,
>>>   * Create a new timer and associate it with the default
>>>   * timer list for the clock type @type.
>>>   *
>>> + * The default timer list has one special feature: in icount mode,
>>> + * %QEMU_CLOCK_VIRTUAL timers are run in the vCPU thread.  This is
>>> + * not true of other timer lists, which are typically associated
>>> + * with an AioContext---each of them runs its timer callbacks in its own
>>> + * AioContext thread.
>>> + *
>>>   * Returns: a pointer to the timer
>>>   */
>>>  static inline QEMUTimer *timer_new(QEMUClockType type, int scale,
>>> @@ -550,6 +556,12 @@ static inline QEMUTimer *timer_new(QEMUClockType type, int scale,
>>>   * Create a new timer with nanosecond scale on the default timer list
>>>   * associated with the clock.
>>>   *
>>> + * The default timer list has one special feature: in icount mode,
>>> + * %QEMU_CLOCK_VIRTUAL timers are run in the vCPU thread.  This is
>>> + * not true of other timer lists, which are typically associated
>>> + * with an AioContext---each of them runs its timer callbacks in its own
>>> + * AioContext thread.
>>> + *
>>>   * Returns: a pointer to the newly created timer
>>>   */
>>>  static inline QEMUTimer *timer_new_ns(QEMUClockType type, QEMUTimerCB *cb,
>>> @@ -564,6 +576,12 @@ static inline QEMUTimer *timer_new_ns(QEMUClockType type, QEMUTimerCB *cb,
>>>   * @cb: the callback to call when the timer expires
>>>   * @opaque: the opaque pointer to pass to the callback
>>>   *
>>> + * The default timer list has one special feature: in icount mode,
>>> + * %QEMU_CLOCK_VIRTUAL timers are run in the vCPU thread.  This is
>>> + * not true of other timer lists, which are typically associated
>>> + * with an AioContext---each of them runs its timer callbacks in its own
>>> + * AioContext thread.
>>> + *
>>>   * Create a new timer with microsecond scale on the default timer list
>>>   * associated with the clock.
>>>   *
>>> @@ -581,6 +599,12 @@ static inline QEMUTimer *timer_new_us(QEMUClockType type, QEMUTimerCB *cb,
>>>   * @cb: the callback to call when the timer expires
>>>   * @opaque: the opaque pointer to pass to the callback
>>>   *
>>> + * The default timer list has one special feature: in icount mode,
>>> + * %QEMU_CLOCK_VIRTUAL timers are run in the vCPU thread.  This is
>>> + * not true of other timer lists, which are typically associated
>>> + * with an AioContext---each of them runs its timer callbacks in its own
>>> + * AioContext thread.
>>> + *
>>>   * Create a new timer with millisecond scale on the default timer list
>>>   * associated with the clock.
>>>   *
>>> diff --git a/util/qemu-timer.c b/util/qemu-timer.c
>>> index dc3181e..82d5650 100644
>>> --- a/util/qemu-timer.c
>>> +++ b/util/qemu-timer.c
>>> @@ -658,7 +658,9 @@ bool qemu_clock_run_all_timers(void)
>>>      QEMUClockType type;
>>>
>>>      for (type = 0; type < QEMU_CLOCK_MAX; type++) {
>>> -        progress |= qemu_clock_run_timers(type);
>>> +        if (qemu_clock_use_for_deadline(type)) {
>>> +            progress |= qemu_clock_run_timers(type);
>>> +        }
>>
>> minor nit: its not really qemu_clock_run_all_timers() now is it. We run
>> all but the virtual timers (if icount is enabled).
>
> Well yeah, it's all those that pass qemu_clock_use_for_deadline.
>
> Paolo

Have you done any testing with record/replay? So far I have one
reproducible run and one failure. However it is not entirely clear to me
how I am meant to cleanly halt and stop a machine so I don't corrupt the
replay log.

--
Alex Bennée

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [Qemu-devel] [PATCH 5/5] icount: process QEMU_CLOCK_VIRTUAL timers in vCPU thread
  2017-03-13 18:15       ` Alex Bennée
@ 2017-03-14 10:05         ` Paolo Bonzini
  2017-03-14 12:57           ` Paolo Bonzini
  0 siblings, 1 reply; 26+ messages in thread
From: Paolo Bonzini @ 2017-03-14 10:05 UTC (permalink / raw)
  To: Alex Bennée; +Cc: qemu-devel



On 13/03/2017 19:15, Alex Bennée wrote:
>> Well yeah, it's all those that pass qemu_clock_use_for_deadline.
>
> Have you done any testing with record/replay? So far I have one
> reproducible run and one failure. However it is not entirely clear to me
> how I am meant to cleanly halt and stop a machine so I don't corrupt the
> replay log.

No, I haven't tried RR.

Paolo

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [Qemu-devel] [PATCH 5/5] icount: process QEMU_CLOCK_VIRTUAL timers in vCPU thread
  2017-03-14 10:05         ` Paolo Bonzini
@ 2017-03-14 12:57           ` Paolo Bonzini
  2017-03-14 15:43             ` Alex Bennée
  0 siblings, 1 reply; 26+ messages in thread
From: Paolo Bonzini @ 2017-03-14 12:57 UTC (permalink / raw)
  To: Alex Bennée; +Cc: qemu-devel



On 14/03/2017 11:05, Paolo Bonzini wrote:
> 
> 
> On 13/03/2017 19:15, Alex Bennée wrote:
>>> Well yeah, it's all those that pass qemu_clock_use_for_deadline.
>>
>> Have you done any testing with record/replay? So far I have one
>> reproducible run and one failure. However it is not entirely clear to me
>> how I am meant to cleanly halt and stop a machine so I don't corrupt the
>> replay log.
> 
> No, I haven't tried RR.

Tried now, it doesn't work but I won't have time to fix it.

I tried "-kernel /path/to/bzImage -icount 3,rr=record,rrfile=replay.bin
-net none".  record works, replay hangs.  The hang is at a "pause"
instruction.  With "-d in_asm" it still hangs, but a bit later (still in
the firmware).

Paolo

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [Qemu-devel] [PATCH 5/5] icount: process QEMU_CLOCK_VIRTUAL timers in vCPU thread
  2017-03-14 12:57           ` Paolo Bonzini
@ 2017-03-14 15:43             ` Alex Bennée
  2017-03-14 16:23               ` Paolo Bonzini
  0 siblings, 1 reply; 26+ messages in thread
From: Alex Bennée @ 2017-03-14 15:43 UTC (permalink / raw)
  To: Paolo Bonzini; +Cc: qemu-devel


Paolo Bonzini <pbonzini@redhat.com> writes:

> On 14/03/2017 11:05, Paolo Bonzini wrote:
>>
>>
>> On 13/03/2017 19:15, Alex Bennée wrote:
>>>> Well yeah, it's all those that pass qemu_clock_use_for_deadline.
>>>
>>> Have you done any testing with record/replay? So far I have one
>>> reproducible run and one failure. However it is not entirely clear to me
>>> how I am meant to cleanly halt and stop a machine so I don't corrupt the
>>> replay log.
>>
>> No, I haven't tried RR.
>
> Tried now, it doesn't work but I won't have time to fix it.
>
> I tried "-kernel /path/to/bzImage -icount 3,rr=record,rrfile=replay.bin
> -net none".  record works, replay hangs.  The hang is at a "pause"
> instruction.  With "-d in_asm" it still hangs, but a bit later (still in
> the firmware).

I'm happy to keep digging.

So to be clear when you do the record step how do you save your state
before you replay?

I was just entering the MMI and typing quit but that seems to leave the
log in a broken state.

--
Alex Bennée

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [Qemu-devel] [PATCH 5/5] icount: process QEMU_CLOCK_VIRTUAL timers in vCPU thread
  2017-03-14 15:43             ` Alex Bennée
@ 2017-03-14 16:23               ` Paolo Bonzini
  0 siblings, 0 replies; 26+ messages in thread
From: Paolo Bonzini @ 2017-03-14 16:23 UTC (permalink / raw)
  To: Alex Bennée; +Cc: qemu-devel



On 14/03/2017 16:43, Alex Bennée wrote:
>>
>> I tried "-kernel /path/to/bzImage -icount 3,rr=record,rrfile=replay.bin
>> -net none".  record works, replay hangs.  The hang is at a "pause"
>> instruction.  With "-d in_asm" it still hangs, but a bit later (still in
>> the firmware).
> I'm happy to keep digging.
> 
> So to be clear when you do the record step how do you save your state
> before you replay?

Here it hangs on the replay way before the end of the record, so it
wasn't an issue.

Paolo

^ permalink raw reply	[flat|nested] 26+ messages in thread

end of thread, other threads:[~2017-03-14 16:23 UTC | newest]

Thread overview: 26+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-03-03 13:11 [Qemu-devel] [PATCH 0/6] tcg: fix icount super slowdown Paolo Bonzini
2017-03-03 13:11 ` [Qemu-devel] [PATCH 1/5] qemu-timer: fix off-by-one Paolo Bonzini
2017-03-03 13:48   ` Edgar E. Iglesias
2017-03-10  9:46   ` Alex Bennée
2017-03-03 13:11 ` [Qemu-devel] [PATCH 2/5] qemu-timer: do not include sysemu/cpus.h from util/qemu-timer.h Paolo Bonzini
2017-03-03 13:48   ` Edgar E. Iglesias
2017-03-03 14:50   ` Alex Bennée
2017-03-03 14:55     ` Paolo Bonzini
2017-03-10  7:42   ` [Qemu-devel] [PATCH] fixup! " Alex Bennée
2017-03-10  8:27     ` Peter Maydell
2017-03-10  9:47   ` [Qemu-devel] [PATCH 2/5] " Alex Bennée
2017-03-03 13:11 ` [Qemu-devel] [PATCH 3/5] cpus: define QEMUTimerListNotifyCB for QEMU system emulation Paolo Bonzini
2017-03-03 13:53   ` Edgar E. Iglesias
2017-03-03 13:11 ` [Qemu-devel] [PATCH 4/5] main-loop: remove now unnecessary optimization Paolo Bonzini
2017-03-03 13:53   ` Edgar E. Iglesias
2017-03-13 16:23   ` Alex Bennée
2017-03-03 13:11 ` [Qemu-devel] [PATCH 5/5] icount: process QEMU_CLOCK_VIRTUAL timers in vCPU thread Paolo Bonzini
2017-03-13 16:53   ` Alex Bennée
2017-03-13 17:16     ` Paolo Bonzini
2017-03-13 18:15       ` Alex Bennée
2017-03-14 10:05         ` Paolo Bonzini
2017-03-14 12:57           ` Paolo Bonzini
2017-03-14 15:43             ` Alex Bennée
2017-03-14 16:23               ` Paolo Bonzini
2017-03-09 17:19 ` [Qemu-devel] [PATCH 0/6] tcg: fix icount super slowdown Alex Bennée
2017-03-09 17:22   ` Paolo Bonzini

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.