qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v2 0/9] memory: Sanity checks memory transaction when releasing BQL
@ 2021-07-23 19:34 Peter Xu
  2021-07-23 19:34 ` [PATCH v2 1/9] cpus: Export queue work related fields to cpu.h Peter Xu
                   ` (10 more replies)
  0 siblings, 11 replies; 28+ messages in thread
From: Peter Xu @ 2021-07-23 19:34 UTC (permalink / raw)
  To: qemu-devel; +Cc: Paolo Bonzini, David Hildenbrand, peterx, Richard Henderson

This is v2 of the series.  It was actually got forgotten for months until it
was used to identify another potential issue of bql usage here (besides it
could still be helpful when debugging a previous kvm dirty ring issue in that
series):

https://lore.kernel.org/qemu-devel/CH0PR02MB7898BBD73D0F3F7D5003BB178BE19@CH0PR02MB7898.namprd02.prod.outlook.com/

So I figured maybe it's still worth to have it, hence a repost.

There're some changes against v1:

  - patch "cpus: Introduce qemu_cond_timedwait_iothread()" is dropped because
    it's introduced in another commit already (b0c3cf9407e64).

  - two more patches to move do_run_on_cpu() into softmmu/ to fix a linux-user
    compliation issue.

Please review, thanks.

=== Original Cover letter ===

This is a continuous work of previous discussion on memory transactions [1].
It should be helpful to fail QEMU far earlier if there's misuse of BQL against
the QEMU memory model.

One example is run_on_cpu() during memory commit.  That'll work previously, but
it'll fail with very strange errors (like KVM ioctl failure due to memslot
already existed, and it's not guaranteed to trigger constantly).  Now it'll
directly fail when run_on_cpu() is called.

Please have a look, thanks.

[1] https://lists.gnu.org/archive/html/qemu-devel/2020-04/msg03205.html

Peter Xu (9):
  cpus: Export queue work related fields to cpu.h
  cpus: Move do_run_on_cpu into softmmu/cpus.c
  memory: Introduce memory_region_transaction_{push|pop}()
  memory: Don't do topology update in memory finalize()
  cpus: Use qemu_cond_wait_iothread() where proper
  cpus: Remove the mutex parameter from do_run_on_cpu()
  cpus: Introduce qemu_mutex_unlock_iothread_prepare()
  memory: Assert on no ongoing memory transaction before release BQL
  memory: Delay the transaction pop() until commit completed

 cpus-common.c                  | 36 ++---------------------
 include/exec/memory-internal.h |  1 +
 include/hw/core/cpu.h          | 22 ++++++--------
 softmmu/cpus.c                 | 42 ++++++++++++++++++++++++---
 softmmu/memory.c               | 53 ++++++++++++++++++++++++++++++----
 5 files changed, 97 insertions(+), 57 deletions(-)

-- 
2.31.1




^ permalink raw reply	[flat|nested] 28+ messages in thread

* [PATCH v2 1/9] cpus: Export queue work related fields to cpu.h
  2021-07-23 19:34 [PATCH v2 0/9] memory: Sanity checks memory transaction when releasing BQL Peter Xu
@ 2021-07-23 19:34 ` Peter Xu
  2021-07-27 13:02   ` David Hildenbrand
  2021-07-23 19:34 ` [PATCH v2 2/9] cpus: Move do_run_on_cpu into softmmu/cpus.c Peter Xu
                   ` (9 subsequent siblings)
  10 siblings, 1 reply; 28+ messages in thread
From: Peter Xu @ 2021-07-23 19:34 UTC (permalink / raw)
  To: qemu-devel; +Cc: Paolo Bonzini, David Hildenbrand, peterx, Richard Henderson

This patch has no functional change, but prepares for moving the function
do_run_on_cpu() into softmmu/cpus.c.  It does:

  1. Move qemu_work_item into hw/core/cpu.h.
  2. Export queue_work_on_cpu()/qemu_work_cond.

All of them will be used by softmmu/cpus.c later.

Signed-off-by: Peter Xu <peterx@redhat.com>
---
 cpus-common.c         | 11 ++---------
 include/hw/core/cpu.h | 10 +++++++++-
 2 files changed, 11 insertions(+), 10 deletions(-)

diff --git a/cpus-common.c b/cpus-common.c
index 6e73d3e58d..d814b2439a 100644
--- a/cpus-common.c
+++ b/cpus-common.c
@@ -27,7 +27,7 @@
 static QemuMutex qemu_cpu_list_lock;
 static QemuCond exclusive_cond;
 static QemuCond exclusive_resume;
-static QemuCond qemu_work_cond;
+QemuCond qemu_work_cond;
 
 /* >= 1 if a thread is inside start_exclusive/end_exclusive.  Written
  * under qemu_cpu_list_lock, read with atomic operations.
@@ -114,14 +114,7 @@ CPUState *qemu_get_cpu(int index)
 /* current CPU in the current thread. It is only valid inside cpu_exec() */
 __thread CPUState *current_cpu;
 
-struct qemu_work_item {
-    QSIMPLEQ_ENTRY(qemu_work_item) node;
-    run_on_cpu_func func;
-    run_on_cpu_data data;
-    bool free, exclusive, done;
-};
-
-static void queue_work_on_cpu(CPUState *cpu, struct qemu_work_item *wi)
+void queue_work_on_cpu(CPUState *cpu, struct qemu_work_item *wi)
 {
     qemu_mutex_lock(&cpu->work_mutex);
     QSIMPLEQ_INSERT_TAIL(&cpu->work_list, wi, node);
diff --git a/include/hw/core/cpu.h b/include/hw/core/cpu.h
index bc864564ce..f62ae88524 100644
--- a/include/hw/core/cpu.h
+++ b/include/hw/core/cpu.h
@@ -243,7 +243,15 @@ typedef union {
 
 typedef void (*run_on_cpu_func)(CPUState *cpu, run_on_cpu_data data);
 
-struct qemu_work_item;
+struct qemu_work_item {
+    QSIMPLEQ_ENTRY(qemu_work_item) node;
+    run_on_cpu_func func;
+    run_on_cpu_data data;
+    bool free, exclusive, done;
+};
+
+void queue_work_on_cpu(CPUState *cpu, struct qemu_work_item *wi);
+extern QemuCond qemu_work_cond;
 
 #define CPU_UNSET_NUMA_NODE_ID -1
 #define CPU_TRACE_DSTATE_MAX_EVENTS 32
-- 
2.31.1



^ permalink raw reply related	[flat|nested] 28+ messages in thread

* [PATCH v2 2/9] cpus: Move do_run_on_cpu into softmmu/cpus.c
  2021-07-23 19:34 [PATCH v2 0/9] memory: Sanity checks memory transaction when releasing BQL Peter Xu
  2021-07-23 19:34 ` [PATCH v2 1/9] cpus: Export queue work related fields to cpu.h Peter Xu
@ 2021-07-23 19:34 ` Peter Xu
  2021-07-27 13:04   ` David Hildenbrand
  2021-07-23 19:34 ` [PATCH v2 3/9] memory: Introduce memory_region_transaction_{push|pop}() Peter Xu
                   ` (8 subsequent siblings)
  10 siblings, 1 reply; 28+ messages in thread
From: Peter Xu @ 2021-07-23 19:34 UTC (permalink / raw)
  To: qemu-devel; +Cc: Paolo Bonzini, David Hildenbrand, peterx, Richard Henderson

It's only used by softmmu binaries not linux-user ones.  Make it static and
drop the definition in the header too.

Signed-off-by: Peter Xu <peterx@redhat.com>
---
 cpus-common.c         | 25 -------------------------
 include/hw/core/cpu.h | 12 ------------
 softmmu/cpus.c        | 26 ++++++++++++++++++++++++++
 3 files changed, 26 insertions(+), 37 deletions(-)

diff --git a/cpus-common.c b/cpus-common.c
index d814b2439a..670826363f 100644
--- a/cpus-common.c
+++ b/cpus-common.c
@@ -124,31 +124,6 @@ void queue_work_on_cpu(CPUState *cpu, struct qemu_work_item *wi)
     qemu_cpu_kick(cpu);
 }
 
-void do_run_on_cpu(CPUState *cpu, run_on_cpu_func func, run_on_cpu_data data,
-                   QemuMutex *mutex)
-{
-    struct qemu_work_item wi;
-
-    if (qemu_cpu_is_self(cpu)) {
-        func(cpu, data);
-        return;
-    }
-
-    wi.func = func;
-    wi.data = data;
-    wi.done = false;
-    wi.free = false;
-    wi.exclusive = false;
-
-    queue_work_on_cpu(cpu, &wi);
-    while (!qatomic_mb_read(&wi.done)) {
-        CPUState *self_cpu = current_cpu;
-
-        qemu_cond_wait(&qemu_work_cond, mutex);
-        current_cpu = self_cpu;
-    }
-}
-
 void async_run_on_cpu(CPUState *cpu, run_on_cpu_func func, run_on_cpu_data data)
 {
     struct qemu_work_item *wi;
diff --git a/include/hw/core/cpu.h b/include/hw/core/cpu.h
index f62ae88524..711ecad62f 100644
--- a/include/hw/core/cpu.h
+++ b/include/hw/core/cpu.h
@@ -689,18 +689,6 @@ void qemu_cpu_kick(CPUState *cpu);
  */
 bool cpu_is_stopped(CPUState *cpu);
 
-/**
- * do_run_on_cpu:
- * @cpu: The vCPU to run on.
- * @func: The function to be executed.
- * @data: Data to pass to the function.
- * @mutex: Mutex to release while waiting for @func to run.
- *
- * Used internally in the implementation of run_on_cpu.
- */
-void do_run_on_cpu(CPUState *cpu, run_on_cpu_func func, run_on_cpu_data data,
-                   QemuMutex *mutex);
-
 /**
  * run_on_cpu:
  * @cpu: The vCPU to run on.
diff --git a/softmmu/cpus.c b/softmmu/cpus.c
index 071085f840..52adc98d39 100644
--- a/softmmu/cpus.c
+++ b/softmmu/cpus.c
@@ -382,6 +382,32 @@ void qemu_init_cpu_loop(void)
     qemu_thread_get_self(&io_thread);
 }
 
+static void
+do_run_on_cpu(CPUState *cpu, run_on_cpu_func func, run_on_cpu_data data,
+              QemuMutex *mutex)
+{
+    struct qemu_work_item wi;
+
+    if (qemu_cpu_is_self(cpu)) {
+        func(cpu, data);
+        return;
+    }
+
+    wi.func = func;
+    wi.data = data;
+    wi.done = false;
+    wi.free = false;
+    wi.exclusive = false;
+
+    queue_work_on_cpu(cpu, &wi);
+    while (!qatomic_mb_read(&wi.done)) {
+        CPUState *self_cpu = current_cpu;
+
+        qemu_cond_wait(&qemu_work_cond, mutex);
+        current_cpu = self_cpu;
+    }
+}
+
 void run_on_cpu(CPUState *cpu, run_on_cpu_func func, run_on_cpu_data data)
 {
     do_run_on_cpu(cpu, func, data, &qemu_global_mutex);
-- 
2.31.1



^ permalink raw reply related	[flat|nested] 28+ messages in thread

* [PATCH v2 3/9] memory: Introduce memory_region_transaction_{push|pop}()
  2021-07-23 19:34 [PATCH v2 0/9] memory: Sanity checks memory transaction when releasing BQL Peter Xu
  2021-07-23 19:34 ` [PATCH v2 1/9] cpus: Export queue work related fields to cpu.h Peter Xu
  2021-07-23 19:34 ` [PATCH v2 2/9] cpus: Move do_run_on_cpu into softmmu/cpus.c Peter Xu
@ 2021-07-23 19:34 ` Peter Xu
  2021-07-27 13:06   ` David Hildenbrand
  2021-07-23 19:34 ` [PATCH v2 4/9] memory: Don't do topology update in memory finalize() Peter Xu
                   ` (7 subsequent siblings)
  10 siblings, 1 reply; 28+ messages in thread
From: Peter Xu @ 2021-07-23 19:34 UTC (permalink / raw)
  To: qemu-devel; +Cc: Paolo Bonzini, David Hildenbrand, peterx, Richard Henderson

memory_region_transaction_{begin|commit}() could be too big when finalizing a
memory region.  E.g., we should never attempt to update address space topology
during the finalize() of a memory region.  Provide helpers for further use.

Signed-off-by: Peter Xu <peterx@redhat.com>
---
 softmmu/memory.c | 14 ++++++++++++--
 1 file changed, 12 insertions(+), 2 deletions(-)

diff --git a/softmmu/memory.c b/softmmu/memory.c
index bfedaf9c4d..1a3e9ff8ad 100644
--- a/softmmu/memory.c
+++ b/softmmu/memory.c
@@ -1079,10 +1079,20 @@ static void address_space_update_topology(AddressSpace *as)
     address_space_set_flatview(as);
 }
 
+static void memory_region_transaction_push(void)
+{
+    memory_region_transaction_depth++;
+}
+
+static void memory_region_transaction_pop(void)
+{
+    memory_region_transaction_depth--;
+}
+
 void memory_region_transaction_begin(void)
 {
     qemu_flush_coalesced_mmio_buffer();
-    ++memory_region_transaction_depth;
+    memory_region_transaction_push();
 }
 
 void memory_region_transaction_commit(void)
@@ -1092,7 +1102,7 @@ void memory_region_transaction_commit(void)
     assert(memory_region_transaction_depth);
     assert(qemu_mutex_iothread_locked());
 
-    --memory_region_transaction_depth;
+    memory_region_transaction_pop();
     if (!memory_region_transaction_depth) {
         if (memory_region_update_pending) {
             flatviews_reset();
-- 
2.31.1



^ permalink raw reply related	[flat|nested] 28+ messages in thread

* [PATCH v2 4/9] memory: Don't do topology update in memory finalize()
  2021-07-23 19:34 [PATCH v2 0/9] memory: Sanity checks memory transaction when releasing BQL Peter Xu
                   ` (2 preceding siblings ...)
  2021-07-23 19:34 ` [PATCH v2 3/9] memory: Introduce memory_region_transaction_{push|pop}() Peter Xu
@ 2021-07-23 19:34 ` Peter Xu
  2021-07-27 13:21   ` David Hildenbrand
  2021-07-23 19:34 ` [PATCH v2 5/9] cpus: Use qemu_cond_wait_iothread() where proper Peter Xu
                   ` (6 subsequent siblings)
  10 siblings, 1 reply; 28+ messages in thread
From: Peter Xu @ 2021-07-23 19:34 UTC (permalink / raw)
  To: qemu-devel; +Cc: Paolo Bonzini, David Hildenbrand, peterx, Richard Henderson

Topology update could be wrongly triggered in memory region finalize() if
there's bug somewhere else.  It'll be a very confusing stack when it
happens (e.g., sending KVM ioctl within the RCU thread, and we'll observe it
only until it fails!).

Instead of that, we use the push()/pop() helper to avoid memory transaction
commit, at the same time we use assertions to make sure there's no pending
updates or it's a nested transaction, so it could fail even earlier and in a
more explicit way.

Suggested-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Peter Xu <peterx@redhat.com>
---
 softmmu/memory.c | 23 +++++++++++++++++++++--
 1 file changed, 21 insertions(+), 2 deletions(-)

diff --git a/softmmu/memory.c b/softmmu/memory.c
index 1a3e9ff8ad..dfce4a2bda 100644
--- a/softmmu/memory.c
+++ b/softmmu/memory.c
@@ -170,6 +170,12 @@ struct MemoryRegionIoeventfd {
     EventNotifier *e;
 };
 
+/* Returns whether there's any pending memory updates */
+static bool memory_region_has_pending_update(void)
+{
+    return memory_region_update_pending || ioeventfd_update_pending;
+}
+
 static bool memory_region_ioeventfd_before(MemoryRegionIoeventfd *a,
                                            MemoryRegionIoeventfd *b)
 {
@@ -1756,12 +1762,25 @@ static void memory_region_finalize(Object *obj)
      * and cause an infinite loop.
      */
     mr->enabled = false;
-    memory_region_transaction_begin();
+
+    /*
+     * Use push()/pop() instead of begin()/commit() to make sure below block
+     * won't trigger any topology update (which should never happen, but it's
+     * still a safety belt).
+     */
+    memory_region_transaction_push();
     while (!QTAILQ_EMPTY(&mr->subregions)) {
         MemoryRegion *subregion = QTAILQ_FIRST(&mr->subregions);
         memory_region_del_subregion(mr, subregion);
     }
-    memory_region_transaction_commit();
+    memory_region_transaction_pop();
+
+    /*
+     * Make sure we're either in a nested transaction or there must have no
+     * pending updates due to memory_region_del_subregion() above.
+     */
+    assert(memory_region_transaction_depth ||
+           !memory_region_has_pending_update());
 
     mr->destructor(mr);
     memory_region_clear_coalescing(mr);
-- 
2.31.1



^ permalink raw reply related	[flat|nested] 28+ messages in thread

* [PATCH v2 5/9] cpus: Use qemu_cond_wait_iothread() where proper
  2021-07-23 19:34 [PATCH v2 0/9] memory: Sanity checks memory transaction when releasing BQL Peter Xu
                   ` (3 preceding siblings ...)
  2021-07-23 19:34 ` [PATCH v2 4/9] memory: Don't do topology update in memory finalize() Peter Xu
@ 2021-07-23 19:34 ` Peter Xu
  2021-07-27 12:49   ` David Hildenbrand
  2021-07-23 19:34 ` [PATCH v2 6/9] cpus: Remove the mutex parameter from do_run_on_cpu() Peter Xu
                   ` (5 subsequent siblings)
  10 siblings, 1 reply; 28+ messages in thread
From: Peter Xu @ 2021-07-23 19:34 UTC (permalink / raw)
  To: qemu-devel; +Cc: Paolo Bonzini, David Hildenbrand, peterx, Richard Henderson

The helper is introduced but we've still got plenty of places that are directly
referencing the qemu_global_mutex itself.  Spread the usage.

Signed-off-by: Peter Xu <peterx@redhat.com>
---
 softmmu/cpus.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/softmmu/cpus.c b/softmmu/cpus.c
index 52adc98d39..94c2804192 100644
--- a/softmmu/cpus.c
+++ b/softmmu/cpus.c
@@ -442,7 +442,7 @@ void qemu_wait_io_event(CPUState *cpu)
             slept = true;
             qemu_plugin_vcpu_idle_cb(cpu);
         }
-        qemu_cond_wait(cpu->halt_cond, &qemu_global_mutex);
+        qemu_cond_wait_iothread(cpu->halt_cond);
     }
     if (slept) {
         qemu_plugin_vcpu_resume_cb(cpu);
@@ -585,7 +585,7 @@ void pause_all_vcpus(void)
     replay_mutex_unlock();
 
     while (!all_vcpus_paused()) {
-        qemu_cond_wait(&qemu_pause_cond, &qemu_global_mutex);
+        qemu_cond_wait_iothread(&qemu_pause_cond);
         CPU_FOREACH(cpu) {
             qemu_cpu_kick(cpu);
         }
@@ -656,7 +656,7 @@ void qemu_init_vcpu(CPUState *cpu)
     cpus_accel->create_vcpu_thread(cpu);
 
     while (!cpu->created) {
-        qemu_cond_wait(&qemu_cpu_cond, &qemu_global_mutex);
+        qemu_cond_wait_iothread(&qemu_cpu_cond);
     }
 }
 
-- 
2.31.1



^ permalink raw reply related	[flat|nested] 28+ messages in thread

* [PATCH v2 6/9] cpus: Remove the mutex parameter from do_run_on_cpu()
  2021-07-23 19:34 [PATCH v2 0/9] memory: Sanity checks memory transaction when releasing BQL Peter Xu
                   ` (4 preceding siblings ...)
  2021-07-23 19:34 ` [PATCH v2 5/9] cpus: Use qemu_cond_wait_iothread() where proper Peter Xu
@ 2021-07-23 19:34 ` Peter Xu
  2021-07-27 12:50   ` David Hildenbrand
  2021-07-23 19:34 ` [PATCH v2 7/9] cpus: Introduce qemu_mutex_unlock_iothread_prepare() Peter Xu
                   ` (4 subsequent siblings)
  10 siblings, 1 reply; 28+ messages in thread
From: Peter Xu @ 2021-07-23 19:34 UTC (permalink / raw)
  To: qemu-devel; +Cc: Paolo Bonzini, David Hildenbrand, peterx, Richard Henderson

We must use the BQL for do_run_on_cpu() without much choice, it means the
parameter is helpless.  Remove it.  Meanwhile use the newly introduced
qemu_cond_wait_iothread() in do_run_on_cpu().

Signed-off-by: Peter Xu <peterx@redhat.com>
---
 softmmu/cpus.c | 7 +++----
 1 file changed, 3 insertions(+), 4 deletions(-)

diff --git a/softmmu/cpus.c b/softmmu/cpus.c
index 94c2804192..9131f77f87 100644
--- a/softmmu/cpus.c
+++ b/softmmu/cpus.c
@@ -383,8 +383,7 @@ void qemu_init_cpu_loop(void)
 }
 
 static void
-do_run_on_cpu(CPUState *cpu, run_on_cpu_func func, run_on_cpu_data data,
-              QemuMutex *mutex)
+do_run_on_cpu(CPUState *cpu, run_on_cpu_func func, run_on_cpu_data data)
 {
     struct qemu_work_item wi;
 
@@ -403,14 +402,14 @@ do_run_on_cpu(CPUState *cpu, run_on_cpu_func func, run_on_cpu_data data,
     while (!qatomic_mb_read(&wi.done)) {
         CPUState *self_cpu = current_cpu;
 
-        qemu_cond_wait(&qemu_work_cond, mutex);
+        qemu_cond_wait_iothread(&qemu_work_cond);
         current_cpu = self_cpu;
     }
 }
 
 void run_on_cpu(CPUState *cpu, run_on_cpu_func func, run_on_cpu_data data)
 {
-    do_run_on_cpu(cpu, func, data, &qemu_global_mutex);
+    do_run_on_cpu(cpu, func, data);
 }
 
 static void qemu_cpu_stop(CPUState *cpu, bool exit)
-- 
2.31.1



^ permalink raw reply related	[flat|nested] 28+ messages in thread

* [PATCH v2 7/9] cpus: Introduce qemu_mutex_unlock_iothread_prepare()
  2021-07-23 19:34 [PATCH v2 0/9] memory: Sanity checks memory transaction when releasing BQL Peter Xu
                   ` (5 preceding siblings ...)
  2021-07-23 19:34 ` [PATCH v2 6/9] cpus: Remove the mutex parameter from do_run_on_cpu() Peter Xu
@ 2021-07-23 19:34 ` Peter Xu
  2021-07-27 12:59   ` David Hildenbrand
  2021-07-23 19:34 ` [PATCH v2 8/9] memory: Assert on no ongoing memory transaction before release BQL Peter Xu
                   ` (3 subsequent siblings)
  10 siblings, 1 reply; 28+ messages in thread
From: Peter Xu @ 2021-07-23 19:34 UTC (permalink / raw)
  To: qemu-devel; +Cc: Paolo Bonzini, David Hildenbrand, peterx, Richard Henderson

The prepare function before unlocking BQL.  There're only three places that can
release the BQL: unlock(), cond_wait() or cond_timedwait().

Signed-off-by: Peter Xu <peterx@redhat.com>
---
 softmmu/cpus.c | 7 +++++++
 1 file changed, 7 insertions(+)

diff --git a/softmmu/cpus.c b/softmmu/cpus.c
index 9131f77f87..6085f8edbe 100644
--- a/softmmu/cpus.c
+++ b/softmmu/cpus.c
@@ -66,6 +66,10 @@
 
 static QemuMutex qemu_global_mutex;
 
+static void qemu_mutex_unlock_iothread_prepare(void)
+{
+}
+
 bool cpu_is_stopped(CPUState *cpu)
 {
     return cpu->stopped || !runstate_is_running();
@@ -523,16 +527,19 @@ void qemu_mutex_unlock_iothread(void)
 {
     g_assert(qemu_mutex_iothread_locked());
     iothread_locked = false;
+    qemu_mutex_unlock_iothread_prepare();
     qemu_mutex_unlock(&qemu_global_mutex);
 }
 
 void qemu_cond_wait_iothread(QemuCond *cond)
 {
+    qemu_mutex_unlock_iothread_prepare();
     qemu_cond_wait(cond, &qemu_global_mutex);
 }
 
 void qemu_cond_timedwait_iothread(QemuCond *cond, int ms)
 {
+    qemu_mutex_unlock_iothread_prepare();
     qemu_cond_timedwait(cond, &qemu_global_mutex, ms);
 }
 
-- 
2.31.1



^ permalink raw reply related	[flat|nested] 28+ messages in thread

* [PATCH v2 8/9] memory: Assert on no ongoing memory transaction before release BQL
  2021-07-23 19:34 [PATCH v2 0/9] memory: Sanity checks memory transaction when releasing BQL Peter Xu
                   ` (6 preceding siblings ...)
  2021-07-23 19:34 ` [PATCH v2 7/9] cpus: Introduce qemu_mutex_unlock_iothread_prepare() Peter Xu
@ 2021-07-23 19:34 ` Peter Xu
  2021-07-27 13:00   ` David Hildenbrand
  2021-07-23 19:34 ` [PATCH v2 9/9] memory: Delay the transaction pop() until commit completed Peter Xu
                   ` (2 subsequent siblings)
  10 siblings, 1 reply; 28+ messages in thread
From: Peter Xu @ 2021-07-23 19:34 UTC (permalink / raw)
  To: qemu-devel; +Cc: Paolo Bonzini, David Hildenbrand, peterx, Richard Henderson

Make sure we don't have any more ongoing memory transaction when releasing the
BQL.  This will trigger an abort if we misuse the QEMU memory model, e.g., when
calling run_on_cpu() during a memory commit.

Signed-off-by: Peter Xu <peterx@redhat.com>
---
 include/exec/memory-internal.h | 1 +
 softmmu/cpus.c                 | 2 ++
 softmmu/memory.c               | 6 ++++++
 3 files changed, 9 insertions(+)

diff --git a/include/exec/memory-internal.h b/include/exec/memory-internal.h
index 9fcc2af25c..3124b91c4b 100644
--- a/include/exec/memory-internal.h
+++ b/include/exec/memory-internal.h
@@ -35,6 +35,7 @@ static inline AddressSpaceDispatch *address_space_to_dispatch(AddressSpace *as)
 
 FlatView *address_space_get_flatview(AddressSpace *as);
 void flatview_unref(FlatView *view);
+bool memory_region_has_pending_transaction(void);
 
 extern const MemoryRegionOps unassigned_mem_ops;
 
diff --git a/softmmu/cpus.c b/softmmu/cpus.c
index 6085f8edbe..14a50289f8 100644
--- a/softmmu/cpus.c
+++ b/softmmu/cpus.c
@@ -31,6 +31,7 @@
 #include "qapi/qapi-events-run-state.h"
 #include "qapi/qmp/qerror.h"
 #include "exec/gdbstub.h"
+#include "exec/memory-internal.h"
 #include "sysemu/hw_accel.h"
 #include "exec/exec-all.h"
 #include "qemu/thread.h"
@@ -68,6 +69,7 @@ static QemuMutex qemu_global_mutex;
 
 static void qemu_mutex_unlock_iothread_prepare(void)
 {
+    assert(!memory_region_has_pending_transaction());
 }
 
 bool cpu_is_stopped(CPUState *cpu)
diff --git a/softmmu/memory.c b/softmmu/memory.c
index dfce4a2bda..08327c22e2 100644
--- a/softmmu/memory.c
+++ b/softmmu/memory.c
@@ -176,6 +176,12 @@ static bool memory_region_has_pending_update(void)
     return memory_region_update_pending || ioeventfd_update_pending;
 }
 
+bool memory_region_has_pending_transaction(void)
+{
+    return memory_region_transaction_depth ||
+        memory_region_has_pending_update();
+}
+
 static bool memory_region_ioeventfd_before(MemoryRegionIoeventfd *a,
                                            MemoryRegionIoeventfd *b)
 {
-- 
2.31.1



^ permalink raw reply related	[flat|nested] 28+ messages in thread

* [PATCH v2 9/9] memory: Delay the transaction pop() until commit completed
  2021-07-23 19:34 [PATCH v2 0/9] memory: Sanity checks memory transaction when releasing BQL Peter Xu
                   ` (7 preceding siblings ...)
  2021-07-23 19:34 ` [PATCH v2 8/9] memory: Assert on no ongoing memory transaction before release BQL Peter Xu
@ 2021-07-23 19:34 ` Peter Xu
  2021-07-27 13:02   ` David Hildenbrand
  2021-07-23 22:36 ` [PATCH v2 0/9] memory: Sanity checks memory transaction when releasing BQL Peter Xu
  2021-07-27 12:41 ` David Hildenbrand
  10 siblings, 1 reply; 28+ messages in thread
From: Peter Xu @ 2021-07-23 19:34 UTC (permalink / raw)
  To: qemu-devel; +Cc: Paolo Bonzini, David Hildenbrand, peterx, Richard Henderson

This should be functionally the same as before, but this allows the
memory_region_transaction_depth to be non-zero during commit, which can help us
to do sanity check on misuses.

Since at it, fix an indentation issue on the bracket.

Signed-off-by: Peter Xu <peterx@redhat.com>
---
 softmmu/memory.c | 12 +++++++++---
 1 file changed, 9 insertions(+), 3 deletions(-)

diff --git a/softmmu/memory.c b/softmmu/memory.c
index 08327c22e2..cf7943c02c 100644
--- a/softmmu/memory.c
+++ b/softmmu/memory.c
@@ -1114,8 +1114,7 @@ void memory_region_transaction_commit(void)
     assert(memory_region_transaction_depth);
     assert(qemu_mutex_iothread_locked());
 
-    memory_region_transaction_pop();
-    if (!memory_region_transaction_depth) {
+    if (memory_region_transaction_depth == 1) {
         if (memory_region_update_pending) {
             flatviews_reset();
 
@@ -1134,7 +1133,14 @@ void memory_region_transaction_commit(void)
             }
             ioeventfd_update_pending = false;
         }
-   }
+    }
+
+    /*
+     * Pop the depth at last, so that memory_region_transaction_depth will
+     * still be non-zero during committing.  This can help us to do some sanity
+     * check within the process of committing.
+     */
+    memory_region_transaction_pop();
 }
 
 static void memory_region_destructor_none(MemoryRegion *mr)
-- 
2.31.1



^ permalink raw reply related	[flat|nested] 28+ messages in thread

* Re: [PATCH v2 0/9] memory: Sanity checks memory transaction when releasing BQL
  2021-07-23 19:34 [PATCH v2 0/9] memory: Sanity checks memory transaction when releasing BQL Peter Xu
                   ` (8 preceding siblings ...)
  2021-07-23 19:34 ` [PATCH v2 9/9] memory: Delay the transaction pop() until commit completed Peter Xu
@ 2021-07-23 22:36 ` Peter Xu
  2021-07-27 12:41 ` David Hildenbrand
  10 siblings, 0 replies; 28+ messages in thread
From: Peter Xu @ 2021-07-23 22:36 UTC (permalink / raw)
  To: qemu-devel
  Cc: Paolo Bonzini, David Hildenbrand, Philippe Mathieu-Daudé,
	Richard Henderson

On Fri, Jul 23, 2021 at 03:34:35PM -0400, Peter Xu wrote:
> This is v2 of the series.  It was actually got forgotten for months until it
> was used to identify another potential issue of bql usage here (besides it
> could still be helpful when debugging a previous kvm dirty ring issue in that
> series):
> 
> https://lore.kernel.org/qemu-devel/CH0PR02MB7898BBD73D0F3F7D5003BB178BE19@CH0PR02MB7898.namprd02.prod.outlook.com/
> 
> So I figured maybe it's still worth to have it, hence a repost.
> 
> There're some changes against v1:
> 
>   - patch "cpus: Introduce qemu_cond_timedwait_iothread()" is dropped because
>     it's introduced in another commit already (b0c3cf9407e64).
> 
>   - two more patches to move do_run_on_cpu() into softmmu/ to fix a linux-user
>     compliation issue.
> 
> Please review, thanks.
> 
> === Original Cover letter ===
> 
> This is a continuous work of previous discussion on memory transactions [1].
> It should be helpful to fail QEMU far earlier if there's misuse of BQL against
> the QEMU memory model.
> 
> One example is run_on_cpu() during memory commit.  That'll work previously, but
> it'll fail with very strange errors (like KVM ioctl failure due to memslot
> already existed, and it's not guaranteed to trigger constantly).  Now it'll
> directly fail when run_on_cpu() is called.
> 
> Please have a look, thanks.
> 
> [1] https://lists.gnu.org/archive/html/qemu-devel/2020-04/msg03205.html

CC Phil too.

-- 
Peter Xu



^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH v2 0/9] memory: Sanity checks memory transaction when releasing BQL
  2021-07-23 19:34 [PATCH v2 0/9] memory: Sanity checks memory transaction when releasing BQL Peter Xu
                   ` (9 preceding siblings ...)
  2021-07-23 22:36 ` [PATCH v2 0/9] memory: Sanity checks memory transaction when releasing BQL Peter Xu
@ 2021-07-27 12:41 ` David Hildenbrand
  2021-07-27 16:35   ` Peter Xu
  10 siblings, 1 reply; 28+ messages in thread
From: David Hildenbrand @ 2021-07-27 12:41 UTC (permalink / raw)
  To: Peter Xu, qemu-devel; +Cc: Paolo Bonzini, Richard Henderson

On 23.07.21 21:34, Peter Xu wrote:
> This is v2 of the series.  It was actually got forgotten for months until it
> was used to identify another potential issue of bql usage here (besides it
> could still be helpful when debugging a previous kvm dirty ring issue in that
> series):
> 
> https://lore.kernel.org/qemu-devel/CH0PR02MB7898BBD73D0F3F7D5003BB178BE19@CH0PR02MB7898.namprd02.prod.outlook.com/
> 
> So I figured maybe it's still worth to have it, hence a repost.
> 
> There're some changes against v1:
> 
>    - patch "cpus: Introduce qemu_cond_timedwait_iothread()" is dropped because
>      it's introduced in another commit already (b0c3cf9407e64).
> 
>    - two more patches to move do_run_on_cpu() into softmmu/ to fix a linux-user
>      compliation issue.
> 
> Please review, thanks.
> 
> === Original Cover letter ===
> 
> This is a continuous work of previous discussion on memory transactions [1].
> It should be helpful to fail QEMU far earlier if there's misuse of BQL against
> the QEMU memory model.
> 
> One example is run_on_cpu() during memory commit.  That'll work previously, but
> it'll fail with very strange errors (like KVM ioctl failure due to memslot
> already existed, and it's not guaranteed to trigger constantly).  Now it'll
> directly fail when run_on_cpu() is called.
> 

Functions that silently drop the BQL are really nasty. I once fall into 
a similar trap calling pause_all_vcpus() from within 
memory_region_transaction_begin(), while resizing RAM blocks.


-- 
Thanks,

David / dhildenb



^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH v2 5/9] cpus: Use qemu_cond_wait_iothread() where proper
  2021-07-23 19:34 ` [PATCH v2 5/9] cpus: Use qemu_cond_wait_iothread() where proper Peter Xu
@ 2021-07-27 12:49   ` David Hildenbrand
  0 siblings, 0 replies; 28+ messages in thread
From: David Hildenbrand @ 2021-07-27 12:49 UTC (permalink / raw)
  To: Peter Xu, qemu-devel; +Cc: Paolo Bonzini, Richard Henderson

On 23.07.21 21:34, Peter Xu wrote:
> The helper is introduced but we've still got plenty of places that are directly
> referencing the qemu_global_mutex itself.  Spread the usage.
> 
> Signed-off-by: Peter Xu <peterx@redhat.com>
> ---
>   softmmu/cpus.c | 6 +++---
>   1 file changed, 3 insertions(+), 3 deletions(-)
> 
> diff --git a/softmmu/cpus.c b/softmmu/cpus.c
> index 52adc98d39..94c2804192 100644
> --- a/softmmu/cpus.c
> +++ b/softmmu/cpus.c
> @@ -442,7 +442,7 @@ void qemu_wait_io_event(CPUState *cpu)
>               slept = true;
>               qemu_plugin_vcpu_idle_cb(cpu);
>           }
> -        qemu_cond_wait(cpu->halt_cond, &qemu_global_mutex);
> +        qemu_cond_wait_iothread(cpu->halt_cond);
>       }
>       if (slept) {
>           qemu_plugin_vcpu_resume_cb(cpu);
> @@ -585,7 +585,7 @@ void pause_all_vcpus(void)
>       replay_mutex_unlock();
>   
>       while (!all_vcpus_paused()) {
> -        qemu_cond_wait(&qemu_pause_cond, &qemu_global_mutex);
> +        qemu_cond_wait_iothread(&qemu_pause_cond);
>           CPU_FOREACH(cpu) {
>               qemu_cpu_kick(cpu);
>           }
> @@ -656,7 +656,7 @@ void qemu_init_vcpu(CPUState *cpu)
>       cpus_accel->create_vcpu_thread(cpu);
>   
>       while (!cpu->created) {
> -        qemu_cond_wait(&qemu_cpu_cond, &qemu_global_mutex);
> +        qemu_cond_wait_iothread(&qemu_cpu_cond);
>       }
>   }
>   
> 

Reviewed-by: David Hildenbrand <david@redhat.com>

-- 
Thanks,

David / dhildenb



^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH v2 6/9] cpus: Remove the mutex parameter from do_run_on_cpu()
  2021-07-23 19:34 ` [PATCH v2 6/9] cpus: Remove the mutex parameter from do_run_on_cpu() Peter Xu
@ 2021-07-27 12:50   ` David Hildenbrand
  0 siblings, 0 replies; 28+ messages in thread
From: David Hildenbrand @ 2021-07-27 12:50 UTC (permalink / raw)
  To: Peter Xu, qemu-devel; +Cc: Paolo Bonzini, Richard Henderson

On 23.07.21 21:34, Peter Xu wrote:
> We must use the BQL for do_run_on_cpu() without much choice, it means the
> parameter is helpless.  Remove it.  Meanwhile use the newly introduced

s/helpless/useless/ ?

Reviewed-by: David Hildenbrand <david@redhat.com>

-- 
Thanks,

David / dhildenb



^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH v2 7/9] cpus: Introduce qemu_mutex_unlock_iothread_prepare()
  2021-07-23 19:34 ` [PATCH v2 7/9] cpus: Introduce qemu_mutex_unlock_iothread_prepare() Peter Xu
@ 2021-07-27 12:59   ` David Hildenbrand
  2021-07-27 16:08     ` Peter Xu
  0 siblings, 1 reply; 28+ messages in thread
From: David Hildenbrand @ 2021-07-27 12:59 UTC (permalink / raw)
  To: Peter Xu, qemu-devel; +Cc: Paolo Bonzini, Richard Henderson

On 23.07.21 21:34, Peter Xu wrote:
> The prepare function before unlocking BQL.  There're only three places that can
> release the BQL: unlock(), cond_wait() or cond_timedwait().
> 
> Signed-off-by: Peter Xu <peterx@redhat.com>
> ---
>   softmmu/cpus.c | 7 +++++++
>   1 file changed, 7 insertions(+)
> 
> diff --git a/softmmu/cpus.c b/softmmu/cpus.c
> index 9131f77f87..6085f8edbe 100644
> --- a/softmmu/cpus.c
> +++ b/softmmu/cpus.c
> @@ -66,6 +66,10 @@
>   
>   static QemuMutex qemu_global_mutex;
>   
> +static void qemu_mutex_unlock_iothread_prepare(void)
> +{
> +}
> +
>   bool cpu_is_stopped(CPUState *cpu)
>   {
>       return cpu->stopped || !runstate_is_running();
> @@ -523,16 +527,19 @@ void qemu_mutex_unlock_iothread(void)
>   {
>       g_assert(qemu_mutex_iothread_locked());
>       iothread_locked = false;
> +    qemu_mutex_unlock_iothread_prepare();
>       qemu_mutex_unlock(&qemu_global_mutex);
>   }
>   
>   void qemu_cond_wait_iothread(QemuCond *cond)
>   {
> +    qemu_mutex_unlock_iothread_prepare();
>       qemu_cond_wait(cond, &qemu_global_mutex);
>   }
>   
>   void qemu_cond_timedwait_iothread(QemuCond *cond, int ms)
>   {
> +    qemu_mutex_unlock_iothread_prepare();
>       qemu_cond_timedwait(cond, &qemu_global_mutex, ms);
>   }
>   
> 

I'd squash this patch into the next one.

I don't quite like the function name, but don't really have a better 
suggestion .... maybe qemu_mutex_might_unlock_iothread(), similar to 
might_sleep() or might_fault() in the kernel. (although here it's pretty 
clear and not a "might"; could be useful in other context where we might 
conditionally unlock the BQL at some point in the future, though)

-- 
Thanks,

David / dhildenb



^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH v2 8/9] memory: Assert on no ongoing memory transaction before release BQL
  2021-07-23 19:34 ` [PATCH v2 8/9] memory: Assert on no ongoing memory transaction before release BQL Peter Xu
@ 2021-07-27 13:00   ` David Hildenbrand
  0 siblings, 0 replies; 28+ messages in thread
From: David Hildenbrand @ 2021-07-27 13:00 UTC (permalink / raw)
  To: Peter Xu, qemu-devel; +Cc: Paolo Bonzini, Richard Henderson

On 23.07.21 21:34, Peter Xu wrote:
> Make sure we don't have any more ongoing memory transaction when releasing the
> BQL.  This will trigger an abort if we misuse the QEMU memory model, e.g., when
> calling run_on_cpu() during a memory commit.

... or pause_all_vcpus() during a memory transaction :)

> 
> Signed-off-by: Peter Xu <peterx@redhat.com>
> ---
>   include/exec/memory-internal.h | 1 +
>   softmmu/cpus.c                 | 2 ++
>   softmmu/memory.c               | 6 ++++++
>   3 files changed, 9 insertions(+)
> 
> diff --git a/include/exec/memory-internal.h b/include/exec/memory-internal.h
> index 9fcc2af25c..3124b91c4b 100644
> --- a/include/exec/memory-internal.h
> +++ b/include/exec/memory-internal.h
> @@ -35,6 +35,7 @@ static inline AddressSpaceDispatch *address_space_to_dispatch(AddressSpace *as)
>   
>   FlatView *address_space_get_flatview(AddressSpace *as);
>   void flatview_unref(FlatView *view);
> +bool memory_region_has_pending_transaction(void);
>   
>   extern const MemoryRegionOps unassigned_mem_ops;
>   
> diff --git a/softmmu/cpus.c b/softmmu/cpus.c
> index 6085f8edbe..14a50289f8 100644
> --- a/softmmu/cpus.c
> +++ b/softmmu/cpus.c
> @@ -31,6 +31,7 @@
>   #include "qapi/qapi-events-run-state.h"
>   #include "qapi/qmp/qerror.h"
>   #include "exec/gdbstub.h"
> +#include "exec/memory-internal.h"
>   #include "sysemu/hw_accel.h"
>   #include "exec/exec-all.h"
>   #include "qemu/thread.h"
> @@ -68,6 +69,7 @@ static QemuMutex qemu_global_mutex;
>   
>   static void qemu_mutex_unlock_iothread_prepare(void)
>   {
> +    assert(!memory_region_has_pending_transaction());
>   }
>   
>   bool cpu_is_stopped(CPUState *cpu)
> diff --git a/softmmu/memory.c b/softmmu/memory.c
> index dfce4a2bda..08327c22e2 100644
> --- a/softmmu/memory.c
> +++ b/softmmu/memory.c
> @@ -176,6 +176,12 @@ static bool memory_region_has_pending_update(void)
>       return memory_region_update_pending || ioeventfd_update_pending;
>   }
>   
> +bool memory_region_has_pending_transaction(void)
> +{
> +    return memory_region_transaction_depth ||
> +        memory_region_has_pending_update();
> +}
> +
>   static bool memory_region_ioeventfd_before(MemoryRegionIoeventfd *a,
>                                              MemoryRegionIoeventfd *b)
>   {
> 

LGTM

Reviewed-by: David Hildenbrand <david@redhat.com>

-- 
Thanks,

David / dhildenb



^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH v2 9/9] memory: Delay the transaction pop() until commit completed
  2021-07-23 19:34 ` [PATCH v2 9/9] memory: Delay the transaction pop() until commit completed Peter Xu
@ 2021-07-27 13:02   ` David Hildenbrand
  0 siblings, 0 replies; 28+ messages in thread
From: David Hildenbrand @ 2021-07-27 13:02 UTC (permalink / raw)
  To: Peter Xu, qemu-devel; +Cc: Paolo Bonzini, Richard Henderson

On 23.07.21 21:34, Peter Xu wrote:
> This should be functionally the same as before, but this allows the
> memory_region_transaction_depth to be non-zero during commit, which can help us
> to do sanity check on misuses.
> 
> Since at it, fix an indentation issue on the bracket.
> 
> Signed-off-by: Peter Xu <peterx@redhat.com>
> ---
>   softmmu/memory.c | 12 +++++++++---
>   1 file changed, 9 insertions(+), 3 deletions(-)
> 
> diff --git a/softmmu/memory.c b/softmmu/memory.c
> index 08327c22e2..cf7943c02c 100644
> --- a/softmmu/memory.c
> +++ b/softmmu/memory.c
> @@ -1114,8 +1114,7 @@ void memory_region_transaction_commit(void)
>       assert(memory_region_transaction_depth);
>       assert(qemu_mutex_iothread_locked());
>   
> -    memory_region_transaction_pop();
> -    if (!memory_region_transaction_depth) {
> +    if (memory_region_transaction_depth == 1) {
>           if (memory_region_update_pending) {
>               flatviews_reset();
>   
> @@ -1134,7 +1133,14 @@ void memory_region_transaction_commit(void)
>               }
>               ioeventfd_update_pending = false;
>           }
> -   }
> +    }
> +
> +    /*
> +     * Pop the depth at last, so that memory_region_transaction_depth will
> +     * still be non-zero during committing.  This can help us to do some sanity
> +     * check within the process of committing.
> +     */
> +    memory_region_transaction_pop();
>   }
>   
>   static void memory_region_destructor_none(MemoryRegion *mr)
> 

Sounds sane to me!

-- 
Thanks,

David / dhildenb



^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH v2 1/9] cpus: Export queue work related fields to cpu.h
  2021-07-23 19:34 ` [PATCH v2 1/9] cpus: Export queue work related fields to cpu.h Peter Xu
@ 2021-07-27 13:02   ` David Hildenbrand
  0 siblings, 0 replies; 28+ messages in thread
From: David Hildenbrand @ 2021-07-27 13:02 UTC (permalink / raw)
  To: Peter Xu, qemu-devel; +Cc: Paolo Bonzini, Richard Henderson

On 23.07.21 21:34, Peter Xu wrote:
> This patch has no functional change, but prepares for moving the function
> do_run_on_cpu() into softmmu/cpus.c.  It does:
> 
>    1. Move qemu_work_item into hw/core/cpu.h.
>    2. Export queue_work_on_cpu()/qemu_work_cond.
> 
> All of them will be used by softmmu/cpus.c later.
> 
> Signed-off-by: Peter Xu <peterx@redhat.com>
> ---
>   cpus-common.c         | 11 ++---------
>   include/hw/core/cpu.h | 10 +++++++++-
>   2 files changed, 11 insertions(+), 10 deletions(-)
> 
> diff --git a/cpus-common.c b/cpus-common.c
> index 6e73d3e58d..d814b2439a 100644
> --- a/cpus-common.c
> +++ b/cpus-common.c
> @@ -27,7 +27,7 @@
>   static QemuMutex qemu_cpu_list_lock;
>   static QemuCond exclusive_cond;
>   static QemuCond exclusive_resume;
> -static QemuCond qemu_work_cond;
> +QemuCond qemu_work_cond;
>   
>   /* >= 1 if a thread is inside start_exclusive/end_exclusive.  Written
>    * under qemu_cpu_list_lock, read with atomic operations.
> @@ -114,14 +114,7 @@ CPUState *qemu_get_cpu(int index)
>   /* current CPU in the current thread. It is only valid inside cpu_exec() */
>   __thread CPUState *current_cpu;
>   
> -struct qemu_work_item {
> -    QSIMPLEQ_ENTRY(qemu_work_item) node;
> -    run_on_cpu_func func;
> -    run_on_cpu_data data;
> -    bool free, exclusive, done;
> -};
> -
> -static void queue_work_on_cpu(CPUState *cpu, struct qemu_work_item *wi)
> +void queue_work_on_cpu(CPUState *cpu, struct qemu_work_item *wi)
>   {
>       qemu_mutex_lock(&cpu->work_mutex);
>       QSIMPLEQ_INSERT_TAIL(&cpu->work_list, wi, node);
> diff --git a/include/hw/core/cpu.h b/include/hw/core/cpu.h
> index bc864564ce..f62ae88524 100644
> --- a/include/hw/core/cpu.h
> +++ b/include/hw/core/cpu.h
> @@ -243,7 +243,15 @@ typedef union {
>   
>   typedef void (*run_on_cpu_func)(CPUState *cpu, run_on_cpu_data data);
>   
> -struct qemu_work_item;
> +struct qemu_work_item {
> +    QSIMPLEQ_ENTRY(qemu_work_item) node;
> +    run_on_cpu_func func;
> +    run_on_cpu_data data;
> +    bool free, exclusive, done;
> +};
> +
> +void queue_work_on_cpu(CPUState *cpu, struct qemu_work_item *wi);
> +extern QemuCond qemu_work_cond;
>   
>   #define CPU_UNSET_NUMA_NODE_ID -1
>   #define CPU_TRACE_DSTATE_MAX_EVENTS 32
> 

Reviewed-by: David Hildenbrand <david@redhat.com>

-- 
Thanks,

David / dhildenb



^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH v2 2/9] cpus: Move do_run_on_cpu into softmmu/cpus.c
  2021-07-23 19:34 ` [PATCH v2 2/9] cpus: Move do_run_on_cpu into softmmu/cpus.c Peter Xu
@ 2021-07-27 13:04   ` David Hildenbrand
  0 siblings, 0 replies; 28+ messages in thread
From: David Hildenbrand @ 2021-07-27 13:04 UTC (permalink / raw)
  To: Peter Xu, qemu-devel; +Cc: Paolo Bonzini, Richard Henderson

On 23.07.21 21:34, Peter Xu wrote:
> It's only used by softmmu binaries not linux-user ones.  Make it static and
> drop the definition in the header too.
> 
> Signed-off-by: Peter Xu <peterx@redhat.com>
> ---
>   cpus-common.c         | 25 -------------------------
>   include/hw/core/cpu.h | 12 ------------
>   softmmu/cpus.c        | 26 ++++++++++++++++++++++++++
>   3 files changed, 26 insertions(+), 37 deletions(-)
> 
> diff --git a/cpus-common.c b/cpus-common.c
> index d814b2439a..670826363f 100644
> --- a/cpus-common.c
> +++ b/cpus-common.c
> @@ -124,31 +124,6 @@ void queue_work_on_cpu(CPUState *cpu, struct qemu_work_item *wi)
>       qemu_cpu_kick(cpu);
>   }
>   
> -void do_run_on_cpu(CPUState *cpu, run_on_cpu_func func, run_on_cpu_data data,
> -                   QemuMutex *mutex)
> -{
> -    struct qemu_work_item wi;
> -
> -    if (qemu_cpu_is_self(cpu)) {
> -        func(cpu, data);
> -        return;
> -    }
> -
> -    wi.func = func;
> -    wi.data = data;
> -    wi.done = false;
> -    wi.free = false;
> -    wi.exclusive = false;
> -
> -    queue_work_on_cpu(cpu, &wi);
> -    while (!qatomic_mb_read(&wi.done)) {
> -        CPUState *self_cpu = current_cpu;
> -
> -        qemu_cond_wait(&qemu_work_cond, mutex);
> -        current_cpu = self_cpu;
> -    }
> -}
> -
>   void async_run_on_cpu(CPUState *cpu, run_on_cpu_func func, run_on_cpu_data data)
>   {
>       struct qemu_work_item *wi;
> diff --git a/include/hw/core/cpu.h b/include/hw/core/cpu.h
> index f62ae88524..711ecad62f 100644
> --- a/include/hw/core/cpu.h
> +++ b/include/hw/core/cpu.h
> @@ -689,18 +689,6 @@ void qemu_cpu_kick(CPUState *cpu);
>    */
>   bool cpu_is_stopped(CPUState *cpu);
>   
> -/**
> - * do_run_on_cpu:
> - * @cpu: The vCPU to run on.
> - * @func: The function to be executed.
> - * @data: Data to pass to the function.
> - * @mutex: Mutex to release while waiting for @func to run.
> - *
> - * Used internally in the implementation of run_on_cpu.
> - */
> -void do_run_on_cpu(CPUState *cpu, run_on_cpu_func func, run_on_cpu_data data,
> -                   QemuMutex *mutex);
> -
>   /**
>    * run_on_cpu:
>    * @cpu: The vCPU to run on.
> diff --git a/softmmu/cpus.c b/softmmu/cpus.c
> index 071085f840..52adc98d39 100644
> --- a/softmmu/cpus.c
> +++ b/softmmu/cpus.c
> @@ -382,6 +382,32 @@ void qemu_init_cpu_loop(void)
>       qemu_thread_get_self(&io_thread);
>   }
>   
> +static void
> +do_run_on_cpu(CPUState *cpu, run_on_cpu_func func, run_on_cpu_data data,
> +              QemuMutex *mutex)
> +{
> +    struct qemu_work_item wi;

You could do

struct qemu_work_item wi = {
     .func = func,
     .data = data,
};

instead of the separate initialization below.



> +
> +    if (qemu_cpu_is_self(cpu)) {
> +        func(cpu, data);
> +        return;
> +    }
> +
> +    wi.func = func;
> +    wi.data = data;
> +    wi.done = false;
> +    wi.free = false;
> +    wi.exclusive = false;
> +
> +    queue_work_on_cpu(cpu, &wi);
> +    while (!qatomic_mb_read(&wi.done)) {
> +        CPUState *self_cpu = current_cpu;
> +
> +        qemu_cond_wait(&qemu_work_cond, mutex);
> +        current_cpu = self_cpu;
> +    }
> +}

Reviewed-by: David Hildenbrand <david@redhat.com>


-- 
Thanks,

David / dhildenb



^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH v2 3/9] memory: Introduce memory_region_transaction_{push|pop}()
  2021-07-23 19:34 ` [PATCH v2 3/9] memory: Introduce memory_region_transaction_{push|pop}() Peter Xu
@ 2021-07-27 13:06   ` David Hildenbrand
  0 siblings, 0 replies; 28+ messages in thread
From: David Hildenbrand @ 2021-07-27 13:06 UTC (permalink / raw)
  To: Peter Xu, qemu-devel; +Cc: Paolo Bonzini, Richard Henderson

On 23.07.21 21:34, Peter Xu wrote:
> memory_region_transaction_{begin|commit}() could be too big when finalizing a
> memory region.  E.g., we should never attempt to update address space topology
> during the finalize() of a memory region.  Provide helpers for further use.
> 
> Signed-off-by: Peter Xu <peterx@redhat.com>
> ---
>   softmmu/memory.c | 14 ++++++++++++--
>   1 file changed, 12 insertions(+), 2 deletions(-)
> 
> diff --git a/softmmu/memory.c b/softmmu/memory.c
> index bfedaf9c4d..1a3e9ff8ad 100644
> --- a/softmmu/memory.c
> +++ b/softmmu/memory.c
> @@ -1079,10 +1079,20 @@ static void address_space_update_topology(AddressSpace *as)
>       address_space_set_flatview(as);
>   }
>   
> +static void memory_region_transaction_push(void)
> +{
> +    memory_region_transaction_depth++;
> +}
> +
> +static void memory_region_transaction_pop(void)
> +{
> +    memory_region_transaction_depth--;
> +}
> +

push/pop has to me stack semantics, meaning we do more than just 
increment/decrement the depth.

I'd have used

memory_region_transaction_depth_inc() / 
memory_region_transaction_depth_dec()


LGTM

-- 
Thanks,

David / dhildenb



^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH v2 4/9] memory: Don't do topology update in memory finalize()
  2021-07-23 19:34 ` [PATCH v2 4/9] memory: Don't do topology update in memory finalize() Peter Xu
@ 2021-07-27 13:21   ` David Hildenbrand
  2021-07-27 16:02     ` Peter Xu
  0 siblings, 1 reply; 28+ messages in thread
From: David Hildenbrand @ 2021-07-27 13:21 UTC (permalink / raw)
  To: Peter Xu, qemu-devel; +Cc: Paolo Bonzini, Richard Henderson

On 23.07.21 21:34, Peter Xu wrote:
> Topology update could be wrongly triggered in memory region finalize() if
> there's bug somewhere else.  It'll be a very confusing stack when it
> happens (e.g., sending KVM ioctl within the RCU thread, and we'll observe it
> only until it fails!).
> 
> Instead of that, we use the push()/pop() helper to avoid memory transaction
> commit, at the same time we use assertions to make sure there's no pending
> updates or it's a nested transaction, so it could fail even earlier and in a
> more explicit way.
> 
> Suggested-by: Paolo Bonzini <pbonzini@redhat.com>
> Signed-off-by: Peter Xu <peterx@redhat.com>
> ---
>   softmmu/memory.c | 23 +++++++++++++++++++++--
>   1 file changed, 21 insertions(+), 2 deletions(-)
> 
> diff --git a/softmmu/memory.c b/softmmu/memory.c
> index 1a3e9ff8ad..dfce4a2bda 100644
> --- a/softmmu/memory.c
> +++ b/softmmu/memory.c
> @@ -170,6 +170,12 @@ struct MemoryRegionIoeventfd {
>       EventNotifier *e;
>   };
>   
> +/* Returns whether there's any pending memory updates */
> +static bool memory_region_has_pending_update(void)
> +{
> +    return memory_region_update_pending || ioeventfd_update_pending;
> +}
> +
>   static bool memory_region_ioeventfd_before(MemoryRegionIoeventfd *a,
>                                              MemoryRegionIoeventfd *b)
>   {
> @@ -1756,12 +1762,25 @@ static void memory_region_finalize(Object *obj)
>        * and cause an infinite loop.
>        */
>       mr->enabled = false;
> -    memory_region_transaction_begin();
> +
> +    /*
> +     * Use push()/pop() instead of begin()/commit() to make sure below block
> +     * won't trigger any topology update (which should never happen, but it's
> +     * still a safety belt).
> +     */

Hmm, I wonder if we can just keep the begin/end semantics and just do an 
assertion before doing the commit? Does anything speak against that?

-- 
Thanks,

David / dhildenb



^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH v2 4/9] memory: Don't do topology update in memory finalize()
  2021-07-27 13:21   ` David Hildenbrand
@ 2021-07-27 16:02     ` Peter Xu
  2021-07-28 12:13       ` David Hildenbrand
  0 siblings, 1 reply; 28+ messages in thread
From: Peter Xu @ 2021-07-27 16:02 UTC (permalink / raw)
  To: David Hildenbrand; +Cc: Paolo Bonzini, qemu-devel, Richard Henderson

On Tue, Jul 27, 2021 at 03:21:31PM +0200, David Hildenbrand wrote:
> On 23.07.21 21:34, Peter Xu wrote:
> > Topology update could be wrongly triggered in memory region finalize() if
> > there's bug somewhere else.  It'll be a very confusing stack when it
> > happens (e.g., sending KVM ioctl within the RCU thread, and we'll observe it
> > only until it fails!).
> > 
> > Instead of that, we use the push()/pop() helper to avoid memory transaction
> > commit, at the same time we use assertions to make sure there's no pending
> > updates or it's a nested transaction, so it could fail even earlier and in a
> > more explicit way.
> > 
> > Suggested-by: Paolo Bonzini <pbonzini@redhat.com>
> > Signed-off-by: Peter Xu <peterx@redhat.com>
> > ---
> >   softmmu/memory.c | 23 +++++++++++++++++++++--
> >   1 file changed, 21 insertions(+), 2 deletions(-)
> > 
> > diff --git a/softmmu/memory.c b/softmmu/memory.c
> > index 1a3e9ff8ad..dfce4a2bda 100644
> > --- a/softmmu/memory.c
> > +++ b/softmmu/memory.c
> > @@ -170,6 +170,12 @@ struct MemoryRegionIoeventfd {
> >       EventNotifier *e;
> >   };
> > +/* Returns whether there's any pending memory updates */
> > +static bool memory_region_has_pending_update(void)
> > +{
> > +    return memory_region_update_pending || ioeventfd_update_pending;
> > +}
> > +
> >   static bool memory_region_ioeventfd_before(MemoryRegionIoeventfd *a,
> >                                              MemoryRegionIoeventfd *b)
> >   {
> > @@ -1756,12 +1762,25 @@ static void memory_region_finalize(Object *obj)
> >        * and cause an infinite loop.
> >        */
> >       mr->enabled = false;
> > -    memory_region_transaction_begin();
> > +
> > +    /*
> > +     * Use push()/pop() instead of begin()/commit() to make sure below block
> > +     * won't trigger any topology update (which should never happen, but it's
> > +     * still a safety belt).
> > +     */
> 
> Hmm, I wonder if we can just keep the begin/end semantics and just do an
> assertion before doing the commit? Does anything speak against that?

That sounds working too for the case of run_on_cpu and similar, but I think
this patch should be able to cover more.  For example, it's possible depth==0
when enter memory_region_finalize(), but some removal of subregions could
further cause memory layout changes.  IMHO we should also bail out early for
those cases too.  Thanks,

-- 
Peter Xu



^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH v2 7/9] cpus: Introduce qemu_mutex_unlock_iothread_prepare()
  2021-07-27 12:59   ` David Hildenbrand
@ 2021-07-27 16:08     ` Peter Xu
  2021-07-28 12:11       ` David Hildenbrand
  0 siblings, 1 reply; 28+ messages in thread
From: Peter Xu @ 2021-07-27 16:08 UTC (permalink / raw)
  To: David Hildenbrand; +Cc: Paolo Bonzini, qemu-devel, Richard Henderson

On Tue, Jul 27, 2021 at 02:59:26PM +0200, David Hildenbrand wrote:
> On 23.07.21 21:34, Peter Xu wrote:
> > The prepare function before unlocking BQL.  There're only three places that can
> > release the BQL: unlock(), cond_wait() or cond_timedwait().
> > 
> > Signed-off-by: Peter Xu <peterx@redhat.com>
> > ---
> >   softmmu/cpus.c | 7 +++++++
> >   1 file changed, 7 insertions(+)
> > 
> > diff --git a/softmmu/cpus.c b/softmmu/cpus.c
> > index 9131f77f87..6085f8edbe 100644
> > --- a/softmmu/cpus.c
> > +++ b/softmmu/cpus.c
> > @@ -66,6 +66,10 @@
> >   static QemuMutex qemu_global_mutex;
> > +static void qemu_mutex_unlock_iothread_prepare(void)
> > +{
> > +}
> > +
> >   bool cpu_is_stopped(CPUState *cpu)
> >   {
> >       return cpu->stopped || !runstate_is_running();
> > @@ -523,16 +527,19 @@ void qemu_mutex_unlock_iothread(void)
> >   {
> >       g_assert(qemu_mutex_iothread_locked());
> >       iothread_locked = false;
> > +    qemu_mutex_unlock_iothread_prepare();
> >       qemu_mutex_unlock(&qemu_global_mutex);
> >   }
> >   void qemu_cond_wait_iothread(QemuCond *cond)
> >   {
> > +    qemu_mutex_unlock_iothread_prepare();
> >       qemu_cond_wait(cond, &qemu_global_mutex);
> >   }
> >   void qemu_cond_timedwait_iothread(QemuCond *cond, int ms)
> >   {
> > +    qemu_mutex_unlock_iothread_prepare();
> >       qemu_cond_timedwait(cond, &qemu_global_mutex, ms);
> >   }
> > 
> 
> I'd squash this patch into the next one.
> 
> I don't quite like the function name, but don't really have a better
> suggestion .... maybe qemu_mutex_might_unlock_iothread(), similar to
> might_sleep() or might_fault() in the kernel. (although here it's pretty
> clear and not a "might"; could be useful in other context where we might
> conditionally unlock the BQL at some point in the future, though)

Yes, IMHO "might" describes a capability of doing something, here it's not
(this one should only be called right before releasing bql, not within any
context of having some capability).  The other option I thought was "pre" but
it will be just a short version of "prepare".

Let me know if you have a better suggestion on naming. :) Otherwise I'll keep
the naming, squash this patch into the next and keep your r-b for that.

Thanks,

-- 
Peter Xu



^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH v2 0/9] memory: Sanity checks memory transaction when releasing BQL
  2021-07-27 12:41 ` David Hildenbrand
@ 2021-07-27 16:35   ` Peter Xu
  0 siblings, 0 replies; 28+ messages in thread
From: Peter Xu @ 2021-07-27 16:35 UTC (permalink / raw)
  To: David Hildenbrand; +Cc: Paolo Bonzini, qemu-devel, Richard Henderson

On Tue, Jul 27, 2021 at 02:41:17PM +0200, David Hildenbrand wrote:
> Functions that silently drop the BQL are really nasty. I once fall into a
> similar trap calling pause_all_vcpus() from within
> memory_region_transaction_begin(), while resizing RAM blocks.

Yes, hopefull the series can help all these cases too.  Thanks for reviewing!

-- 
Peter Xu



^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH v2 7/9] cpus: Introduce qemu_mutex_unlock_iothread_prepare()
  2021-07-27 16:08     ` Peter Xu
@ 2021-07-28 12:11       ` David Hildenbrand
  0 siblings, 0 replies; 28+ messages in thread
From: David Hildenbrand @ 2021-07-28 12:11 UTC (permalink / raw)
  To: Peter Xu; +Cc: Paolo Bonzini, qemu-devel, Richard Henderson

On 27.07.21 18:08, Peter Xu wrote:
> On Tue, Jul 27, 2021 at 02:59:26PM +0200, David Hildenbrand wrote:
>> On 23.07.21 21:34, Peter Xu wrote:
>>> The prepare function before unlocking BQL.  There're only three places that can
>>> release the BQL: unlock(), cond_wait() or cond_timedwait().
>>>
>>> Signed-off-by: Peter Xu <peterx@redhat.com>
>>> ---
>>>    softmmu/cpus.c | 7 +++++++
>>>    1 file changed, 7 insertions(+)
>>>
>>> diff --git a/softmmu/cpus.c b/softmmu/cpus.c
>>> index 9131f77f87..6085f8edbe 100644
>>> --- a/softmmu/cpus.c
>>> +++ b/softmmu/cpus.c
>>> @@ -66,6 +66,10 @@
>>>    static QemuMutex qemu_global_mutex;
>>> +static void qemu_mutex_unlock_iothread_prepare(void)
>>> +{
>>> +}
>>> +
>>>    bool cpu_is_stopped(CPUState *cpu)
>>>    {
>>>        return cpu->stopped || !runstate_is_running();
>>> @@ -523,16 +527,19 @@ void qemu_mutex_unlock_iothread(void)
>>>    {
>>>        g_assert(qemu_mutex_iothread_locked());
>>>        iothread_locked = false;
>>> +    qemu_mutex_unlock_iothread_prepare();
>>>        qemu_mutex_unlock(&qemu_global_mutex);
>>>    }
>>>    void qemu_cond_wait_iothread(QemuCond *cond)
>>>    {
>>> +    qemu_mutex_unlock_iothread_prepare();
>>>        qemu_cond_wait(cond, &qemu_global_mutex);
>>>    }
>>>    void qemu_cond_timedwait_iothread(QemuCond *cond, int ms)
>>>    {
>>> +    qemu_mutex_unlock_iothread_prepare();
>>>        qemu_cond_timedwait(cond, &qemu_global_mutex, ms);
>>>    }
>>>
>>
>> I'd squash this patch into the next one.
>>
>> I don't quite like the function name, but don't really have a better
>> suggestion .... maybe qemu_mutex_might_unlock_iothread(), similar to
>> might_sleep() or might_fault() in the kernel. (although here it's pretty
>> clear and not a "might"; could be useful in other context where we might
>> conditionally unlock the BQL at some point in the future, though)
> 
> Yes, IMHO "might" describes a capability of doing something, here it's not
> (this one should only be called right before releasing bql, not within any
> context of having some capability).  The other option I thought was "pre" but
> it will be just a short version of "prepare".
> 
> Let me know if you have a better suggestion on naming. :) Otherwise I'll keep
> the naming, squash this patch into the next and keep your r-b for that.

Fine with me :)


-- 
Thanks,

David / dhildenb



^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH v2 4/9] memory: Don't do topology update in memory finalize()
  2021-07-27 16:02     ` Peter Xu
@ 2021-07-28 12:13       ` David Hildenbrand
  2021-07-28 13:56         ` Peter Xu
  0 siblings, 1 reply; 28+ messages in thread
From: David Hildenbrand @ 2021-07-28 12:13 UTC (permalink / raw)
  To: Peter Xu; +Cc: Paolo Bonzini, qemu-devel, Richard Henderson

On 27.07.21 18:02, Peter Xu wrote:
> On Tue, Jul 27, 2021 at 03:21:31PM +0200, David Hildenbrand wrote:
>> On 23.07.21 21:34, Peter Xu wrote:
>>> Topology update could be wrongly triggered in memory region finalize() if
>>> there's bug somewhere else.  It'll be a very confusing stack when it
>>> happens (e.g., sending KVM ioctl within the RCU thread, and we'll observe it
>>> only until it fails!).
>>>
>>> Instead of that, we use the push()/pop() helper to avoid memory transaction
>>> commit, at the same time we use assertions to make sure there's no pending
>>> updates or it's a nested transaction, so it could fail even earlier and in a
>>> more explicit way.
>>>
>>> Suggested-by: Paolo Bonzini <pbonzini@redhat.com>
>>> Signed-off-by: Peter Xu <peterx@redhat.com>
>>> ---
>>>    softmmu/memory.c | 23 +++++++++++++++++++++--
>>>    1 file changed, 21 insertions(+), 2 deletions(-)
>>>
>>> diff --git a/softmmu/memory.c b/softmmu/memory.c
>>> index 1a3e9ff8ad..dfce4a2bda 100644
>>> --- a/softmmu/memory.c
>>> +++ b/softmmu/memory.c
>>> @@ -170,6 +170,12 @@ struct MemoryRegionIoeventfd {
>>>        EventNotifier *e;
>>>    };
>>> +/* Returns whether there's any pending memory updates */
>>> +static bool memory_region_has_pending_update(void)
>>> +{
>>> +    return memory_region_update_pending || ioeventfd_update_pending;
>>> +}
>>> +
>>>    static bool memory_region_ioeventfd_before(MemoryRegionIoeventfd *a,
>>>                                               MemoryRegionIoeventfd *b)
>>>    {
>>> @@ -1756,12 +1762,25 @@ static void memory_region_finalize(Object *obj)
>>>         * and cause an infinite loop.
>>>         */
>>>        mr->enabled = false;
>>> -    memory_region_transaction_begin();
>>> +
>>> +    /*
>>> +     * Use push()/pop() instead of begin()/commit() to make sure below block
>>> +     * won't trigger any topology update (which should never happen, but it's
>>> +     * still a safety belt).
>>> +     */
>>
>> Hmm, I wonder if we can just keep the begin/end semantics and just do an
>> assertion before doing the commit? Does anything speak against that?
> 
> That sounds working too for the case of run_on_cpu and similar, but I think
> this patch should be able to cover more.  For example, it's possible depth==0
> when enter memory_region_finalize(), but some removal of subregions could
> further cause memory layout changes.  IMHO we should also bail out early for
> those cases too.  Thanks,
> 

Do we really have to switch to push/pop to catch these cases early? I'd 
assume we'd just have to formulate the right assertions :)

-- 
Thanks,

David / dhildenb



^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH v2 4/9] memory: Don't do topology update in memory finalize()
  2021-07-28 12:13       ` David Hildenbrand
@ 2021-07-28 13:56         ` Peter Xu
  2021-07-28 14:01           ` David Hildenbrand
  0 siblings, 1 reply; 28+ messages in thread
From: Peter Xu @ 2021-07-28 13:56 UTC (permalink / raw)
  To: David Hildenbrand; +Cc: Paolo Bonzini, qemu-devel, Richard Henderson

On Wed, Jul 28, 2021 at 02:13:17PM +0200, David Hildenbrand wrote:
> On 27.07.21 18:02, Peter Xu wrote:
> > On Tue, Jul 27, 2021 at 03:21:31PM +0200, David Hildenbrand wrote:
> > > On 23.07.21 21:34, Peter Xu wrote:
> > > > Topology update could be wrongly triggered in memory region finalize() if
> > > > there's bug somewhere else.  It'll be a very confusing stack when it
> > > > happens (e.g., sending KVM ioctl within the RCU thread, and we'll observe it
> > > > only until it fails!).
> > > > 
> > > > Instead of that, we use the push()/pop() helper to avoid memory transaction
> > > > commit, at the same time we use assertions to make sure there's no pending
> > > > updates or it's a nested transaction, so it could fail even earlier and in a
> > > > more explicit way.
> > > > 
> > > > Suggested-by: Paolo Bonzini <pbonzini@redhat.com>
> > > > Signed-off-by: Peter Xu <peterx@redhat.com>
> > > > ---
> > > >    softmmu/memory.c | 23 +++++++++++++++++++++--
> > > >    1 file changed, 21 insertions(+), 2 deletions(-)
> > > > 
> > > > diff --git a/softmmu/memory.c b/softmmu/memory.c
> > > > index 1a3e9ff8ad..dfce4a2bda 100644
> > > > --- a/softmmu/memory.c
> > > > +++ b/softmmu/memory.c
> > > > @@ -170,6 +170,12 @@ struct MemoryRegionIoeventfd {
> > > >        EventNotifier *e;
> > > >    };
> > > > +/* Returns whether there's any pending memory updates */
> > > > +static bool memory_region_has_pending_update(void)
> > > > +{
> > > > +    return memory_region_update_pending || ioeventfd_update_pending;
> > > > +}
> > > > +
> > > >    static bool memory_region_ioeventfd_before(MemoryRegionIoeventfd *a,
> > > >                                               MemoryRegionIoeventfd *b)
> > > >    {
> > > > @@ -1756,12 +1762,25 @@ static void memory_region_finalize(Object *obj)
> > > >         * and cause an infinite loop.
> > > >         */
> > > >        mr->enabled = false;
> > > > -    memory_region_transaction_begin();
> > > > +
> > > > +    /*
> > > > +     * Use push()/pop() instead of begin()/commit() to make sure below block
> > > > +     * won't trigger any topology update (which should never happen, but it's
> > > > +     * still a safety belt).
> > > > +     */
> > > 
> > > Hmm, I wonder if we can just keep the begin/end semantics and just do an
> > > assertion before doing the commit? Does anything speak against that?
> > 
> > That sounds working too for the case of run_on_cpu and similar, but I think
> > this patch should be able to cover more.  For example, it's possible depth==0
> > when enter memory_region_finalize(), but some removal of subregions could
> > further cause memory layout changes.  IMHO we should also bail out early for
> > those cases too.  Thanks,
> > 
> 
> Do we really have to switch to push/pop to catch these cases early? I'd
> assume we'd just have to formulate the right assertions :)

The subject and commit message was trying to tell why. :)

"memory: Don't do topology update in memory finalize()"

The root reason is errors within memory commit can be very hard to digest,
however the assertion failure is easier to understand because any memory layout
change is not expected to happen.

The push/pop (I renamed it after your other comment to depth_inc/dec) avoids
memory commit happening at all within finalize(), and make sure we always fail
at the assertion as long as there's any potential memory update (even if the
memory update won't crash qemu immediately).

Thanks,

-- 
Peter Xu



^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH v2 4/9] memory: Don't do topology update in memory finalize()
  2021-07-28 13:56         ` Peter Xu
@ 2021-07-28 14:01           ` David Hildenbrand
  0 siblings, 0 replies; 28+ messages in thread
From: David Hildenbrand @ 2021-07-28 14:01 UTC (permalink / raw)
  To: Peter Xu; +Cc: Paolo Bonzini, qemu-devel, Richard Henderson

On 28.07.21 15:56, Peter Xu wrote:
> On Wed, Jul 28, 2021 at 02:13:17PM +0200, David Hildenbrand wrote:
>> On 27.07.21 18:02, Peter Xu wrote:
>>> On Tue, Jul 27, 2021 at 03:21:31PM +0200, David Hildenbrand wrote:
>>>> On 23.07.21 21:34, Peter Xu wrote:
>>>>> Topology update could be wrongly triggered in memory region finalize() if
>>>>> there's bug somewhere else.  It'll be a very confusing stack when it
>>>>> happens (e.g., sending KVM ioctl within the RCU thread, and we'll observe it
>>>>> only until it fails!).
>>>>>
>>>>> Instead of that, we use the push()/pop() helper to avoid memory transaction
>>>>> commit, at the same time we use assertions to make sure there's no pending
>>>>> updates or it's a nested transaction, so it could fail even earlier and in a
>>>>> more explicit way.
>>>>>
>>>>> Suggested-by: Paolo Bonzini <pbonzini@redhat.com>
>>>>> Signed-off-by: Peter Xu <peterx@redhat.com>
>>>>> ---
>>>>>     softmmu/memory.c | 23 +++++++++++++++++++++--
>>>>>     1 file changed, 21 insertions(+), 2 deletions(-)
>>>>>
>>>>> diff --git a/softmmu/memory.c b/softmmu/memory.c
>>>>> index 1a3e9ff8ad..dfce4a2bda 100644
>>>>> --- a/softmmu/memory.c
>>>>> +++ b/softmmu/memory.c
>>>>> @@ -170,6 +170,12 @@ struct MemoryRegionIoeventfd {
>>>>>         EventNotifier *e;
>>>>>     };
>>>>> +/* Returns whether there's any pending memory updates */
>>>>> +static bool memory_region_has_pending_update(void)
>>>>> +{
>>>>> +    return memory_region_update_pending || ioeventfd_update_pending;
>>>>> +}
>>>>> +
>>>>>     static bool memory_region_ioeventfd_before(MemoryRegionIoeventfd *a,
>>>>>                                                MemoryRegionIoeventfd *b)
>>>>>     {
>>>>> @@ -1756,12 +1762,25 @@ static void memory_region_finalize(Object *obj)
>>>>>          * and cause an infinite loop.
>>>>>          */
>>>>>         mr->enabled = false;
>>>>> -    memory_region_transaction_begin();
>>>>> +
>>>>> +    /*
>>>>> +     * Use push()/pop() instead of begin()/commit() to make sure below block
>>>>> +     * won't trigger any topology update (which should never happen, but it's
>>>>> +     * still a safety belt).
>>>>> +     */
>>>>
>>>> Hmm, I wonder if we can just keep the begin/end semantics and just do an
>>>> assertion before doing the commit? Does anything speak against that?
>>>
>>> That sounds working too for the case of run_on_cpu and similar, but I think
>>> this patch should be able to cover more.  For example, it's possible depth==0
>>> when enter memory_region_finalize(), but some removal of subregions could
>>> further cause memory layout changes.  IMHO we should also bail out early for
>>> those cases too.  Thanks,
>>>
>>
>> Do we really have to switch to push/pop to catch these cases early? I'd
>> assume we'd just have to formulate the right assertions :)
> 
> The subject and commit message was trying to tell why. :)
> 
> "memory: Don't do topology update in memory finalize()"
> 
> The root reason is errors within memory commit can be very hard to digest,
> however the assertion failure is easier to understand because any memory layout
> change is not expected to happen.
> 
> The push/pop (I renamed it after your other comment to depth_inc/dec) avoids
> memory commit happening at all within finalize(), and make sure we always fail
> at the assertion as long as there's any potential memory update (even if the
> memory update won't crash qemu immediately).

Okay, makes sense to me, thanks

Acked-by: David Hildenbrand <david@redhat.com>


-- 
Thanks,

David / dhildenb



^ permalink raw reply	[flat|nested] 28+ messages in thread

end of thread, other threads:[~2021-07-28 14:03 UTC | newest]

Thread overview: 28+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-07-23 19:34 [PATCH v2 0/9] memory: Sanity checks memory transaction when releasing BQL Peter Xu
2021-07-23 19:34 ` [PATCH v2 1/9] cpus: Export queue work related fields to cpu.h Peter Xu
2021-07-27 13:02   ` David Hildenbrand
2021-07-23 19:34 ` [PATCH v2 2/9] cpus: Move do_run_on_cpu into softmmu/cpus.c Peter Xu
2021-07-27 13:04   ` David Hildenbrand
2021-07-23 19:34 ` [PATCH v2 3/9] memory: Introduce memory_region_transaction_{push|pop}() Peter Xu
2021-07-27 13:06   ` David Hildenbrand
2021-07-23 19:34 ` [PATCH v2 4/9] memory: Don't do topology update in memory finalize() Peter Xu
2021-07-27 13:21   ` David Hildenbrand
2021-07-27 16:02     ` Peter Xu
2021-07-28 12:13       ` David Hildenbrand
2021-07-28 13:56         ` Peter Xu
2021-07-28 14:01           ` David Hildenbrand
2021-07-23 19:34 ` [PATCH v2 5/9] cpus: Use qemu_cond_wait_iothread() where proper Peter Xu
2021-07-27 12:49   ` David Hildenbrand
2021-07-23 19:34 ` [PATCH v2 6/9] cpus: Remove the mutex parameter from do_run_on_cpu() Peter Xu
2021-07-27 12:50   ` David Hildenbrand
2021-07-23 19:34 ` [PATCH v2 7/9] cpus: Introduce qemu_mutex_unlock_iothread_prepare() Peter Xu
2021-07-27 12:59   ` David Hildenbrand
2021-07-27 16:08     ` Peter Xu
2021-07-28 12:11       ` David Hildenbrand
2021-07-23 19:34 ` [PATCH v2 8/9] memory: Assert on no ongoing memory transaction before release BQL Peter Xu
2021-07-27 13:00   ` David Hildenbrand
2021-07-23 19:34 ` [PATCH v2 9/9] memory: Delay the transaction pop() until commit completed Peter Xu
2021-07-27 13:02   ` David Hildenbrand
2021-07-23 22:36 ` [PATCH v2 0/9] memory: Sanity checks memory transaction when releasing BQL Peter Xu
2021-07-27 12:41 ` David Hildenbrand
2021-07-27 16:35   ` Peter Xu

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).