All of lore.kernel.org
 help / color / mirror / Atom feed
* [Qemu-devel] [PATCH v4 0/2] move qcow2_invalidate_cache() out of coroutine context
@ 2016-02-24  8:53 Denis V. Lunev
  2016-02-24  8:53 ` [Qemu-devel] [PATCH 1/2] migration (ordinary): move bdrv_invalidate_cache_all of " Denis V. Lunev
                   ` (2 more replies)
  0 siblings, 3 replies; 4+ messages in thread
From: Denis V. Lunev @ 2016-02-24  8:53 UTC (permalink / raw)
  Cc: Amit Shah, Denis V. Lunev, Juan Quintela, qemu-devel, Paolo Bonzini

There is a possibility to hit an assert in qcow2_get_specific_info that
s->qcow_version is undefined. This happens when VM in starting from
suspended state, i.e. it processes incoming migration, and in the same
time 'info block' is called.

The problem is that qcow2_invalidate_cache() closes the image and
memset()s BDRVQcowState in the middle.

This operation should not be performed in coroutine context.

Changes from v3:
- added qemu_bh_delete at the end of BH to free allocated structure.
  Thanks to Fam.

Changes from v2:
- subject lines in patches

Changes from v1:
- fixed spelling. Eric, thank you for spell checking

Signed-off-by: Denis V. Lunev <den@openvz.org>
CC: Paolo Bonzini <pbonzini@redhat.com>
CC: Juan Quintela <quintela@redhat.com>
CC: Amit Shah <amit.shah@redhat.com>

^ permalink raw reply	[flat|nested] 4+ messages in thread

* [Qemu-devel] [PATCH 1/2] migration (ordinary): move bdrv_invalidate_cache_all of of coroutine context
  2016-02-24  8:53 [Qemu-devel] [PATCH v4 0/2] move qcow2_invalidate_cache() out of coroutine context Denis V. Lunev
@ 2016-02-24  8:53 ` Denis V. Lunev
  2016-02-24  8:53 ` [Qemu-devel] [PATCH 2/2] migration (postcopy): " Denis V. Lunev
  2016-02-25  1:21 ` [Qemu-devel] [PATCH v4 0/2] move qcow2_invalidate_cache() out " Fam Zheng
  2 siblings, 0 replies; 4+ messages in thread
From: Denis V. Lunev @ 2016-02-24  8:53 UTC (permalink / raw)
  Cc: Amit Shah, Denis V. Lunev, Juan Quintela, qemu-devel, Paolo Bonzini

There is a possibility to hit an assert in qcow2_get_specific_info that
s->qcow_version is undefined. This happens when VM in starting from
suspended state, i.e. it processes incoming migration, and in the same
time 'info block' is called.

The problem is that qcow2_invalidate_cache() closes the image and
memset()s BDRVQcowState in the middle.

The patch moves processing of bdrv_invalidate_cache_all out of
coroutine context for standard migration to avoid that.

Signed-off-by: Denis V. Lunev <den@openvz.org>
CC: Paolo Bonzini <pbonzini@redhat.com>
CC: Juan Quintela <quintela@redhat.com>
CC: Amit Shah <amit.shah@redhat.com>
---
 include/migration/migration.h |  2 +
 migration/migration.c         | 90 ++++++++++++++++++++++++-------------------
 2 files changed, 52 insertions(+), 40 deletions(-)

diff --git a/include/migration/migration.h b/include/migration/migration.h
index 85b6026..ac2c12c 100644
--- a/include/migration/migration.h
+++ b/include/migration/migration.h
@@ -104,6 +104,8 @@ struct MigrationIncomingState {
     QemuMutex rp_mutex;    /* We send replies from multiple threads */
     void     *postcopy_tmp_page;
 
+    QEMUBH *bh;
+
     int state;
     /* See savevm.c */
     LoadStateEntry_Head loadvm_handlers;
diff --git a/migration/migration.c b/migration/migration.c
index fc5e50b..f82fdf6 100644
--- a/migration/migration.c
+++ b/migration/migration.c
@@ -323,10 +323,57 @@ void qemu_start_incoming_migration(const char *uri, Error **errp)
     }
 }
 
+static void process_incoming_migration_bh(void *opaque)
+{
+    Error *local_err = NULL;
+    MigrationIncomingState *mis = opaque;
+
+    /* Make sure all file formats flush their mutable metadata */
+    bdrv_invalidate_cache_all(&local_err);
+    if (local_err) {
+        migrate_set_state(&mis->state, MIGRATION_STATUS_ACTIVE,
+                          MIGRATION_STATUS_FAILED);
+        error_report_err(local_err);
+        migrate_decompress_threads_join();
+        exit(EXIT_FAILURE);
+    }
+
+    /*
+     * This must happen after all error conditions are dealt with and
+     * we're sure the VM is going to be running on this host.
+     */
+    qemu_announce_self();
+
+    /* If global state section was not received or we are in running
+       state, we need to obey autostart. Any other state is set with
+       runstate_set. */
+
+    if (!global_state_received() ||
+        global_state_get_runstate() == RUN_STATE_RUNNING) {
+        if (autostart) {
+            vm_start();
+        } else {
+            runstate_set(RUN_STATE_PAUSED);
+        }
+    } else {
+        runstate_set(global_state_get_runstate());
+    }
+    migrate_decompress_threads_join();
+    /*
+     * This must happen after any state changes since as soon as an external
+     * observer sees this event they might start to prod at the VM assuming
+     * it's ready to use.
+     */
+    migrate_set_state(&mis->state, MIGRATION_STATUS_ACTIVE,
+                      MIGRATION_STATUS_COMPLETED);
+    migration_incoming_state_destroy();
+
+    qemu_bh_delete(mis->bh);
+}
+
 static void process_incoming_migration_co(void *opaque)
 {
     QEMUFile *f = opaque;
-    Error *local_err = NULL;
     MigrationIncomingState *mis;
     PostcopyState ps;
     int ret;
@@ -369,45 +416,8 @@ static void process_incoming_migration_co(void *opaque)
         exit(EXIT_FAILURE);
     }
 
-    /* Make sure all file formats flush their mutable metadata */
-    bdrv_invalidate_cache_all(&local_err);
-    if (local_err) {
-        migrate_set_state(&mis->state, MIGRATION_STATUS_ACTIVE,
-                          MIGRATION_STATUS_FAILED);
-        error_report_err(local_err);
-        migrate_decompress_threads_join();
-        exit(EXIT_FAILURE);
-    }
-
-    /*
-     * This must happen after all error conditions are dealt with and
-     * we're sure the VM is going to be running on this host.
-     */
-    qemu_announce_self();
-
-    /* If global state section was not received or we are in running
-       state, we need to obey autostart. Any other state is set with
-       runstate_set. */
-
-    if (!global_state_received() ||
-        global_state_get_runstate() == RUN_STATE_RUNNING) {
-        if (autostart) {
-            vm_start();
-        } else {
-            runstate_set(RUN_STATE_PAUSED);
-        }
-    } else {
-        runstate_set(global_state_get_runstate());
-    }
-    migrate_decompress_threads_join();
-    /*
-     * This must happen after any state changes since as soon as an external
-     * observer sees this event they might start to prod at the VM assuming
-     * it's ready to use.
-     */
-    migrate_set_state(&mis->state, MIGRATION_STATUS_ACTIVE,
-                      MIGRATION_STATUS_COMPLETED);
-    migration_incoming_state_destroy();
+    mis->bh = qemu_bh_new(process_incoming_migration_bh, mis);
+    qemu_bh_schedule(mis->bh);
 }
 
 void process_incoming_migration(QEMUFile *f)
-- 
2.5.0

^ permalink raw reply related	[flat|nested] 4+ messages in thread

* [Qemu-devel] [PATCH 2/2] migration (postcopy): move bdrv_invalidate_cache_all of of coroutine context
  2016-02-24  8:53 [Qemu-devel] [PATCH v4 0/2] move qcow2_invalidate_cache() out of coroutine context Denis V. Lunev
  2016-02-24  8:53 ` [Qemu-devel] [PATCH 1/2] migration (ordinary): move bdrv_invalidate_cache_all of " Denis V. Lunev
@ 2016-02-24  8:53 ` Denis V. Lunev
  2016-02-25  1:21 ` [Qemu-devel] [PATCH v4 0/2] move qcow2_invalidate_cache() out " Fam Zheng
  2 siblings, 0 replies; 4+ messages in thread
From: Denis V. Lunev @ 2016-02-24  8:53 UTC (permalink / raw)
  Cc: Amit Shah, Denis V. Lunev, Juan Quintela, qemu-devel, Paolo Bonzini

There is a possibility to hit an assert in qcow2_get_specific_info that
s->qcow_version is undefined. This happens when VM in starting from
suspended state, i.e. it processes incoming migration, and in the same
time 'info block' is called.

The problem is that qcow2_invalidate_cache() closes the image and
memset()s BDRVQcowState in the middle.

The patch moves processing of bdrv_invalidate_cache_all out of
coroutine context for postcopy migration to avoid that. This function
is called with the following stack:
  process_incoming_migration_co
  qemu_loadvm_state
  qemu_loadvm_state_main
  loadvm_process_command
  loadvm_postcopy_handle_run

Signed-off-by: Denis V. Lunev <den@openvz.org>
Tested-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
CC: Paolo Bonzini <pbonzini@redhat.com>
CC: Juan Quintela <quintela@redhat.com>
CC: Amit Shah <amit.shah@redhat.com>
---
 migration/savevm.c | 29 +++++++++++++++++++----------
 1 file changed, 19 insertions(+), 10 deletions(-)

diff --git a/migration/savevm.c b/migration/savevm.c
index 94f2894..bc7ded9 100644
--- a/migration/savevm.c
+++ b/migration/savevm.c
@@ -1496,17 +1496,10 @@ static int loadvm_postcopy_handle_listen(MigrationIncomingState *mis)
     return 0;
 }
 
-/* After all discards we can start running and asking for pages */
-static int loadvm_postcopy_handle_run(MigrationIncomingState *mis)
+static void loadvm_postcopy_handle_run_bh(void *opaque)
 {
-    PostcopyState ps = postcopy_state_set(POSTCOPY_INCOMING_RUNNING);
     Error *local_err = NULL;
-
-    trace_loadvm_postcopy_handle_run();
-    if (ps != POSTCOPY_INCOMING_LISTENING) {
-        error_report("CMD_POSTCOPY_RUN in wrong postcopy state (%d)", ps);
-        return -1;
-    }
+    MigrationIncomingState *mis = opaque;
 
     /* TODO we should move all of this lot into postcopy_ram.c or a shared code
      * in migration.c
@@ -1519,7 +1512,6 @@ static int loadvm_postcopy_handle_run(MigrationIncomingState *mis)
     bdrv_invalidate_cache_all(&local_err);
     if (local_err) {
         error_report_err(local_err);
-        return -1;
     }
 
     trace_loadvm_postcopy_handle_run_cpu_sync();
@@ -1535,6 +1527,23 @@ static int loadvm_postcopy_handle_run(MigrationIncomingState *mis)
         runstate_set(RUN_STATE_PAUSED);
     }
 
+    qemu_bh_delete(mis->bh);
+}
+
+/* After all discards we can start running and asking for pages */
+static int loadvm_postcopy_handle_run(MigrationIncomingState *mis)
+{
+    PostcopyState ps = postcopy_state_set(POSTCOPY_INCOMING_RUNNING);
+
+    trace_loadvm_postcopy_handle_run();
+    if (ps != POSTCOPY_INCOMING_LISTENING) {
+        error_report("CMD_POSTCOPY_RUN in wrong postcopy state (%d)", ps);
+        return -1;
+    }
+
+    mis->bh = qemu_bh_new(loadvm_postcopy_handle_run_bh, NULL);
+    qemu_bh_schedule(mis->bh);
+
     /* We need to finish reading the stream from the package
      * and also stop reading anything more from the stream that loaded the
      * package (since it's now being read by the listener thread).
-- 
2.5.0

^ permalink raw reply related	[flat|nested] 4+ messages in thread

* Re: [Qemu-devel] [PATCH v4 0/2] move qcow2_invalidate_cache() out of coroutine context
  2016-02-24  8:53 [Qemu-devel] [PATCH v4 0/2] move qcow2_invalidate_cache() out of coroutine context Denis V. Lunev
  2016-02-24  8:53 ` [Qemu-devel] [PATCH 1/2] migration (ordinary): move bdrv_invalidate_cache_all of " Denis V. Lunev
  2016-02-24  8:53 ` [Qemu-devel] [PATCH 2/2] migration (postcopy): " Denis V. Lunev
@ 2016-02-25  1:21 ` Fam Zheng
  2 siblings, 0 replies; 4+ messages in thread
From: Fam Zheng @ 2016-02-25  1:21 UTC (permalink / raw)
  To: Denis V. Lunev; +Cc: Amit Shah, Paolo Bonzini, qemu-devel, Juan Quintela

On Wed, 02/24 11:53, Denis V. Lunev wrote:
> There is a possibility to hit an assert in qcow2_get_specific_info that
> s->qcow_version is undefined. This happens when VM in starting from
> suspended state, i.e. it processes incoming migration, and in the same
> time 'info block' is called.
> 
> The problem is that qcow2_invalidate_cache() closes the image and
> memset()s BDRVQcowState in the middle.
> 
> This operation should not be performed in coroutine context.
> 
> Changes from v3:
> - added qemu_bh_delete at the end of BH to free allocated structure.
>   Thanks to Fam.

Looks good to me now. Thanks!

Reviewed-by: Fam Zheng <famz@redhat.com>

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2016-02-25  1:21 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-02-24  8:53 [Qemu-devel] [PATCH v4 0/2] move qcow2_invalidate_cache() out of coroutine context Denis V. Lunev
2016-02-24  8:53 ` [Qemu-devel] [PATCH 1/2] migration (ordinary): move bdrv_invalidate_cache_all of " Denis V. Lunev
2016-02-24  8:53 ` [Qemu-devel] [PATCH 2/2] migration (postcopy): " Denis V. Lunev
2016-02-25  1:21 ` [Qemu-devel] [PATCH v4 0/2] move qcow2_invalidate_cache() out " Fam Zheng

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.