* [PULL 1/9] migration: fix vmdesc leak on vmstate_save() error
2019-09-25 15:01 [PULL 0/9] migration queue Dr. David Alan Gilbert (git)
@ 2019-09-25 15:01 ` Dr. David Alan Gilbert (git)
2019-09-25 15:01 ` [PULL 2/9] migration/postcopy: not necessary to do discard when canonicalizing bitmap Dr. David Alan Gilbert (git)
` (8 subsequent siblings)
9 siblings, 0 replies; 11+ messages in thread
From: Dr. David Alan Gilbert (git) @ 2019-09-25 15:01 UTC (permalink / raw)
To: qemu-devel, quintela, peterx, marcandre.lureau, richardw.yang,
alex.benee
From: Marc-André Lureau <marcandre.lureau@redhat.com>
Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Message-Id: <20190912122514.22504-2-marcandre.lureau@redhat.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
---
migration/qjson.h | 2 ++
migration/savevm.c | 3 +--
2 files changed, 3 insertions(+), 2 deletions(-)
diff --git a/migration/qjson.h b/migration/qjson.h
index 41664f2d71..1786bb5864 100644
--- a/migration/qjson.h
+++ b/migration/qjson.h
@@ -24,4 +24,6 @@ void json_start_object(QJSON *json, const char *name);
const char *qjson_get_str(QJSON *json);
void qjson_finish(QJSON *json);
+G_DEFINE_AUTOPTR_CLEANUP_FUNC(QJSON, qjson_destroy)
+
#endif /* QEMU_QJSON_H */
diff --git a/migration/savevm.c b/migration/savevm.c
index ee06f91d42..bb9462a54d 100644
--- a/migration/savevm.c
+++ b/migration/savevm.c
@@ -1314,7 +1314,7 @@ int qemu_savevm_state_complete_precopy_non_iterable(QEMUFile *f,
bool in_postcopy,
bool inactivate_disks)
{
- QJSON *vmdesc;
+ g_autoptr(QJSON) vmdesc = NULL;
int vmdesc_len;
SaveStateEntry *se;
int ret;
@@ -1375,7 +1375,6 @@ int qemu_savevm_state_complete_precopy_non_iterable(QEMUFile *f,
qemu_put_be32(f, vmdesc_len);
qemu_put_buffer(f, (uint8_t *)qjson_get_str(vmdesc), vmdesc_len);
}
- qjson_destroy(vmdesc);
return 0;
}
--
2.21.0
^ permalink raw reply related [flat|nested] 11+ messages in thread
* [PULL 2/9] migration/postcopy: not necessary to do discard when canonicalizing bitmap
2019-09-25 15:01 [PULL 0/9] migration queue Dr. David Alan Gilbert (git)
2019-09-25 15:01 ` [PULL 1/9] migration: fix vmdesc leak on vmstate_save() error Dr. David Alan Gilbert (git)
@ 2019-09-25 15:01 ` Dr. David Alan Gilbert (git)
2019-09-25 15:01 ` [PULL 3/9] migration/postcopy: unsentmap is not necessary for postcopy Dr. David Alan Gilbert (git)
` (7 subsequent siblings)
9 siblings, 0 replies; 11+ messages in thread
From: Dr. David Alan Gilbert (git) @ 2019-09-25 15:01 UTC (permalink / raw)
To: qemu-devel, quintela, peterx, marcandre.lureau, richardw.yang,
alex.benee
From: Wei Yang <richardw.yang@linux.intel.com>
All pages, either partially sent or partially dirty, will be discarded in
postcopy_send_discard_bm_ram(), since we update the unsentmap to be
unsentmap = unsentmap | dirty in ram_postcopy_send_discard_bitmap().
This is not necessary to do discard when canonicalizing bitmap. And by
doing so, we separate the page discard into two individual steps:
* canonicalize bitmap
* discard page
Signed-off-by: Wei Yang <richardw.yang@linux.intel.com>
Message-Id: <20190819061843.28642-2-richardw.yang@linux.intel.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
---
migration/ram.c | 14 +-------------
1 file changed, 1 insertion(+), 13 deletions(-)
diff --git a/migration/ram.c b/migration/ram.c
index 01df326767..57d1a4627e 100644
--- a/migration/ram.c
+++ b/migration/ram.c
@@ -2928,7 +2928,7 @@ static int postcopy_each_ram_send_discard(MigrationState *ms)
}
/**
- * postcopy_chunk_hostpages_pass: canocalize bitmap in hostpages
+ * postcopy_chunk_hostpages_pass: canonicalize bitmap in hostpages
*
* Helper for postcopy_chunk_hostpages; it's called twice to
* canonicalize the two bitmaps, that are similar, but one is
@@ -2991,18 +2991,6 @@ static void postcopy_chunk_hostpages_pass(MigrationState *ms, bool unsent_pass,
host_ratio);
run_start = QEMU_ALIGN_UP(run_start, host_ratio);
- /* Tell the destination to discard this page */
- if (unsent_pass || !test_bit(fixup_start_addr, unsentmap)) {
- /* For the unsent_pass we:
- * discard partially sent pages
- * For the !unsent_pass (dirty) we:
- * discard partially dirty pages that were sent
- * (any partially sent pages were already discarded
- * by the previous unsent_pass)
- */
- postcopy_discard_send_range(ms, fixup_start_addr, host_ratio);
- }
-
/* Clean up the bitmap */
for (page = fixup_start_addr;
page < fixup_start_addr + host_ratio; page++) {
--
2.21.0
^ permalink raw reply related [flat|nested] 11+ messages in thread
* [PULL 3/9] migration/postcopy: unsentmap is not necessary for postcopy
2019-09-25 15:01 [PULL 0/9] migration queue Dr. David Alan Gilbert (git)
2019-09-25 15:01 ` [PULL 1/9] migration: fix vmdesc leak on vmstate_save() error Dr. David Alan Gilbert (git)
2019-09-25 15:01 ` [PULL 2/9] migration/postcopy: not necessary to do discard when canonicalizing bitmap Dr. David Alan Gilbert (git)
@ 2019-09-25 15:01 ` Dr. David Alan Gilbert (git)
2019-09-25 15:01 ` [PULL 4/9] migration: remove sent parameter in get_queued_page_not_dirty Dr. David Alan Gilbert (git)
` (6 subsequent siblings)
9 siblings, 0 replies; 11+ messages in thread
From: Dr. David Alan Gilbert (git) @ 2019-09-25 15:01 UTC (permalink / raw)
To: qemu-devel, quintela, peterx, marcandre.lureau, richardw.yang,
alex.benee
From: Wei Yang <richardw.yang@linux.intel.com>
Commit f3f491fcd6dd594ba695 ('Postcopy: Maintain unsentmap') introduced
unsentmap to track not yet sent pages.
This is not necessary since:
* unsentmap is a sub-set of bmap before postcopy start
* unsentmap is the summation of bmap and unsentmap after canonicalizing
This patch just removes it.
Signed-off-by: Wei Yang <richardw.yang@linux.intel.com>
Message-Id: <20190819061843.28642-3-richardw.yang@linux.intel.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
---
include/exec/ram_addr.h | 6 ----
migration/ram.c | 80 ++++++++---------------------------------
2 files changed, 14 insertions(+), 72 deletions(-)
diff --git a/include/exec/ram_addr.h b/include/exec/ram_addr.h
index a327a80cfe..e96e621de5 100644
--- a/include/exec/ram_addr.h
+++ b/include/exec/ram_addr.h
@@ -44,12 +44,6 @@ struct RAMBlock {
size_t page_size;
/* dirty bitmap used during migration */
unsigned long *bmap;
- /* bitmap of pages that haven't been sent even once
- * only maintained and used in postcopy at the moment
- * where it's used to send the dirtymap at the start
- * of the postcopy phase
- */
- unsigned long *unsentmap;
/* bitmap of already received pages in postcopy */
unsigned long *receivedmap;
diff --git a/migration/ram.c b/migration/ram.c
index 57d1a4627e..a8b1aa2597 100644
--- a/migration/ram.c
+++ b/migration/ram.c
@@ -2348,7 +2348,7 @@ static bool get_queued_page(RAMState *rs, PageSearchStatus *pss)
dirty = test_bit(page, block->bmap);
if (!dirty) {
trace_get_queued_page_not_dirty(block->idstr, (uint64_t)offset,
- page, test_bit(page, block->unsentmap));
+ page, test_bit(page, block->bmap));
} else {
trace_get_queued_page(block->idstr, (uint64_t)offset, page);
}
@@ -2619,10 +2619,6 @@ static int ram_save_host_page(RAMState *rs, PageSearchStatus *pss,
}
pages += tmppages;
- if (pss->block->unsentmap) {
- clear_bit(pss->page, pss->block->unsentmap);
- }
-
pss->page++;
} while ((pss->page & (pagesize_bits - 1)) &&
offset_in_ramblock(pss->block, pss->page << TARGET_PAGE_BITS));
@@ -2776,8 +2772,6 @@ static void ram_save_cleanup(void *opaque)
block->clear_bmap = NULL;
g_free(block->bmap);
block->bmap = NULL;
- g_free(block->unsentmap);
- block->unsentmap = NULL;
}
xbzrle_cleanup();
@@ -2857,8 +2851,6 @@ void ram_postcopy_migrated_memory_release(MigrationState *ms)
* Returns zero on success
*
* Callback from postcopy_each_ram_send_discard for each RAMBlock
- * Note: At this point the 'unsentmap' is the processed bitmap combined
- * with the dirtymap; so a '1' means it's either dirty or unsent.
*
* @ms: current migration state
* @block: RAMBlock to discard
@@ -2867,17 +2859,17 @@ static int postcopy_send_discard_bm_ram(MigrationState *ms, RAMBlock *block)
{
unsigned long end = block->used_length >> TARGET_PAGE_BITS;
unsigned long current;
- unsigned long *unsentmap = block->unsentmap;
+ unsigned long *bitmap = block->bmap;
for (current = 0; current < end; ) {
- unsigned long one = find_next_bit(unsentmap, end, current);
+ unsigned long one = find_next_bit(bitmap, end, current);
unsigned long zero, discard_length;
if (one >= end) {
break;
}
- zero = find_next_zero_bit(unsentmap, end, one + 1);
+ zero = find_next_zero_bit(bitmap, end, one + 1);
if (zero >= end) {
discard_length = end - one;
@@ -2938,16 +2930,12 @@ static int postcopy_each_ram_send_discard(MigrationState *ms)
* clean, not a mix. This function canonicalizes the bitmaps.
*
* @ms: current migration state
- * @unsent_pass: if true we need to canonicalize partially unsent host pages
- * otherwise we need to canonicalize partially dirty host pages
* @block: block that contains the page we want to canonicalize
*/
-static void postcopy_chunk_hostpages_pass(MigrationState *ms, bool unsent_pass,
- RAMBlock *block)
+static void postcopy_chunk_hostpages_pass(MigrationState *ms, RAMBlock *block)
{
RAMState *rs = ram_state;
unsigned long *bitmap = block->bmap;
- unsigned long *unsentmap = block->unsentmap;
unsigned int host_ratio = block->page_size / TARGET_PAGE_SIZE;
unsigned long pages = block->used_length >> TARGET_PAGE_BITS;
unsigned long run_start;
@@ -2957,13 +2945,8 @@ static void postcopy_chunk_hostpages_pass(MigrationState *ms, bool unsent_pass,
return;
}
- if (unsent_pass) {
- /* Find a sent page */
- run_start = find_next_zero_bit(unsentmap, pages, 0);
- } else {
- /* Find a dirty page */
- run_start = find_next_bit(bitmap, pages, 0);
- }
+ /* Find a dirty page */
+ run_start = find_next_bit(bitmap, pages, 0);
while (run_start < pages) {
@@ -2973,11 +2956,7 @@ static void postcopy_chunk_hostpages_pass(MigrationState *ms, bool unsent_pass,
*/
if (QEMU_IS_ALIGNED(run_start, host_ratio)) {
/* Find the end of this run */
- if (unsent_pass) {
- run_start = find_next_bit(unsentmap, pages, run_start + 1);
- } else {
- run_start = find_next_zero_bit(bitmap, pages, run_start + 1);
- }
+ run_start = find_next_zero_bit(bitmap, pages, run_start + 1);
/*
* If the end isn't at the start of a host page, then the
* run doesn't finish at the end of a host page
@@ -2994,9 +2973,6 @@ static void postcopy_chunk_hostpages_pass(MigrationState *ms, bool unsent_pass,
/* Clean up the bitmap */
for (page = fixup_start_addr;
page < fixup_start_addr + host_ratio; page++) {
- /* All pages in this host page are now not sent */
- set_bit(page, unsentmap);
-
/*
* Remark them as dirty, updating the count for any pages
* that weren't previously dirty.
@@ -3005,13 +2981,8 @@ static void postcopy_chunk_hostpages_pass(MigrationState *ms, bool unsent_pass,
}
}
- if (unsent_pass) {
- /* Find the next sent page for the next iteration */
- run_start = find_next_zero_bit(unsentmap, pages, run_start);
- } else {
- /* Find the next dirty page for the next iteration */
- run_start = find_next_bit(bitmap, pages, run_start);
- }
+ /* Find the next dirty page for the next iteration */
+ run_start = find_next_bit(bitmap, pages, run_start);
}
}
@@ -3033,13 +3004,10 @@ static int postcopy_chunk_hostpages(MigrationState *ms, RAMBlock *block)
{
postcopy_discard_send_init(ms, block->idstr);
- /* First pass: Discard all partially sent host pages */
- postcopy_chunk_hostpages_pass(ms, true, block);
/*
- * Second pass: Ensure that all partially dirty host pages are made
- * fully dirty.
+ * Ensure that all partially dirty host pages are made fully dirty.
*/
- postcopy_chunk_hostpages_pass(ms, false, block);
+ postcopy_chunk_hostpages_pass(ms, block);
postcopy_discard_send_finish(ms);
return 0;
@@ -3077,19 +3045,6 @@ int ram_postcopy_send_discard_bitmap(MigrationState *ms)
rs->last_page = 0;
RAMBLOCK_FOREACH_NOT_IGNORED(block) {
- unsigned long pages = block->used_length >> TARGET_PAGE_BITS;
- unsigned long *bitmap = block->bmap;
- unsigned long *unsentmap = block->unsentmap;
-
- if (!unsentmap) {
- /* We don't have a safe way to resize the sentmap, so
- * if the bitmap was resized it will be NULL at this
- * point.
- */
- error_report("migration ram resized during precopy phase");
- rcu_read_unlock();
- return -EINVAL;
- }
/* Deal with TPS != HPS and huge pages */
ret = postcopy_chunk_hostpages(ms, block);
if (ret) {
@@ -3097,12 +3052,9 @@ int ram_postcopy_send_discard_bitmap(MigrationState *ms)
return ret;
}
- /*
- * Update the unsentmap to be unsentmap = unsentmap | dirty
- */
- bitmap_or(unsentmap, unsentmap, bitmap, pages);
#ifdef DEBUG_POSTCOPY
- ram_debug_dump_bitmap(unsentmap, true, pages);
+ ram_debug_dump_bitmap(block->bmap, true,
+ block->used_length >> TARGET_PAGE_BITS);
#endif
}
trace_ram_postcopy_send_discard_bitmap();
@@ -3270,10 +3222,6 @@ static void ram_list_init_bitmaps(void)
bitmap_set(block->bmap, 0, pages);
block->clear_bmap_shift = shift;
block->clear_bmap = bitmap_new(clear_bmap_size(pages, shift));
- if (migrate_postcopy_ram()) {
- block->unsentmap = bitmap_new(pages);
- bitmap_set(block->unsentmap, 0, pages);
- }
}
}
}
--
2.21.0
^ permalink raw reply related [flat|nested] 11+ messages in thread
* [PULL 4/9] migration: remove sent parameter in get_queued_page_not_dirty
2019-09-25 15:01 [PULL 0/9] migration queue Dr. David Alan Gilbert (git)
` (2 preceding siblings ...)
2019-09-25 15:01 ` [PULL 3/9] migration/postcopy: unsentmap is not necessary for postcopy Dr. David Alan Gilbert (git)
@ 2019-09-25 15:01 ` Dr. David Alan Gilbert (git)
2019-09-25 15:01 ` [PULL 5/9] migration/rdma: Don't moan about disconnects at the end Dr. David Alan Gilbert (git)
` (5 subsequent siblings)
9 siblings, 0 replies; 11+ messages in thread
From: Dr. David Alan Gilbert (git) @ 2019-09-25 15:01 UTC (permalink / raw)
To: qemu-devel, quintela, peterx, marcandre.lureau, richardw.yang,
alex.benee
From: Wei Yang <richardw.yang@linux.intel.com>
This is a cleanup for previous removal of unsentmap.
The sent parameter is not necessary now.
Signed-off-by: Wei Yang <richardw.yang@linux.intel.com>
Message-Id: <20190819061843.28642-4-richardw.yang@linux.intel.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
---
migration/ram.c | 2 +-
migration/trace-events | 2 +-
2 files changed, 2 insertions(+), 2 deletions(-)
diff --git a/migration/ram.c b/migration/ram.c
index a8b1aa2597..22423f08cd 100644
--- a/migration/ram.c
+++ b/migration/ram.c
@@ -2348,7 +2348,7 @@ static bool get_queued_page(RAMState *rs, PageSearchStatus *pss)
dirty = test_bit(page, block->bmap);
if (!dirty) {
trace_get_queued_page_not_dirty(block->idstr, (uint64_t)offset,
- page, test_bit(page, block->bmap));
+ page);
} else {
trace_get_queued_page(block->idstr, (uint64_t)offset, page);
}
diff --git a/migration/trace-events b/migration/trace-events
index 00ffcd5930..858d415d56 100644
--- a/migration/trace-events
+++ b/migration/trace-events
@@ -76,7 +76,7 @@ qemu_file_fclose(void) ""
# ram.c
get_queued_page(const char *block_name, uint64_t tmp_offset, unsigned long page_abs) "%s/0x%" PRIx64 " page_abs=0x%lx"
-get_queued_page_not_dirty(const char *block_name, uint64_t tmp_offset, unsigned long page_abs, int sent) "%s/0x%" PRIx64 " page_abs=0x%lx (sent=%d)"
+get_queued_page_not_dirty(const char *block_name, uint64_t tmp_offset, unsigned long page_abs) "%s/0x%" PRIx64 " page_abs=0x%lx"
migration_bitmap_sync_start(void) ""
migration_bitmap_sync_end(uint64_t dirty_pages) "dirty_pages %" PRIu64
migration_bitmap_clear_dirty(char *str, uint64_t start, uint64_t size, unsigned long page) "rb %s start 0x%"PRIx64" size 0x%"PRIx64" page 0x%lx"
--
2.21.0
^ permalink raw reply related [flat|nested] 11+ messages in thread
* [PULL 5/9] migration/rdma: Don't moan about disconnects at the end
2019-09-25 15:01 [PULL 0/9] migration queue Dr. David Alan Gilbert (git)
` (3 preceding siblings ...)
2019-09-25 15:01 ` [PULL 4/9] migration: remove sent parameter in get_queued_page_not_dirty Dr. David Alan Gilbert (git)
@ 2019-09-25 15:01 ` Dr. David Alan Gilbert (git)
2019-09-25 15:01 ` [PULL 6/9] migration/rdma.c: Swap synchronize_rcu for call_rcu Dr. David Alan Gilbert (git)
` (4 subsequent siblings)
9 siblings, 0 replies; 11+ messages in thread
From: Dr. David Alan Gilbert (git) @ 2019-09-25 15:01 UTC (permalink / raw)
To: qemu-devel, quintela, peterx, marcandre.lureau, richardw.yang,
alex.benee
From: "Dr. David Alan Gilbert" <dgilbert@redhat.com>
If we've already finished the migration or something has
already gone wrong, don't moan about the migration stream disconnecting.
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Message-Id: <20190913163507.1403-2-dgilbert@redhat.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
---
migration/rdma.c | 12 ++++++++----
1 file changed, 8 insertions(+), 4 deletions(-)
diff --git a/migration/rdma.c b/migration/rdma.c
index 78e6b72bac..0fcf02f48e 100644
--- a/migration/rdma.c
+++ b/migration/rdma.c
@@ -3253,10 +3253,14 @@ static void rdma_cm_poll_handler(void *opaque)
if (cm_event->event == RDMA_CM_EVENT_DISCONNECTED ||
cm_event->event == RDMA_CM_EVENT_DEVICE_REMOVAL) {
- error_report("receive cm event, cm event is %d", cm_event->event);
- rdma->error_state = -EPIPE;
- if (rdma->return_path) {
- rdma->return_path->error_state = -EPIPE;
+ if (!rdma->error_state &&
+ migration_incoming_get_current()->state !=
+ MIGRATION_STATUS_COMPLETED) {
+ error_report("receive cm event, cm event is %d", cm_event->event);
+ rdma->error_state = -EPIPE;
+ if (rdma->return_path) {
+ rdma->return_path->error_state = -EPIPE;
+ }
}
if (mis->migration_incoming_co) {
--
2.21.0
^ permalink raw reply related [flat|nested] 11+ messages in thread
* [PULL 6/9] migration/rdma.c: Swap synchronize_rcu for call_rcu
2019-09-25 15:01 [PULL 0/9] migration queue Dr. David Alan Gilbert (git)
` (4 preceding siblings ...)
2019-09-25 15:01 ` [PULL 5/9] migration/rdma: Don't moan about disconnects at the end Dr. David Alan Gilbert (git)
@ 2019-09-25 15:01 ` Dr. David Alan Gilbert (git)
2019-09-25 15:01 ` [PULL 7/9] tests/migration: Fail on unexpected migration states Dr. David Alan Gilbert (git)
` (3 subsequent siblings)
9 siblings, 0 replies; 11+ messages in thread
From: Dr. David Alan Gilbert (git) @ 2019-09-25 15:01 UTC (permalink / raw)
To: qemu-devel, quintela, peterx, marcandre.lureau, richardw.yang,
alex.benee
From: "Dr. David Alan Gilbert" <dgilbert@redhat.com>
This fixes a deadlock that can occur on the migration source after
a failed RDMA migration; as the source tries to cleanup it
clears a pair of pointers and uses synchronize_rcu to wait; this
is happening on the main thread. With the CPUs running
a CPU thread can be an rcu reader and attempt to grab the main lock
(kvm_handle_io->address_space_write->flatview_write->flatview_write_continue->
prepare_mmio_access->qemu_mutex_lock_iothread_impl)
Replace the synchronize_rcu with a call_rcu to postpone the freeing.
Fixes: 74637e6f08fceda98806 ("migration: implement bi-directional RDMA QIOChannel")
( https://bugzilla.redhat.com/show_bug.cgi?id=1746787 )
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Message-Id: <20190913163507.1403-3-dgilbert@redhat.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
---
migration/rdma.c | 39 +++++++++++++++++++++++++++------------
1 file changed, 27 insertions(+), 12 deletions(-)
diff --git a/migration/rdma.c b/migration/rdma.c
index 0fcf02f48e..4c74e88a37 100644
--- a/migration/rdma.c
+++ b/migration/rdma.c
@@ -3017,11 +3017,35 @@ static void qio_channel_rdma_set_aio_fd_handler(QIOChannel *ioc,
}
}
+struct rdma_close_rcu {
+ struct rcu_head rcu;
+ RDMAContext *rdmain;
+ RDMAContext *rdmaout;
+};
+
+/* callback from qio_channel_rdma_close via call_rcu */
+static void qio_channel_rdma_close_rcu(struct rdma_close_rcu *rcu)
+{
+ if (rcu->rdmain) {
+ qemu_rdma_cleanup(rcu->rdmain);
+ }
+
+ if (rcu->rdmaout) {
+ qemu_rdma_cleanup(rcu->rdmaout);
+ }
+
+ g_free(rcu->rdmain);
+ g_free(rcu->rdmaout);
+ g_free(rcu);
+}
+
static int qio_channel_rdma_close(QIOChannel *ioc,
Error **errp)
{
QIOChannelRDMA *rioc = QIO_CHANNEL_RDMA(ioc);
RDMAContext *rdmain, *rdmaout;
+ struct rdma_close_rcu *rcu = g_new(struct rdma_close_rcu, 1);
+
trace_qemu_rdma_close();
rdmain = rioc->rdmain;
@@ -3034,18 +3058,9 @@ static int qio_channel_rdma_close(QIOChannel *ioc,
atomic_rcu_set(&rioc->rdmaout, NULL);
}
- synchronize_rcu();
-
- if (rdmain) {
- qemu_rdma_cleanup(rdmain);
- }
-
- if (rdmaout) {
- qemu_rdma_cleanup(rdmaout);
- }
-
- g_free(rdmain);
- g_free(rdmaout);
+ rcu->rdmain = rdmain;
+ rcu->rdmaout = rdmaout;
+ call_rcu(rcu, qio_channel_rdma_close_rcu, rcu);
return 0;
}
--
2.21.0
^ permalink raw reply related [flat|nested] 11+ messages in thread
* [PULL 7/9] tests/migration: Fail on unexpected migration states
2019-09-25 15:01 [PULL 0/9] migration queue Dr. David Alan Gilbert (git)
` (5 preceding siblings ...)
2019-09-25 15:01 ` [PULL 6/9] migration/rdma.c: Swap synchronize_rcu for call_rcu Dr. David Alan Gilbert (git)
@ 2019-09-25 15:01 ` Dr. David Alan Gilbert (git)
2019-09-25 15:01 ` [PULL 8/9] tests/migration/postcopy: trim migration bandwidth Dr. David Alan Gilbert (git)
` (2 subsequent siblings)
9 siblings, 0 replies; 11+ messages in thread
From: Dr. David Alan Gilbert (git) @ 2019-09-25 15:01 UTC (permalink / raw)
To: qemu-devel, quintela, peterx, marcandre.lureau, richardw.yang,
alex.benee
From: "Dr. David Alan Gilbert" <dgilbert@redhat.com>
We've got various places where we wait for a migration to enter
a given state; but if we enter an unexpected state we tend to fail
in odd ways; add a mechanism for explicitly testing for any state
which we shouldn't be in.
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Message-Id: <20190923131022.15498-2-dgilbert@redhat.com>
Reviewed-by: Cleber Rosa <crosa@redhat.com>
Tested-by: Cleber Rosa <crosa@redhat.com>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
---
tests/migration-test.c | 23 +++++++++++++++++------
1 file changed, 17 insertions(+), 6 deletions(-)
diff --git a/tests/migration-test.c b/tests/migration-test.c
index 258aa064d4..9c62ee5331 100644
--- a/tests/migration-test.c
+++ b/tests/migration-test.c
@@ -255,15 +255,19 @@ static void read_blocktime(QTestState *who)
}
static void wait_for_migration_status(QTestState *who,
- const char *goal)
+ const char *goal,
+ const char **ungoals)
{
while (true) {
bool completed;
char *status;
+ const char **ungoal;
status = migrate_query_status(who);
completed = strcmp(status, goal) == 0;
- g_assert_cmpstr(status, !=, "failed");
+ for (ungoal = ungoals; *ungoal; ungoal++) {
+ g_assert_cmpstr(status, !=, *ungoal);
+ }
g_free(status);
if (completed) {
return;
@@ -274,7 +278,8 @@ static void wait_for_migration_status(QTestState *who,
static void wait_for_migration_complete(QTestState *who)
{
- wait_for_migration_status(who, "completed");
+ wait_for_migration_status(who, "completed",
+ (const char * []) { "failed", NULL });
}
static void wait_for_migration_pass(QTestState *who)
@@ -809,7 +814,9 @@ static void test_postcopy_recovery(void)
* Wait until postcopy is really started; we can only run the
* migrate-pause command during a postcopy
*/
- wait_for_migration_status(from, "postcopy-active");
+ wait_for_migration_status(from, "postcopy-active",
+ (const char * []) { "failed",
+ "completed", NULL });
/*
* Manually stop the postcopy migration. This emulates a network
@@ -822,7 +829,9 @@ static void test_postcopy_recovery(void)
* migrate-recover command can only succeed if destination machine
* is in the paused state
*/
- wait_for_migration_status(to, "postcopy-paused");
+ wait_for_migration_status(to, "postcopy-paused",
+ (const char * []) { "failed", "active",
+ "completed", NULL });
/*
* Create a new socket to emulate a new channel that is different
@@ -836,7 +845,9 @@ static void test_postcopy_recovery(void)
* Try to rebuild the migration channel using the resume flag and
* the newly created channel
*/
- wait_for_migration_status(from, "postcopy-paused");
+ wait_for_migration_status(from, "postcopy-paused",
+ (const char * []) { "failed", "active",
+ "completed", NULL });
migrate(from, uri, "{'resume': true}");
g_free(uri);
--
2.21.0
^ permalink raw reply related [flat|nested] 11+ messages in thread
* [PULL 8/9] tests/migration/postcopy: trim migration bandwidth
2019-09-25 15:01 [PULL 0/9] migration queue Dr. David Alan Gilbert (git)
` (6 preceding siblings ...)
2019-09-25 15:01 ` [PULL 7/9] tests/migration: Fail on unexpected migration states Dr. David Alan Gilbert (git)
@ 2019-09-25 15:01 ` Dr. David Alan Gilbert (git)
2019-09-25 15:01 ` [PULL 9/9] migration/postcopy: Recognise the recovery states as 'in_postcopy' Dr. David Alan Gilbert (git)
2019-09-26 15:13 ` [PULL 0/9] migration queue Peter Maydell
9 siblings, 0 replies; 11+ messages in thread
From: Dr. David Alan Gilbert (git) @ 2019-09-25 15:01 UTC (permalink / raw)
To: qemu-devel, quintela, peterx, marcandre.lureau, richardw.yang,
alex.benee
From: "Dr. David Alan Gilbert" <dgilbert@redhat.com>
On slow hosts with tcg we were sometimes finding that the migration
would complete during precopy and never get into the postcopy test.
Trim back the bandwidth a bit to make that much less likely.
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Message-Id: <20190923131022.15498-3-dgilbert@redhat.com>
Reviewed-by: Cleber Rosa <crosa@redhat.com>
Acked-by: Alex Bennée <alex.bennee@linaro.org>
Tested-by: Cleber Rosa <crosa@redhat.com>
Tested-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
---
tests/migration-test.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/tests/migration-test.c b/tests/migration-test.c
index 9c62ee5331..221a33d083 100644
--- a/tests/migration-test.c
+++ b/tests/migration-test.c
@@ -753,7 +753,7 @@ static int migrate_postcopy_prepare(QTestState **from_ptr,
* quickly, but that it doesn't complete precopy even on a slow
* machine, so also set the downtime.
*/
- migrate_set_parameter_int(from, "max-bandwidth", 100000000);
+ migrate_set_parameter_int(from, "max-bandwidth", 30000000);
migrate_set_parameter_int(from, "downtime-limit", 1);
/* Wait for the first serial output from the source */
--
2.21.0
^ permalink raw reply related [flat|nested] 11+ messages in thread
* [PULL 9/9] migration/postcopy: Recognise the recovery states as 'in_postcopy'
2019-09-25 15:01 [PULL 0/9] migration queue Dr. David Alan Gilbert (git)
` (7 preceding siblings ...)
2019-09-25 15:01 ` [PULL 8/9] tests/migration/postcopy: trim migration bandwidth Dr. David Alan Gilbert (git)
@ 2019-09-25 15:01 ` Dr. David Alan Gilbert (git)
2019-09-26 15:13 ` [PULL 0/9] migration queue Peter Maydell
9 siblings, 0 replies; 11+ messages in thread
From: Dr. David Alan Gilbert (git) @ 2019-09-25 15:01 UTC (permalink / raw)
To: qemu-devel, quintela, peterx, marcandre.lureau, richardw.yang,
alex.benee
From: "Dr. David Alan Gilbert" <dgilbert@redhat.com>
Various parts of the migration code do different things when they're
in postcopy mode; prior to this patch this has been 'postcopy-active'.
This patch extends 'in_postcopy' to include 'postcopy-paused' and
'postcopy-recover'.
In particular, when you set the max-postcopy-bandwidth parameter, this
only affects the current migration fd if we're 'in_postcopy';
this leads to a race in the postcopy recovery test where it increases
the speed from 4k/sec to unlimited, but that increase can get ignored
if the change is made between the point at which the reconnection
happens and it transitions back to active.
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Message-Id: <20190923174942.12182-1-dgilbert@redhat.com>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Tested-by: Alex Bennée <alex.bennee@linaro.org>
Tested-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
---
migration/migration.c | 9 ++++++++-
1 file changed, 8 insertions(+), 1 deletion(-)
diff --git a/migration/migration.c b/migration/migration.c
index 01863a95f5..5f7e4d15e9 100644
--- a/migration/migration.c
+++ b/migration/migration.c
@@ -1659,7 +1659,14 @@ bool migration_in_postcopy(void)
{
MigrationState *s = migrate_get_current();
- return (s->state == MIGRATION_STATUS_POSTCOPY_ACTIVE);
+ switch (s->state) {
+ case MIGRATION_STATUS_POSTCOPY_ACTIVE:
+ case MIGRATION_STATUS_POSTCOPY_PAUSED:
+ case MIGRATION_STATUS_POSTCOPY_RECOVER:
+ return true;
+ default:
+ return false;
+ }
}
bool migration_in_postcopy_after_devices(MigrationState *s)
--
2.21.0
^ permalink raw reply related [flat|nested] 11+ messages in thread
* Re: [PULL 0/9] migration queue
2019-09-25 15:01 [PULL 0/9] migration queue Dr. David Alan Gilbert (git)
` (8 preceding siblings ...)
2019-09-25 15:01 ` [PULL 9/9] migration/postcopy: Recognise the recovery states as 'in_postcopy' Dr. David Alan Gilbert (git)
@ 2019-09-26 15:13 ` Peter Maydell
9 siblings, 0 replies; 11+ messages in thread
From: Peter Maydell @ 2019-09-26 15:13 UTC (permalink / raw)
To: Dr. David Alan Gilbert (git)
Cc: Juan Quintela, QEMU Developers, Peter Xu, alex.benee, Wei Yang,
Marc-André Lureau
On Wed, 25 Sep 2019 at 16:06, Dr. David Alan Gilbert (git)
<dgilbert@redhat.com> wrote:
>
> From: "Dr. David Alan Gilbert" <dgilbert@redhat.com>
>
> The following changes since commit 240ab11fb72049d6373cbbec8d788f8e411a00bc:
>
> Merge remote-tracking branch 'remotes/aperard/tags/pull-xen-20190924' into staging (2019-09-24 15:36:31 +0100)
>
> are available in the Git repository at:
>
> git://github.com/dagrh/qemu.git tags/pull-migration-20190925a
>
> for you to fetch changes up to 3748fef9b95a9bc1602f3c4ed2a329d8ef47e63c:
>
> migration/postcopy: Recognise the recovery states as 'in_postcopy' (2019-09-25 15:51:19 +0100)
>
> ----------------------------------------------------------------
> Migration pull 2019-09-25
>
> me: test fixes from (should stop hangs in postcopy tests).
> me: An RDMA cleanup hang fix
> Wei: Tidy ups around postcopy
> Marc-Andre: mem leak fix
>
Applied, thanks.
Please update the changelog at https://wiki.qemu.org/ChangeLog/4.2
for any user-visible changes.
-- PMM
^ permalink raw reply [flat|nested] 11+ messages in thread