All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH] migration: Yield coroutine when receiving MIG_CMD_POSTCOPY_LISTEN
@ 2024-03-29  3:32 Lei Wang
  2024-03-29  8:54 ` Wang, Wei W
  0 siblings, 1 reply; 17+ messages in thread
From: Lei Wang @ 2024-03-29  3:32 UTC (permalink / raw)
  To: qemu-devel; +Cc: peterx, farosas, wei.w.wang, lei4.wang

When using the post-copy preemption feature to perform post-copy live
migration, the below scenario could lead to a deadlock and the migration
will never finish:

 - Source connect() the preemption channel in postcopy_start().
 - Source and the destination side TCP stack finished the 3-way handshake
   thus the connection is successful.
 - The destination side main thread is handling the loading of the bulk RAM
   pages thus it doesn't start to handle the pending connection event in the
   event loop. and doesn't post the semaphore postcopy_qemufile_dst_done for
   the preemption thread.
 - The source side sends non-iterative device states, such as the virtio
   states.
 - The destination main thread starts to receive the virtio states, this
   process may lead to a page fault (e.g., virtio_load()->vring_avail_idx()
   may trigger a page fault since the avail ring page may not be received
   yet).
 - The page request is sent back to the source side. Source sends the page
   content to the destination side preemption thread.
 - Since the event is not arrived and the semaphore
   postcopy_qemufile_dst_done is not posted, the preemption thread in
   destination side is blocked, and cannot handle receiving the page.
 - The QEMU main load thread on the destination side is stuck at the page
   fault, and cannot yield and handle the connect() event for the
   preemption channel to unblock the preemption thread.
 - The postcopy will stuck there forever since this is a deadlock.

The key point to reproduce this bug is that the source side is sending pages
at a rate faster than the destination handling, otherwise,
the qemu_get_be64() in ram_load_precopy() will have a chance to yield since
at that time there are no pending data in the buffer to get. This will make
this bug harder to be reproduced.

Fix this by yielding the load coroutine when receiving
MIG_CMD_POSTCOPY_LISTEN so the main event loop can handle the connection
event before loading the non-iterative devices state to avoid the deadlock
condition.

Signed-off-by: Lei Wang <lei4.wang@intel.com>
---
 migration/savevm.c | 5 +++++
 1 file changed, 5 insertions(+)

diff --git a/migration/savevm.c b/migration/savevm.c
index e386c5267f..8fd4dc92f2 100644
--- a/migration/savevm.c
+++ b/migration/savevm.c
@@ -2445,6 +2445,11 @@ static int loadvm_process_command(QEMUFile *f)
         return loadvm_postcopy_handle_advise(mis, len);
 
     case MIG_CMD_POSTCOPY_LISTEN:
+        if (migrate_postcopy_preempt() && qemu_in_coroutine()) {
+            aio_co_schedule(qemu_get_current_aio_context(),
+                            qemu_coroutine_self());
+            qemu_coroutine_yield();
+        }
         return loadvm_postcopy_handle_listen(mis);
 
     case MIG_CMD_POSTCOPY_RUN:
-- 
2.39.3



^ permalink raw reply related	[flat|nested] 17+ messages in thread

end of thread, other threads:[~2024-04-04 10:12 UTC | newest]

Thread overview: 17+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2024-03-29  3:32 [PATCH] migration: Yield coroutine when receiving MIG_CMD_POSTCOPY_LISTEN Lei Wang
2024-03-29  8:54 ` Wang, Wei W
2024-04-01 16:13   ` Peter Xu
2024-04-01 17:17     ` Fabiano Rosas
2024-04-01 18:47       ` Peter Xu
2024-04-01 21:22         ` Fabiano Rosas
2024-04-02  6:55     ` Wang, Lei
2024-04-02  7:25       ` Wang, Wei W
2024-04-02  9:28         ` Wang, Lei
2024-04-02 21:39           ` Peter Xu
2024-04-03  8:35             ` Wang, Lei
2024-04-03 14:42               ` Peter Xu
2024-04-03 16:04                 ` Wang, Wei W
2024-04-03 16:33                   ` Peter Xu
2024-04-04 10:11                     ` Wang, Wei W
2024-04-02  7:20     ` Wang, Wei W
2024-04-02 21:43       ` Peter Xu

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.