All of lore.kernel.org
 help / color / mirror / Atom feed
From: peterx@redhat.com
To: Peter Maydell <peter.maydell@linaro.org>, qemu-devel@nongnu.org
Cc: "David Hildenbrand" <david@redhat.com>,
	"Eric Blake" <eblake@redhat.com>,
	"Laurent Vivier" <lvivier@redhat.com>,
	"Paolo Bonzini" <pbonzini@redhat.com>,
	"Cédric Le Goater" <clg@redhat.com>,
	"Gerd Hoffmann" <kraxel@redhat.com>,
	"Marc-André Lureau" <marcandre.lureau@redhat.com>,
	"Markus Armbruster" <armbru@redhat.com>,
	"Fabiano Rosas" <farosas@suse.de>,
	"Michael S. Tsirkin" <mst@redhat.com>,
	"Thomas Huth" <thuth@redhat.com>,
	"Jason Wang" <jasowang@redhat.com>,
	"Alex Williamson" <alex.williamson@redhat.com>,
	"Peter Xu" <peterx@redhat.com>
Subject: [PULL 24/25] migration: Join the return path thread before releasing to_dst_file
Date: Wed, 28 Feb 2024 13:13:14 +0800	[thread overview]
Message-ID: <20240228051315.400759-25-peterx@redhat.com> (raw)
In-Reply-To: <20240228051315.400759-1-peterx@redhat.com>

From: Fabiano Rosas <farosas@suse.de>

The return path thread might hang at a blocking system call. Before
joining the thread we might need to issue a shutdown() on the socket
file descriptor to release it. To determine whether the shutdown() is
necessary we look at the QEMUFile error.

Make sure we only clean up the QEMUFile after the return path has been
waited for.

This fixes a hang when qemu_savevm_state_setup() produced an error
that was detected by migration_detect_error(). That skips
migration_completion() so close_return_path_on_source() would get
stuck waiting for the RP thread to terminate.

Reported-by: Cédric Le Goater <clg@redhat.com>
Tested-by: Cédric Le Goater <clg@redhat.com>
Signed-off-by: Fabiano Rosas <farosas@suse.de>
Link: https://lore.kernel.org/r/20240226203122.22894-2-farosas@suse.de
Signed-off-by: Peter Xu <peterx@redhat.com>
---
 migration/migration.c | 22 +++++++++-------------
 1 file changed, 9 insertions(+), 13 deletions(-)

diff --git a/migration/migration.c b/migration/migration.c
index ccb13fa94a..7ba2b60e46 100644
--- a/migration/migration.c
+++ b/migration/migration.c
@@ -1342,6 +1342,8 @@ static void migrate_fd_cleanup(MigrationState *s)
 
     qemu_savevm_state_cleanup();
 
+    close_return_path_on_source(s);
+
     if (s->to_dst_file) {
         QEMUFile *tmp;
 
@@ -1366,12 +1368,6 @@ static void migrate_fd_cleanup(MigrationState *s)
         qemu_fclose(tmp);
     }
 
-    /*
-     * We already cleaned up to_dst_file, so errors from the return
-     * path might be due to that, ignore them.
-     */
-    close_return_path_on_source(s);
-
     assert(!migration_is_active(s));
 
     if (s->state == MIGRATION_STATUS_CANCELLING) {
@@ -2914,6 +2910,13 @@ static MigThrError postcopy_pause(MigrationState *s)
     while (true) {
         QEMUFile *file;
 
+        /*
+         * We're already pausing, so ignore any errors on the return
+         * path and just wait for the thread to finish. It will be
+         * re-created when we resume.
+         */
+        close_return_path_on_source(s);
+
         /*
          * Current channel is possibly broken. Release it.  Note that this is
          * guaranteed even without lock because to_dst_file should only be
@@ -2933,13 +2936,6 @@ static MigThrError postcopy_pause(MigrationState *s)
         qemu_file_shutdown(file);
         qemu_fclose(file);
 
-        /*
-         * We're already pausing, so ignore any errors on the return
-         * path and just wait for the thread to finish. It will be
-         * re-created when we resume.
-         */
-        close_return_path_on_source(s);
-
         migrate_set_state(&s->state, s->state,
                           MIGRATION_STATUS_POSTCOPY_PAUSED);
 
-- 
2.43.0



  parent reply	other threads:[~2024-02-28  5:20 UTC|newest]

Thread overview: 27+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-02-28  5:12 [PULL 00/25] Migration next patches peterx
2024-02-28  5:12 ` [PULL 01/25] docs/devel/migration.rst: Document the file transport peterx
2024-02-28  5:12 ` [PULL 02/25] tests/qtest/migration: Rename fd_proto test peterx
2024-02-28  5:12 ` [PULL 03/25] tests/qtest/migration: Add a fd + file test peterx
2024-02-28  5:12 ` [PULL 04/25] migration/multifd: Remove p->quit from recv side peterx
2024-02-28  5:12 ` [PULL 05/25] migration/multifd: Release recv sem_sync earlier peterx
2024-02-28  5:12 ` [PULL 06/25] migration/multifd: Cleanup TLS iochannel referencing peterx
2024-02-28  5:12 ` [PULL 07/25] migration/multifd: Drop registered_yank peterx
2024-02-28  5:12 ` [PULL 08/25] migration/multifd: Make multifd_channel_connect() return void peterx
2024-02-28  5:12 ` [PULL 09/25] migration/multifd: Cleanup outgoing_args in state destroy peterx
2024-02-28  5:13 ` [PULL 10/25] migration/multifd: Drop unnecessary helper to destroy IOC peterx
2024-02-28  5:13 ` [PULL 11/25] notify: pass error to notifier with return peterx
2024-02-28  5:13 ` [PULL 12/25] migration: remove error from notifier data peterx
2024-02-28  5:13 ` [PULL 13/25] migration: convert to NotifierWithReturn peterx
2024-02-28  5:13 ` [PULL 14/25] migration: MigrationEvent for notifiers peterx
2024-02-28  5:13 ` [PULL 15/25] migration: remove postcopy_after_devices peterx
2024-02-28  5:13 ` [PULL 16/25] migration: MigrationNotifyFunc peterx
2024-02-28  5:13 ` [PULL 17/25] migration: per-mode notifiers peterx
2024-02-28  5:13 ` [PULL 18/25] migration: refactor migrate_fd_connect failures peterx
2024-02-28  5:13 ` [PULL 19/25] migration: notifier error checking peterx
2024-02-28  5:13 ` [PULL 20/25] migration: stop vm for cpr peterx
2024-02-28  5:13 ` [PULL 21/25] migration: update cpr-reboot description peterx
2024-02-28  5:13 ` [PULL 22/25] migration: options incompatible with cpr peterx
2024-02-28  5:13 ` [PULL 23/25] migration: Fix qmp_query_migrate mbps value peterx
2024-02-28  5:13 ` peterx [this message]
2024-02-28  5:13 ` [PULL 25/25] migration: Use migrate_has_error() in close_return_path_on_source() peterx
2024-02-29 15:24 ` [PULL 00/25] Migration next patches Peter Maydell

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20240228051315.400759-25-peterx@redhat.com \
    --to=peterx@redhat.com \
    --cc=alex.williamson@redhat.com \
    --cc=armbru@redhat.com \
    --cc=clg@redhat.com \
    --cc=david@redhat.com \
    --cc=eblake@redhat.com \
    --cc=farosas@suse.de \
    --cc=jasowang@redhat.com \
    --cc=kraxel@redhat.com \
    --cc=lvivier@redhat.com \
    --cc=marcandre.lureau@redhat.com \
    --cc=mst@redhat.com \
    --cc=pbonzini@redhat.com \
    --cc=peter.maydell@linaro.org \
    --cc=qemu-devel@nongnu.org \
    --cc=thuth@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.