All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v5 00/21] migration: Postcopy Preemption
@ 2022-04-25 23:38 Peter Xu
  2022-04-25 23:38 ` [PATCH v5 01/21] tests: fix encoding of IP addresses in x509 certs Peter Xu
                   ` (21 more replies)
  0 siblings, 22 replies; 40+ messages in thread
From: Peter Xu @ 2022-04-25 23:38 UTC (permalink / raw)
  To: qemu-devel
  Cc: Leonardo Bras Soares Passos, Daniel P . Berrange,
	Dr . David Alan Gilbert, peterx, Juan Quintela

This is v5 of postcopy preempt series.  It can also be found here:

  https://github.com/xzpeter/qemu/tree/postcopy-preempt

RFC: https://lore.kernel.org/qemu-devel/20220119080929.39485-1-peterx@redhat.com
V1:  https://lore.kernel.org/qemu-devel/20220216062809.57179-1-peterx@redhat.com
V2:  https://lore.kernel.org/qemu-devel/20220301083925.33483-1-peterx@redhat.com
V3:  https://lore.kernel.org/qemu-devel/20220330213908.26608-1-peterx@redhat.com
V4:  https://lore.kernel.org/qemu-devel/20220331150857.74406-1-peterx@redhat.com

v4->v5 changelog:
- Fixed all checkpatch.pl warnings
- Picked up leftover patches from Dan's tls test case series:
  https://lore.kernel.org/qemu-devel/20220310171821.3724080-1-berrange@redhat.com/
- Rebased to v7.0.0 tag, collected more R-bs from Dave/Dan
- In migrate_fd_cleanup(), use g_clear_pointer() for s->hostname [Dan]
- Mark postcopy-preempt capability for 7.1 not 7.0 [Dan]
- Moved migrate_channel_requires_tls() into tls.[ch] [Dan]
- Mention the bug-fixing side effect of patch "migration: Export
  tls-[creds|hostname|authz] params to cmdline too" on tls_authz [Dan]
- Use g_autoptr where proper [Dan]
- Drop a few (probably over-cautious) asserts on local_err being set [Dan]
- Quite a few renamings in the qtest in the last few test patches [Dan]

Abstract
========

This series contains two parts now:

  (1) Leftover patches from Dan's tls unit tests v2, which is first half
  (2) Leftover patches from my postcopy preempt v4, which is second half

This series added a new migration capability called "postcopy-preempt".  It can
be enabled when postcopy is enabled, and it'll simply (but greatly) speed up
postcopy page requests handling process.

Below are some initial postcopy page request latency measurements after the
new series applied.

For each page size, I measured page request latency for three cases:

  (a) Vanilla:                the old postcopy
  (b) Preempt no-break-huge:  preempt enabled, x-postcopy-preempt-break-huge=off
  (c) Preempt full:           preempt enabled, x-postcopy-preempt-break-huge=on
                              (this is the default option when preempt enabled)

Here x-postcopy-preempt-break-huge parameter is just added in v2 so as to
conditionally disable the behavior to break sending a precopy huge page for
debugging purpose.  So when it's off, postcopy will not preempt precopy
sending a huge page, but still postcopy will use its own channel.

I tested it separately to give a rough idea on which part of the change
helped how much of it.  The overall benefit should be the comparison
between case (a) and (c).

  |-----------+---------+-----------------------+--------------|
  | Page size | Vanilla | Preempt no-break-huge | Preempt full |
  |-----------+---------+-----------------------+--------------|
  | 4K        |   10.68 |               N/A [*] |         0.57 |
  | 2M        |   10.58 |                  5.49 |         5.02 |
  | 1G        | 2046.65 |               933.185 |      649.445 |
  |-----------+---------+-----------------------+--------------|
  [*]: This case is N/A because 4K page does not contain huge page at all

[1] https://github.com/xzpeter/small-stuffs/blob/master/tools/huge_vm/uffd-latency.bpf

TODO List
=========

Avoid precopy write() blocks postcopy
-------------------------------------

I didn't prove this, but I always think the write() syscalls being blocked
for precopy pages can affect postcopy services.  If we can solve this
problem then my wild guess is we can further reduce the average page
latency.

Two solutions at least in mind: (1) we could have made the write side of
the migration channel NON_BLOCK too, or (2) multi-threads on send side,
just like multifd, but we may use lock to protect which page to send too
(e.g., the core idea is we should _never_ rely anything on the main thread,
multifd has that dependency on queuing pages only on main thread).

That can definitely be done and thought about later.

Multi-channel for preemption threads
------------------------------------

Currently the postcopy preempt feature use only one extra channel and one
extra thread on dest (no new thread on src QEMU).  It should be mostly good
enough for major use cases, but when the postcopy queue is long enough
(e.g. hundreds of vCPUs faulted on different pages) logically we could
still observe more delays in average.  Whether growing threads/channels can
solve it is debatable, but sounds worthwhile a try.  That's yet another
thing we can think about after this patchset lands.

Logically the design provides space for that - the receiving postcopy
preempt thread can understand all ram-layer migration protocol, and for
multi channel and multi threads we could simply grow that into multile
threads handling the same protocol (with multiple PostcopyTmpPage).  The
source needs more thoughts on synchronizations, though, but it shouldn't
affect the whole protocol layer, so should be easy to keep compatible.

Please review, thanks.

Daniel P. Berrangé (9):
  tests: fix encoding of IP addresses in x509 certs
  tests: add more helper macros for creating TLS x509 certs
  tests: add migration tests of TLS with PSK credentials
  tests: add migration tests of TLS with x509 credentials
  tests: convert XBZRLE migration test to use common helper
  tests: convert multifd migration tests to use common helper
  tests: add multifd migration tests of TLS with PSK credentials
  tests: add multifd migration tests of TLS with x509 credentials
  tests: ensure migration status isn't reported as failed

Peter Xu (12):
  migration: Add postcopy-preempt capability
  migration: Postcopy preemption preparation on channel creation
  migration: Postcopy preemption enablement
  migration: Postcopy recover with preempt enabled
  migration: Create the postcopy preempt channel asynchronously
  migration: Parameter x-postcopy-preempt-break-huge
  migration: Add helpers to detect TLS capability
  migration: Export tls-[creds|hostname|authz] params to cmdline too
  migration: Enable TLS for preempt channel
  tests: Add postcopy tls migration test
  tests: Add postcopy tls recovery migration test
  tests: Add postcopy preempt tests

 meson.build                          |   1 +
 migration/channel.c                  |  10 +-
 migration/migration.c                | 146 +++-
 migration/migration.h                |  46 +-
 migration/multifd.c                  |   7 +-
 migration/postcopy-ram.c             | 186 ++++-
 migration/postcopy-ram.h             |  11 +
 migration/qemu-file.c                |  27 +
 migration/qemu-file.h                |   1 +
 migration/ram.c                      | 283 +++++++-
 migration/ram.h                      |   4 +-
 migration/savevm.c                   |  46 +-
 migration/socket.c                   |  22 +-
 migration/socket.h                   |   1 +
 migration/tls.c                      |   9 +
 migration/tls.h                      |   4 +
 migration/trace-events               |  15 +-
 qapi/migration.json                  |   8 +-
 tests/qtest/meson.build              |  12 +-
 tests/qtest/migration-helpers.c      |  13 +
 tests/qtest/migration-helpers.h      |   1 +
 tests/qtest/migration-test.c         | 997 ++++++++++++++++++++++++---
 tests/unit/crypto-tls-psk-helpers.c  |  18 +-
 tests/unit/crypto-tls-psk-helpers.h  |   1 +
 tests/unit/crypto-tls-x509-helpers.c |  16 +-
 tests/unit/crypto-tls-x509-helpers.h |  53 ++
 tests/unit/test-crypto-tlssession.c  |  11 +-
 27 files changed, 1779 insertions(+), 170 deletions(-)

-- 
2.32.0



^ permalink raw reply	[flat|nested] 40+ messages in thread

* [PATCH v5 01/21] tests: fix encoding of IP addresses in x509 certs
  2022-04-25 23:38 [PATCH v5 00/21] migration: Postcopy Preemption Peter Xu
@ 2022-04-25 23:38 ` Peter Xu
  2022-04-25 23:38 ` [PATCH v5 02/21] tests: add more helper macros for creating TLS " Peter Xu
                   ` (20 subsequent siblings)
  21 siblings, 0 replies; 40+ messages in thread
From: Peter Xu @ 2022-04-25 23:38 UTC (permalink / raw)
  To: qemu-devel
  Cc: Leonardo Bras Soares Passos, Daniel P . Berrange,
	Dr . David Alan Gilbert, peterx, Juan Quintela

From: Daniel P. Berrangé <berrange@redhat.com>

We need to encode just the address bytes, not the whole struct sockaddr
data. Add a test case to validate that we're matching on SAN IP
addresses correctly.

Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
---
 tests/unit/crypto-tls-x509-helpers.c | 16 +++++++++++++---
 tests/unit/test-crypto-tlssession.c  | 11 +++++++++--
 2 files changed, 22 insertions(+), 5 deletions(-)

diff --git a/tests/unit/crypto-tls-x509-helpers.c b/tests/unit/crypto-tls-x509-helpers.c
index fc609b3fd4..e9937f60d8 100644
--- a/tests/unit/crypto-tls-x509-helpers.c
+++ b/tests/unit/crypto-tls-x509-helpers.c
@@ -168,9 +168,19 @@ test_tls_get_ipaddr(const char *addrstr,
     hints.ai_flags = AI_NUMERICHOST;
     g_assert(getaddrinfo(addrstr, NULL, &hints, &res) == 0);
 
-    *datalen = res->ai_addrlen;
-    *data = g_new(char, *datalen);
-    memcpy(*data, res->ai_addr, *datalen);
+    if (res->ai_family == AF_INET) {
+        struct sockaddr_in *in = (struct sockaddr_in *)res->ai_addr;
+        *datalen = sizeof(in->sin_addr);
+        *data = g_new(char, *datalen);
+        memcpy(*data, &in->sin_addr, *datalen);
+    } else if (res->ai_family == AF_INET6) {
+        struct sockaddr_in6 *in = (struct sockaddr_in6 *)res->ai_addr;
+        *datalen = sizeof(in->sin6_addr);
+        *data = g_new(char, *datalen);
+        memcpy(*data, &in->sin6_addr, *datalen);
+    } else {
+        g_assert_not_reached();
+    }
     freeaddrinfo(res);
 }
 
diff --git a/tests/unit/test-crypto-tlssession.c b/tests/unit/test-crypto-tlssession.c
index 5f0da9192c..a6935d8497 100644
--- a/tests/unit/test-crypto-tlssession.c
+++ b/tests/unit/test-crypto-tlssession.c
@@ -512,12 +512,19 @@ int main(int argc, char **argv)
                   false, true, "wiki.qemu.org", NULL);
 
     TEST_SESS_REG(altname4, cacertreq.filename,
+                  servercertalt1req.filename, clientcertreq.filename,
+                  false, false, "192.168.122.1", NULL);
+    TEST_SESS_REG(altname5, cacertreq.filename,
+                  servercertalt1req.filename, clientcertreq.filename,
+                  false, false, "fec0::dead:beaf", NULL);
+
+    TEST_SESS_REG(altname6, cacertreq.filename,
                   servercertalt2req.filename, clientcertreq.filename,
                   false, true, "qemu.org", NULL);
-    TEST_SESS_REG(altname5, cacertreq.filename,
+    TEST_SESS_REG(altname7, cacertreq.filename,
                   servercertalt2req.filename, clientcertreq.filename,
                   false, false, "www.qemu.org", NULL);
-    TEST_SESS_REG(altname6, cacertreq.filename,
+    TEST_SESS_REG(altname8, cacertreq.filename,
                   servercertalt2req.filename, clientcertreq.filename,
                   false, false, "wiki.qemu.org", NULL);
 
-- 
2.32.0



^ permalink raw reply related	[flat|nested] 40+ messages in thread

* [PATCH v5 02/21] tests: add more helper macros for creating TLS x509 certs
  2022-04-25 23:38 [PATCH v5 00/21] migration: Postcopy Preemption Peter Xu
  2022-04-25 23:38 ` [PATCH v5 01/21] tests: fix encoding of IP addresses in x509 certs Peter Xu
@ 2022-04-25 23:38 ` Peter Xu
  2022-04-25 23:38 ` [PATCH v5 03/21] tests: add migration tests of TLS with PSK credentials Peter Xu
                   ` (19 subsequent siblings)
  21 siblings, 0 replies; 40+ messages in thread
From: Peter Xu @ 2022-04-25 23:38 UTC (permalink / raw)
  To: qemu-devel
  Cc: Leonardo Bras Soares Passos, Daniel P . Berrange,
	Dr . David Alan Gilbert, peterx, Juan Quintela

From: Daniel P. Berrangé <berrange@redhat.com>

These macros are more suited to the general consumers of certs in the
test suite, where we don't need to exercise every single possible
permutation.

Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
---
 tests/unit/crypto-tls-x509-helpers.h | 53 ++++++++++++++++++++++++++++
 1 file changed, 53 insertions(+)

diff --git a/tests/unit/crypto-tls-x509-helpers.h b/tests/unit/crypto-tls-x509-helpers.h
index cf6329e653..247e7160eb 100644
--- a/tests/unit/crypto-tls-x509-helpers.h
+++ b/tests/unit/crypto-tls-x509-helpers.h
@@ -26,6 +26,9 @@
 #include <libtasn1.h>
 
 
+#define QCRYPTO_TLS_TEST_CLIENT_NAME "ACME QEMU Client"
+#define QCRYPTO_TLS_TEST_CLIENT_HOSTILE_NAME "ACME Hostile Client"
+
 /*
  * This contains parameter about how to generate
  * certificates.
@@ -118,6 +121,56 @@ void test_tls_cleanup(const char *keyfile);
     };                                                                  \
     test_tls_generate_cert(&varname, NULL)
 
+# define TLS_ROOT_REQ_SIMPLE(varname, fname)                            \
+    QCryptoTLSTestCertReq varname = {                                   \
+        .filename = fname,                                              \
+        .cn = "qemu-CA",                                                \
+        .basicConstraintsEnable = true,                                 \
+        .basicConstraintsCritical = true,                               \
+        .basicConstraintsIsCA = true,                                   \
+        .keyUsageEnable = true,                                         \
+        .keyUsageCritical = true,                                       \
+        .keyUsageValue = GNUTLS_KEY_KEY_CERT_SIGN,                      \
+    };                                                                  \
+    test_tls_generate_cert(&varname, NULL)
+
+# define TLS_CERT_REQ_SIMPLE_CLIENT(varname, cavarname, cname, fname)   \
+    QCryptoTLSTestCertReq varname = {                                   \
+        .filename = fname,                                              \
+        .cn = cname,                                                    \
+        .basicConstraintsEnable = true,                                 \
+        .basicConstraintsCritical = true,                               \
+        .basicConstraintsIsCA = false,                                  \
+        .keyUsageEnable = true,                                         \
+        .keyUsageCritical = true,                                       \
+        .keyUsageValue =                                                \
+        GNUTLS_KEY_DIGITAL_SIGNATURE | GNUTLS_KEY_KEY_ENCIPHERMENT,     \
+        .keyPurposeEnable = true,                                       \
+        .keyPurposeCritical = true,                                     \
+        .keyPurposeOID1 = GNUTLS_KP_TLS_WWW_CLIENT,                     \
+    };                                                                  \
+    test_tls_generate_cert(&varname, cavarname.crt)
+
+# define TLS_CERT_REQ_SIMPLE_SERVER(varname, cavarname, fname,          \
+                                    hostname, ipaddr)                   \
+    QCryptoTLSTestCertReq varname = {                                   \
+        .filename = fname,                                              \
+        .cn = hostname ? hostname : ipaddr,                             \
+        .altname1 = hostname,                                           \
+        .ipaddr1 = ipaddr,                                              \
+        .basicConstraintsEnable = true,                                 \
+        .basicConstraintsCritical = true,                               \
+        .basicConstraintsIsCA = false,                                  \
+        .keyUsageEnable = true,                                         \
+        .keyUsageCritical = true,                                       \
+        .keyUsageValue =                                                \
+        GNUTLS_KEY_DIGITAL_SIGNATURE | GNUTLS_KEY_KEY_ENCIPHERMENT,     \
+        .keyPurposeEnable = true,                                       \
+        .keyPurposeCritical = true,                                     \
+        .keyPurposeOID1 = GNUTLS_KP_TLS_WWW_SERVER,                     \
+    };                                                                  \
+    test_tls_generate_cert(&varname, cavarname.crt)
+
 extern const asn1_static_node pkix_asn1_tab[];
 
 #endif
-- 
2.32.0



^ permalink raw reply related	[flat|nested] 40+ messages in thread

* [PATCH v5 03/21] tests: add migration tests of TLS with PSK credentials
  2022-04-25 23:38 [PATCH v5 00/21] migration: Postcopy Preemption Peter Xu
  2022-04-25 23:38 ` [PATCH v5 01/21] tests: fix encoding of IP addresses in x509 certs Peter Xu
  2022-04-25 23:38 ` [PATCH v5 02/21] tests: add more helper macros for creating TLS " Peter Xu
@ 2022-04-25 23:38 ` Peter Xu
  2022-04-25 23:38 ` [PATCH v5 04/21] tests: add migration tests of TLS with x509 credentials Peter Xu
                   ` (18 subsequent siblings)
  21 siblings, 0 replies; 40+ messages in thread
From: Peter Xu @ 2022-04-25 23:38 UTC (permalink / raw)
  To: qemu-devel
  Cc: Leonardo Bras Soares Passos, Daniel P . Berrange,
	Dr . David Alan Gilbert, peterx, Juan Quintela

From: Daniel P. Berrangé <berrange@redhat.com>

This validates that we correctly handle migration success and failure
scenarios when using TLS with pre shared keys.

Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
---
 tests/qtest/meson.build             |   7 +-
 tests/qtest/migration-test.c        | 159 +++++++++++++++++++++++++++-
 tests/unit/crypto-tls-psk-helpers.c |  18 +++-
 tests/unit/crypto-tls-psk-helpers.h |   1 +
 4 files changed, 177 insertions(+), 8 deletions(-)

diff --git a/tests/qtest/meson.build b/tests/qtest/meson.build
index 22e1361210..ec14559e73 100644
--- a/tests/qtest/meson.build
+++ b/tests/qtest/meson.build
@@ -271,13 +271,18 @@ endif
 
 tpmemu_files = ['tpm-emu.c', 'tpm-util.c', 'tpm-tests.c']
 
+migration_files = [files('migration-helpers.c')]
+if gnutls.found()
+  migration_files += [files('../unit/crypto-tls-psk-helpers.c'), gnutls]
+endif
+
 qtests = {
   'bios-tables-test': [io, 'boot-sector.c', 'acpi-utils.c', 'tpm-emu.c'],
   'cdrom-test': files('boot-sector.c'),
   'dbus-vmstate-test': files('migration-helpers.c') + dbus_vmstate1,
   'erst-test': files('erst-test.c'),
   'ivshmem-test': [rt, '../../contrib/ivshmem-server/ivshmem-server.c'],
-  'migration-test': files('migration-helpers.c'),
+  'migration-test': migration_files,
   'pxe-test': files('boot-sector.c'),
   'qos-test': [chardev, io, qos_test_ss.apply(config_host, strict: false).sources()],
   'tpm-crb-swtpm-test': [io, tpmemu_files],
diff --git a/tests/qtest/migration-test.c b/tests/qtest/migration-test.c
index 2af36c16a3..f733aa352e 100644
--- a/tests/qtest/migration-test.c
+++ b/tests/qtest/migration-test.c
@@ -23,9 +23,13 @@
 #include "qapi/qapi-visit-sockets.h"
 #include "qapi/qobject-input-visitor.h"
 #include "qapi/qobject-output-visitor.h"
+#include "crypto/tlscredspsk.h"
 
 #include "migration-helpers.h"
 #include "tests/migration/migration-test.h"
+#ifdef CONFIG_GNUTLS
+# include "tests/unit/crypto-tls-psk-helpers.h"
+#endif /* CONFIG_GNUTLS */
 
 /* For dirty ring test; so far only x86_64 is supported */
 #if defined(__linux__) && defined(HOST_X86_64)
@@ -640,6 +644,100 @@ static void test_migrate_end(QTestState *from, QTestState *to, bool test_dest)
     cleanup("dest_serial");
 }
 
+#ifdef CONFIG_GNUTLS
+struct TestMigrateTLSPSKData {
+    char *workdir;
+    char *workdiralt;
+    char *pskfile;
+    char *pskfilealt;
+};
+
+static void *
+test_migrate_tls_psk_start_common(QTestState *from,
+                                  QTestState *to,
+                                  bool mismatch)
+{
+    struct TestMigrateTLSPSKData *data =
+        g_new0(struct TestMigrateTLSPSKData, 1);
+    QDict *rsp;
+
+    data->workdir = g_strdup_printf("%s/tlscredspsk0", tmpfs);
+    data->pskfile = g_strdup_printf("%s/%s", data->workdir,
+                                    QCRYPTO_TLS_CREDS_PSKFILE);
+    mkdir(data->workdir, 0700);
+    test_tls_psk_init(data->pskfile);
+
+    if (mismatch) {
+        data->workdiralt = g_strdup_printf("%s/tlscredspskalt0", tmpfs);
+        data->pskfilealt = g_strdup_printf("%s/%s", data->workdiralt,
+                                           QCRYPTO_TLS_CREDS_PSKFILE);
+        mkdir(data->workdiralt, 0700);
+        test_tls_psk_init_alt(data->pskfilealt);
+    }
+
+    rsp = wait_command(from,
+                       "{ 'execute': 'object-add',"
+                       "  'arguments': { 'qom-type': 'tls-creds-psk',"
+                       "                 'id': 'tlscredspsk0',"
+                       "                 'endpoint': 'client',"
+                       "                 'dir': %s,"
+                       "                 'username': 'qemu'} }",
+                       data->workdir);
+    qobject_unref(rsp);
+
+    rsp = wait_command(to,
+                       "{ 'execute': 'object-add',"
+                       "  'arguments': { 'qom-type': 'tls-creds-psk',"
+                       "                 'id': 'tlscredspsk0',"
+                       "                 'endpoint': 'server',"
+                       "                 'dir': %s } }",
+                       mismatch ? data->workdiralt : data->workdir);
+    qobject_unref(rsp);
+
+    migrate_set_parameter_str(from, "tls-creds", "tlscredspsk0");
+    migrate_set_parameter_str(to, "tls-creds", "tlscredspsk0");
+
+    return data;
+}
+
+static void *
+test_migrate_tls_psk_start_match(QTestState *from,
+                                 QTestState *to)
+{
+    return test_migrate_tls_psk_start_common(from, to, false);
+}
+
+static void *
+test_migrate_tls_psk_start_mismatch(QTestState *from,
+                                    QTestState *to)
+{
+    return test_migrate_tls_psk_start_common(from, to, true);
+}
+
+static void
+test_migrate_tls_psk_finish(QTestState *from,
+                            QTestState *to,
+                            void *opaque)
+{
+    struct TestMigrateTLSPSKData *data = opaque;
+
+    test_tls_psk_cleanup(data->pskfile);
+    if (data->pskfilealt) {
+        test_tls_psk_cleanup(data->pskfilealt);
+    }
+    rmdir(data->workdir);
+    if (data->workdiralt) {
+        rmdir(data->workdiralt);
+    }
+
+    g_free(data->workdiralt);
+    g_free(data->pskfilealt);
+    g_free(data->workdir);
+    g_free(data->pskfile);
+    g_free(data);
+}
+#endif /* CONFIG_GNUTLS */
+
 static int migrate_postcopy_prepare(QTestState **from_ptr,
                                     QTestState **to_ptr,
                                     MigrateStart *args)
@@ -911,7 +1009,7 @@ static void test_precopy_common(MigrateCommon *args)
     test_migrate_end(from, to, args->result == MIG_TEST_SUCCEED);
 }
 
-static void test_precopy_unix(void)
+static void test_precopy_unix_plain(void)
 {
     g_autofree char *uri = g_strdup_printf("unix:%s/migsocket", tmpfs);
     MigrateCommon args = {
@@ -922,6 +1020,19 @@ static void test_precopy_unix(void)
     test_precopy_common(&args);
 }
 
+static void test_precopy_unix_tls_psk(void)
+{
+    g_autofree char *uri = g_strdup_printf("unix:%s/migsocket", tmpfs);
+    MigrateCommon args = {
+        .connect_uri = uri,
+        .listen_uri = uri,
+        .start_hook = test_migrate_tls_psk_start_match,
+        .finish_hook = test_migrate_tls_psk_finish,
+    };
+
+    test_precopy_common(&args);
+}
+
 static void test_precopy_unix_dirty_ring(void)
 {
     g_autofree char *uri = g_strdup_printf("unix:%s/migsocket", tmpfs);
@@ -1026,7 +1137,7 @@ static void test_xbzrle_unix(void)
     test_xbzrle(uri);
 }
 
-static void test_precopy_tcp(void)
+static void test_precopy_tcp_plain(void)
 {
     MigrateCommon args = {
         .listen_uri = "tcp:127.0.0.1:0",
@@ -1035,6 +1146,34 @@ static void test_precopy_tcp(void)
     test_precopy_common(&args);
 }
 
+#ifdef CONFIG_GNUTLS
+static void test_precopy_tcp_tls_psk_match(void)
+{
+    MigrateCommon args = {
+        .listen_uri = "tcp:127.0.0.1:0",
+        .start_hook = test_migrate_tls_psk_start_match,
+        .finish_hook = test_migrate_tls_psk_finish,
+    };
+
+    test_precopy_common(&args);
+}
+
+static void test_precopy_tcp_tls_psk_mismatch(void)
+{
+    MigrateCommon args = {
+        .start = {
+            .hide_stderr = true,
+        },
+        .listen_uri = "tcp:127.0.0.1:0",
+        .start_hook = test_migrate_tls_psk_start_mismatch,
+        .finish_hook = test_migrate_tls_psk_finish,
+        .result = MIG_TEST_FAIL,
+    };
+
+    test_precopy_common(&args);
+}
+#endif /* CONFIG_GNUTLS */
+
 static void *test_migrate_fd_start_hook(QTestState *from,
                                         QTestState *to)
 {
@@ -1497,8 +1636,20 @@ int main(int argc, char **argv)
     qtest_add_func("/migration/postcopy/unix", test_postcopy);
     qtest_add_func("/migration/postcopy/recovery", test_postcopy_recovery);
     qtest_add_func("/migration/bad_dest", test_baddest);
-    qtest_add_func("/migration/precopy/unix", test_precopy_unix);
-    qtest_add_func("/migration/precopy/tcp", test_precopy_tcp);
+    qtest_add_func("/migration/precopy/unix/plain", test_precopy_unix_plain);
+#ifdef CONFIG_GNUTLS
+    qtest_add_func("/migration/precopy/unix/tls/psk",
+                   test_precopy_unix_tls_psk);
+#endif /* CONFIG_GNUTLS */
+
+    qtest_add_func("/migration/precopy/tcp/plain", test_precopy_tcp_plain);
+#ifdef CONFIG_GNUTLS
+    qtest_add_func("/migration/precopy/tcp/tls/psk/match",
+                   test_precopy_tcp_tls_psk_match);
+    qtest_add_func("/migration/precopy/tcp/tls/psk/mismatch",
+                   test_precopy_tcp_tls_psk_mismatch);
+#endif /* CONFIG_GNUTLS */
+
     /* qtest_add_func("/migration/ignore_shared", test_ignore_shared); */
     qtest_add_func("/migration/xbzrle/unix", test_xbzrle_unix);
     qtest_add_func("/migration/fd_proto", test_migrate_fd_proto);
diff --git a/tests/unit/crypto-tls-psk-helpers.c b/tests/unit/crypto-tls-psk-helpers.c
index 4bea7c6fa2..511e08cc9c 100644
--- a/tests/unit/crypto-tls-psk-helpers.c
+++ b/tests/unit/crypto-tls-psk-helpers.c
@@ -24,7 +24,8 @@
 #include "crypto-tls-psk-helpers.h"
 #include "qemu/sockets.h"
 
-void test_tls_psk_init(const char *pskfile)
+static void
+test_tls_psk_init_common(const char *pskfile, const char *user, const char *key)
 {
     FILE *fp;
 
@@ -33,11 +34,22 @@ void test_tls_psk_init(const char *pskfile)
         g_critical("Failed to create pskfile %s: %s", pskfile, strerror(errno));
         abort();
     }
-    /* Don't do this in real applications!  Use psktool. */
-    fprintf(fp, "qemu:009d5638c40fde0c\n");
+    fprintf(fp, "%s:%s\n", user, key);
     fclose(fp);
 }
 
+void test_tls_psk_init(const char *pskfile)
+{
+    /* Don't hard code a key like this in real applications!  Use psktool. */
+    test_tls_psk_init_common(pskfile, "qemu", "009d5638c40fde0c");
+}
+
+void test_tls_psk_init_alt(const char *pskfile)
+{
+    /* Don't hard code a key like this in real applications!  Use psktool. */
+    test_tls_psk_init_common(pskfile, "qemu", "10ffa6a2c42f0388");
+}
+
 void test_tls_psk_cleanup(const char *pskfile)
 {
     unlink(pskfile);
diff --git a/tests/unit/crypto-tls-psk-helpers.h b/tests/unit/crypto-tls-psk-helpers.h
index faa645c629..67f8bdda71 100644
--- a/tests/unit/crypto-tls-psk-helpers.h
+++ b/tests/unit/crypto-tls-psk-helpers.h
@@ -24,6 +24,7 @@
 #include <gnutls/gnutls.h>
 
 void test_tls_psk_init(const char *keyfile);
+void test_tls_psk_init_alt(const char *keyfile);
 void test_tls_psk_cleanup(const char *keyfile);
 
 #endif
-- 
2.32.0



^ permalink raw reply related	[flat|nested] 40+ messages in thread

* [PATCH v5 04/21] tests: add migration tests of TLS with x509 credentials
  2022-04-25 23:38 [PATCH v5 00/21] migration: Postcopy Preemption Peter Xu
                   ` (2 preceding siblings ...)
  2022-04-25 23:38 ` [PATCH v5 03/21] tests: add migration tests of TLS with PSK credentials Peter Xu
@ 2022-04-25 23:38 ` Peter Xu
  2022-04-25 23:38 ` [PATCH v5 05/21] tests: convert XBZRLE migration test to use common helper Peter Xu
                   ` (17 subsequent siblings)
  21 siblings, 0 replies; 40+ messages in thread
From: Peter Xu @ 2022-04-25 23:38 UTC (permalink / raw)
  To: qemu-devel
  Cc: Leonardo Bras Soares Passos, Daniel P . Berrange,
	Dr . David Alan Gilbert, peterx, Juan Quintela

From: Daniel P. Berrangé <berrange@redhat.com>

This validates that we correctly handle migration success and failure
scenarios when using TLS with x509 certificates. There are quite a few
different scenarios that matter in relation to hostname validation.

Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
---
 meson.build                  |   1 +
 tests/qtest/meson.build      |   5 +
 tests/qtest/migration-test.c | 382 ++++++++++++++++++++++++++++++++++-
 3 files changed, 386 insertions(+), 2 deletions(-)

diff --git a/meson.build b/meson.build
index d083c6b7bf..d1231b23ae 100644
--- a/meson.build
+++ b/meson.build
@@ -1565,6 +1565,7 @@ config_host_data.set('CONFIG_KEYUTILS', keyutils.found())
 config_host_data.set('CONFIG_GETTID', has_gettid)
 config_host_data.set('CONFIG_GNUTLS', gnutls.found())
 config_host_data.set('CONFIG_GNUTLS_CRYPTO', gnutls_crypto.found())
+config_host_data.set('CONFIG_TASN1', tasn1.found())
 config_host_data.set('CONFIG_GCRYPT', gcrypt.found())
 config_host_data.set('CONFIG_NETTLE', nettle.found())
 config_host_data.set('CONFIG_QEMU_PRIVATE_XTS', xts == 'private')
diff --git a/tests/qtest/meson.build b/tests/qtest/meson.build
index ec14559e73..af7c31d611 100644
--- a/tests/qtest/meson.build
+++ b/tests/qtest/meson.build
@@ -274,6 +274,11 @@ tpmemu_files = ['tpm-emu.c', 'tpm-util.c', 'tpm-tests.c']
 migration_files = [files('migration-helpers.c')]
 if gnutls.found()
   migration_files += [files('../unit/crypto-tls-psk-helpers.c'), gnutls]
+
+  if tasn1.found()
+    migration_files += [files('../unit/crypto-tls-x509-helpers.c',
+                              '../unit/pkix_asn1_tab.c'), tasn1]
+  endif
 endif
 
 qtests = {
diff --git a/tests/qtest/migration-test.c b/tests/qtest/migration-test.c
index f733aa352e..c730697f74 100644
--- a/tests/qtest/migration-test.c
+++ b/tests/qtest/migration-test.c
@@ -29,6 +29,9 @@
 #include "tests/migration/migration-test.h"
 #ifdef CONFIG_GNUTLS
 # include "tests/unit/crypto-tls-psk-helpers.h"
+# ifdef CONFIG_TASN1
+#  include "tests/unit/crypto-tls-x509-helpers.h"
+# endif /* CONFIG_TASN1 */
 #endif /* CONFIG_GNUTLS */
 
 /* For dirty ring test; so far only x86_64 is supported */
@@ -736,6 +739,234 @@ test_migrate_tls_psk_finish(QTestState *from,
     g_free(data->pskfile);
     g_free(data);
 }
+
+#ifdef CONFIG_TASN1
+typedef struct {
+    char *workdir;
+    char *keyfile;
+    char *cacert;
+    char *servercert;
+    char *serverkey;
+    char *clientcert;
+    char *clientkey;
+} TestMigrateTLSX509Data;
+
+typedef struct {
+    bool verifyclient;
+    bool clientcert;
+    bool hostileclient;
+    bool authzclient;
+    const char *certhostname;
+    const char *certipaddr;
+} TestMigrateTLSX509;
+
+static void *
+test_migrate_tls_x509_start_common(QTestState *from,
+                                   QTestState *to,
+                                   TestMigrateTLSX509 *args)
+{
+    TestMigrateTLSX509Data *data = g_new0(TestMigrateTLSX509Data, 1);
+    QDict *rsp;
+
+    data->workdir = g_strdup_printf("%s/tlscredsx5090", tmpfs);
+    data->keyfile = g_strdup_printf("%s/key.pem", data->workdir);
+
+    data->cacert = g_strdup_printf("%s/ca-cert.pem", data->workdir);
+    data->serverkey = g_strdup_printf("%s/server-key.pem", data->workdir);
+    data->servercert = g_strdup_printf("%s/server-cert.pem", data->workdir);
+    if (args->clientcert) {
+        data->clientkey = g_strdup_printf("%s/client-key.pem", data->workdir);
+        data->clientcert = g_strdup_printf("%s/client-cert.pem", data->workdir);
+    }
+
+    mkdir(data->workdir, 0700);
+
+    test_tls_init(data->keyfile);
+    g_assert(link(data->keyfile, data->serverkey) == 0);
+    if (args->clientcert) {
+        g_assert(link(data->keyfile, data->clientkey) == 0);
+    }
+
+    TLS_ROOT_REQ_SIMPLE(cacertreq, data->cacert);
+    if (args->clientcert) {
+        TLS_CERT_REQ_SIMPLE_CLIENT(servercertreq, cacertreq,
+                                   args->hostileclient ?
+                                   QCRYPTO_TLS_TEST_CLIENT_HOSTILE_NAME :
+                                   QCRYPTO_TLS_TEST_CLIENT_NAME,
+                                   data->clientcert);
+    }
+
+    TLS_CERT_REQ_SIMPLE_SERVER(clientcertreq, cacertreq,
+                               data->servercert,
+                               args->certhostname,
+                               args->certipaddr);
+
+    rsp = wait_command(from,
+                       "{ 'execute': 'object-add',"
+                       "  'arguments': { 'qom-type': 'tls-creds-x509',"
+                       "                 'id': 'tlscredsx509client0',"
+                       "                 'endpoint': 'client',"
+                       "                 'dir': %s,"
+                       "                 'sanity-check': true,"
+                       "                 'verify-peer': true} }",
+                       data->workdir);
+    qobject_unref(rsp);
+    migrate_set_parameter_str(from, "tls-creds", "tlscredsx509client0");
+    if (args->certhostname) {
+        migrate_set_parameter_str(from, "tls-hostname", args->certhostname);
+    }
+
+    rsp = wait_command(to,
+                       "{ 'execute': 'object-add',"
+                       "  'arguments': { 'qom-type': 'tls-creds-x509',"
+                       "                 'id': 'tlscredsx509server0',"
+                       "                 'endpoint': 'server',"
+                       "                 'dir': %s,"
+                       "                 'sanity-check': true,"
+                       "                 'verify-peer': %i} }",
+                       data->workdir, args->verifyclient);
+    qobject_unref(rsp);
+    migrate_set_parameter_str(to, "tls-creds", "tlscredsx509server0");
+
+    if (args->authzclient) {
+        rsp = wait_command(to,
+                           "{ 'execute': 'object-add',"
+                           "  'arguments': { 'qom-type': 'authz-simple',"
+                           "                 'id': 'tlsauthz0',"
+                           "                 'identity': %s} }",
+                           "CN=" QCRYPTO_TLS_TEST_CLIENT_NAME);
+        migrate_set_parameter_str(to, "tls-authz", "tlsauthz0");
+    }
+
+    return data;
+}
+
+/*
+ * The normal case: match server's cert hostname against
+ * whatever host we were telling QEMU to connect to (if any)
+ */
+static void *
+test_migrate_tls_x509_start_default_host(QTestState *from,
+                                         QTestState *to)
+{
+    TestMigrateTLSX509 args = {
+        .verifyclient = true,
+        .clientcert = true,
+        .certipaddr = "127.0.0.1"
+    };
+    return test_migrate_tls_x509_start_common(from, to, &args);
+}
+
+/*
+ * The unusual case: the server's cert is different from
+ * the address we're telling QEMU to connect to (if any),
+ * so we must give QEMU an explicit hostname to validate
+ */
+static void *
+test_migrate_tls_x509_start_override_host(QTestState *from,
+                                          QTestState *to)
+{
+    TestMigrateTLSX509 args = {
+        .verifyclient = true,
+        .clientcert = true,
+        .certhostname = "qemu.org",
+    };
+    return test_migrate_tls_x509_start_common(from, to, &args);
+}
+
+/*
+ * The unusual case: the server's cert is different from
+ * the address we're telling QEMU to connect to, and so we
+ * expect the client to reject the server
+ */
+static void *
+test_migrate_tls_x509_start_mismatch_host(QTestState *from,
+                                          QTestState *to)
+{
+    TestMigrateTLSX509 args = {
+        .verifyclient = true,
+        .clientcert = true,
+        .certipaddr = "10.0.0.1",
+    };
+    return test_migrate_tls_x509_start_common(from, to, &args);
+}
+
+static void *
+test_migrate_tls_x509_start_friendly_client(QTestState *from,
+                                            QTestState *to)
+{
+    TestMigrateTLSX509 args = {
+        .verifyclient = true,
+        .clientcert = true,
+        .authzclient = true,
+        .certipaddr = "127.0.0.1",
+    };
+    return test_migrate_tls_x509_start_common(from, to, &args);
+}
+
+static void *
+test_migrate_tls_x509_start_hostile_client(QTestState *from,
+                                           QTestState *to)
+{
+    TestMigrateTLSX509 args = {
+        .verifyclient = true,
+        .clientcert = true,
+        .hostileclient = true,
+        .authzclient = true,
+        .certipaddr = "127.0.0.1",
+    };
+    return test_migrate_tls_x509_start_common(from, to, &args);
+}
+
+/*
+ * The case with no client certificate presented,
+ * and no server verification
+ */
+static void *
+test_migrate_tls_x509_start_allow_anon_client(QTestState *from,
+                                              QTestState *to)
+{
+    TestMigrateTLSX509 args = {
+        .certipaddr = "127.0.0.1",
+    };
+    return test_migrate_tls_x509_start_common(from, to, &args);
+}
+
+/*
+ * The case with no client certificate presented,
+ * and server verification rejecting
+ */
+static void *
+test_migrate_tls_x509_start_reject_anon_client(QTestState *from,
+                                               QTestState *to)
+{
+    TestMigrateTLSX509 args = {
+        .verifyclient = true,
+        .certipaddr = "127.0.0.1",
+    };
+    return test_migrate_tls_x509_start_common(from, to, &args);
+}
+
+static void
+test_migrate_tls_x509_finish(QTestState *from,
+                             QTestState *to,
+                             void *opaque)
+{
+    TestMigrateTLSX509Data *data = opaque;
+
+    test_tls_cleanup(data->keyfile);
+    unlink(data->cacert);
+    unlink(data->servercert);
+    unlink(data->serverkey);
+    unlink(data->clientcert);
+    unlink(data->clientkey);
+    rmdir(data->workdir);
+
+    g_free(data->workdir);
+    g_free(data->keyfile);
+    g_free(data);
+}
+#endif /* CONFIG_TASN1 */
 #endif /* CONFIG_GNUTLS */
 
 static int migrate_postcopy_prepare(QTestState **from_ptr,
@@ -1020,6 +1251,21 @@ static void test_precopy_unix_plain(void)
     test_precopy_common(&args);
 }
 
+static void test_precopy_unix_dirty_ring(void)
+{
+    g_autofree char *uri = g_strdup_printf("unix:%s/migsocket", tmpfs);
+    MigrateCommon args = {
+        .start = {
+            .use_dirty_ring = true,
+        },
+        .listen_uri = uri,
+        .connect_uri = uri,
+    };
+
+    test_precopy_common(&args);
+}
+
+#ifdef CONFIG_GNUTLS
 static void test_precopy_unix_tls_psk(void)
 {
     g_autofree char *uri = g_strdup_printf("unix:%s/migsocket", tmpfs);
@@ -1033,19 +1279,38 @@ static void test_precopy_unix_tls_psk(void)
     test_precopy_common(&args);
 }
 
-static void test_precopy_unix_dirty_ring(void)
+#ifdef CONFIG_TASN1
+static void test_precopy_unix_tls_x509_default_host(void)
 {
     g_autofree char *uri = g_strdup_printf("unix:%s/migsocket", tmpfs);
     MigrateCommon args = {
         .start = {
-            .use_dirty_ring = true,
+            .hide_stderr = true,
         },
+        .connect_uri = uri,
         .listen_uri = uri,
+        .start_hook = test_migrate_tls_x509_start_default_host,
+        .finish_hook = test_migrate_tls_x509_finish,
+        .result = MIG_TEST_FAIL_DEST_QUIT_ERR,
+    };
+
+    test_precopy_common(&args);
+}
+
+static void test_precopy_unix_tls_x509_override_host(void)
+{
+    g_autofree char *uri = g_strdup_printf("unix:%s/migsocket", tmpfs);
+    MigrateCommon args = {
         .connect_uri = uri,
+        .listen_uri = uri,
+        .start_hook = test_migrate_tls_x509_start_override_host,
+        .finish_hook = test_migrate_tls_x509_finish,
     };
 
     test_precopy_common(&args);
 }
+#endif /* CONFIG_TASN1 */
+#endif /* CONFIG_GNUTLS */
 
 #if 0
 /* Currently upset on aarch64 TCG */
@@ -1172,6 +1437,97 @@ static void test_precopy_tcp_tls_psk_mismatch(void)
 
     test_precopy_common(&args);
 }
+
+#ifdef CONFIG_TASN1
+static void test_precopy_tcp_tls_x509_default_host(void)
+{
+    MigrateCommon args = {
+        .listen_uri = "tcp:127.0.0.1:0",
+        .start_hook = test_migrate_tls_x509_start_default_host,
+        .finish_hook = test_migrate_tls_x509_finish,
+    };
+
+    test_precopy_common(&args);
+}
+
+static void test_precopy_tcp_tls_x509_override_host(void)
+{
+    MigrateCommon args = {
+        .listen_uri = "tcp:127.0.0.1:0",
+        .start_hook = test_migrate_tls_x509_start_override_host,
+        .finish_hook = test_migrate_tls_x509_finish,
+    };
+
+    test_precopy_common(&args);
+}
+
+static void test_precopy_tcp_tls_x509_mismatch_host(void)
+{
+    MigrateCommon args = {
+        .start = {
+            .hide_stderr = true,
+        },
+        .listen_uri = "tcp:127.0.0.1:0",
+        .start_hook = test_migrate_tls_x509_start_mismatch_host,
+        .finish_hook = test_migrate_tls_x509_finish,
+        .result = MIG_TEST_FAIL_DEST_QUIT_ERR,
+    };
+
+    test_precopy_common(&args);
+}
+
+static void test_precopy_tcp_tls_x509_friendly_client(void)
+{
+    MigrateCommon args = {
+        .listen_uri = "tcp:127.0.0.1:0",
+        .start_hook = test_migrate_tls_x509_start_friendly_client,
+        .finish_hook = test_migrate_tls_x509_finish,
+    };
+
+    test_precopy_common(&args);
+}
+
+static void test_precopy_tcp_tls_x509_hostile_client(void)
+{
+    MigrateCommon args = {
+        .start = {
+            .hide_stderr = true,
+        },
+        .listen_uri = "tcp:127.0.0.1:0",
+        .start_hook = test_migrate_tls_x509_start_hostile_client,
+        .finish_hook = test_migrate_tls_x509_finish,
+        .result = MIG_TEST_FAIL,
+    };
+
+    test_precopy_common(&args);
+}
+
+static void test_precopy_tcp_tls_x509_allow_anon_client(void)
+{
+    MigrateCommon args = {
+        .listen_uri = "tcp:127.0.0.1:0",
+        .start_hook = test_migrate_tls_x509_start_allow_anon_client,
+        .finish_hook = test_migrate_tls_x509_finish,
+    };
+
+    test_precopy_common(&args);
+}
+
+static void test_precopy_tcp_tls_x509_reject_anon_client(void)
+{
+    MigrateCommon args = {
+        .start = {
+            .hide_stderr = true,
+        },
+        .listen_uri = "tcp:127.0.0.1:0",
+        .start_hook = test_migrate_tls_x509_start_reject_anon_client,
+        .finish_hook = test_migrate_tls_x509_finish,
+        .result = MIG_TEST_FAIL,
+    };
+
+    test_precopy_common(&args);
+}
+#endif /* CONFIG_TASN1 */
 #endif /* CONFIG_GNUTLS */
 
 static void *test_migrate_fd_start_hook(QTestState *from,
@@ -1640,6 +1996,12 @@ int main(int argc, char **argv)
 #ifdef CONFIG_GNUTLS
     qtest_add_func("/migration/precopy/unix/tls/psk",
                    test_precopy_unix_tls_psk);
+#ifdef CONFIG_TASN1
+    qtest_add_func("/migration/precopy/unix/tls/x509/default-host",
+                   test_precopy_unix_tls_x509_default_host);
+    qtest_add_func("/migration/precopy/unix/tls/x509/override-host",
+                   test_precopy_unix_tls_x509_override_host);
+#endif /* CONFIG_TASN1 */
 #endif /* CONFIG_GNUTLS */
 
     qtest_add_func("/migration/precopy/tcp/plain", test_precopy_tcp_plain);
@@ -1648,6 +2010,22 @@ int main(int argc, char **argv)
                    test_precopy_tcp_tls_psk_match);
     qtest_add_func("/migration/precopy/tcp/tls/psk/mismatch",
                    test_precopy_tcp_tls_psk_mismatch);
+#ifdef CONFIG_TASN1
+    qtest_add_func("/migration/precopy/tcp/tls/x509/default-host",
+                   test_precopy_tcp_tls_x509_default_host);
+    qtest_add_func("/migration/precopy/tcp/tls/x509/override-host",
+                   test_precopy_tcp_tls_x509_override_host);
+    qtest_add_func("/migration/precopy/tcp/tls/x509/mismatch-host",
+                   test_precopy_tcp_tls_x509_mismatch_host);
+    qtest_add_func("/migration/precopy/tcp/tls/x509/friendly-client",
+                   test_precopy_tcp_tls_x509_friendly_client);
+    qtest_add_func("/migration/precopy/tcp/tls/x509/hostile-client",
+                   test_precopy_tcp_tls_x509_hostile_client);
+    qtest_add_func("/migration/precopy/tcp/tls/x509/allow-anon-client",
+                   test_precopy_tcp_tls_x509_allow_anon_client);
+    qtest_add_func("/migration/precopy/tcp/tls/x509/reject-anon-client",
+                   test_precopy_tcp_tls_x509_reject_anon_client);
+#endif /* CONFIG_TASN1 */
 #endif /* CONFIG_GNUTLS */
 
     /* qtest_add_func("/migration/ignore_shared", test_ignore_shared); */
-- 
2.32.0



^ permalink raw reply related	[flat|nested] 40+ messages in thread

* [PATCH v5 05/21] tests: convert XBZRLE migration test to use common helper
  2022-04-25 23:38 [PATCH v5 00/21] migration: Postcopy Preemption Peter Xu
                   ` (3 preceding siblings ...)
  2022-04-25 23:38 ` [PATCH v5 04/21] tests: add migration tests of TLS with x509 credentials Peter Xu
@ 2022-04-25 23:38 ` Peter Xu
  2022-04-25 23:38 ` [PATCH v5 06/21] tests: convert multifd migration tests " Peter Xu
                   ` (16 subsequent siblings)
  21 siblings, 0 replies; 40+ messages in thread
From: Peter Xu @ 2022-04-25 23:38 UTC (permalink / raw)
  To: qemu-devel
  Cc: Leonardo Bras Soares Passos, Daniel P . Berrange,
	Dr . David Alan Gilbert, peterx, Juan Quintela

From: Daniel P. Berrangé <berrange@redhat.com>

Most of the XBZRLE migration test logic is common with the rest of the
precopy tests, so it can use the helper with just one small tweak.

Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
---
 tests/qtest/migration-test.c | 67 ++++++++++++++----------------------
 1 file changed, 25 insertions(+), 42 deletions(-)

diff --git a/tests/qtest/migration-test.c b/tests/qtest/migration-test.c
index c730697f74..043ae94089 100644
--- a/tests/qtest/migration-test.c
+++ b/tests/qtest/migration-test.c
@@ -1174,6 +1174,9 @@ typedef struct {
         /* This test should fail, dest qemu should fail with abnormal status */
         MIG_TEST_FAIL_DEST_QUIT_ERR,
     } result;
+
+    /* Optional: set number of migration passes to wait for */
+    unsigned int iterations;
 } MigrateCommon;
 
 static void test_precopy_common(MigrateCommon *args)
@@ -1219,7 +1222,13 @@ static void test_precopy_common(MigrateCommon *args)
             qtest_set_expected_status(to, 1);
         }
     } else {
-        wait_for_migration_pass(from);
+        if (args->iterations) {
+            while (args->iterations--) {
+                wait_for_migration_pass(from);
+            }
+        } else {
+            wait_for_migration_pass(from);
+        }
 
         migrate_set_parameter_int(from, "downtime-limit", CONVERGE_DOWNTIME);
 
@@ -1349,57 +1358,31 @@ static void test_ignore_shared(void)
 }
 #endif
 
-static void test_xbzrle(const char *uri)
+static void *
+test_migrate_xbzrle_start(QTestState *from,
+                          QTestState *to)
 {
-    MigrateStart args = {};
-    QTestState *from, *to;
-
-    if (test_migrate_start(&from, &to, uri, &args)) {
-        return;
-    }
-
-    /*
-     * We want to pick a speed slow enough that the test completes
-     * quickly, but that it doesn't complete precopy even on a slow
-     * machine, so also set the downtime.
-     */
-    /* 1 ms should make it not converge*/
-    migrate_set_parameter_int(from, "downtime-limit", 1);
-    /* 1GB/s */
-    migrate_set_parameter_int(from, "max-bandwidth", 1000000000);
-
     migrate_set_parameter_int(from, "xbzrle-cache-size", 33554432);
 
     migrate_set_capability(from, "xbzrle", true);
     migrate_set_capability(to, "xbzrle", true);
-    /* Wait for the first serial output from the source */
-    wait_for_serial("src_serial");
 
-    migrate_qmp(from, uri, "{}");
-
-    wait_for_migration_pass(from);
-    /* Make sure we have 2 passes, so the xbzrle cache gets a workout */
-    wait_for_migration_pass(from);
-
-    /* 1000ms should converge */
-    migrate_set_parameter_int(from, "downtime-limit", 1000);
-
-    if (!got_stop) {
-        qtest_qmp_eventwait(from, "STOP");
-    }
-    qtest_qmp_eventwait(to, "RESUME");
-
-    wait_for_serial("dest_serial");
-    wait_for_migration_complete(from);
-
-    test_migrate_end(from, to, true);
+    return NULL;
 }
 
-static void test_xbzrle_unix(void)
+static void test_precopy_unix_xbzrle(void)
 {
     g_autofree char *uri = g_strdup_printf("unix:%s/migsocket", tmpfs);
+    MigrateCommon args = {
+        .connect_uri = uri,
+        .listen_uri = uri,
+
+        .start_hook = test_migrate_xbzrle_start,
 
-    test_xbzrle(uri);
+        .iterations = 2,
+    };
+
+    test_precopy_common(&args);
 }
 
 static void test_precopy_tcp_plain(void)
@@ -1993,6 +1976,7 @@ int main(int argc, char **argv)
     qtest_add_func("/migration/postcopy/recovery", test_postcopy_recovery);
     qtest_add_func("/migration/bad_dest", test_baddest);
     qtest_add_func("/migration/precopy/unix/plain", test_precopy_unix_plain);
+    qtest_add_func("/migration/precopy/unix/xbzrle", test_precopy_unix_xbzrle);
 #ifdef CONFIG_GNUTLS
     qtest_add_func("/migration/precopy/unix/tls/psk",
                    test_precopy_unix_tls_psk);
@@ -2029,7 +2013,6 @@ int main(int argc, char **argv)
 #endif /* CONFIG_GNUTLS */
 
     /* qtest_add_func("/migration/ignore_shared", test_ignore_shared); */
-    qtest_add_func("/migration/xbzrle/unix", test_xbzrle_unix);
     qtest_add_func("/migration/fd_proto", test_migrate_fd_proto);
     qtest_add_func("/migration/validate_uuid", test_validate_uuid);
     qtest_add_func("/migration/validate_uuid_error", test_validate_uuid_error);
-- 
2.32.0



^ permalink raw reply related	[flat|nested] 40+ messages in thread

* [PATCH v5 06/21] tests: convert multifd migration tests to use common helper
  2022-04-25 23:38 [PATCH v5 00/21] migration: Postcopy Preemption Peter Xu
                   ` (4 preceding siblings ...)
  2022-04-25 23:38 ` [PATCH v5 05/21] tests: convert XBZRLE migration test to use common helper Peter Xu
@ 2022-04-25 23:38 ` Peter Xu
  2022-04-25 23:38 ` [PATCH v5 07/21] tests: add multifd migration tests of TLS with PSK credentials Peter Xu
                   ` (15 subsequent siblings)
  21 siblings, 0 replies; 40+ messages in thread
From: Peter Xu @ 2022-04-25 23:38 UTC (permalink / raw)
  To: qemu-devel
  Cc: Leonardo Bras Soares Passos, Daniel P . Berrange,
	Dr . David Alan Gilbert, peterx, Juan Quintela

From: Daniel P. Berrangé <berrange@redhat.com>

Most of the multifd migration test logic is common with the rest of the
precopy tests, so it can use the helper without difficulty. The only
exception of the multifd cancellation test which tries to run multiple
migrations in a row.

Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
---
 tests/qtest/migration-test.c | 77 +++++++++++++++++++-----------------
 1 file changed, 40 insertions(+), 37 deletions(-)

diff --git a/tests/qtest/migration-test.c b/tests/qtest/migration-test.c
index 043ae94089..c1b0b3aca4 100644
--- a/tests/qtest/migration-test.c
+++ b/tests/qtest/migration-test.c
@@ -1739,26 +1739,12 @@ static void test_migrate_auto_converge(void)
     test_migrate_end(from, to, true);
 }
 
-static void test_multifd_tcp(const char *method)
+static void *
+test_migrate_precopy_tcp_multifd_start_common(QTestState *from,
+                                              QTestState *to,
+                                              const char *method)
 {
-    MigrateStart args = {};
-    QTestState *from, *to;
     QDict *rsp;
-    g_autofree char *uri = NULL;
-
-    if (test_migrate_start(&from, &to, "defer", &args)) {
-        return;
-    }
-
-    /*
-     * We want to pick a speed slow enough that the test completes
-     * quickly, but that it doesn't complete precopy even on a slow
-     * machine, so also set the downtime.
-     */
-    /* 1 ms should make it not converge*/
-    migrate_set_parameter_int(from, "downtime-limit", 1);
-    /* 1GB/s */
-    migrate_set_parameter_int(from, "max-bandwidth", 1000000000);
 
     migrate_set_parameter_int(from, "multifd-channels", 16);
     migrate_set_parameter_int(to, "multifd-channels", 16);
@@ -1774,41 +1760,58 @@ static void test_multifd_tcp(const char *method)
                            "  'arguments': { 'uri': 'tcp:127.0.0.1:0' }}");
     qobject_unref(rsp);
 
-    /* Wait for the first serial output from the source */
-    wait_for_serial("src_serial");
-
-    uri = migrate_get_socket_address(to, "socket-address");
-
-    migrate_qmp(from, uri, "{}");
-
-    wait_for_migration_pass(from);
+    return NULL;
+}
 
-    migrate_set_parameter_int(from, "downtime-limit", CONVERGE_DOWNTIME);
+static void *
+test_migrate_precopy_tcp_multifd_start(QTestState *from,
+                                       QTestState *to)
+{
+    return test_migrate_precopy_tcp_multifd_start_common(from, to, "none");
+}
 
-    if (!got_stop) {
-        qtest_qmp_eventwait(from, "STOP");
-    }
-    qtest_qmp_eventwait(to, "RESUME");
+static void *
+test_migrate_precopy_tcp_multifd_zlib_start(QTestState *from,
+                                            QTestState *to)
+{
+    return test_migrate_precopy_tcp_multifd_start_common(from, to, "zlib");
+}
 
-    wait_for_serial("dest_serial");
-    wait_for_migration_complete(from);
-    test_migrate_end(from, to, true);
+#ifdef CONFIG_ZSTD
+static void *
+test_migrate_precopy_tcp_multifd_zstd_start(QTestState *from,
+                                            QTestState *to)
+{
+    return test_migrate_precopy_tcp_multifd_start_common(from, to, "zstd");
 }
+#endif /* CONFIG_ZSTD */
 
 static void test_multifd_tcp_none(void)
 {
-    test_multifd_tcp("none");
+    MigrateCommon args = {
+        .listen_uri = "defer",
+        .start_hook = test_migrate_precopy_tcp_multifd_start,
+    };
+    test_precopy_common(&args);
 }
 
 static void test_multifd_tcp_zlib(void)
 {
-    test_multifd_tcp("zlib");
+    MigrateCommon args = {
+        .listen_uri = "defer",
+        .start_hook = test_migrate_precopy_tcp_multifd_zlib_start,
+    };
+    test_precopy_common(&args);
 }
 
 #ifdef CONFIG_ZSTD
 static void test_multifd_tcp_zstd(void)
 {
-    test_multifd_tcp("zstd");
+    MigrateCommon args = {
+        .listen_uri = "defer",
+        .start_hook = test_migrate_precopy_tcp_multifd_zstd_start,
+    };
+    test_precopy_common(&args);
 }
 #endif
 
-- 
2.32.0



^ permalink raw reply related	[flat|nested] 40+ messages in thread

* [PATCH v5 07/21] tests: add multifd migration tests of TLS with PSK credentials
  2022-04-25 23:38 [PATCH v5 00/21] migration: Postcopy Preemption Peter Xu
                   ` (5 preceding siblings ...)
  2022-04-25 23:38 ` [PATCH v5 06/21] tests: convert multifd migration tests " Peter Xu
@ 2022-04-25 23:38 ` Peter Xu
  2022-04-25 23:38 ` [PATCH v5 08/21] tests: add multifd migration tests of TLS with x509 credentials Peter Xu
                   ` (14 subsequent siblings)
  21 siblings, 0 replies; 40+ messages in thread
From: Peter Xu @ 2022-04-25 23:38 UTC (permalink / raw)
  To: qemu-devel
  Cc: Leonardo Bras Soares Passos, Daniel P . Berrange,
	Dr . David Alan Gilbert, peterx, Juan Quintela

From: Daniel P. Berrangé <berrange@redhat.com>

This validates that we correctly handle multifd migration success
and failure scenarios when using TLS with pre shared keys.

Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
---
 tests/qtest/migration-test.c | 60 +++++++++++++++++++++++++++++++++---
 1 file changed, 56 insertions(+), 4 deletions(-)

diff --git a/tests/qtest/migration-test.c b/tests/qtest/migration-test.c
index c1b0b3aca4..f47e4797e2 100644
--- a/tests/qtest/migration-test.c
+++ b/tests/qtest/migration-test.c
@@ -1815,6 +1815,48 @@ static void test_multifd_tcp_zstd(void)
 }
 #endif
 
+#ifdef CONFIG_GNUTLS
+static void *
+test_migrate_multifd_tcp_tls_psk_start_match(QTestState *from,
+                                             QTestState *to)
+{
+    test_migrate_precopy_tcp_multifd_start_common(from, to, "none");
+    return test_migrate_tls_psk_start_match(from, to);
+}
+
+static void *
+test_migrate_multifd_tcp_tls_psk_start_mismatch(QTestState *from,
+                                                QTestState *to)
+{
+    test_migrate_precopy_tcp_multifd_start_common(from, to, "none");
+    return test_migrate_tls_psk_start_mismatch(from, to);
+}
+
+static void test_multifd_tcp_tls_psk_match(void)
+{
+    MigrateCommon args = {
+        .listen_uri = "defer",
+        .start_hook = test_migrate_multifd_tcp_tls_psk_start_match,
+        .finish_hook = test_migrate_tls_psk_finish,
+    };
+    test_precopy_common(&args);
+}
+
+static void test_multifd_tcp_tls_psk_mismatch(void)
+{
+    MigrateCommon args = {
+        .start = {
+            .hide_stderr = true,
+        },
+        .listen_uri = "defer",
+        .start_hook = test_migrate_multifd_tcp_tls_psk_start_mismatch,
+        .finish_hook = test_migrate_tls_psk_finish,
+        .result = MIG_TEST_FAIL,
+    };
+    test_precopy_common(&args);
+}
+#endif /* CONFIG_GNUTLS */
+
 /*
  * This test does:
  *  source               target
@@ -2025,12 +2067,22 @@ int main(int argc, char **argv)
                    test_validate_uuid_dst_not_set);
 
     qtest_add_func("/migration/auto_converge", test_migrate_auto_converge);
-    qtest_add_func("/migration/multifd/tcp/none", test_multifd_tcp_none);
-    qtest_add_func("/migration/multifd/tcp/cancel", test_multifd_tcp_cancel);
-    qtest_add_func("/migration/multifd/tcp/zlib", test_multifd_tcp_zlib);
+    qtest_add_func("/migration/multifd/tcp/plain/none",
+                   test_multifd_tcp_none);
+    qtest_add_func("/migration/multifd/tcp/plain/cancel",
+                   test_multifd_tcp_cancel);
+    qtest_add_func("/migration/multifd/tcp/plain/zlib",
+                   test_multifd_tcp_zlib);
 #ifdef CONFIG_ZSTD
-    qtest_add_func("/migration/multifd/tcp/zstd", test_multifd_tcp_zstd);
+    qtest_add_func("/migration/multifd/tcp/plain/zstd",
+                   test_multifd_tcp_zstd);
 #endif
+#ifdef CONFIG_GNUTLS
+    qtest_add_func("/migration/multifd/tcp/tls/psk/match",
+                   test_multifd_tcp_tls_psk_match);
+    qtest_add_func("/migration/multifd/tcp/tls/psk/mismatch",
+                   test_multifd_tcp_tls_psk_mismatch);
+#endif /* CONFIG_GNUTLS */
 
     if (kvm_dirty_ring_supported()) {
         qtest_add_func("/migration/dirty_ring",
-- 
2.32.0



^ permalink raw reply related	[flat|nested] 40+ messages in thread

* [PATCH v5 08/21] tests: add multifd migration tests of TLS with x509 credentials
  2022-04-25 23:38 [PATCH v5 00/21] migration: Postcopy Preemption Peter Xu
                   ` (6 preceding siblings ...)
  2022-04-25 23:38 ` [PATCH v5 07/21] tests: add multifd migration tests of TLS with PSK credentials Peter Xu
@ 2022-04-25 23:38 ` Peter Xu
  2022-04-25 23:38 ` [PATCH v5 09/21] tests: ensure migration status isn't reported as failed Peter Xu
                   ` (13 subsequent siblings)
  21 siblings, 0 replies; 40+ messages in thread
From: Peter Xu @ 2022-04-25 23:38 UTC (permalink / raw)
  To: qemu-devel
  Cc: Leonardo Bras Soares Passos, Daniel P . Berrange,
	Dr . David Alan Gilbert, peterx, Juan Quintela

From: Daniel P. Berrangé <berrange@redhat.com>

This validates that we correctly handle multifd migration success
and failure scenarios when using TLS with x509 certificates. There
are quite a few different scenarios that matter in relation to
hostname validation, but we skip a couple as we can assume that
the non-multifd coverage applies to some extent.

Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
---
 tests/qtest/migration-test.c | 127 +++++++++++++++++++++++++++++++++++
 1 file changed, 127 insertions(+)

diff --git a/tests/qtest/migration-test.c b/tests/qtest/migration-test.c
index f47e4797e2..5ea0b9360a 100644
--- a/tests/qtest/migration-test.c
+++ b/tests/qtest/migration-test.c
@@ -1832,6 +1832,48 @@ test_migrate_multifd_tcp_tls_psk_start_mismatch(QTestState *from,
     return test_migrate_tls_psk_start_mismatch(from, to);
 }
 
+#ifdef CONFIG_TASN1
+static void *
+test_migrate_multifd_tls_x509_start_default_host(QTestState *from,
+                                                 QTestState *to)
+{
+    test_migrate_precopy_tcp_multifd_start_common(from, to, "none");
+    return test_migrate_tls_x509_start_default_host(from, to);
+}
+
+static void *
+test_migrate_multifd_tls_x509_start_override_host(QTestState *from,
+                                                  QTestState *to)
+{
+    test_migrate_precopy_tcp_multifd_start_common(from, to, "none");
+    return test_migrate_tls_x509_start_override_host(from, to);
+}
+
+static void *
+test_migrate_multifd_tls_x509_start_mismatch_host(QTestState *from,
+                                                  QTestState *to)
+{
+    test_migrate_precopy_tcp_multifd_start_common(from, to, "none");
+    return test_migrate_tls_x509_start_mismatch_host(from, to);
+}
+
+static void *
+test_migrate_multifd_tls_x509_start_allow_anon_client(QTestState *from,
+                                                      QTestState *to)
+{
+    test_migrate_precopy_tcp_multifd_start_common(from, to, "none");
+    return test_migrate_tls_x509_start_allow_anon_client(from, to);
+}
+
+static void *
+test_migrate_multifd_tls_x509_start_reject_anon_client(QTestState *from,
+                                                       QTestState *to)
+{
+    test_migrate_precopy_tcp_multifd_start_common(from, to, "none");
+    return test_migrate_tls_x509_start_reject_anon_client(from, to);
+}
+#endif /* CONFIG_TASN1 */
+
 static void test_multifd_tcp_tls_psk_match(void)
 {
     MigrateCommon args = {
@@ -1855,6 +1897,79 @@ static void test_multifd_tcp_tls_psk_mismatch(void)
     };
     test_precopy_common(&args);
 }
+
+#ifdef CONFIG_TASN1
+static void test_multifd_tcp_tls_x509_default_host(void)
+{
+    MigrateCommon args = {
+        .listen_uri = "defer",
+        .start_hook = test_migrate_multifd_tls_x509_start_default_host,
+        .finish_hook = test_migrate_tls_x509_finish,
+    };
+    test_precopy_common(&args);
+}
+
+static void test_multifd_tcp_tls_x509_override_host(void)
+{
+    MigrateCommon args = {
+        .listen_uri = "defer",
+        .start_hook = test_migrate_multifd_tls_x509_start_override_host,
+        .finish_hook = test_migrate_tls_x509_finish,
+    };
+    test_precopy_common(&args);
+}
+
+static void test_multifd_tcp_tls_x509_mismatch_host(void)
+{
+    /*
+     * This has different behaviour to the non-multifd case.
+     *
+     * In non-multifd case when client aborts due to mismatched
+     * cert host, the server has already started trying to load
+     * migration state, and so it exits with I/O  failure.
+     *
+     * In multifd case when client aborts due to mismatched
+     * cert host, the server is still waiting for the other
+     * multifd connections to arrive so hasn't started trying
+     * to load migration state, and thus just aborts the migration
+     * without exiting
+     */
+    MigrateCommon args = {
+        .start = {
+            .hide_stderr = true,
+        },
+        .listen_uri = "defer",
+        .start_hook = test_migrate_multifd_tls_x509_start_mismatch_host,
+        .finish_hook = test_migrate_tls_x509_finish,
+        .result = MIG_TEST_FAIL,
+    };
+    test_precopy_common(&args);
+}
+
+static void test_multifd_tcp_tls_x509_allow_anon_client(void)
+{
+    MigrateCommon args = {
+        .listen_uri = "defer",
+        .start_hook = test_migrate_multifd_tls_x509_start_allow_anon_client,
+        .finish_hook = test_migrate_tls_x509_finish,
+    };
+    test_precopy_common(&args);
+}
+
+static void test_multifd_tcp_tls_x509_reject_anon_client(void)
+{
+    MigrateCommon args = {
+        .start = {
+            .hide_stderr = true,
+        },
+        .listen_uri = "defer",
+        .start_hook = test_migrate_multifd_tls_x509_start_reject_anon_client,
+        .finish_hook = test_migrate_tls_x509_finish,
+        .result = MIG_TEST_FAIL,
+    };
+    test_precopy_common(&args);
+}
+#endif /* CONFIG_TASN1 */
 #endif /* CONFIG_GNUTLS */
 
 /*
@@ -2082,6 +2197,18 @@ int main(int argc, char **argv)
                    test_multifd_tcp_tls_psk_match);
     qtest_add_func("/migration/multifd/tcp/tls/psk/mismatch",
                    test_multifd_tcp_tls_psk_mismatch);
+#ifdef CONFIG_TASN1
+    qtest_add_func("/migration/multifd/tcp/tls/x509/default-host",
+                   test_multifd_tcp_tls_x509_default_host);
+    qtest_add_func("/migration/multifd/tcp/tls/x509/override-host",
+                   test_multifd_tcp_tls_x509_override_host);
+    qtest_add_func("/migration/multifd/tcp/tls/x509/mismatch-host",
+                   test_multifd_tcp_tls_x509_mismatch_host);
+    qtest_add_func("/migration/multifd/tcp/tls/x509/allow-anon-client",
+                   test_multifd_tcp_tls_x509_allow_anon_client);
+    qtest_add_func("/migration/multifd/tcp/tls/x509/reject-anon-client",
+                   test_multifd_tcp_tls_x509_reject_anon_client);
+#endif /* CONFIG_TASN1 */
 #endif /* CONFIG_GNUTLS */
 
     if (kvm_dirty_ring_supported()) {
-- 
2.32.0



^ permalink raw reply related	[flat|nested] 40+ messages in thread

* [PATCH v5 09/21] tests: ensure migration status isn't reported as failed
  2022-04-25 23:38 [PATCH v5 00/21] migration: Postcopy Preemption Peter Xu
                   ` (7 preceding siblings ...)
  2022-04-25 23:38 ` [PATCH v5 08/21] tests: add multifd migration tests of TLS with x509 credentials Peter Xu
@ 2022-04-25 23:38 ` Peter Xu
  2022-04-25 23:38 ` [PATCH v5 10/21] migration: Add postcopy-preempt capability Peter Xu
                   ` (12 subsequent siblings)
  21 siblings, 0 replies; 40+ messages in thread
From: Peter Xu @ 2022-04-25 23:38 UTC (permalink / raw)
  To: qemu-devel
  Cc: Leonardo Bras Soares Passos, Daniel P . Berrange,
	Dr . David Alan Gilbert, peterx, Juan Quintela

From: Daniel P. Berrangé <berrange@redhat.com>

Various methods in the migration test call 'query_migrate' to fetch the
current status and then access a particular field. Almost all of these
cases expect the migration to be in a non-failed state. In the case of
'wait_for_migration_pass' in particular, if the status is 'failed' then
it will get into an infinite loop. By validating that the status is
not 'failed' the test suite will assert rather than hang when getting
into an unexpected state.

Reviewed-by: Peter Xu <peterx@redhat.com>
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
---
 tests/qtest/migration-helpers.c | 13 +++++++++++++
 tests/qtest/migration-helpers.h |  1 +
 tests/qtest/migration-test.c    |  6 +++---
 3 files changed, 17 insertions(+), 3 deletions(-)

diff --git a/tests/qtest/migration-helpers.c b/tests/qtest/migration-helpers.c
index 4ee26014b7..a6aa59e4e6 100644
--- a/tests/qtest/migration-helpers.c
+++ b/tests/qtest/migration-helpers.c
@@ -107,6 +107,19 @@ QDict *migrate_query(QTestState *who)
     return wait_command(who, "{ 'execute': 'query-migrate' }");
 }
 
+QDict *migrate_query_not_failed(QTestState *who)
+{
+    const char *status;
+    QDict *rsp = migrate_query(who);
+    status = qdict_get_str(rsp, "status");
+    if (g_str_equal(status, "failed")) {
+        g_printerr("query-migrate shows failed migration: %s\n",
+                   qdict_get_str(rsp, "error-desc"));
+    }
+    g_assert(!g_str_equal(status, "failed"));
+    return rsp;
+}
+
 /*
  * Note: caller is responsible to free the returned object via
  * g_free() after use
diff --git a/tests/qtest/migration-helpers.h b/tests/qtest/migration-helpers.h
index 555adafce1..d07e0fb748 100644
--- a/tests/qtest/migration-helpers.h
+++ b/tests/qtest/migration-helpers.h
@@ -26,6 +26,7 @@ G_GNUC_PRINTF(3, 4)
 void migrate_qmp(QTestState *who, const char *uri, const char *fmt, ...);
 
 QDict *migrate_query(QTestState *who);
+QDict *migrate_query_not_failed(QTestState *who);
 
 void wait_for_migration_status(QTestState *who,
                                const char *goal, const char **ungoals);
diff --git a/tests/qtest/migration-test.c b/tests/qtest/migration-test.c
index 5ea0b9360a..d9f444ea14 100644
--- a/tests/qtest/migration-test.c
+++ b/tests/qtest/migration-test.c
@@ -181,7 +181,7 @@ static int64_t read_ram_property_int(QTestState *who, const char *property)
     QDict *rsp_return, *rsp_ram;
     int64_t result;
 
-    rsp_return = migrate_query(who);
+    rsp_return = migrate_query_not_failed(who);
     if (!qdict_haskey(rsp_return, "ram")) {
         /* Still in setup */
         result = 0;
@@ -198,7 +198,7 @@ static int64_t read_migrate_property_int(QTestState *who, const char *property)
     QDict *rsp_return;
     int64_t result;
 
-    rsp_return = migrate_query(who);
+    rsp_return = migrate_query_not_failed(who);
     result = qdict_get_try_int(rsp_return, property, 0);
     qobject_unref(rsp_return);
     return result;
@@ -213,7 +213,7 @@ static void read_blocktime(QTestState *who)
 {
     QDict *rsp_return;
 
-    rsp_return = migrate_query(who);
+    rsp_return = migrate_query_not_failed(who);
     g_assert(qdict_haskey(rsp_return, "postcopy-blocktime"));
     qobject_unref(rsp_return);
 }
-- 
2.32.0



^ permalink raw reply related	[flat|nested] 40+ messages in thread

* [PATCH v5 10/21] migration: Add postcopy-preempt capability
  2022-04-25 23:38 [PATCH v5 00/21] migration: Postcopy Preemption Peter Xu
                   ` (8 preceding siblings ...)
  2022-04-25 23:38 ` [PATCH v5 09/21] tests: ensure migration status isn't reported as failed Peter Xu
@ 2022-04-25 23:38 ` Peter Xu
  2022-04-25 23:38 ` [PATCH v5 11/21] migration: Postcopy preemption preparation on channel creation Peter Xu
                   ` (11 subsequent siblings)
  21 siblings, 0 replies; 40+ messages in thread
From: Peter Xu @ 2022-04-25 23:38 UTC (permalink / raw)
  To: qemu-devel
  Cc: Leonardo Bras Soares Passos, Daniel P . Berrange,
	Dr . David Alan Gilbert, peterx, Juan Quintela

Firstly, postcopy already preempts precopy due to the fact that we do
unqueue_page() first before looking into dirty bits.

However that's not enough, e.g., when there're host huge page enabled, when
sending a precopy huge page, a postcopy request needs to wait until the whole
huge page that is sending to finish.  That could introduce quite some delay,
the bigger the huge page is the larger delay it'll bring.

This patch adds a new capability to allow postcopy requests to preempt existing
precopy page during sending a huge page, so that postcopy requests can be
serviced even faster.

Meanwhile to send it even faster, bypass the precopy stream by providing a
standalone postcopy socket for sending requested pages.

Since the new behavior will not be compatible with the old behavior, this will
not be the default, it's enabled only when the new capability is set on both
src/dst QEMUs.

This patch only adds the capability itself, the logic will be added in follow
up patches.

Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Signed-off-by: Peter Xu <peterx@redhat.com>
---
 migration/migration.c | 23 +++++++++++++++++++++++
 migration/migration.h |  1 +
 qapi/migration.json   |  8 +++++++-
 3 files changed, 31 insertions(+), 1 deletion(-)

diff --git a/migration/migration.c b/migration/migration.c
index 5a31b23bd6..75d9185c3a 100644
--- a/migration/migration.c
+++ b/migration/migration.c
@@ -1236,6 +1236,11 @@ static bool migrate_caps_check(bool *cap_list,
             error_setg(errp, "Postcopy is not compatible with ignore-shared");
             return false;
         }
+
+        if (cap_list[MIGRATION_CAPABILITY_MULTIFD]) {
+            error_setg(errp, "Multifd is not supported in postcopy");
+            return false;
+        }
     }
 
     if (cap_list[MIGRATION_CAPABILITY_BACKGROUND_SNAPSHOT]) {
@@ -1279,6 +1284,13 @@ static bool migrate_caps_check(bool *cap_list,
         return false;
     }
 
+    if (cap_list[MIGRATION_CAPABILITY_POSTCOPY_PREEMPT]) {
+        if (!cap_list[MIGRATION_CAPABILITY_POSTCOPY_RAM]) {
+            error_setg(errp, "Postcopy preempt requires postcopy-ram");
+            return false;
+        }
+    }
+
     return true;
 }
 
@@ -2626,6 +2638,15 @@ bool migrate_background_snapshot(void)
     return s->enabled_capabilities[MIGRATION_CAPABILITY_BACKGROUND_SNAPSHOT];
 }
 
+bool migrate_postcopy_preempt(void)
+{
+    MigrationState *s;
+
+    s = migrate_get_current();
+
+    return s->enabled_capabilities[MIGRATION_CAPABILITY_POSTCOPY_PREEMPT];
+}
+
 /* migration thread support */
 /*
  * Something bad happened to the RP stream, mark an error
@@ -4236,6 +4257,8 @@ static Property migration_properties[] = {
     DEFINE_PROP_MIG_CAP("x-compress", MIGRATION_CAPABILITY_COMPRESS),
     DEFINE_PROP_MIG_CAP("x-events", MIGRATION_CAPABILITY_EVENTS),
     DEFINE_PROP_MIG_CAP("x-postcopy-ram", MIGRATION_CAPABILITY_POSTCOPY_RAM),
+    DEFINE_PROP_MIG_CAP("x-postcopy-preempt",
+                        MIGRATION_CAPABILITY_POSTCOPY_PREEMPT),
     DEFINE_PROP_MIG_CAP("x-colo", MIGRATION_CAPABILITY_X_COLO),
     DEFINE_PROP_MIG_CAP("x-release-ram", MIGRATION_CAPABILITY_RELEASE_RAM),
     DEFINE_PROP_MIG_CAP("x-block", MIGRATION_CAPABILITY_BLOCK),
diff --git a/migration/migration.h b/migration/migration.h
index a863032b71..af4bcb19c2 100644
--- a/migration/migration.h
+++ b/migration/migration.h
@@ -394,6 +394,7 @@ int migrate_decompress_threads(void);
 bool migrate_use_events(void);
 bool migrate_postcopy_blocktime(void);
 bool migrate_background_snapshot(void);
+bool migrate_postcopy_preempt(void);
 
 /* Sending on the return path - generic and then for each message type */
 void migrate_send_rp_shut(MigrationIncomingState *mis,
diff --git a/qapi/migration.json b/qapi/migration.json
index 409eb086a2..f381a94c98 100644
--- a/qapi/migration.json
+++ b/qapi/migration.json
@@ -463,6 +463,12 @@
 #                       procedure starts. The VM RAM is saved with running VM.
 #                       (since 6.0)
 #
+# @postcopy-preempt: If enabled, the migration process will allow postcopy
+#                    requests to preempt precopy stream, so postcopy requests
+#                    will be handled faster.  This is a performance feature and
+#                    should not affect the correctness of postcopy migration.
+#                    (since 7.1)
+#
 # Features:
 # @unstable: Members @x-colo and @x-ignore-shared are experimental.
 #
@@ -476,7 +482,7 @@
            'block', 'return-path', 'pause-before-switchover', 'multifd',
            'dirty-bitmaps', 'postcopy-blocktime', 'late-block-activate',
            { 'name': 'x-ignore-shared', 'features': [ 'unstable' ] },
-           'validate-uuid', 'background-snapshot'] }
+           'validate-uuid', 'background-snapshot', 'postcopy-preempt'] }
 
 ##
 # @MigrationCapabilityStatus:
-- 
2.32.0



^ permalink raw reply related	[flat|nested] 40+ messages in thread

* [PATCH v5 11/21] migration: Postcopy preemption preparation on channel creation
  2022-04-25 23:38 [PATCH v5 00/21] migration: Postcopy Preemption Peter Xu
                   ` (9 preceding siblings ...)
  2022-04-25 23:38 ` [PATCH v5 10/21] migration: Add postcopy-preempt capability Peter Xu
@ 2022-04-25 23:38 ` Peter Xu
  2022-05-11 17:08   ` manish.mishra
  2022-04-25 23:38 ` [PATCH v5 12/21] migration: Postcopy preemption enablement Peter Xu
                   ` (10 subsequent siblings)
  21 siblings, 1 reply; 40+ messages in thread
From: Peter Xu @ 2022-04-25 23:38 UTC (permalink / raw)
  To: qemu-devel
  Cc: Leonardo Bras Soares Passos, Daniel P . Berrange,
	Dr . David Alan Gilbert, peterx, Juan Quintela

Create a new socket for postcopy to be prepared to send postcopy requested
pages via this specific channel, so as to not get blocked by precopy pages.

A new thread is also created on dest qemu to receive data from this new channel
based on the ram_load_postcopy() routine.

The ram_load_postcopy(POSTCOPY) branch and the thread has not started to
function, and that'll be done in follow up patches.

Cleanup the new sockets on both src/dst QEMUs, meanwhile look after the new
thread too to make sure it'll be recycled properly.

Signed-off-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
---
 migration/migration.c    | 62 +++++++++++++++++++++++----
 migration/migration.h    |  8 ++++
 migration/postcopy-ram.c | 92 ++++++++++++++++++++++++++++++++++++++--
 migration/postcopy-ram.h | 10 +++++
 migration/ram.c          | 25 ++++++++---
 migration/ram.h          |  4 +-
 migration/savevm.c       | 20 ++++-----
 migration/socket.c       | 22 +++++++++-
 migration/socket.h       |  1 +
 migration/trace-events   |  5 ++-
 10 files changed, 218 insertions(+), 31 deletions(-)

diff --git a/migration/migration.c b/migration/migration.c
index 75d9185c3a..e27aa306bc 100644
--- a/migration/migration.c
+++ b/migration/migration.c
@@ -321,6 +321,12 @@ void migration_incoming_state_destroy(void)
         mis->page_requested = NULL;
     }
 
+    if (mis->postcopy_qemufile_dst) {
+        migration_ioc_unregister_yank_from_file(mis->postcopy_qemufile_dst);
+        qemu_fclose(mis->postcopy_qemufile_dst);
+        mis->postcopy_qemufile_dst = NULL;
+    }
+
     yank_unregister_instance(MIGRATION_YANK_INSTANCE);
 }
 
@@ -714,15 +720,21 @@ void migration_fd_process_incoming(QEMUFile *f, Error **errp)
     migration_incoming_process();
 }
 
+static bool migration_needs_multiple_sockets(void)
+{
+    return migrate_use_multifd() || migrate_postcopy_preempt();
+}
+
 void migration_ioc_process_incoming(QIOChannel *ioc, Error **errp)
 {
     MigrationIncomingState *mis = migration_incoming_get_current();
     Error *local_err = NULL;
     bool start_migration;
+    QEMUFile *f;
 
     if (!mis->from_src_file) {
         /* The first connection (multifd may have multiple) */
-        QEMUFile *f = qemu_fopen_channel_input(ioc);
+        f = qemu_fopen_channel_input(ioc);
 
         if (!migration_incoming_setup(f, errp)) {
             return;
@@ -730,13 +742,18 @@ void migration_ioc_process_incoming(QIOChannel *ioc, Error **errp)
 
         /*
          * Common migration only needs one channel, so we can start
-         * right now.  Multifd needs more than one channel, we wait.
+         * right now.  Some features need more than one channel, we wait.
          */
-        start_migration = !migrate_use_multifd();
+        start_migration = !migration_needs_multiple_sockets();
     } else {
         /* Multiple connections */
-        assert(migrate_use_multifd());
-        start_migration = multifd_recv_new_channel(ioc, &local_err);
+        assert(migration_needs_multiple_sockets());
+        if (migrate_use_multifd()) {
+            start_migration = multifd_recv_new_channel(ioc, &local_err);
+        } else if (migrate_postcopy_preempt()) {
+            f = qemu_fopen_channel_input(ioc);
+            start_migration = postcopy_preempt_new_channel(mis, f);
+        }
         if (local_err) {
             error_propagate(errp, local_err);
             return;
@@ -761,11 +778,20 @@ void migration_ioc_process_incoming(QIOChannel *ioc, Error **errp)
 bool migration_has_all_channels(void)
 {
     MigrationIncomingState *mis = migration_incoming_get_current();
-    bool all_channels;
 
-    all_channels = multifd_recv_all_channels_created();
+    if (!mis->from_src_file) {
+        return false;
+    }
+
+    if (migrate_use_multifd()) {
+        return multifd_recv_all_channels_created();
+    }
+
+    if (migrate_postcopy_preempt()) {
+        return mis->postcopy_qemufile_dst != NULL;
+    }
 
-    return all_channels && mis->from_src_file != NULL;
+    return true;
 }
 
 /*
@@ -1862,6 +1888,12 @@ static void migrate_fd_cleanup(MigrationState *s)
         qemu_fclose(tmp);
     }
 
+    if (s->postcopy_qemufile_src) {
+        migration_ioc_unregister_yank_from_file(s->postcopy_qemufile_src);
+        qemu_fclose(s->postcopy_qemufile_src);
+        s->postcopy_qemufile_src = NULL;
+    }
+
     assert(!migration_is_active(s));
 
     if (s->state == MIGRATION_STATUS_CANCELLING) {
@@ -3237,6 +3269,11 @@ static void migration_completion(MigrationState *s)
         qemu_savevm_state_complete_postcopy(s->to_dst_file);
         qemu_mutex_unlock_iothread();
 
+        /* Shutdown the postcopy fast path thread */
+        if (migrate_postcopy_preempt()) {
+            postcopy_preempt_shutdown_file(s);
+        }
+
         trace_migration_completion_postcopy_end_after_complete();
     } else {
         goto fail;
@@ -4124,6 +4161,15 @@ void migrate_fd_connect(MigrationState *s, Error *error_in)
         }
     }
 
+    /* This needs to be done before resuming a postcopy */
+    if (postcopy_preempt_setup(s, &local_err)) {
+        error_report_err(local_err);
+        migrate_set_state(&s->state, MIGRATION_STATUS_SETUP,
+                          MIGRATION_STATUS_FAILED);
+        migrate_fd_cleanup(s);
+        return;
+    }
+
     if (resume) {
         /* Wakeup the main migration thread to do the recovery */
         migrate_set_state(&s->state, MIGRATION_STATUS_POSTCOPY_PAUSED,
diff --git a/migration/migration.h b/migration/migration.h
index af4bcb19c2..caa910d956 100644
--- a/migration/migration.h
+++ b/migration/migration.h
@@ -23,6 +23,7 @@
 #include "io/channel-buffer.h"
 #include "net/announce.h"
 #include "qom/object.h"
+#include "postcopy-ram.h"
 
 struct PostcopyBlocktimeContext;
 
@@ -112,6 +113,11 @@ struct MigrationIncomingState {
      * enabled.
      */
     unsigned int postcopy_channels;
+    /* QEMUFile for postcopy only; it'll be handled by a separate thread */
+    QEMUFile *postcopy_qemufile_dst;
+    /* Postcopy priority thread is used to receive postcopy requested pages */
+    QemuThread postcopy_prio_thread;
+    bool postcopy_prio_thread_created;
     /*
      * An array of temp host huge pages to be used, one for each postcopy
      * channel.
@@ -192,6 +198,8 @@ struct MigrationState {
     QEMUBH *cleanup_bh;
     /* Protected by qemu_file_lock */
     QEMUFile *to_dst_file;
+    /* Postcopy specific transfer channel */
+    QEMUFile *postcopy_qemufile_src;
     QIOChannelBuffer *bioc;
     /*
      * Protects to_dst_file/from_dst_file pointers.  We need to make sure we
diff --git a/migration/postcopy-ram.c b/migration/postcopy-ram.c
index a66dd536d9..e92db0556b 100644
--- a/migration/postcopy-ram.c
+++ b/migration/postcopy-ram.c
@@ -33,6 +33,9 @@
 #include "trace.h"
 #include "hw/boards.h"
 #include "exec/ramblock.h"
+#include "socket.h"
+#include "qemu-file-channel.h"
+#include "yank_functions.h"
 
 /* Arbitrary limit on size of each discard command,
  * keeps them around ~200 bytes
@@ -567,6 +570,11 @@ int postcopy_ram_incoming_cleanup(MigrationIncomingState *mis)
 {
     trace_postcopy_ram_incoming_cleanup_entry();
 
+    if (mis->postcopy_prio_thread_created) {
+        qemu_thread_join(&mis->postcopy_prio_thread);
+        mis->postcopy_prio_thread_created = false;
+    }
+
     if (mis->have_fault_thread) {
         Error *local_err = NULL;
 
@@ -1102,8 +1110,13 @@ static int postcopy_temp_pages_setup(MigrationIncomingState *mis)
     int err, i, channels;
     void *temp_page;
 
-    /* TODO: will be boosted when enable postcopy preemption */
-    mis->postcopy_channels = 1;
+    if (migrate_postcopy_preempt()) {
+        /* If preemption enabled, need extra channel for urgent requests */
+        mis->postcopy_channels = RAM_CHANNEL_MAX;
+    } else {
+        /* Both precopy/postcopy on the same channel */
+        mis->postcopy_channels = 1;
+    }
 
     channels = mis->postcopy_channels;
     mis->postcopy_tmp_pages = g_malloc0_n(sizeof(PostcopyTmpPage), channels);
@@ -1170,7 +1183,7 @@ int postcopy_ram_incoming_setup(MigrationIncomingState *mis)
         return -1;
     }
 
-    postcopy_thread_create(mis, &mis->fault_thread, "postcopy/fault",
+    postcopy_thread_create(mis, &mis->fault_thread, "fault-default",
                            postcopy_ram_fault_thread, QEMU_THREAD_JOINABLE);
     mis->have_fault_thread = true;
 
@@ -1185,6 +1198,16 @@ int postcopy_ram_incoming_setup(MigrationIncomingState *mis)
         return -1;
     }
 
+    if (migrate_postcopy_preempt()) {
+        /*
+         * This thread needs to be created after the temp pages because
+         * it'll fetch RAM_CHANNEL_POSTCOPY PostcopyTmpPage immediately.
+         */
+        postcopy_thread_create(mis, &mis->postcopy_prio_thread, "fault-fast",
+                               postcopy_preempt_thread, QEMU_THREAD_JOINABLE);
+        mis->postcopy_prio_thread_created = true;
+    }
+
     trace_postcopy_ram_enable_notify();
 
     return 0;
@@ -1514,3 +1537,66 @@ void postcopy_unregister_shared_ufd(struct PostCopyFD *pcfd)
         }
     }
 }
+
+bool postcopy_preempt_new_channel(MigrationIncomingState *mis, QEMUFile *file)
+{
+    /*
+     * The new loading channel has its own threads, so it needs to be
+     * blocked too.  It's by default true, just be explicit.
+     */
+    qemu_file_set_blocking(file, true);
+    mis->postcopy_qemufile_dst = file;
+    trace_postcopy_preempt_new_channel();
+
+    /* Start the migration immediately */
+    return true;
+}
+
+int postcopy_preempt_setup(MigrationState *s, Error **errp)
+{
+    QIOChannel *ioc;
+
+    if (!migrate_postcopy_preempt()) {
+        return 0;
+    }
+
+    if (!migrate_multi_channels_is_allowed()) {
+        error_setg(errp, "Postcopy preempt is not supported as current "
+                   "migration stream does not support multi-channels.");
+        return -1;
+    }
+
+    ioc = socket_send_channel_create_sync(errp);
+
+    if (ioc == NULL) {
+        return -1;
+    }
+
+    migration_ioc_register_yank(ioc);
+    s->postcopy_qemufile_src = qemu_fopen_channel_output(ioc);
+
+    trace_postcopy_preempt_new_channel();
+
+    return 0;
+}
+
+void *postcopy_preempt_thread(void *opaque)
+{
+    MigrationIncomingState *mis = opaque;
+    int ret;
+
+    trace_postcopy_preempt_thread_entry();
+
+    rcu_register_thread();
+
+    qemu_sem_post(&mis->thread_sync_sem);
+
+    /* Sending RAM_SAVE_FLAG_EOS to terminate this thread */
+    ret = ram_load_postcopy(mis->postcopy_qemufile_dst, RAM_CHANNEL_POSTCOPY);
+
+    rcu_unregister_thread();
+
+    trace_postcopy_preempt_thread_exit();
+
+    return ret == 0 ? NULL : (void *)-1;
+}
diff --git a/migration/postcopy-ram.h b/migration/postcopy-ram.h
index 07684c0e1d..34b1080cde 100644
--- a/migration/postcopy-ram.h
+++ b/migration/postcopy-ram.h
@@ -183,4 +183,14 @@ int postcopy_wake_shared(struct PostCopyFD *pcfd, uint64_t client_addr,
 int postcopy_request_shared_page(struct PostCopyFD *pcfd, RAMBlock *rb,
                                  uint64_t client_addr, uint64_t offset);
 
+/* Hard-code channels for now for postcopy preemption */
+enum PostcopyChannels {
+    RAM_CHANNEL_PRECOPY = 0,
+    RAM_CHANNEL_POSTCOPY = 1,
+    RAM_CHANNEL_MAX,
+};
+
+bool postcopy_preempt_new_channel(MigrationIncomingState *mis, QEMUFile *file);
+int postcopy_preempt_setup(MigrationState *s, Error **errp);
+
 #endif
diff --git a/migration/ram.c b/migration/ram.c
index a2489a2699..f5615e0a76 100644
--- a/migration/ram.c
+++ b/migration/ram.c
@@ -3644,15 +3644,15 @@ int ram_postcopy_incoming_init(MigrationIncomingState *mis)
  * rcu_read_lock is taken prior to this being called.
  *
  * @f: QEMUFile where to send the data
+ * @channel: the channel to use for loading
  */
-int ram_load_postcopy(QEMUFile *f)
+int ram_load_postcopy(QEMUFile *f, int channel)
 {
     int flags = 0, ret = 0;
     bool place_needed = false;
     bool matches_target_page_size = false;
     MigrationIncomingState *mis = migration_incoming_get_current();
-    /* Currently we only use channel 0.  TODO: use all the channels */
-    PostcopyTmpPage *tmp_page = &mis->postcopy_tmp_pages[0];
+    PostcopyTmpPage *tmp_page = &mis->postcopy_tmp_pages[channel];
 
     while (!ret && !(flags & RAM_SAVE_FLAG_EOS)) {
         ram_addr_t addr;
@@ -3676,7 +3676,7 @@ int ram_load_postcopy(QEMUFile *f)
         flags = addr & ~TARGET_PAGE_MASK;
         addr &= TARGET_PAGE_MASK;
 
-        trace_ram_load_postcopy_loop((uint64_t)addr, flags);
+        trace_ram_load_postcopy_loop(channel, (uint64_t)addr, flags);
         if (flags & (RAM_SAVE_FLAG_ZERO | RAM_SAVE_FLAG_PAGE |
                      RAM_SAVE_FLAG_COMPRESS_PAGE)) {
             block = ram_block_from_stream(mis, f, flags);
@@ -3717,10 +3717,10 @@ int ram_load_postcopy(QEMUFile *f)
             } else if (tmp_page->host_addr !=
                        host_page_from_ram_block_offset(block, addr)) {
                 /* not the 1st TP within the HP */
-                error_report("Non-same host page detected.  "
+                error_report("Non-same host page detected on channel %d: "
                              "Target host page %p, received host page %p "
                              "(rb %s offset 0x"RAM_ADDR_FMT" target_pages %d)",
-                             tmp_page->host_addr,
+                             channel, tmp_page->host_addr,
                              host_page_from_ram_block_offset(block, addr),
                              block->idstr, addr, tmp_page->target_pages);
                 ret = -EINVAL;
@@ -4107,7 +4107,12 @@ static int ram_load(QEMUFile *f, void *opaque, int version_id)
      */
     WITH_RCU_READ_LOCK_GUARD() {
         if (postcopy_running) {
-            ret = ram_load_postcopy(f);
+            /*
+             * Note!  Here RAM_CHANNEL_PRECOPY is the precopy channel of
+             * postcopy migration, we have another RAM_CHANNEL_POSTCOPY to
+             * service fast page faults.
+             */
+            ret = ram_load_postcopy(f, RAM_CHANNEL_PRECOPY);
         } else {
             ret = ram_load_precopy(f);
         }
@@ -4269,6 +4274,12 @@ static int ram_resume_prepare(MigrationState *s, void *opaque)
     return 0;
 }
 
+void postcopy_preempt_shutdown_file(MigrationState *s)
+{
+    qemu_put_be64(s->postcopy_qemufile_src, RAM_SAVE_FLAG_EOS);
+    qemu_fflush(s->postcopy_qemufile_src);
+}
+
 static SaveVMHandlers savevm_ram_handlers = {
     .save_setup = ram_save_setup,
     .save_live_iterate = ram_save_iterate,
diff --git a/migration/ram.h b/migration/ram.h
index ded0a3a086..5d90945a6e 100644
--- a/migration/ram.h
+++ b/migration/ram.h
@@ -61,7 +61,7 @@ void ram_postcopy_send_discard_bitmap(MigrationState *ms);
 /* For incoming postcopy discard */
 int ram_discard_range(const char *block_name, uint64_t start, size_t length);
 int ram_postcopy_incoming_init(MigrationIncomingState *mis);
-int ram_load_postcopy(QEMUFile *f);
+int ram_load_postcopy(QEMUFile *f, int channel);
 
 void ram_handle_compressed(void *host, uint8_t ch, uint64_t size);
 
@@ -73,6 +73,8 @@ int64_t ramblock_recv_bitmap_send(QEMUFile *file,
                                   const char *block_name);
 int ram_dirty_bitmap_reload(MigrationState *s, RAMBlock *rb);
 bool ramblock_page_is_discarded(RAMBlock *rb, ram_addr_t start);
+void postcopy_preempt_shutdown_file(MigrationState *s);
+void *postcopy_preempt_thread(void *opaque);
 
 /* ram cache */
 int colo_init_ram_cache(void);
diff --git a/migration/savevm.c b/migration/savevm.c
index d9076897b8..ecee05e631 100644
--- a/migration/savevm.c
+++ b/migration/savevm.c
@@ -2575,16 +2575,6 @@ static bool postcopy_pause_incoming(MigrationIncomingState *mis)
 {
     int i;
 
-    /*
-     * If network is interrupted, any temp page we received will be useless
-     * because we didn't mark them as "received" in receivedmap.  After a
-     * proper recovery later (which will sync src dirty bitmap with receivedmap
-     * on dest) these cached small pages will be resent again.
-     */
-    for (i = 0; i < mis->postcopy_channels; i++) {
-        postcopy_temp_page_reset(&mis->postcopy_tmp_pages[i]);
-    }
-
     trace_postcopy_pause_incoming();
 
     assert(migrate_postcopy_ram());
@@ -2613,6 +2603,16 @@ static bool postcopy_pause_incoming(MigrationIncomingState *mis)
     /* Notify the fault thread for the invalidated file handle */
     postcopy_fault_thread_notify(mis);
 
+    /*
+     * If network is interrupted, any temp page we received will be useless
+     * because we didn't mark them as "received" in receivedmap.  After a
+     * proper recovery later (which will sync src dirty bitmap with receivedmap
+     * on dest) these cached small pages will be resent again.
+     */
+    for (i = 0; i < mis->postcopy_channels; i++) {
+        postcopy_temp_page_reset(&mis->postcopy_tmp_pages[i]);
+    }
+
     error_report("Detected IO failure for postcopy. "
                  "Migration paused.");
 
diff --git a/migration/socket.c b/migration/socket.c
index 05705a32d8..a7f345b353 100644
--- a/migration/socket.c
+++ b/migration/socket.c
@@ -26,7 +26,7 @@
 #include "io/channel-socket.h"
 #include "io/net-listener.h"
 #include "trace.h"
-
+#include "postcopy-ram.h"
 
 struct SocketOutgoingArgs {
     SocketAddress *saddr;
@@ -39,6 +39,24 @@ void socket_send_channel_create(QIOTaskFunc f, void *data)
                                      f, data, NULL, NULL);
 }
 
+QIOChannel *socket_send_channel_create_sync(Error **errp)
+{
+    QIOChannelSocket *sioc = qio_channel_socket_new();
+
+    if (!outgoing_args.saddr) {
+        object_unref(OBJECT(sioc));
+        error_setg(errp, "Initial sock address not set!");
+        return NULL;
+    }
+
+    if (qio_channel_socket_connect_sync(sioc, outgoing_args.saddr, errp) < 0) {
+        object_unref(OBJECT(sioc));
+        return NULL;
+    }
+
+    return QIO_CHANNEL(sioc);
+}
+
 int socket_send_channel_destroy(QIOChannel *send)
 {
     /* Remove channel */
@@ -158,6 +176,8 @@ socket_start_incoming_migration_internal(SocketAddress *saddr,
 
     if (migrate_use_multifd()) {
         num = migrate_multifd_channels();
+    } else if (migrate_postcopy_preempt()) {
+        num = RAM_CHANNEL_MAX;
     }
 
     if (qio_net_listener_open_sync(listener, saddr, num, errp) < 0) {
diff --git a/migration/socket.h b/migration/socket.h
index 891dbccceb..dc54df4e6c 100644
--- a/migration/socket.h
+++ b/migration/socket.h
@@ -21,6 +21,7 @@
 #include "io/task.h"
 
 void socket_send_channel_create(QIOTaskFunc f, void *data);
+QIOChannel *socket_send_channel_create_sync(Error **errp);
 int socket_send_channel_destroy(QIOChannel *send);
 
 void socket_start_incoming_migration(const char *str, Error **errp);
diff --git a/migration/trace-events b/migration/trace-events
index 1aec580e92..4bc787cf0c 100644
--- a/migration/trace-events
+++ b/migration/trace-events
@@ -91,7 +91,7 @@ migration_bitmap_clear_dirty(char *str, uint64_t start, uint64_t size, unsigned
 migration_throttle(void) ""
 ram_discard_range(const char *rbname, uint64_t start, size_t len) "%s: start: %" PRIx64 " %zx"
 ram_load_loop(const char *rbname, uint64_t addr, int flags, void *host) "%s: addr: 0x%" PRIx64 " flags: 0x%x host: %p"
-ram_load_postcopy_loop(uint64_t addr, int flags) "@%" PRIx64 " %x"
+ram_load_postcopy_loop(int channel, uint64_t addr, int flags) "chan=%d addr=0x%" PRIx64 " flags=0x%x"
 ram_postcopy_send_discard_bitmap(void) ""
 ram_save_page(const char *rbname, uint64_t offset, void *host) "%s: offset: 0x%" PRIx64 " host: %p"
 ram_save_queue_pages(const char *rbname, size_t start, size_t len) "%s: start: 0x%zx len: 0x%zx"
@@ -278,6 +278,9 @@ postcopy_request_shared_page(const char *sharer, const char *rb, uint64_t rb_off
 postcopy_request_shared_page_present(const char *sharer, const char *rb, uint64_t rb_offset) "%s already %s offset 0x%"PRIx64
 postcopy_wake_shared(uint64_t client_addr, const char *rb) "at 0x%"PRIx64" in %s"
 postcopy_page_req_del(void *addr, int count) "resolved page req %p total %d"
+postcopy_preempt_new_channel(void) ""
+postcopy_preempt_thread_entry(void) ""
+postcopy_preempt_thread_exit(void) ""
 
 get_mem_fault_cpu_index(int cpu, uint32_t pid) "cpu: %d, pid: %u"
 
-- 
2.32.0



^ permalink raw reply related	[flat|nested] 40+ messages in thread

* [PATCH v5 12/21] migration: Postcopy preemption enablement
  2022-04-25 23:38 [PATCH v5 00/21] migration: Postcopy Preemption Peter Xu
                   ` (10 preceding siblings ...)
  2022-04-25 23:38 ` [PATCH v5 11/21] migration: Postcopy preemption preparation on channel creation Peter Xu
@ 2022-04-25 23:38 ` Peter Xu
  2022-04-25 23:38 ` [PATCH v5 13/21] migration: Postcopy recover with preempt enabled Peter Xu
                   ` (9 subsequent siblings)
  21 siblings, 0 replies; 40+ messages in thread
From: Peter Xu @ 2022-04-25 23:38 UTC (permalink / raw)
  To: qemu-devel
  Cc: Leonardo Bras Soares Passos, Daniel P . Berrange,
	Dr . David Alan Gilbert, peterx, Juan Quintela

This patch enables postcopy-preempt feature.

It contains two major changes to the migration logic:

(1) Postcopy requests are now sent via a different socket from precopy
    background migration stream, so as to be isolated from very high page
    request delays.

(2) For huge page enabled hosts: when there's postcopy requests, they can now
    intercept a partial sending of huge host pages on src QEMU.

After this patch, we'll live migrate a VM with two channels for postcopy: (1)
PRECOPY channel, which is the default channel that transfers background pages;
and (2) POSTCOPY channel, which only transfers requested pages.

There's no strict rule of which channel to use, e.g., if a requested page is
already being transferred on precopy channel, then we will keep using the same
precopy channel to transfer the page even if it's explicitly requested.  In 99%
of the cases we'll prioritize the channels so we send requested page via the
postcopy channel as long as possible.

On the source QEMU, when we found a postcopy request, we'll interrupt the
PRECOPY channel sending process and quickly switch to the POSTCOPY channel.
After we serviced all the high priority postcopy pages, we'll switch back to
PRECOPY channel so that we'll continue to send the interrupted huge page again.
There's no new thread introduced on src QEMU.

On the destination QEMU, one new thread is introduced to receive page data from
the postcopy specific socket (done in the preparation patch).

This patch has a side effect: after sending postcopy pages, previously we'll
assume the guest will access follow up pages so we'll keep sending from there.
Now it's changed.  Instead of going on with a postcopy requested page, we'll go
back and continue sending the precopy huge page (which can be intercepted by a
postcopy request so the huge page can be sent partially before).

Whether that's a problem is debatable, because "assuming the guest will
continue to access the next page" may not really suite when huge pages are
used, especially if the huge page is large (e.g. 1GB pages).  So that locality
hint is much meaningless if huge pages are used.

Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Signed-off-by: Peter Xu <peterx@redhat.com>
---
 migration/migration.c  |   2 +
 migration/migration.h  |   2 +-
 migration/ram.c        | 251 +++++++++++++++++++++++++++++++++++++++--
 migration/trace-events |   7 ++
 4 files changed, 253 insertions(+), 9 deletions(-)

diff --git a/migration/migration.c b/migration/migration.c
index e27aa306bc..8264b03d4d 100644
--- a/migration/migration.c
+++ b/migration/migration.c
@@ -3157,6 +3157,8 @@ static int postcopy_start(MigrationState *ms)
                               MIGRATION_STATUS_FAILED);
     }
 
+    trace_postcopy_preempt_enabled(migrate_postcopy_preempt());
+
     return ret;
 
 fail_closefb:
diff --git a/migration/migration.h b/migration/migration.h
index caa910d956..b8aacfe3af 100644
--- a/migration/migration.h
+++ b/migration/migration.h
@@ -68,7 +68,7 @@ typedef struct {
 struct MigrationIncomingState {
     QEMUFile *from_src_file;
     /* Previously received RAM's RAMBlock pointer */
-    RAMBlock *last_recv_block;
+    RAMBlock *last_recv_block[RAM_CHANNEL_MAX];
     /* A hook to allow cleanup at the end of incoming migration */
     void *transport_data;
     void (*transport_cleanup)(void *data);
diff --git a/migration/ram.c b/migration/ram.c
index f5615e0a76..a4b39e3675 100644
--- a/migration/ram.c
+++ b/migration/ram.c
@@ -295,6 +295,20 @@ struct RAMSrcPageRequest {
     QSIMPLEQ_ENTRY(RAMSrcPageRequest) next_req;
 };
 
+typedef struct {
+    /*
+     * Cached ramblock/offset values if preempted.  They're only meaningful if
+     * preempted==true below.
+     */
+    RAMBlock *ram_block;
+    unsigned long ram_page;
+    /*
+     * Whether a postcopy preemption just happened.  Will be reset after
+     * precopy recovered to background migration.
+     */
+    bool preempted;
+} PostcopyPreemptState;
+
 /* State of RAM for migration */
 struct RAMState {
     /* QEMUFile used for this migration */
@@ -349,6 +363,14 @@ struct RAMState {
     /* Queue of outstanding page requests from the destination */
     QemuMutex src_page_req_mutex;
     QSIMPLEQ_HEAD(, RAMSrcPageRequest) src_page_requests;
+
+    /* Postcopy preemption informations */
+    PostcopyPreemptState postcopy_preempt_state;
+    /*
+     * Current channel we're using on src VM.  Only valid if postcopy-preempt
+     * is enabled.
+     */
+    unsigned int postcopy_channel;
 };
 typedef struct RAMState RAMState;
 
@@ -356,6 +378,11 @@ static RAMState *ram_state;
 
 static NotifierWithReturnList precopy_notifier_list;
 
+static void postcopy_preempt_reset(RAMState *rs)
+{
+    memset(&rs->postcopy_preempt_state, 0, sizeof(PostcopyPreemptState));
+}
+
 /* Whether postcopy has queued requests? */
 static bool postcopy_has_request(RAMState *rs)
 {
@@ -1947,6 +1974,55 @@ void ram_write_tracking_stop(void)
 }
 #endif /* defined(__linux__) */
 
+/*
+ * Check whether two addr/offset of the ramblock falls onto the same host huge
+ * page.  Returns true if so, false otherwise.
+ */
+static bool offset_on_same_huge_page(RAMBlock *rb, uint64_t addr1,
+                                     uint64_t addr2)
+{
+    size_t page_size = qemu_ram_pagesize(rb);
+
+    addr1 = ROUND_DOWN(addr1, page_size);
+    addr2 = ROUND_DOWN(addr2, page_size);
+
+    return addr1 == addr2;
+}
+
+/*
+ * Whether a previous preempted precopy huge page contains current requested
+ * page?  Returns true if so, false otherwise.
+ *
+ * This should really happen very rarely, because it means when we were sending
+ * during background migration for postcopy we're sending exactly the page that
+ * some vcpu got faulted on on dest node.  When it happens, we probably don't
+ * need to do much but drop the request, because we know right after we restore
+ * the precopy stream it'll be serviced.  It'll slightly affect the order of
+ * postcopy requests to be serviced (e.g. it'll be the same as we move current
+ * request to the end of the queue) but it shouldn't be a big deal.  The most
+ * imporant thing is we can _never_ try to send a partial-sent huge page on the
+ * POSTCOPY channel again, otherwise that huge page will got "split brain" on
+ * two channels (PRECOPY, POSTCOPY).
+ */
+static bool postcopy_preempted_contains(RAMState *rs, RAMBlock *block,
+                                        ram_addr_t offset)
+{
+    PostcopyPreemptState *state = &rs->postcopy_preempt_state;
+
+    /* No preemption at all? */
+    if (!state->preempted) {
+        return false;
+    }
+
+    /* Not even the same ramblock? */
+    if (state->ram_block != block) {
+        return false;
+    }
+
+    return offset_on_same_huge_page(block, offset,
+                                    state->ram_page << TARGET_PAGE_BITS);
+}
+
 /**
  * get_queued_page: unqueue a page from the postcopy requests
  *
@@ -1962,9 +2038,17 @@ static bool get_queued_page(RAMState *rs, PageSearchStatus *pss)
     RAMBlock  *block;
     ram_addr_t offset;
 
+again:
     block = unqueue_page(rs, &offset);
 
-    if (!block) {
+    if (block) {
+        /* See comment above postcopy_preempted_contains() */
+        if (postcopy_preempted_contains(rs, block, offset)) {
+            trace_postcopy_preempt_hit(block->idstr, offset);
+            /* This request is dropped */
+            goto again;
+        }
+    } else {
         /*
          * Poll write faults too if background snapshot is enabled; that's
          * when we have vcpus got blocked by the write protected pages.
@@ -2180,6 +2264,117 @@ static int ram_save_target_page(RAMState *rs, PageSearchStatus *pss)
     return ram_save_page(rs, pss);
 }
 
+static bool postcopy_needs_preempt(RAMState *rs, PageSearchStatus *pss)
+{
+    /* Not enabled eager preempt?  Then never do that. */
+    if (!migrate_postcopy_preempt()) {
+        return false;
+    }
+
+    /* If the ramblock we're sending is a small page?  Never bother. */
+    if (qemu_ram_pagesize(pss->block) == TARGET_PAGE_SIZE) {
+        return false;
+    }
+
+    /* Not in postcopy at all? */
+    if (!migration_in_postcopy()) {
+        return false;
+    }
+
+    /*
+     * If we're already handling a postcopy request, don't preempt as this page
+     * has got the same high priority.
+     */
+    if (pss->postcopy_requested) {
+        return false;
+    }
+
+    /* If there's postcopy requests, then check it up! */
+    return postcopy_has_request(rs);
+}
+
+/* Returns true if we preempted precopy, false otherwise */
+static void postcopy_do_preempt(RAMState *rs, PageSearchStatus *pss)
+{
+    PostcopyPreemptState *p_state = &rs->postcopy_preempt_state;
+
+    trace_postcopy_preempt_triggered(pss->block->idstr, pss->page);
+
+    /*
+     * Time to preempt precopy. Cache current PSS into preempt state, so that
+     * after handling the postcopy pages we can recover to it.  We need to do
+     * so because the dest VM will have partial of the precopy huge page kept
+     * over in its tmp huge page caches; better move on with it when we can.
+     */
+    p_state->ram_block = pss->block;
+    p_state->ram_page = pss->page;
+    p_state->preempted = true;
+}
+
+/* Whether we're preempted by a postcopy request during sending a huge page */
+static bool postcopy_preempt_triggered(RAMState *rs)
+{
+    return rs->postcopy_preempt_state.preempted;
+}
+
+static void postcopy_preempt_restore(RAMState *rs, PageSearchStatus *pss)
+{
+    PostcopyPreemptState *state = &rs->postcopy_preempt_state;
+
+    assert(state->preempted);
+
+    pss->block = state->ram_block;
+    pss->page = state->ram_page;
+    /* This is not a postcopy request but restoring previous precopy */
+    pss->postcopy_requested = false;
+
+    trace_postcopy_preempt_restored(pss->block->idstr, pss->page);
+
+    /* Reset preempt state, most importantly, set preempted==false */
+    postcopy_preempt_reset(rs);
+}
+
+static void postcopy_preempt_choose_channel(RAMState *rs, PageSearchStatus *pss)
+{
+    MigrationState *s = migrate_get_current();
+    unsigned int channel;
+    QEMUFile *next;
+
+    channel = pss->postcopy_requested ?
+        RAM_CHANNEL_POSTCOPY : RAM_CHANNEL_PRECOPY;
+
+    if (channel != rs->postcopy_channel) {
+        if (channel == RAM_CHANNEL_PRECOPY) {
+            next = s->to_dst_file;
+        } else {
+            next = s->postcopy_qemufile_src;
+        }
+        /* Update and cache the current channel */
+        rs->f = next;
+        rs->postcopy_channel = channel;
+
+        /*
+         * If channel switched, reset last_sent_block since the old sent block
+         * may not be on the same channel.
+         */
+        rs->last_sent_block = NULL;
+
+        trace_postcopy_preempt_switch_channel(channel);
+    }
+
+    trace_postcopy_preempt_send_host_page(pss->block->idstr, pss->page);
+}
+
+/* We need to make sure rs->f always points to the default channel elsewhere */
+static void postcopy_preempt_reset_channel(RAMState *rs)
+{
+    if (migrate_postcopy_preempt() && migration_in_postcopy()) {
+        rs->postcopy_channel = RAM_CHANNEL_PRECOPY;
+        rs->f = migrate_get_current()->to_dst_file;
+        trace_postcopy_preempt_reset_channel();
+    }
+}
+
 /**
  * ram_save_host_page: save a whole host page
  *
@@ -2211,7 +2406,16 @@ static int ram_save_host_page(RAMState *rs, PageSearchStatus *pss)
         return 0;
     }
 
+    if (migrate_postcopy_preempt() && migration_in_postcopy()) {
+        postcopy_preempt_choose_channel(rs, pss);
+    }
+
     do {
+        if (postcopy_needs_preempt(rs, pss)) {
+            postcopy_do_preempt(rs, pss);
+            break;
+        }
+
         /* Check the pages is dirty and if it is send it */
         if (migration_bitmap_clear_dirty(rs, pss->block, pss->page)) {
             tmppages = ram_save_target_page(rs, pss);
@@ -2235,6 +2439,19 @@ static int ram_save_host_page(RAMState *rs, PageSearchStatus *pss)
     /* The offset we leave with is the min boundary of host page and block */
     pss->page = MIN(pss->page, hostpage_boundary);
 
+    /*
+     * When with postcopy preempt mode, flush the data as soon as possible for
+     * postcopy requests, because we've already sent a whole huge page, so the
+     * dst node should already have enough resource to atomically filling in
+     * the current missing page.
+     *
+     * More importantly, when using separate postcopy channel, we must do
+     * explicit flush or it won't flush until the buffer is full.
+     */
+    if (migrate_postcopy_preempt() && pss->postcopy_requested) {
+        qemu_fflush(rs->f);
+    }
+
     res = ram_save_release_protection(rs, pss, start_page);
     return (res < 0 ? res : pages);
 }
@@ -2276,8 +2493,17 @@ static int ram_find_and_save_block(RAMState *rs)
         found = get_queued_page(rs, &pss);
 
         if (!found) {
-            /* priority queue empty, so just search for something dirty */
-            found = find_dirty_block(rs, &pss, &again);
+            /*
+             * Recover previous precopy ramblock/offset if postcopy has
+             * preempted precopy.  Otherwise find the next dirty bit.
+             */
+            if (postcopy_preempt_triggered(rs)) {
+                postcopy_preempt_restore(rs, &pss);
+                found = true;
+            } else {
+                /* priority queue empty, so just search for something dirty */
+                found = find_dirty_block(rs, &pss, &again);
+            }
         }
 
         if (found) {
@@ -2405,6 +2631,8 @@ static void ram_state_reset(RAMState *rs)
     rs->last_page = 0;
     rs->last_version = ram_list.version;
     rs->xbzrle_enabled = false;
+    postcopy_preempt_reset(rs);
+    rs->postcopy_channel = RAM_CHANNEL_PRECOPY;
 }
 
 #define MAX_WAIT 50 /* ms, half buffered_file limit */
@@ -3043,6 +3271,8 @@ static int ram_save_iterate(QEMUFile *f, void *opaque)
     }
     qemu_mutex_unlock(&rs->bitmap_mutex);
 
+    postcopy_preempt_reset_channel(rs);
+
     /*
      * Must occur before EOS (or any QEMUFile operation)
      * because of RDMA protocol.
@@ -3112,6 +3342,8 @@ static int ram_save_complete(QEMUFile *f, void *opaque)
         ram_control_after_iterate(f, RAM_CONTROL_FINISH);
     }
 
+    postcopy_preempt_reset_channel(rs);
+
     if (ret >= 0) {
         multifd_send_sync_main(rs->f);
         qemu_put_be64(f, RAM_SAVE_FLAG_EOS);
@@ -3194,11 +3426,13 @@ static int load_xbzrle(QEMUFile *f, ram_addr_t addr, void *host)
  * @mis: the migration incoming state pointer
  * @f: QEMUFile where to read the data from
  * @flags: Page flags (mostly to see if it's a continuation of previous block)
+ * @channel: the channel we're using
  */
 static inline RAMBlock *ram_block_from_stream(MigrationIncomingState *mis,
-                                              QEMUFile *f, int flags)
+                                              QEMUFile *f, int flags,
+                                              int channel)
 {
-    RAMBlock *block = mis->last_recv_block;
+    RAMBlock *block = mis->last_recv_block[channel];
     char id[256];
     uint8_t len;
 
@@ -3225,7 +3459,7 @@ static inline RAMBlock *ram_block_from_stream(MigrationIncomingState *mis,
         return NULL;
     }
 
-    mis->last_recv_block = block;
+    mis->last_recv_block[channel] = block;
 
     return block;
 }
@@ -3679,7 +3913,7 @@ int ram_load_postcopy(QEMUFile *f, int channel)
         trace_ram_load_postcopy_loop(channel, (uint64_t)addr, flags);
         if (flags & (RAM_SAVE_FLAG_ZERO | RAM_SAVE_FLAG_PAGE |
                      RAM_SAVE_FLAG_COMPRESS_PAGE)) {
-            block = ram_block_from_stream(mis, f, flags);
+            block = ram_block_from_stream(mis, f, flags, channel);
             if (!block) {
                 ret = -EINVAL;
                 break;
@@ -3930,7 +4164,8 @@ static int ram_load_precopy(QEMUFile *f)
 
         if (flags & (RAM_SAVE_FLAG_ZERO | RAM_SAVE_FLAG_PAGE |
                      RAM_SAVE_FLAG_COMPRESS_PAGE | RAM_SAVE_FLAG_XBZRLE)) {
-            RAMBlock *block = ram_block_from_stream(mis, f, flags);
+            RAMBlock *block = ram_block_from_stream(mis, f, flags,
+                                                    RAM_CHANNEL_PRECOPY);
 
             host = host_from_ram_block_offset(block, addr);
             /*
diff --git a/migration/trace-events b/migration/trace-events
index 4bc787cf0c..69f311169a 100644
--- a/migration/trace-events
+++ b/migration/trace-events
@@ -111,6 +111,12 @@ ram_load_complete(int ret, uint64_t seq_iter) "exit_code %d seq iteration %" PRI
 ram_write_tracking_ramblock_start(const char *block_id, size_t page_size, void *addr, size_t length) "%s: page_size: %zu addr: %p length: %zu"
 ram_write_tracking_ramblock_stop(const char *block_id, size_t page_size, void *addr, size_t length) "%s: page_size: %zu addr: %p length: %zu"
 unqueue_page(char *block, uint64_t offset, bool dirty) "ramblock '%s' offset 0x%"PRIx64" dirty %d"
+postcopy_preempt_triggered(char *str, unsigned long page) "during sending ramblock %s offset 0x%lx"
+postcopy_preempt_restored(char *str, unsigned long page) "ramblock %s offset 0x%lx"
+postcopy_preempt_hit(char *str, uint64_t offset) "ramblock %s offset 0x%"PRIx64
+postcopy_preempt_send_host_page(char *str, uint64_t offset) "ramblock %s offset 0x%"PRIx64
+postcopy_preempt_switch_channel(int channel) "%d"
+postcopy_preempt_reset_channel(void) ""
 
 # multifd.c
 multifd_new_send_channel_async(uint8_t id) "channel %u"
@@ -176,6 +182,7 @@ migration_thread_low_pending(uint64_t pending) "%" PRIu64
 migrate_transferred(uint64_t tranferred, uint64_t time_spent, uint64_t bandwidth, uint64_t size) "transferred %" PRIu64 " time_spent %" PRIu64 " bandwidth %" PRIu64 " max_size %" PRId64
 process_incoming_migration_co_end(int ret, int ps) "ret=%d postcopy-state=%d"
 process_incoming_migration_co_postcopy_end_main(void) ""
+postcopy_preempt_enabled(bool value) "%d"
 
 # channel.c
 migration_set_incoming_channel(void *ioc, const char *ioctype) "ioc=%p ioctype=%s"
-- 
2.32.0



^ permalink raw reply related	[flat|nested] 40+ messages in thread

* [PATCH v5 13/21] migration: Postcopy recover with preempt enabled
  2022-04-25 23:38 [PATCH v5 00/21] migration: Postcopy Preemption Peter Xu
                   ` (11 preceding siblings ...)
  2022-04-25 23:38 ` [PATCH v5 12/21] migration: Postcopy preemption enablement Peter Xu
@ 2022-04-25 23:38 ` Peter Xu
  2022-05-16 13:31   ` manish.mishra
  2022-04-25 23:38 ` [PATCH v5 14/21] migration: Create the postcopy preempt channel asynchronously Peter Xu
                   ` (8 subsequent siblings)
  21 siblings, 1 reply; 40+ messages in thread
From: Peter Xu @ 2022-04-25 23:38 UTC (permalink / raw)
  To: qemu-devel
  Cc: Leonardo Bras Soares Passos, Daniel P . Berrange,
	Dr . David Alan Gilbert, peterx, Juan Quintela

To allow postcopy recovery, the ram fast load (preempt-only) dest QEMU thread
needs similar handling on fault tolerance.  When ram_load_postcopy() fails,
instead of stopping the thread it halts with a semaphore, preparing to be
kicked again when recovery is detected.

A mutex is introduced to make sure there's no concurrent operation upon the
socket.  To make it simple, the fast ram load thread will take the mutex during
its whole procedure, and only release it if it's paused.  The fast-path socket
will be properly released by the main loading thread safely when there's
network failures during postcopy with that mutex held.

Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Signed-off-by: Peter Xu <peterx@redhat.com>
---
 migration/migration.c    | 27 +++++++++++++++++++++++----
 migration/migration.h    | 19 +++++++++++++++++++
 migration/postcopy-ram.c | 25 +++++++++++++++++++++++--
 migration/qemu-file.c    | 27 +++++++++++++++++++++++++++
 migration/qemu-file.h    |  1 +
 migration/savevm.c       | 26 ++++++++++++++++++++++++--
 migration/trace-events   |  2 ++
 7 files changed, 119 insertions(+), 8 deletions(-)

diff --git a/migration/migration.c b/migration/migration.c
index 8264b03d4d..a0db5de685 100644
--- a/migration/migration.c
+++ b/migration/migration.c
@@ -215,9 +215,11 @@ void migration_object_init(void)
     current_incoming->postcopy_remote_fds =
         g_array_new(FALSE, TRUE, sizeof(struct PostCopyFD));
     qemu_mutex_init(&current_incoming->rp_mutex);
+    qemu_mutex_init(&current_incoming->postcopy_prio_thread_mutex);
     qemu_event_init(&current_incoming->main_thread_load_event, false);
     qemu_sem_init(&current_incoming->postcopy_pause_sem_dst, 0);
     qemu_sem_init(&current_incoming->postcopy_pause_sem_fault, 0);
+    qemu_sem_init(&current_incoming->postcopy_pause_sem_fast_load, 0);
     qemu_mutex_init(&current_incoming->page_request_mutex);
     current_incoming->page_requested = g_tree_new(page_request_addr_cmp);
 
@@ -697,9 +699,9 @@ static bool postcopy_try_recover(void)
 
         /*
          * Here, we only wake up the main loading thread (while the
-         * fault thread will still be waiting), so that we can receive
+         * rest threads will still be waiting), so that we can receive
          * commands from source now, and answer it if needed. The
-         * fault thread will be woken up afterwards until we are sure
+         * rest threads will be woken up afterwards until we are sure
          * that source is ready to reply to page requests.
          */
         qemu_sem_post(&mis->postcopy_pause_sem_dst);
@@ -3470,6 +3472,18 @@ static MigThrError postcopy_pause(MigrationState *s)
         qemu_file_shutdown(file);
         qemu_fclose(file);
 
+        /*
+         * Do the same to postcopy fast path socket too if there is.  No
+         * locking needed because no racer as long as we do this before setting
+         * status to paused.
+         */
+        if (s->postcopy_qemufile_src) {
+            migration_ioc_unregister_yank_from_file(s->postcopy_qemufile_src);
+            qemu_file_shutdown(s->postcopy_qemufile_src);
+            qemu_fclose(s->postcopy_qemufile_src);
+            s->postcopy_qemufile_src = NULL;
+        }
+
         migrate_set_state(&s->state, s->state,
                           MIGRATION_STATUS_POSTCOPY_PAUSED);
 
@@ -3525,8 +3539,13 @@ static MigThrError migration_detect_error(MigrationState *s)
         return MIG_THR_ERR_FATAL;
     }
 
-    /* Try to detect any file errors */
-    ret = qemu_file_get_error_obj(s->to_dst_file, &local_error);
+    /*
+     * Try to detect any file errors.  Note that postcopy_qemufile_src will
+     * be NULL when postcopy preempt is not enabled.
+     */
+    ret = qemu_file_get_error_obj_any(s->to_dst_file,
+                                      s->postcopy_qemufile_src,
+                                      &local_error);
     if (!ret) {
         /* Everything is fine */
         assert(!local_error);
diff --git a/migration/migration.h b/migration/migration.h
index b8aacfe3af..91f845e9e4 100644
--- a/migration/migration.h
+++ b/migration/migration.h
@@ -118,6 +118,18 @@ struct MigrationIncomingState {
     /* Postcopy priority thread is used to receive postcopy requested pages */
     QemuThread postcopy_prio_thread;
     bool postcopy_prio_thread_created;
+    /*
+     * Used to sync between the ram load main thread and the fast ram load
+     * thread.  It protects postcopy_qemufile_dst, which is the postcopy
+     * fast channel.
+     *
+     * The ram fast load thread will take it mostly for the whole lifecycle
+     * because it needs to continuously read data from the channel, and
+     * it'll only release this mutex if postcopy is interrupted, so that
+     * the ram load main thread will take this mutex over and properly
+     * release the broken channel.
+     */
+    QemuMutex postcopy_prio_thread_mutex;
     /*
      * An array of temp host huge pages to be used, one for each postcopy
      * channel.
@@ -147,6 +159,13 @@ struct MigrationIncomingState {
     /* notify PAUSED postcopy incoming migrations to try to continue */
     QemuSemaphore postcopy_pause_sem_dst;
     QemuSemaphore postcopy_pause_sem_fault;
+    /*
+     * This semaphore is used to allow the ram fast load thread (only when
+     * postcopy preempt is enabled) fall into sleep when there's network
+     * interruption detected.  When the recovery is done, the main load
+     * thread will kick the fast ram load thread using this semaphore.
+     */
+    QemuSemaphore postcopy_pause_sem_fast_load;
 
     /* List of listening socket addresses  */
     SocketAddressList *socket_address_list;
diff --git a/migration/postcopy-ram.c b/migration/postcopy-ram.c
index e92db0556b..b3c81b46f6 100644
--- a/migration/postcopy-ram.c
+++ b/migration/postcopy-ram.c
@@ -1580,6 +1580,15 @@ int postcopy_preempt_setup(MigrationState *s, Error **errp)
     return 0;
 }
 
+static void postcopy_pause_ram_fast_load(MigrationIncomingState *mis)
+{
+    trace_postcopy_pause_fast_load();
+    qemu_mutex_unlock(&mis->postcopy_prio_thread_mutex);
+    qemu_sem_wait(&mis->postcopy_pause_sem_fast_load);
+    qemu_mutex_lock(&mis->postcopy_prio_thread_mutex);
+    trace_postcopy_pause_fast_load_continued();
+}
+
 void *postcopy_preempt_thread(void *opaque)
 {
     MigrationIncomingState *mis = opaque;
@@ -1592,11 +1601,23 @@ void *postcopy_preempt_thread(void *opaque)
     qemu_sem_post(&mis->thread_sync_sem);
 
     /* Sending RAM_SAVE_FLAG_EOS to terminate this thread */
-    ret = ram_load_postcopy(mis->postcopy_qemufile_dst, RAM_CHANNEL_POSTCOPY);
+    qemu_mutex_lock(&mis->postcopy_prio_thread_mutex);
+    while (1) {
+        ret = ram_load_postcopy(mis->postcopy_qemufile_dst,
+                                RAM_CHANNEL_POSTCOPY);
+        /* If error happened, go into recovery routine */
+        if (ret) {
+            postcopy_pause_ram_fast_load(mis);
+        } else {
+            /* We're done */
+            break;
+        }
+    }
+    qemu_mutex_unlock(&mis->postcopy_prio_thread_mutex);
 
     rcu_unregister_thread();
 
     trace_postcopy_preempt_thread_exit();
 
-    return ret == 0 ? NULL : (void *)-1;
+    return NULL;
 }
diff --git a/migration/qemu-file.c b/migration/qemu-file.c
index 1479cddad9..397652f0ba 100644
--- a/migration/qemu-file.c
+++ b/migration/qemu-file.c
@@ -139,6 +139,33 @@ int qemu_file_get_error_obj(QEMUFile *f, Error **errp)
     return f->last_error;
 }
 
+/*
+ * Get last error for either stream f1 or f2 with optional Error*.
+ * The error returned (non-zero) can be either from f1 or f2.
+ *
+ * If any of the qemufile* is NULL, then skip the check on that file.
+ *
+ * When there is no error on both qemufile, zero is returned.
+ */
+int qemu_file_get_error_obj_any(QEMUFile *f1, QEMUFile *f2, Error **errp)
+{
+    int ret = 0;
+
+    if (f1) {
+        ret = qemu_file_get_error_obj(f1, errp);
+        /* If there's already error detected, return */
+        if (ret) {
+            return ret;
+        }
+    }
+
+    if (f2) {
+        ret = qemu_file_get_error_obj(f2, errp);
+    }
+
+    return ret;
+}
+
 /*
  * Set the last error for stream f with optional Error*
  */
diff --git a/migration/qemu-file.h b/migration/qemu-file.h
index 3f36d4dc8c..2564e5e1c7 100644
--- a/migration/qemu-file.h
+++ b/migration/qemu-file.h
@@ -156,6 +156,7 @@ void qemu_file_update_transfer(QEMUFile *f, int64_t len);
 void qemu_file_set_rate_limit(QEMUFile *f, int64_t new_rate);
 int64_t qemu_file_get_rate_limit(QEMUFile *f);
 int qemu_file_get_error_obj(QEMUFile *f, Error **errp);
+int qemu_file_get_error_obj_any(QEMUFile *f1, QEMUFile *f2, Error **errp);
 void qemu_file_set_error_obj(QEMUFile *f, int ret, Error *err);
 void qemu_file_set_error(QEMUFile *f, int ret);
 int qemu_file_shutdown(QEMUFile *f);
diff --git a/migration/savevm.c b/migration/savevm.c
index ecee05e631..050874650a 100644
--- a/migration/savevm.c
+++ b/migration/savevm.c
@@ -2152,6 +2152,13 @@ static int loadvm_postcopy_handle_resume(MigrationIncomingState *mis)
      */
     qemu_sem_post(&mis->postcopy_pause_sem_fault);
 
+    if (migrate_postcopy_preempt()) {
+        /* The channel should already be setup again; make sure of it */
+        assert(mis->postcopy_qemufile_dst);
+        /* Kick the fast ram load thread too */
+        qemu_sem_post(&mis->postcopy_pause_sem_fast_load);
+    }
+
     return 0;
 }
 
@@ -2597,6 +2604,21 @@ static bool postcopy_pause_incoming(MigrationIncomingState *mis)
     mis->to_src_file = NULL;
     qemu_mutex_unlock(&mis->rp_mutex);
 
+    /*
+     * NOTE: this must happen before reset the PostcopyTmpPages below,
+     * otherwise it's racy to reset those fields when the fast load thread
+     * can be accessing it in parallel.
+     */
+    if (mis->postcopy_qemufile_dst) {
+        qemu_file_shutdown(mis->postcopy_qemufile_dst);
+        /* Take the mutex to make sure the fast ram load thread halted */
+        qemu_mutex_lock(&mis->postcopy_prio_thread_mutex);
+        migration_ioc_unregister_yank_from_file(mis->postcopy_qemufile_dst);
+        qemu_fclose(mis->postcopy_qemufile_dst);
+        mis->postcopy_qemufile_dst = NULL;
+        qemu_mutex_unlock(&mis->postcopy_prio_thread_mutex);
+    }
+
     migrate_set_state(&mis->state, MIGRATION_STATUS_POSTCOPY_ACTIVE,
                       MIGRATION_STATUS_POSTCOPY_PAUSED);
 
@@ -2634,8 +2656,8 @@ retry:
     while (true) {
         section_type = qemu_get_byte(f);
 
-        if (qemu_file_get_error(f)) {
-            ret = qemu_file_get_error(f);
+        ret = qemu_file_get_error_obj_any(f, mis->postcopy_qemufile_dst, NULL);
+        if (ret) {
             break;
         }
 
diff --git a/migration/trace-events b/migration/trace-events
index 69f311169a..0e385c3a07 100644
--- a/migration/trace-events
+++ b/migration/trace-events
@@ -270,6 +270,8 @@ mark_postcopy_blocktime_begin(uint64_t addr, void *dd, uint32_t time, int cpu, i
 mark_postcopy_blocktime_end(uint64_t addr, void *dd, uint32_t time, int affected_cpu) "addr: 0x%" PRIx64 ", dd: %p, time: %u, affected_cpu: %d"
 postcopy_pause_fault_thread(void) ""
 postcopy_pause_fault_thread_continued(void) ""
+postcopy_pause_fast_load(void) ""
+postcopy_pause_fast_load_continued(void) ""
 postcopy_ram_fault_thread_entry(void) ""
 postcopy_ram_fault_thread_exit(void) ""
 postcopy_ram_fault_thread_fds_core(int baseufd, int quitfd) "ufd: %d quitfd: %d"
-- 
2.32.0



^ permalink raw reply related	[flat|nested] 40+ messages in thread

* [PATCH v5 14/21] migration: Create the postcopy preempt channel asynchronously
  2022-04-25 23:38 [PATCH v5 00/21] migration: Postcopy Preemption Peter Xu
                   ` (12 preceding siblings ...)
  2022-04-25 23:38 ` [PATCH v5 13/21] migration: Postcopy recover with preempt enabled Peter Xu
@ 2022-04-25 23:38 ` Peter Xu
  2022-05-12 17:35   ` Dr. David Alan Gilbert
  2022-05-16 15:02   ` manish.mishra
  2022-04-25 23:38 ` [PATCH v5 15/21] migration: Parameter x-postcopy-preempt-break-huge Peter Xu
                   ` (7 subsequent siblings)
  21 siblings, 2 replies; 40+ messages in thread
From: Peter Xu @ 2022-04-25 23:38 UTC (permalink / raw)
  To: qemu-devel
  Cc: Leonardo Bras Soares Passos, Daniel P . Berrange,
	Dr . David Alan Gilbert, peterx, Juan Quintela

This patch allows the postcopy preempt channel to be created
asynchronously.  The benefit is that when the connection is slow, we won't
take the BQL (and potentially block all things like QMP) for a long time
without releasing.

A function postcopy_preempt_wait_channel() is introduced, allowing the
migration thread to be able to wait on the channel creation.  The channel
is always created by the main thread, in which we'll kick a new semaphore
to tell the migration thread that the channel has created.

We'll need to wait for the new channel in two places: (1) when there's a
new postcopy migration that is starting, or (2) when there's a postcopy
migration to resume.

For the start of migration, we don't need to wait for this channel until
when we want to start postcopy, aka, postcopy_start().  We'll fail the
migration if we found that the channel creation failed (which should
probably not happen at all in 99% of the cases, because the main channel is
using the same network topology).

For a postcopy recovery, we'll need to wait in postcopy_pause().  In that
case if the channel creation failed, we can't fail the migration or we'll
crash the VM, instead we keep in PAUSED state, waiting for yet another
recovery.

Signed-off-by: Peter Xu <peterx@redhat.com>
---
 migration/migration.c    | 16 ++++++++++++
 migration/migration.h    |  7 +++++
 migration/postcopy-ram.c | 56 +++++++++++++++++++++++++++++++---------
 migration/postcopy-ram.h |  1 +
 4 files changed, 68 insertions(+), 12 deletions(-)

diff --git a/migration/migration.c b/migration/migration.c
index a0db5de685..cce741e20e 100644
--- a/migration/migration.c
+++ b/migration/migration.c
@@ -3020,6 +3020,12 @@ static int postcopy_start(MigrationState *ms)
     int64_t bandwidth = migrate_max_postcopy_bandwidth();
     bool restart_block = false;
     int cur_state = MIGRATION_STATUS_ACTIVE;
+
+    if (postcopy_preempt_wait_channel(ms)) {
+        migrate_set_state(&ms->state, ms->state, MIGRATION_STATUS_FAILED);
+        return -1;
+    }
+
     if (!migrate_pause_before_switchover()) {
         migrate_set_state(&ms->state, MIGRATION_STATUS_ACTIVE,
                           MIGRATION_STATUS_POSTCOPY_ACTIVE);
@@ -3501,6 +3507,14 @@ static MigThrError postcopy_pause(MigrationState *s)
         if (s->state == MIGRATION_STATUS_POSTCOPY_RECOVER) {
             /* Woken up by a recover procedure. Give it a shot */
 
+            if (postcopy_preempt_wait_channel(s)) {
+                /*
+                 * Preempt enabled, and new channel create failed; loop
+                 * back to wait for another recovery.
+                 */
+                continue;
+            }
+
             /*
              * Firstly, let's wake up the return path now, with a new
              * return path channel.
@@ -4360,6 +4374,7 @@ static void migration_instance_finalize(Object *obj)
     qemu_sem_destroy(&ms->postcopy_pause_sem);
     qemu_sem_destroy(&ms->postcopy_pause_rp_sem);
     qemu_sem_destroy(&ms->rp_state.rp_sem);
+    qemu_sem_destroy(&ms->postcopy_qemufile_src_sem);
     error_free(ms->error);
 }
 
@@ -4406,6 +4421,7 @@ static void migration_instance_init(Object *obj)
     qemu_sem_init(&ms->rp_state.rp_sem, 0);
     qemu_sem_init(&ms->rate_limit_sem, 0);
     qemu_sem_init(&ms->wait_unplug_sem, 0);
+    qemu_sem_init(&ms->postcopy_qemufile_src_sem, 0);
     qemu_mutex_init(&ms->qemu_file_lock);
 }
 
diff --git a/migration/migration.h b/migration/migration.h
index 91f845e9e4..f898b8547a 100644
--- a/migration/migration.h
+++ b/migration/migration.h
@@ -219,6 +219,13 @@ struct MigrationState {
     QEMUFile *to_dst_file;
     /* Postcopy specific transfer channel */
     QEMUFile *postcopy_qemufile_src;
+    /*
+     * It is posted when the preempt channel is established.  Note: this is
+     * used for both the start or recover of a postcopy migration.  We'll
+     * post to this sem every time a new preempt channel is created in the
+     * main thread, and we keep post() and wait() in pair.
+     */
+    QemuSemaphore postcopy_qemufile_src_sem;
     QIOChannelBuffer *bioc;
     /*
      * Protects to_dst_file/from_dst_file pointers.  We need to make sure we
diff --git a/migration/postcopy-ram.c b/migration/postcopy-ram.c
index b3c81b46f6..1bb603051a 100644
--- a/migration/postcopy-ram.c
+++ b/migration/postcopy-ram.c
@@ -1552,10 +1552,50 @@ bool postcopy_preempt_new_channel(MigrationIncomingState *mis, QEMUFile *file)
     return true;
 }
 
-int postcopy_preempt_setup(MigrationState *s, Error **errp)
+static void
+postcopy_preempt_send_channel_new(QIOTask *task, gpointer opaque)
 {
-    QIOChannel *ioc;
+    MigrationState *s = opaque;
+    QIOChannel *ioc = QIO_CHANNEL(qio_task_get_source(task));
+    Error *local_err = NULL;
+
+    if (qio_task_propagate_error(task, &local_err)) {
+        /* Something wrong happened.. */
+        migrate_set_error(s, local_err);
+        error_free(local_err);
+    } else {
+        migration_ioc_register_yank(ioc);
+        s->postcopy_qemufile_src = qemu_fopen_channel_output(ioc);
+        trace_postcopy_preempt_new_channel();
+    }
+
+    /*
+     * Kick the waiter in all cases.  The waiter should check upon
+     * postcopy_qemufile_src to know whether it failed or not.
+     */
+    qemu_sem_post(&s->postcopy_qemufile_src_sem);
+    object_unref(OBJECT(ioc));
+}
 
+/* Returns 0 if channel established, -1 for error. */
+int postcopy_preempt_wait_channel(MigrationState *s)
+{
+    /* If preempt not enabled, no need to wait */
+    if (!migrate_postcopy_preempt()) {
+        return 0;
+    }
+
+    /*
+     * We need the postcopy preempt channel to be established before
+     * starting doing anything.
+     */
+    qemu_sem_wait(&s->postcopy_qemufile_src_sem);
+
+    return s->postcopy_qemufile_src ? 0 : -1;
+}
+
+int postcopy_preempt_setup(MigrationState *s, Error **errp)
+{
     if (!migrate_postcopy_preempt()) {
         return 0;
     }
@@ -1566,16 +1606,8 @@ int postcopy_preempt_setup(MigrationState *s, Error **errp)
         return -1;
     }
 
-    ioc = socket_send_channel_create_sync(errp);
-
-    if (ioc == NULL) {
-        return -1;
-    }
-
-    migration_ioc_register_yank(ioc);
-    s->postcopy_qemufile_src = qemu_fopen_channel_output(ioc);
-
-    trace_postcopy_preempt_new_channel();
+    /* Kick an async task to connect */
+    socket_send_channel_create(postcopy_preempt_send_channel_new, s);
 
     return 0;
 }
diff --git a/migration/postcopy-ram.h b/migration/postcopy-ram.h
index 34b1080cde..6147bf7d1d 100644
--- a/migration/postcopy-ram.h
+++ b/migration/postcopy-ram.h
@@ -192,5 +192,6 @@ enum PostcopyChannels {
 
 bool postcopy_preempt_new_channel(MigrationIncomingState *mis, QEMUFile *file);
 int postcopy_preempt_setup(MigrationState *s, Error **errp);
+int postcopy_preempt_wait_channel(MigrationState *s);
 
 #endif
-- 
2.32.0



^ permalink raw reply related	[flat|nested] 40+ messages in thread

* [PATCH v5 15/21] migration: Parameter x-postcopy-preempt-break-huge
  2022-04-25 23:38 [PATCH v5 00/21] migration: Postcopy Preemption Peter Xu
                   ` (13 preceding siblings ...)
  2022-04-25 23:38 ` [PATCH v5 14/21] migration: Create the postcopy preempt channel asynchronously Peter Xu
@ 2022-04-25 23:38 ` Peter Xu
  2022-05-12 17:42   ` Dr. David Alan Gilbert
  2022-05-16 15:20   ` manish.mishra
  2022-04-25 23:38 ` [PATCH v5 16/21] migration: Add helpers to detect TLS capability Peter Xu
                   ` (6 subsequent siblings)
  21 siblings, 2 replies; 40+ messages in thread
From: Peter Xu @ 2022-04-25 23:38 UTC (permalink / raw)
  To: qemu-devel
  Cc: Leonardo Bras Soares Passos, Daniel P . Berrange,
	Dr . David Alan Gilbert, peterx, Juan Quintela

Add a parameter that can conditionally disable the "break sending huge
page" behavior in postcopy preemption.  By default it's enabled.

It should only be used for debugging purposes, and we should never remove
the "x-" prefix.

Signed-off-by: Peter Xu <peterx@redhat.com>
---
 migration/migration.c | 2 ++
 migration/migration.h | 7 +++++++
 migration/ram.c       | 7 +++++++
 3 files changed, 16 insertions(+)

diff --git a/migration/migration.c b/migration/migration.c
index cce741e20e..cd9650f2e2 100644
--- a/migration/migration.c
+++ b/migration/migration.c
@@ -4329,6 +4329,8 @@ static Property migration_properties[] = {
     DEFINE_PROP_SIZE("announce-step", MigrationState,
                       parameters.announce_step,
                       DEFAULT_MIGRATE_ANNOUNCE_STEP),
+    DEFINE_PROP_BOOL("x-postcopy-preempt-break-huge", MigrationState,
+                      postcopy_preempt_break_huge, true),
 
     /* Migration capabilities */
     DEFINE_PROP_MIG_CAP("x-xbzrle", MIGRATION_CAPABILITY_XBZRLE),
diff --git a/migration/migration.h b/migration/migration.h
index f898b8547a..6ee520642f 100644
--- a/migration/migration.h
+++ b/migration/migration.h
@@ -340,6 +340,13 @@ struct MigrationState {
     bool send_configuration;
     /* Whether we send section footer during migration */
     bool send_section_footer;
+    /*
+     * Whether we allow break sending huge pages when postcopy preempt is
+     * enabled.  When disabled, we won't interrupt precopy within sending a
+     * host huge page, which is the old behavior of vanilla postcopy.
+     * NOTE: this parameter is ignored if postcopy preempt is not enabled.
+     */
+    bool postcopy_preempt_break_huge;
 
     /* Needed by postcopy-pause state */
     QemuSemaphore postcopy_pause_sem;
diff --git a/migration/ram.c b/migration/ram.c
index a4b39e3675..f3a79c8556 100644
--- a/migration/ram.c
+++ b/migration/ram.c
@@ -2266,11 +2266,18 @@ static int ram_save_target_page(RAMState *rs, PageSearchStatus *pss)
 
 static bool postcopy_needs_preempt(RAMState *rs, PageSearchStatus *pss)
 {
+    MigrationState *ms = migrate_get_current();
+
     /* Not enabled eager preempt?  Then never do that. */
     if (!migrate_postcopy_preempt()) {
         return false;
     }
 
+    /* If the user explicitly disabled breaking of huge page, skip */
+    if (!ms->postcopy_preempt_break_huge) {
+        return false;
+    }
+
     /* If the ramblock we're sending is a small page?  Never bother. */
     if (qemu_ram_pagesize(pss->block) == TARGET_PAGE_SIZE) {
         return false;
-- 
2.32.0



^ permalink raw reply related	[flat|nested] 40+ messages in thread

* [PATCH v5 16/21] migration: Add helpers to detect TLS capability
  2022-04-25 23:38 [PATCH v5 00/21] migration: Postcopy Preemption Peter Xu
                   ` (14 preceding siblings ...)
  2022-04-25 23:38 ` [PATCH v5 15/21] migration: Parameter x-postcopy-preempt-break-huge Peter Xu
@ 2022-04-25 23:38 ` Peter Xu
  2022-05-12 17:46   ` Dr. David Alan Gilbert
  2022-04-25 23:38 ` [PATCH v5 17/21] migration: Export tls-[creds|hostname|authz] params to cmdline too Peter Xu
                   ` (5 subsequent siblings)
  21 siblings, 1 reply; 40+ messages in thread
From: Peter Xu @ 2022-04-25 23:38 UTC (permalink / raw)
  To: qemu-devel
  Cc: Leonardo Bras Soares Passos, Daniel P . Berrange,
	Dr . David Alan Gilbert, peterx, Juan Quintela

Add migrate_tls_enabled() to detect whether TLS is configured.

Add migrate_channel_requires_tls() to detect whether the specific channel
requires TLS.

No functional change intended.

Signed-off-by: Peter Xu <peterx@redhat.com>
---
 migration/channel.c   | 10 ++--------
 migration/migration.c |  8 ++++++++
 migration/migration.h |  2 ++
 migration/multifd.c   |  7 +------
 migration/tls.c       |  9 +++++++++
 migration/tls.h       |  4 ++++
 6 files changed, 26 insertions(+), 14 deletions(-)

diff --git a/migration/channel.c b/migration/channel.c
index c6a8dcf1d7..36e59eaeec 100644
--- a/migration/channel.c
+++ b/migration/channel.c
@@ -38,10 +38,7 @@ void migration_channel_process_incoming(QIOChannel *ioc)
     trace_migration_set_incoming_channel(
         ioc, object_get_typename(OBJECT(ioc)));
 
-    if (s->parameters.tls_creds &&
-        *s->parameters.tls_creds &&
-        !object_dynamic_cast(OBJECT(ioc),
-                             TYPE_QIO_CHANNEL_TLS)) {
+    if (migrate_channel_requires_tls(ioc)) {
         migration_tls_channel_process_incoming(s, ioc, &local_err);
     } else {
         migration_ioc_register_yank(ioc);
@@ -71,10 +68,7 @@ void migration_channel_connect(MigrationState *s,
         ioc, object_get_typename(OBJECT(ioc)), hostname, error);
 
     if (!error) {
-        if (s->parameters.tls_creds &&
-            *s->parameters.tls_creds &&
-            !object_dynamic_cast(OBJECT(ioc),
-                                 TYPE_QIO_CHANNEL_TLS)) {
+        if (migrate_channel_requires_tls(ioc)) {
             migration_tls_channel_connect(s, ioc, hostname, &error);
 
             if (!error) {
diff --git a/migration/migration.c b/migration/migration.c
index cd9650f2e2..71a50b5c37 100644
--- a/migration/migration.c
+++ b/migration/migration.c
@@ -49,6 +49,7 @@
 #include "trace.h"
 #include "exec/target_page.h"
 #include "io/channel-buffer.h"
+#include "io/channel-tls.h"
 #include "migration/colo.h"
 #include "hw/boards.h"
 #include "hw/qdev-properties.h"
@@ -4250,6 +4251,13 @@ void migration_global_dump(Monitor *mon)
                    ms->clear_bitmap_shift);
 }
 
+bool migrate_tls_enabled(void)
+{
+    MigrationState *s = migrate_get_current();
+
+    return s->parameters.tls_creds && *s->parameters.tls_creds;
+}
+
 #define DEFINE_PROP_MIG_CAP(name, x)             \
     DEFINE_PROP_BOOL(name, MigrationState, enabled_capabilities[x], false)
 
diff --git a/migration/migration.h b/migration/migration.h
index 6ee520642f..db176ea749 100644
--- a/migration/migration.h
+++ b/migration/migration.h
@@ -436,6 +436,8 @@ bool migrate_use_events(void);
 bool migrate_postcopy_blocktime(void);
 bool migrate_background_snapshot(void);
 bool migrate_postcopy_preempt(void);
+/* Whether TLS is enabled for migration? */
+bool migrate_tls_enabled(void);
 
 /* Sending on the return path - generic and then for each message type */
 void migrate_send_rp_shut(MigrationIncomingState *mis,
diff --git a/migration/multifd.c b/migration/multifd.c
index 9ea4f581e2..19e3c44491 100644
--- a/migration/multifd.c
+++ b/migration/multifd.c
@@ -782,17 +782,12 @@ static bool multifd_channel_connect(MultiFDSendParams *p,
                                     QIOChannel *ioc,
                                     Error *error)
 {
-    MigrationState *s = migrate_get_current();
-
     trace_multifd_set_outgoing_channel(
         ioc, object_get_typename(OBJECT(ioc)),
         migrate_get_current()->hostname, error);
 
     if (!error) {
-        if (s->parameters.tls_creds &&
-            *s->parameters.tls_creds &&
-            !object_dynamic_cast(OBJECT(ioc),
-                                 TYPE_QIO_CHANNEL_TLS)) {
+        if (migrate_channel_requires_tls(ioc)) {
             multifd_tls_channel_connect(p, ioc, &error);
             if (!error) {
                 /*
diff --git a/migration/tls.c b/migration/tls.c
index 32c384a8b6..bd1fa01048 100644
--- a/migration/tls.c
+++ b/migration/tls.c
@@ -166,3 +166,12 @@ void migration_tls_channel_connect(MigrationState *s,
                               NULL,
                               NULL);
 }
+
+bool migrate_channel_requires_tls(QIOChannel *ioc)
+{
+    if (!migrate_tls_enabled()) {
+        return false;
+    }
+
+    return !object_dynamic_cast(OBJECT(ioc), TYPE_QIO_CHANNEL_TLS);
+}
diff --git a/migration/tls.h b/migration/tls.h
index de4fe2cafd..a54c1dcec7 100644
--- a/migration/tls.h
+++ b/migration/tls.h
@@ -37,4 +37,8 @@ void migration_tls_channel_connect(MigrationState *s,
                                    QIOChannel *ioc,
                                    const char *hostname,
                                    Error **errp);
+
+/* Whether the QIO channel requires further TLS handshake? */
+bool migrate_channel_requires_tls(QIOChannel *ioc);
+
 #endif
-- 
2.32.0



^ permalink raw reply related	[flat|nested] 40+ messages in thread

* [PATCH v5 17/21] migration: Export tls-[creds|hostname|authz] params to cmdline too
  2022-04-25 23:38 [PATCH v5 00/21] migration: Postcopy Preemption Peter Xu
                   ` (15 preceding siblings ...)
  2022-04-25 23:38 ` [PATCH v5 16/21] migration: Add helpers to detect TLS capability Peter Xu
@ 2022-04-25 23:38 ` Peter Xu
  2022-05-12 18:03   ` Dr. David Alan Gilbert
  2022-04-25 23:38 ` [PATCH v5 18/21] migration: Enable TLS for preempt channel Peter Xu
                   ` (4 subsequent siblings)
  21 siblings, 1 reply; 40+ messages in thread
From: Peter Xu @ 2022-04-25 23:38 UTC (permalink / raw)
  To: qemu-devel
  Cc: Leonardo Bras Soares Passos, Daniel P . Berrange,
	Dr . David Alan Gilbert, peterx, Juan Quintela

It's useful for specifying tls credentials all in the cmdline (along with
the -object tls-creds-*), especially for debugging purpose.

The trick here is we must remember to not free these fields again in the
finalize() function of migration object, otherwise it'll cause double-free.

The thing is when destroying an object, we'll first destroy the properties
that bound to the object, then the object itself.  To be explicit, when
destroy the object in object_finalize() we have such sequence of
operations:

    object_property_del_all(obj);
    object_deinit(obj, ti);

So after this change the two fields are properly released already even
before reaching the finalize() function but in object_property_del_all(),
hence we don't need to free them anymore in finalize() or it's double-free.

This also fixes a trivial memory leak for tls-authz as we forgot to free it
before this patch.

Signed-off-by: Peter Xu <peterx@redhat.com>
---
 migration/migration.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/migration/migration.c b/migration/migration.c
index 71a50b5c37..b0f2de1e09 100644
--- a/migration/migration.c
+++ b/migration/migration.c
@@ -4339,6 +4339,9 @@ static Property migration_properties[] = {
                       DEFAULT_MIGRATE_ANNOUNCE_STEP),
     DEFINE_PROP_BOOL("x-postcopy-preempt-break-huge", MigrationState,
                       postcopy_preempt_break_huge, true),
+    DEFINE_PROP_STRING("tls-creds", MigrationState, parameters.tls_creds),
+    DEFINE_PROP_STRING("tls-hostname", MigrationState, parameters.tls_hostname),
+    DEFINE_PROP_STRING("tls-authz", MigrationState, parameters.tls_authz),
 
     /* Migration capabilities */
     DEFINE_PROP_MIG_CAP("x-xbzrle", MIGRATION_CAPABILITY_XBZRLE),
@@ -4372,12 +4375,9 @@ static void migration_class_init(ObjectClass *klass, void *data)
 static void migration_instance_finalize(Object *obj)
 {
     MigrationState *ms = MIGRATION_OBJ(obj);
-    MigrationParameters *params = &ms->parameters;
 
     qemu_mutex_destroy(&ms->error_mutex);
     qemu_mutex_destroy(&ms->qemu_file_lock);
-    g_free(params->tls_hostname);
-    g_free(params->tls_creds);
     qemu_sem_destroy(&ms->wait_unplug_sem);
     qemu_sem_destroy(&ms->rate_limit_sem);
     qemu_sem_destroy(&ms->pause_sem);
-- 
2.32.0



^ permalink raw reply related	[flat|nested] 40+ messages in thread

* [PATCH v5 18/21] migration: Enable TLS for preempt channel
  2022-04-25 23:38 [PATCH v5 00/21] migration: Postcopy Preemption Peter Xu
                   ` (16 preceding siblings ...)
  2022-04-25 23:38 ` [PATCH v5 17/21] migration: Export tls-[creds|hostname|authz] params to cmdline too Peter Xu
@ 2022-04-25 23:38 ` Peter Xu
  2022-04-25 23:38 ` [PATCH v5 19/21] tests: Add postcopy tls migration test Peter Xu
                   ` (3 subsequent siblings)
  21 siblings, 0 replies; 40+ messages in thread
From: Peter Xu @ 2022-04-25 23:38 UTC (permalink / raw)
  To: qemu-devel
  Cc: Leonardo Bras Soares Passos, Daniel P . Berrange,
	Dr . David Alan Gilbert, peterx, Juan Quintela

This patch is based on the async preempt channel creation.  It continues
wiring up the new channel with TLS handshake to destionation when enabled.

Note that only the src QEMU needs such operation; the dest QEMU does not
need any change for TLS support due to the fact that all channels are
established synchronously there, so all the TLS magic is already properly
handled by migration_tls_channel_process_incoming().

Signed-off-by: Peter Xu <peterx@redhat.com>
---
 migration/postcopy-ram.c | 57 ++++++++++++++++++++++++++++++++++------
 migration/trace-events   |  1 +
 2 files changed, 50 insertions(+), 8 deletions(-)

diff --git a/migration/postcopy-ram.c b/migration/postcopy-ram.c
index 1bb603051a..4a4da16389 100644
--- a/migration/postcopy-ram.c
+++ b/migration/postcopy-ram.c
@@ -36,6 +36,7 @@
 #include "socket.h"
 #include "qemu-file-channel.h"
 #include "yank_functions.h"
+#include "tls.h"
 
 /* Arbitrary limit on size of each discard command,
  * keeps them around ~200 bytes
@@ -1552,15 +1553,15 @@ bool postcopy_preempt_new_channel(MigrationIncomingState *mis, QEMUFile *file)
     return true;
 }
 
+/*
+ * Setup the postcopy preempt channel with the IOC.  If ERROR is specified,
+ * setup the error instead.  This helper will free the ERROR if specified.
+ */
 static void
-postcopy_preempt_send_channel_new(QIOTask *task, gpointer opaque)
+postcopy_preempt_send_channel_done(MigrationState *s,
+                                   QIOChannel *ioc, Error *local_err)
 {
-    MigrationState *s = opaque;
-    QIOChannel *ioc = QIO_CHANNEL(qio_task_get_source(task));
-    Error *local_err = NULL;
-
-    if (qio_task_propagate_error(task, &local_err)) {
-        /* Something wrong happened.. */
+    if (local_err) {
         migrate_set_error(s, local_err);
         error_free(local_err);
     } else {
@@ -1574,7 +1575,47 @@ postcopy_preempt_send_channel_new(QIOTask *task, gpointer opaque)
      * postcopy_qemufile_src to know whether it failed or not.
      */
     qemu_sem_post(&s->postcopy_qemufile_src_sem);
-    object_unref(OBJECT(ioc));
+}
+
+static void
+postcopy_preempt_tls_handshake(QIOTask *task, gpointer opaque)
+{
+    g_autoptr(QIOChannel) ioc = QIO_CHANNEL(qio_task_get_source(task));
+    MigrationState *s = opaque;
+    Error *local_err = NULL;
+
+    qio_task_propagate_error(task, &local_err);
+    postcopy_preempt_send_channel_done(s, ioc, local_err);
+}
+
+static void
+postcopy_preempt_send_channel_new(QIOTask *task, gpointer opaque)
+{
+    g_autoptr(QIOChannel) ioc = QIO_CHANNEL(qio_task_get_source(task));
+    MigrationState *s = opaque;
+    QIOChannelTLS *tioc;
+    Error *local_err = NULL;
+
+    if (qio_task_propagate_error(task, &local_err)) {
+        goto out;
+    }
+
+    if (migrate_channel_requires_tls(ioc)) {
+        tioc = migration_tls_client_create(s, ioc, s->hostname, &local_err);
+        if (!tioc) {
+            goto out;
+        }
+        trace_postcopy_preempt_tls_handshake();
+        qio_channel_set_name(QIO_CHANNEL(tioc), "migration-tls-preempt");
+        qio_channel_tls_handshake(tioc, postcopy_preempt_tls_handshake,
+                                  s, NULL, NULL);
+        /* Setup the channel until TLS handshake finished */
+        return;
+    }
+
+out:
+    /* This handles both good and error cases */
+    postcopy_preempt_send_channel_done(s, ioc, local_err);
 }
 
 /* Returns 0 if channel established, -1 for error. */
diff --git a/migration/trace-events b/migration/trace-events
index 0e385c3a07..a34afe7b85 100644
--- a/migration/trace-events
+++ b/migration/trace-events
@@ -287,6 +287,7 @@ postcopy_request_shared_page(const char *sharer, const char *rb, uint64_t rb_off
 postcopy_request_shared_page_present(const char *sharer, const char *rb, uint64_t rb_offset) "%s already %s offset 0x%"PRIx64
 postcopy_wake_shared(uint64_t client_addr, const char *rb) "at 0x%"PRIx64" in %s"
 postcopy_page_req_del(void *addr, int count) "resolved page req %p total %d"
+postcopy_preempt_tls_handshake(void) ""
 postcopy_preempt_new_channel(void) ""
 postcopy_preempt_thread_entry(void) ""
 postcopy_preempt_thread_exit(void) ""
-- 
2.32.0



^ permalink raw reply related	[flat|nested] 40+ messages in thread

* [PATCH v5 19/21] tests: Add postcopy tls migration test
  2022-04-25 23:38 [PATCH v5 00/21] migration: Postcopy Preemption Peter Xu
                   ` (17 preceding siblings ...)
  2022-04-25 23:38 ` [PATCH v5 18/21] migration: Enable TLS for preempt channel Peter Xu
@ 2022-04-25 23:38 ` Peter Xu
  2022-04-25 23:38 ` [PATCH v5 20/21] tests: Add postcopy tls recovery " Peter Xu
                   ` (2 subsequent siblings)
  21 siblings, 0 replies; 40+ messages in thread
From: Peter Xu @ 2022-04-25 23:38 UTC (permalink / raw)
  To: qemu-devel
  Cc: Leonardo Bras Soares Passos, Daniel P . Berrange,
	Dr . David Alan Gilbert, peterx, Juan Quintela

We just added TLS tests for precopy but not postcopy.  Add the
corresponding test for vanilla postcopy.

Rename the vanilla postcopy to "postcopy/plain" because all postcopy tests
will only use unix sockets as channel.

Signed-off-by: Peter Xu <peterx@redhat.com>
---
 tests/qtest/migration-test.c | 50 +++++++++++++++++++++++++++++++-----
 1 file changed, 43 insertions(+), 7 deletions(-)

diff --git a/tests/qtest/migration-test.c b/tests/qtest/migration-test.c
index d9f444ea14..c32b350aea 100644
--- a/tests/qtest/migration-test.c
+++ b/tests/qtest/migration-test.c
@@ -481,6 +481,10 @@ typedef struct {
     bool only_target;
     /* Use dirty ring if true; dirty logging otherwise */
     bool use_dirty_ring;
+    /* Whether use TLS channels for postcopy test? */
+    bool postcopy_tls;
+    /* Used only if postcopy_tls==true, to cache the data object */
+    void *postcopy_tls_data;
     const char *opts_source;
     const char *opts_target;
 } MigrateStart;
@@ -980,6 +984,10 @@ static int migrate_postcopy_prepare(QTestState **from_ptr,
         return -1;
     }
 
+    if (args->postcopy_tls) {
+        args->postcopy_tls_data = test_migrate_tls_psk_start_match(from, to);
+    }
+
     migrate_set_capability(from, "postcopy-ram", true);
     migrate_set_capability(to, "postcopy-ram", true);
     migrate_set_capability(to, "postcopy-blocktime", true);
@@ -1004,7 +1012,8 @@ static int migrate_postcopy_prepare(QTestState **from_ptr,
     return 0;
 }
 
-static void migrate_postcopy_complete(QTestState *from, QTestState *to)
+static void migrate_postcopy_complete(QTestState *from, QTestState *to,
+                                      MigrateStart *args)
 {
     wait_for_migration_complete(from);
 
@@ -1015,19 +1024,38 @@ static void migrate_postcopy_complete(QTestState *from, QTestState *to)
         read_blocktime(to);
     }
 
+    if (args->postcopy_tls) {
+        assert(args->postcopy_tls_data);
+        test_migrate_tls_psk_finish(from, to, args->postcopy_tls_data);
+        args->postcopy_tls_data = NULL;
+    }
+
     test_migrate_end(from, to, true);
 }
 
-static void test_postcopy(void)
+static void test_postcopy_common(MigrateStart *args)
 {
-    MigrateStart args = {};
     QTestState *from, *to;
 
-    if (migrate_postcopy_prepare(&from, &to, &args)) {
+    if (migrate_postcopy_prepare(&from, &to, args)) {
         return;
     }
     migrate_postcopy_start(from, to);
-    migrate_postcopy_complete(from, to);
+    migrate_postcopy_complete(from, to, args);
+}
+
+static void test_postcopy(void)
+{
+    MigrateStart args = { };
+
+    test_postcopy_common(&args);
+}
+
+static void test_postcopy_tls_psk(void)
+{
+    MigrateStart args = { .postcopy_tls = true };
+
+    test_postcopy_common(&args);
 }
 
 static void test_postcopy_recovery(void)
@@ -1089,7 +1117,7 @@ static void test_postcopy_recovery(void)
     /* Restore the postcopy bandwidth to unlimited */
     migrate_set_parameter_int(from, "max-postcopy-bandwidth", 0);
 
-    migrate_postcopy_complete(from, to);
+    migrate_postcopy_complete(from, to, &args);
 }
 
 static void test_baddest(void)
@@ -2132,7 +2160,15 @@ int main(int argc, char **argv)
 
     module_call_init(MODULE_INIT_QOM);
 
-    qtest_add_func("/migration/postcopy/unix", test_postcopy);
+    qtest_add_func("/migration/postcopy/plain", test_postcopy);
+#ifdef CONFIG_GNUTLS
+    /*
+     * NOTE: psk test is enough for postcopy, as other types of TLS
+     * channels are tested under precopy.  Here what we want to test is the
+     * general postcopy path that has TLS channel enabled.
+     */
+    qtest_add_func("/migration/postcopy/tls/psk", test_postcopy_tls_psk);
+#endif /* CONFIG_GNUTLS */
     qtest_add_func("/migration/postcopy/recovery", test_postcopy_recovery);
     qtest_add_func("/migration/bad_dest", test_baddest);
     qtest_add_func("/migration/precopy/unix/plain", test_precopy_unix_plain);
-- 
2.32.0



^ permalink raw reply related	[flat|nested] 40+ messages in thread

* [PATCH v5 20/21] tests: Add postcopy tls recovery migration test
  2022-04-25 23:38 [PATCH v5 00/21] migration: Postcopy Preemption Peter Xu
                   ` (18 preceding siblings ...)
  2022-04-25 23:38 ` [PATCH v5 19/21] tests: Add postcopy tls migration test Peter Xu
@ 2022-04-25 23:38 ` Peter Xu
  2022-04-25 23:38 ` [PATCH v5 21/21] tests: Add postcopy preempt tests Peter Xu
  2022-05-16 15:25 ` [PATCH v5 00/21] migration: Postcopy Preemption manish.mishra
  21 siblings, 0 replies; 40+ messages in thread
From: Peter Xu @ 2022-04-25 23:38 UTC (permalink / raw)
  To: qemu-devel
  Cc: Leonardo Bras Soares Passos, Daniel P . Berrange,
	Dr . David Alan Gilbert, peterx, Juan Quintela

It's easy to build this upon the postcopy tls test.  Rename the old
postcopy recovery test to postcopy/recovery/plain.

Signed-off-by: Peter Xu <peterx@redhat.com>
---
 tests/qtest/migration-test.c | 33 ++++++++++++++++++++++++++-------
 1 file changed, 26 insertions(+), 7 deletions(-)

diff --git a/tests/qtest/migration-test.c b/tests/qtest/migration-test.c
index c32b350aea..cbcf3f73a4 100644
--- a/tests/qtest/migration-test.c
+++ b/tests/qtest/migration-test.c
@@ -1058,15 +1058,15 @@ static void test_postcopy_tls_psk(void)
     test_postcopy_common(&args);
 }
 
-static void test_postcopy_recovery(void)
+static void test_postcopy_recovery_common(MigrateStart *args)
 {
-    MigrateStart args = {
-        .hide_stderr = true,
-    };
     QTestState *from, *to;
     g_autofree char *uri = NULL;
 
-    if (migrate_postcopy_prepare(&from, &to, &args)) {
+    /* Always hide errors for postcopy recover tests since they're expected */
+    args->hide_stderr = true;
+
+    if (migrate_postcopy_prepare(&from, &to, args)) {
         return;
     }
 
@@ -1117,7 +1117,21 @@ static void test_postcopy_recovery(void)
     /* Restore the postcopy bandwidth to unlimited */
     migrate_set_parameter_int(from, "max-postcopy-bandwidth", 0);
 
-    migrate_postcopy_complete(from, to, &args);
+    migrate_postcopy_complete(from, to, args);
+}
+
+static void test_postcopy_recovery(void)
+{
+    MigrateStart args = { };
+
+    test_postcopy_recovery_common(&args);
+}
+
+static void test_postcopy_recovery_tls_psk(void)
+{
+    MigrateStart args = { .postcopy_tls = true };
+
+    test_postcopy_recovery_common(&args);
 }
 
 static void test_baddest(void)
@@ -2169,7 +2183,12 @@ int main(int argc, char **argv)
      */
     qtest_add_func("/migration/postcopy/tls/psk", test_postcopy_tls_psk);
 #endif /* CONFIG_GNUTLS */
-    qtest_add_func("/migration/postcopy/recovery", test_postcopy_recovery);
+    qtest_add_func("/migration/postcopy/recovery/plain",
+                   test_postcopy_recovery);
+#ifdef CONFIG_GNUTLS
+    qtest_add_func("/migration/postcopy/recovery/tls/psk",
+                   test_postcopy_recovery_tls_psk);
+#endif /* CONFIG_GNUTLS */
     qtest_add_func("/migration/bad_dest", test_baddest);
     qtest_add_func("/migration/precopy/unix/plain", test_precopy_unix_plain);
     qtest_add_func("/migration/precopy/unix/xbzrle", test_precopy_unix_xbzrle);
-- 
2.32.0



^ permalink raw reply related	[flat|nested] 40+ messages in thread

* [PATCH v5 21/21] tests: Add postcopy preempt tests
  2022-04-25 23:38 [PATCH v5 00/21] migration: Postcopy Preemption Peter Xu
                   ` (19 preceding siblings ...)
  2022-04-25 23:38 ` [PATCH v5 20/21] tests: Add postcopy tls recovery " Peter Xu
@ 2022-04-25 23:38 ` Peter Xu
  2022-05-16 15:25 ` [PATCH v5 00/21] migration: Postcopy Preemption manish.mishra
  21 siblings, 0 replies; 40+ messages in thread
From: Peter Xu @ 2022-04-25 23:38 UTC (permalink / raw)
  To: qemu-devel
  Cc: Leonardo Bras Soares Passos, Daniel P . Berrange,
	Dr . David Alan Gilbert, peterx, Juan Quintela

Four tests are added for preempt mode:

  - Postcopy plain
  - Postcopy recovery
  - Postcopy tls
  - Postcopy tls+recovery

Signed-off-by: Peter Xu <peterx@redhat.com>
---
 tests/qtest/migration-test.c | 54 ++++++++++++++++++++++++++++++++++++
 1 file changed, 54 insertions(+)

diff --git a/tests/qtest/migration-test.c b/tests/qtest/migration-test.c
index cbcf3f73a4..af8d33c898 100644
--- a/tests/qtest/migration-test.c
+++ b/tests/qtest/migration-test.c
@@ -477,6 +477,7 @@ typedef struct {
      */
     bool hide_stderr;
     bool use_shmem;
+    bool postcopy_preempt;
     /* only launch the target process */
     bool only_target;
     /* Use dirty ring if true; dirty logging otherwise */
@@ -992,6 +993,11 @@ static int migrate_postcopy_prepare(QTestState **from_ptr,
     migrate_set_capability(to, "postcopy-ram", true);
     migrate_set_capability(to, "postcopy-blocktime", true);
 
+    if (args->postcopy_preempt) {
+        migrate_set_capability(from, "postcopy-preempt", true);
+        migrate_set_capability(to, "postcopy-preempt", true);
+    }
+
     /* We want to pick a speed slow enough that the test completes
      * quickly, but that it doesn't complete precopy even on a slow
      * machine, so also set the downtime.
@@ -1058,6 +1064,25 @@ static void test_postcopy_tls_psk(void)
     test_postcopy_common(&args);
 }
 
+static void test_postcopy_preempt(void)
+{
+    MigrateStart args = {
+        .postcopy_preempt = true,
+    };
+
+    test_postcopy_common(&args);
+}
+
+static void test_postcopy_preempt_tls_psk(void)
+{
+    MigrateStart args = {
+        .postcopy_preempt = true,
+        .postcopy_tls = true,
+    };
+
+    test_postcopy_common(&args);
+}
+
 static void test_postcopy_recovery_common(MigrateStart *args)
 {
     QTestState *from, *to;
@@ -1134,6 +1159,24 @@ static void test_postcopy_recovery_tls_psk(void)
     test_postcopy_recovery_common(&args);
 }
 
+static void test_postcopy_preempt_recovery(void)
+{
+    MigrateStart args = { .postcopy_preempt = true };
+
+    test_postcopy_recovery_common(&args);
+}
+
+/* This contains preempt+recovery+tls test altogether */
+static void test_postcopy_preempt_all(void)
+{
+    MigrateStart args = {
+        .postcopy_preempt = true,
+        .postcopy_tls = true,
+    };
+
+    test_postcopy_recovery_common(&args);
+}
+
 static void test_baddest(void)
 {
     MigrateStart args = {
@@ -2189,6 +2232,17 @@ int main(int argc, char **argv)
     qtest_add_func("/migration/postcopy/recovery/tls/psk",
                    test_postcopy_recovery_tls_psk);
 #endif /* CONFIG_GNUTLS */
+
+    qtest_add_func("/migration/postcopy/preempt/plain", test_postcopy_preempt);
+    qtest_add_func("/migration/postcopy/preempt/recovery/plain",
+                   test_postcopy_preempt_recovery);
+#ifdef CONFIG_GNUTLS
+    qtest_add_func("/migration/postcopy/preempt/tls/psk",
+                   test_postcopy_preempt_tls_psk);
+    qtest_add_func("/migration/postcopy/preempt/recovery/tls/psk",
+                   test_postcopy_preempt_all);
+#endif /* CONFIG_GNUTLS */
+
     qtest_add_func("/migration/bad_dest", test_baddest);
     qtest_add_func("/migration/precopy/unix/plain", test_precopy_unix_plain);
     qtest_add_func("/migration/precopy/unix/xbzrle", test_precopy_unix_xbzrle);
-- 
2.32.0



^ permalink raw reply related	[flat|nested] 40+ messages in thread

* Re: [PATCH v5 11/21] migration: Postcopy preemption preparation on channel creation
  2022-04-25 23:38 ` [PATCH v5 11/21] migration: Postcopy preemption preparation on channel creation Peter Xu
@ 2022-05-11 17:08   ` manish.mishra
  2022-05-12 16:36     ` Peter Xu
  0 siblings, 1 reply; 40+ messages in thread
From: manish.mishra @ 2022-05-11 17:08 UTC (permalink / raw)
  To: Peter Xu, qemu-devel
  Cc: Leonardo Bras Soares Passos, Daniel P . Berrange,
	Dr . David Alan Gilbert, Juan Quintela

LGTM

On 26/04/22 5:08 am, Peter Xu wrote:
> Create a new socket for postcopy to be prepared to send postcopy requested
> pages via this specific channel, so as to not get blocked by precopy pages.
>
> A new thread is also created on dest qemu to receive data from this new channel
> based on the ram_load_postcopy() routine.
>
> The ram_load_postcopy(POSTCOPY) branch and the thread has not started to
> function, and that'll be done in follow up patches.
>
> Cleanup the new sockets on both src/dst QEMUs, meanwhile look after the new
> thread too to make sure it'll be recycled properly.
>
> Signed-off-by: Peter Xu <peterx@redhat.com>
> Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
> ---
>   migration/migration.c    | 62 +++++++++++++++++++++++----
>   migration/migration.h    |  8 ++++
>   migration/postcopy-ram.c | 92 ++++++++++++++++++++++++++++++++++++++--
>   migration/postcopy-ram.h | 10 +++++
>   migration/ram.c          | 25 ++++++++---
>   migration/ram.h          |  4 +-
>   migration/savevm.c       | 20 ++++-----
>   migration/socket.c       | 22 +++++++++-
>   migration/socket.h       |  1 +
>   migration/trace-events   |  5 ++-
>   10 files changed, 218 insertions(+), 31 deletions(-)
>
> diff --git a/migration/migration.c b/migration/migration.c
> index 75d9185c3a..e27aa306bc 100644
> --- a/migration/migration.c
> +++ b/migration/migration.c
> @@ -321,6 +321,12 @@ void migration_incoming_state_destroy(void)
>           mis->page_requested = NULL;
>       }
>   
> +    if (mis->postcopy_qemufile_dst) {
> +        migration_ioc_unregister_yank_from_file(mis->postcopy_qemufile_dst);
> +        qemu_fclose(mis->postcopy_qemufile_dst);
> +        mis->postcopy_qemufile_dst = NULL;
> +    }
> +
>       yank_unregister_instance(MIGRATION_YANK_INSTANCE);
>   }
>   
> @@ -714,15 +720,21 @@ void migration_fd_process_incoming(QEMUFile *f, Error **errp)
>       migration_incoming_process();
>   }
>   
> +static bool migration_needs_multiple_sockets(void)
> +{
> +    return migrate_use_multifd() || migrate_postcopy_preempt();
> +}
> +
>   void migration_ioc_process_incoming(QIOChannel *ioc, Error **errp)
>   {
>       MigrationIncomingState *mis = migration_incoming_get_current();
>       Error *local_err = NULL;
>       bool start_migration;
> +    QEMUFile *f;
>   
>       if (!mis->from_src_file) {
>           /* The first connection (multifd may have multiple) */
> -        QEMUFile *f = qemu_fopen_channel_input(ioc);
> +        f = qemu_fopen_channel_input(ioc);
>   
>           if (!migration_incoming_setup(f, errp)) {
>               return;
> @@ -730,13 +742,18 @@ void migration_ioc_process_incoming(QIOChannel *ioc, Error **errp)
>   
>           /*
>            * Common migration only needs one channel, so we can start
> -         * right now.  Multifd needs more than one channel, we wait.
> +         * right now.  Some features need more than one channel, we wait.
>            */
> -        start_migration = !migrate_use_multifd();
> +        start_migration = !migration_needs_multiple_sockets();
>       } else {
>           /* Multiple connections */
> -        assert(migrate_use_multifd());
> -        start_migration = multifd_recv_new_channel(ioc, &local_err);
> +        assert(migration_needs_multiple_sockets());
> +        if (migrate_use_multifd()) {
> +            start_migration = multifd_recv_new_channel(ioc, &local_err);
> +        } else if (migrate_postcopy_preempt()) {
> +            f = qemu_fopen_channel_input(ioc);
> +            start_migration = postcopy_preempt_new_channel(mis, f);
> +        }
>           if (local_err) {
>               error_propagate(errp, local_err);
>               return;
> @@ -761,11 +778,20 @@ void migration_ioc_process_incoming(QIOChannel *ioc, Error **errp)
>   bool migration_has_all_channels(void)
>   {
>       MigrationIncomingState *mis = migration_incoming_get_current();
> -    bool all_channels;
>   
> -    all_channels = multifd_recv_all_channels_created();
> +    if (!mis->from_src_file) {
> +        return false;
> +    }
> +
> +    if (migrate_use_multifd()) {
> +        return multifd_recv_all_channels_created();
> +    }
> +
> +    if (migrate_postcopy_preempt()) {
> +        return mis->postcopy_qemufile_dst != NULL;
> +    }
>   
> -    return all_channels && mis->from_src_file != NULL;
> +    return true;
>   }
>   
>   /*
> @@ -1862,6 +1888,12 @@ static void migrate_fd_cleanup(MigrationState *s)
>           qemu_fclose(tmp);
>       }
>   
> +    if (s->postcopy_qemufile_src) {
> +        migration_ioc_unregister_yank_from_file(s->postcopy_qemufile_src);
> +        qemu_fclose(s->postcopy_qemufile_src);
> +        s->postcopy_qemufile_src = NULL;
> +    }
> +
>       assert(!migration_is_active(s));
>   
>       if (s->state == MIGRATION_STATUS_CANCELLING) {
> @@ -3237,6 +3269,11 @@ static void migration_completion(MigrationState *s)
>           qemu_savevm_state_complete_postcopy(s->to_dst_file);
>           qemu_mutex_unlock_iothread();
>   
> +        /* Shutdown the postcopy fast path thread */
> +        if (migrate_postcopy_preempt()) {
> +            postcopy_preempt_shutdown_file(s);
> +        }
> +
>           trace_migration_completion_postcopy_end_after_complete();
>       } else {
>           goto fail;
> @@ -4124,6 +4161,15 @@ void migrate_fd_connect(MigrationState *s, Error *error_in)
>           }
>       }
>   
> +    /* This needs to be done before resuming a postcopy */
> +    if (postcopy_preempt_setup(s, &local_err)) {
> +        error_report_err(local_err);
> +        migrate_set_state(&s->state, MIGRATION_STATUS_SETUP,
> +                          MIGRATION_STATUS_FAILED);
> +        migrate_fd_cleanup(s);
> +        return;
> +    }
> +
>       if (resume) {
>           /* Wakeup the main migration thread to do the recovery */
>           migrate_set_state(&s->state, MIGRATION_STATUS_POSTCOPY_PAUSED,
> diff --git a/migration/migration.h b/migration/migration.h
> index af4bcb19c2..caa910d956 100644
> --- a/migration/migration.h
> +++ b/migration/migration.h
> @@ -23,6 +23,7 @@
>   #include "io/channel-buffer.h"
>   #include "net/announce.h"
>   #include "qom/object.h"
> +#include "postcopy-ram.h"
>   
>   struct PostcopyBlocktimeContext;
>   
> @@ -112,6 +113,11 @@ struct MigrationIncomingState {
>        * enabled.
>        */
>       unsigned int postcopy_channels;
> +    /* QEMUFile for postcopy only; it'll be handled by a separate thread */
> +    QEMUFile *postcopy_qemufile_dst;
> +    /* Postcopy priority thread is used to receive postcopy requested pages */
> +    QemuThread postcopy_prio_thread;
> +    bool postcopy_prio_thread_created;
>       /*
>        * An array of temp host huge pages to be used, one for each postcopy
>        * channel.
> @@ -192,6 +198,8 @@ struct MigrationState {
>       QEMUBH *cleanup_bh;
>       /* Protected by qemu_file_lock */
>       QEMUFile *to_dst_file;
> +    /* Postcopy specific transfer channel */
> +    QEMUFile *postcopy_qemufile_src;
>       QIOChannelBuffer *bioc;
>       /*
>        * Protects to_dst_file/from_dst_file pointers.  We need to make sure we
> diff --git a/migration/postcopy-ram.c b/migration/postcopy-ram.c
> index a66dd536d9..e92db0556b 100644
> --- a/migration/postcopy-ram.c
> +++ b/migration/postcopy-ram.c
> @@ -33,6 +33,9 @@
>   #include "trace.h"
>   #include "hw/boards.h"
>   #include "exec/ramblock.h"
> +#include "socket.h"
> +#include "qemu-file-channel.h"
> +#include "yank_functions.h"
>   
>   /* Arbitrary limit on size of each discard command,
>    * keeps them around ~200 bytes
> @@ -567,6 +570,11 @@ int postcopy_ram_incoming_cleanup(MigrationIncomingState *mis)
>   {
>       trace_postcopy_ram_incoming_cleanup_entry();
>   
> +    if (mis->postcopy_prio_thread_created) {
> +        qemu_thread_join(&mis->postcopy_prio_thread);
> +        mis->postcopy_prio_thread_created = false;
> +    }
> +
>       if (mis->have_fault_thread) {
>           Error *local_err = NULL;
>   
> @@ -1102,8 +1110,13 @@ static int postcopy_temp_pages_setup(MigrationIncomingState *mis)
>       int err, i, channels;
>       void *temp_page;
>   
> -    /* TODO: will be boosted when enable postcopy preemption */
> -    mis->postcopy_channels = 1;
> +    if (migrate_postcopy_preempt()) {
> +        /* If preemption enabled, need extra channel for urgent requests */
> +        mis->postcopy_channels = RAM_CHANNEL_MAX;
> +    } else {
> +        /* Both precopy/postcopy on the same channel */
> +        mis->postcopy_channels = 1;
> +    }
>   
>       channels = mis->postcopy_channels;
>       mis->postcopy_tmp_pages = g_malloc0_n(sizeof(PostcopyTmpPage), channels);
> @@ -1170,7 +1183,7 @@ int postcopy_ram_incoming_setup(MigrationIncomingState *mis)
>           return -1;
>       }
>   
> -    postcopy_thread_create(mis, &mis->fault_thread, "postcopy/fault",
> +    postcopy_thread_create(mis, &mis->fault_thread, "fault-default",
>                              postcopy_ram_fault_thread, QEMU_THREAD_JOINABLE);
>       mis->have_fault_thread = true;
>   
> @@ -1185,6 +1198,16 @@ int postcopy_ram_incoming_setup(MigrationIncomingState *mis)
>           return -1;
>       }
>   
> +    if (migrate_postcopy_preempt()) {
> +        /*
> +         * This thread needs to be created after the temp pages because
> +         * it'll fetch RAM_CHANNEL_POSTCOPY PostcopyTmpPage immediately.
> +         */
> +        postcopy_thread_create(mis, &mis->postcopy_prio_thread, "fault-fast",
> +                               postcopy_preempt_thread, QEMU_THREAD_JOINABLE);
> +        mis->postcopy_prio_thread_created = true;
> +    }
> +
>       trace_postcopy_ram_enable_notify();
>   
>       return 0;
> @@ -1514,3 +1537,66 @@ void postcopy_unregister_shared_ufd(struct PostCopyFD *pcfd)
>           }
>       }
>   }
> +
> +bool postcopy_preempt_new_channel(MigrationIncomingState *mis, QEMUFile *file)
> +{
> +    /*
> +     * The new loading channel has its own threads, so it needs to be
> +     * blocked too.  It's by default true, just be explicit.
> +     */
> +    qemu_file_set_blocking(file, true);
> +    mis->postcopy_qemufile_dst = file;
> +    trace_postcopy_preempt_new_channel();
> +
> +    /* Start the migration immediately */
> +    return true;
> +}
> +
> +int postcopy_preempt_setup(MigrationState *s, Error **errp)
> +{
> +    QIOChannel *ioc;
> +
> +    if (!migrate_postcopy_preempt()) {
> +        return 0;
> +    }
> +
> +    if (!migrate_multi_channels_is_allowed()) {
> +        error_setg(errp, "Postcopy preempt is not supported as current "
> +                   "migration stream does not support multi-channels.");
> +        return -1;
> +    }
> +
> +    ioc = socket_send_channel_create_sync(errp);
> +
> +    if (ioc == NULL) {
> +        return -1;
> +    }
> +
> +    migration_ioc_register_yank(ioc);
> +    s->postcopy_qemufile_src = qemu_fopen_channel_output(ioc);
> +
> +    trace_postcopy_preempt_new_channel();
> +
> +    return 0;
> +}
> +
> +void *postcopy_preempt_thread(void *opaque)
> +{
> +    MigrationIncomingState *mis = opaque;
> +    int ret;
> +
> +    trace_postcopy_preempt_thread_entry();
> +
> +    rcu_register_thread();
> +
> +    qemu_sem_post(&mis->thread_sync_sem);
> +
> +    /* Sending RAM_SAVE_FLAG_EOS to terminate this thread */
> +    ret = ram_load_postcopy(mis->postcopy_qemufile_dst, RAM_CHANNEL_POSTCOPY);
> +
> +    rcu_unregister_thread();
> +
> +    trace_postcopy_preempt_thread_exit();
> +
> +    return ret == 0 ? NULL : (void *)-1;
> +}
> diff --git a/migration/postcopy-ram.h b/migration/postcopy-ram.h
> index 07684c0e1d..34b1080cde 100644
> --- a/migration/postcopy-ram.h
> +++ b/migration/postcopy-ram.h
> @@ -183,4 +183,14 @@ int postcopy_wake_shared(struct PostCopyFD *pcfd, uint64_t client_addr,
>   int postcopy_request_shared_page(struct PostCopyFD *pcfd, RAMBlock *rb,
>                                    uint64_t client_addr, uint64_t offset);
>   
> +/* Hard-code channels for now for postcopy preemption */
> +enum PostcopyChannels {
> +    RAM_CHANNEL_PRECOPY = 0,
> +    RAM_CHANNEL_POSTCOPY = 1,
> +    RAM_CHANNEL_MAX,
> +};
> +
> +bool postcopy_preempt_new_channel(MigrationIncomingState *mis, QEMUFile *file);
> +int postcopy_preempt_setup(MigrationState *s, Error **errp);
> +
>   #endif
> diff --git a/migration/ram.c b/migration/ram.c
> index a2489a2699..f5615e0a76 100644
> --- a/migration/ram.c
> +++ b/migration/ram.c
> @@ -3644,15 +3644,15 @@ int ram_postcopy_incoming_init(MigrationIncomingState *mis)
>    * rcu_read_lock is taken prior to this being called.
>    *
>    * @f: QEMUFile where to send the data
> + * @channel: the channel to use for loading
>    */
> -int ram_load_postcopy(QEMUFile *f)
> +int ram_load_postcopy(QEMUFile *f, int channel)
>   {
>       int flags = 0, ret = 0;
>       bool place_needed = false;
>       bool matches_target_page_size = false;
>       MigrationIncomingState *mis = migration_incoming_get_current();
> -    /* Currently we only use channel 0.  TODO: use all the channels */
> -    PostcopyTmpPage *tmp_page = &mis->postcopy_tmp_pages[0];
> +    PostcopyTmpPage *tmp_page = &mis->postcopy_tmp_pages[channel];
>   
>       while (!ret && !(flags & RAM_SAVE_FLAG_EOS)) {
>           ram_addr_t addr;
> @@ -3676,7 +3676,7 @@ int ram_load_postcopy(QEMUFile *f)
>           flags = addr & ~TARGET_PAGE_MASK;
>           addr &= TARGET_PAGE_MASK;
>   
> -        trace_ram_load_postcopy_loop((uint64_t)addr, flags);
> +        trace_ram_load_postcopy_loop(channel, (uint64_t)addr, flags);
>           if (flags & (RAM_SAVE_FLAG_ZERO | RAM_SAVE_FLAG_PAGE |
>                        RAM_SAVE_FLAG_COMPRESS_PAGE)) {
>               block = ram_block_from_stream(mis, f, flags);
> @@ -3717,10 +3717,10 @@ int ram_load_postcopy(QEMUFile *f)
>               } else if (tmp_page->host_addr !=
>                          host_page_from_ram_block_offset(block, addr)) {
>                   /* not the 1st TP within the HP */
> -                error_report("Non-same host page detected.  "
> +                error_report("Non-same host page detected on channel %d: "
>                                "Target host page %p, received host page %p "
>                                "(rb %s offset 0x"RAM_ADDR_FMT" target_pages %d)",
> -                             tmp_page->host_addr,
> +                             channel, tmp_page->host_addr,
>                                host_page_from_ram_block_offset(block, addr),
>                                block->idstr, addr, tmp_page->target_pages);
>                   ret = -EINVAL;
> @@ -4107,7 +4107,12 @@ static int ram_load(QEMUFile *f, void *opaque, int version_id)
>        */
>       WITH_RCU_READ_LOCK_GUARD() {
>           if (postcopy_running) {
> -            ret = ram_load_postcopy(f);
> +            /*
> +             * Note!  Here RAM_CHANNEL_PRECOPY is the precopy channel of
> +             * postcopy migration, we have another RAM_CHANNEL_POSTCOPY to
> +             * service fast page faults.
> +             */
> +            ret = ram_load_postcopy(f, RAM_CHANNEL_PRECOPY);
>           } else {
>               ret = ram_load_precopy(f);
>           }
> @@ -4269,6 +4274,12 @@ static int ram_resume_prepare(MigrationState *s, void *opaque)
>       return 0;
>   }
>   
> +void postcopy_preempt_shutdown_file(MigrationState *s)
> +{
> +    qemu_put_be64(s->postcopy_qemufile_src, RAM_SAVE_FLAG_EOS);
> +    qemu_fflush(s->postcopy_qemufile_src);
> +}
> +
>   static SaveVMHandlers savevm_ram_handlers = {
>       .save_setup = ram_save_setup,
>       .save_live_iterate = ram_save_iterate,
> diff --git a/migration/ram.h b/migration/ram.h
> index ded0a3a086..5d90945a6e 100644
> --- a/migration/ram.h
> +++ b/migration/ram.h
> @@ -61,7 +61,7 @@ void ram_postcopy_send_discard_bitmap(MigrationState *ms);
>   /* For incoming postcopy discard */
>   int ram_discard_range(const char *block_name, uint64_t start, size_t length);
>   int ram_postcopy_incoming_init(MigrationIncomingState *mis);
> -int ram_load_postcopy(QEMUFile *f);
> +int ram_load_postcopy(QEMUFile *f, int channel);
>   
>   void ram_handle_compressed(void *host, uint8_t ch, uint64_t size);
>   
> @@ -73,6 +73,8 @@ int64_t ramblock_recv_bitmap_send(QEMUFile *file,
>                                     const char *block_name);
>   int ram_dirty_bitmap_reload(MigrationState *s, RAMBlock *rb);
>   bool ramblock_page_is_discarded(RAMBlock *rb, ram_addr_t start);
> +void postcopy_preempt_shutdown_file(MigrationState *s);
> +void *postcopy_preempt_thread(void *opaque);
>   
>   /* ram cache */
>   int colo_init_ram_cache(void);
> diff --git a/migration/savevm.c b/migration/savevm.c
> index d9076897b8..ecee05e631 100644
> --- a/migration/savevm.c
> +++ b/migration/savevm.c
> @@ -2575,16 +2575,6 @@ static bool postcopy_pause_incoming(MigrationIncomingState *mis)
>   {
>       int i;
>   
> -    /*
> -     * If network is interrupted, any temp page we received will be useless
> -     * because we didn't mark them as "received" in receivedmap.  After a
> -     * proper recovery later (which will sync src dirty bitmap with receivedmap
> -     * on dest) these cached small pages will be resent again.
> -     */
> -    for (i = 0; i < mis->postcopy_channels; i++) {
> -        postcopy_temp_page_reset(&mis->postcopy_tmp_pages[i]);
> -    }
> -
>       trace_postcopy_pause_incoming();
>   
>       assert(migrate_postcopy_ram());
> @@ -2613,6 +2603,16 @@ static bool postcopy_pause_incoming(MigrationIncomingState *mis)
>       /* Notify the fault thread for the invalidated file handle */
>       postcopy_fault_thread_notify(mis);
>   
> +    /*
> +     * If network is interrupted, any temp page we received will be useless
> +     * because we didn't mark them as "received" in receivedmap.  After a
> +     * proper recovery later (which will sync src dirty bitmap with receivedmap
> +     * on dest) these cached small pages will be resent again.
> +     */
> +    for (i = 0; i < mis->postcopy_channels; i++) {
> +        postcopy_temp_page_reset(&mis->postcopy_tmp_pages[i]);
> +    }
> +
>       error_report("Detected IO failure for postcopy. "
>                    "Migration paused.");
>   
> diff --git a/migration/socket.c b/migration/socket.c
> index 05705a32d8..a7f345b353 100644
> --- a/migration/socket.c
> +++ b/migration/socket.c
> @@ -26,7 +26,7 @@
>   #include "io/channel-socket.h"
>   #include "io/net-listener.h"
>   #include "trace.h"
> -
> +#include "postcopy-ram.h"
>   
>   struct SocketOutgoingArgs {
>       SocketAddress *saddr;
> @@ -39,6 +39,24 @@ void socket_send_channel_create(QIOTaskFunc f, void *data)
>                                        f, data, NULL, NULL);
>   }
>   
> +QIOChannel *socket_send_channel_create_sync(Error **errp)
> +{
> +    QIOChannelSocket *sioc = qio_channel_socket_new();
> +
> +    if (!outgoing_args.saddr) {
> +        object_unref(OBJECT(sioc));
> +        error_setg(errp, "Initial sock address not set!");
> +        return NULL;
> +    }
> +
> +    if (qio_channel_socket_connect_sync(sioc, outgoing_args.saddr, errp) < 0) {
> +        object_unref(OBJECT(sioc));
> +        return NULL;
> +    }
> +
> +    return QIO_CHANNEL(sioc);
> +}
> +
>   int socket_send_channel_destroy(QIOChannel *send)
>   {
>       /* Remove channel */
> @@ -158,6 +176,8 @@ socket_start_incoming_migration_internal(SocketAddress *saddr,
>   
>       if (migrate_use_multifd()) {
>           num = migrate_multifd_channels();
> +    } else if (migrate_postcopy_preempt()) {
> +        num = RAM_CHANNEL_MAX;
>       }
>   
>       if (qio_net_listener_open_sync(listener, saddr, num, errp) < 0) {
> diff --git a/migration/socket.h b/migration/socket.h
> index 891dbccceb..dc54df4e6c 100644
> --- a/migration/socket.h
> +++ b/migration/socket.h
> @@ -21,6 +21,7 @@
>   #include "io/task.h"
>   
>   void socket_send_channel_create(QIOTaskFunc f, void *data);
> +QIOChannel *socket_send_channel_create_sync(Error **errp);
>   int socket_send_channel_destroy(QIOChannel *send);
>   
>   void socket_start_incoming_migration(const char *str, Error **errp);
> diff --git a/migration/trace-events b/migration/trace-events
> index 1aec580e92..4bc787cf0c 100644
> --- a/migration/trace-events
> +++ b/migration/trace-events
> @@ -91,7 +91,7 @@ migration_bitmap_clear_dirty(char *str, uint64_t start, uint64_t size, unsigned
>   migration_throttle(void) ""
>   ram_discard_range(const char *rbname, uint64_t start, size_t len) "%s: start: %" PRIx64 " %zx"
>   ram_load_loop(const char *rbname, uint64_t addr, int flags, void *host) "%s: addr: 0x%" PRIx64 " flags: 0x%x host: %p"
> -ram_load_postcopy_loop(uint64_t addr, int flags) "@%" PRIx64 " %x"
> +ram_load_postcopy_loop(int channel, uint64_t addr, int flags) "chan=%d addr=0x%" PRIx64 " flags=0x%x"
>   ram_postcopy_send_discard_bitmap(void) ""
>   ram_save_page(const char *rbname, uint64_t offset, void *host) "%s: offset: 0x%" PRIx64 " host: %p"
>   ram_save_queue_pages(const char *rbname, size_t start, size_t len) "%s: start: 0x%zx len: 0x%zx"
> @@ -278,6 +278,9 @@ postcopy_request_shared_page(const char *sharer, const char *rb, uint64_t rb_off
>   postcopy_request_shared_page_present(const char *sharer, const char *rb, uint64_t rb_offset) "%s already %s offset 0x%"PRIx64
>   postcopy_wake_shared(uint64_t client_addr, const char *rb) "at 0x%"PRIx64" in %s"
>   postcopy_page_req_del(void *addr, int count) "resolved page req %p total %d"
> +postcopy_preempt_new_channel(void) ""
> +postcopy_preempt_thread_entry(void) ""
> +postcopy_preempt_thread_exit(void) ""
>   
>   get_mem_fault_cpu_index(int cpu, uint32_t pid) "cpu: %d, pid: %u"
>   


^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH v5 11/21] migration: Postcopy preemption preparation on channel creation
  2022-05-11 17:08   ` manish.mishra
@ 2022-05-12 16:36     ` Peter Xu
  0 siblings, 0 replies; 40+ messages in thread
From: Peter Xu @ 2022-05-12 16:36 UTC (permalink / raw)
  To: manish.mishra
  Cc: qemu-devel, Leonardo Bras Soares Passos, Daniel P . Berrange,
	Dr . David Alan Gilbert, Juan Quintela

On Wed, May 11, 2022 at 10:38:00PM +0530, manish.mishra wrote:
> LGTM

Manish,

Do you mean you reviewed the patch?  Do you want me to attach this as a
Reviewed-By tag in my next post?

It'll be great if you can provide it by replying with the R-b tag.  I can
also attach it after your confirmation.

Thanks,

-- 
Peter Xu



^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH v5 14/21] migration: Create the postcopy preempt channel asynchronously
  2022-04-25 23:38 ` [PATCH v5 14/21] migration: Create the postcopy preempt channel asynchronously Peter Xu
@ 2022-05-12 17:35   ` Dr. David Alan Gilbert
  2022-05-16 15:02   ` manish.mishra
  1 sibling, 0 replies; 40+ messages in thread
From: Dr. David Alan Gilbert @ 2022-05-12 17:35 UTC (permalink / raw)
  To: Peter Xu
  Cc: qemu-devel, Daniel P . Berrange, Juan Quintela,
	Leonardo Bras Soares Passos

* Peter Xu (peterx@redhat.com) wrote:
> This patch allows the postcopy preempt channel to be created
> asynchronously.  The benefit is that when the connection is slow, we won't
> take the BQL (and potentially block all things like QMP) for a long time
> without releasing.
> 
> A function postcopy_preempt_wait_channel() is introduced, allowing the
> migration thread to be able to wait on the channel creation.  The channel
> is always created by the main thread, in which we'll kick a new semaphore
> to tell the migration thread that the channel has created.
> 
> We'll need to wait for the new channel in two places: (1) when there's a
> new postcopy migration that is starting, or (2) when there's a postcopy
> migration to resume.
> 
> For the start of migration, we don't need to wait for this channel until
> when we want to start postcopy, aka, postcopy_start().  We'll fail the
> migration if we found that the channel creation failed (which should
> probably not happen at all in 99% of the cases, because the main channel is
> using the same network topology).
> 
> For a postcopy recovery, we'll need to wait in postcopy_pause().  In that
> case if the channel creation failed, we can't fail the migration or we'll
> crash the VM, instead we keep in PAUSED state, waiting for yet another
> recovery.
> 
> Signed-off-by: Peter Xu <peterx@redhat.com>

Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>

> ---
>  migration/migration.c    | 16 ++++++++++++
>  migration/migration.h    |  7 +++++
>  migration/postcopy-ram.c | 56 +++++++++++++++++++++++++++++++---------
>  migration/postcopy-ram.h |  1 +
>  4 files changed, 68 insertions(+), 12 deletions(-)
> 
> diff --git a/migration/migration.c b/migration/migration.c
> index a0db5de685..cce741e20e 100644
> --- a/migration/migration.c
> +++ b/migration/migration.c
> @@ -3020,6 +3020,12 @@ static int postcopy_start(MigrationState *ms)
>      int64_t bandwidth = migrate_max_postcopy_bandwidth();
>      bool restart_block = false;
>      int cur_state = MIGRATION_STATUS_ACTIVE;
> +
> +    if (postcopy_preempt_wait_channel(ms)) {
> +        migrate_set_state(&ms->state, ms->state, MIGRATION_STATUS_FAILED);
> +        return -1;
> +    }
> +
>      if (!migrate_pause_before_switchover()) {
>          migrate_set_state(&ms->state, MIGRATION_STATUS_ACTIVE,
>                            MIGRATION_STATUS_POSTCOPY_ACTIVE);
> @@ -3501,6 +3507,14 @@ static MigThrError postcopy_pause(MigrationState *s)
>          if (s->state == MIGRATION_STATUS_POSTCOPY_RECOVER) {
>              /* Woken up by a recover procedure. Give it a shot */
>  
> +            if (postcopy_preempt_wait_channel(s)) {
> +                /*
> +                 * Preempt enabled, and new channel create failed; loop
> +                 * back to wait for another recovery.
> +                 */
> +                continue;
> +            }
> +
>              /*
>               * Firstly, let's wake up the return path now, with a new
>               * return path channel.
> @@ -4360,6 +4374,7 @@ static void migration_instance_finalize(Object *obj)
>      qemu_sem_destroy(&ms->postcopy_pause_sem);
>      qemu_sem_destroy(&ms->postcopy_pause_rp_sem);
>      qemu_sem_destroy(&ms->rp_state.rp_sem);
> +    qemu_sem_destroy(&ms->postcopy_qemufile_src_sem);
>      error_free(ms->error);
>  }
>  
> @@ -4406,6 +4421,7 @@ static void migration_instance_init(Object *obj)
>      qemu_sem_init(&ms->rp_state.rp_sem, 0);
>      qemu_sem_init(&ms->rate_limit_sem, 0);
>      qemu_sem_init(&ms->wait_unplug_sem, 0);
> +    qemu_sem_init(&ms->postcopy_qemufile_src_sem, 0);
>      qemu_mutex_init(&ms->qemu_file_lock);
>  }
>  
> diff --git a/migration/migration.h b/migration/migration.h
> index 91f845e9e4..f898b8547a 100644
> --- a/migration/migration.h
> +++ b/migration/migration.h
> @@ -219,6 +219,13 @@ struct MigrationState {
>      QEMUFile *to_dst_file;
>      /* Postcopy specific transfer channel */
>      QEMUFile *postcopy_qemufile_src;
> +    /*
> +     * It is posted when the preempt channel is established.  Note: this is
> +     * used for both the start or recover of a postcopy migration.  We'll
> +     * post to this sem every time a new preempt channel is created in the
> +     * main thread, and we keep post() and wait() in pair.
> +     */
> +    QemuSemaphore postcopy_qemufile_src_sem;
>      QIOChannelBuffer *bioc;
>      /*
>       * Protects to_dst_file/from_dst_file pointers.  We need to make sure we
> diff --git a/migration/postcopy-ram.c b/migration/postcopy-ram.c
> index b3c81b46f6..1bb603051a 100644
> --- a/migration/postcopy-ram.c
> +++ b/migration/postcopy-ram.c
> @@ -1552,10 +1552,50 @@ bool postcopy_preempt_new_channel(MigrationIncomingState *mis, QEMUFile *file)
>      return true;
>  }
>  
> -int postcopy_preempt_setup(MigrationState *s, Error **errp)
> +static void
> +postcopy_preempt_send_channel_new(QIOTask *task, gpointer opaque)
>  {
> -    QIOChannel *ioc;
> +    MigrationState *s = opaque;
> +    QIOChannel *ioc = QIO_CHANNEL(qio_task_get_source(task));
> +    Error *local_err = NULL;
> +
> +    if (qio_task_propagate_error(task, &local_err)) {
> +        /* Something wrong happened.. */
> +        migrate_set_error(s, local_err);
> +        error_free(local_err);
> +    } else {
> +        migration_ioc_register_yank(ioc);
> +        s->postcopy_qemufile_src = qemu_fopen_channel_output(ioc);
> +        trace_postcopy_preempt_new_channel();
> +    }
> +
> +    /*
> +     * Kick the waiter in all cases.  The waiter should check upon
> +     * postcopy_qemufile_src to know whether it failed or not.
> +     */
> +    qemu_sem_post(&s->postcopy_qemufile_src_sem);
> +    object_unref(OBJECT(ioc));
> +}
>  
> +/* Returns 0 if channel established, -1 for error. */
> +int postcopy_preempt_wait_channel(MigrationState *s)
> +{
> +    /* If preempt not enabled, no need to wait */
> +    if (!migrate_postcopy_preempt()) {
> +        return 0;
> +    }
> +
> +    /*
> +     * We need the postcopy preempt channel to be established before
> +     * starting doing anything.
> +     */
> +    qemu_sem_wait(&s->postcopy_qemufile_src_sem);
> +
> +    return s->postcopy_qemufile_src ? 0 : -1;
> +}
> +
> +int postcopy_preempt_setup(MigrationState *s, Error **errp)
> +{
>      if (!migrate_postcopy_preempt()) {
>          return 0;
>      }
> @@ -1566,16 +1606,8 @@ int postcopy_preempt_setup(MigrationState *s, Error **errp)
>          return -1;
>      }
>  
> -    ioc = socket_send_channel_create_sync(errp);
> -
> -    if (ioc == NULL) {
> -        return -1;
> -    }
> -
> -    migration_ioc_register_yank(ioc);
> -    s->postcopy_qemufile_src = qemu_fopen_channel_output(ioc);
> -
> -    trace_postcopy_preempt_new_channel();
> +    /* Kick an async task to connect */
> +    socket_send_channel_create(postcopy_preempt_send_channel_new, s);
>  
>      return 0;
>  }
> diff --git a/migration/postcopy-ram.h b/migration/postcopy-ram.h
> index 34b1080cde..6147bf7d1d 100644
> --- a/migration/postcopy-ram.h
> +++ b/migration/postcopy-ram.h
> @@ -192,5 +192,6 @@ enum PostcopyChannels {
>  
>  bool postcopy_preempt_new_channel(MigrationIncomingState *mis, QEMUFile *file);
>  int postcopy_preempt_setup(MigrationState *s, Error **errp);
> +int postcopy_preempt_wait_channel(MigrationState *s);
>  
>  #endif
> -- 
> 2.32.0
> 
-- 
Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK



^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH v5 15/21] migration: Parameter x-postcopy-preempt-break-huge
  2022-04-25 23:38 ` [PATCH v5 15/21] migration: Parameter x-postcopy-preempt-break-huge Peter Xu
@ 2022-05-12 17:42   ` Dr. David Alan Gilbert
  2022-05-16 15:20   ` manish.mishra
  1 sibling, 0 replies; 40+ messages in thread
From: Dr. David Alan Gilbert @ 2022-05-12 17:42 UTC (permalink / raw)
  To: Peter Xu
  Cc: qemu-devel, Daniel P . Berrange, Juan Quintela,
	Leonardo Bras Soares Passos

* Peter Xu (peterx@redhat.com) wrote:
> Add a parameter that can conditionally disable the "break sending huge
> page" behavior in postcopy preemption.  By default it's enabled.
> 
> It should only be used for debugging purposes, and we should never remove
> the "x-" prefix.

This is actually a 'property' and not a 'Parameter' isn't it?

> Signed-off-by: Peter Xu <peterx@redhat.com>

Other than the title,


Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>

> ---
>  migration/migration.c | 2 ++
>  migration/migration.h | 7 +++++++
>  migration/ram.c       | 7 +++++++
>  3 files changed, 16 insertions(+)
> 
> diff --git a/migration/migration.c b/migration/migration.c
> index cce741e20e..cd9650f2e2 100644
> --- a/migration/migration.c
> +++ b/migration/migration.c
> @@ -4329,6 +4329,8 @@ static Property migration_properties[] = {
>      DEFINE_PROP_SIZE("announce-step", MigrationState,
>                        parameters.announce_step,
>                        DEFAULT_MIGRATE_ANNOUNCE_STEP),
> +    DEFINE_PROP_BOOL("x-postcopy-preempt-break-huge", MigrationState,
> +                      postcopy_preempt_break_huge, true),
>  
>      /* Migration capabilities */
>      DEFINE_PROP_MIG_CAP("x-xbzrle", MIGRATION_CAPABILITY_XBZRLE),
> diff --git a/migration/migration.h b/migration/migration.h
> index f898b8547a..6ee520642f 100644
> --- a/migration/migration.h
> +++ b/migration/migration.h
> @@ -340,6 +340,13 @@ struct MigrationState {
>      bool send_configuration;
>      /* Whether we send section footer during migration */
>      bool send_section_footer;
> +    /*
> +     * Whether we allow break sending huge pages when postcopy preempt is
> +     * enabled.  When disabled, we won't interrupt precopy within sending a
> +     * host huge page, which is the old behavior of vanilla postcopy.
> +     * NOTE: this parameter is ignored if postcopy preempt is not enabled.
> +     */
> +    bool postcopy_preempt_break_huge;
>  
>      /* Needed by postcopy-pause state */
>      QemuSemaphore postcopy_pause_sem;
> diff --git a/migration/ram.c b/migration/ram.c
> index a4b39e3675..f3a79c8556 100644
> --- a/migration/ram.c
> +++ b/migration/ram.c
> @@ -2266,11 +2266,18 @@ static int ram_save_target_page(RAMState *rs, PageSearchStatus *pss)
>  
>  static bool postcopy_needs_preempt(RAMState *rs, PageSearchStatus *pss)
>  {
> +    MigrationState *ms = migrate_get_current();
> +
>      /* Not enabled eager preempt?  Then never do that. */
>      if (!migrate_postcopy_preempt()) {
>          return false;
>      }
>  
> +    /* If the user explicitly disabled breaking of huge page, skip */
> +    if (!ms->postcopy_preempt_break_huge) {
> +        return false;
> +    }
> +
>      /* If the ramblock we're sending is a small page?  Never bother. */
>      if (qemu_ram_pagesize(pss->block) == TARGET_PAGE_SIZE) {
>          return false;
> -- 
> 2.32.0
> 
-- 
Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK



^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH v5 16/21] migration: Add helpers to detect TLS capability
  2022-04-25 23:38 ` [PATCH v5 16/21] migration: Add helpers to detect TLS capability Peter Xu
@ 2022-05-12 17:46   ` Dr. David Alan Gilbert
  0 siblings, 0 replies; 40+ messages in thread
From: Dr. David Alan Gilbert @ 2022-05-12 17:46 UTC (permalink / raw)
  To: Peter Xu
  Cc: qemu-devel, Daniel P . Berrange, Juan Quintela,
	Leonardo Bras Soares Passos

* Peter Xu (peterx@redhat.com) wrote:
> Add migrate_tls_enabled() to detect whether TLS is configured.
> 
> Add migrate_channel_requires_tls() to detect whether the specific channel
> requires TLS.
> 
> No functional change intended.
> 
> Signed-off-by: Peter Xu <peterx@redhat.com>

Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>

> ---
>  migration/channel.c   | 10 ++--------
>  migration/migration.c |  8 ++++++++
>  migration/migration.h |  2 ++
>  migration/multifd.c   |  7 +------
>  migration/tls.c       |  9 +++++++++
>  migration/tls.h       |  4 ++++
>  6 files changed, 26 insertions(+), 14 deletions(-)
> 
> diff --git a/migration/channel.c b/migration/channel.c
> index c6a8dcf1d7..36e59eaeec 100644
> --- a/migration/channel.c
> +++ b/migration/channel.c
> @@ -38,10 +38,7 @@ void migration_channel_process_incoming(QIOChannel *ioc)
>      trace_migration_set_incoming_channel(
>          ioc, object_get_typename(OBJECT(ioc)));
>  
> -    if (s->parameters.tls_creds &&
> -        *s->parameters.tls_creds &&
> -        !object_dynamic_cast(OBJECT(ioc),
> -                             TYPE_QIO_CHANNEL_TLS)) {
> +    if (migrate_channel_requires_tls(ioc)) {
>          migration_tls_channel_process_incoming(s, ioc, &local_err);
>      } else {
>          migration_ioc_register_yank(ioc);
> @@ -71,10 +68,7 @@ void migration_channel_connect(MigrationState *s,
>          ioc, object_get_typename(OBJECT(ioc)), hostname, error);
>  
>      if (!error) {
> -        if (s->parameters.tls_creds &&
> -            *s->parameters.tls_creds &&
> -            !object_dynamic_cast(OBJECT(ioc),
> -                                 TYPE_QIO_CHANNEL_TLS)) {
> +        if (migrate_channel_requires_tls(ioc)) {
>              migration_tls_channel_connect(s, ioc, hostname, &error);
>  
>              if (!error) {
> diff --git a/migration/migration.c b/migration/migration.c
> index cd9650f2e2..71a50b5c37 100644
> --- a/migration/migration.c
> +++ b/migration/migration.c
> @@ -49,6 +49,7 @@
>  #include "trace.h"
>  #include "exec/target_page.h"
>  #include "io/channel-buffer.h"
> +#include "io/channel-tls.h"
>  #include "migration/colo.h"
>  #include "hw/boards.h"
>  #include "hw/qdev-properties.h"
> @@ -4250,6 +4251,13 @@ void migration_global_dump(Monitor *mon)
>                     ms->clear_bitmap_shift);
>  }
>  
> +bool migrate_tls_enabled(void)
> +{
> +    MigrationState *s = migrate_get_current();
> +
> +    return s->parameters.tls_creds && *s->parameters.tls_creds;
> +}
> +
>  #define DEFINE_PROP_MIG_CAP(name, x)             \
>      DEFINE_PROP_BOOL(name, MigrationState, enabled_capabilities[x], false)
>  
> diff --git a/migration/migration.h b/migration/migration.h
> index 6ee520642f..db176ea749 100644
> --- a/migration/migration.h
> +++ b/migration/migration.h
> @@ -436,6 +436,8 @@ bool migrate_use_events(void);
>  bool migrate_postcopy_blocktime(void);
>  bool migrate_background_snapshot(void);
>  bool migrate_postcopy_preempt(void);
> +/* Whether TLS is enabled for migration? */
> +bool migrate_tls_enabled(void);
>  
>  /* Sending on the return path - generic and then for each message type */
>  void migrate_send_rp_shut(MigrationIncomingState *mis,
> diff --git a/migration/multifd.c b/migration/multifd.c
> index 9ea4f581e2..19e3c44491 100644
> --- a/migration/multifd.c
> +++ b/migration/multifd.c
> @@ -782,17 +782,12 @@ static bool multifd_channel_connect(MultiFDSendParams *p,
>                                      QIOChannel *ioc,
>                                      Error *error)
>  {
> -    MigrationState *s = migrate_get_current();
> -
>      trace_multifd_set_outgoing_channel(
>          ioc, object_get_typename(OBJECT(ioc)),
>          migrate_get_current()->hostname, error);
>  
>      if (!error) {
> -        if (s->parameters.tls_creds &&
> -            *s->parameters.tls_creds &&
> -            !object_dynamic_cast(OBJECT(ioc),
> -                                 TYPE_QIO_CHANNEL_TLS)) {
> +        if (migrate_channel_requires_tls(ioc)) {
>              multifd_tls_channel_connect(p, ioc, &error);
>              if (!error) {
>                  /*
> diff --git a/migration/tls.c b/migration/tls.c
> index 32c384a8b6..bd1fa01048 100644
> --- a/migration/tls.c
> +++ b/migration/tls.c
> @@ -166,3 +166,12 @@ void migration_tls_channel_connect(MigrationState *s,
>                                NULL,
>                                NULL);
>  }
> +
> +bool migrate_channel_requires_tls(QIOChannel *ioc)
> +{
> +    if (!migrate_tls_enabled()) {
> +        return false;
> +    }
> +
> +    return !object_dynamic_cast(OBJECT(ioc), TYPE_QIO_CHANNEL_TLS);
> +}
> diff --git a/migration/tls.h b/migration/tls.h
> index de4fe2cafd..a54c1dcec7 100644
> --- a/migration/tls.h
> +++ b/migration/tls.h
> @@ -37,4 +37,8 @@ void migration_tls_channel_connect(MigrationState *s,
>                                     QIOChannel *ioc,
>                                     const char *hostname,
>                                     Error **errp);
> +
> +/* Whether the QIO channel requires further TLS handshake? */
> +bool migrate_channel_requires_tls(QIOChannel *ioc);
> +
>  #endif
> -- 
> 2.32.0
> 
-- 
Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK



^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH v5 17/21] migration: Export tls-[creds|hostname|authz] params to cmdline too
  2022-04-25 23:38 ` [PATCH v5 17/21] migration: Export tls-[creds|hostname|authz] params to cmdline too Peter Xu
@ 2022-05-12 18:03   ` Dr. David Alan Gilbert
  2022-05-12 19:05     ` Daniel P. Berrangé
  0 siblings, 1 reply; 40+ messages in thread
From: Dr. David Alan Gilbert @ 2022-05-12 18:03 UTC (permalink / raw)
  To: Peter Xu
  Cc: qemu-devel, Daniel P . Berrange, Juan Quintela,
	Leonardo Bras Soares Passos

* Peter Xu (peterx@redhat.com) wrote:
> It's useful for specifying tls credentials all in the cmdline (along with
> the -object tls-creds-*), especially for debugging purpose.
> 
> The trick here is we must remember to not free these fields again in the
> finalize() function of migration object, otherwise it'll cause double-free.
> 
> The thing is when destroying an object, we'll first destroy the properties
> that bound to the object, then the object itself.  To be explicit, when
> destroy the object in object_finalize() we have such sequence of
> operations:
> 
>     object_property_del_all(obj);
>     object_deinit(obj, ti);
> 
> So after this change the two fields are properly released already even
> before reaching the finalize() function but in object_property_del_all(),
> hence we don't need to free them anymore in finalize() or it's double-free.
> 
> This also fixes a trivial memory leak for tls-authz as we forgot to free it
> before this patch.
> 
> Signed-off-by: Peter Xu <peterx@redhat.com>
> ---
>  migration/migration.c | 6 +++---
>  1 file changed, 3 insertions(+), 3 deletions(-)
> 
> diff --git a/migration/migration.c b/migration/migration.c
> index 71a50b5c37..b0f2de1e09 100644
> --- a/migration/migration.c
> +++ b/migration/migration.c
> @@ -4339,6 +4339,9 @@ static Property migration_properties[] = {
>                        DEFAULT_MIGRATE_ANNOUNCE_STEP),
>      DEFINE_PROP_BOOL("x-postcopy-preempt-break-huge", MigrationState,
>                        postcopy_preempt_break_huge, true),
> +    DEFINE_PROP_STRING("tls-creds", MigrationState, parameters.tls_creds),
> +    DEFINE_PROP_STRING("tls-hostname", MigrationState, parameters.tls_hostname),
> +    DEFINE_PROP_STRING("tls-authz", MigrationState, parameters.tls_authz),
>  
>      /* Migration capabilities */
>      DEFINE_PROP_MIG_CAP("x-xbzrle", MIGRATION_CAPABILITY_XBZRLE),
> @@ -4372,12 +4375,9 @@ static void migration_class_init(ObjectClass *klass, void *data)
>  static void migration_instance_finalize(Object *obj)
>  {
>      MigrationState *ms = MIGRATION_OBJ(obj);
> -    MigrationParameters *params = &ms->parameters;
>  
>      qemu_mutex_destroy(&ms->error_mutex);
>      qemu_mutex_destroy(&ms->qemu_file_lock);
> -    g_free(params->tls_hostname);
> -    g_free(params->tls_creds);

So hmm, why is tls-authz special here?

Dave

>      qemu_sem_destroy(&ms->wait_unplug_sem);
>      qemu_sem_destroy(&ms->rate_limit_sem);
>      qemu_sem_destroy(&ms->pause_sem);
> -- 
> 2.32.0
> 
-- 
Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK



^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH v5 17/21] migration: Export tls-[creds|hostname|authz] params to cmdline too
  2022-05-12 18:03   ` Dr. David Alan Gilbert
@ 2022-05-12 19:05     ` Daniel P. Berrangé
  2022-05-12 20:07       ` Peter Xu
  0 siblings, 1 reply; 40+ messages in thread
From: Daniel P. Berrangé @ 2022-05-12 19:05 UTC (permalink / raw)
  To: Dr. David Alan Gilbert
  Cc: Peter Xu, qemu-devel, Juan Quintela, Leonardo Bras Soares Passos

On Thu, May 12, 2022 at 07:03:13PM +0100, Dr. David Alan Gilbert wrote:
> * Peter Xu (peterx@redhat.com) wrote:
> > It's useful for specifying tls credentials all in the cmdline (along with
> > the -object tls-creds-*), especially for debugging purpose.
> > 
> > The trick here is we must remember to not free these fields again in the
> > finalize() function of migration object, otherwise it'll cause double-free.
> > 
> > The thing is when destroying an object, we'll first destroy the properties
> > that bound to the object, then the object itself.  To be explicit, when
> > destroy the object in object_finalize() we have such sequence of
> > operations:
> > 
> >     object_property_del_all(obj);
> >     object_deinit(obj, ti);
> > 
> > So after this change the two fields are properly released already even
> > before reaching the finalize() function but in object_property_del_all(),
> > hence we don't need to free them anymore in finalize() or it's double-free.
> > 
> > This also fixes a trivial memory leak for tls-authz as we forgot to free it
> > before this patch.
> > 
> > Signed-off-by: Peter Xu <peterx@redhat.com>
> > ---
> >  migration/migration.c | 6 +++---
> >  1 file changed, 3 insertions(+), 3 deletions(-)
> > 
> > diff --git a/migration/migration.c b/migration/migration.c
> > index 71a50b5c37..b0f2de1e09 100644
> > --- a/migration/migration.c
> > +++ b/migration/migration.c
> > @@ -4339,6 +4339,9 @@ static Property migration_properties[] = {
> >                        DEFAULT_MIGRATE_ANNOUNCE_STEP),
> >      DEFINE_PROP_BOOL("x-postcopy-preempt-break-huge", MigrationState,
> >                        postcopy_preempt_break_huge, true),
> > +    DEFINE_PROP_STRING("tls-creds", MigrationState, parameters.tls_creds),
> > +    DEFINE_PROP_STRING("tls-hostname", MigrationState, parameters.tls_hostname),
> > +    DEFINE_PROP_STRING("tls-authz", MigrationState, parameters.tls_authz),
> >  
> >      /* Migration capabilities */
> >      DEFINE_PROP_MIG_CAP("x-xbzrle", MIGRATION_CAPABILITY_XBZRLE),
> > @@ -4372,12 +4375,9 @@ static void migration_class_init(ObjectClass *klass, void *data)
> >  static void migration_instance_finalize(Object *obj)
> >  {
> >      MigrationState *ms = MIGRATION_OBJ(obj);
> > -    MigrationParameters *params = &ms->parameters;
> >  
> >      qemu_mutex_destroy(&ms->error_mutex);
> >      qemu_mutex_destroy(&ms->qemu_file_lock);
> > -    g_free(params->tls_hostname);
> > -    g_free(params->tls_creds);
> 
> So hmm, why is tls-authz special here?

Pre-existing memory leak bug IIUC


With regards,
Daniel
-- 
|: https://berrange.com      -o-    https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org         -o-            https://fstop138.berrange.com :|
|: https://entangle-photo.org    -o-    https://www.instagram.com/dberrange :|



^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH v5 17/21] migration: Export tls-[creds|hostname|authz] params to cmdline too
  2022-05-12 19:05     ` Daniel P. Berrangé
@ 2022-05-12 20:07       ` Peter Xu
  0 siblings, 0 replies; 40+ messages in thread
From: Peter Xu @ 2022-05-12 20:07 UTC (permalink / raw)
  To: Daniel P. Berrangé
  Cc: Dr. David Alan Gilbert, qemu-devel, Juan Quintela,
	Leonardo Bras Soares Passos

On Thu, May 12, 2022 at 08:05:45PM +0100, Daniel P. Berrangé wrote:
> > > @@ -4372,12 +4375,9 @@ static void migration_class_init(ObjectClass *klass, void *data)
> > >  static void migration_instance_finalize(Object *obj)
> > >  {
> > >      MigrationState *ms = MIGRATION_OBJ(obj);
> > > -    MigrationParameters *params = &ms->parameters;
> > >  
> > >      qemu_mutex_destroy(&ms->error_mutex);
> > >      qemu_mutex_destroy(&ms->qemu_file_lock);
> > > -    g_free(params->tls_hostname);
> > > -    g_free(params->tls_creds);
> > 
> > So hmm, why is tls-authz special here?
> 
> Pre-existing memory leak bug IIUC

Right, and there's one extra paragraph in commit message explaining it (per
Dan's request):

  This also fixes a trivial memory leak for tls-authz as we forgot to free it
  before this patch.

Thanks,

-- 
Peter Xu



^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH v5 13/21] migration: Postcopy recover with preempt enabled
  2022-04-25 23:38 ` [PATCH v5 13/21] migration: Postcopy recover with preempt enabled Peter Xu
@ 2022-05-16 13:31   ` manish.mishra
  2022-05-16 14:11     ` Peter Xu
  0 siblings, 1 reply; 40+ messages in thread
From: manish.mishra @ 2022-05-16 13:31 UTC (permalink / raw)
  To: Peter Xu, qemu-devel
  Cc: Leonardo Bras Soares Passos, Daniel P . Berrange,
	Dr . David Alan Gilbert, Juan Quintela


On 26/04/22 5:08 am, Peter Xu wrote:
LGTM,
Peter, I wanted to give review-tag for this and ealier patch too. I am 
new to qemu
review process so not sure how give review-tag, did not find any 
reference on
google too. So if you please let me know how to do it.
> To allow postcopy recovery, the ram fast load (preempt-only) dest QEMU thread
> needs similar handling on fault tolerance.  When ram_load_postcopy() fails,
> instead of stopping the thread it halts with a semaphore, preparing to be
> kicked again when recovery is detected.
>
> A mutex is introduced to make sure there's no concurrent operation upon the
> socket.  To make it simple, the fast ram load thread will take the mutex during
> its whole procedure, and only release it if it's paused.  The fast-path socket
> will be properly released by the main loading thread safely when there's
> network failures during postcopy with that mutex held.
>
> Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
> Signed-off-by: Peter Xu <peterx@redhat.com>
> ---
>   migration/migration.c    | 27 +++++++++++++++++++++++----
>   migration/migration.h    | 19 +++++++++++++++++++
>   migration/postcopy-ram.c | 25 +++++++++++++++++++++++--
>   migration/qemu-file.c    | 27 +++++++++++++++++++++++++++
>   migration/qemu-file.h    |  1 +
>   migration/savevm.c       | 26 ++++++++++++++++++++++++--
>   migration/trace-events   |  2 ++
>   7 files changed, 119 insertions(+), 8 deletions(-)
>
> diff --git a/migration/migration.c b/migration/migration.c
> index 8264b03d4d..a0db5de685 100644
> --- a/migration/migration.c
> +++ b/migration/migration.c
> @@ -215,9 +215,11 @@ void migration_object_init(void)
>       current_incoming->postcopy_remote_fds =
>           g_array_new(FALSE, TRUE, sizeof(struct PostCopyFD));
>       qemu_mutex_init(&current_incoming->rp_mutex);
> +    qemu_mutex_init(&current_incoming->postcopy_prio_thread_mutex);
>       qemu_event_init(&current_incoming->main_thread_load_event, false);
>       qemu_sem_init(&current_incoming->postcopy_pause_sem_dst, 0);
>       qemu_sem_init(&current_incoming->postcopy_pause_sem_fault, 0);
> +    qemu_sem_init(&current_incoming->postcopy_pause_sem_fast_load, 0);
>       qemu_mutex_init(&current_incoming->page_request_mutex);
>       current_incoming->page_requested = g_tree_new(page_request_addr_cmp);
>   
> @@ -697,9 +699,9 @@ static bool postcopy_try_recover(void)
>   
>           /*
>            * Here, we only wake up the main loading thread (while the
> -         * fault thread will still be waiting), so that we can receive
> +         * rest threads will still be waiting), so that we can receive
>            * commands from source now, and answer it if needed. The
> -         * fault thread will be woken up afterwards until we are sure
> +         * rest threads will be woken up afterwards until we are sure
>            * that source is ready to reply to page requests.
>            */
>           qemu_sem_post(&mis->postcopy_pause_sem_dst);
> @@ -3470,6 +3472,18 @@ static MigThrError postcopy_pause(MigrationState *s)
>           qemu_file_shutdown(file);
>           qemu_fclose(file);
>   
> +        /*
> +         * Do the same to postcopy fast path socket too if there is.  No
> +         * locking needed because no racer as long as we do this before setting
> +         * status to paused.
> +         */
> +        if (s->postcopy_qemufile_src) {
> +            migration_ioc_unregister_yank_from_file(s->postcopy_qemufile_src);
> +            qemu_file_shutdown(s->postcopy_qemufile_src);
> +            qemu_fclose(s->postcopy_qemufile_src);
> +            s->postcopy_qemufile_src = NULL;
> +        }
> +
>           migrate_set_state(&s->state, s->state,
>                             MIGRATION_STATUS_POSTCOPY_PAUSED);
>   
> @@ -3525,8 +3539,13 @@ static MigThrError migration_detect_error(MigrationState *s)
>           return MIG_THR_ERR_FATAL;
>       }
>   
> -    /* Try to detect any file errors */
> -    ret = qemu_file_get_error_obj(s->to_dst_file, &local_error);
> +    /*
> +     * Try to detect any file errors.  Note that postcopy_qemufile_src will
> +     * be NULL when postcopy preempt is not enabled.
> +     */
> +    ret = qemu_file_get_error_obj_any(s->to_dst_file,
> +                                      s->postcopy_qemufile_src,
> +                                      &local_error);
>       if (!ret) {
>           /* Everything is fine */
>           assert(!local_error);
> diff --git a/migration/migration.h b/migration/migration.h
> index b8aacfe3af..91f845e9e4 100644
> --- a/migration/migration.h
> +++ b/migration/migration.h
> @@ -118,6 +118,18 @@ struct MigrationIncomingState {
>       /* Postcopy priority thread is used to receive postcopy requested pages */
>       QemuThread postcopy_prio_thread;
>       bool postcopy_prio_thread_created;
> +    /*
> +     * Used to sync between the ram load main thread and the fast ram load
> +     * thread.  It protects postcopy_qemufile_dst, which is the postcopy
> +     * fast channel.
> +     *
> +     * The ram fast load thread will take it mostly for the whole lifecycle
> +     * because it needs to continuously read data from the channel, and
> +     * it'll only release this mutex if postcopy is interrupted, so that
> +     * the ram load main thread will take this mutex over and properly
> +     * release the broken channel.
> +     */
> +    QemuMutex postcopy_prio_thread_mutex;
>       /*
>        * An array of temp host huge pages to be used, one for each postcopy
>        * channel.
> @@ -147,6 +159,13 @@ struct MigrationIncomingState {
>       /* notify PAUSED postcopy incoming migrations to try to continue */
>       QemuSemaphore postcopy_pause_sem_dst;
>       QemuSemaphore postcopy_pause_sem_fault;
> +    /*
> +     * This semaphore is used to allow the ram fast load thread (only when
> +     * postcopy preempt is enabled) fall into sleep when there's network
> +     * interruption detected.  When the recovery is done, the main load
> +     * thread will kick the fast ram load thread using this semaphore.
> +     */
> +    QemuSemaphore postcopy_pause_sem_fast_load;
>   
>       /* List of listening socket addresses  */
>       SocketAddressList *socket_address_list;
> diff --git a/migration/postcopy-ram.c b/migration/postcopy-ram.c
> index e92db0556b..b3c81b46f6 100644
> --- a/migration/postcopy-ram.c
> +++ b/migration/postcopy-ram.c
> @@ -1580,6 +1580,15 @@ int postcopy_preempt_setup(MigrationState *s, Error **errp)
>       return 0;
>   }
>   
> +static void postcopy_pause_ram_fast_load(MigrationIncomingState *mis)
> +{
> +    trace_postcopy_pause_fast_load();
> +    qemu_mutex_unlock(&mis->postcopy_prio_thread_mutex);

I may have misunderstood synchronisation here but very very rare chance,

as both threads are working independently is it possible qemu_sem_post on

postcopy_pause_sem_fast_load is called by main thread even before we go

to qemu_sem_wait in next line, causing some kind of deadlock. That's should

not be possible as i understand it requires manually calling 
qmp_migration_recover

so chances are almost impossible. Just wanted to confirm it.

> +    qemu_sem_wait(&mis->postcopy_pause_sem_fast_load);

Just wanted to confirm why postcopy_pause_incoming is not called here 
itself.

Is it based on assumption that if there is error in any of the channel 
it will

eventually be paused on source side, closing both channels, resulting

postcopy_pause_incoming will be called from main thread on destination?

Usually it should be good to call as early as possible. It is left to main

thread default path so that we do not have any synchronisation overhead?

Also Peter, i was trying to understand postcopy recovery model so is use 
case

of qmp_migrate_pause just for debugging purpose?

> +    qemu_mutex_lock(&mis->postcopy_prio_thread_mutex);
> +    trace_postcopy_pause_fast_load_continued();
> +}
> +
>   void *postcopy_preempt_thread(void *opaque)
>   {
>       MigrationIncomingState *mis = opaque;
> @@ -1592,11 +1601,23 @@ void *postcopy_preempt_thread(void *opaque)
>       qemu_sem_post(&mis->thread_sync_sem);
>   
>       /* Sending RAM_SAVE_FLAG_EOS to terminate this thread */
> -    ret = ram_load_postcopy(mis->postcopy_qemufile_dst, RAM_CHANNEL_POSTCOPY);
> +    qemu_mutex_lock(&mis->postcopy_prio_thread_mutex);
> +    while (1) {
> +        ret = ram_load_postcopy(mis->postcopy_qemufile_dst,
> +                                RAM_CHANNEL_POSTCOPY);
> +        /* If error happened, go into recovery routine */
> +        if (ret) {
> +            postcopy_pause_ram_fast_load(mis);
> +        } else {
> +            /* We're done */
> +            break;
> +        }
> +    }
> +    qemu_mutex_unlock(&mis->postcopy_prio_thread_mutex);
>   
>       rcu_unregister_thread();
>   
>       trace_postcopy_preempt_thread_exit();
>   
> -    return ret == 0 ? NULL : (void *)-1;
> +    return NULL;
>   }
> diff --git a/migration/qemu-file.c b/migration/qemu-file.c
> index 1479cddad9..397652f0ba 100644
> --- a/migration/qemu-file.c
> +++ b/migration/qemu-file.c
> @@ -139,6 +139,33 @@ int qemu_file_get_error_obj(QEMUFile *f, Error **errp)
>       return f->last_error;
>   }
>   
> +/*
> + * Get last error for either stream f1 or f2 with optional Error*.
> + * The error returned (non-zero) can be either from f1 or f2.
> + *
> + * If any of the qemufile* is NULL, then skip the check on that file.
> + *
> + * When there is no error on both qemufile, zero is returned.
> + */
> +int qemu_file_get_error_obj_any(QEMUFile *f1, QEMUFile *f2, Error **errp)
> +{
> +    int ret = 0;
> +
> +    if (f1) {
> +        ret = qemu_file_get_error_obj(f1, errp);
> +        /* If there's already error detected, return */
> +        if (ret) {
> +            return ret;
> +        }
> +    }
> +
> +    if (f2) {
> +        ret = qemu_file_get_error_obj(f2, errp);
> +    }
> +
> +    return ret;
> +}
> +
>   /*
>    * Set the last error for stream f with optional Error*
>    */
> diff --git a/migration/qemu-file.h b/migration/qemu-file.h
> index 3f36d4dc8c..2564e5e1c7 100644
> --- a/migration/qemu-file.h
> +++ b/migration/qemu-file.h
> @@ -156,6 +156,7 @@ void qemu_file_update_transfer(QEMUFile *f, int64_t len);
>   void qemu_file_set_rate_limit(QEMUFile *f, int64_t new_rate);
>   int64_t qemu_file_get_rate_limit(QEMUFile *f);
>   int qemu_file_get_error_obj(QEMUFile *f, Error **errp);
> +int qemu_file_get_error_obj_any(QEMUFile *f1, QEMUFile *f2, Error **errp);
>   void qemu_file_set_error_obj(QEMUFile *f, int ret, Error *err);
>   void qemu_file_set_error(QEMUFile *f, int ret);
>   int qemu_file_shutdown(QEMUFile *f);
> diff --git a/migration/savevm.c b/migration/savevm.c
> index ecee05e631..050874650a 100644
> --- a/migration/savevm.c
> +++ b/migration/savevm.c
> @@ -2152,6 +2152,13 @@ static int loadvm_postcopy_handle_resume(MigrationIncomingState *mis)
>        */
>       qemu_sem_post(&mis->postcopy_pause_sem_fault);
>   
> +    if (migrate_postcopy_preempt()) {
> +        /* The channel should already be setup again; make sure of it */
> +        assert(mis->postcopy_qemufile_dst);
> +        /* Kick the fast ram load thread too */

> +        qemu_sem_post(&mis->postcopy_pause_sem_fast_load);
> +    }
> +
>       return 0;
>   }
>   
> @@ -2597,6 +2604,21 @@ static bool postcopy_pause_incoming(MigrationIncomingState *mis)
>       mis->to_src_file = NULL;
>       qemu_mutex_unlock(&mis->rp_mutex);
>   
> +    /*
> +     * NOTE: this must happen before reset the PostcopyTmpPages below,
> +     * otherwise it's racy to reset those fields when the fast load thread
> +     * can be accessing it in parallel.
> +     */
> +    if (mis->postcopy_qemufile_dst) {
> +        qemu_file_shutdown(mis->postcopy_qemufile_dst);
> +        /* Take the mutex to make sure the fast ram load thread halted */
> +        qemu_mutex_lock(&mis->postcopy_prio_thread_mutex);
> +        migration_ioc_unregister_yank_from_file(mis->postcopy_qemufile_dst);
> +        qemu_fclose(mis->postcopy_qemufile_dst);
> +        mis->postcopy_qemufile_dst = NULL;
> +        qemu_mutex_unlock(&mis->postcopy_prio_thread_mutex);
> +    }
> +
>       migrate_set_state(&mis->state, MIGRATION_STATUS_POSTCOPY_ACTIVE,
>                         MIGRATION_STATUS_POSTCOPY_PAUSED);
>   
> @@ -2634,8 +2656,8 @@ retry:
>       while (true) {
>           section_type = qemu_get_byte(f);
>   
> -        if (qemu_file_get_error(f)) {
> -            ret = qemu_file_get_error(f);
> +        ret = qemu_file_get_error_obj_any(f, mis->postcopy_qemufile_dst, NULL);
> +        if (ret) {
>               break;
>           }
>   
> diff --git a/migration/trace-events b/migration/trace-events
> index 69f311169a..0e385c3a07 100644
> --- a/migration/trace-events
> +++ b/migration/trace-events
> @@ -270,6 +270,8 @@ mark_postcopy_blocktime_begin(uint64_t addr, void *dd, uint32_t time, int cpu, i
>   mark_postcopy_blocktime_end(uint64_t addr, void *dd, uint32_t time, int affected_cpu) "addr: 0x%" PRIx64 ", dd: %p, time: %u, affected_cpu: %d"
>   postcopy_pause_fault_thread(void) ""
>   postcopy_pause_fault_thread_continued(void) ""
> +postcopy_pause_fast_load(void) ""
> +postcopy_pause_fast_load_continued(void) ""
>   postcopy_ram_fault_thread_entry(void) ""
>   postcopy_ram_fault_thread_exit(void) ""
>   postcopy_ram_fault_thread_fds_core(int baseufd, int quitfd) "ufd: %d quitfd: %d"


^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH v5 13/21] migration: Postcopy recover with preempt enabled
  2022-05-16 13:31   ` manish.mishra
@ 2022-05-16 14:11     ` Peter Xu
  2022-05-16 14:51       ` manish.mishra
  0 siblings, 1 reply; 40+ messages in thread
From: Peter Xu @ 2022-05-16 14:11 UTC (permalink / raw)
  To: manish.mishra
  Cc: qemu-devel, Leonardo Bras Soares Passos, Daniel P . Berrange,
	Dr . David Alan Gilbert, Juan Quintela

Hi, Manish,

On Mon, May 16, 2022 at 07:01:35PM +0530, manish.mishra wrote:
> 
> On 26/04/22 5:08 am, Peter Xu wrote:
> LGTM,
> Peter, I wanted to give review-tag for this and ealier patch too. I am new
> to qemu
> review process so not sure how give review-tag, did not find any reference
> on
> google too. So if you please let me know how to do it.

It's here:

https://git.qemu.org/?p=qemu.git;a=blob;f=docs/devel/submitting-a-patch.rst;hb=HEAD#l492

Since afaict QEMU is mostly following what Linux does, you can also
reference to this one with more context:

https://www.kernel.org/doc/html/v4.17/process/submitting-patches.html#using-reported-by-tested-by-reviewed-by-suggested-by-and-fixes

But since you're still having question regarding this patch, no rush on
providing your R-bs; let's finish the discussion first.

[...]

> > +static void postcopy_pause_ram_fast_load(MigrationIncomingState *mis)
> > +{
> > +    trace_postcopy_pause_fast_load();
> > +    qemu_mutex_unlock(&mis->postcopy_prio_thread_mutex);
> 
> I may have misunderstood synchronisation here but very very rare chance,
> 
> as both threads are working independently is it possible qemu_sem_post on
> 
> postcopy_pause_sem_fast_load is called by main thread even before we go
> 
> to qemu_sem_wait in next line, causing some kind of deadlock. That's should
> 
> not be possible as i understand it requires manually calling
> qmp_migration_recover
> 
> so chances are almost impossible. Just wanted to confirm it.

Sorry I don't quite get the question, could you elaborate?  E.g., when the
described deadlock happened, what is both main thread and preempt load
thread doing?  What are they waiting at?

Note: we have already released mutex before waiting on sem.

> 
> > +    qemu_sem_wait(&mis->postcopy_pause_sem_fast_load);
> 
> Just wanted to confirm why postcopy_pause_incoming is not called here
> itself.

postcopy_pause_incoming() is only used in the main ram load thread, while
this function (postcopy_pause_ram_fast_load) is only called by the preempt
load thread.

> 
> Is it based on assumption that if there is error in any of the channel it
> will
> 
> eventually be paused on source side, closing both channels, resulting
> 
> postcopy_pause_incoming will be called from main thread on destination?

Yes.

> 
> Usually it should be good to call as early as possible. It is left to main
> 
> thread default path so that we do not have any synchronisation overhead?

What's the sync overhead you mentioned? What we want to do here is simply
to put all the dest QEMU migration threads into a halted state rather than
quitting, so that they can be continued when necessary.

> 
> Also Peter, i was trying to understand postcopy recovery model so is use
> case
> 
> of qmp_migrate_pause just for debugging purpose?

Yes.  It's also a way to cleanly stop using the network (comparing to force
unplug the nic ports?) for whatever reason with a shutdown() syscall upon
the socket.  I just don't know whether there's any real use case of that in
reality.

Thanks,

-- 
Peter Xu



^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH v5 13/21] migration: Postcopy recover with preempt enabled
  2022-05-16 14:11     ` Peter Xu
@ 2022-05-16 14:51       ` manish.mishra
  2022-05-16 15:32         ` manish.mishra
  2022-05-16 15:58         ` Peter Xu
  0 siblings, 2 replies; 40+ messages in thread
From: manish.mishra @ 2022-05-16 14:51 UTC (permalink / raw)
  To: Peter Xu
  Cc: qemu-devel, Leonardo Bras Soares Passos, Daniel P . Berrange,
	Dr . David Alan Gilbert, Juan Quintela

[-- Attachment #1: Type: text/plain, Size: 4736 bytes --]


On 16/05/22 7:41 pm, Peter Xu wrote:
> Hi, Manish,
>
> On Mon, May 16, 2022 at 07:01:35PM +0530, manish.mishra wrote:
>> On 26/04/22 5:08 am, Peter Xu wrote:
>> LGTM,
>> Peter, I wanted to give review-tag for this and ealier patch too. I am new
>> to qemu
>> review process so not sure how give review-tag, did not find any reference
>> on
>> google too. So if you please let me know how to do it.
> It's here:
>
> https://urldefense.proofpoint.com/v2/url?u=https-3A__git.qemu.org_-3Fp-3Dqemu.git-3Ba-3Dblob-3Bf-3Ddocs_devel_submitting-2Da-2Dpatch.rst-3Bhb-3DHEAD-23l492&d=DwIBaQ&c=s883GpUCOChKOHiocYtGcg&r=c4KON2DiMd-szjwjggQcuUvTsPWblztAL0gVzaHnNmc&m=8LU6rphEJ5GMSXEpSxe8JZ_hpn6TQDUXfjWM6Vt7DdShxnU3X5zYXbAMBLPYchdK&s=TUNUCtdl7LWhrdlfnIx1F08kC0d9IMvArl6cNMpfXkc&e=  
>
> Since afaict QEMU is mostly following what Linux does, you can also
> reference to this one with more context:
>
> https://urldefense.proofpoint.com/v2/url?u=https-3A__www.kernel.org_doc_html_v4.17_process_submitting-2Dpatches.html-23using-2Dreported-2Dby-2Dtested-2Dby-2Dreviewed-2Dby-2Dsuggested-2Dby-2Dand-2Dfixes&d=DwIBaQ&c=s883GpUCOChKOHiocYtGcg&r=c4KON2DiMd-szjwjggQcuUvTsPWblztAL0gVzaHnNmc&m=8LU6rphEJ5GMSXEpSxe8JZ_hpn6TQDUXfjWM6Vt7DdShxnU3X5zYXbAMBLPYchdK&s=TJmr_eC4LAccVY1EqgkLleXfJhUgtIjTJmLc3cedYr0&e=  
>
> But since you're still having question regarding this patch, no rush on
> providing your R-bs; let's finish the discussion first.
>
> [...]
>
>>> +static void postcopy_pause_ram_fast_load(MigrationIncomingState *mis)
>>> +{
>>> +    trace_postcopy_pause_fast_load();
>>> +    qemu_mutex_unlock(&mis->postcopy_prio_thread_mutex);
>> I may have misunderstood synchronisation here but very very rare chance,
>>
>> as both threads are working independently is it possible qemu_sem_post on
>>
>> postcopy_pause_sem_fast_load is called by main thread even before we go
>>
>> to qemu_sem_wait in next line, causing some kind of deadlock. That's should
>>
>> not be possible as i understand it requires manually calling
>> qmp_migration_recover
>>
>> so chances are almost impossible. Just wanted to confirm it.
> Sorry I don't quite get the question, could you elaborate?  E.g., when the
> described deadlock happened, what is both main thread and preempt load
> thread doing?  What are they waiting at?
>
> Note: we have already released mutex before waiting on sem.

What i meant here is deadlock could be due the reason that we infinately wait

on qemu_sem_wait(&mis->postcopy_pause_sem_fast_load), because main

thread already called post on postcopy_pause_sem_fast_load after recovery

even before we moved to qemu_sem_wait(&mis->postcopy_pause_sem_fast_load)

in next line. Basically if we miss a post on postcopy_pause_sem_fast_load.

This is nearly impossibily case becuase it requires full recovery path to be completed

before this thread executes just next line. Also as recovery needs to be called manually,

So please ignore this.

Basically i wanted to check if we should use something like

int pthread_cond_wait(pthread_cond_t *restrict cond,
                    pthread_mutex_t *restrict mutex);

so that there is no race between releasing mutex and calling wait.

>
>>> +    qemu_sem_wait(&mis->postcopy_pause_sem_fast_load);
>> Just wanted to confirm why postcopy_pause_incoming is not called here
>> itself.
> postcopy_pause_incoming() is only used in the main ram load thread, while
> this function (postcopy_pause_ram_fast_load) is only called by the preempt
> load thread.
>
ok got it, thanks Peter, i meant if we should close both the channels as soon

as we relise there is some failure instead of main thread waiting for error event

and then closing and pausing post-copy. But agree current approach is good.

>> Is it based on assumption that if there is error in any of the channel it
>> will
>>
>> eventually be paused on source side, closing both channels, resulting
>>
>> postcopy_pause_incoming will be called from main thread on destination?
> Yes.
>
>> Usually it should be good to call as early as possible. It is left to main
>>
>> thread default path so that we do not have any synchronisation overhead?
> What's the sync overhead you mentioned? What we want to do here is simply
> to put all the dest QEMU migration threads into a halted state rather than
> quitting, so that they can be continued when necessary.
>
>> Also Peter, i was trying to understand postcopy recovery model so is use
>> case
>>
>> of qmp_migrate_pause just for debugging purpose?
> Yes.  It's also a way to cleanly stop using the network (comparing to force
> unplug the nic ports?) for whatever reason with a shutdown() syscall upon
> the socket.  I just don't know whether there's any real use case of that in
> reality.
>
> Thanks,
>

[-- Attachment #2: Type: text/html, Size: 8091 bytes --]

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH v5 14/21] migration: Create the postcopy preempt channel asynchronously
  2022-04-25 23:38 ` [PATCH v5 14/21] migration: Create the postcopy preempt channel asynchronously Peter Xu
  2022-05-12 17:35   ` Dr. David Alan Gilbert
@ 2022-05-16 15:02   ` manish.mishra
  2022-05-16 16:21     ` Peter Xu
  1 sibling, 1 reply; 40+ messages in thread
From: manish.mishra @ 2022-05-16 15:02 UTC (permalink / raw)
  To: Peter Xu, qemu-devel
  Cc: Leonardo Bras Soares Passos, Daniel P . Berrange,
	Dr . David Alan Gilbert, Juan Quintela

[-- Attachment #1: Type: text/plain, Size: 7473 bytes --]


On 26/04/22 5:08 am, Peter Xu wrote:
> This patch allows the postcopy preempt channel to be created
> asynchronously.  The benefit is that when the connection is slow, we won't
> take the BQL (and potentially block all things like QMP) for a long time
> without releasing.
>
> A function postcopy_preempt_wait_channel() is introduced, allowing the
> migration thread to be able to wait on the channel creation.  The channel
> is always created by the main thread, in which we'll kick a new semaphore
> to tell the migration thread that the channel has created.
>
> We'll need to wait for the new channel in two places: (1) when there's a
> new postcopy migration that is starting, or (2) when there's a postcopy
> migration to resume.
>
> For the start of migration, we don't need to wait for this channel until
> when we want to start postcopy, aka, postcopy_start().  We'll fail the
> migration if we found that the channel creation failed (which should
> probably not happen at all in 99% of the cases, because the main channel is
> using the same network topology).
>
> For a postcopy recovery, we'll need to wait in postcopy_pause().  In that
> case if the channel creation failed, we can't fail the migration or we'll
> crash the VM, instead we keep in PAUSED state, waiting for yet another
> recovery.
>
> Signed-off-by: Peter Xu<peterx@redhat.com>

Reviewed-by: Manish Mishra <manish.mishra@nutanix.com>

This also looks good to me, sorry if i mis-understood and this is not correct way to add review tag.

> ---
>   migration/migration.c    | 16 ++++++++++++
>   migration/migration.h    |  7 +++++
>   migration/postcopy-ram.c | 56 +++++++++++++++++++++++++++++++---------
>   migration/postcopy-ram.h |  1 +
>   4 files changed, 68 insertions(+), 12 deletions(-)
>
> diff --git a/migration/migration.c b/migration/migration.c
> index a0db5de685..cce741e20e 100644
> --- a/migration/migration.c
> +++ b/migration/migration.c
> @@ -3020,6 +3020,12 @@ static int postcopy_start(MigrationState *ms)
>       int64_t bandwidth = migrate_max_postcopy_bandwidth();
>       bool restart_block = false;
>       int cur_state = MIGRATION_STATUS_ACTIVE;
> +
> +    if (postcopy_preempt_wait_channel(ms)) {
> +        migrate_set_state(&ms->state, ms->state, MIGRATION_STATUS_FAILED);
> +        return -1;
> +    }
> +
>       if (!migrate_pause_before_switchover()) {
>           migrate_set_state(&ms->state, MIGRATION_STATUS_ACTIVE,
>                             MIGRATION_STATUS_POSTCOPY_ACTIVE);
> @@ -3501,6 +3507,14 @@ static MigThrError postcopy_pause(MigrationState *s)
>           if (s->state == MIGRATION_STATUS_POSTCOPY_RECOVER) {
>               /* Woken up by a recover procedure. Give it a shot */
>   
> +            if (postcopy_preempt_wait_channel(s)) {
> +                /*
> +                 * Preempt enabled, and new channel create failed; loop
> +                 * back to wait for another recovery.
> +                 */
> +                continue;
> +            }
> +
>               /*
>                * Firstly, let's wake up the return path now, with a new
>                * return path channel.
> @@ -4360,6 +4374,7 @@ static void migration_instance_finalize(Object *obj)
>       qemu_sem_destroy(&ms->postcopy_pause_sem);
>       qemu_sem_destroy(&ms->postcopy_pause_rp_sem);
>       qemu_sem_destroy(&ms->rp_state.rp_sem);
> +    qemu_sem_destroy(&ms->postcopy_qemufile_src_sem);
>       error_free(ms->error);
>   }
>   
> @@ -4406,6 +4421,7 @@ static void migration_instance_init(Object *obj)
>       qemu_sem_init(&ms->rp_state.rp_sem, 0);
>       qemu_sem_init(&ms->rate_limit_sem, 0);
>       qemu_sem_init(&ms->wait_unplug_sem, 0);
> +    qemu_sem_init(&ms->postcopy_qemufile_src_sem, 0);
>       qemu_mutex_init(&ms->qemu_file_lock);
>   }
>   
> diff --git a/migration/migration.h b/migration/migration.h
> index 91f845e9e4..f898b8547a 100644
> --- a/migration/migration.h
> +++ b/migration/migration.h
> @@ -219,6 +219,13 @@ struct MigrationState {
>       QEMUFile *to_dst_file;
>       /* Postcopy specific transfer channel */
>       QEMUFile *postcopy_qemufile_src;
> +    /*
> +     * It is posted when the preempt channel is established.  Note: this is
> +     * used for both the start or recover of a postcopy migration.  We'll
> +     * post to this sem every time a new preempt channel is created in the
> +     * main thread, and we keep post() and wait() in pair.
> +     */
> +    QemuSemaphore postcopy_qemufile_src_sem;
>       QIOChannelBuffer *bioc;
>       /*
>        * Protects to_dst_file/from_dst_file pointers.  We need to make sure we
> diff --git a/migration/postcopy-ram.c b/migration/postcopy-ram.c
> index b3c81b46f6..1bb603051a 100644
> --- a/migration/postcopy-ram.c
> +++ b/migration/postcopy-ram.c
> @@ -1552,10 +1552,50 @@ bool postcopy_preempt_new_channel(MigrationIncomingState *mis, QEMUFile *file)
>       return true;
>   }
>   
> -int postcopy_preempt_setup(MigrationState *s, Error **errp)
> +static void
> +postcopy_preempt_send_channel_new(QIOTask *task, gpointer opaque)
>   {
> -    QIOChannel *ioc;
> +    MigrationState *s = opaque;
> +    QIOChannel *ioc = QIO_CHANNEL(qio_task_get_source(task));
> +    Error *local_err = NULL;
> +
> +    if (qio_task_propagate_error(task, &local_err)) {
> +        /* Something wrong happened.. */
> +        migrate_set_error(s, local_err);
> +        error_free(local_err);
> +    } else {
> +        migration_ioc_register_yank(ioc);
> +        s->postcopy_qemufile_src = qemu_fopen_channel_output(ioc);
> +        trace_postcopy_preempt_new_channel();
> +    }
> +
> +    /*
> +     * Kick the waiter in all cases.  The waiter should check upon
> +     * postcopy_qemufile_src to know whether it failed or not.
> +     */
> +    qemu_sem_post(&s->postcopy_qemufile_src_sem);
> +    object_unref(OBJECT(ioc));
> +}
>   
> +/* Returns 0 if channel established, -1 for error. */
> +int postcopy_preempt_wait_channel(MigrationState *s)
> +{
> +    /* If preempt not enabled, no need to wait */
> +    if (!migrate_postcopy_preempt()) {
> +        return 0;
> +    }
> +
> +    /*
> +     * We need the postcopy preempt channel to be established before
> +     * starting doing anything.
> +     */
> +    qemu_sem_wait(&s->postcopy_qemufile_src_sem);
> +
> +    return s->postcopy_qemufile_src ? 0 : -1;
> +}
> +
> +int postcopy_preempt_setup(MigrationState *s, Error **errp)
> +{
>       if (!migrate_postcopy_preempt()) {
>           return 0;
>       }
> @@ -1566,16 +1606,8 @@ int postcopy_preempt_setup(MigrationState *s, Error **errp)
>           return -1;
>       }
>   
> -    ioc = socket_send_channel_create_sync(errp);
> -
> -    if (ioc == NULL) {
> -        return -1;
> -    }
> -
> -    migration_ioc_register_yank(ioc);
> -    s->postcopy_qemufile_src = qemu_fopen_channel_output(ioc);
> -
> -    trace_postcopy_preempt_new_channel();
> +    /* Kick an async task to connect */
> +    socket_send_channel_create(postcopy_preempt_send_channel_new, s);
>   
>       return 0;
>   }
> diff --git a/migration/postcopy-ram.h b/migration/postcopy-ram.h
> index 34b1080cde..6147bf7d1d 100644
> --- a/migration/postcopy-ram.h
> +++ b/migration/postcopy-ram.h
> @@ -192,5 +192,6 @@ enum PostcopyChannels {
>   
>   bool postcopy_preempt_new_channel(MigrationIncomingState *mis, QEMUFile *file);
>   int postcopy_preempt_setup(MigrationState *s, Error **errp);
> +int postcopy_preempt_wait_channel(MigrationState *s);
>   
>   #endif

[-- Attachment #2: Type: text/html, Size: 8896 bytes --]

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH v5 15/21] migration: Parameter x-postcopy-preempt-break-huge
  2022-04-25 23:38 ` [PATCH v5 15/21] migration: Parameter x-postcopy-preempt-break-huge Peter Xu
  2022-05-12 17:42   ` Dr. David Alan Gilbert
@ 2022-05-16 15:20   ` manish.mishra
  1 sibling, 0 replies; 40+ messages in thread
From: manish.mishra @ 2022-05-16 15:20 UTC (permalink / raw)
  To: Peter Xu, qemu-devel
  Cc: Leonardo Bras Soares Passos, Daniel P . Berrange,
	Dr . David Alan Gilbert, Juan Quintela

[-- Attachment #1: Type: text/plain, Size: 2810 bytes --]


On 26/04/22 5:08 am, Peter Xu wrote:
> Add a parameter that can conditionally disable the "break sending huge
> page" behavior in postcopy preemption.  By default it's enabled.
>
> It should only be used for debugging purposes, and we should never remove
> the "x-" prefix.
>
> Signed-off-by: Peter Xu<peterx@redhat.com>
Reviewed-by: Manish Mishra <manish.mishra@nutanix.com>
> ---
>   migration/migration.c | 2 ++
>   migration/migration.h | 7 +++++++
>   migration/ram.c       | 7 +++++++
>   3 files changed, 16 insertions(+)
>
> diff --git a/migration/migration.c b/migration/migration.c
> index cce741e20e..cd9650f2e2 100644
> --- a/migration/migration.c
> +++ b/migration/migration.c
> @@ -4329,6 +4329,8 @@ static Property migration_properties[] = {
>       DEFINE_PROP_SIZE("announce-step", MigrationState,
>                         parameters.announce_step,
>                         DEFAULT_MIGRATE_ANNOUNCE_STEP),
> +    DEFINE_PROP_BOOL("x-postcopy-preempt-break-huge", MigrationState,
> +                      postcopy_preempt_break_huge, true),
>   
>       /* Migration capabilities */
>       DEFINE_PROP_MIG_CAP("x-xbzrle", MIGRATION_CAPABILITY_XBZRLE),
> diff --git a/migration/migration.h b/migration/migration.h
> index f898b8547a..6ee520642f 100644
> --- a/migration/migration.h
> +++ b/migration/migration.h
> @@ -340,6 +340,13 @@ struct MigrationState {
>       bool send_configuration;
>       /* Whether we send section footer during migration */
>       bool send_section_footer;
> +    /*
> +     * Whether we allow break sending huge pages when postcopy preempt is
> +     * enabled.  When disabled, we won't interrupt precopy within sending a
> +     * host huge page, which is the old behavior of vanilla postcopy.
> +     * NOTE: this parameter is ignored if postcopy preempt is not enabled.
> +     */
> +    bool postcopy_preempt_break_huge;
>   
>       /* Needed by postcopy-pause state */
>       QemuSemaphore postcopy_pause_sem;
> diff --git a/migration/ram.c b/migration/ram.c
> index a4b39e3675..f3a79c8556 100644
> --- a/migration/ram.c
> +++ b/migration/ram.c
> @@ -2266,11 +2266,18 @@ static int ram_save_target_page(RAMState *rs, PageSearchStatus *pss)
>   
>   static bool postcopy_needs_preempt(RAMState *rs, PageSearchStatus *pss)
>   {
> +    MigrationState *ms = migrate_get_current();
> +
>       /* Not enabled eager preempt?  Then never do that. */
>       if (!migrate_postcopy_preempt()) {
>           return false;
>       }
>   
> +    /* If the user explicitly disabled breaking of huge page, skip */
> +    if (!ms->postcopy_preempt_break_huge) {
> +        return false;
> +    }
> +
>       /* If the ramblock we're sending is a small page?  Never bother. */
>       if (qemu_ram_pagesize(pss->block) == TARGET_PAGE_SIZE) {
>           return false;

[-- Attachment #2: Type: text/html, Size: 4382 bytes --]

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH v5 00/21] migration: Postcopy Preemption
  2022-04-25 23:38 [PATCH v5 00/21] migration: Postcopy Preemption Peter Xu
                   ` (20 preceding siblings ...)
  2022-04-25 23:38 ` [PATCH v5 21/21] tests: Add postcopy preempt tests Peter Xu
@ 2022-05-16 15:25 ` manish.mishra
  2022-05-16 16:06   ` Peter Xu
  21 siblings, 1 reply; 40+ messages in thread
From: manish.mishra @ 2022-05-16 15:25 UTC (permalink / raw)
  To: Peter Xu, qemu-devel
  Cc: Leonardo Bras Soares Passos, Daniel P . Berrange,
	Dr . David Alan Gilbert, Juan Quintela


On 26/04/22 5:08 am, Peter Xu wrote:
> This is v5 of postcopy preempt series.  It can also be found here:
>
>    https://github.com/xzpeter/qemu/tree/postcopy-preempt
>
> RFC: https://lore.kernel.org/qemu-devel/20220119080929.39485-1-peterx@redhat.com
> V1:  https://lore.kernel.org/qemu-devel/20220216062809.57179-1-peterx@redhat.com
> V2:  https://lore.kernel.org/qemu-devel/20220301083925.33483-1-peterx@redhat.com
> V3:  https://lore.kernel.org/qemu-devel/20220330213908.26608-1-peterx@redhat.com
> V4:  https://lore.kernel.org/qemu-devel/20220331150857.74406-1-peterx@redhat.com
>
> v4->v5 changelog:
> - Fixed all checkpatch.pl warnings
> - Picked up leftover patches from Dan's tls test case series:
>    https://lore.kernel.org/qemu-devel/20220310171821.3724080-1-berrange@redhat.com/
> - Rebased to v7.0.0 tag, collected more R-bs from Dave/Dan
> - In migrate_fd_cleanup(), use g_clear_pointer() for s->hostname [Dan]
> - Mark postcopy-preempt capability for 7.1 not 7.0 [Dan]
> - Moved migrate_channel_requires_tls() into tls.[ch] [Dan]
> - Mention the bug-fixing side effect of patch "migration: Export
>    tls-[creds|hostname|authz] params to cmdline too" on tls_authz [Dan]
> - Use g_autoptr where proper [Dan]
> - Drop a few (probably over-cautious) asserts on local_err being set [Dan]
> - Quite a few renamings in the qtest in the last few test patches [Dan]
>
> Abstract
> ========
>
> This series contains two parts now:
>
>    (1) Leftover patches from Dan's tls unit tests v2, which is first half
>    (2) Leftover patches from my postcopy preempt v4, which is second half
>
> This series added a new migration capability called "postcopy-preempt".  It can
> be enabled when postcopy is enabled, and it'll simply (but greatly) speed up
> postcopy page requests handling process.
>
> Below are some initial postcopy page request latency measurements after the
> new series applied.
>
> For each page size, I measured page request latency for three cases:
>
>    (a) Vanilla:                the old postcopy
>    (b) Preempt no-break-huge:  preempt enabled, x-postcopy-preempt-break-huge=off
>    (c) Preempt full:           preempt enabled, x-postcopy-preempt-break-huge=on
>                                (this is the default option when preempt enabled)
>
> Here x-postcopy-preempt-break-huge parameter is just added in v2 so as to
> conditionally disable the behavior to break sending a precopy huge page for
> debugging purpose.  So when it's off, postcopy will not preempt precopy
> sending a huge page, but still postcopy will use its own channel.
>
> I tested it separately to give a rough idea on which part of the change
> helped how much of it.  The overall benefit should be the comparison
> between case (a) and (c).
>
>    |-----------+---------+-----------------------+--------------|
>    | Page size | Vanilla | Preempt no-break-huge | Preempt full |
>    |-----------+---------+-----------------------+--------------|
>    | 4K        |   10.68 |               N/A [*] |         0.57 |
>    | 2M        |   10.58 |                  5.49 |         5.02 |
>    | 1G        | 2046.65 |               933.185 |      649.445 |
>    |-----------+---------+-----------------------+--------------|
>    [*]: This case is N/A because 4K page does not contain huge page at all
>
> [1] https://github.com/xzpeter/small-stuffs/blob/master/tools/huge_vm/uffd-latency.bpf

Hi Peter, Just wanted understand what setup was used for these experiments like

number of vcpu, workload, network bandwidth so that i can make sense of these

numbers. Also i could not understand reason for so much difference between preempt

full and Preempt no-break-huge especially for 1G case, so if you please share little more

details on this.

>
> TODO List
> =========
>
> Avoid precopy write() blocks postcopy
> -------------------------------------
>
> I didn't prove this, but I always think the write() syscalls being blocked
> for precopy pages can affect postcopy services.  If we can solve this
> problem then my wild guess is we can further reduce the average page
> latency.
>
> Two solutions at least in mind: (1) we could have made the write side of
> the migration channel NON_BLOCK too, or (2) multi-threads on send side,
> just like multifd, but we may use lock to protect which page to send too
> (e.g., the core idea is we should _never_ rely anything on the main thread,
> multifd has that dependency on queuing pages only on main thread).
>
> That can definitely be done and thought about later.
>
> Multi-channel for preemption threads
> ------------------------------------
>
> Currently the postcopy preempt feature use only one extra channel and one
> extra thread on dest (no new thread on src QEMU).  It should be mostly good
> enough for major use cases, but when the postcopy queue is long enough
> (e.g. hundreds of vCPUs faulted on different pages) logically we could
> still observe more delays in average.  Whether growing threads/channels can
> solve it is debatable, but sounds worthwhile a try.  That's yet another
> thing we can think about after this patchset lands.
>
> Logically the design provides space for that - the receiving postcopy
> preempt thread can understand all ram-layer migration protocol, and for
> multi channel and multi threads we could simply grow that into multile
> threads handling the same protocol (with multiple PostcopyTmpPage).  The
> source needs more thoughts on synchronizations, though, but it shouldn't
> affect the whole protocol layer, so should be easy to keep compatible.
>
> Please review, thanks.
>
> Daniel P. Berrangé (9):
>    tests: fix encoding of IP addresses in x509 certs
>    tests: add more helper macros for creating TLS x509 certs
>    tests: add migration tests of TLS with PSK credentials
>    tests: add migration tests of TLS with x509 credentials
>    tests: convert XBZRLE migration test to use common helper
>    tests: convert multifd migration tests to use common helper
>    tests: add multifd migration tests of TLS with PSK credentials
>    tests: add multifd migration tests of TLS with x509 credentials
>    tests: ensure migration status isn't reported as failed
>
> Peter Xu (12):
>    migration: Add postcopy-preempt capability
>    migration: Postcopy preemption preparation on channel creation
>    migration: Postcopy preemption enablement
>    migration: Postcopy recover with preempt enabled
>    migration: Create the postcopy preempt channel asynchronously
>    migration: Parameter x-postcopy-preempt-break-huge
>    migration: Add helpers to detect TLS capability
>    migration: Export tls-[creds|hostname|authz] params to cmdline too
>    migration: Enable TLS for preempt channel
>    tests: Add postcopy tls migration test
>    tests: Add postcopy tls recovery migration test
>    tests: Add postcopy preempt tests
>
>   meson.build                          |   1 +
>   migration/channel.c                  |  10 +-
>   migration/migration.c                | 146 +++-
>   migration/migration.h                |  46 +-
>   migration/multifd.c                  |   7 +-
>   migration/postcopy-ram.c             | 186 ++++-
>   migration/postcopy-ram.h             |  11 +
>   migration/qemu-file.c                |  27 +
>   migration/qemu-file.h                |   1 +
>   migration/ram.c                      | 283 +++++++-
>   migration/ram.h                      |   4 +-
>   migration/savevm.c                   |  46 +-
>   migration/socket.c                   |  22 +-
>   migration/socket.h                   |   1 +
>   migration/tls.c                      |   9 +
>   migration/tls.h                      |   4 +
>   migration/trace-events               |  15 +-
>   qapi/migration.json                  |   8 +-
>   tests/qtest/meson.build              |  12 +-
>   tests/qtest/migration-helpers.c      |  13 +
>   tests/qtest/migration-helpers.h      |   1 +
>   tests/qtest/migration-test.c         | 997 ++++++++++++++++++++++++---
>   tests/unit/crypto-tls-psk-helpers.c  |  18 +-
>   tests/unit/crypto-tls-psk-helpers.h  |   1 +
>   tests/unit/crypto-tls-x509-helpers.c |  16 +-
>   tests/unit/crypto-tls-x509-helpers.h |  53 ++
>   tests/unit/test-crypto-tlssession.c  |  11 +-
>   27 files changed, 1779 insertions(+), 170 deletions(-)
>


^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH v5 13/21] migration: Postcopy recover with preempt enabled
  2022-05-16 14:51       ` manish.mishra
@ 2022-05-16 15:32         ` manish.mishra
  2022-05-16 15:58         ` Peter Xu
  1 sibling, 0 replies; 40+ messages in thread
From: manish.mishra @ 2022-05-16 15:32 UTC (permalink / raw)
  To: Peter Xu
  Cc: qemu-devel, Leonardo Bras Soares Passos, Daniel P . Berrange,
	Dr . David Alan Gilbert, Juan Quintela

[-- Attachment #1: Type: text/plain, Size: 4993 bytes --]


On 16/05/22 8:21 pm, manish.mishra wrote:
>
>
> On 16/05/22 7:41 pm, Peter Xu wrote:
>> Hi, Manish,
>>
>> On Mon, May 16, 2022 at 07:01:35PM +0530, manish.mishra wrote:
>>> On 26/04/22 5:08 am, Peter Xu wrote:
>>> LGTM,
>>> Peter, I wanted to give review-tag for this and ealier patch too. I am new
>>> to qemu
>>> review process so not sure how give review-tag, did not find any reference
>>> on
>>> google too. So if you please let me know how to do it.
>> It's here:
>>
>> https://urldefense.proofpoint.com/v2/url?u=https-3A__git.qemu.org_-3Fp-3Dqemu.git-3Ba-3Dblob-3Bf-3Ddocs_devel_submitting-2Da-2Dpatch.rst-3Bhb-3DHEAD-23l492&d=DwIBaQ&c=s883GpUCOChKOHiocYtGcg&r=c4KON2DiMd-szjwjggQcuUvTsPWblztAL0gVzaHnNmc&m=8LU6rphEJ5GMSXEpSxe8JZ_hpn6TQDUXfjWM6Vt7DdShxnU3X5zYXbAMBLPYchdK&s=TUNUCtdl7LWhrdlfnIx1F08kC0d9IMvArl6cNMpfXkc&e=  
>>
>> Since afaict QEMU is mostly following what Linux does, you can also
>> reference to this one with more context:
>>
>> https://urldefense.proofpoint.com/v2/url?u=https-3A__www.kernel.org_doc_html_v4.17_process_submitting-2Dpatches.html-23using-2Dreported-2Dby-2Dtested-2Dby-2Dreviewed-2Dby-2Dsuggested-2Dby-2Dand-2Dfixes&d=DwIBaQ&c=s883GpUCOChKOHiocYtGcg&r=c4KON2DiMd-szjwjggQcuUvTsPWblztAL0gVzaHnNmc&m=8LU6rphEJ5GMSXEpSxe8JZ_hpn6TQDUXfjWM6Vt7DdShxnU3X5zYXbAMBLPYchdK&s=TJmr_eC4LAccVY1EqgkLleXfJhUgtIjTJmLc3cedYr0&e=  
>>
>> But since you're still having question regarding this patch, no rush on
>> providing your R-bs; let's finish the discussion first.
>>
>> [...]
>>
>>>> +static void postcopy_pause_ram_fast_load(MigrationIncomingState *mis)
>>>> +{
>>>> +    trace_postcopy_pause_fast_load();
>>>> +    qemu_mutex_unlock(&mis->postcopy_prio_thread_mutex);
>>> I may have misunderstood synchronisation here but very very rare chance,
>>>
>>> as both threads are working independently is it possible qemu_sem_post on
>>>
>>> postcopy_pause_sem_fast_load is called by main thread even before we go
>>>
>>> to qemu_sem_wait in next line, causing some kind of deadlock. That's should
>>>
>>> not be possible as i understand it requires manually calling
>>> qmp_migration_recover
>>>
>>> so chances are almost impossible. Just wanted to confirm it.
>> Sorry I don't quite get the question, could you elaborate?  E.g., when the
>> described deadlock happened, what is both main thread and preempt load
>> thread doing?  What are they waiting at?
>>
>> Note: we have already released mutex before waiting on sem.
>
> What i meant here is deadlock could be due the reason that we infinately wait
>
> on qemu_sem_wait(&mis->postcopy_pause_sem_fast_load), because main
>
> thread already called post on postcopy_pause_sem_fast_load after recovery
>
> even before we moved to qemu_sem_wait(&mis->postcopy_pause_sem_fast_load)
>
> in next line. Basically if we miss a post on postcopy_pause_sem_fast_load.
>
> This is nearly impossibily case becuase it requires full recovery path to be completed
>
> before this thread executes just next line. Also as recovery needs to be called manually,
>
> So please ignore this.
>
> Basically i wanted to check if we should use something like
>
> int pthread_cond_wait(pthread_cond_t *restrict cond,
>                     pthread_mutex_t *restrict mutex);
>
> so that there is no race between releasing mutex and calling wait.
>
Really sorry, please ignore this it is sem_post and sem_wait so that is not possible.
>
>>>> +    qemu_sem_wait(&mis->postcopy_pause_sem_fast_load);
>>> Just wanted to confirm why postcopy_pause_incoming is not called here
>>> itself.
>> postcopy_pause_incoming() is only used in the main ram load thread, while
>> this function (postcopy_pause_ram_fast_load) is only called by the preempt
>> load thread.
>>
> ok got it, thanks Peter, i meant if we should close both the channels as soon
>
> as we relise there is some failure instead of main thread waiting for error event
>
> and then closing and pausing post-copy. But agree current approach is good.
>
>>> Is it based on assumption that if there is error in any of the channel it
>>> will
>>>
>>> eventually be paused on source side, closing both channels, resulting
>>>
>>> postcopy_pause_incoming will be called from main thread on destination?
>> Yes.
>>
>>> Usually it should be good to call as early as possible. It is left to main
>>>
>>> thread default path so that we do not have any synchronisation overhead?
>> What's the sync overhead you mentioned? What we want to do here is simply
>> to put all the dest QEMU migration threads into a halted state rather than
>> quitting, so that they can be continued when necessary.
>>
>>> Also Peter, i was trying to understand postcopy recovery model so is use
>>> case
>>>
>>> of qmp_migrate_pause just for debugging purpose?
>> Yes.  It's also a way to cleanly stop using the network (comparing to force
>> unplug the nic ports?) for whatever reason with a shutdown() syscall upon
>> the socket.  I just don't know whether there's any real use case of that in
>> reality.
>>
>> Thanks,
>>

[-- Attachment #2: Type: text/html, Size: 8611 bytes --]

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH v5 13/21] migration: Postcopy recover with preempt enabled
  2022-05-16 14:51       ` manish.mishra
  2022-05-16 15:32         ` manish.mishra
@ 2022-05-16 15:58         ` Peter Xu
  1 sibling, 0 replies; 40+ messages in thread
From: Peter Xu @ 2022-05-16 15:58 UTC (permalink / raw)
  To: manish.mishra
  Cc: qemu-devel, Leonardo Bras Soares Passos, Daniel P . Berrange,
	Dr . David Alan Gilbert, Juan Quintela

On Mon, May 16, 2022 at 08:21:23PM +0530, manish.mishra wrote:
> 
> On 16/05/22 7:41 pm, Peter Xu wrote:
> > Hi, Manish,
> > 
> > On Mon, May 16, 2022 at 07:01:35PM +0530, manish.mishra wrote:
> > > On 26/04/22 5:08 am, Peter Xu wrote:
> > > LGTM,
> > > Peter, I wanted to give review-tag for this and ealier patch too. I am new
> > > to qemu
> > > review process so not sure how give review-tag, did not find any reference
> > > on
> > > google too. So if you please let me know how to do it.
> > It's here:
> > 
> > https://urldefense.proofpoint.com/v2/url?u=https-3A__git.qemu.org_-3Fp-3Dqemu.git-3Ba-3Dblob-3Bf-3Ddocs_devel_submitting-2Da-2Dpatch.rst-3Bhb-3DHEAD-23l492&d=DwIBaQ&c=s883GpUCOChKOHiocYtGcg&r=c4KON2DiMd-szjwjggQcuUvTsPWblztAL0gVzaHnNmc&m=8LU6rphEJ5GMSXEpSxe8JZ_hpn6TQDUXfjWM6Vt7DdShxnU3X5zYXbAMBLPYchdK&s=TUNUCtdl7LWhrdlfnIx1F08kC0d9IMvArl6cNMpfXkc&e=
> > 
> > Since afaict QEMU is mostly following what Linux does, you can also
> > reference to this one with more context:
> > 
> > https://urldefense.proofpoint.com/v2/url?u=https-3A__www.kernel.org_doc_html_v4.17_process_submitting-2Dpatches.html-23using-2Dreported-2Dby-2Dtested-2Dby-2Dreviewed-2Dby-2Dsuggested-2Dby-2Dand-2Dfixes&d=DwIBaQ&c=s883GpUCOChKOHiocYtGcg&r=c4KON2DiMd-szjwjggQcuUvTsPWblztAL0gVzaHnNmc&m=8LU6rphEJ5GMSXEpSxe8JZ_hpn6TQDUXfjWM6Vt7DdShxnU3X5zYXbAMBLPYchdK&s=TJmr_eC4LAccVY1EqgkLleXfJhUgtIjTJmLc3cedYr0&e=
> > 
> > But since you're still having question regarding this patch, no rush on
> > providing your R-bs; let's finish the discussion first.
> > 
> > [...]
> > 
> > > > +static void postcopy_pause_ram_fast_load(MigrationIncomingState *mis)
> > > > +{
> > > > +    trace_postcopy_pause_fast_load();
> > > > +    qemu_mutex_unlock(&mis->postcopy_prio_thread_mutex);
> > > I may have misunderstood synchronisation here but very very rare chance,
> > > 
> > > as both threads are working independently is it possible qemu_sem_post on
> > > 
> > > postcopy_pause_sem_fast_load is called by main thread even before we go
> > > 
> > > to qemu_sem_wait in next line, causing some kind of deadlock. That's should
> > > 
> > > not be possible as i understand it requires manually calling
> > > qmp_migration_recover
> > > 
> > > so chances are almost impossible. Just wanted to confirm it.
> > Sorry I don't quite get the question, could you elaborate?  E.g., when the
> > described deadlock happened, what is both main thread and preempt load
> > thread doing?  What are they waiting at?
> > 
> > Note: we have already released mutex before waiting on sem.
> 
> What i meant here is deadlock could be due the reason that we infinately wait
> 
> on qemu_sem_wait(&mis->postcopy_pause_sem_fast_load), because main
> 
> thread already called post on postcopy_pause_sem_fast_load after recovery
> 
> even before we moved to qemu_sem_wait(&mis->postcopy_pause_sem_fast_load)
> 
> in next line. Basically if we miss a post on postcopy_pause_sem_fast_load.
> 
> This is nearly impossibily case becuase it requires full recovery path to be completed
> 
> before this thread executes just next line. Also as recovery needs to be called manually,
> 
> So please ignore this.

Yes the migration state has a dependency.

The other more obvious reason it won't happen is that sem is number based
and it remembers.  Please try below:

    sem_t *sem = sem_open("test", O_CREAT);
    sem_post(sem);
    sem_wait(sem);

And sem_wait() will return immediately because post() already set it to 1.

> 
> Basically i wanted to check if we should use something like
> 
> int pthread_cond_wait(pthread_cond_t *restrict cond,
>                    pthread_mutex_t *restrict mutex);
> 
> so that there is no race between releasing mutex and calling wait.

Yes I think condvar should also work here but it should be stricter.

Thanks,

-- 
Peter Xu



^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH v5 00/21] migration: Postcopy Preemption
  2022-05-16 15:25 ` [PATCH v5 00/21] migration: Postcopy Preemption manish.mishra
@ 2022-05-16 16:06   ` Peter Xu
  0 siblings, 0 replies; 40+ messages in thread
From: Peter Xu @ 2022-05-16 16:06 UTC (permalink / raw)
  To: manish.mishra
  Cc: qemu-devel, Leonardo Bras Soares Passos, Daniel P . Berrange,
	Dr . David Alan Gilbert, Juan Quintela

On Mon, May 16, 2022 at 08:55:50PM +0530, manish.mishra wrote:
> 
> On 26/04/22 5:08 am, Peter Xu wrote:
> > This is v5 of postcopy preempt series.  It can also be found here:
> > 
> >    https://github.com/xzpeter/qemu/tree/postcopy-preempt
> > 
> > RFC: https://lore.kernel.org/qemu-devel/20220119080929.39485-1-peterx@redhat.com
> > V1:  https://lore.kernel.org/qemu-devel/20220216062809.57179-1-peterx@redhat.com
> > V2:  https://lore.kernel.org/qemu-devel/20220301083925.33483-1-peterx@redhat.com
> > V3:  https://lore.kernel.org/qemu-devel/20220330213908.26608-1-peterx@redhat.com
> > V4:  https://lore.kernel.org/qemu-devel/20220331150857.74406-1-peterx@redhat.com
> > 
> > v4->v5 changelog:
> > - Fixed all checkpatch.pl warnings
> > - Picked up leftover patches from Dan's tls test case series:
> >    https://lore.kernel.org/qemu-devel/20220310171821.3724080-1-berrange@redhat.com/
> > - Rebased to v7.0.0 tag, collected more R-bs from Dave/Dan
> > - In migrate_fd_cleanup(), use g_clear_pointer() for s->hostname [Dan]
> > - Mark postcopy-preempt capability for 7.1 not 7.0 [Dan]
> > - Moved migrate_channel_requires_tls() into tls.[ch] [Dan]
> > - Mention the bug-fixing side effect of patch "migration: Export
> >    tls-[creds|hostname|authz] params to cmdline too" on tls_authz [Dan]
> > - Use g_autoptr where proper [Dan]
> > - Drop a few (probably over-cautious) asserts on local_err being set [Dan]
> > - Quite a few renamings in the qtest in the last few test patches [Dan]
> > 
> > Abstract
> > ========
> > 
> > This series contains two parts now:
> > 
> >    (1) Leftover patches from Dan's tls unit tests v2, which is first half
> >    (2) Leftover patches from my postcopy preempt v4, which is second half
> > 
> > This series added a new migration capability called "postcopy-preempt".  It can
> > be enabled when postcopy is enabled, and it'll simply (but greatly) speed up
> > postcopy page requests handling process.
> > 
> > Below are some initial postcopy page request latency measurements after the
> > new series applied.
> > 
> > For each page size, I measured page request latency for three cases:
> > 
> >    (a) Vanilla:                the old postcopy
> >    (b) Preempt no-break-huge:  preempt enabled, x-postcopy-preempt-break-huge=off
> >    (c) Preempt full:           preempt enabled, x-postcopy-preempt-break-huge=on
> >                                (this is the default option when preempt enabled)
> > 
> > Here x-postcopy-preempt-break-huge parameter is just added in v2 so as to
> > conditionally disable the behavior to break sending a precopy huge page for
> > debugging purpose.  So when it's off, postcopy will not preempt precopy
> > sending a huge page, but still postcopy will use its own channel.
> > 
> > I tested it separately to give a rough idea on which part of the change
> > helped how much of it.  The overall benefit should be the comparison
> > between case (a) and (c).
> > 
> >    |-----------+---------+-----------------------+--------------|
> >    | Page size | Vanilla | Preempt no-break-huge | Preempt full |
> >    |-----------+---------+-----------------------+--------------|
> >    | 4K        |   10.68 |               N/A [*] |         0.57 |
> >    | 2M        |   10.58 |                  5.49 |         5.02 |
> >    | 1G        | 2046.65 |               933.185 |      649.445 |
> >    |-----------+---------+-----------------------+--------------|
> >    [*]: This case is N/A because 4K page does not contain huge page at all
> > 
> > [1] https://github.com/xzpeter/small-stuffs/blob/master/tools/huge_vm/uffd-latency.bpf
> 
> Hi Peter, Just wanted understand what setup was used for these experiments like
> 
> number of vcpu, workload, network bandwidth so that i can make sense of these

40 vcpus, 20GB mem, workload is using mig_mon single thread dirtying
workload:

https://github.com/xzpeter/mig_mon

Network is 10gbps one port.

Another thing to mention is that all these numbers are average page
latencies.

> 
> numbers. Also i could not understand reason for so much difference between preempt
> 
> full and Preempt no-break-huge especially for 1G case, so if you please share little more
> 
> details on this.

The break-huge change is the part where the requested page comes within
sending one huge page in the precopy channel, so we can halt sending that
page and jumps quickly to the postcopy channel for sending that huge page.
When with no-break-huge option we use the separate channel, but we won't
start sending the postcopy page (via postcopy preempt channel) until the
precopy finishes sending the current page.

Please don't trust too much on the 1G use case because the samples are very
much limited (total of 20 pages, and my test memory should be only upon
10+GB which I forgot).

Thanks,

-- 
Peter Xu



^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH v5 14/21] migration: Create the postcopy preempt channel asynchronously
  2022-05-16 15:02   ` manish.mishra
@ 2022-05-16 16:21     ` Peter Xu
  0 siblings, 0 replies; 40+ messages in thread
From: Peter Xu @ 2022-05-16 16:21 UTC (permalink / raw)
  To: manish.mishra
  Cc: qemu-devel, Leonardo Bras Soares Passos, Daniel P . Berrange,
	Dr . David Alan Gilbert, Juan Quintela

On Mon, May 16, 2022 at 08:32:01PM +0530, manish.mishra wrote:
> Reviewed-by: Manish Mishra <manish.mishra@nutanix.com>
> 
> This also looks good to me, sorry if i mis-understood and this is not correct way to add review tag.

This is, and I'll collect it when I repost. :)

Thanks,

-- 
Peter Xu



^ permalink raw reply	[flat|nested] 40+ messages in thread

end of thread, other threads:[~2022-05-16 17:36 UTC | newest]

Thread overview: 40+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-04-25 23:38 [PATCH v5 00/21] migration: Postcopy Preemption Peter Xu
2022-04-25 23:38 ` [PATCH v5 01/21] tests: fix encoding of IP addresses in x509 certs Peter Xu
2022-04-25 23:38 ` [PATCH v5 02/21] tests: add more helper macros for creating TLS " Peter Xu
2022-04-25 23:38 ` [PATCH v5 03/21] tests: add migration tests of TLS with PSK credentials Peter Xu
2022-04-25 23:38 ` [PATCH v5 04/21] tests: add migration tests of TLS with x509 credentials Peter Xu
2022-04-25 23:38 ` [PATCH v5 05/21] tests: convert XBZRLE migration test to use common helper Peter Xu
2022-04-25 23:38 ` [PATCH v5 06/21] tests: convert multifd migration tests " Peter Xu
2022-04-25 23:38 ` [PATCH v5 07/21] tests: add multifd migration tests of TLS with PSK credentials Peter Xu
2022-04-25 23:38 ` [PATCH v5 08/21] tests: add multifd migration tests of TLS with x509 credentials Peter Xu
2022-04-25 23:38 ` [PATCH v5 09/21] tests: ensure migration status isn't reported as failed Peter Xu
2022-04-25 23:38 ` [PATCH v5 10/21] migration: Add postcopy-preempt capability Peter Xu
2022-04-25 23:38 ` [PATCH v5 11/21] migration: Postcopy preemption preparation on channel creation Peter Xu
2022-05-11 17:08   ` manish.mishra
2022-05-12 16:36     ` Peter Xu
2022-04-25 23:38 ` [PATCH v5 12/21] migration: Postcopy preemption enablement Peter Xu
2022-04-25 23:38 ` [PATCH v5 13/21] migration: Postcopy recover with preempt enabled Peter Xu
2022-05-16 13:31   ` manish.mishra
2022-05-16 14:11     ` Peter Xu
2022-05-16 14:51       ` manish.mishra
2022-05-16 15:32         ` manish.mishra
2022-05-16 15:58         ` Peter Xu
2022-04-25 23:38 ` [PATCH v5 14/21] migration: Create the postcopy preempt channel asynchronously Peter Xu
2022-05-12 17:35   ` Dr. David Alan Gilbert
2022-05-16 15:02   ` manish.mishra
2022-05-16 16:21     ` Peter Xu
2022-04-25 23:38 ` [PATCH v5 15/21] migration: Parameter x-postcopy-preempt-break-huge Peter Xu
2022-05-12 17:42   ` Dr. David Alan Gilbert
2022-05-16 15:20   ` manish.mishra
2022-04-25 23:38 ` [PATCH v5 16/21] migration: Add helpers to detect TLS capability Peter Xu
2022-05-12 17:46   ` Dr. David Alan Gilbert
2022-04-25 23:38 ` [PATCH v5 17/21] migration: Export tls-[creds|hostname|authz] params to cmdline too Peter Xu
2022-05-12 18:03   ` Dr. David Alan Gilbert
2022-05-12 19:05     ` Daniel P. Berrangé
2022-05-12 20:07       ` Peter Xu
2022-04-25 23:38 ` [PATCH v5 18/21] migration: Enable TLS for preempt channel Peter Xu
2022-04-25 23:38 ` [PATCH v5 19/21] tests: Add postcopy tls migration test Peter Xu
2022-04-25 23:38 ` [PATCH v5 20/21] tests: Add postcopy tls recovery " Peter Xu
2022-04-25 23:38 ` [PATCH v5 21/21] tests: Add postcopy preempt tests Peter Xu
2022-05-16 15:25 ` [PATCH v5 00/21] migration: Postcopy Preemption manish.mishra
2022-05-16 16:06   ` Peter Xu

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.