All of lore.kernel.org
 help / color / mirror / Atom feed
* [Qemu-devel] [PATCH v5 00/17] Multifd
@ 2017-07-17 13:42 Juan Quintela
  2017-07-17 13:42 ` [Qemu-devel] [PATCH v5 01/17] migrate: Add gboolean return type to migrate_channel_process_incoming Juan Quintela
                   ` (16 more replies)
  0 siblings, 17 replies; 93+ messages in thread
From: Juan Quintela @ 2017-07-17 13:42 UTC (permalink / raw)
  To: qemu-devel; +Cc: dgilbert, lvivier, peterx, berrange


Hi

This is a new version from the multifd series, changes since last version:

- tests from qio functions (a.k.a. make danp happy)
- 1st message from one channel to the other contains:
   <uuid> multifd <channel number>
   This would allow us to create more channels as we want them.
   a.k.a. Making dave happy
- Waiting in reception for new channels using qio listeners
  Getting threads, qio and reference counters working at the same time
  was interesing.
  Another make danp happy.

- Lots and lots of small changes and fixes.  Notice that the last 70 patches
  that I merged or so what to make this series easier/smaller.

- NOT DONE: I haven't been woring on measuring performance
  differences, this was about getting the creation of the
  threads/channels right.

So, what I want:

- Are people happy with how I have (ab)used qio channels? (yes danp,
  that is you).
- My understanding is th

ToDo:

- Make paolo happy: He wanted to test using control information
  through each channel, not only pages.  This requires yet more
  cleanups to be able to have more than one QEMUFile/RAMState open at
  the same time.

- How I create multiple channels.  Things I know:
  * with current changes, it should work with fd/channels (the multifd bits),
    but we don;t have a way to pass multiple fd;s or exec files.
    Danp, any idea about how to create an UI for it?
  * My idea is that we would split current code to be:
    + channel creation at migration.c
    + rest of bits at ram.c
    + change format to:
      <uuid> main <rest of migration capabilities/paramentes> so we can check
      <uuid> postcopy <no clue what parameters are needed>
          Dave wanted a way to create a new fd for postcopy for some time
    + Adding new channels is easy

- Performance data/numbers: Yes, I wanted to get this out at once, I
  would continue with this.


Please, review.


[v4]
This is the 4th version of multifd. Changes:
- XBZRLE don't need to be checked for
- Documentation and defaults are consistent
- split socketArgs
- use iovec instead of creating something similar.
- We use now the exported size of target page (another HACK removal)
- created qio_chanel_{wirtev,readv}_all functions.  the _full() name
  was already taken.
  What they do is the same that the without _all() function, but if it
  returns due to blocking it redo the call.
- it is checkpatch.pl clean now.

Please comment, Juan.




[v3]

- comments for previous verion addressed
- lots of bugs fixed
- remove DPRINTF from ram.c

- add multifd-group parameter, it gives how many pages we sent each
  time to the worker threads.  I am open to better names.
- Better flush support.
- with migration_set_speed 2G it is able to migrate "stress -vm 2
  -vm-bytes 512M" over loopback.

Please review.

Thanks, Juan.

[v2]

This is a version against current code.  It is based on top of QIO
work. It improves the thread synchronization and fixes the problem
when we could have two threads handing the same page.

Please comment, Juan.




Juan Quintela (17):
  migrate: Add gboolean return type to migrate_channel_process_incoming
  migration: Create migration_ioc_process_incoming()
  qio: Create new qio_channel_{readv,writev}_all
  migration: Add multifd capability
  migration: Create x-multifd-threads parameter
  migration: Create x-multifd-group parameter
  migration: Create multifd migration threads
  migration: Split migration_fd_process_incomming
  migration: Start of multiple fd work
  migration: Create ram_multifd_page
  migration: Really use multiple pages at a time
  migration: Send the fd number which we are going to use for this page
  migration: Create thread infrastructure for multifd recv side
  migration: Delay the start of reception on main channel
  migration: Test new fd infrastructure
  migration: Transfer pages over new channels
  migration: Flush receive queue

 hmp.c                          |  17 ++
 include/io/channel.h           |  46 ++++
 io/channel.c                   |  76 ++++++
 migration/channel.c            |   6 +-
 migration/channel.h            |   2 +-
 migration/exec.c               |   6 +-
 migration/migration.c          | 128 +++++++++-
 migration/migration.h          |   5 +
 migration/qemu-file-channel.c  |  29 +--
 migration/ram.c                | 541 ++++++++++++++++++++++++++++++++++++++++-
 migration/ram.h                |   7 +
 migration/socket.c             |  50 +++-
 migration/socket.h             |  10 +
 qapi-schema.json               |  32 ++-
 tests/io-channel-helpers.c     |  55 +++++
 tests/io-channel-helpers.h     |   4 +
 tests/test-io-channel-buffer.c |  55 ++++-
 17 files changed, 1010 insertions(+), 59 deletions(-)

-- 
2.9.4

^ permalink raw reply	[flat|nested] 93+ messages in thread

* [Qemu-devel] [PATCH v5 01/17] migrate: Add gboolean return type to migrate_channel_process_incoming
  2017-07-17 13:42 [Qemu-devel] [PATCH v5 00/17] Multifd Juan Quintela
@ 2017-07-17 13:42 ` Juan Quintela
  2017-07-19 15:01   ` Dr. David Alan Gilbert
  2017-07-17 13:42 ` [Qemu-devel] [PATCH v5 02/17] migration: Create migration_ioc_process_incoming() Juan Quintela
                   ` (15 subsequent siblings)
  16 siblings, 1 reply; 93+ messages in thread
From: Juan Quintela @ 2017-07-17 13:42 UTC (permalink / raw)
  To: qemu-devel; +Cc: dgilbert, lvivier, peterx, berrange

Signed-off-by: Juan Quintela <quintela@redhat.com>
---
 migration/channel.c |  3 ++-
 migration/channel.h |  2 +-
 migration/exec.c    |  6 ++++--
 migration/socket.c  | 12 ++++++++----
 4 files changed, 15 insertions(+), 8 deletions(-)

diff --git a/migration/channel.c b/migration/channel.c
index 3b7252f..719055d 100644
--- a/migration/channel.c
+++ b/migration/channel.c
@@ -19,7 +19,7 @@
 #include "qapi/error.h"
 #include "io/channel-tls.h"
 
-void migration_channel_process_incoming(QIOChannel *ioc)
+gboolean migration_channel_process_incoming(QIOChannel *ioc)
 {
     MigrationState *s = migrate_get_current();
 
@@ -39,6 +39,7 @@ void migration_channel_process_incoming(QIOChannel *ioc)
         QEMUFile *f = qemu_fopen_channel_input(ioc);
         migration_fd_process_incoming(f);
     }
+    return FALSE; /* unregister */
 }
 
 
diff --git a/migration/channel.h b/migration/channel.h
index e4b4057..72cbc9f 100644
--- a/migration/channel.h
+++ b/migration/channel.h
@@ -18,7 +18,7 @@
 
 #include "io/channel.h"
 
-void migration_channel_process_incoming(QIOChannel *ioc);
+gboolean migration_channel_process_incoming(QIOChannel *ioc);
 
 void migration_channel_connect(MigrationState *s,
                                QIOChannel *ioc,
diff --git a/migration/exec.c b/migration/exec.c
index 08b599e..2827f15 100644
--- a/migration/exec.c
+++ b/migration/exec.c
@@ -47,9 +47,11 @@ static gboolean exec_accept_incoming_migration(QIOChannel *ioc,
                                                GIOCondition condition,
                                                gpointer opaque)
 {
-    migration_channel_process_incoming(ioc);
+    gboolean result;
+
+    result = migration_channel_process_incoming(ioc);
     object_unref(OBJECT(ioc));
-    return FALSE; /* unregister */
+    return result;
 }
 
 void exec_start_incoming_migration(const char *command, Error **errp)
diff --git a/migration/socket.c b/migration/socket.c
index 757d382..6195596 100644
--- a/migration/socket.c
+++ b/migration/socket.c
@@ -136,25 +136,29 @@ static gboolean socket_accept_incoming_migration(QIOChannel *ioc,
 {
     QIOChannelSocket *sioc;
     Error *err = NULL;
+    gboolean result;
 
     sioc = qio_channel_socket_accept(QIO_CHANNEL_SOCKET(ioc),
                                      &err);
     if (!sioc) {
         error_report("could not accept migration connection (%s)",
                      error_get_pretty(err));
+        result = FALSE; /* unregister */
         goto out;
     }
 
     trace_migration_socket_incoming_accepted();
 
     qio_channel_set_name(QIO_CHANNEL(sioc), "migration-socket-incoming");
-    migration_channel_process_incoming(QIO_CHANNEL(sioc));
+    result = migration_channel_process_incoming(QIO_CHANNEL(sioc));
     object_unref(OBJECT(sioc));
 
 out:
-    /* Close listening socket as its no longer needed */
-    qio_channel_close(ioc, NULL);
-    return FALSE; /* unregister */
+    if (result == FALSE) {
+        /* Close listening socket as its no longer needed */
+        qio_channel_close(ioc, NULL);
+    }
+    return result;
 }
 
 
-- 
2.9.4

^ permalink raw reply related	[flat|nested] 93+ messages in thread

* [Qemu-devel] [PATCH v5 02/17] migration: Create migration_ioc_process_incoming()
  2017-07-17 13:42 [Qemu-devel] [PATCH v5 00/17] Multifd Juan Quintela
  2017-07-17 13:42 ` [Qemu-devel] [PATCH v5 01/17] migrate: Add gboolean return type to migrate_channel_process_incoming Juan Quintela
@ 2017-07-17 13:42 ` Juan Quintela
  2017-07-19 13:38   ` Daniel P. Berrange
  2017-07-17 13:42 ` [Qemu-devel] [PATCH v5 03/17] qio: Create new qio_channel_{readv, writev}_all Juan Quintela
                   ` (14 subsequent siblings)
  16 siblings, 1 reply; 93+ messages in thread
From: Juan Quintela @ 2017-07-17 13:42 UTC (permalink / raw)
  To: qemu-devel; +Cc: dgilbert, lvivier, peterx, berrange

We need to receive the ioc to be able to implement multifd.

Signed-off-by: Juan Quintela <quintela@redhat.com>
---
 migration/channel.c   |  3 +--
 migration/migration.c | 16 +++++++++++++---
 migration/migration.h |  2 ++
 3 files changed, 16 insertions(+), 5 deletions(-)

diff --git a/migration/channel.c b/migration/channel.c
index 719055d..5b777ef 100644
--- a/migration/channel.c
+++ b/migration/channel.c
@@ -36,8 +36,7 @@ gboolean migration_channel_process_incoming(QIOChannel *ioc)
             error_report_err(local_err);
         }
     } else {
-        QEMUFile *f = qemu_fopen_channel_input(ioc);
-        migration_fd_process_incoming(f);
+        return migration_ioc_process_incoming(ioc);
     }
     return FALSE; /* unregister */
 }
diff --git a/migration/migration.c b/migration/migration.c
index a0db40d..c24ad03 100644
--- a/migration/migration.c
+++ b/migration/migration.c
@@ -299,17 +299,15 @@ static void process_incoming_migration_bh(void *opaque)
 
 static void process_incoming_migration_co(void *opaque)
 {
-    QEMUFile *f = opaque;
     MigrationIncomingState *mis = migration_incoming_get_current();
     PostcopyState ps;
     int ret;
 
-    mis->from_src_file = f;
     mis->largest_page_size = qemu_ram_pagesize_largest();
     postcopy_state_set(POSTCOPY_INCOMING_NONE);
     migrate_set_state(&mis->state, MIGRATION_STATUS_NONE,
                       MIGRATION_STATUS_ACTIVE);
-    ret = qemu_loadvm_state(f);
+    ret = qemu_loadvm_state(mis->from_src_file);
 
     ps = postcopy_state_get();
     trace_process_incoming_migration_co_end(ret, ps);
@@ -362,6 +360,18 @@ void migration_fd_process_incoming(QEMUFile *f)
     qemu_coroutine_enter(co);
 }
 
+gboolean migration_ioc_process_incoming(QIOChannel *ioc)
+{
+    MigrationIncomingState *mis = migration_incoming_get_current();
+
+    if (!mis->from_src_file) {
+        QEMUFile *f = qemu_fopen_channel_input(ioc);
+        mis->from_src_file = f;
+        migration_fd_process_incoming(f);
+    }
+    return FALSE; /* unregister */
+}
+
 /*
  * Send a 'SHUT' message on the return channel with the given value
  * to indicate that we've finished with the RP.  Non-0 value indicates
diff --git a/migration/migration.h b/migration/migration.h
index 148c9fa..5a18aea 100644
--- a/migration/migration.h
+++ b/migration/migration.h
@@ -20,6 +20,7 @@
 #include "exec/cpu-common.h"
 #include "qemu/coroutine_int.h"
 #include "hw/qdev.h"
+#include "io/channel.h"
 
 /* State for the incoming migration */
 struct MigrationIncomingState {
@@ -152,6 +153,7 @@ struct MigrationState
 void migrate_set_state(int *state, int old_state, int new_state);
 
 void migration_fd_process_incoming(QEMUFile *f);
+gboolean migration_ioc_process_incoming(QIOChannel *ioc);
 
 uint64_t migrate_max_downtime(void);
 
-- 
2.9.4

^ permalink raw reply related	[flat|nested] 93+ messages in thread

* [Qemu-devel] [PATCH v5 03/17] qio: Create new qio_channel_{readv, writev}_all
  2017-07-17 13:42 [Qemu-devel] [PATCH v5 00/17] Multifd Juan Quintela
  2017-07-17 13:42 ` [Qemu-devel] [PATCH v5 01/17] migrate: Add gboolean return type to migrate_channel_process_incoming Juan Quintela
  2017-07-17 13:42 ` [Qemu-devel] [PATCH v5 02/17] migration: Create migration_ioc_process_incoming() Juan Quintela
@ 2017-07-17 13:42 ` Juan Quintela
  2017-07-19 13:44   ` Daniel P. Berrange
  2017-07-19 15:42   ` Dr. David Alan Gilbert
  2017-07-17 13:42 ` [Qemu-devel] [PATCH v5 04/17] migration: Add multifd capability Juan Quintela
                   ` (13 subsequent siblings)
  16 siblings, 2 replies; 93+ messages in thread
From: Juan Quintela @ 2017-07-17 13:42 UTC (permalink / raw)
  To: qemu-devel; +Cc: dgilbert, lvivier, peterx, berrange

The functions waits until it is able to write the full iov.

Signed-off-by: Juan Quintela <quintela@redhat.com>

--

Add tests.
---
 include/io/channel.h           | 46 +++++++++++++++++++++++++
 io/channel.c                   | 76 ++++++++++++++++++++++++++++++++++++++++++
 migration/qemu-file-channel.c  | 29 +---------------
 tests/io-channel-helpers.c     | 55 ++++++++++++++++++++++++++++++
 tests/io-channel-helpers.h     |  4 +++
 tests/test-io-channel-buffer.c | 55 ++++++++++++++++++++++++++++--
 6 files changed, 234 insertions(+), 31 deletions(-)

diff --git a/include/io/channel.h b/include/io/channel.h
index db9bb02..bfc97e2 100644
--- a/include/io/channel.h
+++ b/include/io/channel.h
@@ -269,6 +269,52 @@ ssize_t qio_channel_writev_full(QIOChannel *ioc,
                                 Error **errp);
 
 /**
+ * qio_channel_readv_all:
+ * @ioc: the channel object
+ * @iov: the array of memory regions to read data into
+ * @niov: the length of the @iov array
+ * @errp: pointer to a NULL-initialized error object
+ *
+ * Read data from the IO channel, storing it in the
+ * memory regions referenced by @iov. Each element
+ * in the @iov will be fully populated with data
+ * before the next one is used. The @niov parameter
+ * specifies the total number of elements in @iov.
+ *
+ * Returns: the number of bytes read, or -1 on error,
+ * or QIO_CHANNEL_ERR_BLOCK if no data is available
+ * and the channel is non-blocking
+ */
+ssize_t qio_channel_readv_all(QIOChannel *ioc,
+                              const struct iovec *iov,
+                              size_t niov,
+                              Error **errp);
+
+
+/**
+ * qio_channel_writev_all:
+ * @ioc: the channel object
+ * @iov: the array of memory regions to write data from
+ * @niov: the length of the @iov array
+ * @errp: pointer to a NULL-initialized error object
+ *
+ * Write data to the IO channel, reading it from the
+ * memory regions referenced by @iov. Each element
+ * in the @iov will be fully sent, before the next
+ * one is used. The @niov parameter specifies the
+ * total number of elements in @iov.
+ *
+ * It is required for all @iov data to be fully
+ * sent.
+ *
+ * Returns: the number of bytes sent, or -1 on error,
+ */
+ssize_t qio_channel_writev_all(QIOChannel *ioc,
+                               const struct iovec *iov,
+                               size_t niov,
+                               Error **erp);
+
+/**
  * qio_channel_readv:
  * @ioc: the channel object
  * @iov: the array of memory regions to read data into
diff --git a/io/channel.c b/io/channel.c
index cdf7454..82203ef 100644
--- a/io/channel.c
+++ b/io/channel.c
@@ -22,6 +22,7 @@
 #include "io/channel.h"
 #include "qapi/error.h"
 #include "qemu/main-loop.h"
+#include "qemu/iov.h"
 
 bool qio_channel_has_feature(QIOChannel *ioc,
                              QIOChannelFeature feature)
@@ -85,6 +86,81 @@ ssize_t qio_channel_writev_full(QIOChannel *ioc,
 }
 
 
+
+ssize_t qio_channel_readv_all(QIOChannel *ioc,
+                              const struct iovec *iov,
+                              size_t niov,
+                              Error **errp)
+{
+    ssize_t done = 0;
+    struct iovec *local_iov = g_new(struct iovec, niov);
+    struct iovec *local_iov_head = local_iov;
+    unsigned int nlocal_iov = niov;
+
+    nlocal_iov = iov_copy(local_iov, nlocal_iov,
+                          iov, niov,
+                          0, iov_size(iov, niov));
+
+    while (nlocal_iov > 0) {
+        ssize_t len;
+        len = qio_channel_readv(ioc, local_iov, nlocal_iov, errp);
+        if (len == QIO_CHANNEL_ERR_BLOCK) {
+            qio_channel_wait(ioc, G_IO_OUT);
+            continue;
+        }
+        if (len < 0) {
+            error_setg_errno(errp, EIO,
+                             "Channel was not able to read full iov");
+            done = -1;
+            goto cleanup;
+        }
+
+        iov_discard_front(&local_iov, &nlocal_iov, len);
+        done += len;
+    }
+
+ cleanup:
+    g_free(local_iov_head);
+    return done;
+}
+
+ssize_t qio_channel_writev_all(QIOChannel *ioc,
+                               const struct iovec *iov,
+                               size_t niov,
+                               Error **errp)
+{
+    ssize_t done = 0;
+    struct iovec *local_iov = g_new(struct iovec, niov);
+    struct iovec *local_iov_head = local_iov;
+    unsigned int nlocal_iov = niov;
+
+    nlocal_iov = iov_copy(local_iov, nlocal_iov,
+                          iov, niov,
+                          0, iov_size(iov, niov));
+
+    while (nlocal_iov > 0) {
+        ssize_t len;
+        len = qio_channel_writev(ioc, local_iov, nlocal_iov, errp);
+        if (len == QIO_CHANNEL_ERR_BLOCK) {
+            qio_channel_wait(ioc, G_IO_OUT);
+            continue;
+        }
+        if (len < 0) {
+            error_setg_errno(errp, EIO,
+                             "Channel was not able to write full iov");
+            done = -1;
+            goto cleanup;
+        }
+
+        iov_discard_front(&local_iov, &nlocal_iov, len);
+        done += len;
+    }
+
+ cleanup:
+    g_free(local_iov_head);
+    return done;
+}
+
 ssize_t qio_channel_readv(QIOChannel *ioc,
                           const struct iovec *iov,
                           size_t niov,
diff --git a/migration/qemu-file-channel.c b/migration/qemu-file-channel.c
index e202d73..457ea6c 100644
--- a/migration/qemu-file-channel.c
+++ b/migration/qemu-file-channel.c
@@ -36,35 +36,8 @@ static ssize_t channel_writev_buffer(void *opaque,
                                      int64_t pos)
 {
     QIOChannel *ioc = QIO_CHANNEL(opaque);
-    ssize_t done = 0;
-    struct iovec *local_iov = g_new(struct iovec, iovcnt);
-    struct iovec *local_iov_head = local_iov;
-    unsigned int nlocal_iov = iovcnt;
 
-    nlocal_iov = iov_copy(local_iov, nlocal_iov,
-                          iov, iovcnt,
-                          0, iov_size(iov, iovcnt));
-
-    while (nlocal_iov > 0) {
-        ssize_t len;
-        len = qio_channel_writev(ioc, local_iov, nlocal_iov, NULL);
-        if (len == QIO_CHANNEL_ERR_BLOCK) {
-            qio_channel_wait(ioc, G_IO_OUT);
-            continue;
-        }
-        if (len < 0) {
-            /* XXX handle Error objects */
-            done = -EIO;
-            goto cleanup;
-        }
-
-        iov_discard_front(&local_iov, &nlocal_iov, len);
-        done += len;
-    }
-
- cleanup:
-    g_free(local_iov_head);
-    return done;
+    return qio_channel_writev_all(ioc, iov, iovcnt, NULL);
 }
 
 
diff --git a/tests/io-channel-helpers.c b/tests/io-channel-helpers.c
index 05e5579..3d76d95 100644
--- a/tests/io-channel-helpers.c
+++ b/tests/io-channel-helpers.c
@@ -21,6 +21,7 @@
 #include "qemu/osdep.h"
 #include "io-channel-helpers.h"
 #include "qapi/error.h"
+#include "qemu/iov.h"
 
 struct QIOChannelTest {
     QIOChannel *src;
@@ -153,6 +154,45 @@ static gpointer test_io_thread_reader(gpointer opaque)
     return NULL;
 }
 
+static gpointer test_io_thread_writer_all(gpointer opaque)
+{
+    QIOChannelTest *data = opaque;
+    size_t niov = data->niov;
+    ssize_t ret;
+
+    qio_channel_set_blocking(data->src, data->blocking, NULL);
+
+    ret = qio_channel_writev_all(data->src,
+                                 data->inputv,
+                                 niov,
+                                 &data->writeerr);
+    if (ret != iov_size(data->inputv, data->niov)) {
+        error_setg(&data->writeerr, "Unexpected I/O error");
+    }
+
+    return NULL;
+}
+
+/* This thread receives all data using iovecs */
+static gpointer test_io_thread_reader_all(gpointer opaque)
+{
+    QIOChannelTest *data = opaque;
+    size_t niov = data->niov;
+    ssize_t ret;
+
+    qio_channel_set_blocking(data->dst, data->blocking, NULL);
+
+    ret = qio_channel_readv_all(data->dst,
+                                data->outputv,
+                                niov,
+                                &data->readerr);
+
+    if (ret != iov_size(data->inputv, data->niov)) {
+        error_setg(&data->readerr, "Unexpected I/O error");
+    }
+
+    return NULL;
+}
 
 QIOChannelTest *qio_channel_test_new(void)
 {
@@ -231,6 +271,21 @@ void qio_channel_test_run_reader(QIOChannelTest *test,
     test->dst = NULL;
 }
 
+void qio_channel_test_run_writer_all(QIOChannelTest *test,
+                                     QIOChannel *src)
+{
+    test->src = src;
+    test_io_thread_writer_all(test);
+    test->src = NULL;
+}
+
+void qio_channel_test_run_reader_all(QIOChannelTest *test,
+                                     QIOChannel *dst)
+{
+    test->dst = dst;
+    test_io_thread_reader_all(test);
+    test->dst = NULL;
+}
 
 void qio_channel_test_validate(QIOChannelTest *test)
 {
diff --git a/tests/io-channel-helpers.h b/tests/io-channel-helpers.h
index fedc64f..17b9647 100644
--- a/tests/io-channel-helpers.h
+++ b/tests/io-channel-helpers.h
@@ -36,6 +36,10 @@ void qio_channel_test_run_writer(QIOChannelTest *test,
                                  QIOChannel *src);
 void qio_channel_test_run_reader(QIOChannelTest *test,
                                  QIOChannel *dst);
+void qio_channel_test_run_writer_all(QIOChannelTest *test,
+                                     QIOChannel *src);
+void qio_channel_test_run_reader_all(QIOChannelTest *test,
+                                     QIOChannel *dst);
 
 void qio_channel_test_validate(QIOChannelTest *test);
 
diff --git a/tests/test-io-channel-buffer.c b/tests/test-io-channel-buffer.c
index 64722a2..4bf64ae 100644
--- a/tests/test-io-channel-buffer.c
+++ b/tests/test-io-channel-buffer.c
@@ -22,8 +22,7 @@
 #include "io/channel-buffer.h"
 #include "io-channel-helpers.h"
 
-
-static void test_io_channel_buf(void)
+static void test_io_channel_buf1(void)
 {
     QIOChannelBuffer *buf;
     QIOChannelTest *test;
@@ -39,6 +38,53 @@ static void test_io_channel_buf(void)
     object_unref(OBJECT(buf));
 }
 
+static void test_io_channel_buf2(void)
+{
+    QIOChannelBuffer *buf;
+    QIOChannelTest *test;
+
+    buf = qio_channel_buffer_new(0);
+
+    test = qio_channel_test_new();
+    qio_channel_test_run_writer_all(test, QIO_CHANNEL(buf));
+    buf->offset = 0;
+    qio_channel_test_run_reader(test, QIO_CHANNEL(buf));
+    qio_channel_test_validate(test);
+
+    object_unref(OBJECT(buf));
+}
+
+static void test_io_channel_buf3(void)
+{
+    QIOChannelBuffer *buf;
+    QIOChannelTest *test;
+
+    buf = qio_channel_buffer_new(0);
+
+    test = qio_channel_test_new();
+    qio_channel_test_run_writer(test, QIO_CHANNEL(buf));
+    buf->offset = 0;
+    qio_channel_test_run_reader_all(test, QIO_CHANNEL(buf));
+    qio_channel_test_validate(test);
+
+    object_unref(OBJECT(buf));
+}
+
+static void test_io_channel_buf4(void)
+{
+    QIOChannelBuffer *buf;
+    QIOChannelTest *test;
+
+    buf = qio_channel_buffer_new(0);
+
+    test = qio_channel_test_new();
+    qio_channel_test_run_writer_all(test, QIO_CHANNEL(buf));
+    buf->offset = 0;
+    qio_channel_test_run_reader_all(test, QIO_CHANNEL(buf));
+    qio_channel_test_validate(test);
+
+    object_unref(OBJECT(buf));
+}
 
 int main(int argc, char **argv)
 {
@@ -46,6 +92,9 @@ int main(int argc, char **argv)
 
     g_test_init(&argc, &argv, NULL);
 
-    g_test_add_func("/io/channel/buf", test_io_channel_buf);
+    g_test_add_func("/io/channel/buf1", test_io_channel_buf1);
+    g_test_add_func("/io/channel/buf2", test_io_channel_buf2);
+    g_test_add_func("/io/channel/buf3", test_io_channel_buf3);
+    g_test_add_func("/io/channel/buf4", test_io_channel_buf4);
     return g_test_run();
 }
-- 
2.9.4

^ permalink raw reply related	[flat|nested] 93+ messages in thread

* [Qemu-devel] [PATCH v5 04/17] migration: Add multifd capability
  2017-07-17 13:42 [Qemu-devel] [PATCH v5 00/17] Multifd Juan Quintela
                   ` (2 preceding siblings ...)
  2017-07-17 13:42 ` [Qemu-devel] [PATCH v5 03/17] qio: Create new qio_channel_{readv, writev}_all Juan Quintela
@ 2017-07-17 13:42 ` Juan Quintela
  2017-07-19 15:44   ` Dr. David Alan Gilbert
  2017-07-19 17:14   ` Eric Blake
  2017-07-17 13:42 ` [Qemu-devel] [PATCH v5 05/17] migration: Create x-multifd-threads parameter Juan Quintela
                   ` (12 subsequent siblings)
  16 siblings, 2 replies; 93+ messages in thread
From: Juan Quintela @ 2017-07-17 13:42 UTC (permalink / raw)
  To: qemu-devel; +Cc: dgilbert, lvivier, peterx, berrange

Signed-off-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
---
 migration/migration.c | 9 +++++++++
 migration/migration.h | 1 +
 qapi-schema.json      | 4 ++--
 3 files changed, 12 insertions(+), 2 deletions(-)

diff --git a/migration/migration.c b/migration/migration.c
index c24ad03..af2630b 100644
--- a/migration/migration.c
+++ b/migration/migration.c
@@ -1282,6 +1282,15 @@ bool migrate_use_events(void)
     return s->enabled_capabilities[MIGRATION_CAPABILITY_EVENTS];
 }
 
+bool migrate_use_multifd(void)
+{
+    MigrationState *s;
+
+    s = migrate_get_current();
+
+    return s->enabled_capabilities[MIGRATION_CAPABILITY_X_MULTIFD];
+}
+
 int migrate_use_xbzrle(void)
 {
     MigrationState *s;
diff --git a/migration/migration.h b/migration/migration.h
index 5a18aea..9da9b4e 100644
--- a/migration/migration.h
+++ b/migration/migration.h
@@ -172,6 +172,7 @@ bool migrate_postcopy_ram(void);
 bool migrate_zero_blocks(void);
 
 bool migrate_auto_converge(void);
+bool migrate_use_multifd(void);
 
 int migrate_use_xbzrle(void);
 int64_t migrate_xbzrle_cache_size(void);
diff --git a/qapi-schema.json b/qapi-schema.json
index ab438ea..2457fb0 100644
--- a/qapi-schema.json
+++ b/qapi-schema.json
@@ -902,14 +902,14 @@
 #
 # @return-path: If enabled, migration will use the return path even
 #               for precopy. (since 2.10)
+# @x-multifd: Use more than one fd for migration (since 2.10)
 #
 # Since: 1.2
 ##
 { 'enum': 'MigrationCapability',
   'data': ['xbzrle', 'rdma-pin-all', 'auto-converge', 'zero-blocks',
            'compress', 'events', 'postcopy-ram', 'x-colo', 'release-ram',
-           'block', 'return-path' ] }
-
+           'block', 'return-path', 'x-multifd'] }
 ##
 # @MigrationCapabilityStatus:
 #
-- 
2.9.4

^ permalink raw reply related	[flat|nested] 93+ messages in thread

* [Qemu-devel] [PATCH v5 05/17] migration: Create x-multifd-threads parameter
  2017-07-17 13:42 [Qemu-devel] [PATCH v5 00/17] Multifd Juan Quintela
                   ` (3 preceding siblings ...)
  2017-07-17 13:42 ` [Qemu-devel] [PATCH v5 04/17] migration: Add multifd capability Juan Quintela
@ 2017-07-17 13:42 ` Juan Quintela
  2017-07-19 16:00   ` Dr. David Alan Gilbert
  2017-07-17 13:42 ` [Qemu-devel] [PATCH v5 06/17] migration: Create x-multifd-group parameter Juan Quintela
                   ` (11 subsequent siblings)
  16 siblings, 1 reply; 93+ messages in thread
From: Juan Quintela @ 2017-07-17 13:42 UTC (permalink / raw)
  To: qemu-devel; +Cc: dgilbert, lvivier, peterx, berrange

Indicates the number of threads that we would create.  By default we
create 2 threads.

Signed-off-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>

--

Catch inconsistent defaults (eric).
Improve comment stating that number of threads is the same than number
of sockets
---
 hmp.c                 |  7 +++++++
 migration/migration.c | 23 +++++++++++++++++++++++
 migration/migration.h |  1 +
 qapi-schema.json      | 18 ++++++++++++++++--
 4 files changed, 47 insertions(+), 2 deletions(-)

diff --git a/hmp.c b/hmp.c
index d970ea9..92f9456 100644
--- a/hmp.c
+++ b/hmp.c
@@ -335,6 +335,9 @@ void hmp_info_migrate_parameters(Monitor *mon, const QDict *qdict)
         monitor_printf(mon, "%s: %s\n",
             MigrationParameter_lookup[MIGRATION_PARAMETER_BLOCK_INCREMENTAL],
                        params->block_incremental ? "on" : "off");
+        monitor_printf(mon, "%s: %" PRId64 "\n",
+            MigrationParameter_lookup[MIGRATION_PARAMETER_X_MULTIFD_THREADS],
+            params->x_multifd_threads);
     }
 
     qapi_free_MigrationParameters(params);
@@ -1573,6 +1576,9 @@ void hmp_migrate_set_parameter(Monitor *mon, const QDict *qdict)
                     goto cleanup;
                 }
                 p.block_incremental = valuebool;
+            case MIGRATION_PARAMETER_X_MULTIFD_THREADS:
+                p.has_x_multifd_threads = true;
+                use_int_value = true;
                 break;
             }
 
@@ -1590,6 +1596,7 @@ void hmp_migrate_set_parameter(Monitor *mon, const QDict *qdict)
                 p.cpu_throttle_increment = valueint;
                 p.downtime_limit = valueint;
                 p.x_checkpoint_delay = valueint;
+                p.x_multifd_threads = valueint;
             }
 
             qmp_migrate_set_parameters(&p, &err);
diff --git a/migration/migration.c b/migration/migration.c
index af2630b..148edc1 100644
--- a/migration/migration.c
+++ b/migration/migration.c
@@ -78,6 +78,7 @@
  * Note: Please change this default value to 10000 when we support hybrid mode.
  */
 #define DEFAULT_MIGRATE_X_CHECKPOINT_DELAY 200
+#define DEFAULT_MIGRATE_MULTIFD_THREADS 2
 
 static NotifierList migration_state_notifiers =
     NOTIFIER_LIST_INITIALIZER(migration_state_notifiers);
@@ -460,6 +461,8 @@ MigrationParameters *qmp_query_migrate_parameters(Error **errp)
     params->x_checkpoint_delay = s->parameters.x_checkpoint_delay;
     params->has_block_incremental = true;
     params->block_incremental = s->parameters.block_incremental;
+    params->has_x_multifd_threads = true;
+    params->x_multifd_threads = s->parameters.x_multifd_threads;
 
     return params;
 }
@@ -712,6 +715,13 @@ void qmp_migrate_set_parameters(MigrationParameters *params, Error **errp)
                     "x_checkpoint_delay",
                     "is invalid, it should be positive");
     }
+    if (params->has_x_multifd_threads &&
+        (params->x_multifd_threads < 1 || params->x_multifd_threads > 255)) {
+        error_setg(errp, QERR_INVALID_PARAMETER_VALUE,
+                   "multifd_threads",
+                   "is invalid, it should be in the range of 1 to 255");
+        return;
+    }
 
     if (params->has_compress_level) {
         s->parameters.compress_level = params->compress_level;
@@ -756,6 +766,9 @@ void qmp_migrate_set_parameters(MigrationParameters *params, Error **errp)
     if (params->has_block_incremental) {
         s->parameters.block_incremental = params->block_incremental;
     }
+    if (params->has_x_multifd_threads) {
+        s->parameters.x_multifd_threads = params->x_multifd_threads;
+    }
 }
 
 
@@ -1291,6 +1304,15 @@ bool migrate_use_multifd(void)
     return s->enabled_capabilities[MIGRATION_CAPABILITY_X_MULTIFD];
 }
 
+int migrate_multifd_threads(void)
+{
+    MigrationState *s;
+
+    s = migrate_get_current();
+
+    return s->parameters.x_multifd_threads;
+}
+
 int migrate_use_xbzrle(void)
 {
     MigrationState *s;
@@ -2055,6 +2077,7 @@ static void migration_instance_init(Object *obj)
         .max_bandwidth = MAX_THROTTLE,
         .downtime_limit = DEFAULT_MIGRATE_SET_DOWNTIME,
         .x_checkpoint_delay = DEFAULT_MIGRATE_X_CHECKPOINT_DELAY,
+        .x_multifd_threads = DEFAULT_MIGRATE_MULTIFD_THREADS,
     };
     ms->parameters.tls_creds = g_strdup("");
     ms->parameters.tls_hostname = g_strdup("");
diff --git a/migration/migration.h b/migration/migration.h
index 9da9b4e..20ea30c 100644
--- a/migration/migration.h
+++ b/migration/migration.h
@@ -173,6 +173,7 @@ bool migrate_zero_blocks(void);
 
 bool migrate_auto_converge(void);
 bool migrate_use_multifd(void);
+int migrate_multifd_threads(void);
 
 int migrate_use_xbzrle(void);
 int64_t migrate_xbzrle_cache_size(void);
diff --git a/qapi-schema.json b/qapi-schema.json
index 2457fb0..444e8f0 100644
--- a/qapi-schema.json
+++ b/qapi-schema.json
@@ -902,6 +902,7 @@
 #
 # @return-path: If enabled, migration will use the return path even
 #               for precopy. (since 2.10)
+#
 # @x-multifd: Use more than one fd for migration (since 2.10)
 #
 # Since: 1.2
@@ -910,6 +911,7 @@
   'data': ['xbzrle', 'rdma-pin-all', 'auto-converge', 'zero-blocks',
            'compress', 'events', 'postcopy-ram', 'x-colo', 'release-ram',
            'block', 'return-path', 'x-multifd'] }
+
 ##
 # @MigrationCapabilityStatus:
 #
@@ -1026,13 +1028,19 @@
 # 	migrated and the destination must already have access to the
 # 	same backing chain as was used on the source.  (since 2.10)
 #
+# @x-multifd-threads: Number of threads used to migrate data in
+#                     parallel. This is the same number that the
+#                     number of sockets used for migration.
+#                     The default value is 2 (since 2.10)
+#
 # Since: 2.4
 ##
 { 'enum': 'MigrationParameter',
   'data': ['compress-level', 'compress-threads', 'decompress-threads',
            'cpu-throttle-initial', 'cpu-throttle-increment',
            'tls-creds', 'tls-hostname', 'max-bandwidth',
-           'downtime-limit', 'x-checkpoint-delay', 'block-incremental' ] }
+           'downtime-limit', 'x-checkpoint-delay', 'block-incremental',
+           'x-multifd-threads'] }
 
 ##
 # @migrate-set-parameters:
@@ -1106,6 +1114,11 @@
 # 	migrated and the destination must already have access to the
 # 	same backing chain as was used on the source.  (since 2.10)
 #
+# @x-multifd-threads: Number of threads used to migrate data in
+#                     parallel. This is the same number that the
+#                     number of sockets used for migration.
+#                     The default value is 2 (since 2.10)
+#
 # Since: 2.4
 ##
 { 'struct': 'MigrationParameters',
@@ -1119,7 +1132,8 @@
             '*max-bandwidth': 'int',
             '*downtime-limit': 'int',
             '*x-checkpoint-delay': 'int',
-            '*block-incremental': 'bool' } }
+            '*block-incremental': 'bool',
+            '*x-multifd-threads': 'int'} }
 
 ##
 # @query-migrate-parameters:
-- 
2.9.4

^ permalink raw reply related	[flat|nested] 93+ messages in thread

* [Qemu-devel] [PATCH v5 06/17] migration: Create x-multifd-group parameter
  2017-07-17 13:42 [Qemu-devel] [PATCH v5 00/17] Multifd Juan Quintela
                   ` (4 preceding siblings ...)
  2017-07-17 13:42 ` [Qemu-devel] [PATCH v5 05/17] migration: Create x-multifd-threads parameter Juan Quintela
@ 2017-07-17 13:42 ` Juan Quintela
  2017-07-17 13:42 ` [Qemu-devel] [PATCH v5 07/17] migration: Create multifd migration threads Juan Quintela
                   ` (10 subsequent siblings)
  16 siblings, 0 replies; 93+ messages in thread
From: Juan Quintela @ 2017-07-17 13:42 UTC (permalink / raw)
  To: qemu-devel; +Cc: dgilbert, lvivier, peterx, berrange

Indicates how many pages we are going to send in each batch to a multifd
thread.

Signed-off-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>

--

Be consistent with defaults and documentation
---
 hmp.c                 |  8 ++++++++
 migration/migration.c | 23 +++++++++++++++++++++++
 migration/migration.h |  1 +
 qapi-schema.json      | 11 +++++++++--
 4 files changed, 41 insertions(+), 2 deletions(-)

diff --git a/hmp.c b/hmp.c
index 92f9456..b01605a 100644
--- a/hmp.c
+++ b/hmp.c
@@ -338,6 +338,9 @@ void hmp_info_migrate_parameters(Monitor *mon, const QDict *qdict)
         monitor_printf(mon, "%s: %" PRId64 "\n",
             MigrationParameter_lookup[MIGRATION_PARAMETER_X_MULTIFD_THREADS],
             params->x_multifd_threads);
+        monitor_printf(mon, "%s: %" PRId64 "\n",
+            MigrationParameter_lookup[MIGRATION_PARAMETER_X_MULTIFD_GROUP],
+            params->x_multifd_group);
     }
 
     qapi_free_MigrationParameters(params);
@@ -1580,6 +1583,10 @@ void hmp_migrate_set_parameter(Monitor *mon, const QDict *qdict)
                 p.has_x_multifd_threads = true;
                 use_int_value = true;
                 break;
+            case MIGRATION_PARAMETER_X_MULTIFD_GROUP:
+                p.has_x_multifd_group = true;
+                use_int_value = true;
+                break;
             }
 
             if (use_int_value) {
@@ -1597,6 +1604,7 @@ void hmp_migrate_set_parameter(Monitor *mon, const QDict *qdict)
                 p.downtime_limit = valueint;
                 p.x_checkpoint_delay = valueint;
                 p.x_multifd_threads = valueint;
+                p.x_multifd_group = valueint;
             }
 
             qmp_migrate_set_parameters(&p, &err);
diff --git a/migration/migration.c b/migration/migration.c
index 148edc1..ff3fc9d 100644
--- a/migration/migration.c
+++ b/migration/migration.c
@@ -79,6 +79,7 @@
  */
 #define DEFAULT_MIGRATE_X_CHECKPOINT_DELAY 200
 #define DEFAULT_MIGRATE_MULTIFD_THREADS 2
+#define DEFAULT_MIGRATE_MULTIFD_GROUP 16
 
 static NotifierList migration_state_notifiers =
     NOTIFIER_LIST_INITIALIZER(migration_state_notifiers);
@@ -463,6 +464,8 @@ MigrationParameters *qmp_query_migrate_parameters(Error **errp)
     params->block_incremental = s->parameters.block_incremental;
     params->has_x_multifd_threads = true;
     params->x_multifd_threads = s->parameters.x_multifd_threads;
+    params->has_x_multifd_group = true;
+    params->x_multifd_group = s->parameters.x_multifd_group;
 
     return params;
 }
@@ -722,6 +725,13 @@ void qmp_migrate_set_parameters(MigrationParameters *params, Error **errp)
                    "is invalid, it should be in the range of 1 to 255");
         return;
     }
+    if (params->has_x_multifd_group &&
+            (params->x_multifd_group < 1 || params->x_multifd_group > 10000)) {
+        error_setg(errp, QERR_INVALID_PARAMETER_VALUE,
+                   "multifd_group",
+                   "is invalid, it should be in the range of 1 to 10000");
+        return;
+    }
 
     if (params->has_compress_level) {
         s->parameters.compress_level = params->compress_level;
@@ -769,6 +779,9 @@ void qmp_migrate_set_parameters(MigrationParameters *params, Error **errp)
     if (params->has_x_multifd_threads) {
         s->parameters.x_multifd_threads = params->x_multifd_threads;
     }
+    if (params->has_x_multifd_group) {
+        s->parameters.x_multifd_group = params->x_multifd_group;
+    }
 }
 
 
@@ -1313,6 +1326,15 @@ int migrate_multifd_threads(void)
     return s->parameters.x_multifd_threads;
 }
 
+int migrate_multifd_group(void)
+{
+    MigrationState *s;
+
+    s = migrate_get_current();
+
+    return s->parameters.x_multifd_group;
+}
+
 int migrate_use_xbzrle(void)
 {
     MigrationState *s;
@@ -2078,6 +2100,7 @@ static void migration_instance_init(Object *obj)
         .downtime_limit = DEFAULT_MIGRATE_SET_DOWNTIME,
         .x_checkpoint_delay = DEFAULT_MIGRATE_X_CHECKPOINT_DELAY,
         .x_multifd_threads = DEFAULT_MIGRATE_MULTIFD_THREADS,
+        .x_multifd_group = DEFAULT_MIGRATE_MULTIFD_GROUP,
     };
     ms->parameters.tls_creds = g_strdup("");
     ms->parameters.tls_hostname = g_strdup("");
diff --git a/migration/migration.h b/migration/migration.h
index 20ea30c..4aaaf9e 100644
--- a/migration/migration.h
+++ b/migration/migration.h
@@ -174,6 +174,7 @@ bool migrate_zero_blocks(void);
 bool migrate_auto_converge(void);
 bool migrate_use_multifd(void);
 int migrate_multifd_threads(void);
+int migrate_multifd_group(void);
 
 int migrate_use_xbzrle(void);
 int64_t migrate_xbzrle_cache_size(void);
diff --git a/qapi-schema.json b/qapi-schema.json
index 444e8f0..5b3733e 100644
--- a/qapi-schema.json
+++ b/qapi-schema.json
@@ -1033,6 +1033,9 @@
 #                     number of sockets used for migration.
 #                     The default value is 2 (since 2.10)
 #
+# @x-multifd-group: Number of pages sent together to a thread
+#                   The default value is 16 (since 2.10)
+#
 # Since: 2.4
 ##
 { 'enum': 'MigrationParameter',
@@ -1040,7 +1043,7 @@
            'cpu-throttle-initial', 'cpu-throttle-increment',
            'tls-creds', 'tls-hostname', 'max-bandwidth',
            'downtime-limit', 'x-checkpoint-delay', 'block-incremental',
-           'x-multifd-threads'] }
+           'x-multifd-threads', 'x-multifd-group'] }
 
 ##
 # @migrate-set-parameters:
@@ -1119,6 +1122,9 @@
 #                     number of sockets used for migration.
 #                     The default value is 2 (since 2.10)
 #
+# @x-multifd-group: Number of pages sent together to a thread
+#                   The default value is 16 (since 2.10)
+#
 # Since: 2.4
 ##
 { 'struct': 'MigrationParameters',
@@ -1133,7 +1139,8 @@
             '*downtime-limit': 'int',
             '*x-checkpoint-delay': 'int',
             '*block-incremental': 'bool',
-            '*x-multifd-threads': 'int'} }
+            '*x-multifd-threads': 'int',
+            '*x-multifd-group': 'int'} }
 
 ##
 # @query-migrate-parameters:
-- 
2.9.4

^ permalink raw reply related	[flat|nested] 93+ messages in thread

* [Qemu-devel] [PATCH v5 07/17] migration: Create multifd migration threads
  2017-07-17 13:42 [Qemu-devel] [PATCH v5 00/17] Multifd Juan Quintela
                   ` (5 preceding siblings ...)
  2017-07-17 13:42 ` [Qemu-devel] [PATCH v5 06/17] migration: Create x-multifd-group parameter Juan Quintela
@ 2017-07-17 13:42 ` Juan Quintela
  2017-07-19 16:49   ` Dr. David Alan Gilbert
  2017-07-17 13:42 ` [Qemu-devel] [PATCH v5 08/17] migration: Split migration_fd_process_incomming Juan Quintela
                   ` (9 subsequent siblings)
  16 siblings, 1 reply; 93+ messages in thread
From: Juan Quintela @ 2017-07-17 13:42 UTC (permalink / raw)
  To: qemu-devel; +Cc: dgilbert, lvivier, peterx, berrange

Creation of the threads, nothing inside yet.

Signed-off-by: Juan Quintela <quintela@redhat.com>

--

Use pointers instead of long array names
Move to use semaphores instead of conditions as paolo suggestion

Put all the state inside one struct.
Use a counter for the number of threads created.  Needed during cancellation.

Add error return to thread creation

Add id field

Rename functions to multifd_save/load_setup/cleanup
---
 migration/migration.c |  14 ++++
 migration/ram.c       | 192 ++++++++++++++++++++++++++++++++++++++++++++++++++
 migration/ram.h       |   5 ++
 3 files changed, 211 insertions(+)

diff --git a/migration/migration.c b/migration/migration.c
index ff3fc9d..5a82c1c 100644
--- a/migration/migration.c
+++ b/migration/migration.c
@@ -288,6 +288,7 @@ static void process_incoming_migration_bh(void *opaque)
     } else {
         runstate_set(global_state_get_runstate());
     }
+    multifd_load_cleanup();
     /*
      * This must happen after any state changes since as soon as an external
      * observer sees this event they might start to prod at the VM assuming
@@ -348,6 +349,7 @@ static void process_incoming_migration_co(void *opaque)
         migrate_set_state(&mis->state, MIGRATION_STATUS_ACTIVE,
                           MIGRATION_STATUS_FAILED);
         error_report("load of migration failed: %s", strerror(-ret));
+        multifd_load_cleanup();
         exit(EXIT_FAILURE);
     }
     mis->bh = qemu_bh_new(process_incoming_migration_bh, mis);
@@ -358,6 +360,11 @@ void migration_fd_process_incoming(QEMUFile *f)
 {
     Coroutine *co = qemu_coroutine_create(process_incoming_migration_co, f);
 
+    if (multifd_load_setup() != 0) {
+        /* We haven't been able to create multifd threads
+           nothing better to do */
+        exit(EXIT_FAILURE);
+    }
     qemu_file_set_blocking(f, false);
     qemu_coroutine_enter(co);
 }
@@ -860,6 +867,7 @@ static void migrate_fd_cleanup(void *opaque)
         }
         qemu_mutex_lock_iothread();
 
+        multifd_save_cleanup();
         qemu_fclose(s->to_dst_file);
         s->to_dst_file = NULL;
     }
@@ -2049,6 +2057,12 @@ void migrate_fd_connect(MigrationState *s)
         }
     }
 
+    if (multifd_save_setup() != 0) {
+        migrate_set_state(&s->state, MIGRATION_STATUS_SETUP,
+                          MIGRATION_STATUS_FAILED);
+        migrate_fd_cleanup(s);
+        return;
+    }
     qemu_thread_create(&s->thread, "live_migration", migration_thread, s,
                        QEMU_THREAD_JOINABLE);
     s->migration_thread_running = true;
diff --git a/migration/ram.c b/migration/ram.c
index 1b08296..8e87533 100644
--- a/migration/ram.c
+++ b/migration/ram.c
@@ -356,6 +356,198 @@ static void compress_threads_save_setup(void)
     }
 }
 
+/* Multiple fd's */
+
+struct MultiFDSendParams {
+    uint8_t id;
+    QemuThread thread;
+    QemuSemaphore sem;
+    QemuMutex mutex;
+    bool quit;
+};
+typedef struct MultiFDSendParams MultiFDSendParams;
+
+struct {
+    MultiFDSendParams *params;
+    /* number of created threads */
+    int count;
+} *multifd_send_state;
+
+static void terminate_multifd_send_threads(void)
+{
+    int i;
+
+    for (i = 0; i < multifd_send_state->count; i++) {
+        MultiFDSendParams *p = &multifd_send_state->params[i];
+
+        qemu_mutex_lock(&p->mutex);
+        p->quit = true;
+        qemu_sem_post(&p->sem);
+        qemu_mutex_unlock(&p->mutex);
+    }
+}
+
+void multifd_save_cleanup(void)
+{
+    int i;
+
+    if (!migrate_use_multifd()) {
+        return;
+    }
+    terminate_multifd_send_threads();
+    for (i = 0; i < multifd_send_state->count; i++) {
+        MultiFDSendParams *p = &multifd_send_state->params[i];
+
+        qemu_thread_join(&p->thread);
+        qemu_mutex_destroy(&p->mutex);
+        qemu_sem_destroy(&p->sem);
+    }
+    g_free(multifd_send_state->params);
+    multifd_send_state->params = NULL;
+    g_free(multifd_send_state);
+    multifd_send_state = NULL;
+}
+
+static void *multifd_send_thread(void *opaque)
+{
+    MultiFDSendParams *p = opaque;
+
+    while (true) {
+        qemu_mutex_lock(&p->mutex);
+        if (p->quit) {
+            qemu_mutex_unlock(&p->mutex);
+            break;
+        }
+        qemu_mutex_unlock(&p->mutex);
+        qemu_sem_wait(&p->sem);
+    }
+
+    return NULL;
+}
+
+int multifd_save_setup(void)
+{
+    int thread_count;
+    uint8_t i;
+
+    if (!migrate_use_multifd()) {
+        return 0;
+    }
+    thread_count = migrate_multifd_threads();
+    multifd_send_state = g_malloc0(sizeof(*multifd_send_state));
+    multifd_send_state->params = g_new0(MultiFDSendParams, thread_count);
+    multifd_send_state->count = 0;
+    for (i = 0; i < thread_count; i++) {
+        char thread_name[16];
+        MultiFDSendParams *p = &multifd_send_state->params[i];
+
+        qemu_mutex_init(&p->mutex);
+        qemu_sem_init(&p->sem, 0);
+        p->quit = false;
+        p->id = i;
+        snprintf(thread_name, sizeof(thread_name), "multifdsend_%d", i);
+        qemu_thread_create(&p->thread, thread_name, multifd_send_thread, p,
+                           QEMU_THREAD_JOINABLE);
+        multifd_send_state->count++;
+    }
+    return 0;
+}
+
+struct MultiFDRecvParams {
+    uint8_t id;
+    QemuThread thread;
+    QemuSemaphore sem;
+    QemuMutex mutex;
+    bool quit;
+};
+typedef struct MultiFDRecvParams MultiFDRecvParams;
+
+struct {
+    MultiFDRecvParams *params;
+    /* number of created threads */
+    int count;
+} *multifd_recv_state;
+
+static void terminate_multifd_recv_threads(void)
+{
+    int i;
+
+    for (i = 0; i < multifd_recv_state->count; i++) {
+        MultiFDRecvParams *p = &multifd_recv_state->params[i];
+
+        qemu_mutex_lock(&p->mutex);
+        p->quit = true;
+        qemu_sem_post(&p->sem);
+        qemu_mutex_unlock(&p->mutex);
+    }
+}
+
+void multifd_load_cleanup(void)
+{
+    int i;
+
+    if (!migrate_use_multifd()) {
+        return;
+    }
+    terminate_multifd_recv_threads();
+    for (i = 0; i < multifd_recv_state->count; i++) {
+        MultiFDRecvParams *p = &multifd_recv_state->params[i];
+
+        qemu_thread_join(&p->thread);
+        qemu_mutex_destroy(&p->mutex);
+        qemu_sem_destroy(&p->sem);
+    }
+    g_free(multifd_recv_state->params);
+    multifd_recv_state->params = NULL;
+    g_free(multifd_recv_state);
+    multifd_recv_state = NULL;
+}
+
+static void *multifd_recv_thread(void *opaque)
+{
+    MultiFDRecvParams *p = opaque;
+
+    while (true) {
+        qemu_mutex_lock(&p->mutex);
+        if (p->quit) {
+            qemu_mutex_unlock(&p->mutex);
+            break;
+        }
+        qemu_mutex_unlock(&p->mutex);
+        qemu_sem_wait(&p->sem);
+    }
+
+    return NULL;
+}
+
+int multifd_load_setup(void)
+{
+    int thread_count;
+    uint8_t i;
+
+    if (!migrate_use_multifd()) {
+        return 0;
+    }
+    thread_count = migrate_multifd_threads();
+    multifd_recv_state = g_malloc0(sizeof(*multifd_recv_state));
+    multifd_recv_state->params = g_new0(MultiFDRecvParams, thread_count);
+    multifd_recv_state->count = 0;
+    for (i = 0; i < thread_count; i++) {
+        char thread_name[16];
+        MultiFDRecvParams *p = &multifd_recv_state->params[i];
+
+        qemu_mutex_init(&p->mutex);
+        qemu_sem_init(&p->sem, 0);
+        p->quit = false;
+        p->id = i;
+        snprintf(thread_name, sizeof(thread_name), "multifdrecv_%d", i);
+        qemu_thread_create(&p->thread, thread_name, multifd_recv_thread, p,
+                           QEMU_THREAD_JOINABLE);
+        multifd_recv_state->count++;
+    }
+    return 0;
+}
+
 /**
  * save_page_header: write page header to wire
  *
diff --git a/migration/ram.h b/migration/ram.h
index c081fde..93c2bb4 100644
--- a/migration/ram.h
+++ b/migration/ram.h
@@ -39,6 +39,11 @@ int64_t xbzrle_cache_resize(int64_t new_size);
 uint64_t ram_bytes_remaining(void);
 uint64_t ram_bytes_total(void);
 
+int multifd_save_setup(void);
+void multifd_save_cleanup(void);
+int multifd_load_setup(void);
+void multifd_load_cleanup(void);
+
 uint64_t ram_pagesize_summary(void);
 int ram_save_queue_pages(const char *rbname, ram_addr_t start, ram_addr_t len);
 void acct_update_position(QEMUFile *f, size_t size, bool zero);
-- 
2.9.4

^ permalink raw reply related	[flat|nested] 93+ messages in thread

* [Qemu-devel] [PATCH v5 08/17] migration: Split migration_fd_process_incomming
  2017-07-17 13:42 [Qemu-devel] [PATCH v5 00/17] Multifd Juan Quintela
                   ` (6 preceding siblings ...)
  2017-07-17 13:42 ` [Qemu-devel] [PATCH v5 07/17] migration: Create multifd migration threads Juan Quintela
@ 2017-07-17 13:42 ` Juan Quintela
  2017-07-19 17:08   ` Dr. David Alan Gilbert
  2017-07-17 13:42 ` [Qemu-devel] [PATCH v5 09/17] migration: Start of multiple fd work Juan Quintela
                   ` (8 subsequent siblings)
  16 siblings, 1 reply; 93+ messages in thread
From: Juan Quintela @ 2017-07-17 13:42 UTC (permalink / raw)
  To: qemu-devel; +Cc: dgilbert, lvivier, peterx, berrange

We need that on posterior patches.

Signed-off-by: Juan Quintela <quintela@redhat.com>
---
 migration/migration.c | 16 ++++++++++++++--
 1 file changed, 14 insertions(+), 2 deletions(-)

diff --git a/migration/migration.c b/migration/migration.c
index 5a82c1c..b81c498 100644
--- a/migration/migration.c
+++ b/migration/migration.c
@@ -356,19 +356,31 @@ static void process_incoming_migration_co(void *opaque)
     qemu_bh_schedule(mis->bh);
 }
 
-void migration_fd_process_incoming(QEMUFile *f)
+static void migration_incoming_setup(QEMUFile *f)
 {
-    Coroutine *co = qemu_coroutine_create(process_incoming_migration_co, f);
+    MigrationIncomingState *mis = migration_incoming_get_current();
 
     if (multifd_load_setup() != 0) {
         /* We haven't been able to create multifd threads
            nothing better to do */
         exit(EXIT_FAILURE);
     }
+    mis->from_src_file = f;
     qemu_file_set_blocking(f, false);
+}
+
+static void migration_incoming_process(void)
+{
+    Coroutine *co = qemu_coroutine_create(process_incoming_migration_co, NULL);
     qemu_coroutine_enter(co);
 }
 
+void migration_fd_process_incoming(QEMUFile *f)
+{
+    migration_incoming_setup(f);
+    migration_incoming_process();
+}
+
 gboolean migration_ioc_process_incoming(QIOChannel *ioc)
 {
     MigrationIncomingState *mis = migration_incoming_get_current();
-- 
2.9.4

^ permalink raw reply related	[flat|nested] 93+ messages in thread

* [Qemu-devel] [PATCH v5 09/17] migration: Start of multiple fd work
  2017-07-17 13:42 [Qemu-devel] [PATCH v5 00/17] Multifd Juan Quintela
                   ` (7 preceding siblings ...)
  2017-07-17 13:42 ` [Qemu-devel] [PATCH v5 08/17] migration: Split migration_fd_process_incomming Juan Quintela
@ 2017-07-17 13:42 ` Juan Quintela
  2017-07-19 13:56   ` Daniel P. Berrange
                     ` (2 more replies)
  2017-07-17 13:42 ` [Qemu-devel] [PATCH v5 10/17] migration: Create ram_multifd_page Juan Quintela
                   ` (7 subsequent siblings)
  16 siblings, 3 replies; 93+ messages in thread
From: Juan Quintela @ 2017-07-17 13:42 UTC (permalink / raw)
  To: qemu-devel; +Cc: dgilbert, lvivier, peterx, berrange

We create new channels for each new thread created. We only send through
them a character to be sure that we are creating the channels in the
right order.

Signed-off-by: Juan Quintela <quintela@redhat.com>

--
Split SocketArgs into incoming and outgoing args

Use UUID's on the initial message, so we are sure we are connecting to
the right channel.

Remove init semaphore.  Now that we use uuids on the init message, we
know that this is our channel.

Fix recv socket destwroy, we were destroying send channels.
This was very interesting, because we were using an unreferred object
without problems.

Move to struct of pointers
init channel sooner.
split recv thread creation.
listen on main thread
---
 migration/migration.c |   7 ++-
 migration/ram.c       | 118 ++++++++++++++++++++++++++++++++++++++++++--------
 migration/ram.h       |   2 +
 migration/socket.c    |  38 ++++++++++++++--
 migration/socket.h    |  10 +++++
 5 files changed, 152 insertions(+), 23 deletions(-)

diff --git a/migration/migration.c b/migration/migration.c
index b81c498..e1c79d5 100644
--- a/migration/migration.c
+++ b/migration/migration.c
@@ -389,8 +389,13 @@ gboolean migration_ioc_process_incoming(QIOChannel *ioc)
         QEMUFile *f = qemu_fopen_channel_input(ioc);
         mis->from_src_file = f;
         migration_fd_process_incoming(f);
+        if (!migrate_use_multifd()) {
+            return FALSE;
+        } else {
+            return TRUE;
+        }
     }
-    return FALSE; /* unregister */
+    return multifd_new_channel(ioc);
 }
 
 /*
diff --git a/migration/ram.c b/migration/ram.c
index 8e87533..b80f511 100644
--- a/migration/ram.c
+++ b/migration/ram.c
@@ -36,6 +36,7 @@
 #include "xbzrle.h"
 #include "ram.h"
 #include "migration.h"
+#include "socket.h"
 #include "migration/register.h"
 #include "migration/misc.h"
 #include "qemu-file.h"
@@ -46,6 +47,8 @@
 #include "exec/ram_addr.h"
 #include "qemu/rcu_queue.h"
 #include "migration/colo.h"
+#include "sysemu/sysemu.h"
+#include "qemu/uuid.h"
 
 /***********************************************************/
 /* ram save/restore */
@@ -361,6 +364,7 @@ static void compress_threads_save_setup(void)
 struct MultiFDSendParams {
     uint8_t id;
     QemuThread thread;
+    QIOChannel *c;
     QemuSemaphore sem;
     QemuMutex mutex;
     bool quit;
@@ -401,6 +405,7 @@ void multifd_save_cleanup(void)
         qemu_thread_join(&p->thread);
         qemu_mutex_destroy(&p->mutex);
         qemu_sem_destroy(&p->sem);
+        socket_send_channel_destroy(p->c);
     }
     g_free(multifd_send_state->params);
     multifd_send_state->params = NULL;
@@ -408,11 +413,38 @@ void multifd_save_cleanup(void)
     multifd_send_state = NULL;
 }
 
+/* Default uuid for multifd when qemu is not started with uuid */
+static char multifd_uuid[] = "5c49fd7e-af88-4a07-b6e8-091fd696ad40";
+/* strlen(multifd) + '-' + <channel id> + '-' +  UUID_FMT + '\0' */
+#define MULTIFD_UUID_MSG (7 + 1 + 3 + 1 + UUID_FMT_LEN + 1)
+
 static void *multifd_send_thread(void *opaque)
 {
     MultiFDSendParams *p = opaque;
+    char string[MULTIFD_UUID_MSG];
+    char *string_uuid;
+    int res;
+    bool exit = false;
 
-    while (true) {
+    if (qemu_uuid_set) {
+        string_uuid = qemu_uuid_unparse_strdup(&qemu_uuid);
+    } else {
+        string_uuid = g_strdup(multifd_uuid);
+    }
+    res = snprintf(string, MULTIFD_UUID_MSG, "%s multifd %03d",
+                   string_uuid, p->id);
+    g_free(string_uuid);
+
+    /* -1 due to the wonders of '\0' accounting */
+    if (res != (MULTIFD_UUID_MSG - 1)) {
+        error_report("Multifd UUID message '%s' is not of right length",
+            string);
+        exit = true;
+    } else {
+        qio_channel_write(p->c, string, MULTIFD_UUID_MSG, &error_abort);
+    }
+
+    while (!exit) {
         qemu_mutex_lock(&p->mutex);
         if (p->quit) {
             qemu_mutex_unlock(&p->mutex);
@@ -445,6 +477,12 @@ int multifd_save_setup(void)
         qemu_sem_init(&p->sem, 0);
         p->quit = false;
         p->id = i;
+        p->c = socket_send_channel_create();
+        if (!p->c) {
+            error_report("Error creating a send channel");
+            multifd_save_cleanup();
+            return -1;
+        }
         snprintf(thread_name, sizeof(thread_name), "multifdsend_%d", i);
         qemu_thread_create(&p->thread, thread_name, multifd_send_thread, p,
                            QEMU_THREAD_JOINABLE);
@@ -456,6 +494,7 @@ int multifd_save_setup(void)
 struct MultiFDRecvParams {
     uint8_t id;
     QemuThread thread;
+    QIOChannel *c;
     QemuSemaphore sem;
     QemuMutex mutex;
     bool quit;
@@ -463,7 +502,7 @@ struct MultiFDRecvParams {
 typedef struct MultiFDRecvParams MultiFDRecvParams;
 
 struct {
-    MultiFDRecvParams *params;
+    MultiFDRecvParams **params;
     /* number of created threads */
     int count;
 } *multifd_recv_state;
@@ -473,7 +512,7 @@ static void terminate_multifd_recv_threads(void)
     int i;
 
     for (i = 0; i < multifd_recv_state->count; i++) {
-        MultiFDRecvParams *p = &multifd_recv_state->params[i];
+        MultiFDRecvParams *p = multifd_recv_state->params[i];
 
         qemu_mutex_lock(&p->mutex);
         p->quit = true;
@@ -491,11 +530,13 @@ void multifd_load_cleanup(void)
     }
     terminate_multifd_recv_threads();
     for (i = 0; i < multifd_recv_state->count; i++) {
-        MultiFDRecvParams *p = &multifd_recv_state->params[i];
+        MultiFDRecvParams *p = multifd_recv_state->params[i];
 
         qemu_thread_join(&p->thread);
         qemu_mutex_destroy(&p->mutex);
         qemu_sem_destroy(&p->sem);
+        socket_recv_channel_destroy(p->c);
+        g_free(p);
     }
     g_free(multifd_recv_state->params);
     multifd_recv_state->params = NULL;
@@ -520,31 +561,70 @@ static void *multifd_recv_thread(void *opaque)
     return NULL;
 }
 
+gboolean multifd_new_channel(QIOChannel *ioc)
+{
+    int thread_count = migrate_multifd_threads();
+    MultiFDRecvParams *p = g_new0(MultiFDRecvParams, 1);
+    MigrationState *s = migrate_get_current();
+    char string[MULTIFD_UUID_MSG];
+    char string_uuid[UUID_FMT_LEN];
+    char *uuid;
+    int id;
+
+    qio_channel_read(ioc, string, sizeof(string), &error_abort);
+    sscanf(string, "%s multifd %03d", string_uuid, &id);
+
+    if (qemu_uuid_set) {
+        uuid = qemu_uuid_unparse_strdup(&qemu_uuid);
+    } else {
+        uuid = g_strdup(multifd_uuid);
+    }
+    if (strcmp(string_uuid, uuid)) {
+        error_report("multifd: received uuid '%s' and expected uuid '%s'",
+                     string_uuid, uuid);
+        migrate_set_state(&s->state, MIGRATION_STATUS_ACTIVE,
+                          MIGRATION_STATUS_FAILED);
+        terminate_multifd_recv_threads();
+        return FALSE;
+    }
+    g_free(uuid);
+
+    if (multifd_recv_state->params[id] != NULL) {
+        error_report("multifd: received id '%d' is already setup'", id);
+        migrate_set_state(&s->state, MIGRATION_STATUS_ACTIVE,
+                          MIGRATION_STATUS_FAILED);
+        terminate_multifd_recv_threads();
+        return FALSE;
+    }
+    qemu_mutex_init(&p->mutex);
+    qemu_sem_init(&p->sem, 0);
+    p->quit = false;
+    p->id = id;
+    p->c = ioc;
+    atomic_set(&multifd_recv_state->params[id], p);
+    qemu_thread_create(&p->thread, "multifd_recv", multifd_recv_thread, p,
+                       QEMU_THREAD_JOINABLE);
+    multifd_recv_state->count++;
+
+    /* We need to return FALSE for the last channel */
+    if (multifd_recv_state->count == thread_count) {
+        return FALSE;
+    } else {
+        return TRUE;
+    }
+}
+
 int multifd_load_setup(void)
 {
     int thread_count;
-    uint8_t i;
 
     if (!migrate_use_multifd()) {
         return 0;
     }
     thread_count = migrate_multifd_threads();
     multifd_recv_state = g_malloc0(sizeof(*multifd_recv_state));
-    multifd_recv_state->params = g_new0(MultiFDRecvParams, thread_count);
+    multifd_recv_state->params = g_new0(MultiFDRecvParams *, thread_count);
     multifd_recv_state->count = 0;
-    for (i = 0; i < thread_count; i++) {
-        char thread_name[16];
-        MultiFDRecvParams *p = &multifd_recv_state->params[i];
-
-        qemu_mutex_init(&p->mutex);
-        qemu_sem_init(&p->sem, 0);
-        p->quit = false;
-        p->id = i;
-        snprintf(thread_name, sizeof(thread_name), "multifdrecv_%d", i);
-        qemu_thread_create(&p->thread, thread_name, multifd_recv_thread, p,
-                           QEMU_THREAD_JOINABLE);
-        multifd_recv_state->count++;
-    }
     return 0;
 }
 
diff --git a/migration/ram.h b/migration/ram.h
index 93c2bb4..9413544 100644
--- a/migration/ram.h
+++ b/migration/ram.h
@@ -31,6 +31,7 @@
 
 #include "qemu-common.h"
 #include "exec/cpu-common.h"
+#include "io/channel.h"
 
 extern MigrationStats ram_counters;
 extern XBZRLECacheStats xbzrle_counters;
@@ -43,6 +44,7 @@ int multifd_save_setup(void);
 void multifd_save_cleanup(void);
 int multifd_load_setup(void);
 void multifd_load_cleanup(void);
+gboolean multifd_new_channel(QIOChannel *ioc);
 
 uint64_t ram_pagesize_summary(void);
 int ram_save_queue_pages(const char *rbname, ram_addr_t start, ram_addr_t len);
diff --git a/migration/socket.c b/migration/socket.c
index 6195596..32a6b39 100644
--- a/migration/socket.c
+++ b/migration/socket.c
@@ -26,6 +26,38 @@
 #include "io/channel-socket.h"
 #include "trace.h"
 
+int socket_recv_channel_destroy(QIOChannel *recv)
+{
+    /* Remove channel */
+    object_unref(OBJECT(recv));
+    return 0;
+}
+
+struct SocketOutgoingArgs {
+    SocketAddress *saddr;
+    Error **errp;
+} outgoing_args;
+
+QIOChannel *socket_send_channel_create(void)
+{
+    QIOChannelSocket *sioc = qio_channel_socket_new();
+
+    qio_channel_socket_connect_sync(sioc, outgoing_args.saddr,
+                                    outgoing_args.errp);
+    qio_channel_set_delay(QIO_CHANNEL(sioc), false);
+    return QIO_CHANNEL(sioc);
+}
+
+int socket_send_channel_destroy(QIOChannel *send)
+{
+    /* Remove channel */
+    object_unref(OBJECT(send));
+    if (outgoing_args.saddr) {
+        qapi_free_SocketAddress(outgoing_args.saddr);
+        outgoing_args.saddr = NULL;
+    }
+    return 0;
+}
 
 static SocketAddress *tcp_build_address(const char *host_port, Error **errp)
 {
@@ -96,6 +128,9 @@ static void socket_start_outgoing_migration(MigrationState *s,
     struct SocketConnectData *data = g_new0(struct SocketConnectData, 1);
 
     data->s = s;
+    outgoing_args.saddr = saddr;
+    outgoing_args.errp = errp;
+
     if (saddr->type == SOCKET_ADDRESS_TYPE_INET) {
         data->hostname = g_strdup(saddr->u.inet.host);
     }
@@ -106,7 +141,6 @@ static void socket_start_outgoing_migration(MigrationState *s,
                                      socket_outgoing_migration,
                                      data,
                                      socket_connect_data_free);
-    qapi_free_SocketAddress(saddr);
 }
 
 void tcp_start_outgoing_migration(MigrationState *s,
@@ -151,8 +185,6 @@ static gboolean socket_accept_incoming_migration(QIOChannel *ioc,
 
     qio_channel_set_name(QIO_CHANNEL(sioc), "migration-socket-incoming");
     result = migration_channel_process_incoming(QIO_CHANNEL(sioc));
-    object_unref(OBJECT(sioc));
-
 out:
     if (result == FALSE) {
         /* Close listening socket as its no longer needed */
diff --git a/migration/socket.h b/migration/socket.h
index 6b91e9d..dabce0e 100644
--- a/migration/socket.h
+++ b/migration/socket.h
@@ -16,6 +16,16 @@
 
 #ifndef QEMU_MIGRATION_SOCKET_H
 #define QEMU_MIGRATION_SOCKET_H
+
+#include "io/channel.h"
+
+QIOChannel *socket_recv_channel_create(void);
+int socket_recv_channel_destroy(QIOChannel *recv);
+
+QIOChannel *socket_send_channel_create(void);
+
+int socket_send_channel_destroy(QIOChannel *send);
+
 void tcp_start_incoming_migration(const char *host_port, Error **errp);
 
 void tcp_start_outgoing_migration(MigrationState *s, const char *host_port,
-- 
2.9.4

^ permalink raw reply related	[flat|nested] 93+ messages in thread

* [Qemu-devel] [PATCH v5 10/17] migration: Create ram_multifd_page
  2017-07-17 13:42 [Qemu-devel] [PATCH v5 00/17] Multifd Juan Quintela
                   ` (8 preceding siblings ...)
  2017-07-17 13:42 ` [Qemu-devel] [PATCH v5 09/17] migration: Start of multiple fd work Juan Quintela
@ 2017-07-17 13:42 ` Juan Quintela
  2017-07-19 19:02   ` Dr. David Alan Gilbert
  2017-07-17 13:42 ` [Qemu-devel] [PATCH v5 11/17] migration: Really use multiple pages at a time Juan Quintela
                   ` (6 subsequent siblings)
  16 siblings, 1 reply; 93+ messages in thread
From: Juan Quintela @ 2017-07-17 13:42 UTC (permalink / raw)
  To: qemu-devel; +Cc: dgilbert, lvivier, peterx, berrange

The function still don't use multifd, but we have simplified
ram_save_page, xbzrle and RDMA stuff is gone.  We have added a new
counter and a new flag for this type of pages.

Signed-off-by: Juan Quintela <quintela@redhat.com>
---
 hmp.c                 |  2 ++
 migration/migration.c |  1 +
 migration/ram.c       | 90 ++++++++++++++++++++++++++++++++++++++++++++++++++-
 qapi-schema.json      |  5 ++-
 4 files changed, 96 insertions(+), 2 deletions(-)

diff --git a/hmp.c b/hmp.c
index b01605a..eeb308b 100644
--- a/hmp.c
+++ b/hmp.c
@@ -234,6 +234,8 @@ void hmp_info_migrate(Monitor *mon, const QDict *qdict)
             monitor_printf(mon, "postcopy request count: %" PRIu64 "\n",
                            info->ram->postcopy_requests);
         }
+        monitor_printf(mon, "multifd: %" PRIu64 " pages\n",
+                       info->ram->multifd);
     }
 
     if (info->has_disk) {
diff --git a/migration/migration.c b/migration/migration.c
index e1c79d5..d9d5415 100644
--- a/migration/migration.c
+++ b/migration/migration.c
@@ -528,6 +528,7 @@ static void populate_ram_info(MigrationInfo *info, MigrationState *s)
     info->ram->dirty_sync_count = ram_counters.dirty_sync_count;
     info->ram->postcopy_requests = ram_counters.postcopy_requests;
     info->ram->page_size = qemu_target_page_size();
+    info->ram->multifd = ram_counters.multifd;
 
     if (migrate_use_xbzrle()) {
         info->has_xbzrle_cache = true;
diff --git a/migration/ram.c b/migration/ram.c
index b80f511..2bf3fa7 100644
--- a/migration/ram.c
+++ b/migration/ram.c
@@ -68,6 +68,7 @@
 #define RAM_SAVE_FLAG_XBZRLE   0x40
 /* 0x80 is reserved in migration.h start with 0x100 next */
 #define RAM_SAVE_FLAG_COMPRESS_PAGE    0x100
+#define RAM_SAVE_FLAG_MULTIFD_PAGE     0x200
 
 static inline bool is_zero_range(uint8_t *p, uint64_t size)
 {
@@ -362,12 +363,17 @@ static void compress_threads_save_setup(void)
 /* Multiple fd's */
 
 struct MultiFDSendParams {
+    /* not changed */
     uint8_t id;
     QemuThread thread;
     QIOChannel *c;
     QemuSemaphore sem;
     QemuMutex mutex;
+    /* protected by param mutex */
     bool quit;
+    uint8_t *address;
+    /* protected by multifd mutex */
+    bool done;
 };
 typedef struct MultiFDSendParams MultiFDSendParams;
 
@@ -375,6 +381,8 @@ struct {
     MultiFDSendParams *params;
     /* number of created threads */
     int count;
+    QemuMutex mutex;
+    QemuSemaphore sem;
 } *multifd_send_state;
 
 static void terminate_multifd_send_threads(void)
@@ -443,6 +451,7 @@ static void *multifd_send_thread(void *opaque)
     } else {
         qio_channel_write(p->c, string, MULTIFD_UUID_MSG, &error_abort);
     }
+    qemu_sem_post(&multifd_send_state->sem);
 
     while (!exit) {
         qemu_mutex_lock(&p->mutex);
@@ -450,6 +459,15 @@ static void *multifd_send_thread(void *opaque)
             qemu_mutex_unlock(&p->mutex);
             break;
         }
+        if (p->address) {
+            p->address = 0;
+            qemu_mutex_unlock(&p->mutex);
+            qemu_mutex_lock(&multifd_send_state->mutex);
+            p->done = true;
+            qemu_mutex_unlock(&multifd_send_state->mutex);
+            qemu_sem_post(&multifd_send_state->sem);
+            continue;
+        }
         qemu_mutex_unlock(&p->mutex);
         qemu_sem_wait(&p->sem);
     }
@@ -469,6 +487,8 @@ int multifd_save_setup(void)
     multifd_send_state = g_malloc0(sizeof(*multifd_send_state));
     multifd_send_state->params = g_new0(MultiFDSendParams, thread_count);
     multifd_send_state->count = 0;
+    qemu_mutex_init(&multifd_send_state->mutex);
+    qemu_sem_init(&multifd_send_state->sem, 0);
     for (i = 0; i < thread_count; i++) {
         char thread_name[16];
         MultiFDSendParams *p = &multifd_send_state->params[i];
@@ -477,6 +497,8 @@ int multifd_save_setup(void)
         qemu_sem_init(&p->sem, 0);
         p->quit = false;
         p->id = i;
+        p->done = true;
+        p->address = 0;
         p->c = socket_send_channel_create();
         if (!p->c) {
             error_report("Error creating a send channel");
@@ -491,6 +513,30 @@ int multifd_save_setup(void)
     return 0;
 }
 
+static int multifd_send_page(uint8_t *address)
+{
+    int i;
+    MultiFDSendParams *p = NULL; /* make happy gcc */
+
+    qemu_sem_wait(&multifd_send_state->sem);
+    qemu_mutex_lock(&multifd_send_state->mutex);
+    for (i = 0; i < multifd_send_state->count; i++) {
+        p = &multifd_send_state->params[i];
+
+        if (p->done) {
+            p->done = false;
+            break;
+        }
+    }
+    qemu_mutex_unlock(&multifd_send_state->mutex);
+    qemu_mutex_lock(&p->mutex);
+    p->address = address;
+    qemu_mutex_unlock(&p->mutex);
+    qemu_sem_post(&p->sem);
+
+    return 0;
+}
+
 struct MultiFDRecvParams {
     uint8_t id;
     QemuThread thread;
@@ -537,6 +583,7 @@ void multifd_load_cleanup(void)
         qemu_sem_destroy(&p->sem);
         socket_recv_channel_destroy(p->c);
         g_free(p);
+        multifd_recv_state->params[i] = NULL;
     }
     g_free(multifd_recv_state->params);
     multifd_recv_state->params = NULL;
@@ -1058,6 +1105,32 @@ static int ram_save_page(RAMState *rs, PageSearchStatus *pss, bool last_stage)
     return pages;
 }
 
+static int ram_multifd_page(RAMState *rs, PageSearchStatus *pss,
+                            bool last_stage)
+{
+    int pages;
+    uint8_t *p;
+    RAMBlock *block = pss->block;
+    ram_addr_t offset = pss->page << TARGET_PAGE_BITS;
+
+    p = block->host + offset;
+
+    pages = save_zero_page(rs, block, offset, p);
+    if (pages == -1) {
+        ram_counters.transferred +=
+            save_page_header(rs, rs->f, block,
+                             offset | RAM_SAVE_FLAG_MULTIFD_PAGE);
+        qemu_put_buffer(rs->f, p, TARGET_PAGE_SIZE);
+        multifd_send_page(p);
+        ram_counters.transferred += TARGET_PAGE_SIZE;
+        pages = 1;
+        ram_counters.normal++;
+        ram_counters.multifd++;
+    }
+
+    return pages;
+}
+
 static int do_compress_ram_page(QEMUFile *f, RAMBlock *block,
                                 ram_addr_t offset)
 {
@@ -1486,6 +1559,8 @@ static int ram_save_target_page(RAMState *rs, PageSearchStatus *pss,
         if (migrate_use_compression() &&
             (rs->ram_bulk_stage || !migrate_use_xbzrle())) {
             res = ram_save_compressed_page(rs, pss, last_stage);
+        } else if (migrate_use_multifd()) {
+            res = ram_multifd_page(rs, pss, last_stage);
         } else {
             res = ram_save_page(rs, pss, last_stage);
         }
@@ -2778,6 +2853,10 @@ static int ram_load(QEMUFile *f, void *opaque, int version_id)
     if (!migrate_use_compression()) {
         invalid_flags |= RAM_SAVE_FLAG_COMPRESS_PAGE;
     }
+
+    if (!migrate_use_multifd()) {
+        invalid_flags |= RAM_SAVE_FLAG_MULTIFD_PAGE;
+    }
     /* This RCU critical section can be very long running.
      * When RCU reclaims in the code start to become numerous,
      * it will be necessary to reduce the granularity of this
@@ -2802,13 +2881,17 @@ static int ram_load(QEMUFile *f, void *opaque, int version_id)
             if (flags & invalid_flags & RAM_SAVE_FLAG_COMPRESS_PAGE) {
                 error_report("Received an unexpected compressed page");
             }
+            if (flags & invalid_flags  & RAM_SAVE_FLAG_MULTIFD_PAGE) {
+                error_report("Received an unexpected multifd page");
+            }
 
             ret = -EINVAL;
             break;
         }
 
         if (flags & (RAM_SAVE_FLAG_ZERO | RAM_SAVE_FLAG_PAGE |
-                     RAM_SAVE_FLAG_COMPRESS_PAGE | RAM_SAVE_FLAG_XBZRLE)) {
+                     RAM_SAVE_FLAG_COMPRESS_PAGE | RAM_SAVE_FLAG_XBZRLE |
+                     RAM_SAVE_FLAG_MULTIFD_PAGE)) {
             RAMBlock *block = ram_block_from_stream(f, flags);
 
             host = host_from_ram_block_offset(block, addr);
@@ -2896,6 +2979,11 @@ static int ram_load(QEMUFile *f, void *opaque, int version_id)
                 break;
             }
             break;
+
+        case RAM_SAVE_FLAG_MULTIFD_PAGE:
+            qemu_get_buffer(f, host, TARGET_PAGE_SIZE);
+            break;
+
         case RAM_SAVE_FLAG_EOS:
             /* normal exit */
             break;
diff --git a/qapi-schema.json b/qapi-schema.json
index 5b3733e..f708782 100644
--- a/qapi-schema.json
+++ b/qapi-schema.json
@@ -601,6 +601,8 @@
 # @page-size: The number of bytes per page for the various page-based
 #        statistics (since 2.10)
 #
+# @multifd: number of pages sent with multifd (since 2.10)
+#
 # Since: 0.14.0
 ##
 { 'struct': 'MigrationStats',
@@ -608,7 +610,8 @@
            'duplicate': 'int', 'skipped': 'int', 'normal': 'int',
            'normal-bytes': 'int', 'dirty-pages-rate' : 'int',
            'mbps' : 'number', 'dirty-sync-count' : 'int',
-           'postcopy-requests' : 'int', 'page-size' : 'int' } }
+           'postcopy-requests' : 'int', 'page-size' : 'int',
+           'multifd' : 'int'} }
 
 ##
 # @XBZRLECacheStats:
-- 
2.9.4

^ permalink raw reply related	[flat|nested] 93+ messages in thread

* [Qemu-devel] [PATCH v5 11/17] migration: Really use multiple pages at a time
  2017-07-17 13:42 [Qemu-devel] [PATCH v5 00/17] Multifd Juan Quintela
                   ` (9 preceding siblings ...)
  2017-07-17 13:42 ` [Qemu-devel] [PATCH v5 10/17] migration: Create ram_multifd_page Juan Quintela
@ 2017-07-17 13:42 ` Juan Quintela
  2017-07-19 13:58   ` Daniel P. Berrange
                     ` (2 more replies)
  2017-07-17 13:42 ` [Qemu-devel] [PATCH v5 12/17] migration: Send the fd number which we are going to use for this page Juan Quintela
                   ` (5 subsequent siblings)
  16 siblings, 3 replies; 93+ messages in thread
From: Juan Quintela @ 2017-07-17 13:42 UTC (permalink / raw)
  To: qemu-devel; +Cc: dgilbert, lvivier, peterx, berrange

We now send several pages at a time each time that we wakeup a thread.

Signed-off-by: Juan Quintela <quintela@redhat.com>

--

Use iovec's insead of creating the equivalent.
---
 migration/ram.c | 46 ++++++++++++++++++++++++++++++++++++++++------
 1 file changed, 40 insertions(+), 6 deletions(-)

diff --git a/migration/ram.c b/migration/ram.c
index 2bf3fa7..90e1bcb 100644
--- a/migration/ram.c
+++ b/migration/ram.c
@@ -362,6 +362,13 @@ static void compress_threads_save_setup(void)
 
 /* Multiple fd's */
 
+
+typedef struct {
+    int num;
+    int size;
+    struct iovec *iov;
+} multifd_pages_t;
+
 struct MultiFDSendParams {
     /* not changed */
     uint8_t id;
@@ -371,7 +378,7 @@ struct MultiFDSendParams {
     QemuMutex mutex;
     /* protected by param mutex */
     bool quit;
-    uint8_t *address;
+    multifd_pages_t pages;
     /* protected by multifd mutex */
     bool done;
 };
@@ -459,8 +466,8 @@ static void *multifd_send_thread(void *opaque)
             qemu_mutex_unlock(&p->mutex);
             break;
         }
-        if (p->address) {
-            p->address = 0;
+        if (p->pages.num) {
+            p->pages.num = 0;
             qemu_mutex_unlock(&p->mutex);
             qemu_mutex_lock(&multifd_send_state->mutex);
             p->done = true;
@@ -475,6 +482,13 @@ static void *multifd_send_thread(void *opaque)
     return NULL;
 }
 
+static void multifd_init_group(multifd_pages_t *pages)
+{
+    pages->num = 0;
+    pages->size = migrate_multifd_group();
+    pages->iov = g_malloc0(pages->size * sizeof(struct iovec));
+}
+
 int multifd_save_setup(void)
 {
     int thread_count;
@@ -498,7 +512,7 @@ int multifd_save_setup(void)
         p->quit = false;
         p->id = i;
         p->done = true;
-        p->address = 0;
+        multifd_init_group(&p->pages);
         p->c = socket_send_channel_create();
         if (!p->c) {
             error_report("Error creating a send channel");
@@ -515,8 +529,23 @@ int multifd_save_setup(void)
 
 static int multifd_send_page(uint8_t *address)
 {
-    int i;
+    int i, j;
     MultiFDSendParams *p = NULL; /* make happy gcc */
+    static multifd_pages_t pages;
+    static bool once;
+
+    if (!once) {
+        multifd_init_group(&pages);
+        once = true;
+    }
+
+    pages.iov[pages.num].iov_base = address;
+    pages.iov[pages.num].iov_len = TARGET_PAGE_SIZE;
+    pages.num++;
+
+    if (pages.num < (pages.size - 1)) {
+        return UINT16_MAX;
+    }
 
     qemu_sem_wait(&multifd_send_state->sem);
     qemu_mutex_lock(&multifd_send_state->mutex);
@@ -530,7 +559,12 @@ static int multifd_send_page(uint8_t *address)
     }
     qemu_mutex_unlock(&multifd_send_state->mutex);
     qemu_mutex_lock(&p->mutex);
-    p->address = address;
+    p->pages.num = pages.num;
+    for (j = 0; j < pages.size; j++) {
+        p->pages.iov[j].iov_base = pages.iov[j].iov_base;
+        p->pages.iov[j].iov_len = pages.iov[j].iov_len;
+    }
+    pages.num = 0;
     qemu_mutex_unlock(&p->mutex);
     qemu_sem_post(&p->sem);
 
-- 
2.9.4

^ permalink raw reply related	[flat|nested] 93+ messages in thread

* [Qemu-devel] [PATCH v5 12/17] migration: Send the fd number which we are going to use for this page
  2017-07-17 13:42 [Qemu-devel] [PATCH v5 00/17] Multifd Juan Quintela
                   ` (10 preceding siblings ...)
  2017-07-17 13:42 ` [Qemu-devel] [PATCH v5 11/17] migration: Really use multiple pages at a time Juan Quintela
@ 2017-07-17 13:42 ` Juan Quintela
  2017-07-20  9:58   ` Dr. David Alan Gilbert
  2017-08-09 16:48   ` Paolo Bonzini
  2017-07-17 13:42 ` [Qemu-devel] [PATCH v5 13/17] migration: Create thread infrastructure for multifd recv side Juan Quintela
                   ` (4 subsequent siblings)
  16 siblings, 2 replies; 93+ messages in thread
From: Juan Quintela @ 2017-07-17 13:42 UTC (permalink / raw)
  To: qemu-devel; +Cc: dgilbert, lvivier, peterx, berrange

We are still sending the page through the main channel, that would
change later in the series

Signed-off-by: Juan Quintela <quintela@redhat.com>
---
 migration/ram.c | 13 +++++++++++--
 1 file changed, 11 insertions(+), 2 deletions(-)

diff --git a/migration/ram.c b/migration/ram.c
index 90e1bcb..ac0742f 100644
--- a/migration/ram.c
+++ b/migration/ram.c
@@ -568,7 +568,7 @@ static int multifd_send_page(uint8_t *address)
     qemu_mutex_unlock(&p->mutex);
     qemu_sem_post(&p->sem);
 
-    return 0;
+    return i;
 }
 
 struct MultiFDRecvParams {
@@ -1143,6 +1143,7 @@ static int ram_multifd_page(RAMState *rs, PageSearchStatus *pss,
                             bool last_stage)
 {
     int pages;
+    uint16_t fd_num;
     uint8_t *p;
     RAMBlock *block = pss->block;
     ram_addr_t offset = pss->page << TARGET_PAGE_BITS;
@@ -1154,8 +1155,10 @@ static int ram_multifd_page(RAMState *rs, PageSearchStatus *pss,
         ram_counters.transferred +=
             save_page_header(rs, rs->f, block,
                              offset | RAM_SAVE_FLAG_MULTIFD_PAGE);
+        fd_num = multifd_send_page(p);
+        qemu_put_be16(rs->f, fd_num);
+        ram_counters.transferred += 2; /* size of fd_num */
         qemu_put_buffer(rs->f, p, TARGET_PAGE_SIZE);
-        multifd_send_page(p);
         ram_counters.transferred += TARGET_PAGE_SIZE;
         pages = 1;
         ram_counters.normal++;
@@ -2905,6 +2908,7 @@ static int ram_load(QEMUFile *f, void *opaque, int version_id)
     while (!postcopy_running && !ret && !(flags & RAM_SAVE_FLAG_EOS)) {
         ram_addr_t addr, total_ram_bytes;
         void *host = NULL;
+        uint16_t fd_num;
         uint8_t ch;
 
         addr = qemu_get_be64(f);
@@ -3015,6 +3019,11 @@ static int ram_load(QEMUFile *f, void *opaque, int version_id)
             break;
 
         case RAM_SAVE_FLAG_MULTIFD_PAGE:
+            fd_num = qemu_get_be16(f);
+            if (fd_num != 0) {
+                /* this is yet an unused variable, changed later */
+                fd_num = fd_num;
+            }
             qemu_get_buffer(f, host, TARGET_PAGE_SIZE);
             break;
 
-- 
2.9.4

^ permalink raw reply related	[flat|nested] 93+ messages in thread

* [Qemu-devel] [PATCH v5 13/17] migration: Create thread infrastructure for multifd recv side
  2017-07-17 13:42 [Qemu-devel] [PATCH v5 00/17] Multifd Juan Quintela
                   ` (11 preceding siblings ...)
  2017-07-17 13:42 ` [Qemu-devel] [PATCH v5 12/17] migration: Send the fd number which we are going to use for this page Juan Quintela
@ 2017-07-17 13:42 ` Juan Quintela
  2017-07-20 10:22   ` Peter Xu
  2017-07-20 10:29   ` Dr. David Alan Gilbert
  2017-07-17 13:42 ` [Qemu-devel] [PATCH v5 14/17] migration: Delay the start of reception on main channel Juan Quintela
                   ` (3 subsequent siblings)
  16 siblings, 2 replies; 93+ messages in thread
From: Juan Quintela @ 2017-07-17 13:42 UTC (permalink / raw)
  To: qemu-devel; +Cc: dgilbert, lvivier, peterx, berrange

We make the locking and the transfer of information specific, even if we
are still receiving things through the main thread.

Signed-off-by: Juan Quintela <quintela@redhat.com>
---
 migration/ram.c | 68 ++++++++++++++++++++++++++++++++++++++++++++++++++-------
 1 file changed, 60 insertions(+), 8 deletions(-)

diff --git a/migration/ram.c b/migration/ram.c
index ac0742f..49c4880 100644
--- a/migration/ram.c
+++ b/migration/ram.c
@@ -49,6 +49,7 @@
 #include "migration/colo.h"
 #include "sysemu/sysemu.h"
 #include "qemu/uuid.h"
+#include "qemu/iov.h"
 
 /***********************************************************/
 /* ram save/restore */
@@ -527,7 +528,7 @@ int multifd_save_setup(void)
     return 0;
 }
 
-static int multifd_send_page(uint8_t *address)
+static uint16_t multifd_send_page(uint8_t *address, bool last_page)
 {
     int i, j;
     MultiFDSendParams *p = NULL; /* make happy gcc */
@@ -543,8 +544,10 @@ static int multifd_send_page(uint8_t *address)
     pages.iov[pages.num].iov_len = TARGET_PAGE_SIZE;
     pages.num++;
 
-    if (pages.num < (pages.size - 1)) {
-        return UINT16_MAX;
+    if (!last_page) {
+        if (pages.num < (pages.size - 1)) {
+            return UINT16_MAX;
+        }
     }
 
     qemu_sem_wait(&multifd_send_state->sem);
@@ -572,12 +575,17 @@ static int multifd_send_page(uint8_t *address)
 }
 
 struct MultiFDRecvParams {
+    /* not changed */
     uint8_t id;
     QemuThread thread;
     QIOChannel *c;
+    QemuSemaphore ready;
     QemuSemaphore sem;
     QemuMutex mutex;
+    /* proteced by param mutex */
     bool quit;
+    multifd_pages_t pages;
+    bool done;
 };
 typedef struct MultiFDRecvParams MultiFDRecvParams;
 
@@ -629,12 +637,20 @@ static void *multifd_recv_thread(void *opaque)
 {
     MultiFDRecvParams *p = opaque;
 
+    qemu_sem_post(&p->ready);
     while (true) {
         qemu_mutex_lock(&p->mutex);
         if (p->quit) {
             qemu_mutex_unlock(&p->mutex);
             break;
         }
+        if (p->pages.num) {
+            p->pages.num = 0;
+            p->done = true;
+            qemu_mutex_unlock(&p->mutex);
+            qemu_sem_post(&p->ready);
+            continue;
+        }
         qemu_mutex_unlock(&p->mutex);
         qemu_sem_wait(&p->sem);
     }
@@ -679,8 +695,11 @@ gboolean multifd_new_channel(QIOChannel *ioc)
     }
     qemu_mutex_init(&p->mutex);
     qemu_sem_init(&p->sem, 0);
+    qemu_sem_init(&p->ready, 0);
     p->quit = false;
     p->id = id;
+    p->done = false;
+    multifd_init_group(&p->pages);
     p->c = ioc;
     atomic_set(&multifd_recv_state->params[id], p);
     qemu_thread_create(&p->thread, "multifd_recv", multifd_recv_thread, p,
@@ -709,6 +728,42 @@ int multifd_load_setup(void)
     return 0;
 }
 
+static void multifd_recv_page(uint8_t *address, uint16_t fd_num)
+{
+    int thread_count;
+    MultiFDRecvParams *p;
+    static multifd_pages_t pages;
+    static bool once;
+
+    if (!once) {
+        multifd_init_group(&pages);
+        once = true;
+    }
+
+    pages.iov[pages.num].iov_base = address;
+    pages.iov[pages.num].iov_len = TARGET_PAGE_SIZE;
+    pages.num++;
+
+    if (fd_num == UINT16_MAX) {
+        return;
+    }
+
+    thread_count = migrate_multifd_threads();
+    assert(fd_num < thread_count);
+    p = multifd_recv_state->params[fd_num];
+
+    qemu_sem_wait(&p->ready);
+
+    qemu_mutex_lock(&p->mutex);
+    p->done = false;
+    iov_copy(p->pages.iov, pages.num, pages.iov, pages.num, 0,
+             iov_size(pages.iov, pages.num));
+    p->pages.num = pages.num;
+    pages.num = 0;
+    qemu_mutex_unlock(&p->mutex);
+    qemu_sem_post(&p->sem);
+}
+
 /**
  * save_page_header: write page header to wire
  *
@@ -1155,7 +1210,7 @@ static int ram_multifd_page(RAMState *rs, PageSearchStatus *pss,
         ram_counters.transferred +=
             save_page_header(rs, rs->f, block,
                              offset | RAM_SAVE_FLAG_MULTIFD_PAGE);
-        fd_num = multifd_send_page(p);
+        fd_num = multifd_send_page(p, rs->migration_dirty_pages == 1);
         qemu_put_be16(rs->f, fd_num);
         ram_counters.transferred += 2; /* size of fd_num */
         qemu_put_buffer(rs->f, p, TARGET_PAGE_SIZE);
@@ -3020,10 +3075,7 @@ static int ram_load(QEMUFile *f, void *opaque, int version_id)
 
         case RAM_SAVE_FLAG_MULTIFD_PAGE:
             fd_num = qemu_get_be16(f);
-            if (fd_num != 0) {
-                /* this is yet an unused variable, changed later */
-                fd_num = fd_num;
-            }
+            multifd_recv_page(host, fd_num);
             qemu_get_buffer(f, host, TARGET_PAGE_SIZE);
             break;
 
-- 
2.9.4

^ permalink raw reply related	[flat|nested] 93+ messages in thread

* [Qemu-devel] [PATCH v5 14/17] migration: Delay the start of reception on main channel
  2017-07-17 13:42 [Qemu-devel] [PATCH v5 00/17] Multifd Juan Quintela
                   ` (12 preceding siblings ...)
  2017-07-17 13:42 ` [Qemu-devel] [PATCH v5 13/17] migration: Create thread infrastructure for multifd recv side Juan Quintela
@ 2017-07-17 13:42 ` Juan Quintela
  2017-07-20 10:56   ` Dr. David Alan Gilbert
  2017-07-20 11:10   ` Peter Xu
  2017-07-17 13:42 ` [Qemu-devel] [PATCH v5 15/17] migration: Test new fd infrastructure Juan Quintela
                   ` (2 subsequent siblings)
  16 siblings, 2 replies; 93+ messages in thread
From: Juan Quintela @ 2017-07-17 13:42 UTC (permalink / raw)
  To: qemu-devel; +Cc: dgilbert, lvivier, peterx, berrange

When we start multifd, we will want to delay the main channel until
the others are created.

Signed-off-by: Juan Quintela <quintela@redhat.com>
---
 migration/migration.c | 23 ++++++++++++++---------
 1 file changed, 14 insertions(+), 9 deletions(-)

diff --git a/migration/migration.c b/migration/migration.c
index d9d5415..e122684 100644
--- a/migration/migration.c
+++ b/migration/migration.c
@@ -358,14 +358,11 @@ static void process_incoming_migration_co(void *opaque)
 
 static void migration_incoming_setup(QEMUFile *f)
 {
-    MigrationIncomingState *mis = migration_incoming_get_current();
-
     if (multifd_load_setup() != 0) {
         /* We haven't been able to create multifd threads
            nothing better to do */
         exit(EXIT_FAILURE);
     }
-    mis->from_src_file = f;
     qemu_file_set_blocking(f, false);
 }
 
@@ -384,18 +381,26 @@ void migration_fd_process_incoming(QEMUFile *f)
 gboolean migration_ioc_process_incoming(QIOChannel *ioc)
 {
     MigrationIncomingState *mis = migration_incoming_get_current();
+    gboolean result = FALSE;
 
     if (!mis->from_src_file) {
         QEMUFile *f = qemu_fopen_channel_input(ioc);
         mis->from_src_file = f;
-        migration_fd_process_incoming(f);
-        if (!migrate_use_multifd()) {
-            return FALSE;
-        } else {
-            return TRUE;
+        migration_incoming_setup(f);
+        if (migrate_use_multifd()) {
+            result = TRUE;
         }
+    } else {
+        /* we can only arrive here if multifd is on
+           and this is a new channel */
+        result = multifd_new_channel(ioc);
     }
-    return multifd_new_channel(ioc);
+    if (result == FALSE) {
+        /* called when !multifd and for last multifd channel */
+        migration_incoming_process();
+    }
+
+    return result;
 }
 
 /*
-- 
2.9.4

^ permalink raw reply related	[flat|nested] 93+ messages in thread

* [Qemu-devel] [PATCH v5 15/17] migration: Test new fd infrastructure
  2017-07-17 13:42 [Qemu-devel] [PATCH v5 00/17] Multifd Juan Quintela
                   ` (13 preceding siblings ...)
  2017-07-17 13:42 ` [Qemu-devel] [PATCH v5 14/17] migration: Delay the start of reception on main channel Juan Quintela
@ 2017-07-17 13:42 ` Juan Quintela
  2017-07-20 11:20   ` Dr. David Alan Gilbert
  2017-07-17 13:42 ` [Qemu-devel] [PATCH v5 16/17] migration: Transfer pages over new channels Juan Quintela
  2017-07-17 13:42 ` [Qemu-devel] [PATCH v5 17/17] migration: Flush receive queue Juan Quintela
  16 siblings, 1 reply; 93+ messages in thread
From: Juan Quintela @ 2017-07-17 13:42 UTC (permalink / raw)
  To: qemu-devel; +Cc: dgilbert, lvivier, peterx, berrange

We just send the address through the alternate channels and test that it
is ok.

Signed-off-by: Juan Quintela <quintela@redhat.com>
---
 migration/ram.c | 50 ++++++++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 50 insertions(+)

diff --git a/migration/ram.c b/migration/ram.c
index 49c4880..b55b243 100644
--- a/migration/ram.c
+++ b/migration/ram.c
@@ -468,8 +468,26 @@ static void *multifd_send_thread(void *opaque)
             break;
         }
         if (p->pages.num) {
+            int i;
+            int num;
+
+            num = p->pages.num;
             p->pages.num = 0;
             qemu_mutex_unlock(&p->mutex);
+
+            for (i = 0; i < num; i++) {
+                if (qio_channel_write(p->c,
+                                      (const char *)&p->pages.iov[i].iov_base,
+                                      sizeof(uint8_t *), &error_abort)
+                    != sizeof(uint8_t *)) {
+                    MigrationState *s = migrate_get_current();
+
+                    migrate_set_state(&s->state, MIGRATION_STATUS_ACTIVE,
+                                      MIGRATION_STATUS_FAILED);
+                    terminate_multifd_send_threads();
+                    return NULL;
+                }
+            }
             qemu_mutex_lock(&multifd_send_state->mutex);
             p->done = true;
             qemu_mutex_unlock(&multifd_send_state->mutex);
@@ -636,6 +654,7 @@ void multifd_load_cleanup(void)
 static void *multifd_recv_thread(void *opaque)
 {
     MultiFDRecvParams *p = opaque;
+    uint8_t *recv_address;
 
     qemu_sem_post(&p->ready);
     while (true) {
@@ -645,7 +664,38 @@ static void *multifd_recv_thread(void *opaque)
             break;
         }
         if (p->pages.num) {
+            int i;
+            int num;
+
+            num = p->pages.num;
             p->pages.num = 0;
+
+            for (i = 0; i < num; i++) {
+                if (qio_channel_read(p->c,
+                                     (char *)&recv_address,
+                                     sizeof(uint8_t *), &error_abort)
+                    != sizeof(uint8_t *)) {
+                    MigrationState *s = migrate_get_current();
+
+                    migrate_set_state(&s->state, MIGRATION_STATUS_ACTIVE,
+                                      MIGRATION_STATUS_FAILED);
+                    terminate_multifd_recv_threads();
+                    return NULL;
+                }
+                if (recv_address != p->pages.iov[i].iov_base) {
+                    MigrationState *s = migrate_get_current();
+
+                    printf("We received %p what we were expecting %p (%d)\n",
+                           recv_address,
+                           p->pages.iov[i].iov_base, i);
+
+                    migrate_set_state(&s->state, MIGRATION_STATUS_ACTIVE,
+                                      MIGRATION_STATUS_FAILED);
+                    terminate_multifd_recv_threads();
+                    return NULL;
+                }
+            }
+
             p->done = true;
             qemu_mutex_unlock(&p->mutex);
             qemu_sem_post(&p->ready);
-- 
2.9.4

^ permalink raw reply related	[flat|nested] 93+ messages in thread

* [Qemu-devel] [PATCH v5 16/17] migration: Transfer pages over new channels
  2017-07-17 13:42 [Qemu-devel] [PATCH v5 00/17] Multifd Juan Quintela
                   ` (14 preceding siblings ...)
  2017-07-17 13:42 ` [Qemu-devel] [PATCH v5 15/17] migration: Test new fd infrastructure Juan Quintela
@ 2017-07-17 13:42 ` Juan Quintela
  2017-07-20 11:31   ` Dr. David Alan Gilbert
  2017-07-17 13:42 ` [Qemu-devel] [PATCH v5 17/17] migration: Flush receive queue Juan Quintela
  16 siblings, 1 reply; 93+ messages in thread
From: Juan Quintela @ 2017-07-17 13:42 UTC (permalink / raw)
  To: qemu-devel; +Cc: dgilbert, lvivier, peterx, berrange

We switch for sending the page number to send real pages.

Signed-off-by: Juan Quintela <quintela@redhat.com>

--

Remove the HACK bit, now we have the function that calculates the size
of a page exported.
---
 migration/migration.c | 14 ++++++++----
 migration/ram.c       | 59 +++++++++++++++++----------------------------------
 2 files changed, 29 insertions(+), 44 deletions(-)

diff --git a/migration/migration.c b/migration/migration.c
index e122684..34a34b7 100644
--- a/migration/migration.c
+++ b/migration/migration.c
@@ -1882,13 +1882,14 @@ static void *migration_thread(void *opaque)
     /* Used by the bandwidth calcs, updated later */
     int64_t initial_time = qemu_clock_get_ms(QEMU_CLOCK_REALTIME);
     int64_t setup_start = qemu_clock_get_ms(QEMU_CLOCK_HOST);
-    int64_t initial_bytes = 0;
     /*
      * The final stage happens when the remaining data is smaller than
      * this threshold; it's calculated from the requested downtime and
      * measured bandwidth
      */
     int64_t threshold_size = 0;
+    int64_t qemu_file_bytes = 0;
+    int64_t multifd_pages = 0;
     int64_t start_time = initial_time;
     int64_t end_time;
     bool old_vm_running = false;
@@ -1976,9 +1977,13 @@ static void *migration_thread(void *opaque)
         }
         current_time = qemu_clock_get_ms(QEMU_CLOCK_REALTIME);
         if (current_time >= initial_time + BUFFER_DELAY) {
-            uint64_t transferred_bytes = qemu_ftell(s->to_dst_file) -
-                                         initial_bytes;
             uint64_t time_spent = current_time - initial_time;
+            uint64_t qemu_file_bytes_now = qemu_ftell(s->to_dst_file);
+            uint64_t multifd_pages_now = ram_counters.multifd;
+            uint64_t transferred_bytes =
+                (qemu_file_bytes_now - qemu_file_bytes) +
+                (multifd_pages_now - multifd_pages) *
+                qemu_target_page_size();
             double bandwidth = (double)transferred_bytes / time_spent;
             threshold_size = bandwidth * s->parameters.downtime_limit;
 
@@ -1996,7 +2001,8 @@ static void *migration_thread(void *opaque)
 
             qemu_file_reset_rate_limit(s->to_dst_file);
             initial_time = current_time;
-            initial_bytes = qemu_ftell(s->to_dst_file);
+            qemu_file_bytes = qemu_file_bytes_now;
+            multifd_pages = multifd_pages_now;
         }
         if (qemu_file_rate_limit(s->to_dst_file)) {
             /* usleep expects microseconds */
diff --git a/migration/ram.c b/migration/ram.c
index b55b243..c78b286 100644
--- a/migration/ram.c
+++ b/migration/ram.c
@@ -468,25 +468,21 @@ static void *multifd_send_thread(void *opaque)
             break;
         }
         if (p->pages.num) {
-            int i;
             int num;
 
             num = p->pages.num;
             p->pages.num = 0;
             qemu_mutex_unlock(&p->mutex);
 
-            for (i = 0; i < num; i++) {
-                if (qio_channel_write(p->c,
-                                      (const char *)&p->pages.iov[i].iov_base,
-                                      sizeof(uint8_t *), &error_abort)
-                    != sizeof(uint8_t *)) {
-                    MigrationState *s = migrate_get_current();
+            if (qio_channel_writev_all(p->c, p->pages.iov,
+                                       num, &error_abort)
+                != num * TARGET_PAGE_SIZE) {
+                MigrationState *s = migrate_get_current();
 
-                    migrate_set_state(&s->state, MIGRATION_STATUS_ACTIVE,
-                                      MIGRATION_STATUS_FAILED);
-                    terminate_multifd_send_threads();
-                    return NULL;
-                }
+                migrate_set_state(&s->state, MIGRATION_STATUS_ACTIVE,
+                                  MIGRATION_STATUS_FAILED);
+                terminate_multifd_send_threads();
+                return NULL;
             }
             qemu_mutex_lock(&multifd_send_state->mutex);
             p->done = true;
@@ -654,7 +650,6 @@ void multifd_load_cleanup(void)
 static void *multifd_recv_thread(void *opaque)
 {
     MultiFDRecvParams *p = opaque;
-    uint8_t *recv_address;
 
     qemu_sem_post(&p->ready);
     while (true) {
@@ -664,38 +659,21 @@ static void *multifd_recv_thread(void *opaque)
             break;
         }
         if (p->pages.num) {
-            int i;
             int num;
 
             num = p->pages.num;
             p->pages.num = 0;
 
-            for (i = 0; i < num; i++) {
-                if (qio_channel_read(p->c,
-                                     (char *)&recv_address,
-                                     sizeof(uint8_t *), &error_abort)
-                    != sizeof(uint8_t *)) {
-                    MigrationState *s = migrate_get_current();
+            if (qio_channel_readv_all(p->c, p->pages.iov,
+                                      num, &error_abort)
+                != num * TARGET_PAGE_SIZE) {
+                MigrationState *s = migrate_get_current();
 
-                    migrate_set_state(&s->state, MIGRATION_STATUS_ACTIVE,
-                                      MIGRATION_STATUS_FAILED);
-                    terminate_multifd_recv_threads();
-                    return NULL;
-                }
-                if (recv_address != p->pages.iov[i].iov_base) {
-                    MigrationState *s = migrate_get_current();
-
-                    printf("We received %p what we were expecting %p (%d)\n",
-                           recv_address,
-                           p->pages.iov[i].iov_base, i);
-
-                    migrate_set_state(&s->state, MIGRATION_STATUS_ACTIVE,
-                                      MIGRATION_STATUS_FAILED);
-                    terminate_multifd_recv_threads();
-                    return NULL;
-                }
+                migrate_set_state(&s->state, MIGRATION_STATUS_ACTIVE,
+                                  MIGRATION_STATUS_FAILED);
+                terminate_multifd_recv_threads();
+                return NULL;
             }
-
             p->done = true;
             qemu_mutex_unlock(&p->mutex);
             qemu_sem_post(&p->ready);
@@ -1262,8 +1240,10 @@ static int ram_multifd_page(RAMState *rs, PageSearchStatus *pss,
                              offset | RAM_SAVE_FLAG_MULTIFD_PAGE);
         fd_num = multifd_send_page(p, rs->migration_dirty_pages == 1);
         qemu_put_be16(rs->f, fd_num);
+        if (fd_num != UINT16_MAX) {
+            qemu_fflush(rs->f);
+        }
         ram_counters.transferred += 2; /* size of fd_num */
-        qemu_put_buffer(rs->f, p, TARGET_PAGE_SIZE);
         ram_counters.transferred += TARGET_PAGE_SIZE;
         pages = 1;
         ram_counters.normal++;
@@ -3126,7 +3106,6 @@ static int ram_load(QEMUFile *f, void *opaque, int version_id)
         case RAM_SAVE_FLAG_MULTIFD_PAGE:
             fd_num = qemu_get_be16(f);
             multifd_recv_page(host, fd_num);
-            qemu_get_buffer(f, host, TARGET_PAGE_SIZE);
             break;
 
         case RAM_SAVE_FLAG_EOS:
-- 
2.9.4

^ permalink raw reply related	[flat|nested] 93+ messages in thread

* [Qemu-devel] [PATCH v5 17/17] migration: Flush receive queue
  2017-07-17 13:42 [Qemu-devel] [PATCH v5 00/17] Multifd Juan Quintela
                   ` (15 preceding siblings ...)
  2017-07-17 13:42 ` [Qemu-devel] [PATCH v5 16/17] migration: Transfer pages over new channels Juan Quintela
@ 2017-07-17 13:42 ` Juan Quintela
  2017-07-20 11:45   ` Dr. David Alan Gilbert
                     ` (2 more replies)
  16 siblings, 3 replies; 93+ messages in thread
From: Juan Quintela @ 2017-07-17 13:42 UTC (permalink / raw)
  To: qemu-devel; +Cc: dgilbert, lvivier, peterx, berrange

Each time that we sync the bitmap, it is a possiblity that we receive
a page that is being processed by a different thread.  We fix this
problem just making sure that we wait for all receiving threads to
finish its work before we procedeed with the next stage.

We are low on page flags, so we use a combination that is not valid to
emit that message:  MULTIFD_PAGE and COMPRESSED.

I tried to make a migration command for it, but it don't work because
we sync the bitmap sometimes when we have already sent the beggining
of the section, so I just added a new page flag.

Signed-off-by: Juan Quintela <quintela@redhat.com>
---
 migration/ram.c | 57 ++++++++++++++++++++++++++++++++++++++++++++++++++++++++-
 1 file changed, 56 insertions(+), 1 deletion(-)

diff --git a/migration/ram.c b/migration/ram.c
index c78b286..bffe204 100644
--- a/migration/ram.c
+++ b/migration/ram.c
@@ -71,6 +71,12 @@
 #define RAM_SAVE_FLAG_COMPRESS_PAGE    0x100
 #define RAM_SAVE_FLAG_MULTIFD_PAGE     0x200
 
+/* We are getting low on pages flags, so we start using combinations
+   When we need to flush a page, we sent it as
+   RAM_SAVE_FLAG_MULTIFD_PAGE | RAM_SAVE_FLAG_COMPRESS_PAGE
+   We don't allow that combination
+*/
+
 static inline bool is_zero_range(uint8_t *p, uint64_t size)
 {
     return buffer_is_zero(p, size);
@@ -193,6 +199,9 @@ struct RAMState {
     uint64_t iterations_prev;
     /* Iterations since start */
     uint64_t iterations;
+    /* Indicates if we have synced the bitmap and we need to assure that
+       target has processeed all previous pages */
+    bool multifd_needs_flush;
     /* protects modification of the bitmap */
     uint64_t migration_dirty_pages;
     /* number of dirty bits in the bitmap */
@@ -363,7 +372,6 @@ static void compress_threads_save_setup(void)
 
 /* Multiple fd's */
 
-
 typedef struct {
     int num;
     int size;
@@ -595,9 +603,11 @@ struct MultiFDRecvParams {
     QIOChannel *c;
     QemuSemaphore ready;
     QemuSemaphore sem;
+    QemuCond cond_sync;
     QemuMutex mutex;
     /* proteced by param mutex */
     bool quit;
+    bool sync;
     multifd_pages_t pages;
     bool done;
 };
@@ -637,6 +647,7 @@ void multifd_load_cleanup(void)
         qemu_thread_join(&p->thread);
         qemu_mutex_destroy(&p->mutex);
         qemu_sem_destroy(&p->sem);
+        qemu_cond_destroy(&p->cond_sync);
         socket_recv_channel_destroy(p->c);
         g_free(p);
         multifd_recv_state->params[i] = NULL;
@@ -675,6 +686,10 @@ static void *multifd_recv_thread(void *opaque)
                 return NULL;
             }
             p->done = true;
+            if (p->sync) {
+                qemu_cond_signal(&p->cond_sync);
+                p->sync = false;
+            }
             qemu_mutex_unlock(&p->mutex);
             qemu_sem_post(&p->ready);
             continue;
@@ -724,9 +739,11 @@ gboolean multifd_new_channel(QIOChannel *ioc)
     qemu_mutex_init(&p->mutex);
     qemu_sem_init(&p->sem, 0);
     qemu_sem_init(&p->ready, 0);
+    qemu_cond_init(&p->cond_sync);
     p->quit = false;
     p->id = id;
     p->done = false;
+    p->sync = false;
     multifd_init_group(&p->pages);
     p->c = ioc;
     atomic_set(&multifd_recv_state->params[id], p);
@@ -792,6 +809,27 @@ static void multifd_recv_page(uint8_t *address, uint16_t fd_num)
     qemu_sem_post(&p->sem);
 }
 
+static int multifd_flush(void)
+{
+    int i, thread_count;
+
+    if (!migrate_use_multifd()) {
+        return 0;
+    }
+    thread_count = migrate_multifd_threads();
+    for (i = 0; i < thread_count; i++) {
+        MultiFDRecvParams *p = multifd_recv_state->params[i];
+
+        qemu_mutex_lock(&p->mutex);
+        while (!p->done) {
+            p->sync = true;
+            qemu_cond_wait(&p->cond_sync, &p->mutex);
+        }
+        qemu_mutex_unlock(&p->mutex);
+    }
+    return 0;
+}
+
 /**
  * save_page_header: write page header to wire
  *
@@ -809,6 +847,12 @@ static size_t save_page_header(RAMState *rs, QEMUFile *f,  RAMBlock *block,
 {
     size_t size, len;
 
+    if (rs->multifd_needs_flush &&
+        (offset & RAM_SAVE_FLAG_MULTIFD_PAGE)) {
+        offset |= RAM_SAVE_FLAG_ZERO;
+        rs->multifd_needs_flush = false;
+    }
+
     if (block == rs->last_sent_block) {
         offset |= RAM_SAVE_FLAG_CONTINUE;
     }
@@ -2496,6 +2540,9 @@ static int ram_save_complete(QEMUFile *f, void *opaque)
 
     if (!migration_in_postcopy()) {
         migration_bitmap_sync(rs);
+        if (migrate_use_multifd()) {
+            rs->multifd_needs_flush = true;
+        }
     }
 
     ram_control_before_iterate(f, RAM_CONTROL_FINISH);
@@ -2538,6 +2585,9 @@ static void ram_save_pending(QEMUFile *f, void *opaque, uint64_t max_size,
         qemu_mutex_lock_iothread();
         rcu_read_lock();
         migration_bitmap_sync(rs);
+        if (migrate_use_multifd()) {
+            rs->multifd_needs_flush = true;
+        }
         rcu_read_unlock();
         qemu_mutex_unlock_iothread();
         remaining_size = rs->migration_dirty_pages * TARGET_PAGE_SIZE;
@@ -3012,6 +3062,11 @@ static int ram_load(QEMUFile *f, void *opaque, int version_id)
             break;
         }
 
+        if ((flags & (RAM_SAVE_FLAG_MULTIFD_PAGE | RAM_SAVE_FLAG_ZERO))
+                  == (RAM_SAVE_FLAG_MULTIFD_PAGE | RAM_SAVE_FLAG_ZERO)) {
+            multifd_flush();
+            flags = flags & ~RAM_SAVE_FLAG_ZERO;
+        }
         if (flags & (RAM_SAVE_FLAG_ZERO | RAM_SAVE_FLAG_PAGE |
                      RAM_SAVE_FLAG_COMPRESS_PAGE | RAM_SAVE_FLAG_XBZRLE |
                      RAM_SAVE_FLAG_MULTIFD_PAGE)) {
-- 
2.9.4

^ permalink raw reply related	[flat|nested] 93+ messages in thread

* Re: [Qemu-devel] [PATCH v5 02/17] migration: Create migration_ioc_process_incoming()
  2017-07-17 13:42 ` [Qemu-devel] [PATCH v5 02/17] migration: Create migration_ioc_process_incoming() Juan Quintela
@ 2017-07-19 13:38   ` Daniel P. Berrange
  2017-07-24 11:09     ` Juan Quintela
  0 siblings, 1 reply; 93+ messages in thread
From: Daniel P. Berrange @ 2017-07-19 13:38 UTC (permalink / raw)
  To: Juan Quintela; +Cc: qemu-devel, dgilbert, lvivier, peterx

On Mon, Jul 17, 2017 at 03:42:23PM +0200, Juan Quintela wrote:
> We need to receive the ioc to be able to implement multifd.
> 
> Signed-off-by: Juan Quintela <quintela@redhat.com>
> ---
>  migration/channel.c   |  3 +--
>  migration/migration.c | 16 +++++++++++++---
>  migration/migration.h |  2 ++
>  3 files changed, 16 insertions(+), 5 deletions(-)
> 
> diff --git a/migration/channel.c b/migration/channel.c
> index 719055d..5b777ef 100644
> --- a/migration/channel.c
> +++ b/migration/channel.c
> @@ -36,8 +36,7 @@ gboolean migration_channel_process_incoming(QIOChannel *ioc)
>              error_report_err(local_err);
>          }
>      } else {
> -        QEMUFile *f = qemu_fopen_channel_input(ioc);
> -        migration_fd_process_incoming(f);
> +        return migration_ioc_process_incoming(ioc);
>      }
>      return FALSE; /* unregister */
>  }

This is going to break TLS with multi FD I'm afraid.


We have two code paths:

 1. Non-TLS

    event loop POLLIN on migration listener socket
     +-> socket_accept_incoming_migration()
          +-> migration_channel_process_incoming()
	       +-> migration_ioc_process_incoming()
	            -> returns FALSE if all required FD channels are now present

 2. TLS

    event loop POLLIN on migration listener socket
     +-> socket_accept_incoming_migration()
          +-> migration_channel_process_incoming()
	       +-> migration_tls_channel_process_incoming
	            -> Registers watch for TLS handhsake on client socket
	            -> returns FALSE immediately to remove listener watch

    event loop POLLIN on migration *client* socket
     +-> migration_tls_incoming_handshake
          +-> migration_channel_process_incoming()
	       +-> migration_ioc_process_incoming()
	            -> return value ignored


So, in this patch your going to immediately unregister the
migration listener socket watch when the TLS handshake
starts.

You can't rely on propagating a return value back from
migration_ioc_process_incoming(), because that is called
from a different context when using TLS.

To fix this we need to migrati onsocket_accept_incoming_migration()
so that it can do a call like

   if (migration_expect_more_clients())
       return TRUE;
   else
       return FALSE;

and have migration_expect_more_clients() do something like

    if (migrate_use_multifd() && mulitfd_recv_state->count < thread_count)
        return TRUE;
    else
	return FALSE;

> diff --git a/migration/migration.c b/migration/migration.c
> index a0db40d..c24ad03 100644
> --- a/migration/migration.c
> +++ b/migration/migration.c
> @@ -299,17 +299,15 @@ static void process_incoming_migration_bh(void *opaque)
>  
>  static void process_incoming_migration_co(void *opaque)
>  {
> -    QEMUFile *f = opaque;
>      MigrationIncomingState *mis = migration_incoming_get_current();
>      PostcopyState ps;
>      int ret;
>  
> -    mis->from_src_file = f;
>      mis->largest_page_size = qemu_ram_pagesize_largest();
>      postcopy_state_set(POSTCOPY_INCOMING_NONE);
>      migrate_set_state(&mis->state, MIGRATION_STATUS_NONE,
>                        MIGRATION_STATUS_ACTIVE);
> -    ret = qemu_loadvm_state(f);
> +    ret = qemu_loadvm_state(mis->from_src_file);
>  
>      ps = postcopy_state_get();
>      trace_process_incoming_migration_co_end(ret, ps);
> @@ -362,6 +360,18 @@ void migration_fd_process_incoming(QEMUFile *f)
>      qemu_coroutine_enter(co);
>  }
>  
> +gboolean migration_ioc_process_incoming(QIOChannel *ioc)
> +{
> +    MigrationIncomingState *mis = migration_incoming_get_current();
> +
> +    if (!mis->from_src_file) {
> +        QEMUFile *f = qemu_fopen_channel_input(ioc);
> +        mis->from_src_file = f;
> +        migration_fd_process_incoming(f);
> +    }
> +    return FALSE; /* unregister */
> +}
> +
>  /*
>   * Send a 'SHUT' message on the return channel with the given value
>   * to indicate that we've finished with the RP.  Non-0 value indicates
> diff --git a/migration/migration.h b/migration/migration.h
> index 148c9fa..5a18aea 100644
> --- a/migration/migration.h
> +++ b/migration/migration.h
> @@ -20,6 +20,7 @@
>  #include "exec/cpu-common.h"
>  #include "qemu/coroutine_int.h"
>  #include "hw/qdev.h"
> +#include "io/channel.h"
>  
>  /* State for the incoming migration */
>  struct MigrationIncomingState {
> @@ -152,6 +153,7 @@ struct MigrationState
>  void migrate_set_state(int *state, int old_state, int new_state);
>  
>  void migration_fd_process_incoming(QEMUFile *f);
> +gboolean migration_ioc_process_incoming(QIOChannel *ioc);
>  
>  uint64_t migrate_max_downtime(void);
>  
> -- 
> 2.9.4
> 

Regards,
Daniel
-- 
|: https://berrange.com      -o-    https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org         -o-            https://fstop138.berrange.com :|
|: https://entangle-photo.org    -o-    https://www.instagram.com/dberrange :|

^ permalink raw reply	[flat|nested] 93+ messages in thread

* Re: [Qemu-devel] [PATCH v5 03/17] qio: Create new qio_channel_{readv, writev}_all
  2017-07-17 13:42 ` [Qemu-devel] [PATCH v5 03/17] qio: Create new qio_channel_{readv, writev}_all Juan Quintela
@ 2017-07-19 13:44   ` Daniel P. Berrange
  2017-08-08  8:40     ` Juan Quintela
  2017-07-19 15:42   ` Dr. David Alan Gilbert
  1 sibling, 1 reply; 93+ messages in thread
From: Daniel P. Berrange @ 2017-07-19 13:44 UTC (permalink / raw)
  To: Juan Quintela; +Cc: qemu-devel, dgilbert, lvivier, peterx

On Mon, Jul 17, 2017 at 03:42:24PM +0200, Juan Quintela wrote:
> The functions waits until it is able to write the full iov.
> 
> Signed-off-by: Juan Quintela <quintela@redhat.com>
> 
> --
> 
> Add tests.
> ---
>  include/io/channel.h           | 46 +++++++++++++++++++++++++
>  io/channel.c                   | 76 ++++++++++++++++++++++++++++++++++++++++++
>  migration/qemu-file-channel.c  | 29 +---------------
>  tests/io-channel-helpers.c     | 55 ++++++++++++++++++++++++++++++
>  tests/io-channel-helpers.h     |  4 +++
>  tests/test-io-channel-buffer.c | 55 ++++++++++++++++++++++++++++--
>  6 files changed, 234 insertions(+), 31 deletions(-)



> diff --git a/io/channel.c b/io/channel.c
> index cdf7454..82203ef 100644
> --- a/io/channel.c
> +++ b/io/channel.c
> @@ -22,6 +22,7 @@
>  #include "io/channel.h"
>  #include "qapi/error.h"
>  #include "qemu/main-loop.h"
> +#include "qemu/iov.h"
>  
>  bool qio_channel_has_feature(QIOChannel *ioc,
>                               QIOChannelFeature feature)
> @@ -85,6 +86,81 @@ ssize_t qio_channel_writev_full(QIOChannel *ioc,
>  }
>  
>  
> +
> +ssize_t qio_channel_readv_all(QIOChannel *ioc,
> +                              const struct iovec *iov,
> +                              size_t niov,
> +                              Error **errp)
> +{
> +    ssize_t done = 0;
> +    struct iovec *local_iov = g_new(struct iovec, niov);
> +    struct iovec *local_iov_head = local_iov;
> +    unsigned int nlocal_iov = niov;
> +
> +    nlocal_iov = iov_copy(local_iov, nlocal_iov,
> +                          iov, niov,
> +                          0, iov_size(iov, niov));
> +
> +    while (nlocal_iov > 0) {
> +        ssize_t len;
> +        len = qio_channel_readv(ioc, local_iov, nlocal_iov, errp);
> +        if (len == QIO_CHANNEL_ERR_BLOCK) {
> +            qio_channel_wait(ioc, G_IO_OUT);
> +            continue;
> +        }
> +        if (len < 0) {
> +            error_setg_errno(errp, EIO,
> +                             "Channel was not able to read full iov");
> +            done = -1;
> +            goto cleanup;
> +        }
> +
> +        iov_discard_front(&local_iov, &nlocal_iov, len);
> +        done += len;
> +    }

If 'len == 0' (ie EOF from qio_channel_readv())  then this will busy
loop. You need to break the loop on that condition and return whatever
'done' currently is.

> +
> + cleanup:
> +    g_free(local_iov_head);
> +    return done;
> +}


> diff --git a/tests/io-channel-helpers.c b/tests/io-channel-helpers.c
> index 05e5579..3d76d95 100644
> --- a/tests/io-channel-helpers.c
> +++ b/tests/io-channel-helpers.c
> @@ -21,6 +21,7 @@
>  #include "qemu/osdep.h"
>  #include "io-channel-helpers.h"
>  #include "qapi/error.h"
> +#include "qemu/iov.h"
>  
>  struct QIOChannelTest {
>      QIOChannel *src;
> @@ -153,6 +154,45 @@ static gpointer test_io_thread_reader(gpointer opaque)
>      return NULL;
>  }
>  
> +static gpointer test_io_thread_writer_all(gpointer opaque)
> +{
> +    QIOChannelTest *data = opaque;
> +    size_t niov = data->niov;
> +    ssize_t ret;
> +
> +    qio_channel_set_blocking(data->src, data->blocking, NULL);
> +
> +    ret = qio_channel_writev_all(data->src,
> +                                 data->inputv,
> +                                 niov,
> +                                 &data->writeerr);
> +    if (ret != iov_size(data->inputv, data->niov)) {
> +        error_setg(&data->writeerr, "Unexpected I/O error");
> +    }
> +
> +    return NULL;
> +}
> +
> +/* This thread receives all data using iovecs */
> +static gpointer test_io_thread_reader_all(gpointer opaque)
> +{
> +    QIOChannelTest *data = opaque;
> +    size_t niov = data->niov;
> +    ssize_t ret;
> +
> +    qio_channel_set_blocking(data->dst, data->blocking, NULL);
> +
> +    ret = qio_channel_readv_all(data->dst,
> +                                data->outputv,
> +                                niov,
> +                                &data->readerr);
> +
> +    if (ret != iov_size(data->inputv, data->niov)) {
> +        error_setg(&data->readerr, "Unexpected I/O error");
> +    }
> +
> +    return NULL;
> +}
>  
>  QIOChannelTest *qio_channel_test_new(void)
>  {
> @@ -231,6 +271,21 @@ void qio_channel_test_run_reader(QIOChannelTest *test,
>      test->dst = NULL;
>  }
>  
> +void qio_channel_test_run_writer_all(QIOChannelTest *test,
> +                                     QIOChannel *src)
> +{
> +    test->src = src;
> +    test_io_thread_writer_all(test);
> +    test->src = NULL;
> +}
> +
> +void qio_channel_test_run_reader_all(QIOChannelTest *test,
> +                                     QIOChannel *dst)
> +{
> +    test->dst = dst;
> +    test_io_thread_reader_all(test);
> +    test->dst = NULL;
> +}
>  
>  void qio_channel_test_validate(QIOChannelTest *test)
>  {
> diff --git a/tests/io-channel-helpers.h b/tests/io-channel-helpers.h
> index fedc64f..17b9647 100644
> --- a/tests/io-channel-helpers.h
> +++ b/tests/io-channel-helpers.h
> @@ -36,6 +36,10 @@ void qio_channel_test_run_writer(QIOChannelTest *test,
>                                   QIOChannel *src);
>  void qio_channel_test_run_reader(QIOChannelTest *test,
>                                   QIOChannel *dst);
> +void qio_channel_test_run_writer_all(QIOChannelTest *test,
> +                                     QIOChannel *src);
> +void qio_channel_test_run_reader_all(QIOChannelTest *test,
> +                                     QIOChannel *dst);
>  
>  void qio_channel_test_validate(QIOChannelTest *test);
>  
> diff --git a/tests/test-io-channel-buffer.c b/tests/test-io-channel-buffer.c
> index 64722a2..4bf64ae 100644
> --- a/tests/test-io-channel-buffer.c
> +++ b/tests/test-io-channel-buffer.c
> @@ -22,8 +22,7 @@
>  #include "io/channel-buffer.h"
>  #include "io-channel-helpers.h"
>  
> -
> -static void test_io_channel_buf(void)
> +static void test_io_channel_buf1(void)
>  {
>      QIOChannelBuffer *buf;
>      QIOChannelTest *test;
> @@ -39,6 +38,53 @@ static void test_io_channel_buf(void)
>      object_unref(OBJECT(buf));
>  }
>  
> +static void test_io_channel_buf2(void)
> +{
> +    QIOChannelBuffer *buf;
> +    QIOChannelTest *test;
> +
> +    buf = qio_channel_buffer_new(0);
> +
> +    test = qio_channel_test_new();
> +    qio_channel_test_run_writer_all(test, QIO_CHANNEL(buf));
> +    buf->offset = 0;
> +    qio_channel_test_run_reader(test, QIO_CHANNEL(buf));
> +    qio_channel_test_validate(test);
> +
> +    object_unref(OBJECT(buf));
> +}
> +
> +static void test_io_channel_buf3(void)
> +{
> +    QIOChannelBuffer *buf;
> +    QIOChannelTest *test;
> +
> +    buf = qio_channel_buffer_new(0);
> +
> +    test = qio_channel_test_new();
> +    qio_channel_test_run_writer(test, QIO_CHANNEL(buf));
> +    buf->offset = 0;
> +    qio_channel_test_run_reader_all(test, QIO_CHANNEL(buf));
> +    qio_channel_test_validate(test);
> +
> +    object_unref(OBJECT(buf));
> +}
> +
> +static void test_io_channel_buf4(void)
> +{
> +    QIOChannelBuffer *buf;
> +    QIOChannelTest *test;
> +
> +    buf = qio_channel_buffer_new(0);
> +
> +    test = qio_channel_test_new();
> +    qio_channel_test_run_writer_all(test, QIO_CHANNEL(buf));
> +    buf->offset = 0;
> +    qio_channel_test_run_reader_all(test, QIO_CHANNEL(buf));
> +    qio_channel_test_validate(test);
> +
> +    object_unref(OBJECT(buf));
> +}
>  
>  int main(int argc, char **argv)
>  {
> @@ -46,6 +92,9 @@ int main(int argc, char **argv)
>  
>      g_test_init(&argc, &argv, NULL);
>  
> -    g_test_add_func("/io/channel/buf", test_io_channel_buf);
> +    g_test_add_func("/io/channel/buf1", test_io_channel_buf1);
> +    g_test_add_func("/io/channel/buf2", test_io_channel_buf2);
> +    g_test_add_func("/io/channel/buf3", test_io_channel_buf3);
> +    g_test_add_func("/io/channel/buf4", test_io_channel_buf4);
>      return g_test_run();
>  }

There's no need to add any of these additions to the test suite.  Instead
you can just change the existing io-channel-helpers.c functions
test_io_thread_writer() and test_io_thread_reader(), to call
qio_channel_writev_all() & qio_channel_readv_all() respectively.

Regards,
Daniel
-- 
|: https://berrange.com      -o-    https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org         -o-            https://fstop138.berrange.com :|
|: https://entangle-photo.org    -o-    https://www.instagram.com/dberrange :|

^ permalink raw reply	[flat|nested] 93+ messages in thread

* Re: [Qemu-devel] [PATCH v5 09/17] migration: Start of multiple fd work
  2017-07-17 13:42 ` [Qemu-devel] [PATCH v5 09/17] migration: Start of multiple fd work Juan Quintela
@ 2017-07-19 13:56   ` Daniel P. Berrange
  2017-07-19 17:35   ` Dr. David Alan Gilbert
  2017-07-20  9:34   ` Peter Xu
  2 siblings, 0 replies; 93+ messages in thread
From: Daniel P. Berrange @ 2017-07-19 13:56 UTC (permalink / raw)
  To: Juan Quintela; +Cc: qemu-devel, dgilbert, lvivier, peterx

On Mon, Jul 17, 2017 at 03:42:30PM +0200, Juan Quintela wrote:
> We create new channels for each new thread created. We only send through
> them a character to be sure that we are creating the channels in the
> right order.
> 
> Signed-off-by: Juan Quintela <quintela@redhat.com>
> 
> --
> Split SocketArgs into incoming and outgoing args
> 
> Use UUID's on the initial message, so we are sure we are connecting to
> the right channel.
> 
> Remove init semaphore.  Now that we use uuids on the init message, we
> know that this is our channel.
> 
> Fix recv socket destwroy, we were destroying send channels.
> This was very interesting, because we were using an unreferred object
> without problems.
> 
> Move to struct of pointers
> init channel sooner.
> split recv thread creation.
> listen on main thread
> ---
>  migration/migration.c |   7 ++-
>  migration/ram.c       | 118 ++++++++++++++++++++++++++++++++++++++++++--------
>  migration/ram.h       |   2 +
>  migration/socket.c    |  38 ++++++++++++++--
>  migration/socket.h    |  10 +++++
>  5 files changed, 152 insertions(+), 23 deletions(-)
> 


> diff --git a/migration/ram.c b/migration/ram.c
> index 8e87533..b80f511 100644
> --- a/migration/ram.c
> +++ b/migration/ram.c
> @@ -408,11 +413,38 @@ void multifd_save_cleanup(void)
>      multifd_send_state = NULL;
>  }
>  
> +/* Default uuid for multifd when qemu is not started with uuid */
> +static char multifd_uuid[] = "5c49fd7e-af88-4a07-b6e8-091fd696ad40";
> +/* strlen(multifd) + '-' + <channel id> + '-' +  UUID_FMT + '\0' */
> +#define MULTIFD_UUID_MSG (7 + 1 + 3 + 1 + UUID_FMT_LEN + 1)

> +
>  static void *multifd_send_thread(void *opaque)
>  {
>      MultiFDSendParams *p = opaque;
> +    char string[MULTIFD_UUID_MSG];
> +    char *string_uuid;
> +    int res;
> +    bool exit = false;
>  
> -    while (true) {
> +    if (qemu_uuid_set) {
> +        string_uuid = qemu_uuid_unparse_strdup(&qemu_uuid);
> +    } else {
> +        string_uuid = g_strdup(multifd_uuid);
> +    }
> +    res = snprintf(string, MULTIFD_UUID_MSG, "%s multifd %03d",
> +                   string_uuid, p->id);

Just use  g_strdup_printf() here and avoid the error prone
logically for calculating the "correct"  buffer size.

> +    g_free(string_uuid);
> +
> +    /* -1 due to the wonders of '\0' accounting */
> +    if (res != (MULTIFD_UUID_MSG - 1)) {
> +        error_report("Multifd UUID message '%s' is not of right length",
> +            string);
> +        exit = true;
> +    } else {
> +        qio_channel_write(p->c, string, MULTIFD_UUID_MSG, &error_abort);

Ewwww, you can't have QEMU abort when there's an I/O error on the
a file descriptor. It needs to fail the migration cleanly.

> +    }
> +
> +    while (!exit) {
>          qemu_mutex_lock(&p->mutex);
>          if (p->quit) {
>              qemu_mutex_unlock(&p->mutex);

> +gboolean multifd_new_channel(QIOChannel *ioc)
> +{
> +    int thread_count = migrate_multifd_threads();
> +    MultiFDRecvParams *p = g_new0(MultiFDRecvParams, 1);
> +    MigrationState *s = migrate_get_current();
> +    char string[MULTIFD_UUID_MSG];
> +    char string_uuid[UUID_FMT_LEN];
> +    char *uuid;
> +    int id;
> +
> +    qio_channel_read(ioc, string, sizeof(string), &error_abort);

Again, we can't abort QEMU on I/O errors

> +    sscanf(string, "%s multifd %03d", string_uuid, &id);
> +
> +    if (qemu_uuid_set) {
> +        uuid = qemu_uuid_unparse_strdup(&qemu_uuid);
> +    } else {
> +        uuid = g_strdup(multifd_uuid);
> +    }
> +    if (strcmp(string_uuid, uuid)) {
> +        error_report("multifd: received uuid '%s' and expected uuid '%s'",
> +                     string_uuid, uuid);
> +        migrate_set_state(&s->state, MIGRATION_STATUS_ACTIVE,
> +                          MIGRATION_STATUS_FAILED);
> +        terminate_multifd_recv_threads();
> +        return FALSE;
> +    }
> +    g_free(uuid);
> +
> +    if (multifd_recv_state->params[id] != NULL) {
> +        error_report("multifd: received id '%d' is already setup'", id);
> +        migrate_set_state(&s->state, MIGRATION_STATUS_ACTIVE,
> +                          MIGRATION_STATUS_FAILED);
> +        terminate_multifd_recv_threads();
> +        return FALSE;
> +    }
> +    qemu_mutex_init(&p->mutex);
> +    qemu_sem_init(&p->sem, 0);
> +    p->quit = false;
> +    p->id = id;
> +    p->c = ioc;
> +    atomic_set(&multifd_recv_state->params[id], p);
> +    qemu_thread_create(&p->thread, "multifd_recv", multifd_recv_thread, p,
> +                       QEMU_THREAD_JOINABLE);
> +    multifd_recv_state->count++;
> +
> +    /* We need to return FALSE for the last channel */
> +    if (multifd_recv_state->count == thread_count) {
> +        return FALSE;
> +    } else {
> +        return TRUE;
> +    }
> +}
> +

> diff --git a/migration/socket.c b/migration/socket.c
> index 6195596..32a6b39 100644
> --- a/migration/socket.c
> +++ b/migration/socket.c
> @@ -26,6 +26,38 @@
>  #include "io/channel-socket.h"
>  #include "trace.h"
>  
> +int socket_recv_channel_destroy(QIOChannel *recv)
> +{
> +    /* Remove channel */
> +    object_unref(OBJECT(recv));
> +    return 0;
> +}
> +
> +struct SocketOutgoingArgs {
> +    SocketAddress *saddr;
> +    Error **errp;
> +} outgoing_args;
> +
> +QIOChannel *socket_send_channel_create(void)
> +{
> +    QIOChannelSocket *sioc = qio_channel_socket_new();
> +
> +    qio_channel_socket_connect_sync(sioc, outgoing_args.saddr,
> +                                    outgoing_args.errp);

This is going to block the caller, which means if someonee
calls migrate_cancel it won't be possible to cleanup
any threads stuck in this connect call. It is preferrable
to use connect_async, and return the sioc immediately. THis
lets the callce close the sioc to cancel the connect attempt.

> +    qio_channel_set_delay(QIO_CHANNEL(sioc), false);
> +    return QIO_CHANNEL(sioc);
> +}
> +
> +int socket_send_channel_destroy(QIOChannel *send)
> +{
> +    /* Remove channel */
> +    object_unref(OBJECT(send));
> +    if (outgoing_args.saddr) {
> +        qapi_free_SocketAddress(outgoing_args.saddr);
> +        outgoing_args.saddr = NULL;
> +    }
> +    return 0;
> +}
>  
>  static SocketAddress *tcp_build_address(const char *host_port, Error **errp)
>  {
> @@ -96,6 +128,9 @@ static void socket_start_outgoing_migration(MigrationState *s,
>      struct SocketConnectData *data = g_new0(struct SocketConnectData, 1);
>  
>      data->s = s;
> +    outgoing_args.saddr = saddr;
> +    outgoing_args.errp = errp;

If socket_start_outgoing_migration() is called multiple times, then
we're going to leak saddr.

Also 'errp' is pointing to stack memory in the caller, so you're
saving a pointer to a stack frame that will no longer be valid
once this method returns. So that doesn't look safe to me.

> +
>      if (saddr->type == SOCKET_ADDRESS_TYPE_INET) {
>          data->hostname = g_strdup(saddr->u.inet.host);
>      }
> @@ -106,7 +141,6 @@ static void socket_start_outgoing_migration(MigrationState *s,
>                                       socket_outgoing_migration,
>                                       data,
>                                       socket_connect_data_free);
> -    qapi_free_SocketAddress(saddr);
>  }
>  
>  void tcp_start_outgoing_migration(MigrationState *s,
> @@ -151,8 +185,6 @@ static gboolean socket_accept_incoming_migration(QIOChannel *ioc,
>  
>      qio_channel_set_name(QIO_CHANNEL(sioc), "migration-socket-incoming");
>      result = migration_channel_process_incoming(QIO_CHANNEL(sioc));
> -    object_unref(OBJECT(sioc));
> -
>  out:
>      if (result == FALSE) {
>          /* Close listening socket as its no longer needed */
> diff --git a/migration/socket.h b/migration/socket.h
> index 6b91e9d..dabce0e 100644
> --- a/migration/socket.h
> +++ b/migration/socket.h
> @@ -16,6 +16,16 @@
>  
>  #ifndef QEMU_MIGRATION_SOCKET_H
>  #define QEMU_MIGRATION_SOCKET_H
> +
> +#include "io/channel.h"
> +
> +QIOChannel *socket_recv_channel_create(void);
> +int socket_recv_channel_destroy(QIOChannel *recv);
> +
> +QIOChannel *socket_send_channel_create(void);
> +
> +int socket_send_channel_destroy(QIOChannel *send);
> +
>  void tcp_start_incoming_migration(const char *host_port, Error **errp);
>  
>  void tcp_start_outgoing_migration(MigrationState *s, const char *host_port,
> -- 
> 2.9.4
> 

Regards,
Daniel
-- 
|: https://berrange.com      -o-    https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org         -o-            https://fstop138.berrange.com :|
|: https://entangle-photo.org    -o-    https://www.instagram.com/dberrange :|

^ permalink raw reply	[flat|nested] 93+ messages in thread

* Re: [Qemu-devel] [PATCH v5 11/17] migration: Really use multiple pages at a time
  2017-07-17 13:42 ` [Qemu-devel] [PATCH v5 11/17] migration: Really use multiple pages at a time Juan Quintela
@ 2017-07-19 13:58   ` Daniel P. Berrange
  2017-08-08 11:55     ` Juan Quintela
  2017-07-20  9:44   ` Dr. David Alan Gilbert
  2017-07-20  9:49   ` Peter Xu
  2 siblings, 1 reply; 93+ messages in thread
From: Daniel P. Berrange @ 2017-07-19 13:58 UTC (permalink / raw)
  To: Juan Quintela; +Cc: qemu-devel, dgilbert, lvivier, peterx

On Mon, Jul 17, 2017 at 03:42:32PM +0200, Juan Quintela wrote:
> We now send several pages at a time each time that we wakeup a thread.
> 
> Signed-off-by: Juan Quintela <quintela@redhat.com>
> 
> --
> 
> Use iovec's insead of creating the equivalent.
> ---
>  migration/ram.c | 46 ++++++++++++++++++++++++++++++++++++++++------
>  1 file changed, 40 insertions(+), 6 deletions(-)
> 
> diff --git a/migration/ram.c b/migration/ram.c
> index 2bf3fa7..90e1bcb 100644
> --- a/migration/ram.c
> +++ b/migration/ram.c

> +static void multifd_init_group(multifd_pages_t *pages)
> +{
> +    pages->num = 0;
> +    pages->size = migrate_multifd_group();
> +    pages->iov = g_malloc0(pages->size * sizeof(struct iovec));

Use g_new() so that it checks for overflow in the size calculation.

> +}
> +

Regards,
Daniel
-- 
|: https://berrange.com      -o-    https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org         -o-            https://fstop138.berrange.com :|
|: https://entangle-photo.org    -o-    https://www.instagram.com/dberrange :|

^ permalink raw reply	[flat|nested] 93+ messages in thread

* Re: [Qemu-devel] [PATCH v5 01/17] migrate: Add gboolean return type to migrate_channel_process_incoming
  2017-07-17 13:42 ` [Qemu-devel] [PATCH v5 01/17] migrate: Add gboolean return type to migrate_channel_process_incoming Juan Quintela
@ 2017-07-19 15:01   ` Dr. David Alan Gilbert
  2017-07-20  7:00     ` Peter Xu
  0 siblings, 1 reply; 93+ messages in thread
From: Dr. David Alan Gilbert @ 2017-07-19 15:01 UTC (permalink / raw)
  To: Juan Quintela; +Cc: qemu-devel, lvivier, peterx, berrange

* Juan Quintela (quintela@redhat.com) wrote:
> Signed-off-by: Juan Quintela <quintela@redhat.com>
> ---
>  migration/channel.c |  3 ++-
>  migration/channel.h |  2 +-
>  migration/exec.c    |  6 ++++--
>  migration/socket.c  | 12 ++++++++----
>  4 files changed, 15 insertions(+), 8 deletions(-)
> 
> diff --git a/migration/channel.c b/migration/channel.c
> index 3b7252f..719055d 100644
> --- a/migration/channel.c
> +++ b/migration/channel.c
> @@ -19,7 +19,7 @@
>  #include "qapi/error.h"
>  #include "io/channel-tls.h"
>  
> -void migration_channel_process_incoming(QIOChannel *ioc)
> +gboolean migration_channel_process_incoming(QIOChannel *ioc)
>  {
>      MigrationState *s = migrate_get_current();
>  
> @@ -39,6 +39,7 @@ void migration_channel_process_incoming(QIOChannel *ioc)
>          QEMUFile *f = qemu_fopen_channel_input(ioc);
>          migration_fd_process_incoming(f);
>      }
> +    return FALSE; /* unregister */
>  }
>  
>  
> diff --git a/migration/channel.h b/migration/channel.h
> index e4b4057..72cbc9f 100644
> --- a/migration/channel.h
> +++ b/migration/channel.h
> @@ -18,7 +18,7 @@
>  
>  #include "io/channel.h"
>  
> -void migration_channel_process_incoming(QIOChannel *ioc);
> +gboolean migration_channel_process_incoming(QIOChannel *ioc);

Can you add a comment here that says what the return value means.

Dave

>  void migration_channel_connect(MigrationState *s,
>                                 QIOChannel *ioc,
> diff --git a/migration/exec.c b/migration/exec.c
> index 08b599e..2827f15 100644
> --- a/migration/exec.c
> +++ b/migration/exec.c
> @@ -47,9 +47,11 @@ static gboolean exec_accept_incoming_migration(QIOChannel *ioc,
>                                                 GIOCondition condition,
>                                                 gpointer opaque)
>  {
> -    migration_channel_process_incoming(ioc);
> +    gboolean result;
> +
> +    result = migration_channel_process_incoming(ioc);
>      object_unref(OBJECT(ioc));
> -    return FALSE; /* unregister */
> +    return result;
>  }
>  
>  void exec_start_incoming_migration(const char *command, Error **errp)
> diff --git a/migration/socket.c b/migration/socket.c
> index 757d382..6195596 100644
> --- a/migration/socket.c
> +++ b/migration/socket.c
> @@ -136,25 +136,29 @@ static gboolean socket_accept_incoming_migration(QIOChannel *ioc,
>  {
>      QIOChannelSocket *sioc;
>      Error *err = NULL;
> +    gboolean result;
>  
>      sioc = qio_channel_socket_accept(QIO_CHANNEL_SOCKET(ioc),
>                                       &err);
>      if (!sioc) {
>          error_report("could not accept migration connection (%s)",
>                       error_get_pretty(err));
> +        result = FALSE; /* unregister */
>          goto out;
>      }
>  
>      trace_migration_socket_incoming_accepted();
>  
>      qio_channel_set_name(QIO_CHANNEL(sioc), "migration-socket-incoming");
> -    migration_channel_process_incoming(QIO_CHANNEL(sioc));
> +    result = migration_channel_process_incoming(QIO_CHANNEL(sioc));
>      object_unref(OBJECT(sioc));
>  
>  out:
> -    /* Close listening socket as its no longer needed */
> -    qio_channel_close(ioc, NULL);
> -    return FALSE; /* unregister */
> +    if (result == FALSE) {
> +        /* Close listening socket as its no longer needed */
> +        qio_channel_close(ioc, NULL);
> +    }
> +    return result;
>  }
>  
>  
> -- 
> 2.9.4
> 
--
Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK

^ permalink raw reply	[flat|nested] 93+ messages in thread

* Re: [Qemu-devel] [PATCH v5 03/17] qio: Create new qio_channel_{readv, writev}_all
  2017-07-17 13:42 ` [Qemu-devel] [PATCH v5 03/17] qio: Create new qio_channel_{readv, writev}_all Juan Quintela
  2017-07-19 13:44   ` Daniel P. Berrange
@ 2017-07-19 15:42   ` Dr. David Alan Gilbert
  2017-07-19 15:43     ` Daniel P. Berrange
  1 sibling, 1 reply; 93+ messages in thread
From: Dr. David Alan Gilbert @ 2017-07-19 15:42 UTC (permalink / raw)
  To: Juan Quintela; +Cc: qemu-devel, lvivier, peterx, berrange

* Juan Quintela (quintela@redhat.com) wrote:
> The functions waits until it is able to write the full iov.

When is it safe to call these - I see qio_channel_wait does it's
own g_main_loop - so I guess they're intended to be called from their
own process?

What causes these to exit if the migration fails for some other
(non-file) related reason?

Dave

> Signed-off-by: Juan Quintela <quintela@redhat.com>
> 
> --
> 
> Add tests.
> ---
>  include/io/channel.h           | 46 +++++++++++++++++++++++++
>  io/channel.c                   | 76 ++++++++++++++++++++++++++++++++++++++++++
>  migration/qemu-file-channel.c  | 29 +---------------
>  tests/io-channel-helpers.c     | 55 ++++++++++++++++++++++++++++++
>  tests/io-channel-helpers.h     |  4 +++
>  tests/test-io-channel-buffer.c | 55 ++++++++++++++++++++++++++++--
>  6 files changed, 234 insertions(+), 31 deletions(-)
> 
> diff --git a/include/io/channel.h b/include/io/channel.h
> index db9bb02..bfc97e2 100644
> --- a/include/io/channel.h
> +++ b/include/io/channel.h
> @@ -269,6 +269,52 @@ ssize_t qio_channel_writev_full(QIOChannel *ioc,
>                                  Error **errp);
>  
>  /**
> + * qio_channel_readv_all:
> + * @ioc: the channel object
> + * @iov: the array of memory regions to read data into
> + * @niov: the length of the @iov array
> + * @errp: pointer to a NULL-initialized error object
> + *
> + * Read data from the IO channel, storing it in the
> + * memory regions referenced by @iov. Each element
> + * in the @iov will be fully populated with data
> + * before the next one is used. The @niov parameter
> + * specifies the total number of elements in @iov.
> + *
> + * Returns: the number of bytes read, or -1 on error,
> + * or QIO_CHANNEL_ERR_BLOCK if no data is available
> + * and the channel is non-blocking
> + */
> +ssize_t qio_channel_readv_all(QIOChannel *ioc,
> +                              const struct iovec *iov,
> +                              size_t niov,
> +                              Error **errp);
> +
> +
> +/**
> + * qio_channel_writev_all:
> + * @ioc: the channel object
> + * @iov: the array of memory regions to write data from
> + * @niov: the length of the @iov array
> + * @errp: pointer to a NULL-initialized error object
> + *
> + * Write data to the IO channel, reading it from the
> + * memory regions referenced by @iov. Each element
> + * in the @iov will be fully sent, before the next
> + * one is used. The @niov parameter specifies the
> + * total number of elements in @iov.
> + *
> + * It is required for all @iov data to be fully
> + * sent.
> + *
> + * Returns: the number of bytes sent, or -1 on error,
> + */
> +ssize_t qio_channel_writev_all(QIOChannel *ioc,
> +                               const struct iovec *iov,
> +                               size_t niov,
> +                               Error **erp);
> +
> +/**
>   * qio_channel_readv:
>   * @ioc: the channel object
>   * @iov: the array of memory regions to read data into
> diff --git a/io/channel.c b/io/channel.c
> index cdf7454..82203ef 100644
> --- a/io/channel.c
> +++ b/io/channel.c
> @@ -22,6 +22,7 @@
>  #include "io/channel.h"
>  #include "qapi/error.h"
>  #include "qemu/main-loop.h"
> +#include "qemu/iov.h"
>  
>  bool qio_channel_has_feature(QIOChannel *ioc,
>                               QIOChannelFeature feature)
> @@ -85,6 +86,81 @@ ssize_t qio_channel_writev_full(QIOChannel *ioc,
>  }
>  
>  
> +
> +ssize_t qio_channel_readv_all(QIOChannel *ioc,
> +                              const struct iovec *iov,
> +                              size_t niov,
> +                              Error **errp)
> +{
> +    ssize_t done = 0;
> +    struct iovec *local_iov = g_new(struct iovec, niov);
> +    struct iovec *local_iov_head = local_iov;
> +    unsigned int nlocal_iov = niov;
> +
> +    nlocal_iov = iov_copy(local_iov, nlocal_iov,
> +                          iov, niov,
> +                          0, iov_size(iov, niov));
> +
> +    while (nlocal_iov > 0) {
> +        ssize_t len;
> +        len = qio_channel_readv(ioc, local_iov, nlocal_iov, errp);
> +        if (len == QIO_CHANNEL_ERR_BLOCK) {
> +            qio_channel_wait(ioc, G_IO_OUT);
> +            continue;
> +        }
> +        if (len < 0) {
> +            error_setg_errno(errp, EIO,
> +                             "Channel was not able to read full iov");
> +            done = -1;
> +            goto cleanup;
> +        }
> +
> +        iov_discard_front(&local_iov, &nlocal_iov, len);
> +        done += len;
> +    }
> +
> + cleanup:
> +    g_free(local_iov_head);
> +    return done;
> +}
> +
> +ssize_t qio_channel_writev_all(QIOChannel *ioc,
> +                               const struct iovec *iov,
> +                               size_t niov,
> +                               Error **errp)
> +{
> +    ssize_t done = 0;
> +    struct iovec *local_iov = g_new(struct iovec, niov);
> +    struct iovec *local_iov_head = local_iov;
> +    unsigned int nlocal_iov = niov;
> +
> +    nlocal_iov = iov_copy(local_iov, nlocal_iov,
> +                          iov, niov,
> +                          0, iov_size(iov, niov));
> +
> +    while (nlocal_iov > 0) {
> +        ssize_t len;
> +        len = qio_channel_writev(ioc, local_iov, nlocal_iov, errp);
> +        if (len == QIO_CHANNEL_ERR_BLOCK) {
> +            qio_channel_wait(ioc, G_IO_OUT);
> +            continue;
> +        }
> +        if (len < 0) {
> +            error_setg_errno(errp, EIO,
> +                             "Channel was not able to write full iov");
> +            done = -1;
> +            goto cleanup;
> +        }
> +
> +        iov_discard_front(&local_iov, &nlocal_iov, len);
> +        done += len;
> +    }
> +
> + cleanup:
> +    g_free(local_iov_head);
> +    return done;
> +}
> +
>  ssize_t qio_channel_readv(QIOChannel *ioc,
>                            const struct iovec *iov,
>                            size_t niov,
> diff --git a/migration/qemu-file-channel.c b/migration/qemu-file-channel.c
> index e202d73..457ea6c 100644
> --- a/migration/qemu-file-channel.c
> +++ b/migration/qemu-file-channel.c
> @@ -36,35 +36,8 @@ static ssize_t channel_writev_buffer(void *opaque,
>                                       int64_t pos)
>  {
>      QIOChannel *ioc = QIO_CHANNEL(opaque);
> -    ssize_t done = 0;
> -    struct iovec *local_iov = g_new(struct iovec, iovcnt);
> -    struct iovec *local_iov_head = local_iov;
> -    unsigned int nlocal_iov = iovcnt;
>  
> -    nlocal_iov = iov_copy(local_iov, nlocal_iov,
> -                          iov, iovcnt,
> -                          0, iov_size(iov, iovcnt));
> -
> -    while (nlocal_iov > 0) {
> -        ssize_t len;
> -        len = qio_channel_writev(ioc, local_iov, nlocal_iov, NULL);
> -        if (len == QIO_CHANNEL_ERR_BLOCK) {
> -            qio_channel_wait(ioc, G_IO_OUT);
> -            continue;
> -        }
> -        if (len < 0) {
> -            /* XXX handle Error objects */
> -            done = -EIO;
> -            goto cleanup;
> -        }
> -
> -        iov_discard_front(&local_iov, &nlocal_iov, len);
> -        done += len;
> -    }
> -
> - cleanup:
> -    g_free(local_iov_head);
> -    return done;
> +    return qio_channel_writev_all(ioc, iov, iovcnt, NULL);
>  }
>  
>  
> diff --git a/tests/io-channel-helpers.c b/tests/io-channel-helpers.c
> index 05e5579..3d76d95 100644
> --- a/tests/io-channel-helpers.c
> +++ b/tests/io-channel-helpers.c
> @@ -21,6 +21,7 @@
>  #include "qemu/osdep.h"
>  #include "io-channel-helpers.h"
>  #include "qapi/error.h"
> +#include "qemu/iov.h"
>  
>  struct QIOChannelTest {
>      QIOChannel *src;
> @@ -153,6 +154,45 @@ static gpointer test_io_thread_reader(gpointer opaque)
>      return NULL;
>  }
>  
> +static gpointer test_io_thread_writer_all(gpointer opaque)
> +{
> +    QIOChannelTest *data = opaque;
> +    size_t niov = data->niov;
> +    ssize_t ret;
> +
> +    qio_channel_set_blocking(data->src, data->blocking, NULL);
> +
> +    ret = qio_channel_writev_all(data->src,
> +                                 data->inputv,
> +                                 niov,
> +                                 &data->writeerr);
> +    if (ret != iov_size(data->inputv, data->niov)) {
> +        error_setg(&data->writeerr, "Unexpected I/O error");
> +    }
> +
> +    return NULL;
> +}
> +
> +/* This thread receives all data using iovecs */
> +static gpointer test_io_thread_reader_all(gpointer opaque)
> +{
> +    QIOChannelTest *data = opaque;
> +    size_t niov = data->niov;
> +    ssize_t ret;
> +
> +    qio_channel_set_blocking(data->dst, data->blocking, NULL);
> +
> +    ret = qio_channel_readv_all(data->dst,
> +                                data->outputv,
> +                                niov,
> +                                &data->readerr);
> +
> +    if (ret != iov_size(data->inputv, data->niov)) {
> +        error_setg(&data->readerr, "Unexpected I/O error");
> +    }
> +
> +    return NULL;
> +}
>  
>  QIOChannelTest *qio_channel_test_new(void)
>  {
> @@ -231,6 +271,21 @@ void qio_channel_test_run_reader(QIOChannelTest *test,
>      test->dst = NULL;
>  }
>  
> +void qio_channel_test_run_writer_all(QIOChannelTest *test,
> +                                     QIOChannel *src)
> +{
> +    test->src = src;
> +    test_io_thread_writer_all(test);
> +    test->src = NULL;
> +}
> +
> +void qio_channel_test_run_reader_all(QIOChannelTest *test,
> +                                     QIOChannel *dst)
> +{
> +    test->dst = dst;
> +    test_io_thread_reader_all(test);
> +    test->dst = NULL;
> +}
>  
>  void qio_channel_test_validate(QIOChannelTest *test)
>  {
> diff --git a/tests/io-channel-helpers.h b/tests/io-channel-helpers.h
> index fedc64f..17b9647 100644
> --- a/tests/io-channel-helpers.h
> +++ b/tests/io-channel-helpers.h
> @@ -36,6 +36,10 @@ void qio_channel_test_run_writer(QIOChannelTest *test,
>                                   QIOChannel *src);
>  void qio_channel_test_run_reader(QIOChannelTest *test,
>                                   QIOChannel *dst);
> +void qio_channel_test_run_writer_all(QIOChannelTest *test,
> +                                     QIOChannel *src);
> +void qio_channel_test_run_reader_all(QIOChannelTest *test,
> +                                     QIOChannel *dst);
>  
>  void qio_channel_test_validate(QIOChannelTest *test);
>  
> diff --git a/tests/test-io-channel-buffer.c b/tests/test-io-channel-buffer.c
> index 64722a2..4bf64ae 100644
> --- a/tests/test-io-channel-buffer.c
> +++ b/tests/test-io-channel-buffer.c
> @@ -22,8 +22,7 @@
>  #include "io/channel-buffer.h"
>  #include "io-channel-helpers.h"
>  
> -
> -static void test_io_channel_buf(void)
> +static void test_io_channel_buf1(void)
>  {
>      QIOChannelBuffer *buf;
>      QIOChannelTest *test;
> @@ -39,6 +38,53 @@ static void test_io_channel_buf(void)
>      object_unref(OBJECT(buf));
>  }
>  
> +static void test_io_channel_buf2(void)
> +{
> +    QIOChannelBuffer *buf;
> +    QIOChannelTest *test;
> +
> +    buf = qio_channel_buffer_new(0);
> +
> +    test = qio_channel_test_new();
> +    qio_channel_test_run_writer_all(test, QIO_CHANNEL(buf));
> +    buf->offset = 0;
> +    qio_channel_test_run_reader(test, QIO_CHANNEL(buf));
> +    qio_channel_test_validate(test);
> +
> +    object_unref(OBJECT(buf));
> +}
> +
> +static void test_io_channel_buf3(void)
> +{
> +    QIOChannelBuffer *buf;
> +    QIOChannelTest *test;
> +
> +    buf = qio_channel_buffer_new(0);
> +
> +    test = qio_channel_test_new();
> +    qio_channel_test_run_writer(test, QIO_CHANNEL(buf));
> +    buf->offset = 0;
> +    qio_channel_test_run_reader_all(test, QIO_CHANNEL(buf));
> +    qio_channel_test_validate(test);
> +
> +    object_unref(OBJECT(buf));
> +}
> +
> +static void test_io_channel_buf4(void)
> +{
> +    QIOChannelBuffer *buf;
> +    QIOChannelTest *test;
> +
> +    buf = qio_channel_buffer_new(0);
> +
> +    test = qio_channel_test_new();
> +    qio_channel_test_run_writer_all(test, QIO_CHANNEL(buf));
> +    buf->offset = 0;
> +    qio_channel_test_run_reader_all(test, QIO_CHANNEL(buf));
> +    qio_channel_test_validate(test);
> +
> +    object_unref(OBJECT(buf));
> +}
>  
>  int main(int argc, char **argv)
>  {
> @@ -46,6 +92,9 @@ int main(int argc, char **argv)
>  
>      g_test_init(&argc, &argv, NULL);
>  
> -    g_test_add_func("/io/channel/buf", test_io_channel_buf);
> +    g_test_add_func("/io/channel/buf1", test_io_channel_buf1);
> +    g_test_add_func("/io/channel/buf2", test_io_channel_buf2);
> +    g_test_add_func("/io/channel/buf3", test_io_channel_buf3);
> +    g_test_add_func("/io/channel/buf4", test_io_channel_buf4);
>      return g_test_run();
>  }
> -- 
> 2.9.4
> 
--
Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK

^ permalink raw reply	[flat|nested] 93+ messages in thread

* Re: [Qemu-devel] [PATCH v5 03/17] qio: Create new qio_channel_{readv, writev}_all
  2017-07-19 15:42   ` Dr. David Alan Gilbert
@ 2017-07-19 15:43     ` Daniel P. Berrange
  2017-07-19 16:04       ` Dr. David Alan Gilbert
  0 siblings, 1 reply; 93+ messages in thread
From: Daniel P. Berrange @ 2017-07-19 15:43 UTC (permalink / raw)
  To: Dr. David Alan Gilbert; +Cc: Juan Quintela, qemu-devel, lvivier, peterx

On Wed, Jul 19, 2017 at 04:42:09PM +0100, Dr. David Alan Gilbert wrote:
> * Juan Quintela (quintela@redhat.com) wrote:
> > The functions waits until it is able to write the full iov.
> 
> When is it safe to call these - I see qio_channel_wait does it's
> own g_main_loop - so I guess they're intended to be called from their
> own process?
> 
> What causes these to exit if the migration fails for some other
> (non-file) related reason?

It'll exit if the other end closes the socket, or if the local QEMU
does a qio_channel_close() on it.  I don't know if this patch series
uses either of those options tough.

Regards,
Daniel
-- 
|: https://berrange.com      -o-    https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org         -o-            https://fstop138.berrange.com :|
|: https://entangle-photo.org    -o-    https://www.instagram.com/dberrange :|

^ permalink raw reply	[flat|nested] 93+ messages in thread

* Re: [Qemu-devel] [PATCH v5 04/17] migration: Add multifd capability
  2017-07-17 13:42 ` [Qemu-devel] [PATCH v5 04/17] migration: Add multifd capability Juan Quintela
@ 2017-07-19 15:44   ` Dr. David Alan Gilbert
  2017-08-08  8:42     ` Juan Quintela
  2017-07-19 17:14   ` Eric Blake
  1 sibling, 1 reply; 93+ messages in thread
From: Dr. David Alan Gilbert @ 2017-07-19 15:44 UTC (permalink / raw)
  To: Juan Quintela; +Cc: qemu-devel, lvivier, peterx, berrange

* Juan Quintela (quintela@redhat.com) wrote:
> Signed-off-by: Juan Quintela <quintela@redhat.com>
> Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>

Note you need to update this;  you need to add the
DEFINE_PROP_MIG_CAP in migration_properties[]

Dave

> ---
>  migration/migration.c | 9 +++++++++
>  migration/migration.h | 1 +
>  qapi-schema.json      | 4 ++--
>  3 files changed, 12 insertions(+), 2 deletions(-)
> 
> diff --git a/migration/migration.c b/migration/migration.c
> index c24ad03..af2630b 100644
> --- a/migration/migration.c
> +++ b/migration/migration.c
> @@ -1282,6 +1282,15 @@ bool migrate_use_events(void)
>      return s->enabled_capabilities[MIGRATION_CAPABILITY_EVENTS];
>  }
>  
> +bool migrate_use_multifd(void)
> +{
> +    MigrationState *s;
> +
> +    s = migrate_get_current();
> +
> +    return s->enabled_capabilities[MIGRATION_CAPABILITY_X_MULTIFD];
> +}
> +
>  int migrate_use_xbzrle(void)
>  {
>      MigrationState *s;
> diff --git a/migration/migration.h b/migration/migration.h
> index 5a18aea..9da9b4e 100644
> --- a/migration/migration.h
> +++ b/migration/migration.h
> @@ -172,6 +172,7 @@ bool migrate_postcopy_ram(void);
>  bool migrate_zero_blocks(void);
>  
>  bool migrate_auto_converge(void);
> +bool migrate_use_multifd(void);
>  
>  int migrate_use_xbzrle(void);
>  int64_t migrate_xbzrle_cache_size(void);
> diff --git a/qapi-schema.json b/qapi-schema.json
> index ab438ea..2457fb0 100644
> --- a/qapi-schema.json
> +++ b/qapi-schema.json
> @@ -902,14 +902,14 @@
>  #
>  # @return-path: If enabled, migration will use the return path even
>  #               for precopy. (since 2.10)
> +# @x-multifd: Use more than one fd for migration (since 2.10)
>  #
>  # Since: 1.2
>  ##
>  { 'enum': 'MigrationCapability',
>    'data': ['xbzrle', 'rdma-pin-all', 'auto-converge', 'zero-blocks',
>             'compress', 'events', 'postcopy-ram', 'x-colo', 'release-ram',
> -           'block', 'return-path' ] }
> -
> +           'block', 'return-path', 'x-multifd'] }
>  ##
>  # @MigrationCapabilityStatus:
>  #
> -- 
> 2.9.4
> 
--
Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK

^ permalink raw reply	[flat|nested] 93+ messages in thread

* Re: [Qemu-devel] [PATCH v5 05/17] migration: Create x-multifd-threads parameter
  2017-07-17 13:42 ` [Qemu-devel] [PATCH v5 05/17] migration: Create x-multifd-threads parameter Juan Quintela
@ 2017-07-19 16:00   ` Dr. David Alan Gilbert
  2017-08-08  8:46     ` Juan Quintela
  0 siblings, 1 reply; 93+ messages in thread
From: Dr. David Alan Gilbert @ 2017-07-19 16:00 UTC (permalink / raw)
  To: Juan Quintela; +Cc: qemu-devel, lvivier, peterx, berrange

* Juan Quintela (quintela@redhat.com) wrote:
> Indicates the number of threads that we would create.  By default we
> create 2 threads.
> 
> Signed-off-by: Juan Quintela <quintela@redhat.com>
> Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>

Also needs updating DEFINE_PROP stuff - and if Markus' qapi patch lands.

> --
> 
> Catch inconsistent defaults (eric).
> Improve comment stating that number of threads is the same than number
> of sockets
> ---
>  hmp.c                 |  7 +++++++
>  migration/migration.c | 23 +++++++++++++++++++++++
>  migration/migration.h |  1 +
>  qapi-schema.json      | 18 ++++++++++++++++--
>  4 files changed, 47 insertions(+), 2 deletions(-)
> 
> diff --git a/hmp.c b/hmp.c
> index d970ea9..92f9456 100644
> --- a/hmp.c
> +++ b/hmp.c
> @@ -335,6 +335,9 @@ void hmp_info_migrate_parameters(Monitor *mon, const QDict *qdict)
>          monitor_printf(mon, "%s: %s\n",
>              MigrationParameter_lookup[MIGRATION_PARAMETER_BLOCK_INCREMENTAL],
>                         params->block_incremental ? "on" : "off");
> +        monitor_printf(mon, "%s: %" PRId64 "\n",
> +            MigrationParameter_lookup[MIGRATION_PARAMETER_X_MULTIFD_THREADS],
> +            params->x_multifd_threads);
>      }
>  
>      qapi_free_MigrationParameters(params);
> @@ -1573,6 +1576,9 @@ void hmp_migrate_set_parameter(Monitor *mon, const QDict *qdict)
>                      goto cleanup;
>                  }
>                  p.block_incremental = valuebool;
> +            case MIGRATION_PARAMETER_X_MULTIFD_THREADS:
> +                p.has_x_multifd_threads = true;
> +                use_int_value = true;
>                  break;
>              }
>  
> @@ -1590,6 +1596,7 @@ void hmp_migrate_set_parameter(Monitor *mon, const QDict *qdict)
>                  p.cpu_throttle_increment = valueint;
>                  p.downtime_limit = valueint;
>                  p.x_checkpoint_delay = valueint;
> +                p.x_multifd_threads = valueint;
>              }
>  
>              qmp_migrate_set_parameters(&p, &err);
> diff --git a/migration/migration.c b/migration/migration.c
> index af2630b..148edc1 100644
> --- a/migration/migration.c
> +++ b/migration/migration.c
> @@ -78,6 +78,7 @@
>   * Note: Please change this default value to 10000 when we support hybrid mode.
>   */
>  #define DEFAULT_MIGRATE_X_CHECKPOINT_DELAY 200
> +#define DEFAULT_MIGRATE_MULTIFD_THREADS 2
>  
>  static NotifierList migration_state_notifiers =
>      NOTIFIER_LIST_INITIALIZER(migration_state_notifiers);
> @@ -460,6 +461,8 @@ MigrationParameters *qmp_query_migrate_parameters(Error **errp)
>      params->x_checkpoint_delay = s->parameters.x_checkpoint_delay;
>      params->has_block_incremental = true;
>      params->block_incremental = s->parameters.block_incremental;
> +    params->has_x_multifd_threads = true;
> +    params->x_multifd_threads = s->parameters.x_multifd_threads;
>  
>      return params;
>  }
> @@ -712,6 +715,13 @@ void qmp_migrate_set_parameters(MigrationParameters *params, Error **errp)
>                      "x_checkpoint_delay",
>                      "is invalid, it should be positive");
>      }
> +    if (params->has_x_multifd_threads &&
> +        (params->x_multifd_threads < 1 || params->x_multifd_threads > 255)) {
> +        error_setg(errp, QERR_INVALID_PARAMETER_VALUE,
> +                   "multifd_threads",
> +                   "is invalid, it should be in the range of 1 to 255");
> +        return;
> +    }
>  
>      if (params->has_compress_level) {
>          s->parameters.compress_level = params->compress_level;
> @@ -756,6 +766,9 @@ void qmp_migrate_set_parameters(MigrationParameters *params, Error **errp)
>      if (params->has_block_incremental) {
>          s->parameters.block_incremental = params->block_incremental;
>      }
> +    if (params->has_x_multifd_threads) {
> +        s->parameters.x_multifd_threads = params->x_multifd_threads;
> +    }
>  }
>  
>  
> @@ -1291,6 +1304,15 @@ bool migrate_use_multifd(void)
>      return s->enabled_capabilities[MIGRATION_CAPABILITY_X_MULTIFD];
>  }
>  
> +int migrate_multifd_threads(void)
> +{
> +    MigrationState *s;
> +
> +    s = migrate_get_current();
> +
> +    return s->parameters.x_multifd_threads;
> +}
> +
>  int migrate_use_xbzrle(void)
>  {
>      MigrationState *s;
> @@ -2055,6 +2077,7 @@ static void migration_instance_init(Object *obj)
>          .max_bandwidth = MAX_THROTTLE,
>          .downtime_limit = DEFAULT_MIGRATE_SET_DOWNTIME,
>          .x_checkpoint_delay = DEFAULT_MIGRATE_X_CHECKPOINT_DELAY,
> +        .x_multifd_threads = DEFAULT_MIGRATE_MULTIFD_THREADS,
>      };
>      ms->parameters.tls_creds = g_strdup("");
>      ms->parameters.tls_hostname = g_strdup("");
> diff --git a/migration/migration.h b/migration/migration.h
> index 9da9b4e..20ea30c 100644
> --- a/migration/migration.h
> +++ b/migration/migration.h
> @@ -173,6 +173,7 @@ bool migrate_zero_blocks(void);
>  
>  bool migrate_auto_converge(void);
>  bool migrate_use_multifd(void);
> +int migrate_multifd_threads(void);
>  
>  int migrate_use_xbzrle(void);
>  int64_t migrate_xbzrle_cache_size(void);
> diff --git a/qapi-schema.json b/qapi-schema.json
> index 2457fb0..444e8f0 100644
> --- a/qapi-schema.json
> +++ b/qapi-schema.json
> @@ -902,6 +902,7 @@
>  #
>  # @return-path: If enabled, migration will use the return path even
>  #               for precopy. (since 2.10)
> +#
>  # @x-multifd: Use more than one fd for migration (since 2.10)
>  #
>  # Since: 1.2
> @@ -910,6 +911,7 @@
>    'data': ['xbzrle', 'rdma-pin-all', 'auto-converge', 'zero-blocks',
>             'compress', 'events', 'postcopy-ram', 'x-colo', 'release-ram',
>             'block', 'return-path', 'x-multifd'] }
> +

Escapee from previous patch.

>  ##
>  # @MigrationCapabilityStatus:
>  #
> @@ -1026,13 +1028,19 @@
>  # 	migrated and the destination must already have access to the
>  # 	same backing chain as was used on the source.  (since 2.10)
>  #
> +# @x-multifd-threads: Number of threads used to migrate data in
> +#                     parallel. This is the same number that the
> +#                     number of sockets used for migration.
> +#                     The default value is 2 (since 2.10)
> +#

That did make me think for a moment; I guess '2' makes sense once you've
set the x-multifd capability on.  The other possibility would be to
remove the capability and just rely on the threads > 1

Dave

>  # Since: 2.4
>  ##
>  { 'enum': 'MigrationParameter',
>    'data': ['compress-level', 'compress-threads', 'decompress-threads',
>             'cpu-throttle-initial', 'cpu-throttle-increment',
>             'tls-creds', 'tls-hostname', 'max-bandwidth',
> -           'downtime-limit', 'x-checkpoint-delay', 'block-incremental' ] }
> +           'downtime-limit', 'x-checkpoint-delay', 'block-incremental',
> +           'x-multifd-threads'] }
>  
>  ##
>  # @migrate-set-parameters:
> @@ -1106,6 +1114,11 @@
>  # 	migrated and the destination must already have access to the
>  # 	same backing chain as was used on the source.  (since 2.10)
>  #
> +# @x-multifd-threads: Number of threads used to migrate data in
> +#                     parallel. This is the same number that the
> +#                     number of sockets used for migration.
> +#                     The default value is 2 (since 2.10)
> +#
>  # Since: 2.4
>  ##
>  { 'struct': 'MigrationParameters',
> @@ -1119,7 +1132,8 @@
>              '*max-bandwidth': 'int',
>              '*downtime-limit': 'int',
>              '*x-checkpoint-delay': 'int',
> -            '*block-incremental': 'bool' } }
> +            '*block-incremental': 'bool',
> +            '*x-multifd-threads': 'int'} }
>  
>  ##
>  # @query-migrate-parameters:
> -- 
> 2.9.4
> 
--
Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK

^ permalink raw reply	[flat|nested] 93+ messages in thread

* Re: [Qemu-devel] [PATCH v5 03/17] qio: Create new qio_channel_{readv, writev}_all
  2017-07-19 15:43     ` Daniel P. Berrange
@ 2017-07-19 16:04       ` Dr. David Alan Gilbert
  2017-07-19 16:08         ` Daniel P. Berrange
  0 siblings, 1 reply; 93+ messages in thread
From: Dr. David Alan Gilbert @ 2017-07-19 16:04 UTC (permalink / raw)
  To: Daniel P. Berrange; +Cc: Juan Quintela, qemu-devel, lvivier, peterx

* Daniel P. Berrange (berrange@redhat.com) wrote:
> On Wed, Jul 19, 2017 at 04:42:09PM +0100, Dr. David Alan Gilbert wrote:
> > * Juan Quintela (quintela@redhat.com) wrote:
> > > The functions waits until it is able to write the full iov.
> > 
> > When is it safe to call these - I see qio_channel_wait does it's
> > own g_main_loop - so I guess they're intended to be called from their
> > own process?
> > 
> > What causes these to exit if the migration fails for some other
> > (non-file) related reason?
> 
> It'll exit if the other end closes the socket, or if the local QEMU
> does a qio_channel_close() on it.  I don't know if this patch series
> uses either of those options tough.

How do you safely cope with calling close on a socket that's currently
being waited on/might be reading?  In the cancel case we use shutdown()
to force exits with out actually closing.

Dave

> 
> Regards,
> Daniel
> -- 
> |: https://berrange.com      -o-    https://www.flickr.com/photos/dberrange :|
> |: https://libvirt.org         -o-            https://fstop138.berrange.com :|
> |: https://entangle-photo.org    -o-    https://www.instagram.com/dberrange :|
--
Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK

^ permalink raw reply	[flat|nested] 93+ messages in thread

* Re: [Qemu-devel] [PATCH v5 03/17] qio: Create new qio_channel_{readv, writev}_all
  2017-07-19 16:04       ` Dr. David Alan Gilbert
@ 2017-07-19 16:08         ` Daniel P. Berrange
  0 siblings, 0 replies; 93+ messages in thread
From: Daniel P. Berrange @ 2017-07-19 16:08 UTC (permalink / raw)
  To: Dr. David Alan Gilbert; +Cc: Juan Quintela, qemu-devel, lvivier, peterx

On Wed, Jul 19, 2017 at 05:04:19PM +0100, Dr. David Alan Gilbert wrote:
> * Daniel P. Berrange (berrange@redhat.com) wrote:
> > On Wed, Jul 19, 2017 at 04:42:09PM +0100, Dr. David Alan Gilbert wrote:
> > > * Juan Quintela (quintela@redhat.com) wrote:
> > > > The functions waits until it is able to write the full iov.
> > > 
> > > When is it safe to call these - I see qio_channel_wait does it's
> > > own g_main_loop - so I guess they're intended to be called from their
> > > own process?
> > > 
> > > What causes these to exit if the migration fails for some other
> > > (non-file) related reason?
> > 
> > It'll exit if the other end closes the socket, or if the local QEMU
> > does a qio_channel_close() on it.  I don't know if this patch series
> > uses either of those options tough.
> 
> How do you safely cope with calling close on a socket that's currently
> being waited on/might be reading?  In the cancel case we use shutdown()
> to force exits with out actually closing.

You can use qio_channel_shutdown() instead if that's desired

Regards,
Daniel
-- 
|: https://berrange.com      -o-    https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org         -o-            https://fstop138.berrange.com :|
|: https://entangle-photo.org    -o-    https://www.instagram.com/dberrange :|

^ permalink raw reply	[flat|nested] 93+ messages in thread

* Re: [Qemu-devel] [PATCH v5 07/17] migration: Create multifd migration threads
  2017-07-17 13:42 ` [Qemu-devel] [PATCH v5 07/17] migration: Create multifd migration threads Juan Quintela
@ 2017-07-19 16:49   ` Dr. David Alan Gilbert
  2017-08-08  8:58     ` Juan Quintela
  0 siblings, 1 reply; 93+ messages in thread
From: Dr. David Alan Gilbert @ 2017-07-19 16:49 UTC (permalink / raw)
  To: Juan Quintela; +Cc: qemu-devel, lvivier, peterx, berrange

* Juan Quintela (quintela@redhat.com) wrote:
> Creation of the threads, nothing inside yet.
> 
> Signed-off-by: Juan Quintela <quintela@redhat.com>
> 
> --
> 
> Use pointers instead of long array names
> Move to use semaphores instead of conditions as paolo suggestion
> 
> Put all the state inside one struct.
> Use a counter for the number of threads created.  Needed during cancellation.
> 
> Add error return to thread creation
> 
> Add id field
> 
> Rename functions to multifd_save/load_setup/cleanup
> ---
>  migration/migration.c |  14 ++++
>  migration/ram.c       | 192 ++++++++++++++++++++++++++++++++++++++++++++++++++
>  migration/ram.h       |   5 ++
>  3 files changed, 211 insertions(+)
> 
> diff --git a/migration/migration.c b/migration/migration.c
> index ff3fc9d..5a82c1c 100644
> --- a/migration/migration.c
> +++ b/migration/migration.c
> @@ -288,6 +288,7 @@ static void process_incoming_migration_bh(void *opaque)
>      } else {
>          runstate_set(global_state_get_runstate());
>      }
> +    multifd_load_cleanup();
>      /*
>       * This must happen after any state changes since as soon as an external
>       * observer sees this event they might start to prod at the VM assuming
> @@ -348,6 +349,7 @@ static void process_incoming_migration_co(void *opaque)
>          migrate_set_state(&mis->state, MIGRATION_STATUS_ACTIVE,
>                            MIGRATION_STATUS_FAILED);
>          error_report("load of migration failed: %s", strerror(-ret));
> +        multifd_load_cleanup();
>          exit(EXIT_FAILURE);
>      }
>      mis->bh = qemu_bh_new(process_incoming_migration_bh, mis);
> @@ -358,6 +360,11 @@ void migration_fd_process_incoming(QEMUFile *f)
>  {
>      Coroutine *co = qemu_coroutine_create(process_incoming_migration_co, f);
>  
> +    if (multifd_load_setup() != 0) {
> +        /* We haven't been able to create multifd threads
> +           nothing better to do */
> +        exit(EXIT_FAILURE);
> +    }
>      qemu_file_set_blocking(f, false);
>      qemu_coroutine_enter(co);
>  }
> @@ -860,6 +867,7 @@ static void migrate_fd_cleanup(void *opaque)
>          }
>          qemu_mutex_lock_iothread();
>  
> +        multifd_save_cleanup();
>          qemu_fclose(s->to_dst_file);
>          s->to_dst_file = NULL;
>      }
> @@ -2049,6 +2057,12 @@ void migrate_fd_connect(MigrationState *s)
>          }
>      }
>  
> +    if (multifd_save_setup() != 0) {
> +        migrate_set_state(&s->state, MIGRATION_STATUS_SETUP,
> +                          MIGRATION_STATUS_FAILED);
> +        migrate_fd_cleanup(s);
> +        return;
> +    }
>      qemu_thread_create(&s->thread, "live_migration", migration_thread, s,
>                         QEMU_THREAD_JOINABLE);
>      s->migration_thread_running = true;
> diff --git a/migration/ram.c b/migration/ram.c
> index 1b08296..8e87533 100644
> --- a/migration/ram.c
> +++ b/migration/ram.c
> @@ -356,6 +356,198 @@ static void compress_threads_save_setup(void)
>      }
>  }
>  
> +/* Multiple fd's */
> +
> +struct MultiFDSendParams {
> +    uint8_t id;
> +    QemuThread thread;
> +    QemuSemaphore sem;
> +    QemuMutex mutex;
> +    bool quit;
> +};
> +typedef struct MultiFDSendParams MultiFDSendParams;
> +
> +struct {
> +    MultiFDSendParams *params;
> +    /* number of created threads */
> +    int count;
> +} *multifd_send_state;
> +
> +static void terminate_multifd_send_threads(void)
> +{
> +    int i;
> +
> +    for (i = 0; i < multifd_send_state->count; i++) {
> +        MultiFDSendParams *p = &multifd_send_state->params[i];
> +
> +        qemu_mutex_lock(&p->mutex);
> +        p->quit = true;
> +        qemu_sem_post(&p->sem);
> +        qemu_mutex_unlock(&p->mutex);

I don't think you need that lock/unlock pair - as long as no one
else is currently going around setting them to false; so as long
as you know you're safely after initialisation and no one is trying
to start a new migration at the moment then I think it's safe.

> +    }
> +}
> +
> +void multifd_save_cleanup(void)
> +{
> +    int i;
> +
> +    if (!migrate_use_multifd()) {
> +        return;
> +    }
> +    terminate_multifd_send_threads();
> +    for (i = 0; i < multifd_send_state->count; i++) {
> +        MultiFDSendParams *p = &multifd_send_state->params[i];
> +
> +        qemu_thread_join(&p->thread);
> +        qemu_mutex_destroy(&p->mutex);
> +        qemu_sem_destroy(&p->sem);
> +    }
> +    g_free(multifd_send_state->params);
> +    multifd_send_state->params = NULL;
> +    g_free(multifd_send_state);
> +    multifd_send_state = NULL;

I'd be tempted to add a few traces around here, and also some
protection against it being called twice.  Maybe it shouldn't
happen, but it would be nice to debug it when it does.

> +}
> +
> +static void *multifd_send_thread(void *opaque)
> +{
> +    MultiFDSendParams *p = opaque;
> +
> +    while (true) {
> +        qemu_mutex_lock(&p->mutex);
> +        if (p->quit) {
> +            qemu_mutex_unlock(&p->mutex);
> +            break;
> +        }
> +        qemu_mutex_unlock(&p->mutex);
> +        qemu_sem_wait(&p->sem);

Similar to above, I don't think you need those
locks around the quit check.

> +    }
> +
> +    return NULL;
> +}
> +
> +int multifd_save_setup(void)
> +{
> +    int thread_count;
> +    uint8_t i;
> +
> +    if (!migrate_use_multifd()) {
> +        return 0;
> +    }
> +    thread_count = migrate_multifd_threads();
> +    multifd_send_state = g_malloc0(sizeof(*multifd_send_state));
> +    multifd_send_state->params = g_new0(MultiFDSendParams, thread_count);
> +    multifd_send_state->count = 0;
> +    for (i = 0; i < thread_count; i++) {
> +        char thread_name[16];
> +        MultiFDSendParams *p = &multifd_send_state->params[i];
> +
> +        qemu_mutex_init(&p->mutex);
> +        qemu_sem_init(&p->sem, 0);
> +        p->quit = false;
> +        p->id = i;
> +        snprintf(thread_name, sizeof(thread_name), "multifdsend_%d", i);
> +        qemu_thread_create(&p->thread, thread_name, multifd_send_thread, p,
> +                           QEMU_THREAD_JOINABLE);
> +        multifd_send_state->count++;
> +    }
> +    return 0;
> +}
> +
> +struct MultiFDRecvParams {
> +    uint8_t id;
> +    QemuThread thread;
> +    QemuSemaphore sem;
> +    QemuMutex mutex;
> +    bool quit;
> +};
> +typedef struct MultiFDRecvParams MultiFDRecvParams;
> +
> +struct {
> +    MultiFDRecvParams *params;
> +    /* number of created threads */
> +    int count;
> +} *multifd_recv_state;
> +
> +static void terminate_multifd_recv_threads(void)
> +{
> +    int i;
> +
> +    for (i = 0; i < multifd_recv_state->count; i++) {
> +        MultiFDRecvParams *p = &multifd_recv_state->params[i];
> +
> +        qemu_mutex_lock(&p->mutex);
> +        p->quit = true;
> +        qemu_sem_post(&p->sem);
> +        qemu_mutex_unlock(&p->mutex);
> +    }
> +}
> +
> +void multifd_load_cleanup(void)
> +{
> +    int i;
> +
> +    if (!migrate_use_multifd()) {
> +        return;
> +    }
> +    terminate_multifd_recv_threads();
> +    for (i = 0; i < multifd_recv_state->count; i++) {
> +        MultiFDRecvParams *p = &multifd_recv_state->params[i];
> +
> +        qemu_thread_join(&p->thread);
> +        qemu_mutex_destroy(&p->mutex);
> +        qemu_sem_destroy(&p->sem);
> +    }
> +    g_free(multifd_recv_state->params);
> +    multifd_recv_state->params = NULL;
> +    g_free(multifd_recv_state);
> +    multifd_recv_state = NULL;
> +}
> +
> +static void *multifd_recv_thread(void *opaque)
> +{
> +    MultiFDRecvParams *p = opaque;
> +
> +    while (true) {
> +        qemu_mutex_lock(&p->mutex);
> +        if (p->quit) {
> +            qemu_mutex_unlock(&p->mutex);
> +            break;
> +        }
> +        qemu_mutex_unlock(&p->mutex);
> +        qemu_sem_wait(&p->sem);
> +    }
> +
> +    return NULL;
> +}
> +
> +int multifd_load_setup(void)
> +{
> +    int thread_count;
> +    uint8_t i;
> +
> +    if (!migrate_use_multifd()) {
> +        return 0;
> +    }
> +    thread_count = migrate_multifd_threads();
> +    multifd_recv_state = g_malloc0(sizeof(*multifd_recv_state));
> +    multifd_recv_state->params = g_new0(MultiFDRecvParams, thread_count);
> +    multifd_recv_state->count = 0;
> +    for (i = 0; i < thread_count; i++) {
> +        char thread_name[16];
> +        MultiFDRecvParams *p = &multifd_recv_state->params[i];
> +
> +        qemu_mutex_init(&p->mutex);
> +        qemu_sem_init(&p->sem, 0);
> +        p->quit = false;
> +        p->id = i;
> +        snprintf(thread_name, sizeof(thread_name), "multifdrecv_%d", i);
> +        qemu_thread_create(&p->thread, thread_name, multifd_recv_thread, p,
> +                           QEMU_THREAD_JOINABLE);
> +        multifd_recv_state->count++;
> +    }
> +    return 0;
> +}
> +

(It's a shame there's no way to wrap this boiler plate up to share
between send/receive threads).

However, all the above is minor, so:


Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>

>  /**
>   * save_page_header: write page header to wire
>   *
> diff --git a/migration/ram.h b/migration/ram.h
> index c081fde..93c2bb4 100644
> --- a/migration/ram.h
> +++ b/migration/ram.h
> @@ -39,6 +39,11 @@ int64_t xbzrle_cache_resize(int64_t new_size);
>  uint64_t ram_bytes_remaining(void);
>  uint64_t ram_bytes_total(void);
>  
> +int multifd_save_setup(void);
> +void multifd_save_cleanup(void);
> +int multifd_load_setup(void);
> +void multifd_load_cleanup(void);
> +
>  uint64_t ram_pagesize_summary(void);
>  int ram_save_queue_pages(const char *rbname, ram_addr_t start, ram_addr_t len);
>  void acct_update_position(QEMUFile *f, size_t size, bool zero);
> -- 
> 2.9.4
> 
--
Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK

^ permalink raw reply	[flat|nested] 93+ messages in thread

* Re: [Qemu-devel] [PATCH v5 08/17] migration: Split migration_fd_process_incomming
  2017-07-17 13:42 ` [Qemu-devel] [PATCH v5 08/17] migration: Split migration_fd_process_incomming Juan Quintela
@ 2017-07-19 17:08   ` Dr. David Alan Gilbert
  2017-07-21 12:39     ` Eric Blake
  0 siblings, 1 reply; 93+ messages in thread
From: Dr. David Alan Gilbert @ 2017-07-19 17:08 UTC (permalink / raw)
  To: Juan Quintela; +Cc: qemu-devel, lvivier, peterx, berrange

* Juan Quintela (quintela@redhat.com) wrote:
> We need that on posterior patches.

following/subsequent/later is probably a better word.

other than that;


Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>

> Signed-off-by: Juan Quintela <quintela@redhat.com>
> ---
>  migration/migration.c | 16 ++++++++++++++--
>  1 file changed, 14 insertions(+), 2 deletions(-)
> 
> diff --git a/migration/migration.c b/migration/migration.c
> index 5a82c1c..b81c498 100644
> --- a/migration/migration.c
> +++ b/migration/migration.c
> @@ -356,19 +356,31 @@ static void process_incoming_migration_co(void *opaque)
>      qemu_bh_schedule(mis->bh);
>  }
>  
> -void migration_fd_process_incoming(QEMUFile *f)
> +static void migration_incoming_setup(QEMUFile *f)
>  {
> -    Coroutine *co = qemu_coroutine_create(process_incoming_migration_co, f);
> +    MigrationIncomingState *mis = migration_incoming_get_current();
>  
>      if (multifd_load_setup() != 0) {
>          /* We haven't been able to create multifd threads
>             nothing better to do */
>          exit(EXIT_FAILURE);
>      }
> +    mis->from_src_file = f;
>      qemu_file_set_blocking(f, false);
> +}
> +
> +static void migration_incoming_process(void)
> +{
> +    Coroutine *co = qemu_coroutine_create(process_incoming_migration_co, NULL);
>      qemu_coroutine_enter(co);
>  }
>  
> +void migration_fd_process_incoming(QEMUFile *f)
> +{
> +    migration_incoming_setup(f);
> +    migration_incoming_process();
> +}
> +
>  gboolean migration_ioc_process_incoming(QIOChannel *ioc)
>  {
>      MigrationIncomingState *mis = migration_incoming_get_current();
> -- 
> 2.9.4
> 
--
Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK

^ permalink raw reply	[flat|nested] 93+ messages in thread

* Re: [Qemu-devel] [PATCH v5 04/17] migration: Add multifd capability
  2017-07-17 13:42 ` [Qemu-devel] [PATCH v5 04/17] migration: Add multifd capability Juan Quintela
  2017-07-19 15:44   ` Dr. David Alan Gilbert
@ 2017-07-19 17:14   ` Eric Blake
  1 sibling, 0 replies; 93+ messages in thread
From: Eric Blake @ 2017-07-19 17:14 UTC (permalink / raw)
  To: Juan Quintela, qemu-devel; +Cc: lvivier, dgilbert, peterx

[-- Attachment #1: Type: text/plain, Size: 849 bytes --]

On 07/17/2017 08:42 AM, Juan Quintela wrote:
> Signed-off-by: Juan Quintela <quintela@redhat.com>
> Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
> ---
>  migration/migration.c | 9 +++++++++
>  migration/migration.h | 1 +
>  qapi-schema.json      | 4 ++--
>  3 files changed, 12 insertions(+), 2 deletions(-)
> 

> +++ b/qapi-schema.json
> @@ -902,14 +902,14 @@
>  #
>  # @return-path: If enabled, migration will use the return path even
>  #               for precopy. (since 2.10)
> +# @x-multifd: Use more than one fd for migration (since 2.10)

Are we still aiming for 2.10, even though this is a feature and soft
freeze has already passed?  Better would be updating it to say 2.11

-- 
Eric Blake, Principal Software Engineer
Red Hat, Inc.           +1-919-301-3266
Virtualization:  qemu.org | libvirt.org


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 619 bytes --]

^ permalink raw reply	[flat|nested] 93+ messages in thread

* Re: [Qemu-devel] [PATCH v5 09/17] migration: Start of multiple fd work
  2017-07-17 13:42 ` [Qemu-devel] [PATCH v5 09/17] migration: Start of multiple fd work Juan Quintela
  2017-07-19 13:56   ` Daniel P. Berrange
@ 2017-07-19 17:35   ` Dr. David Alan Gilbert
  2017-08-08  9:35     ` Juan Quintela
  2017-07-20  9:34   ` Peter Xu
  2 siblings, 1 reply; 93+ messages in thread
From: Dr. David Alan Gilbert @ 2017-07-19 17:35 UTC (permalink / raw)
  To: Juan Quintela; +Cc: qemu-devel, lvivier, peterx, berrange

* Juan Quintela (quintela@redhat.com) wrote:
> We create new channels for each new thread created. We only send through
> them a character to be sure that we are creating the channels in the
> right order.

That text is out of date isn't it?

> 
> Signed-off-by: Juan Quintela <quintela@redhat.com>
> 
> --
> Split SocketArgs into incoming and outgoing args
> 
> Use UUID's on the initial message, so we are sure we are connecting to
> the right channel.
> 
> Remove init semaphore.  Now that we use uuids on the init message, we
> know that this is our channel.
> 
> Fix recv socket destwroy, we were destroying send channels.
> This was very interesting, because we were using an unreferred object
> without problems.
> 
> Move to struct of pointers
> init channel sooner.
> split recv thread creation.
> listen on main thread
> ---
>  migration/migration.c |   7 ++-
>  migration/ram.c       | 118 ++++++++++++++++++++++++++++++++++++++++++--------
>  migration/ram.h       |   2 +
>  migration/socket.c    |  38 ++++++++++++++--
>  migration/socket.h    |  10 +++++
>  5 files changed, 152 insertions(+), 23 deletions(-)
> 
> diff --git a/migration/migration.c b/migration/migration.c
> index b81c498..e1c79d5 100644
> --- a/migration/migration.c
> +++ b/migration/migration.c
> @@ -389,8 +389,13 @@ gboolean migration_ioc_process_incoming(QIOChannel *ioc)
>          QEMUFile *f = qemu_fopen_channel_input(ioc);
>          mis->from_src_file = f;
>          migration_fd_process_incoming(f);
> +        if (!migrate_use_multifd()) {
> +            return FALSE;
> +        } else {
> +            return TRUE;
> +        }
>      }
> -    return FALSE; /* unregister */
> +    return multifd_new_channel(ioc);
>  }
>  
>  /*
> diff --git a/migration/ram.c b/migration/ram.c
> index 8e87533..b80f511 100644
> --- a/migration/ram.c
> +++ b/migration/ram.c
> @@ -36,6 +36,7 @@
>  #include "xbzrle.h"
>  #include "ram.h"
>  #include "migration.h"
> +#include "socket.h"
>  #include "migration/register.h"
>  #include "migration/misc.h"
>  #include "qemu-file.h"
> @@ -46,6 +47,8 @@
>  #include "exec/ram_addr.h"
>  #include "qemu/rcu_queue.h"
>  #include "migration/colo.h"
> +#include "sysemu/sysemu.h"
> +#include "qemu/uuid.h"
>  
>  /***********************************************************/
>  /* ram save/restore */
> @@ -361,6 +364,7 @@ static void compress_threads_save_setup(void)
>  struct MultiFDSendParams {
>      uint8_t id;
>      QemuThread thread;
> +    QIOChannel *c;
>      QemuSemaphore sem;
>      QemuMutex mutex;
>      bool quit;
> @@ -401,6 +405,7 @@ void multifd_save_cleanup(void)
>          qemu_thread_join(&p->thread);
>          qemu_mutex_destroy(&p->mutex);
>          qemu_sem_destroy(&p->sem);
> +        socket_send_channel_destroy(p->c);
>      }
>      g_free(multifd_send_state->params);
>      multifd_send_state->params = NULL;
> @@ -408,11 +413,38 @@ void multifd_save_cleanup(void)
>      multifd_send_state = NULL;
>  }
>  
> +/* Default uuid for multifd when qemu is not started with uuid */
> +static char multifd_uuid[] = "5c49fd7e-af88-4a07-b6e8-091fd696ad40";
> +/* strlen(multifd) + '-' + <channel id> + '-' +  UUID_FMT + '\0' */
> +#define MULTIFD_UUID_MSG (7 + 1 + 3 + 1 + UUID_FMT_LEN + 1)
> +
>  static void *multifd_send_thread(void *opaque)
>  {
>      MultiFDSendParams *p = opaque;
> +    char string[MULTIFD_UUID_MSG];
> +    char *string_uuid;
> +    int res;
> +    bool exit = false;
>  
> -    while (true) {
> +    if (qemu_uuid_set) {
> +        string_uuid = qemu_uuid_unparse_strdup(&qemu_uuid);
> +    } else {
> +        string_uuid = g_strdup(multifd_uuid);
> +    }
> +    res = snprintf(string, MULTIFD_UUID_MSG, "%s multifd %03d",
> +                   string_uuid, p->id);
> +    g_free(string_uuid);
> +
> +    /* -1 due to the wonders of '\0' accounting */
> +    if (res != (MULTIFD_UUID_MSG - 1)) {
> +        error_report("Multifd UUID message '%s' is not of right length",
> +            string);
> +        exit = true;
> +    } else {
> +        qio_channel_write(p->c, string, MULTIFD_UUID_MSG, &error_abort);
> +    }
> +
> +    while (!exit) {
>          qemu_mutex_lock(&p->mutex);
>          if (p->quit) {
>              qemu_mutex_unlock(&p->mutex);
> @@ -445,6 +477,12 @@ int multifd_save_setup(void)
>          qemu_sem_init(&p->sem, 0);
>          p->quit = false;
>          p->id = i;
> +        p->c = socket_send_channel_create();
> +        if (!p->c) {
> +            error_report("Error creating a send channel");
> +            multifd_save_cleanup();
> +            return -1;
> +        }
>          snprintf(thread_name, sizeof(thread_name), "multifdsend_%d", i);
>          qemu_thread_create(&p->thread, thread_name, multifd_send_thread, p,
>                             QEMU_THREAD_JOINABLE);
> @@ -456,6 +494,7 @@ int multifd_save_setup(void)
>  struct MultiFDRecvParams {
>      uint8_t id;
>      QemuThread thread;
> +    QIOChannel *c;
>      QemuSemaphore sem;
>      QemuMutex mutex;
>      bool quit;
> @@ -463,7 +502,7 @@ struct MultiFDRecvParams {
>  typedef struct MultiFDRecvParams MultiFDRecvParams;
>  
>  struct {
> -    MultiFDRecvParams *params;
> +    MultiFDRecvParams **params;

Probably want to push that upto where you added that struct?

>      /* number of created threads */
>      int count;
>  } *multifd_recv_state;
> @@ -473,7 +512,7 @@ static void terminate_multifd_recv_threads(void)
>      int i;
>  
>      for (i = 0; i < multifd_recv_state->count; i++) {
> -        MultiFDRecvParams *p = &multifd_recv_state->params[i];
> +        MultiFDRecvParams *p = multifd_recv_state->params[i];
>  
>          qemu_mutex_lock(&p->mutex);
>          p->quit = true;
> @@ -491,11 +530,13 @@ void multifd_load_cleanup(void)
>      }
>      terminate_multifd_recv_threads();
>      for (i = 0; i < multifd_recv_state->count; i++) {
> -        MultiFDRecvParams *p = &multifd_recv_state->params[i];
> +        MultiFDRecvParams *p = multifd_recv_state->params[i];
>  
>          qemu_thread_join(&p->thread);
>          qemu_mutex_destroy(&p->mutex);
>          qemu_sem_destroy(&p->sem);
> +        socket_recv_channel_destroy(p->c);
> +        g_free(p);
>      }
>      g_free(multifd_recv_state->params);
>      multifd_recv_state->params = NULL;
> @@ -520,31 +561,70 @@ static void *multifd_recv_thread(void *opaque)
>      return NULL;
>  }
>  
> +gboolean multifd_new_channel(QIOChannel *ioc)
> +{
> +    int thread_count = migrate_multifd_threads();
> +    MultiFDRecvParams *p = g_new0(MultiFDRecvParams, 1);
> +    MigrationState *s = migrate_get_current();
> +    char string[MULTIFD_UUID_MSG];
> +    char string_uuid[UUID_FMT_LEN];
> +    char *uuid;
> +    int id;
> +
> +    qio_channel_read(ioc, string, sizeof(string), &error_abort);
> +    sscanf(string, "%s multifd %03d", string_uuid, &id);
> +
> +    if (qemu_uuid_set) {
> +        uuid = qemu_uuid_unparse_strdup(&qemu_uuid);
> +    } else {
> +        uuid = g_strdup(multifd_uuid);
> +    }
> +    if (strcmp(string_uuid, uuid)) {
> +        error_report("multifd: received uuid '%s' and expected uuid '%s'",
> +                     string_uuid, uuid);

probably worth adding the channel id as well so we can see
when it fails.

> +        migrate_set_state(&s->state, MIGRATION_STATUS_ACTIVE,
> +                          MIGRATION_STATUS_FAILED);
> +        terminate_multifd_recv_threads();
> +        return FALSE;
> +    }
> +    g_free(uuid);
> +
> +    if (multifd_recv_state->params[id] != NULL) {
> +        error_report("multifd: received id '%d' is already setup'", id);
> +        migrate_set_state(&s->state, MIGRATION_STATUS_ACTIVE,
> +                          MIGRATION_STATUS_FAILED);
> +        terminate_multifd_recv_threads();
> +        return FALSE;
> +    }
> +    qemu_mutex_init(&p->mutex);
> +    qemu_sem_init(&p->sem, 0);
> +    p->quit = false;
> +    p->id = id;
> +    p->c = ioc;
> +    atomic_set(&multifd_recv_state->params[id], p);

Can you explain why this is quite so careful about ordering ? Is there
something that could look at params or try and take the mutex before
the count is incremented?

I think it's safe to do:
 p->quit = false;
 p->id = id;
 p->c = ioc;
 &multifd_recv_state->params[id] = p;
 qemu_sem_init(&p->sem, 0);
 qemu_mutex_init(&p->mutex);
 qemu_thread_create(...)
 atomic_inc(&multifd_recv_state->count);    <-- I'm not sure if this
 needs to be atomic

> +    qemu_thread_create(&p->thread, "multifd_recv", multifd_recv_thread, p,
> +                       QEMU_THREAD_JOINABLE);

You've lost the nice numbered thread names you had created in the
previous version of this that you're removing.

> +    multifd_recv_state->count++;
> +
> +    /* We need to return FALSE for the last channel */
> +    if (multifd_recv_state->count == thread_count) {
> +        return FALSE;
> +    } else {
> +        return TRUE;
> +    }

return multifd_recv_state->count != thread_count;   ?

> +}
> +
>  int multifd_load_setup(void)
>  {
>      int thread_count;
> -    uint8_t i;
>  
>      if (!migrate_use_multifd()) {
>          return 0;
>      }
>      thread_count = migrate_multifd_threads();
>      multifd_recv_state = g_malloc0(sizeof(*multifd_recv_state));
> -    multifd_recv_state->params = g_new0(MultiFDRecvParams, thread_count);
> +    multifd_recv_state->params = g_new0(MultiFDRecvParams *, thread_count);
>      multifd_recv_state->count = 0;
> -    for (i = 0; i < thread_count; i++) {
> -        char thread_name[16];
> -        MultiFDRecvParams *p = &multifd_recv_state->params[i];
> -
> -        qemu_mutex_init(&p->mutex);
> -        qemu_sem_init(&p->sem, 0);
> -        p->quit = false;
> -        p->id = i;
> -        snprintf(thread_name, sizeof(thread_name), "multifdrecv_%d", i);
> -        qemu_thread_create(&p->thread, thread_name, multifd_recv_thread, p,
> -                           QEMU_THREAD_JOINABLE);
> -        multifd_recv_state->count++;
> -    }
>      return 0;
>  }
>  
> diff --git a/migration/ram.h b/migration/ram.h
> index 93c2bb4..9413544 100644
> --- a/migration/ram.h
> +++ b/migration/ram.h
> @@ -31,6 +31,7 @@
>  
>  #include "qemu-common.h"
>  #include "exec/cpu-common.h"
> +#include "io/channel.h"
>  
>  extern MigrationStats ram_counters;
>  extern XBZRLECacheStats xbzrle_counters;
> @@ -43,6 +44,7 @@ int multifd_save_setup(void);
>  void multifd_save_cleanup(void);
>  int multifd_load_setup(void);
>  void multifd_load_cleanup(void);
> +gboolean multifd_new_channel(QIOChannel *ioc);
>  
>  uint64_t ram_pagesize_summary(void);
>  int ram_save_queue_pages(const char *rbname, ram_addr_t start, ram_addr_t len);
> diff --git a/migration/socket.c b/migration/socket.c
> index 6195596..32a6b39 100644
> --- a/migration/socket.c
> +++ b/migration/socket.c
> @@ -26,6 +26,38 @@
>  #include "io/channel-socket.h"
>  #include "trace.h"
>  
> +int socket_recv_channel_destroy(QIOChannel *recv)
> +{
> +    /* Remove channel */
> +    object_unref(OBJECT(recv));
> +    return 0;
> +}
> +
> +struct SocketOutgoingArgs {
> +    SocketAddress *saddr;
> +    Error **errp;
> +} outgoing_args;
> +
> +QIOChannel *socket_send_channel_create(void)
> +{
> +    QIOChannelSocket *sioc = qio_channel_socket_new();
> +
> +    qio_channel_socket_connect_sync(sioc, outgoing_args.saddr,
> +                                    outgoing_args.errp);
> +    qio_channel_set_delay(QIO_CHANNEL(sioc), false);
> +    return QIO_CHANNEL(sioc);
> +}
> +
> +int socket_send_channel_destroy(QIOChannel *send)
> +{
> +    /* Remove channel */
> +    object_unref(OBJECT(send));
> +    if (outgoing_args.saddr) {
> +        qapi_free_SocketAddress(outgoing_args.saddr);
> +        outgoing_args.saddr = NULL;
> +    }
> +    return 0;
> +}
>  
>  static SocketAddress *tcp_build_address(const char *host_port, Error **errp)
>  {
> @@ -96,6 +128,9 @@ static void socket_start_outgoing_migration(MigrationState *s,
>      struct SocketConnectData *data = g_new0(struct SocketConnectData, 1);
>  
>      data->s = s;
> +    outgoing_args.saddr = saddr;
> +    outgoing_args.errp = errp;
> +
>      if (saddr->type == SOCKET_ADDRESS_TYPE_INET) {
>          data->hostname = g_strdup(saddr->u.inet.host);
>      }
> @@ -106,7 +141,6 @@ static void socket_start_outgoing_migration(MigrationState *s,
>                                       socket_outgoing_migration,
>                                       data,
>                                       socket_connect_data_free);
> -    qapi_free_SocketAddress(saddr);
>  }
>  
>  void tcp_start_outgoing_migration(MigrationState *s,
> @@ -151,8 +185,6 @@ static gboolean socket_accept_incoming_migration(QIOChannel *ioc,
>  
>      qio_channel_set_name(QIO_CHANNEL(sioc), "migration-socket-incoming");
>      result = migration_channel_process_incoming(QIO_CHANNEL(sioc));
> -    object_unref(OBJECT(sioc));
> -
>  out:
>      if (result == FALSE) {
>          /* Close listening socket as its no longer needed */
> diff --git a/migration/socket.h b/migration/socket.h
> index 6b91e9d..dabce0e 100644
> --- a/migration/socket.h
> +++ b/migration/socket.h
> @@ -16,6 +16,16 @@
>  
>  #ifndef QEMU_MIGRATION_SOCKET_H
>  #define QEMU_MIGRATION_SOCKET_H
> +
> +#include "io/channel.h"
> +
> +QIOChannel *socket_recv_channel_create(void);
> +int socket_recv_channel_destroy(QIOChannel *recv);
> +
> +QIOChannel *socket_send_channel_create(void);
> +
> +int socket_send_channel_destroy(QIOChannel *send);
> +
>  void tcp_start_incoming_migration(const char *host_port, Error **errp);
>  
>  void tcp_start_outgoing_migration(MigrationState *s, const char *host_port,
> -- 
> 2.9.4

Dave

--
Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK

^ permalink raw reply	[flat|nested] 93+ messages in thread

* Re: [Qemu-devel] [PATCH v5 10/17] migration: Create ram_multifd_page
  2017-07-17 13:42 ` [Qemu-devel] [PATCH v5 10/17] migration: Create ram_multifd_page Juan Quintela
@ 2017-07-19 19:02   ` Dr. David Alan Gilbert
  2017-07-20  8:10     ` Peter Xu
  2017-08-08 15:56     ` Juan Quintela
  0 siblings, 2 replies; 93+ messages in thread
From: Dr. David Alan Gilbert @ 2017-07-19 19:02 UTC (permalink / raw)
  To: Juan Quintela; +Cc: qemu-devel, lvivier, peterx, berrange

* Juan Quintela (quintela@redhat.com) wrote:
> The function still don't use multifd, but we have simplified
> ram_save_page, xbzrle and RDMA stuff is gone.  We have added a new
> counter and a new flag for this type of pages.
> 
> Signed-off-by: Juan Quintela <quintela@redhat.com>
> ---
>  hmp.c                 |  2 ++
>  migration/migration.c |  1 +
>  migration/ram.c       | 90 ++++++++++++++++++++++++++++++++++++++++++++++++++-
>  qapi-schema.json      |  5 ++-
>  4 files changed, 96 insertions(+), 2 deletions(-)
> 
> diff --git a/hmp.c b/hmp.c
> index b01605a..eeb308b 100644
> --- a/hmp.c
> +++ b/hmp.c
> @@ -234,6 +234,8 @@ void hmp_info_migrate(Monitor *mon, const QDict *qdict)
>              monitor_printf(mon, "postcopy request count: %" PRIu64 "\n",
>                             info->ram->postcopy_requests);
>          }
> +        monitor_printf(mon, "multifd: %" PRIu64 " pages\n",
> +                       info->ram->multifd);
>      }
>  
>      if (info->has_disk) {
> diff --git a/migration/migration.c b/migration/migration.c
> index e1c79d5..d9d5415 100644
> --- a/migration/migration.c
> +++ b/migration/migration.c
> @@ -528,6 +528,7 @@ static void populate_ram_info(MigrationInfo *info, MigrationState *s)
>      info->ram->dirty_sync_count = ram_counters.dirty_sync_count;
>      info->ram->postcopy_requests = ram_counters.postcopy_requests;
>      info->ram->page_size = qemu_target_page_size();
> +    info->ram->multifd = ram_counters.multifd;
>  
>      if (migrate_use_xbzrle()) {
>          info->has_xbzrle_cache = true;
> diff --git a/migration/ram.c b/migration/ram.c
> index b80f511..2bf3fa7 100644
> --- a/migration/ram.c
> +++ b/migration/ram.c
> @@ -68,6 +68,7 @@
>  #define RAM_SAVE_FLAG_XBZRLE   0x40
>  /* 0x80 is reserved in migration.h start with 0x100 next */
>  #define RAM_SAVE_FLAG_COMPRESS_PAGE    0x100
> +#define RAM_SAVE_FLAG_MULTIFD_PAGE     0x200
>  
>  static inline bool is_zero_range(uint8_t *p, uint64_t size)
>  {
> @@ -362,12 +363,17 @@ static void compress_threads_save_setup(void)
>  /* Multiple fd's */
>  
>  struct MultiFDSendParams {
> +    /* not changed */
>      uint8_t id;
>      QemuThread thread;
>      QIOChannel *c;
>      QemuSemaphore sem;
>      QemuMutex mutex;
> +    /* protected by param mutex */
>      bool quit;

Should probably comment to say what address space address is in - this
is really a qemu pointer - and that's why we can treat 0 as special?

> +    uint8_t *address;
> +    /* protected by multifd mutex */
> +    bool done;

done needs a comment to explain what it is because
it sounds similar to quit;  I think 'done' is saying that
the thread is idle having done what was asked?

>  };
>  typedef struct MultiFDSendParams MultiFDSendParams;
>  
> @@ -375,6 +381,8 @@ struct {
>      MultiFDSendParams *params;
>      /* number of created threads */
>      int count;
> +    QemuMutex mutex;
> +    QemuSemaphore sem;
>  } *multifd_send_state;
>  
>  static void terminate_multifd_send_threads(void)
> @@ -443,6 +451,7 @@ static void *multifd_send_thread(void *opaque)
>      } else {
>          qio_channel_write(p->c, string, MULTIFD_UUID_MSG, &error_abort);
>      }
> +    qemu_sem_post(&multifd_send_state->sem);
>  
>      while (!exit) {
>          qemu_mutex_lock(&p->mutex);
> @@ -450,6 +459,15 @@ static void *multifd_send_thread(void *opaque)
>              qemu_mutex_unlock(&p->mutex);
>              break;
>          }
> +        if (p->address) {
> +            p->address = 0;
> +            qemu_mutex_unlock(&p->mutex);
> +            qemu_mutex_lock(&multifd_send_state->mutex);
> +            p->done = true;
> +            qemu_mutex_unlock(&multifd_send_state->mutex);
> +            qemu_sem_post(&multifd_send_state->sem);
> +            continue;
> +        }
>          qemu_mutex_unlock(&p->mutex);
>          qemu_sem_wait(&p->sem);
>      }
> @@ -469,6 +487,8 @@ int multifd_save_setup(void)
>      multifd_send_state = g_malloc0(sizeof(*multifd_send_state));
>      multifd_send_state->params = g_new0(MultiFDSendParams, thread_count);
>      multifd_send_state->count = 0;
> +    qemu_mutex_init(&multifd_send_state->mutex);
> +    qemu_sem_init(&multifd_send_state->sem, 0);
>      for (i = 0; i < thread_count; i++) {
>          char thread_name[16];
>          MultiFDSendParams *p = &multifd_send_state->params[i];
> @@ -477,6 +497,8 @@ int multifd_save_setup(void)
>          qemu_sem_init(&p->sem, 0);
>          p->quit = false;
>          p->id = i;
> +        p->done = true;
> +        p->address = 0;
>          p->c = socket_send_channel_create();
>          if (!p->c) {
>              error_report("Error creating a send channel");
> @@ -491,6 +513,30 @@ int multifd_save_setup(void)
>      return 0;
>  }
>  
> +static int multifd_send_page(uint8_t *address)
> +{
> +    int i;
> +    MultiFDSendParams *p = NULL; /* make happy gcc */
> +
> +    qemu_sem_wait(&multifd_send_state->sem);
> +    qemu_mutex_lock(&multifd_send_state->mutex);
> +    for (i = 0; i < multifd_send_state->count; i++) {
> +        p = &multifd_send_state->params[i];
> +
> +        if (p->done) {
> +            p->done = false;
> +            break;
> +        }
> +    }
> +    qemu_mutex_unlock(&multifd_send_state->mutex);
> +    qemu_mutex_lock(&p->mutex);
> +    p->address = address;
> +    qemu_mutex_unlock(&p->mutex);
> +    qemu_sem_post(&p->sem);

My feeling, without having fully thought it through, is that
the locking around 'address' can be simplified; especially if the
sending-thread never actually changes it.

http://pubs.opengroup.org/onlinepubs/9699919799/basedefs/V1_chap04.html#tag_04_11
defines that most of the pthread_ functions act as barriers;
including the sem_post and pthread_cond_signal that qemu_sem_post
uses.

> +    return 0;
> +}
> +
>  struct MultiFDRecvParams {
>      uint8_t id;
>      QemuThread thread;
> @@ -537,6 +583,7 @@ void multifd_load_cleanup(void)
>          qemu_sem_destroy(&p->sem);
>          socket_recv_channel_destroy(p->c);
>          g_free(p);
> +        multifd_recv_state->params[i] = NULL;
>      }
>      g_free(multifd_recv_state->params);
>      multifd_recv_state->params = NULL;
> @@ -1058,6 +1105,32 @@ static int ram_save_page(RAMState *rs, PageSearchStatus *pss, bool last_stage)
>      return pages;
>  }
>  
> +static int ram_multifd_page(RAMState *rs, PageSearchStatus *pss,
> +                            bool last_stage)
> +{
> +    int pages;
> +    uint8_t *p;
> +    RAMBlock *block = pss->block;
> +    ram_addr_t offset = pss->page << TARGET_PAGE_BITS;
> +
> +    p = block->host + offset;
> +
> +    pages = save_zero_page(rs, block, offset, p);
> +    if (pages == -1) {
> +        ram_counters.transferred +=
> +            save_page_header(rs, rs->f, block,
> +                             offset | RAM_SAVE_FLAG_MULTIFD_PAGE);
> +        qemu_put_buffer(rs->f, p, TARGET_PAGE_SIZE);
> +        multifd_send_page(p);
> +        ram_counters.transferred += TARGET_PAGE_SIZE;
> +        pages = 1;
> +        ram_counters.normal++;
> +        ram_counters.multifd++;
> +    }
> +
> +    return pages;
> +}
> +
>  static int do_compress_ram_page(QEMUFile *f, RAMBlock *block,
>                                  ram_addr_t offset)
>  {
> @@ -1486,6 +1559,8 @@ static int ram_save_target_page(RAMState *rs, PageSearchStatus *pss,
>          if (migrate_use_compression() &&
>              (rs->ram_bulk_stage || !migrate_use_xbzrle())) {
>              res = ram_save_compressed_page(rs, pss, last_stage);
> +        } else if (migrate_use_multifd()) {
> +            res = ram_multifd_page(rs, pss, last_stage);

It's a pity we can't wire this up with compression, but I understand
why you simplify that.

I'll see how the multiple-pages stuff works below; but the interesting
thing here is we've already split up host-pages, which seems like a bad
idea.


>          } else {
>              res = ram_save_page(rs, pss, last_stage);
>          }
> @@ -2778,6 +2853,10 @@ static int ram_load(QEMUFile *f, void *opaque, int version_id)
>      if (!migrate_use_compression()) {
>          invalid_flags |= RAM_SAVE_FLAG_COMPRESS_PAGE;
>      }
> +
> +    if (!migrate_use_multifd()) {
> +        invalid_flags |= RAM_SAVE_FLAG_MULTIFD_PAGE;
> +    }
>      /* This RCU critical section can be very long running.
>       * When RCU reclaims in the code start to become numerous,
>       * it will be necessary to reduce the granularity of this
> @@ -2802,13 +2881,17 @@ static int ram_load(QEMUFile *f, void *opaque, int version_id)
>              if (flags & invalid_flags & RAM_SAVE_FLAG_COMPRESS_PAGE) {
>                  error_report("Received an unexpected compressed page");
>              }
> +            if (flags & invalid_flags  & RAM_SAVE_FLAG_MULTIFD_PAGE) {
> +                error_report("Received an unexpected multifd page");
> +            }
>  
>              ret = -EINVAL;
>              break;
>          }
>  
>          if (flags & (RAM_SAVE_FLAG_ZERO | RAM_SAVE_FLAG_PAGE |
> -                     RAM_SAVE_FLAG_COMPRESS_PAGE | RAM_SAVE_FLAG_XBZRLE)) {
> +                     RAM_SAVE_FLAG_COMPRESS_PAGE | RAM_SAVE_FLAG_XBZRLE |
> +                     RAM_SAVE_FLAG_MULTIFD_PAGE)) {
>              RAMBlock *block = ram_block_from_stream(f, flags);
>  
>              host = host_from_ram_block_offset(block, addr);
> @@ -2896,6 +2979,11 @@ static int ram_load(QEMUFile *f, void *opaque, int version_id)
>                  break;
>              }
>              break;
> +
> +        case RAM_SAVE_FLAG_MULTIFD_PAGE:
> +            qemu_get_buffer(f, host, TARGET_PAGE_SIZE);
> +            break;
> +
>          case RAM_SAVE_FLAG_EOS:
>              /* normal exit */
>              break;
> diff --git a/qapi-schema.json b/qapi-schema.json
> index 5b3733e..f708782 100644
> --- a/qapi-schema.json
> +++ b/qapi-schema.json
> @@ -601,6 +601,8 @@
>  # @page-size: The number of bytes per page for the various page-based
>  #        statistics (since 2.10)
>  #
> +# @multifd: number of pages sent with multifd (since 2.10)

Hopeful!

Dave
>  # Since: 0.14.0
>  ##
>  { 'struct': 'MigrationStats',
> @@ -608,7 +610,8 @@
>             'duplicate': 'int', 'skipped': 'int', 'normal': 'int',
>             'normal-bytes': 'int', 'dirty-pages-rate' : 'int',
>             'mbps' : 'number', 'dirty-sync-count' : 'int',
> -           'postcopy-requests' : 'int', 'page-size' : 'int' } }
> +           'postcopy-requests' : 'int', 'page-size' : 'int',
> +           'multifd' : 'int'} }
>  
>  ##
>  # @XBZRLECacheStats:
> -- 
> 2.9.4
> 
--
Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK

^ permalink raw reply	[flat|nested] 93+ messages in thread

* Re: [Qemu-devel] [PATCH v5 01/17] migrate: Add gboolean return type to migrate_channel_process_incoming
  2017-07-19 15:01   ` Dr. David Alan Gilbert
@ 2017-07-20  7:00     ` Peter Xu
  2017-07-20  8:47       ` Daniel P. Berrange
  0 siblings, 1 reply; 93+ messages in thread
From: Peter Xu @ 2017-07-20  7:00 UTC (permalink / raw)
  To: Dr. David Alan Gilbert; +Cc: Juan Quintela, qemu-devel, lvivier, berrange

On Wed, Jul 19, 2017 at 04:01:10PM +0100, Dr. David Alan Gilbert wrote:
> * Juan Quintela (quintela@redhat.com) wrote:
> > Signed-off-by: Juan Quintela <quintela@redhat.com>
> > ---
> >  migration/channel.c |  3 ++-
> >  migration/channel.h |  2 +-
> >  migration/exec.c    |  6 ++++--
> >  migration/socket.c  | 12 ++++++++----
> >  4 files changed, 15 insertions(+), 8 deletions(-)
> > 
> > diff --git a/migration/channel.c b/migration/channel.c
> > index 3b7252f..719055d 100644
> > --- a/migration/channel.c
> > +++ b/migration/channel.c
> > @@ -19,7 +19,7 @@
> >  #include "qapi/error.h"
> >  #include "io/channel-tls.h"
> >  
> > -void migration_channel_process_incoming(QIOChannel *ioc)
> > +gboolean migration_channel_process_incoming(QIOChannel *ioc)
> >  {
> >      MigrationState *s = migrate_get_current();
> >  
> > @@ -39,6 +39,7 @@ void migration_channel_process_incoming(QIOChannel *ioc)
> >          QEMUFile *f = qemu_fopen_channel_input(ioc);
> >          migration_fd_process_incoming(f);
> >      }
> > +    return FALSE; /* unregister */
> >  }
> >  
> >  
> > diff --git a/migration/channel.h b/migration/channel.h
> > index e4b4057..72cbc9f 100644
> > --- a/migration/channel.h
> > +++ b/migration/channel.h
> > @@ -18,7 +18,7 @@
> >  
> >  #include "io/channel.h"
> >  
> > -void migration_channel_process_incoming(QIOChannel *ioc);
> > +gboolean migration_channel_process_incoming(QIOChannel *ioc);
> 
> Can you add a comment here that says what the return value means.

And, looks like we have G_SOURCE_CONTINUE and G_SOURCE_REMOVE:

https://developer.gnome.org/glib/stable/glib-The-Main-Event-Loop.html#G-SOURCE-CONTINUE:CAPS

Maybe we can use them as well?

I think the problem is that GSourceFunc's return code (which is a
gboolean) is not clear enough.

-- 
Peter Xu

^ permalink raw reply	[flat|nested] 93+ messages in thread

* Re: [Qemu-devel] [PATCH v5 10/17] migration: Create ram_multifd_page
  2017-07-19 19:02   ` Dr. David Alan Gilbert
@ 2017-07-20  8:10     ` Peter Xu
  2017-07-20 11:48       ` Dr. David Alan Gilbert
  2017-08-08 16:04       ` Juan Quintela
  2017-08-08 15:56     ` Juan Quintela
  1 sibling, 2 replies; 93+ messages in thread
From: Peter Xu @ 2017-07-20  8:10 UTC (permalink / raw)
  To: Dr. David Alan Gilbert; +Cc: Juan Quintela, qemu-devel, lvivier, berrange

On Wed, Jul 19, 2017 at 08:02:39PM +0100, Dr. David Alan Gilbert wrote:
> * Juan Quintela (quintela@redhat.com) wrote:
> > The function still don't use multifd, but we have simplified
> > ram_save_page, xbzrle and RDMA stuff is gone.  We have added a new
> > counter and a new flag for this type of pages.
> > 
> > Signed-off-by: Juan Quintela <quintela@redhat.com>
> > ---
> >  hmp.c                 |  2 ++
> >  migration/migration.c |  1 +
> >  migration/ram.c       | 90 ++++++++++++++++++++++++++++++++++++++++++++++++++-
> >  qapi-schema.json      |  5 ++-
> >  4 files changed, 96 insertions(+), 2 deletions(-)
> > 
> > diff --git a/hmp.c b/hmp.c
> > index b01605a..eeb308b 100644
> > --- a/hmp.c
> > +++ b/hmp.c
> > @@ -234,6 +234,8 @@ void hmp_info_migrate(Monitor *mon, const QDict *qdict)
> >              monitor_printf(mon, "postcopy request count: %" PRIu64 "\n",
> >                             info->ram->postcopy_requests);
> >          }
> > +        monitor_printf(mon, "multifd: %" PRIu64 " pages\n",
> > +                       info->ram->multifd);
> >      }
> >  
> >      if (info->has_disk) {
> > diff --git a/migration/migration.c b/migration/migration.c
> > index e1c79d5..d9d5415 100644
> > --- a/migration/migration.c
> > +++ b/migration/migration.c
> > @@ -528,6 +528,7 @@ static void populate_ram_info(MigrationInfo *info, MigrationState *s)
> >      info->ram->dirty_sync_count = ram_counters.dirty_sync_count;
> >      info->ram->postcopy_requests = ram_counters.postcopy_requests;
> >      info->ram->page_size = qemu_target_page_size();
> > +    info->ram->multifd = ram_counters.multifd;
> >  
> >      if (migrate_use_xbzrle()) {
> >          info->has_xbzrle_cache = true;
> > diff --git a/migration/ram.c b/migration/ram.c
> > index b80f511..2bf3fa7 100644
> > --- a/migration/ram.c
> > +++ b/migration/ram.c
> > @@ -68,6 +68,7 @@
> >  #define RAM_SAVE_FLAG_XBZRLE   0x40
> >  /* 0x80 is reserved in migration.h start with 0x100 next */
> >  #define RAM_SAVE_FLAG_COMPRESS_PAGE    0x100
> > +#define RAM_SAVE_FLAG_MULTIFD_PAGE     0x200
> >  
> >  static inline bool is_zero_range(uint8_t *p, uint64_t size)
> >  {
> > @@ -362,12 +363,17 @@ static void compress_threads_save_setup(void)
> >  /* Multiple fd's */
> >  
> >  struct MultiFDSendParams {
> > +    /* not changed */
> >      uint8_t id;
> >      QemuThread thread;
> >      QIOChannel *c;
> >      QemuSemaphore sem;
> >      QemuMutex mutex;
> > +    /* protected by param mutex */
> >      bool quit;
> 
> Should probably comment to say what address space address is in - this
> is really a qemu pointer - and that's why we can treat 0 as special?

I believe this comment is for "address" below.

Yes, it would be nice to comment it. IIUC it belongs to virtual
address space of QEMU, so it should be okay to use zero as a "special"
value.

> 
> > +    uint8_t *address;
> > +    /* protected by multifd mutex */
> > +    bool done;
> 
> done needs a comment to explain what it is because
> it sounds similar to quit;  I think 'done' is saying that
> the thread is idle having done what was asked?

Since we know that valid address won't be zero, not sure whether we
can just avoid introducing the "done" field (even, not sure whether we
will need lock when modifying "address", I think not as well? Please
see below). For what I see this, it works like a state machine, and
"address" can be the state:

            +--------  send thread ---------+
            |                               |
           \|/                              |
        address==0 (IDLE)               address!=0 (ACTIVE)
            |                              /|\
            |                               |
            +--------  main thread ---------+

Then...

> 
> >  };
> >  typedef struct MultiFDSendParams MultiFDSendParams;
> >  
> > @@ -375,6 +381,8 @@ struct {
> >      MultiFDSendParams *params;
> >      /* number of created threads */
> >      int count;
> > +    QemuMutex mutex;
> > +    QemuSemaphore sem;
> >  } *multifd_send_state;
> >  
> >  static void terminate_multifd_send_threads(void)
> > @@ -443,6 +451,7 @@ static void *multifd_send_thread(void *opaque)
> >      } else {
> >          qio_channel_write(p->c, string, MULTIFD_UUID_MSG, &error_abort);
> >      }
> > +    qemu_sem_post(&multifd_send_state->sem);
> >  
> >      while (!exit) {
> >          qemu_mutex_lock(&p->mutex);
> > @@ -450,6 +459,15 @@ static void *multifd_send_thread(void *opaque)
> >              qemu_mutex_unlock(&p->mutex);
> >              break;
> >          }
> > +        if (p->address) {
> > +            p->address = 0;
> > +            qemu_mutex_unlock(&p->mutex);
> > +            qemu_mutex_lock(&multifd_send_state->mutex);
> > +            p->done = true;
> > +            qemu_mutex_unlock(&multifd_send_state->mutex);
> > +            qemu_sem_post(&multifd_send_state->sem);
> > +            continue;

Here instead of setting up address=0 at the entry, can we do this (no
"done" for this time)?

                 // send the page before clearing p->address
                 send_page(p->address);
                 // clear p->address to switch to "IDLE" state
                 atomic_set(&p->address, 0);
                 // tell main thread, in case it's waiting
                 qemu_sem_post(&multifd_send_state->sem);

And on the main thread...

> > +        }
> >          qemu_mutex_unlock(&p->mutex);
> >          qemu_sem_wait(&p->sem);
> >      }
> > @@ -469,6 +487,8 @@ int multifd_save_setup(void)
> >      multifd_send_state = g_malloc0(sizeof(*multifd_send_state));
> >      multifd_send_state->params = g_new0(MultiFDSendParams, thread_count);
> >      multifd_send_state->count = 0;
> > +    qemu_mutex_init(&multifd_send_state->mutex);
> > +    qemu_sem_init(&multifd_send_state->sem, 0);
> >      for (i = 0; i < thread_count; i++) {
> >          char thread_name[16];
> >          MultiFDSendParams *p = &multifd_send_state->params[i];
> > @@ -477,6 +497,8 @@ int multifd_save_setup(void)
> >          qemu_sem_init(&p->sem, 0);
> >          p->quit = false;
> >          p->id = i;
> > +        p->done = true;
> > +        p->address = 0;
> >          p->c = socket_send_channel_create();
> >          if (!p->c) {
> >              error_report("Error creating a send channel");
> > @@ -491,6 +513,30 @@ int multifd_save_setup(void)
> >      return 0;
> >  }
> >  
> > +static int multifd_send_page(uint8_t *address)
> > +{
> > +    int i;
> > +    MultiFDSendParams *p = NULL; /* make happy gcc */
> > +


> > +    qemu_sem_wait(&multifd_send_state->sem);
> > +    qemu_mutex_lock(&multifd_send_state->mutex);
> > +    for (i = 0; i < multifd_send_state->count; i++) {
> > +        p = &multifd_send_state->params[i];
> > +
> > +        if (p->done) {
> > +            p->done = false;
> > +            break;
> > +        }
> > +    }
> > +    qemu_mutex_unlock(&multifd_send_state->mutex);
> > +    qemu_mutex_lock(&p->mutex);
> > +    p->address = address;
> > +    qemu_mutex_unlock(&p->mutex);
> > +    qemu_sem_post(&p->sem);

... here can we just do this?

retry:
    // don't take any lock, only read each p->address
    for (i = 0; i < multifd_send_state->count; i++) {
        p = &multifd_send_state->params[i];
        if (!p->address) {
            // we found one IDLE send thread
            break;
        }
    }
    if (!p) {
        qemu_sem_wait(&multifd_send_state->sem);
        goto retry;
    }
    // we switch its state, IDLE -> ACTIVE
    atomic_set(&p->address, address);
    // tell the thread to start work
    qemu_sem_post(&p->sem);

Above didn't really use any lock at all (either the per thread lock,
or the global lock). Would it work?

Thanks,

-- 
Peter Xu

^ permalink raw reply	[flat|nested] 93+ messages in thread

* Re: [Qemu-devel] [PATCH v5 01/17] migrate: Add gboolean return type to migrate_channel_process_incoming
  2017-07-20  7:00     ` Peter Xu
@ 2017-07-20  8:47       ` Daniel P. Berrange
  2017-07-24 10:18         ` Juan Quintela
  0 siblings, 1 reply; 93+ messages in thread
From: Daniel P. Berrange @ 2017-07-20  8:47 UTC (permalink / raw)
  To: Peter Xu; +Cc: Dr. David Alan Gilbert, Juan Quintela, qemu-devel, lvivier

On Thu, Jul 20, 2017 at 03:00:23PM +0800, Peter Xu wrote:
> On Wed, Jul 19, 2017 at 04:01:10PM +0100, Dr. David Alan Gilbert wrote:
> > * Juan Quintela (quintela@redhat.com) wrote:
> > > Signed-off-by: Juan Quintela <quintela@redhat.com>
> > > ---
> > >  migration/channel.c |  3 ++-
> > >  migration/channel.h |  2 +-
> > >  migration/exec.c    |  6 ++++--
> > >  migration/socket.c  | 12 ++++++++----
> > >  4 files changed, 15 insertions(+), 8 deletions(-)
> > > 
> > > diff --git a/migration/channel.c b/migration/channel.c
> > > index 3b7252f..719055d 100644
> > > --- a/migration/channel.c
> > > +++ b/migration/channel.c
> > > @@ -19,7 +19,7 @@
> > >  #include "qapi/error.h"
> > >  #include "io/channel-tls.h"
> > >  
> > > -void migration_channel_process_incoming(QIOChannel *ioc)
> > > +gboolean migration_channel_process_incoming(QIOChannel *ioc)
> > >  {
> > >      MigrationState *s = migrate_get_current();
> > >  
> > > @@ -39,6 +39,7 @@ void migration_channel_process_incoming(QIOChannel *ioc)
> > >          QEMUFile *f = qemu_fopen_channel_input(ioc);
> > >          migration_fd_process_incoming(f);
> > >      }
> > > +    return FALSE; /* unregister */
> > >  }
> > >  
> > >  
> > > diff --git a/migration/channel.h b/migration/channel.h
> > > index e4b4057..72cbc9f 100644
> > > --- a/migration/channel.h
> > > +++ b/migration/channel.h
> > > @@ -18,7 +18,7 @@
> > >  
> > >  #include "io/channel.h"
> > >  
> > > -void migration_channel_process_incoming(QIOChannel *ioc);
> > > +gboolean migration_channel_process_incoming(QIOChannel *ioc);
> > 
> > Can you add a comment here that says what the return value means.
> 
> And, looks like we have G_SOURCE_CONTINUE and G_SOURCE_REMOVE:
> 
> https://developer.gnome.org/glib/stable/glib-The-Main-Event-Loop.html#G-SOURCE-CONTINUE:CAPS
> 
> Maybe we can use them as well?

Those are newer than our min required glib version, though we could
add compat defines for them

Regards,
Daniel
-- 
|: https://berrange.com      -o-    https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org         -o-            https://fstop138.berrange.com :|
|: https://entangle-photo.org    -o-    https://www.instagram.com/dberrange :|

^ permalink raw reply	[flat|nested] 93+ messages in thread

* Re: [Qemu-devel] [PATCH v5 09/17] migration: Start of multiple fd work
  2017-07-17 13:42 ` [Qemu-devel] [PATCH v5 09/17] migration: Start of multiple fd work Juan Quintela
  2017-07-19 13:56   ` Daniel P. Berrange
  2017-07-19 17:35   ` Dr. David Alan Gilbert
@ 2017-07-20  9:34   ` Peter Xu
  2017-08-08  9:19     ` Juan Quintela
  2 siblings, 1 reply; 93+ messages in thread
From: Peter Xu @ 2017-07-20  9:34 UTC (permalink / raw)
  To: Juan Quintela; +Cc: qemu-devel, dgilbert, lvivier, berrange

On Mon, Jul 17, 2017 at 03:42:30PM +0200, Juan Quintela wrote:

[...]

>  int multifd_load_setup(void)
>  {
>      int thread_count;
> -    uint8_t i;
>  
>      if (!migrate_use_multifd()) {
>          return 0;
>      }
>      thread_count = migrate_multifd_threads();
>      multifd_recv_state = g_malloc0(sizeof(*multifd_recv_state));
> -    multifd_recv_state->params = g_new0(MultiFDRecvParams, thread_count);
> +    multifd_recv_state->params = g_new0(MultiFDRecvParams *, thread_count);
>      multifd_recv_state->count = 0;
> -    for (i = 0; i < thread_count; i++) {
> -        char thread_name[16];
> -        MultiFDRecvParams *p = &multifd_recv_state->params[i];
> -
> -        qemu_mutex_init(&p->mutex);
> -        qemu_sem_init(&p->sem, 0);
> -        p->quit = false;
> -        p->id = i;
> -        snprintf(thread_name, sizeof(thread_name), "multifdrecv_%d", i);
> -        qemu_thread_create(&p->thread, thread_name, multifd_recv_thread, p,
> -                           QEMU_THREAD_JOINABLE);
> -        multifd_recv_state->count++;
> -    }

Could I ask why we explicitly switched from MultiFDRecvParams[] array
into a pointer array? Can we still use the old array?  Thanks,

-- 
Peter Xu

^ permalink raw reply	[flat|nested] 93+ messages in thread

* Re: [Qemu-devel] [PATCH v5 11/17] migration: Really use multiple pages at a time
  2017-07-17 13:42 ` [Qemu-devel] [PATCH v5 11/17] migration: Really use multiple pages at a time Juan Quintela
  2017-07-19 13:58   ` Daniel P. Berrange
@ 2017-07-20  9:44   ` Dr. David Alan Gilbert
  2017-08-08 12:11     ` Juan Quintela
  2017-07-20  9:49   ` Peter Xu
  2 siblings, 1 reply; 93+ messages in thread
From: Dr. David Alan Gilbert @ 2017-07-20  9:44 UTC (permalink / raw)
  To: Juan Quintela; +Cc: qemu-devel, lvivier, peterx, berrange

* Juan Quintela (quintela@redhat.com) wrote:
> We now send several pages at a time each time that we wakeup a thread.
> 
> Signed-off-by: Juan Quintela <quintela@redhat.com>
> 
> --
> 
> Use iovec's insead of creating the equivalent.
> ---
>  migration/ram.c | 46 ++++++++++++++++++++++++++++++++++++++++------
>  1 file changed, 40 insertions(+), 6 deletions(-)
> 
> diff --git a/migration/ram.c b/migration/ram.c
> index 2bf3fa7..90e1bcb 100644
> --- a/migration/ram.c
> +++ b/migration/ram.c
> @@ -362,6 +362,13 @@ static void compress_threads_save_setup(void)
>  
>  /* Multiple fd's */
>  
> +
> +typedef struct {
> +    int num;
> +    int size;

size_t ?

> +    struct iovec *iov;
> +} multifd_pages_t;
> +
>  struct MultiFDSendParams {
>      /* not changed */
>      uint8_t id;
> @@ -371,7 +378,7 @@ struct MultiFDSendParams {
>      QemuMutex mutex;
>      /* protected by param mutex */
>      bool quit;
> -    uint8_t *address;
> +    multifd_pages_t pages;
>      /* protected by multifd mutex */
>      bool done;
>  };
> @@ -459,8 +466,8 @@ static void *multifd_send_thread(void *opaque)
>              qemu_mutex_unlock(&p->mutex);
>              break;
>          }
> -        if (p->address) {
> -            p->address = 0;
> +        if (p->pages.num) {
> +            p->pages.num = 0;
>              qemu_mutex_unlock(&p->mutex);
>              qemu_mutex_lock(&multifd_send_state->mutex);
>              p->done = true;
> @@ -475,6 +482,13 @@ static void *multifd_send_thread(void *opaque)
>      return NULL;
>  }
>  
> +static void multifd_init_group(multifd_pages_t *pages)
> +{
> +    pages->num = 0;
> +    pages->size = migrate_multifd_group();
> +    pages->iov = g_malloc0(pages->size * sizeof(struct iovec));

Does that get freed anywhere?

> +}
> +
>  int multifd_save_setup(void)
>  {
>      int thread_count;
> @@ -498,7 +512,7 @@ int multifd_save_setup(void)
>          p->quit = false;
>          p->id = i;
>          p->done = true;
> -        p->address = 0;
> +        multifd_init_group(&p->pages);
>          p->c = socket_send_channel_create();
>          if (!p->c) {
>              error_report("Error creating a send channel");
> @@ -515,8 +529,23 @@ int multifd_save_setup(void)
>  
>  static int multifd_send_page(uint8_t *address)
>  {
> -    int i;
> +    int i, j;
>      MultiFDSendParams *p = NULL; /* make happy gcc */
> +    static multifd_pages_t pages;
> +    static bool once;
> +
> +    if (!once) {
> +        multifd_init_group(&pages);
> +        once = true;
> +    }
> +
> +    pages.iov[pages.num].iov_base = address;
> +    pages.iov[pages.num].iov_len = TARGET_PAGE_SIZE;
> +    pages.num++;
> +
> +    if (pages.num < (pages.size - 1)) {
> +        return UINT16_MAX;

That's a very odd magic constant to return.
What's your intention?

> +    }
>  
>      qemu_sem_wait(&multifd_send_state->sem);
>      qemu_mutex_lock(&multifd_send_state->mutex);
> @@ -530,7 +559,12 @@ static int multifd_send_page(uint8_t *address)
>      }
>      qemu_mutex_unlock(&multifd_send_state->mutex);
>      qemu_mutex_lock(&p->mutex);
> -    p->address = address;
> +    p->pages.num = pages.num;
> +    for (j = 0; j < pages.size; j++) {
> +        p->pages.iov[j].iov_base = pages.iov[j].iov_base;
> +        p->pages.iov[j].iov_len = pages.iov[j].iov_len;
> +    }

It would seem more logical to update p->pages.num last

This is also a little odd in that iov_len is never really used,
it's always TARGET_PAGE_SIZE.

> +    pages.num = 0;
>      qemu_mutex_unlock(&p->mutex);
>      qemu_sem_post(&p->sem);

What makes sure that any final chunk of pages that was less
than the group size is sent at the end?

Dave

> -- 
> 2.9.4
> 
--
Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK

^ permalink raw reply	[flat|nested] 93+ messages in thread

* Re: [Qemu-devel] [PATCH v5 11/17] migration: Really use multiple pages at a time
  2017-07-17 13:42 ` [Qemu-devel] [PATCH v5 11/17] migration: Really use multiple pages at a time Juan Quintela
  2017-07-19 13:58   ` Daniel P. Berrange
  2017-07-20  9:44   ` Dr. David Alan Gilbert
@ 2017-07-20  9:49   ` Peter Xu
  2017-07-20 10:09     ` Peter Xu
  2017-08-08 16:06     ` Juan Quintela
  2 siblings, 2 replies; 93+ messages in thread
From: Peter Xu @ 2017-07-20  9:49 UTC (permalink / raw)
  To: Juan Quintela; +Cc: qemu-devel, dgilbert, lvivier, berrange

On Mon, Jul 17, 2017 at 03:42:32PM +0200, Juan Quintela wrote:

[...]

>  static int multifd_send_page(uint8_t *address)
>  {
> -    int i;
> +    int i, j;
>      MultiFDSendParams *p = NULL; /* make happy gcc */
> +    static multifd_pages_t pages;
> +    static bool once;
> +
> +    if (!once) {
> +        multifd_init_group(&pages);
> +        once = true;

Would it be good to put the "pages" into multifd_send_state? One is to
stick globals together; another benefit is that we can remove the
"once" here: we can then init the "pages" when init multifd_send_state
struct (but maybe with a better name?...).

(there are similar static variables in multifd_recv_page() as well, if
 this one applies, then we can possibly use multifd_recv_state for
 that one)

> +    }
> +
> +    pages.iov[pages.num].iov_base = address;
> +    pages.iov[pages.num].iov_len = TARGET_PAGE_SIZE;
> +    pages.num++;
> +
> +    if (pages.num < (pages.size - 1)) {
> +        return UINT16_MAX;

Nit: shall we define something for readability?  Like:

#define  MULTIFD_FD_INVALID  UINT16_MAX

> +    }
>  
>      qemu_sem_wait(&multifd_send_state->sem);
>      qemu_mutex_lock(&multifd_send_state->mutex);
> @@ -530,7 +559,12 @@ static int multifd_send_page(uint8_t *address)
>      }
>      qemu_mutex_unlock(&multifd_send_state->mutex);
>      qemu_mutex_lock(&p->mutex);
> -    p->address = address;
> +    p->pages.num = pages.num;
> +    for (j = 0; j < pages.size; j++) {
> +        p->pages.iov[j].iov_base = pages.iov[j].iov_base;
> +        p->pages.iov[j].iov_len = pages.iov[j].iov_len;
> +    }
> +    pages.num = 0;
>      qemu_mutex_unlock(&p->mutex);
>      qemu_sem_post(&p->sem);
>  
> -- 
> 2.9.4
> 

-- 
Peter Xu

^ permalink raw reply	[flat|nested] 93+ messages in thread

* Re: [Qemu-devel] [PATCH v5 12/17] migration: Send the fd number which we are going to use for this page
  2017-07-17 13:42 ` [Qemu-devel] [PATCH v5 12/17] migration: Send the fd number which we are going to use for this page Juan Quintela
@ 2017-07-20  9:58   ` Dr. David Alan Gilbert
  2017-08-09 16:48   ` Paolo Bonzini
  1 sibling, 0 replies; 93+ messages in thread
From: Dr. David Alan Gilbert @ 2017-07-20  9:58 UTC (permalink / raw)
  To: Juan Quintela; +Cc: qemu-devel, lvivier, peterx, berrange

* Juan Quintela (quintela@redhat.com) wrote:
> We are still sending the page through the main channel, that would
> change later in the series
> 
> Signed-off-by: Juan Quintela <quintela@redhat.com>
> ---
>  migration/ram.c | 13 +++++++++++--
>  1 file changed, 11 insertions(+), 2 deletions(-)
> 
> diff --git a/migration/ram.c b/migration/ram.c
> index 90e1bcb..ac0742f 100644
> --- a/migration/ram.c
> +++ b/migration/ram.c
> @@ -568,7 +568,7 @@ static int multifd_send_page(uint8_t *address)
>      qemu_mutex_unlock(&p->mutex);
>      qemu_sem_post(&p->sem);
>  
> -    return 0;
> +    return i;

is 'i' anything sane so far - I think this is currently a bogus value
so perhaps it's best to keep it as 0 till later?
Also, add a comment to multifd_send_page to say it'll return the fd
to use.


>  }
>  
>  struct MultiFDRecvParams {
> @@ -1143,6 +1143,7 @@ static int ram_multifd_page(RAMState *rs, PageSearchStatus *pss,
>                              bool last_stage)
>  {
>      int pages;
> +    uint16_t fd_num;
>      uint8_t *p;
>      RAMBlock *block = pss->block;
>      ram_addr_t offset = pss->page << TARGET_PAGE_BITS;
> @@ -1154,8 +1155,10 @@ static int ram_multifd_page(RAMState *rs, PageSearchStatus *pss,
>          ram_counters.transferred +=
>              save_page_header(rs, rs->f, block,
>                               offset | RAM_SAVE_FLAG_MULTIFD_PAGE);
> +        fd_num = multifd_send_page(p);
> +        qemu_put_be16(rs->f, fd_num);

Check for errors from multifd_send_page?
(I'd have preferred to have made this one field in a new fixed size word
which will get used for other things as well, but I'm OK with it this
way).

I think you've also got a 2^8 fd limit in some of the other configs
as opposed to the 2^16 here.

> +        ram_counters.transferred += 2; /* size of fd_num */
>          qemu_put_buffer(rs->f, p, TARGET_PAGE_SIZE);
> -        multifd_send_page(p);
>          ram_counters.transferred += TARGET_PAGE_SIZE;
>          pages = 1;
>          ram_counters.normal++;
> @@ -2905,6 +2908,7 @@ static int ram_load(QEMUFile *f, void *opaque, int version_id)
>      while (!postcopy_running && !ret && !(flags & RAM_SAVE_FLAG_EOS)) {
>          ram_addr_t addr, total_ram_bytes;
>          void *host = NULL;
> +        uint16_t fd_num;
>          uint8_t ch;
>  
>          addr = qemu_get_be64(f);
> @@ -3015,6 +3019,11 @@ static int ram_load(QEMUFile *f, void *opaque, int version_id)
>              break;
>  
>          case RAM_SAVE_FLAG_MULTIFD_PAGE:
> +            fd_num = qemu_get_be16(f);
> +            if (fd_num != 0) {
> +                /* this is yet an unused variable, changed later */
> +                fd_num = fd_num;
> +            }
>              qemu_get_buffer(f, host, TARGET_PAGE_SIZE);
>              break;

Dave

> -- 
> 2.9.4
> 
--
Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK

^ permalink raw reply	[flat|nested] 93+ messages in thread

* Re: [Qemu-devel] [PATCH v5 11/17] migration: Really use multiple pages at a time
  2017-07-20  9:49   ` Peter Xu
@ 2017-07-20 10:09     ` Peter Xu
  2017-08-08 16:06     ` Juan Quintela
  1 sibling, 0 replies; 93+ messages in thread
From: Peter Xu @ 2017-07-20 10:09 UTC (permalink / raw)
  To: Juan Quintela; +Cc: qemu-devel, dgilbert, lvivier, berrange

On Thu, Jul 20, 2017 at 05:49:47PM +0800, Peter Xu wrote:
> On Mon, Jul 17, 2017 at 03:42:32PM +0200, Juan Quintela wrote:
> 
> [...]
> 
> >  static int multifd_send_page(uint8_t *address)
> >  {
> > -    int i;
> > +    int i, j;
> >      MultiFDSendParams *p = NULL; /* make happy gcc */
> > +    static multifd_pages_t pages;
> > +    static bool once;
> > +
> > +    if (!once) {
> > +        multifd_init_group(&pages);
> > +        once = true;
> 
> Would it be good to put the "pages" into multifd_send_state? One is to
> stick globals together; another benefit is that we can remove the
> "once" here: we can then init the "pages" when init multifd_send_state
> struct (but maybe with a better name?...).
> 
> (there are similar static variables in multifd_recv_page() as well, if
>  this one applies, then we can possibly use multifd_recv_state for
>  that one)
> 
> > +    }
> > +
> > +    pages.iov[pages.num].iov_base = address;
> > +    pages.iov[pages.num].iov_len = TARGET_PAGE_SIZE;
> > +    pages.num++;
> > +
> > +    if (pages.num < (pages.size - 1)) {
> > +        return UINT16_MAX;
> 
> Nit: shall we define something for readability?  Like:
> 
> #define  MULTIFD_FD_INVALID  UINT16_MAX

Sorry I misunderstood. INVALID may not suite here. Maybe
MULTIFD_FD_CONTINUE?

(afaiu we send this before we send the real fd_num for the chunk)

-- 
Peter Xu

^ permalink raw reply	[flat|nested] 93+ messages in thread

* Re: [Qemu-devel] [PATCH v5 13/17] migration: Create thread infrastructure for multifd recv side
  2017-07-17 13:42 ` [Qemu-devel] [PATCH v5 13/17] migration: Create thread infrastructure for multifd recv side Juan Quintela
@ 2017-07-20 10:22   ` Peter Xu
  2017-08-08 11:41     ` Juan Quintela
  2017-07-20 10:29   ` Dr. David Alan Gilbert
  1 sibling, 1 reply; 93+ messages in thread
From: Peter Xu @ 2017-07-20 10:22 UTC (permalink / raw)
  To: Juan Quintela; +Cc: qemu-devel, dgilbert, lvivier, berrange

On Mon, Jul 17, 2017 at 03:42:34PM +0200, Juan Quintela wrote:

[...]

>  struct MultiFDRecvParams {
> +    /* not changed */
>      uint8_t id;
>      QemuThread thread;
>      QIOChannel *c;
> +    QemuSemaphore ready;
>      QemuSemaphore sem;
>      QemuMutex mutex;
> +    /* proteced by param mutex */
>      bool quit;
> +    multifd_pages_t pages;
> +    bool done;

(Again, I am thinking whether we can get rid of this "done" field just
 like the comment I left in sending part, but I'll wait to see how
 that discussion goes in case I missed anything, so will skip it here
 for now...)

>  };
>  typedef struct MultiFDRecvParams MultiFDRecvParams;
>  
> @@ -629,12 +637,20 @@ static void *multifd_recv_thread(void *opaque)
>  {
>      MultiFDRecvParams *p = opaque;
>  
> +    qemu_sem_post(&p->ready);
>      while (true) {
>          qemu_mutex_lock(&p->mutex);
>          if (p->quit) {
>              qemu_mutex_unlock(&p->mutex);
>              break;
>          }
> +        if (p->pages.num) {
> +            p->pages.num = 0;
> +            p->done = true;
> +            qemu_mutex_unlock(&p->mutex);
> +            qemu_sem_post(&p->ready);
> +            continue;
> +        }
>          qemu_mutex_unlock(&p->mutex);
>          qemu_sem_wait(&p->sem);
>      }
> @@ -679,8 +695,11 @@ gboolean multifd_new_channel(QIOChannel *ioc)
>      }
>      qemu_mutex_init(&p->mutex);
>      qemu_sem_init(&p->sem, 0);
> +    qemu_sem_init(&p->ready, 0);
>      p->quit = false;
>      p->id = id;
> +    p->done = false;
> +    multifd_init_group(&p->pages);
>      p->c = ioc;
>      atomic_set(&multifd_recv_state->params[id], p);
>      qemu_thread_create(&p->thread, "multifd_recv", multifd_recv_thread, p,
> @@ -709,6 +728,42 @@ int multifd_load_setup(void)
>      return 0;
>  }
>  
> +static void multifd_recv_page(uint8_t *address, uint16_t fd_num)
> +{
> +    int thread_count;
> +    MultiFDRecvParams *p;
> +    static multifd_pages_t pages;
> +    static bool once;
> +
> +    if (!once) {
> +        multifd_init_group(&pages);
> +        once = true;
> +    }
> +
> +    pages.iov[pages.num].iov_base = address;
> +    pages.iov[pages.num].iov_len = TARGET_PAGE_SIZE;
> +    pages.num++;
> +
> +    if (fd_num == UINT16_MAX) {

(so this check is slightly mistery as well if we don't define
 something... O:-)

> +        return;
> +    }
> +
> +    thread_count = migrate_multifd_threads();
> +    assert(fd_num < thread_count);
> +    p = multifd_recv_state->params[fd_num];
> +
> +    qemu_sem_wait(&p->ready);

Shall we check for p->pages.num == 0 before wait? What if the
corresponding thread is already finished its old work and ready?

> +
> +    qemu_mutex_lock(&p->mutex);
> +    p->done = false;
> +    iov_copy(p->pages.iov, pages.num, pages.iov, pages.num, 0,
> +             iov_size(pages.iov, pages.num));

Question: any reason why we don't use the same for loop in
multifd-send codes, and just copy the IOVs in that loop? (offset is
always zero, and we are copying the whole thing after all)

> +    p->pages.num = pages.num;
> +    pages.num = 0;
> +    qemu_mutex_unlock(&p->mutex);
> +    qemu_sem_post(&p->sem);
> +}
> +
>  /**
>   * save_page_header: write page header to wire
>   *
> @@ -1155,7 +1210,7 @@ static int ram_multifd_page(RAMState *rs, PageSearchStatus *pss,
>          ram_counters.transferred +=
>              save_page_header(rs, rs->f, block,
>                               offset | RAM_SAVE_FLAG_MULTIFD_PAGE);
> -        fd_num = multifd_send_page(p);
> +        fd_num = multifd_send_page(p, rs->migration_dirty_pages == 1);
>          qemu_put_be16(rs->f, fd_num);
>          ram_counters.transferred += 2; /* size of fd_num */
>          qemu_put_buffer(rs->f, p, TARGET_PAGE_SIZE);
> @@ -3020,10 +3075,7 @@ static int ram_load(QEMUFile *f, void *opaque, int version_id)
>  
>          case RAM_SAVE_FLAG_MULTIFD_PAGE:
>              fd_num = qemu_get_be16(f);
> -            if (fd_num != 0) {
> -                /* this is yet an unused variable, changed later */
> -                fd_num = fd_num;
> -            }
> +            multifd_recv_page(host, fd_num);
>              qemu_get_buffer(f, host, TARGET_PAGE_SIZE);
>              break;
>  
> -- 
> 2.9.4
> 

Thanks,

-- 
Peter Xu

^ permalink raw reply	[flat|nested] 93+ messages in thread

* Re: [Qemu-devel] [PATCH v5 13/17] migration: Create thread infrastructure for multifd recv side
  2017-07-17 13:42 ` [Qemu-devel] [PATCH v5 13/17] migration: Create thread infrastructure for multifd recv side Juan Quintela
  2017-07-20 10:22   ` Peter Xu
@ 2017-07-20 10:29   ` Dr. David Alan Gilbert
  2017-08-08 11:51     ` Juan Quintela
  1 sibling, 1 reply; 93+ messages in thread
From: Dr. David Alan Gilbert @ 2017-07-20 10:29 UTC (permalink / raw)
  To: Juan Quintela; +Cc: qemu-devel, lvivier, peterx, berrange

* Juan Quintela (quintela@redhat.com) wrote:
> We make the locking and the transfer of information specific, even if we
> are still receiving things through the main thread.
> 
> Signed-off-by: Juan Quintela <quintela@redhat.com>
> ---
>  migration/ram.c | 68 ++++++++++++++++++++++++++++++++++++++++++++++++++-------
>  1 file changed, 60 insertions(+), 8 deletions(-)
> 
> diff --git a/migration/ram.c b/migration/ram.c
> index ac0742f..49c4880 100644
> --- a/migration/ram.c
> +++ b/migration/ram.c
> @@ -49,6 +49,7 @@
>  #include "migration/colo.h"
>  #include "sysemu/sysemu.h"
>  #include "qemu/uuid.h"
> +#include "qemu/iov.h"
>  
>  /***********************************************************/
>  /* ram save/restore */
> @@ -527,7 +528,7 @@ int multifd_save_setup(void)
>      return 0;
>  }
>  
> -static int multifd_send_page(uint8_t *address)
> +static uint16_t multifd_send_page(uint8_t *address, bool last_page)
>  {
>      int i, j;
>      MultiFDSendParams *p = NULL; /* make happy gcc */
> @@ -543,8 +544,10 @@ static int multifd_send_page(uint8_t *address)
>      pages.iov[pages.num].iov_len = TARGET_PAGE_SIZE;
>      pages.num++;
>  
> -    if (pages.num < (pages.size - 1)) {
> -        return UINT16_MAX;
> +    if (!last_page) {
> +        if (pages.num < (pages.size - 1)) {
> +            return UINT16_MAX;
> +        }
>      }

This doesn't feel like it should be in a recv patch.

>      qemu_sem_wait(&multifd_send_state->sem);
> @@ -572,12 +575,17 @@ static int multifd_send_page(uint8_t *address)
>  }
>  
>  struct MultiFDRecvParams {
> +    /* not changed */
>      uint8_t id;
>      QemuThread thread;
>      QIOChannel *c;
> +    QemuSemaphore ready;
>      QemuSemaphore sem;
>      QemuMutex mutex;
> +    /* proteced by param mutex */
>      bool quit;
> +    multifd_pages_t pages;
> +    bool done;
>  };
>  typedef struct MultiFDRecvParams MultiFDRecvParams;

The params between Send and Recv keep looking very similar; I wonder
if we can share them.

> @@ -629,12 +637,20 @@ static void *multifd_recv_thread(void *opaque)
>  {
>      MultiFDRecvParams *p = opaque;
>  
> +    qemu_sem_post(&p->ready);
>      while (true) {
>          qemu_mutex_lock(&p->mutex);
>          if (p->quit) {
>              qemu_mutex_unlock(&p->mutex);
>              break;
>          }
> +        if (p->pages.num) {
> +            p->pages.num = 0;
> +            p->done = true;
> +            qemu_mutex_unlock(&p->mutex);
> +            qemu_sem_post(&p->ready);
> +            continue;
> +        }
>          qemu_mutex_unlock(&p->mutex);
>          qemu_sem_wait(&p->sem);
>      }
> @@ -679,8 +695,11 @@ gboolean multifd_new_channel(QIOChannel *ioc)
>      }
>      qemu_mutex_init(&p->mutex);
>      qemu_sem_init(&p->sem, 0);
> +    qemu_sem_init(&p->ready, 0);
>      p->quit = false;
>      p->id = id;
> +    p->done = false;
> +    multifd_init_group(&p->pages);
>      p->c = ioc;
>      atomic_set(&multifd_recv_state->params[id], p);
>      qemu_thread_create(&p->thread, "multifd_recv", multifd_recv_thread, p,
> @@ -709,6 +728,42 @@ int multifd_load_setup(void)
>      return 0;
>  }
>  
> +static void multifd_recv_page(uint8_t *address, uint16_t fd_num)
> +{
> +    int thread_count;
> +    MultiFDRecvParams *p;
> +    static multifd_pages_t pages;
> +    static bool once;
> +
> +    if (!once) {
> +        multifd_init_group(&pages);
> +        once = true;
> +    }
> +
> +    pages.iov[pages.num].iov_base = address;
> +    pages.iov[pages.num].iov_len = TARGET_PAGE_SIZE;
> +    pages.num++;
> +
> +    if (fd_num == UINT16_MAX) {
> +        return;
> +    }
> +
> +    thread_count = migrate_multifd_threads();
> +    assert(fd_num < thread_count);
> +    p = multifd_recv_state->params[fd_num];
> +
> +    qemu_sem_wait(&p->ready);
> +
> +    qemu_mutex_lock(&p->mutex);
> +    p->done = false;
> +    iov_copy(p->pages.iov, pages.num, pages.iov, pages.num, 0,
> +             iov_size(pages.iov, pages.num));
> +    p->pages.num = pages.num;
> +    pages.num = 0;
> +    qemu_mutex_unlock(&p->mutex);
> +    qemu_sem_post(&p->sem);
> +}
> +
>  /**
>   * save_page_header: write page header to wire
>   *
> @@ -1155,7 +1210,7 @@ static int ram_multifd_page(RAMState *rs, PageSearchStatus *pss,
>          ram_counters.transferred +=
>              save_page_header(rs, rs->f, block,
>                               offset | RAM_SAVE_FLAG_MULTIFD_PAGE);
> -        fd_num = multifd_send_page(p);
> +        fd_num = multifd_send_page(p, rs->migration_dirty_pages == 1);

I think that belongs in the previous patch and probably answers one of
my questions.

>          qemu_put_be16(rs->f, fd_num);
>          ram_counters.transferred += 2; /* size of fd_num */
>          qemu_put_buffer(rs->f, p, TARGET_PAGE_SIZE);
> @@ -3020,10 +3075,7 @@ static int ram_load(QEMUFile *f, void *opaque, int version_id)
>  
>          case RAM_SAVE_FLAG_MULTIFD_PAGE:
>              fd_num = qemu_get_be16(f);
> -            if (fd_num != 0) {
> -                /* this is yet an unused variable, changed later */
> -                fd_num = fd_num;
> -            }
> +            multifd_recv_page(host, fd_num);
>              qemu_get_buffer(f, host, TARGET_PAGE_SIZE);
>              break;
>  
> -- 
> 2.9.4
> 

Dave

--
Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK

^ permalink raw reply	[flat|nested] 93+ messages in thread

* Re: [Qemu-devel] [PATCH v5 14/17] migration: Delay the start of reception on main channel
  2017-07-17 13:42 ` [Qemu-devel] [PATCH v5 14/17] migration: Delay the start of reception on main channel Juan Quintela
@ 2017-07-20 10:56   ` Dr. David Alan Gilbert
  2017-08-08 11:29     ` Juan Quintela
  2017-07-20 11:10   ` Peter Xu
  1 sibling, 1 reply; 93+ messages in thread
From: Dr. David Alan Gilbert @ 2017-07-20 10:56 UTC (permalink / raw)
  To: Juan Quintela; +Cc: qemu-devel, lvivier, peterx, berrange

* Juan Quintela (quintela@redhat.com) wrote:
> When we start multifd, we will want to delay the main channel until
> the others are created.
> 
> Signed-off-by: Juan Quintela <quintela@redhat.com>
> ---
>  migration/migration.c | 23 ++++++++++++++---------
>  1 file changed, 14 insertions(+), 9 deletions(-)
> 
> diff --git a/migration/migration.c b/migration/migration.c
> index d9d5415..e122684 100644
> --- a/migration/migration.c
> +++ b/migration/migration.c
> @@ -358,14 +358,11 @@ static void process_incoming_migration_co(void *opaque)
>  
>  static void migration_incoming_setup(QEMUFile *f)
>  {
> -    MigrationIncomingState *mis = migration_incoming_get_current();
> -
>      if (multifd_load_setup() != 0) {
>          /* We haven't been able to create multifd threads
>             nothing better to do */
>          exit(EXIT_FAILURE);
>      }
> -    mis->from_src_file = f;
>      qemu_file_set_blocking(f, false);
>  }
>  
> @@ -384,18 +381,26 @@ void migration_fd_process_incoming(QEMUFile *f)
>  gboolean migration_ioc_process_incoming(QIOChannel *ioc)
>  {
>      MigrationIncomingState *mis = migration_incoming_get_current();
> +    gboolean result = FALSE;

I wonder if we need some state somewhere so that we can see that the
incoming migration is partially connected - since the main incoming
coroutine hasn't started yet, we've not got much of mis setup.

Dave

>      if (!mis->from_src_file) {
>          QEMUFile *f = qemu_fopen_channel_input(ioc);
>          mis->from_src_file = f;
> -        migration_fd_process_incoming(f);
> -        if (!migrate_use_multifd()) {
> -            return FALSE;
> -        } else {
> -            return TRUE;
> +        migration_incoming_setup(f);
> +        if (migrate_use_multifd()) {
> +            result = TRUE;
>          }
> +    } else {
> +        /* we can only arrive here if multifd is on
> +           and this is a new channel */
> +        result = multifd_new_channel(ioc);
>      }
> -    return multifd_new_channel(ioc);
> +    if (result == FALSE) {
> +        /* called when !multifd and for last multifd channel */
> +        migration_incoming_process();
> +    }
> +
> +    return result;
>  }
>  
>  /*
> -- 
> 2.9.4
> 
--
Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK

^ permalink raw reply	[flat|nested] 93+ messages in thread

* Re: [Qemu-devel] [PATCH v5 14/17] migration: Delay the start of reception on main channel
  2017-07-17 13:42 ` [Qemu-devel] [PATCH v5 14/17] migration: Delay the start of reception on main channel Juan Quintela
  2017-07-20 10:56   ` Dr. David Alan Gilbert
@ 2017-07-20 11:10   ` Peter Xu
  2017-08-08 11:30     ` Juan Quintela
  1 sibling, 1 reply; 93+ messages in thread
From: Peter Xu @ 2017-07-20 11:10 UTC (permalink / raw)
  To: Juan Quintela; +Cc: qemu-devel, dgilbert, lvivier, berrange

On Mon, Jul 17, 2017 at 03:42:35PM +0200, Juan Quintela wrote:
> When we start multifd, we will want to delay the main channel until
> the others are created.
> 
> Signed-off-by: Juan Quintela <quintela@redhat.com>
> ---
>  migration/migration.c | 23 ++++++++++++++---------
>  1 file changed, 14 insertions(+), 9 deletions(-)
> 
> diff --git a/migration/migration.c b/migration/migration.c
> index d9d5415..e122684 100644
> --- a/migration/migration.c
> +++ b/migration/migration.c
> @@ -358,14 +358,11 @@ static void process_incoming_migration_co(void *opaque)
>  
>  static void migration_incoming_setup(QEMUFile *f)
>  {
> -    MigrationIncomingState *mis = migration_incoming_get_current();
> -
>      if (multifd_load_setup() != 0) {
>          /* We haven't been able to create multifd threads
>             nothing better to do */
>          exit(EXIT_FAILURE);
>      }
> -    mis->from_src_file = f;

Shall we keep this, and ...

>      qemu_file_set_blocking(f, false);
>  }
>  
> @@ -384,18 +381,26 @@ void migration_fd_process_incoming(QEMUFile *f)
>  gboolean migration_ioc_process_incoming(QIOChannel *ioc)
>  {
>      MigrationIncomingState *mis = migration_incoming_get_current();
> +    gboolean result = FALSE;
>  
>      if (!mis->from_src_file) {
>          QEMUFile *f = qemu_fopen_channel_input(ioc);
>          mis->from_src_file = f;

... remove this instead?  I am not sure, but looks like RDMA is still
using migration_fd_process_incoming():

rdma_accept_incoming_migration
  migration_fd_process_incoming
    migration_incoming_setup
    migration_incoming_process
      process_incoming_migration_co <-- here we'll use from_src_file
                                        while it's not inited?

> -        migration_fd_process_incoming(f);
> -        if (!migrate_use_multifd()) {
> -            return FALSE;
> -        } else {
> -            return TRUE;
> +        migration_incoming_setup(f);
> +        if (migrate_use_multifd()) {
> +            result = TRUE;
>          }
> +    } else {
> +        /* we can only arrive here if multifd is on
> +           and this is a new channel */
> +        result = multifd_new_channel(ioc);
>      }
> -    return multifd_new_channel(ioc);
> +    if (result == FALSE) {
> +        /* called when !multifd and for last multifd channel */
> +        migration_incoming_process();
> +    }
> +
> +    return result;

-- 
Peter Xu

^ permalink raw reply	[flat|nested] 93+ messages in thread

* Re: [Qemu-devel] [PATCH v5 15/17] migration: Test new fd infrastructure
  2017-07-17 13:42 ` [Qemu-devel] [PATCH v5 15/17] migration: Test new fd infrastructure Juan Quintela
@ 2017-07-20 11:20   ` Dr. David Alan Gilbert
  0 siblings, 0 replies; 93+ messages in thread
From: Dr. David Alan Gilbert @ 2017-07-20 11:20 UTC (permalink / raw)
  To: Juan Quintela; +Cc: qemu-devel, lvivier, peterx, berrange

* Juan Quintela (quintela@redhat.com) wrote:
> We just send the address through the alternate channels and test that it
> is ok.

So this is just a debug patch?

> Signed-off-by: Juan Quintela <quintela@redhat.com>
> ---
>  migration/ram.c | 50 ++++++++++++++++++++++++++++++++++++++++++++++++++
>  1 file changed, 50 insertions(+)
> 
> diff --git a/migration/ram.c b/migration/ram.c
> index 49c4880..b55b243 100644
> --- a/migration/ram.c
> +++ b/migration/ram.c
> @@ -468,8 +468,26 @@ static void *multifd_send_thread(void *opaque)
>              break;
>          }
>          if (p->pages.num) {
> +            int i;
> +            int num;
> +
> +            num = p->pages.num;
>              p->pages.num = 0;
>              qemu_mutex_unlock(&p->mutex);
> +
> +            for (i = 0; i < num; i++) {
> +                if (qio_channel_write(p->c,
> +                                      (const char *)&p->pages.iov[i].iov_base,
> +                                      sizeof(uint8_t *), &error_abort)

Never abort on the source.

> +                    != sizeof(uint8_t *)) {
> +                    MigrationState *s = migrate_get_current();
> +
> +                    migrate_set_state(&s->state, MIGRATION_STATUS_ACTIVE,
> +                                      MIGRATION_STATUS_FAILED);
> +                    terminate_multifd_send_threads();

Is it really safe to call terminate_multifd_send_threads from one of the
threads?  It feels like having set it to FAILED all the cleanup should
happen back in the main thread.

> +                    return NULL;
> +                }
> +            }
>              qemu_mutex_lock(&multifd_send_state->mutex);
>              p->done = true;
>              qemu_mutex_unlock(&multifd_send_state->mutex);
> @@ -636,6 +654,7 @@ void multifd_load_cleanup(void)
>  static void *multifd_recv_thread(void *opaque)
>  {
>      MultiFDRecvParams *p = opaque;
> +    uint8_t *recv_address;
>  
>      qemu_sem_post(&p->ready);
>      while (true) {
> @@ -645,7 +664,38 @@ static void *multifd_recv_thread(void *opaque)
>              break;
>          }
>          if (p->pages.num) {
> +            int i;
> +            int num;
> +
> +            num = p->pages.num;
>              p->pages.num = 0;
> +
> +            for (i = 0; i < num; i++) {
> +                if (qio_channel_read(p->c,
> +                                     (char *)&recv_address,
> +                                     sizeof(uint8_t *), &error_abort)

and I'd prefer we didn't abort on the dest either, but a bit less
critical (until postcopy?)

> +                    != sizeof(uint8_t *)) {
> +                    MigrationState *s = migrate_get_current();
> +
> +                    migrate_set_state(&s->state, MIGRATION_STATUS_ACTIVE,
> +                                      MIGRATION_STATUS_FAILED);
> +                    terminate_multifd_recv_threads();
> +                    return NULL;
> +                }
> +                if (recv_address != p->pages.iov[i].iov_base) {
> +                    MigrationState *s = migrate_get_current();
> +
> +                    printf("We received %p what we were expecting %p (%d)\n",
> +                           recv_address,
> +                           p->pages.iov[i].iov_base, i);
> +
> +                    migrate_set_state(&s->state, MIGRATION_STATUS_ACTIVE,
> +                                      MIGRATION_STATUS_FAILED);
> +                    terminate_multifd_recv_threads();
> +                    return NULL;
> +                }
> +            }
> +
>              p->done = true;
>              qemu_mutex_unlock(&p->mutex);
>              qemu_sem_post(&p->ready);

Dave
> -- 
> 2.9.4
> 
--
Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK

^ permalink raw reply	[flat|nested] 93+ messages in thread

* Re: [Qemu-devel] [PATCH v5 16/17] migration: Transfer pages over new channels
  2017-07-17 13:42 ` [Qemu-devel] [PATCH v5 16/17] migration: Transfer pages over new channels Juan Quintela
@ 2017-07-20 11:31   ` Dr. David Alan Gilbert
  2017-08-08 11:13     ` Juan Quintela
  0 siblings, 1 reply; 93+ messages in thread
From: Dr. David Alan Gilbert @ 2017-07-20 11:31 UTC (permalink / raw)
  To: Juan Quintela; +Cc: qemu-devel, lvivier, peterx, berrange

* Juan Quintela (quintela@redhat.com) wrote:
> We switch for sending the page number to send real pages.
> 
> Signed-off-by: Juan Quintela <quintela@redhat.com>
> 
> --
> 
> Remove the HACK bit, now we have the function that calculates the size
> of a page exported.
> ---
>  migration/migration.c | 14 ++++++++----
>  migration/ram.c       | 59 +++++++++++++++++----------------------------------
>  2 files changed, 29 insertions(+), 44 deletions(-)
> 
> diff --git a/migration/migration.c b/migration/migration.c
> index e122684..34a34b7 100644
> --- a/migration/migration.c
> +++ b/migration/migration.c
> @@ -1882,13 +1882,14 @@ static void *migration_thread(void *opaque)
>      /* Used by the bandwidth calcs, updated later */
>      int64_t initial_time = qemu_clock_get_ms(QEMU_CLOCK_REALTIME);
>      int64_t setup_start = qemu_clock_get_ms(QEMU_CLOCK_HOST);
> -    int64_t initial_bytes = 0;
>      /*
>       * The final stage happens when the remaining data is smaller than
>       * this threshold; it's calculated from the requested downtime and
>       * measured bandwidth
>       */
>      int64_t threshold_size = 0;
> +    int64_t qemu_file_bytes = 0;
> +    int64_t multifd_pages = 0;

It feels like these changes to the transfer count should be in a
separate patch.

>      int64_t start_time = initial_time;
>      int64_t end_time;
>      bool old_vm_running = false;
> @@ -1976,9 +1977,13 @@ static void *migration_thread(void *opaque)
>          }
>          current_time = qemu_clock_get_ms(QEMU_CLOCK_REALTIME);
>          if (current_time >= initial_time + BUFFER_DELAY) {
> -            uint64_t transferred_bytes = qemu_ftell(s->to_dst_file) -
> -                                         initial_bytes;
>              uint64_t time_spent = current_time - initial_time;
> +            uint64_t qemu_file_bytes_now = qemu_ftell(s->to_dst_file);
> +            uint64_t multifd_pages_now = ram_counters.multifd;
> +            uint64_t transferred_bytes =
> +                (qemu_file_bytes_now - qemu_file_bytes) +
> +                (multifd_pages_now - multifd_pages) *
> +                qemu_target_page_size();

If I've followed this right, then ram_counters.multifd is in the main
thread not the individual threads, so we should be OK doing that.

>              double bandwidth = (double)transferred_bytes / time_spent;
>              threshold_size = bandwidth * s->parameters.downtime_limit;
>  
> @@ -1996,7 +2001,8 @@ static void *migration_thread(void *opaque)
>  
>              qemu_file_reset_rate_limit(s->to_dst_file);
>              initial_time = current_time;
> -            initial_bytes = qemu_ftell(s->to_dst_file);
> +            qemu_file_bytes = qemu_file_bytes_now;
> +            multifd_pages = multifd_pages_now;
>          }
>          if (qemu_file_rate_limit(s->to_dst_file)) {
>              /* usleep expects microseconds */
> diff --git a/migration/ram.c b/migration/ram.c
> index b55b243..c78b286 100644
> --- a/migration/ram.c
> +++ b/migration/ram.c
> @@ -468,25 +468,21 @@ static void *multifd_send_thread(void *opaque)
>              break;
>          }
>          if (p->pages.num) {
> -            int i;
>              int num;
>  
>              num = p->pages.num;
>              p->pages.num = 0;
>              qemu_mutex_unlock(&p->mutex);
>  
> -            for (i = 0; i < num; i++) {
> -                if (qio_channel_write(p->c,
> -                                      (const char *)&p->pages.iov[i].iov_base,
> -                                      sizeof(uint8_t *), &error_abort)
> -                    != sizeof(uint8_t *)) {
> -                    MigrationState *s = migrate_get_current();
> +            if (qio_channel_writev_all(p->c, p->pages.iov,
> +                                       num, &error_abort)
> +                != num * TARGET_PAGE_SIZE) {
> +                MigrationState *s = migrate_get_current();

Same comments as previous patch; note we should find a way to get
the error message logged; not easy since we're in a thread, but
we need to find a way to log the errors.

>  
> -                    migrate_set_state(&s->state, MIGRATION_STATUS_ACTIVE,
> -                                      MIGRATION_STATUS_FAILED);
> -                    terminate_multifd_send_threads();
> -                    return NULL;
> -                }
> +                migrate_set_state(&s->state, MIGRATION_STATUS_ACTIVE,
> +                                  MIGRATION_STATUS_FAILED);
> +                terminate_multifd_send_threads();
> +                return NULL;
>              }
>              qemu_mutex_lock(&multifd_send_state->mutex);
>              p->done = true;
> @@ -654,7 +650,6 @@ void multifd_load_cleanup(void)
>  static void *multifd_recv_thread(void *opaque)
>  {
>      MultiFDRecvParams *p = opaque;
> -    uint8_t *recv_address;
>  
>      qemu_sem_post(&p->ready);
>      while (true) {
> @@ -664,38 +659,21 @@ static void *multifd_recv_thread(void *opaque)
>              break;
>          }
>          if (p->pages.num) {
> -            int i;
>              int num;
>  
>              num = p->pages.num;
>              p->pages.num = 0;
>  
> -            for (i = 0; i < num; i++) {
> -                if (qio_channel_read(p->c,
> -                                     (char *)&recv_address,
> -                                     sizeof(uint8_t *), &error_abort)
> -                    != sizeof(uint8_t *)) {
> -                    MigrationState *s = migrate_get_current();
> +            if (qio_channel_readv_all(p->c, p->pages.iov,
> +                                      num, &error_abort)
> +                != num * TARGET_PAGE_SIZE) {
> +                MigrationState *s = migrate_get_current();
>  
> -                    migrate_set_state(&s->state, MIGRATION_STATUS_ACTIVE,
> -                                      MIGRATION_STATUS_FAILED);
> -                    terminate_multifd_recv_threads();
> -                    return NULL;
> -                }
> -                if (recv_address != p->pages.iov[i].iov_base) {
> -                    MigrationState *s = migrate_get_current();
> -
> -                    printf("We received %p what we were expecting %p (%d)\n",
> -                           recv_address,
> -                           p->pages.iov[i].iov_base, i);
> -
> -                    migrate_set_state(&s->state, MIGRATION_STATUS_ACTIVE,
> -                                      MIGRATION_STATUS_FAILED);
> -                    terminate_multifd_recv_threads();
> -                    return NULL;
> -                }
> +                migrate_set_state(&s->state, MIGRATION_STATUS_ACTIVE,
> +                                  MIGRATION_STATUS_FAILED);
> +                terminate_multifd_recv_threads();
> +                return NULL;
>              }
> -
>              p->done = true;
>              qemu_mutex_unlock(&p->mutex);
>              qemu_sem_post(&p->ready);
> @@ -1262,8 +1240,10 @@ static int ram_multifd_page(RAMState *rs, PageSearchStatus *pss,
>                               offset | RAM_SAVE_FLAG_MULTIFD_PAGE);
>          fd_num = multifd_send_page(p, rs->migration_dirty_pages == 1);
>          qemu_put_be16(rs->f, fd_num);
> +        if (fd_num != UINT16_MAX) {
> +            qemu_fflush(rs->f);
> +        }

Is that to make sure that the relatively small messages actually get
transmitted on the main fd so that the destination starts receiving
them?

I do have a worry there that, since the addresses are going down a
single fd we are open to deadlock by the send threads filling up
buffers and blocking waiting for the receivers to receive.

>          ram_counters.transferred += 2; /* size of fd_num */
> -        qemu_put_buffer(rs->f, p, TARGET_PAGE_SIZE);
>          ram_counters.transferred += TARGET_PAGE_SIZE;
>          pages = 1;
>          ram_counters.normal++;
> @@ -3126,7 +3106,6 @@ static int ram_load(QEMUFile *f, void *opaque, int version_id)
>          case RAM_SAVE_FLAG_MULTIFD_PAGE:
>              fd_num = qemu_get_be16(f);
>              multifd_recv_page(host, fd_num);
> -            qemu_get_buffer(f, host, TARGET_PAGE_SIZE);
>              break;
>  
>          case RAM_SAVE_FLAG_EOS:

Dave

> -- 
> 2.9.4
> 
--
Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK

^ permalink raw reply	[flat|nested] 93+ messages in thread

* Re: [Qemu-devel] [PATCH v5 17/17] migration: Flush receive queue
  2017-07-17 13:42 ` [Qemu-devel] [PATCH v5 17/17] migration: Flush receive queue Juan Quintela
@ 2017-07-20 11:45   ` Dr. David Alan Gilbert
  2017-08-08 10:43     ` Juan Quintela
  2017-07-21  2:40   ` Peter Xu
  2017-07-21  6:03   ` Peter Xu
  2 siblings, 1 reply; 93+ messages in thread
From: Dr. David Alan Gilbert @ 2017-07-20 11:45 UTC (permalink / raw)
  To: Juan Quintela; +Cc: qemu-devel, lvivier, peterx, berrange

* Juan Quintela (quintela@redhat.com) wrote:
> Each time that we sync the bitmap, it is a possiblity that we receive
> a page that is being processed by a different thread.  We fix this
> problem just making sure that we wait for all receiving threads to
> finish its work before we procedeed with the next stage.
> 
> We are low on page flags, so we use a combination that is not valid to
> emit that message:  MULTIFD_PAGE and COMPRESSED.
> 
> I tried to make a migration command for it, but it don't work because
> we sync the bitmap sometimes when we have already sent the beggining
> of the section, so I just added a new page flag.
> 
> Signed-off-by: Juan Quintela <quintela@redhat.com>
> ---
>  migration/ram.c | 57 ++++++++++++++++++++++++++++++++++++++++++++++++++++++++-
>  1 file changed, 56 insertions(+), 1 deletion(-)
> 
> diff --git a/migration/ram.c b/migration/ram.c
> index c78b286..bffe204 100644
> --- a/migration/ram.c
> +++ b/migration/ram.c
> @@ -71,6 +71,12 @@
>  #define RAM_SAVE_FLAG_COMPRESS_PAGE    0x100
>  #define RAM_SAVE_FLAG_MULTIFD_PAGE     0x200
>  
> +/* We are getting low on pages flags, so we start using combinations
> +   When we need to flush a page, we sent it as
> +   RAM_SAVE_FLAG_MULTIFD_PAGE | RAM_SAVE_FLAG_COMPRESS_PAGE
> +   We don't allow that combination
> +*/
> +
>  static inline bool is_zero_range(uint8_t *p, uint64_t size)
>  {
>      return buffer_is_zero(p, size);
> @@ -193,6 +199,9 @@ struct RAMState {
>      uint64_t iterations_prev;
>      /* Iterations since start */
>      uint64_t iterations;
> +    /* Indicates if we have synced the bitmap and we need to assure that
> +       target has processeed all previous pages */
> +    bool multifd_needs_flush;
>      /* protects modification of the bitmap */
>      uint64_t migration_dirty_pages;
>      /* number of dirty bits in the bitmap */
> @@ -363,7 +372,6 @@ static void compress_threads_save_setup(void)
>  
>  /* Multiple fd's */
>  
> -
>  typedef struct {
>      int num;
>      int size;
> @@ -595,9 +603,11 @@ struct MultiFDRecvParams {
>      QIOChannel *c;
>      QemuSemaphore ready;
>      QemuSemaphore sem;
> +    QemuCond cond_sync;
>      QemuMutex mutex;
>      /* proteced by param mutex */
>      bool quit;
> +    bool sync;
>      multifd_pages_t pages;
>      bool done;
>  };
> @@ -637,6 +647,7 @@ void multifd_load_cleanup(void)
>          qemu_thread_join(&p->thread);
>          qemu_mutex_destroy(&p->mutex);
>          qemu_sem_destroy(&p->sem);
> +        qemu_cond_destroy(&p->cond_sync);
>          socket_recv_channel_destroy(p->c);
>          g_free(p);
>          multifd_recv_state->params[i] = NULL;
> @@ -675,6 +686,10 @@ static void *multifd_recv_thread(void *opaque)
>                  return NULL;
>              }
>              p->done = true;
> +            if (p->sync) {
> +                qemu_cond_signal(&p->cond_sync);
> +                p->sync = false;
> +            }
>              qemu_mutex_unlock(&p->mutex);
>              qemu_sem_post(&p->ready);
>              continue;
> @@ -724,9 +739,11 @@ gboolean multifd_new_channel(QIOChannel *ioc)
>      qemu_mutex_init(&p->mutex);
>      qemu_sem_init(&p->sem, 0);
>      qemu_sem_init(&p->ready, 0);
> +    qemu_cond_init(&p->cond_sync);
>      p->quit = false;
>      p->id = id;
>      p->done = false;
> +    p->sync = false;
>      multifd_init_group(&p->pages);
>      p->c = ioc;
>      atomic_set(&multifd_recv_state->params[id], p);
> @@ -792,6 +809,27 @@ static void multifd_recv_page(uint8_t *address, uint16_t fd_num)
>      qemu_sem_post(&p->sem);
>  }
>  
> +static int multifd_flush(void)
> +{
> +    int i, thread_count;
> +
> +    if (!migrate_use_multifd()) {
> +        return 0;
> +    }
> +    thread_count = migrate_multifd_threads();
> +    for (i = 0; i < thread_count; i++) {
> +        MultiFDRecvParams *p = multifd_recv_state->params[i];
> +
> +        qemu_mutex_lock(&p->mutex);
> +        while (!p->done) {
> +            p->sync = true;
> +            qemu_cond_wait(&p->cond_sync, &p->mutex);
> +        }

I don't think I understand how that works in the case where the
recv_thread has already 'done' by the point you set sync=true; how does
it get back to the check and do the signal?

> +        qemu_mutex_unlock(&p->mutex);
> +    }
> +    return 0;
> +}
> +
>  /**
>   * save_page_header: write page header to wire
>   *
> @@ -809,6 +847,12 @@ static size_t save_page_header(RAMState *rs, QEMUFile *f,  RAMBlock *block,
>  {
>      size_t size, len;
>  
> +    if (rs->multifd_needs_flush &&
> +        (offset & RAM_SAVE_FLAG_MULTIFD_PAGE)) {
> +        offset |= RAM_SAVE_FLAG_ZERO;

In the comment near the top you say RAM_SAVE_FLAG_COMPRESS_PAGE;  it's
probably best to add an alias at the top to make it clear, e.g.
  #define RAM_SAVE_FLAG_MULTIFD_SYNC RAM_SAVE_FLAG_ZERO

  or maybe (RAM_SAVE_FLAG_MULTIFD_PAGE | RAM_SAVE_FLAG_ZERO)

> +        rs->multifd_needs_flush = false;
> +    }
> +
>      if (block == rs->last_sent_block) {
>          offset |= RAM_SAVE_FLAG_CONTINUE;
>      }
> @@ -2496,6 +2540,9 @@ static int ram_save_complete(QEMUFile *f, void *opaque)
>  
>      if (!migration_in_postcopy()) {
>          migration_bitmap_sync(rs);
> +        if (migrate_use_multifd()) {
> +            rs->multifd_needs_flush = true;
> +        }
>      }
>  
>      ram_control_before_iterate(f, RAM_CONTROL_FINISH);
> @@ -2538,6 +2585,9 @@ static void ram_save_pending(QEMUFile *f, void *opaque, uint64_t max_size,
>          qemu_mutex_lock_iothread();
>          rcu_read_lock();
>          migration_bitmap_sync(rs);
> +        if (migrate_use_multifd()) {
> +            rs->multifd_needs_flush = true;
> +        }
>          rcu_read_unlock();
>          qemu_mutex_unlock_iothread();
>          remaining_size = rs->migration_dirty_pages * TARGET_PAGE_SIZE;
> @@ -3012,6 +3062,11 @@ static int ram_load(QEMUFile *f, void *opaque, int version_id)
>              break;
>          }
>  
> +        if ((flags & (RAM_SAVE_FLAG_MULTIFD_PAGE | RAM_SAVE_FLAG_ZERO))
> +                  == (RAM_SAVE_FLAG_MULTIFD_PAGE | RAM_SAVE_FLAG_ZERO)) {
> +            multifd_flush();
> +            flags = flags & ~RAM_SAVE_FLAG_ZERO;
> +        }
>          if (flags & (RAM_SAVE_FLAG_ZERO | RAM_SAVE_FLAG_PAGE |
>                       RAM_SAVE_FLAG_COMPRESS_PAGE | RAM_SAVE_FLAG_XBZRLE |
>                       RAM_SAVE_FLAG_MULTIFD_PAGE)) {

Dave

> -- 
> 2.9.4
> 
--
Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK

^ permalink raw reply	[flat|nested] 93+ messages in thread

* Re: [Qemu-devel] [PATCH v5 10/17] migration: Create ram_multifd_page
  2017-07-20  8:10     ` Peter Xu
@ 2017-07-20 11:48       ` Dr. David Alan Gilbert
  2017-08-08 15:58         ` Juan Quintela
  2017-08-08 16:04       ` Juan Quintela
  1 sibling, 1 reply; 93+ messages in thread
From: Dr. David Alan Gilbert @ 2017-07-20 11:48 UTC (permalink / raw)
  To: Peter Xu; +Cc: Juan Quintela, qemu-devel, lvivier, berrange

* Peter Xu (peterx@redhat.com) wrote:
> On Wed, Jul 19, 2017 at 08:02:39PM +0100, Dr. David Alan Gilbert wrote:
> > * Juan Quintela (quintela@redhat.com) wrote:
> > > The function still don't use multifd, but we have simplified
> > > ram_save_page, xbzrle and RDMA stuff is gone.  We have added a new
> > > counter and a new flag for this type of pages.
> > > 
> > > Signed-off-by: Juan Quintela <quintela@redhat.com>
> > > ---
> > >  hmp.c                 |  2 ++
> > >  migration/migration.c |  1 +
> > >  migration/ram.c       | 90 ++++++++++++++++++++++++++++++++++++++++++++++++++-
> > >  qapi-schema.json      |  5 ++-
> > >  4 files changed, 96 insertions(+), 2 deletions(-)
> > > 
> > > diff --git a/hmp.c b/hmp.c
> > > index b01605a..eeb308b 100644
> > > --- a/hmp.c
> > > +++ b/hmp.c
> > > @@ -234,6 +234,8 @@ void hmp_info_migrate(Monitor *mon, const QDict *qdict)
> > >              monitor_printf(mon, "postcopy request count: %" PRIu64 "\n",
> > >                             info->ram->postcopy_requests);
> > >          }
> > > +        monitor_printf(mon, "multifd: %" PRIu64 " pages\n",
> > > +                       info->ram->multifd);
> > >      }
> > >  
> > >      if (info->has_disk) {
> > > diff --git a/migration/migration.c b/migration/migration.c
> > > index e1c79d5..d9d5415 100644
> > > --- a/migration/migration.c
> > > +++ b/migration/migration.c
> > > @@ -528,6 +528,7 @@ static void populate_ram_info(MigrationInfo *info, MigrationState *s)
> > >      info->ram->dirty_sync_count = ram_counters.dirty_sync_count;
> > >      info->ram->postcopy_requests = ram_counters.postcopy_requests;
> > >      info->ram->page_size = qemu_target_page_size();
> > > +    info->ram->multifd = ram_counters.multifd;
> > >  
> > >      if (migrate_use_xbzrle()) {
> > >          info->has_xbzrle_cache = true;
> > > diff --git a/migration/ram.c b/migration/ram.c
> > > index b80f511..2bf3fa7 100644
> > > --- a/migration/ram.c
> > > +++ b/migration/ram.c
> > > @@ -68,6 +68,7 @@
> > >  #define RAM_SAVE_FLAG_XBZRLE   0x40
> > >  /* 0x80 is reserved in migration.h start with 0x100 next */
> > >  #define RAM_SAVE_FLAG_COMPRESS_PAGE    0x100
> > > +#define RAM_SAVE_FLAG_MULTIFD_PAGE     0x200
> > >  
> > >  static inline bool is_zero_range(uint8_t *p, uint64_t size)
> > >  {
> > > @@ -362,12 +363,17 @@ static void compress_threads_save_setup(void)
> > >  /* Multiple fd's */
> > >  
> > >  struct MultiFDSendParams {
> > > +    /* not changed */
> > >      uint8_t id;
> > >      QemuThread thread;
> > >      QIOChannel *c;
> > >      QemuSemaphore sem;
> > >      QemuMutex mutex;
> > > +    /* protected by param mutex */
> > >      bool quit;
> > 
> > Should probably comment to say what address space address is in - this
> > is really a qemu pointer - and that's why we can treat 0 as special?
> 
> I believe this comment is for "address" below.
> 
> Yes, it would be nice to comment it. IIUC it belongs to virtual
> address space of QEMU, so it should be okay to use zero as a "special"
> value.
> 
> > 
> > > +    uint8_t *address;
> > > +    /* protected by multifd mutex */
> > > +    bool done;
> > 
> > done needs a comment to explain what it is because
> > it sounds similar to quit;  I think 'done' is saying that
> > the thread is idle having done what was asked?
> 
> Since we know that valid address won't be zero, not sure whether we
> can just avoid introducing the "done" field (even, not sure whether we
> will need lock when modifying "address", I think not as well? Please
> see below). For what I see this, it works like a state machine, and
> "address" can be the state:
> 
>             +--------  send thread ---------+
>             |                               |
>            \|/                              |
>         address==0 (IDLE)               address!=0 (ACTIVE)
>             |                              /|\
>             |                               |
>             +--------  main thread ---------+
> 
> Then...
> 
> > 
> > >  };
> > >  typedef struct MultiFDSendParams MultiFDSendParams;
> > >  
> > > @@ -375,6 +381,8 @@ struct {
> > >      MultiFDSendParams *params;
> > >      /* number of created threads */
> > >      int count;
> > > +    QemuMutex mutex;
> > > +    QemuSemaphore sem;
> > >  } *multifd_send_state;
> > >  
> > >  static void terminate_multifd_send_threads(void)
> > > @@ -443,6 +451,7 @@ static void *multifd_send_thread(void *opaque)
> > >      } else {
> > >          qio_channel_write(p->c, string, MULTIFD_UUID_MSG, &error_abort);
> > >      }
> > > +    qemu_sem_post(&multifd_send_state->sem);
> > >  
> > >      while (!exit) {
> > >          qemu_mutex_lock(&p->mutex);
> > > @@ -450,6 +459,15 @@ static void *multifd_send_thread(void *opaque)
> > >              qemu_mutex_unlock(&p->mutex);
> > >              break;
> > >          }
> > > +        if (p->address) {
> > > +            p->address = 0;
> > > +            qemu_mutex_unlock(&p->mutex);
> > > +            qemu_mutex_lock(&multifd_send_state->mutex);
> > > +            p->done = true;
> > > +            qemu_mutex_unlock(&multifd_send_state->mutex);
> > > +            qemu_sem_post(&multifd_send_state->sem);
> > > +            continue;
> 
> Here instead of setting up address=0 at the entry, can we do this (no
> "done" for this time)?
> 
>                  // send the page before clearing p->address
>                  send_page(p->address);
>                  // clear p->address to switch to "IDLE" state
>                  atomic_set(&p->address, 0);
>                  // tell main thread, in case it's waiting
>                  qemu_sem_post(&multifd_send_state->sem);
> 
> And on the main thread...
> 
> > > +        }
> > >          qemu_mutex_unlock(&p->mutex);
> > >          qemu_sem_wait(&p->sem);
> > >      }
> > > @@ -469,6 +487,8 @@ int multifd_save_setup(void)
> > >      multifd_send_state = g_malloc0(sizeof(*multifd_send_state));
> > >      multifd_send_state->params = g_new0(MultiFDSendParams, thread_count);
> > >      multifd_send_state->count = 0;
> > > +    qemu_mutex_init(&multifd_send_state->mutex);
> > > +    qemu_sem_init(&multifd_send_state->sem, 0);
> > >      for (i = 0; i < thread_count; i++) {
> > >          char thread_name[16];
> > >          MultiFDSendParams *p = &multifd_send_state->params[i];
> > > @@ -477,6 +497,8 @@ int multifd_save_setup(void)
> > >          qemu_sem_init(&p->sem, 0);
> > >          p->quit = false;
> > >          p->id = i;
> > > +        p->done = true;
> > > +        p->address = 0;
> > >          p->c = socket_send_channel_create();
> > >          if (!p->c) {
> > >              error_report("Error creating a send channel");
> > > @@ -491,6 +513,30 @@ int multifd_save_setup(void)
> > >      return 0;
> > >  }
> > >  
> > > +static int multifd_send_page(uint8_t *address)
> > > +{
> > > +    int i;
> > > +    MultiFDSendParams *p = NULL; /* make happy gcc */
> > > +
> 
> 
> > > +    qemu_sem_wait(&multifd_send_state->sem);
> > > +    qemu_mutex_lock(&multifd_send_state->mutex);
> > > +    for (i = 0; i < multifd_send_state->count; i++) {
> > > +        p = &multifd_send_state->params[i];
> > > +
> > > +        if (p->done) {
> > > +            p->done = false;
> > > +            break;
> > > +        }
> > > +    }
> > > +    qemu_mutex_unlock(&multifd_send_state->mutex);
> > > +    qemu_mutex_lock(&p->mutex);
> > > +    p->address = address;
> > > +    qemu_mutex_unlock(&p->mutex);
> > > +    qemu_sem_post(&p->sem);
> 
> ... here can we just do this?
> 
> retry:
>     // don't take any lock, only read each p->address
>     for (i = 0; i < multifd_send_state->count; i++) {
>         p = &multifd_send_state->params[i];
>         if (!p->address) {
>             // we found one IDLE send thread
>             break;
>         }
>     }
>     if (!p) {
>         qemu_sem_wait(&multifd_send_state->sem);
>         goto retry;
>     }
>     // we switch its state, IDLE -> ACTIVE
>     atomic_set(&p->address, address);
>     // tell the thread to start work
>     qemu_sem_post(&p->sem);
> 
> Above didn't really use any lock at all (either the per thread lock,
> or the global lock). Would it work?

I think what's there can certainly be simplified;  but also note
that the later patch gets rid of 'address' and turns it into a count.
My suggest was to keep the 'done' and stop using 'address' as something
special; i.e. never write address in the thread; but I think yours might
work as well.

Dave

> Thanks,
> 
> -- 
> Peter Xu
--
Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK

^ permalink raw reply	[flat|nested] 93+ messages in thread

* Re: [Qemu-devel] [PATCH v5 17/17] migration: Flush receive queue
  2017-07-17 13:42 ` [Qemu-devel] [PATCH v5 17/17] migration: Flush receive queue Juan Quintela
  2017-07-20 11:45   ` Dr. David Alan Gilbert
@ 2017-07-21  2:40   ` Peter Xu
  2017-08-08 11:40     ` Juan Quintela
  2017-07-21  6:03   ` Peter Xu
  2 siblings, 1 reply; 93+ messages in thread
From: Peter Xu @ 2017-07-21  2:40 UTC (permalink / raw)
  To: Juan Quintela; +Cc: qemu-devel, dgilbert, lvivier, berrange

On Mon, Jul 17, 2017 at 03:42:38PM +0200, Juan Quintela wrote:
> Each time that we sync the bitmap, it is a possiblity that we receive
> a page that is being processed by a different thread.  We fix this
> problem just making sure that we wait for all receiving threads to
> finish its work before we procedeed with the next stage.
> 
> We are low on page flags, so we use a combination that is not valid to
> emit that message:  MULTIFD_PAGE and COMPRESSED.
> 
> I tried to make a migration command for it, but it don't work because
> we sync the bitmap sometimes when we have already sent the beggining
> of the section, so I just added a new page flag.
> 
> Signed-off-by: Juan Quintela <quintela@redhat.com>
> ---
>  migration/ram.c | 57 ++++++++++++++++++++++++++++++++++++++++++++++++++++++++-
>  1 file changed, 56 insertions(+), 1 deletion(-)
> 
> diff --git a/migration/ram.c b/migration/ram.c
> index c78b286..bffe204 100644
> --- a/migration/ram.c
> +++ b/migration/ram.c
> @@ -71,6 +71,12 @@
>  #define RAM_SAVE_FLAG_COMPRESS_PAGE    0x100
>  #define RAM_SAVE_FLAG_MULTIFD_PAGE     0x200
>  
> +/* We are getting low on pages flags, so we start using combinations
> +   When we need to flush a page, we sent it as
> +   RAM_SAVE_FLAG_MULTIFD_PAGE | RAM_SAVE_FLAG_COMPRESS_PAGE
> +   We don't allow that combination
> +*/
> +
>  static inline bool is_zero_range(uint8_t *p, uint64_t size)
>  {
>      return buffer_is_zero(p, size);
> @@ -193,6 +199,9 @@ struct RAMState {
>      uint64_t iterations_prev;
>      /* Iterations since start */
>      uint64_t iterations;
> +    /* Indicates if we have synced the bitmap and we need to assure that
> +       target has processeed all previous pages */
> +    bool multifd_needs_flush;
>      /* protects modification of the bitmap */
>      uint64_t migration_dirty_pages;
>      /* number of dirty bits in the bitmap */
> @@ -363,7 +372,6 @@ static void compress_threads_save_setup(void)
>  
>  /* Multiple fd's */
>  
> -
>  typedef struct {
>      int num;
>      int size;
> @@ -595,9 +603,11 @@ struct MultiFDRecvParams {
>      QIOChannel *c;
>      QemuSemaphore ready;
>      QemuSemaphore sem;
> +    QemuCond cond_sync;
>      QemuMutex mutex;
>      /* proteced by param mutex */
>      bool quit;
> +    bool sync;
>      multifd_pages_t pages;
>      bool done;
>  };
> @@ -637,6 +647,7 @@ void multifd_load_cleanup(void)
>          qemu_thread_join(&p->thread);
>          qemu_mutex_destroy(&p->mutex);
>          qemu_sem_destroy(&p->sem);
> +        qemu_cond_destroy(&p->cond_sync);
>          socket_recv_channel_destroy(p->c);
>          g_free(p);
>          multifd_recv_state->params[i] = NULL;
> @@ -675,6 +686,10 @@ static void *multifd_recv_thread(void *opaque)
>                  return NULL;
>              }
>              p->done = true;
> +            if (p->sync) {
> +                qemu_cond_signal(&p->cond_sync);
> +                p->sync = false;
> +            }

Could we use the same p->ready for this purpose? They looks similar:
all we want to do is to let the main thread know "worker thread has
finished receiving the last piece and becomes idle again", right?

>              qemu_mutex_unlock(&p->mutex);
>              qemu_sem_post(&p->ready);
>              continue;
> @@ -724,9 +739,11 @@ gboolean multifd_new_channel(QIOChannel *ioc)
>      qemu_mutex_init(&p->mutex);
>      qemu_sem_init(&p->sem, 0);
>      qemu_sem_init(&p->ready, 0);
> +    qemu_cond_init(&p->cond_sync);
>      p->quit = false;
>      p->id = id;
>      p->done = false;
> +    p->sync = false;
>      multifd_init_group(&p->pages);
>      p->c = ioc;
>      atomic_set(&multifd_recv_state->params[id], p);
> @@ -792,6 +809,27 @@ static void multifd_recv_page(uint8_t *address, uint16_t fd_num)
>      qemu_sem_post(&p->sem);
>  }
>  
> +static int multifd_flush(void)
> +{
> +    int i, thread_count;
> +
> +    if (!migrate_use_multifd()) {
> +        return 0;
> +    }
> +    thread_count = migrate_multifd_threads();
> +    for (i = 0; i < thread_count; i++) {
> +        MultiFDRecvParams *p = multifd_recv_state->params[i];
> +
> +        qemu_mutex_lock(&p->mutex);
> +        while (!p->done) {
> +            p->sync = true;
> +            qemu_cond_wait(&p->cond_sync, &p->mutex);

(similar comment like above)

> +        }
> +        qemu_mutex_unlock(&p->mutex);
> +    }
> +    return 0;
> +}
> +
>  /**
>   * save_page_header: write page header to wire
>   *
> @@ -809,6 +847,12 @@ static size_t save_page_header(RAMState *rs, QEMUFile *f,  RAMBlock *block,
>  {
>      size_t size, len;
>  
> +    if (rs->multifd_needs_flush &&
> +        (offset & RAM_SAVE_FLAG_MULTIFD_PAGE)) {

If multifd_needs_flush is only for multifd, then we may skip this
check, but it looks more like an assertion:

    if (rs->multifd_needs_flush) {
        assert(offset & RAM_SAVE_FLAG_MULTIFD_PAGE);
        offset |= RAM_SAVE_FLAG_ZERO;
    }

(Dave mentioned about unaligned flag used in commit message and here:
 ZERO is used, but COMPRESS is mentioned)

> +        offset |= RAM_SAVE_FLAG_ZERO;
> +        rs->multifd_needs_flush = false;
> +    }
> +
>      if (block == rs->last_sent_block) {
>          offset |= RAM_SAVE_FLAG_CONTINUE;
>      }
> @@ -2496,6 +2540,9 @@ static int ram_save_complete(QEMUFile *f, void *opaque)
>  
>      if (!migration_in_postcopy()) {
>          migration_bitmap_sync(rs);
> +        if (migrate_use_multifd()) {
> +            rs->multifd_needs_flush = true;
> +        }

Would it be good to move this block into entry of
migration_bitmap_sync(), instead of setting it up at the callers of
migration_bitmap_sync()?

>      }
>  
>      ram_control_before_iterate(f, RAM_CONTROL_FINISH);
> @@ -2538,6 +2585,9 @@ static void ram_save_pending(QEMUFile *f, void *opaque, uint64_t max_size,
>          qemu_mutex_lock_iothread();
>          rcu_read_lock();
>          migration_bitmap_sync(rs);
> +        if (migrate_use_multifd()) {
> +            rs->multifd_needs_flush = true;
> +        }
>          rcu_read_unlock();
>          qemu_mutex_unlock_iothread();
>          remaining_size = rs->migration_dirty_pages * TARGET_PAGE_SIZE;
> @@ -3012,6 +3062,11 @@ static int ram_load(QEMUFile *f, void *opaque, int version_id)
>              break;
>          }
>  
> +        if ((flags & (RAM_SAVE_FLAG_MULTIFD_PAGE | RAM_SAVE_FLAG_ZERO))
> +                  == (RAM_SAVE_FLAG_MULTIFD_PAGE | RAM_SAVE_FLAG_ZERO)) {
> +            multifd_flush();
> +            flags = flags & ~RAM_SAVE_FLAG_ZERO;
> +        }
>          if (flags & (RAM_SAVE_FLAG_ZERO | RAM_SAVE_FLAG_PAGE |
>                       RAM_SAVE_FLAG_COMPRESS_PAGE | RAM_SAVE_FLAG_XBZRLE |
>                       RAM_SAVE_FLAG_MULTIFD_PAGE)) {
> -- 
> 2.9.4
> 

Thanks,

-- 
Peter Xu

^ permalink raw reply	[flat|nested] 93+ messages in thread

* Re: [Qemu-devel] [PATCH v5 17/17] migration: Flush receive queue
  2017-07-17 13:42 ` [Qemu-devel] [PATCH v5 17/17] migration: Flush receive queue Juan Quintela
  2017-07-20 11:45   ` Dr. David Alan Gilbert
  2017-07-21  2:40   ` Peter Xu
@ 2017-07-21  6:03   ` Peter Xu
  2017-07-21 10:53     ` Juan Quintela
  2 siblings, 1 reply; 93+ messages in thread
From: Peter Xu @ 2017-07-21  6:03 UTC (permalink / raw)
  To: Juan Quintela; +Cc: qemu-devel, dgilbert, lvivier, berrange

On Mon, Jul 17, 2017 at 03:42:38PM +0200, Juan Quintela wrote:
> Each time that we sync the bitmap, it is a possiblity that we receive
> a page that is being processed by a different thread.  We fix this
> problem just making sure that we wait for all receiving threads to
> finish its work before we procedeed with the next stage.
> 
> We are low on page flags, so we use a combination that is not valid to
> emit that message:  MULTIFD_PAGE and COMPRESSED.

Btw, would it be possible that we introduce a new QEMU_VM_COMMAND for
this flush operation?  Like: MIG_CMD_MULTIFD_FLUSH?

-- 
Peter Xu

^ permalink raw reply	[flat|nested] 93+ messages in thread

* Re: [Qemu-devel] [PATCH v5 17/17] migration: Flush receive queue
  2017-07-21  6:03   ` Peter Xu
@ 2017-07-21 10:53     ` Juan Quintela
  0 siblings, 0 replies; 93+ messages in thread
From: Juan Quintela @ 2017-07-21 10:53 UTC (permalink / raw)
  To: Peter Xu; +Cc: qemu-devel, dgilbert, lvivier, berrange

Peter Xu <peterx@redhat.com> wrote:
> On Mon, Jul 17, 2017 at 03:42:38PM +0200, Juan Quintela wrote:
>> Each time that we sync the bitmap, it is a possiblity that we receive
>> a page that is being processed by a different thread.  We fix this
>> problem just making sure that we wait for all receiving threads to
>> finish its work before we procedeed with the next stage.
>> 
>> We are low on page flags, so we use a combination that is not valid to
>> emit that message:  MULTIFD_PAGE and COMPRESSED.
>
> Btw, would it be possible that we introduce a new QEMU_VM_COMMAND for
> this flush operation?  Like: MIG_CMD_MULTIFD_FLUSH?

>From the commit message:

> I tried to make a migration command for it, but it don't work because
> we sync the bitmap sometimes when we have already sent the beggining
> of the section, so I just added a new page flag.

Yeap, I found that much better dessign, but without further surgery, it
is not trivial.  There are two places where we can sync the bitmap, and
in one of them, we have already sent the beggining of the section, so
too late to send a command.

Later, Juan.

^ permalink raw reply	[flat|nested] 93+ messages in thread

* Re: [Qemu-devel] [PATCH v5 08/17] migration: Split migration_fd_process_incomming
  2017-07-19 17:08   ` Dr. David Alan Gilbert
@ 2017-07-21 12:39     ` Eric Blake
  0 siblings, 0 replies; 93+ messages in thread
From: Eric Blake @ 2017-07-21 12:39 UTC (permalink / raw)
  To: Dr. David Alan Gilbert, Juan Quintela; +Cc: lvivier, qemu-devel, peterx

[-- Attachment #1: Type: text/plain, Size: 405 bytes --]

On 07/19/2017 12:08 PM, Dr. David Alan Gilbert wrote:
> * Juan Quintela (quintela@redhat.com) wrote:
>> We need that on posterior patches.
> 
> following/subsequent/later is probably a better word.

I'd go with 'later'.

Also, s/incomming/incoming/ in the subject

-- 
Eric Blake, Principal Software Engineer
Red Hat, Inc.           +1-919-301-3266
Virtualization:  qemu.org | libvirt.org


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 615 bytes --]

^ permalink raw reply	[flat|nested] 93+ messages in thread

* Re: [Qemu-devel] [PATCH v5 01/17] migrate: Add gboolean return type to migrate_channel_process_incoming
  2017-07-20  8:47       ` Daniel P. Berrange
@ 2017-07-24 10:18         ` Juan Quintela
  0 siblings, 0 replies; 93+ messages in thread
From: Juan Quintela @ 2017-07-24 10:18 UTC (permalink / raw)
  To: Daniel P. Berrange; +Cc: Peter Xu, Dr. David Alan Gilbert, qemu-devel, lvivier

"Daniel P. Berrange" <berrange@redhat.com> wrote:
> On Thu, Jul 20, 2017 at 03:00:23PM +0800, Peter Xu wrote:
>> On Wed, Jul 19, 2017 at 04:01:10PM +0100, Dr. David Alan Gilbert wrote:
>> > * Juan Quintela (quintela@redhat.com) wrote:
>> > >  
>> > > -void migration_channel_process_incoming(QIOChannel *ioc);
>> > > +gboolean migration_channel_process_incoming(QIOChannel *ioc);
>> > 
>> > Can you add a comment here that says what the return value means.

Added comment for the two functions (in the .c file through)

>> 
>> And, looks like we have G_SOURCE_CONTINUE and G_SOURCE_REMOVE:

Used this ones, thanks.

>> https://developer.gnome.org/glib/stable/glib-The-Main-Event-Loop.html#G-SOURCE-CONTINUE:CAPS
>> 
>> Maybe we can use them as well?
>
> Those are newer than our min required glib version, though we could
> add compat defines for them

Added the compatibility macros.

Thanks to all of you, Juan.

^ permalink raw reply	[flat|nested] 93+ messages in thread

* Re: [Qemu-devel] [PATCH v5 02/17] migration: Create migration_ioc_process_incoming()
  2017-07-19 13:38   ` Daniel P. Berrange
@ 2017-07-24 11:09     ` Juan Quintela
  0 siblings, 0 replies; 93+ messages in thread
From: Juan Quintela @ 2017-07-24 11:09 UTC (permalink / raw)
  To: Daniel P. Berrange; +Cc: qemu-devel, dgilbert, lvivier, peterx

"Daniel P. Berrange" <berrange@redhat.com> wrote:
> On Mon, Jul 17, 2017 at 03:42:23PM +0200, Juan Quintela wrote:
>> We need to receive the ioc to be able to implement multifd.
>> 
>> Signed-off-by: Juan Quintela <quintela@redhat.com>
>> ---
>>  migration/channel.c   |  3 +--
>>  migration/migration.c | 16 +++++++++++++---
>>  migration/migration.h |  2 ++
>>  3 files changed, 16 insertions(+), 5 deletions(-)
>> 
>> diff --git a/migration/channel.c b/migration/channel.c
>> index 719055d..5b777ef 100644
>> --- a/migration/channel.c
>> +++ b/migration/channel.c
>> @@ -36,8 +36,7 @@ gboolean migration_channel_process_incoming(QIOChannel *ioc)
>>              error_report_err(local_err);
>>          }
>>      } else {
>> -        QEMUFile *f = qemu_fopen_channel_input(ioc);
>> -        migration_fd_process_incoming(f);
>> +        return migration_ioc_process_incoming(ioc);
>>      }
>>      return FALSE; /* unregister */
>>  }
>
> This is going to break TLS with multi FD I'm afraid.
>
>
> We have two code paths:
>
>  1. Non-TLS
>
>     event loop POLLIN on migration listener socket
>      +-> socket_accept_incoming_migration()
>           +-> migration_channel_process_incoming()
> 	       +-> migration_ioc_process_incoming()
> 	            -> returns FALSE if all required FD channels are now present
>
>  2. TLS
>
>     event loop POLLIN on migration listener socket
>      +-> socket_accept_incoming_migration()
>           +-> migration_channel_process_incoming()
> 	       +-> migration_tls_channel_process_incoming
> 	            -> Registers watch for TLS handhsake on client socket
> 	            -> returns FALSE immediately to remove listener watch
>
>     event loop POLLIN on migration *client* socket
>      +-> migration_tls_incoming_handshake
>           +-> migration_channel_process_incoming()
> 	       +-> migration_ioc_process_incoming()
> 	            -> return value ignored
>

The part of the cover letter when I explained that TLS was not working
and asked for help was not chopped for user error.

I *think* that is fixed this correctly.

As this worked differently than I expected, I just changed how things
were done.

Thanks, Juan.

^ permalink raw reply	[flat|nested] 93+ messages in thread

* Re: [Qemu-devel] [PATCH v5 03/17] qio: Create new qio_channel_{readv, writev}_all
  2017-07-19 13:44   ` Daniel P. Berrange
@ 2017-08-08  8:40     ` Juan Quintela
  2017-08-08  9:25       ` Daniel P. Berrange
  0 siblings, 1 reply; 93+ messages in thread
From: Juan Quintela @ 2017-08-08  8:40 UTC (permalink / raw)
  To: Daniel P. Berrange; +Cc: qemu-devel, dgilbert, lvivier, peterx

"Daniel P. Berrange" <berrange@redhat.com> wrote:
> On Mon, Jul 17, 2017 at 03:42:24PM +0200, Juan Quintela wrote:
>> The functions waits until it is able to write the full iov.
>> 
>> Signed-off-by: Juan Quintela <quintela@redhat.com>
>> 
>> --
>> 
>> Add tests.
>> ---
>>  include/io/channel.h           | 46 +++++++++++++++++++++++++
>>  io/channel.c                   | 76 ++++++++++++++++++++++++++++++++++++++++++
>>  migration/qemu-file-channel.c  | 29 +---------------
>>  tests/io-channel-helpers.c     | 55 ++++++++++++++++++++++++++++++
>>  tests/io-channel-helpers.h     |  4 +++
>>  tests/test-io-channel-buffer.c | 55 ++++++++++++++++++++++++++++--
>>  6 files changed, 234 insertions(+), 31 deletions(-)
>
>
>
>> diff --git a/io/channel.c b/io/channel.c
>> index cdf7454..82203ef 100644
>> --- a/io/channel.c
>> +++ b/io/channel.c
>> @@ -22,6 +22,7 @@
>>  #include "io/channel.h"
>>  #include "qapi/error.h"
>>  #include "qemu/main-loop.h"
>> +#include "qemu/iov.h"
>>  
>>  bool qio_channel_has_feature(QIOChannel *ioc,
>>                               QIOChannelFeature feature)
>> @@ -85,6 +86,81 @@ ssize_t qio_channel_writev_full(QIOChannel *ioc,
>>  }
>>  
>>  
>> +
>> +ssize_t qio_channel_readv_all(QIOChannel *ioc,
>> +                              const struct iovec *iov,
>> +                              size_t niov,
>> +                              Error **errp)
>> +{
>> +    ssize_t done = 0;
>> +    struct iovec *local_iov = g_new(struct iovec, niov);
>> +    struct iovec *local_iov_head = local_iov;
>> +    unsigned int nlocal_iov = niov;
>> +
>> +    nlocal_iov = iov_copy(local_iov, nlocal_iov,
>> +                          iov, niov,
>> +                          0, iov_size(iov, niov));
>> +
>> +    while (nlocal_iov > 0) {
>> +        ssize_t len;
>> +        len = qio_channel_readv(ioc, local_iov, nlocal_iov, errp);
>> +        if (len == QIO_CHANNEL_ERR_BLOCK) {
>> +            qio_channel_wait(ioc, G_IO_OUT);
>> +            continue;
>> +        }
>> +        if (len < 0) {
>> +            error_setg_errno(errp, EIO,
>> +                             "Channel was not able to read full iov");
>> +            done = -1;
>> +            goto cleanup;
>> +        }
>> +
>> +        iov_discard_front(&local_iov, &nlocal_iov, len);
>> +        done += len;
>> +    }
>
> If 'len == 0' (ie EOF from qio_channel_readv())  then this will busy
> loop. You need to break the loop on that condition and return whatever
> 'done' currently is.

Done.

>> +static void test_io_channel_buf2(void)
>> +{
>> +    QIOChannelBuffer *buf;
>> +    QIOChannelTest *test;
>> +
>> +    buf = qio_channel_buffer_new(0);
>> +
>> +    test = qio_channel_test_new();
>> +    qio_channel_test_run_writer_all(test, QIO_CHANNEL(buf));
>> +    buf->offset = 0;
>> +    qio_channel_test_run_reader(test, QIO_CHANNEL(buf));
>> +    qio_channel_test_validate(test);
>> +
>> +    object_unref(OBJECT(buf));
>> +}
>> +
>> +static void test_io_channel_buf3(void)
>> +{
>> +    QIOChannelBuffer *buf;
>> +    QIOChannelTest *test;
>> +
>> +    buf = qio_channel_buffer_new(0);
>> +
>> +    test = qio_channel_test_new();
>> +    qio_channel_test_run_writer(test, QIO_CHANNEL(buf));
>> +    buf->offset = 0;
>> +    qio_channel_test_run_reader_all(test, QIO_CHANNEL(buf));
>> +    qio_channel_test_validate(test);
>> +
>> +    object_unref(OBJECT(buf));
>> +}
>> +
>> +static void test_io_channel_buf4(void)
>> +{
>> +    QIOChannelBuffer *buf;
>> +    QIOChannelTest *test;
>> +
>> +    buf = qio_channel_buffer_new(0);
>> +
>> +    test = qio_channel_test_new();
>> +    qio_channel_test_run_writer_all(test, QIO_CHANNEL(buf));
>> +    buf->offset = 0;
>> +    qio_channel_test_run_reader_all(test, QIO_CHANNEL(buf));
>> +    qio_channel_test_validate(test);
>> +
>> +    object_unref(OBJECT(buf));
>> +}
>>  
>>  int main(int argc, char **argv)
>>  {
>> @@ -46,6 +92,9 @@ int main(int argc, char **argv)
>>  
>>      g_test_init(&argc, &argv, NULL);
>>  
>> -    g_test_add_func("/io/channel/buf", test_io_channel_buf);
>> +    g_test_add_func("/io/channel/buf1", test_io_channel_buf1);
>> +    g_test_add_func("/io/channel/buf2", test_io_channel_buf2);
>> +    g_test_add_func("/io/channel/buf3", test_io_channel_buf3);
>> +    g_test_add_func("/io/channel/buf4", test_io_channel_buf4);
>>      return g_test_run();
>>  }
>
> There's no need to add any of these additions to the test suite.  Instead
> you can just change the existing io-channel-helpers.c functions
> test_io_thread_writer() and test_io_thread_reader(), to call
> qio_channel_writev_all() & qio_channel_readv_all() respectively.

They are already done now, and the advantage of this was that I was able
to test that everything worked well against everything.  That was good
to be able to check that all worked as expected.

Later, Juan.

^ permalink raw reply	[flat|nested] 93+ messages in thread

* Re: [Qemu-devel] [PATCH v5 04/17] migration: Add multifd capability
  2017-07-19 15:44   ` Dr. David Alan Gilbert
@ 2017-08-08  8:42     ` Juan Quintela
  0 siblings, 0 replies; 93+ messages in thread
From: Juan Quintela @ 2017-08-08  8:42 UTC (permalink / raw)
  To: Dr. David Alan Gilbert; +Cc: qemu-devel, lvivier, peterx, berrange

"Dr. David Alan Gilbert" <dgilbert@redhat.com> wrote:
> * Juan Quintela (quintela@redhat.com) wrote:
>> Signed-off-by: Juan Quintela <quintela@redhat.com>
>> Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
>
> Note you need to update this;  you need to add the
> DEFINE_PROP_MIG_CAP in migration_properties[]

Done, it was sent before I integrated the other patches into upstream.

^ permalink raw reply	[flat|nested] 93+ messages in thread

* Re: [Qemu-devel] [PATCH v5 05/17] migration: Create x-multifd-threads parameter
  2017-07-19 16:00   ` Dr. David Alan Gilbert
@ 2017-08-08  8:46     ` Juan Quintela
  2017-08-08  9:44       ` Dr. David Alan Gilbert
  0 siblings, 1 reply; 93+ messages in thread
From: Juan Quintela @ 2017-08-08  8:46 UTC (permalink / raw)
  To: Dr. David Alan Gilbert; +Cc: qemu-devel, lvivier, peterx, berrange

"Dr. David Alan Gilbert" <dgilbert@redhat.com> wrote:
> * Juan Quintela (quintela@redhat.com) wrote:
>> Indicates the number of threads that we would create.  By default we
>> create 2 threads.
>> 
>> Signed-off-by: Juan Quintela <quintela@redhat.com>
>> Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
>
> Also needs updating DEFINE_PROP stuff - and if Markus' qapi patch lands.

Done.

>>  #
>>  # @return-path: If enabled, migration will use the return path even
>>  #               for precopy. (since 2.10)
>> +#
>>  # @x-multifd: Use more than one fd for migration (since 2.10)
>>  #
>>  # Since: 1.2
>> @@ -910,6 +911,7 @@
>>    'data': ['xbzrle', 'rdma-pin-all', 'auto-converge', 'zero-blocks',
>>             'compress', 'events', 'postcopy-ram', 'x-colo', 'release-ram',
>>             'block', 'return-path', 'x-multifd'] }
>> +
>
> Escapee from previous patch.

Done.

>
>>  ##
>>  # @MigrationCapabilityStatus:
>>  #
>> @@ -1026,13 +1028,19 @@
>>  # 	migrated and the destination must already have access to the
>>  # 	same backing chain as was used on the source.  (since 2.10)
>>  #
>> +# @x-multifd-threads: Number of threads used to migrate data in
>> +#                     parallel. This is the same number that the
>> +#                     number of sockets used for migration.
>> +#                     The default value is 2 (since 2.10)
>> +#
>
> That did make me think for a moment; I guess '2' makes sense once you've
> set the x-multifd capability on.  The other possibility would be to
> remove the capability and just rely on the threads > 1

I think this is the same that xbzrle cache size.  It has a default
value.  But we only used it when we set the capability.

I think that it makes the code more ortogonal with the others, no?

Later, Juan.

^ permalink raw reply	[flat|nested] 93+ messages in thread

* Re: [Qemu-devel] [PATCH v5 07/17] migration: Create multifd migration threads
  2017-07-19 16:49   ` Dr. David Alan Gilbert
@ 2017-08-08  8:58     ` Juan Quintela
  0 siblings, 0 replies; 93+ messages in thread
From: Juan Quintela @ 2017-08-08  8:58 UTC (permalink / raw)
  To: Dr. David Alan Gilbert; +Cc: qemu-devel, lvivier, peterx, berrange

"Dr. David Alan Gilbert" <dgilbert@redhat.com> wrote:
> * Juan Quintela (quintela@redhat.com) wrote:
>> Creation of the threads, nothing inside yet.
>> 
>> Signed-off-by: Juan Quintela <quintela@redhat.com>

>> +        MultiFDSendParams *p = &multifd_send_state->params[i];
>> +
>> +        qemu_mutex_lock(&p->mutex);
>> +        p->quit = true;
>> +        qemu_sem_post(&p->sem);
>> +        qemu_mutex_unlock(&p->mutex);
>
> I don't think you need that lock/unlock pair - as long as no one
> else is currently going around setting them to false; so as long
> as you know you're safely after initialisation and no one is trying
> to start a new migration at the moment then I think it's safe.

It is the error path, or the end of migration.  I get very nervous with:
*I think  that it is safe*, and specially, then I lost "moral
authority" when somebody else tries *not* to protect some access to a
variable.

If you prefer it, I can change that to one atomic, but I am not sure
that it would make a big difference.

And yes, in this case it is probably OK, because the sem_post() should
synchronize everything needed, but I am not one expert in all
architectures, better sorry than safe, no?

>> +    }
>> +}
>> +
>> +void multifd_save_cleanup(void)
>> +{
>> +    int i;
>> +
>> +    if (!migrate_use_multifd()) {
>> +        return;
>> +    }
>> +    terminate_multifd_send_threads();
>> +    for (i = 0; i < multifd_send_state->count; i++) {
>> +        MultiFDSendParams *p = &multifd_send_state->params[i];
>> +
>> +        qemu_thread_join(&p->thread);
>> +        qemu_mutex_destroy(&p->mutex);
>> +        qemu_sem_destroy(&p->sem);
>> +    }
>> +    g_free(multifd_send_state->params);
>> +    multifd_send_state->params = NULL;
>> +    g_free(multifd_send_state);
>> +    multifd_send_state = NULL;
>
> I'd be tempted to add a few traces around here, and also some
> protection against it being called twice.  Maybe it shouldn't
> happen, but it would be nice to debug it when it does.

I can change like I do on the reception side, As it is an array of
pointers, I can easily make them point to NULL.  What do you think?

>
>> +}
>> +
>> +static void *multifd_send_thread(void *opaque)
>> +{
>> +    MultiFDSendParams *p = opaque;
>> +
>> +    while (true) {
>> +        qemu_mutex_lock(&p->mutex);
>> +        if (p->quit) {
>> +            qemu_mutex_unlock(&p->mutex);
>> +            break;
>> +        }
>> +        qemu_mutex_unlock(&p->mutex);
>> +        qemu_sem_wait(&p->sem);
>
> Similar to above, I don't think you need those
> locks around the quit check.

For POSIX it is not strictly needed:



 Applications shall ensure that access to any memory location by more
 than one thread of control (threads or processes) is restricted such
 that no thread of control can read or modify a memory location while
 another thread of control may be modifying it. Such access is restricted
 using functions that synchronize thread execution and also synchronize
 memory with respect to other threads. The following functions
 synchronize memory with respect to other threads:

 fork() pthread_barrier_wait() pthread_cond_broadcast()
 pthread_cond_signal() pthread_cond_timedwait() pthread_cond_wait()
 pthread_create() pthread_join() pthread_mutex_lock()
 pthread_mutex_timedlock()

 pthread_mutex_trylock() pthread_mutex_unlock() pthread_spin_lock()
 pthread_spin_trylock() pthread_spin_unlock() pthread_rwlock_rdlock()
 pthread_rwlock_timedrdlock() pthread_rwlock_timedwrlock()
 pthread_rwlock_tryrdlock() pthread_rwlock_trywrlock()

 pthread_rwlock_unlock() pthread_rwlock_wrlock() sem_post()
 sem_timedwait() sem_trywait() sem_wait() semctl() semop() wait()
 waitpid()

sem_wait() synchronizes memory, but when we add more code to the mix, we
need to make sure that we call some function that synchronizes memory
there.  My experience is that this gets us into trouble along the road.

Just for starters, I never remember this list of functions from memory,
and secondly, in x86, if you are just assigning a variable, it just
works correctly almost always, so we always get the bugs on other
architectures.


>> +int multifd_load_setup(void)
>> +{
>> +    int thread_count;
>> +    uint8_t i;
>> +
>> +    if (!migrate_use_multifd()) {
>> +        return 0;
>> +    }
>> +    thread_count = migrate_multifd_threads();
>> +    multifd_recv_state = g_malloc0(sizeof(*multifd_recv_state));
>> +    multifd_recv_state->params = g_new0(MultiFDRecvParams, thread_count);
>> +    multifd_recv_state->count = 0;
>> +    for (i = 0; i < thread_count; i++) {
>> +        char thread_name[16];
>> +        MultiFDRecvParams *p = &multifd_recv_state->params[i];
>> +
>> +        qemu_mutex_init(&p->mutex);
>> +        qemu_sem_init(&p->sem, 0);
>> +        p->quit = false;
>> +        p->id = i;
>> +        snprintf(thread_name, sizeof(thread_name), "multifdrecv_%d", i);
>> +        qemu_thread_create(&p->thread, thread_name, multifd_recv_thread, p,
>> +                           QEMU_THREAD_JOINABLE);
>> +        multifd_recv_state->count++;
>> +    }
>> +    return 0;
>> +}
>> +
>
> (It's a shame there's no way to wrap this boiler plate up to share
> between send/receive threads).

I want to share it with compress/decompress, but first I want to be sure
that this is the scheme that we want.

> However, all the above is minor, so:
>
>
> Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>

Thanks.

^ permalink raw reply	[flat|nested] 93+ messages in thread

* Re: [Qemu-devel] [PATCH v5 09/17] migration: Start of multiple fd work
  2017-07-20  9:34   ` Peter Xu
@ 2017-08-08  9:19     ` Juan Quintela
  2017-08-09  8:08       ` Peter Xu
  0 siblings, 1 reply; 93+ messages in thread
From: Juan Quintela @ 2017-08-08  9:19 UTC (permalink / raw)
  To: Peter Xu; +Cc: qemu-devel, dgilbert, lvivier, berrange

Peter Xu <peterx@redhat.com> wrote:
> On Mon, Jul 17, 2017 at 03:42:30PM +0200, Juan Quintela wrote:
>
> [...]
>
>>  int multifd_load_setup(void)
>>  {
>>      int thread_count;
>> -    uint8_t i;
>>  
>>      if (!migrate_use_multifd()) {
>>          return 0;
>>      }
>>      thread_count = migrate_multifd_threads();
>>      multifd_recv_state = g_malloc0(sizeof(*multifd_recv_state));
>> -    multifd_recv_state->params = g_new0(MultiFDRecvParams, thread_count);
>> +    multifd_recv_state->params = g_new0(MultiFDRecvParams *, thread_count);
>>      multifd_recv_state->count = 0;
>> -    for (i = 0; i < thread_count; i++) {
>> -        char thread_name[16];
>> -        MultiFDRecvParams *p = &multifd_recv_state->params[i];
>> -
>> -        qemu_mutex_init(&p->mutex);
>> -        qemu_sem_init(&p->sem, 0);
>> -        p->quit = false;
>> -        p->id = i;
>> -        snprintf(thread_name, sizeof(thread_name), "multifdrecv_%d", i);
>> -        qemu_thread_create(&p->thread, thread_name, multifd_recv_thread, p,
>> -                           QEMU_THREAD_JOINABLE);
>> -        multifd_recv_state->count++;
>> -    }
>
> Could I ask why we explicitly switched from MultiFDRecvParams[] array
> into a pointer array? Can we still use the old array?  Thanks,

Now, we could receive the channels out of order (the wonders of
networking).  So, we have two options that I can see:

* Add interesting global locking to be able to modify inplace (I know
  that it should be safe, but yet).
* Create a new struct in the new connection, and then atomically switch
  the pointer to the right instruction.

I can assure you that the second one makes it much more easier to detect
when you use the "channel" before you have fully created it O:-)

Later, Juan.

^ permalink raw reply	[flat|nested] 93+ messages in thread

* Re: [Qemu-devel] [PATCH v5 03/17] qio: Create new qio_channel_{readv, writev}_all
  2017-08-08  8:40     ` Juan Quintela
@ 2017-08-08  9:25       ` Daniel P. Berrange
  0 siblings, 0 replies; 93+ messages in thread
From: Daniel P. Berrange @ 2017-08-08  9:25 UTC (permalink / raw)
  To: Juan Quintela; +Cc: qemu-devel, dgilbert, lvivier, peterx

On Tue, Aug 08, 2017 at 10:40:08AM +0200, Juan Quintela wrote:
> "Daniel P. Berrange" <berrange@redhat.com> wrote:
> > On Mon, Jul 17, 2017 at 03:42:24PM +0200, Juan Quintela wrote:
> >> The functions waits until it is able to write the full iov.
> >> 
> >> Signed-off-by: Juan Quintela <quintela@redhat.com>
> >> 
> >> --
> >> 
> >> Add tests.
> >> ---
> >>  include/io/channel.h           | 46 +++++++++++++++++++++++++
> >>  io/channel.c                   | 76 ++++++++++++++++++++++++++++++++++++++++++
> >>  migration/qemu-file-channel.c  | 29 +---------------
> >>  tests/io-channel-helpers.c     | 55 ++++++++++++++++++++++++++++++
> >>  tests/io-channel-helpers.h     |  4 +++
> >>  tests/test-io-channel-buffer.c | 55 ++++++++++++++++++++++++++++--
> >>  6 files changed, 234 insertions(+), 31 deletions(-)
> >
> >
> >
> >> diff --git a/io/channel.c b/io/channel.c
> >> index cdf7454..82203ef 100644
> >> --- a/io/channel.c
> >> +++ b/io/channel.c
> >> @@ -22,6 +22,7 @@
> >>  #include "io/channel.h"
> >>  #include "qapi/error.h"
> >>  #include "qemu/main-loop.h"
> >> +#include "qemu/iov.h"
> >>  
> >>  bool qio_channel_has_feature(QIOChannel *ioc,
> >>                               QIOChannelFeature feature)
> >> @@ -85,6 +86,81 @@ ssize_t qio_channel_writev_full(QIOChannel *ioc,
> >>  }
> >>  
> >>  
> >> +
> >> +ssize_t qio_channel_readv_all(QIOChannel *ioc,
> >> +                              const struct iovec *iov,
> >> +                              size_t niov,
> >> +                              Error **errp)
> >> +{
> >> +    ssize_t done = 0;
> >> +    struct iovec *local_iov = g_new(struct iovec, niov);
> >> +    struct iovec *local_iov_head = local_iov;
> >> +    unsigned int nlocal_iov = niov;
> >> +
> >> +    nlocal_iov = iov_copy(local_iov, nlocal_iov,
> >> +                          iov, niov,
> >> +                          0, iov_size(iov, niov));
> >> +
> >> +    while (nlocal_iov > 0) {
> >> +        ssize_t len;
> >> +        len = qio_channel_readv(ioc, local_iov, nlocal_iov, errp);
> >> +        if (len == QIO_CHANNEL_ERR_BLOCK) {
> >> +            qio_channel_wait(ioc, G_IO_OUT);
> >> +            continue;
> >> +        }
> >> +        if (len < 0) {
> >> +            error_setg_errno(errp, EIO,
> >> +                             "Channel was not able to read full iov");
> >> +            done = -1;
> >> +            goto cleanup;
> >> +        }
> >> +
> >> +        iov_discard_front(&local_iov, &nlocal_iov, len);
> >> +        done += len;
> >> +    }
> >
> > If 'len == 0' (ie EOF from qio_channel_readv())  then this will busy
> > loop. You need to break the loop on that condition and return whatever
> > 'done' currently is.
> 
> Done.
> 
> >> +static void test_io_channel_buf2(void)
> >> +{
> >> +    QIOChannelBuffer *buf;
> >> +    QIOChannelTest *test;
> >> +
> >> +    buf = qio_channel_buffer_new(0);
> >> +
> >> +    test = qio_channel_test_new();
> >> +    qio_channel_test_run_writer_all(test, QIO_CHANNEL(buf));
> >> +    buf->offset = 0;
> >> +    qio_channel_test_run_reader(test, QIO_CHANNEL(buf));
> >> +    qio_channel_test_validate(test);
> >> +
> >> +    object_unref(OBJECT(buf));
> >> +}
> >> +
> >> +static void test_io_channel_buf3(void)
> >> +{
> >> +    QIOChannelBuffer *buf;
> >> +    QIOChannelTest *test;
> >> +
> >> +    buf = qio_channel_buffer_new(0);
> >> +
> >> +    test = qio_channel_test_new();
> >> +    qio_channel_test_run_writer(test, QIO_CHANNEL(buf));
> >> +    buf->offset = 0;
> >> +    qio_channel_test_run_reader_all(test, QIO_CHANNEL(buf));
> >> +    qio_channel_test_validate(test);
> >> +
> >> +    object_unref(OBJECT(buf));
> >> +}
> >> +
> >> +static void test_io_channel_buf4(void)
> >> +{
> >> +    QIOChannelBuffer *buf;
> >> +    QIOChannelTest *test;
> >> +
> >> +    buf = qio_channel_buffer_new(0);
> >> +
> >> +    test = qio_channel_test_new();
> >> +    qio_channel_test_run_writer_all(test, QIO_CHANNEL(buf));
> >> +    buf->offset = 0;
> >> +    qio_channel_test_run_reader_all(test, QIO_CHANNEL(buf));
> >> +    qio_channel_test_validate(test);
> >> +
> >> +    object_unref(OBJECT(buf));
> >> +}
> >>  
> >>  int main(int argc, char **argv)
> >>  {
> >> @@ -46,6 +92,9 @@ int main(int argc, char **argv)
> >>  
> >>      g_test_init(&argc, &argv, NULL);
> >>  
> >> -    g_test_add_func("/io/channel/buf", test_io_channel_buf);
> >> +    g_test_add_func("/io/channel/buf1", test_io_channel_buf1);
> >> +    g_test_add_func("/io/channel/buf2", test_io_channel_buf2);
> >> +    g_test_add_func("/io/channel/buf3", test_io_channel_buf3);
> >> +    g_test_add_func("/io/channel/buf4", test_io_channel_buf4);
> >>      return g_test_run();
> >>  }
> >
> > There's no need to add any of these additions to the test suite.  Instead
> > you can just change the existing io-channel-helpers.c functions
> > test_io_thread_writer() and test_io_thread_reader(), to call
> > qio_channel_writev_all() & qio_channel_readv_all() respectively.
> 
> They are already done now, and the advantage of this was that I was able
> to test that everything worked well against everything.  That was good
> to be able to check that all worked as expected.

The existing test_io_thread_reader/writer() methods that I mention are
common code used by multiple tests - test-io-channel-buffer,
test-io-chanel-socket, test-io-channel-file, test-io-channel-tls.

I don't want to add code to test-io-channel-buffer that duplicates
functionality that already exists, and is exercised by only one of the
many IO channel implementation tests.

Regards,
Daniel
-- 
|: https://berrange.com      -o-    https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org         -o-            https://fstop138.berrange.com :|
|: https://entangle-photo.org    -o-    https://www.instagram.com/dberrange :|

^ permalink raw reply	[flat|nested] 93+ messages in thread

* Re: [Qemu-devel] [PATCH v5 09/17] migration: Start of multiple fd work
  2017-07-19 17:35   ` Dr. David Alan Gilbert
@ 2017-08-08  9:35     ` Juan Quintela
  2017-08-08  9:54       ` Dr. David Alan Gilbert
  0 siblings, 1 reply; 93+ messages in thread
From: Juan Quintela @ 2017-08-08  9:35 UTC (permalink / raw)
  To: Dr. David Alan Gilbert; +Cc: qemu-devel, lvivier, peterx, berrange

"Dr. David Alan Gilbert" <dgilbert@redhat.com> wrote:
> * Juan Quintela (quintela@redhat.com) wrote:
>> We create new channels for each new thread created. We only send through
>> them a character to be sure that we are creating the channels in the
>> right order.
>
> That text is out of date isn't it?

oops, fixed.


>> +gboolean multifd_new_channel(QIOChannel *ioc)
>> +{
>> +    int thread_count = migrate_multifd_threads();
>> +    MultiFDRecvParams *p = g_new0(MultiFDRecvParams, 1);
>> +    MigrationState *s = migrate_get_current();
>> +    char string[MULTIFD_UUID_MSG];
>> +    char string_uuid[UUID_FMT_LEN];
>> +    char *uuid;
>> +    int id;
>> +
>> +    qio_channel_read(ioc, string, sizeof(string), &error_abort);
>> +    sscanf(string, "%s multifd %03d", string_uuid, &id);
>> +
>> +    if (qemu_uuid_set) {
>> +        uuid = qemu_uuid_unparse_strdup(&qemu_uuid);
>> +    } else {
>> +        uuid = g_strdup(multifd_uuid);
>> +    }
>> +    if (strcmp(string_uuid, uuid)) {
>> +        error_report("multifd: received uuid '%s' and expected uuid '%s'",
>> +                     string_uuid, uuid);
>
> probably worth adding the channel id as well so we can see
> when it fails.

Done.

>> +        migrate_set_state(&s->state, MIGRATION_STATUS_ACTIVE,
>> +                          MIGRATION_STATUS_FAILED);
>> +        terminate_multifd_recv_threads();
>> +        return FALSE;
>> +    }
>> +    g_free(uuid);
>> +
>> +    if (multifd_recv_state->params[id] != NULL) {
>> +        error_report("multifd: received id '%d' is already setup'", id);
>> +        migrate_set_state(&s->state, MIGRATION_STATUS_ACTIVE,
>> +                          MIGRATION_STATUS_FAILED);
>> +        terminate_multifd_recv_threads();
>> +        return FALSE;
>> +    }
>> +    qemu_mutex_init(&p->mutex);
>> +    qemu_sem_init(&p->sem, 0);
>> +    p->quit = false;
>> +    p->id = id;
>> +    p->c = ioc;
>> +    atomic_set(&multifd_recv_state->params[id], p);
>
> Can you explain why this is quite so careful about ordering ? Is there
> something that could look at params or try and take the mutex before
> the count is incremented?

what happened to me in the middle stages of the patches (yes, doing
asynchronously was painful) was that:

I created the threads (at the beggining I did the
multifd_recv_state->params[id] == p inside the thread, that makes things
really, really racy.  I *think* that now we could probably do this
as you state.



> I think it's safe to do:
>  p->quit = false;
>  p->id = id;
>  p->c = ioc;
>  &multifd_recv_state->params[id] = p;
>  qemu_sem_init(&p->sem, 0);
>  qemu_mutex_init(&p->mutex);
>  qemu_thread_create(...)
>  atomic_inc(&multifd_recv_state->count);    <-- I'm not sure if this
>  needs to be atomic

We only change it on the main thread, so it should be enough.  The split
that I want to do is:

we do the listen asynchronously
when something arrives, we just read it (main thread)
we then read <uuid> <string> <arguments>
and then after checking that uuid is right, we call whatever function we
have for "string", in our case "multifd", with <arguments> as one string
parameters.

This should make it easier to create new "channels" for other purposes.
So far so good.

But then it appears what are the responsabilities, At the beggining, I
read the string on the reception thread for that channel, that created a
race because I received the 1st message for that channel before the
channel was fully created (yes, it only happened sometimes, easy to
understand after debugging).  This is the main reason that I changed to
an array of pointers to structs instead of one array of structs.

Then, I had to ve very careful to know when I had created all the
channels threads, because otherwise I ended having races left and right.

I will try to test the ordering that you suggested.

>> +    qemu_thread_create(&p->thread, "multifd_recv", multifd_recv_thread, p,
>> +                       QEMU_THREAD_JOINABLE);
>
> You've lost the nice numbered thread names you had created in the
> previous version of this that you're removing.

I could get them back, but they really were not showing at gdb, where do
they show? ps?

>> +    multifd_recv_state->count++;
>> +
>> +    /* We need to return FALSE for the last channel */
>> +    if (multifd_recv_state->count == thread_count) {
>> +        return FALSE;
>> +    } else {
>> +        return TRUE;
>> +    }
>
> return multifd_recv_state->count != thread_count;   ?

For other reasons I change this functions and now they use a different
way of setting/checking if we have finished.  Look at the new series.

I didn't do as you said because I feel it weird that we return a bool
when we expert a gboolean, but .....

Thanks, Juan.

^ permalink raw reply	[flat|nested] 93+ messages in thread

* Re: [Qemu-devel] [PATCH v5 05/17] migration: Create x-multifd-threads parameter
  2017-08-08  8:46     ` Juan Quintela
@ 2017-08-08  9:44       ` Dr. David Alan Gilbert
  0 siblings, 0 replies; 93+ messages in thread
From: Dr. David Alan Gilbert @ 2017-08-08  9:44 UTC (permalink / raw)
  To: Juan Quintela; +Cc: qemu-devel, lvivier, peterx, berrange

* Juan Quintela (quintela@redhat.com) wrote:
> "Dr. David Alan Gilbert" <dgilbert@redhat.com> wrote:
> > * Juan Quintela (quintela@redhat.com) wrote:
> >> Indicates the number of threads that we would create.  By default we
> >> create 2 threads.
> >> 
> >> Signed-off-by: Juan Quintela <quintela@redhat.com>
> >> Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
> >
> > Also needs updating DEFINE_PROP stuff - and if Markus' qapi patch lands.
> 
> Done.
> 
> >>  #
> >>  # @return-path: If enabled, migration will use the return path even
> >>  #               for precopy. (since 2.10)
> >> +#
> >>  # @x-multifd: Use more than one fd for migration (since 2.10)
> >>  #
> >>  # Since: 1.2
> >> @@ -910,6 +911,7 @@
> >>    'data': ['xbzrle', 'rdma-pin-all', 'auto-converge', 'zero-blocks',
> >>             'compress', 'events', 'postcopy-ram', 'x-colo', 'release-ram',
> >>             'block', 'return-path', 'x-multifd'] }
> >> +
> >
> > Escapee from previous patch.
> 
> Done.
> 
> >
> >>  ##
> >>  # @MigrationCapabilityStatus:
> >>  #
> >> @@ -1026,13 +1028,19 @@
> >>  # 	migrated and the destination must already have access to the
> >>  # 	same backing chain as was used on the source.  (since 2.10)
> >>  #
> >> +# @x-multifd-threads: Number of threads used to migrate data in
> >> +#                     parallel. This is the same number that the
> >> +#                     number of sockets used for migration.
> >> +#                     The default value is 2 (since 2.10)
> >> +#
> >
> > That did make me think for a moment; I guess '2' makes sense once you've
> > set the x-multifd capability on.  The other possibility would be to
> > remove the capability and just rely on the threads > 1
> 
> I think this is the same that xbzrle cache size.  It has a default
> value.  But we only used it when we set the capability.
> 
> I think that it makes the code more ortogonal with the others, no?

I don't have a strong view either way; it's more orthogonal vs one less
parameter.  Your choice.

Dave

> Later, Juan.
--
Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK

^ permalink raw reply	[flat|nested] 93+ messages in thread

* Re: [Qemu-devel] [PATCH v5 09/17] migration: Start of multiple fd work
  2017-08-08  9:35     ` Juan Quintela
@ 2017-08-08  9:54       ` Dr. David Alan Gilbert
  0 siblings, 0 replies; 93+ messages in thread
From: Dr. David Alan Gilbert @ 2017-08-08  9:54 UTC (permalink / raw)
  To: Juan Quintela; +Cc: qemu-devel, lvivier, peterx, berrange

* Juan Quintela (quintela@redhat.com) wrote:
> "Dr. David Alan Gilbert" <dgilbert@redhat.com> wrote:
> > * Juan Quintela (quintela@redhat.com) wrote:
> >> We create new channels for each new thread created. We only send through
> >> them a character to be sure that we are creating the channels in the
> >> right order.
> >
> > That text is out of date isn't it?
> 
> oops, fixed.
> 
> 
> >> +gboolean multifd_new_channel(QIOChannel *ioc)
> >> +{
> >> +    int thread_count = migrate_multifd_threads();
> >> +    MultiFDRecvParams *p = g_new0(MultiFDRecvParams, 1);
> >> +    MigrationState *s = migrate_get_current();
> >> +    char string[MULTIFD_UUID_MSG];
> >> +    char string_uuid[UUID_FMT_LEN];
> >> +    char *uuid;
> >> +    int id;
> >> +
> >> +    qio_channel_read(ioc, string, sizeof(string), &error_abort);
> >> +    sscanf(string, "%s multifd %03d", string_uuid, &id);
> >> +
> >> +    if (qemu_uuid_set) {
> >> +        uuid = qemu_uuid_unparse_strdup(&qemu_uuid);
> >> +    } else {
> >> +        uuid = g_strdup(multifd_uuid);
> >> +    }
> >> +    if (strcmp(string_uuid, uuid)) {
> >> +        error_report("multifd: received uuid '%s' and expected uuid '%s'",
> >> +                     string_uuid, uuid);
> >
> > probably worth adding the channel id as well so we can see
> > when it fails.
> 
> Done.
> 
> >> +        migrate_set_state(&s->state, MIGRATION_STATUS_ACTIVE,
> >> +                          MIGRATION_STATUS_FAILED);
> >> +        terminate_multifd_recv_threads();
> >> +        return FALSE;
> >> +    }
> >> +    g_free(uuid);
> >> +
> >> +    if (multifd_recv_state->params[id] != NULL) {
> >> +        error_report("multifd: received id '%d' is already setup'", id);
> >> +        migrate_set_state(&s->state, MIGRATION_STATUS_ACTIVE,
> >> +                          MIGRATION_STATUS_FAILED);
> >> +        terminate_multifd_recv_threads();
> >> +        return FALSE;
> >> +    }
> >> +    qemu_mutex_init(&p->mutex);
> >> +    qemu_sem_init(&p->sem, 0);
> >> +    p->quit = false;
> >> +    p->id = id;
> >> +    p->c = ioc;
> >> +    atomic_set(&multifd_recv_state->params[id], p);
> >
> > Can you explain why this is quite so careful about ordering ? Is there
> > something that could look at params or try and take the mutex before
> > the count is incremented?
> 
> what happened to me in the middle stages of the patches (yes, doing
> asynchronously was painful) was that:
> 
> I created the threads (at the beggining I did the
> multifd_recv_state->params[id] == p inside the thread, that makes things
> really, really racy.  I *think* that now we could probably do this
> as you state.
> 
> 
> 
> > I think it's safe to do:
> >  p->quit = false;
> >  p->id = id;
> >  p->c = ioc;
> >  &multifd_recv_state->params[id] = p;
> >  qemu_sem_init(&p->sem, 0);
> >  qemu_mutex_init(&p->mutex);
> >  qemu_thread_create(...)
> >  atomic_inc(&multifd_recv_state->count);    <-- I'm not sure if this
> >  needs to be atomic
> 
> We only change it on the main thread, so it should be enough.  The split
> that I want to do is:
> 
> we do the listen asynchronously
> when something arrives, we just read it (main thread)
> we then read <uuid> <string> <arguments>
> and then after checking that uuid is right, we call whatever function we
> have for "string", in our case "multifd", with <arguments> as one string
> parameters.
> 
> This should make it easier to create new "channels" for other purposes.
> So far so good.
> 
> But then it appears what are the responsabilities, At the beggining, I
> read the string on the reception thread for that channel, that created a
> race because I received the 1st message for that channel before the
> channel was fully created (yes, it only happened sometimes, easy to
> understand after debugging).  This is the main reason that I changed to
> an array of pointers to structs instead of one array of structs.
> 
> Then, I had to ve very careful to know when I had created all the
> channels threads, because otherwise I ended having races left and right.
> 
> I will try to test the ordering that you suggested.
> 
> >> +    qemu_thread_create(&p->thread, "multifd_recv", multifd_recv_thread, p,
> >> +                       QEMU_THREAD_JOINABLE);
> >
> > You've lost the nice numbered thread names you had created in the
> > previous version of this that you're removing.
> 
> I could get them back, but they really were not showing at gdb, where do
> they show? ps?

If you start qemu with -name debug-threads=on they show up in gdb's
info threads
also in top (hit H) and ps if you turn on the right optioa (H as well?)n.

> >> +    multifd_recv_state->count++;
> >> +
> >> +    /* We need to return FALSE for the last channel */
> >> +    if (multifd_recv_state->count == thread_count) {
> >> +        return FALSE;
> >> +    } else {
> >> +        return TRUE;
> >> +    }
> >
> > return multifd_recv_state->count != thread_count;   ?
> 
> For other reasons I change this functions and now they use a different
> way of setting/checking if we have finished.  Look at the new series.
> 
> I didn't do as you said because I feel it weird that we return a bool
> when we expert a gboolean, but .....

I hope & believe they're defined as compatible:
  https://people.gnome.org/~desrt/glib-docs/glib-Standard-Macros.html#TRUE:CAPS

Dave
> Thanks, Juan.
--
Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK

^ permalink raw reply	[flat|nested] 93+ messages in thread

* Re: [Qemu-devel] [PATCH v5 17/17] migration: Flush receive queue
  2017-07-20 11:45   ` Dr. David Alan Gilbert
@ 2017-08-08 10:43     ` Juan Quintela
  2017-08-08 11:25       ` Dr. David Alan Gilbert
  0 siblings, 1 reply; 93+ messages in thread
From: Juan Quintela @ 2017-08-08 10:43 UTC (permalink / raw)
  To: Dr. David Alan Gilbert; +Cc: qemu-devel, lvivier, peterx, berrange

"Dr. David Alan Gilbert" <dgilbert@redhat.com> wrote:
> * Juan Quintela (quintela@redhat.com) wrote:
>> Each time that we sync the bitmap, it is a possiblity that we receive
>> a page that is being processed by a different thread.  We fix this
>> problem just making sure that we wait for all receiving threads to
>> finish its work before we procedeed with the next stage.
>> 
>> We are low on page flags, so we use a combination that is not valid to
>> emit that message:  MULTIFD_PAGE and COMPRESSED.
>> 
>> I tried to make a migration command for it, but it don't work because
>> we sync the bitmap sometimes when we have already sent the beggining
>> of the section, so I just added a new page flag.
>> 
>> Signed-off-by: Juan Quintela <quintela@redhat.com>

>> +static int multifd_flush(void)
>> +{
>> +    int i, thread_count;
>> +
>> +    if (!migrate_use_multifd()) {
>> +        return 0;
>> +    }
>> +    thread_count = migrate_multifd_threads();
>> +    for (i = 0; i < thread_count; i++) {
>> +        MultiFDRecvParams *p = multifd_recv_state->params[i];
>> +
>> +        qemu_mutex_lock(&p->mutex);
>> +        while (!p->done) {
>> +            p->sync = true;
>> +            qemu_cond_wait(&p->cond_sync, &p->mutex);
>> +        }
>
> I don't think I understand how that works in the case where the
> recv_thread has already 'done' by the point you set sync=true; how does
> it get back to the check and do the signal?

We have two cases:
* done = true
* done = false

if done is false, we need to wait until it is done.  But if it is true,
we don't have to wait.  By definition, there is nothing on that thread
that we need to wait for.  It is not in the middle of receiving a page.



>
>> +        qemu_mutex_unlock(&p->mutex);
>> +    }
>> +    return 0;
>> +}
>> +
>>  /**
>>   * save_page_header: write page header to wire
>>   *
>> @@ -809,6 +847,12 @@ static size_t save_page_header(RAMState *rs, QEMUFile *f,  RAMBlock *block,
>>  {
>>      size_t size, len;
>>  
>> +    if (rs->multifd_needs_flush &&
>> +        (offset & RAM_SAVE_FLAG_MULTIFD_PAGE)) {
>> +        offset |= RAM_SAVE_FLAG_ZERO;
>
> In the comment near the top you say RAM_SAVE_FLAG_COMPRESS_PAGE;  it's
> probably best to add an alias at the top to make it clear, e.g.
>   #define RAM_SAVE_FLAG_MULTIFD_SYNC RAM_SAVE_FLAG_ZERO
>
>   or maybe (RAM_SAVE_FLAG_MULTIFD_PAGE | RAM_SAVE_FLAG_ZERO)

done.

But I only use it when we use the "or".

Thanks, Juan.

^ permalink raw reply	[flat|nested] 93+ messages in thread

* Re: [Qemu-devel] [PATCH v5 16/17] migration: Transfer pages over new channels
  2017-07-20 11:31   ` Dr. David Alan Gilbert
@ 2017-08-08 11:13     ` Juan Quintela
  2017-08-08 11:32       ` Dr. David Alan Gilbert
  0 siblings, 1 reply; 93+ messages in thread
From: Juan Quintela @ 2017-08-08 11:13 UTC (permalink / raw)
  To: Dr. David Alan Gilbert; +Cc: qemu-devel, lvivier, peterx, berrange

"Dr. David Alan Gilbert" <dgilbert@redhat.com> wrote:
> * Juan Quintela (quintela@redhat.com) wrote:
>> We switch for sending the page number to send real pages.
>> 
>> Signed-off-by: Juan Quintela <quintela@redhat.com>
>> 
>> --
>> 
>> Remove the HACK bit, now we have the function that calculates the size
>> of a page exported.
>> ---
>>  migration/migration.c | 14 ++++++++----
>>  migration/ram.c       | 59 +++++++++++++++++----------------------------------
>>  2 files changed, 29 insertions(+), 44 deletions(-)
>> 
>> diff --git a/migration/migration.c b/migration/migration.c
>> index e122684..34a34b7 100644
>> --- a/migration/migration.c
>> +++ b/migration/migration.c
>> @@ -1882,13 +1882,14 @@ static void *migration_thread(void *opaque)
>>      /* Used by the bandwidth calcs, updated later */
>>      int64_t initial_time = qemu_clock_get_ms(QEMU_CLOCK_REALTIME);
>>      int64_t setup_start = qemu_clock_get_ms(QEMU_CLOCK_HOST);
>> -    int64_t initial_bytes = 0;
>>      /*
>>       * The final stage happens when the remaining data is smaller than
>>       * this threshold; it's calculated from the requested downtime and
>>       * measured bandwidth
>>       */
>>      int64_t threshold_size = 0;
>> +    int64_t qemu_file_bytes = 0;
>> +    int64_t multifd_pages = 0;
>
> It feels like these changes to the transfer count should be in a
> separate patch.

Until this patch, we only sent the address number for testing purposes,
we can change it in the previous patch.  I can split the
qemu_file_bytes, though.

>>      int64_t start_time = initial_time;
>>      int64_t end_time;
>>      bool old_vm_running = false;
>> @@ -1976,9 +1977,13 @@ static void *migration_thread(void *opaque)
>>          }
>>          current_time = qemu_clock_get_ms(QEMU_CLOCK_REALTIME);
>>          if (current_time >= initial_time + BUFFER_DELAY) {
>> -            uint64_t transferred_bytes = qemu_ftell(s->to_dst_file) -
>> -                                         initial_bytes;
>>              uint64_t time_spent = current_time - initial_time;
>> +            uint64_t qemu_file_bytes_now = qemu_ftell(s->to_dst_file);
>> +            uint64_t multifd_pages_now = ram_counters.multifd;
>> +            uint64_t transferred_bytes =
>> +                (qemu_file_bytes_now - qemu_file_bytes) +
>> +                (multifd_pages_now - multifd_pages) *
>> +                qemu_target_page_size();
>
> If I've followed this right, then ram_counters.multifd is in the main
> thread not the individual threads, so we should be OK doing that.

Yeap.

>
>>              double bandwidth = (double)transferred_bytes / time_spent;
>>              threshold_size = bandwidth * s->parameters.downtime_limit;
>>  
>> @@ -1996,7 +2001,8 @@ static void *migration_thread(void *opaque)
>>  
>>              qemu_file_reset_rate_limit(s->to_dst_file);
>>              initial_time = current_time;
>> -            initial_bytes = qemu_ftell(s->to_dst_file);
>> +            qemu_file_bytes = qemu_file_bytes_now;
>> +            multifd_pages = multifd_pages_now;
>>          }
>>          if (qemu_file_rate_limit(s->to_dst_file)) {
>>              /* usleep expects microseconds */
>> diff --git a/migration/ram.c b/migration/ram.c
>> index b55b243..c78b286 100644
>> --- a/migration/ram.c
>> +++ b/migration/ram.c
>> @@ -468,25 +468,21 @@ static void *multifd_send_thread(void *opaque)
>>              break;
>>          }
>>          if (p->pages.num) {
>> -            int i;
>>              int num;
>>  
>>              num = p->pages.num;
>>              p->pages.num = 0;
>>              qemu_mutex_unlock(&p->mutex);
>>  
>> -            for (i = 0; i < num; i++) {
>> -                if (qio_channel_write(p->c,
>> -                                      (const char *)&p->pages.iov[i].iov_base,
>> -                                      sizeof(uint8_t *), &error_abort)
>> -                    != sizeof(uint8_t *)) {
>> -                    MigrationState *s = migrate_get_current();
>> +            if (qio_channel_writev_all(p->c, p->pages.iov,
>> +                                       num, &error_abort)
>> +                != num * TARGET_PAGE_SIZE) {
>> +                MigrationState *s = migrate_get_current();
>
> Same comments as previous patch; note we should find a way to get
> the error message logged; not easy since we're in a thread, but
> we need to find a way to log the errors.

I am open to suggestions how to set errors in a different thread.

>> @@ -1262,8 +1240,10 @@ static int ram_multifd_page(RAMState *rs, PageSearchStatus *pss,
>>                               offset | RAM_SAVE_FLAG_MULTIFD_PAGE);
>>          fd_num = multifd_send_page(p, rs->migration_dirty_pages == 1);
>>          qemu_put_be16(rs->f, fd_num);
>> +        if (fd_num != UINT16_MAX) {
>> +            qemu_fflush(rs->f);
>> +        }
>
> Is that to make sure that the relatively small messages actually get
> transmitted on the main fd so that the destination starts receiving
> them?

Yeap.

> I do have a worry there that, since the addresses are going down a
> single fd we are open to deadlock by the send threads filling up
> buffers and blocking waiting for the receivers to receive.

I think we are doing the intelligent case here.
We only sync when we are sure that the package has finished, so we
should be ok here.  If we finish the migration, we call fflush anyways on
other places, so we can't get stuck as far as I can see.

Later, Juan.

^ permalink raw reply	[flat|nested] 93+ messages in thread

* Re: [Qemu-devel] [PATCH v5 17/17] migration: Flush receive queue
  2017-08-08 10:43     ` Juan Quintela
@ 2017-08-08 11:25       ` Dr. David Alan Gilbert
  0 siblings, 0 replies; 93+ messages in thread
From: Dr. David Alan Gilbert @ 2017-08-08 11:25 UTC (permalink / raw)
  To: Juan Quintela; +Cc: qemu-devel, lvivier, peterx, berrange

* Juan Quintela (quintela@redhat.com) wrote:
> "Dr. David Alan Gilbert" <dgilbert@redhat.com> wrote:
> > * Juan Quintela (quintela@redhat.com) wrote:
> >> Each time that we sync the bitmap, it is a possiblity that we receive
> >> a page that is being processed by a different thread.  We fix this
> >> problem just making sure that we wait for all receiving threads to
> >> finish its work before we procedeed with the next stage.
> >> 
> >> We are low on page flags, so we use a combination that is not valid to
> >> emit that message:  MULTIFD_PAGE and COMPRESSED.
> >> 
> >> I tried to make a migration command for it, but it don't work because
> >> we sync the bitmap sometimes when we have already sent the beggining
> >> of the section, so I just added a new page flag.
> >> 
> >> Signed-off-by: Juan Quintela <quintela@redhat.com>
> 
> >> +static int multifd_flush(void)
> >> +{
> >> +    int i, thread_count;
> >> +
> >> +    if (!migrate_use_multifd()) {
> >> +        return 0;
> >> +    }
> >> +    thread_count = migrate_multifd_threads();
> >> +    for (i = 0; i < thread_count; i++) {
> >> +        MultiFDRecvParams *p = multifd_recv_state->params[i];
> >> +
> >> +        qemu_mutex_lock(&p->mutex);
> >> +        while (!p->done) {
> >> +            p->sync = true;
> >> +            qemu_cond_wait(&p->cond_sync, &p->mutex);
> >> +        }
> >
> > I don't think I understand how that works in the case where the
> > recv_thread has already 'done' by the point you set sync=true; how does
> > it get back to the check and do the signal?
> 
> We have two cases:
> * done = true
> * done = false
> 
> if done is false, we need to wait until it is done.  But if it is true,
> we don't have to wait.  By definition, there is nothing on that thread
> that we need to wait for.  It is not in the middle of receiving a page.

OK, and you've got the p->mutex, so done can't become true
between the check at the top of the while() and the p->sync = true
on the next line? OK.

Dave
> 
> 
> >
> >> +        qemu_mutex_unlock(&p->mutex);
> >> +    }
> >> +    return 0;
> >> +}
> >> +
> >>  /**
> >>   * save_page_header: write page header to wire
> >>   *
> >> @@ -809,6 +847,12 @@ static size_t save_page_header(RAMState *rs, QEMUFile *f,  RAMBlock *block,
> >>  {
> >>      size_t size, len;
> >>  
> >> +    if (rs->multifd_needs_flush &&
> >> +        (offset & RAM_SAVE_FLAG_MULTIFD_PAGE)) {
> >> +        offset |= RAM_SAVE_FLAG_ZERO;
> >
> > In the comment near the top you say RAM_SAVE_FLAG_COMPRESS_PAGE;  it's
> > probably best to add an alias at the top to make it clear, e.g.
> >   #define RAM_SAVE_FLAG_MULTIFD_SYNC RAM_SAVE_FLAG_ZERO
> >
> >   or maybe (RAM_SAVE_FLAG_MULTIFD_PAGE | RAM_SAVE_FLAG_ZERO)
> 
> done.
> 
> But I only use it when we use the "or".
> 
> Thanks, Juan.
--
Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK

^ permalink raw reply	[flat|nested] 93+ messages in thread

* Re: [Qemu-devel] [PATCH v5 14/17] migration: Delay the start of reception on main channel
  2017-07-20 10:56   ` Dr. David Alan Gilbert
@ 2017-08-08 11:29     ` Juan Quintela
  0 siblings, 0 replies; 93+ messages in thread
From: Juan Quintela @ 2017-08-08 11:29 UTC (permalink / raw)
  To: Dr. David Alan Gilbert; +Cc: qemu-devel, lvivier, peterx, berrange

"Dr. David Alan Gilbert" <dgilbert@redhat.com> wrote:
> * Juan Quintela (quintela@redhat.com) wrote:
>> When we start multifd, we will want to delay the main channel until
>> the others are created.
>> 
>> Signed-off-by: Juan Quintela <quintela@redhat.com>
>> ---
>>  migration/migration.c | 23 ++++++++++++++---------
>>  1 file changed, 14 insertions(+), 9 deletions(-)
>> 
>> diff --git a/migration/migration.c b/migration/migration.c
>> index d9d5415..e122684 100644
>> --- a/migration/migration.c
>> +++ b/migration/migration.c
>> @@ -358,14 +358,11 @@ static void process_incoming_migration_co(void *opaque)
>>  
>>  static void migration_incoming_setup(QEMUFile *f)
>>  {
>> -    MigrationIncomingState *mis = migration_incoming_get_current();
>> -
>>      if (multifd_load_setup() != 0) {
>>          /* We haven't been able to create multifd threads
>>             nothing better to do */
>>          exit(EXIT_FAILURE);
>>      }
>> -    mis->from_src_file = f;
>>      qemu_file_set_blocking(f, false);
>>  }
>>  
>> @@ -384,18 +381,26 @@ void migration_fd_process_incoming(QEMUFile *f)
>>  gboolean migration_ioc_process_incoming(QIOChannel *ioc)
>>  {
>>      MigrationIncomingState *mis = migration_incoming_get_current();
>> +    gboolean result = FALSE;
>
> I wonder if we need some state somewhere so that we can see that the
> incoming migration is partially connected - since the main incoming
> coroutine hasn't started yet, we've not got much of mis setup.

For other reasons this code has changed, and now this variable don't
exist.

Later, Juan.

^ permalink raw reply	[flat|nested] 93+ messages in thread

* Re: [Qemu-devel] [PATCH v5 14/17] migration: Delay the start of reception on main channel
  2017-07-20 11:10   ` Peter Xu
@ 2017-08-08 11:30     ` Juan Quintela
  0 siblings, 0 replies; 93+ messages in thread
From: Juan Quintela @ 2017-08-08 11:30 UTC (permalink / raw)
  To: Peter Xu; +Cc: qemu-devel, dgilbert, lvivier, berrange

Peter Xu <peterx@redhat.com> wrote:
> On Mon, Jul 17, 2017 at 03:42:35PM +0200, Juan Quintela wrote:
>> When we start multifd, we will want to delay the main channel until
>> the others are created.
>> 
>> Signed-off-by: Juan Quintela <quintela@redhat.com>
>> ---
>>  migration/migration.c | 23 ++++++++++++++---------
>>  1 file changed, 14 insertions(+), 9 deletions(-)
>> 
>> diff --git a/migration/migration.c b/migration/migration.c
>> index d9d5415..e122684 100644
>> --- a/migration/migration.c
>> +++ b/migration/migration.c
>> @@ -358,14 +358,11 @@ static void process_incoming_migration_co(void *opaque)
>>  
>>  static void migration_incoming_setup(QEMUFile *f)
>>  {
>> -    MigrationIncomingState *mis = migration_incoming_get_current();
>> -
>>      if (multifd_load_setup() != 0) {
>>          /* We haven't been able to create multifd threads
>>             nothing better to do */
>>          exit(EXIT_FAILURE);
>>      }
>> -    mis->from_src_file = f;
>
> Shall we keep this, and ...
>
>>      qemu_file_set_blocking(f, false);
>>  }
>>  
>> @@ -384,18 +381,26 @@ void migration_fd_process_incoming(QEMUFile *f)
>>  gboolean migration_ioc_process_incoming(QIOChannel *ioc)
>>  {
>>      MigrationIncomingState *mis = migration_incoming_get_current();
>> +    gboolean result = FALSE;
>>  
>>      if (!mis->from_src_file) {
>>          QEMUFile *f = qemu_fopen_channel_input(ioc);
>>          mis->from_src_file = f;
>
> ... remove this instead?  I am not sure, but looks like RDMA is still
> using migration_fd_process_incoming():
>
> rdma_accept_incoming_migration
>   migration_fd_process_incoming
>     migration_incoming_setup
>     migration_incoming_process
>       process_incoming_migration_co <-- here we'll use from_src_file
>                                         while it's not inited?

Reworked all the "incoming" logic for other reasons, I *think* that now
it is correct.

Later, Juan.

^ permalink raw reply	[flat|nested] 93+ messages in thread

* Re: [Qemu-devel] [PATCH v5 16/17] migration: Transfer pages over new channels
  2017-08-08 11:13     ` Juan Quintela
@ 2017-08-08 11:32       ` Dr. David Alan Gilbert
  0 siblings, 0 replies; 93+ messages in thread
From: Dr. David Alan Gilbert @ 2017-08-08 11:32 UTC (permalink / raw)
  To: Juan Quintela; +Cc: qemu-devel, lvivier, peterx, berrange

* Juan Quintela (quintela@redhat.com) wrote:
> "Dr. David Alan Gilbert" <dgilbert@redhat.com> wrote:
> > * Juan Quintela (quintela@redhat.com) wrote:
> >> We switch for sending the page number to send real pages.
> >> 
> >> Signed-off-by: Juan Quintela <quintela@redhat.com>
> >> 
> >> --
> >> 
> >> Remove the HACK bit, now we have the function that calculates the size
> >> of a page exported.
> >> ---
> >>  migration/migration.c | 14 ++++++++----
> >>  migration/ram.c       | 59 +++++++++++++++++----------------------------------
> >>  2 files changed, 29 insertions(+), 44 deletions(-)
> >> 
> >> diff --git a/migration/migration.c b/migration/migration.c
> >> index e122684..34a34b7 100644
> >> --- a/migration/migration.c
> >> +++ b/migration/migration.c
> >> @@ -1882,13 +1882,14 @@ static void *migration_thread(void *opaque)
> >>      /* Used by the bandwidth calcs, updated later */
> >>      int64_t initial_time = qemu_clock_get_ms(QEMU_CLOCK_REALTIME);
> >>      int64_t setup_start = qemu_clock_get_ms(QEMU_CLOCK_HOST);
> >> -    int64_t initial_bytes = 0;
> >>      /*
> >>       * The final stage happens when the remaining data is smaller than
> >>       * this threshold; it's calculated from the requested downtime and
> >>       * measured bandwidth
> >>       */
> >>      int64_t threshold_size = 0;
> >> +    int64_t qemu_file_bytes = 0;
> >> +    int64_t multifd_pages = 0;
> >
> > It feels like these changes to the transfer count should be in a
> > separate patch.
> 
> Until this patch, we only sent the address number for testing purposes,
> we can change it in the previous patch.  I can split the
> qemu_file_bytes, though.
> 
> >>      int64_t start_time = initial_time;
> >>      int64_t end_time;
> >>      bool old_vm_running = false;
> >> @@ -1976,9 +1977,13 @@ static void *migration_thread(void *opaque)
> >>          }
> >>          current_time = qemu_clock_get_ms(QEMU_CLOCK_REALTIME);
> >>          if (current_time >= initial_time + BUFFER_DELAY) {
> >> -            uint64_t transferred_bytes = qemu_ftell(s->to_dst_file) -
> >> -                                         initial_bytes;
> >>              uint64_t time_spent = current_time - initial_time;
> >> +            uint64_t qemu_file_bytes_now = qemu_ftell(s->to_dst_file);
> >> +            uint64_t multifd_pages_now = ram_counters.multifd;
> >> +            uint64_t transferred_bytes =
> >> +                (qemu_file_bytes_now - qemu_file_bytes) +
> >> +                (multifd_pages_now - multifd_pages) *
> >> +                qemu_target_page_size();
> >
> > If I've followed this right, then ram_counters.multifd is in the main
> > thread not the individual threads, so we should be OK doing that.
> 
> Yeap.
> 
> >
> >>              double bandwidth = (double)transferred_bytes / time_spent;
> >>              threshold_size = bandwidth * s->parameters.downtime_limit;
> >>  
> >> @@ -1996,7 +2001,8 @@ static void *migration_thread(void *opaque)
> >>  
> >>              qemu_file_reset_rate_limit(s->to_dst_file);
> >>              initial_time = current_time;
> >> -            initial_bytes = qemu_ftell(s->to_dst_file);
> >> +            qemu_file_bytes = qemu_file_bytes_now;
> >> +            multifd_pages = multifd_pages_now;
> >>          }
> >>          if (qemu_file_rate_limit(s->to_dst_file)) {
> >>              /* usleep expects microseconds */
> >> diff --git a/migration/ram.c b/migration/ram.c
> >> index b55b243..c78b286 100644
> >> --- a/migration/ram.c
> >> +++ b/migration/ram.c
> >> @@ -468,25 +468,21 @@ static void *multifd_send_thread(void *opaque)
> >>              break;
> >>          }
> >>          if (p->pages.num) {
> >> -            int i;
> >>              int num;
> >>  
> >>              num = p->pages.num;
> >>              p->pages.num = 0;
> >>              qemu_mutex_unlock(&p->mutex);
> >>  
> >> -            for (i = 0; i < num; i++) {
> >> -                if (qio_channel_write(p->c,
> >> -                                      (const char *)&p->pages.iov[i].iov_base,
> >> -                                      sizeof(uint8_t *), &error_abort)
> >> -                    != sizeof(uint8_t *)) {
> >> -                    MigrationState *s = migrate_get_current();
> >> +            if (qio_channel_writev_all(p->c, p->pages.iov,
> >> +                                       num, &error_abort)
> >> +                != num * TARGET_PAGE_SIZE) {
> >> +                MigrationState *s = migrate_get_current();
> >
> > Same comments as previous patch; note we should find a way to get
> > the error message logged; not easy since we're in a thread, but
> > we need to find a way to log the errors.
> 
> I am open to suggestions how to set errors in a different thread.

The thread function can 'return' a value - could that be an error
pointer consumed when the thread is joined?
I'd take a fprintf if nothing else (although that's not actually safe)
but not an abort on the source side. Ever.

> 
> >> @@ -1262,8 +1240,10 @@ static int ram_multifd_page(RAMState *rs, PageSearchStatus *pss,
> >>                               offset | RAM_SAVE_FLAG_MULTIFD_PAGE);
> >>          fd_num = multifd_send_page(p, rs->migration_dirty_pages == 1);
> >>          qemu_put_be16(rs->f, fd_num);
> >> +        if (fd_num != UINT16_MAX) {
> >> +            qemu_fflush(rs->f);
> >> +        }
> >
> > Is that to make sure that the relatively small messages actually get
> > transmitted on the main fd so that the destination starts receiving
> > them?
> 
> Yeap.
> 
> > I do have a worry there that, since the addresses are going down a
> > single fd we are open to deadlock by the send threads filling up
> > buffers and blocking waiting for the receivers to receive.
> 
> I think we are doing the intelligent case here.
> We only sync when we are sure that the package has finished, so we
> should be ok here.  If we finish the migration, we call fflush anyways on
> other places, so we can't get stuck as far as I can see.

Dave

> Later, Juan.
--
Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK

^ permalink raw reply	[flat|nested] 93+ messages in thread

* Re: [Qemu-devel] [PATCH v5 17/17] migration: Flush receive queue
  2017-07-21  2:40   ` Peter Xu
@ 2017-08-08 11:40     ` Juan Quintela
  2017-08-10  6:49       ` Peter Xu
  0 siblings, 1 reply; 93+ messages in thread
From: Juan Quintela @ 2017-08-08 11:40 UTC (permalink / raw)
  To: Peter Xu; +Cc: qemu-devel, dgilbert, lvivier, berrange

Peter Xu <peterx@redhat.com> wrote:
> On Mon, Jul 17, 2017 at 03:42:38PM +0200, Juan Quintela wrote:
>> Each time that we sync the bitmap, it is a possiblity that we receive
>> a page that is being processed by a different thread.  We fix this
>> problem just making sure that we wait for all receiving threads to
>> finish its work before we procedeed with the next stage.
>> 
>> We are low on page flags, so we use a combination that is not valid to
>> emit that message:  MULTIFD_PAGE and COMPRESSED.
>> 
>> I tried to make a migration command for it, but it don't work because
>> we sync the bitmap sometimes when we have already sent the beggining
>> of the section, so I just added a new page flag.
>> 
>> Signed-off-by: Juan Quintela <quintela@redhat.com>

>> @@ -675,6 +686,10 @@ static void *multifd_recv_thread(void *opaque)
>>                  return NULL;
>>              }
>>              p->done = true;
>> +            if (p->sync) {
>> +                qemu_cond_signal(&p->cond_sync);
>> +                p->sync = false;
>> +            }
>
> Could we use the same p->ready for this purpose? They looks similar:
> all we want to do is to let the main thread know "worker thread has
> finished receiving the last piece and becomes idle again", right?

We *could*, but "ready" is used for each page that we sent, sync is only
used once every round.  Notice that "ready" is a semaphore, and its
semantic is weird.  See next comment.


>> +static int multifd_flush(void)
>> +{
>> +    int i, thread_count;
>> +
>> +    if (!migrate_use_multifd()) {
>> +        return 0;
>> +    }
>> +    thread_count = migrate_multifd_threads();
>> +    for (i = 0; i < thread_count; i++) {
>> +        MultiFDRecvParams *p = multifd_recv_state->params[i];
>> +
>> +        qemu_mutex_lock(&p->mutex);
>> +        while (!p->done) {
>> +            p->sync = true;
>> +            qemu_cond_wait(&p->cond_sync, &p->mutex);
>
> (similar comment like above)

We need to look at the two pieces of code at the same time.  What are we
trying to do:

- making sure that all threads have finished the current round.
  in this particular case, that this thread has finished its current
  round OR  that it is waiting for work.

So, the main thread is the one that does the sem_wait(ready) and the channel
thread is the one that does the sem_post(ready).

multifd_recv_thread()

    if (p->sync) {
        sem_post(ready);
        p->sync = false;
    }

multifd_flush()
   if (!p->done) {
       p->sync = true;
       sem_wait(ready);
   }

Ah, but done and sync can be changed from other threads, so current code
will become:

multifd_recv_thread()

    if (p->sync) {
        sem_post(ready);
        p->sync = false;
    }

multifd_flush()
   ...
   mutex_lock(lock);
   if (!p->done) {
       p->sync = true;
       mutex_unlock(lock)
       sem_wait(ready);
       mutex_lock(lock)
   }
   mutex_unlock(lock)

That I would claim that it is more complicated to understand.  Mixing
locks and semaphores is ..... interesting to say the least.  With
variable conditions it becomes easy.

Yes, we can change sync/done to atomic access, but not sure that makes
things so much simpler.

>> +        }
>> +        qemu_mutex_unlock(&p->mutex);
>> +    }
>> +    return 0;
>> +}
>> +
>>  /**
>>   * save_page_header: write page header to wire
>>   *
>> @@ -809,6 +847,12 @@ static size_t save_page_header(RAMState *rs, QEMUFile *f,  RAMBlock *block,
>>  {
>>      size_t size, len;
>>  
>> +    if (rs->multifd_needs_flush &&
>> +        (offset & RAM_SAVE_FLAG_MULTIFD_PAGE)) {
>
> If multifd_needs_flush is only for multifd, then we may skip this
> check, but it looks more like an assertion:
>
>     if (rs->multifd_needs_flush) {
>         assert(offset & RAM_SAVE_FLAG_MULTIFD_PAGE);
>         offset |= RAM_SAVE_FLAG_ZERO;
>     }

No, it could be that this page is a _non_ multifd page, and then ZERO
means something different.  So, we can only send this for MULTIFD pages.

> (Dave mentioned about unaligned flag used in commit message and here:
>  ZERO is used, but COMPRESS is mentioned)

OK, I can change the message.

>> @@ -2496,6 +2540,9 @@ static int ram_save_complete(QEMUFile *f, void *opaque)
>>  
>>      if (!migration_in_postcopy()) {
>>          migration_bitmap_sync(rs);
>> +        if (migrate_use_multifd()) {
>> +            rs->multifd_needs_flush = true;
>> +        }
>
> Would it be good to move this block into entry of
> migration_bitmap_sync(), instead of setting it up at the callers of
> migration_bitmap_sync()?

We can't have all of it.

We call migration_bitmap_sync() in 4 places.
- We don't need to set the flag for the 1st synchronization
- We don't need to set it on postcopy (yet).

So, we can add code inside to check if we are on the 1st round, and
forget about postcopy (we check in other place), or we maintain it this way.

So, change becomes:

modified   migration/ram.c
@@ -1131,6 +1131,9 @@ static void migration_bitmap_sync(RAMState *rs)
     if (migrate_use_events()) {
         qapi_event_send_migration_pass(ram_counters.dirty_sync_count, NULL);
     }
+    if (rs->ram_bulk_stage && migrate_use_multifd()) {
+        rs->multifd_needs_flush = true;
+    }
 }
 
 /**
@@ -2533,9 +2536,6 @@ static int ram_save_complete(QEMUFile *f, void *opaque)
 
     if (!migration_in_postcopy()) {
         migration_bitmap_sync(rs);
-        if (migrate_use_multifd()) {
-            rs->multifd_needs_flush = true;
-        }
     }
 
     ram_control_before_iterate(f, RAM_CONTROL_FINISH);
@@ -2578,9 +2578,6 @@ static void ram_save_pending(QEMUFile *f, void *opaque, uint64_t max_size,
         qemu_mutex_lock_iothread();
         rcu_read_lock();
         migration_bitmap_sync(rs);
-        if (migrate_use_multifd()) {
-            rs->multifd_needs_flush = true;
-        }
         rcu_read_unlock();
         qemu_mutex_unlock_iothread();
         remaining_size = rs->migration_dirty_pages * TARGET_PAGE_SIZE;

three less lines, you win.  We need to check in otherplace already that
postcopy & multifd are not enabled at the same time.

Thanks, Juan.

^ permalink raw reply	[flat|nested] 93+ messages in thread

* Re: [Qemu-devel] [PATCH v5 13/17] migration: Create thread infrastructure for multifd recv side
  2017-07-20 10:22   ` Peter Xu
@ 2017-08-08 11:41     ` Juan Quintela
  2017-08-09  5:53       ` Peter Xu
  0 siblings, 1 reply; 93+ messages in thread
From: Juan Quintela @ 2017-08-08 11:41 UTC (permalink / raw)
  To: Peter Xu; +Cc: qemu-devel, dgilbert, lvivier, berrange

Peter Xu <peterx@redhat.com> wrote:
> On Mon, Jul 17, 2017 at 03:42:34PM +0200, Juan Quintela wrote:

>> +static void multifd_recv_page(uint8_t *address, uint16_t fd_num)
>> +{
>> +    int thread_count;
>> +    MultiFDRecvParams *p;
>> +    static multifd_pages_t pages;
>> +    static bool once;
>> +
>> +    if (!once) {
>> +        multifd_init_group(&pages);
>> +        once = true;
>> +    }
>> +
>> +    pages.iov[pages.num].iov_base = address;
>> +    pages.iov[pages.num].iov_len = TARGET_PAGE_SIZE;
>> +    pages.num++;
>> +
>> +    if (fd_num == UINT16_MAX) {
>
> (so this check is slightly mistery as well if we don't define
>  something... O:-)

It means that we continue sending pages on the same "group".  Will add a
comment.

>
>> +        return;
>> +    }
>> +
>> +    thread_count = migrate_multifd_threads();
>> +    assert(fd_num < thread_count);
>> +    p = multifd_recv_state->params[fd_num];
>> +
>> +    qemu_sem_wait(&p->ready);
>
> Shall we check for p->pages.num == 0 before wait? What if the
> corresponding thread is already finished its old work and ready?

this is a semaphore, not a condition variable.  We only use it with
values 0 and 1.  We only wait if the other thread hasn't done the post,
if it has done the post, the wait don't have to wait. (no, I didn't
invented the semaphore names).

>> +
>> +    qemu_mutex_lock(&p->mutex);
>> +    p->done = false;
>> +    iov_copy(p->pages.iov, pages.num, pages.iov, pages.num, 0,
>> +             iov_size(pages.iov, pages.num));
>
> Question: any reason why we don't use the same for loop in
> multifd-send codes, and just copy the IOVs in that loop? (offset is
> always zero, and we are copying the whole thing after all)

When I found the function, I only remembered to change one of the
loops.  Nice catch.

Later, Juan.

^ permalink raw reply	[flat|nested] 93+ messages in thread

* Re: [Qemu-devel] [PATCH v5 13/17] migration: Create thread infrastructure for multifd recv side
  2017-07-20 10:29   ` Dr. David Alan Gilbert
@ 2017-08-08 11:51     ` Juan Quintela
  0 siblings, 0 replies; 93+ messages in thread
From: Juan Quintela @ 2017-08-08 11:51 UTC (permalink / raw)
  To: Dr. David Alan Gilbert; +Cc: qemu-devel, lvivier, peterx, berrange

"Dr. David Alan Gilbert" <dgilbert@redhat.com> wrote:
> * Juan Quintela (quintela@redhat.com) wrote:
>> We make the locking and the transfer of information specific, even if we
>> are still receiving things through the main thread.
>> 
>> Signed-off-by: Juan Quintela <quintela@redhat.com>
>> ---
>>  migration/ram.c | 68 ++++++++++++++++++++++++++++++++++++++++++++++++++-------
>>  1 file changed, 60 insertions(+), 8 deletions(-)
>> 
>> diff --git a/migration/ram.c b/migration/ram.c
>> index ac0742f..49c4880 100644
>> --- a/migration/ram.c
>> +++ b/migration/ram.c
>> @@ -49,6 +49,7 @@
>>  #include "migration/colo.h"
>>  #include "sysemu/sysemu.h"
>>  #include "qemu/uuid.h"
>> +#include "qemu/iov.h"
>>  
>>  /***********************************************************/
>>  /* ram save/restore */
>> @@ -527,7 +528,7 @@ int multifd_save_setup(void)
>>      return 0;
>>  }
>>  
>> -static int multifd_send_page(uint8_t *address)
>> +static uint16_t multifd_send_page(uint8_t *address, bool last_page)
>>  {
>>      int i, j;
>>      MultiFDSendParams *p = NULL; /* make happy gcc */
>> @@ -543,8 +544,10 @@ static int multifd_send_page(uint8_t *address)
>>      pages.iov[pages.num].iov_len = TARGET_PAGE_SIZE;
>>      pages.num++;
>>  
>> -    if (pages.num < (pages.size - 1)) {
>> -        return UINT16_MAX;
>> +    if (!last_page) {
>> +        if (pages.num < (pages.size - 1)) {
>> +            return UINT16_MAX;
>> +        }
>>      }
>
> This doesn't feel like it should be in a recv patch.


I will change it, until here we don't need it :p

>
>>      qemu_sem_wait(&multifd_send_state->sem);
>> @@ -572,12 +575,17 @@ static int multifd_send_page(uint8_t *address)
>>  }
>>  
>>  struct MultiFDRecvParams {
>> +    /* not changed */
>>      uint8_t id;
>>      QemuThread thread;
>>      QIOChannel *c;
>> +    QemuSemaphore ready;
>>      QemuSemaphore sem;
>>      QemuMutex mutex;
>> +    /* proteced by param mutex */
>>      bool quit;
>> +    multifd_pages_t pages;
>> +    bool done;
>>  };
>>  typedef struct MultiFDRecvParams MultiFDRecvParams;
>
> The params between Send and Recv keep looking very similar; I wonder
> if we can share them.

We use other parameters.  We could do, but I am not sure if it makes
sense the trouble.

>>   * save_page_header: write page header to wire
>>   *
>> @@ -1155,7 +1210,7 @@ static int ram_multifd_page(RAMState *rs, PageSearchStatus *pss,
>>          ram_counters.transferred +=
>>              save_page_header(rs, rs->f, block,
>>                               offset | RAM_SAVE_FLAG_MULTIFD_PAGE);
>> -        fd_num = multifd_send_page(p);
>> +        fd_num = multifd_send_page(p, rs->migration_dirty_pages == 1);
>
> I think that belongs in the previous patch and probably answers one of
> my questions.

Ok, I change that.

^ permalink raw reply	[flat|nested] 93+ messages in thread

* Re: [Qemu-devel] [PATCH v5 11/17] migration: Really use multiple pages at a time
  2017-07-19 13:58   ` Daniel P. Berrange
@ 2017-08-08 11:55     ` Juan Quintela
  0 siblings, 0 replies; 93+ messages in thread
From: Juan Quintela @ 2017-08-08 11:55 UTC (permalink / raw)
  To: Daniel P. Berrange; +Cc: qemu-devel, dgilbert, lvivier, peterx

"Daniel P. Berrange" <berrange@redhat.com> wrote:
> On Mon, Jul 17, 2017 at 03:42:32PM +0200, Juan Quintela wrote:
>> We now send several pages at a time each time that we wakeup a thread.
>> 
>> Signed-off-by: Juan Quintela <quintela@redhat.com>
>> 
>> --
>> 
>> Use iovec's insead of creating the equivalent.
>> ---
>>  migration/ram.c | 46 ++++++++++++++++++++++++++++++++++++++++------
>>  1 file changed, 40 insertions(+), 6 deletions(-)
>> 
>> diff --git a/migration/ram.c b/migration/ram.c
>> index 2bf3fa7..90e1bcb 100644
>> --- a/migration/ram.c
>> +++ b/migration/ram.c
>
>> +static void multifd_init_group(multifd_pages_t *pages)
>> +{
>> +    pages->num = 0;
>> +    pages->size = migrate_multifd_group();
>> +    pages->iov = g_malloc0(pages->size * sizeof(struct iovec));
>
> Use g_new() so that it checks for overflow in the size calculation.

Done, thanks.

^ permalink raw reply	[flat|nested] 93+ messages in thread

* Re: [Qemu-devel] [PATCH v5 11/17] migration: Really use multiple pages at a time
  2017-07-20  9:44   ` Dr. David Alan Gilbert
@ 2017-08-08 12:11     ` Juan Quintela
  0 siblings, 0 replies; 93+ messages in thread
From: Juan Quintela @ 2017-08-08 12:11 UTC (permalink / raw)
  To: Dr. David Alan Gilbert; +Cc: qemu-devel, lvivier, peterx, berrange

"Dr. David Alan Gilbert" <dgilbert@redhat.com> wrote:
> * Juan Quintela (quintela@redhat.com) wrote:
>> We now send several pages at a time each time that we wakeup a thread.
>> 
>> Signed-off-by: Juan Quintela <quintela@redhat.com>
>> 
>> --
>> 
>> Use iovec's insead of creating the equivalent.
>> ---
>>  migration/ram.c | 46 ++++++++++++++++++++++++++++++++++++++++------
>>  1 file changed, 40 insertions(+), 6 deletions(-)
>> 
>> diff --git a/migration/ram.c b/migration/ram.c
>> index 2bf3fa7..90e1bcb 100644
>> --- a/migration/ram.c
>> +++ b/migration/ram.c
>> @@ -362,6 +362,13 @@ static void compress_threads_save_setup(void)
>>  
>>  /* Multiple fd's */
>>  
>> +
>> +typedef struct {
>> +    int num;
>> +    int size;
>
> size_t ?

Done.
>
>> +    struct iovec *iov;
>> +} multifd_pages_t;
>> +
>>  struct MultiFDSendParams {
>>      /* not changed */
>>      uint8_t id;
>> @@ -371,7 +378,7 @@ struct MultiFDSendParams {
>>      QemuMutex mutex;
>>      /* protected by param mutex */
>>      bool quit;
>> -    uint8_t *address;
>> +    multifd_pages_t pages;
>>      /* protected by multifd mutex */
>>      bool done;
>>  };
>> @@ -459,8 +466,8 @@ static void *multifd_send_thread(void *opaque)
>>              qemu_mutex_unlock(&p->mutex);
>>              break;
>>          }
>> -        if (p->address) {
>> -            p->address = 0;
>> +        if (p->pages.num) {
>> +            p->pages.num = 0;
>>              qemu_mutex_unlock(&p->mutex);
>>              qemu_mutex_lock(&multifd_send_state->mutex);
>>              p->done = true;
>> @@ -475,6 +482,13 @@ static void *multifd_send_thread(void *opaque)
>>      return NULL;
>>  }
>>  
>> +static void multifd_init_group(multifd_pages_t *pages)
>> +{
>> +    pages->num = 0;
>> +    pages->size = migrate_multifd_group();
>> +    pages->iov = g_malloc0(pages->size * sizeof(struct iovec));
>
> Does that get freed anywhere?

Ooops.  Now it does.

>> +}
>> +
>>  int multifd_save_setup(void)
>>  {
>>      int thread_count;
>> @@ -498,7 +512,7 @@ int multifd_save_setup(void)
>>          p->quit = false;
>>          p->id = i;
>>          p->done = true;
>> -        p->address = 0;
>> +        multifd_init_group(&p->pages);
>>          p->c = socket_send_channel_create();
>>          if (!p->c) {
>>              error_report("Error creating a send channel");
>> @@ -515,8 +529,23 @@ int multifd_save_setup(void)
>>  
>>  static int multifd_send_page(uint8_t *address)
>>  {
>> -    int i;
>> +    int i, j;
>>      MultiFDSendParams *p = NULL; /* make happy gcc */
>> +    static multifd_pages_t pages;
>> +    static bool once;
>> +
>> +    if (!once) {
>> +        multifd_init_group(&pages);
>> +        once = true;
>> +    }
>> +
>> +    pages.iov[pages.num].iov_base = address;
>> +    pages.iov[pages.num].iov_len = TARGET_PAGE_SIZE;
>> +    pages.num++;
>> +
>> +    if (pages.num < (pages.size - 1)) {
>> +        return UINT16_MAX;
>
> That's a very odd magic constant to return.
> What's your intention?
>
>> +    }
>>  
>>      qemu_sem_wait(&multifd_send_state->sem);
>>      qemu_mutex_lock(&multifd_send_state->mutex);
>> @@ -530,7 +559,12 @@ static int multifd_send_page(uint8_t *address)
>>      }
>>      qemu_mutex_unlock(&multifd_send_state->mutex);
>>      qemu_mutex_lock(&p->mutex);
>> -    p->address = address;
>> +    p->pages.num = pages.num;
>> +    for (j = 0; j < pages.size; j++) {
>> +        p->pages.iov[j].iov_base = pages.iov[j].iov_base;
>> +        p->pages.iov[j].iov_len = pages.iov[j].iov_len;
>> +    }
>
> It would seem more logical to update p->pages.num last
>
> This is also a little odd in that iov_len is never really used,
> it's always TARGET_PAGE_SIZE.

changed by peter suggestion to iov_copy().  And iov_len is used in the
qio send functions, so we have a right value.

>> +    pages.num = 0;
>>      qemu_mutex_unlock(&p->mutex);
>>      qemu_sem_post(&p->sem);
>
> What makes sure that any final chunk of pages that was less
> than the group size is sent at the end?

See last_page boolean in a following patch.  It was the wrong place.

Thanks, Juan.

^ permalink raw reply	[flat|nested] 93+ messages in thread

* Re: [Qemu-devel] [PATCH v5 10/17] migration: Create ram_multifd_page
  2017-07-19 19:02   ` Dr. David Alan Gilbert
  2017-07-20  8:10     ` Peter Xu
@ 2017-08-08 15:56     ` Juan Quintela
  2017-08-08 16:30       ` Dr. David Alan Gilbert
  1 sibling, 1 reply; 93+ messages in thread
From: Juan Quintela @ 2017-08-08 15:56 UTC (permalink / raw)
  To: Dr. David Alan Gilbert; +Cc: qemu-devel, lvivier, peterx, berrange

"Dr. David Alan Gilbert" <dgilbert@redhat.com> wrote:
> * Juan Quintela (quintela@redhat.com) wrote:
>> The function still don't use multifd, but we have simplified
>> ram_save_page, xbzrle and RDMA stuff is gone.  We have added a new
>> counter and a new flag for this type of pages.
>> 
>> Signed-off-by: Juan Quintela <quintela@redhat.com>
>> ---
>>  hmp.c                 |  2 ++
>>  migration/migration.c |  1 +
>>  migration/ram.c       | 90 ++++++++++++++++++++++++++++++++++++++++++++++++++-
>>  qapi-schema.json      |  5 ++-
>>  4 files changed, 96 insertions(+), 2 deletions(-)
>> 
>> diff --git a/hmp.c b/hmp.c
>> index b01605a..eeb308b 100644
>> --- a/hmp.c
>> +++ b/hmp.c
>> @@ -234,6 +234,8 @@ void hmp_info_migrate(Monitor *mon, const QDict *qdict)
>>              monitor_printf(mon, "postcopy request count: %" PRIu64 "\n",
>>                             info->ram->postcopy_requests);
>>          }
>> +        monitor_printf(mon, "multifd: %" PRIu64 " pages\n",
>> +                       info->ram->multifd);
>>      }
>>  
>>      if (info->has_disk) {
>> diff --git a/migration/migration.c b/migration/migration.c
>> index e1c79d5..d9d5415 100644
>> --- a/migration/migration.c
>> +++ b/migration/migration.c
>> @@ -528,6 +528,7 @@ static void populate_ram_info(MigrationInfo *info, MigrationState *s)
>>      info->ram->dirty_sync_count = ram_counters.dirty_sync_count;
>>      info->ram->postcopy_requests = ram_counters.postcopy_requests;
>>      info->ram->page_size = qemu_target_page_size();
>> +    info->ram->multifd = ram_counters.multifd;
>>  
>>      if (migrate_use_xbzrle()) {
>>          info->has_xbzrle_cache = true;
>> diff --git a/migration/ram.c b/migration/ram.c
>> index b80f511..2bf3fa7 100644
>> --- a/migration/ram.c
>> +++ b/migration/ram.c
>> @@ -68,6 +68,7 @@
>>  #define RAM_SAVE_FLAG_XBZRLE   0x40
>>  /* 0x80 is reserved in migration.h start with 0x100 next */
>>  #define RAM_SAVE_FLAG_COMPRESS_PAGE    0x100
>> +#define RAM_SAVE_FLAG_MULTIFD_PAGE     0x200
>>  
>>  static inline bool is_zero_range(uint8_t *p, uint64_t size)
>>  {
>> @@ -362,12 +363,17 @@ static void compress_threads_save_setup(void)
>>  /* Multiple fd's */
>>  
>>  struct MultiFDSendParams {
>> +    /* not changed */
>>      uint8_t id;
>>      QemuThread thread;
>>      QIOChannel *c;
>>      QemuSemaphore sem;
>>      QemuMutex mutex;
>> +    /* protected by param mutex */
>>      bool quit;
>
> Should probably comment to say what address space address is in - this
> is really a qemu pointer - and that's why we can treat 0 as special?

Ok.  Added

    /* This is a temp field.  We are using it now to transmit
       something the address of the page.  Later in the series, we
       change it for the real page.
    */


>
>> +    uint8_t *address;
>> +    /* protected by multifd mutex */
>> +    bool done;
>
> done needs a comment to explain what it is because
> it sounds similar to quit;  I think 'done' is saying that
> the thread is idle having done what was asked?

    /* has the thread finish the last submitted job */

>> +static int multifd_send_page(uint8_t *address)
>> +{
>> +    int i;
>> +    MultiFDSendParams *p = NULL; /* make happy gcc */
>> +
>> +    qemu_sem_wait(&multifd_send_state->sem);
>> +    qemu_mutex_lock(&multifd_send_state->mutex);
>> +    for (i = 0; i < multifd_send_state->count; i++) {
>> +        p = &multifd_send_state->params[i];
>> +
>> +        if (p->done) {
>> +            p->done = false;
>> +            break;
>> +        }
>> +    }
>> +    qemu_mutex_unlock(&multifd_send_state->mutex);
>> +    qemu_mutex_lock(&p->mutex);
>> +    p->address = address;
>> +    qemu_mutex_unlock(&p->mutex);
>> +    qemu_sem_post(&p->sem);
>
> My feeling, without having fully thought it through, is that
> the locking around 'address' can be simplified; especially if the
> sending-thread never actually changes it.
>
> http://pubs.opengroup.org/onlinepubs/9699919799/basedefs/V1_chap04.html#tag_04_11
> defines that most of the pthread_ functions act as barriers;
> including the sem_post and pthread_cond_signal that qemu_sem_post
> uses.

At the end of the series the code is this:

    qemu_mutex_lock(&p->mutex);
    p->pages.num = pages->num;
    iov_copy(p->pages.iov, pages->num, pages->iov, pages->num, 0,
             iov_size(pages->iov, pages->num));
    pages->num = 0;
    qemu_mutex_unlock(&p->mutex);
 
Are you sure that it looks like a good idea to drop the mutex?

The other thread uses pages->num to know if things are ready.

>
>> +    return 0;
>> +}
>> +
>>  struct MultiFDRecvParams {
>>      uint8_t id;
>>      QemuThread thread;
>> @@ -537,6 +583,7 @@ void multifd_load_cleanup(void)
>>          qemu_sem_destroy(&p->sem);
>>          socket_recv_channel_destroy(p->c);
>>          g_free(p);
>> +        multifd_recv_state->params[i] = NULL;
>>      }
>>      g_free(multifd_recv_state->params);
>>      multifd_recv_state->params = NULL;
>> @@ -1058,6 +1105,32 @@ static int ram_save_page(RAMState *rs, PageSearchStatus *pss, bool last_stage)
>>      return pages;
>>  }
>>  
>> +static int ram_multifd_page(RAMState *rs, PageSearchStatus *pss,
>> +                            bool last_stage)
>> +{
>> +    int pages;
>> +    uint8_t *p;
>> +    RAMBlock *block = pss->block;
>> +    ram_addr_t offset = pss->page << TARGET_PAGE_BITS;
>> +
>> +    p = block->host + offset;
>> +
>> +    pages = save_zero_page(rs, block, offset, p);
>> +    if (pages == -1) {
>> +        ram_counters.transferred +=
>> +            save_page_header(rs, rs->f, block,
>> +                             offset | RAM_SAVE_FLAG_MULTIFD_PAGE);
>> +        qemu_put_buffer(rs->f, p, TARGET_PAGE_SIZE);
>> +        multifd_send_page(p);
>> +        ram_counters.transferred += TARGET_PAGE_SIZE;
>> +        pages = 1;
>> +        ram_counters.normal++;
>> +        ram_counters.multifd++;
>> +    }
>> +
>> +    return pages;
>> +}
>> +
>>  static int do_compress_ram_page(QEMUFile *f, RAMBlock *block,
>>                                  ram_addr_t offset)
>>  {
>> @@ -1486,6 +1559,8 @@ static int ram_save_target_page(RAMState *rs, PageSearchStatus *pss,
>>          if (migrate_use_compression() &&
>>              (rs->ram_bulk_stage || !migrate_use_xbzrle())) {
>>              res = ram_save_compressed_page(rs, pss, last_stage);
>> +        } else if (migrate_use_multifd()) {
>> +            res = ram_multifd_page(rs, pss, last_stage);
>
> It's a pity we can't wire this up with compression, but I understand
> why you simplify that.
>
> I'll see how the multiple-pages stuff works below; but the interesting
> thing here is we've already split up host-pages, which seems like a bad
> idea.

It is.  But I can't fix all the world in one go :-(
>>  #        statistics (since 2.10)
>>  #
>> +# @multifd: number of pages sent with multifd (since 2.10)
>
> Hopeful!

Everything puts 2.11.

Later, Juan.

^ permalink raw reply	[flat|nested] 93+ messages in thread

* Re: [Qemu-devel] [PATCH v5 10/17] migration: Create ram_multifd_page
  2017-07-20 11:48       ` Dr. David Alan Gilbert
@ 2017-08-08 15:58         ` Juan Quintela
  0 siblings, 0 replies; 93+ messages in thread
From: Juan Quintela @ 2017-08-08 15:58 UTC (permalink / raw)
  To: Dr. David Alan Gilbert; +Cc: Peter Xu, qemu-devel, lvivier, berrange

"Dr. David Alan Gilbert" <dgilbert@redhat.com> wrote:
> * Peter Xu (peterx@redhat.com) wrote:
>> On Wed, Jul 19, 2017 at 08:02:39PM +0100, Dr. David Alan Gilbert wrote:
>> ... here can we just do this?
>> 
>> retry:
>>     // don't take any lock, only read each p->address
>>     for (i = 0; i < multifd_send_state->count; i++) {
>>         p = &multifd_send_state->params[i];
>>         if (!p->address) {
>>             // we found one IDLE send thread
>>             break;
>>         }
>>     }
>>     if (!p) {
>>         qemu_sem_wait(&multifd_send_state->sem);
>>         goto retry;
>>     }
>>     // we switch its state, IDLE -> ACTIVE
>>     atomic_set(&p->address, address);
>>     // tell the thread to start work
>>     qemu_sem_post(&p->sem);
>> 
>> Above didn't really use any lock at all (either the per thread lock,
>> or the global lock). Would it work?
>
> I think what's there can certainly be simplified;  but also note
> that the later patch gets rid of 'address' and turns it into a count.
> My suggest was to keep the 'done' and stop using 'address' as something
> special; i.e. never write address in the thread; but I think yours might
> work as well.

I substitute the test from address == 0 to page.num == 0.

Notice that this is temporal, just to check that I am doing the things
right.  we end sending the pages here.

Later, Juan.

^ permalink raw reply	[flat|nested] 93+ messages in thread

* Re: [Qemu-devel] [PATCH v5 10/17] migration: Create ram_multifd_page
  2017-07-20  8:10     ` Peter Xu
  2017-07-20 11:48       ` Dr. David Alan Gilbert
@ 2017-08-08 16:04       ` Juan Quintela
  2017-08-09  7:42         ` Peter Xu
  1 sibling, 1 reply; 93+ messages in thread
From: Juan Quintela @ 2017-08-08 16:04 UTC (permalink / raw)
  To: Peter Xu; +Cc: Dr. David Alan Gilbert, qemu-devel, lvivier, berrange

Peter Xu <peterx@redhat.com> wrote:
> On Wed, Jul 19, 2017 at 08:02:39PM +0100, Dr. David Alan Gilbert wrote:
>> * Juan Quintela (quintela@redhat.com) wrote:

>> >  struct MultiFDSendParams {
>> > +    /* not changed */
>> >      uint8_t id;
>> >      QemuThread thread;
>> >      QIOChannel *c;
>> >      QemuSemaphore sem;
>> >      QemuMutex mutex;
>> > +    /* protected by param mutex */
>> >      bool quit;
>> 
>> Should probably comment to say what address space address is in - this
>> is really a qemu pointer - and that's why we can treat 0 as special?
>
> I believe this comment is for "address" below.
>
> Yes, it would be nice to comment it. IIUC it belongs to virtual
> address space of QEMU, so it should be okay to use zero as a "special"
> value.

See new comments.

>> 
>> > +    uint8_t *address;
>> > +    /* protected by multifd mutex */
>> > +    bool done;
>> 
>> done needs a comment to explain what it is because
>> it sounds similar to quit;  I think 'done' is saying that
>> the thread is idle having done what was asked?
>
> Since we know that valid address won't be zero, not sure whether we
> can just avoid introducing the "done" field (even, not sure whether we
> will need lock when modifying "address", I think not as well? Please
> see below). For what I see this, it works like a state machine, and
> "address" can be the state:
>
>             +--------  send thread ---------+
>             |                               |
>            \|/                              |
>         address==0 (IDLE)               address!=0 (ACTIVE)
>             |                              /|\
>             |                               |
>             +--------  main thread ---------+
>
> Then...

It is needed, we change things later in the series.  We could treat as
an special case page.num == 0. But then we can differentiate the case
where we have finished the last round and that we are in the beggining
of the new one.

>> 
>> >  };
>> >  typedef struct MultiFDSendParams MultiFDSendParams;
>> >  
>> > @@ -375,6 +381,8 @@ struct {
>> >      MultiFDSendParams *params;
>> >      /* number of created threads */
>> >      int count;
>> > +    QemuMutex mutex;
>> > +    QemuSemaphore sem;
>> >  } *multifd_send_state;
>> >  
>> >  static void terminate_multifd_send_threads(void)
>> > @@ -443,6 +451,7 @@ static void *multifd_send_thread(void *opaque)
>> >      } else {
>> >          qio_channel_write(p->c, string, MULTIFD_UUID_MSG, &error_abort);
>> >      }
>> > +    qemu_sem_post(&multifd_send_state->sem);
>> >  
>> >      while (!exit) {
>> >          qemu_mutex_lock(&p->mutex);
>> > @@ -450,6 +459,15 @@ static void *multifd_send_thread(void *opaque)
>> >              qemu_mutex_unlock(&p->mutex);
>> >              break;
>> >          }
>> > +        if (p->address) {
>> > +            p->address = 0;
>> > +            qemu_mutex_unlock(&p->mutex);
>> > +            qemu_mutex_lock(&multifd_send_state->mutex);
>> > +            p->done = true;
>> > +            qemu_mutex_unlock(&multifd_send_state->mutex);
>> > +            qemu_sem_post(&multifd_send_state->sem);
>> > +            continue;
>
> Here instead of setting up address=0 at the entry, can we do this (no
> "done" for this time)?
>
>                  // send the page before clearing p->address
>                  send_page(p->address);
>                  // clear p->address to switch to "IDLE" state
>                  atomic_set(&p->address, 0);
>                  // tell main thread, in case it's waiting
>                  qemu_sem_post(&multifd_send_state->sem);
>
> And on the main thread...
>
>> > +        }
>> >          qemu_mutex_unlock(&p->mutex);
>> >          qemu_sem_wait(&p->sem);
>> >      }
>> > @@ -469,6 +487,8 @@ int multifd_save_setup(void)
>> >      multifd_send_state = g_malloc0(sizeof(*multifd_send_state));
>> >      multifd_send_state->params = g_new0(MultiFDSendParams, thread_count);
>> >      multifd_send_state->count = 0;
>> > +    qemu_mutex_init(&multifd_send_state->mutex);
>> > +    qemu_sem_init(&multifd_send_state->sem, 0);
>> >      for (i = 0; i < thread_count; i++) {
>> >          char thread_name[16];
>> >          MultiFDSendParams *p = &multifd_send_state->params[i];
>> > @@ -477,6 +497,8 @@ int multifd_save_setup(void)
>> >          qemu_sem_init(&p->sem, 0);
>> >          p->quit = false;
>> >          p->id = i;
>> > +        p->done = true;
>> > +        p->address = 0;
>> >          p->c = socket_send_channel_create();
>> >          if (!p->c) {
>> >              error_report("Error creating a send channel");
>> > @@ -491,6 +513,30 @@ int multifd_save_setup(void)
>> >      return 0;
>> >  }
>> >  
>> > +static int multifd_send_page(uint8_t *address)
>> > +{
>> > +    int i;
>> > +    MultiFDSendParams *p = NULL; /* make happy gcc */
>> > +
>
>
>> > +    qemu_sem_wait(&multifd_send_state->sem);
>> > +    qemu_mutex_lock(&multifd_send_state->mutex);
>> > +    for (i = 0; i < multifd_send_state->count; i++) {
>> > +        p = &multifd_send_state->params[i];
>> > +
>> > +        if (p->done) {
>> > +            p->done = false;
>> > +            break;
>> > +        }
>> > +    }
>> > +    qemu_mutex_unlock(&multifd_send_state->mutex);
>> > +    qemu_mutex_lock(&p->mutex);
>> > +    p->address = address;
>> > +    qemu_mutex_unlock(&p->mutex);
>> > +    qemu_sem_post(&p->sem);
>
> ... here can we just do this?
>
> retry:
>     // don't take any lock, only read each p->address
>     for (i = 0; i < multifd_send_state->count; i++) {
>         p = &multifd_send_state->params[i];
>         if (!p->address) {
>             // we found one IDLE send thread
>             break;
>         }
>     }
>     if (!p) {
>         qemu_sem_wait(&multifd_send_state->sem);
>         goto retry;
>     }
>     // we switch its state, IDLE -> ACTIVE
>     atomic_set(&p->address, address);
>     // tell the thread to start work
>     qemu_sem_post(&p->sem);
>
> Above didn't really use any lock at all (either the per thread lock,
> or the global lock). Would it work?

Probably (surely on x86).

But on the "final code", it becomes:

    qemu_mutex_lock(&multifd_send_state->mutex);
    for (i = 0; i < multifd_send_state->count; i++) {
        p = &multifd_send_state->params[i];

        if (p->done) {
            p->done = false;
            break;
        }
    }
    qemu_mutex_unlock(&multifd_send_state->mutex);
    qemu_mutex_lock(&p->mutex);
    p->pages.num = pages->num;
    iov_copy(p->pages.iov, pages->num, pages->iov, pages->num, 0,
             iov_size(pages->iov, pages->num));
    pages->num = 0;
    qemu_mutex_unlock(&p->mutex);
    qemu_sem_post(&p->sem);

So, we set done to false, without the global mutex (yes, we can change
that for one atomic).

But then we copy an iov without a lock?  With the other thread checking
for pages->num == 0?  It sounds a bit fragile to me, no?

Later, Juan.

^ permalink raw reply	[flat|nested] 93+ messages in thread

* Re: [Qemu-devel] [PATCH v5 11/17] migration: Really use multiple pages at a time
  2017-07-20  9:49   ` Peter Xu
  2017-07-20 10:09     ` Peter Xu
@ 2017-08-08 16:06     ` Juan Quintela
  2017-08-09  7:48       ` Peter Xu
  1 sibling, 1 reply; 93+ messages in thread
From: Juan Quintela @ 2017-08-08 16:06 UTC (permalink / raw)
  To: Peter Xu; +Cc: qemu-devel, dgilbert, lvivier, berrange

Peter Xu <peterx@redhat.com> wrote:
> On Mon, Jul 17, 2017 at 03:42:32PM +0200, Juan Quintela wrote:
>
> [...]
>
>>  static int multifd_send_page(uint8_t *address)
>>  {
>> -    int i;
>> +    int i, j;
>>      MultiFDSendParams *p = NULL; /* make happy gcc */
>> +    static multifd_pages_t pages;
>> +    static bool once;
>> +
>> +    if (!once) {
>> +        multifd_init_group(&pages);
>> +        once = true;
>
> Would it be good to put the "pages" into multifd_send_state? One is to
> stick globals together; another benefit is that we can remove the
> "once" here: we can then init the "pages" when init multifd_send_state
> struct (but maybe with a better name?...).

I did to be able to free it.

> (there are similar static variables in multifd_recv_page() as well, if
>  this one applies, then we can possibly use multifd_recv_state for
>  that one)

Also there.

>> +    }
>> +
>> +    pages.iov[pages.num].iov_base = address;
>> +    pages.iov[pages.num].iov_len = TARGET_PAGE_SIZE;
>> +    pages.num++;
>> +
>> +    if (pages.num < (pages.size - 1)) {
>> +        return UINT16_MAX;
>
> Nit: shall we define something for readability?  Like:
>
> #define  MULTIFD_FD_INVALID  UINT16_MAX

Also done.

MULTIFD_CONTINUE

But I am open to changes.


>> +    }
>>  
>>      qemu_sem_wait(&multifd_send_state->sem);
>>      qemu_mutex_lock(&multifd_send_state->mutex);
>> @@ -530,7 +559,12 @@ static int multifd_send_page(uint8_t *address)
>>      }
>>      qemu_mutex_unlock(&multifd_send_state->mutex);
>>      qemu_mutex_lock(&p->mutex);
>> -    p->address = address;
>> +    p->pages.num = pages.num;
>> +    for (j = 0; j < pages.size; j++) {
>> +        p->pages.iov[j].iov_base = pages.iov[j].iov_base;
>> +        p->pages.iov[j].iov_len = pages.iov[j].iov_len;
>> +    }
>> +    pages.num = 0;
>>      qemu_mutex_unlock(&p->mutex);
>>      qemu_sem_post(&p->sem);
>>  
>> -- 
>> 2.9.4
>> 

^ permalink raw reply	[flat|nested] 93+ messages in thread

* Re: [Qemu-devel] [PATCH v5 10/17] migration: Create ram_multifd_page
  2017-08-08 15:56     ` Juan Quintela
@ 2017-08-08 16:30       ` Dr. David Alan Gilbert
  2017-08-08 18:02         ` Juan Quintela
  0 siblings, 1 reply; 93+ messages in thread
From: Dr. David Alan Gilbert @ 2017-08-08 16:30 UTC (permalink / raw)
  To: Juan Quintela; +Cc: qemu-devel, lvivier, peterx, berrange

* Juan Quintela (quintela@redhat.com) wrote:
> "Dr. David Alan Gilbert" <dgilbert@redhat.com> wrote:
> > * Juan Quintela (quintela@redhat.com) wrote:
> >> The function still don't use multifd, but we have simplified
> >> ram_save_page, xbzrle and RDMA stuff is gone.  We have added a new
> >> counter and a new flag for this type of pages.
> >> 
> >> Signed-off-by: Juan Quintela <quintela@redhat.com>
> >> ---
> >>  hmp.c                 |  2 ++
> >>  migration/migration.c |  1 +
> >>  migration/ram.c       | 90 ++++++++++++++++++++++++++++++++++++++++++++++++++-
> >>  qapi-schema.json      |  5 ++-
> >>  4 files changed, 96 insertions(+), 2 deletions(-)
> >> 
> >> diff --git a/hmp.c b/hmp.c
> >> index b01605a..eeb308b 100644
> >> --- a/hmp.c
> >> +++ b/hmp.c
> >> @@ -234,6 +234,8 @@ void hmp_info_migrate(Monitor *mon, const QDict *qdict)
> >>              monitor_printf(mon, "postcopy request count: %" PRIu64 "\n",
> >>                             info->ram->postcopy_requests);
> >>          }
> >> +        monitor_printf(mon, "multifd: %" PRIu64 " pages\n",
> >> +                       info->ram->multifd);
> >>      }
> >>  
> >>      if (info->has_disk) {
> >> diff --git a/migration/migration.c b/migration/migration.c
> >> index e1c79d5..d9d5415 100644
> >> --- a/migration/migration.c
> >> +++ b/migration/migration.c
> >> @@ -528,6 +528,7 @@ static void populate_ram_info(MigrationInfo *info, MigrationState *s)
> >>      info->ram->dirty_sync_count = ram_counters.dirty_sync_count;
> >>      info->ram->postcopy_requests = ram_counters.postcopy_requests;
> >>      info->ram->page_size = qemu_target_page_size();
> >> +    info->ram->multifd = ram_counters.multifd;
> >>  
> >>      if (migrate_use_xbzrle()) {
> >>          info->has_xbzrle_cache = true;
> >> diff --git a/migration/ram.c b/migration/ram.c
> >> index b80f511..2bf3fa7 100644
> >> --- a/migration/ram.c
> >> +++ b/migration/ram.c
> >> @@ -68,6 +68,7 @@
> >>  #define RAM_SAVE_FLAG_XBZRLE   0x40
> >>  /* 0x80 is reserved in migration.h start with 0x100 next */
> >>  #define RAM_SAVE_FLAG_COMPRESS_PAGE    0x100
> >> +#define RAM_SAVE_FLAG_MULTIFD_PAGE     0x200
> >>  
> >>  static inline bool is_zero_range(uint8_t *p, uint64_t size)
> >>  {
> >> @@ -362,12 +363,17 @@ static void compress_threads_save_setup(void)
> >>  /* Multiple fd's */
> >>  
> >>  struct MultiFDSendParams {
> >> +    /* not changed */
> >>      uint8_t id;
> >>      QemuThread thread;
> >>      QIOChannel *c;
> >>      QemuSemaphore sem;
> >>      QemuMutex mutex;
> >> +    /* protected by param mutex */
> >>      bool quit;
> >
> > Should probably comment to say what address space address is in - this
> > is really a qemu pointer - and that's why we can treat 0 as special?
> 
> Ok.  Added
> 
>     /* This is a temp field.  We are using it now to transmit
>        something the address of the page.  Later in the series, we
>        change it for the real page.
>     */
> 
> 
> >
> >> +    uint8_t *address;
> >> +    /* protected by multifd mutex */
> >> +    bool done;
> >
> > done needs a comment to explain what it is because
> > it sounds similar to quit;  I think 'done' is saying that
> > the thread is idle having done what was asked?
> 
>     /* has the thread finish the last submitted job */
> 
> >> +static int multifd_send_page(uint8_t *address)
> >> +{
> >> +    int i;
> >> +    MultiFDSendParams *p = NULL; /* make happy gcc */
> >> +
> >> +    qemu_sem_wait(&multifd_send_state->sem);
> >> +    qemu_mutex_lock(&multifd_send_state->mutex);
> >> +    for (i = 0; i < multifd_send_state->count; i++) {
> >> +        p = &multifd_send_state->params[i];
> >> +
> >> +        if (p->done) {
> >> +            p->done = false;
> >> +            break;
> >> +        }
> >> +    }
> >> +    qemu_mutex_unlock(&multifd_send_state->mutex);
> >> +    qemu_mutex_lock(&p->mutex);
> >> +    p->address = address;
> >> +    qemu_mutex_unlock(&p->mutex);
> >> +    qemu_sem_post(&p->sem);
> >
> > My feeling, without having fully thought it through, is that
> > the locking around 'address' can be simplified; especially if the
> > sending-thread never actually changes it.
> >
> > http://pubs.opengroup.org/onlinepubs/9699919799/basedefs/V1_chap04.html#tag_04_11
> > defines that most of the pthread_ functions act as barriers;
> > including the sem_post and pthread_cond_signal that qemu_sem_post
> > uses.
> 
> At the end of the series the code is this:
> 
>     qemu_mutex_lock(&p->mutex);
>     p->pages.num = pages->num;
>     iov_copy(p->pages.iov, pages->num, pages->iov, pages->num, 0,
>              iov_size(pages->iov, pages->num));
>     pages->num = 0;
>     qemu_mutex_unlock(&p->mutex);
>  
> Are you sure that it looks like a good idea to drop the mutex?
> 
> The other thread uses pages->num to know if things are ready.

Well, I wont push it too hard, but; if you:
  a) Know that the other thread isn't accessing the iov
      (because you previously know that it had set done)
  b) Know the other thread wont access it until pages->num gets
     set
  c) Ensure that all changes to the iov are visible before
     the pages->num write is visible - appropriate barriers/ordering

then you're good.  However, the mutex might be simpler.

Dave

> >
> >> +    return 0;
> >> +}
> >> +
> >>  struct MultiFDRecvParams {
> >>      uint8_t id;
> >>      QemuThread thread;
> >> @@ -537,6 +583,7 @@ void multifd_load_cleanup(void)
> >>          qemu_sem_destroy(&p->sem);
> >>          socket_recv_channel_destroy(p->c);
> >>          g_free(p);
> >> +        multifd_recv_state->params[i] = NULL;
> >>      }
> >>      g_free(multifd_recv_state->params);
> >>      multifd_recv_state->params = NULL;
> >> @@ -1058,6 +1105,32 @@ static int ram_save_page(RAMState *rs, PageSearchStatus *pss, bool last_stage)
> >>      return pages;
> >>  }
> >>  
> >> +static int ram_multifd_page(RAMState *rs, PageSearchStatus *pss,
> >> +                            bool last_stage)
> >> +{
> >> +    int pages;
> >> +    uint8_t *p;
> >> +    RAMBlock *block = pss->block;
> >> +    ram_addr_t offset = pss->page << TARGET_PAGE_BITS;
> >> +
> >> +    p = block->host + offset;
> >> +
> >> +    pages = save_zero_page(rs, block, offset, p);
> >> +    if (pages == -1) {
> >> +        ram_counters.transferred +=
> >> +            save_page_header(rs, rs->f, block,
> >> +                             offset | RAM_SAVE_FLAG_MULTIFD_PAGE);
> >> +        qemu_put_buffer(rs->f, p, TARGET_PAGE_SIZE);
> >> +        multifd_send_page(p);
> >> +        ram_counters.transferred += TARGET_PAGE_SIZE;
> >> +        pages = 1;
> >> +        ram_counters.normal++;
> >> +        ram_counters.multifd++;
> >> +    }
> >> +
> >> +    return pages;
> >> +}
> >> +
> >>  static int do_compress_ram_page(QEMUFile *f, RAMBlock *block,
> >>                                  ram_addr_t offset)
> >>  {
> >> @@ -1486,6 +1559,8 @@ static int ram_save_target_page(RAMState *rs, PageSearchStatus *pss,
> >>          if (migrate_use_compression() &&
> >>              (rs->ram_bulk_stage || !migrate_use_xbzrle())) {
> >>              res = ram_save_compressed_page(rs, pss, last_stage);
> >> +        } else if (migrate_use_multifd()) {
> >> +            res = ram_multifd_page(rs, pss, last_stage);
> >
> > It's a pity we can't wire this up with compression, but I understand
> > why you simplify that.
> >
> > I'll see how the multiple-pages stuff works below; but the interesting
> > thing here is we've already split up host-pages, which seems like a bad
> > idea.
> 
> It is.  But I can't fix all the world in one go :-(
> >>  #        statistics (since 2.10)
> >>  #
> >> +# @multifd: number of pages sent with multifd (since 2.10)
> >
> > Hopeful!
> 
> Everything puts 2.11.
> 
> Later, Juan.
--
Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK

^ permalink raw reply	[flat|nested] 93+ messages in thread

* Re: [Qemu-devel] [PATCH v5 10/17] migration: Create ram_multifd_page
  2017-08-08 16:30       ` Dr. David Alan Gilbert
@ 2017-08-08 18:02         ` Juan Quintela
  2017-08-08 19:14           ` Dr. David Alan Gilbert
  0 siblings, 1 reply; 93+ messages in thread
From: Juan Quintela @ 2017-08-08 18:02 UTC (permalink / raw)
  To: Dr. David Alan Gilbert; +Cc: qemu-devel, lvivier, peterx, berrange

"Dr. David Alan Gilbert" <dgilbert@redhat.com> wrote:
> * Juan Quintela (quintela@redhat.com) wrote:
>> "Dr. David Alan Gilbert" <dgilbert@redhat.com> wrote:
>> > * Juan Quintela (quintela@redhat.com) wrote:

...

>> > My feeling, without having fully thought it through, is that
>> > the locking around 'address' can be simplified; especially if the
>> > sending-thread never actually changes it.
>> >
>> > http://pubs.opengroup.org/onlinepubs/9699919799/basedefs/V1_chap04.html#tag_04_11
>> > defines that most of the pthread_ functions act as barriers;
>> > including the sem_post and pthread_cond_signal that qemu_sem_post
>> > uses.
>> 
>> At the end of the series the code is this:
>> 
>>     qemu_mutex_lock(&p->mutex);
>>     p->pages.num = pages->num;
>>     iov_copy(p->pages.iov, pages->num, pages->iov, pages->num, 0,
>>              iov_size(pages->iov, pages->num));

****** HERE ******

>>     pages->num = 0;
>>     qemu_mutex_unlock(&p->mutex);
>>  
>> Are you sure that it looks like a good idea to drop the mutex?
>> 
>> The other thread uses pages->num to know if things are ready.
>
> Well, I wont push it too hard, but; if you:
>   a) Know that the other thread isn't accessing the iov
>       (because you previously know that it had set done)

This bit I know it is true.

>   b) Know the other thread wont access it until pages->num gets
>      set



>   c) Ensure that all changes to the iov are visible before
>      the pages->num write is visible - appropriate barriers/ordering

There is no barrier there that I can see.  I know that it probably work
on x86, but in others?  I think that it *** HERE **** we need that
memory barrier that we don't have.

> then you're good.  However, the mutex might be simpler.

Code (after all the changes) is:

    qemu_sem_wait(&multifd_send_state->sem);
    qemu_mutex_lock(&multifd_send_state->mutex);
    for (i = 0; i < multifd_send_state->count; i++) {
        p = &multifd_send_state->params[i];

        if (p->done) {
            p->done = false;
            break;
        }
    }
    qemu_mutex_unlock(&multifd_send_state->mutex);
    qemu_mutex_lock(&p->mutex);
    p->pages.num = pages->num;  /* we could probably switch this
                                   statement  with the next, but I doubt
                                   this would make a big difference */
    iov_copy(p->pages.iov, pages->num, pages->iov, pages->num, 0,
             iov_size(pages->iov, pages->num));
    pages->num = 0;
    qemu_mutex_unlock(&p->mutex);
    qemu_sem_post(&p->sem);


And the other thread

        qemu_mutex_lock(&p->mutex);
        [...]
        if (p->pages.num) {
            int num;

            num = p->pages.num;
            p->pages.num = 0;
            qemu_mutex_unlock(&p->mutex);

            if (qio_channel_writev_all(p->c, p->pages.iov,
                                       num, &error_abort)
                != num * TARGET_PAGE_SIZE) {
                MigrationState *s = migrate_get_current();

                migrate_set_state(&s->state, MIGRATION_STATUS_ACTIVE,
                                  MIGRATION_STATUS_FAILED);
                terminate_multifd_send_threads();
                return NULL;
            }
            qemu_mutex_lock(&multifd_send_state->mutex);
            p->done = true;
            qemu_mutex_unlock(&multifd_send_state->mutex);
            qemu_sem_post(&multifd_send_state->sem);
            continue;
        }
        qemu_mutex_unlock(&p->mutex);
        qemu_sem_wait(&p->sem);

This code used to have condition variables for waiting.  With
semaphores, we can probably remove the p->mutex, but then we need to
think a lot each time that we do a change.

Later, Juan.

^ permalink raw reply	[flat|nested] 93+ messages in thread

* Re: [Qemu-devel] [PATCH v5 10/17] migration: Create ram_multifd_page
  2017-08-08 18:02         ` Juan Quintela
@ 2017-08-08 19:14           ` Dr. David Alan Gilbert
  2017-08-09 16:48             ` Paolo Bonzini
  0 siblings, 1 reply; 93+ messages in thread
From: Dr. David Alan Gilbert @ 2017-08-08 19:14 UTC (permalink / raw)
  To: Juan Quintela; +Cc: lvivier, qemu-devel, peterx, pbonzini

* Juan Quintela (quintela@redhat.com) wrote:
> "Dr. David Alan Gilbert" <dgilbert@redhat.com> wrote:
> > * Juan Quintela (quintela@redhat.com) wrote:
> >> "Dr. David Alan Gilbert" <dgilbert@redhat.com> wrote:
> >> > * Juan Quintela (quintela@redhat.com) wrote:
> 
> ...
> 
> >> > My feeling, without having fully thought it through, is that
> >> > the locking around 'address' can be simplified; especially if the
> >> > sending-thread never actually changes it.
> >> >
> >> > http://pubs.opengroup.org/onlinepubs/9699919799/basedefs/V1_chap04.html#tag_04_11
> >> > defines that most of the pthread_ functions act as barriers;
> >> > including the sem_post and pthread_cond_signal that qemu_sem_post
> >> > uses.
> >> 
> >> At the end of the series the code is this:
> >> 
> >>     qemu_mutex_lock(&p->mutex);
> >>     p->pages.num = pages->num;
> >>     iov_copy(p->pages.iov, pages->num, pages->iov, pages->num, 0,
> >>              iov_size(pages->iov, pages->num));
> 
> ****** HERE ******
> 
> >>     pages->num = 0;
> >>     qemu_mutex_unlock(&p->mutex);
> >>  
> >> Are you sure that it looks like a good idea to drop the mutex?
> >> 
> >> The other thread uses pages->num to know if things are ready.
> >
> > Well, I wont push it too hard, but; if you:
> >   a) Know that the other thread isn't accessing the iov
> >       (because you previously know that it had set done)
> 
> This bit I know it is true.
> 
> >   b) Know the other thread wont access it until pages->num gets
> >      set
> 
> 
> 
> >   c) Ensure that all changes to the iov are visible before
> >      the pages->num write is visible - appropriate barriers/ordering
> 
> There is no barrier there that I can see.  I know that it probably work
> on x86, but in others?  I think that it *** HERE **** we need that
> memory barrier that we don't have.

Yes, I think that's smp_mb_release() - and you have to do an
smp_mb_acquire after reading the pages->num before accessing the iov.
(Probably worth checking with Paolo).
Or just stick with mutex's.


> > then you're good.  However, the mutex might be simpler.
> 
> Code (after all the changes) is:
> 
>     qemu_sem_wait(&multifd_send_state->sem);
>     qemu_mutex_lock(&multifd_send_state->mutex);
>     for (i = 0; i < multifd_send_state->count; i++) {
>         p = &multifd_send_state->params[i];
> 
>         if (p->done) {
>             p->done = false;
>             break;
>         }
>     }
>     qemu_mutex_unlock(&multifd_send_state->mutex);
>     qemu_mutex_lock(&p->mutex);
>     p->pages.num = pages->num;  /* we could probably switch this
>                                    statement  with the next, but I doubt
>                                    this would make a big difference */
>     iov_copy(p->pages.iov, pages->num, pages->iov, pages->num, 0,
>              iov_size(pages->iov, pages->num));
>     pages->num = 0;
>     qemu_mutex_unlock(&p->mutex);
>     qemu_sem_post(&p->sem);
> 
> 
> And the other thread
> 
>         qemu_mutex_lock(&p->mutex);
>         [...]
>         if (p->pages.num) {
>             int num;
> 
>             num = p->pages.num;
>             p->pages.num = 0;
>             qemu_mutex_unlock(&p->mutex);
> 
>             if (qio_channel_writev_all(p->c, p->pages.iov,
>                                        num, &error_abort)
>                 != num * TARGET_PAGE_SIZE) {
>                 MigrationState *s = migrate_get_current();
> 
>                 migrate_set_state(&s->state, MIGRATION_STATUS_ACTIVE,
>                                   MIGRATION_STATUS_FAILED);
>                 terminate_multifd_send_threads();
>                 return NULL;
>             }
>             qemu_mutex_lock(&multifd_send_state->mutex);
>             p->done = true;
>             qemu_mutex_unlock(&multifd_send_state->mutex);
>             qemu_sem_post(&multifd_send_state->sem);
>             continue;
>         }
>         qemu_mutex_unlock(&p->mutex);
>         qemu_sem_wait(&p->sem);
> 
> This code used to have condition variables for waiting.  With
> semaphores, we can probably remove the p->mutex, but then we need to
> think a lot each time that we do a change.
> 
> Later, Juan.

Dave

--
Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK

^ permalink raw reply	[flat|nested] 93+ messages in thread

* Re: [Qemu-devel] [PATCH v5 13/17] migration: Create thread infrastructure for multifd recv side
  2017-08-08 11:41     ` Juan Quintela
@ 2017-08-09  5:53       ` Peter Xu
  0 siblings, 0 replies; 93+ messages in thread
From: Peter Xu @ 2017-08-09  5:53 UTC (permalink / raw)
  To: Juan Quintela; +Cc: qemu-devel, dgilbert, lvivier, berrange

On Tue, Aug 08, 2017 at 01:41:13PM +0200, Juan Quintela wrote:
> Peter Xu <peterx@redhat.com> wrote:
> > On Mon, Jul 17, 2017 at 03:42:34PM +0200, Juan Quintela wrote:
> 
> >> +static void multifd_recv_page(uint8_t *address, uint16_t fd_num)
> >> +{
> >> +    int thread_count;
> >> +    MultiFDRecvParams *p;
> >> +    static multifd_pages_t pages;
> >> +    static bool once;
> >> +
> >> +    if (!once) {
> >> +        multifd_init_group(&pages);
> >> +        once = true;
> >> +    }
> >> +
> >> +    pages.iov[pages.num].iov_base = address;
> >> +    pages.iov[pages.num].iov_len = TARGET_PAGE_SIZE;
> >> +    pages.num++;
> >> +
> >> +    if (fd_num == UINT16_MAX) {
> >
> > (so this check is slightly mistery as well if we don't define
> >  something... O:-)
> 
> It means that we continue sending pages on the same "group".  Will add a
> comment.
> 
> >
> >> +        return;
> >> +    }
> >> +
> >> +    thread_count = migrate_multifd_threads();
> >> +    assert(fd_num < thread_count);
> >> +    p = multifd_recv_state->params[fd_num];
> >> +
> >> +    qemu_sem_wait(&p->ready);
> >
> > Shall we check for p->pages.num == 0 before wait? What if the
> > corresponding thread is already finished its old work and ready?
> 
> this is a semaphore, not a condition variable.  We only use it with
> values 0 and 1.  We only wait if the other thread hasn't done the post,
> if it has done the post, the wait don't have to wait. (no, I didn't
> invented the semaphore names).

Yeah I think you are right. :)  Thanks,

-- 
Peter Xu

^ permalink raw reply	[flat|nested] 93+ messages in thread

* Re: [Qemu-devel] [PATCH v5 10/17] migration: Create ram_multifd_page
  2017-08-08 16:04       ` Juan Quintela
@ 2017-08-09  7:42         ` Peter Xu
  0 siblings, 0 replies; 93+ messages in thread
From: Peter Xu @ 2017-08-09  7:42 UTC (permalink / raw)
  To: Juan Quintela; +Cc: Dr. David Alan Gilbert, qemu-devel, lvivier, berrange

On Tue, Aug 08, 2017 at 06:04:54PM +0200, Juan Quintela wrote:
> Peter Xu <peterx@redhat.com> wrote:
> > On Wed, Jul 19, 2017 at 08:02:39PM +0100, Dr. David Alan Gilbert wrote:
> >> * Juan Quintela (quintela@redhat.com) wrote:
> 
> >> >  struct MultiFDSendParams {
> >> > +    /* not changed */
> >> >      uint8_t id;
> >> >      QemuThread thread;
> >> >      QIOChannel *c;
> >> >      QemuSemaphore sem;
> >> >      QemuMutex mutex;
> >> > +    /* protected by param mutex */
> >> >      bool quit;
> >> 
> >> Should probably comment to say what address space address is in - this
> >> is really a qemu pointer - and that's why we can treat 0 as special?
> >
> > I believe this comment is for "address" below.
> >
> > Yes, it would be nice to comment it. IIUC it belongs to virtual
> > address space of QEMU, so it should be okay to use zero as a "special"
> > value.
> 
> See new comments.
> 
> >> 
> >> > +    uint8_t *address;
> >> > +    /* protected by multifd mutex */
> >> > +    bool done;
> >> 
> >> done needs a comment to explain what it is because
> >> it sounds similar to quit;  I think 'done' is saying that
> >> the thread is idle having done what was asked?
> >
> > Since we know that valid address won't be zero, not sure whether we
> > can just avoid introducing the "done" field (even, not sure whether we
> > will need lock when modifying "address", I think not as well? Please
> > see below). For what I see this, it works like a state machine, and
> > "address" can be the state:
> >
> >             +--------  send thread ---------+
> >             |                               |
> >            \|/                              |
> >         address==0 (IDLE)               address!=0 (ACTIVE)
> >             |                              /|\
> >             |                               |
> >             +--------  main thread ---------+
> >
> > Then...
> 
> It is needed, we change things later in the series.  We could treat as
> an special case page.num == 0. But then we can differentiate the case
> where we have finished the last round and that we are in the beggining
> of the new one.

(Will comment below)

> 
> >> 
> >> >  };
> >> >  typedef struct MultiFDSendParams MultiFDSendParams;
> >> >  
> >> > @@ -375,6 +381,8 @@ struct {
> >> >      MultiFDSendParams *params;
> >> >      /* number of created threads */
> >> >      int count;
> >> > +    QemuMutex mutex;
> >> > +    QemuSemaphore sem;
> >> >  } *multifd_send_state;
> >> >  
> >> >  static void terminate_multifd_send_threads(void)
> >> > @@ -443,6 +451,7 @@ static void *multifd_send_thread(void *opaque)
> >> >      } else {
> >> >          qio_channel_write(p->c, string, MULTIFD_UUID_MSG, &error_abort);
> >> >      }
> >> > +    qemu_sem_post(&multifd_send_state->sem);
> >> >  
> >> >      while (!exit) {
> >> >          qemu_mutex_lock(&p->mutex);
> >> > @@ -450,6 +459,15 @@ static void *multifd_send_thread(void *opaque)
> >> >              qemu_mutex_unlock(&p->mutex);
> >> >              break;
> >> >          }
> >> > +        if (p->address) {
> >> > +            p->address = 0;
> >> > +            qemu_mutex_unlock(&p->mutex);
> >> > +            qemu_mutex_lock(&multifd_send_state->mutex);
> >> > +            p->done = true;
> >> > +            qemu_mutex_unlock(&multifd_send_state->mutex);
> >> > +            qemu_sem_post(&multifd_send_state->sem);
> >> > +            continue;
> >
> > Here instead of setting up address=0 at the entry, can we do this (no
> > "done" for this time)?
> >
> >                  // send the page before clearing p->address
> >                  send_page(p->address);
> >                  // clear p->address to switch to "IDLE" state
> >                  atomic_set(&p->address, 0);
> >                  // tell main thread, in case it's waiting
> >                  qemu_sem_post(&multifd_send_state->sem);
> >
> > And on the main thread...
> >
> >> > +        }
> >> >          qemu_mutex_unlock(&p->mutex);
> >> >          qemu_sem_wait(&p->sem);
> >> >      }
> >> > @@ -469,6 +487,8 @@ int multifd_save_setup(void)
> >> >      multifd_send_state = g_malloc0(sizeof(*multifd_send_state));
> >> >      multifd_send_state->params = g_new0(MultiFDSendParams, thread_count);
> >> >      multifd_send_state->count = 0;
> >> > +    qemu_mutex_init(&multifd_send_state->mutex);
> >> > +    qemu_sem_init(&multifd_send_state->sem, 0);
> >> >      for (i = 0; i < thread_count; i++) {
> >> >          char thread_name[16];
> >> >          MultiFDSendParams *p = &multifd_send_state->params[i];
> >> > @@ -477,6 +497,8 @@ int multifd_save_setup(void)
> >> >          qemu_sem_init(&p->sem, 0);
> >> >          p->quit = false;
> >> >          p->id = i;
> >> > +        p->done = true;
> >> > +        p->address = 0;
> >> >          p->c = socket_send_channel_create();
> >> >          if (!p->c) {
> >> >              error_report("Error creating a send channel");
> >> > @@ -491,6 +513,30 @@ int multifd_save_setup(void)
> >> >      return 0;
> >> >  }
> >> >  
> >> > +static int multifd_send_page(uint8_t *address)
> >> > +{
> >> > +    int i;
> >> > +    MultiFDSendParams *p = NULL; /* make happy gcc */
> >> > +
> >
> >
> >> > +    qemu_sem_wait(&multifd_send_state->sem);
> >> > +    qemu_mutex_lock(&multifd_send_state->mutex);
> >> > +    for (i = 0; i < multifd_send_state->count; i++) {
> >> > +        p = &multifd_send_state->params[i];
> >> > +
> >> > +        if (p->done) {
> >> > +            p->done = false;
> >> > +            break;
> >> > +        }
> >> > +    }
> >> > +    qemu_mutex_unlock(&multifd_send_state->mutex);
> >> > +    qemu_mutex_lock(&p->mutex);
> >> > +    p->address = address;
> >> > +    qemu_mutex_unlock(&p->mutex);
> >> > +    qemu_sem_post(&p->sem);
> >
> > ... here can we just do this?
> >
> > retry:
> >     // don't take any lock, only read each p->address
> >     for (i = 0; i < multifd_send_state->count; i++) {
> >         p = &multifd_send_state->params[i];
> >         if (!p->address) {
> >             // we found one IDLE send thread
> >             break;
> >         }
> >     }
> >     if (!p) {
> >         qemu_sem_wait(&multifd_send_state->sem);
> >         goto retry;
> >     }
> >     // we switch its state, IDLE -> ACTIVE
> >     atomic_set(&p->address, address);
> >     // tell the thread to start work
> >     qemu_sem_post(&p->sem);
> >
> > Above didn't really use any lock at all (either the per thread lock,
> > or the global lock). Would it work?
> 
> Probably (surely on x86).
> 
> But on the "final code", it becomes:
> 
>     qemu_mutex_lock(&multifd_send_state->mutex);
>     for (i = 0; i < multifd_send_state->count; i++) {
>         p = &multifd_send_state->params[i];
> 
>         if (p->done) {
>             p->done = false;
>             break;
>         }
>     }
>     qemu_mutex_unlock(&multifd_send_state->mutex);
>     qemu_mutex_lock(&p->mutex);
>     p->pages.num = pages->num;
>     iov_copy(p->pages.iov, pages->num, pages->iov, pages->num, 0,
>              iov_size(pages->iov, pages->num));
>     pages->num = 0;
>     qemu_mutex_unlock(&p->mutex);
>     qemu_sem_post(&p->sem);
> 
> So, we set done to false, without the global mutex (yes, we can change
> that for one atomic).

Yep. Then IMHO we should be able to avoid the global lock
(multifd_send_state->mutex).

Though I still think that p->done is not really necessary, since there
are only two valid states for each MultiFDSendParams, either:

  p->done == true, p->pages.num == 0

Which means the send thread is idle, or:

  p->done == true, p->pages.num > 0

Which means the send thread is busy. So logically p->pages.num
contains all the information already.

But I'm fine with either way.

> 
> But then we copy an iov without a lock?  With the other thread checking
> for pages->num == 0?  It sounds a bit fragile to me, no?

Yeah, for this one, not sure this can be achieved by careful ordering.
When we publish the send request, we may need to:

    for (i = 0; i < pages->num; i++) {
        p->pages.iov[j].iov_base = pages.iov[j].iov_base;
        p->pages.iov[j].iov_len = pages.iov[j].iov_len;
    }
    atomic_set(&p->pages.num, pages->num);
    qemu_sem_post(&p->sem);

The point is (just like when we were using address in current patch),
we use p->pages.num as a state, though for this time num==0 means
IDLE, but num>0 means busy. As long as we setup the IOVs before the
final state change, IMHO we should be fine.

Again, I'm also fine if we want to keep the locks at least in the
first version. I just think it may be faster if we can avoid using
those locks.

Thanks,

-- 
Peter Xu

^ permalink raw reply	[flat|nested] 93+ messages in thread

* Re: [Qemu-devel] [PATCH v5 11/17] migration: Really use multiple pages at a time
  2017-08-08 16:06     ` Juan Quintela
@ 2017-08-09  7:48       ` Peter Xu
  2017-08-09  8:05         ` Juan Quintela
  0 siblings, 1 reply; 93+ messages in thread
From: Peter Xu @ 2017-08-09  7:48 UTC (permalink / raw)
  To: Juan Quintela; +Cc: qemu-devel, dgilbert, lvivier, berrange

On Tue, Aug 08, 2017 at 06:06:04PM +0200, Juan Quintela wrote:
> Peter Xu <peterx@redhat.com> wrote:
> > On Mon, Jul 17, 2017 at 03:42:32PM +0200, Juan Quintela wrote:
> >
> > [...]
> >
> >>  static int multifd_send_page(uint8_t *address)
> >>  {
> >> -    int i;
> >> +    int i, j;
> >>      MultiFDSendParams *p = NULL; /* make happy gcc */
> >> +    static multifd_pages_t pages;
> >> +    static bool once;
> >> +
> >> +    if (!once) {
> >> +        multifd_init_group(&pages);
> >> +        once = true;
> >
> > Would it be good to put the "pages" into multifd_send_state? One is to
> > stick globals together; another benefit is that we can remove the
> > "once" here: we can then init the "pages" when init multifd_send_state
> > struct (but maybe with a better name?...).
> 
> I did to be able to free it.

Free it? But they a static variables, then how can we free them?

(I thought the only way to free it is putting it into
 multifd_send_state...)

Something I must have missed here. :(

> 
> > (there are similar static variables in multifd_recv_page() as well, if
> >  this one applies, then we can possibly use multifd_recv_state for
> >  that one)
> 
> Also there.
> 
> >> +    }
> >> +
> >> +    pages.iov[pages.num].iov_base = address;
> >> +    pages.iov[pages.num].iov_len = TARGET_PAGE_SIZE;
> >> +    pages.num++;
> >> +
> >> +    if (pages.num < (pages.size - 1)) {
> >> +        return UINT16_MAX;
> >
> > Nit: shall we define something for readability?  Like:
> >
> > #define  MULTIFD_FD_INVALID  UINT16_MAX
> 
> Also done.
> 
> MULTIFD_CONTINUE
> 
> But I am open to changes.

It's clear enough at least to me. Thanks!

-- 
Peter Xu

^ permalink raw reply	[flat|nested] 93+ messages in thread

* Re: [Qemu-devel] [PATCH v5 11/17] migration: Really use multiple pages at a time
  2017-08-09  7:48       ` Peter Xu
@ 2017-08-09  8:05         ` Juan Quintela
  2017-08-09  8:12           ` Peter Xu
  0 siblings, 1 reply; 93+ messages in thread
From: Juan Quintela @ 2017-08-09  8:05 UTC (permalink / raw)
  To: Peter Xu; +Cc: qemu-devel, dgilbert, lvivier, berrange

Peter Xu <peterx@redhat.com> wrote:
> On Tue, Aug 08, 2017 at 06:06:04PM +0200, Juan Quintela wrote:
>> Peter Xu <peterx@redhat.com> wrote:
>> > On Mon, Jul 17, 2017 at 03:42:32PM +0200, Juan Quintela wrote:
>> >
>> > [...]
>> >
>> >>  static int multifd_send_page(uint8_t *address)
>> >>  {
>> >> -    int i;
>> >> +    int i, j;
>> >>      MultiFDSendParams *p = NULL; /* make happy gcc */
>> >> +    static multifd_pages_t pages;
>> >> +    static bool once;
>> >> +
>> >> +    if (!once) {
>> >> +        multifd_init_group(&pages);
>> >> +        once = true;
>> >
>> > Would it be good to put the "pages" into multifd_send_state? One is to
>> > stick globals together; another benefit is that we can remove the
>> > "once" here: we can then init the "pages" when init multifd_send_state
>> > struct (but maybe with a better name?...).
>> 
>> I did to be able to free it.
>
> Free it? But they a static variables, then how can we free them?
>
> (I thought the only way to free it is putting it into
>  multifd_send_state...)
>
> Something I must have missed here. :(

I did the change that you suggested in response to a comment from Dave
that asked where I freed it.   I see that my sentence was ambigous.

>
>> 
>> > (there are similar static variables in multifd_recv_page() as well, if
>> >  this one applies, then we can possibly use multifd_recv_state for
>> >  that one)
>> 
>> Also there.
>> 
>> >> +    }
>> >> +
>> >> +    pages.iov[pages.num].iov_base = address;
>> >> +    pages.iov[pages.num].iov_len = TARGET_PAGE_SIZE;
>> >> +    pages.num++;
>> >> +
>> >> +    if (pages.num < (pages.size - 1)) {
>> >> +        return UINT16_MAX;
>> >
>> > Nit: shall we define something for readability?  Like:
>> >
>> > #define  MULTIFD_FD_INVALID  UINT16_MAX
>> 
>> Also done.
>> 
>> MULTIFD_CONTINUE
>> 
>> But I am open to changes.
>
> It's clear enough at least to me. Thanks!

Thanks, Juan.

^ permalink raw reply	[flat|nested] 93+ messages in thread

* Re: [Qemu-devel] [PATCH v5 09/17] migration: Start of multiple fd work
  2017-08-08  9:19     ` Juan Quintela
@ 2017-08-09  8:08       ` Peter Xu
  2017-08-09 11:12         ` Juan Quintela
  0 siblings, 1 reply; 93+ messages in thread
From: Peter Xu @ 2017-08-09  8:08 UTC (permalink / raw)
  To: Juan Quintela; +Cc: qemu-devel, dgilbert, lvivier, berrange

On Tue, Aug 08, 2017 at 11:19:35AM +0200, Juan Quintela wrote:
> Peter Xu <peterx@redhat.com> wrote:
> > On Mon, Jul 17, 2017 at 03:42:30PM +0200, Juan Quintela wrote:
> >
> > [...]
> >
> >>  int multifd_load_setup(void)
> >>  {
> >>      int thread_count;
> >> -    uint8_t i;
> >>  
> >>      if (!migrate_use_multifd()) {
> >>          return 0;
> >>      }
> >>      thread_count = migrate_multifd_threads();
> >>      multifd_recv_state = g_malloc0(sizeof(*multifd_recv_state));
> >> -    multifd_recv_state->params = g_new0(MultiFDRecvParams, thread_count);
> >> +    multifd_recv_state->params = g_new0(MultiFDRecvParams *, thread_count);
> >>      multifd_recv_state->count = 0;
> >> -    for (i = 0; i < thread_count; i++) {
> >> -        char thread_name[16];
> >> -        MultiFDRecvParams *p = &multifd_recv_state->params[i];
> >> -
> >> -        qemu_mutex_init(&p->mutex);
> >> -        qemu_sem_init(&p->sem, 0);
> >> -        p->quit = false;
> >> -        p->id = i;
> >> -        snprintf(thread_name, sizeof(thread_name), "multifdrecv_%d", i);
> >> -        qemu_thread_create(&p->thread, thread_name, multifd_recv_thread, p,
> >> -                           QEMU_THREAD_JOINABLE);
> >> -        multifd_recv_state->count++;
> >> -    }
> >
> > Could I ask why we explicitly switched from MultiFDRecvParams[] array
> > into a pointer array? Can we still use the old array?  Thanks,
> 
> Now, we could receive the channels out of order (the wonders of
> networking).  So, we have two options that I can see:
> 
> * Add interesting global locking to be able to modify inplace (I know
>   that it should be safe, but yet).
> * Create a new struct in the new connection, and then atomically switch
>   the pointer to the right instruction.
> 
> I can assure you that the second one makes it much more easier to detect
> when you use the "channel" before you have fully created it O:-)

Oh, so it's possible that we start to recv pages even if the recv
channel has not yet been established...

Then would current code be problematic? Like in multifd_recv_page() we
have:

static void multifd_recv_page(uint8_t *address, uint16_t fd_num)
{
    ...
    p = multifd_recv_state->params[fd_num];
    qemu_sem_wait(&p->ready);
    ...
}

Here can p==NULL if channel is not ready yet?

(If so, I think a static array makes more sense...)

Thanks,

-- 
Peter Xu

^ permalink raw reply	[flat|nested] 93+ messages in thread

* Re: [Qemu-devel] [PATCH v5 11/17] migration: Really use multiple pages at a time
  2017-08-09  8:05         ` Juan Quintela
@ 2017-08-09  8:12           ` Peter Xu
  0 siblings, 0 replies; 93+ messages in thread
From: Peter Xu @ 2017-08-09  8:12 UTC (permalink / raw)
  To: Juan Quintela; +Cc: qemu-devel, dgilbert, lvivier, berrange

On Wed, Aug 09, 2017 at 10:05:19AM +0200, Juan Quintela wrote:
> Peter Xu <peterx@redhat.com> wrote:
> > On Tue, Aug 08, 2017 at 06:06:04PM +0200, Juan Quintela wrote:
> >> Peter Xu <peterx@redhat.com> wrote:
> >> > On Mon, Jul 17, 2017 at 03:42:32PM +0200, Juan Quintela wrote:
> >> >
> >> > [...]
> >> >
> >> >>  static int multifd_send_page(uint8_t *address)
> >> >>  {
> >> >> -    int i;
> >> >> +    int i, j;
> >> >>      MultiFDSendParams *p = NULL; /* make happy gcc */
> >> >> +    static multifd_pages_t pages;
> >> >> +    static bool once;
> >> >> +
> >> >> +    if (!once) {
> >> >> +        multifd_init_group(&pages);
> >> >> +        once = true;
> >> >
> >> > Would it be good to put the "pages" into multifd_send_state? One is to
> >> > stick globals together; another benefit is that we can remove the
> >> > "once" here: we can then init the "pages" when init multifd_send_state
> >> > struct (but maybe with a better name?...).
> >> 
> >> I did to be able to free it.
> >
> > Free it? But they a static variables, then how can we free them?
> >
> > (I thought the only way to free it is putting it into
> >  multifd_send_state...)
> >
> > Something I must have missed here. :(
> 
> I did the change that you suggested in response to a comment from Dave
> that asked where I freed it.   I see that my sentence was ambigous.

Oh! Then it's clear now. Thanks!

(Sorry I may have missed some of the emails in the threads)

-- 
Peter Xu

^ permalink raw reply	[flat|nested] 93+ messages in thread

* Re: [Qemu-devel] [PATCH v5 09/17] migration: Start of multiple fd work
  2017-08-09  8:08       ` Peter Xu
@ 2017-08-09 11:12         ` Juan Quintela
  0 siblings, 0 replies; 93+ messages in thread
From: Juan Quintela @ 2017-08-09 11:12 UTC (permalink / raw)
  To: Peter Xu; +Cc: qemu-devel, dgilbert, lvivier, berrange

Peter Xu <peterx@redhat.com> wrote:
> On Tue, Aug 08, 2017 at 11:19:35AM +0200, Juan Quintela wrote:
>> Peter Xu <peterx@redhat.com> wrote:
>> > On Mon, Jul 17, 2017 at 03:42:30PM +0200, Juan Quintela wrote:
>> >
>> > [...]
>> >
>> >>  int multifd_load_setup(void)
>> >>  {
>> >>      int thread_count;
>> >> -    uint8_t i;
>> >>  
>> >>      if (!migrate_use_multifd()) {
>> >>          return 0;
>> >>      }
>> >>      thread_count = migrate_multifd_threads();
>> >>      multifd_recv_state = g_malloc0(sizeof(*multifd_recv_state));
>> >> -    multifd_recv_state->params = g_new0(MultiFDRecvParams, thread_count);
>> >> +    multifd_recv_state->params = g_new0(MultiFDRecvParams *, thread_count);
>> >>      multifd_recv_state->count = 0;
>> >> -    for (i = 0; i < thread_count; i++) {
>> >> -        char thread_name[16];
>> >> -        MultiFDRecvParams *p = &multifd_recv_state->params[i];
>> >> -
>> >> -        qemu_mutex_init(&p->mutex);
>> >> -        qemu_sem_init(&p->sem, 0);
>> >> -        p->quit = false;
>> >> -        p->id = i;
>> >> -        snprintf(thread_name, sizeof(thread_name), "multifdrecv_%d", i);
>> >> -        qemu_thread_create(&p->thread, thread_name, multifd_recv_thread, p,
>> >> -                           QEMU_THREAD_JOINABLE);
>> >> -        multifd_recv_state->count++;
>> >> -    }
>> >
>> > Could I ask why we explicitly switched from MultiFDRecvParams[] array
>> > into a pointer array? Can we still use the old array?  Thanks,
>> 
>> Now, we could receive the channels out of order (the wonders of
>> networking).  So, we have two options that I can see:
>> 
>> * Add interesting global locking to be able to modify inplace (I know
>>   that it should be safe, but yet).
>> * Create a new struct in the new connection, and then atomically switch
>>   the pointer to the right instruction.
>> 
>> I can assure you that the second one makes it much more easier to detect
>> when you use the "channel" before you have fully created it O:-)
>
> Oh, so it's possible that we start to recv pages even if the recv
> channel has not yet been established...
>
> Then would current code be problematic? Like in multifd_recv_page() we
> have:
>
> static void multifd_recv_page(uint8_t *address, uint16_t fd_num)
> {
>     ...
>     p = multifd_recv_state->params[fd_num];
>     qemu_sem_wait(&p->ready);
>     ...
> }
>
> Here can p==NULL if channel is not ready yet?
>
> (If so, I think a static array makes more sense...)

Yeap.  If we make an error (and believe me that I did), we  get a "nice"
segmentation fault, where we can see what fd_num is.  Otherwise we
receive a hang qemu.  I know what I preffer O:-)

Later, Juan.

^ permalink raw reply	[flat|nested] 93+ messages in thread

* Re: [Qemu-devel] [PATCH v5 12/17] migration: Send the fd number which we are going to use for this page
  2017-07-17 13:42 ` [Qemu-devel] [PATCH v5 12/17] migration: Send the fd number which we are going to use for this page Juan Quintela
  2017-07-20  9:58   ` Dr. David Alan Gilbert
@ 2017-08-09 16:48   ` Paolo Bonzini
  1 sibling, 0 replies; 93+ messages in thread
From: Paolo Bonzini @ 2017-08-09 16:48 UTC (permalink / raw)
  To: Juan Quintela, qemu-devel; +Cc: lvivier, dgilbert, peterx

On 17/07/2017 15:42, Juan Quintela wrote:
> We are still sending the page through the main channel, that would
> change later in the series
> 
> Signed-off-by: Juan Quintela <quintela@redhat.com>
> ---
>  migration/ram.c | 13 +++++++++++--
>  1 file changed, 11 insertions(+), 2 deletions(-)
> 
> diff --git a/migration/ram.c b/migration/ram.c
> index 90e1bcb..ac0742f 100644
> --- a/migration/ram.c
> +++ b/migration/ram.c
> @@ -568,7 +568,7 @@ static int multifd_send_page(uint8_t *address)
>      qemu_mutex_unlock(&p->mutex);
>      qemu_sem_post(&p->sem);
>  
> -    return 0;
> +    return i;
>  }
>  
>  struct MultiFDRecvParams {
> @@ -1143,6 +1143,7 @@ static int ram_multifd_page(RAMState *rs, PageSearchStatus *pss,
>                              bool last_stage)
>  {
>      int pages;
> +    uint16_t fd_num;
>      uint8_t *p;
>      RAMBlock *block = pss->block;
>      ram_addr_t offset = pss->page << TARGET_PAGE_BITS;
> @@ -1154,8 +1155,10 @@ static int ram_multifd_page(RAMState *rs, PageSearchStatus *pss,
>          ram_counters.transferred +=
>              save_page_header(rs, rs->f, block,
>                               offset | RAM_SAVE_FLAG_MULTIFD_PAGE);
> +        fd_num = multifd_send_page(p);
> +        qemu_put_be16(rs->f, fd_num);
> +        ram_counters.transferred += 2; /* size of fd_num */
>          qemu_put_buffer(rs->f, p, TARGET_PAGE_SIZE);
> -        multifd_send_page(p);
>          ram_counters.transferred += TARGET_PAGE_SIZE;
>          pages = 1;
>          ram_counters.normal++;
> @@ -2905,6 +2908,7 @@ static int ram_load(QEMUFile *f, void *opaque, int version_id)
>      while (!postcopy_running && !ret && !(flags & RAM_SAVE_FLAG_EOS)) {
>          ram_addr_t addr, total_ram_bytes;
>          void *host = NULL;
> +        uint16_t fd_num;
>          uint8_t ch;
>  
>          addr = qemu_get_be64(f);
> @@ -3015,6 +3019,11 @@ static int ram_load(QEMUFile *f, void *opaque, int version_id)
>              break;
>  
>          case RAM_SAVE_FLAG_MULTIFD_PAGE:
> +            fd_num = qemu_get_be16(f);
> +            if (fd_num != 0) {
> +                /* this is yet an unused variable, changed later */
> +                fd_num = fd_num;
> +            }
>              qemu_get_buffer(f, host, TARGET_PAGE_SIZE);
>              break;
>  
> 


I'm still not convinced of doing this instead of just treating all
sockets equivalently (and flushing them all when the main socket is told
that there is a new block).

Paolo

^ permalink raw reply	[flat|nested] 93+ messages in thread

* Re: [Qemu-devel] [PATCH v5 10/17] migration: Create ram_multifd_page
  2017-08-08 19:14           ` Dr. David Alan Gilbert
@ 2017-08-09 16:48             ` Paolo Bonzini
  0 siblings, 0 replies; 93+ messages in thread
From: Paolo Bonzini @ 2017-08-09 16:48 UTC (permalink / raw)
  To: Dr. David Alan Gilbert, Juan Quintela; +Cc: lvivier, qemu-devel, peterx

On 08/08/2017 21:14, Dr. David Alan Gilbert wrote:
>> There is no barrier there that I can see.  I know that it probably work
>> on x86, but in others?  I think that it *** HERE **** we need that
>> memory barrier that we don't have.
> Yes, I think that's smp_mb_release() - and you have to do an
> smp_mb_acquire after reading the pages->num before accessing the iov.

Yes, I think that's correct.

Paolo

> (Probably worth checking with Paolo).
> Or just stick with mutex's.
> 
> 

^ permalink raw reply	[flat|nested] 93+ messages in thread

* Re: [Qemu-devel] [PATCH v5 17/17] migration: Flush receive queue
  2017-08-08 11:40     ` Juan Quintela
@ 2017-08-10  6:49       ` Peter Xu
  0 siblings, 0 replies; 93+ messages in thread
From: Peter Xu @ 2017-08-10  6:49 UTC (permalink / raw)
  To: Juan Quintela; +Cc: qemu-devel, dgilbert, lvivier, berrange

On Tue, Aug 08, 2017 at 01:40:58PM +0200, Juan Quintela wrote:
> Peter Xu <peterx@redhat.com> wrote:
> > On Mon, Jul 17, 2017 at 03:42:38PM +0200, Juan Quintela wrote:
> >> Each time that we sync the bitmap, it is a possiblity that we receive
> >> a page that is being processed by a different thread.  We fix this
> >> problem just making sure that we wait for all receiving threads to
> >> finish its work before we procedeed with the next stage.
> >> 
> >> We are low on page flags, so we use a combination that is not valid to
> >> emit that message:  MULTIFD_PAGE and COMPRESSED.
> >> 
> >> I tried to make a migration command for it, but it don't work because
> >> we sync the bitmap sometimes when we have already sent the beggining
> >> of the section, so I just added a new page flag.
> >> 
> >> Signed-off-by: Juan Quintela <quintela@redhat.com>
> 
> >> @@ -675,6 +686,10 @@ static void *multifd_recv_thread(void *opaque)
> >>                  return NULL;
> >>              }
> >>              p->done = true;
> >> +            if (p->sync) {
> >> +                qemu_cond_signal(&p->cond_sync);
> >> +                p->sync = false;
> >> +            }
> >
> > Could we use the same p->ready for this purpose? They looks similar:
> > all we want to do is to let the main thread know "worker thread has
> > finished receiving the last piece and becomes idle again", right?
> 
> We *could*, but "ready" is used for each page that we sent, sync is only
> used once every round.  Notice that "ready" is a semaphore, and its
> semantic is weird.  See next comment.
> 
> 
> >> +static int multifd_flush(void)
> >> +{
> >> +    int i, thread_count;
> >> +
> >> +    if (!migrate_use_multifd()) {
> >> +        return 0;
> >> +    }
> >> +    thread_count = migrate_multifd_threads();
> >> +    for (i = 0; i < thread_count; i++) {
> >> +        MultiFDRecvParams *p = multifd_recv_state->params[i];
> >> +
> >> +        qemu_mutex_lock(&p->mutex);
> >> +        while (!p->done) {
> >> +            p->sync = true;
> >> +            qemu_cond_wait(&p->cond_sync, &p->mutex);
> >
> > (similar comment like above)
> 
> We need to look at the two pieces of code at the same time.  What are we
> trying to do:
> 
> - making sure that all threads have finished the current round.
>   in this particular case, that this thread has finished its current
>   round OR  that it is waiting for work.
> 
> So, the main thread is the one that does the sem_wait(ready) and the channel
> thread is the one that does the sem_post(ready).
> 
> multifd_recv_thread()
> 
>     if (p->sync) {
>         sem_post(ready);
>         p->sync = false;
>     }
> 
> multifd_flush()
>    if (!p->done) {
>        p->sync = true;
>        sem_wait(ready);
>    }
> 
> Ah, but done and sync can be changed from other threads, so current code
> will become:
> 
> multifd_recv_thread()
> 
>     if (p->sync) {
>         sem_post(ready);
>         p->sync = false;
>     }
> 
> multifd_flush()
>    ...
>    mutex_lock(lock);
>    if (!p->done) {
>        p->sync = true;
>        mutex_unlock(lock)
>        sem_wait(ready);
>        mutex_lock(lock)
>    }
>    mutex_unlock(lock)
> 
> That I would claim that it is more complicated to understand.  Mixing
> locks and semaphores is ..... interesting to say the least.  With
> variable conditions it becomes easy.
> 
> Yes, we can change sync/done to atomic access, but not sure that makes
> things so much simpler.

I was thinking that p->ready can be used a notification channel from
recv thread to main thread for any reason. But I'm also fine that if
you want to do this separately to have different sync channels for
page-level completions and global flushes especially in first version.

(but I'd say I feel the whole thing slightly complicated, while I feel
 it can be simpler somewhere...)

> 
> >> +        }
> >> +        qemu_mutex_unlock(&p->mutex);
> >> +    }
> >> +    return 0;
> >> +}
> >> +
> >>  /**
> >>   * save_page_header: write page header to wire
> >>   *
> >> @@ -809,6 +847,12 @@ static size_t save_page_header(RAMState *rs, QEMUFile *f,  RAMBlock *block,
> >>  {
> >>      size_t size, len;
> >>  
> >> +    if (rs->multifd_needs_flush &&
> >> +        (offset & RAM_SAVE_FLAG_MULTIFD_PAGE)) {
> >
> > If multifd_needs_flush is only for multifd, then we may skip this
> > check, but it looks more like an assertion:
> >
> >     if (rs->multifd_needs_flush) {
> >         assert(offset & RAM_SAVE_FLAG_MULTIFD_PAGE);
> >         offset |= RAM_SAVE_FLAG_ZERO;
> >     }
> 
> No, it could be that this page is a _non_ multifd page, and then ZERO
> means something different.  So, we can only send this for MULTIFD pages.

But if multifd_needs_flush==true, it must be a multifd page, no? :)

I think this is trivial, so both work for me.

> 
> > (Dave mentioned about unaligned flag used in commit message and here:
> >  ZERO is used, but COMPRESS is mentioned)
> 
> OK, I can change the message.
> 
> >> @@ -2496,6 +2540,9 @@ static int ram_save_complete(QEMUFile *f, void *opaque)
> >>  
> >>      if (!migration_in_postcopy()) {
> >>          migration_bitmap_sync(rs);
> >> +        if (migrate_use_multifd()) {
> >> +            rs->multifd_needs_flush = true;
> >> +        }
> >
> > Would it be good to move this block into entry of
> > migration_bitmap_sync(), instead of setting it up at the callers of
> > migration_bitmap_sync()?
> 
> We can't have all of it.
> 
> We call migration_bitmap_sync() in 4 places.
> - We don't need to set the flag for the 1st synchronization
> - We don't need to set it on postcopy (yet).

[1]

I see.

> 
> So, we can add code inside to check if we are on the 1st round, and
> forget about postcopy (we check in other place), or we maintain it this way.
> 
> So, change becomes:
> 
> modified   migration/ram.c
> @@ -1131,6 +1131,9 @@ static void migration_bitmap_sync(RAMState *rs)
>      if (migrate_use_events()) {
>          qapi_event_send_migration_pass(ram_counters.dirty_sync_count, NULL);
>      }
> +    if (rs->ram_bulk_stage && migrate_use_multifd()) {

Should this be "!rs->ram_bulk_stage && migrate_use_multifd()"?

> +        rs->multifd_needs_flush = true;
> +    }
>  }
>  
>  /**
> @@ -2533,9 +2536,6 @@ static int ram_save_complete(QEMUFile *f, void *opaque)
>  
>      if (!migration_in_postcopy()) {
>          migration_bitmap_sync(rs);
> -        if (migrate_use_multifd()) {
> -            rs->multifd_needs_flush = true;
> -        }
>      }
>  
>      ram_control_before_iterate(f, RAM_CONTROL_FINISH);
> @@ -2578,9 +2578,6 @@ static void ram_save_pending(QEMUFile *f, void *opaque, uint64_t max_size,
>          qemu_mutex_lock_iothread();
>          rcu_read_lock();
>          migration_bitmap_sync(rs);
> -        if (migrate_use_multifd()) {
> -            rs->multifd_needs_flush = true;
> -        }
>          rcu_read_unlock();
>          qemu_mutex_unlock_iothread();
>          remaining_size = rs->migration_dirty_pages * TARGET_PAGE_SIZE;
> 
> three less lines, you win.  We need to check in otherplace already that
> postcopy & multifd are not enabled at the same time.

I got the point. I would slightly prefer the new way to have only one
single place to set multifd_needs_flush (it would be nice to have some
comments like [1] there), but I'm also fine if you prefer the old one.

Thanks,

-- 
Peter Xu

^ permalink raw reply	[flat|nested] 93+ messages in thread

end of thread, other threads:[~2017-08-10  6:49 UTC | newest]

Thread overview: 93+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-07-17 13:42 [Qemu-devel] [PATCH v5 00/17] Multifd Juan Quintela
2017-07-17 13:42 ` [Qemu-devel] [PATCH v5 01/17] migrate: Add gboolean return type to migrate_channel_process_incoming Juan Quintela
2017-07-19 15:01   ` Dr. David Alan Gilbert
2017-07-20  7:00     ` Peter Xu
2017-07-20  8:47       ` Daniel P. Berrange
2017-07-24 10:18         ` Juan Quintela
2017-07-17 13:42 ` [Qemu-devel] [PATCH v5 02/17] migration: Create migration_ioc_process_incoming() Juan Quintela
2017-07-19 13:38   ` Daniel P. Berrange
2017-07-24 11:09     ` Juan Quintela
2017-07-17 13:42 ` [Qemu-devel] [PATCH v5 03/17] qio: Create new qio_channel_{readv, writev}_all Juan Quintela
2017-07-19 13:44   ` Daniel P. Berrange
2017-08-08  8:40     ` Juan Quintela
2017-08-08  9:25       ` Daniel P. Berrange
2017-07-19 15:42   ` Dr. David Alan Gilbert
2017-07-19 15:43     ` Daniel P. Berrange
2017-07-19 16:04       ` Dr. David Alan Gilbert
2017-07-19 16:08         ` Daniel P. Berrange
2017-07-17 13:42 ` [Qemu-devel] [PATCH v5 04/17] migration: Add multifd capability Juan Quintela
2017-07-19 15:44   ` Dr. David Alan Gilbert
2017-08-08  8:42     ` Juan Quintela
2017-07-19 17:14   ` Eric Blake
2017-07-17 13:42 ` [Qemu-devel] [PATCH v5 05/17] migration: Create x-multifd-threads parameter Juan Quintela
2017-07-19 16:00   ` Dr. David Alan Gilbert
2017-08-08  8:46     ` Juan Quintela
2017-08-08  9:44       ` Dr. David Alan Gilbert
2017-07-17 13:42 ` [Qemu-devel] [PATCH v5 06/17] migration: Create x-multifd-group parameter Juan Quintela
2017-07-17 13:42 ` [Qemu-devel] [PATCH v5 07/17] migration: Create multifd migration threads Juan Quintela
2017-07-19 16:49   ` Dr. David Alan Gilbert
2017-08-08  8:58     ` Juan Quintela
2017-07-17 13:42 ` [Qemu-devel] [PATCH v5 08/17] migration: Split migration_fd_process_incomming Juan Quintela
2017-07-19 17:08   ` Dr. David Alan Gilbert
2017-07-21 12:39     ` Eric Blake
2017-07-17 13:42 ` [Qemu-devel] [PATCH v5 09/17] migration: Start of multiple fd work Juan Quintela
2017-07-19 13:56   ` Daniel P. Berrange
2017-07-19 17:35   ` Dr. David Alan Gilbert
2017-08-08  9:35     ` Juan Quintela
2017-08-08  9:54       ` Dr. David Alan Gilbert
2017-07-20  9:34   ` Peter Xu
2017-08-08  9:19     ` Juan Quintela
2017-08-09  8:08       ` Peter Xu
2017-08-09 11:12         ` Juan Quintela
2017-07-17 13:42 ` [Qemu-devel] [PATCH v5 10/17] migration: Create ram_multifd_page Juan Quintela
2017-07-19 19:02   ` Dr. David Alan Gilbert
2017-07-20  8:10     ` Peter Xu
2017-07-20 11:48       ` Dr. David Alan Gilbert
2017-08-08 15:58         ` Juan Quintela
2017-08-08 16:04       ` Juan Quintela
2017-08-09  7:42         ` Peter Xu
2017-08-08 15:56     ` Juan Quintela
2017-08-08 16:30       ` Dr. David Alan Gilbert
2017-08-08 18:02         ` Juan Quintela
2017-08-08 19:14           ` Dr. David Alan Gilbert
2017-08-09 16:48             ` Paolo Bonzini
2017-07-17 13:42 ` [Qemu-devel] [PATCH v5 11/17] migration: Really use multiple pages at a time Juan Quintela
2017-07-19 13:58   ` Daniel P. Berrange
2017-08-08 11:55     ` Juan Quintela
2017-07-20  9:44   ` Dr. David Alan Gilbert
2017-08-08 12:11     ` Juan Quintela
2017-07-20  9:49   ` Peter Xu
2017-07-20 10:09     ` Peter Xu
2017-08-08 16:06     ` Juan Quintela
2017-08-09  7:48       ` Peter Xu
2017-08-09  8:05         ` Juan Quintela
2017-08-09  8:12           ` Peter Xu
2017-07-17 13:42 ` [Qemu-devel] [PATCH v5 12/17] migration: Send the fd number which we are going to use for this page Juan Quintela
2017-07-20  9:58   ` Dr. David Alan Gilbert
2017-08-09 16:48   ` Paolo Bonzini
2017-07-17 13:42 ` [Qemu-devel] [PATCH v5 13/17] migration: Create thread infrastructure for multifd recv side Juan Quintela
2017-07-20 10:22   ` Peter Xu
2017-08-08 11:41     ` Juan Quintela
2017-08-09  5:53       ` Peter Xu
2017-07-20 10:29   ` Dr. David Alan Gilbert
2017-08-08 11:51     ` Juan Quintela
2017-07-17 13:42 ` [Qemu-devel] [PATCH v5 14/17] migration: Delay the start of reception on main channel Juan Quintela
2017-07-20 10:56   ` Dr. David Alan Gilbert
2017-08-08 11:29     ` Juan Quintela
2017-07-20 11:10   ` Peter Xu
2017-08-08 11:30     ` Juan Quintela
2017-07-17 13:42 ` [Qemu-devel] [PATCH v5 15/17] migration: Test new fd infrastructure Juan Quintela
2017-07-20 11:20   ` Dr. David Alan Gilbert
2017-07-17 13:42 ` [Qemu-devel] [PATCH v5 16/17] migration: Transfer pages over new channels Juan Quintela
2017-07-20 11:31   ` Dr. David Alan Gilbert
2017-08-08 11:13     ` Juan Quintela
2017-08-08 11:32       ` Dr. David Alan Gilbert
2017-07-17 13:42 ` [Qemu-devel] [PATCH v5 17/17] migration: Flush receive queue Juan Quintela
2017-07-20 11:45   ` Dr. David Alan Gilbert
2017-08-08 10:43     ` Juan Quintela
2017-08-08 11:25       ` Dr. David Alan Gilbert
2017-07-21  2:40   ` Peter Xu
2017-08-08 11:40     ` Juan Quintela
2017-08-10  6:49       ` Peter Xu
2017-07-21  6:03   ` Peter Xu
2017-07-21 10:53     ` Juan Quintela

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.