All of lore.kernel.org
 help / color / mirror / Atom feed
* [Qemu-devel] [RFC 00/12] Migration: Remove copying of guest ram pages
@ 2013-03-21  9:09 Orit Wasserman
  2013-03-21  9:09 ` [Qemu-devel] [RFC 01/12] Add iov_writev to use writev to send iovec (also for files) Orit Wasserman
                   ` (13 more replies)
  0 siblings, 14 replies; 35+ messages in thread
From: Orit Wasserman @ 2013-03-21  9:09 UTC (permalink / raw)
  To: qemu-devel; +Cc: Orit Wasserman, pbonzini, mst, chegu_vinod, quintela

In migration all data is copied to a static buffer in QEMUFile,
this hurts our network bandwidth and CPU usage especially with large guests.
We switched to iovec for storing different buffers to send (even a byte field is
considered as a buffer) and use writev to send the iovec.
writev was chosen (as apposed to sendmsg) because it supprts non socket fds.
  
Guest memory pages are not copied by calling a new function 
qemu_put_buffer_no_copy.
The page header data and device state data are still copied into the static
buffer. This data consists of a lot of bytes and integer fields and the static
buffer is used to store it during batching.
Another improvement is changing qemu_putbe64/32/16 to create a single
buffer instead of several byte sized buffer.

Orit Wasserman (12):
  Add iov_writev to use writev to send iovec (also for files)
  Add QemuFileWritevBuffer QemuFileOps
  Add socket_writev_buffer function
  Add stdio_writev_buffer function
  Add block_writev_buffer function
  Update bytes_xfer in qemu_put_byte
  Store the data to send also in iovec
  Use writev ops instead of put_buffer ops
  More optimized qemu_put_be64/32/16
  Add qemu_put_buffer_no_copy
  Use qemu_put_buffer_no_copy for guest memory pages
  Bye Bye put_buffer

 arch_init.c                   |   2 +-
 include/migration/qemu-file.h |  20 ++++---
 include/qemu/iov.h            |  12 ++++
 savevm.c                      | 130 +++++++++++++++++++++++++-----------------
 util/iov.c                    |  36 ++++++++++++
 5 files changed, 139 insertions(+), 61 deletions(-)

-- 
1.7.11.7

^ permalink raw reply	[flat|nested] 35+ messages in thread

* [Qemu-devel] [RFC 01/12] Add iov_writev to use writev to send iovec (also for files)
  2013-03-21  9:09 [Qemu-devel] [RFC 00/12] Migration: Remove copying of guest ram pages Orit Wasserman
@ 2013-03-21  9:09 ` Orit Wasserman
  2013-03-21  9:23   ` Paolo Bonzini
  2013-03-21  9:09 ` [Qemu-devel] [RFC 02/12] Add QemuFileWritevBuffer QemuFileOps Orit Wasserman
                   ` (12 subsequent siblings)
  13 siblings, 1 reply; 35+ messages in thread
From: Orit Wasserman @ 2013-03-21  9:09 UTC (permalink / raw)
  To: qemu-devel; +Cc: Orit Wasserman, pbonzini, mst, chegu_vinod, quintela

We use writev because sendmsg can only be used on socket fds.

Signed-off-by: Orit Wasserman <owasserm@redhat.com>
---
 include/qemu/iov.h | 12 ++++++++++++
 util/iov.c         | 36 ++++++++++++++++++++++++++++++++++++
 2 files changed, 48 insertions(+)

diff --git a/include/qemu/iov.h b/include/qemu/iov.h
index 68d25f2..e26b7ba 100644
--- a/include/qemu/iov.h
+++ b/include/qemu/iov.h
@@ -82,6 +82,18 @@ ssize_t iov_send_recv(int sockfd, struct iovec *iov, unsigned iov_cnt,
 #define iov_send(sockfd, iov, iov_cnt, offset, bytes) \
   iov_send_recv(sockfd, iov, iov_cnt, offset, bytes, true)
 
+/*
+ * writes iovcnt buffers of data described by iov to the file associated with
+ * the file descriptor fd.
+ *
+ * @fd - file descriptor
+ * @iov - iov to write
+ * @invcnt - number of buffers to write
+ *
+ * @return - number of bytes return or -1 on error
+ */
+ssize_t iov_writev(int fd, struct iovec *iov, unsigned iov_cnt);
+
 /**
  * Produce a text hexdump of iovec `iov' with `iov_cnt' number of elements
  * in file `fp', prefixing each line with `prefix' and processing not more
diff --git a/util/iov.c b/util/iov.c
index 9dae318..bcd6e97 100644
--- a/util/iov.c
+++ b/util/iov.c
@@ -197,6 +197,42 @@ ssize_t iov_send_recv(int sockfd, struct iovec *iov, unsigned iov_cnt,
     return ret;
 }
 
+ssize_t iov_writev(int fd, struct iovec *iov, unsigned iov_cnt)
+{
+#if defined CONFIG_IOVEC && defined CONFIG_POSIX
+    ssize_t ret;
+
+    do {
+        ret = writev(fd, iov, iov_cnt);
+    } while (ret < 0 && errno == EINTR);
+
+    return ret;
+#else
+    /* else send piece-by-piece */
+    /*XXX Note: windows has WSASend() and WSARecv() */
+    unsigned i = 0;
+    ssize_t ret = 0;
+    while (i < iov_cnt) {
+        ssize_t r = send(fd, iov[i].iov_base, iov[i].iov_len, 0);
+        if (r > 0) {
+            ret += r;
+        } else if (!r) {
+            break;
+        } else if (errno == EINTR) {
+            continue;
+        } else {
+            /* else it is some "other" error,
+             * only return if there was no data processed. */
+            if (ret == 0) {
+                ret = -1;
+            }
+            break;
+        }
+        i++;
+    }
+    return ret;
+#endif
+}
 
 void iov_hexdump(const struct iovec *iov, const unsigned int iov_cnt,
                  FILE *fp, const char *prefix, size_t limit)
-- 
1.7.11.7

^ permalink raw reply related	[flat|nested] 35+ messages in thread

* [Qemu-devel] [RFC 02/12] Add QemuFileWritevBuffer QemuFileOps
  2013-03-21  9:09 [Qemu-devel] [RFC 00/12] Migration: Remove copying of guest ram pages Orit Wasserman
  2013-03-21  9:09 ` [Qemu-devel] [RFC 01/12] Add iov_writev to use writev to send iovec (also for files) Orit Wasserman
@ 2013-03-21  9:09 ` Orit Wasserman
  2013-03-21  9:09 ` [Qemu-devel] [RFC 03/12] Add socket_writev_buffer function Orit Wasserman
                   ` (11 subsequent siblings)
  13 siblings, 0 replies; 35+ messages in thread
From: Orit Wasserman @ 2013-03-21  9:09 UTC (permalink / raw)
  To: qemu-devel; +Cc: Orit Wasserman, pbonzini, mst, chegu_vinod, quintela

This will allow us to write an iovec

Signed-off-by: Orit Wasserman <owasserm@redhat.com>
---
 include/migration/qemu-file.h | 7 +++++++
 1 file changed, 7 insertions(+)

diff --git a/include/migration/qemu-file.h b/include/migration/qemu-file.h
index df81261..8d3da9b 100644
--- a/include/migration/qemu-file.h
+++ b/include/migration/qemu-file.h
@@ -51,11 +51,18 @@ typedef int (QEMUFileCloseFunc)(void *opaque);
  */
 typedef int (QEMUFileGetFD)(void *opaque);
 
+/*
+ * This function writes an iovec to file.
+ */
+typedef int (QEMUFileWritevBufferFunc)(void *opaque, struct iovec *iov,
+                                       int iovcnt);
+
 typedef struct QEMUFileOps {
     QEMUFilePutBufferFunc *put_buffer;
     QEMUFileGetBufferFunc *get_buffer;
     QEMUFileCloseFunc *close;
     QEMUFileGetFD *get_fd;
+    QEMUFileWritevBufferFunc *writev_buffer;
 } QEMUFileOps;
 
 QEMUFile *qemu_fopen_ops(void *opaque, const QEMUFileOps *ops);
-- 
1.7.11.7

^ permalink raw reply related	[flat|nested] 35+ messages in thread

* [Qemu-devel] [RFC 03/12] Add socket_writev_buffer function
  2013-03-21  9:09 [Qemu-devel] [RFC 00/12] Migration: Remove copying of guest ram pages Orit Wasserman
  2013-03-21  9:09 ` [Qemu-devel] [RFC 01/12] Add iov_writev to use writev to send iovec (also for files) Orit Wasserman
  2013-03-21  9:09 ` [Qemu-devel] [RFC 02/12] Add QemuFileWritevBuffer QemuFileOps Orit Wasserman
@ 2013-03-21  9:09 ` Orit Wasserman
  2013-03-21  9:18   ` Paolo Bonzini
  2013-03-21  9:09 ` [Qemu-devel] [RFC 04/12] Add stdio_writev_buffer function Orit Wasserman
                   ` (10 subsequent siblings)
  13 siblings, 1 reply; 35+ messages in thread
From: Orit Wasserman @ 2013-03-21  9:09 UTC (permalink / raw)
  To: qemu-devel; +Cc: Orit Wasserman, pbonzini, mst, chegu_vinod, quintela

Signed-off-by: Orit Wasserman <owasserm@redhat.com>
---
 savevm.c | 14 ++++++++++++++
 1 file changed, 14 insertions(+)

diff --git a/savevm.c b/savevm.c
index 35c8d1e..41dc87f 100644
--- a/savevm.c
+++ b/savevm.c
@@ -39,6 +39,7 @@
 #include "qmp-commands.h"
 #include "trace.h"
 #include "qemu/bitops.h"
+#include "qemu/iov.h"
 
 #define SELF_ANNOUNCE_ROUNDS 5
 
@@ -171,6 +172,18 @@ static void coroutine_fn yield_until_fd_readable(int fd)
     qemu_coroutine_yield();
 }
 
+static int socket_writev_buffer(void *opaque, struct iovec *iov, int iovcnt)
+{
+    QEMUFileSocket *s = opaque;
+    ssize_t len;
+
+    len = iov_writev(s->fd, iov, iovcnt);
+    if (len < iov_size(iov, iovcnt)) {
+        len = -socket_error();
+    }
+    return len;
+}
+
 static int socket_get_fd(void *opaque)
 {
     QEMUFileSocket *s = opaque;
@@ -387,6 +400,7 @@ static const QEMUFileOps socket_read_ops = {
 static const QEMUFileOps socket_write_ops = {
     .get_fd =     socket_get_fd,
     .put_buffer = socket_put_buffer,
+    .writev_buffer = socket_writev_buffer,
     .close =      socket_close
 };
 
-- 
1.7.11.7

^ permalink raw reply related	[flat|nested] 35+ messages in thread

* [Qemu-devel] [RFC 04/12] Add stdio_writev_buffer function
  2013-03-21  9:09 [Qemu-devel] [RFC 00/12] Migration: Remove copying of guest ram pages Orit Wasserman
                   ` (2 preceding siblings ...)
  2013-03-21  9:09 ` [Qemu-devel] [RFC 03/12] Add socket_writev_buffer function Orit Wasserman
@ 2013-03-21  9:09 ` Orit Wasserman
  2013-03-21  9:28   ` Paolo Bonzini
  2013-03-21  9:09 ` [Qemu-devel] [RFC 05/12] Add block_writev_buffer function Orit Wasserman
                   ` (9 subsequent siblings)
  13 siblings, 1 reply; 35+ messages in thread
From: Orit Wasserman @ 2013-03-21  9:09 UTC (permalink / raw)
  To: qemu-devel; +Cc: Orit Wasserman, pbonzini, mst, chegu_vinod, quintela

Signed-off-by: Orit Wasserman <owasserm@redhat.com>
---
 savevm.c | 6 ++++++
 1 file changed, 6 insertions(+)

diff --git a/savevm.c b/savevm.c
index 41dc87f..fa07a86 100644
--- a/savevm.c
+++ b/savevm.c
@@ -241,6 +241,11 @@ static int stdio_get_fd(void *opaque)
     return fileno(s->stdio_file);
 }
 
+static int stdio_writev_buffer(void *opaque, struct iovec *iov, int iovcnt)
+{
+    return iov_writev(stdio_get_fd(opaque), iov, iovcnt);
+}
+
 static int stdio_put_buffer(void *opaque, const uint8_t *buf, int64_t pos, int size)
 {
     QEMUFileStdio *s = opaque;
@@ -321,6 +326,7 @@ static const QEMUFileOps stdio_pipe_read_ops = {
 static const QEMUFileOps stdio_pipe_write_ops = {
     .get_fd =     stdio_get_fd,
     .put_buffer = stdio_put_buffer,
+    .writev_buffer = stdio_writev_buffer,
     .close =      stdio_pclose
 };
 
-- 
1.7.11.7

^ permalink raw reply related	[flat|nested] 35+ messages in thread

* [Qemu-devel] [RFC 05/12] Add block_writev_buffer function
  2013-03-21  9:09 [Qemu-devel] [RFC 00/12] Migration: Remove copying of guest ram pages Orit Wasserman
                   ` (3 preceding siblings ...)
  2013-03-21  9:09 ` [Qemu-devel] [RFC 04/12] Add stdio_writev_buffer function Orit Wasserman
@ 2013-03-21  9:09 ` Orit Wasserman
  2013-03-21  9:09 ` [Qemu-devel] [RFC 06/12] Update bytes_xfer in qemu_put_byte Orit Wasserman
                   ` (8 subsequent siblings)
  13 siblings, 0 replies; 35+ messages in thread
From: Orit Wasserman @ 2013-03-21  9:09 UTC (permalink / raw)
  To: qemu-devel; +Cc: Orit Wasserman, pbonzini, mst, chegu_vinod, quintela

Signed-off-by: Orit Wasserman <owasserm@redhat.com>
---
 savevm.c | 13 +++++++++++++
 1 file changed, 13 insertions(+)

diff --git a/savevm.c b/savevm.c
index fa07a86..2c8fb03 100644
--- a/savevm.c
+++ b/savevm.c
@@ -459,6 +459,18 @@ fail:
     return NULL;
 }
 
+static int block_writev_buffer(void *opaque, struct iovec *iov, int iovcnt)
+{
+    int i;
+    int size = 0;
+
+    for (i = 0; i < iovcnt; i++) {
+        bdrv_save_vmstate(opaque, iov[i].iov_base, 0, iov[i].iov_len);
+        size += iov[i].iov_len;
+    }
+    return size;
+}
+
 static int block_put_buffer(void *opaque, const uint8_t *buf,
                            int64_t pos, int size)
 {
@@ -483,6 +495,7 @@ static const QEMUFileOps bdrv_read_ops = {
 
 static const QEMUFileOps bdrv_write_ops = {
     .put_buffer = block_put_buffer,
+    .writev_buffer = block_writev_buffer,
     .close =      bdrv_fclose
 };
 
-- 
1.7.11.7

^ permalink raw reply related	[flat|nested] 35+ messages in thread

* [Qemu-devel] [RFC 06/12] Update bytes_xfer in qemu_put_byte
  2013-03-21  9:09 [Qemu-devel] [RFC 00/12] Migration: Remove copying of guest ram pages Orit Wasserman
                   ` (4 preceding siblings ...)
  2013-03-21  9:09 ` [Qemu-devel] [RFC 05/12] Add block_writev_buffer function Orit Wasserman
@ 2013-03-21  9:09 ` Orit Wasserman
  2013-03-21  9:09 ` [Qemu-devel] [RFC 07/12] Store the data to send also in iovec Orit Wasserman
                   ` (7 subsequent siblings)
  13 siblings, 0 replies; 35+ messages in thread
From: Orit Wasserman @ 2013-03-21  9:09 UTC (permalink / raw)
  To: qemu-devel; +Cc: Orit Wasserman, pbonzini, mst, chegu_vinod, quintela

Signed-off-by: Orit Wasserman <owasserm@redhat.com>
---
 savevm.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/savevm.c b/savevm.c
index 2c8fb03..ec64533 100644
--- a/savevm.c
+++ b/savevm.c
@@ -666,6 +666,8 @@ void qemu_put_byte(QEMUFile *f, int v)
 
     f->buf[f->buf_index++] = v;
     f->is_write = 1;
+    f->bytes_xfer += 1;
+
     if (f->buf_index >= IO_BUF_SIZE) {
         qemu_fflush(f);
     }
-- 
1.7.11.7

^ permalink raw reply related	[flat|nested] 35+ messages in thread

* [Qemu-devel] [RFC 07/12] Store the data to send also in iovec
  2013-03-21  9:09 [Qemu-devel] [RFC 00/12] Migration: Remove copying of guest ram pages Orit Wasserman
                   ` (5 preceding siblings ...)
  2013-03-21  9:09 ` [Qemu-devel] [RFC 06/12] Update bytes_xfer in qemu_put_byte Orit Wasserman
@ 2013-03-21  9:09 ` Orit Wasserman
  2013-03-21  9:56   ` Paolo Bonzini
  2013-03-21  9:09 ` [Qemu-devel] [RFC 08/12] Use writev ops instead of put_buffer ops Orit Wasserman
                   ` (6 subsequent siblings)
  13 siblings, 1 reply; 35+ messages in thread
From: Orit Wasserman @ 2013-03-21  9:09 UTC (permalink / raw)
  To: qemu-devel; +Cc: Orit Wasserman, pbonzini, mst, chegu_vinod, quintela

All data is still copied into the static buffer.

Signed-off-by: Orit Wasserman <owasserm@redhat.com>
---
 savevm.c | 13 +++++++++++--
 1 file changed, 11 insertions(+), 2 deletions(-)

diff --git a/savevm.c b/savevm.c
index ec64533..d5834ca 100644
--- a/savevm.c
+++ b/savevm.c
@@ -114,6 +114,7 @@ void qemu_announce_self(void)
 /* savevm/loadvm support */
 
 #define IO_BUF_SIZE 32768
+#define MAX_IOV_SIZE 64
 
 struct QEMUFile {
     const QEMUFileOps *ops;
@@ -129,6 +130,9 @@ struct QEMUFile {
     int buf_size; /* 0 when writing */
     uint8_t buf[IO_BUF_SIZE];
 
+    struct iovec iov[MAX_IOV_SIZE];
+    unsigned int iovcnt;
+
     int last_error;
 };
 
@@ -546,6 +550,7 @@ static void qemu_fflush(QEMUFile *f)
             f->pos += f->buf_index;
         }
         f->buf_index = 0;
+        f->iovcnt = 0;
     }
     if (ret < 0) {
         qemu_file_set_error(f, ret);
@@ -638,12 +643,14 @@ void qemu_put_buffer(QEMUFile *f, const uint8_t *buf, int size)
         if (l > size)
             l = size;
         memcpy(f->buf + f->buf_index, buf, l);
+        f->iov[f->iovcnt].iov_base = f->buf + f->buf_index;
+        f->iov[f->iovcnt++].iov_len = l;
         f->is_write = 1;
         f->buf_index += l;
         f->bytes_xfer += l;
         buf += l;
         size -= l;
-        if (f->buf_index >= IO_BUF_SIZE) {
+        if (f->buf_index >= IO_BUF_SIZE || f->iovcnt >= MAX_IOV_SIZE) {
             qemu_fflush(f);
             if (qemu_file_get_error(f)) {
                 break;
@@ -667,8 +674,10 @@ void qemu_put_byte(QEMUFile *f, int v)
     f->buf[f->buf_index++] = v;
     f->is_write = 1;
     f->bytes_xfer += 1;
+    f->iov[f->iovcnt].iov_base = f->buf + (f->buf_index - 1);
+    f->iov[f->iovcnt++].iov_len = 1;
 
-    if (f->buf_index >= IO_BUF_SIZE) {
+    if (f->buf_index >= IO_BUF_SIZE || f->iovcnt >= MAX_IOV_SIZE) {
         qemu_fflush(f);
     }
 }
-- 
1.7.11.7

^ permalink raw reply related	[flat|nested] 35+ messages in thread

* [Qemu-devel] [RFC 08/12] Use writev ops instead of put_buffer ops
  2013-03-21  9:09 [Qemu-devel] [RFC 00/12] Migration: Remove copying of guest ram pages Orit Wasserman
                   ` (6 preceding siblings ...)
  2013-03-21  9:09 ` [Qemu-devel] [RFC 07/12] Store the data to send also in iovec Orit Wasserman
@ 2013-03-21  9:09 ` Orit Wasserman
  2013-03-21  9:09 ` [Qemu-devel] [RFC 09/12] More optimized qemu_put_be64/32/16 Orit Wasserman
                   ` (5 subsequent siblings)
  13 siblings, 0 replies; 35+ messages in thread
From: Orit Wasserman @ 2013-03-21  9:09 UTC (permalink / raw)
  To: qemu-devel; +Cc: Orit Wasserman, pbonzini, mst, chegu_vinod, quintela

Update qemu_fflush and stdio_close to use writev ops

Signed-off-by: Orit Wasserman <owasserm@redhat.com>
---
 savevm.c | 11 ++++++-----
 1 file changed, 6 insertions(+), 5 deletions(-)

diff --git a/savevm.c b/savevm.c
index d5834ca..9506a20 100644
--- a/savevm.c
+++ b/savevm.c
@@ -297,7 +297,7 @@ static int stdio_fclose(void *opaque)
     QEMUFileStdio *s = opaque;
     int ret = 0;
 
-    if (s->file->ops->put_buffer) {
+    if (s->file->ops->writev_buffer) {
         int fd = fileno(s->stdio_file);
         struct stat st;
 
@@ -534,18 +534,19 @@ static void qemu_file_set_error(QEMUFile *f, int ret)
     }
 }
 
-/** Flushes QEMUFile buffer
+/**
+ * Flushes QEMUFile iovec
  *
  */
 static void qemu_fflush(QEMUFile *f)
 {
     int ret = 0;
 
-    if (!f->ops->put_buffer) {
+    if (!f->ops->writev_buffer) {
         return;
     }
-    if (f->is_write && f->buf_index > 0) {
-        ret = f->ops->put_buffer(f->opaque, f->buf, f->pos, f->buf_index);
+    if (f->is_write && f->iovcnt > 0) {
+        ret = f->ops->writev_buffer(f->opaque, f->iov, f->iovcnt);
         if (ret >= 0) {
             f->pos += f->buf_index;
         }
-- 
1.7.11.7

^ permalink raw reply related	[flat|nested] 35+ messages in thread

* [Qemu-devel] [RFC 09/12] More optimized qemu_put_be64/32/16
  2013-03-21  9:09 [Qemu-devel] [RFC 00/12] Migration: Remove copying of guest ram pages Orit Wasserman
                   ` (7 preceding siblings ...)
  2013-03-21  9:09 ` [Qemu-devel] [RFC 08/12] Use writev ops instead of put_buffer ops Orit Wasserman
@ 2013-03-21  9:09 ` Orit Wasserman
  2013-03-21  9:09 ` [Qemu-devel] [RFC 10/12] Add qemu_put_buffer_no_copy Orit Wasserman
                   ` (4 subsequent siblings)
  13 siblings, 0 replies; 35+ messages in thread
From: Orit Wasserman @ 2013-03-21  9:09 UTC (permalink / raw)
  To: qemu-devel; +Cc: Orit Wasserman, pbonzini, mst, chegu_vinod, quintela

qemu_put_be functions used qemu_put_byte this caused lots of 1 bytes buffers
in the iovec.
we move to use cpu_put_be64/32/16wu and put a single buffer per call.

Signed-off-by: Orit Wasserman <owasserm@redhat.com>
---
 savevm.c | 20 ++++++++++++--------
 1 file changed, 12 insertions(+), 8 deletions(-)

diff --git a/savevm.c b/savevm.c
index 9506a20..40d96f4 100644
--- a/savevm.c
+++ b/savevm.c
@@ -799,22 +799,26 @@ void qemu_file_reset_rate_limit(QEMUFile *f)
 
 void qemu_put_be16(QEMUFile *f, unsigned int v)
 {
-    qemu_put_byte(f, v >> 8);
-    qemu_put_byte(f, v);
+    uint16_t p;
+    cpu_to_be16wu(&p, v);
+
+    qemu_put_buffer(f, (uint8_t *)&p, sizeof(p));
 }
 
 void qemu_put_be32(QEMUFile *f, unsigned int v)
 {
-    qemu_put_byte(f, v >> 24);
-    qemu_put_byte(f, v >> 16);
-    qemu_put_byte(f, v >> 8);
-    qemu_put_byte(f, v);
+    uint32_t p;
+    cpu_to_be32wu(&p, v);
+
+    qemu_put_buffer(f, (uint8_t *)&p, sizeof(p));
 }
 
 void qemu_put_be64(QEMUFile *f, uint64_t v)
 {
-    qemu_put_be32(f, v >> 32);
-    qemu_put_be32(f, v);
+    uint64_t p;
+    cpu_to_be64wu(&p, v);
+
+    qemu_put_buffer(f, (uint8_t *)&p, sizeof(p));
 }
 
 unsigned int qemu_get_be16(QEMUFile *f)
-- 
1.7.11.7

^ permalink raw reply related	[flat|nested] 35+ messages in thread

* [Qemu-devel] [RFC 10/12] Add qemu_put_buffer_no_copy
  2013-03-21  9:09 [Qemu-devel] [RFC 00/12] Migration: Remove copying of guest ram pages Orit Wasserman
                   ` (8 preceding siblings ...)
  2013-03-21  9:09 ` [Qemu-devel] [RFC 09/12] More optimized qemu_put_be64/32/16 Orit Wasserman
@ 2013-03-21  9:09 ` Orit Wasserman
  2013-03-21  9:25   ` Paolo Bonzini
  2013-03-23 16:27   ` Michael R. Hines
  2013-03-21  9:09 ` [Qemu-devel] [RFC 11/12] Use qemu_put_buffer_no_copy for guest memory pages Orit Wasserman
                   ` (3 subsequent siblings)
  13 siblings, 2 replies; 35+ messages in thread
From: Orit Wasserman @ 2013-03-21  9:09 UTC (permalink / raw)
  To: qemu-devel; +Cc: Orit Wasserman, pbonzini, mst, chegu_vinod, quintela

This allow us to add a buffer to the iovec to send without copying it
into the static buffer.

Signed-off-by: Orit Wasserman <owasserm@redhat.com>
---
 include/migration/qemu-file.h |  5 +++++
 savevm.c                      | 42 ++++++++++++++++++++++++------------------
 2 files changed, 29 insertions(+), 18 deletions(-)

diff --git a/include/migration/qemu-file.h b/include/migration/qemu-file.h
index 8d3da9b..5168be2 100644
--- a/include/migration/qemu-file.h
+++ b/include/migration/qemu-file.h
@@ -75,6 +75,11 @@ int qemu_fclose(QEMUFile *f);
 int64_t qemu_ftell(QEMUFile *f);
 void qemu_put_buffer(QEMUFile *f, const uint8_t *buf, int size);
 void qemu_put_byte(QEMUFile *f, int v);
+/*
+ * put_buffer without copying the buffer.
+ * The buffer should be available till it is sent.
+ */
+void qemu_put_buffer_no_copy(QEMUFile *f, const uint8_t *buf, int size);
 
 static inline void qemu_put_ubyte(QEMUFile *f, unsigned int v)
 {
diff --git a/savevm.c b/savevm.c
index 40d96f4..32a506e 100644
--- a/savevm.c
+++ b/savevm.c
@@ -629,6 +629,22 @@ void qemu_put_buffer(QEMUFile *f, const uint8_t *buf, int size)
 {
     int l;
 
+    while (size > 0) {
+        l = IO_BUF_SIZE - f->buf_index;
+        if (l > size) {
+            l = size;
+        }
+        memcpy(f->buf + f->buf_index, buf, l);
+        f->buf_index += l;
+        f->is_write = 1;
+        qemu_put_buffer_no_copy(f, f->buf + (f->buf_index - l), l);
+        buf += l;
+        size -= l;
+    }
+}
+
+void qemu_put_buffer_no_copy(QEMUFile *f, const uint8_t *buf, int size)
+{
     if (f->last_error) {
         return;
     }
@@ -639,24 +655,14 @@ void qemu_put_buffer(QEMUFile *f, const uint8_t *buf, int size)
         abort();
     }
 
-    while (size > 0) {
-        l = IO_BUF_SIZE - f->buf_index;
-        if (l > size)
-            l = size;
-        memcpy(f->buf + f->buf_index, buf, l);
-        f->iov[f->iovcnt].iov_base = f->buf + f->buf_index;
-        f->iov[f->iovcnt++].iov_len = l;
-        f->is_write = 1;
-        f->buf_index += l;
-        f->bytes_xfer += l;
-        buf += l;
-        size -= l;
-        if (f->buf_index >= IO_BUF_SIZE || f->iovcnt >= MAX_IOV_SIZE) {
-            qemu_fflush(f);
-            if (qemu_file_get_error(f)) {
-                break;
-            }
-        }
+    f->iov[f->iovcnt].iov_base = (uint8_t *)buf;
+    f->iov[f->iovcnt++].iov_len = size;
+
+    f->is_write = 1;
+    f->bytes_xfer += size;
+
+    if (f->buf_index >= IO_BUF_SIZE || f->iovcnt >= MAX_IOV_SIZE) {
+        qemu_fflush(f);
     }
 }
 
-- 
1.7.11.7

^ permalink raw reply related	[flat|nested] 35+ messages in thread

* [Qemu-devel] [RFC 11/12] Use qemu_put_buffer_no_copy for guest memory pages
  2013-03-21  9:09 [Qemu-devel] [RFC 00/12] Migration: Remove copying of guest ram pages Orit Wasserman
                   ` (9 preceding siblings ...)
  2013-03-21  9:09 ` [Qemu-devel] [RFC 10/12] Add qemu_put_buffer_no_copy Orit Wasserman
@ 2013-03-21  9:09 ` Orit Wasserman
  2013-03-21  9:09 ` [Qemu-devel] [RFC 12/12] Bye Bye put_buffer Orit Wasserman
                   ` (2 subsequent siblings)
  13 siblings, 0 replies; 35+ messages in thread
From: Orit Wasserman @ 2013-03-21  9:09 UTC (permalink / raw)
  To: qemu-devel; +Cc: Orit Wasserman, pbonzini, mst, chegu_vinod, quintela

This will remove an unneeded copy of guest memory pages.
For the page header and device state we still copy the data to the
static buffer the other option is to allocate the memory on demand
which is more expensive.

Signed-off-by: Orit Wasserman <owasserm@redhat.com>
---
 arch_init.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch_init.c b/arch_init.c
index 98e2bc6..27b53eb 100644
--- a/arch_init.c
+++ b/arch_init.c
@@ -481,7 +481,7 @@ static int ram_save_block(QEMUFile *f, bool last_stage)
             /* XBZRLE overflow or normal page */
             if (bytes_sent == -1) {
                 bytes_sent = save_block_hdr(f, block, offset, cont, RAM_SAVE_FLAG_PAGE);
-                qemu_put_buffer(f, p, TARGET_PAGE_SIZE);
+                qemu_put_buffer_no_copy(f, p, TARGET_PAGE_SIZE);
                 bytes_sent += TARGET_PAGE_SIZE;
                 acct_info.norm_pages++;
             }
-- 
1.7.11.7

^ permalink raw reply related	[flat|nested] 35+ messages in thread

* [Qemu-devel] [RFC 12/12] Bye Bye put_buffer
  2013-03-21  9:09 [Qemu-devel] [RFC 00/12] Migration: Remove copying of guest ram pages Orit Wasserman
                   ` (10 preceding siblings ...)
  2013-03-21  9:09 ` [Qemu-devel] [RFC 11/12] Use qemu_put_buffer_no_copy for guest memory pages Orit Wasserman
@ 2013-03-21  9:09 ` Orit Wasserman
  2013-03-21  9:29 ` [Qemu-devel] [RFC 00/12] Migration: Remove copying of guest ram pages Paolo Bonzini
  2013-03-21  9:48 ` Michael S. Tsirkin
  13 siblings, 0 replies; 35+ messages in thread
From: Orit Wasserman @ 2013-03-21  9:09 UTC (permalink / raw)
  To: qemu-devel; +Cc: Orit Wasserman, pbonzini, mst, chegu_vinod, quintela

Removed all unused put_buffer code

Signed-off-by: Orit Wasserman <owasserm@redhat.com>
---
 include/migration/qemu-file.h |  8 --------
 savevm.c                      | 29 -----------------------------
 2 files changed, 37 deletions(-)

diff --git a/include/migration/qemu-file.h b/include/migration/qemu-file.h
index 5168be2..c76c61b 100644
--- a/include/migration/qemu-file.h
+++ b/include/migration/qemu-file.h
@@ -24,13 +24,6 @@
 #ifndef QEMU_FILE_H
 #define QEMU_FILE_H 1
 
-/* This function writes a chunk of data to a file at the given position.
- * The pos argument can be ignored if the file is only being used for
- * streaming.  The handler should try to write all of the data it can.
- */
-typedef int (QEMUFilePutBufferFunc)(void *opaque, const uint8_t *buf,
-                                    int64_t pos, int size);
-
 /* Read a chunk of data from a file at the given position.  The pos argument
  * can be ignored if the file is only be used for streaming.  The number of
  * bytes actually read should be returned.
@@ -58,7 +51,6 @@ typedef int (QEMUFileWritevBufferFunc)(void *opaque, struct iovec *iov,
                                        int iovcnt);
 
 typedef struct QEMUFileOps {
-    QEMUFilePutBufferFunc *put_buffer;
     QEMUFileGetBufferFunc *get_buffer;
     QEMUFileCloseFunc *close;
     QEMUFileGetFD *get_fd;
diff --git a/savevm.c b/savevm.c
index 32a506e..b8d94cf 100644
--- a/savevm.c
+++ b/savevm.c
@@ -218,18 +218,6 @@ static int socket_get_buffer(void *opaque, uint8_t *buf, int64_t pos, int size)
     return len;
 }
 
-static int socket_put_buffer(void *opaque, const uint8_t *buf, int64_t pos, int size)
-{
-    QEMUFileSocket *s = opaque;
-    ssize_t len;
-
-    len = qemu_send_full(s->fd, buf, size, 0);
-    if (len < size) {
-        len = -socket_error();
-    }
-    return len;
-}
-
 static int socket_close(void *opaque)
 {
     QEMUFileSocket *s = opaque;
@@ -250,12 +238,6 @@ static int stdio_writev_buffer(void *opaque, struct iovec *iov, int iovcnt)
     return iov_writev(stdio_get_fd(opaque), iov, iovcnt);
 }
 
-static int stdio_put_buffer(void *opaque, const uint8_t *buf, int64_t pos, int size)
-{
-    QEMUFileStdio *s = opaque;
-    return fwrite(buf, 1, size, s->stdio_file);
-}
-
 static int stdio_get_buffer(void *opaque, uint8_t *buf, int64_t pos, int size)
 {
     QEMUFileStdio *s = opaque;
@@ -329,7 +311,6 @@ static const QEMUFileOps stdio_pipe_read_ops = {
 
 static const QEMUFileOps stdio_pipe_write_ops = {
     .get_fd =     stdio_get_fd,
-    .put_buffer = stdio_put_buffer,
     .writev_buffer = stdio_writev_buffer,
     .close =      stdio_pclose
 };
@@ -369,7 +350,6 @@ static const QEMUFileOps stdio_file_read_ops = {
 
 static const QEMUFileOps stdio_file_write_ops = {
     .get_fd =     stdio_get_fd,
-    .put_buffer = stdio_put_buffer,
     .close =      stdio_fclose
 };
 
@@ -409,7 +389,6 @@ static const QEMUFileOps socket_read_ops = {
 
 static const QEMUFileOps socket_write_ops = {
     .get_fd =     socket_get_fd,
-    .put_buffer = socket_put_buffer,
     .writev_buffer = socket_writev_buffer,
     .close =      socket_close
 };
@@ -475,13 +454,6 @@ static int block_writev_buffer(void *opaque, struct iovec *iov, int iovcnt)
     return size;
 }
 
-static int block_put_buffer(void *opaque, const uint8_t *buf,
-                           int64_t pos, int size)
-{
-    bdrv_save_vmstate(opaque, buf, pos, size);
-    return size;
-}
-
 static int block_get_buffer(void *opaque, uint8_t *buf, int64_t pos, int size)
 {
     return bdrv_load_vmstate(opaque, buf, pos, size);
@@ -498,7 +470,6 @@ static const QEMUFileOps bdrv_read_ops = {
 };
 
 static const QEMUFileOps bdrv_write_ops = {
-    .put_buffer = block_put_buffer,
     .writev_buffer = block_writev_buffer,
     .close =      bdrv_fclose
 };
-- 
1.7.11.7

^ permalink raw reply related	[flat|nested] 35+ messages in thread

* Re: [Qemu-devel] [RFC 03/12] Add socket_writev_buffer function
  2013-03-21  9:09 ` [Qemu-devel] [RFC 03/12] Add socket_writev_buffer function Orit Wasserman
@ 2013-03-21  9:18   ` Paolo Bonzini
  2013-03-21  9:47     ` Orit Wasserman
  0 siblings, 1 reply; 35+ messages in thread
From: Paolo Bonzini @ 2013-03-21  9:18 UTC (permalink / raw)
  To: Orit Wasserman; +Cc: mst, chegu_vinod, qemu-devel, quintela

Il 21/03/2013 10:09, Orit Wasserman ha scritto:
> Signed-off-by: Orit Wasserman <owasserm@redhat.com>
> ---
>  savevm.c | 14 ++++++++++++++
>  1 file changed, 14 insertions(+)
> 
> diff --git a/savevm.c b/savevm.c
> index 35c8d1e..41dc87f 100644
> --- a/savevm.c
> +++ b/savevm.c
> @@ -39,6 +39,7 @@
>  #include "qmp-commands.h"
>  #include "trace.h"
>  #include "qemu/bitops.h"
> +#include "qemu/iov.h"
>  
>  #define SELF_ANNOUNCE_ROUNDS 5
>  
> @@ -171,6 +172,18 @@ static void coroutine_fn yield_until_fd_readable(int fd)
>      qemu_coroutine_yield();
>  }
>  
> +static int socket_writev_buffer(void *opaque, struct iovec *iov, int iovcnt)
> +{
> +    QEMUFileSocket *s = opaque;
> +    ssize_t len;
> +
> +    len = iov_writev(s->fd, iov, iovcnt);
> +    if (len < iov_size(iov, iovcnt)) {
> +        len = -socket_error();
> +    }
> +    return len;
> +}

This doesn't work on Windows, you need iov_send:

ssize_t iov_send(int sockfd, struct iovec *iov, unsigned iov_cnt,
                 size_t offset, size_t bytes);

Paolo

+
>  static int socket_get_fd(void *opaque)
>  {
>      QEMUFileSocket *s = opaque;
> @@ -387,6 +400,7 @@ static const QEMUFileOps socket_read_ops = {
>  static const QEMUFileOps socket_write_ops = {
>      .get_fd =     socket_get_fd,
>      .put_buffer = socket_put_buffer,
> +    .writev_buffer = socket_writev_buffer,
>      .close =      socket_close
>  };
>  
> 

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [Qemu-devel] [RFC 01/12] Add iov_writev to use writev to send iovec (also for files)
  2013-03-21  9:09 ` [Qemu-devel] [RFC 01/12] Add iov_writev to use writev to send iovec (also for files) Orit Wasserman
@ 2013-03-21  9:23   ` Paolo Bonzini
  0 siblings, 0 replies; 35+ messages in thread
From: Paolo Bonzini @ 2013-03-21  9:23 UTC (permalink / raw)
  To: Orit Wasserman; +Cc: mst, chegu_vinod, qemu-devel, quintela

Il 21/03/2013 10:09, Orit Wasserman ha scritto:
> We use writev because sendmsg can only be used on socket fds.
> 
> Signed-off-by: Orit Wasserman <owasserm@redhat.com>
> ---
>  include/qemu/iov.h | 12 ++++++++++++
>  util/iov.c         | 36 ++++++++++++++++++++++++++++++++++++
>  2 files changed, 48 insertions(+)
> 
> diff --git a/include/qemu/iov.h b/include/qemu/iov.h
> index 68d25f2..e26b7ba 100644
> --- a/include/qemu/iov.h
> +++ b/include/qemu/iov.h
> @@ -82,6 +82,18 @@ ssize_t iov_send_recv(int sockfd, struct iovec *iov, unsigned iov_cnt,
>  #define iov_send(sockfd, iov, iov_cnt, offset, bytes) \
>    iov_send_recv(sockfd, iov, iov_cnt, offset, bytes, true)
>  
> +/*
> + * writes iovcnt buffers of data described by iov to the file associated with
> + * the file descriptor fd.
> + *
> + * @fd - file descriptor
> + * @iov - iov to write
> + * @invcnt - number of buffers to write
> + *
> + * @return - number of bytes return or -1 on error
> + */
> +ssize_t iov_writev(int fd, struct iovec *iov, unsigned iov_cnt);
> +
>  /**
>   * Produce a text hexdump of iovec `iov' with `iov_cnt' number of elements
>   * in file `fp', prefixing each line with `prefix' and processing not more
> diff --git a/util/iov.c b/util/iov.c
> index 9dae318..bcd6e97 100644
> --- a/util/iov.c
> +++ b/util/iov.c
> @@ -197,6 +197,42 @@ ssize_t iov_send_recv(int sockfd, struct iovec *iov, unsigned iov_cnt,
>      return ret;
>  }
>  
> +ssize_t iov_writev(int fd, struct iovec *iov, unsigned iov_cnt)
> +{
> +#if defined CONFIG_IOVEC && defined CONFIG_POSIX
> +    ssize_t ret;
> +
> +    do {
> +        ret = writev(fd, iov, iov_cnt);
> +    } while (ret < 0 && errno == EINTR);
> +
> +    return ret;
> +#else
> +    /* else send piece-by-piece */
> +    /*XXX Note: windows has WSASend() and WSARecv() */
> +    unsigned i = 0;
> +    ssize_t ret = 0;
> +    while (i < iov_cnt) {
> +        ssize_t r = send(fd, iov[i].iov_base, iov[i].iov_len, 0);
> +        if (r > 0) {
> +            ret += r;
> +        } else if (!r) {
> +            break;
> +        } else if (errno == EINTR) {
> +            continue;
> +        } else {
> +            /* else it is some "other" error,
> +             * only return if there was no data processed. */
> +            if (ret == 0) {
> +                ret = -1;
> +            }
> +            break;
> +        }
> +        i++;
> +    }
> +    return ret;
> +#endif
> +}

I think this code instead belongs in QEMUFile.  qemu_fflush can use
either a single f->ops->writev_buffer call, or multiple
f->ops->put_buffer calls.

Paolo

>  void iov_hexdump(const struct iovec *iov, const unsigned int iov_cnt,
>                   FILE *fp, const char *prefix, size_t limit)
> 

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [Qemu-devel] [RFC 10/12] Add qemu_put_buffer_no_copy
  2013-03-21  9:09 ` [Qemu-devel] [RFC 10/12] Add qemu_put_buffer_no_copy Orit Wasserman
@ 2013-03-21  9:25   ` Paolo Bonzini
  2013-03-23 16:27   ` Michael R. Hines
  1 sibling, 0 replies; 35+ messages in thread
From: Paolo Bonzini @ 2013-03-21  9:25 UTC (permalink / raw)
  To: Orit Wasserman; +Cc: mst, chegu_vinod, qemu-devel, quintela

Il 21/03/2013 10:09, Orit Wasserman ha scritto:
> +    f->iov[f->iovcnt].iov_base = (uint8_t *)buf;
> +    f->iov[f->iovcnt++].iov_len = size;
> +
> +    f->is_write = 1;
> +    f->bytes_xfer += size;
> +
> +    if (f->buf_index >= IO_BUF_SIZE || f->iovcnt >= MAX_IOV_SIZE) {
> +        qemu_fflush(f);
>      }

It should not be complex to check if f->iov[f->iovcnt - 1] can be
extended?  This could remove many system calls when you have many
consecutive zero pages.

Paolo

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [Qemu-devel] [RFC 04/12] Add stdio_writev_buffer function
  2013-03-21  9:09 ` [Qemu-devel] [RFC 04/12] Add stdio_writev_buffer function Orit Wasserman
@ 2013-03-21  9:28   ` Paolo Bonzini
  0 siblings, 0 replies; 35+ messages in thread
From: Paolo Bonzini @ 2013-03-21  9:28 UTC (permalink / raw)
  To: Orit Wasserman; +Cc: mst, chegu_vinod, qemu-devel, quintela

Il 21/03/2013 10:09, Orit Wasserman ha scritto:
> Signed-off-by: Orit Wasserman <owasserm@redhat.com>
> ---
>  savevm.c | 6 ++++++
>  1 file changed, 6 insertions(+)
> 
> diff --git a/savevm.c b/savevm.c
> index 41dc87f..fa07a86 100644
> --- a/savevm.c
> +++ b/savevm.c
> @@ -241,6 +241,11 @@ static int stdio_get_fd(void *opaque)
>      return fileno(s->stdio_file);
>  }
>  
> +static int stdio_writev_buffer(void *opaque, struct iovec *iov, int iovcnt)
> +{
> +    return iov_writev(stdio_get_fd(opaque), iov, iovcnt);
> +}
> +
>  static int stdio_put_buffer(void *opaque, const uint8_t *buf, int64_t pos, int size)
>  {
>      QEMUFileStdio *s = opaque;
> @@ -321,6 +326,7 @@ static const QEMUFileOps stdio_pipe_read_ops = {
>  static const QEMUFileOps stdio_pipe_write_ops = {
>      .get_fd =     stdio_get_fd,
>      .put_buffer = stdio_put_buffer,
> +    .writev_buffer = stdio_writev_buffer,
>      .close =      stdio_pclose
>  };
>  
> 

This may cause subtle bugs if the FILE* is written before opening the
QEMUFile, but not flushed.  It doesn't happen in QEMU, but it is not too
clean.  If you let qemu_fflush pick one of stdio_put_buffer or
stdio_writev_buffer, the problem disappears.

Paolo

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [Qemu-devel] [RFC 00/12] Migration: Remove copying of guest ram pages
  2013-03-21  9:09 [Qemu-devel] [RFC 00/12] Migration: Remove copying of guest ram pages Orit Wasserman
                   ` (11 preceding siblings ...)
  2013-03-21  9:09 ` [Qemu-devel] [RFC 12/12] Bye Bye put_buffer Orit Wasserman
@ 2013-03-21  9:29 ` Paolo Bonzini
  2013-03-21 10:05   ` Orit Wasserman
  2013-03-21  9:48 ` Michael S. Tsirkin
  13 siblings, 1 reply; 35+ messages in thread
From: Paolo Bonzini @ 2013-03-21  9:29 UTC (permalink / raw)
  To: Orit Wasserman; +Cc: mst, chegu_vinod, qemu-devel, quintela

Il 21/03/2013 10:09, Orit Wasserman ha scritto:
> In migration all data is copied to a static buffer in QEMUFile,
> this hurts our network bandwidth and CPU usage especially with large guests.
> We switched to iovec for storing different buffers to send (even a byte field is
> considered as a buffer) and use writev to send the iovec.
> writev was chosen (as apposed to sendmsg) because it supprts non socket fds.
>   
> Guest memory pages are not copied by calling a new function 
> qemu_put_buffer_no_copy.
> The page header data and device state data are still copied into the static
> buffer. This data consists of a lot of bytes and integer fields and the static
> buffer is used to store it during batching.
> Another improvement is changing qemu_putbe64/32/16 to create a single
> buffer instead of several byte sized buffer.

Very nice!  I just disagree on making writev_buffer mandatory; instead,
a QemuFileOps could choose between implementing either put_buffer or
writev_buffer.  This removes the duplicate code you have between
iov_writev and block_writev_buffer.

Thanks,

Paolo

> Orit Wasserman (12):
>   Add iov_writev to use writev to send iovec (also for files)
>   Add QemuFileWritevBuffer QemuFileOps
>   Add socket_writev_buffer function
>   Add stdio_writev_buffer function
>   Add block_writev_buffer function
>   Update bytes_xfer in qemu_put_byte
>   Store the data to send also in iovec
>   Use writev ops instead of put_buffer ops
>   More optimized qemu_put_be64/32/16
>   Add qemu_put_buffer_no_copy
>   Use qemu_put_buffer_no_copy for guest memory pages
>   Bye Bye put_buffer
> 
>  arch_init.c                   |   2 +-
>  include/migration/qemu-file.h |  20 ++++---
>  include/qemu/iov.h            |  12 ++++
>  savevm.c                      | 130 +++++++++++++++++++++++++-----------------
>  util/iov.c                    |  36 ++++++++++++
>  5 files changed, 139 insertions(+), 61 deletions(-)
> 

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [Qemu-devel] [RFC 03/12] Add socket_writev_buffer function
  2013-03-21  9:18   ` Paolo Bonzini
@ 2013-03-21  9:47     ` Orit Wasserman
  2013-03-21  9:47       ` Paolo Bonzini
  0 siblings, 1 reply; 35+ messages in thread
From: Orit Wasserman @ 2013-03-21  9:47 UTC (permalink / raw)
  To: Paolo Bonzini; +Cc: mst, chegu_vinod, qemu-devel, quintela

On 03/21/2013 11:18 AM, Paolo Bonzini wrote:
> Il 21/03/2013 10:09, Orit Wasserman ha scritto:
>> Signed-off-by: Orit Wasserman <owasserm@redhat.com>
>> ---
>>  savevm.c | 14 ++++++++++++++
>>  1 file changed, 14 insertions(+)
>>
>> diff --git a/savevm.c b/savevm.c
>> index 35c8d1e..41dc87f 100644
>> --- a/savevm.c
>> +++ b/savevm.c
>> @@ -39,6 +39,7 @@
>>  #include "qmp-commands.h"
>>  #include "trace.h"
>>  #include "qemu/bitops.h"
>> +#include "qemu/iov.h"
>>  
>>  #define SELF_ANNOUNCE_ROUNDS 5
>>  
>> @@ -171,6 +172,18 @@ static void coroutine_fn yield_until_fd_readable(int fd)
>>      qemu_coroutine_yield();
>>  }
>>  
>> +static int socket_writev_buffer(void *opaque, struct iovec *iov, int iovcnt)
>> +{
>> +    QEMUFileSocket *s = opaque;
>> +    ssize_t len;
>> +
>> +    len = iov_writev(s->fd, iov, iovcnt);
>> +    if (len < iov_size(iov, iovcnt)) {
>> +        len = -socket_error();
>> +    }
>> +    return len;
>> +}
> 
> This doesn't work on Windows, you need iov_send:
> 
> ssize_t iov_send(int sockfd, struct iovec *iov, unsigned iov_cnt,
>                  size_t offset, size_t bytes);
> 
iov_writev is implemented like iov_send the only difference is that it uses writev and not sendmsg
so it can work on files.
So in windows it will use send.

Orit
> Paolo
> 
> +
>>  static int socket_get_fd(void *opaque)
>>  {
>>      QEMUFileSocket *s = opaque;
>> @@ -387,6 +400,7 @@ static const QEMUFileOps socket_read_ops = {
>>  static const QEMUFileOps socket_write_ops = {
>>      .get_fd =     socket_get_fd,
>>      .put_buffer = socket_put_buffer,
>> +    .writev_buffer = socket_writev_buffer,
>>      .close =      socket_close
>>  };
>>  
>>
> 

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [Qemu-devel] [RFC 03/12] Add socket_writev_buffer function
  2013-03-21  9:47     ` Orit Wasserman
@ 2013-03-21  9:47       ` Paolo Bonzini
  2013-03-21 10:17         ` Orit Wasserman
  0 siblings, 1 reply; 35+ messages in thread
From: Paolo Bonzini @ 2013-03-21  9:47 UTC (permalink / raw)
  To: Orit Wasserman; +Cc: mst, chegu_vinod, qemu-devel, quintela

Il 21/03/2013 10:47, Orit Wasserman ha scritto:
> On 03/21/2013 11:18 AM, Paolo Bonzini wrote:
>> Il 21/03/2013 10:09, Orit Wasserman ha scritto:
>>> Signed-off-by: Orit Wasserman <owasserm@redhat.com>
>>> ---
>>>  savevm.c | 14 ++++++++++++++
>>>  1 file changed, 14 insertions(+)
>>>
>>> diff --git a/savevm.c b/savevm.c
>>> index 35c8d1e..41dc87f 100644
>>> --- a/savevm.c
>>> +++ b/savevm.c
>>> @@ -39,6 +39,7 @@
>>>  #include "qmp-commands.h"
>>>  #include "trace.h"
>>>  #include "qemu/bitops.h"
>>> +#include "qemu/iov.h"
>>>  
>>>  #define SELF_ANNOUNCE_ROUNDS 5
>>>  
>>> @@ -171,6 +172,18 @@ static void coroutine_fn yield_until_fd_readable(int fd)
>>>      qemu_coroutine_yield();
>>>  }
>>>  
>>> +static int socket_writev_buffer(void *opaque, struct iovec *iov, int iovcnt)
>>> +{
>>> +    QEMUFileSocket *s = opaque;
>>> +    ssize_t len;
>>> +
>>> +    len = iov_writev(s->fd, iov, iovcnt);
>>> +    if (len < iov_size(iov, iovcnt)) {
>>> +        len = -socket_error();
>>> +    }
>>> +    return len;
>>> +}
>>
>> This doesn't work on Windows, you need iov_send:
>>
>> ssize_t iov_send(int sockfd, struct iovec *iov, unsigned iov_cnt,
>>                  size_t offset, size_t bytes);
>>
> iov_writev is implemented like iov_send the only difference is that it uses writev and not sendmsg
> so it can work on files.
> So in windows it will use send.

Yes, I would prefer to use send since here we know it's a socket.  See
the general review, I don't think iov_writev is needed.

Paolo

> Orit
>> Paolo
>>
>> +
>>>  static int socket_get_fd(void *opaque)
>>>  {
>>>      QEMUFileSocket *s = opaque;
>>> @@ -387,6 +400,7 @@ static const QEMUFileOps socket_read_ops = {
>>>  static const QEMUFileOps socket_write_ops = {
>>>      .get_fd =     socket_get_fd,
>>>      .put_buffer = socket_put_buffer,
>>> +    .writev_buffer = socket_writev_buffer,
>>>      .close =      socket_close
>>>  };
>>>  
>>>
>>
> 

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [Qemu-devel] [RFC 00/12] Migration: Remove copying of guest ram pages
  2013-03-21  9:09 [Qemu-devel] [RFC 00/12] Migration: Remove copying of guest ram pages Orit Wasserman
                   ` (12 preceding siblings ...)
  2013-03-21  9:29 ` [Qemu-devel] [RFC 00/12] Migration: Remove copying of guest ram pages Paolo Bonzini
@ 2013-03-21  9:48 ` Michael S. Tsirkin
  2013-03-21  9:53   ` Paolo Bonzini
  13 siblings, 1 reply; 35+ messages in thread
From: Michael S. Tsirkin @ 2013-03-21  9:48 UTC (permalink / raw)
  To: Orit Wasserman; +Cc: pbonzini, chegu_vinod, qemu-devel, quintela

On Thu, Mar 21, 2013 at 11:09:19AM +0200, Orit Wasserman wrote:
> In migration all data is copied to a static buffer in QEMUFile,
> this hurts our network bandwidth and CPU usage especially with large guests.
> We switched to iovec for storing different buffers to send (even a byte field is
> considered as a buffer) and use writev to send the iovec.
> writev was chosen (as apposed to sendmsg) because it supprts non socket fds.
>   
> Guest memory pages are not copied by calling a new function 
> qemu_put_buffer_no_copy.
> The page header data and device state data are still copied into the static
> buffer. This data consists of a lot of bytes and integer fields and the static
> buffer is used to store it during batching.
> Another improvement is changing qemu_putbe64/32/16 to create a single
> buffer instead of several byte sized buffer.

A recent discussion about overcommitted memory made me think:
this will still need to read pages into memory even
if all we will do is send it out on the wire immediately,
dirtying cache etc.

Now, one property of RAM writes is that if guest changes
the memory that we are migrating, we really
don't care that remote will get a new copy and not
the old copy.

So this seems like a perfect place to use vmsplice
and save the read of data from pagecache into QEMU.

For this to work, however, we need to have a way
for QEMUFile to know whether a specific iovec
references RAM (so it's ok to use vmsplice)
or a malloced buffer for device state (so it must use write
to ensure kernel copies data).

What do you think?

> Orit Wasserman (12):
>   Add iov_writev to use writev to send iovec (also for files)
>   Add QemuFileWritevBuffer QemuFileOps
>   Add socket_writev_buffer function
>   Add stdio_writev_buffer function
>   Add block_writev_buffer function
>   Update bytes_xfer in qemu_put_byte
>   Store the data to send also in iovec
>   Use writev ops instead of put_buffer ops
>   More optimized qemu_put_be64/32/16
>   Add qemu_put_buffer_no_copy
>   Use qemu_put_buffer_no_copy for guest memory pages
>   Bye Bye put_buffer
> 
>  arch_init.c                   |   2 +-
>  include/migration/qemu-file.h |  20 ++++---
>  include/qemu/iov.h            |  12 ++++
>  savevm.c                      | 130 +++++++++++++++++++++++++-----------------
>  util/iov.c                    |  36 ++++++++++++
>  5 files changed, 139 insertions(+), 61 deletions(-)
> 
> -- 
> 1.7.11.7

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [Qemu-devel] [RFC 00/12] Migration: Remove copying of guest ram pages
  2013-03-21  9:48 ` Michael S. Tsirkin
@ 2013-03-21  9:53   ` Paolo Bonzini
  2013-03-21 11:09     ` Michael S. Tsirkin
  0 siblings, 1 reply; 35+ messages in thread
From: Paolo Bonzini @ 2013-03-21  9:53 UTC (permalink / raw)
  To: Michael S. Tsirkin; +Cc: Orit Wasserman, chegu_vinod, qemu-devel, quintela

Il 21/03/2013 10:48, Michael S. Tsirkin ha scritto:
> A recent discussion about overcommitted memory made me think:
> this will still need to read pages into memory even
> if all we will do is send it out on the wire immediately,
> dirtying cache etc.
> 
> Now, one property of RAM writes is that if guest changes
> the memory that we are migrating, we really
> don't care that remote will get a new copy and not
> the old copy.

Orit's patches already do this (patch 11).  The only copy that remains
is from memory to socket buffers.


> For this to work, however, we need to have a way
> for QEMUFile to know whether a specific iovec
> references RAM (so it's ok to use vmsplice)
> or a malloced buffer for device state (so it must use write
> to ensure kernel copies data).
> 
> What do you think?

Was SPLICE_F_GIFT/SPLICE_F_MOVE ever implemented in the end?  If not,
patch 11 of this series does the same thing you are suggesting, without
the indirection of an additional pipe file descriptor that is required
for splice/vmsplice.

Paolo

>> Orit Wasserman (12):
>>   Add iov_writev to use writev to send iovec (also for files)
>>   Add QemuFileWritevBuffer QemuFileOps
>>   Add socket_writev_buffer function
>>   Add stdio_writev_buffer function
>>   Add block_writev_buffer function
>>   Update bytes_xfer in qemu_put_byte
>>   Store the data to send also in iovec
>>   Use writev ops instead of put_buffer ops
>>   More optimized qemu_put_be64/32/16
>>   Add qemu_put_buffer_no_copy
>>   Use qemu_put_buffer_no_copy for guest memory pages
>>   Bye Bye put_buffer
>>
>>  arch_init.c                   |   2 +-
>>  include/migration/qemu-file.h |  20 ++++---
>>  include/qemu/iov.h            |  12 ++++
>>  savevm.c                      | 130 +++++++++++++++++++++++++-----------------
>>  util/iov.c                    |  36 ++++++++++++
>>  5 files changed, 139 insertions(+), 61 deletions(-)
>>
>> -- 
>> 1.7.11.7

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [Qemu-devel] [RFC 07/12] Store the data to send also in iovec
  2013-03-21  9:09 ` [Qemu-devel] [RFC 07/12] Store the data to send also in iovec Orit Wasserman
@ 2013-03-21  9:56   ` Paolo Bonzini
  2013-03-21 11:10     ` Orit Wasserman
  0 siblings, 1 reply; 35+ messages in thread
From: Paolo Bonzini @ 2013-03-21  9:56 UTC (permalink / raw)
  To: Orit Wasserman; +Cc: quintela, chegu_vinod, qemu-devel, mst

Il 21/03/2013 10:09, Orit Wasserman ha scritto:
> All data is still copied into the static buffer.
> 
> Signed-off-by: Orit Wasserman <owasserm@redhat.com>
> ---
>  savevm.c | 13 +++++++++++--
>  1 file changed, 11 insertions(+), 2 deletions(-)
> 
> diff --git a/savevm.c b/savevm.c
> index ec64533..d5834ca 100644
> --- a/savevm.c
> +++ b/savevm.c
> @@ -114,6 +114,7 @@ void qemu_announce_self(void)
>  /* savevm/loadvm support */
>  
>  #define IO_BUF_SIZE 32768
> +#define MAX_IOV_SIZE 64

You could use IOV_MAX, or min(IOV_MAX, 64).  The 64 should be tuned on a
good 10G network...

Paolo

>  struct QEMUFile {
>      const QEMUFileOps *ops;
> @@ -129,6 +130,9 @@ struct QEMUFile {
>      int buf_size; /* 0 when writing */
>      uint8_t buf[IO_BUF_SIZE];
>  
> +    struct iovec iov[MAX_IOV_SIZE];
> +    unsigned int iovcnt;
> +
>      int last_error;
>  };
>  
> @@ -546,6 +550,7 @@ static void qemu_fflush(QEMUFile *f)
>              f->pos += f->buf_index;
>          }
>          f->buf_index = 0;
> +        f->iovcnt = 0;
>      }
>      if (ret < 0) {
>          qemu_file_set_error(f, ret);
> @@ -638,12 +643,14 @@ void qemu_put_buffer(QEMUFile *f, const uint8_t *buf, int size)
>          if (l > size)
>              l = size;
>          memcpy(f->buf + f->buf_index, buf, l);
> +        f->iov[f->iovcnt].iov_base = f->buf + f->buf_index;
> +        f->iov[f->iovcnt++].iov_len = l;
>          f->is_write = 1;
>          f->buf_index += l;
>          f->bytes_xfer += l;
>          buf += l;
>          size -= l;
> -        if (f->buf_index >= IO_BUF_SIZE) {
> +        if (f->buf_index >= IO_BUF_SIZE || f->iovcnt >= MAX_IOV_SIZE) {
>              qemu_fflush(f);
>              if (qemu_file_get_error(f)) {
>                  break;
> @@ -667,8 +674,10 @@ void qemu_put_byte(QEMUFile *f, int v)
>      f->buf[f->buf_index++] = v;
>      f->is_write = 1;
>      f->bytes_xfer += 1;
> +    f->iov[f->iovcnt].iov_base = f->buf + (f->buf_index - 1);
> +    f->iov[f->iovcnt++].iov_len = 1;
>  
> -    if (f->buf_index >= IO_BUF_SIZE) {
> +    if (f->buf_index >= IO_BUF_SIZE || f->iovcnt >= MAX_IOV_SIZE) {
>          qemu_fflush(f);
>      }
>  }
> 

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [Qemu-devel] [RFC 00/12] Migration: Remove copying of guest ram pages
  2013-03-21  9:29 ` [Qemu-devel] [RFC 00/12] Migration: Remove copying of guest ram pages Paolo Bonzini
@ 2013-03-21 10:05   ` Orit Wasserman
  0 siblings, 0 replies; 35+ messages in thread
From: Orit Wasserman @ 2013-03-21 10:05 UTC (permalink / raw)
  To: Paolo Bonzini; +Cc: mst, chegu_vinod, qemu-devel, quintela

On 03/21/2013 11:29 AM, Paolo Bonzini wrote:
> Il 21/03/2013 10:09, Orit Wasserman ha scritto:
>> In migration all data is copied to a static buffer in QEMUFile,
>> this hurts our network bandwidth and CPU usage especially with large guests.
>> We switched to iovec for storing different buffers to send (even a byte field is
>> considered as a buffer) and use writev to send the iovec.
>> writev was chosen (as apposed to sendmsg) because it supprts non socket fds.
>>   
>> Guest memory pages are not copied by calling a new function 
>> qemu_put_buffer_no_copy.
>> The page header data and device state data are still copied into the static
>> buffer. This data consists of a lot of bytes and integer fields and the static
>> buffer is used to store it during batching.
>> Another improvement is changing qemu_putbe64/32/16 to create a single
>> buffer instead of several byte sized buffer.
> 
> Very nice!  I just disagree on making writev_buffer mandatory; instead,
> a QemuFileOps could choose between implementing either put_buffer or
> writev_buffer. 
 This removes the duplicate code you have between
> iov_writev and block_writev_buffer.
Sure, I can change the code that if put_buffer exists to revert to the previous
implementation.

Orit
> 
> Thanks,
> 
> Paolo
> 
>> Orit Wasserman (12):
>>   Add iov_writev to use writev to send iovec (also for files)
>>   Add QemuFileWritevBuffer QemuFileOps
>>   Add socket_writev_buffer function
>>   Add stdio_writev_buffer function
>>   Add block_writev_buffer function
>>   Update bytes_xfer in qemu_put_byte
>>   Store the data to send also in iovec
>>   Use writev ops instead of put_buffer ops
>>   More optimized qemu_put_be64/32/16
>>   Add qemu_put_buffer_no_copy
>>   Use qemu_put_buffer_no_copy for guest memory pages
>>   Bye Bye put_buffer
>>
>>  arch_init.c                   |   2 +-
>>  include/migration/qemu-file.h |  20 ++++---
>>  include/qemu/iov.h            |  12 ++++
>>  savevm.c                      | 130 +++++++++++++++++++++++++-----------------
>>  util/iov.c                    |  36 ++++++++++++
>>  5 files changed, 139 insertions(+), 61 deletions(-)
>>
> 

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [Qemu-devel] [RFC 03/12] Add socket_writev_buffer function
  2013-03-21  9:47       ` Paolo Bonzini
@ 2013-03-21 10:17         ` Orit Wasserman
  0 siblings, 0 replies; 35+ messages in thread
From: Orit Wasserman @ 2013-03-21 10:17 UTC (permalink / raw)
  To: Paolo Bonzini; +Cc: mst, chegu_vinod, qemu-devel, quintela

On 03/21/2013 11:47 AM, Paolo Bonzini wrote:
> Il 21/03/2013 10:47, Orit Wasserman ha scritto:
>> On 03/21/2013 11:18 AM, Paolo Bonzini wrote:
>>> Il 21/03/2013 10:09, Orit Wasserman ha scritto:
>>>> Signed-off-by: Orit Wasserman <owasserm@redhat.com>
>>>> ---
>>>>  savevm.c | 14 ++++++++++++++
>>>>  1 file changed, 14 insertions(+)
>>>>
>>>> diff --git a/savevm.c b/savevm.c
>>>> index 35c8d1e..41dc87f 100644
>>>> --- a/savevm.c
>>>> +++ b/savevm.c
>>>> @@ -39,6 +39,7 @@
>>>>  #include "qmp-commands.h"
>>>>  #include "trace.h"
>>>>  #include "qemu/bitops.h"
>>>> +#include "qemu/iov.h"
>>>>  
>>>>  #define SELF_ANNOUNCE_ROUNDS 5
>>>>  
>>>> @@ -171,6 +172,18 @@ static void coroutine_fn yield_until_fd_readable(int fd)
>>>>      qemu_coroutine_yield();
>>>>  }
>>>>  
>>>> +static int socket_writev_buffer(void *opaque, struct iovec *iov, int iovcnt)
>>>> +{
>>>> +    QEMUFileSocket *s = opaque;
>>>> +    ssize_t len;
>>>> +
>>>> +    len = iov_writev(s->fd, iov, iovcnt);
>>>> +    if (len < iov_size(iov, iovcnt)) {
>>>> +        len = -socket_error();
>>>> +    }
>>>> +    return len;
>>>> +}
>>>
>>> This doesn't work on Windows, you need iov_send:
>>>
>>> ssize_t iov_send(int sockfd, struct iovec *iov, unsigned iov_cnt,
>>>                  size_t offset, size_t bytes);
>>>
>> iov_writev is implemented like iov_send the only difference is that it uses writev and not sendmsg
>> so it can work on files.
>> So in windows it will use send.
> 
> Yes, I would prefer to use send since here we know it's a socket.  See
> the general review, I don't think iov_writev is needed.
Sure.
> 
> Paolo
> 
>> Orit
>>> Paolo
>>>
>>> +
>>>>  static int socket_get_fd(void *opaque)
>>>>  {
>>>>      QEMUFileSocket *s = opaque;
>>>> @@ -387,6 +400,7 @@ static const QEMUFileOps socket_read_ops = {
>>>>  static const QEMUFileOps socket_write_ops = {
>>>>      .get_fd =     socket_get_fd,
>>>>      .put_buffer = socket_put_buffer,
>>>> +    .writev_buffer = socket_writev_buffer,
>>>>      .close =      socket_close
>>>>  };
>>>>  
>>>>
>>>
>>
> 

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [Qemu-devel] [RFC 00/12] Migration: Remove copying of guest ram pages
  2013-03-21  9:53   ` Paolo Bonzini
@ 2013-03-21 11:09     ` Michael S. Tsirkin
  0 siblings, 0 replies; 35+ messages in thread
From: Michael S. Tsirkin @ 2013-03-21 11:09 UTC (permalink / raw)
  To: Paolo Bonzini; +Cc: Orit Wasserman, chegu_vinod, qemu-devel, quintela

On Thu, Mar 21, 2013 at 10:53:55AM +0100, Paolo Bonzini wrote:
> Il 21/03/2013 10:48, Michael S. Tsirkin ha scritto:
> > A recent discussion about overcommitted memory made me think:
> > this will still need to read pages into memory even
> > if all we will do is send it out on the wire immediately,
> > dirtying cache etc.
> > 
> > Now, one property of RAM writes is that if guest changes
> > the memory that we are migrating, we really
> > don't care that remote will get a new copy and not
> > the old copy.
> 
> Orit's patches already do this (patch 11).  The only copy that remains
> is from memory to socket buffers.
> 
> 
> > For this to work, however, we need to have a way
> > for QEMUFile to know whether a specific iovec
> > references RAM (so it's ok to use vmsplice)
> > or a malloced buffer for device state (so it must use write
> > to ensure kernel copies data).
> > 
> > What do you think?
> 
> Was SPLICE_F_GIFT/SPLICE_F_MOVE ever implemented in the end?  If not,
> patch 11 of this series does the same thing you are suggesting, without
> the indirection of an additional pipe file descriptor that is required
> for splice/vmsplice.
> 
> Paolo

GIFT seems to be implemented so no problem there.
MOVE seems not to be needed for TCP - I think it just uses sendpage
internally which will do zero copy without special hints.


> >> Orit Wasserman (12):
> >>   Add iov_writev to use writev to send iovec (also for files)
> >>   Add QemuFileWritevBuffer QemuFileOps
> >>   Add socket_writev_buffer function
> >>   Add stdio_writev_buffer function
> >>   Add block_writev_buffer function
> >>   Update bytes_xfer in qemu_put_byte
> >>   Store the data to send also in iovec
> >>   Use writev ops instead of put_buffer ops
> >>   More optimized qemu_put_be64/32/16
> >>   Add qemu_put_buffer_no_copy
> >>   Use qemu_put_buffer_no_copy for guest memory pages
> >>   Bye Bye put_buffer
> >>
> >>  arch_init.c                   |   2 +-
> >>  include/migration/qemu-file.h |  20 ++++---
> >>  include/qemu/iov.h            |  12 ++++
> >>  savevm.c                      | 130 +++++++++++++++++++++++++-----------------
> >>  util/iov.c                    |  36 ++++++++++++
> >>  5 files changed, 139 insertions(+), 61 deletions(-)
> >>
> >> -- 
> >> 1.7.11.7

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [Qemu-devel] [RFC 07/12] Store the data to send also in iovec
  2013-03-21  9:56   ` Paolo Bonzini
@ 2013-03-21 11:10     ` Orit Wasserman
  2013-03-21 11:14       ` Michael S. Tsirkin
  0 siblings, 1 reply; 35+ messages in thread
From: Orit Wasserman @ 2013-03-21 11:10 UTC (permalink / raw)
  To: Paolo Bonzini; +Cc: quintela, chegu_vinod, qemu-devel, mst

On 03/21/2013 11:56 AM, Paolo Bonzini wrote:
> Il 21/03/2013 10:09, Orit Wasserman ha scritto:
>> All data is still copied into the static buffer.
>>
>> Signed-off-by: Orit Wasserman <owasserm@redhat.com>
>> ---
>>  savevm.c | 13 +++++++++++--
>>  1 file changed, 11 insertions(+), 2 deletions(-)
>>
>> diff --git a/savevm.c b/savevm.c
>> index ec64533..d5834ca 100644
>> --- a/savevm.c
>> +++ b/savevm.c
>> @@ -114,6 +114,7 @@ void qemu_announce_self(void)
>>  /* savevm/loadvm support */
>>  
>>  #define IO_BUF_SIZE 32768
>> +#define MAX_IOV_SIZE 64
> 
> You could use IOV_MAX, or min(IOV_MAX, 64).
Sure
 The 64 should be tuned on a
> good 10G network...

You need to remember that iovec of 64 is equivalent to 32 guest pages which are 
128K data. This is a large TCP packet that will be fragmented even with jumbo frames.
I can make it configurable but not sure if it will be useful.
Micheal, what do you think?

Orit
> 
> Paolo
> 
>>  struct QEMUFile {
>>      const QEMUFileOps *ops;
>> @@ -129,6 +130,9 @@ struct QEMUFile {
>>      int buf_size; /* 0 when writing */
>>      uint8_t buf[IO_BUF_SIZE];
>>  
>> +    struct iovec iov[MAX_IOV_SIZE];
>> +    unsigned int iovcnt;
>> +
>>      int last_error;
>>  };
>>  
>> @@ -546,6 +550,7 @@ static void qemu_fflush(QEMUFile *f)
>>              f->pos += f->buf_index;
>>          }
>>          f->buf_index = 0;
>> +        f->iovcnt = 0;
>>      }
>>      if (ret < 0) {
>>          qemu_file_set_error(f, ret);
>> @@ -638,12 +643,14 @@ void qemu_put_buffer(QEMUFile *f, const uint8_t *buf, int size)
>>          if (l > size)
>>              l = size;
>>          memcpy(f->buf + f->buf_index, buf, l);
>> +        f->iov[f->iovcnt].iov_base = f->buf + f->buf_index;
>> +        f->iov[f->iovcnt++].iov_len = l;
>>          f->is_write = 1;
>>          f->buf_index += l;
>>          f->bytes_xfer += l;
>>          buf += l;
>>          size -= l;
>> -        if (f->buf_index >= IO_BUF_SIZE) {
>> +        if (f->buf_index >= IO_BUF_SIZE || f->iovcnt >= MAX_IOV_SIZE) {
>>              qemu_fflush(f);
>>              if (qemu_file_get_error(f)) {
>>                  break;
>> @@ -667,8 +674,10 @@ void qemu_put_byte(QEMUFile *f, int v)
>>      f->buf[f->buf_index++] = v;
>>      f->is_write = 1;
>>      f->bytes_xfer += 1;
>> +    f->iov[f->iovcnt].iov_base = f->buf + (f->buf_index - 1);
>> +    f->iov[f->iovcnt++].iov_len = 1;
>>  
>> -    if (f->buf_index >= IO_BUF_SIZE) {
>> +    if (f->buf_index >= IO_BUF_SIZE || f->iovcnt >= MAX_IOV_SIZE) {
>>          qemu_fflush(f);
>>      }
>>  }
>>
> 

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [Qemu-devel] [RFC 07/12] Store the data to send also in iovec
  2013-03-21 11:10     ` Orit Wasserman
@ 2013-03-21 11:14       ` Michael S. Tsirkin
  2013-03-21 12:50         ` Orit Wasserman
  0 siblings, 1 reply; 35+ messages in thread
From: Michael S. Tsirkin @ 2013-03-21 11:14 UTC (permalink / raw)
  To: Orit Wasserman; +Cc: Paolo Bonzini, chegu_vinod, qemu-devel, quintela

On Thu, Mar 21, 2013 at 01:10:40PM +0200, Orit Wasserman wrote:
> On 03/21/2013 11:56 AM, Paolo Bonzini wrote:
> > Il 21/03/2013 10:09, Orit Wasserman ha scritto:
> >> All data is still copied into the static buffer.
> >>
> >> Signed-off-by: Orit Wasserman <owasserm@redhat.com>
> >> ---
> >>  savevm.c | 13 +++++++++++--
> >>  1 file changed, 11 insertions(+), 2 deletions(-)
> >>
> >> diff --git a/savevm.c b/savevm.c
> >> index ec64533..d5834ca 100644
> >> --- a/savevm.c
> >> +++ b/savevm.c
> >> @@ -114,6 +114,7 @@ void qemu_announce_self(void)
> >>  /* savevm/loadvm support */
> >>  
> >>  #define IO_BUF_SIZE 32768
> >> +#define MAX_IOV_SIZE 64
> > 
> > You could use IOV_MAX, or min(IOV_MAX, 64).
> Sure
>  The 64 should be tuned on a
> > good 10G network...
> 
> You need to remember that iovec of 64 is equivalent to 32 guest pages which are 
> 128K data. This is a large TCP packet that will be fragmented even with jumbo frames.
> I can make it configurable but not sure if it will be useful.
> Micheal, what do you think?
> 
> Orit

You are both right :). 64 looks like a sane value, and we can
try and tune it some more if we have the time.

> > 
> > Paolo
> > 
> >>  struct QEMUFile {
> >>      const QEMUFileOps *ops;
> >> @@ -129,6 +130,9 @@ struct QEMUFile {
> >>      int buf_size; /* 0 when writing */
> >>      uint8_t buf[IO_BUF_SIZE];
> >>  
> >> +    struct iovec iov[MAX_IOV_SIZE];
> >> +    unsigned int iovcnt;
> >> +
> >>      int last_error;
> >>  };
> >>  
> >> @@ -546,6 +550,7 @@ static void qemu_fflush(QEMUFile *f)
> >>              f->pos += f->buf_index;
> >>          }
> >>          f->buf_index = 0;
> >> +        f->iovcnt = 0;
> >>      }
> >>      if (ret < 0) {
> >>          qemu_file_set_error(f, ret);
> >> @@ -638,12 +643,14 @@ void qemu_put_buffer(QEMUFile *f, const uint8_t *buf, int size)
> >>          if (l > size)
> >>              l = size;
> >>          memcpy(f->buf + f->buf_index, buf, l);
> >> +        f->iov[f->iovcnt].iov_base = f->buf + f->buf_index;
> >> +        f->iov[f->iovcnt++].iov_len = l;
> >>          f->is_write = 1;
> >>          f->buf_index += l;
> >>          f->bytes_xfer += l;
> >>          buf += l;
> >>          size -= l;
> >> -        if (f->buf_index >= IO_BUF_SIZE) {
> >> +        if (f->buf_index >= IO_BUF_SIZE || f->iovcnt >= MAX_IOV_SIZE) {
> >>              qemu_fflush(f);
> >>              if (qemu_file_get_error(f)) {
> >>                  break;
> >> @@ -667,8 +674,10 @@ void qemu_put_byte(QEMUFile *f, int v)
> >>      f->buf[f->buf_index++] = v;
> >>      f->is_write = 1;
> >>      f->bytes_xfer += 1;
> >> +    f->iov[f->iovcnt].iov_base = f->buf + (f->buf_index - 1);
> >> +    f->iov[f->iovcnt++].iov_len = 1;
> >>  
> >> -    if (f->buf_index >= IO_BUF_SIZE) {
> >> +    if (f->buf_index >= IO_BUF_SIZE || f->iovcnt >= MAX_IOV_SIZE) {
> >>          qemu_fflush(f);
> >>      }
> >>  }
> >>
> > 

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [Qemu-devel] [RFC 07/12] Store the data to send also in iovec
  2013-03-21 11:14       ` Michael S. Tsirkin
@ 2013-03-21 12:50         ` Orit Wasserman
  2013-03-21 13:00           ` Paolo Bonzini
  0 siblings, 1 reply; 35+ messages in thread
From: Orit Wasserman @ 2013-03-21 12:50 UTC (permalink / raw)
  To: Michael S. Tsirkin; +Cc: Paolo Bonzini, chegu_vinod, qemu-devel, quintela

On 03/21/2013 01:14 PM, Michael S. Tsirkin wrote:
> On Thu, Mar 21, 2013 at 01:10:40PM +0200, Orit Wasserman wrote:
>> On 03/21/2013 11:56 AM, Paolo Bonzini wrote:
>>> Il 21/03/2013 10:09, Orit Wasserman ha scritto:
>>>> All data is still copied into the static buffer.
>>>>
>>>> Signed-off-by: Orit Wasserman <owasserm@redhat.com>
>>>> ---
>>>>  savevm.c | 13 +++++++++++--
>>>>  1 file changed, 11 insertions(+), 2 deletions(-)
>>>>
>>>> diff --git a/savevm.c b/savevm.c
>>>> index ec64533..d5834ca 100644
>>>> --- a/savevm.c
>>>> +++ b/savevm.c
>>>> @@ -114,6 +114,7 @@ void qemu_announce_self(void)
>>>>  /* savevm/loadvm support */
>>>>  
>>>>  #define IO_BUF_SIZE 32768
>>>> +#define MAX_IOV_SIZE 64
>>>
>>> You could use IOV_MAX, or min(IOV_MAX, 64).
>> Sure
>>  The 64 should be tuned on a
>>> good 10G network...
>>
>> You need to remember that iovec of 64 is equivalent to 32 guest pages which are 
>> 128K data. This is a large TCP packet that will be fragmented even with jumbo frames.
>> I can make it configurable but not sure if it will be useful.
>> Micheal, what do you think?
>>
>> Orit
> 
> You are both right :). 64 looks like a sane value, and we can
> try and tune it some more if we have the time.
> 
Paolo, are you ok with leaving it MIN(IOV_MAX, 64) at the moment?
Later we can make it changeable.

Orit
>>>
>>> Paolo
>>>
>>>>  struct QEMUFile {
>>>>      const QEMUFileOps *ops;
>>>> @@ -129,6 +130,9 @@ struct QEMUFile {
>>>>      int buf_size; /* 0 when writing */
>>>>      uint8_t buf[IO_BUF_SIZE];
>>>>  
>>>> +    struct iovec iov[MAX_IOV_SIZE];
>>>> +    unsigned int iovcnt;
>>>> +
>>>>      int last_error;
>>>>  };
>>>>  
>>>> @@ -546,6 +550,7 @@ static void qemu_fflush(QEMUFile *f)
>>>>              f->pos += f->buf_index;
>>>>          }
>>>>          f->buf_index = 0;
>>>> +        f->iovcnt = 0;
>>>>      }
>>>>      if (ret < 0) {
>>>>          qemu_file_set_error(f, ret);
>>>> @@ -638,12 +643,14 @@ void qemu_put_buffer(QEMUFile *f, const uint8_t *buf, int size)
>>>>          if (l > size)
>>>>              l = size;
>>>>          memcpy(f->buf + f->buf_index, buf, l);
>>>> +        f->iov[f->iovcnt].iov_base = f->buf + f->buf_index;
>>>> +        f->iov[f->iovcnt++].iov_len = l;
>>>>          f->is_write = 1;
>>>>          f->buf_index += l;
>>>>          f->bytes_xfer += l;
>>>>          buf += l;
>>>>          size -= l;
>>>> -        if (f->buf_index >= IO_BUF_SIZE) {
>>>> +        if (f->buf_index >= IO_BUF_SIZE || f->iovcnt >= MAX_IOV_SIZE) {
>>>>              qemu_fflush(f);
>>>>              if (qemu_file_get_error(f)) {
>>>>                  break;
>>>> @@ -667,8 +674,10 @@ void qemu_put_byte(QEMUFile *f, int v)
>>>>      f->buf[f->buf_index++] = v;
>>>>      f->is_write = 1;
>>>>      f->bytes_xfer += 1;
>>>> +    f->iov[f->iovcnt].iov_base = f->buf + (f->buf_index - 1);
>>>> +    f->iov[f->iovcnt++].iov_len = 1;
>>>>  
>>>> -    if (f->buf_index >= IO_BUF_SIZE) {
>>>> +    if (f->buf_index >= IO_BUF_SIZE || f->iovcnt >= MAX_IOV_SIZE) {
>>>>          qemu_fflush(f);
>>>>      }
>>>>  }
>>>>
>>>

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [Qemu-devel] [RFC 07/12] Store the data to send also in iovec
  2013-03-21 12:50         ` Orit Wasserman
@ 2013-03-21 13:00           ` Paolo Bonzini
  0 siblings, 0 replies; 35+ messages in thread
From: Paolo Bonzini @ 2013-03-21 13:00 UTC (permalink / raw)
  To: Orit Wasserman; +Cc: quintela, chegu_vinod, qemu-devel, Michael S. Tsirkin

Il 21/03/2013 13:50, Orit Wasserman ha scritto:
> On 03/21/2013 01:14 PM, Michael S. Tsirkin wrote:
>> On Thu, Mar 21, 2013 at 01:10:40PM +0200, Orit Wasserman wrote:
>>> On 03/21/2013 11:56 AM, Paolo Bonzini wrote:
>>>> Il 21/03/2013 10:09, Orit Wasserman ha scritto:
>>>>> All data is still copied into the static buffer.
>>>>>
>>>>> Signed-off-by: Orit Wasserman <owasserm@redhat.com>
>>>>> ---
>>>>>  savevm.c | 13 +++++++++++--
>>>>>  1 file changed, 11 insertions(+), 2 deletions(-)
>>>>>
>>>>> diff --git a/savevm.c b/savevm.c
>>>>> index ec64533..d5834ca 100644
>>>>> --- a/savevm.c
>>>>> +++ b/savevm.c
>>>>> @@ -114,6 +114,7 @@ void qemu_announce_self(void)
>>>>>  /* savevm/loadvm support */
>>>>>  
>>>>>  #define IO_BUF_SIZE 32768
>>>>> +#define MAX_IOV_SIZE 64
>>>>
>>>> You could use IOV_MAX, or min(IOV_MAX, 64).
>
> Paolo, are you ok with leaving it MIN(IOV_MAX, 64) at the moment?

Of course! (When I use "..." it means "I don't expect you to do that..."
:)).

Paolo

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [Qemu-devel] [RFC 10/12] Add qemu_put_buffer_no_copy
  2013-03-21  9:09 ` [Qemu-devel] [RFC 10/12] Add qemu_put_buffer_no_copy Orit Wasserman
  2013-03-21  9:25   ` Paolo Bonzini
@ 2013-03-23 16:27   ` Michael R. Hines
  2013-03-25  8:11     ` Orit Wasserman
  2013-03-25 13:05     ` Paolo Bonzini
  1 sibling, 2 replies; 35+ messages in thread
From: Michael R. Hines @ 2013-03-23 16:27 UTC (permalink / raw)
  To: Orit Wasserman; +Cc: pbonzini, quintela, chegu_vinod, qemu-devel, mst

Can you add a "flag" or something to indicate that the iov pointer 
belongs to RAM and not to device state?

That way, I could re-use this code for RDMA - if I see this flag, I will 
know to send to RDMA.....

- Michael


On 03/21/2013 05:09 AM, Orit Wasserman wrote:
> This allow us to add a buffer to the iovec to send without copying it
> into the static buffer.
>
> Signed-off-by: Orit Wasserman <owasserm@redhat.com>
> ---
>   include/migration/qemu-file.h |  5 +++++
>   savevm.c                      | 42 ++++++++++++++++++++++++------------------
>   2 files changed, 29 insertions(+), 18 deletions(-)
>
> diff --git a/include/migration/qemu-file.h b/include/migration/qemu-file.h
> index 8d3da9b..5168be2 100644
> --- a/include/migration/qemu-file.h
> +++ b/include/migration/qemu-file.h
> @@ -75,6 +75,11 @@ int qemu_fclose(QEMUFile *f);
>   int64_t qemu_ftell(QEMUFile *f);
>   void qemu_put_buffer(QEMUFile *f, const uint8_t *buf, int size);
>   void qemu_put_byte(QEMUFile *f, int v);
> +/*
> + * put_buffer without copying the buffer.
> + * The buffer should be available till it is sent.
> + */
> +void qemu_put_buffer_no_copy(QEMUFile *f, const uint8_t *buf, int size);
>
>   static inline void qemu_put_ubyte(QEMUFile *f, unsigned int v)
>   {
> diff --git a/savevm.c b/savevm.c
> index 40d96f4..32a506e 100644
> --- a/savevm.c
> +++ b/savevm.c
> @@ -629,6 +629,22 @@ void qemu_put_buffer(QEMUFile *f, const uint8_t *buf, int size)
>   {
>       int l;
>
> +    while (size > 0) {
> +        l = IO_BUF_SIZE - f->buf_index;
> +        if (l > size) {
> +            l = size;
> +        }
> +        memcpy(f->buf + f->buf_index, buf, l);
> +        f->buf_index += l;
> +        f->is_write = 1;
> +        qemu_put_buffer_no_copy(f, f->buf + (f->buf_index - l), l);
> +        buf += l;
> +        size -= l;
> +    }
> +}
> +
> +void qemu_put_buffer_no_copy(QEMUFile *f, const uint8_t *buf, int size)
> +{
>       if (f->last_error) {
>           return;
>       }
> @@ -639,24 +655,14 @@ void qemu_put_buffer(QEMUFile *f, const uint8_t *buf, int size)
>           abort();
>       }
>
> -    while (size > 0) {
> -        l = IO_BUF_SIZE - f->buf_index;
> -        if (l > size)
> -            l = size;
> -        memcpy(f->buf + f->buf_index, buf, l);
> -        f->iov[f->iovcnt].iov_base = f->buf + f->buf_index;
> -        f->iov[f->iovcnt++].iov_len = l;
> -        f->is_write = 1;
> -        f->buf_index += l;
> -        f->bytes_xfer += l;
> -        buf += l;
> -        size -= l;
> -        if (f->buf_index >= IO_BUF_SIZE || f->iovcnt >= MAX_IOV_SIZE) {
> -            qemu_fflush(f);
> -            if (qemu_file_get_error(f)) {
> -                break;
> -            }
> -        }
> +    f->iov[f->iovcnt].iov_base = (uint8_t *)buf;
> +    f->iov[f->iovcnt++].iov_len = size;
> +
> +    f->is_write = 1;
> +    f->bytes_xfer += size;
> +
> +    if (f->buf_index >= IO_BUF_SIZE || f->iovcnt >= MAX_IOV_SIZE) {
> +        qemu_fflush(f);
>       }
>   }
>

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [Qemu-devel] [RFC 10/12] Add qemu_put_buffer_no_copy
  2013-03-23 16:27   ` Michael R. Hines
@ 2013-03-25  8:11     ` Orit Wasserman
  2013-03-25 13:05     ` Paolo Bonzini
  1 sibling, 0 replies; 35+ messages in thread
From: Orit Wasserman @ 2013-03-25  8:11 UTC (permalink / raw)
  To: Michael R. Hines; +Cc: pbonzini, quintela, chegu_vinod, qemu-devel, mst

On 03/23/2013 06:27 PM, Michael R. Hines wrote:
> Can you add a "flag" or something to indicate that the iov pointer belongs to RAM and not to device state?
> 
> That way, I could re-use this code for RDMA - if I see this flag, I will know to send to RDMA.....
This function is called only for ram pages so no need for flag.

Orit
> 
> - Michael
> 
> 
> On 03/21/2013 05:09 AM, Orit Wasserman wrote:
>> This allow us to add a buffer to the iovec to send without copying it
>> into the static buffer.
>>
>> Signed-off-by: Orit Wasserman <owasserm@redhat.com>
>> ---
>>   include/migration/qemu-file.h |  5 +++++
>>   savevm.c                      | 42 ++++++++++++++++++++++++------------------
>>   2 files changed, 29 insertions(+), 18 deletions(-)
>>
>> diff --git a/include/migration/qemu-file.h b/include/migration/qemu-file.h
>> index 8d3da9b..5168be2 100644
>> --- a/include/migration/qemu-file.h
>> +++ b/include/migration/qemu-file.h
>> @@ -75,6 +75,11 @@ int qemu_fclose(QEMUFile *f);
>>   int64_t qemu_ftell(QEMUFile *f);
>>   void qemu_put_buffer(QEMUFile *f, const uint8_t *buf, int size);
>>   void qemu_put_byte(QEMUFile *f, int v);
>> +/*
>> + * put_buffer without copying the buffer.
>> + * The buffer should be available till it is sent.
>> + */
>> +void qemu_put_buffer_no_copy(QEMUFile *f, const uint8_t *buf, int size);
>>
>>   static inline void qemu_put_ubyte(QEMUFile *f, unsigned int v)
>>   {
>> diff --git a/savevm.c b/savevm.c
>> index 40d96f4..32a506e 100644
>> --- a/savevm.c
>> +++ b/savevm.c
>> @@ -629,6 +629,22 @@ void qemu_put_buffer(QEMUFile *f, const uint8_t *buf, int size)
>>   {
>>       int l;
>>
>> +    while (size > 0) {
>> +        l = IO_BUF_SIZE - f->buf_index;
>> +        if (l > size) {
>> +            l = size;
>> +        }
>> +        memcpy(f->buf + f->buf_index, buf, l);
>> +        f->buf_index += l;
>> +        f->is_write = 1;
>> +        qemu_put_buffer_no_copy(f, f->buf + (f->buf_index - l), l);
>> +        buf += l;
>> +        size -= l;
>> +    }
>> +}
>> +
>> +void qemu_put_buffer_no_copy(QEMUFile *f, const uint8_t *buf, int size)
>> +{
>>       if (f->last_error) {
>>           return;
>>       }
>> @@ -639,24 +655,14 @@ void qemu_put_buffer(QEMUFile *f, const uint8_t *buf, int size)
>>           abort();
>>       }
>>
>> -    while (size > 0) {
>> -        l = IO_BUF_SIZE - f->buf_index;
>> -        if (l > size)
>> -            l = size;
>> -        memcpy(f->buf + f->buf_index, buf, l);
>> -        f->iov[f->iovcnt].iov_base = f->buf + f->buf_index;
>> -        f->iov[f->iovcnt++].iov_len = l;
>> -        f->is_write = 1;
>> -        f->buf_index += l;
>> -        f->bytes_xfer += l;
>> -        buf += l;
>> -        size -= l;
>> -        if (f->buf_index >= IO_BUF_SIZE || f->iovcnt >= MAX_IOV_SIZE) {
>> -            qemu_fflush(f);
>> -            if (qemu_file_get_error(f)) {
>> -                break;
>> -            }
>> -        }
>> +    f->iov[f->iovcnt].iov_base = (uint8_t *)buf;
>> +    f->iov[f->iovcnt++].iov_len = size;
>> +
>> +    f->is_write = 1;
>> +    f->bytes_xfer += size;
>> +
>> +    if (f->buf_index >= IO_BUF_SIZE || f->iovcnt >= MAX_IOV_SIZE) {
>> +        qemu_fflush(f);
>>       }
>>   }
>>
> 

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [Qemu-devel] [RFC 10/12] Add qemu_put_buffer_no_copy
  2013-03-23 16:27   ` Michael R. Hines
  2013-03-25  8:11     ` Orit Wasserman
@ 2013-03-25 13:05     ` Paolo Bonzini
  2013-03-25 15:18       ` Michael R. Hines
  1 sibling, 1 reply; 35+ messages in thread
From: Paolo Bonzini @ 2013-03-25 13:05 UTC (permalink / raw)
  To: Michael R. Hines; +Cc: Orit Wasserman, quintela, chegu vinod, qemu-devel, mst



----- Messaggio originale -----
> Da: "Michael R. Hines" <mrhines@linux.vnet.ibm.com>
> A: "Orit Wasserman" <owasserm@redhat.com>
> Cc: qemu-devel@nongnu.org, pbonzini@redhat.com, mst@redhat.com, "chegu vinod" <chegu_vinod@hp.com>,
> quintela@redhat.com
> Inviato: Sabato, 23 marzo 2013 17:27:49
> Oggetto: Re: [Qemu-devel] [RFC 10/12] Add qemu_put_buffer_no_copy
> 
> Can you add a "flag" or something to indicate that the iov pointer
> belongs to RAM and not to device state?
> 
> That way, I could re-use this code for RDMA - if I see this flag, I
> will know to send to RDMA.....

I am not sure you can, because the function will be preceded by
a qemu_put_be64 to store the header.  That header is not sent in
the RDMA case, isn't it?

Paolo

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [Qemu-devel] [RFC 10/12] Add qemu_put_buffer_no_copy
  2013-03-25 13:05     ` Paolo Bonzini
@ 2013-03-25 15:18       ` Michael R. Hines
  2013-03-25 15:59         ` Paolo Bonzini
  0 siblings, 1 reply; 35+ messages in thread
From: Michael R. Hines @ 2013-03-25 15:18 UTC (permalink / raw)
  To: Paolo Bonzini; +Cc: Orit Wasserman, quintela, chegu vinod, qemu-devel, mst

Right, the header's not used - but, are we certain that 
put_buffer_copy() will *always* be used for RAM in the future?

- Michael

On 03/25/2013 09:05 AM, Paolo Bonzini wrote:
>
> ----- Messaggio originale -----
>> Da: "Michael R. Hines" <mrhines@linux.vnet.ibm.com>
>> A: "Orit Wasserman" <owasserm@redhat.com>
>> Cc: qemu-devel@nongnu.org, pbonzini@redhat.com, mst@redhat.com, "chegu vinod" <chegu_vinod@hp.com>,
>> quintela@redhat.com
>> Inviato: Sabato, 23 marzo 2013 17:27:49
>> Oggetto: Re: [Qemu-devel] [RFC 10/12] Add qemu_put_buffer_no_copy
>>
>> Can you add a "flag" or something to indicate that the iov pointer
>> belongs to RAM and not to device state?
>>
>> That way, I could re-use this code for RDMA - if I see this flag, I
>> will know to send to RDMA.....
> I am not sure you can, because the function will be preceded by
> a qemu_put_be64 to store the header.  That header is not sent in
> the RDMA case, isn't it?
>
> Paolo
>

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [Qemu-devel] [RFC 10/12] Add qemu_put_buffer_no_copy
  2013-03-25 15:18       ` Michael R. Hines
@ 2013-03-25 15:59         ` Paolo Bonzini
  0 siblings, 0 replies; 35+ messages in thread
From: Paolo Bonzini @ 2013-03-25 15:59 UTC (permalink / raw)
  To: Michael R. Hines; +Cc: Orit Wasserman, quintela, chegu vinod, qemu-devel, mst


> Right, the header's not used - but, are we certain that
> put_buffer_copy() will *always* be used for RAM in the future?

I think you should not make any assumption and proceed as if this
series didn't exist.

Paolo

^ permalink raw reply	[flat|nested] 35+ messages in thread

end of thread, other threads:[~2013-03-25 16:01 UTC | newest]

Thread overview: 35+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2013-03-21  9:09 [Qemu-devel] [RFC 00/12] Migration: Remove copying of guest ram pages Orit Wasserman
2013-03-21  9:09 ` [Qemu-devel] [RFC 01/12] Add iov_writev to use writev to send iovec (also for files) Orit Wasserman
2013-03-21  9:23   ` Paolo Bonzini
2013-03-21  9:09 ` [Qemu-devel] [RFC 02/12] Add QemuFileWritevBuffer QemuFileOps Orit Wasserman
2013-03-21  9:09 ` [Qemu-devel] [RFC 03/12] Add socket_writev_buffer function Orit Wasserman
2013-03-21  9:18   ` Paolo Bonzini
2013-03-21  9:47     ` Orit Wasserman
2013-03-21  9:47       ` Paolo Bonzini
2013-03-21 10:17         ` Orit Wasserman
2013-03-21  9:09 ` [Qemu-devel] [RFC 04/12] Add stdio_writev_buffer function Orit Wasserman
2013-03-21  9:28   ` Paolo Bonzini
2013-03-21  9:09 ` [Qemu-devel] [RFC 05/12] Add block_writev_buffer function Orit Wasserman
2013-03-21  9:09 ` [Qemu-devel] [RFC 06/12] Update bytes_xfer in qemu_put_byte Orit Wasserman
2013-03-21  9:09 ` [Qemu-devel] [RFC 07/12] Store the data to send also in iovec Orit Wasserman
2013-03-21  9:56   ` Paolo Bonzini
2013-03-21 11:10     ` Orit Wasserman
2013-03-21 11:14       ` Michael S. Tsirkin
2013-03-21 12:50         ` Orit Wasserman
2013-03-21 13:00           ` Paolo Bonzini
2013-03-21  9:09 ` [Qemu-devel] [RFC 08/12] Use writev ops instead of put_buffer ops Orit Wasserman
2013-03-21  9:09 ` [Qemu-devel] [RFC 09/12] More optimized qemu_put_be64/32/16 Orit Wasserman
2013-03-21  9:09 ` [Qemu-devel] [RFC 10/12] Add qemu_put_buffer_no_copy Orit Wasserman
2013-03-21  9:25   ` Paolo Bonzini
2013-03-23 16:27   ` Michael R. Hines
2013-03-25  8:11     ` Orit Wasserman
2013-03-25 13:05     ` Paolo Bonzini
2013-03-25 15:18       ` Michael R. Hines
2013-03-25 15:59         ` Paolo Bonzini
2013-03-21  9:09 ` [Qemu-devel] [RFC 11/12] Use qemu_put_buffer_no_copy for guest memory pages Orit Wasserman
2013-03-21  9:09 ` [Qemu-devel] [RFC 12/12] Bye Bye put_buffer Orit Wasserman
2013-03-21  9:29 ` [Qemu-devel] [RFC 00/12] Migration: Remove copying of guest ram pages Paolo Bonzini
2013-03-21 10:05   ` Orit Wasserman
2013-03-21  9:48 ` Michael S. Tsirkin
2013-03-21  9:53   ` Paolo Bonzini
2013-03-21 11:09     ` Michael S. Tsirkin

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.