All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v2 0/3] Zero copy improvements (QIOChannel + multifd)
@ 2022-07-01 15:59 Leonardo Bras
  2022-07-01 15:59 ` [PATCH v2 1/3] QIOChannelSocket: Fix zero-copy flush returning code 1 when nothing sent Leonardo Bras
                   ` (2 more replies)
  0 siblings, 3 replies; 14+ messages in thread
From: Leonardo Bras @ 2022-07-01 15:59 UTC (permalink / raw)
  To: Daniel P. Berrangé,
	Juan Quintela, Dr. David Alan Gilbert, Eric Blake,
	Markus Armbruster, Peter Xu
  Cc: Leonardo Bras, qemu-devel

The first patch avoid spuriously returning 1 [*] when zero-copy flush is
attempted before any buffer was sent using MSG_ZEROCOPY.

[*] zero-copy not being used, even though it's enabled and supported
by kernel

The second patch introduces a new migration stat (zero-copy-copied)
that will be used to keep track of [*]. 

Honestly I would like some help with this naming, which I don't think
is quite good, but I could also not think on anything better.

The third patch keeps track of how many zero-copy flushes retured 1 [*]

Leonardo Bras (3):
  QIOChannelSocket: Fix zero-copy flush returning code 1 when nothing
    sent
  Add zero-copy-copied migration stat
  migration/multifd: Warn user when zerocopy not working

 qapi/migration.json   | 5 ++++-
 migration/ram.h       | 2 ++
 io/channel-socket.c   | 8 +++++++-
 migration/migration.c | 1 +
 migration/multifd.c   | 2 ++
 migration/ram.c       | 5 +++++
 monitor/hmp-cmds.c    | 4 ++++
 7 files changed, 25 insertions(+), 2 deletions(-)

-- 
2.36.1



^ permalink raw reply	[flat|nested] 14+ messages in thread

* [PATCH v2 1/3] QIOChannelSocket: Fix zero-copy flush returning code 1 when nothing sent
  2022-07-01 15:59 [PATCH v2 0/3] Zero copy improvements (QIOChannel + multifd) Leonardo Bras
@ 2022-07-01 15:59 ` Leonardo Bras
  2022-07-01 15:59 ` [PATCH v2 2/3] Add zero-copy-copied migration stat Leonardo Bras
  2022-07-01 15:59 ` [PATCH v2 3/3] migration/multifd: Warn user when zerocopy not working Leonardo Bras
  2 siblings, 0 replies; 14+ messages in thread
From: Leonardo Bras @ 2022-07-01 15:59 UTC (permalink / raw)
  To: Daniel P. Berrangé,
	Juan Quintela, Dr. David Alan Gilbert, Eric Blake,
	Markus Armbruster, Peter Xu
  Cc: Leonardo Bras, qemu-devel

If flush is called when no buffer was sent with MSG_ZEROCOPY, it currently
returns 1. This return code should be used only when Linux fails to use
MSG_ZEROCOPY on a lot of sendmsg().

Fix this by returning early from flush if no sendmsg(...,MSG_ZEROCOPY)
was attempted.

Fixes: 2bc58ffc2926 ("QIOChannelSocket: Implement io_writev zero copy flag & io_flush for CONFIG_LINUX")
Signed-off-by: Leonardo Bras <leobras@redhat.com>
---
 io/channel-socket.c | 8 +++++++-
 1 file changed, 7 insertions(+), 1 deletion(-)

diff --git a/io/channel-socket.c b/io/channel-socket.c
index 4466bb1cd4..698c086b70 100644
--- a/io/channel-socket.c
+++ b/io/channel-socket.c
@@ -716,12 +716,18 @@ static int qio_channel_socket_flush(QIOChannel *ioc,
     struct cmsghdr *cm;
     char control[CMSG_SPACE(sizeof(*serr))];
     int received;
-    int ret = 1;
+    int ret;
+
+    if (!sioc->zero_copy_queued) {
+        return 0;
+    }
 
     msg.msg_control = control;
     msg.msg_controllen = sizeof(control);
     memset(control, 0, sizeof(control));
 
+    ret = 1;
+
     while (sioc->zero_copy_sent < sioc->zero_copy_queued) {
         received = recvmsg(sioc->fd, &msg, MSG_ERRQUEUE);
         if (received < 0) {
-- 
2.36.1



^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH v2 2/3] Add zero-copy-copied migration stat
  2022-07-01 15:59 [PATCH v2 0/3] Zero copy improvements (QIOChannel + multifd) Leonardo Bras
  2022-07-01 15:59 ` [PATCH v2 1/3] QIOChannelSocket: Fix zero-copy flush returning code 1 when nothing sent Leonardo Bras
@ 2022-07-01 15:59 ` Leonardo Bras
  2022-07-04  6:18   ` Markus Armbruster
  2022-07-01 15:59 ` [PATCH v2 3/3] migration/multifd: Warn user when zerocopy not working Leonardo Bras
  2 siblings, 1 reply; 14+ messages in thread
From: Leonardo Bras @ 2022-07-01 15:59 UTC (permalink / raw)
  To: Daniel P. Berrangé,
	Juan Quintela, Dr. David Alan Gilbert, Eric Blake,
	Markus Armbruster, Peter Xu
  Cc: Leonardo Bras, qemu-devel

Signed-off-by: Leonardo Bras <leobras@redhat.com>
---
 qapi/migration.json   | 5 ++++-
 migration/migration.c | 1 +
 monitor/hmp-cmds.c    | 4 ++++
 3 files changed, 9 insertions(+), 1 deletion(-)

diff --git a/qapi/migration.json b/qapi/migration.json
index 7102e474a6..925f009868 100644
--- a/qapi/migration.json
+++ b/qapi/migration.json
@@ -55,6 +55,9 @@
 # @postcopy-bytes: The number of bytes sent during the post-copy phase
 #                  (since 7.0).
 #
+# @zero-copy-copied: The number of zero-copy flushes that reported data sent
+#                    using zero-copy that ended up being copied. (since 7.2)
+#
 # Since: 0.14
 ##
 { 'struct': 'MigrationStats',
@@ -65,7 +68,7 @@
            'postcopy-requests' : 'int', 'page-size' : 'int',
            'multifd-bytes' : 'uint64', 'pages-per-second' : 'uint64',
            'precopy-bytes' : 'uint64', 'downtime-bytes' : 'uint64',
-           'postcopy-bytes' : 'uint64' } }
+           'postcopy-bytes' : 'uint64', 'zero-copy-copied' : 'uint64' } }
 
 ##
 # @XBZRLECacheStats:
diff --git a/migration/migration.c b/migration/migration.c
index 78f5057373..68fc3894ba 100644
--- a/migration/migration.c
+++ b/migration/migration.c
@@ -1034,6 +1034,7 @@ static void populate_ram_info(MigrationInfo *info, MigrationState *s)
     info->ram->precopy_bytes = ram_counters.precopy_bytes;
     info->ram->downtime_bytes = ram_counters.downtime_bytes;
     info->ram->postcopy_bytes = ram_counters.postcopy_bytes;
+    info->ram->zero_copy_copied = ram_counters.zero_copy_copied;
 
     if (migrate_use_xbzrle()) {
         info->has_xbzrle_cache = true;
diff --git a/monitor/hmp-cmds.c b/monitor/hmp-cmds.c
index ca98df0495..bcb1799543 100644
--- a/monitor/hmp-cmds.c
+++ b/monitor/hmp-cmds.c
@@ -307,6 +307,10 @@ void hmp_info_migrate(Monitor *mon, const QDict *qdict)
             monitor_printf(mon, "postcopy ram: %" PRIu64 " kbytes\n",
                            info->ram->postcopy_bytes >> 10);
         }
+        if (info->ram->zero_copy_copied) {
+            monitor_printf(mon, "zero-copy copied: %" PRIu64 " iterations\n",
+                           info->ram->zero_copy_copied);
+        }
     }
 
     if (info->has_disk) {
-- 
2.36.1



^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH v2 3/3] migration/multifd: Warn user when zerocopy not working
  2022-07-01 15:59 [PATCH v2 0/3] Zero copy improvements (QIOChannel + multifd) Leonardo Bras
  2022-07-01 15:59 ` [PATCH v2 1/3] QIOChannelSocket: Fix zero-copy flush returning code 1 when nothing sent Leonardo Bras
  2022-07-01 15:59 ` [PATCH v2 2/3] Add zero-copy-copied migration stat Leonardo Bras
@ 2022-07-01 15:59 ` Leonardo Bras
  2 siblings, 0 replies; 14+ messages in thread
From: Leonardo Bras @ 2022-07-01 15:59 UTC (permalink / raw)
  To: Daniel P. Berrangé,
	Juan Quintela, Dr. David Alan Gilbert, Eric Blake,
	Markus Armbruster, Peter Xu
  Cc: Leonardo Bras, qemu-devel

Some errors, like the lack of Scatter-Gather support by the network
interface(NETIF_F_SG) may cause sendmsg(...,MSG_ZEROCOPY) to fail on using
zero-copy, which causes it to fall back to the default copying mechanism.

After each full dirty-bitmap scan there should be a zero-copy flush
happening, which checks for errors each of the previous calls to
sendmsg(...,MSG_ZEROCOPY). If all of them failed to use zero-copy, then
increment zero_copy_copied migration stat to let the user know about it.

Signed-off-by: Leonardo Bras <leobras@redhat.com>
---
 migration/ram.h     | 2 ++
 migration/multifd.c | 2 ++
 migration/ram.c     | 5 +++++
 3 files changed, 9 insertions(+)

diff --git a/migration/ram.h b/migration/ram.h
index ded0a3a086..f6753f1354 100644
--- a/migration/ram.h
+++ b/migration/ram.h
@@ -87,4 +87,6 @@ void ram_write_tracking_prepare(void);
 int ram_write_tracking_start(void);
 void ram_write_tracking_stop(void);
 
+void zero_copy_copied(void);
+
 #endif
diff --git a/migration/multifd.c b/migration/multifd.c
index 684c014c86..ff19bd4881 100644
--- a/migration/multifd.c
+++ b/migration/multifd.c
@@ -624,6 +624,8 @@ int multifd_send_sync_main(QEMUFile *f)
             if (ret < 0) {
                 error_report_err(err);
                 return -1;
+            } else if (ret == 1) {
+                zero_copy_copied();
             }
         }
     }
diff --git a/migration/ram.c b/migration/ram.c
index 01f9cc1d72..0b71993951 100644
--- a/migration/ram.c
+++ b/migration/ram.c
@@ -407,6 +407,11 @@ static void ram_transferred_add(uint64_t bytes)
     ram_counters.transferred += bytes;
 }
 
+void zero_copy_copied(void)
+{
+    ram_counters.zero_copy_copied++;
+}
+
 /* used by the search for pages to send */
 struct PageSearchStatus {
     /* Current block being searched */
-- 
2.36.1



^ permalink raw reply related	[flat|nested] 14+ messages in thread

* Re: [PATCH v2 2/3] Add zero-copy-copied migration stat
  2022-07-01 15:59 ` [PATCH v2 2/3] Add zero-copy-copied migration stat Leonardo Bras
@ 2022-07-04  6:18   ` Markus Armbruster
  2022-07-04  9:06     ` Daniel P. Berrangé
  0 siblings, 1 reply; 14+ messages in thread
From: Markus Armbruster @ 2022-07-04  6:18 UTC (permalink / raw)
  To: Leonardo Bras
  Cc: Daniel P. Berrangé,
	Juan Quintela, Dr. David Alan Gilbert, Eric Blake, Peter Xu,
	qemu-devel

Leonardo Bras <leobras@redhat.com> writes:

> Signed-off-by: Leonardo Bras <leobras@redhat.com>
> ---
>  qapi/migration.json   | 5 ++++-
>  migration/migration.c | 1 +
>  monitor/hmp-cmds.c    | 4 ++++
>  3 files changed, 9 insertions(+), 1 deletion(-)
>
> diff --git a/qapi/migration.json b/qapi/migration.json
> index 7102e474a6..925f009868 100644
> --- a/qapi/migration.json
> +++ b/qapi/migration.json
> @@ -55,6 +55,9 @@
>  # @postcopy-bytes: The number of bytes sent during the post-copy phase
>  #                  (since 7.0).
>  #
> +# @zero-copy-copied: The number of zero-copy flushes that reported data sent
> +#                    using zero-copy that ended up being copied. (since 7.2)

The description feels awkward.  What's a "zero-copy flush", and why
should the user care?  I figure what users care about is the number of
all-zero pages we had to "copy", i.e. send the bulky way.  Is this what
@zero-copy-copied reports?

> +#
>  # Since: 0.14
>  ##
>  { 'struct': 'MigrationStats',
> @@ -65,7 +68,7 @@
>             'postcopy-requests' : 'int', 'page-size' : 'int',
>             'multifd-bytes' : 'uint64', 'pages-per-second' : 'uint64',
>             'precopy-bytes' : 'uint64', 'downtime-bytes' : 'uint64',
> -           'postcopy-bytes' : 'uint64' } }
> +           'postcopy-bytes' : 'uint64', 'zero-copy-copied' : 'uint64' } }
>  
>  ##
>  # @XBZRLECacheStats:



^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH v2 2/3] Add zero-copy-copied migration stat
  2022-07-04  6:18   ` Markus Armbruster
@ 2022-07-04  9:06     ` Daniel P. Berrangé
  2022-07-04 11:33       ` Markus Armbruster
  2022-07-04 11:40       ` Dr. David Alan Gilbert
  0 siblings, 2 replies; 14+ messages in thread
From: Daniel P. Berrangé @ 2022-07-04  9:06 UTC (permalink / raw)
  To: Markus Armbruster
  Cc: Leonardo Bras, Juan Quintela, Dr. David Alan Gilbert, Eric Blake,
	Peter Xu, qemu-devel

On Mon, Jul 04, 2022 at 08:18:54AM +0200, Markus Armbruster wrote:
> Leonardo Bras <leobras@redhat.com> writes:
> 
> > Signed-off-by: Leonardo Bras <leobras@redhat.com>
> > ---
> >  qapi/migration.json   | 5 ++++-
> >  migration/migration.c | 1 +
> >  monitor/hmp-cmds.c    | 4 ++++
> >  3 files changed, 9 insertions(+), 1 deletion(-)
> >
> > diff --git a/qapi/migration.json b/qapi/migration.json
> > index 7102e474a6..925f009868 100644
> > --- a/qapi/migration.json
> > +++ b/qapi/migration.json
> > @@ -55,6 +55,9 @@
> >  # @postcopy-bytes: The number of bytes sent during the post-copy phase
> >  #                  (since 7.0).
> >  #
> > +# @zero-copy-copied: The number of zero-copy flushes that reported data sent
> > +#                    using zero-copy that ended up being copied. (since 7.2)
> 
> The description feels awkward.  What's a "zero-copy flush", and why
> should the user care?  I figure what users care about is the number of
> all-zero pages we had to "copy", i.e. send the bulky way.  Is this what
> @zero-copy-copied reports?

MigrationCapability field @zero-copy-send instructs QEMU to try to
avoid copying data between userspace and kernel space when transmitting
RAM region.

Even if the kernel supports zero copy, it is not guaranteed to happen,
it is merely a request to try.

QEMU periodically (once per migration iteration) flushes outstanding
zero-copy requests and gets an indication back of whether any copies
took place or not.

So this counter is a reflection of how many iterations resulted  in
zero-copy not being fully honoured.

IOW, ideally this counter will always be zero. If it is non-zero,
then the magnitude gives a very very very rough guide to what's
going on. If it is '1' then it was just a transient limitation.
If it matches the number of migration iterations, then it is a
more systemic limitation.

Incidentally, do we report the migration iteration count ? I
thought we did, but i'm not finding it now that I look.


With regards,
Daniel
-- 
|: https://berrange.com      -o-    https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org         -o-            https://fstop138.berrange.com :|
|: https://entangle-photo.org    -o-    https://www.instagram.com/dberrange :|



^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH v2 2/3] Add zero-copy-copied migration stat
  2022-07-04  9:06     ` Daniel P. Berrangé
@ 2022-07-04 11:33       ` Markus Armbruster
  2022-07-04 11:40       ` Dr. David Alan Gilbert
  1 sibling, 0 replies; 14+ messages in thread
From: Markus Armbruster @ 2022-07-04 11:33 UTC (permalink / raw)
  To: Daniel P. Berrangé
  Cc: Leonardo Bras, Juan Quintela, Dr. David Alan Gilbert, Eric Blake,
	Peter Xu, qemu-devel

Daniel P. Berrangé <berrange@redhat.com> writes:

> On Mon, Jul 04, 2022 at 08:18:54AM +0200, Markus Armbruster wrote:
>> Leonardo Bras <leobras@redhat.com> writes:
>> 
>> > Signed-off-by: Leonardo Bras <leobras@redhat.com>
>> > ---
>> >  qapi/migration.json   | 5 ++++-
>> >  migration/migration.c | 1 +
>> >  monitor/hmp-cmds.c    | 4 ++++
>> >  3 files changed, 9 insertions(+), 1 deletion(-)
>> >
>> > diff --git a/qapi/migration.json b/qapi/migration.json
>> > index 7102e474a6..925f009868 100644
>> > --- a/qapi/migration.json
>> > +++ b/qapi/migration.json
>> > @@ -55,6 +55,9 @@
>> >  # @postcopy-bytes: The number of bytes sent during the post-copy phase
>> >  #                  (since 7.0).
>> >  #
>> > +# @zero-copy-copied: The number of zero-copy flushes that reported data sent
>> > +#                    using zero-copy that ended up being copied. (since 7.2)
>> 
>> The description feels awkward.  What's a "zero-copy flush", and why
>> should the user care?  I figure what users care about is the number of
>> all-zero pages we had to "copy", i.e. send the bulky way.  Is this what
>> @zero-copy-copied reports?
>
> MigrationCapability field @zero-copy-send instructs QEMU to try to
> avoid copying data between userspace and kernel space when transmitting
> RAM region.
>
> Even if the kernel supports zero copy, it is not guaranteed to happen,
> it is merely a request to try.
>
> QEMU periodically (once per migration iteration) flushes outstanding
> zero-copy requests and gets an indication back of whether any copies
> took place or not.
>
> So this counter is a reflection of how many iterations resulted  in
> zero-copy not being fully honoured.

Aha.  Thanks!

> IOW, ideally this counter will always be zero. If it is non-zero,
> then the magnitude gives a very very very rough guide to what's
> going on. If it is '1' then it was just a transient limitation.
> If it matches the number of migration iterations, then it is a
> more systemic limitation.

I think we should rephrase the documentation in terms of migration
iterations instead of "flushes", because the latter is a non-obvious
term without a definition.

> Incidentally, do we report the migration iteration count ? I
> thought we did, but i'm not finding it now that I look.

I was about to ask exactly that.  I'm not sure the value of
@zero-copy-copied can be usefully interpreted without.



^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH v2 2/3] Add zero-copy-copied migration stat
  2022-07-04  9:06     ` Daniel P. Berrangé
  2022-07-04 11:33       ` Markus Armbruster
@ 2022-07-04 11:40       ` Dr. David Alan Gilbert
  2022-07-04 12:04         ` Markus Armbruster
  1 sibling, 1 reply; 14+ messages in thread
From: Dr. David Alan Gilbert @ 2022-07-04 11:40 UTC (permalink / raw)
  To: Daniel P. Berrangé
  Cc: Markus Armbruster, Leonardo Bras, Juan Quintela, Eric Blake,
	Peter Xu, qemu-devel

* Daniel P. Berrangé (berrange@redhat.com) wrote:
> On Mon, Jul 04, 2022 at 08:18:54AM +0200, Markus Armbruster wrote:
> > Leonardo Bras <leobras@redhat.com> writes:
> > 
> > > Signed-off-by: Leonardo Bras <leobras@redhat.com>
> > > ---
> > >  qapi/migration.json   | 5 ++++-
> > >  migration/migration.c | 1 +
> > >  monitor/hmp-cmds.c    | 4 ++++
> > >  3 files changed, 9 insertions(+), 1 deletion(-)
> > >
> > > diff --git a/qapi/migration.json b/qapi/migration.json
> > > index 7102e474a6..925f009868 100644
> > > --- a/qapi/migration.json
> > > +++ b/qapi/migration.json
> > > @@ -55,6 +55,9 @@
> > >  # @postcopy-bytes: The number of bytes sent during the post-copy phase
> > >  #                  (since 7.0).
> > >  #
> > > +# @zero-copy-copied: The number of zero-copy flushes that reported data sent
> > > +#                    using zero-copy that ended up being copied. (since 7.2)
> > 
> > The description feels awkward.  What's a "zero-copy flush", and why
> > should the user care?  I figure what users care about is the number of
> > all-zero pages we had to "copy", i.e. send the bulky way.  Is this what
> > @zero-copy-copied reports?
> 
> MigrationCapability field @zero-copy-send instructs QEMU to try to
> avoid copying data between userspace and kernel space when transmitting
> RAM region.
> 
> Even if the kernel supports zero copy, it is not guaranteed to happen,
> it is merely a request to try.
> 
> QEMU periodically (once per migration iteration) flushes outstanding
> zero-copy requests and gets an indication back of whether any copies
> took place or not.
> 
> So this counter is a reflection of how many iterations resulted  in
> zero-copy not being fully honoured.
> 
> IOW, ideally this counter will always be zero. If it is non-zero,
> then the magnitude gives a very very very rough guide to what's
> going on. If it is '1' then it was just a transient limitation.
> If it matches the number of migration iterations, then it is a
> more systemic limitation.
> 
> Incidentally, do we report the migration iteration count ? I
> thought we did, but i'm not finding it now that I look.

Yes we do; it's dirty-sync-count

Dave

> 
> With regards,
> Daniel
> -- 
> |: https://berrange.com      -o-    https://www.flickr.com/photos/dberrange :|
> |: https://libvirt.org         -o-            https://fstop138.berrange.com :|
> |: https://entangle-photo.org    -o-    https://www.instagram.com/dberrange :|
> 
-- 
Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK



^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH v2 2/3] Add zero-copy-copied migration stat
  2022-07-04 11:40       ` Dr. David Alan Gilbert
@ 2022-07-04 12:04         ` Markus Armbruster
  2022-07-04 12:16           ` Daniel P. Berrangé
  0 siblings, 1 reply; 14+ messages in thread
From: Markus Armbruster @ 2022-07-04 12:04 UTC (permalink / raw)
  To: Dr. David Alan Gilbert
  Cc: Daniel P. Berrangé,
	Leonardo Bras, Juan Quintela, Eric Blake, Peter Xu, qemu-devel

"Dr. David Alan Gilbert" <dgilbert@redhat.com> writes:

> * Daniel P. Berrangé (berrange@redhat.com) wrote:
>> On Mon, Jul 04, 2022 at 08:18:54AM +0200, Markus Armbruster wrote:
>> > Leonardo Bras <leobras@redhat.com> writes:
>> > 
>> > > Signed-off-by: Leonardo Bras <leobras@redhat.com>
>> > > ---
>> > >  qapi/migration.json   | 5 ++++-
>> > >  migration/migration.c | 1 +
>> > >  monitor/hmp-cmds.c    | 4 ++++
>> > >  3 files changed, 9 insertions(+), 1 deletion(-)
>> > >
>> > > diff --git a/qapi/migration.json b/qapi/migration.json
>> > > index 7102e474a6..925f009868 100644
>> > > --- a/qapi/migration.json
>> > > +++ b/qapi/migration.json
>> > > @@ -55,6 +55,9 @@
>> > >  # @postcopy-bytes: The number of bytes sent during the post-copy phase
>> > >  #                  (since 7.0).
>> > >  #
>> > > +# @zero-copy-copied: The number of zero-copy flushes that reported data sent
>> > > +#                    using zero-copy that ended up being copied. (since 7.2)

since 7.1, unless you're planning for really tortuous review :)

>> > 
>> > The description feels awkward.  What's a "zero-copy flush", and why
>> > should the user care?  I figure what users care about is the number of
>> > all-zero pages we had to "copy", i.e. send the bulky way.  Is this what
>> > @zero-copy-copied reports?
>> 
>> MigrationCapability field @zero-copy-send instructs QEMU to try to
>> avoid copying data between userspace and kernel space when transmitting
>> RAM region.
>> 
>> Even if the kernel supports zero copy, it is not guaranteed to happen,
>> it is merely a request to try.
>> 
>> QEMU periodically (once per migration iteration) flushes outstanding
>> zero-copy requests and gets an indication back of whether any copies
>> took place or not.
>> 
>> So this counter is a reflection of how many iterations resulted  in
>> zero-copy not being fully honoured.
>> 
>> IOW, ideally this counter will always be zero. If it is non-zero,
>> then the magnitude gives a very very very rough guide to what's
>> going on. If it is '1' then it was just a transient limitation.
>> If it matches the number of migration iterations, then it is a
>> more systemic limitation.
>> 
>> Incidentally, do we report the migration iteration count ? I
>> thought we did, but i'm not finding it now that I look.
>
> Yes we do; it's dirty-sync-count

Please rephrase the documentation of @zero-copy-copied in terms of
@dirty-sync-count.  Here's my attempt.

# @zero-copy-copied: Number of times dirty RAM synchronization could not
#                    avoid copying zero pages.  This is between 0 and
#                    @dirty-sync-count.  (since 7.1)



^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH v2 2/3] Add zero-copy-copied migration stat
  2022-07-04 12:04         ` Markus Armbruster
@ 2022-07-04 12:16           ` Daniel P. Berrangé
  2022-07-04 12:59             ` Markus Armbruster
  0 siblings, 1 reply; 14+ messages in thread
From: Daniel P. Berrangé @ 2022-07-04 12:16 UTC (permalink / raw)
  To: Markus Armbruster
  Cc: Dr. David Alan Gilbert, Leonardo Bras, Juan Quintela, Eric Blake,
	Peter Xu, qemu-devel

On Mon, Jul 04, 2022 at 02:04:41PM +0200, Markus Armbruster wrote:
> "Dr. David Alan Gilbert" <dgilbert@redhat.com> writes:
> 
> > * Daniel P. Berrangé (berrange@redhat.com) wrote:
> >> On Mon, Jul 04, 2022 at 08:18:54AM +0200, Markus Armbruster wrote:
> >> > Leonardo Bras <leobras@redhat.com> writes:
> >> > 
> >> > > Signed-off-by: Leonardo Bras <leobras@redhat.com>
> >> > > ---
> >> > >  qapi/migration.json   | 5 ++++-
> >> > >  migration/migration.c | 1 +
> >> > >  monitor/hmp-cmds.c    | 4 ++++
> >> > >  3 files changed, 9 insertions(+), 1 deletion(-)
> >> > >
> >> > > diff --git a/qapi/migration.json b/qapi/migration.json
> >> > > index 7102e474a6..925f009868 100644
> >> > > --- a/qapi/migration.json
> >> > > +++ b/qapi/migration.json
> >> > > @@ -55,6 +55,9 @@
> >> > >  # @postcopy-bytes: The number of bytes sent during the post-copy phase
> >> > >  #                  (since 7.0).
> >> > >  #
> >> > > +# @zero-copy-copied: The number of zero-copy flushes that reported data sent
> >> > > +#                    using zero-copy that ended up being copied. (since 7.2)
> 
> since 7.1, unless you're planning for really tortuous review :)
> 
> >> > 
> >> > The description feels awkward.  What's a "zero-copy flush", and why
> >> > should the user care?  I figure what users care about is the number of
> >> > all-zero pages we had to "copy", i.e. send the bulky way.  Is this what
> >> > @zero-copy-copied reports?
> >> 
> >> MigrationCapability field @zero-copy-send instructs QEMU to try to
> >> avoid copying data between userspace and kernel space when transmitting
> >> RAM region.
> >> 
> >> Even if the kernel supports zero copy, it is not guaranteed to happen,
> >> it is merely a request to try.
> >> 
> >> QEMU periodically (once per migration iteration) flushes outstanding
> >> zero-copy requests and gets an indication back of whether any copies
> >> took place or not.
> >> 
> >> So this counter is a reflection of how many iterations resulted  in
> >> zero-copy not being fully honoured.
> >> 
> >> IOW, ideally this counter will always be zero. If it is non-zero,
> >> then the magnitude gives a very very very rough guide to what's
> >> going on. If it is '1' then it was just a transient limitation.
> >> If it matches the number of migration iterations, then it is a
> >> more systemic limitation.
> >> 
> >> Incidentally, do we report the migration iteration count ? I
> >> thought we did, but i'm not finding it now that I look.
> >
> > Yes we do; it's dirty-sync-count
> 
> Please rephrase the documentation of @zero-copy-copied in terms of
> @dirty-sync-count.  Here's my attempt.
> 
> # @zero-copy-copied: Number of times dirty RAM synchronization could not
> #                    avoid copying zero pages.  This is between 0 and
> #                    @dirty-sync-count.  (since 7.1)

Any one have preferences on the name - i get slight put off by the
repeated word in the property name here.

   @zero-copy-rejects ?
   @zero-copy-misses ?
   @zero-copy-fails ?



With regards,
Daniel
-- 
|: https://berrange.com      -o-    https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org         -o-            https://fstop138.berrange.com :|
|: https://entangle-photo.org    -o-    https://www.instagram.com/dberrange :|



^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH v2 2/3] Add zero-copy-copied migration stat
  2022-07-04 12:16           ` Daniel P. Berrangé
@ 2022-07-04 12:59             ` Markus Armbruster
  2022-07-04 13:14               ` Daniel P. Berrangé
  0 siblings, 1 reply; 14+ messages in thread
From: Markus Armbruster @ 2022-07-04 12:59 UTC (permalink / raw)
  To: Daniel P. Berrangé
  Cc: Dr. David Alan Gilbert, Leonardo Bras, Juan Quintela, Eric Blake,
	Peter Xu, qemu-devel

Daniel P. Berrangé <berrange@redhat.com> writes:

> On Mon, Jul 04, 2022 at 02:04:41PM +0200, Markus Armbruster wrote:

[...]

>> Please rephrase the documentation of @zero-copy-copied in terms of
>> @dirty-sync-count.  Here's my attempt.
>> 
>> # @zero-copy-copied: Number of times dirty RAM synchronization could not
>> #                    avoid copying zero pages.  This is between 0 and
>> #                    @dirty-sync-count.  (since 7.1)
>
> Any one have preferences on the name - i get slight put off by the
> repeated word in the property name here.
>
>    @zero-copy-rejects ?
>    @zero-copy-misses ?
>    @zero-copy-fails ?

I'd consider any of these an improvement.  Not a native speaker, but
perhaps "failures" instead of "fails".

We could also express the connection to @dirty-sync-count right in the
name, like @dirty-sync-rejected-zero-copy, @dirty-sync-missed-zero-copy,
@dirty-sync-failed-zero-copy.  Or maybe -copies.



^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH v2 2/3] Add zero-copy-copied migration stat
  2022-07-04 12:59             ` Markus Armbruster
@ 2022-07-04 13:14               ` Daniel P. Berrangé
  2022-07-04 18:13                 ` Leonardo Brás
  0 siblings, 1 reply; 14+ messages in thread
From: Daniel P. Berrangé @ 2022-07-04 13:14 UTC (permalink / raw)
  To: Markus Armbruster
  Cc: Dr. David Alan Gilbert, Leonardo Bras, Juan Quintela, Eric Blake,
	Peter Xu, qemu-devel

On Mon, Jul 04, 2022 at 02:59:50PM +0200, Markus Armbruster wrote:
> Daniel P. Berrangé <berrange@redhat.com> writes:
> 
> > On Mon, Jul 04, 2022 at 02:04:41PM +0200, Markus Armbruster wrote:
> 
> [...]
> 
> >> Please rephrase the documentation of @zero-copy-copied in terms of
> >> @dirty-sync-count.  Here's my attempt.
> >> 
> >> # @zero-copy-copied: Number of times dirty RAM synchronization could not
> >> #                    avoid copying zero pages.  This is between 0 and
> >> #                    @dirty-sync-count.  (since 7.1)
> >
> > Any one have preferences on the name - i get slight put off by the
> > repeated word in the property name here.
> >
> >    @zero-copy-rejects ?
> >    @zero-copy-misses ?
> >    @zero-copy-fails ?
> 
> I'd consider any of these an improvement.  Not a native speaker, but
> perhaps "failures" instead of "fails".
> 
> We could also express the connection to @dirty-sync-count right in the
> name, like @dirty-sync-rejected-zero-copy, @dirty-sync-missed-zero-copy,
> @dirty-sync-failed-zero-copy.  Or maybe -copies.

Yeah, @dirty-sync-missed-zero-copy   is probably my favourite.

With regards,
Daniel
-- 
|: https://berrange.com      -o-    https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org         -o-            https://fstop138.berrange.com :|
|: https://entangle-photo.org    -o-    https://www.instagram.com/dberrange :|



^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH v2 2/3] Add zero-copy-copied migration stat
  2022-07-04 13:14               ` Daniel P. Berrangé
@ 2022-07-04 18:13                 ` Leonardo Brás
  2022-07-05  4:14                   ` Markus Armbruster
  0 siblings, 1 reply; 14+ messages in thread
From: Leonardo Brás @ 2022-07-04 18:13 UTC (permalink / raw)
  To: Daniel P. Berrangé, Markus Armbruster, Dr. David Alan Gilbert
  Cc: Juan Quintela, Eric Blake, Peter Xu, qemu-devel

Thanks Daniel, Markus and Dave for the feedback!

On Mon, 2022-07-04 at 14:14 +0100, Daniel P. Berrangé wrote:
> On Mon, Jul 04, 2022 at 02:59:50PM +0200, Markus Armbruster wrote:
> > Daniel P. Berrangé <berrange@redhat.com> writes:
> > 
> > > On Mon, Jul 04, 2022 at 02:04:41PM +0200, Markus Armbruster wrote:
> > 
[...]

> 
> since 7.1, unless you're planning for really tortuous review :)
> 

Ok, updated :)

> > [...]
> > 
> > > > Please rephrase the documentation of @zero-copy-copied in terms of
> > > > @dirty-sync-count.  Here's my attempt.
> > > > 
> > > > # @zero-copy-copied: Number of times dirty RAM synchronization could not
> > > > #                    avoid copying zero pages.  This is between 0 and
> > > > #                    @dirty-sync-count.  (since 7.1)

That's a great description! And it's almost 100% correct.

IIUC dirty-sync-count increments on migration_bitmap_sync() once after each full
dirty-bitmap scan, and even with multifd syncing at the same point, it could
potentially have a increment per multifd channel.

The only change would be having:
# This is between 0 and @dirty-sync-count * @multifd-channel.


> > > 
> > > Any one have preferences on the name - i get slight put off by the
> > > repeated word in the property name here.
> > > 
> > >    @zero-copy-rejects ?
> > >    @zero-copy-misses ?
> > >    @zero-copy-fails ?
> > 
> > I'd consider any of these an improvement.  Not a native speaker, but
> > perhaps "failures" instead of "fails".
> > 
> > We could also express the connection to @dirty-sync-count right in the
> > name, like @dirty-sync-rejected-zero-copy, @dirty-sync-missed-zero-copy,
> > @dirty-sync-failed-zero-copy.  Or maybe -copies.
> 
> Yeah, @dirty-sync-missed-zero-copy   is probably my favourite.

Ok then, this one there is :)

I will update my patchset with the suggestions, and I should publish a v3
shortly. 

Best regards,
Leo

> 
> With regards,
> Daniel




^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH v2 2/3] Add zero-copy-copied migration stat
  2022-07-04 18:13                 ` Leonardo Brás
@ 2022-07-05  4:14                   ` Markus Armbruster
  0 siblings, 0 replies; 14+ messages in thread
From: Markus Armbruster @ 2022-07-05  4:14 UTC (permalink / raw)
  To: Leonardo Brás
  Cc: Daniel P. Berrangé,
	Markus Armbruster, Dr. David Alan Gilbert, Juan Quintela,
	Eric Blake, Peter Xu, qemu-devel

Leonardo Brás <leobras@redhat.com> writes:

> Thanks Daniel, Markus and Dave for the feedback!
>
> On Mon, 2022-07-04 at 14:14 +0100, Daniel P. Berrangé wrote:
>> On Mon, Jul 04, 2022 at 02:59:50PM +0200, Markus Armbruster wrote:
>> > Daniel P. Berrangé <berrange@redhat.com> writes:
>> > 
>> > > On Mon, Jul 04, 2022 at 02:04:41PM +0200, Markus Armbruster wrote:
>> > 
> [...]
>
>> 
>> since 7.1, unless you're planning for really tortuous review :)
>> 
>
> Ok, updated :)
>
>> > [...]
>> > 
>> > > > Please rephrase the documentation of @zero-copy-copied in terms of
>> > > > @dirty-sync-count.  Here's my attempt.
>> > > > 
>> > > > # @zero-copy-copied: Number of times dirty RAM synchronization could not
>> > > > #                    avoid copying zero pages.  This is between 0 and
>> > > > #                    @dirty-sync-count.  (since 7.1)
>
> That's a great description! And it's almost 100% correct.

That's why we do patch review :)

> IIUC dirty-sync-count increments on migration_bitmap_sync() once after each full
> dirty-bitmap scan, and even with multifd syncing at the same point, it could
> potentially have a increment per multifd channel.
>
> The only change would be having:
> # This is between 0 and @dirty-sync-count * @multifd-channel.

Makes sense to me.

[...]



^ permalink raw reply	[flat|nested] 14+ messages in thread

end of thread, other threads:[~2022-07-05  4:16 UTC | newest]

Thread overview: 14+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-07-01 15:59 [PATCH v2 0/3] Zero copy improvements (QIOChannel + multifd) Leonardo Bras
2022-07-01 15:59 ` [PATCH v2 1/3] QIOChannelSocket: Fix zero-copy flush returning code 1 when nothing sent Leonardo Bras
2022-07-01 15:59 ` [PATCH v2 2/3] Add zero-copy-copied migration stat Leonardo Bras
2022-07-04  6:18   ` Markus Armbruster
2022-07-04  9:06     ` Daniel P. Berrangé
2022-07-04 11:33       ` Markus Armbruster
2022-07-04 11:40       ` Dr. David Alan Gilbert
2022-07-04 12:04         ` Markus Armbruster
2022-07-04 12:16           ` Daniel P. Berrangé
2022-07-04 12:59             ` Markus Armbruster
2022-07-04 13:14               ` Daniel P. Berrangé
2022-07-04 18:13                 ` Leonardo Brás
2022-07-05  4:14                   ` Markus Armbruster
2022-07-01 15:59 ` [PATCH v2 3/3] migration/multifd: Warn user when zerocopy not working Leonardo Bras

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.