All of lore.kernel.org
 help / color / mirror / Atom feed
* [Qemu-devel] [PATCH] add migration capability to bypass the shared memory
@ 2016-08-09 12:10 Lai Jiangshan
  2016-08-09 13:51 ` no-reply
                   ` (2 more replies)
  0 siblings, 3 replies; 25+ messages in thread
From: Lai Jiangshan @ 2016-08-09 12:10 UTC (permalink / raw)
  To: qemu-devel
  Cc: Lai Jiangshan, Paolo Bonzini, Peter Crosthwaite,
	Richard Henderson, Juan Quintela, Amit Shah, Eric Blake,
	Markus Armbruster

When the migration capability 'bypass-shared-memory'
is set, the shared memory will be bypassed when migration.

It is the key feature to enable several excellent features for
the qemu, such as qemu-local-migration, qemu-live-update,
extremely-fast-save-restore, vm-template, vm-fast-live-clone,
yet-another-post-copy-migration, etc..

The philosophy behind this key feature and the advanced
key features is that a part of the memory management is
separated out from the qemu, and let the other toolkits
such as libvirt, runv(https://github.com/hyperhq/runv/)
or the next qemu-cmd directly access to it, manage it,
provide features to it.

The hyperhq(http://hyper.sh  http://hypercontainer.io/)
introduced the feature vm-template(vm-fast-live-clone)
to the hyper container for several months, it works perfect.
(see https://github.com/hyperhq/runv/pull/297)

The feature vm-template makes the containers(VMs) can
be started in 130ms and save 80M memory for every
container(VM). So that the hyper containers are fast
and high-density as normal containers.

In current qemu command line, shared memory has
to be configured via memory-object. Anyone can add a
-mem-path-share to the qemu command line for combining
with -mem-path for this feature. This patch doesn’t include
this change of -mem-path-share.

Advanced features:
1) qemu-local-migration, qemu-live-update
Set the mem-path on the tmpfs and set share=on for it when
start the vm. example:
-object \
memory-backend-file,id=mem,size=128M,mem-path=/dev/shm/memory,share=on \
-numa node,nodeid=0,cpus=0-7,memdev=mem

when you want to migrate the vm locally (after fixed a security bug
of the qemu-binary, or other reason), you can start a new qemu with
the same command line and -incoming, then you can migrate the
vm from the old qemu to the new qemu with the migration capability
'bypass-shared-memory' set. The migration will migrate the device-state
*ONLY*, the memory is the origin memory backed by tmpfs file.

2) extremely-fast-save-restore
the same above, but the mem-path is on the persistent file system.

3)  vm-template, vm-fast-live-clone
the template vm is started as 1), and paused when the guest reaches
the template point(example: the guest app is ready), then the template
vm is saved. (the qemu process of the template can be killed now, because
we need only the memory and the device state files (in tmpfs)).

Then we can launch one or multiple VMs base on the template vm states,
the new VMs are started without the “share=on”, all the new VMs share
the initial memory from the memory file, they save a lot of memory.
all the new VMs start from the template point, the guest app can go to
work quickly.

The new VM booted from template vm can’t become template
again, if you need this special feature, you can write a cloneable-tmpfs
kernel module for it.

The libvirt toolkit can’t manage vm-template currently, in the
hyperhq/runv, we use qemu wrapper script to do it. I hope someone add
“libvrit managed template” feature to libvirt.

4) yet-another-post-copy-migration
It is a possible feature, no toolkit can do it well now.
Using nbd server/client on the memory file is reluctantly Ok but
inconvenient. A special feature for tmpfs might be needed to
fully complete this feature.
No one need yet another post copy migration method,
but it is possible when some crazy man need it.

Signed-off-by: Lai Jiangshan <jiangshanlai@gmail.com>
---
 exec.c                        |  5 +++++
 include/exec/cpu-common.h     |  1 +
 include/migration/migration.h |  1 +
 migration/migration.c         |  9 +++++++++
 migration/ram.c               | 37 ++++++++++++++++++++++++++++---------
 qapi-schema.json              |  6 +++++-
 qmp-commands.hx               |  3 +++
 7 files changed, 52 insertions(+), 10 deletions(-)

diff --git a/exec.c b/exec.c
index 8ffde75..888919a 100644
--- a/exec.c
+++ b/exec.c
@@ -1402,6 +1402,11 @@ static void qemu_ram_setup_dump(void *addr, ram_addr_t size)
     }
 }
 
+bool qemu_ram_is_shared(RAMBlock *rb)
+{
+    return rb->flags & RAM_SHARED;
+}
+
 const char *qemu_ram_get_idstr(RAMBlock *rb)
 {
     return rb->idstr;
diff --git a/include/exec/cpu-common.h b/include/exec/cpu-common.h
index 952bcfe..7c18db9 100644
--- a/include/exec/cpu-common.h
+++ b/include/exec/cpu-common.h
@@ -58,6 +58,7 @@ RAMBlock *qemu_ram_block_from_host(void *ptr, bool round_offset,
 void qemu_ram_set_idstr(RAMBlock *block, const char *name, DeviceState *dev);
 void qemu_ram_unset_idstr(RAMBlock *block);
 const char *qemu_ram_get_idstr(RAMBlock *rb);
+bool qemu_ram_is_shared(RAMBlock *rb);
 
 void cpu_physical_memory_rw(hwaddr addr, uint8_t *buf,
                             int len, int is_write);
diff --git a/include/migration/migration.h b/include/migration/migration.h
index 3c96623..080b6b2 100644
--- a/include/migration/migration.h
+++ b/include/migration/migration.h
@@ -290,6 +290,7 @@ void migrate_add_blocker(Error *reason);
  */
 void migrate_del_blocker(Error *reason);
 
+bool migrate_bypass_shared_memory(void);
 bool migrate_postcopy_ram(void);
 bool migrate_zero_blocks(void);
 
diff --git a/migration/migration.c b/migration/migration.c
index 955d5ee..c87d136 100644
--- a/migration/migration.c
+++ b/migration/migration.c
@@ -1189,6 +1189,15 @@ void qmp_migrate_set_downtime(double value, Error **errp)
     max_downtime = (uint64_t)value;
 }
 
+bool migrate_bypass_shared_memory(void)
+{
+    MigrationState *s;
+
+    s = migrate_get_current();
+
+    return s->enabled_capabilities[MIGRATION_CAPABILITY_BYPASS_SHARED_MEMORY];
+}
+
 bool migrate_postcopy_ram(void)
 {
     MigrationState *s;
diff --git a/migration/ram.c b/migration/ram.c
index 815bc0e..880972d 100644
--- a/migration/ram.c
+++ b/migration/ram.c
@@ -605,6 +605,28 @@ static void migration_bitmap_sync_init(void)
     num_dirty_pages_period = 0;
     xbzrle_cache_miss_prev = 0;
     iterations_prev = 0;
+    migration_dirty_pages = 0;
+}
+
+static void migration_bitmap_init(unsigned long *bitmap)
+{
+    RAMBlock *block;
+
+    bitmap_clear(bitmap, 0, last_ram_offset() >> TARGET_PAGE_BITS);
+    rcu_read_lock();
+    QLIST_FOREACH_RCU(block, &ram_list.blocks, next) {
+        if (!migrate_bypass_shared_memory() || !qemu_ram_is_shared(block)) {
+            bitmap_set(bitmap, block->offset >> TARGET_PAGE_BITS,
+                       block->used_length >> TARGET_PAGE_BITS);
+
+            /*
+             * Count the total number of pages used by ram blocks not including
+             * any gaps due to alignment or unplugs.
+             */
+	    migration_dirty_pages += block->used_length >> TARGET_PAGE_BITS;
+	}
+    }
+    rcu_read_unlock();
 }
 
 static void migration_bitmap_sync(void)
@@ -631,7 +653,9 @@ static void migration_bitmap_sync(void)
     qemu_mutex_lock(&migration_bitmap_mutex);
     rcu_read_lock();
     QLIST_FOREACH_RCU(block, &ram_list.blocks, next) {
-        migration_bitmap_sync_range(block->offset, block->used_length);
+        if (!migrate_bypass_shared_memory() || !qemu_ram_is_shared(block)) {
+            migration_bitmap_sync_range(block->offset, block->used_length);
+        }
     }
     rcu_read_unlock();
     qemu_mutex_unlock(&migration_bitmap_mutex);
@@ -1926,19 +1950,14 @@ static int ram_save_setup(QEMUFile *f, void *opaque)
     ram_bitmap_pages = last_ram_offset() >> TARGET_PAGE_BITS;
     migration_bitmap_rcu = g_new0(struct BitmapRcu, 1);
     migration_bitmap_rcu->bmap = bitmap_new(ram_bitmap_pages);
-    bitmap_set(migration_bitmap_rcu->bmap, 0, ram_bitmap_pages);
+    migration_bitmap_init(migration_bitmap_rcu->bmap);
 
     if (migrate_postcopy_ram()) {
         migration_bitmap_rcu->unsentmap = bitmap_new(ram_bitmap_pages);
-        bitmap_set(migration_bitmap_rcu->unsentmap, 0, ram_bitmap_pages);
+        bitmap_copy(migration_bitmap_rcu->unsentmap,
+		    migration_bitmap_rcu->bmap, ram_bitmap_pages);
     }
 
-    /*
-     * Count the total number of pages used by ram blocks not including any
-     * gaps due to alignment or unplugs.
-     */
-    migration_dirty_pages = ram_bytes_total() >> TARGET_PAGE_BITS;
-
     memory_global_dirty_log_start();
     migration_bitmap_sync();
     qemu_mutex_unlock_ramlist();
diff --git a/qapi-schema.json b/qapi-schema.json
index 5658723..453e6d9 100644
--- a/qapi-schema.json
+++ b/qapi-schema.json
@@ -553,11 +553,15 @@
 #          been migrated, pulling the remaining pages along as needed. NOTE: If
 #          the migration fails during postcopy the VM will fail.  (since 2.6)
 #
+# @bypass-shared-memory: the shared memory region will be bypassed on migration.
+#          This feature allows the memory region to be reused by new qemu(s)
+#          or be migrated separately. (since 2.8)
+#
 # Since: 1.2
 ##
 { 'enum': 'MigrationCapability',
   'data': ['xbzrle', 'rdma-pin-all', 'auto-converge', 'zero-blocks',
-           'compress', 'events', 'postcopy-ram'] }
+           'compress', 'events', 'postcopy-ram', 'bypass-shared-memory'] }
 
 ##
 # @MigrationCapabilityStatus
diff --git a/qmp-commands.hx b/qmp-commands.hx
index c8d360a..c31152c 100644
--- a/qmp-commands.hx
+++ b/qmp-commands.hx
@@ -3723,6 +3723,7 @@ Enable/Disable migration capabilities
 - "compress": use multiple compression threads to accelerate live migration
 - "events": generate events for each migration state change
 - "postcopy-ram": postcopy mode for live migration
+- "bypass-shared-memory": bypass shared memory region
 
 Arguments:
 
@@ -3753,6 +3754,7 @@ Query current migration capabilities
          - "compress": Multiple compression threads state (json-bool)
          - "events": Migration state change event state (json-bool)
          - "postcopy-ram": postcopy ram state (json-bool)
+         - "bypass-shared-memory": bypass shared memory state (json-bool)
 
 Arguments:
 
@@ -3767,6 +3769,7 @@ Example:
      {"state": false, "capability": "compress"},
      {"state": true, "capability": "events"},
      {"state": false, "capability": "postcopy-ram"}
+     {"state": false, "capability": "bypass-shared-memory"}
    ]}
 
 EQMP
-- 
2.7.4 (Apple Git-66)

^ permalink raw reply related	[flat|nested] 25+ messages in thread

* Re: [Qemu-devel] [PATCH] add migration capability to bypass the shared memory
  2016-08-09 12:10 [Qemu-devel] [PATCH] add migration capability to bypass the shared memory Lai Jiangshan
@ 2016-08-09 13:51 ` no-reply
  2016-08-09 19:12 ` Dr. David Alan Gilbert
  2016-08-10  9:03 ` Juan Quintela
  2 siblings, 0 replies; 25+ messages in thread
From: no-reply @ 2016-08-09 13:51 UTC (permalink / raw)
  To: jiangshanlai
  Cc: famz, qemu-devel, quintela, crosthwaite.peter, armbru, amit.shah,
	pbonzini, rth

Hi,

Your series seems to have some coding style problems. See output below for
more information:

Message-id: 1470744604-80857-1-git-send-email-jiangshanlai@gmail.com
Type: series
Subject: [Qemu-devel] [PATCH] add migration capability to bypass the shared memory

=== TEST SCRIPT BEGIN ===
#!/bin/bash

BASE=base
n=1
total=$(git log --oneline $BASE.. | wc -l)
failed=0

commits="$(git log --format=%H --reverse $BASE..)"
for c in $commits; do
    echo "Checking PATCH $n/$total: $(git show --no-patch --format=%s $c)..."
    if ! git show $c --format=email | ./scripts/checkpatch.pl --mailback -; then
        failed=1
        echo
    fi
    n=$((n+1))
done

exit $failed
=== TEST SCRIPT END ===

Updating 3c8cf5a9c21ff8782164d1def7f44bd888713384
Switched to a new branch 'test'
e63cdc7 add migration capability to bypass the shared memory

=== OUTPUT BEGIN ===
Checking PATCH 1/1: add migration capability to bypass the shared memory...
ERROR: code indent should never use tabs
#175: FILE: migration/ram.c:626:
+^I    migration_dirty_pages += block->used_length >> TARGET_PAGE_BITS;$

ERROR: code indent should never use tabs
#176: FILE: migration/ram.c:627:
+^I}$

ERROR: code indent should never use tabs
#204: FILE: migration/ram.c:1958:
+^I^I    migration_bitmap_rcu->bmap, ram_bitmap_pages);$

total: 3 errors, 0 warnings, 137 lines checked

Your patch has style problems, please review.  If any of these errors
are false positives report them to the maintainer, see
CHECKPATCH in MAINTAINERS.

=== OUTPUT END ===

Test command exited with code: 1


---
Email generated automatically by Patchew [http://patchew.org/].
Please send your feedback to patchew-devel@freelists.org

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [Qemu-devel] [PATCH] add migration capability to bypass the shared memory
  2016-08-09 12:10 [Qemu-devel] [PATCH] add migration capability to bypass the shared memory Lai Jiangshan
  2016-08-09 13:51 ` no-reply
@ 2016-08-09 19:12 ` Dr. David Alan Gilbert
  2016-08-10  0:54   ` [Qemu-devel] [PATCH V2] " Lai Jiangshan
  2016-08-10  1:09   ` [Qemu-devel] [PATCH] " Lai Jiangshan
  2016-08-10  9:03 ` Juan Quintela
  2 siblings, 2 replies; 25+ messages in thread
From: Dr. David Alan Gilbert @ 2016-08-09 19:12 UTC (permalink / raw)
  To: Lai Jiangshan
  Cc: qemu-devel, Juan Quintela, Peter Crosthwaite, Markus Armbruster,
	Amit Shah, Paolo Bonzini, Richard Henderson

* Lai Jiangshan (jiangshanlai@gmail.com) wrote:
> When the migration capability 'bypass-shared-memory'
> is set, the shared memory will be bypassed when migration.
> 
> It is the key feature to enable several excellent features for
> the qemu, such as qemu-local-migration, qemu-live-update,
> extremely-fast-save-restore, vm-template, vm-fast-live-clone,
> yet-another-post-copy-migration, etc..
> 
> The philosophy behind this key feature and the advanced
> key features is that a part of the memory management is
> separated out from the qemu, and let the other toolkits
> such as libvirt, runv(https://github.com/hyperhq/runv/)
> or the next qemu-cmd directly access to it, manage it,
> provide features to it.
> 
> The hyperhq(http://hyper.sh  http://hypercontainer.io/)
> introduced the feature vm-template(vm-fast-live-clone)
> to the hyper container for several months, it works perfect.
> (see https://github.com/hyperhq/runv/pull/297)
> 
> The feature vm-template makes the containers(VMs) can
> be started in 130ms and save 80M memory for every
> container(VM). So that the hyper containers are fast
> and high-density as normal containers.

Very nice.

> In current qemu command line, shared memory has
> to be configured via memory-object. Anyone can add a
> -mem-path-share to the qemu command line for combining
> with -mem-path for this feature. This patch doesn’t include
> this change of -mem-path-share.
> 
> Advanced features:
> 1) qemu-local-migration, qemu-live-update
> Set the mem-path on the tmpfs and set share=on for it when
> start the vm. example:
> -object \
> memory-backend-file,id=mem,size=128M,mem-path=/dev/shm/memory,share=on \
> -numa node,nodeid=0,cpus=0-7,memdev=mem
> 
> when you want to migrate the vm locally (after fixed a security bug
> of the qemu-binary, or other reason), you can start a new qemu with
> the same command line and -incoming, then you can migrate the
> vm from the old qemu to the new qemu with the migration capability
> 'bypass-shared-memory' set. The migration will migrate the device-state
> *ONLY*, the memory is the origin memory backed by tmpfs file.

Interesting; I was wondering about using the xen-save-devices and
xen-load-devices commands to do the same trick; but you're allowing
it to be specified on individual RAM blocks which is probably a good
thing.
(I'm not sure what happens to things like vram).

> 2) extremely-fast-save-restore
> the same above, but the mem-path is on the persistent file system.
> 
> 3)  vm-template, vm-fast-live-clone
> the template vm is started as 1), and paused when the guest reaches
> the template point(example: the guest app is ready), then the template
> vm is saved. (the qemu process of the template can be killed now, because
> we need only the memory and the device state files (in tmpfs)).
> 
> Then we can launch one or multiple VMs base on the template vm states,
> the new VMs are started without the “share=on”, all the new VMs share
> the initial memory from the memory file, they save a lot of memory.
> all the new VMs start from the template point, the guest app can go to
> work quickly.
> 
> The new VM booted from template vm can’t become template
> again, if you need this special feature, you can write a cloneable-tmpfs
> kernel module for it.

You could probably use the new write detection mode in userfaultfd to do
that.

> The libvirt toolkit can’t manage vm-template currently, in the
> hyperhq/runv, we use qemu wrapper script to do it. I hope someone add
> “libvrit managed template” feature to libvirt.
> 
> 4) yet-another-post-copy-migration
> It is a possible feature, no toolkit can do it well now.
> Using nbd server/client on the memory file is reluctantly Ok but
> inconvenient. A special feature for tmpfs might be needed to
> fully complete this feature.
> No one need yet another post copy migration method,
> but it is possible when some crazy man need it.

Well as the crazy man who wrote the current postcopy code, what would
you need it to do that the current one can't?

I can't see anything noticeably wrong with your code; but I see the bot
gave you some style warnings.

Dave

> Signed-off-by: Lai Jiangshan <jiangshanlai@gmail.com>
> ---
>  exec.c                        |  5 +++++
>  include/exec/cpu-common.h     |  1 +
>  include/migration/migration.h |  1 +
>  migration/migration.c         |  9 +++++++++
>  migration/ram.c               | 37 ++++++++++++++++++++++++++++---------
>  qapi-schema.json              |  6 +++++-
>  qmp-commands.hx               |  3 +++
>  7 files changed, 52 insertions(+), 10 deletions(-)
> 
> diff --git a/exec.c b/exec.c
> index 8ffde75..888919a 100644
> --- a/exec.c
> +++ b/exec.c
> @@ -1402,6 +1402,11 @@ static void qemu_ram_setup_dump(void *addr, ram_addr_t size)
>      }
>  }
>  
> +bool qemu_ram_is_shared(RAMBlock *rb)
> +{
> +    return rb->flags & RAM_SHARED;
> +}
> +
>  const char *qemu_ram_get_idstr(RAMBlock *rb)
>  {
>      return rb->idstr;
> diff --git a/include/exec/cpu-common.h b/include/exec/cpu-common.h
> index 952bcfe..7c18db9 100644
> --- a/include/exec/cpu-common.h
> +++ b/include/exec/cpu-common.h
> @@ -58,6 +58,7 @@ RAMBlock *qemu_ram_block_from_host(void *ptr, bool round_offset,
>  void qemu_ram_set_idstr(RAMBlock *block, const char *name, DeviceState *dev);
>  void qemu_ram_unset_idstr(RAMBlock *block);
>  const char *qemu_ram_get_idstr(RAMBlock *rb);
> +bool qemu_ram_is_shared(RAMBlock *rb);
>  
>  void cpu_physical_memory_rw(hwaddr addr, uint8_t *buf,
>                              int len, int is_write);
> diff --git a/include/migration/migration.h b/include/migration/migration.h
> index 3c96623..080b6b2 100644
> --- a/include/migration/migration.h
> +++ b/include/migration/migration.h
> @@ -290,6 +290,7 @@ void migrate_add_blocker(Error *reason);
>   */
>  void migrate_del_blocker(Error *reason);
>  
> +bool migrate_bypass_shared_memory(void);
>  bool migrate_postcopy_ram(void);
>  bool migrate_zero_blocks(void);
>  
> diff --git a/migration/migration.c b/migration/migration.c
> index 955d5ee..c87d136 100644
> --- a/migration/migration.c
> +++ b/migration/migration.c
> @@ -1189,6 +1189,15 @@ void qmp_migrate_set_downtime(double value, Error **errp)
>      max_downtime = (uint64_t)value;
>  }
>  
> +bool migrate_bypass_shared_memory(void)
> +{
> +    MigrationState *s;
> +
> +    s = migrate_get_current();
> +
> +    return s->enabled_capabilities[MIGRATION_CAPABILITY_BYPASS_SHARED_MEMORY];
> +}
> +
>  bool migrate_postcopy_ram(void)
>  {
>      MigrationState *s;
> diff --git a/migration/ram.c b/migration/ram.c
> index 815bc0e..880972d 100644
> --- a/migration/ram.c
> +++ b/migration/ram.c
> @@ -605,6 +605,28 @@ static void migration_bitmap_sync_init(void)
>      num_dirty_pages_period = 0;
>      xbzrle_cache_miss_prev = 0;
>      iterations_prev = 0;
> +    migration_dirty_pages = 0;
> +}
> +
> +static void migration_bitmap_init(unsigned long *bitmap)
> +{
> +    RAMBlock *block;
> +
> +    bitmap_clear(bitmap, 0, last_ram_offset() >> TARGET_PAGE_BITS);
> +    rcu_read_lock();
> +    QLIST_FOREACH_RCU(block, &ram_list.blocks, next) {
> +        if (!migrate_bypass_shared_memory() || !qemu_ram_is_shared(block)) {
> +            bitmap_set(bitmap, block->offset >> TARGET_PAGE_BITS,
> +                       block->used_length >> TARGET_PAGE_BITS);
> +
> +            /*
> +             * Count the total number of pages used by ram blocks not including
> +             * any gaps due to alignment or unplugs.
> +             */
> +	    migration_dirty_pages += block->used_length >> TARGET_PAGE_BITS;
> +	}
> +    }
> +    rcu_read_unlock();
>  }
>  
>  static void migration_bitmap_sync(void)
> @@ -631,7 +653,9 @@ static void migration_bitmap_sync(void)
>      qemu_mutex_lock(&migration_bitmap_mutex);
>      rcu_read_lock();
>      QLIST_FOREACH_RCU(block, &ram_list.blocks, next) {
> -        migration_bitmap_sync_range(block->offset, block->used_length);
> +        if (!migrate_bypass_shared_memory() || !qemu_ram_is_shared(block)) {
> +            migration_bitmap_sync_range(block->offset, block->used_length);
> +        }
>      }
>      rcu_read_unlock();
>      qemu_mutex_unlock(&migration_bitmap_mutex);
> @@ -1926,19 +1950,14 @@ static int ram_save_setup(QEMUFile *f, void *opaque)
>      ram_bitmap_pages = last_ram_offset() >> TARGET_PAGE_BITS;
>      migration_bitmap_rcu = g_new0(struct BitmapRcu, 1);
>      migration_bitmap_rcu->bmap = bitmap_new(ram_bitmap_pages);
> -    bitmap_set(migration_bitmap_rcu->bmap, 0, ram_bitmap_pages);
> +    migration_bitmap_init(migration_bitmap_rcu->bmap);
>  
>      if (migrate_postcopy_ram()) {
>          migration_bitmap_rcu->unsentmap = bitmap_new(ram_bitmap_pages);
> -        bitmap_set(migration_bitmap_rcu->unsentmap, 0, ram_bitmap_pages);
> +        bitmap_copy(migration_bitmap_rcu->unsentmap,
> +		    migration_bitmap_rcu->bmap, ram_bitmap_pages);
>      }
>  
> -    /*
> -     * Count the total number of pages used by ram blocks not including any
> -     * gaps due to alignment or unplugs.
> -     */
> -    migration_dirty_pages = ram_bytes_total() >> TARGET_PAGE_BITS;
> -
>      memory_global_dirty_log_start();
>      migration_bitmap_sync();
>      qemu_mutex_unlock_ramlist();
> diff --git a/qapi-schema.json b/qapi-schema.json
> index 5658723..453e6d9 100644
> --- a/qapi-schema.json
> +++ b/qapi-schema.json
> @@ -553,11 +553,15 @@
>  #          been migrated, pulling the remaining pages along as needed. NOTE: If
>  #          the migration fails during postcopy the VM will fail.  (since 2.6)
>  #
> +# @bypass-shared-memory: the shared memory region will be bypassed on migration.
> +#          This feature allows the memory region to be reused by new qemu(s)
> +#          or be migrated separately. (since 2.8)
> +#
>  # Since: 1.2
>  ##
>  { 'enum': 'MigrationCapability',
>    'data': ['xbzrle', 'rdma-pin-all', 'auto-converge', 'zero-blocks',
> -           'compress', 'events', 'postcopy-ram'] }
> +           'compress', 'events', 'postcopy-ram', 'bypass-shared-memory'] }
>  
>  ##
>  # @MigrationCapabilityStatus
> diff --git a/qmp-commands.hx b/qmp-commands.hx
> index c8d360a..c31152c 100644
> --- a/qmp-commands.hx
> +++ b/qmp-commands.hx
> @@ -3723,6 +3723,7 @@ Enable/Disable migration capabilities
>  - "compress": use multiple compression threads to accelerate live migration
>  - "events": generate events for each migration state change
>  - "postcopy-ram": postcopy mode for live migration
> +- "bypass-shared-memory": bypass shared memory region
>  
>  Arguments:
>  
> @@ -3753,6 +3754,7 @@ Query current migration capabilities
>           - "compress": Multiple compression threads state (json-bool)
>           - "events": Migration state change event state (json-bool)
>           - "postcopy-ram": postcopy ram state (json-bool)
> +         - "bypass-shared-memory": bypass shared memory state (json-bool)
>  
>  Arguments:
>  
> @@ -3767,6 +3769,7 @@ Example:
>       {"state": false, "capability": "compress"},
>       {"state": true, "capability": "events"},
>       {"state": false, "capability": "postcopy-ram"}
> +     {"state": false, "capability": "bypass-shared-memory"}
>     ]}
>  
>  EQMP
> -- 
> 2.7.4 (Apple Git-66)
> 
> 
--
Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK

^ permalink raw reply	[flat|nested] 25+ messages in thread

* [Qemu-devel] [PATCH V2] add migration capability to bypass the shared memory
  2016-08-09 19:12 ` Dr. David Alan Gilbert
@ 2016-08-10  0:54   ` Lai Jiangshan
  2016-08-10  1:22     ` Fam Zheng
                       ` (3 more replies)
  2016-08-10  1:09   ` [Qemu-devel] [PATCH] " Lai Jiangshan
  1 sibling, 4 replies; 25+ messages in thread
From: Lai Jiangshan @ 2016-08-10  0:54 UTC (permalink / raw)
  To: qemu-devel
  Cc: Lai Jiangshan, Paolo Bonzini, Peter Crosthwaite,
	Richard Henderson, Juan Quintela, Amit Shah, Eric Blake,
	Markus Armbruster

When the migration capability 'bypass-shared-memory'
is set, the shared memory will be bypassed when migration.

It is the key feature to enable several excellent features for
the qemu, such as qemu-local-migration, qemu-live-update,
extremely-fast-save-restore, vm-template, vm-fast-live-clone,
yet-another-post-copy-migration, etc..

The philosophy behind this key feature and the advanced
key features is that a part of the memory management is
separated out from the qemu, and let the other toolkits
such as libvirt, runv(https://github.com/hyperhq/runv/)
or the next qemu-cmd directly access to it, manage it,
provide features to it.

The hyperhq(http://hyper.sh  http://hypercontainer.io/)
introduced the feature vm-template(vm-fast-live-clone)
to the hyper container for several months, it works perfect.
(see https://github.com/hyperhq/runv/pull/297)

The feature vm-template makes the containers(VMs) can
be started in 130ms and save 80M memory for every
container(VM). So that the hyper containers are fast
and high-density as normal containers.

In current qemu command line, shared memory has
to be configured via memory-object. Anyone can add a
-mem-path-share to the qemu command line for combining
with -mem-path for this feature. This patch doesn’t include
this change of -mem-path-share.

Advanced features:
1) qemu-local-migration, qemu-live-update
Set the mem-path on the tmpfs and set share=on for it when
start the vm. example:
-object \
memory-backend-file,id=mem,size=128M,mem-path=/dev/shm/memory,share=on \
-numa node,nodeid=0,cpus=0-7,memdev=mem

when you want to migrate the vm locally (after fixed a security bug
of the qemu-binary, or other reason), you can start a new qemu with
the same command line and -incoming, then you can migrate the
vm from the old qemu to the new qemu with the migration capability
'bypass-shared-memory' set. The migration will migrate the device-state
*ONLY*, the memory is the origin memory backed by tmpfs file.

2) extremely-fast-save-restore
the same above, but the mem-path is on the persistent file system.

3)  vm-template, vm-fast-live-clone
the template vm is started as 1), and paused when the guest reaches
the template point(example: the guest app is ready), then the template
vm is saved. (the qemu process of the template can be killed now, because
we need only the memory and the device state files (in tmpfs)).

Then we can launch one or multiple VMs base on the template vm states,
the new VMs are started without the “share=on”, all the new VMs share
the initial memory from the memory file, they save a lot of memory.
all the new VMs start from the template point, the guest app can go to
work quickly.

The new VM booted from template vm can’t become template again,
if you need this unusual chained-template feature, you can write
a cloneable-tmpfs kernel module for it.

The libvirt toolkit can’t manage vm-template currently, in the
hyperhq/runv, we use qemu wrapper script to do it. I hope someone add
“libvrit managed template” feature to libvirt.

4) yet-another-post-copy-migration
It is a possible feature, no toolkit can do it well now.
Using nbd server/client on the memory file is reluctantly Ok but
inconvenient. A special feature for tmpfs might be needed to
fully complete this feature.
No one need yet another post copy migration method,
but it is possible when some crazy man need it.

Changed from v1:
   fix style

Signed-off-by: Lai Jiangshan <jiangshanlai@gmail.com>
---
 exec.c                        |  5 +++++
 include/exec/cpu-common.h     |  1 +
 include/migration/migration.h |  1 +
 migration/migration.c         |  9 +++++++++
 migration/ram.c               | 37 ++++++++++++++++++++++++++++---------
 qapi-schema.json              |  6 +++++-
 qmp-commands.hx               |  3 +++
 7 files changed, 52 insertions(+), 10 deletions(-)

diff --git a/exec.c b/exec.c
index 8ffde75..888919a 100644
--- a/exec.c
+++ b/exec.c
@@ -1402,6 +1402,11 @@ static void qemu_ram_setup_dump(void *addr, ram_addr_t size)
     }
 }
 
+bool qemu_ram_is_shared(RAMBlock *rb)
+{
+    return rb->flags & RAM_SHARED;
+}
+
 const char *qemu_ram_get_idstr(RAMBlock *rb)
 {
     return rb->idstr;
diff --git a/include/exec/cpu-common.h b/include/exec/cpu-common.h
index 952bcfe..7c18db9 100644
--- a/include/exec/cpu-common.h
+++ b/include/exec/cpu-common.h
@@ -58,6 +58,7 @@ RAMBlock *qemu_ram_block_from_host(void *ptr, bool round_offset,
 void qemu_ram_set_idstr(RAMBlock *block, const char *name, DeviceState *dev);
 void qemu_ram_unset_idstr(RAMBlock *block);
 const char *qemu_ram_get_idstr(RAMBlock *rb);
+bool qemu_ram_is_shared(RAMBlock *rb);
 
 void cpu_physical_memory_rw(hwaddr addr, uint8_t *buf,
                             int len, int is_write);
diff --git a/include/migration/migration.h b/include/migration/migration.h
index 3c96623..080b6b2 100644
--- a/include/migration/migration.h
+++ b/include/migration/migration.h
@@ -290,6 +290,7 @@ void migrate_add_blocker(Error *reason);
  */
 void migrate_del_blocker(Error *reason);
 
+bool migrate_bypass_shared_memory(void);
 bool migrate_postcopy_ram(void);
 bool migrate_zero_blocks(void);
 
diff --git a/migration/migration.c b/migration/migration.c
index 955d5ee..c87d136 100644
--- a/migration/migration.c
+++ b/migration/migration.c
@@ -1189,6 +1189,15 @@ void qmp_migrate_set_downtime(double value, Error **errp)
     max_downtime = (uint64_t)value;
 }
 
+bool migrate_bypass_shared_memory(void)
+{
+    MigrationState *s;
+
+    s = migrate_get_current();
+
+    return s->enabled_capabilities[MIGRATION_CAPABILITY_BYPASS_SHARED_MEMORY];
+}
+
 bool migrate_postcopy_ram(void)
 {
     MigrationState *s;
diff --git a/migration/ram.c b/migration/ram.c
index 815bc0e..f7c4081 100644
--- a/migration/ram.c
+++ b/migration/ram.c
@@ -605,6 +605,28 @@ static void migration_bitmap_sync_init(void)
     num_dirty_pages_period = 0;
     xbzrle_cache_miss_prev = 0;
     iterations_prev = 0;
+    migration_dirty_pages = 0;
+}
+
+static void migration_bitmap_init(unsigned long *bitmap)
+{
+    RAMBlock *block;
+
+    bitmap_clear(bitmap, 0, last_ram_offset() >> TARGET_PAGE_BITS);
+    rcu_read_lock();
+    QLIST_FOREACH_RCU(block, &ram_list.blocks, next) {
+        if (!migrate_bypass_shared_memory() || !qemu_ram_is_shared(block)) {
+            bitmap_set(bitmap, block->offset >> TARGET_PAGE_BITS,
+                       block->used_length >> TARGET_PAGE_BITS);
+
+            /*
+             * Count the total number of pages used by ram blocks not including
+             * any gaps due to alignment or unplugs.
+             */
+            migration_dirty_pages += block->used_length >> TARGET_PAGE_BITS;
+        }
+    }
+    rcu_read_unlock();
 }
 
 static void migration_bitmap_sync(void)
@@ -631,7 +653,9 @@ static void migration_bitmap_sync(void)
     qemu_mutex_lock(&migration_bitmap_mutex);
     rcu_read_lock();
     QLIST_FOREACH_RCU(block, &ram_list.blocks, next) {
-        migration_bitmap_sync_range(block->offset, block->used_length);
+        if (!migrate_bypass_shared_memory() || !qemu_ram_is_shared(block)) {
+            migration_bitmap_sync_range(block->offset, block->used_length);
+        }
     }
     rcu_read_unlock();
     qemu_mutex_unlock(&migration_bitmap_mutex);
@@ -1926,19 +1950,14 @@ static int ram_save_setup(QEMUFile *f, void *opaque)
     ram_bitmap_pages = last_ram_offset() >> TARGET_PAGE_BITS;
     migration_bitmap_rcu = g_new0(struct BitmapRcu, 1);
     migration_bitmap_rcu->bmap = bitmap_new(ram_bitmap_pages);
-    bitmap_set(migration_bitmap_rcu->bmap, 0, ram_bitmap_pages);
+    migration_bitmap_init(migration_bitmap_rcu->bmap);
 
     if (migrate_postcopy_ram()) {
         migration_bitmap_rcu->unsentmap = bitmap_new(ram_bitmap_pages);
-        bitmap_set(migration_bitmap_rcu->unsentmap, 0, ram_bitmap_pages);
+        bitmap_copy(migration_bitmap_rcu->unsentmap,
+                    migration_bitmap_rcu->bmap, ram_bitmap_pages);
     }
 
-    /*
-     * Count the total number of pages used by ram blocks not including any
-     * gaps due to alignment or unplugs.
-     */
-    migration_dirty_pages = ram_bytes_total() >> TARGET_PAGE_BITS;
-
     memory_global_dirty_log_start();
     migration_bitmap_sync();
     qemu_mutex_unlock_ramlist();
diff --git a/qapi-schema.json b/qapi-schema.json
index 5658723..453e6d9 100644
--- a/qapi-schema.json
+++ b/qapi-schema.json
@@ -553,11 +553,15 @@
 #          been migrated, pulling the remaining pages along as needed. NOTE: If
 #          the migration fails during postcopy the VM will fail.  (since 2.6)
 #
+# @bypass-shared-memory: the shared memory region will be bypassed on migration.
+#          This feature allows the memory region to be reused by new qemu(s)
+#          or be migrated separately. (since 2.8)
+#
 # Since: 1.2
 ##
 { 'enum': 'MigrationCapability',
   'data': ['xbzrle', 'rdma-pin-all', 'auto-converge', 'zero-blocks',
-           'compress', 'events', 'postcopy-ram'] }
+           'compress', 'events', 'postcopy-ram', 'bypass-shared-memory'] }
 
 ##
 # @MigrationCapabilityStatus
diff --git a/qmp-commands.hx b/qmp-commands.hx
index c8d360a..c31152c 100644
--- a/qmp-commands.hx
+++ b/qmp-commands.hx
@@ -3723,6 +3723,7 @@ Enable/Disable migration capabilities
 - "compress": use multiple compression threads to accelerate live migration
 - "events": generate events for each migration state change
 - "postcopy-ram": postcopy mode for live migration
+- "bypass-shared-memory": bypass shared memory region
 
 Arguments:
 
@@ -3753,6 +3754,7 @@ Query current migration capabilities
          - "compress": Multiple compression threads state (json-bool)
          - "events": Migration state change event state (json-bool)
          - "postcopy-ram": postcopy ram state (json-bool)
+         - "bypass-shared-memory": bypass shared memory state (json-bool)
 
 Arguments:
 
@@ -3767,6 +3769,7 @@ Example:
      {"state": false, "capability": "compress"},
      {"state": true, "capability": "events"},
      {"state": false, "capability": "postcopy-ram"}
+     {"state": false, "capability": "bypass-shared-memory"}
    ]}
 
 EQMP
-- 
2.7.4 (Apple Git-66)

^ permalink raw reply related	[flat|nested] 25+ messages in thread

* Re: [Qemu-devel] [PATCH] add migration capability to bypass the shared memory
  2016-08-09 19:12 ` Dr. David Alan Gilbert
  2016-08-10  0:54   ` [Qemu-devel] [PATCH V2] " Lai Jiangshan
@ 2016-08-10  1:09   ` Lai Jiangshan
  1 sibling, 0 replies; 25+ messages in thread
From: Lai Jiangshan @ 2016-08-10  1:09 UTC (permalink / raw)
  To: Dr. David Alan Gilbert
  Cc: qemu-devel, Juan Quintela, Peter Crosthwaite, Markus Armbruster,
	Amit Shah, Paolo Bonzini, Richard Henderson

On Wed, Aug 10, 2016 at 3:12 AM, Dr. David Alan Gilbert
<dgilbert@redhat.com> wrote:
> * Lai Jiangshan (jiangshanlai@gmail.com) wrote:
>> The feature vm-template makes the containers(VMs) can
>> be started in 130ms and save 80M memory for every
>> container(VM). So that the hyper containers are fast
>> and high-density as normal containers.
>
> Very nice.

Hello, David,

thank you for your review.


>>
>> 4) yet-another-post-copy-migration
>> It is a possible feature, no toolkit can do it well now.
>> Using nbd server/client on the memory file is reluctantly Ok but
>> inconvenient. A special feature for tmpfs might be needed to
>> fully complete this feature.
>> No one need yet another post copy migration method,
>> but it is possible when some crazy man need it.
>
> Well as the crazy man who wrote the current postcopy code, what would
> you need it to do that the current one can't?

I don't think we need to add a new post-copy-migration method.
I mentioned it for showing one of the potentials of this patch.

>
> I can't see anything noticeably wrong with your code; but I see the bot
> gave you some style warnings.

Fixed in v2.

Thanks,
Lai

>
> Dave
>
>> Signed-off-by: Lai Jiangshan <jiangshanlai@gmail.com>
>> ---
>>  exec.c                        |  5 +++++
>>  include/exec/cpu-common.h     |  1 +
>>  include/migration/migration.h |  1 +
>>  migration/migration.c         |  9 +++++++++
>>  migration/ram.c               | 37 ++++++++++++++++++++++++++++---------
>>  qapi-schema.json              |  6 +++++-
>>  qmp-commands.hx               |  3 +++
>>  7 files changed, 52 insertions(+), 10 deletions(-)

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [Qemu-devel] [PATCH V2] add migration capability to bypass the shared memory
  2016-08-10  0:54   ` [Qemu-devel] [PATCH V2] " Lai Jiangshan
@ 2016-08-10  1:22     ` Fam Zheng
  2016-08-10  2:22     ` Li, Liang Z
                       ` (2 subsequent siblings)
  3 siblings, 0 replies; 25+ messages in thread
From: Fam Zheng @ 2016-08-10  1:22 UTC (permalink / raw)
  To: Lai Jiangshan
  Cc: qemu-devel, Juan Quintela, Peter Crosthwaite, Markus Armbruster,
	Amit Shah, Paolo Bonzini, Richard Henderson

On Wed, 08/10 08:54, Lai Jiangshan wrote:
> When the migration capability 'bypass-shared-memory'
> is set, the shared memory will be bypassed when migration.
> 
> It is the key feature to enable several excellent features for
> the qemu, such as qemu-local-migration, qemu-live-update,
> extremely-fast-save-restore, vm-template, vm-fast-live-clone,
> yet-another-post-copy-migration, etc..
> 
> The philosophy behind this key feature and the advanced
> key features is that a part of the memory management is
> separated out from the qemu, and let the other toolkits
> such as libvirt, runv(https://github.com/hyperhq/runv/)
> or the next qemu-cmd directly access to it, manage it,
> provide features to it.
> 
> The hyperhq(http://hyper.sh  http://hypercontainer.io/)
> introduced the feature vm-template(vm-fast-live-clone)
> to the hyper container for several months, it works perfect.
> (see https://github.com/hyperhq/runv/pull/297)
> 
> The feature vm-template makes the containers(VMs) can
> be started in 130ms and save 80M memory for every
> container(VM). So that the hyper containers are fast
> and high-density as normal containers.
> 
> In current qemu command line, shared memory has
> to be configured via memory-object. Anyone can add a
> -mem-path-share to the qemu command line for combining
> with -mem-path for this feature. This patch doesn’t include
> this change of -mem-path-share.
> 
> Advanced features:
> 1) qemu-local-migration, qemu-live-update
> Set the mem-path on the tmpfs and set share=on for it when
> start the vm. example:
> -object \
> memory-backend-file,id=mem,size=128M,mem-path=/dev/shm/memory,share=on \
> -numa node,nodeid=0,cpus=0-7,memdev=mem
> 
> when you want to migrate the vm locally (after fixed a security bug
> of the qemu-binary, or other reason), you can start a new qemu with
> the same command line and -incoming, then you can migrate the
> vm from the old qemu to the new qemu with the migration capability
> 'bypass-shared-memory' set. The migration will migrate the device-state
> *ONLY*, the memory is the origin memory backed by tmpfs file.
> 
> 2) extremely-fast-save-restore
> the same above, but the mem-path is on the persistent file system.
> 
> 3)  vm-template, vm-fast-live-clone
> the template vm is started as 1), and paused when the guest reaches
> the template point(example: the guest app is ready), then the template
> vm is saved. (the qemu process of the template can be killed now, because
> we need only the memory and the device state files (in tmpfs)).
> 
> Then we can launch one or multiple VMs base on the template vm states,
> the new VMs are started without the “share=on”, all the new VMs share
> the initial memory from the memory file, they save a lot of memory.
> all the new VMs start from the template point, the guest app can go to
> work quickly.
> 
> The new VM booted from template vm can’t become template again,
> if you need this unusual chained-template feature, you can write
> a cloneable-tmpfs kernel module for it.
> 
> The libvirt toolkit can’t manage vm-template currently, in the
> hyperhq/runv, we use qemu wrapper script to do it. I hope someone add
> “libvrit managed template” feature to libvirt.
> 
> 4) yet-another-post-copy-migration
> It is a possible feature, no toolkit can do it well now.
> Using nbd server/client on the memory file is reluctantly Ok but
> inconvenient. A special feature for tmpfs might be needed to
> fully complete this feature.
> No one need yet another post copy migration method,
> but it is possible when some crazy man need it.
> 
> Changed from v1:
>    fix style

Thanks for sending the patch and fixing the coding style. Without having looked
at the patch, I have one meta comment here. In the future, please move the
revision log to below the --- line after the signed-off-by line. This is
because that piece of text is not necessary in the git history, so it won't be
picked up by git-am when maintainers apply it, if put there.

Also please remember to always send top level thread for new versions instead
of as a reply to v1 (i.e.  don't set In-Reply-To in the header).

Thanks,
Fam

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [Qemu-devel] [PATCH V2] add migration capability to bypass the shared memory
  2016-08-10  0:54   ` [Qemu-devel] [PATCH V2] " Lai Jiangshan
  2016-08-10  1:22     ` Fam Zheng
@ 2016-08-10  2:22     ` Li, Liang Z
  2016-08-10  3:22       ` Lai Jiangshan
  2017-09-25  7:42     ` Zhang Haoyu
  2017-09-25 12:13     ` Zhang Haoyu
  3 siblings, 1 reply; 25+ messages in thread
From: Li, Liang Z @ 2016-08-10  2:22 UTC (permalink / raw)
  To: Lai Jiangshan, qemu-devel
  Cc: Juan Quintela, Peter Crosthwaite, Markus Armbruster, Amit Shah,
	Paolo Bonzini, Richard Henderson

Hi Jiangshan,

Glad to see your patch. It's a simple implementation which could provide very useful functions.

> +static void migration_bitmap_init(unsigned long *bitmap) {
> +    RAMBlock *block;
> +
> +    bitmap_clear(bitmap, 0, last_ram_offset() >> TARGET_PAGE_BITS);
> +    rcu_read_lock();
> +    QLIST_FOREACH_RCU(block, &ram_list.blocks, next) {
> +        if (!migrate_bypass_shared_memory()
> || !qemu_ram_is_shared(block)) {
> +            bitmap_set(bitmap, block->offset >> TARGET_PAGE_BITS,

You should use (block->offset >> TARGET_PAGE_BITS )/ BITS_PER_LONG here.

> +                       block->used_length >> TARGET_PAGE_BITS);
> +
> +            /*
> +             * Count the total number of pages used by ram blocks not including
> +             * any gaps due to alignment or unplugs.
> +             */
> +            migration_dirty_pages += block->used_length >>
> TARGET_PAGE_BITS;
> +        }
> +    }
> +    rcu_read_unlock();
>  }

Liang

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [Qemu-devel] [PATCH V2] add migration capability to bypass the shared memory
  2016-08-10  2:22     ` Li, Liang Z
@ 2016-08-10  3:22       ` Lai Jiangshan
  2016-08-10  5:04         ` Li, Liang Z
  0 siblings, 1 reply; 25+ messages in thread
From: Lai Jiangshan @ 2016-08-10  3:22 UTC (permalink / raw)
  To: Li, Liang Z
  Cc: qemu-devel, Juan Quintela, Peter Crosthwaite, Markus Armbruster,
	Amit Shah, Paolo Bonzini, Richard Henderson

On Wed, Aug 10, 2016 at 10:22 AM, Li, Liang Z <liang.z.li@intel.com> wrote:
> Hi Jiangshan,
>
> Glad to see your patch. It's a simple implementation which could provide very useful functions.
>
>> +static void migration_bitmap_init(unsigned long *bitmap) {
>> +    RAMBlock *block;
>> +
>> +    bitmap_clear(bitmap, 0, last_ram_offset() >> TARGET_PAGE_BITS);
>> +    rcu_read_lock();
>> +    QLIST_FOREACH_RCU(block, &ram_list.blocks, next) {
>> +        if (!migrate_bypass_shared_memory()
>> || !qemu_ram_is_shared(block)) {
>> +            bitmap_set(bitmap, block->offset >> TARGET_PAGE_BITS,
>
> You should use (block->offset >> TARGET_PAGE_BITS )/ BITS_PER_LONG here.

Hello, Li

I might have missed something, could you tell me more?

void bitmap_set(unsigned long *map, long start, long nr);
I think the @start and @nr are both the number of the bits.

thanks,
Lai


>
>> +                       block->used_length >> TARGET_PAGE_BITS);
>> +
>> +            /*
>> +             * Count the total number of pages used by ram blocks not including
>> +             * any gaps due to alignment or unplugs.
>> +             */
>> +            migration_dirty_pages += block->used_length >>
>> TARGET_PAGE_BITS;
>> +        }
>> +    }
>> +    rcu_read_unlock();
>>  }
>
> Liang

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [Qemu-devel] [PATCH V2] add migration capability to bypass the shared memory
  2016-08-10  3:22       ` Lai Jiangshan
@ 2016-08-10  5:04         ` Li, Liang Z
  2016-08-10  9:11           ` Juan Quintela
  0 siblings, 1 reply; 25+ messages in thread
From: Li, Liang Z @ 2016-08-10  5:04 UTC (permalink / raw)
  To: Lai Jiangshan
  Cc: qemu-devel, Juan Quintela, Peter Crosthwaite, Markus Armbruster,
	Amit Shah, Paolo Bonzini, Richard Henderson

> On Wed, Aug 10, 2016 at 10:22 AM, Li, Liang Z <liang.z.li@intel.com> wrote:
> > Hi Jiangshan,
> >
> > Glad to see your patch. It's a simple implementation which could provide
> very useful functions.
> >
> >> +static void migration_bitmap_init(unsigned long *bitmap) {
> >> +    RAMBlock *block;
> >> +
> >> +    bitmap_clear(bitmap, 0, last_ram_offset() >> TARGET_PAGE_BITS);
> >> +    rcu_read_lock();
> >> +    QLIST_FOREACH_RCU(block, &ram_list.blocks, next) {
> >> +        if (!migrate_bypass_shared_memory()
> >> || !qemu_ram_is_shared(block)) {
> >> +            bitmap_set(bitmap, block->offset >> TARGET_PAGE_BITS,
> >
> > You should use (block->offset >> TARGET_PAGE_BITS )/ BITS_PER_LONG
> here.
> 
> Hello, Li
> 
> I might have missed something, could you tell me more?
> 
> void bitmap_set(unsigned long *map, long start, long nr); I think the @start
> and @nr are both the number of the bits.
> 
> thanks,
> Lai

You are right,  I have make a mistake by checking the code. Sorry for the noise.

BTW. Is it possible to bypass the shared block in the 'ram_find_and_save_block'?
I mean no to check if a page is dirty for a shared block, it may make things faster.

Liang


^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [Qemu-devel] [PATCH] add migration capability to bypass the shared memory
  2016-08-09 12:10 [Qemu-devel] [PATCH] add migration capability to bypass the shared memory Lai Jiangshan
  2016-08-09 13:51 ` no-reply
  2016-08-09 19:12 ` Dr. David Alan Gilbert
@ 2016-08-10  9:03 ` Juan Quintela
  2016-08-30  4:11   ` Lai Jiangshan
  2 siblings, 1 reply; 25+ messages in thread
From: Juan Quintela @ 2016-08-10  9:03 UTC (permalink / raw)
  To: Lai Jiangshan
  Cc: qemu-devel, Paolo Bonzini, Peter Crosthwaite, Richard Henderson,
	Amit Shah, Eric Blake, Markus Armbruster

Lai Jiangshan <jiangshanlai@gmail.com> wrote:

Hi

First of all, I like a lot the patchset, but I would preffer to split it
to find "possible" bugs along the lines, especially in postcopy, but not only.

[very nice description of the patch]

Nothing to say about the QMP and shared memory detection, looks correct
to me.

> diff --git a/migration/ram.c b/migration/ram.c
> index 815bc0e..880972d 100644
> --- a/migration/ram.c
> +++ b/migration/ram.c
> @@ -605,6 +605,28 @@ static void migration_bitmap_sync_init(void)
>      num_dirty_pages_period = 0;
>      xbzrle_cache_miss_prev = 0;
>      iterations_prev = 0;
> +    migration_dirty_pages = 0;
> +}
> +
> +static void migration_bitmap_init(unsigned long *bitmap)
> +{
> +    RAMBlock *block;
> +
> +    bitmap_clear(bitmap, 0, last_ram_offset() >> TARGET_PAGE_BITS);
> +    rcu_read_lock();
> +    QLIST_FOREACH_RCU(block, &ram_list.blocks, next) {
> +        if (!migrate_bypass_shared_memory() || !qemu_ram_is_shared(block)) {
> +            bitmap_set(bitmap, block->offset >> TARGET_PAGE_BITS,
> +                       block->used_length >> TARGET_PAGE_BITS);
> +
> +            /*
> +             * Count the total number of pages used by ram blocks not including
> +             * any gaps due to alignment or unplugs.
> +             */
> +	    migration_dirty_pages += block->used_length >> TARGET_PAGE_BITS;
> +	}
> +    }
> +    rcu_read_unlock();
>  }

We can split this function in a different patch.  I haven't fully search
if we care about taking the rcu lock here.  The thing that I am more
interested is in knowing what happens when we don't set
migration_dirty_pages as the full "possible" memory pages.

Once here, should we check for ROM regions?

BTW, could'nt we use:

int qemu_ram_foreach_block(RAMBlockIterFunc func, void *opaque)
{
    RAMBlock *block;
    int ret = 0;

    rcu_read_lock();
    QLIST_FOREACH_RCU(block, &ram_list.blocks, next) {
        ret = func(block->idstr, block->host, block->offset,
                   block->used_length, opaque);
        if (ret) {
            break;
        }
    }
    rcu_read_unlock();
    return ret;
}



>  
>  static void migration_bitmap_sync(void)
> @@ -631,7 +653,9 @@ static void migration_bitmap_sync(void)
>      qemu_mutex_lock(&migration_bitmap_mutex);
>      rcu_read_lock();
>      QLIST_FOREACH_RCU(block, &ram_list.blocks, next) {
> -        migration_bitmap_sync_range(block->offset, block->used_length);
> +        if (!migrate_bypass_shared_memory() || !qemu_ram_is_shared(block)) {
> +            migration_bitmap_sync_range(block->offset, block->used_length);
> +        }
>      }
>      rcu_read_unlock();
>      qemu_mutex_unlock(&migration_bitmap_mutex);

Oops, another place where we were not using qemu_ram_foreach_block :p


> @@ -1926,19 +1950,14 @@ static int ram_save_setup(QEMUFile *f, void *opaque)
>      ram_bitmap_pages = last_ram_offset() >> TARGET_PAGE_BITS;
>      migration_bitmap_rcu = g_new0(struct BitmapRcu, 1);
>      migration_bitmap_rcu->bmap = bitmap_new(ram_bitmap_pages);
> -    bitmap_set(migration_bitmap_rcu->bmap, 0, ram_bitmap_pages);
> +    migration_bitmap_init(migration_bitmap_rcu->bmap);
>  
>      if (migrate_postcopy_ram()) {
>          migration_bitmap_rcu->unsentmap = bitmap_new(ram_bitmap_pages);
> -        bitmap_set(migration_bitmap_rcu->unsentmap, 0, ram_bitmap_pages);
> +        bitmap_copy(migration_bitmap_rcu->unsentmap,
> +		    migration_bitmap_rcu->bmap, ram_bitmap_pages);
>      }

I think that if we go this route, we should move the whole if inside the
migration_bitmap_init?

>  
> -    /*
> -     * Count the total number of pages used by ram blocks not including any
> -     * gaps due to alignment or unplugs.
> -     */
> -    migration_dirty_pages = ram_bytes_total() >> TARGET_PAGE_BITS;
> -
>      memory_global_dirty_log_start();
>      migration_bitmap_sync();
>      qemu_mutex_unlock_ramlist();


As said, very happy with the patch.  And it got much simpler that I
would have expected.

Thanks, Juan.

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [Qemu-devel] [PATCH V2] add migration capability to bypass the shared memory
  2016-08-10  5:04         ` Li, Liang Z
@ 2016-08-10  9:11           ` Juan Quintela
  2016-08-11  7:11             ` Li, Liang Z
  0 siblings, 1 reply; 25+ messages in thread
From: Juan Quintela @ 2016-08-10  9:11 UTC (permalink / raw)
  To: Li, Liang Z
  Cc: Lai Jiangshan, qemu-devel, Peter Crosthwaite, Markus Armbruster,
	Amit Shah, Paolo Bonzini, Richard Henderson

"Li, Liang Z" <liang.z.li@intel.com> wrote:
>> On Wed, Aug 10, 2016 at 10:22 AM, Li, Liang Z <liang.z.li@intel.com> wrote:
>> > Hi Jiangshan,
>> >
>> > Glad to see your patch. It's a simple implementation which could provide
>> very useful functions.
>> >
>> >> +static void migration_bitmap_init(unsigned long *bitmap) {
>> >> +    RAMBlock *block;
>> >> +
>> >> +    bitmap_clear(bitmap, 0, last_ram_offset() >> TARGET_PAGE_BITS);
>> >> +    rcu_read_lock();
>> >> +    QLIST_FOREACH_RCU(block, &ram_list.blocks, next) {
>> >> +        if (!migrate_bypass_shared_memory()
>> >> || !qemu_ram_is_shared(block)) {
>> >> +            bitmap_set(bitmap, block->offset >> TARGET_PAGE_BITS,
>> >
>> > You should use (block->offset >> TARGET_PAGE_BITS )/ BITS_PER_LONG
>> here.
>> 
>> Hello, Li
>> 
>> I might have missed something, could you tell me more?
>> 
>> void bitmap_set(unsigned long *map, long start, long nr); I think the @start
>> and @nr are both the number of the bits.
>> 
>> thanks,
>> Lai
>
> You are right,  I have make a mistake by checking the code. Sorry for the noise.
>
> BTW. Is it possible to bypass the shared block in the 'ram_find_and_save_block'?
> I mean no to check if a page is dirty for a shared block, it may make things faster.

Nice spotted.  That would make things faster.  But once there we could
split the bitmap by ramblock, and the migration_dirty_pages also per
block, that would make all the searchs faster, and adding/removing
blocks of ram much easier.  If we enter optimizing that function, we
could really do much better that the "again" parameter that we have
right now.

Later, Juan.

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [Qemu-devel] [PATCH V2] add migration capability to bypass the shared memory
  2016-08-10  9:11           ` Juan Quintela
@ 2016-08-11  7:11             ` Li, Liang Z
  2016-08-11 14:31               ` Lai Jiangshan
  0 siblings, 1 reply; 25+ messages in thread
From: Li, Liang Z @ 2016-08-11  7:11 UTC (permalink / raw)
  To: quintela, Lai Jiangshan
  Cc: qemu-devel, Peter Crosthwaite, Markus Armbruster, Amit Shah,
	Paolo Bonzini, Richard Henderson

> >>
> >> I might have missed something, could you tell me more?
> >>
> >> void bitmap_set(unsigned long *map, long start, long nr); I think the
> >> @start and @nr are both the number of the bits.
> >>
> >> thanks,
> >> Lai
> >
> > You are right,  I have make a mistake by checking the code. Sorry for the
> noise.
> >
> > BTW. Is it possible to bypass the shared block in the
> 'ram_find_and_save_block'?
> > I mean no to check if a page is dirty for a shared block, it may make things
> faster.
> 
> Nice spotted.  That would make things faster.  But once there we could split
> the bitmap by ramblock, and the migration_dirty_pages also per block, that
> would make all the searchs faster, and adding/removing blocks of ram much
> easier.  If we enter optimizing that function, we could really do much better
> that the "again" parameter that we have right now.
> 
> Later, Juan.

Did some experiment with an 8GB VM.
Total time: 93ms
Setup: 70ms
Downtime: 6ms

So most of the time spend on Setup stage, and most of the time spend on 
setup stage is starting dirty page logging.
How about not start dirty page logging for shared blocks? then it's possible
to shorten the total time to about 30ms.

Liang

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [Qemu-devel] [PATCH V2] add migration capability to bypass the shared memory
  2016-08-11  7:11             ` Li, Liang Z
@ 2016-08-11 14:31               ` Lai Jiangshan
  2016-08-11 14:45                 ` Lai Jiangshan
  2016-08-12  6:48                 ` Li, Liang Z
  0 siblings, 2 replies; 25+ messages in thread
From: Lai Jiangshan @ 2016-08-11 14:31 UTC (permalink / raw)
  To: Li, Liang Z
  Cc: quintela, qemu-devel, Peter Crosthwaite, Markus Armbruster,
	Amit Shah, Paolo Bonzini, Richard Henderson

On Thu, Aug 11, 2016 at 3:11 PM, Li, Liang Z <liang.z.li@intel.com> wrote:
>> >>
>> >> I might have missed something, could you tell me more?
>> >>
>> >> void bitmap_set(unsigned long *map, long start, long nr); I think the
>> >> @start and @nr are both the number of the bits.
>> >>
>> >> thanks,
>> >> Lai
>> >
>> > You are right,  I have make a mistake by checking the code. Sorry for the
>> noise.
>> >
>> > BTW. Is it possible to bypass the shared block in the
>> 'ram_find_and_save_block'?
>> > I mean no to check if a page is dirty for a shared block, it may make things
>> faster.
>>
>> Nice spotted.  That would make things faster.  But once there we could split
>> the bitmap by ramblock, and the migration_dirty_pages also per block, that
>> would make all the searchs faster, and adding/removing blocks of ram much
>> easier.  If we enter optimizing that function, we could really do much better
>> that the "again" parameter that we have right now.
>>
>> Later, Juan.


This patch focuses on the core thing: adding the ability to bypass
migrating the memory.
I don't want to add further optimization for it now.
I hope this patch merged solo and earlier.
optimization patches can be sent after this patch accepted.

>
> Did some experiment with an 8GB VM.
> Total time: 93ms
> Setup: 70ms
> Downtime: 6ms
>
> So most of the time spend on Setup stage, and most of the time spend on
> setup stage is starting dirty page logging.
> How about not start dirty page logging for shared blocks? then it's possible
> to shorten the total time to about 30ms.

bypass migrating the memory includes
  bypass copying&sending the memory (this patch, save seconds)
  bypass dirty page logging (save ten of milliseconds)
  bypass other thing??

30ms is a huge part of the 130ms of the hyperhq cases.
We definitely need it, I once tried it at the beginning, but it is a
little complicated for me. So I focus the the major part now.

Thanks,
Lai




>
> Liang
>
>
>
>
>

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [Qemu-devel] [PATCH V2] add migration capability to bypass the shared memory
  2016-08-11 14:31               ` Lai Jiangshan
@ 2016-08-11 14:45                 ` Lai Jiangshan
  2017-09-21  3:41                   ` Zhang Haoyu
  2016-08-12  6:48                 ` Li, Liang Z
  1 sibling, 1 reply; 25+ messages in thread
From: Lai Jiangshan @ 2016-08-11 14:45 UTC (permalink / raw)
  To: Li, Liang Z
  Cc: quintela, qemu-devel, Peter Crosthwaite, Markus Armbruster,
	Amit Shah, Paolo Bonzini, Richard Henderson

Note, the old local migration patchset:
https://lists.gnu.org/archive/html/qemu-devel/2013-12/msg00073.html

this patch can be considered as a new local migration implementation,
but with more restrictions (the memory must set shared when boot the qemu)

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [Qemu-devel] [PATCH V2] add migration capability to bypass the shared memory
  2016-08-11 14:31               ` Lai Jiangshan
  2016-08-11 14:45                 ` Lai Jiangshan
@ 2016-08-12  6:48                 ` Li, Liang Z
  2016-08-12  7:19                   ` Lai Jiangshan
  1 sibling, 1 reply; 25+ messages in thread
From: Li, Liang Z @ 2016-08-12  6:48 UTC (permalink / raw)
  To: Lai Jiangshan
  Cc: quintela, qemu-devel, Peter Crosthwaite, Markus Armbruster,
	Amit Shah, Paolo Bonzini, Richard Henderson

> >> > BTW. Is it possible to bypass the shared block in the
> >> 'ram_find_and_save_block'?
> >> > I mean no to check if a page is dirty for a shared block, it may
> >> > make things
> >> faster.
> >>
> >> Nice spotted.  That would make things faster.  But once there we
> >> could split the bitmap by ramblock, and the migration_dirty_pages
> >> also per block, that would make all the searchs faster, and
> >> adding/removing blocks of ram much easier.  If we enter optimizing
> >> that function, we could really do much better that the "again" parameter
> that we have right now.
> >>
> >> Later, Juan.
> 
> 
> This patch focuses on the core thing: adding the ability to bypass migrating
> the memory.
> I don't want to add further optimization for it now.
> I hope this patch merged solo and earlier.
> optimization patches can be sent after this patch accepted.
> 

Will you send a v3?

Liang


^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [Qemu-devel] [PATCH V2] add migration capability to bypass the shared memory
  2016-08-12  6:48                 ` Li, Liang Z
@ 2016-08-12  7:19                   ` Lai Jiangshan
  0 siblings, 0 replies; 25+ messages in thread
From: Lai Jiangshan @ 2016-08-12  7:19 UTC (permalink / raw)
  To: Li, Liang Z
  Cc: quintela, qemu-devel, Peter Crosthwaite, Markus Armbruster,
	Amit Shah, Paolo Bonzini, Richard Henderson

On Fri, Aug 12, 2016 at 2:48 PM, Li, Liang Z <liang.z.li@intel.com> wrote:
>> >> > BTW. Is it possible to bypass the shared block in the
>> >> 'ram_find_and_save_block'?
>> >> > I mean no to check if a page is dirty for a shared block, it may
>> >> > make things
>> >> faster.
>> >>
>> >> Nice spotted.  That would make things faster.  But once there we
>> >> could split the bitmap by ramblock, and the migration_dirty_pages
>> >> also per block, that would make all the searchs faster, and
>> >> adding/removing blocks of ram much easier.  If we enter optimizing
>> >> that function, we could really do much better that the "again" parameter
>> that we have right now.
>> >>
>> >> Later, Juan.
>>
>>
>> This patch focuses on the core thing: adding the ability to bypass migrating
>> the memory.
>> I don't want to add further optimization for it now.
>> I hope this patch merged solo and earlier.
>> optimization patches can be sent after this patch accepted.
>>
>
> Will you send a v3?

After I had read the current comments, it seems I don't need to update
the patch before any new comment arrived. (if I hadn't missed any
important comment......)

Thanks,
Lai

>
> Liang
>

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [Qemu-devel] [PATCH] add migration capability to bypass the shared memory
  2016-08-10  9:03 ` Juan Quintela
@ 2016-08-30  4:11   ` Lai Jiangshan
  2016-11-18 13:01     ` Alexander Graf
                       ` (2 more replies)
  0 siblings, 3 replies; 25+ messages in thread
From: Lai Jiangshan @ 2016-08-30  4:11 UTC (permalink / raw)
  To: Juan Quintela
  Cc: qemu-devel, Paolo Bonzini, Peter Crosthwaite, Richard Henderson,
	Amit Shah, Eric Blake, Markus Armbruster

On Wed, Aug 10, 2016 at 5:03 PM, Juan Quintela <quintela@redhat.com> wrote:
> Lai Jiangshan <jiangshanlai@gmail.com> wrote:
>
> Hi
>
> First of all, I like a lot the patchset, but I would preffer to split it
> to find "possible" bugs along the lines, especially in postcopy, but not only.

Hello, thanks for review and comments

I tried to make the patch be sane and tight.
I don't see any strong reason to split it without complicating the patch.

>
> [very nice description of the patch]
>
> Nothing to say about the QMP and shared memory detection, looks correct
> to me.
>
>> diff --git a/migration/ram.c b/migration/ram.c
>> index 815bc0e..880972d 100644
>> --- a/migration/ram.c
>> +++ b/migration/ram.c
>> @@ -605,6 +605,28 @@ static void migration_bitmap_sync_init(void)
>>      num_dirty_pages_period = 0;
>>      xbzrle_cache_miss_prev = 0;
>>      iterations_prev = 0;
>> +    migration_dirty_pages = 0;
>> +}
>> +
>> +static void migration_bitmap_init(unsigned long *bitmap)
>> +{
>> +    RAMBlock *block;
>> +
>> +    bitmap_clear(bitmap, 0, last_ram_offset() >> TARGET_PAGE_BITS);
>> +    rcu_read_lock();
>> +    QLIST_FOREACH_RCU(block, &ram_list.blocks, next) {
>> +        if (!migrate_bypass_shared_memory() || !qemu_ram_is_shared(block)) {
>> +            bitmap_set(bitmap, block->offset >> TARGET_PAGE_BITS,
>> +                       block->used_length >> TARGET_PAGE_BITS);
>> +
>> +            /*
>> +             * Count the total number of pages used by ram blocks not including
>> +             * any gaps due to alignment or unplugs.
>> +             */
>> +         migration_dirty_pages += block->used_length >> TARGET_PAGE_BITS;
>> +     }
>> +    }
>> +    rcu_read_unlock();
>>  }
>
> We can split this function in a different patch.

it calls the new function migrate_bypass_shared_memory().
it is no a good idea to split it out.

> I haven't fully search
> if we care about taking the rcu lock here.  The thing that I am more
> interested is in knowing what happens when we don't set
> migration_dirty_pages as the full "possible" memory pages.

I hadn't tested it with postcopy, I don't know how to use postcopy.
>From my review I can't find obvious bugs about it.

I don't think there is any good reason to use migrate_bypass
and postcopy together,  I can disable the migrate_bypass
when postcopy==true if you want.

>
> Once here, should we check for ROM regions?
>
> BTW, could'nt we use:
>
> int qemu_ram_foreach_block(RAMBlockIterFunc func, void *opaque)
> {
>     RAMBlock *block;
>     int ret = 0;
>
>     rcu_read_lock();
>     QLIST_FOREACH_RCU(block, &ram_list.blocks, next) {
>         ret = func(block->idstr, block->host, block->offset,
>                    block->used_length, opaque);
>         if (ret) {
>             break;
>         }
>     }
>     rcu_read_unlock();
>     return ret;
> }
>

the patch only introduces only one "QLIST_FOREACH_RCU(ram_list.blocks)"
but
# git grep 'QLIST_FOREACH_RCU.*ram_list'  | wc -l
#       16

I don't want to introduce qemu_ram_foreach_block()
and touch another 15 places.
I hope someone do it after merged.


>
>
>>
>>  static void migration_bitmap_sync(void)
>> @@ -631,7 +653,9 @@ static void migration_bitmap_sync(void)
>>      qemu_mutex_lock(&migration_bitmap_mutex);
>>      rcu_read_lock();
>>      QLIST_FOREACH_RCU(block, &ram_list.blocks, next) {
>> -        migration_bitmap_sync_range(block->offset, block->used_length);
>> +        if (!migrate_bypass_shared_memory() || !qemu_ram_is_shared(block)) {
>> +            migration_bitmap_sync_range(block->offset, block->used_length);
>> +        }
>>      }
>>      rcu_read_unlock();
>>      qemu_mutex_unlock(&migration_bitmap_mutex);
>
> Oops, another place where we were not using qemu_ram_foreach_block :p
>
>
>> @@ -1926,19 +1950,14 @@ static int ram_save_setup(QEMUFile *f, void *opaque)
>>      ram_bitmap_pages = last_ram_offset() >> TARGET_PAGE_BITS;
>>      migration_bitmap_rcu = g_new0(struct BitmapRcu, 1);
>>      migration_bitmap_rcu->bmap = bitmap_new(ram_bitmap_pages);
>> -    bitmap_set(migration_bitmap_rcu->bmap, 0, ram_bitmap_pages);
>> +    migration_bitmap_init(migration_bitmap_rcu->bmap);
>>
>>      if (migrate_postcopy_ram()) {
>>          migration_bitmap_rcu->unsentmap = bitmap_new(ram_bitmap_pages);
>> -        bitmap_set(migration_bitmap_rcu->unsentmap, 0, ram_bitmap_pages);
>> +        bitmap_copy(migration_bitmap_rcu->unsentmap,
>> +                 migration_bitmap_rcu->bmap, ram_bitmap_pages);
>>      }
>
> I think that if we go this route, we should move the whole if inside the
> migration_bitmap_init?

good! I will do it when I update the patch.

Thanks,
Lai

>
>>
>> -    /*
>> -     * Count the total number of pages used by ram blocks not including any
>> -     * gaps due to alignment or unplugs.
>> -     */
>> -    migration_dirty_pages = ram_bytes_total() >> TARGET_PAGE_BITS;
>> -
>>      memory_global_dirty_log_start();
>>      migration_bitmap_sync();
>>      qemu_mutex_unlock_ramlist();
>
>
> As said, very happy with the patch.  And it got much simpler that I
> would have expected.
>
> Thanks, Juan.

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [Qemu-devel] [PATCH] add migration capability to bypass the shared memory
  2016-08-30  4:11   ` Lai Jiangshan
@ 2016-11-18 13:01     ` Alexander Graf
  2017-01-12 19:19     ` Jianjun Duan
  2017-09-21  6:33     ` Zhang Haoyu
  2 siblings, 0 replies; 25+ messages in thread
From: Alexander Graf @ 2016-11-18 13:01 UTC (permalink / raw)
  To: Lai Jiangshan, Juan Quintela
  Cc: Peter Crosthwaite, qemu-devel, Markus Armbruster, Amit Shah,
	Paolo Bonzini, Richard Henderson

Juan,

It looks like Lai is waiting for a reply from you on this email :).


Alex

On 08/30/2016 06:11 AM, Lai Jiangshan wrote:
> On Wed, Aug 10, 2016 at 5:03 PM, Juan Quintela <quintela@redhat.com> wrote:
>> Lai Jiangshan <jiangshanlai@gmail.com> wrote:
>>
>> Hi
>>
>> First of all, I like a lot the patchset, but I would preffer to split it
>> to find "possible" bugs along the lines, especially in postcopy, but not only.
> Hello, thanks for review and comments
>
> I tried to make the patch be sane and tight.
> I don't see any strong reason to split it without complicating the patch.
>
>> [very nice description of the patch]
>>
>> Nothing to say about the QMP and shared memory detection, looks correct
>> to me.
>>
>>> diff --git a/migration/ram.c b/migration/ram.c
>>> index 815bc0e..880972d 100644
>>> --- a/migration/ram.c
>>> +++ b/migration/ram.c
>>> @@ -605,6 +605,28 @@ static void migration_bitmap_sync_init(void)
>>>       num_dirty_pages_period = 0;
>>>       xbzrle_cache_miss_prev = 0;
>>>       iterations_prev = 0;
>>> +    migration_dirty_pages = 0;
>>> +}
>>> +
>>> +static void migration_bitmap_init(unsigned long *bitmap)
>>> +{
>>> +    RAMBlock *block;
>>> +
>>> +    bitmap_clear(bitmap, 0, last_ram_offset() >> TARGET_PAGE_BITS);
>>> +    rcu_read_lock();
>>> +    QLIST_FOREACH_RCU(block, &ram_list.blocks, next) {
>>> +        if (!migrate_bypass_shared_memory() || !qemu_ram_is_shared(block)) {
>>> +            bitmap_set(bitmap, block->offset >> TARGET_PAGE_BITS,
>>> +                       block->used_length >> TARGET_PAGE_BITS);
>>> +
>>> +            /*
>>> +             * Count the total number of pages used by ram blocks not including
>>> +             * any gaps due to alignment or unplugs.
>>> +             */
>>> +         migration_dirty_pages += block->used_length >> TARGET_PAGE_BITS;
>>> +     }
>>> +    }
>>> +    rcu_read_unlock();
>>>   }
>> We can split this function in a different patch.
> it calls the new function migrate_bypass_shared_memory().
> it is no a good idea to split it out.
>
>> I haven't fully search
>> if we care about taking the rcu lock here.  The thing that I am more
>> interested is in knowing what happens when we don't set
>> migration_dirty_pages as the full "possible" memory pages.
> I hadn't tested it with postcopy, I don't know how to use postcopy.
>  From my review I can't find obvious bugs about it.
>
> I don't think there is any good reason to use migrate_bypass
> and postcopy together,  I can disable the migrate_bypass
> when postcopy==true if you want.
>
>> Once here, should we check for ROM regions?
>>
>> BTW, could'nt we use:
>>
>> int qemu_ram_foreach_block(RAMBlockIterFunc func, void *opaque)
>> {
>>      RAMBlock *block;
>>      int ret = 0;
>>
>>      rcu_read_lock();
>>      QLIST_FOREACH_RCU(block, &ram_list.blocks, next) {
>>          ret = func(block->idstr, block->host, block->offset,
>>                     block->used_length, opaque);
>>          if (ret) {
>>              break;
>>          }
>>      }
>>      rcu_read_unlock();
>>      return ret;
>> }
>>
> the patch only introduces only one "QLIST_FOREACH_RCU(ram_list.blocks)"
> but
> # git grep 'QLIST_FOREACH_RCU.*ram_list'  | wc -l
> #       16
>
> I don't want to introduce qemu_ram_foreach_block()
> and touch another 15 places.
> I hope someone do it after merged.
>
>
>>
>>>   static void migration_bitmap_sync(void)
>>> @@ -631,7 +653,9 @@ static void migration_bitmap_sync(void)
>>>       qemu_mutex_lock(&migration_bitmap_mutex);
>>>       rcu_read_lock();
>>>       QLIST_FOREACH_RCU(block, &ram_list.blocks, next) {
>>> -        migration_bitmap_sync_range(block->offset, block->used_length);
>>> +        if (!migrate_bypass_shared_memory() || !qemu_ram_is_shared(block)) {
>>> +            migration_bitmap_sync_range(block->offset, block->used_length);
>>> +        }
>>>       }
>>>       rcu_read_unlock();
>>>       qemu_mutex_unlock(&migration_bitmap_mutex);
>> Oops, another place where we were not using qemu_ram_foreach_block :p
>>
>>
>>> @@ -1926,19 +1950,14 @@ static int ram_save_setup(QEMUFile *f, void *opaque)
>>>       ram_bitmap_pages = last_ram_offset() >> TARGET_PAGE_BITS;
>>>       migration_bitmap_rcu = g_new0(struct BitmapRcu, 1);
>>>       migration_bitmap_rcu->bmap = bitmap_new(ram_bitmap_pages);
>>> -    bitmap_set(migration_bitmap_rcu->bmap, 0, ram_bitmap_pages);
>>> +    migration_bitmap_init(migration_bitmap_rcu->bmap);
>>>
>>>       if (migrate_postcopy_ram()) {
>>>           migration_bitmap_rcu->unsentmap = bitmap_new(ram_bitmap_pages);
>>> -        bitmap_set(migration_bitmap_rcu->unsentmap, 0, ram_bitmap_pages);
>>> +        bitmap_copy(migration_bitmap_rcu->unsentmap,
>>> +                 migration_bitmap_rcu->bmap, ram_bitmap_pages);
>>>       }
>> I think that if we go this route, we should move the whole if inside the
>> migration_bitmap_init?
> good! I will do it when I update the patch.
>
> Thanks,
> Lai
>
>>> -    /*
>>> -     * Count the total number of pages used by ram blocks not including any
>>> -     * gaps due to alignment or unplugs.
>>> -     */
>>> -    migration_dirty_pages = ram_bytes_total() >> TARGET_PAGE_BITS;
>>> -
>>>       memory_global_dirty_log_start();
>>>       migration_bitmap_sync();
>>>       qemu_mutex_unlock_ramlist();
>>
>> As said, very happy with the patch.  And it got much simpler that I
>> would have expected.
>>
>> Thanks, Juan.

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [Qemu-devel] [PATCH] add migration capability to bypass the shared memory
  2016-08-30  4:11   ` Lai Jiangshan
  2016-11-18 13:01     ` Alexander Graf
@ 2017-01-12 19:19     ` Jianjun Duan
  2017-01-13  4:44       ` Lai Jiangshan
  2017-09-21  6:33     ` Zhang Haoyu
  2 siblings, 1 reply; 25+ messages in thread
From: Jianjun Duan @ 2017-01-12 19:19 UTC (permalink / raw)
  To: Lai Jiangshan, Juan Quintela
  Cc: Peter Crosthwaite, qemu-devel, Markus Armbruster, Amit Shah,
	Paolo Bonzini, Richard Henderson

I have a question related to interplay of bypassing the shared memory in
migration and memory hotplugging. If on the source guest a big chunk of
memory is plugged in, will the shared memory still be mapped the same
way on the guest? i.e, the mapping from guest physical address to the
host virtual address be the same?

Thanks,
Jianjun


On 08/29/2016 09:11 PM, Lai Jiangshan wrote:
> On Wed, Aug 10, 2016 at 5:03 PM, Juan Quintela <quintela@redhat.com> wrote:
>> Lai Jiangshan <jiangshanlai@gmail.com> wrote:
>>
>> Hi
>>
>> First of all, I like a lot the patchset, but I would preffer to split it
>> to find "possible" bugs along the lines, especially in postcopy, but not only.
> 
> Hello, thanks for review and comments
> 
> I tried to make the patch be sane and tight.
> I don't see any strong reason to split it without complicating the patch.
> 
>>
>> [very nice description of the patch]
>>
>> Nothing to say about the QMP and shared memory detection, looks correct
>> to me.
>>
>>> diff --git a/migration/ram.c b/migration/ram.c
>>> index 815bc0e..880972d 100644
>>> --- a/migration/ram.c
>>> +++ b/migration/ram.c
>>> @@ -605,6 +605,28 @@ static void migration_bitmap_sync_init(void)
>>>      num_dirty_pages_period = 0;
>>>      xbzrle_cache_miss_prev = 0;
>>>      iterations_prev = 0;
>>> +    migration_dirty_pages = 0;
>>> +}
>>> +
>>> +static void migration_bitmap_init(unsigned long *bitmap)
>>> +{
>>> +    RAMBlock *block;
>>> +
>>> +    bitmap_clear(bitmap, 0, last_ram_offset() >> TARGET_PAGE_BITS);
>>> +    rcu_read_lock();
>>> +    QLIST_FOREACH_RCU(block, &ram_list.blocks, next) {
>>> +        if (!migrate_bypass_shared_memory() || !qemu_ram_is_shared(block)) {
>>> +            bitmap_set(bitmap, block->offset >> TARGET_PAGE_BITS,
>>> +                       block->used_length >> TARGET_PAGE_BITS);
>>> +
>>> +            /*
>>> +             * Count the total number of pages used by ram blocks not including
>>> +             * any gaps due to alignment or unplugs.
>>> +             */
>>> +         migration_dirty_pages += block->used_length >> TARGET_PAGE_BITS;
>>> +     }
>>> +    }
>>> +    rcu_read_unlock();
>>>  }
>>
>> We can split this function in a different patch.
> 
> it calls the new function migrate_bypass_shared_memory().
> it is no a good idea to split it out.
> 
>> I haven't fully search
>> if we care about taking the rcu lock here.  The thing that I am more
>> interested is in knowing what happens when we don't set
>> migration_dirty_pages as the full "possible" memory pages.
> 
> I hadn't tested it with postcopy, I don't know how to use postcopy.
> From my review I can't find obvious bugs about it.
> 
> I don't think there is any good reason to use migrate_bypass
> and postcopy together,  I can disable the migrate_bypass
> when postcopy==true if you want.
> 
>>
>> Once here, should we check for ROM regions?
>>
>> BTW, could'nt we use:
>>
>> int qemu_ram_foreach_block(RAMBlockIterFunc func, void *opaque)
>> {
>>     RAMBlock *block;
>>     int ret = 0;
>>
>>     rcu_read_lock();
>>     QLIST_FOREACH_RCU(block, &ram_list.blocks, next) {
>>         ret = func(block->idstr, block->host, block->offset,
>>                    block->used_length, opaque);
>>         if (ret) {
>>             break;
>>         }
>>     }
>>     rcu_read_unlock();
>>     return ret;
>> }
>>
> 
> the patch only introduces only one "QLIST_FOREACH_RCU(ram_list.blocks)"
> but
> # git grep 'QLIST_FOREACH_RCU.*ram_list'  | wc -l
> #       16
> 
> I don't want to introduce qemu_ram_foreach_block()
> and touch another 15 places.
> I hope someone do it after merged.
> 
> 
>>
>>
>>>
>>>  static void migration_bitmap_sync(void)
>>> @@ -631,7 +653,9 @@ static void migration_bitmap_sync(void)
>>>      qemu_mutex_lock(&migration_bitmap_mutex);
>>>      rcu_read_lock();
>>>      QLIST_FOREACH_RCU(block, &ram_list.blocks, next) {
>>> -        migration_bitmap_sync_range(block->offset, block->used_length);
>>> +        if (!migrate_bypass_shared_memory() || !qemu_ram_is_shared(block)) {
>>> +            migration_bitmap_sync_range(block->offset, block->used_length);
>>> +        }
>>>      }
>>>      rcu_read_unlock();
>>>      qemu_mutex_unlock(&migration_bitmap_mutex);
>>
>> Oops, another place where we were not using qemu_ram_foreach_block :p
>>
>>
>>> @@ -1926,19 +1950,14 @@ static int ram_save_setup(QEMUFile *f, void *opaque)
>>>      ram_bitmap_pages = last_ram_offset() >> TARGET_PAGE_BITS;
>>>      migration_bitmap_rcu = g_new0(struct BitmapRcu, 1);
>>>      migration_bitmap_rcu->bmap = bitmap_new(ram_bitmap_pages);
>>> -    bitmap_set(migration_bitmap_rcu->bmap, 0, ram_bitmap_pages);
>>> +    migration_bitmap_init(migration_bitmap_rcu->bmap);
>>>
>>>      if (migrate_postcopy_ram()) {
>>>          migration_bitmap_rcu->unsentmap = bitmap_new(ram_bitmap_pages);
>>> -        bitmap_set(migration_bitmap_rcu->unsentmap, 0, ram_bitmap_pages);
>>> +        bitmap_copy(migration_bitmap_rcu->unsentmap,
>>> +                 migration_bitmap_rcu->bmap, ram_bitmap_pages);
>>>      }
>>
>> I think that if we go this route, we should move the whole if inside the
>> migration_bitmap_init?
> 
> good! I will do it when I update the patch.
> 
> Thanks,
> Lai
> 
>>
>>>
>>> -    /*
>>> -     * Count the total number of pages used by ram blocks not including any
>>> -     * gaps due to alignment or unplugs.
>>> -     */
>>> -    migration_dirty_pages = ram_bytes_total() >> TARGET_PAGE_BITS;
>>> -
>>>      memory_global_dirty_log_start();
>>>      migration_bitmap_sync();
>>>      qemu_mutex_unlock_ramlist();
>>
>>
>> As said, very happy with the patch.  And it got much simpler that I
>> would have expected.
>>
>> Thanks, Juan.
> 

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [Qemu-devel] [PATCH] add migration capability to bypass the shared memory
  2017-01-12 19:19     ` Jianjun Duan
@ 2017-01-13  4:44       ` Lai Jiangshan
  2017-01-16 17:38         ` Jianjun Duan
  0 siblings, 1 reply; 25+ messages in thread
From: Lai Jiangshan @ 2017-01-13  4:44 UTC (permalink / raw)
  To: Jianjun Duan
  Cc: Juan Quintela, Peter Crosthwaite, qemu-devel, Markus Armbruster,
	Amit Shah, Paolo Bonzini, Richard Henderson

On Fri, Jan 13, 2017 at 3:19 AM, Jianjun Duan <duanj@linux.vnet.ibm.com> wrote:
> I have a question related to interplay of bypassing the shared memory in
> migration and memory hotplugging. If on the source guest a big chunk of
> memory is plugged in, will the shared memory still be mapped the same
> way on the guest? i.e, the mapping from guest physical address to the
> host virtual address be the same?

I don't understand the question, the patch doesn't change
the memory hotplugging nor the way how the pages are mapped
in the guest physical.

>
> Thanks,
> Jianjun
>
>
> On 08/29/2016 09:11 PM, Lai Jiangshan wrote:
>> On Wed, Aug 10, 2016 at 5:03 PM, Juan Quintela <quintela@redhat.com> wrote:
>>> Lai Jiangshan <jiangshanlai@gmail.com> wrote:
>>>
>>> Hi
>>>
>>> First of all, I like a lot the patchset, but I would preffer to split it
>>> to find "possible" bugs along the lines, especially in postcopy, but not only.
>>
>> Hello, thanks for review and comments
>>
>> I tried to make the patch be sane and tight.
>> I don't see any strong reason to split it without complicating the patch.
>>
>>>
>>> [very nice description of the patch]
>>>
>>> Nothing to say about the QMP and shared memory detection, looks correct
>>> to me.
>>>
>>>> diff --git a/migration/ram.c b/migration/ram.c
>>>> index 815bc0e..880972d 100644
>>>> --- a/migration/ram.c
>>>> +++ b/migration/ram.c
>>>> @@ -605,6 +605,28 @@ static void migration_bitmap_sync_init(void)
>>>>      num_dirty_pages_period = 0;
>>>>      xbzrle_cache_miss_prev = 0;
>>>>      iterations_prev = 0;
>>>> +    migration_dirty_pages = 0;
>>>> +}
>>>> +
>>>> +static void migration_bitmap_init(unsigned long *bitmap)
>>>> +{
>>>> +    RAMBlock *block;
>>>> +
>>>> +    bitmap_clear(bitmap, 0, last_ram_offset() >> TARGET_PAGE_BITS);
>>>> +    rcu_read_lock();
>>>> +    QLIST_FOREACH_RCU(block, &ram_list.blocks, next) {
>>>> +        if (!migrate_bypass_shared_memory() || !qemu_ram_is_shared(block)) {
>>>> +            bitmap_set(bitmap, block->offset >> TARGET_PAGE_BITS,
>>>> +                       block->used_length >> TARGET_PAGE_BITS);
>>>> +
>>>> +            /*
>>>> +             * Count the total number of pages used by ram blocks not including
>>>> +             * any gaps due to alignment or unplugs.
>>>> +             */
>>>> +         migration_dirty_pages += block->used_length >> TARGET_PAGE_BITS;
>>>> +     }
>>>> +    }
>>>> +    rcu_read_unlock();
>>>>  }
>>>
>>> We can split this function in a different patch.
>>
>> it calls the new function migrate_bypass_shared_memory().
>> it is no a good idea to split it out.
>>
>>> I haven't fully search
>>> if we care about taking the rcu lock here.  The thing that I am more
>>> interested is in knowing what happens when we don't set
>>> migration_dirty_pages as the full "possible" memory pages.
>>
>> I hadn't tested it with postcopy, I don't know how to use postcopy.
>> From my review I can't find obvious bugs about it.
>>
>> I don't think there is any good reason to use migrate_bypass
>> and postcopy together,  I can disable the migrate_bypass
>> when postcopy==true if you want.
>>
>>>
>>> Once here, should we check for ROM regions?
>>>
>>> BTW, could'nt we use:
>>>
>>> int qemu_ram_foreach_block(RAMBlockIterFunc func, void *opaque)
>>> {
>>>     RAMBlock *block;
>>>     int ret = 0;
>>>
>>>     rcu_read_lock();
>>>     QLIST_FOREACH_RCU(block, &ram_list.blocks, next) {
>>>         ret = func(block->idstr, block->host, block->offset,
>>>                    block->used_length, opaque);
>>>         if (ret) {
>>>             break;
>>>         }
>>>     }
>>>     rcu_read_unlock();
>>>     return ret;
>>> }
>>>
>>
>> the patch only introduces only one "QLIST_FOREACH_RCU(ram_list.blocks)"
>> but
>> # git grep 'QLIST_FOREACH_RCU.*ram_list'  | wc -l
>> #       16
>>
>> I don't want to introduce qemu_ram_foreach_block()
>> and touch another 15 places.
>> I hope someone do it after merged.
>>
>>
>>>
>>>
>>>>
>>>>  static void migration_bitmap_sync(void)
>>>> @@ -631,7 +653,9 @@ static void migration_bitmap_sync(void)
>>>>      qemu_mutex_lock(&migration_bitmap_mutex);
>>>>      rcu_read_lock();
>>>>      QLIST_FOREACH_RCU(block, &ram_list.blocks, next) {
>>>> -        migration_bitmap_sync_range(block->offset, block->used_length);
>>>> +        if (!migrate_bypass_shared_memory() || !qemu_ram_is_shared(block)) {
>>>> +            migration_bitmap_sync_range(block->offset, block->used_length);
>>>> +        }
>>>>      }
>>>>      rcu_read_unlock();
>>>>      qemu_mutex_unlock(&migration_bitmap_mutex);
>>>
>>> Oops, another place where we were not using qemu_ram_foreach_block :p
>>>
>>>
>>>> @@ -1926,19 +1950,14 @@ static int ram_save_setup(QEMUFile *f, void *opaque)
>>>>      ram_bitmap_pages = last_ram_offset() >> TARGET_PAGE_BITS;
>>>>      migration_bitmap_rcu = g_new0(struct BitmapRcu, 1);
>>>>      migration_bitmap_rcu->bmap = bitmap_new(ram_bitmap_pages);
>>>> -    bitmap_set(migration_bitmap_rcu->bmap, 0, ram_bitmap_pages);
>>>> +    migration_bitmap_init(migration_bitmap_rcu->bmap);
>>>>
>>>>      if (migrate_postcopy_ram()) {
>>>>          migration_bitmap_rcu->unsentmap = bitmap_new(ram_bitmap_pages);
>>>> -        bitmap_set(migration_bitmap_rcu->unsentmap, 0, ram_bitmap_pages);
>>>> +        bitmap_copy(migration_bitmap_rcu->unsentmap,
>>>> +                 migration_bitmap_rcu->bmap, ram_bitmap_pages);
>>>>      }
>>>
>>> I think that if we go this route, we should move the whole if inside the
>>> migration_bitmap_init?
>>
>> good! I will do it when I update the patch.
>>
>> Thanks,
>> Lai
>>
>>>
>>>>
>>>> -    /*
>>>> -     * Count the total number of pages used by ram blocks not including any
>>>> -     * gaps due to alignment or unplugs.
>>>> -     */
>>>> -    migration_dirty_pages = ram_bytes_total() >> TARGET_PAGE_BITS;
>>>> -
>>>>      memory_global_dirty_log_start();
>>>>      migration_bitmap_sync();
>>>>      qemu_mutex_unlock_ramlist();
>>>
>>>
>>> As said, very happy with the patch.  And it got much simpler that I
>>> would have expected.
>>>
>>> Thanks, Juan.
>>
>

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [Qemu-devel] [PATCH] add migration capability to bypass the shared memory
  2017-01-13  4:44       ` Lai Jiangshan
@ 2017-01-16 17:38         ` Jianjun Duan
  0 siblings, 0 replies; 25+ messages in thread
From: Jianjun Duan @ 2017-01-16 17:38 UTC (permalink / raw)
  To: Lai Jiangshan
  Cc: Juan Quintela, Peter Crosthwaite, qemu-devel, Markus Armbruster,
	Amit Shah, Paolo Bonzini, Richard Henderson



On 01/12/2017 08:44 PM, Lai Jiangshan wrote:
> On Fri, Jan 13, 2017 at 3:19 AM, Jianjun Duan <duanj@linux.vnet.ibm.com> wrote:
>> I have a question related to interplay of bypassing the shared memory in
>> migration and memory hotplugging. If on the source guest a big chunk of
>> memory is plugged in, will the shared memory still be mapped the same
>> way on the guest? i.e, the mapping from guest physical address to the
>> host virtual address be the same?
> 
> I don't understand the question, the patch doesn't change
> the memory hotplugging nor the way how the pages are mapped
> in the guest physical.
> 
Let me try to rephrase it. Is there a scenario in which the mapping
could change? If the mapping could change, will bypassing shared memory
still work?

Thanks,
Jianjun

>>
>> Thanks,
>> Jianjun
>>
>>
>> On 08/29/2016 09:11 PM, Lai Jiangshan wrote:
>>> On Wed, Aug 10, 2016 at 5:03 PM, Juan Quintela <quintela@redhat.com> wrote:
>>>> Lai Jiangshan <jiangshanlai@gmail.com> wrote:
>>>>
>>>> Hi
>>>>
>>>> First of all, I like a lot the patchset, but I would preffer to split it
>>>> to find "possible" bugs along the lines, especially in postcopy, but not only.
>>>
>>> Hello, thanks for review and comments
>>>
>>> I tried to make the patch be sane and tight.
>>> I don't see any strong reason to split it without complicating the patch.
>>>
>>>>
>>>> [very nice description of the patch]
>>>>
>>>> Nothing to say about the QMP and shared memory detection, looks correct
>>>> to me.
>>>>
>>>>> diff --git a/migration/ram.c b/migration/ram.c
>>>>> index 815bc0e..880972d 100644
>>>>> --- a/migration/ram.c
>>>>> +++ b/migration/ram.c
>>>>> @@ -605,6 +605,28 @@ static void migration_bitmap_sync_init(void)
>>>>>      num_dirty_pages_period = 0;
>>>>>      xbzrle_cache_miss_prev = 0;
>>>>>      iterations_prev = 0;
>>>>> +    migration_dirty_pages = 0;
>>>>> +}
>>>>> +
>>>>> +static void migration_bitmap_init(unsigned long *bitmap)
>>>>> +{
>>>>> +    RAMBlock *block;
>>>>> +
>>>>> +    bitmap_clear(bitmap, 0, last_ram_offset() >> TARGET_PAGE_BITS);
>>>>> +    rcu_read_lock();
>>>>> +    QLIST_FOREACH_RCU(block, &ram_list.blocks, next) {
>>>>> +        if (!migrate_bypass_shared_memory() || !qemu_ram_is_shared(block)) {
>>>>> +            bitmap_set(bitmap, block->offset >> TARGET_PAGE_BITS,
>>>>> +                       block->used_length >> TARGET_PAGE_BITS);
>>>>> +
>>>>> +            /*
>>>>> +             * Count the total number of pages used by ram blocks not including
>>>>> +             * any gaps due to alignment or unplugs.
>>>>> +             */
>>>>> +         migration_dirty_pages += block->used_length >> TARGET_PAGE_BITS;
>>>>> +     }
>>>>> +    }
>>>>> +    rcu_read_unlock();
>>>>>  }
>>>>
>>>> We can split this function in a different patch.
>>>
>>> it calls the new function migrate_bypass_shared_memory().
>>> it is no a good idea to split it out.
>>>
>>>> I haven't fully search
>>>> if we care about taking the rcu lock here.  The thing that I am more
>>>> interested is in knowing what happens when we don't set
>>>> migration_dirty_pages as the full "possible" memory pages.
>>>
>>> I hadn't tested it with postcopy, I don't know how to use postcopy.
>>> From my review I can't find obvious bugs about it.
>>>
>>> I don't think there is any good reason to use migrate_bypass
>>> and postcopy together,  I can disable the migrate_bypass
>>> when postcopy==true if you want.
>>>
>>>>
>>>> Once here, should we check for ROM regions?
>>>>
>>>> BTW, could'nt we use:
>>>>
>>>> int qemu_ram_foreach_block(RAMBlockIterFunc func, void *opaque)
>>>> {
>>>>     RAMBlock *block;
>>>>     int ret = 0;
>>>>
>>>>     rcu_read_lock();
>>>>     QLIST_FOREACH_RCU(block, &ram_list.blocks, next) {
>>>>         ret = func(block->idstr, block->host, block->offset,
>>>>                    block->used_length, opaque);
>>>>         if (ret) {
>>>>             break;
>>>>         }
>>>>     }
>>>>     rcu_read_unlock();
>>>>     return ret;
>>>> }
>>>>
>>>
>>> the patch only introduces only one "QLIST_FOREACH_RCU(ram_list.blocks)"
>>> but
>>> # git grep 'QLIST_FOREACH_RCU.*ram_list'  | wc -l
>>> #       16
>>>
>>> I don't want to introduce qemu_ram_foreach_block()
>>> and touch another 15 places.
>>> I hope someone do it after merged.
>>>
>>>
>>>>
>>>>
>>>>>
>>>>>  static void migration_bitmap_sync(void)
>>>>> @@ -631,7 +653,9 @@ static void migration_bitmap_sync(void)
>>>>>      qemu_mutex_lock(&migration_bitmap_mutex);
>>>>>      rcu_read_lock();
>>>>>      QLIST_FOREACH_RCU(block, &ram_list.blocks, next) {
>>>>> -        migration_bitmap_sync_range(block->offset, block->used_length);
>>>>> +        if (!migrate_bypass_shared_memory() || !qemu_ram_is_shared(block)) {
>>>>> +            migration_bitmap_sync_range(block->offset, block->used_length);
>>>>> +        }
>>>>>      }
>>>>>      rcu_read_unlock();
>>>>>      qemu_mutex_unlock(&migration_bitmap_mutex);
>>>>
>>>> Oops, another place where we were not using qemu_ram_foreach_block :p
>>>>
>>>>
>>>>> @@ -1926,19 +1950,14 @@ static int ram_save_setup(QEMUFile *f, void *opaque)
>>>>>      ram_bitmap_pages = last_ram_offset() >> TARGET_PAGE_BITS;
>>>>>      migration_bitmap_rcu = g_new0(struct BitmapRcu, 1);
>>>>>      migration_bitmap_rcu->bmap = bitmap_new(ram_bitmap_pages);
>>>>> -    bitmap_set(migration_bitmap_rcu->bmap, 0, ram_bitmap_pages);
>>>>> +    migration_bitmap_init(migration_bitmap_rcu->bmap);
>>>>>
>>>>>      if (migrate_postcopy_ram()) {
>>>>>          migration_bitmap_rcu->unsentmap = bitmap_new(ram_bitmap_pages);
>>>>> -        bitmap_set(migration_bitmap_rcu->unsentmap, 0, ram_bitmap_pages);
>>>>> +        bitmap_copy(migration_bitmap_rcu->unsentmap,
>>>>> +                 migration_bitmap_rcu->bmap, ram_bitmap_pages);
>>>>>      }
>>>>
>>>> I think that if we go this route, we should move the whole if inside the
>>>> migration_bitmap_init?
>>>
>>> good! I will do it when I update the patch.
>>>
>>> Thanks,
>>> Lai
>>>
>>>>
>>>>>
>>>>> -    /*
>>>>> -     * Count the total number of pages used by ram blocks not including any
>>>>> -     * gaps due to alignment or unplugs.
>>>>> -     */
>>>>> -    migration_dirty_pages = ram_bytes_total() >> TARGET_PAGE_BITS;
>>>>> -
>>>>>      memory_global_dirty_log_start();
>>>>>      migration_bitmap_sync();
>>>>>      qemu_mutex_unlock_ramlist();
>>>>
>>>>
>>>> As said, very happy with the patch.  And it got much simpler that I
>>>> would have expected.
>>>>
>>>> Thanks, Juan.
>>>
>>
> 

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [Qemu-devel] [PATCH V2] add migration capability to bypass the shared memory
  2016-08-11 14:45                 ` Lai Jiangshan
@ 2017-09-21  3:41                   ` Zhang Haoyu
  0 siblings, 0 replies; 25+ messages in thread
From: Zhang Haoyu @ 2017-09-21  3:41 UTC (permalink / raw)
  To: Lai Jiangshan, Li, Liang Z
  Cc: Peter Crosthwaite, quintela, Markus Armbruster, qemu-devel,
	Amit Shah, Paolo Bonzini, Richard Henderson

Hi Jiangshan,

Any update from this patch?

Thanks,
Zhang Haoyu

On 2016/8/11 22:45, Lai Jiangshan wrote:
> Note, the old local migration patchset:
> https://lists.gnu.org/archive/html/qemu-devel/2013-12/msg00073.html
> 
> this patch can be considered as a new local migration implementation,
> but with more restrictions (the memory must set shared when boot the qemu)
> 

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [Qemu-devel] [PATCH] add migration capability to bypass the shared memory
  2016-08-30  4:11   ` Lai Jiangshan
  2016-11-18 13:01     ` Alexander Graf
  2017-01-12 19:19     ` Jianjun Duan
@ 2017-09-21  6:33     ` Zhang Haoyu
  2 siblings, 0 replies; 25+ messages in thread
From: Zhang Haoyu @ 2017-09-21  6:33 UTC (permalink / raw)
  To: Lai Jiangshan, Juan Quintela
  Cc: Peter Crosthwaite, qemu-devel, Markus Armbruster, Amit Shah,
	Paolo Bonzini, Richard Henderson

Hi,

Any update?

Thanks,
Zhang Haoyu

On 2016/8/30 12:11, Lai Jiangshan wrote:
> On Wed, Aug 10, 2016 at 5:03 PM, Juan Quintela <quintela@redhat.com> wrote:
>> Lai Jiangshan <jiangshanlai@gmail.com> wrote:
>>
>> Hi
>>
>> First of all, I like a lot the patchset, but I would preffer to split it
>> to find "possible" bugs along the lines, especially in postcopy, but not only.
> 
> Hello, thanks for review and comments
> 
> I tried to make the patch be sane and tight.
> I don't see any strong reason to split it without complicating the patch.
> 
>>
>> [very nice description of the patch]
>>
>> Nothing to say about the QMP and shared memory detection, looks correct
>> to me.
>>
>>> diff --git a/migration/ram.c b/migration/ram.c
>>> index 815bc0e..880972d 100644
>>> --- a/migration/ram.c
>>> +++ b/migration/ram.c
>>> @@ -605,6 +605,28 @@ static void migration_bitmap_sync_init(void)
>>>      num_dirty_pages_period = 0;
>>>      xbzrle_cache_miss_prev = 0;
>>>      iterations_prev = 0;
>>> +    migration_dirty_pages = 0;
>>> +}
>>> +
>>> +static void migration_bitmap_init(unsigned long *bitmap)
>>> +{
>>> +    RAMBlock *block;
>>> +
>>> +    bitmap_clear(bitmap, 0, last_ram_offset() >> TARGET_PAGE_BITS);
>>> +    rcu_read_lock();
>>> +    QLIST_FOREACH_RCU(block, &ram_list.blocks, next) {
>>> +        if (!migrate_bypass_shared_memory() || !qemu_ram_is_shared(block)) {
>>> +            bitmap_set(bitmap, block->offset >> TARGET_PAGE_BITS,
>>> +                       block->used_length >> TARGET_PAGE_BITS);
>>> +
>>> +            /*
>>> +             * Count the total number of pages used by ram blocks not including
>>> +             * any gaps due to alignment or unplugs.
>>> +             */
>>> +         migration_dirty_pages += block->used_length >> TARGET_PAGE_BITS;
>>> +     }
>>> +    }
>>> +    rcu_read_unlock();
>>>  }
>>
>> We can split this function in a different patch.
> 
> it calls the new function migrate_bypass_shared_memory().
> it is no a good idea to split it out.
> 
>> I haven't fully search
>> if we care about taking the rcu lock here.  The thing that I am more
>> interested is in knowing what happens when we don't set
>> migration_dirty_pages as the full "possible" memory pages.
> 
> I hadn't tested it with postcopy, I don't know how to use postcopy.
> From my review I can't find obvious bugs about it.
> 
> I don't think there is any good reason to use migrate_bypass
> and postcopy together,  I can disable the migrate_bypass
> when postcopy==true if you want.
> 
>>
>> Once here, should we check for ROM regions?
>>
>> BTW, could'nt we use:
>>
>> int qemu_ram_foreach_block(RAMBlockIterFunc func, void *opaque)
>> {
>>     RAMBlock *block;
>>     int ret = 0;
>>
>>     rcu_read_lock();
>>     QLIST_FOREACH_RCU(block, &ram_list.blocks, next) {
>>         ret = func(block->idstr, block->host, block->offset,
>>                    block->used_length, opaque);
>>         if (ret) {
>>             break;
>>         }
>>     }
>>     rcu_read_unlock();
>>     return ret;
>> }
>>
> 
> the patch only introduces only one "QLIST_FOREACH_RCU(ram_list.blocks)"
> but
> # git grep 'QLIST_FOREACH_RCU.*ram_list'  | wc -l
> #       16
> 
> I don't want to introduce qemu_ram_foreach_block()
> and touch another 15 places.
> I hope someone do it after merged.
> 
> 
>>
>>
>>>
>>>  static void migration_bitmap_sync(void)
>>> @@ -631,7 +653,9 @@ static void migration_bitmap_sync(void)
>>>      qemu_mutex_lock(&migration_bitmap_mutex);
>>>      rcu_read_lock();
>>>      QLIST_FOREACH_RCU(block, &ram_list.blocks, next) {
>>> -        migration_bitmap_sync_range(block->offset, block->used_length);
>>> +        if (!migrate_bypass_shared_memory() || !qemu_ram_is_shared(block)) {
>>> +            migration_bitmap_sync_range(block->offset, block->used_length);
>>> +        }
>>>      }
>>>      rcu_read_unlock();
>>>      qemu_mutex_unlock(&migration_bitmap_mutex);
>>
>> Oops, another place where we were not using qemu_ram_foreach_block :p
>>
>>
>>> @@ -1926,19 +1950,14 @@ static int ram_save_setup(QEMUFile *f, void *opaque)
>>>      ram_bitmap_pages = last_ram_offset() >> TARGET_PAGE_BITS;
>>>      migration_bitmap_rcu = g_new0(struct BitmapRcu, 1);
>>>      migration_bitmap_rcu->bmap = bitmap_new(ram_bitmap_pages);
>>> -    bitmap_set(migration_bitmap_rcu->bmap, 0, ram_bitmap_pages);
>>> +    migration_bitmap_init(migration_bitmap_rcu->bmap);
>>>
>>>      if (migrate_postcopy_ram()) {
>>>          migration_bitmap_rcu->unsentmap = bitmap_new(ram_bitmap_pages);
>>> -        bitmap_set(migration_bitmap_rcu->unsentmap, 0, ram_bitmap_pages);
>>> +        bitmap_copy(migration_bitmap_rcu->unsentmap,
>>> +                 migration_bitmap_rcu->bmap, ram_bitmap_pages);
>>>      }
>>
>> I think that if we go this route, we should move the whole if inside the
>> migration_bitmap_init?
> 
> good! I will do it when I update the patch.
> 
> Thanks,
> Lai
> 
>>
>>>
>>> -    /*
>>> -     * Count the total number of pages used by ram blocks not including any
>>> -     * gaps due to alignment or unplugs.
>>> -     */
>>> -    migration_dirty_pages = ram_bytes_total() >> TARGET_PAGE_BITS;
>>> -
>>>      memory_global_dirty_log_start();
>>>      migration_bitmap_sync();
>>>      qemu_mutex_unlock_ramlist();
>>
>>
>> As said, very happy with the patch.  And it got much simpler that I
>> would have expected.
>>
>> Thanks, Juan.
> 

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [Qemu-devel] [PATCH V2] add migration capability to bypass the shared memory
  2016-08-10  0:54   ` [Qemu-devel] [PATCH V2] " Lai Jiangshan
  2016-08-10  1:22     ` Fam Zheng
  2016-08-10  2:22     ` Li, Liang Z
@ 2017-09-25  7:42     ` Zhang Haoyu
  2017-09-25 12:13     ` Zhang Haoyu
  3 siblings, 0 replies; 25+ messages in thread
From: Zhang Haoyu @ 2017-09-25  7:42 UTC (permalink / raw)
  To: Lai Jiangshan, qemu-devel
  Cc: Juan Quintela, Peter Crosthwaite, Markus Armbruster, Amit Shah,
	Paolo Bonzini, Richard Henderson

If hotplug memory during migration, the calculation of migration_dirty_pages
maybe not correct,
void migration_bitmap_extend(ram_addr_t old, ram_addr_t new)
{
        ...
        migration_dirty_pages += new - old;
        call_rcu(old_bitmap, migration_bitmap_free, rcu);
        ...
}

Thanks,
Zhanghaoyu


On 2016/8/10 8:54, Lai Jiangshan wrote:
> When the migration capability 'bypass-shared-memory'
> is set, the shared memory will be bypassed when migration.
> 
> It is the key feature to enable several excellent features for
> the qemu, such as qemu-local-migration, qemu-live-update,
> extremely-fast-save-restore, vm-template, vm-fast-live-clone,
> yet-another-post-copy-migration, etc..
> 
> The philosophy behind this key feature and the advanced
> key features is that a part of the memory management is
> separated out from the qemu, and let the other toolkits
> such as libvirt, runv(https://github.com/hyperhq/runv/)
> or the next qemu-cmd directly access to it, manage it,
> provide features to it.
> 
> The hyperhq(http://hyper.sh  http://hypercontainer.io/)
> introduced the feature vm-template(vm-fast-live-clone)
> to the hyper container for several months, it works perfect.
> (see https://github.com/hyperhq/runv/pull/297)
> 
> The feature vm-template makes the containers(VMs) can
> be started in 130ms and save 80M memory for every
> container(VM). So that the hyper containers are fast
> and high-density as normal containers.
> 
> In current qemu command line, shared memory has
> to be configured via memory-object. Anyone can add a
> -mem-path-share to the qemu command line for combining
> with -mem-path for this feature. This patch doesn’t include
> this change of -mem-path-share.
> 
> Advanced features:
> 1) qemu-local-migration, qemu-live-update
> Set the mem-path on the tmpfs and set share=on for it when
> start the vm. example:
> -object \
> memory-backend-file,id=mem,size=128M,mem-path=/dev/shm/memory,share=on \
> -numa node,nodeid=0,cpus=0-7,memdev=mem
> 
> when you want to migrate the vm locally (after fixed a security bug
> of the qemu-binary, or other reason), you can start a new qemu with
> the same command line and -incoming, then you can migrate the
> vm from the old qemu to the new qemu with the migration capability
> 'bypass-shared-memory' set. The migration will migrate the device-state
> *ONLY*, the memory is the origin memory backed by tmpfs file.
> 
> 2) extremely-fast-save-restore
> the same above, but the mem-path is on the persistent file system.
> 
> 3)  vm-template, vm-fast-live-clone
> the template vm is started as 1), and paused when the guest reaches
> the template point(example: the guest app is ready), then the template
> vm is saved. (the qemu process of the template can be killed now, because
> we need only the memory and the device state files (in tmpfs)).
> 
> Then we can launch one or multiple VMs base on the template vm states,
> the new VMs are started without the “share=on”, all the new VMs share
> the initial memory from the memory file, they save a lot of memory.
> all the new VMs start from the template point, the guest app can go to
> work quickly.
> 
> The new VM booted from template vm can’t become template again,
> if you need this unusual chained-template feature, you can write
> a cloneable-tmpfs kernel module for it.
> 
> The libvirt toolkit can’t manage vm-template currently, in the
> hyperhq/runv, we use qemu wrapper script to do it. I hope someone add
> “libvrit managed template” feature to libvirt.
> 
> 4) yet-another-post-copy-migration
> It is a possible feature, no toolkit can do it well now.
> Using nbd server/client on the memory file is reluctantly Ok but
> inconvenient. A special feature for tmpfs might be needed to
> fully complete this feature.
> No one need yet another post copy migration method,
> but it is possible when some crazy man need it.
> 
> Changed from v1:
>    fix style
> 
> Signed-off-by: Lai Jiangshan <jiangshanlai@gmail.com>
> ---
>  exec.c                        |  5 +++++
>  include/exec/cpu-common.h     |  1 +
>  include/migration/migration.h |  1 +
>  migration/migration.c         |  9 +++++++++
>  migration/ram.c               | 37 ++++++++++++++++++++++++++++---------
>  qapi-schema.json              |  6 +++++-
>  qmp-commands.hx               |  3 +++
>  7 files changed, 52 insertions(+), 10 deletions(-)
> 
> diff --git a/exec.c b/exec.c
> index 8ffde75..888919a 100644
> --- a/exec.c
> +++ b/exec.c
> @@ -1402,6 +1402,11 @@ static void qemu_ram_setup_dump(void *addr, ram_addr_t size)
>      }
>  }
>  
> +bool qemu_ram_is_shared(RAMBlock *rb)
> +{
> +    return rb->flags & RAM_SHARED;
> +}
> +
>  const char *qemu_ram_get_idstr(RAMBlock *rb)
>  {
>      return rb->idstr;
> diff --git a/include/exec/cpu-common.h b/include/exec/cpu-common.h
> index 952bcfe..7c18db9 100644
> --- a/include/exec/cpu-common.h
> +++ b/include/exec/cpu-common.h
> @@ -58,6 +58,7 @@ RAMBlock *qemu_ram_block_from_host(void *ptr, bool round_offset,
>  void qemu_ram_set_idstr(RAMBlock *block, const char *name, DeviceState *dev);
>  void qemu_ram_unset_idstr(RAMBlock *block);
>  const char *qemu_ram_get_idstr(RAMBlock *rb);
> +bool qemu_ram_is_shared(RAMBlock *rb);
>  
>  void cpu_physical_memory_rw(hwaddr addr, uint8_t *buf,
>                              int len, int is_write);
> diff --git a/include/migration/migration.h b/include/migration/migration.h
> index 3c96623..080b6b2 100644
> --- a/include/migration/migration.h
> +++ b/include/migration/migration.h
> @@ -290,6 +290,7 @@ void migrate_add_blocker(Error *reason);
>   */
>  void migrate_del_blocker(Error *reason);
>  
> +bool migrate_bypass_shared_memory(void);
>  bool migrate_postcopy_ram(void);
>  bool migrate_zero_blocks(void);
>  
> diff --git a/migration/migration.c b/migration/migration.c
> index 955d5ee..c87d136 100644
> --- a/migration/migration.c
> +++ b/migration/migration.c
> @@ -1189,6 +1189,15 @@ void qmp_migrate_set_downtime(double value, Error **errp)
>      max_downtime = (uint64_t)value;
>  }
>  
> +bool migrate_bypass_shared_memory(void)
> +{
> +    MigrationState *s;
> +
> +    s = migrate_get_current();
> +
> +    return s->enabled_capabilities[MIGRATION_CAPABILITY_BYPASS_SHARED_MEMORY];
> +}
> +
>  bool migrate_postcopy_ram(void)
>  {
>      MigrationState *s;
> diff --git a/migration/ram.c b/migration/ram.c
> index 815bc0e..f7c4081 100644
> --- a/migration/ram.c
> +++ b/migration/ram.c
> @@ -605,6 +605,28 @@ static void migration_bitmap_sync_init(void)
>      num_dirty_pages_period = 0;
>      xbzrle_cache_miss_prev = 0;
>      iterations_prev = 0;
> +    migration_dirty_pages = 0;
This initialization is not necessary.

> +}
> +
> +static void migration_bitmap_init(unsigned long *bitmap)
> +{
> +    RAMBlock *block;
> +
> +    bitmap_clear(bitmap, 0, last_ram_offset() >> TARGET_PAGE_BITS);
> +    rcu_read_lock();
> +    QLIST_FOREACH_RCU(block, &ram_list.blocks, next) {
> +        if (!migrate_bypass_shared_memory() || !qemu_ram_is_shared(block)) {
> +            bitmap_set(bitmap, block->offset >> TARGET_PAGE_BITS,
> +                       block->used_length >> TARGET_PAGE_BITS);
> +
> +            /*
> +             * Count the total number of pages used by ram blocks not including
> +             * any gaps due to alignment or unplugs.
> +             */
> +            migration_dirty_pages += block->used_length >> TARGET_PAGE_BITS;
> +        }
> +    }
> +    rcu_read_unlock();
>  }
>  
>  static void migration_bitmap_sync(void)
> @@ -631,7 +653,9 @@ static void migration_bitmap_sync(void)
>      qemu_mutex_lock(&migration_bitmap_mutex);
>      rcu_read_lock();
>      QLIST_FOREACH_RCU(block, &ram_list.blocks, next) {
> -        migration_bitmap_sync_range(block->offset, block->used_length);
> +        if (!migrate_bypass_shared_memory() || !qemu_ram_is_shared(block)) {
> +            migration_bitmap_sync_range(block->offset, block->used_length);
> +        }
>      }
>      rcu_read_unlock();
>      qemu_mutex_unlock(&migration_bitmap_mutex);
> @@ -1926,19 +1950,14 @@ static int ram_save_setup(QEMUFile *f, void *opaque)
>      ram_bitmap_pages = last_ram_offset() >> TARGET_PAGE_BITS;
>      migration_bitmap_rcu = g_new0(struct BitmapRcu, 1);
>      migration_bitmap_rcu->bmap = bitmap_new(ram_bitmap_pages);
> -    bitmap_set(migration_bitmap_rcu->bmap, 0, ram_bitmap_pages);
> +    migration_bitmap_init(migration_bitmap_rcu->bmap);
>  
>      if (migrate_postcopy_ram()) {
>          migration_bitmap_rcu->unsentmap = bitmap_new(ram_bitmap_pages);
> -        bitmap_set(migration_bitmap_rcu->unsentmap, 0, ram_bitmap_pages);
> +        bitmap_copy(migration_bitmap_rcu->unsentmap,
> +                    migration_bitmap_rcu->bmap, ram_bitmap_pages);
>      }
>  
> -    /*
> -     * Count the total number of pages used by ram blocks not including any
> -     * gaps due to alignment or unplugs.
> -     */
> -    migration_dirty_pages = ram_bytes_total() >> TARGET_PAGE_BITS;
> -
>      memory_global_dirty_log_start();
>      migration_bitmap_sync();
>      qemu_mutex_unlock_ramlist();
> diff --git a/qapi-schema.json b/qapi-schema.json
> index 5658723..453e6d9 100644
> --- a/qapi-schema.json
> +++ b/qapi-schema.json
> @@ -553,11 +553,15 @@
>  #          been migrated, pulling the remaining pages along as needed. NOTE: If
>  #          the migration fails during postcopy the VM will fail.  (since 2.6)
>  #
> +# @bypass-shared-memory: the shared memory region will be bypassed on migration.
> +#          This feature allows the memory region to be reused by new qemu(s)
> +#          or be migrated separately. (since 2.8)
> +#
>  # Since: 1.2
>  ##
>  { 'enum': 'MigrationCapability',
>    'data': ['xbzrle', 'rdma-pin-all', 'auto-converge', 'zero-blocks',
> -           'compress', 'events', 'postcopy-ram'] }
> +           'compress', 'events', 'postcopy-ram', 'bypass-shared-memory'] }
>  
>  ##
>  # @MigrationCapabilityStatus
> diff --git a/qmp-commands.hx b/qmp-commands.hx
> index c8d360a..c31152c 100644
> --- a/qmp-commands.hx
> +++ b/qmp-commands.hx
> @@ -3723,6 +3723,7 @@ Enable/Disable migration capabilities
>  - "compress": use multiple compression threads to accelerate live migration
>  - "events": generate events for each migration state change
>  - "postcopy-ram": postcopy mode for live migration
> +- "bypass-shared-memory": bypass shared memory region
>  
>  Arguments:
>  
> @@ -3753,6 +3754,7 @@ Query current migration capabilities
>           - "compress": Multiple compression threads state (json-bool)
>           - "events": Migration state change event state (json-bool)
>           - "postcopy-ram": postcopy ram state (json-bool)
> +         - "bypass-shared-memory": bypass shared memory state (json-bool)
>  
>  Arguments:
>  
> @@ -3767,6 +3769,7 @@ Example:
>       {"state": false, "capability": "compress"},
>       {"state": true, "capability": "events"},
>       {"state": false, "capability": "postcopy-ram"}
> +     {"state": false, "capability": "bypass-shared-memory"}
>     ]}
>  
>  EQMP
> 

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [Qemu-devel] [PATCH V2] add migration capability to bypass the shared memory
  2016-08-10  0:54   ` [Qemu-devel] [PATCH V2] " Lai Jiangshan
                       ` (2 preceding siblings ...)
  2017-09-25  7:42     ` Zhang Haoyu
@ 2017-09-25 12:13     ` Zhang Haoyu
  3 siblings, 0 replies; 25+ messages in thread
From: Zhang Haoyu @ 2017-09-25 12:13 UTC (permalink / raw)
  To: Lai Jiangshan, qemu-devel
  Cc: Juan Quintela, Peter Crosthwaite, Markus Armbruster, Amit Shah,
	Paolo Bonzini, Richard Henderson

If hotplug memory during migration, the calculation of migration_dirty_pages
maybe incorrect, should fixed as below,

-void migration_bitmap_extend(ram_addr_t old, ram_addr_t new)
+void migration_bitmap_extend(RAMBlock *block, ram_addr_t old, ram_addr_t new)
{
    /* called in qemu main thread, so there is
     * no writing race against this migration_bitmap
     */
-    if (migration_bitmap_rcu) {
+    if (migration_bitmap_rcu && (!migrate_bypass_shared_memory() || !qemu_ram_is_shared(block))) {
        struct BitmapRcu *old_bitmap = migration_bitmap_rcu, *bitmap;
        bitmap = g_new(struct BitmapRcu, 1);

On 2016/8/10 8:54, Lai Jiangshan wrote:
> When the migration capability 'bypass-shared-memory'
> is set, the shared memory will be bypassed when migration.
> 
> It is the key feature to enable several excellent features for
> the qemu, such as qemu-local-migration, qemu-live-update,
> extremely-fast-save-restore, vm-template, vm-fast-live-clone,
> yet-another-post-copy-migration, etc..
> 
> The philosophy behind this key feature and the advanced
> key features is that a part of the memory management is
> separated out from the qemu, and let the other toolkits
> such as libvirt, runv(https://github.com/hyperhq/runv/)
> or the next qemu-cmd directly access to it, manage it,
> provide features to it.
> 
> The hyperhq(http://hyper.sh  http://hypercontainer.io/)
> introduced the feature vm-template(vm-fast-live-clone)
> to the hyper container for several months, it works perfect.
> (see https://github.com/hyperhq/runv/pull/297)
> 
> The feature vm-template makes the containers(VMs) can
> be started in 130ms and save 80M memory for every
> container(VM). So that the hyper containers are fast
> and high-density as normal containers.
> 
> In current qemu command line, shared memory has
> to be configured via memory-object. Anyone can add a
> -mem-path-share to the qemu command line for combining
> with -mem-path for this feature. This patch doesn’t include
> this change of -mem-path-share.
> 
> Advanced features:
> 1) qemu-local-migration, qemu-live-update
> Set the mem-path on the tmpfs and set share=on for it when
> start the vm. example:
> -object \
> memory-backend-file,id=mem,size=128M,mem-path=/dev/shm/memory,share=on \
> -numa node,nodeid=0,cpus=0-7,memdev=mem
> 
> when you want to migrate the vm locally (after fixed a security bug
> of the qemu-binary, or other reason), you can start a new qemu with
> the same command line and -incoming, then you can migrate the
> vm from the old qemu to the new qemu with the migration capability
> 'bypass-shared-memory' set. The migration will migrate the device-state
> *ONLY*, the memory is the origin memory backed by tmpfs file.
> 
> 2) extremely-fast-save-restore
> the same above, but the mem-path is on the persistent file system.
> 
> 3)  vm-template, vm-fast-live-clone
> the template vm is started as 1), and paused when the guest reaches
> the template point(example: the guest app is ready), then the template
> vm is saved. (the qemu process of the template can be killed now, because
> we need only the memory and the device state files (in tmpfs)).
> 
> Then we can launch one or multiple VMs base on the template vm states,
> the new VMs are started without the “share=on”, all the new VMs share
> the initial memory from the memory file, they save a lot of memory.
> all the new VMs start from the template point, the guest app can go to
> work quickly.
> 
> The new VM booted from template vm can’t become template again,
> if you need this unusual chained-template feature, you can write
> a cloneable-tmpfs kernel module for it.
> 
> The libvirt toolkit can’t manage vm-template currently, in the
> hyperhq/runv, we use qemu wrapper script to do it. I hope someone add
> “libvrit managed template” feature to libvirt.
> 
> 4) yet-another-post-copy-migration
> It is a possible feature, no toolkit can do it well now.
> Using nbd server/client on the memory file is reluctantly Ok but
> inconvenient. A special feature for tmpfs might be needed to
> fully complete this feature.
> No one need yet another post copy migration method,
> but it is possible when some crazy man need it.
> 
> Changed from v1:
>    fix style
> 
> Signed-off-by: Lai Jiangshan <jiangshanlai@gmail.com>
> ---
>  exec.c                        |  5 +++++
>  include/exec/cpu-common.h     |  1 +
>  include/migration/migration.h |  1 +
>  migration/migration.c         |  9 +++++++++
>  migration/ram.c               | 37 ++++++++++++++++++++++++++++---------
>  qapi-schema.json              |  6 +++++-
>  qmp-commands.hx               |  3 +++
>  7 files changed, 52 insertions(+), 10 deletions(-)
> 
> diff --git a/exec.c b/exec.c
> index 8ffde75..888919a 100644
> --- a/exec.c
> +++ b/exec.c
> @@ -1402,6 +1402,11 @@ static void qemu_ram_setup_dump(void *addr, ram_addr_t size)
>      }
>  }
>  
> +bool qemu_ram_is_shared(RAMBlock *rb)
> +{
> +    return rb->flags & RAM_SHARED;
> +}
> +
>  const char *qemu_ram_get_idstr(RAMBlock *rb)
>  {
>      return rb->idstr;
> diff --git a/include/exec/cpu-common.h b/include/exec/cpu-common.h
> index 952bcfe..7c18db9 100644
> --- a/include/exec/cpu-common.h
> +++ b/include/exec/cpu-common.h
> @@ -58,6 +58,7 @@ RAMBlock *qemu_ram_block_from_host(void *ptr, bool round_offset,
>  void qemu_ram_set_idstr(RAMBlock *block, const char *name, DeviceState *dev);
>  void qemu_ram_unset_idstr(RAMBlock *block);
>  const char *qemu_ram_get_idstr(RAMBlock *rb);
> +bool qemu_ram_is_shared(RAMBlock *rb);
>  
>  void cpu_physical_memory_rw(hwaddr addr, uint8_t *buf,
>                              int len, int is_write);
> diff --git a/include/migration/migration.h b/include/migration/migration.h
> index 3c96623..080b6b2 100644
> --- a/include/migration/migration.h
> +++ b/include/migration/migration.h
> @@ -290,6 +290,7 @@ void migrate_add_blocker(Error *reason);
>   */
>  void migrate_del_blocker(Error *reason);
>  
> +bool migrate_bypass_shared_memory(void);
>  bool migrate_postcopy_ram(void);
>  bool migrate_zero_blocks(void);
>  
> diff --git a/migration/migration.c b/migration/migration.c
> index 955d5ee..c87d136 100644
> --- a/migration/migration.c
> +++ b/migration/migration.c
> @@ -1189,6 +1189,15 @@ void qmp_migrate_set_downtime(double value, Error **errp)
>      max_downtime = (uint64_t)value;
>  }
>  
> +bool migrate_bypass_shared_memory(void)
> +{
> +    MigrationState *s;
> +
> +    s = migrate_get_current();
> +
> +    return s->enabled_capabilities[MIGRATION_CAPABILITY_BYPASS_SHARED_MEMORY];
> +}
> +
>  bool migrate_postcopy_ram(void)
>  {
>      MigrationState *s;
> diff --git a/migration/ram.c b/migration/ram.c
> index 815bc0e..f7c4081 100644
> --- a/migration/ram.c
> +++ b/migration/ram.c
> @@ -605,6 +605,28 @@ static void migration_bitmap_sync_init(void)
>      num_dirty_pages_period = 0;
>      xbzrle_cache_miss_prev = 0;
>      iterations_prev = 0;
> +    migration_dirty_pages = 0;

This initialization is not necessary.

> +}
> +
> +static void migration_bitmap_init(unsigned long *bitmap)
> +{
> +    RAMBlock *block;
> +
> +    bitmap_clear(bitmap, 0, last_ram_offset() >> TARGET_PAGE_BITS);
> +    rcu_read_lock();
> +    QLIST_FOREACH_RCU(block, &ram_list.blocks, next) {
> +        if (!migrate_bypass_shared_memory() || !qemu_ram_is_shared(block)) {
> +            bitmap_set(bitmap, block->offset >> TARGET_PAGE_BITS,
> +                       block->used_length >> TARGET_PAGE_BITS);
> +
> +            /*
> +             * Count the total number of pages used by ram blocks not including
> +             * any gaps due to alignment or unplugs.
> +             */
> +            migration_dirty_pages += block->used_length >> TARGET_PAGE_BITS;
> +        }
> +    }
> +    rcu_read_unlock();
>  }
>  
>  static void migration_bitmap_sync(void)
> @@ -631,7 +653,9 @@ static void migration_bitmap_sync(void)
>      qemu_mutex_lock(&migration_bitmap_mutex);
>      rcu_read_lock();
>      QLIST_FOREACH_RCU(block, &ram_list.blocks, next) {
> -        migration_bitmap_sync_range(block->offset, block->used_length);
> +        if (!migrate_bypass_shared_memory() || !qemu_ram_is_shared(block)) {
> +            migration_bitmap_sync_range(block->offset, block->used_length);
> +        }
>      }
>      rcu_read_unlock();
>      qemu_mutex_unlock(&migration_bitmap_mutex);
> @@ -1926,19 +1950,14 @@ static int ram_save_setup(QEMUFile *f, void *opaque)
>      ram_bitmap_pages = last_ram_offset() >> TARGET_PAGE_BITS;
>      migration_bitmap_rcu = g_new0(struct BitmapRcu, 1);
>      migration_bitmap_rcu->bmap = bitmap_new(ram_bitmap_pages);
> -    bitmap_set(migration_bitmap_rcu->bmap, 0, ram_bitmap_pages);
> +    migration_bitmap_init(migration_bitmap_rcu->bmap);
>  
>      if (migrate_postcopy_ram()) {
>          migration_bitmap_rcu->unsentmap = bitmap_new(ram_bitmap_pages);
> -        bitmap_set(migration_bitmap_rcu->unsentmap, 0, ram_bitmap_pages);
> +        bitmap_copy(migration_bitmap_rcu->unsentmap,
> +                    migration_bitmap_rcu->bmap, ram_bitmap_pages);
>      }
>  
> -    /*
> -     * Count the total number of pages used by ram blocks not including any
> -     * gaps due to alignment or unplugs.
> -     */
> -    migration_dirty_pages = ram_bytes_total() >> TARGET_PAGE_BITS;
> -
>      memory_global_dirty_log_start();
>      migration_bitmap_sync();
>      qemu_mutex_unlock_ramlist();
> diff --git a/qapi-schema.json b/qapi-schema.json
> index 5658723..453e6d9 100644
> --- a/qapi-schema.json
> +++ b/qapi-schema.json
> @@ -553,11 +553,15 @@
>  #          been migrated, pulling the remaining pages along as needed. NOTE: If
>  #          the migration fails during postcopy the VM will fail.  (since 2.6)
>  #
> +# @bypass-shared-memory: the shared memory region will be bypassed on migration.
> +#          This feature allows the memory region to be reused by new qemu(s)
> +#          or be migrated separately. (since 2.8)
> +#
>  # Since: 1.2
>  ##
>  { 'enum': 'MigrationCapability',
>    'data': ['xbzrle', 'rdma-pin-all', 'auto-converge', 'zero-blocks',
> -           'compress', 'events', 'postcopy-ram'] }
> +           'compress', 'events', 'postcopy-ram', 'bypass-shared-memory'] }
>  
>  ##
>  # @MigrationCapabilityStatus
> diff --git a/qmp-commands.hx b/qmp-commands.hx
> index c8d360a..c31152c 100644
> --- a/qmp-commands.hx
> +++ b/qmp-commands.hx
> @@ -3723,6 +3723,7 @@ Enable/Disable migration capabilities
>  - "compress": use multiple compression threads to accelerate live migration
>  - "events": generate events for each migration state change
>  - "postcopy-ram": postcopy mode for live migration
> +- "bypass-shared-memory": bypass shared memory region
>  
>  Arguments:
>  
> @@ -3753,6 +3754,7 @@ Query current migration capabilities
>           - "compress": Multiple compression threads state (json-bool)
>           - "events": Migration state change event state (json-bool)
>           - "postcopy-ram": postcopy ram state (json-bool)
> +         - "bypass-shared-memory": bypass shared memory state (json-bool)
>  
>  Arguments:
>  
> @@ -3767,6 +3769,7 @@ Example:
>       {"state": false, "capability": "compress"},
>       {"state": true, "capability": "events"},
>       {"state": false, "capability": "postcopy-ram"}
> +     {"state": false, "capability": "bypass-shared-memory"}
>     ]}
>  
>  EQMP
> 

^ permalink raw reply	[flat|nested] 25+ messages in thread

end of thread, other threads:[~2017-09-25 12:10 UTC | newest]

Thread overview: 25+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-08-09 12:10 [Qemu-devel] [PATCH] add migration capability to bypass the shared memory Lai Jiangshan
2016-08-09 13:51 ` no-reply
2016-08-09 19:12 ` Dr. David Alan Gilbert
2016-08-10  0:54   ` [Qemu-devel] [PATCH V2] " Lai Jiangshan
2016-08-10  1:22     ` Fam Zheng
2016-08-10  2:22     ` Li, Liang Z
2016-08-10  3:22       ` Lai Jiangshan
2016-08-10  5:04         ` Li, Liang Z
2016-08-10  9:11           ` Juan Quintela
2016-08-11  7:11             ` Li, Liang Z
2016-08-11 14:31               ` Lai Jiangshan
2016-08-11 14:45                 ` Lai Jiangshan
2017-09-21  3:41                   ` Zhang Haoyu
2016-08-12  6:48                 ` Li, Liang Z
2016-08-12  7:19                   ` Lai Jiangshan
2017-09-25  7:42     ` Zhang Haoyu
2017-09-25 12:13     ` Zhang Haoyu
2016-08-10  1:09   ` [Qemu-devel] [PATCH] " Lai Jiangshan
2016-08-10  9:03 ` Juan Quintela
2016-08-30  4:11   ` Lai Jiangshan
2016-11-18 13:01     ` Alexander Graf
2017-01-12 19:19     ` Jianjun Duan
2017-01-13  4:44       ` Lai Jiangshan
2017-01-16 17:38         ` Jianjun Duan
2017-09-21  6:33     ` Zhang Haoyu

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.