All of lore.kernel.org
 help / color / mirror / Atom feed
* [Qemu-devel] [RFC v2 0/6] Fix QEMU crash during memory hotplug with vhost=on
@ 2015-06-08 15:19 Igor Mammedov
  2015-06-08 15:19 ` [Qemu-devel] [RFC v2 1/6] memory: get rid of memory_region_destructor_ram_from_ptr() Igor Mammedov
                   ` (5 more replies)
  0 siblings, 6 replies; 25+ messages in thread
From: Igor Mammedov @ 2015-06-08 15:19 UTC (permalink / raw)
  To: qemu-devel; +Cc: pbonzini, mst

Changelog:
 v1->v2:
   * take into account Paolo's review comments
     * do not overload ram_addr
     * ifdef linux specific code
   * reseve HVA using API from exec.c instead of calling
     mmap() dircely from memory.c
   * support unmapping of HVA remapped region


Series still is RFC due to VHOST issues with small memory
amount (i.e. last patch 6/6 isn't usable yet), so I'm posting
series mainly for revewing memory API bits.

--
When more than ~50 pc-dimm devices are hotplugged with
vhost enabled, QEMU will assert in vhost vhost_commit()
due to backend refusing to accept too many memory ranges.

Series introduces Reserved HVA MemoryRegion container
where all hotplugged memory is remapped and passes
the single container range to vhost instead of multiple
memory ranges for each hotlugged pc-dimm device.

It's alternative approach to increasing backend supported
memory regions limit since what I've come up with
backend side approach is quite a bit more code so far.

With this approach it would be possible to extend it to
initial memory later and provide a single range for all
RAM to vhost, which should speed up its hot-path by replacing
current GPA<->HVA lookup loop with offset calculation.

Igor Mammedov (6):
  memory: get rid of memory_region_destructor_ram_from_ptr()
  memory: introduce MemoryRegion container with reserved HVA range
  memory: support unmapping of MemoryRegion mapped into HVA parent
  hostmem: return recreated MemoryRegion if current can't be reused
  pc: reserve hotpluggable memory range with
    memory_region_init_hva_range()
  pc: fix QEMU crashing when more than ~50 memory hotplugged

 backends/hostmem.c        |  6 ++++
 exec.c                    | 67 +++++++++++++++++++++++++++++-------------
 hw/i386/pc.c              |  4 +--
 hw/virtio/vhost.c         | 15 ++++++++--
 include/exec/cpu-common.h |  3 ++
 include/exec/memory.h     | 57 +++++++++++++++++++++++++++++++++--
 include/exec/ram_addr.h   |  1 -
 memory.c                  | 75 +++++++++++++++++++++++++++++++++++++++++++----
 8 files changed, 194 insertions(+), 34 deletions(-)

-- 
1.8.3.1

^ permalink raw reply	[flat|nested] 25+ messages in thread

* [Qemu-devel] [RFC v2 1/6] memory: get rid of memory_region_destructor_ram_from_ptr()
  2015-06-08 15:19 [Qemu-devel] [RFC v2 0/6] Fix QEMU crash during memory hotplug with vhost=on Igor Mammedov
@ 2015-06-08 15:19 ` Igor Mammedov
  2015-06-08 15:23   ` Paolo Bonzini
  2015-06-08 15:19 ` [Qemu-devel] [RFC v2 2/6] memory: introduce MemoryRegion container with reserved HVA range Igor Mammedov
                   ` (4 subsequent siblings)
  5 siblings, 1 reply; 25+ messages in thread
From: Igor Mammedov @ 2015-06-08 15:19 UTC (permalink / raw)
  To: qemu-devel; +Cc: pbonzini, mst

memory_region_destructor_ram_from_ptr() uses qemu_ram_free_from_ptr()
to free RAMBlock without freeing host RAM but generic
memory_region_destructor_ram() -> qemu_ram_free() does the same so
reuse memory_region_destructor_ram() instead.

Signed-off-by: Igor Mammedov <imammedo@redhat.com>
---
 exec.c                  | 19 -------------------
 include/exec/ram_addr.h |  1 -
 memory.c                |  7 +------
 3 files changed, 1 insertion(+), 26 deletions(-)

diff --git a/exec.c b/exec.c
index e19ab22..d895a86 100644
--- a/exec.c
+++ b/exec.c
@@ -1555,25 +1555,6 @@ ram_addr_t qemu_ram_alloc_resizeable(ram_addr_t size, ram_addr_t maxsz,
     return qemu_ram_alloc_internal(size, maxsz, resized, NULL, true, mr, errp);
 }
 
-void qemu_ram_free_from_ptr(ram_addr_t addr)
-{
-    RAMBlock *block;
-
-    qemu_mutex_lock_ramlist();
-    QLIST_FOREACH_RCU(block, &ram_list.blocks, next) {
-        if (addr == block->offset) {
-            QLIST_REMOVE_RCU(block, next);
-            ram_list.mru_block = NULL;
-            /* Write list before version */
-            smp_wmb();
-            ram_list.version++;
-            g_free_rcu(block, rcu);
-            break;
-        }
-    }
-    qemu_mutex_unlock_ramlist();
-}
-
 static void reclaim_ramblock(RAMBlock *block)
 {
     if (block->flags & RAM_PREALLOC) {
diff --git a/include/exec/ram_addr.h b/include/exec/ram_addr.h
index ff558a4..56d2c17 100644
--- a/include/exec/ram_addr.h
+++ b/include/exec/ram_addr.h
@@ -37,7 +37,6 @@ int qemu_get_ram_fd(ram_addr_t addr);
 void *qemu_get_ram_block_host_ptr(ram_addr_t addr);
 void *qemu_get_ram_ptr(ram_addr_t addr);
 void qemu_ram_free(ram_addr_t addr);
-void qemu_ram_free_from_ptr(ram_addr_t addr);
 
 int qemu_ram_resize(ram_addr_t base, ram_addr_t newsize, Error **errp);
 
diff --git a/memory.c b/memory.c
index 03c536b..929123f 100644
--- a/memory.c
+++ b/memory.c
@@ -869,11 +869,6 @@ static void memory_region_destructor_alias(MemoryRegion *mr)
     memory_region_unref(mr->alias);
 }
 
-static void memory_region_destructor_ram_from_ptr(MemoryRegion *mr)
-{
-    qemu_ram_free_from_ptr(mr->ram_addr);
-}
-
 static void memory_region_destructor_rom_device(MemoryRegion *mr)
 {
     qemu_ram_free(mr->ram_addr & TARGET_PAGE_MASK);
@@ -1252,7 +1247,7 @@ void memory_region_init_ram_ptr(MemoryRegion *mr,
     memory_region_init(mr, owner, name, size);
     mr->ram = true;
     mr->terminates = true;
-    mr->destructor = memory_region_destructor_ram_from_ptr;
+    mr->destructor = memory_region_destructor_ram;
 
     /* qemu_ram_alloc_from_ptr cannot fail with ptr != NULL.  */
     assert(ptr != NULL);
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [Qemu-devel] [RFC v2 2/6] memory: introduce MemoryRegion container with reserved HVA range
  2015-06-08 15:19 [Qemu-devel] [RFC v2 0/6] Fix QEMU crash during memory hotplug with vhost=on Igor Mammedov
  2015-06-08 15:19 ` [Qemu-devel] [RFC v2 1/6] memory: get rid of memory_region_destructor_ram_from_ptr() Igor Mammedov
@ 2015-06-08 15:19 ` Igor Mammedov
  2015-06-08 15:19 ` [Qemu-devel] [RFC v2 3/6] memory: support unmapping of MemoryRegion mapped into HVA parent Igor Mammedov
                   ` (3 subsequent siblings)
  5 siblings, 0 replies; 25+ messages in thread
From: Igor Mammedov @ 2015-06-08 15:19 UTC (permalink / raw)
  To: qemu-devel; +Cc: pbonzini, mst

Patch adds memory_region_init_hva_range() and
memory_region_find_hva_range() API to allocate and lookup
reserved HVA MemoryRegion.

MemoryRegion with reserved HVA range will be used for
providing linear 1:1 HVA->GVA mapping for RAM MemoryRegion-s
that is added as subregions inside it.

It will be used for memory hotplug and vhost integration
reducing all hotplugged MemoryRegions down to one
memory range descriptor, which allows to overcome
vhost's limitation on number of allowed memory ranges.

Signed-off-by: Igor Mammedov <imammedo@redhat.com>
---
RFC->v1:
  - rename:
       memory_region_init_rsvd_hva -> memory_region_init_hva_range
       memory_region_find_rsvd_hva -> memory_region_find_hva_range
  - replace using ram_addr with "void *rsvd_hva"
  - guard linux specific calls with ifdef
  - split memory reservation into qemu_ram_reserve_hva()
---
 exec.c                    | 31 ++++++++++++++++++++++++++++
 include/exec/cpu-common.h |  2 ++
 include/exec/memory.h     | 44 +++++++++++++++++++++++++++++++++++++--
 memory.c                  | 52 +++++++++++++++++++++++++++++++++++++++++++++++
 4 files changed, 127 insertions(+), 2 deletions(-)

diff --git a/exec.c b/exec.c
index d895a86..7e47f64 100644
--- a/exec.c
+++ b/exec.c
@@ -1325,6 +1325,37 @@ static int memory_try_enable_merging(void *addr, size_t len)
     return qemu_madvise(addr, len, QEMU_MADV_MERGEABLE);
 }
 
+#ifdef __linux__
+void *qemu_ram_reserve_hva(ram_addr_t length)
+{
+    return mmap(0, length, PROT_NONE,
+                MAP_NORESERVE | MAP_ANONYMOUS | MAP_PRIVATE, -1, 0);
+}
+
+void qemu_ram_remap_hva(ram_addr_t addr, void *new_hva)
+{
+    RAMBlock *block = find_ram_block(addr);
+
+    assert(block);
+    block->host = mremap(block->host, block->used_length,
+                      block->used_length,
+                      MREMAP_MAYMOVE | MREMAP_FIXED, new_hva);
+    memory_try_enable_merging(block->host, block->used_length);
+    qemu_ram_setup_dump(block->host, block->used_length);
+}
+#else
+void *qemu_ram_reserve_hva(ram_addr_t length)
+{
+    return NULL;
+}
+
+void qemu_ram_remap_hva(ram_addr_t addr, void *new_hva)
+{
+    assert(0);
+}
+#endif
+
+
 /* Only legal before guest might have detected the memory size: e.g. on
  * incoming migration, or right after reset.
  *
diff --git a/include/exec/cpu-common.h b/include/exec/cpu-common.h
index 43428bd..ec077ee 100644
--- a/include/exec/cpu-common.h
+++ b/include/exec/cpu-common.h
@@ -60,6 +60,8 @@ typedef void CPUWriteMemoryFunc(void *opaque, hwaddr addr, uint32_t value);
 typedef uint32_t CPUReadMemoryFunc(void *opaque, hwaddr addr);
 
 void qemu_ram_remap(ram_addr_t addr, ram_addr_t length);
+void *qemu_ram_reserve_hva(ram_addr_t length);
+void qemu_ram_remap_hva(ram_addr_t addr, void *new_hva);
 /* This should not be used by devices.  */
 MemoryRegion *qemu_ram_addr_from_host(void *ptr, ram_addr_t *ram_addr);
 void qemu_ram_set_idstr(ram_addr_t addr, const char *name, DeviceState *dev);
diff --git a/include/exec/memory.h b/include/exec/memory.h
index b61c84f..5e16026 100644
--- a/include/exec/memory.h
+++ b/include/exec/memory.h
@@ -174,6 +174,7 @@ struct MemoryRegion {
     bool terminates;
     bool romd_mode;
     bool ram;
+    void *rsvd_hva;
     bool skip_dump;
     bool readonly; /* For RAM regions */
     bool enabled;
@@ -281,6 +282,25 @@ void memory_region_init(MemoryRegion *mr,
                         struct Object *owner,
                         const char *name,
                         uint64_t size);
+/**
+ * memory_region_init_hva_range: Initialize a reserved HVA memory region
+ *
+ * The container for RAM memory regions.
+ * When adding subregion with memory_region_add_subregion(), subregion's
+ * backing host memory will be remapped to inside the reserved by this
+ * region HVA.
+ * Supported only on Linux. If memory reservation and remapping is not
+ * implemented for platform, this call degrades to regular memory_region_init().
+ *
+ * @mr: the #MemoryRegion to be initialized
+ * @owner: the object that tracks the region's reference count
+ * @name: used for debugging; not visible to the user or ABI
+ * @size: size of the region; any subregions beyond this size will be clipped
+ */
+void memory_region_init_hva_range(MemoryRegion *mr,
+                                  struct Object *owner,
+                                  const char *name,
+                                  uint64_t size);
 
 /**
  * memory_region_ref: Add 1 to a memory region's reference count
@@ -620,8 +640,8 @@ int memory_region_get_fd(MemoryRegion *mr);
  * memory_region_get_ram_ptr: Get a pointer into a RAM memory region.
  *
  * Returns a host pointer to a RAM memory region (created with
- * memory_region_init_ram() or memory_region_init_ram_ptr()).  Use with
- * care.
+ * memory_region_init_ram() or memory_region_init_ram_ptr()) or
+ * memory_region_init_hva_range(). Use with care.
  *
  * @mr: the memory region being queried.
  */
@@ -1014,6 +1034,26 @@ MemoryRegionSection memory_region_find(MemoryRegion *mr,
                                        hwaddr addr, uint64_t size);
 
 /**
+ * memory_region_find_hva_range: finds a parent MemoryRegion with
+ * reserved HVA and translates it into a #MemoryRegionSection.
+ *
+ * Locates the first parent #MemoryRegion of @mr that is
+ * of reserved HVA type.
+ *
+ * Returns a #MemoryRegionSection that describes a reserved HVA
+ * memory region.
+ *    .@offset_within_address_space is offset of found
+ *      (in the .@mr field) memory region relative to the address
+ *      space that contains it.
+ *    .@offset_within_region is offset of @mr relative
+ *      to the returned region (in the .@mr field).
+ *    .@size is size of found memory region
+ *
+ * @mr: a MemoryRegion whose HVA parent is looked up
+ */
+MemoryRegionSection memory_region_find_hva_range(MemoryRegion *mr);
+
+/**
  * address_space_sync_dirty_bitmap: synchronize the dirty log for all memory
  *
  * Synchronizes the dirty page log for an entire address space.
diff --git a/memory.c b/memory.c
index 929123f..f9b8a60 100644
--- a/memory.c
+++ b/memory.c
@@ -934,6 +934,15 @@ void memory_region_init(MemoryRegion *mr,
     }
 }
 
+void memory_region_init_hva_range(MemoryRegion *mr,
+                                  Object *owner,
+                                  const char *name,
+                                  uint64_t size)
+{
+    memory_region_init(mr, owner, name, size);
+    mr->rsvd_hva = qemu_ram_reserve_hva(memory_region_size(mr));
+}
+
 static void memory_region_get_addr(Object *obj, Visitor *v, void *opaque,
                                    const char *name, Error **errp)
 {
@@ -1509,6 +1518,10 @@ int memory_region_get_fd(MemoryRegion *mr)
 
 void *memory_region_get_ram_ptr(MemoryRegion *mr)
 {
+    if (mr->rsvd_hva) {
+        return mr->rsvd_hva;
+    }
+
     if (mr->alias) {
         return memory_region_get_ram_ptr(mr->alias) + mr->alias_offset;
     }
@@ -1737,6 +1750,17 @@ static void memory_region_add_subregion_common(MemoryRegion *mr,
     assert(!subregion->container);
     subregion->container = mr;
     subregion->addr = offset;
+
+    if (mr->ram) {
+	MemoryRegionSection rsvd_hva = memory_region_find_hva_range(mr);
+
+        if (rsvd_hva.mr) {
+            qemu_ram_remap_hva(mr->ram_addr,
+                memory_region_get_ram_ptr(rsvd_hva.mr) +
+                    rsvd_hva.offset_within_region);
+        }
+    }
+
     memory_region_update_container_subregions(subregion);
 }
 
@@ -1879,6 +1903,34 @@ bool memory_region_is_mapped(MemoryRegion *mr)
     return mr->container ? true : false;
 }
 
+MemoryRegionSection memory_region_find_hva_range(MemoryRegion *mr)
+{
+    MemoryRegionSection ret = { .mr = NULL };
+    MemoryRegion *hva_container = NULL;
+    hwaddr addr = mr->addr;
+    MemoryRegion *root;
+
+    addr += mr->addr;
+    for (root = mr; root->container; ) {
+        if (!hva_container && root->rsvd_hva) {
+            hva_container = root;
+            ret.offset_within_region = addr;
+        }
+        root = root->container;
+        addr += root->addr;
+    }
+
+    ret.address_space = memory_region_to_address_space(root);
+    if (!ret.address_space || !hva_container) {
+        return ret;
+    }
+
+    ret.mr = hva_container;
+    ret.offset_within_address_space = addr;
+    ret.size = int128_make64(memory_region_size(ret.mr));
+    return ret;
+}
+
 MemoryRegionSection memory_region_find(MemoryRegion *mr,
                                        hwaddr addr, uint64_t size)
 {
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [Qemu-devel] [RFC v2 3/6] memory: support unmapping of MemoryRegion mapped into HVA parent
  2015-06-08 15:19 [Qemu-devel] [RFC v2 0/6] Fix QEMU crash during memory hotplug with vhost=on Igor Mammedov
  2015-06-08 15:19 ` [Qemu-devel] [RFC v2 1/6] memory: get rid of memory_region_destructor_ram_from_ptr() Igor Mammedov
  2015-06-08 15:19 ` [Qemu-devel] [RFC v2 2/6] memory: introduce MemoryRegion container with reserved HVA range Igor Mammedov
@ 2015-06-08 15:19 ` Igor Mammedov
  2015-06-08 15:26   ` Paolo Bonzini
  2015-06-08 15:32   ` Paolo Bonzini
  2015-06-08 15:19 ` [Qemu-devel] [RFC v2 4/6] hostmem: return recreated MemoryRegion if current can't be reused Igor Mammedov
                   ` (2 subsequent siblings)
  5 siblings, 2 replies; 25+ messages in thread
From: Igor Mammedov @ 2015-06-08 15:19 UTC (permalink / raw)
  To: qemu-devel; +Cc: pbonzini, mst

due to need to preserve HVA reserved range continous (make it atomic),
unmap subregion's host memory by placing reservation mapping
on it's range and invalidate subregion since it can't be reused
anymore.

Also add memory_region_is_hva_mapped() to let backend check if
MemoryRegion should be recreated instead of returning invalidated
region.

Signed-off-by: Igor Mammedov <imammedo@redhat.com>
---
 exec.c                    | 17 ++++++++++++++++-
 include/exec/cpu-common.h |  1 +
 include/exec/memory.h     | 13 +++++++++++++
 memory.c                  | 16 ++++++++++++++++
 4 files changed, 46 insertions(+), 1 deletion(-)

diff --git a/exec.c b/exec.c
index 7e47f64..7bef7f9 100644
--- a/exec.c
+++ b/exec.c
@@ -1332,11 +1332,22 @@ void *qemu_ram_reserve_hva(ram_addr_t length)
                 MAP_NORESERVE | MAP_ANONYMOUS | MAP_PRIVATE, -1, 0);
 }
 
+void qemu_ram_unmap_hva(ram_addr_t addr)
+{
+    RAMBlock *block = find_ram_block(addr);
+
+    assert(block);
+    mmap(block->host, block->used_length, PROT_NONE,
+         MAP_FIXED | MAP_NORESERVE | MAP_ANONYMOUS | MAP_PRIVATE, -1, 0);
+}
+
 void qemu_ram_remap_hva(ram_addr_t addr, void *new_hva)
 {
     RAMBlock *block = find_ram_block(addr);
 
     assert(block);
+    assert(!(block->flags & RAM_PREALLOC));
+    block->flags |= RAM_PREALLOC;
     block->host = mremap(block->host, block->used_length,
                       block->used_length,
                       MREMAP_MAYMOVE | MREMAP_FIXED, new_hva);
@@ -1595,11 +1606,15 @@ static void reclaim_ramblock(RAMBlock *block)
 #ifndef _WIN32
     } else if (block->fd >= 0) {
         munmap(block->host, block->max_length);
-        close(block->fd);
 #endif
     } else {
         qemu_anon_ram_free(block->host, block->max_length);
     }
+#ifndef _WIN32
+    if (block->fd >= 0) {
+        close(block->fd);
+    }
+#endif
     g_free(block);
 }
 
diff --git a/include/exec/cpu-common.h b/include/exec/cpu-common.h
index ec077ee..817762c 100644
--- a/include/exec/cpu-common.h
+++ b/include/exec/cpu-common.h
@@ -62,6 +62,7 @@ typedef uint32_t CPUReadMemoryFunc(void *opaque, hwaddr addr);
 void qemu_ram_remap(ram_addr_t addr, ram_addr_t length);
 void *qemu_ram_reserve_hva(ram_addr_t length);
 void qemu_ram_remap_hva(ram_addr_t addr, void *new_hva);
+void qemu_ram_unmap_hva(ram_addr_t addr);
 /* This should not be used by devices.  */
 MemoryRegion *qemu_ram_addr_from_host(void *ptr, ram_addr_t *ram_addr);
 void qemu_ram_set_idstr(ram_addr_t addr, const char *name, DeviceState *dev);
diff --git a/include/exec/memory.h b/include/exec/memory.h
index 5e16026..f9e86d6 100644
--- a/include/exec/memory.h
+++ b/include/exec/memory.h
@@ -175,6 +175,7 @@ struct MemoryRegion {
     bool romd_mode;
     bool ram;
     void *rsvd_hva;
+    bool hva_mapped;
     bool skip_dump;
     bool readonly; /* For RAM regions */
     bool enabled;
@@ -902,6 +903,10 @@ void memory_region_add_subregion(MemoryRegion *mr,
  * memory_region_del_subregion()); use memory_region_init_alias() if you
  * want a region to be a subregion in multiple locations.
  *
+ * if @subregion has HVA container as parent, the call rereseves HVA
+ * range ocicupied by @subregion, freeing host memory along the way
+ * and invalidates @subregion so it coudn't be reused.
+ *
  * @mr: the region to contain the new subregion; must be a container
  *      initialized with memory_region_init().
  * @offset: the offset relative to @mr where @subregion is added.
@@ -1003,6 +1008,14 @@ bool memory_region_present(MemoryRegion *container, hwaddr addr);
 bool memory_region_is_mapped(MemoryRegion *mr);
 
 /**
+ * memory_region_is_hva_mapped: returns true if #MemoryRegion is/was
+ * mapped into HVA container.
+ *
+ * @mr: a #MemoryRegion which should be checked if it's HVA parent
+ */
+bool memory_region_is_hva_mapped(MemoryRegion *mr);
+
+/**
  * memory_region_find: translate an address/size relative to a
  * MemoryRegion into a #MemoryRegionSection.
  *
diff --git a/memory.c b/memory.c
index f9b8a60..ab6d41b 100644
--- a/memory.c
+++ b/memory.c
@@ -1754,7 +1754,9 @@ static void memory_region_add_subregion_common(MemoryRegion *mr,
     if (mr->ram) {
 	MemoryRegionSection rsvd_hva = memory_region_find_hva_range(mr);
 
+        assert(!subregion->hva_mapped);
         if (rsvd_hva.mr) {
+            subregion->hva_mapped = true;
             qemu_ram_remap_hva(mr->ram_addr,
                 memory_region_get_ram_ptr(rsvd_hva.mr) +
                     rsvd_hva.offset_within_region);
@@ -1788,6 +1790,15 @@ void memory_region_del_subregion(MemoryRegion *mr,
 {
     memory_region_transaction_begin();
     assert(subregion->container == mr);
+
+    if (mr->ram) {
+        MemoryRegionSection rsvd_hva = memory_region_find_hva_range(mr);
+
+        if (rsvd_hva.mr) {
+            qemu_ram_unmap_hva(mr->ram_addr);
+        }
+    }
+
     subregion->container = NULL;
     QTAILQ_REMOVE(&mr->subregions, subregion, subregions_link);
     memory_region_unref(subregion);
@@ -1903,6 +1914,11 @@ bool memory_region_is_mapped(MemoryRegion *mr)
     return mr->container ? true : false;
 }
 
+bool memory_region_is_hva_mapped(MemoryRegion *mr)
+{
+    return mr->hva_mapped ? true : false;
+}
+
 MemoryRegionSection memory_region_find_hva_range(MemoryRegion *mr)
 {
     MemoryRegionSection ret = { .mr = NULL };
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [Qemu-devel] [RFC v2 4/6] hostmem: return recreated MemoryRegion if current can't be reused
  2015-06-08 15:19 [Qemu-devel] [RFC v2 0/6] Fix QEMU crash during memory hotplug with vhost=on Igor Mammedov
                   ` (2 preceding siblings ...)
  2015-06-08 15:19 ` [Qemu-devel] [RFC v2 3/6] memory: support unmapping of MemoryRegion mapped into HVA parent Igor Mammedov
@ 2015-06-08 15:19 ` Igor Mammedov
  2015-06-08 15:30   ` Paolo Bonzini
  2015-06-08 15:19 ` [Qemu-devel] [RFC v2 5/6] pc: reserve hotpluggable memory range with memory_region_init_hva_range() Igor Mammedov
  2015-06-08 15:19 ` [Qemu-devel] [RFC v2 6/6] pc: fix QEMU crashing when more than ~50 memory hotplugged Igor Mammedov
  5 siblings, 1 reply; 25+ messages in thread
From: Igor Mammedov @ 2015-06-08 15:19 UTC (permalink / raw)
  To: qemu-devel; +Cc: pbonzini, mst

Signed-off-by: Igor Mammedov <imammedo@redhat.com>
---
 backends/hostmem.c | 6 ++++++
 1 file changed, 6 insertions(+)

diff --git a/backends/hostmem.c b/backends/hostmem.c
index 2a8614d..4486483 100644
--- a/backends/hostmem.c
+++ b/backends/hostmem.c
@@ -283,6 +283,12 @@ static void host_memory_backend_init(Object *obj)
 MemoryRegion *
 host_memory_backend_get_memory(HostMemoryBackend *backend, Error **errp)
 {
+    UserCreatableClass *ucc = USER_CREATABLE_GET_CLASS(backend);
+
+    if (memory_region_is_hva_mapped(&backend->mr)) {
+        object_unparent(OBJECT(&backend->mr));
+        ucc->complete(USER_CREATABLE(backend), errp);
+    }
     return memory_region_size(&backend->mr) ? &backend->mr : NULL;
 }
 
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [Qemu-devel] [RFC v2 5/6] pc: reserve hotpluggable memory range with memory_region_init_hva_range()
  2015-06-08 15:19 [Qemu-devel] [RFC v2 0/6] Fix QEMU crash during memory hotplug with vhost=on Igor Mammedov
                   ` (3 preceding siblings ...)
  2015-06-08 15:19 ` [Qemu-devel] [RFC v2 4/6] hostmem: return recreated MemoryRegion if current can't be reused Igor Mammedov
@ 2015-06-08 15:19 ` Igor Mammedov
  2015-06-08 15:19 ` [Qemu-devel] [RFC v2 6/6] pc: fix QEMU crashing when more than ~50 memory hotplugged Igor Mammedov
  5 siblings, 0 replies; 25+ messages in thread
From: Igor Mammedov @ 2015-06-08 15:19 UTC (permalink / raw)
  To: qemu-devel; +Cc: pbonzini, mst

It makes sure that all hotplugged memory will be put in
continuos HVA range allowing to use 1:1 GVA<->HVA mapping.

1:1 mapping will be used by vhost to reduce number of memory
ranges for hotplugged memory to a single range that covers
all hotpluggable memory address space.

Signed-off-by: Igor Mammedov <imammedo@redhat.com>
---
 hw/i386/pc.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/hw/i386/pc.c b/hw/i386/pc.c
index 2baff4a..82278a9 100644
--- a/hw/i386/pc.c
+++ b/hw/i386/pc.c
@@ -1333,8 +1333,8 @@ FWCfgState *pc_memory_init(MachineState *machine,
             exit(EXIT_FAILURE);
         }
 
-        memory_region_init(&pcms->hotplug_memory, OBJECT(pcms),
-                           "hotplug-memory", hotplug_mem_size);
+        memory_region_init_hva_range(&pcms->hotplug_memory, OBJECT(pcms),
+                                     "hotplug-memory", hotplug_mem_size);
         memory_region_add_subregion(system_memory, pcms->hotplug_memory_base,
                                     &pcms->hotplug_memory);
     }
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [Qemu-devel] [RFC v2 6/6] pc: fix QEMU crashing when more than ~50 memory hotplugged
  2015-06-08 15:19 [Qemu-devel] [RFC v2 0/6] Fix QEMU crash during memory hotplug with vhost=on Igor Mammedov
                   ` (4 preceding siblings ...)
  2015-06-08 15:19 ` [Qemu-devel] [RFC v2 5/6] pc: reserve hotpluggable memory range with memory_region_init_hva_range() Igor Mammedov
@ 2015-06-08 15:19 ` Igor Mammedov
  5 siblings, 0 replies; 25+ messages in thread
From: Igor Mammedov @ 2015-06-08 15:19 UTC (permalink / raw)
  To: qemu-devel; +Cc: pbonzini, mst

QEMU asserts in vhost due to hitting vhost backend limit
on number of supported memory regions.

Instead of increasing limit in backends, describe all hotplugged
memory as one continuos range to vhost with linear 1:1 HVA->GPA
mapping in backend.

It allows to avoid increasing VHOST_MEMORY_MAX_NREGIONS limit
in kernel and refactoring current region lookup algorithm
to a faster/scalable data structure. The same applies to
vhost user which has even lower limit.

Signed-off-by: Igor Mammedov <imammedo@redhat.com>
---
 hw/virtio/vhost.c | 15 ++++++++++++---
 1 file changed, 12 insertions(+), 3 deletions(-)

diff --git a/hw/virtio/vhost.c b/hw/virtio/vhost.c
index 01f1e04..49a7b2e 100644
--- a/hw/virtio/vhost.c
+++ b/hw/virtio/vhost.c
@@ -419,6 +419,7 @@ static void vhost_set_memory(MemoryListener *listener,
     bool log_dirty = memory_region_is_logging(section->mr);
     int s = offsetof(struct vhost_memory, regions) +
         (dev->mem->nregions + 1) * sizeof dev->mem->regions[0];
+    MemoryRegionSection rsvd_hva;
     void *ram;
 
     dev->mem = g_realloc(dev->mem, s);
@@ -427,17 +428,25 @@ static void vhost_set_memory(MemoryListener *listener,
         add = false;
     }
 
+    rsvd_hva = memory_region_find_hva_range(section->mr);
+    if (rsvd_hva.mr) {
+        start_addr = rsvd_hva.offset_within_address_space;
+        size = int128_get64(rsvd_hva.size);
+        ram = memory_region_get_ram_ptr(rsvd_hva.mr);
+    } else {
+        ram = memory_region_get_ram_ptr(section->mr) + section->offset_within_region;
+    }
+
     assert(size);
 
     /* Optimize no-change case. At least cirrus_vga does this a lot at this time. */
-    ram = memory_region_get_ram_ptr(section->mr) + section->offset_within_region;
     if (add) {
-        if (!vhost_dev_cmp_memory(dev, start_addr, size, (uintptr_t)ram)) {
+        if (!rsvd_hva.mr && !vhost_dev_cmp_memory(dev, start_addr, size, (uintptr_t)ram)) {
             /* Region exists with same address. Nothing to do. */
             return;
         }
     } else {
-        if (!vhost_dev_find_reg(dev, start_addr, size)) {
+        if (!rsvd_hva.mr && !vhost_dev_find_reg(dev, start_addr, size)) {
             /* Removing region that we don't access. Nothing to do. */
             return;
         }
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 25+ messages in thread

* Re: [Qemu-devel] [RFC v2 1/6] memory: get rid of memory_region_destructor_ram_from_ptr()
  2015-06-08 15:19 ` [Qemu-devel] [RFC v2 1/6] memory: get rid of memory_region_destructor_ram_from_ptr() Igor Mammedov
@ 2015-06-08 15:23   ` Paolo Bonzini
  2015-06-08 16:08     ` Igor Mammedov
  0 siblings, 1 reply; 25+ messages in thread
From: Paolo Bonzini @ 2015-06-08 15:23 UTC (permalink / raw)
  To: Igor Mammedov, qemu-devel; +Cc: mst



On 08/06/2015 17:19, Igor Mammedov wrote:
> -    qemu_mutex_lock_ramlist();
> -    QLIST_FOREACH_RCU(block, &ram_list.blocks, next) {
> -        if (addr == block->offset) {
> -            QLIST_REMOVE_RCU(block, next);
> -            ram_list.mru_block = NULL;
> -            /* Write list before version */
> -            smp_wmb();
> -            ram_list.version++;
> -            g_free_rcu(block, rcu);

qemu_ram_free here does:

            call_rcu(block, reclaim_ramblock, rcu);

which is different.

Paolo

> -            break;
> -        }
> -    }
> -    qemu_mutex_unlock_ramlist();

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [Qemu-devel] [RFC v2 3/6] memory: support unmapping of MemoryRegion mapped into HVA parent
  2015-06-08 15:19 ` [Qemu-devel] [RFC v2 3/6] memory: support unmapping of MemoryRegion mapped into HVA parent Igor Mammedov
@ 2015-06-08 15:26   ` Paolo Bonzini
  2015-06-08 15:32   ` Paolo Bonzini
  1 sibling, 0 replies; 25+ messages in thread
From: Paolo Bonzini @ 2015-06-08 15:26 UTC (permalink / raw)
  To: Igor Mammedov, qemu-devel; +Cc: mst



On 08/06/2015 17:19, Igor Mammedov wrote:
> +        assert(!subregion->hva_mapped);
>          if (rsvd_hva.mr) {
> +            subregion->hva_mapped = true;
>              qemu_ram_remap_hva(mr->ram_addr,
>                  memory_region_get_ram_ptr(rsvd_hva.mr) +
>                      rsvd_hva.offset_within_region);
> @@ -1788,6 +1790,15 @@ void memory_region_del_subregion(MemoryRegion *mr,
>  {
>      memory_region_transaction_begin();
>      assert(subregion->container == mr);
> +
> +    if (mr->ram) {

"if (subregion->hva_mapped)" I think would replace this "if" and the one
below.

Paolo

> +        MemoryRegionSection rsvd_hva = memory_region_find_hva_range(mr);
> +
> +        if (rsvd_hva.mr) {
> +            qemu_ram_unmap_hva(mr->ram_addr);
> +        }
> +    }
> +

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [Qemu-devel] [RFC v2 4/6] hostmem: return recreated MemoryRegion if current can't be reused
  2015-06-08 15:19 ` [Qemu-devel] [RFC v2 4/6] hostmem: return recreated MemoryRegion if current can't be reused Igor Mammedov
@ 2015-06-08 15:30   ` Paolo Bonzini
  2015-06-08 16:25     ` Igor Mammedov
  0 siblings, 1 reply; 25+ messages in thread
From: Paolo Bonzini @ 2015-06-08 15:30 UTC (permalink / raw)
  To: Igor Mammedov, qemu-devel; +Cc: mst



On 08/06/2015 17:19, Igor Mammedov wrote:
>  MemoryRegion *
>  host_memory_backend_get_memory(HostMemoryBackend *backend, Error **errp)
>  {
> +    UserCreatableClass *ucc = USER_CREATABLE_GET_CLASS(backend);
> +
> +    if (memory_region_is_hva_mapped(&backend->mr)) {
> +        object_unparent(OBJECT(&backend->mr));
> +        ucc->complete(USER_CREATABLE(backend), errp);
> +    }

I'm not sure I understand this, and the commit message... doesn't help.

Is it for the case where you unplug memory and then reuse the old
backend?  Can we just outlaw this, forcing each memory backend to be
used only once?

Paolo

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [Qemu-devel] [RFC v2 3/6] memory: support unmapping of MemoryRegion mapped into HVA parent
  2015-06-08 15:19 ` [Qemu-devel] [RFC v2 3/6] memory: support unmapping of MemoryRegion mapped into HVA parent Igor Mammedov
  2015-06-08 15:26   ` Paolo Bonzini
@ 2015-06-08 15:32   ` Paolo Bonzini
  2015-06-08 16:13     ` Igor Mammedov
  1 sibling, 1 reply; 25+ messages in thread
From: Paolo Bonzini @ 2015-06-08 15:32 UTC (permalink / raw)
  To: Igor Mammedov, qemu-devel; +Cc: mst



On 08/06/2015 17:19, Igor Mammedov wrote:
> +void qemu_ram_unmap_hva(ram_addr_t addr)
> +{
> +    RAMBlock *block = find_ram_block(addr);
> +
> +    assert(block);
> +    mmap(block->host, block->used_length, PROT_NONE,
> +         MAP_FIXED | MAP_NORESERVE | MAP_ANONYMOUS | MAP_PRIVATE, -1, 0);
> +}
> +

Hmm, this is not good. :(  The area at block->host can be in use, for
example via memory_region_ref/memory_region_unref.  This can happen a
bit after the memory_region_del_subregion.  So you can SEGV if you
simply make a synchronous update.  I'm not sure if there is a solution
(but thanks for splitting the patches in a way that made the problem
clear!).

Paolo

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [Qemu-devel] [RFC v2 1/6] memory: get rid of memory_region_destructor_ram_from_ptr()
  2015-06-08 15:23   ` Paolo Bonzini
@ 2015-06-08 16:08     ` Igor Mammedov
  2015-06-08 16:09       ` Paolo Bonzini
  0 siblings, 1 reply; 25+ messages in thread
From: Igor Mammedov @ 2015-06-08 16:08 UTC (permalink / raw)
  To: Paolo Bonzini; +Cc: qemu-devel, mst

On Mon, 08 Jun 2015 17:23:35 +0200
Paolo Bonzini <pbonzini@redhat.com> wrote:

> 
> 
> On 08/06/2015 17:19, Igor Mammedov wrote:
> > -    qemu_mutex_lock_ramlist();
> > -    QLIST_FOREACH_RCU(block, &ram_list.blocks, next) {
> > -        if (addr == block->offset) {
> > -            QLIST_REMOVE_RCU(block, next);
> > -            ram_list.mru_block = NULL;
> > -            /* Write list before version */
> > -            smp_wmb();
> > -            ram_list.version++;
> > -            g_free_rcu(block, rcu);
> 
> qemu_ram_free here does:
> 
>             call_rcu(block, reclaim_ramblock, rcu);
> 
> which is different.


qemu_ram_free() calls reclaim_ramblock() which does:

  if (!(block->flags & RAM_PREALLOC))
      free_host_memory()

  g_free(block)

while
  g_free_rcu(block, rcu) results -> g_free(block)

and for memory_region_init_ram_ptr() we set RAM_PREALLOC
so qemu_ram_free() degrades to g_free(block).
 

  

> 
> Paolo
> 
> > -            break;
> > -        }
> > -    }
> > -    qemu_mutex_unlock_ramlist();

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [Qemu-devel] [RFC v2 1/6] memory: get rid of memory_region_destructor_ram_from_ptr()
  2015-06-08 16:08     ` Igor Mammedov
@ 2015-06-08 16:09       ` Paolo Bonzini
  2015-06-08 16:26         ` Igor Mammedov
  0 siblings, 1 reply; 25+ messages in thread
From: Paolo Bonzini @ 2015-06-08 16:09 UTC (permalink / raw)
  To: Igor Mammedov; +Cc: qemu-devel, mst



On 08/06/2015 18:08, Igor Mammedov wrote:
> On Mon, 08 Jun 2015 17:23:35 +0200
> Paolo Bonzini <pbonzini@redhat.com> wrote:
> 
>>
>>
>> On 08/06/2015 17:19, Igor Mammedov wrote:
>>> -    qemu_mutex_lock_ramlist();
>>> -    QLIST_FOREACH_RCU(block, &ram_list.blocks, next) {
>>> -        if (addr == block->offset) {
>>> -            QLIST_REMOVE_RCU(block, next);
>>> -            ram_list.mru_block = NULL;
>>> -            /* Write list before version */
>>> -            smp_wmb();
>>> -            ram_list.version++;
>>> -            g_free_rcu(block, rcu);
>>
>> qemu_ram_free here does:
>>
>>             call_rcu(block, reclaim_ramblock, rcu);
>>
>> which is different.
> 
> 
> qemu_ram_free() calls reclaim_ramblock() which does:
> 
>   if (!(block->flags & RAM_PREALLOC))
>       free_host_memory()
> 
>   g_free(block)
> 
> while
>   g_free_rcu(block, rcu) results -> g_free(block)
> 
> and for memory_region_init_ram_ptr() we set RAM_PREALLOC
> so qemu_ram_free() degrades to g_free(block).

Please put this in the commit message. :)

Paolo

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [Qemu-devel] [RFC v2 3/6] memory: support unmapping of MemoryRegion mapped into HVA parent
  2015-06-08 15:32   ` Paolo Bonzini
@ 2015-06-08 16:13     ` Igor Mammedov
  2015-06-08 16:25       ` Michael S. Tsirkin
  0 siblings, 1 reply; 25+ messages in thread
From: Igor Mammedov @ 2015-06-08 16:13 UTC (permalink / raw)
  To: Paolo Bonzini; +Cc: qemu-devel, mst

On Mon, 08 Jun 2015 17:32:27 +0200
Paolo Bonzini <pbonzini@redhat.com> wrote:

> 
> 
> On 08/06/2015 17:19, Igor Mammedov wrote:
> > +void qemu_ram_unmap_hva(ram_addr_t addr)
> > +{
> > +    RAMBlock *block = find_ram_block(addr);
> > +
> > +    assert(block);
> > +    mmap(block->host, block->used_length, PROT_NONE,
> > +         MAP_FIXED | MAP_NORESERVE | MAP_ANONYMOUS | MAP_PRIVATE, -1, 0);
> > +}
> > +
> 
> Hmm, this is not good. :(  The area at block->host can be in use, for
> example via memory_region_ref/memory_region_unref.  This can happen a
> bit after the memory_region_del_subregion.  So you can SEGV if you
> simply make a synchronous update.  I'm not sure if there is a solution
Yep, that's the problem I haven't found solution to so far,
any ideas hoe to approach this are appreciated.

issue is that we have to re-reserve HVA region first so no other allocation
would claim gap and the only way I found was just to call mmap() on it
which as side effect invalidates MemoryRegion's backing RAM.

> (but thanks for splitting the patches in a way that made the problem
> clear!).
> 
> Paolo

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [Qemu-devel] [RFC v2 4/6] hostmem: return recreated MemoryRegion if current can't be reused
  2015-06-08 15:30   ` Paolo Bonzini
@ 2015-06-08 16:25     ` Igor Mammedov
  2015-06-08 16:28       ` Paolo Bonzini
  0 siblings, 1 reply; 25+ messages in thread
From: Igor Mammedov @ 2015-06-08 16:25 UTC (permalink / raw)
  To: Paolo Bonzini; +Cc: qemu-devel, mst

On Mon, 08 Jun 2015 17:30:20 +0200
Paolo Bonzini <pbonzini@redhat.com> wrote:

> 
> 
> On 08/06/2015 17:19, Igor Mammedov wrote:
> >  MemoryRegion *
> >  host_memory_backend_get_memory(HostMemoryBackend *backend, Error **errp)
> >  {
> > +    UserCreatableClass *ucc = USER_CREATABLE_GET_CLASS(backend);
> > +
> > +    if (memory_region_is_hva_mapped(&backend->mr)) {
> > +        object_unparent(OBJECT(&backend->mr));
> > +        ucc->complete(USER_CREATABLE(backend), errp);
> > +    }
> 
> I'm not sure I understand this, and the commit message... doesn't help.
> 
> Is it for the case where you unplug memory and then reuse the old
> backend?
yes

>  Can we just outlaw this, forcing each memory backend to be
> used only once?
to outlaw it gracefully without asserting QEMU during hotplug
user should be able to detect that it's outlawed i.e. use
memory_region_is_hva_mapped(), but yes we can.

> 
> Paolo

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [Qemu-devel] [RFC v2 3/6] memory: support unmapping of MemoryRegion mapped into HVA parent
  2015-06-08 16:13     ` Igor Mammedov
@ 2015-06-08 16:25       ` Michael S. Tsirkin
  2015-06-08 17:06         ` Paolo Bonzini
  0 siblings, 1 reply; 25+ messages in thread
From: Michael S. Tsirkin @ 2015-06-08 16:25 UTC (permalink / raw)
  To: Igor Mammedov; +Cc: Paolo Bonzini, qemu-devel

On Mon, Jun 08, 2015 at 06:13:14PM +0200, Igor Mammedov wrote:
> On Mon, 08 Jun 2015 17:32:27 +0200
> Paolo Bonzini <pbonzini@redhat.com> wrote:
> 
> > 
> > 
> > On 08/06/2015 17:19, Igor Mammedov wrote:
> > > +void qemu_ram_unmap_hva(ram_addr_t addr)
> > > +{
> > > +    RAMBlock *block = find_ram_block(addr);
> > > +
> > > +    assert(block);
> > > +    mmap(block->host, block->used_length, PROT_NONE,
> > > +         MAP_FIXED | MAP_NORESERVE | MAP_ANONYMOUS | MAP_PRIVATE, -1, 0);
> > > +}
> > > +
> > 
> > Hmm, this is not good. :(  The area at block->host can be in use, for
> > example via memory_region_ref/memory_region_unref.  This can happen a
> > bit after the memory_region_del_subregion.  So you can SEGV if you
> > simply make a synchronous update.  I'm not sure if there is a solution
> Yep, that's the problem I haven't found solution to so far,
> any ideas hoe to approach this are appreciated.
> 
> issue is that we have to re-reserve HVA region first so no other allocation
> would claim gap and the only way I found was just to call mmap() on it
> which as side effect invalidates MemoryRegion's backing RAM.

Well the only point we need to mmap is where we'd unmap
normally, if that's not safe then unmapping wouldn't
be safe either?

> > (but thanks for splitting the patches in a way that made the problem
> > clear!).
> > 
> > Paolo

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [Qemu-devel] [RFC v2 1/6] memory: get rid of memory_region_destructor_ram_from_ptr()
  2015-06-08 16:09       ` Paolo Bonzini
@ 2015-06-08 16:26         ` Igor Mammedov
  0 siblings, 0 replies; 25+ messages in thread
From: Igor Mammedov @ 2015-06-08 16:26 UTC (permalink / raw)
  To: Paolo Bonzini; +Cc: qemu-devel, mst

On Mon, 08 Jun 2015 18:09:17 +0200
Paolo Bonzini <pbonzini@redhat.com> wrote:

> 
> 
> On 08/06/2015 18:08, Igor Mammedov wrote:
> > On Mon, 08 Jun 2015 17:23:35 +0200
> > Paolo Bonzini <pbonzini@redhat.com> wrote:
> > 
> >>
> >>
> >> On 08/06/2015 17:19, Igor Mammedov wrote:
> >>> -    qemu_mutex_lock_ramlist();
> >>> -    QLIST_FOREACH_RCU(block, &ram_list.blocks, next) {
> >>> -        if (addr == block->offset) {
> >>> -            QLIST_REMOVE_RCU(block, next);
> >>> -            ram_list.mru_block = NULL;
> >>> -            /* Write list before version */
> >>> -            smp_wmb();
> >>> -            ram_list.version++;
> >>> -            g_free_rcu(block, rcu);
> >>
> >> qemu_ram_free here does:
> >>
> >>             call_rcu(block, reclaim_ramblock, rcu);
> >>
> >> which is different.
> > 
> > 
> > qemu_ram_free() calls reclaim_ramblock() which does:
> > 
> >   if (!(block->flags & RAM_PREALLOC))
> >       free_host_memory()
> > 
> >   g_free(block)
> > 
> > while
> >   g_free_rcu(block, rcu) results -> g_free(block)
> > 
> > and for memory_region_init_ram_ptr() we set RAM_PREALLOC
> > so qemu_ram_free() degrades to g_free(block).
> 
> Please put this in the commit message. :)
ok, I'll post it as separate cleanup patch as it doesn't depend on
the rest of series.


> 
> Paolo

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [Qemu-devel] [RFC v2 4/6] hostmem: return recreated MemoryRegion if current can't be reused
  2015-06-08 16:25     ` Igor Mammedov
@ 2015-06-08 16:28       ` Paolo Bonzini
  0 siblings, 0 replies; 25+ messages in thread
From: Paolo Bonzini @ 2015-06-08 16:28 UTC (permalink / raw)
  To: Igor Mammedov; +Cc: qemu-devel, mst



On 08/06/2015 18:25, Igor Mammedov wrote:
> 
>> >  Can we just outlaw this, forcing each memory backend to be
>> > used only once?
> to outlaw it gracefully without asserting QEMU during hotplug
> user should be able to detect that it's outlawed i.e. use
> memory_region_is_hva_mapped(), but yes we can.

We can also add a ->detach method to HostMemoryBackend, and forbid
get_memory after it has been invoked.

Paolo

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [Qemu-devel] [RFC v2 3/6] memory: support unmapping of MemoryRegion mapped into HVA parent
  2015-06-08 16:25       ` Michael S. Tsirkin
@ 2015-06-08 17:06         ` Paolo Bonzini
  2015-06-09 10:08           ` Igor Mammedov
  0 siblings, 1 reply; 25+ messages in thread
From: Paolo Bonzini @ 2015-06-08 17:06 UTC (permalink / raw)
  To: Michael S. Tsirkin, Igor Mammedov; +Cc: qemu-devel



On 08/06/2015 18:25, Michael S. Tsirkin wrote:
> > issue is that we have to re-reserve HVA region first so no other allocation
> > would claim gap and the only way I found was just to call mmap() on it
> > which as side effect invalidates MemoryRegion's backing RAM.
> 
> Well the only point we need to mmap is where we'd unmap
> normally, if that's not safe then unmapping wouldn't
> be safe either?

I think it is it possible to map slot 2 at address 0x12340000 right
after unmapping slot 1 at the same address but before an RCU grace
period has expired.

If this is possible, then you can have two DIMMs trying to mmap
themselves at the same address.

Probably you need to stop using object_child_foreach in
hw/mem/pc-dimm.c, and instead build your own list.  An object can keep a
"weak" reference to itself in the list, and remove itself from the list
at instance_finalize time.

Paolo

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [Qemu-devel] [RFC v2 3/6] memory: support unmapping of MemoryRegion mapped into HVA parent
  2015-06-08 17:06         ` Paolo Bonzini
@ 2015-06-09 10:08           ` Igor Mammedov
  2015-06-17  8:14             ` Paolo Bonzini
  0 siblings, 1 reply; 25+ messages in thread
From: Igor Mammedov @ 2015-06-09 10:08 UTC (permalink / raw)
  To: Paolo Bonzini; +Cc: qemu-devel, Michael S. Tsirkin

On Mon, 08 Jun 2015 19:06:39 +0200
Paolo Bonzini <pbonzini@redhat.com> wrote:

> 
> 
> On 08/06/2015 18:25, Michael S. Tsirkin wrote:
> > > issue is that we have to re-reserve HVA region first so no other allocation
> > > would claim gap and the only way I found was just to call mmap() on it
> > > which as side effect invalidates MemoryRegion's backing RAM.
> > 
> > Well the only point we need to mmap is where we'd unmap
> > normally, if that's not safe then unmapping wouldn't
> > be safe either?
> 
> I think it is it possible to map slot 2 at address 0x12340000 right
> after unmapping slot 1 at the same address but before an RCU grace
> period has expired.
Let me sum up my understanding of issue:

1. we can "unmap" GPA of HVA remapped region using memory_region_del_subregion()
   from guest current flatview but mapping will stay in old flatview
   until RCU's grace period passes.

2. hanging reference from old flatview doesn't allow us to mmap(RESEVED)
   to be freed range in container's HVA range.

3. until #2 is done we can't allow to map another memory region in
   current flatview at the same range, hence need to keep list of
   still active HVA ranges so we could check at memory_region_add_subregion()
   time that new mapping is allowed. => adding "Error **errp"
   to memory_region_add_subregion() for reporting fail.


> 
> If this is possible, then you can have two DIMMs trying to mmap
> themselves at the same address.
> 
> Probably you need to stop using object_child_foreach in
> hw/mem/pc-dimm.c, and instead build your own list.  An object can keep a
> "weak" reference to itself in the list, and remove itself from the list
> at instance_finalize time.
I don't get what you suggest,
how would it solve issue with still alive MemoryRegion reference in old flatview?


> 
> Paolo

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [Qemu-devel] [RFC v2 3/6] memory: support unmapping of MemoryRegion mapped into HVA parent
  2015-06-09 10:08           ` Igor Mammedov
@ 2015-06-17  8:14             ` Paolo Bonzini
  2015-06-17 15:04               ` Igor Mammedov
  0 siblings, 1 reply; 25+ messages in thread
From: Paolo Bonzini @ 2015-06-17  8:14 UTC (permalink / raw)
  To: Igor Mammedov; +Cc: qemu-devel, Michael S. Tsirkin



On 09/06/2015 12:08, Igor Mammedov wrote:
> 
> 3. until #2 is done we can't allow to map another memory region in
>    current flatview at the same range, hence need to keep list of
>    still active HVA ranges so we could check at memory_region_add_subregion()
>    time that new mapping is allowed. => adding "Error **errp"
>    to memory_region_add_subregion() for reporting fail.

No, we cannot do that.  memory_region_add_subregion cannot fail.  The
RCU mechanism is transparent.

Paolo

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [Qemu-devel] [RFC v2 3/6] memory: support unmapping of MemoryRegion mapped into HVA parent
  2015-06-17  8:14             ` Paolo Bonzini
@ 2015-06-17 15:04               ` Igor Mammedov
  2015-06-17 15:10                 ` Michael S. Tsirkin
  0 siblings, 1 reply; 25+ messages in thread
From: Igor Mammedov @ 2015-06-17 15:04 UTC (permalink / raw)
  To: Paolo Bonzini; +Cc: qemu-devel, Michael S. Tsirkin

On Wed, 17 Jun 2015 10:14:00 +0200
Paolo Bonzini <pbonzini@redhat.com> wrote:

> 
> 
> On 09/06/2015 12:08, Igor Mammedov wrote:
> > 
> > 3. until #2 is done we can't allow to map another memory region in
> >    current flatview at the same range, hence need to keep list of
> >    still active HVA ranges so we could check at memory_region_add_subregion()
> >    time that new mapping is allowed. => adding "Error **errp"
> >    to memory_region_add_subregion() for reporting fail.
> 
> No, we cannot do that.  memory_region_add_subregion cannot fail.  The
> RCU mechanism is transparent.
So what are the options?

Actually memory_region_add_subregion(cannot fail) will continue to casue problems
in case one of the listeners fails and has no way to propagate error up the stack.
It could be that vhost_set_memory() or kvm_set_memslots() aren't able to allocate
memory for replacement table in low memory conditions and without
way to report back QEMU is left with option to die.

> 
> Paolo

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [Qemu-devel] [RFC v2 3/6] memory: support unmapping of MemoryRegion mapped into HVA parent
  2015-06-17 15:04               ` Igor Mammedov
@ 2015-06-17 15:10                 ` Michael S. Tsirkin
  2015-06-17 16:15                   ` Paolo Bonzini
  0 siblings, 1 reply; 25+ messages in thread
From: Michael S. Tsirkin @ 2015-06-17 15:10 UTC (permalink / raw)
  To: Igor Mammedov; +Cc: Paolo Bonzini, qemu-devel

On Wed, Jun 17, 2015 at 05:04:37PM +0200, Igor Mammedov wrote:
> On Wed, 17 Jun 2015 10:14:00 +0200
> Paolo Bonzini <pbonzini@redhat.com> wrote:
> 
> > 
> > 
> > On 09/06/2015 12:08, Igor Mammedov wrote:
> > > 
> > > 3. until #2 is done we can't allow to map another memory region in
> > >    current flatview at the same range, hence need to keep list of
> > >    still active HVA ranges so we could check at memory_region_add_subregion()
> > >    time that new mapping is allowed. => adding "Error **errp"
> > >    to memory_region_add_subregion() for reporting fail.
> > 
> > No, we cannot do that.  memory_region_add_subregion cannot fail.  The
> > RCU mechanism is transparent.
> So what are the options?
> 
> Actually memory_region_add_subregion(cannot fail) will continue to casue problems
> in case one of the listeners fails and has no way to propagate error up the stack.
> It could be that vhost_set_memory() or kvm_set_memslots() aren't able to allocate
> memory for replacement table in low memory conditions and without
> way to report back QEMU is left with option to die.

mmap reserving memory just adds it to a data structure somewhere
within glibc, doesn't it? Looks like it's highly unlikely to fail,
so maybe killing QEMU if it does isn't a big deal.

> > 
> > Paolo

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [Qemu-devel] [RFC v2 3/6] memory: support unmapping of MemoryRegion mapped into HVA parent
  2015-06-17 15:10                 ` Michael S. Tsirkin
@ 2015-06-17 16:15                   ` Paolo Bonzini
  2015-06-17 16:30                     ` Michael S. Tsirkin
  0 siblings, 1 reply; 25+ messages in thread
From: Paolo Bonzini @ 2015-06-17 16:15 UTC (permalink / raw)
  To: Michael S. Tsirkin, Igor Mammedov; +Cc: qemu-devel



On 17/06/2015 17:10, Michael S. Tsirkin wrote:
> > Actually memory_region_add_subregion(cannot fail) will continue to casue problems
> > in case one of the listeners fails and has no way to propagate error up the stack.
> > It could be that vhost_set_memory() or kvm_set_memslots() aren't able to allocate
> > memory for replacement table in low memory conditions and without
> > way to report back QEMU is left with option to die.
> 
> mmap reserving memory just adds it to a data structure somewhere
> within glibc, doesn't it? Looks like it's highly unlikely to fail,
> so maybe killing QEMU if it does isn't a big deal.

It's in the kernel, but yes, a MAP_NORESERVE mmap is highly unlikely to
fail.

Paolo

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [Qemu-devel] [RFC v2 3/6] memory: support unmapping of MemoryRegion mapped into HVA parent
  2015-06-17 16:15                   ` Paolo Bonzini
@ 2015-06-17 16:30                     ` Michael S. Tsirkin
  0 siblings, 0 replies; 25+ messages in thread
From: Michael S. Tsirkin @ 2015-06-17 16:30 UTC (permalink / raw)
  To: Paolo Bonzini; +Cc: Igor Mammedov, qemu-devel

On Wed, Jun 17, 2015 at 06:15:14PM +0200, Paolo Bonzini wrote:
> 
> 
> On 17/06/2015 17:10, Michael S. Tsirkin wrote:
> > > Actually memory_region_add_subregion(cannot fail) will continue to casue problems
> > > in case one of the listeners fails and has no way to propagate error up the stack.
> > > It could be that vhost_set_memory() or kvm_set_memslots() aren't able to allocate
> > > memory for replacement table in low memory conditions and without
> > > way to report back QEMU is left with option to die.
> > 
> > mmap reserving memory just adds it to a data structure somewhere
> > within glibc, doesn't it? Looks like it's highly unlikely to fail,
> > so maybe killing QEMU if it does isn't a big deal.
> 
> It's in the kernel, but yes, a MAP_NORESERVE mmap is highly unlikely to
> fail.
> 
> Paolo

So let's ignore it for now, we always crash if e.g. close fails, too.

^ permalink raw reply	[flat|nested] 25+ messages in thread

end of thread, other threads:[~2015-06-17 16:30 UTC | newest]

Thread overview: 25+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-06-08 15:19 [Qemu-devel] [RFC v2 0/6] Fix QEMU crash during memory hotplug with vhost=on Igor Mammedov
2015-06-08 15:19 ` [Qemu-devel] [RFC v2 1/6] memory: get rid of memory_region_destructor_ram_from_ptr() Igor Mammedov
2015-06-08 15:23   ` Paolo Bonzini
2015-06-08 16:08     ` Igor Mammedov
2015-06-08 16:09       ` Paolo Bonzini
2015-06-08 16:26         ` Igor Mammedov
2015-06-08 15:19 ` [Qemu-devel] [RFC v2 2/6] memory: introduce MemoryRegion container with reserved HVA range Igor Mammedov
2015-06-08 15:19 ` [Qemu-devel] [RFC v2 3/6] memory: support unmapping of MemoryRegion mapped into HVA parent Igor Mammedov
2015-06-08 15:26   ` Paolo Bonzini
2015-06-08 15:32   ` Paolo Bonzini
2015-06-08 16:13     ` Igor Mammedov
2015-06-08 16:25       ` Michael S. Tsirkin
2015-06-08 17:06         ` Paolo Bonzini
2015-06-09 10:08           ` Igor Mammedov
2015-06-17  8:14             ` Paolo Bonzini
2015-06-17 15:04               ` Igor Mammedov
2015-06-17 15:10                 ` Michael S. Tsirkin
2015-06-17 16:15                   ` Paolo Bonzini
2015-06-17 16:30                     ` Michael S. Tsirkin
2015-06-08 15:19 ` [Qemu-devel] [RFC v2 4/6] hostmem: return recreated MemoryRegion if current can't be reused Igor Mammedov
2015-06-08 15:30   ` Paolo Bonzini
2015-06-08 16:25     ` Igor Mammedov
2015-06-08 16:28       ` Paolo Bonzini
2015-06-08 15:19 ` [Qemu-devel] [RFC v2 5/6] pc: reserve hotpluggable memory range with memory_region_init_hva_range() Igor Mammedov
2015-06-08 15:19 ` [Qemu-devel] [RFC v2 6/6] pc: fix QEMU crashing when more than ~50 memory hotplugged Igor Mammedov

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.