All of lore.kernel.org
 help / color / mirror / Atom feed
* [Qemu-devel] [PATCH v4 0/6] Rework vhost memory region updates
@ 2017-12-13 18:08 Dr. David Alan Gilbert (git)
  2017-12-13 18:08 ` [Qemu-devel] [PATCH v4 1/6] vhost: Move log_dirty check Dr. David Alan Gilbert (git)
                   ` (5 more replies)
  0 siblings, 6 replies; 24+ messages in thread
From: Dr. David Alan Gilbert (git) @ 2017-12-13 18:08 UTC (permalink / raw)
  To: qemu-devel, pbonzini, imammedo; +Cc: maxime.coquelin, mst, groug

From: "Dr. David Alan Gilbert" <dgilbert@redhat.com>

Hi,
  This patch set reworks the way the vhost code handles changes in
physical address space layout that came from a discussion with Igor.
 
Its intention is to simplify a lot of the update code,
and to make it easier for the postcopy+shared code to
do the hugepage alignments that are needed.
 
Instead of inserting/removing each section during the add/del
callbacks of the listener, we start afresh and build a list
from the add and nop callbacks, then at the end compare the list
we've built with the exisiting list.
 
v4
  Rework based on comments from Paolo; now using add/nop rather
  than doing another flatview walk.
 
Dave

Dr. David Alan Gilbert (6):
  vhost: Move log_dirty check
  vhost: Simplify ring verification checks
  vhost: Add temporary memory structure
  vhost: add regions to temporary list
  vhost: compare and flip in new memory region list
  vhost: Clean out old vhost_set_memory and friends

 hw/virtio/trace-events    |   6 +
 hw/virtio/vhost.c         | 426 +++++++++++++++-------------------------------
 include/hw/virtio/vhost.h |   4 +-
 3 files changed, 146 insertions(+), 290 deletions(-)

-- 
2.14.3

^ permalink raw reply	[flat|nested] 24+ messages in thread

* [Qemu-devel] [PATCH v4 1/6] vhost: Move log_dirty check
  2017-12-13 18:08 [Qemu-devel] [PATCH v4 0/6] Rework vhost memory region updates Dr. David Alan Gilbert (git)
@ 2017-12-13 18:08 ` Dr. David Alan Gilbert (git)
  2017-12-14 14:30   ` Igor Mammedov
  2017-12-13 18:08 ` [Qemu-devel] [PATCH v4 2/6] vhost: Simplify ring verification checks Dr. David Alan Gilbert (git)
                   ` (4 subsequent siblings)
  5 siblings, 1 reply; 24+ messages in thread
From: Dr. David Alan Gilbert (git) @ 2017-12-13 18:08 UTC (permalink / raw)
  To: qemu-devel, pbonzini, imammedo; +Cc: maxime.coquelin, mst, groug

From: "Dr. David Alan Gilbert" <dgilbert@redhat.com>

Move the log_dirty check into vhost_section.

Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
---
 hw/virtio/trace-events |  3 +++
 hw/virtio/vhost.c      | 20 +++++++++++++-------
 2 files changed, 16 insertions(+), 7 deletions(-)

diff --git a/hw/virtio/trace-events b/hw/virtio/trace-events
index 775461ae98..4a493bcd46 100644
--- a/hw/virtio/trace-events
+++ b/hw/virtio/trace-events
@@ -1,5 +1,8 @@
 # See docs/devel/tracing.txt for syntax documentation.
 
+# hw/virtio/vhost.c
+vhost_section(const char *name, int r) "%s:%d"
+
 # hw/virtio/virtio.c
 virtqueue_alloc_element(void *elem, size_t sz, unsigned in_num, unsigned out_num) "elem %p size %zd in_num %u out_num %u"
 virtqueue_fill(void *vq, const void *elem, unsigned int len, unsigned int idx) "vq %p elem %p len %u idx %u"
diff --git a/hw/virtio/vhost.c b/hw/virtio/vhost.c
index e4290ce93d..e923219e63 100644
--- a/hw/virtio/vhost.c
+++ b/hw/virtio/vhost.c
@@ -27,6 +27,7 @@
 #include "hw/virtio/virtio-access.h"
 #include "migration/blocker.h"
 #include "sysemu/dma.h"
+#include "trace.h"
 
 /* enabled until disconnected backend stabilizes */
 #define _VHOST_DEBUG 1
@@ -567,18 +568,12 @@ static void vhost_set_memory(MemoryListener *listener,
                                          memory_listener);
     hwaddr start_addr = section->offset_within_address_space;
     ram_addr_t size = int128_get64(section->size);
-    bool log_dirty =
-        memory_region_get_dirty_log_mask(section->mr) & ~(1 << DIRTY_MEMORY_MIGRATION);
     int s = offsetof(struct vhost_memory, regions) +
         (dev->mem->nregions + 1) * sizeof dev->mem->regions[0];
     void *ram;
 
     dev->mem = g_realloc(dev->mem, s);
 
-    if (log_dirty) {
-        add = false;
-    }
-
     assert(size);
 
     /* Optimize no-change case. At least cirrus_vga does this a lot at this time. */
@@ -611,8 +606,19 @@ static void vhost_set_memory(MemoryListener *listener,
 
 static bool vhost_section(MemoryRegionSection *section)
 {
-    return memory_region_is_ram(section->mr) &&
+    bool result;
+    bool log_dirty = memory_region_get_dirty_log_mask(section->mr) &
+                     ~(1 << DIRTY_MEMORY_MIGRATION);
+    result = memory_region_is_ram(section->mr) &&
         !memory_region_is_rom(section->mr);
+
+    /* Vhost doesn't handle any block which is doing dirty-tracking other
+     * than migration; this typically fires on VGA areas.
+     */
+    result &= !log_dirty;
+
+    trace_vhost_section(section->mr->name, result);
+    return result;
 }
 
 static void vhost_begin(MemoryListener *listener)
-- 
2.14.3

^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [Qemu-devel] [PATCH v4 2/6] vhost: Simplify ring verification checks
  2017-12-13 18:08 [Qemu-devel] [PATCH v4 0/6] Rework vhost memory region updates Dr. David Alan Gilbert (git)
  2017-12-13 18:08 ` [Qemu-devel] [PATCH v4 1/6] vhost: Move log_dirty check Dr. David Alan Gilbert (git)
@ 2017-12-13 18:08 ` Dr. David Alan Gilbert (git)
  2017-12-14 14:07   ` Igor Mammedov
  2017-12-13 18:08 ` [Qemu-devel] [PATCH v4 3/6] vhost: Add temporary memory structure Dr. David Alan Gilbert (git)
                   ` (3 subsequent siblings)
  5 siblings, 1 reply; 24+ messages in thread
From: Dr. David Alan Gilbert (git) @ 2017-12-13 18:08 UTC (permalink / raw)
  To: qemu-devel, pbonzini, imammedo; +Cc: maxime.coquelin, mst, groug

From: "Dr. David Alan Gilbert" <dgilbert@redhat.com>

vhost_verify_ring_mappings() were used to verify that
rings are still accessible and related memory hasn't
been moved after flatview is updated.

It was doing checks by mapping ring's GPA+len and
checking that HVA hadn't changed with new memory map.
To avoid maybe expensive mapping call, we were
identifying address range that changed and were doing
mapping only if ring was in changed range.

However it's not neccessary to perform ring's GPA
mapping as we already have its current HVA and all
we need is to verify that ring's GPA translates to
the same HVA in updated flatview.

This will allow the following patches to simplify the range
comparison that was previously needed to avoid expensive
verify_ring_mapping calls.

Signed-off-by: Igor Mammedov <imammedo@redhat.com>
with modifications by:
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
---
 hw/virtio/vhost.c | 74 ++++++++++++++++++++++++++++++-------------------------
 1 file changed, 41 insertions(+), 33 deletions(-)

diff --git a/hw/virtio/vhost.c b/hw/virtio/vhost.c
index e923219e63..c7ce7baf9b 100644
--- a/hw/virtio/vhost.c
+++ b/hw/virtio/vhost.c
@@ -450,35 +450,37 @@ static void vhost_memory_unmap(struct vhost_dev *dev, void *buffer,
     }
 }
 
-static int vhost_verify_ring_part_mapping(struct vhost_dev *dev,
-                                          void *part,
-                                          uint64_t part_addr,
-                                          uint64_t part_size,
-                                          uint64_t start_addr,
-                                          uint64_t size)
+static int vhost_verify_ring_part_mapping(void *ring_hva,
+                                          uint64_t ring_gpa,
+                                          uint64_t ring_size,
+                                          void *reg_hva,
+                                          uint64_t reg_gpa,
+                                          uint64_t reg_size)
 {
-    hwaddr l;
-    void *p;
-    int r = 0;
+    uint64_t hva_ring_offset;
+    uint64_t ring_last = range_get_last(ring_gpa, ring_size);
+    uint64_t reg_last = range_get_last(reg_gpa, reg_size);
 
-    if (!ranges_overlap(start_addr, size, part_addr, part_size)) {
+    if (ring_last < reg_gpa || ring_gpa > reg_last) {
         return 0;
     }
-    l = part_size;
-    p = vhost_memory_map(dev, part_addr, &l, 1);
-    if (!p || l != part_size) {
-        r = -ENOMEM;
+    /* check that whole ring's is mapped */
+    if (ring_last > reg_last) {
+        return -EBUSY;
     }
-    if (p != part) {
-        r = -EBUSY;
+    /* check that ring's MemoryRegion wasn't replaced */
+    hva_ring_offset = ring_gpa - reg_gpa;
+    if (ring_hva != reg_hva + hva_ring_offset) {
+        return -ENOMEM;
     }
-    vhost_memory_unmap(dev, p, l, 0, 0);
-    return r;
+
+    return 0;
 }
 
 static int vhost_verify_ring_mappings(struct vhost_dev *dev,
-                                      uint64_t start_addr,
-                                      uint64_t size)
+                                      void *reg_hva,
+                                      uint64_t reg_gpa,
+                                      uint64_t reg_size)
 {
     int i, j;
     int r = 0;
@@ -492,22 +494,25 @@ static int vhost_verify_ring_mappings(struct vhost_dev *dev,
         struct vhost_virtqueue *vq = dev->vqs + i;
 
         j = 0;
-        r = vhost_verify_ring_part_mapping(dev, vq->desc, vq->desc_phys,
-                                           vq->desc_size, start_addr, size);
+        r = vhost_verify_ring_part_mapping(
+                vq->desc, vq->desc_phys, vq->desc_size,
+                reg_hva, reg_gpa, reg_size);
         if (r) {
             break;
         }
 
         j++;
-        r = vhost_verify_ring_part_mapping(dev, vq->avail, vq->avail_phys,
-                                           vq->avail_size, start_addr, size);
+        r = vhost_verify_ring_part_mapping(
+                vq->desc, vq->desc_phys, vq->desc_size,
+                reg_hva, reg_gpa, reg_size);
         if (r) {
             break;
         }
 
         j++;
-        r = vhost_verify_ring_part_mapping(dev, vq->used, vq->used_phys,
-                                           vq->used_size, start_addr, size);
+        r = vhost_verify_ring_part_mapping(
+                vq->desc, vq->desc_phys, vq->desc_size,
+                reg_hva, reg_gpa, reg_size);
         if (r) {
             break;
         }
@@ -633,8 +638,6 @@ static void vhost_commit(MemoryListener *listener)
 {
     struct vhost_dev *dev = container_of(listener, struct vhost_dev,
                                          memory_listener);
-    hwaddr start_addr = 0;
-    ram_addr_t size = 0;
     uint64_t log_size;
     int r;
 
@@ -649,11 +652,16 @@ static void vhost_commit(MemoryListener *listener)
     }
 
     if (dev->started) {
-        start_addr = dev->mem_changed_start_addr;
-        size = dev->mem_changed_end_addr - dev->mem_changed_start_addr + 1;
-
-        r = vhost_verify_ring_mappings(dev, start_addr, size);
-        assert(r >= 0);
+        int i;
+        for (i = 0; i < dev->mem->nregions; i++) {
+            if (vhost_verify_ring_mappings(dev,
+                               (void *)dev->mem->regions[i].userspace_addr,
+                               dev->mem->regions[i].guest_phys_addr,
+                               dev->mem->regions[i].memory_size)) {
+                error_report("Verify ring failure on region %d", i);
+                abort();
+            }
+        }
     }
 
     if (!dev->log_enabled) {
-- 
2.14.3

^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [Qemu-devel] [PATCH v4 3/6] vhost: Add temporary memory structure
  2017-12-13 18:08 [Qemu-devel] [PATCH v4 0/6] Rework vhost memory region updates Dr. David Alan Gilbert (git)
  2017-12-13 18:08 ` [Qemu-devel] [PATCH v4 1/6] vhost: Move log_dirty check Dr. David Alan Gilbert (git)
  2017-12-13 18:08 ` [Qemu-devel] [PATCH v4 2/6] vhost: Simplify ring verification checks Dr. David Alan Gilbert (git)
@ 2017-12-13 18:08 ` Dr. David Alan Gilbert (git)
  2017-12-14 15:15   ` Igor Mammedov
  2017-12-13 18:08 ` [Qemu-devel] [PATCH v4 4/6] vhost: add regions to temporary list Dr. David Alan Gilbert (git)
                   ` (2 subsequent siblings)
  5 siblings, 1 reply; 24+ messages in thread
From: Dr. David Alan Gilbert (git) @ 2017-12-13 18:08 UTC (permalink / raw)
  To: qemu-devel, pbonzini, imammedo; +Cc: maxime.coquelin, mst, groug

From: "Dr. David Alan Gilbert" <dgilbert@redhat.com>

Add a 2nd 'vhost_memory' structure that will be used to build
the new version as the listener iterates over the address space.

Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
---
 hw/virtio/vhost.c         | 3 +++
 include/hw/virtio/vhost.h | 1 +
 2 files changed, 4 insertions(+)

diff --git a/hw/virtio/vhost.c b/hw/virtio/vhost.c
index c7ce7baf9b..4523f45587 100644
--- a/hw/virtio/vhost.c
+++ b/hw/virtio/vhost.c
@@ -632,6 +632,8 @@ static void vhost_begin(MemoryListener *listener)
                                          memory_listener);
     dev->mem_changed_end_addr = 0;
     dev->mem_changed_start_addr = -1;
+
+    dev->tmp_mem = g_malloc0(offsetof(struct vhost_memory, regions));
 }
 
 static void vhost_commit(MemoryListener *listener)
@@ -641,6 +643,7 @@ static void vhost_commit(MemoryListener *listener)
     uint64_t log_size;
     int r;
 
+    g_free(dev->tmp_mem);
     if (!dev->memory_changed) {
         return;
     }
diff --git a/include/hw/virtio/vhost.h b/include/hw/virtio/vhost.h
index 467dc7794b..41f9e569be 100644
--- a/include/hw/virtio/vhost.h
+++ b/include/hw/virtio/vhost.h
@@ -70,6 +70,7 @@ struct vhost_dev {
     bool memory_changed;
     hwaddr mem_changed_start_addr;
     hwaddr mem_changed_end_addr;
+    struct vhost_memory *tmp_mem;
     const VhostOps *vhost_ops;
     void *opaque;
     struct vhost_log *log;
-- 
2.14.3

^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [Qemu-devel] [PATCH v4 4/6] vhost: add regions to temporary list
  2017-12-13 18:08 [Qemu-devel] [PATCH v4 0/6] Rework vhost memory region updates Dr. David Alan Gilbert (git)
                   ` (2 preceding siblings ...)
  2017-12-13 18:08 ` [Qemu-devel] [PATCH v4 3/6] vhost: Add temporary memory structure Dr. David Alan Gilbert (git)
@ 2017-12-13 18:08 ` Dr. David Alan Gilbert (git)
  2017-12-13 21:31   ` Paolo Bonzini
  2017-12-14 15:27   ` Igor Mammedov
  2017-12-13 18:08 ` [Qemu-devel] [PATCH v4 5/6] vhost: compare and flip in new memory region list Dr. David Alan Gilbert (git)
  2017-12-13 18:08 ` [Qemu-devel] [PATCH v4 6/6] vhost: Clean out old vhost_set_memory and friends Dr. David Alan Gilbert (git)
  5 siblings, 2 replies; 24+ messages in thread
From: Dr. David Alan Gilbert (git) @ 2017-12-13 18:08 UTC (permalink / raw)
  To: qemu-devel, pbonzini, imammedo; +Cc: maxime.coquelin, mst, groug

From: "Dr. David Alan Gilbert" <dgilbert@redhat.com>

As regions are reported by the listener to the _nop and _add
methods, add them to our new temporary list.
Regions that abut can be merged if the backend allows.

Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
---
 hw/virtio/trace-events |  2 ++
 hw/virtio/vhost.c      | 70 ++++++++++++++++++++++++++++++++++++++++++++++++++
 2 files changed, 72 insertions(+)

diff --git a/hw/virtio/trace-events b/hw/virtio/trace-events
index 4a493bcd46..7de0663652 100644
--- a/hw/virtio/trace-events
+++ b/hw/virtio/trace-events
@@ -1,6 +1,8 @@
 # See docs/devel/tracing.txt for syntax documentation.
 
 # hw/virtio/vhost.c
+vhost_region_add_tmp(const char *name, uint64_t gpa, uint64_t size, uint64_t host) "%s: 0x%"PRIx64"+0x%"PRIx64" @ 0x%"PRIx64
+vhost_region_add_tmp_abut(const char *name, uint64_t new_size) "%s: 0x%"PRIx64
 vhost_section(const char *name, int r) "%s:%d"
 
 # hw/virtio/virtio.c
diff --git a/hw/virtio/vhost.c b/hw/virtio/vhost.c
index 4523f45587..2084888aa7 100644
--- a/hw/virtio/vhost.c
+++ b/hw/virtio/vhost.c
@@ -694,6 +694,67 @@ static void vhost_commit(MemoryListener *listener)
     dev->memory_changed = false;
 }
 
+/* Adds the section data to the tmp_mem structure.
+ * It relies on the listener calling us in memory address order
+ * and for each region (via the _add and _nop methods).
+ */
+static void vhost_region_add_tmp(struct vhost_dev *dev,
+                                 MemoryRegionSection *section)
+{
+    bool need_add = true;
+    uint64_t mrs_size = int128_get64(section->size);
+    uint64_t mrs_gpa = section->offset_within_address_space;
+    uintptr_t mrs_host = (uintptr_t)memory_region_get_ram_ptr(section->mr) +
+                         section->offset_within_region;
+
+    trace_vhost_region_add_tmp(section->mr->name, mrs_gpa, mrs_size, mrs_host);
+
+    if (dev->tmp_mem->nregions) {
+        /* Since we already have at least one region, lets see if
+         * this extends it; since we're scanning in order, we only
+         * have to look at the last one, and the FlatView that calls
+         * us shouldn't have overlaps.
+         */
+        struct vhost_memory_region *prev_vmr = dev->tmp_mem->regions +
+                                               (dev->tmp_mem->nregions - 1);
+        uint64_t prev_gpa_start = prev_vmr->guest_phys_addr;
+        uint64_t prev_gpa_end   = range_get_last(prev_gpa_start,
+                                                 prev_vmr->memory_size);
+        uint64_t prev_host_start = prev_vmr->userspace_addr;
+        uint64_t prev_host_end   = range_get_last(prev_host_start,
+                                                  prev_vmr->memory_size);
+
+        if (prev_gpa_end + 1 == mrs_gpa &&
+            prev_host_end + 1 == mrs_host &&
+            (!dev->vhost_ops->vhost_backend_can_merge ||
+                dev->vhost_ops->vhost_backend_can_merge(dev,
+                    mrs_host, mrs_size,
+                    prev_host_start, prev_vmr->memory_size))) {
+            /* The two regions abut */
+            need_add = false;
+            mrs_size = mrs_size + prev_vmr->memory_size;
+            prev_vmr->memory_size = mrs_size;
+            trace_vhost_region_add_tmp_abut(section->mr->name, mrs_size);
+        }
+    }
+
+    if (need_add) {
+        uint32_t nregions = dev->tmp_mem->nregions;
+        size_t s = offsetof(struct vhost_memory, regions) +
+                            (nregions + 1) * sizeof dev->tmp_mem->regions[0];
+        dev->tmp_mem = g_realloc(dev->tmp_mem, s);
+        dev->tmp_mem->nregions++;
+        struct vhost_memory_region *cur_vmr = &dev->tmp_mem->regions[nregions];
+
+        cur_vmr->guest_phys_addr = mrs_gpa;
+        cur_vmr->memory_size     = mrs_size;
+        cur_vmr->userspace_addr  = mrs_host;
+        cur_vmr->flags_padding = 0;
+    }
+
+
+}
+
 static void vhost_region_add(MemoryListener *listener,
                              MemoryRegionSection *section)
 {
@@ -703,6 +764,7 @@ static void vhost_region_add(MemoryListener *listener,
     if (!vhost_section(section)) {
         return;
     }
+    vhost_region_add_tmp(dev, section);
 
     ++dev->n_mem_sections;
     dev->mem_sections = g_renew(MemoryRegionSection, dev->mem_sections,
@@ -800,9 +862,17 @@ static void vhost_iommu_region_del(MemoryListener *listener,
     }
 }
 
+/* Called on regions that have not changed */
 static void vhost_region_nop(MemoryListener *listener,
                              MemoryRegionSection *section)
 {
+    struct vhost_dev *dev = container_of(listener, struct vhost_dev,
+                                         memory_listener);
+    if (!vhost_section(section)) {
+        return;
+    }
+
+    vhost_region_add_tmp(dev, section);
 }
 
 static int vhost_virtqueue_set_addr(struct vhost_dev *dev,
-- 
2.14.3

^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [Qemu-devel] [PATCH v4 5/6] vhost: compare and flip in new memory region list
  2017-12-13 18:08 [Qemu-devel] [PATCH v4 0/6] Rework vhost memory region updates Dr. David Alan Gilbert (git)
                   ` (3 preceding siblings ...)
  2017-12-13 18:08 ` [Qemu-devel] [PATCH v4 4/6] vhost: add regions to temporary list Dr. David Alan Gilbert (git)
@ 2017-12-13 18:08 ` Dr. David Alan Gilbert (git)
  2017-12-13 18:08 ` [Qemu-devel] [PATCH v4 6/6] vhost: Clean out old vhost_set_memory and friends Dr. David Alan Gilbert (git)
  5 siblings, 0 replies; 24+ messages in thread
From: Dr. David Alan Gilbert (git) @ 2017-12-13 18:08 UTC (permalink / raw)
  To: qemu-devel, pbonzini, imammedo; +Cc: maxime.coquelin, mst, groug

From: "Dr. David Alan Gilbert" <dgilbert@redhat.com>

Compare the temporary region list with the current list
to generate a change flag (replacing the old mem_changed_start_addr)
and flip the temporary list into becoming the new current list.

Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
---
 hw/virtio/trace-events |  1 +
 hw/virtio/vhost.c      | 18 ++++++++++++++----
 2 files changed, 15 insertions(+), 4 deletions(-)

diff --git a/hw/virtio/trace-events b/hw/virtio/trace-events
index 7de0663652..3dd6be797c 100644
--- a/hw/virtio/trace-events
+++ b/hw/virtio/trace-events
@@ -1,6 +1,7 @@
 # See docs/devel/tracing.txt for syntax documentation.
 
 # hw/virtio/vhost.c
+vhost_commit(bool started, bool changed) "Started: %d Changed: %d"
 vhost_region_add_tmp(const char *name, uint64_t gpa, uint64_t size, uint64_t host) "%s: 0x%"PRIx64"+0x%"PRIx64" @ 0x%"PRIx64
 vhost_region_add_tmp_abut(const char *name, uint64_t new_size) "%s: 0x%"PRIx64
 vhost_section(const char *name, int r) "%s:%d"
diff --git a/hw/virtio/vhost.c b/hw/virtio/vhost.c
index 2084888aa7..1b276b210f 100644
--- a/hw/virtio/vhost.c
+++ b/hw/virtio/vhost.c
@@ -642,15 +642,25 @@ static void vhost_commit(MemoryListener *listener)
                                          memory_listener);
     uint64_t log_size;
     int r;
+    bool changed = false;
 
-    g_free(dev->tmp_mem);
-    if (!dev->memory_changed) {
-        return;
+    if (dev->mem->nregions != dev->tmp_mem->nregions) {
+        changed = true;
+    } else {
+        /* Same size, lets check the contents */
+        size_t region_size = dev->mem->nregions * sizeof dev->mem->regions[0];
+        changed = memcmp(dev->mem->regions, dev->tmp_mem->regions,
+                         region_size) != 0;
     }
+    g_free(dev->mem);
+    dev->mem = dev->tmp_mem;
+    dev->tmp_mem = NULL;
+
+    trace_vhost_commit(dev->started, changed);
     if (!dev->started) {
         return;
     }
-    if (dev->mem_changed_start_addr > dev->mem_changed_end_addr) {
+    if (!changed) {
         return;
     }
 
-- 
2.14.3

^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [Qemu-devel] [PATCH v4 6/6] vhost: Clean out old vhost_set_memory and friends
  2017-12-13 18:08 [Qemu-devel] [PATCH v4 0/6] Rework vhost memory region updates Dr. David Alan Gilbert (git)
                   ` (4 preceding siblings ...)
  2017-12-13 18:08 ` [Qemu-devel] [PATCH v4 5/6] vhost: compare and flip in new memory region list Dr. David Alan Gilbert (git)
@ 2017-12-13 18:08 ` Dr. David Alan Gilbert (git)
  5 siblings, 0 replies; 24+ messages in thread
From: Dr. David Alan Gilbert (git) @ 2017-12-13 18:08 UTC (permalink / raw)
  To: qemu-devel, pbonzini, imammedo; +Cc: maxime.coquelin, mst, groug

From: "Dr. David Alan Gilbert" <dgilbert@redhat.com>

Remove the old update mechanism, vhost_set_memory, and the functions
and flags it used.

Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
---
 hw/virtio/vhost.c         | 245 ----------------------------------------------
 include/hw/virtio/vhost.h |   3 -
 2 files changed, 248 deletions(-)

diff --git a/hw/virtio/vhost.c b/hw/virtio/vhost.c
index 1b276b210f..f6932dc107 100644
--- a/hw/virtio/vhost.c
+++ b/hw/virtio/vhost.c
@@ -156,160 +156,6 @@ static void vhost_log_sync_range(struct vhost_dev *dev,
     }
 }
 
-/* Assign/unassign. Keep an unsorted array of non-overlapping
- * memory regions in dev->mem. */
-static void vhost_dev_unassign_memory(struct vhost_dev *dev,
-                                      uint64_t start_addr,
-                                      uint64_t size)
-{
-    int from, to, n = dev->mem->nregions;
-    /* Track overlapping/split regions for sanity checking. */
-    int overlap_start = 0, overlap_end = 0, overlap_middle = 0, split = 0;
-
-    for (from = 0, to = 0; from < n; ++from, ++to) {
-        struct vhost_memory_region *reg = dev->mem->regions + to;
-        uint64_t reglast;
-        uint64_t memlast;
-        uint64_t change;
-
-        /* clone old region */
-        if (to != from) {
-            memcpy(reg, dev->mem->regions + from, sizeof *reg);
-        }
-
-        /* No overlap is simple */
-        if (!ranges_overlap(reg->guest_phys_addr, reg->memory_size,
-                            start_addr, size)) {
-            continue;
-        }
-
-        /* Split only happens if supplied region
-         * is in the middle of an existing one. Thus it can not
-         * overlap with any other existing region. */
-        assert(!split);
-
-        reglast = range_get_last(reg->guest_phys_addr, reg->memory_size);
-        memlast = range_get_last(start_addr, size);
-
-        /* Remove whole region */
-        if (start_addr <= reg->guest_phys_addr && memlast >= reglast) {
-            --dev->mem->nregions;
-            --to;
-            ++overlap_middle;
-            continue;
-        }
-
-        /* Shrink region */
-        if (memlast >= reglast) {
-            reg->memory_size = start_addr - reg->guest_phys_addr;
-            assert(reg->memory_size);
-            assert(!overlap_end);
-            ++overlap_end;
-            continue;
-        }
-
-        /* Shift region */
-        if (start_addr <= reg->guest_phys_addr) {
-            change = memlast + 1 - reg->guest_phys_addr;
-            reg->memory_size -= change;
-            reg->guest_phys_addr += change;
-            reg->userspace_addr += change;
-            assert(reg->memory_size);
-            assert(!overlap_start);
-            ++overlap_start;
-            continue;
-        }
-
-        /* This only happens if supplied region
-         * is in the middle of an existing one. Thus it can not
-         * overlap with any other existing region. */
-        assert(!overlap_start);
-        assert(!overlap_end);
-        assert(!overlap_middle);
-        /* Split region: shrink first part, shift second part. */
-        memcpy(dev->mem->regions + n, reg, sizeof *reg);
-        reg->memory_size = start_addr - reg->guest_phys_addr;
-        assert(reg->memory_size);
-        change = memlast + 1 - reg->guest_phys_addr;
-        reg = dev->mem->regions + n;
-        reg->memory_size -= change;
-        assert(reg->memory_size);
-        reg->guest_phys_addr += change;
-        reg->userspace_addr += change;
-        /* Never add more than 1 region */
-        assert(dev->mem->nregions == n);
-        ++dev->mem->nregions;
-        ++split;
-    }
-}
-
-/* Called after unassign, so no regions overlap the given range. */
-static void vhost_dev_assign_memory(struct vhost_dev *dev,
-                                    uint64_t start_addr,
-                                    uint64_t size,
-                                    uint64_t uaddr)
-{
-    int from, to;
-    struct vhost_memory_region *merged = NULL;
-    for (from = 0, to = 0; from < dev->mem->nregions; ++from, ++to) {
-        struct vhost_memory_region *reg = dev->mem->regions + to;
-        uint64_t prlast, urlast;
-        uint64_t pmlast, umlast;
-        uint64_t s, e, u;
-
-        /* clone old region */
-        if (to != from) {
-            memcpy(reg, dev->mem->regions + from, sizeof *reg);
-        }
-        prlast = range_get_last(reg->guest_phys_addr, reg->memory_size);
-        pmlast = range_get_last(start_addr, size);
-        urlast = range_get_last(reg->userspace_addr, reg->memory_size);
-        umlast = range_get_last(uaddr, size);
-
-        /* check for overlapping regions: should never happen. */
-        assert(prlast < start_addr || pmlast < reg->guest_phys_addr);
-        /* Not an adjacent or overlapping region - do not merge. */
-        if ((prlast + 1 != start_addr || urlast + 1 != uaddr) &&
-            (pmlast + 1 != reg->guest_phys_addr ||
-             umlast + 1 != reg->userspace_addr)) {
-            continue;
-        }
-
-        if (dev->vhost_ops->vhost_backend_can_merge &&
-            !dev->vhost_ops->vhost_backend_can_merge(dev, uaddr, size,
-                                                     reg->userspace_addr,
-                                                     reg->memory_size)) {
-            continue;
-        }
-
-        if (merged) {
-            --to;
-            assert(to >= 0);
-        } else {
-            merged = reg;
-        }
-        u = MIN(uaddr, reg->userspace_addr);
-        s = MIN(start_addr, reg->guest_phys_addr);
-        e = MAX(pmlast, prlast);
-        uaddr = merged->userspace_addr = u;
-        start_addr = merged->guest_phys_addr = s;
-        size = merged->memory_size = e - s + 1;
-        assert(merged->memory_size);
-    }
-
-    if (!merged) {
-        struct vhost_memory_region *reg = dev->mem->regions + to;
-        memset(reg, 0, sizeof *reg);
-        reg->memory_size = size;
-        assert(reg->memory_size);
-        reg->guest_phys_addr = start_addr;
-        reg->userspace_addr = uaddr;
-        ++to;
-    }
-    assert(to <= dev->mem->nregions + 1);
-    dev->mem->nregions = to;
-}
-
 static uint64_t vhost_get_log_size(struct vhost_dev *dev)
 {
     uint64_t log_size = 0;
@@ -526,89 +372,6 @@ static int vhost_verify_ring_mappings(struct vhost_dev *dev,
     return r;
 }
 
-static struct vhost_memory_region *vhost_dev_find_reg(struct vhost_dev *dev,
-						      uint64_t start_addr,
-						      uint64_t size)
-{
-    int i, n = dev->mem->nregions;
-    for (i = 0; i < n; ++i) {
-        struct vhost_memory_region *reg = dev->mem->regions + i;
-        if (ranges_overlap(reg->guest_phys_addr, reg->memory_size,
-                           start_addr, size)) {
-            return reg;
-        }
-    }
-    return NULL;
-}
-
-static bool vhost_dev_cmp_memory(struct vhost_dev *dev,
-                                 uint64_t start_addr,
-                                 uint64_t size,
-                                 uint64_t uaddr)
-{
-    struct vhost_memory_region *reg = vhost_dev_find_reg(dev, start_addr, size);
-    uint64_t reglast;
-    uint64_t memlast;
-
-    if (!reg) {
-        return true;
-    }
-
-    reglast = range_get_last(reg->guest_phys_addr, reg->memory_size);
-    memlast = range_get_last(start_addr, size);
-
-    /* Need to extend region? */
-    if (start_addr < reg->guest_phys_addr || memlast > reglast) {
-        return true;
-    }
-    /* userspace_addr changed? */
-    return uaddr != reg->userspace_addr + start_addr - reg->guest_phys_addr;
-}
-
-static void vhost_set_memory(MemoryListener *listener,
-                             MemoryRegionSection *section,
-                             bool add)
-{
-    struct vhost_dev *dev = container_of(listener, struct vhost_dev,
-                                         memory_listener);
-    hwaddr start_addr = section->offset_within_address_space;
-    ram_addr_t size = int128_get64(section->size);
-    int s = offsetof(struct vhost_memory, regions) +
-        (dev->mem->nregions + 1) * sizeof dev->mem->regions[0];
-    void *ram;
-
-    dev->mem = g_realloc(dev->mem, s);
-
-    assert(size);
-
-    /* Optimize no-change case. At least cirrus_vga does this a lot at this time. */
-    ram = memory_region_get_ram_ptr(section->mr) + section->offset_within_region;
-    if (add) {
-        if (!vhost_dev_cmp_memory(dev, start_addr, size, (uintptr_t)ram)) {
-            /* Region exists with same address. Nothing to do. */
-            return;
-        }
-    } else {
-        if (!vhost_dev_find_reg(dev, start_addr, size)) {
-            /* Removing region that we don't access. Nothing to do. */
-            return;
-        }
-    }
-
-    vhost_dev_unassign_memory(dev, start_addr, size);
-    if (add) {
-        /* Add given mapping, merging adjacent regions if any */
-        vhost_dev_assign_memory(dev, start_addr, size, (uintptr_t)ram);
-    } else {
-        /* Remove old mapping for this memory, if any. */
-        vhost_dev_unassign_memory(dev, start_addr, size);
-    }
-    dev->mem_changed_start_addr = MIN(dev->mem_changed_start_addr, start_addr);
-    dev->mem_changed_end_addr = MAX(dev->mem_changed_end_addr, start_addr + size - 1);
-    dev->memory_changed = true;
-    used_memslots = dev->mem->nregions;
-}
-
 static bool vhost_section(MemoryRegionSection *section)
 {
     bool result;
@@ -630,9 +393,6 @@ static void vhost_begin(MemoryListener *listener)
 {
     struct vhost_dev *dev = container_of(listener, struct vhost_dev,
                                          memory_listener);
-    dev->mem_changed_end_addr = 0;
-    dev->mem_changed_start_addr = -1;
-
     dev->tmp_mem = g_malloc0(offsetof(struct vhost_memory, regions));
 }
 
@@ -682,7 +442,6 @@ static void vhost_commit(MemoryListener *listener)
         if (r < 0) {
             VHOST_OPS_DEBUG("vhost_set_mem_table failed");
         }
-        dev->memory_changed = false;
         return;
     }
     log_size = vhost_get_log_size(dev);
@@ -701,7 +460,6 @@ static void vhost_commit(MemoryListener *listener)
     if (dev->log_size > log_size + VHOST_LOG_BUFFER) {
         vhost_dev_log_resize(dev, log_size);
     }
-    dev->memory_changed = false;
 }
 
 /* Adds the section data to the tmp_mem structure.
@@ -781,7 +539,6 @@ static void vhost_region_add(MemoryListener *listener,
                                 dev->n_mem_sections);
     dev->mem_sections[dev->n_mem_sections - 1] = *section;
     memory_region_ref(section->mr);
-    vhost_set_memory(listener, section, true);
 }
 
 static void vhost_region_del(MemoryListener *listener,
@@ -795,7 +552,6 @@ static void vhost_region_del(MemoryListener *listener,
         return;
     }
 
-    vhost_set_memory(listener, section, false);
     memory_region_unref(section->mr);
     for (i = 0; i < dev->n_mem_sections; ++i) {
         if (dev->mem_sections[i].offset_within_address_space
@@ -1433,7 +1189,6 @@ int vhost_dev_init(struct vhost_dev *hdev, void *opaque,
     hdev->log_size = 0;
     hdev->log_enabled = false;
     hdev->started = false;
-    hdev->memory_changed = false;
     memory_listener_register(&hdev->memory_listener, &address_space_memory);
     QLIST_INSERT_HEAD(&vhost_devices, hdev, entry);
     return 0;
diff --git a/include/hw/virtio/vhost.h b/include/hw/virtio/vhost.h
index 41f9e569be..7e58fa5bc5 100644
--- a/include/hw/virtio/vhost.h
+++ b/include/hw/virtio/vhost.h
@@ -67,9 +67,6 @@ struct vhost_dev {
     bool log_enabled;
     uint64_t log_size;
     Error *migration_blocker;
-    bool memory_changed;
-    hwaddr mem_changed_start_addr;
-    hwaddr mem_changed_end_addr;
     struct vhost_memory *tmp_mem;
     const VhostOps *vhost_ops;
     void *opaque;
-- 
2.14.3

^ permalink raw reply related	[flat|nested] 24+ messages in thread

* Re: [Qemu-devel] [PATCH v4 4/6] vhost: add regions to temporary list
  2017-12-13 18:08 ` [Qemu-devel] [PATCH v4 4/6] vhost: add regions to temporary list Dr. David Alan Gilbert (git)
@ 2017-12-13 21:31   ` Paolo Bonzini
  2017-12-14 15:53     ` Dr. David Alan Gilbert
  2017-12-14 15:27   ` Igor Mammedov
  1 sibling, 1 reply; 24+ messages in thread
From: Paolo Bonzini @ 2017-12-13 21:31 UTC (permalink / raw)
  To: Dr. David Alan Gilbert (git), qemu-devel, imammedo
  Cc: maxime.coquelin, groug, mst

On 13/12/2017 19:08, Dr. David Alan Gilbert (git) wrote:
> +    if (dev->tmp_mem->nregions) {
> +        /* Since we already have at least one region, lets see if
> +         * this extends it; since we're scanning in order, we only
> +         * have to look at the last one, and the FlatView that calls
> +         * us shouldn't have overlaps.
> +         */
> +        struct vhost_memory_region *prev_vmr = dev->tmp_mem->regions +
> +                                               (dev->tmp_mem->nregions - 1);
> +        uint64_t prev_gpa_start = prev_vmr->guest_phys_addr;
> +        uint64_t prev_gpa_end   = range_get_last(prev_gpa_start,
> +                                                 prev_vmr->memory_size);
> +        uint64_t prev_host_start = prev_vmr->userspace_addr;
> +        uint64_t prev_host_end   = range_get_last(prev_host_start,
> +                                                  prev_vmr->memory_size);
> +
> +        if (prev_gpa_end + 1 == mrs_gpa &&
> +            prev_host_end + 1 == mrs_host &&
> +            (!dev->vhost_ops->vhost_backend_can_merge ||
> +                dev->vhost_ops->vhost_backend_can_merge(dev,
> +                    mrs_host, mrs_size,
> +                    prev_host_start, prev_vmr->memory_size))) {
> +            /* The two regions abut */
> +            need_add = false;
> +            mrs_size = mrs_size + prev_vmr->memory_size;
> +            prev_vmr->memory_size = mrs_size;
> +            trace_vhost_region_add_tmp_abut(section->mr->name, mrs_size);
> +        }
> +    }

Interesting, in which cases does this actually trigger?

Paolo

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [Qemu-devel] [PATCH v4 2/6] vhost: Simplify ring verification checks
  2017-12-13 18:08 ` [Qemu-devel] [PATCH v4 2/6] vhost: Simplify ring verification checks Dr. David Alan Gilbert (git)
@ 2017-12-14 14:07   ` Igor Mammedov
  2017-12-15 12:24     ` Dr. David Alan Gilbert
  0 siblings, 1 reply; 24+ messages in thread
From: Igor Mammedov @ 2017-12-14 14:07 UTC (permalink / raw)
  To: Dr. David Alan Gilbert (git)
  Cc: qemu-devel, pbonzini, maxime.coquelin, mst, groug

On Wed, 13 Dec 2017 18:08:03 +0000
"Dr. David Alan Gilbert (git)" <dgilbert@redhat.com> wrote:

> From: "Dr. David Alan Gilbert" <dgilbert@redhat.com>
> 
> vhost_verify_ring_mappings() were used to verify that
> rings are still accessible and related memory hasn't
> been moved after flatview is updated.
> 
> It was doing checks by mapping ring's GPA+len and
> checking that HVA hadn't changed with new memory map.
> To avoid maybe expensive mapping call, we were
> identifying address range that changed and were doing
> mapping only if ring was in changed range.
> 
> However it's not neccessary to perform ring's GPA
> mapping as we already have its current HVA and all
> we need is to verify that ring's GPA translates to
> the same HVA in updated flatview.
> 
> This will allow the following patches to simplify the range
> comparison that was previously needed to avoid expensive
> verify_ring_mapping calls.
> 
> Signed-off-by: Igor Mammedov <imammedo@redhat.com>
> with modifications by:
> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
> ---
>  hw/virtio/vhost.c | 74 ++++++++++++++++++++++++++++++-------------------------
>  1 file changed, 41 insertions(+), 33 deletions(-)
> 
> diff --git a/hw/virtio/vhost.c b/hw/virtio/vhost.c
> index e923219e63..c7ce7baf9b 100644
> --- a/hw/virtio/vhost.c
> +++ b/hw/virtio/vhost.c
> @@ -450,35 +450,37 @@ static void vhost_memory_unmap(struct vhost_dev *dev, void *buffer,
>      }
>  }
>  
> -static int vhost_verify_ring_part_mapping(struct vhost_dev *dev,
> -                                          void *part,
> -                                          uint64_t part_addr,
> -                                          uint64_t part_size,
> -                                          uint64_t start_addr,
> -                                          uint64_t size)
> +static int vhost_verify_ring_part_mapping(void *ring_hva,
> +                                          uint64_t ring_gpa,
> +                                          uint64_t ring_size,
> +                                          void *reg_hva,
> +                                          uint64_t reg_gpa,
> +                                          uint64_t reg_size)
>  {
> -    hwaddr l;
> -    void *p;
> -    int r = 0;
> +    uint64_t hva_ring_offset;
> +    uint64_t ring_last = range_get_last(ring_gpa, ring_size);
> +    uint64_t reg_last = range_get_last(reg_gpa, reg_size);
>  
> -    if (!ranges_overlap(start_addr, size, part_addr, part_size)) {
> +    if (ring_last < reg_gpa || ring_gpa > reg_last) {
>          return 0;
>      }
> -    l = part_size;
> -    p = vhost_memory_map(dev, part_addr, &l, 1);
> -    if (!p || l != part_size) {
> -        r = -ENOMEM;
> +    /* check that whole ring's is mapped */
> +    if (ring_last > reg_last) {
> +        return -EBUSY;
ENOMEM

>      }
> -    if (p != part) {
> -        r = -EBUSY;
> +    /* check that ring's MemoryRegion wasn't replaced */
> +    hva_ring_offset = ring_gpa - reg_gpa;
> +    if (ring_hva != reg_hva + hva_ring_offset) {
> +        return -ENOMEM;
EBUSY

>      }
> -    vhost_memory_unmap(dev, p, l, 0, 0);
> -    return r;
> +
> +    return 0;
>  }
>  
>  static int vhost_verify_ring_mappings(struct vhost_dev *dev,
> -                                      uint64_t start_addr,
> -                                      uint64_t size)
> +                                      void *reg_hva,
> +                                      uint64_t reg_gpa,
> +                                      uint64_t reg_size)
>  {
>      int i, j;
>      int r = 0;
> @@ -492,22 +494,25 @@ static int vhost_verify_ring_mappings(struct vhost_dev *dev,
>          struct vhost_virtqueue *vq = dev->vqs + i;
>  
>          j = 0;
> -        r = vhost_verify_ring_part_mapping(dev, vq->desc, vq->desc_phys,
> -                                           vq->desc_size, start_addr, size);
> +        r = vhost_verify_ring_part_mapping(
> +                vq->desc, vq->desc_phys, vq->desc_size,
> +                reg_hva, reg_gpa, reg_size);
>          if (r) {
>              break;
>          }
>  
>          j++;
> -        r = vhost_verify_ring_part_mapping(dev, vq->avail, vq->avail_phys,
> -                                           vq->avail_size, start_addr, size);
> +        r = vhost_verify_ring_part_mapping(
> +                vq->desc, vq->desc_phys, vq->desc_size,
> +                reg_hva, reg_gpa, reg_size);
>          if (r) {
>              break;
>          }
>  
>          j++;
> -        r = vhost_verify_ring_part_mapping(dev, vq->used, vq->used_phys,
> -                                           vq->used_size, start_addr, size);
> +        r = vhost_verify_ring_part_mapping(
> +                vq->desc, vq->desc_phys, vq->desc_size,
> +                reg_hva, reg_gpa, reg_size);
>          if (r) {
>              break;
>          }
> @@ -633,8 +638,6 @@ static void vhost_commit(MemoryListener *listener)
>  {
>      struct vhost_dev *dev = container_of(listener, struct vhost_dev,
>                                           memory_listener);
> -    hwaddr start_addr = 0;
> -    ram_addr_t size = 0;
>      uint64_t log_size;
>      int r;
>  
> @@ -649,11 +652,16 @@ static void vhost_commit(MemoryListener *listener)
>      }
>  
>      if (dev->started) {
^^^ not needed as we bail out earlier if it's not true

> -        start_addr = dev->mem_changed_start_addr;
> -        size = dev->mem_changed_end_addr - dev->mem_changed_start_addr + 1;
> -
> -        r = vhost_verify_ring_mappings(dev, start_addr, size);
> -        assert(r >= 0);
> +        int i;
> +        for (i = 0; i < dev->mem->nregions; i++) {
> +            if (vhost_verify_ring_mappings(dev,
> +                               (void *)dev->mem->regions[i].userspace_addr,
> +                               dev->mem->regions[i].guest_phys_addr,
> +                               dev->mem->regions[i].memory_size)) {
> +                error_report("Verify ring failure on region %d", i);
> +                abort();
> +            }
> +        }
>      }
>  
>      if (!dev->log_enabled) {

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [Qemu-devel] [PATCH v4 1/6] vhost: Move log_dirty check
  2017-12-13 18:08 ` [Qemu-devel] [PATCH v4 1/6] vhost: Move log_dirty check Dr. David Alan Gilbert (git)
@ 2017-12-14 14:30   ` Igor Mammedov
  2017-12-14 15:20     ` Dr. David Alan Gilbert
  0 siblings, 1 reply; 24+ messages in thread
From: Igor Mammedov @ 2017-12-14 14:30 UTC (permalink / raw)
  To: Dr. David Alan Gilbert (git)
  Cc: qemu-devel, pbonzini, maxime.coquelin, mst, groug

On Wed, 13 Dec 2017 18:08:02 +0000
"Dr. David Alan Gilbert (git)" <dgilbert@redhat.com> wrote:

> From: "Dr. David Alan Gilbert" <dgilbert@redhat.com>
> 
> Move the log_dirty check into vhost_section.
> 
> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
> ---
>  hw/virtio/trace-events |  3 +++
>  hw/virtio/vhost.c      | 20 +++++++++++++-------
>  2 files changed, 16 insertions(+), 7 deletions(-)
> 
> diff --git a/hw/virtio/trace-events b/hw/virtio/trace-events
> index 775461ae98..4a493bcd46 100644
> --- a/hw/virtio/trace-events
> +++ b/hw/virtio/trace-events
> @@ -1,5 +1,8 @@
>  # See docs/devel/tracing.txt for syntax documentation.
>  
> +# hw/virtio/vhost.c
> +vhost_section(const char *name, int r) "%s:%d"
> +
>  # hw/virtio/virtio.c
>  virtqueue_alloc_element(void *elem, size_t sz, unsigned in_num, unsigned out_num) "elem %p size %zd in_num %u out_num %u"
>  virtqueue_fill(void *vq, const void *elem, unsigned int len, unsigned int idx) "vq %p elem %p len %u idx %u"
> diff --git a/hw/virtio/vhost.c b/hw/virtio/vhost.c
> index e4290ce93d..e923219e63 100644
> --- a/hw/virtio/vhost.c
> +++ b/hw/virtio/vhost.c
> @@ -27,6 +27,7 @@
>  #include "hw/virtio/virtio-access.h"
>  #include "migration/blocker.h"
>  #include "sysemu/dma.h"
> +#include "trace.h"
>  
>  /* enabled until disconnected backend stabilizes */
>  #define _VHOST_DEBUG 1
> @@ -567,18 +568,12 @@ static void vhost_set_memory(MemoryListener *listener,
>                                           memory_listener);
>      hwaddr start_addr = section->offset_within_address_space;
>      ram_addr_t size = int128_get64(section->size);
> -    bool log_dirty =
> -        memory_region_get_dirty_log_mask(section->mr) & ~(1 << DIRTY_MEMORY_MIGRATION);
>      int s = offsetof(struct vhost_memory, regions) +
>          (dev->mem->nregions + 1) * sizeof dev->mem->regions[0];
>      void *ram;
>  
>      dev->mem = g_realloc(dev->mem, s);
>  
> -    if (log_dirty) {
> -        add = false;
> -    }
> -
>      assert(size);
>  
>      /* Optimize no-change case. At least cirrus_vga does this a lot at this time. */
> @@ -611,8 +606,19 @@ static void vhost_set_memory(MemoryListener *listener,
>  
>  static bool vhost_section(MemoryRegionSection *section)
>  {
> -    return memory_region_is_ram(section->mr) &&
> +    bool result;
> +    bool log_dirty = memory_region_get_dirty_log_mask(section->mr) &
> +                     ~(1 << DIRTY_MEMORY_MIGRATION);
> +    result = memory_region_is_ram(section->mr) &&
>          !memory_region_is_rom(section->mr);
> +
> +    /* Vhost doesn't handle any block which is doing dirty-tracking other
> +     * than migration; this typically fires on VGA areas.
> +     */
> +    result &= !log_dirty;
before patch even if log_dirty, vhost_set_memory will still proceed
and may remove dirty section from memmap and set memory_changed = true

but with this patch it will just ignore such section,
I'm not sure it's right (it might be right with new approach add/nop
but then this patch should go after new code in place and old one
if gone).

> +
> +    trace_vhost_section(section->mr->name, result);
> +    return result;
>  }
>  
>  static void vhost_begin(MemoryListener *listener)

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [Qemu-devel] [PATCH v4 3/6] vhost: Add temporary memory structure
  2017-12-13 18:08 ` [Qemu-devel] [PATCH v4 3/6] vhost: Add temporary memory structure Dr. David Alan Gilbert (git)
@ 2017-12-14 15:15   ` Igor Mammedov
  2017-12-15 13:15     ` Dr. David Alan Gilbert
  0 siblings, 1 reply; 24+ messages in thread
From: Igor Mammedov @ 2017-12-14 15:15 UTC (permalink / raw)
  To: Dr. David Alan Gilbert (git)
  Cc: qemu-devel, pbonzini, maxime.coquelin, mst, groug

On Wed, 13 Dec 2017 18:08:04 +0000
"Dr. David Alan Gilbert (git)" <dgilbert@redhat.com> wrote:

> From: "Dr. David Alan Gilbert" <dgilbert@redhat.com>
> 
> Add a 2nd 'vhost_memory' structure that will be used to build
> the new version as the listener iterates over the address space.
I'd suggest to add temporary 'mem_sections' instead and
create/use/free temporary 'vhost_memory' structure at commit time,
(1 less duplicated data set to keep in memory)

> 
> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
> ---
>  hw/virtio/vhost.c         | 3 +++
>  include/hw/virtio/vhost.h | 1 +
>  2 files changed, 4 insertions(+)
> 
> diff --git a/hw/virtio/vhost.c b/hw/virtio/vhost.c
> index c7ce7baf9b..4523f45587 100644
> --- a/hw/virtio/vhost.c
> +++ b/hw/virtio/vhost.c
> @@ -632,6 +632,8 @@ static void vhost_begin(MemoryListener *listener)
>                                           memory_listener);
>      dev->mem_changed_end_addr = 0;
>      dev->mem_changed_start_addr = -1;
> +
> +    dev->tmp_mem = g_malloc0(offsetof(struct vhost_memory, regions));
>  }
>  
>  static void vhost_commit(MemoryListener *listener)
> @@ -641,6 +643,7 @@ static void vhost_commit(MemoryListener *listener)
>      uint64_t log_size;
>      int r;
>  
> +    g_free(dev->tmp_mem);
>      if (!dev->memory_changed) {
>          return;
>      }
> diff --git a/include/hw/virtio/vhost.h b/include/hw/virtio/vhost.h
> index 467dc7794b..41f9e569be 100644
> --- a/include/hw/virtio/vhost.h
> +++ b/include/hw/virtio/vhost.h
> @@ -70,6 +70,7 @@ struct vhost_dev {
>      bool memory_changed;
>      hwaddr mem_changed_start_addr;
>      hwaddr mem_changed_end_addr;
> +    struct vhost_memory *tmp_mem;
>      const VhostOps *vhost_ops;
>      void *opaque;
>      struct vhost_log *log;

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [Qemu-devel] [PATCH v4 1/6] vhost: Move log_dirty check
  2017-12-14 14:30   ` Igor Mammedov
@ 2017-12-14 15:20     ` Dr. David Alan Gilbert
  2017-12-15 10:01       ` Igor Mammedov
  2017-12-27 12:10       ` Igor Mammedov
  0 siblings, 2 replies; 24+ messages in thread
From: Dr. David Alan Gilbert @ 2017-12-14 15:20 UTC (permalink / raw)
  To: Igor Mammedov; +Cc: qemu-devel, pbonzini, maxime.coquelin, mst, groug

* Igor Mammedov (imammedo@redhat.com) wrote:
> On Wed, 13 Dec 2017 18:08:02 +0000
> "Dr. David Alan Gilbert (git)" <dgilbert@redhat.com> wrote:
> 
> > From: "Dr. David Alan Gilbert" <dgilbert@redhat.com>
> > 
> > Move the log_dirty check into vhost_section.
> > 
> > Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
> > ---
> >  hw/virtio/trace-events |  3 +++
> >  hw/virtio/vhost.c      | 20 +++++++++++++-------
> >  2 files changed, 16 insertions(+), 7 deletions(-)
> > 
> > diff --git a/hw/virtio/trace-events b/hw/virtio/trace-events
> > index 775461ae98..4a493bcd46 100644
> > --- a/hw/virtio/trace-events
> > +++ b/hw/virtio/trace-events
> > @@ -1,5 +1,8 @@
> >  # See docs/devel/tracing.txt for syntax documentation.
> >  
> > +# hw/virtio/vhost.c
> > +vhost_section(const char *name, int r) "%s:%d"
> > +
> >  # hw/virtio/virtio.c
> >  virtqueue_alloc_element(void *elem, size_t sz, unsigned in_num, unsigned out_num) "elem %p size %zd in_num %u out_num %u"
> >  virtqueue_fill(void *vq, const void *elem, unsigned int len, unsigned int idx) "vq %p elem %p len %u idx %u"
> > diff --git a/hw/virtio/vhost.c b/hw/virtio/vhost.c
> > index e4290ce93d..e923219e63 100644
> > --- a/hw/virtio/vhost.c
> > +++ b/hw/virtio/vhost.c
> > @@ -27,6 +27,7 @@
> >  #include "hw/virtio/virtio-access.h"
> >  #include "migration/blocker.h"
> >  #include "sysemu/dma.h"
> > +#include "trace.h"
> >  
> >  /* enabled until disconnected backend stabilizes */
> >  #define _VHOST_DEBUG 1
> > @@ -567,18 +568,12 @@ static void vhost_set_memory(MemoryListener *listener,
> >                                           memory_listener);
> >      hwaddr start_addr = section->offset_within_address_space;
> >      ram_addr_t size = int128_get64(section->size);
> > -    bool log_dirty =
> > -        memory_region_get_dirty_log_mask(section->mr) & ~(1 << DIRTY_MEMORY_MIGRATION);
> >      int s = offsetof(struct vhost_memory, regions) +
> >          (dev->mem->nregions + 1) * sizeof dev->mem->regions[0];
> >      void *ram;
> >  
> >      dev->mem = g_realloc(dev->mem, s);
> >  
> > -    if (log_dirty) {
> > -        add = false;
> > -    }
> > -
> >      assert(size);
> >  
> >      /* Optimize no-change case. At least cirrus_vga does this a lot at this time. */
> > @@ -611,8 +606,19 @@ static void vhost_set_memory(MemoryListener *listener,
> >  
> >  static bool vhost_section(MemoryRegionSection *section)
> >  {
> > -    return memory_region_is_ram(section->mr) &&
> > +    bool result;
> > +    bool log_dirty = memory_region_get_dirty_log_mask(section->mr) &
> > +                     ~(1 << DIRTY_MEMORY_MIGRATION);
> > +    result = memory_region_is_ram(section->mr) &&
> >          !memory_region_is_rom(section->mr);
> > +
> > +    /* Vhost doesn't handle any block which is doing dirty-tracking other
> > +     * than migration; this typically fires on VGA areas.
> > +     */
> > +    result &= !log_dirty;
> before patch even if log_dirty, vhost_set_memory will still proceed
> and may remove dirty section from memmap and set memory_changed = true
> 
> but with this patch it will just ignore such section,
> I'm not sure it's right (it might be right with new approach add/nop
> but then this patch should go after new code in place and old one
> if gone).

I thought about that, but then I came to the conclusion that the whole
idea is that we're supposed to be ignoring these regions - so why
should they cause any change to the behaviour of vhost at all?
Thus this way seems safer.

Dave

> > +
> > +    trace_vhost_section(section->mr->name, result);
> > +    return result;
> >  }
> >  
> >  static void vhost_begin(MemoryListener *listener)
> 
--
Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [Qemu-devel] [PATCH v4 4/6] vhost: add regions to temporary list
  2017-12-13 18:08 ` [Qemu-devel] [PATCH v4 4/6] vhost: add regions to temporary list Dr. David Alan Gilbert (git)
  2017-12-13 21:31   ` Paolo Bonzini
@ 2017-12-14 15:27   ` Igor Mammedov
  2017-12-14 18:43     ` Michael S. Tsirkin
  2017-12-15 13:30     ` Dr. David Alan Gilbert
  1 sibling, 2 replies; 24+ messages in thread
From: Igor Mammedov @ 2017-12-14 15:27 UTC (permalink / raw)
  To: Dr. David Alan Gilbert (git)
  Cc: qemu-devel, pbonzini, maxime.coquelin, groug, mst

On Wed, 13 Dec 2017 18:08:05 +0000
"Dr. David Alan Gilbert (git)" <dgilbert@redhat.com> wrote:

> From: "Dr. David Alan Gilbert" <dgilbert@redhat.com>
> 
> As regions are reported by the listener to the _nop and _add
> methods, add them to our new temporary list.
> Regions that abut can be merged if the backend allows.
> 
> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
> ---
>  hw/virtio/trace-events |  2 ++
>  hw/virtio/vhost.c      | 70 ++++++++++++++++++++++++++++++++++++++++++++++++++
>  2 files changed, 72 insertions(+)
> 
> diff --git a/hw/virtio/trace-events b/hw/virtio/trace-events
> index 4a493bcd46..7de0663652 100644
> --- a/hw/virtio/trace-events
> +++ b/hw/virtio/trace-events
> @@ -1,6 +1,8 @@
>  # See docs/devel/tracing.txt for syntax documentation.
>  
>  # hw/virtio/vhost.c
> +vhost_region_add_tmp(const char *name, uint64_t gpa, uint64_t size, uint64_t host) "%s: 0x%"PRIx64"+0x%"PRIx64" @ 0x%"PRIx64
> +vhost_region_add_tmp_abut(const char *name, uint64_t new_size) "%s: 0x%"PRIx64
>  vhost_section(const char *name, int r) "%s:%d"
>  
>  # hw/virtio/virtio.c
> diff --git a/hw/virtio/vhost.c b/hw/virtio/vhost.c
> index 4523f45587..2084888aa7 100644
> --- a/hw/virtio/vhost.c
> +++ b/hw/virtio/vhost.c
> @@ -694,6 +694,67 @@ static void vhost_commit(MemoryListener *listener)
>      dev->memory_changed = false;
>  }
>  
> +/* Adds the section data to the tmp_mem structure.
> + * It relies on the listener calling us in memory address order
> + * and for each region (via the _add and _nop methods).
> + */
> +static void vhost_region_add_tmp(struct vhost_dev *dev,
> +                                 MemoryRegionSection *section)
> +{
> +    bool need_add = true;
> +    uint64_t mrs_size = int128_get64(section->size);
> +    uint64_t mrs_gpa = section->offset_within_address_space;
> +    uintptr_t mrs_host = (uintptr_t)memory_region_get_ram_ptr(section->mr) +
> +                         section->offset_within_region;
> +
> +    trace_vhost_region_add_tmp(section->mr->name, mrs_gpa, mrs_size, mrs_host);
> +
> +    if (dev->tmp_mem->nregions) {
> +        /* Since we already have at least one region, lets see if
> +         * this extends it; since we're scanning in order, we only
> +         * have to look at the last one, and the FlatView that calls
> +         * us shouldn't have overlaps.
> +         */
> +        struct vhost_memory_region *prev_vmr = dev->tmp_mem->regions +
> +                                               (dev->tmp_mem->nregions - 1);
> +        uint64_t prev_gpa_start = prev_vmr->guest_phys_addr;
> +        uint64_t prev_gpa_end   = range_get_last(prev_gpa_start,
> +                                                 prev_vmr->memory_size);
> +        uint64_t prev_host_start = prev_vmr->userspace_addr;
> +        uint64_t prev_host_end   = range_get_last(prev_host_start,
> +                                                  prev_vmr->memory_size);
> +
> +        if (prev_gpa_end + 1 == mrs_gpa &&
> +            prev_host_end + 1 == mrs_host &&
> +            (!dev->vhost_ops->vhost_backend_can_merge ||
> +                dev->vhost_ops->vhost_backend_can_merge(dev,
> +                    mrs_host, mrs_size,
> +                    prev_host_start, prev_vmr->memory_size))) {
> +            /* The two regions abut */
> +            need_add = false;
> +            mrs_size = mrs_size + prev_vmr->memory_size;
> +            prev_vmr->memory_size = mrs_size;
> +            trace_vhost_region_add_tmp_abut(section->mr->name, mrs_size);
> +        }
> +    }
> +
> +    if (need_add) {
> +        uint32_t nregions = dev->tmp_mem->nregions;
> +        size_t s = offsetof(struct vhost_memory, regions) +
> +                            (nregions + 1) * sizeof dev->tmp_mem->regions[0];
> +        dev->tmp_mem = g_realloc(dev->tmp_mem, s);
> +        dev->tmp_mem->nregions++;
> +        struct vhost_memory_region *cur_vmr = &dev->tmp_mem->regions[nregions];
> +
> +        cur_vmr->guest_phys_addr = mrs_gpa;
> +        cur_vmr->memory_size     = mrs_size;
> +        cur_vmr->userspace_addr  = mrs_host;
> +        cur_vmr->flags_padding = 0;
> +    }
> +
> +
> +}
> +
>  static void vhost_region_add(MemoryListener *listener,
>                               MemoryRegionSection *section)
>  {
> @@ -703,6 +764,7 @@ static void vhost_region_add(MemoryListener *listener,
>      if (!vhost_section(section)) {
>          return;
>      }
> +    vhost_region_add_tmp(dev, section);
>  
>      ++dev->n_mem_sections;
>      dev->mem_sections = g_renew(MemoryRegionSection, dev->mem_sections,
> @@ -800,9 +862,17 @@ static void vhost_iommu_region_del(MemoryListener *listener,
>      }
>  }
>  
> +/* Called on regions that have not changed */
>  static void vhost_region_nop(MemoryListener *listener,
>                               MemoryRegionSection *section)
>  {
move it close to add/del callbacks

> +    struct vhost_dev *dev = container_of(listener, struct vhost_dev,
> +                                         memory_listener);
> +    if (!vhost_section(section)) {
> +        return;
> +    }
> +
> +    vhost_region_add_tmp(dev, section);
if you'd operate on temporary mem_sections,
I'd add memory_region_ref() and get rid of region_del callback
with its section lookup, by unreffing old sections at commit time.

Also it seems that we have a race in current code where
region_del() unrefs memory region first and then by the
commit time memory region could be gone since old flatview
is unreffed before commit callback is called, but guest still
uses old memory map until vhost_set_mem_table() is complete.
We probably should unref deleted(old) sections after
guest gets new memmap.


>  }
>  
>  static int vhost_virtqueue_set_addr(struct vhost_dev *dev,

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [Qemu-devel] [PATCH v4 4/6] vhost: add regions to temporary list
  2017-12-13 21:31   ` Paolo Bonzini
@ 2017-12-14 15:53     ` Dr. David Alan Gilbert
  0 siblings, 0 replies; 24+ messages in thread
From: Dr. David Alan Gilbert @ 2017-12-14 15:53 UTC (permalink / raw)
  To: Paolo Bonzini; +Cc: qemu-devel, imammedo, maxime.coquelin, groug, mst

* Paolo Bonzini (pbonzini@redhat.com) wrote:
> On 13/12/2017 19:08, Dr. David Alan Gilbert (git) wrote:
> > +    if (dev->tmp_mem->nregions) {
> > +        /* Since we already have at least one region, lets see if
> > +         * this extends it; since we're scanning in order, we only
> > +         * have to look at the last one, and the FlatView that calls
> > +         * us shouldn't have overlaps.
> > +         */
> > +        struct vhost_memory_region *prev_vmr = dev->tmp_mem->regions +
> > +                                               (dev->tmp_mem->nregions - 1);
> > +        uint64_t prev_gpa_start = prev_vmr->guest_phys_addr;
> > +        uint64_t prev_gpa_end   = range_get_last(prev_gpa_start,
> > +                                                 prev_vmr->memory_size);
> > +        uint64_t prev_host_start = prev_vmr->userspace_addr;
> > +        uint64_t prev_host_end   = range_get_last(prev_host_start,
> > +                                                  prev_vmr->memory_size);
> > +
> > +        if (prev_gpa_end + 1 == mrs_gpa &&
> > +            prev_host_end + 1 == mrs_host &&
> > +            (!dev->vhost_ops->vhost_backend_can_merge ||
> > +                dev->vhost_ops->vhost_backend_can_merge(dev,
> > +                    mrs_host, mrs_size,
> > +                    prev_host_start, prev_vmr->memory_size))) {
> > +            /* The two regions abut */
> > +            need_add = false;
> > +            mrs_size = mrs_size + prev_vmr->memory_size;
> > +            prev_vmr->memory_size = mrs_size;
> > +            trace_vhost_region_add_tmp_abut(section->mr->name, mrs_size);
> > +        }
> > +    }
> 
> Interesting, in which cases does this actually trigger?

:vhost_section vga-lowmem:0
:vhost_section /objects/mem:1
:vhost_region_add_tmp /objects/mem: 0x0+0xa0000 @ 0x7ff1f5a08000
:vhost_section vga.vram:0
:vhost_section vga-lowmem:0
:vhost_section /objects/mem:1
:vhost_region_add_tmp /objects/mem: 0xc0000+0xa000 @ 0x7ff1f5ac8000
:vhost_section /objects/mem:1
:vhost_region_add_tmp /objects/mem: 0xca000+0x3000 @ 0x7ff1f5ad2000
>vhost_region_add_tmp_abut /objects/mem: 0xd000
:vhost_section /objects/mem:1
:vhost_region_add_tmp /objects/mem: 0xcd000+0x1f000 @ 0x7ff1f5ad5000
>vhost_region_add_tmp_abut /objects/mem: 0x2c000
:vhost_section /objects/mem:1
:vhost_region_add_tmp /objects/mem: 0xec000+0x4000 @ 0x7ff1f5af4000
>vhost_region_add_tmp_abut /objects/mem: 0x30000
:vhost_section /objects/mem:1
:vhost_region_add_tmp /objects/mem: 0xf0000+0x10000 @ 0x7ff1f5af8000
>vhost_region_add_tmp_abut /objects/mem: 0x40000
:vhost_section /objects/mem:1
:vhost_region_add_tmp /objects/mem: 0x100000+0x3ff00000 @ 0x7ff1f5b08000
>vhost_region_add_tmp_abut /objects/mem: 0x3ff40000
:vhost_section vga.vram:0
:vhost_section vga ioports remapped:0
:vhost_section bochs dispi interface:0
:vhost_section qemu extended regs:0
:vhost_section msix-table:0
:vhost_section msix-pba:0
:vhost_section kvm-ioapic:0
:vhost_section hpet:0
:vhost_section kvm-apic-msi:0
:vhost_section pc.bios:0
:vhost_commit Started: 1 Changed: 0

So it's not unusual for us to piece them back together into one
chunk.

Dave

> Paolo
--
Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [Qemu-devel] [PATCH v4 4/6] vhost: add regions to temporary list
  2017-12-14 15:27   ` Igor Mammedov
@ 2017-12-14 18:43     ` Michael S. Tsirkin
  2017-12-18 20:29       ` Dr. David Alan Gilbert
  2017-12-15 13:30     ` Dr. David Alan Gilbert
  1 sibling, 1 reply; 24+ messages in thread
From: Michael S. Tsirkin @ 2017-12-14 18:43 UTC (permalink / raw)
  To: Igor Mammedov
  Cc: Dr. David Alan Gilbert (git),
	qemu-devel, pbonzini, maxime.coquelin, groug

On Thu, Dec 14, 2017 at 04:27:31PM +0100, Igor Mammedov wrote:
> Also it seems that we have a race in current code where
> region_del() unrefs memory region first and then by the
> commit time memory region could be gone since old flatview
> is unreffed before commit callback is called, but guest still
> uses old memory map until vhost_set_mem_table() is complete.
> We probably should unref deleted(old) sections after
> guest gets new memmap.

Care trying to post a patch for stable? Might be a good idea
to merge before this rework, for the sake of downstreams.

> 
> >  }
> >  
> >  static int vhost_virtqueue_set_addr(struct vhost_dev *dev,

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [Qemu-devel] [PATCH v4 1/6] vhost: Move log_dirty check
  2017-12-14 15:20     ` Dr. David Alan Gilbert
@ 2017-12-15 10:01       ` Igor Mammedov
  2017-12-27 12:10       ` Igor Mammedov
  1 sibling, 0 replies; 24+ messages in thread
From: Igor Mammedov @ 2017-12-15 10:01 UTC (permalink / raw)
  To: Dr. David Alan Gilbert; +Cc: qemu-devel, pbonzini, maxime.coquelin, mst, groug

On Thu, 14 Dec 2017 15:20:10 +0000
"Dr. David Alan Gilbert" <dgilbert@redhat.com> wrote:

> * Igor Mammedov (imammedo@redhat.com) wrote:
> > On Wed, 13 Dec 2017 18:08:02 +0000
> > "Dr. David Alan Gilbert (git)" <dgilbert@redhat.com> wrote:
> >   
> > > From: "Dr. David Alan Gilbert" <dgilbert@redhat.com>
> > > 
> > > Move the log_dirty check into vhost_section.
> > > 
> > > Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
> > > ---
> > >  hw/virtio/trace-events |  3 +++
> > >  hw/virtio/vhost.c      | 20 +++++++++++++-------
> > >  2 files changed, 16 insertions(+), 7 deletions(-)
> > > 
> > > diff --git a/hw/virtio/trace-events b/hw/virtio/trace-events
> > > index 775461ae98..4a493bcd46 100644
> > > --- a/hw/virtio/trace-events
> > > +++ b/hw/virtio/trace-events
> > > @@ -1,5 +1,8 @@
> > >  # See docs/devel/tracing.txt for syntax documentation.
> > >  
> > > +# hw/virtio/vhost.c
> > > +vhost_section(const char *name, int r) "%s:%d"
> > > +
> > >  # hw/virtio/virtio.c
> > >  virtqueue_alloc_element(void *elem, size_t sz, unsigned in_num, unsigned out_num) "elem %p size %zd in_num %u out_num %u"
> > >  virtqueue_fill(void *vq, const void *elem, unsigned int len, unsigned int idx) "vq %p elem %p len %u idx %u"
> > > diff --git a/hw/virtio/vhost.c b/hw/virtio/vhost.c
> > > index e4290ce93d..e923219e63 100644
> > > --- a/hw/virtio/vhost.c
> > > +++ b/hw/virtio/vhost.c
> > > @@ -27,6 +27,7 @@
> > >  #include "hw/virtio/virtio-access.h"
> > >  #include "migration/blocker.h"
> > >  #include "sysemu/dma.h"
> > > +#include "trace.h"
> > >  
> > >  /* enabled until disconnected backend stabilizes */
> > >  #define _VHOST_DEBUG 1
> > > @@ -567,18 +568,12 @@ static void vhost_set_memory(MemoryListener *listener,
> > >                                           memory_listener);
> > >      hwaddr start_addr = section->offset_within_address_space;
> > >      ram_addr_t size = int128_get64(section->size);
> > > -    bool log_dirty =
> > > -        memory_region_get_dirty_log_mask(section->mr) & ~(1 << DIRTY_MEMORY_MIGRATION);
> > >      int s = offsetof(struct vhost_memory, regions) +
> > >          (dev->mem->nregions + 1) * sizeof dev->mem->regions[0];
> > >      void *ram;
> > >  
> > >      dev->mem = g_realloc(dev->mem, s);
> > >  
> > > -    if (log_dirty) {
> > > -        add = false;
> > > -    }
> > > -
> > >      assert(size);
> > >  
> > >      /* Optimize no-change case. At least cirrus_vga does this a lot at this time. */
> > > @@ -611,8 +606,19 @@ static void vhost_set_memory(MemoryListener *listener,
> > >  
> > >  static bool vhost_section(MemoryRegionSection *section)
> > >  {
> > > -    return memory_region_is_ram(section->mr) &&
> > > +    bool result;
> > > +    bool log_dirty = memory_region_get_dirty_log_mask(section->mr) &
> > > +                     ~(1 << DIRTY_MEMORY_MIGRATION);
> > > +    result = memory_region_is_ram(section->mr) &&
> > >          !memory_region_is_rom(section->mr);
> > > +
> > > +    /* Vhost doesn't handle any block which is doing dirty-tracking other
> > > +     * than migration; this typically fires on VGA areas.
> > > +     */
> > > +    result &= !log_dirty;  
> > before patch even if log_dirty, vhost_set_memory will still proceed
> > and may remove dirty section from memmap and set memory_changed = true
> > 
> > but with this patch it will just ignore such section,
> > I'm not sure it's right (it might be right with new approach add/nop
> > but then this patch should go after new code in place and old one
> > if gone).  
> 
> I thought about that, but then I came to the conclusion that the whole
> idea is that we're supposed to be ignoring these regions - so why
> should they cause any change to the behaviour of vhost at all?
> Thus this way seems safer.
Could it happen that, while being deleted section could become dirty
at the same time? if that could happen then region_del won't process
section and won't call unref on it either.

one more reason to operate on tmp_mem_sections and then
unconditionally unref old mem_sections at commit time at
the right time (after set mem table).

> 
> Dave
> 
> > > +
> > > +    trace_vhost_section(section->mr->name, result);
> > > +    return result;
> > >  }
> > >  
> > >  static void vhost_begin(MemoryListener *listener)  
> >   
> --
> Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [Qemu-devel] [PATCH v4 2/6] vhost: Simplify ring verification checks
  2017-12-14 14:07   ` Igor Mammedov
@ 2017-12-15 12:24     ` Dr. David Alan Gilbert
  0 siblings, 0 replies; 24+ messages in thread
From: Dr. David Alan Gilbert @ 2017-12-15 12:24 UTC (permalink / raw)
  To: Igor Mammedov; +Cc: qemu-devel, pbonzini, maxime.coquelin, mst, groug

* Igor Mammedov (imammedo@redhat.com) wrote:
> On Wed, 13 Dec 2017 18:08:03 +0000
> "Dr. David Alan Gilbert (git)" <dgilbert@redhat.com> wrote:
> 
> > From: "Dr. David Alan Gilbert" <dgilbert@redhat.com>
> > 
> > vhost_verify_ring_mappings() were used to verify that
> > rings are still accessible and related memory hasn't
> > been moved after flatview is updated.
> > 
> > It was doing checks by mapping ring's GPA+len and
> > checking that HVA hadn't changed with new memory map.
> > To avoid maybe expensive mapping call, we were
> > identifying address range that changed and were doing
> > mapping only if ring was in changed range.
> > 
> > However it's not neccessary to perform ring's GPA
> > mapping as we already have its current HVA and all
> > we need is to verify that ring's GPA translates to
> > the same HVA in updated flatview.
> > 
> > This will allow the following patches to simplify the range
> > comparison that was previously needed to avoid expensive
> > verify_ring_mapping calls.
> > 
> > Signed-off-by: Igor Mammedov <imammedo@redhat.com>
> > with modifications by:
> > Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
> > ---
> >  hw/virtio/vhost.c | 74 ++++++++++++++++++++++++++++++-------------------------
> >  1 file changed, 41 insertions(+), 33 deletions(-)
> > 
> > diff --git a/hw/virtio/vhost.c b/hw/virtio/vhost.c
> > index e923219e63..c7ce7baf9b 100644
> > --- a/hw/virtio/vhost.c
> > +++ b/hw/virtio/vhost.c
> > @@ -450,35 +450,37 @@ static void vhost_memory_unmap(struct vhost_dev *dev, void *buffer,
> >      }
> >  }
> >  
> > -static int vhost_verify_ring_part_mapping(struct vhost_dev *dev,
> > -                                          void *part,
> > -                                          uint64_t part_addr,
> > -                                          uint64_t part_size,
> > -                                          uint64_t start_addr,
> > -                                          uint64_t size)
> > +static int vhost_verify_ring_part_mapping(void *ring_hva,
> > +                                          uint64_t ring_gpa,
> > +                                          uint64_t ring_size,
> > +                                          void *reg_hva,
> > +                                          uint64_t reg_gpa,
> > +                                          uint64_t reg_size)
> >  {
> > -    hwaddr l;
> > -    void *p;
> > -    int r = 0;
> > +    uint64_t hva_ring_offset;
> > +    uint64_t ring_last = range_get_last(ring_gpa, ring_size);
> > +    uint64_t reg_last = range_get_last(reg_gpa, reg_size);
> >  
> > -    if (!ranges_overlap(start_addr, size, part_addr, part_size)) {
> > +    if (ring_last < reg_gpa || ring_gpa > reg_last) {
> >          return 0;
> >      }
> > -    l = part_size;
> > -    p = vhost_memory_map(dev, part_addr, &l, 1);
> > -    if (!p || l != part_size) {
> > -        r = -ENOMEM;
> > +    /* check that whole ring's is mapped */
> > +    if (ring_last > reg_last) {
> > +        return -EBUSY;
> ENOMEM

Done

> >      }
> > -    if (p != part) {
> > -        r = -EBUSY;
> > +    /* check that ring's MemoryRegion wasn't replaced */
> > +    hva_ring_offset = ring_gpa - reg_gpa;
> > +    if (ring_hva != reg_hva + hva_ring_offset) {
> > +        return -ENOMEM;
> EBUSY

Done

> 
> >      }
> > -    vhost_memory_unmap(dev, p, l, 0, 0);
> > -    return r;
> > +
> > +    return 0;
> >  }
> >  
> >  static int vhost_verify_ring_mappings(struct vhost_dev *dev,
> > -                                      uint64_t start_addr,
> > -                                      uint64_t size)
> > +                                      void *reg_hva,
> > +                                      uint64_t reg_gpa,
> > +                                      uint64_t reg_size)
> >  {
> >      int i, j;
> >      int r = 0;
> > @@ -492,22 +494,25 @@ static int vhost_verify_ring_mappings(struct vhost_dev *dev,
> >          struct vhost_virtqueue *vq = dev->vqs + i;
> >  
> >          j = 0;
> > -        r = vhost_verify_ring_part_mapping(dev, vq->desc, vq->desc_phys,
> > -                                           vq->desc_size, start_addr, size);
> > +        r = vhost_verify_ring_part_mapping(
> > +                vq->desc, vq->desc_phys, vq->desc_size,
> > +                reg_hva, reg_gpa, reg_size);
> >          if (r) {
> >              break;
> >          }
> >  
> >          j++;
> > -        r = vhost_verify_ring_part_mapping(dev, vq->avail, vq->avail_phys,
> > -                                           vq->avail_size, start_addr, size);
> > +        r = vhost_verify_ring_part_mapping(
> > +                vq->desc, vq->desc_phys, vq->desc_size,
> > +                reg_hva, reg_gpa, reg_size);
> >          if (r) {
> >              break;
> >          }
> >  
> >          j++;
> > -        r = vhost_verify_ring_part_mapping(dev, vq->used, vq->used_phys,
> > -                                           vq->used_size, start_addr, size);
> > +        r = vhost_verify_ring_part_mapping(
> > +                vq->desc, vq->desc_phys, vq->desc_size,
> > +                reg_hva, reg_gpa, reg_size);
> >          if (r) {
> >              break;
> >          }
> > @@ -633,8 +638,6 @@ static void vhost_commit(MemoryListener *listener)
> >  {
> >      struct vhost_dev *dev = container_of(listener, struct vhost_dev,
> >                                           memory_listener);
> > -    hwaddr start_addr = 0;
> > -    ram_addr_t size = 0;
> >      uint64_t log_size;
> >      int r;
> >  
> > @@ -649,11 +652,16 @@ static void vhost_commit(MemoryListener *listener)
> >      }
> >  
> >      if (dev->started) {
> ^^^ not needed as we bail out earlier if it's not true

Done

> > -        start_addr = dev->mem_changed_start_addr;
> > -        size = dev->mem_changed_end_addr - dev->mem_changed_start_addr + 1;
> > -
> > -        r = vhost_verify_ring_mappings(dev, start_addr, size);
> > -        assert(r >= 0);
> > +        int i;
> > +        for (i = 0; i < dev->mem->nregions; i++) {
> > +            if (vhost_verify_ring_mappings(dev,
> > +                               (void *)dev->mem->regions[i].userspace_addr,
> > +                               dev->mem->regions[i].guest_phys_addr,
> > +                               dev->mem->regions[i].memory_size)) {
> > +                error_report("Verify ring failure on region %d", i);
> > +                abort();
> > +            }
> > +        }
> >      }
> >  
> >      if (!dev->log_enabled) {
> 
--
Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [Qemu-devel] [PATCH v4 3/6] vhost: Add temporary memory structure
  2017-12-14 15:15   ` Igor Mammedov
@ 2017-12-15 13:15     ` Dr. David Alan Gilbert
  0 siblings, 0 replies; 24+ messages in thread
From: Dr. David Alan Gilbert @ 2017-12-15 13:15 UTC (permalink / raw)
  To: Igor Mammedov; +Cc: qemu-devel, pbonzini, maxime.coquelin, mst, groug

* Igor Mammedov (imammedo@redhat.com) wrote:
> On Wed, 13 Dec 2017 18:08:04 +0000
> "Dr. David Alan Gilbert (git)" <dgilbert@redhat.com> wrote:
> 
> > From: "Dr. David Alan Gilbert" <dgilbert@redhat.com>
> > 
> > Add a 2nd 'vhost_memory' structure that will be used to build
> > the new version as the listener iterates over the address space.
> I'd suggest to add temporary 'mem_sections' instead and
> create/use/free temporary 'vhost_memory' structure at commit time,
> (1 less duplicated data set to keep in memory)

I don't see how this leads to less duplication;  I still end up with
one set of temporary structures, and the sections use Int128 which
is a pain to work with in the comparison/merging code.

Dave

> > 
> > Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
> > ---
> >  hw/virtio/vhost.c         | 3 +++
> >  include/hw/virtio/vhost.h | 1 +
> >  2 files changed, 4 insertions(+)
> > 
> > diff --git a/hw/virtio/vhost.c b/hw/virtio/vhost.c
> > index c7ce7baf9b..4523f45587 100644
> > --- a/hw/virtio/vhost.c
> > +++ b/hw/virtio/vhost.c
> > @@ -632,6 +632,8 @@ static void vhost_begin(MemoryListener *listener)
> >                                           memory_listener);
> >      dev->mem_changed_end_addr = 0;
> >      dev->mem_changed_start_addr = -1;
> > +
> > +    dev->tmp_mem = g_malloc0(offsetof(struct vhost_memory, regions));
> >  }
> >  
> >  static void vhost_commit(MemoryListener *listener)
> > @@ -641,6 +643,7 @@ static void vhost_commit(MemoryListener *listener)
> >      uint64_t log_size;
> >      int r;
> >  
> > +    g_free(dev->tmp_mem);
> >      if (!dev->memory_changed) {
> >          return;
> >      }
> > diff --git a/include/hw/virtio/vhost.h b/include/hw/virtio/vhost.h
> > index 467dc7794b..41f9e569be 100644
> > --- a/include/hw/virtio/vhost.h
> > +++ b/include/hw/virtio/vhost.h
> > @@ -70,6 +70,7 @@ struct vhost_dev {
> >      bool memory_changed;
> >      hwaddr mem_changed_start_addr;
> >      hwaddr mem_changed_end_addr;
> > +    struct vhost_memory *tmp_mem;
> >      const VhostOps *vhost_ops;
> >      void *opaque;
> >      struct vhost_log *log;
> 
--
Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [Qemu-devel] [PATCH v4 4/6] vhost: add regions to temporary list
  2017-12-14 15:27   ` Igor Mammedov
  2017-12-14 18:43     ` Michael S. Tsirkin
@ 2017-12-15 13:30     ` Dr. David Alan Gilbert
  2017-12-15 14:50       ` Paolo Bonzini
  1 sibling, 1 reply; 24+ messages in thread
From: Dr. David Alan Gilbert @ 2017-12-15 13:30 UTC (permalink / raw)
  To: Igor Mammedov; +Cc: qemu-devel, pbonzini, maxime.coquelin, groug, mst

* Igor Mammedov (imammedo@redhat.com) wrote:
> On Wed, 13 Dec 2017 18:08:05 +0000
> "Dr. David Alan Gilbert (git)" <dgilbert@redhat.com> wrote:
> 
> > From: "Dr. David Alan Gilbert" <dgilbert@redhat.com>
> > 
> > As regions are reported by the listener to the _nop and _add
> > methods, add them to our new temporary list.
> > Regions that abut can be merged if the backend allows.
> > 
> > Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
> > ---
> >  hw/virtio/trace-events |  2 ++
> >  hw/virtio/vhost.c      | 70 ++++++++++++++++++++++++++++++++++++++++++++++++++
> >  2 files changed, 72 insertions(+)
> > 
> > diff --git a/hw/virtio/trace-events b/hw/virtio/trace-events
> > index 4a493bcd46..7de0663652 100644
> > --- a/hw/virtio/trace-events
> > +++ b/hw/virtio/trace-events
> > @@ -1,6 +1,8 @@
> >  # See docs/devel/tracing.txt for syntax documentation.
> >  
> >  # hw/virtio/vhost.c
> > +vhost_region_add_tmp(const char *name, uint64_t gpa, uint64_t size, uint64_t host) "%s: 0x%"PRIx64"+0x%"PRIx64" @ 0x%"PRIx64
> > +vhost_region_add_tmp_abut(const char *name, uint64_t new_size) "%s: 0x%"PRIx64
> >  vhost_section(const char *name, int r) "%s:%d"
> >  
> >  # hw/virtio/virtio.c
> > diff --git a/hw/virtio/vhost.c b/hw/virtio/vhost.c
> > index 4523f45587..2084888aa7 100644
> > --- a/hw/virtio/vhost.c
> > +++ b/hw/virtio/vhost.c
> > @@ -694,6 +694,67 @@ static void vhost_commit(MemoryListener *listener)
> >      dev->memory_changed = false;
> >  }
> >  
> > +/* Adds the section data to the tmp_mem structure.
> > + * It relies on the listener calling us in memory address order
> > + * and for each region (via the _add and _nop methods).
> > + */
> > +static void vhost_region_add_tmp(struct vhost_dev *dev,
> > +                                 MemoryRegionSection *section)
> > +{
> > +    bool need_add = true;
> > +    uint64_t mrs_size = int128_get64(section->size);
> > +    uint64_t mrs_gpa = section->offset_within_address_space;
> > +    uintptr_t mrs_host = (uintptr_t)memory_region_get_ram_ptr(section->mr) +
> > +                         section->offset_within_region;
> > +
> > +    trace_vhost_region_add_tmp(section->mr->name, mrs_gpa, mrs_size, mrs_host);
> > +
> > +    if (dev->tmp_mem->nregions) {
> > +        /* Since we already have at least one region, lets see if
> > +         * this extends it; since we're scanning in order, we only
> > +         * have to look at the last one, and the FlatView that calls
> > +         * us shouldn't have overlaps.
> > +         */
> > +        struct vhost_memory_region *prev_vmr = dev->tmp_mem->regions +
> > +                                               (dev->tmp_mem->nregions - 1);
> > +        uint64_t prev_gpa_start = prev_vmr->guest_phys_addr;
> > +        uint64_t prev_gpa_end   = range_get_last(prev_gpa_start,
> > +                                                 prev_vmr->memory_size);
> > +        uint64_t prev_host_start = prev_vmr->userspace_addr;
> > +        uint64_t prev_host_end   = range_get_last(prev_host_start,
> > +                                                  prev_vmr->memory_size);
> > +
> > +        if (prev_gpa_end + 1 == mrs_gpa &&
> > +            prev_host_end + 1 == mrs_host &&
> > +            (!dev->vhost_ops->vhost_backend_can_merge ||
> > +                dev->vhost_ops->vhost_backend_can_merge(dev,
> > +                    mrs_host, mrs_size,
> > +                    prev_host_start, prev_vmr->memory_size))) {
> > +            /* The two regions abut */
> > +            need_add = false;
> > +            mrs_size = mrs_size + prev_vmr->memory_size;
> > +            prev_vmr->memory_size = mrs_size;
> > +            trace_vhost_region_add_tmp_abut(section->mr->name, mrs_size);
> > +        }
> > +    }
> > +
> > +    if (need_add) {
> > +        uint32_t nregions = dev->tmp_mem->nregions;
> > +        size_t s = offsetof(struct vhost_memory, regions) +
> > +                            (nregions + 1) * sizeof dev->tmp_mem->regions[0];
> > +        dev->tmp_mem = g_realloc(dev->tmp_mem, s);
> > +        dev->tmp_mem->nregions++;
> > +        struct vhost_memory_region *cur_vmr = &dev->tmp_mem->regions[nregions];
> > +
> > +        cur_vmr->guest_phys_addr = mrs_gpa;
> > +        cur_vmr->memory_size     = mrs_size;
> > +        cur_vmr->userspace_addr  = mrs_host;
> > +        cur_vmr->flags_padding = 0;
> > +    }
> > +
> > +
> > +}
> > +
> >  static void vhost_region_add(MemoryListener *listener,
> >                               MemoryRegionSection *section)
> >  {
> > @@ -703,6 +764,7 @@ static void vhost_region_add(MemoryListener *listener,
> >      if (!vhost_section(section)) {
> >          return;
> >      }
> > +    vhost_region_add_tmp(dev, section);
> >  
> >      ++dev->n_mem_sections;
> >      dev->mem_sections = g_renew(MemoryRegionSection, dev->mem_sections,
> > @@ -800,9 +862,17 @@ static void vhost_iommu_region_del(MemoryListener *listener,
> >      }
> >  }
> >  
> > +/* Called on regions that have not changed */
> >  static void vhost_region_nop(MemoryListener *listener,
> >                               MemoryRegionSection *section)
> >  {
> move it close to add/del callbacks
> 
> > +    struct vhost_dev *dev = container_of(listener, struct vhost_dev,
> > +                                         memory_listener);
> > +    if (!vhost_section(section)) {
> > +        return;
> > +    }
> > +
> > +    vhost_region_add_tmp(dev, section);
> if you'd operate on temporary mem_sections,
> I'd add memory_region_ref() and get rid of region_del callback
> with its section lookup, by unreffing old sections at commit time.
> 
> Also it seems that we have a race in current code where
> region_del() unrefs memory region first and then by the
> commit time memory region could be gone since old flatview
> is unreffed before commit callback is called, but guest still
> uses old memory map until vhost_set_mem_table() is complete.
> We probably should unref deleted(old) sections after
> guest gets new memmap.

Will they really get cleaned up before the commit() returns?
There's no rcu like thing guarding it?

Dave

> 
> >  }
> >  
> >  static int vhost_virtqueue_set_addr(struct vhost_dev *dev,
> 
--
Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [Qemu-devel] [PATCH v4 4/6] vhost: add regions to temporary list
  2017-12-15 13:30     ` Dr. David Alan Gilbert
@ 2017-12-15 14:50       ` Paolo Bonzini
  2017-12-15 16:11         ` Dr. David Alan Gilbert
  0 siblings, 1 reply; 24+ messages in thread
From: Paolo Bonzini @ 2017-12-15 14:50 UTC (permalink / raw)
  To: Dr. David Alan Gilbert, Igor Mammedov
  Cc: qemu-devel, maxime.coquelin, groug, mst

On 15/12/2017 14:30, Dr. David Alan Gilbert wrote:
>> Also it seems that we have a race in current code where
>> region_del() unrefs memory region first and then by the
>> commit time memory region could be gone since old flatview
>> is unreffed before commit callback is called, but guest still
>> uses old memory map until vhost_set_mem_table() is complete.
>> We probably should unref deleted(old) sections after
>> guest gets new memmap.
>
> Will they really get cleaned up before the commit() returns?
> There's no rcu like thing guarding it?

The memory subsystem only keeps them alive until before commmit() is
invoked.

Paolo

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [Qemu-devel] [PATCH v4 4/6] vhost: add regions to temporary list
  2017-12-15 14:50       ` Paolo Bonzini
@ 2017-12-15 16:11         ` Dr. David Alan Gilbert
  0 siblings, 0 replies; 24+ messages in thread
From: Dr. David Alan Gilbert @ 2017-12-15 16:11 UTC (permalink / raw)
  To: Paolo Bonzini; +Cc: Igor Mammedov, qemu-devel, maxime.coquelin, groug, mst

* Paolo Bonzini (pbonzini@redhat.com) wrote:
> On 15/12/2017 14:30, Dr. David Alan Gilbert wrote:
> >> Also it seems that we have a race in current code where
> >> region_del() unrefs memory region first and then by the
> >> commit time memory region could be gone since old flatview
> >> is unreffed before commit callback is called, but guest still
> >> uses old memory map until vhost_set_mem_table() is complete.
> >> We probably should unref deleted(old) sections after
> >> guest gets new memmap.
> >
> > Will they really get cleaned up before the commit() returns?
> > There's no rcu like thing guarding it?
> 
> The memory subsystem only keeps them alive until before commmit() is
> invoked.

Hmm ok; I guess then we do need to keep the temporary list of
MemoryRegionSections and unref all the old ones after the end of the callback.

I'll rework it (again).

Dave


> Paolo
--
Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [Qemu-devel] [PATCH v4 4/6] vhost: add regions to temporary list
  2017-12-14 18:43     ` Michael S. Tsirkin
@ 2017-12-18 20:29       ` Dr. David Alan Gilbert
  0 siblings, 0 replies; 24+ messages in thread
From: Dr. David Alan Gilbert @ 2017-12-18 20:29 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: Igor Mammedov, qemu-devel, pbonzini, maxime.coquelin, groug

* Michael S. Tsirkin (mst@redhat.com) wrote:
> On Thu, Dec 14, 2017 at 04:27:31PM +0100, Igor Mammedov wrote:
> > Also it seems that we have a race in current code where
> > region_del() unrefs memory region first and then by the
> > commit time memory region could be gone since old flatview
> > is unreffed before commit callback is called, but guest still
> > uses old memory map until vhost_set_mem_table() is complete.
> > We probably should unref deleted(old) sections after
> > guest gets new memmap.
> 
> Care trying to post a patch for stable? Might be a good idea
> to merge before this rework, for the sake of downstreams.

I think the 1st patch of my v5 might be suitable for that; please
have a look.

Dave

> > 
> > >  }
> > >  
> > >  static int vhost_virtqueue_set_addr(struct vhost_dev *dev,
--
Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [Qemu-devel] [PATCH v4 1/6] vhost: Move log_dirty check
  2017-12-14 15:20     ` Dr. David Alan Gilbert
  2017-12-15 10:01       ` Igor Mammedov
@ 2017-12-27 12:10       ` Igor Mammedov
  2018-01-09 18:42         ` Dr. David Alan Gilbert
  1 sibling, 1 reply; 24+ messages in thread
From: Igor Mammedov @ 2017-12-27 12:10 UTC (permalink / raw)
  To: Dr. David Alan Gilbert; +Cc: qemu-devel, pbonzini, maxime.coquelin, mst, groug

On Thu, 14 Dec 2017 15:20:10 +0000
"Dr. David Alan Gilbert" <dgilbert@redhat.com> wrote:

> * Igor Mammedov (imammedo@redhat.com) wrote:
> > On Wed, 13 Dec 2017 18:08:02 +0000
> > "Dr. David Alan Gilbert (git)" <dgilbert@redhat.com> wrote:
> > 
> > > From: "Dr. David Alan Gilbert" <dgilbert@redhat.com>
> > > 
> > > Move the log_dirty check into vhost_section.
> > > 
> > > Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
> > > ---
> > >  hw/virtio/trace-events |  3 +++
> > >  hw/virtio/vhost.c      | 20 +++++++++++++-------
> > >  2 files changed, 16 insertions(+), 7 deletions(-)
> > > 
> > > diff --git a/hw/virtio/trace-events b/hw/virtio/trace-events
> > > index 775461ae98..4a493bcd46 100644
> > > --- a/hw/virtio/trace-events
> > > +++ b/hw/virtio/trace-events
> > > @@ -1,5 +1,8 @@
> > >  # See docs/devel/tracing.txt for syntax documentation.
> > >  
> > > +# hw/virtio/vhost.c
> > > +vhost_section(const char *name, int r) "%s:%d"
> > > +
> > >  # hw/virtio/virtio.c
> > >  virtqueue_alloc_element(void *elem, size_t sz, unsigned in_num, unsigned out_num) "elem %p size %zd in_num %u out_num %u"
> > >  virtqueue_fill(void *vq, const void *elem, unsigned int len, unsigned int idx) "vq %p elem %p len %u idx %u"
> > > diff --git a/hw/virtio/vhost.c b/hw/virtio/vhost.c
> > > index e4290ce93d..e923219e63 100644
> > > --- a/hw/virtio/vhost.c
> > > +++ b/hw/virtio/vhost.c
> > > @@ -27,6 +27,7 @@
> > >  #include "hw/virtio/virtio-access.h"
> > >  #include "migration/blocker.h"
> > >  #include "sysemu/dma.h"
> > > +#include "trace.h"
> > >  
> > >  /* enabled until disconnected backend stabilizes */
> > >  #define _VHOST_DEBUG 1
> > > @@ -567,18 +568,12 @@ static void vhost_set_memory(MemoryListener *listener,
> > >                                           memory_listener);
> > >      hwaddr start_addr = section->offset_within_address_space;
> > >      ram_addr_t size = int128_get64(section->size);
> > > -    bool log_dirty =
> > > -        memory_region_get_dirty_log_mask(section->mr) & ~(1 << DIRTY_MEMORY_MIGRATION);
> > >      int s = offsetof(struct vhost_memory, regions) +
> > >          (dev->mem->nregions + 1) * sizeof dev->mem->regions[0];
> > >      void *ram;
> > >  
> > >      dev->mem = g_realloc(dev->mem, s);
> > >  
> > > -    if (log_dirty) {
> > > -        add = false;
> > > -    }
> > > -
> > >      assert(size);
> > >  
> > >      /* Optimize no-change case. At least cirrus_vga does this a lot at this time. */
> > > @@ -611,8 +606,19 @@ static void vhost_set_memory(MemoryListener *listener,
> > >  
> > >  static bool vhost_section(MemoryRegionSection *section)
> > >  {
> > > -    return memory_region_is_ram(section->mr) &&
> > > +    bool result;
> > > +    bool log_dirty = memory_region_get_dirty_log_mask(section->mr) &
> > > +                     ~(1 << DIRTY_MEMORY_MIGRATION);
> > > +    result = memory_region_is_ram(section->mr) &&
> > >          !memory_region_is_rom(section->mr);
> > > +
> > > +    /* Vhost doesn't handle any block which is doing dirty-tracking other
> > > +     * than migration; this typically fires on VGA areas.
> > > +     */
> > > +    result &= !log_dirty;
> > before patch even if log_dirty, vhost_set_memory will still proceed
> > and may remove dirty section from memmap and set memory_changed = true
> > 
> > but with this patch it will just ignore such section,
> > I'm not sure it's right (it might be right with new approach add/nop
> > but then this patch should go after new code in place and old one
> > if gone).
> 
> I thought about that, but then I came to the conclusion that the whole
> idea is that we're supposed to be ignoring these regions - so why
> should they cause any change to the behaviour of vhost at all?
> Thus this way seems safer.
the thing is that with originally with old code backend will use memap
with not yet dirty region, then after region becomes dirty current code
will REMOVE dirty region from map and NOTIFY backend.

While this patch is fine for new approach as memmap is build from scratch,
it doesn't look correct for current code since region that just became
dirty will not be removed from memap and might be used backend.
Whether it would really break something or not I'm not sure,
maybe Michael may Ack this patch ii using dirty region is fine.

> 
> Dave
> 
> > > +
> > > +    trace_vhost_section(section->mr->name, result);
> > > +    return result;
> > >  }
> > >  
> > >  static void vhost_begin(MemoryListener *listener)
> > 
> --
> Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [Qemu-devel] [PATCH v4 1/6] vhost: Move log_dirty check
  2017-12-27 12:10       ` Igor Mammedov
@ 2018-01-09 18:42         ` Dr. David Alan Gilbert
  0 siblings, 0 replies; 24+ messages in thread
From: Dr. David Alan Gilbert @ 2018-01-09 18:42 UTC (permalink / raw)
  To: Igor Mammedov; +Cc: qemu-devel, pbonzini, maxime.coquelin, mst, groug

* Igor Mammedov (imammedo@redhat.com) wrote:
> On Thu, 14 Dec 2017 15:20:10 +0000
> "Dr. David Alan Gilbert" <dgilbert@redhat.com> wrote:
> 
> > * Igor Mammedov (imammedo@redhat.com) wrote:
> > > On Wed, 13 Dec 2017 18:08:02 +0000
> > > "Dr. David Alan Gilbert (git)" <dgilbert@redhat.com> wrote:
> > > 
> > > > From: "Dr. David Alan Gilbert" <dgilbert@redhat.com>
> > > > 
> > > > Move the log_dirty check into vhost_section.
> > > > 
> > > > Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
> > > > ---
> > > >  hw/virtio/trace-events |  3 +++
> > > >  hw/virtio/vhost.c      | 20 +++++++++++++-------
> > > >  2 files changed, 16 insertions(+), 7 deletions(-)
> > > > 
> > > > diff --git a/hw/virtio/trace-events b/hw/virtio/trace-events
> > > > index 775461ae98..4a493bcd46 100644
> > > > --- a/hw/virtio/trace-events
> > > > +++ b/hw/virtio/trace-events
> > > > @@ -1,5 +1,8 @@
> > > >  # See docs/devel/tracing.txt for syntax documentation.
> > > >  
> > > > +# hw/virtio/vhost.c
> > > > +vhost_section(const char *name, int r) "%s:%d"
> > > > +
> > > >  # hw/virtio/virtio.c
> > > >  virtqueue_alloc_element(void *elem, size_t sz, unsigned in_num, unsigned out_num) "elem %p size %zd in_num %u out_num %u"
> > > >  virtqueue_fill(void *vq, const void *elem, unsigned int len, unsigned int idx) "vq %p elem %p len %u idx %u"
> > > > diff --git a/hw/virtio/vhost.c b/hw/virtio/vhost.c
> > > > index e4290ce93d..e923219e63 100644
> > > > --- a/hw/virtio/vhost.c
> > > > +++ b/hw/virtio/vhost.c
> > > > @@ -27,6 +27,7 @@
> > > >  #include "hw/virtio/virtio-access.h"
> > > >  #include "migration/blocker.h"
> > > >  #include "sysemu/dma.h"
> > > > +#include "trace.h"
> > > >  
> > > >  /* enabled until disconnected backend stabilizes */
> > > >  #define _VHOST_DEBUG 1
> > > > @@ -567,18 +568,12 @@ static void vhost_set_memory(MemoryListener *listener,
> > > >                                           memory_listener);
> > > >      hwaddr start_addr = section->offset_within_address_space;
> > > >      ram_addr_t size = int128_get64(section->size);
> > > > -    bool log_dirty =
> > > > -        memory_region_get_dirty_log_mask(section->mr) & ~(1 << DIRTY_MEMORY_MIGRATION);
> > > >      int s = offsetof(struct vhost_memory, regions) +
> > > >          (dev->mem->nregions + 1) * sizeof dev->mem->regions[0];
> > > >      void *ram;
> > > >  
> > > >      dev->mem = g_realloc(dev->mem, s);
> > > >  
> > > > -    if (log_dirty) {
> > > > -        add = false;
> > > > -    }
> > > > -
> > > >      assert(size);
> > > >  
> > > >      /* Optimize no-change case. At least cirrus_vga does this a lot at this time. */
> > > > @@ -611,8 +606,19 @@ static void vhost_set_memory(MemoryListener *listener,
> > > >  
> > > >  static bool vhost_section(MemoryRegionSection *section)
> > > >  {
> > > > -    return memory_region_is_ram(section->mr) &&
> > > > +    bool result;
> > > > +    bool log_dirty = memory_region_get_dirty_log_mask(section->mr) &
> > > > +                     ~(1 << DIRTY_MEMORY_MIGRATION);
> > > > +    result = memory_region_is_ram(section->mr) &&
> > > >          !memory_region_is_rom(section->mr);
> > > > +
> > > > +    /* Vhost doesn't handle any block which is doing dirty-tracking other
> > > > +     * than migration; this typically fires on VGA areas.
> > > > +     */
> > > > +    result &= !log_dirty;
> > > before patch even if log_dirty, vhost_set_memory will still proceed
> > > and may remove dirty section from memmap and set memory_changed = true
> > > 
> > > but with this patch it will just ignore such section,
> > > I'm not sure it's right (it might be right with new approach add/nop
> > > but then this patch should go after new code in place and old one
> > > if gone).
> > 
> > I thought about that, but then I came to the conclusion that the whole
> > idea is that we're supposed to be ignoring these regions - so why
> > should they cause any change to the behaviour of vhost at all?
> > Thus this way seems safer.
> the thing is that with originally with old code backend will use memap
> with not yet dirty region, then after region becomes dirty current code
> will REMOVE dirty region from map and NOTIFY backend.
> 
> While this patch is fine for new approach as memmap is build from scratch,
> it doesn't look correct for current code since region that just became
> dirty will not be removed from memap and might be used backend.
> Whether it would really break something or not I'm not sure,
> maybe Michael may Ack this patch ii using dirty region is fine.

That's OK; I can add the check to vhost_region_add_section when I
initially create it, and then swing this change around nearer to the
end of the patchset, so that now we're using the new code.

Dave

> > 
> > Dave
> > 
> > > > +
> > > > +    trace_vhost_section(section->mr->name, result);
> > > > +    return result;
> > > >  }
> > > >  
> > > >  static void vhost_begin(MemoryListener *listener)
> > > 
> > --
> > Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK
> 
--
Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK

^ permalink raw reply	[flat|nested] 24+ messages in thread

end of thread, other threads:[~2018-01-09 18:42 UTC | newest]

Thread overview: 24+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-12-13 18:08 [Qemu-devel] [PATCH v4 0/6] Rework vhost memory region updates Dr. David Alan Gilbert (git)
2017-12-13 18:08 ` [Qemu-devel] [PATCH v4 1/6] vhost: Move log_dirty check Dr. David Alan Gilbert (git)
2017-12-14 14:30   ` Igor Mammedov
2017-12-14 15:20     ` Dr. David Alan Gilbert
2017-12-15 10:01       ` Igor Mammedov
2017-12-27 12:10       ` Igor Mammedov
2018-01-09 18:42         ` Dr. David Alan Gilbert
2017-12-13 18:08 ` [Qemu-devel] [PATCH v4 2/6] vhost: Simplify ring verification checks Dr. David Alan Gilbert (git)
2017-12-14 14:07   ` Igor Mammedov
2017-12-15 12:24     ` Dr. David Alan Gilbert
2017-12-13 18:08 ` [Qemu-devel] [PATCH v4 3/6] vhost: Add temporary memory structure Dr. David Alan Gilbert (git)
2017-12-14 15:15   ` Igor Mammedov
2017-12-15 13:15     ` Dr. David Alan Gilbert
2017-12-13 18:08 ` [Qemu-devel] [PATCH v4 4/6] vhost: add regions to temporary list Dr. David Alan Gilbert (git)
2017-12-13 21:31   ` Paolo Bonzini
2017-12-14 15:53     ` Dr. David Alan Gilbert
2017-12-14 15:27   ` Igor Mammedov
2017-12-14 18:43     ` Michael S. Tsirkin
2017-12-18 20:29       ` Dr. David Alan Gilbert
2017-12-15 13:30     ` Dr. David Alan Gilbert
2017-12-15 14:50       ` Paolo Bonzini
2017-12-15 16:11         ` Dr. David Alan Gilbert
2017-12-13 18:08 ` [Qemu-devel] [PATCH v4 5/6] vhost: compare and flip in new memory region list Dr. David Alan Gilbert (git)
2017-12-13 18:08 ` [Qemu-devel] [PATCH v4 6/6] vhost: Clean out old vhost_set_memory and friends Dr. David Alan Gilbert (git)

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.