All of lore.kernel.org
 help / color / mirror / Atom feed
* [Qemu-devel] [PATCH v2 00/17] Memory/IOMMU patches part 4: region ownership
@ 2013-06-04 12:13 Paolo Bonzini
  2013-06-04 12:13 ` [Qemu-devel] [PATCH v2 01/17] memory: add getter/setter for owner Paolo Bonzini
                   ` (16 more replies)
  0 siblings, 17 replies; 36+ messages in thread
From: Paolo Bonzini @ 2013-06-04 12:13 UTC (permalink / raw)
  To: qemu-devel; +Cc: peter.maydell

Now that the DMA APIs are unified, we move closer and closer to breaking
memory access from the BQL dependency.  This series adds an API to
reference/unreference memory regions, which is not really needed only
for BQL-less memory access: the big lock can already be dropped between
address_space_map and the corresponding unmap, and the last patch fixes
potential problems in this area if the DMA destination is hot-unplugged.
Lockless memory access only makes things a bit worse.

Reference counting the region piggybacks on reference counting of a QOM
object, the "owner" of the region.  The owner API is designed so that
it will be called as little as possible.  Unowned subregions will get a
region if memory_region_set_owner is called after the subregion is added.
This is in general the common case already; often setting the owner can
be delegated to a bus-specific API that already takes a DeviceState
(for example pci_register_bar or sysbus_init_mmio).

As mentioned earlier, I'm not sending a pull request for the previous
part yet.  Unplugging a PCI bridge found a reference counting bug in
memory_region_set_owner.

To test this I tried hot-unplug for all PCI devices.

Paolo

v1->v2: fixed reference counting bug in memory_region_set_owner
        added sysbus_pass_mmio (patch 6)
        moved qemu_ram_addr_from_host_nofail to cputlb.c (patch 15)

Paolo Bonzini (17):
  memory: add getter/setter for owner
  memory: add ref/unref
  memory: add ref/unref calls
  exec: add a reference to the region returned by
    address_space_translate
  pci: set owner for BARs
  sysbus: add sysbus_pass_mmio
  sysbus: set owner for MMIO regions
  acpi: add memory_region_set_owner calls
  misc: add memory_region_set_owner calls
  isa/portio: allow setting an owner
  vga: add memory_region_set_owner calls
  pci-assign: add memory_region_set_owner calls
  vfio: add memory_region_set_owner calls
  exec: check MRU in qemu_ram_addr_from_host
  exec: move qemu_ram_addr_from_host_nofail to cputlb.c
  memory: return MemoryRegion from qemu_ram_addr_from_host
  memory: ref/unref memory across address_space_map/unmap

 cputlb.c                              | 11 +++++
 exec.c                                | 90 +++++++++++++++++++++++------------
 hw/acpi/ich9.c                        |  1 +
 hw/acpi/piix4.c                       |  5 ++
 hw/char/serial-pci.c                  |  1 +
 hw/core/loader.c                      |  1 +
 hw/core/sysbus.c                      | 14 ++++++
 hw/cpu/arm11mpcore.c                  |  2 +-
 hw/display/cirrus_vga.c               | 19 ++++++--
 hw/display/exynos4210_fimd.c          |  6 +++
 hw/display/framebuffer.c              | 12 +++--
 hw/display/qxl.c                      |  6 ++-
 hw/display/vga-isa-mm.c               |  2 +-
 hw/display/vga-isa.c                  |  4 +-
 hw/display/vga-pci.c                  |  5 +-
 hw/display/vga.c                      | 19 ++++++--
 hw/display/vga_int.h                  |  9 ++--
 hw/display/vmware_vga.c               |  4 +-
 hw/i386/kvm/pci-assign.c              | 11 +++++
 hw/i386/kvmvapic.c                    |  1 +
 hw/isa/apm.c                          |  1 +
 hw/isa/isa-bus.c                      |  2 +
 hw/misc/pc-testdev.c                  |  7 +++
 hw/misc/vfio.c                        | 10 ++++
 hw/pci/pci.c                          |  2 +
 hw/virtio/dataplane/hostmem.c         |  7 +++
 hw/virtio/vhost.c                     |  2 +
 hw/virtio/virtio-balloon.c            |  1 +
 hw/xen/xen_pt.c                       |  4 ++
 include/exec/cpu-common.h             |  3 +-
 include/exec/ioport.h                 |  3 ++
 include/exec/memory.h                 | 69 ++++++++++++++++++++++++++-
 include/hw/sysbus.h                   |  1 +
 include/hw/virtio/dataplane/hostmem.h |  1 +
 ioport.c                              | 10 ++++
 kvm-all.c                             |  2 +
 memory.c                              | 79 ++++++++++++++++++++++++++++++
 target-arm/kvm.c                      |  2 +
 target-i386/kvm.c                     |  4 +-
 target-sparc/mmu_helper.c             |  1 +
 xen-all.c                             |  2 +
 41 files changed, 372 insertions(+), 64 deletions(-)

-- 
1.8.1.4

^ permalink raw reply	[flat|nested] 36+ messages in thread

* [Qemu-devel] [PATCH v2 01/17] memory: add getter/setter for owner
  2013-06-04 12:13 [Qemu-devel] [PATCH v2 00/17] Memory/IOMMU patches part 4: region ownership Paolo Bonzini
@ 2013-06-04 12:13 ` Paolo Bonzini
  2013-06-04 12:13 ` [Qemu-devel] [PATCH v2 02/17] memory: add ref/unref Paolo Bonzini
                   ` (15 subsequent siblings)
  16 siblings, 0 replies; 36+ messages in thread
From: Paolo Bonzini @ 2013-06-04 12:13 UTC (permalink / raw)
  To: qemu-devel; +Cc: peter.maydell

Whenever memory regions are accessed outside the BQL, they need to be
preserved against hot-unplug.  MemoryRegions actually do not have their
own reference count; they piggyback on a QOM object, their "owner".
Add two functions to retrieve and specify the owner.

The setter function will affect the owner recursively on a whole tree
of contained regions, but without crossing (a) aliases (b) regions that
are already owned by another device.  This is so that a device can create
a complex tree of regions and a single call to memory_region_set_owner
will affect the entire tree.

In turn, this lets buses (usually through a bus-specific function, e.g.
pci_register_bar) set the owner for regions that are managed by the bus.
The device must set the owner itself only if the device plays directly
with address_space_memory/io (which shouldn't happen except in special
cases) or if regions are added/deleted after passing the container to
the bus (for example dynamically while the device runs).

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
 include/exec/memory.h | 36 ++++++++++++++++++++++++++++++++++++
 memory.c              | 45 +++++++++++++++++++++++++++++++++++++++++++++
 2 files changed, 81 insertions(+)

diff --git a/include/exec/memory.h b/include/exec/memory.h
index 3598c4f..e51f30f 100644
--- a/include/exec/memory.h
+++ b/include/exec/memory.h
@@ -150,6 +150,7 @@ struct MemoryRegion {
     const MemoryRegionIOMMUOps *iommu_ops;
     void *opaque;
     MemoryRegion *parent;
+    struct Object *owner;
     Int128 size;
     hwaddr addr;
     void (*destructor)(MemoryRegion *mr);
@@ -388,6 +389,41 @@ void memory_region_init_iommu(MemoryRegion *mr,
 void memory_region_destroy(MemoryRegion *mr);
 
 /**
+ * memory_region_owner: get a memory region's owner.
+ *
+ * @mr: the memory region being queried.
+ */
+struct Object *memory_region_owner(MemoryRegion *mr);
+
+/**
+ * memory_region_set_owner: set the owner for a memory region and all
+ * the unowned regions below it.
+ *
+ * The owner of a region is an object that must be preserved together
+ * with the region itself while the region is being accessed.  This
+ * is useful whenever a region is accessed while the big QEMU lock is
+ * not held, even in the simplest case of accessing RAM from
+ * asynchronous block device I/O.
+ *
+ * This function will affect the owner recursively on a whole tree
+ * of contained regions (not aliases), but without crossing regions that
+ * are already owned by another device.  This is so that a device can create
+ * a complex tree of regions and a single call to memory_region_set_owner
+ * will affect the entire tree.
+ *
+ * This function will usually be called through a bus-specific function, e.g.
+ * pci_register_bar or sysbus_init_mmio.  The device must set the owner itself
+ * only if it uses memory_region_add_subregion directly on some address space,
+ * or after the parent region is passed to the bus (for example dynamically
+ * while the device runs).
+ *
+ * @mr: the memory region being set.
+ * @owner: the object that acts as the owner
+ */
+void memory_region_set_owner(MemoryRegion *mr,
+                             struct Object *owner);
+
+/**
  * memory_region_size: get a memory region's size.
  *
  * @mr: the memory region being queried.
diff --git a/memory.c b/memory.c
index c500d8d..b40cdde 100644
--- a/memory.c
+++ b/memory.c
@@ -823,6 +823,7 @@ void memory_region_init(MemoryRegion *mr,
     mr->opaque = NULL;
     mr->iommu_ops = NULL;
     mr->parent = NULL;
+    mr->owner = NULL;
     mr->size = int128_make64(size);
     if (size == UINT64_MAX) {
         mr->size = int128_2_64();
@@ -1089,6 +1090,50 @@ void memory_region_destroy(MemoryRegion *mr)
     g_free(mr->ioeventfds);
 }
 
+Object *memory_region_owner(MemoryRegion *mr)
+{
+    return mr->owner;
+}
+
+void memory_region_set_owner(MemoryRegion *mr,
+                             Object *owner)
+{
+    MemoryRegion *child;
+    Object *old_owner;
+
+    old_owner = mr->owner;
+    assert(old_owner == NULL || old_owner == owner);
+
+    if (owner != NULL && old_owner == NULL) {
+        object_ref(owner);
+    }
+    mr->owner = owner;
+
+    QTAILQ_FOREACH(child, &mr->subregions, subregions_link) {
+        Object *child_owner = child->owner;
+        if (child_owner == NULL || child_owner == owner) {
+            /* Balance the reference that would have been added in
+             * memory_region_add_subregion.  Same below for
+             * memory_region_del_subregion.
+             */
+            if (owner != NULL && child_owner == NULL) {
+                memory_region_ref(child);
+            }
+            memory_region_set_owner(child, owner);
+            if (owner == NULL && child_owner != NULL) {
+                memory_region_unref(child);
+            }
+        }
+    }
+
+    /* Do not unref until all child regions have been processed,
+     * or the old owner might disappear.
+     */
+    if (owner == NULL && old_owner != NULL) {
+        object_unref(old_owner);
+    }
+}
+
 uint64_t memory_region_size(MemoryRegion *mr)
 {
     if (int128_eq(mr->size, int128_2_64())) {
-- 
1.8.1.4

^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [Qemu-devel] [PATCH v2 02/17] memory: add ref/unref
  2013-06-04 12:13 [Qemu-devel] [PATCH v2 00/17] Memory/IOMMU patches part 4: region ownership Paolo Bonzini
  2013-06-04 12:13 ` [Qemu-devel] [PATCH v2 01/17] memory: add getter/setter for owner Paolo Bonzini
@ 2013-06-04 12:13 ` Paolo Bonzini
  2013-06-04 12:13 ` [Qemu-devel] [PATCH v2 03/17] memory: add ref/unref calls Paolo Bonzini
                   ` (14 subsequent siblings)
  16 siblings, 0 replies; 36+ messages in thread
From: Paolo Bonzini @ 2013-06-04 12:13 UTC (permalink / raw)
  To: qemu-devel; +Cc: peter.maydell

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
 include/exec/memory.h | 30 ++++++++++++++++++++++++++++++
 memory.c              | 14 ++++++++++++++
 2 files changed, 44 insertions(+)

diff --git a/include/exec/memory.h b/include/exec/memory.h
index e51f30f..bfcdf65 100644
--- a/include/exec/memory.h
+++ b/include/exec/memory.h
@@ -268,6 +268,36 @@ struct MemoryListener {
 void memory_region_init(MemoryRegion *mr,
                         const char *name,
                         uint64_t size);
+
+/**
+ * memory_region_ref: Add 1 to a memory region's reference count
+ *
+ * Whenever memory regions are accessed outside the BQL, they need to be
+ * preserved against hot-unplug.  MemoryRegions actually do not have their
+ * own reference count; they piggyback on a QOM object, their "owner".
+ * This function adds a reference to the owner.
+ *
+ * All MemoryRegions must have an owner if they can disappear, even if the
+ * device they belong to operates exclusively under the BQL.  This is because
+ * the region could be returned at any time by memory_region_find, and this
+ * is usually under guest control.
+ *
+ * @mr: the #MemoryRegion
+ */
+void memory_region_ref(MemoryRegion *mr);
+
+/**
+ * memory_region_unref: Remove 1 to a memory region's reference count
+ *
+ * Whenever memory regions are accessed outside the BQL, they need to be
+ * preserved against hot-unplug.  MemoryRegions actually do not have their
+ * own reference count; they piggyback on a QOM object, their "owner".
+ * This function removes a reference to the owner and possibly destroys it.
+ *
+ * @mr: the #MemoryRegion
+ */
+void memory_region_unref(MemoryRegion *mr);
+
 /**
  * memory_region_init_io: Initialize an I/O memory region.
  *
diff --git a/memory.c b/memory.c
index b40cdde..a136a77 100644
--- a/memory.c
+++ b/memory.c
@@ -1134,6 +1134,20 @@ void memory_region_set_owner(MemoryRegion *mr,
     }
 }
 
+void memory_region_ref(MemoryRegion *mr)
+{
+    if (mr && mr->owner) {
+        object_ref(mr->owner);
+    }
+}
+
+void memory_region_unref(MemoryRegion *mr)
+{
+    if (mr && mr->owner) {
+        object_unref(mr->owner);
+    }
+}
+
 uint64_t memory_region_size(MemoryRegion *mr)
 {
     if (int128_eq(mr->size, int128_2_64())) {
-- 
1.8.1.4

^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [Qemu-devel] [PATCH v2 03/17] memory: add ref/unref calls
  2013-06-04 12:13 [Qemu-devel] [PATCH v2 00/17] Memory/IOMMU patches part 4: region ownership Paolo Bonzini
  2013-06-04 12:13 ` [Qemu-devel] [PATCH v2 01/17] memory: add getter/setter for owner Paolo Bonzini
  2013-06-04 12:13 ` [Qemu-devel] [PATCH v2 02/17] memory: add ref/unref Paolo Bonzini
@ 2013-06-04 12:13 ` Paolo Bonzini
  2013-06-13  6:28   ` Alexey Kardashevskiy
  2013-06-04 12:13 ` [Qemu-devel] [PATCH v2 04/17] exec: add a reference to the region returned by address_space_translate Paolo Bonzini
                   ` (13 subsequent siblings)
  16 siblings, 1 reply; 36+ messages in thread
From: Paolo Bonzini @ 2013-06-04 12:13 UTC (permalink / raw)
  To: qemu-devel; +Cc: peter.maydell

Add ref/unref calls at the following places:

- places where memory regions are stashed by a listener and
  used outside the BQL (including in Xen or KVM).

- memory_region_find callsites

- creation of aliases and containers (only the aliased/contained
  region gets a reference to avoid loops)

- around calls to del_subregion/add_subregion, where the region
  could disappear after the first call

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
 exec.c                                |  6 +++++-
 hw/core/loader.c                      |  1 +
 hw/display/exynos4210_fimd.c          |  6 ++++++
 hw/display/framebuffer.c              | 12 +++++++-----
 hw/i386/kvmvapic.c                    |  1 +
 hw/misc/vfio.c                        |  2 ++
 hw/virtio/dataplane/hostmem.c         |  7 +++++++
 hw/virtio/vhost.c                     |  2 ++
 hw/virtio/virtio-balloon.c            |  1 +
 hw/xen/xen_pt.c                       |  4 ++++
 include/hw/virtio/dataplane/hostmem.h |  1 +
 kvm-all.c                             |  2 ++
 memory.c                              | 20 ++++++++++++++++++++
 target-arm/kvm.c                      |  2 ++
 target-sparc/mmu_helper.c             |  1 +
 xen-all.c                             |  2 ++
 16 files changed, 64 insertions(+), 6 deletions(-)

diff --git a/exec.c b/exec.c
index 8909478..ba50e8d 100644
--- a/exec.c
+++ b/exec.c
@@ -815,12 +815,16 @@ static uint16_t phys_section_add(MemoryRegionSection *section)
                                 phys_sections_nb_alloc);
     }
     phys_sections[phys_sections_nb] = *section;
+    memory_region_ref(section->mr);
     return phys_sections_nb++;
 }
 
 static void phys_sections_clear(void)
 {
-    phys_sections_nb = 0;
+    while (phys_sections_nb > 0) {
+        MemoryRegionSection *section = &phys_sections[--phys_sections_nb];
+        memory_region_unref(section->mr);
+    }
 }
 
 static void register_subpage(AddressSpaceDispatch *d, MemoryRegionSection *section)
diff --git a/hw/core/loader.c b/hw/core/loader.c
index 3a60cbe..44d8714 100644
--- a/hw/core/loader.c
+++ b/hw/core/loader.c
@@ -727,6 +727,7 @@ int rom_load_all(void)
         addr += rom->romsize;
         section = memory_region_find(get_system_memory(), rom->addr, 1);
         rom->isrom = int128_nz(section.size) && memory_region_is_rom(section.mr);
+        memory_region_unref(section.mr);
     }
     qemu_register_reset(rom_reset, NULL);
     roms_loaded = 1;
diff --git a/hw/display/exynos4210_fimd.c b/hw/display/exynos4210_fimd.c
index 0da00a9..f44c4a6 100644
--- a/hw/display/exynos4210_fimd.c
+++ b/hw/display/exynos4210_fimd.c
@@ -1126,6 +1126,11 @@ static void fimd_update_memory_section(Exynos4210fimdState *s, unsigned win)
     /* Total number of bytes of virtual screen used by current window */
     w->fb_len = fb_mapped_len = (w->virtpage_width + w->virtpage_offsize) *
             (w->rightbot_y - w->lefttop_y + 1);
+
+    /* TODO: add .exit and unref the region there.  Not needed yet since sysbus
+     * does not support hot-unplug.
+     */
+    memory_region_unref(w->mem_section.mr);
     w->mem_section = memory_region_find(sysbus_address_space(&s->busdev),
             fb_start_addr, w->fb_len);
     assert(w->mem_section.mr);
@@ -1154,6 +1159,7 @@ static void fimd_update_memory_section(Exynos4210fimdState *s, unsigned win)
     return;
 
 error_return:
+    memory_region_unref(w->mem_section.mr);
     w->mem_section.mr = NULL;
     w->mem_section.size = int128_zero();
     w->host_fb_addr = NULL;
diff --git a/hw/display/framebuffer.c b/hw/display/framebuffer.c
index 49c9e59..4546e42 100644
--- a/hw/display/framebuffer.c
+++ b/hw/display/framebuffer.c
@@ -54,11 +54,11 @@ void framebuffer_update_display(
     src_len = src_width * rows;
 
     mem_section = memory_region_find(address_space, base, src_len);
+    mem = mem_section.mr;
     if (int128_get64(mem_section.size) != src_len ||
             !memory_region_is_ram(mem_section.mr)) {
-        return;
+        goto out;
     }
-    mem = mem_section.mr;
     assert(mem);
     assert(mem_section.offset_within_address_space == base);
 
@@ -68,10 +68,10 @@ void framebuffer_update_display(
        but it's not really worth it as dirty flag tracking will probably
        already have failed above.  */
     if (!src_base)
-        return;
+        goto out;
     if (src_len != src_width * rows) {
         cpu_physical_memory_unmap(src_base, src_len, 0, 0);
-        return;
+        goto out;
     }
     src = src_base;
     dest = surface_data(ds);
@@ -102,10 +102,12 @@ void framebuffer_update_display(
     }
     cpu_physical_memory_unmap(src_base, src_len, 0, 0);
     if (first < 0) {
-        return;
+        goto out;
     }
     memory_region_reset_dirty(mem, mem_section.offset_within_region, src_len,
                               DIRTY_MEMORY_VGA);
     *first_row = first;
     *last_row = last;
+out:
+    memory_region_unref(mem);
 }
diff --git a/hw/i386/kvmvapic.c b/hw/i386/kvmvapic.c
index 655483b..e375c1c 100644
--- a/hw/i386/kvmvapic.c
+++ b/hw/i386/kvmvapic.c
@@ -605,6 +605,7 @@ static void vapic_map_rom_writable(VAPICROMState *s)
                              rom_size);
     memory_region_add_subregion_overlap(as, rom_paddr, &s->rom, 1000);
     s->rom_mapped_writable = true;
+    memory_region_unref(section.mr);
 }
 
 static int vapic_prepare(VAPICROMState *s)
diff --git a/hw/misc/vfio.c b/hw/misc/vfio.c
index 52fb036..a1f5803 100644
--- a/hw/misc/vfio.c
+++ b/hw/misc/vfio.c
@@ -1969,6 +1969,7 @@ static void vfio_listener_region_add(MemoryListener *listener,
     DPRINTF("region_add %"HWADDR_PRIx" - %"HWADDR_PRIx" [%p]\n",
             iova, end - 1, vaddr);
 
+    memory_region_ref(section->mr);
     ret = vfio_dma_map(container, iova, end - iova, vaddr, section->readonly);
     if (ret) {
         error_report("vfio_dma_map(%p, 0x%"HWADDR_PRIx", "
@@ -2010,6 +2011,7 @@ static void vfio_listener_region_del(MemoryListener *listener,
             iova, end - 1);
 
     ret = vfio_dma_unmap(container, iova, end - iova);
+    memory_region_unref(section->mr);
     if (ret) {
         error_report("vfio_dma_unmap(%p, 0x%"HWADDR_PRIx", "
                      "0x%"HWADDR_PRIx") = %d (%m)",
diff --git a/hw/virtio/dataplane/hostmem.c b/hw/virtio/dataplane/hostmem.c
index 7e46723..901d98b 100644
--- a/hw/virtio/dataplane/hostmem.c
+++ b/hw/virtio/dataplane/hostmem.c
@@ -64,8 +64,12 @@ out:
 static void hostmem_listener_commit(MemoryListener *listener)
 {
     HostMem *hostmem = container_of(listener, HostMem, listener);
+    int i;
 
     qemu_mutex_lock(&hostmem->current_regions_lock);
+    for (i = 0; i < hostmem->num_current_regions; i++) {
+        memory_region_unref(hostmem->current_regions[i].mr);
+    }
     g_free(hostmem->current_regions);
     hostmem->current_regions = hostmem->new_regions;
     hostmem->num_current_regions = hostmem->num_new_regions;
@@ -92,8 +96,11 @@ static void hostmem_append_new_region(HostMem *hostmem,
         .guest_addr = section->offset_within_address_space,
         .size = int128_get64(section->size),
         .readonly = section->readonly,
+        .mr = section->mr,
     };
     hostmem->num_new_regions++;
+
+    memory_region_ref(section->mr);
 }
 
 static void hostmem_listener_append_region(MemoryListener *listener,
diff --git a/hw/virtio/vhost.c b/hw/virtio/vhost.c
index baf84ea..96ab625 100644
--- a/hw/virtio/vhost.c
+++ b/hw/virtio/vhost.c
@@ -497,6 +497,7 @@ static void vhost_region_add(MemoryListener *listener,
     dev->mem_sections = g_renew(MemoryRegionSection, dev->mem_sections,
                                 dev->n_mem_sections);
     dev->mem_sections[dev->n_mem_sections - 1] = *section;
+    memory_region_ref(section->mr);
     vhost_set_memory(listener, section, true);
 }
 
@@ -512,6 +513,7 @@ static void vhost_region_del(MemoryListener *listener,
     }
 
     vhost_set_memory(listener, section, false);
+    memory_region_unref(section->mr);
     for (i = 0; i < dev->n_mem_sections; ++i) {
         if (dev->mem_sections[i].offset_within_address_space
             == section->offset_within_address_space) {
diff --git a/hw/virtio/virtio-balloon.c b/hw/virtio/virtio-balloon.c
index a27051c..3fa72a9 100644
--- a/hw/virtio/virtio-balloon.c
+++ b/hw/virtio/virtio-balloon.c
@@ -205,6 +205,7 @@ static void virtio_balloon_handle_output(VirtIODevice *vdev, VirtQueue *vq)
             addr = section.offset_within_region;
             balloon_page(memory_region_get_ram_ptr(section.mr) + addr,
                          !!(vq == s->dvq));
+            memory_region_unref(section.mr);
         }
 
         virtqueue_push(vq, &elem, offset);
diff --git a/hw/xen/xen_pt.c b/hw/xen/xen_pt.c
index c199818..be1fd52 100644
--- a/hw/xen/xen_pt.c
+++ b/hw/xen/xen_pt.c
@@ -606,6 +606,7 @@ static void xen_pt_region_add(MemoryListener *l, MemoryRegionSection *sec)
     XenPCIPassthroughState *s = container_of(l, XenPCIPassthroughState,
                                              memory_listener);
 
+    memory_region_ref(sec->mr);
     xen_pt_region_update(s, sec, true);
 }
 
@@ -615,6 +616,7 @@ static void xen_pt_region_del(MemoryListener *l, MemoryRegionSection *sec)
                                              memory_listener);
 
     xen_pt_region_update(s, sec, false);
+    memory_region_unref(sec->mr);
 }
 
 static void xen_pt_io_region_add(MemoryListener *l, MemoryRegionSection *sec)
@@ -622,6 +624,7 @@ static void xen_pt_io_region_add(MemoryListener *l, MemoryRegionSection *sec)
     XenPCIPassthroughState *s = container_of(l, XenPCIPassthroughState,
                                              io_listener);
 
+    memory_region_ref(sec->mr);
     xen_pt_region_update(s, sec, true);
 }
 
@@ -631,6 +634,7 @@ static void xen_pt_io_region_del(MemoryListener *l, MemoryRegionSection *sec)
                                              io_listener);
 
     xen_pt_region_update(s, sec, false);
+    memory_region_unref(sec->mr);
 }
 
 static const MemoryListener xen_pt_memory_listener = {
diff --git a/include/hw/virtio/dataplane/hostmem.h b/include/hw/virtio/dataplane/hostmem.h
index b2cf093..2810f4b 100644
--- a/include/hw/virtio/dataplane/hostmem.h
+++ b/include/hw/virtio/dataplane/hostmem.h
@@ -18,6 +18,7 @@
 #include "qemu/thread.h"
 
 typedef struct {
+    MemoryRegion *mr;
     void *host_addr;
     hwaddr guest_addr;
     uint64_t size;
diff --git a/kvm-all.c b/kvm-all.c
index e6b262f..91aa7ff 100644
--- a/kvm-all.c
+++ b/kvm-all.c
@@ -788,6 +788,7 @@ static void kvm_set_phys_mem(MemoryRegionSection *section, bool add)
 static void kvm_region_add(MemoryListener *listener,
                            MemoryRegionSection *section)
 {
+    memory_region_ref(section->mr);
     kvm_set_phys_mem(section, true);
 }
 
@@ -795,6 +796,7 @@ static void kvm_region_del(MemoryListener *listener,
                            MemoryRegionSection *section)
 {
     kvm_set_phys_mem(section, false);
+    memory_region_unref(section->mr);
 }
 
 static void kvm_log_sync(MemoryListener *listener,
diff --git a/memory.c b/memory.c
index a136a77..cfce42b 100644
--- a/memory.c
+++ b/memory.c
@@ -147,6 +147,7 @@ static bool memory_listener_match(MemoryListener *listener,
         }                                                               \
     } while (0)
 
+/* No need to ref/unref .mr, the FlatRange keeps it alive.  */
 #define MEMORY_LISTENER_UPDATE_REGION(fr, as, dir, callback)            \
     MEMORY_LISTENER_CALL(callback, dir, (&(MemoryRegionSection) {       \
         .mr = (fr)->mr,                                                 \
@@ -262,11 +263,17 @@ static void flatview_insert(FlatView *view, unsigned pos, FlatRange *range)
     memmove(view->ranges + pos + 1, view->ranges + pos,
             (view->nr - pos) * sizeof(FlatRange));
     view->ranges[pos] = *range;
+    memory_region_ref(range->mr);
     ++view->nr;
 }
 
 static void flatview_destroy(FlatView *view)
 {
+    int i;
+
+    for (i = 0; i < view->nr; i++) {
+        memory_region_unref(view->ranges[i].mr);
+    }
     g_free(view->ranges);
 }
 
@@ -796,6 +803,11 @@ static void memory_region_destructor_ram(MemoryRegion *mr)
     qemu_ram_free(mr->ram_addr);
 }
 
+static void memory_region_destructor_alias(MemoryRegion *mr)
+{
+    memory_region_unref(mr->alias);
+}
+
 static void memory_region_destructor_ram_from_ptr(MemoryRegion *mr)
 {
     qemu_ram_free_from_ptr(mr->ram_addr);
@@ -1043,6 +1055,8 @@ void memory_region_init_alias(MemoryRegion *mr,
                               uint64_t size)
 {
     memory_region_init(mr, name, size);
+    memory_region_ref(orig);
+    mr->destructor = memory_region_destructor_alias;
     mr->alias = orig;
     mr->alias_offset = offset;
 }
@@ -1457,6 +1471,7 @@ static void memory_region_add_subregion_common(MemoryRegion *mr,
     memory_region_transaction_begin();
 
     assert(!subregion->parent);
+    memory_region_ref(subregion);
     subregion->parent = mr;
     subregion->addr = offset;
     QTAILQ_FOREACH(other, &mr->subregions, subregions_link) {
@@ -1519,6 +1534,7 @@ void memory_region_del_subregion(MemoryRegion *mr,
     assert(subregion->parent == mr);
     subregion->parent = NULL;
     QTAILQ_REMOVE(&mr->subregions, subregion, subregions_link);
+    memory_region_unref(subregion);
     memory_region_update_pending |= mr->enabled && subregion->enabled;
     memory_region_transaction_commit();
 }
@@ -1546,12 +1562,14 @@ void memory_region_set_address(MemoryRegion *mr, hwaddr addr)
     }
 
     memory_region_transaction_begin();
+    memory_region_ref(mr);
     memory_region_del_subregion(parent, mr);
     if (may_overlap) {
         memory_region_add_subregion_overlap(parent, addr, mr, priority);
     } else {
         memory_region_add_subregion(parent, addr, mr);
     }
+    memory_region_unref(mr);
     memory_region_transaction_commit();
 }
 
@@ -1629,6 +1647,8 @@ MemoryRegionSection memory_region_find(MemoryRegion *mr,
     ret.size = range.size;
     ret.offset_within_address_space = int128_get64(range.start);
     ret.readonly = fr->readonly;
+    memory_region_ref(ret.mr);
+
     return ret;
 }
 
diff --git a/target-arm/kvm.c b/target-arm/kvm.c
index b7bdc03..b9051a4 100644
--- a/target-arm/kvm.c
+++ b/target-arm/kvm.c
@@ -127,6 +127,7 @@ static void kvm_arm_machine_init_done(Notifier *notifier, void *data)
                 abort();
             }
         }
+        memory_region_unref(kd->mr);
         g_free(kd);
     }
 }
@@ -152,6 +153,7 @@ void kvm_arm_register_device(MemoryRegion *mr, uint64_t devid)
     kd->kda.id = devid;
     kd->kda.addr = -1;
     QSLIST_INSERT_HEAD(&kvm_devices_head, kd, entries);
+    memory_region_ref(kd->mr);
 }
 
 typedef struct Reg {
diff --git a/target-sparc/mmu_helper.c b/target-sparc/mmu_helper.c
index 3983c96..740cbe8 100644
--- a/target-sparc/mmu_helper.c
+++ b/target-sparc/mmu_helper.c
@@ -845,6 +845,7 @@ hwaddr cpu_get_phys_page_debug(CPUSPARCState *env, target_ulong addr)
         }
     }
     section = memory_region_find(get_system_memory(), phys_addr, 1);
+    memory_region_unref(section.mr);
     if (!int128_nz(section.size)) {
         return -1;
     }
diff --git a/xen-all.c b/xen-all.c
index cd520b1..764741a 100644
--- a/xen-all.c
+++ b/xen-all.c
@@ -459,6 +459,7 @@ static void xen_set_memory(struct MemoryListener *listener,
 static void xen_region_add(MemoryListener *listener,
                            MemoryRegionSection *section)
 {
+    memory_region_ref(section->mr);
     xen_set_memory(listener, section, true);
 }
 
@@ -466,6 +467,7 @@ static void xen_region_del(MemoryListener *listener,
                            MemoryRegionSection *section)
 {
     xen_set_memory(listener, section, false);
+    memory_region_unref(section->mr);
 }
 
 static void xen_sync_dirty_bitmap(XenIOState *state,
-- 
1.8.1.4

^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [Qemu-devel] [PATCH v2 04/17] exec: add a reference to the region returned by address_space_translate
  2013-06-04 12:13 [Qemu-devel] [PATCH v2 00/17] Memory/IOMMU patches part 4: region ownership Paolo Bonzini
                   ` (2 preceding siblings ...)
  2013-06-04 12:13 ` [Qemu-devel] [PATCH v2 03/17] memory: add ref/unref calls Paolo Bonzini
@ 2013-06-04 12:13 ` Paolo Bonzini
  2013-06-04 12:13 ` [Qemu-devel] [PATCH v2 05/17] pci: set owner for BARs Paolo Bonzini
                   ` (12 subsequent siblings)
  16 siblings, 0 replies; 36+ messages in thread
From: Paolo Bonzini @ 2013-06-04 12:13 UTC (permalink / raw)
  To: qemu-devel; +Cc: peter.maydell

Once address_space_translate will only be protected by RCU, the returned
MemoryRegion can disappear as soon as the RCU read-side critical
section ends.  Avoid this by adding a reference to the region, and
dropping it in the caller of address_space_translate.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
 exec.c                | 16 ++++++++++++++--
 include/exec/memory.h |  3 ++-
 2 files changed, 16 insertions(+), 3 deletions(-)

diff --git a/exec.c b/exec.c
index ba50e8d..bf287cb 100644
--- a/exec.c
+++ b/exec.c
@@ -292,6 +292,7 @@ MemoryRegion *address_space_translate(AddressSpace *as, hwaddr addr,
 
     *plen = len;
     *xlat = addr;
+    memory_region_ref(mr);
     return mr;
 }
 
@@ -1994,6 +1995,7 @@ bool address_space_rw(AddressSpace *as, hwaddr addr, uint8_t *buf,
                 memcpy(buf, ptr, l);
             }
         }
+        memory_region_unref(mr);
         len -= l;
         buf += l;
         addr += l;
@@ -2159,8 +2161,10 @@ void *address_space_map(AddressSpace *as,
             raddr = memory_region_get_ram_addr(mr) + xlat;
         } else {
             if (memory_region_get_ram_addr(mr) + xlat != raddr + todo) {
+                memory_region_unref(mr);
                 break;
             }
+            memory_region_unref(mr);
         }
 
         len -= l;
@@ -2260,6 +2264,7 @@ static inline uint32_t ldl_phys_internal(hwaddr addr,
             break;
         }
     }
+    memory_region_unref(mr);
     return val;
 }
 
@@ -2319,6 +2324,7 @@ static inline uint64_t ldq_phys_internal(hwaddr addr,
             break;
         }
     }
+    memory_region_unref(mr);
     return val;
 }
 
@@ -2386,6 +2392,7 @@ static inline uint32_t lduw_phys_internal(hwaddr addr,
             break;
         }
     }
+    memory_region_unref(mr);
     return val;
 }
 
@@ -2433,6 +2440,7 @@ void stl_phys_notdirty(hwaddr addr, uint32_t val)
             }
         }
     }
+    memory_region_unref(mr);
 }
 
 /* warning: addr must be aligned */
@@ -2474,6 +2482,7 @@ static inline void stl_phys_internal(hwaddr addr, uint32_t val,
         }
         invalidate_and_set_dirty(addr1, 4);
     }
+    memory_region_unref(mr);
 }
 
 void stl_phys(hwaddr addr, uint32_t val)
@@ -2537,6 +2546,7 @@ static inline void stw_phys_internal(hwaddr addr, uint32_t val,
         }
         invalidate_and_set_dirty(addr1, 2);
     }
+    memory_region_unref(mr);
 }
 
 void stw_phys(hwaddr addr, uint32_t val)
@@ -2626,11 +2636,13 @@ bool cpu_physical_memory_is_io(hwaddr phys_addr)
 {
     MemoryRegion*mr;
     hwaddr l = 1;
+    bool res;
 
     mr = address_space_translate(&address_space_memory,
                                  phys_addr, &phys_addr, &l, false);
 
-    return !(memory_region_is_ram(mr) ||
-             memory_region_is_romd(mr));
+    res = !(memory_region_is_ram(mr) || memory_region_is_romd(mr));
+    memory_region_unref(mr);
+    return res;
 }
 #endif
diff --git a/include/exec/memory.h b/include/exec/memory.h
index bfcdf65..a2546b7 100644
--- a/include/exec/memory.h
+++ b/include/exec/memory.h
@@ -1015,7 +1015,8 @@ bool address_space_write(AddressSpace *as, hwaddr addr,
 bool address_space_read(AddressSpace *as, hwaddr addr, uint8_t *buf, int len);
 
 /* address_space_translate: translate an address range into an address space
- * into a MemoryRegion and an address range into that section
+ * into a MemoryRegion and an address range into that section.  Add a reference
+ * to that region.
  *
  * @as: #AddressSpace to be accessed
  * @addr: address within that address space
-- 
1.8.1.4

^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [Qemu-devel] [PATCH v2 05/17] pci: set owner for BARs
  2013-06-04 12:13 [Qemu-devel] [PATCH v2 00/17] Memory/IOMMU patches part 4: region ownership Paolo Bonzini
                   ` (3 preceding siblings ...)
  2013-06-04 12:13 ` [Qemu-devel] [PATCH v2 04/17] exec: add a reference to the region returned by address_space_translate Paolo Bonzini
@ 2013-06-04 12:13 ` Paolo Bonzini
  2013-06-04 12:13 ` [Qemu-devel] [PATCH v2 06/17] sysbus: add sysbus_pass_mmio Paolo Bonzini
                   ` (11 subsequent siblings)
  16 siblings, 0 replies; 36+ messages in thread
From: Paolo Bonzini @ 2013-06-04 12:13 UTC (permalink / raw)
  To: qemu-devel; +Cc: peter.maydell

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
 hw/pci/pci.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/hw/pci/pci.c b/hw/pci/pci.c
index 26851ac..776ad96 100644
--- a/hw/pci/pci.c
+++ b/hw/pci/pci.c
@@ -913,6 +913,8 @@ void pci_register_bar(PCIDevice *pci_dev, int region_num,
     uint64_t wmask;
     pcibus_t size = memory_region_size(memory);
 
+    memory_region_set_owner(memory, OBJECT(pci_dev));
+
     assert(region_num >= 0);
     assert(region_num < PCI_NUM_REGIONS);
     if (size & (size-1)) {
-- 
1.8.1.4

^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [Qemu-devel] [PATCH v2 06/17] sysbus: add sysbus_pass_mmio
  2013-06-04 12:13 [Qemu-devel] [PATCH v2 00/17] Memory/IOMMU patches part 4: region ownership Paolo Bonzini
                   ` (4 preceding siblings ...)
  2013-06-04 12:13 ` [Qemu-devel] [PATCH v2 05/17] pci: set owner for BARs Paolo Bonzini
@ 2013-06-04 12:13 ` Paolo Bonzini
  2013-06-04 12:24   ` Peter Maydell
  2013-06-04 12:13 ` [Qemu-devel] [PATCH v2 07/17] sysbus: set owner for MMIO regions Paolo Bonzini
                   ` (10 subsequent siblings)
  16 siblings, 1 reply; 36+ messages in thread
From: Paolo Bonzini @ 2013-06-04 12:13 UTC (permalink / raw)
  To: qemu-devel; +Cc: peter.maydell

This matches sysbus_pass_irq in cases where a device is a thin wrapper
of another.  MMIO regions will keep the subdevice as the owner.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
 hw/core/sysbus.c     | 12 ++++++++++++
 hw/cpu/arm11mpcore.c |  2 +-
 include/hw/sysbus.h  |  1 +
 3 files changed, 14 insertions(+), 1 deletion(-)

diff --git a/hw/core/sysbus.c b/hw/core/sysbus.c
index 9004d8c..6dbd1f8 100644
--- a/hw/core/sysbus.c
+++ b/hw/core/sysbus.c
@@ -117,6 +117,18 @@ void sysbus_init_mmio(SysBusDevice *dev, MemoryRegion *memory)
     dev->mmio[n].memory = memory;
 }
 
+/* Pass MMIOs from a target device.  */
+void sysbus_pass_mmio(SysBusDevice *dev, SysBusDevice *target)
+{
+    int i;
+    assert(dev->num_mmio == 0);
+    dev->num_mmio = target->num_mmio;
+    for (i = 0; i < dev->num_mmio; i++) {
+        assert(target->mmio[i].addr == -1);
+        dev->mmio[i] = target->mmio[i];
+    }
+}
+
 MemoryRegion *sysbus_mmio_get_region(SysBusDevice *dev, int n)
 {
     return dev->mmio[n].memory;
diff --git a/hw/cpu/arm11mpcore.c b/hw/cpu/arm11mpcore.c
index 90dcead..cc885d1 100644
--- a/hw/cpu/arm11mpcore.c
+++ b/hw/cpu/arm11mpcore.c
@@ -213,7 +213,7 @@ static int realview_mpcore_init(SysBusDevice *dev)
         }
     }
     qdev_init_gpio_in(&dev->qdev, mpcore_rirq_set_irq, 64);
-    sysbus_init_mmio(dev, sysbus_mmio_get_region(s->priv, 0));
+    sysbus_pass_mmio(dev, s->priv);
     return 0;
 }
 
diff --git a/include/hw/sysbus.h b/include/hw/sysbus.h
index 7c2e316..0639343 100644
--- a/include/hw/sysbus.h
+++ b/include/hw/sysbus.h
@@ -48,6 +48,7 @@ struct SysBusDevice {
 
 void *sysbus_new(void);
 void sysbus_init_mmio(SysBusDevice *dev, MemoryRegion *memory);
+void sysbus_pass_mmio(SysBusDevice *dev, SysBusDevice *target);
 MemoryRegion *sysbus_mmio_get_region(SysBusDevice *dev, int n);
 void sysbus_init_irq(SysBusDevice *dev, qemu_irq *p);
 void sysbus_pass_irq(SysBusDevice *dev, SysBusDevice *target);
-- 
1.8.1.4

^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [Qemu-devel] [PATCH v2 07/17] sysbus: set owner for MMIO regions
  2013-06-04 12:13 [Qemu-devel] [PATCH v2 00/17] Memory/IOMMU patches part 4: region ownership Paolo Bonzini
                   ` (5 preceding siblings ...)
  2013-06-04 12:13 ` [Qemu-devel] [PATCH v2 06/17] sysbus: add sysbus_pass_mmio Paolo Bonzini
@ 2013-06-04 12:13 ` Paolo Bonzini
  2013-06-04 12:13 ` [Qemu-devel] [PATCH v2 08/17] acpi: add memory_region_set_owner calls Paolo Bonzini
                   ` (9 subsequent siblings)
  16 siblings, 0 replies; 36+ messages in thread
From: Paolo Bonzini @ 2013-06-04 12:13 UTC (permalink / raw)
  To: qemu-devel; +Cc: peter.maydell

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
 hw/core/sysbus.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/hw/core/sysbus.c b/hw/core/sysbus.c
index 6dbd1f8..e54f1fc 100644
--- a/hw/core/sysbus.c
+++ b/hw/core/sysbus.c
@@ -115,6 +115,8 @@ void sysbus_init_mmio(SysBusDevice *dev, MemoryRegion *memory)
     n = dev->num_mmio++;
     dev->mmio[n].addr = -1;
     dev->mmio[n].memory = memory;
+
+    memory_region_set_owner(dev->mmio[n].memory, OBJECT(dev));
 }
 
 /* Pass MMIOs from a target device.  */
-- 
1.8.1.4

^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [Qemu-devel] [PATCH v2 08/17] acpi: add memory_region_set_owner calls
  2013-06-04 12:13 [Qemu-devel] [PATCH v2 00/17] Memory/IOMMU patches part 4: region ownership Paolo Bonzini
                   ` (6 preceding siblings ...)
  2013-06-04 12:13 ` [Qemu-devel] [PATCH v2 07/17] sysbus: set owner for MMIO regions Paolo Bonzini
@ 2013-06-04 12:13 ` Paolo Bonzini
  2013-06-04 12:13 ` [Qemu-devel] [PATCH v2 09/17] misc: " Paolo Bonzini
                   ` (8 subsequent siblings)
  16 siblings, 0 replies; 36+ messages in thread
From: Paolo Bonzini @ 2013-06-04 12:13 UTC (permalink / raw)
  To: qemu-devel; +Cc: peter.maydell

ACPI regions are added directly to the I/O address space, without
going through BARs.  Thus they need the owner to be set directly.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
 hw/acpi/ich9.c  | 1 +
 hw/acpi/piix4.c | 5 +++++
 hw/isa/apm.c    | 1 +
 3 files changed, 7 insertions(+)

diff --git a/hw/acpi/ich9.c b/hw/acpi/ich9.c
index 4a17f32..0b19864 100644
--- a/hw/acpi/ich9.c
+++ b/hw/acpi/ich9.c
@@ -223,6 +223,7 @@ void ich9_pm_init(PCIDevice *lpc_pci, ICH9LPCPMRegs *pm,
                           8);
     memory_region_add_subregion(&pm->io, ICH9_PMIO_SMI_EN, &pm->io_smi);
 
+    memory_region_set_owner(&pm->io, OBJECT(lpc_pci));
     pm->irq = sci_irq;
     qemu_register_reset(pm_reset, pm);
     pm->powerdown_notifier.notify = pm_powerdown_req;
diff --git a/hw/acpi/piix4.c b/hw/acpi/piix4.c
index c4af1cc..d097592 100644
--- a/hw/acpi/piix4.c
+++ b/hw/acpi/piix4.c
@@ -433,6 +433,8 @@ static int piix4_pm_initfn(PCIDevice *dev)
     acpi_pm1_cnt_init(&s->ar, &s->io, s->s4_val);
     acpi_gpe_init(&s->ar, GPE_LEN);
 
+    memory_region_set_owner(&s->io, OBJECT(s));
+
     s->powerdown_notifier.notify = piix4_pm_powerdown_req;
     qemu_register_powerdown_notifier(&s->powerdown_notifier);
 
@@ -672,10 +674,12 @@ static void piix4_acpi_system_hot_add_init(MemoryRegion *parent,
 {
     memory_region_init_io(&s->io_gpe, &piix4_gpe_ops, s, "apci-gpe0",
                           GPE_LEN);
+    memory_region_set_owner(&s->io_gpe, OBJECT(s));
     memory_region_add_subregion(parent, GPE_BASE, &s->io_gpe);
 
     memory_region_init_io(&s->io_pci, &piix4_pci_ops, s, "apci-pci-hotplug",
                           PCI_HOTPLUG_SIZE);
+    memory_region_set_owner(&s->io_pci, OBJECT(s));
     memory_region_add_subregion(parent, PCI_HOTPLUG_ADDR,
                                 &s->io_pci);
     pci_bus_hotplug(bus, piix4_device_hotplug, &s->dev.qdev);
@@ -683,6 +687,7 @@ static void piix4_acpi_system_hot_add_init(MemoryRegion *parent,
     qemu_for_each_cpu(piix4_init_cpu_status, &s->gpe_cpu);
     memory_region_init_io(&s->io_cpu, &cpu_hotplug_ops, s, "apci-cpu-hotplug",
                           PIIX4_PROC_LEN);
+    memory_region_set_owner(&s->io_cpu, OBJECT(s));
     memory_region_add_subregion(parent, PIIX4_PROC_BASE, &s->io_cpu);
     s->cpu_added_notifier.notify = piix4_cpu_added_req;
     qemu_register_cpu_added_notifier(&s->cpu_added_notifier);
diff --git a/hw/isa/apm.c b/hw/isa/apm.c
index 5f21d21..376c564 100644
--- a/hw/isa/apm.c
+++ b/hw/isa/apm.c
@@ -97,6 +97,7 @@ void apm_init(PCIDevice *dev, APMState *apm, apm_ctrl_changed_t callback,
 
     /* ioport 0xb2, 0xb3 */
     memory_region_init_io(&apm->io, &apm_ops, apm, "apm-io", 2);
+    memory_region_set_owner(&apm->io, OBJECT(dev));
     memory_region_add_subregion(pci_address_space_io(dev), APM_CNT_IOPORT,
                                 &apm->io);
 }
-- 
1.8.1.4

^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [Qemu-devel] [PATCH v2 09/17] misc: add memory_region_set_owner calls
  2013-06-04 12:13 [Qemu-devel] [PATCH v2 00/17] Memory/IOMMU patches part 4: region ownership Paolo Bonzini
                   ` (7 preceding siblings ...)
  2013-06-04 12:13 ` [Qemu-devel] [PATCH v2 08/17] acpi: add memory_region_set_owner calls Paolo Bonzini
@ 2013-06-04 12:13 ` Paolo Bonzini
  2013-06-04 12:13 ` [Qemu-devel] [PATCH v2 10/17] isa/portio: allow setting an owner Paolo Bonzini
                   ` (7 subsequent siblings)
  16 siblings, 0 replies; 36+ messages in thread
From: Paolo Bonzini @ 2013-06-04 12:13 UTC (permalink / raw)
  To: qemu-devel; +Cc: peter.maydell

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
 hw/char/serial-pci.c | 1 +
 hw/misc/pc-testdev.c | 7 +++++++
 2 files changed, 8 insertions(+)

diff --git a/hw/char/serial-pci.c b/hw/char/serial-pci.c
index 2138e35..6b6106b 100644
--- a/hw/char/serial-pci.c
+++ b/hw/char/serial-pci.c
@@ -106,6 +106,7 @@ static int multi_serial_pci_init(PCIDevice *dev)
         s->irq = pci->irqs[i];
         pci->name[i] = g_strdup_printf("uart #%d", i+1);
         memory_region_init_io(&s->io, &serial_io_ops, s, pci->name[i], 8);
+        memory_region_set_owner(&s->io, OBJECT(pci));
         memory_region_add_subregion(&pci->iobar, 8 * i, &s->io);
     }
     return 0;
diff --git a/hw/misc/pc-testdev.c b/hw/misc/pc-testdev.c
index 32df175..77998d6 100644
--- a/hw/misc/pc-testdev.c
+++ b/hw/misc/pc-testdev.c
@@ -150,12 +150,19 @@ static int init_test_device(ISADevice *isa)
 
     memory_region_init_io(&dev->ioport, &test_ioport_ops, dev,
                           "pc-testdev-ioport", 4);
+    memory_region_set_owner(&dev->ioport, OBJECT(dev));
+
     memory_region_init_io(&dev->flush, &test_flush_ops, dev,
                           "pc-testdev-flush-page", 4);
+    memory_region_set_owner(&dev->flush, OBJECT(dev));
+
     memory_region_init_io(&dev->irq, &test_irq_ops, dev,
                           "pc-testdev-irq-line", 24);
+    memory_region_set_owner(&dev->irq, OBJECT(dev));
+
     memory_region_init_io(&dev->iomem, &test_iomem_ops, dev,
                           "pc-testdev-iomem", IOMEM_LEN);
+    memory_region_set_owner(&dev->iomem, OBJECT(dev));
 
     memory_region_add_subregion(io,  0xe0,       &dev->ioport);
     memory_region_add_subregion(io,  0xe4,       &dev->flush);
-- 
1.8.1.4

^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [Qemu-devel] [PATCH v2 10/17] isa/portio: allow setting an owner
  2013-06-04 12:13 [Qemu-devel] [PATCH v2 00/17] Memory/IOMMU patches part 4: region ownership Paolo Bonzini
                   ` (8 preceding siblings ...)
  2013-06-04 12:13 ` [Qemu-devel] [PATCH v2 09/17] misc: " Paolo Bonzini
@ 2013-06-04 12:13 ` Paolo Bonzini
  2013-06-04 12:13 ` [Qemu-devel] [PATCH v2 11/17] vga: add memory_region_set_owner calls Paolo Bonzini
                   ` (6 subsequent siblings)
  16 siblings, 0 replies; 36+ messages in thread
From: Paolo Bonzini @ 2013-06-04 12:13 UTC (permalink / raw)
  To: qemu-devel; +Cc: peter.maydell

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
 hw/isa/isa-bus.c      |  2 ++
 include/exec/ioport.h |  3 +++
 ioport.c              | 10 ++++++++++
 3 files changed, 15 insertions(+)

diff --git a/hw/isa/isa-bus.c b/hw/isa/isa-bus.c
index 7860b17..d263d0f 100644
--- a/hw/isa/isa-bus.c
+++ b/hw/isa/isa-bus.c
@@ -100,6 +100,7 @@ static inline void isa_init_ioport(ISADevice *dev, uint16_t ioport)
 
 void isa_register_ioport(ISADevice *dev, MemoryRegion *io, uint16_t start)
 {
+    memory_region_set_owner(io, OBJECT(dev));
     memory_region_add_subregion(isabus->address_space_io, start, io);
     isa_init_ioport(dev, start);
 }
@@ -116,6 +117,7 @@ void isa_register_portio_list(ISADevice *dev, uint16_t start,
     isa_init_ioport(dev, start);
 
     portio_list_init(piolist, pio_start, opaque, name);
+    portio_list_set_owner(piolist, OBJECT(dev));
     portio_list_add(piolist, isabus->address_space_io, start);
 }
 
diff --git a/include/exec/ioport.h b/include/exec/ioport.h
index fc28350..5fe0d99 100644
--- a/include/exec/ioport.h
+++ b/include/exec/ioport.h
@@ -62,6 +62,7 @@ typedef struct PortioList {
     unsigned nr;
     struct MemoryRegion **regions;
     struct MemoryRegion **aliases;
+    struct Object *owner;
     void *opaque;
     const char *name;
 } PortioList;
@@ -69,6 +70,8 @@ typedef struct PortioList {
 void portio_list_init(PortioList *piolist,
                       const struct MemoryRegionPortio *callbacks,
                       void *opaque, const char *name);
+void portio_list_set_owner(PortioList *piolist,
+                           struct Object *owner);
 void portio_list_destroy(PortioList *piolist);
 void portio_list_add(PortioList *piolist,
                      struct MemoryRegion *address_space,
diff --git a/ioport.c b/ioport.c
index a0ac2a0..1cccd70 100644
--- a/ioport.c
+++ b/ioport.c
@@ -347,6 +347,12 @@ void portio_list_init(PortioList *piolist,
     piolist->address_space = NULL;
     piolist->opaque = opaque;
     piolist->name = name;
+    piolist->owner = NULL;
+}
+
+void portio_list_set_owner(PortioList *piolist, Object *owner)
+{
+    piolist->owner = owner;
 }
 
 void portio_list_destroy(PortioList *piolist)
@@ -386,8 +392,12 @@ static void portio_list_add_1(PortioList *piolist,
      */
     memory_region_init_io(region, ops, piolist->opaque, piolist->name,
                           INT64_MAX);
+    memory_region_set_owner(region, piolist->owner);
+
     memory_region_init_alias(alias, piolist->name,
                              region, start + off_low, off_high - off_low);
+    memory_region_set_owner(alias, piolist->owner);
+
     memory_region_add_subregion(piolist->address_space,
                                 start + off_low, alias);
     piolist->regions[piolist->nr] = region;
-- 
1.8.1.4

^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [Qemu-devel] [PATCH v2 11/17] vga: add memory_region_set_owner calls
  2013-06-04 12:13 [Qemu-devel] [PATCH v2 00/17] Memory/IOMMU patches part 4: region ownership Paolo Bonzini
                   ` (9 preceding siblings ...)
  2013-06-04 12:13 ` [Qemu-devel] [PATCH v2 10/17] isa/portio: allow setting an owner Paolo Bonzini
@ 2013-06-04 12:13 ` Paolo Bonzini
  2013-06-04 12:13 ` [Qemu-devel] [PATCH v2 12/17] pci-assign: " Paolo Bonzini
                   ` (5 subsequent siblings)
  16 siblings, 0 replies; 36+ messages in thread
From: Paolo Bonzini @ 2013-06-04 12:13 UTC (permalink / raw)
  To: qemu-devel; +Cc: peter.maydell

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
 hw/display/cirrus_vga.c | 19 ++++++++++++++-----
 hw/display/qxl.c        |  6 ++++--
 hw/display/vga-isa-mm.c |  2 +-
 hw/display/vga-isa.c    |  4 ++--
 hw/display/vga-pci.c    |  5 +++--
 hw/display/vga.c        | 19 ++++++++++++++-----
 hw/display/vga_int.h    |  9 +++++----
 hw/display/vmware_vga.c |  4 ++--
 8 files changed, 45 insertions(+), 23 deletions(-)

diff --git a/hw/display/cirrus_vga.c b/hw/display/cirrus_vga.c
index a5dbc39..e0410fc 100644
--- a/hw/display/cirrus_vga.c
+++ b/hw/display/cirrus_vga.c
@@ -2805,12 +2805,14 @@ static const MemoryRegionOps cirrus_vga_io_ops = {
     },
 };
 
-static void cirrus_init_common(CirrusVGAState * s, int device_id, int is_pci,
+static void cirrus_init_common(CirrusVGAState * s, int device_id,
+                               DeviceState *owner,
                                MemoryRegion *system_memory,
                                MemoryRegion *system_io)
 {
     int i;
     static int inited;
+    int is_pci = !!object_dynamic_cast(OBJECT(owner), TYPE_PCI_DEVICE);
 
     if (!inited) {
         inited = 1;
@@ -2842,19 +2844,23 @@ static void cirrus_init_common(CirrusVGAState * s, int device_id, int is_pci,
     /* Register ioport 0x3b0 - 0x3df */
     memory_region_init_io(&s->cirrus_vga_io, &cirrus_vga_io_ops, s,
                           "cirrus-io", 0x30);
+    memory_region_set_owner(&s->cirrus_vga_io, OBJECT(owner));
     memory_region_add_subregion(system_io, 0x3b0, &s->cirrus_vga_io);
 
     memory_region_init(&s->low_mem_container,
                        "cirrus-lowmem-container",
                        0x20000);
+    memory_region_set_owner(&s->low_mem_container, OBJECT(owner));
 
     memory_region_init_io(&s->low_mem, &cirrus_vga_mem_ops, s,
                           "cirrus-low-memory", 0x20000);
+    memory_region_set_owner(&s->low_mem, OBJECT(owner));
     memory_region_add_subregion(&s->low_mem_container, 0, &s->low_mem);
     for (i = 0; i < 2; ++i) {
         static const char *names[] = { "vga.bank0", "vga.bank1" };
         MemoryRegion *bank = &s->cirrus_bank[i];
         memory_region_init_alias(bank, names[i], &s->vga.vram, 0, 0x8000);
+        memory_region_set_owner(bank, OBJECT(owner));
         memory_region_set_enabled(bank, false);
         memory_region_add_subregion_overlap(&s->low_mem_container, i * 0x8000,
                                             bank, 1);
@@ -2869,6 +2875,7 @@ static void cirrus_init_common(CirrusVGAState * s, int device_id, int is_pci,
     memory_region_init_io(&s->cirrus_linear_io, &cirrus_linear_io_ops, s,
                           "cirrus-linear-io", s->vga.vram_size_mb
                                               * 1024 * 1024);
+    memory_region_set_owner(&s->cirrus_linear_io, OBJECT(owner));
     memory_region_set_flush_coalesced(&s->cirrus_linear_io);
 
     /* I/O handler for LFB */
@@ -2877,11 +2884,13 @@ static void cirrus_init_common(CirrusVGAState * s, int device_id, int is_pci,
                           s,
                           "cirrus-bitblt-mmio",
                           0x400000);
+    memory_region_set_owner(&s->cirrus_linear_bitblt_io, OBJECT(owner));
     memory_region_set_flush_coalesced(&s->cirrus_linear_bitblt_io);
 
     /* I/O handler for memory-mapped I/O */
     memory_region_init_io(&s->cirrus_mmio_io, &cirrus_mmio_io_ops, s,
                           "cirrus-mmio", CIRRUS_PNPMMIO_SIZE);
+    memory_region_set_owner(&s->cirrus_mmio_io, OBJECT(owner));
     memory_region_set_flush_coalesced(&s->cirrus_mmio_io);
 
     s->real_vram_size =
@@ -2911,8 +2920,8 @@ static int vga_initfn(ISADevice *dev)
     ISACirrusVGAState *d = ISA_CIRRUS_VGA(dev);
     VGACommonState *s = &d->cirrus_vga.vga;
 
-    vga_common_init(s);
-    cirrus_init_common(&d->cirrus_vga, CIRRUS_ID_CLGD5430, 0,
+    vga_common_init(s, DEVICE(dev));
+    cirrus_init_common(&d->cirrus_vga, CIRRUS_ID_CLGD5430, DEVICE(d),
                        isa_address_space(dev), isa_address_space_io(dev));
     s->con = graphic_console_init(DEVICE(dev), s->hw_ops, s);
     rom_add_vga(VGABIOS_CIRRUS_FILENAME);
@@ -2958,8 +2967,8 @@ static int pci_cirrus_vga_initfn(PCIDevice *dev)
      int16_t device_id = pc->device_id;
 
      /* setup VGA */
-     vga_common_init(&s->vga);
-     cirrus_init_common(s, device_id, 1, pci_address_space(dev),
+     vga_common_init(&s->vga, DEVICE(dev));
+     cirrus_init_common(s, device_id, DEVICE(dev), pci_address_space(dev),
                         pci_address_space_io(dev));
      s->vga.con = graphic_console_init(DEVICE(dev), s->vga.hw_ops, &s->vga);
 
diff --git a/hw/display/qxl.c b/hw/display/qxl.c
index c475cb1..3b6cc85 100644
--- a/hw/display/qxl.c
+++ b/hw/display/qxl.c
@@ -2066,9 +2066,11 @@ static int qxl_init_primary(PCIDevice *dev)
     qxl->id = 0;
     qxl_init_ramsize(qxl);
     vga->vram_size_mb = qxl->vga.vram_size >> 20;
-    vga_common_init(vga);
-    vga_init(vga, pci_address_space(dev), pci_address_space_io(dev), false);
+    vga_common_init(vga, DEVICE(dev));
+    vga_init(vga, DEVICE(dev), pci_address_space(dev),
+             pci_address_space_io(dev), false);
     portio_list_init(qxl_vga_port_list, qxl_vga_portio_list, vga, "vga");
+    portio_list_set_owner(qxl_vga_port_list, OBJECT(dev));
     portio_list_add(qxl_vga_port_list, pci_address_space_io(dev), 0x3b0);
 
     vga->con = graphic_console_init(DEVICE(dev), &qxl_ops, qxl);
diff --git a/hw/display/vga-isa-mm.c b/hw/display/vga-isa-mm.c
index ceeb92f..64c6fc3 100644
--- a/hw/display/vga-isa-mm.c
+++ b/hw/display/vga-isa-mm.c
@@ -132,7 +132,7 @@ int isa_vga_mm_init(hwaddr vram_base,
     s = g_malloc0(sizeof(*s));
 
     s->vga.vram_size_mb = VGA_RAM_SIZE >> 20;
-    vga_common_init(&s->vga);
+    vga_common_init(&s->vga, NULL);
     vga_mm_init(s, vram_base, ctrl_base, it_shift, address_space);
 
     s->vga.con = graphic_console_init(NULL, s->vga.hw_ops, s);
diff --git a/hw/display/vga-isa.c b/hw/display/vga-isa.c
index 9e63b69..f52c104 100644
--- a/hw/display/vga-isa.c
+++ b/hw/display/vga-isa.c
@@ -55,9 +55,9 @@ static int vga_initfn(ISADevice *dev)
     MemoryRegion *vga_io_memory;
     const MemoryRegionPortio *vga_ports, *vbe_ports;
 
-    vga_common_init(s);
+    vga_common_init(s, DEVICE(dev));
     s->legacy_address_space = isa_address_space(dev);
-    vga_io_memory = vga_init_io(s, &vga_ports, &vbe_ports);
+    vga_io_memory = vga_init_io(s, DEVICE(dev), &vga_ports, &vbe_ports);
     isa_register_portio_list(dev, 0x3b0, vga_ports, s, "vga");
     if (vbe_ports) {
         isa_register_portio_list(dev, 0x1ce, vbe_ports, s, "vbe");
diff --git a/hw/display/vga-pci.c b/hw/display/vga-pci.c
index cea8db7..3f860e9 100644
--- a/hw/display/vga-pci.c
+++ b/hw/display/vga-pci.c
@@ -147,8 +147,9 @@ static int pci_std_vga_initfn(PCIDevice *dev)
     VGACommonState *s = &d->vga;
 
     /* vga + console init */
-    vga_common_init(s);
-    vga_init(s, pci_address_space(dev), pci_address_space_io(dev), true);
+    vga_common_init(s, DEVICE(dev));
+    vga_init(s, DEVICE(dev), pci_address_space(dev), pci_address_space_io(dev),
+             true);
 
     s->con = graphic_console_init(DEVICE(dev), s->hw_ops, s);
 
diff --git a/hw/display/vga.c b/hw/display/vga.c
index 21a108d..2d7e37a 100644
--- a/hw/display/vga.c
+++ b/hw/display/vga.c
@@ -199,6 +199,8 @@ static void vga_update_memory_access(VGACommonState *s)
         base += isa_mem_base;
         region = g_malloc(sizeof(*region));
         memory_region_init_alias(region, "vga.chain4", &s->vram, offset, size);
+        memory_region_set_owner(region, memory_region_owner(&s->vram));
+
         memory_region_add_subregion_overlap(s->legacy_address_space, base,
                                             region, 2);
         s->chain4_alias = region;
@@ -2256,7 +2258,7 @@ static const GraphicHwOps vga_ops = {
     .text_update = vga_update_text,
 };
 
-void vga_common_init(VGACommonState *s)
+void vga_common_init(VGACommonState *s, DeviceState *owner)
 {
     int i, j, v, b;
 
@@ -2293,6 +2295,7 @@ void vga_common_init(VGACommonState *s)
 
     s->is_vbe_vmstate = 1;
     memory_region_init_ram(&s->vram, "vga.vram", s->vram_size);
+    memory_region_set_owner(&s->vram, OBJECT(owner));
     vmstate_register_ram_global(&s->vram);
     xen_register_framebuffer(&s->vram);
     s->vram_ptr = memory_region_get_ram_ptr(&s->vram);
@@ -2333,7 +2336,7 @@ static const MemoryRegionPortio vbe_portio_list[] = {
 };
 
 /* Used by both ISA and PCI */
-MemoryRegion *vga_init_io(VGACommonState *s,
+MemoryRegion *vga_init_io(VGACommonState *s, DeviceState *owner,
                           const MemoryRegionPortio **vga_ports,
                           const MemoryRegionPortio **vbe_ports)
 {
@@ -2345,13 +2348,15 @@ MemoryRegion *vga_init_io(VGACommonState *s,
     vga_mem = g_malloc(sizeof(*vga_mem));
     memory_region_init_io(vga_mem, &vga_mem_ops, s,
                           "vga-lowmem", 0x20000);
+    memory_region_set_owner(vga_mem, OBJECT(owner));
     memory_region_set_flush_coalesced(vga_mem);
 
     return vga_mem;
 }
 
-void vga_init(VGACommonState *s, MemoryRegion *address_space,
-              MemoryRegion *address_space_io, bool init_vga_ports)
+void vga_init(VGACommonState *s, DeviceState *owner,
+              MemoryRegion *address_space, MemoryRegion *address_space_io,
+              bool init_vga_ports)
 {
     MemoryRegion *vga_io_memory;
     const MemoryRegionPortio *vga_ports, *vbe_ports;
@@ -2364,7 +2369,7 @@ void vga_init(VGACommonState *s, MemoryRegion *address_space,
 
     s->legacy_address_space = address_space;
 
-    vga_io_memory = vga_init_io(s, &vga_ports, &vbe_ports);
+    vga_io_memory = vga_init_io(s, owner, &vga_ports, &vbe_ports);
     memory_region_add_subregion_overlap(address_space,
                                         isa_mem_base + 0x000a0000,
                                         vga_io_memory,
@@ -2372,10 +2377,12 @@ void vga_init(VGACommonState *s, MemoryRegion *address_space,
     memory_region_set_coalescing(vga_io_memory);
     if (init_vga_ports) {
         portio_list_init(vga_port_list, vga_ports, s, "vga");
+        portio_list_set_owner(vga_port_list, OBJECT(owner));
         portio_list_add(vga_port_list, address_space_io, 0x3b0);
     }
     if (vbe_ports) {
         portio_list_init(vbe_port_list, vbe_ports, s, "vbe");
+        portio_list_set_owner(vbe_port_list, OBJECT(owner));
         portio_list_add(vbe_port_list, address_space_io, 0x1ce);
     }
 }
@@ -2387,6 +2394,8 @@ void vga_init_vbe(VGACommonState *s, MemoryRegion *system_memory)
      */
     memory_region_init_alias(&s->vram_vbe, "vram.vbe",
                              &s->vram, 0, memory_region_size(&s->vram));
+    memory_region_set_owner(&s->vram_vbe, memory_region_owner(&s->vram));
+
     /* XXX: use optimized standard vga accesses */
     memory_region_add_subregion(system_memory,
                                 VBE_DISPI_LFB_PHYSICAL_ADDRESS,
diff --git a/hw/display/vga_int.h b/hw/display/vga_int.h
index 66f9f3c..7fe4967 100644
--- a/hw/display/vga_int.h
+++ b/hw/display/vga_int.h
@@ -177,10 +177,11 @@ static inline int c6_to_8(int v)
     return (v << 2) | (b << 1) | b;
 }
 
-void vga_common_init(VGACommonState *s);
-void vga_init(VGACommonState *s, MemoryRegion *address_space,
-              MemoryRegion *address_space_io, bool init_vga_ports);
-MemoryRegion *vga_init_io(VGACommonState *s,
+void vga_common_init(VGACommonState *s, DeviceState *owner);
+void vga_init(VGACommonState *s, DeviceState *owner,
+              MemoryRegion *address_space, MemoryRegion *address_space_io,
+              bool init_vga_ports);
+MemoryRegion *vga_init_io(VGACommonState *s, DeviceState *owner,
                           const MemoryRegionPortio **vga_ports,
                           const MemoryRegionPortio **vbe_ports);
 void vga_common_reset(VGACommonState *s);
diff --git a/hw/display/vmware_vga.c b/hw/display/vmware_vga.c
index fd3569d..1f94b8e 100644
--- a/hw/display/vmware_vga.c
+++ b/hw/display/vmware_vga.c
@@ -1198,8 +1198,8 @@ static void vmsvga_init(DeviceState *dev, struct vmsvga_state_s *s,
     vmstate_register_ram_global(&s->fifo_ram);
     s->fifo_ptr = memory_region_get_ram_ptr(&s->fifo_ram);
 
-    vga_common_init(&s->vga);
-    vga_init(&s->vga, address_space, io, true);
+    vga_common_init(&s->vga, dev);
+    vga_init(&s->vga, dev, address_space, io, true);
     vmstate_register(NULL, 0, &vmstate_vga_common, &s->vga);
     s->new_depth = 32;
 }
-- 
1.8.1.4

^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [Qemu-devel] [PATCH v2 12/17] pci-assign: add memory_region_set_owner calls
  2013-06-04 12:13 [Qemu-devel] [PATCH v2 00/17] Memory/IOMMU patches part 4: region ownership Paolo Bonzini
                   ` (10 preceding siblings ...)
  2013-06-04 12:13 ` [Qemu-devel] [PATCH v2 11/17] vga: add memory_region_set_owner calls Paolo Bonzini
@ 2013-06-04 12:13 ` Paolo Bonzini
  2013-06-04 16:42   ` Paolo Bonzini
  2013-06-04 12:13 ` [Qemu-devel] [PATCH v2 13/17] vfio: " Paolo Bonzini
                   ` (4 subsequent siblings)
  16 siblings, 1 reply; 36+ messages in thread
From: Paolo Bonzini @ 2013-06-04 12:13 UTC (permalink / raw)
  To: qemu-devel; +Cc: peter.maydell

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
 hw/i386/kvm/pci-assign.c | 11 +++++++++++
 1 file changed, 11 insertions(+)

diff --git a/hw/i386/kvm/pci-assign.c b/hw/i386/kvm/pci-assign.c
index ff85590..4b1c2d9 100644
--- a/hw/i386/kvm/pci-assign.c
+++ b/hw/i386/kvm/pci-assign.c
@@ -300,6 +300,7 @@ static void assigned_dev_iomem_setup(PCIDevice *pci_dev, int region_num,
     if (e_size > 0) {
         memory_region_init(&region->container, "assigned-dev-container",
                            e_size);
+        memory_region_set_owner(&region->container, OBJECT(pci_dev));
         memory_region_add_subregion(&region->container, 0, &region->real_iomem);
 
         /* deal with MSI-X MMIO page */
@@ -330,9 +331,12 @@ static void assigned_dev_ioport_setup(PCIDevice *pci_dev, int region_num,
 
     region->e_size = size;
     memory_region_init(&region->container, "assigned-dev-container", size);
+    memory_region_set_owner(&region->container, OBJECT(pci_dev));
+
     memory_region_init_io(&region->real_iomem, &assigned_dev_ioport_ops,
                           r_dev->v_addrs + region_num,
                           "assigned-dev-iomem", size);
+    memory_region_set_owner(&region->real_iomem, OBJECT(pci_dev));
     memory_region_add_subregion(&region->container, 0, &region->real_iomem);
 }
 
@@ -482,6 +486,8 @@ static int assigned_dev_register_regions(PCIRegion *io_regions,
                                       &slow_bar_ops, &pci_dev->v_addrs[i],
                                       "assigned-dev-slow-bar",
                                       cur_region->size);
+                memory_region_set_owner(&pci_dev->v_addrs[i].real_iomem,
+                                        OBJECT(pci_dev));
             } else {
                 void *virtbase = pci_dev->v_addrs[i].u.r_virtbase;
                 char name[32];
@@ -490,6 +496,9 @@ static int assigned_dev_register_regions(PCIRegion *io_regions,
                 memory_region_init_ram_ptr(&pci_dev->v_addrs[i].real_iomem,
                                            name, cur_region->size,
                                            virtbase);
+                memory_region_set_owner(&pci_dev->v_addrs[i].real_iomem,
+                                        OBJECT(pci_dev));
+
                 vmstate_register_ram(&pci_dev->v_addrs[i].real_iomem,
                                      &pci_dev->dev.qdev);
             }
@@ -1651,6 +1660,7 @@ static int assigned_dev_register_msix_mmio(AssignedDevice *dev)
 
     memory_region_init_io(&dev->mmio, &assigned_dev_msix_mmio_ops, dev,
                           "assigned-dev-msix", MSIX_PAGE_SIZE);
+    memory_region_set_owner(&dev->mmio, OBJECT(dev));
     return 0;
 }
 
@@ -1916,6 +1926,7 @@ static void assigned_dev_load_option_rom(AssignedDevice *dev)
     snprintf(name, sizeof(name), "%s.rom",
             object_get_typename(OBJECT(dev)));
     memory_region_init_ram(&dev->dev.rom, name, st.st_size);
+    memory_region_set_owner(&dev->dev.rom, OBJECT(dev));
     vmstate_register_ram(&dev->dev.rom, &dev->dev.qdev);
     ptr = memory_region_get_ram_ptr(&dev->dev.rom);
     memset(ptr, 0xff, st.st_size);
-- 
1.8.1.4

^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [Qemu-devel] [PATCH v2 13/17] vfio: add memory_region_set_owner calls
  2013-06-04 12:13 [Qemu-devel] [PATCH v2 00/17] Memory/IOMMU patches part 4: region ownership Paolo Bonzini
                   ` (11 preceding siblings ...)
  2013-06-04 12:13 ` [Qemu-devel] [PATCH v2 12/17] pci-assign: " Paolo Bonzini
@ 2013-06-04 12:13 ` Paolo Bonzini
  2013-06-04 12:13 ` [Qemu-devel] [PATCH v2 14/17] exec: check MRU in qemu_ram_addr_from_host Paolo Bonzini
                   ` (3 subsequent siblings)
  16 siblings, 0 replies; 36+ messages in thread
From: Paolo Bonzini @ 2013-06-04 12:13 UTC (permalink / raw)
  To: qemu-devel; +Cc: peter.maydell

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
 hw/misc/vfio.c | 8 ++++++++
 1 file changed, 8 insertions(+)

diff --git a/hw/misc/vfio.c b/hw/misc/vfio.c
index a1f5803..3c0dc9f 100644
--- a/hw/misc/vfio.c
+++ b/hw/misc/vfio.c
@@ -1156,6 +1156,7 @@ static void vfio_vga_probe_ati_3c3_quirk(VFIODevice *vdev)
 
     memory_region_init_io(&quirk->mem, &vfio_ati_3c3_quirk, quirk,
                           "vfio-ati-3c3-quirk", 1);
+    memory_region_set_owner(&quirk->mem, OBJECT(vdev));
     memory_region_add_subregion(&vdev->vga.region[QEMU_PCI_VGA_IO_HI].mem, 3,
                                 &quirk->mem);
 
@@ -1247,6 +1248,7 @@ static void vfio_probe_ati_4010_quirk(VFIODevice *vdev, int nr)
 
     memory_region_init_io(&quirk->mem, &vfio_ati_4010_quirk, quirk,
                           "vfio-ati-4010-quirk", 8);
+    memory_region_set_owner(&quirk->mem, OBJECT(vdev));
     memory_region_add_subregion_overlap(&vdev->bars[nr].mem, 0, &quirk->mem, 1);
 
     QLIST_INSERT_HEAD(&vdev->bars[nr].quirks, quirk, next);
@@ -1333,6 +1335,7 @@ static void vfio_probe_ati_f10_quirk(VFIODevice *vdev, int nr)
 
     memory_region_init_io(&quirk->mem, &vfio_ati_f10_quirk, quirk,
                           "vfio-ati-f10-quirk", 8);
+    memory_region_set_owner(&quirk->mem, OBJECT(vdev));
     memory_region_add_subregion_overlap(&vdev->bars[nr].mem, 0, &quirk->mem, 1);
 
     QLIST_INSERT_HEAD(&vdev->bars[nr].quirks, quirk, next);
@@ -1453,6 +1456,7 @@ static void vfio_vga_probe_nvidia_3d0_quirk(VFIODevice *vdev)
 
     memory_region_init_io(&quirk->mem, &vfio_nvidia_3d0_quirk, quirk,
                           "vfio-nvidia-3d0-quirk", 6);
+    memory_region_set_owner(&quirk->mem, OBJECT(vdev));
     memory_region_add_subregion(&vdev->vga.region[QEMU_PCI_VGA_IO_HI].mem,
                                 0x10, &quirk->mem);
 
@@ -1568,6 +1572,7 @@ static void vfio_probe_nvidia_bar5_window_quirk(VFIODevice *vdev, int nr)
 
     memory_region_init_io(&quirk->mem, &vfio_nvidia_bar5_window_quirk, quirk,
                           "vfio-nvidia-bar5-window-quirk", 16);
+    memory_region_set_owner(&quirk->mem, OBJECT(vdev));
     memory_region_add_subregion_overlap(&vdev->bars[nr].mem, 0, &quirk->mem, 1);
 
     QLIST_INSERT_HEAD(&vdev->bars[nr].quirks, quirk, next);
@@ -1647,6 +1652,7 @@ static void vfio_probe_nvidia_bar0_88000_quirk(VFIODevice *vdev, int nr)
     memory_region_init_io(&quirk->mem, &vfio_nvidia_bar0_88000_quirk, quirk,
                           "vfio-nvidia-bar0-88000-quirk",
                           TARGET_PAGE_ALIGN(PCIE_CONFIG_SPACE_SIZE));
+    memory_region_set_owner(&quirk->mem, OBJECT(vdev));
     memory_region_add_subregion_overlap(&vdev->bars[nr].mem,
                                         0x88000 & TARGET_PAGE_MASK,
                                         &quirk->mem, 1);
@@ -1726,6 +1732,7 @@ static void vfio_probe_nvidia_bar0_1800_quirk(VFIODevice *vdev, int nr)
     memory_region_init_io(&quirk->mem, &vfio_nvidia_bar0_1800_quirk, quirk,
                           "vfio-nvidia-bar0-1800-quirk",
                           TARGET_PAGE_ALIGN(PCI_CONFIG_SPACE_SIZE));
+    memory_region_set_owner(&quirk->mem, OBJECT(vdev));
     memory_region_add_subregion_overlap(&vdev->bars[nr].mem,
                                         0x1800 & TARGET_PAGE_MASK,
                                         &quirk->mem, 1);
@@ -2237,6 +2244,7 @@ empty_region:
         memory_region_init(submem, name, 0);
     }
 
+    memory_region_set_owner(submem, memory_region_owner(mem));
     memory_region_add_subregion(mem, offset, submem);
 
     return ret;
-- 
1.8.1.4

^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [Qemu-devel] [PATCH v2 14/17] exec: check MRU in qemu_ram_addr_from_host
  2013-06-04 12:13 [Qemu-devel] [PATCH v2 00/17] Memory/IOMMU patches part 4: region ownership Paolo Bonzini
                   ` (12 preceding siblings ...)
  2013-06-04 12:13 ` [Qemu-devel] [PATCH v2 13/17] vfio: " Paolo Bonzini
@ 2013-06-04 12:13 ` Paolo Bonzini
  2013-06-04 12:13 ` [Qemu-devel] [PATCH v2 15/17] exec: move qemu_ram_addr_from_host_nofail to cputlb.c Paolo Bonzini
                   ` (2 subsequent siblings)
  16 siblings, 0 replies; 36+ messages in thread
From: Paolo Bonzini @ 2013-06-04 12:13 UTC (permalink / raw)
  To: qemu-devel; +Cc: peter.maydell

This function is not used outside the iothread mutex, so it
can use ram_list.mru_block.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
 exec.c | 12 ++++++++++--
 1 file changed, 10 insertions(+), 2 deletions(-)

diff --git a/exec.c b/exec.c
index bf287cb..9fd4c90 100644
--- a/exec.c
+++ b/exec.c
@@ -1441,18 +1441,26 @@ int qemu_ram_addr_from_host(void *ptr, ram_addr_t *ram_addr)
         return 0;
     }
 
+    block = ram_list.mru_block;
+    if (block && block->host && host - block->host < block->length) {
+        goto found;
+    }
+
     QTAILQ_FOREACH(block, &ram_list.blocks, next) {
         /* This case append when the block is not mapped. */
         if (block->host == NULL) {
             continue;
         }
         if (host - block->host < block->length) {
-            *ram_addr = block->offset + (host - block->host);
-            return 0;
+            goto found;
         }
     }
 
     return -1;
+
+found:
+    *ram_addr = block->offset + (host - block->host);
+    return 0;
 }
 
 /* Some of the softmmu routines need to translate from a host pointer
-- 
1.8.1.4

^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [Qemu-devel] [PATCH v2 15/17] exec: move qemu_ram_addr_from_host_nofail to cputlb.c
  2013-06-04 12:13 [Qemu-devel] [PATCH v2 00/17] Memory/IOMMU patches part 4: region ownership Paolo Bonzini
                   ` (13 preceding siblings ...)
  2013-06-04 12:13 ` [Qemu-devel] [PATCH v2 14/17] exec: check MRU in qemu_ram_addr_from_host Paolo Bonzini
@ 2013-06-04 12:13 ` Paolo Bonzini
  2013-06-04 12:14 ` [Qemu-devel] [PATCH v2 16/17] memory: return MemoryRegion from qemu_ram_addr_from_host Paolo Bonzini
  2013-06-04 12:14 ` [Qemu-devel] [PATCH v2 17/17] memory: ref/unref memory across address_space_map/unmap Paolo Bonzini
  16 siblings, 0 replies; 36+ messages in thread
From: Paolo Bonzini @ 2013-06-04 12:13 UTC (permalink / raw)
  To: qemu-devel; +Cc: peter.maydell

It is only used there, and the _nofail and the standard
versions of this function return different things.  Limit
confusion by removing the function from the public headers.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
 cputlb.c                  | 11 +++++++++++
 exec.c                    | 15 ++-------------
 include/exec/cpu-common.h |  1 -
 3 files changed, 13 insertions(+), 14 deletions(-)

diff --git a/cputlb.c b/cputlb.c
index 060c67d..c34bd7b 100644
--- a/cputlb.c
+++ b/cputlb.c
@@ -158,6 +158,17 @@ void tlb_reset_dirty_range(CPUTLBEntry *tlb_entry, uintptr_t start,
     }
 }
 
+static inline ram_addr_t qemu_ram_addr_from_host_nofail(void *ptr)
+{
+    ram_addr_t ram_addr;
+
+    if (qemu_ram_addr_from_host(ptr, &ram_addr)) {
+        fprintf(stderr, "Bad ram pointer %p\n", ptr);
+        abort();
+    }
+    return ram_addr;
+}
+
 static inline void tlb_update_dirty(CPUTLBEntry *tlb_entry)
 {
     ram_addr_t ram_addr;
diff --git a/exec.c b/exec.c
index 9fd4c90..07d4b5f 100644
--- a/exec.c
+++ b/exec.c
@@ -1431,6 +1431,8 @@ static void *qemu_ram_ptr_length(ram_addr_t addr, ram_addr_t *size)
     }
 }
 
+/* Some of the softmmu routines need to translate from a host pointer
+   (typically a TLB entry) back to a ram offset.  */
 int qemu_ram_addr_from_host(void *ptr, ram_addr_t *ram_addr)
 {
     RAMBlock *block;
@@ -1463,19 +1465,6 @@ found:
     return 0;
 }
 
-/* Some of the softmmu routines need to translate from a host pointer
-   (typically a TLB entry) back to a ram offset.  */
-ram_addr_t qemu_ram_addr_from_host_nofail(void *ptr)
-{
-    ram_addr_t ram_addr;
-
-    if (qemu_ram_addr_from_host(ptr, &ram_addr)) {
-        fprintf(stderr, "Bad ram pointer %p\n", ptr);
-        abort();
-    }
-    return ram_addr;
-}
-
 static void notdirty_mem_write(void *opaque, hwaddr ram_addr,
                                uint64_t val, unsigned size)
 {
diff --git a/include/exec/cpu-common.h b/include/exec/cpu-common.h
index e061e21..8063ba2 100644
--- a/include/exec/cpu-common.h
+++ b/include/exec/cpu-common.h
@@ -51,7 +51,6 @@ typedef uint32_t CPUReadMemoryFunc(void *opaque, hwaddr addr);
 void qemu_ram_remap(ram_addr_t addr, ram_addr_t length);
 /* This should not be used by devices.  */
 int qemu_ram_addr_from_host(void *ptr, ram_addr_t *ram_addr);
-ram_addr_t qemu_ram_addr_from_host_nofail(void *ptr);
 void qemu_ram_set_idstr(ram_addr_t addr, const char *name, DeviceState *dev);
 
 void cpu_physical_memory_rw(hwaddr addr, uint8_t *buf,
-- 
1.8.1.4

^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [Qemu-devel] [PATCH v2 16/17] memory: return MemoryRegion from qemu_ram_addr_from_host
  2013-06-04 12:13 [Qemu-devel] [PATCH v2 00/17] Memory/IOMMU patches part 4: region ownership Paolo Bonzini
                   ` (14 preceding siblings ...)
  2013-06-04 12:13 ` [Qemu-devel] [PATCH v2 15/17] exec: move qemu_ram_addr_from_host_nofail to cputlb.c Paolo Bonzini
@ 2013-06-04 12:14 ` Paolo Bonzini
  2013-06-04 12:14 ` [Qemu-devel] [PATCH v2 17/17] memory: ref/unref memory across address_space_map/unmap Paolo Bonzini
  16 siblings, 0 replies; 36+ messages in thread
From: Paolo Bonzini @ 2013-06-04 12:14 UTC (permalink / raw)
  To: qemu-devel; +Cc: peter.maydell

It will be needed in the next patch.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
 cputlb.c                  |  2 +-
 exec.c                    | 33 ++++++++++++++++++++-------------
 include/exec/cpu-common.h |  2 +-
 target-i386/kvm.c         |  4 ++--
 4 files changed, 24 insertions(+), 17 deletions(-)

diff --git a/cputlb.c b/cputlb.c
index c34bd7b..f8c15dc 100644
--- a/cputlb.c
+++ b/cputlb.c
@@ -162,7 +162,7 @@ static inline ram_addr_t qemu_ram_addr_from_host_nofail(void *ptr)
 {
     ram_addr_t ram_addr;
 
-    if (qemu_ram_addr_from_host(ptr, &ram_addr)) {
+    if (qemu_ram_addr_from_host(ptr, &ram_addr) == NULL) {
         fprintf(stderr, "Bad ram pointer %p\n", ptr);
         abort();
     }
diff --git a/exec.c b/exec.c
index 07d4b5f..db03507 100644
--- a/exec.c
+++ b/exec.c
@@ -1329,15 +1329,7 @@ void qemu_ram_remap(ram_addr_t addr, ram_addr_t length)
 }
 #endif /* !_WIN32 */
 
-/* Return a host pointer to ram allocated with qemu_ram_alloc.
-   With the exception of the softmmu code in this file, this should
-   only be used for local memory (e.g. video ram) that the device owns,
-   and knows it isn't going to access beyond the end of the block.
-
-   It should not be used for general purpose DMA.
-   Use cpu_physical_memory_map/cpu_physical_memory_rw instead.
- */
-void *qemu_get_ram_ptr(ram_addr_t addr)
+static RAMBlock *qemu_get_ram_block(ram_addr_t addr)
 {
     RAMBlock *block;
 
@@ -1357,6 +1349,21 @@ void *qemu_get_ram_ptr(ram_addr_t addr)
 
 found:
     ram_list.mru_block = block;
+    return block;
+}
+
+/* Return a host pointer to ram allocated with qemu_ram_alloc.
+   With the exception of the softmmu code in this file, this should
+   only be used for local memory (e.g. video ram) that the device owns,
+   and knows it isn't going to access beyond the end of the block.
+
+   It should not be used for general purpose DMA.
+   Use cpu_physical_memory_map/cpu_physical_memory_rw instead.
+ */
+void *qemu_get_ram_ptr(ram_addr_t addr)
+{
+    RAMBlock *block = qemu_get_ram_block(addr);
+
     if (xen_enabled()) {
         /* We need to check if the requested address is in the RAM
          * because we don't want to map the entire memory in QEMU.
@@ -1433,14 +1440,14 @@ static void *qemu_ram_ptr_length(ram_addr_t addr, ram_addr_t *size)
 
 /* Some of the softmmu routines need to translate from a host pointer
    (typically a TLB entry) back to a ram offset.  */
-int qemu_ram_addr_from_host(void *ptr, ram_addr_t *ram_addr)
+MemoryRegion *qemu_ram_addr_from_host(void *ptr, ram_addr_t *ram_addr)
 {
     RAMBlock *block;
     uint8_t *host = ptr;
 
     if (xen_enabled()) {
         *ram_addr = xen_ram_addr_from_mapcache(ptr);
-        return 0;
+        return qemu_get_ram_block(*ram_addr)->mr;
     }
 
     block = ram_list.mru_block;
@@ -1458,11 +1465,11 @@ int qemu_ram_addr_from_host(void *ptr, ram_addr_t *ram_addr)
         }
     }
 
-    return -1;
+    return NULL;
 
 found:
     *ram_addr = block->offset + (host - block->host);
-    return 0;
+    return block->mr;
 }
 
 static void notdirty_mem_write(void *opaque, hwaddr ram_addr,
diff --git a/include/exec/cpu-common.h b/include/exec/cpu-common.h
index 8063ba2..8f75233 100644
--- a/include/exec/cpu-common.h
+++ b/include/exec/cpu-common.h
@@ -50,7 +50,7 @@ typedef uint32_t CPUReadMemoryFunc(void *opaque, hwaddr addr);
 
 void qemu_ram_remap(ram_addr_t addr, ram_addr_t length);
 /* This should not be used by devices.  */
-int qemu_ram_addr_from_host(void *ptr, ram_addr_t *ram_addr);
+MemoryRegion *qemu_ram_addr_from_host(void *ptr, ram_addr_t *ram_addr);
 void qemu_ram_set_idstr(ram_addr_t addr, const char *name, DeviceState *dev);
 
 void cpu_physical_memory_rw(hwaddr addr, uint8_t *buf,
diff --git a/target-i386/kvm.c b/target-i386/kvm.c
index 9ffb6ca..7ba98cd 100644
--- a/target-i386/kvm.c
+++ b/target-i386/kvm.c
@@ -318,7 +318,7 @@ int kvm_arch_on_sigbus_vcpu(CPUState *c, int code, void *addr)
 
     if ((env->mcg_cap & MCG_SER_P) && addr
         && (code == BUS_MCEERR_AR || code == BUS_MCEERR_AO)) {
-        if (qemu_ram_addr_from_host(addr, &ram_addr) ||
+        if (qemu_ram_addr_from_host(addr, &ram_addr) == NULL ||
             !kvm_physical_memory_addr_from_host(c->kvm_state, addr, &paddr)) {
             fprintf(stderr, "Hardware memory error for memory used by "
                     "QEMU itself instead of guest system!\n");
@@ -350,7 +350,7 @@ int kvm_arch_on_sigbus(int code, void *addr)
         hwaddr paddr;
 
         /* Hope we are lucky for AO MCE */
-        if (qemu_ram_addr_from_host(addr, &ram_addr) ||
+        if (qemu_ram_addr_from_host(addr, &ram_addr) == NULL ||
             !kvm_physical_memory_addr_from_host(CPU(first_cpu)->kvm_state,
                                                 addr, &paddr)) {
             fprintf(stderr, "Hardware memory error for memory used by "
-- 
1.8.1.4

^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [Qemu-devel] [PATCH v2 17/17] memory: ref/unref memory across address_space_map/unmap
  2013-06-04 12:13 [Qemu-devel] [PATCH v2 00/17] Memory/IOMMU patches part 4: region ownership Paolo Bonzini
                   ` (15 preceding siblings ...)
  2013-06-04 12:14 ` [Qemu-devel] [PATCH v2 16/17] memory: return MemoryRegion from qemu_ram_addr_from_host Paolo Bonzini
@ 2013-06-04 12:14 ` Paolo Bonzini
  16 siblings, 0 replies; 36+ messages in thread
From: Paolo Bonzini @ 2013-06-04 12:14 UTC (permalink / raw)
  To: qemu-devel; +Cc: peter.maydell

The iothread mutex might be released between map and unmap, so the
mapped region might disappear.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
 exec.c | 12 +++++++++++-
 1 file changed, 11 insertions(+), 1 deletion(-)

diff --git a/exec.c b/exec.c
index db03507..5c37393 100644
--- a/exec.c
+++ b/exec.c
@@ -2057,6 +2057,7 @@ void cpu_physical_memory_write_rom(hwaddr addr,
 }
 
 typedef struct {
+    MemoryRegion *mr;
     void *buffer;
     hwaddr addr;
     hwaddr len;
@@ -2154,15 +2155,18 @@ void *address_space_map(AddressSpace *as,
             bounce.buffer = qemu_memalign(TARGET_PAGE_SIZE, TARGET_PAGE_SIZE);
             bounce.addr = addr;
             bounce.len = l;
+            bounce.mr = mr;
             if (!is_write) {
                 address_space_read(as, addr, bounce.buffer, l);
             }
 
             *plen = l;
+            memory_region_ref(mr);
             return bounce.buffer;
         }
         if (!todo) {
             raddr = memory_region_get_ram_addr(mr) + xlat;
+            memory_region_ref(mr);
         } else {
             if (memory_region_get_ram_addr(mr) + xlat != raddr + todo) {
                 memory_region_unref(mr);
@@ -2189,8 +2193,12 @@ void address_space_unmap(AddressSpace *as, void *buffer, hwaddr len,
                          int is_write, hwaddr access_len)
 {
     if (buffer != bounce.buffer) {
+        MemoryRegion *mr;
+        ram_addr_t addr1;
+
+        mr = qemu_ram_addr_from_host(buffer, &addr1);
+        assert(mr);
         if (is_write) {
-            ram_addr_t addr1 = qemu_ram_addr_from_host_nofail(buffer);
             while (access_len) {
                 unsigned l;
                 l = TARGET_PAGE_SIZE;
@@ -2204,6 +2212,7 @@ void address_space_unmap(AddressSpace *as, void *buffer, hwaddr len,
         if (xen_enabled()) {
             xen_invalidate_map_cache_entry(buffer);
         }
+        memory_region_unref(mr);
         return;
     }
     if (is_write) {
@@ -2211,6 +2220,7 @@ void address_space_unmap(AddressSpace *as, void *buffer, hwaddr len,
     }
     qemu_vfree(bounce.buffer);
     bounce.buffer = NULL;
+    memory_region_unref(bounce.mr);
     cpu_notify_map_clients();
 }
 
-- 
1.8.1.4

^ permalink raw reply related	[flat|nested] 36+ messages in thread

* Re: [Qemu-devel] [PATCH v2 06/17] sysbus: add sysbus_pass_mmio
  2013-06-04 12:13 ` [Qemu-devel] [PATCH v2 06/17] sysbus: add sysbus_pass_mmio Paolo Bonzini
@ 2013-06-04 12:24   ` Peter Maydell
  2013-06-04 12:31     ` Paolo Bonzini
  0 siblings, 1 reply; 36+ messages in thread
From: Peter Maydell @ 2013-06-04 12:24 UTC (permalink / raw)
  To: Paolo Bonzini; +Cc: qemu-devel

On 4 June 2013 13:13, Paolo Bonzini <pbonzini@redhat.com> wrote:
> This matches sysbus_pass_irq in cases where a device is a thin wrapper
> of another.  MMIO regions will keep the subdevice as the owner.
>
> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
> ---
>  hw/core/sysbus.c     | 12 ++++++++++++
>  hw/cpu/arm11mpcore.c |  2 +-
>  include/hw/sysbus.h  |  1 +
>  3 files changed, 14 insertions(+), 1 deletion(-)
>
> diff --git a/hw/core/sysbus.c b/hw/core/sysbus.c
> index 9004d8c..6dbd1f8 100644
> --- a/hw/core/sysbus.c
> +++ b/hw/core/sysbus.c
> @@ -117,6 +117,18 @@ void sysbus_init_mmio(SysBusDevice *dev, MemoryRegion *memory)
>      dev->mmio[n].memory = memory;
>  }
>
> +/* Pass MMIOs from a target device.  */
> +void sysbus_pass_mmio(SysBusDevice *dev, SysBusDevice *target)
> +{
> +    int i;
> +    assert(dev->num_mmio == 0);
> +    dev->num_mmio = target->num_mmio;
> +    for (i = 0; i < dev->num_mmio; i++) {
> +        assert(target->mmio[i].addr == -1);
> +        dev->mmio[i] = target->mmio[i];
> +    }
> +}

This is much less flexible than just using sysbus_mmio_get_region(),
because it only lets you pass the whole set of MMIOs from the
other device through, not just the ones you want. Please
just make reference counting work properly with passing
MemoryRegion*s around.

-- PMM

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [Qemu-devel] [PATCH v2 06/17] sysbus: add sysbus_pass_mmio
  2013-06-04 12:24   ` Peter Maydell
@ 2013-06-04 12:31     ` Paolo Bonzini
  2013-06-04 12:36       ` Peter Maydell
  0 siblings, 1 reply; 36+ messages in thread
From: Paolo Bonzini @ 2013-06-04 12:31 UTC (permalink / raw)
  To: Peter Maydell; +Cc: qemu-devel

Il 04/06/2013 14:24, Peter Maydell ha scritto:
> On 4 June 2013 13:13, Paolo Bonzini <pbonzini@redhat.com> wrote:
>> This matches sysbus_pass_irq in cases where a device is a thin wrapper
>> of another.  MMIO regions will keep the subdevice as the owner.
>>
>> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
>> ---
>>  hw/core/sysbus.c     | 12 ++++++++++++
>>  hw/cpu/arm11mpcore.c |  2 +-
>>  include/hw/sysbus.h  |  1 +
>>  3 files changed, 14 insertions(+), 1 deletion(-)
>>
>> diff --git a/hw/core/sysbus.c b/hw/core/sysbus.c
>> index 9004d8c..6dbd1f8 100644
>> --- a/hw/core/sysbus.c
>> +++ b/hw/core/sysbus.c
>> @@ -117,6 +117,18 @@ void sysbus_init_mmio(SysBusDevice *dev, MemoryRegion *memory)
>>      dev->mmio[n].memory = memory;
>>  }
>>
>> +/* Pass MMIOs from a target device.  */
>> +void sysbus_pass_mmio(SysBusDevice *dev, SysBusDevice *target)
>> +{
>> +    int i;
>> +    assert(dev->num_mmio == 0);
>> +    dev->num_mmio = target->num_mmio;
>> +    for (i = 0; i < dev->num_mmio; i++) {
>> +        assert(target->mmio[i].addr == -1);
>> +        dev->mmio[i] = target->mmio[i];
>> +    }
>> +}
> 
> This is much less flexible than just using sysbus_mmio_get_region(),
> because it only lets you pass the whole set of MMIOs from the
> other device through, not just the ones you want.

How is this different from sysbus_pass_irq?

> Please just make reference counting work properly with passing
> MemoryRegion*s around.

Do you have any idea that doesn't require touch 800 invocation of the
region creation functions?  This looks like a solution in search of a
problem to me.

Paolo

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [Qemu-devel] [PATCH v2 06/17] sysbus: add sysbus_pass_mmio
  2013-06-04 12:31     ` Paolo Bonzini
@ 2013-06-04 12:36       ` Peter Maydell
  2013-06-04 13:24         ` Paolo Bonzini
  0 siblings, 1 reply; 36+ messages in thread
From: Peter Maydell @ 2013-06-04 12:36 UTC (permalink / raw)
  To: Paolo Bonzini; +Cc: qemu-devel

On 4 June 2013 13:31, Paolo Bonzini <pbonzini@redhat.com> wrote:
> Il 04/06/2013 14:24, Peter Maydell ha scritto:
>> On 4 June 2013 13:13, Paolo Bonzini <pbonzini@redhat.com> wrote:
>> This is much less flexible than just using sysbus_mmio_get_region(),
>> because it only lets you pass the whole set of MMIOs from the
>> other device through, not just the ones you want.
>
> How is this different from sysbus_pass_irq?

sysbus_pass_irq is also an annoyingly inflexible function.
With MMIOs we have the advantage of being able to do better.

>> Please just make reference counting work properly with passing
>> MemoryRegion*s around.
>
> Do you have any idea that doesn't require touch 800 invocation of the
> region creation functions?

I think that would be a straightforward and easy to understand
way to define the ownership rules so I would much rather we
did that. I really don't like the way your current patch
is doing something complicated in an attempt to avoid this.

thanks
-- PMM

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [Qemu-devel] [PATCH v2 06/17] sysbus: add sysbus_pass_mmio
  2013-06-04 12:36       ` Peter Maydell
@ 2013-06-04 13:24         ` Paolo Bonzini
  2013-06-04 14:11           ` Peter Maydell
  2013-06-04 16:50           ` Paolo Bonzini
  0 siblings, 2 replies; 36+ messages in thread
From: Paolo Bonzini @ 2013-06-04 13:24 UTC (permalink / raw)
  To: Peter Maydell; +Cc: qemu-devel

Il 04/06/2013 14:36, Peter Maydell ha scritto:
> On 4 June 2013 13:31, Paolo Bonzini <pbonzini@redhat.com> wrote:
>> Il 04/06/2013 14:24, Peter Maydell ha scritto:
>>> On 4 June 2013 13:13, Paolo Bonzini <pbonzini@redhat.com> wrote:
>>> This is much less flexible than just using sysbus_mmio_get_region(),
>>> because it only lets you pass the whole set of MMIOs from the
>>> other device through, not just the ones you want.
>>
>> How is this different from sysbus_pass_irq?
> 
> sysbus_pass_irq is also an annoyingly inflexible function.
> With MMIOs we have the advantage of being able to do better.

I prefer consistency to useless flexibility.

The day someone will need it, they can add sysbus_pass_one_{irq,mmio}.

>>> Please just make reference counting work properly with passing
>>> MemoryRegion*s around.
>>
>> Do you have any idea that doesn't require touch 800 invocation of the
>> region creation functions?
> 
> I think that would be a straightforward and easy to understand
> way to define the ownership rules so I would much rather we
> did that. I really don't like the way your current patch
> is doing something complicated in an attempt to avoid this.

They are straightforward, documented, and the wide majority of the
devices need not care at all about them.  By contrast, changing 800
invocations of the functions would be impossible to review seriously, it
would have to be redone when boards are qdev/QOM-ified, would be worse
for submitters of new boards.

There are an order of magnitude less calls to memory_region_set_owner
than to memory_region_init_*.  Changing four places suffices to get
ownership for 97% of the devices (309 files in hw/ call
memory_region_init*, 9 devices call memory_region_set_owner):

  hw/core/sysbus.c:1
  hw/isa/isa-bus.c:1
  hw/pci/pci.c:1
  ioport.c:2

Of the remaining calls, 2/3 of them are concentrated in a handful of
devices:

  hw/display/cirrus_vga.c:7
  hw/display/vga.c:4
  hw/i386/kvm/pci-assign.c:7
  hw/misc/vfio.c:8

and all the others could probably be refactored away, and are all for PC
devices, other targets are unaffected (I did review them and sysbus
catches everything):

  hw/acpi/ich9.c:1
  hw/acpi/piix4.c:4
  hw/char/serial-pci.c:1
  hw/isa/apm.c:1
  hw/misc/pc-testdev.c:4


To me, it seems a pretty good abstraction.

Paolo

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [Qemu-devel] [PATCH v2 06/17] sysbus: add sysbus_pass_mmio
  2013-06-04 13:24         ` Paolo Bonzini
@ 2013-06-04 14:11           ` Peter Maydell
  2013-06-04 14:27             ` Paolo Bonzini
  2013-06-04 16:50           ` Paolo Bonzini
  1 sibling, 1 reply; 36+ messages in thread
From: Peter Maydell @ 2013-06-04 14:11 UTC (permalink / raw)
  To: Paolo Bonzini; +Cc: qemu-devel

On 4 June 2013 14:24, Paolo Bonzini <pbonzini@redhat.com> wrote:
> Il 04/06/2013 14:36, Peter Maydell ha scritto:
>> On 4 June 2013 13:31, Paolo Bonzini <pbonzini@redhat.com> wrote:
>>> Il 04/06/2013 14:24, Peter Maydell ha scritto:
>>>> On 4 June 2013 13:13, Paolo Bonzini <pbonzini@redhat.com> wrote:
>>>> This is much less flexible than just using sysbus_mmio_get_region(),
>>>> because it only lets you pass the whole set of MMIOs from the
>>>> other device through, not just the ones you want.
>>>
>>> How is this different from sysbus_pass_irq?
>>
>> sysbus_pass_irq is also an annoyingly inflexible function.
>> With MMIOs we have the advantage of being able to do better.
>
> I prefer consistency to useless flexibility.
>
> The day someone will need it, they can add sysbus_pass_one_{irq,mmio}.

We've already got a working implementation, in the shape
of sysbus_mmio_get_region(). This is exactly the right way
to do this API -- we have one API which says "give me a
MemoryRegion*" and one which says "I have a MemoryRegion*,
please expose it". All I'm asking you to do is not break it.

>> I think that would be a straightforward and easy to understand
>> way to define the ownership rules so I would much rather we
>> did that. I really don't like the way your current patch
>> is doing something complicated in an attempt to avoid this.
>
> They are straightforward, documented, and the wide majority of the
> devices need not care at all about them.

I still don't understand them. Why should "hey, please use
this MMIO region as a PCI BAR" imply "and by the way set the
ownership"? Why does "here's an MMIO region which should be
exposed to users of this device" imply "and by the way
set the ownership"? This is just yoking together two separate
things. And it seems wrong that devices shouldn't care about
memory region ownership -- devices *are* the memory region
owners typically. Memory regions should just have owners
always from the start, if we need them to have owners.

>  By contrast, changing 800
> invocations of the functions would be impossible to review seriously, it
> would have to be redone when boards are qdev/QOM-ified, would be worse
> for submitters of new boards.

If it's not clear who ought to be the owner when mmio_init_region
is called then there are problems anyway.

> There are an order of magnitude less calls to memory_region_set_owner
> than to memory_region_init_*.

That's because you've optimised for "minimise number of places
to put calls". The downside is it's much harder to review new
patches. An owner parameter to the mmio_init_region* functions
means (a) it's impossible to forget to set the owner (b) it's
easy to check when looking at the patch whether the owner is
appropriate (c) you don't have to worry about weird cases
where something else might try to set the owner later.

As a concrete example, if somebody submitted cirrus_vga
as a new driver, I have no idea how to tell that it needs
to set the owner for its memory regions, when 99% of
other devices don't. I think this is going to result in
"forgot to set owner" bugs.

thanks
-- PMM

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [Qemu-devel] [PATCH v2 06/17] sysbus: add sysbus_pass_mmio
  2013-06-04 14:11           ` Peter Maydell
@ 2013-06-04 14:27             ` Paolo Bonzini
  2013-06-04 14:56               ` Peter Maydell
  0 siblings, 1 reply; 36+ messages in thread
From: Paolo Bonzini @ 2013-06-04 14:27 UTC (permalink / raw)
  To: Peter Maydell; +Cc: qemu-devel

Il 04/06/2013 16:11, Peter Maydell ha scritto:
> We've already got a working implementation, in the shape
> of sysbus_mmio_get_region(). This is exactly the right way
> to do this API -- we have one API which says "give me a
> MemoryRegion*" and one which says "I have a MemoryRegion*,
> please expose it". All I'm asking you to do is not break it.

I can add a conditional to sysbus_add_mmio if you prefer.  I think it's
uglier, but I can live with it.

> I still don't understand them. Why should "hey, please use
> this MMIO region as a PCI BAR" imply "and by the way set the
> ownership"? Why does "here's an MMIO region which should be
> exposed to users of this device" imply "and by the way
> set the ownership"?

Because both places are _already_ tying a MemoryRegion to a device.

>>  By contrast, changing 800
>> invocations of the functions would be impossible to review seriously, it
>> would have to be redone when boards are qdev/QOM-ified, would be worse
>> for submitters of new boards.
> 
> If it's not clear who ought to be the owner when mmio_init_region
> is called then there are problems anyway.

It is clear, but this doesn't make a mechanical-but-not-quite patch easy
to review.

It's not that I cannot do it.  I simply believe it is a worse choice.

>> There are an order of magnitude less calls to memory_region_set_owner
>> than to memory_region_init_*.
> 
> That's because you've optimised for "minimise number of places
> to put calls". The downside is it's much harder to review new
> patches. An owner parameter to the mmio_init_region* functions
> means (a) it's impossible to forget to set the owner (b) it's
> easy to check when looking at the patch whether the owner is
> appropriate (c) you don't have to worry about weird cases
> where something else might try to set the owner later.

That's true, I cannot deny that.

> As a concrete example, if somebody submitted cirrus_vga
> as a new driver, I have no idea how to tell that it needs
> to set the owner for its memory regions, when 99% of
> other devices don't. I think this is going to result in
> "forgot to set owner" bugs.

Because cirrus is adding regions directly to address_space_memory/io.
As documented:

 * The device must set the owner itself
 * only if it uses memory_region_add_subregion directly on some address
 * space, or after the parent region is passed to the bus (for example
 * dynamically while the device runs).

Paolo

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [Qemu-devel] [PATCH v2 06/17] sysbus: add sysbus_pass_mmio
  2013-06-04 14:27             ` Paolo Bonzini
@ 2013-06-04 14:56               ` Peter Maydell
  2013-06-04 15:06                 ` Paolo Bonzini
  0 siblings, 1 reply; 36+ messages in thread
From: Peter Maydell @ 2013-06-04 14:56 UTC (permalink / raw)
  To: Paolo Bonzini; +Cc: qemu-devel

On 4 June 2013 15:27, Paolo Bonzini <pbonzini@redhat.com> wrote:
> Il 04/06/2013 16:11, Peter Maydell ha scritto:
>> We've already got a working implementation, in the shape
>> of sysbus_mmio_get_region(). This is exactly the right way
>> to do this API -- we have one API which says "give me a
>> MemoryRegion*" and one which says "I have a MemoryRegion*,
>> please expose it". All I'm asking you to do is not break it.
>
> I can add a conditional to sysbus_add_mmio if you prefer.  I think it's
> uglier, but I can live with it.

No, I just think it shouldn't be setting the owner.

>> As a concrete example, if somebody submitted cirrus_vga
>> as a new driver, I have no idea how to tell that it needs
>> to set the owner for its memory regions, when 99% of
>> other devices don't. I think this is going to result in
>> "forgot to set owner" bugs.
>
> Because cirrus is adding regions directly to address_space_memory/io.
> As documented:
>
>  * The device must set the owner itself
>  * only if it uses memory_region_add_subregion directly on some address
>  * space, or after the parent region is passed to the bus (for example
>  * dynamically while the device runs).

OK, so why doesn't your patchset make the places in
hw/arm/omap1.c which add memory regions directly
to a subregion set the owner of the region? (or any
of the many other places where we do similar things).

thanks
-- PMM

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [Qemu-devel] [PATCH v2 06/17] sysbus: add sysbus_pass_mmio
  2013-06-04 14:56               ` Peter Maydell
@ 2013-06-04 15:06                 ` Paolo Bonzini
  0 siblings, 0 replies; 36+ messages in thread
From: Paolo Bonzini @ 2013-06-04 15:06 UTC (permalink / raw)
  To: Peter Maydell; +Cc: qemu-devel

Il 04/06/2013 16:56, Peter Maydell ha scritto:
>>> As a concrete example, if somebody submitted cirrus_vga
>>> as a new driver, I have no idea how to tell that it needs
>>> to set the owner for its memory regions, when 99% of
>>> other devices don't. I think this is going to result in
>>> "forgot to set owner" bugs.
>>
>> Because cirrus is adding regions directly to address_space_memory/io.
>> As documented:
>>
>>  * The device must set the owner itself
>>  * only if it uses memory_region_add_subregion directly on some address
>>  * space, or after the parent region is passed to the bus (for example
>>  * dynamically while the device runs).
> 
> OK, so why doesn't your patchset make the places in
> hw/arm/omap1.c which add memory regions directly
> to a subregion set the owner of the region?

Because these aren't qdevified.

> (or any of the many other places where we do similar things).

Note that it's only necessary to do so when you add those to the address
space, not to other regions.

Paolo

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [Qemu-devel] [PATCH v2 12/17] pci-assign: add memory_region_set_owner calls
  2013-06-04 12:13 ` [Qemu-devel] [PATCH v2 12/17] pci-assign: " Paolo Bonzini
@ 2013-06-04 16:42   ` Paolo Bonzini
  0 siblings, 0 replies; 36+ messages in thread
From: Paolo Bonzini @ 2013-06-04 16:42 UTC (permalink / raw)
  To: qemu-devel; +Cc: peter.maydell

Il 04/06/2013 14:13, Paolo Bonzini ha scritto:
> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
> ---
>  hw/i386/kvm/pci-assign.c | 11 +++++++++++
>  1 file changed, 11 insertions(+)
> 
> diff --git a/hw/i386/kvm/pci-assign.c b/hw/i386/kvm/pci-assign.c
> index ff85590..4b1c2d9 100644
> --- a/hw/i386/kvm/pci-assign.c
> +++ b/hw/i386/kvm/pci-assign.c
> @@ -300,6 +300,7 @@ static void assigned_dev_iomem_setup(PCIDevice *pci_dev, int region_num,
>      if (e_size > 0) {
>          memory_region_init(&region->container, "assigned-dev-container",
>                             e_size);
> +        memory_region_set_owner(&region->container, OBJECT(pci_dev));
>          memory_region_add_subregion(&region->container, 0, &region->real_iomem);
>  
>          /* deal with MSI-X MMIO page */
> @@ -330,9 +331,12 @@ static void assigned_dev_ioport_setup(PCIDevice *pci_dev, int region_num,
>  
>      region->e_size = size;
>      memory_region_init(&region->container, "assigned-dev-container", size);
> +    memory_region_set_owner(&region->container, OBJECT(pci_dev));
> +
>      memory_region_init_io(&region->real_iomem, &assigned_dev_ioport_ops,
>                            r_dev->v_addrs + region_num,
>                            "assigned-dev-iomem", size);
> +    memory_region_set_owner(&region->real_iomem, OBJECT(pci_dev));
>      memory_region_add_subregion(&region->container, 0, &region->real_iomem);
>  }
>  
> @@ -482,6 +486,8 @@ static int assigned_dev_register_regions(PCIRegion *io_regions,
>                                        &slow_bar_ops, &pci_dev->v_addrs[i],
>                                        "assigned-dev-slow-bar",
>                                        cur_region->size);
> +                memory_region_set_owner(&pci_dev->v_addrs[i].real_iomem,
> +                                        OBJECT(pci_dev));
>              } else {
>                  void *virtbase = pci_dev->v_addrs[i].u.r_virtbase;
>                  char name[32];
> @@ -490,6 +496,9 @@ static int assigned_dev_register_regions(PCIRegion *io_regions,
>                  memory_region_init_ram_ptr(&pci_dev->v_addrs[i].real_iomem,
>                                             name, cur_region->size,
>                                             virtbase);
> +                memory_region_set_owner(&pci_dev->v_addrs[i].real_iomem,
> +                                        OBJECT(pci_dev));
> +
>                  vmstate_register_ram(&pci_dev->v_addrs[i].real_iomem,
>                                       &pci_dev->dev.qdev);
>              }
> @@ -1651,6 +1660,7 @@ static int assigned_dev_register_msix_mmio(AssignedDevice *dev)
>  
>      memory_region_init_io(&dev->mmio, &assigned_dev_msix_mmio_ops, dev,
>                            "assigned-dev-msix", MSIX_PAGE_SIZE);
> +    memory_region_set_owner(&dev->mmio, OBJECT(dev));
>      return 0;
>  }
>  
> @@ -1916,6 +1926,7 @@ static void assigned_dev_load_option_rom(AssignedDevice *dev)
>      snprintf(name, sizeof(name), "%s.rom",
>              object_get_typename(OBJECT(dev)));
>      memory_region_init_ram(&dev->dev.rom, name, st.st_size);
> +    memory_region_set_owner(&dev->dev.rom, OBJECT(dev));
>      vmstate_register_ram(&dev->dev.rom, &dev->dev.qdev);
>      ptr = memory_region_get_ram_ptr(&dev->dev.rom);
>      memset(ptr, 0xff, st.st_size);
> 

Not needed.  I followed the flow more closely and (unlike vfio) all
memory_region_add_subregion precede pci_register_bar here.

Paolo

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [Qemu-devel] [PATCH v2 06/17] sysbus: add sysbus_pass_mmio
  2013-06-04 13:24         ` Paolo Bonzini
  2013-06-04 14:11           ` Peter Maydell
@ 2013-06-04 16:50           ` Paolo Bonzini
  2013-06-04 17:47             ` Alex Williamson
  1 sibling, 1 reply; 36+ messages in thread
From: Paolo Bonzini @ 2013-06-04 16:50 UTC (permalink / raw)
  To: Peter Maydell, Alex Williamson; +Cc: qemu-devel

Il 04/06/2013 15:24, Paolo Bonzini ha scritto:
> Il 04/06/2013 14:36, Peter Maydell ha scritto:
>> On 4 June 2013 13:31, Paolo Bonzini <pbonzini@redhat.com> wrote:
>>> Il 04/06/2013 14:24, Peter Maydell ha scritto:
>>>> On 4 June 2013 13:13, Paolo Bonzini <pbonzini@redhat.com> wrote:
>>>> This is much less flexible than just using sysbus_mmio_get_region(),
>>>> because it only lets you pass the whole set of MMIOs from the
>>>> other device through, not just the ones you want.
>>>
>>> How is this different from sysbus_pass_irq?
>>
>> sysbus_pass_irq is also an annoyingly inflexible function.
>> With MMIOs we have the advantage of being able to do better.
> 
> I prefer consistency to useless flexibility.
> 
> The day someone will need it, they can add sysbus_pass_one_{irq,mmio}.
> 
>>>> Please just make reference counting work properly with passing
>>>> MemoryRegion*s around.
>>>
>>> Do you have any idea that doesn't require touch 800 invocation of the
>>> region creation functions?
>>
>> I think that would be a straightforward and easy to understand
>> way to define the ownership rules so I would much rather we
>> did that. I really don't like the way your current patch
>> is doing something complicated in an attempt to avoid this.
> 
> They are straightforward, documented, and the wide majority of the
> devices need not care at all about them.  By contrast, changing 800
> invocations of the functions would be impossible to review seriously, it
> would have to be redone when boards are qdev/QOM-ified, would be worse
> for submitters of new boards.
> 
> There are an order of magnitude less calls to memory_region_set_owner
> than to memory_region_init_*.  Changing four places suffices to get
> ownership for 97% of the devices (309 files in hw/ call
> memory_region_init*, 9 devices call memory_region_set_owner):
> 
>   hw/core/sysbus.c:1
>   hw/isa/isa-bus.c:1
>   hw/pci/pci.c:1
>   ioport.c:2
> 
> Of the remaining calls, 2/3 of them are concentrated in a handful of
> devices:
> 
>   hw/display/cirrus_vga.c:7
>   hw/display/vga.c:4
>   hw/i386/kvm/pci-assign.c:7
>   hw/misc/vfio.c:8

A closer look at the code, and a better grep command, changed this down to:

    hw/display/cirrus_vga.c:2
    hw/display/qxl.c:1
    hw/display/vga-isa.c:1
    hw/display/vga.c:6
    hw/misc/vfio.c:8

(plus the ones quoted below) where all the calls in cirrus, qxl and vfio
could be removed realtively easily.  In particular, VGA card
implementations do not use pci_register_vga yet, because it's a new API.

So, with further refactoring, it could be brought down to 1 or 2 calls
in hw/display/vga.c and one in vga-isa.c.  hw/display/vga.c needs the
memory_region_set_owner calls, both because of optimization tricks, and
because the code tries to cover both ISA and PCI.  vga-isa.c needs it
because VGAs are special ISA devices, the only ones that do MMIO.

Now, doing all the refactoring may not be worthwhile, but it shows that
the abstraction is even less leaky than it sounds.

I asked Alex Williamson to read the thread and share his opinion.
Interestingly, he had a different mental model of building the memory
regions (passing them to PCI core early rather than late, and that's why
VFIO needed 8 calls in this series).  So I believe his input will be useful.

Paolo

> and all the others could probably be refactored away, and are all for PC
> devices, other targets are unaffected (I did review them and sysbus
> catches everything):
> 
>   hw/acpi/ich9.c:1
>   hw/acpi/piix4.c:4
>   hw/char/serial-pci.c:1
>   hw/isa/apm.c:1
>   hw/misc/pc-testdev.c:4
> 
> 
> To me, it seems a pretty good abstraction.
> 
> Paolo
> 
> 

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [Qemu-devel] [PATCH v2 06/17] sysbus: add sysbus_pass_mmio
  2013-06-04 16:50           ` Paolo Bonzini
@ 2013-06-04 17:47             ` Alex Williamson
  2013-06-04 19:11               ` Paolo Bonzini
  0 siblings, 1 reply; 36+ messages in thread
From: Alex Williamson @ 2013-06-04 17:47 UTC (permalink / raw)
  To: Paolo Bonzini; +Cc: Peter Maydell, qemu-devel

On Tue, 2013-06-04 at 18:50 +0200, Paolo Bonzini wrote:
> Il 04/06/2013 15:24, Paolo Bonzini ha scritto:
> > Il 04/06/2013 14:36, Peter Maydell ha scritto:
> >> On 4 June 2013 13:31, Paolo Bonzini <pbonzini@redhat.com> wrote:
> >>> Il 04/06/2013 14:24, Peter Maydell ha scritto:
> >>>> On 4 June 2013 13:13, Paolo Bonzini <pbonzini@redhat.com> wrote:
> >>>> This is much less flexible than just using sysbus_mmio_get_region(),
> >>>> because it only lets you pass the whole set of MMIOs from the
> >>>> other device through, not just the ones you want.
> >>>
> >>> How is this different from sysbus_pass_irq?
> >>
> >> sysbus_pass_irq is also an annoyingly inflexible function.
> >> With MMIOs we have the advantage of being able to do better.
> > 
> > I prefer consistency to useless flexibility.
> > 
> > The day someone will need it, they can add sysbus_pass_one_{irq,mmio}.
> > 
> >>>> Please just make reference counting work properly with passing
> >>>> MemoryRegion*s around.
> >>>
> >>> Do you have any idea that doesn't require touch 800 invocation of the
> >>> region creation functions?
> >>
> >> I think that would be a straightforward and easy to understand
> >> way to define the ownership rules so I would much rather we
> >> did that. I really don't like the way your current patch
> >> is doing something complicated in an attempt to avoid this.
> > 
> > They are straightforward, documented, and the wide majority of the
> > devices need not care at all about them.  By contrast, changing 800
> > invocations of the functions would be impossible to review seriously, it
> > would have to be redone when boards are qdev/QOM-ified, would be worse
> > for submitters of new boards.
> > 
> > There are an order of magnitude less calls to memory_region_set_owner
> > than to memory_region_init_*.  Changing four places suffices to get
> > ownership for 97% of the devices (309 files in hw/ call
> > memory_region_init*, 9 devices call memory_region_set_owner):
> > 
> >   hw/core/sysbus.c:1
> >   hw/isa/isa-bus.c:1
> >   hw/pci/pci.c:1
> >   ioport.c:2
> > 
> > Of the remaining calls, 2/3 of them are concentrated in a handful of
> > devices:
> > 
> >   hw/display/cirrus_vga.c:7
> >   hw/display/vga.c:4
> >   hw/i386/kvm/pci-assign.c:7
> >   hw/misc/vfio.c:8
> 
> A closer look at the code, and a better grep command, changed this down to:
> 
>     hw/display/cirrus_vga.c:2
>     hw/display/qxl.c:1
>     hw/display/vga-isa.c:1
>     hw/display/vga.c:6
>     hw/misc/vfio.c:8
> 
> (plus the ones quoted below) where all the calls in cirrus, qxl and vfio
> could be removed realtively easily.  In particular, VGA card
> implementations do not use pci_register_vga yet, because it's a new API.
> 
> So, with further refactoring, it could be brought down to 1 or 2 calls
> in hw/display/vga.c and one in vga-isa.c.  hw/display/vga.c needs the
> memory_region_set_owner calls, both because of optimization tricks, and
> because the code tries to cover both ISA and PCI.  vga-isa.c needs it
> because VGAs are special ISA devices, the only ones that do MMIO.
> 
> Now, doing all the refactoring may not be worthwhile, but it shows that
> the abstraction is even less leaky than it sounds.
> 
> I asked Alex Williamson to read the thread and share his opinion.
> Interestingly, he had a different mental model of building the memory
> regions (passing them to PCI core early rather than late, and that's why
> VFIO needed 8 calls in this series).  So I believe his input will be useful.

We'll see about that ;)  It's true that it's simply a mental model of
doing the required steps, then optimizing that makes vfio need a
sprinkling of set ownership calls.  Paolo, your patch to move the
PCI/VGA registration later solves this and completely hides memory
region ownership from vfio.  That's great, but as Peter is arguing,
leaves a hole that I'm not even aware that an owner is required for a
memory region and the API still leaves me lots of opportunities to get
it wrong.  So, I have to go back to Rusty's API design guidelines that
an API should difficult to use incorrectly.  From what I see, I'm not
sure we have that here.  An ugly compromise might be a runtime checks
for orphan memory regions after a device is initialized, but that has
it's own set of problems.  Thanks,

Alex

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [Qemu-devel] [PATCH v2 06/17] sysbus: add sysbus_pass_mmio
  2013-06-04 17:47             ` Alex Williamson
@ 2013-06-04 19:11               ` Paolo Bonzini
  0 siblings, 0 replies; 36+ messages in thread
From: Paolo Bonzini @ 2013-06-04 19:11 UTC (permalink / raw)
  To: Alex Williamson; +Cc: Peter Maydell, qemu-devel

Il 04/06/2013 19:47, Alex Williamson ha scritto:
> We'll see about that ;)

At the very least you broke the tie. :)

> It's true that it's simply a mental model of
> doing the required steps, then optimizing that makes vfio need a
> sprinkling of set ownership calls.  Paolo, your patch to move the
> PCI/VGA registration later solves this and completely hides memory
> region ownership from vfio.  That's great, but as Peter is arguing,
> leaves a hole that I'm not even aware that an owner is required for a
> memory region and the API still leaves me lots of opportunities to get
> it wrong.  So, I have to go back to Rusty's API design guidelines that
> an API should difficult to use incorrectly.  From what I see, I'm not
> sure we have that here.  An ugly compromise might be a runtime checks
> for orphan memory regions after a device is initialized, but that has
> it's own set of problems.  Thanks,

Then I'll hunt for the 800 owners.  In the meanwhile, patch
2/3/4/14/15/16/17 won't change from this series to the next, so a review
of those is welcome. :)

Paolo

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [Qemu-devel] [PATCH v2 03/17] memory: add ref/unref calls
  2013-06-04 12:13 ` [Qemu-devel] [PATCH v2 03/17] memory: add ref/unref calls Paolo Bonzini
@ 2013-06-13  6:28   ` Alexey Kardashevskiy
  2013-06-13  9:02     ` Alexey Kardashevskiy
  0 siblings, 1 reply; 36+ messages in thread
From: Alexey Kardashevskiy @ 2013-06-13  6:28 UTC (permalink / raw)
  To: Paolo Bonzini; +Cc: peter.maydell, qemu-devel

Hi!

I do not know how (yet) but this patch breaks qtest on x86 (I bisected it):


make check-qtest V=1
QTEST_QEMU_BINARY=x86_64-softmmu/qemu-system-x86_64
MALLOC_PERTURB_=${MALLOC_PERTURB_:-$((RANDOM % 255 + 1))} gtester -k
--verbose -m=quick tests/fdc-test tests/ide-test tests/hd-geo-test
tests/rtc-test tests/i440fx-test tests/fw_cfg-test
TEST: tests/fdc-test... (pid=13049)
Broken pipe
FAIL: tests/fdc-test
TEST: tests/ide-test... (pid=13053)
  /x86_64/ide/identify:
Broken pipe
FAIL
GTester: last random seed: R02S2f8a8fd53ff256765db44cefb0a920ce
(pid=13057)
  /x86_64/ide/bmdma/setup:
Broken pipe
FAIL
GTester: last random seed: R02S0cec5d222cfd196e6e839e06d7ddde89
(pid=13061)
  /x86_64/ide/bmdma/simple_rw:                                         FAIL
GTester: last random seed: R02S46a30a1ccd33dc104919118330810a85
(pid=13062)
  /x86_64/ide/bmdma/short_prdt:                                        FAIL
GTester: last random seed: R02S19fdcc95895b870371ed5ddcc8b77eda
(pid=13063)

[...]


On 06/04/2013 10:13 PM, Paolo Bonzini wrote:
> Add ref/unref calls at the following places:
> 
> - places where memory regions are stashed by a listener and
>   used outside the BQL (including in Xen or KVM).
> 
> - memory_region_find callsites
> 
> - creation of aliases and containers (only the aliased/contained
>   region gets a reference to avoid loops)
> 
> - around calls to del_subregion/add_subregion, where the region
>   could disappear after the first call
> 
> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
> ---
>  exec.c                                |  6 +++++-
>  hw/core/loader.c                      |  1 +
>  hw/display/exynos4210_fimd.c          |  6 ++++++
>  hw/display/framebuffer.c              | 12 +++++++-----
>  hw/i386/kvmvapic.c                    |  1 +
>  hw/misc/vfio.c                        |  2 ++
>  hw/virtio/dataplane/hostmem.c         |  7 +++++++
>  hw/virtio/vhost.c                     |  2 ++
>  hw/virtio/virtio-balloon.c            |  1 +
>  hw/xen/xen_pt.c                       |  4 ++++
>  include/hw/virtio/dataplane/hostmem.h |  1 +
>  kvm-all.c                             |  2 ++
>  memory.c                              | 20 ++++++++++++++++++++
>  target-arm/kvm.c                      |  2 ++
>  target-sparc/mmu_helper.c             |  1 +
>  xen-all.c                             |  2 ++
>  16 files changed, 64 insertions(+), 6 deletions(-)
> 
> diff --git a/exec.c b/exec.c
> index 8909478..ba50e8d 100644
> --- a/exec.c
> +++ b/exec.c
> @@ -815,12 +815,16 @@ static uint16_t phys_section_add(MemoryRegionSection *section)
>                                  phys_sections_nb_alloc);
>      }
>      phys_sections[phys_sections_nb] = *section;
> +    memory_region_ref(section->mr);
>      return phys_sections_nb++;
>  }
>  
>  static void phys_sections_clear(void)
>  {
> -    phys_sections_nb = 0;
> +    while (phys_sections_nb > 0) {
> +        MemoryRegionSection *section = &phys_sections[--phys_sections_nb];
> +        memory_region_unref(section->mr);
> +    }
>  }
>  
>  static void register_subpage(AddressSpaceDispatch *d, MemoryRegionSection *section)
> diff --git a/hw/core/loader.c b/hw/core/loader.c
> index 3a60cbe..44d8714 100644
> --- a/hw/core/loader.c
> +++ b/hw/core/loader.c
> @@ -727,6 +727,7 @@ int rom_load_all(void)
>          addr += rom->romsize;
>          section = memory_region_find(get_system_memory(), rom->addr, 1);
>          rom->isrom = int128_nz(section.size) && memory_region_is_rom(section.mr);
> +        memory_region_unref(section.mr);
>      }
>      qemu_register_reset(rom_reset, NULL);
>      roms_loaded = 1;
> diff --git a/hw/display/exynos4210_fimd.c b/hw/display/exynos4210_fimd.c
> index 0da00a9..f44c4a6 100644
> --- a/hw/display/exynos4210_fimd.c
> +++ b/hw/display/exynos4210_fimd.c
> @@ -1126,6 +1126,11 @@ static void fimd_update_memory_section(Exynos4210fimdState *s, unsigned win)
>      /* Total number of bytes of virtual screen used by current window */
>      w->fb_len = fb_mapped_len = (w->virtpage_width + w->virtpage_offsize) *
>              (w->rightbot_y - w->lefttop_y + 1);
> +
> +    /* TODO: add .exit and unref the region there.  Not needed yet since sysbus
> +     * does not support hot-unplug.
> +     */
> +    memory_region_unref(w->mem_section.mr);
>      w->mem_section = memory_region_find(sysbus_address_space(&s->busdev),
>              fb_start_addr, w->fb_len);
>      assert(w->mem_section.mr);
> @@ -1154,6 +1159,7 @@ static void fimd_update_memory_section(Exynos4210fimdState *s, unsigned win)
>      return;
>  
>  error_return:
> +    memory_region_unref(w->mem_section.mr);
>      w->mem_section.mr = NULL;
>      w->mem_section.size = int128_zero();
>      w->host_fb_addr = NULL;
> diff --git a/hw/display/framebuffer.c b/hw/display/framebuffer.c
> index 49c9e59..4546e42 100644
> --- a/hw/display/framebuffer.c
> +++ b/hw/display/framebuffer.c
> @@ -54,11 +54,11 @@ void framebuffer_update_display(
>      src_len = src_width * rows;
>  
>      mem_section = memory_region_find(address_space, base, src_len);
> +    mem = mem_section.mr;
>      if (int128_get64(mem_section.size) != src_len ||
>              !memory_region_is_ram(mem_section.mr)) {
> -        return;
> +        goto out;
>      }
> -    mem = mem_section.mr;
>      assert(mem);
>      assert(mem_section.offset_within_address_space == base);
>  
> @@ -68,10 +68,10 @@ void framebuffer_update_display(
>         but it's not really worth it as dirty flag tracking will probably
>         already have failed above.  */
>      if (!src_base)
> -        return;
> +        goto out;
>      if (src_len != src_width * rows) {
>          cpu_physical_memory_unmap(src_base, src_len, 0, 0);
> -        return;
> +        goto out;
>      }
>      src = src_base;
>      dest = surface_data(ds);
> @@ -102,10 +102,12 @@ void framebuffer_update_display(
>      }
>      cpu_physical_memory_unmap(src_base, src_len, 0, 0);
>      if (first < 0) {
> -        return;
> +        goto out;
>      }
>      memory_region_reset_dirty(mem, mem_section.offset_within_region, src_len,
>                                DIRTY_MEMORY_VGA);
>      *first_row = first;
>      *last_row = last;
> +out:
> +    memory_region_unref(mem);
>  }
> diff --git a/hw/i386/kvmvapic.c b/hw/i386/kvmvapic.c
> index 655483b..e375c1c 100644
> --- a/hw/i386/kvmvapic.c
> +++ b/hw/i386/kvmvapic.c
> @@ -605,6 +605,7 @@ static void vapic_map_rom_writable(VAPICROMState *s)
>                               rom_size);
>      memory_region_add_subregion_overlap(as, rom_paddr, &s->rom, 1000);
>      s->rom_mapped_writable = true;
> +    memory_region_unref(section.mr);
>  }
>  
>  static int vapic_prepare(VAPICROMState *s)
> diff --git a/hw/misc/vfio.c b/hw/misc/vfio.c
> index 52fb036..a1f5803 100644
> --- a/hw/misc/vfio.c
> +++ b/hw/misc/vfio.c
> @@ -1969,6 +1969,7 @@ static void vfio_listener_region_add(MemoryListener *listener,
>      DPRINTF("region_add %"HWADDR_PRIx" - %"HWADDR_PRIx" [%p]\n",
>              iova, end - 1, vaddr);
>  
> +    memory_region_ref(section->mr);
>      ret = vfio_dma_map(container, iova, end - iova, vaddr, section->readonly);
>      if (ret) {
>          error_report("vfio_dma_map(%p, 0x%"HWADDR_PRIx", "
> @@ -2010,6 +2011,7 @@ static void vfio_listener_region_del(MemoryListener *listener,
>              iova, end - 1);
>  
>      ret = vfio_dma_unmap(container, iova, end - iova);
> +    memory_region_unref(section->mr);
>      if (ret) {
>          error_report("vfio_dma_unmap(%p, 0x%"HWADDR_PRIx", "
>                       "0x%"HWADDR_PRIx") = %d (%m)",
> diff --git a/hw/virtio/dataplane/hostmem.c b/hw/virtio/dataplane/hostmem.c
> index 7e46723..901d98b 100644
> --- a/hw/virtio/dataplane/hostmem.c
> +++ b/hw/virtio/dataplane/hostmem.c
> @@ -64,8 +64,12 @@ out:
>  static void hostmem_listener_commit(MemoryListener *listener)
>  {
>      HostMem *hostmem = container_of(listener, HostMem, listener);
> +    int i;
>  
>      qemu_mutex_lock(&hostmem->current_regions_lock);
> +    for (i = 0; i < hostmem->num_current_regions; i++) {
> +        memory_region_unref(hostmem->current_regions[i].mr);
> +    }
>      g_free(hostmem->current_regions);
>      hostmem->current_regions = hostmem->new_regions;
>      hostmem->num_current_regions = hostmem->num_new_regions;
> @@ -92,8 +96,11 @@ static void hostmem_append_new_region(HostMem *hostmem,
>          .guest_addr = section->offset_within_address_space,
>          .size = int128_get64(section->size),
>          .readonly = section->readonly,
> +        .mr = section->mr,
>      };
>      hostmem->num_new_regions++;
> +
> +    memory_region_ref(section->mr);
>  }
>  
>  static void hostmem_listener_append_region(MemoryListener *listener,
> diff --git a/hw/virtio/vhost.c b/hw/virtio/vhost.c
> index baf84ea..96ab625 100644
> --- a/hw/virtio/vhost.c
> +++ b/hw/virtio/vhost.c
> @@ -497,6 +497,7 @@ static void vhost_region_add(MemoryListener *listener,
>      dev->mem_sections = g_renew(MemoryRegionSection, dev->mem_sections,
>                                  dev->n_mem_sections);
>      dev->mem_sections[dev->n_mem_sections - 1] = *section;
> +    memory_region_ref(section->mr);
>      vhost_set_memory(listener, section, true);
>  }
>  
> @@ -512,6 +513,7 @@ static void vhost_region_del(MemoryListener *listener,
>      }
>  
>      vhost_set_memory(listener, section, false);
> +    memory_region_unref(section->mr);
>      for (i = 0; i < dev->n_mem_sections; ++i) {
>          if (dev->mem_sections[i].offset_within_address_space
>              == section->offset_within_address_space) {
> diff --git a/hw/virtio/virtio-balloon.c b/hw/virtio/virtio-balloon.c
> index a27051c..3fa72a9 100644
> --- a/hw/virtio/virtio-balloon.c
> +++ b/hw/virtio/virtio-balloon.c
> @@ -205,6 +205,7 @@ static void virtio_balloon_handle_output(VirtIODevice *vdev, VirtQueue *vq)
>              addr = section.offset_within_region;
>              balloon_page(memory_region_get_ram_ptr(section.mr) + addr,
>                           !!(vq == s->dvq));
> +            memory_region_unref(section.mr);
>          }
>  
>          virtqueue_push(vq, &elem, offset);
> diff --git a/hw/xen/xen_pt.c b/hw/xen/xen_pt.c
> index c199818..be1fd52 100644
> --- a/hw/xen/xen_pt.c
> +++ b/hw/xen/xen_pt.c
> @@ -606,6 +606,7 @@ static void xen_pt_region_add(MemoryListener *l, MemoryRegionSection *sec)
>      XenPCIPassthroughState *s = container_of(l, XenPCIPassthroughState,
>                                               memory_listener);
>  
> +    memory_region_ref(sec->mr);
>      xen_pt_region_update(s, sec, true);
>  }
>  
> @@ -615,6 +616,7 @@ static void xen_pt_region_del(MemoryListener *l, MemoryRegionSection *sec)
>                                               memory_listener);
>  
>      xen_pt_region_update(s, sec, false);
> +    memory_region_unref(sec->mr);
>  }
>  
>  static void xen_pt_io_region_add(MemoryListener *l, MemoryRegionSection *sec)
> @@ -622,6 +624,7 @@ static void xen_pt_io_region_add(MemoryListener *l, MemoryRegionSection *sec)
>      XenPCIPassthroughState *s = container_of(l, XenPCIPassthroughState,
>                                               io_listener);
>  
> +    memory_region_ref(sec->mr);
>      xen_pt_region_update(s, sec, true);
>  }
>  
> @@ -631,6 +634,7 @@ static void xen_pt_io_region_del(MemoryListener *l, MemoryRegionSection *sec)
>                                               io_listener);
>  
>      xen_pt_region_update(s, sec, false);
> +    memory_region_unref(sec->mr);
>  }
>  
>  static const MemoryListener xen_pt_memory_listener = {
> diff --git a/include/hw/virtio/dataplane/hostmem.h b/include/hw/virtio/dataplane/hostmem.h
> index b2cf093..2810f4b 100644
> --- a/include/hw/virtio/dataplane/hostmem.h
> +++ b/include/hw/virtio/dataplane/hostmem.h
> @@ -18,6 +18,7 @@
>  #include "qemu/thread.h"
>  
>  typedef struct {
> +    MemoryRegion *mr;
>      void *host_addr;
>      hwaddr guest_addr;
>      uint64_t size;
> diff --git a/kvm-all.c b/kvm-all.c
> index e6b262f..91aa7ff 100644
> --- a/kvm-all.c
> +++ b/kvm-all.c
> @@ -788,6 +788,7 @@ static void kvm_set_phys_mem(MemoryRegionSection *section, bool add)
>  static void kvm_region_add(MemoryListener *listener,
>                             MemoryRegionSection *section)
>  {
> +    memory_region_ref(section->mr);
>      kvm_set_phys_mem(section, true);
>  }
>  
> @@ -795,6 +796,7 @@ static void kvm_region_del(MemoryListener *listener,
>                             MemoryRegionSection *section)
>  {
>      kvm_set_phys_mem(section, false);
> +    memory_region_unref(section->mr);
>  }
>  
>  static void kvm_log_sync(MemoryListener *listener,
> diff --git a/memory.c b/memory.c
> index a136a77..cfce42b 100644
> --- a/memory.c
> +++ b/memory.c
> @@ -147,6 +147,7 @@ static bool memory_listener_match(MemoryListener *listener,
>          }                                                               \
>      } while (0)
>  
> +/* No need to ref/unref .mr, the FlatRange keeps it alive.  */
>  #define MEMORY_LISTENER_UPDATE_REGION(fr, as, dir, callback)            \
>      MEMORY_LISTENER_CALL(callback, dir, (&(MemoryRegionSection) {       \
>          .mr = (fr)->mr,                                                 \
> @@ -262,11 +263,17 @@ static void flatview_insert(FlatView *view, unsigned pos, FlatRange *range)
>      memmove(view->ranges + pos + 1, view->ranges + pos,
>              (view->nr - pos) * sizeof(FlatRange));
>      view->ranges[pos] = *range;
> +    memory_region_ref(range->mr);
>      ++view->nr;
>  }
>  
>  static void flatview_destroy(FlatView *view)
>  {
> +    int i;
> +
> +    for (i = 0; i < view->nr; i++) {
> +        memory_region_unref(view->ranges[i].mr);
> +    }
>      g_free(view->ranges);
>  }
>  
> @@ -796,6 +803,11 @@ static void memory_region_destructor_ram(MemoryRegion *mr)
>      qemu_ram_free(mr->ram_addr);
>  }
>  
> +static void memory_region_destructor_alias(MemoryRegion *mr)
> +{
> +    memory_region_unref(mr->alias);
> +}
> +
>  static void memory_region_destructor_ram_from_ptr(MemoryRegion *mr)
>  {
>      qemu_ram_free_from_ptr(mr->ram_addr);
> @@ -1043,6 +1055,8 @@ void memory_region_init_alias(MemoryRegion *mr,
>                                uint64_t size)
>  {
>      memory_region_init(mr, name, size);
> +    memory_region_ref(orig);
> +    mr->destructor = memory_region_destructor_alias;
>      mr->alias = orig;
>      mr->alias_offset = offset;
>  }
> @@ -1457,6 +1471,7 @@ static void memory_region_add_subregion_common(MemoryRegion *mr,
>      memory_region_transaction_begin();
>  
>      assert(!subregion->parent);
> +    memory_region_ref(subregion);
>      subregion->parent = mr;
>      subregion->addr = offset;
>      QTAILQ_FOREACH(other, &mr->subregions, subregions_link) {
> @@ -1519,6 +1534,7 @@ void memory_region_del_subregion(MemoryRegion *mr,
>      assert(subregion->parent == mr);
>      subregion->parent = NULL;
>      QTAILQ_REMOVE(&mr->subregions, subregion, subregions_link);
> +    memory_region_unref(subregion);
>      memory_region_update_pending |= mr->enabled && subregion->enabled;
>      memory_region_transaction_commit();
>  }
> @@ -1546,12 +1562,14 @@ void memory_region_set_address(MemoryRegion *mr, hwaddr addr)
>      }
>  
>      memory_region_transaction_begin();
> +    memory_region_ref(mr);
>      memory_region_del_subregion(parent, mr);
>      if (may_overlap) {
>          memory_region_add_subregion_overlap(parent, addr, mr, priority);
>      } else {
>          memory_region_add_subregion(parent, addr, mr);
>      }
> +    memory_region_unref(mr);
>      memory_region_transaction_commit();
>  }
>  
> @@ -1629,6 +1647,8 @@ MemoryRegionSection memory_region_find(MemoryRegion *mr,
>      ret.size = range.size;
>      ret.offset_within_address_space = int128_get64(range.start);
>      ret.readonly = fr->readonly;
> +    memory_region_ref(ret.mr);
> +
>      return ret;
>  }
>  
> diff --git a/target-arm/kvm.c b/target-arm/kvm.c
> index b7bdc03..b9051a4 100644
> --- a/target-arm/kvm.c
> +++ b/target-arm/kvm.c
> @@ -127,6 +127,7 @@ static void kvm_arm_machine_init_done(Notifier *notifier, void *data)
>                  abort();
>              }
>          }
> +        memory_region_unref(kd->mr);
>          g_free(kd);
>      }
>  }
> @@ -152,6 +153,7 @@ void kvm_arm_register_device(MemoryRegion *mr, uint64_t devid)
>      kd->kda.id = devid;
>      kd->kda.addr = -1;
>      QSLIST_INSERT_HEAD(&kvm_devices_head, kd, entries);
> +    memory_region_ref(kd->mr);
>  }
>  
>  typedef struct Reg {
> diff --git a/target-sparc/mmu_helper.c b/target-sparc/mmu_helper.c
> index 3983c96..740cbe8 100644
> --- a/target-sparc/mmu_helper.c
> +++ b/target-sparc/mmu_helper.c
> @@ -845,6 +845,7 @@ hwaddr cpu_get_phys_page_debug(CPUSPARCState *env, target_ulong addr)
>          }
>      }
>      section = memory_region_find(get_system_memory(), phys_addr, 1);
> +    memory_region_unref(section.mr);
>      if (!int128_nz(section.size)) {
>          return -1;
>      }
> diff --git a/xen-all.c b/xen-all.c
> index cd520b1..764741a 100644
> --- a/xen-all.c
> +++ b/xen-all.c
> @@ -459,6 +459,7 @@ static void xen_set_memory(struct MemoryListener *listener,
>  static void xen_region_add(MemoryListener *listener,
>                             MemoryRegionSection *section)
>  {
> +    memory_region_ref(section->mr);
>      xen_set_memory(listener, section, true);
>  }
>  
> @@ -466,6 +467,7 @@ static void xen_region_del(MemoryListener *listener,
>                             MemoryRegionSection *section)
>  {
>      xen_set_memory(listener, section, false);
> +    memory_region_unref(section->mr);
>  }
>  
>  static void xen_sync_dirty_bitmap(XenIOState *state,
> 


-- 
Alexey

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [Qemu-devel] [PATCH v2 03/17] memory: add ref/unref calls
  2013-06-13  6:28   ` Alexey Kardashevskiy
@ 2013-06-13  9:02     ` Alexey Kardashevskiy
  2013-06-14 10:09       ` Alexey Kardashevskiy
  0 siblings, 1 reply; 36+ messages in thread
From: Alexey Kardashevskiy @ 2013-06-13  9:02 UTC (permalink / raw)
  To: Paolo Bonzini; +Cc: peter.maydell, qemu-devel

Fails on qtest_init() in tests/libqtest.c, "Broken pipe". I cannot easily
see what is wrong here with this patch but it is 100% reproducible on x86_64
 :(


On 06/13/2013 04:28 PM, Alexey Kardashevskiy wrote:
> Hi!
> 
> I do not know how (yet) but this patch breaks qtest on x86 (I bisected it):
> 
> 
> make check-qtest V=1
> QTEST_QEMU_BINARY=x86_64-softmmu/qemu-system-x86_64
> MALLOC_PERTURB_=${MALLOC_PERTURB_:-$((RANDOM % 255 + 1))} gtester -k
> --verbose -m=quick tests/fdc-test tests/ide-test tests/hd-geo-test
> tests/rtc-test tests/i440fx-test tests/fw_cfg-test
> TEST: tests/fdc-test... (pid=13049)
> Broken pipe
> FAIL: tests/fdc-test
> TEST: tests/ide-test... (pid=13053)
>   /x86_64/ide/identify:
> Broken pipe
> FAIL
> GTester: last random seed: R02S2f8a8fd53ff256765db44cefb0a920ce
> (pid=13057)
>   /x86_64/ide/bmdma/setup:
> Broken pipe
> FAIL
> GTester: last random seed: R02S0cec5d222cfd196e6e839e06d7ddde89
> (pid=13061)
>   /x86_64/ide/bmdma/simple_rw:                                         FAIL
> GTester: last random seed: R02S46a30a1ccd33dc104919118330810a85
> (pid=13062)
>   /x86_64/ide/bmdma/short_prdt:                                        FAIL
> GTester: last random seed: R02S19fdcc95895b870371ed5ddcc8b77eda
> (pid=13063)
> 
> [...]
> 
> 
> On 06/04/2013 10:13 PM, Paolo Bonzini wrote:
>> Add ref/unref calls at the following places:
>>
>> - places where memory regions are stashed by a listener and
>>   used outside the BQL (including in Xen or KVM).
>>
>> - memory_region_find callsites
>>
>> - creation of aliases and containers (only the aliased/contained
>>   region gets a reference to avoid loops)
>>
>> - around calls to del_subregion/add_subregion, where the region
>>   could disappear after the first call
>>
>> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
>> ---
>>  exec.c                                |  6 +++++-
>>  hw/core/loader.c                      |  1 +
>>  hw/display/exynos4210_fimd.c          |  6 ++++++
>>  hw/display/framebuffer.c              | 12 +++++++-----
>>  hw/i386/kvmvapic.c                    |  1 +
>>  hw/misc/vfio.c                        |  2 ++
>>  hw/virtio/dataplane/hostmem.c         |  7 +++++++
>>  hw/virtio/vhost.c                     |  2 ++
>>  hw/virtio/virtio-balloon.c            |  1 +
>>  hw/xen/xen_pt.c                       |  4 ++++
>>  include/hw/virtio/dataplane/hostmem.h |  1 +
>>  kvm-all.c                             |  2 ++
>>  memory.c                              | 20 ++++++++++++++++++++
>>  target-arm/kvm.c                      |  2 ++
>>  target-sparc/mmu_helper.c             |  1 +
>>  xen-all.c                             |  2 ++
>>  16 files changed, 64 insertions(+), 6 deletions(-)


-- 
Alexey

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [Qemu-devel] [PATCH v2 03/17] memory: add ref/unref calls
  2013-06-13  9:02     ` Alexey Kardashevskiy
@ 2013-06-14 10:09       ` Alexey Kardashevskiy
  2013-06-14 13:56         ` Paolo Bonzini
  2013-06-25  8:49         ` Paolo Bonzini
  0 siblings, 2 replies; 36+ messages in thread
From: Alexey Kardashevskiy @ 2013-06-14 10:09 UTC (permalink / raw)
  To: Paolo Bonzini; +Cc: peter.maydell, qemu-devel, Paul Mackerras, Avi Kivity

Hi.

Ok. Back to the bug with this patch. The initial problem with this patch is
that "make check" fails.

Please help with subpages.

It turned out that tests use MALLOC_PERTURB_ which is normally off. Who
does not know - this is a way to tell glibc to fill released memory with
some value and then debug accesses to released memory. Some bright mind
made it random what confuses a lot (and btw valgrind found nothing :-/ ).
So I spend some time before figured out how to reproduce it outside of the
qtest thingy.

The tree is qemu.org/master "bd5c51e Michael Roth qemu-char: don't issue
CHR_EVENT_OPEN in a BH" + replayed patches till the one from $subj on top
of it. QEMU is configured as "configure --target-list=x86_64-softmmu".

The magic is:

export MALLOC_PERTURB_=123
nc -l -U ~/qtest-16318.sock &
nc -l -U ~/qtest-16318.qmp &
x86_64-softmmu/qemu-system-x86_64  \
	-qtest unix:/home/alexey/qtest-16318.sock,nowait \
	-qtest-log /dev/null \
	-qmp unix:/home/alexey/qtest-16318.qmp,nowait \
	-pidfile ~/qtest-16318.pid -machine accel=qtest -vnc none

Immediate crash at (the very last backtrace in this mail is that crash).

x86_cpu_apic_realize() creates a subpage for IO:


#0  aik_dbg_start (f=f@entry=0x5555558c4b41 "subpage_init",
l=l@entry=0x6a0, mr=mr@entry=0x555556556d30)
    at /home/alexey/pcipassthru/qemu-impreza/hw/i386/pc.c:1297
#1  0x0000555555774299 in subpage_init (base=0x0, as=0x5555564a9260) at
/home/alexey/pcipassthru/qemu-impreza/exec.c:1696
#2  register_subpage (d=d@entry=0x555556523d00,
section=section@entry=0x7fffffffd620)
    at /home/alexey/pcipassthru/qemu-impreza/exec.c:845
#3  0x000055555577447d in mem_add (listener=0x555556523d08,
section=<optimized out>)
    at /home/alexey/pcipassthru/qemu-impreza/exec.c:881
#4  0x00005555557c2d69 in address_space_update_topology_pass
(as=as@entry=0x5555564a9260, adding=adding@entry=0x1, old_view=...,
    new_view=...) at /home/alexey/pcipassthru/qemu-impreza/memory.c:751
#5  0x00005555557c64b8 in address_space_update_topology (as=0x5555564a9260)
at /home/alexey/pcipassthru/qemu-impreza/memory.c:766
#6  memory_region_transaction_commit () at
/home/alexey/pcipassthru/qemu-impreza/memory.c:790
#7  0x00005555557c79cd in memory_region_add_subregion_common
(mr=0x555556523c30, offset=offset@entry=0x7e,
    subregion=subregion@entry=0x555556550a28) at
/home/alexey/pcipassthru/qemu-impreza/memory.c:1518
#8  0x00005555557c7ae8 in memory_region_add_subregion (mr=<optimized out>,
offset=offset@entry=0x7e,
    subregion=subregion@entry=0x555556550a28) at
/home/alexey/pcipassthru/qemu-impreza/memory.c:1527
#9  0x0000555555663995 in sysbus_add_io (dev=dev@entry=0x55555654e700,
addr=addr@entry=0x7e, mem=mem@entry=0x555556550a28)
    at /home/alexey/pcipassthru/qemu-impreza/hw/core/sysbus.c:242
#10 0x000055555579cfce in vapic_init (dev=0x55555654e700) at
/home/alexey/pcipassthru/qemu-impreza/hw/i386/kvmvapic.c:707
#11 0x0000555555661651 in device_realize (dev=0x55555654e700,
err=0x7fffffffda40)
    at /home/alexey/pcipassthru/qemu-impreza/hw/core/qdev.c:178
#12 0x0000555555662cf3 in device_set_realized (obj=0x55555654e700,
value=0x1, err=0x7fffffffdb50)
    at /home/alexey/pcipassthru/qemu-impreza/hw/core/qdev.c:699
#13 0x000055555573358e in property_set_bool (obj=0x55555654e700,
v=<optimized out>, opaque=0x55555653c1f0, name=<optimized out>,
    errp=0x7fffffffdb50) at
/home/alexey/pcipassthru/qemu-impreza/qom/object.c:1301
#14 0x0000555555736445 in object_property_set_qobject (obj=0x55555654e700,
value=<optimized out>, name=0x555555896553 "realized",
    errp=0x7fffffffdb50) at
/home/alexey/pcipassthru/qemu-impreza/qom/qom-qobject.c:24
#15 0x000055555573525e in object_property_set_bool
(obj=obj@entry=0x55555654e700, value=value@entry=0x1,
    name=name@entry=0x555555896553 "realized", errp=errp@entry=0x7fffffffdb50)
    at /home/alexey/pcipassthru/qemu-impreza/qom/object.c:852
#16 0x0000555555661c3a in qdev_init (dev=dev@entry=0x55555654e700) at
/home/alexey/pcipassthru/qemu-impreza/hw/core/qdev.c:163
#17 0x0000555555661e91 in qdev_init_nofail (dev=dev@entry=0x55555654e700)
    at /home/alexey/pcipassthru/qemu-impreza/hw/core/qdev.c:277
#18 0x0000555555663789 in sysbus_create_varargs
(name=name@entry=0x5555558c73a1 "kvmvapic", addr=addr@entry=0xffffffffffffffff)
    at /home/alexey/pcipassthru/qemu-impreza/hw/core/sysbus.c:157
#19 0x00005555557a4ead in sysbus_create_simple (irq=0x0,
addr=0xffffffffffffffff, name=0x5555558c73a1 "kvmvapic")
    at /home/alexey/pcipassthru/qemu-impreza/include/hw/sysbus.h:75
#20 apic_init_common (dev=0x555556535350) at
/home/alexey/pcipassthru/qemu-impreza/hw/intc/apic_common.c:311
#21 0x0000555555790fb6 in icc_device_realize (dev=0x555556535350,
errp=0x7fffffffdc80)
    at /home/alexey/pcipassthru/qemu-impreza/hw/cpu/icc_bus.c:50
#22 0x0000555555662cf3 in device_set_realized (obj=0x555556535350,
value=0x1, err=0x7fffffffdd90)
    at /home/alexey/pcipassthru/qemu-impreza/hw/core/qdev.c:699
#23 0x000055555573358e in property_set_bool (obj=0x555556535350,
v=<optimized out>, opaque=0x555556535610, name=<optimized out>,
    errp=0x7fffffffdd90) at
/home/alexey/pcipassthru/qemu-impreza/qom/object.c:1301
#24 0x0000555555736445 in object_property_set_qobject (obj=0x555556535350,
value=<optimized out>, name=0x555555896553 "realized",
    errp=0x7fffffffdd90) at
/home/alexey/pcipassthru/qemu-impreza/qom/qom-qobject.c:24
#25 0x000055555573525e in object_property_set_bool
(obj=obj@entry=0x555556535350, value=value@entry=0x1,
    name=name@entry=0x555555896553 "realized", errp=errp@entry=0x7fffffffdd90)
    at /home/alexey/pcipassthru/qemu-impreza/qom/object.c:852
#26 0x0000555555661c3a in qdev_init (dev=0x555556535350) at
/home/alexey/pcipassthru/qemu-impreza/hw/core/qdev.c:163
#27 0x00005555557d9a7c in x86_cpu_apic_realize (errp=0x7fffffffddd0,
cpu=0x55555653df50)
    at /home/alexey/pcipassthru/qemu-impreza/target-i386/cpu.c:2327
#28 x86_cpu_realizefn (dev=0x55555653df50, errp=0x7fffffffde20) at
/home/alexey/pcipassthru/qemu-impreza/target-i386/cpu.c:2397
#29 0x0000555555662cf3 in device_set_realized (obj=0x55555653df50,
value=0x1, err=0x7fffffffdf30)
    at /home/alexey/pcipassthru/qemu-impreza/hw/core/qdev.c:699
#30 0x000055555573358e in property_set_bool (obj=0x55555653df50,
v=<optimized out>, opaque=0x55555652e390, name=<optimized out>,
    errp=0x7fffffffdf30) at
/home/alexey/pcipassthru/qemu-impreza/qom/object.c:1301
---Type <return> to continue, or q <return> to quit---
#31 0x0000555555736445 in object_property_set_qobject (obj=0x55555653df50,
value=<optimized out>, name=0x555555896553 "realized",
    errp=0x7fffffffdf30) at
/home/alexey/pcipassthru/qemu-impreza/qom/qom-qobject.c:24
#32 0x000055555573525e in object_property_set_bool (obj=0x55555653df50,
value=<optimized out>, name=0x555555896553 "realized",
    errp=0x7fffffffdf30) at
/home/alexey/pcipassthru/qemu-impreza/qom/object.c:852
#33 0x000055555579f3b0 in pc_new_cpu (cpu_model=<optimized out>,
apic_id=0x0, icc_bridge=<optimized out>, errp=0x7fffffffdf70)
    at /home/alexey/pcipassthru/qemu-impreza/hw/i386/pc.c:911
#34 0x00005555557a0fc1 in pc_cpus_init (cpu_model=0x5555558c7a2d "qemu64",
cpu_model@entry=0x0,
    icc_bridge=icc_bridge@entry=0x55555652b420) at
/home/alexey/pcipassthru/qemu-impreza/hw/i386/pc.c:964
#35 0x00005555557a129f in pc_init1 (system_memory=0x555556522e60,
system_io=0x555556523c30, ram_size=ram_size@entry=0x8000000,
    boot_device=boot_device@entry=0x555555891aaa "cad",
kernel_filename=kernel_filename@entry=0x0,
    kernel_cmdline=kernel_cmdline@entry=0x5555558d8fb6 "",
initrd_filename=initrd_filename@entry=0x0,
    cpu_model=cpu_model@entry=0x0, pci_enabled=pci_enabled@entry=0x1,
kvmclock_enabled=kvmclock_enabled@entry=0x1)
    at /home/alexey/pcipassthru/qemu-impreza/hw/i386/pc_piix.c:98
#36 0x00005555557a1aea in pc_init_pci (args=<optimized out>) at
/home/alexey/pcipassthru/qemu-impreza/hw/i386/pc_piix.c:242
#37 0x00005555555dcea0 in main (argc=<optimized out>, argv=<optimized out>,
envp=<optimized out>)
    at /home/alexey/pcipassthru/qemu-impreza/vl.c:4307




This subpage is released later due to some magic which I do not understand:


(gdb) bt
#0  aik_dbg (f=f@entry=0x5555558c4c20 "destroy_page_desc", l=l@entry=0x305)
    at /home/alexey/pcipassthru/qemu-impreza/hw/i386/pc.c:1284
#1  0x0000555555773d48 in destroy_page_desc (section_index=<optimized out>)
at /home/alexey/pcipassthru/qemu-impreza/exec.c:773
#2  destroy_l2_mapping (level=0x0, lp=0x555556559e10) at
/home/alexey/pcipassthru/qemu-impreza/exec.c:791
#3  destroy_l2_mapping (lp=0x555556559e10, level=0x0) at
/home/alexey/pcipassthru/qemu-impreza/exec.c:777
#4  0x0000555555773c88 in destroy_l2_mapping (level=0x1, lp=0x555556559610)
at /home/alexey/pcipassthru/qemu-impreza/exec.c:789
#5  destroy_l2_mapping (lp=0x555556559610, level=0x1) at
/home/alexey/pcipassthru/qemu-impreza/exec.c:777
#6  0x0000555555773c88 in destroy_l2_mapping (level=0x2, lp=0x555556558e10)
at /home/alexey/pcipassthru/qemu-impreza/exec.c:789
#7  destroy_l2_mapping (lp=0x555556558e10, level=0x2) at
/home/alexey/pcipassthru/qemu-impreza/exec.c:777
#8  0x0000555555773c88 in destroy_l2_mapping (level=0x3, lp=0x555556523d00)
at /home/alexey/pcipassthru/qemu-impreza/exec.c:789
#9  destroy_l2_mapping (lp=0x555556523d00, level=0x3) at
/home/alexey/pcipassthru/qemu-impreza/exec.c:777
#10 0x0000555555773df8 in destroy_all_mappings (d=0x555556523d00) at
/home/alexey/pcipassthru/qemu-impreza/exec.c:800
#11 mem_begin (listener=0x555556523d08) at
/home/alexey/pcipassthru/qemu-impreza/exec.c:1732
#12 0x00005555557c6168 in memory_region_transaction_commit () at
/home/alexey/pcipassthru/qemu-impreza/memory.c:787
#13 0x00005555557c79cd in memory_region_add_subregion_common
(mr=mr@entry=0x555556522e60, offset=offset@entry=0xfee00000,
    subregion=subregion@entry=0x55555652d7b8) at
/home/alexey/pcipassthru/qemu-impreza/memory.c:1518
#14 0x00005555557c7a72 in memory_region_add_subregion_overlap
(mr=0x555556522e60, offset=0xfee00000, subregion=0x55555652d7b8,
    priority=<optimized out>) at
/home/alexey/pcipassthru/qemu-impreza/memory.c:1537
#15 0x00005555557a1038 in pc_cpus_init (cpu_model=0x5555558c7a2d "qemu64",
cpu_model@entry=0x0,
    icc_bridge=icc_bridge@entry=0x55555652b420) at
/home/alexey/pcipassthru/qemu-impreza/hw/i386/pc.c:976
#16 0x00005555557a129f in pc_init1 (system_memory=0x555556522e60,
system_io=0x555556523c30, ram_size=ram_size@entry=0x8000000,
    boot_device=boot_device@entry=0x555555891aaa "cad",
kernel_filename=kernel_filename@entry=0x0,
    kernel_cmdline=kernel_cmdline@entry=0x5555558d8fb6 "",
initrd_filename=initrd_filename@entry=0x0,
    cpu_model=cpu_model@entry=0x0, pci_enabled=pci_enabled@entry=0x1,
kvmclock_enabled=kvmclock_enabled@entry=0x1)
    at /home/alexey/pcipassthru/qemu-impreza/hw/i386/pc_piix.c:98
#17 0x00005555557a1aea in pc_init_pci (args=<optimized out>) at
/home/alexey/pcipassthru/qemu-impreza/hw/i386/pc_piix.c:242
#18 0x00005555555dcea0 in main (argc=<optimized out>, argv=<optimized out>,
envp=<optimized out>)
    at /home/alexey/pcipassthru/qemu-impreza/vl.c:4307
(gdb)



And - crash:


#0  object_unref (obj=0xa7a7a7a7a7a7a7a7) at
/home/alexey/pcipassthru/qemu-impreza/qom/object.c:691
#1  0x00005555557c505c in memory_region_unref (mr=<optimized out>) at
/home/alexey/pcipassthru/qemu-impreza/memory.c:1172
#2  0x0000555555775953 in phys_sections_clear () at
/home/alexey/pcipassthru/qemu-impreza/exec.c:826
#3  0x0000555555775999 in core_begin (listener=<optimized out>) at
/home/alexey/pcipassthru/qemu-impreza/exec.c:1738
#4  0x00005555557c6168 in memory_region_transaction_commit () at
/home/alexey/pcipassthru/qemu-impreza/memory.c:787
#5  0x00005555557c79cd in memory_region_add_subregion_common
(mr=mr@entry=0x555556522e60, offset=offset@entry=0xfee00000,
    subregion=subregion@entry=0x55555652d7b8) at
/home/alexey/pcipassthru/qemu-impreza/memory.c:1518
#6  0x00005555557c7a72 in memory_region_add_subregion_overlap
(mr=0x555556522e60, offset=0xfee00000, subregion=0x55555652d7b8,
    priority=<optimized out>) at
/home/alexey/pcipassthru/qemu-impreza/memory.c:1537
#7  0x00005555557a1038 in pc_cpus_init (cpu_model=0x5555558c7a2d "qemu64",
cpu_model@entry=0x0,
    icc_bridge=icc_bridge@entry=0x55555652b420) at
/home/alexey/pcipassthru/qemu-impreza/hw/i386/pc.c:976
#8  0x00005555557a129f in pc_init1 (system_memory=0x555556522e60,
system_io=0x555556523c30, ram_size=ram_size@entry=0x8000000,
    boot_device=boot_device@entry=0x555555891aaa "cad",
kernel_filename=kernel_filename@entry=0x0,
    kernel_cmdline=kernel_cmdline@entry=0x5555558d8fb6 "",
initrd_filename=initrd_filename@entry=0x0,
    cpu_model=cpu_model@entry=0x0, pci_enabled=pci_enabled@entry=0x1,
kvmclock_enabled=kvmclock_enabled@entry=0x1)
    at /home/alexey/pcipassthru/qemu-impreza/hw/i386/pc_piix.c:98
#9  0x00005555557a1aea in pc_init_pci (args=<optimized out>) at
/home/alexey/pcipassthru/qemu-impreza/hw/i386/pc_piix.c:242
#10 0x00005555555dcea0 in main (argc=<optimized out>, argv=<optimized out>,
envp=<optimized out>)
    at /home/alexey/pcipassthru/qemu-impreza/vl.c:4307
(gdb)





On 06/13/2013 07:02 PM, Alexey Kardashevskiy wrote:
> Fails on qtest_init() in tests/libqtest.c, "Broken pipe". I cannot easily
> see what is wrong here with this patch but it is 100% reproducible on x86_64
>  :(
> 
> 
> On 06/13/2013 04:28 PM, Alexey Kardashevskiy wrote:
>> Hi!
>>
>> I do not know how (yet) but this patch breaks qtest on x86 (I bisected it):
>>
>>
>> make check-qtest V=1
>> QTEST_QEMU_BINARY=x86_64-softmmu/qemu-system-x86_64
>> MALLOC_PERTURB_=${MALLOC_PERTURB_:-$((RANDOM % 255 + 1))} gtester -k
>> --verbose -m=quick tests/fdc-test tests/ide-test tests/hd-geo-test
>> tests/rtc-test tests/i440fx-test tests/fw_cfg-test
>> TEST: tests/fdc-test... (pid=13049)
>> Broken pipe
>> FAIL: tests/fdc-test
>> TEST: tests/ide-test... (pid=13053)
>>   /x86_64/ide/identify:
>> Broken pipe
>> FAIL
>> GTester: last random seed: R02S2f8a8fd53ff256765db44cefb0a920ce
>> (pid=13057)
>>   /x86_64/ide/bmdma/setup:
>> Broken pipe
>> FAIL
>> GTester: last random seed: R02S0cec5d222cfd196e6e839e06d7ddde89
>> (pid=13061)
>>   /x86_64/ide/bmdma/simple_rw:                                         FAIL
>> GTester: last random seed: R02S46a30a1ccd33dc104919118330810a85
>> (pid=13062)
>>   /x86_64/ide/bmdma/short_prdt:                                        FAIL
>> GTester: last random seed: R02S19fdcc95895b870371ed5ddcc8b77eda
>> (pid=13063)
>>
>> [...]
>>
>>
>> On 06/04/2013 10:13 PM, Paolo Bonzini wrote:
>>> Add ref/unref calls at the following places:
>>>
>>> - places where memory regions are stashed by a listener and
>>>   used outside the BQL (including in Xen or KVM).
>>>
>>> - memory_region_find callsites
>>>
>>> - creation of aliases and containers (only the aliased/contained
>>>   region gets a reference to avoid loops)
>>>
>>> - around calls to del_subregion/add_subregion, where the region
>>>   could disappear after the first call
>>>
>>> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
>>> ---
>>>  exec.c                                |  6 +++++-
>>>  hw/core/loader.c                      |  1 +
>>>  hw/display/exynos4210_fimd.c          |  6 ++++++
>>>  hw/display/framebuffer.c              | 12 +++++++-----
>>>  hw/i386/kvmvapic.c                    |  1 +
>>>  hw/misc/vfio.c                        |  2 ++
>>>  hw/virtio/dataplane/hostmem.c         |  7 +++++++
>>>  hw/virtio/vhost.c                     |  2 ++
>>>  hw/virtio/virtio-balloon.c            |  1 +
>>>  hw/xen/xen_pt.c                       |  4 ++++
>>>  include/hw/virtio/dataplane/hostmem.h |  1 +
>>>  kvm-all.c                             |  2 ++
>>>  memory.c                              | 20 ++++++++++++++++++++
>>>  target-arm/kvm.c                      |  2 ++
>>>  target-sparc/mmu_helper.c             |  1 +
>>>  xen-all.c                             |  2 ++
>>>  16 files changed, 64 insertions(+), 6 deletions(-)
> 
> 


-- 
Alexey

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [Qemu-devel] [PATCH v2 03/17] memory: add ref/unref calls
  2013-06-14 10:09       ` Alexey Kardashevskiy
@ 2013-06-14 13:56         ` Paolo Bonzini
  2013-06-14 15:04           ` Alexey Kardashevskiy
  2013-06-25  8:49         ` Paolo Bonzini
  1 sibling, 1 reply; 36+ messages in thread
From: Paolo Bonzini @ 2013-06-14 13:56 UTC (permalink / raw)
  To: Alexey Kardashevskiy, qemu-devel

Il 14/06/2013 06:09, Alexey Kardashevskiy ha scritto:
> And - crash:
> 
> 
> #0  object_unref (obj=0xa7a7a7a7a7a7a7a7) at
> /home/alexey/pcipassthru/qemu-impreza/qom/object.c:691

Dangling pointer.  One ref, two unrefs probably.

I'm redoing the series according to Peter's request, it could fix it
automatically.

Paolo

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [Qemu-devel] [PATCH v2 03/17] memory: add ref/unref calls
  2013-06-14 13:56         ` Paolo Bonzini
@ 2013-06-14 15:04           ` Alexey Kardashevskiy
  0 siblings, 0 replies; 36+ messages in thread
From: Alexey Kardashevskiy @ 2013-06-14 15:04 UTC (permalink / raw)
  To: Paolo Bonzini; +Cc: qemu-devel, Avi Kivity

On 06/14/2013 11:56 PM, Paolo Bonzini wrote:
> Il 14/06/2013 06:09, Alexey Kardashevskiy ha scritto:
>> And - crash:
>>
>>
>> #0  object_unref (obj=0xa7a7a7a7a7a7a7a7) at
>> /home/alexey/pcipassthru/qemu-impreza/qom/object.c:691
> 
> Dangling pointer.  One ref, two unrefs probably.

No, subpages (and nested MRs) are just g_free'd, it is not a result of
unreferencing, that's the point.


> I'm redoing the series according to Peter's request, it could fix it
> automatically.


When is it expected to be pushed to github?



-- 
Alexey

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [Qemu-devel] [PATCH v2 03/17] memory: add ref/unref calls
  2013-06-14 10:09       ` Alexey Kardashevskiy
  2013-06-14 13:56         ` Paolo Bonzini
@ 2013-06-25  8:49         ` Paolo Bonzini
  1 sibling, 0 replies; 36+ messages in thread
From: Paolo Bonzini @ 2013-06-25  8:49 UTC (permalink / raw)
  To: Alexey Kardashevskiy
  Cc: peter.maydell, Paul Mackerras, qemu-devel, Avi Kivity

Il 14/06/2013 12:09, Alexey Kardashevskiy ha scritto:
> Hi.
> 
> Ok. Back to the bug with this patch. The initial problem with this patch is
> that "make check" fails.
> 
> Please help with subpages.
> 
> It turned out that tests use MALLOC_PERTURB_ which is normally off. Who
> does not know - this is a way to tell glibc to fill released memory with
> some value and then debug accesses to released memory. Some bright mind
> made it random what confuses a lot (and btw valgrind found nothing :-/ ).
> So I spend some time before figured out how to reproduce it outside of the
> qtest thingy.
> 
> The tree is qemu.org/master "bd5c51e Michael Roth qemu-char: don't issue
> CHR_EVENT_OPEN in a BH" + replayed patches till the one from $subj on top
> of it. QEMU is configured as "configure --target-list=x86_64-softmmu".
> 
> The magic is:
> 
> export MALLOC_PERTURB_=123
> nc -l -U ~/qtest-16318.sock &
> nc -l -U ~/qtest-16318.qmp &
> x86_64-softmmu/qemu-system-x86_64  \
> 	-qtest unix:/home/alexey/qtest-16318.sock,nowait \
> 	-qtest-log /dev/null \
> 	-qmp unix:/home/alexey/qtest-16318.qmp,nowait \
> 	-pidfile ~/qtest-16318.pid -machine accel=qtest -vnc none
> 
> Immediate crash at (the very last backtrace in this mail is that crash).

It's a use-after-free, strange that valgrind found nothing.  The 
subpage is freed in destroy_page_desc before being used in 
phys_sections_clear.

The fix is to move freeing the phys_sections in phys_sections_clear,
which is also cleaner.

diff --git a/exec.c b/exec.c
index fa5de88..93907e9 100644
--- a/exec.c
+++ b/exec.c
@@ -762,17 +762,6 @@ hwaddr memory_region_section_get_iotlb(CPUArchState *env,
 static int subpage_register (subpage_t *mmio, uint32_t start, uint32_t end,
                              uint16_t section);
 static subpage_t *subpage_init(AddressSpace *as, hwaddr base);
-static void destroy_page_desc(uint16_t section_index)
-{
-    MemoryRegionSection *section = &phys_sections[section_index];
-    MemoryRegion *mr = section->mr;
-
-    if (mr->subpage) {
-        subpage_t *subpage = container_of(mr, subpage_t, iomem);
-        memory_region_destroy(&subpage->iomem);
-        g_free(subpage);
-    }
-}
 
 static void destroy_l2_mapping(PhysPageEntry *lp, unsigned level)
 {
@@ -787,8 +776,6 @@ static void destroy_l2_mapping(PhysPageEntry *lp, unsigned level)
     for (i = 0; i < L2_SIZE; ++i) {
         if (!p[i].is_leaf) {
             destroy_l2_mapping(&p[i], level - 1);
-        } else {
-            destroy_page_desc(p[i].ptr);
         }
     }
     lp->is_leaf = 0;
@@ -819,11 +806,20 @@ static uint16_t phys_section_add(MemoryRegionSection *section)
     return phys_sections_nb++;
 }
 
+static void phys_section_destroy(MemoryRegion *mr)
+{
+    memory_region_unref(mr);
+
+    if (mr->subpage) {
+        subpage_t *subpage = container_of(mr, subpage_t, iomem);
+        memory_region_destroy(&subpage->iomem);
+        g_free(subpage);
+    }
+}
+
 static void phys_sections_clear(void)
 {
     while (phys_sections_nb > 0) {
         MemoryRegionSection *section = &phys_sections[--phys_sections_nb];
-        memory_region_unref(section->mr);
+        phys_section_destroy(section->mr);
     }
 }
 

^ permalink raw reply related	[flat|nested] 36+ messages in thread

end of thread, other threads:[~2013-06-25  8:49 UTC | newest]

Thread overview: 36+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2013-06-04 12:13 [Qemu-devel] [PATCH v2 00/17] Memory/IOMMU patches part 4: region ownership Paolo Bonzini
2013-06-04 12:13 ` [Qemu-devel] [PATCH v2 01/17] memory: add getter/setter for owner Paolo Bonzini
2013-06-04 12:13 ` [Qemu-devel] [PATCH v2 02/17] memory: add ref/unref Paolo Bonzini
2013-06-04 12:13 ` [Qemu-devel] [PATCH v2 03/17] memory: add ref/unref calls Paolo Bonzini
2013-06-13  6:28   ` Alexey Kardashevskiy
2013-06-13  9:02     ` Alexey Kardashevskiy
2013-06-14 10:09       ` Alexey Kardashevskiy
2013-06-14 13:56         ` Paolo Bonzini
2013-06-14 15:04           ` Alexey Kardashevskiy
2013-06-25  8:49         ` Paolo Bonzini
2013-06-04 12:13 ` [Qemu-devel] [PATCH v2 04/17] exec: add a reference to the region returned by address_space_translate Paolo Bonzini
2013-06-04 12:13 ` [Qemu-devel] [PATCH v2 05/17] pci: set owner for BARs Paolo Bonzini
2013-06-04 12:13 ` [Qemu-devel] [PATCH v2 06/17] sysbus: add sysbus_pass_mmio Paolo Bonzini
2013-06-04 12:24   ` Peter Maydell
2013-06-04 12:31     ` Paolo Bonzini
2013-06-04 12:36       ` Peter Maydell
2013-06-04 13:24         ` Paolo Bonzini
2013-06-04 14:11           ` Peter Maydell
2013-06-04 14:27             ` Paolo Bonzini
2013-06-04 14:56               ` Peter Maydell
2013-06-04 15:06                 ` Paolo Bonzini
2013-06-04 16:50           ` Paolo Bonzini
2013-06-04 17:47             ` Alex Williamson
2013-06-04 19:11               ` Paolo Bonzini
2013-06-04 12:13 ` [Qemu-devel] [PATCH v2 07/17] sysbus: set owner for MMIO regions Paolo Bonzini
2013-06-04 12:13 ` [Qemu-devel] [PATCH v2 08/17] acpi: add memory_region_set_owner calls Paolo Bonzini
2013-06-04 12:13 ` [Qemu-devel] [PATCH v2 09/17] misc: " Paolo Bonzini
2013-06-04 12:13 ` [Qemu-devel] [PATCH v2 10/17] isa/portio: allow setting an owner Paolo Bonzini
2013-06-04 12:13 ` [Qemu-devel] [PATCH v2 11/17] vga: add memory_region_set_owner calls Paolo Bonzini
2013-06-04 12:13 ` [Qemu-devel] [PATCH v2 12/17] pci-assign: " Paolo Bonzini
2013-06-04 16:42   ` Paolo Bonzini
2013-06-04 12:13 ` [Qemu-devel] [PATCH v2 13/17] vfio: " Paolo Bonzini
2013-06-04 12:13 ` [Qemu-devel] [PATCH v2 14/17] exec: check MRU in qemu_ram_addr_from_host Paolo Bonzini
2013-06-04 12:13 ` [Qemu-devel] [PATCH v2 15/17] exec: move qemu_ram_addr_from_host_nofail to cputlb.c Paolo Bonzini
2013-06-04 12:14 ` [Qemu-devel] [PATCH v2 16/17] memory: return MemoryRegion from qemu_ram_addr_from_host Paolo Bonzini
2013-06-04 12:14 ` [Qemu-devel] [PATCH v2 17/17] memory: ref/unref memory across address_space_map/unmap Paolo Bonzini

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.