All of lore.kernel.org
 help / color / mirror / Atom feed
* [RFC 0/9] Add an interVM memory sharing device
       [not found] <CGME20200204113102eucas1p172cfb883c70cfc8d7c2832682df3df2a@eucas1p1.samsung.com>
@ 2020-02-04 11:30 ` i.kotrasinsk
       [not found]   ` <CGME20200204113104eucas1p2587768b7daa479ef5c01b45e1da99e45@eucas1p2.samsung.com>
                     ` (11 more replies)
  0 siblings, 12 replies; 20+ messages in thread
From: i.kotrasinsk @ 2020-02-04 11:30 UTC (permalink / raw)
  To: qemu-devel; +Cc: peter.maydell, Igor Kotrasinski, pbonzini

From: Igor Kotrasinski <i.kotrasinsk@partner.samsung.com>

This patchset adds a "memory exposing" device that allows two QEMU
instances to share arbitrary memory regions. Unlike ivshmem, it does not
create a new region of memory that's shared between VMs, but instead
allows one VM to access any memory region of the other VM we choose to
share.

The motivation for this device is a sort of ARM Trustzone "emulation",
where a rich system running on one machine (e.g. x86_64 linux) is able
to perform SMCs to a trusted system running on another (e.g. OpTEE on
ARM). With a device that allows sharing arbitrary memory between VMs,
this can be achieved with minimal changes to the trusted system and its
linux driver while allowing the rich system to run on a speedier x86
emulator. I prepared additional patches for linux, OpTEE OS and OpTEE
build system as a PoC that such emulation works and passes OpTEE tests;
I'm not sure what would be the best way to share them.

This patchset is my first foray into QEMU source code and I'm certain
it's not yet ready to be merged in. I'm not sure whether memory sharing
code has any race conditions or breaks rules of working with memory
regions, or if having VMs communicate synchronously via chardevs is the
right way to do it. I do believe the basic idea for sharing memory
regions is sound and that it could be useful for inter-VM communication.

Igor Kotrasinski (9):
  memory: Add function for finding flat memory ranges
  memory: Support mmap offset for fd-backed memory regions
  memory: Hack - use shared memory when possible
  hw/misc/memexpose: Add documentation
  hw/misc/memexpose: Add core memexpose files
  hw/misc/memexpose: Add memexpose pci device
  hw/misc/memexpose: Add memexpose memory region device
  hw/misc/memexpose: Add simple tests
  hw/arm/virt: Hack in support for memexpose device

 Kconfig.host                            |   3 +
 MAINTAINERS                             |  12 +
 Makefile                                |   1 +
 backends/hostmem-memfd.c                |   2 +-
 configure                               |   8 +
 docs/specs/memexpose-spec.txt           | 168 +++++++++
 exec.c                                  |  10 +-
 hw/arm/virt.c                           | 110 +++++-
 hw/core/numa.c                          |   4 +-
 hw/mem/Kconfig                          |   3 +
 hw/misc/Makefile.objs                   |   1 +
 hw/misc/ivshmem.c                       |   3 +-
 hw/misc/memexpose/Makefile.objs         |   4 +
 hw/misc/memexpose/memexpose-core.c      | 630 ++++++++++++++++++++++++++++++++
 hw/misc/memexpose/memexpose-core.h      | 109 ++++++
 hw/misc/memexpose/memexpose-memregion.c | 142 +++++++
 hw/misc/memexpose/memexpose-memregion.h |  41 +++
 hw/misc/memexpose/memexpose-msg.c       | 261 +++++++++++++
 hw/misc/memexpose/memexpose-msg.h       | 161 ++++++++
 hw/misc/memexpose/memexpose-pci.c       | 218 +++++++++++
 include/exec/memory.h                   |  20 +
 include/exec/ram_addr.h                 |   2 +-
 include/hw/arm/virt.h                   |   5 +
 include/qemu/mmap-alloc.h               |   1 +
 memory.c                                |  82 ++++-
 tests/qtest/Makefile.include            |   2 +
 tests/qtest/memexpose-test.c            | 364 ++++++++++++++++++
 util/mmap-alloc.c                       |   7 +-
 util/oslib-posix.c                      |   2 +-
 29 files changed, 2360 insertions(+), 16 deletions(-)
 create mode 100644 docs/specs/memexpose-spec.txt
 create mode 100644 hw/misc/memexpose/Makefile.objs
 create mode 100644 hw/misc/memexpose/memexpose-core.c
 create mode 100644 hw/misc/memexpose/memexpose-core.h
 create mode 100644 hw/misc/memexpose/memexpose-memregion.c
 create mode 100644 hw/misc/memexpose/memexpose-memregion.h
 create mode 100644 hw/misc/memexpose/memexpose-msg.c
 create mode 100644 hw/misc/memexpose/memexpose-msg.h
 create mode 100644 hw/misc/memexpose/memexpose-pci.c
 create mode 100644 tests/qtest/memexpose-test.c

-- 
2.7.4



^ permalink raw reply	[flat|nested] 20+ messages in thread

* [RFC 1/9] memory: Add function for finding flat memory ranges
       [not found]   ` <CGME20200204113104eucas1p2587768b7daa479ef5c01b45e1da99e45@eucas1p2.samsung.com>
@ 2020-02-04 11:30     ` i.kotrasinsk
  0 siblings, 0 replies; 20+ messages in thread
From: i.kotrasinsk @ 2020-02-04 11:30 UTC (permalink / raw)
  To: qemu-devel; +Cc: peter.maydell, Igor Kotrasinski, pbonzini

From: Igor Kotrasinski <i.kotrasinsk@partner.samsung.com>

Given an address this lets us find the largest contiguous memory range
at that address.

Signed-off-by: Igor Kotrasinski <i.kotrasinsk@partner.samsung.com>
---
 include/exec/memory.h | 19 +++++++++++++
 memory.c              | 79 ++++++++++++++++++++++++++++++++++++++++++++++++++-
 2 files changed, 97 insertions(+), 1 deletion(-)

diff --git a/include/exec/memory.h b/include/exec/memory.h
index e85b7de..6092528 100644
--- a/include/exec/memory.h
+++ b/include/exec/memory.h
@@ -1915,6 +1915,25 @@ MemoryRegionSection memory_region_find(MemoryRegion *mr,
                                        hwaddr addr, uint64_t size);
 
 /**
+ * memory_region_find_flat_range: translate an address/size relative to
+ * a MemoryRegion into a FlatRange containing it.
+ *
+ * Returns a #MemoryRegionSection that describes this FlatRange.
+ * It will have the following characteristics:
+ *    .@size = 0 iff no containing FlatRange was found
+ *    .@mr is non-%NULL iff a containing FlatRange was found
+ *
+ * Remember that in the return value the @offset_within_region is
+ * relative to the returned region (in the .@mr field), not to the
+ * @mr argument.
+ *
+ * @mr: a MemoryRegion within which @addr is a relative address
+ * @addr: start of the area within @as to be searched
+ * @size: size of the area to be searched
+ */
+MemoryRegionSection memory_region_find_flat_range(MemoryRegion *mr,
+                                                  hwaddr addr, uint64_t size);
+/**
  * memory_global_dirty_log_sync: synchronize the dirty log for all memory
  *
  * Synchronizes the dirty page log for all address spaces.
diff --git a/memory.c b/memory.c
index aeaa8dc..e9f37e7 100644
--- a/memory.c
+++ b/memory.c
@@ -2523,6 +2523,25 @@ static FlatRange *flatview_lookup(FlatView *view, AddrRange addr)
                    sizeof(FlatRange), cmp_flatrange_addr);
 }
 
+static int cmp_flatrange_addr_containing(const void *addr_, const void *fr_)
+{
+    const AddrRange *addr = addr_;
+    const FlatRange *fr = fr_;
+
+    if (int128_le(addr->start, fr->addr.start)) {
+        return -1;
+    } else if (int128_ge(addrrange_end(*addr), addrrange_end(fr->addr))) {
+        return 1;
+    }
+    return 0;
+}
+
+static FlatRange *flatview_lookup_containing(FlatView *view, AddrRange addr)
+{
+    return bsearch(&addr, view->ranges, view->nr,
+                   sizeof(FlatRange), cmp_flatrange_addr_containing);
+}
+
 bool memory_region_is_mapped(MemoryRegion *mr)
 {
     return mr->container ? true : false;
@@ -2532,7 +2551,8 @@ bool memory_region_is_mapped(MemoryRegion *mr)
  * returned region.  It must be called from an RCU critical section.
  */
 static MemoryRegionSection memory_region_find_rcu(MemoryRegion *mr,
-                                                  hwaddr addr, uint64_t size)
+                                                  hwaddr addr,
+                                                  uint64_t size)
 {
     MemoryRegionSection ret = { .mr = NULL };
     MemoryRegion *root;
@@ -2576,6 +2596,50 @@ static MemoryRegionSection memory_region_find_rcu(MemoryRegion *mr,
     return ret;
 }
 
+/*
+ * Same as memory_region_find_flat_range, but it does not add a reference to
+ * the returned region.  It must be called from an RCU critical section.
+ */
+static MemoryRegionSection memory_region_find_flat_range_rcu(MemoryRegion *mr,
+                                                             hwaddr addr,
+                                                             uint64_t size)
+{
+    MemoryRegionSection ret = { .mr = NULL, .size = 0 };
+    MemoryRegion *root;
+    AddressSpace *as;
+    AddrRange range;
+    FlatView *view;
+    FlatRange *fr;
+
+    addr += mr->addr;
+    for (root = mr; root->container; ) {
+        root = root->container;
+        addr += root->addr;
+    }
+
+    as = memory_region_to_address_space(root);
+    if (!as) {
+        return ret;
+    }
+    range = addrrange_make(int128_make64(addr), int128_make64(size));
+
+    view = address_space_to_flatview(as);
+    fr = flatview_lookup_containing(view, range);
+    if (!fr) {
+        return ret;
+    }
+
+    ret.mr = fr->mr;
+    ret.fv = view;
+    range = fr->addr;
+    ret.offset_within_region = fr->offset_in_region;
+    ret.size = range.size;
+    ret.offset_within_address_space = int128_get64(range.start);
+    ret.readonly = fr->readonly;
+    ret.nonvolatile = fr->nonvolatile;
+    return ret;
+}
+
 MemoryRegionSection memory_region_find(MemoryRegion *mr,
                                        hwaddr addr, uint64_t size)
 {
@@ -2588,6 +2652,19 @@ MemoryRegionSection memory_region_find(MemoryRegion *mr,
     return ret;
 }
 
+MemoryRegionSection memory_region_find_flat_range(MemoryRegion *mr,
+                                                  hwaddr addr, uint64_t size)
+{
+    MemoryRegionSection ret;
+    rcu_read_lock();
+    ret = memory_region_find_flat_range_rcu(mr, addr, size);
+    if (ret.mr) {
+        memory_region_ref(ret.mr);
+    }
+    rcu_read_unlock();
+    return ret;
+}
+
 bool memory_region_present(MemoryRegion *container, hwaddr addr)
 {
     MemoryRegion *mr;
-- 
2.7.4



^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [RFC 2/9] memory: Support mmap offset for fd-backed memory regions
       [not found]   ` <CGME20200204113105eucas1p2981e8d1e49ca9621255a4aedf8f1ec6e@eucas1p2.samsung.com>
@ 2020-02-04 11:30     ` i.kotrasinsk
  0 siblings, 0 replies; 20+ messages in thread
From: i.kotrasinsk @ 2020-02-04 11:30 UTC (permalink / raw)
  To: qemu-devel; +Cc: peter.maydell, Igor Kotrasinski, pbonzini

From: Igor Kotrasinski <i.kotrasinsk@partner.samsung.com>

The memexpose device will receive shared memory from another VM and map
parts of it as memory regions. For that, we need to be able to mmap the
region at an offset from shared memory's start.

Signed-off-by: Igor Kotrasinski <i.kotrasinsk@partner.samsung.com>
---
 backends/hostmem-memfd.c  |  2 +-
 exec.c                    | 10 ++++++----
 hw/misc/ivshmem.c         |  3 ++-
 include/exec/memory.h     |  1 +
 include/exec/ram_addr.h   |  2 +-
 include/qemu/mmap-alloc.h |  1 +
 memory.c                  |  3 ++-
 util/mmap-alloc.c         |  7 ++++---
 util/oslib-posix.c        |  2 +-
 9 files changed, 19 insertions(+), 12 deletions(-)

diff --git a/backends/hostmem-memfd.c b/backends/hostmem-memfd.c
index 26070b4..7cd6c53 100644
--- a/backends/hostmem-memfd.c
+++ b/backends/hostmem-memfd.c
@@ -56,7 +56,7 @@ memfd_backend_memory_alloc(HostMemoryBackend *backend, Error **errp)
 
     name = host_memory_backend_get_name(backend);
     memory_region_init_ram_from_fd(&backend->mr, OBJECT(backend),
-                                   name, backend->size,
+                                   name, backend->size, 0,
                                    backend->share, fd, errp);
     g_free(name);
 }
diff --git a/exec.c b/exec.c
index 67e520d..afcb3c9 100644
--- a/exec.c
+++ b/exec.c
@@ -1839,6 +1839,7 @@ static int file_ram_open(const char *path,
 
 static void *file_ram_alloc(RAMBlock *block,
                             ram_addr_t memory,
+                            size_t mmap_offset,
                             int fd,
                             bool truncate,
                             Error **errp)
@@ -1892,7 +1893,7 @@ static void *file_ram_alloc(RAMBlock *block,
         perror("ftruncate");
     }
 
-    area = qemu_ram_mmap(fd, memory, block->mr->align,
+    area = qemu_ram_mmap(fd, memory, mmap_offset, block->mr->align,
                          block->flags & RAM_SHARED, block->flags & RAM_PMEM);
     if (area == MAP_FAILED) {
         error_setg_errno(errp, errno,
@@ -2314,7 +2315,7 @@ static void ram_block_add(RAMBlock *new_block, Error **errp, bool shared)
 
 #ifdef CONFIG_POSIX
 RAMBlock *qemu_ram_alloc_from_fd(ram_addr_t size, MemoryRegion *mr,
-                                 uint32_t ram_flags, int fd,
+                                 uint32_t ram_flags, int fd, size_t mmap_offset,
                                  Error **errp)
 {
     RAMBlock *new_block;
@@ -2360,7 +2361,8 @@ RAMBlock *qemu_ram_alloc_from_fd(ram_addr_t size, MemoryRegion *mr,
     new_block->used_length = size;
     new_block->max_length = size;
     new_block->flags = ram_flags;
-    new_block->host = file_ram_alloc(new_block, size, fd, !file_size, errp);
+    new_block->host = file_ram_alloc(new_block, size, mmap_offset, fd,
+                                     !file_size, errp);
     if (!new_block->host) {
         g_free(new_block);
         return NULL;
@@ -2390,7 +2392,7 @@ RAMBlock *qemu_ram_alloc_from_file(ram_addr_t size, MemoryRegion *mr,
         return NULL;
     }
 
-    block = qemu_ram_alloc_from_fd(size, mr, ram_flags, fd, errp);
+    block = qemu_ram_alloc_from_fd(size, mr, ram_flags, fd, 0, errp);
     if (!block) {
         if (created) {
             unlink(mem_path);
diff --git a/hw/misc/ivshmem.c b/hw/misc/ivshmem.c
index 1a0fad7..4967d4f 100644
--- a/hw/misc/ivshmem.c
+++ b/hw/misc/ivshmem.c
@@ -492,7 +492,8 @@ static void process_msg_shmem(IVShmemState *s, int fd, Error **errp)
 
     /* mmap the region and map into the BAR2 */
     memory_region_init_ram_from_fd(&s->server_bar2, OBJECT(s),
-                                   "ivshmem.bar2", size, true, fd, &local_err);
+                                   "ivshmem.bar2", size, 0, true, fd,
+                                   &local_err);
     if (local_err) {
         error_propagate(errp, local_err);
         return;
diff --git a/include/exec/memory.h b/include/exec/memory.h
index 6092528..28cb2e9 100644
--- a/include/exec/memory.h
+++ b/include/exec/memory.h
@@ -917,6 +917,7 @@ void memory_region_init_ram_from_fd(MemoryRegion *mr,
                                     struct Object *owner,
                                     const char *name,
                                     uint64_t size,
+                                    uint64_t mmap_offset,
                                     bool share,
                                     int fd,
                                     Error **errp);
diff --git a/include/exec/ram_addr.h b/include/exec/ram_addr.h
index 5e59a3d..1e85362 100644
--- a/include/exec/ram_addr.h
+++ b/include/exec/ram_addr.h
@@ -120,7 +120,7 @@ RAMBlock *qemu_ram_alloc_from_file(ram_addr_t size, MemoryRegion *mr,
                                    uint32_t ram_flags, const char *mem_path,
                                    Error **errp);
 RAMBlock *qemu_ram_alloc_from_fd(ram_addr_t size, MemoryRegion *mr,
-                                 uint32_t ram_flags, int fd,
+                                 uint32_t ram_flags, int fd, size_t mmap_offset,
                                  Error **errp);
 
 RAMBlock *qemu_ram_alloc_from_ptr(ram_addr_t size, void *host,
diff --git a/include/qemu/mmap-alloc.h b/include/qemu/mmap-alloc.h
index e786266..bd95504 100644
--- a/include/qemu/mmap-alloc.h
+++ b/include/qemu/mmap-alloc.h
@@ -23,6 +23,7 @@ size_t qemu_mempath_getpagesize(const char *mem_path);
  */
 void *qemu_ram_mmap(int fd,
                     size_t size,
+                    size_t mmap_offset,
                     size_t align,
                     bool shared,
                     bool is_pmem);
diff --git a/memory.c b/memory.c
index e9f37e7..65dd165 100644
--- a/memory.c
+++ b/memory.c
@@ -1584,6 +1584,7 @@ void memory_region_init_ram_from_fd(MemoryRegion *mr,
                                     struct Object *owner,
                                     const char *name,
                                     uint64_t size,
+                                    uint64_t mmap_offset,
                                     bool share,
                                     int fd,
                                     Error **errp)
@@ -1595,7 +1596,7 @@ void memory_region_init_ram_from_fd(MemoryRegion *mr,
     mr->destructor = memory_region_destructor_ram;
     mr->ram_block = qemu_ram_alloc_from_fd(size, mr,
                                            share ? RAM_SHARED : 0,
-                                           fd, &err);
+                                           fd, mmap_offset, &err);
     mr->dirty_log_mask = tcg_enabled() ? (1 << DIRTY_MEMORY_CODE) : 0;
     if (err) {
         mr->size = int128_zero();
diff --git a/util/mmap-alloc.c b/util/mmap-alloc.c
index 27dcccd..191db45 100644
--- a/util/mmap-alloc.c
+++ b/util/mmap-alloc.c
@@ -84,6 +84,7 @@ size_t qemu_mempath_getpagesize(const char *mem_path)
 
 void *qemu_ram_mmap(int fd,
                     size_t size,
+                    size_t mmap_offset,
                     size_t align,
                     bool shared,
                     bool is_pmem)
@@ -127,7 +128,7 @@ void *qemu_ram_mmap(int fd,
     flags = MAP_PRIVATE | MAP_ANONYMOUS;
 #endif
 
-    guardptr = mmap(0, total, PROT_NONE, flags, guardfd, 0);
+    guardptr = mmap(0, total, PROT_NONE, flags, guardfd, mmap_offset);
 
     if (guardptr == MAP_FAILED) {
         return MAP_FAILED;
@@ -147,7 +148,7 @@ void *qemu_ram_mmap(int fd,
     offset = QEMU_ALIGN_UP((uintptr_t)guardptr, align) - (uintptr_t)guardptr;
 
     ptr = mmap(guardptr + offset, size, PROT_READ | PROT_WRITE,
-               flags | map_sync_flags, fd, 0);
+               flags | map_sync_flags, fd, mmap_offset);
 
     if (ptr == MAP_FAILED && map_sync_flags) {
         if (errno == ENOTSUP) {
@@ -172,7 +173,7 @@ void *qemu_ram_mmap(int fd,
          * we will remove these flags to handle compatibility.
          */
         ptr = mmap(guardptr + offset, size, PROT_READ | PROT_WRITE,
-                   flags, fd, 0);
+                   flags, fd, mmap_offset);
     }
 
     if (ptr == MAP_FAILED) {
diff --git a/util/oslib-posix.c b/util/oslib-posix.c
index 5a291cc..e4ffdc1 100644
--- a/util/oslib-posix.c
+++ b/util/oslib-posix.c
@@ -205,7 +205,7 @@ void *qemu_memalign(size_t alignment, size_t size)
 void *qemu_anon_ram_alloc(size_t size, uint64_t *alignment, bool shared)
 {
     size_t align = QEMU_VMALLOC_ALIGN;
-    void *ptr = qemu_ram_mmap(-1, size, align, shared, false);
+    void *ptr = qemu_ram_mmap(-1, size, 0, align, shared, false);
 
     if (ptr == MAP_FAILED) {
         return NULL;
-- 
2.7.4



^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [RFC 3/9] memory: Hack - use shared memory when possible
       [not found]   ` <CGME20200204113106eucas1p2cf218553048c75f5a8b7771cde90f5f1@eucas1p2.samsung.com>
@ 2020-02-04 11:30     ` i.kotrasinsk
  0 siblings, 0 replies; 20+ messages in thread
From: i.kotrasinsk @ 2020-02-04 11:30 UTC (permalink / raw)
  To: qemu-devel; +Cc: peter.maydell, Igor Kotrasinski, pbonzini

From: Igor Kotrasinski <i.kotrasinsk@partner.samsung.com>

Signed-off-by: Igor Kotrasinski <i.kotrasinsk@partner.samsung.com>
---
 hw/core/numa.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/hw/core/numa.c b/hw/core/numa.c
index 0d1b4be..02fd7f5 100644
--- a/hw/core/numa.c
+++ b/hw/core/numa.c
@@ -785,8 +785,8 @@ static void allocate_system_memory_nonnuma(MemoryRegion *mr, Object *owner,
     if (mem_path) {
 #ifdef __linux__
         Error *err = NULL;
-        memory_region_init_ram_from_file(mr, owner, name, ram_size, 0, 0,
-                                         mem_path, &err);
+        memory_region_init_ram_from_file(mr, owner, name, ram_size, 0,
+                                         RAM_SHARED, mem_path, &err);
         if (err) {
             error_report_err(err);
             if (mem_prealloc) {
-- 
2.7.4



^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [RFC 4/9] hw/misc/memexpose: Add documentation
       [not found]   ` <CGME20200204113107eucas1p2769c0c8204a57751a4e6c5d4fb40e2d5@eucas1p2.samsung.com>
@ 2020-02-04 11:30     ` i.kotrasinsk
  0 siblings, 0 replies; 20+ messages in thread
From: i.kotrasinsk @ 2020-02-04 11:30 UTC (permalink / raw)
  To: qemu-devel; +Cc: peter.maydell, Igor Kotrasinski, pbonzini

From: Igor Kotrasinski <i.kotrasinsk@partner.samsung.com>

Signed-off-by: Igor Kotrasinski <i.kotrasinsk@partner.samsung.com>
---
 docs/specs/memexpose-spec.txt | 168 ++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 168 insertions(+)
 create mode 100644 docs/specs/memexpose-spec.txt

diff --git a/docs/specs/memexpose-spec.txt b/docs/specs/memexpose-spec.txt
new file mode 100644
index 0000000..60ccea6
--- /dev/null
+++ b/docs/specs/memexpose-spec.txt
@@ -0,0 +1,168 @@
+= Specification for Inter-VM memory region sharing device =
+
+The inter-VM memory region sharing device (memexpose) is designed to allow two
+QEMU devices to share arbitrary physical memory regions between one another, as
+well as pass simple interrupts. It attempts to share memory regions directly
+when feasible, falling back to MMIO via socket communication when it's not.
+
+The device is modeled by QEMU as a PCI device, as well as a memory
+region/interrupt directly usable on platforms like ARM, with an entry in the
+device tree.
+
+An example use case for memexpose is forwarding ARM Trustzone functionality
+between two VMs running different architectures - one running a rich OS on an
+x86_64 VM, the other running the trusted OS on an ARM VM. In this scenario,
+sharing arbitrary memory regions allows such forwarding to work with minimal
+changes to the trusted OS.
+
+
+== Configuring the memexpose device ==
+
+The device uses two character devices to communicate with the other VM - one for
+synchronous memory accesses, another for passing interrupts. A typical
+configuration of the PCI device looks like this:
+
+        -chardev socket,...,path=/tmp/qemu-memexpose-mem,id="mem" \
+        -chardev socket,...,path=/tmp/qemu-memexpose-intr,id="intr" \
+        -device memexpose-pci,mem_chardev="mem",intr_chardev="intr",shm_size=0xN...
+
+While the arm-virt machine device can be enabled like this:
+ 
+        -chardev socket,...,path=/tmp/qemu-memexpose-mem,id="mem-mem" \
+        -chardev socket,...,path=/tmp/qemu-memexpose-intr,id="mem-intr" \
+        -machine memexpose-ep=mem,memexpose-size=0xN...
+
+Normally one of the VMs would have 'server,nowait' options set on these
+chardevs.
+
+At the moment the memory exposed to the other device always starts at 0
+(relative to system_memory). The shm_size/memexpose-size property indicates the
+size of the exposed region.
+
+The *_chardev/memexpose-ep properties are used to point the memexpose device to
+chardevs used to communicate with the other VM.
+
+
+== Memexpose PCI device interface ===
+
+The device has vendor ID 1af4, device ID 1111, revision 0.
+
+=== PCI BARs ===
+
+The device has two BARs:
+- BAR0 holds device registers and interrupt data (0x1000 byte MMIO),
+- BAR1 maps memory from the other VM.
+
+To use the device, you must first enable it by writing 1 to BAR0 at address 0.
+This makes QEMU wait for another VM to connect. Once that is done, you can
+access the other machine's memory via BAR1.
+
+Interrupts can be sent and received by configuring the device for interrupts and
+reading and writing to registers in BAR0.
+
+=== Device registers ===
+
+BAR 0 has following registers:
+
+    Offset  Size  Access      On reset  Function
+        0     8   read/write        0   Enable/disable device
+                                        bit 0: device enabled / disabled
+                                        bit 1..63: reserved
+    0x400     8   read/write        0   Interrupt RX address
+                                        bit 1: interrupt read
+                                        bit 1..63: reserved
+    0x408     8   read-only        UD   RX Interrupt type
+    0x410   128   read-only        UD   RX Interrupt data
+    0x800     8   read/write        0   Interrupt TX address
+    0x808     8   write-only      N/A   TX Interrupt type
+    0x810   128   write-only      N/A   TX Interrupt data
+
+All other addresses are reserved.
+
+=== Handling interrupts ===
+
+To send interrupts, write to TX interrupt address. Contents of TX interrupt type
+and data regions will be send along with the interrupt. The device is holding an
+internal queue of 16 interrupts, any extra interrupts are silently dropped.
+
+To receive interrupts, read the interrupt RX address. If the value is 1, then
+RX interrupt type and data registers contain the data / type sent by the other
+VM. Otherwise (the value is 0), no more interrupts are queued and RX interrupt
+type/data register contents are undefined.
+
+
+=== Platform device protocol ===
+
+The other memexpose device type (provided on e.g. ARM via device tree) is
+essentially identical to the PCI device. It provides two memory ranges that work
+exactly like the PCI BAR regions and an interrupt for signaling an interrupt
+from the other VM.
+
+== Memexpose peer protocol ==
+
+This section describes the current memexpose protocol. It is a WIP and likely to
+change.
+
+A connection between two VMs connected via memexpose happens on two sockets - an
+interrupt socket and a memory socket. All communication on the earlier is
+asynchronous, while communication on the latter is synchronous.
+
+When the device is enabled, QEMU waits for memexpose's chardevs to connect. No
+messages are exchanged upon connection. After devices are connected, the
+following messages can be exchanged:
+
+1. Interrupt message, via interrupt socket. This message contains interrupt type
+   and data.
+
+2. Memory access request message, via memory socket. It contains a target
+   address, access size and valueto write in case of writes.
+
+3. Memory access return message. This contains an access result (as
+   MemTxResult) and a value in case of reads. If the accessed region can be
+   shared directly, then this region's start, size and shmem file descriptor are
+   also sent.
+
+4. Memory invalidation message. This is sent when a VM's memory region changes
+   status and contains such region's start and size. The other VM is expected to
+   drop any shared regions overlapping with it.
+
+5. Memory invalidation response. This is sent in response to a memory
+   invalidation message; after receiving this the remote VM is guaranteed have
+   scheduled region invalidation before accessing the region again.
+
+As QEMU performes memory accesses synchronously, we want to perform memory
+invalidation before returning to guest OS and both VMs might try to perform a
+remote memory access at the same time, all messages passed via the memory socket
+have an associated priority.
+
+At any time, only one message with a given priority is in flight. After sending
+a message, the VM reads messages on the memory socket, servicing all messages
+with a priority higher than its own. Once it receives a message with a priority
+lower than its own, it waits for a response to its own message before servicing
+it. This guarantees no deadlocks, assuming that messages don't trigger further
+messages. Message priorities, from highest to lowest, are as follows:
+
+1. Memory invalidation message/response.
+2. Memory access message/response.
+
+Additionally, one of the VMs is assigned a sub-priority higher than another, so
+that its messages of the same type have priority over the other VM's messages.
+
+Memory access messages have the lowest priority in order to guarantee that QEMU
+will not attempt to access memory while in the middle of a memory region
+listener.
+
+=== Memexpose memory sharing ===
+
+This section describes the memexpose memory sharing mechanism.
+
+Memory sharing is implemented lazily, initially no memory regions are shared
+between devices. When a memory access is performed via a socket, the remote VM
+checks whether the underlying memory range is backed by shareable memory. If it
+is, the VM finds out the maximum contiguous flat range backed by this region and
+sends its file descriptor to the local VM, where it is mapped as a subregion.
+
+The memexpose device registers memory listeners for the memory region it's
+using. Whenever a flat range for this region (that is not this device's
+subregion) changes, that range is sent to the other VM and any directly shared
+memory region intersecting this range is scheduled for removal via a BH.
-- 
2.7.4



^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [RFC 5/9] hw/misc/memexpose: Add core memexpose files
       [not found]   ` <CGME20200204113108eucas1p232d86a495fa8200473047ffb58548201@eucas1p2.samsung.com>
@ 2020-02-04 11:30     ` i.kotrasinsk
  0 siblings, 0 replies; 20+ messages in thread
From: i.kotrasinsk @ 2020-02-04 11:30 UTC (permalink / raw)
  To: qemu-devel; +Cc: peter.maydell, Igor Kotrasinski, pbonzini

From: Igor Kotrasinski <i.kotrasinsk@partner.samsung.com>

Signed-off-by: Igor Kotrasinski <i.kotrasinsk@partner.samsung.com>
---
 Kconfig.host                       |   3 +
 MAINTAINERS                        |   8 +
 Makefile                           |   1 +
 configure                          |   8 +
 hw/mem/Kconfig                     |   3 +
 hw/misc/Makefile.objs              |   1 +
 hw/misc/memexpose/Makefile.objs    |   2 +
 hw/misc/memexpose/memexpose-core.c | 630 +++++++++++++++++++++++++++++++++++++
 hw/misc/memexpose/memexpose-core.h | 109 +++++++
 hw/misc/memexpose/memexpose-msg.c  | 261 +++++++++++++++
 hw/misc/memexpose/memexpose-msg.h  | 161 ++++++++++
 11 files changed, 1187 insertions(+)
 create mode 100644 hw/misc/memexpose/Makefile.objs
 create mode 100644 hw/misc/memexpose/memexpose-core.c
 create mode 100644 hw/misc/memexpose/memexpose-core.h
 create mode 100644 hw/misc/memexpose/memexpose-msg.c
 create mode 100644 hw/misc/memexpose/memexpose-msg.h

diff --git a/Kconfig.host b/Kconfig.host
index 55136e0..7470210 100644
--- a/Kconfig.host
+++ b/Kconfig.host
@@ -20,6 +20,9 @@ config SPICE
 config IVSHMEM
     bool
 
+config MEMEXPOSE
+    bool
+
 config TPM
     bool
 
diff --git a/MAINTAINERS b/MAINTAINERS
index 1f0bc72..d6146c0 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -1639,6 +1639,14 @@ F: hw/virtio/virtio-crypto.c
 F: hw/virtio/virtio-crypto-pci.c
 F: include/hw/virtio/virtio-crypto.h
 
+memexpose
+M: Igor Kotrasinski <i.kotrasinsk@partner.samsung.com>
+S: Maintained
+F: hw/misc/memexpose/memexpose-core.h
+F: hw/misc/memexpose/memexpose-core.c
+F: hw/misc/memexpose/memexpose-msg.h
+F: hw/misc/memexpose/memexpose-msg.c
+
 nvme
 M: Keith Busch <keith.busch@intel.com>
 L: qemu-block@nongnu.org
diff --git a/Makefile b/Makefile
index a6f5d44..b125a1b 100644
--- a/Makefile
+++ b/Makefile
@@ -387,6 +387,7 @@ MINIKCONF_ARGS = \
     CONFIG_KVM=$(CONFIG_KVM) \
     CONFIG_SPICE=$(CONFIG_SPICE) \
     CONFIG_IVSHMEM=$(CONFIG_IVSHMEM) \
+    CONFIG_MEMEXPOSE=$(CONFIG_MEMEXPOSE) \
     CONFIG_TPM=$(CONFIG_TPM) \
     CONFIG_XEN=$(CONFIG_XEN) \
     CONFIG_OPENGL=$(CONFIG_OPENGL) \
diff --git a/configure b/configure
index 5095f01..710e739 100755
--- a/configure
+++ b/configure
@@ -505,6 +505,7 @@ debug_mutex="no"
 libpmem=""
 default_devices="yes"
 plugins="no"
+memexpose="no"
 
 supported_cpu="no"
 supported_os="no"
@@ -1020,6 +1021,10 @@ for opt do
   ;;
   --without-default-devices) default_devices="no"
   ;;
+  --enable-memexpose) memexpose="yes"
+  ;;
+  --disable-memexpose) memexpose="no"
+  ;;
   --enable-gprof) gprof="yes"
   ;;
   --enable-gcov) gcov="yes"
@@ -7400,6 +7405,9 @@ fi
 if test "$ivshmem" = "yes" ; then
   echo "CONFIG_IVSHMEM=y" >> $config_host_mak
 fi
+if test "$memexpose" = "yes" ; then
+  echo "CONFIG_MEMEXPOSE=y" >> $config_host_mak
+fi
 if test "$capstone" != "no" ; then
   echo "CONFIG_CAPSTONE=y" >> $config_host_mak
 fi
diff --git a/hw/mem/Kconfig b/hw/mem/Kconfig
index 620fd4c..e377b05 100644
--- a/hw/mem/Kconfig
+++ b/hw/mem/Kconfig
@@ -5,6 +5,9 @@ config DIMM
 config MEM_DEVICE
     bool
 
+config MEM_EXPOSE
+    bool
+
 config NVDIMM
     bool
     default y
diff --git a/hw/misc/Makefile.objs b/hw/misc/Makefile.objs
index da993f4..7e9a692 100644
--- a/hw/misc/Makefile.objs
+++ b/hw/misc/Makefile.objs
@@ -27,6 +27,7 @@ common-obj-$(CONFIG_PUV3) += puv3_pm.o
 common-obj-$(CONFIG_MACIO) += macio/
 
 common-obj-$(CONFIG_IVSHMEM_DEVICE) += ivshmem.o
+common-obj-$(CONFIG_MEMEXPOSE) += memexpose/
 
 common-obj-$(CONFIG_REALVIEW) += arm_sysctl.o
 common-obj-$(CONFIG_NSERIES) += cbus.o
diff --git a/hw/misc/memexpose/Makefile.objs b/hw/misc/memexpose/Makefile.objs
new file mode 100644
index 0000000..f405fe7
--- /dev/null
+++ b/hw/misc/memexpose/Makefile.objs
@@ -0,0 +1,2 @@
+common-obj-y += memexpose-msg.o
+common-obj-y += memexpose-core.o
diff --git a/hw/misc/memexpose/memexpose-core.c b/hw/misc/memexpose/memexpose-core.c
new file mode 100644
index 0000000..3b6ef3c
--- /dev/null
+++ b/hw/misc/memexpose/memexpose-core.c
@@ -0,0 +1,630 @@
+/*
+ *  Memexpose core
+ *
+ *  Copyright (C) 2020 Samsung Electronics Co Ltd.
+ *    Igor Kotrasinski, <i.kotrasinsk@partner.samsung.com>
+ *
+ *  This program is free software; you can redistribute it and/or modify it
+ *  under the terms of the GNU General Public License as published by the
+ *  Free Software Foundation; either version 2 of the License, or
+ *  (at your option) any later version.
+ *
+ *  This program is distributed in the hope that it will be useful, but WITHOUT
+ *  ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
+ *  FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License
+ *  for more details.
+ *
+ *  You should have received a copy of the GNU General Public License along
+ *  with this program; if not, see <http://www.gnu.org/licenses/>.
+ *
+ */
+
+#include "qemu/osdep.h"
+#include "qapi/error.h"
+#include "memexpose-core.h"
+#include "exec/address-spaces.h"
+#include "exec/cpu-common.h"
+
+static int memexpose_pop_intr(MemexposeIntr *s)
+{
+    if (s->queue_count == 0) {
+        MEMEXPOSE_DPRINTF("No queued interrupts\n");
+        return 0;
+    }
+    struct memexpose_op_intr *head = &s->intr_queue[s->queue_start];
+    s->intr_rx = *head;
+    s->queue_start = (s->queue_start + 1) % MEMEXPOSE_INTR_QUEUE_SIZE;
+    s->queue_count--;
+
+    if (!s->queue_count) {
+        s->ops.intr(s->ops.parent, 0);
+    }
+    MEMEXPOSE_DPRINTF("Popped interrupt %lx\n", s->intr_rx.type);
+    return 1;
+}
+
+static void memexpose_push_intr(MemexposeIntr *s, struct memexpose_op_intr *msg)
+{
+    int signal = 0, free_slot;
+
+    if (s->queue_count == MEMEXPOSE_INTR_QUEUE_SIZE) {
+        MEMEXPOSE_DPRINTF("Interrupt queue is already full!\n");
+        return;
+    }
+    free_slot = (s->queue_start + s->queue_count) % MEMEXPOSE_INTR_QUEUE_SIZE;
+    s->intr_queue[free_slot] = *msg;
+    if (!s->queue_count) {
+        signal = 1;
+    }
+    s->queue_count++;
+
+    if (signal) {
+        s->ops.intr(s->ops.parent, 1);
+    }
+}
+
+static void process_intr(void *opaque, struct memexpose_op *op, Error **err)
+{
+    MemexposeIntr *s = opaque;
+    switch (op->head.ot) {
+    case MOP_INTR:
+        memexpose_push_intr(s, &op->body.intr);
+        break;
+    default:
+        error_setg(err, "Unknown memexpose intr command %u", op->head.ot);
+    }
+}
+
+static void memexpose_send_intr(MemexposeIntr *s)
+{
+    struct memexpose_op msg;
+
+    msg.head.ot = MOP_INTR;
+    msg.head.size = sizeof(msg.head) + sizeof(msg.body.intr);
+    msg.head.prio = 0;
+    msg.body.intr = s->intr_tx;
+    memexpose_ep_write_async(&s->ep, &msg);
+    MEMEXPOSE_DPRINTF("Sending interrupt %lx\n", msg.body.intr.type);
+}
+
+#define IN_INTR_DATA_RANGE(a, s, r) \
+    (a >= r && \
+     a < r + MEMEXPOSE_MAX_INTR_DATA_SIZE && \
+     (s = MIN(s, r + MEMEXPOSE_MAX_INTR_DATA_SIZE - a), 1))
+
+static uint64_t memexpose_intr_read(void *opaque, hwaddr addr,
+                                    unsigned size)
+{
+    MemexposeIntr *s = opaque;
+    uint64_t ret = 0;
+    unsigned int boff = 8 * (addr & 0x7);
+
+    switch (addr & (~0x7)) {
+    case MEMEXPOSE_INTR_RX_TYPE_ADDR:
+        ret = s->intr_rx.type;
+        ret >>= boff;
+        return ret;
+    case MEMEXPOSE_INTR_TX_TYPE_ADDR:
+        ret = s->intr_tx.type;
+        ret >>= boff;
+        return ret;
+    case MEMEXPOSE_INTR_RECV_ADDR:
+        /* Make multiple read calls in readq and such behave as expected */
+        if (addr & 0x7) {
+            return 0;
+        }
+
+        ret = memexpose_pop_intr(s);
+        return ret;
+    case MEMEXPOSE_INTR_ENABLE_ADDR:
+        if (addr & 0x7) {
+            return 0;
+        }
+        return s->enabled;
+    default:
+        break;
+    }
+
+    if (IN_INTR_DATA_RANGE(addr, size, MEMEXPOSE_INTR_RX_DATA_ADDR)) {
+        uint64_t off = addr - MEMEXPOSE_INTR_RX_DATA_ADDR;
+        memcpy(&ret, s->intr_rx.data + off, size);
+        return ret;
+    } else if (IN_INTR_DATA_RANGE(addr, size, MEMEXPOSE_INTR_TX_DATA_ADDR)) {
+        uint64_t off = addr - MEMEXPOSE_INTR_TX_DATA_ADDR;
+        memcpy(&ret, s->intr_tx.data + off, size);
+        return ret;
+    } else {
+        MEMEXPOSE_DPRINTF("Invalid mmio read at " TARGET_FMT_plx "\n", addr);
+        ret = 0;
+        return ret;
+    }
+}
+
+static void memexpose_intr_write(void *opaque, hwaddr addr,
+                                 uint64_t val, unsigned size)
+{
+    MemexposeIntr *s = opaque;
+    unsigned int boff = 8 * (addr & 0x7);
+    uint64_t mask = ((1LL << (size * 8)) - 1) << boff;
+
+    switch (addr & (~0x7)) {
+    case MEMEXPOSE_INTR_RX_TYPE_ADDR:
+        s->intr_rx.type &= ~mask;
+        s->intr_rx.type |= (val << boff);
+        return;
+    case MEMEXPOSE_INTR_TX_TYPE_ADDR:
+        s->intr_tx.type &= ~mask;
+        s->intr_tx.type |= (val << boff);
+        return;
+    case MEMEXPOSE_INTR_SEND_ADDR:
+        /* Make multiple write calls in writeq and such behave as expected */
+        if (addr & 0x7) {
+            return;
+        }
+        memexpose_send_intr(s);
+        return;
+    case MEMEXPOSE_INTR_ENABLE_ADDR:
+        if (addr & 0x7) {
+            return;
+        }
+        if (val) {
+            if (s->ops.enable) {
+                s->enabled = s->ops.enable(s->ops.parent) ? 0 : 1;
+            } else {
+                s->enabled = 1;
+            }
+        } else {
+            if (s->ops.disable) {
+                s->ops.disable(s->ops.parent);
+            }
+            s->enabled = 0;
+        }
+        return;
+    }
+
+    if (IN_INTR_DATA_RANGE(addr, size, MEMEXPOSE_INTR_RX_DATA_ADDR)) {
+        uint64_t off = addr - MEMEXPOSE_INTR_RX_DATA_ADDR;
+        memcpy(s->intr_rx.data + off, &val, size);
+    } else if (IN_INTR_DATA_RANGE(addr, size, MEMEXPOSE_INTR_TX_DATA_ADDR)) {
+        uint64_t off = addr - MEMEXPOSE_INTR_TX_DATA_ADDR;
+        memcpy(s->intr_tx.data + off, &val, size);
+    } else {
+        MEMEXPOSE_DPRINTF("Invalid mmio write at " TARGET_FMT_plx "\n", addr);
+    }
+}
+
+static const MemoryRegionOps memexpose_intr_ops = {
+    .read = memexpose_intr_read,
+    .write = memexpose_intr_write,
+    .endianness = DEVICE_NATIVE_ENDIAN,
+    .impl = {
+        .min_access_size = 4,
+        .max_access_size = 4,
+    },
+};
+
+void memexpose_intr_init(MemexposeIntr *s, struct memexpose_intr_ops *ops,
+                         Object *parent, CharBackend *chr, Error **errp)
+{
+    if (!qemu_chr_fe_backend_connected(chr)) {
+        error_setg(errp, "You must specify a 'intr_chardev'");
+        return;
+    }
+
+    s->parent = parent;
+    s->ops = *ops;
+    s->enabled = 0;
+    s->queue_start = 0;
+    s->queue_count = 0;
+    memexpose_ep_init(&s->ep, chr, s, 0, process_intr);
+    s->ep.is_async = true;
+    memory_region_init_io(&s->shmem, parent, &memexpose_intr_ops, s,
+                          "memexpose-intr", MEMEXPOSE_INTR_MEM_SIZE);
+}
+
+int memexpose_intr_enable(MemexposeIntr *s)
+{
+    return memexpose_ep_connect(&s->ep);
+}
+
+void memexpose_intr_disable(MemexposeIntr *s)
+{
+    memexpose_ep_disconnect(&s->ep);
+}
+
+void memexpose_intr_destroy(MemexposeIntr *s)
+{
+    memexpose_intr_disable(s);
+    /* Region will be collected with its parent */
+    memexpose_ep_destroy(&s->ep);
+}
+
+static bool memshare_region_overlaps(MemexposeMem *s,
+                                     struct memexpose_memshare_info_fd *share)
+{
+    MemexposeRemoteMemory *mem;
+    QLIST_FOREACH(mem, &s->remote_regions, list) {
+        uint64_t start = memory_region_get_ram_addr(&mem->region);
+        uint64_t size = memory_region_size(&mem->region);
+        MEMEXPOSE_DPRINTF("Comparing regions: received %"PRIx64"-%"PRIx64", "\
+                          "current mapped %"PRIx64"-%"PRIx64"\n",
+                          share->start, share->start + share->size,
+                          start, start + size);
+        if (start < share->start + share->size ||
+            share->start < start + size)
+            return true;
+    }
+    return false;
+}
+
+static void memshare_add_region(MemexposeMem *s, int fd,
+                                struct memexpose_memshare_info_fd *share,
+                                Error **errp)
+{
+    if (share->start >= s->shmem_size) {
+        /* TODO - error out */
+        MEMEXPOSE_DPRINTF("Shared memory start too high: "
+                          "%" PRIx64 " >= %" PRIx64,
+                          share->start, s->shmem_size);
+        close(fd);
+        return;
+    }
+
+    if (memshare_region_overlaps(s, share)) {
+        /* TODO - error out */
+        MEMEXPOSE_DPRINTF("Shared memory %" PRIx64 "-%" PRIx64
+                          " overlaps with existing region",
+                          share->start, share->start + share->size);
+        close(fd);
+        return;
+    }
+
+    uint64_t clamped_size = s->shmem_size - share->start;
+    share->size = MIN(share->size, clamped_size);
+
+    MemexposeRemoteMemory *mem = g_malloc(sizeof(*mem));
+    char *rname = g_strdup_printf("Memexpose shmem "
+                                  "%" PRIx64 "-%" PRIx64" -> %" PRIx64,
+                                  share->start, share->start + share->size,
+                                  share->mmap_start);
+
+    MEMEXPOSE_DPRINTF("Mapping remote memory: %" PRIx64 \
+                      "-%" PRIx64 ", fd offset %" PRIx64 "\n",
+                      share->start, share->size, share->mmap_start);
+
+    memory_region_init_ram_from_fd(&mem->region, s->parent, rname,
+                                   share->size, share->mmap_start,
+                                   true, fd, errp);
+    if (*errp) {
+        error_report_err(*errp);
+        close(fd);
+        return;
+    }
+
+    memory_region_set_nonvolatile(&mem->region, share->nonvolatile);
+    memory_region_set_readonly(&mem->region, share->readonly);
+    g_free(rname);
+    memory_region_add_subregion_overlap(&s->shmem, share->start,
+                                        &mem->region, 1);
+    QLIST_INSERT_HEAD(&s->remote_regions, mem, list);
+}
+
+static void memshare_remove_region(MemexposeMem *s, MemexposeRemoteMemory *reg)
+{
+    /* TODO is this correct? Docs warn about leaked refcounts */
+    QLIST_REMOVE(reg, list);
+    memory_region_del_subregion(&s->shmem, &reg->region);
+    object_unparent(OBJECT(&reg->region));
+}
+
+static void memshare_handle(MemexposeMem *s,
+                            struct memexpose_memshare_info *share)
+{
+    int fd;
+    switch (share->type) {
+    case MEMSHARE_NONE:
+        return;
+    case MEMSHARE_FD:
+        fd = memexpose_ep_recv_fd(&s->ep);
+        MEMEXPOSE_DPRINTF("Received memshare fd: %d\n", fd);
+        if (s->pending_invalidation) {
+            close(fd);
+            return;
+        }
+        Error *err = NULL;
+        memshare_add_region(s, fd, &share->fd, &err); /* TODO - handle errors */
+        return;
+    default:
+        MEMEXPOSE_DPRINTF("Invalid memshare type: %u\n", share->type);
+        return;
+    }
+}
+
+static MemTxResult memexpose_read_slow(void *opaque, hwaddr addr,
+                                       uint64_t *data, unsigned size,
+                                       MemTxAttrs attrs)
+{
+    MemexposeMem *s = opaque;
+
+    struct memexpose_op msg;
+    msg.head.size = sizeof(msg.head) + sizeof(msg.body.read);
+    msg.head.ot = MOP_READ;
+    msg.head.prio = memexpose_ep_msg_prio(&s->ep, msg.head.ot);
+    msg.body.read.offset = addr;
+    msg.body.read.size = size;
+    memexpose_ep_write_sync(&s->ep, &msg);
+
+    MemTxResult res = msg.body.read_ret.ret;
+    if (res == MEMTX_OK) {
+        memshare_handle(s, &msg.body.read_ret.share);
+    }
+    memcpy(data, &msg.body.read_ret.value, size);
+    return res;
+}
+
+static MemTxResult memexpose_write_slow(void *opaque, hwaddr addr,
+                                        uint64_t val, unsigned size,
+                                        MemTxAttrs attrs)
+{
+    MemexposeMem *s = opaque;
+    struct memexpose_op msg;
+    msg.head.size = sizeof(msg.head) + sizeof(msg.body.write);
+    msg.head.ot = MOP_WRITE;
+    msg.head.prio = memexpose_ep_msg_prio(&s->ep, msg.head.ot);
+    msg.body.write.offset = addr;
+    msg.body.write.size = size;
+    msg.body.write.value = val;
+    memexpose_ep_write_sync(&s->ep, &msg);
+
+    MemTxResult res = msg.body.write_ret.ret;
+    if (res == MEMTX_OK) {
+        memshare_handle(s, &msg.body.write_ret.share);
+    }
+    return res;
+}
+
+static const MemoryRegionOps memexpose_region_ops = {
+    .read_with_attrs = memexpose_read_slow,
+    .write_with_attrs = memexpose_write_slow,
+    .endianness = DEVICE_NATIVE_ENDIAN,
+    .impl = {
+        .min_access_size = 1,
+        .max_access_size = 8,
+    },
+};
+
+static void prepare_memshare(MemexposeMem *s,
+                             uint64_t size, uint64_t offset,
+                             struct memexpose_memshare_info *info) {
+    MemoryRegionSection section = memory_region_find_flat_range(
+            s->as.root, offset, size);
+    if (!section.mr) {
+        MEMEXPOSE_DPRINTF("No memory region under %lu!\n", offset);
+        goto unref;
+    }
+
+    int fd = memory_region_get_fd(section.mr);
+    if (fd != -1 && qemu_ram_is_shared(section.mr->ram_block)) {
+        info->type = MEMSHARE_FD;
+        info->fd.mmap_start = section.offset_within_region;
+        info->fd.start = section.offset_within_address_space;
+        info->fd.size = section.size;
+        info->fd.readonly = memory_region_is_rom(section.mr);
+        info->fd.nonvolatile = memory_region_is_nonvolatile(section.mr);
+
+        MEMEXPOSE_DPRINTF("Prepared a memshare fd: %" PRIx64 \
+                          "-%" PRIx64 ", fd offset %" PRIx64 "\n",
+                          info->fd.start, info->fd.size, info->fd.mmap_start);
+        memexpose_ep_send_fd(&s->ep, fd);
+        s->nothing_shared = false;
+    } else {
+        info->type = MEMSHARE_NONE;
+    }
+unref:
+    memory_region_unref(section.mr);
+}
+
+static void memexpose_perform_read_request(
+        MemexposeMem *s, struct memexpose_op_read *in,
+        struct memexpose_op *out)
+{
+    out->head.ot = MOP_READ_RET;
+    out->head.size = sizeof(out->head) + sizeof(out->body.read_ret);
+    out->body.read_ret.ret = 0;
+    out->body.read_ret.share.type = MEMSHARE_NONE;
+
+    MEMEXPOSE_DPRINTF("Reading %u from %lx\n", in->size, in->offset);
+    MemTxResult r = address_space_read(&s->as, in->offset,
+                                       MEMTXATTRS_UNSPECIFIED,
+                                       (uint8_t *) &out->body.read_ret.value,
+                                       in->size);
+    out->body.read_ret.ret = r;
+    if (r != MEMTX_OK) {
+        MEMEXPOSE_DPRINTF("Failed to read\n");
+    } else {
+        prepare_memshare(s, in->size, in->offset, &out->body.read_ret.share);
+    }
+}
+
+static void memexpose_perform_write_request(
+        MemexposeMem *s, struct memexpose_op_write *in,
+        struct memexpose_op *out)
+{
+    out->head.ot = MOP_WRITE_RET;
+    out->head.size = sizeof(out->head) + sizeof(out->body.write_ret);
+    out->body.write_ret.ret = 0;
+    out->body.write_ret.share.type = MEMSHARE_NONE;
+
+    MEMEXPOSE_DPRINTF("Writing %u to %lx\n", in->size, in->offset);
+    MemTxResult r = address_space_write(&s->as, in->offset,
+                                        MEMTXATTRS_UNSPECIFIED,
+                                        (uint8_t *) &in->value,
+                                        in->size);
+    if (r != MEMTX_OK) {
+        out->body.write_ret.ret = -EIO;
+        MEMEXPOSE_DPRINTF("Failed to write\n");
+        return;
+    }
+
+    out->body.write_ret.ret = r;
+    if (r != MEMTX_OK) {
+        MEMEXPOSE_DPRINTF("Failed to read\n");
+    } else {
+        prepare_memshare(s, in->size, in->offset, &out->body.write_ret.share);
+    }
+}
+
+static bool region_is_ours(MemexposeMem *s, MemoryRegion *mr)
+{
+    if (mr == &s->shmem) {
+        return true;
+    }
+
+    MemexposeRemoteMemory *mem;
+    QLIST_FOREACH(mem, &s->remote_regions, list) {
+        if (mr == &mem->region) {
+            return true;
+        }
+    }
+    return false;
+}
+
+static void memexpose_remote_invalidate(MemoryListener *inv,
+                                        MemoryRegionSection *sect)
+{
+    MemexposeMem *s = container_of(inv, MemexposeMem, remote_invalidator);
+    struct memexpose_op msg;
+    struct memexpose_op_reg_inv *ri = &msg.body.reg_inv;
+
+    if (!sect->mr || region_is_ours(s, sect->mr)) {
+        return;
+    }
+    if (s->nothing_shared) {
+        return;
+    }
+
+    msg.head.size = sizeof(msg.head) + sizeof(msg.body.reg_inv);
+    msg.head.ot = MOP_REG_INV;
+    msg.head.prio = memexpose_ep_msg_prio(&s->ep, msg.head.ot);
+
+    ri->start = sect->offset_within_address_space;
+    ri->size = int128_get64(sect->size);
+    MEMEXPOSE_DPRINTF("Region %"PRIx64"-%"PRIx64" changed, "
+                      "sending invalidation request\n",
+                      ri->start, ri->start + ri->size);
+    memexpose_ep_write_sync(&s->ep, &msg);
+}
+
+static void memexpose_invalidate_region(MemexposeMem *s,
+                                        struct memexpose_op_reg_inv *ri,
+                                        struct memexpose_op *out)
+{
+    MemexposeRemoteMemory *mem;
+
+    QLIST_FOREACH(mem, &s->remote_regions, list) {
+        uint64_t start = memory_region_get_ram_addr(&mem->region);
+        uint64_t size = memory_region_size(&mem->region);
+        if (start < ri->start + ri->size ||
+            start + size > ri->start) {
+            mem->should_invalidate = true;
+            s->pending_invalidation = true;
+        }
+    }
+
+    if (s->pending_invalidation) {
+        qemu_bh_schedule(s->reg_inv_bh);
+    }
+
+    out->head.ot = MOP_REG_INV_RET;
+    out->head.size = sizeof(out->head);
+}
+
+static void memexpose_do_reg_inv_bh(void *opaque)
+{
+    MemexposeMem *s = opaque;
+
+    MemexposeRemoteMemory *mem, *tmp;
+    QLIST_FOREACH_SAFE(mem, &s->remote_regions, list, tmp) {
+        if (mem->should_invalidate) {
+            memshare_remove_region(s, mem);
+        }
+    }
+    s->pending_invalidation = false;
+}
+
+static void process_mem(void *opaque, struct memexpose_op *op, Error **err)
+{
+    MemexposeMem *s = opaque;
+    struct memexpose_op resp;
+    resp.head.prio = op->head.prio;
+    switch (op->head.ot) {
+    case MOP_READ:
+        memexpose_perform_read_request(s, &op->body.read, &resp);
+        break;
+    case MOP_WRITE:
+        memexpose_perform_write_request(s, &op->body.write, &resp);
+        break;
+    case MOP_REG_INV:
+        memexpose_invalidate_region(s, &op->body.reg_inv, &resp);
+        break;
+    default:
+        error_setg(err, "Unknown memexpose command %u", op->head.ot);
+        return;
+    }
+    memexpose_ep_write_async(&s->ep, &resp);
+}
+
+void memexpose_mem_init(MemexposeMem *s, Object *parent,
+                        MemoryRegion *as_root,
+                        CharBackend *chr, int prio, Error **errp)
+{
+    if (!qemu_chr_fe_backend_connected(chr)) {
+        error_setg(errp, "You must specify a 'mem_chardev'");
+        return;
+    }
+
+    QLIST_INIT(&s->remote_regions);
+    s->parent = parent;
+    address_space_init(&s->as, as_root, "Memexpose");
+
+    memexpose_ep_init(&s->ep, chr, s, prio, process_mem);
+    s->ep.is_async = false;
+    memory_region_init_io(&s->shmem, parent, &memexpose_region_ops, s,
+                          "memexpose-shmem", s->shmem_size);
+    MEMEXPOSE_DPRINTF("Shmem size %lx\n", memory_region_size(&s->shmem));
+
+    s->nothing_shared = true;
+    s->remote_invalidator = (MemoryListener) {
+        .region_add = memexpose_remote_invalidate,
+            .region_del = memexpose_remote_invalidate,
+    };
+    s->reg_inv_bh = qemu_bh_new(memexpose_do_reg_inv_bh, s);
+    memory_listener_register(&s->remote_invalidator, &s->as);
+}
+
+int memexpose_mem_enable(MemexposeMem *s)
+{
+    return memexpose_ep_connect(&s->ep);
+}
+
+void memexpose_mem_disable(MemexposeMem *s)
+{
+    memexpose_ep_disconnect(&s->ep);
+
+    MemexposeRemoteMemory *mem, *tmp;
+    QLIST_FOREACH_SAFE(mem, &s->remote_regions, list, tmp) {
+        memshare_remove_region(s, mem);
+    }
+    qemu_bh_cancel(s->reg_inv_bh);
+    s->pending_invalidation = false;
+}
+
+void memexpose_mem_destroy(MemexposeMem *s)
+{
+    memexpose_mem_disable(s);
+    /* Region will be collected with its parent */
+    memory_listener_unregister(&s->remote_invalidator);
+    memexpose_ep_destroy(&s->ep);
+    qemu_bh_delete(s->reg_inv_bh);
+    address_space_destroy(&s->as);
+}
diff --git a/hw/misc/memexpose/memexpose-core.h b/hw/misc/memexpose/memexpose-core.h
new file mode 100644
index 0000000..fd0ac60
--- /dev/null
+++ b/hw/misc/memexpose/memexpose-core.h
@@ -0,0 +1,109 @@
+/*
+ *  Memexpose core
+ *
+ *  Copyright (C) 2020 Samsung Electronics Co Ltd.
+ *    Igor Kotrasinski, <i.kotrasinsk@partner.samsung.com>
+ *
+ *  This program is free software; you can redistribute it and/or modify it
+ *  under the terms of the GNU General Public License as published by the
+ *  Free Software Foundation; either version 2 of the License, or
+ *  (at your option) any later version.
+ *
+ *  This program is distributed in the hope that it will be useful, but WITHOUT
+ *  ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
+ *  FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License
+ *  for more details.
+ *
+ *  You should have received a copy of the GNU General Public License along
+ *  with this program; if not, see <http://www.gnu.org/licenses/>.
+ *
+ */
+
+#ifndef _MEMEXPOSE_CORE_H_
+#define _MEMEXPOSE_CORE_H_
+#include "qemu/osdep.h"
+
+#include <inttypes.h>
+#include "chardev/char-fe.h"
+#include "hw/hw.h"
+#include "exec/memory.h"
+#include "memexpose-msg.h"
+
+#define MEMEXPOSE_INTR_QUEUE_SIZE 16
+
+#define MEMEXPOSE_DEBUG 1
+#define MEMEXPOSE_DPRINTF(fmt, ...)                       \
+    do {                                                \
+        if (MEMEXPOSE_DEBUG) {                            \
+            printf("MEMEXPOSE: " fmt, ## __VA_ARGS__);    \
+        }                                               \
+    } while (0)
+
+#define MEMEXPOSE_INTR_MEM_SIZE 0x1000
+
+
+#define MEMEXPOSE_INTR_ENABLE_ADDR  0x0
+#define MEMEXPOSE_INTR_RECV_ADDR    0x400
+#define MEMEXPOSE_INTR_RX_TYPE_ADDR 0x408
+#define MEMEXPOSE_INTR_RX_DATA_ADDR 0x410
+#define MEMEXPOSE_INTR_SEND_ADDR    0x800
+#define MEMEXPOSE_INTR_TX_TYPE_ADDR 0x808
+#define MEMEXPOSE_INTR_TX_DATA_ADDR 0x810
+
+struct memexpose_intr_ops {
+    void *parent;
+    void (*intr) (void *opaque, int dir);
+    int (*enable) (void *opaque);
+    void (*disable) (void *opaque);
+};
+
+typedef struct MemexposeIntr {
+    Object *parent;
+    struct memexpose_intr_ops ops;
+    int enabled;
+
+    MemexposeEp ep;
+    MemoryRegion shmem;
+
+    struct memexpose_op_intr intr_queue[MEMEXPOSE_INTR_QUEUE_SIZE];
+    int queue_start;
+    int queue_count;
+    struct memexpose_op_intr intr_tx;
+    struct memexpose_op_intr intr_rx;
+} MemexposeIntr;
+
+typedef struct MemexposeMem {
+    Object *parent;
+    MemexposeEp ep;
+
+    AddressSpace as;
+    MemoryRegion shmem;
+    uint64_t shmem_size;
+    QLIST_HEAD(, MemexposeRemoteMemory) remote_regions;
+
+    MemoryListener remote_invalidator;
+    QEMUBH *reg_inv_bh;
+    bool pending_invalidation;
+    bool nothing_shared;
+} MemexposeMem;
+
+typedef struct MemexposeRemoteMemory {
+    MemoryRegion region;
+    bool should_invalidate;
+    QLIST_ENTRY(MemexposeRemoteMemory) list;
+} MemexposeRemoteMemory;
+
+void memexpose_intr_init(MemexposeIntr *s, struct memexpose_intr_ops *ops,
+                         Object *parent, CharBackend *chr, Error **errp);
+void memexpose_intr_destroy(MemexposeIntr *s);
+int memexpose_intr_enable(MemexposeIntr *s);
+void memexpose_intr_disable(MemexposeIntr *s);
+
+void memexpose_mem_init(MemexposeMem *s, Object *parent,
+                        MemoryRegion *as_root,
+                        CharBackend *chr, int prio, Error **errp);
+void memexpose_mem_destroy(MemexposeMem *s);
+int memexpose_mem_enable(MemexposeMem *s);
+void memexpose_mem_disable(MemexposeMem *s);
+
+#endif /* _MEMEXPOSE_CORE_H_ */
diff --git a/hw/misc/memexpose/memexpose-msg.c b/hw/misc/memexpose/memexpose-msg.c
new file mode 100644
index 0000000..7205dd0
--- /dev/null
+++ b/hw/misc/memexpose/memexpose-msg.c
@@ -0,0 +1,261 @@
+/*
+ *  Memexpose core
+ *
+ *  Copyright (C) 2020 Samsung Electronics Co Ltd.
+ *    Igor Kotrasinski, <i.kotrasinsk@partner.samsung.com>
+ *
+ *  This program is free software; you can redistribute it and/or modify it
+ *  under the terms of the GNU General Public License as published by the
+ *  Free Software Foundation; either version 2 of the License, or
+ *  (at your option) any later version.
+ *
+ *  This program is distributed in the hope that it will be useful, but WITHOUT
+ *  ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
+ *  FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License
+ *  for more details.
+ *
+ *  You should have received a copy of the GNU General Public License along
+ *  with this program; if not, see <http://www.gnu.org/licenses/>.
+ *
+ */
+
+#include "qemu/osdep.h"
+#include "qapi/error.h"
+#include "memexpose-msg.h"
+#include "memexpose-core.h"
+
+#define MIN_MSG_SIZE (sizeof(struct memexpose_op_head))
+#define MAX_MSG_SIZE (sizeof(struct memexpose_op))
+
+int memexpose_ep_msg_prio(MemexposeEp *ep, enum memexpose_op_type ot)
+{
+    int ot_prio;
+    switch (ot) {
+    case MOP_READ:
+    case MOP_READ_RET:
+    case MOP_WRITE:
+    case MOP_WRITE_RET:
+        ot_prio = 2;
+        break;
+    default:
+        ot_prio = 0;
+    }
+    return ot_prio + ep->prio;
+}
+
+static int mep_can_receive(void *opaque)
+{
+    int sz;
+    MemexposeEp *ep = opaque;
+    MemexposeMsg *msg = &ep->msg;
+
+    switch (msg->read_state) {
+    case MEMEXPOSE_MSG_BROKEN:
+        return 0;
+    case MEMEXPOSE_MSG_READ_SIZE:
+        return sizeof(msg->buf.head.size) - msg->bytes;
+    case MEMEXPOSE_MSG_READ_BODY:
+        sz = msg->buf.head.size - msg->bytes;
+        if (sz > MAX_MSG_SIZE) {
+            return MAX_MSG_SIZE;  /* We'll handle this as an error later */
+        }
+        return sz;
+    default:
+        MEMEXPOSE_DPRINTF("Invalid read state %d\n", msg->read_state);
+        return 0;
+    }
+}
+
+static int mep_do_receive(MemexposeMsg *msg,
+                          const uint8_t *buf, int size)
+{
+    switch (msg->read_state) {
+    case MEMEXPOSE_MSG_BROKEN:
+        return -1;
+    case MEMEXPOSE_MSG_READ_SIZE:
+        memcpy((unsigned char *)&msg->buf + msg->bytes, buf, size);
+        msg->bytes += size;
+        if (msg->bytes == sizeof(msg->buf.head.size)) {
+            msg->read_state = MEMEXPOSE_MSG_READ_BODY;
+        }
+        return 0;
+    case MEMEXPOSE_MSG_READ_BODY:
+        if (msg->buf.head.size < MIN_MSG_SIZE ||
+            msg->buf.head.size > MAX_MSG_SIZE) {
+            MEMEXPOSE_DPRINTF("Invalid message size %d, protocol broken!\n",
+                              msg->buf.head.size);
+            msg->read_state = MEMEXPOSE_MSG_BROKEN;
+            return -1;
+        }
+        memcpy((unsigned char *)&msg->buf + msg->bytes, buf, size);
+        msg->bytes += size;
+        if (msg->bytes < msg->buf.head.size) {
+            return 0;
+        }
+        msg->bytes = 0;
+        msg->read_state = MEMEXPOSE_MSG_READ_SIZE;
+        return 1;
+    default:
+        MEMEXPOSE_DPRINTF("Invalid read state %d\n", msg->read_state);
+        return -1;
+    }
+}
+
+static void mep_receive(void *opaque, const uint8_t *buf, int size)
+{
+    MemexposeEp *ep = opaque;
+    Error *err = NULL;
+    int new_msg = mep_do_receive(&ep->msg, buf, size);
+    if (new_msg) {
+        ep->handle_msg(ep->data, &ep->msg.buf, &err);
+        if (err) {
+            error_report_err(err);
+        }
+    } else if (new_msg < 0) {
+        error_setg(&err, "Failed to receive memexpose message"); /* FIXME */
+        error_report_err(err);
+    }
+}
+
+static int mep_receive_sync(MemexposeEp *ep, struct memexpose_op *op)
+{
+    int ret = 0;
+    MemexposeMsg *msg = &ep->msg;
+    assert(!ep->is_async);
+
+    while (!ret) {
+        int can_receive = mep_can_receive(ep);
+        unsigned char *msgbuf = (unsigned char *)&msg->buf + msg->bytes;
+        qemu_chr_fe_read_all(ep->chr, msgbuf, can_receive);
+        ret = mep_do_receive(msg, msgbuf, can_receive);
+        if (ret == -1) {
+            return -1;
+        }
+    }
+    *op = msg->buf;
+    return 0;
+}
+
+void memexpose_ep_write_async(MemexposeEp *ep, struct memexpose_op *op)
+{
+    qemu_chr_fe_write_all(ep->chr, (unsigned char *) op, op->head.size);
+}
+
+static void mep_queue_msg(MemexposeEp *ep, struct memexpose_op *op)
+{
+    ep->queued_op = *op;
+    qemu_bh_schedule(ep->queue_msg_bh);
+}
+
+static void mep_queue_msg_bh(void *epp)
+{
+    Error *err = NULL;
+    MemexposeEp *ep = epp;
+    if (!ep->queued_op.head.size) {
+        return;
+    }
+    ep->handle_msg(ep->data, &ep->queued_op, &err); /* FIXME - handle */
+    ep->queued_op.head.size = 0;
+}
+
+/*
+ * Synchronously write a message to another QEMU and receive a response.
+ * To avoid deadlocks, each message type has its priority and no more than one
+ * message of each priority is in flight.
+ *
+ * After we send a message, we await a response while handling all messages of
+ * higher priority and deferring messages of lower priority. This way each side
+ * will have its requests handled until they have time to handle ours.
+ *
+ * The above means that a handler for a message must be able to run while an
+ * operation that sends any other lower priority message is in progress. Make
+ * sure to order operations in an order that does not upset QEMU!
+ */
+void memexpose_ep_write_sync(MemexposeEp *ep, struct memexpose_op *op)
+{
+    assert(!ep->is_async);
+    qemu_chr_fe_write_all(ep->chr, (unsigned char *) op, op->head.size);
+
+    struct memexpose_op resp;
+    int prio = op->head.prio;
+
+    /* FIXME - handle errors */
+    while (true) {
+        Error *err = NULL;
+        mep_receive_sync(ep, &resp);
+        int resp_prio = resp.head.prio;
+        if (resp_prio > prio) {
+            ep->handle_msg(ep->data, &resp, &err);
+        } else if (resp_prio < prio) {
+            mep_queue_msg(ep, &resp);
+        } else {
+            *op = resp;
+            return;
+        }
+    }
+}
+
+void memexpose_ep_init(MemexposeEp *ep, CharBackend *chr, void *data, int prio,
+                       void (*handle_msg)(void *data, struct memexpose_op *op,
+                                          Error **errp))
+{
+    ep->queue_msg_bh = qemu_bh_new(mep_queue_msg_bh, ep);
+    ep->queued_op.head.size = 0;
+    ep->handle_msg = handle_msg;
+    ep->msg.bytes = 0;
+    ep->msg.read_state = MEMEXPOSE_MSG_READ_SIZE;
+    ep->chr = chr;
+    ep->data = data;
+    ep->prio = prio;
+    ep->connected = 0;
+
+    if (handle_msg)
+        qemu_chr_fe_set_handlers(ep->chr, mep_can_receive,
+                                 mep_receive, NULL, NULL, ep, NULL, true);
+    Chardev *chrd = qemu_chr_fe_get_driver(ep->chr);
+    assert(chrd);
+    MEMEXPOSE_DPRINTF("Memexpose endpoint at %s\n",
+                      chrd->filename);
+
+}
+
+/* TODO - protocol for synchronously ending connection */
+void memexpose_ep_destroy(MemexposeEp *ep)
+{
+    qemu_chr_fe_set_handlers(ep->chr, NULL, NULL, NULL, NULL, NULL, NULL, true);
+}
+
+void memexpose_ep_send_fd(MemexposeEp *ep, int fd)
+{
+    qemu_chr_fe_set_msgfds(ep->chr, &fd, 1);
+}
+
+int memexpose_ep_recv_fd(MemexposeEp *ep)
+{
+    return qemu_chr_fe_get_msgfd(ep->chr);
+}
+
+int memexpose_ep_connect(MemexposeEp *ep)
+{
+    /* FIXME - report errors */
+    Error *err = NULL;
+    if (ep->connected) {
+        return 0;
+    }
+
+    int ret = qemu_chr_fe_wait_connected(ep->chr, &err);
+    if (ret) {
+        return ret;
+    }
+
+    ep->connected = 1;
+    return 0;
+}
+
+void memexpose_ep_disconnect(MemexposeEp *ep)
+{
+    if (ep->connected) {
+        qemu_chr_fe_disconnect(ep->chr);
+    }
+    ep->connected = 0;
+}
diff --git a/hw/misc/memexpose/memexpose-msg.h b/hw/misc/memexpose/memexpose-msg.h
new file mode 100644
index 0000000..5543cd4
--- /dev/null
+++ b/hw/misc/memexpose/memexpose-msg.h
@@ -0,0 +1,161 @@
+/*
+ *  Memexpose core
+ *
+ *  Copyright (C) 2020 Samsung Electronics Co Ltd.
+ *    Igor Kotrasinski, <i.kotrasinsk@partner.samsung.com>
+ *
+ *  This program is free software; you can redistribute it and/or modify it
+ *  under the terms of the GNU General Public License as published by the
+ *  Free Software Foundation; either version 2 of the License, or
+ *  (at your option) any later version.
+ *
+ *  This program is distributed in the hope that it will be useful, but WITHOUT
+ *  ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
+ *  FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License
+ *  for more details.
+ *
+ *  You should have received a copy of the GNU General Public License along
+ *  with this program; if not, see <http://www.gnu.org/licenses/>.
+ *
+ */
+
+#ifndef _MEMEXPOSE_MSG_H_
+#define _MEMEXPOSE_MSG_H_
+
+#include "qemu/osdep.h"
+#include "qemu/typedefs.h"
+#include "chardev/char-fe.h"
+#include "exec/memattrs.h"
+#include <inttypes.h>
+
+#define MEMEXPOSE_MAX_INTR_DATA_SIZE 128
+
+enum memexpose_op_type {
+    MOP_READ,
+    MOP_READ_RET,
+    MOP_WRITE,
+    MOP_WRITE_RET,
+    MOP_REG_INV,
+    MOP_REG_INV_RET,
+    MOP_INTR,
+};
+
+enum memexpose_memshare_type {
+    MEMSHARE_NONE,
+    MEMSHARE_FD,
+};
+
+/*
+ * TODO - we'll need to share more info here, like access permissions for
+ * example
+ */
+struct memexpose_memshare_info_fd {
+    uint64_t start;
+    uint64_t mmap_start;
+    uint64_t size;
+    uint8_t readonly;
+    uint8_t nonvolatile;
+} __attribute__((packed));
+
+/* TODO - this might have variable size in the future */
+struct memexpose_memshare_info {
+    uint8_t type;
+    union {
+        struct memexpose_memshare_info_fd fd;
+    };
+} __attribute__((packed));
+
+/* TODO - endianness */
+struct memexpose_op_head {
+    uint32_t size;
+    uint8_t ot;
+    uint8_t prio;
+} __attribute__((packed));
+
+struct memexpose_op_read {
+    uint64_t offset;
+    uint8_t size;
+} __attribute__((packed));
+
+struct memexpose_op_write {
+    uint64_t offset;
+    uint64_t value;
+    uint8_t size;
+} __attribute__((packed));
+
+struct memexpose_op_read_ret {
+    MemTxResult ret;
+    uint64_t value;
+    struct memexpose_memshare_info share;
+} __attribute__((packed));
+
+struct memexpose_op_write_ret {
+    MemTxResult ret;
+    struct memexpose_memshare_info share;
+} __attribute__((packed));
+
+struct memexpose_op_intr {
+    uint64_t type;
+    uint8_t data[MEMEXPOSE_MAX_INTR_DATA_SIZE];
+} __attribute__((packed));
+
+struct memexpose_op_reg_inv {
+    uint64_t start;
+    uint64_t size;
+} __attribute__((packed));
+
+union memexpose_op_all {
+    struct memexpose_op_read read;
+    struct memexpose_op_write write;
+    struct memexpose_op_read_ret read_ret;
+    struct memexpose_op_write_ret write_ret;
+    struct memexpose_op_intr intr;
+    struct memexpose_op_reg_inv reg_inv;
+} __attribute__((packed));
+
+struct memexpose_op {
+    struct memexpose_op_head head;
+    union memexpose_op_all body;
+} __attribute__((packed));
+
+enum MemexposeMsgState {
+    MEMEXPOSE_MSG_READ_SIZE,
+    MEMEXPOSE_MSG_READ_BODY,
+    MEMEXPOSE_MSG_BROKEN,
+};
+
+typedef struct MemexposeMsg {
+    int read_state;
+    int bytes;
+    struct memexpose_op buf;
+} MemexposeMsg;
+
+typedef struct MemexposeEp {
+    CharBackend *chr;
+    MemexposeMsg msg;
+    bool is_async;
+    int prio;
+    void *data;
+    void (*handle_msg)(void *data, struct memexpose_op *op, Error **err);
+
+    int connected;
+    struct memexpose_op queued_op;
+    QEMUBH *queue_msg_bh;
+} MemexposeEp;
+
+void memexpose_ep_init(MemexposeEp *ep, CharBackend *chr, void *data, int prio,
+                       void (*handle_msg)(void *data, struct memexpose_op *op,
+                                          Error **errp));
+void memexpose_ep_destroy(MemexposeEp *ep);
+
+int memexpose_ep_connect(MemexposeEp *ep);
+void memexpose_ep_disconnect(MemexposeEp *ep);
+
+/* TODO - functions for header boilerplate */
+void memexpose_ep_write_sync(MemexposeEp *ep, struct memexpose_op *op);
+void memexpose_ep_write_async(MemexposeEp *ep, struct memexpose_op *op);
+void memexpose_ep_send_fd(MemexposeEp *ep, int fd);
+int memexpose_ep_recv_fd(MemexposeEp *ep);
+int memexpose_ep_msg_prio(MemexposeEp *ep, enum memexpose_op_type);
+
+#endif /* _MEMEXPOSE_MSG_H_ */
-- 
2.7.4



^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [RFC 6/9] hw/misc/memexpose: Add memexpose pci device
       [not found]   ` <CGME20200204113108eucas1p2526a9481bf8a4420d359c45f1183fe95@eucas1p2.samsung.com>
@ 2020-02-04 11:30     ` i.kotrasinsk
  0 siblings, 0 replies; 20+ messages in thread
From: i.kotrasinsk @ 2020-02-04 11:30 UTC (permalink / raw)
  To: qemu-devel; +Cc: peter.maydell, Igor Kotrasinski, pbonzini

From: Igor Kotrasinski <i.kotrasinsk@partner.samsung.com>

Signed-off-by: Igor Kotrasinski <i.kotrasinsk@partner.samsung.com>
---
 MAINTAINERS                       |   1 +
 hw/misc/memexpose/Makefile.objs   |   1 +
 hw/misc/memexpose/memexpose-pci.c | 218 ++++++++++++++++++++++++++++++++++++++
 3 files changed, 220 insertions(+)
 create mode 100644 hw/misc/memexpose/memexpose-pci.c

diff --git a/MAINTAINERS b/MAINTAINERS
index d6146c0..50628e4 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -1646,6 +1646,7 @@ F: hw/misc/memexpose/memexpose-core.h
 F: hw/misc/memexpose/memexpose-core.c
 F: hw/misc/memexpose/memexpose-msg.h
 F: hw/misc/memexpose/memexpose-msg.c
+F: hw/misc/memexpose/memexpose-pci.c
 
 nvme
 M: Keith Busch <keith.busch@intel.com>
diff --git a/hw/misc/memexpose/Makefile.objs b/hw/misc/memexpose/Makefile.objs
index f405fe7..05a2395 100644
--- a/hw/misc/memexpose/Makefile.objs
+++ b/hw/misc/memexpose/Makefile.objs
@@ -1,2 +1,3 @@
 common-obj-y += memexpose-msg.o
 common-obj-y += memexpose-core.o
+common-obj-$(CONFIG_PCI) += memexpose-pci.o
diff --git a/hw/misc/memexpose/memexpose-pci.c b/hw/misc/memexpose/memexpose-pci.c
new file mode 100644
index 0000000..7372651
--- /dev/null
+++ b/hw/misc/memexpose/memexpose-pci.c
@@ -0,0 +1,218 @@
+/*
+ *  Memexpose PCI device
+ *
+ *  Copyright (C) 2020 Samsung Electronics Co Ltd.
+ *    Igor Kotrasinski, <i.kotrasinsk@partner.samsung.com>
+ *
+ *  This program is free software; you can redistribute it and/or modify it
+ *  under the terms of the GNU General Public License as published by the
+ *  Free Software Foundation; either version 2 of the License, or
+ *  (at your option) any later version.
+ *
+ *  This program is distributed in the hope that it will be useful, but WITHOUT
+ *  ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
+ *  FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License
+ *  for more details.
+ *
+ *  You should have received a copy of the GNU General Public License along
+ *  with this program; if not, see <http://www.gnu.org/licenses/>.
+ *
+ */
+
+#include "qemu/osdep.h"
+#include "qemu/units.h"
+#include "qapi/error.h"
+#include "qemu/cutils.h"
+#include "hw/hw.h"
+#include "hw/pci/pci.h"
+#include "hw/pci/msi.h"
+#include "hw/pci/msix.h"
+#include "hw/qdev-properties.h"
+#include "exec/memory.h"
+#include "exec/address-spaces.h"
+#include "memexpose-core.h"
+
+#define PCI_VENDOR_ID_MEMEXPOSE            PCI_VENDOR_ID_REDHAT_QUMRANET
+#define TYPE_MEMEXPOSE_PCI "memexpose-pci"
+#define PCI_DEVICE_ID_MEMEXPOSE     0x1111
+#define MEMEXPOSE_PCI(obj) \
+    OBJECT_CHECK(MemexposePCIState, (obj), TYPE_MEMEXPOSE_PCI)
+
+typedef struct MemexposePCIState {
+    PCIDevice parent_obj;
+
+    CharBackend intr_chr;
+    CharBackend mem_chr;
+
+    MemexposeIntr intr;
+    uint32_t intr_status;
+    MemexposeMem mem;
+} MemexposePCIState;
+
+static void raise_irq(MemexposePCIState *s)
+{
+    s->intr_status |= 1;
+    if (msi_enabled(&s->parent_obj)) {
+        msi_notify(&s->parent_obj, 0);
+    } else {
+        pci_set_irq(&s->parent_obj, 1);
+    }
+}
+
+static void lower_irq(MemexposePCIState *s)
+{
+    s->intr_status &= (~1);
+    if (!s->intr_status && !msi_enabled(&s->parent_obj)) {
+        pci_set_irq(&s->parent_obj, 0);
+    }
+}
+
+static void handle_irq(void *opaque, int dir)
+{
+    MemexposePCIState *s = opaque;
+    if (dir) {
+        raise_irq(s);
+    } else {
+        lower_irq(s);
+    }
+}
+
+static int memexpose_enable(void *opaque)
+{
+    int ret;
+    MemexposePCIState *s = opaque;
+
+    ret = memexpose_intr_enable(&s->intr);
+    if (ret) {
+        return ret;
+    }
+
+    ret = memexpose_mem_enable(&s->mem);
+    if (ret) {
+        memexpose_intr_disable(&s->intr);
+        return ret;
+    }
+
+    return 0;
+}
+
+static void memexpose_disable(void *opaque)
+{
+    MemexposePCIState *s = opaque;
+
+    memexpose_intr_disable(&s->intr);
+    memexpose_mem_disable(&s->mem);
+}
+
+static void memexpose_pci_intr_init(PCIDevice *dev, Error **errp)
+{
+    MemexposePCIState *s = MEMEXPOSE_PCI(dev);
+    struct memexpose_intr_ops ops;
+    ops.intr = handle_irq;
+    ops.enable = memexpose_enable;
+    ops.disable = memexpose_disable;
+    ops.parent = s;
+
+    memexpose_intr_init(&s->intr, &ops, OBJECT(dev), &s->intr_chr, errp);
+    if (*errp) {
+        return;
+    }
+
+    s->intr_status = 0;
+    uint8_t *pci_conf;
+    pci_conf = dev->config;
+    pci_conf[PCI_COMMAND] = PCI_COMMAND_IO | PCI_COMMAND_MEMORY;
+    pci_config_set_interrupt_pin(pci_conf, 1);
+    if (msi_init(dev, 0, 1, true, false, errp)) {
+        error_setg(errp, "Failed to initialize memexpose PCI interrupts");
+        memexpose_intr_destroy(&s->intr);
+        return;
+    }
+
+    /* region for registers*/
+    pci_register_bar(dev, 0,
+                     PCI_BASE_ADDRESS_SPACE_MEMORY,
+                     &s->intr.shmem);
+    MEMEXPOSE_DPRINTF("Initialized bar.\n");
+}
+
+static void memexpose_pci_intr_exit(PCIDevice *dev)
+{
+    MemexposePCIState *s = MEMEXPOSE_PCI(dev);
+    msi_uninit(dev);
+    memexpose_intr_destroy(&s->intr);
+}
+
+static void memexpose_pci_realize(PCIDevice *dev, Error **errp)
+{
+    MemexposePCIState *s = MEMEXPOSE_PCI(dev);
+    memexpose_pci_intr_init(dev, errp);
+    if (*errp) {
+        return;
+    }
+
+    Chardev *chrd = qemu_chr_fe_get_driver(&s->mem_chr);
+    assert(chrd);
+    MEMEXPOSE_DPRINTF("Memexpose endpoint at %s!\n",
+                      chrd->filename);
+    memexpose_mem_init(&s->mem, OBJECT(dev),
+                       get_system_memory(),
+                       &s->mem_chr, 0, errp);
+    if (*errp) {
+        memexpose_pci_intr_exit(dev);
+        return;
+    }
+
+    pci_register_bar(dev, 1,
+                     PCI_BASE_ADDRESS_SPACE_MEMORY |
+                     PCI_BASE_ADDRESS_MEM_TYPE_64,
+                     &s->mem.shmem);
+    MEMEXPOSE_DPRINTF("Initialized second bar.\n");
+}
+
+static void memexpose_pci_exit(PCIDevice *dev)
+{
+    MemexposePCIState *s = MEMEXPOSE_PCI(dev);
+    memexpose_mem_destroy(&s->mem);
+    memexpose_pci_intr_exit(dev);
+}
+
+static Property memexpose_pci_properties[] = {
+    DEFINE_PROP_CHR("mem_chardev", MemexposePCIState, mem_chr),
+    DEFINE_PROP_CHR("intr_chardev", MemexposePCIState, intr_chr),
+    DEFINE_PROP_UINT64("shm_size", MemexposePCIState, mem.shmem_size, 4096),
+    DEFINE_PROP_END_OF_LIST(),
+};
+
+static void memexpose_pci_class_init(ObjectClass *klass, void *data)
+{
+    DeviceClass *dc = DEVICE_CLASS(klass);
+    PCIDeviceClass *k = PCI_DEVICE_CLASS(klass);
+
+    k->realize = memexpose_pci_realize;
+    k->exit = memexpose_pci_exit;
+    k->vendor_id = PCI_VENDOR_ID_MEMEXPOSE;
+    k->device_id = PCI_DEVICE_ID_MEMEXPOSE;
+    k->class_id = PCI_CLASS_MEMORY_RAM;
+    k->revision = 1;
+    device_class_set_props(dc, memexpose_pci_properties);
+}
+
+static const TypeInfo memexpose_pci_info = {
+    .name          = TYPE_MEMEXPOSE_PCI,
+    .parent        = TYPE_PCI_DEVICE,
+    .instance_size = sizeof(MemexposePCIState),
+    .class_init    = memexpose_pci_class_init,
+    .interfaces    = (InterfaceInfo[]) {
+        { INTERFACE_CONVENTIONAL_PCI_DEVICE },
+        { },
+    },
+};
+
+
+static void memexpose_pci_register_types(void)
+{
+    type_register_static(&memexpose_pci_info);
+}
+
+type_init(memexpose_pci_register_types)
-- 
2.7.4



^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [RFC 7/9] hw/misc/memexpose: Add memexpose memory region device
       [not found]   ` <CGME20200204113109eucas1p18527bb78c3d930d56e6ae9c205f64ba3@eucas1p1.samsung.com>
@ 2020-02-04 11:30     ` i.kotrasinsk
  0 siblings, 0 replies; 20+ messages in thread
From: i.kotrasinsk @ 2020-02-04 11:30 UTC (permalink / raw)
  To: qemu-devel; +Cc: peter.maydell, Igor Kotrasinski, pbonzini

From: Igor Kotrasinski <i.kotrasinsk@partner.samsung.com>

Signed-off-by: Igor Kotrasinski <i.kotrasinsk@partner.samsung.com>
---
 MAINTAINERS                             |   2 +
 hw/misc/memexpose/Makefile.objs         |   1 +
 hw/misc/memexpose/memexpose-memregion.c | 142 ++++++++++++++++++++++++++++++++
 hw/misc/memexpose/memexpose-memregion.h |  41 +++++++++
 4 files changed, 186 insertions(+)
 create mode 100644 hw/misc/memexpose/memexpose-memregion.c
 create mode 100644 hw/misc/memexpose/memexpose-memregion.h

diff --git a/MAINTAINERS b/MAINTAINERS
index 50628e4..2142c07 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -1647,6 +1647,8 @@ F: hw/misc/memexpose/memexpose-core.c
 F: hw/misc/memexpose/memexpose-msg.h
 F: hw/misc/memexpose/memexpose-msg.c
 F: hw/misc/memexpose/memexpose-pci.c
+F: hw/misc/memexpose/memexpose-memregion.h
+F: hw/misc/memexpose/memexpose-memregion.c
 
 nvme
 M: Keith Busch <keith.busch@intel.com>
diff --git a/hw/misc/memexpose/Makefile.objs b/hw/misc/memexpose/Makefile.objs
index 05a2395..056bff3 100644
--- a/hw/misc/memexpose/Makefile.objs
+++ b/hw/misc/memexpose/Makefile.objs
@@ -1,3 +1,4 @@
 common-obj-y += memexpose-msg.o
 common-obj-y += memexpose-core.o
 common-obj-$(CONFIG_PCI) += memexpose-pci.o
+common-obj-y += memexpose-memregion.o
diff --git a/hw/misc/memexpose/memexpose-memregion.c b/hw/misc/memexpose/memexpose-memregion.c
new file mode 100644
index 0000000..fbdd966
--- /dev/null
+++ b/hw/misc/memexpose/memexpose-memregion.c
@@ -0,0 +1,142 @@
+/*
+ *  Memexpose ARM device
+ *
+ *  Copyright (C) 2020 Samsung Electronics Co Ltd.
+ *    Igor Kotrasinski, <i.kotrasinsk@partner.samsung.com>
+ *
+ *  This program is free software; you can redistribute it and/or modify it
+ *  under the terms of the GNU General Public License as published by the
+ *  Free Software Foundation; either version 2 of the License, or
+ *  (at your option) any later version.
+ *
+ *  This program is distributed in the hope that it will be useful, but WITHOUT
+ *  ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
+ *  FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License
+ *  for more details.
+ *
+ *  You should have received a copy of the GNU General Public License along
+ *  with this program; if not, see <http://www.gnu.org/licenses/>.
+ *
+ */
+
+#include "qemu/osdep.h"
+#include "hw/irq.h"
+#include "hw/sysbus.h"
+#include "hw/qdev-properties.h"
+#include "exec/memory.h"
+#include "exec/address-spaces.h"
+#include "memexpose-core.h"
+#include "memexpose-memregion.h"
+
+static void memexpose_memdev_intr(void *opaque, int dir)
+{
+    MemexposeMemdev *dev = opaque;
+    if (dir) {
+        qemu_set_irq(dev->irq, 1);
+    } else {
+        qemu_set_irq(dev->irq, 0);
+    }
+}
+
+static int memexpose_memdev_enable(void *opaque)
+{
+    int ret;
+    MemexposeMemdev *s = opaque;
+
+    ret = memexpose_intr_enable(&s->intr);
+    if (ret) {
+        return ret;
+    }
+
+    ret = memexpose_mem_enable(&s->mem);
+    if (ret) {
+        memexpose_intr_disable(&s->intr);
+        return ret;
+    }
+
+    return 0;
+}
+
+static void memexpose_memdev_disable(void *opaque)
+{
+    MemexposeMemdev *s = opaque;
+
+    memexpose_intr_disable(&s->intr);
+    memexpose_mem_disable(&s->mem);
+}
+
+static void memexpose_memdev_init(Object *obj)
+{
+    SysBusDevice *sbd = SYS_BUS_DEVICE(obj);
+    MemexposeMemdev *mdev = MEMEXPOSE_MEMDEV(obj);
+    sysbus_init_mmio(sbd, &mdev->intr.shmem);
+    sysbus_init_irq(sbd, &mdev->irq);
+}
+
+static void memexpose_memdev_finalize(Object *obj)
+{
+}
+
+static void memexpose_memdev_realize(DeviceState *dev, Error **errp)
+{
+    MemexposeMemdev *mdev = MEMEXPOSE_MEMDEV(dev);
+    struct memexpose_intr_ops ops = {
+        .parent = dev,
+        .intr = memexpose_memdev_intr,
+        .enable = memexpose_memdev_enable,
+        .disable = memexpose_memdev_disable,
+    };
+
+    memexpose_intr_init(&mdev->intr, &ops, OBJECT(dev), &mdev->intr_chr, errp);
+    if (*errp) {
+        return;
+    }
+    memexpose_mem_init(&mdev->mem, OBJECT(dev),
+                       get_system_memory(),
+                       &mdev->mem_chr, 1, errp);
+    if (*errp) {
+        goto free_intr;
+    }
+    return;
+
+free_intr:
+    memexpose_intr_destroy(&mdev->intr);
+}
+
+static void memexpose_memdev_unrealize(DeviceState *dev, Error **errp)
+{
+    MemexposeMemdev *mdev = MEMEXPOSE_MEMDEV(dev);
+    memexpose_mem_destroy(&mdev->mem);
+    memexpose_intr_destroy(&mdev->intr);
+}
+
+static Property memexpose_memdev_properties[] = {
+    DEFINE_PROP_CHR("intr_chardev", MemexposeMemdev, intr_chr),
+    DEFINE_PROP_CHR("mem_chardev", MemexposeMemdev, mem_chr),
+    DEFINE_PROP_UINT64("shm_size", MemexposeMemdev, mem.shmem_size, 4096),
+    DEFINE_PROP_END_OF_LIST(),
+};
+
+static void memexpose_memdev_class_init(ObjectClass *klass, void *data)
+{
+    DeviceClass *dc = DEVICE_CLASS(klass);
+    dc->realize = memexpose_memdev_realize;
+    dc->unrealize = memexpose_memdev_unrealize;
+    device_class_set_props(dc, memexpose_memdev_properties);
+}
+
+static const TypeInfo memexpose_memdev_info = {
+    .name = TYPE_MEMEXPOSE_MEMDEV,
+    .parent = TYPE_SYS_BUS_DEVICE,
+    .instance_size = sizeof(MemexposeMemdev),
+    .instance_init = memexpose_memdev_init,
+    .instance_finalize = memexpose_memdev_finalize,
+    .class_init = memexpose_memdev_class_init,
+};
+
+static void register_types(void)
+{
+    type_register_static(&memexpose_memdev_info);
+}
+
+type_init(register_types);
diff --git a/hw/misc/memexpose/memexpose-memregion.h b/hw/misc/memexpose/memexpose-memregion.h
new file mode 100644
index 0000000..7eddcbe
--- /dev/null
+++ b/hw/misc/memexpose/memexpose-memregion.h
@@ -0,0 +1,41 @@
+/*
+ *  Memexpose ARM device
+ *
+ *  Copyright (C) 2020 Samsung Electronics Co Ltd.
+ *    Igor Kotrasinski, <i.kotrasinsk@partner.samsung.com>
+ *
+ *  This program is free software; you can redistribute it and/or modify it
+ *  under the terms of the GNU General Public License as published by the
+ *  Free Software Foundation; either version 2 of the License, or
+ *  (at your option) any later version.
+ *
+ *  This program is distributed in the hope that it will be useful, but WITHOUT
+ *  ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
+ *  FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License
+ *  for more details.
+ *
+ *  You should have received a copy of the GNU General Public License along
+ *  with this program; if not, see <http://www.gnu.org/licenses/>.
+ *
+ */
+
+#ifndef _MEMEXPOSE_MEMDEV_H_
+#define _MEMEXPOSE_MEMDEV_H_
+
+#include "memexpose-core.h"
+#include "hw/sysbus.h"
+
+#define TYPE_MEMEXPOSE_MEMDEV "memexpose-memdev"
+#define MEMEXPOSE_MEMDEV(obj) \
+    OBJECT_CHECK(MemexposeMemdev, (obj), TYPE_MEMEXPOSE_MEMDEV)
+
+typedef struct MemexposeMemdev {
+    SysBusDevice dev;
+    MemexposeIntr intr;
+    MemexposeMem mem;
+    CharBackend intr_chr;
+    CharBackend mem_chr;
+    qemu_irq irq;
+} MemexposeMemdev;
+
+#endif /* _MEMEXPOSE_MEMDEV_H_ */
-- 
2.7.4



^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [RFC 8/9] hw/misc/memexpose: Add simple tests
       [not found]   ` <CGME20200204113110eucas1p2f9ab3639730113139730d1853772d100@eucas1p2.samsung.com>
@ 2020-02-04 11:30     ` i.kotrasinsk
  0 siblings, 0 replies; 20+ messages in thread
From: i.kotrasinsk @ 2020-02-04 11:30 UTC (permalink / raw)
  To: qemu-devel; +Cc: peter.maydell, Igor Kotrasinski, pbonzini

From: Igor Kotrasinski <i.kotrasinsk@partner.samsung.com>

Signed-off-by: Igor Kotrasinski <i.kotrasinsk@partner.samsung.com>
---
 MAINTAINERS                  |   1 +
 tests/qtest/Makefile.include |   2 +
 tests/qtest/memexpose-test.c | 364 +++++++++++++++++++++++++++++++++++++++++++
 3 files changed, 367 insertions(+)
 create mode 100644 tests/qtest/memexpose-test.c

diff --git a/MAINTAINERS b/MAINTAINERS
index 2142c07..55bc6ab 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -1649,6 +1649,7 @@ F: hw/misc/memexpose/memexpose-msg.c
 F: hw/misc/memexpose/memexpose-pci.c
 F: hw/misc/memexpose/memexpose-memregion.h
 F: hw/misc/memexpose/memexpose-memregion.c
+F: tests/memexpose-test.c
 
 nvme
 M: Keith Busch <keith.busch@intel.com>
diff --git a/tests/qtest/Makefile.include b/tests/qtest/Makefile.include
index e6bb4ab..3b580bc 100644
--- a/tests/qtest/Makefile.include
+++ b/tests/qtest/Makefile.include
@@ -14,6 +14,7 @@ check-qtest-pci-$(CONFIG_RTL8139_PCI) += rtl8139-test
 check-qtest-pci-$(CONFIG_VGA) += display-vga-test
 check-qtest-pci-$(CONFIG_HDA) += intel-hda-test
 check-qtest-pci-$(CONFIG_IVSHMEM_DEVICE) += ivshmem-test
+check-qtest-x86_64-$(CONFIG_MEMEXPOSE) += memexpose-test
 
 DBUS_DAEMON := $(shell which dbus-daemon 2>/dev/null)
 ifneq ($(GDBUS_CODEGEN),)
@@ -289,6 +290,7 @@ tests/qtest/test-filter-mirror$(EXESUF): tests/qtest/test-filter-mirror.o $(qtes
 tests/qtest/test-filter-redirector$(EXESUF): tests/qtest/test-filter-redirector.o $(qtest-obj-y)
 tests/qtest/test-x86-cpuid-compat$(EXESUF): tests/qtest/test-x86-cpuid-compat.o $(qtest-obj-y)
 tests/qtest/ivshmem-test$(EXESUF): tests/qtest/ivshmem-test.o contrib/ivshmem-server/ivshmem-server.o $(libqos-pc-obj-y) $(libqos-spapr-obj-y)
+tests/qtest/memexpose-test$(EXESUF): tests/qtest/memexpose-test.o $(libqos-pc-obj-y)
 tests/qtest/dbus-vmstate-test$(EXESUF): tests/qtest/dbus-vmstate-test.o tests/qtest/migration-helpers.o tests/qtest/dbus-vmstate1.o $(libqos-pc-obj-y) $(libqos-spapr-obj-y)
 tests/qtest/vhost-user-bridge$(EXESUF): tests/qtest/vhost-user-bridge.o $(test-util-obj-y) libvhost-user.a
 tests/qtest/test-arm-mptimer$(EXESUF): tests/qtest/test-arm-mptimer.o
diff --git a/tests/qtest/memexpose-test.c b/tests/qtest/memexpose-test.c
new file mode 100644
index 0000000..70a8a73
--- /dev/null
+++ b/tests/qtest/memexpose-test.c
@@ -0,0 +1,364 @@
+/*
+ *  Memexpose PCI device
+ *
+ *  Copyright (C) 2020 Samsung Electronics Co Ltd.
+ *    Igor Kotrasinski, <i.kotrasinsk@partner.samsung.com>
+ *
+ *  This program is free software; you can redistribute it and/or modify it
+ *  under the terms of the GNU General Public License as published by the
+ *  Free Software Foundation; either version 2 of the License, or
+ *  (at your option) any later version.
+ *
+ *  This program is distributed in the hope that it will be useful, but WITHOUT
+ *  ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
+ *  FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License
+ *  for more details.
+ *
+ *  You should have received a copy of the GNU General Public License along
+ *  with this program; if not, see <http://www.gnu.org/licenses/>.
+ *
+ */
+
+#include "qemu/osdep.h"
+#include <glib/gstdio.h>
+#include "libqos/libqos-pc.h"
+#include "libqtest-single.h"
+#include "hw/misc/memexpose/memexpose-core.h"
+
+static char *tmpshm;
+static char *tmpdir;
+
+static void save_fn(QPCIDevice *dev, int devfn, void *data)
+{
+    QPCIDevice **pdev = (QPCIDevice **) data;
+
+    *pdev = dev;
+}
+
+static QPCIDevice *get_device(QPCIBus *pcibus)
+{
+    QPCIDevice *dev;
+
+    dev = NULL;
+    qpci_device_foreach(pcibus, 0x1af4, 0x1111, save_fn, &dev);
+    g_assert(dev != NULL);
+
+    return dev;
+}
+
+typedef struct _MexpState {
+    QOSState *qs;
+    QPCIBar reg_bar, mem_bar;
+    QPCIDevice *dev;
+} MexpState;
+
+
+static inline void read_mexp_mem(MexpState *s, uint64_t off,
+                                 void *buf, size_t len)
+{
+    qpci_memread(s->dev, s->mem_bar, off, buf, len);
+}
+
+static inline void write_mexp_mem(MexpState *s, uint64_t off,
+                                  const void *buf, size_t len)
+{
+    qpci_memwrite(s->dev, s->mem_bar, off, buf, len);
+}
+
+static inline void read_mem(MexpState *s, uint64_t off,
+                            void *buf, size_t len)
+{
+    char *cbuf = buf;
+    for (size_t i = 0; i < len; i++) {
+        cbuf[i] = qtest_readb(s->qs->qts, off + i);
+    }
+}
+
+static inline void write_mem(MexpState *s, uint64_t off,
+                             const void *buf, size_t len)
+{
+    const char *cbuf = buf;
+    for (size_t i = 0; i < len; i++) {
+        qtest_writeb(s->qs->qts, off + i, cbuf[i]);
+    }
+}
+
+static inline void write_mexp_reg(MexpState *s, uint64_t off,
+                                  uint64_t val)
+{
+    qpci_io_writeq(s->dev, s->reg_bar, off, val);
+}
+
+static inline uint64_t read_mexp_reg(MexpState *s, uint64_t off)
+{
+    return qpci_io_readq(s->dev, s->reg_bar, off);
+}
+
+static void mexp_send_intr(MexpState *s, uint64_t type,
+                           uint64_t data)
+{
+    uint64_t send = 1;
+    write_mexp_reg(s, MEMEXPOSE_INTR_TX_TYPE_ADDR, type);
+    write_mexp_reg(s, MEMEXPOSE_INTR_TX_DATA_ADDR, data);
+    write_mexp_reg(s, MEMEXPOSE_INTR_SEND_ADDR, send);
+}
+
+static uint64_t mexp_recv_intr(MexpState *s, uint64_t *type,
+                               uint64_t *data)
+{
+    uint64_t recv = 0;
+    int tries = 0;
+    while (recv == 0 && tries < 100) {
+        recv = read_mexp_reg(s, MEMEXPOSE_INTR_RECV_ADDR);
+        if (recv) {
+            break;
+        }
+        tries++;
+        g_usleep(10000);
+    }
+    *type = read_mexp_reg(s, MEMEXPOSE_INTR_RX_TYPE_ADDR);
+    *data = read_mexp_reg(s, MEMEXPOSE_INTR_RX_DATA_ADDR);
+    return recv;
+}
+
+static void setup_vm_cmd(MexpState *s, const char *cmd, bool msix)
+{
+    uint64_t barsize;
+    const char *arch = qtest_get_arch();
+
+    if (strcmp(arch, "x86_64") == 0) {
+        s->qs = qtest_pc_boot(cmd);
+    } else {
+        g_printerr("memexpose-test tests are only available on x86_64\n");
+        exit(EXIT_FAILURE);
+    }
+    s->dev = get_device(s->qs->pcibus);
+    s->reg_bar = qpci_iomap(s->dev, 0, &barsize);
+    g_assert_cmpuint(barsize, ==, MEMEXPOSE_INTR_MEM_SIZE);
+
+    if (msix) {
+        qpci_msix_enable(s->dev);
+    }
+
+    s->mem_bar = qpci_iomap(s->dev, 1, &barsize);
+
+    qpci_device_enable(s->dev);
+}
+
+static void remove_socks(char *tmp_path)
+{
+    char *memsock = g_strdup_printf("%s/qemu-mexp-mem", tmp_path);
+    g_remove(memsock);
+    g_free(memsock);
+
+    char *intsock = g_strdup_printf("%s/qemu-mexp-mem", tmp_path);
+    g_remove(intsock);
+    g_free(intsock);
+}
+static void add_socks(char *tmp_path)
+{
+    char *memsock = g_strdup_printf("%s/qemu-mexp-mem", tmp_path);
+    mkfifo(memsock, 0700);
+    g_free(memsock);
+
+    char *intsock = g_strdup_printf("%s/qemu-mexp-mem", tmp_path);
+    mkfifo(intsock, 0700);
+    g_free(intsock);
+}
+
+static void setup_vm(MexpState *s, int server)
+{
+    unsigned long shm_size = 1 << 28;
+    const char *socksrv = server ? "server,nowait," : "";
+    char *cmd = g_strdup_printf("-mem-path %s "
+                                "-device memexpose-pci,mem_chardev=mem-mem,"
+                                "intr_chardev=mem-intr,shm_size=0x%lx "
+                                "-chardev socket,%spath=%s/qemu-mexp-mem,id=mem-mem "
+                                "-chardev socket,%spath=%s/qemu-mexp-intr,id=mem-intr",
+                                tmpshm, shm_size,
+                                socksrv, tmpdir, socksrv, tmpdir);
+    setup_vm_cmd(s, cmd, false);
+    g_free(cmd);
+}
+
+static void cleanup_vm(MexpState *s)
+{
+    assert(!global_qtest);
+    g_free(s->dev);
+    qtest_shutdown(s->qs);
+}
+
+static void setup_connected_vms(MexpState *s1, MexpState *s2)
+{
+    remove_socks(tmpdir);
+    add_socks(tmpdir);
+    setup_vm(s1, 1);
+    setup_vm(s2, 0);
+
+    write_mexp_reg(s1, MEMEXPOSE_INTR_ENABLE_ADDR, 1);
+    write_mexp_reg(s2, MEMEXPOSE_INTR_ENABLE_ADDR, 1);
+}
+
+static void test_memexpose_simple_memshare(void)
+{
+    size_t sixty_four_megs = 1 << (20 + 6);
+    uint32_t in, out;
+
+    MexpState s1, s2;
+    setup_connected_vms(&s1, &s2);
+
+    in = 0xdeadbeef;
+    write_mem(&s1, sixty_four_megs, &in, 4);
+    read_mexp_mem(&s2, sixty_four_megs, &out, 4);
+    g_assert_cmphex(in, ==, out);
+    in = 0xbaba1510;
+    write_mem(&s1, sixty_four_megs, &in, 4);
+    read_mexp_mem(&s2, sixty_four_megs, &out, 4);
+    g_assert_cmphex(in, ==, out);
+
+    in = 0xaaaaaaaa;
+    write_mexp_mem(&s1, sixty_four_megs, &in, 4);
+    read_mem(&s2, sixty_four_megs, &out, 4);
+    g_assert_cmphex(in, ==, out);
+    in = 0xbbbbbbbb;
+    write_mexp_mem(&s1, sixty_four_megs, &in, 4);
+    read_mem(&s2, sixty_four_megs, &out, 4);
+    g_assert_cmphex(in, ==, out);
+
+    cleanup_vm(&s1);
+    cleanup_vm(&s2);
+}
+
+static void test_memexpose_simple_interrupts(void)
+{
+    MexpState s1, s2;
+    setup_connected_vms(&s1, &s2);
+
+    mexp_send_intr(&s1, 0x1, 0xdeadbea7);
+    mexp_send_intr(&s1, 0x2, 0xdeadbaba);
+
+    uint64_t type, data, received;
+
+    received = mexp_recv_intr(&s2, &type, &data);
+    g_assert_cmpuint(received, ==, 1);
+    g_assert_cmphex(type, ==, 0x1);
+    g_assert_cmphex(data, ==, 0xdeadbea7);
+
+    received = mexp_recv_intr(&s2, &type, &data);
+    g_assert_cmpuint(received, ==, 1);
+    g_assert_cmphex(type, ==, 0x2);
+    g_assert_cmphex(data, ==, 0xdeadbaba);
+
+    cleanup_vm(&s1);
+    cleanup_vm(&s2);
+}
+
+static void test_memexpose_overfull_intr_queue(void)
+{
+    MexpState s1, s2;
+    setup_connected_vms(&s1, &s2);
+
+    unsigned int i, expected, runs = MEMEXPOSE_INTR_QUEUE_SIZE + 10;
+    uint64_t type, data;
+
+    for (i = 0; i < runs; i++) {
+        mexp_send_intr(&s1, i, i);
+    }
+
+    expected = 0;
+    while (mexp_recv_intr(&s2, &type, &data)) {
+        if (expected < MEMEXPOSE_INTR_QUEUE_SIZE) {
+            g_assert_cmphex(type, ==, expected);
+            g_assert_cmphex(data, ==, expected);
+            expected += 1;
+        } else {
+            g_assert_cmphex(type, >, expected);
+            g_assert_cmphex(type, <, runs);
+            g_assert_cmphex(data, >, expected);
+            g_assert_cmphex(data, <, runs);
+            expected = type;
+        }
+    }
+    g_assert_cmpuint(expected, >=, MEMEXPOSE_INTR_QUEUE_SIZE - 1);
+
+    cleanup_vm(&s1);
+    cleanup_vm(&s2);
+}
+
+static void test_memexpose_intr_data(void)
+{
+    MexpState s1, s2;
+    setup_connected_vms(&s1, &s2);
+
+    unsigned int i;
+    uint64_t type, data, received;
+
+    uint64_t send = 1;
+    write_mexp_reg(&s1, MEMEXPOSE_INTR_TX_TYPE_ADDR, 0);
+    for (i = 0; i < MEMEXPOSE_MAX_INTR_DATA_SIZE; i += 8) {
+        write_mexp_reg(&s1, MEMEXPOSE_INTR_TX_DATA_ADDR + i, i);
+    }
+    write_mexp_reg(&s1, MEMEXPOSE_INTR_SEND_ADDR, send);
+
+    received = mexp_recv_intr(&s2, &type, &data);
+    g_assert_cmpuint(received, ==, 1);
+    for (i = 0; i < MEMEXPOSE_MAX_INTR_DATA_SIZE; i += 8) {
+        data = read_mexp_reg(&s1, MEMEXPOSE_INTR_TX_DATA_ADDR + i);
+        g_assert_cmphex(data, ==, i);
+    }
+
+    cleanup_vm(&s1);
+    cleanup_vm(&s2);
+}
+
+static void cleanup(void)
+{
+    if (tmpshm) {
+        g_rmdir(tmpshm);
+        tmpshm = NULL;
+    }
+
+    if (tmpdir) {
+        remove_socks(tmpdir);
+        g_rmdir(tmpdir);
+        tmpdir = NULL;
+    }
+}
+
+static void abrt_handler(void *data)
+{
+    cleanup();
+}
+
+int main(int argc, char **argv)
+{
+    int ret;
+    gchar dir[] = "/tmp/memexpose-test.XXXXXX";
+    gchar shmdir[] = "/dev/shm/memexpose-test.XXXXXX";
+
+    g_test_init(&argc, &argv, NULL);
+
+    qtest_add_abrt_handler(abrt_handler, NULL);
+
+    if (mkdtemp(dir) == NULL) {
+        g_error("mkdtemp: %s", g_strerror(errno));
+        goto out;
+    }
+    tmpdir = dir;
+    if (mkdtemp(shmdir) == NULL) {
+        g_error("mkdtemp: %s", g_strerror(errno));
+        goto out;
+    }
+    tmpshm = shmdir;
+
+    qtest_add_func("/memexpose/memory", test_memexpose_simple_memshare);
+    qtest_add_func("/memexpose/interrupts", test_memexpose_simple_interrupts);
+    qtest_add_func("/memexpose/interrupts_full_queue",
+                   test_memexpose_overfull_intr_queue);
+    qtest_add_func("/memexpose/interrupts_all_data", test_memexpose_intr_data);
+    ret = g_test_run();
+
+out:
+    cleanup();
+    return ret;
+}
-- 
2.7.4



^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [RFC 9/9] hw/arm/virt: Hack in support for memexpose device
       [not found]   ` <CGME20200204113111eucas1p2a96ec20fbaf679215b50d9f03245b33e@eucas1p2.samsung.com>
@ 2020-02-04 11:30     ` i.kotrasinsk
  0 siblings, 0 replies; 20+ messages in thread
From: i.kotrasinsk @ 2020-02-04 11:30 UTC (permalink / raw)
  To: qemu-devel; +Cc: peter.maydell, Igor Kotrasinski, pbonzini

From: Igor Kotrasinski <i.kotrasinsk@partner.samsung.com>

Signed-off-by: Igor Kotrasinski <i.kotrasinsk@partner.samsung.com>
---
 hw/arm/virt.c         | 110 +++++++++++++++++++++++++++++++++++++++++++++++++-
 include/hw/arm/virt.h |   5 +++
 2 files changed, 114 insertions(+), 1 deletion(-)

diff --git a/hw/arm/virt.c b/hw/arm/virt.c
index f788fe2..ba35b21 100644
--- a/hw/arm/virt.c
+++ b/hw/arm/virt.c
@@ -71,6 +71,8 @@
 #include "hw/mem/pc-dimm.h"
 #include "hw/mem/nvdimm.h"
 #include "hw/acpi/generic_event_device.h"
+#include "hw/misc/memexpose/memexpose-core.h"
+#include "hw/misc/memexpose/memexpose-memregion.h"
 
 #define DEFINE_VIRT_MACHINE_LATEST(major, minor, latest) \
     static void virt_##major##_##minor##_class_init(ObjectClass *oc, \
@@ -168,6 +170,8 @@ static MemMapEntry extended_memmap[] = {
     /* Additional 64 MB redist region (can contain up to 512 redistributors) */
     [VIRT_HIGH_GIC_REDIST2] =   { 0x0, 64 * MiB },
     [VIRT_HIGH_PCIE_ECAM] =     { 0x0, 256 * MiB },
+    [VIRT_HIGH_MEMEXPOSE_MMIO] =     { 0x0, 256 * MiB },
+    [VIRT_HIGH_MEMEXPOSE] =     { 0x0, 32 * GiB },
     /* Second PCIe window */
     [VIRT_HIGH_PCIE_MMIO] =     { 0x0, 512 * GiB },
 };
@@ -179,6 +183,7 @@ static const int a15irqmap[] = {
     [VIRT_GPIO] = 7,
     [VIRT_SECURE_UART] = 8,
     [VIRT_ACPI_GED] = 9,
+    [VIRT_MEMEXPOSE] = 10,
     [VIRT_MMIO] = 16, /* ...to 16 + NUM_VIRTIO_TRANSPORTS - 1 */
     [VIRT_GIC_V2M] = 48, /* ...to 48 + NUM_GICV2M_SPIS - 1 */
     [VIRT_SMMU] = 74,    /* ...to 74 + NUM_SMMU_IRQS - 1 */
@@ -763,6 +768,67 @@ static void create_uart(const VirtMachineState *vms, int uart,
     g_free(nodename);
 }
 
+static void create_memexpose(const VirtMachineState *vms, MemoryRegion *mem,
+                             Error **errp)
+{
+    if (!vms->memexpose_size) {
+        error_setg(errp, "For memexpose support, memexpose_size "
+                         "needs to be greater than zero");
+        return;
+    }
+    if (!strcmp("", vms->memexpose_ep)) {
+        error_setg(errp, "For memexpose support, memexpose_ep "
+                         "needs to be non-empty");
+        return;
+    }
+
+    DeviceState *dev = qdev_create(NULL, "memexpose-memdev");
+
+    hwaddr base = vms->memmap[VIRT_HIGH_MEMEXPOSE].base;
+    hwaddr size = vms->memexpose_size;
+    hwaddr mmio_base = vms->memmap[VIRT_HIGH_MEMEXPOSE_MMIO].base;
+    hwaddr mmio_size = MEMEXPOSE_INTR_MEM_SIZE;
+    int irq = vms->irqmap[VIRT_MEMEXPOSE];
+
+    qdev_prop_set_uint64(dev, "shm_size", size);
+
+    char *intr_ep = g_strdup_printf("%s-intr", vms->memexpose_ep);
+    char *mem_ep = g_strdup_printf("%s-mem", vms->memexpose_ep);
+    Chardev *c = qemu_chr_find(mem_ep);
+    if (!c) {
+        error_setg(errp, "Failed to find memexpose memory endpoint");
+        return;
+    }
+    qdev_prop_set_chr(dev, "mem_chardev", c);
+    c = qemu_chr_find(intr_ep);
+    if (!c) {
+        error_setg(errp, "Failed to find memexpose interrupt endpoint");
+        return;
+    }
+    qdev_prop_set_chr(dev, "intr_chardev", c);
+    g_free(intr_ep);
+    g_free(mem_ep);
+
+    qdev_init_nofail(dev);
+    MemexposeMemdev *mdev = MEMEXPOSE_MEMDEV(dev);
+    SysBusDevice *s = SYS_BUS_DEVICE(dev);
+    memory_region_add_subregion(mem, mmio_base, &mdev->intr.shmem);
+    memory_region_add_subregion(mem, base, &mdev->mem.shmem);
+    sysbus_connect_irq(s, 0, qdev_get_gpio_in(vms->gic, irq));
+
+    char *nodename = g_strdup_printf("/memexpose@%" PRIx64, mmio_base);
+    qemu_fdt_add_subnode(vms->fdt, nodename);
+    qemu_fdt_setprop_string(vms->fdt, nodename, "compatible",
+                            "memexpose-memregion");
+    qemu_fdt_setprop_sized_cells(vms->fdt, nodename, "reg",
+                                 2, mmio_base, 2, mmio_size,
+                                 2, base, 2, size);
+    qemu_fdt_setprop_cells(vms->fdt, nodename, "interrupts",
+                               GIC_FDT_IRQ_TYPE_SPI, irq,
+                               GIC_FDT_IRQ_FLAGS_LEVEL_HI);
+    g_free(nodename);
+}
+
 static void create_rtc(const VirtMachineState *vms)
 {
     char *nodename;
@@ -1572,7 +1638,6 @@ static void machvirt_init(MachineState *machine)
                            UINT64_MAX);
         memory_region_add_subregion_overlap(secure_sysmem, 0, sysmem, -1);
     }
-
     firmware_loaded = virt_firmware_init(vms, sysmem,
                                          secure_sysmem ?: sysmem);
 
@@ -1721,6 +1786,8 @@ static void machvirt_init(MachineState *machine)
     fdt_add_pmu_nodes(vms);
 
     create_uart(vms, VIRT_UART, sysmem, serial_hd(0));
+    if (vms->memexpose_size > 0)
+        create_memexpose(vms, sysmem, &error_abort);
 
     if (vms->secure) {
         create_secure_ram(vms, secure_sysmem);
@@ -1849,6 +1916,32 @@ static void virt_set_gic_version(Object *obj, const char *value, Error **errp)
     }
 }
 
+static char *virt_get_memexpose_ep(Object *obj, Error **errp)
+{
+    VirtMachineState *vms = VIRT_MACHINE(obj);
+    return g_strdup(vms->memexpose_ep);
+}
+
+static void virt_set_memexpose_ep(Object *obj, const char *value, Error **errp)
+{
+    VirtMachineState *vms = VIRT_MACHINE(obj);
+    g_free(vms->memexpose_ep);
+    vms->memexpose_ep = g_strdup(value);
+}
+
+static char *virt_get_memexpose_size(Object *obj, Error **errp)
+{
+    VirtMachineState *vms = VIRT_MACHINE(obj);
+    return g_strdup_printf("%" PRIx64, vms->memexpose_size);
+}
+
+static void virt_set_memexpose_size(Object *obj, const char *value,
+                                    Error **errp)
+{
+    VirtMachineState *vms = VIRT_MACHINE(obj);
+    parse_option_size("memexpose-size", value, &vms->memexpose_size, errp);
+}
+
 static char *virt_get_iommu(Object *obj, Error **errp)
 {
     VirtMachineState *vms = VIRT_MACHINE(obj);
@@ -2103,6 +2196,21 @@ static void virt_instance_init(Object *obj)
                                     "Set GIC version. "
                                     "Valid values are 2, 3 and host", NULL);
 
+    /* Memexpose disabled by default */
+    vms->memexpose_ep = g_strdup("");
+    object_property_add_str(obj, "memexpose-ep", virt_get_memexpose_ep,
+                            virt_set_memexpose_ep, NULL);
+    object_property_set_description(obj, "memexpose-ep",
+                                    "Set path to memexpose server socket. "
+                                    "Sockets used for communication will be "
+                                    "<name>-intr and <name>-mem. Set to empty "
+                                    "to disable memexpose.", NULL);
+    vms->memexpose_size = 0;
+    object_property_add_str(obj, "memexpose-size", virt_get_memexpose_size,
+                            virt_set_memexpose_size, NULL);
+    object_property_set_description(obj, "memexpose-size",
+                                    "Size of the memexpose region to access.",
+                                    NULL);
     vms->highmem_ecam = !vmc->no_highmem_ecam;
 
     if (vmc->no_its) {
diff --git a/include/hw/arm/virt.h b/include/hw/arm/virt.h
index 71508bf..d0aeb67 100644
--- a/include/hw/arm/virt.h
+++ b/include/hw/arm/virt.h
@@ -76,6 +76,7 @@ enum {
     VIRT_PLATFORM_BUS,
     VIRT_GPIO,
     VIRT_SECURE_UART,
+    VIRT_MEMEXPOSE,
     VIRT_SECURE_MEM,
     VIRT_PCDIMM_ACPI,
     VIRT_ACPI_GED,
@@ -86,6 +87,8 @@ enum {
 enum {
     VIRT_HIGH_GIC_REDIST2 =  VIRT_LOWMEMMAP_LAST,
     VIRT_HIGH_PCIE_ECAM,
+    VIRT_HIGH_MEMEXPOSE_MMIO,
+    VIRT_HIGH_MEMEXPOSE,
     VIRT_HIGH_PCIE_MMIO,
 };
 
@@ -124,6 +127,8 @@ typedef struct {
     bool its;
     bool virt;
     int32_t gic_version;
+    char *memexpose_ep;
+    uint64_t memexpose_size;
     VirtIOMMUType iommu;
     struct arm_boot_info bootinfo;
     MemMapEntry *memmap;
-- 
2.7.4



^ permalink raw reply related	[flat|nested] 20+ messages in thread

* Re: [RFC 0/9] Add an interVM memory sharing device
  2020-02-04 11:30 ` [RFC 0/9] Add an interVM memory sharing device i.kotrasinsk
                     ` (8 preceding siblings ...)
       [not found]   ` <CGME20200204113111eucas1p2a96ec20fbaf679215b50d9f03245b33e@eucas1p2.samsung.com>
@ 2020-02-04 12:13   ` no-reply
  2020-02-04 12:16   ` no-reply
  2020-02-05 14:39   ` Stefan Hajnoczi
  11 siblings, 0 replies; 20+ messages in thread
From: no-reply @ 2020-02-04 12:13 UTC (permalink / raw)
  To: i.kotrasinsk; +Cc: peter.maydell, i.kotrasinsk, qemu-devel, pbonzini

Patchew URL: https://patchew.org/QEMU/1580815851-28887-1-git-send-email-i.kotrasinsk@partner.samsung.com/



Hi,

This series seems to have some coding style problems. See output below for
more information:

Subject: [RFC 0/9] Add an interVM memory sharing device
Message-id: 1580815851-28887-1-git-send-email-i.kotrasinsk@partner.samsung.com
Type: series

=== TEST SCRIPT BEGIN ===
#!/bin/bash
git rev-parse base > /dev/null || exit 0
git config --local diff.renamelimit 0
git config --local diff.renames True
git config --local diff.algorithm histogram
./scripts/checkpatch.pl --mailback base..
=== TEST SCRIPT END ===

Updating 3c8cf5a9c21ff8782164d1def7f44bd888713384
From https://github.com/patchew-project/qemu
 - [tag update]      patchew/20200204105142.21845-1-alex.bennee@linaro.org -> patchew/20200204105142.21845-1-alex.bennee@linaro.org
 - [tag update]      patchew/20200204110501.10731-1-dgilbert@redhat.com -> patchew/20200204110501.10731-1-dgilbert@redhat.com
Switched to a new branch 'test'
4b2f7b1 hw/arm/virt: Hack in support for memexpose device
bea4faf hw/misc/memexpose: Add simple tests
b9a53fc hw/misc/memexpose: Add memexpose memory region device
8d2c64f hw/misc/memexpose: Add memexpose pci device
6a6d45a hw/misc/memexpose: Add core memexpose files
c35aef7 hw/misc/memexpose: Add documentation
d6e7169 memory: Hack - use shared memory when possible
6f6a337 memory: Support mmap offset for fd-backed memory regions
fe515a9 memory: Add function for finding flat memory ranges

=== OUTPUT BEGIN ===
1/9 Checking commit fe515a937f89 (memory: Add function for finding flat memory ranges)
2/9 Checking commit 6f6a337b8e94 (memory: Support mmap offset for fd-backed memory regions)
3/9 Checking commit d6e716993be8 (memory: Hack - use shared memory when possible)
4/9 Checking commit c35aef77ec71 (hw/misc/memexpose: Add documentation)
WARNING: added, moved or deleted file(s), does MAINTAINERS need updating?
#11: 
new file mode 100644

ERROR: code blocks in documentation should have empty lines with exactly 4 columns of whitespace
#45: FILE: docs/specs/memexpose-spec.txt:30:
+ $

total: 1 errors, 1 warnings, 168 lines checked

Patch 4/9 has style problems, please review.  If any of these errors
are false positives report them to the maintainer, see
CHECKPATCH in MAINTAINERS.

5/9 Checking commit 6a6d45a15b43 (hw/misc/memexpose: Add core memexpose files)
WARNING: added, moved or deleted file(s), does MAINTAINERS need updating?
#115: 
new file mode 100644

total: 0 errors, 1 warnings, 1235 lines checked

Patch 5/9 has style problems, please review.  If any of these errors
are false positives report them to the maintainer, see
CHECKPATCH in MAINTAINERS.
6/9 Checking commit 8d2c64f273a3 (hw/misc/memexpose: Add memexpose pci device)
WARNING: added, moved or deleted file(s), does MAINTAINERS need updating?
#31: 
new file mode 100644

total: 0 errors, 1 warnings, 228 lines checked

Patch 6/9 has style problems, please review.  If any of these errors
are false positives report them to the maintainer, see
CHECKPATCH in MAINTAINERS.
7/9 Checking commit b9a53fc1ef8b (hw/misc/memexpose: Add memexpose memory region device)
WARNING: added, moved or deleted file(s), does MAINTAINERS need updating?
#33: 
new file mode 100644

total: 0 errors, 1 warnings, 195 lines checked

Patch 7/9 has style problems, please review.  If any of these errors
are false positives report them to the maintainer, see
CHECKPATCH in MAINTAINERS.
8/9 Checking commit bea4faff4da5 (hw/misc/memexpose: Add simple tests)
WARNING: added, moved or deleted file(s), does MAINTAINERS need updating?
#43: 
new file mode 100644

total: 0 errors, 1 warnings, 385 lines checked

Patch 8/9 has style problems, please review.  If any of these errors
are false positives report them to the maintainer, see
CHECKPATCH in MAINTAINERS.
9/9 Checking commit 4b2f7b165800 (hw/arm/virt: Hack in support for memexpose device)
=== OUTPUT END ===

Test command exited with code: 1


The full log is available at
http://patchew.org/logs/1580815851-28887-1-git-send-email-i.kotrasinsk@partner.samsung.com/testing.checkpatch/?type=message.
---
Email generated automatically by Patchew [https://patchew.org/].
Please send your feedback to patchew-devel@redhat.com

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [RFC 0/9] Add an interVM memory sharing device
  2020-02-04 11:30 ` [RFC 0/9] Add an interVM memory sharing device i.kotrasinsk
                     ` (9 preceding siblings ...)
  2020-02-04 12:13   ` [RFC 0/9] Add an interVM memory sharing device no-reply
@ 2020-02-04 12:16   ` no-reply
  2020-02-05 14:39   ` Stefan Hajnoczi
  11 siblings, 0 replies; 20+ messages in thread
From: no-reply @ 2020-02-04 12:16 UTC (permalink / raw)
  To: i.kotrasinsk; +Cc: peter.maydell, i.kotrasinsk, qemu-devel, pbonzini

Patchew URL: https://patchew.org/QEMU/1580815851-28887-1-git-send-email-i.kotrasinsk@partner.samsung.com/



Hi,

This series failed the docker-mingw@fedora build test. Please find the testing commands and
their output below. If you have Docker installed, you can probably reproduce it
locally.

=== TEST SCRIPT BEGIN ===
#! /bin/bash
export ARCH=x86_64
make docker-image-fedora V=1 NETWORK=1
time make docker-test-mingw@fedora J=14 NETWORK=1
=== TEST SCRIPT END ===

  CC      block/blkreplay.o
  CC      block/parallels.o
  CC      block/blklogwrites.o
/tmp/qemu-test/src/docs/../include/exec/memory.h:923: warning: Function parameter or member 'mmap_offset' not described in 'memory_region_init_ram_from_fd'

Warning, treated as error:
/tmp/qemu-test/src/docs/../include/exec/memory.h:1923:Unexpected indentation.
  CC      block/block-backend.o
  CC      block/snapshot.o
---
  CC      block/create.o
  CC      block/throttle-groups.o
  CC      block/nbd.o
make: *** [Makefile:1043: docs/devel/index.html] Error 2
make: *** Waiting for unfinished jobs....
Traceback (most recent call last):
  File "./tests/docker/docker.py", line 662, in <module>
---
    raise CalledProcessError(retcode, cmd)
subprocess.CalledProcessError: Command '['sudo', '-n', 'docker', 'run', '--label', 'com.qemu.instance.uuid=4e0ce4bc006546c6a7e47701712aa571', '-u', '1001', '--security-opt', 'seccomp=unconfined', '--rm', '-e', 'TARGET_LIST=', '-e', 'EXTRA_CONFIGURE_OPTS=', '-e', 'V=', '-e', 'J=14', '-e', 'DEBUG=', '-e', 'SHOW_ENV=', '-e', 'CCACHE_DIR=/var/tmp/ccache', '-v', '/home/patchew/.cache/qemu-docker-ccache:/var/tmp/ccache:z', '-v', '/var/tmp/patchew-tester-tmp-gsa7_m4x/src/docker-src.2020-02-04-07.14.22.20243:/var/tmp/qemu:z,ro', 'qemu:fedora', '/var/tmp/qemu/run', 'test-mingw']' returned non-zero exit status 2.
filter=--filter=label=com.qemu.instance.uuid=4e0ce4bc006546c6a7e47701712aa571
make[1]: *** [docker-run] Error 1
make[1]: Leaving directory `/var/tmp/patchew-tester-tmp-gsa7_m4x/src'
make: *** [docker-run-test-mingw@fedora] Error 2

real    1m55.421s
user    0m8.478s


The full log is available at
http://patchew.org/logs/1580815851-28887-1-git-send-email-i.kotrasinsk@partner.samsung.com/testing.docker-mingw@fedora/?type=message.
---
Email generated automatically by Patchew [https://patchew.org/].
Please send your feedback to patchew-devel@redhat.com

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [RFC 0/9] Add an interVM memory sharing device
  2020-02-04 11:30 ` [RFC 0/9] Add an interVM memory sharing device i.kotrasinsk
                     ` (10 preceding siblings ...)
  2020-02-04 12:16   ` no-reply
@ 2020-02-05 14:39   ` Stefan Hajnoczi
  2020-02-05 14:49     ` Jan Kiszka
  11 siblings, 1 reply; 20+ messages in thread
From: Stefan Hajnoczi @ 2020-02-05 14:39 UTC (permalink / raw)
  To: i.kotrasinsk
  Cc: peter.maydell, Igor Mammedov, Jan Kiszka, qemu-devel, pbonzini

[-- Attachment #1: Type: text/plain, Size: 4691 bytes --]

On Tue, Feb 04, 2020 at 12:30:42PM +0100, i.kotrasinsk@partner.samsung.com wrote:
> From: Igor Kotrasinski <i.kotrasinsk@partner.samsung.com>
> 
> This patchset adds a "memory exposing" device that allows two QEMU
> instances to share arbitrary memory regions. Unlike ivshmem, it does not
> create a new region of memory that's shared between VMs, but instead
> allows one VM to access any memory region of the other VM we choose to
> share.
> 
> The motivation for this device is a sort of ARM Trustzone "emulation",
> where a rich system running on one machine (e.g. x86_64 linux) is able
> to perform SMCs to a trusted system running on another (e.g. OpTEE on
> ARM). With a device that allows sharing arbitrary memory between VMs,
> this can be achieved with minimal changes to the trusted system and its
> linux driver while allowing the rich system to run on a speedier x86
> emulator. I prepared additional patches for linux, OpTEE OS and OpTEE
> build system as a PoC that such emulation works and passes OpTEE tests;
> I'm not sure what would be the best way to share them.
> 
> This patchset is my first foray into QEMU source code and I'm certain
> it's not yet ready to be merged in. I'm not sure whether memory sharing
> code has any race conditions or breaks rules of working with memory
> regions, or if having VMs communicate synchronously via chardevs is the
> right way to do it. I do believe the basic idea for sharing memory
> regions is sound and that it could be useful for inter-VM communication.

Hi,
Without having looked into the patches yet, I'm already wondering if you
can use the existing -object
memory-backend-file,size=512M,mem-path=/my/shared/mem feature for your
use case?

That's the existing mechanism for fully sharing guest RAM and if you
want to share all of memory then maybe a device is not necessary - just
share the memory.

Stefan

> Igor Kotrasinski (9):
>   memory: Add function for finding flat memory ranges
>   memory: Support mmap offset for fd-backed memory regions
>   memory: Hack - use shared memory when possible
>   hw/misc/memexpose: Add documentation
>   hw/misc/memexpose: Add core memexpose files
>   hw/misc/memexpose: Add memexpose pci device
>   hw/misc/memexpose: Add memexpose memory region device
>   hw/misc/memexpose: Add simple tests
>   hw/arm/virt: Hack in support for memexpose device
> 
>  Kconfig.host                            |   3 +
>  MAINTAINERS                             |  12 +
>  Makefile                                |   1 +
>  backends/hostmem-memfd.c                |   2 +-
>  configure                               |   8 +
>  docs/specs/memexpose-spec.txt           | 168 +++++++++
>  exec.c                                  |  10 +-
>  hw/arm/virt.c                           | 110 +++++-
>  hw/core/numa.c                          |   4 +-
>  hw/mem/Kconfig                          |   3 +
>  hw/misc/Makefile.objs                   |   1 +
>  hw/misc/ivshmem.c                       |   3 +-
>  hw/misc/memexpose/Makefile.objs         |   4 +
>  hw/misc/memexpose/memexpose-core.c      | 630 ++++++++++++++++++++++++++++++++
>  hw/misc/memexpose/memexpose-core.h      | 109 ++++++
>  hw/misc/memexpose/memexpose-memregion.c | 142 +++++++
>  hw/misc/memexpose/memexpose-memregion.h |  41 +++
>  hw/misc/memexpose/memexpose-msg.c       | 261 +++++++++++++
>  hw/misc/memexpose/memexpose-msg.h       | 161 ++++++++
>  hw/misc/memexpose/memexpose-pci.c       | 218 +++++++++++
>  include/exec/memory.h                   |  20 +
>  include/exec/ram_addr.h                 |   2 +-
>  include/hw/arm/virt.h                   |   5 +
>  include/qemu/mmap-alloc.h               |   1 +
>  memory.c                                |  82 ++++-
>  tests/qtest/Makefile.include            |   2 +
>  tests/qtest/memexpose-test.c            | 364 ++++++++++++++++++
>  util/mmap-alloc.c                       |   7 +-
>  util/oslib-posix.c                      |   2 +-
>  29 files changed, 2360 insertions(+), 16 deletions(-)
>  create mode 100644 docs/specs/memexpose-spec.txt
>  create mode 100644 hw/misc/memexpose/Makefile.objs
>  create mode 100644 hw/misc/memexpose/memexpose-core.c
>  create mode 100644 hw/misc/memexpose/memexpose-core.h
>  create mode 100644 hw/misc/memexpose/memexpose-memregion.c
>  create mode 100644 hw/misc/memexpose/memexpose-memregion.h
>  create mode 100644 hw/misc/memexpose/memexpose-msg.c
>  create mode 100644 hw/misc/memexpose/memexpose-msg.h
>  create mode 100644 hw/misc/memexpose/memexpose-pci.c
>  create mode 100644 tests/qtest/memexpose-test.c
> 
> -- 
> 2.7.4
> 
> 

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [RFC 0/9] Add an interVM memory sharing device
  2020-02-05 14:39   ` Stefan Hajnoczi
@ 2020-02-05 14:49     ` Jan Kiszka
  2020-02-07  9:00       ` Igor Kotrasiński
  0 siblings, 1 reply; 20+ messages in thread
From: Jan Kiszka @ 2020-02-05 14:49 UTC (permalink / raw)
  To: Stefan Hajnoczi, i.kotrasinsk
  Cc: peter.maydell, Igor Mammedov, qemu-devel, pbonzini

On 05.02.20 15:39, Stefan Hajnoczi wrote:
> On Tue, Feb 04, 2020 at 12:30:42PM +0100, i.kotrasinsk@partner.samsung.com wrote:
>> From: Igor Kotrasinski <i.kotrasinsk@partner.samsung.com>
>>
>> This patchset adds a "memory exposing" device that allows two QEMU
>> instances to share arbitrary memory regions. Unlike ivshmem, it does not
>> create a new region of memory that's shared between VMs, but instead
>> allows one VM to access any memory region of the other VM we choose to
>> share.
>>
>> The motivation for this device is a sort of ARM Trustzone "emulation",
>> where a rich system running on one machine (e.g. x86_64 linux) is able
>> to perform SMCs to a trusted system running on another (e.g. OpTEE on
>> ARM). With a device that allows sharing arbitrary memory between VMs,
>> this can be achieved with minimal changes to the trusted system and its
>> linux driver while allowing the rich system to run on a speedier x86
>> emulator. I prepared additional patches for linux, OpTEE OS and OpTEE
>> build system as a PoC that such emulation works and passes OpTEE tests;
>> I'm not sure what would be the best way to share them.
>>
>> This patchset is my first foray into QEMU source code and I'm certain
>> it's not yet ready to be merged in. I'm not sure whether memory sharing
>> code has any race conditions or breaks rules of working with memory
>> regions, or if having VMs communicate synchronously via chardevs is the
>> right way to do it. I do believe the basic idea for sharing memory
>> regions is sound and that it could be useful for inter-VM communication.
> 
> Hi,
> Without having looked into the patches yet, I'm already wondering if you
> can use the existing -object
> memory-backend-file,size=512M,mem-path=/my/shared/mem feature for your
> use case?
> 
> That's the existing mechanism for fully sharing guest RAM and if you
> want to share all of memory then maybe a device is not necessary - just
> share the memory.

I suspect it's about sharing that memory in a discoverable way. Maybe it 
is also about the signalling channel defined in the device.

OTOH, when it's really about sharing everything, even bidirectional, 
that rather looks like a pragmatic shortcut, not a generic model.

The patches should clarify their use case a bit further, I think. The 
title suggests a generic sharing solution, but my impression is that it 
rather caters a specific case under specific boundary conditions.

Jan

-- 
Siemens AG, Corporate Technology, CT RDA IOT SES-DE
Corporate Competence Center Embedded Linux


^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [RFC 0/9] Add an interVM memory sharing device
  2020-02-05 14:49     ` Jan Kiszka
@ 2020-02-07  9:00       ` Igor Kotrasiński
  2020-02-07 10:04         ` Igor Mammedov
  0 siblings, 1 reply; 20+ messages in thread
From: Igor Kotrasiński @ 2020-02-07  9:00 UTC (permalink / raw)
  To: Jan Kiszka, Stefan Hajnoczi
  Cc: peter.maydell, Igor Mammedov, qemu-devel, pbonzini

On 2/5/20 3:49 PM, Jan Kiszka wrote:
> On 05.02.20 15:39, Stefan Hajnoczi wrote:
>> On Tue, Feb 04, 2020 at 12:30:42PM +0100, 
>> i.kotrasinsk@partner.samsung.com wrote:
>>> From: Igor Kotrasinski <i.kotrasinsk@partner.samsung.com>
>>>
>>> This patchset adds a "memory exposing" device that allows two QEMU
>>> instances to share arbitrary memory regions. Unlike ivshmem, it does not
>>> create a new region of memory that's shared between VMs, but instead
>>> allows one VM to access any memory region of the other VM we choose to
>>> share.
>>>
>>> The motivation for this device is a sort of ARM Trustzone "emulation",
>>> where a rich system running on one machine (e.g. x86_64 linux) is able
>>> to perform SMCs to a trusted system running on another (e.g. OpTEE on
>>> ARM). With a device that allows sharing arbitrary memory between VMs,
>>> this can be achieved with minimal changes to the trusted system and its
>>> linux driver while allowing the rich system to run on a speedier x86
>>> emulator. I prepared additional patches for linux, OpTEE OS and OpTEE
>>> build system as a PoC that such emulation works and passes OpTEE tests;
>>> I'm not sure what would be the best way to share them.
>>>
>>> This patchset is my first foray into QEMU source code and I'm certain
>>> it's not yet ready to be merged in. I'm not sure whether memory sharing
>>> code has any race conditions or breaks rules of working with memory
>>> regions, or if having VMs communicate synchronously via chardevs is the
>>> right way to do it. I do believe the basic idea for sharing memory
>>> regions is sound and that it could be useful for inter-VM communication.
>>
>> Hi,
>> Without having looked into the patches yet, I'm already wondering if you
>> can use the existing -object
>> memory-backend-file,size=512M,mem-path=/my/shared/mem feature for your
>> use case?
>>
>> That's the existing mechanism for fully sharing guest RAM and if you
>> want to share all of memory then maybe a device is not necessary - just
>> share the memory.

That option adds memory in addition to the memory allocated with the 
'-m' flag, doesn't it? I looked into that option, and it seemed to me 
you can't back all memory this way.

Apart from that, the only advantage my solution has is that it's aware 
of any memory overlaying the memory-backed regions (i.e. memory backed 
by accessor functions). Maybe I don't need this for my use case, I'd 
have to test that.

> 
> I suspect it's about sharing that memory in a discoverable way. Maybe it 
> is also about the signalling channel defined in the device.
> 
> OTOH, when it's really about sharing everything, even bidirectional, 
> that rather looks like a pragmatic shortcut, not a generic model.
> 
> The patches should clarify their use case a bit further, I think. The 
> title suggests a generic sharing solution, but my impression is that it 
> rather caters a specific case under specific boundary conditions.
> 
> Jan
> 

The solution does stem from a specific use case, the ARM Trustzone 
forwarding described in the cover letter. Normally both OSes can pass 
data around by sharing physical addresses (potentially from anywhere in 
memory), so giving VMs an ability to access one another's memory no 
matter how it's backed allows for minimal trusted OS modification, just 
offsetting physical addresses. The interrupt functionality also reflects 
this, it's intended to pass around SMC data.

I realize that this kind of total memory sharing couples the two VMs 
tightly - this is why I'm asking for comments on this, perhaps there's a 
better solution for this specific scenario.

For what it's worth, the extent of this sharing can be reduced by using 
a limited MemoryRegion like it's done for secure and non-secure memory 
views on ARM.

Igor


^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [RFC 0/9] Add an interVM memory sharing device
  2020-02-07  9:00       ` Igor Kotrasiński
@ 2020-02-07 10:04         ` Igor Mammedov
  2020-02-07 16:33           ` Stefan Hajnoczi
  0 siblings, 1 reply; 20+ messages in thread
From: Igor Mammedov @ 2020-02-07 10:04 UTC (permalink / raw)
  To: Igor Kotrasiński
  Cc: peter.maydell, Jan Kiszka, pbonzini, qemu-devel, Stefan Hajnoczi

On Fri, 7 Feb 2020 10:00:50 +0100
Igor Kotrasiński <i.kotrasinsk@partner.samsung.com> wrote:

> On 2/5/20 3:49 PM, Jan Kiszka wrote:
> > On 05.02.20 15:39, Stefan Hajnoczi wrote:  
> >> On Tue, Feb 04, 2020 at 12:30:42PM +0100, 
> >> i.kotrasinsk@partner.samsung.com wrote:  
> >>> From: Igor Kotrasinski <i.kotrasinsk@partner.samsung.com>
> >>>
> >>> This patchset adds a "memory exposing" device that allows two QEMU
> >>> instances to share arbitrary memory regions. Unlike ivshmem, it does not
> >>> create a new region of memory that's shared between VMs, but instead
> >>> allows one VM to access any memory region of the other VM we choose to
> >>> share.
> >>>
> >>> The motivation for this device is a sort of ARM Trustzone "emulation",
> >>> where a rich system running on one machine (e.g. x86_64 linux) is able
> >>> to perform SMCs to a trusted system running on another (e.g. OpTEE on
> >>> ARM). With a device that allows sharing arbitrary memory between VMs,
> >>> this can be achieved with minimal changes to the trusted system and its
> >>> linux driver while allowing the rich system to run on a speedier x86
> >>> emulator. I prepared additional patches for linux, OpTEE OS and OpTEE
> >>> build system as a PoC that such emulation works and passes OpTEE tests;
> >>> I'm not sure what would be the best way to share them.
> >>>
> >>> This patchset is my first foray into QEMU source code and I'm certain
> >>> it's not yet ready to be merged in. I'm not sure whether memory sharing
> >>> code has any race conditions or breaks rules of working with memory
> >>> regions, or if having VMs communicate synchronously via chardevs is the
> >>> right way to do it. I do believe the basic idea for sharing memory
> >>> regions is sound and that it could be useful for inter-VM communication.  
> >>
> >> Hi,
> >> Without having looked into the patches yet, I'm already wondering if you
> >> can use the existing -object
> >> memory-backend-file,size=512M,mem-path=/my/shared/mem feature for your
> >> use case?
> >>
> >> That's the existing mechanism for fully sharing guest RAM and if you
> >> want to share all of memory then maybe a device is not necessary - just
> >> share the memory.  
> 
> That option adds memory in addition to the memory allocated with the 
> '-m' flag, doesn't it? I looked into that option, and it seemed to me 
> you can't back all memory this way.
with current QEMU you play with memory sharing using numa workaround

-m 512 \
-object memory-backend-file,id=mem,size=512M,mem-path=/my/shared/mem feature,share=on \
-numa node,memdev=mem

also on the list there is series that allows to share main ram
without numa workaround, see
  "[PATCH v4 00/80] refactor main RAM allocation to use hostmem backend"

with it applied you can share main RAM with following CLI:

-object memory-backend-file,id=mem,size=512M,mem-path=/my/shared/mem feature,share=on \
-m 512 \
-M virt,memory-backend=mem

> Apart from that, the only advantage my solution has is that it's aware 
> of any memory overlaying the memory-backed regions (i.e. memory backed 
> by accessor functions). Maybe I don't need this for my use case, I'd 
> have to test that.
> 
> > 
> > I suspect it's about sharing that memory in a discoverable way. Maybe it 
> > is also about the signalling channel defined in the device.
> > 
> > OTOH, when it's really about sharing everything, even bidirectional, 
> > that rather looks like a pragmatic shortcut, not a generic model.
> > 
> > The patches should clarify their use case a bit further, I think. The 
> > title suggests a generic sharing solution, but my impression is that it 
> > rather caters a specific case under specific boundary conditions.
> > 
> > Jan
> >   
> 
> The solution does stem from a specific use case, the ARM Trustzone 
> forwarding described in the cover letter. Normally both OSes can pass 
> data around by sharing physical addresses (potentially from anywhere in 
> memory), so giving VMs an ability to access one another's memory no 
> matter how it's backed allows for minimal trusted OS modification, just 
> offsetting physical addresses. The interrupt functionality also reflects 
> this, it's intended to pass around SMC data.
> 
> I realize that this kind of total memory sharing couples the two VMs 
> tightly - this is why I'm asking for comments on this, perhaps there's a 
> better solution for this specific scenario.
> 
> For what it's worth, the extent of this sharing can be reduced by using 
> a limited MemoryRegion like it's done for secure and non-secure memory 
> views on ARM.
> 
> Igor
> 



^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [RFC 0/9] Add an interVM memory sharing device
  2020-02-07 10:04         ` Igor Mammedov
@ 2020-02-07 16:33           ` Stefan Hajnoczi
  2020-02-10 13:01             ` Igor Kotrasiński
  0 siblings, 1 reply; 20+ messages in thread
From: Stefan Hajnoczi @ 2020-02-07 16:33 UTC (permalink / raw)
  To: Igor Mammedov
  Cc: Jan Kiszka, Igor Kotrasiński, qemu-devel, pbonzini, peter.maydell

[-- Attachment #1: Type: text/plain, Size: 3466 bytes --]

On Fri, Feb 07, 2020 at 11:04:03AM +0100, Igor Mammedov wrote:
> On Fri, 7 Feb 2020 10:00:50 +0100
> Igor Kotrasiński <i.kotrasinsk@partner.samsung.com> wrote:
> 
> > On 2/5/20 3:49 PM, Jan Kiszka wrote:
> > > On 05.02.20 15:39, Stefan Hajnoczi wrote:  
> > >> On Tue, Feb 04, 2020 at 12:30:42PM +0100, 
> > >> i.kotrasinsk@partner.samsung.com wrote:  
> > >>> From: Igor Kotrasinski <i.kotrasinsk@partner.samsung.com>
> > >>>
> > >>> This patchset adds a "memory exposing" device that allows two QEMU
> > >>> instances to share arbitrary memory regions. Unlike ivshmem, it does not
> > >>> create a new region of memory that's shared between VMs, but instead
> > >>> allows one VM to access any memory region of the other VM we choose to
> > >>> share.
> > >>>
> > >>> The motivation for this device is a sort of ARM Trustzone "emulation",
> > >>> where a rich system running on one machine (e.g. x86_64 linux) is able
> > >>> to perform SMCs to a trusted system running on another (e.g. OpTEE on
> > >>> ARM). With a device that allows sharing arbitrary memory between VMs,
> > >>> this can be achieved with minimal changes to the trusted system and its
> > >>> linux driver while allowing the rich system to run on a speedier x86
> > >>> emulator. I prepared additional patches for linux, OpTEE OS and OpTEE
> > >>> build system as a PoC that such emulation works and passes OpTEE tests;
> > >>> I'm not sure what would be the best way to share them.
> > >>>
> > >>> This patchset is my first foray into QEMU source code and I'm certain
> > >>> it's not yet ready to be merged in. I'm not sure whether memory sharing
> > >>> code has any race conditions or breaks rules of working with memory
> > >>> regions, or if having VMs communicate synchronously via chardevs is the
> > >>> right way to do it. I do believe the basic idea for sharing memory
> > >>> regions is sound and that it could be useful for inter-VM communication.  
> > >>
> > >> Hi,
> > >> Without having looked into the patches yet, I'm already wondering if you
> > >> can use the existing -object
> > >> memory-backend-file,size=512M,mem-path=/my/shared/mem feature for your
> > >> use case?
> > >>
> > >> That's the existing mechanism for fully sharing guest RAM and if you
> > >> want to share all of memory then maybe a device is not necessary - just
> > >> share the memory.  
> > 
> > That option adds memory in addition to the memory allocated with the 
> > '-m' flag, doesn't it? I looked into that option, and it seemed to me 
> > you can't back all memory this way.
> with current QEMU you play with memory sharing using numa workaround
> 
> -m 512 \
> -object memory-backend-file,id=mem,size=512M,mem-path=/my/shared/mem feature,share=on \
> -numa node,memdev=mem
> 
> also on the list there is series that allows to share main ram
> without numa workaround, see
>   "[PATCH v4 00/80] refactor main RAM allocation to use hostmem backend"
> 
> with it applied you can share main RAM with following CLI:
> 
> -object memory-backend-file,id=mem,size=512M,mem-path=/my/shared/mem feature,share=on \
> -m 512 \
> -M virt,memory-backend=mem

Nice!  That takes care of memory.

If signalling (e.g. a notification interrupt) is necessary then a
mechanism is still needed for that.  I don't know enough about TrustZone
to suggest an appropriate way of doing it with existing QEMU features.
Maybe Peter understands?

Stefan

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [RFC 0/9] Add an interVM memory sharing device
  2020-02-07 16:33           ` Stefan Hajnoczi
@ 2020-02-10 13:01             ` Igor Kotrasiński
  2020-02-12 13:57               ` Stefan Hajnoczi
  2020-04-01 12:58               ` Igor Kotrasiński
  0 siblings, 2 replies; 20+ messages in thread
From: Igor Kotrasiński @ 2020-02-10 13:01 UTC (permalink / raw)
  To: Stefan Hajnoczi, Igor Mammedov
  Cc: Jan Kiszka, pbonzini, qemu-devel, peter.maydell

On 2/7/20 5:33 PM, Stefan Hajnoczi wrote:
> On Fri, Feb 07, 2020 at 11:04:03AM +0100, Igor Mammedov wrote:
>> On Fri, 7 Feb 2020 10:00:50 +0100
>> Igor Kotrasiński <i.kotrasinsk@partner.samsung.com> wrote:
>>
>>> On 2/5/20 3:49 PM, Jan Kiszka wrote:
>>>> On 05.02.20 15:39, Stefan Hajnoczi wrote:
>>>>> On Tue, Feb 04, 2020 at 12:30:42PM +0100,
>>>>> i.kotrasinsk@partner.samsung.com wrote:
>>>>>> From: Igor Kotrasinski <i.kotrasinsk@partner.samsung.com>
>>>>>>
>>>>>> This patchset adds a "memory exposing" device that allows two QEMU
>>>>>> instances to share arbitrary memory regions. Unlike ivshmem, it does not
>>>>>> create a new region of memory that's shared between VMs, but instead
>>>>>> allows one VM to access any memory region of the other VM we choose to
>>>>>> share.
>>>>>>
>>>>>> The motivation for this device is a sort of ARM Trustzone "emulation",
>>>>>> where a rich system running on one machine (e.g. x86_64 linux) is able
>>>>>> to perform SMCs to a trusted system running on another (e.g. OpTEE on
>>>>>> ARM). With a device that allows sharing arbitrary memory between VMs,
>>>>>> this can be achieved with minimal changes to the trusted system and its
>>>>>> linux driver while allowing the rich system to run on a speedier x86
>>>>>> emulator. I prepared additional patches for linux, OpTEE OS and OpTEE
>>>>>> build system as a PoC that such emulation works and passes OpTEE tests;
>>>>>> I'm not sure what would be the best way to share them.
>>>>>>
>>>>>> This patchset is my first foray into QEMU source code and I'm certain
>>>>>> it's not yet ready to be merged in. I'm not sure whether memory sharing
>>>>>> code has any race conditions or breaks rules of working with memory
>>>>>> regions, or if having VMs communicate synchronously via chardevs is the
>>>>>> right way to do it. I do believe the basic idea for sharing memory
>>>>>> regions is sound and that it could be useful for inter-VM communication.
>>>>>
>>>>> Hi,
>>>>> Without having looked into the patches yet, I'm already wondering if you
>>>>> can use the existing -object
>>>>> memory-backend-file,size=512M,mem-path=/my/shared/mem feature for your
>>>>> use case?
>>>>>
>>>>> That's the existing mechanism for fully sharing guest RAM and if you
>>>>> want to share all of memory then maybe a device is not necessary - just
>>>>> share the memory.
>>>
>>> That option adds memory in addition to the memory allocated with the
>>> '-m' flag, doesn't it? I looked into that option, and it seemed to me
>>> you can't back all memory this way.
>> with current QEMU you play with memory sharing using numa workaround
>>
>> -m 512 \
>> -object memory-backend-file,id=mem,size=512M,mem-path=/my/shared/mem feature,share=on \
>> -numa node,memdev=mem
>>
>> also on the list there is series that allows to share main ram
>> without numa workaround, see
>>    "[PATCH v4 00/80] refactor main RAM allocation to use hostmem backend"
>>
>> with it applied you can share main RAM with following CLI:
>>
>> -object memory-backend-file,id=mem,size=512M,mem-path=/my/shared/mem feature,share=on \
>> -m 512 \
>> -M virt,memory-backend=mem
> 
> Nice!  That takes care of memory.

After a bit of hacking to map the shared RAM instead of communicating 
via socket I can confirm - I can run OpTEE this way, and it passes 
tests. My solution is *technically* more accurate since it is aware of 
memory subregions and completely independent from memory backend setup, 
but with my use case satisfied already, I don't think it's of use to anyone.

> 
> If signalling (e.g. a notification interrupt) is necessary then a
> mechanism is still needed for that.  I don't know enough about TrustZone
> to suggest an appropriate way of doing it with existing QEMU features.
> Maybe Peter understands?
> 

Any signalling mechanism that can pass data along with it (e.g. ivshmem 
with its shared memory) will suffice.

Igor


^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [RFC 0/9] Add an interVM memory sharing device
  2020-02-10 13:01             ` Igor Kotrasiński
@ 2020-02-12 13:57               ` Stefan Hajnoczi
  2020-04-01 12:58               ` Igor Kotrasiński
  1 sibling, 0 replies; 20+ messages in thread
From: Stefan Hajnoczi @ 2020-02-12 13:57 UTC (permalink / raw)
  To: Igor Kotrasiński
  Cc: peter.maydell, Igor Mammedov, pbonzini, qemu-devel, Jan Kiszka

[-- Attachment #1: Type: text/plain, Size: 3825 bytes --]

On Mon, Feb 10, 2020 at 02:01:48PM +0100, Igor Kotrasiński wrote:
> On 2/7/20 5:33 PM, Stefan Hajnoczi wrote:
> > On Fri, Feb 07, 2020 at 11:04:03AM +0100, Igor Mammedov wrote:
> >> On Fri, 7 Feb 2020 10:00:50 +0100
> >> Igor Kotrasiński <i.kotrasinsk@partner.samsung.com> wrote:
> >>
> >>> On 2/5/20 3:49 PM, Jan Kiszka wrote:
> >>>> On 05.02.20 15:39, Stefan Hajnoczi wrote:
> >>>>> On Tue, Feb 04, 2020 at 12:30:42PM +0100,
> >>>>> i.kotrasinsk@partner.samsung.com wrote:
> >>>>>> From: Igor Kotrasinski <i.kotrasinsk@partner.samsung.com>
> >>>>>>
> >>>>>> This patchset adds a "memory exposing" device that allows two QEMU
> >>>>>> instances to share arbitrary memory regions. Unlike ivshmem, it does not
> >>>>>> create a new region of memory that's shared between VMs, but instead
> >>>>>> allows one VM to access any memory region of the other VM we choose to
> >>>>>> share.
> >>>>>>
> >>>>>> The motivation for this device is a sort of ARM Trustzone "emulation",
> >>>>>> where a rich system running on one machine (e.g. x86_64 linux) is able
> >>>>>> to perform SMCs to a trusted system running on another (e.g. OpTEE on
> >>>>>> ARM). With a device that allows sharing arbitrary memory between VMs,
> >>>>>> this can be achieved with minimal changes to the trusted system and its
> >>>>>> linux driver while allowing the rich system to run on a speedier x86
> >>>>>> emulator. I prepared additional patches for linux, OpTEE OS and OpTEE
> >>>>>> build system as a PoC that such emulation works and passes OpTEE tests;
> >>>>>> I'm not sure what would be the best way to share them.
> >>>>>>
> >>>>>> This patchset is my first foray into QEMU source code and I'm certain
> >>>>>> it's not yet ready to be merged in. I'm not sure whether memory sharing
> >>>>>> code has any race conditions or breaks rules of working with memory
> >>>>>> regions, or if having VMs communicate synchronously via chardevs is the
> >>>>>> right way to do it. I do believe the basic idea for sharing memory
> >>>>>> regions is sound and that it could be useful for inter-VM communication.
> >>>>>
> >>>>> Hi,
> >>>>> Without having looked into the patches yet, I'm already wondering if you
> >>>>> can use the existing -object
> >>>>> memory-backend-file,size=512M,mem-path=/my/shared/mem feature for your
> >>>>> use case?
> >>>>>
> >>>>> That's the existing mechanism for fully sharing guest RAM and if you
> >>>>> want to share all of memory then maybe a device is not necessary - just
> >>>>> share the memory.
> >>>
> >>> That option adds memory in addition to the memory allocated with the
> >>> '-m' flag, doesn't it? I looked into that option, and it seemed to me
> >>> you can't back all memory this way.
> >> with current QEMU you play with memory sharing using numa workaround
> >>
> >> -m 512 \
> >> -object memory-backend-file,id=mem,size=512M,mem-path=/my/shared/mem feature,share=on \
> >> -numa node,memdev=mem
> >>
> >> also on the list there is series that allows to share main ram
> >> without numa workaround, see
> >>    "[PATCH v4 00/80] refactor main RAM allocation to use hostmem backend"
> >>
> >> with it applied you can share main RAM with following CLI:
> >>
> >> -object memory-backend-file,id=mem,size=512M,mem-path=/my/shared/mem feature,share=on \
> >> -m 512 \
> >> -M virt,memory-backend=mem
> > 
> > Nice!  That takes care of memory.
> 
> After a bit of hacking to map the shared RAM instead of communicating 
> via socket I can confirm - I can run OpTEE this way, and it passes 
> tests. My solution is *technically* more accurate since it is aware of 
> memory subregions and completely independent from memory backend setup, 
> but with my use case satisfied already, I don't think it's of use to anyone.

Great!

Stefan

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [RFC 0/9] Add an interVM memory sharing device
  2020-02-10 13:01             ` Igor Kotrasiński
  2020-02-12 13:57               ` Stefan Hajnoczi
@ 2020-04-01 12:58               ` Igor Kotrasiński
  1 sibling, 0 replies; 20+ messages in thread
From: Igor Kotrasiński @ 2020-04-01 12:58 UTC (permalink / raw)
  To: Stefan Hajnoczi, Igor Mammedov
  Cc: Jan Kiszka, pbonzini, qemu-devel, peter.maydell

On 10.02.2020 14:01, Igor Kotrasiński wrote:
> On 2/7/20 5:33 PM, Stefan Hajnoczi wrote:
>> On Fri, Feb 07, 2020 at 11:04:03AM +0100, Igor Mammedov wrote:
>>> On Fri, 7 Feb 2020 10:00:50 +0100
>>> Igor Kotrasiński <i.kotrasinsk@partner.samsung.com> wrote:
>>>
>>>> On 2/5/20 3:49 PM, Jan Kiszka wrote:
>>>>> On 05.02.20 15:39, Stefan Hajnoczi wrote:
>>>>>> On Tue, Feb 04, 2020 at 12:30:42PM +0100,
>>>>>> i.kotrasinsk@partner.samsung.com wrote:
>>>>>>> From: Igor Kotrasinski <i.kotrasinsk@partner.samsung.com>
>>>>>>>
>>>>>>> This patchset adds a "memory exposing" device that allows two QEMU
>>>>>>> instances to share arbitrary memory regions. Unlike ivshmem, it 
>>>>>>> does not
>>>>>>> create a new region of memory that's shared between VMs, but instead
>>>>>>> allows one VM to access any memory region of the other VM we 
>>>>>>> choose to
>>>>>>> share.
>>>>>>>
>>>>>>> The motivation for this device is a sort of ARM Trustzone 
>>>>>>> "emulation",
>>>>>>> where a rich system running on one machine (e.g. x86_64 linux) is 
>>>>>>> able
>>>>>>> to perform SMCs to a trusted system running on another (e.g. 
>>>>>>> OpTEE on
>>>>>>> ARM). With a device that allows sharing arbitrary memory between 
>>>>>>> VMs,
>>>>>>> this can be achieved with minimal changes to the trusted system 
>>>>>>> and its
>>>>>>> linux driver while allowing the rich system to run on a speedier x86
>>>>>>> emulator. I prepared additional patches for linux, OpTEE OS and 
>>>>>>> OpTEE
>>>>>>> build system as a PoC that such emulation works and passes OpTEE 
>>>>>>> tests;
>>>>>>> I'm not sure what would be the best way to share them.
>>>>>>>
>>>>>>> This patchset is my first foray into QEMU source code and I'm 
>>>>>>> certain
>>>>>>> it's not yet ready to be merged in. I'm not sure whether memory 
>>>>>>> sharing
>>>>>>> code has any race conditions or breaks rules of working with memory
>>>>>>> regions, or if having VMs communicate synchronously via chardevs 
>>>>>>> is the
>>>>>>> right way to do it. I do believe the basic idea for sharing memory
>>>>>>> regions is sound and that it could be useful for inter-VM 
>>>>>>> communication.
>>>>>>
>>>>>> Hi,
>>>>>> Without having looked into the patches yet, I'm already wondering 
>>>>>> if you
>>>>>> can use the existing -object
>>>>>> memory-backend-file,size=512M,mem-path=/my/shared/mem feature for 
>>>>>> your
>>>>>> use case?
>>>>>>
>>>>>> That's the existing mechanism for fully sharing guest RAM and if you
>>>>>> want to share all of memory then maybe a device is not necessary - 
>>>>>> just
>>>>>> share the memory.
>>>>
>>>> That option adds memory in addition to the memory allocated with the
>>>> '-m' flag, doesn't it? I looked into that option, and it seemed to me
>>>> you can't back all memory this way.
>>> with current QEMU you play with memory sharing using numa workaround
>>>
>>> -m 512 \
>>> -object memory-backend-file,id=mem,size=512M,mem-path=/my/shared/mem 
>>> feature,share=on \
>>> -numa node,memdev=mem
>>>
>>> also on the list there is series that allows to share main ram
>>> without numa workaround, see
>>>    "[PATCH v4 00/80] refactor main RAM allocation to use hostmem 
>>> backend"
>>>
>>> with it applied you can share main RAM with following CLI:
>>>
>>> -object memory-backend-file,id=mem,size=512M,mem-path=/my/shared/mem 
>>> feature,share=on \
>>> -m 512 \
>>> -M virt,memory-backend=mem
>>
>> Nice!  That takes care of memory.
> 
> After a bit of hacking to map the shared RAM instead of communicating 
> via socket I can confirm - I can run OpTEE this way, and it passes 
> tests. My solution is *technically* more accurate since it is aware of 
> memory subregions and completely independent from memory backend setup, 
> but with my use case satisfied already, I don't think it's of use to 
> anyone.
> 

After a long while hacking QEMU to achieve 1-to-1 memory mapping between 
machines, I realized that I wasn't completely right here. I can share 
main memory from both machines, but for the arm machine that's only 
non-secure memory. Secure memory (VIRT_SECURE_MEM) is always allocated 
with memory_region_init_ram(), I don't know if other secure memory 
regions (e.g. VIRT_FLASH) might need to be shared as well.

This can probably be solved by allowing these regions to use file-backed 
memory when configured to do so, then mapping these files in the other 
machine at correct offsets.

Another solution would be memory sharing in this patchset. If we strip 
away interrupts, PCI stuff and io memory region support, it amounts to 
basically the same thing - a mechanism for accessing shareable memory in 
the other machine in a discoverable way.

>>
>> If signalling (e.g. a notification interrupt) is necessary then a
>> mechanism is still needed for that.  I don't know enough about TrustZone
>> to suggest an appropriate way of doing it with existing QEMU features.
>> Maybe Peter understands?
>>
> 
> Any signalling mechanism that can pass data along with it (e.g. ivshmem 
> with its shared memory) will suffice.

Igor


^ permalink raw reply	[flat|nested] 20+ messages in thread

end of thread, other threads:[~2020-04-01 12:59 UTC | newest]

Thread overview: 20+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
     [not found] <CGME20200204113102eucas1p172cfb883c70cfc8d7c2832682df3df2a@eucas1p1.samsung.com>
2020-02-04 11:30 ` [RFC 0/9] Add an interVM memory sharing device i.kotrasinsk
     [not found]   ` <CGME20200204113104eucas1p2587768b7daa479ef5c01b45e1da99e45@eucas1p2.samsung.com>
2020-02-04 11:30     ` [RFC 1/9] memory: Add function for finding flat memory ranges i.kotrasinsk
     [not found]   ` <CGME20200204113105eucas1p2981e8d1e49ca9621255a4aedf8f1ec6e@eucas1p2.samsung.com>
2020-02-04 11:30     ` [RFC 2/9] memory: Support mmap offset for fd-backed memory regions i.kotrasinsk
     [not found]   ` <CGME20200204113106eucas1p2cf218553048c75f5a8b7771cde90f5f1@eucas1p2.samsung.com>
2020-02-04 11:30     ` [RFC 3/9] memory: Hack - use shared memory when possible i.kotrasinsk
     [not found]   ` <CGME20200204113107eucas1p2769c0c8204a57751a4e6c5d4fb40e2d5@eucas1p2.samsung.com>
2020-02-04 11:30     ` [RFC 4/9] hw/misc/memexpose: Add documentation i.kotrasinsk
     [not found]   ` <CGME20200204113108eucas1p232d86a495fa8200473047ffb58548201@eucas1p2.samsung.com>
2020-02-04 11:30     ` [RFC 5/9] hw/misc/memexpose: Add core memexpose files i.kotrasinsk
     [not found]   ` <CGME20200204113108eucas1p2526a9481bf8a4420d359c45f1183fe95@eucas1p2.samsung.com>
2020-02-04 11:30     ` [RFC 6/9] hw/misc/memexpose: Add memexpose pci device i.kotrasinsk
     [not found]   ` <CGME20200204113109eucas1p18527bb78c3d930d56e6ae9c205f64ba3@eucas1p1.samsung.com>
2020-02-04 11:30     ` [RFC 7/9] hw/misc/memexpose: Add memexpose memory region device i.kotrasinsk
     [not found]   ` <CGME20200204113110eucas1p2f9ab3639730113139730d1853772d100@eucas1p2.samsung.com>
2020-02-04 11:30     ` [RFC 8/9] hw/misc/memexpose: Add simple tests i.kotrasinsk
     [not found]   ` <CGME20200204113111eucas1p2a96ec20fbaf679215b50d9f03245b33e@eucas1p2.samsung.com>
2020-02-04 11:30     ` [RFC 9/9] hw/arm/virt: Hack in support for memexpose device i.kotrasinsk
2020-02-04 12:13   ` [RFC 0/9] Add an interVM memory sharing device no-reply
2020-02-04 12:16   ` no-reply
2020-02-05 14:39   ` Stefan Hajnoczi
2020-02-05 14:49     ` Jan Kiszka
2020-02-07  9:00       ` Igor Kotrasiński
2020-02-07 10:04         ` Igor Mammedov
2020-02-07 16:33           ` Stefan Hajnoczi
2020-02-10 13:01             ` Igor Kotrasiński
2020-02-12 13:57               ` Stefan Hajnoczi
2020-04-01 12:58               ` Igor Kotrasiński

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.