All of lore.kernel.org
 help / color / mirror / Atom feed
From: David Hildenbrand <david@redhat.com>
To: qemu-devel@nongnu.org
Cc: Pankaj Gupta <pankaj.gupta.linux@gmail.com>,
	David Hildenbrand <david@redhat.com>,
	"Michael S. Tsirkin" <mst@redhat.com>,
	"Dr . David Alan Gilbert" <dgilbert@redhat.com>,
	Peter Xu <peterx@redhat.com>,
	Luiz Capitulino <lcapitulino@redhat.com>,
	Auger Eric <eric.auger@redhat.com>,
	Alex Williamson <alex.williamson@redhat.com>,
	Wei Yang <richardw.yang@linux.intel.com>,
	Igor Mammedov <imammedo@redhat.com>,
	Paolo Bonzini <pbonzini@redhat.com>
Subject: [PATCH PROTOTYPE 1/6] memory: Introduce sparse RAM handler for memory regions
Date: Thu, 24 Sep 2020 18:04:18 +0200	[thread overview]
Message-ID: <20200924160423.106747-2-david@redhat.com> (raw)
In-Reply-To: <20200924160423.106747-1-david@redhat.com>

We have some special memory ram regions (managed by paravirtualized memory
devices - virtio-mem), whereby the guest agreed to only use selected memory
ranges. This results in "sparse" mmaps, "sparse" RAMBlocks and "sparse"
memory ram regions.

In most cases, we don't currently care about that - e.g., in KVM, we simply
have a single KVM memory slot (and as the number is fairly limited, we'll
have to keep it like that). However, in case of vfio, registering the
whole region with the kernel results in all pages getting pinned, and
therefore an unexpected high memory consumption. This is the main
reason why vfio is incompatible with memory ballooning.

Let's introduce a way to communicate the actual accessible/mapped (meaning,
not discarded) pieces for such a sparse memory region, and get notified on
changes (e.g., a virito-mem device plugging/unplugging memory).

We expect that the SparseRAMHandler is set for a memory region before it
is mapped into guest physical address space (so before any memory
listeners get notified about the addition), and the SparseRAMHandler isn't
unset before the memory region was unmapped from guest physical address
space (so after any memory listener got notified about the removal).

This is somewhat similar to the iommu memory region notifier mechanism.

TODO:
- Better documentation.
- Better Naming?
- Handle it on RAMBlocks?
- SPAPR spacial handling required (virtio-mem only supports x86-64 for now)?
- Catch mapping errors during hotplug in a nice way
- Fail early when a certain number of mappings would be exceeded
  (instead of eventually consuming too many, leaving none for others)
- Resizeable memory region handling (future).
- Callback to check the state of a block.

Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: "Michael S. Tsirkin" <mst@redhat.com>
Cc: Alex Williamson <alex.williamson@redhat.com>
Cc: Wei Yang <richardw.yang@linux.intel.com>
Cc: Dr. David Alan Gilbert <dgilbert@redhat.com>
Cc: Igor Mammedov <imammedo@redhat.com>
Cc: Pankaj Gupta <pankaj.gupta.linux@gmail.com>
Cc: Peter Xu <peterx@redhat.com>
Signed-off-by: David Hildenbrand <david@redhat.com>
---
 include/exec/memory.h | 115 ++++++++++++++++++++++++++++++++++++++++++
 softmmu/memory.c      |   7 +++
 2 files changed, 122 insertions(+)

diff --git a/include/exec/memory.h b/include/exec/memory.h
index f1bb2a7df5..2931ead730 100644
--- a/include/exec/memory.h
+++ b/include/exec/memory.h
@@ -42,6 +42,12 @@ typedef struct IOMMUMemoryRegionClass IOMMUMemoryRegionClass;
 DECLARE_OBJ_CHECKERS(IOMMUMemoryRegion, IOMMUMemoryRegionClass,
                      IOMMU_MEMORY_REGION, TYPE_IOMMU_MEMORY_REGION)
 
+#define TYPE_SPARSE_RAM_HANDLER "sparse-ram-handler"
+typedef struct SparseRAMHandlerClass SparseRAMHandlerClass;
+typedef struct SparseRAMHandler SparseRAMHandler;
+DECLARE_OBJ_CHECKERS(SparseRAMHandler, SparseRAMHandlerClass,
+                     SPARSE_RAM_HANDLER, TYPE_SPARSE_RAM_HANDLER)
+
 extern bool global_dirty_log;
 
 typedef struct MemoryRegionOps MemoryRegionOps;
@@ -136,6 +142,28 @@ static inline void iommu_notifier_init(IOMMUNotifier *n, IOMMUNotify fn,
     n->iommu_idx = iommu_idx;
 }
 
+struct SparseRAMNotifier;
+typedef int (*SparseRAMNotifyMap)(struct SparseRAMNotifier *notifier,
+                                  const MemoryRegion *mr, uint64_t mr_offset,
+                                  uint64_t size);
+typedef void (*SparseRAMNotifyUnmap)(struct SparseRAMNotifier *notifier,
+                                     const MemoryRegion *mr, uint64_t mr_offset,
+                                     uint64_t size);
+
+typedef struct SparseRAMNotifier {
+    SparseRAMNotifyMap notify_map;
+    SparseRAMNotifyUnmap notify_unmap;
+    QLIST_ENTRY(SparseRAMNotifier) next;
+} SparseRAMNotifier;
+
+static inline void sparse_ram_notifier_init(SparseRAMNotifier *notifier,
+                                            SparseRAMNotifyMap map_fn,
+                                            SparseRAMNotifyUnmap unmap_fn)
+{
+    notifier->notify_map = map_fn;
+    notifier->notify_unmap = unmap_fn;
+}
+
 /*
  * Memory region callbacks
  */
@@ -352,6 +380,36 @@ struct IOMMUMemoryRegionClass {
     int (*num_indexes)(IOMMUMemoryRegion *iommu);
 };
 
+struct SparseRAMHandlerClass {
+    /* private */
+    InterfaceClass parent_class;
+
+    /*
+     * Returns the minimum granularity in which (granularity-aligned pieces
+     * within the memory region) can become either be mapped or unmapped.
+     */
+    uint64_t (*get_granularity)(const SparseRAMHandler *srh,
+                                const MemoryRegion *mr);
+
+    /*
+     * Register a listener for mapping changes.
+     */
+    void (*register_listener)(SparseRAMHandler *srh, const MemoryRegion *mr,
+                              SparseRAMNotifier *notifier);
+
+    /*
+     * Unregister a listener for mapping changes.
+     */
+    void (*unregister_listener)(SparseRAMHandler *srh, const MemoryRegion *mr,
+                                SparseRAMNotifier *notifier);
+
+    /*
+     * Replay notifications for mapped RAM.
+     */
+    int (*replay_mapped)(SparseRAMHandler *srh, const MemoryRegion *mr,
+                         SparseRAMNotifier *notifier);
+};
+
 typedef struct CoalescedMemoryRange CoalescedMemoryRange;
 typedef struct MemoryRegionIoeventfd MemoryRegionIoeventfd;
 
@@ -399,6 +457,7 @@ struct MemoryRegion {
     const char *name;
     unsigned ioeventfd_nb;
     MemoryRegionIoeventfd *ioeventfds;
+    SparseRAMHandler *srh; /* For RAM only */
 };
 
 struct IOMMUMemoryRegion {
@@ -1889,6 +1948,62 @@ bool memory_region_present(MemoryRegion *container, hwaddr addr);
  */
 bool memory_region_is_mapped(MemoryRegion *mr);
 
+
+static inline SparseRAMHandler* memory_region_get_sparse_ram_handler(
+                                                               MemoryRegion *mr)
+{
+    return mr->srh;
+}
+
+static inline bool memory_region_is_sparse_ram(MemoryRegion *mr)
+{
+    return memory_region_get_sparse_ram_handler(mr) != NULL;
+}
+
+static inline void memory_region_set_sparse_ram_handler(MemoryRegion *mr,
+                                                        SparseRAMHandler *srh)
+{
+    g_assert(memory_region_is_ram(mr));
+    mr->srh = srh;
+}
+
+static inline void memory_region_register_sparse_ram_notifier(MemoryRegion *mr,
+                                                           SparseRAMNotifier *n)
+{
+    SparseRAMHandler *srh = memory_region_get_sparse_ram_handler(mr);
+    SparseRAMHandlerClass *srhc = SPARSE_RAM_HANDLER_GET_CLASS(srh);
+
+    srhc->register_listener(srh, mr, n);
+}
+
+static inline void memory_region_unregister_sparse_ram_notifier(
+                                                               MemoryRegion *mr,
+                                                           SparseRAMNotifier *n)
+{
+    SparseRAMHandler *srh = memory_region_get_sparse_ram_handler(mr);
+    SparseRAMHandlerClass *srhc = SPARSE_RAM_HANDLER_GET_CLASS(srh);
+
+    srhc->unregister_listener(srh, mr, n);
+}
+
+static inline uint64_t memory_region_sparse_ram_get_granularity(
+                                                               MemoryRegion *mr)
+{
+    SparseRAMHandler *srh = memory_region_get_sparse_ram_handler(mr);
+    SparseRAMHandlerClass *srhc = SPARSE_RAM_HANDLER_GET_CLASS(srh);
+
+    return srhc->get_granularity(srh, mr);
+}
+
+static inline int memory_region_sparse_ram_replay_mapped(MemoryRegion *mr,
+                                                         SparseRAMNotifier *n)
+{
+    SparseRAMHandler *srh = memory_region_get_sparse_ram_handler(mr);
+    SparseRAMHandlerClass *srhc = SPARSE_RAM_HANDLER_GET_CLASS(srh);
+
+    return srhc->replay_mapped(srh, mr, n);
+}
+
 /**
  * memory_region_find: translate an address/size relative to a
  * MemoryRegion into a #MemoryRegionSection.
diff --git a/softmmu/memory.c b/softmmu/memory.c
index d030eb6f7c..89649f52f7 100644
--- a/softmmu/memory.c
+++ b/softmmu/memory.c
@@ -3241,10 +3241,17 @@ static const TypeInfo iommu_memory_region_info = {
     .abstract           = true,
 };
 
+static const TypeInfo sparse_ram_handler_info = {
+    .parent             = TYPE_INTERFACE,
+    .name               = TYPE_SPARSE_RAM_HANDLER,
+    .class_size         = sizeof(SparseRAMHandlerClass),
+};
+
 static void memory_register_types(void)
 {
     type_register_static(&memory_region_info);
     type_register_static(&iommu_memory_region_info);
+    type_register_static(&sparse_ram_handler_info);
 }
 
 type_init(memory_register_types)
-- 
2.26.2



  reply	other threads:[~2020-09-24 16:34 UTC|newest]

Thread overview: 27+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-09-24 16:04 [PATCH PROTOTYPE 0/6] virtio-mem: vfio support David Hildenbrand
2020-09-24 16:04 ` David Hildenbrand [this message]
2020-10-20 19:24   ` [PATCH PROTOTYPE 1/6] memory: Introduce sparse RAM handler for memory regions Peter Xu
2020-10-20 20:13     ` David Hildenbrand
2020-09-24 16:04 ` [PATCH PROTOTYPE 2/6] virtio-mem: Impelement SparseRAMHandler interface David Hildenbrand
2020-09-24 16:04 ` [PATCH PROTOTYPE 3/6] vfio: Implement support for sparse RAM memory regions David Hildenbrand
2020-10-20 19:44   ` Peter Xu
2020-10-20 20:01     ` David Hildenbrand
2020-10-20 20:44       ` Peter Xu
2020-11-12 10:11         ` David Hildenbrand
2020-11-18 13:04         ` David Hildenbrand
2020-11-18 15:23           ` Peter Xu
2020-11-18 16:14             ` David Hildenbrand
2020-11-18 17:01               ` Peter Xu
2020-11-18 17:37                 ` David Hildenbrand
2020-11-18 19:05                   ` Peter Xu
2020-11-18 19:20                     ` David Hildenbrand
2020-09-24 16:04 ` [PATCH PROTOTYPE 4/6] memory: Extend ram_block_discard_(require|disable) by two discard types David Hildenbrand
2020-10-20 19:17   ` Peter Xu
2020-10-20 19:58     ` David Hildenbrand
2020-10-20 20:49       ` Peter Xu
2020-10-20 21:30         ` Peter Xu
2020-09-24 16:04 ` [PATCH PROTOTYPE 5/6] virtio-mem: Require only RAM_BLOCK_DISCARD_T_COORDINATED discards David Hildenbrand
2020-09-24 16:04 ` [PATCH PROTOTYPE 6/6] vfio: Disable only RAM_BLOCK_DISCARD_T_UNCOORDINATED discards David Hildenbrand
2020-09-24 19:30 ` [PATCH PROTOTYPE 0/6] virtio-mem: vfio support no-reply
2020-09-29 17:02 ` Dr. David Alan Gilbert
2020-09-29 17:05   ` David Hildenbrand

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20200924160423.106747-2-david@redhat.com \
    --to=david@redhat.com \
    --cc=alex.williamson@redhat.com \
    --cc=dgilbert@redhat.com \
    --cc=eric.auger@redhat.com \
    --cc=imammedo@redhat.com \
    --cc=lcapitulino@redhat.com \
    --cc=mst@redhat.com \
    --cc=pankaj.gupta.linux@gmail.com \
    --cc=pbonzini@redhat.com \
    --cc=peterx@redhat.com \
    --cc=qemu-devel@nongnu.org \
    --cc=richardw.yang@linux.intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.