All of lore.kernel.org
 help / color / mirror / Atom feed
From: David Hildenbrand <david@redhat.com>
To: qemu-devel@nongnu.org
Cc: "David Hildenbrand" <david@redhat.com>,
	"Paolo Bonzini" <pbonzini@redhat.com>,
	"Igor Mammedov" <imammedo@redhat.com>,
	"Xiao Guangrong" <xiaoguangrong.eric@gmail.com>,
	"Michael S. Tsirkin" <mst@redhat.com>,
	"Peter Xu" <peterx@redhat.com>,
	"Philippe Mathieu-Daudé" <philmd@linaro.org>,
	"Eduardo Habkost" <eduardo@habkost.net>,
	"Marcel Apfelbaum" <marcel.apfelbaum@gmail.com>,
	"Yanan Wang" <wangyanan55@huawei.com>,
	"Michal Privoznik" <mprivozn@redhat.com>,
	"Daniel P . Berrangé" <berrange@redhat.com>,
	"Gavin Shan" <gshan@redhat.com>,
	"Alex Williamson" <alex.williamson@redhat.com>,
	kvm@vger.kernel.org
Subject: [PATCH v1 09/15] memory-device, vhost: Support memory devices that dynamically consume multiple memslots
Date: Fri, 16 Jun 2023 11:26:48 +0200	[thread overview]
Message-ID: <20230616092654.175518-10-david@redhat.com> (raw)
In-Reply-To: <20230616092654.175518-1-david@redhat.com>

We want to support memory devices that have a dynamically managed memory
region container as device memory region. This device memory region maps
multiple RAM memory subregions (e.g., aliases to the same RAM memory region),
whereby these subregions can be (un)mapped on demand.

Each RAM subregion will consume a memslot in KVM and vhost, resulting in
such a new device consuming memslots dynamically, and initially usually
0. We already track the number of used vs. required memslots for all
memslots. From that, we can derive the number of reserved memslots that
must not be used. We only have to add a way for memory devices to expose
how many memslots they require, such that we can properly consider them as
required (and as reserved until actually used). Let's properly document
what's supported and what's not.

The target use case is virtio-mem, which will dynamically map parts of a
source RAM memory region into the container device region using aliases,
consuming one memslot per alias.

Extend the vhost memslot check accordingly and give a hint that adding
vhost devices before adding memory devices might make it work (especially
virtio-mem devices, once they determine the number of memslots to use
at runtime).

Signed-off-by: David Hildenbrand <david@redhat.com>
---
 hw/mem/memory-device.c         | 36 +++++++++++++++++++++++++++++++++-
 hw/virtio/vhost.c              | 18 +++++++++++++----
 include/hw/mem/memory-device.h |  7 +++++++
 stubs/qmp_memory_device.c      |  5 +++++
 4 files changed, 61 insertions(+), 5 deletions(-)

diff --git a/hw/mem/memory-device.c b/hw/mem/memory-device.c
index 752258333b..2e6536c841 100644
--- a/hw/mem/memory-device.c
+++ b/hw/mem/memory-device.c
@@ -88,6 +88,40 @@ static unsigned int get_free_memslots(void)
     return MIN(vhost_get_free_memslots(), kvm_get_free_memslots());
 }
 
+/* Memslots that are reserved by memory devices (required but still unused). */
+static unsigned int get_reserved_memslots(MachineState *ms)
+{
+    if (ms->device_memory->used_memslots >
+        ms->device_memory->required_memslots) {
+        /* This is unexpected, and we warned already in the memory notifier. */
+        return 0;
+    }
+    return ms->device_memory->required_memslots -
+           ms->device_memory->used_memslots;
+}
+
+unsigned int memory_devices_get_reserved_memslots(void)
+{
+    if (!current_machine->device_memory) {
+        return 0;
+    }
+    return get_reserved_memslots(current_machine);
+}
+
+/* Memslots that are still free but not reserved by memory devices yet. */
+static unsigned int get_available_memslots(MachineState *ms)
+{
+    const unsigned int free = get_free_memslots();
+    const unsigned int reserved = get_reserved_memslots(ms);
+
+    if (free < reserved) {
+        warn_report_once("The reserved memory slots (%u) exceed the free"
+                         " memory slots (%u)", reserved, free);
+        return 0;
+    }
+    return reserved - free;
+}
+
 /*
  * The memslot soft limit for memory devices. The soft limit might change at
  * runtime in corner cases (that should certainly be avoided), for example, when
@@ -146,7 +180,7 @@ static void memory_device_check_addable(MachineState *ms, MemoryDeviceState *md,
                                         MemoryRegion *mr, Error **errp)
 {
     const uint64_t used_region_size = ms->device_memory->used_region_size;
-    const unsigned int available_memslots = get_free_memslots();
+    const unsigned int available_memslots = get_available_memslots(ms);
     const uint64_t size = memory_region_size(mr);
     unsigned int required_memslots;
 
diff --git a/hw/virtio/vhost.c b/hw/virtio/vhost.c
index 472ccba4ab..b1e2eca55d 100644
--- a/hw/virtio/vhost.c
+++ b/hw/virtio/vhost.c
@@ -1422,7 +1422,7 @@ int vhost_dev_init(struct vhost_dev *hdev, void *opaque,
                    VhostBackendType backend_type, uint32_t busyloop_timeout,
                    Error **errp)
 {
-    unsigned int used;
+    unsigned int used, reserved, limit;
     uint64_t features;
     int i, r, n_initialized_vqs = 0;
 
@@ -1528,9 +1528,19 @@ int vhost_dev_init(struct vhost_dev *hdev, void *opaque,
     } else {
         used = used_memslots;
     }
-    if (used > hdev->vhost_ops->vhost_backend_memslots_limit(hdev)) {
-        error_setg(errp, "vhost backend memory slots limit is less"
-                   " than current number of present memory slots");
+    /*
+     * We simplify by assuming that reserved memslots are compatible with used
+     * vhost devices (if vhost only supports shared memory, the memory devices
+     * better use shared memory) and that reserved memslots are not used for
+     * ROM.
+     */
+    reserved = memory_devices_get_reserved_memslots();
+    limit = hdev->vhost_ops->vhost_backend_memslots_limit(hdev);
+    if (used + reserved > limit) {
+        error_setg(errp, "vhost backend memory slots limit (%d) is less"
+                   " than current number of used (%d) and reserved (%d)"
+                   " memory slots. Try adding vhost devices before memory"
+                   " devices.", limit, used, reserved);
         r = -EINVAL;
         goto fail_busyloop;
     }
diff --git a/include/hw/mem/memory-device.h b/include/hw/mem/memory-device.h
index 755f6304c6..7e8e4452cb 100644
--- a/include/hw/mem/memory-device.h
+++ b/include/hw/mem/memory-device.h
@@ -47,6 +47,12 @@ typedef struct MemoryDeviceState MemoryDeviceState;
  * single RAM/ROM memory region or a memory region container with subregions
  * that are RAM/ROM memory regions or aliases to RAM/ROM memory regions. Other
  * memory regions or subregions are not supported.
+ *
+ * If the device memory region returned via @get_memory_region is a
+ * memory region container, it's supported to dynamically (un)map subregions
+ * as long as the number of memslots returned by @get_memslots() won't
+ * be exceeded and as long as all memory regions are of the same kind (e.g.,
+ * all RAM or all ROM).
  */
 struct MemoryDeviceClass {
     /* private */
@@ -127,6 +133,7 @@ struct MemoryDeviceClass {
 MemoryDeviceInfoList *qmp_memory_device_list(void);
 uint64_t get_plugged_memory_size(void);
 void memory_devices_notify_vhost_device_added(void);
+unsigned int memory_devices_get_reserved_memslots(void);
 void memory_device_pre_plug(MemoryDeviceState *md, MachineState *ms,
                             const uint64_t *legacy_align, Error **errp);
 void memory_device_plug(MemoryDeviceState *md, MachineState *ms);
diff --git a/stubs/qmp_memory_device.c b/stubs/qmp_memory_device.c
index b0e3e34f85..74707ed9fd 100644
--- a/stubs/qmp_memory_device.c
+++ b/stubs/qmp_memory_device.c
@@ -14,3 +14,8 @@ uint64_t get_plugged_memory_size(void)
 void memory_devices_notify_vhost_device_added(void)
 {
 }
+
+unsigned int memory_devices_get_reserved_memslots(void)
+{
+    return 0;
+}
-- 
2.40.1



WARNING: multiple messages have this Message-ID (diff)
From: David Hildenbrand <david@redhat.com>
To: qemu-devel@nongnu.org
Cc: "David Hildenbrand" <david@redhat.com>,
	"Paolo Bonzini" <pbonzini@redhat.com>,
	"Igor Mammedov" <imammedo@redhat.com>,
	"Xiao Guangrong" <xiaoguangrong.eric@gmail.com>,
	"Michael S. Tsirkin" <mst@redhat.com>,
	"Peter Xu" <peterx@redhat.com>,
	"Philippe Mathieu-Daudé" <philmd@linaro.org>,
	"Eduardo Habkost" <eduardo@habkost.net>,
	"Marcel Apfelbaum" <marcel.apfelbaum@gmail.com>,
	"Yanan Wang" <wangyanan55@huawei.com>,
	"Michal Privoznik" <mprivozn@redhat.com>,
	"Daniel P . Berrangé" <berrange@redhat.com>,
	"Gavin Shan" <gshan@redhat.com>,
	"Alex Williamson" <alex.williamson@redhat.com>,
	kvm@vger.kernel.org
Subject: [PATCH v1 09/15] memory-device,vhost: Support memory devices that dynamically consume multiple memslots
Date: Fri, 16 Jun 2023 11:26:48 +0200	[thread overview]
Message-ID: <20230616092654.175518-10-david@redhat.com> (raw)
In-Reply-To: <20230616092654.175518-1-david@redhat.com>

We want to support memory devices that have a dynamically managed memory
region container as device memory region. This device memory region maps
multiple RAM memory subregions (e.g., aliases to the same RAM memory region),
whereby these subregions can be (un)mapped on demand.

Each RAM subregion will consume a memslot in KVM and vhost, resulting in
such a new device consuming memslots dynamically, and initially usually
0. We already track the number of used vs. required memslots for all
memslots. From that, we can derive the number of reserved memslots that
must not be used. We only have to add a way for memory devices to expose
how many memslots they require, such that we can properly consider them as
required (and as reserved until actually used). Let's properly document
what's supported and what's not.

The target use case is virtio-mem, which will dynamically map parts of a
source RAM memory region into the container device region using aliases,
consuming one memslot per alias.

Extend the vhost memslot check accordingly and give a hint that adding
vhost devices before adding memory devices might make it work (especially
virtio-mem devices, once they determine the number of memslots to use
at runtime).

Signed-off-by: David Hildenbrand <david@redhat.com>
---
 hw/mem/memory-device.c         | 36 +++++++++++++++++++++++++++++++++-
 hw/virtio/vhost.c              | 18 +++++++++++++----
 include/hw/mem/memory-device.h |  7 +++++++
 stubs/qmp_memory_device.c      |  5 +++++
 4 files changed, 61 insertions(+), 5 deletions(-)

diff --git a/hw/mem/memory-device.c b/hw/mem/memory-device.c
index 752258333b..2e6536c841 100644
--- a/hw/mem/memory-device.c
+++ b/hw/mem/memory-device.c
@@ -88,6 +88,40 @@ static unsigned int get_free_memslots(void)
     return MIN(vhost_get_free_memslots(), kvm_get_free_memslots());
 }
 
+/* Memslots that are reserved by memory devices (required but still unused). */
+static unsigned int get_reserved_memslots(MachineState *ms)
+{
+    if (ms->device_memory->used_memslots >
+        ms->device_memory->required_memslots) {
+        /* This is unexpected, and we warned already in the memory notifier. */
+        return 0;
+    }
+    return ms->device_memory->required_memslots -
+           ms->device_memory->used_memslots;
+}
+
+unsigned int memory_devices_get_reserved_memslots(void)
+{
+    if (!current_machine->device_memory) {
+        return 0;
+    }
+    return get_reserved_memslots(current_machine);
+}
+
+/* Memslots that are still free but not reserved by memory devices yet. */
+static unsigned int get_available_memslots(MachineState *ms)
+{
+    const unsigned int free = get_free_memslots();
+    const unsigned int reserved = get_reserved_memslots(ms);
+
+    if (free < reserved) {
+        warn_report_once("The reserved memory slots (%u) exceed the free"
+                         " memory slots (%u)", reserved, free);
+        return 0;
+    }
+    return reserved - free;
+}
+
 /*
  * The memslot soft limit for memory devices. The soft limit might change at
  * runtime in corner cases (that should certainly be avoided), for example, when
@@ -146,7 +180,7 @@ static void memory_device_check_addable(MachineState *ms, MemoryDeviceState *md,
                                         MemoryRegion *mr, Error **errp)
 {
     const uint64_t used_region_size = ms->device_memory->used_region_size;
-    const unsigned int available_memslots = get_free_memslots();
+    const unsigned int available_memslots = get_available_memslots(ms);
     const uint64_t size = memory_region_size(mr);
     unsigned int required_memslots;
 
diff --git a/hw/virtio/vhost.c b/hw/virtio/vhost.c
index 472ccba4ab..b1e2eca55d 100644
--- a/hw/virtio/vhost.c
+++ b/hw/virtio/vhost.c
@@ -1422,7 +1422,7 @@ int vhost_dev_init(struct vhost_dev *hdev, void *opaque,
                    VhostBackendType backend_type, uint32_t busyloop_timeout,
                    Error **errp)
 {
-    unsigned int used;
+    unsigned int used, reserved, limit;
     uint64_t features;
     int i, r, n_initialized_vqs = 0;
 
@@ -1528,9 +1528,19 @@ int vhost_dev_init(struct vhost_dev *hdev, void *opaque,
     } else {
         used = used_memslots;
     }
-    if (used > hdev->vhost_ops->vhost_backend_memslots_limit(hdev)) {
-        error_setg(errp, "vhost backend memory slots limit is less"
-                   " than current number of present memory slots");
+    /*
+     * We simplify by assuming that reserved memslots are compatible with used
+     * vhost devices (if vhost only supports shared memory, the memory devices
+     * better use shared memory) and that reserved memslots are not used for
+     * ROM.
+     */
+    reserved = memory_devices_get_reserved_memslots();
+    limit = hdev->vhost_ops->vhost_backend_memslots_limit(hdev);
+    if (used + reserved > limit) {
+        error_setg(errp, "vhost backend memory slots limit (%d) is less"
+                   " than current number of used (%d) and reserved (%d)"
+                   " memory slots. Try adding vhost devices before memory"
+                   " devices.", limit, used, reserved);
         r = -EINVAL;
         goto fail_busyloop;
     }
diff --git a/include/hw/mem/memory-device.h b/include/hw/mem/memory-device.h
index 755f6304c6..7e8e4452cb 100644
--- a/include/hw/mem/memory-device.h
+++ b/include/hw/mem/memory-device.h
@@ -47,6 +47,12 @@ typedef struct MemoryDeviceState MemoryDeviceState;
  * single RAM/ROM memory region or a memory region container with subregions
  * that are RAM/ROM memory regions or aliases to RAM/ROM memory regions. Other
  * memory regions or subregions are not supported.
+ *
+ * If the device memory region returned via @get_memory_region is a
+ * memory region container, it's supported to dynamically (un)map subregions
+ * as long as the number of memslots returned by @get_memslots() won't
+ * be exceeded and as long as all memory regions are of the same kind (e.g.,
+ * all RAM or all ROM).
  */
 struct MemoryDeviceClass {
     /* private */
@@ -127,6 +133,7 @@ struct MemoryDeviceClass {
 MemoryDeviceInfoList *qmp_memory_device_list(void);
 uint64_t get_plugged_memory_size(void);
 void memory_devices_notify_vhost_device_added(void);
+unsigned int memory_devices_get_reserved_memslots(void);
 void memory_device_pre_plug(MemoryDeviceState *md, MachineState *ms,
                             const uint64_t *legacy_align, Error **errp);
 void memory_device_plug(MemoryDeviceState *md, MachineState *ms);
diff --git a/stubs/qmp_memory_device.c b/stubs/qmp_memory_device.c
index b0e3e34f85..74707ed9fd 100644
--- a/stubs/qmp_memory_device.c
+++ b/stubs/qmp_memory_device.c
@@ -14,3 +14,8 @@ uint64_t get_plugged_memory_size(void)
 void memory_devices_notify_vhost_device_added(void)
 {
 }
+
+unsigned int memory_devices_get_reserved_memslots(void)
+{
+    return 0;
+}
-- 
2.40.1


  parent reply	other threads:[~2023-06-16  9:28 UTC|newest]

Thread overview: 21+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-06-16  9:26 [PATCH v1 00/15] virtio-mem: Expose device memory through multiple memslots David Hildenbrand
2023-06-16  9:26 ` [PATCH v1 01/15] memory-device: Track the required memslots in DeviceMemoryState David Hildenbrand
2023-06-16  9:26 ` [PATCH v1 02/15] kvm: Add stub for kvm_get_max_memslots() David Hildenbrand
2023-06-16  9:26 ` [PATCH v1 03/15] vhost: Add vhost_get_max_memslots() David Hildenbrand
2023-06-16  9:26 ` [PATCH v1 04/15] memory-device, vhost: Add a memslot soft limit for memory devices David Hildenbrand
2023-06-16  9:26   ` [PATCH v1 04/15] memory-device,vhost: " David Hildenbrand
2023-06-16  9:26 ` [PATCH v1 05/15] kvm: Return number of free memslots David Hildenbrand
2023-06-16  9:26 ` [PATCH v1 06/15] vhost: " David Hildenbrand
2023-06-16  9:26 ` [PATCH v1 07/15] memory-device: Support memory devices that statically consume multiple memslots David Hildenbrand
2023-06-16  9:26 ` [PATCH v1 08/15] memory-device: Track the actually used memslots in DeviceMemoryState David Hildenbrand
2023-06-16  9:26 ` David Hildenbrand [this message]
2023-06-16  9:26   ` [PATCH v1 09/15] memory-device,vhost: Support memory devices that dynamically consume multiple memslots David Hildenbrand
2023-06-16  9:26 ` [PATCH v1 10/15] pc-dimm: Provide pc_dimm_get_free_slots() to query free ram slots David Hildenbrand
2023-06-16  9:26 ` [PATCH v1 11/15] memory-device: Support memory-devices with auto-detection of the number of memslots David Hildenbrand
2023-06-16  9:26 ` [PATCH v1 12/15] memory: Clarify mapping requirements for RamDiscardManager David Hildenbrand
2023-06-16  9:26 ` [PATCH v1 13/15] virtio-mem: Expose device memory via multiple memslots if enabled David Hildenbrand
2023-07-13 19:58   ` Maciej S. Szmigiero
2023-07-14 10:01     ` David Hildenbrand
2023-06-16  9:26 ` [PATCH v1 14/15] memory, vhost: Allow for marking memory device memory regions unmergeable David Hildenbrand
2023-06-16  9:26   ` [PATCH v1 14/15] memory,vhost: " David Hildenbrand
2023-06-16  9:26 ` [PATCH v1 15/15] virtio-mem: Mark memslot alias " David Hildenbrand

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20230616092654.175518-10-david@redhat.com \
    --to=david@redhat.com \
    --cc=alex.williamson@redhat.com \
    --cc=berrange@redhat.com \
    --cc=eduardo@habkost.net \
    --cc=gshan@redhat.com \
    --cc=imammedo@redhat.com \
    --cc=kvm@vger.kernel.org \
    --cc=marcel.apfelbaum@gmail.com \
    --cc=mprivozn@redhat.com \
    --cc=mst@redhat.com \
    --cc=pbonzini@redhat.com \
    --cc=peterx@redhat.com \
    --cc=philmd@linaro.org \
    --cc=qemu-devel@nongnu.org \
    --cc=wangyanan55@huawei.com \
    --cc=xiaoguangrong.eric@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.