All of lore.kernel.org
 help / color / mirror / Atom feed
From: David Hildenbrand <david@redhat.com>
To: qemu-devel@nongnu.org
Cc: "David Hildenbrand" <david@redhat.com>,
	"Paolo Bonzini" <pbonzini@redhat.com>,
	"Igor Mammedov" <imammedo@redhat.com>,
	"Xiao Guangrong" <xiaoguangrong.eric@gmail.com>,
	"Michael S. Tsirkin" <mst@redhat.com>,
	"Peter Xu" <peterx@redhat.com>,
	"Philippe Mathieu-Daudé" <philmd@linaro.org>,
	"Eduardo Habkost" <eduardo@habkost.net>,
	"Marcel Apfelbaum" <marcel.apfelbaum@gmail.com>,
	"Yanan Wang" <wangyanan55@huawei.com>,
	"Michal Privoznik" <mprivozn@redhat.com>,
	"Daniel P . Berrangé" <berrange@redhat.com>,
	"Gavin Shan" <gshan@redhat.com>,
	"Alex Williamson" <alex.williamson@redhat.com>,
	"Stefan Hajnoczi" <stefanha@redhat.com>,
	"Maciej S . Szmigiero" <mail@maciej.szmigiero.name>,
	kvm@vger.kernel.org
Subject: [PATCH v3 09/16] memory-device,vhost: Support memory devices that dynamically consume memslots
Date: Fri,  8 Sep 2023 16:21:29 +0200	[thread overview]
Message-ID: <20230908142136.403541-10-david@redhat.com> (raw)
In-Reply-To: <20230908142136.403541-1-david@redhat.com>

We want to support memory devices that have a dynamically managed memory
region container as device memory region. This device memory region maps
multiple RAM memory subregions (e.g., aliases to the same RAM memory
region), whereby these subregions can be (un)mapped on demand.

Each RAM subregion will consume a memslot in KVM and vhost, resulting in
such a new device consuming memslots dynamically, and initially usually
0. We already track the number of used vs. required memslots for all
memslots. From that, we can derive the number of reserved memslots that
must not be used otherwise.

The target use case is virtio-mem and the hyper-v balloon, which will
dynamically map aliases to RAM memory region into their device memory
region container.

Properly document what's supported and what's not and extend the vhost
memslot check accordingly.

Signed-off-by: David Hildenbrand <david@redhat.com>
---
 hw/mem/memory-device.c         | 29 +++++++++++++++++++++++++++--
 hw/virtio/vhost.c              | 18 ++++++++++++++----
 include/hw/mem/memory-device.h |  7 +++++++
 stubs/memory_device.c          |  5 +++++
 4 files changed, 53 insertions(+), 6 deletions(-)

diff --git a/hw/mem/memory-device.c b/hw/mem/memory-device.c
index d37cfbd65d..1b14ba5661 100644
--- a/hw/mem/memory-device.c
+++ b/hw/mem/memory-device.c
@@ -62,19 +62,44 @@ static unsigned int memory_device_get_memslots(MemoryDeviceState *md)
     return 1;
 }
 
+/*
+ * Memslots that are reserved by memory devices (required but still reported
+ * as free from KVM / vhost).
+ */
+static unsigned int get_reserved_memslots(MachineState *ms)
+{
+    if (ms->device_memory->used_memslots >
+        ms->device_memory->required_memslots) {
+        /* This is unexpected, and we warned already in the memory notifier. */
+        return 0;
+    }
+    return ms->device_memory->required_memslots -
+           ms->device_memory->used_memslots;
+}
+
+unsigned int memory_devices_get_reserved_memslots(void)
+{
+    if (!current_machine->device_memory) {
+        return 0;
+    }
+    return get_reserved_memslots(current_machine);
+}
+
 static void memory_device_check_addable(MachineState *ms, MemoryDeviceState *md,
                                         MemoryRegion *mr, Error **errp)
 {
     const uint64_t used_region_size = ms->device_memory->used_region_size;
     const uint64_t size = memory_region_size(mr);
     const unsigned int required_memslots = memory_device_get_memslots(md);
+    const unsigned int reserved_memslots = get_reserved_memslots(ms);
 
     /* we will need memory slots for kvm and vhost */
-    if (kvm_enabled() && kvm_get_free_memslots() < required_memslots) {
+    if (kvm_enabled() &&
+        kvm_get_free_memslots() < required_memslots + reserved_memslots) {
         error_setg(errp, "hypervisor has not enough free memory slots left");
         return;
     }
-    if (vhost_get_free_memslots() < required_memslots) {
+    if (vhost_get_free_memslots() < required_memslots + reserved_memslots) {
         error_setg(errp, "a used vhost backend has not enough free memory slots left");
         return;
     }
diff --git a/hw/virtio/vhost.c b/hw/virtio/vhost.c
index 8e84dca246..f7e1ac12a8 100644
--- a/hw/virtio/vhost.c
+++ b/hw/virtio/vhost.c
@@ -23,6 +23,7 @@
 #include "qemu/log.h"
 #include "standard-headers/linux/vhost_types.h"
 #include "hw/virtio/virtio-bus.h"
+#include "hw/mem/memory-device.h"
 #include "migration/blocker.h"
 #include "migration/qemu-file-types.h"
 #include "sysemu/dma.h"
@@ -1423,7 +1424,7 @@ int vhost_dev_init(struct vhost_dev *hdev, void *opaque,
                    VhostBackendType backend_type, uint32_t busyloop_timeout,
                    Error **errp)
 {
-    unsigned int used;
+    unsigned int used, reserved, limit;
     uint64_t features;
     int i, r, n_initialized_vqs = 0;
 
@@ -1529,9 +1530,18 @@ int vhost_dev_init(struct vhost_dev *hdev, void *opaque,
     } else {
         used = used_memslots;
     }
-    if (used > hdev->vhost_ops->vhost_backend_memslots_limit(hdev)) {
-        error_setg(errp, "vhost backend memory slots limit is less"
-                   " than current number of present memory slots");
+    /*
+     * We assume that all reserved memslots actually require a real memslot
+     * in our vhost backend. This might not be true, for example, if the
+     * memslot would be ROM. If ever relevant, we can optimize for that --
+     * but we'll need additional information about the reservations.
+     */
+    reserved = memory_devices_get_reserved_memslots();
+    limit = hdev->vhost_ops->vhost_backend_memslots_limit(hdev);
+    if (used + reserved > limit) {
+        error_setg(errp, "vhost backend memory slots limit (%d) is less"
+                   " than current number of used (%d) and reserved (%d)"
+                   " memory slots for memory devices.", limit, used, reserved);
         r = -EINVAL;
         goto fail_busyloop;
     }
diff --git a/include/hw/mem/memory-device.h b/include/hw/mem/memory-device.h
index b51a579fb9..c7b624da6a 100644
--- a/include/hw/mem/memory-device.h
+++ b/include/hw/mem/memory-device.h
@@ -46,6 +46,12 @@ typedef struct MemoryDeviceState MemoryDeviceState;
  * single RAM memory region or a memory region container with subregions
  * that are RAM memory regions or aliases to RAM memory regions. Other
  * memory regions or subregions are not supported.
+ *
+ * If the device memory region returned via @get_memory_region is a
+ * memory region container, it's supported to dynamically (un)map subregions
+ * as long as the number of memslots returned by @get_memslots() won't
+ * be exceeded and as long as all memory regions are of the same kind (e.g.,
+ * all RAM or all ROM).
  */
 struct MemoryDeviceClass {
     /* private */
@@ -125,6 +131,7 @@ struct MemoryDeviceClass {
 
 MemoryDeviceInfoList *qmp_memory_device_list(void);
 uint64_t get_plugged_memory_size(void);
+unsigned int memory_devices_get_reserved_memslots(void);
 void memory_device_pre_plug(MemoryDeviceState *md, MachineState *ms,
                             const uint64_t *legacy_align, Error **errp);
 void memory_device_plug(MemoryDeviceState *md, MachineState *ms);
diff --git a/stubs/memory_device.c b/stubs/memory_device.c
index e75cac62dc..318a5d4187 100644
--- a/stubs/memory_device.c
+++ b/stubs/memory_device.c
@@ -10,3 +10,8 @@ uint64_t get_plugged_memory_size(void)
 {
     return (uint64_t)-1;
 }
+
+unsigned int memory_devices_get_reserved_memslots(void)
+{
+    return 0;
+}
-- 
2.41.0


WARNING: multiple messages have this Message-ID (diff)
From: David Hildenbrand <david@redhat.com>
To: qemu-devel@nongnu.org
Cc: "David Hildenbrand" <david@redhat.com>,
	"Paolo Bonzini" <pbonzini@redhat.com>,
	"Igor Mammedov" <imammedo@redhat.com>,
	"Xiao Guangrong" <xiaoguangrong.eric@gmail.com>,
	"Michael S. Tsirkin" <mst@redhat.com>,
	"Peter Xu" <peterx@redhat.com>,
	"Philippe Mathieu-Daudé" <philmd@linaro.org>,
	"Eduardo Habkost" <eduardo@habkost.net>,
	"Marcel Apfelbaum" <marcel.apfelbaum@gmail.com>,
	"Yanan Wang" <wangyanan55@huawei.com>,
	"Michal Privoznik" <mprivozn@redhat.com>,
	"Daniel P . Berrangé" <berrange@redhat.com>,
	"Gavin Shan" <gshan@redhat.com>,
	"Alex Williamson" <alex.williamson@redhat.com>,
	"Stefan Hajnoczi" <stefanha@redhat.com>,
	"Maciej S . Szmigiero" <mail@maciej.szmigiero.name>,
	kvm@vger.kernel.org
Subject: [PATCH v3 09/16] memory-device, vhost: Support memory devices that dynamically consume memslots
Date: Fri,  8 Sep 2023 16:21:29 +0200	[thread overview]
Message-ID: <20230908142136.403541-10-david@redhat.com> (raw)
In-Reply-To: <20230908142136.403541-1-david@redhat.com>

We want to support memory devices that have a dynamically managed memory
region container as device memory region. This device memory region maps
multiple RAM memory subregions (e.g., aliases to the same RAM memory
region), whereby these subregions can be (un)mapped on demand.

Each RAM subregion will consume a memslot in KVM and vhost, resulting in
such a new device consuming memslots dynamically, and initially usually
0. We already track the number of used vs. required memslots for all
memslots. From that, we can derive the number of reserved memslots that
must not be used otherwise.

The target use case is virtio-mem and the hyper-v balloon, which will
dynamically map aliases to RAM memory region into their device memory
region container.

Properly document what's supported and what's not and extend the vhost
memslot check accordingly.

Signed-off-by: David Hildenbrand <david@redhat.com>
---
 hw/mem/memory-device.c         | 29 +++++++++++++++++++++++++++--
 hw/virtio/vhost.c              | 18 ++++++++++++++----
 include/hw/mem/memory-device.h |  7 +++++++
 stubs/memory_device.c          |  5 +++++
 4 files changed, 53 insertions(+), 6 deletions(-)

diff --git a/hw/mem/memory-device.c b/hw/mem/memory-device.c
index d37cfbd65d..1b14ba5661 100644
--- a/hw/mem/memory-device.c
+++ b/hw/mem/memory-device.c
@@ -62,19 +62,44 @@ static unsigned int memory_device_get_memslots(MemoryDeviceState *md)
     return 1;
 }
 
+/*
+ * Memslots that are reserved by memory devices (required but still reported
+ * as free from KVM / vhost).
+ */
+static unsigned int get_reserved_memslots(MachineState *ms)
+{
+    if (ms->device_memory->used_memslots >
+        ms->device_memory->required_memslots) {
+        /* This is unexpected, and we warned already in the memory notifier. */
+        return 0;
+    }
+    return ms->device_memory->required_memslots -
+           ms->device_memory->used_memslots;
+}
+
+unsigned int memory_devices_get_reserved_memslots(void)
+{
+    if (!current_machine->device_memory) {
+        return 0;
+    }
+    return get_reserved_memslots(current_machine);
+}
+
 static void memory_device_check_addable(MachineState *ms, MemoryDeviceState *md,
                                         MemoryRegion *mr, Error **errp)
 {
     const uint64_t used_region_size = ms->device_memory->used_region_size;
     const uint64_t size = memory_region_size(mr);
     const unsigned int required_memslots = memory_device_get_memslots(md);
+    const unsigned int reserved_memslots = get_reserved_memslots(ms);
 
     /* we will need memory slots for kvm and vhost */
-    if (kvm_enabled() && kvm_get_free_memslots() < required_memslots) {
+    if (kvm_enabled() &&
+        kvm_get_free_memslots() < required_memslots + reserved_memslots) {
         error_setg(errp, "hypervisor has not enough free memory slots left");
         return;
     }
-    if (vhost_get_free_memslots() < required_memslots) {
+    if (vhost_get_free_memslots() < required_memslots + reserved_memslots) {
         error_setg(errp, "a used vhost backend has not enough free memory slots left");
         return;
     }
diff --git a/hw/virtio/vhost.c b/hw/virtio/vhost.c
index 8e84dca246..f7e1ac12a8 100644
--- a/hw/virtio/vhost.c
+++ b/hw/virtio/vhost.c
@@ -23,6 +23,7 @@
 #include "qemu/log.h"
 #include "standard-headers/linux/vhost_types.h"
 #include "hw/virtio/virtio-bus.h"
+#include "hw/mem/memory-device.h"
 #include "migration/blocker.h"
 #include "migration/qemu-file-types.h"
 #include "sysemu/dma.h"
@@ -1423,7 +1424,7 @@ int vhost_dev_init(struct vhost_dev *hdev, void *opaque,
                    VhostBackendType backend_type, uint32_t busyloop_timeout,
                    Error **errp)
 {
-    unsigned int used;
+    unsigned int used, reserved, limit;
     uint64_t features;
     int i, r, n_initialized_vqs = 0;
 
@@ -1529,9 +1530,18 @@ int vhost_dev_init(struct vhost_dev *hdev, void *opaque,
     } else {
         used = used_memslots;
     }
-    if (used > hdev->vhost_ops->vhost_backend_memslots_limit(hdev)) {
-        error_setg(errp, "vhost backend memory slots limit is less"
-                   " than current number of present memory slots");
+    /*
+     * We assume that all reserved memslots actually require a real memslot
+     * in our vhost backend. This might not be true, for example, if the
+     * memslot would be ROM. If ever relevant, we can optimize for that --
+     * but we'll need additional information about the reservations.
+     */
+    reserved = memory_devices_get_reserved_memslots();
+    limit = hdev->vhost_ops->vhost_backend_memslots_limit(hdev);
+    if (used + reserved > limit) {
+        error_setg(errp, "vhost backend memory slots limit (%d) is less"
+                   " than current number of used (%d) and reserved (%d)"
+                   " memory slots for memory devices.", limit, used, reserved);
         r = -EINVAL;
         goto fail_busyloop;
     }
diff --git a/include/hw/mem/memory-device.h b/include/hw/mem/memory-device.h
index b51a579fb9..c7b624da6a 100644
--- a/include/hw/mem/memory-device.h
+++ b/include/hw/mem/memory-device.h
@@ -46,6 +46,12 @@ typedef struct MemoryDeviceState MemoryDeviceState;
  * single RAM memory region or a memory region container with subregions
  * that are RAM memory regions or aliases to RAM memory regions. Other
  * memory regions or subregions are not supported.
+ *
+ * If the device memory region returned via @get_memory_region is a
+ * memory region container, it's supported to dynamically (un)map subregions
+ * as long as the number of memslots returned by @get_memslots() won't
+ * be exceeded and as long as all memory regions are of the same kind (e.g.,
+ * all RAM or all ROM).
  */
 struct MemoryDeviceClass {
     /* private */
@@ -125,6 +131,7 @@ struct MemoryDeviceClass {
 
 MemoryDeviceInfoList *qmp_memory_device_list(void);
 uint64_t get_plugged_memory_size(void);
+unsigned int memory_devices_get_reserved_memslots(void);
 void memory_device_pre_plug(MemoryDeviceState *md, MachineState *ms,
                             const uint64_t *legacy_align, Error **errp);
 void memory_device_plug(MemoryDeviceState *md, MachineState *ms);
diff --git a/stubs/memory_device.c b/stubs/memory_device.c
index e75cac62dc..318a5d4187 100644
--- a/stubs/memory_device.c
+++ b/stubs/memory_device.c
@@ -10,3 +10,8 @@ uint64_t get_plugged_memory_size(void)
 {
     return (uint64_t)-1;
 }
+
+unsigned int memory_devices_get_reserved_memslots(void)
+{
+    return 0;
+}
-- 
2.41.0



  parent reply	other threads:[~2023-09-08 14:22 UTC|newest]

Thread overview: 39+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-09-08 14:21 [PATCH v3 00/16] virtio-mem: Expose device memory through multiple memslots David Hildenbrand
2023-09-08 14:21 ` [PATCH v3 01/16] vhost: Rework memslot filtering and fix "used_memslot" tracking David Hildenbrand
2023-09-08 14:21 ` [PATCH v3 02/16] vhost: Remove vhost_backend_can_merge() callback David Hildenbrand
2023-09-08 14:21 ` [PATCH v3 03/16] softmmu/physmem: Fixup qemu_ram_block_from_host() documentation David Hildenbrand
2023-09-08 14:21 ` [PATCH v3 04/16] kvm: Return number of free memslots David Hildenbrand
2023-09-16 16:05   ` Maciej S. Szmigiero
2023-09-08 14:21 ` [PATCH v3 05/16] vhost: " David Hildenbrand
2023-09-16 16:07   ` Maciej S. Szmigiero
2023-09-08 14:21 ` [PATCH v3 06/16] memory-device: Support memory devices with multiple memslots David Hildenbrand
2023-09-16 16:27   ` Maciej S. Szmigiero
2023-09-08 14:21 ` [PATCH v3 07/16] stubs: Rename qmp_memory_device.c to memory_device.c David Hildenbrand
2023-09-16 16:28   ` Maciej S. Szmigiero
2023-09-08 14:21 ` [PATCH v3 08/16] memory-device: Track required and actually used memslots in DeviceMemoryState David Hildenbrand
2023-09-16 16:36   ` Maciej S. Szmigiero
2023-09-08 14:21 ` David Hildenbrand [this message]
2023-09-08 14:21   ` [PATCH v3 09/16] memory-device, vhost: Support memory devices that dynamically consume memslots David Hildenbrand
2023-09-16 17:52   ` Maciej S. Szmigiero
2023-09-16 17:52     ` [PATCH v3 09/16] memory-device,vhost: " Maciej S. Szmigiero
2023-09-08 14:21 ` [PATCH v3 10/16] kvm: Add stub for kvm_get_max_memslots() David Hildenbrand
2023-09-16 17:13   ` Maciej S. Szmigiero
2023-09-08 14:21 ` [PATCH v3 11/16] vhost: Add vhost_get_max_memslots() David Hildenbrand
2023-09-16 17:16   ` Maciej S. Szmigiero
2023-09-08 14:21 ` [PATCH v3 12/16] memory-device,vhost: Support automatic decision on the number of memslots David Hildenbrand
2023-09-08 14:21   ` [PATCH v3 12/16] memory-device, vhost: " David Hildenbrand
2023-09-17 10:46   ` [PATCH v3 12/16] memory-device,vhost: " Maciej S. Szmigiero
2023-09-17 10:46     ` [PATCH v3 12/16] memory-device, vhost: " Maciej S. Szmigiero
2023-09-18 12:33     ` David Hildenbrand
2023-09-18 12:33       ` [PATCH v3 12/16] memory-device,vhost: " David Hildenbrand
2023-09-08 14:21 ` [PATCH v3 13/16] memory: Clarify mapping requirements for RamDiscardManager David Hildenbrand
2023-09-16 17:31   ` Maciej S. Szmigiero
2023-09-08 14:21 ` [PATCH v3 14/16] virtio-mem: Expose device memory via multiple memslots if enabled David Hildenbrand
2023-09-17 11:47   ` Maciej S. Szmigiero
2023-09-19  8:08     ` David Hildenbrand
2023-09-08 14:21 ` [PATCH v3 15/16] memory,vhost: Allow for marking memory device memory regions unmergeable David Hildenbrand
2023-09-08 14:21   ` [PATCH v3 15/16] memory, vhost: " David Hildenbrand
2023-09-08 14:21 ` [PATCH v3 16/16] virtio-mem: Mark memslot alias " David Hildenbrand
2023-09-11  7:45 ` [PATCH v3 00/16] virtio-mem: Expose device memory through multiple memslots David Hildenbrand
2023-09-19  8:20   ` David Hildenbrand
2023-09-19  9:34     ` David Hildenbrand

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20230908142136.403541-10-david@redhat.com \
    --to=david@redhat.com \
    --cc=alex.williamson@redhat.com \
    --cc=berrange@redhat.com \
    --cc=eduardo@habkost.net \
    --cc=gshan@redhat.com \
    --cc=imammedo@redhat.com \
    --cc=kvm@vger.kernel.org \
    --cc=mail@maciej.szmigiero.name \
    --cc=marcel.apfelbaum@gmail.com \
    --cc=mprivozn@redhat.com \
    --cc=mst@redhat.com \
    --cc=pbonzini@redhat.com \
    --cc=peterx@redhat.com \
    --cc=philmd@linaro.org \
    --cc=qemu-devel@nongnu.org \
    --cc=stefanha@redhat.com \
    --cc=wangyanan55@huawei.com \
    --cc=xiaoguangrong.eric@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.