All of lore.kernel.org
 help / color / mirror / Atom feed
* [GIT PULL 0/4] Host Memory Backends and Memory devices queue 2023-01-02
@ 2023-01-02 11:29 David Hildenbrand
  2023-01-02 11:29 ` [GIT PULL 1/4] virtio-mem: Fix the bitmap index of the section offset David Hildenbrand
                   ` (4 more replies)
  0 siblings, 5 replies; 6+ messages in thread
From: David Hildenbrand @ 2023-01-02 11:29 UTC (permalink / raw)
  To: qemu-devel
  Cc: Igor Mammedov, Michael S . Tsirkin, Paolo Bonzini,
	David Hildenbrand, Chenyi Qiang, Michal Privoznik,
	Philippe Mathieu-Daudé

The following changes since commit 222059a0fccf4af3be776fe35a5ea2d6a68f9a0b:

  Merge tag 'pull-ppc-20221221' of https://gitlab.com/danielhb/qemu into staging (2022-12-21 18:08:09 +0000)

are available in the Git repository at:

  https://github.com/davidhildenbrand/qemu.git tags/mem-2023-01-02

for you to fetch changes up to 6bb613f0812d1364fc8fcf0846647446884d5148:

  hostmem: Honor multiple preferred nodes if possible (2022-12-28 14:59:55 +0100)

----------------------------------------------------------------
Hi,

"Host Memory Backends" and "Memory devices" queue ("mem"):
- virtio-mem fixes
- Use new MPOL_PREFERRED_MANY mbind() policy for memory backends if
  possible

----------------------------------------------------------------
Chenyi Qiang (2):
  virtio-mem: Fix the bitmap index of the section offset
  virtio-mem: Fix the iterator variable in a vmem->rdl_list loop

Michal Privoznik (1):
  hostmem: Honor multiple preferred nodes if possible

Philippe Mathieu-Daudé (1):
  virtio-mem: Fix typo in function name

 backends/hostmem.c     | 19 +++++++++++++++++--
 hw/virtio/virtio-mem.c | 18 +++++++++---------
 meson.build            |  5 +++++
 3 files changed, 31 insertions(+), 11 deletions(-)

-- 
2.39.0



^ permalink raw reply	[flat|nested] 6+ messages in thread

* [GIT PULL 1/4] virtio-mem: Fix the bitmap index of the section offset
  2023-01-02 11:29 [GIT PULL 0/4] Host Memory Backends and Memory devices queue 2023-01-02 David Hildenbrand
@ 2023-01-02 11:29 ` David Hildenbrand
  2023-01-02 11:29 ` [GIT PULL 2/4] virtio-mem: Fix the iterator variable in a vmem->rdl_list loop David Hildenbrand
                   ` (3 subsequent siblings)
  4 siblings, 0 replies; 6+ messages in thread
From: David Hildenbrand @ 2023-01-02 11:29 UTC (permalink / raw)
  To: qemu-devel
  Cc: Igor Mammedov, Michael S . Tsirkin, Paolo Bonzini,
	David Hildenbrand, Chenyi Qiang, Michal Privoznik,
	Philippe Mathieu-Daudé,
	qemu-stable

From: Chenyi Qiang <chenyi.qiang@intel.com>

vmem->bitmap indexes the memory region of the virtio-mem backend at a
granularity of block_size. To calculate the index of target section offset,
the block_size should be divided instead of the bitmap_size.

Fixes: 2044969f0b ("virtio-mem: Implement RamDiscardManager interface")
Signed-off-by: Chenyi Qiang <chenyi.qiang@intel.com>
Message-Id: <20221216062231.11181-1-chenyi.qiang@intel.com>
Reviewed-by: David Hildenbrand <david@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Cc: qemu-stable@nongnu.org
Signed-off-by: David Hildenbrand <david@redhat.com>
---
 hw/virtio/virtio-mem.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/hw/virtio/virtio-mem.c b/hw/virtio/virtio-mem.c
index d96bde1fab..5c22c4b876 100644
--- a/hw/virtio/virtio-mem.c
+++ b/hw/virtio/virtio-mem.c
@@ -235,7 +235,7 @@ static int virtio_mem_for_each_plugged_section(const VirtIOMEM *vmem,
     uint64_t offset, size;
     int ret = 0;
 
-    first_bit = s->offset_within_region / vmem->bitmap_size;
+    first_bit = s->offset_within_region / vmem->block_size;
     first_bit = find_next_bit(vmem->bitmap, vmem->bitmap_size, first_bit);
     while (first_bit < vmem->bitmap_size) {
         MemoryRegionSection tmp = *s;
@@ -267,7 +267,7 @@ static int virtio_mem_for_each_unplugged_section(const VirtIOMEM *vmem,
     uint64_t offset, size;
     int ret = 0;
 
-    first_bit = s->offset_within_region / vmem->bitmap_size;
+    first_bit = s->offset_within_region / vmem->block_size;
     first_bit = find_next_zero_bit(vmem->bitmap, vmem->bitmap_size, first_bit);
     while (first_bit < vmem->bitmap_size) {
         MemoryRegionSection tmp = *s;
-- 
2.39.0



^ permalink raw reply related	[flat|nested] 6+ messages in thread

* [GIT PULL 2/4] virtio-mem: Fix the iterator variable in a vmem->rdl_list loop
  2023-01-02 11:29 [GIT PULL 0/4] Host Memory Backends and Memory devices queue 2023-01-02 David Hildenbrand
  2023-01-02 11:29 ` [GIT PULL 1/4] virtio-mem: Fix the bitmap index of the section offset David Hildenbrand
@ 2023-01-02 11:29 ` David Hildenbrand
  2023-01-02 11:29 ` [GIT PULL 3/4] virtio-mem: Fix typo in function name David Hildenbrand
                   ` (2 subsequent siblings)
  4 siblings, 0 replies; 6+ messages in thread
From: David Hildenbrand @ 2023-01-02 11:29 UTC (permalink / raw)
  To: qemu-devel
  Cc: Igor Mammedov, Michael S . Tsirkin, Paolo Bonzini,
	David Hildenbrand, Chenyi Qiang, Michal Privoznik,
	Philippe Mathieu-Daudé,
	qemu-stable

From: Chenyi Qiang <chenyi.qiang@intel.com>

It should be the variable rdl2 to revert the already-notified listeners.

Fixes: 2044969f0b ("virtio-mem: Implement RamDiscardManager interface")
Signed-off-by: Chenyi Qiang <chenyi.qiang@intel.com>
Message-Id: <20221228090312.17276-1-chenyi.qiang@intel.com>
Cc: qemu-stable@nongnu.org
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: David Hildenbrand <david@redhat.com>
---
 hw/virtio/virtio-mem.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/hw/virtio/virtio-mem.c b/hw/virtio/virtio-mem.c
index 5c22c4b876..2b0271442b 100644
--- a/hw/virtio/virtio-mem.c
+++ b/hw/virtio/virtio-mem.c
@@ -341,7 +341,7 @@ static int virtio_mem_notify_plug(VirtIOMEM *vmem, uint64_t offset,
     if (ret) {
         /* Notify all already-notified listeners. */
         QLIST_FOREACH(rdl2, &vmem->rdl_list, next) {
-            MemoryRegionSection tmp = *rdl->section;
+            MemoryRegionSection tmp = *rdl2->section;
 
             if (rdl2 == rdl) {
                 break;
-- 
2.39.0



^ permalink raw reply related	[flat|nested] 6+ messages in thread

* [GIT PULL 3/4] virtio-mem: Fix typo in function name
  2023-01-02 11:29 [GIT PULL 0/4] Host Memory Backends and Memory devices queue 2023-01-02 David Hildenbrand
  2023-01-02 11:29 ` [GIT PULL 1/4] virtio-mem: Fix the bitmap index of the section offset David Hildenbrand
  2023-01-02 11:29 ` [GIT PULL 2/4] virtio-mem: Fix the iterator variable in a vmem->rdl_list loop David Hildenbrand
@ 2023-01-02 11:29 ` David Hildenbrand
  2023-01-02 11:29 ` [GIT PULL 4/4] hostmem: Honor multiple preferred nodes if possible David Hildenbrand
  2023-01-05 16:58 ` [GIT PULL 0/4] Host Memory Backends and Memory devices queue 2023-01-02 Peter Maydell
  4 siblings, 0 replies; 6+ messages in thread
From: David Hildenbrand @ 2023-01-02 11:29 UTC (permalink / raw)
  To: qemu-devel
  Cc: Igor Mammedov, Michael S . Tsirkin, Paolo Bonzini,
	David Hildenbrand, Chenyi Qiang, Michal Privoznik,
	Philippe Mathieu-Daudé

From: Philippe Mathieu-Daudé <philmd@linaro.org>

Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-Id: <20221228130956.80515-1-philmd@linaro.org>
Signed-off-by: David Hildenbrand <david@redhat.com>
---
 hw/virtio/virtio-mem.c | 12 ++++++------
 1 file changed, 6 insertions(+), 6 deletions(-)

diff --git a/hw/virtio/virtio-mem.c b/hw/virtio/virtio-mem.c
index 2b0271442b..1ed1f5a4af 100644
--- a/hw/virtio/virtio-mem.c
+++ b/hw/virtio/virtio-mem.c
@@ -207,7 +207,7 @@ static int virtio_mem_for_each_unplugged_range(const VirtIOMEM *vmem, void *arg,
  *
  * Returns false if the intersection is empty, otherwise returns true.
  */
-static bool virito_mem_intersect_memory_section(MemoryRegionSection *s,
+static bool virtio_mem_intersect_memory_section(MemoryRegionSection *s,
                                                 uint64_t offset, uint64_t size)
 {
     uint64_t start = MAX(s->offset_within_region, offset);
@@ -245,7 +245,7 @@ static int virtio_mem_for_each_plugged_section(const VirtIOMEM *vmem,
                                       first_bit + 1) - 1;
         size = (last_bit - first_bit + 1) * vmem->block_size;
 
-        if (!virito_mem_intersect_memory_section(&tmp, offset, size)) {
+        if (!virtio_mem_intersect_memory_section(&tmp, offset, size)) {
             break;
         }
         ret = cb(&tmp, arg);
@@ -277,7 +277,7 @@ static int virtio_mem_for_each_unplugged_section(const VirtIOMEM *vmem,
                                  first_bit + 1) - 1;
         size = (last_bit - first_bit + 1) * vmem->block_size;
 
-        if (!virito_mem_intersect_memory_section(&tmp, offset, size)) {
+        if (!virtio_mem_intersect_memory_section(&tmp, offset, size)) {
             break;
         }
         ret = cb(&tmp, arg);
@@ -313,7 +313,7 @@ static void virtio_mem_notify_unplug(VirtIOMEM *vmem, uint64_t offset,
     QLIST_FOREACH(rdl, &vmem->rdl_list, next) {
         MemoryRegionSection tmp = *rdl->section;
 
-        if (!virito_mem_intersect_memory_section(&tmp, offset, size)) {
+        if (!virtio_mem_intersect_memory_section(&tmp, offset, size)) {
             continue;
         }
         rdl->notify_discard(rdl, &tmp);
@@ -329,7 +329,7 @@ static int virtio_mem_notify_plug(VirtIOMEM *vmem, uint64_t offset,
     QLIST_FOREACH(rdl, &vmem->rdl_list, next) {
         MemoryRegionSection tmp = *rdl->section;
 
-        if (!virito_mem_intersect_memory_section(&tmp, offset, size)) {
+        if (!virtio_mem_intersect_memory_section(&tmp, offset, size)) {
             continue;
         }
         ret = rdl->notify_populate(rdl, &tmp);
@@ -346,7 +346,7 @@ static int virtio_mem_notify_plug(VirtIOMEM *vmem, uint64_t offset,
             if (rdl2 == rdl) {
                 break;
             }
-            if (!virito_mem_intersect_memory_section(&tmp, offset, size)) {
+            if (!virtio_mem_intersect_memory_section(&tmp, offset, size)) {
                 continue;
             }
             rdl2->notify_discard(rdl2, &tmp);
-- 
2.39.0



^ permalink raw reply related	[flat|nested] 6+ messages in thread

* [GIT PULL 4/4] hostmem: Honor multiple preferred nodes if possible
  2023-01-02 11:29 [GIT PULL 0/4] Host Memory Backends and Memory devices queue 2023-01-02 David Hildenbrand
                   ` (2 preceding siblings ...)
  2023-01-02 11:29 ` [GIT PULL 3/4] virtio-mem: Fix typo in function name David Hildenbrand
@ 2023-01-02 11:29 ` David Hildenbrand
  2023-01-05 16:58 ` [GIT PULL 0/4] Host Memory Backends and Memory devices queue 2023-01-02 Peter Maydell
  4 siblings, 0 replies; 6+ messages in thread
From: David Hildenbrand @ 2023-01-02 11:29 UTC (permalink / raw)
  To: qemu-devel
  Cc: Igor Mammedov, Michael S . Tsirkin, Paolo Bonzini,
	David Hildenbrand, Chenyi Qiang, Michal Privoznik,
	Philippe Mathieu-Daudé

From: Michal Privoznik <mprivozn@redhat.com>

If a memory-backend is configured with mode
HOST_MEM_POLICY_PREFERRED then
host_memory_backend_memory_complete() calls mbind() as:

  mbind(..., MPOL_PREFERRED, nodemask, ...);

Here, 'nodemask' is a bitmap of host NUMA nodes and corresponds
to the .host-nodes attribute. Therefore, there can be multiple
nodes specified. However, the documentation to MPOL_PREFERRED
says:

  MPOL_PREFERRED
    This mode sets the preferred node for allocation. ...
    If nodemask specifies more than one node ID, the first node
    in the mask will be selected as the preferred node.

Therefore, only the first node is honored and the rest is
silently ignored. Well, with recent changes to the kernel and
numactl we can do better.

The Linux kernel added in v5.15 via commit cfcaa66f8032
("mm/hugetlb: add support for mempolicy MPOL_PREFERRED_MANY")
support for MPOL_PREFERRED_MANY, which accepts multiple preferred
NUMA nodes instead.

Then, numa_has_preferred_many() API was introduced to numactl
(v2.0.15~26) allowing applications to query kernel support.

Wiring this all together, we can pass MPOL_PREFERRED_MANY to the
mbind() call instead and stop ignoring multiple nodes, silently.

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Message-Id: <a0b4adce1af5bd2344c2218eb4a04b3ff7bcfdb4.1671097918.git.mprivozn@redhat.com>
Reviewed-by: David Hildenbrand <david@redhat.com>
Signed-off-by: David Hildenbrand <david@redhat.com>
---
 backends/hostmem.c | 19 +++++++++++++++++--
 meson.build        |  5 +++++
 2 files changed, 22 insertions(+), 2 deletions(-)

diff --git a/backends/hostmem.c b/backends/hostmem.c
index 8640294c10..747e7838c0 100644
--- a/backends/hostmem.c
+++ b/backends/hostmem.c
@@ -23,7 +23,12 @@
 
 #ifdef CONFIG_NUMA
 #include <numaif.h>
+#include <numa.h>
 QEMU_BUILD_BUG_ON(HOST_MEM_POLICY_DEFAULT != MPOL_DEFAULT);
+/*
+ * HOST_MEM_POLICY_PREFERRED may either translate to MPOL_PREFERRED or
+ * MPOL_PREFERRED_MANY, see comments further below.
+ */
 QEMU_BUILD_BUG_ON(HOST_MEM_POLICY_PREFERRED != MPOL_PREFERRED);
 QEMU_BUILD_BUG_ON(HOST_MEM_POLICY_BIND != MPOL_BIND);
 QEMU_BUILD_BUG_ON(HOST_MEM_POLICY_INTERLEAVE != MPOL_INTERLEAVE);
@@ -346,6 +351,7 @@ host_memory_backend_memory_complete(UserCreatable *uc, Error **errp)
          * before mbind(). note: MPOL_MF_STRICT is ignored on hugepages so
          * this doesn't catch hugepage case. */
         unsigned flags = MPOL_MF_STRICT | MPOL_MF_MOVE;
+        int mode = backend->policy;
 
         /* check for invalid host-nodes and policies and give more verbose
          * error messages than mbind(). */
@@ -369,9 +375,18 @@ host_memory_backend_memory_complete(UserCreatable *uc, Error **errp)
                BITS_TO_LONGS(MAX_NODES + 1) * sizeof(unsigned long));
         assert(maxnode <= MAX_NODES);
 
+#ifdef HAVE_NUMA_HAS_PREFERRED_MANY
+        if (mode == MPOL_PREFERRED && numa_has_preferred_many() > 0) {
+            /*
+             * Replace with MPOL_PREFERRED_MANY otherwise the mbind() below
+             * silently picks the first node.
+             */
+            mode = MPOL_PREFERRED_MANY;
+        }
+#endif
+
         if (maxnode &&
-            mbind(ptr, sz, backend->policy, backend->host_nodes, maxnode + 1,
-                  flags)) {
+            mbind(ptr, sz, mode, backend->host_nodes, maxnode + 1, flags)) {
             if (backend->policy != MPOL_DEFAULT || errno != ENOSYS) {
                 error_setg_errno(errp, errno,
                                  "cannot bind memory to host NUMA nodes");
diff --git a/meson.build b/meson.build
index 4c6f8a674a..3f31db5963 100644
--- a/meson.build
+++ b/meson.build
@@ -1858,6 +1858,11 @@ config_host_data.set('CONFIG_LINUX_AIO', libaio.found())
 config_host_data.set('CONFIG_LINUX_IO_URING', linux_io_uring.found())
 config_host_data.set('CONFIG_LIBPMEM', libpmem.found())
 config_host_data.set('CONFIG_NUMA', numa.found())
+if numa.found()
+  config_host_data.set('HAVE_NUMA_HAS_PREFERRED_MANY',
+                       cc.has_function('numa_has_preferred_many',
+                                       dependencies: numa))
+endif
 config_host_data.set('CONFIG_OPENGL', opengl.found())
 config_host_data.set('CONFIG_PROFILER', get_option('profiler'))
 config_host_data.set('CONFIG_RBD', rbd.found())
-- 
2.39.0



^ permalink raw reply related	[flat|nested] 6+ messages in thread

* Re: [GIT PULL 0/4] Host Memory Backends and Memory devices queue 2023-01-02
  2023-01-02 11:29 [GIT PULL 0/4] Host Memory Backends and Memory devices queue 2023-01-02 David Hildenbrand
                   ` (3 preceding siblings ...)
  2023-01-02 11:29 ` [GIT PULL 4/4] hostmem: Honor multiple preferred nodes if possible David Hildenbrand
@ 2023-01-05 16:58 ` Peter Maydell
  4 siblings, 0 replies; 6+ messages in thread
From: Peter Maydell @ 2023-01-05 16:58 UTC (permalink / raw)
  To: David Hildenbrand
  Cc: qemu-devel, Igor Mammedov, Michael S . Tsirkin, Paolo Bonzini,
	Chenyi Qiang, Michal Privoznik, Philippe Mathieu-Daudé

On Mon, 2 Jan 2023 at 11:31, David Hildenbrand <david@redhat.com> wrote:
>
> The following changes since commit 222059a0fccf4af3be776fe35a5ea2d6a68f9a0b:
>
>   Merge tag 'pull-ppc-20221221' of https://gitlab.com/danielhb/qemu into staging (2022-12-21 18:08:09 +0000)
>
> are available in the Git repository at:
>
>   https://github.com/davidhildenbrand/qemu.git tags/mem-2023-01-02
>
> for you to fetch changes up to 6bb613f0812d1364fc8fcf0846647446884d5148:
>
>   hostmem: Honor multiple preferred nodes if possible (2022-12-28 14:59:55 +0100)
>
> ----------------------------------------------------------------
> Hi,
>
> "Host Memory Backends" and "Memory devices" queue ("mem"):
> - virtio-mem fixes
> - Use new MPOL_PREFERRED_MANY mbind() policy for memory backends if
>   possible
>


Applied, thanks.

Please update the changelog at https://wiki.qemu.org/ChangeLog/8.0
for any user-visible changes.

-- PMM


^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2023-01-05 17:16 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-01-02 11:29 [GIT PULL 0/4] Host Memory Backends and Memory devices queue 2023-01-02 David Hildenbrand
2023-01-02 11:29 ` [GIT PULL 1/4] virtio-mem: Fix the bitmap index of the section offset David Hildenbrand
2023-01-02 11:29 ` [GIT PULL 2/4] virtio-mem: Fix the iterator variable in a vmem->rdl_list loop David Hildenbrand
2023-01-02 11:29 ` [GIT PULL 3/4] virtio-mem: Fix typo in function name David Hildenbrand
2023-01-02 11:29 ` [GIT PULL 4/4] hostmem: Honor multiple preferred nodes if possible David Hildenbrand
2023-01-05 16:58 ` [GIT PULL 0/4] Host Memory Backends and Memory devices queue 2023-01-02 Peter Maydell

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.