All of lore.kernel.org
 help / color / mirror / Atom feed
From: Brice Goglin <Brice.Goglin@inria.fr>
To: QEMU Developers <qemu-devel@nongnu.org>
Cc: "Liu Jingqi" <jingqi.liu@intel.com>,
	"Eduardo Habkost" <eduardo@habkost.net>,
	"Marcel Apfelbaum" <marcel.apfelbaum@gmail.com>,
	"Philippe Mathieu-Daudé" <f4bug@amsat.org>,
	"Yanan Wang" <wangyanan55@huawei.com>,
	"Jonathan Cameron" <jonathan.cameron@huawei.com>,
	"Igor Mammedov" <imammedo@redhat.com>
Subject: [PATCH v3 1/4] hmat acpi: Don't require initiator value in -numa
Date: Wed, 29 Jun 2022 11:34:12 +0200	[thread overview]
Message-ID: <965c03a7-b61b-dc25-1527-b26cb5f5f9bd@inria.fr> (raw)
In-Reply-To: <1e9d4c01-0ce3-4514-1e5b-ffccc6e88f9e@inria.fr>


[-- Attachment #1.1: Type: text/plain, Size: 7273 bytes --]

The "Memory Proximity Domain Attributes" structure of the ACPI HMAT
has a "Processor Proximity Domain Valid" flag that is currently
always set because Qemu -numa requires an initiator=X value
when hmat=on. Unsetting this flag allows to create more complex
memory topologies by having multiple best initiators for a single
memory target.

This patch allows -numa without initiator=X when hmat=on by keeping
the default value MAX_NODES in numa_state->nodes[i].initiator.
All places reading numa_state->nodes[i].initiator already check
whether it's different from MAX_NODES before using it.

Tested with
qemu-system-x86_64 -accel kvm \
  -machine pc,hmat=on \
  -drive if=pflash,format=raw,file=./OVMF.fd \
  -drive media=disk,format=qcow2,file=efi.qcow2 \
  -smp 4 \
  -m 3G \
  -object memory-backend-ram,size=1G,id=ram0 \
  -object memory-backend-ram,size=1G,id=ram1 \
  -object memory-backend-ram,size=1G,id=ram2 \
  -numa node,nodeid=0,memdev=ram0,cpus=0-1 \
  -numa node,nodeid=1,memdev=ram1,cpus=2-3 \
  -numa node,nodeid=2,memdev=ram2 \
  -numa hmat-lb,initiator=0,target=0,hierarchy=memory,data-type=access-latency,latency=10 \
  -numa hmat-lb,initiator=0,target=0,hierarchy=memory,data-type=access-bandwidth,bandwidth=10485760 \
  -numa hmat-lb,initiator=0,target=1,hierarchy=memory,data-type=access-latency,latency=20 \
  -numa hmat-lb,initiator=0,target=1,hierarchy=memory,data-type=access-bandwidth,bandwidth=5242880 \
  -numa hmat-lb,initiator=0,target=2,hierarchy=memory,data-type=access-latency,latency=30 \
  -numa hmat-lb,initiator=0,target=2,hierarchy=memory,data-type=access-bandwidth,bandwidth=1048576 \
  -numa hmat-lb,initiator=1,target=0,hierarchy=memory,data-type=access-latency,latency=20 \
  -numa hmat-lb,initiator=1,target=0,hierarchy=memory,data-type=access-bandwidth,bandwidth=5242880 \
  -numa hmat-lb,initiator=1,target=1,hierarchy=memory,data-type=access-latency,latency=10 \
  -numa hmat-lb,initiator=1,target=1,hierarchy=memory,data-type=access-bandwidth,bandwidth=10485760 \
  -numa hmat-lb,initiator=1,target=2,hierarchy=memory,data-type=access-latency,latency=30 \
  -numa hmat-lb,initiator=1,target=2,hierarchy=memory,data-type=access-bandwidth,bandwidth=1048576
which reports NUMA node2 at same distance from both node0 and node1 as seen in lstopo:
Machine (2966MB total) + Package P#0
   NUMANode P#2 (979MB)
   Group0
     NUMANode P#0 (980MB)
     Core P#0 + PU P#0
     Core P#1 + PU P#1
   Group0
     NUMANode P#1 (1007MB)
     Core P#2 + PU P#2
     Core P#3 + PU P#3

Before this patch, we had to add ",initiator=X" to "-numa node,nodeid=2,memdev=ram2".
The lstopo output difference between initiator=1 and no initiator is:
@@ -1,10 +1,10 @@
  Machine (2966MB total) + Package P#0
+  NUMANode P#2 (979MB)
    Group0
      NUMANode P#0 (980MB)
      Core P#0 + PU P#0
      Core P#1 + PU P#1
    Group0
      NUMANode P#1 (1007MB)
-    NUMANode P#2 (979MB)
      Core P#2 + PU P#2
      Core P#3 + PU P#3

Corresponding changes in the HMAT MPDA structure:
@@ -49,10 +49,10 @@
  [078h 0120   2]               Structure Type : 0000 [Memory Proximity Domain Attributes]
  [07Ah 0122   2]                     Reserved : 0000
  [07Ch 0124   4]                       Length : 00000028
-[080h 0128   2]        Flags (decoded below) : 0001
-            Processor Proximity Domain Valid : 1
+[080h 0128   2]        Flags (decoded below) : 0000
+            Processor Proximity Domain Valid : 0
  [082h 0130   2]                    Reserved1 : 0000
-[084h 0132   4] Attached Initiator Proximity Domain : 00000001
+[084h 0132   4] Attached Initiator Proximity Domain : 00000080
  [088h 0136   4]      Memory Proximity Domain : 00000002
  [08Ch 0140   4]                    Reserved2 : 00000000
  [090h 0144   8]                    Reserved3 : 0000000000000000

Final HMAT SLLB structures:
[0A0h 0160   2]               Structure Type : 0001 [System Locality Latency and Bandwidth Information]
[0A2h 0162   2]                     Reserved : 0000
[0A4h 0164   4]                       Length : 00000040
[0A8h 0168   1]        Flags (decoded below) : 00
                             Memory Hierarchy : 0
[0A9h 0169   1]                    Data Type : 00
[0AAh 0170   2]                    Reserved1 : 0000
[0ACh 0172   4] Initiator Proximity Domains # : 00000002
[0B0h 0176   4]   Target Proximity Domains # : 00000003
[0B4h 0180   4]                    Reserved2 : 00000000
[0B8h 0184   8]              Entry Base Unit : 0000000000002710
[0C0h 0192   4] Initiator Proximity Domain List : 00000000
[0C4h 0196   4] Initiator Proximity Domain List : 00000001
[0C8h 0200   4] Target Proximity Domain List : 00000000
[0CCh 0204   4] Target Proximity Domain List : 00000001
[0D0h 0208   4] Target Proximity Domain List : 00000002
[0D4h 0212   2]                        Entry : 0001
[0D6h 0214   2]                        Entry : 0002
[0D8h 0216   2]                        Entry : 0003
[0DAh 0218   2]                        Entry : 0002
[0DCh 0220   2]                        Entry : 0001
[0DEh 0222   2]                        Entry : 0003

[0E0h 0224   2]               Structure Type : 0001 [System Locality Latency and Bandwidth Information]
[0E2h 0226   2]                     Reserved : 0000
[0E4h 0228   4]                       Length : 00000040
[0E8h 0232   1]        Flags (decoded below) : 00
                             Memory Hierarchy : 0
[0E9h 0233   1]                    Data Type : 03
[0EAh 0234   2]                    Reserved1 : 0000
[0ECh 0236   4] Initiator Proximity Domains # : 00000002
[0F0h 0240   4]   Target Proximity Domains # : 00000003
[0F4h 0244   4]                    Reserved2 : 00000000
[0F8h 0248   8]              Entry Base Unit : 0000000000000001
[100h 0256   4] Initiator Proximity Domain List : 00000000
[104h 0260   4] Initiator Proximity Domain List : 00000001
[108h 0264   4] Target Proximity Domain List : 00000000
[10Ch 0268   4] Target Proximity Domain List : 00000001
[110h 0272   4] Target Proximity Domain List : 00000002
[114h 0276   2]                        Entry : 000A
[116h 0278   2]                        Entry : 0005
[118h 0280   2]                        Entry : 0001
[11Ah 0282   2]                        Entry : 0005
[11Ch 0284   2]                        Entry : 000A
[11Eh 0286   2]                        Entry : 0001

Signed-off-by: Brice Goglin <Brice.Goglin@inria.fr>
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
---
  hw/core/machine.c | 4 +---
  1 file changed, 1 insertion(+), 3 deletions(-)

diff --git a/hw/core/machine.c b/hw/core/machine.c
index a673302cce..d4d7e77401 100644
--- a/hw/core/machine.c
+++ b/hw/core/machine.c
@@ -1173,9 +1173,7 @@ static void numa_validate_initiator(NumaState *numa_state)
  
      for (i = 0; i < numa_state->num_nodes; i++) {
          if (numa_info[i].initiator == MAX_NODES) {
-            error_report("The initiator of NUMA node %d is missing, use "
-                         "'-numa node,initiator' option to declare it", i);
-            exit(1);
+            continue;
          }
  
          if (!numa_info[numa_info[i].initiator].present) {
-- 
2.30.2




[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 840 bytes --]

  reply	other threads:[~2022-06-29  9:44 UTC|newest]

Thread overview: 6+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-06-29  9:32 [PATCH v3 0/4] hmat acpi: Don't require initiator value in -numa Brice Goglin
2022-06-29  9:34 ` Brice Goglin [this message]
2022-06-29  9:34 ` [PATCH v3 2/4] tests: acpi: add and whitelist *.hmat-noinitiator expected blobs Brice Goglin
2022-06-29  9:35 ` [PATCH v3 3/4] tests: acpi: q35: add test for hmat nodes without initiators Brice Goglin
2022-06-29 13:52   ` Igor Mammedov
2022-06-29  9:35 ` [PATCH v3 4/4] tests: acpi: q35: update expected blobs *.hmat-noinitiators Brice Goglin

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=965c03a7-b61b-dc25-1527-b26cb5f5f9bd@inria.fr \
    --to=brice.goglin@inria.fr \
    --cc=eduardo@habkost.net \
    --cc=f4bug@amsat.org \
    --cc=imammedo@redhat.com \
    --cc=jingqi.liu@intel.com \
    --cc=jonathan.cameron@huawei.com \
    --cc=marcel.apfelbaum@gmail.com \
    --cc=qemu-devel@nongnu.org \
    --cc=wangyanan55@huawei.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.