All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 0/3] account for NVDIMM nodes during SRAT generation
@ 2020-04-28  1:28 Vishal Verma
  2020-04-28  1:28 ` [PATCH 1/3] diffs-allowed: add the SRAT AML to diffs-allowed Vishal Verma
                   ` (4 more replies)
  0 siblings, 5 replies; 12+ messages in thread
From: Vishal Verma @ 2020-04-28  1:28 UTC (permalink / raw)
  To: qemu-devel
  Cc: jingqi.liu, Dan Williams, Dave Hansen, Vishal Verma, Michael S. Tsirkin

On the command line, one can specify a NUMA node for NVDIMM devices. If
we set up the topology to give NVDIMMs their own nodes, i.e. not
containing any CPUs or regular memory, qemu doesn't populate SRAT memory
affinity structures for these nodes. However the NFIT does reference
those proximity domains.

As a result, Linux, while parsing the SRAT, fails to initialize node
related structures for these nodes, and they never end up in the
nodes_possible map. When these are onlined at a later point (via
hotplug), this causes problems.

I've followed the instructions in bios-tables-test.c to update the
expected SRAT binary, and the tests (make check) pass. Patches 1 and 3
are the relevant ones for the binary update.

Patch 2 is the main patch which changes SRAT generation.

Vishal Verma (3):
  diffs-allowed: add the SRAT AML to diffs-allowed
  hw/acpi-build: account for NVDIMM numa nodes in SRAT
  tests/acpi: update expected SRAT files

 hw/i386/acpi-build.c             |  20 ++++++++++++++++++++
 tests/data/acpi/pc/SRAT.dimmpxm  | Bin 392 -> 392 bytes
 tests/data/acpi/q35/SRAT.dimmpxm | Bin 392 -> 392 bytes
 3 files changed, 20 insertions(+)

-- 
2.25.4



^ permalink raw reply	[flat|nested] 12+ messages in thread

* [PATCH 1/3] diffs-allowed: add the SRAT AML to diffs-allowed
  2020-04-28  1:28 [PATCH 0/3] account for NVDIMM nodes during SRAT generation Vishal Verma
@ 2020-04-28  1:28 ` Vishal Verma
  2020-04-28  1:28 ` [PATCH 2/3] hw/acpi-build: account for NVDIMM numa nodes in SRAT Vishal Verma
                   ` (3 subsequent siblings)
  4 siblings, 0 replies; 12+ messages in thread
From: Vishal Verma @ 2020-04-28  1:28 UTC (permalink / raw)
  To: qemu-devel
  Cc: jingqi.liu, Dan Williams, Dave Hansen, Vishal Verma, Michael S. Tsirkin

In anticipation of a change to the SRAT generation in qemu, add the AML
file to diffs-allowed.

Signed-off-by: Vishal Verma <vishal.l.verma@intel.com>
---
 tests/qtest/bios-tables-test-allowed-diff.h | 1 +
 1 file changed, 1 insertion(+)

diff --git a/tests/qtest/bios-tables-test-allowed-diff.h b/tests/qtest/bios-tables-test-allowed-diff.h
index dfb8523c8b..83d3ea5032 100644
--- a/tests/qtest/bios-tables-test-allowed-diff.h
+++ b/tests/qtest/bios-tables-test-allowed-diff.h
@@ -1 +1,2 @@
 /* List of comma-separated changed AML files to ignore */
+"tests/data/acpi/pc/SRAT.dimmpxm",
-- 
2.25.4



^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [PATCH 2/3] hw/acpi-build: account for NVDIMM numa nodes in SRAT
  2020-04-28  1:28 [PATCH 0/3] account for NVDIMM nodes during SRAT generation Vishal Verma
  2020-04-28  1:28 ` [PATCH 1/3] diffs-allowed: add the SRAT AML to diffs-allowed Vishal Verma
@ 2020-04-28  1:28 ` Vishal Verma
  2020-04-30  3:42   ` Liu, Jingqi
  2020-05-21 15:16   ` Igor Mammedov
  2020-04-28  1:28 ` [PATCH 3/3] tests/acpi: update expected SRAT files Vishal Verma
                   ` (2 subsequent siblings)
  4 siblings, 2 replies; 12+ messages in thread
From: Vishal Verma @ 2020-04-28  1:28 UTC (permalink / raw)
  To: qemu-devel
  Cc: jingqi.liu, Dan Williams, Dave Hansen, Vishal Verma, Michael S. Tsirkin

NVDIMMs can belong to their own proximity domains, as described by the
NFIT. In such cases, the SRAT needs to have Memory Affinity structures
in the SRAT for these NVDIMMs, otherwise Linux doesn't populate node
data structures properly during NUMA initialization. See the following
for an example failure case.

https://lore.kernel.org/linux-nvdimm/20200416225438.15208-1-vishal.l.verma@intel.com/

Fix this by adding device address range and node information from
NVDIMMs to the SRAT in build_srat().

The relevant command line options to exercise this are below. Nodes 0-1
contain CPUs and regular memory, and nodes 2-3 are the NVDIMM address
space.

  -numa node,nodeid=0,mem=2048M,
  -numa node,nodeid=1,mem=2048M,
  -numa node,nodeid=2,mem=0,
  -object memory-backend-file,id=nvmem0,share,mem-path=nvdimm-0,size=16384M,align=128M
  -device nvdimm,memdev=nvmem0,id=nv0,label-size=2M,node=2
  -numa node,nodeid=3,mem=0,
  -object memory-backend-file,id=nvmem1,share,mem-path=nvdimm-1,size=16384M,align=128M
  -device nvdimm,memdev=nvmem1,id=nv1,label-size=2M,node=3

Cc: Jingqi Liu <jingqi.liu@intel.com>
Cc: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Vishal Verma <vishal.l.verma@intel.com>
---
 hw/i386/acpi-build.c | 20 ++++++++++++++++++++
 1 file changed, 20 insertions(+)

diff --git a/hw/i386/acpi-build.c b/hw/i386/acpi-build.c
index 23c77eeb95..b0da67de0e 100644
--- a/hw/i386/acpi-build.c
+++ b/hw/i386/acpi-build.c
@@ -48,6 +48,7 @@
 #include "migration/vmstate.h"
 #include "hw/mem/memory-device.h"
 #include "hw/mem/nvdimm.h"
+#include "qemu/nvdimm-utils.h"
 #include "sysemu/numa.h"
 #include "sysemu/reset.h"
 
@@ -2429,6 +2430,25 @@ build_srat(GArray *table_data, BIOSLinker *linker, MachineState *machine)
                               MEM_AFFINITY_ENABLED);
         }
     }
+
+    if (machine->nvdimms_state->is_enabled) {
+        GSList *device_list = nvdimm_get_device_list();
+
+        for (; device_list; device_list = device_list->next) {
+            DeviceState *dev = device_list->data;
+            int node = object_property_get_int(OBJECT(dev), PC_DIMM_NODE_PROP,
+                                               NULL);
+            uint64_t addr = object_property_get_uint(OBJECT(dev),
+                                                     PC_DIMM_ADDR_PROP, NULL);
+            uint64_t size = object_property_get_uint(OBJECT(dev),
+                                                     PC_DIMM_SIZE_PROP, NULL);
+
+            numamem = acpi_data_push(table_data, sizeof *numamem);
+            build_srat_memory(numamem, addr, size, node,
+                              MEM_AFFINITY_ENABLED | MEM_AFFINITY_NON_VOLATILE);
+        }
+    }
+
     slots = (table_data->len - numa_start) / sizeof *numamem;
     for (; slots < pcms->numa_nodes + 2; slots++) {
         numamem = acpi_data_push(table_data, sizeof *numamem);
-- 
2.25.4



^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [PATCH 3/3] tests/acpi: update expected SRAT files
  2020-04-28  1:28 [PATCH 0/3] account for NVDIMM nodes during SRAT generation Vishal Verma
  2020-04-28  1:28 ` [PATCH 1/3] diffs-allowed: add the SRAT AML to diffs-allowed Vishal Verma
  2020-04-28  1:28 ` [PATCH 2/3] hw/acpi-build: account for NVDIMM numa nodes in SRAT Vishal Verma
@ 2020-04-28  1:28 ` Vishal Verma
  2020-04-28  2:44 ` [PATCH 0/3] account for NVDIMM nodes during SRAT generation no-reply
  2020-05-12 18:45 ` Verma, Vishal L
  4 siblings, 0 replies; 12+ messages in thread
From: Vishal Verma @ 2020-04-28  1:28 UTC (permalink / raw)
  To: qemu-devel
  Cc: jingqi.liu, Dan Williams, Dave Hansen, Vishal Verma, Michael S. Tsirkin

Update the expected SRAT files for the change to account for NVDIMM numa
nodes in the SRAT.

AML Diff:
  --- /tmp/asl-V49YJ0.dsl	2020-04-27 18:50:52.680043327 -0600
  +++ /tmp/asl-48AZJ0.dsl	2020-04-27 18:50:52.679043344 -0600
  @@ -3,7 +3,7 @@
    * AML/ASL+ Disassembler version 20190509 (64-bit version)
    * Copyright (c) 2000 - 2019 Intel Corporation
    *
  - * Disassembly of tests/data/acpi/pc/SRAT.dimmpxm, Mon Apr 27 18:50:52 2020
  + * Disassembly of /tmp/aml-U3BZJ0, Mon Apr 27 18:50:52 2020
    *
    * ACPI Data Table [SRAT]
    *
  @@ -13,7 +13,7 @@
   [000h 0000   4]                    Signature : "SRAT"    [System Resource Affinity Table]
   [004h 0004   4]                 Table Length : 00000188
   [008h 0008   1]                     Revision : 01
  -[009h 0009   1]                     Checksum : 80
  +[009h 0009   1]                     Checksum : 68
   [00Ah 0010   6]                       Oem ID : "BOCHS "
   [010h 0016   8]                 Oem Table ID : "BXPCSRAT"
   [018h 0024   4]                 Oem Revision : 00000001
  @@ -140,15 +140,15 @@
   [138h 0312   1]                Subtable Type : 01 [Memory Affinity]
   [139h 0313   1]                       Length : 28

  -[13Ah 0314   4]             Proximity Domain : 00000000
  +[13Ah 0314   4]             Proximity Domain : 00000002
   [13Eh 0318   2]                    Reserved1 : 0000
  -[140h 0320   8]                 Base Address : 0000000000000000
  -[148h 0328   8]               Address Length : 0000000000000000
  +[140h 0320   8]                 Base Address : 0000000108000000
  +[148h 0328   8]               Address Length : 0000000008000000
   [150h 0336   4]                    Reserved2 : 00000000
  -[154h 0340   4]        Flags (decoded below) : 00000000
  -                                     Enabled : 0
  +[154h 0340   4]        Flags (decoded below) : 00000005
  +                                     Enabled : 1
                                  Hot Pluggable : 0
  -                                Non-Volatile : 0
  +                                Non-Volatile : 1
   [158h 0344   8]                    Reserved3 : 0000000000000000

   [160h 0352   1]                Subtable Type : 01 [Memory Affinity]
  @@ -167,7 +167,7 @@

   Raw Table Data: Length 392 (0x188)

  -    0000: 53 52 41 54 88 01 00 00 01 80 42 4F 43 48 53 20  // SRAT......BOCHS
  +    0000: 53 52 41 54 88 01 00 00 01 68 42 4F 43 48 53 20  // SRAT.....hBOCHS
       0010: 42 58 50 43 53 52 41 54 01 00 00 00 42 58 50 43  // BXPCSRAT....BXPC
       0020: 01 00 00 00 01 00 00 00 00 00 00 00 00 00 00 00  // ................
       0030: 00 10 00 00 01 00 00 00 00 00 00 00 00 00 00 00  // ................
  @@ -186,9 +186,9 @@
       0100: 00 00 00 00 01 00 00 00 00 00 00 00 00 00 00 00  // ................
       0110: 01 28 03 00 00 00 00 00 00 00 00 06 00 00 00 00  // .(..............
       0120: 00 00 00 02 00 00 00 00 00 00 00 00 01 00 00 00  // ................
  -    0130: 00 00 00 00 00 00 00 00 01 28 00 00 00 00 00 00  // .........(......
  -    0140: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  // ................
  -    0150: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  // ................
  +    0130: 00 00 00 00 00 00 00 00 01 28 02 00 00 00 00 00  // .........(......
  +    0140: 00 00 00 08 01 00 00 00 00 00 00 08 00 00 00 00  // ................
  +    0150: 00 00 00 00 05 00 00 00 00 00 00 00 00 00 00 00  // ................
       0160: 01 28 03 00 00 00 00 00 00 00 00 00 01 00 00 00  // .(..............
       0170: 00 00 00 F8 00 00 00 00 00 00 00 00 03 00 00 00  // ................
       0180: 00 00 00 00 00 00 00 00                          // ........

Signed-off-by: Vishal Verma <vishal.l.verma@intel.com>
---
 tests/data/acpi/pc/SRAT.dimmpxm             | Bin 392 -> 392 bytes
 tests/data/acpi/q35/SRAT.dimmpxm            | Bin 392 -> 392 bytes
 tests/qtest/bios-tables-test-allowed-diff.h |   1 -
 3 files changed, 1 deletion(-)

diff --git a/tests/data/acpi/pc/SRAT.dimmpxm b/tests/data/acpi/pc/SRAT.dimmpxm
index f5c0267ea24bb404b6b4e687390140378fbdc3f1..5a13c61b9041c6045c29643bf93a111fb1c0c76a 100644
GIT binary patch
delta 51
scmeBR?qKE$4ss0XU}Rum%-G0fz$nec00kUCF%aN@Pz(&LlS3Je0lmQmhyVZp

delta 51
icmeBR?qKE$4ss0XU}RumY}m+Uz$ndt8%z#mGzI{_tp$hx

diff --git a/tests/data/acpi/q35/SRAT.dimmpxm b/tests/data/acpi/q35/SRAT.dimmpxm
index f5c0267ea24bb404b6b4e687390140378fbdc3f1..5a13c61b9041c6045c29643bf93a111fb1c0c76a 100644
GIT binary patch
delta 51
scmeBR?qKE$4ss0XU}Rum%-G0fz$nec00kUCF%aN@Pz(&LlS3Je0lmQmhyVZp

delta 51
icmeBR?qKE$4ss0XU}RumY}m+Uz$ndt8%z#mGzI{_tp$hx

diff --git a/tests/qtest/bios-tables-test-allowed-diff.h b/tests/qtest/bios-tables-test-allowed-diff.h
index 83d3ea5032..dfb8523c8b 100644
--- a/tests/qtest/bios-tables-test-allowed-diff.h
+++ b/tests/qtest/bios-tables-test-allowed-diff.h
@@ -1,2 +1 @@
 /* List of comma-separated changed AML files to ignore */
-"tests/data/acpi/pc/SRAT.dimmpxm",
-- 
2.25.4



^ permalink raw reply related	[flat|nested] 12+ messages in thread

* Re: [PATCH 0/3] account for NVDIMM nodes during SRAT generation
  2020-04-28  1:28 [PATCH 0/3] account for NVDIMM nodes during SRAT generation Vishal Verma
                   ` (2 preceding siblings ...)
  2020-04-28  1:28 ` [PATCH 3/3] tests/acpi: update expected SRAT files Vishal Verma
@ 2020-04-28  2:44 ` no-reply
  2020-04-28 16:02   ` Verma, Vishal L
  2020-05-12 18:45 ` Verma, Vishal L
  4 siblings, 1 reply; 12+ messages in thread
From: no-reply @ 2020-04-28  2:44 UTC (permalink / raw)
  To: vishal.l.verma
  Cc: mst, jingqi.liu, dave.hansen, qemu-devel, vishal.l.verma, dan.j.williams

Patchew URL: https://patchew.org/QEMU/20200428012810.10877-1-vishal.l.verma@intel.com/



Hi,

This series seems to have some coding style problems. See output below for
more information:

Subject: [PATCH 0/3] account for NVDIMM nodes during SRAT generation
Message-id: 20200428012810.10877-1-vishal.l.verma@intel.com
Type: series

=== TEST SCRIPT BEGIN ===
#!/bin/bash
git rev-parse base > /dev/null || exit 0
git config --local diff.renamelimit 0
git config --local diff.renames True
git config --local diff.algorithm histogram
./scripts/checkpatch.pl --mailback base..
=== TEST SCRIPT END ===

Switched to a new branch 'test'
34c59d3 tests/acpi: update expected SRAT files
dcc96eb hw/acpi-build: account for NVDIMM numa nodes in SRAT
08b7ee5 diffs-allowed: add the SRAT AML to diffs-allowed

=== OUTPUT BEGIN ===
1/3 Checking commit 08b7ee5e7ddf (diffs-allowed: add the SRAT AML to diffs-allowed)
2/3 Checking commit dcc96eb97d46 (hw/acpi-build: account for NVDIMM numa nodes in SRAT)
ERROR: Do not add expected files together with tests, follow instructions in tests/qtest/bios-tables-test.c: both tests/qtest/bios-tables-test-allowed-diff.h and hw/i386/acpi-build.c found

ERROR: Do not add expected files together with tests, follow instructions in tests/qtest/bios-tables-test.c: both tests/qtest/bios-tables-test-allowed-diff.h and hw/i386/acpi-build.c found

total: 2 errors, 0 warnings, 32 lines checked

Patch 2/3 has style problems, please review.  If any of these errors
are false positives report them to the maintainer, see
CHECKPATCH in MAINTAINERS.

3/3 Checking commit 34c59d3a0232 (tests/acpi: update expected SRAT files)
ERROR: Do not add expected files together with tests, follow instructions in tests/qtest/bios-tables-test.c: both tests/data/acpi/pc/SRAT.dimmpxm and hw/i386/acpi-build.c found

ERROR: Do not add expected files together with tests, follow instructions in tests/qtest/bios-tables-test.c: both tests/data/acpi/q35/SRAT.dimmpxm and hw/i386/acpi-build.c found

ERROR: Do not add expected files together with tests, follow instructions in tests/qtest/bios-tables-test.c: both tests/qtest/bios-tables-test-allowed-diff.h and hw/i386/acpi-build.c found

ERROR: Do not add expected files together with tests, follow instructions in tests/qtest/bios-tables-test.c: both tests/qtest/bios-tables-test-allowed-diff.h and hw/i386/acpi-build.c found

total: 4 errors, 0 warnings, 1 lines checked

Patch 3/3 has style problems, please review.  If any of these errors
are false positives report them to the maintainer, see
CHECKPATCH in MAINTAINERS.

=== OUTPUT END ===

Test command exited with code: 1


The full log is available at
http://patchew.org/logs/20200428012810.10877-1-vishal.l.verma@intel.com/testing.checkpatch/?type=message.
---
Email generated automatically by Patchew [https://patchew.org/].
Please send your feedback to patchew-devel@redhat.com

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH 0/3] account for NVDIMM nodes during SRAT generation
  2020-04-28  2:44 ` [PATCH 0/3] account for NVDIMM nodes during SRAT generation no-reply
@ 2020-04-28 16:02   ` Verma, Vishal L
  0 siblings, 0 replies; 12+ messages in thread
From: Verma, Vishal L @ 2020-04-28 16:02 UTC (permalink / raw)
  To: qemu-devel; +Cc: Liu, Jingqi, Williams, Dan J, dave.hansen, mst

On Mon, 2020-04-27 at 19:44 -0700, no-reply@patchew.org wrote:
> Patchew URL: https://patchew.org/QEMU/20200428012810.10877-1-vishal.l.verma@intel.com/
[..]
> 
> === OUTPUT BEGIN ===
> 1/3 Checking commit 08b7ee5e7ddf (diffs-allowed: add the SRAT AML to
> diffs-allowed)
> 2/3 Checking commit dcc96eb97d46 (hw/acpi-build: account for NVDIMM
> numa nodes in SRAT)
> ERROR: Do not add expected files together with tests, follow
> instructions in tests/qtest/bios-tables-test.c: both tests/qtest/bios-
> tables-test-allowed-diff.h and hw/i386/acpi-build.c found
> 
> ERROR: Do not add expected files together with tests, follow
> instructions in tests/qtest/bios-tables-test.c: both tests/qtest/bios-
> tables-test-allowed-diff.h and hw/i386/acpi-build.c found

I'm not sure I follow these errors - I think I followed the instructions
in bios-tables-test.c exactly.. Did I miss something?

> 
> total: 2 errors, 0 warnings, 32 lines checked
> 
> Patch 2/3 has style problems, please review.  If any of these errors
> are false positives report them to the maintainer, see
> CHECKPATCH in MAINTAINERS.
> 
> 3/3 Checking commit 34c59d3a0232 (tests/acpi: update expected SRAT
> files)
> ERROR: Do not add expected files together with tests, follow
> instructions in tests/qtest/bios-tables-test.c: both
> tests/data/acpi/pc/SRAT.dimmpxm and hw/i386/acpi-build.c found
> 
> ERROR: Do not add expected files together with tests, follow
> instructions in tests/qtest/bios-tables-test.c: both
> tests/data/acpi/q35/SRAT.dimmpxm and hw/i386/acpi-build.c found
> 
> ERROR: Do not add expected files together with tests, follow
> instructions in tests/qtest/bios-tables-test.c: both tests/qtest/bios-
> tables-test-allowed-diff.h and hw/i386/acpi-build.c found
> 
> ERROR: Do not add expected files together with tests, follow
> instructions in tests/qtest/bios-tables-test.c: both tests/qtest/bios-
> tables-test-allowed-diff.h and hw/i386/acpi-build.c found
> 
> total: 4 errors, 0 warnings, 1 lines checked
> 
> Patch 3/3 has style problems, please review.  If any of these errors
> are false positives report them to the maintainer, see
> CHECKPATCH in MAINTAINERS.
> 
> === OUTPUT END ===
> 
> Test command exited with code: 1
> 
> 
> The full log is available at
> http://patchew.org/logs/20200428012810.10877-1-vishal.l.verma@intel.com/testing.checkpatch/?type=message.
> ---
> Email generated automatically by Patchew [https://patchew.org/].
> Please send your feedback to patchew-devel@redhat.com

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH 2/3] hw/acpi-build: account for NVDIMM numa nodes in SRAT
  2020-04-28  1:28 ` [PATCH 2/3] hw/acpi-build: account for NVDIMM numa nodes in SRAT Vishal Verma
@ 2020-04-30  3:42   ` Liu, Jingqi
  2020-05-21 15:16   ` Igor Mammedov
  1 sibling, 0 replies; 12+ messages in thread
From: Liu, Jingqi @ 2020-04-30  3:42 UTC (permalink / raw)
  To: Verma, Vishal L, qemu-devel
  Cc: Williams, Dan J, Dave Hansen, Michael S. Tsirkin

[-- Attachment #1: Type: text/plain, Size: 3151 bytes --]

On 4/28/2020 9:28 AM, Verma, Vishal L wrote:
> NVDIMMs can belong to their own proximity domains, as described by the
> NFIT. In such cases, the SRAT needs to have Memory Affinity structures
> in the SRAT for these NVDIMMs, otherwise Linux doesn't populate node
> data structures properly during NUMA initialization. See the following
> for an example failure case.
>
> https://lore.kernel.org/linux-nvdimm/20200416225438.15208-1-vishal.l.verma@intel.com/
>
> Fix this by adding device address range and node information from
> NVDIMMs to the SRAT in build_srat().
>
> The relevant command line options to exercise this are below. Nodes 0-1
> contain CPUs and regular memory, and nodes 2-3 are the NVDIMM address
> space.
>
>    -numa node,nodeid=0,mem=2048M,
>    -numa node,nodeid=1,mem=2048M,
>    -numa node,nodeid=2,mem=0,
>    -object memory-backend-file,id=nvmem0,share,mem-path=nvdimm-0,size=16384M,align=128M
>    -device nvdimm,memdev=nvmem0,id=nv0,label-size=2M,node=2
>    -numa node,nodeid=3,mem=0,
>    -object memory-backend-file,id=nvmem1,share,mem-path=nvdimm-1,size=16384M,align=128M
>    -device nvdimm,memdev=nvmem1,id=nv1,label-size=2M,node=3
>
> Cc: Jingqi Liu <jingqi.liu@intel.com>
> Cc: Michael S. Tsirkin <mst@redhat.com>
> Signed-off-by: Vishal Verma <vishal.l.verma@intel.com>
> ---
>   hw/i386/acpi-build.c | 20 ++++++++++++++++++++
>   1 file changed, 20 insertions(+)
>
> diff --git a/hw/i386/acpi-build.c b/hw/i386/acpi-build.c
> index 23c77eeb95..b0da67de0e 100644
> --- a/hw/i386/acpi-build.c
> +++ b/hw/i386/acpi-build.c
> @@ -48,6 +48,7 @@
>   #include "migration/vmstate.h"
>   #include "hw/mem/memory-device.h"
>   #include "hw/mem/nvdimm.h"
> +#include "qemu/nvdimm-utils.h"
>   #include "sysemu/numa.h"
>   #include "sysemu/reset.h"
>   
> @@ -2429,6 +2430,25 @@ build_srat(GArray *table_data, BIOSLinker *linker, MachineState *machine)
>                                 MEM_AFFINITY_ENABLED);
>           }
>       }
> +
> +    if (machine->nvdimms_state->is_enabled) {
> +        GSList *device_list = nvdimm_get_device_list();
> +
> +        for (; device_list; device_list = device_list->next) {
> +            DeviceState *dev = device_list->data;
> +            int node = object_property_get_int(OBJECT(dev), PC_DIMM_NODE_PROP,
> +                                               NULL);
> +            uint64_t addr = object_property_get_uint(OBJECT(dev),
> +                                                     PC_DIMM_ADDR_PROP, NULL);
> +            uint64_t size = object_property_get_uint(OBJECT(dev),
> +                                                     PC_DIMM_SIZE_PROP, NULL);
> +
> +            numamem = acpi_data_push(table_data, sizeof *numamem);
> +            build_srat_memory(numamem, addr, size, node,
> +                              MEM_AFFINITY_ENABLED | MEM_AFFINITY_NON_VOLATILE);
> +        }
> +    }
> +

Looks good for me.
Reviewed-by: Jingqi Liu <jingqi.liu@intel.com>

Thanks,
Jingqi

>       slots = (table_data->len - numa_start) / sizeof *numamem;
>       for (; slots < pcms->numa_nodes + 2; slots++) {
>           numamem = acpi_data_push(table_data, sizeof *numamem);

[-- Attachment #2: Type: text/html, Size: 3969 bytes --]

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH 0/3] account for NVDIMM nodes during SRAT generation
  2020-04-28  1:28 [PATCH 0/3] account for NVDIMM nodes during SRAT generation Vishal Verma
                   ` (3 preceding siblings ...)
  2020-04-28  2:44 ` [PATCH 0/3] account for NVDIMM nodes during SRAT generation no-reply
@ 2020-05-12 18:45 ` Verma, Vishal L
  4 siblings, 0 replies; 12+ messages in thread
From: Verma, Vishal L @ 2020-05-12 18:45 UTC (permalink / raw)
  To: qemu-devel
  Cc: lvivier, thuth, ehabkost, mst, Liu, Jingqi, dave.hansen,
	qemu-devel, pbonzini, imammedo, Williams,  Dan J, rth

On Mon, 2020-04-27 at 19:28 -0600, Vishal Verma wrote:
> On the command line, one can specify a NUMA node for NVDIMM devices. If
> we set up the topology to give NVDIMMs their own nodes, i.e. not
> containing any CPUs or regular memory, qemu doesn't populate SRAT memory
> affinity structures for these nodes. However the NFIT does reference
> those proximity domains.
> 
> As a result, Linux, while parsing the SRAT, fails to initialize node
> related structures for these nodes, and they never end up in the
> nodes_possible map. When these are onlined at a later point (via
> hotplug), this causes problems.
> 
> I've followed the instructions in bios-tables-test.c to update the
> expected SRAT binary, and the tests (make check) pass. Patches 1 and 3
> are the relevant ones for the binary update.
> 
> Patch 2 is the main patch which changes SRAT generation.
> 
> Vishal Verma (3):
>   diffs-allowed: add the SRAT AML to diffs-allowed
>   hw/acpi-build: account for NVDIMM numa nodes in SRAT
>   tests/acpi: update expected SRAT files
> 
>  hw/i386/acpi-build.c             |  20 ++++++++++++++++++++
>  tests/data/acpi/pc/SRAT.dimmpxm  | Bin 392 -> 392 bytes
>  tests/data/acpi/q35/SRAT.dimmpxm | Bin 392 -> 392 bytes
>  3 files changed, 20 insertions(+)
> 
Hi All - Just pinging this patchset again. I neglected to CC maintainers
in the original posting, doing so now. The full series can be seen here:

https://lore.kernel.org/qemu-devel/20200428012810.10877-1-vishal.l.verma@intel.com/

If I should resend the patches, please let me know and I'll be happy to
do so.

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH 2/3] hw/acpi-build: account for NVDIMM numa nodes in SRAT
  2020-04-28  1:28 ` [PATCH 2/3] hw/acpi-build: account for NVDIMM numa nodes in SRAT Vishal Verma
  2020-04-30  3:42   ` Liu, Jingqi
@ 2020-05-21 15:16   ` Igor Mammedov
  2020-05-28  1:24     ` Verma, Vishal L
  1 sibling, 1 reply; 12+ messages in thread
From: Igor Mammedov @ 2020-05-21 15:16 UTC (permalink / raw)
  To: Vishal Verma
  Cc: jingqi.liu, Dan Williams, Dave Hansen, qemu-devel, Michael S. Tsirkin

On Mon, 27 Apr 2020 19:28:09 -0600
Vishal Verma <vishal.l.verma@intel.com> wrote:

> NVDIMMs can belong to their own proximity domains, as described by the
> NFIT. In such cases, the SRAT needs to have Memory Affinity structures
> in the SRAT for these NVDIMMs, otherwise Linux doesn't populate node
> data structures properly during NUMA initialization. See the following
> for an example failure case.
> 
> https://lore.kernel.org/linux-nvdimm/20200416225438.15208-1-vishal.l.verma@intel.com/
> 
> Fix this by adding device address range and node information from
> NVDIMMs to the SRAT in build_srat().
> 
> The relevant command line options to exercise this are below. Nodes 0-1
> contain CPUs and regular memory, and nodes 2-3 are the NVDIMM address
> space.
> 
>   -numa node,nodeid=0,mem=2048M,
>   -numa node,nodeid=1,mem=2048M,
>   -numa node,nodeid=2,mem=0,
>   -object memory-backend-file,id=nvmem0,share,mem-path=nvdimm-0,size=16384M,align=128M
>   -device nvdimm,memdev=nvmem0,id=nv0,label-size=2M,node=2
>   -numa node,nodeid=3,mem=0,
>   -object memory-backend-file,id=nvmem1,share,mem-path=nvdimm-1,size=16384M,align=128M
>   -device nvdimm,memdev=nvmem1,id=nv1,label-size=2M,node=3
> 
> Cc: Jingqi Liu <jingqi.liu@intel.com>
> Cc: Michael S. Tsirkin <mst@redhat.com>
> Signed-off-by: Vishal Verma <vishal.l.verma@intel.com>
> ---
>  hw/i386/acpi-build.c | 20 ++++++++++++++++++++
>  1 file changed, 20 insertions(+)
> 
> diff --git a/hw/i386/acpi-build.c b/hw/i386/acpi-build.c
> index 23c77eeb95..b0da67de0e 100644
> --- a/hw/i386/acpi-build.c
> +++ b/hw/i386/acpi-build.c
> @@ -48,6 +48,7 @@
>  #include "migration/vmstate.h"
>  #include "hw/mem/memory-device.h"
>  #include "hw/mem/nvdimm.h"
> +#include "qemu/nvdimm-utils.h"
>  #include "sysemu/numa.h"
>  #include "sysemu/reset.h"
>  
> @@ -2429,6 +2430,25 @@ build_srat(GArray *table_data, BIOSLinker *linker, MachineState *machine)
>                                MEM_AFFINITY_ENABLED);
>          }
>      }
> +
> +    if (machine->nvdimms_state->is_enabled) {
> +        GSList *device_list = nvdimm_get_device_list();
> +
> +        for (; device_list; device_list = device_list->next) {
> +            DeviceState *dev = device_list->data;
> +            int node = object_property_get_int(OBJECT(dev), PC_DIMM_NODE_PROP,
> +                                               NULL);
> +            uint64_t addr = object_property_get_uint(OBJECT(dev),
> +                                                     PC_DIMM_ADDR_PROP, NULL);
> +            uint64_t size = object_property_get_uint(OBJECT(dev),
> +                                                     PC_DIMM_SIZE_PROP, NULL);
> +
suggest to use error_abort in getters

> +            numamem = acpi_data_push(table_data, sizeof *numamem);
> +            build_srat_memory(numamem, addr, size, node,
> +                              MEM_AFFINITY_ENABLED | MEM_AFFINITY_NON_VOLATILE);
> +        }
who is in charge of freeing device_list ?

> +    }

There is ARM version of build_srat(),
I suggest to put this NVDIMM specific part in helper function within hw/acpi/nvdimm.c
and use it from both build_srat() functions.

> +
>      slots = (table_data->len - numa_start) / sizeof *numamem;
>      for (; slots < pcms->numa_nodes + 2; slots++) {
>          numamem = acpi_data_push(table_data, sizeof *numamem);



^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH 2/3] hw/acpi-build: account for NVDIMM numa nodes in SRAT
  2020-05-21 15:16   ` Igor Mammedov
@ 2020-05-28  1:24     ` Verma, Vishal L
  2020-05-28 10:41       ` Igor Mammedov
  0 siblings, 1 reply; 12+ messages in thread
From: Verma, Vishal L @ 2020-05-28  1:24 UTC (permalink / raw)
  To: imammedo; +Cc: Liu, Jingqi, Williams, Dan J, dave.hansen, qemu-devel, mst

On Thu, 2020-05-21 at 17:16 +0200, Igor Mammedov wrote:

Hi Igor, Thanks for the review.

[..]
> >  
> > @@ -2429,6 +2430,25 @@ build_srat(GArray *table_data, BIOSLinker *linker, MachineState *machine)
> >                                MEM_AFFINITY_ENABLED);
> >          }
> >      }
> > +
> > +    if (machine->nvdimms_state->is_enabled) {
> > +        GSList *device_list = nvdimm_get_device_list();
> > +
> > +        for (; device_list; device_list = device_list->next) {
> > +            DeviceState *dev = device_list->data;
> > +            int node = object_property_get_int(OBJECT(dev), PC_DIMM_NODE_PROP,
> > +                                               NULL);
> > +            uint64_t addr = object_property_get_uint(OBJECT(dev),
> > +                                                     PC_DIMM_ADDR_PROP, NULL);
> > +            uint64_t size = object_property_get_uint(OBJECT(dev),
> > +                                                     PC_DIMM_SIZE_PROP, NULL);
> > +
> suggest to use error_abort in getters

Yep, fixed in v2.

> 
> > +            numamem = acpi_data_push(table_data, sizeof *numamem);
> > +            build_srat_memory(numamem, addr, size, node,
> > +                              MEM_AFFINITY_ENABLED | MEM_AFFINITY_NON_VOLATILE);
> > +        }
> who is in charge of freeing device_list ?

Thanks, also fixed in v2.

> 
> > +    }
> 
> There is ARM version of build_srat(),
> I suggest to put this NVDIMM specific part in helper function within hw/acpi/nvdimm.c
> and use it from both build_srat() functions.

Splitting the work out into a helper function in nvdimm.c does make
sense, and I've done that. However, looking at the arm version of
build_srat and generally in virt-acpi-build.c, I don't see any NVDIMM
support, so unless I'm mistaken, it wouldn't make sense to actually call
this from the arm version of build_srat.

I'll send a v2 with the above fixes.

> 
> > +
> >      slots = (table_data->len - numa_start) / sizeof *numamem;
> >      for (; slots < pcms->numa_nodes + 2; slots++) {
> >          numamem = acpi_data_push(table_data, sizeof *numamem);

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH 2/3] hw/acpi-build: account for NVDIMM numa nodes in SRAT
  2020-05-28  1:24     ` Verma, Vishal L
@ 2020-05-28 10:41       ` Igor Mammedov
  2020-05-28 16:27         ` Verma, Vishal L
  0 siblings, 1 reply; 12+ messages in thread
From: Igor Mammedov @ 2020-05-28 10:41 UTC (permalink / raw)
  To: Verma, Vishal L
  Cc: Liu, Jingqi, Williams, Dan J, dave.hansen, qemu-devel, mst

On Thu, 28 May 2020 01:24:42 +0000
"Verma, Vishal L" <vishal.l.verma@intel.com> wrote:

> On Thu, 2020-05-21 at 17:16 +0200, Igor Mammedov wrote:
> 
> Hi Igor, Thanks for the review.
> 
> [..]
> > >  
> > > @@ -2429,6 +2430,25 @@ build_srat(GArray *table_data, BIOSLinker *linker, MachineState *machine)
> > >                                MEM_AFFINITY_ENABLED);
> > >          }
> > >      }
> > > +
> > > +    if (machine->nvdimms_state->is_enabled) {
> > > +        GSList *device_list = nvdimm_get_device_list();
> > > +
> > > +        for (; device_list; device_list = device_list->next) {
> > > +            DeviceState *dev = device_list->data;
> > > +            int node = object_property_get_int(OBJECT(dev), PC_DIMM_NODE_PROP,
> > > +                                               NULL);
> > > +            uint64_t addr = object_property_get_uint(OBJECT(dev),
> > > +                                                     PC_DIMM_ADDR_PROP, NULL);
> > > +            uint64_t size = object_property_get_uint(OBJECT(dev),
> > > +                                                     PC_DIMM_SIZE_PROP, NULL);
> > > +  
> > suggest to use error_abort in getters  
> 
> Yep, fixed in v2.
> 
> >   
> > > +            numamem = acpi_data_push(table_data, sizeof *numamem);
> > > +            build_srat_memory(numamem, addr, size, node,
> > > +                              MEM_AFFINITY_ENABLED | MEM_AFFINITY_NON_VOLATILE);
> > > +        }  
> > who is in charge of freeing device_list ?  
> 
> Thanks, also fixed in v2.
> 
> >   
> > > +    }  
> > 
> > There is ARM version of build_srat(),
> > I suggest to put this NVDIMM specific part in helper function within hw/acpi/nvdimm.c
> > and use it from both build_srat() functions.  
> 
> Splitting the work out into a helper function in nvdimm.c does make
> sense, and I've done that. However, looking at the arm version of
> build_srat and generally in virt-acpi-build.c, I don't see any NVDIMM
> support, so unless I'm mistaken, it wouldn't make sense to actually call
> this from the arm version of build_srat.

perhaps you are lookin into old version on QEMU
current HEAD has followin snippet:

virt-acpi-build.c:
    if (ms->nvdimms_state->is_enabled) {
        nvdimm_build_acpi(table_offsets, tables_blob, tables->linker,
                          ms->nvdimms_state, ms->ram_slots);
    }
 
> 
> I'll send a v2 with the above fixes.
> 
> >   
> > > +
> > >      slots = (table_data->len - numa_start) / sizeof *numamem;
> > >      for (; slots < pcms->numa_nodes + 2; slots++) {
> > >          numamem = acpi_data_push(table_data, sizeof *numamem);  



^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH 2/3] hw/acpi-build: account for NVDIMM numa nodes in SRAT
  2020-05-28 10:41       ` Igor Mammedov
@ 2020-05-28 16:27         ` Verma, Vishal L
  0 siblings, 0 replies; 12+ messages in thread
From: Verma, Vishal L @ 2020-05-28 16:27 UTC (permalink / raw)
  To: imammedo; +Cc: Liu, Jingqi, Williams, Dan J, dave.hansen, qemu-devel, mst

On Thu, 2020-05-28 at 12:41 +0200, Igor Mammedov wrote:
> On Thu, 28 May 2020 01:24:42 +0000
> "Verma, Vishal L" <vishal.l.verma@intel.com> wrote:
> > On Thu, 2020-05-21 at 17:16 +0200, Igor Mammedov wrote:
> > 
[..]
> > > 
> > > There is ARM version of build_srat(),
> > > I suggest to put this NVDIMM specific part in helper function within hw/acpi/nvdimm.c
> > > and use it from both build_srat() functions.  
> > 
> > Splitting the work out into a helper function in nvdimm.c does make
> > sense, and I've done that. However, looking at the arm version of
> > build_srat and generally in virt-acpi-build.c, I don't see any NVDIMM
> > support, so unless I'm mistaken, it wouldn't make sense to actually call
> > this from the arm version of build_srat.
> 
> perhaps you are lookin into old version on QEMU
> current HEAD has followin snippet:
> 
> virt-acpi-build.c:
>     if (ms->nvdimms_state->is_enabled) {
>         nvdimm_build_acpi(table_offsets, tables_blob, tables->linker,
>                           ms->nvdimms_state, ms->ram_slots);
>     }
>  
Sorry, I jumped the gun on sending v3 and didn't see this email until
after sending it. You're right, I think I was on an older version
before. I've added the nvdimm bit to build_srat here and going to run
through the tests. If everything looks ok I'll send a v4.

^ permalink raw reply	[flat|nested] 12+ messages in thread

end of thread, other threads:[~2020-05-28 16:28 UTC | newest]

Thread overview: 12+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-04-28  1:28 [PATCH 0/3] account for NVDIMM nodes during SRAT generation Vishal Verma
2020-04-28  1:28 ` [PATCH 1/3] diffs-allowed: add the SRAT AML to diffs-allowed Vishal Verma
2020-04-28  1:28 ` [PATCH 2/3] hw/acpi-build: account for NVDIMM numa nodes in SRAT Vishal Verma
2020-04-30  3:42   ` Liu, Jingqi
2020-05-21 15:16   ` Igor Mammedov
2020-05-28  1:24     ` Verma, Vishal L
2020-05-28 10:41       ` Igor Mammedov
2020-05-28 16:27         ` Verma, Vishal L
2020-04-28  1:28 ` [PATCH 3/3] tests/acpi: update expected SRAT files Vishal Verma
2020-04-28  2:44 ` [PATCH 0/3] account for NVDIMM nodes during SRAT generation no-reply
2020-04-28 16:02   ` Verma, Vishal L
2020-05-12 18:45 ` Verma, Vishal L

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.