All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v2 0/2] hw/arm/virt: Fix qemu booting failure on device-tree
@ 2021-10-12  6:36 Gavin Shan
  2021-10-12  6:36 ` [PATCH v2 1/2] numa: Set default distance map if needed Gavin Shan
  2021-10-12  6:36 ` [PATCH v2 2/2] hw/arm/virt: Don't create device-tree node for empty NUMA node Gavin Shan
  0 siblings, 2 replies; 4+ messages in thread
From: Gavin Shan @ 2021-10-12  6:36 UTC (permalink / raw)
  To: qemu-arm; +Cc: peter.maydell, drjones, qemu-devel, shan.gavin, ehabkost

The empty NUMA nodes, where no memory resides, are allowed on ARM64 virt
platform. However, QEMU fails to boot because the device-tree can't be
populated due to the conflicting device-tree node names of these empty
NUMA nodes. For example, QEMU fails to boot and the following error
message reported when below command line is used.

  /home/gavin/sandbox/qemu.main/build/qemu-system-aarch64 \
  -accel kvm -machine virt,gic-version=host               \
  -cpu host -smp 4,sockets=2,cores=2,threads=1            \
  -m 1024M,slots=16,maxmem=64G                            \
  -object memory-backend-ram,id=mem0,size=512M            \
  -object memory-backend-ram,id=mem1,size=512M            \
  -numa node,nodeid=0,cpus=0-1,memdev=mem0                \
  -numa node,nodeid=1,cpus=2-3,memdev=mem1                \
  -numa node,nodeid=2                                     \
  -numa node,nodeid=3                                     \
    :
  qemu-system-aarch64: FDT: Failed to create subnode /memory@80000000: FDT_ERR_EXISTS

The lastest device-tree specification doesn't indicate how the device-tree
nodes should be populated for these empty NUMA nodes. The proposed way
to handle this is documented in linux kernel. The linux kernel patches
have been acknoledged and merged to upstream pretty soon.

  https://lkml.org/lkml/2021/9/27/31

This series follows the suggestion, which is included in linux kernel
patches, to resolve the QEMU boot failure issue: The corresponding
device-tree nodes aren't created for the empty NUMA nodes, but their
NUMA IDs and distance map matrix should be included in the distance-map
device-tree node.

Changelog
=========
v2:
   * Amend PATCH[01/02]'s changelog to explain why we needn't
     switch to disable generating the default distance map        (Drew)

Gavin Shan (2):
  numa: Set default distance map if needed
  hw/arm/virt: Don't create device-tree node for empty NUMA node

 hw/arm/boot.c  |  4 ++++
 hw/core/numa.c | 13 +++++++++++--
 2 files changed, 15 insertions(+), 2 deletions(-)

-- 
2.23.0



^ permalink raw reply	[flat|nested] 4+ messages in thread

* [PATCH v2 1/2] numa: Set default distance map if needed
  2021-10-12  6:36 [PATCH v2 0/2] hw/arm/virt: Fix qemu booting failure on device-tree Gavin Shan
@ 2021-10-12  6:36 ` Gavin Shan
  2021-10-12  6:39   ` Andrew Jones
  2021-10-12  6:36 ` [PATCH v2 2/2] hw/arm/virt: Don't create device-tree node for empty NUMA node Gavin Shan
  1 sibling, 1 reply; 4+ messages in thread
From: Gavin Shan @ 2021-10-12  6:36 UTC (permalink / raw)
  To: qemu-arm; +Cc: peter.maydell, drjones, qemu-devel, shan.gavin, ehabkost

The following option is used to specify the distance map. It's
possible the option isn't provided by user. In this case, the
distance map isn't populated and exposed to platform. On the
other hand, the empty NUMA node, where no memory resides, is
allowed on ARM64 virt platform. For these empty NUMA nodes,
their corresponding device-tree nodes aren't populated, but
their NUMA IDs should be included in the "/distance-map"
device-tree node, so that kernel can probe them properly if
device-tree is used.

  -numa,dist,src=<numa_id>,dst=<numa_id>,val=<distance>

So when user doesn't specify distance map, we need to generate
the default distance map, where the local and remote distances
are 10 and 20 separately. However, this is going to change the
hardware description of the guest in this particular scenario.
It's fine as the guest should be tolerant to ignore the distance
map completely or parse it properly by following the device-tree
specification.

This introduces an extra parameter to the exiting function
complete_init_numa_distance() to generate the default distance
map when no node pair distances are provided by user.

Signed-off-by: Gavin Shan <gshan@redhat.com>
---
 hw/core/numa.c | 13 +++++++++++--
 1 file changed, 11 insertions(+), 2 deletions(-)

diff --git a/hw/core/numa.c b/hw/core/numa.c
index 510d096a88..c508d857a0 100644
--- a/hw/core/numa.c
+++ b/hw/core/numa.c
@@ -594,7 +594,7 @@ static void validate_numa_distance(MachineState *ms)
     }
 }
 
-static void complete_init_numa_distance(MachineState *ms)
+static void complete_init_numa_distance(MachineState *ms, bool is_default)
 {
     int src, dst;
     NodeInfo *numa_info = ms->numa_state->nodes;
@@ -609,6 +609,8 @@ static void complete_init_numa_distance(MachineState *ms)
             if (numa_info[src].distance[dst] == 0) {
                 if (src == dst) {
                     numa_info[src].distance[dst] = NUMA_DISTANCE_MIN;
+                } else if (is_default) {
+                    numa_info[src].distance[dst] = NUMA_DISTANCE_DEFAULT;
                 } else {
                     numa_info[src].distance[dst] = numa_info[dst].distance[src];
                 }
@@ -716,13 +718,20 @@ void numa_complete_configuration(MachineState *ms)
          * A->B != distance B->A, then that means the distance table is
          * asymmetric. In this case, the distances for both directions
          * of all node pairs are required.
+         *
+         * The default node pair distances, which are 10 and 20 for the
+         * local and remote nodes separately, are provided if user doesn't
+         * specify any node pair distances.
          */
         if (ms->numa_state->have_numa_distance) {
             /* Validate enough NUMA distance information was provided. */
             validate_numa_distance(ms);
 
             /* Validation succeeded, now fill in any missing distances. */
-            complete_init_numa_distance(ms);
+            complete_init_numa_distance(ms, false);
+        } else {
+            complete_init_numa_distance(ms, true);
+            ms->numa_state->have_numa_distance = true;
         }
     }
 }
-- 
2.23.0



^ permalink raw reply related	[flat|nested] 4+ messages in thread

* [PATCH v2 2/2] hw/arm/virt: Don't create device-tree node for empty NUMA node
  2021-10-12  6:36 [PATCH v2 0/2] hw/arm/virt: Fix qemu booting failure on device-tree Gavin Shan
  2021-10-12  6:36 ` [PATCH v2 1/2] numa: Set default distance map if needed Gavin Shan
@ 2021-10-12  6:36 ` Gavin Shan
  1 sibling, 0 replies; 4+ messages in thread
From: Gavin Shan @ 2021-10-12  6:36 UTC (permalink / raw)
  To: qemu-arm; +Cc: peter.maydell, drjones, qemu-devel, shan.gavin, ehabkost

The empty NUMA node, where no memory resides, are allowed. For
example, the following command line specifies two empty NUMA nodes.
With this, QEMU fails to boot because of the conflicting device-tree
node names, as the following error message indicates.

  /home/gavin/sandbox/qemu.main/build/qemu-system-aarch64 \
  -accel kvm -machine virt,gic-version=host               \
  -cpu host -smp 4,sockets=2,cores=2,threads=1            \
  -m 1024M,slots=16,maxmem=64G                            \
  -object memory-backend-ram,id=mem0,size=512M            \
  -object memory-backend-ram,id=mem1,size=512M            \
  -numa node,nodeid=0,cpus=0-1,memdev=mem0                \
  -numa node,nodeid=1,cpus=2-3,memdev=mem1                \
  -numa node,nodeid=2                                     \
  -numa node,nodeid=3
    :
  qemu-system-aarch64: FDT: Failed to create subnode /memory@80000000: FDT_ERR_EXISTS

As specified by linux device-tree binding document, the device-tree
nodes for these empty NUMA nodes shouldn't be generated. However,
the corresponding NUMA node IDs should be included in the distance
map device-tree node. This skips populating the device-tree nodes
for these empty NUMA nodes to avoid the error, so that QEMU can be
started successfully.

Signed-off-by: Gavin Shan <gshan@redhat.com>
Reviewed-by: Andrew Jones <drjones@redhat.com>
---
 hw/arm/boot.c | 4 ++++
 1 file changed, 4 insertions(+)

diff --git a/hw/arm/boot.c b/hw/arm/boot.c
index 57efb61ee4..4e5898fcdc 100644
--- a/hw/arm/boot.c
+++ b/hw/arm/boot.c
@@ -603,6 +603,10 @@ int arm_load_dtb(hwaddr addr, const struct arm_boot_info *binfo,
         mem_base = binfo->loader_start;
         for (i = 0; i < ms->numa_state->num_nodes; i++) {
             mem_len = ms->numa_state->nodes[i].node_mem;
+            if (!mem_len) {
+                continue;
+            }
+
             rc = fdt_add_memory_node(fdt, acells, mem_base,
                                      scells, mem_len, i);
             if (rc < 0) {
-- 
2.23.0



^ permalink raw reply related	[flat|nested] 4+ messages in thread

* Re: [PATCH v2 1/2] numa: Set default distance map if needed
  2021-10-12  6:36 ` [PATCH v2 1/2] numa: Set default distance map if needed Gavin Shan
@ 2021-10-12  6:39   ` Andrew Jones
  0 siblings, 0 replies; 4+ messages in thread
From: Andrew Jones @ 2021-10-12  6:39 UTC (permalink / raw)
  To: Gavin Shan; +Cc: peter.maydell, qemu-devel, qemu-arm, shan.gavin, ehabkost

On Tue, Oct 12, 2021 at 02:36:02PM +0800, Gavin Shan wrote:
> The following option is used to specify the distance map. It's
> possible the option isn't provided by user. In this case, the
> distance map isn't populated and exposed to platform. On the
> other hand, the empty NUMA node, where no memory resides, is
> allowed on ARM64 virt platform. For these empty NUMA nodes,
> their corresponding device-tree nodes aren't populated, but
> their NUMA IDs should be included in the "/distance-map"
> device-tree node, so that kernel can probe them properly if
> device-tree is used.
> 
>   -numa,dist,src=<numa_id>,dst=<numa_id>,val=<distance>
> 
> So when user doesn't specify distance map, we need to generate
> the default distance map, where the local and remote distances
> are 10 and 20 separately. However, this is going to change the
> hardware description of the guest in this particular scenario.
> It's fine as the guest should be tolerant to ignore the distance
> map completely or parse it properly by following the device-tree
> specification.
> 
> This introduces an extra parameter to the exiting function
> complete_init_numa_distance() to generate the default distance
> map when no node pair distances are provided by user.
> 
> Signed-off-by: Gavin Shan <gshan@redhat.com>
> ---
>  hw/core/numa.c | 13 +++++++++++--
>  1 file changed, 11 insertions(+), 2 deletions(-)
>

Reviewed-by: Andrew Jones <drjones@redhat.com>



^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2021-10-12  6:43 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-10-12  6:36 [PATCH v2 0/2] hw/arm/virt: Fix qemu booting failure on device-tree Gavin Shan
2021-10-12  6:36 ` [PATCH v2 1/2] numa: Set default distance map if needed Gavin Shan
2021-10-12  6:39   ` Andrew Jones
2021-10-12  6:36 ` [PATCH v2 2/2] hw/arm/virt: Don't create device-tree node for empty NUMA node Gavin Shan

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.