xen-devel.lists.xenproject.org archive mirror
 help / color / mirror / Atom feed
From: Stefano Stabellini <sstabellini@kernel.org>
To: Wei Chen <Wei.Chen@arm.com>
Cc: Stefano Stabellini <sstabellini@kernel.org>,
	 "xen-devel@lists.xenproject.org"
	<xen-devel@lists.xenproject.org>,
	 "julien@xen.org" <julien@xen.org>,
	 Bertrand Marquis <Bertrand.Marquis@arm.com>,
	 "jbeulich@suse.com" <jbeulich@suse.com>,
	 "andrew.cooper3@citrix.com" <andrew.cooper3@citrix.com>,
	 "roger.pau@citrix.com" <roger.pau@citrix.com>,
	"wl@xen.org" <wl@xen.org>
Subject: RE: [PATCH 08/37] xen/x86: add detection of discontinous node memory range
Date: Sun, 26 Sep 2021 20:13:09 -0700 (PDT)	[thread overview]
Message-ID: <alpine.DEB.2.21.2109262002390.5022@sstabellini-ThinkPad-T480s> (raw)
In-Reply-To: <DB9PR08MB6857A3176752B3E08EAE4D739EA69@DB9PR08MB6857.eurprd08.prod.outlook.com>

[-- Attachment #1: Type: text/plain, Size: 5713 bytes --]

On Sun, 26 Sep 2021, Wei Chen wrote:
> > -----Original Message-----
> > From: Stefano Stabellini <sstabellini@kernel.org>
> > Sent: 2021年9月25日 3:53
> > To: Wei Chen <Wei.Chen@arm.com>
> > Cc: Stefano Stabellini <sstabellini@kernel.org>; xen-
> > devel@lists.xenproject.org; julien@xen.org; Bertrand Marquis
> > <Bertrand.Marquis@arm.com>; jbeulich@suse.com; andrew.cooper3@citrix.com;
> > roger.pau@citrix.com; wl@xen.org
> > Subject: RE: [PATCH 08/37] xen/x86: add detection of discontinous node
> > memory range
> > 
> > On Fri, 24 Sep 2021, Wei Chen wrote:
> > > > -----Original Message-----
> > > > From: Stefano Stabellini <sstabellini@kernel.org>
> > > > Sent: 2021年9月24日 8:26
> > > > To: Wei Chen <Wei.Chen@arm.com>
> > > > Cc: xen-devel@lists.xenproject.org; sstabellini@kernel.org;
> > julien@xen.org;
> > > > Bertrand Marquis <Bertrand.Marquis@arm.com>; jbeulich@suse.com;
> > > > andrew.cooper3@citrix.com; roger.pau@citrix.com; wl@xen.org
> > > > Subject: Re: [PATCH 08/37] xen/x86: add detection of discontinous node
> > > > memory range
> > > >
> > > > CC'ing x86 maintainers
> > > >
> > > > On Thu, 23 Sep 2021, Wei Chen wrote:
> > > > > One NUMA node may contain several memory blocks. In current Xen
> > > > > code, Xen will maintain a node memory range for each node to cover
> > > > > all its memory blocks. But here comes the problem, in the gap of
> > > > > one node's two memory blocks, if there are some memory blocks don't
> > > > > belong to this node (remote memory blocks). This node's memory range
> > > > > will be expanded to cover these remote memory blocks.
> > > > >
> > > > > One node's memory range contains othe nodes' memory, this is
> > obviously
> > > > > not very reasonable. This means current NUMA code only can support
> > > > > node has continous memory blocks. However, on a physical machine,
> > the
> > > > > addresses of multiple nodes can be interleaved.
> > > > >
> > > > > So in this patch, we add code to detect discontinous memory blocks
> > > > > for one node. NUMA initializtion will be failed and error messages
> > > > > will be printed when Xen detect such hardware configuration.
> > > >
> > > > At least on ARM, it is not just memory that can be interleaved, but
> > also
> > > > MMIO regions. For instance:
> > > >
> > > > node0 bank0 0-0x1000000
> > > > MMIO 0x1000000-0x1002000
> > > > Hole 0x1002000-0x2000000
> > > > node0 bank1 0x2000000-0x3000000
> > > >
> > > > So I am not familiar with the SRAT format, but I think on ARM the
> > check
> > > > would look different: we would just look for multiple memory ranges
> > > > under a device_type = "memory" node of a NUMA node in device tree.
> > > >
> > > >
> > >
> > > Should I need to include/refine above message to commit log?
> > 
> > Let me ask you a question first.
> > 
> > With the NUMA implementation of this patch series, can we deal with
> > cases where each node has multiple memory banks, not interleaved?
> 
> Yes.
> 
> > An an example:
> > 
> > node0: 0x0        - 0x10000000
> > MMIO : 0x10000000 - 0x20000000
> > node0: 0x20000000 - 0x30000000
> > MMIO : 0x30000000 - 0x50000000
> > node1: 0x50000000 - 0x60000000
> > MMIO : 0x60000000 - 0x80000000
> > node2: 0x80000000 - 0x90000000
> > 
> > 
> > I assume we can deal with this case simply by setting node0 memory to
> > 0x0-0x30000000 even if there is actually something else, a device, that
> > doesn't belong to node0 in between the two node0 banks?
> 
> While this configuration is rare in SoC design, but it is not impossible. 

Definitely, I have seen it before.


> > Is it only other nodes' memory interleaved that cause issues? In other
> > words, only the following is a problematic scenario?
> > 
> > node0: 0x0        - 0x10000000
> > MMIO : 0x10000000 - 0x20000000
> > node1: 0x20000000 - 0x30000000
> > MMIO : 0x30000000 - 0x50000000
> > node0: 0x50000000 - 0x60000000
> > 
> > Because node1 is in between the two ranges of node0?
> > 
> 
> But only device_type="memory" can be added to allocation.
> For mmio there are two cases:
> 1. mmio doesn't have NUMA id property.
> 2. mmio has NUMA id property, just like some PCIe controllers.
>    But we don’t need to handle these kinds of MMIO devices
>    in memory block parsing. Because we don't need to allocate
>    memory from these mmio ranges. And for accessing, we need
>    a NUMA-aware PCIe controller driver or a generic NUMA-aware
>    MMIO accessing APIs.

Yes, I am not too worried about devices with a NUMA id property because
they are less common and this series doesn't handle them at all, right?
I imagine they would be treated like any other device without NUMA
awareness.

I am thinking about the case where the memory of each NUMA node is made
of multiple banks. I understand that this patch adds an explicit check
for cases where these banks are interleaving, however there are many
other cases where NUMA memory nodes are *not* interleaving but they are
still made of multiple discontinuous banks, like in the two example
above.

My question is whether this patch series in its current form can handle
the two cases above correctly. If so, I am wondering how it works given
that we only have a single "start" and "size" parameter per node.

On the other hand if this series cannot handle the two cases above, my
question is whether it would fail explicitly or not. The new
check is_node_memory_continuous doesn't seem to be able to catch them.


> > I am asking these questions because it is certainly possible to have
> > multiple memory ranges for each NUMA node in device tree, either by
> > specifying multiple ranges with a single "reg" property, or by
> > specifying multiple memory nodes with the same numa-node-id.
> 
> 
> 

  reply	other threads:[~2021-09-27  3:13 UTC|newest]

Thread overview: 192+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-09-23 12:01 [PATCH 00/37] Add device tree based NUMA support to Arm Wei Chen
2021-09-23 12:02 ` [PATCH 01/37] xen/arm: Print a 64-bit number in hex from early uart Wei Chen
2021-09-23 12:02 ` [PATCH 02/37] xen: introduce a Kconfig option to configure NUMA nodes number Wei Chen
2021-09-23 23:45   ` Stefano Stabellini
2021-09-24  1:24     ` Wei Chen
2021-09-24  8:55   ` Jan Beulich
2021-09-24 10:33     ` Wei Chen
2021-09-24 10:47       ` Jan Beulich
2021-09-23 12:02 ` [PATCH 03/37] xen/x86: Initialize memnodemapsize while faking NUMA node Wei Chen
2021-09-24  8:57   ` Jan Beulich
2021-09-24 10:34     ` Wei Chen
2021-09-23 12:02 ` [PATCH 04/37] xen: introduce an arch helper for default dma zone status Wei Chen
2021-09-23 23:55   ` Stefano Stabellini
2021-09-24  1:50     ` Wei Chen
2022-01-17 16:10   ` Jan Beulich
2022-01-18  7:51     ` Wei Chen
2022-01-18  8:16       ` Jan Beulich
2022-01-18  9:20         ` Wei Chen
2022-01-18 14:16           ` Jan Beulich
2022-01-19  2:49             ` Wei Chen
2022-01-19  7:50               ` Jan Beulich
2022-01-19  8:33                 ` Wei Chen
2021-09-23 12:02 ` [PATCH 05/37] xen: decouple NUMA from ACPI in Kconfig Wei Chen
2021-09-23 12:02 ` [PATCH 06/37] xen/arm: use !CONFIG_NUMA to keep fake NUMA API Wei Chen
2021-09-24  0:05   ` Stefano Stabellini
2021-09-24 10:21     ` Wei Chen
2021-09-23 12:02 ` [PATCH 07/37] xen/x86: use paddr_t for addresses in NUMA node structure Wei Chen
2021-09-24  0:11   ` Stefano Stabellini
2021-09-24  0:13     ` Stefano Stabellini
2021-09-24  3:00       ` Wei Chen
2022-01-18 15:22   ` Jan Beulich
2022-01-19  6:33     ` Wei Chen
2022-01-19  7:55       ` Jan Beulich
2022-01-19  8:36         ` Wei Chen
2021-09-23 12:02 ` [PATCH 08/37] xen/x86: add detection of discontinous node memory range Wei Chen
2021-09-24  0:25   ` Stefano Stabellini
2021-09-24  4:28     ` Wei Chen
2021-09-24 19:52       ` Stefano Stabellini
2021-09-26 10:11         ` Wei Chen
2021-09-27  3:13           ` Stefano Stabellini [this message]
2021-09-27  5:05             ` Stefano Stabellini
2021-09-27  9:50               ` Wei Chen
2021-09-27 17:19                 ` Stefano Stabellini
2021-09-28  4:41                   ` Wei Chen
2021-09-28  4:59                     ` Stefano Stabellini
2022-01-18 16:13   ` Jan Beulich
2022-01-19  7:33     ` Wei Chen
2022-01-19  8:01       ` Jan Beulich
2022-01-19  8:24         ` Wei Chen
2021-09-23 12:02 ` [PATCH 09/37] xen/x86: introduce two helpers to access memory hotplug end Wei Chen
2021-09-24  0:29   ` Stefano Stabellini
2021-09-24  4:21     ` Wei Chen
2022-01-24 16:24   ` Jan Beulich
2022-01-26  7:53     ` Wei Chen
2021-09-23 12:02 ` [PATCH 10/37] xen/x86: use helpers to access/update mem_hotplug Wei Chen
2021-09-24  0:31   ` Stefano Stabellini
2021-09-24  4:29     ` Wei Chen
2022-01-24 16:29   ` Jan Beulich
2022-01-26  7:58     ` Wei Chen
2021-09-23 12:02 ` [PATCH 11/37] xen/x86: abstract neutral code from acpi_numa_memory_affinity_init Wei Chen
2021-09-24  0:38   ` Stefano Stabellini
2022-01-24 16:50   ` Jan Beulich
2022-01-26 10:39     ` Wei Chen
2021-09-23 12:02 ` [PATCH 12/37] xen/x86: decouple nodes_cover_memory from E820 map Wei Chen
2021-09-24  0:39   ` Stefano Stabellini
2022-01-24 16:59   ` Jan Beulich
2022-01-27  8:03     ` Wei Chen
2022-01-27  8:08       ` Jan Beulich
2022-01-27  9:03         ` Wei Chen
2022-01-27  9:22           ` Jan Beulich
2022-01-27  9:27             ` Wei Chen
2021-09-23 12:02 ` [PATCH 13/37] xen/x86: decouple processor_nodes_parsed from acpi numa functions Wei Chen
2021-09-24  0:40   ` Stefano Stabellini
2022-01-25  9:49   ` Jan Beulich
2022-01-27  8:06     ` Wei Chen
2021-09-23 12:02 ` [PATCH 14/37] xen/x86: use name fw_numa to replace acpi_numa Wei Chen
2021-09-24  0:40   ` Stefano Stabellini
2022-01-25 10:12   ` Jan Beulich
2022-01-27  8:09     ` Wei Chen
2021-09-23 12:02 ` [PATCH 15/37] xen/x86: rename acpi_scan_nodes to numa_scan_nodes Wei Chen
2021-09-24  0:40   ` Stefano Stabellini
2022-01-25 10:17   ` Jan Beulich
2022-01-27  8:14     ` Wei Chen
2021-09-23 12:02 ` [PATCH 16/37] xen/x86: export srat_bad to external Wei Chen
2021-09-24  0:41   ` Stefano Stabellini
2022-01-25 10:22   ` Jan Beulich
2022-01-27  8:35     ` Wei Chen
2022-01-27  8:37       ` Jan Beulich
2022-01-27  8:47         ` Wei Chen
2021-09-23 12:02 ` [PATCH 17/37] xen/x86: use CONFIG_NUMA to gate numa_scan_nodes Wei Chen
2021-09-24  0:41   ` Stefano Stabellini
2022-01-25 10:26   ` Jan Beulich
2022-01-27  8:37     ` Wei Chen
2021-09-23 12:02 ` [PATCH 18/37] xen: move NUMA common code from x86 to common Wei Chen
2021-09-23 12:02 ` [PATCH 19/37] xen/x86: promote VIRTUAL_BUG_ON to ASSERT in Wei Chen
2022-01-17 16:21   ` Jan Beulich
2022-01-18  7:52     ` Wei Chen
2021-09-23 12:02 ` [PATCH 20/37] xen: introduce CONFIG_EFI to stub API for non-EFI architecture Wei Chen
2021-09-24  1:15   ` Stefano Stabellini
2021-09-24  4:34     ` Wei Chen
2021-09-24  7:58       ` Jan Beulich
2021-09-24 10:31         ` Wei Chen
2021-09-24 10:49           ` Jan Beulich
2021-09-26 10:25             ` Wei Chen
2021-09-27 10:28               ` Wei Chen
2021-09-28  0:59                 ` Stefano Stabellini
2021-09-28  4:16                   ` Wei Chen
2021-09-28  5:01                     ` Stefano Stabellini
2021-09-28  8:02                       ` Jan Beulich
2021-10-03 23:28                         ` Wei Chen
2022-01-25 10:34   ` Jan Beulich
2022-01-27  8:44     ` Wei Chen
2022-01-27  8:51       ` Wei Chen
2022-01-27  9:00         ` Jan Beulich
2022-01-27  9:09           ` Wei Chen
2022-01-27  9:16             ` Jan Beulich
2022-01-27  9:25               ` Wei Chen
2022-01-27  9:27                 ` Jan Beulich
2022-01-27 10:00                   ` Julien Grall
2022-01-28  4:35                     ` Wei Chen
2021-09-23 12:02 ` [PATCH 21/37] xen/arm: Keep memory nodes in dtb for NUMA when boot from EFI Wei Chen
2021-09-24  1:23   ` Stefano Stabellini
2021-09-24  4:36     ` Wei Chen
2022-01-25 10:38   ` Jan Beulich
2022-01-27  8:45     ` Wei Chen
2021-09-23 12:02 ` [PATCH 22/37] xen/arm: use NR_MEM_BANKS to override default NR_NODE_MEMBLKS Wei Chen
2021-09-24  1:34   ` Stefano Stabellini
2021-09-26 13:13     ` Wei Chen
2021-09-27  3:25       ` Stefano Stabellini
2021-09-27  4:18         ` Wei Chen
2021-09-27  4:59           ` Stefano Stabellini
2021-09-27  6:25             ` Julien Grall
2021-09-27  6:46             ` Wei Chen
2021-09-27  6:53               ` Wei Chen
2021-09-27  7:35                 ` Julien Grall
2021-09-27 10:21                   ` Wei Chen
2021-09-27 10:39                     ` Julien Grall
2021-09-27 16:58                       ` Stefano Stabellini
2021-09-28  2:57                         ` Wei Chen
2021-09-23 12:02 ` [PATCH 23/37] xen/arm: implement node distance helpers for Arm Wei Chen
2021-09-24  1:46   ` Stefano Stabellini
2021-09-24  4:41     ` Wei Chen
2021-09-24 19:36       ` Stefano Stabellini
2021-09-26 10:15         ` Wei Chen
2021-09-23 12:02 ` [PATCH 24/37] xen/arm: implement two arch helpers to get memory map info Wei Chen
2021-09-24  2:06   ` Stefano Stabellini
2021-09-24  4:42     ` Wei Chen
2021-09-23 12:02 ` [PATCH 25/37] xen/arm: implement bad_srat for Arm NUMA initialization Wei Chen
2021-09-24  2:09   ` Stefano Stabellini
2021-09-24  4:45     ` Wei Chen
2021-09-24  8:07     ` Jan Beulich
2021-09-24 19:33       ` Stefano Stabellini
2021-09-23 12:02 ` [PATCH 26/37] xen/arm: build NUMA cpu_to_node map in dt_smp_init_cpus Wei Chen
2021-09-24  2:26   ` Stefano Stabellini
2021-09-24  4:25     ` Wei Chen
2021-09-23 12:02 ` [PATCH 27/37] xen/arm: Add boot and secondary CPU to NUMA system Wei Chen
2021-09-23 12:02 ` [PATCH 28/37] xen/arm: stub memory hotplug access helpers for Arm Wei Chen
2021-09-24  2:33   ` Stefano Stabellini
2021-09-24  4:26     ` Wei Chen
2021-09-23 12:02 ` [PATCH 29/37] xen/arm: introduce a helper to parse device tree processor node Wei Chen
2021-09-24  2:44   ` Stefano Stabellini
2021-09-24  4:46     ` Wei Chen
2021-09-23 12:02 ` [PATCH 30/37] xen/arm: introduce a helper to parse device tree memory node Wei Chen
2021-09-24  3:05   ` Stefano Stabellini
2021-09-24  7:54     ` Wei Chen
2021-09-23 12:02 ` [PATCH 31/37] xen/arm: introduce a helper to parse device tree NUMA distance map Wei Chen
2021-09-24  3:05   ` Stefano Stabellini
2021-09-24  5:23     ` Wei Chen
2021-09-23 12:02 ` [PATCH 32/37] xen/arm: unified entry to parse all NUMA data from device tree Wei Chen
2021-09-24  3:16   ` Stefano Stabellini
2021-09-24  7:58     ` Wei Chen
2021-09-24 19:42       ` Stefano Stabellini
2021-09-23 12:02 ` [PATCH 33/37] xen/arm: keep guest still be NUMA unware Wei Chen
2021-09-24  3:19   ` Stefano Stabellini
2021-09-24 10:23     ` Wei Chen
2021-09-23 12:02 ` [PATCH 34/37] xen/arm: enable device tree based NUMA in system init Wei Chen
2021-09-24  3:28   ` Stefano Stabellini
2021-09-24  9:52     ` Wei Chen
2021-09-23 12:02 ` [PATCH 35/37] xen/arm: use CONFIG_NUMA to gate node_online_map in smpboot Wei Chen
2021-09-23 12:02 ` [PATCH 36/37] xen/arm: Provide Kconfig options for Arm to enable NUMA Wei Chen
2021-09-24  3:31   ` Stefano Stabellini
2021-09-24 10:13     ` Wei Chen
2021-09-24 19:39       ` Stefano Stabellini
2021-09-27  8:33         ` Jan Beulich
2021-09-27  8:45           ` Julien Grall
2021-09-27  9:17             ` Jan Beulich
2021-09-27 17:17               ` Stefano Stabellini
2021-09-28  2:59                 ` Wei Chen
2021-09-28  3:30                   ` Stefano Stabellini
2021-09-24 10:25   ` Jan Beulich
2021-09-24 10:37     ` Wei Chen
2021-09-23 12:02 ` [PATCH 37/37] docs: update numa command line to support Arm Wei Chen

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=alpine.DEB.2.21.2109262002390.5022@sstabellini-ThinkPad-T480s \
    --to=sstabellini@kernel.org \
    --cc=Bertrand.Marquis@arm.com \
    --cc=Wei.Chen@arm.com \
    --cc=andrew.cooper3@citrix.com \
    --cc=jbeulich@suse.com \
    --cc=julien@xen.org \
    --cc=roger.pau@citrix.com \
    --cc=wl@xen.org \
    --cc=xen-devel@lists.xenproject.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).