* [PATCH V5 0/4] ACPI: Support Generic Initiator proximity domains @ 2019-10-04 11:43 Jonathan Cameron 2019-10-04 11:43 ` [PATCH V5 1/4] ACPI: Support Generic Initiator only domains Jonathan Cameron ` (3 more replies) 0 siblings, 4 replies; 21+ messages in thread From: Jonathan Cameron @ 2019-10-04 11:43 UTC (permalink / raw) To: linux-mm, linux-acpi, linux-kernel, linux-arm-kernel, x86 Cc: Keith Busch, jglisse, Rafael J . Wysocki, linuxarm, Andrew Morton, Dan Williams, Jonathan Cameron Introduces a new type of NUMA node for cases where we want to represent the access characteristics of a non CPU initiator of memory requests, as these differ from all those for existing nodes containing CPUs and/or memory. This patch set has been sitting around for a long time (rebases only since v3 in April) without significant review. I would appreciate it very much if anyone has time to take a look. One outstanding question to highlight in this series is whether we should assume all ACPI supporting architectures support Generic Initiator domains, or whether to introduce an ARCH_HAS_GENERIC_INITIATOR_DOMAINS entry in Kconfig. Changes since V4: At Rafael's suggestion: Rebase on top of Dan William's Specific Purpose Memory series as that moves srat.c Original patches cherry-picked fine onto mmotm with Dan's patches applied. Applies to mmotm-2019-09-25 + https://lore.kernel.org/linux-acpi/156140036490.2951909.1837804994781523185.stgit@dwillia2-desk3.amr.corp.intel.com/ [PATCH v4 00/10] EFI Specific Purpose Memory Support (note there are some trivial conflicts to deal with when applying the SPM series). Change since V3. * Rebase. Changes since RFC V2. * RFC dropped as now we have x86 support, so the lack of guards in in the ACPI code etc should now be fine. * Added x86 support. Note this has only been tested on QEMU as I don't have a convenient x86 NUMA machine to play with. Note that this fitted together rather differently from arm64 so I'm particularly interested in feedback on the two solutions. Since RFC V1. * Fix incorrect interpretation of the ACPI entry noted by Keith Busch * Use the acpica headers definitions that are now in mmotm. It's worth noting that, to safely put a given device in a GI node, may require changes to the existing drivers as it's not unusual to assume you have local memory or processor core. There may be further constraints not yet covered by this patch. Original cover letter... ACPI 6.3 introduced a new entity that can be part of a NUMA proximity domain. It may share such a domain with the existing options (memory, CPU etc) but it may also exist on it's own. The intent is to allow the description of the NUMA properties (particularly via HMAT) of accelerators and other initiators of memory activity that are not the host processor running the operating system. This patch set introduces 'just enough' to make them work for arm64 and x86. It should be trivial to support other architectures, I just don't suitable NUMA systems readily available to test. There are a few quirks that need to be considered. 1. Fall back nodes ****************** As pre ACPI 6.3 supporting operating systems do not have Generic Initiator Proximity Domains it is possible to specify, via _PXM in DSDT that another device is part of such a GI only node. This currently blows up spectacularly. Whilst we can obviously 'now' protect against such a situation (see the related thread on PCI _PXM support and the threadripper board identified there as also falling into the problem of using non existent nodes https://patchwork.kernel.org/patch/10723311/ ), there is no way to be sure we will never have legacy OSes that are not protected against this. It would also be 'non ideal' to fallback to a default node as there may be a better (non GI) node to pick if GI nodes aren't available. The work around is that we also have a new system wide OSC bit that allows an operating system to 'announce' that it supports Generic Initiators. This allows, the firmware to us DSDT magic to 'move' devices between the nodes dependent on whether our new nodes are there or not. 2. New ways of assigning a proximity domain for devices ******************************************************* Until now, the only way firmware could indicate that a particular device (outside the 'special' set of cpus etc) was to be found in a particular Proximity Domain by the use of _PXM in DSDT. That is equally valid with GI domains, but we have new options. The SRAT affinity structure includes a handle (ACPI or PCI) to identify devices with the system and specify their proximity domain that way. If both _PXM and this are provided, they should give the same answer. For now this patch set completely ignores that feature as we don't need it to start the discussion. It will form a follow up set at some point (if no one else fancies doing it). Jonathan Cameron (4): ACPI: Support Generic Initiator only domains arm64: Support Generic Initiator only domains x86: Support Generic Initiator only proximity domains ACPI: Let ACPI know we support Generic Initiator Affinity Structures arch/arm64/kernel/smp.c | 8 +++++ arch/x86/include/asm/numa.h | 2 ++ arch/x86/kernel/setup.c | 1 + arch/x86/mm/numa.c | 14 ++++++++ drivers/acpi/bus.c | 1 + drivers/acpi/numa/srat.c | 62 +++++++++++++++++++++++++++++++++- drivers/base/node.c | 3 ++ include/asm-generic/topology.h | 3 ++ include/linux/acpi.h | 1 + include/linux/nodemask.h | 1 + include/linux/topology.h | 7 ++++ 11 files changed, 102 insertions(+), 1 deletion(-) -- 2.20.1 ^ permalink raw reply [flat|nested] 21+ messages in thread
* [PATCH V5 1/4] ACPI: Support Generic Initiator only domains 2019-10-04 11:43 [PATCH V5 0/4] ACPI: Support Generic Initiator proximity domains Jonathan Cameron @ 2019-10-04 11:43 ` Jonathan Cameron 2019-10-18 10:18 ` Rafael J. Wysocki 2019-11-12 17:55 ` Dan Williams 2019-10-04 11:43 ` [PATCH V5 2/4] arm64: " Jonathan Cameron ` (2 subsequent siblings) 3 siblings, 2 replies; 21+ messages in thread From: Jonathan Cameron @ 2019-10-04 11:43 UTC (permalink / raw) To: linux-mm, linux-acpi, linux-kernel, linux-arm-kernel, x86 Cc: Keith Busch, jglisse, Rafael J . Wysocki, linuxarm, Andrew Morton, Dan Williams, Jonathan Cameron Generic Initiators are a new ACPI concept that allows for the description of proximity domains that contain a device which performs memory access (such as a network card) but neither host CPU nor Memory. This patch has the parsing code and provides the infrastructure for an architecture to associate these new domains with their nearest memory processing node. Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> --- drivers/acpi/numa/srat.c | 62 +++++++++++++++++++++++++++++++++- drivers/base/node.c | 3 ++ include/asm-generic/topology.h | 3 ++ include/linux/nodemask.h | 1 + include/linux/topology.h | 7 ++++ 5 files changed, 75 insertions(+), 1 deletion(-) diff --git a/drivers/acpi/numa/srat.c b/drivers/acpi/numa/srat.c index eadbf90e65d1..fe34315a9234 100644 --- a/drivers/acpi/numa/srat.c +++ b/drivers/acpi/numa/srat.c @@ -170,6 +170,38 @@ acpi_table_print_srat_entry(struct acpi_subtable_header *header) } break; + case ACPI_SRAT_TYPE_GENERIC_AFFINITY: + { + struct acpi_srat_generic_affinity *p = + (struct acpi_srat_generic_affinity *)header; + char name[9] = {}; + + if (p->device_handle_type == 0) { + /* + * For pci devices this may be the only place they + * are assigned a proximity domain + */ + pr_debug("SRAT Generic Initiator(Seg:%u BDF:%u) in proximity domain %d %s\n", + *(u16 *)(&p->device_handle[0]), + *(u16 *)(&p->device_handle[2]), + p->proximity_domain, + (p->flags & ACPI_SRAT_GENERIC_AFFINITY_ENABLED) ? + "enabled" : "disabled"); + } else { + /* + * In this case we can rely on the device having a + * proximity domain reference + */ + memcpy(name, p->device_handle, 8); + pr_info("SRAT Generic Initiator(HID=%.8s UID=%.4s) in proximity domain %d %s\n", + (char *)(&p->device_handle[0]), + (char *)(&p->device_handle[8]), + p->proximity_domain, + (p->flags & ACPI_SRAT_GENERIC_AFFINITY_ENABLED) ? + "enabled" : "disabled"); + } + } + break; default: pr_warn("Found unsupported SRAT entry (type = 0x%x)\n", header->type); @@ -378,6 +410,32 @@ acpi_parse_gicc_affinity(union acpi_subtable_headers *header, return 0; } +static int __init +acpi_parse_gi_affinity(union acpi_subtable_headers *header, + const unsigned long end) +{ + struct acpi_srat_generic_affinity *gi_affinity; + int node; + + gi_affinity = (struct acpi_srat_generic_affinity *)header; + if (!gi_affinity) + return -EINVAL; + acpi_table_print_srat_entry(&header->common); + + if (!(gi_affinity->flags & ACPI_SRAT_GENERIC_AFFINITY_ENABLED)) + return -EINVAL; + + node = acpi_map_pxm_to_node(gi_affinity->proximity_domain); + if (node == NUMA_NO_NODE || node >= MAX_NUMNODES) { + pr_err("SRAT: Too many proximity domains.\n"); + return -EINVAL; + } + node_set(node, numa_nodes_parsed); + node_set_state(node, N_GENERIC_INITIATOR); + + return 0; +} + static int __initdata parsed_numa_memblks; static int __init @@ -433,7 +491,7 @@ int __init acpi_numa_init(void) /* SRAT: System Resource Affinity Table */ if (!acpi_table_parse(ACPI_SIG_SRAT, acpi_parse_srat)) { - struct acpi_subtable_proc srat_proc[3]; + struct acpi_subtable_proc srat_proc[4]; memset(srat_proc, 0, sizeof(srat_proc)); srat_proc[0].id = ACPI_SRAT_TYPE_CPU_AFFINITY; @@ -442,6 +500,8 @@ int __init acpi_numa_init(void) srat_proc[1].handler = acpi_parse_x2apic_affinity; srat_proc[2].id = ACPI_SRAT_TYPE_GICC_AFFINITY; srat_proc[2].handler = acpi_parse_gicc_affinity; + srat_proc[3].id = ACPI_SRAT_TYPE_GENERIC_AFFINITY; + srat_proc[3].handler = acpi_parse_gi_affinity; acpi_table_parse_entries_array(ACPI_SIG_SRAT, sizeof(struct acpi_table_srat), diff --git a/drivers/base/node.c b/drivers/base/node.c index 296546ffed6c..e5863baa8cb6 100644 --- a/drivers/base/node.c +++ b/drivers/base/node.c @@ -977,6 +977,8 @@ static struct node_attr node_state_attr[] = { #endif [N_MEMORY] = _NODE_ATTR(has_memory, N_MEMORY), [N_CPU] = _NODE_ATTR(has_cpu, N_CPU), + [N_GENERIC_INITIATOR] = _NODE_ATTR(has_generic_initiator, + N_GENERIC_INITIATOR), }; static struct attribute *node_state_attrs[] = { @@ -988,6 +990,7 @@ static struct attribute *node_state_attrs[] = { #endif &node_state_attr[N_MEMORY].attr.attr, &node_state_attr[N_CPU].attr.attr, + &node_state_attr[N_GENERIC_INITIATOR].attr.attr, NULL }; diff --git a/include/asm-generic/topology.h b/include/asm-generic/topology.h index 238873739550..54d0b4176a45 100644 --- a/include/asm-generic/topology.h +++ b/include/asm-generic/topology.h @@ -71,6 +71,9 @@ #ifndef set_cpu_numa_mem #define set_cpu_numa_mem(cpu, node) #endif +#ifndef set_gi_numa_mem +#define set_gi_numa_mem(gi, node) +#endif #endif /* !CONFIG_NUMA || !CONFIG_HAVE_MEMORYLESS_NODES */ diff --git a/include/linux/nodemask.h b/include/linux/nodemask.h index 27e7fa36f707..1aebf766fb52 100644 --- a/include/linux/nodemask.h +++ b/include/linux/nodemask.h @@ -399,6 +399,7 @@ enum node_states { #endif N_MEMORY, /* The node has memory(regular, high, movable) */ N_CPU, /* The node has one or more cpus */ + N_GENERIC_INITIATOR, /* The node is a GI only node */ NR_NODE_STATES }; diff --git a/include/linux/topology.h b/include/linux/topology.h index eb2fe6edd73c..05ccf011e489 100644 --- a/include/linux/topology.h +++ b/include/linux/topology.h @@ -140,6 +140,13 @@ static inline void set_numa_mem(int node) } #endif +#ifndef set_gi_numa_mem +static inline void set_gi_numa_mem(int gi, int node) +{ + _node_numa_mem_[gi] = node; +} +#endif + #ifndef node_to_mem_node static inline int node_to_mem_node(int node) { -- 2.20.1 ^ permalink raw reply related [flat|nested] 21+ messages in thread
* Re: [PATCH V5 1/4] ACPI: Support Generic Initiator only domains 2019-10-04 11:43 ` [PATCH V5 1/4] ACPI: Support Generic Initiator only domains Jonathan Cameron @ 2019-10-18 10:18 ` Rafael J. Wysocki 2019-10-18 12:46 ` Jonathan Cameron 2019-11-12 17:55 ` Dan Williams 1 sibling, 1 reply; 21+ messages in thread From: Rafael J. Wysocki @ 2019-10-18 10:18 UTC (permalink / raw) To: Jonathan Cameron Cc: linux-mm, linux-acpi, linux-kernel, linux-arm-kernel, x86, Keith Busch, jglisse, linuxarm, Andrew Morton, Dan Williams On Friday, October 4, 2019 1:43:27 PM CEST Jonathan Cameron wrote: > Generic Initiators are a new ACPI concept that allows for the > description of proximity domains that contain a device which > performs memory access (such as a network card) but neither > host CPU nor Memory. > > This patch has the parsing code and provides the infrastructure > for an architecture to associate these new domains with their > nearest memory processing node. > > Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> This depends on the series from Dan at: https://lore.kernel.org/linux-acpi/CAPcyv4gBSX58CWH4HZ28w0_cZRzJrhgdEFHa2g8KDqyv8aFqZQ@mail.gmail.com/T/#m1acce3ae8f29f680c0d95fd1e840e703949fbc48 AFAICS, so please respin when that one hits the Linus' tree. > --- > drivers/acpi/numa/srat.c | 62 +++++++++++++++++++++++++++++++++- > drivers/base/node.c | 3 ++ > include/asm-generic/topology.h | 3 ++ > include/linux/nodemask.h | 1 + > include/linux/topology.h | 7 ++++ > 5 files changed, 75 insertions(+), 1 deletion(-) > > diff --git a/drivers/acpi/numa/srat.c b/drivers/acpi/numa/srat.c > index eadbf90e65d1..fe34315a9234 100644 > --- a/drivers/acpi/numa/srat.c > +++ b/drivers/acpi/numa/srat.c > @@ -170,6 +170,38 @@ acpi_table_print_srat_entry(struct acpi_subtable_header *header) > } > break; > > + case ACPI_SRAT_TYPE_GENERIC_AFFINITY: > + { > + struct acpi_srat_generic_affinity *p = > + (struct acpi_srat_generic_affinity *)header; > + char name[9] = {}; > + > + if (p->device_handle_type == 0) { > + /* > + * For pci devices this may be the only place they > + * are assigned a proximity domain > + */ > + pr_debug("SRAT Generic Initiator(Seg:%u BDF:%u) in proximity domain %d %s\n", > + *(u16 *)(&p->device_handle[0]), > + *(u16 *)(&p->device_handle[2]), > + p->proximity_domain, > + (p->flags & ACPI_SRAT_GENERIC_AFFINITY_ENABLED) ? > + "enabled" : "disabled"); > + } else { > + /* > + * In this case we can rely on the device having a > + * proximity domain reference > + */ > + memcpy(name, p->device_handle, 8); > + pr_info("SRAT Generic Initiator(HID=%.8s UID=%.4s) in proximity domain %d %s\n", > + (char *)(&p->device_handle[0]), > + (char *)(&p->device_handle[8]), > + p->proximity_domain, > + (p->flags & ACPI_SRAT_GENERIC_AFFINITY_ENABLED) ? > + "enabled" : "disabled"); > + } > + } > + break; > default: > pr_warn("Found unsupported SRAT entry (type = 0x%x)\n", > header->type); > @@ -378,6 +410,32 @@ acpi_parse_gicc_affinity(union acpi_subtable_headers *header, > return 0; > } > > +static int __init > +acpi_parse_gi_affinity(union acpi_subtable_headers *header, > + const unsigned long end) > +{ > + struct acpi_srat_generic_affinity *gi_affinity; > + int node; > + > + gi_affinity = (struct acpi_srat_generic_affinity *)header; > + if (!gi_affinity) > + return -EINVAL; > + acpi_table_print_srat_entry(&header->common); > + > + if (!(gi_affinity->flags & ACPI_SRAT_GENERIC_AFFINITY_ENABLED)) > + return -EINVAL; > + > + node = acpi_map_pxm_to_node(gi_affinity->proximity_domain); > + if (node == NUMA_NO_NODE || node >= MAX_NUMNODES) { > + pr_err("SRAT: Too many proximity domains.\n"); > + return -EINVAL; > + } > + node_set(node, numa_nodes_parsed); > + node_set_state(node, N_GENERIC_INITIATOR); > + > + return 0; > +} > + > static int __initdata parsed_numa_memblks; > > static int __init > @@ -433,7 +491,7 @@ int __init acpi_numa_init(void) > > /* SRAT: System Resource Affinity Table */ > if (!acpi_table_parse(ACPI_SIG_SRAT, acpi_parse_srat)) { > - struct acpi_subtable_proc srat_proc[3]; > + struct acpi_subtable_proc srat_proc[4]; > > memset(srat_proc, 0, sizeof(srat_proc)); > srat_proc[0].id = ACPI_SRAT_TYPE_CPU_AFFINITY; > @@ -442,6 +500,8 @@ int __init acpi_numa_init(void) > srat_proc[1].handler = acpi_parse_x2apic_affinity; > srat_proc[2].id = ACPI_SRAT_TYPE_GICC_AFFINITY; > srat_proc[2].handler = acpi_parse_gicc_affinity; > + srat_proc[3].id = ACPI_SRAT_TYPE_GENERIC_AFFINITY; > + srat_proc[3].handler = acpi_parse_gi_affinity; > > acpi_table_parse_entries_array(ACPI_SIG_SRAT, > sizeof(struct acpi_table_srat), > diff --git a/drivers/base/node.c b/drivers/base/node.c > index 296546ffed6c..e5863baa8cb6 100644 > --- a/drivers/base/node.c > +++ b/drivers/base/node.c > @@ -977,6 +977,8 @@ static struct node_attr node_state_attr[] = { > #endif > [N_MEMORY] = _NODE_ATTR(has_memory, N_MEMORY), > [N_CPU] = _NODE_ATTR(has_cpu, N_CPU), > + [N_GENERIC_INITIATOR] = _NODE_ATTR(has_generic_initiator, > + N_GENERIC_INITIATOR), > }; > > static struct attribute *node_state_attrs[] = { > @@ -988,6 +990,7 @@ static struct attribute *node_state_attrs[] = { > #endif > &node_state_attr[N_MEMORY].attr.attr, > &node_state_attr[N_CPU].attr.attr, > + &node_state_attr[N_GENERIC_INITIATOR].attr.attr, > NULL > }; > > diff --git a/include/asm-generic/topology.h b/include/asm-generic/topology.h > index 238873739550..54d0b4176a45 100644 > --- a/include/asm-generic/topology.h > +++ b/include/asm-generic/topology.h > @@ -71,6 +71,9 @@ > #ifndef set_cpu_numa_mem > #define set_cpu_numa_mem(cpu, node) > #endif > +#ifndef set_gi_numa_mem > +#define set_gi_numa_mem(gi, node) > +#endif > > #endif /* !CONFIG_NUMA || !CONFIG_HAVE_MEMORYLESS_NODES */ > > diff --git a/include/linux/nodemask.h b/include/linux/nodemask.h > index 27e7fa36f707..1aebf766fb52 100644 > --- a/include/linux/nodemask.h > +++ b/include/linux/nodemask.h > @@ -399,6 +399,7 @@ enum node_states { > #endif > N_MEMORY, /* The node has memory(regular, high, movable) */ > N_CPU, /* The node has one or more cpus */ > + N_GENERIC_INITIATOR, /* The node is a GI only node */ > NR_NODE_STATES > }; > > diff --git a/include/linux/topology.h b/include/linux/topology.h > index eb2fe6edd73c..05ccf011e489 100644 > --- a/include/linux/topology.h > +++ b/include/linux/topology.h > @@ -140,6 +140,13 @@ static inline void set_numa_mem(int node) > } > #endif > > +#ifndef set_gi_numa_mem > +static inline void set_gi_numa_mem(int gi, int node) > +{ > + _node_numa_mem_[gi] = node; > +} > +#endif > + > #ifndef node_to_mem_node > static inline int node_to_mem_node(int node) > { > ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [PATCH V5 1/4] ACPI: Support Generic Initiator only domains 2019-10-18 10:18 ` Rafael J. Wysocki @ 2019-10-18 12:46 ` Jonathan Cameron 2019-11-07 14:54 ` Rafael J. Wysocki 0 siblings, 1 reply; 21+ messages in thread From: Jonathan Cameron @ 2019-10-18 12:46 UTC (permalink / raw) To: Rafael J. Wysocki Cc: linux-mm, linux-acpi, linux-kernel, linux-arm-kernel, x86, Keith Busch, jglisse, linuxarm, Andrew Morton, Dan Williams On Fri, 18 Oct 2019 12:18:33 +0200 "Rafael J. Wysocki" <rjw@rjwysocki.net> wrote: > On Friday, October 4, 2019 1:43:27 PM CEST Jonathan Cameron wrote: > > Generic Initiators are a new ACPI concept that allows for the > > description of proximity domains that contain a device which > > performs memory access (such as a network card) but neither > > host CPU nor Memory. > > > > This patch has the parsing code and provides the infrastructure > > for an architecture to associate these new domains with their > > nearest memory processing node. > > > > Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> > > This depends on the series from Dan at: > > https://lore.kernel.org/linux-acpi/CAPcyv4gBSX58CWH4HZ28w0_cZRzJrhgdEFHa2g8KDqyv8aFqZQ@mail.gmail.com/T/#m1acce3ae8f29f680c0d95fd1e840e703949fbc48 > Hi Rafael, Yes. Cover letter mentions it was rebased on v4 of that series. > AFAICS, so please respin when that one hits the Linus' tree. Sure, though that pushes it out another cycle and it's beginning to get a bit silly (just rebases since April). I guess it can't be helped given the series hits several trees. Note that this version applies completely clean on top of V7 of Dan's SPM/hmem set applied to the tip tree (which I assume is the route that will take). Hence, unless something else changes, the respin will be identical to this version. Thanks, Jonathan > > > --- > > drivers/acpi/numa/srat.c | 62 +++++++++++++++++++++++++++++++++- > > drivers/base/node.c | 3 ++ > > include/asm-generic/topology.h | 3 ++ > > include/linux/nodemask.h | 1 + > > include/linux/topology.h | 7 ++++ > > 5 files changed, 75 insertions(+), 1 deletion(-) > > > > diff --git a/drivers/acpi/numa/srat.c b/drivers/acpi/numa/srat.c > > index eadbf90e65d1..fe34315a9234 100644 > > --- a/drivers/acpi/numa/srat.c > > +++ b/drivers/acpi/numa/srat.c > > @@ -170,6 +170,38 @@ acpi_table_print_srat_entry(struct acpi_subtable_header *header) > > } > > break; > > > > + case ACPI_SRAT_TYPE_GENERIC_AFFINITY: > > + { > > + struct acpi_srat_generic_affinity *p = > > + (struct acpi_srat_generic_affinity *)header; > > + char name[9] = {}; > > + > > + if (p->device_handle_type == 0) { > > + /* > > + * For pci devices this may be the only place they > > + * are assigned a proximity domain > > + */ > > + pr_debug("SRAT Generic Initiator(Seg:%u BDF:%u) in proximity domain %d %s\n", > > + *(u16 *)(&p->device_handle[0]), > > + *(u16 *)(&p->device_handle[2]), > > + p->proximity_domain, > > + (p->flags & ACPI_SRAT_GENERIC_AFFINITY_ENABLED) ? > > + "enabled" : "disabled"); > > + } else { > > + /* > > + * In this case we can rely on the device having a > > + * proximity domain reference > > + */ > > + memcpy(name, p->device_handle, 8); > > + pr_info("SRAT Generic Initiator(HID=%.8s UID=%.4s) in proximity domain %d %s\n", > > + (char *)(&p->device_handle[0]), > > + (char *)(&p->device_handle[8]), > > + p->proximity_domain, > > + (p->flags & ACPI_SRAT_GENERIC_AFFINITY_ENABLED) ? > > + "enabled" : "disabled"); > > + } > > + } > > + break; > > default: > > pr_warn("Found unsupported SRAT entry (type = 0x%x)\n", > > header->type); > > @@ -378,6 +410,32 @@ acpi_parse_gicc_affinity(union acpi_subtable_headers *header, > > return 0; > > } > > > > +static int __init > > +acpi_parse_gi_affinity(union acpi_subtable_headers *header, > > + const unsigned long end) > > +{ > > + struct acpi_srat_generic_affinity *gi_affinity; > > + int node; > > + > > + gi_affinity = (struct acpi_srat_generic_affinity *)header; > > + if (!gi_affinity) > > + return -EINVAL; > > + acpi_table_print_srat_entry(&header->common); > > + > > + if (!(gi_affinity->flags & ACPI_SRAT_GENERIC_AFFINITY_ENABLED)) > > + return -EINVAL; > > + > > + node = acpi_map_pxm_to_node(gi_affinity->proximity_domain); > > + if (node == NUMA_NO_NODE || node >= MAX_NUMNODES) { > > + pr_err("SRAT: Too many proximity domains.\n"); > > + return -EINVAL; > > + } > > + node_set(node, numa_nodes_parsed); > > + node_set_state(node, N_GENERIC_INITIATOR); > > + > > + return 0; > > +} > > + > > static int __initdata parsed_numa_memblks; > > > > static int __init > > @@ -433,7 +491,7 @@ int __init acpi_numa_init(void) > > > > /* SRAT: System Resource Affinity Table */ > > if (!acpi_table_parse(ACPI_SIG_SRAT, acpi_parse_srat)) { > > - struct acpi_subtable_proc srat_proc[3]; > > + struct acpi_subtable_proc srat_proc[4]; > > > > memset(srat_proc, 0, sizeof(srat_proc)); > > srat_proc[0].id = ACPI_SRAT_TYPE_CPU_AFFINITY; > > @@ -442,6 +500,8 @@ int __init acpi_numa_init(void) > > srat_proc[1].handler = acpi_parse_x2apic_affinity; > > srat_proc[2].id = ACPI_SRAT_TYPE_GICC_AFFINITY; > > srat_proc[2].handler = acpi_parse_gicc_affinity; > > + srat_proc[3].id = ACPI_SRAT_TYPE_GENERIC_AFFINITY; > > + srat_proc[3].handler = acpi_parse_gi_affinity; > > > > acpi_table_parse_entries_array(ACPI_SIG_SRAT, > > sizeof(struct acpi_table_srat), > > diff --git a/drivers/base/node.c b/drivers/base/node.c > > index 296546ffed6c..e5863baa8cb6 100644 > > --- a/drivers/base/node.c > > +++ b/drivers/base/node.c > > @@ -977,6 +977,8 @@ static struct node_attr node_state_attr[] = { > > #endif > > [N_MEMORY] = _NODE_ATTR(has_memory, N_MEMORY), > > [N_CPU] = _NODE_ATTR(has_cpu, N_CPU), > > + [N_GENERIC_INITIATOR] = _NODE_ATTR(has_generic_initiator, > > + N_GENERIC_INITIATOR), > > }; > > > > static struct attribute *node_state_attrs[] = { > > @@ -988,6 +990,7 @@ static struct attribute *node_state_attrs[] = { > > #endif > > &node_state_attr[N_MEMORY].attr.attr, > > &node_state_attr[N_CPU].attr.attr, > > + &node_state_attr[N_GENERIC_INITIATOR].attr.attr, > > NULL > > }; > > > > diff --git a/include/asm-generic/topology.h b/include/asm-generic/topology.h > > index 238873739550..54d0b4176a45 100644 > > --- a/include/asm-generic/topology.h > > +++ b/include/asm-generic/topology.h > > @@ -71,6 +71,9 @@ > > #ifndef set_cpu_numa_mem > > #define set_cpu_numa_mem(cpu, node) > > #endif > > +#ifndef set_gi_numa_mem > > +#define set_gi_numa_mem(gi, node) > > +#endif > > > > #endif /* !CONFIG_NUMA || !CONFIG_HAVE_MEMORYLESS_NODES */ > > > > diff --git a/include/linux/nodemask.h b/include/linux/nodemask.h > > index 27e7fa36f707..1aebf766fb52 100644 > > --- a/include/linux/nodemask.h > > +++ b/include/linux/nodemask.h > > @@ -399,6 +399,7 @@ enum node_states { > > #endif > > N_MEMORY, /* The node has memory(regular, high, movable) */ > > N_CPU, /* The node has one or more cpus */ > > + N_GENERIC_INITIATOR, /* The node is a GI only node */ > > NR_NODE_STATES > > }; > > > > diff --git a/include/linux/topology.h b/include/linux/topology.h > > index eb2fe6edd73c..05ccf011e489 100644 > > --- a/include/linux/topology.h > > +++ b/include/linux/topology.h > > @@ -140,6 +140,13 @@ static inline void set_numa_mem(int node) > > } > > #endif > > > > +#ifndef set_gi_numa_mem > > +static inline void set_gi_numa_mem(int gi, int node) > > +{ > > + _node_numa_mem_[gi] = node; > > +} > > +#endif > > + > > #ifndef node_to_mem_node > > static inline int node_to_mem_node(int node) > > { > > > > > > ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [PATCH V5 1/4] ACPI: Support Generic Initiator only domains 2019-10-18 12:46 ` Jonathan Cameron @ 2019-11-07 14:54 ` Rafael J. Wysocki 2019-11-12 17:07 ` Jonathan Cameron 0 siblings, 1 reply; 21+ messages in thread From: Rafael J. Wysocki @ 2019-11-07 14:54 UTC (permalink / raw) To: Jonathan Cameron Cc: linux-mm, linux-acpi, linux-kernel, linux-arm-kernel, x86, Keith Busch, jglisse, linuxarm, Andrew Morton, Dan Williams On Friday, October 18, 2019 2:46:56 PM CET Jonathan Cameron wrote: > On Fri, 18 Oct 2019 12:18:33 +0200 > "Rafael J. Wysocki" <rjw@rjwysocki.net> wrote: > > > On Friday, October 4, 2019 1:43:27 PM CEST Jonathan Cameron wrote: > > > Generic Initiators are a new ACPI concept that allows for the > > > description of proximity domains that contain a device which > > > performs memory access (such as a network card) but neither > > > host CPU nor Memory. > > > > > > This patch has the parsing code and provides the infrastructure > > > for an architecture to associate these new domains with their > > > nearest memory processing node. > > > > > > Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> > > > > This depends on the series from Dan at: > > > > https://lore.kernel.org/linux-acpi/CAPcyv4gBSX58CWH4HZ28w0_cZRzJrhgdEFHa2g8KDqyv8aFqZQ@mail.gmail.com/T/#m1acce3ae8f29f680c0d95fd1e840e703949fbc48 > > > Hi Rafael, > > Yes. Cover letter mentions it was rebased on v4 of that series. > > > AFAICS, so please respin when that one hits the Linus' tree. > > Sure, though that pushes it out another cycle and it's beginning to > get a bit silly (just rebases since April). > > I guess it can't be helped given the series hits several trees. I've just applied the Dan's series and I can take patch [1/4] from this one, but for the [2-3/4] I'd like to get some ACKs from the arm64 and x86 people respectively. Thanks! ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [PATCH V5 1/4] ACPI: Support Generic Initiator only domains 2019-11-07 14:54 ` Rafael J. Wysocki @ 2019-11-12 17:07 ` Jonathan Cameron 0 siblings, 0 replies; 21+ messages in thread From: Jonathan Cameron @ 2019-11-12 17:07 UTC (permalink / raw) To: Rafael J. Wysocki Cc: linux-mm, linux-acpi, linux-kernel, linux-arm-kernel, x86, Keith Busch, jglisse, linuxarm, Andrew Morton, Dan Williams, will, lorenzo.pieralisi, guohanjun, Ingo Molnar On Thu, 7 Nov 2019 15:54:28 +0100 "Rafael J. Wysocki" <rjw@rjwysocki.net> wrote: > On Friday, October 18, 2019 2:46:56 PM CET Jonathan Cameron wrote: > > On Fri, 18 Oct 2019 12:18:33 +0200 > > "Rafael J. Wysocki" <rjw@rjwysocki.net> wrote: > > > > > On Friday, October 4, 2019 1:43:27 PM CEST Jonathan Cameron wrote: > > > > Generic Initiators are a new ACPI concept that allows for the > > > > description of proximity domains that contain a device which > > > > performs memory access (such as a network card) but neither > > > > host CPU nor Memory. > > > > > > > > This patch has the parsing code and provides the infrastructure > > > > for an architecture to associate these new domains with their > > > > nearest memory processing node. > > > > > > > > Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> > > > > > > This depends on the series from Dan at: > > > > > > https://lore.kernel.org/linux-acpi/CAPcyv4gBSX58CWH4HZ28w0_cZRzJrhgdEFHa2g8KDqyv8aFqZQ@mail.gmail.com/T/#m1acce3ae8f29f680c0d95fd1e840e703949fbc48 > > > > > Hi Rafael, > > > > Yes. Cover letter mentions it was rebased on v4 of that series. > > > > > AFAICS, so please respin when that one hits the Linus' tree. > > > > Sure, though that pushes it out another cycle and it's beginning to > > get a bit silly (just rebases since April). > > > > I guess it can't be helped given the series hits several trees. > > I've just applied the Dan's series and I can take patch [1/4] from this one, > but for the [2-3/4] I'd like to get some ACKs from the arm64 and x86 people > respectively. Thanks Rafael! Absolutely understood on the need for Acks. For ARM let us try a few more CCs +CC Will, Lorenzo, Hanjun. Also Ingo on basis of showing a passing interest in the x86 patch previously. Otherwise I think we have the x86 people most like to comment already cc'd. https://patchwork.kernel.org/cover/11174247/ has the full series. I'd appreciate anyone who has time taking a look at these. The actual actions in the architectures are very simple, but I may well be missing some subtlety. > > Thanks! > Thanks, Jonathan ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [PATCH V5 1/4] ACPI: Support Generic Initiator only domains 2019-10-04 11:43 ` [PATCH V5 1/4] ACPI: Support Generic Initiator only domains Jonathan Cameron 2019-10-18 10:18 ` Rafael J. Wysocki @ 2019-11-12 17:55 ` Dan Williams 2019-11-13 9:47 ` Jonathan Cameron 1 sibling, 1 reply; 21+ messages in thread From: Dan Williams @ 2019-11-12 17:55 UTC (permalink / raw) To: Jonathan Cameron Cc: Linux MM, Linux ACPI, Linux Kernel Mailing List, Linux ARM, X86 ML, Keith Busch, Jérôme Glisse, Rafael J . Wysocki, Linuxarm, Andrew Morton, Tao Xu [ add Tao Xu ] On Fri, Oct 4, 2019 at 4:45 AM Jonathan Cameron <Jonathan.Cameron@huawei.com> wrote: > > Generic Initiators are a new ACPI concept that allows for the > description of proximity domains that contain a device which > performs memory access (such as a network card) but neither > host CPU nor Memory. > > This patch has the parsing code and provides the infrastructure > for an architecture to associate these new domains with their > nearest memory processing node. Thanks for this Jonathan. May I ask how this was tested? Tao has been working on qemu support for HMAT [1]. I have not checked if it already supports generic initiator entries, but it would be helpful to include an example of how the kernel sees these configurations in practice. [1]: http://patchwork.ozlabs.org/cover/1096737/ ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [PATCH V5 1/4] ACPI: Support Generic Initiator only domains 2019-11-12 17:55 ` Dan Williams @ 2019-11-13 9:47 ` Jonathan Cameron 2019-11-13 13:57 ` Tao Xu 0 siblings, 1 reply; 21+ messages in thread From: Jonathan Cameron @ 2019-11-13 9:47 UTC (permalink / raw) To: Dan Williams Cc: Linux MM, Linux ACPI, Linux Kernel Mailing List, Linux ARM, X86 ML, Keith Busch, Jérôme Glisse, Rafael J . Wysocki, Linuxarm, Andrew Morton, Tao Xu On Tue, 12 Nov 2019 09:55:17 -0800 Dan Williams <dan.j.williams@intel.com> wrote: > [ add Tao Xu ] > > On Fri, Oct 4, 2019 at 4:45 AM Jonathan Cameron > <Jonathan.Cameron@huawei.com> wrote: > > > > Generic Initiators are a new ACPI concept that allows for the > > description of proximity domains that contain a device which > > performs memory access (such as a network card) but neither > > host CPU nor Memory. > > > > This patch has the parsing code and provides the infrastructure > > for an architecture to associate these new domains with their > > nearest memory processing node. > > Thanks for this Jonathan. May I ask how this was tested? Tao has been > working on qemu support for HMAT [1]. I have not checked if it already > supports generic initiator entries, but it would be helpful to include > an example of how the kernel sees these configurations in practice. > > [1]: http://patchwork.ozlabs.org/cover/1096737/ Tested against qemu with SRAT and SLIT table overrides from an initrd to actually create the node and give it distances (those all turn up correctly in the normal places). DSDT override used to move an emulated network card into the GI numa node. That currently requires the PCI patch referred to in the cover letter. On arm64 tested both on qemu and real hardware (overrides on tables even for real hardware as I can't persuade our BIOS team to implement Generic Initiators until an OS is actually using them.) Main real requirement is memory allocations then occur from one of the nodes at the minimal distance when you are do a devm_ allocation from a device assigned. Also need to be able to query the distances to allow load balancing etc. All that works as expected. It only has a fairly tangential connection to HMAT in that HMAT can provide information on GI nodes. Given HMAT code is quite happy with memoryless nodes anyway it should work. QEMU doesn't currently have support to create GI SRAT entries let alone HMAT using them. Whilst I could look at adding such support to QEMU, it's not exactly high priority to emulate something we can test easily by overriding the tables before the kernel reads them. I'll look at how hard it is to build an HMAT tables for my test configs based on the ones I used to test your HMAT patches a while back. Should be easy if tedious. Jonathan ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [PATCH V5 1/4] ACPI: Support Generic Initiator only domains 2019-11-13 9:47 ` Jonathan Cameron @ 2019-11-13 13:57 ` Tao Xu 2019-11-13 16:52 ` Dan Williams 2019-11-13 17:48 ` Jonathan Cameron 0 siblings, 2 replies; 21+ messages in thread From: Tao Xu @ 2019-11-13 13:57 UTC (permalink / raw) To: Jonathan Cameron, Dan Williams Cc: Linux MM, Linux ACPI, Linux Kernel Mailing List, Linux ARM, X86 ML, Keith Busch, Jérôme Glisse, Rafael J . Wysocki, Linuxarm, Andrew Morton On 11/13/2019 5:47 PM, Jonathan Cameron wrote: > On Tue, 12 Nov 2019 09:55:17 -0800 > Dan Williams <dan.j.williams@intel.com> wrote: > >> [ add Tao Xu ] >> >> On Fri, Oct 4, 2019 at 4:45 AM Jonathan Cameron >> <Jonathan.Cameron@huawei.com> wrote: >>> >>> Generic Initiators are a new ACPI concept that allows for the >>> description of proximity domains that contain a device which >>> performs memory access (such as a network card) but neither >>> host CPU nor Memory. >>> >>> This patch has the parsing code and provides the infrastructure >>> for an architecture to associate these new domains with their >>> nearest memory processing node. >> >> Thanks for this Jonathan. May I ask how this was tested? Tao has been >> working on qemu support for HMAT [1]. I have not checked if it already >> supports generic initiator entries, but it would be helpful to include >> an example of how the kernel sees these configurations in practice. >> >> [1]: http://patchwork.ozlabs.org/cover/1096737/ > > Tested against qemu with SRAT and SLIT table overrides from an > initrd to actually create the node and give it distances > (those all turn up correctly in the normal places). DSDT override > used to move an emulated network card into the GI numa node. That > currently requires the PCI patch referred to in the cover letter. > On arm64 tested both on qemu and real hardware (overrides on tables > even for real hardware as I can't persuade our BIOS team to implement > Generic Initiators until an OS is actually using them.) > > Main real requirement is memory allocations then occur from one of > the nodes at the minimal distance when you are do a devm_ allocation > from a device assigned. Also need to be able to query the distances > to allow load balancing etc. All that works as expected. > > It only has a fairly tangential connection to HMAT in that HMAT > can provide information on GI nodes. Given HMAT code is quite happy > with memoryless nodes anyway it should work. QEMU doesn't currently > have support to create GI SRAT entries let alone HMAT using them. > > Whilst I could look at adding such support to QEMU, it's not > exactly high priority to emulate something we can test easily > by overriding the tables before the kernel reads them. > > I'll look at how hard it is to build an HMAT tables for my test > configs based on the ones I used to test your HMAT patches a while > back. Should be easy if tedious. > > Jonathan > Indeed, HMAT can support Generic Initiator, but as far as I know, QEMU only can emulate a node with cpu and memory, or memory-only. Even if we assign a node with cpu only, qemu will raise error. Considering compatibility, there are lots of work to do for QEMU if we change NUMA or SRAT table. ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [PATCH V5 1/4] ACPI: Support Generic Initiator only domains 2019-11-13 13:57 ` Tao Xu @ 2019-11-13 16:52 ` Dan Williams 2019-11-13 17:56 ` Jonathan Cameron 2019-11-13 17:48 ` Jonathan Cameron 1 sibling, 1 reply; 21+ messages in thread From: Dan Williams @ 2019-11-13 16:52 UTC (permalink / raw) To: Tao Xu Cc: Jonathan Cameron, Linux MM, Linux ACPI, Linux Kernel Mailing List, Linux ARM, X86 ML, Keith Busch, Jérôme Glisse, Rafael J . Wysocki, Linuxarm, Andrew Morton On Wed, Nov 13, 2019 at 5:57 AM Tao Xu <tao3.xu@intel.com> wrote: > > On 11/13/2019 5:47 PM, Jonathan Cameron wrote: > > On Tue, 12 Nov 2019 09:55:17 -0800 > > Dan Williams <dan.j.williams@intel.com> wrote: > > > >> [ add Tao Xu ] > >> > >> On Fri, Oct 4, 2019 at 4:45 AM Jonathan Cameron > >> <Jonathan.Cameron@huawei.com> wrote: > >>> > >>> Generic Initiators are a new ACPI concept that allows for the > >>> description of proximity domains that contain a device which > >>> performs memory access (such as a network card) but neither > >>> host CPU nor Memory. > >>> > >>> This patch has the parsing code and provides the infrastructure > >>> for an architecture to associate these new domains with their > >>> nearest memory processing node. > >> > >> Thanks for this Jonathan. May I ask how this was tested? Tao has been > >> working on qemu support for HMAT [1]. I have not checked if it already > >> supports generic initiator entries, but it would be helpful to include > >> an example of how the kernel sees these configurations in practice. > >> > >> [1]: http://patchwork.ozlabs.org/cover/1096737/ > > > > Tested against qemu with SRAT and SLIT table overrides from an > > initrd to actually create the node and give it distances > > (those all turn up correctly in the normal places). DSDT override > > used to move an emulated network card into the GI numa node. That > > currently requires the PCI patch referred to in the cover letter. > > On arm64 tested both on qemu and real hardware (overrides on tables > > even for real hardware as I can't persuade our BIOS team to implement > > Generic Initiators until an OS is actually using them.) > > > > Main real requirement is memory allocations then occur from one of > > the nodes at the minimal distance when you are do a devm_ allocation > > from a device assigned. Also need to be able to query the distances > > to allow load balancing etc. All that works as expected. > > > > It only has a fairly tangential connection to HMAT in that HMAT > > can provide information on GI nodes. Given HMAT code is quite happy > > with memoryless nodes anyway it should work. QEMU doesn't currently > > have support to create GI SRAT entries let alone HMAT using them. > > > > Whilst I could look at adding such support to QEMU, it's not > > exactly high priority to emulate something we can test easily > > by overriding the tables before the kernel reads them. > > > > I'll look at how hard it is to build an HMAT tables for my test > > configs based on the ones I used to test your HMAT patches a while > > back. Should be easy if tedious. > > > > Jonathan > > > Indeed, HMAT can support Generic Initiator, but as far as I know, QEMU > only can emulate a node with cpu and memory, or memory-only. Even if we > assign a node with cpu only, qemu will raise error. Considering > compatibility, there are lots of work to do for QEMU if we change NUMA > or SRAT table. Thanks for the background. It would still be a useful feature to be able to define a memory + generic-initiator node in qemu. That will mirror real world accelerators with local memory configurations. ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [PATCH V5 1/4] ACPI: Support Generic Initiator only domains 2019-11-13 16:52 ` Dan Williams @ 2019-11-13 17:56 ` Jonathan Cameron 0 siblings, 0 replies; 21+ messages in thread From: Jonathan Cameron @ 2019-11-13 17:56 UTC (permalink / raw) To: Dan Williams Cc: Tao Xu, Linux MM, Linux ACPI, Linux Kernel Mailing List, Linux ARM, X86 ML, Keith Busch, Jérôme Glisse, Rafael J . Wysocki, Linuxarm, Andrew Morton On Wed, 13 Nov 2019 08:52:46 -0800 Dan Williams <dan.j.williams@intel.com> wrote: > On Wed, Nov 13, 2019 at 5:57 AM Tao Xu <tao3.xu@intel.com> wrote: > > > > On 11/13/2019 5:47 PM, Jonathan Cameron wrote: > > > On Tue, 12 Nov 2019 09:55:17 -0800 > > > Dan Williams <dan.j.williams@intel.com> wrote: > > > > > >> [ add Tao Xu ] > > >> > > >> On Fri, Oct 4, 2019 at 4:45 AM Jonathan Cameron > > >> <Jonathan.Cameron@huawei.com> wrote: > > >>> > > >>> Generic Initiators are a new ACPI concept that allows for the > > >>> description of proximity domains that contain a device which > > >>> performs memory access (such as a network card) but neither > > >>> host CPU nor Memory. > > >>> > > >>> This patch has the parsing code and provides the infrastructure > > >>> for an architecture to associate these new domains with their > > >>> nearest memory processing node. > > >> > > >> Thanks for this Jonathan. May I ask how this was tested? Tao has been > > >> working on qemu support for HMAT [1]. I have not checked if it already > > >> supports generic initiator entries, but it would be helpful to include > > >> an example of how the kernel sees these configurations in practice. > > >> > > >> [1]: http://patchwork.ozlabs.org/cover/1096737/ > > > > > > Tested against qemu with SRAT and SLIT table overrides from an > > > initrd to actually create the node and give it distances > > > (those all turn up correctly in the normal places). DSDT override > > > used to move an emulated network card into the GI numa node. That > > > currently requires the PCI patch referred to in the cover letter. > > > On arm64 tested both on qemu and real hardware (overrides on tables > > > even for real hardware as I can't persuade our BIOS team to implement > > > Generic Initiators until an OS is actually using them.) > > > > > > Main real requirement is memory allocations then occur from one of > > > the nodes at the minimal distance when you are do a devm_ allocation > > > from a device assigned. Also need to be able to query the distances > > > to allow load balancing etc. All that works as expected. > > > > > > It only has a fairly tangential connection to HMAT in that HMAT > > > can provide information on GI nodes. Given HMAT code is quite happy > > > with memoryless nodes anyway it should work. QEMU doesn't currently > > > have support to create GI SRAT entries let alone HMAT using them. > > > > > > Whilst I could look at adding such support to QEMU, it's not > > > exactly high priority to emulate something we can test easily > > > by overriding the tables before the kernel reads them. > > > > > > I'll look at how hard it is to build an HMAT tables for my test > > > configs based on the ones I used to test your HMAT patches a while > > > back. Should be easy if tedious. > > > > > > Jonathan > > > > > Indeed, HMAT can support Generic Initiator, but as far as I know, QEMU > > only can emulate a node with cpu and memory, or memory-only. Even if we > > assign a node with cpu only, qemu will raise error. Considering > > compatibility, there are lots of work to do for QEMU if we change NUMA > > or SRAT table. > > Thanks for the background. It would still be a useful feature to be > able to define a memory + generic-initiator node in qemu. That will > mirror real world accelerators with local memory configurations. Ah crossed with my essay. This simple case you have here is easier to discuss. Lets call it a GPU on a coherent interconnect with local memory. What do you think should happen for access0 in sysfs? Do we want the GPU reflected in there or not? This particular case doesn't actually need a GI, though perhaps you might want one purely to give HMAT based info. On a pre GI system you would just use a memory only node and use DSDT _PXM to put the GPU device in it. Whilst I agree a means of testing this in qemu might be more friendly than doing it by overriding tables, the overriding route lets you do the crazy corner cases + generate 'invalid' tables which are also useful for testing. Thanks, Jonathan ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [PATCH V5 1/4] ACPI: Support Generic Initiator only domains 2019-11-13 13:57 ` Tao Xu 2019-11-13 16:52 ` Dan Williams @ 2019-11-13 17:48 ` Jonathan Cameron 2019-11-13 23:20 ` Dan Williams 1 sibling, 1 reply; 21+ messages in thread From: Jonathan Cameron @ 2019-11-13 17:48 UTC (permalink / raw) To: Tao Xu Cc: Dan Williams, Linux MM, Linux ACPI, Linux Kernel Mailing List, Linux ARM, X86 ML, Keith Busch, Jérôme Glisse, Rafael J . Wysocki, Linuxarm, Andrew Morton On Wed, 13 Nov 2019 21:57:24 +0800 Tao Xu <tao3.xu@intel.com> wrote: > On 11/13/2019 5:47 PM, Jonathan Cameron wrote: > > On Tue, 12 Nov 2019 09:55:17 -0800 > > Dan Williams <dan.j.williams@intel.com> wrote: > > > >> [ add Tao Xu ] > >> > >> On Fri, Oct 4, 2019 at 4:45 AM Jonathan Cameron > >> <Jonathan.Cameron@huawei.com> wrote: > >>> > >>> Generic Initiators are a new ACPI concept that allows for the > >>> description of proximity domains that contain a device which > >>> performs memory access (such as a network card) but neither > >>> host CPU nor Memory. > >>> > >>> This patch has the parsing code and provides the infrastructure > >>> for an architecture to associate these new domains with their > >>> nearest memory processing node. > >> > >> Thanks for this Jonathan. May I ask how this was tested? Tao has been > >> working on qemu support for HMAT [1]. I have not checked if it already > >> supports generic initiator entries, but it would be helpful to include > >> an example of how the kernel sees these configurations in practice. > >> > >> [1]: http://patchwork.ozlabs.org/cover/1096737/ > > > > Tested against qemu with SRAT and SLIT table overrides from an > > initrd to actually create the node and give it distances > > (those all turn up correctly in the normal places). DSDT override > > used to move an emulated network card into the GI numa node. That > > currently requires the PCI patch referred to in the cover letter. > > On arm64 tested both on qemu and real hardware (overrides on tables > > even for real hardware as I can't persuade our BIOS team to implement > > Generic Initiators until an OS is actually using them.) > > > > Main real requirement is memory allocations then occur from one of > > the nodes at the minimal distance when you are do a devm_ allocation > > from a device assigned. Also need to be able to query the distances > > to allow load balancing etc. All that works as expected. > > > > It only has a fairly tangential connection to HMAT in that HMAT > > can provide information on GI nodes. Given HMAT code is quite happy > > with memoryless nodes anyway it should work. QEMU doesn't currently > > have support to create GI SRAT entries let alone HMAT using them. > > > > Whilst I could look at adding such support to QEMU, it's not > > exactly high priority to emulate something we can test easily > > by overriding the tables before the kernel reads them. > > > > I'll look at how hard it is to build an HMAT tables for my test > > configs based on the ones I used to test your HMAT patches a while > > back. Should be easy if tedious. > > > > Jonathan > > > Indeed, HMAT can support Generic Initiator, but as far as I know, QEMU > only can emulate a node with cpu and memory, or memory-only. Even if we > assign a node with cpu only, qemu will raise error. Considering > compatibility, there are lots of work to do for QEMU if we change NUMA > or SRAT table. > I faked up a quick HMAT table. Used a configuration with 3x CPU and memory nodes, 1x memory only node and 1x GI node. Two test cases, one where the GI initiator is further than the CPU containing nodes from the memory only node (realistic case for existing hardware). That behaves as expected and there are no /sys/node/bus/nodeX/access0 entries for the GI node + appropriate ones for the memory only node as normal. The other case is more interesting we have the memory only node nearer to the GI node than to any of the CPUs. In that case for x86 at least the HMAT code is happy to put an access0 directory GI in the GI node with empty access0/initiators and the memory node under access0/targets The memory only node is node4 and the GI node node3. So relevant dirs under /sys/bus/nodes/devices node3/access0/initators/ Empty node3/access0/targets/node4 node4/access0/initators/[node3 read_bandwidth write_bandwith etc] node4/access0/targets/ Empty So the result current (I think - the HMAT interface still confuses me :) is that a GI node is treated like a CPU node. This might mean there is no useful information available if you want to figure out which CPU containing node is nearest to Memory when the GI node is nearer still. Is this a problem? I'm not sure... If we don't want to include GI nodes then we can possibly use the node_state(N_CPU, x) method to check before considering them, or I guess parse SRAT to extract that info directly. I tried this and it seems to work so can add patch doing this next version if we think this is the 'right' thing to do. So what do you think 'should' happen? Jonathan ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [PATCH V5 1/4] ACPI: Support Generic Initiator only domains 2019-11-13 17:48 ` Jonathan Cameron @ 2019-11-13 23:20 ` Dan Williams 2019-11-14 11:26 ` Jonathan Cameron 0 siblings, 1 reply; 21+ messages in thread From: Dan Williams @ 2019-11-13 23:20 UTC (permalink / raw) To: Jonathan Cameron Cc: Tao Xu, Linux MM, Linux ACPI, Linux Kernel Mailing List, Linux ARM, X86 ML, Keith Busch, Jérôme Glisse, Rafael J . Wysocki, Linuxarm, Andrew Morton On Wed, Nov 13, 2019 at 9:49 AM Jonathan Cameron <jonathan.cameron@huawei.com> wrote: > > On Wed, 13 Nov 2019 21:57:24 +0800 > Tao Xu <tao3.xu@intel.com> wrote: > > > On 11/13/2019 5:47 PM, Jonathan Cameron wrote: > > > On Tue, 12 Nov 2019 09:55:17 -0800 > > > Dan Williams <dan.j.williams@intel.com> wrote: > > > > > >> [ add Tao Xu ] > > >> > > >> On Fri, Oct 4, 2019 at 4:45 AM Jonathan Cameron > > >> <Jonathan.Cameron@huawei.com> wrote: > > >>> > > >>> Generic Initiators are a new ACPI concept that allows for the > > >>> description of proximity domains that contain a device which > > >>> performs memory access (such as a network card) but neither > > >>> host CPU nor Memory. > > >>> > > >>> This patch has the parsing code and provides the infrastructure > > >>> for an architecture to associate these new domains with their > > >>> nearest memory processing node. > > >> > > >> Thanks for this Jonathan. May I ask how this was tested? Tao has been > > >> working on qemu support for HMAT [1]. I have not checked if it already > > >> supports generic initiator entries, but it would be helpful to include > > >> an example of how the kernel sees these configurations in practice. > > >> > > >> [1]: http://patchwork.ozlabs.org/cover/1096737/ > > > > > > Tested against qemu with SRAT and SLIT table overrides from an > > > initrd to actually create the node and give it distances > > > (those all turn up correctly in the normal places). DSDT override > > > used to move an emulated network card into the GI numa node. That > > > currently requires the PCI patch referred to in the cover letter. > > > On arm64 tested both on qemu and real hardware (overrides on tables > > > even for real hardware as I can't persuade our BIOS team to implement > > > Generic Initiators until an OS is actually using them.) > > > > > > Main real requirement is memory allocations then occur from one of > > > the nodes at the minimal distance when you are do a devm_ allocation > > > from a device assigned. Also need to be able to query the distances > > > to allow load balancing etc. All that works as expected. > > > > > > It only has a fairly tangential connection to HMAT in that HMAT > > > can provide information on GI nodes. Given HMAT code is quite happy > > > with memoryless nodes anyway it should work. QEMU doesn't currently > > > have support to create GI SRAT entries let alone HMAT using them. > > > > > > Whilst I could look at adding such support to QEMU, it's not > > > exactly high priority to emulate something we can test easily > > > by overriding the tables before the kernel reads them. > > > > > > I'll look at how hard it is to build an HMAT tables for my test > > > configs based on the ones I used to test your HMAT patches a while > > > back. Should be easy if tedious. > > > > > > Jonathan > > > > > Indeed, HMAT can support Generic Initiator, but as far as I know, QEMU > > only can emulate a node with cpu and memory, or memory-only. Even if we > > assign a node with cpu only, qemu will raise error. Considering > > compatibility, there are lots of work to do for QEMU if we change NUMA > > or SRAT table. > > > > I faked up a quick HMAT table. > > Used a configuration with 3x CPU and memory nodes, 1x memory only node > and 1x GI node. Two test cases, one where the GI initiator is further than > the CPU containing nodes from the memory only node (realistic case for > existing hardware). That behaves as expected and there are no > /sys/node/bus/nodeX/access0 entries for the GI node > + appropriate ones for the memory only node as normal. > > The other case is more interesting we have the memory only node nearer > to the GI node than to any of the CPUs. In that case for x86 at least > the HMAT code is happy to put an access0 directory GI in the GI node > with empty access0/initiators and the memory node under access0/targets > > The memory only node is node4 and the GI node node3. > > So relevant dirs under /sys/bus/nodes/devices > > node3/access0/initators/ Empty > node3/access0/targets/node4 This makes sense node3 is an initiator, no other nodes can initiate to it. > node4/access0/initators/[node3 read_bandwidth write_bandwith etc] > node4/access0/targets/ Empty > > So the result current (I think - the HMAT interface still confuses > me :) is that a GI node is treated like a CPU node. This might mean > there is no useful information available if you want to figure out > which CPU containing node is nearest to Memory when the GI node is > nearer still. > > Is this a problem? I'm not sure... > > If we don't want to include GI nodes then we can possibly > use the node_state(N_CPU, x) method to check before considering > them, or I guess parse SRAT to extract that info directly. > > I tried this and it seems to work so can add patch doing this > next version if we think this is the 'right' thing to do. > > So what do you think 'should' happen? I think this might be our first case for adding an "access1" instance by default. I.e. in the case when access0 is not a cpu, then access1 is there to at least show the "local" cpu and let userspace see the performance difference of cpu vs a specific-initiator access. ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [PATCH V5 1/4] ACPI: Support Generic Initiator only domains 2019-11-13 23:20 ` Dan Williams @ 2019-11-14 11:26 ` Jonathan Cameron 2019-11-16 20:45 ` Dan Williams 0 siblings, 1 reply; 21+ messages in thread From: Jonathan Cameron @ 2019-11-14 11:26 UTC (permalink / raw) To: Dan Williams Cc: Tao Xu, Linux MM, Linux ACPI, Linux Kernel Mailing List, Linux ARM, X86 ML, Keith Busch, Jérôme Glisse, Rafael J . Wysocki, Linuxarm, Andrew Morton On Wed, 13 Nov 2019 15:20:01 -0800 Dan Williams <dan.j.williams@intel.com> wrote: > On Wed, Nov 13, 2019 at 9:49 AM Jonathan Cameron > <jonathan.cameron@huawei.com> wrote: > > > > On Wed, 13 Nov 2019 21:57:24 +0800 > > Tao Xu <tao3.xu@intel.com> wrote: > > > > > On 11/13/2019 5:47 PM, Jonathan Cameron wrote: > > > > On Tue, 12 Nov 2019 09:55:17 -0800 > > > > Dan Williams <dan.j.williams@intel.com> wrote: > > > > > > > >> [ add Tao Xu ] > > > >> > > > >> On Fri, Oct 4, 2019 at 4:45 AM Jonathan Cameron > > > >> <Jonathan.Cameron@huawei.com> wrote: > > > >>> > > > >>> Generic Initiators are a new ACPI concept that allows for the > > > >>> description of proximity domains that contain a device which > > > >>> performs memory access (such as a network card) but neither > > > >>> host CPU nor Memory. > > > >>> > > > >>> This patch has the parsing code and provides the infrastructure > > > >>> for an architecture to associate these new domains with their > > > >>> nearest memory processing node. > > > >> > > > >> Thanks for this Jonathan. May I ask how this was tested? Tao has been > > > >> working on qemu support for HMAT [1]. I have not checked if it already > > > >> supports generic initiator entries, but it would be helpful to include > > > >> an example of how the kernel sees these configurations in practice. > > > >> > > > >> [1]: http://patchwork.ozlabs.org/cover/1096737/ > > > > > > > > Tested against qemu with SRAT and SLIT table overrides from an > > > > initrd to actually create the node and give it distances > > > > (those all turn up correctly in the normal places). DSDT override > > > > used to move an emulated network card into the GI numa node. That > > > > currently requires the PCI patch referred to in the cover letter. > > > > On arm64 tested both on qemu and real hardware (overrides on tables > > > > even for real hardware as I can't persuade our BIOS team to implement > > > > Generic Initiators until an OS is actually using them.) > > > > > > > > Main real requirement is memory allocations then occur from one of > > > > the nodes at the minimal distance when you are do a devm_ allocation > > > > from a device assigned. Also need to be able to query the distances > > > > to allow load balancing etc. All that works as expected. > > > > > > > > It only has a fairly tangential connection to HMAT in that HMAT > > > > can provide information on GI nodes. Given HMAT code is quite happy > > > > with memoryless nodes anyway it should work. QEMU doesn't currently > > > > have support to create GI SRAT entries let alone HMAT using them. > > > > > > > > Whilst I could look at adding such support to QEMU, it's not > > > > exactly high priority to emulate something we can test easily > > > > by overriding the tables before the kernel reads them. > > > > > > > > I'll look at how hard it is to build an HMAT tables for my test > > > > configs based on the ones I used to test your HMAT patches a while > > > > back. Should be easy if tedious. > > > > > > > > Jonathan > > > > > > > Indeed, HMAT can support Generic Initiator, but as far as I know, QEMU > > > only can emulate a node with cpu and memory, or memory-only. Even if we > > > assign a node with cpu only, qemu will raise error. Considering > > > compatibility, there are lots of work to do for QEMU if we change NUMA > > > or SRAT table. > > > > > > > I faked up a quick HMAT table. > > > > Used a configuration with 3x CPU and memory nodes, 1x memory only node > > and 1x GI node. Two test cases, one where the GI initiator is further than > > the CPU containing nodes from the memory only node (realistic case for > > existing hardware). That behaves as expected and there are no > > /sys/node/bus/nodeX/access0 entries for the GI node > > + appropriate ones for the memory only node as normal. > > > > The other case is more interesting we have the memory only node nearer > > to the GI node than to any of the CPUs. In that case for x86 at least > > the HMAT code is happy to put an access0 directory GI in the GI node > > with empty access0/initiators and the memory node under access0/targets > > > > The memory only node is node4 and the GI node node3. > > > > So relevant dirs under /sys/bus/nodes/devices > > > > node3/access0/initators/ Empty > > node3/access0/targets/node4 > > This makes sense node3 is an initiator, no other nodes can initiate to it. > > > node4/access0/initators/[node3 read_bandwidth write_bandwith etc] > > node4/access0/targets/ Empty > > > > So the result current (I think - the HMAT interface still confuses > > me :) is that a GI node is treated like a CPU node. This might mean > > there is no useful information available if you want to figure out > > which CPU containing node is nearest to Memory when the GI node is > > nearer still. > > > > Is this a problem? I'm not sure... > > > > If we don't want to include GI nodes then we can possibly > > use the node_state(N_CPU, x) method to check before considering > > them, or I guess parse SRAT to extract that info directly. > > > > I tried this and it seems to work so can add patch doing this > > next version if we think this is the 'right' thing to do. > > > > So what do you think 'should' happen? > > I think this might be our first case for adding an "access1" instance > by default. I.e. in the case when access0 is not a cpu, then access1 > is there to at least show the "local" cpu and let userspace see the > performance difference of cpu vs a specific-initiator access. Hi Dan, Agreed that it makes sense to expand how we describe these cases a bit. To make sure I've understood correctly let me paraphrase what you are proposing (and tweak it a bit ;) Assuming for this purpose we don't put GIs in CPU nodes as that makes for really fiddly explanation. In reality the code will need to handle that. 1) Leave access0 as it currently is with this series - so continue to not distinguish between CPU nodes and Generic Initator containing ones? 2) Add access 1 which is effectively access0 ignoring Generic Initiators? My feeling is that any existing users of access0 are definitely not going to be expecting generic initiators, so we might want to do this the other way around. access0 is only CPUs and memory, access1 is including generic initiators. If there are no GIs don't expose access1 at all? For now we could simply block the GI visibility in access0 and deal with access1 as a separate series. I suspect we will get push back as there are no known users of our new access1 so it may take a while to prove utility and get it accepted. Thanks, Jonathan ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [PATCH V5 1/4] ACPI: Support Generic Initiator only domains 2019-11-14 11:26 ` Jonathan Cameron @ 2019-11-16 20:45 ` Dan Williams 2019-11-18 17:18 ` Brice Goglin 0 siblings, 1 reply; 21+ messages in thread From: Dan Williams @ 2019-11-16 20:45 UTC (permalink / raw) To: Jonathan Cameron Cc: Tao Xu, Linux MM, Linux ACPI, Linux Kernel Mailing List, Linux ARM, X86 ML, Keith Busch, Jérôme Glisse, Rafael J . Wysocki, Linuxarm, Andrew Morton On Thu, Nov 14, 2019 at 3:27 AM Jonathan Cameron <jonathan.cameron@huawei.com> wrote: [..] > Hi Dan, > > Agreed that it makes sense to expand how we describe these cases a bit. > To make sure I've understood correctly let me paraphrase what you > are proposing (and tweak it a bit ;) > > Assuming for this purpose we don't put GIs in CPU nodes as that makes > for really fiddly explanation. In reality the code will need to handle > that. > > 1) Leave access0 as it currently is with this series - so continue to > not distinguish between CPU nodes and Generic Initator containing ones? Yes, but with the caveat that I think 2) also needs to be part of the series before it goes upstream. I.e. don't regress the amount of default information just because a generic initiator is present. > 2) Add access 1 which is effectively access0 ignoring Generic Initiators? Effectively yes, but I'd say it differently. Always display the access class for the local initiator as defined by the HMAT as access0, but also include the "local" cpu node. > My feeling is that any existing users of access0 are definitely not going > to be expecting generic initiators, so we might want to do this the other > way around. access0 is only CPUs and memory, access1 is including > generic initiators. If there are no GIs don't expose access1 at all? There are no consumers of the information that I know of, so I do not see the risk of regression. > For now we could simply block the GI visibility in access0 and deal > with access1 as a separate series. I suspect we will get push back > as there are no known users of our new access1 so it may take a while > to prove utility and get it accepted. The problem is that HMAT gives an unequivocal answer for "local" because it lists it in the table explicitly. Everything else is a subjective determination from parsing the performance data and picking a metric. If access0 is a GI, then let sysfs just reflect that truth. ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [PATCH V5 1/4] ACPI: Support Generic Initiator only domains 2019-11-16 20:45 ` Dan Williams @ 2019-11-18 17:18 ` Brice Goglin 0 siblings, 0 replies; 21+ messages in thread From: Brice Goglin @ 2019-11-18 17:18 UTC (permalink / raw) To: Dan Williams, Jonathan Cameron Cc: Tao Xu, Linux MM, Linux ACPI, Linux Kernel Mailing List, Linux ARM, X86 ML, Keith Busch, Jérôme Glisse, Rafael J . Wysocki, Linuxarm, Andrew Morton Le 16/11/2019 à 21:45, Dan Williams a écrit : > >> My feeling is that any existing users of access0 are definitely not going >> to be expecting generic initiators, so we might want to do this the other >> way around. access0 is only CPUs and memory, access1 is including >> generic initiators. If there are no GIs don't expose access1 at all? > There are no consumers of the information that I know of, so I do not > see the risk of regression. hwloc already reads access0/initiators/ node symlinks (mostly useful for finding which CPUs are local to kmem dax devices). If I understand correctly the changes you propose, we would get an empty list of CPUs in the access0/initiators/ nodes? If it only occurs on platforms with GI (when are those coming to market?), I'd say it's not a big deal for us, we'll manage to have users upgrade their hwloc. Brice ^ permalink raw reply [flat|nested] 21+ messages in thread
* [PATCH V5 2/4] arm64: Support Generic Initiator only domains 2019-10-04 11:43 [PATCH V5 0/4] ACPI: Support Generic Initiator proximity domains Jonathan Cameron 2019-10-04 11:43 ` [PATCH V5 1/4] ACPI: Support Generic Initiator only domains Jonathan Cameron @ 2019-10-04 11:43 ` Jonathan Cameron 2019-10-04 11:43 ` [PATCH V5 3/4] x86: Support Generic Initiator only proximity domains Jonathan Cameron 2019-10-04 11:43 ` [PATCH V5 4/4] ACPI: Let ACPI know we support Generic Initiator Affinity Structures Jonathan Cameron 3 siblings, 0 replies; 21+ messages in thread From: Jonathan Cameron @ 2019-10-04 11:43 UTC (permalink / raw) To: linux-mm, linux-acpi, linux-kernel, linux-arm-kernel, x86 Cc: Keith Busch, jglisse, Rafael J . Wysocki, linuxarm, Andrew Morton, Dan Williams, Jonathan Cameron The one thing that currently needs doing from an architecture point of view is associating the GI domain with its nearest memory domain. This allows all the standard NUMA aware code to get a 'reasonable' answer. A clever driver might elect to do load balancing etc if there are multiple host / memory domains nearby, but that's a decision for the driver. Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> --- arch/arm64/kernel/smp.c | 8 ++++++++ 1 file changed, 8 insertions(+) diff --git a/arch/arm64/kernel/smp.c b/arch/arm64/kernel/smp.c index dc9fe879c279..fd7f6b1cdc84 100644 --- a/arch/arm64/kernel/smp.c +++ b/arch/arm64/kernel/smp.c @@ -713,6 +713,7 @@ void __init smp_prepare_cpus(unsigned int max_cpus) { int err; unsigned int cpu; + unsigned int node; unsigned int this_cpu; init_cpu_topology(); @@ -751,6 +752,13 @@ void __init smp_prepare_cpus(unsigned int max_cpus) set_cpu_present(cpu, true); numa_store_cpu_info(cpu); } + + /* + * Walk the numa domains and set the node to numa memory reference + * for any that are Generic Initiator Only. + */ + for_each_node_state(node, N_GENERIC_INITIATOR) + set_gi_numa_mem(node, local_memory_node(node)); } void (*__smp_cross_call)(const struct cpumask *, unsigned int); -- 2.20.1 ^ permalink raw reply related [flat|nested] 21+ messages in thread
* [PATCH V5 3/4] x86: Support Generic Initiator only proximity domains 2019-10-04 11:43 [PATCH V5 0/4] ACPI: Support Generic Initiator proximity domains Jonathan Cameron 2019-10-04 11:43 ` [PATCH V5 1/4] ACPI: Support Generic Initiator only domains Jonathan Cameron 2019-10-04 11:43 ` [PATCH V5 2/4] arm64: " Jonathan Cameron @ 2019-10-04 11:43 ` Jonathan Cameron 2019-10-07 14:55 ` Ingo Molnar 2019-10-04 11:43 ` [PATCH V5 4/4] ACPI: Let ACPI know we support Generic Initiator Affinity Structures Jonathan Cameron 3 siblings, 1 reply; 21+ messages in thread From: Jonathan Cameron @ 2019-10-04 11:43 UTC (permalink / raw) To: linux-mm, linux-acpi, linux-kernel, linux-arm-kernel, x86 Cc: Keith Busch, jglisse, Rafael J . Wysocki, linuxarm, Andrew Morton, Dan Williams, Jonathan Cameron Done in a somewhat different fashion to arm64. Here the infrastructure for memoryless domains was already in place. That infrastruture applies just as well to domains that also don't have a CPU, hence it works for Generic Initiator Domains. In common with memoryless domains we only register GI domains if the proximity node is not online. If a domain is already a memory containing domain, or a memoryless domain there is nothing to do just because it also contains a Generic Initiator. Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> --- arch/x86/include/asm/numa.h | 2 ++ arch/x86/kernel/setup.c | 1 + arch/x86/mm/numa.c | 14 ++++++++++++++ 3 files changed, 17 insertions(+) diff --git a/arch/x86/include/asm/numa.h b/arch/x86/include/asm/numa.h index bbfde3d2662f..f631467272a3 100644 --- a/arch/x86/include/asm/numa.h +++ b/arch/x86/include/asm/numa.h @@ -62,12 +62,14 @@ extern void numa_clear_node(int cpu); extern void __init init_cpu_to_node(void); extern void numa_add_cpu(int cpu); extern void numa_remove_cpu(int cpu); +extern void init_gi_nodes(void); #else /* CONFIG_NUMA */ static inline void numa_set_node(int cpu, int node) { } static inline void numa_clear_node(int cpu) { } static inline void init_cpu_to_node(void) { } static inline void numa_add_cpu(int cpu) { } static inline void numa_remove_cpu(int cpu) { } +static inline void init_gi_nodes(void) { } #endif /* CONFIG_NUMA */ #ifdef CONFIG_DEBUG_PER_CPU_MAPS diff --git a/arch/x86/kernel/setup.c b/arch/x86/kernel/setup.c index cfb533d42371..b6c977907ea5 100644 --- a/arch/x86/kernel/setup.c +++ b/arch/x86/kernel/setup.c @@ -1264,6 +1264,7 @@ void __init setup_arch(char **cmdline_p) prefill_possible_map(); init_cpu_to_node(); + init_gi_nodes(); io_apic_init_mappings(); diff --git a/arch/x86/mm/numa.c b/arch/x86/mm/numa.c index 4123100e0eaf..50bf724a425e 100644 --- a/arch/x86/mm/numa.c +++ b/arch/x86/mm/numa.c @@ -733,6 +733,20 @@ static void __init init_memory_less_node(int nid) */ } +/* + * Generic Initiator Nodes may have neither CPU nor Memory. + * At this stage if either of the others were present we would + * already be online. + */ +void __init init_gi_nodes(void) +{ + int nid; + + for_each_node_state(nid, N_GENERIC_INITIATOR) + if (!node_online(nid)) + init_memory_less_node(nid); +} + /* * Setup early cpu_to_node. * -- 2.20.1 ^ permalink raw reply related [flat|nested] 21+ messages in thread
* Re: [PATCH V5 3/4] x86: Support Generic Initiator only proximity domains 2019-10-04 11:43 ` [PATCH V5 3/4] x86: Support Generic Initiator only proximity domains Jonathan Cameron @ 2019-10-07 14:55 ` Ingo Molnar 2019-10-08 11:17 ` Jonathan Cameron 0 siblings, 1 reply; 21+ messages in thread From: Ingo Molnar @ 2019-10-07 14:55 UTC (permalink / raw) To: Jonathan Cameron Cc: linux-mm, linux-acpi, linux-kernel, linux-arm-kernel, x86, Keith Busch, jglisse, Rafael J . Wysocki, linuxarm, Andrew Morton, Dan Williams * Jonathan Cameron <Jonathan.Cameron@huawei.com> wrote: > Done in a somewhat different fashion to arm64. > Here the infrastructure for memoryless domains was already > in place. That infrastruture applies just as well to > domains that also don't have a CPU, hence it works for > Generic Initiator Domains. > > In common with memoryless domains we only register GI domains > if the proximity node is not online. If a domain is already > a memory containing domain, or a memoryless domain there is > nothing to do just because it also contains a Generic Initiator. > > Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> > --- > arch/x86/include/asm/numa.h | 2 ++ > arch/x86/kernel/setup.c | 1 + > arch/x86/mm/numa.c | 14 ++++++++++++++ > 3 files changed, 17 insertions(+) > > diff --git a/arch/x86/include/asm/numa.h b/arch/x86/include/asm/numa.h > index bbfde3d2662f..f631467272a3 100644 > --- a/arch/x86/include/asm/numa.h > +++ b/arch/x86/include/asm/numa.h > @@ -62,12 +62,14 @@ extern void numa_clear_node(int cpu); > extern void __init init_cpu_to_node(void); > extern void numa_add_cpu(int cpu); > extern void numa_remove_cpu(int cpu); > +extern void init_gi_nodes(void); > #else /* CONFIG_NUMA */ > static inline void numa_set_node(int cpu, int node) { } > static inline void numa_clear_node(int cpu) { } > static inline void init_cpu_to_node(void) { } > static inline void numa_add_cpu(int cpu) { } > static inline void numa_remove_cpu(int cpu) { } > +static inline void init_gi_nodes(void) { } > #endif /* CONFIG_NUMA */ > > #ifdef CONFIG_DEBUG_PER_CPU_MAPS > diff --git a/arch/x86/kernel/setup.c b/arch/x86/kernel/setup.c > index cfb533d42371..b6c977907ea5 100644 > --- a/arch/x86/kernel/setup.c > +++ b/arch/x86/kernel/setup.c > @@ -1264,6 +1264,7 @@ void __init setup_arch(char **cmdline_p) > prefill_possible_map(); > > init_cpu_to_node(); > + init_gi_nodes(); > > io_apic_init_mappings(); > > diff --git a/arch/x86/mm/numa.c b/arch/x86/mm/numa.c > index 4123100e0eaf..50bf724a425e 100644 > --- a/arch/x86/mm/numa.c > +++ b/arch/x86/mm/numa.c > @@ -733,6 +733,20 @@ static void __init init_memory_less_node(int nid) > */ > } > > +/* > + * Generic Initiator Nodes may have neither CPU nor Memory. > + * At this stage if either of the others were present we would > + * already be online. > + */ > +void __init init_gi_nodes(void) > +{ > + int nid; > + > + for_each_node_state(nid, N_GENERIC_INITIATOR) > + if (!node_online(nid)) > + init_memory_less_node(nid); > +} Nit: missing curly braces. How do these work in practice, will a system that only had nodes 0-1 today grow a third node '2' that won't have any CPUs on memory on them? Thanks, Ingo ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [PATCH V5 3/4] x86: Support Generic Initiator only proximity domains 2019-10-07 14:55 ` Ingo Molnar @ 2019-10-08 11:17 ` Jonathan Cameron 0 siblings, 0 replies; 21+ messages in thread From: Jonathan Cameron @ 2019-10-08 11:17 UTC (permalink / raw) To: Ingo Molnar Cc: linux-mm, linux-acpi, linux-kernel, linux-arm-kernel, x86, Keith Busch, jglisse, Rafael J . Wysocki, linuxarm, Andrew Morton, Dan Williams On Mon, 7 Oct 2019 16:55:05 +0200 Ingo Molnar <mingo@kernel.org> wrote: > * Jonathan Cameron <Jonathan.Cameron@huawei.com> wrote: > > > Done in a somewhat different fashion to arm64. > > Here the infrastructure for memoryless domains was already > > in place. That infrastruture applies just as well to > > domains that also don't have a CPU, hence it works for > > Generic Initiator Domains. > > > > In common with memoryless domains we only register GI domains > > if the proximity node is not online. If a domain is already > > a memory containing domain, or a memoryless domain there is > > nothing to do just because it also contains a Generic Initiator. > > > > Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> > > --- > > arch/x86/include/asm/numa.h | 2 ++ > > arch/x86/kernel/setup.c | 1 + > > arch/x86/mm/numa.c | 14 ++++++++++++++ > > 3 files changed, 17 insertions(+) > > > > diff --git a/arch/x86/include/asm/numa.h b/arch/x86/include/asm/numa.h > > index bbfde3d2662f..f631467272a3 100644 > > --- a/arch/x86/include/asm/numa.h > > +++ b/arch/x86/include/asm/numa.h > > @@ -62,12 +62,14 @@ extern void numa_clear_node(int cpu); > > extern void __init init_cpu_to_node(void); > > extern void numa_add_cpu(int cpu); > > extern void numa_remove_cpu(int cpu); > > +extern void init_gi_nodes(void); > > #else /* CONFIG_NUMA */ > > static inline void numa_set_node(int cpu, int node) { } > > static inline void numa_clear_node(int cpu) { } > > static inline void init_cpu_to_node(void) { } > > static inline void numa_add_cpu(int cpu) { } > > static inline void numa_remove_cpu(int cpu) { } > > +static inline void init_gi_nodes(void) { } > > #endif /* CONFIG_NUMA */ > > > > #ifdef CONFIG_DEBUG_PER_CPU_MAPS > > diff --git a/arch/x86/kernel/setup.c b/arch/x86/kernel/setup.c > > index cfb533d42371..b6c977907ea5 100644 > > --- a/arch/x86/kernel/setup.c > > +++ b/arch/x86/kernel/setup.c > > @@ -1264,6 +1264,7 @@ void __init setup_arch(char **cmdline_p) > > prefill_possible_map(); > > > > init_cpu_to_node(); > > + init_gi_nodes(); > > > > io_apic_init_mappings(); > > > > diff --git a/arch/x86/mm/numa.c b/arch/x86/mm/numa.c > > index 4123100e0eaf..50bf724a425e 100644 > > --- a/arch/x86/mm/numa.c > > +++ b/arch/x86/mm/numa.c > > @@ -733,6 +733,20 @@ static void __init init_memory_less_node(int nid) > > */ > > } > > > > +/* > > + * Generic Initiator Nodes may have neither CPU nor Memory. > > + * At this stage if either of the others were present we would > > + * already be online. > > + */ > > +void __init init_gi_nodes(void) > > +{ > > + int nid; > > + > > + for_each_node_state(nid, N_GENERIC_INITIATOR) > > + if (!node_online(nid)) > > + init_memory_less_node(nid); > > +} > > Nit: missing curly braces. Good point. > > How do these work in practice, will a system that only had nodes 0-1 > today grow a third node '2' that won't have any CPUs on memory on them? Yes. Exactly that. The result is that fallback lists etc work when _PXM is used to assign a device into that new node. The interesting bit comes when a driver does something more interesting and queries the numa distances from SLIT. At that point the driver can elect to do load balancing across multiple nodes at similar distances. In theory you can also specify a device you wish to put into the node via the SRAT entry (IIRC using segment + BDF for PCI devices), but for now I haven't implemented that method. > > Thanks, > > Ingo Thanks, Jonathan ^ permalink raw reply [flat|nested] 21+ messages in thread
* [PATCH V5 4/4] ACPI: Let ACPI know we support Generic Initiator Affinity Structures 2019-10-04 11:43 [PATCH V5 0/4] ACPI: Support Generic Initiator proximity domains Jonathan Cameron ` (2 preceding siblings ...) 2019-10-04 11:43 ` [PATCH V5 3/4] x86: Support Generic Initiator only proximity domains Jonathan Cameron @ 2019-10-04 11:43 ` Jonathan Cameron 3 siblings, 0 replies; 21+ messages in thread From: Jonathan Cameron @ 2019-10-04 11:43 UTC (permalink / raw) To: linux-mm, linux-acpi, linux-kernel, linux-arm-kernel, x86 Cc: Keith Busch, jglisse, Rafael J . Wysocki, linuxarm, Andrew Morton, Dan Williams, Jonathan Cameron Until we tell ACPI that we support generic initiators, it will have to operate in fall back domain mode and all _PXM entries should be on existing non GI domains. This patch sets the relevant OSC bit to make that happen. Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> --- drivers/acpi/bus.c | 1 + include/linux/acpi.h | 1 + 2 files changed, 2 insertions(+) diff --git a/drivers/acpi/bus.c b/drivers/acpi/bus.c index 48bc96d45bab..9d40e465232f 100644 --- a/drivers/acpi/bus.c +++ b/drivers/acpi/bus.c @@ -302,6 +302,7 @@ static void acpi_bus_osc_support(void) capbuf[OSC_SUPPORT_DWORD] |= OSC_SB_HOTPLUG_OST_SUPPORT; capbuf[OSC_SUPPORT_DWORD] |= OSC_SB_PCLPI_SUPPORT; + capbuf[OSC_SUPPORT_DWORD] |= OSC_SB_GENERIC_INITIATOR_SUPPORT; #ifdef CONFIG_X86 if (boot_cpu_has(X86_FEATURE_HWP)) { diff --git a/include/linux/acpi.h b/include/linux/acpi.h index 8b4e516bac00..195b21e93aaa 100644 --- a/include/linux/acpi.h +++ b/include/linux/acpi.h @@ -505,6 +505,7 @@ acpi_status acpi_run_osc(acpi_handle handle, struct acpi_osc_context *context); #define OSC_SB_PCLPI_SUPPORT 0x00000080 #define OSC_SB_OSLPI_SUPPORT 0x00000100 #define OSC_SB_CPC_DIVERSE_HIGH_SUPPORT 0x00001000 +#define OSC_SB_GENERIC_INITIATOR_SUPPORT 0x00002000 extern bool osc_sb_apei_support_acked; extern bool osc_pc_lpi_support_confirmed; -- 2.20.1 ^ permalink raw reply related [flat|nested] 21+ messages in thread
end of thread, other threads:[~2019-11-18 17:19 UTC | newest] Thread overview: 21+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2019-10-04 11:43 [PATCH V5 0/4] ACPI: Support Generic Initiator proximity domains Jonathan Cameron 2019-10-04 11:43 ` [PATCH V5 1/4] ACPI: Support Generic Initiator only domains Jonathan Cameron 2019-10-18 10:18 ` Rafael J. Wysocki 2019-10-18 12:46 ` Jonathan Cameron 2019-11-07 14:54 ` Rafael J. Wysocki 2019-11-12 17:07 ` Jonathan Cameron 2019-11-12 17:55 ` Dan Williams 2019-11-13 9:47 ` Jonathan Cameron 2019-11-13 13:57 ` Tao Xu 2019-11-13 16:52 ` Dan Williams 2019-11-13 17:56 ` Jonathan Cameron 2019-11-13 17:48 ` Jonathan Cameron 2019-11-13 23:20 ` Dan Williams 2019-11-14 11:26 ` Jonathan Cameron 2019-11-16 20:45 ` Dan Williams 2019-11-18 17:18 ` Brice Goglin 2019-10-04 11:43 ` [PATCH V5 2/4] arm64: " Jonathan Cameron 2019-10-04 11:43 ` [PATCH V5 3/4] x86: Support Generic Initiator only proximity domains Jonathan Cameron 2019-10-07 14:55 ` Ingo Molnar 2019-10-08 11:17 ` Jonathan Cameron 2019-10-04 11:43 ` [PATCH V5 4/4] ACPI: Let ACPI know we support Generic Initiator Affinity Structures Jonathan Cameron
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).