All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH] x86: Fix an issue with invalid ACPI numa config
@ 2018-11-15 11:06 Jonathan Cameron
  2018-11-15 11:09 ` Jonathan Cameron
                   ` (2 more replies)
  0 siblings, 3 replies; 7+ messages in thread
From: Jonathan Cameron @ 2018-11-15 11:06 UTC (permalink / raw)
  To: helgaas, linux-pci
  Cc: linuxarm, Ingo Molnar, linux-acpi, Dave Hansen, Andy Lutomirski,
	Peter Zijlstra, x86, Jonathan Cameron

[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #1: Type: text/plain; charset="UTF-8", Size: 2012 bytes --]

The addition of support to read the numa node for a PCI
card specified by _PXM resulted in Martin's system not
booting.   Looking at the ACPI tables it seems that there
are PXM entries for the root ports, but no SRAT table.

The absence of SRAT table results in dummy_numa_init being
called.  However, unlike on arm64, this doesn't then result
in numa_off being set.  When the PCI code later comes along
and calls acpi_get_node for any PCI card below the root port,
it navigates up the ACPI tree until it finds the PXM value in
the root port. This value is then passed to
acpi_map_pxm_to_node.  If numa_off is set this returns,
NUMA_NO_NODE (as it does on arm64), on x86 it instead tries
to allocate a numa node from the unused set without setting
up all the infrastructure that would normally accompany such
a call.  We have not identified exactly which driver is
causing the subsequent hang for Martin.

It is invalid under the ACPI spec to specify new
numa nodes using PXM if they have no presence in SRAT.
Thus the simplest fix is to set numa_off when it is off due
to an invalid SRAT (here not present at all).

I do not have easy access to appropriate x86 numa systems so
would appreciate some testing of this one!

Known problem boards setups:

AMD Ryzen Threadripper 2950X on ASROCK X399 TAICHI
MSI X399 SLI PLUS (probably - not confirmed yet)

The PCI patch has been reverted, so this fix is not critical.

Reported-by: Martin Hundebøll <martin@geanix.com>
Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Fixes: bad7dcd94f39 ("ACPI/PCI: Pay attention to device-specific _PXM node values")

---
 arch/x86/mm/numa.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/arch/x86/mm/numa.c b/arch/x86/mm/numa.c
index 1308f5408bf7..ce1182f953ff 100644
--- a/arch/x86/mm/numa.c
+++ b/arch/x86/mm/numa.c
@@ -695,6 +695,8 @@ static int __init dummy_numa_init(void)
 	node_set(0, numa_nodes_parsed);
 	numa_add_memblk(0, 0, PFN_PHYS(max_pfn));
 
+	numa_off = true;
+
 	return 0;
 }
 
-- 
2.18.0



^ permalink raw reply related	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2018-12-11  9:47 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-11-15 11:06 [PATCH] x86: Fix an issue with invalid ACPI numa config Jonathan Cameron
2018-11-15 11:09 ` Jonathan Cameron
2018-11-20 12:01 ` Peter Zijlstra
2018-11-20 13:19   ` Jonathan Cameron
2018-12-03 10:15     ` Jonathan Cameron
2018-12-10 23:56 ` Bjorn Helgaas
2018-12-11  9:47   ` Jonathan Cameron

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.