All of lore.kernel.org
 help / color / mirror / Atom feed
From: Jonathan Cameron <Jonathan.Cameron@huawei.com>
To: <linux-pci@vger.kernel.org>, <x86@kernel.org>, <helgaas@kernel.org>
Cc: <linuxarm@huawei.com>, Ingo Molnar <mingo@kernel.org>,
	Dave Hansen <dave.hansen@linux.intel.com>,
	Andy Lutomirski <luto@kernel.org>,
	"Peter Zijlstra" <peterz@infradead.org>, <martin@geanix.com>,
	Jonathan Cameron <Jonathan.Cameron@huawei.com>
Subject: [PATCH V2] x86: Fix an issue with invalid ACPI NUMA config
Date: Tue, 11 Dec 2018 17:47:37 +0800	[thread overview]
Message-ID: <20181211094737.71554-1-Jonathan.Cameron@huawei.com> (raw)

The addition of support to read the NUMA node for a PCI card specified by
_PXM resulted in Martin's system not booting.   Looking at the ACPI tables
it seems that there are _PXM entries for the root ports, but no SRAT table.

The absence of the SRAT table results in dummy_numa_init() being called.
However, unlike on arm64, this doesn't result in numa_off being set.

When the PCI code later comes along and calls acpi_get_node() for any PCI
card below the root port, it navigates up the ACPI tree until it finds the
_PXM value in the root port. This value is then passed to
acpi_map_pxm_to_node().

As numa_off has not been set on x86 it tries to allocate a NUMA node, from
the unused set, without setting up all the infrastructure that would
normally accompany such a call.  We have not identified exactly which driver
is causing the subsequent hang for Martin.

If numa_off had been set, as it is in the equivalent flow on arm64, then
acpi_map_pxm_to_node() would return NUMA_NO_NODE, which is what we want to
happen.

It is invalid under the ACPI spec to specify new NUMA nodes using _PXM if
they have no presence in SRAT. Thus the simplest fix is to set numa_off when
NUMA support is disabled due to an invalid SRAT (here not present at all).

I do not have easy access to appropriate x86 NUMA systems so would
appreciate some testing of this one!

Known problem boards setups:

AMD Ryzen Threadripper 2950X on ASROCK X399 TAICHI
MSI X399 SLI PLUS (probably - not confirmed yet)

The PCI patch has been reverted, so this fix is not critical.

Reported-by: Martin Hundebøll <martin@geanix.com>
Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Fixes: bad7dcd94f39 ("ACPI/PCI: Pay attention to device-specific _PXM node values")
---
Changes since V1:
* Update commit message as suggested by Bjorn Helgaas.
* No functional changes.

 arch/x86/mm/numa.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/arch/x86/mm/numa.c b/arch/x86/mm/numa.c
index 1308f5408bf7..ce1182f953ff 100644
--- a/arch/x86/mm/numa.c
+++ b/arch/x86/mm/numa.c
@@ -695,6 +695,8 @@ static int __init dummy_numa_init(void)
 	node_set(0, numa_nodes_parsed);
 	numa_add_memblk(0, 0, PFN_PHYS(max_pfn));
 
+	numa_off = true;
+
 	return 0;
 }
 
-- 
2.19.1


             reply	other threads:[~2018-12-11  9:48 UTC|newest]

Thread overview: 15+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-12-11  9:47 Jonathan Cameron [this message]
2018-12-11 18:19 ` [PATCH V2] x86: Fix an issue with invalid ACPI NUMA config Dave Hansen
2018-12-12  9:39   ` Jonathan Cameron
2018-12-20 15:12     ` Bjorn Helgaas
2018-12-20 17:13       ` Dave Hansen
2018-12-20 19:57         ` Bjorn Helgaas
2019-01-28 11:31           ` Jonathan Cameron
2019-01-28 11:31             ` Jonathan Cameron
2019-01-28 11:31             ` Jonathan Cameron
2019-01-28 23:13             ` Bjorn Helgaas
2019-01-29  9:51               ` Jonathan Cameron
2019-01-29 19:05                 ` Bjorn Helgaas
2019-01-29 19:45                   ` Jonathan Cameron
2019-01-29 21:10                     ` Bjorn Helgaas
2019-02-07 10:12                   ` Jonathan Cameron

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20181211094737.71554-1-Jonathan.Cameron@huawei.com \
    --to=jonathan.cameron@huawei.com \
    --cc=dave.hansen@linux.intel.com \
    --cc=helgaas@kernel.org \
    --cc=linux-pci@vger.kernel.org \
    --cc=linuxarm@huawei.com \
    --cc=luto@kernel.org \
    --cc=martin@geanix.com \
    --cc=mingo@kernel.org \
    --cc=peterz@infradead.org \
    --cc=x86@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.