linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Valentin Schneider <valentin.schneider@arm.com>
To: John Paul Adrian Glaubitz <glaubitz@physik.fu-berlin.de>
Cc: "Peter Zijlstra \(Intel\)" <peterz@infradead.org>,
	Linux Kernel Mailing List <linux-kernel@vger.kernel.org>,
	"linux-ia64\@vger.kernel.org" <linux-ia64@vger.kernel.org>,
	Sergei Trofimovich <slyfox@gentoo.org>,
	debian-ia64 <debian-ia64@lists.debian.org>
Subject: Re: [PATCH 0/1] sched/topology: NUMA distance deduplication
Date: Wed, 17 Mar 2021 19:36:27 +0000	[thread overview]
Message-ID: <8735wtr2ro.mognet@arm.com> (raw)
In-Reply-To: <255d6b5d-194e-eb0e-ecdd-97477a534441@physik.fu-berlin.de>


Hi,

On 17/03/21 20:04, John Paul Adrian Glaubitz wrote:
> Hi Valentin!
>
>> As pointed out by Barry in [1], there are topologies out there that struggle to
>> go through the NUMA distance deduplicating sort. Included patch is something
>> I wrote back when I started untangling this distance > 2 mess.
>>
>> It's only been lightly tested on some array of QEMU-powered topologies I keep
>> around for this sort of things. I *think* this works out fine with the NODE
>> topology level, but I wouldn't be surprised if I (re)introduced an off-by-one
>> error in there.
>
> This patch causes a regression on my ia64 RX2660 server:
>
> [    0.040000] smp: Brought up 1 node, 4 CPUs
> [    0.040000] Total of 4 processors activated (12713.98 BogoMIPS).
> [    0.044000] ERROR: Invalid distance value range
> [    0.044000]
>
> The machine still seems to boot normally besides the huge amount of spam. Full message
> log below.
>
> Any idea?
>

Harumph!

The expected / valid distance value range (as per ACPI spec) is
[10, 255] (actually double-checking the spec, 255 is supposed to mean
"unreachable", but whatever)

Now, something in your system is exposing 256 nodes, all of them distance 0
from one another - the spam you're seeing is a printout of

  node_distance(i,j) for all nodes i, j

I see ACPI in your boot logs, so I'm guessing you have a bogus SLIT table
(the ACPI table with node distances). You should be able to double check
this with something like:

$ acpidump > acpi.dump
$ acpixtract -a acpi.dump
$ iasl -d *.dat
$ cat slit.dsl

As for fixing it, I think you have the following options:

a) Complain to your hardware vendor to have them fix the table and ship a
   firmware fix
b) Fix the ACPI table yourself - I've been told it's doable for *some* of
   them, but I've never done that myself
c) Compile your kernel with CONFIG_NUMA=n, as AFAICT you only actually have
   a single node
d) Ignore the warning


c) is clearly not ideal if you want to use a somewhat generic kernel image
on a wide host of machines; d) is also a bit yucky...

  reply	other threads:[~2021-03-17 19:37 UTC|newest]

Thread overview: 12+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-03-17 19:04 [PATCH 0/1] sched/topology: NUMA distance deduplication John Paul Adrian Glaubitz
2021-03-17 19:36 ` Valentin Schneider [this message]
2021-03-17 19:47   ` John Paul Adrian Glaubitz
2021-03-17 20:04     ` Valentin Schneider
2021-03-17 20:56       ` Valentin Schneider
2021-03-17 23:26         ` John Paul Adrian Glaubitz
2021-03-18 10:28           ` John Paul Adrian Glaubitz
2021-03-18 10:48             ` Valentin Schneider
2021-03-17 21:14       ` Sergei Trofimovich
2021-03-17 21:58         ` Anatoly Pugachev
2021-03-17 23:29         ` John Paul Adrian Glaubitz
  -- strict thread matches above, loose matches on Subject: below --
2021-01-22 12:39 Valentin Schneider

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=8735wtr2ro.mognet@arm.com \
    --to=valentin.schneider@arm.com \
    --cc=debian-ia64@lists.debian.org \
    --cc=glaubitz@physik.fu-berlin.de \
    --cc=linux-ia64@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=peterz@infradead.org \
    --cc=slyfox@gentoo.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).