linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Dietmar Eggemann <dietmar.eggemann@arm.com>
To: Valentin Schneider <valentin.schneider@arm.com>,
	Meelis Roos <mroos@linux.ee>, LKML <linux-kernel@vger.kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>,
	Vincent Guittot <vincent.guittot@linaro.org>,
	Barry Song <song.bao.hua@hisilicon.com>,
	Mel Gorman <mgorman@suse.de>
Subject: Re: 5.11-rc4+git: Shortest NUMA path spans too many nodes
Date: Thu, 21 Jan 2021 19:53:37 +0100	[thread overview]
Message-ID: <f0818204-66d1-bf01-062e-0aeec9ce806d@arm.com> (raw)
In-Reply-To: <jhjh7na2lsj.mognet@arm.com>

On 21/01/2021 19:21, Valentin Schneider wrote:
> On 21/01/21 19:39, Meelis Roos wrote:
>>> Could you paste the output of the below?
>>>
>>>    $ cat /sys/devices/system/node/node*/distance
>>
>> 10 12 12 14 14 14 14 16
>> 12 10 14 12 14 14 12 14
>> 12 14 10 14 12 12 14 14
>> 14 12 14 10 12 12 14 14
>> 14 14 12 12 10 14 12 14
>> 14 14 12 12 14 10 14 12
>> 14 12 14 14 12 14 10 12
>> 16 14 14 14 14 12 12 10
>>
> 
> Thanks!
> 
>>
>>> Additionally, booting your system with CONFIG_SCHED_DEBUG=y and
>>> appending 'sched_debug' to your cmdline should yield some extra data.
>>
>> [    0.000000] Linux version 5.11.0-rc4-00015-g45dfb8a5659a (mroos@x4600m2) (gcc (Debian 10.2.1-6) 10.2.1 20210110, GNU ld (GNU Binutils for Debian) 2.35.1) #55 SMP Thu Jan 21 19:23:10 EET 2021
>> [    0.000000] Command line: BOOT_IMAGE=/boot/vmlinuz-5.11.0-rc4-00015-g45dfb8a5659a root=/dev/sda1 ro quiet
> 
> This is missing 'sched_debug' to get the extra topology debug prints (yes
> it needs an extra cmdline argument on top of having CONFIG_SCHED_DEBUG=y),
> but I should be able to generate those locally by feeding QEMU the above
> distance table.

Can be recreated with (simplified with only 1 CPU per node):

$ qemu-system-aarch64 -kernel /opt/git/kernel_org/arch/arm64/boot/Image -hda /opt/git/tools/qemu-imgs-manipulator/images/qemu-image-aarch64.img -append 'root=/dev/vda console=ttyAMA0 loglevel=8 sched_debug' -nographic -machine virt,gic-version=max -smp cores=8 -m 512 -cpu cortex-a57 -numa node,cpus=0,nodeid=0 -numa node,cpus=1,nodeid=1, -numa node,cpus=2,nodeid=2, -numa node,cpus=3,nodeid=3, -numa node,cpus=4,nodeid=4, -numa node,cpus=5,nodeid=5, -numa node,cpus=6,nodeid=6, -numa node,cpus=7,nodeid=7, -numa dist,src=0,dst=1,val=12, -numa dist,src=0,dst=2,val=12, -numa dist,src=0,dst=3,val=14, -numa dist,src=0,dst=4,val=14, -numa dist,src=0,dst=5,val=14, -numa dist,src=0,dst=6,val=14, -numa dist,src=0,dst=7,val=16, -numa dist,src=1,dst=2,val=14, -numa dist,src=1,dst=3,val=12, -numa dist,src=1,dst=4,val=14, -numa dist,src=1,dst=5,val=14, -numa dist,src=1,dst=6,val=12, -numa dist,src=1,dst=7,val=14, -numa dist,src=2,dst=3,val=14, -numa dist,src=2,dst=4,val=12, -numa dist,src=2,dst=5,val=12, -numa dist,src=2,dst=6,val=14, -numa dist,src=2,dst=7,val=14, -numa dist,src=3,dst=4,val=12, -numa dist,src=3,dst=5,val=12, -numa dist,src=3,dst=6,val=14, -numa dist,src=3,dst=7,val=14, -numa dist,src=4,dst=5,val=14, -numa dist,src=4,dst=6,val=12, -numa dist,src=4,dst=7,val=14, -numa dist,src=5,dst=6,val=14, -numa dist,src=5,dst=7,val=12, -numa dist,src=6,dst=7,val=12

[    0.206628] ------------[ cut here ]------------
[    0.206698] Shortest NUMA path spans too many nodes
[    0.207119] WARNING: CPU: 0 PID: 1 at kernel/sched/topology.c:753 cpu_attach_domain+0x42c/0x87c
[    0.207176] Modules linked in:
[    0.207373] CPU: 0 PID: 1 Comm: swapper/0 Not tainted 5.11.0-rc2-00010-g65bcf072e20e-dirty #81
[    0.207458] Hardware name: linux,dummy-virt (DT)
[    0.207584] pstate: 60000005 (nZCv daif -PAN -UAO -TCO BTYPE=--)
[    0.207618] pc : cpu_attach_domain+0x42c/0x87c
[    0.207646] lr : cpu_attach_domain+0x42c/0x87c
[    0.207665] sp : ffff800011fcbbf0
[    0.207679] x29: ffff800011fcbbf0 x28: ffff0000024d8200 
[    0.207735] x27: 0000000000001fef x26: 0000000000001917 
[    0.207755] x25: ffff0000024d8000 x24: 0000000000001917 
[    0.207772] x23: 0000000000000000 x22: ffff800011b69a40 
[    0.207789] x21: ffff0000024d8320 x20: ffff8000116fda80 
[    0.207806] x19: ffff0000024d8000 x18: 0000000000000000 
[    0.207822] x17: 0000000000000000 x16: 00000000bd30d762 
[    0.207838] x15: 0000000000000030 x14: ffffffffffffffff 
[    0.207855] x13: ffff800011b82e08 x12: 00000000000001b9 
[    0.207871] x11: 0000000000000093 x10: ffff800011bdae08 
[    0.207887] x9 : 00000000fffff000 x8 : ffff800011b82e08 
[    0.207922] x7 : ffff800011bdae08 x6 : 0000000000000000 
[    0.207939] x5 : 0000000000000000 x4 : 0000000000000000 
[    0.207955] x3 : 00000000ffffffff x2 : 0000000000000000 
[    0.207972] x1 : 0000000000000000 x0 : ffff000018020000 
[    0.208125] Call trace:
[    0.208230]  cpu_attach_domain+0x42c/0x87c
[    0.208256]  build_sched_domains+0x1238/0x12f4
[    0.208271]  sched_init_domains+0x80/0xb0
[    0.208283]  sched_init_smp+0x30/0x80
[    0.208299]  kernel_init_freeable+0xf4/0x238
[    0.208313]  kernel_init+0x14/0x118
[    0.208328]  ret_from_fork+0x10/0x34
[    0.208507] ---[ end trace 75cafa7c7d1a3d7e ]---
[    0.208706] CPU0 attaching sched-domain(s):
[    0.208756]  domain-0: span=0-2 level=NUMA
[    0.209001]   groups: 0:{ span=0 cap=1017 }, 1:{ span=1 cap=1016 }, 2:{ span=2 cap=1015 }
[    0.209247]   domain-1: span=0-6 level=NUMA
[    0.209280]    groups: 0:{ span=0-2 mask=0 cap=3048 }, 3:{ span=1,3-5 mask=3 cap=4073 }, 6:{ span=1,4,6-7 mask=6 cap=4084 }
[    0.209693] ERROR: groups don't span domain->span
[    0.209703]    domain-2: span=0-7 level=NUMA
[    0.209722]     groups: 0:{ span=0-6 mask=0 cap=7114 }, 7:{ span=1-7 mask=7 cap=7163 }
[    0.210361] CPU1 attaching sched-domain(s):
[    0.210376]  domain-0: span=0-1,3,6 level=NUMA
[    0.210411]   groups: 1:{ span=1 cap=1016 }, 3:{ span=3 cap=1018 }, 6:{ span=6 cap=1017 }, 0:{ span=0 cap=1017 }
[    0.210493]   domain-1: span=0-7 level=NUMA
[    0.210511]    groups: 1:{ span=0-1,3,6 mask=1 cap=4075 }, 2:{ span=0,2,4-5 mask=2 cap=4070 }, 7:{ span=5-7 mask=7 cap=3067 }
[    0.210641] CPU2 attaching sched-domain(s):
[    0.210653]  domain-0: span=0,2,4-5 level=NUMA
[    0.210672]   groups: 2:{ span=2 cap=1015 }, 4:{ span=4 cap=1016 }, 5:{ span=5 cap=1015 }, 0:{ span=0 cap=1017 }
[    0.210752]   domain-1: span=0-7 level=NUMA
[    0.210769]    groups: 2:{ span=0,2,4-5 mask=2 cap=4070 }, 3:{ span=1,3-5 mask=3 cap=4073 }, 6:{ span=1,4,6-7 mask=6 cap=4084 }
[    0.210860] CPU3 attaching sched-domain(s):
[    0.210870]  domain-0: span=1,3-5 level=NUMA
[    0.210887]   groups: 3:{ span=3 cap=1018 }, 4:{ span=4 cap=1016 }, 5:{ span=5 cap=1015 }, 1:{ span=1 cap=1016 }
[    0.210965]   domain-1: span=0-7 level=NUMA
[    0.210981]    groups: 3:{ span=1,3-5 mask=3 cap=4073 }, 6:{ span=1,4,6-7 mask=6 cap=4084 }, 0:{ span=0-2 mask=0 cap=3048 }
[    0.211109] CPU4 attaching sched-domain(s):
[    0.211134]  domain-0: span=2-4,6 level=NUMA
[    0.211151]   groups: 4:{ span=4 cap=1016 }, 6:{ span=6 cap=1017 }, 2:{ span=2 cap=1015 }, 3:{ span=3 cap=1018 }
[    0.211229]   domain-1: span=0-7 level=NUMA
[    0.211245]    groups: 4:{ span=2-4,6 mask=4 cap=4081 }, 5:{ span=2-3,5,7 mask=5 cap=4082 }, 0:{ span=0-2 mask=0 cap=3048 }
[    0.211383] CPU5 attaching sched-domain(s):
[    0.211393]  domain-0: span=2-3,5,7 level=NUMA
[    0.211425]   groups: 5:{ span=5 cap=1015 }, 7:{ span=7 cap=1019 }, 2:{ span=2 cap=1015 }, 3:{ span=3 cap=1018 }
[    0.211506]   domain-1: span=0-7 level=NUMA
[    0.211524]    groups: 5:{ span=2-3,5,7 mask=5 cap=4082 }, 6:{ span=1,4,6-7 mask=6 cap=4084 }, 0:{ span=0-2 mask=0 cap=3048 }
[    0.211618] CPU6 attaching sched-domain(s):
[    0.211628]  domain-0: span=1,4,6-7 level=NUMA
[    0.211645]   groups: 6:{ span=6 cap=1017 }, 7:{ span=7 cap=1019 }, 1:{ span=1 cap=1016 }, 4:{ span=4 cap=1016 }
[    0.211728]   domain-1: span=0-7 level=NUMA
[    0.211745]    groups: 6:{ span=1,4,6-7 mask=6 cap=4084 }, 0:{ span=0-2 mask=0 cap=3048 }, 3:{ span=1,3-5 mask=3 cap=4073 }
[    0.211855] CPU7 attaching sched-domain(s):
[    0.211866]  domain-0: span=5-7 level=NUMA
[    0.211884]   groups: 7:{ span=7 cap=1019 }, 5:{ span=5 cap=1015 }, 6:{ span=6 cap=1017 }
[    0.211949]   domain-1: span=1-7 level=NUMA
[    0.211966]    groups: 7:{ span=5-7 mask=7 cap=3067 }, 1:{ span=0-1,3,6 mask=1 cap=4075 }, 2:{ span=0,2,4-5 mask=2 cap=4070 }
[    0.212047] ERROR: groups don't span domain->span
[    0.212055]    domain-2: span=0-7 level=NUMA
[    0.212072]     groups: 7:{ span=1-7 mask=7 cap=7163 }, 0:{ span=0-6 mask=0 cap=7114 }

# cat /sys/devices/system/node/node*/distance
10 12 12 14 14 14 14 16
12 10 14 12 14 14 12 14
12 14 10 14 12 12 14 14
14 12 14 10 12 12 14 14
14 14 12 12 10 14 12 14
14 14 12 12 14 10 14 12
14 12 14 14 12 14 10 12
16 14 14 14 14 12 12 10

The '16' seems to be the culprit. How does such a topo look like?

  reply	other threads:[~2021-01-21 19:27 UTC|newest]

Thread overview: 9+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-01-21 13:41 5.11-rc4+git: Shortest NUMA path spans too many nodes Meelis Roos
2021-01-21 15:05 ` Valentin Schneider
2021-01-21 17:39   ` Meelis Roos
2021-01-21 18:21     ` Valentin Schneider
2021-01-21 18:53       ` Dietmar Eggemann [this message]
2021-01-21 21:17         ` Song Bao Hua (Barry Song)
2021-01-22 10:05           ` Dietmar Eggemann
2021-01-22 11:09             ` Song Bao Hua (Barry Song)
2021-01-22 11:16               ` Valentin Schneider

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=f0818204-66d1-bf01-062e-0aeec9ce806d@arm.com \
    --to=dietmar.eggemann@arm.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mgorman@suse.de \
    --cc=mroos@linux.ee \
    --cc=peterz@infradead.org \
    --cc=song.bao.hua@hisilicon.com \
    --cc=valentin.schneider@arm.com \
    --cc=vincent.guittot@linaro.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).