From: David Hildenbrand <david@redhat.com> To: Srikar Dronamraju <srikar@linux.vnet.ibm.com> Cc: Michal Hocko <mhocko@kernel.org>, Andrew Morton <akpm@linux-foundation.org>, linuxppc-dev@lists.ozlabs.org, linux-mm@kvack.org, linux-kernel@vger.kernel.org, Mel Gorman <mgorman@suse.de>, Vlastimil Babka <vbabka@suse.cz>, "Kirill A. Shutemov" <kirill@shutemov.name>, Christopher Lameter <cl@linux.com>, Michael Ellerman <mpe@ellerman.id.au>, Linus Torvalds <torvalds@linux-foundation.org> Subject: Re: [PATCH v2 3/3] mm/page_alloc: Keep memoryless cpuless node 0 offline Date: Tue, 12 May 2020 09:49:05 +0200 [thread overview] Message-ID: <45d50d80-c998-9372-42eb-ca753a7258b9@redhat.com> (raw) In-Reply-To: <20200511174731.GD1961@linux.vnet.ibm.com> On 11.05.20 19:47, Srikar Dronamraju wrote: > * David Hildenbrand <david@redhat.com> [2020-05-08 15:42:12]: > > Hi David, > > Thanks for the steps to tryout. > >>> >>> #! /bin/bash >>> sudo x86_64-softmmu/qemu-system-x86_64 \ >>> --enable-kvm \ >>> -m 4G,maxmem=20G,slots=2 \ >>> -smp sockets=2,cores=2 \ >>> -numa node,nodeid=0,cpus=0-1,mem=4G -numa node,nodeid=1,cpus=2-3,mem=0G \ >> >> Sorry, this line has to be >> >> -numa node,nodeid=0,cpus=0-3,mem=4G -numa node,nodeid=1,mem=0G \ >> >>> -kernel /home/dhildenb/git/linux/arch/x86_64/boot/bzImage \ >>> -append "console=ttyS0 rd.shell rd.luks=0 rd.lvm=0 rd.md=0 rd.dm=0" \ >>> -initrd /boot/initramfs-5.2.8-200.fc30.x86_64.img \ >>> -machine pc,nvdimm \ >>> -nographic \ >>> -nodefaults \ >>> -chardev stdio,id=serial \ >>> -device isa-serial,chardev=serial \ >>> -chardev socket,id=monitor,path=/var/tmp/monitor,server,nowait \ >>> -mon chardev=monitor,mode=readline >>> >>> to get a cpu-less and memory-less node 1. Never tried with node 0. >>> > > I tried > > qemu-system-x86_64 -enable-kvm -m 4G,maxmem=20G,slots=2 -smp sockets=2,cores=2 -cpu host -numa node,nodeid=0,cpus=0-3,mem=4G -numa node,nodeid=1,mem=0G -vga none -nographic -serial mon:stdio /home/srikar/fedora.qcow2 > > and the resulting guest was. > > [root@localhost ~]# numactl -H > available: 1 nodes (0) > node 0 cpus: 0 1 2 3 > node 0 size: 3927 MB > node 0 free: 3316 MB > node distances: > node 0 > 0: 10 > > [root@localhost ~]# lscpu > Architecture: x86_64 > CPU op-mode(s): 32-bit, 64-bit > Byte Order: Little Endian > Address sizes: 40 bits physical, 48 bits virtual > CPU(s): 4 > On-line CPU(s) list: 0-3 > Thread(s) per core: 1 > Core(s) per socket: 2 > Socket(s): 2 > NUMA node(s): 1 > Vendor ID: GenuineIntel > CPU family: 6 > Model: 46 > Model name: Intel(R) Xeon(R) CPU X7560 @ 2.27GHz > Stepping: 6 > CPU MHz: 2260.986 > BogoMIPS: 4521.97 > Virtualization: VT-x > Hypervisor vendor: KVM > Virtualization type: full > L1d cache: 32K > L1i cache: 32K > L2 cache: 4096K > L3 cache: 16384K > NUMA node0 CPU(s): 0-3 > Flags: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ht syscall nx rdtscp lm constant_tsc arch_perfmon rep_good nopl xtopology cpuid tsc_known_freq pni vmx ssse3 cx16 sse4_1 sse4_2 x2apic popcnt tsc_deadline_timer hypervisor lahf_lm cpuid_fault pti ssbd ibrs ibpb tpr_shadow vnmi flexpriority ept vpid tsc_adjust arat umip arch_capabilities > > [root@localhost ~]# cat /sys/devices/system/node/online > 0 > [root@localhost ~]# cat /sys/devices/system/node/possible > 0-1 > > --------------------------------------------------------------------------------- > > I also tried > > qemu-system-x86_64 -enable-kvm -m 4G,maxmem=20G,slots=2 -smp sockets=2,cores=2 -cpu host -numa node,nodeid=1,cpus=0-3,mem=4G -numa node,nodeid=0,mem=0G -vga none -nographic -serial mon:stdio /home/srikar/fedora.qcow2 > > and the resulting guest was. > > [root@localhost ~]# numactl -H > available: 1 nodes (0) > node 0 cpus: 0 1 2 3 > node 0 size: 3927 MB > node 0 free: 3316 MB > node distances: > node 0 > 0: 10 > > [root@localhost ~]# lscpu > Architecture: x86_64 > CPU op-mode(s): 32-bit, 64-bit > Byte Order: Little Endian > Address sizes: 40 bits physical, 48 bits virtual > CPU(s): 4 > On-line CPU(s) list: 0-3 > Thread(s) per core: 1 > Core(s) per socket: 2 > Socket(s): 2 > NUMA node(s): 1 > Vendor ID: GenuineIntel > CPU family: 6 > Model: 46 > Model name: Intel(R) Xeon(R) CPU X7560 @ 2.27GHz > Stepping: 6 > CPU MHz: 2260.986 > BogoMIPS: 4521.97 > Virtualization: VT-x > Hypervisor vendor: KVM > Virtualization type: full > L1d cache: 32K > L1i cache: 32K > L2 cache: 4096K > L3 cache: 16384K > NUMA node0 CPU(s): 0-3 > Flags: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ht syscall nx rdtscp lm constant_tsc arch_perfmon rep_good nopl xtopology cpuid tsc_known_freq pni vmx ssse3 cx16 sse4_1 sse4_2 x2apic popcnt tsc_deadline_timer hypervisor lahf_lm cpuid_fault pti ssbd ibrs ibpb tpr_shadow vnmi flexpriority ept vpid tsc_adjust arat umip arch_capabilities > > [root@localhost ~]# cat /sys/devices/system/node/online > 0 > [root@localhost ~]# cat /sys/devices/system/node/possible > 0-1 > > Even without my patch, both the combinations, I am still unable to see a > cpuless, memoryless node being online. And the interesting part being even Yeah, I think on x86, all memory-less and cpu-less nodes are offline as default. Especially when hotunplugging cpus/memory, we set them offline as well. But as Michal mentioned, the node handling code is complicated and differs between various architectures. > if I mark node 0 as cpuless,memoryless and node 1 as actual node, the system > somewhere marks node 0 as the actual node. Is the kernel maybe mapping PXM 1 to node 0 in that case, because it always requires node 0 to be online/contain memory? Would be interesting what happens if you hotplug a DIMM to (QEMU )node 0 - if PXM 0 will be mapped to node 1 then as well. -- Thanks, David / dhildenb
WARNING: multiple messages have this Message-ID (diff)
From: David Hildenbrand <david@redhat.com> To: Srikar Dronamraju <srikar@linux.vnet.ibm.com> Cc: Linus Torvalds <torvalds@linux-foundation.org>, linux-kernel@vger.kernel.org, Michal Hocko <mhocko@kernel.org>, linux-mm@kvack.org, Mel Gorman <mgorman@suse.de>, "Kirill A. Shutemov" <kirill@shutemov.name>, Andrew Morton <akpm@linux-foundation.org>, linuxppc-dev@lists.ozlabs.org, Christopher Lameter <cl@linux.com>, Vlastimil Babka <vbabka@suse.cz> Subject: Re: [PATCH v2 3/3] mm/page_alloc: Keep memoryless cpuless node 0 offline Date: Tue, 12 May 2020 09:49:05 +0200 [thread overview] Message-ID: <45d50d80-c998-9372-42eb-ca753a7258b9@redhat.com> (raw) In-Reply-To: <20200511174731.GD1961@linux.vnet.ibm.com> On 11.05.20 19:47, Srikar Dronamraju wrote: > * David Hildenbrand <david@redhat.com> [2020-05-08 15:42:12]: > > Hi David, > > Thanks for the steps to tryout. > >>> >>> #! /bin/bash >>> sudo x86_64-softmmu/qemu-system-x86_64 \ >>> --enable-kvm \ >>> -m 4G,maxmem=20G,slots=2 \ >>> -smp sockets=2,cores=2 \ >>> -numa node,nodeid=0,cpus=0-1,mem=4G -numa node,nodeid=1,cpus=2-3,mem=0G \ >> >> Sorry, this line has to be >> >> -numa node,nodeid=0,cpus=0-3,mem=4G -numa node,nodeid=1,mem=0G \ >> >>> -kernel /home/dhildenb/git/linux/arch/x86_64/boot/bzImage \ >>> -append "console=ttyS0 rd.shell rd.luks=0 rd.lvm=0 rd.md=0 rd.dm=0" \ >>> -initrd /boot/initramfs-5.2.8-200.fc30.x86_64.img \ >>> -machine pc,nvdimm \ >>> -nographic \ >>> -nodefaults \ >>> -chardev stdio,id=serial \ >>> -device isa-serial,chardev=serial \ >>> -chardev socket,id=monitor,path=/var/tmp/monitor,server,nowait \ >>> -mon chardev=monitor,mode=readline >>> >>> to get a cpu-less and memory-less node 1. Never tried with node 0. >>> > > I tried > > qemu-system-x86_64 -enable-kvm -m 4G,maxmem=20G,slots=2 -smp sockets=2,cores=2 -cpu host -numa node,nodeid=0,cpus=0-3,mem=4G -numa node,nodeid=1,mem=0G -vga none -nographic -serial mon:stdio /home/srikar/fedora.qcow2 > > and the resulting guest was. > > [root@localhost ~]# numactl -H > available: 1 nodes (0) > node 0 cpus: 0 1 2 3 > node 0 size: 3927 MB > node 0 free: 3316 MB > node distances: > node 0 > 0: 10 > > [root@localhost ~]# lscpu > Architecture: x86_64 > CPU op-mode(s): 32-bit, 64-bit > Byte Order: Little Endian > Address sizes: 40 bits physical, 48 bits virtual > CPU(s): 4 > On-line CPU(s) list: 0-3 > Thread(s) per core: 1 > Core(s) per socket: 2 > Socket(s): 2 > NUMA node(s): 1 > Vendor ID: GenuineIntel > CPU family: 6 > Model: 46 > Model name: Intel(R) Xeon(R) CPU X7560 @ 2.27GHz > Stepping: 6 > CPU MHz: 2260.986 > BogoMIPS: 4521.97 > Virtualization: VT-x > Hypervisor vendor: KVM > Virtualization type: full > L1d cache: 32K > L1i cache: 32K > L2 cache: 4096K > L3 cache: 16384K > NUMA node0 CPU(s): 0-3 > Flags: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ht syscall nx rdtscp lm constant_tsc arch_perfmon rep_good nopl xtopology cpuid tsc_known_freq pni vmx ssse3 cx16 sse4_1 sse4_2 x2apic popcnt tsc_deadline_timer hypervisor lahf_lm cpuid_fault pti ssbd ibrs ibpb tpr_shadow vnmi flexpriority ept vpid tsc_adjust arat umip arch_capabilities > > [root@localhost ~]# cat /sys/devices/system/node/online > 0 > [root@localhost ~]# cat /sys/devices/system/node/possible > 0-1 > > --------------------------------------------------------------------------------- > > I also tried > > qemu-system-x86_64 -enable-kvm -m 4G,maxmem=20G,slots=2 -smp sockets=2,cores=2 -cpu host -numa node,nodeid=1,cpus=0-3,mem=4G -numa node,nodeid=0,mem=0G -vga none -nographic -serial mon:stdio /home/srikar/fedora.qcow2 > > and the resulting guest was. > > [root@localhost ~]# numactl -H > available: 1 nodes (0) > node 0 cpus: 0 1 2 3 > node 0 size: 3927 MB > node 0 free: 3316 MB > node distances: > node 0 > 0: 10 > > [root@localhost ~]# lscpu > Architecture: x86_64 > CPU op-mode(s): 32-bit, 64-bit > Byte Order: Little Endian > Address sizes: 40 bits physical, 48 bits virtual > CPU(s): 4 > On-line CPU(s) list: 0-3 > Thread(s) per core: 1 > Core(s) per socket: 2 > Socket(s): 2 > NUMA node(s): 1 > Vendor ID: GenuineIntel > CPU family: 6 > Model: 46 > Model name: Intel(R) Xeon(R) CPU X7560 @ 2.27GHz > Stepping: 6 > CPU MHz: 2260.986 > BogoMIPS: 4521.97 > Virtualization: VT-x > Hypervisor vendor: KVM > Virtualization type: full > L1d cache: 32K > L1i cache: 32K > L2 cache: 4096K > L3 cache: 16384K > NUMA node0 CPU(s): 0-3 > Flags: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ht syscall nx rdtscp lm constant_tsc arch_perfmon rep_good nopl xtopology cpuid tsc_known_freq pni vmx ssse3 cx16 sse4_1 sse4_2 x2apic popcnt tsc_deadline_timer hypervisor lahf_lm cpuid_fault pti ssbd ibrs ibpb tpr_shadow vnmi flexpriority ept vpid tsc_adjust arat umip arch_capabilities > > [root@localhost ~]# cat /sys/devices/system/node/online > 0 > [root@localhost ~]# cat /sys/devices/system/node/possible > 0-1 > > Even without my patch, both the combinations, I am still unable to see a > cpuless, memoryless node being online. And the interesting part being even Yeah, I think on x86, all memory-less and cpu-less nodes are offline as default. Especially when hotunplugging cpus/memory, we set them offline as well. But as Michal mentioned, the node handling code is complicated and differs between various architectures. > if I mark node 0 as cpuless,memoryless and node 1 as actual node, the system > somewhere marks node 0 as the actual node. Is the kernel maybe mapping PXM 1 to node 0 in that case, because it always requires node 0 to be online/contain memory? Would be interesting what happens if you hotplug a DIMM to (QEMU )node 0 - if PXM 0 will be mapped to node 1 then as well. -- Thanks, David / dhildenb
next prev parent reply other threads:[~2020-05-12 7:49 UTC|newest] Thread overview: 34+ messages / expand[flat|nested] mbox.gz Atom feed top 2020-04-28 9:38 [PATCH v2 0/3] Offline memoryless cpuless node 0 Srikar Dronamraju 2020-04-28 9:38 ` Srikar Dronamraju 2020-04-28 9:38 ` [PATCH v2 1/3] powerpc/numa: Set numa_node for all possible cpus Srikar Dronamraju 2020-04-28 9:38 ` Srikar Dronamraju 2020-04-28 9:38 ` [PATCH v2 2/3] powerpc/numa: Prefer node id queried from vphn Srikar Dronamraju 2020-04-28 9:38 ` Srikar Dronamraju 2020-04-29 6:52 ` Gautham R Shenoy 2020-04-29 6:52 ` Gautham R Shenoy 2020-04-30 4:34 ` Srikar Dronamraju 2020-04-30 4:34 ` Srikar Dronamraju 2020-04-28 9:38 ` [PATCH v2 3/3] mm/page_alloc: Keep memoryless cpuless node 0 offline Srikar Dronamraju 2020-04-28 9:38 ` Srikar Dronamraju 2020-04-28 23:59 ` Andrew Morton 2020-04-28 23:59 ` Andrew Morton 2020-04-29 1:41 ` Srikar Dronamraju 2020-04-29 1:41 ` Srikar Dronamraju 2020-04-29 12:22 ` Michal Hocko 2020-04-29 12:22 ` Michal Hocko 2020-04-30 7:18 ` Srikar Dronamraju 2020-04-30 7:18 ` Srikar Dronamraju 2020-05-04 9:37 ` Michal Hocko 2020-05-04 9:37 ` Michal Hocko 2020-05-08 13:03 ` Srikar Dronamraju 2020-05-08 13:03 ` Srikar Dronamraju 2020-05-08 13:39 ` David Hildenbrand 2020-05-08 13:39 ` David Hildenbrand 2020-05-08 13:42 ` David Hildenbrand 2020-05-08 13:42 ` David Hildenbrand 2020-05-11 17:47 ` Srikar Dronamraju 2020-05-11 17:47 ` Srikar Dronamraju 2020-05-12 7:49 ` David Hildenbrand [this message] 2020-05-12 7:49 ` David Hildenbrand 2020-05-12 10:42 ` Srikar Dronamraju 2020-05-12 10:42 ` Srikar Dronamraju
Reply instructions: You may reply publicly to this message via plain-text email using any one of the following methods: * Save the following mbox file, import it into your mail client, and reply-to-all from there: mbox Avoid top-posting and favor interleaved quoting: https://en.wikipedia.org/wiki/Posting_style#Interleaved_style * Reply using the --to, --cc, and --in-reply-to switches of git-send-email(1): git send-email \ --in-reply-to=45d50d80-c998-9372-42eb-ca753a7258b9@redhat.com \ --to=david@redhat.com \ --cc=akpm@linux-foundation.org \ --cc=cl@linux.com \ --cc=kirill@shutemov.name \ --cc=linux-kernel@vger.kernel.org \ --cc=linux-mm@kvack.org \ --cc=linuxppc-dev@lists.ozlabs.org \ --cc=mgorman@suse.de \ --cc=mhocko@kernel.org \ --cc=mpe@ellerman.id.au \ --cc=srikar@linux.vnet.ibm.com \ --cc=torvalds@linux-foundation.org \ --cc=vbabka@suse.cz \ /path/to/YOUR_REPLY https://kernel.org/pub/software/scm/git/docs/git-send-email.html * If your mail client supports setting the In-Reply-To header via mailto: links, try the mailto: linkBe sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.