From: Barry Song <21cnbao@gmail.com> To: bp@alien8.de, catalin.marinas@arm.com, dietmar.eggemann@arm.com, gregkh@linuxfoundation.org, hpa@zytor.com, juri.lelli@redhat.com, bristot@redhat.com, lenb@kernel.org, mgorman@suse.de, mingo@redhat.com, peterz@infradead.org, rjw@rjwysocki.net, sudeep.holla@arm.com, tglx@linutronix.de Cc: aubrey.li@linux.intel.com, bsegall@google.com, guodong.xu@linaro.org, jonathan.cameron@huawei.com, liguozhu@hisilicon.com, linux-acpi@vger.kernel.org, linux-arm-kernel@lists.infradead.org, linux-kernel@vger.kernel.org, mark.rutland@arm.com, msys.mizuma@gmail.com, prime.zeng@hisilicon.com, rostedt@goodmis.org, tim.c.chen@linux.intel.com, valentin.schneider@arm.com, vincent.guittot@linaro.org, will@kernel.org, x86@kernel.org, xuwei5@huawei.com, yangyicong@huawei.com, linuxarm@huawei.com, Barry Song <song.bao.hua@hisilicon.com> Subject: [PATCH 0/3] Represent cluster topology and enable load balance between clusters Date: Fri, 20 Aug 2021 13:30:05 +1200 [thread overview] Message-ID: <20210820013008.12881-1-21cnbao@gmail.com> (raw) From: Barry Song <song.bao.hua@hisilicon.com> ARM64 machines like kunpeng920 and x86 machines like Jacobsville have a level of hardware topology in which some CPU cores, typically 4 cores, share L3 tags or L2 cache. That means spreading those tasks between clusters will bring more memory bandwidth and decrease cache contention. But packing tasks might help decrease the latency of cache synchronization. We have three series to bring up cluster level scheduler in kernel. This is the first series. 1st series(this one): make kernel aware of cluster, expose cluster to sysfs ABI and add SCHED_CLUSTER which can make load balance among clusters to benefit lots of workload. Testing shows this can hugely boost the performance, for example, this can increase 25.1% of SPECrate mcf on Jacobsville and 13.574% of mcf on kunpeng920. 2nd series(packing path): modify the wake_affine and let kernel select CPUs within cluster first before scanning the whole LLC so that we can benefit from the lower latency of the communication within one single cluster. this series is much more tricky. so we would like to send it after the 1st series settles down. Prototype here: https://op-lists.linaro.org/pipermail/linaro-open-discussions/2021-June/000219.html 3rd series: a sysctl to permit users to enable or disable cluster scheduler from Tim Chen. Prototype here: Add run time sysctl to enable/disable cluster scheduling https://op-lists.linaro.org/pipermail/linaro-open-discussions/2021-July/000258.html This series is rebased on Greg's driver-core-next with the update in topology sysfs ABI. -V1: differences with RFC v6 * rebased on top of the latest update in topology sysfs ABI of Greg's driver-core-next * removed wake_affine path modifcation, which will be separately b2nd series * cluster_id is gotten by detecting valid ID before falling back to use offset * lots of benchmark data from both x86 Jacobsville and ARM64 kunpeng920 -RFC v6: https://lore.kernel.org/lkml/20210420001844.9116-1-song.bao.hua@hisilicon.com/ Barry Song (1): scheduler: Add cluster scheduler level in core and related Kconfig for ARM64 Jonathan Cameron (1): topology: Represent clusters of CPUs within a die Tim Chen (1): scheduler: Add cluster scheduler level for x86 .../ABI/stable/sysfs-devices-system-cpu | 15 +++++ Documentation/admin-guide/cputopology.rst | 12 ++-- arch/arm64/Kconfig | 7 ++ arch/arm64/kernel/topology.c | 2 + arch/x86/Kconfig | 8 +++ arch/x86/include/asm/smp.h | 7 ++ arch/x86/include/asm/topology.h | 3 + arch/x86/kernel/cpu/cacheinfo.c | 1 + arch/x86/kernel/cpu/common.c | 3 + arch/x86/kernel/smpboot.c | 44 +++++++++++- drivers/acpi/pptt.c | 67 +++++++++++++++++++ drivers/base/arch_topology.c | 14 ++++ drivers/base/topology.c | 10 +++ include/linux/acpi.h | 5 ++ include/linux/arch_topology.h | 5 ++ include/linux/sched/topology.h | 7 ++ include/linux/topology.h | 13 ++++ kernel/sched/topology.c | 5 ++ 18 files changed, 223 insertions(+), 5 deletions(-) -- 2.25.1
WARNING: multiple messages have this Message-ID (diff)
From: Barry Song <21cnbao@gmail.com> To: bp@alien8.de, catalin.marinas@arm.com, dietmar.eggemann@arm.com, gregkh@linuxfoundation.org, hpa@zytor.com, juri.lelli@redhat.com, bristot@redhat.com, lenb@kernel.org, mgorman@suse.de, mingo@redhat.com, peterz@infradead.org, rjw@rjwysocki.net, sudeep.holla@arm.com, tglx@linutronix.de Cc: aubrey.li@linux.intel.com, bsegall@google.com, guodong.xu@linaro.org, jonathan.cameron@huawei.com, liguozhu@hisilicon.com, linux-acpi@vger.kernel.org, linux-arm-kernel@lists.infradead.org, linux-kernel@vger.kernel.org, mark.rutland@arm.com, msys.mizuma@gmail.com, prime.zeng@hisilicon.com, rostedt@goodmis.org, tim.c.chen@linux.intel.com, valentin.schneider@arm.com, vincent.guittot@linaro.org, will@kernel.org, x86@kernel.org, xuwei5@huawei.com, yangyicong@huawei.com, linuxarm@huawei.com, Barry Song <song.bao.hua@hisilicon.com> Subject: [PATCH 0/3] Represent cluster topology and enable load balance between clusters Date: Fri, 20 Aug 2021 13:30:05 +1200 [thread overview] Message-ID: <20210820013008.12881-1-21cnbao@gmail.com> (raw) From: Barry Song <song.bao.hua@hisilicon.com> ARM64 machines like kunpeng920 and x86 machines like Jacobsville have a level of hardware topology in which some CPU cores, typically 4 cores, share L3 tags or L2 cache. That means spreading those tasks between clusters will bring more memory bandwidth and decrease cache contention. But packing tasks might help decrease the latency of cache synchronization. We have three series to bring up cluster level scheduler in kernel. This is the first series. 1st series(this one): make kernel aware of cluster, expose cluster to sysfs ABI and add SCHED_CLUSTER which can make load balance among clusters to benefit lots of workload. Testing shows this can hugely boost the performance, for example, this can increase 25.1% of SPECrate mcf on Jacobsville and 13.574% of mcf on kunpeng920. 2nd series(packing path): modify the wake_affine and let kernel select CPUs within cluster first before scanning the whole LLC so that we can benefit from the lower latency of the communication within one single cluster. this series is much more tricky. so we would like to send it after the 1st series settles down. Prototype here: https://op-lists.linaro.org/pipermail/linaro-open-discussions/2021-June/000219.html 3rd series: a sysctl to permit users to enable or disable cluster scheduler from Tim Chen. Prototype here: Add run time sysctl to enable/disable cluster scheduling https://op-lists.linaro.org/pipermail/linaro-open-discussions/2021-July/000258.html This series is rebased on Greg's driver-core-next with the update in topology sysfs ABI. -V1: differences with RFC v6 * rebased on top of the latest update in topology sysfs ABI of Greg's driver-core-next * removed wake_affine path modifcation, which will be separately b2nd series * cluster_id is gotten by detecting valid ID before falling back to use offset * lots of benchmark data from both x86 Jacobsville and ARM64 kunpeng920 -RFC v6: https://lore.kernel.org/lkml/20210420001844.9116-1-song.bao.hua@hisilicon.com/ Barry Song (1): scheduler: Add cluster scheduler level in core and related Kconfig for ARM64 Jonathan Cameron (1): topology: Represent clusters of CPUs within a die Tim Chen (1): scheduler: Add cluster scheduler level for x86 .../ABI/stable/sysfs-devices-system-cpu | 15 +++++ Documentation/admin-guide/cputopology.rst | 12 ++-- arch/arm64/Kconfig | 7 ++ arch/arm64/kernel/topology.c | 2 + arch/x86/Kconfig | 8 +++ arch/x86/include/asm/smp.h | 7 ++ arch/x86/include/asm/topology.h | 3 + arch/x86/kernel/cpu/cacheinfo.c | 1 + arch/x86/kernel/cpu/common.c | 3 + arch/x86/kernel/smpboot.c | 44 +++++++++++- drivers/acpi/pptt.c | 67 +++++++++++++++++++ drivers/base/arch_topology.c | 14 ++++ drivers/base/topology.c | 10 +++ include/linux/acpi.h | 5 ++ include/linux/arch_topology.h | 5 ++ include/linux/sched/topology.h | 7 ++ include/linux/topology.h | 13 ++++ kernel/sched/topology.c | 5 ++ 18 files changed, 223 insertions(+), 5 deletions(-) -- 2.25.1 _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
next reply other threads:[~2021-08-20 1:30 UTC|newest] Thread overview: 16+ messages / expand[flat|nested] mbox.gz Atom feed top 2021-08-20 1:30 Barry Song [this message] 2021-08-20 1:30 ` [PATCH 0/3] Represent cluster topology and enable load balance between clusters Barry Song 2021-08-20 1:30 ` [PATCH 1/3] topology: Represent clusters of CPUs within a die Barry Song 2021-08-20 1:30 ` Barry Song 2022-05-06 20:24 ` [BUG] " Jeremy Linton 2022-05-06 20:24 ` Jeremy Linton 2022-05-09 10:15 ` Jonathan Cameron 2022-05-09 10:15 ` Jonathan Cameron 2022-05-10 19:17 ` Darren Hart 2022-05-10 19:17 ` Darren Hart 2021-08-20 1:30 ` [PATCH 2/3] scheduler: Add cluster scheduler level in core and related Kconfig for ARM64 Barry Song 2021-08-20 1:30 ` Barry Song 2021-08-20 1:30 ` [PATCH 3/3] scheduler: Add cluster scheduler level for x86 Barry Song 2021-08-20 1:30 ` Barry Song 2021-08-23 17:49 ` Tim Chen 2021-08-23 17:49 ` Tim Chen
Reply instructions: You may reply publicly to this message via plain-text email using any one of the following methods: * Save the following mbox file, import it into your mail client, and reply-to-all from there: mbox Avoid top-posting and favor interleaved quoting: https://en.wikipedia.org/wiki/Posting_style#Interleaved_style * Reply using the --to, --cc, and --in-reply-to switches of git-send-email(1): git send-email \ --in-reply-to=20210820013008.12881-1-21cnbao@gmail.com \ --to=21cnbao@gmail.com \ --cc=aubrey.li@linux.intel.com \ --cc=bp@alien8.de \ --cc=bristot@redhat.com \ --cc=bsegall@google.com \ --cc=catalin.marinas@arm.com \ --cc=dietmar.eggemann@arm.com \ --cc=gregkh@linuxfoundation.org \ --cc=guodong.xu@linaro.org \ --cc=hpa@zytor.com \ --cc=jonathan.cameron@huawei.com \ --cc=juri.lelli@redhat.com \ --cc=lenb@kernel.org \ --cc=liguozhu@hisilicon.com \ --cc=linux-acpi@vger.kernel.org \ --cc=linux-arm-kernel@lists.infradead.org \ --cc=linux-kernel@vger.kernel.org \ --cc=linuxarm@huawei.com \ --cc=mark.rutland@arm.com \ --cc=mgorman@suse.de \ --cc=mingo@redhat.com \ --cc=msys.mizuma@gmail.com \ --cc=peterz@infradead.org \ --cc=prime.zeng@hisilicon.com \ --cc=rjw@rjwysocki.net \ --cc=rostedt@goodmis.org \ --cc=song.bao.hua@hisilicon.com \ --cc=sudeep.holla@arm.com \ --cc=tglx@linutronix.de \ --cc=tim.c.chen@linux.intel.com \ --cc=valentin.schneider@arm.com \ --cc=vincent.guittot@linaro.org \ --cc=will@kernel.org \ --cc=x86@kernel.org \ --cc=xuwei5@huawei.com \ --cc=yangyicong@huawei.com \ /path/to/YOUR_REPLY https://kernel.org/pub/software/scm/git/docs/git-send-email.html * If your mail client supports setting the In-Reply-To header via mailto: links, try the mailto: linkBe sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.