All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH] sched: Optimize housekeeping_cpumask in for_each_cpu_and
@ 2021-06-06 13:11 Yuan ZhaoXiong
  2021-06-28 13:58 ` [tip: sched/core] sched: Optimize housekeeping_cpumask() in for_each_cpu_and() tip-bot2 for Yuan ZhaoXiong
  0 siblings, 1 reply; 2+ messages in thread
From: Yuan ZhaoXiong @ 2021-06-06 13:11 UTC (permalink / raw)
  To: mingo, peterz, juri.lelli, vincent.guittot, dietmar.eggemann,
	rostedt, bsegall, mgorman, bristot
  Cc: linux-kernel, lirongqing, yuanzhaoxiong

On a 128 cores AMD machine, there are 8 cores in nohz_full mode, and
the others are used for housekeeping. When many housekeeping cpus are
in idle state, we can observe huge time burn in the loop for searching
nearest busy housekeeper cpu by ftrace.

   9)               |              get_nohz_timer_target() {
   9)               |                housekeeping_test_cpu() {
   9)   0.390 us    |                  housekeeping_get_mask.part.1();
   9)   0.561 us    |                }
   9)   0.090 us    |                __rcu_read_lock();
   9)   0.090 us    |                housekeeping_cpumask();
   9)   0.521 us    |                housekeeping_cpumask();
   9)   0.140 us    |                housekeeping_cpumask();

   ...

   9)   0.500 us    |                housekeeping_cpumask();
   9)               |                housekeeping_any_cpu() {
   9)   0.090 us    |                  housekeeping_get_mask.part.1();
   9)   0.100 us    |                  sched_numa_find_closest();
   9)   0.491 us    |                }
   9)   0.100 us    |                __rcu_read_unlock();
   9) + 76.163 us   |              }

for_each_cpu_and() is a micro function, so in get_nohz_timer_target()
function the
        for_each_cpu_and(i, sched_domain_span(sd),
                housekeeping_cpumask(HK_FLAG_TIMER))
equals to below:
        for (i = -1; i = cpumask_next_and(i, sched_domain_span(sd),
                housekeeping_cpumask(HK_FLAG_TIMER)), i < nr_cpu_ids;)
That will cause that housekeeping_cpumask() will be invoked many times.
The housekeeping_cpumask() function returns a const value, so it is
unnecessary to invoke it every time. This patch can minimize the worst
searching time from ~76us to ~16us in my testing.

Similarly, the find_new_ilb() function has the same problem.

Co-developed-by: Li RongQing <lirongqing@baidu.com>
Signed-off-by: Li RongQing <lirongqing@baidu.com>
Signed-off-by: Yuan ZhaoXiong <yuanzhaoxiong@baidu.com>
---
 kernel/sched/core.c | 6 ++++--
 kernel/sched/fair.c | 6 ++++--
 2 files changed, 8 insertions(+), 4 deletions(-)

diff --git a/kernel/sched/core.c b/kernel/sched/core.c
index 98191218d891..14ad3bb36321 100644
--- a/kernel/sched/core.c
+++ b/kernel/sched/core.c
@@ -645,6 +645,7 @@ int get_nohz_timer_target(void)
 {
 	int i, cpu = smp_processor_id(), default_cpu = -1;
 	struct sched_domain *sd;
+	const struct cpumask *hk_mask;
 
 	if (housekeeping_cpu(cpu, HK_FLAG_TIMER)) {
 		if (!idle_cpu(cpu))
@@ -652,10 +653,11 @@ int get_nohz_timer_target(void)
 		default_cpu = cpu;
 	}
 
+	hk_mask = housekeeping_cpumask(HK_FLAG_TIMER);
+
 	rcu_read_lock();
 	for_each_domain(cpu, sd) {
-		for_each_cpu_and(i, sched_domain_span(sd),
-			housekeeping_cpumask(HK_FLAG_TIMER)) {
+		for_each_cpu_and(i, sched_domain_span(sd), hk_mask) {
 			if (cpu == i)
 				continue;
 
diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
index 794c2cb945f8..d3ecfbf160bf 100644
--- a/kernel/sched/fair.c
+++ b/kernel/sched/fair.c
@@ -10097,9 +10097,11 @@ static inline int on_null_domain(struct rq *rq)
 static inline int find_new_ilb(void)
 {
 	int ilb;
+	const struct cpumask *hk_mask;
 
-	for_each_cpu_and(ilb, nohz.idle_cpus_mask,
-			      housekeeping_cpumask(HK_FLAG_MISC)) {
+	hk_mask = housekeeping_cpumask(HK_FLAG_MISC);
+
+	for_each_cpu_and(ilb, nohz.idle_cpus_mask, hk_mask) {
 
 		if (ilb == smp_processor_id())
 			continue;
-- 
2.27.0


^ permalink raw reply related	[flat|nested] 2+ messages in thread

* [tip: sched/core] sched: Optimize housekeeping_cpumask() in for_each_cpu_and()
  2021-06-06 13:11 [PATCH] sched: Optimize housekeeping_cpumask in for_each_cpu_and Yuan ZhaoXiong
@ 2021-06-28 13:58 ` tip-bot2 for Yuan ZhaoXiong
  0 siblings, 0 replies; 2+ messages in thread
From: tip-bot2 for Yuan ZhaoXiong @ 2021-06-28 13:58 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: Li RongQing, Yuan ZhaoXiong, Peter Zijlstra (Intel), x86, linux-kernel

The following commit has been merged into the sched/core branch of tip:

Commit-ID:     031e3bd8986fffe31e1ddbf5264cccfe30c9abd7
Gitweb:        https://git.kernel.org/tip/031e3bd8986fffe31e1ddbf5264cccfe30c9abd7
Author:        Yuan ZhaoXiong <yuanzhaoxiong@baidu.com>
AuthorDate:    Sun, 06 Jun 2021 21:11:55 +08:00
Committer:     Peter Zijlstra <peterz@infradead.org>
CommitterDate: Mon, 28 Jun 2021 15:42:26 +02:00

sched: Optimize housekeeping_cpumask() in for_each_cpu_and()

On a 128 cores AMD machine, there are 8 cores in nohz_full mode, and
the others are used for housekeeping. When many housekeeping cpus are
in idle state, we can observe huge time burn in the loop for searching
nearest busy housekeeper cpu by ftrace.

   9)               |              get_nohz_timer_target() {
   9)               |                housekeeping_test_cpu() {
   9)   0.390 us    |                  housekeeping_get_mask.part.1();
   9)   0.561 us    |                }
   9)   0.090 us    |                __rcu_read_lock();
   9)   0.090 us    |                housekeeping_cpumask();
   9)   0.521 us    |                housekeeping_cpumask();
   9)   0.140 us    |                housekeeping_cpumask();

   ...

   9)   0.500 us    |                housekeeping_cpumask();
   9)               |                housekeeping_any_cpu() {
   9)   0.090 us    |                  housekeeping_get_mask.part.1();
   9)   0.100 us    |                  sched_numa_find_closest();
   9)   0.491 us    |                }
   9)   0.100 us    |                __rcu_read_unlock();
   9) + 76.163 us   |              }

for_each_cpu_and() is a micro function, so in get_nohz_timer_target()
function the
        for_each_cpu_and(i, sched_domain_span(sd),
                housekeeping_cpumask(HK_FLAG_TIMER))
equals to below:
        for (i = -1; i = cpumask_next_and(i, sched_domain_span(sd),
                housekeeping_cpumask(HK_FLAG_TIMER)), i < nr_cpu_ids;)
That will cause that housekeeping_cpumask() will be invoked many times.
The housekeeping_cpumask() function returns a const value, so it is
unnecessary to invoke it every time. This patch can minimize the worst
searching time from ~76us to ~16us in my testing.

Similarly, the find_new_ilb() function has the same problem.

Co-developed-by: Li RongQing <lirongqing@baidu.com>
Signed-off-by: Li RongQing <lirongqing@baidu.com>
Signed-off-by: Yuan ZhaoXiong <yuanzhaoxiong@baidu.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lkml.kernel.org/r/1622985115-51007-1-git-send-email-yuanzhaoxiong@baidu.com
---
 kernel/sched/core.c | 6 ++++--
 kernel/sched/fair.c | 6 ++++--
 2 files changed, 8 insertions(+), 4 deletions(-)

diff --git a/kernel/sched/core.c b/kernel/sched/core.c
index 2883c22..0c22cd0 100644
--- a/kernel/sched/core.c
+++ b/kernel/sched/core.c
@@ -993,6 +993,7 @@ int get_nohz_timer_target(void)
 {
 	int i, cpu = smp_processor_id(), default_cpu = -1;
 	struct sched_domain *sd;
+	const struct cpumask *hk_mask;
 
 	if (housekeeping_cpu(cpu, HK_FLAG_TIMER)) {
 		if (!idle_cpu(cpu))
@@ -1000,10 +1001,11 @@ int get_nohz_timer_target(void)
 		default_cpu = cpu;
 	}
 
+	hk_mask = housekeeping_cpumask(HK_FLAG_TIMER);
+
 	rcu_read_lock();
 	for_each_domain(cpu, sd) {
-		for_each_cpu_and(i, sched_domain_span(sd),
-			housekeeping_cpumask(HK_FLAG_TIMER)) {
+		for_each_cpu_and(i, sched_domain_span(sd), hk_mask) {
 			if (cpu == i)
 				continue;
 
diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
index 45edf61..11d2294 100644
--- a/kernel/sched/fair.c
+++ b/kernel/sched/fair.c
@@ -10188,9 +10188,11 @@ static inline int on_null_domain(struct rq *rq)
 static inline int find_new_ilb(void)
 {
 	int ilb;
+	const struct cpumask *hk_mask;
 
-	for_each_cpu_and(ilb, nohz.idle_cpus_mask,
-			      housekeeping_cpumask(HK_FLAG_MISC)) {
+	hk_mask = housekeeping_cpumask(HK_FLAG_MISC);
+
+	for_each_cpu_and(ilb, nohz.idle_cpus_mask, hk_mask) {
 
 		if (ilb == smp_processor_id())
 			continue;

^ permalink raw reply related	[flat|nested] 2+ messages in thread

end of thread, other threads:[~2021-06-28 13:58 UTC | newest]

Thread overview: 2+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-06-06 13:11 [PATCH] sched: Optimize housekeeping_cpumask in for_each_cpu_and Yuan ZhaoXiong
2021-06-28 13:58 ` [tip: sched/core] sched: Optimize housekeeping_cpumask() in for_each_cpu_and() tip-bot2 for Yuan ZhaoXiong

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.