linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] Add busy loop polling for idle SMT
@ 2021-11-16 11:51 Peng Wang
  2021-11-16 11:51 ` [PATCH] sched/idle: support busy loop polling on idle SMT cpus Peng Wang
  2021-11-17 10:58 ` [PATCH] Add busy loop polling for idle SMT Peter Zijlstra
  0 siblings, 2 replies; 8+ messages in thread
From: Peng Wang @ 2021-11-16 11:51 UTC (permalink / raw)
  To: mingo, peterz, juri.lelli, vincent.guittot, dietmar.eggemann,
	rostedt, bsegall, mgorman, bristot
  Cc: linux-kernel

Now we have cpu_idle_force_poll which uses cpu_relax() waiting for
an arriving IPI, while sometimes busy loop on idle cpu is also
useful to provide consistent pipeline interference for hardware SMT.

When hardware SMT is enabled, the switching between idle and
busy state of one cpu will cause performance fluctuation of
other sibling cpus on the same core.

In pay-for-execution-time scenario, cloud service providers prefer
stable performance data to set stabel price for same workload.
Different execution time of the same workload caused by different
idle or busy state of sibling SMT cpus will make different bills, which
is confused for customers.

Since there is no dynamic CPU time scaling based on SMT pipeline interference,
to coordinate sibling SMT noise no matter whether they are idle or not,
busy loop in idle state can provide approximately consistent pipeline interference.

For example, a workload computing tangent and cotangent will finish in 9071ms when
sibling SMT cpus are idle, and 13299ms when sibling SMT cpus are computiing other workload.
This generate 32% performance fluctuation.

SMT idle polling makes things slower, but we can set bigger cpu quota to make up
a deficiency. This also increase power consumption by 2.2%, which is acceptable.

There may be some other possible solutions, while each has its own problem:
a) disbale hardware SMT, which means half of SMT is unused and more hardware cost.
b) busy loop in a userspace thread, but the cpu usage is confusing.

We propose this patch to discuss the performance fluctuation problem related to SMT
pipeline interference, and any comments are welcome.

Peng Wang (1):
  sched/idle: support busy loop polling on idle SMT cpus

 arch/Kconfig         | 10 ++++++
 kernel/sched/core.c  | 39 ++++++++++++++++++++++
 kernel/sched/idle.c  | 91 ++++++++++++++++++++++++++++++++++++++++++++++++++++
 kernel/sched/sched.h | 18 +++++++++++
 4 files changed, 158 insertions(+)

-- 
2.9.5


^ permalink raw reply	[flat|nested] 8+ messages in thread

* [PATCH] sched/idle: support busy loop polling on idle SMT cpus
  2021-11-16 11:51 [PATCH] Add busy loop polling for idle SMT Peng Wang
@ 2021-11-16 11:51 ` Peng Wang
  2021-11-16 23:08   ` kernel test robot
                     ` (2 more replies)
  2021-11-17 10:58 ` [PATCH] Add busy loop polling for idle SMT Peter Zijlstra
  1 sibling, 3 replies; 8+ messages in thread
From: Peng Wang @ 2021-11-16 11:51 UTC (permalink / raw)
  To: mingo, peterz, juri.lelli, vincent.guittot, dietmar.eggemann,
	rostedt, bsegall, mgorman, bristot
  Cc: linux-kernel

When hardware SMT is enabled, the switching between one cpu's idle and busy
state will cause performance fluctuation of sibling cpus on the same core.

Consistent noise is needed when requiring stable performance on one SMT cpu
regardless of whether the sibling cpus on the same core is idle.

The orginal cpu_idle_force_poll uses cpu_relax() waiting an arriving IPI,
while this smt_idle_force_poll uses busy loop to provide consistent SMT
pipeline interference.

Use a CGroup config to enable busy loop polling for specific tasks.

Interface: cgroup/cpu.smt_idle_poll
 0: default behavior
 1: SMT idle sibling cpus will do busy loop polling

Co-developed-by: Huaixin Chang <changhuaixin@linux.alibaba.com>
Signed-off-by: Huaixin Chang <changhuaixin@linux.alibaba.com>
Co-developed-by: Shanpei Chen <shanpeic@linux.alibaba.com>
Signed-off-by: Shanpei Chen <shanpeic@linux.alibaba.com>
Signed-off-by: Peng Wang <rocking@linux.alibaba.com>
---
 arch/Kconfig         | 10 ++++++
 kernel/sched/core.c  | 39 ++++++++++++++++++++++
 kernel/sched/idle.c  | 91 ++++++++++++++++++++++++++++++++++++++++++++++++++++
 kernel/sched/sched.h | 18 +++++++++++
 4 files changed, 158 insertions(+)

diff --git a/arch/Kconfig b/arch/Kconfig
index 26b8ed1..805c3b9 100644
--- a/arch/Kconfig
+++ b/arch/Kconfig
@@ -233,6 +233,16 @@ config GENERIC_SMP_IDLE_THREAD
 config GENERIC_IDLE_POLL_SETUP
 	bool
 
+config SMT_IDLE_POLL
+	bool "SMT idle poll"
+	depends on SCHED_SMT
+	help
+	  When hardware SMT is enabled, the switching between idle or busy
+	  state of one cpu will cause performance fluctuation of other
+	  cpus on the same core. This provides a way to support consistent
+	  SMT noise.
+	  If in doubt, say "N".
+
 config ARCH_HAS_FORTIFY_SOURCE
 	bool
 	help
diff --git a/kernel/sched/core.c b/kernel/sched/core.c
index 3c9b0fd..ef478d6 100644
--- a/kernel/sched/core.c
+++ b/kernel/sched/core.c
@@ -4845,6 +4845,7 @@ static struct rq *finish_task_switch(struct task_struct *prev)
 	prev_state = READ_ONCE(prev->__state);
 	vtime_task_switch(prev);
 	perf_event_task_sched_in(prev, current);
+	smt_idle_poll_switch(rq);
 	finish_task(prev);
 	tick_nohz_task_switch();
 	finish_lock_switch(rq);
@@ -10189,6 +10190,30 @@ static u64 cpu_shares_read_u64(struct cgroup_subsys_state *css,
 	return (u64) scale_load_down(tg->shares);
 }
 
+#ifdef CONFIG_SMT_IDLE_POLL
+static atomic_t smt_idle_poll_count;
+static int smt_idle_poll_write_u64(struct cgroup_subsys_state *css,
+				struct cftype *cftype, u64 smt_idle_poll)
+{
+	if (smt_idle_poll == 1) {
+		if (atomic_inc_return(&smt_idle_poll_count) == 1)
+			static_branch_enable(&__smt_idle_poll_enabled);
+	} else if (smt_idle_poll == 0) {
+		if (atomic_dec_return(&smt_idle_poll_count) == 0)
+			static_branch_disable(&__smt_idle_poll_enabled);
+	} else
+		return -EINVAL;
+	css_tg(css)->need_smt_idle_poll = smt_idle_poll;
+	return 0;
+}
+
+static u64 smt_idle_poll_read_u64(struct cgroup_subsys_state *css,
+			       struct cftype *cft)
+{
+	return (u64)css_tg(css)->need_smt_idle_poll;
+}
+#endif
+
 #ifdef CONFIG_CFS_BANDWIDTH
 static DEFINE_MUTEX(cfs_constraints_mutex);
 
@@ -10565,6 +10590,13 @@ static struct cftype cpu_legacy_files[] = {
 		.read_s64 = cpu_idle_read_s64,
 		.write_s64 = cpu_idle_write_s64,
 	},
+#ifdef CONFIG_SMT_IDLE_POLL
+	{
+		.name = "smt_idle_poll",
+		.read_u64 = smt_idle_poll_read_u64,
+		.write_u64 = smt_idle_poll_write_u64,
+	},
+#endif
 #endif
 #ifdef CONFIG_CFS_BANDWIDTH
 	{
@@ -10782,6 +10814,13 @@ static struct cftype cpu_files[] = {
 		.read_s64 = cpu_idle_read_s64,
 		.write_s64 = cpu_idle_write_s64,
 	},
+#ifdef CONFIG_SMT_IDLE_POLL
+	{
+		.name = "smt_idle_poll",
+		.read_u64 = smt_idle_poll_read_u64,
+		.write_u64 = smt_idle_poll_write_u64,
+	},
+#endif
 #endif
 #ifdef CONFIG_CFS_BANDWIDTH
 	{
diff --git a/kernel/sched/idle.c b/kernel/sched/idle.c
index d17b0a5..bfb8f56 100644
--- a/kernel/sched/idle.c
+++ b/kernel/sched/idle.c
@@ -52,6 +52,90 @@ static int __init cpu_idle_nopoll_setup(char *__unused)
 __setup("hlt", cpu_idle_nopoll_setup);
 #endif
 
+#ifdef CONFIG_SMT_IDLE_POLL
+DEFINE_STATIC_KEY_FALSE(__smt_idle_poll_enabled);
+
+void smt_idle_poll_switch(struct rq *rq)
+{
+	struct rq *smt_rq;
+	struct task_group *tg;
+	int smt, cpu;
+
+	if (!static_branch_unlikely(&__smt_idle_poll_enabled))
+		return;
+
+	if (rq->idle == current) {
+		rq->need_smt_idle_poll = false;
+		return;
+	}
+
+	rcu_read_lock();
+	tg = container_of(task_css_check(current, cpu_cgrp_id, true),
+		struct task_group, css);
+	rq->need_smt_idle_poll = tg->need_smt_idle_poll;
+	rcu_read_unlock();
+
+	if (!rq->need_smt_idle_poll)
+		return;
+
+	cpu = rq->cpu;
+	for_each_cpu(smt, cpu_smt_mask(cpu)) {
+		if (cpu == smt || !cpu_online(smt))
+			continue;
+		smt_rq = cpu_rq(smt);
+		if (smt_rq->idle == smt_rq->curr && !smt_rq->in_smt_idle_poll)
+			smp_send_reschedule(smt);
+	}
+}
+
+/**
+ * need_smt_idle_poll - Decide whether to continue SMT idle polling.
+ */
+static bool need_smt_idle_poll(void)
+{
+	int cpu, ht;
+	struct rq *rq;
+
+	if (!static_branch_unlikely(&__smt_idle_poll_enabled))
+		return false;
+
+	if (tif_need_resched())
+		return false;
+
+	cpu = smp_processor_id();
+	for_each_cpu(ht, cpu_smt_mask(cpu)) {
+		if (cpu == ht || !cpu_online(ht))
+			continue;
+
+		rq = cpu_rq(ht);
+		if (rq->need_smt_idle_poll)
+			return true;
+	}
+
+	return false;
+}
+
+static noinline int __cpuidle smt_idle_poll(void)
+{
+	trace_cpu_idle(0, smp_processor_id());
+	stop_critical_timings();
+	rcu_idle_enter();
+	local_irq_enable();
+
+	this_rq()->in_smt_idle_poll = true;
+	while (need_smt_idle_poll())
+		;
+	this_rq()->in_smt_idle_poll = false;
+
+	rcu_idle_exit();
+	start_critical_timings();
+	trace_cpu_idle(PWR_EVENT_EXIT, smp_processor_id());
+
+	return 1;
+}
+
+#endif
+
 static noinline int __cpuidle cpu_idle_poll(void)
 {
 	trace_cpu_idle(0, smp_processor_id());
@@ -293,6 +377,13 @@ static void do_idle(void)
 		arch_cpu_idle_enter();
 		rcu_nocb_flush_deferred_wakeup();
 
+#ifdef CONFIG_SMT_IDLE_POLL
+		if (need_smt_idle_poll()) {
+			tick_nohz_idle_restart_tick();
+			smt_idle_poll();
+		}
+#endif
+
 		/*
 		 * In poll mode we reenable interrupts and spin. Also if we
 		 * detected in the wakeup from idle path that the tick
diff --git a/kernel/sched/sched.h b/kernel/sched/sched.h
index 0e66749..58b3af3 100644
--- a/kernel/sched/sched.h
+++ b/kernel/sched/sched.h
@@ -401,6 +401,10 @@ struct task_group {
 	/* A positive value indicates that this is a SCHED_IDLE group. */
 	int			idle;
 
+#ifdef CONFIG_SMT_IDLE_POLL
+	bool			need_smt_idle_poll;
+#endif
+
 #ifdef	CONFIG_SMP
 	/*
 	 * load_avg can be heavily contended at clock tick time, so put
@@ -1114,6 +1118,11 @@ struct rq {
 	unsigned char		core_forceidle;
 	unsigned int		core_forceidle_seq;
 #endif
+
+#ifdef CONFIG_SMT_IDLE_POLL
+	bool			need_smt_idle_poll;
+	bool			in_smt_idle_poll;
+#endif
 };
 
 #ifdef CONFIG_FAIR_GROUP_SCHED
@@ -1302,6 +1311,15 @@ static inline bool sched_group_cookie_match(struct rq *rq,
 }
 #endif /* CONFIG_SCHED_CORE */
 
+#ifdef CONFIG_SMT_IDLE_POLL
+DECLARE_STATIC_KEY_FALSE(__smt_idle_poll_enabled);
+extern void smt_idle_poll_switch(struct rq *rq);
+#else
+static inline void smt_idle_poll_switch(struct rq *rq)
+{
+}
+#endif
+
 static inline void lockdep_assert_rq_held(struct rq *rq)
 {
 	lockdep_assert_held(__rq_lockp(rq));
-- 
2.9.5


^ permalink raw reply related	[flat|nested] 8+ messages in thread

* Re: [PATCH] sched/idle: support busy loop polling on idle SMT cpus
  2021-11-16 11:51 ` [PATCH] sched/idle: support busy loop polling on idle SMT cpus Peng Wang
@ 2021-11-16 23:08   ` kernel test robot
  2021-11-18 11:36   ` kernel test robot
  2021-11-27 15:50   ` kernel test robot
  2 siblings, 0 replies; 8+ messages in thread
From: kernel test robot @ 2021-11-16 23:08 UTC (permalink / raw)
  To: Peng Wang, mingo, peterz, juri.lelli, vincent.guittot,
	dietmar.eggemann, rostedt, bsegall, mgorman, bristot
  Cc: llvm, kbuild-all, linux-kernel

[-- Attachment #1: Type: text/plain, Size: 4565 bytes --]

Hi Peng,

Thank you for the patch! Yet something to improve:

[auto build test ERROR on tip/sched/core]
[also build test ERROR on tip/master linux/master linus/master v5.16-rc1 next-20211116]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch]

url:    https://github.com/0day-ci/linux/commits/Peng-Wang/sched-idle-support-busy-loop-polling-on-idle-SMT-cpus/20211116-195600
base:   https://git.kernel.org/pub/scm/linux/kernel/git/tip/tip.git 8ea9183db4ad8afbcb7089a77c23eaf965b0cacd
config: i386-randconfig-r034-20211116 (attached as .config)
compiler: clang version 14.0.0 (https://github.com/llvm/llvm-project fbe72e41b99dc7994daac300d208a955be3e4a0a)
reproduce (this is a W=1 build):
        wget https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross
        chmod +x ~/bin/make.cross
        # https://github.com/0day-ci/linux/commit/11c6a63e5c0f496b43058351f1e2a488ee27a5ec
        git remote add linux-review https://github.com/0day-ci/linux
        git fetch --no-tags linux-review Peng-Wang/sched-idle-support-busy-loop-polling-on-idle-SMT-cpus/20211116-195600
        git checkout 11c6a63e5c0f496b43058351f1e2a488ee27a5ec
        # save the attached .config to linux build tree
        mkdir build_dir
        COMPILER_INSTALL_PATH=$HOME/0day COMPILER=clang make.cross W=1 O=build_dir ARCH=i386 SHELL=/bin/bash

If you fix the issue, kindly add following tag as appropriate
Reported-by: kernel test robot <lkp@intel.com>

All errors (new ones prefixed by >>):

>> kernel/sched/idle.c:73:44: error: use of undeclared identifier 'cpu_cgrp_id'
           tg = container_of(task_css_check(current, cpu_cgrp_id, true),
                                                     ^
>> kernel/sched/idle.c:73:44: error: use of undeclared identifier 'cpu_cgrp_id'
>> kernel/sched/idle.c:73:44: error: use of undeclared identifier 'cpu_cgrp_id'
>> kernel/sched/idle.c:73:7: error: offsetof of incomplete type 'struct task_group'
           tg = container_of(task_css_check(current, cpu_cgrp_id, true),
                ^
   include/linux/kernel.h:498:21: note: expanded from macro 'container_of'
           ((type *)(__mptr - offsetof(type, member))); })
                              ^        ~~~~
   include/linux/stddef.h:17:32: note: expanded from macro 'offsetof'
   #define offsetof(TYPE, MEMBER)  __compiler_offsetof(TYPE, MEMBER)
                                   ^                   ~~~~
   include/linux/compiler_types.h:140:35: note: expanded from macro '__compiler_offsetof'
   #define __compiler_offsetof(a, b)       __builtin_offsetof(a, b)
                                           ^                  ~
   include/linux/sched.h:69:8: note: forward declaration of 'struct task_group'
   struct task_group;
          ^
>> kernel/sched/idle.c:73:5: error: assigning to 'struct task_group *' from incompatible type 'void'
           tg = container_of(task_css_check(current, cpu_cgrp_id, true),
              ^ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
>> kernel/sched/idle.c:75:29: error: incomplete definition of type 'struct task_group'
           rq->need_smt_idle_poll = tg->need_smt_idle_poll;
                                    ~~^
   include/linux/sched.h:69:8: note: forward declaration of 'struct task_group'
   struct task_group;
          ^
   6 errors generated.


vim +/cpu_cgrp_id +73 kernel/sched/idle.c

    57	
    58	void smt_idle_poll_switch(struct rq *rq)
    59	{
    60		struct rq *smt_rq;
    61		struct task_group *tg;
    62		int smt, cpu;
    63	
    64		if (!static_branch_unlikely(&__smt_idle_poll_enabled))
    65			return;
    66	
    67		if (rq->idle == current) {
    68			rq->need_smt_idle_poll = false;
    69			return;
    70		}
    71	
    72		rcu_read_lock();
  > 73		tg = container_of(task_css_check(current, cpu_cgrp_id, true),
    74			struct task_group, css);
  > 75		rq->need_smt_idle_poll = tg->need_smt_idle_poll;
    76		rcu_read_unlock();
    77	
    78		if (!rq->need_smt_idle_poll)
    79			return;
    80	
    81		cpu = rq->cpu;
    82		for_each_cpu(smt, cpu_smt_mask(cpu)) {
    83			if (cpu == smt || !cpu_online(smt))
    84				continue;
    85			smt_rq = cpu_rq(smt);
    86			if (smt_rq->idle == smt_rq->curr && !smt_rq->in_smt_idle_poll)
    87				smp_send_reschedule(smt);
    88		}
    89	}
    90	

---
0-DAY CI Kernel Test Service, Intel Corporation
https://lists.01.org/hyperkitty/list/kbuild-all@lists.01.org

[-- Attachment #2: .config.gz --]
[-- Type: application/gzip, Size: 39128 bytes --]

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH] Add busy loop polling for idle SMT
  2021-11-16 11:51 [PATCH] Add busy loop polling for idle SMT Peng Wang
  2021-11-16 11:51 ` [PATCH] sched/idle: support busy loop polling on idle SMT cpus Peng Wang
@ 2021-11-17 10:58 ` Peter Zijlstra
  2021-11-19  3:19   ` Peng Wang
  1 sibling, 1 reply; 8+ messages in thread
From: Peter Zijlstra @ 2021-11-17 10:58 UTC (permalink / raw)
  To: Peng Wang
  Cc: mingo, juri.lelli, vincent.guittot, dietmar.eggemann, rostedt,
	bsegall, mgorman, bristot, linux-kernel

On Tue, Nov 16, 2021 at 07:51:35PM +0800, Peng Wang wrote:
> Now we have cpu_idle_force_poll which uses cpu_relax() waiting for
> an arriving IPI, while sometimes busy loop on idle cpu is also
> useful to provide consistent pipeline interference for hardware SMT.
> 
> When hardware SMT is enabled, the switching between idle and
> busy state of one cpu will cause performance fluctuation of
> other sibling cpus on the same core.
> 
> In pay-for-execution-time scenario, cloud service providers prefer
> stable performance data to set stabel price for same workload.
> Different execution time of the same workload caused by different
> idle or busy state of sibling SMT cpus will make different bills, which
> is confused for customers.
> 
> Since there is no dynamic CPU time scaling based on SMT pipeline interference,
> to coordinate sibling SMT noise no matter whether they are idle or not,
> busy loop in idle state can provide approximately consistent pipeline interference.
> 
> For example, a workload computing tangent and cotangent will finish in 9071ms when
> sibling SMT cpus are idle, and 13299ms when sibling SMT cpus are computiing other workload.
> This generate 32% performance fluctuation.
> 
> SMT idle polling makes things slower, but we can set bigger cpu quota to make up
> a deficiency. This also increase power consumption by 2.2%, which is acceptable.
> 
> There may be some other possible solutions, while each has its own problem:
> a) disbale hardware SMT, which means half of SMT is unused and more hardware cost.
> b) busy loop in a userspace thread, but the cpu usage is confusing.
> 
> We propose this patch to discuss the performance fluctuation problem related to SMT
> pipeline interference, and any comments are welcome.

I think you missed April Fools' Day by a wide margin.

Lowering performance and increasing power usage is a direct
contradiction to sanity. It also doesn't really work as advertised,
if the siblings are competing for AVX resources the performance is a
*lot* lower than when an AVX task is competing against a spinner like
this.



^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH] sched/idle: support busy loop polling on idle SMT cpus
  2021-11-16 11:51 ` [PATCH] sched/idle: support busy loop polling on idle SMT cpus Peng Wang
  2021-11-16 23:08   ` kernel test robot
@ 2021-11-18 11:36   ` kernel test robot
  2021-11-27 15:50   ` kernel test robot
  2 siblings, 0 replies; 8+ messages in thread
From: kernel test robot @ 2021-11-18 11:36 UTC (permalink / raw)
  To: Peng Wang, mingo, peterz, juri.lelli, vincent.guittot,
	dietmar.eggemann, rostedt, bsegall, mgorman, bristot
  Cc: kbuild-all, linux-kernel

[-- Attachment #1: Type: text/plain, Size: 7078 bytes --]

Hi Peng,

Thank you for the patch! Yet something to improve:

[auto build test ERROR on tip/sched/core]
[also build test ERROR on tip/master linux/master linus/master v5.16-rc1 next-20211118]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch]

url:    https://github.com/0day-ci/linux/commits/Peng-Wang/sched-idle-support-busy-loop-polling-on-idle-SMT-cpus/20211116-195600
base:   https://git.kernel.org/pub/scm/linux/kernel/git/tip/tip.git 8ea9183db4ad8afbcb7089a77c23eaf965b0cacd
config: sparc-randconfig-p001-20211118 (attached as .config)
compiler: sparc64-linux-gcc (GCC) 11.2.0
reproduce (this is a W=1 build):
        wget https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross
        chmod +x ~/bin/make.cross
        # https://github.com/0day-ci/linux/commit/11c6a63e5c0f496b43058351f1e2a488ee27a5ec
        git remote add linux-review https://github.com/0day-ci/linux
        git fetch --no-tags linux-review Peng-Wang/sched-idle-support-busy-loop-polling-on-idle-SMT-cpus/20211116-195600
        git checkout 11c6a63e5c0f496b43058351f1e2a488ee27a5ec
        # save the attached .config to linux build tree
        COMPILER_INSTALL_PATH=$HOME/0day COMPILER=gcc-11.2.0 make.cross ARCH=sparc 

If you fix the issue, kindly add following tag as appropriate
Reported-by: kernel test robot <lkp@intel.com>

All errors (new ones prefixed by >>):

   In file included from include/linux/list.h:9,
                    from include/linux/rculist.h:10,
                    from include/linux/pid.h:5,
                    from include/linux/sched.h:14,
                    from kernel/sched/sched.h:5,
                    from kernel/sched/idle.c:9:
   kernel/sched/idle.c: In function 'smt_idle_poll_switch':
>> kernel/sched/idle.c:73:27: error: implicit declaration of function 'task_css_check' [-Werror=implicit-function-declaration]
      73 |         tg = container_of(task_css_check(current, cpu_cgrp_id, true),
         |                           ^~~~~~~~~~~~~~
   include/linux/kernel.h:494:33: note: in definition of macro 'container_of'
     494 |         void *__mptr = (void *)(ptr);                                   \
         |                                 ^~~
   kernel/sched/idle.c:73:51: error: 'cpu_cgrp_id' undeclared (first use in this function); did you mean 'mem_cgroup_id'?
      73 |         tg = container_of(task_css_check(current, cpu_cgrp_id, true),
         |                                                   ^~~~~~~~~~~
   include/linux/kernel.h:494:33: note: in definition of macro 'container_of'
     494 |         void *__mptr = (void *)(ptr);                                   \
         |                                 ^~~
   kernel/sched/idle.c:73:51: note: each undeclared identifier is reported only once for each function it appears in
      73 |         tg = container_of(task_css_check(current, cpu_cgrp_id, true),
         |                                                   ^~~~~~~~~~~
   include/linux/kernel.h:494:33: note: in definition of macro 'container_of'
     494 |         void *__mptr = (void *)(ptr);                                   \
         |                                 ^~~
   In file included from <command-line>:
   include/linux/kernel.h:495:58: error: invalid use of undefined type 'struct task_group'
     495 |         BUILD_BUG_ON_MSG(!__same_type(*(ptr), ((type *)0)->member) &&   \
         |                                                          ^~
   include/linux/compiler_types.h:302:23: note: in definition of macro '__compiletime_assert'
     302 |                 if (!(condition))                                       \
         |                       ^~~~~~~~~
   include/linux/compiler_types.h:322:9: note: in expansion of macro '_compiletime_assert'
     322 |         _compiletime_assert(condition, msg, __compiletime_assert_, __COUNTER__)
         |         ^~~~~~~~~~~~~~~~~~~
   include/linux/build_bug.h:39:37: note: in expansion of macro 'compiletime_assert'
      39 | #define BUILD_BUG_ON_MSG(cond, msg) compiletime_assert(!(cond), msg)
         |                                     ^~~~~~~~~~~~~~~~~~
   include/linux/kernel.h:495:9: note: in expansion of macro 'BUILD_BUG_ON_MSG'
     495 |         BUILD_BUG_ON_MSG(!__same_type(*(ptr), ((type *)0)->member) &&   \
         |         ^~~~~~~~~~~~~~~~
   include/linux/kernel.h:495:27: note: in expansion of macro '__same_type'
     495 |         BUILD_BUG_ON_MSG(!__same_type(*(ptr), ((type *)0)->member) &&   \
         |                           ^~~~~~~~~~~
   kernel/sched/idle.c:73:14: note: in expansion of macro 'container_of'
      73 |         tg = container_of(task_css_check(current, cpu_cgrp_id, true),
         |              ^~~~~~~~~~~~
   include/linux/compiler_types.h:140:41: error: invalid use of undefined type 'struct task_group'
     140 | #define __compiler_offsetof(a, b)       __builtin_offsetof(a, b)
         |                                         ^~~~~~~~~~~~~~~~~~
   include/linux/stddef.h:17:33: note: in expansion of macro '__compiler_offsetof'
      17 | #define offsetof(TYPE, MEMBER)  __compiler_offsetof(TYPE, MEMBER)
         |                                 ^~~~~~~~~~~~~~~~~~~
   include/linux/kernel.h:498:28: note: in expansion of macro 'offsetof'
     498 |         ((type *)(__mptr - offsetof(type, member))); })
         |                            ^~~~~~~~
   kernel/sched/idle.c:73:14: note: in expansion of macro 'container_of'
      73 |         tg = container_of(task_css_check(current, cpu_cgrp_id, true),
         |              ^~~~~~~~~~~~
   kernel/sched/idle.c:75:36: error: invalid use of undefined type 'struct task_group'
      75 |         rq->need_smt_idle_poll = tg->need_smt_idle_poll;
         |                                    ^~
   cc1: some warnings being treated as errors


vim +/task_css_check +73 kernel/sched/idle.c

    57	
    58	void smt_idle_poll_switch(struct rq *rq)
    59	{
    60		struct rq *smt_rq;
    61		struct task_group *tg;
    62		int smt, cpu;
    63	
    64		if (!static_branch_unlikely(&__smt_idle_poll_enabled))
    65			return;
    66	
    67		if (rq->idle == current) {
    68			rq->need_smt_idle_poll = false;
    69			return;
    70		}
    71	
    72		rcu_read_lock();
  > 73		tg = container_of(task_css_check(current, cpu_cgrp_id, true),
    74			struct task_group, css);
    75		rq->need_smt_idle_poll = tg->need_smt_idle_poll;
    76		rcu_read_unlock();
    77	
    78		if (!rq->need_smt_idle_poll)
    79			return;
    80	
    81		cpu = rq->cpu;
    82		for_each_cpu(smt, cpu_smt_mask(cpu)) {
    83			if (cpu == smt || !cpu_online(smt))
    84				continue;
    85			smt_rq = cpu_rq(smt);
    86			if (smt_rq->idle == smt_rq->curr && !smt_rq->in_smt_idle_poll)
    87				smp_send_reschedule(smt);
    88		}
    89	}
    90	

---
0-DAY CI Kernel Test Service, Intel Corporation
https://lists.01.org/hyperkitty/list/kbuild-all@lists.01.org

[-- Attachment #2: .config.gz --]
[-- Type: application/gzip, Size: 38159 bytes --]

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH] Add busy loop polling for idle SMT
  2021-11-17 10:58 ` [PATCH] Add busy loop polling for idle SMT Peter Zijlstra
@ 2021-11-19  3:19   ` Peng Wang
  2021-11-19  9:25     ` Peter Zijlstra
  0 siblings, 1 reply; 8+ messages in thread
From: Peng Wang @ 2021-11-19  3:19 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: mingo, juri.lelli, vincent.guittot, dietmar.eggemann, rostedt,
	bsegall, mgorman, bristot, linux-kernel

On 17/11/2021 18:58, , Peter Zijlstra wrote:
> On Tue, Nov 16, 2021 at 07:51:35PM +0800, Peng Wang wrote:
>> Now we have cpu_idle_force_poll which uses cpu_relax() waiting for
>> an arriving IPI, while sometimes busy loop on idle cpu is also
>> useful to provide consistent pipeline interference for hardware SMT.
>>
>> When hardware SMT is enabled, the switching between idle and
>> busy state of one cpu will cause performance fluctuation of
>> other sibling cpus on the same core.
>>
>> In pay-for-execution-time scenario, cloud service providers prefer
>> stable performance data to set stabel price for same workload.
>> Different execution time of the same workload caused by different
>> idle or busy state of sibling SMT cpus will make different bills, which
>> is confused for customers.
>>
>> Since there is no dynamic CPU time scaling based on SMT pipeline interference,
>> to coordinate sibling SMT noise no matter whether they are idle or not,
>> busy loop in idle state can provide approximately consistent pipeline interference.
>>
>> For example, a workload computing tangent and cotangent will finish in 9071ms when
>> sibling SMT cpus are idle, and 13299ms when sibling SMT cpus are computiing other workload.
>> This generate 32% performance fluctuation.
>>
>> SMT idle polling makes things slower, but we can set bigger cpu quota to make up
>> a deficiency. This also increase power consumption by 2.2%, which is acceptable.
>>
>> There may be some other possible solutions, while each has its own problem:
>> a) disbale hardware SMT, which means half of SMT is unused and more hardware cost.
>> b) busy loop in a userspace thread, but the cpu usage is confusing.
>>
>> We propose this patch to discuss the performance fluctuation problem related to SMT
>> pipeline interference, and any comments are welcome.
> 
> I think you missed April Fools' Day by a wide margin.
> 
> Lowering performance and increasing power usage is a direct

Siblings' noise depends on workloads, when persuing performance 
stability, we have to consider what performance data to keep:

a) the worst with all-time noise
b) the best with monopolizing a whole core by disabling SMT or using 
core scheduling, while wasting some logic CPUs
c) A number between the worst and the best which is hard to decide

That's where lowering performance comes from.

> contradiction to sanity. It also doesn't really work as advertised,
> if the siblings are competing for AVX resources the performance is a
> *lot* lower than when an AVX task is competing against a spinner like
> this.
> 

Yes, idle SMT busy loop polling can only provide approximately pipeline 
interference for normal instructions.

When it comes to AVX works, we notice an idea modifing CPU time 
accounting[1], do you think the combination can lead to a feasible
solution, or any other better ideas?

[1] https://www.usenix.org/conference/atc21/presentation/gottschlag

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH] Add busy loop polling for idle SMT
  2021-11-19  3:19   ` Peng Wang
@ 2021-11-19  9:25     ` Peter Zijlstra
  0 siblings, 0 replies; 8+ messages in thread
From: Peter Zijlstra @ 2021-11-19  9:25 UTC (permalink / raw)
  To: Peng Wang
  Cc: mingo, juri.lelli, vincent.guittot, dietmar.eggemann, rostedt,
	bsegall, mgorman, bristot, linux-kernel

On Fri, Nov 19, 2021 at 11:19:03AM +0800, Peng Wang wrote:
> 
> Yes, idle SMT busy loop polling can only provide approximately pipeline
> interference for normal instructions.
> 
> When it comes to AVX works, we notice an idea modifing CPU time
> accounting[1], do you think the combination can lead to a feasible
> solution, or any other better ideas?

I'm not at all interested in solving this business model induced brain
trauma.

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH] sched/idle: support busy loop polling on idle SMT cpus
  2021-11-16 11:51 ` [PATCH] sched/idle: support busy loop polling on idle SMT cpus Peng Wang
  2021-11-16 23:08   ` kernel test robot
  2021-11-18 11:36   ` kernel test robot
@ 2021-11-27 15:50   ` kernel test robot
  2 siblings, 0 replies; 8+ messages in thread
From: kernel test robot @ 2021-11-27 15:50 UTC (permalink / raw)
  To: Peng Wang, mingo, peterz, juri.lelli, vincent.guittot,
	dietmar.eggemann, rostedt, bsegall, mgorman, bristot
  Cc: kbuild-all, linux-kernel

Hi Peng,

Thank you for the patch! Yet something to improve:

[auto build test ERROR on tip/sched/core]
[also build test ERROR on tip/master linux/master linus/master v5.16-rc2 next-20211126]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch]

url:    https://github.com/0day-ci/linux/commits/Peng-Wang/sched-idle-support-busy-loop-polling-on-idle-SMT-cpus/20211116-195600
base:   https://git.kernel.org/pub/scm/linux/kernel/git/tip/tip.git 8ea9183db4ad8afbcb7089a77c23eaf965b0cacd
config: sparc64-randconfig-r035-20211116 (https://download.01.org/0day-ci/archive/20211127/202111272343.0rQFsj1v-lkp@intel.com/config)
compiler: sparc64-linux-gcc (GCC) 11.2.0
reproduce (this is a W=1 build):
        wget https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross
        chmod +x ~/bin/make.cross
        # https://github.com/0day-ci/linux/commit/11c6a63e5c0f496b43058351f1e2a488ee27a5ec
        git remote add linux-review https://github.com/0day-ci/linux
        git fetch --no-tags linux-review Peng-Wang/sched-idle-support-busy-loop-polling-on-idle-SMT-cpus/20211116-195600
        git checkout 11c6a63e5c0f496b43058351f1e2a488ee27a5ec
        # save the config file to linux build tree
        mkdir build_dir
        COMPILER_INSTALL_PATH=$HOME/0day COMPILER=gcc-11.2.0 make.cross O=build_dir ARCH=sparc64 SHELL=/bin/bash

If you fix the issue, kindly add following tag as appropriate
Reported-by: kernel test robot <lkp@intel.com>

All errors (new ones prefixed by >>):

   In file included from include/linux/list.h:9,
                    from include/linux/rculist.h:10,
                    from include/linux/pid.h:5,
                    from include/linux/sched.h:14,
                    from kernel/sched/sched.h:5,
                    from kernel/sched/idle.c:9:
   kernel/sched/idle.c: In function 'smt_idle_poll_switch':
>> kernel/sched/idle.c:73:51: error: 'cpu_cgrp_id' undeclared (first use in this function); did you mean 'io_cgrp_id'?
      73 |         tg = container_of(task_css_check(current, cpu_cgrp_id, true),
         |                                                   ^~~~~~~~~~~
   include/linux/kernel.h:494:33: note: in definition of macro 'container_of'
     494 |         void *__mptr = (void *)(ptr);                                   \
         |                                 ^~~
   kernel/sched/idle.c:73:27: note: in expansion of macro 'task_css_check'
      73 |         tg = container_of(task_css_check(current, cpu_cgrp_id, true),
         |                           ^~~~~~~~~~~~~~
   kernel/sched/idle.c:73:51: note: each undeclared identifier is reported only once for each function it appears in
      73 |         tg = container_of(task_css_check(current, cpu_cgrp_id, true),
         |                                                   ^~~~~~~~~~~
   include/linux/kernel.h:494:33: note: in definition of macro 'container_of'
     494 |         void *__mptr = (void *)(ptr);                                   \
         |                                 ^~~
   kernel/sched/idle.c:73:27: note: in expansion of macro 'task_css_check'
      73 |         tg = container_of(task_css_check(current, cpu_cgrp_id, true),
         |                           ^~~~~~~~~~~~~~
   In file included from <command-line>:
   include/linux/kernel.h:495:58: error: invalid use of undefined type 'struct task_group'
     495 |         BUILD_BUG_ON_MSG(!__same_type(*(ptr), ((type *)0)->member) &&   \
         |                                                          ^~
   include/linux/compiler_types.h:302:23: note: in definition of macro '__compiletime_assert'
     302 |                 if (!(condition))                                       \
         |                       ^~~~~~~~~
   include/linux/compiler_types.h:322:9: note: in expansion of macro '_compiletime_assert'
     322 |         _compiletime_assert(condition, msg, __compiletime_assert_, __COUNTER__)
         |         ^~~~~~~~~~~~~~~~~~~
   include/linux/build_bug.h:39:37: note: in expansion of macro 'compiletime_assert'
      39 | #define BUILD_BUG_ON_MSG(cond, msg) compiletime_assert(!(cond), msg)
         |                                     ^~~~~~~~~~~~~~~~~~
   include/linux/kernel.h:495:9: note: in expansion of macro 'BUILD_BUG_ON_MSG'
     495 |         BUILD_BUG_ON_MSG(!__same_type(*(ptr), ((type *)0)->member) &&   \
         |         ^~~~~~~~~~~~~~~~
   include/linux/kernel.h:495:27: note: in expansion of macro '__same_type'
     495 |         BUILD_BUG_ON_MSG(!__same_type(*(ptr), ((type *)0)->member) &&   \
         |                           ^~~~~~~~~~~
   kernel/sched/idle.c:73:14: note: in expansion of macro 'container_of'
      73 |         tg = container_of(task_css_check(current, cpu_cgrp_id, true),
         |              ^~~~~~~~~~~~
>> include/linux/compiler_types.h:140:41: error: invalid use of undefined type 'struct task_group'
     140 | #define __compiler_offsetof(a, b)       __builtin_offsetof(a, b)
         |                                         ^~~~~~~~~~~~~~~~~~
   include/linux/stddef.h:17:33: note: in expansion of macro '__compiler_offsetof'
      17 | #define offsetof(TYPE, MEMBER)  __compiler_offsetof(TYPE, MEMBER)
         |                                 ^~~~~~~~~~~~~~~~~~~
   include/linux/kernel.h:498:28: note: in expansion of macro 'offsetof'
     498 |         ((type *)(__mptr - offsetof(type, member))); })
         |                            ^~~~~~~~
   kernel/sched/idle.c:73:14: note: in expansion of macro 'container_of'
      73 |         tg = container_of(task_css_check(current, cpu_cgrp_id, true),
         |              ^~~~~~~~~~~~
>> kernel/sched/idle.c:75:36: error: invalid use of undefined type 'struct task_group'
      75 |         rq->need_smt_idle_poll = tg->need_smt_idle_poll;
         |                                    ^~


vim +73 kernel/sched/idle.c

    57	
    58	void smt_idle_poll_switch(struct rq *rq)
    59	{
    60		struct rq *smt_rq;
    61		struct task_group *tg;
    62		int smt, cpu;
    63	
    64		if (!static_branch_unlikely(&__smt_idle_poll_enabled))
    65			return;
    66	
    67		if (rq->idle == current) {
    68			rq->need_smt_idle_poll = false;
    69			return;
    70		}
    71	
    72		rcu_read_lock();
  > 73		tg = container_of(task_css_check(current, cpu_cgrp_id, true),
    74			struct task_group, css);
  > 75		rq->need_smt_idle_poll = tg->need_smt_idle_poll;
    76		rcu_read_unlock();
    77	
    78		if (!rq->need_smt_idle_poll)
    79			return;
    80	
    81		cpu = rq->cpu;
    82		for_each_cpu(smt, cpu_smt_mask(cpu)) {
    83			if (cpu == smt || !cpu_online(smt))
    84				continue;
    85			smt_rq = cpu_rq(smt);
    86			if (smt_rq->idle == smt_rq->curr && !smt_rq->in_smt_idle_poll)
    87				smp_send_reschedule(smt);
    88		}
    89	}
    90	

---
0-DAY CI Kernel Test Service, Intel Corporation
https://lists.01.org/hyperkitty/list/kbuild-all@lists.01.org

^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2021-11-27 15:53 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-11-16 11:51 [PATCH] Add busy loop polling for idle SMT Peng Wang
2021-11-16 11:51 ` [PATCH] sched/idle: support busy loop polling on idle SMT cpus Peng Wang
2021-11-16 23:08   ` kernel test robot
2021-11-18 11:36   ` kernel test robot
2021-11-27 15:50   ` kernel test robot
2021-11-17 10:58 ` [PATCH] Add busy loop polling for idle SMT Peter Zijlstra
2021-11-19  3:19   ` Peng Wang
2021-11-19  9:25     ` Peter Zijlstra

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).