linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH V2 0/3] Create sched_select_cpu() and use it in workqueues
@ 2012-09-27  9:04 Viresh Kumar
  2012-09-27  9:04 ` [PATCH V2 1/3] sched: Create sched_select_cpu() to give preferred CPU for power saving Viresh Kumar
                   ` (3 more replies)
  0 siblings, 4 replies; 13+ messages in thread
From: Viresh Kumar @ 2012-09-27  9:04 UTC (permalink / raw)
  To: linux-kernel
  Cc: pjt, paul.mckenney, tglx, tj, suresh.b.siddha, venki, mingo,
	peterz, robin.randhawa, Steve.Bannister, Arvind.Chauhan,
	amit.kucheria, vincent.guittot, linaro-dev, patches,
	Viresh Kumar

Hi All,

This is V2 of my sched_select_cpu() work.

In order to save power, it would be useful to schedule work onto non-IDLE cpus
instead of waking up an IDLE one.

To achieve this, we need scheduler to guide kernel frameworks (like: timers &
workqueues) on which is the most preferred CPU that must be used for these
tasks.

This patchset is about implementing this concept.

- The first patch adds sched_select_cpu() routine which returns the preferred
  cpu which is non-idle. 
- Second patch removes idle_cpu() calls from timer & hrtimer.
- Third patch is about adapting this change in workqueue framework.

Earlier discussions over v1 can be found here:
http://www.mail-archive.com/linaro-dev@lists.linaro.org/msg13342.html

Earlier discussions over this concept were done at last LPC:
http://summit.linuxplumbersconf.org/lpc-2012/meeting/90/lpc2012-sched-timer-workqueue/

Module created for testing this behavior is present here:
http://git.linaro.org/gitweb?p=people/vireshk/module.git;a=summary

Following are the steps followed in test module:
1. Run single work on each cpu
2. This work will start a timer after x (tested with 10) jiffies of delay
3. Timer routine queues a work... (This may be called from idle or non-idle cpu)
   and starts the same timer again STEP 3 is done for n number of times (i.e.
   queuing n works, one after other)
4. All works will call a single routine, which will count following per cpu:
 - Total works processed by a CPU
 - Total works processed by a CPU, which are queued from it
 - Total works processed by a CPU, which aren't queued from it

Setup:
-----
- ARM Vexpress TC2 - big.LITTLE CPU
- Core 0-1: A15, 2-4: A7
- rootfs: linaro-ubuntu-nano

Results:
-------
Without Workqueue Modification, i.e. PATCH 3/3:
[ 2493.022335] Workqueue Analyser: works processsed by CPU0, Total: 1000, Own: 0, migrated: 0
[ 2493.047789] Workqueue Analyser: works processsed by CPU1, Total: 1000, Own: 0, migrated: 0
[ 2493.072918] Workqueue Analyser: works processsed by CPU2, Total: 1000, Own: 0, migrated: 0
[ 2493.098576] Workqueue Analyser: works processsed by CPU3, Total: 1000, Own: 0, migrated: 0
[ 2493.123702] Workqueue Analyser: works processsed by CPU4, Total: 1000, Own: 0, migrated: 0

With Workqueue Modification, i.e. PATCH 3/3:
[ 2493.022335] Workqueue Analyser: works processsed by CPU0, Total: 1002, Own: 999, migrated: 3
[ 2493.047789] Workqueue Analyser: works processsed by CPU1, Total: 998,  Own: 997, migrated: 1
[ 2493.072918] Workqueue Analyser: works processsed by CPU2, Total: 1013, Own: 996, migrated: 17
[ 2493.098576] Workqueue Analyser: works processsed by CPU3, Total: 998,  Own: 993, migrated: 5
[ 2493.123702] Workqueue Analyser: works processsed by CPU4, Total: 989,  Own: 987, migrated: 2

V1->V2
-----
- New SD_* macros removed now and earlier ones used
- sched_select_cpu() rewritten and it includes the check on current cpu's
  idleness.
- cpu_idle() calls from timer and hrtimer removed now.
- Patch 2/3 from V1, removed as it doesn't apply to latest workqueue branch from
  tejun.
- CONFIG_MIGRATE_WQ removed and so is wq_select_cpu()
- sched_select_cpu() called only from __queue_work()
- got tejun/for-3.7 branch in my tree, before making workqueue changes.

Viresh Kumar (3):
  sched: Create sched_select_cpu() to give preferred CPU for power
    saving
  timer: hrtimer: Don't check idle_cpu() before calling
    get_nohz_timer_target()
  workqueue: Schedule work on non-idle cpu instead of current one

 include/linux/sched.h | 16 ++++++++++--
 kernel/hrtimer.c      |  2 +-
 kernel/sched/core.c   | 69 +++++++++++++++++++++++++++++++--------------------
 kernel/timer.c        |  9 ++++---
 kernel/workqueue.c    |  2 +-
 5 files changed, 63 insertions(+), 35 deletions(-)

-- 
1.7.12.rc2.18.g61b472e



^ permalink raw reply	[flat|nested] 13+ messages in thread

* [PATCH V2 1/3] sched: Create sched_select_cpu() to give preferred CPU for power saving
  2012-09-27  9:04 [PATCH V2 0/3] Create sched_select_cpu() and use it in workqueues Viresh Kumar
@ 2012-09-27  9:04 ` Viresh Kumar
  2012-09-27  9:04 ` [PATCH V2 2/3] timer: hrtimer: Don't check idle_cpu() before calling get_nohz_timer_target() Viresh Kumar
                   ` (2 subsequent siblings)
  3 siblings, 0 replies; 13+ messages in thread
From: Viresh Kumar @ 2012-09-27  9:04 UTC (permalink / raw)
  To: linux-kernel
  Cc: pjt, paul.mckenney, tglx, tj, suresh.b.siddha, venki, mingo,
	peterz, robin.randhawa, Steve.Bannister, Arvind.Chauhan,
	amit.kucheria, vincent.guittot, linaro-dev, patches,
	Viresh Kumar

In order to save power, it would be useful to schedule work onto non-IDLE cpus
instead of waking up an IDLE one.

To achieve this, we need scheduler to guide kernel frameworks (like: timers &
workqueues) on which is the most preferred CPU that must be used for these
tasks.

This routine returns the preferred cpu which is non-idle. It accepts a bitwise
OR of SD_* flags present in linux/sched.h. If the local CPU isn't idle, it is
returned back. If it is idle, then we must look for another CPU which have all
the flags passed as argument as set. Also, as this activity is part of load
balancing only, SD_LOAD_BALANCE must also be set for selected domain.

This patch reuses the code from get_nohz_timer_target() routine, which had
similar implementation. get_nohz_timer_target() is also modified to use
sched_select_cpu() now.

Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
---
 include/linux/sched.h | 16 ++++++++++--
 kernel/sched/core.c   | 69 +++++++++++++++++++++++++++++++--------------------
 2 files changed, 56 insertions(+), 29 deletions(-)

diff --git a/include/linux/sched.h b/include/linux/sched.h
index 0059212..d2722cf 100644
--- a/include/linux/sched.h
+++ b/include/linux/sched.h
@@ -272,14 +272,26 @@ extern void init_idle_bootup_task(struct task_struct *idle);
 
 extern int runqueue_is_locked(int cpu);
 
-#if defined(CONFIG_SMP) && defined(CONFIG_NO_HZ)
+#ifdef CONFIG_SMP
+extern int sched_select_cpu(unsigned int sd_flags);
+
+#ifdef CONFIG_NO_HZ
 extern void select_nohz_load_balancer(int stop_tick);
 extern void set_cpu_sd_state_idle(void);
-extern int get_nohz_timer_target(void);
+/*
+ * In the semi idle case, use the nearest busy cpu for migrating timers
+ * from an idle cpu.  This is good for power-savings.
+ *
+ * We don't do similar optimization for completely idle system, as
+ * selecting an idle cpu will add more delays to the timers than intended
+ * (as that cpu's timer base may not be uptodate wrt jiffies etc).
+ */
+#define get_nohz_timer_target() sched_select_cpu(0)
 #else
 static inline void select_nohz_load_balancer(int stop_tick) { }
 static inline void set_cpu_sd_state_idle(void) { }
 #endif
+#endif /* CONFIG_SMP */
 
 /*
  * Only dump TASK_* tasks. (0 for all tasks)
diff --git a/kernel/sched/core.c b/kernel/sched/core.c
index de97083..d573183 100644
--- a/kernel/sched/core.c
+++ b/kernel/sched/core.c
@@ -542,33 +542,6 @@ void resched_cpu(int cpu)
 
 #ifdef CONFIG_NO_HZ
 /*
- * In the semi idle case, use the nearest busy cpu for migrating timers
- * from an idle cpu.  This is good for power-savings.
- *
- * We don't do similar optimization for completely idle system, as
- * selecting an idle cpu will add more delays to the timers than intended
- * (as that cpu's timer base may not be uptodate wrt jiffies etc).
- */
-int get_nohz_timer_target(void)
-{
-	int cpu = smp_processor_id();
-	int i;
-	struct sched_domain *sd;
-
-	rcu_read_lock();
-	for_each_domain(cpu, sd) {
-		for_each_cpu(i, sched_domain_span(sd)) {
-			if (!idle_cpu(i)) {
-				cpu = i;
-				goto unlock;
-			}
-		}
-	}
-unlock:
-	rcu_read_unlock();
-	return cpu;
-}
-/*
  * When add_timer_on() enqueues a timer into the timer wheel of an
  * idle CPU then this timer might expire before the next timer event
  * which is scheduled to wake up that CPU. In case of a completely
@@ -639,6 +612,48 @@ void sched_avg_update(struct rq *rq)
 	}
 }
 
+/*
+ * This routine returns the preferred cpu which is non-idle. It accepts a
+ * bitwise OR of SD_* flags present in linux/sched.h. If the local CPU isn't
+ * idle, it is returned back. If it is idle, then we must look for another CPU
+ * which have all the flags passed as argument as set. Also, as this activity is
+ * part of load balancing only, SD_LOAD_BALANCE must also be set for selected
+ * domain.
+ */
+int sched_select_cpu(unsigned int sd_flags)
+{
+	struct sched_domain *sd;
+	int cpu = smp_processor_id();
+	int i;
+
+	/* If Current cpu isn't idle, don't migrate anything */
+	if (!idle_cpu(cpu))
+		return cpu;
+
+	/* Add SD_LOAD_BALANCE to flags */
+	sd_flags |= SD_LOAD_BALANCE;
+
+	rcu_read_lock();
+	for_each_domain(cpu, sd) {
+		/*
+		 * If sd doesnt' have both sd_flags and SD_LOAD_BALANCE set,
+		 * skip sd.
+		 */
+		if ((sd->flags & sd_flags) != sd_flags)
+			continue;
+
+		for_each_cpu(i, sched_domain_span(sd)) {
+			if (!idle_cpu(i)) {
+				cpu = i;
+				goto unlock;
+			}
+		}
+	}
+unlock:
+	rcu_read_unlock();
+	return cpu;
+}
+
 #else /* !CONFIG_SMP */
 void resched_task(struct task_struct *p)
 {
-- 
1.7.12.rc2.18.g61b472e



^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH V2 2/3] timer: hrtimer: Don't check idle_cpu() before calling get_nohz_timer_target()
  2012-09-27  9:04 [PATCH V2 0/3] Create sched_select_cpu() and use it in workqueues Viresh Kumar
  2012-09-27  9:04 ` [PATCH V2 1/3] sched: Create sched_select_cpu() to give preferred CPU for power saving Viresh Kumar
@ 2012-09-27  9:04 ` Viresh Kumar
  2012-09-27 10:31   ` Viresh Kumar
  2012-09-27  9:04 ` [PATCH V2 3/3] workqueue: Schedule work on non-idle cpu instead of current one Viresh Kumar
  2012-10-15  4:38 ` [PATCH V2 0/3] Create sched_select_cpu() and use it in workqueues Viresh Kumar
  3 siblings, 1 reply; 13+ messages in thread
From: Viresh Kumar @ 2012-09-27  9:04 UTC (permalink / raw)
  To: linux-kernel
  Cc: pjt, paul.mckenney, tglx, tj, suresh.b.siddha, venki, mingo,
	peterz, robin.randhawa, Steve.Bannister, Arvind.Chauhan,
	amit.kucheria, vincent.guittot, linaro-dev, patches,
	Viresh Kumar

Check for current cpu's idleness already done in implementation of
sched_select_cpu() which is called by get_nohz_timer_target(). So, no need to
call idle_cpu() twice, once from sched_select_cpu() and once from timer and
hrtimer before calling get_nohz_timer_target().

This patch removes calls to idle_cpu() from timer and hrtimer.

Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
---
 kernel/hrtimer.c | 2 +-
 kernel/timer.c   | 9 +++++----
 2 files changed, 6 insertions(+), 5 deletions(-)

diff --git a/kernel/hrtimer.c b/kernel/hrtimer.c
index 6db7a5e..74bdaf6 100644
--- a/kernel/hrtimer.c
+++ b/kernel/hrtimer.c
@@ -159,7 +159,7 @@ struct hrtimer_clock_base *lock_hrtimer_base(const struct hrtimer *timer,
 static int hrtimer_get_target(int this_cpu, int pinned)
 {
 #ifdef CONFIG_NO_HZ
-	if (!pinned && get_sysctl_timer_migration() && idle_cpu(this_cpu))
+	if (!pinned && get_sysctl_timer_migration())
 		return get_nohz_timer_target();
 #endif
 	return this_cpu;
diff --git a/kernel/timer.c b/kernel/timer.c
index d5de1b2..8409d9d 100644
--- a/kernel/timer.c
+++ b/kernel/timer.c
@@ -720,7 +720,7 @@ __mod_timer(struct timer_list *timer, unsigned long expires,
 {
 	struct tvec_base *base, *new_base;
 	unsigned long flags;
-	int ret = 0 , cpu;
+	int ret = 0 , cpu = 0;
 
 	timer_stats_timer_set_start_info(timer);
 	BUG_ON(!timer->function);
@@ -733,12 +733,13 @@ __mod_timer(struct timer_list *timer, unsigned long expires,
 
 	debug_activate(timer, expires);
 
-	cpu = smp_processor_id();
-
 #if defined(CONFIG_NO_HZ) && defined(CONFIG_SMP)
-	if (!pinned && get_sysctl_timer_migration() && idle_cpu(cpu))
+	if (!pinned && get_sysctl_timer_migration())
 		cpu = get_nohz_timer_target();
 #endif
+	if (!cpu)
+		cpu = smp_processor_id();
+
 	new_base = per_cpu(tvec_bases, cpu);
 
 	if (base != new_base) {
-- 
1.7.12.rc2.18.g61b472e



^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH V2 3/3] workqueue: Schedule work on non-idle cpu instead of current one
  2012-09-27  9:04 [PATCH V2 0/3] Create sched_select_cpu() and use it in workqueues Viresh Kumar
  2012-09-27  9:04 ` [PATCH V2 1/3] sched: Create sched_select_cpu() to give preferred CPU for power saving Viresh Kumar
  2012-09-27  9:04 ` [PATCH V2 2/3] timer: hrtimer: Don't check idle_cpu() before calling get_nohz_timer_target() Viresh Kumar
@ 2012-09-27  9:04 ` Viresh Kumar
  2012-09-28  6:03   ` Viresh Kumar
  2012-09-30  8:54   ` Tejun Heo
  2012-10-15  4:38 ` [PATCH V2 0/3] Create sched_select_cpu() and use it in workqueues Viresh Kumar
  3 siblings, 2 replies; 13+ messages in thread
From: Viresh Kumar @ 2012-09-27  9:04 UTC (permalink / raw)
  To: linux-kernel
  Cc: pjt, paul.mckenney, tglx, tj, suresh.b.siddha, venki, mingo,
	peterz, robin.randhawa, Steve.Bannister, Arvind.Chauhan,
	amit.kucheria, vincent.guittot, linaro-dev, patches,
	Viresh Kumar

Workqueues queues work on current cpu, if the caller haven't passed a preferred
cpu. This may wake up an idle CPU, which is actually not required.

This work can be processed by any CPU and so we must select a non-idle CPU here.
This patch adds in support in workqueue framework to get preferred CPU details
from the scheduler, instead of using current CPU.

Most of the time when a work is queued, the current cpu isn't idle and so we
will choose it only. There are cases when a cpu is idle when it queues some
work. For example, consider following scenario:
- A cpu has programmed a timer and is IDLE now.
- CPU gets into interrupt handler due to timer and queues a work. As the CPU is
  currently IDLE, we can queue this work to some other CPU.

Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
---
 kernel/workqueue.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/kernel/workqueue.c b/kernel/workqueue.c
index 143fd8c..5fa4ba4 100644
--- a/kernel/workqueue.c
+++ b/kernel/workqueue.c
@@ -1238,7 +1238,7 @@ static void __queue_work(unsigned int cpu, struct workqueue_struct *wq,
 		struct global_cwq *last_gcwq;
 
 		if (cpu == WORK_CPU_UNBOUND)
-			cpu = raw_smp_processor_id();
+			cpu = sched_select_cpu(0);
 
 		/*
 		 * It's multi cpu.  If @work was previously on a different
-- 
1.7.12.rc2.18.g61b472e



^ permalink raw reply related	[flat|nested] 13+ messages in thread

* Re: [PATCH V2 2/3] timer: hrtimer: Don't check idle_cpu() before calling get_nohz_timer_target()
  2012-09-27  9:04 ` [PATCH V2 2/3] timer: hrtimer: Don't check idle_cpu() before calling get_nohz_timer_target() Viresh Kumar
@ 2012-09-27 10:31   ` Viresh Kumar
  0 siblings, 0 replies; 13+ messages in thread
From: Viresh Kumar @ 2012-09-27 10:31 UTC (permalink / raw)
  To: linux-kernel
  Cc: pjt, paul.mckenney, tglx, tj, suresh.b.siddha, venki, mingo,
	peterz, robin.randhawa, Steve.Bannister, Arvind.Chauhan,
	amit.kucheria, vincent.guittot, linaro-dev, patches,
	Viresh Kumar

On 27 September 2012 14:34, Viresh Kumar <viresh.kumar@linaro.org> wrote:

>  #if defined(CONFIG_NO_HZ) && defined(CONFIG_SMP)
> -       if (!pinned && get_sysctl_timer_migration() && idle_cpu(cpu))
> +       if (!pinned && get_sysctl_timer_migration())
>                 cpu = get_nohz_timer_target();
>  #endif
> +       if (!cpu)
> +               cpu = smp_processor_id();

Ahh... Bad change... cpu can be returned as zero by get_nohz_timer_target().

Please consider below patch here:

------------8<-------------


Subject: [PATCH] timer: hrtimer: Don't check idle_cpu() before calling
 get_nohz_timer_target()

Check for current cpu's idleness already done in implementation of
sched_select_cpu() which is called by get_nohz_timer_target(). So, no need to
call idle_cpu() twice, once from sched_select_cpu() and once from timer and
hrtimer before calling get_nohz_timer_target().

This patch removes calls to idle_cpu() from timer and hrtimer.

This also reduces an extra call to smp_processor_id() when
get_nohz_timer_target() is called.

Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
---
 kernel/hrtimer.c | 2 +-
 kernel/timer.c   | 8 +++++---
 2 files changed, 6 insertions(+), 4 deletions(-)

diff --git a/kernel/hrtimer.c b/kernel/hrtimer.c
index 6db7a5e..74bdaf6 100644
--- a/kernel/hrtimer.c
+++ b/kernel/hrtimer.c
@@ -159,7 +159,7 @@ struct hrtimer_clock_base *lock_hrtimer_base(const
struct hrtimer *timer,
 static int hrtimer_get_target(int this_cpu, int pinned)
 {
 #ifdef CONFIG_NO_HZ
-	if (!pinned && get_sysctl_timer_migration() && idle_cpu(this_cpu))
+	if (!pinned && get_sysctl_timer_migration())
 		return get_nohz_timer_target();
 #endif
 	return this_cpu;
diff --git a/kernel/timer.c b/kernel/timer.c
index d5de1b2..db57606 100644
--- a/kernel/timer.c
+++ b/kernel/timer.c
@@ -733,11 +733,13 @@ __mod_timer(struct timer_list *timer, unsigned
long expires,

 	debug_activate(timer, expires);

-	cpu = smp_processor_id();
-
 #if defined(CONFIG_NO_HZ) && defined(CONFIG_SMP)
-	if (!pinned && get_sysctl_timer_migration() && idle_cpu(cpu))
+	if (!pinned && get_sysctl_timer_migration())
 		cpu = get_nohz_timer_target();
+	else
+		cpu = smp_processor_id();
+#else
+	cpu = smp_processor_id();
 #endif
 	new_base = per_cpu(tvec_bases, cpu);

^ permalink raw reply related	[flat|nested] 13+ messages in thread

* Re: [PATCH V2 3/3] workqueue: Schedule work on non-idle cpu instead of current one
  2012-09-27  9:04 ` [PATCH V2 3/3] workqueue: Schedule work on non-idle cpu instead of current one Viresh Kumar
@ 2012-09-28  6:03   ` Viresh Kumar
  2012-09-30  8:54   ` Tejun Heo
  1 sibling, 0 replies; 13+ messages in thread
From: Viresh Kumar @ 2012-09-28  6:03 UTC (permalink / raw)
  To: Tejun Heo
  Cc: pjt, paul.mckenney, tglx, Linux Kernel Mailing List,
	suresh.b.siddha, venki, mingo, peterz, robin.randhawa,
	Steve.Bannister, Arvind.Chauhan, amit.kucheria, vincent.guittot,
	linaro-dev, patches, Viresh Kumar

On 27 September 2012 14:34, Viresh Kumar <viresh.kumar@linaro.org> wrote:
> Workqueues queues work on current cpu, if the caller haven't passed a preferred
> cpu. This may wake up an idle CPU, which is actually not required.
>
> This work can be processed by any CPU and so we must select a non-idle CPU here.
> This patch adds in support in workqueue framework to get preferred CPU details
> from the scheduler, instead of using current CPU.
>
> Most of the time when a work is queued, the current cpu isn't idle and so we
> will choose it only. There are cases when a cpu is idle when it queues some
> work. For example, consider following scenario:
> - A cpu has programmed a timer and is IDLE now.
> - CPU gets into interrupt handler due to timer and queues a work. As the CPU is
>   currently IDLE, we can queue this work to some other CPU.
>
> Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
> ---
>  kernel/workqueue.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/kernel/workqueue.c b/kernel/workqueue.c
> index 143fd8c..5fa4ba4 100644
> --- a/kernel/workqueue.c
> +++ b/kernel/workqueue.c
> @@ -1238,7 +1238,7 @@ static void __queue_work(unsigned int cpu, struct workqueue_struct *wq,
>                 struct global_cwq *last_gcwq;
>
>                 if (cpu == WORK_CPU_UNBOUND)
> -                       cpu = raw_smp_processor_id();
> +                       cpu = sched_select_cpu(0);

Hi Tejun,

Following change is also required with this patch:


--- a/kernel/workqueue.c
+++ b/kernel/workqueue.c
@@ -1383,7 +1383,7 @@ static void __queue_delayed_work(int cpu, struct
workqueue_struct *wq,
                if (gcwq)
                        lcpu = gcwq->cpu;
                if (lcpu == WORK_CPU_UNBOUND)
-                       lcpu = raw_smp_processor_id();
+                       lcpu = sched_select_cpu(0);
        } else {
                lcpu = WORK_CPU_UNBOUND;
        }

Otherwise, we will end up running the re-reentrancy check, which we are trying
to skip for delayed work.

--
viresh

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH V2 3/3] workqueue: Schedule work on non-idle cpu instead of current one
  2012-09-27  9:04 ` [PATCH V2 3/3] workqueue: Schedule work on non-idle cpu instead of current one Viresh Kumar
  2012-09-28  6:03   ` Viresh Kumar
@ 2012-09-30  8:54   ` Tejun Heo
  2012-09-30 12:16     ` Viresh Kumar
  1 sibling, 1 reply; 13+ messages in thread
From: Tejun Heo @ 2012-09-30  8:54 UTC (permalink / raw)
  To: Viresh Kumar
  Cc: linux-kernel, pjt, paul.mckenney, tglx, suresh.b.siddha, venki,
	mingo, peterz, robin.randhawa, Steve.Bannister, Arvind.Chauhan,
	amit.kucheria, vincent.guittot, linaro-dev, patches

Hello, Viresh.

On Thu, Sep 27, 2012 at 02:34:05PM +0530, Viresh Kumar wrote:
> - A cpu has programmed a timer and is IDLE now.
> - CPU gets into interrupt handler due to timer and queues a work. As the CPU is
>   currently IDLE, we can queue this work to some other CPU.

I'm still a bit confused, if the CPU is already running the IRQ
handler, the CPU is not idle by definition.  What am I missing here?

Thanks.

-- 
tejun

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH V2 3/3] workqueue: Schedule work on non-idle cpu instead of current one
  2012-09-30  8:54   ` Tejun Heo
@ 2012-09-30 12:16     ` Viresh Kumar
  2012-10-01  0:32       ` Tejun Heo
  0 siblings, 1 reply; 13+ messages in thread
From: Viresh Kumar @ 2012-09-30 12:16 UTC (permalink / raw)
  To: Tejun Heo
  Cc: linux-kernel, pjt, paul.mckenney, tglx, suresh.b.siddha, venki,
	mingo, peterz, robin.randhawa, Steve.Bannister, Arvind.Chauhan,
	amit.kucheria, vincent.guittot, linaro-dev, patches

On 30 September 2012 14:24, Tejun Heo <tj@kernel.org> wrote:
> On Thu, Sep 27, 2012 at 02:34:05PM +0530, Viresh Kumar wrote:
>> - A cpu has programmed a timer and is IDLE now.
>> - CPU gets into interrupt handler due to timer and queues a work. As the CPU is
>>   currently IDLE, we can queue this work to some other CPU.
>
> I'm still a bit confused, if the CPU is already running the IRQ
> handler, the CPU is not idle by definition.  What am I missing here?

Hi Tejun,

For the scheduler CPU is idle, if all below are true:
- current task is idle task
- nr_running == 0
- wake_list is empty

And during these conditions, there can be a timer running in background.
And when we reach its interrupt handler, then also these conditions hold true
and local cpu is idle.

--
Viresh

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH V2 3/3] workqueue: Schedule work on non-idle cpu instead of current one
  2012-09-30 12:16     ` Viresh Kumar
@ 2012-10-01  0:32       ` Tejun Heo
  2012-10-01  3:47         ` Viresh Kumar
  0 siblings, 1 reply; 13+ messages in thread
From: Tejun Heo @ 2012-10-01  0:32 UTC (permalink / raw)
  To: Viresh Kumar
  Cc: linux-kernel, pjt, paul.mckenney, tglx, suresh.b.siddha, venki,
	mingo, peterz, robin.randhawa, Steve.Bannister, Arvind.Chauhan,
	amit.kucheria, vincent.guittot, linaro-dev, patches

Hello,

On Sun, Sep 30, 2012 at 05:46:45PM +0530, Viresh Kumar wrote:
> For the scheduler CPU is idle, if all below are true:
> - current task is idle task
> - nr_running == 0
> - wake_list is empty
> 
> And during these conditions, there can be a timer running in background.
> And when we reach its interrupt handler, then also these conditions hold true
> and local cpu is idle.

It isn't about the CPU being actually idle?  Also, if it's only about
timers, shouldn't it be enough to implement it for timer and
delayed_work?

It would be great if you explain what you're trying to achieve how.  I
can't tell what you're aiming for and why that would be beneficial
from these submissions.

Thanks.

-- 
tejun

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH V2 3/3] workqueue: Schedule work on non-idle cpu instead of current one
  2012-10-01  0:32       ` Tejun Heo
@ 2012-10-01  3:47         ` Viresh Kumar
  2012-10-01  9:12           ` Vincent Guittot
  0 siblings, 1 reply; 13+ messages in thread
From: Viresh Kumar @ 2012-10-01  3:47 UTC (permalink / raw)
  To: vincent.guittot, Tejun Heo
  Cc: linux-kernel, pjt, paul.mckenney, tglx, suresh.b.siddha, venki,
	mingo, peterz, robin.randhawa, Steve.Bannister, Arvind.Chauhan,
	amit.kucheria, linaro-dev, patches

On 1 October 2012 06:02, Tejun Heo <tj@kernel.org> wrote:
> It isn't about the CPU being actually idle?

No. Being idle only from scheduler's perspective. :)

> Also, if it's only about timers, shouldn't it be enough to implement
> it for timer and delayed_work?

What if we need a timer, which must re-arm itself + schedule a work?
delayed_work will not be sufficient in that case, and we would need
to use normal work.

If i am not wrong, there can be other future users of this routine too.
@Vincent: Can you please comment on this?

> It would be great if you explain what you're trying to achieve how.  I
> can't tell what you're aiming for and why that would be beneficial
> from these submissions.

Following slides are implemented by Vincent and presented during LPC.
Please have a look at them, they explain the problem statement well:

http://www.linuxplumbersconf.org/2012/wp-content/uploads/2012/08/lpc2012-sched-timer-workqueue.pdf

Specifically slides: 12 & 19.

There aren't too many users with this behavior, but even a single user
will be sufficient not to let the cpu get idle at all. And that will result in
less power saving.

--
viresh

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH V2 3/3] workqueue: Schedule work on non-idle cpu instead of current one
  2012-10-01  3:47         ` Viresh Kumar
@ 2012-10-01  9:12           ` Vincent Guittot
  0 siblings, 0 replies; 13+ messages in thread
From: Vincent Guittot @ 2012-10-01  9:12 UTC (permalink / raw)
  To: Viresh Kumar
  Cc: Tejun Heo, linux-kernel, pjt, paul.mckenney, tglx,
	suresh.b.siddha, venki, mingo, peterz, robin.randhawa,
	Steve.Bannister, Arvind.Chauhan, amit.kucheria, linaro-dev,
	patches

On 1 October 2012 05:47, Viresh Kumar <viresh.kumar@linaro.org> wrote:
> On 1 October 2012 06:02, Tejun Heo <tj@kernel.org> wrote:
>> It isn't about the CPU being actually idle?
>
> No. Being idle only from scheduler's perspective. :)
>
>> Also, if it's only about timers, shouldn't it be enough to implement
>> it for timer and delayed_work?
>
> What if we need a timer, which must re-arm itself + schedule a work?
> delayed_work will not be sufficient in that case, and we would need
> to use normal work.
>
> If i am not wrong, there can be other future users of this routine too.
> @Vincent: Can you please comment on this?

The goal is to consolidate the place, where the selection of a target
CPU for running something is done. The scheduler should select the CPU
in order to have coherent behaviour of all framework.

A delayed work can be delayed for a long period and the target CPU
that have been selected for the timer could not be the best place to
queue a work any more.

The goal is to be able to migrate a work if the scheduler thinks that'
is necessary.

Vincent

>
>> It would be great if you explain what you're trying to achieve how.  I
>> can't tell what you're aiming for and why that would be beneficial
>> from these submissions.
>
> Following slides are implemented by Vincent and presented during LPC.
> Please have a look at them, they explain the problem statement well:
>
> http://www.linuxplumbersconf.org/2012/wp-content/uploads/2012/08/lpc2012-sched-timer-workqueue.pdf
>
> Specifically slides: 12 & 19.
>
> There aren't too many users with this behavior, but even a single user
> will be sufficient not to let the cpu get idle at all. And that will result in
> less power saving.
>
> --
> viresh

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH V2 0/3] Create sched_select_cpu() and use it in workqueues
  2012-09-27  9:04 [PATCH V2 0/3] Create sched_select_cpu() and use it in workqueues Viresh Kumar
                   ` (2 preceding siblings ...)
  2012-09-27  9:04 ` [PATCH V2 3/3] workqueue: Schedule work on non-idle cpu instead of current one Viresh Kumar
@ 2012-10-15  4:38 ` Viresh Kumar
  2012-10-31  9:00   ` Viresh Kumar
  3 siblings, 1 reply; 13+ messages in thread
From: Viresh Kumar @ 2012-10-15  4:38 UTC (permalink / raw)
  To: mingo, peterz, tglx
  Cc: pjt, paul.mckenney, tj, suresh.b.siddha, venki, robin.randhawa,
	Steve.Bannister, amit.kucheria, vincent.guittot, linaro-dev,
	patches, Linux Kernel Mailing List

On 27 September 2012 14:34, Viresh Kumar <viresh.kumar@linaro.org> wrote:
> This is V2 of my sched_select_cpu() work.
>
> In order to save power, it would be useful to schedule work onto non-IDLE cpus
> instead of waking up an IDLE one.
>
> To achieve this, we need scheduler to guide kernel frameworks (like: timers &
> workqueues) on which is the most preferred CPU that must be used for these
> tasks.
>
> This patchset is about implementing this concept.
>
> - The first patch adds sched_select_cpu() routine which returns the preferred
>   cpu which is non-idle.
> - Second patch removes idle_cpu() calls from timer & hrtimer.
> - Third patch is about adapting this change in workqueue framework.
>
> Earlier discussions over v1 can be found here:
> http://www.mail-archive.com/linaro-dev@lists.linaro.org/msg13342.html
>
> Earlier discussions over this concept were done at last LPC:
> http://summit.linuxplumbersconf.org/lpc-2012/meeting/90/lpc2012-sched-timer-workqueue/
>
> Module created for testing this behavior is present here:
> http://git.linaro.org/gitweb?p=people/vireshk/module.git;a=summary
>
> Following are the steps followed in test module:
> 1. Run single work on each cpu
> 2. This work will start a timer after x (tested with 10) jiffies of delay
> 3. Timer routine queues a work... (This may be called from idle or non-idle cpu)
>    and starts the same timer again STEP 3 is done for n number of times (i.e.
>    queuing n works, one after other)
> 4. All works will call a single routine, which will count following per cpu:
>  - Total works processed by a CPU
>  - Total works processed by a CPU, which are queued from it
>  - Total works processed by a CPU, which aren't queued from it
>
> Setup:
> -----
> - ARM Vexpress TC2 - big.LITTLE CPU
> - Core 0-1: A15, 2-4: A7
> - rootfs: linaro-ubuntu-nano
>
> Results:
> -------
> Without Workqueue Modification, i.e. PATCH 3/3:
> [ 2493.022335] Workqueue Analyser: works processsed by CPU0, Total: 1000, Own: 0, migrated: 0
> [ 2493.047789] Workqueue Analyser: works processsed by CPU1, Total: 1000, Own: 0, migrated: 0
> [ 2493.072918] Workqueue Analyser: works processsed by CPU2, Total: 1000, Own: 0, migrated: 0
> [ 2493.098576] Workqueue Analyser: works processsed by CPU3, Total: 1000, Own: 0, migrated: 0
> [ 2493.123702] Workqueue Analyser: works processsed by CPU4, Total: 1000, Own: 0, migrated: 0
>
> With Workqueue Modification, i.e. PATCH 3/3:
> [ 2493.022335] Workqueue Analyser: works processsed by CPU0, Total: 1002, Own: 999, migrated: 3
> [ 2493.047789] Workqueue Analyser: works processsed by CPU1, Total: 998,  Own: 997, migrated: 1
> [ 2493.072918] Workqueue Analyser: works processsed by CPU2, Total: 1013, Own: 996, migrated: 17
> [ 2493.098576] Workqueue Analyser: works processsed by CPU3, Total: 998,  Own: 993, migrated: 5
> [ 2493.123702] Workqueue Analyser: works processsed by CPU4, Total: 989,  Own: 987, migrated: 2
>
> V1->V2
> -----
> - New SD_* macros removed now and earlier ones used
> - sched_select_cpu() rewritten and it includes the check on current cpu's
>   idleness.
> - cpu_idle() calls from timer and hrtimer removed now.
> - Patch 2/3 from V1, removed as it doesn't apply to latest workqueue branch from
>   tejun.
> - CONFIG_MIGRATE_WQ removed and so is wq_select_cpu()
> - sched_select_cpu() called only from __queue_work()
> - got tejun/for-3.7 branch in my tree, before making workqueue changes.
>
> Viresh Kumar (3):
>   sched: Create sched_select_cpu() to give preferred CPU for power
>     saving
>   timer: hrtimer: Don't check idle_cpu() before calling
>     get_nohz_timer_target()
>   workqueue: Schedule work on non-idle cpu instead of current one

Hi Guys,

I totally understand since last few weeks you guys were very busy as the
merge window was around. So, didn't tried to disturb you then :)

Can you please share your viewpoint on this patchset now? And also
the running timer migration patch (which was sent separately)?

Thanks in Advance.

--
viresh

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH V2 0/3] Create sched_select_cpu() and use it in workqueues
  2012-10-15  4:38 ` [PATCH V2 0/3] Create sched_select_cpu() and use it in workqueues Viresh Kumar
@ 2012-10-31  9:00   ` Viresh Kumar
  0 siblings, 0 replies; 13+ messages in thread
From: Viresh Kumar @ 2012-10-31  9:00 UTC (permalink / raw)
  To: mingo, peterz, tglx
  Cc: pjt, paul.mckenney, tj, suresh.b.siddha, venki, robin.randhawa,
	Steve.Bannister, amit.kucheria, vincent.guittot, linaro-dev,
	patches, Linux Kernel Mailing List

On 15 October 2012 10:08, Viresh Kumar <viresh.kumar@linaro.org> wrote:
> I totally understand since last few weeks you guys were very busy as the
> merge window was around. So, didn't tried to disturb you then :)
>
> Can you please share your viewpoint on this patchset now? And also
> the running timer migration patch (which was sent separately)?

Ping!!

^ permalink raw reply	[flat|nested] 13+ messages in thread

end of thread, other threads:[~2012-10-31  9:00 UTC | newest]

Thread overview: 13+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2012-09-27  9:04 [PATCH V2 0/3] Create sched_select_cpu() and use it in workqueues Viresh Kumar
2012-09-27  9:04 ` [PATCH V2 1/3] sched: Create sched_select_cpu() to give preferred CPU for power saving Viresh Kumar
2012-09-27  9:04 ` [PATCH V2 2/3] timer: hrtimer: Don't check idle_cpu() before calling get_nohz_timer_target() Viresh Kumar
2012-09-27 10:31   ` Viresh Kumar
2012-09-27  9:04 ` [PATCH V2 3/3] workqueue: Schedule work on non-idle cpu instead of current one Viresh Kumar
2012-09-28  6:03   ` Viresh Kumar
2012-09-30  8:54   ` Tejun Heo
2012-09-30 12:16     ` Viresh Kumar
2012-10-01  0:32       ` Tejun Heo
2012-10-01  3:47         ` Viresh Kumar
2012-10-01  9:12           ` Vincent Guittot
2012-10-15  4:38 ` [PATCH V2 0/3] Create sched_select_cpu() and use it in workqueues Viresh Kumar
2012-10-31  9:00   ` Viresh Kumar

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).