All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH] sched: Optimize task_sched_runtime()
@ 2013-11-11 17:29 Peter Zijlstra
  2013-11-13 17:25 ` [tip:sched/urgent] " tip-bot for Peter Zijlstra
  0 siblings, 1 reply; 5+ messages in thread
From: Peter Zijlstra @ 2013-11-11 17:29 UTC (permalink / raw)
  To: linux-kernel, mingo; +Cc: kosaki.motohiro, lwoodman, pjt

Subject: sched: Optimize task_sched_runtime()
From: Peter Zijlstra <peterz@infradead.org>
Date: Mon Nov 11 18:21:56 CET 2013

Large multi-threaded apps like to hit this using do_sys_times() and
then queue up on the rq->lock.

Avoid when possible.

Larry reported ~20% performance increase his test case.

Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Reported-by: Larry Woodman <lwoodman@redhat.com>
Suggested-by: Paul Turner <pjt@google.com>
Signed-off-by: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/n/tip-m3prfhn4woqzrg4w029obww8@git.kernel.org
---
 kernel/sched/core.c |   14 ++++++++++++++
 1 file changed, 14 insertions(+)

--- a/kernel/sched/core.c
+++ b/kernel/sched/core.c
@@ -2253,6 +2253,20 @@ unsigned long long task_sched_runtime(st
 	struct rq *rq;
 	u64 ns = 0;
 
+#if defined(CONFIG_64BIT) && defined(CONFIG_SMP)
+	/*
+	 * 64-bit doesn't need locks to atomically read a 64bit value.
+	 * So we have a optimization chance when the task's delta_exec is 0.
+	 * Reading ->on_cpu is racy, but this is ok.
+	 *
+	 * If we race with it leaving cpu, we'll take a lock. So we're correct.
+	 * If we race with it entering cpu, unaccounted time is 0. This is
+	 * indistinguishable from the read occurring a few cycles earlier.
+	 */
+	if (!p->on_cpu)
+		return p->se.sum_exec_runtime;
+#endif
+
 	rq = task_rq_lock(p, &flags);
 	ns = p->se.sum_exec_runtime + do_task_delta_exec(p, rq);
 	task_rq_unlock(rq, p, &flags);

^ permalink raw reply	[flat|nested] 5+ messages in thread

* [tip:sched/urgent] sched: Optimize task_sched_runtime()
  2013-11-11 17:29 [PATCH] sched: Optimize task_sched_runtime() Peter Zijlstra
@ 2013-11-13 17:25 ` tip-bot for Peter Zijlstra
  2013-11-18 23:59   ` KOSAKI Motohiro
  2013-11-19  3:16   ` Davidlohr Bueso
  0 siblings, 2 replies; 5+ messages in thread
From: tip-bot for Peter Zijlstra @ 2013-11-13 17:25 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: linux-kernel, hpa, mingo, torvalds, pjt, peterz, akpm, tglx,
	kosaki.motohiro, lwoodman

Commit-ID:  911b2898b3c9fe0048e9485ad1629ed4fce330fd
Gitweb:     http://git.kernel.org/tip/911b2898b3c9fe0048e9485ad1629ed4fce330fd
Author:     Peter Zijlstra <peterz@infradead.org>
AuthorDate: Mon, 11 Nov 2013 18:21:56 +0100
Committer:  Ingo Molnar <mingo@kernel.org>
CommitDate: Wed, 13 Nov 2013 13:33:54 +0100

sched: Optimize task_sched_runtime()

Large multi-threaded apps like to hit this using do_sys_times() and
then queue up on the rq->lock.

Avoid when possible.

Larry reported ~20% performance increase his test case.

Reported-by: Larry Woodman <lwoodman@redhat.com>
Suggested-by: Paul Turner <pjt@google.com>
Signed-off-by: Peter Zijlstra <peterz@infradead.org>
Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Link: http://lkml.kernel.org/r/20131111172925.GG26898@twins.programming.kicks-ass.net
Signed-off-by: Ingo Molnar <mingo@kernel.org>
---
 kernel/sched/core.c | 14 ++++++++++++++
 1 file changed, 14 insertions(+)

diff --git a/kernel/sched/core.c b/kernel/sched/core.c
index 1deccd7..c180860 100644
--- a/kernel/sched/core.c
+++ b/kernel/sched/core.c
@@ -2253,6 +2253,20 @@ unsigned long long task_sched_runtime(struct task_struct *p)
 	struct rq *rq;
 	u64 ns = 0;
 
+#if defined(CONFIG_64BIT) && defined(CONFIG_SMP)
+	/*
+	 * 64-bit doesn't need locks to atomically read a 64bit value.
+	 * So we have a optimization chance when the task's delta_exec is 0.
+	 * Reading ->on_cpu is racy, but this is ok.
+	 *
+	 * If we race with it leaving cpu, we'll take a lock. So we're correct.
+	 * If we race with it entering cpu, unaccounted time is 0. This is
+	 * indistinguishable from the read occurring a few cycles earlier.
+	 */
+	if (!p->on_cpu)
+		return p->se.sum_exec_runtime;
+#endif
+
 	rq = task_rq_lock(p, &flags);
 	ns = p->se.sum_exec_runtime + do_task_delta_exec(p, rq);
 	task_rq_unlock(rq, p, &flags);

^ permalink raw reply related	[flat|nested] 5+ messages in thread

* Re: [tip:sched/urgent] sched: Optimize task_sched_runtime()
  2013-11-13 17:25 ` [tip:sched/urgent] " tip-bot for Peter Zijlstra
@ 2013-11-18 23:59   ` KOSAKI Motohiro
  2013-11-19  3:16   ` Davidlohr Bueso
  1 sibling, 0 replies; 5+ messages in thread
From: KOSAKI Motohiro @ 2013-11-18 23:59 UTC (permalink / raw)
  To: mingo, hpa, linux-kernel, torvalds, peterz, pjt, akpm, tglx,
	lwoodman, linux-tip-commits
  Cc: kosaki.motohiro

On 11/13/2013 12:25 PM, tip-bot for Peter Zijlstra wrote:
> Commit-ID:  911b2898b3c9fe0048e9485ad1629ed4fce330fd
> Gitweb:     http://git.kernel.org/tip/911b2898b3c9fe0048e9485ad1629ed4fce330fd
> Author:     Peter Zijlstra <peterz@infradead.org>
> AuthorDate: Mon, 11 Nov 2013 18:21:56 +0100
> Committer:  Ingo Molnar <mingo@kernel.org>
> CommitDate: Wed, 13 Nov 2013 13:33:54 +0100
> 
> sched: Optimize task_sched_runtime()
> 
> Large multi-threaded apps like to hit this using do_sys_times() and
> then queue up on the rq->lock.
> 
> Avoid when possible.
> 
> Larry reported ~20% performance increase his test case.
> 
> Reported-by: Larry Woodman <lwoodman@redhat.com>
> Suggested-by: Paul Turner <pjt@google.com>
> Signed-off-by: Peter Zijlstra <peterz@infradead.org>
> Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
> Cc: Linus Torvalds <torvalds@linux-foundation.org>
> Cc: Andrew Morton <akpm@linux-foundation.org>
> Link: http://lkml.kernel.org/r/20131111172925.GG26898@twins.programming.kicks-ass.net
> Signed-off-by: Ingo Molnar <mingo@kernel.org>

Sorry for the delay. I took a vacation. As you know, this patch originally
written me, then I already tested this. So, of course,

Acked-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>

# S-O-B is better? dunno.




^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [tip:sched/urgent] sched: Optimize task_sched_runtime()
  2013-11-13 17:25 ` [tip:sched/urgent] " tip-bot for Peter Zijlstra
  2013-11-18 23:59   ` KOSAKI Motohiro
@ 2013-11-19  3:16   ` Davidlohr Bueso
  2013-11-19  7:17     ` Ingo Molnar
  1 sibling, 1 reply; 5+ messages in thread
From: Davidlohr Bueso @ 2013-11-19  3:16 UTC (permalink / raw)
  To: mingo, hpa, linux-kernel, torvalds, peterz, pjt, akpm, tglx,
	kosaki.motohiro, lwoodman
  Cc: linux-tip-commits

On Wed, 2013-11-13 at 09:25 -0800, tip-bot for Peter Zijlstra wrote:
> Commit-ID:  911b2898b3c9fe0048e9485ad1629ed4fce330fd
> Gitweb:     http://git.kernel.org/tip/911b2898b3c9fe0048e9485ad1629ed4fce330fd
> Author:     Peter Zijlstra <peterz@infradead.org>
> AuthorDate: Mon, 11 Nov 2013 18:21:56 +0100
> Committer:  Ingo Molnar <mingo@kernel.org>
> CommitDate: Wed, 13 Nov 2013 13:33:54 +0100
> 
> sched: Optimize task_sched_runtime()
> 
> Large multi-threaded apps like to hit this using do_sys_times() and
> then queue up on the rq->lock.
> 
> Avoid when possible.
> 
> Larry reported ~20% performance increase his test case.
> 
> Reported-by: Larry Woodman <lwoodman@redhat.com>
> Suggested-by: Paul Turner <pjt@google.com>
> Signed-off-by: Peter Zijlstra <peterz@infradead.org>
> Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
> Cc: Linus Torvalds <torvalds@linux-foundation.org>
> Cc: Andrew Morton <akpm@linux-foundation.org>
> Link: http://lkml.kernel.org/r/20131111172925.GG26898@twins.programming.kicks-ass.net
> Signed-off-by: Ingo Molnar <mingo@kernel.org>

For what it's worth:

Tested-by: Davidlohr Bueso <davidlohr@hp.com>


^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [tip:sched/urgent] sched: Optimize task_sched_runtime()
  2013-11-19  3:16   ` Davidlohr Bueso
@ 2013-11-19  7:17     ` Ingo Molnar
  0 siblings, 0 replies; 5+ messages in thread
From: Ingo Molnar @ 2013-11-19  7:17 UTC (permalink / raw)
  To: Davidlohr Bueso
  Cc: hpa, linux-kernel, torvalds, peterz, pjt, akpm, tglx,
	kosaki.motohiro, lwoodman, linux-tip-commits


* Davidlohr Bueso <davidlohr@hp.com> wrote:

> On Wed, 2013-11-13 at 09:25 -0800, tip-bot for Peter Zijlstra wrote:
> > Commit-ID:  911b2898b3c9fe0048e9485ad1629ed4fce330fd
> > Gitweb:     http://git.kernel.org/tip/911b2898b3c9fe0048e9485ad1629ed4fce330fd
> > Author:     Peter Zijlstra <peterz@infradead.org>
> > AuthorDate: Mon, 11 Nov 2013 18:21:56 +0100
> > Committer:  Ingo Molnar <mingo@kernel.org>
> > CommitDate: Wed, 13 Nov 2013 13:33:54 +0100
> > 
> > sched: Optimize task_sched_runtime()
> > 
> > Large multi-threaded apps like to hit this using do_sys_times() and
> > then queue up on the rq->lock.
> > 
> > Avoid when possible.
> > 
> > Larry reported ~20% performance increase his test case.
> > 
> > Reported-by: Larry Woodman <lwoodman@redhat.com>
> > Suggested-by: Paul Turner <pjt@google.com>
> > Signed-off-by: Peter Zijlstra <peterz@infradead.org>
> > Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
> > Cc: Linus Torvalds <torvalds@linux-foundation.org>
> > Cc: Andrew Morton <akpm@linux-foundation.org>
> > Link: http://lkml.kernel.org/r/20131111172925.GG26898@twins.programming.kicks-ass.net
> > Signed-off-by: Ingo Molnar <mingo@kernel.org>
> 
> For what it's worth:
> 
> Tested-by: Davidlohr Bueso <davidlohr@hp.com>

Thanks for the testing - the change is upstream already and unless it 
causes regressions it will be part of the v3.13 kernel.

Thanks,

	Ingo

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2013-11-19  7:18 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2013-11-11 17:29 [PATCH] sched: Optimize task_sched_runtime() Peter Zijlstra
2013-11-13 17:25 ` [tip:sched/urgent] " tip-bot for Peter Zijlstra
2013-11-18 23:59   ` KOSAKI Motohiro
2013-11-19  3:16   ` Davidlohr Bueso
2013-11-19  7:17     ` Ingo Molnar

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.