linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Peter Zijlstra <a.p.zijlstra@chello.nl>
To: Arun Sharma <asharma@fb.com>
Cc: linux-kernel@vger.kernel.org,
	Arnaldo Carvalho de Melo <acme@infradead.org>,
	Andrew Vagin <avagin@openvz.org>,
	Frederic Weisbecker <fweisbec@gmail.com>,
	Ingo Molnar <mingo@elte.hu>, Steven Rostedt <rostedt@goodmis.org>
Subject: Re: [PATCH] trace: reset sleep/block start time on task switch
Date: Mon, 23 Jan 2012 22:03:51 +0100	[thread overview]
Message-ID: <1327352631.2446.22.camel@twins> (raw)
In-Reply-To: <4F1DA9D0.6090208@fb.com>

On Mon, 2012-01-23 at 10:41 -0800, Arun Sharma wrote:
> On 1/23/12 3:34 AM, Peter Zijlstra wrote:

> > This'll fail to compile for !CONFIG_SCHEDSTAT I guess.. I should have
> > paid more attention to the initial patch, that tracepoint having
> > side-effects is a big no-no.
> >
> > Having unconditional writes there is somewhat sad, but I suspect putting
> > a conditional around it isn't going to help much..
> 
> For performance reasons?

Yep.

> 
> > bah can we
> > restructure things so we don't need this?
> >
> 
> We can go back to the old code, where these values were getting reset in 
> {en,de}queue_sleeper(). But we'll have to do it conditionally, so the 
> values are preserved till context switch time when we need it there.
> 
> --- a/kernel/sched/fair.c
> +++ b/kernel/sched/fair.c
> @@ -1003,6 +1003,8 @@ static void enqueue_sleeper(struct cfs_rq *cfs_rq, 
> struct sched_entity *se)
>                  if (unlikely(delta > se->statistics.sleep_max))
>                          se->statistics.sleep_max = delta;
> 
> +               if (!trace_sched_stat_sleeptime_enabled())
> +                       se->statistics.sleep_start = 0;
>                  se->statistics.sum_sleep_runtime += delta;
> 
>                  if (tsk) {
> @@ -1019,6 +1021,8 @@ static void enqueue_sleeper(struct cfs_rq *cfs_rq, 
> struct sched_entity *se)
>                  if (unlikely(delta > se->statistics.block_max))
>                          se->statistics.block_max = delta;
> 
> +               if (!trace_sched_stat_sleeptime_enabled())
> +                       se->statistics.block_start = 0;
>                  se->statistics.sum_sleep_runtime += delta;
> 
>                  if (tsk) {
> 
> This looks pretty ugly too,

Agreed, it still violates the tracepoints shouldn't actually affect the
code principle.

>  I don't know how to check for a tracepoint 
> being active (Steven?). The only advantage of this approach is that it's 
> in the sleep/wakeup path, rather than the context switch path.

Right, thus avoiding the stores for preemptions.

> Conceptually, the following seems to be the simplest:
> 
> --- a/kernel/sched/core.c
> +++ b/kernel/sched/core.c
> @@ -1939,8 +1939,10 @@ static void finish_task_switch(struct rq *rq, 
> struct task_struct *prev)
>          finish_lock_switch(rq, prev);
> 
>          trace_sched_stat_sleeptime(current, rq->clock);
> +#ifdef CONFIG_SCHEDSTAT
>          current->se.statistics.block_start = 0;
>          current->se.statistics.sleep_start = 0;
> +#endif /* CONFIG_SCHEDSTAT */
> 
> Perhaps we can reorder fields in sched_statistics so we touch one 
> cacheline here instead of two?

That might help some, but no stores for preemptions would still be best.

I was thinking something along the lines of the below, since enqueue
uses non-zero to mean its set and we need the values to still be
available in schedule dequeue is the last moment we can clear them.

This would limit the stores to the blocking case, your suggestion of
moving them to the same cacheline will then get us back where we started
in terms of performance.

Or did I miss something?


---
 kernel/sched/fair.c |    3 +++
 1 files changed, 3 insertions(+), 0 deletions(-)

diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
index 84adb2d..60f9ab9 100644
--- a/kernel/sched/fair.c
+++ b/kernel/sched/fair.c
@@ -1191,6 +1191,9 @@ dequeue_entity(struct cfs_rq *cfs_rq, struct sched_entity *se, int flags)
 		if (entity_is_task(se)) {
 			struct task_struct *tsk = task_of(se);
 
+			se->statistics.sleep_start = 0;
+			se->statistics.block_start = 0;
+
 			if (tsk->state & TASK_INTERRUPTIBLE)
 				se->statistics.sleep_start = rq_of(cfs_rq)->clock;
 			if (tsk->state & TASK_UNINTERRUPTIBLE)


  reply	other threads:[~2012-01-23 21:04 UTC|newest]

Thread overview: 16+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2012-01-20  2:20 [PATCH] trace: reset sleep/block start time on task switch Arun Sharma
2012-01-23 11:34 ` Peter Zijlstra
2012-01-23 18:41   ` Arun Sharma
2012-01-23 21:03     ` Peter Zijlstra [this message]
2012-01-23 23:02       ` Arun Sharma
2012-01-24 14:27         ` Peter Zijlstra
2012-01-24 21:46           ` Arun Sharma
2012-01-25  9:20             ` Frederic Weisbecker
2012-01-25 19:50               ` Arun Sharma
2012-01-25 20:15                 ` Steven Rostedt
2012-01-25 22:29                   ` Arun Sharma
2012-01-26  2:27                     ` Frederic Weisbecker
2012-01-26 19:13                       ` Arun Sharma
2012-01-26  2:21                 ` Frederic Weisbecker
2012-02-10 18:43                 ` Peter Zijlstra
2012-02-10 20:07                   ` Arun Sharma

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1327352631.2446.22.camel@twins \
    --to=a.p.zijlstra@chello.nl \
    --cc=acme@infradead.org \
    --cc=asharma@fb.com \
    --cc=avagin@openvz.org \
    --cc=fweisbec@gmail.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mingo@elte.hu \
    --cc=rostedt@goodmis.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).