linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Yuyang Du <yuyang.du@intel.com>
To: Peter Zijlstra <peterz@infradead.org>
Cc: Byungchul Park <byungchul.park@lge.com>,
	mingo@kernel.org, linux-kernel@vger.kernel.org
Subject: Re: [PATCH] sched: sync with the cfs_rq when changing sched class
Date: Fri, 14 Aug 2015 07:20:20 +0800	[thread overview]
Message-ID: <20150813232020.GA7780@intel.com> (raw)
In-Reply-To: <20150813152212.GE16853@twins.programming.kicks-ass.net>

On Thu, Aug 13, 2015 at 05:22:12PM +0200, Peter Zijlstra wrote:
> > when did I say "don't need to consider..."?
> > 
> > Doing more does not mean better, or just trivial. BTW, the task switched_to
> > does not have to be switched_from before. 
> 
> Correct, there's a few corner cases we need to consider.
> 
> However, I think we unconditionally call init_entity_runnable_average()
> on all tasks, regardless of their 'initial' sched class, so it should
> have a valid state.
> 
> Another thing to consider is the state being very stale, suppose it
> started live as FAIR, ran for a bit, got switched to !FAIR by means of
> sys_sched_setscheduler()/sys_sched_setattr() or similar, runs for a long
> time and for some reason gets switched back to FAIR, we need to age and
> or re-init things.
> 
> I _think_ we can use last_update_time for that, but I've not looked too
> hard.
> 
> That is, age based on last_update_time, if all 0, reinit, or somesuch.
> 
> 
> The most common case of switched_from()/switched_to() is Priority
> Inheritance, and that typically results in very short lived stints as
> !FAIR and the avg data should be still accurate by the time we return.

Right, we have the following cases to consider:
queued or not * PI or sched_setscheduler * switched_from_fair or not

After enumerating them, it simply boils down to this basic question:
with either short or long lost record, what we predict?

It can have to do with perspective, preference, etc, and it differs
in how frequent it happens, which we lack. Otherwise, we don't have
the context to see what difference it makes.

To address this, I think the following patch might be helpful first.

--
sched: Provide sched class and priority change statistics to task

The sched class and priority changes make substantial impact for
a task, but we really have not a quantitative understanding of how
frequent they are, which makes it diffcult to work on many cases
especially some corner cases. We privide it at task level.

Signed-off-by: Yuyang Du <yuyang.du@intel.com>
---
 include/linux/sched.h |    3 +++
 kernel/sched/core.c   |    8 +++++++-
 kernel/sched/debug.c  |    2 ++
 3 files changed, 12 insertions(+), 1 deletion(-)

diff --git a/include/linux/sched.h b/include/linux/sched.h
index 5b50082..9111696 100644
--- a/include/linux/sched.h
+++ b/include/linux/sched.h
@@ -1228,6 +1228,9 @@ struct sched_statistics {
 	u64			nr_wakeups_affine_attempts;
 	u64			nr_wakeups_passive;
 	u64			nr_wakeups_idle;
+
+	u64			class_change;
+	u64			priority_change;
 };
 #endif
 
diff --git a/kernel/sched/core.c b/kernel/sched/core.c
index 655557d..b65af01 100644
--- a/kernel/sched/core.c
+++ b/kernel/sched/core.c
@@ -1009,12 +1009,18 @@ static inline void check_class_changed(struct rq *rq, struct task_struct *p,
 				       int oldprio)
 {
 	if (prev_class != p->sched_class) {
+		schedstat_inc(p, se.statistics.class_change);
+		schedstat_inc(p, se.statistics.priority_change);
+
 		if (prev_class->switched_from)
 			prev_class->switched_from(rq, p);
 
 		p->sched_class->switched_to(rq, p);
-	} else if (oldprio != p->prio || dl_task(p))
+	} else if (oldprio != p->prio || dl_task(p)) {
+		schedstat_inc(p, se.statistics.priority_change);
+
 		p->sched_class->prio_changed(rq, p, oldprio);
+	}
 }
 
 void check_preempt_curr(struct rq *rq, struct task_struct *p, int flags)
diff --git a/kernel/sched/debug.c b/kernel/sched/debug.c
index 6415117..fac8b89 100644
--- a/kernel/sched/debug.c
+++ b/kernel/sched/debug.c
@@ -597,6 +597,8 @@ void proc_sched_show_task(struct task_struct *p, struct seq_file *m)
 	P(se.statistics.nr_wakeups_affine_attempts);
 	P(se.statistics.nr_wakeups_passive);
 	P(se.statistics.nr_wakeups_idle);
+	P(se.statistics.class_change);
+	P(se.statistics.priority_change);
 
 	{
 		u64 avg_atom, avg_per_cpu;
-- 
1.7.9.5


  reply	other threads:[~2015-08-14  7:10 UTC|newest]

Thread overview: 17+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-08-13  5:55 [PATCH] sched: sync with the cfs_rq when changing sched class byungchul.park
2015-08-12 22:41 ` Yuyang Du
2015-08-13  7:19   ` Byungchul Park
2015-08-13  2:30     ` Yuyang Du
2015-08-13  7:46 ` Peter Zijlstra
2015-08-13  8:21   ` Byungchul Park
2015-08-13  2:15     ` Yuyang Du
2015-08-13 10:56       ` Byungchul Park
2015-08-13 15:22       ` Peter Zijlstra
2015-08-13 23:20         ` Yuyang Du [this message]
2015-08-14  8:46           ` Peter Zijlstra
2015-08-15  6:52         ` Byungchul Park
2015-08-15  7:16           ` Byungchul Park
2015-08-13  8:42   ` Byungchul Park
2015-08-14 12:59 ` T. Zhou
2015-08-15  4:24   ` Byungchul Park
2015-08-15 12:34     ` T. Zhou

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20150813232020.GA7780@intel.com \
    --to=yuyang.du@intel.com \
    --cc=byungchul.park@lge.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mingo@kernel.org \
    --cc=peterz@infradead.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).