All of lore.kernel.org
 help / color / mirror / Atom feed
From: Vincent Guittot <vincent.guittot@linaro.org>
To: Tim Chen <tim.c.chen@linux.intel.com>
Cc: mingo@redhat.com, peterz@infradead.org, juri.lelli@redhat.com,
	dietmar.eggemann@arm.com, rostedt@goodmis.org,
	bsegall@google.com, mgorman@suse.de, bristot@redhat.com,
	linux-kernel@vger.kernel.org
Subject: Re: [PATCH v3 0/5] Improve newidle lb cost tracking and early abort
Date: Thu, 28 Oct 2021 14:15:30 +0200	[thread overview]
Message-ID: <20211028121530.GA19512@vingu-book> (raw)
In-Reply-To: <720fd26424927dd27fea4e5719dafe8a0afaa8c4.camel@linux.intel.com>

Le mercredi 27 oct. 2021 à 13:53:32 (-0700), Tim Chen a écrit :
> On Wed, 2021-10-27 at 10:49 +0200, Vincent Guittot wrote:
> > 
> > > Looking at the profile on update_blocked_averages a bit more,
> > > the majority of the call to update_blocked_averages
> > > happens in run_rebalance_domain.  And we are not
> > > including that cost of update_blocked_averages for
> > > run_rebalance_domains in our current patch set. I think
> > > the patch set should account for that too.
> > 
> > nohz_newidle_balance keeps using sysctl_sched_migration_cost to
> > trigger a _nohz_idle_balance(cpu_rq(cpu), NOHZ_STATS_KICK, CPU_IDLE);
> > This would probably benefit to take into account the cost of
> > update_blocked_averages instead
> > 
> 
> For the case where
> 
> 	this_rq->avg_idle < sysctl_sched_migration_cost
> 
> in newidle_balance(), we skip to the out: label
> 
> out:
>         /* Move the next balance forward */
>         if (time_after(this_rq->next_balance, next_balance))
>                 this_rq->next_balance = next_balance;
> 
>         if (pulled_task)
>                 this_rq->idle_stamp = 0;
>         else
>                 nohz_newidle_balance(this_rq);
> 
> and we call nohz_newidle_balance as we don't have a pulled_task.
> 
> It seems to make sense to skip the call
> to nohz_newidle_balance() for this case? 

nohz_newidle_balance() also tests this condition :
(this_rq->avg_idle < sysctl_sched_migration_cost)
and doesn't set NOHZ_NEWILB_KICKi in such case

But this patch now used the condition :
this_rq->avg_idle < sd->max_newidle_lb_cost
and sd->max_newidle_lb_cost can be higher than sysctl_sched_migration_cost

which means that we can set NOHZ_NEWILB_KICK:
-although we decided to skip newidle loop
-or when we abort because this_rq->avg_idle < curr_cost + sd->max_newidle_lb_cost 

This is even more true when sysctl_sched_migration_cost is lowered which is your case IIRC

The patch below ensures that we don't set NOHZ_NEWILB_KICK in such cases:

---
 kernel/sched/fair.c | 18 ++++++++++++++----
 1 file changed, 14 insertions(+), 4 deletions(-)

diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
index c19f4bb3df1a..36ddae208959 100644
--- a/kernel/sched/fair.c
+++ b/kernel/sched/fair.c
@@ -10779,7 +10779,7 @@ static int newidle_balance(struct rq *this_rq, struct rq_flags *rf)
 	int this_cpu = this_rq->cpu;
 	u64 t0, t1, curr_cost = 0;
 	struct sched_domain *sd;
-	int pulled_task = 0;
+	int pulled_task = 0, early_stop = 0;
 
 	update_misfit_status(NULL, this_rq);
 
@@ -10816,8 +10816,16 @@ static int newidle_balance(struct rq *this_rq, struct rq_flags *rf)
 	if (!READ_ONCE(this_rq->rd->overload) ||
 	    (sd && this_rq->avg_idle < sd->max_newidle_lb_cost)) {
 
-		if (sd)
+		if (sd) {
 			update_next_balance(sd, &next_balance);
+
+			/*
+			 * We skip new idle LB because there is not enough
+			 * time before next wake up. Make sure that we will
+			 * not kick NOHZ_NEWILB_KICK
+			 */
+			early_stop = 1;
+		}
 		rcu_read_unlock();
 
 		goto out;
@@ -10836,8 +10844,10 @@ static int newidle_balance(struct rq *this_rq, struct rq_flags *rf)
 
 		update_next_balance(sd, &next_balance);
 
-		if (this_rq->avg_idle < curr_cost + sd->max_newidle_lb_cost)
+		if (this_rq->avg_idle < curr_cost + sd->max_newidle_lb_cost) {
+			early_stop = 1;
 			break;
+		}
 
 		if (sd->flags & SD_BALANCE_NEWIDLE) {
 
@@ -10887,7 +10897,7 @@ static int newidle_balance(struct rq *this_rq, struct rq_flags *rf)
 
 	if (pulled_task)
 		this_rq->idle_stamp = 0;
-	else
+	else if (!early_stop)
 		nohz_newidle_balance(this_rq);
 
 	rq_repin_lock(this_rq, rf);
-- 

> We expect a very short idle and a task to wake shortly.  
> So we do not have to pull a task
> to this idle cpu and incur the migration cost.
> 
> Tim
> 
> 
> 
> 

  reply	other threads:[~2021-10-28 12:15 UTC|newest]

Thread overview: 34+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-10-19 12:35 [PATCH v3 0/5] Improve newidle lb cost tracking and early abort Vincent Guittot
2021-10-19 12:35 ` [PATCH v3 1/5] sched/fair: Account update_blocked_averages in newidle_balance cost Vincent Guittot
2021-10-21  9:43   ` Mel Gorman
2021-10-31 10:16   ` [tip: sched/core] " tip-bot2 for Vincent Guittot
2021-10-19 12:35 ` [PATCH v3 2/5] sched/fair: Skip update_blocked_averages if we are defering load balance Vincent Guittot
2021-10-21  9:44   ` Mel Gorman
2021-10-21 12:45     ` Vincent Guittot
2021-10-31 10:16   ` [tip: sched/core] " tip-bot2 for Vincent Guittot
2021-10-19 12:35 ` [PATCH 3/5] sched/fair: Wait before decaying max_newidle_lb_cost Vincent Guittot
2021-10-21  9:45   ` Mel Gorman
2021-10-29 10:01   ` Dietmar Eggemann
2021-10-29 12:19     ` Vincent Guittot
2021-10-31 10:16   ` [tip: sched/core] " tip-bot2 for Vincent Guittot
2021-10-19 12:35 ` [PATCH v3 4/5] sched/fair: Remove sysctl_sched_migration_cost condition Vincent Guittot
2021-10-21  9:46   ` Mel Gorman
2021-10-29 10:01   ` Dietmar Eggemann
2021-10-29 12:30     ` Vincent Guittot
2021-10-31 10:16   ` [tip: sched/core] " tip-bot2 for Vincent Guittot
2021-10-19 12:35 ` [PATCH v3 5/5] sched/fair: cleanup newidle_balance Vincent Guittot
2021-10-21  9:46   ` Mel Gorman
2021-10-31 10:16   ` [tip: sched/core] sched/fair: Cleanup newidle_balance tip-bot2 for Vincent Guittot
2021-10-21  9:52 ` [PATCH v3 0/5] Improve newidle lb cost tracking and early abort Mel Gorman
2021-10-21 12:42   ` Vincent Guittot
2021-10-26 17:25 ` Tim Chen
2021-10-27  8:49   ` Vincent Guittot
2021-10-27 20:53     ` Tim Chen
2021-10-28 12:15       ` Vincent Guittot [this message]
2021-10-28 21:36         ` Tim Chen
2021-10-29 17:00     ` Tim Chen
2021-10-31 10:18       ` Vincent Guittot
2021-10-28 23:25 ` Joel Fernandes
2021-10-29  7:49   ` Vincent Guittot
2021-10-29 19:37     ` Joel Fernandes
2021-10-29 10:01 ` Dietmar Eggemann

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20211028121530.GA19512@vingu-book \
    --to=vincent.guittot@linaro.org \
    --cc=bristot@redhat.com \
    --cc=bsegall@google.com \
    --cc=dietmar.eggemann@arm.com \
    --cc=juri.lelli@redhat.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mgorman@suse.de \
    --cc=mingo@redhat.com \
    --cc=peterz@infradead.org \
    --cc=rostedt@goodmis.org \
    --cc=tim.c.chen@linux.intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.