From: Peter Zijlstra <peterz@infradead.org>
To: Vincent Guittot <vincent.guittot@linaro.org>
Cc: mingo@redhat.com, juri.lelli@redhat.com,
dietmar.eggemann@arm.com, rostedt@goodmis.org,
bsegall@google.com, mgorman@suse.de, bristot@redhat.com,
linux-kernel@vger.kernel.org, tim.c.chen@linux.intel.com
Subject: Re: [PATCH v2 3/4] sched/fair: Wait before decaying max_newidle_lb_cost
Date: Fri, 15 Oct 2021 19:40:45 +0200 [thread overview]
Message-ID: <20211015174045.GI174703@worktop.programming.kicks-ass.net> (raw)
In-Reply-To: <20211015124654.18093-4-vincent.guittot@linaro.org>
On Fri, Oct 15, 2021 at 02:46:53PM +0200, Vincent Guittot wrote:
> Decay max_newidle_lb_cost only when it has not been updated for a while
> and ensure to not decay a recently changed value.
I was more thinking something long these lines; ofcourse, no idea how
well it actually behaves.
Index: linux-2.6/include/linux/sched/topology.h
===================================================================
--- linux-2.6.orig/include/linux/sched/topology.h
+++ linux-2.6/include/linux/sched/topology.h
@@ -98,7 +98,6 @@ struct sched_domain {
/* idle_balance() stats */
u64 max_newidle_lb_cost;
- unsigned long next_decay_max_lb_cost;
u64 avg_scan_cost; /* select_idle_sibling */
Index: linux-2.6/kernel/sched/fair.c
===================================================================
--- linux-2.6.orig/kernel/sched/fair.c
+++ linux-2.6/kernel/sched/fair.c
@@ -10241,6 +10241,17 @@ void update_max_interval(void)
}
/*
+ * Asymmetric IIR filter, 1/4th down, 3/4th up.
+ */
+static void update_newidle_cost(u64 *cost, u64 new)
+{
+ s64 diff = new - *cost;
+ if (diff > 0)
+ diff *= 3;
+ *cost += diff / 4;
+}
+
+/*
* It checks each scheduling domain to see if it is due to be balanced,
* and initiates a balancing operation if so.
*
@@ -10256,33 +10267,18 @@ static void rebalance_domains(struct rq
/* Earliest time when we have to do rebalance again */
unsigned long next_balance = jiffies + 60*HZ;
int update_next_balance = 0;
- int need_serialize, need_decay = 0;
- u64 max_cost = 0;
+ int need_serialize;
rcu_read_lock();
for_each_domain(cpu, sd) {
- /*
- * Decay the newidle max times here because this is a regular
- * visit to all the domains. Decay ~1% per second.
- */
- if (time_after(jiffies, sd->next_decay_max_lb_cost)) {
- sd->max_newidle_lb_cost =
- (sd->max_newidle_lb_cost * 253) / 256;
- sd->next_decay_max_lb_cost = jiffies + HZ;
- need_decay = 1;
- }
- max_cost += sd->max_newidle_lb_cost;
/*
* Stop the load balance at this level. There is another
* CPU in our sched group which is doing load balancing more
* actively.
*/
- if (!continue_balancing) {
- if (need_decay)
- continue;
+ if (!continue_balancing)
break;
- }
interval = get_sd_balance_interval(sd, busy);
@@ -10313,14 +10309,7 @@ out:
update_next_balance = 1;
}
}
- if (need_decay) {
- /*
- * Ensure the rq-wide value also decays but keep it at a
- * reasonable floor to avoid funnies with rq->avg_idle.
- */
- rq->max_idle_balance_cost =
- max((u64)sysctl_sched_migration_cost, max_cost);
- }
+
rcu_read_unlock();
/*
@@ -10909,8 +10898,7 @@ static int newidle_balance(struct rq *th
t1 = sched_clock_cpu(this_cpu);
domain_cost = t1 - t0;
- if (domain_cost > sd->max_newidle_lb_cost)
- sd->max_newidle_lb_cost = domain_cost;
+ update_newidle_cost(&sd->max_newidle_lb_cost, domain_cost);
curr_cost += domain_cost;
t0 = t1;
@@ -10930,8 +10918,7 @@ static int newidle_balance(struct rq *th
raw_spin_rq_lock(this_rq);
- if (curr_cost > this_rq->max_idle_balance_cost)
- this_rq->max_idle_balance_cost = curr_cost;
+ update_newidle_cost(&this_rq->max_idle_balance_cost, curr_cost);
/*
* While browsing the domains, we released the rq lock, a task could
next prev parent reply other threads:[~2021-10-15 17:42 UTC|newest]
Thread overview: 10+ messages / expand[flat|nested] mbox.gz Atom feed top
2021-10-15 12:46 [PATCH v2 0/4] Improve newidle lb cost tracking and early abort Vincent Guittot
2021-10-15 12:46 ` [PATCH v2 1/4] sched/fair: Account update_blocked_averages in newidle_balance cost Vincent Guittot
2021-10-15 12:46 ` [PATCH v2 2/4] sched/fair: Skip update_blocked_averages if we are defering load balance Vincent Guittot
2021-10-15 12:46 ` [PATCH v2 3/4] sched/fair: Wait before decaying max_newidle_lb_cost Vincent Guittot
2021-10-15 17:40 ` Peter Zijlstra [this message]
2021-10-15 18:02 ` Vincent Guittot
2021-10-15 18:29 ` Peter Zijlstra
2021-10-15 12:46 ` [PATCH v2 4/4] sched/fair: Remove sysctl_sched_migration_cost condition Vincent Guittot
2021-10-15 17:42 ` Peter Zijlstra
2021-10-15 17:57 ` Vincent Guittot
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20211015174045.GI174703@worktop.programming.kicks-ass.net \
--to=peterz@infradead.org \
--cc=bristot@redhat.com \
--cc=bsegall@google.com \
--cc=dietmar.eggemann@arm.com \
--cc=juri.lelli@redhat.com \
--cc=linux-kernel@vger.kernel.org \
--cc=mgorman@suse.de \
--cc=mingo@redhat.com \
--cc=rostedt@goodmis.org \
--cc=tim.c.chen@linux.intel.com \
--cc=vincent.guittot@linaro.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).