From: Vincent Guittot <vincent.guittot@linaro.org>
To: Qais Yousef <qais.yousef@arm.com>
Cc: Mel Gorman <mgorman@suse.de>,
Patrick Bellasi <patrick.bellasi@matbug.net>,
Dietmar Eggemann <dietmar.eggemann@arm.com>,
Peter Zijlstra <peterz@infradead.org>,
Ingo Molnar <mingo@redhat.com>,
Randy Dunlap <rdunlap@infradead.org>,
Jonathan Corbet <corbet@lwn.net>,
Juri Lelli <juri.lelli@redhat.com>,
Steven Rostedt <rostedt@goodmis.org>,
Ben Segall <bsegall@google.com>,
Luis Chamberlain <mcgrof@kernel.org>,
Kees Cook <keescook@chromium.org>,
Iurii Zaikin <yzaikin@google.com>,
Quentin Perret <qperret@google.com>,
Valentin Schneider <valentin.schneider@arm.com>,
Pavan Kondeti <pkondeti@codeaurora.org>,
linux-doc@vger.kernel.org,
linux-kernel <linux-kernel@vger.kernel.org>,
linux-fs <linux-fsdevel@vger.kernel.org>
Subject: Re: [PATCH 1/2] sched/uclamp: Add a new sysctl to control RT default boost value
Date: Tue, 9 Jun 2020 19:10:07 +0200 [thread overview]
Message-ID: <CAKfTPtCKS-2RoaMHhKGigjzc7dhXhx0z3dYNQLD3Q9aRC_tCnw@mail.gmail.com> (raw)
In-Reply-To: <20200608123102.6sdhdhit7lac5cfl@e107158-lin.cambridge.arm.com>
On Mon, 8 Jun 2020 at 14:31, Qais Yousef <qais.yousef@arm.com> wrote:
>
> On 06/04/20 14:14, Vincent Guittot wrote:
>
> [...]
>
> > I have tried your patch and I don't see any difference compared to
> > previous tests. Let me give you more details of my setup:
> > I create 3 levels of cgroups and usually run the tests in the 4 levels
> > (which includes root). The result above are for the root level
> >
> > But I see a difference at other levels:
> >
> > root level 1 level 2 level 3
> >
> > /w patch uclamp disable 50097 46615 43806 41078
> > tip uclamp enable 48706(-2.78%) 45583(-2.21%) 42851(-2.18%)
> > 40313(-1.86%)
> > /w patch uclamp enable 48882(-2.43%) 45774(-1.80%) 43108(-1.59%)
> > 40667(-1.00%)
> >
> > Whereas tip with uclamp stays around 2% behind tip without uclamp, the
> > diff of uclamp with your patch tends to decrease when we increase the
> > number of level
>
> So I did try to dig more into this, but I think it's either not a good
> reproducer or what we're observing here is uArch level latencies caused by the
> new code that seem to produce a bigger knock on effect than what they really
> are.
>
> First, CONFIG_FAIR_GROUP_SCHED is 'expensive', for some definition of
> expensive..
yes, enabling CONFIG_FAIR_GROUP_SCHED adds an overhead
>
> *** uclamp disabled/fair group enabled ***
>
> # Executed 50000 pipe operations between two threads
>
> Total time: 0.958 [sec]
>
> 19.177100 usecs/op
> 52145 ops/sec
>
> *** uclamp disabled/fair group disabled ***
>
> # Executed 50000 pipe operations between two threads
> Total time: 0.808 [sec]
>
> 16.176200 usecs/op
> 61819 ops/sec
>
> So there's a 15.6% drop in ops/sec when enabling this option. I think it's good
> to look at the absolutely number of usecs/op, Fair group adds around
> 3 usecs/op.
>
> I dropped FAIR_GROUP_SCHED from my config to eliminate this overhead and focus
> on solely on uclamp overhead.
Have you checked that both tests run at the root level ?
Your function-graph log below shows several calls to
update_cfs_group() which means that your trace below has not been made
at root level but most probably at the 3rd level and I wonder if you
used the same setup for running the benchmark above. This could
explain such huge difference because I don't have such difference on
my platform but more around 2%
For uclamp disable/fair group enable/ function graph enable : 47994ops/sec
For uclamp disable/fair group disable/ function graph enable : 49107ops/sec
>
> With uclamp enabled but no fair group I get
>
> *** uclamp enabled/fair group disabled ***
>
> # Executed 50000 pipe operations between two threads
> Total time: 0.856 [sec]
>
> 17.125740 usecs/op
> 58391 ops/sec
>
> The drop is 5.5% in ops/sec. Or 1 usecs/op.
>
> I don't know what's the expectation here. 1 us could be a lot, but I don't
> think we expect the new code to take more than few 100s of ns anyway. If you
> add potential caching effects, reaching 1 us wouldn't be that hard.
>
> Note that in my runs I chose performance governor and use `taskset 0x2` to
You might want to set 2 CPUs in your cpumask instead of 1 in order to
have 1 CPU for each thread
> force running on a big core to make sure the runs are repeatable.
I also use performance governor but don't pinned tasks because I use smp.
>
> On Juno-r2 I managed to scrap most of the 1 us with the below patch. It seems
> there was weird branching behavior that affects the I$ in my case. It'd be good
> to try it out to see if it makes a difference for you.
The perf are slightly worse on my setup:
For uclamp enable/fair group disable/ function graph enable : 48413ops/sec
with patch below : 47804os/sec
>
> The I$ effect is my best educated guess. Perf doesn't catch this path and
> I couldn't convince it to look at cache and branch misses between 2 specific
> points.
>
> Other subtle code shuffling did have weird effect on the result too. One worthy
> one is making uclamp_rq_dec() noinline gains back ~400 ns. Making
> uclamp_rq_inc() noinline *too* cancels this gain out :-/
>
>
> diff --git a/kernel/sched/core.c b/kernel/sched/core.c
> index 0464569f26a7..0835ee20a3c7 100644
> --- a/kernel/sched/core.c
> +++ b/kernel/sched/core.c
> @@ -1071,13 +1071,11 @@ static inline void uclamp_rq_dec_id(struct rq *rq, struct task_struct *p,
>
> static inline void uclamp_rq_inc(struct rq *rq, struct task_struct *p)
> {
> - enum uclamp_id clamp_id;
> -
> if (unlikely(!p->sched_class->uclamp_enabled))
> return;
>
> - for_each_clamp_id(clamp_id)
> - uclamp_rq_inc_id(rq, p, clamp_id);
> + uclamp_rq_inc_id(rq, p, UCLAMP_MIN);
> + uclamp_rq_inc_id(rq, p, UCLAMP_MAX);
>
> /* Reset clamp idle holding when there is one RUNNABLE task */
> if (rq->uclamp_flags & UCLAMP_FLAG_IDLE)
> @@ -1086,13 +1084,11 @@ static inline void uclamp_rq_inc(struct rq *rq, struct task_struct *p)
>
> static inline void uclamp_rq_dec(struct rq *rq, struct task_struct *p)
> {
> - enum uclamp_id clamp_id;
> -
> if (unlikely(!p->sched_class->uclamp_enabled))
> return;
>
> - for_each_clamp_id(clamp_id)
> - uclamp_rq_dec_id(rq, p, clamp_id);
> + uclamp_rq_dec_id(rq, p, UCLAMP_MIN);
> + uclamp_rq_dec_id(rq, p, UCLAMP_MAX);
> }
>
> static inline void
>
>
> FWIW I fail to see activate/deactivate_task in perf record. They don't show up
> on the list which means this micro benchmark doesn't stress them as Mel's test
> does.
Strange because I have been able to trace them.
>
> Worth noting that I did try running the same test on 2 vCPU VirtualBox VM and
> 64 vCPU qemu and I couldn't spot a difference when uclamp was enabled/disabled
> in these 2 environments.
>
> >
> > Beside this, that's also interesting to notice the ~6% of perf impact
> > between each level for the same image
>
> Beside my observation above, I captured this function_graph when
> FAIR_GROUP_SCHED is enabled. What I pasted below is a particularly bad
> deactivation, it's not always that costly.
>
> This ran happened was recorded with uclamp disabled.
>
> I admit I don't know how much of these numbers is ftrace overhead. When trying
> to capture similar runs for uclamp, the numbers didn't add up compared to
> running the test without ftrace generating the graph. If juno is suffering from
> bad branching costs in this path, then I suspect ftrace will amplify this as
> AFAIU it'll cause extra jumps on entry and exit.
>
>
>
> sched-pipe-6532 [001] 9407.276302: funcgraph_entry: | deactivate_task() {
> sched-pipe-6532 [001] 9407.276302: funcgraph_entry: | dequeue_task_fair() {
> sched-pipe-6532 [001] 9407.276303: funcgraph_entry: | update_curr() {
> sched-pipe-6532 [001] 9407.276304: funcgraph_entry: 0.780 us | update_min_vruntime();
> sched-pipe-6532 [001] 9407.276306: funcgraph_entry: | cpuacct_charge() {
> sched-pipe-6532 [001] 9407.276306: funcgraph_entry: 0.820 us | __rcu_read_lock();
> sched-pipe-6532 [001] 9407.276308: funcgraph_entry: 0.740 us | __rcu_read_unlock();
> sched-pipe-6532 [001] 9407.276309: funcgraph_exit: 3.980 us | }
> sched-pipe-6532 [001] 9407.276310: funcgraph_entry: 0.720 us | __rcu_read_lock();
> sched-pipe-6532 [001] 9407.276312: funcgraph_entry: 0.720 us | __rcu_read_unlock();
> sched-pipe-6532 [001] 9407.276313: funcgraph_exit: 9.840 us | }
> sched-pipe-6532 [001] 9407.276314: funcgraph_entry: | __update_load_avg_se() {
> sched-pipe-6532 [001] 9407.276315: funcgraph_entry: 0.720 us | __accumulate_pelt_segments();
> sched-pipe-6532 [001] 9407.276316: funcgraph_exit: 2.260 us | }
> sched-pipe-6532 [001] 9407.276317: funcgraph_entry: | __update_load_avg_cfs_rq() {
> sched-pipe-6532 [001] 9407.276318: funcgraph_entry: 0.860 us | __accumulate_pelt_segments();
> sched-pipe-6532 [001] 9407.276319: funcgraph_exit: 2.340 us | }
> sched-pipe-6532 [001] 9407.276320: funcgraph_entry: 0.760 us | clear_buddies();
> sched-pipe-6532 [001] 9407.276321: funcgraph_entry: 0.800 us | account_entity_dequeue();
> sched-pipe-6532 [001] 9407.276323: funcgraph_entry: 0.720 us | update_cfs_group();
> sched-pipe-6532 [001] 9407.276324: funcgraph_entry: 0.740 us | update_min_vruntime();
> sched-pipe-6532 [001] 9407.276326: funcgraph_entry: 0.720 us | set_next_buddy();
> sched-pipe-6532 [001] 9407.276327: funcgraph_entry: | __update_load_avg_se() {
> sched-pipe-6532 [001] 9407.276328: funcgraph_entry: 0.740 us | __accumulate_pelt_segments();
> sched-pipe-6532 [001] 9407.276329: funcgraph_exit: 2.220 us | }
> sched-pipe-6532 [001] 9407.276330: funcgraph_entry: | __update_load_avg_cfs_rq() {
> sched-pipe-6532 [001] 9407.276331: funcgraph_entry: 0.740 us | __accumulate_pelt_segments();
> sched-pipe-6532 [001] 9407.276332: funcgraph_exit: 2.180 us | }
> sched-pipe-6532 [001] 9407.276333: funcgraph_entry: | update_cfs_group() {
> sched-pipe-6532 [001] 9407.276334: funcgraph_entry: | reweight_entity() {
> sched-pipe-6532 [001] 9407.276335: funcgraph_entry: | update_curr() {
> sched-pipe-6532 [001] 9407.276335: funcgraph_entry: 0.720 us | __calc_delta();
> sched-pipe-6532 [001] 9407.276337: funcgraph_entry: 0.740 us | update_min_vruntime();
> sched-pipe-6532 [001] 9407.276338: funcgraph_exit: 3.560 us | }
> sched-pipe-6532 [001] 9407.276339: funcgraph_entry: 0.720 us | account_entity_dequeue();
> sched-pipe-6532 [001] 9407.276340: funcgraph_entry: 0.720 us | account_entity_enqueue();
> sched-pipe-6532 [001] 9407.276342: funcgraph_exit: 7.860 us | }
> sched-pipe-6532 [001] 9407.276342: funcgraph_exit: 9.280 us | }
> sched-pipe-6532 [001] 9407.276343: funcgraph_entry: | __update_load_avg_se() {
> sched-pipe-6532 [001] 9407.276344: funcgraph_entry: 0.720 us | __accumulate_pelt_segments();
> sched-pipe-6532 [001] 9407.276345: funcgraph_exit: 2.180 us | }
> sched-pipe-6532 [001] 9407.276346: funcgraph_entry: | __update_load_avg_cfs_rq() {
> sched-pipe-6532 [001] 9407.276347: funcgraph_entry: 0.740 us | __accumulate_pelt_segments();
> sched-pipe-6532 [001] 9407.276348: funcgraph_exit: 2.180 us | }
> sched-pipe-6532 [001] 9407.276349: funcgraph_entry: | update_cfs_group() {
> sched-pipe-6532 [001] 9407.276350: funcgraph_entry: | reweight_entity() {
> sched-pipe-6532 [001] 9407.276350: funcgraph_entry: | update_curr() {
> sched-pipe-6532 [001] 9407.276351: funcgraph_entry: 0.740 us | __calc_delta();
> sched-pipe-6532 [001] 9407.276353: funcgraph_entry: 0.720 us | update_min_vruntime();
> sched-pipe-6532 [001] 9407.276354: funcgraph_exit: 3.580 us | }
> sched-pipe-6532 [001] 9407.276355: funcgraph_entry: 0.740 us | account_entity_dequeue();
> sched-pipe-6532 [001] 9407.276356: funcgraph_entry: 0.720 us | account_entity_enqueue();
> sched-pipe-6532 [001] 9407.276358: funcgraph_exit: 7.960 us | }
> sched-pipe-6532 [001] 9407.276358: funcgraph_exit: 9.400 us | }
> sched-pipe-6532 [001] 9407.276360: funcgraph_entry: | __update_load_avg_se() {
> sched-pipe-6532 [001] 9407.276360: funcgraph_entry: 0.740 us | __accumulate_pelt_segments();
> sched-pipe-6532 [001] 9407.276362: funcgraph_exit: 2.220 us | }
> sched-pipe-6532 [001] 9407.276362: funcgraph_entry: | __update_load_avg_cfs_rq() {
> sched-pipe-6532 [001] 9407.276363: funcgraph_entry: 0.740 us | __accumulate_pelt_segments();
> sched-pipe-6532 [001] 9407.276365: funcgraph_exit: 2.160 us | }
> sched-pipe-6532 [001] 9407.276366: funcgraph_entry: | update_cfs_group() {
> sched-pipe-6532 [001] 9407.276367: funcgraph_entry: | reweight_entity() {
> sched-pipe-6532 [001] 9407.276368: funcgraph_entry: | update_curr() {
> sched-pipe-6532 [001] 9407.276368: funcgraph_entry: 0.720 us | __calc_delta();
> sched-pipe-6532 [001] 9407.276370: funcgraph_entry: 0.720 us | update_min_vruntime();
> sched-pipe-6532 [001] 9407.276371: funcgraph_exit: 3.540 us | }
> sched-pipe-6532 [001] 9407.276372: funcgraph_entry: 0.740 us | account_entity_dequeue();
> sched-pipe-6532 [001] 9407.276373: funcgraph_entry: 0.720 us | account_entity_enqueue();
> sched-pipe-6532 [001] 9407.276375: funcgraph_exit: 7.840 us | }
> sched-pipe-6532 [001] 9407.276375: funcgraph_exit: 9.300 us | }
> sched-pipe-6532 [001] 9407.276376: funcgraph_entry: 0.720 us | hrtick_update();
> sched-pipe-6532 [001] 9407.276377: funcgraph_exit: + 75.000 us | }
> sched-pipe-6532 [001] 9407.276378: funcgraph_exit: + 76.700 us | }
>
>
> Cheers
>
> --
> Qais Yousef
next prev parent reply other threads:[~2020-06-09 17:10 UTC|newest]
Thread overview: 68+ messages / expand[flat|nested] mbox.gz Atom feed top
2020-05-11 15:40 [PATCH 1/2] sched/uclamp: Add a new sysctl to control RT default boost value Qais Yousef
2020-05-11 15:40 ` [PATCH 2/2] Documentation/sysctl: Document uclamp sysctl knobs Qais Yousef
2020-05-11 17:18 ` [PATCH 1/2] sched/uclamp: Add a new sysctl to control RT default boost value Qais Yousef
2020-05-12 2:10 ` Pavan Kondeti
2020-05-12 11:46 ` Qais Yousef
2020-05-15 11:08 ` Patrick Bellasi
2020-05-18 8:31 ` Dietmar Eggemann
2020-05-18 16:49 ` Qais Yousef
2020-05-28 13:23 ` Peter Zijlstra
2020-05-28 15:58 ` Qais Yousef
2020-05-28 16:11 ` Peter Zijlstra
2020-05-28 16:51 ` Qais Yousef
2020-05-28 18:29 ` Peter Zijlstra
2020-05-28 19:08 ` Patrick Bellasi
2020-05-28 19:20 ` Dietmar Eggemann
2020-05-29 9:11 ` Qais Yousef
2020-05-29 10:21 ` Mel Gorman
2020-05-29 15:11 ` Qais Yousef
2020-05-29 16:02 ` Mel Gorman
2020-05-29 16:05 ` Qais Yousef
2020-05-29 10:08 ` Mel Gorman
2020-05-29 16:04 ` Qais Yousef
2020-05-29 16:57 ` Mel Gorman
2020-06-02 16:46 ` Dietmar Eggemann
2020-06-03 8:29 ` Patrick Bellasi
2020-06-03 10:10 ` Mel Gorman
2020-06-03 14:59 ` Vincent Guittot
2020-06-03 16:52 ` Qais Yousef
2020-06-04 12:14 ` Vincent Guittot
2020-06-05 10:45 ` Qais Yousef
2020-06-09 15:29 ` Vincent Guittot
2020-06-08 12:31 ` Qais Yousef
2020-06-08 13:06 ` Valentin Schneider
2020-06-08 14:44 ` Steven Rostedt
2020-06-11 10:13 ` Qais Yousef
2020-06-09 17:10 ` Vincent Guittot [this message]
2020-06-11 10:24 ` Qais Yousef
2020-06-11 12:01 ` Vincent Guittot
2020-06-23 15:44 ` Qais Yousef
2020-06-24 8:45 ` Vincent Guittot
2020-06-05 7:55 ` Patrick Bellasi
2020-06-05 11:32 ` Qais Yousef
2020-06-05 13:27 ` Patrick Bellasi
2020-06-03 9:40 ` Mel Gorman
2020-06-03 12:41 ` Qais Yousef
2020-06-04 13:40 ` Mel Gorman
2020-06-05 10:58 ` Qais Yousef
2020-06-11 10:58 ` Qais Yousef
2020-06-16 11:08 ` Qais Yousef
2020-06-16 13:56 ` Lukasz Luba
-- strict thread matches above, loose matches on Subject: below --
2020-04-03 12:30 Qais Yousef
2020-04-14 18:21 ` Patrick Bellasi
2020-04-15 7:46 ` Patrick Bellasi
2020-04-20 15:04 ` Qais Yousef
2020-04-20 8:24 ` Dietmar Eggemann
2020-04-20 15:19 ` Qais Yousef
2020-04-21 0:52 ` Steven Rostedt
2020-04-21 11:16 ` Dietmar Eggemann
2020-04-21 11:23 ` Qais Yousef
2020-04-20 14:50 ` Qais Yousef
2020-04-15 10:11 ` Quentin Perret
2020-04-20 15:08 ` Qais Yousef
2020-04-20 8:29 ` Dietmar Eggemann
2020-04-20 15:13 ` Qais Yousef
2020-04-21 11:18 ` Dietmar Eggemann
2020-04-21 11:27 ` Qais Yousef
2020-04-22 10:59 ` Dietmar Eggemann
2020-04-22 13:13 ` Qais Yousef
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=CAKfTPtCKS-2RoaMHhKGigjzc7dhXhx0z3dYNQLD3Q9aRC_tCnw@mail.gmail.com \
--to=vincent.guittot@linaro.org \
--cc=bsegall@google.com \
--cc=corbet@lwn.net \
--cc=dietmar.eggemann@arm.com \
--cc=juri.lelli@redhat.com \
--cc=keescook@chromium.org \
--cc=linux-doc@vger.kernel.org \
--cc=linux-fsdevel@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=mcgrof@kernel.org \
--cc=mgorman@suse.de \
--cc=mingo@redhat.com \
--cc=patrick.bellasi@matbug.net \
--cc=peterz@infradead.org \
--cc=pkondeti@codeaurora.org \
--cc=qais.yousef@arm.com \
--cc=qperret@google.com \
--cc=rdunlap@infradead.org \
--cc=rostedt@goodmis.org \
--cc=valentin.schneider@arm.com \
--cc=yzaikin@google.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).