From: Qais Yousef <qais.yousef@arm.com>
To: Mel Gorman <mgorman@suse.de>
Cc: Dietmar Eggemann <dietmar.eggemann@arm.com>,
Peter Zijlstra <peterz@infradead.org>,
Ingo Molnar <mingo@redhat.com>,
Randy Dunlap <rdunlap@infradead.org>,
Jonathan Corbet <corbet@lwn.net>,
Juri Lelli <juri.lelli@redhat.com>,
Vincent Guittot <vincent.guittot@linaro.org>,
Steven Rostedt <rostedt@goodmis.org>,
Ben Segall <bsegall@google.com>,
Luis Chamberlain <mcgrof@kernel.org>,
Kees Cook <keescook@chromium.org>,
Iurii Zaikin <yzaikin@google.com>,
Quentin Perret <qperret@google.com>,
Valentin Schneider <valentin.schneider@arm.com>,
Patrick Bellasi <patrick.bellasi@matbug.net>,
Pavan Kondeti <pkondeti@codeaurora.org>,
linux-doc@vger.kernel.org, linux-kernel@vger.kernel.org,
linux-fsdevel@vger.kernel.org, chris.redpath@arm.com,
lukasz.luba@arm.com
Subject: Re: [PATCH 1/2] sched/uclamp: Add a new sysctl to control RT default boost value
Date: Tue, 16 Jun 2020 12:08:26 +0100 [thread overview]
Message-ID: <20200616110824.dgkkbyapn3io6wik@e107158-lin> (raw)
In-Reply-To: <20200611105811.5q5rga2cmy6ypq7e@e107158-lin.cambridge.arm.com>
On 06/11/20 11:58, Qais Yousef wrote:
[...]
>
> nouclam nouclamp uclam uclamp uclamp.disable uclamp uclamp uclamp
> nouclamp recompile uclamp uclamp2 uclamp.disabled opt opt2 opt.disabled
> Hmean send-64 158.07 ( 0.00%) 156.99 * -0.68%* 163.83 * 3.65%* 160.97 * 1.83%* 163.93 * 3.71%* 159.62 * 0.98%* 161.79 * 2.36%* 161.14 * 1.94%*
> Hmean send-128 314.86 ( 0.00%) 314.41 * -0.14%* 329.05 * 4.51%* 322.88 * 2.55%* 327.88 * 4.14%* 317.56 * 0.86%* 320.72 * 1.86%* 319.62 * 1.51%*
> Hmean send-256 629.98 ( 0.00%) 625.78 * -0.67%* 652.67 * 3.60%* 639.98 * 1.59%* 643.99 * 2.22%* 631.96 * 0.31%* 635.75 * 0.92%* 644.10 * 2.24%*
> Hmean send-1024 2465.04 ( 0.00%) 2452.29 * -0.52%* 2554.66 * 3.64%* 2509.60 * 1.81%* 2540.71 * 3.07%* 2495.82 * 1.25%* 2490.50 * 1.03%* 2509.86 * 1.82%*
> Hmean send-2048 4717.57 ( 0.00%) 4713.17 * -0.09%* 4923.98 * 4.38%* 4811.01 * 1.98%* 4881.87 * 3.48%* 4793.82 * 1.62%* 4820.28 * 2.18%* 4824.60 * 2.27%*
> Hmean send-3312 7412.33 ( 0.00%) 7433.42 * 0.28%* 7717.76 * 4.12%* 7522.97 * 1.49%* 7620.99 * 2.82%* 7522.89 * 1.49%* 7614.51 * 2.73%* 7568.51 * 2.11%*
> Hmean send-4096 9021.55 ( 0.00%) 8988.71 * -0.36%* 9337.62 * 3.50%* 9075.49 * 0.60%* 9258.34 * 2.62%* 9117.17 * 1.06%* 9175.85 * 1.71%* 9079.50 * 0.64%*
> Hmean send-8192 15370.36 ( 0.00%) 15467.63 * 0.63%* 15999.52 * 4.09%* 15467.80 * 0.63%* 15978.69 * 3.96%* 15619.84 * 1.62%* 15395.09 * 0.16%* 15779.73 * 2.66%*
> Hmean send-16384 26512.35 ( 0.00%) 26498.18 * -0.05%* 26931.86 * 1.58%* 26513.18 * 0.00%* 26873.98 * 1.36%* 26456.38 * -0.21%* 26467.77 * -0.17%* 26975.04 * 1.75%*
I have attempted a few other things after this.
As pointed out above, with 5.7-rc7 I can't see a regression.
The machine I'm testing on is 2 Sockets Xeon E5 2x10-Cores (40 CPUs).
If I switch to 5.6, I can see a drop (performed each run twice)
nouclamp nouclamp2 uclamp uclamp2
Hmean send-64 162.43 ( 0.00%) 161.46 * -0.60%* 157.84 * -2.82%* 158.11 * -2.66%*
Hmean send-128 324.71 ( 0.00%) 323.88 * -0.25%* 314.78 * -3.06%* 314.94 * -3.01%*
Hmean send-256 641.55 ( 0.00%) 640.22 * -0.21%* 628.67 * -2.01%* 631.79 * -1.52%*
Hmean send-1024 2525.28 ( 0.00%) 2520.31 * -0.20%* 2448.26 * -3.05%* 2497.15 * -1.11%*
Hmean send-2048 4836.14 ( 0.00%) 4827.47 * -0.18%* 4712.08 * -2.57%* 4757.70 * -1.62%*
Hmean send-3312 7540.83 ( 0.00%) 7603.14 * 0.83%* 7425.45 * -1.53%* 7499.87 * -0.54%*
Hmean send-4096 9124.53 ( 0.00%) 9224.90 * 1.10%* 8948.82 * -1.93%* 9087.20 * -0.41%*
Hmean send-8192 15589.67 ( 0.00%) 15768.82 * 1.15%* 15486.35 * -0.66%* 15594.53 * 0.03%*
Hmean send-16384 26386.47 ( 0.00%) 26683.64 * 1.13%* 25752.25 * -2.40%* 26609.64 * 0.85%*
If I apply the 2 patches from my previous email, with uclamp enabled I see
nouclamp nouclamp2 uclamp-opt uclamp-opt2
Hmean send-64 162.43 ( 0.00%) 161.46 * -0.60%* 159.84 * -1.60%* 160.79 * -1.01%*
Hmean send-128 324.71 ( 0.00%) 323.88 * -0.25%* 318.44 * -1.93%* 321.88 * -0.87%*
Hmean send-256 641.55 ( 0.00%) 640.22 * -0.21%* 633.54 * -1.25%* 640.43 * -0.17%*
Hmean send-1024 2525.28 ( 0.00%) 2520.31 * -0.20%* 2497.47 * -1.10%* 2522.00 * -0.13%*
Hmean send-2048 4836.14 ( 0.00%) 4827.47 * -0.18%* 4773.63 * -1.29%* 4825.31 * -0.22%*
Hmean send-3312 7540.83 ( 0.00%) 7603.14 * 0.83%* 7512.92 * -0.37%* 7482.66 * -0.77%*
Hmean send-4096 9124.53 ( 0.00%) 9224.90 * 1.10%* 9076.62 * -0.52%* 9175.58 * 0.56%*
Hmean send-8192 15589.67 ( 0.00%) 15768.82 * 1.15%* 15466.02 * -0.79%* 15792.10 * 1.30%*
Hmean send-16384 26386.47 ( 0.00%) 26683.64 * 1.13%* 26234.79 * -0.57%* 26459.95 * 0.28%*
Which shows that on this machine, the system is slowed down due to bad D$
behavior on access to rq->uclamp[].bucket[] and p->uclamp{_rq}[].
If I disable uclamp using the static key I get
nouclamp nouclamp2 uclamp-opt.disabled uclamp-opt.disabled2
Hmean send-64 162.43 ( 0.00%) 161.46 * -0.60%* 161.21 * -0.75%* 161.05 * -0.85%*
Hmean send-128 324.71 ( 0.00%) 323.88 * -0.25%* 321.09 * -1.11%* 319.72 * -1.54%*
Hmean send-256 641.55 ( 0.00%) 640.22 * -0.21%* 637.37 * -0.65%* 637.82 * -0.58%*
Hmean send-1024 2525.28 ( 0.00%) 2520.31 * -0.20%* 2510.07 * -0.60%* 2504.99 * -0.80%*
Hmean send-2048 4836.14 ( 0.00%) 4827.47 * -0.18%* 4795.29 * -0.84%* 4788.99 * -0.97%*
Hmean send-3312 7540.83 ( 0.00%) 7603.14 * 0.83%* 7490.27 * -0.67%* 7498.56 * -0.56%*
Hmean send-4096 9124.53 ( 0.00%) 9224.90 * 1.10%* 9108.73 * -0.17%* 9196.45 * 0.79%*
Hmean send-8192 15589.67 ( 0.00%) 15768.82 * 1.15%* 15649.50 * 0.38%* 16101.68 * 3.28%*
Hmean send-16384 26386.47 ( 0.00%) 26683.64 * 1.13%* 26435.38 * 0.19%* 27199.11 * 3.08%*
I decided after this to see if this failure is observed all the way until
5.7-rc7.
For 5.7-rc1 I get (comparing against 5.6-nouclamp)
nouclamp nouclamp2 uclamp uclamp2
Hmean send-64 162.43 ( 0.00%) 161.46 * -0.60%* 155.56 * -4.23%* 156.72 * -3.52%*
Hmean send-128 324.71 ( 0.00%) 323.88 * -0.25%* 311.68 * -4.01%* 312.63 * -3.72%*
Hmean send-256 641.55 ( 0.00%) 640.22 * -0.21%* 616.03 * -3.98%* 620.83 * -3.23%*
Hmean send-1024 2525.28 ( 0.00%) 2520.31 * -0.20%* 2441.92 * -3.30%* 2433.83 * -3.62%*
Hmean send-2048 4836.14 ( 0.00%) 4827.47 * -0.18%* 4698.42 * -2.85%* 4682.22 * -3.18%*
Hmean send-3312 7540.83 ( 0.00%) 7603.14 * 0.83%* 7379.37 * -2.14%* 7354.82 * -2.47%*
Hmean send-4096 9124.53 ( 0.00%) 9224.90 * 1.10%* 8797.21 * -3.59%* 8815.65 * -3.39%*
Hmean send-8192 15589.67 ( 0.00%) 15768.82 * 1.15%* 15009.19 * -3.72%* 15065.16 * -3.36%*
Hmean send-16384 26386.47 ( 0.00%) 26683.64 * 1.13%* 25829.20 * -2.11%* 25783.17 * -2.29%*
For 5.7-rc2, the overhead disappears again (against 5.6-nouclamp)
nouclamp nouclamp2 uclamp uclamp2
Hmean send-64 162.43 ( 0.00%) 161.46 * -0.60%* 162.97 * 0.34%* 163.31 * 0.54%*
Hmean send-128 324.71 ( 0.00%) 323.88 * -0.25%* 323.94 * -0.24%* 325.74 * 0.32%*
Hmean send-256 641.55 ( 0.00%) 640.22 * -0.21%* 641.82 * 0.04%* 645.11 * 0.56%*
Hmean send-1024 2525.28 ( 0.00%) 2520.31 * -0.20%* 2522.74 * -0.10%* 2535.63 * 0.41%*
Hmean send-2048 4836.14 ( 0.00%) 4827.47 * -0.18%* 4836.74 * 0.01%* 4838.62 * 0.05%*
Hmean send-3312 7540.83 ( 0.00%) 7603.14 * 0.83%* 7635.31 * 1.25%* 7613.91 * 0.97%*
Hmean send-4096 9124.53 ( 0.00%) 9224.90 * 1.10%* 9198.58 * 0.81%* 9161.53 * 0.41%*
Hmean send-8192 15589.67 ( 0.00%) 15768.82 * 1.15%* 15804.47 * 1.38%* 15755.91 * 1.07%*
Hmean send-16384 26386.47 ( 0.00%) 26683.64 * 1.13%* 26649.29 * 1.00%* 26677.46 * 1.10%*
I stopped here tbh. I thought maybe numa scheduling is making the uclamp
accesses more expensive in certain patterns, so I tried with numactl -N 0
(using 5.7-rc1)
nouclamp nouclamp2 uclamp-N0-1 uclamp-N0-2
Hmean send-64 162.43 ( 0.00%) 161.46 * -0.60%* 156.26 * -3.80%* 156.00 * -3.96%*
Hmean send-128 324.71 ( 0.00%) 323.88 * -0.25%* 312.20 * -3.85%* 312.94 * -3.63%*
Hmean send-256 641.55 ( 0.00%) 640.22 * -0.21%* 620.29 * -3.31%* 619.25 * -3.48%*
Hmean send-1024 2525.28 ( 0.00%) 2520.31 * -0.20%* 2437.59 * -3.47%* 2433.94 * -3.62%*
Hmean send-2048 4836.14 ( 0.00%) 4827.47 * -0.18%* 4671.28 * -3.41%* 4714.49 * -2.52%*
Hmean send-3312 7540.83 ( 0.00%) 7603.14 * 0.83%* 7355.86 * -2.45%* 7387.51 * -2.03%*
Hmean send-4096 9124.53 ( 0.00%) 9224.90 * 1.10%* 8793.02 * -3.63%* 8883.88 * -2.64%*
Hmean send-8192 15589.67 ( 0.00%) 15768.82 * 1.15%* 14898.76 * -4.43%* 14958.19 * -4.05%*
Hmean send-16384 26386.47 ( 0.00%) 26683.64 * 1.13%* 25745.40 * -2.43%* 25800.01 * -2.22%*
And it had no effect. Interesting Lukasz can see an improvement if he tries
something similar on his machine.
Did we have any previous history of code/data layout affecting the performance
of the hot path in the past? On the juno board (octa core big.LITTLE arm
paltform), I could make the overhead disappear with a simple code shuffle (for
perf bench sched pipe).
I have tried putting the rq->uclamp[].bucket[] structures into their own PERCPU
variable since the rq is read by many cpus and thought that might lead to bad
cache patterns since uclamp are mostly read by the owning cpus, but no luck
with this approach.
I am working on a proper static key patch now that disables uclamp by default
and only enables it if the userspace attemps to modify any of the knobs it
provides, then we switch it on and keep it on. Testing it at the moment.
Thanks
--
Qais Yousef
next prev parent reply other threads:[~2020-06-16 11:08 UTC|newest]
Thread overview: 68+ messages / expand[flat|nested] mbox.gz Atom feed top
2020-05-11 15:40 [PATCH 1/2] sched/uclamp: Add a new sysctl to control RT default boost value Qais Yousef
2020-05-11 15:40 ` [PATCH 2/2] Documentation/sysctl: Document uclamp sysctl knobs Qais Yousef
2020-05-11 17:18 ` [PATCH 1/2] sched/uclamp: Add a new sysctl to control RT default boost value Qais Yousef
2020-05-12 2:10 ` Pavan Kondeti
2020-05-12 11:46 ` Qais Yousef
2020-05-15 11:08 ` Patrick Bellasi
2020-05-18 8:31 ` Dietmar Eggemann
2020-05-18 16:49 ` Qais Yousef
2020-05-28 13:23 ` Peter Zijlstra
2020-05-28 15:58 ` Qais Yousef
2020-05-28 16:11 ` Peter Zijlstra
2020-05-28 16:51 ` Qais Yousef
2020-05-28 18:29 ` Peter Zijlstra
2020-05-28 19:08 ` Patrick Bellasi
2020-05-28 19:20 ` Dietmar Eggemann
2020-05-29 9:11 ` Qais Yousef
2020-05-29 10:21 ` Mel Gorman
2020-05-29 15:11 ` Qais Yousef
2020-05-29 16:02 ` Mel Gorman
2020-05-29 16:05 ` Qais Yousef
2020-05-29 10:08 ` Mel Gorman
2020-05-29 16:04 ` Qais Yousef
2020-05-29 16:57 ` Mel Gorman
2020-06-02 16:46 ` Dietmar Eggemann
2020-06-03 8:29 ` Patrick Bellasi
2020-06-03 10:10 ` Mel Gorman
2020-06-03 14:59 ` Vincent Guittot
2020-06-03 16:52 ` Qais Yousef
2020-06-04 12:14 ` Vincent Guittot
2020-06-05 10:45 ` Qais Yousef
2020-06-09 15:29 ` Vincent Guittot
2020-06-08 12:31 ` Qais Yousef
2020-06-08 13:06 ` Valentin Schneider
2020-06-08 14:44 ` Steven Rostedt
2020-06-11 10:13 ` Qais Yousef
2020-06-09 17:10 ` Vincent Guittot
2020-06-11 10:24 ` Qais Yousef
2020-06-11 12:01 ` Vincent Guittot
2020-06-23 15:44 ` Qais Yousef
2020-06-24 8:45 ` Vincent Guittot
2020-06-05 7:55 ` Patrick Bellasi
2020-06-05 11:32 ` Qais Yousef
2020-06-05 13:27 ` Patrick Bellasi
2020-06-03 9:40 ` Mel Gorman
2020-06-03 12:41 ` Qais Yousef
2020-06-04 13:40 ` Mel Gorman
2020-06-05 10:58 ` Qais Yousef
2020-06-11 10:58 ` Qais Yousef
2020-06-16 11:08 ` Qais Yousef [this message]
2020-06-16 13:56 ` Lukasz Luba
-- strict thread matches above, loose matches on Subject: below --
2020-04-03 12:30 Qais Yousef
2020-04-14 18:21 ` Patrick Bellasi
2020-04-15 7:46 ` Patrick Bellasi
2020-04-20 15:04 ` Qais Yousef
2020-04-20 8:24 ` Dietmar Eggemann
2020-04-20 15:19 ` Qais Yousef
2020-04-21 0:52 ` Steven Rostedt
2020-04-21 11:16 ` Dietmar Eggemann
2020-04-21 11:23 ` Qais Yousef
2020-04-20 14:50 ` Qais Yousef
2020-04-15 10:11 ` Quentin Perret
2020-04-20 15:08 ` Qais Yousef
2020-04-20 8:29 ` Dietmar Eggemann
2020-04-20 15:13 ` Qais Yousef
2020-04-21 11:18 ` Dietmar Eggemann
2020-04-21 11:27 ` Qais Yousef
2020-04-22 10:59 ` Dietmar Eggemann
2020-04-22 13:13 ` Qais Yousef
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20200616110824.dgkkbyapn3io6wik@e107158-lin \
--to=qais.yousef@arm.com \
--cc=bsegall@google.com \
--cc=chris.redpath@arm.com \
--cc=corbet@lwn.net \
--cc=dietmar.eggemann@arm.com \
--cc=juri.lelli@redhat.com \
--cc=keescook@chromium.org \
--cc=linux-doc@vger.kernel.org \
--cc=linux-fsdevel@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=lukasz.luba@arm.com \
--cc=mcgrof@kernel.org \
--cc=mgorman@suse.de \
--cc=mingo@redhat.com \
--cc=patrick.bellasi@matbug.net \
--cc=peterz@infradead.org \
--cc=pkondeti@codeaurora.org \
--cc=qperret@google.com \
--cc=rdunlap@infradead.org \
--cc=rostedt@goodmis.org \
--cc=valentin.schneider@arm.com \
--cc=vincent.guittot@linaro.org \
--cc=yzaikin@google.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).