From: Qais Yousef <qais.yousef@arm.com>
To: Mel Gorman <mgorman@suse.de>
Cc: Dietmar Eggemann <dietmar.eggemann@arm.com>,
Peter Zijlstra <peterz@infradead.org>,
Ingo Molnar <mingo@redhat.com>,
Randy Dunlap <rdunlap@infradead.org>,
Jonathan Corbet <corbet@lwn.net>,
Juri Lelli <juri.lelli@redhat.com>,
Vincent Guittot <vincent.guittot@linaro.org>,
Steven Rostedt <rostedt@goodmis.org>,
Ben Segall <bsegall@google.com>,
Luis Chamberlain <mcgrof@kernel.org>,
Kees Cook <keescook@chromium.org>,
Iurii Zaikin <yzaikin@google.com>,
Quentin Perret <qperret@google.com>,
Valentin Schneider <valentin.schneider@arm.com>,
Patrick Bellasi <patrick.bellasi@matbug.net>,
Pavan Kondeti <pkondeti@codeaurora.org>,
linux-doc@vger.kernel.org, linux-kernel@vger.kernel.org,
linux-fsdevel@vger.kernel.org
Subject: Re: [PATCH 1/2] sched/uclamp: Add a new sysctl to control RT default boost value
Date: Wed, 3 Jun 2020 13:41:13 +0100 [thread overview]
Message-ID: <20200603124112.w5stb7v2z3kzcze3@e107158-lin.cambridge.arm.com> (raw)
In-Reply-To: <20200603094036.GF3070@suse.de>
On 06/03/20 10:40, Mel Gorman wrote:
> On Tue, Jun 02, 2020 at 06:46:00PM +0200, Dietmar Eggemann wrote:
> > On 29.05.20 12:08, Mel Gorman wrote:
> > > On Thu, May 28, 2020 at 06:11:12PM +0200, Peter Zijlstra wrote:
> > >>> FWIW, I think you're referring to Mel's notice in OSPM regarding the overhead.
> > >>> Trying to see what goes on in there.
> > >>
> > >> Indeed, that one. The fact that regular distros cannot enable this
> > >> feature due to performance overhead is unfortunate. It means there is a
> > >> lot less potential for this stuff.
> > >
> > > During that talk, I was a vague about the cost, admitted I had not looked
> > > too closely at mainline performance and had since deleted the data given
> > > that the problem was first spotted in early April. If I heard someone
> > > else making statements like I did at the talk, I would consider it a bit
> > > vague, potentially FUD, possibly wrong and worth rechecking myself. In
> > > terms of distributions "cannot enable this", we could but I was unwilling
> > > to pay the cost for a feature no one has asked for yet. If they had, I
> > > would endevour to put it behind static branches and disable it by default
> > > (like what happened for PSI). I was contacted offlist about my comments
> > > at OSPM and gathered new data to respond properly. For the record, here
> > > is an editted version of my response;
> >
> > [...]
> >
> > I ran these tests on 'Ubuntu 18.04 Desktop' on Intel E5-2690 v2
> > (2 sockets * 10 cores * 2 threads) with powersave governor as:
> >
> > $ numactl -N 0 ./run-mmtests.sh XXX
> >
> > w/ config-network-netperf-unbound.
> >
> > Running w/o 'numactl -N 0' gives slightly worse results.
> >
> > without-clamp : CONFIG_UCLAMP_TASK is not set
> > with-clamp : CONFIG_UCLAMP_TASK=y,
> > CONFIG_UCLAMP_TASK_GROUP is not set
> > with-clamp-tskgrp : CONFIG_UCLAMP_TASK=y,
> > CONFIG_UCLAMP_TASK_GROUP=y
> >
> >
> > netperf-udp
> > ./5.7.0-rc7 ./5.7.0-rc7 ./5.7.0-rc7
> > without-clamp with-clamp with-clamp-tskgrp
> >
> > Hmean send-64 153.62 ( 0.00%) 151.80 * -1.19%* 155.60 * 1.28%*
> > Hmean send-128 306.77 ( 0.00%) 306.27 * -0.16%* 309.39 * 0.85%*
> > Hmean send-256 608.54 ( 0.00%) 604.28 * -0.70%* 613.42 * 0.80%*
> > Hmean send-1024 2395.80 ( 0.00%) 2365.67 * -1.26%* 2409.50 * 0.57%*
> > Hmean send-2048 4608.70 ( 0.00%) 4544.02 * -1.40%* 4665.96 * 1.24%*
> > Hmean send-3312 7223.97 ( 0.00%) 7158.88 * -0.90%* 7331.23 * 1.48%*
> > Hmean send-4096 8729.53 ( 0.00%) 8598.78 * -1.50%* 8860.47 * 1.50%*
> > Hmean send-8192 14961.77 ( 0.00%) 14418.92 * -3.63%* 14908.36 * -0.36%*
> > Hmean send-16384 25799.50 ( 0.00%) 25025.64 * -3.00%* 25831.20 * 0.12%*
> > Hmean recv-64 153.62 ( 0.00%) 151.80 * -1.19%* 155.60 * 1.28%*
> > Hmean recv-128 306.77 ( 0.00%) 306.27 * -0.16%* 309.39 * 0.85%*
> > Hmean recv-256 608.54 ( 0.00%) 604.28 * -0.70%* 613.42 * 0.80%*
> > Hmean recv-1024 2395.80 ( 0.00%) 2365.67 * -1.26%* 2409.50 * 0.57%*
> > Hmean recv-2048 4608.70 ( 0.00%) 4544.02 * -1.40%* 4665.95 * 1.24%*
> > Hmean recv-3312 7223.97 ( 0.00%) 7158.88 * -0.90%* 7331.23 * 1.48%*
> > Hmean recv-4096 8729.53 ( 0.00%) 8598.78 * -1.50%* 8860.47 * 1.50%*
> > Hmean recv-8192 14961.61 ( 0.00%) 14418.88 * -3.63%* 14908.30 * -0.36%*
> > Hmean recv-16384 25799.39 ( 0.00%) 25025.49 * -3.00%* 25831.00 * 0.12%*
> >
> > netperf-tcp
> >
> > Hmean 64 818.65 ( 0.00%) 812.98 * -0.69%* 826.17 * 0.92%*
> > Hmean 128 1569.55 ( 0.00%) 1555.79 * -0.88%* 1586.94 * 1.11%*
> > Hmean 256 2952.86 ( 0.00%) 2915.07 * -1.28%* 2968.15 * 0.52%*
> > Hmean 1024 10425.91 ( 0.00%) 10296.68 * -1.24%* 10418.38 * -0.07%*
> > Hmean 2048 17454.51 ( 0.00%) 17369.57 * -0.49%* 17419.24 * -0.20%*
> > Hmean 3312 22509.95 ( 0.00%) 22229.69 * -1.25%* 22373.32 * -0.61%*
> > Hmean 4096 25033.23 ( 0.00%) 24859.59 * -0.69%* 24912.50 * -0.48%*
> > Hmean 8192 32080.51 ( 0.00%) 31744.51 * -1.05%* 31800.45 * -0.87%*
> > Hmean 16384 36531.86 ( 0.00%) 37064.68 * 1.46%* 37397.71 * 2.37%*
> >
> > The diffs are smaller than on openSUSE Leap 15.1 and some of the
> > uclamp taskgroup results are better?
> >
>
> I don't see the stddev and coeff but these look close to borderline.
> Sure, they are marked with a * so it passed a significant test but it's
> still a very marginal difference for netperf. It's possible that the
> systemd configurations differ in some way that is significant for uclamp
> but I don't know what that is.
Hmm so what you're saying is that Dietmar didn't reproduce the same problem
you're observing? I was hoping to use that to dig more into it.
>
> > With this test setup we now can play with the uclamp code in
> > enqueue_task() and dequeue_task().
> >
>
> That is still true. An annotated perf profile should tell you if the
> uclamp code is being heavily used or if it's bailing early but it's also
> possible that uclamp overhead is not a big deal on your particular
> machine.
>
> The possibility that either the distribution, the machine or both are
> critical for detecting a problem with uclamp may explain why any overhead
> was missed. Even if it is marginal, it still makes sense to minimise the
> amount of uclamp code that is executed if no limit is specified for tasks.
So one speculation I have that might be causing the problem is that the
accesses of struct uclamp_rq are causing bad cache behavior in your case. Your
mmtest description of the netperf says that it is sensitive to cacheline
bouncing.
Looking at struct rq, the uclamp_rq is spanning 2 cachelines
29954 /* --- cacheline 1 boundary (64 bytes) --- */
29955 struct uclamp_rq uclamp[2]; /* 64 96 */
29956 /* --- cacheline 2 boundary (128 bytes) was 32 bytes ago --- */
29957 unsigned int uclamp_flags; /* 160 4 */
29958
29959 /* XXX 28 bytes hole, try to pack */
29960
Reducing sturct uclamp_bucket to use unsigned int instead of unsigned long
helps putting it all in a single cacheline
diff --git a/kernel/sched/sched.h b/kernel/sched/sched.h
index db3a57675ccf..63b5397a1708 100644
--- a/kernel/sched/sched.h
+++ b/kernel/sched/sched.h
@@ -833,8 +833,8 @@ extern void rto_push_irq_work_func(struct irq_work *work);
* clamp value.
*/
struct uclamp_bucket {
- unsigned long value : bits_per(SCHED_CAPACITY_SCALE);
- unsigned long tasks : BITS_PER_LONG - bits_per(SCHED_CAPACITY_SCALE);
+ unsigned int value : bits_per(SCHED_CAPACITY_SCALE);
+ unsigned int tasks : 32 - bits_per(SCHED_CAPACITY_SCALE);
};
/*
29954 /* --- cacheline 1 boundary (64 bytes) --- */
29955 struct uclamp_rq uclamp[2]; /* 64 48 */
29956 unsigned int uclamp_flags; /* 112 4 */
29957
Is it something worth experimenting with?
Thanks
--
Qais Yousef
next prev parent reply other threads:[~2020-06-03 12:41 UTC|newest]
Thread overview: 68+ messages / expand[flat|nested] mbox.gz Atom feed top
2020-05-11 15:40 [PATCH 1/2] sched/uclamp: Add a new sysctl to control RT default boost value Qais Yousef
2020-05-11 15:40 ` [PATCH 2/2] Documentation/sysctl: Document uclamp sysctl knobs Qais Yousef
2020-05-11 17:18 ` [PATCH 1/2] sched/uclamp: Add a new sysctl to control RT default boost value Qais Yousef
2020-05-12 2:10 ` Pavan Kondeti
2020-05-12 11:46 ` Qais Yousef
2020-05-15 11:08 ` Patrick Bellasi
2020-05-18 8:31 ` Dietmar Eggemann
2020-05-18 16:49 ` Qais Yousef
2020-05-28 13:23 ` Peter Zijlstra
2020-05-28 15:58 ` Qais Yousef
2020-05-28 16:11 ` Peter Zijlstra
2020-05-28 16:51 ` Qais Yousef
2020-05-28 18:29 ` Peter Zijlstra
2020-05-28 19:08 ` Patrick Bellasi
2020-05-28 19:20 ` Dietmar Eggemann
2020-05-29 9:11 ` Qais Yousef
2020-05-29 10:21 ` Mel Gorman
2020-05-29 15:11 ` Qais Yousef
2020-05-29 16:02 ` Mel Gorman
2020-05-29 16:05 ` Qais Yousef
2020-05-29 10:08 ` Mel Gorman
2020-05-29 16:04 ` Qais Yousef
2020-05-29 16:57 ` Mel Gorman
2020-06-02 16:46 ` Dietmar Eggemann
2020-06-03 8:29 ` Patrick Bellasi
2020-06-03 10:10 ` Mel Gorman
2020-06-03 14:59 ` Vincent Guittot
2020-06-03 16:52 ` Qais Yousef
2020-06-04 12:14 ` Vincent Guittot
2020-06-05 10:45 ` Qais Yousef
2020-06-09 15:29 ` Vincent Guittot
2020-06-08 12:31 ` Qais Yousef
2020-06-08 13:06 ` Valentin Schneider
2020-06-08 14:44 ` Steven Rostedt
2020-06-11 10:13 ` Qais Yousef
2020-06-09 17:10 ` Vincent Guittot
2020-06-11 10:24 ` Qais Yousef
2020-06-11 12:01 ` Vincent Guittot
2020-06-23 15:44 ` Qais Yousef
2020-06-24 8:45 ` Vincent Guittot
2020-06-05 7:55 ` Patrick Bellasi
2020-06-05 11:32 ` Qais Yousef
2020-06-05 13:27 ` Patrick Bellasi
2020-06-03 9:40 ` Mel Gorman
2020-06-03 12:41 ` Qais Yousef [this message]
2020-06-04 13:40 ` Mel Gorman
2020-06-05 10:58 ` Qais Yousef
2020-06-11 10:58 ` Qais Yousef
2020-06-16 11:08 ` Qais Yousef
2020-06-16 13:56 ` Lukasz Luba
-- strict thread matches above, loose matches on Subject: below --
2020-04-03 12:30 Qais Yousef
2020-04-14 18:21 ` Patrick Bellasi
2020-04-15 7:46 ` Patrick Bellasi
2020-04-20 15:04 ` Qais Yousef
2020-04-20 8:24 ` Dietmar Eggemann
2020-04-20 15:19 ` Qais Yousef
2020-04-21 0:52 ` Steven Rostedt
2020-04-21 11:16 ` Dietmar Eggemann
2020-04-21 11:23 ` Qais Yousef
2020-04-20 14:50 ` Qais Yousef
2020-04-15 10:11 ` Quentin Perret
2020-04-20 15:08 ` Qais Yousef
2020-04-20 8:29 ` Dietmar Eggemann
2020-04-20 15:13 ` Qais Yousef
2020-04-21 11:18 ` Dietmar Eggemann
2020-04-21 11:27 ` Qais Yousef
2020-04-22 10:59 ` Dietmar Eggemann
2020-04-22 13:13 ` Qais Yousef
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20200603124112.w5stb7v2z3kzcze3@e107158-lin.cambridge.arm.com \
--to=qais.yousef@arm.com \
--cc=bsegall@google.com \
--cc=corbet@lwn.net \
--cc=dietmar.eggemann@arm.com \
--cc=juri.lelli@redhat.com \
--cc=keescook@chromium.org \
--cc=linux-doc@vger.kernel.org \
--cc=linux-fsdevel@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=mcgrof@kernel.org \
--cc=mgorman@suse.de \
--cc=mingo@redhat.com \
--cc=patrick.bellasi@matbug.net \
--cc=peterz@infradead.org \
--cc=pkondeti@codeaurora.org \
--cc=qperret@google.com \
--cc=rdunlap@infradead.org \
--cc=rostedt@goodmis.org \
--cc=valentin.schneider@arm.com \
--cc=vincent.guittot@linaro.org \
--cc=yzaikin@google.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).