From: K Prateek Nayak <kprateek.nayak@amd.com>
To: Vincent Guittot <vincent.guittot@linaro.org>
Cc: mingo@redhat.com, peterz@infradead.org, juri.lelli@redhat.com,
dietmar.eggemann@arm.com, rostedt@goodmis.org,
bsegall@google.com, mgorman@suse.de, bristot@redhat.com,
vschneid@redhat.com, linux-kernel@vger.kernel.org,
parth@linux.ibm.com, qyousef@layalina.io, chris.hyser@oracle.com,
patrick.bellasi@matbug.net, David.Laight@aculab.com,
pjt@google.com, pavel@ucw.cz, tj@kernel.org, qperret@google.com,
tim.c.chen@linux.intel.com, joshdon@google.com, timj@gnu.org,
yu.c.chen@intel.com, youssefesmat@chromium.org,
joel@joelfernandes.org
Subject: Re: [PATCH v9 0/9] Add latency priority for CFS class
Date: Wed, 7 Dec 2022 21:56:04 +0530 [thread overview]
Message-ID: <e3fdc51b-19aa-c85d-f51e-16ff9cf64e2a@amd.com> (raw)
In-Reply-To: <CAKfTPtDgVT8mGhDbh9Z40769Ju1DMFpL+zu+rEqnYyJRYetmfg@mail.gmail.com>
Hello Vincent,
Thank you for taking a look at the report.
On 11/28/2022 10:49 PM, Vincent Guittot wrote:
> Hi Prateek,
>
> On Mon, 28 Nov 2022 at 12:52, K Prateek Nayak <kprateek.nayak@amd.com> wrote:
>>
>> Hello Vincent,
>>
>> Following are the test results on dual socket Zen3 machine (2 x 64C/128T)
>>
>> tl;dr
>>
>> o All benchmarks with DEFAULT_LATENCY_NICE value are comparable to tip.
>> There is, however, a noticeable dip for unixbench-spawn test case.
>>
>> o With the 2 rbtree approach, I do not see much difference in the
>> hackbench results with varying latency nice value. Tests on v5 did
>> yield noticeable improvements for hackbench.
>> (https://lore.kernel.org/lkml/cd48ebbb-9724-985f-28e3-e558dea07827@amd.com/)
>
> The 2 rbtree approach is the one that was already used in v5. I just
> rerun hackbench tests with latest tip and v6.2-rc7 and I can see large
> performance improvement for pipe tests on my system (8 cores system).
> Could you try witha larger number of group ? like 64, 128 and 256
> groups
Ah! My bad. I've rerun hackbench with larger number of groups and I see a
clear win for pipes with latency nice 19. Hackbench with sockets too see a
small win.
o pipes
$ perf bench sched messaging -p -l 50000 -g <groups>
latency_nice: 0 19 -20
32-groups: 9.43 (0.00 pct) 6.42 (31.91 pct) 9.75 (-3.39 pct)
64-groups: 21.55 (0.00 pct) 12.97 (39.81 pct) 21.48 (0.32 pct)
128-groups: 41.15 (0.00 pct) 24.18 (41.23 pct) 46.69 (-13.46 pct)
256-groups: 78.87 (0.00 pct) 43.65 (44.65 pct) 78.84 (0.03 pct)
512-groups: 125.48 (0.00 pct) 78.91 (37.11 pct) 136.21 (-8.55 pct)
1024-groups: 292.81 (0.00 pct) 151.36 (48.30 pct) 323.57 (-10.50 pct)
o sockets
$ perf bench sched messaging -l 100000 -g <groups>
latency_nice: 0 19 -20
32-groups: 27.23 (0.00 pct) 27.00 (0.84 pct) 26.92 (1.13 pct)
64-groups: 45.71 (0.00 pct) 44.58 (2.47 pct) 45.86 (-0.32 pct)
128-groups: 79.55 (0.00 pct) 78.22 (1.67 pct) 80.01 (-0.57 pct)
256-groups: 161.41 (0.00 pct) 164.04 (-1.62 pct) 169.57 (-5.05 pct)
512-groups: 326.41 (0.00 pct) 310.00 (5.02 pct) 342.17 (-4.82 pct)
1024-groups: 634.36 (0.00 pct) 633.59 (0.12 pct) 640.05 (-0.89 pct)
Note: All tests were done in NPS1 mode.
>
>>
>> [..snip..]
>>
>> ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
>> ~ Unixbench - DEFAULT_LATENCY_NICE ~
>> ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
>>
>> o NPS1
>>
>> Test Metric Parallelism tip latency_nice
>> unixbench-dhry2reg Hmean unixbench-dhry2reg-1 48929419.48 ( 0.00%) 49137039.06 ( 0.42%)
>> unixbench-dhry2reg Hmean unixbench-dhry2reg-512 6275526953.25 ( 0.00%) 6265580479.15 ( -0.16%)
>> unixbench-syscall Amean unixbench-syscall-1 2994319.73 ( 0.00%) 3008596.83 * -0.48%*
>> unixbench-syscall Amean unixbench-syscall-512 7349715.87 ( 0.00%) 7420994.50 * -0.97%*
>> unixbench-pipe Hmean unixbench-pipe-1 2830206.03 ( 0.00%) 2854405.99 * 0.86%*
>> unixbench-pipe Hmean unixbench-pipe-512 326207828.01 ( 0.00%) 328997804.52 * 0.86%*
>> unixbench-spawn Hmean unixbench-spawn-1 6394.21 ( 0.00%) 6367.75 ( -0.41%)
>> unixbench-spawn Hmean unixbench-spawn-512 72700.64 ( 0.00%) 71454.19 * -1.71%*
>> unixbench-execl Hmean unixbench-execl-1 4723.61 ( 0.00%) 4750.59 ( 0.57%)
>> unixbench-execl Hmean unixbench-execl-512 11212.05 ( 0.00%) 11262.13 ( 0.45%)
>>
>> o NPS2
>>
>> Test Metric Parallelism tip latency_nice
>> unixbench-dhry2reg Hmean unixbench-dhry2reg-1 49271512.85 ( 0.00%) 49245260.43 ( -0.05%)
>> unixbench-dhry2reg Hmean unixbench-dhry2reg-512 6267992483.03 ( 0.00%) 6264951100.67 ( -0.05%)
>> unixbench-syscall Amean unixbench-syscall-1 2995885.93 ( 0.00%) 3005975.10 * -0.34%*
>> unixbench-syscall Amean unixbench-syscall-512 7388865.77 ( 0.00%) 7276275.63 * 1.52%*
>> unixbench-pipe Hmean unixbench-pipe-1 2828971.95 ( 0.00%) 2856578.72 * 0.98%*
>> unixbench-pipe Hmean unixbench-pipe-512 326225385.37 ( 0.00%) 328941270.81 * 0.83%*
>> unixbench-spawn Hmean unixbench-spawn-1 6958.71 ( 0.00%) 6954.21 ( -0.06%)
>> unixbench-spawn Hmean unixbench-spawn-512 85443.56 ( 0.00%) 70536.42 * -17.45%* (0.67% vs 0.93% - CoEff var)
>
> I don't expect any perf improvement or regression when the latency
> nice is not changed
This regression can be ignored. Although the results from back to
back runs are very stable, I see the results vary when I rebuild
the unixbench binaries on my test setup.
tip latency_nice
unixbench-spawn-512 73489.0 78260.4 (kexec)
unixbench-spawn-512 73332.7 77821.2 (reboot)
unixbench-spawn-512 86207.4 82281.2 (rebuilt + reboot)
I'll go back and look more into the spawn test because there is
something else at play there but other Unixbench results seem to
be stable looking at the rerun.
>
>> unixbench-execl Hmean unixbench-execl-1 4767.99 ( 0.00%) 4752.63 * -0.32%*
>> unixbench-execl Hmean unixbench-execl-512 11250.72 ( 0.00%) 11320.97 ( 0.62%)
>>
>> o NPS4
>>
>> Test Metric Parallelism tip latency_nice
>> unixbench-dhry2reg Hmean unixbench-dhry2reg-1 49041932.68 ( 0.00%) 49156671.05 ( 0.23%)
>> unixbench-dhry2reg Hmean unixbench-dhry2reg-512 6286981589.85 ( 0.00%) 6285248711.40 ( -0.03%)
>> unixbench-syscall Amean unixbench-syscall-1 2992405.60 ( 0.00%) 3008933.03 * -0.55%*
>> unixbench-syscall Amean unixbench-syscall-512 7971789.70 ( 0.00%) 7814622.23 * 1.97%*
>> unixbench-pipe Hmean unixbench-pipe-1 2822892.54 ( 0.00%) 2852615.11 * 1.05%*
>> unixbench-pipe Hmean unixbench-pipe-512 326408309.83 ( 0.00%) 329617202.56 * 0.98%*
>> unixbench-spawn Hmean unixbench-spawn-1 7685.31 ( 0.00%) 7243.54 ( -5.75%)
>> unixbench-spawn Hmean unixbench-spawn-512 72245.56 ( 0.00%) 77000.81 * 6.58%*
>> unixbench-execl Hmean unixbench-execl-1 4761.42 ( 0.00%) 4733.12 * -0.59%*
>> unixbench-execl Hmean unixbench-execl-512 11533.53 ( 0.00%) 11660.17 ( 1.10%)
>>
>> ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
>> ~ Hackbench - Various Latency Nice Values ~
>> ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
>>
>> o 100000 loops
>>
>> - pipe (process)
>>
>> Test: LN: 0 LN: 19 LN: -20
>> 1-groups: 3.91 (0.00 pct) 3.91 (0.00 pct) 3.81 (2.55 pct)
>> 2-groups: 4.48 (0.00 pct) 4.52 (-0.89 pct) 4.53 (-1.11 pct)
>> 4-groups: 4.83 (0.00 pct) 4.83 (0.00 pct) 4.87 (-0.82 pct)
>> 8-groups: 5.09 (0.00 pct) 5.00 (1.76 pct) 5.07 (0.39 pct)
>> 16-groups: 6.92 (0.00 pct) 6.79 (1.87 pct) 6.96 (-0.57 pct)
>>
>> - pipe (thread)
>>
>> 1-groups: 4.13 (0.00 pct) 4.08 (1.21 pct) 4.11 (0.48 pct)
>> 2-groups: 4.78 (0.00 pct) 4.90 (-2.51 pct) 4.79 (-0.20 pct)
>> 4-groups: 5.12 (0.00 pct) 5.08 (0.78 pct) 5.16 (-0.78 pct)
>> 8-groups: 5.31 (0.00 pct) 5.28 (0.56 pct) 5.33 (-0.37 pct)
>> 16-groups: 7.34 (0.00 pct) 7.27 (0.95 pct) 7.33 (0.13 pct)
>>
>> - socket (process)
>>
>> Test: LN: 0 LN: 19 LN: -20
>> 1-groups: 6.61 (0.00 pct) 6.38 (3.47 pct) 6.54 (1.05 pct)
>> 2-groups: 6.59 (0.00 pct) 6.67 (-1.21 pct) 6.11 (7.28 pct)
>> 4-groups: 6.77 (0.00 pct) 6.78 (-0.14 pct) 6.79 (-0.29 pct)
>> 8-groups: 8.29 (0.00 pct) 8.39 (-1.20 pct) 8.36 (-0.84 pct)
>> 16-groups: 12.21 (0.00 pct) 12.03 (1.47 pct) 12.35 (-1.14 pct)
>>
>> - socket (thread)
>>
>> Test: LN: 0 LN: 19 LN: -20
>> 1-groups: 6.50 (0.00 pct) 5.99 (7.84 pct) 6.02 (7.38 pct) ^
>> 2-groups: 6.07 (0.00 pct) 6.20 (-2.14 pct) 6.23 (-2.63 pct)
>> 4-groups: 6.61 (0.00 pct) 6.64 (-0.45 pct) 6.63 (-0.30 pct)
>> 8-groups: 8.87 (0.00 pct) 8.67 (2.25 pct) 8.78 (1.01 pct)
>> 16-groups: 12.63 (0.00 pct) 12.54 (0.71 pct) 12.59 (0.31 pct)
>>
>>> [..snip..]
>>>
>>
>> Apart from couple of anomalies, latency nice reduces wait time, especially
>> when the system is heavily loaded. If there is any data, or any specific
>> workload you would like me to run on the test system, please do let me know.
>> Meanwhile, I'll try to get some numbers for larger workloads like SpecJBB
>> that did see improvements with latency nice on v5.
Following are results for SpecJBB in NPS1 mode:
+----------------------------------------------+
| | Latency Nice | |
| Metric |-------------------| tip |
| | 0 | 19 | |
|----------------|-------------------|---------|
| Max jOPS | 100.00% | 102.19% | 101.02% |
| Criritcal jOPS | 100.00% | 122.41% | 100.41% |
+----------------------------------------------+
SpecJBB throughput for Max-jOPS is similar across the board
but Critical-jOPS throughput sees a good uplift again with
latency nice 19.
>
> [..snip..]
>
If there is any specific workload you would like me to test,
please do let me know. I'll try to test more workloads I come
across with different latency nice values and update you
with the results on this thread.
Tested-by: K Prateek Nayak <kprateek.nayak@amd.com>
--
Thanks and Regards,
Prateek
prev parent reply other threads:[~2022-12-07 16:27 UTC|newest]
Thread overview: 19+ messages / expand[flat|nested] mbox.gz Atom feed top
2022-11-15 17:18 [PATCH v9 0/9] Add latency priority for CFS class Vincent Guittot
2022-11-15 17:18 ` [PATCH v9 1/9] sched/fair: fix unfairness at wakeup Vincent Guittot
2022-11-15 17:18 ` [PATCH v9 2/9] sched: Introduce latency-nice as a per-task attribute Vincent Guittot
2022-11-15 17:18 ` [PATCH v9 3/9] sched/core: Propagate parent task's latency requirements to the child task Vincent Guittot
2022-11-15 17:18 ` [PATCH v9 4/9] sched: Allow sched_{get,set}attr to change latency_nice of the task Vincent Guittot
2022-11-15 17:18 ` [PATCH 5/9] sched/fair: Take into account latency priority at wakeup Vincent Guittot
2022-11-29 4:25 ` Joel Fernandes
2022-11-29 8:58 ` Vincent Guittot
2022-11-29 15:45 ` Joel Fernandes
2022-11-29 17:20 ` Vincent Guittot
2022-11-30 3:09 ` Joel Fernandes
2022-11-30 13:42 ` Vincent Guittot
2022-11-15 17:18 ` [PATCH v9 6/9] sched/fair: Add sched group latency support Vincent Guittot
2022-11-15 17:18 ` [PATCH v9 7/9] sched/core: Support latency priority with sched core Vincent Guittot
2022-11-15 17:18 ` [PATCH v9 8/9] sched/fair: Add latency list Vincent Guittot
2022-11-15 17:18 ` [PATCH v9 9/9] sched/fair: remove check_preempt_from_others Vincent Guittot
2022-11-28 11:51 ` [PATCH v9 0/9] Add latency priority for CFS class K Prateek Nayak
2022-11-28 17:19 ` Vincent Guittot
2022-12-07 16:26 ` K Prateek Nayak [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=e3fdc51b-19aa-c85d-f51e-16ff9cf64e2a@amd.com \
--to=kprateek.nayak@amd.com \
--cc=David.Laight@aculab.com \
--cc=bristot@redhat.com \
--cc=bsegall@google.com \
--cc=chris.hyser@oracle.com \
--cc=dietmar.eggemann@arm.com \
--cc=joel@joelfernandes.org \
--cc=joshdon@google.com \
--cc=juri.lelli@redhat.com \
--cc=linux-kernel@vger.kernel.org \
--cc=mgorman@suse.de \
--cc=mingo@redhat.com \
--cc=parth@linux.ibm.com \
--cc=patrick.bellasi@matbug.net \
--cc=pavel@ucw.cz \
--cc=peterz@infradead.org \
--cc=pjt@google.com \
--cc=qperret@google.com \
--cc=qyousef@layalina.io \
--cc=rostedt@goodmis.org \
--cc=tim.c.chen@linux.intel.com \
--cc=timj@gnu.org \
--cc=tj@kernel.org \
--cc=vincent.guittot@linaro.org \
--cc=vschneid@redhat.com \
--cc=youssefesmat@chromium.org \
--cc=yu.c.chen@intel.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).