From: Raghavendra K T <raghavendra.kt@amd.com>
To: Mateusz Guzik <mjguzik@gmail.com>,
Ankur Arora <ankur.a.arora@oracle.com>
Cc: linux-kernel@vger.kernel.org, linux-mm@kvack.org, x86@kernel.org,
akpm@linux-foundation.org, luto@kernel.org, bp@alien8.de,
dave.hansen@linux.intel.com, hpa@zytor.com, mingo@redhat.com,
juri.lelli@redhat.com, vincent.guittot@linaro.org,
willy@infradead.org, mgorman@suse.de, peterz@infradead.org,
rostedt@goodmis.org, tglx@linutronix.de, jon.grimm@amd.com,
bharata@amd.com, boris.ostrovsky@oracle.com,
konrad.wilk@oracle.com
Subject: Re: [PATCH v2 0/9] x86/clear_huge_page: multi-page clearing
Date: Fri, 8 Sep 2023 07:48:16 +0530 [thread overview]
Message-ID: <5570c6b9-4abd-1526-cd17-ed45f7d51b20@amd.com> (raw)
In-Reply-To: <20230903081404.hmkhnrk243h2nuoa@f>
On 9/3/2023 1:44 PM, Mateusz Guzik wrote:
> On Wed, Aug 30, 2023 at 11:49:49AM -0700, Ankur Arora wrote:
>> This series adds a multi-page clearing primitive, clear_pages(),
>> which enables more effective use of x86 string instructions by
>> advertising the real region-size to be cleared.
>>
>> Region-size can be used as a hint by uarchs to optimize the
>> clearing.
>>
>> Also add allow_resched() which marks a code-section as allowing
>> rescheduling in the irqentry_exit path. This allows clear_pages()
>> to get by without having to call cond_sched() periodically.
>> (preempt_model_full() already handles this via
>> irqentry_exit_cond_resched(), so we handle this similarly for
>> preempt_model_none() and preempt_model_voluntary().)
>>
>> Performance
>> ==
>>
>> With this demand fault performance gets a decent increase:
>>
>> *Milan* mm/clear_huge_page x86/clear_huge_page change
>> (GB/s) (GB/s)
>>
>> pg-sz=2MB 14.55 19.29 +32.5%
>> pg-sz=1GB 19.34 49.60 +156.4%
>>
>> Milan (and some other AMD Zen uarchs tested) take advantage of the
>> hint to elide cacheline allocation for pg-sz=1GB. The cut-off for
>> this optimization seems to be at around region-size > LLC-size so
>> the pg-sz=2MB load still allocates cachelines.
>>
>
> Have you benchmarked clzero? It is an AMD-specific instruction issuing
> non-temporal stores. It is definitely something to try out for 1G pages.
>
> One would think rep stosq has to be at least not worse since the CPU is
> explicitly told what to do and is free to optimize it however it sees
> fit, but the rep prefix has a long history of underperforming.
>
> I'm not saying it is going to be better, but that this should be tested,
> albeit one can easily argue this can be done at a later date.
>
> I would do it myself but my access to AMD CPUs is limited.
>
Hello Mateuz,
I plugged in CLZERO unconditionally (even for coherent path with
sfence) for my earlier experimets on top of this series.
Test: Use mmap(MAP_HUGETLB) to demand a fault on 64GB region (NUMA
node0), for both base-hugepage-size=2M and 1GB
perf stat -r 10 -d -d numactl -m 0 -N 0 <test>
SUT: AMD Bergamo with 2 node/2 socket 128 cores per socket.
From that I see time taken is:
for 2M: 1.092125
for 1G: 0.997661
So overall for 64GB size experiment result look like this:
Time taken for 64GB region, (lesser = better)
page-size base patched (gain%) patched-clzero (gain%)
2M 5.0779 2.50623 (50.64) 1.092125 (78)
1G 2.50623 1.012439 (59.60) 0.997661 (60)
In summary I further see improvements for even for 2M base size (2.5x).
Overall CLZERO clearing is promising. But we may need threshold tuning
and hint passing as done in Ankurs'
Link:
https://lore.kernel.org/lkml/20220606202109.1306034-1-ankur.a.arora@oracle.com/
on top of current series.
I need to experiment with different chunk size as well as base size
further. (both clzero and rep stos)
Thanks and Regards
- Raghu
Run Details:
Performance counter stats for 'numactl -m 0 -N 0 map_hugetlb_1G' (10
runs):
996.34 msec task-clock # 0.999 CPUs
utilized ( +- 0.02% )
2 context-switches # 2.007 /sec
( +- 21.34% )
0 cpu-migrations # 0.000 /sec
212 page-faults # 212.735 /sec
( +- 0.20% )
3,116,497,471 cycles # 3.127 GHz
( +- 0.02% ) (35.66%)
100,343 stalled-cycles-frontend # 0.00% frontend
cycles idle ( +- 16.85% ) (35.75%)
1,369,118 stalled-cycles-backend # 0.04% backend
cycles idle ( +- 3.45% ) (35.86%)
4,325,987,025 instructions # 1.39 insn per cycle
# 0.00 stalled
cycles per insn ( +- 0.02% ) (35.87%)
1,078,119,163 branches # 1.082 G/sec
( +- 0.01% ) (35.87%)
87,907 branch-misses # 0.01% of all
branches ( +- 5.22% ) (35.83%)
12,337,100 L1-dcache-loads # 12.380 M/sec
( +- 5.44% ) (35.74%)
280,300 L1-dcache-load-misses # 2.48% of all
L1-dcache accesses ( +- 5.74% ) (35.64%)
1,464,549 L1-icache-loads # 1.470 M/sec
( +- 1.61% ) (35.63%)
30,659 L1-icache-load-misses # 2.12% of all
L1-icache accesses ( +- 3.30% ) (35.62%)
17,366 dTLB-loads # 17.426 K/sec
( +- 5.52% ) (35.63%)
11,774 dTLB-load-misses # 81.79% of all
dTLB cache accesses ( +- 7.94% ) (35.63%)
0 iTLB-loads # 0.000 /sec
(35.63%)
2 iTLB-load-misses # 0.00% of all
iTLB cache accesses ( +-342.39% ) (35.64%)
0.997661 +- 0.000150 seconds time elapsed ( +- 0.02% )
Performance counter stats for 'numactl -m 0 -N 0 map_hugetlb' (10 runs):
1,089.97 msec task-clock # 0.998 CPUs
utilized ( +- 0.03% )
3 context-switches # 2.750 /sec
( +- 15.11% )
0 cpu-migrations # 0.000 /sec
32,917 page-faults # 30.172 K/sec
( +- 0.00% )
3,408,713,422 cycles # 3.124 GHz
( +- 0.03% ) (35.60%)
982,417 stalled-cycles-frontend # 0.03% frontend
cycles idle ( +- 2.77% ) (35.60%)
8,495,409 stalled-cycles-backend # 0.25% backend
cycles idle ( +- 6.12% ) (35.59%)
4,970,939,278 instructions # 1.46 insn per cycle
# 0.00 stalled
cycles per insn ( +- 0.04% ) (35.64%)
1,196,644,653 branches # 1.097 G/sec
( +- 0.03% ) (35.73%)
196,584 branch-misses # 0.02% of all
branches ( +- 2.79% ) (35.78%)
226,254,284 L1-dcache-loads # 207.388 M/sec
( +- 0.23% ) (35.78%)
1,161,607 L1-dcache-load-misses # 0.52% of all
L1-dcache accesses ( +- 3.27% ) (35.78%)
21,757,775 L1-icache-loads # 19.943 M/sec
( +- 0.66% ) (35.77%)
165,503 L1-icache-load-misses # 0.78% of all
L1-icache accesses ( +- 3.11% ) (35.78%)
1,118,573 dTLB-loads # 1.025 M/sec
( +- 1.38% ) (35.78%)
415,943 dTLB-load-misses # 37.10% of all
dTLB cache accesses ( +- 1.12% ) (35.78%)
36 iTLB-loads # 32.998 /sec
( +- 18.47% ) (35.74%)
49,785 iTLB-load-misses # 270570.65% of all
iTLB cache accesses ( +- 0.34% ) (35.65%)
1.092125 +- 0.000350 seconds time elapsed ( +- 0.03% )
next prev parent reply other threads:[~2023-09-08 2:19 UTC|newest]
Thread overview: 214+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-08-30 18:49 [PATCH v2 0/9] x86/clear_huge_page: multi-page clearing Ankur Arora
2023-08-30 18:49 ` [PATCH v2 1/9] mm/clear_huge_page: allow arch override for clear_huge_page() Ankur Arora
2023-08-30 18:49 ` [PATCH v2 2/9] mm/huge_page: separate clear_huge_page() and copy_huge_page() Ankur Arora
2023-08-30 18:49 ` [PATCH v2 3/9] mm/huge_page: cleanup clear_/copy_subpage() Ankur Arora
2023-09-08 13:09 ` Matthew Wilcox
2023-09-11 17:22 ` Ankur Arora
2023-08-30 18:49 ` [PATCH v2 4/9] x86/clear_page: extend clear_page*() for multi-page clearing Ankur Arora
2023-09-08 13:11 ` Matthew Wilcox
2023-08-30 18:49 ` [PATCH v2 5/9] x86/clear_page: add clear_pages() Ankur Arora
2023-08-30 18:49 ` [PATCH v2 6/9] x86/clear_huge_page: multi-page clearing Ankur Arora
2023-08-31 18:26 ` kernel test robot
2023-09-08 12:38 ` Peter Zijlstra
2023-09-13 6:43 ` Raghavendra K T
2023-08-30 18:49 ` [PATCH v2 7/9] sched: define TIF_ALLOW_RESCHED Ankur Arora
2023-09-08 7:02 ` Peter Zijlstra
2023-09-08 17:15 ` Linus Torvalds
2023-09-08 22:50 ` Peter Zijlstra
2023-09-09 5:15 ` Linus Torvalds
2023-09-09 6:39 ` Ankur Arora
2023-09-09 9:11 ` Peter Zijlstra
2023-09-09 20:04 ` Ankur Arora
2023-09-09 5:30 ` Ankur Arora
2023-09-09 9:12 ` Peter Zijlstra
2023-09-09 20:15 ` Ankur Arora
2023-09-09 21:16 ` Linus Torvalds
2023-09-10 3:48 ` Ankur Arora
2023-09-10 4:35 ` Linus Torvalds
2023-09-10 10:01 ` Ankur Arora
2023-09-10 18:32 ` Linus Torvalds
2023-09-11 15:04 ` Peter Zijlstra
2023-09-11 16:29 ` andrew.cooper3
2023-09-11 17:04 ` Ankur Arora
2023-09-12 8:26 ` Peter Zijlstra
2023-09-12 12:24 ` Phil Auld
2023-09-12 12:33 ` Matthew Wilcox
2023-09-18 23:42 ` Thomas Gleixner
2023-09-19 1:57 ` Linus Torvalds
2023-09-19 8:03 ` Ingo Molnar
2023-09-19 8:43 ` Ingo Molnar
2023-09-19 13:43 ` Thomas Gleixner
2023-09-19 13:25 ` Thomas Gleixner
2023-09-19 12:30 ` Thomas Gleixner
2023-09-19 13:00 ` Arches that don't support PREEMPT Matthew Wilcox
2023-09-19 13:00 ` Matthew Wilcox
2023-09-19 13:00 ` Matthew Wilcox
2023-09-19 13:34 ` Geert Uytterhoeven
2023-09-19 13:34 ` Geert Uytterhoeven
2023-09-19 13:34 ` Geert Uytterhoeven
2023-09-19 13:37 ` John Paul Adrian Glaubitz
2023-09-19 13:37 ` John Paul Adrian Glaubitz
2023-09-19 13:37 ` John Paul Adrian Glaubitz
2023-09-19 13:42 ` Peter Zijlstra
2023-09-19 13:42 ` Peter Zijlstra
2023-09-19 13:42 ` Peter Zijlstra
2023-09-19 13:48 ` John Paul Adrian Glaubitz
2023-09-19 13:48 ` John Paul Adrian Glaubitz
2023-09-19 13:48 ` John Paul Adrian Glaubitz
2023-09-19 14:16 ` Peter Zijlstra
2023-09-19 14:16 ` Peter Zijlstra
2023-09-19 14:16 ` Peter Zijlstra
2023-09-19 14:24 ` John Paul Adrian Glaubitz
2023-09-19 14:24 ` John Paul Adrian Glaubitz
2023-09-19 14:24 ` John Paul Adrian Glaubitz
2023-09-19 14:32 ` Matthew Wilcox
2023-09-19 14:32 ` Matthew Wilcox
2023-09-19 14:32 ` Matthew Wilcox
2023-09-19 15:31 ` Steven Rostedt
2023-09-19 15:31 ` Steven Rostedt
2023-09-19 15:31 ` Steven Rostedt
2023-09-20 14:38 ` Anton Ivanov
2023-09-20 14:38 ` Anton Ivanov
2023-09-20 14:38 ` Anton Ivanov
2023-09-21 12:20 ` Arnd Bergmann
2023-09-21 12:20 ` Arnd Bergmann
2023-09-21 12:20 ` Arnd Bergmann
2023-09-19 14:17 ` Thomas Gleixner
2023-09-19 14:17 ` Thomas Gleixner
2023-09-19 14:17 ` Thomas Gleixner
2023-09-19 14:50 ` H. Peter Anvin
2023-09-19 14:50 ` H. Peter Anvin
2023-09-19 14:50 ` H. Peter Anvin
2023-09-19 14:57 ` Matt Turner
2023-09-19 14:57 ` Matt Turner
2023-09-19 14:57 ` Matt Turner
2023-09-19 17:09 ` Ulrich Teichert
2023-09-19 17:09 ` Ulrich Teichert
2023-09-19 17:25 ` Linus Torvalds
2023-09-19 17:25 ` Linus Torvalds
2023-09-19 17:25 ` Linus Torvalds
2023-09-19 17:58 ` John Paul Adrian Glaubitz
2023-09-19 17:58 ` John Paul Adrian Glaubitz
2023-09-19 17:58 ` John Paul Adrian Glaubitz
2023-09-19 18:31 ` Thomas Gleixner
2023-09-19 18:31 ` Thomas Gleixner
2023-09-19 18:31 ` Thomas Gleixner
2023-09-19 18:38 ` Steven Rostedt
2023-09-19 18:38 ` Steven Rostedt
2023-09-19 18:38 ` Steven Rostedt
2023-09-19 18:52 ` Linus Torvalds
2023-09-19 18:52 ` Linus Torvalds
2023-09-19 18:52 ` Linus Torvalds
2023-09-19 19:53 ` Thomas Gleixner
2023-09-19 19:53 ` Thomas Gleixner
2023-09-19 19:53 ` Thomas Gleixner
2023-09-20 7:32 ` Ingo Molnar
2023-09-20 7:32 ` Ingo Molnar
2023-09-20 7:32 ` Ingo Molnar
2023-09-20 7:29 ` Ingo Molnar
2023-09-20 7:29 ` Ingo Molnar
2023-09-20 7:29 ` Ingo Molnar
2023-09-20 8:26 ` Thomas Gleixner
2023-09-20 8:26 ` Thomas Gleixner
2023-09-20 8:26 ` Thomas Gleixner
2023-09-20 10:37 ` David Laight
2023-09-20 10:37 ` David Laight
2023-09-20 10:37 ` David Laight
2023-09-19 14:21 ` Anton Ivanov
2023-09-19 14:21 ` Anton Ivanov
2023-09-19 14:21 ` Anton Ivanov
2023-09-19 15:17 ` Thomas Gleixner
2023-09-19 15:17 ` Thomas Gleixner
2023-09-19 15:17 ` Thomas Gleixner
2023-09-19 15:21 ` Anton Ivanov
2023-09-19 15:21 ` Anton Ivanov
2023-09-19 15:21 ` Anton Ivanov
2023-09-19 16:22 ` Richard Weinberger
2023-09-19 16:22 ` Richard Weinberger
2023-09-19 16:22 ` Richard Weinberger
2023-09-19 16:41 ` Anton Ivanov
2023-09-19 16:41 ` Anton Ivanov
2023-09-19 16:41 ` Anton Ivanov
2023-09-19 17:33 ` Thomas Gleixner
2023-09-19 17:33 ` Thomas Gleixner
2023-09-19 17:33 ` Thomas Gleixner
2023-10-06 14:51 ` Geert Uytterhoeven
2023-10-06 14:51 ` Geert Uytterhoeven
2023-09-20 14:22 ` [PATCH v2 7/9] sched: define TIF_ALLOW_RESCHED Ankur Arora
2023-09-20 20:51 ` Thomas Gleixner
2023-09-21 0:14 ` Thomas Gleixner
2023-09-21 0:58 ` Ankur Arora
2023-09-21 2:12 ` Thomas Gleixner
2023-09-20 23:58 ` Thomas Gleixner
2023-09-21 0:57 ` Ankur Arora
2023-09-21 2:02 ` Thomas Gleixner
2023-09-21 4:16 ` Ankur Arora
2023-09-21 13:59 ` Steven Rostedt
2023-09-21 16:00 ` Linus Torvalds
2023-09-21 22:55 ` Thomas Gleixner
2023-09-23 1:11 ` Thomas Gleixner
2023-10-02 14:15 ` Steven Rostedt
2023-10-02 16:13 ` Thomas Gleixner
2023-10-18 1:03 ` Paul E. McKenney
2023-10-18 12:09 ` Ankur Arora
2023-10-18 17:51 ` Paul E. McKenney
2023-10-18 22:53 ` Thomas Gleixner
2023-10-18 23:25 ` Paul E. McKenney
2023-10-18 13:16 ` Thomas Gleixner
2023-10-18 14:31 ` Steven Rostedt
2023-10-18 17:55 ` Paul E. McKenney
2023-10-18 18:00 ` Steven Rostedt
2023-10-18 18:13 ` Paul E. McKenney
2023-10-19 12:37 ` Daniel Bristot de Oliveira
2023-10-19 17:08 ` Paul E. McKenney
2023-10-18 17:19 ` Paul E. McKenney
2023-10-18 17:41 ` Steven Rostedt
2023-10-18 17:59 ` Paul E. McKenney
2023-10-18 20:15 ` Ankur Arora
2023-10-18 20:42 ` Paul E. McKenney
2023-10-19 0:21 ` Thomas Gleixner
2023-10-19 19:13 ` Paul E. McKenney
2023-10-20 21:59 ` Paul E. McKenney
2023-10-20 22:56 ` Ankur Arora
2023-10-20 23:36 ` Paul E. McKenney
2023-10-21 1:05 ` Ankur Arora
2023-10-21 2:08 ` Paul E. McKenney
2023-10-24 12:15 ` Thomas Gleixner
2023-10-24 18:59 ` Paul E. McKenney
2023-09-23 22:50 ` Thomas Gleixner
2023-09-24 0:10 ` Thomas Gleixner
2023-09-24 7:19 ` Matthew Wilcox
2023-09-24 7:55 ` Thomas Gleixner
2023-09-24 10:29 ` Matthew Wilcox
2023-09-25 0:13 ` Ankur Arora
2023-10-06 13:01 ` Geert Uytterhoeven
2023-09-19 7:21 ` Ingo Molnar
2023-09-19 19:05 ` Ankur Arora
2023-10-24 14:34 ` Steven Rostedt
2023-10-25 1:49 ` Steven Rostedt
2023-10-26 7:50 ` Sergey Senozhatsky
2023-10-26 12:48 ` Steven Rostedt
2023-09-11 16:48 ` Steven Rostedt
2023-09-11 20:50 ` Linus Torvalds
2023-09-11 21:16 ` Linus Torvalds
2023-09-12 7:20 ` Peter Zijlstra
2023-09-12 7:38 ` Ingo Molnar
2023-09-11 22:20 ` Steven Rostedt
2023-09-11 23:10 ` Ankur Arora
2023-09-11 23:16 ` Steven Rostedt
2023-09-12 16:30 ` Linus Torvalds
2023-09-12 3:27 ` Matthew Wilcox
2023-09-12 16:20 ` Linus Torvalds
2023-09-19 3:21 ` Andy Lutomirski
2023-09-19 9:20 ` Thomas Gleixner
2023-09-19 9:49 ` Ingo Molnar
2023-08-30 18:49 ` [PATCH v2 8/9] irqentry: define irqentry_exit_allow_resched() Ankur Arora
2023-09-08 12:42 ` Peter Zijlstra
2023-09-11 17:24 ` Ankur Arora
2023-08-30 18:49 ` [PATCH v2 9/9] x86/clear_huge_page: make clear_contig_region() preemptible Ankur Arora
2023-09-08 12:45 ` Peter Zijlstra
2023-09-03 8:14 ` [PATCH v2 0/9] x86/clear_huge_page: multi-page clearing Mateusz Guzik
2023-09-05 22:14 ` Ankur Arora
2023-09-08 2:18 ` Raghavendra K T [this message]
2023-09-05 1:06 ` Raghavendra K T
2023-09-05 19:36 ` Ankur Arora
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=5570c6b9-4abd-1526-cd17-ed45f7d51b20@amd.com \
--to=raghavendra.kt@amd.com \
--cc=akpm@linux-foundation.org \
--cc=ankur.a.arora@oracle.com \
--cc=bharata@amd.com \
--cc=boris.ostrovsky@oracle.com \
--cc=bp@alien8.de \
--cc=dave.hansen@linux.intel.com \
--cc=hpa@zytor.com \
--cc=jon.grimm@amd.com \
--cc=juri.lelli@redhat.com \
--cc=konrad.wilk@oracle.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=luto@kernel.org \
--cc=mgorman@suse.de \
--cc=mingo@redhat.com \
--cc=mjguzik@gmail.com \
--cc=peterz@infradead.org \
--cc=rostedt@goodmis.org \
--cc=tglx@linutronix.de \
--cc=vincent.guittot@linaro.org \
--cc=willy@infradead.org \
--cc=x86@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.