* [PATCH 00/24] Optimise page alloc/free fast paths v2
@ 2016-04-12 10:12 Mel Gorman
2016-04-12 10:12 ` [PATCH 01/24] mm, page_alloc: Only check PageCompound for high-order pages Mel Gorman
` (23 more replies)
0 siblings, 24 replies; 25+ messages in thread
From: Mel Gorman @ 2016-04-12 10:12 UTC (permalink / raw)
To: Andrew Morton; +Cc: Vlastimil Babka, Linux-MM, LKML, Mel Gorman
Sorry for the quick resend. One patch had a warning still and while I
was there, I added a few patches in the bulk pcp free path.
Changelog since v1
o Fix an unused variable warning
o Throw in a few optimisations in the bulk pcp free path
o Rebase to 4.6-rc3
Another year, another round of page allocator optimisations focusing this
time on the alloc and free fast paths. This should be of help to workloads
that are allocator-intensive from kernel space where the cost of zeroing
is not nceessraily incurred.
The series is motivated by the observation that page alloc microbenchmarks
on multiple machines regressed between 3.12.44 and 4.4. Second, there is
discussions before LSF/MM considering the possibility of adding another
page allocator which is potentially hazardous but a patch series improving
performance is better than whining.
After the series is applied, there are still hazards. In the free paths,
the debugging checking and page zone/pageblock lookups dominate but
there was not an obvious solution to that. In the alloc path, the major
contributers are dealing with zonelists, new page preperation, the fair
zone allocation and numerous statistic updates. The fair zone allocator
is removed by the per-node LRU series if that gets merged so it's nor a
major concern at the moment.
On normal userspace benchmarks, there is little impact as the zeroing cost
is significant but it's visible
aim9
4.6.0-rc2 4.6.0-rc2
vanilla cpuset-v1r20
Min page_test 864733.33 ( 0.00%) 922986.67 ( 6.74%)
Min brk_test 6212191.87 ( 0.00%) 6271866.67 ( 0.96%)
Min exec_test 1294.67 ( 0.00%) 1306.00 ( 0.88%)
Min fork_test 12644.90 ( 0.00%) 12713.33 ( 0.54%)
The overall impact on a page allocator microbenchmark for a range of orders
and number of pages allocated in a batch is
4.6.0-rc3 4.6.0-rc2
vanilla micro-v2
Min alloc-odr0-1 428.00 ( 0.00%) 343.00 ( 19.86%)
Min alloc-odr0-2 314.00 ( 0.00%) 252.00 ( 19.75%)
Min alloc-odr0-4 256.00 ( 0.00%) 209.00 ( 18.36%)
Min alloc-odr0-8 223.00 ( 0.00%) 182.00 ( 18.39%)
Min alloc-odr0-16 207.00 ( 0.00%) 168.00 ( 18.84%)
Min alloc-odr0-32 197.00 ( 0.00%) 162.00 ( 17.77%)
Min alloc-odr0-64 193.00 ( 0.00%) 159.00 ( 17.62%)
Min alloc-odr0-128 191.00 ( 0.00%) 157.00 ( 17.80%)
Min alloc-odr0-256 200.00 ( 0.00%) 167.00 ( 16.50%)
Min alloc-odr0-512 212.00 ( 0.00%) 179.00 ( 15.57%)
Min alloc-odr0-1024 220.00 ( 0.00%) 184.00 ( 16.36%)
Min alloc-odr0-2048 226.00 ( 0.00%) 190.00 ( 15.93%)
Min alloc-odr0-4096 233.00 ( 0.00%) 197.00 ( 15.45%)
Min alloc-odr0-8192 235.00 ( 0.00%) 199.00 ( 15.32%)
Min alloc-odr0-16384 235.00 ( 0.00%) 199.00 ( 15.32%)
Min alloc-odr1-1 519.00 ( 0.00%) 461.00 ( 11.18%)
Min alloc-odr1-2 391.00 ( 0.00%) 344.00 ( 12.02%)
Min alloc-odr1-4 312.00 ( 0.00%) 276.00 ( 11.54%)
Min alloc-odr1-8 276.00 ( 0.00%) 238.00 ( 13.77%)
Min alloc-odr1-16 256.00 ( 0.00%) 220.00 ( 14.06%)
Min alloc-odr1-32 247.00 ( 0.00%) 211.00 ( 14.57%)
Min alloc-odr1-64 242.00 ( 0.00%) 208.00 ( 14.05%)
Min alloc-odr1-128 245.00 ( 0.00%) 206.00 ( 15.92%)
Min alloc-odr1-256 244.00 ( 0.00%) 206.00 ( 15.57%)
Min alloc-odr1-512 245.00 ( 0.00%) 209.00 ( 14.69%)
Min alloc-odr1-1024 246.00 ( 0.00%) 211.00 ( 14.23%)
Min alloc-odr1-2048 253.00 ( 0.00%) 220.00 ( 13.04%)
Min alloc-odr1-4096 258.00 ( 0.00%) 224.00 ( 13.18%)
Min alloc-odr1-8192 261.00 ( 0.00%) 226.00 ( 13.41%)
Min alloc-odr2-1 560.00 ( 0.00%) 480.00 ( 14.29%)
Min alloc-odr2-2 422.00 ( 0.00%) 366.00 ( 13.27%)
Min alloc-odr2-4 339.00 ( 0.00%) 289.00 ( 14.75%)
Min alloc-odr2-8 297.00 ( 0.00%) 250.00 ( 15.82%)
Min alloc-odr2-16 277.00 ( 0.00%) 233.00 ( 15.88%)
Min alloc-odr2-32 268.00 ( 0.00%) 223.00 ( 16.79%)
Min alloc-odr2-64 266.00 ( 0.00%) 219.00 ( 17.67%)
Min alloc-odr2-128 264.00 ( 0.00%) 218.00 ( 17.42%)
Min alloc-odr2-256 265.00 ( 0.00%) 219.00 ( 17.36%)
Min alloc-odr2-512 270.00 ( 0.00%) 224.00 ( 17.04%)
Min alloc-odr2-1024 279.00 ( 0.00%) 234.00 ( 16.13%)
Min alloc-odr2-2048 284.00 ( 0.00%) 239.00 ( 15.85%)
Min alloc-odr2-4096 285.00 ( 0.00%) 239.00 ( 16.14%)
Min alloc-odr3-1 629.00 ( 0.00%) 526.00 ( 16.38%)
Min alloc-odr3-2 471.00 ( 0.00%) 395.00 ( 16.14%)
Min alloc-odr3-4 382.00 ( 0.00%) 315.00 ( 17.54%)
Min alloc-odr3-8 466.00 ( 0.00%) 279.00 ( 40.13%)
Min alloc-odr3-16 316.00 ( 0.00%) 259.00 ( 18.04%)
Min alloc-odr3-32 307.00 ( 0.00%) 251.00 ( 18.24%)
Min alloc-odr3-64 305.00 ( 0.00%) 248.00 ( 18.69%)
Min alloc-odr3-128 308.00 ( 0.00%) 248.00 ( 19.48%)
Min alloc-odr3-256 317.00 ( 0.00%) 256.00 ( 19.24%)
Min alloc-odr3-512 327.00 ( 0.00%) 262.00 ( 19.88%)
Min alloc-odr3-1024 332.00 ( 0.00%) 268.00 ( 19.28%)
Min alloc-odr3-2048 333.00 ( 0.00%) 269.00 ( 19.22%)
Min alloc-odr4-1 764.00 ( 0.00%) 607.00 ( 20.55%)
Min alloc-odr4-2 577.00 ( 0.00%) 459.00 ( 20.45%)
Min alloc-odr4-4 473.00 ( 0.00%) 370.00 ( 21.78%)
Min alloc-odr4-8 420.00 ( 0.00%) 327.00 ( 22.14%)
Min alloc-odr4-16 397.00 ( 0.00%) 309.00 ( 22.17%)
Min alloc-odr4-32 391.00 ( 0.00%) 303.00 ( 22.51%)
Min alloc-odr4-64 395.00 ( 0.00%) 302.00 ( 23.54%)
Min alloc-odr4-128 408.00 ( 0.00%) 311.00 ( 23.77%)
Min alloc-odr4-256 421.00 ( 0.00%) 326.00 ( 22.57%)
Min alloc-odr4-512 428.00 ( 0.00%) 333.00 ( 22.20%)
Min alloc-odr4-1024 429.00 ( 0.00%) 330.00 ( 23.08%)
Min free-odr0-1 216.00 ( 0.00%) 193.00 ( 10.65%)
Min free-odr0-2 152.00 ( 0.00%) 137.00 ( 9.87%)
Min free-odr0-4 119.00 ( 0.00%) 107.00 ( 10.08%)
Min free-odr0-8 106.00 ( 0.00%) 95.00 ( 10.38%)
Min free-odr0-16 97.00 ( 0.00%) 87.00 ( 10.31%)
Min free-odr0-32 92.00 ( 0.00%) 82.00 ( 10.87%)
Min free-odr0-64 89.00 ( 0.00%) 80.00 ( 10.11%)
Min free-odr0-128 89.00 ( 0.00%) 79.00 ( 11.24%)
Min free-odr0-256 102.00 ( 0.00%) 94.00 ( 7.84%)
Min free-odr0-512 117.00 ( 0.00%) 110.00 ( 5.98%)
Min free-odr0-1024 125.00 ( 0.00%) 118.00 ( 5.60%)
Min free-odr0-2048 131.00 ( 0.00%) 123.00 ( 6.11%)
Min free-odr0-4096 136.00 ( 0.00%) 126.00 ( 7.35%)
Min free-odr0-8192 136.00 ( 0.00%) 127.00 ( 6.62%)
Min free-odr0-16384 137.00 ( 0.00%) 127.00 ( 7.30%)
Min free-odr1-1 317.00 ( 0.00%) 292.00 ( 7.89%)
Min free-odr1-2 228.00 ( 0.00%) 210.00 ( 7.89%)
Min free-odr1-4 182.00 ( 0.00%) 169.00 ( 7.14%)
Min free-odr1-8 162.00 ( 0.00%) 148.00 ( 8.64%)
Min free-odr1-16 152.00 ( 0.00%) 138.00 ( 9.21%)
Min free-odr1-32 144.00 ( 0.00%) 132.00 ( 8.33%)
Min free-odr1-64 143.00 ( 0.00%) 131.00 ( 8.39%)
Min free-odr1-128 148.00 ( 0.00%) 136.00 ( 8.11%)
Min free-odr1-256 150.00 ( 0.00%) 141.00 ( 6.00%)
Min free-odr1-512 151.00 ( 0.00%) 144.00 ( 4.64%)
Min free-odr1-1024 155.00 ( 0.00%) 147.00 ( 5.16%)
Min free-odr1-2048 157.00 ( 0.00%) 150.00 ( 4.46%)
Min free-odr1-4096 156.00 ( 0.00%) 147.00 ( 5.77%)
Min free-odr1-8192 156.00 ( 0.00%) 146.00 ( 6.41%)
Min free-odr2-1 363.00 ( 0.00%) 315.00 ( 13.22%)
Min free-odr2-2 256.00 ( 0.00%) 229.00 ( 10.55%)
Min free-odr2-4 209.00 ( 0.00%) 189.00 ( 9.57%)
Min free-odr2-8 182.00 ( 0.00%) 162.00 ( 10.99%)
Min free-odr2-16 171.00 ( 0.00%) 154.00 ( 9.94%)
Min free-odr2-32 165.00 ( 0.00%) 152.00 ( 7.88%)
Min free-odr2-64 166.00 ( 0.00%) 153.00 ( 7.83%)
Min free-odr2-128 167.00 ( 0.00%) 156.00 ( 6.59%)
Min free-odr2-256 170.00 ( 0.00%) 159.00 ( 6.47%)
Min free-odr2-512 177.00 ( 0.00%) 165.00 ( 6.78%)
Min free-odr2-1024 184.00 ( 0.00%) 168.00 ( 8.70%)
Min free-odr2-2048 182.00 ( 0.00%) 165.00 ( 9.34%)
Min free-odr2-4096 181.00 ( 0.00%) 163.00 ( 9.94%)
Min free-odr3-1 442.00 ( 0.00%) 376.00 ( 14.93%)
Min free-odr3-2 310.00 ( 0.00%) 272.00 ( 12.26%)
Min free-odr3-4 253.00 ( 0.00%) 215.00 ( 15.02%)
Min free-odr3-8 285.00 ( 0.00%) 193.00 ( 32.28%)
Min free-odr3-16 207.00 ( 0.00%) 179.00 ( 13.53%)
Min free-odr3-32 207.00 ( 0.00%) 180.00 ( 13.04%)
Min free-odr3-64 212.00 ( 0.00%) 184.00 ( 13.21%)
Min free-odr3-128 216.00 ( 0.00%) 189.00 ( 12.50%)
Min free-odr3-256 224.00 ( 0.00%) 197.00 ( 12.05%)
Min free-odr3-512 231.00 ( 0.00%) 201.00 ( 12.99%)
Min free-odr3-1024 230.00 ( 0.00%) 202.00 ( 12.17%)
Min free-odr3-2048 229.00 ( 0.00%) 199.00 ( 13.10%)
Min free-odr4-1 559.00 ( 0.00%) 460.00 ( 17.71%)
Min free-odr4-2 406.00 ( 0.00%) 333.00 ( 17.98%)
Min free-odr4-4 336.00 ( 0.00%) 272.00 ( 19.05%)
Min free-odr4-8 298.00 ( 0.00%) 240.00 ( 19.46%)
Min free-odr4-16 283.00 ( 0.00%) 235.00 ( 16.96%)
Min free-odr4-32 291.00 ( 0.00%) 239.00 ( 17.87%)
Min free-odr4-64 297.00 ( 0.00%) 242.00 ( 18.52%)
Min free-odr4-128 309.00 ( 0.00%) 257.00 ( 16.83%)
Min free-odr4-256 322.00 ( 0.00%) 275.00 ( 14.60%)
Min free-odr4-512 326.00 ( 0.00%) 279.00 ( 14.42%)
Min free-odr4-1024 325.00 ( 0.00%) 275.00 ( 15.38%)
Min total-odr0-1 644.00 ( 0.00%) 536.00 ( 16.77%)
Min total-odr0-2 466.00 ( 0.00%) 389.00 ( 16.52%)
Min total-odr0-4 375.00 ( 0.00%) 316.00 ( 15.73%)
Min total-odr0-8 329.00 ( 0.00%) 277.00 ( 15.81%)
Min total-odr0-16 304.00 ( 0.00%) 255.00 ( 16.12%)
Min total-odr0-32 289.00 ( 0.00%) 244.00 ( 15.57%)
Min total-odr0-64 282.00 ( 0.00%) 239.00 ( 15.25%)
Min total-odr0-128 280.00 ( 0.00%) 236.00 ( 15.71%)
Min total-odr0-256 302.00 ( 0.00%) 261.00 ( 13.58%)
Min total-odr0-512 329.00 ( 0.00%) 289.00 ( 12.16%)
Min total-odr0-1024 345.00 ( 0.00%) 302.00 ( 12.46%)
Min total-odr0-2048 357.00 ( 0.00%) 313.00 ( 12.32%)
Min total-odr0-4096 369.00 ( 0.00%) 323.00 ( 12.47%)
Min total-odr0-8192 371.00 ( 0.00%) 326.00 ( 12.13%)
Min total-odr0-16384 372.00 ( 0.00%) 326.00 ( 12.37%)
Min total-odr1-1 836.00 ( 0.00%) 754.00 ( 9.81%)
Min total-odr1-2 619.00 ( 0.00%) 554.00 ( 10.50%)
Min total-odr1-4 495.00 ( 0.00%) 445.00 ( 10.10%)
Min total-odr1-8 438.00 ( 0.00%) 386.00 ( 11.87%)
Min total-odr1-16 408.00 ( 0.00%) 358.00 ( 12.25%)
Min total-odr1-32 391.00 ( 0.00%) 343.00 ( 12.28%)
Min total-odr1-64 385.00 ( 0.00%) 339.00 ( 11.95%)
Min total-odr1-128 393.00 ( 0.00%) 342.00 ( 12.98%)
Min total-odr1-256 394.00 ( 0.00%) 347.00 ( 11.93%)
Min total-odr1-512 396.00 ( 0.00%) 353.00 ( 10.86%)
Min total-odr1-1024 401.00 ( 0.00%) 358.00 ( 10.72%)
Min total-odr1-2048 410.00 ( 0.00%) 370.00 ( 9.76%)
Min total-odr1-4096 414.00 ( 0.00%) 371.00 ( 10.39%)
Min total-odr1-8192 417.00 ( 0.00%) 372.00 ( 10.79%)
Min total-odr2-1 923.00 ( 0.00%) 795.00 ( 13.87%)
Min total-odr2-2 678.00 ( 0.00%) 595.00 ( 12.24%)
Min total-odr2-4 548.00 ( 0.00%) 478.00 ( 12.77%)
Min total-odr2-8 480.00 ( 0.00%) 412.00 ( 14.17%)
Min total-odr2-16 448.00 ( 0.00%) 387.00 ( 13.62%)
Min total-odr2-32 433.00 ( 0.00%) 375.00 ( 13.39%)
Min total-odr2-64 432.00 ( 0.00%) 372.00 ( 13.89%)
Min total-odr2-128 431.00 ( 0.00%) 374.00 ( 13.23%)
Min total-odr2-256 436.00 ( 0.00%) 378.00 ( 13.30%)
Min total-odr2-512 447.00 ( 0.00%) 389.00 ( 12.98%)
Min total-odr2-1024 463.00 ( 0.00%) 402.00 ( 13.17%)
Min total-odr2-2048 466.00 ( 0.00%) 404.00 ( 13.30%)
Min total-odr2-4096 466.00 ( 0.00%) 402.00 ( 13.73%)
Min total-odr3-1 1071.00 ( 0.00%) 904.00 ( 15.59%)
Min total-odr3-2 781.00 ( 0.00%) 667.00 ( 14.60%)
Min total-odr3-4 636.00 ( 0.00%) 531.00 ( 16.51%)
Min total-odr3-8 751.00 ( 0.00%) 472.00 ( 37.15%)
Min total-odr3-16 523.00 ( 0.00%) 438.00 ( 16.25%)
Min total-odr3-32 514.00 ( 0.00%) 431.00 ( 16.15%)
Min total-odr3-64 517.00 ( 0.00%) 432.00 ( 16.44%)
Min total-odr3-128 524.00 ( 0.00%) 437.00 ( 16.60%)
Min total-odr3-256 541.00 ( 0.00%) 453.00 ( 16.27%)
Min total-odr3-512 558.00 ( 0.00%) 463.00 ( 17.03%)
Min total-odr3-1024 562.00 ( 0.00%) 470.00 ( 16.37%)
Min total-odr3-2048 562.00 ( 0.00%) 468.00 ( 16.73%)
Min total-odr4-1 1323.00 ( 0.00%) 1067.00 ( 19.35%)
Min total-odr4-2 983.00 ( 0.00%) 792.00 ( 19.43%)
Min total-odr4-4 809.00 ( 0.00%) 642.00 ( 20.64%)
Min total-odr4-8 718.00 ( 0.00%) 567.00 ( 21.03%)
Min total-odr4-16 680.00 ( 0.00%) 544.00 ( 20.00%)
Min total-odr4-32 682.00 ( 0.00%) 542.00 ( 20.53%)
Min total-odr4-64 692.00 ( 0.00%) 544.00 ( 21.39%)
Min total-odr4-128 717.00 ( 0.00%) 568.00 ( 20.78%)
Min total-odr4-256 743.00 ( 0.00%) 601.00 ( 19.11%)
Min total-odr4-512 754.00 ( 0.00%) 612.00 ( 18.83%)
Min total-odr4-1024 754.00 ( 0.00%) 605.00 ( 19.76%)
fs/buffer.c | 10 +-
include/linux/compaction.h | 6 +-
include/linux/cpuset.h | 42 ++++--
include/linux/mm.h | 5 +-
include/linux/mmzone.h | 34 +++--
include/linux/page-flags.h | 7 +-
include/linux/vmstat.h | 2 -
kernel/cpuset.c | 14 +-
mm/compaction.c | 12 +-
mm/internal.h | 4 +-
mm/mempolicy.c | 19 +--
mm/mmzone.c | 2 +-
mm/page_alloc.c | 328 +++++++++++++++++++++++++++------------------
mm/vmstat.c | 25 ----
14 files changed, 293 insertions(+), 217 deletions(-)
--
2.6.4
^ permalink raw reply [flat|nested] 25+ messages in thread
* [PATCH 01/24] mm, page_alloc: Only check PageCompound for high-order pages
2016-04-12 10:12 [PATCH 00/24] Optimise page alloc/free fast paths v2 Mel Gorman
@ 2016-04-12 10:12 ` Mel Gorman
2016-04-12 10:12 ` [PATCH 02/24] mm, page_alloc: Use new PageAnonHead helper in the free page fast path Mel Gorman
` (22 subsequent siblings)
23 siblings, 0 replies; 25+ messages in thread
From: Mel Gorman @ 2016-04-12 10:12 UTC (permalink / raw)
To: Andrew Morton; +Cc: Vlastimil Babka, Linux-MM, LKML, Mel Gorman
order-0 pages by definition cannot be compound so avoid the check in the
fast path for those pages.
Signed-off-by: Mel Gorman <mgorman@techsingularity.net>
---
mm/page_alloc.c | 25 +++++++++++++++++--------
1 file changed, 17 insertions(+), 8 deletions(-)
diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index 59de90d5d3a3..5d205bcfe10d 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -1024,24 +1024,33 @@ void __meminit reserve_bootmem_region(unsigned long start, unsigned long end)
static bool free_pages_prepare(struct page *page, unsigned int order)
{
- bool compound = PageCompound(page);
- int i, bad = 0;
+ int bad = 0;
VM_BUG_ON_PAGE(PageTail(page), page);
- VM_BUG_ON_PAGE(compound && compound_order(page) != order, page);
trace_mm_page_free(page, order);
kmemcheck_free_shadow(page, order);
kasan_free_pages(page, order);
+ /*
+ * Check tail pages before head page information is cleared to
+ * avoid checking PageCompound for order-0 pages.
+ */
+ if (order) {
+ bool compound = PageCompound(page);
+ int i;
+
+ VM_BUG_ON_PAGE(compound && compound_order(page) != order, page);
+
+ for (i = 1; i < (1 << order); i++) {
+ if (compound)
+ bad += free_tail_pages_check(page, page + i);
+ bad += free_pages_check(page + i);
+ }
+ }
if (PageAnon(page))
page->mapping = NULL;
bad += free_pages_check(page);
- for (i = 1; i < (1 << order); i++) {
- if (compound)
- bad += free_tail_pages_check(page, page + i);
- bad += free_pages_check(page + i);
- }
if (bad)
return false;
--
2.6.4
^ permalink raw reply related [flat|nested] 25+ messages in thread
* [PATCH 02/24] mm, page_alloc: Use new PageAnonHead helper in the free page fast path
2016-04-12 10:12 [PATCH 00/24] Optimise page alloc/free fast paths v2 Mel Gorman
2016-04-12 10:12 ` [PATCH 01/24] mm, page_alloc: Only check PageCompound for high-order pages Mel Gorman
@ 2016-04-12 10:12 ` Mel Gorman
2016-04-12 10:12 ` [PATCH 03/24] mm, page_alloc: Reduce branches in zone_statistics Mel Gorman
` (21 subsequent siblings)
23 siblings, 0 replies; 25+ messages in thread
From: Mel Gorman @ 2016-04-12 10:12 UTC (permalink / raw)
To: Andrew Morton; +Cc: Vlastimil Babka, Linux-MM, LKML, Mel Gorman
The PageAnon check always checks for compound_head but this is a relatively
expensive check if the caller already knows the page is a head page. This
patch creates a helper and uses it in the page free path which only operates
on head pages.
With this patch and "Only check PageCompound for high-order pages", the
performance difference on a page allocator microbenchmark is;
4.6.0-rc2 4.6.0-rc2
vanilla nocompound-v1r20
Min alloc-odr0-1 425.00 ( 0.00%) 417.00 ( 1.88%)
Min alloc-odr0-2 313.00 ( 0.00%) 308.00 ( 1.60%)
Min alloc-odr0-4 257.00 ( 0.00%) 253.00 ( 1.56%)
Min alloc-odr0-8 224.00 ( 0.00%) 221.00 ( 1.34%)
Min alloc-odr0-16 208.00 ( 0.00%) 205.00 ( 1.44%)
Min alloc-odr0-32 199.00 ( 0.00%) 199.00 ( 0.00%)
Min alloc-odr0-64 195.00 ( 0.00%) 193.00 ( 1.03%)
Min alloc-odr0-128 192.00 ( 0.00%) 191.00 ( 0.52%)
Min alloc-odr0-256 204.00 ( 0.00%) 200.00 ( 1.96%)
Min alloc-odr0-512 213.00 ( 0.00%) 212.00 ( 0.47%)
Min alloc-odr0-1024 219.00 ( 0.00%) 219.00 ( 0.00%)
Min alloc-odr0-2048 225.00 ( 0.00%) 225.00 ( 0.00%)
Min alloc-odr0-4096 230.00 ( 0.00%) 231.00 ( -0.43%)
Min alloc-odr0-8192 235.00 ( 0.00%) 234.00 ( 0.43%)
Min alloc-odr0-16384 235.00 ( 0.00%) 234.00 ( 0.43%)
Min free-odr0-1 215.00 ( 0.00%) 191.00 ( 11.16%)
Min free-odr0-2 152.00 ( 0.00%) 136.00 ( 10.53%)
Min free-odr0-4 119.00 ( 0.00%) 107.00 ( 10.08%)
Min free-odr0-8 106.00 ( 0.00%) 96.00 ( 9.43%)
Min free-odr0-16 97.00 ( 0.00%) 87.00 ( 10.31%)
Min free-odr0-32 91.00 ( 0.00%) 83.00 ( 8.79%)
Min free-odr0-64 89.00 ( 0.00%) 81.00 ( 8.99%)
Min free-odr0-128 88.00 ( 0.00%) 80.00 ( 9.09%)
Min free-odr0-256 106.00 ( 0.00%) 95.00 ( 10.38%)
Min free-odr0-512 116.00 ( 0.00%) 111.00 ( 4.31%)
Min free-odr0-1024 125.00 ( 0.00%) 118.00 ( 5.60%)
Min free-odr0-2048 133.00 ( 0.00%) 126.00 ( 5.26%)
Min free-odr0-4096 136.00 ( 0.00%) 130.00 ( 4.41%)
Min free-odr0-8192 138.00 ( 0.00%) 130.00 ( 5.80%)
Min free-odr0-16384 137.00 ( 0.00%) 130.00 ( 5.11%)
There is a sizable boost to the free allocator performance. While there
is an apparent boost on the allocation side, it's likely a co-incidence
or due to the patches slightly reducing cache footprint.
Signed-off-by: Mel Gorman <mgorman@techsingularity.net>
---
include/linux/page-flags.h | 7 ++++++-
mm/page_alloc.c | 2 +-
2 files changed, 7 insertions(+), 2 deletions(-)
diff --git a/include/linux/page-flags.h b/include/linux/page-flags.h
index f4ed4f1b0c77..ccd04ee1ba2d 100644
--- a/include/linux/page-flags.h
+++ b/include/linux/page-flags.h
@@ -371,10 +371,15 @@ PAGEFLAG(Idle, idle, PF_ANY)
#define PAGE_MAPPING_KSM 2
#define PAGE_MAPPING_FLAGS (PAGE_MAPPING_ANON | PAGE_MAPPING_KSM)
+static __always_inline int PageAnonHead(struct page *page)
+{
+ return ((unsigned long)page->mapping & PAGE_MAPPING_ANON) != 0;
+}
+
static __always_inline int PageAnon(struct page *page)
{
page = compound_head(page);
- return ((unsigned long)page->mapping & PAGE_MAPPING_ANON) != 0;
+ return PageAnonHead(page);
}
#ifdef CONFIG_KSM
diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index 5d205bcfe10d..6812de41f698 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -1048,7 +1048,7 @@ static bool free_pages_prepare(struct page *page, unsigned int order)
bad += free_pages_check(page + i);
}
}
- if (PageAnon(page))
+ if (PageAnonHead(page))
page->mapping = NULL;
bad += free_pages_check(page);
if (bad)
--
2.6.4
^ permalink raw reply related [flat|nested] 25+ messages in thread
* [PATCH 03/24] mm, page_alloc: Reduce branches in zone_statistics
2016-04-12 10:12 [PATCH 00/24] Optimise page alloc/free fast paths v2 Mel Gorman
2016-04-12 10:12 ` [PATCH 01/24] mm, page_alloc: Only check PageCompound for high-order pages Mel Gorman
2016-04-12 10:12 ` [PATCH 02/24] mm, page_alloc: Use new PageAnonHead helper in the free page fast path Mel Gorman
@ 2016-04-12 10:12 ` Mel Gorman
2016-04-12 10:12 ` [PATCH 04/24] mm, page_alloc: Inline zone_statistics Mel Gorman
` (20 subsequent siblings)
23 siblings, 0 replies; 25+ messages in thread
From: Mel Gorman @ 2016-04-12 10:12 UTC (permalink / raw)
To: Andrew Morton; +Cc: Vlastimil Babka, Linux-MM, LKML, Mel Gorman
zone_statistics has more branches than it really needs to take an
unlikely GFP flag into account. Reduce the number and annotate
the unlikely flag.
The performance difference on a page allocator microbenchmark is;
4.6.0-rc2 4.6.0-rc2
nocompound-v1r10 statbranch-v1r10
Min alloc-odr0-1 417.00 ( 0.00%) 419.00 ( -0.48%)
Min alloc-odr0-2 308.00 ( 0.00%) 305.00 ( 0.97%)
Min alloc-odr0-4 253.00 ( 0.00%) 250.00 ( 1.19%)
Min alloc-odr0-8 221.00 ( 0.00%) 219.00 ( 0.90%)
Min alloc-odr0-16 205.00 ( 0.00%) 203.00 ( 0.98%)
Min alloc-odr0-32 199.00 ( 0.00%) 195.00 ( 2.01%)
Min alloc-odr0-64 193.00 ( 0.00%) 191.00 ( 1.04%)
Min alloc-odr0-128 191.00 ( 0.00%) 189.00 ( 1.05%)
Min alloc-odr0-256 200.00 ( 0.00%) 198.00 ( 1.00%)
Min alloc-odr0-512 212.00 ( 0.00%) 210.00 ( 0.94%)
Min alloc-odr0-1024 219.00 ( 0.00%) 216.00 ( 1.37%)
Min alloc-odr0-2048 225.00 ( 0.00%) 221.00 ( 1.78%)
Min alloc-odr0-4096 231.00 ( 0.00%) 227.00 ( 1.73%)
Min alloc-odr0-8192 234.00 ( 0.00%) 232.00 ( 0.85%)
Min alloc-odr0-16384 234.00 ( 0.00%) 232.00 ( 0.85%)
Signed-off-by: Mel Gorman <mgorman@techsingularity.net>
---
mm/vmstat.c | 16 ++++++++++------
1 file changed, 10 insertions(+), 6 deletions(-)
diff --git a/mm/vmstat.c b/mm/vmstat.c
index 5e4300482897..2e58ead9bcf5 100644
--- a/mm/vmstat.c
+++ b/mm/vmstat.c
@@ -581,17 +581,21 @@ void drain_zonestat(struct zone *zone, struct per_cpu_pageset *pset)
*/
void zone_statistics(struct zone *preferred_zone, struct zone *z, gfp_t flags)
{
- if (z->zone_pgdat == preferred_zone->zone_pgdat) {
+ int local_nid = numa_node_id();
+ enum zone_stat_item local_stat = NUMA_LOCAL;
+
+ if (unlikely(flags & __GFP_OTHER_NODE)) {
+ local_stat = NUMA_OTHER;
+ local_nid = preferred_zone->node;
+ }
+
+ if (z->node == local_nid) {
__inc_zone_state(z, NUMA_HIT);
+ __inc_zone_state(z, local_stat);
} else {
__inc_zone_state(z, NUMA_MISS);
__inc_zone_state(preferred_zone, NUMA_FOREIGN);
}
- if (z->node == ((flags & __GFP_OTHER_NODE) ?
- preferred_zone->node : numa_node_id()))
- __inc_zone_state(z, NUMA_LOCAL);
- else
- __inc_zone_state(z, NUMA_OTHER);
}
/*
--
2.6.4
^ permalink raw reply related [flat|nested] 25+ messages in thread
* [PATCH 04/24] mm, page_alloc: Inline zone_statistics
2016-04-12 10:12 [PATCH 00/24] Optimise page alloc/free fast paths v2 Mel Gorman
` (2 preceding siblings ...)
2016-04-12 10:12 ` [PATCH 03/24] mm, page_alloc: Reduce branches in zone_statistics Mel Gorman
@ 2016-04-12 10:12 ` Mel Gorman
2016-04-12 10:12 ` [PATCH 05/24] mm, page_alloc: Inline the fast path of the zonelist iterator Mel Gorman
` (19 subsequent siblings)
23 siblings, 0 replies; 25+ messages in thread
From: Mel Gorman @ 2016-04-12 10:12 UTC (permalink / raw)
To: Andrew Morton; +Cc: Vlastimil Babka, Linux-MM, LKML, Mel Gorman
zone_statistics has one call-site but it's a public function. Make
it static and inline.
The performance difference on a page allocator microbenchmark is;
4.6.0-rc2 4.6.0-rc2
statbranch-v1r20 statinline-v1r20
Min alloc-odr0-1 419.00 ( 0.00%) 412.00 ( 1.67%)
Min alloc-odr0-2 305.00 ( 0.00%) 301.00 ( 1.31%)
Min alloc-odr0-4 250.00 ( 0.00%) 247.00 ( 1.20%)
Min alloc-odr0-8 219.00 ( 0.00%) 215.00 ( 1.83%)
Min alloc-odr0-16 203.00 ( 0.00%) 199.00 ( 1.97%)
Min alloc-odr0-32 195.00 ( 0.00%) 191.00 ( 2.05%)
Min alloc-odr0-64 191.00 ( 0.00%) 187.00 ( 2.09%)
Min alloc-odr0-128 189.00 ( 0.00%) 185.00 ( 2.12%)
Min alloc-odr0-256 198.00 ( 0.00%) 193.00 ( 2.53%)
Min alloc-odr0-512 210.00 ( 0.00%) 207.00 ( 1.43%)
Min alloc-odr0-1024 216.00 ( 0.00%) 213.00 ( 1.39%)
Min alloc-odr0-2048 221.00 ( 0.00%) 220.00 ( 0.45%)
Min alloc-odr0-4096 227.00 ( 0.00%) 226.00 ( 0.44%)
Min alloc-odr0-8192 232.00 ( 0.00%) 229.00 ( 1.29%)
Min alloc-odr0-16384 232.00 ( 0.00%) 229.00 ( 1.29%)
Signed-off-by: Mel Gorman <mgorman@techsingularity.net>
---
include/linux/vmstat.h | 2 --
mm/page_alloc.c | 31 +++++++++++++++++++++++++++++++
mm/vmstat.c | 29 -----------------------------
3 files changed, 31 insertions(+), 31 deletions(-)
diff --git a/include/linux/vmstat.h b/include/linux/vmstat.h
index 73fae8c4a5fb..152d26b7f972 100644
--- a/include/linux/vmstat.h
+++ b/include/linux/vmstat.h
@@ -163,12 +163,10 @@ static inline unsigned long zone_page_state_snapshot(struct zone *zone,
#ifdef CONFIG_NUMA
extern unsigned long node_page_state(int node, enum zone_stat_item item);
-extern void zone_statistics(struct zone *, struct zone *, gfp_t gfp);
#else
#define node_page_state(node, item) global_page_state(item)
-#define zone_statistics(_zl, _z, gfp) do { } while (0)
#endif /* CONFIG_NUMA */
diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index 6812de41f698..b56c2b2911a2 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -2352,6 +2352,37 @@ int split_free_page(struct page *page)
}
/*
+ * Update NUMA hit/miss statistics
+ *
+ * Must be called with interrupts disabled.
+ *
+ * When __GFP_OTHER_NODE is set assume the node of the preferred
+ * zone is the local node. This is useful for daemons who allocate
+ * memory on behalf of other processes.
+ */
+static inline void zone_statistics(struct zone *preferred_zone, struct zone *z,
+ gfp_t flags)
+{
+#ifdef CONFIG_NUMA
+ int local_nid = numa_node_id();
+ enum zone_stat_item local_stat = NUMA_LOCAL;
+
+ if (unlikely(flags & __GFP_OTHER_NODE)) {
+ local_stat = NUMA_OTHER;
+ local_nid = preferred_zone->node;
+ }
+
+ if (z->node == local_nid) {
+ __inc_zone_state(z, NUMA_HIT);
+ __inc_zone_state(z, local_stat);
+ } else {
+ __inc_zone_state(z, NUMA_MISS);
+ __inc_zone_state(preferred_zone, NUMA_FOREIGN);
+ }
+#endif
+}
+
+/*
* Allocate a page from the given zone. Use pcplists for order-0 allocations.
*/
static inline
diff --git a/mm/vmstat.c b/mm/vmstat.c
index 2e58ead9bcf5..a4bda11eac8d 100644
--- a/mm/vmstat.c
+++ b/mm/vmstat.c
@@ -570,35 +570,6 @@ void drain_zonestat(struct zone *zone, struct per_cpu_pageset *pset)
#ifdef CONFIG_NUMA
/*
- * zonelist = the list of zones passed to the allocator
- * z = the zone from which the allocation occurred.
- *
- * Must be called with interrupts disabled.
- *
- * When __GFP_OTHER_NODE is set assume the node of the preferred
- * zone is the local node. This is useful for daemons who allocate
- * memory on behalf of other processes.
- */
-void zone_statistics(struct zone *preferred_zone, struct zone *z, gfp_t flags)
-{
- int local_nid = numa_node_id();
- enum zone_stat_item local_stat = NUMA_LOCAL;
-
- if (unlikely(flags & __GFP_OTHER_NODE)) {
- local_stat = NUMA_OTHER;
- local_nid = preferred_zone->node;
- }
-
- if (z->node == local_nid) {
- __inc_zone_state(z, NUMA_HIT);
- __inc_zone_state(z, local_stat);
- } else {
- __inc_zone_state(z, NUMA_MISS);
- __inc_zone_state(preferred_zone, NUMA_FOREIGN);
- }
-}
-
-/*
* Determine the per node value of a stat item.
*/
unsigned long node_page_state(int node, enum zone_stat_item item)
--
2.6.4
^ permalink raw reply related [flat|nested] 25+ messages in thread
* [PATCH 05/24] mm, page_alloc: Inline the fast path of the zonelist iterator
2016-04-12 10:12 [PATCH 00/24] Optimise page alloc/free fast paths v2 Mel Gorman
` (3 preceding siblings ...)
2016-04-12 10:12 ` [PATCH 04/24] mm, page_alloc: Inline zone_statistics Mel Gorman
@ 2016-04-12 10:12 ` Mel Gorman
2016-04-12 10:12 ` [PATCH 06/24] mm, page_alloc: Use __dec_zone_state for order-0 page allocation Mel Gorman
` (18 subsequent siblings)
23 siblings, 0 replies; 25+ messages in thread
From: Mel Gorman @ 2016-04-12 10:12 UTC (permalink / raw)
To: Andrew Morton; +Cc: Vlastimil Babka, Linux-MM, LKML, Mel Gorman
The page allocator iterates through a zonelist for zones that match
the addressing limitations and nodemask of the caller but many allocations
will not be restricted. Despite this, there is always functional call
overhead which builds up.
This patch inlines the optimistic basic case and only calls the
iterator function for the complex case. A hindrance was the fact that
cpuset_current_mems_allowed is used in the fastpath as the allowed nodemask
even though all nodes are allowed on most systems. The patch handles this
by only considering cpuset_current_mems_allowed if a cpuset exists. As well
as being faster in the fast-path, this removes some junk in the slowpath.
The performance difference on a page allocator microbenchmark is;
4.6.0-rc2 4.6.0-rc2
statinline-v1r20 optiter-v1r20
Min alloc-odr0-1 412.00 ( 0.00%) 382.00 ( 7.28%)
Min alloc-odr0-2 301.00 ( 0.00%) 282.00 ( 6.31%)
Min alloc-odr0-4 247.00 ( 0.00%) 233.00 ( 5.67%)
Min alloc-odr0-8 215.00 ( 0.00%) 203.00 ( 5.58%)
Min alloc-odr0-16 199.00 ( 0.00%) 188.00 ( 5.53%)
Min alloc-odr0-32 191.00 ( 0.00%) 182.00 ( 4.71%)
Min alloc-odr0-64 187.00 ( 0.00%) 177.00 ( 5.35%)
Min alloc-odr0-128 185.00 ( 0.00%) 175.00 ( 5.41%)
Min alloc-odr0-256 193.00 ( 0.00%) 184.00 ( 4.66%)
Min alloc-odr0-512 207.00 ( 0.00%) 197.00 ( 4.83%)
Min alloc-odr0-1024 213.00 ( 0.00%) 203.00 ( 4.69%)
Min alloc-odr0-2048 220.00 ( 0.00%) 209.00 ( 5.00%)
Min alloc-odr0-4096 226.00 ( 0.00%) 214.00 ( 5.31%)
Min alloc-odr0-8192 229.00 ( 0.00%) 218.00 ( 4.80%)
Min alloc-odr0-16384 229.00 ( 0.00%) 219.00 ( 4.37%)
perf indicated that next_zones_zonelist disappeared in the profile and
__next_zones_zonelist did not appear. This is expected as the micro-benchmark
would hit the inlined fast-path every time.
Signed-off-by: Mel Gorman <mgorman@techsingularity.net>
---
include/linux/mmzone.h | 13 +++++++++++--
mm/mmzone.c | 2 +-
mm/page_alloc.c | 26 +++++++++-----------------
3 files changed, 21 insertions(+), 20 deletions(-)
diff --git a/include/linux/mmzone.h b/include/linux/mmzone.h
index c60df9257cc7..0c4d5ebb3849 100644
--- a/include/linux/mmzone.h
+++ b/include/linux/mmzone.h
@@ -922,6 +922,10 @@ static inline int zonelist_node_idx(struct zoneref *zoneref)
#endif /* CONFIG_NUMA */
}
+struct zoneref *__next_zones_zonelist(struct zoneref *z,
+ enum zone_type highest_zoneidx,
+ nodemask_t *nodes);
+
/**
* next_zones_zonelist - Returns the next zone at or below highest_zoneidx within the allowed nodemask using a cursor within a zonelist as a starting point
* @z - The cursor used as a starting point for the search
@@ -934,9 +938,14 @@ static inline int zonelist_node_idx(struct zoneref *zoneref)
* being examined. It should be advanced by one before calling
* next_zones_zonelist again.
*/
-struct zoneref *next_zones_zonelist(struct zoneref *z,
+static __always_inline struct zoneref *next_zones_zonelist(struct zoneref *z,
enum zone_type highest_zoneidx,
- nodemask_t *nodes);
+ nodemask_t *nodes)
+{
+ if (likely(!nodes && zonelist_zone_idx(z) <= highest_zoneidx))
+ return z;
+ return __next_zones_zonelist(z, highest_zoneidx, nodes);
+}
/**
* first_zones_zonelist - Returns the first zone at or below highest_zoneidx within the allowed nodemask in a zonelist
diff --git a/mm/mmzone.c b/mm/mmzone.c
index 52687fb4de6f..5652be858e5e 100644
--- a/mm/mmzone.c
+++ b/mm/mmzone.c
@@ -52,7 +52,7 @@ static inline int zref_in_nodemask(struct zoneref *zref, nodemask_t *nodes)
}
/* Returns the next zone at or below highest_zoneidx in a zonelist */
-struct zoneref *next_zones_zonelist(struct zoneref *z,
+struct zoneref *__next_zones_zonelist(struct zoneref *z,
enum zone_type highest_zoneidx,
nodemask_t *nodes)
{
diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index b56c2b2911a2..e9acc0b0f787 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -3193,17 +3193,6 @@ __alloc_pages_slowpath(gfp_t gfp_mask, unsigned int order,
*/
alloc_flags = gfp_to_alloc_flags(gfp_mask);
- /*
- * Find the true preferred zone if the allocation is unconstrained by
- * cpusets.
- */
- if (!(alloc_flags & ALLOC_CPUSET) && !ac->nodemask) {
- struct zoneref *preferred_zoneref;
- preferred_zoneref = first_zones_zonelist(ac->zonelist,
- ac->high_zoneidx, NULL, &ac->preferred_zone);
- ac->classzone_idx = zonelist_zone_idx(preferred_zoneref);
- }
-
/* This is the last chance, in general, before the goto nopage. */
page = get_page_from_freelist(gfp_mask, order,
alloc_flags & ~ALLOC_NO_WATERMARKS, ac);
@@ -3359,14 +3348,21 @@ __alloc_pages_nodemask(gfp_t gfp_mask, unsigned int order,
struct zoneref *preferred_zoneref;
struct page *page = NULL;
unsigned int cpuset_mems_cookie;
- int alloc_flags = ALLOC_WMARK_LOW|ALLOC_CPUSET|ALLOC_FAIR;
+ int alloc_flags = ALLOC_WMARK_LOW|ALLOC_FAIR;
gfp_t alloc_mask; /* The gfp_t that was actually used for allocation */
struct alloc_context ac = {
.high_zoneidx = gfp_zone(gfp_mask),
+ .zonelist = zonelist,
.nodemask = nodemask,
.migratetype = gfpflags_to_migratetype(gfp_mask),
};
+ if (cpusets_enabled()) {
+ alloc_flags |= ALLOC_CPUSET;
+ if (!ac.nodemask)
+ ac.nodemask = &cpuset_current_mems_allowed;
+ }
+
gfp_mask &= gfp_allowed_mask;
lockdep_trace_alloc(gfp_mask);
@@ -3390,16 +3386,12 @@ __alloc_pages_nodemask(gfp_t gfp_mask, unsigned int order,
retry_cpuset:
cpuset_mems_cookie = read_mems_allowed_begin();
- /* We set it here, as __alloc_pages_slowpath might have changed it */
- ac.zonelist = zonelist;
-
/* Dirty zone balancing only done in the fast path */
ac.spread_dirty_pages = (gfp_mask & __GFP_WRITE);
/* The preferred zone is used for statistics later */
preferred_zoneref = first_zones_zonelist(ac.zonelist, ac.high_zoneidx,
- ac.nodemask ? : &cpuset_current_mems_allowed,
- &ac.preferred_zone);
+ ac.nodemask, &ac.preferred_zone);
if (!ac.preferred_zone)
goto out;
ac.classzone_idx = zonelist_zone_idx(preferred_zoneref);
--
2.6.4
^ permalink raw reply related [flat|nested] 25+ messages in thread
* [PATCH 06/24] mm, page_alloc: Use __dec_zone_state for order-0 page allocation
2016-04-12 10:12 [PATCH 00/24] Optimise page alloc/free fast paths v2 Mel Gorman
` (4 preceding siblings ...)
2016-04-12 10:12 ` [PATCH 05/24] mm, page_alloc: Inline the fast path of the zonelist iterator Mel Gorman
@ 2016-04-12 10:12 ` Mel Gorman
2016-04-12 10:12 ` [PATCH 07/24] mm, page_alloc: Avoid unnecessary zone lookups during pageblock operations Mel Gorman
` (17 subsequent siblings)
23 siblings, 0 replies; 25+ messages in thread
From: Mel Gorman @ 2016-04-12 10:12 UTC (permalink / raw)
To: Andrew Morton; +Cc: Vlastimil Babka, Linux-MM, LKML, Mel Gorman
__dec_zone_state is cheaper to use for removing an order-0 page as it
has fewer conditions to check.
The performance difference on a page allocator microbenchmark is;
4.6.0-rc2 4.6.0-rc2
optiter-v1r20 decstat-v1r20
Min alloc-odr0-1 382.00 ( 0.00%) 381.00 ( 0.26%)
Min alloc-odr0-2 282.00 ( 0.00%) 275.00 ( 2.48%)
Min alloc-odr0-4 233.00 ( 0.00%) 229.00 ( 1.72%)
Min alloc-odr0-8 203.00 ( 0.00%) 199.00 ( 1.97%)
Min alloc-odr0-16 188.00 ( 0.00%) 186.00 ( 1.06%)
Min alloc-odr0-32 182.00 ( 0.00%) 179.00 ( 1.65%)
Min alloc-odr0-64 177.00 ( 0.00%) 174.00 ( 1.69%)
Min alloc-odr0-128 175.00 ( 0.00%) 172.00 ( 1.71%)
Min alloc-odr0-256 184.00 ( 0.00%) 181.00 ( 1.63%)
Min alloc-odr0-512 197.00 ( 0.00%) 193.00 ( 2.03%)
Min alloc-odr0-1024 203.00 ( 0.00%) 201.00 ( 0.99%)
Min alloc-odr0-2048 209.00 ( 0.00%) 206.00 ( 1.44%)
Min alloc-odr0-4096 214.00 ( 0.00%) 212.00 ( 0.93%)
Min alloc-odr0-8192 218.00 ( 0.00%) 215.00 ( 1.38%)
Min alloc-odr0-16384 219.00 ( 0.00%) 216.00 ( 1.37%)
Signed-off-by: Mel Gorman <mgorman@techsingularity.net>
---
mm/page_alloc.c | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index e9acc0b0f787..ab16560b76e6 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -2414,6 +2414,7 @@ struct page *buffered_rmqueue(struct zone *preferred_zone,
else
page = list_first_entry(list, struct page, lru);
+ __dec_zone_state(zone, NR_ALLOC_BATCH);
list_del(&page->lru);
pcp->count--;
} else {
@@ -2435,11 +2436,11 @@ struct page *buffered_rmqueue(struct zone *preferred_zone,
spin_unlock(&zone->lock);
if (!page)
goto failed;
+ __mod_zone_page_state(zone, NR_ALLOC_BATCH, -(1 << order));
__mod_zone_freepage_state(zone, -(1 << order),
get_pcppage_migratetype(page));
}
- __mod_zone_page_state(zone, NR_ALLOC_BATCH, -(1 << order));
if (atomic_long_read(&zone->vm_stat[NR_ALLOC_BATCH]) <= 0 &&
!test_bit(ZONE_FAIR_DEPLETED, &zone->flags))
set_bit(ZONE_FAIR_DEPLETED, &zone->flags);
--
2.6.4
^ permalink raw reply related [flat|nested] 25+ messages in thread
* [PATCH 07/24] mm, page_alloc: Avoid unnecessary zone lookups during pageblock operations
2016-04-12 10:12 [PATCH 00/24] Optimise page alloc/free fast paths v2 Mel Gorman
` (5 preceding siblings ...)
2016-04-12 10:12 ` [PATCH 06/24] mm, page_alloc: Use __dec_zone_state for order-0 page allocation Mel Gorman
@ 2016-04-12 10:12 ` Mel Gorman
2016-04-12 10:12 ` [PATCH 08/24] mm, page_alloc: Convert alloc_flags to unsigned Mel Gorman
` (16 subsequent siblings)
23 siblings, 0 replies; 25+ messages in thread
From: Mel Gorman @ 2016-04-12 10:12 UTC (permalink / raw)
To: Andrew Morton; +Cc: Vlastimil Babka, Linux-MM, LKML, Mel Gorman
Pageblocks have an associated bitmap to store migrate types and whether
the pageblock should be skipped during compaction. The bitmap may be
associated with a memory section or a zone but the zone is looked up
unconditionally. The compiler should optimise this away automatically so
this is a cosmetic patch only in many cases.
Signed-off-by: Mel Gorman <mgorman@techsingularity.net>
---
mm/page_alloc.c | 22 +++++++++-------------
1 file changed, 9 insertions(+), 13 deletions(-)
diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index ab16560b76e6..d00847bb1612 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -6759,23 +6759,23 @@ void *__init alloc_large_system_hash(const char *tablename,
}
/* Return a pointer to the bitmap storing bits affecting a block of pages */
-static inline unsigned long *get_pageblock_bitmap(struct zone *zone,
+static inline unsigned long *get_pageblock_bitmap(struct page *page,
unsigned long pfn)
{
#ifdef CONFIG_SPARSEMEM
return __pfn_to_section(pfn)->pageblock_flags;
#else
- return zone->pageblock_flags;
+ return page_zone(page)->pageblock_flags;
#endif /* CONFIG_SPARSEMEM */
}
-static inline int pfn_to_bitidx(struct zone *zone, unsigned long pfn)
+static inline int pfn_to_bitidx(struct page *page, unsigned long pfn)
{
#ifdef CONFIG_SPARSEMEM
pfn &= (PAGES_PER_SECTION-1);
return (pfn >> pageblock_order) * NR_PAGEBLOCK_BITS;
#else
- pfn = pfn - round_down(zone->zone_start_pfn, pageblock_nr_pages);
+ pfn = pfn - round_down(page_zone(page)->zone_start_pfn, pageblock_nr_pages);
return (pfn >> pageblock_order) * NR_PAGEBLOCK_BITS;
#endif /* CONFIG_SPARSEMEM */
}
@@ -6793,14 +6793,12 @@ unsigned long get_pfnblock_flags_mask(struct page *page, unsigned long pfn,
unsigned long end_bitidx,
unsigned long mask)
{
- struct zone *zone;
unsigned long *bitmap;
unsigned long bitidx, word_bitidx;
unsigned long word;
- zone = page_zone(page);
- bitmap = get_pageblock_bitmap(zone, pfn);
- bitidx = pfn_to_bitidx(zone, pfn);
+ bitmap = get_pageblock_bitmap(page, pfn);
+ bitidx = pfn_to_bitidx(page, pfn);
word_bitidx = bitidx / BITS_PER_LONG;
bitidx &= (BITS_PER_LONG-1);
@@ -6822,20 +6820,18 @@ void set_pfnblock_flags_mask(struct page *page, unsigned long flags,
unsigned long end_bitidx,
unsigned long mask)
{
- struct zone *zone;
unsigned long *bitmap;
unsigned long bitidx, word_bitidx;
unsigned long old_word, word;
BUILD_BUG_ON(NR_PAGEBLOCK_BITS != 4);
- zone = page_zone(page);
- bitmap = get_pageblock_bitmap(zone, pfn);
- bitidx = pfn_to_bitidx(zone, pfn);
+ bitmap = get_pageblock_bitmap(page, pfn);
+ bitidx = pfn_to_bitidx(page, pfn);
word_bitidx = bitidx / BITS_PER_LONG;
bitidx &= (BITS_PER_LONG-1);
- VM_BUG_ON_PAGE(!zone_spans_pfn(zone, pfn), page);
+ VM_BUG_ON_PAGE(!zone_spans_pfn(page_zone(page), pfn), page);
bitidx += end_bitidx;
mask <<= (BITS_PER_LONG - bitidx - 1);
--
2.6.4
^ permalink raw reply related [flat|nested] 25+ messages in thread
* [PATCH 08/24] mm, page_alloc: Convert alloc_flags to unsigned
2016-04-12 10:12 [PATCH 00/24] Optimise page alloc/free fast paths v2 Mel Gorman
` (6 preceding siblings ...)
2016-04-12 10:12 ` [PATCH 07/24] mm, page_alloc: Avoid unnecessary zone lookups during pageblock operations Mel Gorman
@ 2016-04-12 10:12 ` Mel Gorman
2016-04-12 10:12 ` [PATCH 09/24] mm, page_alloc: Convert nr_fair_skipped to bool Mel Gorman
` (15 subsequent siblings)
23 siblings, 0 replies; 25+ messages in thread
From: Mel Gorman @ 2016-04-12 10:12 UTC (permalink / raw)
To: Andrew Morton; +Cc: Vlastimil Babka, Linux-MM, LKML, Mel Gorman
alloc_flags is a bitmask of flags but it is signed which does not
necessarily generate the best code depending on the compiler. Even
without an impact, it makes more sense that this be unsigned.
Signed-off-by: Mel Gorman <mgorman@techsingularity.net>
---
include/linux/compaction.h | 6 +++---
include/linux/mmzone.h | 3 ++-
mm/compaction.c | 12 +++++++-----
mm/internal.h | 2 +-
mm/page_alloc.c | 26 ++++++++++++++------------
5 files changed, 27 insertions(+), 22 deletions(-)
diff --git a/include/linux/compaction.h b/include/linux/compaction.h
index d7c8de583a23..242b660f64e6 100644
--- a/include/linux/compaction.h
+++ b/include/linux/compaction.h
@@ -39,12 +39,12 @@ extern int sysctl_compact_unevictable_allowed;
extern int fragmentation_index(struct zone *zone, unsigned int order);
extern unsigned long try_to_compact_pages(gfp_t gfp_mask, unsigned int order,
- int alloc_flags, const struct alloc_context *ac,
- enum migrate_mode mode, int *contended);
+ unsigned int alloc_flags, const struct alloc_context *ac,
+ enum migrate_mode mode, int *contended);
extern void compact_pgdat(pg_data_t *pgdat, int order);
extern void reset_isolation_suitable(pg_data_t *pgdat);
extern unsigned long compaction_suitable(struct zone *zone, int order,
- int alloc_flags, int classzone_idx);
+ unsigned int alloc_flags, int classzone_idx);
extern void defer_compaction(struct zone *zone, int order);
extern bool compaction_deferred(struct zone *zone, int order);
diff --git a/include/linux/mmzone.h b/include/linux/mmzone.h
index 0c4d5ebb3849..f49bb9add372 100644
--- a/include/linux/mmzone.h
+++ b/include/linux/mmzone.h
@@ -747,7 +747,8 @@ extern struct mutex zonelists_mutex;
void build_all_zonelists(pg_data_t *pgdat, struct zone *zone);
void wakeup_kswapd(struct zone *zone, int order, enum zone_type classzone_idx);
bool zone_watermark_ok(struct zone *z, unsigned int order,
- unsigned long mark, int classzone_idx, int alloc_flags);
+ unsigned long mark, int classzone_idx,
+ unsigned int alloc_flags);
bool zone_watermark_ok_safe(struct zone *z, unsigned int order,
unsigned long mark, int classzone_idx);
enum memmap_context {
diff --git a/mm/compaction.c b/mm/compaction.c
index ccf97b02b85f..244bb669b5a6 100644
--- a/mm/compaction.c
+++ b/mm/compaction.c
@@ -1259,7 +1259,8 @@ static int compact_finished(struct zone *zone, struct compact_control *cc,
* COMPACT_CONTINUE - If compaction should run now
*/
static unsigned long __compaction_suitable(struct zone *zone, int order,
- int alloc_flags, int classzone_idx)
+ unsigned int alloc_flags,
+ int classzone_idx)
{
int fragindex;
unsigned long watermark;
@@ -1304,7 +1305,8 @@ static unsigned long __compaction_suitable(struct zone *zone, int order,
}
unsigned long compaction_suitable(struct zone *zone, int order,
- int alloc_flags, int classzone_idx)
+ unsigned int alloc_flags,
+ int classzone_idx)
{
unsigned long ret;
@@ -1464,7 +1466,7 @@ static int compact_zone(struct zone *zone, struct compact_control *cc)
static unsigned long compact_zone_order(struct zone *zone, int order,
gfp_t gfp_mask, enum migrate_mode mode, int *contended,
- int alloc_flags, int classzone_idx)
+ unsigned int alloc_flags, int classzone_idx)
{
unsigned long ret;
struct compact_control cc = {
@@ -1505,8 +1507,8 @@ int sysctl_extfrag_threshold = 500;
* This is the main entry point for direct page compaction.
*/
unsigned long try_to_compact_pages(gfp_t gfp_mask, unsigned int order,
- int alloc_flags, const struct alloc_context *ac,
- enum migrate_mode mode, int *contended)
+ unsigned int alloc_flags, const struct alloc_context *ac,
+ enum migrate_mode mode, int *contended)
{
int may_enter_fs = gfp_mask & __GFP_FS;
int may_perform_io = gfp_mask & __GFP_IO;
diff --git a/mm/internal.h b/mm/internal.h
index b79abb6721cf..f6d0a5875ec4 100644
--- a/mm/internal.h
+++ b/mm/internal.h
@@ -175,7 +175,7 @@ struct compact_control {
bool direct_compaction; /* False from kcompactd or /proc/... */
int order; /* order a direct compactor needs */
const gfp_t gfp_mask; /* gfp mask of a direct compactor */
- const int alloc_flags; /* alloc flags of a direct compactor */
+ const unsigned int alloc_flags; /* alloc flags of a direct compactor */
const int classzone_idx; /* zone index of a direct compactor */
struct zone *zone;
int contended; /* Signal need_sched() or lock
diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index d00847bb1612..4bce6298dd07 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -1526,7 +1526,7 @@ static inline bool free_pages_prezeroed(bool poisoned)
}
static int prep_new_page(struct page *page, unsigned int order, gfp_t gfp_flags,
- int alloc_flags)
+ unsigned int alloc_flags)
{
int i;
bool poisoned = true;
@@ -2388,7 +2388,8 @@ static inline void zone_statistics(struct zone *preferred_zone, struct zone *z,
static inline
struct page *buffered_rmqueue(struct zone *preferred_zone,
struct zone *zone, unsigned int order,
- gfp_t gfp_flags, int alloc_flags, int migratetype)
+ gfp_t gfp_flags, unsigned int alloc_flags,
+ int migratetype)
{
unsigned long flags;
struct page *page;
@@ -2542,12 +2543,13 @@ static inline bool should_fail_alloc_page(gfp_t gfp_mask, unsigned int order)
* to check in the allocation paths if no pages are free.
*/
static bool __zone_watermark_ok(struct zone *z, unsigned int order,
- unsigned long mark, int classzone_idx, int alloc_flags,
+ unsigned long mark, int classzone_idx,
+ unsigned int alloc_flags,
long free_pages)
{
long min = mark;
int o;
- const int alloc_harder = (alloc_flags & ALLOC_HARDER);
+ const bool alloc_harder = (alloc_flags & ALLOC_HARDER);
/* free_pages may go negative - that's OK */
free_pages -= (1 << order) - 1;
@@ -2610,7 +2612,7 @@ static bool __zone_watermark_ok(struct zone *z, unsigned int order,
}
bool zone_watermark_ok(struct zone *z, unsigned int order, unsigned long mark,
- int classzone_idx, int alloc_flags)
+ int classzone_idx, unsigned int alloc_flags)
{
return __zone_watermark_ok(z, order, mark, classzone_idx, alloc_flags,
zone_page_state(z, NR_FREE_PAGES));
@@ -2958,7 +2960,7 @@ __alloc_pages_may_oom(gfp_t gfp_mask, unsigned int order,
/* Try memory compaction for high-order allocations before reclaim */
static struct page *
__alloc_pages_direct_compact(gfp_t gfp_mask, unsigned int order,
- int alloc_flags, const struct alloc_context *ac,
+ unsigned int alloc_flags, const struct alloc_context *ac,
enum migrate_mode mode, int *contended_compaction,
bool *deferred_compaction)
{
@@ -3014,7 +3016,7 @@ __alloc_pages_direct_compact(gfp_t gfp_mask, unsigned int order,
#else
static inline struct page *
__alloc_pages_direct_compact(gfp_t gfp_mask, unsigned int order,
- int alloc_flags, const struct alloc_context *ac,
+ unsigned int alloc_flags, const struct alloc_context *ac,
enum migrate_mode mode, int *contended_compaction,
bool *deferred_compaction)
{
@@ -3054,7 +3056,7 @@ __perform_reclaim(gfp_t gfp_mask, unsigned int order,
/* The really slow allocator path where we enter direct reclaim */
static inline struct page *
__alloc_pages_direct_reclaim(gfp_t gfp_mask, unsigned int order,
- int alloc_flags, const struct alloc_context *ac,
+ unsigned int alloc_flags, const struct alloc_context *ac,
unsigned long *did_some_progress)
{
struct page *page = NULL;
@@ -3093,10 +3095,10 @@ static void wake_all_kswapds(unsigned int order, const struct alloc_context *ac)
wakeup_kswapd(zone, order, zone_idx(ac->preferred_zone));
}
-static inline int
+static inline unsigned int
gfp_to_alloc_flags(gfp_t gfp_mask)
{
- int alloc_flags = ALLOC_WMARK_MIN | ALLOC_CPUSET;
+ unsigned int alloc_flags = ALLOC_WMARK_MIN | ALLOC_CPUSET;
/* __GFP_HIGH is assumed to be the same as ALLOC_HIGH to save a branch. */
BUILD_BUG_ON(__GFP_HIGH != (__force gfp_t) ALLOC_HIGH);
@@ -3157,7 +3159,7 @@ __alloc_pages_slowpath(gfp_t gfp_mask, unsigned int order,
{
bool can_direct_reclaim = gfp_mask & __GFP_DIRECT_RECLAIM;
struct page *page = NULL;
- int alloc_flags;
+ unsigned int alloc_flags;
unsigned long pages_reclaimed = 0;
unsigned long did_some_progress;
enum migrate_mode migration_mode = MIGRATE_ASYNC;
@@ -3349,7 +3351,7 @@ __alloc_pages_nodemask(gfp_t gfp_mask, unsigned int order,
struct zoneref *preferred_zoneref;
struct page *page = NULL;
unsigned int cpuset_mems_cookie;
- int alloc_flags = ALLOC_WMARK_LOW|ALLOC_FAIR;
+ unsigned int alloc_flags = ALLOC_WMARK_LOW|ALLOC_FAIR;
gfp_t alloc_mask; /* The gfp_t that was actually used for allocation */
struct alloc_context ac = {
.high_zoneidx = gfp_zone(gfp_mask),
--
2.6.4
^ permalink raw reply related [flat|nested] 25+ messages in thread
* [PATCH 09/24] mm, page_alloc: Convert nr_fair_skipped to bool
2016-04-12 10:12 [PATCH 00/24] Optimise page alloc/free fast paths v2 Mel Gorman
` (7 preceding siblings ...)
2016-04-12 10:12 ` [PATCH 08/24] mm, page_alloc: Convert alloc_flags to unsigned Mel Gorman
@ 2016-04-12 10:12 ` Mel Gorman
2016-04-12 10:12 ` [PATCH 10/24] mm, page_alloc: Remove unnecessary local variable in get_page_from_freelist Mel Gorman
` (14 subsequent siblings)
23 siblings, 0 replies; 25+ messages in thread
From: Mel Gorman @ 2016-04-12 10:12 UTC (permalink / raw)
To: Andrew Morton; +Cc: Vlastimil Babka, Linux-MM, LKML, Mel Gorman
The number of zones skipped to a zone expiring its fair zone allocation quota
is irrelevant. Convert to bool.
Signed-off-by: Mel Gorman <mgorman@techsingularity.net>
---
mm/page_alloc.c | 6 +++---
1 file changed, 3 insertions(+), 3 deletions(-)
diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index 4bce6298dd07..e778485a64c1 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -2677,7 +2677,7 @@ get_page_from_freelist(gfp_t gfp_mask, unsigned int order, int alloc_flags,
struct zoneref *z;
struct page *page = NULL;
struct zone *zone;
- int nr_fair_skipped = 0;
+ bool fair_skipped;
bool zonelist_rescan;
zonelist_scan:
@@ -2705,7 +2705,7 @@ get_page_from_freelist(gfp_t gfp_mask, unsigned int order, int alloc_flags,
if (!zone_local(ac->preferred_zone, zone))
break;
if (test_bit(ZONE_FAIR_DEPLETED, &zone->flags)) {
- nr_fair_skipped++;
+ fair_skipped = true;
continue;
}
}
@@ -2798,7 +2798,7 @@ get_page_from_freelist(gfp_t gfp_mask, unsigned int order, int alloc_flags,
*/
if (alloc_flags & ALLOC_FAIR) {
alloc_flags &= ~ALLOC_FAIR;
- if (nr_fair_skipped) {
+ if (fair_skipped) {
zonelist_rescan = true;
reset_alloc_batches(ac->preferred_zone);
}
--
2.6.4
^ permalink raw reply related [flat|nested] 25+ messages in thread
* [PATCH 10/24] mm, page_alloc: Remove unnecessary local variable in get_page_from_freelist
2016-04-12 10:12 [PATCH 00/24] Optimise page alloc/free fast paths v2 Mel Gorman
` (8 preceding siblings ...)
2016-04-12 10:12 ` [PATCH 09/24] mm, page_alloc: Convert nr_fair_skipped to bool Mel Gorman
@ 2016-04-12 10:12 ` Mel Gorman
2016-04-12 10:12 ` [PATCH 11/24] mm, page_alloc: Remove unnecessary initialisation " Mel Gorman
` (13 subsequent siblings)
23 siblings, 0 replies; 25+ messages in thread
From: Mel Gorman @ 2016-04-12 10:12 UTC (permalink / raw)
To: Andrew Morton; +Cc: Vlastimil Babka, Linux-MM, LKML, Mel Gorman
zonelist here is a copy of a struct field that is used once. Ditch it.
Signed-off-by: Mel Gorman <mgorman@techsingularity.net>
---
mm/page_alloc.c | 3 +--
1 file changed, 1 insertion(+), 2 deletions(-)
diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index e778485a64c1..313db1c43839 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -2673,7 +2673,6 @@ static struct page *
get_page_from_freelist(gfp_t gfp_mask, unsigned int order, int alloc_flags,
const struct alloc_context *ac)
{
- struct zonelist *zonelist = ac->zonelist;
struct zoneref *z;
struct page *page = NULL;
struct zone *zone;
@@ -2687,7 +2686,7 @@ get_page_from_freelist(gfp_t gfp_mask, unsigned int order, int alloc_flags,
* Scan zonelist, looking for a zone with enough free.
* See also __cpuset_node_allowed() comment in kernel/cpuset.c.
*/
- for_each_zone_zonelist_nodemask(zone, z, zonelist, ac->high_zoneidx,
+ for_each_zone_zonelist_nodemask(zone, z, ac->zonelist, ac->high_zoneidx,
ac->nodemask) {
unsigned long mark;
--
2.6.4
^ permalink raw reply related [flat|nested] 25+ messages in thread
* [PATCH 11/24] mm, page_alloc: Remove unnecessary initialisation in get_page_from_freelist
2016-04-12 10:12 [PATCH 00/24] Optimise page alloc/free fast paths v2 Mel Gorman
` (9 preceding siblings ...)
2016-04-12 10:12 ` [PATCH 10/24] mm, page_alloc: Remove unnecessary local variable in get_page_from_freelist Mel Gorman
@ 2016-04-12 10:12 ` Mel Gorman
2016-04-12 10:12 ` [PATCH 12/24] mm, page_alloc: Remove unnecessary initialisation from __alloc_pages_nodemask() Mel Gorman
` (12 subsequent siblings)
23 siblings, 0 replies; 25+ messages in thread
From: Mel Gorman @ 2016-04-12 10:12 UTC (permalink / raw)
To: Andrew Morton; +Cc: Vlastimil Babka, Linux-MM, LKML, Mel Gorman
See subject.
Signed-off-by: Mel Gorman <mgorman@techsingularity.net>
---
mm/page_alloc.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index 313db1c43839..f5ddb342c967 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -2674,7 +2674,6 @@ get_page_from_freelist(gfp_t gfp_mask, unsigned int order, int alloc_flags,
const struct alloc_context *ac)
{
struct zoneref *z;
- struct page *page = NULL;
struct zone *zone;
bool fair_skipped;
bool zonelist_rescan;
@@ -2688,6 +2687,7 @@ get_page_from_freelist(gfp_t gfp_mask, unsigned int order, int alloc_flags,
*/
for_each_zone_zonelist_nodemask(zone, z, ac->zonelist, ac->high_zoneidx,
ac->nodemask) {
+ struct page *page;
unsigned long mark;
if (cpusets_enabled() &&
--
2.6.4
^ permalink raw reply related [flat|nested] 25+ messages in thread
* [PATCH 12/24] mm, page_alloc: Remove unnecessary initialisation from __alloc_pages_nodemask()
2016-04-12 10:12 [PATCH 00/24] Optimise page alloc/free fast paths v2 Mel Gorman
` (10 preceding siblings ...)
2016-04-12 10:12 ` [PATCH 11/24] mm, page_alloc: Remove unnecessary initialisation " Mel Gorman
@ 2016-04-12 10:12 ` Mel Gorman
2016-04-12 10:12 ` [PATCH 13/24] mm, page_alloc: Remove redundant check for empty zonelist Mel Gorman
` (11 subsequent siblings)
23 siblings, 0 replies; 25+ messages in thread
From: Mel Gorman @ 2016-04-12 10:12 UTC (permalink / raw)
To: Andrew Morton; +Cc: Vlastimil Babka, Linux-MM, LKML, Mel Gorman
page is guaranteed to be set before it is read with or without the
initialisation.
Signed-off-by: Mel Gorman <mgorman@techsingularity.net>
---
mm/page_alloc.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index f5ddb342c967..df03ccc7f07c 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -3348,7 +3348,7 @@ __alloc_pages_nodemask(gfp_t gfp_mask, unsigned int order,
struct zonelist *zonelist, nodemask_t *nodemask)
{
struct zoneref *preferred_zoneref;
- struct page *page = NULL;
+ struct page *page;
unsigned int cpuset_mems_cookie;
unsigned int alloc_flags = ALLOC_WMARK_LOW|ALLOC_FAIR;
gfp_t alloc_mask; /* The gfp_t that was actually used for allocation */
--
2.6.4
^ permalink raw reply related [flat|nested] 25+ messages in thread
* [PATCH 13/24] mm, page_alloc: Remove redundant check for empty zonelist
2016-04-12 10:12 [PATCH 00/24] Optimise page alloc/free fast paths v2 Mel Gorman
` (11 preceding siblings ...)
2016-04-12 10:12 ` [PATCH 12/24] mm, page_alloc: Remove unnecessary initialisation from __alloc_pages_nodemask() Mel Gorman
@ 2016-04-12 10:12 ` Mel Gorman
2016-04-12 10:12 ` [PATCH 14/24] mm, page_alloc: Simplify last cpupid reset Mel Gorman
` (10 subsequent siblings)
23 siblings, 0 replies; 25+ messages in thread
From: Mel Gorman @ 2016-04-12 10:12 UTC (permalink / raw)
To: Andrew Morton; +Cc: Vlastimil Babka, Linux-MM, LKML, Mel Gorman
A check is made for an empty zonelist early in the page allocator fast
path but it's unnecessary. The check after first_zones_zonelist call
should catch that situation. Removing the first check is slower for
machines with memoryless nodes but that is a corner case that can
live with the overhead.
Signed-off-by: Mel Gorman <mgorman@techsingularity.net>
---
mm/page_alloc.c | 8 --------
1 file changed, 8 deletions(-)
diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index df03ccc7f07c..e50e754ec9eb 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -3374,14 +3374,6 @@ __alloc_pages_nodemask(gfp_t gfp_mask, unsigned int order,
if (should_fail_alloc_page(gfp_mask, order))
return NULL;
- /*
- * Check the zones suitable for the gfp_mask contain at least one
- * valid zone. It's possible to have an empty zonelist as a result
- * of __GFP_THISNODE and a memoryless node
- */
- if (unlikely(!zonelist->_zonerefs->zone))
- return NULL;
-
if (IS_ENABLED(CONFIG_CMA) && ac.migratetype == MIGRATE_MOVABLE)
alloc_flags |= ALLOC_CMA;
--
2.6.4
^ permalink raw reply related [flat|nested] 25+ messages in thread
* [PATCH 14/24] mm, page_alloc: Simplify last cpupid reset
2016-04-12 10:12 [PATCH 00/24] Optimise page alloc/free fast paths v2 Mel Gorman
` (12 preceding siblings ...)
2016-04-12 10:12 ` [PATCH 13/24] mm, page_alloc: Remove redundant check for empty zonelist Mel Gorman
@ 2016-04-12 10:12 ` Mel Gorman
2016-04-12 10:12 ` [PATCH 15/24] mm, page_alloc: Move might_sleep_if check to the allocator slowpath Mel Gorman
` (9 subsequent siblings)
23 siblings, 0 replies; 25+ messages in thread
From: Mel Gorman @ 2016-04-12 10:12 UTC (permalink / raw)
To: Andrew Morton; +Cc: Vlastimil Babka, Linux-MM, LKML, Mel Gorman
The current reset unnecessarily clears flags and makes pointless calculations.
Signed-off-by: Mel Gorman <mgorman@techsingularity.net>
---
include/linux/mm.h | 5 +----
1 file changed, 1 insertion(+), 4 deletions(-)
diff --git a/include/linux/mm.h b/include/linux/mm.h
index ffcff53e3b2b..60656db00abd 100644
--- a/include/linux/mm.h
+++ b/include/linux/mm.h
@@ -837,10 +837,7 @@ extern int page_cpupid_xchg_last(struct page *page, int cpupid);
static inline void page_cpupid_reset_last(struct page *page)
{
- int cpupid = (1 << LAST_CPUPID_SHIFT) - 1;
-
- page->flags &= ~(LAST_CPUPID_MASK << LAST_CPUPID_PGSHIFT);
- page->flags |= (cpupid & LAST_CPUPID_MASK) << LAST_CPUPID_PGSHIFT;
+ page->flags |= LAST_CPUPID_MASK << LAST_CPUPID_PGSHIFT;
}
#endif /* LAST_CPUPID_NOT_IN_PAGE_FLAGS */
#else /* !CONFIG_NUMA_BALANCING */
--
2.6.4
^ permalink raw reply related [flat|nested] 25+ messages in thread
* [PATCH 15/24] mm, page_alloc: Move might_sleep_if check to the allocator slowpath
2016-04-12 10:12 [PATCH 00/24] Optimise page alloc/free fast paths v2 Mel Gorman
` (13 preceding siblings ...)
2016-04-12 10:12 ` [PATCH 14/24] mm, page_alloc: Simplify last cpupid reset Mel Gorman
@ 2016-04-12 10:12 ` Mel Gorman
2016-04-12 10:12 ` [PATCH 16/24] mm, page_alloc: Move __GFP_HARDWALL modifications out of the fastpath Mel Gorman
` (8 subsequent siblings)
23 siblings, 0 replies; 25+ messages in thread
From: Mel Gorman @ 2016-04-12 10:12 UTC (permalink / raw)
To: Andrew Morton; +Cc: Vlastimil Babka, Linux-MM, LKML, Mel Gorman
There is a debugging check for callers that specify __GFP_DIRECT_RECLAIM
from a context that cannot sleep. Triggering this is almost certainly
a bug but it's also overhead in the fast path. Move the check to the slow
path. It'll be harder to trigger as it'll only be checked when watermarks
are depleted but it'll also only be checked in a path that can sleep.
Signed-off-by: Mel Gorman <mgorman@techsingularity.net>
---
mm/page_alloc.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index e50e754ec9eb..73dc0413e997 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -3176,6 +3176,8 @@ __alloc_pages_slowpath(gfp_t gfp_mask, unsigned int order,
return NULL;
}
+ might_sleep_if(gfp_mask & __GFP_DIRECT_RECLAIM);
+
/*
* We also sanity check to catch abuse of atomic reserves being used by
* callers that are not in atomic context.
@@ -3369,8 +3371,6 @@ __alloc_pages_nodemask(gfp_t gfp_mask, unsigned int order,
lockdep_trace_alloc(gfp_mask);
- might_sleep_if(gfp_mask & __GFP_DIRECT_RECLAIM);
-
if (should_fail_alloc_page(gfp_mask, order))
return NULL;
--
2.6.4
^ permalink raw reply related [flat|nested] 25+ messages in thread
* [PATCH 16/24] mm, page_alloc: Move __GFP_HARDWALL modifications out of the fastpath
2016-04-12 10:12 [PATCH 00/24] Optimise page alloc/free fast paths v2 Mel Gorman
` (14 preceding siblings ...)
2016-04-12 10:12 ` [PATCH 15/24] mm, page_alloc: Move might_sleep_if check to the allocator slowpath Mel Gorman
@ 2016-04-12 10:12 ` Mel Gorman
2016-04-12 10:12 ` [PATCH 17/24] mm, page_alloc: Reduce cost of fair zone allocation policy retry Mel Gorman
` (7 subsequent siblings)
23 siblings, 0 replies; 25+ messages in thread
From: Mel Gorman @ 2016-04-12 10:12 UTC (permalink / raw)
To: Andrew Morton; +Cc: Vlastimil Babka, Linux-MM, LKML, Mel Gorman
__GFP_HARDWALL only has meaning in the context of cpusets but the fast path
always applies the flag on the first attempt. Move the manipulations into
the cpuset paths where they will be masked by a static branch in the common
case.
With the other micro-optimisations in this series combined, the impact on
a page allocator microbenchmark is
4.6.0-rc2 4.6.0-rc2
decstat-v1r20 micro-v1r20
Min alloc-odr0-1 381.00 ( 0.00%) 377.00 ( 1.05%)
Min alloc-odr0-2 275.00 ( 0.00%) 273.00 ( 0.73%)
Min alloc-odr0-4 229.00 ( 0.00%) 226.00 ( 1.31%)
Min alloc-odr0-8 199.00 ( 0.00%) 196.00 ( 1.51%)
Min alloc-odr0-16 186.00 ( 0.00%) 183.00 ( 1.61%)
Min alloc-odr0-32 179.00 ( 0.00%) 175.00 ( 2.23%)
Min alloc-odr0-64 174.00 ( 0.00%) 172.00 ( 1.15%)
Min alloc-odr0-128 172.00 ( 0.00%) 170.00 ( 1.16%)
Min alloc-odr0-256 181.00 ( 0.00%) 183.00 ( -1.10%)
Min alloc-odr0-512 193.00 ( 0.00%) 191.00 ( 1.04%)
Min alloc-odr0-1024 201.00 ( 0.00%) 199.00 ( 1.00%)
Min alloc-odr0-2048 206.00 ( 0.00%) 204.00 ( 0.97%)
Min alloc-odr0-4096 212.00 ( 0.00%) 210.00 ( 0.94%)
Min alloc-odr0-8192 215.00 ( 0.00%) 213.00 ( 0.93%)
Min alloc-odr0-16384 216.00 ( 0.00%) 214.00 ( 0.93%)
Signed-off-by: Mel Gorman <mgorman@techsingularity.net>
---
mm/page_alloc.c | 8 +++++---
1 file changed, 5 insertions(+), 3 deletions(-)
diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index 73dc0413e997..219e0d05ed88 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -3353,7 +3353,7 @@ __alloc_pages_nodemask(gfp_t gfp_mask, unsigned int order,
struct page *page;
unsigned int cpuset_mems_cookie;
unsigned int alloc_flags = ALLOC_WMARK_LOW|ALLOC_FAIR;
- gfp_t alloc_mask; /* The gfp_t that was actually used for allocation */
+ gfp_t alloc_mask = gfp_mask; /* The gfp_t that was actually used for allocation */
struct alloc_context ac = {
.high_zoneidx = gfp_zone(gfp_mask),
.zonelist = zonelist,
@@ -3362,6 +3362,7 @@ __alloc_pages_nodemask(gfp_t gfp_mask, unsigned int order,
};
if (cpusets_enabled()) {
+ alloc_mask |= __GFP_HARDWALL;
alloc_flags |= ALLOC_CPUSET;
if (!ac.nodemask)
ac.nodemask = &cpuset_current_mems_allowed;
@@ -3391,7 +3392,6 @@ __alloc_pages_nodemask(gfp_t gfp_mask, unsigned int order,
ac.classzone_idx = zonelist_zone_idx(preferred_zoneref);
/* First allocation attempt */
- alloc_mask = gfp_mask|__GFP_HARDWALL;
page = get_page_from_freelist(alloc_mask, order, alloc_flags, &ac);
if (unlikely(!page)) {
/*
@@ -3417,8 +3417,10 @@ __alloc_pages_nodemask(gfp_t gfp_mask, unsigned int order,
* the mask is being updated. If a page allocation is about to fail,
* check if the cpuset changed during allocation and if so, retry.
*/
- if (unlikely(!page && read_mems_allowed_retry(cpuset_mems_cookie)))
+ if (unlikely(!page && read_mems_allowed_retry(cpuset_mems_cookie))) {
+ alloc_mask = gfp_mask;
goto retry_cpuset;
+ }
return page;
}
--
2.6.4
^ permalink raw reply related [flat|nested] 25+ messages in thread
* [PATCH 17/24] mm, page_alloc: Reduce cost of fair zone allocation policy retry
2016-04-12 10:12 [PATCH 00/24] Optimise page alloc/free fast paths v2 Mel Gorman
` (15 preceding siblings ...)
2016-04-12 10:12 ` [PATCH 16/24] mm, page_alloc: Move __GFP_HARDWALL modifications out of the fastpath Mel Gorman
@ 2016-04-12 10:12 ` Mel Gorman
2016-04-12 10:12 ` [PATCH 18/24] mm, page_alloc: Shortcut watermark checks for order-0 pages Mel Gorman
` (6 subsequent siblings)
23 siblings, 0 replies; 25+ messages in thread
From: Mel Gorman @ 2016-04-12 10:12 UTC (permalink / raw)
To: Andrew Morton; +Cc: Vlastimil Babka, Linux-MM, LKML, Mel Gorman
The fair zone allocation policy is not without cost but it can be reduced
slightly. This patch removes an unnecessary local variable, checks the
likely conditions of the fair zone policy first, uses a bool instead of
a flags check and falls through when a remote node is encountered instead
of doing a full restart. The benefit is marginal but it's there
4.6.0-rc2 4.6.0-rc2
decstat-v1r20 optfair-v1r20
Min alloc-odr0-1 377.00 ( 0.00%) 380.00 ( -0.80%)
Min alloc-odr0-2 273.00 ( 0.00%) 273.00 ( 0.00%)
Min alloc-odr0-4 226.00 ( 0.00%) 227.00 ( -0.44%)
Min alloc-odr0-8 196.00 ( 0.00%) 196.00 ( 0.00%)
Min alloc-odr0-16 183.00 ( 0.00%) 183.00 ( 0.00%)
Min alloc-odr0-32 175.00 ( 0.00%) 173.00 ( 1.14%)
Min alloc-odr0-64 172.00 ( 0.00%) 169.00 ( 1.74%)
Min alloc-odr0-128 170.00 ( 0.00%) 169.00 ( 0.59%)
Min alloc-odr0-256 183.00 ( 0.00%) 180.00 ( 1.64%)
Min alloc-odr0-512 191.00 ( 0.00%) 190.00 ( 0.52%)
Min alloc-odr0-1024 199.00 ( 0.00%) 198.00 ( 0.50%)
Min alloc-odr0-2048 204.00 ( 0.00%) 204.00 ( 0.00%)
Min alloc-odr0-4096 210.00 ( 0.00%) 209.00 ( 0.48%)
Min alloc-odr0-8192 213.00 ( 0.00%) 213.00 ( 0.00%)
Min alloc-odr0-16384 214.00 ( 0.00%) 214.00 ( 0.00%)
The benefit is marginal at best but one of the most important benefits,
avoiding a second search when falling back to another node is not triggered
by this particular test so the benefit for some corner cases is understated.
Signed-off-by: Mel Gorman <mgorman@techsingularity.net>
---
mm/page_alloc.c | 32 ++++++++++++++------------------
1 file changed, 14 insertions(+), 18 deletions(-)
diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index 219e0d05ed88..25a8ab07b287 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -2675,12 +2675,10 @@ get_page_from_freelist(gfp_t gfp_mask, unsigned int order, int alloc_flags,
{
struct zoneref *z;
struct zone *zone;
- bool fair_skipped;
- bool zonelist_rescan;
+ bool fair_skipped = false;
+ bool apply_fair = (alloc_flags & ALLOC_FAIR);
zonelist_scan:
- zonelist_rescan = false;
-
/*
* Scan zonelist, looking for a zone with enough free.
* See also __cpuset_node_allowed() comment in kernel/cpuset.c.
@@ -2700,13 +2698,16 @@ get_page_from_freelist(gfp_t gfp_mask, unsigned int order, int alloc_flags,
* page was allocated in should have no effect on the
* time the page has in memory before being reclaimed.
*/
- if (alloc_flags & ALLOC_FAIR) {
- if (!zone_local(ac->preferred_zone, zone))
- break;
+ if (apply_fair) {
if (test_bit(ZONE_FAIR_DEPLETED, &zone->flags)) {
fair_skipped = true;
continue;
}
+ if (!zone_local(ac->preferred_zone, zone)) {
+ if (fair_skipped)
+ goto reset_fair;
+ apply_fair = false;
+ }
}
/*
* When allocating a page cache page for writing, we
@@ -2795,18 +2796,13 @@ get_page_from_freelist(gfp_t gfp_mask, unsigned int order, int alloc_flags,
* include remote zones now, before entering the slowpath and waking
* kswapd: prefer spilling to a remote zone over swapping locally.
*/
- if (alloc_flags & ALLOC_FAIR) {
- alloc_flags &= ~ALLOC_FAIR;
- if (fair_skipped) {
- zonelist_rescan = true;
- reset_alloc_batches(ac->preferred_zone);
- }
- if (nr_online_nodes > 1)
- zonelist_rescan = true;
- }
-
- if (zonelist_rescan)
+ if (fair_skipped) {
+reset_fair:
+ apply_fair = false;
+ fair_skipped = false;
+ reset_alloc_batches(ac->preferred_zone);
goto zonelist_scan;
+ }
return NULL;
}
--
2.6.4
^ permalink raw reply related [flat|nested] 25+ messages in thread
* [PATCH 18/24] mm, page_alloc: Shortcut watermark checks for order-0 pages
2016-04-12 10:12 [PATCH 00/24] Optimise page alloc/free fast paths v2 Mel Gorman
` (16 preceding siblings ...)
2016-04-12 10:12 ` [PATCH 17/24] mm, page_alloc: Reduce cost of fair zone allocation policy retry Mel Gorman
@ 2016-04-12 10:12 ` Mel Gorman
2016-04-12 10:12 ` [PATCH 19/24] mm, page_alloc: Avoid looking up the first zone in a zonelist twice Mel Gorman
` (5 subsequent siblings)
23 siblings, 0 replies; 25+ messages in thread
From: Mel Gorman @ 2016-04-12 10:12 UTC (permalink / raw)
To: Andrew Morton; +Cc: Vlastimil Babka, Linux-MM, LKML, Mel Gorman
Watermarks have to be checked on every allocation including the number of
pages being allocated and whether reserves can be accessed. The reserves
only matter if memory is limited and the free_pages adjustment only applies
to high-order pages. This patch adds a shortcut for order-0 pages that avoids
numerous calculations if there is plenty of free memory yielding the following
performance difference in a page allocator microbenchmark;
4.6.0-rc2 4.6.0-rc2
optfair-v1r20 fastmark-v1r20
Min alloc-odr0-1 380.00 ( 0.00%) 364.00 ( 4.21%)
Min alloc-odr0-2 273.00 ( 0.00%) 262.00 ( 4.03%)
Min alloc-odr0-4 227.00 ( 0.00%) 214.00 ( 5.73%)
Min alloc-odr0-8 196.00 ( 0.00%) 186.00 ( 5.10%)
Min alloc-odr0-16 183.00 ( 0.00%) 173.00 ( 5.46%)
Min alloc-odr0-32 173.00 ( 0.00%) 165.00 ( 4.62%)
Min alloc-odr0-64 169.00 ( 0.00%) 161.00 ( 4.73%)
Min alloc-odr0-128 169.00 ( 0.00%) 159.00 ( 5.92%)
Min alloc-odr0-256 180.00 ( 0.00%) 168.00 ( 6.67%)
Min alloc-odr0-512 190.00 ( 0.00%) 180.00 ( 5.26%)
Min alloc-odr0-1024 198.00 ( 0.00%) 190.00 ( 4.04%)
Min alloc-odr0-2048 204.00 ( 0.00%) 196.00 ( 3.92%)
Min alloc-odr0-4096 209.00 ( 0.00%) 202.00 ( 3.35%)
Min alloc-odr0-8192 213.00 ( 0.00%) 206.00 ( 3.29%)
Min alloc-odr0-16384 214.00 ( 0.00%) 206.00 ( 3.74%)
Signed-off-by: Mel Gorman <mgorman@techsingularity.net>
---
mm/page_alloc.c | 28 +++++++++++++++++++++++++++-
1 file changed, 27 insertions(+), 1 deletion(-)
diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index 25a8ab07b287..c131218913e8 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -2618,6 +2618,32 @@ bool zone_watermark_ok(struct zone *z, unsigned int order, unsigned long mark,
zone_page_state(z, NR_FREE_PAGES));
}
+static inline bool zone_watermark_fast(struct zone *z, unsigned int order,
+ unsigned long mark, int classzone_idx, unsigned int alloc_flags)
+{
+ long free_pages = zone_page_state(z, NR_FREE_PAGES);
+ long cma_pages = 0;
+
+#ifdef CONFIG_CMA
+ /* If allocation can't use CMA areas don't use free CMA pages */
+ if (!(alloc_flags & ALLOC_CMA))
+ cma_pages = zone_page_state(z, NR_FREE_CMA_PAGES);
+#endif
+
+ /*
+ * Fast check for order-0 only. If this fails then the reserves
+ * need to be calculated. There is a corner case where the check
+ * passes but only the high-order atomic reserve are free. If
+ * the caller is !atomic then it'll uselessly search the free
+ * list. That corner case is then slower but it is harmless.
+ */
+ if (!order && (free_pages - cma_pages) > mark + z->lowmem_reserve[classzone_idx])
+ return true;
+
+ return __zone_watermark_ok(z, order, mark, classzone_idx, alloc_flags,
+ free_pages);
+}
+
bool zone_watermark_ok_safe(struct zone *z, unsigned int order,
unsigned long mark, int classzone_idx)
{
@@ -2739,7 +2765,7 @@ get_page_from_freelist(gfp_t gfp_mask, unsigned int order, int alloc_flags,
continue;
mark = zone->watermark[alloc_flags & ALLOC_WMARK_MASK];
- if (!zone_watermark_ok(zone, order, mark,
+ if (!zone_watermark_fast(zone, order, mark,
ac->classzone_idx, alloc_flags)) {
int ret;
--
2.6.4
^ permalink raw reply related [flat|nested] 25+ messages in thread
* [PATCH 19/24] mm, page_alloc: Avoid looking up the first zone in a zonelist twice
2016-04-12 10:12 [PATCH 00/24] Optimise page alloc/free fast paths v2 Mel Gorman
` (17 preceding siblings ...)
2016-04-12 10:12 ` [PATCH 18/24] mm, page_alloc: Shortcut watermark checks for order-0 pages Mel Gorman
@ 2016-04-12 10:12 ` Mel Gorman
2016-04-12 10:12 ` [PATCH 20/24] mm, page_alloc: Check multiple page fields with a single branch Mel Gorman
` (4 subsequent siblings)
23 siblings, 0 replies; 25+ messages in thread
From: Mel Gorman @ 2016-04-12 10:12 UTC (permalink / raw)
To: Andrew Morton; +Cc: Vlastimil Babka, Linux-MM, LKML, Mel Gorman
The allocator fast path looks up the first usable zone in a zonelist
and then get_page_from_freelist does the same job in the zonelist
iterator. This patch preserves the necessary information.
4.6.0-rc2 4.6.0-rc2
fastmark-v1r20 initonce-v1r20
Min alloc-odr0-1 364.00 ( 0.00%) 359.00 ( 1.37%)
Min alloc-odr0-2 262.00 ( 0.00%) 260.00 ( 0.76%)
Min alloc-odr0-4 214.00 ( 0.00%) 214.00 ( 0.00%)
Min alloc-odr0-8 186.00 ( 0.00%) 186.00 ( 0.00%)
Min alloc-odr0-16 173.00 ( 0.00%) 173.00 ( 0.00%)
Min alloc-odr0-32 165.00 ( 0.00%) 165.00 ( 0.00%)
Min alloc-odr0-64 161.00 ( 0.00%) 162.00 ( -0.62%)
Min alloc-odr0-128 159.00 ( 0.00%) 161.00 ( -1.26%)
Min alloc-odr0-256 168.00 ( 0.00%) 170.00 ( -1.19%)
Min alloc-odr0-512 180.00 ( 0.00%) 181.00 ( -0.56%)
Min alloc-odr0-1024 190.00 ( 0.00%) 190.00 ( 0.00%)
Min alloc-odr0-2048 196.00 ( 0.00%) 196.00 ( 0.00%)
Min alloc-odr0-4096 202.00 ( 0.00%) 202.00 ( 0.00%)
Min alloc-odr0-8192 206.00 ( 0.00%) 205.00 ( 0.49%)
Min alloc-odr0-16384 206.00 ( 0.00%) 205.00 ( 0.49%)
The benefit is negligible and the results are within the noise but each
cycle counts.
Signed-off-by: Mel Gorman <mgorman@techsingularity.net>
---
fs/buffer.c | 10 +++++-----
include/linux/mmzone.h | 18 +++++++++++-------
mm/internal.h | 2 +-
mm/mempolicy.c | 19 ++++++++++---------
mm/page_alloc.c | 34 ++++++++++++++++------------------
5 files changed, 43 insertions(+), 40 deletions(-)
diff --git a/fs/buffer.c b/fs/buffer.c
index af0d9a82a8ed..754813a6962b 100644
--- a/fs/buffer.c
+++ b/fs/buffer.c
@@ -255,17 +255,17 @@ __find_get_block_slow(struct block_device *bdev, sector_t block)
*/
static void free_more_memory(void)
{
- struct zone *zone;
+ struct zoneref *z;
int nid;
wakeup_flusher_threads(1024, WB_REASON_FREE_MORE_MEM);
yield();
for_each_online_node(nid) {
- (void)first_zones_zonelist(node_zonelist(nid, GFP_NOFS),
- gfp_zone(GFP_NOFS), NULL,
- &zone);
- if (zone)
+
+ z = first_zones_zonelist(node_zonelist(nid, GFP_NOFS),
+ gfp_zone(GFP_NOFS), NULL);
+ if (z->zone)
try_to_free_pages(node_zonelist(nid, GFP_NOFS), 0,
GFP_NOFS, NULL);
}
diff --git a/include/linux/mmzone.h b/include/linux/mmzone.h
index f49bb9add372..bf153ed097d5 100644
--- a/include/linux/mmzone.h
+++ b/include/linux/mmzone.h
@@ -962,13 +962,10 @@ static __always_inline struct zoneref *next_zones_zonelist(struct zoneref *z,
*/
static inline struct zoneref *first_zones_zonelist(struct zonelist *zonelist,
enum zone_type highest_zoneidx,
- nodemask_t *nodes,
- struct zone **zone)
+ nodemask_t *nodes)
{
- struct zoneref *z = next_zones_zonelist(zonelist->_zonerefs,
+ return next_zones_zonelist(zonelist->_zonerefs,
highest_zoneidx, nodes);
- *zone = zonelist_zone(z);
- return z;
}
/**
@@ -983,10 +980,17 @@ static inline struct zoneref *first_zones_zonelist(struct zonelist *zonelist,
* within a given nodemask
*/
#define for_each_zone_zonelist_nodemask(zone, z, zlist, highidx, nodemask) \
- for (z = first_zones_zonelist(zlist, highidx, nodemask, &zone); \
+ for (z = first_zones_zonelist(zlist, highidx, nodemask), zone = zonelist_zone(z); \
zone; \
z = next_zones_zonelist(++z, highidx, nodemask), \
- zone = zonelist_zone(z)) \
+ zone = zonelist_zone(z))
+
+#define for_next_zone_zonelist_nodemask(zone, z, zlist, highidx, nodemask) \
+ for (zone = z->zone; \
+ zone; \
+ z = next_zones_zonelist(++z, highidx, nodemask), \
+ zone = zonelist_zone(z))
+
/**
* for_each_zone_zonelist - helper macro to iterate over valid zones in a zonelist at or below a given zone index
diff --git a/mm/internal.h b/mm/internal.h
index f6d0a5875ec4..4c2396cd514c 100644
--- a/mm/internal.h
+++ b/mm/internal.h
@@ -102,7 +102,7 @@ extern pmd_t *mm_find_pmd(struct mm_struct *mm, unsigned long address);
struct alloc_context {
struct zonelist *zonelist;
nodemask_t *nodemask;
- struct zone *preferred_zone;
+ struct zoneref *preferred_zoneref;
int classzone_idx;
int migratetype;
enum zone_type high_zoneidx;
diff --git a/mm/mempolicy.c b/mm/mempolicy.c
index 36cc01bc950a..66d73efba370 100644
--- a/mm/mempolicy.c
+++ b/mm/mempolicy.c
@@ -1744,18 +1744,18 @@ unsigned int mempolicy_slab_node(void)
return interleave_nodes(policy);
case MPOL_BIND: {
+ struct zoneref *z;
+
/*
* Follow bind policy behavior and start allocation at the
* first node.
*/
struct zonelist *zonelist;
- struct zone *zone;
enum zone_type highest_zoneidx = gfp_zone(GFP_KERNEL);
zonelist = &NODE_DATA(node)->node_zonelists[0];
- (void)first_zones_zonelist(zonelist, highest_zoneidx,
- &policy->v.nodes,
- &zone);
- return zone ? zone->node : node;
+ z = first_zones_zonelist(zonelist, highest_zoneidx,
+ &policy->v.nodes);
+ return z->zone ? z->zone->node : node;
}
default:
@@ -2284,7 +2284,7 @@ static void sp_free(struct sp_node *n)
int mpol_misplaced(struct page *page, struct vm_area_struct *vma, unsigned long addr)
{
struct mempolicy *pol;
- struct zone *zone;
+ struct zoneref *z;
int curnid = page_to_nid(page);
unsigned long pgoff;
int thiscpu = raw_smp_processor_id();
@@ -2316,6 +2316,7 @@ int mpol_misplaced(struct page *page, struct vm_area_struct *vma, unsigned long
break;
case MPOL_BIND:
+
/*
* allows binding to multiple nodes.
* use current page if in policy nodemask,
@@ -2324,11 +2325,11 @@ int mpol_misplaced(struct page *page, struct vm_area_struct *vma, unsigned long
*/
if (node_isset(curnid, pol->v.nodes))
goto out;
- (void)first_zones_zonelist(
+ z = first_zones_zonelist(
node_zonelist(numa_node_id(), GFP_HIGHUSER),
gfp_zone(GFP_HIGHUSER),
- &pol->v.nodes, &zone);
- polnid = zone->node;
+ &pol->v.nodes);
+ polnid = z->zone->node;
break;
default:
diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index c131218913e8..4019dfe26b11 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -2699,7 +2699,7 @@ static struct page *
get_page_from_freelist(gfp_t gfp_mask, unsigned int order, int alloc_flags,
const struct alloc_context *ac)
{
- struct zoneref *z;
+ struct zoneref *z = ac->preferred_zoneref;
struct zone *zone;
bool fair_skipped = false;
bool apply_fair = (alloc_flags & ALLOC_FAIR);
@@ -2709,7 +2709,7 @@ get_page_from_freelist(gfp_t gfp_mask, unsigned int order, int alloc_flags,
* Scan zonelist, looking for a zone with enough free.
* See also __cpuset_node_allowed() comment in kernel/cpuset.c.
*/
- for_each_zone_zonelist_nodemask(zone, z, ac->zonelist, ac->high_zoneidx,
+ for_next_zone_zonelist_nodemask(zone, z, ac->zonelist, ac->high_zoneidx,
ac->nodemask) {
struct page *page;
unsigned long mark;
@@ -2729,7 +2729,7 @@ get_page_from_freelist(gfp_t gfp_mask, unsigned int order, int alloc_flags,
fair_skipped = true;
continue;
}
- if (!zone_local(ac->preferred_zone, zone)) {
+ if (!zone_local(ac->preferred_zoneref->zone, zone)) {
if (fair_skipped)
goto reset_fair;
apply_fair = false;
@@ -2775,7 +2775,7 @@ get_page_from_freelist(gfp_t gfp_mask, unsigned int order, int alloc_flags,
goto try_this_zone;
if (zone_reclaim_mode == 0 ||
- !zone_allows_reclaim(ac->preferred_zone, zone))
+ !zone_allows_reclaim(ac->preferred_zoneref->zone, zone))
continue;
ret = zone_reclaim(zone, gfp_mask, order);
@@ -2797,7 +2797,7 @@ get_page_from_freelist(gfp_t gfp_mask, unsigned int order, int alloc_flags,
}
try_this_zone:
- page = buffered_rmqueue(ac->preferred_zone, zone, order,
+ page = buffered_rmqueue(ac->preferred_zoneref->zone, zone, order,
gfp_mask, alloc_flags, ac->migratetype);
if (page) {
if (prep_new_page(page, order, gfp_mask, alloc_flags))
@@ -2826,7 +2826,7 @@ get_page_from_freelist(gfp_t gfp_mask, unsigned int order, int alloc_flags,
reset_fair:
apply_fair = false;
fair_skipped = false;
- reset_alloc_batches(ac->preferred_zone);
+ reset_alloc_batches(ac->preferred_zoneref->zone);
goto zonelist_scan;
}
@@ -3113,7 +3113,7 @@ static void wake_all_kswapds(unsigned int order, const struct alloc_context *ac)
for_each_zone_zonelist_nodemask(zone, z, ac->zonelist,
ac->high_zoneidx, ac->nodemask)
- wakeup_kswapd(zone, order, zone_idx(ac->preferred_zone));
+ wakeup_kswapd(zone, order, zonelist_zone_idx(ac->preferred_zoneref));
}
static inline unsigned int
@@ -3333,7 +3333,7 @@ __alloc_pages_slowpath(gfp_t gfp_mask, unsigned int order,
if ((did_some_progress && order <= PAGE_ALLOC_COSTLY_ORDER) ||
((gfp_mask & __GFP_REPEAT) && pages_reclaimed < (1 << order))) {
/* Wait for some write requests to complete then retry */
- wait_iff_congested(ac->preferred_zone, BLK_RW_ASYNC, HZ/50);
+ wait_iff_congested(ac->preferred_zoneref->zone, BLK_RW_ASYNC, HZ/50);
goto retry;
}
@@ -3371,7 +3371,6 @@ struct page *
__alloc_pages_nodemask(gfp_t gfp_mask, unsigned int order,
struct zonelist *zonelist, nodemask_t *nodemask)
{
- struct zoneref *preferred_zoneref;
struct page *page;
unsigned int cpuset_mems_cookie;
unsigned int alloc_flags = ALLOC_WMARK_LOW|ALLOC_FAIR;
@@ -3407,11 +3406,11 @@ __alloc_pages_nodemask(gfp_t gfp_mask, unsigned int order,
ac.spread_dirty_pages = (gfp_mask & __GFP_WRITE);
/* The preferred zone is used for statistics later */
- preferred_zoneref = first_zones_zonelist(ac.zonelist, ac.high_zoneidx,
- ac.nodemask, &ac.preferred_zone);
- if (!ac.preferred_zone)
+ ac.preferred_zoneref = first_zones_zonelist(ac.zonelist, ac.high_zoneidx,
+ ac.nodemask);
+ if (!ac.preferred_zoneref->zone)
goto out;
- ac.classzone_idx = zonelist_zone_idx(preferred_zoneref);
+ ac.classzone_idx = zonelist_zone_idx(ac.preferred_zoneref);
/* First allocation attempt */
page = get_page_from_freelist(alloc_mask, order, alloc_flags, &ac);
@@ -4440,13 +4439,12 @@ static void build_zonelists(pg_data_t *pgdat)
*/
int local_memory_node(int node)
{
- struct zone *zone;
+ struct zoneref *z;
- (void)first_zones_zonelist(node_zonelist(node, GFP_KERNEL),
+ z = first_zones_zonelist(node_zonelist(node, GFP_KERNEL),
gfp_zone(GFP_KERNEL),
- NULL,
- &zone);
- return zone->node;
+ NULL);
+ return z->zone->node;
}
#endif
--
2.6.4
^ permalink raw reply related [flat|nested] 25+ messages in thread
* [PATCH 20/24] mm, page_alloc: Check multiple page fields with a single branch
2016-04-12 10:12 [PATCH 00/24] Optimise page alloc/free fast paths v2 Mel Gorman
` (18 preceding siblings ...)
2016-04-12 10:12 ` [PATCH 19/24] mm, page_alloc: Avoid looking up the first zone in a zonelist twice Mel Gorman
@ 2016-04-12 10:12 ` Mel Gorman
2016-04-12 10:12 ` [PATCH 21/24] cpuset: use static key better and convert to new API Mel Gorman
` (3 subsequent siblings)
23 siblings, 0 replies; 25+ messages in thread
From: Mel Gorman @ 2016-04-12 10:12 UTC (permalink / raw)
To: Andrew Morton; +Cc: Vlastimil Babka, Linux-MM, LKML, Mel Gorman
Every page allocated or freed is checked for sanity to avoid corruptions
that are difficult to detect later. A bad page could be due to a number of
fields. Instead of using multiple branches, this patch combines multiple
fields into a single branch. A detailed check is only necessary if that
check fails.
4.6.0-rc2 4.6.0-rc2
initonce-v1r20 multcheck-v1r20
Min alloc-odr0-1 359.00 ( 0.00%) 348.00 ( 3.06%)
Min alloc-odr0-2 260.00 ( 0.00%) 254.00 ( 2.31%)
Min alloc-odr0-4 214.00 ( 0.00%) 213.00 ( 0.47%)
Min alloc-odr0-8 186.00 ( 0.00%) 186.00 ( 0.00%)
Min alloc-odr0-16 173.00 ( 0.00%) 173.00 ( 0.00%)
Min alloc-odr0-32 165.00 ( 0.00%) 166.00 ( -0.61%)
Min alloc-odr0-64 162.00 ( 0.00%) 162.00 ( 0.00%)
Min alloc-odr0-128 161.00 ( 0.00%) 160.00 ( 0.62%)
Min alloc-odr0-256 170.00 ( 0.00%) 169.00 ( 0.59%)
Min alloc-odr0-512 181.00 ( 0.00%) 180.00 ( 0.55%)
Min alloc-odr0-1024 190.00 ( 0.00%) 188.00 ( 1.05%)
Min alloc-odr0-2048 196.00 ( 0.00%) 194.00 ( 1.02%)
Min alloc-odr0-4096 202.00 ( 0.00%) 199.00 ( 1.49%)
Min alloc-odr0-8192 205.00 ( 0.00%) 202.00 ( 1.46%)
Min alloc-odr0-16384 205.00 ( 0.00%) 203.00 ( 0.98%)
Again, the benefit is marginal but avoiding excessive branches is
important. Ideally the paths would not have to check these conditions at
all but regrettably abandoning the tests would make use-after-free bugs
much harder to detect.
Signed-off-by: Mel Gorman <mgorman@techsingularity.net>
---
mm/page_alloc.c | 55 +++++++++++++++++++++++++++++++++++++++++++------------
1 file changed, 43 insertions(+), 12 deletions(-)
diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index 4019dfe26b11..0100609f6510 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -784,10 +784,42 @@ static inline void __free_one_page(struct page *page,
zone->free_area[order].nr_free++;
}
+/*
+ * A bad page could be due to a number of fields. Instead of multiple branches,
+ * try and check multiple fields with one check. The caller must do a detailed
+ * check if necessary.
+ */
+static inline bool page_expected_state(struct page *page,
+ unsigned long check_flags)
+{
+ if (unlikely(atomic_read(&page->_mapcount) != -1))
+ return false;
+
+ if (unlikely((unsigned long)page->mapping |
+ page_ref_count(page) |
+#ifdef CONFIG_MEMCG
+ (unsigned long)page->mem_cgroup |
+#endif
+ (page->flags & check_flags)))
+ return false;
+
+ return true;
+}
+
static inline int free_pages_check(struct page *page)
{
- const char *bad_reason = NULL;
- unsigned long bad_flags = 0;
+ const char *bad_reason;
+ unsigned long bad_flags;
+
+ if (page_expected_state(page, PAGE_FLAGS_CHECK_AT_FREE)) {
+ page_cpupid_reset_last(page);
+ page->flags &= ~PAGE_FLAGS_CHECK_AT_PREP;
+ return 0;
+ }
+
+ /* Something has gone sideways, find it */
+ bad_reason = NULL;
+ bad_flags = 0;
if (unlikely(atomic_read(&page->_mapcount) != -1))
bad_reason = "nonzero mapcount";
@@ -803,14 +835,8 @@ static inline int free_pages_check(struct page *page)
if (unlikely(page->mem_cgroup))
bad_reason = "page still charged to cgroup";
#endif
- if (unlikely(bad_reason)) {
- bad_page(page, bad_reason, bad_flags);
- return 1;
- }
- page_cpupid_reset_last(page);
- if (page->flags & PAGE_FLAGS_CHECK_AT_PREP)
- page->flags &= ~PAGE_FLAGS_CHECK_AT_PREP;
- return 0;
+ bad_page(page, bad_reason, bad_flags);
+ return 1;
}
/*
@@ -1491,9 +1517,14 @@ static inline void expand(struct zone *zone, struct page *page,
*/
static inline int check_new_page(struct page *page)
{
- const char *bad_reason = NULL;
- unsigned long bad_flags = 0;
+ const char *bad_reason;
+ unsigned long bad_flags;
+
+ if (page_expected_state(page, PAGE_FLAGS_CHECK_AT_PREP|__PG_HWPOISON))
+ return 0;
+ bad_reason = NULL;
+ bad_flags = 0;
if (unlikely(atomic_read(&page->_mapcount) != -1))
bad_reason = "nonzero mapcount";
if (unlikely(page->mapping != NULL))
--
2.6.4
^ permalink raw reply related [flat|nested] 25+ messages in thread
* [PATCH 21/24] cpuset: use static key better and convert to new API
2016-04-12 10:12 [PATCH 00/24] Optimise page alloc/free fast paths v2 Mel Gorman
` (19 preceding siblings ...)
2016-04-12 10:12 ` [PATCH 20/24] mm, page_alloc: Check multiple page fields with a single branch Mel Gorman
@ 2016-04-12 10:12 ` Mel Gorman
2016-04-12 10:12 ` [PATCH 22/24] mm, page_alloc: Check once if a zone has isolated pageblocks Mel Gorman
` (2 subsequent siblings)
23 siblings, 0 replies; 25+ messages in thread
From: Mel Gorman @ 2016-04-12 10:12 UTC (permalink / raw)
To: Andrew Morton; +Cc: Vlastimil Babka, Linux-MM, LKML, Mel Gorman
From: Vlastimil Babka <vbabka@suse.cz>
An important function for cpusets is cpuset_node_allowed(), which optimizes on
the fact if there's a single root CPU set, it must be trivially allowed. But
the check "nr_cpusets() <= 1" doesn't use the cpusets_enabled_key static key
the right way where static keys eliminate branching overhead with jump labels.
This patch converts it so that static key is used properly. It's also switched
to the new static key API and the checking functions are converted to return
bool instead of int. We also provide a new variant __cpuset_zone_allowed()
which expects that the static key check was already done and they key was
enabled. This is needed for get_page_from_freelist() where we want to also
avoid the relatively slower check when ALLOC_CPUSET is not set in alloc_flags.
The impact on the page allocator microbenchmark is less than expected but the
cleanup in itself is worthwhile.
4.6.0-rc2 4.6.0-rc2
multcheck-v1r20 cpuset-v1r20
Min alloc-odr0-1 348.00 ( 0.00%) 348.00 ( 0.00%)
Min alloc-odr0-2 254.00 ( 0.00%) 254.00 ( 0.00%)
Min alloc-odr0-4 213.00 ( 0.00%) 213.00 ( 0.00%)
Min alloc-odr0-8 186.00 ( 0.00%) 183.00 ( 1.61%)
Min alloc-odr0-16 173.00 ( 0.00%) 171.00 ( 1.16%)
Min alloc-odr0-32 166.00 ( 0.00%) 163.00 ( 1.81%)
Min alloc-odr0-64 162.00 ( 0.00%) 159.00 ( 1.85%)
Min alloc-odr0-128 160.00 ( 0.00%) 157.00 ( 1.88%)
Min alloc-odr0-256 169.00 ( 0.00%) 166.00 ( 1.78%)
Min alloc-odr0-512 180.00 ( 0.00%) 180.00 ( 0.00%)
Min alloc-odr0-1024 188.00 ( 0.00%) 187.00 ( 0.53%)
Min alloc-odr0-2048 194.00 ( 0.00%) 193.00 ( 0.52%)
Min alloc-odr0-4096 199.00 ( 0.00%) 198.00 ( 0.50%)
Min alloc-odr0-8192 202.00 ( 0.00%) 201.00 ( 0.50%)
Min alloc-odr0-16384 203.00 ( 0.00%) 202.00 ( 0.49%)
Signed-off-by: Vlastimil Babka <vbabka@suse.cz>
Signed-off-by: Mel Gorman <mgorman@techsingularity.net>
---
include/linux/cpuset.h | 42 ++++++++++++++++++++++++++++--------------
kernel/cpuset.c | 14 +++++++-------
mm/page_alloc.c | 2 +-
3 files changed, 36 insertions(+), 22 deletions(-)
diff --git a/include/linux/cpuset.h b/include/linux/cpuset.h
index fea160ee5803..054c734d0170 100644
--- a/include/linux/cpuset.h
+++ b/include/linux/cpuset.h
@@ -16,26 +16,26 @@
#ifdef CONFIG_CPUSETS
-extern struct static_key cpusets_enabled_key;
+extern struct static_key_false cpusets_enabled_key;
static inline bool cpusets_enabled(void)
{
- return static_key_false(&cpusets_enabled_key);
+ return static_branch_unlikely(&cpusets_enabled_key);
}
static inline int nr_cpusets(void)
{
/* jump label reference count + the top-level cpuset */
- return static_key_count(&cpusets_enabled_key) + 1;
+ return static_key_count(&cpusets_enabled_key.key) + 1;
}
static inline void cpuset_inc(void)
{
- static_key_slow_inc(&cpusets_enabled_key);
+ static_branch_inc(&cpusets_enabled_key);
}
static inline void cpuset_dec(void)
{
- static_key_slow_dec(&cpusets_enabled_key);
+ static_branch_dec(&cpusets_enabled_key);
}
extern int cpuset_init(void);
@@ -48,16 +48,25 @@ extern nodemask_t cpuset_mems_allowed(struct task_struct *p);
void cpuset_init_current_mems_allowed(void);
int cpuset_nodemask_valid_mems_allowed(nodemask_t *nodemask);
-extern int __cpuset_node_allowed(int node, gfp_t gfp_mask);
+extern bool __cpuset_node_allowed(int node, gfp_t gfp_mask);
-static inline int cpuset_node_allowed(int node, gfp_t gfp_mask)
+static inline bool cpuset_node_allowed(int node, gfp_t gfp_mask)
{
- return nr_cpusets() <= 1 || __cpuset_node_allowed(node, gfp_mask);
+ if (cpusets_enabled())
+ return __cpuset_node_allowed(node, gfp_mask);
+ return true;
}
-static inline int cpuset_zone_allowed(struct zone *z, gfp_t gfp_mask)
+static inline bool __cpuset_zone_allowed(struct zone *z, gfp_t gfp_mask)
{
- return cpuset_node_allowed(zone_to_nid(z), gfp_mask);
+ return __cpuset_node_allowed(zone_to_nid(z), gfp_mask);
+}
+
+static inline bool cpuset_zone_allowed(struct zone *z, gfp_t gfp_mask)
+{
+ if (cpusets_enabled())
+ return __cpuset_zone_allowed(z, gfp_mask);
+ return true;
}
extern int cpuset_mems_allowed_intersects(const struct task_struct *tsk1,
@@ -174,14 +183,19 @@ static inline int cpuset_nodemask_valid_mems_allowed(nodemask_t *nodemask)
return 1;
}
-static inline int cpuset_node_allowed(int node, gfp_t gfp_mask)
+static inline bool cpuset_node_allowed(int node, gfp_t gfp_mask)
{
- return 1;
+ return true;
}
-static inline int cpuset_zone_allowed(struct zone *z, gfp_t gfp_mask)
+static inline bool __cpuset_zone_allowed(struct zone *z, gfp_t gfp_mask)
{
- return 1;
+ return true;
+}
+
+static inline bool cpuset_zone_allowed(struct zone *z, gfp_t gfp_mask)
+{
+ return true;
}
static inline int cpuset_mems_allowed_intersects(const struct task_struct *tsk1,
diff --git a/kernel/cpuset.c b/kernel/cpuset.c
index 00ab5c2b7c5b..37a0b44d101f 100644
--- a/kernel/cpuset.c
+++ b/kernel/cpuset.c
@@ -62,7 +62,7 @@
#include <linux/cgroup.h>
#include <linux/wait.h>
-struct static_key cpusets_enabled_key __read_mostly = STATIC_KEY_INIT_FALSE;
+DEFINE_STATIC_KEY_FALSE(cpusets_enabled_key);
/* See "Frequency meter" comments, below. */
@@ -2528,27 +2528,27 @@ static struct cpuset *nearest_hardwall_ancestor(struct cpuset *cs)
* GFP_KERNEL - any node in enclosing hardwalled cpuset ok
* GFP_USER - only nodes in current tasks mems allowed ok.
*/
-int __cpuset_node_allowed(int node, gfp_t gfp_mask)
+bool __cpuset_node_allowed(int node, gfp_t gfp_mask)
{
struct cpuset *cs; /* current cpuset ancestors */
int allowed; /* is allocation in zone z allowed? */
unsigned long flags;
if (in_interrupt())
- return 1;
+ return true;
if (node_isset(node, current->mems_allowed))
- return 1;
+ return true;
/*
* Allow tasks that have access to memory reserves because they have
* been OOM killed to get memory anywhere.
*/
if (unlikely(test_thread_flag(TIF_MEMDIE)))
- return 1;
+ return true;
if (gfp_mask & __GFP_HARDWALL) /* If hardwall request, stop here */
- return 0;
+ return false;
if (current->flags & PF_EXITING) /* Let dying task have memory */
- return 1;
+ return true;
/* Not hardwall and node outside mems_allowed: scan up cpusets */
spin_lock_irqsave(&callback_lock, flags);
diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index 0100609f6510..3fd8489b3055 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -2747,7 +2747,7 @@ get_page_from_freelist(gfp_t gfp_mask, unsigned int order, int alloc_flags,
if (cpusets_enabled() &&
(alloc_flags & ALLOC_CPUSET) &&
- !cpuset_zone_allowed(zone, gfp_mask))
+ !__cpuset_zone_allowed(zone, gfp_mask))
continue;
/*
* Distribute pages in proportion to the individual
--
2.6.4
^ permalink raw reply related [flat|nested] 25+ messages in thread
* [PATCH 22/24] mm, page_alloc: Check once if a zone has isolated pageblocks
2016-04-12 10:12 [PATCH 00/24] Optimise page alloc/free fast paths v2 Mel Gorman
` (20 preceding siblings ...)
2016-04-12 10:12 ` [PATCH 21/24] cpuset: use static key better and convert to new API Mel Gorman
@ 2016-04-12 10:12 ` Mel Gorman
2016-04-12 10:12 ` [PATCH 23/24] mm, page_alloc: Remove unnecessary variable from free_pcppages_bulk Mel Gorman
2016-04-12 10:12 ` [PATCH 24/24] mm, page_alloc: Do not lookup pcp migratetype during bulk free Mel Gorman
23 siblings, 0 replies; 25+ messages in thread
From: Mel Gorman @ 2016-04-12 10:12 UTC (permalink / raw)
To: Andrew Morton; +Cc: Vlastimil Babka, Linux-MM, LKML, Mel Gorman
When bulk freeing pages from the per-cpu lists the zone is checked
for isolated pageblocks on every release. This patch checks it once
per drain. Technically this is race-prone but so is the existing
code.
Signed-off-by: Mel Gorman <mgorman@techsingularity.net>
---
mm/page_alloc.c | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index 3fd8489b3055..854925c99c23 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -857,6 +857,7 @@ static void free_pcppages_bulk(struct zone *zone, int count,
int batch_free = 0;
int to_free = count;
unsigned long nr_scanned;
+ bool isolated_pageblocks = has_isolate_pageblock(zone);
spin_lock(&zone->lock);
nr_scanned = zone_page_state(zone, NR_PAGES_SCANNED);
@@ -896,7 +897,7 @@ static void free_pcppages_bulk(struct zone *zone, int count,
/* MIGRATE_ISOLATE page should not go to pcplists */
VM_BUG_ON_PAGE(is_migrate_isolate(mt), page);
/* Pageblock could have been isolated meanwhile */
- if (unlikely(has_isolate_pageblock(zone)))
+ if (unlikely(isolated_pageblocks))
mt = get_pageblock_migratetype(page);
__free_one_page(page, page_to_pfn(page), zone, 0, mt);
--
2.6.4
^ permalink raw reply related [flat|nested] 25+ messages in thread
* [PATCH 23/24] mm, page_alloc: Remove unnecessary variable from free_pcppages_bulk
2016-04-12 10:12 [PATCH 00/24] Optimise page alloc/free fast paths v2 Mel Gorman
` (21 preceding siblings ...)
2016-04-12 10:12 ` [PATCH 22/24] mm, page_alloc: Check once if a zone has isolated pageblocks Mel Gorman
@ 2016-04-12 10:12 ` Mel Gorman
2016-04-12 10:12 ` [PATCH 24/24] mm, page_alloc: Do not lookup pcp migratetype during bulk free Mel Gorman
23 siblings, 0 replies; 25+ messages in thread
From: Mel Gorman @ 2016-04-12 10:12 UTC (permalink / raw)
To: Andrew Morton; +Cc: Vlastimil Babka, Linux-MM, LKML, Mel Gorman
The original count is never reused so it can be removed.
Signed-off-by: Mel Gorman <mgorman@techsingularity.net>
---
mm/page_alloc.c | 7 +++----
1 file changed, 3 insertions(+), 4 deletions(-)
diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index 854925c99c23..1b1553c1156c 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -855,7 +855,6 @@ static void free_pcppages_bulk(struct zone *zone, int count,
{
int migratetype = 0;
int batch_free = 0;
- int to_free = count;
unsigned long nr_scanned;
bool isolated_pageblocks = has_isolate_pageblock(zone);
@@ -864,7 +863,7 @@ static void free_pcppages_bulk(struct zone *zone, int count,
if (nr_scanned)
__mod_zone_page_state(zone, NR_PAGES_SCANNED, -nr_scanned);
- while (to_free) {
+ while (count) {
struct page *page;
struct list_head *list;
@@ -884,7 +883,7 @@ static void free_pcppages_bulk(struct zone *zone, int count,
/* This is the only non-empty list. Free them all. */
if (batch_free == MIGRATE_PCPTYPES)
- batch_free = to_free;
+ batch_free = count;
do {
int mt; /* migratetype of the to-be-freed page */
@@ -902,7 +901,7 @@ static void free_pcppages_bulk(struct zone *zone, int count,
__free_one_page(page, page_to_pfn(page), zone, 0, mt);
trace_mm_page_pcpu_drain(page, 0, mt);
- } while (--to_free && --batch_free && !list_empty(list));
+ } while (--count && --batch_free && !list_empty(list));
}
spin_unlock(&zone->lock);
}
--
2.6.4
^ permalink raw reply related [flat|nested] 25+ messages in thread
* [PATCH 24/24] mm, page_alloc: Do not lookup pcp migratetype during bulk free
2016-04-12 10:12 [PATCH 00/24] Optimise page alloc/free fast paths v2 Mel Gorman
` (22 preceding siblings ...)
2016-04-12 10:12 ` [PATCH 23/24] mm, page_alloc: Remove unnecessary variable from free_pcppages_bulk Mel Gorman
@ 2016-04-12 10:12 ` Mel Gorman
23 siblings, 0 replies; 25+ messages in thread
From: Mel Gorman @ 2016-04-12 10:12 UTC (permalink / raw)
To: Andrew Morton; +Cc: Vlastimil Babka, Linux-MM, LKML, Mel Gorman
During bulk free, the pcp type of the page is known as it was removed
from a specific list. It only needs to be rechecked if an isolated
pageblock exists. This patch removes an unnecessary variable in
the process. The impact is that the round-robin freeing of PCP
lists is distorted when an isolated pageblock is encountered but
that is a rare and harmless corner-case.
The impact on the page allocator microbench of the bulk free patches is
visible for higher batch counts when the bulk free paths are hit.
pagealloc
4.6.0-rc3 4.6.0-rc3
cpuset-v2r2 micro-v2r2
Min free-odr0-1 191.00 ( 0.00%) 195.00 ( -2.09%)
Min free-odr0-2 136.00 ( 0.00%) 136.00 ( 0.00%)
Min free-odr0-4 107.00 ( 0.00%) 107.00 ( 0.00%)
Min free-odr0-8 95.00 ( 0.00%) 95.00 ( 0.00%)
Min free-odr0-16 87.00 ( 0.00%) 87.00 ( 0.00%)
Min free-odr0-32 82.00 ( 0.00%) 82.00 ( 0.00%)
Min free-odr0-64 80.00 ( 0.00%) 80.00 ( 0.00%)
Min free-odr0-128 79.00 ( 0.00%) 79.00 ( 0.00%)
Min free-odr0-256 94.00 ( 0.00%) 97.00 ( -3.19%)
Min free-odr0-512 112.00 ( 0.00%) 109.00 ( 2.68%)
Min free-odr0-1024 118.00 ( 0.00%) 118.00 ( 0.00%)
Min free-odr0-2048 123.00 ( 0.00%) 121.00 ( 1.63%)
Min free-odr0-4096 127.00 ( 0.00%) 125.00 ( 1.57%)
Min free-odr0-8192 129.00 ( 0.00%) 127.00 ( 1.55%)
Min free-odr0-16384 128.00 ( 0.00%) 127.00 ( 0.78%)
It's tiny but the patches are trivial.
Signed-off-by: Mel Gorman <mgorman@techsingularity.net>
---
mm/page_alloc.c | 13 ++++---------
1 file changed, 4 insertions(+), 9 deletions(-)
diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index 1b1553c1156c..4d4079309760 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -876,7 +876,7 @@ static void free_pcppages_bulk(struct zone *zone, int count,
*/
do {
batch_free++;
- if (++migratetype == MIGRATE_PCPTYPES)
+ if (++migratetype >= MIGRATE_PCPTYPES)
migratetype = 0;
list = &pcp->lists[migratetype];
} while (list_empty(list));
@@ -886,21 +886,16 @@ static void free_pcppages_bulk(struct zone *zone, int count,
batch_free = count;
do {
- int mt; /* migratetype of the to-be-freed page */
-
page = list_last_entry(list, struct page, lru);
/* must delete as __free_one_page list manipulates */
list_del(&page->lru);
- mt = get_pcppage_migratetype(page);
- /* MIGRATE_ISOLATE page should not go to pcplists */
- VM_BUG_ON_PAGE(is_migrate_isolate(mt), page);
/* Pageblock could have been isolated meanwhile */
if (unlikely(isolated_pageblocks))
- mt = get_pageblock_migratetype(page);
+ migratetype = get_pageblock_migratetype(page);
- __free_one_page(page, page_to_pfn(page), zone, 0, mt);
- trace_mm_page_pcpu_drain(page, 0, mt);
+ __free_one_page(page, page_to_pfn(page), zone, 0, migratetype);
+ trace_mm_page_pcpu_drain(page, 0, migratetype);
} while (--count && --batch_free && !list_empty(list));
}
spin_unlock(&zone->lock);
--
2.6.4
^ permalink raw reply related [flat|nested] 25+ messages in thread
end of thread, other threads:[~2016-04-12 10:17 UTC | newest]
Thread overview: 25+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-04-12 10:12 [PATCH 00/24] Optimise page alloc/free fast paths v2 Mel Gorman
2016-04-12 10:12 ` [PATCH 01/24] mm, page_alloc: Only check PageCompound for high-order pages Mel Gorman
2016-04-12 10:12 ` [PATCH 02/24] mm, page_alloc: Use new PageAnonHead helper in the free page fast path Mel Gorman
2016-04-12 10:12 ` [PATCH 03/24] mm, page_alloc: Reduce branches in zone_statistics Mel Gorman
2016-04-12 10:12 ` [PATCH 04/24] mm, page_alloc: Inline zone_statistics Mel Gorman
2016-04-12 10:12 ` [PATCH 05/24] mm, page_alloc: Inline the fast path of the zonelist iterator Mel Gorman
2016-04-12 10:12 ` [PATCH 06/24] mm, page_alloc: Use __dec_zone_state for order-0 page allocation Mel Gorman
2016-04-12 10:12 ` [PATCH 07/24] mm, page_alloc: Avoid unnecessary zone lookups during pageblock operations Mel Gorman
2016-04-12 10:12 ` [PATCH 08/24] mm, page_alloc: Convert alloc_flags to unsigned Mel Gorman
2016-04-12 10:12 ` [PATCH 09/24] mm, page_alloc: Convert nr_fair_skipped to bool Mel Gorman
2016-04-12 10:12 ` [PATCH 10/24] mm, page_alloc: Remove unnecessary local variable in get_page_from_freelist Mel Gorman
2016-04-12 10:12 ` [PATCH 11/24] mm, page_alloc: Remove unnecessary initialisation " Mel Gorman
2016-04-12 10:12 ` [PATCH 12/24] mm, page_alloc: Remove unnecessary initialisation from __alloc_pages_nodemask() Mel Gorman
2016-04-12 10:12 ` [PATCH 13/24] mm, page_alloc: Remove redundant check for empty zonelist Mel Gorman
2016-04-12 10:12 ` [PATCH 14/24] mm, page_alloc: Simplify last cpupid reset Mel Gorman
2016-04-12 10:12 ` [PATCH 15/24] mm, page_alloc: Move might_sleep_if check to the allocator slowpath Mel Gorman
2016-04-12 10:12 ` [PATCH 16/24] mm, page_alloc: Move __GFP_HARDWALL modifications out of the fastpath Mel Gorman
2016-04-12 10:12 ` [PATCH 17/24] mm, page_alloc: Reduce cost of fair zone allocation policy retry Mel Gorman
2016-04-12 10:12 ` [PATCH 18/24] mm, page_alloc: Shortcut watermark checks for order-0 pages Mel Gorman
2016-04-12 10:12 ` [PATCH 19/24] mm, page_alloc: Avoid looking up the first zone in a zonelist twice Mel Gorman
2016-04-12 10:12 ` [PATCH 20/24] mm, page_alloc: Check multiple page fields with a single branch Mel Gorman
2016-04-12 10:12 ` [PATCH 21/24] cpuset: use static key better and convert to new API Mel Gorman
2016-04-12 10:12 ` [PATCH 22/24] mm, page_alloc: Check once if a zone has isolated pageblocks Mel Gorman
2016-04-12 10:12 ` [PATCH 23/24] mm, page_alloc: Remove unnecessary variable from free_pcppages_bulk Mel Gorman
2016-04-12 10:12 ` [PATCH 24/24] mm, page_alloc: Do not lookup pcp migratetype during bulk free Mel Gorman
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).