[RFC PATCH 0/6] proactive kcompactd

* [RFC PATCH 0/6] proactive kcompactd
@ 2017-07-27 16:06 Vlastimil Babka
  2017-07-27 16:06 ` [PATCH 1/6] mm, kswapd: refactor kswapd_try_to_sleep() Vlastimil Babka
                   ` (6 more replies)
  0 siblings, 7 replies; 21+ messages in thread
From: Vlastimil Babka @ 2017-07-27 16:06 UTC (permalink / raw)
  To: linux-mm
  Cc: Joonsoo Kim, Mel Gorman, David Rientjes, Michal Hocko,
	Johannes Weiner, Andrea Arcangeli, Rik van Riel, Vlastimil Babka

As we discussed at last LSF/MM [1], the goal here is to shift more compaction
work to kcompactd, which currently just makes a single high-order page
available and then goes to sleep. The last patch, evolved from the initial RFC
[2] does this by recording for each order > 0 how many allocations would have
potentially be able to skip direct compaction, if the memory wasn't fragmented.
Kcompactd then tries to compact as long as it takes to make that many
allocations satisfiable. This approach avoids any hooks in allocator fast
paths. There are more details to this, see the last patch.

The first 4 patches fix some corner cases where kcompactd wasn't properly woken
up in my basic testing, and could be reviewed and merged immediately if found
OK. Patch 5 terminates compaction (direct or kcompactd) faster when free memory
has been consumed in parallel. IIRC similar thing was already proposed by
Joonsoo.

First I did some basic testing with workload described in patches 2-4, where
memory is fragmented with allocating a large file and then in every other page
a hole is punched. A test doing GFP_NOWAIT allocations with short sleeps then
fails allocation, waking up kcompactd so that the next allocation succeeds,
then another one fails, waking up kcompactd etc. After the series, the number
of consecutive successes gradually grows as kcompactd increases its target,
and then it falls down as all free memory is depleted.

Then I did some more measurements with mmtests stress-highalloc (3 iterations)
configured for allocating order-4 pages with GFP_NOWAIT, to make it rely on
kcompactd completely. The baseline kernel is 4.12.3 plus "mm, page_alloc:
fallback to smallest page when not stealing whole pageblock"

                               4.12.3                4.12.3                4.12.3                4.12.3
                                 base                patch4                patch5                patch6
Success 1 Min         71.00 (  0.00%)       71.00 (  0.00%)       71.00 (  0.00%)       74.00 ( -4.23%)
Success 1 Mean        72.33 (  0.00%)       72.33 (  0.00%)       72.33 (  0.00%)       75.00 ( -3.69%)
Success 1 Max         73.00 (  0.00%)       74.00 ( -1.37%)       74.00 ( -1.37%)       76.00 ( -4.11%)
Success 2 Min         78.00 (  0.00%)       74.00 (  5.13%)       76.00 (  2.56%)       80.00 ( -2.56%)
Success 2 Mean        80.00 (  0.00%)       77.33 (  3.33%)       79.33 (  0.83%)       81.67 ( -2.08%)
Success 2 Max         81.00 (  0.00%)       81.00 (  0.00%)       82.00 ( -1.23%)       84.00 ( -3.70%)
Success 3 Min         88.00 (  0.00%)       88.00 (  0.00%)       91.00 ( -3.41%)       90.00 ( -2.27%)
Success 3 Mean        88.33 (  0.00%)       88.67 ( -0.38%)       91.33 ( -3.40%)       90.67 ( -2.64%)
Success 3 Max         89.00 (  0.00%)       90.00 ( -1.12%)       92.00 ( -3.37%)       91.00 ( -2.25%)

Success rates didn't change much, already quite high for an order-4 GFP_NOWAIT.

                                    4.12.3       4.12.3     4.12.3      4.12.3
                                      base      patch4      patch5      patch6
Kcompactd wakeups                    15705       16312       15335       20234
Compaction stalls                      155         130         135         130
Compaction success                      86          69          79          82
Compaction failures                     69          61          56          48
Page migrate success                925279      945304      954274     1363974
Page migrate failure                 77284       76466       82188       15060
Compaction pages isolated          1947918     1987482     2008541     2748177
Compaction migrate scanned      1501322768  1590902016  1601288469    85829846
Compaction free scanned          509199325   526560956   522041706   104149027
Compaction cost                      11507       12156       12238        2069

Not much happening until patch6, which results in more kcompactd wakeups, but
surprisingly much reduced scanning activity, and improved migrate stats.

Same test, but order-9 (again GFP_NOWAIT)

                               4.12.3                4.12.3                4.12.3                4.12.3
                                 base                patch4                patch5                patch6
Success 1 Min         57.00 (  0.00%)       56.00 (  1.75%)       54.00 (  5.26%)       56.00 (  1.75%)
Success 1 Mean        59.00 (  0.00%)       59.33 ( -0.56%)       56.00 (  5.08%)       58.00 (  1.69%)
Success 1 Max         60.00 (  0.00%)       63.00 ( -5.00%)       58.00 (  3.33%)       60.00 (  0.00%)
Success 2 Min         66.00 (  0.00%)       66.00 (  0.00%)       67.00 ( -1.52%)       65.00 (  1.52%)
Success 2 Mean        66.33 (  0.00%)       67.00 ( -1.01%)       67.00 ( -1.01%)       66.33 (  0.00%)
Success 2 Max         67.00 (  0.00%)       68.00 ( -1.49%)       67.00 (  0.00%)       68.00 ( -1.49%)
Success 3 Min         53.00 (  0.00%)       56.00 ( -5.66%)       51.00 (  3.77%)       57.00 ( -7.55%)
Success 3 Mean        56.00 (  0.00%)       57.00 ( -1.79%)       54.33 (  2.98%)       57.33 ( -2.38%)
Success 3 Max         58.00 (  0.00%)       59.00 ( -1.72%)       58.00 (  0.00%)       58.00 (  0.00%)

                                    4.12.3       4.12.3     4.12.3      4.12.3
                                      base      patch4      patch5      patch6
Kcompactd wakeups                      992        1676        1749        1661
Compaction stalls                      134         139         151          91
Compaction success                      93          83         103          53
Compaction failures                     41          55          48          37
Page migrate success                885733      904325      849397      869434
Page migrate failure                  8261       12819       12299       10288
Compaction pages isolated          1779692     1822833     1713638     1749977
Compaction migrate scanned        95755848    87494396    96276153    18487127
Compaction migrate prescanned            0           0           0           0
Compaction free scanned           33409748    38040646    34997109    15738289
Compaction free direct alloc             0           0           0           0
Compaction free dir. all. miss           0           0           0           0
Compaction cost                       1623        1585        1588        1065

Order-9 allocations are more likely to trigger the corner cases fixed by
patches 2-4 and thus we see increased kcompactd wakeups with patch 4.
Patch 6 again significantly decreases the numbers of pages scanned. It's not
yet clear how. Optimistic explanation would be that creating more free
high-order at once is more efficient than repeatedly creating a single page
always rescanning part of the zone uselessly, but it needs more investigation.
I will also redo the test with gfp flags allowing direct compaction, and see
if the series does shift the direct compaction effort into kcompactd as
expected. Meanwhile I would like some feedback whether this is going into the
right direction or not...

[1] https://lwn.net/Articles/717656/
[2] https://marc.info/?l=linux-mm&m=148898500006034

Vlastimil Babka (6):
  mm, kswapd: refactor kswapd_try_to_sleep()
  mm, kswapd: don't reset kswapd_order prematurely
  mm, kswapd: reset kswapd's order to 0 when it fails to reclaim enough
  mm, kswapd: wake up kcompactd when kswapd had too many failures
  mm, compaction: stop when number of free pages goes below watermark
  mm: make kcompactd more proactive

 include/linux/compaction.h |   6 ++
 include/linux/mmzone.h     |   3 +
 mm/compaction.c            | 226 ++++++++++++++++++++++++++++++++++++++++++++-
 mm/page_alloc.c            |  13 +++
 mm/vmscan.c                | 149 +++++++++++++++++-------------
 5 files changed, 329 insertions(+), 68 deletions(-)

-- 
2.13.3

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 21+ messages in thread