* [PATCH 0/3] Removal of lumpy reclaim V2
@ 2012-04-11 16:38 ` Mel Gorman
0 siblings, 0 replies; 36+ messages in thread
From: Mel Gorman @ 2012-04-11 16:38 UTC (permalink / raw)
To: Andrew Morton
Cc: Rik van Riel, Konstantin Khlebnikov, Hugh Dickins, Ying Han,
Mel Gorman, Linux-MM, LKML
Andrew, these three patches should replace the two lumpy reclaim patches
you already have. When applied, there is no functional difference (slightly
changes in layout) but the changelogs are better.
Changelog since V1
o Ying pointed out that compaction was waiting on page writeback and the
description of the patches in V1 was broken. This version is the same
except that it is structured differently to explain that waiting on
page writeback is removed.
o Rebased to v3.4-rc2
This series removes lumpy reclaim and some stalling logic that was
unintentionally being used by memory compaction. The end result
is that stalling on dirty pages during page reclaim now depends on
wait_iff_congested().
Four kernels were compared
3.3.0 vanilla
3.4.0-rc2 vanilla
3.4.0-rc2 lumpyremove-v2 is patch one from this series
3.4.0-rc2 nosync-v2r3 is the full series
Removing lumpy reclaim saves almost 900K of text where as the full series
removes 1200K of text.
text data bss dec hex filename
6740375 1927944 2260992 10929311 a6c49f vmlinux-3.4.0-rc2-vanilla
6739479 1927944 2260992 10928415 a6c11f vmlinux-3.4.0-rc2-lumpyremove-v2
6739159 1927944 2260992 10928095 a6bfdf vmlinux-3.4.0-rc2-nosync-v2
There are behaviour changes in the series and so tests were run with
monitoring of ftrace events. This disrupts results so the performance
results are distorted but the new behaviour should be clearer.
fs-mark running in a threaded configuration showed little of interest as
it did not push reclaim aggressively
FS-Mark Multi Threaded
3.3.0-vanilla rc2-vanilla lumpyremove-v2r3 nosync-v2r3
Files/s min 3.20 ( 0.00%) 3.20 ( 0.00%) 3.20 ( 0.00%) 3.20 ( 0.00%)
Files/s mean 3.20 ( 0.00%) 3.20 ( 0.00%) 3.20 ( 0.00%) 3.20 ( 0.00%)
Files/s stddev 0.00 ( 0.00%) 0.00 ( 0.00%) 0.00 ( 0.00%) 0.00 ( 0.00%)
Files/s max 3.20 ( 0.00%) 3.20 ( 0.00%) 3.20 ( 0.00%) 3.20 ( 0.00%)
Overhead min 508667.00 ( 0.00%) 521350.00 (-2.49%) 544292.00 (-7.00%) 547168.00 (-7.57%)
Overhead mean 551185.00 ( 0.00%) 652690.73 (-18.42%) 991208.40 (-79.83%) 570130.53 (-3.44%)
Overhead stddev 18200.69 ( 0.00%) 331958.29 (-1723.88%) 1579579.43 (-8578.68%) 9576.81 (47.38%)
Overhead max 576775.00 ( 0.00%) 1846634.00 (-220.17%) 6901055.00 (-1096.49%) 585675.00 (-1.54%)
MMTests Statistics: duration
Sys Time Running Test (seconds) 309.90 300.95 307.33 298.95
User+Sys Time Running Test (seconds) 319.32 309.67 315.69 307.51
Total Elapsed Time (seconds) 1187.85 1193.09 1191.98 1193.73
MMTests Statistics: vmstat
Page Ins 80532 82212 81420 79480
Page Outs 111434984 111456240 111437376 111582628
Swap Ins 0 0 0 0
Swap Outs 0 0 0 0
Direct pages scanned 44881 27889 27453 34843
Kswapd pages scanned 25841428 25860774 25861233 25843212
Kswapd pages reclaimed 25841393 25860741 25861199 25843179
Direct pages reclaimed 44881 27889 27453 34843
Kswapd efficiency 99% 99% 99% 99%
Kswapd velocity 21754.791 21675.460 21696.029 21649.127
Direct efficiency 100% 100% 100% 100%
Direct velocity 37.783 23.375 23.031 29.188
Percentage direct scans 0% 0% 0% 0%
ftrace showed that there was no stalling on writeback or pages submitted
for IO from reclaim context.
postmark was similar and while it was more interesting, it also did not
push reclaim heavily.
POSTMARK
3.3.0-vanilla rc2-vanilla lumpyremove-v2r3 nosync-v2r3
Transactions per second: 16.00 ( 0.00%) 20.00 (25.00%) 18.00 (12.50%) 17.00 ( 6.25%)
Data megabytes read per second: 18.80 ( 0.00%) 24.27 (29.10%) 22.26 (18.40%) 20.54 ( 9.26%)
Data megabytes written per second: 35.83 ( 0.00%) 46.25 (29.08%) 42.42 (18.39%) 39.14 ( 9.24%)
Files created alone per second: 28.00 ( 0.00%) 38.00 (35.71%) 34.00 (21.43%) 30.00 ( 7.14%)
Files create/transact per second: 8.00 ( 0.00%) 10.00 (25.00%) 9.00 (12.50%) 8.00 ( 0.00%)
Files deleted alone per second: 556.00 ( 0.00%) 1224.00 (120.14%) 3062.00 (450.72%) 6124.00 (1001.44%)
Files delete/transact per second: 8.00 ( 0.00%) 10.00 (25.00%) 9.00 (12.50%) 8.00 ( 0.00%)
MMTests Statistics: duration
Sys Time Running Test (seconds) 113.34 107.99 109.73 108.72
User+Sys Time Running Test (seconds) 145.51 139.81 143.32 143.55
Total Elapsed Time (seconds) 1159.16 899.23 980.17 1062.27
MMTests Statistics: vmstat
Page Ins 13710192 13729032 13727944 13760136
Page Outs 43071140 42987228 42733684 42931624
Swap Ins 0 0 0 0
Swap Outs 0 0 0 0
Direct pages scanned 0 0 0 0
Kswapd pages scanned 9941613 9937443 9939085 9929154
Kswapd pages reclaimed 9940926 9936751 9938397 9928465
Direct pages reclaimed 0 0 0 0
Kswapd efficiency 99% 99% 99% 99%
Kswapd velocity 8576.567 11051.058 10140.164 9347.109
Direct efficiency 100% 100% 100% 100%
Direct velocity 0.000 0.000 0.000 0.000
It looks like here that the full series regresses performance but as ftrace
showed no usage of wait_iff_congested() or sync reclaim I am assuming it's
a disruption due to monitoring. Other data such as memory usage, page IO,
swap IO all looked similar.
Running a benchmark with a plain DD showed nothing very interesting. The
full series stalled in wait_iff_congested() slightly less but stall times
on vanilla kernels were marginal.
Running a benchmark that hammered on file-backed mappings showed stalls
due to congestion but not in sync writebacks
MICRO
3.3.0-vanilla rc2-vanilla lumpyremove-v2r3 nosync-v2r3
MMTests Statistics: duration
Sys Time Running Test (seconds) 308.13 294.50 298.75 299.53
User+Sys Time Running Test (seconds) 330.45 316.28 318.93 320.79
Total Elapsed Time (seconds) 1814.90 1833.88 1821.14 1832.91
MMTests Statistics: vmstat
Page Ins 108712 120708 97224 110344
Page Outs 155514576 156017404 155813676 156193256
Swap Ins 0 0 0 0
Swap Outs 0 0 0 0
Direct pages scanned 2599253 1550480 2512822 2414760
Kswapd pages scanned 69742364 71150694 68839041 69692533
Kswapd pages reclaimed 34824488 34773341 34796602 34799396
Direct pages reclaimed 53693 94750 61792 75205
Kswapd efficiency 49% 48% 50% 49%
Kswapd velocity 38427.662 38797.901 37799.972 38022.889
Direct efficiency 2% 6% 2% 3%
Direct velocity 1432.174 845.464 1379.807 1317.446
Percentage direct scans 3% 2% 3% 3%
Page writes by reclaim 0 0 0 0
Page writes file 0 0 0 0
Page writes anon 0 0 0 0
Page reclaim immediate 0 0 0 1218
Page rescued immediate 0 0 0 0
Slabs scanned 15360 16384 13312 16384
Direct inode steals 0 0 0 0
Kswapd inode steals 4340 4327 1630 4323
FTrace Reclaim Statistics: congestion_wait
Direct number congest waited 0 0 0 0
Direct time congest waited 0ms 0ms 0ms 0ms
Direct full congest waited 0 0 0 0
Direct number conditional waited 900 870 754 789
Direct time conditional waited 0ms 0ms 0ms 20ms
Direct full conditional waited 0 0 0 0
KSwapd number congest waited 2106 2308 2116 1915
KSwapd time congest waited 139924ms 157832ms 125652ms 132516ms
KSwapd full congest waited 1346 1530 1202 1278
KSwapd number conditional waited 12922 16320 10943 14670
KSwapd time conditional waited 0ms 0ms 0ms 0ms
KSwapd full conditional waited 0 0 0 0
Reclaim statistics are not radically changed. The stall times in kswapd
are massive but it is clear that it is due to calls to congestion_wait()
and that is almost certainly the call in balance_pgdat(). Otherwise stalls
due to dirty pages are non-existant.
I ran a benchmark that stressed high-order allocation. This is very
artifical load but was used in the past to evaluate lumpy reclaim and
compaction. Generally I look at allocation success rates and latency figures.
STRESS-HIGHALLOC
3.3.0-vanilla rc2-vanilla lumpyremove-v2r3 nosync-v2r3
Pass 1 81.00 ( 0.00%) 28.00 (-53.00%) 24.00 (-57.00%) 28.00 (-53.00%)
Pass 2 82.00 ( 0.00%) 39.00 (-43.00%) 38.00 (-44.00%) 43.00 (-39.00%)
while Rested 88.00 ( 0.00%) 87.00 (-1.00%) 88.00 ( 0.00%) 88.00 ( 0.00%)
MMTests Statistics: duration
Sys Time Running Test (seconds) 740.93 681.42 685.14 684.87
User+Sys Time Running Test (seconds) 2922.65 3269.52 3281.35 3279.44
Total Elapsed Time (seconds) 1161.73 1152.49 1159.55 1161.44
MMTests Statistics: vmstat
Page Ins 4486020 2807256 2855944 2876244
Page Outs 7261600 7973688 7975320 7986120
Swap Ins 31694 0 0 0
Swap Outs 98179 0 0 0
Direct pages scanned 53494 57731 34406 113015
Kswapd pages scanned 6271173 1287481 1278174 1219095
Kswapd pages reclaimed 2029240 1281025 1260708 1201583
Direct pages reclaimed 1468 14564 16649 92456
Kswapd efficiency 32% 99% 98% 98%
Kswapd velocity 5398.133 1117.130 1102.302 1049.641
Direct efficiency 2% 25% 48% 81%
Direct velocity 46.047 50.092 29.672 97.306
Percentage direct scans 0% 4% 2% 8%
Page writes by reclaim 1616049 0 0 0
Page writes file 1517870 0 0 0
Page writes anon 98179 0 0 0
Page reclaim immediate 103778 27339 9796 17831
Page rescued immediate 0 0 0 0
Slabs scanned 1096704 986112 980992 998400
Direct inode steals 223 215040 216736 247881
Kswapd inode steals 175331 61548 68444 63066
Kswapd skipped wait 21991 0 1 0
THP fault alloc 1 135 125 134
THP collapse alloc 393 311 228 236
THP splits 25 13 7 8
THP fault fallback 0 0 0 0
THP collapse fail 3 5 7 7
Compaction stalls 865 1270 1422 1518
Compaction success 370 401 353 383
Compaction failures 495 869 1069 1135
Compaction pages moved 870155 3828868 4036106 4423626
Compaction move failure 26429 23865 29742 27514
Success rates are completely hosed for 3.4-rc2 which is almost certainly
due to [fe2c2a10: vmscan: reclaim at order 0 when compaction is enabled]. I
expected this would happen for kswapd and impair allocation success rates
(https://lkml.org/lkml/2012/1/25/166) but I did not anticipate this much
a difference: 80% less scanning, 37% less reclaim by kswapd
In comparison, reclaim/compaction is not aggressive and gives up easily
which is the intended behaviour. hugetlbfs uses __GFP_REPEAT and would be
much more aggressive about reclaim/compaction than THP allocations are. The
stress test above is allocating like neither THP or hugetlbfs but is much
closer to THP.
Mainline is now impaired in terms of high order allocation under heavy load
although I do not know to what degree as I did not test with __GFP_REPEAT.
Keep this in mind for bugs related to hugepage pool resizing, THP allocation
and high order atomic allocation failures from network devices.
In terms of congestion throttling, I see the following for this test
FTrace Reclaim Statistics: congestion_wait
Direct number congest waited 3 0 0 0
Direct time congest waited 0ms 0ms 0ms 0ms
Direct full congest waited 0 0 0 0
Direct number conditional waited 957 512 1081 1075
Direct time conditional waited 0ms 0ms 0ms 0ms
Direct full conditional waited 0 0 0 0
KSwapd number congest waited 36 4 3 5
KSwapd time congest waited 3148ms 400ms 300ms 500ms
KSwapd full congest waited 30 4 3 5
KSwapd number conditional waited 88514 197 332 542
KSwapd time conditional waited 4980ms 0ms 0ms 0ms
KSwapd full conditional waited 49 0 0 0
The "conditional waited" times are the most interesting as this is directly
impacted by the number of dirty pages encountered during scan. As lumpy
reclaim is no longer scanning contiguous ranges, it is finding fewer dirty
pages. This brings wait times from about 5 seconds to 0. kswapd itself is
still calling congestion_wait() so it'll still stall but it's a lot less.
In terms of the type of IO we were doing, I see this
FTrace Reclaim Statistics: mm_vmscan_writepage
Direct writes anon sync 0 0 0 0
Direct writes anon async 0 0 0 0
Direct writes file sync 0 0 0 0
Direct writes file async 0 0 0 0
Direct writes mixed sync 0 0 0 0
Direct writes mixed async 0 0 0 0
KSwapd writes anon sync 0 0 0 0
KSwapd writes anon async 91682 0 0 0
KSwapd writes file sync 0 0 0 0
KSwapd writes file async 822629 0 0 0
KSwapd writes mixed sync 0 0 0 0
KSwapd writes mixed async 0 0 0 0
In 3.2, kswapd was doing a bunch of async writes of pages but
reclaim/compaction was never reaching a point where it was doing sync
IO. This does not guarantee that reclaim/compaction was not calling
wait_on_page_writeback() but I would consider it unlikely. It indicates
that merging patches 2 and 3 to stop reclaim/compaction calling
wait_on_page_writeback() should be safe.
include/trace/events/vmscan.h | 40 ++-----
mm/vmscan.c | 263 ++++-------------------------------------
2 files changed, 37 insertions(+), 266 deletions(-)
--
1.7.9.2
^ permalink raw reply [flat|nested] 36+ messages in thread
* [PATCH 0/3] Removal of lumpy reclaim V2
@ 2012-04-11 16:38 ` Mel Gorman
0 siblings, 0 replies; 36+ messages in thread
From: Mel Gorman @ 2012-04-11 16:38 UTC (permalink / raw)
To: Andrew Morton
Cc: Rik van Riel, Konstantin Khlebnikov, Hugh Dickins, Ying Han,
Mel Gorman, Linux-MM, LKML
Andrew, these three patches should replace the two lumpy reclaim patches
you already have. When applied, there is no functional difference (slightly
changes in layout) but the changelogs are better.
Changelog since V1
o Ying pointed out that compaction was waiting on page writeback and the
description of the patches in V1 was broken. This version is the same
except that it is structured differently to explain that waiting on
page writeback is removed.
o Rebased to v3.4-rc2
This series removes lumpy reclaim and some stalling logic that was
unintentionally being used by memory compaction. The end result
is that stalling on dirty pages during page reclaim now depends on
wait_iff_congested().
Four kernels were compared
3.3.0 vanilla
3.4.0-rc2 vanilla
3.4.0-rc2 lumpyremove-v2 is patch one from this series
3.4.0-rc2 nosync-v2r3 is the full series
Removing lumpy reclaim saves almost 900K of text where as the full series
removes 1200K of text.
text data bss dec hex filename
6740375 1927944 2260992 10929311 a6c49f vmlinux-3.4.0-rc2-vanilla
6739479 1927944 2260992 10928415 a6c11f vmlinux-3.4.0-rc2-lumpyremove-v2
6739159 1927944 2260992 10928095 a6bfdf vmlinux-3.4.0-rc2-nosync-v2
There are behaviour changes in the series and so tests were run with
monitoring of ftrace events. This disrupts results so the performance
results are distorted but the new behaviour should be clearer.
fs-mark running in a threaded configuration showed little of interest as
it did not push reclaim aggressively
FS-Mark Multi Threaded
3.3.0-vanilla rc2-vanilla lumpyremove-v2r3 nosync-v2r3
Files/s min 3.20 ( 0.00%) 3.20 ( 0.00%) 3.20 ( 0.00%) 3.20 ( 0.00%)
Files/s mean 3.20 ( 0.00%) 3.20 ( 0.00%) 3.20 ( 0.00%) 3.20 ( 0.00%)
Files/s stddev 0.00 ( 0.00%) 0.00 ( 0.00%) 0.00 ( 0.00%) 0.00 ( 0.00%)
Files/s max 3.20 ( 0.00%) 3.20 ( 0.00%) 3.20 ( 0.00%) 3.20 ( 0.00%)
Overhead min 508667.00 ( 0.00%) 521350.00 (-2.49%) 544292.00 (-7.00%) 547168.00 (-7.57%)
Overhead mean 551185.00 ( 0.00%) 652690.73 (-18.42%) 991208.40 (-79.83%) 570130.53 (-3.44%)
Overhead stddev 18200.69 ( 0.00%) 331958.29 (-1723.88%) 1579579.43 (-8578.68%) 9576.81 (47.38%)
Overhead max 576775.00 ( 0.00%) 1846634.00 (-220.17%) 6901055.00 (-1096.49%) 585675.00 (-1.54%)
MMTests Statistics: duration
Sys Time Running Test (seconds) 309.90 300.95 307.33 298.95
User+Sys Time Running Test (seconds) 319.32 309.67 315.69 307.51
Total Elapsed Time (seconds) 1187.85 1193.09 1191.98 1193.73
MMTests Statistics: vmstat
Page Ins 80532 82212 81420 79480
Page Outs 111434984 111456240 111437376 111582628
Swap Ins 0 0 0 0
Swap Outs 0 0 0 0
Direct pages scanned 44881 27889 27453 34843
Kswapd pages scanned 25841428 25860774 25861233 25843212
Kswapd pages reclaimed 25841393 25860741 25861199 25843179
Direct pages reclaimed 44881 27889 27453 34843
Kswapd efficiency 99% 99% 99% 99%
Kswapd velocity 21754.791 21675.460 21696.029 21649.127
Direct efficiency 100% 100% 100% 100%
Direct velocity 37.783 23.375 23.031 29.188
Percentage direct scans 0% 0% 0% 0%
ftrace showed that there was no stalling on writeback or pages submitted
for IO from reclaim context.
postmark was similar and while it was more interesting, it also did not
push reclaim heavily.
POSTMARK
3.3.0-vanilla rc2-vanilla lumpyremove-v2r3 nosync-v2r3
Transactions per second: 16.00 ( 0.00%) 20.00 (25.00%) 18.00 (12.50%) 17.00 ( 6.25%)
Data megabytes read per second: 18.80 ( 0.00%) 24.27 (29.10%) 22.26 (18.40%) 20.54 ( 9.26%)
Data megabytes written per second: 35.83 ( 0.00%) 46.25 (29.08%) 42.42 (18.39%) 39.14 ( 9.24%)
Files created alone per second: 28.00 ( 0.00%) 38.00 (35.71%) 34.00 (21.43%) 30.00 ( 7.14%)
Files create/transact per second: 8.00 ( 0.00%) 10.00 (25.00%) 9.00 (12.50%) 8.00 ( 0.00%)
Files deleted alone per second: 556.00 ( 0.00%) 1224.00 (120.14%) 3062.00 (450.72%) 6124.00 (1001.44%)
Files delete/transact per second: 8.00 ( 0.00%) 10.00 (25.00%) 9.00 (12.50%) 8.00 ( 0.00%)
MMTests Statistics: duration
Sys Time Running Test (seconds) 113.34 107.99 109.73 108.72
User+Sys Time Running Test (seconds) 145.51 139.81 143.32 143.55
Total Elapsed Time (seconds) 1159.16 899.23 980.17 1062.27
MMTests Statistics: vmstat
Page Ins 13710192 13729032 13727944 13760136
Page Outs 43071140 42987228 42733684 42931624
Swap Ins 0 0 0 0
Swap Outs 0 0 0 0
Direct pages scanned 0 0 0 0
Kswapd pages scanned 9941613 9937443 9939085 9929154
Kswapd pages reclaimed 9940926 9936751 9938397 9928465
Direct pages reclaimed 0 0 0 0
Kswapd efficiency 99% 99% 99% 99%
Kswapd velocity 8576.567 11051.058 10140.164 9347.109
Direct efficiency 100% 100% 100% 100%
Direct velocity 0.000 0.000 0.000 0.000
It looks like here that the full series regresses performance but as ftrace
showed no usage of wait_iff_congested() or sync reclaim I am assuming it's
a disruption due to monitoring. Other data such as memory usage, page IO,
swap IO all looked similar.
Running a benchmark with a plain DD showed nothing very interesting. The
full series stalled in wait_iff_congested() slightly less but stall times
on vanilla kernels were marginal.
Running a benchmark that hammered on file-backed mappings showed stalls
due to congestion but not in sync writebacks
MICRO
3.3.0-vanilla rc2-vanilla lumpyremove-v2r3 nosync-v2r3
MMTests Statistics: duration
Sys Time Running Test (seconds) 308.13 294.50 298.75 299.53
User+Sys Time Running Test (seconds) 330.45 316.28 318.93 320.79
Total Elapsed Time (seconds) 1814.90 1833.88 1821.14 1832.91
MMTests Statistics: vmstat
Page Ins 108712 120708 97224 110344
Page Outs 155514576 156017404 155813676 156193256
Swap Ins 0 0 0 0
Swap Outs 0 0 0 0
Direct pages scanned 2599253 1550480 2512822 2414760
Kswapd pages scanned 69742364 71150694 68839041 69692533
Kswapd pages reclaimed 34824488 34773341 34796602 34799396
Direct pages reclaimed 53693 94750 61792 75205
Kswapd efficiency 49% 48% 50% 49%
Kswapd velocity 38427.662 38797.901 37799.972 38022.889
Direct efficiency 2% 6% 2% 3%
Direct velocity 1432.174 845.464 1379.807 1317.446
Percentage direct scans 3% 2% 3% 3%
Page writes by reclaim 0 0 0 0
Page writes file 0 0 0 0
Page writes anon 0 0 0 0
Page reclaim immediate 0 0 0 1218
Page rescued immediate 0 0 0 0
Slabs scanned 15360 16384 13312 16384
Direct inode steals 0 0 0 0
Kswapd inode steals 4340 4327 1630 4323
FTrace Reclaim Statistics: congestion_wait
Direct number congest waited 0 0 0 0
Direct time congest waited 0ms 0ms 0ms 0ms
Direct full congest waited 0 0 0 0
Direct number conditional waited 900 870 754 789
Direct time conditional waited 0ms 0ms 0ms 20ms
Direct full conditional waited 0 0 0 0
KSwapd number congest waited 2106 2308 2116 1915
KSwapd time congest waited 139924ms 157832ms 125652ms 132516ms
KSwapd full congest waited 1346 1530 1202 1278
KSwapd number conditional waited 12922 16320 10943 14670
KSwapd time conditional waited 0ms 0ms 0ms 0ms
KSwapd full conditional waited 0 0 0 0
Reclaim statistics are not radically changed. The stall times in kswapd
are massive but it is clear that it is due to calls to congestion_wait()
and that is almost certainly the call in balance_pgdat(). Otherwise stalls
due to dirty pages are non-existant.
I ran a benchmark that stressed high-order allocation. This is very
artifical load but was used in the past to evaluate lumpy reclaim and
compaction. Generally I look at allocation success rates and latency figures.
STRESS-HIGHALLOC
3.3.0-vanilla rc2-vanilla lumpyremove-v2r3 nosync-v2r3
Pass 1 81.00 ( 0.00%) 28.00 (-53.00%) 24.00 (-57.00%) 28.00 (-53.00%)
Pass 2 82.00 ( 0.00%) 39.00 (-43.00%) 38.00 (-44.00%) 43.00 (-39.00%)
while Rested 88.00 ( 0.00%) 87.00 (-1.00%) 88.00 ( 0.00%) 88.00 ( 0.00%)
MMTests Statistics: duration
Sys Time Running Test (seconds) 740.93 681.42 685.14 684.87
User+Sys Time Running Test (seconds) 2922.65 3269.52 3281.35 3279.44
Total Elapsed Time (seconds) 1161.73 1152.49 1159.55 1161.44
MMTests Statistics: vmstat
Page Ins 4486020 2807256 2855944 2876244
Page Outs 7261600 7973688 7975320 7986120
Swap Ins 31694 0 0 0
Swap Outs 98179 0 0 0
Direct pages scanned 53494 57731 34406 113015
Kswapd pages scanned 6271173 1287481 1278174 1219095
Kswapd pages reclaimed 2029240 1281025 1260708 1201583
Direct pages reclaimed 1468 14564 16649 92456
Kswapd efficiency 32% 99% 98% 98%
Kswapd velocity 5398.133 1117.130 1102.302 1049.641
Direct efficiency 2% 25% 48% 81%
Direct velocity 46.047 50.092 29.672 97.306
Percentage direct scans 0% 4% 2% 8%
Page writes by reclaim 1616049 0 0 0
Page writes file 1517870 0 0 0
Page writes anon 98179 0 0 0
Page reclaim immediate 103778 27339 9796 17831
Page rescued immediate 0 0 0 0
Slabs scanned 1096704 986112 980992 998400
Direct inode steals 223 215040 216736 247881
Kswapd inode steals 175331 61548 68444 63066
Kswapd skipped wait 21991 0 1 0
THP fault alloc 1 135 125 134
THP collapse alloc 393 311 228 236
THP splits 25 13 7 8
THP fault fallback 0 0 0 0
THP collapse fail 3 5 7 7
Compaction stalls 865 1270 1422 1518
Compaction success 370 401 353 383
Compaction failures 495 869 1069 1135
Compaction pages moved 870155 3828868 4036106 4423626
Compaction move failure 26429 23865 29742 27514
Success rates are completely hosed for 3.4-rc2 which is almost certainly
due to [fe2c2a10: vmscan: reclaim at order 0 when compaction is enabled]. I
expected this would happen for kswapd and impair allocation success rates
(https://lkml.org/lkml/2012/1/25/166) but I did not anticipate this much
a difference: 80% less scanning, 37% less reclaim by kswapd
In comparison, reclaim/compaction is not aggressive and gives up easily
which is the intended behaviour. hugetlbfs uses __GFP_REPEAT and would be
much more aggressive about reclaim/compaction than THP allocations are. The
stress test above is allocating like neither THP or hugetlbfs but is much
closer to THP.
Mainline is now impaired in terms of high order allocation under heavy load
although I do not know to what degree as I did not test with __GFP_REPEAT.
Keep this in mind for bugs related to hugepage pool resizing, THP allocation
and high order atomic allocation failures from network devices.
In terms of congestion throttling, I see the following for this test
FTrace Reclaim Statistics: congestion_wait
Direct number congest waited 3 0 0 0
Direct time congest waited 0ms 0ms 0ms 0ms
Direct full congest waited 0 0 0 0
Direct number conditional waited 957 512 1081 1075
Direct time conditional waited 0ms 0ms 0ms 0ms
Direct full conditional waited 0 0 0 0
KSwapd number congest waited 36 4 3 5
KSwapd time congest waited 3148ms 400ms 300ms 500ms
KSwapd full congest waited 30 4 3 5
KSwapd number conditional waited 88514 197 332 542
KSwapd time conditional waited 4980ms 0ms 0ms 0ms
KSwapd full conditional waited 49 0 0 0
The "conditional waited" times are the most interesting as this is directly
impacted by the number of dirty pages encountered during scan. As lumpy
reclaim is no longer scanning contiguous ranges, it is finding fewer dirty
pages. This brings wait times from about 5 seconds to 0. kswapd itself is
still calling congestion_wait() so it'll still stall but it's a lot less.
In terms of the type of IO we were doing, I see this
FTrace Reclaim Statistics: mm_vmscan_writepage
Direct writes anon sync 0 0 0 0
Direct writes anon async 0 0 0 0
Direct writes file sync 0 0 0 0
Direct writes file async 0 0 0 0
Direct writes mixed sync 0 0 0 0
Direct writes mixed async 0 0 0 0
KSwapd writes anon sync 0 0 0 0
KSwapd writes anon async 91682 0 0 0
KSwapd writes file sync 0 0 0 0
KSwapd writes file async 822629 0 0 0
KSwapd writes mixed sync 0 0 0 0
KSwapd writes mixed async 0 0 0 0
In 3.2, kswapd was doing a bunch of async writes of pages but
reclaim/compaction was never reaching a point where it was doing sync
IO. This does not guarantee that reclaim/compaction was not calling
wait_on_page_writeback() but I would consider it unlikely. It indicates
that merging patches 2 and 3 to stop reclaim/compaction calling
wait_on_page_writeback() should be safe.
include/trace/events/vmscan.h | 40 ++-----
mm/vmscan.c | 263 ++++-------------------------------------
2 files changed, 37 insertions(+), 266 deletions(-)
--
1.7.9.2
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 36+ messages in thread
* [PATCH 1/3] mm: vmscan: Remove lumpy reclaim
2012-04-11 16:38 ` Mel Gorman
@ 2012-04-11 16:38 ` Mel Gorman
-1 siblings, 0 replies; 36+ messages in thread
From: Mel Gorman @ 2012-04-11 16:38 UTC (permalink / raw)
To: Andrew Morton
Cc: Rik van Riel, Konstantin Khlebnikov, Hugh Dickins, Ying Han,
Mel Gorman, Linux-MM, LKML
Lumpy reclaim had a purpose but in the mind of some, it was to kick
the system so hard it trashed. For others the purpose was to complicate
vmscan.c. Over time it was giving softer shoes and a nicer attitude but
memory compaction needs to step up and replace it so this patch sends
lumpy reclaim to the farm.
The tracepoint format changes for isolating LRU pages with this patch
applied. Furthermore reclaim/compaction can no longer queue dirty pages in
pageout() if the underlying BDI is congested. Lumpy reclaim used this logic
and reclaim/compaction was using it in error.
Signed-off-by: Mel Gorman <mgorman@suse.de>
---
include/trace/events/vmscan.h | 26 ++------
mm/vmscan.c | 144 +++++------------------------------------
2 files changed, 19 insertions(+), 151 deletions(-)
diff --git a/include/trace/events/vmscan.h b/include/trace/events/vmscan.h
index f64560e..1c20a1f 100644
--- a/include/trace/events/vmscan.h
+++ b/include/trace/events/vmscan.h
@@ -263,22 +263,16 @@ DECLARE_EVENT_CLASS(mm_vmscan_lru_isolate_template,
unsigned long nr_requested,
unsigned long nr_scanned,
unsigned long nr_taken,
- unsigned long nr_lumpy_taken,
- unsigned long nr_lumpy_dirty,
- unsigned long nr_lumpy_failed,
isolate_mode_t isolate_mode,
int file),
- TP_ARGS(order, nr_requested, nr_scanned, nr_taken, nr_lumpy_taken, nr_lumpy_dirty, nr_lumpy_failed, isolate_mode, file),
+ TP_ARGS(order, nr_requested, nr_scanned, nr_taken, isolate_mode, file),
TP_STRUCT__entry(
__field(int, order)
__field(unsigned long, nr_requested)
__field(unsigned long, nr_scanned)
__field(unsigned long, nr_taken)
- __field(unsigned long, nr_lumpy_taken)
- __field(unsigned long, nr_lumpy_dirty)
- __field(unsigned long, nr_lumpy_failed)
__field(isolate_mode_t, isolate_mode)
__field(int, file)
),
@@ -288,22 +282,16 @@ DECLARE_EVENT_CLASS(mm_vmscan_lru_isolate_template,
__entry->nr_requested = nr_requested;
__entry->nr_scanned = nr_scanned;
__entry->nr_taken = nr_taken;
- __entry->nr_lumpy_taken = nr_lumpy_taken;
- __entry->nr_lumpy_dirty = nr_lumpy_dirty;
- __entry->nr_lumpy_failed = nr_lumpy_failed;
__entry->isolate_mode = isolate_mode;
__entry->file = file;
),
- TP_printk("isolate_mode=%d order=%d nr_requested=%lu nr_scanned=%lu nr_taken=%lu contig_taken=%lu contig_dirty=%lu contig_failed=%lu file=%d",
+ TP_printk("isolate_mode=%d order=%d nr_requested=%lu nr_scanned=%lu nr_taken=%lu file=%d",
__entry->isolate_mode,
__entry->order,
__entry->nr_requested,
__entry->nr_scanned,
__entry->nr_taken,
- __entry->nr_lumpy_taken,
- __entry->nr_lumpy_dirty,
- __entry->nr_lumpy_failed,
__entry->file)
);
@@ -313,13 +301,10 @@ DEFINE_EVENT(mm_vmscan_lru_isolate_template, mm_vmscan_lru_isolate,
unsigned long nr_requested,
unsigned long nr_scanned,
unsigned long nr_taken,
- unsigned long nr_lumpy_taken,
- unsigned long nr_lumpy_dirty,
- unsigned long nr_lumpy_failed,
isolate_mode_t isolate_mode,
int file),
- TP_ARGS(order, nr_requested, nr_scanned, nr_taken, nr_lumpy_taken, nr_lumpy_dirty, nr_lumpy_failed, isolate_mode, file)
+ TP_ARGS(order, nr_requested, nr_scanned, nr_taken, isolate_mode, file)
);
@@ -329,13 +314,10 @@ DEFINE_EVENT(mm_vmscan_lru_isolate_template, mm_vmscan_memcg_isolate,
unsigned long nr_requested,
unsigned long nr_scanned,
unsigned long nr_taken,
- unsigned long nr_lumpy_taken,
- unsigned long nr_lumpy_dirty,
- unsigned long nr_lumpy_failed,
isolate_mode_t isolate_mode,
int file),
- TP_ARGS(order, nr_requested, nr_scanned, nr_taken, nr_lumpy_taken, nr_lumpy_dirty, nr_lumpy_failed, isolate_mode, file)
+ TP_ARGS(order, nr_requested, nr_scanned, nr_taken, isolate_mode, file)
);
diff --git a/mm/vmscan.c b/mm/vmscan.c
index 33c332b..a4b86bd 100644
--- a/mm/vmscan.c
+++ b/mm/vmscan.c
@@ -58,9 +58,6 @@
* RECLAIM_MODE_SINGLE: Reclaim only order-0 pages
* RECLAIM_MODE_ASYNC: Do not block
* RECLAIM_MODE_SYNC: Allow blocking e.g. call wait_on_page_writeback
- * RECLAIM_MODE_LUMPYRECLAIM: For high-order allocations, take a reference
- * page from the LRU and reclaim all pages within a
- * naturally aligned range
* RECLAIM_MODE_COMPACTION: For high-order allocations, reclaim a number of
* order-0 pages and then compact the zone
*/
@@ -68,7 +65,6 @@ typedef unsigned __bitwise__ reclaim_mode_t;
#define RECLAIM_MODE_SINGLE ((__force reclaim_mode_t)0x01u)
#define RECLAIM_MODE_ASYNC ((__force reclaim_mode_t)0x02u)
#define RECLAIM_MODE_SYNC ((__force reclaim_mode_t)0x04u)
-#define RECLAIM_MODE_LUMPYRECLAIM ((__force reclaim_mode_t)0x08u)
#define RECLAIM_MODE_COMPACTION ((__force reclaim_mode_t)0x10u)
struct scan_control {
@@ -367,27 +363,17 @@ out:
static void set_reclaim_mode(int priority, struct scan_control *sc,
bool sync)
{
+ /* Sync reclaim used only for compaction */
reclaim_mode_t syncmode = sync ? RECLAIM_MODE_SYNC : RECLAIM_MODE_ASYNC;
/*
- * Initially assume we are entering either lumpy reclaim or
- * reclaim/compaction.Depending on the order, we will either set the
- * sync mode or just reclaim order-0 pages later.
- */
- if (COMPACTION_BUILD)
- sc->reclaim_mode = RECLAIM_MODE_COMPACTION;
- else
- sc->reclaim_mode = RECLAIM_MODE_LUMPYRECLAIM;
-
- /*
- * Avoid using lumpy reclaim or reclaim/compaction if possible by
- * restricting when its set to either costly allocations or when
+ * Restrict reclaim/compaction to costly allocations or when
* under memory pressure
*/
- if (sc->order > PAGE_ALLOC_COSTLY_ORDER)
- sc->reclaim_mode |= syncmode;
- else if (sc->order && priority < DEF_PRIORITY - 2)
- sc->reclaim_mode |= syncmode;
+ if (COMPACTION_BUILD && sc->order &&
+ (sc->order > PAGE_ALLOC_COSTLY_ORDER ||
+ priority < DEF_PRIORITY - 2))
+ sc->reclaim_mode = RECLAIM_MODE_COMPACTION | syncmode;
else
sc->reclaim_mode = RECLAIM_MODE_SINGLE | RECLAIM_MODE_ASYNC;
}
@@ -416,10 +402,6 @@ static int may_write_to_queue(struct backing_dev_info *bdi,
return 1;
if (bdi == current->backing_dev_info)
return 1;
-
- /* lumpy reclaim for hugepage often need a lot of write */
- if (sc->order > PAGE_ALLOC_COSTLY_ORDER)
- return 1;
return 0;
}
@@ -710,10 +692,6 @@ static enum page_references page_check_references(struct page *page,
referenced_ptes = page_referenced(page, 1, mz->mem_cgroup, &vm_flags);
referenced_page = TestClearPageReferenced(page);
- /* Lumpy reclaim - ignore references */
- if (sc->reclaim_mode & RECLAIM_MODE_LUMPYRECLAIM)
- return PAGEREF_RECLAIM;
-
/*
* Mlock lost the isolation race with us. Let try_to_unmap()
* move the page to the unevictable list.
@@ -824,7 +802,7 @@ static unsigned long shrink_page_list(struct list_head *page_list,
wait_on_page_writeback(page);
else {
unlock_page(page);
- goto keep_lumpy;
+ goto keep_reclaim_mode;
}
}
@@ -908,7 +886,7 @@ static unsigned long shrink_page_list(struct list_head *page_list,
goto activate_locked;
case PAGE_SUCCESS:
if (PageWriteback(page))
- goto keep_lumpy;
+ goto keep_reclaim_mode;
if (PageDirty(page))
goto keep;
@@ -1008,7 +986,7 @@ keep_locked:
unlock_page(page);
keep:
reset_reclaim_mode(sc);
-keep_lumpy:
+keep_reclaim_mode:
list_add(&page->lru, &ret_pages);
VM_BUG_ON(PageLRU(page) || PageUnevictable(page));
}
@@ -1064,11 +1042,7 @@ int __isolate_lru_page(struct page *page, isolate_mode_t mode, int file)
if (!all_lru_mode && !!page_is_file_cache(page) != file)
return ret;
- /*
- * When this function is being called for lumpy reclaim, we
- * initially look into all LRU pages, active, inactive and
- * unevictable; only give shrink_page_list evictable pages.
- */
+ /* Do not give back unevictable pages for compaction */
if (PageUnevictable(page))
return ret;
@@ -1153,9 +1127,6 @@ static unsigned long isolate_lru_pages(unsigned long nr_to_scan,
struct lruvec *lruvec;
struct list_head *src;
unsigned long nr_taken = 0;
- unsigned long nr_lumpy_taken = 0;
- unsigned long nr_lumpy_dirty = 0;
- unsigned long nr_lumpy_failed = 0;
unsigned long scan;
int lru = LRU_BASE;
@@ -1168,10 +1139,6 @@ static unsigned long isolate_lru_pages(unsigned long nr_to_scan,
for (scan = 0; scan < nr_to_scan && !list_empty(src); scan++) {
struct page *page;
- unsigned long pfn;
- unsigned long end_pfn;
- unsigned long page_pfn;
- int zone_id;
page = lru_to_page(src);
prefetchw_prev_lru_page(page, src, flags);
@@ -1193,84 +1160,6 @@ static unsigned long isolate_lru_pages(unsigned long nr_to_scan,
default:
BUG();
}
-
- if (!sc->order || !(sc->reclaim_mode & RECLAIM_MODE_LUMPYRECLAIM))
- continue;
-
- /*
- * Attempt to take all pages in the order aligned region
- * surrounding the tag page. Only take those pages of
- * the same active state as that tag page. We may safely
- * round the target page pfn down to the requested order
- * as the mem_map is guaranteed valid out to MAX_ORDER,
- * where that page is in a different zone we will detect
- * it from its zone id and abort this block scan.
- */
- zone_id = page_zone_id(page);
- page_pfn = page_to_pfn(page);
- pfn = page_pfn & ~((1 << sc->order) - 1);
- end_pfn = pfn + (1 << sc->order);
- for (; pfn < end_pfn; pfn++) {
- struct page *cursor_page;
-
- /* The target page is in the block, ignore it. */
- if (unlikely(pfn == page_pfn))
- continue;
-
- /* Avoid holes within the zone. */
- if (unlikely(!pfn_valid_within(pfn)))
- break;
-
- cursor_page = pfn_to_page(pfn);
-
- /* Check that we have not crossed a zone boundary. */
- if (unlikely(page_zone_id(cursor_page) != zone_id))
- break;
-
- /*
- * If we don't have enough swap space, reclaiming of
- * anon page which don't already have a swap slot is
- * pointless.
- */
- if (nr_swap_pages <= 0 && PageSwapBacked(cursor_page) &&
- !PageSwapCache(cursor_page))
- break;
-
- if (__isolate_lru_page(cursor_page, mode, file) == 0) {
- unsigned int isolated_pages;
-
- mem_cgroup_lru_del(cursor_page);
- list_move(&cursor_page->lru, dst);
- isolated_pages = hpage_nr_pages(cursor_page);
- nr_taken += isolated_pages;
- nr_lumpy_taken += isolated_pages;
- if (PageDirty(cursor_page))
- nr_lumpy_dirty += isolated_pages;
- scan++;
- pfn += isolated_pages - 1;
- } else {
- /*
- * Check if the page is freed already.
- *
- * We can't use page_count() as that
- * requires compound_head and we don't
- * have a pin on the page here. If a
- * page is tail, we may or may not
- * have isolated the head, so assume
- * it's not free, it'd be tricky to
- * track the head status without a
- * page pin.
- */
- if (!PageTail(cursor_page) &&
- !atomic_read(&cursor_page->_count))
- continue;
- break;
- }
- }
-
- /* If we break out of the loop above, lumpy reclaim failed */
- if (pfn < end_pfn)
- nr_lumpy_failed++;
}
*nr_scanned = scan;
@@ -1278,7 +1167,6 @@ static unsigned long isolate_lru_pages(unsigned long nr_to_scan,
trace_mm_vmscan_lru_isolate(sc->order,
nr_to_scan, scan,
nr_taken,
- nr_lumpy_taken, nr_lumpy_dirty, nr_lumpy_failed,
mode, file);
return nr_taken;
}
@@ -1466,13 +1354,13 @@ static inline bool should_reclaim_stall(unsigned long nr_taken,
int priority,
struct scan_control *sc)
{
- int lumpy_stall_priority;
+ int stall_priority;
/* kswapd should not stall on sync IO */
if (current_is_kswapd())
return false;
- /* Only stall on lumpy reclaim */
+ /* Only stall for memory compaction */
if (sc->reclaim_mode & RECLAIM_MODE_SINGLE)
return false;
@@ -1487,11 +1375,11 @@ static inline bool should_reclaim_stall(unsigned long nr_taken,
* priority to be much higher before stalling.
*/
if (sc->order > PAGE_ALLOC_COSTLY_ORDER)
- lumpy_stall_priority = DEF_PRIORITY;
+ stall_priority = DEF_PRIORITY;
else
- lumpy_stall_priority = DEF_PRIORITY / 3;
+ stall_priority = DEF_PRIORITY / 3;
- return priority <= lumpy_stall_priority;
+ return priority <= stall_priority;
}
/*
@@ -1523,8 +1411,6 @@ shrink_inactive_list(unsigned long nr_to_scan, struct mem_cgroup_zone *mz,
}
set_reclaim_mode(priority, sc, false);
- if (sc->reclaim_mode & RECLAIM_MODE_LUMPYRECLAIM)
- isolate_mode |= ISOLATE_ACTIVE;
lru_add_drain();
--
1.7.9.2
^ permalink raw reply related [flat|nested] 36+ messages in thread
* [PATCH 1/3] mm: vmscan: Remove lumpy reclaim
@ 2012-04-11 16:38 ` Mel Gorman
0 siblings, 0 replies; 36+ messages in thread
From: Mel Gorman @ 2012-04-11 16:38 UTC (permalink / raw)
To: Andrew Morton
Cc: Rik van Riel, Konstantin Khlebnikov, Hugh Dickins, Ying Han,
Mel Gorman, Linux-MM, LKML
Lumpy reclaim had a purpose but in the mind of some, it was to kick
the system so hard it trashed. For others the purpose was to complicate
vmscan.c. Over time it was giving softer shoes and a nicer attitude but
memory compaction needs to step up and replace it so this patch sends
lumpy reclaim to the farm.
The tracepoint format changes for isolating LRU pages with this patch
applied. Furthermore reclaim/compaction can no longer queue dirty pages in
pageout() if the underlying BDI is congested. Lumpy reclaim used this logic
and reclaim/compaction was using it in error.
Signed-off-by: Mel Gorman <mgorman@suse.de>
---
include/trace/events/vmscan.h | 26 ++------
mm/vmscan.c | 144 +++++------------------------------------
2 files changed, 19 insertions(+), 151 deletions(-)
diff --git a/include/trace/events/vmscan.h b/include/trace/events/vmscan.h
index f64560e..1c20a1f 100644
--- a/include/trace/events/vmscan.h
+++ b/include/trace/events/vmscan.h
@@ -263,22 +263,16 @@ DECLARE_EVENT_CLASS(mm_vmscan_lru_isolate_template,
unsigned long nr_requested,
unsigned long nr_scanned,
unsigned long nr_taken,
- unsigned long nr_lumpy_taken,
- unsigned long nr_lumpy_dirty,
- unsigned long nr_lumpy_failed,
isolate_mode_t isolate_mode,
int file),
- TP_ARGS(order, nr_requested, nr_scanned, nr_taken, nr_lumpy_taken, nr_lumpy_dirty, nr_lumpy_failed, isolate_mode, file),
+ TP_ARGS(order, nr_requested, nr_scanned, nr_taken, isolate_mode, file),
TP_STRUCT__entry(
__field(int, order)
__field(unsigned long, nr_requested)
__field(unsigned long, nr_scanned)
__field(unsigned long, nr_taken)
- __field(unsigned long, nr_lumpy_taken)
- __field(unsigned long, nr_lumpy_dirty)
- __field(unsigned long, nr_lumpy_failed)
__field(isolate_mode_t, isolate_mode)
__field(int, file)
),
@@ -288,22 +282,16 @@ DECLARE_EVENT_CLASS(mm_vmscan_lru_isolate_template,
__entry->nr_requested = nr_requested;
__entry->nr_scanned = nr_scanned;
__entry->nr_taken = nr_taken;
- __entry->nr_lumpy_taken = nr_lumpy_taken;
- __entry->nr_lumpy_dirty = nr_lumpy_dirty;
- __entry->nr_lumpy_failed = nr_lumpy_failed;
__entry->isolate_mode = isolate_mode;
__entry->file = file;
),
- TP_printk("isolate_mode=%d order=%d nr_requested=%lu nr_scanned=%lu nr_taken=%lu contig_taken=%lu contig_dirty=%lu contig_failed=%lu file=%d",
+ TP_printk("isolate_mode=%d order=%d nr_requested=%lu nr_scanned=%lu nr_taken=%lu file=%d",
__entry->isolate_mode,
__entry->order,
__entry->nr_requested,
__entry->nr_scanned,
__entry->nr_taken,
- __entry->nr_lumpy_taken,
- __entry->nr_lumpy_dirty,
- __entry->nr_lumpy_failed,
__entry->file)
);
@@ -313,13 +301,10 @@ DEFINE_EVENT(mm_vmscan_lru_isolate_template, mm_vmscan_lru_isolate,
unsigned long nr_requested,
unsigned long nr_scanned,
unsigned long nr_taken,
- unsigned long nr_lumpy_taken,
- unsigned long nr_lumpy_dirty,
- unsigned long nr_lumpy_failed,
isolate_mode_t isolate_mode,
int file),
- TP_ARGS(order, nr_requested, nr_scanned, nr_taken, nr_lumpy_taken, nr_lumpy_dirty, nr_lumpy_failed, isolate_mode, file)
+ TP_ARGS(order, nr_requested, nr_scanned, nr_taken, isolate_mode, file)
);
@@ -329,13 +314,10 @@ DEFINE_EVENT(mm_vmscan_lru_isolate_template, mm_vmscan_memcg_isolate,
unsigned long nr_requested,
unsigned long nr_scanned,
unsigned long nr_taken,
- unsigned long nr_lumpy_taken,
- unsigned long nr_lumpy_dirty,
- unsigned long nr_lumpy_failed,
isolate_mode_t isolate_mode,
int file),
- TP_ARGS(order, nr_requested, nr_scanned, nr_taken, nr_lumpy_taken, nr_lumpy_dirty, nr_lumpy_failed, isolate_mode, file)
+ TP_ARGS(order, nr_requested, nr_scanned, nr_taken, isolate_mode, file)
);
diff --git a/mm/vmscan.c b/mm/vmscan.c
index 33c332b..a4b86bd 100644
--- a/mm/vmscan.c
+++ b/mm/vmscan.c
@@ -58,9 +58,6 @@
* RECLAIM_MODE_SINGLE: Reclaim only order-0 pages
* RECLAIM_MODE_ASYNC: Do not block
* RECLAIM_MODE_SYNC: Allow blocking e.g. call wait_on_page_writeback
- * RECLAIM_MODE_LUMPYRECLAIM: For high-order allocations, take a reference
- * page from the LRU and reclaim all pages within a
- * naturally aligned range
* RECLAIM_MODE_COMPACTION: For high-order allocations, reclaim a number of
* order-0 pages and then compact the zone
*/
@@ -68,7 +65,6 @@ typedef unsigned __bitwise__ reclaim_mode_t;
#define RECLAIM_MODE_SINGLE ((__force reclaim_mode_t)0x01u)
#define RECLAIM_MODE_ASYNC ((__force reclaim_mode_t)0x02u)
#define RECLAIM_MODE_SYNC ((__force reclaim_mode_t)0x04u)
-#define RECLAIM_MODE_LUMPYRECLAIM ((__force reclaim_mode_t)0x08u)
#define RECLAIM_MODE_COMPACTION ((__force reclaim_mode_t)0x10u)
struct scan_control {
@@ -367,27 +363,17 @@ out:
static void set_reclaim_mode(int priority, struct scan_control *sc,
bool sync)
{
+ /* Sync reclaim used only for compaction */
reclaim_mode_t syncmode = sync ? RECLAIM_MODE_SYNC : RECLAIM_MODE_ASYNC;
/*
- * Initially assume we are entering either lumpy reclaim or
- * reclaim/compaction.Depending on the order, we will either set the
- * sync mode or just reclaim order-0 pages later.
- */
- if (COMPACTION_BUILD)
- sc->reclaim_mode = RECLAIM_MODE_COMPACTION;
- else
- sc->reclaim_mode = RECLAIM_MODE_LUMPYRECLAIM;
-
- /*
- * Avoid using lumpy reclaim or reclaim/compaction if possible by
- * restricting when its set to either costly allocations or when
+ * Restrict reclaim/compaction to costly allocations or when
* under memory pressure
*/
- if (sc->order > PAGE_ALLOC_COSTLY_ORDER)
- sc->reclaim_mode |= syncmode;
- else if (sc->order && priority < DEF_PRIORITY - 2)
- sc->reclaim_mode |= syncmode;
+ if (COMPACTION_BUILD && sc->order &&
+ (sc->order > PAGE_ALLOC_COSTLY_ORDER ||
+ priority < DEF_PRIORITY - 2))
+ sc->reclaim_mode = RECLAIM_MODE_COMPACTION | syncmode;
else
sc->reclaim_mode = RECLAIM_MODE_SINGLE | RECLAIM_MODE_ASYNC;
}
@@ -416,10 +402,6 @@ static int may_write_to_queue(struct backing_dev_info *bdi,
return 1;
if (bdi == current->backing_dev_info)
return 1;
-
- /* lumpy reclaim for hugepage often need a lot of write */
- if (sc->order > PAGE_ALLOC_COSTLY_ORDER)
- return 1;
return 0;
}
@@ -710,10 +692,6 @@ static enum page_references page_check_references(struct page *page,
referenced_ptes = page_referenced(page, 1, mz->mem_cgroup, &vm_flags);
referenced_page = TestClearPageReferenced(page);
- /* Lumpy reclaim - ignore references */
- if (sc->reclaim_mode & RECLAIM_MODE_LUMPYRECLAIM)
- return PAGEREF_RECLAIM;
-
/*
* Mlock lost the isolation race with us. Let try_to_unmap()
* move the page to the unevictable list.
@@ -824,7 +802,7 @@ static unsigned long shrink_page_list(struct list_head *page_list,
wait_on_page_writeback(page);
else {
unlock_page(page);
- goto keep_lumpy;
+ goto keep_reclaim_mode;
}
}
@@ -908,7 +886,7 @@ static unsigned long shrink_page_list(struct list_head *page_list,
goto activate_locked;
case PAGE_SUCCESS:
if (PageWriteback(page))
- goto keep_lumpy;
+ goto keep_reclaim_mode;
if (PageDirty(page))
goto keep;
@@ -1008,7 +986,7 @@ keep_locked:
unlock_page(page);
keep:
reset_reclaim_mode(sc);
-keep_lumpy:
+keep_reclaim_mode:
list_add(&page->lru, &ret_pages);
VM_BUG_ON(PageLRU(page) || PageUnevictable(page));
}
@@ -1064,11 +1042,7 @@ int __isolate_lru_page(struct page *page, isolate_mode_t mode, int file)
if (!all_lru_mode && !!page_is_file_cache(page) != file)
return ret;
- /*
- * When this function is being called for lumpy reclaim, we
- * initially look into all LRU pages, active, inactive and
- * unevictable; only give shrink_page_list evictable pages.
- */
+ /* Do not give back unevictable pages for compaction */
if (PageUnevictable(page))
return ret;
@@ -1153,9 +1127,6 @@ static unsigned long isolate_lru_pages(unsigned long nr_to_scan,
struct lruvec *lruvec;
struct list_head *src;
unsigned long nr_taken = 0;
- unsigned long nr_lumpy_taken = 0;
- unsigned long nr_lumpy_dirty = 0;
- unsigned long nr_lumpy_failed = 0;
unsigned long scan;
int lru = LRU_BASE;
@@ -1168,10 +1139,6 @@ static unsigned long isolate_lru_pages(unsigned long nr_to_scan,
for (scan = 0; scan < nr_to_scan && !list_empty(src); scan++) {
struct page *page;
- unsigned long pfn;
- unsigned long end_pfn;
- unsigned long page_pfn;
- int zone_id;
page = lru_to_page(src);
prefetchw_prev_lru_page(page, src, flags);
@@ -1193,84 +1160,6 @@ static unsigned long isolate_lru_pages(unsigned long nr_to_scan,
default:
BUG();
}
-
- if (!sc->order || !(sc->reclaim_mode & RECLAIM_MODE_LUMPYRECLAIM))
- continue;
-
- /*
- * Attempt to take all pages in the order aligned region
- * surrounding the tag page. Only take those pages of
- * the same active state as that tag page. We may safely
- * round the target page pfn down to the requested order
- * as the mem_map is guaranteed valid out to MAX_ORDER,
- * where that page is in a different zone we will detect
- * it from its zone id and abort this block scan.
- */
- zone_id = page_zone_id(page);
- page_pfn = page_to_pfn(page);
- pfn = page_pfn & ~((1 << sc->order) - 1);
- end_pfn = pfn + (1 << sc->order);
- for (; pfn < end_pfn; pfn++) {
- struct page *cursor_page;
-
- /* The target page is in the block, ignore it. */
- if (unlikely(pfn == page_pfn))
- continue;
-
- /* Avoid holes within the zone. */
- if (unlikely(!pfn_valid_within(pfn)))
- break;
-
- cursor_page = pfn_to_page(pfn);
-
- /* Check that we have not crossed a zone boundary. */
- if (unlikely(page_zone_id(cursor_page) != zone_id))
- break;
-
- /*
- * If we don't have enough swap space, reclaiming of
- * anon page which don't already have a swap slot is
- * pointless.
- */
- if (nr_swap_pages <= 0 && PageSwapBacked(cursor_page) &&
- !PageSwapCache(cursor_page))
- break;
-
- if (__isolate_lru_page(cursor_page, mode, file) == 0) {
- unsigned int isolated_pages;
-
- mem_cgroup_lru_del(cursor_page);
- list_move(&cursor_page->lru, dst);
- isolated_pages = hpage_nr_pages(cursor_page);
- nr_taken += isolated_pages;
- nr_lumpy_taken += isolated_pages;
- if (PageDirty(cursor_page))
- nr_lumpy_dirty += isolated_pages;
- scan++;
- pfn += isolated_pages - 1;
- } else {
- /*
- * Check if the page is freed already.
- *
- * We can't use page_count() as that
- * requires compound_head and we don't
- * have a pin on the page here. If a
- * page is tail, we may or may not
- * have isolated the head, so assume
- * it's not free, it'd be tricky to
- * track the head status without a
- * page pin.
- */
- if (!PageTail(cursor_page) &&
- !atomic_read(&cursor_page->_count))
- continue;
- break;
- }
- }
-
- /* If we break out of the loop above, lumpy reclaim failed */
- if (pfn < end_pfn)
- nr_lumpy_failed++;
}
*nr_scanned = scan;
@@ -1278,7 +1167,6 @@ static unsigned long isolate_lru_pages(unsigned long nr_to_scan,
trace_mm_vmscan_lru_isolate(sc->order,
nr_to_scan, scan,
nr_taken,
- nr_lumpy_taken, nr_lumpy_dirty, nr_lumpy_failed,
mode, file);
return nr_taken;
}
@@ -1466,13 +1354,13 @@ static inline bool should_reclaim_stall(unsigned long nr_taken,
int priority,
struct scan_control *sc)
{
- int lumpy_stall_priority;
+ int stall_priority;
/* kswapd should not stall on sync IO */
if (current_is_kswapd())
return false;
- /* Only stall on lumpy reclaim */
+ /* Only stall for memory compaction */
if (sc->reclaim_mode & RECLAIM_MODE_SINGLE)
return false;
@@ -1487,11 +1375,11 @@ static inline bool should_reclaim_stall(unsigned long nr_taken,
* priority to be much higher before stalling.
*/
if (sc->order > PAGE_ALLOC_COSTLY_ORDER)
- lumpy_stall_priority = DEF_PRIORITY;
+ stall_priority = DEF_PRIORITY;
else
- lumpy_stall_priority = DEF_PRIORITY / 3;
+ stall_priority = DEF_PRIORITY / 3;
- return priority <= lumpy_stall_priority;
+ return priority <= stall_priority;
}
/*
@@ -1523,8 +1411,6 @@ shrink_inactive_list(unsigned long nr_to_scan, struct mem_cgroup_zone *mz,
}
set_reclaim_mode(priority, sc, false);
- if (sc->reclaim_mode & RECLAIM_MODE_LUMPYRECLAIM)
- isolate_mode |= ISOLATE_ACTIVE;
lru_add_drain();
--
1.7.9.2
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply related [flat|nested] 36+ messages in thread
* [PATCH 2/3] mm: vmscan: Do not stall on writeback during memory compaction
2012-04-11 16:38 ` Mel Gorman
@ 2012-04-11 16:38 ` Mel Gorman
-1 siblings, 0 replies; 36+ messages in thread
From: Mel Gorman @ 2012-04-11 16:38 UTC (permalink / raw)
To: Andrew Morton
Cc: Rik van Riel, Konstantin Khlebnikov, Hugh Dickins, Ying Han,
Mel Gorman, Linux-MM, LKML
This patch stops reclaim/compaction entering sync reclaim as this was only
intended for lumpy reclaim and an oversight. Page migration has its own
logic for stalling on writeback pages if necessary and memory compaction
is already using it.
Waiting on page writeback is bad for a number of reasons but the primary
one is that waiting on writeback to a slow device like USB can take a
considerable length of time. Page reclaim instead uses wait_iff_congested()
to throttle if too many dirty pages are being scanned.
Signed-off-by: Mel Gorman <mgorman@suse.de>
---
include/trace/events/vmscan.h | 10 ++---
mm/vmscan.c | 85 ++++-------------------------------------
2 files changed, 13 insertions(+), 82 deletions(-)
diff --git a/include/trace/events/vmscan.h b/include/trace/events/vmscan.h
index 1c20a1f..044e8ba 100644
--- a/include/trace/events/vmscan.h
+++ b/include/trace/events/vmscan.h
@@ -13,7 +13,7 @@
#define RECLAIM_WB_ANON 0x0001u
#define RECLAIM_WB_FILE 0x0002u
#define RECLAIM_WB_MIXED 0x0010u
-#define RECLAIM_WB_SYNC 0x0004u
+#define RECLAIM_WB_SYNC 0x0004u /* Unused, all reclaim async */
#define RECLAIM_WB_ASYNC 0x0008u
#define show_reclaim_flags(flags) \
@@ -27,13 +27,13 @@
#define trace_reclaim_flags(page, sync) ( \
(page_is_file_cache(page) ? RECLAIM_WB_FILE : RECLAIM_WB_ANON) | \
- (sync & RECLAIM_MODE_SYNC ? RECLAIM_WB_SYNC : RECLAIM_WB_ASYNC) \
+ (RECLAIM_WB_ASYNC) \
)
#define trace_shrink_flags(file, sync) ( \
- (sync & RECLAIM_MODE_SYNC ? RECLAIM_WB_MIXED : \
- (file ? RECLAIM_WB_FILE : RECLAIM_WB_ANON)) | \
- (sync & RECLAIM_MODE_SYNC ? RECLAIM_WB_SYNC : RECLAIM_WB_ASYNC) \
+ ( \
+ (file ? RECLAIM_WB_FILE : RECLAIM_WB_ANON) | \
+ (RECLAIM_WB_ASYNC) \
)
TRACE_EVENT(mm_vmscan_kswapd_sleep,
diff --git a/mm/vmscan.c b/mm/vmscan.c
index a4b86bd..68319e4 100644
--- a/mm/vmscan.c
+++ b/mm/vmscan.c
@@ -56,15 +56,11 @@
/*
* reclaim_mode determines how the inactive list is shrunk
* RECLAIM_MODE_SINGLE: Reclaim only order-0 pages
- * RECLAIM_MODE_ASYNC: Do not block
- * RECLAIM_MODE_SYNC: Allow blocking e.g. call wait_on_page_writeback
* RECLAIM_MODE_COMPACTION: For high-order allocations, reclaim a number of
* order-0 pages and then compact the zone
*/
typedef unsigned __bitwise__ reclaim_mode_t;
#define RECLAIM_MODE_SINGLE ((__force reclaim_mode_t)0x01u)
-#define RECLAIM_MODE_ASYNC ((__force reclaim_mode_t)0x02u)
-#define RECLAIM_MODE_SYNC ((__force reclaim_mode_t)0x04u)
#define RECLAIM_MODE_COMPACTION ((__force reclaim_mode_t)0x10u)
struct scan_control {
@@ -360,12 +356,8 @@ out:
return ret;
}
-static void set_reclaim_mode(int priority, struct scan_control *sc,
- bool sync)
+static void set_reclaim_mode(int priority, struct scan_control *sc)
{
- /* Sync reclaim used only for compaction */
- reclaim_mode_t syncmode = sync ? RECLAIM_MODE_SYNC : RECLAIM_MODE_ASYNC;
-
/*
* Restrict reclaim/compaction to costly allocations or when
* under memory pressure
@@ -373,14 +365,14 @@ static void set_reclaim_mode(int priority, struct scan_control *sc,
if (COMPACTION_BUILD && sc->order &&
(sc->order > PAGE_ALLOC_COSTLY_ORDER ||
priority < DEF_PRIORITY - 2))
- sc->reclaim_mode = RECLAIM_MODE_COMPACTION | syncmode;
+ sc->reclaim_mode = RECLAIM_MODE_COMPACTION;
else
- sc->reclaim_mode = RECLAIM_MODE_SINGLE | RECLAIM_MODE_ASYNC;
+ sc->reclaim_mode = RECLAIM_MODE_SINGLE;
}
static void reset_reclaim_mode(struct scan_control *sc)
{
- sc->reclaim_mode = RECLAIM_MODE_SINGLE | RECLAIM_MODE_ASYNC;
+ sc->reclaim_mode = RECLAIM_MODE_SINGLE;
}
static inline int is_page_cache_freeable(struct page *page)
@@ -791,19 +783,8 @@ static unsigned long shrink_page_list(struct list_head *page_list,
if (PageWriteback(page)) {
nr_writeback++;
- /*
- * Synchronous reclaim cannot queue pages for
- * writeback due to the possibility of stack overflow
- * but if it encounters a page under writeback, wait
- * for the IO to complete.
- */
- if ((sc->reclaim_mode & RECLAIM_MODE_SYNC) &&
- may_enter_fs)
- wait_on_page_writeback(page);
- else {
- unlock_page(page);
- goto keep_reclaim_mode;
- }
+ unlock_page(page);
+ goto keep;
}
references = page_check_references(page, mz, sc);
@@ -886,7 +867,7 @@ static unsigned long shrink_page_list(struct list_head *page_list,
goto activate_locked;
case PAGE_SUCCESS:
if (PageWriteback(page))
- goto keep_reclaim_mode;
+ goto keep;
if (PageDirty(page))
goto keep;
@@ -985,8 +966,6 @@ activate_locked:
keep_locked:
unlock_page(page);
keep:
- reset_reclaim_mode(sc);
-keep_reclaim_mode:
list_add(&page->lru, &ret_pages);
VM_BUG_ON(PageLRU(page) || PageUnevictable(page));
}
@@ -1342,47 +1321,6 @@ update_isolated_counts(struct mem_cgroup_zone *mz,
}
/*
- * Returns true if a direct reclaim should wait on pages under writeback.
- *
- * If we are direct reclaiming for contiguous pages and we do not reclaim
- * everything in the list, try again and wait for writeback IO to complete.
- * This will stall high-order allocations noticeably. Only do that when really
- * need to free the pages under high memory pressure.
- */
-static inline bool should_reclaim_stall(unsigned long nr_taken,
- unsigned long nr_freed,
- int priority,
- struct scan_control *sc)
-{
- int stall_priority;
-
- /* kswapd should not stall on sync IO */
- if (current_is_kswapd())
- return false;
-
- /* Only stall for memory compaction */
- if (sc->reclaim_mode & RECLAIM_MODE_SINGLE)
- return false;
-
- /* If we have reclaimed everything on the isolated list, no stall */
- if (nr_freed == nr_taken)
- return false;
-
- /*
- * For high-order allocations, there are two stall thresholds.
- * High-cost allocations stall immediately where as lower
- * order allocations such as stacks require the scanning
- * priority to be much higher before stalling.
- */
- if (sc->order > PAGE_ALLOC_COSTLY_ORDER)
- stall_priority = DEF_PRIORITY;
- else
- stall_priority = DEF_PRIORITY / 3;
-
- return priority <= stall_priority;
-}
-
-/*
* shrink_inactive_list() is a helper for shrink_zone(). It returns the number
* of reclaimed pages
*/
@@ -1410,7 +1348,7 @@ shrink_inactive_list(unsigned long nr_to_scan, struct mem_cgroup_zone *mz,
return SWAP_CLUSTER_MAX;
}
- set_reclaim_mode(priority, sc, false);
+ set_reclaim_mode(priority, sc);
lru_add_drain();
@@ -1442,13 +1380,6 @@ shrink_inactive_list(unsigned long nr_to_scan, struct mem_cgroup_zone *mz,
nr_reclaimed = shrink_page_list(&page_list, mz, sc, priority,
&nr_dirty, &nr_writeback);
- /* Check if we should syncronously wait for writeback */
- if (should_reclaim_stall(nr_taken, nr_reclaimed, priority, sc)) {
- set_reclaim_mode(priority, sc, true);
- nr_reclaimed += shrink_page_list(&page_list, mz, sc,
- priority, &nr_dirty, &nr_writeback);
- }
-
spin_lock_irq(&zone->lru_lock);
reclaim_stat->recent_scanned[0] += nr_anon;
--
1.7.9.2
^ permalink raw reply related [flat|nested] 36+ messages in thread
* [PATCH 2/3] mm: vmscan: Do not stall on writeback during memory compaction
@ 2012-04-11 16:38 ` Mel Gorman
0 siblings, 0 replies; 36+ messages in thread
From: Mel Gorman @ 2012-04-11 16:38 UTC (permalink / raw)
To: Andrew Morton
Cc: Rik van Riel, Konstantin Khlebnikov, Hugh Dickins, Ying Han,
Mel Gorman, Linux-MM, LKML
This patch stops reclaim/compaction entering sync reclaim as this was only
intended for lumpy reclaim and an oversight. Page migration has its own
logic for stalling on writeback pages if necessary and memory compaction
is already using it.
Waiting on page writeback is bad for a number of reasons but the primary
one is that waiting on writeback to a slow device like USB can take a
considerable length of time. Page reclaim instead uses wait_iff_congested()
to throttle if too many dirty pages are being scanned.
Signed-off-by: Mel Gorman <mgorman@suse.de>
---
include/trace/events/vmscan.h | 10 ++---
mm/vmscan.c | 85 ++++-------------------------------------
2 files changed, 13 insertions(+), 82 deletions(-)
diff --git a/include/trace/events/vmscan.h b/include/trace/events/vmscan.h
index 1c20a1f..044e8ba 100644
--- a/include/trace/events/vmscan.h
+++ b/include/trace/events/vmscan.h
@@ -13,7 +13,7 @@
#define RECLAIM_WB_ANON 0x0001u
#define RECLAIM_WB_FILE 0x0002u
#define RECLAIM_WB_MIXED 0x0010u
-#define RECLAIM_WB_SYNC 0x0004u
+#define RECLAIM_WB_SYNC 0x0004u /* Unused, all reclaim async */
#define RECLAIM_WB_ASYNC 0x0008u
#define show_reclaim_flags(flags) \
@@ -27,13 +27,13 @@
#define trace_reclaim_flags(page, sync) ( \
(page_is_file_cache(page) ? RECLAIM_WB_FILE : RECLAIM_WB_ANON) | \
- (sync & RECLAIM_MODE_SYNC ? RECLAIM_WB_SYNC : RECLAIM_WB_ASYNC) \
+ (RECLAIM_WB_ASYNC) \
)
#define trace_shrink_flags(file, sync) ( \
- (sync & RECLAIM_MODE_SYNC ? RECLAIM_WB_MIXED : \
- (file ? RECLAIM_WB_FILE : RECLAIM_WB_ANON)) | \
- (sync & RECLAIM_MODE_SYNC ? RECLAIM_WB_SYNC : RECLAIM_WB_ASYNC) \
+ ( \
+ (file ? RECLAIM_WB_FILE : RECLAIM_WB_ANON) | \
+ (RECLAIM_WB_ASYNC) \
)
TRACE_EVENT(mm_vmscan_kswapd_sleep,
diff --git a/mm/vmscan.c b/mm/vmscan.c
index a4b86bd..68319e4 100644
--- a/mm/vmscan.c
+++ b/mm/vmscan.c
@@ -56,15 +56,11 @@
/*
* reclaim_mode determines how the inactive list is shrunk
* RECLAIM_MODE_SINGLE: Reclaim only order-0 pages
- * RECLAIM_MODE_ASYNC: Do not block
- * RECLAIM_MODE_SYNC: Allow blocking e.g. call wait_on_page_writeback
* RECLAIM_MODE_COMPACTION: For high-order allocations, reclaim a number of
* order-0 pages and then compact the zone
*/
typedef unsigned __bitwise__ reclaim_mode_t;
#define RECLAIM_MODE_SINGLE ((__force reclaim_mode_t)0x01u)
-#define RECLAIM_MODE_ASYNC ((__force reclaim_mode_t)0x02u)
-#define RECLAIM_MODE_SYNC ((__force reclaim_mode_t)0x04u)
#define RECLAIM_MODE_COMPACTION ((__force reclaim_mode_t)0x10u)
struct scan_control {
@@ -360,12 +356,8 @@ out:
return ret;
}
-static void set_reclaim_mode(int priority, struct scan_control *sc,
- bool sync)
+static void set_reclaim_mode(int priority, struct scan_control *sc)
{
- /* Sync reclaim used only for compaction */
- reclaim_mode_t syncmode = sync ? RECLAIM_MODE_SYNC : RECLAIM_MODE_ASYNC;
-
/*
* Restrict reclaim/compaction to costly allocations or when
* under memory pressure
@@ -373,14 +365,14 @@ static void set_reclaim_mode(int priority, struct scan_control *sc,
if (COMPACTION_BUILD && sc->order &&
(sc->order > PAGE_ALLOC_COSTLY_ORDER ||
priority < DEF_PRIORITY - 2))
- sc->reclaim_mode = RECLAIM_MODE_COMPACTION | syncmode;
+ sc->reclaim_mode = RECLAIM_MODE_COMPACTION;
else
- sc->reclaim_mode = RECLAIM_MODE_SINGLE | RECLAIM_MODE_ASYNC;
+ sc->reclaim_mode = RECLAIM_MODE_SINGLE;
}
static void reset_reclaim_mode(struct scan_control *sc)
{
- sc->reclaim_mode = RECLAIM_MODE_SINGLE | RECLAIM_MODE_ASYNC;
+ sc->reclaim_mode = RECLAIM_MODE_SINGLE;
}
static inline int is_page_cache_freeable(struct page *page)
@@ -791,19 +783,8 @@ static unsigned long shrink_page_list(struct list_head *page_list,
if (PageWriteback(page)) {
nr_writeback++;
- /*
- * Synchronous reclaim cannot queue pages for
- * writeback due to the possibility of stack overflow
- * but if it encounters a page under writeback, wait
- * for the IO to complete.
- */
- if ((sc->reclaim_mode & RECLAIM_MODE_SYNC) &&
- may_enter_fs)
- wait_on_page_writeback(page);
- else {
- unlock_page(page);
- goto keep_reclaim_mode;
- }
+ unlock_page(page);
+ goto keep;
}
references = page_check_references(page, mz, sc);
@@ -886,7 +867,7 @@ static unsigned long shrink_page_list(struct list_head *page_list,
goto activate_locked;
case PAGE_SUCCESS:
if (PageWriteback(page))
- goto keep_reclaim_mode;
+ goto keep;
if (PageDirty(page))
goto keep;
@@ -985,8 +966,6 @@ activate_locked:
keep_locked:
unlock_page(page);
keep:
- reset_reclaim_mode(sc);
-keep_reclaim_mode:
list_add(&page->lru, &ret_pages);
VM_BUG_ON(PageLRU(page) || PageUnevictable(page));
}
@@ -1342,47 +1321,6 @@ update_isolated_counts(struct mem_cgroup_zone *mz,
}
/*
- * Returns true if a direct reclaim should wait on pages under writeback.
- *
- * If we are direct reclaiming for contiguous pages and we do not reclaim
- * everything in the list, try again and wait for writeback IO to complete.
- * This will stall high-order allocations noticeably. Only do that when really
- * need to free the pages under high memory pressure.
- */
-static inline bool should_reclaim_stall(unsigned long nr_taken,
- unsigned long nr_freed,
- int priority,
- struct scan_control *sc)
-{
- int stall_priority;
-
- /* kswapd should not stall on sync IO */
- if (current_is_kswapd())
- return false;
-
- /* Only stall for memory compaction */
- if (sc->reclaim_mode & RECLAIM_MODE_SINGLE)
- return false;
-
- /* If we have reclaimed everything on the isolated list, no stall */
- if (nr_freed == nr_taken)
- return false;
-
- /*
- * For high-order allocations, there are two stall thresholds.
- * High-cost allocations stall immediately where as lower
- * order allocations such as stacks require the scanning
- * priority to be much higher before stalling.
- */
- if (sc->order > PAGE_ALLOC_COSTLY_ORDER)
- stall_priority = DEF_PRIORITY;
- else
- stall_priority = DEF_PRIORITY / 3;
-
- return priority <= stall_priority;
-}
-
-/*
* shrink_inactive_list() is a helper for shrink_zone(). It returns the number
* of reclaimed pages
*/
@@ -1410,7 +1348,7 @@ shrink_inactive_list(unsigned long nr_to_scan, struct mem_cgroup_zone *mz,
return SWAP_CLUSTER_MAX;
}
- set_reclaim_mode(priority, sc, false);
+ set_reclaim_mode(priority, sc);
lru_add_drain();
@@ -1442,13 +1380,6 @@ shrink_inactive_list(unsigned long nr_to_scan, struct mem_cgroup_zone *mz,
nr_reclaimed = shrink_page_list(&page_list, mz, sc, priority,
&nr_dirty, &nr_writeback);
- /* Check if we should syncronously wait for writeback */
- if (should_reclaim_stall(nr_taken, nr_reclaimed, priority, sc)) {
- set_reclaim_mode(priority, sc, true);
- nr_reclaimed += shrink_page_list(&page_list, mz, sc,
- priority, &nr_dirty, &nr_writeback);
- }
-
spin_lock_irq(&zone->lru_lock);
reclaim_stat->recent_scanned[0] += nr_anon;
--
1.7.9.2
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply related [flat|nested] 36+ messages in thread
* [PATCH 3/3] mm: vmscan: Remove reclaim_mode_t
2012-04-11 16:38 ` Mel Gorman
@ 2012-04-11 16:38 ` Mel Gorman
-1 siblings, 0 replies; 36+ messages in thread
From: Mel Gorman @ 2012-04-11 16:38 UTC (permalink / raw)
To: Andrew Morton
Cc: Rik van Riel, Konstantin Khlebnikov, Hugh Dickins, Ying Han,
Mel Gorman, Linux-MM, LKML
There is little motiviation for reclaim_mode_t once RECLAIM_MODE_[A]SYNC
and lumpy reclaim have been removed. This patch gets rid of reclaim_mode_t
as well and improves the documentation about what reclaim/compaction is
and when it is triggered.
Signed-off-by: Mel Gorman <mgorman@suse.de>
---
include/trace/events/vmscan.h | 4 +--
mm/vmscan.c | 72 +++++++++++++----------------------------
2 files changed, 24 insertions(+), 52 deletions(-)
diff --git a/include/trace/events/vmscan.h b/include/trace/events/vmscan.h
index 044e8ba..0794aa2 100644
--- a/include/trace/events/vmscan.h
+++ b/include/trace/events/vmscan.h
@@ -25,12 +25,12 @@
{RECLAIM_WB_ASYNC, "RECLAIM_WB_ASYNC"} \
) : "RECLAIM_WB_NONE"
-#define trace_reclaim_flags(page, sync) ( \
+#define trace_reclaim_flags(page) ( \
(page_is_file_cache(page) ? RECLAIM_WB_FILE : RECLAIM_WB_ANON) | \
(RECLAIM_WB_ASYNC) \
)
-#define trace_shrink_flags(file, sync) ( \
+#define trace_shrink_flags(file) \
( \
(file ? RECLAIM_WB_FILE : RECLAIM_WB_ANON) | \
(RECLAIM_WB_ASYNC) \
diff --git a/mm/vmscan.c b/mm/vmscan.c
index 68319e4..36c6ad2 100644
--- a/mm/vmscan.c
+++ b/mm/vmscan.c
@@ -53,16 +53,6 @@
#define CREATE_TRACE_POINTS
#include <trace/events/vmscan.h>
-/*
- * reclaim_mode determines how the inactive list is shrunk
- * RECLAIM_MODE_SINGLE: Reclaim only order-0 pages
- * RECLAIM_MODE_COMPACTION: For high-order allocations, reclaim a number of
- * order-0 pages and then compact the zone
- */
-typedef unsigned __bitwise__ reclaim_mode_t;
-#define RECLAIM_MODE_SINGLE ((__force reclaim_mode_t)0x01u)
-#define RECLAIM_MODE_COMPACTION ((__force reclaim_mode_t)0x10u)
-
struct scan_control {
/* Incremented by the number of inactive pages that were scanned */
unsigned long nr_scanned;
@@ -89,12 +79,6 @@ struct scan_control {
int order;
/*
- * Intend to reclaim enough continuous memory rather than reclaim
- * enough amount of memory. i.e, mode for high order allocation.
- */
- reclaim_mode_t reclaim_mode;
-
- /*
* The memory cgroup that hit its limit and as a result is the
* primary target of this reclaim invocation.
*/
@@ -356,25 +340,6 @@ out:
return ret;
}
-static void set_reclaim_mode(int priority, struct scan_control *sc)
-{
- /*
- * Restrict reclaim/compaction to costly allocations or when
- * under memory pressure
- */
- if (COMPACTION_BUILD && sc->order &&
- (sc->order > PAGE_ALLOC_COSTLY_ORDER ||
- priority < DEF_PRIORITY - 2))
- sc->reclaim_mode = RECLAIM_MODE_COMPACTION;
- else
- sc->reclaim_mode = RECLAIM_MODE_SINGLE;
-}
-
-static void reset_reclaim_mode(struct scan_control *sc)
-{
- sc->reclaim_mode = RECLAIM_MODE_SINGLE;
-}
-
static inline int is_page_cache_freeable(struct page *page)
{
/*
@@ -497,8 +462,7 @@ static pageout_t pageout(struct page *page, struct address_space *mapping,
/* synchronous write or broken a_ops? */
ClearPageReclaim(page);
}
- trace_mm_vmscan_writepage(page,
- trace_reclaim_flags(page, sc->reclaim_mode));
+ trace_mm_vmscan_writepage(page, trace_reclaim_flags(page));
inc_zone_page_state(page, NR_VMSCAN_WRITE);
return PAGE_SUCCESS;
}
@@ -953,7 +917,6 @@ cull_mlocked:
try_to_free_swap(page);
unlock_page(page);
putback_lru_page(page);
- reset_reclaim_mode(sc);
continue;
activate_locked:
@@ -1348,8 +1311,6 @@ shrink_inactive_list(unsigned long nr_to_scan, struct mem_cgroup_zone *mz,
return SWAP_CLUSTER_MAX;
}
- set_reclaim_mode(priority, sc);
-
lru_add_drain();
if (!sc->may_unmap)
@@ -1428,7 +1389,7 @@ shrink_inactive_list(unsigned long nr_to_scan, struct mem_cgroup_zone *mz,
zone_idx(zone),
nr_scanned, nr_reclaimed,
priority,
- trace_shrink_flags(file, sc->reclaim_mode));
+ trace_shrink_flags(file));
return nr_reclaimed;
}
@@ -1507,8 +1468,6 @@ static void shrink_active_list(unsigned long nr_to_scan,
lru_add_drain();
- reset_reclaim_mode(sc);
-
if (!sc->may_unmap)
isolate_mode |= ISOLATE_UNMAPPED;
if (!sc->may_writepage)
@@ -1821,23 +1780,35 @@ out:
}
}
+/* Use reclaim/compaction for costly allocs or under memory pressure */
+static bool in_reclaim_compaction(int priority, struct scan_control *sc)
+{
+ if (COMPACTION_BUILD && sc->order &&
+ (sc->order > PAGE_ALLOC_COSTLY_ORDER ||
+ priority < DEF_PRIORITY - 2))
+ return true;
+
+ return false;
+}
+
/*
- * Reclaim/compaction depends on a number of pages being freed. To avoid
- * disruption to the system, a small number of order-0 pages continue to be
- * rotated and reclaimed in the normal fashion. However, by the time we get
- * back to the allocator and call try_to_compact_zone(), we ensure that
- * there are enough free pages for it to be likely successful
+ * Reclaim/compaction is used for high-order allocation requests. It reclaims
+ * order-0 pages before compacting the zone. should_continue_reclaim() returns
+ * true if more pages should be reclaimed such that when the page allocator
+ * calls try_to_compact_zone() that it will have enough free pages to succeed.
+ * It will give up earlier than that if there is difficulty reclaiming pages.
*/
static inline bool should_continue_reclaim(struct mem_cgroup_zone *mz,
unsigned long nr_reclaimed,
unsigned long nr_scanned,
+ int priority,
struct scan_control *sc)
{
unsigned long pages_for_compaction;
unsigned long inactive_lru_pages;
/* If not in reclaim/compaction mode, stop */
- if (!(sc->reclaim_mode & RECLAIM_MODE_COMPACTION))
+ if (!in_reclaim_compaction(priority, sc))
return false;
/* Consider stopping depending on scan and reclaim activity */
@@ -1944,7 +1915,8 @@ restart:
/* reclaim/compaction might need reclaim to continue */
if (should_continue_reclaim(mz, nr_reclaimed,
- sc->nr_scanned - nr_scanned, sc))
+ sc->nr_scanned - nr_scanned,
+ priority, sc))
goto restart;
throttle_vm_writeout(sc->gfp_mask);
--
1.7.9.2
^ permalink raw reply related [flat|nested] 36+ messages in thread
* [PATCH 3/3] mm: vmscan: Remove reclaim_mode_t
@ 2012-04-11 16:38 ` Mel Gorman
0 siblings, 0 replies; 36+ messages in thread
From: Mel Gorman @ 2012-04-11 16:38 UTC (permalink / raw)
To: Andrew Morton
Cc: Rik van Riel, Konstantin Khlebnikov, Hugh Dickins, Ying Han,
Mel Gorman, Linux-MM, LKML
There is little motiviation for reclaim_mode_t once RECLAIM_MODE_[A]SYNC
and lumpy reclaim have been removed. This patch gets rid of reclaim_mode_t
as well and improves the documentation about what reclaim/compaction is
and when it is triggered.
Signed-off-by: Mel Gorman <mgorman@suse.de>
---
include/trace/events/vmscan.h | 4 +--
mm/vmscan.c | 72 +++++++++++++----------------------------
2 files changed, 24 insertions(+), 52 deletions(-)
diff --git a/include/trace/events/vmscan.h b/include/trace/events/vmscan.h
index 044e8ba..0794aa2 100644
--- a/include/trace/events/vmscan.h
+++ b/include/trace/events/vmscan.h
@@ -25,12 +25,12 @@
{RECLAIM_WB_ASYNC, "RECLAIM_WB_ASYNC"} \
) : "RECLAIM_WB_NONE"
-#define trace_reclaim_flags(page, sync) ( \
+#define trace_reclaim_flags(page) ( \
(page_is_file_cache(page) ? RECLAIM_WB_FILE : RECLAIM_WB_ANON) | \
(RECLAIM_WB_ASYNC) \
)
-#define trace_shrink_flags(file, sync) ( \
+#define trace_shrink_flags(file) \
( \
(file ? RECLAIM_WB_FILE : RECLAIM_WB_ANON) | \
(RECLAIM_WB_ASYNC) \
diff --git a/mm/vmscan.c b/mm/vmscan.c
index 68319e4..36c6ad2 100644
--- a/mm/vmscan.c
+++ b/mm/vmscan.c
@@ -53,16 +53,6 @@
#define CREATE_TRACE_POINTS
#include <trace/events/vmscan.h>
-/*
- * reclaim_mode determines how the inactive list is shrunk
- * RECLAIM_MODE_SINGLE: Reclaim only order-0 pages
- * RECLAIM_MODE_COMPACTION: For high-order allocations, reclaim a number of
- * order-0 pages and then compact the zone
- */
-typedef unsigned __bitwise__ reclaim_mode_t;
-#define RECLAIM_MODE_SINGLE ((__force reclaim_mode_t)0x01u)
-#define RECLAIM_MODE_COMPACTION ((__force reclaim_mode_t)0x10u)
-
struct scan_control {
/* Incremented by the number of inactive pages that were scanned */
unsigned long nr_scanned;
@@ -89,12 +79,6 @@ struct scan_control {
int order;
/*
- * Intend to reclaim enough continuous memory rather than reclaim
- * enough amount of memory. i.e, mode for high order allocation.
- */
- reclaim_mode_t reclaim_mode;
-
- /*
* The memory cgroup that hit its limit and as a result is the
* primary target of this reclaim invocation.
*/
@@ -356,25 +340,6 @@ out:
return ret;
}
-static void set_reclaim_mode(int priority, struct scan_control *sc)
-{
- /*
- * Restrict reclaim/compaction to costly allocations or when
- * under memory pressure
- */
- if (COMPACTION_BUILD && sc->order &&
- (sc->order > PAGE_ALLOC_COSTLY_ORDER ||
- priority < DEF_PRIORITY - 2))
- sc->reclaim_mode = RECLAIM_MODE_COMPACTION;
- else
- sc->reclaim_mode = RECLAIM_MODE_SINGLE;
-}
-
-static void reset_reclaim_mode(struct scan_control *sc)
-{
- sc->reclaim_mode = RECLAIM_MODE_SINGLE;
-}
-
static inline int is_page_cache_freeable(struct page *page)
{
/*
@@ -497,8 +462,7 @@ static pageout_t pageout(struct page *page, struct address_space *mapping,
/* synchronous write or broken a_ops? */
ClearPageReclaim(page);
}
- trace_mm_vmscan_writepage(page,
- trace_reclaim_flags(page, sc->reclaim_mode));
+ trace_mm_vmscan_writepage(page, trace_reclaim_flags(page));
inc_zone_page_state(page, NR_VMSCAN_WRITE);
return PAGE_SUCCESS;
}
@@ -953,7 +917,6 @@ cull_mlocked:
try_to_free_swap(page);
unlock_page(page);
putback_lru_page(page);
- reset_reclaim_mode(sc);
continue;
activate_locked:
@@ -1348,8 +1311,6 @@ shrink_inactive_list(unsigned long nr_to_scan, struct mem_cgroup_zone *mz,
return SWAP_CLUSTER_MAX;
}
- set_reclaim_mode(priority, sc);
-
lru_add_drain();
if (!sc->may_unmap)
@@ -1428,7 +1389,7 @@ shrink_inactive_list(unsigned long nr_to_scan, struct mem_cgroup_zone *mz,
zone_idx(zone),
nr_scanned, nr_reclaimed,
priority,
- trace_shrink_flags(file, sc->reclaim_mode));
+ trace_shrink_flags(file));
return nr_reclaimed;
}
@@ -1507,8 +1468,6 @@ static void shrink_active_list(unsigned long nr_to_scan,
lru_add_drain();
- reset_reclaim_mode(sc);
-
if (!sc->may_unmap)
isolate_mode |= ISOLATE_UNMAPPED;
if (!sc->may_writepage)
@@ -1821,23 +1780,35 @@ out:
}
}
+/* Use reclaim/compaction for costly allocs or under memory pressure */
+static bool in_reclaim_compaction(int priority, struct scan_control *sc)
+{
+ if (COMPACTION_BUILD && sc->order &&
+ (sc->order > PAGE_ALLOC_COSTLY_ORDER ||
+ priority < DEF_PRIORITY - 2))
+ return true;
+
+ return false;
+}
+
/*
- * Reclaim/compaction depends on a number of pages being freed. To avoid
- * disruption to the system, a small number of order-0 pages continue to be
- * rotated and reclaimed in the normal fashion. However, by the time we get
- * back to the allocator and call try_to_compact_zone(), we ensure that
- * there are enough free pages for it to be likely successful
+ * Reclaim/compaction is used for high-order allocation requests. It reclaims
+ * order-0 pages before compacting the zone. should_continue_reclaim() returns
+ * true if more pages should be reclaimed such that when the page allocator
+ * calls try_to_compact_zone() that it will have enough free pages to succeed.
+ * It will give up earlier than that if there is difficulty reclaiming pages.
*/
static inline bool should_continue_reclaim(struct mem_cgroup_zone *mz,
unsigned long nr_reclaimed,
unsigned long nr_scanned,
+ int priority,
struct scan_control *sc)
{
unsigned long pages_for_compaction;
unsigned long inactive_lru_pages;
/* If not in reclaim/compaction mode, stop */
- if (!(sc->reclaim_mode & RECLAIM_MODE_COMPACTION))
+ if (!in_reclaim_compaction(priority, sc))
return false;
/* Consider stopping depending on scan and reclaim activity */
@@ -1944,7 +1915,8 @@ restart:
/* reclaim/compaction might need reclaim to continue */
if (should_continue_reclaim(mz, nr_reclaimed,
- sc->nr_scanned - nr_scanned, sc))
+ sc->nr_scanned - nr_scanned,
+ priority, sc))
goto restart;
throttle_vm_writeout(sc->gfp_mask);
--
1.7.9.2
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply related [flat|nested] 36+ messages in thread
* Re: [PATCH 0/3] Removal of lumpy reclaim V2
2012-04-11 16:38 ` Mel Gorman
@ 2012-04-11 17:17 ` Rik van Riel
-1 siblings, 0 replies; 36+ messages in thread
From: Rik van Riel @ 2012-04-11 17:17 UTC (permalink / raw)
To: Mel Gorman
Cc: Andrew Morton, Konstantin Khlebnikov, Hugh Dickins, Ying Han,
Linux-MM, LKML
On 04/11/2012 12:38 PM, Mel Gorman wrote:
> Success rates are completely hosed for 3.4-rc2 which is almost certainly
> due to [fe2c2a10: vmscan: reclaim at order 0 when compaction is enabled]. I
> expected this would happen for kswapd and impair allocation success rates
> (https://lkml.org/lkml/2012/1/25/166) but I did not anticipate this much
> a difference: 80% less scanning, 37% less reclaim by kswapd
Also, no gratuitous pageouts of anonymous memory.
That was what really made a difference on a somewhat
heavily loaded desktop + kvm workload.
> In comparison, reclaim/compaction is not aggressive and gives up easily
> which is the intended behaviour. hugetlbfs uses __GFP_REPEAT and would be
> much more aggressive about reclaim/compaction than THP allocations are. The
> stress test above is allocating like neither THP or hugetlbfs but is much
> closer to THP.
Next step: get rid of __GFP_NO_KSWAPD for THP, first
in the -mm kernel
> Mainline is now impaired in terms of high order allocation under heavy load
> although I do not know to what degree as I did not test with __GFP_REPEAT.
> Keep this in mind for bugs related to hugepage pool resizing, THP allocation
> and high order atomic allocation failures from network devices.
This might be due to smaller allocations not bumping
the compaction deferring code, when we have deferred
compaction for a higher order allocation.
I wonder if the compaction deferring code is simply
too defer-happy, now that we ignore compaction at
lower orders than where compaction failed?
^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: [PATCH 0/3] Removal of lumpy reclaim V2
@ 2012-04-11 17:17 ` Rik van Riel
0 siblings, 0 replies; 36+ messages in thread
From: Rik van Riel @ 2012-04-11 17:17 UTC (permalink / raw)
To: Mel Gorman
Cc: Andrew Morton, Konstantin Khlebnikov, Hugh Dickins, Ying Han,
Linux-MM, LKML
On 04/11/2012 12:38 PM, Mel Gorman wrote:
> Success rates are completely hosed for 3.4-rc2 which is almost certainly
> due to [fe2c2a10: vmscan: reclaim at order 0 when compaction is enabled]. I
> expected this would happen for kswapd and impair allocation success rates
> (https://lkml.org/lkml/2012/1/25/166) but I did not anticipate this much
> a difference: 80% less scanning, 37% less reclaim by kswapd
Also, no gratuitous pageouts of anonymous memory.
That was what really made a difference on a somewhat
heavily loaded desktop + kvm workload.
> In comparison, reclaim/compaction is not aggressive and gives up easily
> which is the intended behaviour. hugetlbfs uses __GFP_REPEAT and would be
> much more aggressive about reclaim/compaction than THP allocations are. The
> stress test above is allocating like neither THP or hugetlbfs but is much
> closer to THP.
Next step: get rid of __GFP_NO_KSWAPD for THP, first
in the -mm kernel
> Mainline is now impaired in terms of high order allocation under heavy load
> although I do not know to what degree as I did not test with __GFP_REPEAT.
> Keep this in mind for bugs related to hugepage pool resizing, THP allocation
> and high order atomic allocation failures from network devices.
This might be due to smaller allocations not bumping
the compaction deferring code, when we have deferred
compaction for a higher order allocation.
I wonder if the compaction deferring code is simply
too defer-happy, now that we ignore compaction at
lower orders than where compaction failed?
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: [PATCH 1/3] mm: vmscan: Remove lumpy reclaim
2012-04-11 16:38 ` Mel Gorman
@ 2012-04-11 17:25 ` Rik van Riel
-1 siblings, 0 replies; 36+ messages in thread
From: Rik van Riel @ 2012-04-11 17:25 UTC (permalink / raw)
To: Mel Gorman
Cc: Andrew Morton, Konstantin Khlebnikov, Hugh Dickins, Ying Han,
Linux-MM, LKML
On 04/11/2012 12:38 PM, Mel Gorman wrote:
> Lumpy reclaim had a purpose but in the mind of some, it was to kick
> the system so hard it trashed. For others the purpose was to complicate
> vmscan.c. Over time it was giving softer shoes and a nicer attitude but
> memory compaction needs to step up and replace it so this patch sends
> lumpy reclaim to the farm.
>
> The tracepoint format changes for isolating LRU pages with this patch
> applied. Furthermore reclaim/compaction can no longer queue dirty pages in
> pageout() if the underlying BDI is congested. Lumpy reclaim used this logic
> and reclaim/compaction was using it in error.
>
> Signed-off-by: Mel Gorman<mgorman@suse.de>
Acked-by: Rik van Riel <riel@redhat.com>
^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: [PATCH 1/3] mm: vmscan: Remove lumpy reclaim
@ 2012-04-11 17:25 ` Rik van Riel
0 siblings, 0 replies; 36+ messages in thread
From: Rik van Riel @ 2012-04-11 17:25 UTC (permalink / raw)
To: Mel Gorman
Cc: Andrew Morton, Konstantin Khlebnikov, Hugh Dickins, Ying Han,
Linux-MM, LKML
On 04/11/2012 12:38 PM, Mel Gorman wrote:
> Lumpy reclaim had a purpose but in the mind of some, it was to kick
> the system so hard it trashed. For others the purpose was to complicate
> vmscan.c. Over time it was giving softer shoes and a nicer attitude but
> memory compaction needs to step up and replace it so this patch sends
> lumpy reclaim to the farm.
>
> The tracepoint format changes for isolating LRU pages with this patch
> applied. Furthermore reclaim/compaction can no longer queue dirty pages in
> pageout() if the underlying BDI is congested. Lumpy reclaim used this logic
> and reclaim/compaction was using it in error.
>
> Signed-off-by: Mel Gorman<mgorman@suse.de>
Acked-by: Rik van Riel <riel@redhat.com>
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: [PATCH 2/3] mm: vmscan: Do not stall on writeback during memory compaction
2012-04-11 16:38 ` Mel Gorman
@ 2012-04-11 17:26 ` Rik van Riel
-1 siblings, 0 replies; 36+ messages in thread
From: Rik van Riel @ 2012-04-11 17:26 UTC (permalink / raw)
To: Mel Gorman
Cc: Andrew Morton, Konstantin Khlebnikov, Hugh Dickins, Ying Han,
Linux-MM, LKML
On 04/11/2012 12:38 PM, Mel Gorman wrote:
> This patch stops reclaim/compaction entering sync reclaim as this was only
> intended for lumpy reclaim and an oversight. Page migration has its own
> logic for stalling on writeback pages if necessary and memory compaction
> is already using it.
>
> Waiting on page writeback is bad for a number of reasons but the primary
> one is that waiting on writeback to a slow device like USB can take a
> considerable length of time. Page reclaim instead uses wait_iff_congested()
> to throttle if too many dirty pages are being scanned.
>
> Signed-off-by: Mel Gorman<mgorman@suse.de>
Acked-by: Rik van Riel <riel@redhat.com>
^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: [PATCH 2/3] mm: vmscan: Do not stall on writeback during memory compaction
@ 2012-04-11 17:26 ` Rik van Riel
0 siblings, 0 replies; 36+ messages in thread
From: Rik van Riel @ 2012-04-11 17:26 UTC (permalink / raw)
To: Mel Gorman
Cc: Andrew Morton, Konstantin Khlebnikov, Hugh Dickins, Ying Han,
Linux-MM, LKML
On 04/11/2012 12:38 PM, Mel Gorman wrote:
> This patch stops reclaim/compaction entering sync reclaim as this was only
> intended for lumpy reclaim and an oversight. Page migration has its own
> logic for stalling on writeback pages if necessary and memory compaction
> is already using it.
>
> Waiting on page writeback is bad for a number of reasons but the primary
> one is that waiting on writeback to a slow device like USB can take a
> considerable length of time. Page reclaim instead uses wait_iff_congested()
> to throttle if too many dirty pages are being scanned.
>
> Signed-off-by: Mel Gorman<mgorman@suse.de>
Acked-by: Rik van Riel <riel@redhat.com>
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: [PATCH 3/3] mm: vmscan: Remove reclaim_mode_t
2012-04-11 16:38 ` Mel Gorman
@ 2012-04-11 17:26 ` Rik van Riel
-1 siblings, 0 replies; 36+ messages in thread
From: Rik van Riel @ 2012-04-11 17:26 UTC (permalink / raw)
To: Mel Gorman
Cc: Andrew Morton, Konstantin Khlebnikov, Hugh Dickins, Ying Han,
Linux-MM, LKML
On 04/11/2012 12:38 PM, Mel Gorman wrote:
> There is little motiviation for reclaim_mode_t once RECLAIM_MODE_[A]SYNC
> and lumpy reclaim have been removed. This patch gets rid of reclaim_mode_t
> as well and improves the documentation about what reclaim/compaction is
> and when it is triggered.
>
> Signed-off-by: Mel Gorman<mgorman@suse.de>
Acked-by: Rik van Riel <riel@redhat.com>
^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: [PATCH 3/3] mm: vmscan: Remove reclaim_mode_t
@ 2012-04-11 17:26 ` Rik van Riel
0 siblings, 0 replies; 36+ messages in thread
From: Rik van Riel @ 2012-04-11 17:26 UTC (permalink / raw)
To: Mel Gorman
Cc: Andrew Morton, Konstantin Khlebnikov, Hugh Dickins, Ying Han,
Linux-MM, LKML
On 04/11/2012 12:38 PM, Mel Gorman wrote:
> There is little motiviation for reclaim_mode_t once RECLAIM_MODE_[A]SYNC
> and lumpy reclaim have been removed. This patch gets rid of reclaim_mode_t
> as well and improves the documentation about what reclaim/compaction is
> and when it is triggered.
>
> Signed-off-by: Mel Gorman<mgorman@suse.de>
Acked-by: Rik van Riel <riel@redhat.com>
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: [PATCH 0/3] Removal of lumpy reclaim V2
2012-04-11 17:17 ` Rik van Riel
@ 2012-04-11 17:52 ` Mel Gorman
-1 siblings, 0 replies; 36+ messages in thread
From: Mel Gorman @ 2012-04-11 17:52 UTC (permalink / raw)
To: Rik van Riel
Cc: Andrew Morton, Konstantin Khlebnikov, Hugh Dickins, Ying Han,
Linux-MM, LKML
On Wed, Apr 11, 2012 at 01:17:02PM -0400, Rik van Riel wrote:
> On 04/11/2012 12:38 PM, Mel Gorman wrote:
>
> >Success rates are completely hosed for 3.4-rc2 which is almost certainly
> >due to [fe2c2a10: vmscan: reclaim at order 0 when compaction is enabled]. I
> >expected this would happen for kswapd and impair allocation success rates
> >(https://lkml.org/lkml/2012/1/25/166) but I did not anticipate this much
> >a difference: 80% less scanning, 37% less reclaim by kswapd
>
> Also, no gratuitous pageouts of anonymous memory.
> That was what really made a difference on a somewhat
> heavily loaded desktop + kvm workload.
>
Indeed.
> >In comparison, reclaim/compaction is not aggressive and gives up easily
> >which is the intended behaviour. hugetlbfs uses __GFP_REPEAT and would be
> >much more aggressive about reclaim/compaction than THP allocations are. The
> >stress test above is allocating like neither THP or hugetlbfs but is much
> >closer to THP.
>
> Next step: get rid of __GFP_NO_KSWAPD for THP, first
> in the -mm kernel
>
Initially the flag was introduced because kswapd reclaimed too
aggressively. One would like to believe that it would be less of a problem
now but we must avoid a situation where the CPU and reclaim cost of kswapd
exceeds the benefit of allocating a THP.
> >Mainline is now impaired in terms of high order allocation under heavy load
> >although I do not know to what degree as I did not test with __GFP_REPEAT.
> >Keep this in mind for bugs related to hugepage pool resizing, THP allocation
> >and high order atomic allocation failures from network devices.
>
> This might be due to smaller allocations not bumping
> the compaction deferring code, when we have deferred
> compaction for a higher order allocation.
>
It's one possibility but in this case I am not inclined to blame memory
compaction as such although there is some indication that there is a bug in
the free scanner that would make compaction less effective than it should be.
> I wonder if the compaction deferring code is simply
> too defer-happy, now that we ignore compaction at
> lower orders than where compaction failed?
I do not think it's a compaction deferral problem. We do not record
statistics on how often we defer compaction but if you look at the compaction
statistics you'll see that "Compaction stalls" and "Compaction pages moved"
figures are much higher. This implies that we are using compaction more
aggressively in 3.4-rc2 instead of deferring more.
You may also note that "Compaction success" figures are more or less the
same as 3.3 but that "Compaction failures" are higher. This indicates that
in 3.2 the high success rate was partially due to lumpy reclaim freeing
up the contiguous page before memory compaction was needed in memory
pressure situations. If that is accurate then adjusting the logic in
should_continue_reclaim() for reclaim/compaction may partially address
the issue but not 100% of the way as reclaim/compaction will still be
racing with other allocation requests. This race is likely to be tigher
now because an accidental side-effect of lumpy reclaim was to throttle
parallel allocations requests in swap. It may not be very
straight-forward to fix :)
--
Mel Gorman
SUSE Labs
^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: [PATCH 0/3] Removal of lumpy reclaim V2
@ 2012-04-11 17:52 ` Mel Gorman
0 siblings, 0 replies; 36+ messages in thread
From: Mel Gorman @ 2012-04-11 17:52 UTC (permalink / raw)
To: Rik van Riel
Cc: Andrew Morton, Konstantin Khlebnikov, Hugh Dickins, Ying Han,
Linux-MM, LKML
On Wed, Apr 11, 2012 at 01:17:02PM -0400, Rik van Riel wrote:
> On 04/11/2012 12:38 PM, Mel Gorman wrote:
>
> >Success rates are completely hosed for 3.4-rc2 which is almost certainly
> >due to [fe2c2a10: vmscan: reclaim at order 0 when compaction is enabled]. I
> >expected this would happen for kswapd and impair allocation success rates
> >(https://lkml.org/lkml/2012/1/25/166) but I did not anticipate this much
> >a difference: 80% less scanning, 37% less reclaim by kswapd
>
> Also, no gratuitous pageouts of anonymous memory.
> That was what really made a difference on a somewhat
> heavily loaded desktop + kvm workload.
>
Indeed.
> >In comparison, reclaim/compaction is not aggressive and gives up easily
> >which is the intended behaviour. hugetlbfs uses __GFP_REPEAT and would be
> >much more aggressive about reclaim/compaction than THP allocations are. The
> >stress test above is allocating like neither THP or hugetlbfs but is much
> >closer to THP.
>
> Next step: get rid of __GFP_NO_KSWAPD for THP, first
> in the -mm kernel
>
Initially the flag was introduced because kswapd reclaimed too
aggressively. One would like to believe that it would be less of a problem
now but we must avoid a situation where the CPU and reclaim cost of kswapd
exceeds the benefit of allocating a THP.
> >Mainline is now impaired in terms of high order allocation under heavy load
> >although I do not know to what degree as I did not test with __GFP_REPEAT.
> >Keep this in mind for bugs related to hugepage pool resizing, THP allocation
> >and high order atomic allocation failures from network devices.
>
> This might be due to smaller allocations not bumping
> the compaction deferring code, when we have deferred
> compaction for a higher order allocation.
>
It's one possibility but in this case I am not inclined to blame memory
compaction as such although there is some indication that there is a bug in
the free scanner that would make compaction less effective than it should be.
> I wonder if the compaction deferring code is simply
> too defer-happy, now that we ignore compaction at
> lower orders than where compaction failed?
I do not think it's a compaction deferral problem. We do not record
statistics on how often we defer compaction but if you look at the compaction
statistics you'll see that "Compaction stalls" and "Compaction pages moved"
figures are much higher. This implies that we are using compaction more
aggressively in 3.4-rc2 instead of deferring more.
You may also note that "Compaction success" figures are more or less the
same as 3.3 but that "Compaction failures" are higher. This indicates that
in 3.2 the high success rate was partially due to lumpy reclaim freeing
up the contiguous page before memory compaction was needed in memory
pressure situations. If that is accurate then adjusting the logic in
should_continue_reclaim() for reclaim/compaction may partially address
the issue but not 100% of the way as reclaim/compaction will still be
racing with other allocation requests. This race is likely to be tigher
now because an accidental side-effect of lumpy reclaim was to throttle
parallel allocations requests in swap. It may not be very
straight-forward to fix :)
--
Mel Gorman
SUSE Labs
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: [PATCH 0/3] Removal of lumpy reclaim V2
2012-04-11 17:52 ` Mel Gorman
@ 2012-04-11 18:06 ` Rik van Riel
-1 siblings, 0 replies; 36+ messages in thread
From: Rik van Riel @ 2012-04-11 18:06 UTC (permalink / raw)
To: Mel Gorman
Cc: Andrew Morton, Konstantin Khlebnikov, Hugh Dickins, Ying Han,
Linux-MM, LKML
On 04/11/2012 01:52 PM, Mel Gorman wrote:
> On Wed, Apr 11, 2012 at 01:17:02PM -0400, Rik van Riel wrote:
>> Next step: get rid of __GFP_NO_KSWAPD for THP, first
>> in the -mm kernel
>>
>
> Initially the flag was introduced because kswapd reclaimed too
> aggressively. One would like to believe that it would be less of a problem
> now but we must avoid a situation where the CPU and reclaim cost of kswapd
> exceeds the benefit of allocating a THP.
Since kswapd and the direct reclaim code now use
the same conditionals for calling compaction,
the cost ought to be identical.
I agree this is something we should shake out
in -mm for a while though, before considering a
mainline merge.
Andrew, would you be willing to take a removal
of __GFP_NO_KSWAPD in -mm, and push it to Linus
for the 3.6 kernel if no ill effects are seen
in -mm and -next?
^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: [PATCH 0/3] Removal of lumpy reclaim V2
@ 2012-04-11 18:06 ` Rik van Riel
0 siblings, 0 replies; 36+ messages in thread
From: Rik van Riel @ 2012-04-11 18:06 UTC (permalink / raw)
To: Mel Gorman
Cc: Andrew Morton, Konstantin Khlebnikov, Hugh Dickins, Ying Han,
Linux-MM, LKML
On 04/11/2012 01:52 PM, Mel Gorman wrote:
> On Wed, Apr 11, 2012 at 01:17:02PM -0400, Rik van Riel wrote:
>> Next step: get rid of __GFP_NO_KSWAPD for THP, first
>> in the -mm kernel
>>
>
> Initially the flag was introduced because kswapd reclaimed too
> aggressively. One would like to believe that it would be less of a problem
> now but we must avoid a situation where the CPU and reclaim cost of kswapd
> exceeds the benefit of allocating a THP.
Since kswapd and the direct reclaim code now use
the same conditionals for calling compaction,
the cost ought to be identical.
I agree this is something we should shake out
in -mm for a while though, before considering a
mainline merge.
Andrew, would you be willing to take a removal
of __GFP_NO_KSWAPD in -mm, and push it to Linus
for the 3.6 kernel if no ill effects are seen
in -mm and -next?
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: [PATCH 2/3] mm: vmscan: Do not stall on writeback during memory compaction
2012-04-11 17:26 ` Rik van Riel
@ 2012-04-11 18:51 ` KOSAKI Motohiro
-1 siblings, 0 replies; 36+ messages in thread
From: KOSAKI Motohiro @ 2012-04-11 18:51 UTC (permalink / raw)
To: Rik van Riel
Cc: Mel Gorman, Andrew Morton, Konstantin Khlebnikov, Hugh Dickins,
Ying Han, Linux-MM, LKML
On Wed, Apr 11, 2012 at 1:26 PM, Rik van Riel <riel@redhat.com> wrote:
> On 04/11/2012 12:38 PM, Mel Gorman wrote:
>>
>> This patch stops reclaim/compaction entering sync reclaim as this was only
>> intended for lumpy reclaim and an oversight. Page migration has its own
>> logic for stalling on writeback pages if necessary and memory compaction
>> is already using it.
>>
>> Waiting on page writeback is bad for a number of reasons but the primary
>> one is that waiting on writeback to a slow device like USB can take a
>> considerable length of time. Page reclaim instead uses
>> wait_iff_congested()
>> to throttle if too many dirty pages are being scanned.
>>
>> Signed-off-by: Mel Gorman<mgorman@suse.de>
>
>
> Acked-by: Rik van Riel <riel@redhat.com>
Acked-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: [PATCH 2/3] mm: vmscan: Do not stall on writeback during memory compaction
@ 2012-04-11 18:51 ` KOSAKI Motohiro
0 siblings, 0 replies; 36+ messages in thread
From: KOSAKI Motohiro @ 2012-04-11 18:51 UTC (permalink / raw)
To: Rik van Riel
Cc: Mel Gorman, Andrew Morton, Konstantin Khlebnikov, Hugh Dickins,
Ying Han, Linux-MM, LKML
On Wed, Apr 11, 2012 at 1:26 PM, Rik van Riel <riel@redhat.com> wrote:
> On 04/11/2012 12:38 PM, Mel Gorman wrote:
>>
>> This patch stops reclaim/compaction entering sync reclaim as this was only
>> intended for lumpy reclaim and an oversight. Page migration has its own
>> logic for stalling on writeback pages if necessary and memory compaction
>> is already using it.
>>
>> Waiting on page writeback is bad for a number of reasons but the primary
>> one is that waiting on writeback to a slow device like USB can take a
>> considerable length of time. Page reclaim instead uses
>> wait_iff_congested()
>> to throttle if too many dirty pages are being scanned.
>>
>> Signed-off-by: Mel Gorman<mgorman@suse.de>
>
>
> Acked-by: Rik van Riel <riel@redhat.com>
Acked-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: [PATCH 1/3] mm: vmscan: Remove lumpy reclaim
2012-04-11 17:25 ` Rik van Riel
@ 2012-04-11 18:54 ` KOSAKI Motohiro
-1 siblings, 0 replies; 36+ messages in thread
From: KOSAKI Motohiro @ 2012-04-11 18:54 UTC (permalink / raw)
To: Rik van Riel
Cc: Mel Gorman, Andrew Morton, Konstantin Khlebnikov, Hugh Dickins,
Ying Han, Linux-MM, LKML
On Wed, Apr 11, 2012 at 1:25 PM, Rik van Riel <riel@redhat.com> wrote:
> On 04/11/2012 12:38 PM, Mel Gorman wrote:
>>
>> Lumpy reclaim had a purpose but in the mind of some, it was to kick
>> the system so hard it trashed. For others the purpose was to complicate
>> vmscan.c. Over time it was giving softer shoes and a nicer attitude but
>> memory compaction needs to step up and replace it so this patch sends
>> lumpy reclaim to the farm.
>>
>> The tracepoint format changes for isolating LRU pages with this patch
>> applied. Furthermore reclaim/compaction can no longer queue dirty pages in
>> pageout() if the underlying BDI is congested. Lumpy reclaim used this
>> logic
>> and reclaim/compaction was using it in error.
>>
>> Signed-off-by: Mel Gorman<mgorman@suse.de>
>
> Acked-by: Rik van Riel <riel@redhat.com>
Acked-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: [PATCH 1/3] mm: vmscan: Remove lumpy reclaim
@ 2012-04-11 18:54 ` KOSAKI Motohiro
0 siblings, 0 replies; 36+ messages in thread
From: KOSAKI Motohiro @ 2012-04-11 18:54 UTC (permalink / raw)
To: Rik van Riel
Cc: Mel Gorman, Andrew Morton, Konstantin Khlebnikov, Hugh Dickins,
Ying Han, Linux-MM, LKML
On Wed, Apr 11, 2012 at 1:25 PM, Rik van Riel <riel@redhat.com> wrote:
> On 04/11/2012 12:38 PM, Mel Gorman wrote:
>>
>> Lumpy reclaim had a purpose but in the mind of some, it was to kick
>> the system so hard it trashed. For others the purpose was to complicate
>> vmscan.c. Over time it was giving softer shoes and a nicer attitude but
>> memory compaction needs to step up and replace it so this patch sends
>> lumpy reclaim to the farm.
>>
>> The tracepoint format changes for isolating LRU pages with this patch
>> applied. Furthermore reclaim/compaction can no longer queue dirty pages in
>> pageout() if the underlying BDI is congested. Lumpy reclaim used this
>> logic
>> and reclaim/compaction was using it in error.
>>
>> Signed-off-by: Mel Gorman<mgorman@suse.de>
>
> Acked-by: Rik van Riel <riel@redhat.com>
Acked-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: [PATCH 3/3] mm: vmscan: Remove reclaim_mode_t
2012-04-11 17:26 ` Rik van Riel
@ 2012-04-11 19:48 ` KOSAKI Motohiro
-1 siblings, 0 replies; 36+ messages in thread
From: KOSAKI Motohiro @ 2012-04-11 19:48 UTC (permalink / raw)
To: Rik van Riel
Cc: Mel Gorman, Andrew Morton, Konstantin Khlebnikov, Hugh Dickins,
Ying Han, Linux-MM, LKML
On Wed, Apr 11, 2012 at 1:26 PM, Rik van Riel <riel@redhat.com> wrote:
> On 04/11/2012 12:38 PM, Mel Gorman wrote:
>>
>> There is little motiviation for reclaim_mode_t once RECLAIM_MODE_[A]SYNC
>> and lumpy reclaim have been removed. This patch gets rid of reclaim_mode_t
>> as well and improves the documentation about what reclaim/compaction is
>> and when it is triggered.
>>
>> Signed-off-by: Mel Gorman<mgorman@suse.de>
>
> Acked-by: Rik van Riel <riel@redhat.com>
Acked-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: [PATCH 3/3] mm: vmscan: Remove reclaim_mode_t
@ 2012-04-11 19:48 ` KOSAKI Motohiro
0 siblings, 0 replies; 36+ messages in thread
From: KOSAKI Motohiro @ 2012-04-11 19:48 UTC (permalink / raw)
To: Rik van Riel
Cc: Mel Gorman, Andrew Morton, Konstantin Khlebnikov, Hugh Dickins,
Ying Han, Linux-MM, LKML
On Wed, Apr 11, 2012 at 1:26 PM, Rik van Riel <riel@redhat.com> wrote:
> On 04/11/2012 12:38 PM, Mel Gorman wrote:
>>
>> There is little motiviation for reclaim_mode_t once RECLAIM_MODE_[A]SYNC
>> and lumpy reclaim have been removed. This patch gets rid of reclaim_mode_t
>> as well and improves the documentation about what reclaim/compaction is
>> and when it is triggered.
>>
>> Signed-off-by: Mel Gorman<mgorman@suse.de>
>
> Acked-by: Rik van Riel <riel@redhat.com>
Acked-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: [PATCH 0/3] Removal of lumpy reclaim V2
2012-04-11 16:38 ` Mel Gorman
@ 2012-04-11 23:37 ` Ying Han
-1 siblings, 0 replies; 36+ messages in thread
From: Ying Han @ 2012-04-11 23:37 UTC (permalink / raw)
To: Mel Gorman
Cc: Andrew Morton, Rik van Riel, Konstantin Khlebnikov, Hugh Dickins,
Linux-MM, LKML
On Wed, Apr 11, 2012 at 9:38 AM, Mel Gorman <mgorman@suse.de> wrote:
> Andrew, these three patches should replace the two lumpy reclaim patches
> you already have. When applied, there is no functional difference (slightly
> changes in layout) but the changelogs are better.
>
> Changelog since V1
> o Ying pointed out that compaction was waiting on page writeback and the
> description of the patches in V1 was broken. This version is the same
> except that it is structured differently to explain that waiting on
> page writeback is removed.
> o Rebased to v3.4-rc2
>
> This series removes lumpy reclaim and some stalling logic that was
> unintentionally being used by memory compaction. The end result
> is that stalling on dirty pages during page reclaim now depends on
> wait_iff_congested().
>
> Four kernels were compared
>
> 3.3.0 vanilla
> 3.4.0-rc2 vanilla
> 3.4.0-rc2 lumpyremove-v2 is patch one from this series
> 3.4.0-rc2 nosync-v2r3 is the full series
>
> Removing lumpy reclaim saves almost 900K of text where as the full series
> removes 1200K of text.
>
> text data bss dec hex filename
> 6740375 1927944 2260992 10929311 a6c49f vmlinux-3.4.0-rc2-vanilla
> 6739479 1927944 2260992 10928415 a6c11f vmlinux-3.4.0-rc2-lumpyremove-v2
> 6739159 1927944 2260992 10928095 a6bfdf vmlinux-3.4.0-rc2-nosync-v2
>
> There are behaviour changes in the series and so tests were run with
> monitoring of ftrace events. This disrupts results so the performance
> results are distorted but the new behaviour should be clearer.
>
> fs-mark running in a threaded configuration showed little of interest as
> it did not push reclaim aggressively
>
> FS-Mark Multi Threaded
> 3.3.0-vanilla rc2-vanilla lumpyremove-v2r3 nosync-v2r3
> Files/s min 3.20 ( 0.00%) 3.20 ( 0.00%) 3.20 ( 0.00%) 3.20 ( 0.00%)
> Files/s mean 3.20 ( 0.00%) 3.20 ( 0.00%) 3.20 ( 0.00%) 3.20 ( 0.00%)
> Files/s stddev 0.00 ( 0.00%) 0.00 ( 0.00%) 0.00 ( 0.00%) 0.00 ( 0.00%)
> Files/s max 3.20 ( 0.00%) 3.20 ( 0.00%) 3.20 ( 0.00%) 3.20 ( 0.00%)
> Overhead min 508667.00 ( 0.00%) 521350.00 (-2.49%) 544292.00 (-7.00%) 547168.00 (-7.57%)
> Overhead mean 551185.00 ( 0.00%) 652690.73 (-18.42%) 991208.40 (-79.83%) 570130.53 (-3.44%)
> Overhead stddev 18200.69 ( 0.00%) 331958.29 (-1723.88%) 1579579.43 (-8578.68%) 9576.81 (47.38%)
> Overhead max 576775.00 ( 0.00%) 1846634.00 (-220.17%) 6901055.00 (-1096.49%) 585675.00 (-1.54%)
> MMTests Statistics: duration
> Sys Time Running Test (seconds) 309.90 300.95 307.33 298.95
> User+Sys Time Running Test (seconds) 319.32 309.67 315.69 307.51
> Total Elapsed Time (seconds) 1187.85 1193.09 1191.98 1193.73
>
> MMTests Statistics: vmstat
> Page Ins 80532 82212 81420 79480
> Page Outs 111434984 111456240 111437376 111582628
> Swap Ins 0 0 0 0
> Swap Outs 0 0 0 0
> Direct pages scanned 44881 27889 27453 34843
> Kswapd pages scanned 25841428 25860774 25861233 25843212
> Kswapd pages reclaimed 25841393 25860741 25861199 25843179
> Direct pages reclaimed 44881 27889 27453 34843
> Kswapd efficiency 99% 99% 99% 99%
> Kswapd velocity 21754.791 21675.460 21696.029 21649.127
> Direct efficiency 100% 100% 100% 100%
> Direct velocity 37.783 23.375 23.031 29.188
> Percentage direct scans 0% 0% 0% 0%
>
> ftrace showed that there was no stalling on writeback or pages submitted
> for IO from reclaim context.
>
>
> postmark was similar and while it was more interesting, it also did not
> push reclaim heavily.
>
> POSTMARK
> 3.3.0-vanilla rc2-vanilla lumpyremove-v2r3 nosync-v2r3
> Transactions per second: 16.00 ( 0.00%) 20.00 (25.00%) 18.00 (12.50%) 17.00 ( 6.25%)
> Data megabytes read per second: 18.80 ( 0.00%) 24.27 (29.10%) 22.26 (18.40%) 20.54 ( 9.26%)
> Data megabytes written per second: 35.83 ( 0.00%) 46.25 (29.08%) 42.42 (18.39%) 39.14 ( 9.24%)
> Files created alone per second: 28.00 ( 0.00%) 38.00 (35.71%) 34.00 (21.43%) 30.00 ( 7.14%)
> Files create/transact per second: 8.00 ( 0.00%) 10.00 (25.00%) 9.00 (12.50%) 8.00 ( 0.00%)
> Files deleted alone per second: 556.00 ( 0.00%) 1224.00 (120.14%) 3062.00 (450.72%) 6124.00 (1001.44%)
> Files delete/transact per second: 8.00 ( 0.00%) 10.00 (25.00%) 9.00 (12.50%) 8.00 ( 0.00%)
>
> MMTests Statistics: duration
> Sys Time Running Test (seconds) 113.34 107.99 109.73 108.72
> User+Sys Time Running Test (seconds) 145.51 139.81 143.32 143.55
> Total Elapsed Time (seconds) 1159.16 899.23 980.17 1062.27
>
> MMTests Statistics: vmstat
> Page Ins 13710192 13729032 13727944 13760136
> Page Outs 43071140 42987228 42733684 42931624
> Swap Ins 0 0 0 0
> Swap Outs 0 0 0 0
> Direct pages scanned 0 0 0 0
> Kswapd pages scanned 9941613 9937443 9939085 9929154
> Kswapd pages reclaimed 9940926 9936751 9938397 9928465
> Direct pages reclaimed 0 0 0 0
> Kswapd efficiency 99% 99% 99% 99%
> Kswapd velocity 8576.567 11051.058 10140.164 9347.109
> Direct efficiency 100% 100% 100% 100%
> Direct velocity 0.000 0.000 0.000 0.000
>
> It looks like here that the full series regresses performance but as ftrace
> showed no usage of wait_iff_congested() or sync reclaim I am assuming it's
> a disruption due to monitoring. Other data such as memory usage, page IO,
> swap IO all looked similar.
>
> Running a benchmark with a plain DD showed nothing very interesting. The
> full series stalled in wait_iff_congested() slightly less but stall times
> on vanilla kernels were marginal.
>
> Running a benchmark that hammered on file-backed mappings showed stalls
> due to congestion but not in sync writebacks
>
> MICRO
> 3.3.0-vanilla rc2-vanilla lumpyremove-v2r3 nosync-v2r3
> MMTests Statistics: duration
> Sys Time Running Test (seconds) 308.13 294.50 298.75 299.53
> User+Sys Time Running Test (seconds) 330.45 316.28 318.93 320.79
> Total Elapsed Time (seconds) 1814.90 1833.88 1821.14 1832.91
>
> MMTests Statistics: vmstat
> Page Ins 108712 120708 97224 110344
> Page Outs 155514576 156017404 155813676 156193256
> Swap Ins 0 0 0 0
> Swap Outs 0 0 0 0
> Direct pages scanned 2599253 1550480 2512822 2414760
> Kswapd pages scanned 69742364 71150694 68839041 69692533
> Kswapd pages reclaimed 34824488 34773341 34796602 34799396
> Direct pages reclaimed 53693 94750 61792 75205
> Kswapd efficiency 49% 48% 50% 49%
> Kswapd velocity 38427.662 38797.901 37799.972 38022.889
> Direct efficiency 2% 6% 2% 3%
> Direct velocity 1432.174 845.464 1379.807 1317.446
> Percentage direct scans 3% 2% 3% 3%
> Page writes by reclaim 0 0 0 0
> Page writes file 0 0 0 0
> Page writes anon 0 0 0 0
> Page reclaim immediate 0 0 0 1218
> Page rescued immediate 0 0 0 0
> Slabs scanned 15360 16384 13312 16384
> Direct inode steals 0 0 0 0
> Kswapd inode steals 4340 4327 1630 4323
>
> FTrace Reclaim Statistics: congestion_wait
> Direct number congest waited 0 0 0 0
> Direct time congest waited 0ms 0ms 0ms 0ms
> Direct full congest waited 0 0 0 0
> Direct number conditional waited 900 870 754 789
> Direct time conditional waited 0ms 0ms 0ms 20ms
> Direct full conditional waited 0 0 0 0
> KSwapd number congest waited 2106 2308 2116 1915
> KSwapd time congest waited 139924ms 157832ms 125652ms 132516ms
> KSwapd full congest waited 1346 1530 1202 1278
> KSwapd number conditional waited 12922 16320 10943 14670
> KSwapd time conditional waited 0ms 0ms 0ms 0ms
> KSwapd full conditional waited 0 0 0 0
>
>
> Reclaim statistics are not radically changed. The stall times in kswapd
> are massive but it is clear that it is due to calls to congestion_wait()
> and that is almost certainly the call in balance_pgdat(). Otherwise stalls
> due to dirty pages are non-existant.
>
> I ran a benchmark that stressed high-order allocation. This is very
> artifical load but was used in the past to evaluate lumpy reclaim and
> compaction. Generally I look at allocation success rates and latency figures.
>
> STRESS-HIGHALLOC
> 3.3.0-vanilla rc2-vanilla lumpyremove-v2r3 nosync-v2r3
> Pass 1 81.00 ( 0.00%) 28.00 (-53.00%) 24.00 (-57.00%) 28.00 (-53.00%)
> Pass 2 82.00 ( 0.00%) 39.00 (-43.00%) 38.00 (-44.00%) 43.00 (-39.00%)
> while Rested 88.00 ( 0.00%) 87.00 (-1.00%) 88.00 ( 0.00%) 88.00 ( 0.00%)
>
> MMTests Statistics: duration
> Sys Time Running Test (seconds) 740.93 681.42 685.14 684.87
> User+Sys Time Running Test (seconds) 2922.65 3269.52 3281.35 3279.44
> Total Elapsed Time (seconds) 1161.73 1152.49 1159.55 1161.44
>
> MMTests Statistics: vmstat
> Page Ins 4486020 2807256 2855944 2876244
> Page Outs 7261600 7973688 7975320 7986120
> Swap Ins 31694 0 0 0
> Swap Outs 98179 0 0 0
> Direct pages scanned 53494 57731 34406 113015
> Kswapd pages scanned 6271173 1287481 1278174 1219095
> Kswapd pages reclaimed 2029240 1281025 1260708 1201583
> Direct pages reclaimed 1468 14564 16649 92456
> Kswapd efficiency 32% 99% 98% 98%
> Kswapd velocity 5398.133 1117.130 1102.302 1049.641
> Direct efficiency 2% 25% 48% 81%
> Direct velocity 46.047 50.092 29.672 97.306
> Percentage direct scans 0% 4% 2% 8%
> Page writes by reclaim 1616049 0 0 0
> Page writes file 1517870 0 0 0
> Page writes anon 98179 0 0 0
> Page reclaim immediate 103778 27339 9796 17831
> Page rescued immediate 0 0 0 0
> Slabs scanned 1096704 986112 980992 998400
> Direct inode steals 223 215040 216736 247881
> Kswapd inode steals 175331 61548 68444 63066
> Kswapd skipped wait 21991 0 1 0
> THP fault alloc 1 135 125 134
> THP collapse alloc 393 311 228 236
> THP splits 25 13 7 8
> THP fault fallback 0 0 0 0
> THP collapse fail 3 5 7 7
> Compaction stalls 865 1270 1422 1518
> Compaction success 370 401 353 383
> Compaction failures 495 869 1069 1135
> Compaction pages moved 870155 3828868 4036106 4423626
> Compaction move failure 26429 23865 29742 27514
>
> Success rates are completely hosed for 3.4-rc2 which is almost certainly
> due to [fe2c2a10: vmscan: reclaim at order 0 when compaction is enabled]. I
> expected this would happen for kswapd and impair allocation success rates
> (https://lkml.org/lkml/2012/1/25/166) but I did not anticipate this much
> a difference: 80% less scanning, 37% less reclaim by kswapd
>
> In comparison, reclaim/compaction is not aggressive and gives up easily
> which is the intended behaviour. hugetlbfs uses __GFP_REPEAT and would be
> much more aggressive about reclaim/compaction than THP allocations are. The
> stress test above is allocating like neither THP or hugetlbfs but is much
> closer to THP.
>
> Mainline is now impaired in terms of high order allocation under heavy load
> although I do not know to what degree as I did not test with __GFP_REPEAT.
> Keep this in mind for bugs related to hugepage pool resizing, THP allocation
> and high order atomic allocation failures from network devices.
>
> In terms of congestion throttling, I see the following for this test
>
> FTrace Reclaim Statistics: congestion_wait
> Direct number congest waited 3 0 0 0
> Direct time congest waited 0ms 0ms 0ms 0ms
> Direct full congest waited 0 0 0 0
> Direct number conditional waited 957 512 1081 1075
> Direct time conditional waited 0ms 0ms 0ms 0ms
> Direct full conditional waited 0 0 0 0
> KSwapd number congest waited 36 4 3 5
> KSwapd time congest waited 3148ms 400ms 300ms 500ms
> KSwapd full congest waited 30 4 3 5
> KSwapd number conditional waited 88514 197 332 542
> KSwapd time conditional waited 4980ms 0ms 0ms 0ms
> KSwapd full conditional waited 49 0 0 0
>
> The "conditional waited" times are the most interesting as this is directly
> impacted by the number of dirty pages encountered during scan. As lumpy
> reclaim is no longer scanning contiguous ranges, it is finding fewer dirty
> pages. This brings wait times from about 5 seconds to 0. kswapd itself is
> still calling congestion_wait() so it'll still stall but it's a lot less.
>
> In terms of the type of IO we were doing, I see this
>
> FTrace Reclaim Statistics: mm_vmscan_writepage
> Direct writes anon sync 0 0 0 0
> Direct writes anon async 0 0 0 0
> Direct writes file sync 0 0 0 0
> Direct writes file async 0 0 0 0
> Direct writes mixed sync 0 0 0 0
> Direct writes mixed async 0 0 0 0
> KSwapd writes anon sync 0 0 0 0
> KSwapd writes anon async 91682 0 0 0
> KSwapd writes file sync 0 0 0 0
> KSwapd writes file async 822629 0 0 0
> KSwapd writes mixed sync 0 0 0 0
> KSwapd writes mixed async 0 0 0 0
>
> In 3.2, kswapd was doing a bunch of async writes of pages but
> reclaim/compaction was never reaching a point where it was doing sync
> IO. This does not guarantee that reclaim/compaction was not calling
> wait_on_page_writeback() but I would consider it unlikely. It indicates
> that merging patches 2 and 3 to stop reclaim/compaction calling
> wait_on_page_writeback() should be safe.
>
> include/trace/events/vmscan.h | 40 ++-----
> mm/vmscan.c | 263 ++++-------------------------------------
> 2 files changed, 37 insertions(+), 266 deletions(-)
>
> --
> 1.7.9.2
>
It might be a naive question, what we do w/ users with the following
in the .config file?
# CONFIG_COMPACTION is not set
--Ying
^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: [PATCH 0/3] Removal of lumpy reclaim V2
@ 2012-04-11 23:37 ` Ying Han
0 siblings, 0 replies; 36+ messages in thread
From: Ying Han @ 2012-04-11 23:37 UTC (permalink / raw)
To: Mel Gorman
Cc: Andrew Morton, Rik van Riel, Konstantin Khlebnikov, Hugh Dickins,
Linux-MM, LKML
On Wed, Apr 11, 2012 at 9:38 AM, Mel Gorman <mgorman@suse.de> wrote:
> Andrew, these three patches should replace the two lumpy reclaim patches
> you already have. When applied, there is no functional difference (slightly
> changes in layout) but the changelogs are better.
>
> Changelog since V1
> o Ying pointed out that compaction was waiting on page writeback and the
> description of the patches in V1 was broken. This version is the same
> except that it is structured differently to explain that waiting on
> page writeback is removed.
> o Rebased to v3.4-rc2
>
> This series removes lumpy reclaim and some stalling logic that was
> unintentionally being used by memory compaction. The end result
> is that stalling on dirty pages during page reclaim now depends on
> wait_iff_congested().
>
> Four kernels were compared
>
> 3.3.0 vanilla
> 3.4.0-rc2 vanilla
> 3.4.0-rc2 lumpyremove-v2 is patch one from this series
> 3.4.0-rc2 nosync-v2r3 is the full series
>
> Removing lumpy reclaim saves almost 900K of text where as the full series
> removes 1200K of text.
>
> text data bss dec hex filename
> 6740375 1927944 2260992 10929311 a6c49f vmlinux-3.4.0-rc2-vanilla
> 6739479 1927944 2260992 10928415 a6c11f vmlinux-3.4.0-rc2-lumpyremove-v2
> 6739159 1927944 2260992 10928095 a6bfdf vmlinux-3.4.0-rc2-nosync-v2
>
> There are behaviour changes in the series and so tests were run with
> monitoring of ftrace events. This disrupts results so the performance
> results are distorted but the new behaviour should be clearer.
>
> fs-mark running in a threaded configuration showed little of interest as
> it did not push reclaim aggressively
>
> FS-Mark Multi Threaded
> 3.3.0-vanilla rc2-vanilla lumpyremove-v2r3 nosync-v2r3
> Files/s min 3.20 ( 0.00%) 3.20 ( 0.00%) 3.20 ( 0.00%) 3.20 ( 0.00%)
> Files/s mean 3.20 ( 0.00%) 3.20 ( 0.00%) 3.20 ( 0.00%) 3.20 ( 0.00%)
> Files/s stddev 0.00 ( 0.00%) 0.00 ( 0.00%) 0.00 ( 0.00%) 0.00 ( 0.00%)
> Files/s max 3.20 ( 0.00%) 3.20 ( 0.00%) 3.20 ( 0.00%) 3.20 ( 0.00%)
> Overhead min 508667.00 ( 0.00%) 521350.00 (-2.49%) 544292.00 (-7.00%) 547168.00 (-7.57%)
> Overhead mean 551185.00 ( 0.00%) 652690.73 (-18.42%) 991208.40 (-79.83%) 570130.53 (-3.44%)
> Overhead stddev 18200.69 ( 0.00%) 331958.29 (-1723.88%) 1579579.43 (-8578.68%) 9576.81 (47.38%)
> Overhead max 576775.00 ( 0.00%) 1846634.00 (-220.17%) 6901055.00 (-1096.49%) 585675.00 (-1.54%)
> MMTests Statistics: duration
> Sys Time Running Test (seconds) 309.90 300.95 307.33 298.95
> User+Sys Time Running Test (seconds) 319.32 309.67 315.69 307.51
> Total Elapsed Time (seconds) 1187.85 1193.09 1191.98 1193.73
>
> MMTests Statistics: vmstat
> Page Ins 80532 82212 81420 79480
> Page Outs 111434984 111456240 111437376 111582628
> Swap Ins 0 0 0 0
> Swap Outs 0 0 0 0
> Direct pages scanned 44881 27889 27453 34843
> Kswapd pages scanned 25841428 25860774 25861233 25843212
> Kswapd pages reclaimed 25841393 25860741 25861199 25843179
> Direct pages reclaimed 44881 27889 27453 34843
> Kswapd efficiency 99% 99% 99% 99%
> Kswapd velocity 21754.791 21675.460 21696.029 21649.127
> Direct efficiency 100% 100% 100% 100%
> Direct velocity 37.783 23.375 23.031 29.188
> Percentage direct scans 0% 0% 0% 0%
>
> ftrace showed that there was no stalling on writeback or pages submitted
> for IO from reclaim context.
>
>
> postmark was similar and while it was more interesting, it also did not
> push reclaim heavily.
>
> POSTMARK
> 3.3.0-vanilla rc2-vanilla lumpyremove-v2r3 nosync-v2r3
> Transactions per second: 16.00 ( 0.00%) 20.00 (25.00%) 18.00 (12.50%) 17.00 ( 6.25%)
> Data megabytes read per second: 18.80 ( 0.00%) 24.27 (29.10%) 22.26 (18.40%) 20.54 ( 9.26%)
> Data megabytes written per second: 35.83 ( 0.00%) 46.25 (29.08%) 42.42 (18.39%) 39.14 ( 9.24%)
> Files created alone per second: 28.00 ( 0.00%) 38.00 (35.71%) 34.00 (21.43%) 30.00 ( 7.14%)
> Files create/transact per second: 8.00 ( 0.00%) 10.00 (25.00%) 9.00 (12.50%) 8.00 ( 0.00%)
> Files deleted alone per second: 556.00 ( 0.00%) 1224.00 (120.14%) 3062.00 (450.72%) 6124.00 (1001.44%)
> Files delete/transact per second: 8.00 ( 0.00%) 10.00 (25.00%) 9.00 (12.50%) 8.00 ( 0.00%)
>
> MMTests Statistics: duration
> Sys Time Running Test (seconds) 113.34 107.99 109.73 108.72
> User+Sys Time Running Test (seconds) 145.51 139.81 143.32 143.55
> Total Elapsed Time (seconds) 1159.16 899.23 980.17 1062.27
>
> MMTests Statistics: vmstat
> Page Ins 13710192 13729032 13727944 13760136
> Page Outs 43071140 42987228 42733684 42931624
> Swap Ins 0 0 0 0
> Swap Outs 0 0 0 0
> Direct pages scanned 0 0 0 0
> Kswapd pages scanned 9941613 9937443 9939085 9929154
> Kswapd pages reclaimed 9940926 9936751 9938397 9928465
> Direct pages reclaimed 0 0 0 0
> Kswapd efficiency 99% 99% 99% 99%
> Kswapd velocity 8576.567 11051.058 10140.164 9347.109
> Direct efficiency 100% 100% 100% 100%
> Direct velocity 0.000 0.000 0.000 0.000
>
> It looks like here that the full series regresses performance but as ftrace
> showed no usage of wait_iff_congested() or sync reclaim I am assuming it's
> a disruption due to monitoring. Other data such as memory usage, page IO,
> swap IO all looked similar.
>
> Running a benchmark with a plain DD showed nothing very interesting. The
> full series stalled in wait_iff_congested() slightly less but stall times
> on vanilla kernels were marginal.
>
> Running a benchmark that hammered on file-backed mappings showed stalls
> due to congestion but not in sync writebacks
>
> MICRO
> 3.3.0-vanilla rc2-vanilla lumpyremove-v2r3 nosync-v2r3
> MMTests Statistics: duration
> Sys Time Running Test (seconds) 308.13 294.50 298.75 299.53
> User+Sys Time Running Test (seconds) 330.45 316.28 318.93 320.79
> Total Elapsed Time (seconds) 1814.90 1833.88 1821.14 1832.91
>
> MMTests Statistics: vmstat
> Page Ins 108712 120708 97224 110344
> Page Outs 155514576 156017404 155813676 156193256
> Swap Ins 0 0 0 0
> Swap Outs 0 0 0 0
> Direct pages scanned 2599253 1550480 2512822 2414760
> Kswapd pages scanned 69742364 71150694 68839041 69692533
> Kswapd pages reclaimed 34824488 34773341 34796602 34799396
> Direct pages reclaimed 53693 94750 61792 75205
> Kswapd efficiency 49% 48% 50% 49%
> Kswapd velocity 38427.662 38797.901 37799.972 38022.889
> Direct efficiency 2% 6% 2% 3%
> Direct velocity 1432.174 845.464 1379.807 1317.446
> Percentage direct scans 3% 2% 3% 3%
> Page writes by reclaim 0 0 0 0
> Page writes file 0 0 0 0
> Page writes anon 0 0 0 0
> Page reclaim immediate 0 0 0 1218
> Page rescued immediate 0 0 0 0
> Slabs scanned 15360 16384 13312 16384
> Direct inode steals 0 0 0 0
> Kswapd inode steals 4340 4327 1630 4323
>
> FTrace Reclaim Statistics: congestion_wait
> Direct number congest waited 0 0 0 0
> Direct time congest waited 0ms 0ms 0ms 0ms
> Direct full congest waited 0 0 0 0
> Direct number conditional waited 900 870 754 789
> Direct time conditional waited 0ms 0ms 0ms 20ms
> Direct full conditional waited 0 0 0 0
> KSwapd number congest waited 2106 2308 2116 1915
> KSwapd time congest waited 139924ms 157832ms 125652ms 132516ms
> KSwapd full congest waited 1346 1530 1202 1278
> KSwapd number conditional waited 12922 16320 10943 14670
> KSwapd time conditional waited 0ms 0ms 0ms 0ms
> KSwapd full conditional waited 0 0 0 0
>
>
> Reclaim statistics are not radically changed. The stall times in kswapd
> are massive but it is clear that it is due to calls to congestion_wait()
> and that is almost certainly the call in balance_pgdat(). Otherwise stalls
> due to dirty pages are non-existant.
>
> I ran a benchmark that stressed high-order allocation. This is very
> artifical load but was used in the past to evaluate lumpy reclaim and
> compaction. Generally I look at allocation success rates and latency figures.
>
> STRESS-HIGHALLOC
> 3.3.0-vanilla rc2-vanilla lumpyremove-v2r3 nosync-v2r3
> Pass 1 81.00 ( 0.00%) 28.00 (-53.00%) 24.00 (-57.00%) 28.00 (-53.00%)
> Pass 2 82.00 ( 0.00%) 39.00 (-43.00%) 38.00 (-44.00%) 43.00 (-39.00%)
> while Rested 88.00 ( 0.00%) 87.00 (-1.00%) 88.00 ( 0.00%) 88.00 ( 0.00%)
>
> MMTests Statistics: duration
> Sys Time Running Test (seconds) 740.93 681.42 685.14 684.87
> User+Sys Time Running Test (seconds) 2922.65 3269.52 3281.35 3279.44
> Total Elapsed Time (seconds) 1161.73 1152.49 1159.55 1161.44
>
> MMTests Statistics: vmstat
> Page Ins 4486020 2807256 2855944 2876244
> Page Outs 7261600 7973688 7975320 7986120
> Swap Ins 31694 0 0 0
> Swap Outs 98179 0 0 0
> Direct pages scanned 53494 57731 34406 113015
> Kswapd pages scanned 6271173 1287481 1278174 1219095
> Kswapd pages reclaimed 2029240 1281025 1260708 1201583
> Direct pages reclaimed 1468 14564 16649 92456
> Kswapd efficiency 32% 99% 98% 98%
> Kswapd velocity 5398.133 1117.130 1102.302 1049.641
> Direct efficiency 2% 25% 48% 81%
> Direct velocity 46.047 50.092 29.672 97.306
> Percentage direct scans 0% 4% 2% 8%
> Page writes by reclaim 1616049 0 0 0
> Page writes file 1517870 0 0 0
> Page writes anon 98179 0 0 0
> Page reclaim immediate 103778 27339 9796 17831
> Page rescued immediate 0 0 0 0
> Slabs scanned 1096704 986112 980992 998400
> Direct inode steals 223 215040 216736 247881
> Kswapd inode steals 175331 61548 68444 63066
> Kswapd skipped wait 21991 0 1 0
> THP fault alloc 1 135 125 134
> THP collapse alloc 393 311 228 236
> THP splits 25 13 7 8
> THP fault fallback 0 0 0 0
> THP collapse fail 3 5 7 7
> Compaction stalls 865 1270 1422 1518
> Compaction success 370 401 353 383
> Compaction failures 495 869 1069 1135
> Compaction pages moved 870155 3828868 4036106 4423626
> Compaction move failure 26429 23865 29742 27514
>
> Success rates are completely hosed for 3.4-rc2 which is almost certainly
> due to [fe2c2a10: vmscan: reclaim at order 0 when compaction is enabled]. I
> expected this would happen for kswapd and impair allocation success rates
> (https://lkml.org/lkml/2012/1/25/166) but I did not anticipate this much
> a difference: 80% less scanning, 37% less reclaim by kswapd
>
> In comparison, reclaim/compaction is not aggressive and gives up easily
> which is the intended behaviour. hugetlbfs uses __GFP_REPEAT and would be
> much more aggressive about reclaim/compaction than THP allocations are. The
> stress test above is allocating like neither THP or hugetlbfs but is much
> closer to THP.
>
> Mainline is now impaired in terms of high order allocation under heavy load
> although I do not know to what degree as I did not test with __GFP_REPEAT.
> Keep this in mind for bugs related to hugepage pool resizing, THP allocation
> and high order atomic allocation failures from network devices.
>
> In terms of congestion throttling, I see the following for this test
>
> FTrace Reclaim Statistics: congestion_wait
> Direct number congest waited 3 0 0 0
> Direct time congest waited 0ms 0ms 0ms 0ms
> Direct full congest waited 0 0 0 0
> Direct number conditional waited 957 512 1081 1075
> Direct time conditional waited 0ms 0ms 0ms 0ms
> Direct full conditional waited 0 0 0 0
> KSwapd number congest waited 36 4 3 5
> KSwapd time congest waited 3148ms 400ms 300ms 500ms
> KSwapd full congest waited 30 4 3 5
> KSwapd number conditional waited 88514 197 332 542
> KSwapd time conditional waited 4980ms 0ms 0ms 0ms
> KSwapd full conditional waited 49 0 0 0
>
> The "conditional waited" times are the most interesting as this is directly
> impacted by the number of dirty pages encountered during scan. As lumpy
> reclaim is no longer scanning contiguous ranges, it is finding fewer dirty
> pages. This brings wait times from about 5 seconds to 0. kswapd itself is
> still calling congestion_wait() so it'll still stall but it's a lot less.
>
> In terms of the type of IO we were doing, I see this
>
> FTrace Reclaim Statistics: mm_vmscan_writepage
> Direct writes anon sync 0 0 0 0
> Direct writes anon async 0 0 0 0
> Direct writes file sync 0 0 0 0
> Direct writes file async 0 0 0 0
> Direct writes mixed sync 0 0 0 0
> Direct writes mixed async 0 0 0 0
> KSwapd writes anon sync 0 0 0 0
> KSwapd writes anon async 91682 0 0 0
> KSwapd writes file sync 0 0 0 0
> KSwapd writes file async 822629 0 0 0
> KSwapd writes mixed sync 0 0 0 0
> KSwapd writes mixed async 0 0 0 0
>
> In 3.2, kswapd was doing a bunch of async writes of pages but
> reclaim/compaction was never reaching a point where it was doing sync
> IO. This does not guarantee that reclaim/compaction was not calling
> wait_on_page_writeback() but I would consider it unlikely. It indicates
> that merging patches 2 and 3 to stop reclaim/compaction calling
> wait_on_page_writeback() should be safe.
>
> include/trace/events/vmscan.h | 40 ++-----
> mm/vmscan.c | 263 ++++-------------------------------------
> 2 files changed, 37 insertions(+), 266 deletions(-)
>
> --
> 1.7.9.2
>
It might be a naive question, what we do w/ users with the following
in the .config file?
# CONFIG_COMPACTION is not set
--Ying
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: [PATCH 0/3] Removal of lumpy reclaim V2
2012-04-11 16:38 ` Mel Gorman
@ 2012-04-11 23:54 ` Hugh Dickins
-1 siblings, 0 replies; 36+ messages in thread
From: Hugh Dickins @ 2012-04-11 23:54 UTC (permalink / raw)
To: Mel Gorman
Cc: Andrew Morton, Rik van Riel, Konstantin Khlebnikov, Ying Han,
Linux-MM, LKML
On Wed, 11 Apr 2012, Mel Gorman wrote:
>
> Removing lumpy reclaim saves almost 900K of text where as the full series
> removes 1200K of text.
Impressive...
>
> text data bss dec hex filename
> 6740375 1927944 2260992 10929311 a6c49f vmlinux-3.4.0-rc2-vanilla
> 6739479 1927944 2260992 10928415 a6c11f vmlinux-3.4.0-rc2-lumpyremove-v2
> 6739159 1927944 2260992 10928095 a6bfdf vmlinux-3.4.0-rc2-nosync-v2
... but I fear you meant " bytes" instead of "K" ;)
Hugh
^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: [PATCH 0/3] Removal of lumpy reclaim V2
@ 2012-04-11 23:54 ` Hugh Dickins
0 siblings, 0 replies; 36+ messages in thread
From: Hugh Dickins @ 2012-04-11 23:54 UTC (permalink / raw)
To: Mel Gorman
Cc: Andrew Morton, Rik van Riel, Konstantin Khlebnikov, Ying Han,
Linux-MM, LKML
On Wed, 11 Apr 2012, Mel Gorman wrote:
>
> Removing lumpy reclaim saves almost 900K of text where as the full series
> removes 1200K of text.
Impressive...
>
> text data bss dec hex filename
> 6740375 1927944 2260992 10929311 a6c49f vmlinux-3.4.0-rc2-vanilla
> 6739479 1927944 2260992 10928415 a6c11f vmlinux-3.4.0-rc2-lumpyremove-v2
> 6739159 1927944 2260992 10928095 a6bfdf vmlinux-3.4.0-rc2-nosync-v2
... but I fear you meant " bytes" instead of "K" ;)
Hugh
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: [PATCH 0/3] Removal of lumpy reclaim V2
2012-04-11 23:54 ` Hugh Dickins
@ 2012-04-12 5:44 ` Mel Gorman
-1 siblings, 0 replies; 36+ messages in thread
From: Mel Gorman @ 2012-04-12 5:44 UTC (permalink / raw)
To: Hugh Dickins
Cc: Andrew Morton, Rik van Riel, Konstantin Khlebnikov, Ying Han,
Linux-MM, LKML
On Wed, Apr 11, 2012 at 04:54:18PM -0700, Hugh Dickins wrote:
> On Wed, 11 Apr 2012, Mel Gorman wrote:
> >
> > Removing lumpy reclaim saves almost 900K of text where as the full series
> > removes 1200K of text.
>
> Impressive...
>
> >
> > text data bss dec hex filename
> > 6740375 1927944 2260992 10929311 a6c49f vmlinux-3.4.0-rc2-vanilla
> > 6739479 1927944 2260992 10928415 a6c11f vmlinux-3.4.0-rc2-lumpyremove-v2
> > 6739159 1927944 2260992 10928095 a6bfdf vmlinux-3.4.0-rc2-nosync-v2
>
> ... but I fear you meant " bytes" instead of "K" ;)
>
Whoops, I do :)
--
Mel Gorman
SUSE Labs
^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: [PATCH 0/3] Removal of lumpy reclaim V2
@ 2012-04-12 5:44 ` Mel Gorman
0 siblings, 0 replies; 36+ messages in thread
From: Mel Gorman @ 2012-04-12 5:44 UTC (permalink / raw)
To: Hugh Dickins
Cc: Andrew Morton, Rik van Riel, Konstantin Khlebnikov, Ying Han,
Linux-MM, LKML
On Wed, Apr 11, 2012 at 04:54:18PM -0700, Hugh Dickins wrote:
> On Wed, 11 Apr 2012, Mel Gorman wrote:
> >
> > Removing lumpy reclaim saves almost 900K of text where as the full series
> > removes 1200K of text.
>
> Impressive...
>
> >
> > text data bss dec hex filename
> > 6740375 1927944 2260992 10929311 a6c49f vmlinux-3.4.0-rc2-vanilla
> > 6739479 1927944 2260992 10928415 a6c11f vmlinux-3.4.0-rc2-lumpyremove-v2
> > 6739159 1927944 2260992 10928095 a6bfdf vmlinux-3.4.0-rc2-nosync-v2
>
> ... but I fear you meant " bytes" instead of "K" ;)
>
Whoops, I do :)
--
Mel Gorman
SUSE Labs
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: [PATCH 0/3] Removal of lumpy reclaim V2
2012-04-11 23:37 ` Ying Han
@ 2012-04-12 5:49 ` Mel Gorman
-1 siblings, 0 replies; 36+ messages in thread
From: Mel Gorman @ 2012-04-12 5:49 UTC (permalink / raw)
To: Ying Han
Cc: Andrew Morton, Rik van Riel, Konstantin Khlebnikov, Hugh Dickins,
Linux-MM, LKML
On Wed, Apr 11, 2012 at 04:37:00PM -0700, Ying Han wrote:
> > In 3.2, kswapd was doing a bunch of async writes of pages but
> > reclaim/compaction was never reaching a point where it was doing sync
> > IO. This does not guarantee that reclaim/compaction was not calling
> > wait_on_page_writeback() but I would consider it unlikely. It indicates
> > that merging patches 2 and 3 to stop reclaim/compaction calling
> > wait_on_page_writeback() should be safe.
> >
> > include/trace/events/vmscan.h | 40 ++-----
> > mm/vmscan.c | 263 ++++-------------------------------------
> > 2 files changed, 37 insertions(+), 266 deletions(-)
> >
> > --
> > 1.7.9.2
> >
>
> It might be a naive question, what we do w/ users with the following
> in the .config file?
>
> # CONFIG_COMPACTION is not set
>
After lumpy reclaim is removed page reclaim will be reclaiming at order-0
randomly to see if that frees up a high-order page randomly. It remains to
be seen how many users really depended on lumpy reclaim like this and as
to why they were not using compaction. Two configurations that may care are
NOMMU and SLUB. NOMMU may not notice as they were already unable to handle
anonymous pages in lumpy reclaim. SLUB will fallback to using order-0 pages.
--
Mel Gorman
SUSE Labs
^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: [PATCH 0/3] Removal of lumpy reclaim V2
@ 2012-04-12 5:49 ` Mel Gorman
0 siblings, 0 replies; 36+ messages in thread
From: Mel Gorman @ 2012-04-12 5:49 UTC (permalink / raw)
To: Ying Han
Cc: Andrew Morton, Rik van Riel, Konstantin Khlebnikov, Hugh Dickins,
Linux-MM, LKML
On Wed, Apr 11, 2012 at 04:37:00PM -0700, Ying Han wrote:
> > In 3.2, kswapd was doing a bunch of async writes of pages but
> > reclaim/compaction was never reaching a point where it was doing sync
> > IO. This does not guarantee that reclaim/compaction was not calling
> > wait_on_page_writeback() but I would consider it unlikely. It indicates
> > that merging patches 2 and 3 to stop reclaim/compaction calling
> > wait_on_page_writeback() should be safe.
> >
> > include/trace/events/vmscan.h | 40 ++-----
> > mm/vmscan.c | 263 ++++-------------------------------------
> > 2 files changed, 37 insertions(+), 266 deletions(-)
> >
> > --
> > 1.7.9.2
> >
>
> It might be a naive question, what we do w/ users with the following
> in the .config file?
>
> # CONFIG_COMPACTION is not set
>
After lumpy reclaim is removed page reclaim will be reclaiming at order-0
randomly to see if that frees up a high-order page randomly. It remains to
be seen how many users really depended on lumpy reclaim like this and as
to why they were not using compaction. Two configurations that may care are
NOMMU and SLUB. NOMMU may not notice as they were already unable to handle
anonymous pages in lumpy reclaim. SLUB will fallback to using order-0 pages.
--
Mel Gorman
SUSE Labs
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: [PATCH 0/3] Removal of lumpy reclaim V2
2012-04-11 18:06 ` Rik van Riel
@ 2012-04-12 9:32 ` Mel Gorman
-1 siblings, 0 replies; 36+ messages in thread
From: Mel Gorman @ 2012-04-12 9:32 UTC (permalink / raw)
To: Rik van Riel
Cc: Andrew Morton, Konstantin Khlebnikov, Hugh Dickins, Ying Han,
Linux-MM, LKML
On Wed, Apr 11, 2012 at 02:06:11PM -0400, Rik van Riel wrote:
> On 04/11/2012 01:52 PM, Mel Gorman wrote:
> >On Wed, Apr 11, 2012 at 01:17:02PM -0400, Rik van Riel wrote:
>
> >>Next step: get rid of __GFP_NO_KSWAPD for THP, first
> >>in the -mm kernel
> >>
> >
> >Initially the flag was introduced because kswapd reclaimed too
> >aggressively. One would like to believe that it would be less of a problem
> >now but we must avoid a situation where the CPU and reclaim cost of kswapd
> >exceeds the benefit of allocating a THP.
>
> Since kswapd and the direct reclaim code now use
> the same conditionals for calling compaction,
> the cost ought to be identical.
>
kswapd has different retry logic for reclaim and can stay awake if there
are continual calls to wakeup_kswapd() setting pgdat->kswapd_max_order
and kswapd makes forward progress. It's not identical enough that I would
express 100% confidence that it will be free of problems.
--
Mel Gorman
SUSE Labs
^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: [PATCH 0/3] Removal of lumpy reclaim V2
@ 2012-04-12 9:32 ` Mel Gorman
0 siblings, 0 replies; 36+ messages in thread
From: Mel Gorman @ 2012-04-12 9:32 UTC (permalink / raw)
To: Rik van Riel
Cc: Andrew Morton, Konstantin Khlebnikov, Hugh Dickins, Ying Han,
Linux-MM, LKML
On Wed, Apr 11, 2012 at 02:06:11PM -0400, Rik van Riel wrote:
> On 04/11/2012 01:52 PM, Mel Gorman wrote:
> >On Wed, Apr 11, 2012 at 01:17:02PM -0400, Rik van Riel wrote:
>
> >>Next step: get rid of __GFP_NO_KSWAPD for THP, first
> >>in the -mm kernel
> >>
> >
> >Initially the flag was introduced because kswapd reclaimed too
> >aggressively. One would like to believe that it would be less of a problem
> >now but we must avoid a situation where the CPU and reclaim cost of kswapd
> >exceeds the benefit of allocating a THP.
>
> Since kswapd and the direct reclaim code now use
> the same conditionals for calling compaction,
> the cost ought to be identical.
>
kswapd has different retry logic for reclaim and can stay awake if there
are continual calls to wakeup_kswapd() setting pgdat->kswapd_max_order
and kswapd makes forward progress. It's not identical enough that I would
express 100% confidence that it will be free of problems.
--
Mel Gorman
SUSE Labs
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 36+ messages in thread
end of thread, other threads:[~2012-04-12 9:32 UTC | newest]
Thread overview: 36+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2012-04-11 16:38 [PATCH 0/3] Removal of lumpy reclaim V2 Mel Gorman
2012-04-11 16:38 ` Mel Gorman
2012-04-11 16:38 ` [PATCH 1/3] mm: vmscan: Remove lumpy reclaim Mel Gorman
2012-04-11 16:38 ` Mel Gorman
2012-04-11 17:25 ` Rik van Riel
2012-04-11 17:25 ` Rik van Riel
2012-04-11 18:54 ` KOSAKI Motohiro
2012-04-11 18:54 ` KOSAKI Motohiro
2012-04-11 16:38 ` [PATCH 2/3] mm: vmscan: Do not stall on writeback during memory compaction Mel Gorman
2012-04-11 16:38 ` Mel Gorman
2012-04-11 17:26 ` Rik van Riel
2012-04-11 17:26 ` Rik van Riel
2012-04-11 18:51 ` KOSAKI Motohiro
2012-04-11 18:51 ` KOSAKI Motohiro
2012-04-11 16:38 ` [PATCH 3/3] mm: vmscan: Remove reclaim_mode_t Mel Gorman
2012-04-11 16:38 ` Mel Gorman
2012-04-11 17:26 ` Rik van Riel
2012-04-11 17:26 ` Rik van Riel
2012-04-11 19:48 ` KOSAKI Motohiro
2012-04-11 19:48 ` KOSAKI Motohiro
2012-04-11 17:17 ` [PATCH 0/3] Removal of lumpy reclaim V2 Rik van Riel
2012-04-11 17:17 ` Rik van Riel
2012-04-11 17:52 ` Mel Gorman
2012-04-11 17:52 ` Mel Gorman
2012-04-11 18:06 ` Rik van Riel
2012-04-11 18:06 ` Rik van Riel
2012-04-12 9:32 ` Mel Gorman
2012-04-12 9:32 ` Mel Gorman
2012-04-11 23:37 ` Ying Han
2012-04-11 23:37 ` Ying Han
2012-04-12 5:49 ` Mel Gorman
2012-04-12 5:49 ` Mel Gorman
2012-04-11 23:54 ` Hugh Dickins
2012-04-11 23:54 ` Hugh Dickins
2012-04-12 5:44 ` Mel Gorman
2012-04-12 5:44 ` Mel Gorman
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.