linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 0/5] Reduce sequential read overhead
@ 2014-07-09  8:13 Mel Gorman
  2014-07-09  8:13 ` [PATCH 1/6] mm: pagemap: Avoid unnecessary overhead when tracepoints are deactivated Mel Gorman
                   ` (5 more replies)
  0 siblings, 6 replies; 25+ messages in thread
From: Mel Gorman @ 2014-07-09  8:13 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Linux Kernel, Linux-MM, Linux-FSDevel, Johannes Weiner, Mel Gorman

This was formerly the series "Improve sequential read throughput" which
noted some major differences in performance of tiobench since 3.0. While
there are a number of factors, two that dominated were the introduction
of the fair zone allocation policy and changes to CFQ.

The behaviour of fair zone allocation policy makes more sense than tiobench
as a benchmark and CFQ defaults were not changed due to insufficient
benchmarking.

This series is what's left. It's one functional fix to the fair zone
allocation policy when used on NUMA machines and a reduction of overhead
in general. tiobench was used for the comparison despite its flaws as an
IO benchmark as in this case we are primarily interested in the overhead
of page allocator and page reclaim activity.

On UMA, it makes little difference to overhead

          3.16.0-rc3   3.16.0-rc3
             vanilla lowercost-v5
User          383.61      386.77
System        403.83      401.74
Elapsed      5411.50     5413.11

On a 4-socket NUMA machine it's a bit more noticable

          3.16.0-rc3   3.16.0-rc3
             vanilla lowercost-v5
User          746.94      802.00
System      65336.22    40852.33
Elapsed     27553.52    27368.46

 include/linux/mmzone.h         | 217 ++++++++++++++++++++++-------------------
 include/trace/events/pagemap.h |  16 ++-
 mm/page_alloc.c                | 122 ++++++++++++-----------
 mm/swap.c                      |   4 +-
 mm/vmscan.c                    |   7 +-
 mm/vmstat.c                    |   9 +-
 6 files changed, 198 insertions(+), 177 deletions(-)

-- 
1.8.4.5


^ permalink raw reply	[flat|nested] 25+ messages in thread
* [PATCH 0/6] Improve sequential read throughput v2
@ 2014-06-25  7:58 Mel Gorman
  2014-06-25  7:58 ` [PATCH 1/6] mm: pagemap: Avoid unnecessary overhead when tracepoints are deactivated Mel Gorman
  0 siblings, 1 reply; 25+ messages in thread
From: Mel Gorman @ 2014-06-25  7:58 UTC (permalink / raw)
  To: Linux Kernel, Linux-MM, Linux-FSDevel
  Cc: Johannes Weiner, Jens Axboe, Jeff Moyer, Dave Chinner, Mel Gorman

Changelog since v1
o Rebase to v3.16-rc2
o Move CFQ patch to end of series where it can be rejected easier if necessary
o Introduce page-reclaim related patch related to kswapd/fairzone interactions
o Rework fast zone policy patch

IO performance since 3.0 has been a mixed bag. In many respects we are
better and in some we are worse and one of those places is sequential
read throughput. This is visible in a number of benchmarks but I looked
at tiobench the closest. This is using ext3 on a mid-range desktop and
comparing against 3.0.

                                      3.16.0-rc2            3.16.0-rc2                 3.0.0
                                         vanilla                cfq600               vanilla
Min    SeqRead-MB/sec-1         120.96 (  0.00%)      140.43 ( 16.10%)      134.04 ( 10.81%)
Min    SeqRead-MB/sec-2         100.73 (  0.00%)      118.18 ( 17.32%)      120.76 ( 19.88%)
Min    SeqRead-MB/sec-4          96.05 (  0.00%)      110.84 ( 15.40%)      114.49 ( 19.20%)
Min    SeqRead-MB/sec-8          82.46 (  0.00%)       92.40 ( 12.05%)       98.04 ( 18.89%)
Min    SeqRead-MB/sec-16         66.37 (  0.00%)       76.68 ( 15.53%)       79.49 ( 19.77%)

This series does not fully restore throughput performance to 3.0 levels
but it brings it acceptably close. While throughput for higher numbers
of threads is lower, it is known that it can be tuned by increasing
target_latency or disabling low_latency giving higher overall throughput
at the cost of latency and IO fairness.

This series in ordered in ascending-likelihood-to-cause-controversary so
that a partial series can still potentially be merged even if parts of it
are naked (e.g. CGQ). For reference, here is the series without the CFQ
patch at the end.

                                      3.16.0-rc2            3.16.0-rc2                 3.0.0
                                         vanilla             lessdirty               vanilla
Min    SeqRead-MB/sec-1         120.96 (  0.00%)      141.04 ( 16.60%)      134.04 ( 10.81%)
Min    SeqRead-MB/sec-2         100.73 (  0.00%)      116.26 ( 15.42%)      120.76 ( 19.88%)
Min    SeqRead-MB/sec-4          96.05 (  0.00%)      109.52 ( 14.02%)      114.49 ( 19.20%)
Min    SeqRead-MB/sec-8          82.46 (  0.00%)       88.60 (  7.45%)       98.04 ( 18.89%)
Min    SeqRead-MB/sec-16         66.37 (  0.00%)       69.87 (  5.27%)       79.49 ( 19.77%)


 block/cfq-iosched.c            |   2 +-
 include/linux/mmzone.h         | 210 ++++++++++++++++++++++-------------------
 include/linux/writeback.h      |   1 +
 include/trace/events/pagemap.h |  16 ++--
 mm/internal.h                  |   1 +
 mm/mm_init.c                   |   5 +-
 mm/page-writeback.c            |  15 +--
 mm/page_alloc.c                | 206 ++++++++++++++++++++++++++--------------
 mm/swap.c                      |   4 +-
 mm/vmscan.c                    |  16 ++--
 mm/vmstat.c                    |   4 +-
 11 files changed, 285 insertions(+), 195 deletions(-)

-- 
1.8.4.5


^ permalink raw reply	[flat|nested] 25+ messages in thread

end of thread, other threads:[~2014-09-10 20:32 UTC | newest]

Thread overview: 25+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2014-07-09  8:13 [PATCH 0/5] Reduce sequential read overhead Mel Gorman
2014-07-09  8:13 ` [PATCH 1/6] mm: pagemap: Avoid unnecessary overhead when tracepoints are deactivated Mel Gorman
2014-07-10 12:01   ` Johannes Weiner
2014-07-09  8:13 ` [PATCH 2/6] mm: Rearrange zone fields into read-only, page alloc, statistics and page reclaim lines Mel Gorman
2014-07-10 12:06   ` Johannes Weiner
2014-07-09  8:13 ` [PATCH 3/6] mm: Move zone->pages_scanned into a vmstat counter Mel Gorman
2014-07-10 12:08   ` Johannes Weiner
2014-07-09  8:13 ` [PATCH 4/6] mm: vmscan: Only update per-cpu thresholds for online CPU Mel Gorman
2014-07-10 12:09   ` Johannes Weiner
2014-07-09  8:13 ` [PATCH 5/6] mm: page_alloc: Abort fair zone allocation policy when remotes nodes are encountered Mel Gorman
2014-07-10 12:14   ` Johannes Weiner
2014-07-10 12:44     ` Mel Gorman
2014-07-09  8:13 ` [PATCH 6/6] mm: page_alloc: Reduce cost of the fair zone allocation policy Mel Gorman
2014-07-10 12:18   ` Johannes Weiner
2014-08-08 15:27   ` Vlastimil Babka
2014-08-11 12:12     ` Mel Gorman
2014-08-11 12:34       ` Vlastimil Babka
2014-09-02 14:01         ` Johannes Weiner
2014-09-05 10:14           ` [PATCH] mm: page_alloc: Fix setting of ZONE_FAIR_DEPLETED on UP Mel Gorman
2014-09-07  6:32             ` Leon Romanovsky
2014-09-08 11:57               ` [PATCH] mm: page_alloc: Fix setting of ZONE_FAIR_DEPLETED on UP v2 Mel Gorman
2014-09-09 19:53                 ` Andrew Morton
2014-09-10  9:16                   ` Mel Gorman
2014-09-10 20:32                     ` Johannes Weiner
  -- strict thread matches above, loose matches on Subject: below --
2014-06-25  7:58 [PATCH 0/6] Improve sequential read throughput v2 Mel Gorman
2014-06-25  7:58 ` [PATCH 1/6] mm: pagemap: Avoid unnecessary overhead when tracepoints are deactivated Mel Gorman

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).