All of lore.kernel.org
 help / color / mirror / Atom feed
* [RFC v8 0/8] Implement Data Access Monitoring-based Memory Operation Schemes
@ 2020-05-12 11:53 SeongJae Park
  2020-05-12 11:53 ` [RFC v8 1/8] mm/madvise: Export do_madvise() to external GPL modules SeongJae Park
                   ` (7 more replies)
  0 siblings, 8 replies; 9+ messages in thread
From: SeongJae Park @ 2020-05-12 11:53 UTC (permalink / raw)
  To: akpm
  Cc: SeongJae Park, Jonathan.Cameron, aarcange, acme,
	alexander.shishkin, amit, benh, brendan.d.gregg, brendanhiggins,
	cai, colin.king, corbet, dwmw, irogers, jolsa, kirill,
	mark.rutland, mgorman, minchan, mingo, namhyung, peterz, rdunlap,
	riel, rientjes, rostedt, sblbir, shakeelb, shuah, sj38.park, snu,
	vbabka, vdavydov.dev, yang.shi, ying.huang, linux-damon,
	linux-mm, linux-doc, linux-kernel

From: SeongJae Park <sjpark@amazon.de>

DAMON[1] can be used as a primitive for data access awared memory management
optimizations.  That said, users who want such optimizations should run DAMON,
read the monitoring results, analyze it, plan a new memory management scheme,
and apply the new scheme by themselves.  Such efforts will be inevitable for
some complicated optimizations.

However, in many other cases, the users would simply want the system to apply a
memory management action to a memory region of a specific size having a
specific access frequency for a specific time.  For example, "page out a memory
region larger than 100 MiB keeping only rare accesses more than 2 minutes", or
"Do not use THP for a memory region larger than 2 MiB rarely accessed for more
than 1 seconds".

This RFC patchset makes DAMON to handle such data access monitoring-based
operation schemes.  With this change, users can do the data access aware
optimizations by simply specifying their schemes to DAMON.

[1] https://lore.kernel.org/linux-mm/20200406130938.14066-1-sjpark@amazon.com/


Evaluations
===========

Setup
-----

On my personal QEMU/KVM based virtual machine on an Intel i7 host machine
running Ubuntu 18.04, I measure runtime and consumed system memory while
running various realistic workloads with several configurations.  I use 13 and
12 workloads in PARSEC3[3] and SPLASH-2X[4] benchmark suites, respectively.  I
personally use another wrapper scripts[5] for setup and run of the workloads.
On top of this patchset, we also applied the DAMON-based operation schemes
patchset[6] for this evaluation.

Measurement
~~~~~~~~~~~

For the measurement of the amount of consumed memory in system global scope, I
drop caches before starting each of the workloads and monitor 'MemFree' in the
'/proc/meminfo' file.  To make results more stable, I repeat the runs 5 times
and average results.  You can get stdev, min, and max of the numbers among the
repeated runs in appendix below.

Configurations
~~~~~~~~~~~~~~

The configurations I use are as below.

orig: Linux v5.5 with 'madvise' THP policy
rec: 'orig' plus DAMON running with record feature
thp: same with 'orig', but use 'always' THP policy
ethp: 'orig' plus a DAMON operation scheme[6], 'efficient THP'
prcl: 'orig' plus a DAMON operation scheme, 'proactive reclaim[7]'

I use 'rec' for measurement of DAMON overheads to target workloads and system
memory.  The remaining configs including 'thp', 'ethp', and 'prcl' are for
measurement of DAMON monitoring accuracy.

'ethp' and 'prcl' is simple DAMON-based operation schemes developed for
proof of concepts of DAMON.  'ethp' reduces memory space waste of THP by using
DAMON for decision of promotions and demotion for huge pages, while 'prcl' is
as similar as the original work.  Those are implemented as below:

# format: <min/max size> <min/max frequency (0-100)> <min/max age> <action>
# ethp: Use huge pages if a region >2MB shows >5% access rate, use regular
# pages if a region >2MB shows <5% access rate for >1 second
2M null    5 null    null null    hugepage
2M null    null 5    1s null      nohugepage

# prcl: If a region >4KB shows <5% access rate for >5 seconds, page out.
4K null    null 5    500ms null      pageout

Note that both 'ethp' and 'prcl' are designed with my only straightforward
intuition, because those are for only proof of concepts and monitoring accuracy
of DAMON.  In other words, those are not for production.  For production use,
those should be tuned more.


[1] "Redis latency problems troubleshooting", https://redis.io/topics/latency
[2] "Disable Transparent Huge Pages (THP)",
    https://docs.mongodb.com/manual/tutorial/transparent-huge-pages/
[3] "The PARSEC Becnhmark Suite", https://parsec.cs.princeton.edu/index.htm
[4] "SPLASH-2x", https://parsec.cs.princeton.edu/parsec3-doc.htm#splash2x
[5] "parsec3_on_ubuntu", https://github.com/sjp38/parsec3_on_ubuntu
[6] "[RFC v4 0/7] Implement Data Access Monitoring-based Memory Operation
    Schemes",
    https://lore.kernel.org/linux-mm/20200303121406.20954-1-sjpark@amazon.com/
[7] "Proactively reclaiming idle memory", https://lwn.net/Articles/787611/


Results
-------

Below two tables show the measurement results.  The runtimes are in seconds
while the memory usages are in KiB.  Each configurations except 'orig' shows
its overhead relative to 'orig' in percent within parenthesises.

runtime                 orig     rec      (overhead) thp      (overhead) ethp     (overhead) prcl     (overhead)
parsec3/blackscholes    107.065  107.478  (0.39)     106.682  (-0.36)    107.365  (0.28)     111.811  (4.43)
parsec3/bodytrack       79.256   79.450   (0.25)     78.645   (-0.77)    79.314   (0.07)     80.305   (1.32)
parsec3/canneal         139.497  141.181  (1.21)     121.526  (-12.88)   130.074  (-6.75)    154.644  (10.86)
parsec3/dedup           11.879   11.873   (-0.05)    11.693   (-1.56)    11.948   (0.58)     12.694   (6.86)
parsec3/facesim         207.814  208.467  (0.31)     203.743  (-1.96)    206.759  (-0.51)    214.603  (3.27)
parsec3/ferret          190.124  190.955  (0.44)     189.575  (-0.29)    190.852  (0.38)     191.548  (0.75)
parsec3/fluidanimate    211.046  212.282  (0.59)     208.832  (-1.05)    212.143  (0.52)     218.774  (3.66)
parsec3/freqmine        289.259  290.096  (0.29)     288.510  (-0.26)    290.177  (0.32)     296.269  (2.42)
parsec3/raytrace        118.522  119.701  (0.99)     119.469  (0.80)     118.964  (0.37)     130.584  (10.18)
parsec3/streamcluster   323.619  327.830  (1.30)     283.374  (-12.44)   287.837  (-11.06)   330.216  (2.04)
parsec3/swaptions       154.007  155.714  (1.11)     154.767  (0.49)     154.955  (0.62)     155.256  (0.81)
parsec3/vips            58.822   58.750   (-0.12)    58.564   (-0.44)    58.807   (-0.02)    60.320   (2.55)
parsec3/x264            67.335   72.516   (7.69)     64.680   (-3.94)    70.096   (4.10)     72.465   (7.62)
splash2x/barnes         80.335   80.979   (0.80)     73.758   (-8.19)    78.874   (-1.82)    99.226   (23.52)
splash2x/fft            33.441   33.312   (-0.38)    22.909   (-31.49)   31.561   (-5.62)    41.496   (24.09)
splash2x/lu_cb          85.691   85.706   (0.02)     84.352   (-1.56)    85.943   (0.29)     88.914   (3.76)
splash2x/lu_ncb         92.338   92.749   (0.45)     89.773   (-2.78)    92.888   (0.60)     94.104   (1.91)
splash2x/ocean_cp       44.542   44.795   (0.57)     42.958   (-3.56)    44.061   (-1.08)    49.091   (10.21)
splash2x/ocean_ncp      82.101   82.006   (-0.12)    51.418   (-37.37)   64.496   (-21.44)   105.998  (29.11)
splash2x/radiosity      91.296   91.353   (0.06)     90.668   (-0.69)    91.379   (0.09)     103.265  (13.11)
splash2x/radix          31.243   31.417   (0.56)     25.176   (-19.42)   30.297   (-3.03)    38.474   (23.14)
splash2x/raytrace       84.405   84.863   (0.54)     83.498   (-1.08)    83.637   (-0.91)    85.166   (0.90)
splash2x/volrend        87.516   88.156   (0.73)     86.311   (-1.38)    87.016   (-0.57)    88.318   (0.92)
splash2x/water_nsquared 233.515  233.826  (0.13)     221.169  (-5.29)    224.430  (-3.89)    236.929  (1.46)
splash2x/water_spatial  89.207   89.448   (0.27)     89.396   (0.21)     89.826   (0.69)     97.700   (9.52)
total                   2993.890 3014.920 (0.70)     2851.460 (-4.76)    2923.710 (-2.34)    3158.180 (5.49)


memused.avg             orig         rec          (overhead) thp          (overhead) ethp         (overhead) prcl         (overhead)
parsec3/blackscholes    1819997.200  1832626.000  (0.69)     1821707.000  (0.09)     1830010.400  (0.55)     1651016.200  (-9.28)
parsec3/bodytrack       1416437.600  1430462.200  (0.99)     1420736.400  (0.30)     1428355.600  (0.84)     1430327.000  (0.98)
parsec3/canneal         1040414.400  1050736.800  (0.99)     1041515.600  (0.11)     1048562.200  (0.78)     1049049.400  (0.83)
parsec3/dedup           2414431.800  2454260.400  (1.65)     2423175.400  (0.36)     2396560.200  (-0.74)    2379898.200  (-1.43)
parsec3/facesim         540432.200   551410.200   (2.03)     545978.200   (1.03)     558558.400   (3.35)     483755.400   (-10.49)
parsec3/ferret          318728.600   333971.800   (4.78)     322158.200   (1.08)     332889.200   (4.44)     327896.400   (2.88)
parsec3/fluidanimate    576917.800   585126.600   (1.42)     575123.200   (-0.31)    585429.200   (1.48)     484810.600   (-15.97)
parsec3/freqmine        987882.200   997030.600   (0.93)     990429.200   (0.26)     998484.000   (1.07)     770740.200   (-21.98)
parsec3/raytrace        1747059.800  1752904.000  (0.33)     1738853.600  (-0.47)    1753948.600  (0.39)     1578118.000  (-9.67)
parsec3/streamcluster   121857.600   133934.400   (9.91)     121777.800   (-0.07)    133145.800   (9.26)     131512.800   (7.92)
parsec3/swaptions       14123.000    29254.400    (107.14)   14017.200    (-0.75)    26470.600    (87.43)    28429.800    (101.30)
parsec3/vips            2957631.800  2972884.400  (0.52)     2938855.400  (-0.63)    2960746.000  (0.11)     2946850.800  (-0.36)
parsec3/x264            3184777.200  3214527.400  (0.93)     3177061.000  (-0.24)    3192446.600  (0.24)     3185851.800  (0.03)
splash2x/barnes         1209737.200  1214763.200  (0.42)     1242138.400  (2.68)     1215857.600  (0.51)     994280.800   (-17.81)
splash2x/fft            9362799.400  9178844.600  (-1.96)    9264052.600  (-1.05)    9164996.600  (-2.11)    9452048.200  (0.95)
splash2x/lu_cb          515716.000   524071.600   (1.62)     521226.200   (1.07)     524261.400   (1.66)     372910.200   (-27.69)
splash2x/lu_ncb         512898.200   523057.600   (1.98)     520630.800   (1.51)     523779.000   (2.12)     446282.400   (-12.99)
splash2x/ocean_cp       3346038.000  3288703.600  (-1.71)    3386906.600  (1.22)     3330937.200  (-0.45)    3266442.400  (-2.38)
splash2x/ocean_ncp      3886945.600  3871894.000  (-0.39)    7066192.000  (81.79)    5065229.800  (30.31)    3652078.200  (-6.04)
splash2x/radiosity      1467107.200  1468850.800  (0.12)     1481292.600  (0.97)     1470335.800  (0.22)     530923.400   (-63.81)
splash2x/radix          1708330.800  1699792.200  (-0.50)    1352708.600  (-20.82)   1601339.200  (-6.26)    2043947.800  (19.65)
splash2x/raytrace       44817.200    59047.800    (31.75)    52010.200    (16.05)    60407.200    (34.79)    53916.400    (20.30)
splash2x/volrend        151534.200   167791.400   (10.73)    151759.000   (0.15)     165012.400   (8.89)     160864.600   (6.16)
splash2x/water_nsquared 46549.400    61846.800    (32.86)    51741.200    (11.15)    59214.400    (27.21)    91869.400    (97.36)
splash2x/water_spatial  669085.200   675929.200   (1.02)     665924.600   (-0.47)    676218.200   (1.07)     538430.200   (-19.53)
total                   40062200.000 40073700.000 (0.03)     42888000.000 (7.05)     41103100.000 (2.60)     38052200.000 (-5.02)


DAMON Overheads
~~~~~~~~~~~~~~~

In total, DAMON recording feature incurs 0.70% runtime overhead and 0.03%
memory space overhead.

For convenience test run of 'rec', I use a Python wrapper.  The wrapper
constantly consumes about 10-15MB of memory.  This becomes high memory overhead
if the target workload has small memory footprint.  Nonetheless, the overheads
are not from DAMON, but from the wrapper, and thus should be ignored.  This
fake memory overhead continues in 'ethp' and 'prcl', as those configurations
are also using the Python wrapper.


Efficient THP
~~~~~~~~~~~~~

THP 'always' enabled policy achieves 4.76% speedup but incurs 7.05% memory
overhead.  It achieves 37.37% speedup in best case, but 81.79% memory overhead
in worst case.  Interestingly, both the best and worst case are with
'splash2x/ocean_ncp').

The 2-lines implementation of data access monitoring based THP version ('ethp')
shows 2.34% speedup and 2.60% memory overhead.  In other words, 'ethp' removes
63.12% of THP memory waste while preserving 49.15% of THP speedup in total.  In
case of the 'splash2x/ocean_ncp', 'ethp' removes 62.94% of THP memory waste
while preserving 57.37% of THP speedup.


Proactive Reclamation
~~~~~~~~~~~~~~~~~~~~

As same to the original work, I use 'zram' swap device for this configuration.

In total, our 1 line implementation of Proactive Reclamation, 'prcl', incurred
8.41% runtime overhead in total while achieving 5.83% system memory usage
reduction.

Nonetheless, as the memory usage is calculated with 'MemFree' in
'/proc/meminfo', it contains the SwapCached pages.  As the swapcached pages can
be easily evicted, I also measured the residential set size of the workloads:

rss.avg                 orig         rec          (overhead) thp          (overhead) ethp         (overhead) prcl         (overhead)
parsec3/blackscholes    591452.000   591466.400   (0.00)     593145.200   (0.29)     590609.400   (-0.14)    324379.000   (-45.16)
parsec3/bodytrack       32458.600    32352.200    (-0.33)    32218.400    (-0.74)    32376.400    (-0.25)    27186.000    (-16.24)
parsec3/canneal         841311.600   839888.400   (-0.17)    837008.400   (-0.51)    837811.000   (-0.42)    823276.200   (-2.14)
parsec3/dedup           1219096.600  1228038.800  (0.73)     1235610.800  (1.35)     1214267.000  (-0.40)    992031.000   (-18.63)
parsec3/facesim         311322.200   311574.400   (0.08)     316277.000   (1.59)     312593.800   (0.41)     188789.400   (-39.36)
parsec3/ferret          99536.600    99556.800    (0.02)     102366.000   (2.84)     99799.000    (0.26)     88392.000    (-11.20)
parsec3/fluidanimate    531893.600   531856.000   (-0.01)    532143.400   (0.05)     532190.200   (0.06)     421798.800   (-20.70)
parsec3/freqmine        553533.200   552730.400   (-0.15)    555642.600   (0.38)     553895.400   (0.07)     78335.000    (-85.85)
parsec3/raytrace        894094.200   894849.000   (0.08)     889964.000   (-0.46)    892865.000   (-0.14)    332911.800   (-62.77)
parsec3/streamcluster   110938.000   110968.200   (0.03)     111673.400   (0.66)     111312.200   (0.34)     109911.200   (-0.93)
parsec3/swaptions       5630.000     5634.800     (0.09)     5656.600     (0.47)     5692.000     (1.10)     4028.400     (-28.45)
parsec3/vips            32107.000    32045.200    (-0.19)    32207.800    (0.31)     32293.800    (0.58)     29093.600    (-9.39)
parsec3/x264            81926.000    82143.000    (0.26)     83258.400    (1.63)     82570.600    (0.79)     80651.800    (-1.56)
splash2x/barnes         1215468.800  1217889.800  (0.20)     1222006.800  (0.54)     1217425.600  (0.16)     752405.200   (-38.10)
splash2x/fft            9584734.800  9568872.800  (-0.17)    9660321.400  (0.79)     9646012.000  (0.64)     8367492.800  (-12.70)
splash2x/lu_cb          510555.400   510807.400   (0.05)     514448.600   (0.76)     509281.800   (-0.25)    349272.200   (-31.59)
splash2x/lu_ncb         510310.000   508915.600   (-0.27)    513886.000   (0.70)     510288.400   (-0.00)    431521.800   (-15.44)
splash2x/ocean_cp       3408724.400  3408424.600  (-0.01)    3446054.400  (1.10)     3419536.200  (0.32)     3173818.600  (-6.89)
splash2x/ocean_ncp      3923539.600  3922605.400  (-0.02)    7175526.600  (82.88)    5152558.800  (31.32)    3475756.000  (-11.41)
splash2x/radiosity      1476050.000  1475470.400  (-0.04)    1485747.000  (0.66)     1476232.600  (0.01)     269512.200   (-81.74)
splash2x/radix          1756385.400  1752676.000  (-0.21)    1431621.600  (-18.49)   1711460.800  (-2.56)    1923448.200  (9.51)
splash2x/raytrace       23286.400    23311.200    (0.11)     28440.800    (22.13)    26977.200    (15.85)    15685.200    (-32.64)
splash2x/volrend        44089.400    44125.600    (0.08)     44436.600    (0.79)     44250.400    (0.37)     27616.800    (-37.36)
splash2x/water_nsquared 29437.600    29403.200    (-0.12)    29817.400    (1.29)     30040.000    (2.05)     25369.600    (-13.82)
splash2x/water_spatial  656264.400   656566.400   (0.05)     656016.400   (-0.04)    656420.200   (0.02)     474480.400   (-27.70)
total                   28444100.000 28432200.000 (-0.04)    31535300.000 (10.87)    29698900.000 (4.41)     22787200.000 (-19.89)

In total, 19.89% of residential sets were reduced.

With parsec3/freqmine, 'prcl' reduced 85.85% of residential sets and 21.98% of
system memory usage while incurring only 2.42% runtime overhead.


Baseline and Complete Git Tree
==============================


The patches are based on the v5.6 plus v11 DAMON patchset[1] and Minchan's
``do_madvise()`` patch[2].  Minchan's patch was necessary for reuse of
``madvise()`` code in DAMON.  You can also clone the complete git tree:

    $ git clone git://github.com/sjp38/linux -b damos/rfc/v8

The web is also available:
https://github.com/sjp38/linux/releases/tag/damos/rfc/v8

The latest DAMON development tree is also available at:
https://github.com/sjp38/linux/tree/damon/master


[1] https://lore.kernel.org/linux-mm/20200511123302.12520-1-sjpark@amazon.com/
[2] https://lore.kernel.org/linux-mm/20200302193630.68771-2-minchan@kernel.org/


Sequence Of Patches
===================

The first patch allows DAMON to reuse ``madvise()`` code for the actions.  The
second patch accounts age of each region.  The third patch implements the
handling of the schemes in DAMON and exports a kernel space programming
interface for it.  The fourth patch implements a debugfs interface for the
privileged people and programs.  The fifth and sixth patches each adds kunit
tests and selftests for these changes, and the seventhe patch adds human
friendly schemes support to the user space tool for DAMON.  Finally, the eighth
patch documents this new feature in the document.


Patch History
=============

Changes from RFC v7
(https://lore.kernel.org/linux-mm/20200429124540.32232-1-sjpark@amazon.com/)
 - Rebase on DAMON v11 patchset
 - Add documentation

Changes from RFC v6
(https://lore.kernel.org/linux-mm/20200407100007.3894-1-sjpark@amazon.com/)
 - Rebase on DAMON v9 patchset
 - Cleanup code and fix typos (Stefan Nuernberger)

Changes from RFC v5
(https://lore.kernel.org/linux-mm/20200330115042.17431-1-sjpark@amazon.com/)
 - Rebase on DAMON v8 patchset
 - Update test results
 - Fix DAMON userspace tool crash on signal handling
 - Fix checkpatch warnings

Changes from RFC v4
(https://lore.kernel.org/linux-mm/20200303121406.20954-1-sjpark@amazon.com/)
 - Handle CONFIG_ADVISE_SYSCALL
 - Clean up code (Jonathan Cameron)
 - Update test results
 - Rebase on v5.6 + DAMON v7

Changes from RFC v3
(https://lore.kernel.org/linux-mm/20200225102300.23895-1-sjpark@amazon.com/)
 - Add Reviewed-by from Brendan Higgins
 - Code cleanup: Modularize madvise() call
 - Fix a trivial bug in the wrapper python script
 - Add more stable and detailed evaluation results with updated ETHP scheme

Changes from RFC v2
(https://lore.kernel.org/linux-mm/20200218085309.18346-1-sjpark@amazon.com/)
 - Fix aging mechanism for more better 'old region' selection
 - Add more kunittests and kselftests for this patchset
 - Support more human friedly description and application of 'schemes'

Changes from RFC v1
(https://lore.kernel.org/linux-mm/20200210150921.32482-1-sjpark@amazon.com/)
 - Properly adjust age accounting related properties after splitting, merging,
   and action applying


SeongJae Park (8):
  mm/madvise: Export do_madvise() to external GPL modules
  mm/damon: Account age of target regions
  mm/damon: Implement data access monitoring-based operation schemes
  mm/damon/schemes: Implement a debugfs interface
  mm/damon-test: Add kunit test case for regions age accounting
  mm/damon/selftests: Add 'schemes' debugfs tests
  damon/tools: Support more human friendly 'schemes' control
  Documentation/admin-guide/mm: Document DAMON-based operation schemes

 .../admin-guide/mm/data_access_monitor.rst    | 100 +++-
 include/linux/damon.h                         |  29 ++
 mm/damon-test.h                               |   5 +
 mm/damon.c                                    | 429 +++++++++++++++++-
 mm/madvise.c                                  |   1 +
 tools/damon/_convert_damos.py                 | 127 ++++++
 tools/damon/_damon.py                         | 143 ++++++
 tools/damon/damo                              |   7 +
 tools/damon/record.py                         | 135 +-----
 tools/damon/schemes.py                        | 105 +++++
 .../testing/selftests/damon/debugfs_attrs.sh  |  29 ++
 11 files changed, 979 insertions(+), 131 deletions(-)
 create mode 100755 tools/damon/_convert_damos.py
 create mode 100644 tools/damon/_damon.py
 create mode 100644 tools/damon/schemes.py

-- 
2.17.1

==================================== >8 =======================================

Appendix: Stdev / min / max numbers among the repeated runs
===========================================================

Below are stdev/min/max of each number in the 5 repeated runs.

runtime_avg             orig    rec     thp     ethp    prcl
parsec3/blackscholes    107.065 107.478 106.682 107.365 111.811
parsec3/bodytrack       79.256  79.450  78.645  79.314  80.305
parsec3/canneal         139.497 141.181 121.526 130.074 154.644
parsec3/dedup           11.879  11.873  11.693  11.948  12.694
parsec3/facesim         207.814 208.467 203.743 206.759 214.603
parsec3/ferret          190.124 190.955 189.575 190.852 191.548
parsec3/fluidanimate    211.046 212.282 208.832 212.143 218.774
parsec3/freqmine        289.259 290.096 288.510 290.177 296.269
parsec3/raytrace        118.522 119.701 119.469 118.964 130.584
parsec3/streamcluster   323.619 327.830 283.374 287.837 330.216
parsec3/swaptions       154.007 155.714 154.767 154.955 155.256
parsec3/vips            58.822  58.750  58.564  58.807  60.320
parsec3/x264            67.335  72.516  64.680  70.096  72.465
splash2x/barnes         80.335  80.979  73.758  78.874  99.226
splash2x/fft            33.441  33.312  22.909  31.561  41.496
splash2x/lu_cb          85.691  85.706  84.352  85.943  88.914
splash2x/lu_ncb         92.338  92.749  89.773  92.888  94.104
splash2x/ocean_cp       44.542  44.795  42.958  44.061  49.091
splash2x/ocean_ncp      82.101  82.006  51.418  64.496  105.998
splash2x/radiosity      91.296  91.353  90.668  91.379  103.265
splash2x/radix          31.243  31.417  25.176  30.297  38.474
splash2x/raytrace       84.405  84.863  83.498  83.637  85.166
splash2x/volrend        87.516  88.156  86.311  87.016  88.318
splash2x/water_nsquared 233.515 233.826 221.169 224.430 236.929
splash2x/water_spatial  89.207  89.448  89.396  89.826  97.700


memused.avg_avg         orig        rec         thp         ethp        prcl
parsec3/blackscholes    1819997.200 1832626.000 1821707.000 1830010.400 1651016.200
parsec3/bodytrack       1416437.600 1430462.200 1420736.400 1428355.600 1430327.000
parsec3/canneal         1040414.400 1050736.800 1041515.600 1048562.200 1049049.400
parsec3/dedup           2414431.800 2454260.400 2423175.400 2396560.200 2379898.200
parsec3/facesim         540432.200  551410.200  545978.200  558558.400  483755.400
parsec3/ferret          318728.600  333971.800  322158.200  332889.200  327896.400
parsec3/fluidanimate    576917.800  585126.600  575123.200  585429.200  484810.600
parsec3/freqmine        987882.200  997030.600  990429.200  998484.000  770740.200
parsec3/raytrace        1747059.800 1752904.000 1738853.600 1753948.600 1578118.000
parsec3/streamcluster   121857.600  133934.400  121777.800  133145.800  131512.800
parsec3/swaptions       14123.000   29254.400   14017.200   26470.600   28429.800
parsec3/vips            2957631.800 2972884.400 2938855.400 2960746.000 2946850.800
parsec3/x264            3184777.200 3214527.400 3177061.000 3192446.600 3185851.800
splash2x/barnes         1209737.200 1214763.200 1242138.400 1215857.600 994280.800
splash2x/fft            9362799.400 9178844.600 9264052.600 9164996.600 9452048.200
splash2x/lu_cb          515716.000  524071.600  521226.200  524261.400  372910.200
splash2x/lu_ncb         512898.200  523057.600  520630.800  523779.000  446282.400
splash2x/ocean_cp       3346038.000 3288703.600 3386906.600 3330937.200 3266442.400
splash2x/ocean_ncp      3886945.600 3871894.000 7066192.000 5065229.800 3652078.200
splash2x/radiosity      1467107.200 1468850.800 1481292.600 1470335.800 530923.400
splash2x/radix          1708330.800 1699792.200 1352708.600 1601339.200 2043947.800
splash2x/raytrace       44817.200   59047.800   52010.200   60407.200   53916.400
splash2x/volrend        151534.200  167791.400  151759.000  165012.400  160864.600
splash2x/water_nsquared 46549.400   61846.800   51741.200   59214.400   91869.400
splash2x/water_spatial  669085.200  675929.200  665924.600  676218.200  538430.200


rss.avg_avg             orig        rec         thp         ethp        prcl
parsec3/blackscholes    591452.000  591466.400  593145.200  590609.400  324379.000
parsec3/bodytrack       32458.600   32352.200   32218.400   32376.400   27186.000
parsec3/canneal         841311.600  839888.400  837008.400  837811.000  823276.200
parsec3/dedup           1219096.600 1228038.800 1235610.800 1214267.000 992031.000
parsec3/facesim         311322.200  311574.400  316277.000  312593.800  188789.400
parsec3/ferret          99536.600   99556.800   102366.000  99799.000   88392.000
parsec3/fluidanimate    531893.600  531856.000  532143.400  532190.200  421798.800
parsec3/freqmine        553533.200  552730.400  555642.600  553895.400  78335.000
parsec3/raytrace        894094.200  894849.000  889964.000  892865.000  332911.800
parsec3/streamcluster   110938.000  110968.200  111673.400  111312.200  109911.200
parsec3/swaptions       5630.000    5634.800    5656.600    5692.000    4028.400
parsec3/vips            32107.000   32045.200   32207.800   32293.800   29093.600
parsec3/x264            81926.000   82143.000   83258.400   82570.600   80651.800
splash2x/barnes         1215468.800 1217889.800 1222006.800 1217425.600 752405.200
splash2x/fft            9584734.800 9568872.800 9660321.400 9646012.000 8367492.800
splash2x/lu_cb          510555.400  510807.400  514448.600  509281.800  349272.200
splash2x/lu_ncb         510310.000  508915.600  513886.000  510288.400  431521.800
splash2x/ocean_cp       3408724.400 3408424.600 3446054.400 3419536.200 3173818.600
splash2x/ocean_ncp      3923539.600 3922605.400 7175526.600 5152558.800 3475756.000
splash2x/radiosity      1476050.000 1475470.400 1485747.000 1476232.600 269512.200
splash2x/radix          1756385.400 1752676.000 1431621.600 1711460.800 1923448.200
splash2x/raytrace       23286.400   23311.200   28440.800   26977.200   15685.200
splash2x/volrend        44089.400   44125.600   44436.600   44250.400   27616.800
splash2x/water_nsquared 29437.600   29403.200   29817.400   30040.000   25369.600
splash2x/water_spatial  656264.400  656566.400  656016.400  656420.200  474480.400


runtime_stdev           orig  rec   thp   ethp   prcl
parsec3/blackscholes    1.002 0.968 1.072 0.699  1.353
parsec3/bodytrack       0.772 0.691 0.546 0.544  0.999
parsec3/canneal         5.381 3.346 1.187 1.693  6.379
parsec3/dedup           0.098 0.063 0.047 0.119  0.835
parsec3/facesim         0.842 0.407 0.699 0.825  0.745
parsec3/ferret          0.343 0.809 0.214 1.009  0.453
parsec3/fluidanimate    0.732 0.864 1.506 1.420  1.539
parsec3/freqmine        3.491 2.325 2.746 1.057  2.781
parsec3/raytrace        0.946 0.919 1.315 0.746  0.812
parsec3/streamcluster   1.294 2.267 2.133 0.837  2.858
parsec3/swaptions       0.918 1.154 1.433 0.789  0.983
parsec3/vips            0.157 0.130 0.150 0.313  0.663
parsec3/x264            5.485 3.596 3.120 3.251  4.454
splash2x/barnes         0.401 0.827 0.213 0.734  5.511
splash2x/fft            0.245 0.372 0.067 2.244  4.638
splash2x/lu_cb          1.439 0.306 0.082 0.928  1.463
splash2x/lu_ncb         0.339 0.165 0.208 0.266  1.054
splash2x/ocean_cp       0.175 0.308 0.290 0.379  6.556
splash2x/ocean_ncp      1.485 0.939 0.477 14.037 15.558
splash2x/radiosity      0.666 0.448 0.444 0.466  1.553
splash2x/radix          0.301 0.402 0.137 1.554  6.122
splash2x/raytrace       1.053 0.853 0.786 0.535  0.275
splash2x/volrend        0.911 0.309 0.783 0.828  1.168
splash2x/water_nsquared 2.791 1.930 2.511 4.602  2.299
splash2x/water_spatial  0.795 0.696 0.794 0.548  1.663


memused.avg_stdev       orig      rec       thp       ethp       prcl
parsec3/blackscholes    5734.731  4311.105  7196.073  6250.088   38591.335
parsec3/bodytrack       5604.283  8053.736  5653.392  7457.510   6963.186
parsec3/canneal         4234.325  4698.502  1277.093  3276.701   5093.649
parsec3/dedup           57376.221 12160.066 7145.142  60455.246  83253.830
parsec3/facesim         1512.291  2418.956  3818.778  11707.979  5285.020
parsec3/ferret          2636.174  1932.627  2836.998  2171.699   3600.209
parsec3/fluidanimate    1598.216  3489.956  1799.876  2744.956   16040.475
parsec3/freqmine        4737.378  2585.209  1516.651  833.023    2835.236
parsec3/raytrace        6029.193  5301.939  3917.335  2996.082   9466.302
parsec3/streamcluster   5896.735  2787.781  1670.123  1757.718   3018.166
parsec3/swaptions       2643.941  269.967   1914.219  2180.731   544.245
parsec3/vips            1783.138  5078.111  24973.322 31076.452  38391.263
parsec3/x264            15667.795 15632.156 11233.959 25678.288  36209.397
splash2x/barnes         6571.840  8658.759  42823.767 7075.071   68590.867
splash2x/fft            85635.738 47518.810 17455.819 49252.727  363372.295
splash2x/lu_cb          1421.258  1647.129  2372.638  2105.939   17907.733
splash2x/lu_ncb         3998.792  1236.203  1718.105  3234.752   19646.973
splash2x/ocean_cp       4887.504  2266.515  2997.870  40046.934  25117.676
splash2x/ocean_ncp      25633.509 3240.085  8002.012  980010.789 122762.809
splash2x/radiosity      6798.900  2183.071  2027.795  2417.779   30261.866
splash2x/radix          39814.934 34503.616 57493.970 87549.945  235275.193
splash2x/raytrace       2875.630  1353.957  1065.653  1918.838   1356.082
splash2x/volrend        1222.512  3380.618  2242.415  1507.868   2955.215
splash2x/water_nsquared 3198.770  3823.809  16179.890 3511.015   62231.581
splash2x/water_spatial  714.574   2989.515  4637.876  2556.774   26980.198


rss.avg_stdev           orig       rec        thp       ethp        prcl
parsec3/blackscholes    2539.004   2464.031   2156.462  2413.093    53977.533
parsec3/bodytrack       94.396     226.072    121.582   149.827     1071.906
parsec3/canneal         1140.831   1332.975   1659.666  1579.872    7374.389
parsec3/dedup           20103.782  14597.251  9030.847  20039.333   314199.052
parsec3/facesim         363.581    494.161    906.467   473.521     6371.760
parsec3/ferret          199.416    138.839    697.471   78.519      1163.491
parsec3/fluidanimate    33.903     47.666     394.817   368.483     16330.277
parsec3/freqmine        410.158    981.816    637.904   1114.078    8881.688
parsec3/raytrace        1724.461   1463.681   746.438   1706.933    15221.381
parsec3/streamcluster   24.698     37.080     688.713   521.613     73.251
parsec3/swaptions       32.521     65.965     48.861    23.195      74.535
parsec3/vips            186.227    113.344    147.699   426.642     497.310
parsec3/x264            230.339    84.763     739.852   189.659     1015.909
splash2x/barnes         2068.964   1788.076   2402.419  1463.312    114524.086
splash2x/fft            175328.325 179979.020 17840.083 148550.842  818559.897
splash2x/lu_cb          256.770    397.374    23.500    2601.638    20636.424
splash2x/lu_ncb         210.223    2042.361   42.133    238.893     19596.759
splash2x/ocean_cp       4103.035   2449.065   1812.305  12470.919   299893.309
splash2x/ocean_ncp      22401.137  18053.550  5341.733  1003372.423 366901.046
splash2x/radiosity      1221.591   1271.056   349.319   710.017     40872.559
splash2x/radix          37002.584  26159.493  7497.310  81687.572   133361.762
splash2x/raytrace       30.943     36.783     456.262   545.508     1046.860
splash2x/volrend        114.991    68.611     208.251   249.526     4425.980
splash2x/water_nsquared 34.926     27.294     463.759   413.561     534.799
splash2x/water_spatial  602.384    1035.828   544.494   861.522     25796.221


runtime_min             orig    rec     thp     ethp    prcl
parsec3/blackscholes    106.122 106.677 105.919 106.619 110.383
parsec3/bodytrack       78.583  78.662  77.963  78.873  79.467
parsec3/canneal         128.976 134.898 120.434 127.203 143.840
parsec3/dedup           11.752  11.767  11.627  11.827  11.899
parsec3/facesim         206.796 207.966 202.737 205.231 213.218
parsec3/ferret          189.745 190.214 189.266 189.905 190.969
parsec3/fluidanimate    210.378 211.312 207.586 210.468 216.577
parsec3/freqmine        286.317 288.065 286.234 289.249 293.411
parsec3/raytrace        117.755 118.384 118.072 118.358 129.605
parsec3/streamcluster   321.690 325.011 279.751 286.955 326.447
parsec3/swaptions       153.185 154.154 153.310 153.993 154.328
parsec3/vips            58.688  58.537  58.430  58.451  59.520
parsec3/x264            62.342  65.581  62.383  64.317  66.676
splash2x/barnes         79.756  80.335  73.387  78.191  93.555
splash2x/fft            33.066  32.855  22.804  28.031  33.648
splash2x/lu_cb          84.857  85.454  84.211  85.408  86.695
splash2x/lu_ncb         91.931  92.534  89.463  92.670  93.163
splash2x/ocean_cp       44.342  44.283  42.531  43.634  44.678
splash2x/ocean_ncp      79.808  80.530  50.690  52.658  82.238
splash2x/radiosity      90.522  91.085  90.146  91.076  101.043
splash2x/radix          30.793  30.748  25.059  27.514  31.649
splash2x/raytrace       83.115  83.872  82.632  83.073  84.777
splash2x/volrend        86.573  87.847  85.367  86.157  87.235
splash2x/water_nsquared 230.687 231.397 216.456 218.151 234.711
splash2x/water_spatial  88.231  88.761  88.657  89.263  94.696


memused.avg_min         orig        rec         thp         ethp        prcl
parsec3/blackscholes    1811544.000 1826536.000 1808494.000 1821773.000 1614734.000
parsec3/bodytrack       1408397.000 1417771.000 1409682.000 1418166.000 1421597.000
parsec3/canneal         1033746.000 1044350.000 1039137.000 1042459.000 1039997.000
parsec3/dedup           2301102.000 2439442.000 2415716.000 2312844.000 2271479.000
parsec3/facesim         538342.000  548319.000  538855.000  548533.000  476439.000
parsec3/ferret          314307.000  330592.000  318816.000  328550.000  322669.000
parsec3/fluidanimate    575266.000  580145.000  572053.000  580210.000  471031.000
parsec3/freqmine        983385.000  992568.000  989103.000  996967.000  765382.000
parsec3/raytrace        1739135.000 1746208.000 1732673.000 1750043.000 1562113.000
parsec3/streamcluster   117830.000  129708.000  119065.000  130544.000  127311.000
parsec3/swaptions       9019.000    28779.000   10747.000   22525.000   27794.000
parsec3/vips            2955461.000 2966185.000 2889282.000 2899751.000 2899574.000
parsec3/x264            3167265.000 3184099.000 3165670.000 3147043.000 3127201.000
splash2x/barnes         1197072.000 1202375.000 1216718.000 1208018.000 927735.000
splash2x/fft            9252959.000 9127817.000 9241331.000 9070892.000 9171009.000
splash2x/lu_cb          513078.000  521249.000  516804.000  522107.000  352197.000
splash2x/lu_ncb         507538.000  520640.000  518193.000  520217.000  414060.000
splash2x/ocean_cp       3338417.000 3285969.000 3381462.000 3284112.000 3220039.000
splash2x/ocean_ncp      3850694.000 3866285.000 7054914.000 3863994.000 3533390.000
splash2x/radiosity      1458533.000 1466378.000 1478591.000 1466341.000 500186.000
splash2x/radix          1632067.000 1660555.000 1237961.000 1433203.000 1742575.000
splash2x/raytrace       41129.000   57383.000   50345.000   56853.000   51558.000
splash2x/volrend        149557.000  161324.000  149036.000  162226.000  156885.000
splash2x/water_nsquared 42148.000   54926.000   37080.000   53261.000   59453.000
splash2x/water_spatial  667829.000  672042.000  657036.000  674431.000  499066.000


rss.avg_min             orig        rec         thp         ethp        prcl
parsec3/blackscholes    588296.000  588492.000  590821.000  588424.000  275254.000
parsec3/bodytrack       32314.000   31950.000   32051.000   32229.000   25606.000
parsec3/canneal         839821.000  837971.000  835293.000  835454.000  814989.000
parsec3/dedup           1190628.000 1198910.000 1217746.000 1191978.000 387128.000
parsec3/facesim         311025.000  311017.000  315101.000  312000.000  178780.000
parsec3/ferret          99241.000   99307.000   101493.000  99715.000   87322.000
parsec3/fluidanimate    531852.000  531808.000  531716.000  531812.000  407062.000
parsec3/freqmine        553154.000  551378.000  554723.000  552409.000  60624.000
parsec3/raytrace        891883.000  892921.000  889021.000  890269.000  320336.000
parsec3/streamcluster   110908.000  110901.000  110959.000  110905.000  109822.000
parsec3/swaptions       5595.000    5569.000    5598.000    5669.000    3932.000
parsec3/vips            31813.000   31872.000   31999.000   31862.000   28309.000
parsec3/x264            81595.000   81995.000   82492.000   82257.000   79007.000
splash2x/barnes         1213152.000 1215177.000 1219337.000 1215911.000 633247.000
splash2x/fft            9352024.000 9344708.000 9635110.000 9350033.000 7315602.000
splash2x/lu_cb          510081.000  510173.000  514423.000  504167.000  328613.000
splash2x/lu_ncb         510113.000  504887.000  513811.000  509917.000  400025.000
splash2x/ocean_cp       3403622.000 3406128.000 3444901.000 3402319.000 2645269.000
splash2x/ocean_ncp      3879080.000 3886923.000 7166490.000 3922086.000 2992831.000
splash2x/radiosity      1474071.000 1473050.000 1485076.000 1475643.000 227437.000
splash2x/radix          1695640.000 1705200.000 1419128.000 1554897.000 1749998.000
splash2x/raytrace       23268.000   23256.000   28084.000   26045.000   14677.000
splash2x/volrend        44000.000   44028.000   44150.000   44027.000   22121.000
splash2x/water_nsquared 29380.000   29380.000   29284.000   29436.000   24453.000
splash2x/water_spatial  655590.000  654947.000  655388.000  655542.000  432854.000


runtime_max             orig    rec     thp     ethp    prcl
parsec3/blackscholes    108.370 108.877 108.776 108.445 113.864
parsec3/bodytrack       80.276  80.366  79.630  80.126  81.659
parsec3/canneal         143.367 144.197 123.455 131.975 162.892
parsec3/dedup           12.054  11.965  11.772  12.142  14.055
parsec3/facesim         209.196 209.195 204.742 207.583 215.245
parsec3/ferret          190.661 192.285 189.877 192.743 192.066
parsec3/fluidanimate    212.472 213.463 211.522 214.281 220.408
parsec3/freqmine        295.246 293.927 292.255 292.148 300.932
parsec3/raytrace        120.358 120.792 121.136 120.404 131.839
parsec3/streamcluster   325.071 331.539 286.436 289.228 334.317
parsec3/swaptions       155.697 157.038 156.825 156.309 156.606
parsec3/vips            59.062  58.877  58.842  59.265  61.472
parsec3/x264            75.345  75.685  70.758  73.542  78.697
splash2x/barnes         80.831  82.605  74.027  79.812  106.207
splash2x/fft            33.718  33.881  23.008  33.554  47.834
splash2x/lu_cb          88.563  86.292  84.463  87.789  91.005
splash2x/lu_ncb         92.726  92.950  90.083  93.311  96.009
splash2x/ocean_cp       44.868  45.243  43.415  44.749  61.729
splash2x/ocean_ncp      84.442  83.289  51.998  82.229  124.633
splash2x/radiosity      92.378  92.244  91.275  92.292  105.059
splash2x/radix          31.681  31.874  25.371  32.192  45.455
splash2x/raytrace       85.864  85.906  84.819  84.530  85.539
splash2x/volrend        88.814  88.744  87.670  88.399  90.504
splash2x/water_nsquared 237.596 237.199 223.741 231.241 241.168
splash2x/water_spatial  90.466  90.773  90.521  90.550  99.268


memused.avg_max         orig        rec         thp         ethp        prcl
parsec3/blackscholes    1827340.000 1836897.000 1828187.000 1837378.000 1707479.000
parsec3/bodytrack       1421976.000 1437658.000 1424985.000 1437609.000 1436792.000
parsec3/canneal         1046601.000 1056260.000 1042789.000 1051592.000 1055545.000
parsec3/dedup           2456740.000 2467145.000 2435467.000 2467154.000 2466283.000
parsec3/facesim         542381.000  554228.000  550226.000  581529.000  492295.000
parsec3/ferret          321498.000  336341.000  325716.000  334070.000  331389.000
parsec3/fluidanimate    579520.000  588703.000  577242.000  588291.000  516079.000
parsec3/freqmine        996987.000  999207.000  992443.000  999281.000  773858.000
parsec3/raytrace        1754751.000 1760398.000 1743557.000 1756996.000 1586220.000
parsec3/streamcluster   133449.000  137064.000  124066.000  134933.000  134636.000
parsec3/swaptions       16397.000   29621.000   15944.000   28895.000   29207.000
parsec3/vips            2960001.000 2981427.000 2955163.000 2986429.000 2983387.000
parsec3/x264            3205259.000 3228002.000 3197838.000 3219088.000 3223033.000
splash2x/barnes         1214951.000 1228755.000 1327664.000 1228861.000 1114904.000
splash2x/fft            9480354.000 9264341.000 9282594.000 9213597.000 10164992.000
splash2x/lu_cb          517263.000  526113.000  523255.000  528095.000  402158.000
splash2x/lu_ncb         517764.000  524148.000  522610.000  528755.000  471551.000
splash2x/ocean_cp       3352774.000 3292570.000 3389379.000 3391900.000 3287861.000
splash2x/ocean_ncp      3908896.000 3875773.000 7077948.000 5935372.000 3867246.000
splash2x/radiosity      1475164.000 1472553.000 1484177.000 1473081.000 581921.000
splash2x/radix          1747493.000 1755210.000 1386306.000 1670956.000 2257898.000
splash2x/raytrace       47791.000   60947.000   53194.000   62454.000   55115.000
splash2x/volrend        153317.000  170996.000  154377.000  166662.000  164911.000
splash2x/water_nsquared 50000.000   66673.000   82988.000   62664.000   216319.000
splash2x/water_spatial  669928.000  681200.000  669739.000  681264.000  576776.000


rss.avg_max             orig        rec         thp         ethp        prcl
parsec3/blackscholes    593578.000  593873.000  596546.000  593668.000  404235.000
parsec3/bodytrack       32583.000   32617.000   32400.000   32664.000   28517.000
parsec3/canneal         842867.000  841952.000  839916.000  840304.000  835563.000
parsec3/dedup           1236591.000 1237214.000 1242257.000 1238979.000 1235900.000
parsec3/facesim         311936.000  312443.000  317898.000  313322.000  197379.000
parsec3/ferret          99760.000   99669.000   103618.000  99923.000   90262.000
parsec3/fluidanimate    531952.000  531940.000  532718.000  532706.000  453297.000
parsec3/freqmine        554327.000  553656.000  556388.000  555822.000  83652.000
parsec3/raytrace        896387.000  897364.000  891293.000  894360.000  362231.000
parsec3/streamcluster   110965.000  111003.000  112888.000  112335.000  110019.000
parsec3/swaptions       5675.000    5761.000    5721.000    5736.000    4110.000
parsec3/vips            32314.000   32166.000   32368.000   33057.000   29649.000
parsec3/x264            82128.000   82256.000   84304.000   82773.000   81839.000
splash2x/barnes         1219196.000 1219746.000 1225193.000 1219322.000 944438.000
splash2x/fft            9756750.000 9744458.000 9680119.000 9739032.000 9774117.000
splash2x/lu_cb          510829.000  511270.000  514484.000  510940.000  384608.000
splash2x/lu_ncb         510607.000  510501.000  513941.000  510667.000  455172.000
splash2x/ocean_cp       3414341.000 3412957.000 3449666.000 3434074.000 3401949.000
splash2x/ocean_ncp      3938869.000 3935467.000 7181326.000 6013719.000 3929055.000
splash2x/radiosity      1477633.000 1476719.000 1486072.000 1477598.000 343386.000
splash2x/radix          1796005.000 1783915.000 1440644.000 1776567.000 2039690.000
splash2x/raytrace       23348.000   23356.000   29332.000   27568.000   17323.000
splash2x/volrend        44314.000   44213.000   44736.000   44729.000   32296.000
splash2x/water_nsquared 29488.000   29444.000   30541.000   30583.000   26084.000
splash2x/water_spatial  657107.000  657889.000  656721.000  657519.000  510062.000

^ permalink raw reply	[flat|nested] 9+ messages in thread

* [RFC v8 1/8] mm/madvise: Export do_madvise() to external GPL modules
  2020-05-12 11:53 [RFC v8 0/8] Implement Data Access Monitoring-based Memory Operation Schemes SeongJae Park
@ 2020-05-12 11:53 ` SeongJae Park
  2020-05-12 11:53 ` [RFC v8 2/8] mm/damon: Account age of target regions SeongJae Park
                   ` (6 subsequent siblings)
  7 siblings, 0 replies; 9+ messages in thread
From: SeongJae Park @ 2020-05-12 11:53 UTC (permalink / raw)
  To: akpm
  Cc: SeongJae Park, Jonathan.Cameron, aarcange, acme,
	alexander.shishkin, amit, benh, brendan.d.gregg, brendanhiggins,
	cai, colin.king, corbet, dwmw, irogers, jolsa, kirill,
	mark.rutland, mgorman, minchan, mingo, namhyung, peterz, rdunlap,
	riel, rientjes, rostedt, sblbir, shakeelb, shuah, sj38.park, snu,
	vbabka, vdavydov.dev, yang.shi, ying.huang, linux-damon,
	linux-mm, linux-doc, linux-kernel

From: SeongJae Park <sjpark@amazon.de>

This commit exports 'do_madvise()' to external GPL modules, so that
other modules including DAMON could use the function.

Signed-off-by: SeongJae Park <sjpark@amazon.de>
---
 mm/madvise.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/mm/madvise.c b/mm/madvise.c
index 80f8a1839f70..151aaf285cdd 100644
--- a/mm/madvise.c
+++ b/mm/madvise.c
@@ -1151,6 +1151,7 @@ int do_madvise(struct task_struct *target_task, struct mm_struct *mm,
 
 	return error;
 }
+EXPORT_SYMBOL_GPL(do_madvise);
 
 SYSCALL_DEFINE3(madvise, unsigned long, start, size_t, len_in, int, behavior)
 {
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 9+ messages in thread

* [RFC v8 2/8] mm/damon: Account age of target regions
  2020-05-12 11:53 [RFC v8 0/8] Implement Data Access Monitoring-based Memory Operation Schemes SeongJae Park
  2020-05-12 11:53 ` [RFC v8 1/8] mm/madvise: Export do_madvise() to external GPL modules SeongJae Park
@ 2020-05-12 11:53 ` SeongJae Park
  2020-05-12 11:53 ` [RFC v8 3/8] mm/damon: Implement data access monitoring-based operation schemes SeongJae Park
                   ` (5 subsequent siblings)
  7 siblings, 0 replies; 9+ messages in thread
From: SeongJae Park @ 2020-05-12 11:53 UTC (permalink / raw)
  To: akpm
  Cc: SeongJae Park, Jonathan.Cameron, aarcange, acme,
	alexander.shishkin, amit, benh, brendan.d.gregg, brendanhiggins,
	cai, colin.king, corbet, dwmw, irogers, jolsa, kirill,
	mark.rutland, mgorman, minchan, mingo, namhyung, peterz, rdunlap,
	riel, rientjes, rostedt, sblbir, shakeelb, shuah, sj38.park, snu,
	vbabka, vdavydov.dev, yang.shi, ying.huang, linux-damon,
	linux-mm, linux-doc, linux-kernel

From: SeongJae Park <sjpark@amazon.de>

DAMON can be used as a primitive for data access pattern aware memory
management optimizations.  However, users who want such optimizations
should run DAMON, read the monitoring results, analyze it, plan a new
memory management scheme, and apply the new scheme by themselves.  It
would not be too hard, but still require some level of effort.  For
complicated optimizations, this effort is inevitable.

That said, in many cases, users would simply want to apply an actions to
a memory region of a specific size having a specific access frequency
for a specific time.  For example, "page out a memory region larger than
100 MiB but having a low access frequency more than 10 minutes", or "Use
THP for a memory region larger than 2 MiB having a high access frequency
for more than 2 seconds".

For such optimizations, users will need to first account the age of each
region themselves.  To reduce such efforts, this commit implements a
simple age account of each region in DAMON.  For each aggregation step,
DAMON compares the access frequency and start/end address of each region
with those from last aggregation and reset the age of the region if the
change is significant.  Else, the age is incremented.

Signed-off-by: SeongJae Park <sjpark@amazon.de>
---
 include/linux/damon.h |   5 ++
 mm/damon.c            | 106 ++++++++++++++++++++++++++++++++++++++++--
 2 files changed, 107 insertions(+), 4 deletions(-)

diff --git a/include/linux/damon.h b/include/linux/damon.h
index bc46ea00e9a1..7276b2a31c38 100644
--- a/include/linux/damon.h
+++ b/include/linux/damon.h
@@ -22,6 +22,11 @@ struct damon_region {
 	unsigned long sampling_addr;
 	unsigned int nr_accesses;
 	struct list_head list;
+
+	unsigned int age;
+	unsigned long last_vm_start;
+	unsigned long last_vm_end;
+	unsigned int last_nr_accesses;
 };
 
 /* Represents a monitoring target task */
diff --git a/mm/damon.c b/mm/damon.c
index 6fb847312a5d..da266e8b2b30 100644
--- a/mm/damon.c
+++ b/mm/damon.c
@@ -86,6 +86,10 @@ static struct damon_region *damon_new_region(struct damon_ctx *ctx,
 	region->nr_accesses = 0;
 	INIT_LIST_HEAD(&region->list);
 
+	region->age = 0;
+	region->last_vm_start = vm_start;
+	region->last_vm_end = vm_end;
+
 	return region;
 }
 
@@ -659,11 +663,45 @@ static void kdamond_reset_aggregated(struct damon_ctx *c)
 					sizeof(r->nr_accesses));
 			trace_damon_aggregated(t->pid, nr,
 					r->vm_start, r->vm_end, r->nr_accesses);
+			r->last_nr_accesses = r->nr_accesses;
 			r->nr_accesses = 0;
 		}
 	}
 }
 
+#define diff_of(a, b) (a > b ? a - b : b - a)
+
+/*
+ * Increase or reset the age of the given monitoring target region
+ *
+ * If the area or '->nr_accesses' has changed significantly, reset the '->age'.
+ * Else, increase the age.
+ */
+static void damon_do_count_age(struct damon_region *r, unsigned int threshold)
+{
+	unsigned long region_threshold = (r->vm_end - r->vm_start) / 4;
+	unsigned long region_diff = diff_of(r->vm_start, r->last_vm_start) +
+			diff_of(r->vm_end, r->last_vm_end);
+	unsigned int nr_accesses_diff = diff_of(r->nr_accesses,
+			r->last_nr_accesses);
+
+	if (region_diff > region_threshold || nr_accesses_diff > threshold)
+		r->age = 0;
+	else
+		r->age++;
+}
+
+static void kdamond_count_age(struct damon_ctx *c, unsigned int threshold)
+{
+	struct damon_task *t;
+	struct damon_region *r;
+
+	damon_for_each_task(c, t) {
+		damon_for_each_region(r, t)
+			damon_do_count_age(r, threshold);
+	}
+}
+
 #define sz_damon_region(r) (r->vm_end - r->vm_start)
 
 /*
@@ -672,33 +710,86 @@ static void kdamond_reset_aggregated(struct damon_ctx *c)
 static void damon_merge_two_regions(struct damon_region *l,
 				struct damon_region *r)
 {
-	l->nr_accesses = (l->nr_accesses * sz_damon_region(l) +
-			r->nr_accesses * sz_damon_region(r)) /
-			(sz_damon_region(l) + sz_damon_region(r));
+	unsigned long sz_l = sz_damon_region(l), sz_r = sz_damon_region(r);
+
+	l->nr_accesses = (l->nr_accesses * sz_l + r->nr_accesses * sz_r) /
+			(sz_l + sz_r);
+	l->age = (l->age * sz_l + r->age * sz_r) / (sz_l + sz_r);
 	l->vm_end = r->vm_end;
 	damon_destroy_region(r);
 }
 
-#define diff_of(a, b) (a > b ? a - b : b - a)
+static inline void set_last_area(struct damon_region *r, struct region *last)
+{
+	r->last_vm_start = last->start;
+	r->last_vm_end = last->end;
+}
+
+static inline void get_last_area(struct damon_region *r, struct region *last)
+{
+	last->start = r->last_vm_start;
+	last->end = r->last_vm_end;
+}
 
 /*
  * Merge adjacent regions having similar access frequencies
  *
  * t		task affected by merge operation
  * thres	'->nr_accesses' diff threshold for the merge
+ *
+ * After each merge, the biggest mergee region becomes the last shape of the
+ * new region.  If two regions split from one region at the end of previous
+ * aggregation interval are merged into one region, we handle the two regions
+ * as one big mergee, because it can lead to unproper last shape record if we
+ * don't do so.
+ *
+ * To understand why we take special care for regions split from one region,
+ * suppose that a region of size 10 has split into two regions of size 4 and 6.
+ * Two regions show similar access frequency for next aggregation interval and
+ * thus now be merged into one region again.  Because the split is made
+ * regardless of the access pattern, DAMON should say the region of size 10 had
+ * no area change for last aggregation interval.  However, if the two mergees
+ * are handled separately, DAMON will say the merged region has changed its
+ * size from 6 to 10.
  */
 static void damon_merge_regions_of(struct damon_task *t, unsigned int thres)
 {
 	struct damon_region *r, *prev = NULL, *next;
+	struct region biggest_mergee;	/* the biggest region being merged */
+	unsigned long sz_biggest = 0;	/* size of the biggest_mergee */
+	unsigned long sz_mergee = 0;	/* size of current mergee */
 
 	damon_for_each_region_safe(r, next, t) {
 		if (!prev || prev->vm_end != r->vm_start ||
 		    diff_of(prev->nr_accesses, r->nr_accesses) > thres) {
+			if (sz_biggest)
+				set_last_area(prev, &biggest_mergee);
+
 			prev = r;
+			sz_biggest = sz_damon_region(prev);
+			get_last_area(prev, &biggest_mergee);
 			continue;
 		}
+
+
+		/* Set size of current mergee and biggest mergee */
+		sz_mergee += sz_damon_region(r);
+		if (sz_mergee > sz_biggest) {
+			sz_biggest = sz_mergee;
+			get_last_area(r, &biggest_mergee);
+		}
+
+		/*
+		 * If next region and current region is not originated from
+		 * same region, initialize the size of mergee.
+		 */
+		if (r->last_vm_start != next->last_vm_start)
+			sz_mergee = 0;
+
 		damon_merge_two_regions(prev, r);
 	}
+	if (sz_biggest)
+		set_last_area(prev, &biggest_mergee);
 }
 
 /*
@@ -731,6 +822,12 @@ static void damon_split_region_at(struct damon_ctx *ctx,
 	struct damon_region *new;
 
 	new = damon_new_region(ctx, r->vm_start + sz_r, r->vm_end);
+	new->age = r->age;
+	new->last_vm_start = r->vm_start;
+	new->last_nr_accesses = r->last_nr_accesses;
+
+	r->last_vm_start = r->vm_start;
+	r->last_vm_end = r->vm_end;
 	r->vm_end = new->vm_start;
 
 	damon_insert_region(new, r, damon_next_region(r));
@@ -957,6 +1054,7 @@ static int kdamond_fn(void *data)
 
 		if (kdamond_aggregate_interval_passed(ctx)) {
 			kdamond_merge_regions(ctx, max_nr_accesses / 10);
+			kdamond_count_age(ctx, max_nr_accesses / 10);
 			if (ctx->aggregate_cb)
 				ctx->aggregate_cb(ctx);
 			kdamond_reset_aggregated(ctx);
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 9+ messages in thread

* [RFC v8 3/8] mm/damon: Implement data access monitoring-based operation schemes
  2020-05-12 11:53 [RFC v8 0/8] Implement Data Access Monitoring-based Memory Operation Schemes SeongJae Park
  2020-05-12 11:53 ` [RFC v8 1/8] mm/madvise: Export do_madvise() to external GPL modules SeongJae Park
  2020-05-12 11:53 ` [RFC v8 2/8] mm/damon: Account age of target regions SeongJae Park
@ 2020-05-12 11:53 ` SeongJae Park
  2020-05-12 11:53 ` [RFC v8 4/8] mm/damon/schemes: Implement a debugfs interface SeongJae Park
                   ` (4 subsequent siblings)
  7 siblings, 0 replies; 9+ messages in thread
From: SeongJae Park @ 2020-05-12 11:53 UTC (permalink / raw)
  To: akpm
  Cc: SeongJae Park, Jonathan.Cameron, aarcange, acme,
	alexander.shishkin, amit, benh, brendan.d.gregg, brendanhiggins,
	cai, colin.king, corbet, dwmw, irogers, jolsa, kirill,
	mark.rutland, mgorman, minchan, mingo, namhyung, peterz, rdunlap,
	riel, rientjes, rostedt, sblbir, shakeelb, shuah, sj38.park, snu,
	vbabka, vdavydov.dev, yang.shi, ying.huang, linux-damon,
	linux-mm, linux-doc, linux-kernel

From: SeongJae Park <sjpark@amazon.de>

In many cases, users might use DAMON for simple data access aware
memory management optimizations such as applying an operation scheme to
a memory region of a specific size having a specific access frequency
for a specific time.  For example, "page out a memory region larger than
100 MiB but having a low access frequency more than 10 minutes", or "Use
THP for a memory region larger than 2 MiB having a high access frequency
for more than 2 seconds".

To minimize users from spending their time for implementation of such
simple data access monitoring-based operation schemes, this commit makes
DAMON to handle such schemes directly.  With this commit, users can
simply specify their desired schemes to DAMON.

Each of the schemes is composed with conditions for filtering of the
target memory regions and desired memory management action for the
target.  Specifically, the format is::

    <min/max size> <min/max access frequency> <min/max age> <action>

The filtering conditions are size of memory region, number of accesses
to the region monitored by DAMON, and the age of the region.  The age of
region is incremented periodically but reset when its addresses or
access frequency has significantly changed or the action of a scheme was
applied.  For the action, current implementation supports only a few of
madvise() hints, ``MADV_WILLNEED``, ``MADV_COLD``, ``MADV_PAGEOUT``,
``MADV_HUGEPAGE``, and ``MADV_NOHUGEPAGE``.

Signed-off-by: SeongJae Park <sjpark@amazon.de>
---
 include/linux/damon.h |  24 +++++++
 mm/damon.c            | 149 ++++++++++++++++++++++++++++++++++++++++++
 2 files changed, 173 insertions(+)

diff --git a/include/linux/damon.h b/include/linux/damon.h
index 7276b2a31c38..0f26d8aad33c 100644
--- a/include/linux/damon.h
+++ b/include/linux/damon.h
@@ -36,6 +36,27 @@ struct damon_task {
 	struct list_head list;
 };
 
+/* Data Access Monitoring-based Operation Scheme */
+enum damos_action {
+	DAMOS_WILLNEED,
+	DAMOS_COLD,
+	DAMOS_PAGEOUT,
+	DAMOS_HUGEPAGE,
+	DAMOS_NOHUGEPAGE,
+	DAMOS_ACTION_LEN,
+};
+
+struct damos {
+	unsigned int min_sz_region;
+	unsigned int max_sz_region;
+	unsigned int min_nr_accesses;
+	unsigned int max_nr_accesses;
+	unsigned int min_age_region;
+	unsigned int max_age_region;
+	enum damos_action action;
+	struct list_head list;
+};
+
 /*
  * For each 'sample_interval', DAMON checks whether each region is accessed or
  * not.  It aggregates and keeps the access information (number of accesses to
@@ -65,6 +86,7 @@ struct damon_ctx {
 	struct mutex kdamond_lock;
 
 	struct list_head tasks_list;	/* 'damon_task' objects */
+	struct list_head schemes_list;	/* 'damos' objects */
 
 	/* callbacks */
 	void (*sample_cb)(struct damon_ctx *context);
@@ -75,6 +97,8 @@ int damon_set_pids(struct damon_ctx *ctx, int *pids, ssize_t nr_pids);
 int damon_set_attrs(struct damon_ctx *ctx, unsigned long sample_int,
 		unsigned long aggr_int, unsigned long regions_update_int,
 		unsigned long min_nr_reg, unsigned long max_nr_reg);
+int damon_set_schemes(struct damon_ctx *ctx,
+			struct damos **schemes, ssize_t nr_schemes);
 int damon_set_recording(struct damon_ctx *ctx,
 				unsigned int rbuf_len, char *rfile_path);
 int damon_start(struct damon_ctx *ctx);
diff --git a/mm/damon.c b/mm/damon.c
index da266e8b2b30..13275c31a6c5 100644
--- a/mm/damon.c
+++ b/mm/damon.c
@@ -11,6 +11,7 @@
 
 #define CREATE_TRACE_POINTS
 
+#include <asm-generic/mman-common.h>
 #include <linux/damon.h>
 #include <linux/debugfs.h>
 #include <linux/delay.h>
@@ -52,6 +53,12 @@
 #define damon_for_each_task_safe(ctx, t, next) \
 	list_for_each_entry_safe(t, next, &(ctx)->tasks_list, list)
 
+#define damon_for_each_scheme(ctx, r) \
+	list_for_each_entry(r, &(ctx)->schemes_list, list)
+
+#define damon_for_each_scheme_safe(ctx, s, next) \
+	list_for_each_entry_safe(s, next, &(ctx)->schemes_list, list)
+
 #define MAX_RECORD_BUFFER_LEN	(4 * 1024 * 1024)
 #define MAX_RFILE_PATH_LEN	256
 
@@ -167,6 +174,27 @@ static void damon_destroy_task(struct damon_task *t)
 	damon_free_task(t);
 }
 
+static void damon_add_scheme(struct damon_ctx *ctx, struct damos *s)
+{
+	list_add_tail(&s->list, &ctx->schemes_list);
+}
+
+static void damon_del_scheme(struct damos *s)
+{
+	list_del(&s->list);
+}
+
+static void damon_free_scheme(struct damos *s)
+{
+	kfree(s);
+}
+
+static void damon_destroy_scheme(struct damos *s)
+{
+	damon_del_scheme(s);
+	damon_free_scheme(s);
+}
+
 static unsigned int nr_damon_tasks(struct damon_ctx *ctx)
 {
 	struct damon_task *t;
@@ -702,6 +730,101 @@ static void kdamond_count_age(struct damon_ctx *c, unsigned int threshold)
 	}
 }
 
+#ifndef CONFIG_ADVISE_SYSCALLS
+static int damos_madvise(struct damon_task *task, struct damon_region *r,
+			int behavior)
+{
+	return -EINVAL;
+}
+#else
+static int damos_madvise(struct damon_task *task, struct damon_region *r,
+			int behavior)
+{
+	struct task_struct *t;
+	struct mm_struct *mm;
+	int ret = -ENOMEM;
+
+	t = damon_get_task_struct(task);
+	if (!t)
+		goto out;
+	mm = damon_get_mm(task);
+	if (!mm)
+		goto put_task_out;
+
+	ret = do_madvise(t, mm, PAGE_ALIGN(r->vm_start),
+			PAGE_ALIGN(r->vm_end - r->vm_start), behavior);
+	mmput(mm);
+put_task_out:
+	put_task_struct(t);
+out:
+	return ret;
+}
+#endif	/* CONFIG_ADVISE_SYSCALLS */
+
+static int damos_do_action(struct damon_task *task, struct damon_region *r,
+			enum damos_action action)
+{
+	int madv_action;
+
+	switch (action) {
+	case DAMOS_WILLNEED:
+		madv_action = MADV_WILLNEED;
+		break;
+	case DAMOS_COLD:
+		madv_action = MADV_COLD;
+		break;
+	case DAMOS_PAGEOUT:
+		madv_action = MADV_PAGEOUT;
+		break;
+	case DAMOS_HUGEPAGE:
+		madv_action = MADV_HUGEPAGE;
+		break;
+	case DAMOS_NOHUGEPAGE:
+		madv_action = MADV_NOHUGEPAGE;
+		break;
+	default:
+		pr_warn("Wrong action %d\n", action);
+		return -EINVAL;
+	}
+
+	return damos_madvise(task, r, madv_action);
+}
+
+static void damon_do_apply_schemes(struct damon_ctx *c, struct damon_task *t,
+				struct damon_region *r)
+{
+	struct damos *s;
+	unsigned long sz;
+
+	damon_for_each_scheme(c, s) {
+		sz = r->vm_end - r->vm_start;
+		if ((s->min_sz_region && sz < s->min_sz_region) ||
+				(s->max_sz_region && s->max_sz_region < sz))
+			continue;
+		if ((s->min_nr_accesses && r->nr_accesses < s->min_nr_accesses)
+				|| (s->max_nr_accesses &&
+					s->max_nr_accesses < r->nr_accesses))
+			continue;
+		if ((s->min_age_region && r->age < s->min_age_region) ||
+				(s->max_age_region &&
+				 s->max_age_region < r->age))
+			continue;
+		damos_do_action(t, r, s->action);
+		r->age = 0;
+	}
+}
+
+static void kdamond_apply_schemes(struct damon_ctx *c)
+{
+	struct damon_task *t;
+	struct damon_region *r;
+
+	damon_for_each_task(c, t) {
+		damon_for_each_region(r, t)
+			damon_do_apply_schemes(c, t, r);
+	}
+}
+
 #define sz_damon_region(r) (r->vm_end - r->vm_start)
 
 /*
@@ -1057,6 +1180,7 @@ static int kdamond_fn(void *data)
 			kdamond_count_age(ctx, max_nr_accesses / 10);
 			if (ctx->aggregate_cb)
 				ctx->aggregate_cb(ctx);
+			kdamond_apply_schemes(ctx);
 			kdamond_reset_aggregated(ctx);
 			kdamond_split_regions(ctx);
 		}
@@ -1137,6 +1261,30 @@ int damon_stop(struct damon_ctx *ctx)
 	return -EPERM;
 }
 
+/**
+ * damon_set_schemes() - Set data access monitoring based operation schemes.
+ * @ctx:	monitoring context
+ * @schemes:	array of the schemes
+ * @nr_schemes:	number of entries in @schemes
+ *
+ * This function should not be called while the kdamond of the context is
+ * running.
+ *
+ * Return: 0 if success, or negative error code otherwise.
+ */
+int damon_set_schemes(struct damon_ctx *ctx, struct damos **schemes,
+			ssize_t nr_schemes)
+{
+	struct damos *s, *next;
+	ssize_t i;
+
+	damon_for_each_scheme_safe(ctx, s, next)
+		damon_destroy_scheme(s);
+	for (i = 0; i < nr_schemes; i++)
+		damon_add_scheme(ctx, schemes[i]);
+	return 0;
+}
+
 /**
  * damon_set_pids() - Set monitoring target processes.
  * @ctx:	monitoring context
@@ -1571,6 +1719,7 @@ static int __init damon_init_user_ctx(void)
 	mutex_init(&ctx->kdamond_lock);
 
 	INIT_LIST_HEAD(&ctx->tasks_list);
+	INIT_LIST_HEAD(&ctx->schemes_list);
 
 	return 0;
 }
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 9+ messages in thread

* [RFC v8 4/8] mm/damon/schemes: Implement a debugfs interface
  2020-05-12 11:53 [RFC v8 0/8] Implement Data Access Monitoring-based Memory Operation Schemes SeongJae Park
                   ` (2 preceding siblings ...)
  2020-05-12 11:53 ` [RFC v8 3/8] mm/damon: Implement data access monitoring-based operation schemes SeongJae Park
@ 2020-05-12 11:53 ` SeongJae Park
  2020-05-12 11:53 ` [RFC v8 5/8] mm/damon-test: Add kunit test case for regions age accounting SeongJae Park
                   ` (3 subsequent siblings)
  7 siblings, 0 replies; 9+ messages in thread
From: SeongJae Park @ 2020-05-12 11:53 UTC (permalink / raw)
  To: akpm
  Cc: SeongJae Park, Jonathan.Cameron, aarcange, acme,
	alexander.shishkin, amit, benh, brendan.d.gregg, brendanhiggins,
	cai, colin.king, corbet, dwmw, irogers, jolsa, kirill,
	mark.rutland, mgorman, minchan, mingo, namhyung, peterz, rdunlap,
	riel, rientjes, rostedt, sblbir, shakeelb, shuah, sj38.park, snu,
	vbabka, vdavydov.dev, yang.shi, ying.huang, linux-damon,
	linux-mm, linux-doc, linux-kernel

From: SeongJae Park <sjpark@amazon.de>

This commit implements a debugfs interface for the data access
monitoring oriented memory management schemes.  It is supposed to be
used by administrators and/or privileged user space programs.  Users can
read and update the rules using ``<debugfs>/damon/schemes`` file.  The
format is::

    <min/max size> <min/max access frequency> <min/max age> <action>

Signed-off-by: SeongJae Park <sjpark@amazon.de>
---
 mm/damon.c | 174 ++++++++++++++++++++++++++++++++++++++++++++++++++++-
 1 file changed, 172 insertions(+), 2 deletions(-)

diff --git a/mm/damon.c b/mm/damon.c
index 13275c31a6c5..be8a986f08e0 100644
--- a/mm/damon.c
+++ b/mm/damon.c
@@ -174,6 +174,29 @@ static void damon_destroy_task(struct damon_task *t)
 	damon_free_task(t);
 }
 
+static struct damos *damon_new_scheme(
+		unsigned int min_sz_region, unsigned int max_sz_region,
+		unsigned int min_nr_accesses, unsigned int max_nr_accesses,
+		unsigned int min_age_region, unsigned int max_age_region,
+		enum damos_action action)
+{
+	struct damos *scheme;
+
+	scheme = kmalloc(sizeof(*scheme), GFP_KERNEL);
+	if (!scheme)
+		return NULL;
+	scheme->min_sz_region = min_sz_region;
+	scheme->max_sz_region = max_sz_region;
+	scheme->min_nr_accesses = min_nr_accesses;
+	scheme->max_nr_accesses = max_nr_accesses;
+	scheme->min_age_region = min_age_region;
+	scheme->max_age_region = max_age_region;
+	scheme->action = action;
+	INIT_LIST_HEAD(&scheme->list);
+
+	return scheme;
+}
+
 static void damon_add_scheme(struct damon_ctx *ctx, struct damos *s)
 {
 	list_add_tail(&s->list, &ctx->schemes_list);
@@ -1439,6 +1462,147 @@ static ssize_t debugfs_monitor_on_write(struct file *file,
 	return ret;
 }
 
+static ssize_t sprint_schemes(struct damon_ctx *c, char *buf, ssize_t len)
+{
+	struct damos *s;
+	int written = 0;
+	int rc;
+
+	damon_for_each_scheme(c, s) {
+		rc = snprintf(&buf[written], len - written,
+				"%u %u %u %u %u %u %d\n",
+				s->min_sz_region, s->max_sz_region,
+				s->min_nr_accesses, s->max_nr_accesses,
+				s->min_age_region, s->max_age_region,
+				s->action);
+		if (!rc)
+			return -ENOMEM;
+
+		written += rc;
+	}
+	return written;
+}
+
+static ssize_t debugfs_schemes_read(struct file *file, char __user *buf,
+		size_t count, loff_t *ppos)
+{
+	struct damon_ctx *ctx = &damon_user_ctx;
+	char *kbuf;
+	ssize_t len;
+
+	kbuf = kmalloc(count, GFP_KERNEL);
+	if (!kbuf)
+		return -ENOMEM;
+
+	len = sprint_schemes(ctx, kbuf, count);
+	if (len < 0)
+		goto out;
+	len = simple_read_from_buffer(buf, count, ppos, kbuf, len);
+
+out:
+	kfree(kbuf);
+	return len;
+}
+
+static void free_schemes_arr(struct damos **schemes, ssize_t nr_schemes)
+{
+	ssize_t i;
+
+	for (i = 0; i < nr_schemes; i++)
+		kfree(schemes[i]);
+	kfree(schemes);
+}
+
+/*
+ * Converts a string into an array of struct damos pointers
+ *
+ * Returns an array of struct damos pointers that converted if the conversion
+ * success, or NULL otherwise.
+ */
+static struct damos **str_to_schemes(const char *str, ssize_t len,
+				ssize_t *nr_schemes)
+{
+	struct damos *scheme, **schemes;
+	const int max_nr_schemes = 256;
+	int pos = 0, parsed, ret;
+	unsigned int min_sz, max_sz, min_nr_a, max_nr_a, min_age, max_age;
+	unsigned int action;
+
+	schemes = kmalloc_array(max_nr_schemes, sizeof(scheme),
+			GFP_KERNEL);
+	if (!schemes)
+		return NULL;
+
+	*nr_schemes = 0;
+	while (pos < len && *nr_schemes < max_nr_schemes) {
+		ret = sscanf(&str[pos], "%u %u %u %u %u %u %u%n",
+				&min_sz, &max_sz, &min_nr_a, &max_nr_a,
+				&min_age, &max_age, &action, &parsed);
+		if (ret != 7)
+			break;
+		if (action >= DAMOS_ACTION_LEN) {
+			pr_err("wrong action %d\n", action);
+			goto fail;
+		}
+
+		pos += parsed;
+		scheme = damon_new_scheme(min_sz, max_sz, min_nr_a, max_nr_a,
+				min_age, max_age, action);
+		if (!scheme)
+			goto fail;
+
+		schemes[*nr_schemes] = scheme;
+		*nr_schemes += 1;
+	}
+	if (!*nr_schemes)
+		goto fail;
+	return schemes;
+fail:
+	free_schemes_arr(schemes, *nr_schemes);
+	return NULL;
+}
+
+static ssize_t debugfs_schemes_write(struct file *file, const char __user *buf,
+		size_t count, loff_t *ppos)
+{
+	struct damon_ctx *ctx = &damon_user_ctx;
+	char *kbuf;
+	struct damos **schemes;
+	ssize_t nr_schemes = 0, ret;
+	int err;
+
+	if (*ppos)
+		return -EINVAL;
+
+	kbuf = kmalloc(count, GFP_KERNEL);
+	if (!kbuf)
+		return -ENOMEM;
+
+	ret = simple_write_to_buffer(kbuf, count, ppos, buf, count);
+	if (ret < 0)
+		goto out;
+
+	schemes = str_to_schemes(kbuf, ret, &nr_schemes);
+
+	mutex_lock(&ctx->kdamond_lock);
+	if (ctx->kdamond) {
+		ret = -EBUSY;
+		goto unlock_out;
+	}
+
+	err = damon_set_schemes(ctx, schemes, nr_schemes);
+	if (err)
+		ret = err;
+	else
+		nr_schemes = 0;
+unlock_out:
+	mutex_unlock(&ctx->kdamond_lock);
+	free_schemes_arr(schemes, nr_schemes);
+out:
+	kfree(kbuf);
+	return ret;
+}
+
 static ssize_t damon_sprint_pids(struct damon_ctx *ctx, char *buf, ssize_t len)
 {
 	struct damon_task *t;
@@ -1664,6 +1828,12 @@ static const struct file_operations pids_fops = {
 	.write = debugfs_pids_write,
 };
 
+static const struct file_operations schemes_fops = {
+	.owner = THIS_MODULE,
+	.read = debugfs_schemes_read,
+	.write = debugfs_schemes_write,
+};
+
 static const struct file_operations record_fops = {
 	.owner = THIS_MODULE,
 	.read = debugfs_record_read,
@@ -1680,10 +1850,10 @@ static struct dentry *debugfs_root;
 
 static int __init damon_debugfs_init(void)
 {
-	const char * const file_names[] = {"attrs", "record",
+	const char * const file_names[] = {"attrs", "record", "schemes",
 		"pids", "monitor_on"};
 	const struct file_operations *fops[] = {&attrs_fops, &record_fops,
-		&pids_fops, &monitor_on_fops};
+		&schemes_fops, &pids_fops, &monitor_on_fops};
 	int i;
 
 	debugfs_root = debugfs_create_dir("damon", NULL);
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 9+ messages in thread

* [RFC v8 5/8] mm/damon-test: Add kunit test case for regions age accounting
  2020-05-12 11:53 [RFC v8 0/8] Implement Data Access Monitoring-based Memory Operation Schemes SeongJae Park
                   ` (3 preceding siblings ...)
  2020-05-12 11:53 ` [RFC v8 4/8] mm/damon/schemes: Implement a debugfs interface SeongJae Park
@ 2020-05-12 11:53 ` SeongJae Park
  2020-05-12 11:53 ` [RFC v8 6/8] mm/damon/selftests: Add 'schemes' debugfs tests SeongJae Park
                   ` (2 subsequent siblings)
  7 siblings, 0 replies; 9+ messages in thread
From: SeongJae Park @ 2020-05-12 11:53 UTC (permalink / raw)
  To: akpm
  Cc: SeongJae Park, Jonathan.Cameron, aarcange, acme,
	alexander.shishkin, amit, benh, brendan.d.gregg, brendanhiggins,
	cai, colin.king, corbet, dwmw, irogers, jolsa, kirill,
	mark.rutland, mgorman, minchan, mingo, namhyung, peterz, rdunlap,
	riel, rientjes, rostedt, sblbir, shakeelb, shuah, sj38.park, snu,
	vbabka, vdavydov.dev, yang.shi, ying.huang, linux-damon,
	linux-mm, linux-doc, linux-kernel

From: SeongJae Park <sjpark@amazon.de>

After merges of regions, each region should know their last shape in
proper way to measure the changes from the last modification and reset
the age if the changes are significant.  This commit adds kunit test
cases checking whether the regions are knowing their last shape properly
after merges of regions.

Signed-off-by: SeongJae Park <sjpark@amazon.de>
Reviewed-by: Brendan Higgins <brendanhiggins@google.com>
---
 mm/damon-test.h | 5 +++++
 1 file changed, 5 insertions(+)

diff --git a/mm/damon-test.h b/mm/damon-test.h
index 851c2062a480..c7a7f3b9b57f 100644
--- a/mm/damon-test.h
+++ b/mm/damon-test.h
@@ -551,6 +551,8 @@ static void damon_test_merge_regions_of(struct kunit *test)
 
 	unsigned long saddrs[] = {0, 114, 130, 156, 170};
 	unsigned long eaddrs[] = {112, 130, 156, 170, 230};
+	unsigned long lsa[] = {0, 114, 130, 156, 184};
+	unsigned long lea[] = {100, 122, 156, 170, 230};
 	int i;
 
 	t = damon_new_task(42);
@@ -567,6 +569,9 @@ static void damon_test_merge_regions_of(struct kunit *test)
 		r = __nth_region_of(t, i);
 		KUNIT_EXPECT_EQ(test, r->vm_start, saddrs[i]);
 		KUNIT_EXPECT_EQ(test, r->vm_end, eaddrs[i]);
+		KUNIT_EXPECT_EQ(test, r->last_vm_start, lsa[i]);
+		KUNIT_EXPECT_EQ(test, r->last_vm_end, lea[i]);
+
 	}
 	damon_free_task(t);
 }
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 9+ messages in thread

* [RFC v8 6/8] mm/damon/selftests: Add 'schemes' debugfs tests
  2020-05-12 11:53 [RFC v8 0/8] Implement Data Access Monitoring-based Memory Operation Schemes SeongJae Park
                   ` (4 preceding siblings ...)
  2020-05-12 11:53 ` [RFC v8 5/8] mm/damon-test: Add kunit test case for regions age accounting SeongJae Park
@ 2020-05-12 11:53 ` SeongJae Park
  2020-05-12 11:53 ` [RFC v8 7/8] damon/tools: Support more human friendly 'schemes' control SeongJae Park
  2020-05-12 11:53 ` [RFC v8 8/8] Documentation/admin-guide/mm: Document DAMON-based operation schemes SeongJae Park
  7 siblings, 0 replies; 9+ messages in thread
From: SeongJae Park @ 2020-05-12 11:53 UTC (permalink / raw)
  To: akpm
  Cc: SeongJae Park, Jonathan.Cameron, aarcange, acme,
	alexander.shishkin, amit, benh, brendan.d.gregg, brendanhiggins,
	cai, colin.king, corbet, dwmw, irogers, jolsa, kirill,
	mark.rutland, mgorman, minchan, mingo, namhyung, peterz, rdunlap,
	riel, rientjes, rostedt, sblbir, shakeelb, shuah, sj38.park, snu,
	vbabka, vdavydov.dev, yang.shi, ying.huang, linux-damon,
	linux-mm, linux-doc, linux-kernel

From: SeongJae Park <sjpark@amazon.de>

This commit adds simple selftets for 'schemes' debugfs file of DAMON.

Signed-off-by: SeongJae Park <sjpark@amazon.de>
---
 .../testing/selftests/damon/debugfs_attrs.sh  | 29 +++++++++++++++++++
 1 file changed, 29 insertions(+)

diff --git a/tools/testing/selftests/damon/debugfs_attrs.sh b/tools/testing/selftests/damon/debugfs_attrs.sh
index d5188b0f71b1..4aeb2037a67e 100755
--- a/tools/testing/selftests/damon/debugfs_attrs.sh
+++ b/tools/testing/selftests/damon/debugfs_attrs.sh
@@ -97,6 +97,35 @@ fi
 
 echo $ORIG_CONTENT > $file
 
+# Test schemes file
+file="$DBGFS/schemes"
+
+ORIG_CONTENT=$(cat $file)
+echo "1 2 3 4 5 6 3" > $file
+if [ $? -ne 0 ]
+then
+	echo "$file write fail"
+	echo $ORIG_CONTENT > $file
+	exit 1
+fi
+
+echo "1 2
+3 4 5 6 3" > $file
+if [ $? -eq 0 ]
+then
+	echo "$file multi line write success (expected fail)"
+	echo $ORIG_CONTENT > $file
+	exit 1
+fi
+
+echo > $file
+if [ $? -ne 0 ]
+then
+	echo "$file empty string writing fail"
+	echo $ORIG_CONTENT > $file
+	exit 1
+fi
+
 # Test pids file
 file="$DBGFS/pids"
 
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 9+ messages in thread

* [RFC v8 7/8] damon/tools: Support more human friendly 'schemes' control
  2020-05-12 11:53 [RFC v8 0/8] Implement Data Access Monitoring-based Memory Operation Schemes SeongJae Park
                   ` (5 preceding siblings ...)
  2020-05-12 11:53 ` [RFC v8 6/8] mm/damon/selftests: Add 'schemes' debugfs tests SeongJae Park
@ 2020-05-12 11:53 ` SeongJae Park
  2020-05-12 11:53 ` [RFC v8 8/8] Documentation/admin-guide/mm: Document DAMON-based operation schemes SeongJae Park
  7 siblings, 0 replies; 9+ messages in thread
From: SeongJae Park @ 2020-05-12 11:53 UTC (permalink / raw)
  To: akpm
  Cc: SeongJae Park, Jonathan.Cameron, aarcange, acme,
	alexander.shishkin, amit, benh, brendan.d.gregg, brendanhiggins,
	cai, colin.king, corbet, dwmw, irogers, jolsa, kirill,
	mark.rutland, mgorman, minchan, mingo, namhyung, peterz, rdunlap,
	riel, rientjes, rostedt, sblbir, shakeelb, shuah, sj38.park, snu,
	vbabka, vdavydov.dev, yang.shi, ying.huang, linux-damon,
	linux-mm, linux-doc, linux-kernel

From: SeongJae Park <sjpark@amazon.de>

This commit implements 'schemes' subcommand of the damon userspace tool.
It can be used to describe and apply the data access monitoring-based
operation schemes in more human friendly fashion.

Signed-off-by: SeongJae Park <sjpark@amazon.de>
---
 tools/damon/_convert_damos.py | 127 ++++++++++++++++++++++++++++++
 tools/damon/_damon.py         | 143 ++++++++++++++++++++++++++++++++++
 tools/damon/damo              |   7 ++
 tools/damon/record.py         | 135 +++-----------------------------
 tools/damon/schemes.py        | 105 +++++++++++++++++++++++++
 5 files changed, 394 insertions(+), 123 deletions(-)
 create mode 100755 tools/damon/_convert_damos.py
 create mode 100644 tools/damon/_damon.py
 create mode 100644 tools/damon/schemes.py

diff --git a/tools/damon/_convert_damos.py b/tools/damon/_convert_damos.py
new file mode 100755
index 000000000000..709d0991899c
--- /dev/null
+++ b/tools/damon/_convert_damos.py
@@ -0,0 +1,127 @@
+#!/usr/bin/env python3
+# SPDX-License-Identifier: GPL-2.0
+
+"""
+Change human readable data access monitoring-based operation schemes to the low
+level input for the '<debugfs>/damon/schemes' file.  Below is an example of the
+schemes written in the human readable format:
+
+    # format is:
+    # <min/max size> <min/max frequency (0-99)> <min/max age> <action>
+    #
+    # B/K/M/G/T for Bytes/KiB/MiB/GiB/TiB
+    # us/ms/s/m/h/d for micro-seconds/milli-seconds/seconds/minutes/hours/days
+    # 'null' means zero for size and age.
+
+    # if a region keeps a high access frequency for more than 100ms, put the
+    # region on the head of the LRU list (call madvise() with MADV_WILLNEED).
+    null    null    80      null    100ms   0s      willneed
+
+    # if a region keeps a low access frequency for more than 200ms and less
+    # than one hour, put the # region on the tail of the LRU list (call
+    # madvise() with MADV_COLD).
+    0B      0B      10      20      200ms   1h cold
+
+    # if a region keeps a very low access frequency for more than 1 minute,
+    # swap out the region immediately (call madvise() with MADV_PAGEOUT).
+    0B      null    0       10      60s     0s pageout
+
+    # if a region of a size bigger than 2MiB keeps a very high access frequency
+    # for more than 100ms, let the region to use huge pages (call madvise()
+    # with MADV_HUGEPAGE).
+    2M      null    90      99      100ms   0s hugepage
+
+    # If a regions of a size bigger than 2MiB keeps small access frequency for
+    # more than 100ms, avoid the region using huge pages (call madvise() with
+    # MADV_NOHUGEPAGE).
+    2M      null    0       25      100ms   0s nohugepage
+"""
+
+import argparse
+
+unit_to_bytes = {'B': 1, 'K': 1024, 'M': 1024 * 1024, 'G': 1024 * 1024 * 1024,
+        'T': 1024 * 1024 * 1024 * 1024}
+
+def text_to_bytes(txt):
+    if txt == 'null':
+        return 0
+    unit = txt[-1]
+    number = int(txt[:-1])
+    return number * unit_to_bytes[unit]
+
+unit_to_usecs = {'us': 1, 'ms': 1000, 's': 1000 * 1000, 'm': 60 * 1000 * 1000,
+        'h': 60 * 60 * 1000 * 1000, 'd': 24 * 60 * 60 * 1000 * 1000}
+
+def text_to_us(txt):
+    if txt == 'null':
+        return 0
+    unit = txt[-2:]
+    if unit in ['us', 'ms']:
+        number = int(txt[:-2])
+    else:
+        unit = txt[-1]
+        number = int(txt[:-1])
+    return number * unit_to_usecs[unit]
+
+damos_action_to_int = {'DAMOS_WILLNEED': 0, 'DAMOS_COLD': 1,
+        'DAMOS_PAGEOUT': 2, 'DAMOS_HUGEPAGE': 3, 'DAMOS_NOHUGEPAGE': 4}
+
+def text_to_damos_action(txt):
+    return damos_action_to_int['DAMOS_' + txt.upper()]
+
+def text_to_nr_accesses(txt, max_nr_accesses):
+    if txt == 'null':
+        return 0
+    return int(int(txt) * max_nr_accesses / 100)
+
+def debugfs_scheme(line, sample_interval, aggr_interval):
+    fields = line.split()
+    if len(fields) != 7:
+        print('wrong input line: %s' % line)
+        exit(1)
+
+    limit_nr_accesses = aggr_interval / sample_interval
+    try:
+        min_sz = text_to_bytes(fields[0])
+        max_sz = text_to_bytes(fields[1])
+        min_nr_accesses = text_to_nr_accesses(fields[2], limit_nr_accesses)
+        max_nr_accesses = text_to_nr_accesses(fields[3], limit_nr_accesses)
+        min_age = text_to_us(fields[4]) / aggr_interval
+        max_age = text_to_us(fields[5]) / aggr_interval
+        action = text_to_damos_action(fields[6])
+    except:
+        print('wrong input field')
+        raise
+    return '%d\t%d\t%d\t%d\t%d\t%d\t%d' % (min_sz, max_sz, min_nr_accesses,
+            max_nr_accesses, min_age, max_age, action)
+
+def convert(schemes_file, sample_interval, aggr_interval):
+    lines = []
+    with open(schemes_file, 'r') as f:
+        for line in f:
+            if line.startswith('#'):
+                continue
+            line = line.strip()
+            if line == '':
+                continue
+            lines.append(debugfs_scheme(line, sample_interval, aggr_interval))
+    return '\n'.join(lines)
+
+def main():
+    parser = argparse.ArgumentParser()
+    parser.add_argument('input', metavar='<file>',
+            help='input file describing the schemes')
+    parser.add_argument('-s', '--sample', metavar='<interval>', type=int,
+            default=5000, help='sampling interval (us)')
+    parser.add_argument('-a', '--aggr', metavar='<interval>', type=int,
+            default=100000, help='aggregation interval (us)')
+    args = parser.parse_args()
+
+    schemes_file = args.input
+    sample_interval = args.sample
+    aggr_interval = args.aggr
+
+    print(convert(schemes_file, sample_interval, aggr_interval))
+
+if __name__ == '__main__':
+    main()
diff --git a/tools/damon/_damon.py b/tools/damon/_damon.py
new file mode 100644
index 000000000000..0a703ec7471a
--- /dev/null
+++ b/tools/damon/_damon.py
@@ -0,0 +1,143 @@
+#!/usr/bin/env python3
+# SPDX-License-Identifier: GPL-2.0
+
+"""
+Contains core functions for DAMON debugfs control.
+"""
+
+import os
+import subprocess
+
+debugfs_attrs = None
+debugfs_record = None
+debugfs_schemes = None
+debugfs_pids = None
+debugfs_monitor_on = None
+
+def set_target_pid(pid):
+    return subprocess.call('echo %s > %s' % (pid, debugfs_pids), shell=True,
+            executable='/bin/bash')
+
+def turn_damon(on_off):
+    return subprocess.call("echo %s > %s" % (on_off, debugfs_monitor_on),
+            shell=True, executable="/bin/bash")
+
+def is_damon_running():
+    with open(debugfs_monitor_on, 'r') as f:
+        return f.read().strip() == 'on'
+
+class Attrs:
+    sample_interval = None
+    aggr_interval = None
+    regions_update_interval = None
+    min_nr_regions = None
+    max_nr_regions = None
+    rbuf_len = None
+    rfile_path = None
+    schemes = None
+
+    def __init__(self, s, a, r, n, x, l, f, c):
+        self.sample_interval = s
+        self.aggr_interval = a
+        self.regions_update_interval = r
+        self.min_nr_regions = n
+        self.max_nr_regions = x
+        self.rbuf_len = l
+        self.rfile_path = f
+        self.schemes = c
+
+    def __str__(self):
+        return "%s %s %s %s %s %s %s\n%s" % (self.sample_interval,
+                self.aggr_interval, self.regions_update_interval,
+                self.min_nr_regions, self.max_nr_regions, self.rbuf_len,
+                self.rfile_path, self.schemes)
+
+    def attr_str(self):
+        return "%s %s %s %s %s " % (self.sample_interval, self.aggr_interval,
+                self.regions_update_interval, self.min_nr_regions,
+                self.max_nr_regions)
+
+    def record_str(self):
+        return '%s %s ' % (self.rbuf_len, self.rfile_path)
+
+    def apply(self):
+        ret = subprocess.call('echo %s > %s' % (self.attr_str(), debugfs_attrs),
+                shell=True, executable='/bin/bash')
+        if ret:
+            return ret
+        ret = subprocess.call('echo %s > %s' % (self.record_str(),
+            debugfs_record), shell=True, executable='/bin/bash')
+        if ret:
+            return ret
+        return subprocess.call('echo %s > %s' % (
+            self.schemes.replace('\n', ' '), debugfs_schemes), shell=True,
+            executable='/bin/bash')
+
+def current_attrs():
+    with open(debugfs_attrs, 'r') as f:
+        attrs = f.read().split()
+    attrs = [int(x) for x in attrs]
+
+    with open(debugfs_record, 'r') as f:
+        rattrs = f.read().split()
+    attrs.append(int(rattrs[0]))
+    attrs.append(rattrs[1])
+
+    with open(debugfs_schemes, 'r') as f:
+        schemes = f.read()
+    attrs.append(schemes)
+
+    return Attrs(*attrs)
+
+def chk_update_debugfs(debugfs):
+    global debugfs_attrs
+    global debugfs_record
+    global debugfs_schemes
+    global debugfs_pids
+    global debugfs_monitor_on
+
+    debugfs_damon = os.path.join(debugfs, 'damon')
+    debugfs_attrs = os.path.join(debugfs_damon, 'attrs')
+    debugfs_record = os.path.join(debugfs_damon, 'record')
+    debugfs_schemes = os.path.join(debugfs_damon, 'schemes')
+    debugfs_pids = os.path.join(debugfs_damon, 'pids')
+    debugfs_monitor_on = os.path.join(debugfs_damon, 'monitor_on')
+
+    if not os.path.isdir(debugfs_damon):
+        print("damon debugfs dir (%s) not found", debugfs_damon)
+        exit(1)
+
+    for f in [debugfs_attrs, debugfs_record, debugfs_schemes, debugfs_pids,
+            debugfs_monitor_on]:
+        if not os.path.isfile(f):
+            print("damon debugfs file (%s) not found" % f)
+            exit(1)
+
+def cmd_args_to_attrs(args):
+    "Generate attributes with specified arguments"
+    sample_interval = args.sample
+    aggr_interval = args.aggr
+    regions_update_interval = args.updr
+    min_nr_regions = args.minr
+    max_nr_regions = args.maxr
+    rbuf_len = args.rbuf
+    if not os.path.isabs(args.out):
+        args.out = os.path.join(os.getcwd(), args.out)
+    rfile_path = args.out
+    schemes = args.schemes
+    return Attrs(sample_interval, aggr_interval, regions_update_interval,
+            min_nr_regions, max_nr_regions, rbuf_len, rfile_path, schemes)
+
+def set_attrs_argparser(parser):
+    parser.add_argument('-d', '--debugfs', metavar='<debugfs>', type=str,
+            default='/sys/kernel/debug', help='debugfs mounted path')
+    parser.add_argument('-s', '--sample', metavar='<interval>', type=int,
+            default=5000, help='sampling interval')
+    parser.add_argument('-a', '--aggr', metavar='<interval>', type=int,
+            default=100000, help='aggregate interval')
+    parser.add_argument('-u', '--updr', metavar='<interval>', type=int,
+            default=1000000, help='regions update interval')
+    parser.add_argument('-n', '--minr', metavar='<# regions>', type=int,
+            default=10, help='minimal number of regions')
+    parser.add_argument('-m', '--maxr', metavar='<# regions>', type=int,
+            default=1000, help='maximum number of regions')
diff --git a/tools/damon/damo b/tools/damon/damo
index 58e1099ae5fc..ce7180069bef 100755
--- a/tools/damon/damo
+++ b/tools/damon/damo
@@ -5,6 +5,7 @@ import argparse
 
 import record
 import report
+import schemes
 
 class SubCmdHelpFormatter(argparse.RawDescriptionHelpFormatter):
     def _format_action(self, action):
@@ -25,6 +26,10 @@ parser_record = subparser.add_parser('record',
         help='record data accesses of the given target processes')
 record.set_argparser(parser_record)
 
+parser_schemes = subparser.add_parser('schemes',
+        help='apply operation schemes to the given target process')
+schemes.set_argparser(parser_schemes)
+
 parser_report = subparser.add_parser('report',
         help='report the recorded data accesses in the specified form')
 report.set_argparser(parser_report)
@@ -33,5 +38,7 @@ args = parser.parse_args()
 
 if args.command == 'record':
     record.main(args)
+elif args.command == 'schemes':
+    schemes.main(args)
 elif args.command == 'report':
     report.main(args)
diff --git a/tools/damon/record.py b/tools/damon/record.py
index a547d479a103..3bbf7b8359da 100644
--- a/tools/damon/record.py
+++ b/tools/damon/record.py
@@ -6,28 +6,12 @@ Record data access patterns of the target process.
 """
 
 import argparse
-import copy
 import os
 import signal
 import subprocess
 import time
 
-debugfs_attrs = None
-debugfs_record = None
-debugfs_pids = None
-debugfs_monitor_on = None
-
-def set_target_pid(pid):
-    return subprocess.call('echo %s > %s' % (pid, debugfs_pids), shell=True,
-            executable='/bin/bash')
-
-def turn_damon(on_off):
-    return subprocess.call("echo %s > %s" % (on_off, debugfs_monitor_on),
-            shell=True, executable="/bin/bash")
-
-def is_damon_running():
-    with open(debugfs_monitor_on, 'r') as f:
-        return f.read().strip() == 'on'
+import _damon
 
 def do_record(target, is_target_cmd, attrs, old_attrs):
     if os.path.isfile(attrs.rfile_path):
@@ -36,93 +20,29 @@ def do_record(target, is_target_cmd, attrs, old_attrs):
     if attrs.apply():
         print('attributes (%s) failed to be applied' % attrs)
         cleanup_exit(old_attrs, -1)
-    print('# damon attrs: %s' % attrs)
+    print('# damon attrs: %s %s' % (attrs.attr_str(), attrs.record_str()))
     if is_target_cmd:
         p = subprocess.Popen(target, shell=True, executable='/bin/bash')
         target = p.pid
-    if set_target_pid(target):
+    if _damon.set_target_pid(target):
         print('pid setting (%s) failed' % target)
         cleanup_exit(old_attrs, -2)
-    if turn_damon('on'):
+    if _damon.turn_damon('on'):
         print('could not turn on damon' % target)
         cleanup_exit(old_attrs, -3)
     if is_target_cmd:
         p.wait()
     while True:
         # damon will turn it off by itself if the target tasks are terminated.
-        if not is_damon_running():
+        if not _damon.is_damon_running():
             break
         time.sleep(1)
 
     cleanup_exit(old_attrs, 0)
 
-class Attrs:
-    sample_interval = None
-    aggr_interval = None
-    regions_update_interval = None
-    min_nr_regions = None
-    max_nr_regions = None
-    rbuf_len = None
-    rfile_path = None
-
-    def __init__(self, s, a, r, n, x, l, f):
-        self.sample_interval = s
-        self.aggr_interval = a
-        self.regions_update_interval = r
-        self.min_nr_regions = n
-        self.max_nr_regions = x
-        self.rbuf_len = l
-        self.rfile_path = f
-
-    def __str__(self):
-        return "%s %s %s %s %s %s %s" % (self.sample_interval, self.aggr_interval,
-                self.regions_update_interval, self.min_nr_regions,
-                self.max_nr_regions, self.rbuf_len, self.rfile_path)
-
-    def attr_str(self):
-        return "%s %s %s %s %s " % (self.sample_interval, self.aggr_interval,
-                self.regions_update_interval, self.min_nr_regions,
-                self.max_nr_regions)
-
-    def record_str(self):
-        return '%s %s ' % (self.rbuf_len, self.rfile_path)
-
-    def apply(self):
-        ret = subprocess.call('echo %s > %s' % (self.attr_str(), debugfs_attrs),
-                shell=True, executable='/bin/bash')
-        if ret:
-            return ret
-        return subprocess.call('echo %s > %s' % (self.record_str(),
-            debugfs_record), shell=True, executable='/bin/bash')
-
-def current_attrs():
-    with open(debugfs_attrs, 'r') as f:
-        attrs = f.read().split()
-    attrs = [int(x) for x in attrs]
-
-    with open(debugfs_record, 'r') as f:
-        rattrs = f.read().split()
-    attrs.append(int(rattrs[0]))
-    attrs.append(rattrs[1])
-    return Attrs(*attrs)
-
-def cmd_args_to_attrs(args):
-    "Generate attributes with specified arguments"
-    sample_interval = args.sample
-    aggr_interval = args.aggr
-    regions_update_interval = args.updr
-    min_nr_regions = args.minr
-    max_nr_regions = args.maxr
-    rbuf_len = args.rbuf
-    if not os.path.isabs(args.out):
-        args.out = os.path.join(os.getcwd(), args.out)
-    rfile_path = args.out
-    return Attrs(sample_interval, aggr_interval, regions_update_interval,
-            min_nr_regions, max_nr_regions, rbuf_len, rfile_path)
-
 def cleanup_exit(orig_attrs, exit_code):
-    if is_damon_running():
-        if turn_damon('off'):
+    if _damon.is_damon_running():
+        if _damon.turn_damon('off'):
             print('failed to turn damon off!')
     if orig_attrs:
         if orig_attrs.apply():
@@ -133,51 +53,19 @@ def sighandler(signum, frame):
     print('\nsignal %s received' % signum)
     cleanup_exit(orig_attrs, signum)
 
-def chk_update_debugfs(debugfs):
-    global debugfs_attrs
-    global debugfs_record
-    global debugfs_pids
-    global debugfs_monitor_on
-
-    debugfs_damon = os.path.join(debugfs, 'damon')
-    debugfs_attrs = os.path.join(debugfs_damon, 'attrs')
-    debugfs_record = os.path.join(debugfs_damon, 'record')
-    debugfs_pids = os.path.join(debugfs_damon, 'pids')
-    debugfs_monitor_on = os.path.join(debugfs_damon, 'monitor_on')
-
-    if not os.path.isdir(debugfs_damon):
-        print("damon debugfs dir (%s) not found", debugfs_damon)
-        exit(1)
-
-    for f in [debugfs_attrs, debugfs_record, debugfs_pids, debugfs_monitor_on]:
-        if not os.path.isfile(f):
-            print("damon debugfs file (%s) not found" % f)
-            exit(1)
-
 def chk_permission():
     if os.geteuid() != 0:
         print("Run as root")
         exit(1)
 
 def set_argparser(parser):
+    _damon.set_attrs_argparser(parser)
     parser.add_argument('target', type=str, metavar='<target>',
             help='the target command or the pid to record')
-    parser.add_argument('-s', '--sample', metavar='<interval>', type=int,
-            default=5000, help='sampling interval')
-    parser.add_argument('-a', '--aggr', metavar='<interval>', type=int,
-            default=100000, help='aggregate interval')
-    parser.add_argument('-u', '--updr', metavar='<interval>', type=int,
-            default=1000000, help='regions update interval')
-    parser.add_argument('-n', '--minr', metavar='<# regions>', type=int,
-            default=10, help='minimal number of regions')
-    parser.add_argument('-m', '--maxr', metavar='<# regions>', type=int,
-            default=1000, help='maximum number of regions')
     parser.add_argument('-l', '--rbuf', metavar='<len>', type=int,
             default=1024*1024, help='length of record result buffer')
     parser.add_argument('-o', '--out', metavar='<file path>', type=str,
             default='damon.data', help='output file path')
-    parser.add_argument('-d', '--debugfs', metavar='<debugfs>', type=str,
-            default='/sys/kernel/debug', help='debugfs mounted path')
 
 def main(args=None):
     global orig_attrs
@@ -187,13 +75,14 @@ def main(args=None):
         args = parser.parse_args()
 
     chk_permission()
-    chk_update_debugfs(args.debugfs)
+    _damon.chk_update_debugfs(args.debugfs)
 
     signal.signal(signal.SIGINT, sighandler)
     signal.signal(signal.SIGTERM, sighandler)
-    orig_attrs = current_attrs()
+    orig_attrs = _damon.current_attrs()
 
-    new_attrs = cmd_args_to_attrs(args)
+    args.schemes = ''
+    new_attrs = _damon.cmd_args_to_attrs(args)
     target = args.target
 
     target_fields = target.split()
diff --git a/tools/damon/schemes.py b/tools/damon/schemes.py
new file mode 100644
index 000000000000..ca1551fe5696
--- /dev/null
+++ b/tools/damon/schemes.py
@@ -0,0 +1,105 @@
+#!/usr/bin/env python3
+# SPDX-License-Identifier: GPL-2.0
+
+"""
+Apply given operation schemes to the target process.
+"""
+
+import argparse
+import os
+import signal
+import subprocess
+import time
+
+import _convert_damos
+import _damon
+
+def run_damon(target, is_target_cmd, attrs, old_attrs):
+    if os.path.isfile(attrs.rfile_path):
+        os.rename(attrs.rfile_path, attrs.rfile_path + '.old')
+
+    if attrs.apply():
+        print('attributes (%s) failed to be applied' % attrs)
+        cleanup_exit(old_attrs, -1)
+    print('# damon attrs: %s %s' % (attrs.attr_str(), attrs.record_str()))
+    for line in attrs.schemes.split('\n'):
+        print('# scheme: %s' % line)
+    if is_target_cmd:
+        p = subprocess.Popen(target, shell=True, executable='/bin/bash')
+        target = p.pid
+    if _damon.set_target_pid(target):
+        print('pid setting (%s) failed' % target)
+        cleanup_exit(old_attrs, -2)
+    if _damon.turn_damon('on'):
+        print('could not turn on damon' % target)
+        cleanup_exit(old_attrs, -3)
+    if is_target_cmd:
+        p.wait()
+    while True:
+        # damon will turn it off by itself if the target tasks are terminated.
+        if not _damon.is_damon_running():
+            break
+        time.sleep(1)
+
+    cleanup_exit(old_attrs, 0)
+
+def cleanup_exit(orig_attrs, exit_code):
+    if _damon.is_damon_running():
+        if _damon.turn_damon('off'):
+            print('failed to turn damon off!')
+    if orig_attrs:
+        if orig_attrs.apply():
+            print('original attributes (%s) restoration failed!' % orig_attrs)
+    exit(exit_code)
+
+def sighandler(signum, frame):
+    print('\nsignal %s received' % signum)
+    cleanup_exit(orig_attrs, signum)
+
+def chk_permission():
+    if os.geteuid() != 0:
+        print("Run as root")
+        exit(1)
+
+def set_argparser(parser):
+    _damon.set_attrs_argparser(parser)
+    parser.add_argument('target', type=str, metavar='<target>',
+            help='the target command or the pid to record')
+    parser.add_argument('-c', '--schemes', metavar='<file>', type=str,
+            default='damon.schemes',
+            help='data access monitoring-based operation schemes')
+
+def main(args=None):
+    global orig_attrs
+    if not args:
+        parser = argparse.ArgumentParser()
+        set_argparser(parser)
+        args = parser.parse_args()
+
+    chk_permission()
+    _damon.chk_update_debugfs(args.debugfs)
+
+    signal.signal(signal.SIGINT, sighandler)
+    signal.signal(signal.SIGTERM, sighandler)
+    orig_attrs = _damon.current_attrs()
+
+    args.rbuf = 0
+    args.out = 'null'
+    args.schemes = _convert_damos.convert(args.schemes, args.sample, args.aggr)
+    new_attrs = _damon.cmd_args_to_attrs(args)
+    target = args.target
+
+    target_fields = target.split()
+    if not subprocess.call('which %s > /dev/null' % target_fields[0],
+            shell=True, executable='/bin/bash'):
+        run_damon(target, True, new_attrs, orig_attrs)
+    else:
+        try:
+            pid = int(target)
+        except:
+            print('target \'%s\' is neither a command, nor a pid' % target)
+            exit(1)
+        run_damon(target, False, new_attrs, orig_attrs)
+
+if __name__ == '__main__':
+    main()
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 9+ messages in thread

* [RFC v8 8/8] Documentation/admin-guide/mm: Document DAMON-based operation schemes
  2020-05-12 11:53 [RFC v8 0/8] Implement Data Access Monitoring-based Memory Operation Schemes SeongJae Park
                   ` (6 preceding siblings ...)
  2020-05-12 11:53 ` [RFC v8 7/8] damon/tools: Support more human friendly 'schemes' control SeongJae Park
@ 2020-05-12 11:53 ` SeongJae Park
  7 siblings, 0 replies; 9+ messages in thread
From: SeongJae Park @ 2020-05-12 11:53 UTC (permalink / raw)
  To: akpm
  Cc: SeongJae Park, Jonathan.Cameron, aarcange, acme,
	alexander.shishkin, amit, benh, brendan.d.gregg, brendanhiggins,
	cai, colin.king, corbet, dwmw, irogers, jolsa, kirill,
	mark.rutland, mgorman, minchan, mingo, namhyung, peterz, rdunlap,
	riel, rientjes, rostedt, sblbir, shakeelb, shuah, sj38.park, snu,
	vbabka, vdavydov.dev, yang.shi, ying.huang, linux-damon,
	linux-mm, linux-doc, linux-kernel

From: SeongJae Park <sjpark@amazon.de>

This commit documents DAMON-based operation schemes in the DAMON
document.

Signed-off-by: SeongJae Park <sjpark@amazon.de>
---
 .../admin-guide/mm/data_access_monitor.rst    | 100 +++++++++++++++++-
 1 file changed, 98 insertions(+), 2 deletions(-)

diff --git a/Documentation/admin-guide/mm/data_access_monitor.rst b/Documentation/admin-guide/mm/data_access_monitor.rst
index 915956aa1065..d4a48bc63400 100644
--- a/Documentation/admin-guide/mm/data_access_monitor.rst
+++ b/Documentation/admin-guide/mm/data_access_monitor.rst
@@ -182,8 +182,8 @@ only for each of a user-specified time interval (``regions update interval``).
 ``debugfs`` Interface
 =====================
 
-DAMON exports four files, ``attrs``, ``pids``, ``record``, and ``monitor_on``
-under its debugfs directory, ``<debugfs>/damon/``.
+DAMON exports five files, ``attrs``, ``pids``, ``record``, ``schemes`` and
+``monitor_on`` under its debugfs directory, ``<debugfs>/damon/``.
 
 Attributes
 ----------
@@ -227,6 +227,46 @@ be 4 KiB and the result to be saved in ``/damon.data``::
     # cat record
     4096 /damon.data
 
+Schemes
+-------
+
+For usual DAMON-based data access awared memory management optimizations, users
+would simply want the system to apply a memory management action to a memory
+region of a specific size having a specific access frequency for a specific
+time.  DAMON receives such formalized operation schemes from user and applies
+those to the target processes.
+
+Users can get and set the schemes by reading from and writing to ``schemes``
+debugfs file.  To the file, each of the schemes should represented in each line
+in below form:
+
+    min-size max-size min-acc max-acc min-age max-age action
+
+Bytes for size of regions (``min-size`` and ``max-size``), number of monitored
+accesses per aggregate interval for access frequency (``min-acc`` and
+``max-acc``), number of aggregate intervals for age of regions (``min-age`` and
+``max-age``), and predefined integer for memory management actions should be
+used.  ``madvise()`` system call with specific hint are currently available.
+The numbers and their representing memory hint are as below::
+
+    0   MADV_WILLNEED
+    1   MADV_COLD
+    2   MADV_PAGEOUT
+    3   MADV_HUGEPAGE
+    4   MADV_NOHUGEPAGE
+
+You can disable schemes by simply writing empty string to the file.  For
+example, below commands applies a scheme saying “If a memory region larger than
+4 KiB (4096 0) is showing less than 5 accesses per aggregate interval (0 5) for
+more than 5 aggregate interval (5 0), page out the region (2)”, check the
+entered scheme again, and finally remove the scheme.::
+
+    # cd <debugfs>/damon
+    # echo "4096 0 0 5 5 0 2" > schemes
+    # cat schemes
+    4096 0 0 5 5 0 2
+    # echo > schemes
+
 Turning On/Off
 --------------
 
@@ -426,3 +466,59 @@ made.
 
 Users can specify the resolution of the distribution (``--range``).  It also
 supports 'gnuplot' based simple visualization (``--plot``) of the distribution.
+
+
+DAMON-based Operation Schemes
+-----------------------------
+
+The ``schemes`` subcommand applies given data access pattern based operation
+schemes to the given target processes.  The target processes are described
+using the command to spawn the processes or pid of running processes, as
+similar to that of ``record`` subcommand.  Meanwhile, the operation schemes
+should be saved in a text file using below format and passed to ``schemes``
+subcommand via ``--schemes`` option.
+
+    min-size max-size min-acc max-acc min-age max-age action
+
+The format also supports comments, several units for size and age of regions,
+and human readable action names.  Currently supported operation actions are
+WILLNEED, COLD, PAGEOUT, HUGEPAGE, and NOHUGEPAGE.  Each of the actions works
+as same to that of madvise() system call.  Below is an example schemes file.
+Please also note that 0 for max values means infinite.::
+
+    # format is:
+    # <min/max size> <min/max frequency (0-99)> <min/max age> <action>
+    #
+    # B/K/M/G/T for Bytes/KiB/MiB/GiB/TiB
+    # us/ms/s/m/h/d for micro-seconds/milli-seconds/seconds/minutes/hours/days
+    # 'null' means zero for size and age.
+
+    # if a region keeps a high access frequency for more than 100ms, put the
+    # region on the head of the LRU list (call madvise() with MADV_WILLNEED).
+    null    null    80      null    100ms   0s      willneed
+
+    # if a region keeps a low access frequency for more than 200ms and less
+    # than one hour, put the # region on the tail of the LRU list (call
+    # madvise() with MADV_COLD).
+    0B      0B      10      20      200ms   1h cold
+
+    # if a region keeps a very low access frequency for more than 1 minute,
+    # swap out the region immediately (call madvise() with MADV_PAGEOUT).
+    0B      null    0       10      60s     0s pageout
+
+    # if a region of a size bigger than 2MiB keeps a very high access frequency
+    # for more than 100ms, let the region to use huge pages (call madvise()
+    # with MADV_HUGEPAGE).
+    2M      null    90      99      100ms   0s hugepage
+
+    # If a regions of a size bigger than 2MiB keeps small access frequency for
+    # more than 100ms, avoid the region using huge pages (call madvise() with
+    # MADV_NOHUGEPAGE).
+    2M      null    0       25      100ms   0s nohugepage
+
+For example, you can make a running process named 'foo' to use THP for memory
+regions keeping 2MB or larger size and having very high access frequency for
+more than 100 milliseconds using below commands::
+
+    $ echo "2M null 90 99 100ms 0s hugepage" > my_thp_scheme
+    $ ./damo schemes --schemes my_thp_scheme `pidof foo`
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2020-05-12 11:58 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-05-12 11:53 [RFC v8 0/8] Implement Data Access Monitoring-based Memory Operation Schemes SeongJae Park
2020-05-12 11:53 ` [RFC v8 1/8] mm/madvise: Export do_madvise() to external GPL modules SeongJae Park
2020-05-12 11:53 ` [RFC v8 2/8] mm/damon: Account age of target regions SeongJae Park
2020-05-12 11:53 ` [RFC v8 3/8] mm/damon: Implement data access monitoring-based operation schemes SeongJae Park
2020-05-12 11:53 ` [RFC v8 4/8] mm/damon/schemes: Implement a debugfs interface SeongJae Park
2020-05-12 11:53 ` [RFC v8 5/8] mm/damon-test: Add kunit test case for regions age accounting SeongJae Park
2020-05-12 11:53 ` [RFC v8 6/8] mm/damon/selftests: Add 'schemes' debugfs tests SeongJae Park
2020-05-12 11:53 ` [RFC v8 7/8] damon/tools: Support more human friendly 'schemes' control SeongJae Park
2020-05-12 11:53 ` [RFC v8 8/8] Documentation/admin-guide/mm: Document DAMON-based operation schemes SeongJae Park

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.