linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [RFC v7 0/7] Implement Data Access Monitoring-based Memory Operation Schemes
@ 2020-04-29 12:45 SeongJae Park
  2020-04-29 12:45 ` [RFC v7 1/7] mm/madvise: Export do_madvise() to external GPL modules SeongJae Park
                   ` (6 more replies)
  0 siblings, 7 replies; 8+ messages in thread
From: SeongJae Park @ 2020-04-29 12:45 UTC (permalink / raw)
  To: akpm
  Cc: SeongJae Park, Jonathan.Cameron, aarcange, acme,
	alexander.shishkin, amit, benh, brendan.d.gregg, brendanhiggins,
	cai, colin.king, corbet, dwmw, irogers, jolsa, kirill,
	mark.rutland, mgorman, minchan, mingo, namhyung, peterz, rdunlap,
	riel, rientjes, rostedt, sblbir, shakeelb, shuah, sj38.park, snu,
	vbabka, vdavydov.dev, yang.shi, ying.huang, linux-damon,
	linux-mm, linux-doc, linux-kernel

From: SeongJae Park <sjpark@amazon.de>

DAMON[1] can be used as a primitive for data access awared memory management
optimizations.  That said, users who want such optimizations should run DAMON,
read the monitoring results, analyze it, plan a new memory management scheme,
and apply the new scheme by themselves.  Such efforts will be inevitable for
some complicated optimizations.

However, in many other cases, the users would simply want the system to apply a
memory management action to a memory region of a specific size having a
specific access frequency for a specific time.  For example, "page out a memory
region larger than 100 MiB keeping only rare accesses more than 2 minutes", or
"Do not use THP for a memory region larger than 2 MiB rarely accessed for more
than 1 seconds".

This RFC patchset makes DAMON to handle such data access monitoring-based
operation schemes.  With this change, users can do the data access aware
optimizations by simply specifying their schemes to DAMON.

[1] https://lore.kernel.org/linux-mm/20200406130938.14066-1-sjpark@amazon.com/


Evaluations
===========

Setup
-----

On my personal QEMU/KVM based virtual machine on an Intel i7 host machine
running Ubuntu 18.04, I measure runtime and consumed system memory while
running various realistic workloads with several configurations.  I use 13 and
12 workloads in PARSEC3[3] and SPLASH-2X[4] benchmark suites, respectively.  I
personally use another wrapper scripts[5] for setup and run of the workloads.
On top of this patchset, we also applied the DAMON-based operation schemes
patchset[6] for this evaluation.

Measurement
~~~~~~~~~~~

For the measurement of the amount of consumed memory in system global scope, I
drop caches before starting each of the workloads and monitor 'MemFree' in the
'/proc/meminfo' file.  To make results more stable, I repeat the runs 5 times
and average results.  You can get stdev, min, and max of the numbers among the
repeated runs in appendix below.

Configurations
~~~~~~~~~~~~~~

The configurations I use are as below.

orig: Linux v5.6 with 'madvise' THP policy
rec: 'orig' plus DAMON running with record feature
thp: same with 'orig', but use 'always' THP policy
ethp: 'orig' plus a DAMON operation scheme[6], 'efficient THP'
prcl: 'orig' plus a DAMON operation scheme, 'proactive reclaim[7]'

I use 'rec' for measurement of DAMON overheads to target workloads and system
memory.  The remaining configs including 'thp', 'ethp', and 'prcl' are for
measurement of DAMON monitoring accuracy.

'ethp' and 'prcl' is simple DAMON-based operation schemes developed for
proof of concepts of DAMON.  'ethp' reduces memory space waste of THP by using
DAMON for decision of promotions and demotion for huge pages, while 'prcl' is
as similar as the original work.  Those are implemented as below:

# format: <min/max size> <min/max frequency (0-100)> <min/max age> <action>
# ethp: Use huge pages if a region >2MB shows >5% access rate, use regular
# pages if a region >2MB shows <5% access rate for >1 second
2M null    5 null    null null    hugepage
2M null    null 5    1s null      nohugepage

# prcl: If a region >4KB shows <5% access rate for >5 seconds, page out.
4K null    null 5    500ms null      pageout

Note that both 'ethp' and 'prcl' are designed with my only straightforward
intuition, because those are for only proof of concepts and monitoring accuracy
of DAMON.  In other words, those are not for production.  For production use,
those should be tuned more.


[1] "Redis latency problems troubleshooting", https://redis.io/topics/latency
[2] "Disable Transparent Huge Pages (THP)",
    https://docs.mongodb.com/manual/tutorial/transparent-huge-pages/
[3] "The PARSEC Becnhmark Suite", https://parsec.cs.princeton.edu/index.htm
[4] "SPLASH-2x", https://parsec.cs.princeton.edu/parsec3-doc.htm#splash2x
[5] "parsec3_on_ubuntu", https://github.com/sjp38/parsec3_on_ubuntu
[6] "[RFC v4 0/7] Implement Data Access Monitoring-based Memory Operation
    Schemes",
    https://lore.kernel.org/linux-mm/20200303121406.20954-1-sjpark@amazon.com/
[7] "Proactively reclaiming idle memory", https://lwn.net/Articles/787611/


Results
-------

Below two tables show the measurement results.  The runtimes are in seconds
while the memory usages are in KiB.  Each configurations except 'orig' shows
its overhead relative to 'orig' in percent within parenthesises.

runtime                 orig     rec      (overhead) thp      (overhead) ethp     (overhead) prcl     (overhead)
parsec3/blackscholes    107.755  106.693  (-0.99)    106.408  (-1.25)    107.848  (0.09)     112.142  (4.07)
parsec3/bodytrack       79.603   79.110   (-0.62)    78.862   (-0.93)    79.577   (-0.03)    80.579   (1.23)
parsec3/canneal         139.588  139.148  (-0.31)    125.747  (-9.92)    130.833  (-6.27)    157.601  (12.90)
parsec3/dedup           11.923   11.860   (-0.53)    11.739   (-1.55)    11.931   (0.06)     13.090   (9.78)
parsec3/facesim         208.270  208.401  (0.06)     205.557  (-1.30)    206.114  (-1.04)    216.352  (3.88)
parsec3/ferret          190.247  190.540  (0.15)     191.056  (0.43)     190.492  (0.13)     193.026  (1.46)
parsec3/fluidanimate    210.495  212.142  (0.78)     210.075  (-0.20)    211.365  (0.41)     220.724  (4.86)
parsec3/freqmine        287.887  292.770  (1.70)     287.576  (-0.11)    289.190  (0.45)     296.266  (2.91)
parsec3/raytrace        117.887  119.385  (1.27)     118.781  (0.76)     118.572  (0.58)     129.831  (10.13)
parsec3/streamcluster   321.637  327.692  (1.88)     283.875  (-11.74)   291.699  (-9.31)    329.212  (2.36)
parsec3/swaptions       154.148  155.623  (0.96)     155.070  (0.60)     154.952  (0.52)     155.241  (0.71)
parsec3/vips            58.851   58.527   (-0.55)    58.396   (-0.77)    58.979   (0.22)     59.970   (1.90)
parsec3/x264            70.559   68.624   (-2.74)    66.662   (-5.52)    67.817   (-3.89)    71.065   (0.72)
splash2x/barnes         80.678   80.491   (-0.23)    74.135   (-8.11)    79.493   (-1.47)    98.688   (22.32)
splash2x/fft            33.565   33.434   (-0.39)    23.153   (-31.02)   31.181   (-7.10)    45.662   (36.04)
splash2x/lu_cb          85.536   85.391   (-0.17)    84.396   (-1.33)    86.323   (0.92)     89.000   (4.05)
splash2x/lu_ncb         92.899   92.830   (-0.07)    90.075   (-3.04)    93.566   (0.72)     95.603   (2.91)
splash2x/ocean_cp       44.529   44.741   (0.47)     43.049   (-3.32)    44.117   (-0.93)    57.652   (29.47)
splash2x/ocean_ncp      81.271   81.538   (0.33)     51.337   (-36.83)   62.990   (-22.49)   137.621  (69.34)
splash2x/radiosity      91.411   91.329   (-0.09)    90.889   (-0.57)    91.944   (0.58)     102.682  (12.33)
splash2x/radix          31.194   31.202   (0.03)     25.258   (-19.03)   28.667   (-8.10)    43.684   (40.04)
splash2x/raytrace       83.930   84.754   (0.98)     83.734   (-0.23)    83.394   (-0.64)    84.932   (1.19)
splash2x/volrend        86.163   87.052   (1.03)     86.918   (0.88)     86.621   (0.53)     87.520   (1.57)
splash2x/water_nsquared 231.335  234.050  (1.17)     222.722  (-3.72)    224.502  (-2.95)    236.589  (2.27)
splash2x/water_spatial  88.753   89.167   (0.47)     89.542   (0.89)     89.510   (0.85)     97.960   (10.37)
total                   2990.130 3006.480 (0.55)     2865.010 (-4.18)    2921.670 (-2.29)    3212.680 (7.44)


memused.avg             orig         rec          (overhead) thp          (overhead) ethp         (overhead) prcl         (overhead)
parsec3/blackscholes    1816303.000  1835404.800  (1.05)     1825285.800  (0.49)     1827203.000  (0.60)     1641411.600  (-9.63)
parsec3/bodytrack       1413888.000  1435353.800  (1.52)     1418535.200  (0.33)     1423560.600  (0.68)     1449993.600  (2.55)
parsec3/canneal         1042149.000  1053590.600  (1.10)     1038469.400  (-0.35)    1051556.600  (0.90)     1044271.200  (0.20)
parsec3/dedup           2364713.400  2448044.200  (3.52)     2397824.600  (1.40)     2427849.200  (2.67)     2402863.000  (1.61)
parsec3/facesim         540004.800   554035.000   (2.60)     543449.800   (0.64)     553955.400   (2.58)     483559.400   (-10.45)
parsec3/ferret          319349.600   331756.400   (3.89)     319751.600   (0.13)     333884.000   (4.55)     329600.400   (3.21)
parsec3/fluidanimate    576741.400   587662.400   (1.89)     576208.000   (-0.09)    586089.800   (1.62)     489655.000   (-15.10)
parsec3/freqmine        986222.400   999265.800   (1.32)     987716.200   (0.15)     1001756.400  (1.58)     766269.800   (-22.30)
parsec3/raytrace        1748338.200  1750036.000  (0.10)     1742218.400  (-0.35)    1755005.000  (0.38)     1584009.400  (-9.40)
parsec3/streamcluster   134980.800   136257.600   (0.95)     119580.000   (-11.41)   135188.600   (0.15)     132589.600   (-1.77)
parsec3/swaptions       13893.800    28265.000    (103.44)   16206.000    (16.64)    27826.800    (100.28)   26332.800    (89.53)
parsec3/vips            2954105.600  2972710.000  (0.63)     2955940.200  (0.06)     2971989.600  (0.61)     2968768.600  (0.50)
parsec3/x264            3169214.400  3206571.400  (1.18)     3185179.200  (0.50)     3170560.000  (0.04)     3209772.400  (1.28)
splash2x/barnes         1213585.000  1211837.400  (-0.14)    1220890.600  (0.60)     1215453.600  (0.15)     974635.600   (-19.69)
splash2x/fft            9371991.000  9201587.200  (-1.82)    9292089.200  (-0.85)    9108707.400  (-2.81)    9625476.600  (2.70)
splash2x/lu_cb          515113.800   523791.000   (1.68)     520880.200   (1.12)     523066.800   (1.54)     362113.400   (-29.70)
splash2x/lu_ncb         514847.800   524934.000   (1.96)     521362.400   (1.27)     521515.600   (1.30)     445374.200   (-13.49)
splash2x/ocean_cp       3341933.600  3322040.400  (-0.60)    3381251.000  (1.18)     3292229.400  (-1.49)    3181383.000  (-4.80)
splash2x/ocean_ncp      3899426.800  3870830.800  (-0.73)    7065641.200  (81.20)    5099403.200  (30.77)    3557460.000  (-8.77)
splash2x/radiosity      1465960.800  1470778.600  (0.33)     1482777.600  (1.15)     1500133.400  (2.33)     498807.200   (-65.97)
splash2x/radix          1711100.800  1672141.400  (-2.28)    1387826.200  (-18.89)   1516728.600  (-11.36)   2043053.600  (19.40)
splash2x/raytrace       47586.400    58698.000    (23.35)    51308.400    (7.82)     61274.800    (28.77)    54446.200    (14.42)
splash2x/volrend        150480.400   164633.800   (9.41)     150819.600   (0.23)     163517.400   (8.66)     161828.200   (7.54)
splash2x/water_nsquared 47147.600    62403.400    (32.36)    47689.600    (1.15)     60030.800    (27.33)    59736.600    (26.70)
splash2x/water_spatial  666544.600   674447.800   (1.19)     665904.600   (-0.10)    673677.600   (1.07)     559765.200   (-16.02)
total                   40025500.000 40096900.000 (0.18)     42914900.000 (7.22)     41002100.000 (2.44)     38053200.000 (-4.93)


DAMON Overheads
~~~~~~~~~~~~~~~

In total, DAMON recording feature incurs 0.55% runtime overhead (up to 1.88% in
worst case with 'parsec3/streamcluster') and 0.18% memory space overhead.

For convenience test run of 'rec', I use a Python wrapper.  The wrapper
constantly consumes about 10-15MB of memory.  This becomes high memory overhead
if the target workload has small memory footprint.  In detail,
parsec3/swaptions (13 MiB), splash2x/raytrace (47 MiB), splash2x/volrend (150
MiB), and splash2x/water_nsquared (46 MiB)) show 103.44%, 23%, 9%, and 32%
overheads, respectively.  Nonetheless, the overheads are not from DAMON, but
from the wrapper, and thus should be ignored.  This fake memory overhead
continues in 'ethp' and 'prcl', as those configurations are also using the
Python wrapper.


Efficient THP
~~~~~~~~~~~~~

THP 'always' enabled policy achieves 4.18% speedup but incurs 7.22% memory
overhead.  It achieves 36.83% speedup in best case, but 81.20% memory overhead
in worst case.  Interestingly, both the best and worst case are with
'splash2x/ocean_ncp').

The 2-lines implementation of data access monitoring based THP version ('ethp')
shows 2.29% speedup and 2.44% memory overhead.  In other words, 'ethp' removes
66.2% of THP memory waste while preserving 54.78% of THP speedup in total.  In
case of the 'splash2x/ocean_ncp', 'ethp' removes 62.10% of THP memory waste
while preserving 61% of THP speedup.


Proactive Reclamation
~~~~~~~~~~~~~~~~~~~~

As same to the original work, I use 'zram' swap device for this configuration.

In total, our 1 line implementation of Proactive Reclamation, 'prcl', incurred
7.44% runtime overhead in total while achieving 4.93% system memory usage
reduction.

Nonetheless, as the memory usage is calculated with 'MemFree' in
'/proc/meminfo', it contains the SwapCached pages.  As the swapcached pages can
be easily evicted, I also measured the residential set size of the workloads:

rss.avg                 orig         rec          (overhead) thp          (overhead) ethp         (overhead) prcl         (overhead)
parsec3/blackscholes    591461.000   590761.000   (-0.12)    592669.200   (0.20)     592442.600   (0.17)     308627.200   (-47.82)
parsec3/bodytrack       32201.400    32242.800    (0.13)     32299.000    (0.30)     32327.600    (0.39)     27411.000    (-14.88)
parsec3/canneal         841593.600   839721.400   (-0.22)    837427.600   (-0.50)    838363.400   (-0.38)    822220.600   (-2.30)
parsec3/dedup           1210000.600  1235153.600  (2.08)     1205207.200  (-0.40)    1229808.800  (1.64)     827881.400   (-31.58)
parsec3/facesim         311630.400   311273.200   (-0.11)    314747.400   (1.00)     312449.400   (0.26)     184104.600   (-40.92)
parsec3/ferret          99714.800    99558.400    (-0.16)    100996.800   (1.29)     99769.600    (0.05)     88979.200    (-10.77)
parsec3/fluidanimate    531429.600   531855.200   (0.08)     531744.800   (0.06)     532158.600   (0.14)     428154.000   (-19.43)
parsec3/freqmine        553063.600   552561.000   (-0.09)    556588.600   (0.64)     553518.000   (0.08)     65516.800    (-88.15)
parsec3/raytrace        894129.800   894332.400   (0.02)     889421.800   (-0.53)    892801.000   (-0.15)    363634.000   (-59.33)
parsec3/streamcluster   110887.200   110949.400   (0.06)     111508.400   (0.56)     111645.000   (0.68)     109921.200   (-0.87)
parsec3/swaptions       5688.600     5660.800     (-0.49)    5656.400     (-0.57)    5709.200     (0.36)     4201.000     (-26.15)
parsec3/vips            31774.800    31992.000    (0.68)     32134.800    (1.13)     32212.400    (1.38)     29026.000    (-8.65)
parsec3/x264            81897.400    81842.200    (-0.07)    83073.800    (1.44)     82435.200    (0.66)     80929.400    (-1.18)
splash2x/barnes         1216429.200  1212158.000  (-0.35)    1223021.400  (0.54)     1218261.200  (0.15)     710678.800   (-41.58)
splash2x/fft            9582824.800  9732597.400  (1.56)     9695113.400  (1.17)     9665607.200  (0.86)     7959449.000  (-16.94)
splash2x/lu_cb          509782.600   509423.400   (-0.07)    514467.000   (0.92)     510521.000   (0.14)     346267.200   (-32.08)
splash2x/lu_ncb         509735.200   510578.000   (0.17)     513892.200   (0.82)     509864.800   (0.03)     429509.800   (-15.74)
splash2x/ocean_cp       3402516.400  3405858.200  (0.10)     3442579.400  (1.18)     3411920.400  (0.28)     2782917.800  (-18.21)
splash2x/ocean_ncp      3924875.800  3921542.800  (-0.08)    7179644.000  (82.93)    5243201.400  (33.59)    2760506.600  (-29.67)
splash2x/radiosity      1472925.800  1475449.200  (0.17)     1485645.800  (0.86)     1473646.000  (0.05)     248785.000   (-83.11)
splash2x/radix          1748452.000  1750998.000  (0.15)     1434846.600  (-17.94)   1606307.800  (-8.13)    1713493.600  (-2.00)
splash2x/raytrace       23265.600    23278.400    (0.06)     29232.800    (25.65)    27050.400    (16.27)    16464.600    (-29.23)
splash2x/volrend        44020.600    44048.400    (0.06)     44148.400    (0.29)     44125.400    (0.24)     28101.800    (-36.16)
splash2x/water_nsquared 29420.800    29409.600    (-0.04)    29808.400    (1.32)     29984.800    (1.92)     25234.000    (-14.23)
splash2x/water_spatial  656716.000   656514.200   (-0.03)    656023.000   (-0.11)    656411.600   (-0.05)    498736.400   (-24.06)
total                   28416316.000 28589600.000 (0.61)     31541823.000 (11.00)    29712600.000 (4.56)     20860800.000 (-26.59)

In total, 26.59% of residential sets were reduced.

With parsec3/freqmine, 'prcl' reduced 88.15% of residential sets and 22.30% of
system memory footprint while incurring only 2.91% runtime overhead.


Baseline and Complete Git Tree
==============================


The patches are based on the v5.6 plus v9 DAMON patchset[1] and Minchan's
``do_madvise()`` patch[2].  Minchan's patch was necessary for reuse of
``madvise()`` code in DAMON.  You can also clone the complete git tree:

    $ git clone git://github.com/sjp38/linux -b damos/rfc/v7

The web is also available:
https://github.com/sjp38/linux/releases/tag/damos/rfc/v7

The latest DAMON development tree is also available at:
https://github.com/sjp38/linux/tree/damon/master


[1] https://lore.kernel.org/linux-mm/20200406130938.14066-1-sjpark@amazon.com/
[2] https://lore.kernel.org/linux-mm/20200302193630.68771-2-minchan@kernel.org/


Sequence Of Patches
===================

The first patch allows DAMON to reuse ``madvise()`` code for the actions.  The
second patch accounts age of each region.  The third patch implements the
handling of the schemes in DAMON and exports a kernel space programming
interface for it.  The fourth patch implements a debugfs interface for the
privileged people and programs.  The fifth and sixth patches each adds kunit
tests and selftests for these changes, and finally the seventhe patch adds
human friendly schemes support to the user space tool for DAMON.


Patch History
=============

Changes from RFC v6
(https://lore.kernel.org/linux-mm/20200407100007.3894-1-sjpark@amazon.com/)
 - Rebase on DAMON v9 patchset
 - Cleanup code and fix typos (Stefan Nuernberger)

Changes from RFC v5
(https://lore.kernel.org/linux-mm/20200330115042.17431-1-sjpark@amazon.com/)
 - Rebase on DAMON v8 patchset
 - Update test results
 - Fix DAMON userspace tool crash on signal handling
 - Fix checkpatch warnings

Changes from RFC v4
(https://lore.kernel.org/linux-mm/20200303121406.20954-1-sjpark@amazon.com/)
 - Handle CONFIG_ADVISE_SYSCALL
 - Clean up code (Jonathan Cameron)
 - Update test results
 - Rebase on v5.6 + DAMON v7

Changes from RFC v3
(https://lore.kernel.org/linux-mm/20200225102300.23895-1-sjpark@amazon.com/)
 - Add Reviewed-by from Brendan Higgins
 - Code cleanup: Modularize madvise() call
 - Fix a trivial bug in the wrapper python script
 - Add more stable and detailed evaluation results with updated ETHP scheme

Changes from RFC v2
(https://lore.kernel.org/linux-mm/20200218085309.18346-1-sjpark@amazon.com/)
 - Fix aging mechanism for more better 'old region' selection
 - Add more kunittests and kselftests for this patchset
 - Support more human friedly description and application of 'schemes'

Changes from RFC v1
(https://lore.kernel.org/linux-mm/20200210150921.32482-1-sjpark@amazon.com/)
 - Properly adjust age accounting related properties after splitting, merging,
   and action applying

SeongJae Park (7):
  mm/madvise: Export do_madvise() to external GPL modules
  mm/damon: Account age of target regions
  mm/damon: Implement data access monitoring-based operation schemes
  mm/damon/schemes: Implement a debugfs interface
  mm/damon-test: Add kunit test case for regions age accounting
  mm/damon/selftests: Add 'schemes' debugfs tests
  damon/tools: Support more human friendly 'schemes' control

 include/linux/damon.h                         |  29 ++
 mm/damon-test.h                               |   5 +
 mm/damon.c                                    | 429 +++++++++++++++++-
 mm/madvise.c                                  |   1 +
 tools/damon/_convert_damos.py                 | 126 +++++
 tools/damon/_damon.py                         | 143 ++++++
 tools/damon/damo                              |   7 +
 tools/damon/record.py                         | 135 +-----
 tools/damon/schemes.py                        | 105 +++++
 .../testing/selftests/damon/debugfs_attrs.sh  |  29 ++
 10 files changed, 880 insertions(+), 129 deletions(-)
 create mode 100755 tools/damon/_convert_damos.py
 create mode 100644 tools/damon/_damon.py
 create mode 100644 tools/damon/schemes.py

-- 
2.17.1

==================================== >8 =======================================

Appendix: Stdev / min / max numbers among the repeated runs
===========================================================

Below are stdev/min/max of each number in the 5 repeated runs.

runtime_stdev           orig  rec   thp   ethp   prcl
parsec3/blackscholes    0.927 0.093 0.479 0.809  1.879
parsec3/bodytrack       0.716 0.299 0.545 0.744  0.759
parsec3/canneal         5.413 2.320 4.270 3.734  4.561
parsec3/dedup           0.087 0.045 0.051 0.103  1.027
parsec3/facesim         1.286 0.243 1.832 1.070  1.478
parsec3/ferret          1.224 0.355 1.410 0.280  1.042
parsec3/fluidanimate    0.771 1.015 1.825 0.560  3.507
parsec3/freqmine        1.583 2.597 0.788 1.133  2.163
parsec3/raytrace        0.221 0.694 0.158 0.398  0.540
parsec3/streamcluster   0.999 1.660 1.426 3.448  1.246
parsec3/swaptions       1.014 0.949 2.997 0.902  1.032
parsec3/vips            0.322 0.127 0.280 0.349  0.365
parsec3/x264            4.271 5.999 5.492 3.739  5.560
splash2x/barnes         0.972 0.698 0.558 1.094  3.955
splash2x/fft            0.447 0.143 0.352 2.985  7.956
splash2x/lu_cb          0.796 0.105 0.088 0.688  1.200
splash2x/lu_ncb         1.024 0.277 0.199 1.103  2.872
splash2x/ocean_cp       0.280 0.126 0.201 0.673  3.281
splash2x/ocean_ncp      1.057 0.508 0.335 12.102 14.090
splash2x/radiosity      0.785 0.173 0.260 0.870  0.893
splash2x/radix          0.316 0.150 0.201 2.437  7.187
splash2x/raytrace       0.424 0.321 0.882 0.830  0.626
splash2x/volrend        0.089 0.381 0.972 0.242  0.223
splash2x/water_nsquared 2.306 3.673 2.910 2.275  0.274
splash2x/water_spatial  0.107 0.230 0.709 0.800  2.819


memused.avg_stdev       orig       rec       thp       ethp       prcl
parsec3/blackscholes    9697.635   1494.697  2387.690  8093.143   43663.214
parsec3/bodytrack       7250.643   1473.509  2397.013  7603.035   39980.147
parsec3/canneal         4210.342   1424.186  7433.352  5297.711   4658.795
parsec3/dedup           68212.037  20891.405 45047.056 53208.941  80774.420
parsec3/facesim         2032.111   1231.900  2189.628  1509.638   3163.506
parsec3/ferret          3086.358   3654.773  2260.834  479.277    1326.218
parsec3/fluidanimate    4134.608   2544.518  1207.893  3600.042   19132.237
parsec3/freqmine        1289.178   1158.483  2775.569  11680.579  1778.696
parsec3/raytrace        2998.385   4523.060  4970.297  2610.227   10123.220
parsec3/streamcluster   29359.793  778.918   2727.880  8358.612   1162.631
parsec3/swaptions       1333.761   1725.212  921.479   1256.970   1450.368
parsec3/vips            3396.246   3850.901  3947.109  4234.346   4890.742
parsec3/x264            37565.788  16116.006 23477.849 36372.921  19756.335
splash2x/barnes         2187.324   2570.255  4164.816  6122.991   23278.418
splash2x/fft            114780.227 55225.274 38989.522 120264.044 320549.636
splash2x/lu_cb          1068.762   1221.553  2979.611  1226.053   9996.730
splash2x/lu_ncb         3170.041   1571.103  1376.271  2051.800   24940.971
splash2x/ocean_cp       7068.865   35021.454 6299.243  5732.506   39711.553
splash2x/ocean_ncp      7937.572   4710.785  6351.772  884113.443 65600.214
splash2x/radiosity      6180.740   1808.464  1987.039  62425.903  49386.961
splash2x/radix          19986.863  5055.739  7747.845  133910.722 250182.861
splash2x/raytrace       824.991    805.535   1398.821  1882.344   657.657
splash2x/volrend        1536.609   2907.749  1343.772  1551.393   1794.672
splash2x/water_nsquared 2302.140   589.065   2673.043  2603.017   1178.533
splash2x/water_spatial  4190.273   1206.995  4811.902  1670.600   54112.547


rss.avg_stdev           orig       rec       thp       ethp       prcl
parsec3/blackscholes    1974.955   2499.204  2250.405  1922.173   63345.041
parsec3/bodytrack       220.466    268.157   131.569   74.021     761.960
parsec3/canneal         1471.797   514.756   1434.294  2098.894   5416.763
parsec3/dedup           32354.306  476.043   44399.324 11765.220  447871.173
parsec3/facesim         180.793    347.747   1332.481  699.827    4196.029
parsec3/ferret          56.986     197.517   1604.932  445.317    1218.997
parsec3/fluidanimate    901.496    35.091    118.211   364.814    20828.449
parsec3/freqmine        598.492    921.266   1049.698  716.771    3775.504
parsec3/raytrace        1438.348   1493.995  933.785   1600.661   40270.688
parsec3/streamcluster   84.156     35.942    540.225   552.519    76.022
parsec3/swaptions       72.212     67.857    68.884    92.417     184.822
parsec3/vips            210.871    119.778   429.014   385.145    492.484
parsec3/x264            391.476    275.537   657.096   211.499    524.207
splash2x/barnes         3042.395   7105.471  984.499   2359.814   51528.340
splash2x/fft            177542.817 25508.833 65650.997 46205.634  986121.141
splash2x/lu_cb          482.980    2549.395  22.414    692.206    10484.456
splash2x/lu_ncb         752.005    318.677   42.691    601.876    25431.036
splash2x/ocean_cp       9541.635   5736.370  4909.930  7999.780   199531.665
splash2x/ocean_ncp      8671.685   16560.130 3528.334  945156.130 205065.499
splash2x/radiosity      4009.875   1272.857  347.112   2746.042   84332.709
splash2x/radix          31387.749  49955.889 4666.096  140269.485 162155.771
splash2x/raytrace       57.722     74.085    1291.440  489.024    2241.461
splash2x/volrend        54.169     72.182    89.641    208.225    2268.328
splash2x/water_nsquared 23.379     29.890    435.101   490.508    631.352
splash2x/water_spatial  611.088    652.141   885.563   554.320    71409.571


runtime_min             orig    rec     thp     ethp    prcl
parsec3/blackscholes    106.457 106.550 105.994 106.795 109.004
parsec3/bodytrack       78.367  78.800  78.222  78.645  79.723
parsec3/canneal         129.744 135.271 121.695 126.207 149.199
parsec3/dedup           11.822  11.785  11.693  11.818  11.963
parsec3/facesim         206.997 208.071 203.557 205.254 215.114
parsec3/ferret          189.292 190.111 189.528 190.210 192.073
parsec3/fluidanimate    209.892 211.267 207.865 210.921 216.901
parsec3/freqmine        286.118 288.196 286.343 287.564 294.548
parsec3/raytrace        117.501 118.562 118.566 118.213 129.207
parsec3/streamcluster   320.227 325.232 281.686 287.583 327.193
parsec3/swaptions       153.229 154.133 153.392 154.194 154.358
parsec3/vips            58.563  58.352  57.859  58.604  59.446
parsec3/x264            64.915  62.497  59.804  64.030  63.511
splash2x/barnes         79.605  79.729  73.315  78.168  94.994
splash2x/fft            32.830  33.302  22.901  26.244  34.666
splash2x/lu_cb          84.837  85.198  84.320  85.354  87.937
splash2x/lu_ncb         91.839  92.540  89.880  92.368  93.502
splash2x/ocean_cp       44.189  44.592  42.787  43.538  53.972
splash2x/ocean_ncp      79.264  81.014  50.772  52.880  119.121
splash2x/radiosity      90.665  91.160  90.471  91.020  101.365
splash2x/radix          30.702  31.060  25.087  25.822  33.994
splash2x/raytrace       83.267  84.228  82.642  82.295  83.801
splash2x/volrend        86.087  86.621  85.258  86.344  87.316
splash2x/water_nsquared 229.264 231.365 217.897 222.087 236.275
splash2x/water_spatial  88.576  88.934  88.633  88.829  93.093


memused.avg_min         orig        rec         thp         ethp        prcl
parsec3/blackscholes    1806450.000 1832800.000 1821208.000 1815059.000 1597823.000
parsec3/bodytrack       1406716.000 1433260.000 1415546.000 1414766.000 1422325.000
parsec3/canneal         1034762.000 1050949.000 1029342.000 1045707.000 1039362.000
parsec3/dedup           2276407.000 2416275.000 2332186.000 2326706.000 2293520.000
parsec3/facesim         537730.000  552392.000  541119.000  551696.000  479697.000
parsec3/ferret          314753.000  325707.000  315704.000  333059.000  327897.000
parsec3/fluidanimate    569205.000  582740.000  574036.000  579170.000  465726.000
parsec3/freqmine        984189.000  997617.000  983088.000  990709.000  763236.000
parsec3/raytrace        1743861.000 1745491.000 1737423.000 1750450.000 1574814.000
parsec3/streamcluster   119184.000  135411.000  115804.000  127617.000  130339.000
parsec3/swaptions       11455.000   24872.000   15156.000   25501.000   24410.000
parsec3/vips            2950013.000 2968672.000 2952220.000 2966535.000 2961988.000
parsec3/x264            3105486.000 3187448.000 3152008.000 3124959.000 3186481.000
splash2x/barnes         1210156.000 1207677.000 1213432.000 1207101.000 942444.000
splash2x/fft            9169120.000 9153626.000 9248274.000 8931352.000 9298670.000
splash2x/lu_cb          513286.000  521543.000  516460.000  520848.000  343202.000
splash2x/lu_ncb         509384.000  522742.000  519016.000  518494.000  414389.000
splash2x/ocean_cp       3332320.000 3283348.000 3369876.000 3283512.000 3111864.000
splash2x/ocean_ncp      3887754.000 3865529.000 7060592.000 3875220.000 3491442.000
splash2x/radiosity      1456077.000 1467334.000 1478828.000 1463326.000 426797.000
splash2x/radix          1671807.000 1665862.000 1380882.000 1343347.000 1720222.000
splash2x/raytrace       46261.000   57711.000   50246.000   57849.000   53681.000
splash2x/volrend        147829.000  161237.000  148246.000  160764.000  158943.000
splash2x/water_nsquared 42598.000   61731.000   42600.000   54974.000   57599.000
splash2x/water_spatial  660084.000  672456.000  656933.000  670901.000  476582.000


rss.avg_min             orig        rec         thp         ethp        prcl
parsec3/blackscholes    588530.000  588342.000  590573.000  588953.000  251104.000
parsec3/bodytrack       31780.000   31948.000   32128.000   32212.000   26108.000
parsec3/canneal         839418.000  839190.000  835078.000  835363.000  816479.000
parsec3/dedup           1165305.000 1234371.000 1143193.000 1206303.000 194759.000
parsec3/facesim         311415.000  310889.000  312549.000  311587.000  178906.000
parsec3/ferret          99636.000   99188.000   99631.000   99183.000   87446.000
parsec3/fluidanimate    529628.000  531824.000  531584.000  531880.000  402604.000
parsec3/freqmine        551880.000  551304.000  555413.000  552349.000  59796.000
parsec3/raytrace        892361.000  892703.000  888396.000  890062.000  317630.000
parsec3/streamcluster   110762.000  110887.000  110975.000  111028.000  109785.000
parsec3/swaptions       5552.000    5565.000    5567.000    5533.000    4028.000
parsec3/vips            31569.000   31792.000   31569.000   31770.000   28081.000
parsec3/x264            81172.000   81427.000   82115.000   82098.000   80171.000
splash2x/barnes         1211709.000 1198036.000 1221765.000 1214537.000 612264.000
splash2x/fft            9325088.000 9702371.000 9643955.000 9609539.000 6873966.000
splash2x/lu_cb          509124.000  504333.000  514440.000  509140.000  326514.000
splash2x/lu_ncb         508924.000  509949.000  513828.000  508981.000  398087.000
splash2x/ocean_cp       3390197.000 3396522.000 3435755.000 3406635.000 2496312.000
splash2x/ocean_ncp      3911994.000 3888754.000 7174881.000 3931613.000 2510919.000
splash2x/radiosity      1466955.000 1473002.000 1485104.000 1469533.000 133616.000
splash2x/radix          1718566.000 1651402.000 1427018.000 1431687.000 1421661.000
splash2x/raytrace       23192.000   23184.000   28360.000   26609.000   14007.000
splash2x/volrend        43966.000   43984.000   44036.000   43916.000   24641.000
splash2x/water_nsquared 29384.000   29372.000   29461.000   29368.000   24322.000
splash2x/water_spatial  655822.000  655553.000  655257.000  655564.000  385536.000


runtime_max             orig    rec     thp     ethp    prcl
parsec3/blackscholes    108.639 106.823 107.089 108.523 114.100
parsec3/bodytrack       80.344  79.682  79.610  80.455  81.570
parsec3/canneal         144.660 141.343 133.450 136.502 162.011
parsec3/dedup           12.053  11.914  11.815  12.125  14.375
parsec3/facesim         210.501 208.796 208.025 208.131 219.250
parsec3/ferret          192.667 191.103 193.199 190.842 194.873
parsec3/fluidanimate    212.016 214.100 212.441 212.302 226.134
parsec3/freqmine        290.759 296.147 288.329 290.879 300.302
parsec3/raytrace        118.187 120.142 119.051 119.264 130.698
parsec3/streamcluster   322.889 329.792 286.140 296.215 330.679
parsec3/swaptions       155.672 156.698 161.053 156.156 157.261
parsec3/vips            59.441  58.744  58.660  59.471  60.506
parsec3/x264            75.093  76.112  71.583  73.333  77.155
splash2x/barnes         82.041  81.774  75.048  80.939  106.030
splash2x/fft            34.055  33.708  23.851  33.607  55.145
splash2x/lu_cb          86.523  85.472  84.560  87.341  90.481
splash2x/lu_ncb         94.408  93.267  90.395  95.377  100.905
splash2x/ocean_cp       44.899  44.971  43.330  45.350  62.554
splash2x/ocean_ncp      82.065  82.463  51.817  80.988  159.273
splash2x/radiosity      92.750  91.653  91.252  93.436  104.075
splash2x/radix          31.621  31.422  25.630  31.439  53.405
splash2x/raytrace       84.439  85.095  85.075  84.137  85.696
splash2x/volrend        86.330  87.694  87.981  87.032  87.949
splash2x/water_nsquared 235.682 241.083 225.483 228.505 237.043
splash2x/water_spatial  88.888  89.473  90.619  91.044  101.648


memused.avg_max         orig        rec         thp         ethp        prcl
parsec3/blackscholes    1828721.000 1836936.000 1828313.000 1836904.000 1725072.000
parsec3/bodytrack       1423906.000 1437442.000 1421857.000 1436967.000 1529455.000
parsec3/canneal         1047894.000 1055037.000 1044682.000 1060394.000 1050224.000
parsec3/dedup           2422783.000 2470880.000 2440561.000 2469625.000 2482833.000
parsec3/facesim         543497.000  555890.000  547311.000  555693.000  488209.000
parsec3/ferret          322168.000  335269.000  321546.000  334310.000  331252.000
parsec3/fluidanimate    581849.000  589377.000  577381.000  588655.000  519412.000
parsec3/freqmine        987399.000  1001129.000 991383.000  1024350.000 768684.000
parsec3/raytrace        1752775.000 1756513.000 1748414.000 1757901.000 1602179.000
parsec3/streamcluster   193685.000  137433.000  123499.000  151169.000  133488.000
parsec3/swaptions       15394.000   29542.000   17551.000   29108.000   28190.000
parsec3/vips            2959475.000 2980019.000 2963026.000 2978095.000 2973578.000
parsec3/x264            3208695.000 3231006.000 3216669.000 3202809.000 3238422.000
splash2x/barnes         1215967.000 1214849.000 1225941.000 1225748.000 1002529.000
splash2x/fft            9498187.000 9306674.000 9348936.000 9248373.000 10130342.000
splash2x/lu_cb          516522.000  524947.000  523693.000  524516.000  371119.000
splash2x/lu_ncb         518686.000  527209.000  522813.000  523686.000  481892.000
splash2x/ocean_cp       3351940.000 3363005.000 3387628.000 3301121.000 3225897.000
splash2x/ocean_ncp      3911769.000 3879024.000 7076827.000 5828184.000 3663069.000
splash2x/radiosity      1473754.000 1472333.000 1484191.000 1624759.000 551833.000
splash2x/radix          1727012.000 1680485.000 1402384.000 1670110.000 2391320.000
splash2x/raytrace       48817.000   60002.000   54006.000   63010.000   55226.000
splash2x/volrend        152398.000  169388.000  152201.000  165242.000  164335.000
splash2x/water_nsquared 48865.000   63465.000   50230.000   62111.000   61226.000
splash2x/water_spatial  670394.000  675796.000  670321.000  675621.000  630537.000


rss.avg_max             orig        rec         thp         ethp        prcl
parsec3/blackscholes    593431.000  593881.000  596538.000  594269.000  431382.000
parsec3/bodytrack       32421.000   32705.000   32516.000   32411.000   28188.000
parsec3/canneal         843084.000  840577.000  839187.000  841070.000  831519.000
parsec3/dedup           1236606.000 1235718.000 1242719.000 1236695.000 1235588.000
parsec3/facesim         311916.000  311790.000  316261.000  313469.000  190366.000
parsec3/ferret          99770.000   99753.000   103510.000  100556.000  90763.000
parsec3/fluidanimate    531936.000  531912.000  531948.000  532809.000  460491.000
parsec3/freqmine        553491.000  553511.000  558414.000  554173.000  70911.000
parsec3/raytrace        896681.000  896342.000  891012.000  894728.000  425544.000
parsec3/streamcluster   110997.000  110999.000  112287.000  112349.000  110011.000
parsec3/swaptions       5763.000    5723.000    5722.000    5799.000    4553.000
parsec3/vips            32099.000   32135.000   32879.000   32930.000   29459.000
parsec3/x264            82209.000   82190.000   83818.000   82679.000   81584.000
splash2x/barnes         1220175.000 1216655.000 1224298.000 1221938.000 759477.000
splash2x/fft            9748999.000 9768966.000 9824334.000 9726585.000 9749104.000
splash2x/lu_cb          510299.000  510868.000  514505.000  510922.000  355965.000
splash2x/lu_ncb         510679.000  510788.000  513962.000  510663.000  466846.000
splash2x/ocean_cp       3416130.000 3414420.000 3449904.000 3427720.000 2998629.000
splash2x/ocean_ncp      3936906.000 3932580.000 7183097.000 6017945.000 3040572.000
splash2x/radiosity      1477020.000 1476736.000 1486152.000 1477564.000 344068.000
splash2x/radix          1788375.000 1781055.000 1440575.000 1774484.000 1892298.000
splash2x/raytrace       23356.000   23360.000   31800.000   27846.000   20436.000
splash2x/volrend        44119.000   44157.000   44260.000   44517.000   30636.000
splash2x/water_nsquared 29448.000   29452.000   30625.000   30629.000   25995.000
splash2x/water_spatial  657379.000  657433.000  657217.000  657169.000  602051.000

^ permalink raw reply	[flat|nested] 8+ messages in thread

* [RFC v7 1/7] mm/madvise: Export do_madvise() to external GPL modules
  2020-04-29 12:45 [RFC v7 0/7] Implement Data Access Monitoring-based Memory Operation Schemes SeongJae Park
@ 2020-04-29 12:45 ` SeongJae Park
  2020-04-29 12:45 ` [RFC v7 2/7] mm/damon: Account age of target regions SeongJae Park
                   ` (5 subsequent siblings)
  6 siblings, 0 replies; 8+ messages in thread
From: SeongJae Park @ 2020-04-29 12:45 UTC (permalink / raw)
  To: akpm
  Cc: SeongJae Park, Jonathan.Cameron, aarcange, acme,
	alexander.shishkin, amit, benh, brendan.d.gregg, brendanhiggins,
	cai, colin.king, corbet, dwmw, irogers, jolsa, kirill,
	mark.rutland, mgorman, minchan, mingo, namhyung, peterz, rdunlap,
	riel, rientjes, rostedt, sblbir, shakeelb, shuah, sj38.park, snu,
	vbabka, vdavydov.dev, yang.shi, ying.huang, linux-damon,
	linux-mm, linux-doc, linux-kernel

From: SeongJae Park <sjpark@amazon.de>

This commit exports 'do_madvise()' to external GPL modules, so that
other modules including DAMON could use the function.

Signed-off-by: SeongJae Park <sjpark@amazon.de>
---
 mm/madvise.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/mm/madvise.c b/mm/madvise.c
index 80f8a1839f70..151aaf285cdd 100644
--- a/mm/madvise.c
+++ b/mm/madvise.c
@@ -1151,6 +1151,7 @@ int do_madvise(struct task_struct *target_task, struct mm_struct *mm,
 
 	return error;
 }
+EXPORT_SYMBOL_GPL(do_madvise);
 
 SYSCALL_DEFINE3(madvise, unsigned long, start, size_t, len_in, int, behavior)
 {
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 8+ messages in thread

* [RFC v7 2/7] mm/damon: Account age of target regions
  2020-04-29 12:45 [RFC v7 0/7] Implement Data Access Monitoring-based Memory Operation Schemes SeongJae Park
  2020-04-29 12:45 ` [RFC v7 1/7] mm/madvise: Export do_madvise() to external GPL modules SeongJae Park
@ 2020-04-29 12:45 ` SeongJae Park
  2020-04-29 12:45 ` [RFC v7 3/7] mm/damon: Implement data access monitoring-based operation schemes SeongJae Park
                   ` (4 subsequent siblings)
  6 siblings, 0 replies; 8+ messages in thread
From: SeongJae Park @ 2020-04-29 12:45 UTC (permalink / raw)
  To: akpm
  Cc: SeongJae Park, Jonathan.Cameron, aarcange, acme,
	alexander.shishkin, amit, benh, brendan.d.gregg, brendanhiggins,
	cai, colin.king, corbet, dwmw, irogers, jolsa, kirill,
	mark.rutland, mgorman, minchan, mingo, namhyung, peterz, rdunlap,
	riel, rientjes, rostedt, sblbir, shakeelb, shuah, sj38.park, snu,
	vbabka, vdavydov.dev, yang.shi, ying.huang, linux-damon,
	linux-mm, linux-doc, linux-kernel

From: SeongJae Park <sjpark@amazon.de>

DAMON can be used as a primitive for data access pattern aware memory
management optimizations.  However, users who want such optimizations
should run DAMON, read the monitoring results, analyze it, plan a new
memory management scheme, and apply the new scheme by themselves.  It
would not be too hard, but still require some level of effort.  For
complicated optimizations, this effort is inevitable.

That said, in many cases, users would simply want to apply an actions to
a memory region of a specific size having a specific access frequency
for a specific time.  For example, "page out a memory region larger than
100 MiB but having a low access frequency more than 10 minutes", or "Use
THP for a memory region larger than 2 MiB having a high access frequency
for more than 2 seconds".

For such optimizations, users will need to first account the age of each
region themselves.  To reduce such efforts, this commit implements a
simple age account of each region in DAMON.  For each aggregation step,
DAMON compares the access frequency and start/end address of each region
with those from last aggregation and reset the age of the region if the
change is significant.  Else, the age is incremented.

Signed-off-by: SeongJae Park <sjpark@amazon.de>
---
 include/linux/damon.h |   5 ++
 mm/damon.c            | 106 ++++++++++++++++++++++++++++++++++++++++--
 2 files changed, 107 insertions(+), 4 deletions(-)

diff --git a/include/linux/damon.h b/include/linux/damon.h
index bc46ea00e9a1..7276b2a31c38 100644
--- a/include/linux/damon.h
+++ b/include/linux/damon.h
@@ -22,6 +22,11 @@ struct damon_region {
 	unsigned long sampling_addr;
 	unsigned int nr_accesses;
 	struct list_head list;
+
+	unsigned int age;
+	unsigned long last_vm_start;
+	unsigned long last_vm_end;
+	unsigned int last_nr_accesses;
 };
 
 /* Represents a monitoring target task */
diff --git a/mm/damon.c b/mm/damon.c
index 95f5d94b97e6..704462ff30d9 100644
--- a/mm/damon.c
+++ b/mm/damon.c
@@ -86,6 +86,10 @@ static struct damon_region *damon_new_region(struct damon_ctx *ctx,
 	region->nr_accesses = 0;
 	INIT_LIST_HEAD(&region->list);
 
+	region->age = 0;
+	region->last_vm_start = vm_start;
+	region->last_vm_end = vm_end;
+
 	return region;
 }
 
@@ -659,11 +663,45 @@ static void kdamond_reset_aggregated(struct damon_ctx *c)
 					sizeof(r->nr_accesses));
 			trace_damon_aggregated(t->pid, nr,
 					r->vm_start, r->vm_end, r->nr_accesses);
+			r->last_nr_accesses = r->nr_accesses;
 			r->nr_accesses = 0;
 		}
 	}
 }
 
+#define diff_of(a, b) (a > b ? a - b : b - a)
+
+/*
+ * Increase or reset the age of the given monitoring target region
+ *
+ * If the area or '->nr_accesses' has changed significantly, reset the '->age'.
+ * Else, increase the age.
+ */
+static void damon_do_count_age(struct damon_region *r, unsigned int threshold)
+{
+	unsigned long region_threshold = (r->vm_end - r->vm_start) / 4;
+	unsigned long region_diff = diff_of(r->vm_start, r->last_vm_start) +
+			diff_of(r->vm_end, r->last_vm_end);
+	unsigned int nr_accesses_diff = diff_of(r->nr_accesses,
+			r->last_nr_accesses);
+
+	if (region_diff > region_threshold || nr_accesses_diff > threshold)
+		r->age = 0;
+	else
+		r->age++;
+}
+
+static void kdamond_count_age(struct damon_ctx *c, unsigned int threshold)
+{
+	struct damon_task *t;
+	struct damon_region *r;
+
+	damon_for_each_task(c, t) {
+		damon_for_each_region(r, t)
+			damon_do_count_age(r, threshold);
+	}
+}
+
 #define sz_damon_region(r) (r->vm_end - r->vm_start)
 
 /*
@@ -672,33 +710,86 @@ static void kdamond_reset_aggregated(struct damon_ctx *c)
 static void damon_merge_two_regions(struct damon_region *l,
 				struct damon_region *r)
 {
-	l->nr_accesses = (l->nr_accesses * sz_damon_region(l) +
-			r->nr_accesses * sz_damon_region(r)) /
-			(sz_damon_region(l) + sz_damon_region(r));
+	unsigned long sz_l = sz_damon_region(l), sz_r = sz_damon_region(r);
+
+	l->nr_accesses = (l->nr_accesses * sz_l + r->nr_accesses * sz_r) /
+			(sz_l + sz_r);
+	l->age = (l->age * sz_l + r->age * sz_r) / (sz_l + sz_r);
 	l->vm_end = r->vm_end;
 	damon_destroy_region(r);
 }
 
-#define diff_of(a, b) (a > b ? a - b : b - a)
+static inline void set_last_area(struct damon_region *r, struct region *last)
+{
+	r->last_vm_start = last->start;
+	r->last_vm_end = last->end;
+}
+
+static inline void get_last_area(struct damon_region *r, struct region *last)
+{
+	last->start = r->last_vm_start;
+	last->end = r->last_vm_end;
+}
 
 /*
  * Merge adjacent regions having similar access frequencies
  *
  * t		task affected by merge operation
  * thres	'->nr_accesses' diff threshold for the merge
+ *
+ * After each merge, the biggest mergee region becomes the last shape of the
+ * new region.  If two regions split from one region at the end of previous
+ * aggregation interval are merged into one region, we handle the two regions
+ * as one big mergee, because it can lead to unproper last shape record if we
+ * don't do so.
+ *
+ * To understand why we take special care for regions split from one region,
+ * suppose that a region of size 10 has split into two regions of size 4 and 6.
+ * Two regions show similar access frequency for next aggregation interval and
+ * thus now be merged into one region again.  Because the split is made
+ * regardless of the access pattern, DAMON should say the region of size 10 had
+ * no area change for last aggregation interval.  However, if the two mergees
+ * are handled separately, DAMON will say the merged region has changed its
+ * size from 6 to 10.
  */
 static void damon_merge_regions_of(struct damon_task *t, unsigned int thres)
 {
 	struct damon_region *r, *prev = NULL, *next;
+	struct region biggest_mergee;	/* the biggest region being merged */
+	unsigned long sz_biggest = 0;	/* size of the biggest_mergee */
+	unsigned long sz_mergee = 0;	/* size of current mergee */
 
 	damon_for_each_region_safe(r, next, t) {
 		if (!prev || prev->vm_end != r->vm_start ||
 		    diff_of(prev->nr_accesses, r->nr_accesses) > thres) {
+			if (sz_biggest)
+				set_last_area(prev, &biggest_mergee);
+
 			prev = r;
+			sz_biggest = sz_damon_region(prev);
+			get_last_area(prev, &biggest_mergee);
 			continue;
 		}
+
+
+		/* Set size of current mergee and biggest mergee */
+		sz_mergee += sz_damon_region(r);
+		if (sz_mergee > sz_biggest) {
+			sz_biggest = sz_mergee;
+			get_last_area(r, &biggest_mergee);
+		}
+
+		/*
+		 * If next region and current region is not originated from
+		 * same region, initialize the size of mergee.
+		 */
+		if (r->last_vm_start != next->last_vm_start)
+			sz_mergee = 0;
+
 		damon_merge_two_regions(prev, r);
 	}
+	if (sz_biggest)
+		set_last_area(prev, &biggest_mergee);
 }
 
 /*
@@ -731,6 +822,12 @@ static void damon_split_region_at(struct damon_ctx *ctx,
 	struct damon_region *new;
 
 	new = damon_new_region(ctx, r->vm_start + sz_r, r->vm_end);
+	new->age = r->age;
+	new->last_vm_start = r->vm_start;
+	new->last_nr_accesses = r->last_nr_accesses;
+
+	r->last_vm_start = r->vm_start;
+	r->last_vm_end = r->vm_end;
 	r->vm_end = new->vm_start;
 
 	damon_insert_region(new, r, damon_next_region(r));
@@ -940,6 +1037,7 @@ static int kdamond_fn(void *data)
 
 		if (kdamond_aggregate_interval_passed(ctx)) {
 			kdamond_merge_regions(ctx, max_nr_accesses / 10);
+			kdamond_count_age(ctx, max_nr_accesses / 10);
 			if (ctx->aggregate_cb)
 				ctx->aggregate_cb(ctx);
 			kdamond_reset_aggregated(ctx);
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 8+ messages in thread

* [RFC v7 3/7] mm/damon: Implement data access monitoring-based operation schemes
  2020-04-29 12:45 [RFC v7 0/7] Implement Data Access Monitoring-based Memory Operation Schemes SeongJae Park
  2020-04-29 12:45 ` [RFC v7 1/7] mm/madvise: Export do_madvise() to external GPL modules SeongJae Park
  2020-04-29 12:45 ` [RFC v7 2/7] mm/damon: Account age of target regions SeongJae Park
@ 2020-04-29 12:45 ` SeongJae Park
  2020-04-29 12:45 ` [RFC v7 4/7] mm/damon/schemes: Implement a debugfs interface SeongJae Park
                   ` (3 subsequent siblings)
  6 siblings, 0 replies; 8+ messages in thread
From: SeongJae Park @ 2020-04-29 12:45 UTC (permalink / raw)
  To: akpm
  Cc: SeongJae Park, Jonathan.Cameron, aarcange, acme,
	alexander.shishkin, amit, benh, brendan.d.gregg, brendanhiggins,
	cai, colin.king, corbet, dwmw, irogers, jolsa, kirill,
	mark.rutland, mgorman, minchan, mingo, namhyung, peterz, rdunlap,
	riel, rientjes, rostedt, sblbir, shakeelb, shuah, sj38.park, snu,
	vbabka, vdavydov.dev, yang.shi, ying.huang, linux-damon,
	linux-mm, linux-doc, linux-kernel

From: SeongJae Park <sjpark@amazon.de>

In many cases, users might use DAMON for simple data access aware
memory management optimizations such as applying an operation scheme to
a memory region of a specific size having a specific access frequency
for a specific time.  For example, "page out a memory region larger than
100 MiB but having a low access frequency more than 10 minutes", or "Use
THP for a memory region larger than 2 MiB having a high access frequency
for more than 2 seconds".

To minimize users from spending their time for implementation of such
simple data access monitoring-based operation schemes, this commit makes
DAMON to handle such schemes directly.  With this commit, users can
simply specify their desired schemes to DAMON.

Each of the schemes is composed with conditions for filtering of the
target memory regions and desired memory management action for the
target.  Specifically, the format is::

    <min/max size> <min/max access frequency> <min/max age> <action>

The filtering conditions are size of memory region, number of accesses
to the region monitored by DAMON, and the age of the region.  The age of
region is incremented periodically but reset when its addresses or
access frequency has significantly changed or the action of a scheme was
applied.  For the action, current implementation supports only a few of
madvise() hints, ``MADV_WILLNEED``, ``MADV_COLD``, ``MADV_PAGEOUT``,
``MADV_HUGEPAGE``, and ``MADV_NOHUGEPAGE``.

Signed-off-by: SeongJae Park <sjpark@amazon.de>
---
 include/linux/damon.h |  24 +++++++
 mm/damon.c            | 149 ++++++++++++++++++++++++++++++++++++++++++
 2 files changed, 173 insertions(+)

diff --git a/include/linux/damon.h b/include/linux/damon.h
index 7276b2a31c38..0f26d8aad33c 100644
--- a/include/linux/damon.h
+++ b/include/linux/damon.h
@@ -36,6 +36,27 @@ struct damon_task {
 	struct list_head list;
 };
 
+/* Data Access Monitoring-based Operation Scheme */
+enum damos_action {
+	DAMOS_WILLNEED,
+	DAMOS_COLD,
+	DAMOS_PAGEOUT,
+	DAMOS_HUGEPAGE,
+	DAMOS_NOHUGEPAGE,
+	DAMOS_ACTION_LEN,
+};
+
+struct damos {
+	unsigned int min_sz_region;
+	unsigned int max_sz_region;
+	unsigned int min_nr_accesses;
+	unsigned int max_nr_accesses;
+	unsigned int min_age_region;
+	unsigned int max_age_region;
+	enum damos_action action;
+	struct list_head list;
+};
+
 /*
  * For each 'sample_interval', DAMON checks whether each region is accessed or
  * not.  It aggregates and keeps the access information (number of accesses to
@@ -65,6 +86,7 @@ struct damon_ctx {
 	struct mutex kdamond_lock;
 
 	struct list_head tasks_list;	/* 'damon_task' objects */
+	struct list_head schemes_list;	/* 'damos' objects */
 
 	/* callbacks */
 	void (*sample_cb)(struct damon_ctx *context);
@@ -75,6 +97,8 @@ int damon_set_pids(struct damon_ctx *ctx, int *pids, ssize_t nr_pids);
 int damon_set_attrs(struct damon_ctx *ctx, unsigned long sample_int,
 		unsigned long aggr_int, unsigned long regions_update_int,
 		unsigned long min_nr_reg, unsigned long max_nr_reg);
+int damon_set_schemes(struct damon_ctx *ctx,
+			struct damos **schemes, ssize_t nr_schemes);
 int damon_set_recording(struct damon_ctx *ctx,
 				unsigned int rbuf_len, char *rfile_path);
 int damon_start(struct damon_ctx *ctx);
diff --git a/mm/damon.c b/mm/damon.c
index 704462ff30d9..56a4fb4d1e4a 100644
--- a/mm/damon.c
+++ b/mm/damon.c
@@ -11,6 +11,7 @@
 
 #define CREATE_TRACE_POINTS
 
+#include <asm-generic/mman-common.h>
 #include <linux/damon.h>
 #include <linux/debugfs.h>
 #include <linux/delay.h>
@@ -52,6 +53,12 @@
 #define damon_for_each_task_safe(ctx, t, next) \
 	list_for_each_entry_safe(t, next, &(ctx)->tasks_list, list)
 
+#define damon_for_each_scheme(ctx, r) \
+	list_for_each_entry(r, &(ctx)->schemes_list, list)
+
+#define damon_for_each_scheme_safe(ctx, s, next) \
+	list_for_each_entry_safe(s, next, &(ctx)->schemes_list, list)
+
 #define MAX_RECORD_BUFFER_LEN	(4 * 1024 * 1024)
 #define MAX_RFILE_PATH_LEN	256
 
@@ -167,6 +174,27 @@ static void damon_destroy_task(struct damon_task *t)
 	damon_free_task(t);
 }
 
+static void damon_add_scheme(struct damon_ctx *ctx, struct damos *s)
+{
+	list_add_tail(&s->list, &ctx->schemes_list);
+}
+
+static void damon_del_scheme(struct damos *s)
+{
+	list_del(&s->list);
+}
+
+static void damon_free_scheme(struct damos *s)
+{
+	kfree(s);
+}
+
+static void damon_destroy_scheme(struct damos *s)
+{
+	damon_del_scheme(s);
+	damon_free_scheme(s);
+}
+
 static unsigned int nr_damon_tasks(struct damon_ctx *ctx)
 {
 	struct damon_task *t;
@@ -702,6 +730,101 @@ static void kdamond_count_age(struct damon_ctx *c, unsigned int threshold)
 	}
 }
 
+#ifndef CONFIG_ADVISE_SYSCALLS
+static int damos_madvise(struct damon_task *task, struct damon_region *r,
+			int behavior)
+{
+	return -EINVAL;
+}
+#else
+static int damos_madvise(struct damon_task *task, struct damon_region *r,
+			int behavior)
+{
+	struct task_struct *t;
+	struct mm_struct *mm;
+	int ret = -ENOMEM;
+
+	t = damon_get_task_struct(task);
+	if (!t)
+		goto out;
+	mm = damon_get_mm(task);
+	if (!mm)
+		goto put_task_out;
+
+	ret = do_madvise(t, mm, PAGE_ALIGN(r->vm_start),
+			PAGE_ALIGN(r->vm_end - r->vm_start), behavior);
+	mmput(mm);
+put_task_out:
+	put_task_struct(t);
+out:
+	return ret;
+}
+#endif	/* CONFIG_ADVISE_SYSCALLS */
+
+static int damos_do_action(struct damon_task *task, struct damon_region *r,
+			enum damos_action action)
+{
+	int madv_action;
+
+	switch (action) {
+	case DAMOS_WILLNEED:
+		madv_action = MADV_WILLNEED;
+		break;
+	case DAMOS_COLD:
+		madv_action = MADV_COLD;
+		break;
+	case DAMOS_PAGEOUT:
+		madv_action = MADV_PAGEOUT;
+		break;
+	case DAMOS_HUGEPAGE:
+		madv_action = MADV_HUGEPAGE;
+		break;
+	case DAMOS_NOHUGEPAGE:
+		madv_action = MADV_NOHUGEPAGE;
+		break;
+	default:
+		pr_warn("Wrong action %d\n", action);
+		return -EINVAL;
+	}
+
+	return damos_madvise(task, r, madv_action);
+}
+
+static void damon_do_apply_schemes(struct damon_ctx *c, struct damon_task *t,
+				struct damon_region *r)
+{
+	struct damos *s;
+	unsigned long sz;
+
+	damon_for_each_scheme(c, s) {
+		sz = r->vm_end - r->vm_start;
+		if ((s->min_sz_region && sz < s->min_sz_region) ||
+				(s->max_sz_region && s->max_sz_region < sz))
+			continue;
+		if ((s->min_nr_accesses && r->nr_accesses < s->min_nr_accesses)
+				|| (s->max_nr_accesses &&
+					s->max_nr_accesses < r->nr_accesses))
+			continue;
+		if ((s->min_age_region && r->age < s->min_age_region) ||
+				(s->max_age_region &&
+				 s->max_age_region < r->age))
+			continue;
+		damos_do_action(t, r, s->action);
+		r->age = 0;
+	}
+}
+
+static void kdamond_apply_schemes(struct damon_ctx *c)
+{
+	struct damon_task *t;
+	struct damon_region *r;
+
+	damon_for_each_task(c, t) {
+		damon_for_each_region(r, t)
+			damon_do_apply_schemes(c, t, r);
+	}
+}
+
 #define sz_damon_region(r) (r->vm_end - r->vm_start)
 
 /*
@@ -1040,6 +1163,7 @@ static int kdamond_fn(void *data)
 			kdamond_count_age(ctx, max_nr_accesses / 10);
 			if (ctx->aggregate_cb)
 				ctx->aggregate_cb(ctx);
+			kdamond_apply_schemes(ctx);
 			kdamond_reset_aggregated(ctx);
 			kdamond_split_regions(ctx);
 		}
@@ -1120,6 +1244,30 @@ int damon_stop(struct damon_ctx *ctx)
 	return -EPERM;
 }
 
+/**
+ * damon_set_schemes() - Set data access monitoring based operation schemes.
+ * @ctx:	monitoring context
+ * @schemes:	array of the schemes
+ * @nr_schemes:	number of entries in @schemes
+ *
+ * This function should not be called while the kdamond of the context is
+ * running.
+ *
+ * Return: 0 if success, or negative error code otherwise.
+ */
+int damon_set_schemes(struct damon_ctx *ctx, struct damos **schemes,
+			ssize_t nr_schemes)
+{
+	struct damos *s, *next;
+	ssize_t i;
+
+	damon_for_each_scheme_safe(ctx, s, next)
+		damon_destroy_scheme(s);
+	for (i = 0; i < nr_schemes; i++)
+		damon_add_scheme(ctx, schemes[i]);
+	return 0;
+}
+
 /**
  * damon_set_pids() - Set monitoring target processes.
  * @ctx:	monitoring context
@@ -1554,6 +1702,7 @@ static int __init damon_init_user_ctx(void)
 	mutex_init(&ctx->kdamond_lock);
 
 	INIT_LIST_HEAD(&ctx->tasks_list);
+	INIT_LIST_HEAD(&ctx->schemes_list);
 
 	return 0;
 }
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 8+ messages in thread

* [RFC v7 4/7] mm/damon/schemes: Implement a debugfs interface
  2020-04-29 12:45 [RFC v7 0/7] Implement Data Access Monitoring-based Memory Operation Schemes SeongJae Park
                   ` (2 preceding siblings ...)
  2020-04-29 12:45 ` [RFC v7 3/7] mm/damon: Implement data access monitoring-based operation schemes SeongJae Park
@ 2020-04-29 12:45 ` SeongJae Park
  2020-04-29 12:45 ` [RFC v7 5/7] mm/damon-test: Add kunit test case for regions age accounting SeongJae Park
                   ` (2 subsequent siblings)
  6 siblings, 0 replies; 8+ messages in thread
From: SeongJae Park @ 2020-04-29 12:45 UTC (permalink / raw)
  To: akpm
  Cc: SeongJae Park, Jonathan.Cameron, aarcange, acme,
	alexander.shishkin, amit, benh, brendan.d.gregg, brendanhiggins,
	cai, colin.king, corbet, dwmw, irogers, jolsa, kirill,
	mark.rutland, mgorman, minchan, mingo, namhyung, peterz, rdunlap,
	riel, rientjes, rostedt, sblbir, shakeelb, shuah, sj38.park, snu,
	vbabka, vdavydov.dev, yang.shi, ying.huang, linux-damon,
	linux-mm, linux-doc, linux-kernel

From: SeongJae Park <sjpark@amazon.de>

This commit implements a debugfs interface for the data access
monitoring oriented memory management schemes.  It is supposed to be
used by administrators and/or privileged user space programs.  Users can
read and update the rules using ``<debugfs>/damon/schemes`` file.  The
format is::

    <min/max size> <min/max access frequency> <min/max age> <action>

Signed-off-by: SeongJae Park <sjpark@amazon.de>
---
 mm/damon.c | 174 ++++++++++++++++++++++++++++++++++++++++++++++++++++-
 1 file changed, 172 insertions(+), 2 deletions(-)

diff --git a/mm/damon.c b/mm/damon.c
index 56a4fb4d1e4a..cbe3fb0a317e 100644
--- a/mm/damon.c
+++ b/mm/damon.c
@@ -174,6 +174,29 @@ static void damon_destroy_task(struct damon_task *t)
 	damon_free_task(t);
 }
 
+static struct damos *damon_new_scheme(
+		unsigned int min_sz_region, unsigned int max_sz_region,
+		unsigned int min_nr_accesses, unsigned int max_nr_accesses,
+		unsigned int min_age_region, unsigned int max_age_region,
+		enum damos_action action)
+{
+	struct damos *scheme;
+
+	scheme = kmalloc(sizeof(*scheme), GFP_KERNEL);
+	if (!scheme)
+		return NULL;
+	scheme->min_sz_region = min_sz_region;
+	scheme->max_sz_region = max_sz_region;
+	scheme->min_nr_accesses = min_nr_accesses;
+	scheme->max_nr_accesses = max_nr_accesses;
+	scheme->min_age_region = min_age_region;
+	scheme->max_age_region = max_age_region;
+	scheme->action = action;
+	INIT_LIST_HEAD(&scheme->list);
+
+	return scheme;
+}
+
 static void damon_add_scheme(struct damon_ctx *ctx, struct damos *s)
 {
 	list_add_tail(&s->list, &ctx->schemes_list);
@@ -1422,6 +1445,147 @@ static ssize_t debugfs_monitor_on_write(struct file *file,
 	return ret;
 }
 
+static ssize_t sprint_schemes(struct damon_ctx *c, char *buf, ssize_t len)
+{
+	struct damos *s;
+	int written = 0;
+	int rc;
+
+	damon_for_each_scheme(c, s) {
+		rc = snprintf(&buf[written], len - written,
+				"%u %u %u %u %u %u %d\n",
+				s->min_sz_region, s->max_sz_region,
+				s->min_nr_accesses, s->max_nr_accesses,
+				s->min_age_region, s->max_age_region,
+				s->action);
+		if (!rc)
+			return -ENOMEM;
+
+		written += rc;
+	}
+	return written;
+}
+
+static ssize_t debugfs_schemes_read(struct file *file, char __user *buf,
+		size_t count, loff_t *ppos)
+{
+	struct damon_ctx *ctx = &damon_user_ctx;
+	char *kbuf;
+	ssize_t len;
+
+	kbuf = kmalloc(count, GFP_KERNEL);
+	if (!kbuf)
+		return -ENOMEM;
+
+	len = sprint_schemes(ctx, kbuf, count);
+	if (len < 0)
+		goto out;
+	len = simple_read_from_buffer(buf, count, ppos, kbuf, len);
+
+out:
+	kfree(kbuf);
+	return len;
+}
+
+static void free_schemes_arr(struct damos **schemes, ssize_t nr_schemes)
+{
+	ssize_t i;
+
+	for (i = 0; i < nr_schemes; i++)
+		kfree(schemes[i]);
+	kfree(schemes);
+}
+
+/*
+ * Converts a string into an array of struct damos pointers
+ *
+ * Returns an array of struct damos pointers that converted if the conversion
+ * success, or NULL otherwise.
+ */
+static struct damos **str_to_schemes(const char *str, ssize_t len,
+				ssize_t *nr_schemes)
+{
+	struct damos *scheme, **schemes;
+	const int max_nr_schemes = 256;
+	int pos = 0, parsed, ret;
+	unsigned int min_sz, max_sz, min_nr_a, max_nr_a, min_age, max_age;
+	unsigned int action;
+
+	schemes = kmalloc_array(max_nr_schemes, sizeof(scheme),
+			GFP_KERNEL);
+	if (!schemes)
+		return NULL;
+
+	*nr_schemes = 0;
+	while (pos < len && *nr_schemes < max_nr_schemes) {
+		ret = sscanf(&str[pos], "%u %u %u %u %u %u %u%n",
+				&min_sz, &max_sz, &min_nr_a, &max_nr_a,
+				&min_age, &max_age, &action, &parsed);
+		if (ret != 7)
+			break;
+		if (action >= DAMOS_ACTION_LEN) {
+			pr_err("wrong action %d\n", action);
+			goto fail;
+		}
+
+		pos += parsed;
+		scheme = damon_new_scheme(min_sz, max_sz, min_nr_a, max_nr_a,
+				min_age, max_age, action);
+		if (!scheme)
+			goto fail;
+
+		schemes[*nr_schemes] = scheme;
+		*nr_schemes += 1;
+	}
+	if (!*nr_schemes)
+		goto fail;
+	return schemes;
+fail:
+	free_schemes_arr(schemes, *nr_schemes);
+	return NULL;
+}
+
+static ssize_t debugfs_schemes_write(struct file *file, const char __user *buf,
+		size_t count, loff_t *ppos)
+{
+	struct damon_ctx *ctx = &damon_user_ctx;
+	char *kbuf;
+	struct damos **schemes;
+	ssize_t nr_schemes = 0, ret;
+	int err;
+
+	if (*ppos)
+		return -EINVAL;
+
+	kbuf = kmalloc(count, GFP_KERNEL);
+	if (!kbuf)
+		return -ENOMEM;
+
+	ret = simple_write_to_buffer(kbuf, count, ppos, buf, count);
+	if (ret < 0)
+		goto out;
+
+	schemes = str_to_schemes(kbuf, ret, &nr_schemes);
+
+	mutex_lock(&ctx->kdamond_lock);
+	if (ctx->kdamond) {
+		ret = -EBUSY;
+		goto unlock_out;
+	}
+
+	err = damon_set_schemes(ctx, schemes, nr_schemes);
+	if (err)
+		ret = err;
+	else
+		nr_schemes = 0;
+unlock_out:
+	mutex_unlock(&ctx->kdamond_lock);
+	free_schemes_arr(schemes, nr_schemes);
+out:
+	kfree(kbuf);
+	return ret;
+}
+
 static ssize_t damon_sprint_pids(struct damon_ctx *ctx, char *buf, ssize_t len)
 {
 	struct damon_task *t;
@@ -1647,6 +1811,12 @@ static const struct file_operations pids_fops = {
 	.write = debugfs_pids_write,
 };
 
+static const struct file_operations schemes_fops = {
+	.owner = THIS_MODULE,
+	.read = debugfs_schemes_read,
+	.write = debugfs_schemes_write,
+};
+
 static const struct file_operations record_fops = {
 	.owner = THIS_MODULE,
 	.read = debugfs_record_read,
@@ -1663,10 +1833,10 @@ static struct dentry *debugfs_root;
 
 static int __init damon_debugfs_init(void)
 {
-	const char * const file_names[] = {"attrs", "record",
+	const char * const file_names[] = {"attrs", "record", "schemes",
 		"pids", "monitor_on"};
 	const struct file_operations *fops[] = {&attrs_fops, &record_fops,
-		&pids_fops, &monitor_on_fops};
+		&schemes_fops, &pids_fops, &monitor_on_fops};
 	int i;
 
 	debugfs_root = debugfs_create_dir("damon", NULL);
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 8+ messages in thread

* [RFC v7 5/7] mm/damon-test: Add kunit test case for regions age accounting
  2020-04-29 12:45 [RFC v7 0/7] Implement Data Access Monitoring-based Memory Operation Schemes SeongJae Park
                   ` (3 preceding siblings ...)
  2020-04-29 12:45 ` [RFC v7 4/7] mm/damon/schemes: Implement a debugfs interface SeongJae Park
@ 2020-04-29 12:45 ` SeongJae Park
  2020-04-29 12:45 ` [RFC v7 6/7] mm/damon/selftests: Add 'schemes' debugfs tests SeongJae Park
  2020-04-29 12:45 ` [RFC v7 7/7] damon/tools: Support more human friendly 'schemes' control SeongJae Park
  6 siblings, 0 replies; 8+ messages in thread
From: SeongJae Park @ 2020-04-29 12:45 UTC (permalink / raw)
  To: akpm
  Cc: SeongJae Park, Jonathan.Cameron, aarcange, acme,
	alexander.shishkin, amit, benh, brendan.d.gregg, brendanhiggins,
	cai, colin.king, corbet, dwmw, irogers, jolsa, kirill,
	mark.rutland, mgorman, minchan, mingo, namhyung, peterz, rdunlap,
	riel, rientjes, rostedt, sblbir, shakeelb, shuah, sj38.park, snu,
	vbabka, vdavydov.dev, yang.shi, ying.huang, linux-damon,
	linux-mm, linux-doc, linux-kernel

From: SeongJae Park <sjpark@amazon.de>

After merges of regions, each region should know their last shape in
proper way to measure the changes from the last modification and reset
the age if the changes are significant.  This commit adds kunit test
cases checking whether the regions are knowing their last shape properly
after merges of regions.

Signed-off-by: SeongJae Park <sjpark@amazon.de>
Reviewed-by: Brendan Higgins <brendanhiggins@google.com>
---
 mm/damon-test.h | 5 +++++
 1 file changed, 5 insertions(+)

diff --git a/mm/damon-test.h b/mm/damon-test.h
index 439ffce783f6..780094deea05 100644
--- a/mm/damon-test.h
+++ b/mm/damon-test.h
@@ -551,6 +551,8 @@ static void damon_test_merge_regions_of(struct kunit *test)
 
 	unsigned long saddrs[] = {0, 114, 130, 156, 170};
 	unsigned long eaddrs[] = {112, 130, 156, 170, 230};
+	unsigned long lsa[] = {0, 114, 130, 156, 184};
+	unsigned long lea[] = {100, 122, 156, 170, 230};
 	int i;
 
 	t = damon_new_task(42);
@@ -567,6 +569,9 @@ static void damon_test_merge_regions_of(struct kunit *test)
 		r = __nth_region_of(t, i);
 		KUNIT_EXPECT_EQ(test, r->vm_start, saddrs[i]);
 		KUNIT_EXPECT_EQ(test, r->vm_end, eaddrs[i]);
+		KUNIT_EXPECT_EQ(test, r->last_vm_start, lsa[i]);
+		KUNIT_EXPECT_EQ(test, r->last_vm_end, lea[i]);
+
 	}
 	damon_free_task(t);
 }
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 8+ messages in thread

* [RFC v7 6/7] mm/damon/selftests: Add 'schemes' debugfs tests
  2020-04-29 12:45 [RFC v7 0/7] Implement Data Access Monitoring-based Memory Operation Schemes SeongJae Park
                   ` (4 preceding siblings ...)
  2020-04-29 12:45 ` [RFC v7 5/7] mm/damon-test: Add kunit test case for regions age accounting SeongJae Park
@ 2020-04-29 12:45 ` SeongJae Park
  2020-04-29 12:45 ` [RFC v7 7/7] damon/tools: Support more human friendly 'schemes' control SeongJae Park
  6 siblings, 0 replies; 8+ messages in thread
From: SeongJae Park @ 2020-04-29 12:45 UTC (permalink / raw)
  To: akpm
  Cc: SeongJae Park, Jonathan.Cameron, aarcange, acme,
	alexander.shishkin, amit, benh, brendan.d.gregg, brendanhiggins,
	cai, colin.king, corbet, dwmw, irogers, jolsa, kirill,
	mark.rutland, mgorman, minchan, mingo, namhyung, peterz, rdunlap,
	riel, rientjes, rostedt, sblbir, shakeelb, shuah, sj38.park, snu,
	vbabka, vdavydov.dev, yang.shi, ying.huang, linux-damon,
	linux-mm, linux-doc, linux-kernel

From: SeongJae Park <sjpark@amazon.de>

This commit adds simple selftets for 'schemes' debugfs file of DAMON.

Signed-off-by: SeongJae Park <sjpark@amazon.de>
---
 .../testing/selftests/damon/debugfs_attrs.sh  | 29 +++++++++++++++++++
 1 file changed, 29 insertions(+)

diff --git a/tools/testing/selftests/damon/debugfs_attrs.sh b/tools/testing/selftests/damon/debugfs_attrs.sh
index d5188b0f71b1..4aeb2037a67e 100755
--- a/tools/testing/selftests/damon/debugfs_attrs.sh
+++ b/tools/testing/selftests/damon/debugfs_attrs.sh
@@ -97,6 +97,35 @@ fi
 
 echo $ORIG_CONTENT > $file
 
+# Test schemes file
+file="$DBGFS/schemes"
+
+ORIG_CONTENT=$(cat $file)
+echo "1 2 3 4 5 6 3" > $file
+if [ $? -ne 0 ]
+then
+	echo "$file write fail"
+	echo $ORIG_CONTENT > $file
+	exit 1
+fi
+
+echo "1 2
+3 4 5 6 3" > $file
+if [ $? -eq 0 ]
+then
+	echo "$file multi line write success (expected fail)"
+	echo $ORIG_CONTENT > $file
+	exit 1
+fi
+
+echo > $file
+if [ $? -ne 0 ]
+then
+	echo "$file empty string writing fail"
+	echo $ORIG_CONTENT > $file
+	exit 1
+fi
+
 # Test pids file
 file="$DBGFS/pids"
 
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 8+ messages in thread

* [RFC v7 7/7] damon/tools: Support more human friendly 'schemes' control
  2020-04-29 12:45 [RFC v7 0/7] Implement Data Access Monitoring-based Memory Operation Schemes SeongJae Park
                   ` (5 preceding siblings ...)
  2020-04-29 12:45 ` [RFC v7 6/7] mm/damon/selftests: Add 'schemes' debugfs tests SeongJae Park
@ 2020-04-29 12:45 ` SeongJae Park
  6 siblings, 0 replies; 8+ messages in thread
From: SeongJae Park @ 2020-04-29 12:45 UTC (permalink / raw)
  To: akpm
  Cc: SeongJae Park, Jonathan.Cameron, aarcange, acme,
	alexander.shishkin, amit, benh, brendan.d.gregg, brendanhiggins,
	cai, colin.king, corbet, dwmw, irogers, jolsa, kirill,
	mark.rutland, mgorman, minchan, mingo, namhyung, peterz, rdunlap,
	riel, rientjes, rostedt, sblbir, shakeelb, shuah, sj38.park, snu,
	vbabka, vdavydov.dev, yang.shi, ying.huang, linux-damon,
	linux-mm, linux-doc, linux-kernel

From: SeongJae Park <sjpark@amazon.de>

This commit implements 'schemes' subcommand of the damon userspace tool.
It can be used to describe and apply the data access monitoring-based
operation schemes in more human friendly fashion.

Signed-off-by: SeongJae Park <sjpark@amazon.de>
---
 tools/damon/_convert_damos.py | 126 ++++++++++++++++++++++++++++++
 tools/damon/_damon.py         | 143 ++++++++++++++++++++++++++++++++++
 tools/damon/damo              |   7 ++
 tools/damon/record.py         | 135 +++-----------------------------
 tools/damon/schemes.py        | 105 +++++++++++++++++++++++++
 5 files changed, 393 insertions(+), 123 deletions(-)
 create mode 100755 tools/damon/_convert_damos.py
 create mode 100644 tools/damon/_damon.py
 create mode 100644 tools/damon/schemes.py

diff --git a/tools/damon/_convert_damos.py b/tools/damon/_convert_damos.py
new file mode 100755
index 000000000000..3a01c6b16d18
--- /dev/null
+++ b/tools/damon/_convert_damos.py
@@ -0,0 +1,126 @@
+#!/usr/bin/env python3
+# SPDX-License-Identifier: GPL-2.0
+
+"""
+Change human readable data access monitoring-based operation schemes to the low
+level input for the '<debugfs>/damon/schemes' file.  Below is an example of the
+schemes written in the human readable format:
+
+# format is: <min/max size> <min/max frequency (0-100)> <min/max age> <action>
+# lines starts with '#' or blank are ignored.
+# B/K/M/G/T for Bytes/KiB/MiB/GiB/TiB
+# us/ms/s/m/h/d for micro-seconds/milli-seconds/seconds/minutes/hours/days
+# 'null' means zero, which passes the check
+
+# if a region (no matter of its size) keeps a high access frequency for more
+# than 100ms, put the region on the head of the LRU list (call madvise() with
+# MADV_WILLNEED).
+null	null	80	null	100ms	null	willneed
+
+# if a region keeps a low access frequency for more than 100ms, put the
+# region on the tail of the LRU list (call madvise() with MADV_COLD).
+0B	0B	10	20	200ms	1h cold
+
+# if a region keeps a very low access frequency for more than 100ms, swap
+# out the region immediately (call madvise() with MADV_PAGEOUT).
+0B	null	0	10	100ms	2h pageout
+
+# if a region of a size bigger than 2MiB keeps a very high access frequency
+# for more than 100ms, let the region to use huge pages (call madvise()
+# with MADV_HUGEPAGE).
+2M	null	90	99	100ms	2h hugepage
+
+# If a regions of a size bigger than 2MiB keeps no high access frequency
+# for more than 100ms, avoid the region from using huge pages (call
+# madvise() with MADV_NOHUGEPAGE).
+2M	null	0	25	100ms	2h nohugepage
+"""
+
+import argparse
+
+unit_to_bytes = {'B': 1, 'K': 1024, 'M': 1024 * 1024, 'G': 1024 * 1024 * 1024,
+        'T': 1024 * 1024 * 1024 * 1024}
+
+def text_to_bytes(txt):
+    if txt == 'null':
+        return 0
+    unit = txt[-1]
+    number = int(txt[:-1])
+    return number * unit_to_bytes[unit]
+
+unit_to_usecs = {'us': 1, 'ms': 1000, 's': 1000 * 1000, 'm': 60 * 1000 * 1000,
+        'h': 60 * 60 * 1000 * 1000, 'd': 24 * 60 * 60 * 1000 * 1000}
+
+def text_to_us(txt):
+    if txt == 'null':
+        return 0
+    unit = txt[-2:]
+    if unit in ['us', 'ms']:
+        number = int(txt[:-2])
+    else:
+        unit = txt[-1]
+        number = int(txt[:-1])
+    return number * unit_to_usecs[unit]
+
+damos_action_to_int = {'DAMOS_WILLNEED': 0, 'DAMOS_COLD': 1,
+        'DAMOS_PAGEOUT': 2, 'DAMOS_HUGEPAGE': 3, 'DAMOS_NOHUGEPAGE': 4}
+
+def text_to_damos_action(txt):
+    return damos_action_to_int['DAMOS_' + txt.upper()]
+
+def text_to_nr_accesses(txt, max_nr_accesses):
+    if txt == 'null':
+        return 0
+    return int(int(txt) * max_nr_accesses / 100)
+
+def debugfs_scheme(line, sample_interval, aggr_interval):
+    fields = line.split()
+    if len(fields) != 7:
+        print('wrong input line: %s' % line)
+        exit(1)
+
+    limit_nr_accesses = aggr_interval / sample_interval
+    try:
+        min_sz = text_to_bytes(fields[0])
+        max_sz = text_to_bytes(fields[1])
+        min_nr_accesses = text_to_nr_accesses(fields[2], limit_nr_accesses)
+        max_nr_accesses = text_to_nr_accesses(fields[3], limit_nr_accesses)
+        min_age = text_to_us(fields[4]) / aggr_interval
+        max_age = text_to_us(fields[5]) / aggr_interval
+        action = text_to_damos_action(fields[6])
+    except:
+        print('wrong input field')
+        raise
+    return '%d\t%d\t%d\t%d\t%d\t%d\t%d' % (min_sz, max_sz, min_nr_accesses,
+            max_nr_accesses, min_age, max_age, action)
+
+def convert(schemes_file, sample_interval, aggr_interval):
+    lines = []
+    with open(schemes_file, 'r') as f:
+        for line in f:
+            if line.startswith('#'):
+                continue
+            line = line.strip()
+            if line == '':
+                continue
+            lines.append(debugfs_scheme(line, sample_interval, aggr_interval))
+    return '\n'.join(lines)
+
+def main():
+    parser = argparse.ArgumentParser()
+    parser.add_argument('input', metavar='<file>',
+            help='input file describing the schemes')
+    parser.add_argument('-s', '--sample', metavar='<interval>', type=int,
+            default=5000, help='sampling interval (us)')
+    parser.add_argument('-a', '--aggr', metavar='<interval>', type=int,
+            default=100000, help='aggregation interval (us)')
+    args = parser.parse_args()
+
+    schemes_file = args.input
+    sample_interval = args.sample
+    aggr_interval = args.aggr
+
+    print(convert(schemes_file, sample_interval, aggr_interval))
+
+if __name__ == '__main__':
+    main()
diff --git a/tools/damon/_damon.py b/tools/damon/_damon.py
new file mode 100644
index 000000000000..0a703ec7471a
--- /dev/null
+++ b/tools/damon/_damon.py
@@ -0,0 +1,143 @@
+#!/usr/bin/env python3
+# SPDX-License-Identifier: GPL-2.0
+
+"""
+Contains core functions for DAMON debugfs control.
+"""
+
+import os
+import subprocess
+
+debugfs_attrs = None
+debugfs_record = None
+debugfs_schemes = None
+debugfs_pids = None
+debugfs_monitor_on = None
+
+def set_target_pid(pid):
+    return subprocess.call('echo %s > %s' % (pid, debugfs_pids), shell=True,
+            executable='/bin/bash')
+
+def turn_damon(on_off):
+    return subprocess.call("echo %s > %s" % (on_off, debugfs_monitor_on),
+            shell=True, executable="/bin/bash")
+
+def is_damon_running():
+    with open(debugfs_monitor_on, 'r') as f:
+        return f.read().strip() == 'on'
+
+class Attrs:
+    sample_interval = None
+    aggr_interval = None
+    regions_update_interval = None
+    min_nr_regions = None
+    max_nr_regions = None
+    rbuf_len = None
+    rfile_path = None
+    schemes = None
+
+    def __init__(self, s, a, r, n, x, l, f, c):
+        self.sample_interval = s
+        self.aggr_interval = a
+        self.regions_update_interval = r
+        self.min_nr_regions = n
+        self.max_nr_regions = x
+        self.rbuf_len = l
+        self.rfile_path = f
+        self.schemes = c
+
+    def __str__(self):
+        return "%s %s %s %s %s %s %s\n%s" % (self.sample_interval,
+                self.aggr_interval, self.regions_update_interval,
+                self.min_nr_regions, self.max_nr_regions, self.rbuf_len,
+                self.rfile_path, self.schemes)
+
+    def attr_str(self):
+        return "%s %s %s %s %s " % (self.sample_interval, self.aggr_interval,
+                self.regions_update_interval, self.min_nr_regions,
+                self.max_nr_regions)
+
+    def record_str(self):
+        return '%s %s ' % (self.rbuf_len, self.rfile_path)
+
+    def apply(self):
+        ret = subprocess.call('echo %s > %s' % (self.attr_str(), debugfs_attrs),
+                shell=True, executable='/bin/bash')
+        if ret:
+            return ret
+        ret = subprocess.call('echo %s > %s' % (self.record_str(),
+            debugfs_record), shell=True, executable='/bin/bash')
+        if ret:
+            return ret
+        return subprocess.call('echo %s > %s' % (
+            self.schemes.replace('\n', ' '), debugfs_schemes), shell=True,
+            executable='/bin/bash')
+
+def current_attrs():
+    with open(debugfs_attrs, 'r') as f:
+        attrs = f.read().split()
+    attrs = [int(x) for x in attrs]
+
+    with open(debugfs_record, 'r') as f:
+        rattrs = f.read().split()
+    attrs.append(int(rattrs[0]))
+    attrs.append(rattrs[1])
+
+    with open(debugfs_schemes, 'r') as f:
+        schemes = f.read()
+    attrs.append(schemes)
+
+    return Attrs(*attrs)
+
+def chk_update_debugfs(debugfs):
+    global debugfs_attrs
+    global debugfs_record
+    global debugfs_schemes
+    global debugfs_pids
+    global debugfs_monitor_on
+
+    debugfs_damon = os.path.join(debugfs, 'damon')
+    debugfs_attrs = os.path.join(debugfs_damon, 'attrs')
+    debugfs_record = os.path.join(debugfs_damon, 'record')
+    debugfs_schemes = os.path.join(debugfs_damon, 'schemes')
+    debugfs_pids = os.path.join(debugfs_damon, 'pids')
+    debugfs_monitor_on = os.path.join(debugfs_damon, 'monitor_on')
+
+    if not os.path.isdir(debugfs_damon):
+        print("damon debugfs dir (%s) not found", debugfs_damon)
+        exit(1)
+
+    for f in [debugfs_attrs, debugfs_record, debugfs_schemes, debugfs_pids,
+            debugfs_monitor_on]:
+        if not os.path.isfile(f):
+            print("damon debugfs file (%s) not found" % f)
+            exit(1)
+
+def cmd_args_to_attrs(args):
+    "Generate attributes with specified arguments"
+    sample_interval = args.sample
+    aggr_interval = args.aggr
+    regions_update_interval = args.updr
+    min_nr_regions = args.minr
+    max_nr_regions = args.maxr
+    rbuf_len = args.rbuf
+    if not os.path.isabs(args.out):
+        args.out = os.path.join(os.getcwd(), args.out)
+    rfile_path = args.out
+    schemes = args.schemes
+    return Attrs(sample_interval, aggr_interval, regions_update_interval,
+            min_nr_regions, max_nr_regions, rbuf_len, rfile_path, schemes)
+
+def set_attrs_argparser(parser):
+    parser.add_argument('-d', '--debugfs', metavar='<debugfs>', type=str,
+            default='/sys/kernel/debug', help='debugfs mounted path')
+    parser.add_argument('-s', '--sample', metavar='<interval>', type=int,
+            default=5000, help='sampling interval')
+    parser.add_argument('-a', '--aggr', metavar='<interval>', type=int,
+            default=100000, help='aggregate interval')
+    parser.add_argument('-u', '--updr', metavar='<interval>', type=int,
+            default=1000000, help='regions update interval')
+    parser.add_argument('-n', '--minr', metavar='<# regions>', type=int,
+            default=10, help='minimal number of regions')
+    parser.add_argument('-m', '--maxr', metavar='<# regions>', type=int,
+            default=1000, help='maximum number of regions')
diff --git a/tools/damon/damo b/tools/damon/damo
index 58e1099ae5fc..ce7180069bef 100755
--- a/tools/damon/damo
+++ b/tools/damon/damo
@@ -5,6 +5,7 @@ import argparse
 
 import record
 import report
+import schemes
 
 class SubCmdHelpFormatter(argparse.RawDescriptionHelpFormatter):
     def _format_action(self, action):
@@ -25,6 +26,10 @@ parser_record = subparser.add_parser('record',
         help='record data accesses of the given target processes')
 record.set_argparser(parser_record)
 
+parser_schemes = subparser.add_parser('schemes',
+        help='apply operation schemes to the given target process')
+schemes.set_argparser(parser_schemes)
+
 parser_report = subparser.add_parser('report',
         help='report the recorded data accesses in the specified form')
 report.set_argparser(parser_report)
@@ -33,5 +38,7 @@ args = parser.parse_args()
 
 if args.command == 'record':
     record.main(args)
+elif args.command == 'schemes':
+    schemes.main(args)
 elif args.command == 'report':
     report.main(args)
diff --git a/tools/damon/record.py b/tools/damon/record.py
index a547d479a103..3bbf7b8359da 100644
--- a/tools/damon/record.py
+++ b/tools/damon/record.py
@@ -6,28 +6,12 @@ Record data access patterns of the target process.
 """
 
 import argparse
-import copy
 import os
 import signal
 import subprocess
 import time
 
-debugfs_attrs = None
-debugfs_record = None
-debugfs_pids = None
-debugfs_monitor_on = None
-
-def set_target_pid(pid):
-    return subprocess.call('echo %s > %s' % (pid, debugfs_pids), shell=True,
-            executable='/bin/bash')
-
-def turn_damon(on_off):
-    return subprocess.call("echo %s > %s" % (on_off, debugfs_monitor_on),
-            shell=True, executable="/bin/bash")
-
-def is_damon_running():
-    with open(debugfs_monitor_on, 'r') as f:
-        return f.read().strip() == 'on'
+import _damon
 
 def do_record(target, is_target_cmd, attrs, old_attrs):
     if os.path.isfile(attrs.rfile_path):
@@ -36,93 +20,29 @@ def do_record(target, is_target_cmd, attrs, old_attrs):
     if attrs.apply():
         print('attributes (%s) failed to be applied' % attrs)
         cleanup_exit(old_attrs, -1)
-    print('# damon attrs: %s' % attrs)
+    print('# damon attrs: %s %s' % (attrs.attr_str(), attrs.record_str()))
     if is_target_cmd:
         p = subprocess.Popen(target, shell=True, executable='/bin/bash')
         target = p.pid
-    if set_target_pid(target):
+    if _damon.set_target_pid(target):
         print('pid setting (%s) failed' % target)
         cleanup_exit(old_attrs, -2)
-    if turn_damon('on'):
+    if _damon.turn_damon('on'):
         print('could not turn on damon' % target)
         cleanup_exit(old_attrs, -3)
     if is_target_cmd:
         p.wait()
     while True:
         # damon will turn it off by itself if the target tasks are terminated.
-        if not is_damon_running():
+        if not _damon.is_damon_running():
             break
         time.sleep(1)
 
     cleanup_exit(old_attrs, 0)
 
-class Attrs:
-    sample_interval = None
-    aggr_interval = None
-    regions_update_interval = None
-    min_nr_regions = None
-    max_nr_regions = None
-    rbuf_len = None
-    rfile_path = None
-
-    def __init__(self, s, a, r, n, x, l, f):
-        self.sample_interval = s
-        self.aggr_interval = a
-        self.regions_update_interval = r
-        self.min_nr_regions = n
-        self.max_nr_regions = x
-        self.rbuf_len = l
-        self.rfile_path = f
-
-    def __str__(self):
-        return "%s %s %s %s %s %s %s" % (self.sample_interval, self.aggr_interval,
-                self.regions_update_interval, self.min_nr_regions,
-                self.max_nr_regions, self.rbuf_len, self.rfile_path)
-
-    def attr_str(self):
-        return "%s %s %s %s %s " % (self.sample_interval, self.aggr_interval,
-                self.regions_update_interval, self.min_nr_regions,
-                self.max_nr_regions)
-
-    def record_str(self):
-        return '%s %s ' % (self.rbuf_len, self.rfile_path)
-
-    def apply(self):
-        ret = subprocess.call('echo %s > %s' % (self.attr_str(), debugfs_attrs),
-                shell=True, executable='/bin/bash')
-        if ret:
-            return ret
-        return subprocess.call('echo %s > %s' % (self.record_str(),
-            debugfs_record), shell=True, executable='/bin/bash')
-
-def current_attrs():
-    with open(debugfs_attrs, 'r') as f:
-        attrs = f.read().split()
-    attrs = [int(x) for x in attrs]
-
-    with open(debugfs_record, 'r') as f:
-        rattrs = f.read().split()
-    attrs.append(int(rattrs[0]))
-    attrs.append(rattrs[1])
-    return Attrs(*attrs)
-
-def cmd_args_to_attrs(args):
-    "Generate attributes with specified arguments"
-    sample_interval = args.sample
-    aggr_interval = args.aggr
-    regions_update_interval = args.updr
-    min_nr_regions = args.minr
-    max_nr_regions = args.maxr
-    rbuf_len = args.rbuf
-    if not os.path.isabs(args.out):
-        args.out = os.path.join(os.getcwd(), args.out)
-    rfile_path = args.out
-    return Attrs(sample_interval, aggr_interval, regions_update_interval,
-            min_nr_regions, max_nr_regions, rbuf_len, rfile_path)
-
 def cleanup_exit(orig_attrs, exit_code):
-    if is_damon_running():
-        if turn_damon('off'):
+    if _damon.is_damon_running():
+        if _damon.turn_damon('off'):
             print('failed to turn damon off!')
     if orig_attrs:
         if orig_attrs.apply():
@@ -133,51 +53,19 @@ def sighandler(signum, frame):
     print('\nsignal %s received' % signum)
     cleanup_exit(orig_attrs, signum)
 
-def chk_update_debugfs(debugfs):
-    global debugfs_attrs
-    global debugfs_record
-    global debugfs_pids
-    global debugfs_monitor_on
-
-    debugfs_damon = os.path.join(debugfs, 'damon')
-    debugfs_attrs = os.path.join(debugfs_damon, 'attrs')
-    debugfs_record = os.path.join(debugfs_damon, 'record')
-    debugfs_pids = os.path.join(debugfs_damon, 'pids')
-    debugfs_monitor_on = os.path.join(debugfs_damon, 'monitor_on')
-
-    if not os.path.isdir(debugfs_damon):
-        print("damon debugfs dir (%s) not found", debugfs_damon)
-        exit(1)
-
-    for f in [debugfs_attrs, debugfs_record, debugfs_pids, debugfs_monitor_on]:
-        if not os.path.isfile(f):
-            print("damon debugfs file (%s) not found" % f)
-            exit(1)
-
 def chk_permission():
     if os.geteuid() != 0:
         print("Run as root")
         exit(1)
 
 def set_argparser(parser):
+    _damon.set_attrs_argparser(parser)
     parser.add_argument('target', type=str, metavar='<target>',
             help='the target command or the pid to record')
-    parser.add_argument('-s', '--sample', metavar='<interval>', type=int,
-            default=5000, help='sampling interval')
-    parser.add_argument('-a', '--aggr', metavar='<interval>', type=int,
-            default=100000, help='aggregate interval')
-    parser.add_argument('-u', '--updr', metavar='<interval>', type=int,
-            default=1000000, help='regions update interval')
-    parser.add_argument('-n', '--minr', metavar='<# regions>', type=int,
-            default=10, help='minimal number of regions')
-    parser.add_argument('-m', '--maxr', metavar='<# regions>', type=int,
-            default=1000, help='maximum number of regions')
     parser.add_argument('-l', '--rbuf', metavar='<len>', type=int,
             default=1024*1024, help='length of record result buffer')
     parser.add_argument('-o', '--out', metavar='<file path>', type=str,
             default='damon.data', help='output file path')
-    parser.add_argument('-d', '--debugfs', metavar='<debugfs>', type=str,
-            default='/sys/kernel/debug', help='debugfs mounted path')
 
 def main(args=None):
     global orig_attrs
@@ -187,13 +75,14 @@ def main(args=None):
         args = parser.parse_args()
 
     chk_permission()
-    chk_update_debugfs(args.debugfs)
+    _damon.chk_update_debugfs(args.debugfs)
 
     signal.signal(signal.SIGINT, sighandler)
     signal.signal(signal.SIGTERM, sighandler)
-    orig_attrs = current_attrs()
+    orig_attrs = _damon.current_attrs()
 
-    new_attrs = cmd_args_to_attrs(args)
+    args.schemes = ''
+    new_attrs = _damon.cmd_args_to_attrs(args)
     target = args.target
 
     target_fields = target.split()
diff --git a/tools/damon/schemes.py b/tools/damon/schemes.py
new file mode 100644
index 000000000000..ca1551fe5696
--- /dev/null
+++ b/tools/damon/schemes.py
@@ -0,0 +1,105 @@
+#!/usr/bin/env python3
+# SPDX-License-Identifier: GPL-2.0
+
+"""
+Apply given operation schemes to the target process.
+"""
+
+import argparse
+import os
+import signal
+import subprocess
+import time
+
+import _convert_damos
+import _damon
+
+def run_damon(target, is_target_cmd, attrs, old_attrs):
+    if os.path.isfile(attrs.rfile_path):
+        os.rename(attrs.rfile_path, attrs.rfile_path + '.old')
+
+    if attrs.apply():
+        print('attributes (%s) failed to be applied' % attrs)
+        cleanup_exit(old_attrs, -1)
+    print('# damon attrs: %s %s' % (attrs.attr_str(), attrs.record_str()))
+    for line in attrs.schemes.split('\n'):
+        print('# scheme: %s' % line)
+    if is_target_cmd:
+        p = subprocess.Popen(target, shell=True, executable='/bin/bash')
+        target = p.pid
+    if _damon.set_target_pid(target):
+        print('pid setting (%s) failed' % target)
+        cleanup_exit(old_attrs, -2)
+    if _damon.turn_damon('on'):
+        print('could not turn on damon' % target)
+        cleanup_exit(old_attrs, -3)
+    if is_target_cmd:
+        p.wait()
+    while True:
+        # damon will turn it off by itself if the target tasks are terminated.
+        if not _damon.is_damon_running():
+            break
+        time.sleep(1)
+
+    cleanup_exit(old_attrs, 0)
+
+def cleanup_exit(orig_attrs, exit_code):
+    if _damon.is_damon_running():
+        if _damon.turn_damon('off'):
+            print('failed to turn damon off!')
+    if orig_attrs:
+        if orig_attrs.apply():
+            print('original attributes (%s) restoration failed!' % orig_attrs)
+    exit(exit_code)
+
+def sighandler(signum, frame):
+    print('\nsignal %s received' % signum)
+    cleanup_exit(orig_attrs, signum)
+
+def chk_permission():
+    if os.geteuid() != 0:
+        print("Run as root")
+        exit(1)
+
+def set_argparser(parser):
+    _damon.set_attrs_argparser(parser)
+    parser.add_argument('target', type=str, metavar='<target>',
+            help='the target command or the pid to record')
+    parser.add_argument('-c', '--schemes', metavar='<file>', type=str,
+            default='damon.schemes',
+            help='data access monitoring-based operation schemes')
+
+def main(args=None):
+    global orig_attrs
+    if not args:
+        parser = argparse.ArgumentParser()
+        set_argparser(parser)
+        args = parser.parse_args()
+
+    chk_permission()
+    _damon.chk_update_debugfs(args.debugfs)
+
+    signal.signal(signal.SIGINT, sighandler)
+    signal.signal(signal.SIGTERM, sighandler)
+    orig_attrs = _damon.current_attrs()
+
+    args.rbuf = 0
+    args.out = 'null'
+    args.schemes = _convert_damos.convert(args.schemes, args.sample, args.aggr)
+    new_attrs = _damon.cmd_args_to_attrs(args)
+    target = args.target
+
+    target_fields = target.split()
+    if not subprocess.call('which %s > /dev/null' % target_fields[0],
+            shell=True, executable='/bin/bash'):
+        run_damon(target, True, new_attrs, orig_attrs)
+    else:
+        try:
+            pid = int(target)
+        except:
+            print('target \'%s\' is neither a command, nor a pid' % target)
+            exit(1)
+        run_damon(target, False, new_attrs, orig_attrs)
+
+if __name__ == '__main__':
+    main()
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2020-04-29 12:49 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-04-29 12:45 [RFC v7 0/7] Implement Data Access Monitoring-based Memory Operation Schemes SeongJae Park
2020-04-29 12:45 ` [RFC v7 1/7] mm/madvise: Export do_madvise() to external GPL modules SeongJae Park
2020-04-29 12:45 ` [RFC v7 2/7] mm/damon: Account age of target regions SeongJae Park
2020-04-29 12:45 ` [RFC v7 3/7] mm/damon: Implement data access monitoring-based operation schemes SeongJae Park
2020-04-29 12:45 ` [RFC v7 4/7] mm/damon/schemes: Implement a debugfs interface SeongJae Park
2020-04-29 12:45 ` [RFC v7 5/7] mm/damon-test: Add kunit test case for regions age accounting SeongJae Park
2020-04-29 12:45 ` [RFC v7 6/7] mm/damon/selftests: Add 'schemes' debugfs tests SeongJae Park
2020-04-29 12:45 ` [RFC v7 7/7] damon/tools: Support more human friendly 'schemes' control SeongJae Park

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).