All of lore.kernel.org
 help / color / mirror / Atom feed
From: SeongJae Park <sjpark@amazon.com>
To: <akpm@linux-foundation.org>
Cc: SeongJae Park <sjpark@amazon.de>, <Jonathan.Cameron@Huawei.com>,
	<aarcange@redhat.com>, <acme@kernel.org>,
	<alexander.shishkin@linux.intel.com>, <amit@kernel.org>,
	<benh@kernel.crashing.org>, <brendan.d.gregg@gmail.com>,
	<brendanhiggins@google.com>, <cai@lca.pw>,
	<colin.king@canonical.com>, <corbet@lwn.net>, <dwmw@amazon.com>,
	<irogers@google.com>, <jolsa@redhat.com>, <kirill@shutemov.name>,
	<mark.rutland@arm.com>, <mgorman@suse.de>, <minchan@kernel.org>,
	<mingo@redhat.com>, <namhyung@kernel.org>, <peterz@infradead.org>,
	<rdunlap@infradead.org>, <riel@surriel.com>,
	<rientjes@google.com>, <rostedt@goodmis.org>, <sblbir@amazon.com>,
	<shakeelb@google.com>, <shuah@kernel.org>, <sj38.park@gmail.com>,
	<snu@amazon.de>, <vbabka@suse.cz>, <vdavydov.dev@gmail.com>,
	<yang.shi@linux.alibaba.com>, <ying.huang@intel.com>,
	<linux-damon@amazon.com>, <linux-mm@kvack.org>,
	<linux-doc@vger.kernel.org>, <linux-kernel@vger.kernel.org>
Subject: [RFC v8 8/8] Documentation/admin-guide/mm: Document DAMON-based operation schemes
Date: Tue, 12 May 2020 13:53:43 +0200	[thread overview]
Message-ID: <20200512115343.27699-9-sjpark@amazon.com> (raw)
In-Reply-To: <20200512115343.27699-1-sjpark@amazon.com>

From: SeongJae Park <sjpark@amazon.de>

This commit documents DAMON-based operation schemes in the DAMON
document.

Signed-off-by: SeongJae Park <sjpark@amazon.de>
---
 .../admin-guide/mm/data_access_monitor.rst    | 100 +++++++++++++++++-
 1 file changed, 98 insertions(+), 2 deletions(-)

diff --git a/Documentation/admin-guide/mm/data_access_monitor.rst b/Documentation/admin-guide/mm/data_access_monitor.rst
index 915956aa1065..d4a48bc63400 100644
--- a/Documentation/admin-guide/mm/data_access_monitor.rst
+++ b/Documentation/admin-guide/mm/data_access_monitor.rst
@@ -182,8 +182,8 @@ only for each of a user-specified time interval (``regions update interval``).
 ``debugfs`` Interface
 =====================
 
-DAMON exports four files, ``attrs``, ``pids``, ``record``, and ``monitor_on``
-under its debugfs directory, ``<debugfs>/damon/``.
+DAMON exports five files, ``attrs``, ``pids``, ``record``, ``schemes`` and
+``monitor_on`` under its debugfs directory, ``<debugfs>/damon/``.
 
 Attributes
 ----------
@@ -227,6 +227,46 @@ be 4 KiB and the result to be saved in ``/damon.data``::
     # cat record
     4096 /damon.data
 
+Schemes
+-------
+
+For usual DAMON-based data access awared memory management optimizations, users
+would simply want the system to apply a memory management action to a memory
+region of a specific size having a specific access frequency for a specific
+time.  DAMON receives such formalized operation schemes from user and applies
+those to the target processes.
+
+Users can get and set the schemes by reading from and writing to ``schemes``
+debugfs file.  To the file, each of the schemes should represented in each line
+in below form:
+
+    min-size max-size min-acc max-acc min-age max-age action
+
+Bytes for size of regions (``min-size`` and ``max-size``), number of monitored
+accesses per aggregate interval for access frequency (``min-acc`` and
+``max-acc``), number of aggregate intervals for age of regions (``min-age`` and
+``max-age``), and predefined integer for memory management actions should be
+used.  ``madvise()`` system call with specific hint are currently available.
+The numbers and their representing memory hint are as below::
+
+    0   MADV_WILLNEED
+    1   MADV_COLD
+    2   MADV_PAGEOUT
+    3   MADV_HUGEPAGE
+    4   MADV_NOHUGEPAGE
+
+You can disable schemes by simply writing empty string to the file.  For
+example, below commands applies a scheme saying “If a memory region larger than
+4 KiB (4096 0) is showing less than 5 accesses per aggregate interval (0 5) for
+more than 5 aggregate interval (5 0), page out the region (2)”, check the
+entered scheme again, and finally remove the scheme.::
+
+    # cd <debugfs>/damon
+    # echo "4096 0 0 5 5 0 2" > schemes
+    # cat schemes
+    4096 0 0 5 5 0 2
+    # echo > schemes
+
 Turning On/Off
 --------------
 
@@ -426,3 +466,59 @@ made.
 
 Users can specify the resolution of the distribution (``--range``).  It also
 supports 'gnuplot' based simple visualization (``--plot``) of the distribution.
+
+
+DAMON-based Operation Schemes
+-----------------------------
+
+The ``schemes`` subcommand applies given data access pattern based operation
+schemes to the given target processes.  The target processes are described
+using the command to spawn the processes or pid of running processes, as
+similar to that of ``record`` subcommand.  Meanwhile, the operation schemes
+should be saved in a text file using below format and passed to ``schemes``
+subcommand via ``--schemes`` option.
+
+    min-size max-size min-acc max-acc min-age max-age action
+
+The format also supports comments, several units for size and age of regions,
+and human readable action names.  Currently supported operation actions are
+WILLNEED, COLD, PAGEOUT, HUGEPAGE, and NOHUGEPAGE.  Each of the actions works
+as same to that of madvise() system call.  Below is an example schemes file.
+Please also note that 0 for max values means infinite.::
+
+    # format is:
+    # <min/max size> <min/max frequency (0-99)> <min/max age> <action>
+    #
+    # B/K/M/G/T for Bytes/KiB/MiB/GiB/TiB
+    # us/ms/s/m/h/d for micro-seconds/milli-seconds/seconds/minutes/hours/days
+    # 'null' means zero for size and age.
+
+    # if a region keeps a high access frequency for more than 100ms, put the
+    # region on the head of the LRU list (call madvise() with MADV_WILLNEED).
+    null    null    80      null    100ms   0s      willneed
+
+    # if a region keeps a low access frequency for more than 200ms and less
+    # than one hour, put the # region on the tail of the LRU list (call
+    # madvise() with MADV_COLD).
+    0B      0B      10      20      200ms   1h cold
+
+    # if a region keeps a very low access frequency for more than 1 minute,
+    # swap out the region immediately (call madvise() with MADV_PAGEOUT).
+    0B      null    0       10      60s     0s pageout
+
+    # if a region of a size bigger than 2MiB keeps a very high access frequency
+    # for more than 100ms, let the region to use huge pages (call madvise()
+    # with MADV_HUGEPAGE).
+    2M      null    90      99      100ms   0s hugepage
+
+    # If a regions of a size bigger than 2MiB keeps small access frequency for
+    # more than 100ms, avoid the region using huge pages (call madvise() with
+    # MADV_NOHUGEPAGE).
+    2M      null    0       25      100ms   0s nohugepage
+
+For example, you can make a running process named 'foo' to use THP for memory
+regions keeping 2MB or larger size and having very high access frequency for
+more than 100 milliseconds using below commands::
+
+    $ echo "2M null 90 99 100ms 0s hugepage" > my_thp_scheme
+    $ ./damo schemes --schemes my_thp_scheme `pidof foo`
-- 
2.17.1


      parent reply	other threads:[~2020-05-12 11:58 UTC|newest]

Thread overview: 9+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-05-12 11:53 [RFC v8 0/8] Implement Data Access Monitoring-based Memory Operation Schemes SeongJae Park
2020-05-12 11:53 ` [RFC v8 1/8] mm/madvise: Export do_madvise() to external GPL modules SeongJae Park
2020-05-12 11:53 ` [RFC v8 2/8] mm/damon: Account age of target regions SeongJae Park
2020-05-12 11:53 ` [RFC v8 3/8] mm/damon: Implement data access monitoring-based operation schemes SeongJae Park
2020-05-12 11:53 ` [RFC v8 4/8] mm/damon/schemes: Implement a debugfs interface SeongJae Park
2020-05-12 11:53 ` [RFC v8 5/8] mm/damon-test: Add kunit test case for regions age accounting SeongJae Park
2020-05-12 11:53 ` [RFC v8 6/8] mm/damon/selftests: Add 'schemes' debugfs tests SeongJae Park
2020-05-12 11:53 ` [RFC v8 7/8] damon/tools: Support more human friendly 'schemes' control SeongJae Park
2020-05-12 11:53 ` SeongJae Park [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20200512115343.27699-9-sjpark@amazon.com \
    --to=sjpark@amazon.com \
    --cc=Jonathan.Cameron@Huawei.com \
    --cc=aarcange@redhat.com \
    --cc=acme@kernel.org \
    --cc=akpm@linux-foundation.org \
    --cc=alexander.shishkin@linux.intel.com \
    --cc=amit@kernel.org \
    --cc=benh@kernel.crashing.org \
    --cc=brendan.d.gregg@gmail.com \
    --cc=brendanhiggins@google.com \
    --cc=cai@lca.pw \
    --cc=colin.king@canonical.com \
    --cc=corbet@lwn.net \
    --cc=dwmw@amazon.com \
    --cc=irogers@google.com \
    --cc=jolsa@redhat.com \
    --cc=kirill@shutemov.name \
    --cc=linux-damon@amazon.com \
    --cc=linux-doc@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mark.rutland@arm.com \
    --cc=mgorman@suse.de \
    --cc=minchan@kernel.org \
    --cc=mingo@redhat.com \
    --cc=namhyung@kernel.org \
    --cc=peterz@infradead.org \
    --cc=rdunlap@infradead.org \
    --cc=riel@surriel.com \
    --cc=rientjes@google.com \
    --cc=rostedt@goodmis.org \
    --cc=sblbir@amazon.com \
    --cc=shakeelb@google.com \
    --cc=shuah@kernel.org \
    --cc=sj38.park@gmail.com \
    --cc=sjpark@amazon.de \
    --cc=snu@amazon.de \
    --cc=vbabka@suse.cz \
    --cc=vdavydov.dev@gmail.com \
    --cc=yang.shi@linux.alibaba.com \
    --cc=ying.huang@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.