linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: SeongJae Park <sjpark@amazon.com>
To: <akpm@linux-foundation.org>
Cc: SeongJae Park <sjpark@amazon.de>, <Jonathan.Cameron@Huawei.com>,
	<aarcange@redhat.com>, <acme@kernel.org>,
	<alexander.shishkin@linux.intel.com>, <amit@kernel.org>,
	<benh@kernel.crashing.org>, <brendan.d.gregg@gmail.com>,
	<brendanhiggins@google.com>, <cai@lca.pw>,
	<colin.king@canonical.com>, <corbet@lwn.net>, <david@redhat.com>,
	<dwmw@amazon.com>, <elver@google.com>, <fan.du@intel.com>,
	<foersleo@amazon.de>, <gthelen@google.com>, <irogers@google.com>,
	<jolsa@redhat.com>, <kirill@shutemov.name>,
	<mark.rutland@arm.com>, <mgorman@suse.de>, <minchan@kernel.org>,
	<mingo@redhat.com>, <namhyung@kernel.org>, <peterz@infradead.org>,
	<rdunlap@infradead.org>, <riel@surriel.com>,
	<rientjes@google.com>, <rostedt@goodmis.org>, <rppt@kernel.org>,
	<sblbir@amazon.com>, <shakeelb@google.com>, <shuah@kernel.org>,
	<sj38.park@gmail.com>, <snu@amazon.de>, <vbabka@suse.cz>,
	<vdavydov.dev@gmail.com>, <yang.shi@linux.alibaba.com>,
	<ying.huang@intel.com>, <zgf574564920@gmail.com>,
	<linux-damon@amazon.com>, <linux-mm@kvack.org>,
	<linux-doc@vger.kernel.org>, <linux-kernel@vger.kernel.org>
Subject: [PATCH v22 07/18] mm/page_idle: Avoid interferences from concurrent users
Date: Tue, 20 Oct 2020 10:59:29 +0200	[thread overview]
Message-ID: <20201020085940.13875-8-sjpark@amazon.com> (raw)
In-Reply-To: <20201020085940.13875-1-sjpark@amazon.com>

From: SeongJae Park <sjpark@amazon.de>

Concurrent Idle Page Tracking users can interfere each other because the
interface doesn't provide a central rule for synchronization between the
users.  Users could implement their own synchronization rule, but even
in that case, applications developed by different users would not know
how to synchronize with others.  To help this situation, this commit
introduces a centralized synchronization infrastructure of Idle Page
Tracking.

In detail, this commit introduces a mutex lock for Idle Page Tracking,
called 'page_idle_lock'.  It is exposed to user space via a new bool
sysfs file, '/sys/kernel/mm/page_idle/lock'.  By writing to and reading
from the file, users can hold/release and read status of the mutex.
Writes to the Idle Page Tracking 'bitmap' file fails if the lock is not
held, while reads of the file can be done regardless of the lock status.

Note that users could still interfere each other if they abuse this
locking rule.  Nevertheless, this change will let them notice the rule.

Signed-off-by: SeongJae Park <sjpark@amazon.de>
---
 .../admin-guide/mm/idle_page_tracking.rst     | 22 +++++++---
 mm/page_idle.c                                | 40 +++++++++++++++++++
 2 files changed, 56 insertions(+), 6 deletions(-)

diff --git a/Documentation/admin-guide/mm/idle_page_tracking.rst b/Documentation/admin-guide/mm/idle_page_tracking.rst
index df9394fb39c2..3f5e7a8b5b78 100644
--- a/Documentation/admin-guide/mm/idle_page_tracking.rst
+++ b/Documentation/admin-guide/mm/idle_page_tracking.rst
@@ -21,13 +21,13 @@ User API
 ========
 
 The idle page tracking API is located at ``/sys/kernel/mm/page_idle``.
-Currently, it consists of the only read-write file,
-``/sys/kernel/mm/page_idle/bitmap``.
+Currently, it consists of two read-write file,
+``/sys/kernel/mm/page_idle/bitmap`` and ``/sys/kernel/mm/page_idle/lock``.
 
-The file implements a bitmap where each bit corresponds to a memory page. The
-bitmap is represented by an array of 8-byte integers, and the page at PFN #i is
-mapped to bit #i%64 of array element #i/64, byte order is native. When a bit is
-set, the corresponding page is idle.
+The ``bitmap`` file implements a bitmap where each bit corresponds to a memory
+page. The bitmap is represented by an array of 8-byte integers, and the page at
+PFN #i is mapped to bit #i%64 of array element #i/64, byte order is native.
+When a bit is set, the corresponding page is idle.
 
 A page is considered idle if it has not been accessed since it was marked idle
 (for more details on what "accessed" actually means see the :ref:`Implementation
@@ -74,6 +74,16 @@ See :ref:`Documentation/admin-guide/mm/pagemap.rst <pagemap>` for more
 information about ``/proc/pid/pagemap``, ``/proc/kpageflags``, and
 ``/proc/kpagecgroup``.
 
+The ``lock`` file is for avoidance of interference from concurrent users.  If
+the content of the ``lock`` file is ``1``, it means the ``bitmap`` file is
+currently being used by someone.  While the content of the ``lock`` file is
+``1``, writing ``1`` to the file fails.  Therefore, users should first
+successfully write ``1`` to the ``lock`` file before starting use of ``bitmap``
+file and write ``0`` to the ``lock`` file after they finished use of the
+``bitmap`` file.  If a user writes the ``bitmap`` file while the ``lock`` is
+``0``, the write fails.  Meanwhile, reads of the ``bitmap`` file success
+regardless of the ``lock`` status.
+
 .. _impl_details:
 
 Implementation Details
diff --git a/mm/page_idle.c b/mm/page_idle.c
index 144fb4ed961d..0aa45f848570 100644
--- a/mm/page_idle.c
+++ b/mm/page_idle.c
@@ -16,6 +16,8 @@
 #define BITMAP_CHUNK_SIZE	sizeof(u64)
 #define BITMAP_CHUNK_BITS	(BITMAP_CHUNK_SIZE * BITS_PER_BYTE)
 
+static DEFINE_MUTEX(page_idle_lock);
+
 /*
  * Idle page tracking only considers user memory pages, for other types of
  * pages the idle flag is always unset and an attempt to set it is silently
@@ -169,6 +171,9 @@ static ssize_t page_idle_bitmap_write(struct file *file, struct kobject *kobj,
 	unsigned long pfn, end_pfn;
 	int bit;
 
+	if (!mutex_is_locked(&page_idle_lock))
+		return -EPERM;
+
 	if (pos % BITMAP_CHUNK_SIZE || count % BITMAP_CHUNK_SIZE)
 		return -EINVAL;
 
@@ -197,17 +202,52 @@ static ssize_t page_idle_bitmap_write(struct file *file, struct kobject *kobj,
 	return (char *)in - buf;
 }
 
+static ssize_t page_idle_lock_show(struct kobject *kobj,
+		struct kobj_attribute *attr, char *buf)
+{
+	return sprintf(buf, "%d\n", mutex_is_locked(&page_idle_lock));
+}
+
+static ssize_t page_idle_lock_store(struct kobject *kobj,
+		struct kobj_attribute *attr, const char *buf, size_t count)
+{
+	bool do_lock;
+	int ret;
+
+	ret = kstrtobool(buf, &do_lock);
+	if (ret < 0)
+		return ret;
+
+	if (do_lock) {
+		if (!mutex_trylock(&page_idle_lock))
+			return -EBUSY;
+	} else {
+		mutex_unlock(&page_idle_lock);
+	}
+
+	return count;
+}
+
 static struct bin_attribute page_idle_bitmap_attr =
 		__BIN_ATTR(bitmap, 0600,
 			   page_idle_bitmap_read, page_idle_bitmap_write, 0);
 
+static struct kobj_attribute page_idle_lock_attr =
+		__ATTR(lock, 0600, page_idle_lock_show, page_idle_lock_store);
+
 static struct bin_attribute *page_idle_bin_attrs[] = {
 	&page_idle_bitmap_attr,
 	NULL,
 };
 
+static struct attribute *page_idle_lock_attrs[] = {
+	&page_idle_lock_attr.attr,
+	NULL,
+};
+
 static const struct attribute_group page_idle_attr_group = {
 	.bin_attrs = page_idle_bin_attrs,
+	.attrs = page_idle_lock_attrs,
 	.name = "page_idle",
 };
 
-- 
2.17.1



  parent reply	other threads:[~2020-10-20  9:06 UTC|newest]

Thread overview: 39+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-10-20  8:59 [PATCH v22 00/18] Introduce Data Access MONitor (DAMON) SeongJae Park
2020-10-20  8:59 ` [PATCH v22 01/18] mm: " SeongJae Park
2020-11-25 15:29   ` Shakeel Butt
2020-11-26 11:51     ` SeongJae Park
2020-12-08  7:41       ` SeongJae Park
2020-10-20  8:59 ` [PATCH v22 02/18] mm/damon: Implement region based sampling SeongJae Park
2020-11-25 15:29   ` Shakeel Butt
2020-11-26 12:09     ` SeongJae Park
2020-10-20  8:59 ` [PATCH v22 03/18] mm/damon: Adaptively adjust regions SeongJae Park
2020-11-25 15:29   ` Shakeel Butt
2020-11-26 12:12     ` SeongJae Park
2020-10-20  8:59 ` [PATCH v22 04/18] mm/damon: Track dynamic monitoring target regions update SeongJae Park
2020-11-25 15:29   ` Shakeel Butt
2020-11-26 12:18     ` SeongJae Park
2020-10-20  8:59 ` [PATCH v22 05/18] mm/idle_page_tracking: Make PG_(idle|young) reusable SeongJae Park
2020-11-25 15:30   ` Shakeel Butt
2020-11-26 12:31     ` SeongJae Park
2020-10-20  8:59 ` [PATCH v22 06/18] mm/damon: Implement primitives for the virtual memory address spaces SeongJae Park
2020-11-25 15:30   ` Shakeel Butt
2020-11-26 13:34     ` SeongJae Park
2020-10-20  8:59 ` SeongJae Park [this message]
2020-11-25 15:30   ` [PATCH v22 07/18] mm/page_idle: Avoid interferences from concurrent users Shakeel Butt
2020-11-26 13:37     ` SeongJae Park
2020-10-20  8:59 ` [PATCH v22 08/18] mm/damon/primitives: Make coexistable with Idle Page Tracking SeongJae Park
2020-10-20  8:59 ` [PATCH v22 09/18] mm/damon: Add a tracepoint SeongJae Park
2020-10-20  8:59 ` [PATCH v22 10/18] mm/damon: Implement a debugfs-based user space interface SeongJae Park
2020-11-25 15:30   ` Shakeel Butt
2020-11-26 13:45     ` SeongJae Park
2020-10-20  8:59 ` [PATCH v22 11/18] mm/damon/dbgfs: Implement recording feature SeongJae Park
2020-10-20  8:59 ` [PATCH v22 12/18] mm/damon/dbgfs: Export kdamond pid to the user space SeongJae Park
2020-10-20  8:59 ` [PATCH v22 13/18] mm/damon/dbgfs: Support multiple contexts SeongJae Park
2020-10-20  8:59 ` [PATCH v22 14/18] tools: Introduce a minimal user-space tool for DAMON SeongJae Park
2020-10-20  8:59 ` [PATCH v22 15/18] Documentation: Add documents " SeongJae Park
2020-10-20  8:59 ` [PATCH v22 16/18] mm/damon: Add kunit tests SeongJae Park
2020-10-20  8:59 ` [PATCH v22 17/18] mm/damon: Add user space selftests SeongJae Park
2020-10-20  8:59 ` [PATCH v22 18/18] MAINTAINERS: Update for DAMON SeongJae Park
2020-11-11 16:41 ` [PATCH v22 00/18] Introduce Data Access MONitor (DAMON) SeongJae Park
2020-11-17  8:05   ` SeongJae Park
2020-11-17 14:30     ` SeongJae Park

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20201020085940.13875-8-sjpark@amazon.com \
    --to=sjpark@amazon.com \
    --cc=Jonathan.Cameron@Huawei.com \
    --cc=aarcange@redhat.com \
    --cc=acme@kernel.org \
    --cc=akpm@linux-foundation.org \
    --cc=alexander.shishkin@linux.intel.com \
    --cc=amit@kernel.org \
    --cc=benh@kernel.crashing.org \
    --cc=brendan.d.gregg@gmail.com \
    --cc=brendanhiggins@google.com \
    --cc=cai@lca.pw \
    --cc=colin.king@canonical.com \
    --cc=corbet@lwn.net \
    --cc=david@redhat.com \
    --cc=dwmw@amazon.com \
    --cc=elver@google.com \
    --cc=fan.du@intel.com \
    --cc=foersleo@amazon.de \
    --cc=gthelen@google.com \
    --cc=irogers@google.com \
    --cc=jolsa@redhat.com \
    --cc=kirill@shutemov.name \
    --cc=linux-damon@amazon.com \
    --cc=linux-doc@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mark.rutland@arm.com \
    --cc=mgorman@suse.de \
    --cc=minchan@kernel.org \
    --cc=mingo@redhat.com \
    --cc=namhyung@kernel.org \
    --cc=peterz@infradead.org \
    --cc=rdunlap@infradead.org \
    --cc=riel@surriel.com \
    --cc=rientjes@google.com \
    --cc=rostedt@goodmis.org \
    --cc=rppt@kernel.org \
    --cc=sblbir@amazon.com \
    --cc=shakeelb@google.com \
    --cc=shuah@kernel.org \
    --cc=sj38.park@gmail.com \
    --cc=sjpark@amazon.de \
    --cc=snu@amazon.de \
    --cc=vbabka@suse.cz \
    --cc=vdavydov.dev@gmail.com \
    --cc=yang.shi@linux.alibaba.com \
    --cc=ying.huang@intel.com \
    --cc=zgf574564920@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).