From: Minchan Kim <minchan@kernel.org>
To: Andrew Morton <akpm@linux-foundation.org>
Cc: LKML <linux-kernel@vger.kernel.org>,
linux-mm <linux-mm@kvack.org>,
linux-api@vger.kernel.org, oleksandr@redhat.com,
Suren Baghdasaryan <surenb@google.com>,
Tim Murray <timmurray@google.com>,
Daniel Colascione <dancol@google.com>,
Sandeep Patil <sspatil@google.com>,
Sonny Rao <sonnyrao@google.com>,
Brian Geffon <bgeffon@google.com>, Michal Hocko <mhocko@suse.com>,
Johannes Weiner <hannes@cmpxchg.org>,
Shakeel Butt <shakeelb@google.com>,
John Dias <joaodias@google.com>,
Joel Fernandes <joel@joelfernandes.org>,
sj38.park@gmail.com, alexander.h.duyck@linux.intel.com,
Jann Horn <jannh@google.com>, Minchan Kim <minchan@kernel.org>,
SeongJae Park <sjpark@amazon.de>
Subject: [PATCH v6 7/7] mm/madvise: allow KSM hints for remote API
Date: Tue, 18 Feb 2020 17:44:33 -0800 [thread overview]
Message-ID: <20200219014433.88424-8-minchan@kernel.org> (raw)
In-Reply-To: <20200219014433.88424-1-minchan@kernel.org>
From: Oleksandr Natalenko <oleksandr@redhat.com>
It all began with the fact that KSM works only on memory that is marked
by madvise(). And the only way to get around that is to either:
* use LD_PRELOAD; or
* patch the kernel with something like UKSM or PKSM.
(i skip ptrace can of worms here intentionally)
To overcome this restriction, lets employ a new remote madvise API. This
can be used by some small userspace helper daemon that will do auto-KSM
job for us.
I think of two major consumers of remote KSM hints:
* hosts, that run containers, especially similar ones and especially in
a trusted environment, sharing the same runtime like Node.js;
* heavy applications, that can be run in multiple instances, not
limited to opensource ones like Firefox, but also those that cannot be
modified since they are binary-only and, maybe, statically linked.
Speaking of statistics, more numbers can be found in the very first
submission, that is related to this one [1]. For my current setup with
two Firefox instances I get 100 to 200 MiB saved for the second instance
depending on the amount of tabs.
1 FF instance with 15 tabs:
$ echo "$(cat /sys/kernel/mm/ksm/pages_sharing) * 4 / 1024" | bc
410
2 FF instances, second one has 12 tabs (all the tabs are different):
$ echo "$(cat /sys/kernel/mm/ksm/pages_sharing) * 4 / 1024" | bc
592
At the very moment I do not have specific numbers for containerised
workload, but those should be comparable in case the containers share
similar/same runtime.
[1] https://lore.kernel.org/patchwork/patch/1012142/
Reviewed-by: SeongJae Park <sjpark@amazon.de>
Signed-off-by: Oleksandr Natalenko <oleksandr@redhat.com>
Signed-off-by: Minchan Kim <minchan@kernel.org>
---
mm/madvise.c | 4 ++++
1 file changed, 4 insertions(+)
diff --git a/mm/madvise.c b/mm/madvise.c
index c55a18fe71f9..b97c7e1a5cab 100644
--- a/mm/madvise.c
+++ b/mm/madvise.c
@@ -1005,6 +1005,10 @@ process_madvise_behavior_valid(int behavior)
switch (behavior) {
case MADV_COLD:
case MADV_PAGEOUT:
+#ifdef CONFIG_KSM
+ case MADV_MERGEABLE:
+ case MADV_UNMERGEABLE:
+#endif
return true;
default:
return false;
--
2.25.0.265.gbab2e86ba0-goog
next prev parent reply other threads:[~2020-02-19 1:44 UTC|newest]
Thread overview: 28+ messages / expand[flat|nested] mbox.gz Atom feed top
2020-02-19 1:44 [PATCH v6 0/7] introduce memory hinting API for external process Minchan Kim
2020-02-19 1:44 ` [PATCH v6 1/7] mm: pass task and mm to do_madvise Minchan Kim
2020-02-28 22:15 ` Suren Baghdasaryan
2020-02-19 1:44 ` [PATCH v6 2/7] mm: introduce external memory hinting API Minchan Kim
2020-02-20 19:13 ` kbuild test robot
2020-02-20 21:15 ` Minchan Kim
2020-02-20 21:21 ` Minchan Kim
2020-02-28 22:14 ` Suren Baghdasaryan
2020-03-02 19:18 ` Minchan Kim
2020-02-20 20:48 ` kbuild test robot
2020-02-19 1:44 ` [PATCH v6 3/7] mm: check fatal signal pending of target process Minchan Kim
2020-02-28 22:20 ` Suren Baghdasaryan
2020-02-19 1:44 ` [PATCH v6 4/7] pid: move pidfd_get_pid function to pid.c Minchan Kim
2020-02-28 22:22 ` Suren Baghdasaryan
2020-02-19 1:44 ` [PATCH v6 5/7] mm: support both pid and pidfd for process_madvise Minchan Kim
2020-02-28 22:41 ` Suren Baghdasaryan
2020-03-02 19:23 ` Minchan Kim
2020-03-02 19:38 ` Suren Baghdasaryan
2020-02-19 1:44 ` [PATCH v6 6/7] mm/madvise: employ mmget_still_valid for write lock Minchan Kim
2020-02-28 23:19 ` Suren Baghdasaryan
2020-03-02 7:33 ` Oleksandr Natalenko
2020-03-02 16:32 ` Suren Baghdasaryan
2020-02-19 1:44 ` Minchan Kim [this message]
2020-02-19 20:01 ` [PATCH v6 0/7] introduce memory hinting API for external process Andrew Morton
2020-02-19 21:05 ` Suren Baghdasaryan
2020-02-19 22:32 ` Minchan Kim
2020-02-19 22:51 ` Brian Geffon
2020-02-20 9:16 ` SeongJae Park
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20200219014433.88424-8-minchan@kernel.org \
--to=minchan@kernel.org \
--cc=akpm@linux-foundation.org \
--cc=alexander.h.duyck@linux.intel.com \
--cc=bgeffon@google.com \
--cc=dancol@google.com \
--cc=hannes@cmpxchg.org \
--cc=jannh@google.com \
--cc=joaodias@google.com \
--cc=joel@joelfernandes.org \
--cc=linux-api@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=mhocko@suse.com \
--cc=oleksandr@redhat.com \
--cc=shakeelb@google.com \
--cc=sj38.park@gmail.com \
--cc=sjpark@amazon.de \
--cc=sonnyrao@google.com \
--cc=sspatil@google.com \
--cc=surenb@google.com \
--cc=timmurray@google.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).