From: Johannes Weiner <hannes@cmpxchg.org>
To: linux-mm@kvack.org
Cc: Rik van Riel <riel@surriel.com>,
Minchan Kim <minchan.kim@gmail.com>,
Michal Hocko <mhocko@suse.com>,
Andrew Morton <akpm@linux-foundation.org>,
Joonsoo Kim <iamjoonsoo.kim@lge.com>,
linux-kernel@vger.kernel.org, kernel-team@fb.com
Subject: [PATCH 03/14] mm: allow swappiness that prefers reclaiming anon over the file workingset
Date: Wed, 20 May 2020 19:25:14 -0400 [thread overview]
Message-ID: <20200520232525.798933-4-hannes@cmpxchg.org> (raw)
In-Reply-To: <20200520232525.798933-1-hannes@cmpxchg.org>
With the advent of fast random IO devices (SSDs, PMEM) and in-memory
swap devices such as zswap, it's possible for swap to be much faster
than filesystems, and for swapping to be preferable over thrashing
filesystem caches.
Allow setting swappiness - which defines the rough relative IO cost of
cache misses between page cache and swap-backed pages - to reflect
such situations by making the swap-preferred range configurable.
v2: clarify how to calculate swappiness (Minchan Kim)
Signed-off-by: Johannes Weiner <hannes@cmpxchg.org>
---
Documentation/admin-guide/sysctl/vm.rst | 23 ++++++++++++++++++-----
kernel/sysctl.c | 3 ++-
mm/vmscan.c | 2 +-
3 files changed, 21 insertions(+), 7 deletions(-)
diff --git a/Documentation/admin-guide/sysctl/vm.rst b/Documentation/admin-guide/sysctl/vm.rst
index 0329a4d3fa9e..d46d5b7013c6 100644
--- a/Documentation/admin-guide/sysctl/vm.rst
+++ b/Documentation/admin-guide/sysctl/vm.rst
@@ -831,14 +831,27 @@ When page allocation performance is not a bottleneck and you want all
swappiness
==========
-This control is used to define how aggressive the kernel will swap
-memory pages. Higher values will increase aggressiveness, lower values
-decrease the amount of swap. A value of 0 instructs the kernel not to
-initiate swap until the amount of free and file-backed pages is less
-than the high water mark in a zone.
+This control is used to define the rough relative IO cost of swapping
+and filesystem paging, as a value between 0 and 200. At 100, the VM
+assumes equal IO cost and will thus apply memory pressure to the page
+cache and swap-backed pages equally; lower values signify more
+expensive swap IO, higher values indicates cheaper.
+
+Keep in mind that filesystem IO patterns under memory pressure tend to
+be more efficient than swap's random IO. An optimal value will require
+experimentation and will also be workload-dependent.
The default value is 60.
+For in-memory swap, like zram or zswap, as well as hybrid setups that
+have swap on faster devices than the filesystem, values beyond 100 can
+be considered. For example, if the random IO against the swap device
+is on average 2x faster than IO from the filesystem, swappiness should
+be 133 (x + 2x = 200, 2x = 133.33).
+
+At 0, the kernel will not initiate swap until the amount of free and
+file-backed pages is less than the high watermark in a zone.
+
unprivileged_userfaultfd
========================
diff --git a/kernel/sysctl.c b/kernel/sysctl.c
index 8a176d8727a3..7f15d292e44c 100644
--- a/kernel/sysctl.c
+++ b/kernel/sysctl.c
@@ -131,6 +131,7 @@ static unsigned long zero_ul;
static unsigned long one_ul = 1;
static unsigned long long_max = LONG_MAX;
static int one_hundred = 100;
+static int two_hundred = 200;
static int one_thousand = 1000;
#ifdef CONFIG_PRINTK
static int ten_thousand = 10000;
@@ -1391,7 +1392,7 @@ static struct ctl_table vm_table[] = {
.mode = 0644,
.proc_handler = proc_dointvec_minmax,
.extra1 = SYSCTL_ZERO,
- .extra2 = &one_hundred,
+ .extra2 = &two_hundred,
},
#ifdef CONFIG_HUGETLB_PAGE
{
diff --git a/mm/vmscan.c b/mm/vmscan.c
index 70b0e2c6a4b9..43f88b1a4f14 100644
--- a/mm/vmscan.c
+++ b/mm/vmscan.c
@@ -161,7 +161,7 @@ struct scan_control {
#endif
/*
- * From 0 .. 100. Higher means more swappy.
+ * From 0 .. 200. Higher means more swappy.
*/
int vm_swappiness = 60;
/*
--
2.26.2
next prev parent reply other threads:[~2020-05-20 23:26 UTC|newest]
Thread overview: 36+ messages / expand[flat|nested] mbox.gz Atom feed top
2020-05-20 23:25 [PATCH 00/14] mm: balance LRU lists based on relative thrashing v2 Johannes Weiner
2020-05-20 23:25 ` [PATCH 01/14] mm: fix LRU balancing effect of new transparent huge pages Johannes Weiner
2020-05-27 19:48 ` Shakeel Butt
2020-05-20 23:25 ` [PATCH 02/14] mm: keep separate anon and file statistics on page reclaim activity Johannes Weiner
2020-05-20 23:25 ` Johannes Weiner [this message]
2020-05-20 23:25 ` [PATCH 04/14] mm: fold and remove lru_cache_add_anon() and lru_cache_add_file() Johannes Weiner
2020-05-20 23:25 ` [PATCH 05/14] mm: workingset: let cache workingset challenge anon Johannes Weiner
2020-05-27 2:06 ` Joonsoo Kim
2020-05-27 13:43 ` Johannes Weiner
2020-05-28 7:16 ` Joonsoo Kim
2020-05-28 17:01 ` Johannes Weiner
2020-05-29 6:48 ` Joonsoo Kim
2020-05-29 15:12 ` Johannes Weiner
2020-06-01 6:14 ` Joonsoo Kim
2020-06-01 15:56 ` Johannes Weiner
2020-06-01 20:44 ` Johannes Weiner
2020-06-04 13:35 ` Vlastimil Babka
2020-06-04 15:05 ` Johannes Weiner
2020-06-12 3:19 ` Joonsoo Kim
2020-06-15 13:41 ` Johannes Weiner
2020-06-16 6:09 ` Joonsoo Kim
2020-06-02 2:34 ` Joonsoo Kim
2020-06-02 16:47 ` Johannes Weiner
2020-06-03 5:40 ` Joonsoo Kim
2020-05-20 23:25 ` [PATCH 06/14] mm: remove use-once cache bias from LRU balancing Johannes Weiner
2020-05-20 23:25 ` [PATCH 07/14] mm: vmscan: drop unnecessary div0 avoidance rounding in get_scan_count() Johannes Weiner
2020-05-20 23:25 ` [PATCH 08/14] mm: base LRU balancing on an explicit cost model Johannes Weiner
2020-05-20 23:25 ` [PATCH 09/14] mm: deactivations shouldn't bias the LRU balance Johannes Weiner
2020-05-22 13:33 ` Qian Cai
2020-05-26 15:55 ` Johannes Weiner
2020-05-27 0:54 ` Qian Cai
2020-05-20 23:25 ` [PATCH 10/14] mm: only count actual rotations as LRU reclaim cost Johannes Weiner
2020-05-20 23:25 ` [PATCH 11/14] mm: balance LRU lists based on relative thrashing Johannes Weiner
2020-05-20 23:25 ` [PATCH 12/14] mm: vmscan: determine anon/file pressure balance at the reclaim root Johannes Weiner
2020-05-20 23:25 ` [PATCH 13/14] mm: vmscan: reclaim writepage is IO cost Johannes Weiner
2020-05-20 23:25 ` [PATCH 14/14] mm: vmscan: limit the range of LRU type balancing Johannes Weiner
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20200520232525.798933-4-hannes@cmpxchg.org \
--to=hannes@cmpxchg.org \
--cc=akpm@linux-foundation.org \
--cc=iamjoonsoo.kim@lge.com \
--cc=kernel-team@fb.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=mhocko@suse.com \
--cc=minchan.kim@gmail.com \
--cc=riel@surriel.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).