All of lore.kernel.org
 help / color / mirror / Atom feed
From: David Rientjes <rientjes@google.com>
To: SeongJae Park <sjpark@amazon.com>,
	 Andrew Morton <akpm@linux-foundation.org>
Cc: Yang Shi <shy828301@gmail.com>, Michal Hocko <mhocko@kernel.org>,
	 Shakeel Butt <shakeelb@google.com>,
	Yang Shi <yang.shi@linux.alibaba.com>,
	 Roman Gushchin <guro@fb.com>, Greg Thelen <gthelen@google.com>,
	 Johannes Weiner <hannes@cmpxchg.org>,
	 Vladimir Davydov <vdavydov.dev@gmail.com>,
	cgroups@vger.kernel.org,  linux-mm@kvack.org
Subject: [patch] mm, memcg: provide an anon_reclaimable stat
Date: Thu, 16 Jul 2020 13:58:19 -0700 (PDT)	[thread overview]
Message-ID: <alpine.DEB.2.23.453.2007161357490.3209847@chino.kir.corp.google.com> (raw)
In-Reply-To: <alpine.DEB.2.23.453.2007151031020.2788464@chino.kir.corp.google.com>

Userspace can lack insight into the amount of memory that can be reclaimed
from a memcg based on values from memory.stat.  Two specific examples:

 - Lazy freeable memory (MADV_FREE) that are clean anonymous pages on the
   inactive file LRU that can be quickly reclaimed under memory pressure
   but otherwise shows up as mapped anon in memory.stat, and

 - Memory on deferred split queues (thp) that are compound pages that can
   be split and uncharged from the memcg under memory pressure, but
   otherwise shows up as charged anon LRU memory in memory.stat.

Both of this anonymous usage is also charged to memory.current.

Userspace can currently derive this information but it depends on kernel
implementation details for how this memory is handled for the purposes of
reclaim (anon on inactive file LRU or unmapped anon on the LRU).

For the purposes of writing portable userspace code that does not need to
have insight into the kernel implementation for reclaimable memory, this
exports a stat that reveals the amount of anonymous memory that can be
reclaimed and uncharged from the memcg to start new applications.

As the kernel implementation evolves for memory that can be reclaimed
under memory pressure, this stat can be kept consistent.

Signed-off-by: David Rientjes <rientjes@google.com>
---
 Documentation/admin-guide/cgroup-v2.rst |  6 +++++
 mm/memcontrol.c                         | 31 +++++++++++++++++++++++++
 2 files changed, 37 insertions(+)

diff --git a/Documentation/admin-guide/cgroup-v2.rst b/Documentation/admin-guide/cgroup-v2.rst
--- a/Documentation/admin-guide/cgroup-v2.rst
+++ b/Documentation/admin-guide/cgroup-v2.rst
@@ -1296,6 +1296,12 @@ PAGE_SIZE multiple when read back.
 		Amount of memory used in anonymous mappings backed by
 		transparent hugepages
 
+	  anon_reclaimable
+		The amount of charged anonymous memory that can be reclaimed
+		under memory pressure without swap.  This currently includes
+		lazy freeable memory (MADV_FREE) and compound pages that can be
+		split and uncharged.
+
 	  inactive_anon, active_anon, inactive_file, active_file, unevictable
 		Amount of memory, swap-backed and filesystem-backed,
 		on the internal memory management lists used by the
diff --git a/mm/memcontrol.c b/mm/memcontrol.c
--- a/mm/memcontrol.c
+++ b/mm/memcontrol.c
@@ -1350,6 +1350,32 @@ static bool mem_cgroup_wait_acct_move(struct mem_cgroup *memcg)
 	return false;
 }
 
+/*
+ * Returns the amount of anon memory that is charged to the memcg that is
+ * reclaimable under memory pressure without swap, in pages.
+ */
+static unsigned long memcg_anon_reclaimable(struct mem_cgroup *memcg)
+{
+	long deferred, lazyfree;
+
+	/*
+	 * Deferred pages are charged anonymous pages that are on the LRU but
+	 * are unmapped.  These compound pages are split under memory pressure.
+	 */
+	deferred = max_t(long, memcg_page_state(memcg, NR_ACTIVE_ANON) +
+			       memcg_page_state(memcg, NR_INACTIVE_ANON) -
+			       memcg_page_state(memcg, NR_ANON_MAPPED), 0);
+	/*
+	 * Lazyfree pages are charged clean anonymous pages that are on the file
+	 * LRU and can be reclaimed under memory pressure.
+	 */
+	lazyfree = max_t(long, memcg_page_state(memcg, NR_ACTIVE_FILE) +
+			       memcg_page_state(memcg, NR_INACTIVE_FILE) -
+			       memcg_page_state(memcg, NR_FILE_PAGES), 0);
+
+	return deferred + lazyfree;
+}
+
 static char *memory_stat_format(struct mem_cgroup *memcg)
 {
 	struct seq_buf s;
@@ -1363,6 +1389,9 @@ static char *memory_stat_format(struct mem_cgroup *memcg)
 	 * Provide statistics on the state of the memory subsystem as
 	 * well as cumulative event counters that show past behavior.
 	 *
+	 * All values in this buffer are read individually, so no implied
+	 * consistency amongst them.
+	 *
 	 * This list is ordered following a combination of these gradients:
 	 * 1) generic big picture -> specifics and details
 	 * 2) reflecting userspace activity -> reflecting kernel heuristics
@@ -1405,6 +1434,8 @@ static char *memory_stat_format(struct mem_cgroup *memcg)
 		       (u64)memcg_page_state(memcg, NR_ANON_THPS) *
 		       HPAGE_PMD_SIZE);
 #endif
+	seq_buf_printf(&s, "anon_reclaimable %llu\n",
+		       (u64)memcg_anon_reclaimable(memcg) * PAGE_SIZE);
 
 	for (i = 0; i < NR_LRU_LISTS; i++)
 		seq_buf_printf(&s, "%s %llu\n", lru_list_name(i),


WARNING: multiple messages have this Message-ID (diff)
From: David Rientjes <rientjes-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org>
To: SeongJae Park <sjpark-vV1OtcyAfmbQT0dZR+AlfA@public.gmane.org>,
	Andrew Morton
	<akpm-de/tnXTf+JLsfHDXvbKv3WD2FQJk+8+b@public.gmane.org>
Cc: Yang Shi <shy828301-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>,
	Michal Hocko <mhocko-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>,
	Shakeel Butt <shakeelb-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org>,
	Yang Shi
	<yang.shi-KPsoFbNs7GizrGE5bRqYAgC/G2K4zDHf@public.gmane.org>,
	Roman Gushchin <guro-b10kYP2dOMg@public.gmane.org>,
	Greg Thelen <gthelen-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org>,
	Johannes Weiner <hannes-druUgvl0LCNAfugRpC6u6w@public.gmane.org>,
	Vladimir Davydov
	<vdavydov.dev-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>,
	cgroups-u79uwXL29TY76Z2rM5mHXA@public.gmane.org,
	linux-mm-Bw31MaZKKs3YtjvyW6yDsg@public.gmane.org
Subject: [patch] mm, memcg: provide an anon_reclaimable stat
Date: Thu, 16 Jul 2020 13:58:19 -0700 (PDT)	[thread overview]
Message-ID: <alpine.DEB.2.23.453.2007161357490.3209847@chino.kir.corp.google.com> (raw)
In-Reply-To: <alpine.DEB.2.23.453.2007151031020.2788464-X6Q0R45D7oAcqpCFd4KODRPsWskHk0ljAL8bYrjMMd8@public.gmane.org>

Userspace can lack insight into the amount of memory that can be reclaimed
from a memcg based on values from memory.stat.  Two specific examples:

 - Lazy freeable memory (MADV_FREE) that are clean anonymous pages on the
   inactive file LRU that can be quickly reclaimed under memory pressure
   but otherwise shows up as mapped anon in memory.stat, and

 - Memory on deferred split queues (thp) that are compound pages that can
   be split and uncharged from the memcg under memory pressure, but
   otherwise shows up as charged anon LRU memory in memory.stat.

Both of this anonymous usage is also charged to memory.current.

Userspace can currently derive this information but it depends on kernel
implementation details for how this memory is handled for the purposes of
reclaim (anon on inactive file LRU or unmapped anon on the LRU).

For the purposes of writing portable userspace code that does not need to
have insight into the kernel implementation for reclaimable memory, this
exports a stat that reveals the amount of anonymous memory that can be
reclaimed and uncharged from the memcg to start new applications.

As the kernel implementation evolves for memory that can be reclaimed
under memory pressure, this stat can be kept consistent.

Signed-off-by: David Rientjes <rientjes-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org>
---
 Documentation/admin-guide/cgroup-v2.rst |  6 +++++
 mm/memcontrol.c                         | 31 +++++++++++++++++++++++++
 2 files changed, 37 insertions(+)

diff --git a/Documentation/admin-guide/cgroup-v2.rst b/Documentation/admin-guide/cgroup-v2.rst
--- a/Documentation/admin-guide/cgroup-v2.rst
+++ b/Documentation/admin-guide/cgroup-v2.rst
@@ -1296,6 +1296,12 @@ PAGE_SIZE multiple when read back.
 		Amount of memory used in anonymous mappings backed by
 		transparent hugepages
 
+	  anon_reclaimable
+		The amount of charged anonymous memory that can be reclaimed
+		under memory pressure without swap.  This currently includes
+		lazy freeable memory (MADV_FREE) and compound pages that can be
+		split and uncharged.
+
 	  inactive_anon, active_anon, inactive_file, active_file, unevictable
 		Amount of memory, swap-backed and filesystem-backed,
 		on the internal memory management lists used by the
diff --git a/mm/memcontrol.c b/mm/memcontrol.c
--- a/mm/memcontrol.c
+++ b/mm/memcontrol.c
@@ -1350,6 +1350,32 @@ static bool mem_cgroup_wait_acct_move(struct mem_cgroup *memcg)
 	return false;
 }
 
+/*
+ * Returns the amount of anon memory that is charged to the memcg that is
+ * reclaimable under memory pressure without swap, in pages.
+ */
+static unsigned long memcg_anon_reclaimable(struct mem_cgroup *memcg)
+{
+	long deferred, lazyfree;
+
+	/*
+	 * Deferred pages are charged anonymous pages that are on the LRU but
+	 * are unmapped.  These compound pages are split under memory pressure.
+	 */
+	deferred = max_t(long, memcg_page_state(memcg, NR_ACTIVE_ANON) +
+			       memcg_page_state(memcg, NR_INACTIVE_ANON) -
+			       memcg_page_state(memcg, NR_ANON_MAPPED), 0);
+	/*
+	 * Lazyfree pages are charged clean anonymous pages that are on the file
+	 * LRU and can be reclaimed under memory pressure.
+	 */
+	lazyfree = max_t(long, memcg_page_state(memcg, NR_ACTIVE_FILE) +
+			       memcg_page_state(memcg, NR_INACTIVE_FILE) -
+			       memcg_page_state(memcg, NR_FILE_PAGES), 0);
+
+	return deferred + lazyfree;
+}
+
 static char *memory_stat_format(struct mem_cgroup *memcg)
 {
 	struct seq_buf s;
@@ -1363,6 +1389,9 @@ static char *memory_stat_format(struct mem_cgroup *memcg)
 	 * Provide statistics on the state of the memory subsystem as
 	 * well as cumulative event counters that show past behavior.
 	 *
+	 * All values in this buffer are read individually, so no implied
+	 * consistency amongst them.
+	 *
 	 * This list is ordered following a combination of these gradients:
 	 * 1) generic big picture -> specifics and details
 	 * 2) reflecting userspace activity -> reflecting kernel heuristics
@@ -1405,6 +1434,8 @@ static char *memory_stat_format(struct mem_cgroup *memcg)
 		       (u64)memcg_page_state(memcg, NR_ANON_THPS) *
 		       HPAGE_PMD_SIZE);
 #endif
+	seq_buf_printf(&s, "anon_reclaimable %llu\n",
+		       (u64)memcg_anon_reclaimable(memcg) * PAGE_SIZE);
 
 	for (i = 0; i < NR_LRU_LISTS; i++)
 		seq_buf_printf(&s, "%s %llu\n", lru_list_name(i),

  reply	other threads:[~2020-07-16 20:58 UTC|newest]

Thread overview: 29+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-07-15  3:18 [patch] mm, memcg: provide a stat to describe reclaimable memory David Rientjes
2020-07-15  3:18 ` David Rientjes
2020-07-15  7:00 ` David Rientjes
2020-07-15  7:00   ` David Rientjes
2020-07-15  7:15   ` SeongJae Park
2020-07-15  7:15     ` SeongJae Park
2020-07-15 17:33     ` David Rientjes
2020-07-15 17:33       ` David Rientjes
2020-07-16 20:58       ` David Rientjes [this message]
2020-07-16 20:58         ` [patch] mm, memcg: provide an anon_reclaimable stat David Rientjes
2020-07-16 21:07         ` Shakeel Butt
2020-07-16 21:07           ` Shakeel Butt
2020-07-16 21:28           ` David Rientjes
2020-07-16 21:28             ` David Rientjes
2020-07-17  1:37             ` Shakeel Butt
2020-07-17  1:37               ` Shakeel Butt
2020-07-17  8:34         ` Michal Hocko
2020-07-17  8:34           ` Michal Hocko
2020-07-17 14:39         ` Johannes Weiner
2020-07-17 14:39           ` Johannes Weiner
2020-07-15 13:10 ` [patch] mm, memcg: provide a stat to describe reclaimable memory Chris Down
2020-07-15 13:10   ` Chris Down
     [not found]   ` <20200715131048.GA176092-6Bi1550iOqEnzZ6mRAm98g@public.gmane.org>
2020-07-15 18:02     ` David Rientjes
2020-07-17 12:17       ` Chris Down
2020-07-17 12:17         ` Chris Down
2020-07-17 19:37         ` David Rientjes
2020-07-17 19:37           ` David Rientjes
2020-07-20  7:37           ` Michal Hocko
2020-07-20  7:37             ` Michal Hocko

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=alpine.DEB.2.23.453.2007161357490.3209847@chino.kir.corp.google.com \
    --to=rientjes@google.com \
    --cc=akpm@linux-foundation.org \
    --cc=cgroups@vger.kernel.org \
    --cc=gthelen@google.com \
    --cc=guro@fb.com \
    --cc=hannes@cmpxchg.org \
    --cc=linux-mm@kvack.org \
    --cc=mhocko@kernel.org \
    --cc=shakeelb@google.com \
    --cc=shy828301@gmail.com \
    --cc=sjpark@amazon.com \
    --cc=vdavydov.dev@gmail.com \
    --cc=yang.shi@linux.alibaba.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.