All of lore.kernel.org
 help / color / mirror / Atom feed
From: Johannes Weiner <hannes@cmpxchg.org>
To: David Rientjes <rientjes@google.com>
Cc: SeongJae Park <sjpark@amazon.com>,
	Andrew Morton <akpm@linux-foundation.org>,
	Yang Shi <shy828301@gmail.com>, Michal Hocko <mhocko@kernel.org>,
	Shakeel Butt <shakeelb@google.com>,
	Yang Shi <yang.shi@linux.alibaba.com>,
	Roman Gushchin <guro@fb.com>, Greg Thelen <gthelen@google.com>,
	Vladimir Davydov <vdavydov.dev@gmail.com>,
	cgroups@vger.kernel.org, linux-mm@kvack.org
Subject: Re: [patch] mm, memcg: provide an anon_reclaimable stat
Date: Fri, 17 Jul 2020 10:39:02 -0400	[thread overview]
Message-ID: <20200717143902.GA266388@cmpxchg.org> (raw)
In-Reply-To: <alpine.DEB.2.23.453.2007161357490.3209847@chino.kir.corp.google.com>

On Thu, Jul 16, 2020 at 01:58:19PM -0700, David Rientjes wrote:
> @@ -1350,6 +1350,32 @@ static bool mem_cgroup_wait_acct_move(struct mem_cgroup *memcg)
>  	return false;
>  }
>  
> +/*
> + * Returns the amount of anon memory that is charged to the memcg that is
> + * reclaimable under memory pressure without swap, in pages.
> + */
> +static unsigned long memcg_anon_reclaimable(struct mem_cgroup *memcg)
> +{
> +	long deferred, lazyfree;
> +
> +	/*
> +	 * Deferred pages are charged anonymous pages that are on the LRU but
> +	 * are unmapped.  These compound pages are split under memory pressure.
> +	 */
> +	deferred = max_t(long, memcg_page_state(memcg, NR_ACTIVE_ANON) +
> +			       memcg_page_state(memcg, NR_INACTIVE_ANON) -
> +			       memcg_page_state(memcg, NR_ANON_MAPPED), 0);
> +	/*
> +	 * Lazyfree pages are charged clean anonymous pages that are on the file
> +	 * LRU and can be reclaimed under memory pressure.
> +	 */
> +	lazyfree = max_t(long, memcg_page_state(memcg, NR_ACTIVE_FILE) +
> +			       memcg_page_state(memcg, NR_INACTIVE_FILE) -
> +			       memcg_page_state(memcg, NR_FILE_PAGES), 0);

Unfortunately, we don't know if these have been reused after the
madvise until we actually do the rmap walk in page reclaim. All of
these could have dirty ptes and require swapout after all.

The MADV_FREE tradeoff was that the freed pages can get reused by
userspace without another context switch and tlb flush in the common
case, by exploiting the fact that the MMU sets the dirty bit for
us. The downside is that the kernel doesn't know what state these
pages are in until it takes a close-up look at them one by one.


WARNING: multiple messages have this Message-ID (diff)
From: Johannes Weiner <hannes-druUgvl0LCNAfugRpC6u6w@public.gmane.org>
To: David Rientjes <rientjes-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org>
Cc: SeongJae Park <sjpark-vV1OtcyAfmbQT0dZR+AlfA@public.gmane.org>,
	Andrew Morton
	<akpm-de/tnXTf+JLsfHDXvbKv3WD2FQJk+8+b@public.gmane.org>,
	Yang Shi <shy828301-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>,
	Michal Hocko <mhocko-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>,
	Shakeel Butt <shakeelb-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org>,
	Yang Shi
	<yang.shi-KPsoFbNs7GizrGE5bRqYAgC/G2K4zDHf@public.gmane.org>,
	Roman Gushchin <guro-b10kYP2dOMg@public.gmane.org>,
	Greg Thelen <gthelen-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org>,
	Vladimir Davydov
	<vdavydov.dev-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>,
	cgroups-u79uwXL29TY76Z2rM5mHXA@public.gmane.org,
	linux-mm-Bw31MaZKKs3YtjvyW6yDsg@public.gmane.org
Subject: Re: [patch] mm, memcg: provide an anon_reclaimable stat
Date: Fri, 17 Jul 2020 10:39:02 -0400	[thread overview]
Message-ID: <20200717143902.GA266388@cmpxchg.org> (raw)
In-Reply-To: <alpine.DEB.2.23.453.2007161357490.3209847-X6Q0R45D7oAcqpCFd4KODRPsWskHk0ljAL8bYrjMMd8@public.gmane.org>

On Thu, Jul 16, 2020 at 01:58:19PM -0700, David Rientjes wrote:
> @@ -1350,6 +1350,32 @@ static bool mem_cgroup_wait_acct_move(struct mem_cgroup *memcg)
>  	return false;
>  }
>  
> +/*
> + * Returns the amount of anon memory that is charged to the memcg that is
> + * reclaimable under memory pressure without swap, in pages.
> + */
> +static unsigned long memcg_anon_reclaimable(struct mem_cgroup *memcg)
> +{
> +	long deferred, lazyfree;
> +
> +	/*
> +	 * Deferred pages are charged anonymous pages that are on the LRU but
> +	 * are unmapped.  These compound pages are split under memory pressure.
> +	 */
> +	deferred = max_t(long, memcg_page_state(memcg, NR_ACTIVE_ANON) +
> +			       memcg_page_state(memcg, NR_INACTIVE_ANON) -
> +			       memcg_page_state(memcg, NR_ANON_MAPPED), 0);
> +	/*
> +	 * Lazyfree pages are charged clean anonymous pages that are on the file
> +	 * LRU and can be reclaimed under memory pressure.
> +	 */
> +	lazyfree = max_t(long, memcg_page_state(memcg, NR_ACTIVE_FILE) +
> +			       memcg_page_state(memcg, NR_INACTIVE_FILE) -
> +			       memcg_page_state(memcg, NR_FILE_PAGES), 0);

Unfortunately, we don't know if these have been reused after the
madvise until we actually do the rmap walk in page reclaim. All of
these could have dirty ptes and require swapout after all.

The MADV_FREE tradeoff was that the freed pages can get reused by
userspace without another context switch and tlb flush in the common
case, by exploiting the fact that the MMU sets the dirty bit for
us. The downside is that the kernel doesn't know what state these
pages are in until it takes a close-up look at them one by one.

  parent reply	other threads:[~2020-07-17 14:40 UTC|newest]

Thread overview: 29+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-07-15  3:18 [patch] mm, memcg: provide a stat to describe reclaimable memory David Rientjes
2020-07-15  3:18 ` David Rientjes
2020-07-15  7:00 ` David Rientjes
2020-07-15  7:00   ` David Rientjes
2020-07-15  7:15   ` SeongJae Park
2020-07-15  7:15     ` SeongJae Park
2020-07-15 17:33     ` David Rientjes
2020-07-15 17:33       ` David Rientjes
2020-07-16 20:58       ` [patch] mm, memcg: provide an anon_reclaimable stat David Rientjes
2020-07-16 20:58         ` David Rientjes
2020-07-16 21:07         ` Shakeel Butt
2020-07-16 21:07           ` Shakeel Butt
2020-07-16 21:28           ` David Rientjes
2020-07-16 21:28             ` David Rientjes
2020-07-17  1:37             ` Shakeel Butt
2020-07-17  1:37               ` Shakeel Butt
2020-07-17  8:34         ` Michal Hocko
2020-07-17  8:34           ` Michal Hocko
2020-07-17 14:39         ` Johannes Weiner [this message]
2020-07-17 14:39           ` Johannes Weiner
2020-07-15 13:10 ` [patch] mm, memcg: provide a stat to describe reclaimable memory Chris Down
2020-07-15 13:10   ` Chris Down
     [not found]   ` <20200715131048.GA176092-6Bi1550iOqEnzZ6mRAm98g@public.gmane.org>
2020-07-15 18:02     ` David Rientjes
2020-07-17 12:17       ` Chris Down
2020-07-17 12:17         ` Chris Down
2020-07-17 19:37         ` David Rientjes
2020-07-17 19:37           ` David Rientjes
2020-07-20  7:37           ` Michal Hocko
2020-07-20  7:37             ` Michal Hocko

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20200717143902.GA266388@cmpxchg.org \
    --to=hannes@cmpxchg.org \
    --cc=akpm@linux-foundation.org \
    --cc=cgroups@vger.kernel.org \
    --cc=gthelen@google.com \
    --cc=guro@fb.com \
    --cc=linux-mm@kvack.org \
    --cc=mhocko@kernel.org \
    --cc=rientjes@google.com \
    --cc=shakeelb@google.com \
    --cc=shy828301@gmail.com \
    --cc=sjpark@amazon.com \
    --cc=vdavydov.dev@gmail.com \
    --cc=yang.shi@linux.alibaba.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.