linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Johannes Weiner <hannes@cmpxchg.org>
To: js1304@gmail.com
Cc: Andrew Morton <akpm@linux-foundation.org>,
	linux-mm@kvack.org, linux-kernel@vger.kernel.org,
	Michal Hocko <mhocko@kernel.org>, Hugh Dickins <hughd@google.com>,
	Minchan Kim <minchan@kernel.org>,
	Vlastimil Babka <vbabka@suse.cz>,
	Mel Gorman <mgorman@techsingularity.net>,
	kernel-team@lge.com, Joonsoo Kim <iamjoonsoo.kim@lge.com>
Subject: Re: [PATCH v3 6/9] mm/workingset: handle the page without memcg
Date: Wed, 18 Mar 2020 15:59:11 -0400	[thread overview]
Message-ID: <20200318195911.GF154135@cmpxchg.org> (raw)
In-Reply-To: <1584423717-3440-7-git-send-email-iamjoonsoo.kim@lge.com>

On Tue, Mar 17, 2020 at 02:41:54PM +0900, js1304@gmail.com wrote:
> From: Joonsoo Kim <iamjoonsoo.kim@lge.com>
> 
> When implementing workingset detection for anonymous page, I found
> some swapcache pages with NULL memcg. From the code reading, I found
> two reasons.
> 
> One is the case that swap-in readahead happens. The other is the
> corner case related to the shmem cache. These two problems should be
> fixed, but, it's not straight-forward to fix. For example, when swap-off,
> all swapped-out pages are read into swapcache. In this case, who's the
> owner of the swapcache page?
> 
> Since this problem doesn't look trivial, I decide to leave the issue and
> handles this corner case on the place where the error occurs.
> 
> Signed-off-by: Joonsoo Kim <iamjoonsoo.kim@lge.com>

It wouldn't be hard to find out who owns this page. The code in
mem_cgroup_try_charge() is only a few lines:

			swp_entry_t ent = { .val = page_private(page), };
			unsigned short id = lookup_swap_cgroup_id(ent);

			rcu_read_lock();
			memcg = mem_cgroup_from_id(id);
			if (memcg && !css_tryget_online(&memcg->css))
				memcg = NULL;
			rcu_read_unlock();

THAT BEING SAID, I don't think we actually *want* to know the original
cgroup for readahead pages. Because before they are accessed and
charged to the original owner, the pages are sitting on the root
cgroup LRU list and follow the root group's aging speed and LRU order.

Eviction and refault tracking is about the LRU that hosts the pages.

So IMO your patch is much less of a hack than you might think.

> diff --git a/mm/workingset.c b/mm/workingset.c
> index a9f474a..8d2e83a 100644
> --- a/mm/workingset.c
> +++ b/mm/workingset.c
> @@ -257,6 +257,10 @@ void *workingset_eviction(struct page *page, struct mem_cgroup *target_memcg)
>  	VM_BUG_ON_PAGE(page_count(page), page);
>  	VM_BUG_ON_PAGE(!PageLocked(page), page);
>  
> +	/* page_memcg() can be NULL if swap-in readahead happens */
> +	if (!page_memcg(page))
> +		return NULL;
> +
>  	advance_inactive_age(page_memcg(page), pgdat, is_file);
>  
>  	lruvec = mem_cgroup_lruvec(target_memcg, pgdat);

This means a readahead page that hasn't been accessed will actively
not be tracked as an eviction and later as a refault.

I think that's the right thing to do, but I would expand the comment:

/*
 * A page can be without a cgroup here when it was brought in by swap
 * readahead and nobody has touched it since.
 *
 * The idea behind the workingset code is to tell on page fault time
 * whether pages have been previously used or not. Since this page
 * hasn't been used, don't store a shadow entry for it; when it later
 * faults back in, we treat it as the new page that it is.
 */

  reply	other threads:[~2020-03-18 19:59 UTC|newest]

Thread overview: 21+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-03-17  5:41 [PATCH v3 0/9] workingset protection/detection on the anonymous LRU list js1304
2020-03-17  5:41 ` [PATCH v3 1/9] mm/vmscan: make active/inactive ratio as 1:1 for anon lru js1304
2020-03-18 17:45   ` Johannes Weiner
2020-03-17  5:41 ` [PATCH v3 2/9] mm/vmscan: protect the workingset on anonymous LRU js1304
2020-03-18 17:51   ` Johannes Weiner
2020-03-19  4:01     ` Joonsoo Kim
2020-03-17  5:41 ` [PATCH v3 3/9] mm/workingset: extend the workingset detection for anon LRU js1304
2020-03-18 18:06   ` Johannes Weiner
2020-03-19  4:13     ` Joonsoo Kim
2020-03-17  5:41 ` [PATCH v3 4/9] mm/swapcache: support to handle the value in swapcache js1304
2020-03-18 18:33   ` Johannes Weiner
2020-03-19  6:01     ` Joonsoo Kim
2020-03-17  5:41 ` [PATCH v3 5/9] mm/workingset: use the node counter if memcg is the root memcg js1304
2020-03-18 19:18   ` Johannes Weiner
2020-03-19  6:20     ` Joonsoo Kim
2020-03-17  5:41 ` [PATCH v3 6/9] mm/workingset: handle the page without memcg js1304
2020-03-18 19:59   ` Johannes Weiner [this message]
2020-03-19  8:31     ` Joonsoo Kim
2020-03-17  5:41 ` [PATCH v3 7/9] mm/swap: implement workingset detection for anonymous LRU js1304
2020-03-17  5:41 ` [PATCH v3 8/9] mm/vmscan: restore active/inactive ratio " js1304
2020-03-17  5:41 ` [PATCH v3 9/9] mm/swap: count a new anonymous page as a reclaim_state's rotate js1304

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20200318195911.GF154135@cmpxchg.org \
    --to=hannes@cmpxchg.org \
    --cc=akpm@linux-foundation.org \
    --cc=hughd@google.com \
    --cc=iamjoonsoo.kim@lge.com \
    --cc=js1304@gmail.com \
    --cc=kernel-team@lge.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mgorman@techsingularity.net \
    --cc=mhocko@kernel.org \
    --cc=minchan@kernel.org \
    --cc=vbabka@suse.cz \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).