linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Johannes Weiner <hannes@cmpxchg.org>
To: Minchan Kim <minchan@kernel.org>
Cc: linux-mm@kvack.org, linux-kernel@vger.kernel.org,
	Andrew Morton <akpm@linux-foundation.org>,
	Rik van Riel <riel@redhat.com>, Mel Gorman <mgorman@suse.de>,
	Andrea Arcangeli <aarcange@redhat.com>,
	Andi Kleen <andi@firstfloor.org>, Michal Hocko <mhocko@suse.cz>,
	Tim Chen <tim.c.chen@linux.intel.com>,
	kernel-team@fb.com
Subject: Re: [PATCH 10/10] mm: balance LRU lists based on relative thrashing
Date: Wed, 22 Jun 2016 17:56:52 -0400	[thread overview]
Message-ID: <20160622215652.GB24150@cmpxchg.org> (raw)
In-Reply-To: <20160620074208.GA28207@bbox>

On Mon, Jun 20, 2016 at 04:42:08PM +0900, Minchan Kim wrote:
> On Fri, Jun 17, 2016 at 01:01:29PM -0400, Johannes Weiner wrote:
> > On Fri, Jun 17, 2016 at 04:49:45PM +0900, Minchan Kim wrote:
> > > On Thu, Jun 16, 2016 at 11:12:07AM -0400, Johannes Weiner wrote:
> > > > On Wed, Jun 15, 2016 at 11:23:41AM +0900, Minchan Kim wrote:
> > > > > Do we want to retain [1]?
> > > > > 
> > > > > This patch motivates from swap IO could be much faster than file IO
> > > > > so that it would be natural if we rely on refaulting feedback rather
> > > > > than forcing evicting file cache?
> > > > > 
> > > > > [1] e9868505987a, mm,vmscan: only evict file pages when we have plenty?
> > > > 
> > > > Yes! We don't want to go after the workingset, whether it be cache or
> > > > anonymous, while there is single-use page cache lying around that we
> > > > can reclaim for free, with no IO and little risk of future IO. Anon
> > > > memory doesn't have this equivalent. Only cache is lazy-reclaimed.
> > > > 
> > > > Once the cache refaults, we activate it to reflect the fact that it's
> > > > workingset. Only when we run out of single-use cache do we want to
> > > > reclaim multi-use pages, and *then* we balance workingsets based on
> > > > cost of refetching each side from secondary storage.
> > > 
> > > If pages in inactive file LRU are really single-use page cache, I agree.
> > > 
> > > However, how does the logic can work like that?
> > > If reclaimed file pages were part of workingset(i.e., refault happens),
> > > we give the pressure to anonymous LRU but get_scan_count still force to
> > > reclaim file lru until inactive file LRU list size is enough low.
> > > 
> > > With that, too many file workingset could be evicted although anon swap
> > > is cheaper on fast swap storage?
> > > 
> > > IOW, refault mechanisme works once inactive file LRU list size is enough
> > > small but small inactive file LRU doesn't guarantee it has only multiple
> > > -use pages. Hm, Isn't it a problem?
> > 
> > It's a trade-off between the cost of detecting a new workingset from a
> > stream of use-once pages, and the cost of use-once pages impose on the
> > established workingset.
> > 
> > That's a pretty easy choice, if you ask me. I'd rather ask cache pages
> > to prove they are multi-use than have use-once pages put pressure on
> > the workingset.
> 
> Make sense.
> 
> > 
> > Sure, a spike like you describe is certainly possible, where a good
> > portion of the inactive file pages will be re-used in the near future,
> > yet we evict all of them in a burst of memory pressure when we should
> > have swapped. That's a worst case scenario for the use-once policy in
> > a workingset transition.
> 
> So, the point is how such case it happens frequently. A scenario I can
> think of is that if we use one-cgroup-per-app, many file pages would be
> inactive LRU while active LRU is almost empty until reclaim kicks in.
> Because normally, parallel reclaim work during launching new app makes
> app's startup time really slow. That's why mobile platform uses notifiers
> to get free memory in advance via kiling/reclaiming. Anyway, once we get
> amount of free memory and lauching new app in a new cgroup, pages would
> live his born LRU list(ie, anon: active file: inactive) without aging.
> 
> Then, activity manager can set memory.high of less important app-cgroup
> to reclaim it with high value swappiness because swap device is much
> faster on that system and much bigger anonymous pages compared to file-
> backed pages. Surely, activity manager will expect lots of anonymous
> pages be able to swap out but unlike expectation, he will see such spike
> easily with reclaiming file-backed pages a lot and refault until inactive
> file LRU is enough small.
> 
> I think it's enough possible scenario in small system one-cgroup-per-
> app.

That's the workingset transition I was talking about. The algorithm is
designed to settle towards stable memory patterns. We can't possibly
remove one of the key components of this - the use-once policy - to
speed up a few seconds of workingset transition when it comes at the
risk of potentially thrashing the workingset for *hours*.

The fact that swap IO can be faster than filesystem IO doesn't change
this at all. The point is that the reclaim and refetch IO cost of
use-once cache is ZERO. Causing swap IO to make room for more and more
unused cache pages doesn't make any sense, no matter the swap speed.

I really don't see the relevance of this discussion to this patch set.

  reply	other threads:[~2016-06-22 21:59 UTC|newest]

Thread overview: 67+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-06-06 19:48 [PATCH 00/10] mm: balance LRU lists based on relative thrashing Johannes Weiner
2016-06-06 19:48 ` [PATCH 01/10] mm: allow swappiness that prefers anon over file Johannes Weiner
2016-06-07  0:25   ` Minchan Kim
2016-06-07 14:18     ` Johannes Weiner
2016-06-08  0:06       ` Minchan Kim
2016-06-08 15:58         ` Johannes Weiner
2016-06-09  1:01           ` Minchan Kim
2016-06-09 13:32             ` Johannes Weiner
2016-06-06 19:48 ` [PATCH 02/10] mm: swap: unexport __pagevec_lru_add() Johannes Weiner
2016-06-06 21:32   ` Rik van Riel
2016-06-07  9:07   ` Michal Hocko
2016-06-08  7:14   ` Minchan Kim
2016-06-06 19:48 ` [PATCH 03/10] mm: fold and remove lru_cache_add_anon() and lru_cache_add_file() Johannes Weiner
2016-06-06 21:33   ` Rik van Riel
2016-06-07  9:12   ` Michal Hocko
2016-06-08  7:24   ` Minchan Kim
2016-06-06 19:48 ` [PATCH 04/10] mm: fix LRU balancing effect of new transparent huge pages Johannes Weiner
2016-06-06 21:36   ` Rik van Riel
2016-06-07  9:19   ` Michal Hocko
2016-06-08  7:28   ` Minchan Kim
2016-06-06 19:48 ` [PATCH 05/10] mm: remove LRU balancing effect of temporary page isolation Johannes Weiner
2016-06-06 21:56   ` Rik van Riel
2016-06-06 22:15     ` Johannes Weiner
2016-06-07  1:11       ` Rik van Riel
2016-06-07 13:57         ` Johannes Weiner
2016-06-07  9:26       ` Michal Hocko
2016-06-07 14:06         ` Johannes Weiner
2016-06-07  9:49   ` Michal Hocko
2016-06-08  7:39   ` Minchan Kim
2016-06-08 16:02     ` Johannes Weiner
2016-06-06 19:48 ` [PATCH 06/10] mm: remove unnecessary use-once cache bias from LRU balancing Johannes Weiner
2016-06-07  2:20   ` Rik van Riel
2016-06-07 14:11     ` Johannes Weiner
2016-06-08  8:03   ` Minchan Kim
2016-06-08 12:31   ` Michal Hocko
2016-06-06 19:48 ` [PATCH 07/10] mm: base LRU balancing on an explicit cost model Johannes Weiner
2016-06-06 19:13   ` kbuild test robot
2016-06-07  2:34   ` Rik van Riel
2016-06-07 14:12     ` Johannes Weiner
2016-06-08  8:14   ` Minchan Kim
2016-06-08 16:06     ` Johannes Weiner
2016-06-08 12:51   ` Michal Hocko
2016-06-08 16:16     ` Johannes Weiner
2016-06-09 12:18       ` Michal Hocko
2016-06-09 13:33         ` Johannes Weiner
2016-06-06 19:48 ` [PATCH 08/10] mm: deactivations shouldn't bias the LRU balance Johannes Weiner
2016-06-08  8:15   ` Minchan Kim
2016-06-08 12:57   ` Michal Hocko
2016-06-06 19:48 ` [PATCH 09/10] mm: only count actual rotations as LRU reclaim cost Johannes Weiner
2016-06-08  8:19   ` Minchan Kim
2016-06-08 13:18   ` Michal Hocko
2016-06-06 19:48 ` [PATCH 10/10] mm: balance LRU lists based on relative thrashing Johannes Weiner
2016-06-06 19:22   ` kbuild test robot
2016-06-06 23:50   ` Tim Chen
2016-06-07 16:23     ` Johannes Weiner
2016-06-07 19:56       ` Tim Chen
2016-06-08 13:58   ` Michal Hocko
2016-06-10  2:19   ` Minchan Kim
2016-06-13 15:52     ` Johannes Weiner
2016-06-15  2:23       ` Minchan Kim
2016-06-16 15:12         ` Johannes Weiner
2016-06-17  7:49           ` Minchan Kim
2016-06-17 17:01             ` Johannes Weiner
2016-06-20  7:42               ` Minchan Kim
2016-06-22 21:56                 ` Johannes Weiner [this message]
2016-06-24  6:22                   ` Minchan Kim
2016-06-07  9:51 ` [PATCH 00/10] " Michal Hocko

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20160622215652.GB24150@cmpxchg.org \
    --to=hannes@cmpxchg.org \
    --cc=aarcange@redhat.com \
    --cc=akpm@linux-foundation.org \
    --cc=andi@firstfloor.org \
    --cc=kernel-team@fb.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mgorman@suse.de \
    --cc=mhocko@suse.cz \
    --cc=minchan@kernel.org \
    --cc=riel@redhat.com \
    --cc=tim.c.chen@linux.intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).