linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Minchan Kim <minchan@kernel.org>
To: Michal Hocko <mhocko@kernel.org>
Cc: Andrew Morton <akpm@linux-foundation.org>,
	linux-mm <linux-mm@kvack.org>,
	LKML <linux-kernel@vger.kernel.org>,
	linux-api@vger.kernel.org, Johannes Weiner <hannes@cmpxchg.org>,
	Tim Murray <timmurray@google.com>,
	Joel Fernandes <joel@joelfernandes.org>,
	Suren Baghdasaryan <surenb@google.com>,
	Daniel Colascione <dancol@google.com>,
	Shakeel Butt <shakeelb@google.com>,
	Sonny Rao <sonnyrao@google.com>,
	Brian Geffon <bgeffon@google.com>,
	jannh@google.com, oleg@redhat.com, christian@brauner.io,
	oleksandr@redhat.com, hdanton@sina.com, lizeb@google.com
Subject: Re: [PATCH v2 1/5] mm: introduce MADV_COLD
Date: Thu, 20 Jun 2019 09:06:51 +0900	[thread overview]
Message-ID: <20190620000650.GB52978@google.com> (raw)
In-Reply-To: <20190619125611.GO2968@dhcp22.suse.cz>

On Wed, Jun 19, 2019 at 02:56:12PM +0200, Michal Hocko wrote:
> On Mon 10-06-19 20:12:48, Minchan Kim wrote:
> > When a process expects no accesses to a certain memory range, it could
> > give a hint to kernel that the pages can be reclaimed when memory pressure
> > happens but data should be preserved for future use.  This could reduce
> > workingset eviction so it ends up increasing performance.
> > 
> > This patch introduces the new MADV_COLD hint to madvise(2) syscall.
> > MADV_COLD can be used by a process to mark a memory range as not expected
> > to be used in the near future. The hint can help kernel in deciding which
> > pages to evict early during memory pressure.
> > 
> > It works for every LRU pages like MADV_[DONTNEED|FREE]. IOW, It moves
> > 
> > 	active file page -> inactive file LRU
> > 	active anon page -> inacdtive anon LRU
> > 
> > Unlike MADV_FREE, it doesn't move active anonymous pages to inactive
> > file LRU's head because MADV_COLD is a little bit different symantic.
> > MADV_FREE means it's okay to discard when the memory pressure because
> > the content of the page is *garbage* so freeing such pages is almost zero
> > overhead since we don't need to swap out and access afterward causes just
> > minor fault. Thus, it would make sense to put those freeable pages in
> > inactive file LRU to compete other used-once pages. It makes sense for
> > implmentaion point of view, too because it's not swapbacked memory any
> > longer until it would be re-dirtied. Even, it could give a bonus to make
> > them be reclaimed on swapless system. However, MADV_COLD doesn't mean
> > garbage so reclaiming them requires swap-out/in in the end so it's bigger
> > cost. Since we have designed VM LRU aging based on cost-model, anonymous
> > cold pages would be better to position inactive anon's LRU list, not file
> > LRU. Furthermore, it would help to avoid unnecessary scanning if system
> > doesn't have a swap device. Let's start simpler way without adding
> > complexity at this moment.
> 
> I would only add that it is a caveat that workloads with a lot of page
> cache are likely to ignore MADV_COLD on anonymous memory because we
> rarely age anonymous LRU lists.

Okay, I will add some more.

> 
> [...]
> > +static int madvise_cold_pte_range(pmd_t *pmd, unsigned long addr,
> > +				unsigned long end, struct mm_walk *walk)
> > +{
> 
> This is duplicating a large part of madvise_free_pte_range with some
> subtle differences which are not explained anywhere (e.g. why does
> madvise_free_huge_pmd need try_lock on a page while not here? etc.).

madvise_free_huge_pmd handle dirty bit but this is not.

> 
> Why cannot we reuse a large part of that code and differ essentially on
> the reclaim target check and action? Have you considered to consolidate
> the code to share as much as possible? Maybe that is easier said than
> done because the devil is always in details...

Yub, it was not pretty when I tried. Please see last patch in this
patchset.

  reply	other threads:[~2019-06-20  0:07 UTC|newest]

Thread overview: 25+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-06-10 11:12 [PATCH v2 0/5] Introduce MADV_COLD and MADV_PAGEOUT Minchan Kim
2019-06-10 11:12 ` [PATCH v2 1/5] mm: introduce MADV_COLD Minchan Kim
2019-06-19 12:56   ` Michal Hocko
2019-06-20  0:06     ` Minchan Kim [this message]
2019-06-20  7:08       ` Michal Hocko
2019-06-20  8:44         ` Minchan Kim
2019-06-10 11:12 ` [PATCH v2 2/5] mm: change PAGEREF_RECLAIM_CLEAN with PAGE_REFRECLAIM Minchan Kim
2019-06-19 13:09   ` Michal Hocko
2019-06-10 11:12 ` [PATCH v2 3/5] mm: account nr_isolated_xxx in [isolate|putback]_lru_page Minchan Kim
2019-06-10 11:12 ` [PATCH v2 4/5] mm: introduce MADV_PAGEOUT Minchan Kim
2019-06-19 13:24   ` Michal Hocko
2019-06-20  4:16     ` Minchan Kim
2019-06-20  7:04       ` Michal Hocko
2019-06-20  8:40         ` Minchan Kim
2019-06-20  9:22           ` Michal Hocko
2019-06-20 10:32             ` Minchan Kim
2019-06-20 10:55               ` Michal Hocko
2019-06-10 11:12 ` [PATCH v2 5/5] mm: factor out pmd young/dirty bit handling and THP split Minchan Kim
2019-06-10 18:03 ` [PATCH v2 0/5] Introduce MADV_COLD and MADV_PAGEOUT Dave Hansen
2019-06-13  4:51   ` Minchan Kim
2019-06-12 10:59 ` Pavel Machek
2019-06-12 11:19   ` Oleksandr Natalenko
2019-06-12 11:37     ` Pavel Machek
2019-06-19 12:27 ` Michal Hocko
2019-06-19 23:42   ` Minchan Kim

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20190620000650.GB52978@google.com \
    --to=minchan@kernel.org \
    --cc=akpm@linux-foundation.org \
    --cc=bgeffon@google.com \
    --cc=christian@brauner.io \
    --cc=dancol@google.com \
    --cc=hannes@cmpxchg.org \
    --cc=hdanton@sina.com \
    --cc=jannh@google.com \
    --cc=joel@joelfernandes.org \
    --cc=linux-api@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=lizeb@google.com \
    --cc=mhocko@kernel.org \
    --cc=oleg@redhat.com \
    --cc=oleksandr@redhat.com \
    --cc=shakeelb@google.com \
    --cc=sonnyrao@google.com \
    --cc=surenb@google.com \
    --cc=timmurray@google.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).