linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Barry Song <21cnbao@gmail.com>
To: Yu Zhao <yuzhao@google.com>
Cc: "Johannes Weiner" <hannes@cmpxchg.org>,
	"Andrew Morton" <akpm@linux-foundation.org>,
	"Mel Gorman" <mgorman@suse.de>,
	"Michal Hocko" <mhocko@kernel.org>,
	"Andi Kleen" <ak@linux.intel.com>,
	"Aneesh Kumar" <aneesh.kumar@linux.ibm.com>,
	"Catalin Marinas" <catalin.marinas@arm.com>,
	"Dave Hansen" <dave.hansen@linux.intel.com>,
	"Hillf Danton" <hdanton@sina.com>, "Jens Axboe" <axboe@kernel.dk>,
	"Jesse Barnes" <jsbarnes@google.com>,
	"Jonathan Corbet" <corbet@lwn.net>,
	"Linus Torvalds" <torvalds@linux-foundation.org>,
	"Matthew Wilcox" <willy@infradead.org>,
	"Michael Larabel" <Michael@michaellarabel.com>,
	"Mike Rapoport" <rppt@kernel.org>,
	"Rik van Riel" <riel@surriel.com>,
	"Vlastimil Babka" <vbabka@suse.cz>,
	"Will Deacon" <will@kernel.org>,
	"Ying Huang" <ying.huang@intel.com>,
	LAK <linux-arm-kernel@lists.infradead.org>,
	"Linux Doc Mailing List" <linux-doc@vger.kernel.org>,
	LKML <linux-kernel@vger.kernel.org>,
	Linux-MM <linux-mm@kvack.org>,
	"Kernel Page Reclaim v2" <page-reclaim@google.com>,
	x86 <x86@kernel.org>, "Brian Geffon" <bgeffon@google.com>,
	"Jan Alexander Steffens" <heftig@archlinux.org>,
	"Oleksandr Natalenko" <oleksandr@natalenko.name>,
	"Steven Barrett" <steven@liquorix.net>,
	"Suleiman Souhlal" <suleiman@google.com>,
	"Daniel Byrne" <djbyrne@mtu.edu>,
	"Donald Carr" <d@chaos-reins.com>,
	"Holger Hoffstätte" <holger@applied-asynchrony.com>,
	"Konstantin Kharlamov" <Hi-Angel@yandex.ru>,
	"Shuang Zhai" <szhai2@cs.rochester.edu>,
	"Sofia Trinh" <sofia.trinh@edi.works>
Subject: Re: [PATCH v7 04/12] mm: multigenerational LRU: groundwork
Date: Sat, 12 Mar 2022 23:37:35 +1300	[thread overview]
Message-ID: <CAGsJ_4zT7gtGSEoAay=VE6x_YZkNKtLymRL94pBnVgKekPzxaQ@mail.gmail.com> (raw)
In-Reply-To: <CAOUHufaUJD8nC6PDVfmkeTwB4BtzBzigxh+V-hfR-_26VwjOPA@mail.gmail.com>

On Sat, Mar 12, 2022 at 12:45 PM Yu Zhao <yuzhao@google.com> wrote:
>
> On Fri, Mar 11, 2022 at 3:16 AM Barry Song <21cnbao@gmail.com> wrote:
> >
> > On Tue, Feb 15, 2022 at 10:43 PM Yu Zhao <yuzhao@google.com> wrote:
> > >
> > > On Thu, Feb 10, 2022 at 03:41:57PM -0500, Johannes Weiner wrote:
> > >
> > > Thanks for reviewing.
> > >
> > > > > +static inline bool lru_gen_is_active(struct lruvec *lruvec, int gen)
> > > > > +{
> > > > > +   unsigned long max_seq = lruvec->lrugen.max_seq;
> > > > > +
> > > > > +   VM_BUG_ON(gen >= MAX_NR_GENS);
> > > > > +
> > > > > +   /* see the comment on MIN_NR_GENS */
> > > > > +   return gen == lru_gen_from_seq(max_seq) || gen == lru_gen_from_seq(max_seq - 1);
> > > > > +}
> > > >
> > > > I'm still reading the series, so correct me if I'm wrong: the "active"
> > > > set is split into two generations for the sole purpose of the
> > > > second-chance policy for fresh faults, right?
> > >
> > > To be precise, the active/inactive notion on top of generations is
> > > just for ABI compatibility, e.g., the counters in /proc/vmstat.
> > > Otherwise, this function wouldn't be needed.
> >
> > Hi Yu,
> > I am still quite confused as i am seeing both active/inactive and lru_gen.
> > eg:
> >
> > root@ubuntu:~# cat /proc/vmstat | grep active
> > nr_zone_inactive_anon 22797
> > nr_zone_active_anon 578405
> > nr_zone_inactive_file 0
> > nr_zone_active_file 4156
> > nr_inactive_anon 22800
> > nr_active_anon 578574
> > nr_inactive_file 0
> > nr_active_file 4215
>
> Yes, this is expected. We have to maintain the ABI, i.e., the
> *_active/inactive_* counters.
>
> > and:
> >
> > root@ubuntu:~# cat /sys//kernel/debug/lru_gen
> >
> > ...
> > memcg    36 /user.slice/user-0.slice/user@0.service
> >  node     0
> >          20      18820         22           0
> >          21       7452          0           0
> >          22       7448          0           0
> > memcg    33 /user.slice/user-0.slice/user@0.service/app.slice
> >  node     0
> >           0    2171452          0           0
> >           1    2171452          0           0
> >           2    2171452          0           0
> >           3    2171452          0           0
> > memcg    37 /user.slice/user-0.slice/session-1.scope
> >  node     0
> >          42      51804     102127           0
> >          43      18840     275622           0
> >          44      16104     216805           1
> >
> > Does it mean one page could be in both one of the generations and one
> > of the active/inactive lists?
>
> In terms of the data structure, evictable pages are either on
> lruvec->lists or lrugen->lists.
>
> > Do we have some mapping relationship between active/inactive lists
> > with generations?
>
> For the counters, yes -- pages in max_seq and max_seq-1 are counted as
> active, and the rest are inactive.
>
> > We used to put a faulted file page in inactive, if we access it a
> > second time, it can be promoted
> > to active. then in recent years, we have also applied this to anon
> > pages while kernel adds
> > workingset protection for anon pages. so basically both anon and file
> > pages go into the inactive
> > list for the 1st time, if we access it for the second time, they go to
> > the active list. if we don't access
> > it any more, they are likely to be reclaimed as they are inactive.
> > we do have some special fastpath for code section, executable file
> > pages are kept on active list
> > as long as they are accessed.
>
> Yes.
>
> > so all of the above concerns are actually not that correct?
>
> They are valid concerns but I don't know any popular workloads that
> care about them.

Hi Yu,
here we can get a workload in Kim's patchset while he added workingset
protection
for anon pages:
https://patchwork.kernel.org/project/linux-mm/cover/1581401993-20041-1-git-send-email-iamjoonsoo.kim@lge.com/
anon pages used to go to active rather than inactive, but kim's patchset
moved to use inactive first. then only after the anon page is accessed
second time, it can move to active.

"In current implementation, newly created or swap-in anonymous page is

started on the active list. Growing the active list results in rebalancing
active/inactive list so old pages on the active list are demoted to the
inactive list. Hence, hot page on the active list isn't protected at all.

Following is an example of this situation.

Assume that 50 hot pages on active list and system can contain total
100 pages. Numbers denote the number of pages on active/inactive
list (active | inactive). (h) stands for hot pages and (uo) stands for
used-once pages.

1. 50 hot pages on active list
50(h) | 0

2. workload: 50 newly created (used-once) pages
50(uo) | 50(h)

3. workload: another 50 newly created (used-once) pages
50(uo) | 50(uo), swap-out 50(h)

As we can see, hot pages are swapped-out and it would cause swap-in later."

Is MGLRU able to avoid the swap-out of the 50 hot pages? since MGLRU
is putting faulted pages to the youngest generation directly, do we have the
risk mentioned in Kim's patchset?

Thanks
Barry


  reply	other threads:[~2022-03-12 10:37 UTC|newest]

Thread overview: 74+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-02-08  8:18 [PATCH v7 00/12] Multigenerational LRU Framework Yu Zhao
2022-02-08  8:18 ` [PATCH v7 01/12] mm: x86, arm64: add arch_has_hw_pte_young() Yu Zhao
2022-02-08  8:24   ` Yu Zhao
2022-02-08 10:33   ` Will Deacon
2022-02-08  8:18 ` [PATCH v7 02/12] mm: x86: add CONFIG_ARCH_HAS_NONLEAF_PMD_YOUNG Yu Zhao
2022-02-08  8:27   ` Yu Zhao
2022-02-08  8:18 ` [PATCH v7 03/12] mm/vmscan.c: refactor shrink_node() Yu Zhao
2022-02-08  8:18 ` [PATCH v7 04/12] mm: multigenerational LRU: groundwork Yu Zhao
2022-02-08  8:28   ` Yu Zhao
2022-02-10 20:41   ` Johannes Weiner
2022-02-15  9:43     ` Yu Zhao
2022-02-15 21:53       ` Johannes Weiner
2022-02-21  8:14         ` Yu Zhao
2022-02-23 21:18           ` Yu Zhao
2022-02-25 16:34             ` Minchan Kim
2022-03-03 15:29           ` Johannes Weiner
2022-03-03 19:26             ` Yu Zhao
2022-03-03 21:43               ` Johannes Weiner
2022-03-11 10:16       ` Barry Song
2022-03-11 23:45         ` Yu Zhao
2022-03-12 10:37           ` Barry Song [this message]
2022-03-12 21:11             ` Yu Zhao
2022-03-13  4:57               ` Barry Song
2022-03-14 11:11                 ` Barry Song
2022-03-14 16:45                   ` Yu Zhao
2022-03-14 23:38                     ` Barry Song
     [not found]                       ` <CAOUHufa9eY44QadfGTzsxa2=hEvqwahXd7Canck5Gt-N6c4UKA@mail.gmail.com>
     [not found]                         ` <CAGsJ_4zvj5rmz7DkW-kJx+jmUT9G8muLJ9De--NZma9ey0Oavw@mail.gmail.com>
2022-03-15 10:29                           ` Barry Song
2022-03-16  2:46                             ` Yu Zhao
2022-03-16  4:37                               ` Barry Song
2022-03-16  5:44                                 ` Yu Zhao
2022-03-16  6:06                                   ` Barry Song
2022-03-16 21:37                                     ` Yu Zhao
2022-02-10 21:37   ` Matthew Wilcox
2022-02-13 21:16     ` Yu Zhao
2022-02-08  8:18 ` [PATCH v7 05/12] mm: multigenerational LRU: minimal implementation Yu Zhao
2022-02-08  8:33   ` Yu Zhao
2022-02-08 16:50   ` Johannes Weiner
2022-02-10  2:53     ` Yu Zhao
2022-02-13 10:04   ` Hillf Danton
2022-02-17  0:13     ` Yu Zhao
2022-02-23  8:27   ` Huang, Ying
2022-02-23  9:36     ` Yu Zhao
2022-02-24  0:59       ` Huang, Ying
2022-02-24  1:34         ` Yu Zhao
2022-02-24  3:31           ` Huang, Ying
2022-02-24  4:09             ` Yu Zhao
2022-02-24  5:27               ` Huang, Ying
2022-02-24  5:35                 ` Yu Zhao
2022-02-08  8:18 ` [PATCH v7 06/12] mm: multigenerational LRU: exploit locality in rmap Yu Zhao
2022-02-08  8:40   ` Yu Zhao
2022-02-08  8:18 ` [PATCH v7 07/12] mm: multigenerational LRU: support page table walks Yu Zhao
2022-02-08  8:39   ` Yu Zhao
2022-02-08  8:18 ` [PATCH v7 08/12] mm: multigenerational LRU: optimize multiple memcgs Yu Zhao
2022-02-08  8:18 ` [PATCH v7 09/12] mm: multigenerational LRU: runtime switch Yu Zhao
2022-02-08  8:42   ` Yu Zhao
2022-02-08  8:19 ` [PATCH v7 10/12] mm: multigenerational LRU: thrashing prevention Yu Zhao
2022-02-08  8:43   ` Yu Zhao
2022-02-08  8:19 ` [PATCH v7 11/12] mm: multigenerational LRU: debugfs interface Yu Zhao
2022-02-18 18:56   ` [page-reclaim] " David Rientjes
2022-02-08  8:19 ` [PATCH v7 12/12] mm: multigenerational LRU: documentation Yu Zhao
2022-02-08  8:44   ` Yu Zhao
2022-02-14 10:28   ` Mike Rapoport
2022-02-16  3:22     ` Yu Zhao
2022-02-21  9:01       ` Mike Rapoport
2022-02-22  1:47         ` Yu Zhao
2022-02-23 10:58           ` Mike Rapoport
2022-02-23 21:20             ` Yu Zhao
2022-02-08 10:11 ` [PATCH v7 00/12] Multigenerational LRU Framework Oleksandr Natalenko
2022-02-08 11:14   ` Michal Hocko
2022-02-08 11:23     ` Oleksandr Natalenko
2022-02-11 20:12 ` Alexey Avramov
2022-02-12 21:01   ` Yu Zhao
2022-03-03  6:06 ` Vaibhav Jain
2022-03-03  6:47   ` Yu Zhao

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CAGsJ_4zT7gtGSEoAay=VE6x_YZkNKtLymRL94pBnVgKekPzxaQ@mail.gmail.com' \
    --to=21cnbao@gmail.com \
    --cc=Hi-Angel@yandex.ru \
    --cc=Michael@michaellarabel.com \
    --cc=ak@linux.intel.com \
    --cc=akpm@linux-foundation.org \
    --cc=aneesh.kumar@linux.ibm.com \
    --cc=axboe@kernel.dk \
    --cc=bgeffon@google.com \
    --cc=catalin.marinas@arm.com \
    --cc=corbet@lwn.net \
    --cc=d@chaos-reins.com \
    --cc=dave.hansen@linux.intel.com \
    --cc=djbyrne@mtu.edu \
    --cc=hannes@cmpxchg.org \
    --cc=hdanton@sina.com \
    --cc=heftig@archlinux.org \
    --cc=holger@applied-asynchrony.com \
    --cc=jsbarnes@google.com \
    --cc=linux-arm-kernel@lists.infradead.org \
    --cc=linux-doc@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mgorman@suse.de \
    --cc=mhocko@kernel.org \
    --cc=oleksandr@natalenko.name \
    --cc=page-reclaim@google.com \
    --cc=riel@surriel.com \
    --cc=rppt@kernel.org \
    --cc=sofia.trinh@edi.works \
    --cc=steven@liquorix.net \
    --cc=suleiman@google.com \
    --cc=szhai2@cs.rochester.edu \
    --cc=torvalds@linux-foundation.org \
    --cc=vbabka@suse.cz \
    --cc=will@kernel.org \
    --cc=willy@infradead.org \
    --cc=x86@kernel.org \
    --cc=ying.huang@intel.com \
    --cc=yuzhao@google.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).