linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Yu Zhao <yuzhao@google.com>
To: Michal Hocko <mhocko@suse.com>
Cc: Andrew Morton <akpm@linux-foundation.org>,
	Linus Torvalds <torvalds@linux-foundation.org>,
	Andi Kleen <ak@linux.intel.com>,
	Catalin Marinas <catalin.marinas@arm.com>,
	Dave Hansen <dave.hansen@linux.intel.com>,
	Hillf Danton <hdanton@sina.com>, Jens Axboe <axboe@kernel.dk>,
	Jesse Barnes <jsbarnes@google.com>,
	Johannes Weiner <hannes@cmpxchg.org>,
	Jonathan Corbet <corbet@lwn.net>,
	Matthew Wilcox <willy@infradead.org>,
	Mel Gorman <mgorman@suse.de>,
	Michael Larabel <Michael@michaellarabel.com>,
	Rik van Riel <riel@surriel.com>, Vlastimil Babka <vbabka@suse.cz>,
	Will Deacon <will@kernel.org>, Ying Huang <ying.huang@intel.com>,
	linux-arm-kernel@lists.infradead.org, linux-doc@vger.kernel.org,
	linux-kernel@vger.kernel.org, linux-mm@kvack.org,
	page-reclaim@google.com, x86@kernel.org,
	Konstantin Kharlamov <Hi-Angel@yandex.ru>
Subject: Re: [PATCH v6 6/9] mm: multigenerational lru: aging
Date: Mon, 10 Jan 2022 18:18:55 -0700	[thread overview]
Message-ID: <Ydza/zXKY9ATRoh6@google.com> (raw)
In-Reply-To: <YdxSUuDc3OC4pe+f@dhcp22.suse.cz>

On Mon, Jan 10, 2022 at 04:35:46PM +0100, Michal Hocko wrote:
> On Fri 07-01-22 16:36:11, Yu Zhao wrote:
> > On Fri, Jan 07, 2022 at 02:11:29PM +0100, Michal Hocko wrote:
> > > On Tue 04-01-22 13:22:25, Yu Zhao wrote:
> > > [...]
> > > > +static void lru_gen_age_node(struct pglist_data *pgdat, struct scan_control *sc)
> > > > +{
> > > > +	struct mem_cgroup *memcg;
> > > > +	bool success = false;
> > > > +	unsigned long min_ttl = READ_ONCE(lru_gen_min_ttl);
> > > > +
> > > > +	VM_BUG_ON(!current_is_kswapd());
> > > > +
> > > > +	current->reclaim_state->mm_walk = &pgdat->mm_walk;
> > > > +
> > > > +	memcg = mem_cgroup_iter(NULL, NULL, NULL);
> > > > +	do {
> > > > +		struct lruvec *lruvec = mem_cgroup_lruvec(memcg, pgdat);
> > > > +
> > > > +		if (age_lruvec(lruvec, sc, min_ttl))
> > > > +			success = true;
> > > > +
> > > > +		cond_resched();
> > > > +	} while ((memcg = mem_cgroup_iter(NULL, memcg, NULL)));
> > > > +
> > > > +	if (!success && mutex_trylock(&oom_lock)) {
> > > > +		struct oom_control oc = {
> > > > +			.gfp_mask = sc->gfp_mask,
> > > > +			.order = sc->order,
> > > > +		};
> > > > +
> > > > +		if (!oom_reaping_in_progress())
> > > > +			out_of_memory(&oc);
> > > > +
> > > > +		mutex_unlock(&oom_lock);
> > > > +	}
> > > 
> > > Why do you need to trigger oom killer from this path? Why cannot you
> > > rely on the page allocator to do that like we do now?
> > 
> > This is per desktop users' (repeated) requests. The can't tolerate
> > thrashing as servers do because of UI lags; and they usually don't
> > have fancy tools like oomd.
> > 
> > Related discussions I saw:
> > https://github.com/zen-kernel/zen-kernel/issues/218
> > https://lore.kernel.org/lkml/20101028191523.GA14972@google.com/
> > https://lore.kernel.org/lkml/20211213051521.21f02dd2@mail.inbox.lv/
> > https://lore.kernel.org/lkml/54C2C89C.8080002@gmail.com/
> > https://lore.kernel.org/lkml/d9802b6a-949b-b327-c4a6-3dbca485ec20@gmx.com/
> 
> I do not really see any arguments why an userspace based trashing
> detection cannot be used for those. Could you clarify?

It definitely can be done. But who is going to do it for every distro
and all individual users? AFAIK, not a single distro provides such a
solution for desktop/laptop/phone users.

And also there is the theoretical question how reliable a userspace
solution can be. What if this usespace solution itself gets stuck in
the direct reclaim path. I'm sure if nobody has done some search to
prove or debunk it.

In addition, what exactly PSI values should be used on different
models of consumer electronics? Nobody knows. We have a team working
on this and we haven't figured it out for all our Chromebook models.

As Andrew said, "a blunt instrument like this would be useful".
https://lore.kernel.org/lkml/20211202135824.33d2421bf5116801cfa2040d@linux-foundation.org/

I'd like to have less code in kernel too, but I've learned never to
walk over users. If I remove this and they come after me asking why,
I'd have a hard time convincing them.

> Also my question was pointing to why out_of_memory is called from the
> reclaim rather than the allocator (memcg charging path). It is the
> caller of the reclaim to control different reclaim strategies and tell
> when all the hopes are lost and the oom killer should be invoked. This
> allows for a different policies at the allocator level and this change
> will break that AFAICS. E.g. what if the underlying allocation context
> is __GFP_NORETRY?

This is called in kswapd only, and by default (min_ttl=0) it doesn't
do anything. So __GFP_NORETRY doesn't apply. The question would be
more along the lines of long-term ABI support.

And I'll add the following comments, if you think we can keep this
logic:
   OOM kill if every generation from all memcgs is younger than min_ttl.
   Another theoretical possibility is all memcgs are either below min or
   ineligible at priority 0, but this isn't the main goal.

(Please read my reply at the bottom to decide whether we should keep
 it or not. Thanks.)

> > >From patch 8:
> >   Personal computers
> >   ------------------
> >   :Thrashing prevention: Write ``N`` to
> >    ``/sys/kernel/mm/lru_gen/min_ttl_ms`` to prevent the working set of
> >    ``N`` milliseconds from getting evicted. The OOM killer is invoked if
> >    this working set can't be kept in memory. Based on the average human
> >    detectable lag (~100ms), ``N=1000`` usually eliminates intolerable
> >    lags due to thrashing. Larger values like ``N=3000`` make lags less
> >    noticeable at the cost of more OOM kills.
> 
> This is a very good example of something that should be a self contained
> patch with its own justification.

Consider it done.

> TBH it is really not all that clear to
> me that we want to provide any user visible knob to control OOM behavior
> based on a time based QoS.

Agreed, and it didn't exist until v4, i.e., after I was demanded to
provide it for several times.

For example:
https://github.com/zen-kernel/zen-kernel/issues/223

And another example:
   Your Multigenerational LRU patchset is pretty complex and
   effective, but does not eliminate thrashing condition fully on an
   old PCs with slow HDD.

   I'm kindly asking you to cooperate with hakavlad if it's possible
   and maybe re-implement parts of le9 patch in your patchset wherever
   acceptable, as they are quite similar in the core concept.

This is excerpt of an email from iam@valdikss.org.ru, and he has
posted demo videos in this discussion:
https://lore.kernel.org/lkml/2dc51fc8-f14e-17ed-a8c6-0ec70423bf54@valdikss.org.ru/

  reply	other threads:[~2022-01-11  1:19 UTC|newest]

Thread overview: 111+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-01-04 20:22 [PATCH v6 0/9] Multigenerational LRU Framework Yu Zhao
2022-01-04 20:22 ` [PATCH v6 1/9] mm: x86, arm64: add arch_has_hw_pte_young() Yu Zhao
2022-01-05 10:45   ` Will Deacon
2022-01-05 20:47     ` Yu Zhao
2022-01-06 10:30       ` Will Deacon
2022-01-07  7:25         ` Yu Zhao
2022-01-11 14:19           ` Will Deacon
2022-01-11 22:27             ` Yu Zhao
2022-01-04 20:22 ` [PATCH v6 2/9] mm: x86: add CONFIG_ARCH_HAS_NONLEAF_PMD_YOUNG Yu Zhao
2022-01-04 21:24   ` Linus Torvalds
2022-01-04 20:22 ` [PATCH v6 3/9] mm/vmscan.c: refactor shrink_node() Yu Zhao
2022-01-04 20:22 ` [PATCH v6 4/9] mm: multigenerational lru: groundwork Yu Zhao
2022-01-04 21:34   ` Linus Torvalds
2022-01-11  8:16   ` Aneesh Kumar K.V
2022-01-12  2:16     ` Yu Zhao
2022-01-04 20:22 ` [PATCH v6 5/9] mm: multigenerational lru: mm_struct list Yu Zhao
2022-01-07  9:06   ` Michal Hocko
2022-01-08  0:19     ` Yu Zhao
2022-01-10 15:21       ` Michal Hocko
2022-01-12  8:08         ` Yu Zhao
2022-01-04 20:22 ` [PATCH v6 6/9] mm: multigenerational lru: aging Yu Zhao
2022-01-06 16:06   ` Michal Hocko
2022-01-06 21:27     ` Yu Zhao
2022-01-07  8:43       ` Michal Hocko
2022-01-07 21:12         ` Yu Zhao
2022-01-06 16:12   ` Michal Hocko
2022-01-06 21:41     ` Yu Zhao
2022-01-07  8:55       ` Michal Hocko
2022-01-07  9:00         ` Michal Hocko
2022-01-10  3:58           ` Yu Zhao
2022-01-10 14:37             ` Michal Hocko
2022-01-13  9:43               ` Yu Zhao
2022-01-13 12:02                 ` Michal Hocko
2022-01-19  6:31                   ` Yu Zhao
2022-01-19  9:44                     ` Michal Hocko
2022-01-10 15:01     ` Michal Hocko
2022-01-10 16:01       ` Vlastimil Babka
2022-01-10 16:25         ` Michal Hocko
2022-01-11 23:16       ` Yu Zhao
2022-01-12 10:28         ` Michal Hocko
2022-01-13  9:25           ` Yu Zhao
2022-01-07 13:11   ` Michal Hocko
2022-01-07 23:36     ` Yu Zhao
2022-01-10 15:35       ` Michal Hocko
2022-01-11  1:18         ` Yu Zhao [this message]
2022-01-11  9:00           ` Michal Hocko
     [not found]         ` <1641900108.61dd684cb0e59@mail.inbox.lv>
2022-01-11 12:15           ` Michal Hocko
2022-01-13 17:00             ` Alexey Avramov
2022-01-11 14:22         ` Alexey Avramov
2022-01-07 14:44   ` Michal Hocko
2022-01-10  4:47     ` Yu Zhao
2022-01-10 10:54       ` Michal Hocko
2022-01-19  7:04         ` Yu Zhao
2022-01-19  9:42           ` Michal Hocko
2022-01-23 21:28             ` Yu Zhao
2022-01-24 14:01               ` Michal Hocko
2022-01-10 16:57   ` Michal Hocko
2022-01-12  1:01     ` Yu Zhao
2022-01-12 10:17       ` Michal Hocko
2022-01-12 23:43         ` Yu Zhao
2022-01-13 11:57           ` Michal Hocko
2022-01-23 21:40             ` Yu Zhao
2022-01-04 20:22 ` [PATCH v6 7/9] mm: multigenerational lru: eviction Yu Zhao
2022-01-11 10:37   ` Aneesh Kumar K.V
2022-01-12  8:05     ` Yu Zhao
2022-01-04 20:22 ` [PATCH v6 8/9] mm: multigenerational lru: user interface Yu Zhao
2022-01-10 10:27   ` Mike Rapoport
2022-01-12  8:35     ` Yu Zhao
2022-01-12 10:31       ` Michal Hocko
2022-01-12 15:45       ` Mike Rapoport
2022-01-13  9:47         ` Yu Zhao
2022-01-13 10:31   ` Aneesh Kumar K.V
2022-01-13 23:02     ` Yu Zhao
2022-01-14  5:20       ` Aneesh Kumar K.V
2022-01-14  6:50         ` Yu Zhao
2022-01-04 20:22 ` [PATCH v6 9/9] mm: multigenerational lru: Kconfig Yu Zhao
2022-01-04 21:39   ` Linus Torvalds
2022-01-04 20:22 ` [PATCH v6 0/9] Multigenerational LRU Framework Yu Zhao
2022-01-04 20:30 ` Yu Zhao
2022-01-04 21:43   ` Linus Torvalds
2022-01-05 21:12     ` Yu Zhao
2022-01-07  9:38   ` Michal Hocko
2022-01-07 18:45     ` Yu Zhao
2022-01-10 15:39       ` Michal Hocko
2022-01-10 22:04         ` Yu Zhao
2022-01-10 22:46           ` Jesse Barnes
2022-01-11  1:41             ` Linus Torvalds
2022-01-11 10:40             ` Michal Hocko
2022-01-11  8:41   ` Yu Zhao
2022-01-11  8:53     ` Holger Hoffstätte
2022-01-11 16:04     ` Shuang Zhai
2022-01-12  1:46     ` Suleiman Souhlal
2022-01-12  6:07     ` Sofia Trinh
2022-01-18  9:21     ` Yu Zhao
2022-01-18  9:36     ` Donald Carr
2022-01-19 20:19     ` Steven Barrett
2022-01-19 22:25     ` Brian Geffon
2022-01-05  2:44 ` Shuang Zhai
2022-01-05  8:55 ` SeongJae Park
2022-01-05 10:53   ` Yu Zhao
2022-01-05 11:12     ` Borislav Petkov
2022-01-05 11:25     ` SeongJae Park
2022-01-05 21:06       ` Yu Zhao
2022-01-10 14:49 ` Alexey Avramov
2022-01-11 10:24 ` Alexey Avramov
2022-01-12 20:56 ` Oleksandr Natalenko
2022-01-13  8:59   ` Yu Zhao
2022-01-23  5:43 ` Barry Song
2022-01-25  6:48   ` Yu Zhao
2022-01-28  8:54     ` Barry Song
2022-02-08  9:16       ` Yu Zhao

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=Ydza/zXKY9ATRoh6@google.com \
    --to=yuzhao@google.com \
    --cc=Hi-Angel@yandex.ru \
    --cc=Michael@michaellarabel.com \
    --cc=ak@linux.intel.com \
    --cc=akpm@linux-foundation.org \
    --cc=axboe@kernel.dk \
    --cc=catalin.marinas@arm.com \
    --cc=corbet@lwn.net \
    --cc=dave.hansen@linux.intel.com \
    --cc=hannes@cmpxchg.org \
    --cc=hdanton@sina.com \
    --cc=jsbarnes@google.com \
    --cc=linux-arm-kernel@lists.infradead.org \
    --cc=linux-doc@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mgorman@suse.de \
    --cc=mhocko@suse.com \
    --cc=page-reclaim@google.com \
    --cc=riel@surriel.com \
    --cc=torvalds@linux-foundation.org \
    --cc=vbabka@suse.cz \
    --cc=will@kernel.org \
    --cc=willy@infradead.org \
    --cc=x86@kernel.org \
    --cc=ying.huang@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).