linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Yu Zhao <yuzhao@google.com>
To: Hillf Danton <hdanton@sina.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>,
	linux-kernel@vger.kernel.org, linux-mm@kvack.org
Subject: Re: [PATCH v7 05/12] mm: multigenerational LRU: minimal implementation
Date: Wed, 16 Feb 2022 17:13:13 -0700	[thread overview]
Message-ID: <Yg2TGSruRVMGpxqk@google.com> (raw)
In-Reply-To: <20220213100417.1183-1-hdanton@sina.com>

On Sun, Feb 13, 2022 at 06:04:17PM +0800, Hillf Danton wrote:

Hi Hillf,

> On Tue,  8 Feb 2022 01:18:55 -0700 Yu Zhao wrote:
> > +
> > +/******************************************************************************
> > + *                          the aging
> > + ******************************************************************************/
> > +
> > +static int folio_inc_gen(struct lruvec *lruvec, struct folio *folio, bool reclaiming)
> > +{
> > +	unsigned long old_flags, new_flags;
> > +	int type = folio_is_file_lru(folio);
> > +	struct lru_gen_struct *lrugen = &lruvec->lrugen;
> > +	int new_gen, old_gen = lru_gen_from_seq(lrugen->min_seq[type]);
> > +
> > +	do {
> > +		new_flags = old_flags = READ_ONCE(folio->flags);
> > +		VM_BUG_ON_FOLIO(!(new_flags & LRU_GEN_MASK), folio);
> > +
> > +		new_gen = ((new_flags & LRU_GEN_MASK) >> LRU_GEN_PGOFF) - 1;
> 
> Is the chance zero for deadloop if new_gen != old_gen?

No, because the counter is only cleared during isolation, and here
it's protected again isolation (under the LRU lock, which is asserted
in the lru_gen_balance_size() -> lru_gen_update_size() path).

> > +		new_gen = (old_gen + 1) % MAX_NR_GENS;
> > +
> > +		new_flags &= ~LRU_GEN_MASK;
> > +		new_flags |= (new_gen + 1UL) << LRU_GEN_PGOFF;
> > +		new_flags &= ~(LRU_REFS_MASK | LRU_REFS_FLAGS);
> > +		/* for folio_end_writeback() */
> 
> 		/* for folio_end_writeback() and sort_folio() */ in terms of
> reclaiming?

Right.

> > +		if (reclaiming)
> > +			new_flags |= BIT(PG_reclaim);
> > +	} while (cmpxchg(&folio->flags, old_flags, new_flags) != old_flags);
> > +
> > +	lru_gen_balance_size(lruvec, folio, old_gen, new_gen);
> > +
> > +	return new_gen;
> > +}
> 
> ...
> 
> > +/******************************************************************************
> > + *                          the eviction
> > + ******************************************************************************/
> > +
> > +static bool sort_folio(struct lruvec *lruvec, struct folio *folio, int tier_idx)
> > +{
> 
> Nit, the 80-column-char format is prefered.

Will do.

> > +	bool success;
> > +	int gen = folio_lru_gen(folio);
> > +	int type = folio_is_file_lru(folio);
> > +	int zone = folio_zonenum(folio);
> > +	int tier = folio_lru_tier(folio);
> > +	int delta = folio_nr_pages(folio);
> > +	struct lru_gen_struct *lrugen = &lruvec->lrugen;
> > +
> > +	VM_BUG_ON_FOLIO(gen >= MAX_NR_GENS, folio);
> > +
> > +	if (!folio_evictable(folio)) {
> > +		success = lru_gen_del_folio(lruvec, folio, true);
> > +		VM_BUG_ON_FOLIO(!success, folio);
> > +		folio_set_unevictable(folio);
> > +		lruvec_add_folio(lruvec, folio);
> > +		__count_vm_events(UNEVICTABLE_PGCULLED, delta);
> > +		return true;
> > +	}
> > +
> > +	if (type && folio_test_anon(folio) && folio_test_dirty(folio)) {
> > +		success = lru_gen_del_folio(lruvec, folio, true);
> > +		VM_BUG_ON_FOLIO(!success, folio);
> > +		folio_set_swapbacked(folio);
> > +		lruvec_add_folio_tail(lruvec, folio);
> > +		return true;
> > +	}
> > +
> > +	if (tier > tier_idx) {
> > +		int hist = lru_hist_from_seq(lrugen->min_seq[type]);
> > +
> > +		gen = folio_inc_gen(lruvec, folio, false);
> > +		list_move_tail(&folio->lru, &lrugen->lists[gen][type][zone]);
> > +
> > +		WRITE_ONCE(lrugen->promoted[hist][type][tier - 1],
> > +			   lrugen->promoted[hist][type][tier - 1] + delta);
> > +		__mod_lruvec_state(lruvec, WORKINGSET_ACTIVATE_BASE + type, delta);
> > +		return true;
> > +	}
> > +
> > +	if (folio_test_locked(folio) || folio_test_writeback(folio) ||
> > +	    (type && folio_test_dirty(folio))) {
> > +		gen = folio_inc_gen(lruvec, folio, true);
> > +		list_move(&folio->lru, &lrugen->lists[gen][type][zone]);
> > +		return true;
> 
> Make the cold dirty page cache younger instead of writeout in the backgroungd
> reclaimer context, and the question rising is if laundry is defered until the
> flusher threads are waken up in the following patches.

This is a good point. In contrast to the active/inactive LRU, MGLRU
doesn't write out dirty file pages (kswapd or direct reclaimers) --
this is writeback's job and it should be better at doing this. In
fact, commit 21b4ee7029 ("xfs: drop ->writepage completely") has
disabled dirty file page writeouts in the reclaim path completely.

Reclaim indirectly wakes up writeback after clean file pages drop
below a threshold (dirty ratio). However, dirty pages might be under
counted on a system that uses a large number of mmapped file pages.
MGLRU optimizes this by calling folio_mark_dirty() on pages mapped
by dirty PTEs when scanning page tables. (Why not since it's already
looking at the accessed bit.)

The commit above explained this design choice from the performance
aspect. From the implementation aspect, it also creates a boundary
between reclaim and writeback. This simplifies things, e.g., the
PageWriteback() check in shrink_page_list is no longer relevant for
MGLRU, neither is the top half of the PageDirty() check.


  reply	other threads:[~2022-02-17  0:13 UTC|newest]

Thread overview: 74+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-02-08  8:18 [PATCH v7 00/12] Multigenerational LRU Framework Yu Zhao
2022-02-08  8:18 ` [PATCH v7 01/12] mm: x86, arm64: add arch_has_hw_pte_young() Yu Zhao
2022-02-08  8:24   ` Yu Zhao
2022-02-08 10:33   ` Will Deacon
2022-02-08  8:18 ` [PATCH v7 02/12] mm: x86: add CONFIG_ARCH_HAS_NONLEAF_PMD_YOUNG Yu Zhao
2022-02-08  8:27   ` Yu Zhao
2022-02-08  8:18 ` [PATCH v7 03/12] mm/vmscan.c: refactor shrink_node() Yu Zhao
2022-02-08  8:18 ` [PATCH v7 04/12] mm: multigenerational LRU: groundwork Yu Zhao
2022-02-08  8:28   ` Yu Zhao
2022-02-10 20:41   ` Johannes Weiner
2022-02-15  9:43     ` Yu Zhao
2022-02-15 21:53       ` Johannes Weiner
2022-02-21  8:14         ` Yu Zhao
2022-02-23 21:18           ` Yu Zhao
2022-02-25 16:34             ` Minchan Kim
2022-03-03 15:29           ` Johannes Weiner
2022-03-03 19:26             ` Yu Zhao
2022-03-03 21:43               ` Johannes Weiner
2022-03-11 10:16       ` Barry Song
2022-03-11 23:45         ` Yu Zhao
2022-03-12 10:37           ` Barry Song
2022-03-12 21:11             ` Yu Zhao
2022-03-13  4:57               ` Barry Song
2022-03-14 11:11                 ` Barry Song
2022-03-14 16:45                   ` Yu Zhao
2022-03-14 23:38                     ` Barry Song
     [not found]                       ` <CAOUHufa9eY44QadfGTzsxa2=hEvqwahXd7Canck5Gt-N6c4UKA@mail.gmail.com>
     [not found]                         ` <CAGsJ_4zvj5rmz7DkW-kJx+jmUT9G8muLJ9De--NZma9ey0Oavw@mail.gmail.com>
2022-03-15 10:29                           ` Barry Song
2022-03-16  2:46                             ` Yu Zhao
2022-03-16  4:37                               ` Barry Song
2022-03-16  5:44                                 ` Yu Zhao
2022-03-16  6:06                                   ` Barry Song
2022-03-16 21:37                                     ` Yu Zhao
2022-02-10 21:37   ` Matthew Wilcox
2022-02-13 21:16     ` Yu Zhao
2022-02-08  8:18 ` [PATCH v7 05/12] mm: multigenerational LRU: minimal implementation Yu Zhao
2022-02-08  8:33   ` Yu Zhao
2022-02-08 16:50   ` Johannes Weiner
2022-02-10  2:53     ` Yu Zhao
2022-02-13 10:04   ` Hillf Danton
2022-02-17  0:13     ` Yu Zhao [this message]
2022-02-23  8:27   ` Huang, Ying
2022-02-23  9:36     ` Yu Zhao
2022-02-24  0:59       ` Huang, Ying
2022-02-24  1:34         ` Yu Zhao
2022-02-24  3:31           ` Huang, Ying
2022-02-24  4:09             ` Yu Zhao
2022-02-24  5:27               ` Huang, Ying
2022-02-24  5:35                 ` Yu Zhao
2022-02-08  8:18 ` [PATCH v7 06/12] mm: multigenerational LRU: exploit locality in rmap Yu Zhao
2022-02-08  8:40   ` Yu Zhao
2022-02-08  8:18 ` [PATCH v7 07/12] mm: multigenerational LRU: support page table walks Yu Zhao
2022-02-08  8:39   ` Yu Zhao
2022-02-08  8:18 ` [PATCH v7 08/12] mm: multigenerational LRU: optimize multiple memcgs Yu Zhao
2022-02-08  8:18 ` [PATCH v7 09/12] mm: multigenerational LRU: runtime switch Yu Zhao
2022-02-08  8:42   ` Yu Zhao
2022-02-08  8:19 ` [PATCH v7 10/12] mm: multigenerational LRU: thrashing prevention Yu Zhao
2022-02-08  8:43   ` Yu Zhao
2022-02-08  8:19 ` [PATCH v7 11/12] mm: multigenerational LRU: debugfs interface Yu Zhao
2022-02-18 18:56   ` [page-reclaim] " David Rientjes
2022-02-08  8:19 ` [PATCH v7 12/12] mm: multigenerational LRU: documentation Yu Zhao
2022-02-08  8:44   ` Yu Zhao
2022-02-14 10:28   ` Mike Rapoport
2022-02-16  3:22     ` Yu Zhao
2022-02-21  9:01       ` Mike Rapoport
2022-02-22  1:47         ` Yu Zhao
2022-02-23 10:58           ` Mike Rapoport
2022-02-23 21:20             ` Yu Zhao
2022-02-08 10:11 ` [PATCH v7 00/12] Multigenerational LRU Framework Oleksandr Natalenko
2022-02-08 11:14   ` Michal Hocko
2022-02-08 11:23     ` Oleksandr Natalenko
2022-02-11 20:12 ` Alexey Avramov
2022-02-12 21:01   ` Yu Zhao
2022-03-03  6:06 ` Vaibhav Jain
2022-03-03  6:47   ` Yu Zhao

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=Yg2TGSruRVMGpxqk@google.com \
    --to=yuzhao@google.com \
    --cc=hannes@cmpxchg.org \
    --cc=hdanton@sina.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).