From: Yu Zhao <yuzhao@google.com>
To: Barry Song <21cnbao@gmail.com>
Cc: "Linus Torvalds" <torvalds@linux-foundation.org>,
"Will Deacon" <will@kernel.org>,
"Andrew Morton" <akpm@linux-foundation.org>,
Linux-MM <linux-mm@kvack.org>, "Andi Kleen" <ak@linux.intel.com>,
"Aneesh Kumar" <aneesh.kumar@linux.ibm.com>,
"Catalin Marinas" <catalin.marinas@arm.com>,
"Dave Hansen" <dave.hansen@linux.intel.com>,
"Hillf Danton" <hdanton@sina.com>, "Jens Axboe" <axboe@kernel.dk>,
"Johannes Weiner" <hannes@cmpxchg.org>,
"Jonathan Corbet" <corbet@lwn.net>,
"Matthew Wilcox" <willy@infradead.org>,
"Mel Gorman" <mgorman@suse.de>,
"Michael Larabel" <Michael@michaellarabel.com>,
"Michal Hocko" <mhocko@kernel.org>,
"Mike Rapoport" <rppt@kernel.org>,
"Peter Zijlstra" <peterz@infradead.org>,
"Tejun Heo" <tj@kernel.org>, "Vlastimil Babka" <vbabka@suse.cz>,
LAK <linux-arm-kernel@lists.infradead.org>,
"Linux Doc Mailing List" <linux-doc@vger.kernel.org>,
LKML <linux-kernel@vger.kernel.org>, x86 <x86@kernel.org>,
"Kernel Page Reclaim v2" <page-reclaim@google.com>,
"Brian Geffon" <bgeffon@google.com>,
"Jan Alexander Steffens" <heftig@archlinux.org>,
"Oleksandr Natalenko" <oleksandr@natalenko.name>,
"Steven Barrett" <steven@liquorix.net>,
"Suleiman Souhlal" <suleiman@google.com>,
"Daniel Byrne" <djbyrne@mtu.edu>,
"Donald Carr" <d@chaos-reins.com>,
"Holger Hoffstätte" <holger@applied-asynchrony.com>,
"Konstantin Kharlamov" <Hi-Angel@yandex.ru>,
"Shuang Zhai" <szhai2@cs.rochester.edu>,
"Sofia Trinh" <sofia.trinh@edi.works>,
"Vaibhav Jain" <vaibhav@linux.ibm.com>,
huzhanyuan@oppo.com
Subject: Re: [PATCH v11 07/14] mm: multi-gen LRU: exploit locality in rmap
Date: Thu, 16 Jun 2022 19:42:25 -0600 [thread overview]
Message-ID: <CAOUHufYvH2LaGyAJZFQNOsGDBKD2++aFnTV6=qaVtcNrKjS_bA@mail.gmail.com> (raw)
In-Reply-To: <CAOUHufbOwPSbBwd7TG0QFt4YJvBp93Q9nUJEDvMpUA6PqjYMUQ@mail.gmail.com>
On Thu, Jun 16, 2022 at 5:29 PM Yu Zhao <yuzhao@google.com> wrote:
>
> On Thu, Jun 16, 2022 at 4:33 PM Barry Song <21cnbao@gmail.com> wrote:
> >
> > On Fri, Jun 17, 2022 at 9:56 AM Yu Zhao <yuzhao@google.com> wrote:
> > >
> > > On Wed, Jun 8, 2022 at 4:46 PM Barry Song <21cnbao@gmail.com> wrote:
> > > >
> > > > On Thu, Jun 9, 2022 at 3:52 AM Linus Torvalds
> > > > <torvalds@linux-foundation.org> wrote:
> > > > >
> > > > > On Tue, Jun 7, 2022 at 5:43 PM Barry Song <21cnbao@gmail.com> wrote:
> > > > > >
> > > > > > Given we used to have a flush for clear pte young in LRU, right now we are
> > > > > > moving to nop in almost all cases for the flush unless the address becomes
> > > > > > young exactly after look_around and before ptep_clear_flush_young_notify.
> > > > > > It means we are actually dropping flush. So the question is, were we
> > > > > > overcautious? we actually don't need the flush at all even without mglru?
> > > > >
> > > > > We stopped flushing the TLB on A bit clears on x86 back in 2014.
> > > > >
> > > > > See commit b13b1d2d8692 ("x86/mm: In the PTE swapout page reclaim case
> > > > > clear the accessed bit instead of flushing the TLB").
> > > >
> > > > This is true for x86, RISC-V, powerpc and S390. but it is not true for
> > > > most platforms.
> > > >
> > > > There was an attempt to do the same thing in arm64:
> > > > https://www.mail-archive.com/linux-kernel@vger.kernel.org/msg1793830.html
> > > > but arm64 still sent a nosync tlbi and depent on a deferred to dsb :
> > > > https://www.mail-archive.com/linux-kernel@vger.kernel.org/msg1794484.html
> > >
> > > Barry, you've already answered your own question.
> > >
> > > Without commit 07509e10dcc7 arm64: pgtable: Fix pte_accessible():
> > > #define pte_accessible(mm, pte) \
> > > - (mm_tlb_flush_pending(mm) ? pte_present(pte) : pte_valid_young(pte))
> > > + (mm_tlb_flush_pending(mm) ? pte_present(pte) : pte_valid(pte))
> > >
> > > You missed all TLB flushes for PTEs that have gone through
> > > ptep_test_and_clear_young() on the reclaim path. But most of the time,
> > > you got away with it, only occasional app crashes:
> > > https://lore.kernel.org/r/CAGsJ_4w6JjuG4rn2P=d974wBOUtXUUnaZKnx+-G6a8_mSROa+Q@mail.gmail.com/
> > >
> > > Why?
> >
> > Yes. On the arm64 platform, ptep_test_and_clear_young() without flush
> > can cause random
> > App to crash.
> > ptep_test_and_clear_young() + flush won't have this kind of crashes though.
> > But after applying commit 07509e10dcc7 arm64: pgtable: Fix
> > pte_accessible(), on arm64,
> > ptep_test_and_clear_young() without flush won't cause App to crash.
> >
> > ptep_test_and_clear_young(), with flush, without commit 07509e10dcc7: OK
> > ptep_test_and_clear_young(), without flush, with commit 07509e10dcc7: OK
> > ptep_test_and_clear_young(), without flush, without commit 07509e10dcc7: CRASH
>
> I agree -- my question was rhetorical :)
>
> I was trying to imply this logic:
> 1. We cleared the A-bit in PTEs with ptep_test_and_clear_young()
> 2. We missed TLB flush for those PTEs on the reclaim path, i.e., case
> 3 (case 1 & 2 guarantee flushes)
> 3. We saw crashes, but only occasionally
>
> Assuming TLB cached those PTEs, we would have seen the crashes more
> often, which contradicts our observation. So the conclusion is TLB
> didn't cache them most of the time, meaning flushing TLB just for the
> sake of the A-bit isn't necessary.
>
> > do you think it is safe to totally remove the flush code even for
> > the original
> > LRU?
>
> Affirmative, based on not only my words, but 3rd parties':
> 1. Your (indirect) observation
> 2. Alexander's benchmark:
> https://lore.kernel.org/r/BYAPR12MB271295B398729E07F31082A7CFAA0@BYAPR12MB2712.namprd12.prod.outlook.com/
> 3. The fundamental hardware limitation in terms of the TLB scalability
> (Fig. 1): https://www.usenix.org/legacy/events/osdi02/tech/full_papers/navarro/navarro.pdf
4. Intel's commit b13b1d2d8692 ("x86/mm: In the PTE swapout page
reclaim case clear the accessed bit instead of flushing the TLB")
next prev parent reply other threads:[~2022-06-17 1:43 UTC|newest]
Thread overview: 41+ messages / expand[flat|nested] mbox.gz Atom feed top
2022-05-18 1:46 [PATCH v11 00/14] Multi-Gen LRU Framework Yu Zhao
2022-05-18 1:46 ` [PATCH v11 01/14] mm: x86, arm64: add arch_has_hw_pte_young() Yu Zhao
2022-05-18 1:46 ` [PATCH v11 02/14] mm: x86: add CONFIG_ARCH_HAS_NONLEAF_PMD_YOUNG Yu Zhao
2022-05-18 1:46 ` [PATCH v11 03/14] mm/vmscan.c: refactor shrink_node() Yu Zhao
2022-05-18 1:46 ` [PATCH v11 04/14] Revert "include/linux/mm_inline.h: fold __update_lru_size() into its sole caller" Yu Zhao
2022-05-18 1:46 ` [PATCH v11 05/14] mm: multi-gen LRU: groundwork Yu Zhao
2022-06-09 5:33 ` zhong jiang
2022-05-18 1:46 ` [PATCH v11 06/14] mm: multi-gen LRU: minimal implementation Yu Zhao
2022-06-09 12:34 ` zhong jiang
2022-06-09 14:46 ` zhong jiang
2022-05-18 1:46 ` [PATCH v11 07/14] mm: multi-gen LRU: exploit locality in rmap Yu Zhao
2022-06-06 9:25 ` Barry Song
2022-06-07 7:37 ` Barry Song
2022-06-07 10:21 ` Will Deacon
2022-06-06 22:37 ` Barry Song
2022-06-07 10:43 ` Will Deacon
2022-06-07 21:06 ` Yu Zhao
2022-06-08 0:43 ` Barry Song
2022-06-08 15:51 ` Linus Torvalds
2022-06-08 22:45 ` Barry Song
2022-06-16 21:55 ` Yu Zhao
2022-06-16 22:33 ` Barry Song
2022-06-16 23:29 ` Yu Zhao
2022-06-17 1:42 ` Yu Zhao [this message]
2022-06-17 2:01 ` Barry Song
2022-06-17 3:03 ` Yu Zhao
2022-06-17 3:17 ` Yu Zhao
2022-06-19 20:36 ` Yu Zhao
2022-06-19 21:56 ` Barry Song
2022-06-07 19:07 ` Yu Zhao
2022-06-08 7:48 ` Barry Song
2022-06-07 18:58 ` Yu Zhao
2022-05-18 1:46 ` [PATCH v11 08/14] mm: multi-gen LRU: support page table walks Yu Zhao
2022-05-18 1:46 ` [PATCH v11 09/14] mm: multi-gen LRU: optimize multiple memcgs Yu Zhao
2022-05-18 1:46 ` [PATCH v11 10/14] mm: multi-gen LRU: kill switch Yu Zhao
2022-05-18 1:46 ` [PATCH v11 11/14] mm: multi-gen LRU: thrashing prevention Yu Zhao
2022-05-18 1:46 ` [PATCH v11 12/14] mm: multi-gen LRU: debugfs interface Yu Zhao
2022-05-18 1:46 ` [PATCH v11 13/14] mm: multi-gen LRU: admin guide Yu Zhao
2022-05-18 1:46 ` [PATCH v11 14/14] mm: multi-gen LRU: design doc Yu Zhao
2022-05-18 2:05 ` [PATCH v11 00/14] Multi-Gen LRU Framework Jens Axboe
2022-06-07 22:47 ` Yu Zhao
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to='CAOUHufYvH2LaGyAJZFQNOsGDBKD2++aFnTV6=qaVtcNrKjS_bA@mail.gmail.com' \
--to=yuzhao@google.com \
--cc=21cnbao@gmail.com \
--cc=Hi-Angel@yandex.ru \
--cc=Michael@michaellarabel.com \
--cc=ak@linux.intel.com \
--cc=akpm@linux-foundation.org \
--cc=aneesh.kumar@linux.ibm.com \
--cc=axboe@kernel.dk \
--cc=bgeffon@google.com \
--cc=catalin.marinas@arm.com \
--cc=corbet@lwn.net \
--cc=d@chaos-reins.com \
--cc=dave.hansen@linux.intel.com \
--cc=djbyrne@mtu.edu \
--cc=hannes@cmpxchg.org \
--cc=hdanton@sina.com \
--cc=heftig@archlinux.org \
--cc=holger@applied-asynchrony.com \
--cc=huzhanyuan@oppo.com \
--cc=linux-arm-kernel@lists.infradead.org \
--cc=linux-doc@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=mgorman@suse.de \
--cc=mhocko@kernel.org \
--cc=oleksandr@natalenko.name \
--cc=page-reclaim@google.com \
--cc=peterz@infradead.org \
--cc=rppt@kernel.org \
--cc=sofia.trinh@edi.works \
--cc=steven@liquorix.net \
--cc=suleiman@google.com \
--cc=szhai2@cs.rochester.edu \
--cc=tj@kernel.org \
--cc=torvalds@linux-foundation.org \
--cc=vaibhav@linux.ibm.com \
--cc=vbabka@suse.cz \
--cc=will@kernel.org \
--cc=willy@infradead.org \
--cc=x86@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).