linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Barry Song <21cnbao@gmail.com>
To: Will Deacon <will@kernel.org>
Cc: "Yu Zhao" <yuzhao@google.com>,
	"Andrew Morton" <akpm@linux-foundation.org>,
	Linux-MM <linux-mm@kvack.org>, "Andi Kleen" <ak@linux.intel.com>,
	"Aneesh Kumar" <aneesh.kumar@linux.ibm.com>,
	"Catalin Marinas" <catalin.marinas@arm.com>,
	"Dave Hansen" <dave.hansen@linux.intel.com>,
	"Hillf Danton" <hdanton@sina.com>, "Jens Axboe" <axboe@kernel.dk>,
	"Johannes Weiner" <hannes@cmpxchg.org>,
	"Jonathan Corbet" <corbet@lwn.net>,
	"Linus Torvalds" <torvalds@linux-foundation.org>,
	"Matthew Wilcox" <willy@infradead.org>,
	"Mel Gorman" <mgorman@suse.de>,
	"Michael Larabel" <Michael@michaellarabel.com>,
	"Michal Hocko" <mhocko@kernel.org>,
	"Mike Rapoport" <rppt@kernel.org>,
	"Peter Zijlstra" <peterz@infradead.org>,
	"Tejun Heo" <tj@kernel.org>, "Vlastimil Babka" <vbabka@suse.cz>,
	LAK <linux-arm-kernel@lists.infradead.org>,
	"Linux Doc Mailing List" <linux-doc@vger.kernel.org>,
	LKML <linux-kernel@vger.kernel.org>, x86 <x86@kernel.org>,
	"Kernel Page Reclaim v2" <page-reclaim@google.com>,
	"Brian Geffon" <bgeffon@google.com>,
	"Jan Alexander Steffens" <heftig@archlinux.org>,
	"Oleksandr Natalenko" <oleksandr@natalenko.name>,
	"Steven Barrett" <steven@liquorix.net>,
	"Suleiman Souhlal" <suleiman@google.com>,
	"Daniel Byrne" <djbyrne@mtu.edu>,
	"Donald Carr" <d@chaos-reins.com>,
	"Holger Hoffstätte" <holger@applied-asynchrony.com>,
	"Konstantin Kharlamov" <Hi-Angel@yandex.ru>,
	"Shuang Zhai" <szhai2@cs.rochester.edu>,
	"Sofia Trinh" <sofia.trinh@edi.works>,
	"Vaibhav Jain" <vaibhav@linux.ibm.com>,
	huzhanyuan@oppo.com
Subject: Re: [PATCH v11 07/14] mm: multi-gen LRU: exploit locality in rmap
Date: Tue, 7 Jun 2022 10:37:46 +1200	[thread overview]
Message-ID: <CAGsJ_4zGEdHDv0ObZ-5y8sFKLO7Y6ZjTsZFs0KvdLwA_-iGJ5A@mail.gmail.com> (raw)
In-Reply-To: <20220607102135.GA32448@willie-the-truck>

On Tue, Jun 7, 2022 at 10:21 PM Will Deacon <will@kernel.org> wrote:
>
> On Tue, Jun 07, 2022 at 07:37:10PM +1200, Barry Song wrote:
> > On Mon, Jun 6, 2022 at 9:25 PM Barry Song <21cnbao@gmail.com> wrote:
> > > On Wed, May 18, 2022 at 4:49 PM Yu Zhao <yuzhao@google.com> wrote:
> > > > diff --git a/mm/rmap.c b/mm/rmap.c
> > > > index fedb82371efe..7cb7ef29088a 100644
> > > > --- a/mm/rmap.c
> > > > +++ b/mm/rmap.c
> > > > @@ -73,6 +73,7 @@
> > > >  #include <linux/page_idle.h>
> > > >  #include <linux/memremap.h>
> > > >  #include <linux/userfaultfd_k.h>
> > > > +#include <linux/mm_inline.h>
> > > >
> > > >  #include <asm/tlbflush.h>
> > > >
> > > > @@ -821,6 +822,12 @@ static bool folio_referenced_one(struct folio *folio,
> > > >                 }
> > > >
> > > >                 if (pvmw.pte) {
> > > > +                       if (lru_gen_enabled() && pte_young(*pvmw.pte) &&
> > > > +                           !(vma->vm_flags & (VM_SEQ_READ | VM_RAND_READ))) {
> > > > +                               lru_gen_look_around(&pvmw);
> > > > +                               referenced++;
> > > > +                       }
> > > > +
> > > >                         if (ptep_clear_flush_young_notify(vma, address,
> > >
> > > Hello, Yu.
> > > look_around() is calling ptep_test_and_clear_young(pvmw->vma, addr, pte + i)
> > > only without flush and notify. for flush, there is a tlb operation for arm64:
> > > static inline int ptep_clear_flush_young(struct vm_area_struct *vma,
> > >                                          unsigned long address, pte_t *ptep)
> > > {
> > >         int young = ptep_test_and_clear_young(vma, address, ptep);
> > >
> > >         if (young) {
> > >                 /*
> > >                  * We can elide the trailing DSB here since the worst that can
> > >                  * happen is that a CPU continues to use the young entry in its
> > >                  * TLB and we mistakenly reclaim the associated page. The
> > >                  * window for such an event is bounded by the next
> > >                  * context-switch, which provides a DSB to complete the TLB
> > >                  * invalidation.
> > >                  */
> > >                 flush_tlb_page_nosync(vma, address);
> > >         }
> > >
> > >         return young;
> > > }
> > >
> > > Does it mean the current kernel is over cautious?  is it
> > > safe to call ptep_test_and_clear_young() only?
> >
> > I can't really explain why we are getting a random app/java vm crash in monkey
> > test by using ptep_test_and_clear_young() only in lru_gen_look_around() on an
> > armv8-a machine without hardware PTE young support.
> >
> > Moving to  ptep_clear_flush_young() in look_around can make the random
> > hang disappear according to zhanyuan(Cc-ed).
> >
> > On x86, ptep_clear_flush_young() is exactly ptep_test_and_clear_young()
> > after
> >  'commit b13b1d2d8692 ("x86/mm: In the PTE swapout page reclaim case clear
> > the accessed bit instead of flushing the TLB")'
> >
> > But on arm64, they are different. according to Will's comments in this
> > thread which
> > tried to make arm64 same with x86,
> > https://www.mail-archive.com/linux-kernel@vger.kernel.org/msg1793881.html
> >
> > "
> > This is blindly copied from x86 and isn't true for us: we don't invalidate
> > the TLB on context switch. That means our window for keeping the stale
> > entries around is potentially much bigger and might not be a great idea.
> >
> > If we roll a TLB invalidation routine without the trailing DSB, what sort of
> > performance does that get you?
> > "
> > We shouldn't think ptep_clear_flush_young() is safe enough in LRU to
> > clear PTE young? Any comments from Will?
>
> Given that this issue is specific to the multi-gen LRU work, I think Yu is
> the best person to comment. However, looking quickly at your analysis above,
> I wonder if the code is relying on this sequence:
>
>
>         ptep_test_and_clear_young(vma, address, ptep);
>         ptep_clear_flush_young(vma, address, ptep);
>
>
> to invalidate the TLB. On arm64, that won't be the case, as the invalidation
> in ptep_clear_flush_young() is predicated on the pte being young (and this
> patches the generic implementation in mm/pgtable-generic.c. In fact, that
> second function call is always going to be a no-op unless the pte became
> young again in the middle.

Hi Will,
thanks for your reply, sorry for failing to let you understand my question.
my question is actually as below,
right now  lru_gen_look_around() is using ptep_test_and_clear_young()
only without flush to clear pte for a couple of pages including the specific
address:
void lru_gen_look_around(struct page_vma_mapped_walk *pvmw)
{
       ...

       for (i = 0, addr = start; addr != end; i++, addr += PAGE_SIZE) {
               ...

               if (!ptep_test_and_clear_young(pvmw->vma, addr, pte + i))
                       continue;

               ...
}

I wonder if it is safe to arm64. Do we need to move to ptep_clear_flush_young()
in the loop?

>
> Will

Thanks
Barry

  reply	other threads:[~2022-06-07 10:38 UTC|newest]

Thread overview: 41+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-05-18  1:46 [PATCH v11 00/14] Multi-Gen LRU Framework Yu Zhao
2022-05-18  1:46 ` [PATCH v11 01/14] mm: x86, arm64: add arch_has_hw_pte_young() Yu Zhao
2022-05-18  1:46 ` [PATCH v11 02/14] mm: x86: add CONFIG_ARCH_HAS_NONLEAF_PMD_YOUNG Yu Zhao
2022-05-18  1:46 ` [PATCH v11 03/14] mm/vmscan.c: refactor shrink_node() Yu Zhao
2022-05-18  1:46 ` [PATCH v11 04/14] Revert "include/linux/mm_inline.h: fold __update_lru_size() into its sole caller" Yu Zhao
2022-05-18  1:46 ` [PATCH v11 05/14] mm: multi-gen LRU: groundwork Yu Zhao
2022-06-09  5:33   ` zhong jiang
2022-05-18  1:46 ` [PATCH v11 06/14] mm: multi-gen LRU: minimal implementation Yu Zhao
2022-06-09 12:34   ` zhong jiang
2022-06-09 14:46     ` zhong jiang
2022-05-18  1:46 ` [PATCH v11 07/14] mm: multi-gen LRU: exploit locality in rmap Yu Zhao
2022-06-06  9:25   ` Barry Song
2022-06-07  7:37     ` Barry Song
2022-06-07 10:21       ` Will Deacon
2022-06-06 22:37         ` Barry Song [this message]
2022-06-07 10:43           ` Will Deacon
2022-06-07 21:06             ` Yu Zhao
2022-06-08  0:43               ` Barry Song
2022-06-08 15:51                 ` Linus Torvalds
2022-06-08 22:45                   ` Barry Song
2022-06-16 21:55                     ` Yu Zhao
2022-06-16 22:33                       ` Barry Song
2022-06-16 23:29                         ` Yu Zhao
2022-06-17  1:42                           ` Yu Zhao
2022-06-17  2:01                             ` Barry Song
2022-06-17  3:03                               ` Yu Zhao
2022-06-17  3:17                                 ` Yu Zhao
2022-06-19 20:36                                   ` Yu Zhao
2022-06-19 21:56                                     ` Barry Song
2022-06-07 19:07       ` Yu Zhao
2022-06-08  7:48         ` Barry Song
2022-06-07 18:58     ` Yu Zhao
2022-05-18  1:46 ` [PATCH v11 08/14] mm: multi-gen LRU: support page table walks Yu Zhao
2022-05-18  1:46 ` [PATCH v11 09/14] mm: multi-gen LRU: optimize multiple memcgs Yu Zhao
2022-05-18  1:46 ` [PATCH v11 10/14] mm: multi-gen LRU: kill switch Yu Zhao
2022-05-18  1:46 ` [PATCH v11 11/14] mm: multi-gen LRU: thrashing prevention Yu Zhao
2022-05-18  1:46 ` [PATCH v11 12/14] mm: multi-gen LRU: debugfs interface Yu Zhao
2022-05-18  1:46 ` [PATCH v11 13/14] mm: multi-gen LRU: admin guide Yu Zhao
2022-05-18  1:46 ` [PATCH v11 14/14] mm: multi-gen LRU: design doc Yu Zhao
2022-05-18  2:05 ` [PATCH v11 00/14] Multi-Gen LRU Framework Jens Axboe
2022-06-07 22:47   ` Yu Zhao

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CAGsJ_4zGEdHDv0ObZ-5y8sFKLO7Y6ZjTsZFs0KvdLwA_-iGJ5A@mail.gmail.com \
    --to=21cnbao@gmail.com \
    --cc=Hi-Angel@yandex.ru \
    --cc=Michael@michaellarabel.com \
    --cc=ak@linux.intel.com \
    --cc=akpm@linux-foundation.org \
    --cc=aneesh.kumar@linux.ibm.com \
    --cc=axboe@kernel.dk \
    --cc=bgeffon@google.com \
    --cc=catalin.marinas@arm.com \
    --cc=corbet@lwn.net \
    --cc=d@chaos-reins.com \
    --cc=dave.hansen@linux.intel.com \
    --cc=djbyrne@mtu.edu \
    --cc=hannes@cmpxchg.org \
    --cc=hdanton@sina.com \
    --cc=heftig@archlinux.org \
    --cc=holger@applied-asynchrony.com \
    --cc=huzhanyuan@oppo.com \
    --cc=linux-arm-kernel@lists.infradead.org \
    --cc=linux-doc@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mgorman@suse.de \
    --cc=mhocko@kernel.org \
    --cc=oleksandr@natalenko.name \
    --cc=page-reclaim@google.com \
    --cc=peterz@infradead.org \
    --cc=rppt@kernel.org \
    --cc=sofia.trinh@edi.works \
    --cc=steven@liquorix.net \
    --cc=suleiman@google.com \
    --cc=szhai2@cs.rochester.edu \
    --cc=tj@kernel.org \
    --cc=torvalds@linux-foundation.org \
    --cc=vaibhav@linux.ibm.com \
    --cc=vbabka@suse.cz \
    --cc=will@kernel.org \
    --cc=willy@infradead.org \
    --cc=x86@kernel.org \
    --cc=yuzhao@google.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).