linux-riscv.lists.infradead.org archive mirror
 help / color / mirror / Atom feed
From: Joel Fernandes <joel@joelfernandes.org>
To: "Kirill A. Shutemov" <kirill@shutemov.name>
Cc: linux-mips@linux-mips.org, Rich Felker <dalias@libc.org>,
	linux-ia64@vger.kernel.org, linux-sh@vger.kernel.org,
	Peter Zijlstra <peterz@infradead.org>,
	Catalin Marinas <catalin.marinas@arm.com>,
	Balbir Singh <bsingharora@gmail.com>,
	Dave Hansen <dave.hansen@linux.intel.com>,
	Will Deacon <will.deacon@arm.com>,
	mhocko@kernel.org, linux-mm@kvack.org, lokeshgidra@google.com,
	sparclinux@vger.kernel.org, linux-riscv@lists.infradead.org,
	elfring@users.sourceforge.net, Jonas Bonn <jonas@southpole.se>,
	kvmarm@lists.cs.columbia.edu, dancol@google.com,
	Yoshinori Sato <ysato@users.sourceforge.jp>,
	linux-xtensa@linux-xtensa.org, linux-hexagon@vger.kernel.org,
	Helge Deller <deller@gmx.de>,
	"maintainer:X86 ARCHITECTURE \(32-BIT AND 64-BIT\)"
	<x86@kernel.org>,
	hughd@google.com, "James E.J. Bottomley" <jejb@parisc-linux.org>,
	kasan-dev@googlegroups.com, anton.ivanov@kot-begemot.co.uk,
	Ingo Molnar <mingo@redhat.com>,
	Geert Uytterhoeven <geert@linux-m68k.org>,
	Andrey Ryabinin <aryabinin@virtuozzo.com>,
	linux-snps-arc@lists.infradead.org, kernel-team@android.com,
	Sam Creasey <sammy@sammy.net>, Fenghua Yu <fenghua.yu@intel.com>,
	linux-s390@vger.kernel.org, Jeff Dike <jdike@addtoit.com>,
	linux-um@lists.infradead.org,
	Stefan Kristiansson <stefan.kristiansson@saunalahti.fi>,
	Julia Lawall <Julia.Lawall@lip6.fr>,
	linux-m68k@lists.linux-m68k.org, Borislav Petkov <bp@alien8.de>,
	Andy Lutomirski <luto@kernel.org>,
	nios2-dev@lists.rocketboards.org,
	Stafford Horne <shorne@gmail.com>, Guan Xuetao <gxt@pku.edu.cn>,
	Chris Zankel <chris@zankel.net>, Tony Luck <tony.luck@intel.com>,
	Richard Weinberger <richard@nod.at>,
	linux-parisc@vger.kernel.org, pantin@google.com,
	Max Filippov <jcmvbkbc@gmail.com>,
	linux-kernel@vger.kernel.org, minchan@kernel.org,
	Thomas Gleixner <tglx@linutronix.de>,
	linux-alpha@vger.kernel.org, Ley Foon Tan <lftan@altera.com>,
	akpm@linux-foundation.org, linuxppc-dev@lists.ozlabs.org,
	"David S. Miller" <davem@davemloft.net>
Subject: Re: [PATCH 2/4] mm: speed up mremap by 500x on large regions (v2)
Date: Wed, 24 Oct 2018 19:09:07 -0700	[thread overview]
Message-ID: <20181025020907.GA13560@joelaf.mtv.corp.google.com> (raw)
Message-ID: <20181025020907.OS777LGVojirFXRJFNyj6eAjdE_C3R5tPDnR3_821vs@z> (raw)
In-Reply-To: <20181024125724.yf6frdimjulf35do@kshutemo-mobl1>

On Wed, Oct 24, 2018 at 03:57:24PM +0300, Kirill A. Shutemov wrote:
> On Wed, Oct 24, 2018 at 10:57:33PM +1100, Balbir Singh wrote:
> > On Wed, Oct 24, 2018 at 01:12:56PM +0300, Kirill A. Shutemov wrote:
> > > On Fri, Oct 12, 2018 at 06:31:58PM -0700, Joel Fernandes (Google) wrote:
> > > > diff --git a/mm/mremap.c b/mm/mremap.c
> > > > index 9e68a02a52b1..2fd163cff406 100644
> > > > --- a/mm/mremap.c
> > > > +++ b/mm/mremap.c
> > > > @@ -191,6 +191,54 @@ static void move_ptes(struct vm_area_struct *vma, pmd_t *old_pmd,
> > > >  		drop_rmap_locks(vma);
> > > >  }
> > > >  
> > > > +static bool move_normal_pmd(struct vm_area_struct *vma, unsigned long old_addr,
> > > > +		  unsigned long new_addr, unsigned long old_end,
> > > > +		  pmd_t *old_pmd, pmd_t *new_pmd, bool *need_flush)
> > > > +{
> > > > +	spinlock_t *old_ptl, *new_ptl;
> > > > +	struct mm_struct *mm = vma->vm_mm;
> > > > +
> > > > +	if ((old_addr & ~PMD_MASK) || (new_addr & ~PMD_MASK)
> > > > +	    || old_end - old_addr < PMD_SIZE)
> > > > +		return false;
> > > > +
> > > > +	/*
> > > > +	 * The destination pmd shouldn't be established, free_pgtables()
> > > > +	 * should have release it.
> > > > +	 */
> > > > +	if (WARN_ON(!pmd_none(*new_pmd)))
> > > > +		return false;
> > > > +
> > > > +	/*
> > > > +	 * We don't have to worry about the ordering of src and dst
> > > > +	 * ptlocks because exclusive mmap_sem prevents deadlock.
> > > > +	 */
> > > > +	old_ptl = pmd_lock(vma->vm_mm, old_pmd);
> > > > +	if (old_ptl) {
> > > 
> > > How can it ever be false?

Kirill,
It cannot, you are right. I'll remove the test.

By the way, there are new changes upstream by Linus which flush the TLB
before releasing the ptlock instead of after. I'm guessing that patch came
about because of reviews of this patch and someone spotted an issue in the
existing code :)

Anyway the patch in concern is:
eb66ae030829 ("mremap: properly flush TLB before releasing the page")

I need to rebase on top of that with appropriate modifications, but I worry
that this patch will slow down performance since we have to flush at every
PMD/PTE move before releasing the ptlock. Where as with my patch, the
intention is to flush only at once in the end of move_page_tables. When I
tried to flush TLB on every PMD move, it was quite slow on my arm64 device [2].

Further observation [1] is, it seems like the move_huge_pmds and move_ptes code
is a bit sub optimal in the sense, we are acquiring and releasing the same
ptlock for a bunch of PMDs if the said PMDs are on the same page-table page
right? Instead we can do better by acquiring and release the ptlock less
often.

I think this observation [1] and the frequent TLB flush issue [2] can be solved
by acquiring the ptlock once for a bunch of PMDs, move them all, then flush
the tlb and then release the ptlock, and then proceed to doing the same thing
for the PMDs in the next page-table page. What do you think?

- Joel


_______________________________________________
linux-riscv mailing list
linux-riscv@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-riscv

  parent reply	other threads:[~2018-10-25  2:09 UTC|newest]

Thread overview: 52+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-10-13  1:31 [PATCH 0/4] Add support for fast mremap Joel Fernandes (Google)
2018-10-13  1:31 ` Joel Fernandes (Google)
2018-10-13  1:31 ` [PATCH 1/4] treewide: remove unused address argument from pte_alloc functions (v2) Joel Fernandes (Google)
2018-10-13  1:31   ` Joel Fernandes (Google)
2018-10-24  8:37   ` Peter Zijlstra
2018-10-24  8:37     ` Peter Zijlstra
2018-10-25  2:21     ` Joel Fernandes
2018-10-25  2:21       ` Joel Fernandes
2018-10-26  8:52       ` Peter Zijlstra
2018-10-26  8:52         ` Peter Zijlstra
2018-10-25 10:47     ` Kirill A. Shutemov
2018-10-25 10:47       ` Kirill A. Shutemov
2018-10-26  8:50       ` Peter Zijlstra
2018-10-26  8:50         ` Peter Zijlstra
2018-10-13  1:31 ` [PATCH 2/4] mm: speed up mremap by 500x on large regions (v2) Joel Fernandes (Google)
2018-10-13  1:31   ` Joel Fernandes (Google)
2018-10-15  9:42   ` Christoph Hellwig
2018-10-15  9:42     ` Christoph Hellwig
2018-10-15 22:33     ` Joel Fernandes
2018-10-15 22:33       ` Joel Fernandes
2018-10-16 11:29       ` Vlastimil Babka
2018-10-16 11:29         ` Vlastimil Babka
2018-10-16 19:43         ` Joel Fernandes
2018-10-16 19:43           ` Joel Fernandes
2018-10-17  7:38           ` Vlastimil Babka
2018-10-17  7:38             ` Vlastimil Babka
2018-10-24 10:12   ` Kirill A. Shutemov
2018-10-24 10:12     ` Kirill A. Shutemov
2018-10-24 11:57     ` Balbir Singh
2018-10-24 11:57       ` Balbir Singh
2018-10-24 12:57       ` Kirill A. Shutemov
2018-10-24 12:57         ` Kirill A. Shutemov
2018-10-25  2:09         ` Joel Fernandes [this message]
2018-10-25  2:09           ` Joel Fernandes
2018-10-25 10:19           ` Kirill A. Shutemov
2018-10-25 10:19             ` Kirill A. Shutemov
2018-10-26 21:11             ` Joel Fernandes
2018-10-26 21:11               ` Joel Fernandes
2018-10-29 10:28               ` Will Deacon
2018-10-29 10:28                 ` Will Deacon
2018-10-25  2:13       ` Joel Fernandes
2018-10-25  2:13         ` Joel Fernandes
2018-10-27 10:21         ` Balbir Singh
2018-10-27 10:21           ` Balbir Singh
2018-10-27 19:39           ` Joel Fernandes
2018-10-27 19:39             ` Joel Fernandes
2018-10-28 22:40             ` Balbir Singh
2018-10-28 22:40               ` Balbir Singh
2018-10-13  1:31 ` [PATCH 3/4] arm64: select HAVE_MOVE_PMD for faster mremap (v1) Joel Fernandes (Google)
2018-10-13  1:31   ` Joel Fernandes (Google)
2018-10-13  1:32 ` [PATCH 4/4] x86: " Joel Fernandes (Google)
2018-10-13  1:32   ` Joel Fernandes (Google)

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20181025020907.GA13560@joelaf.mtv.corp.google.com \
    --to=joel@joelfernandes.org \
    --cc=Julia.Lawall@lip6.fr \
    --cc=akpm@linux-foundation.org \
    --cc=anton.ivanov@kot-begemot.co.uk \
    --cc=aryabinin@virtuozzo.com \
    --cc=bp@alien8.de \
    --cc=bsingharora@gmail.com \
    --cc=catalin.marinas@arm.com \
    --cc=chris@zankel.net \
    --cc=dalias@libc.org \
    --cc=dancol@google.com \
    --cc=dave.hansen@linux.intel.com \
    --cc=davem@davemloft.net \
    --cc=deller@gmx.de \
    --cc=elfring@users.sourceforge.net \
    --cc=fenghua.yu@intel.com \
    --cc=geert@linux-m68k.org \
    --cc=gxt@pku.edu.cn \
    --cc=hughd@google.com \
    --cc=jcmvbkbc@gmail.com \
    --cc=jdike@addtoit.com \
    --cc=jejb@parisc-linux.org \
    --cc=jonas@southpole.se \
    --cc=kasan-dev@googlegroups.com \
    --cc=kernel-team@android.com \
    --cc=kirill@shutemov.name \
    --cc=kvmarm@lists.cs.columbia.edu \
    --cc=lftan@altera.com \
    --cc=linux-alpha@vger.kernel.org \
    --cc=linux-hexagon@vger.kernel.org \
    --cc=linux-ia64@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-m68k@lists.linux-m68k.org \
    --cc=linux-mips@linux-mips.org \
    --cc=linux-mm@kvack.org \
    --cc=linux-parisc@vger.kernel.org \
    --cc=linux-riscv@lists.infradead.org \
    --cc=linux-s390@vger.kernel.org \
    --cc=linux-sh@vger.kernel.org \
    --cc=linux-snps-arc@lists.infradead.org \
    --cc=linux-um@lists.infradead.org \
    --cc=linux-xtensa@linux-xtensa.org \
    --cc=linuxppc-dev@lists.ozlabs.org \
    --cc=lokeshgidra@google.com \
    --cc=luto@kernel.org \
    --cc=mhocko@kernel.org \
    --cc=minchan@kernel.org \
    --cc=mingo@redhat.com \
    --cc=nios2-dev@lists.rocketboards.org \
    --cc=pantin@google.com \
    --cc=peterz@infradead.org \
    --cc=richard@nod.at \
    --cc=sammy@sammy.net \
    --cc=shorne@gmail.com \
    --cc=sparclinux@vger.kernel.org \
    --cc=stefan.kristiansson@saunalahti.fi \
    --cc=tglx@linutronix.de \
    --cc=tony.luck@intel.com \
    --cc=will.deacon@arm.com \
    --cc=x86@kernel.org \
    --cc=ysato@users.sourceforge.jp \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).