linux-arm-kernel.lists.infradead.org archive mirror
 help / color / mirror / Atom feed
From: Lokesh Gidra <lokeshgidra@google.com>
To: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
Cc: Joel Fernandes <joelaf@google.com>,
	Mark Rutland <mark.rutland@arm.com>,
	Gavin Shan <gshan@redhat.com>, Brian Geffon <bgeffon@google.com>,
	Peter Zijlstra <peterz@infradead.org>,
	Catalin Marinas <catalin.marinas@arm.com>,
	Kalesh Singh <kaleshsingh@google.com>,
	Ram Pai <linuxram@us.ibm.com>,
	Kamalesh Babulal <kamalesh@linux.vnet.ibm.com>,
	"open list:MEMORY MANAGEMENT" <linux-mm@kvack.org>,
	William Kucharski <william.kucharski@oracle.com>,
	Sandipan Das <sandipan@linux.ibm.com>,
	"open list:KERNEL SELFTEST FRAMEWORK"
	<linux-kselftest@vger.kernel.org>,
	"H. Peter Anvin" <hpa@zytor.com>,
	Christian Brauner <christian.brauner@ubuntu.com>,
	Shuah Khan <shuah@kernel.org>,
	Mina Almasry <almasrymina@google.com>, Jia He <justin.he@arm.com>,
	Arnd Bergmann <arnd@arndb.de>, Will Deacon <will@kernel.org>,
	Masahiro Yamada <masahiroy@kernel.org>,
	the arch/x86 maintainers <x86@kernel.org>,
	Krzysztof Kozlowski <krzk@kernel.org>,
	Ingo Molnar <mingo@redhat.com>,
	Sami Tolvanen <samitolvanen@google.com>,
	"Cc: Android Kernel" <kernel-team@android.com>,
	Hassan Naveed <hnaveed@wavecomp.com>,
	Ralph Campbell <rcampbell@nvidia.com>,
	Kees Cook <keescook@chromium.org>,
	Minchan Kim <minchan@google.com>,
	Zhenyu Ye <yezhenyu2@huawei.com>,
	John Hubbard <jhubbard@nvidia.com>,
	Frederic Weisbecker <frederic@kernel.org>,
	Borislav Petkov <bp@alien8.de>,
	Thomas Gleixner <tglx@linutronix.de>,
	Suren Baghdasaryan <surenb@google.com>,
	"moderated list:ARM64 PORT \(AARCH64 ARCHITECTURE\)"
	<linux-arm-kernel@lists.infradead.org>,
	SeongJae Park <sjpark@amazon.de>,
	Dave Hansen <dave.hansen@intel.com>,
	linux-kernel <linux-kernel@vger.kernel.org>,
	Masami Hiramatsu <mhiramat@kernel.org>,
	"Aneesh Kumar K.V" <aneesh.kumar@linux.ibm.com>,
	Andrew Morton <akpm@linux-foundation.org>,
	Mike Rapoport <rppt@kernel.org>
Subject: Re: [PATCH 0/5] Speed up mremap on large regions
Date: Thu, 1 Oct 2020 23:39:53 -0700	[thread overview]
Message-ID: <CA+EESO5P1P4_Mb_7q0E9Y9uv6f9wK4kTALqCOKsc36k+E4p-5Q@mail.gmail.com> (raw)
In-Reply-To: <20201002053547.7roe7b4mpamw4uk2@black.fi.intel.com>

On Thu, Oct 1, 2020 at 10:36 PM Kirill A. Shutemov
<kirill.shutemov@linux.intel.com> wrote:
>
> On Thu, Oct 01, 2020 at 05:09:02PM -0700, Lokesh Gidra wrote:
> > On Thu, Oct 1, 2020 at 9:00 AM Kalesh Singh <kaleshsingh@google.com> wrote:
> > >
> > > On Thu, Oct 1, 2020 at 8:27 AM Kirill A. Shutemov
> > > <kirill.shutemov@linux.intel.com> wrote:
> > > >
> > > > On Wed, Sep 30, 2020 at 03:42:17PM -0700, Lokesh Gidra wrote:
> > > > > On Wed, Sep 30, 2020 at 3:32 PM Kirill A. Shutemov
> > > > > <kirill.shutemov@linux.intel.com> wrote:
> > > > > >
> > > > > > On Wed, Sep 30, 2020 at 10:21:17PM +0000, Kalesh Singh wrote:
> > > > > > > mremap time can be optimized by moving entries at the PMD/PUD level if
> > > > > > > the source and destination addresses are PMD/PUD-aligned and
> > > > > > > PMD/PUD-sized. Enable moving at the PMD and PUD levels on arm64 and
> > > > > > > x86. Other architectures where this type of move is supported and known to
> > > > > > > be safe can also opt-in to these optimizations by enabling HAVE_MOVE_PMD
> > > > > > > and HAVE_MOVE_PUD.
> > > > > > >
> > > > > > > Observed Performance Improvements for remapping a PUD-aligned 1GB-sized
> > > > > > > region on x86 and arm64:
> > > > > > >
> > > > > > >     - HAVE_MOVE_PMD is already enabled on x86 : N/A
> > > > > > >     - Enabling HAVE_MOVE_PUD on x86   : ~13x speed up
> > > > > > >
> > > > > > >     - Enabling HAVE_MOVE_PMD on arm64 : ~ 8x speed up
> > > > > > >     - Enabling HAVE_MOVE_PUD on arm64 : ~19x speed up
> > > > > > >
> > > > > > >           Altogether, HAVE_MOVE_PMD and HAVE_MOVE_PUD
> > > > > > >           give a total of ~150x speed up on arm64.
> > > > > >
> > > > > > Is there a *real* workload that benefit from HAVE_MOVE_PUD?
> > > > > >
> > > > > We have a Java garbage collector under development which requires
> > > > > moving physical pages of multi-gigabyte heap using mremap. During this
> > > > > move, the application threads have to be paused for correctness. It is
> > > > > critical to keep this pause as short as possible to avoid jitters
> > > > > during user interaction. This is where HAVE_MOVE_PUD will greatly
> > > > > help.
> > > >
> > > > Any chance to quantify the effect of mremap() with and without
> > > > HAVE_MOVE_PUD?
> > > >
> > > > I doubt it's a major contributor to the GC pause. I expect you need to
> > > > move tens of gigs to get sizable effect. And if your GC routinely moves
> > > > tens of gigs, maybe problem somewhere else?
> > > >
> > > > I'm asking for numbers, because increase in complexity comes with cost.
> > > > If it doesn't provide an substantial benefit to a real workload
> > > > maintaining the code forever doesn't make sense.
> > >
> > mremap is indeed the biggest contributor to the GC pause. It has to
> > take place in what is typically known as a 'stop-the-world' pause,
> > wherein all application threads are paused. During this pause the GC
> > thread flips the GC roots (threads' stacks, globals etc.), and then
> > resumes threads along with concurrent compaction of the heap.This
> > GC-root flip differs depending on which compaction algorithm is being
> > used.
> >
> > In our case it involves updating object references in threads' stacks
> > and remapping java heap to a different location. The threads' stacks
> > can be handled in parallel with the mremap. Therefore, the dominant
> > factor is indeed the cost of mremap. From patches 2 and 4, it is clear
> > that remapping 1GB without this optimization will take ~9ms on arm64.
> >
> > Although this mremap has to happen only once every GC cycle, and the
> > typical size is also not going to be more than a GB or 2, pausing
> > application threads for ~9ms is guaranteed to cause jitters. OTOH,
> > with this optimization, mremap is reduced to ~60us, which is a totally
> > acceptable pause time.
> >
> > Unfortunately, implementation of the new GC algorithm hasn't yet
> > reached the point where I can quantify the effect of this
> > optimization. But I can confirm that without this optimization the new
> > GC will not be approved.
>
> IIUC, the 9ms -> 90us improvement attributed to combination HAVE_MOVE_PMD
> and HAVE_MOVE_PUD, right? I expect HAVE_MOVE_PMD to be reasonable for some
> workloads, but marginal benefit of HAVE_MOVE_PUD is in doubt. Do you see
> it's useful for your workload?
>
Yes, 9ms -> 90us is when both are combined. The past experience has
been that even ~1ms long stop-the-world pause is prone to cause
jitters. HAVE_MOVE_PMD takes us only this far. So HAVE_MOVE_PUD is
required to bring the mremap cost to acceptable level.

Ideally, I was hoping that the functionality of HAVE_MOVE_PMD can be
extended to all levels of the hierarchical page table, and in the
process simplify the implementation. But unfortunately, that doesn't
seem to be possible from patch 3.

> --
>  Kirill A. Shutemov

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

      reply	other threads:[~2020-10-02  6:41 UTC|newest]

Thread overview: 21+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-09-30 22:21 [PATCH 0/5] Speed up mremap on large regions Kalesh Singh
2020-09-30 22:21 ` [PATCH 1/5] kselftests: vm: Add mremap tests Kalesh Singh
2020-10-01  7:24   ` John Hubbard
2020-10-01 15:46     ` Kalesh Singh
2020-10-01 18:36       ` John Hubbard
2020-09-30 22:21 ` [PATCH 2/5] arm64: mremap speedup - Enable HAVE_MOVE_PMD Kalesh Singh
2020-09-30 22:21 ` [PATCH 3/5] mm: Speedup mremap on 1GB or larger regions Kalesh Singh
2020-10-01 12:36   ` Kirill A. Shutemov
2020-10-01 16:40     ` Kalesh Singh
2020-10-01 18:10       ` Kalesh Singh
2020-09-30 22:21 ` [PATCH 4/5] arm64: mremap speedup - Enable HAVE_MOVE_PUD Kalesh Singh
2020-09-30 22:21 ` [PATCH 5/5] x86: " Kalesh Singh
2020-09-30 22:32 ` [PATCH 0/5] Speed up mremap on large regions Kirill A. Shutemov
2020-09-30 22:42   ` Lokesh Gidra
2020-09-30 22:46     ` Joel Fernandes
2020-09-30 23:03       ` Kalesh Singh
2020-10-01 12:27     ` Kirill A. Shutemov
2020-10-01 15:59       ` Kalesh Singh
2020-10-02  0:09         ` Lokesh Gidra
2020-10-02  5:35           ` Kirill A. Shutemov
2020-10-02  6:39             ` Lokesh Gidra [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CA+EESO5P1P4_Mb_7q0E9Y9uv6f9wK4kTALqCOKsc36k+E4p-5Q@mail.gmail.com \
    --to=lokeshgidra@google.com \
    --cc=akpm@linux-foundation.org \
    --cc=almasrymina@google.com \
    --cc=aneesh.kumar@linux.ibm.com \
    --cc=arnd@arndb.de \
    --cc=bgeffon@google.com \
    --cc=bp@alien8.de \
    --cc=catalin.marinas@arm.com \
    --cc=christian.brauner@ubuntu.com \
    --cc=dave.hansen@intel.com \
    --cc=frederic@kernel.org \
    --cc=gshan@redhat.com \
    --cc=hnaveed@wavecomp.com \
    --cc=hpa@zytor.com \
    --cc=jhubbard@nvidia.com \
    --cc=joelaf@google.com \
    --cc=justin.he@arm.com \
    --cc=kaleshsingh@google.com \
    --cc=kamalesh@linux.vnet.ibm.com \
    --cc=keescook@chromium.org \
    --cc=kernel-team@android.com \
    --cc=kirill.shutemov@linux.intel.com \
    --cc=krzk@kernel.org \
    --cc=linux-arm-kernel@lists.infradead.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-kselftest@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=linuxram@us.ibm.com \
    --cc=mark.rutland@arm.com \
    --cc=masahiroy@kernel.org \
    --cc=mhiramat@kernel.org \
    --cc=minchan@google.com \
    --cc=mingo@redhat.com \
    --cc=peterz@infradead.org \
    --cc=rcampbell@nvidia.com \
    --cc=rppt@kernel.org \
    --cc=samitolvanen@google.com \
    --cc=sandipan@linux.ibm.com \
    --cc=shuah@kernel.org \
    --cc=sjpark@amazon.de \
    --cc=surenb@google.com \
    --cc=tglx@linutronix.de \
    --cc=will@kernel.org \
    --cc=william.kucharski@oracle.com \
    --cc=x86@kernel.org \
    --cc=yezhenyu2@huawei.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).