linux-mips.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Nicholas Piggin <npiggin@gmail.com>
To: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Andrey Konovalov <andreyknvl@google.com>,
	Benjamin Herrenschmidt <benh@kernel.crashing.org>,
	Rich Felker <dalias@libc.org>,
	"David S. Miller" <davem@davemloft.net>,
	Christoph Hellwig <hch@lst.de>, James Hogan <jhogan@kernel.org>,
	Khalid Aziz <khalid.aziz@oracle.com>,
	Linux List Kernel Mailing <linux-kernel@vger.kernel.org>,
	linux-mips@vger.kernel.org, Linux-MM <linux-mm@kvack.org>,
	linuxppc-dev@lists.ozlabs.org,
	Linux-sh list <linux-sh@vger.kernel.org>,
	Michael Ellerman <mpe@ellerman.id.au>,
	Paul Burton <paul.burton@mips.com>,
	Paul Mackerras <paulus@samba.org>,
	sparclinux@vger.kernel.org,
	the arch/x86 maintainers <x86@kernel.org>,
	Yoshinori Sato <ysato@users.sourceforge.jp>
Subject: Re: [PATCH 16/16] mm: pass get_user_pages_fast iterator arguments in a structure
Date: Fri, 21 Jun 2019 18:29:27 +1000	[thread overview]
Message-ID: <1561104674.cxm7sn77rx.astroid@bobo.none> (raw)
In-Reply-To: <CAHk-=wh46y3x5O0HkR=R4ETh6e5pDCrEsJ94CtC0fyQiYYAf6A@mail.gmail.com>

Linus Torvalds's on June 21, 2019 3:21 am:
> On Thu, Jun 20, 2019 at 5:19 AM Nicholas Piggin <npiggin@gmail.com> wrote:
>>
>> The processor aliasing problem happens because the struct will
>> be initialised with stores using one base register (e.g., stack
>> register), and then same memory is loaded using a different
>> register (e.g., parameter register).
> 
> Hmm. Honestly, I've never seen anything like that in any kernel profiles.
> 
> Compared to the problems I _do_ see (which is usually the obvious
> cache misses, and locking), it must either be in the noise or it's
> some problem specific to whatever CPU you are doing performance work
> on?

No you're right, the performance hit from these flushes is not a
big hit that stands out in cycle counts. I just look at kernel code
for various flushes. Branches not surprisingly are usually the main
culprit, but they're normally not so interesting.

Static alias prediction seems to work well outside this case. It's
interesting, you need both a store ; load sequence that does not
predict well (e.g., using a different base register), and you also
need that load to be executed ahead of the store.

The small stack structure for arguments is the perfect case. Bad
pattern, and load executed right after store. Even then you also need
a reason to delay the store (e.g., source not ready or store queue
full), but those hazards do show up.

Now, even when all that goes wrong, there are dynamic heuristics that
can take over. So if you run a repetitive microbenchmark you won't
see it.

Some CPUs seem to be quite aggressive about giving up and turning off
the alias prediction globally if you take misses (Intel x86 used to do
that IIRC, not sure if they still do). So in that case you wouldn't
even see it show up in one place, everything will just run slightly
slower.

What I worry about is high rate direct IO workloads that see single
flushes in these paths as significant. Or if this thing creeps in to
the kernel too much and just slightly raises global misses enough,
then it will cause disambiguation to be significantly shut down.

Thanks,
Nick


  parent reply	other threads:[~2019-06-21  8:29 UTC|newest]

Thread overview: 46+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-06-11 14:40 switch the remaining architectures to use generic GUP v3 Christoph Hellwig
2019-06-11 14:40 ` [PATCH 01/16] mm: use untagged_addr() for get_user_pages_fast addresses Christoph Hellwig
2019-06-11 19:22   ` Khalid Aziz
2019-06-21 13:16   ` Jason Gunthorpe
2019-06-21 13:39   ` Jason Gunthorpe
2019-06-21 15:35     ` Khalid Aziz
2019-06-21 15:54       ` Jason Gunthorpe
2019-06-25  7:41     ` Christoph Hellwig
2019-06-25  7:43     ` Christoph Hellwig
2019-06-11 14:40 ` [PATCH 02/16] mm: simplify gup_fast_permitted Christoph Hellwig
2019-06-21 13:40   ` Jason Gunthorpe
2019-06-11 14:40 ` [PATCH 03/16] mm: lift the x86_32 PAE version of gup_get_pte to common code Christoph Hellwig
2019-06-21 13:45   ` Jason Gunthorpe
2019-06-11 14:40 ` [PATCH 04/16] MIPS: use the generic get_user_pages_fast code Christoph Hellwig
2019-06-21 14:05   ` Jason Gunthorpe
2019-06-25  7:46     ` Christoph Hellwig
2019-06-11 14:40 ` [PATCH 05/16] sh: add the missing pud_page definition Christoph Hellwig
2019-06-11 14:40 ` [PATCH 06/16] sh: use the generic get_user_pages_fast code Christoph Hellwig
2019-06-11 14:40 ` [PATCH 07/16] sparc64: add the missing pgd_page definition Christoph Hellwig
2019-06-11 14:40 ` [PATCH 08/16] sparc64: define untagged_addr() Christoph Hellwig
2019-06-11 19:23   ` Khalid Aziz
2019-06-11 14:40 ` [PATCH 09/16] sparc64: use the generic get_user_pages_fast code Christoph Hellwig
2019-06-11 19:35   ` Khalid Aziz
2019-06-11 14:40 ` [PATCH 10/16] mm: rename CONFIG_HAVE_GENERIC_GUP to CONFIG_HAVE_FAST_GUP Christoph Hellwig
2019-06-11 19:35   ` Khalid Aziz
2019-06-21 14:28   ` Jason Gunthorpe
2019-06-25  7:50     ` Christoph Hellwig
2019-06-11 14:40 ` [PATCH 11/16] mm: consolidate the get_user_pages* implementations Christoph Hellwig
2019-06-21 14:41   ` Jason Gunthorpe
2019-06-25  7:56     ` Christoph Hellwig
2019-06-25 11:56       ` Jason Gunthorpe
2019-06-11 14:40 ` [PATCH 12/16] mm: validate get_user_pages_fast flags Christoph Hellwig
2019-06-11 14:40 ` [PATCH 13/16] mm: move the powerpc hugepd code to mm/gup.c Christoph Hellwig
2019-06-11 14:41 ` [PATCH 14/16] mm: switch gup_hugepte to use try_get_compound_head Christoph Hellwig
2019-06-11 14:41 ` [PATCH 15/16] mm: mark the page referenced in gup_hugepte Christoph Hellwig
2019-06-11 14:41 ` [PATCH 16/16] mm: pass get_user_pages_fast iterator arguments in a structure Christoph Hellwig
2019-06-12  0:52   ` Nicholas Piggin
2019-06-12  1:09     ` Linus Torvalds
2019-06-20 12:18       ` Nicholas Piggin
2019-06-20 17:21         ` Linus Torvalds
2019-06-21  8:15           ` Christoph Hellwig
2019-06-21 23:55             ` Nicholas Piggin
2019-06-21  8:29           ` Nicholas Piggin [this message]
2019-06-12  1:27     ` Nadav Amit
2019-06-20 11:45 ` switch the remaining architectures to use generic GUP v3 Christoph Hellwig
2019-06-21 14:43 ` Jason Gunthorpe

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1561104674.cxm7sn77rx.astroid@bobo.none \
    --to=npiggin@gmail.com \
    --cc=andreyknvl@google.com \
    --cc=benh@kernel.crashing.org \
    --cc=dalias@libc.org \
    --cc=davem@davemloft.net \
    --cc=hch@lst.de \
    --cc=jhogan@kernel.org \
    --cc=khalid.aziz@oracle.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mips@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=linux-sh@vger.kernel.org \
    --cc=linuxppc-dev@lists.ozlabs.org \
    --cc=mpe@ellerman.id.au \
    --cc=paul.burton@mips.com \
    --cc=paulus@samba.org \
    --cc=sparclinux@vger.kernel.org \
    --cc=torvalds@linux-foundation.org \
    --cc=x86@kernel.org \
    --cc=ysato@users.sourceforge.jp \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).