All of lore.kernel.org
 help / color / mirror / Atom feed
From: Peter Collingbourne <pcc@google.com>
To: Catalin Marinas <catalin.marinas@arm.com>
Cc: Vincenzo Frascino <vincenzo.frascino@arm.com>,
	Will Deacon <will@kernel.org>,
	 Andrey Konovalov <andreyknvl@gmail.com>,
	Evgenii Stepanov <eugenis@google.com>,
	Szabolcs Nagy <szabolcs.nagy@arm.com>,
	Tejas Belagod <Tejas.Belagod@arm.com>,
	 Linux ARM <linux-arm-kernel@lists.infradead.org>
Subject: Re: [PATCH v2] arm64: mte: switch GCR_EL1 on task switch rather than entry/exit
Date: Thu, 8 Jul 2021 18:50:09 -0700	[thread overview]
Message-ID: <CAMn1gO5eG=_C7xvt=To5WjircK7J-sOrak=437LtDE4uVPe=tQ@mail.gmail.com> (raw)
In-Reply-To: <20210705125217.GA4799@arm.com>

On Mon, Jul 5, 2021 at 5:52 AM Catalin Marinas <catalin.marinas@arm.com> wrote:
>
> On Fri, Jul 02, 2021 at 12:45:18PM -0700, Peter Collingbourne wrote:
> > Accessing GCR_EL1 and issuing an ISB can be expensive on some
> > microarchitectures. To avoid taking this performance hit on every
> > kernel entry/exit, switch GCR_EL1 on task switch rather than
> > entry/exit. This is essentially a revert of commit bad1e1c663e0
> > ("arm64: mte: switch GCR_EL1 in kernel entry and exit").
>
> As per the discussion in v1, we can avoid an ISB, though we are still
> left with the GCR_EL1 access. I'm surprised that access to a non
> self-synchronising register is that expensive but I suspect the
> benchmark is just timing a dummy syscall. I'm not asking for numbers but
> I'd like to make sure we don't optimise for unrealistic use-cases. Is
> something like a geekbench score affected for example?

FWIW, I was using this test program:
https://patchwork.kernel.org/project/linux-arm-kernel/patch/20200801011152.39838-1-pcc@google.com/#23572981

Since it's an invalid syscall it's a good way to measure the effect of
changes to entry/exit in isolation, but it does mean that we need to
be careful when also making changes elsewhere in the kernel, as will
become apparent in a moment.

> While we can get rid of the IRG in the kernel, at some point we may want
> to use ADDG as generated by the compiler. That too is affected by the
> GCR_EL1.Exclude mask.
>
> > This requires changing how we generate random tags for HW tag-based
> > KASAN, since at this point IRG would use the user's exclusion mask,
> > which may not be suitable for kernel use. In this patch I chose to take
> > the modulus of CNTVCT_EL0, however alternative approaches are possible.
>
> So a few successive mte_get_mem_tag() will give the same result if the
> counter hasn't changed. Even if ARMv8.6 requires a 1GHz timer frequency,
> I think an implementation is allowed to count in bigger increments.

Yes, I observed that Apple M1 for example counts in increments of 16.
Taking the modulus of the timer would happen to work as long as the
increment is small enough (since it would mean that the timer would
likely have incremented by the time we need to make another
allocation) and a power of 2 (to ensure that we permute through all of
the possible tag values), which I would expect to be the case on most
microarchitectures.

However, I developed an in-kernel allocator microbenchmark which
revealed a more important issue with this patch, which is that on most
cores switching from IRG to reading the timer costs more than the
performance improvement from switching from the single ISB patch to
the GCR on task switch patch. Which means that if KASAN is enabled, a
single allocation would wipe out the performance improvement from
avoiding touching GCR on entry/exit. I also tried a number of
alternative approaches and they were also too expensive. So now I am
less inclined to push for an approach that avoids touching GCR on
entry/exit.

> BTW, can you also modify mte_set_kernel_gcr to only do a write to the
> GCR_EL1 register rather than a read-modify-write?

Yes, this helps a bit. In v3 I now do this as well as single ISB.

Peter

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

      reply	other threads:[~2021-07-09  1:52 UTC|newest]

Thread overview: 4+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-07-02 19:45 [PATCH v2] arm64: mte: switch GCR_EL1 on task switch rather than entry/exit Peter Collingbourne
2021-07-04 15:37 ` Andrey Konovalov
2021-07-05 12:52 ` Catalin Marinas
2021-07-09  1:50   ` Peter Collingbourne [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CAMn1gO5eG=_C7xvt=To5WjircK7J-sOrak=437LtDE4uVPe=tQ@mail.gmail.com' \
    --to=pcc@google.com \
    --cc=Tejas.Belagod@arm.com \
    --cc=andreyknvl@gmail.com \
    --cc=catalin.marinas@arm.com \
    --cc=eugenis@google.com \
    --cc=linux-arm-kernel@lists.infradead.org \
    --cc=szabolcs.nagy@arm.com \
    --cc=vincenzo.frascino@arm.com \
    --cc=will@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.