All of lore.kernel.org
 help / color / mirror / Atom feed
From: Mark Rutland <mark.rutland@arm.com>
To: andrey.konovalov@linux.dev
Cc: Marco Elver <elver@google.com>,
	Alexander Potapenko <glider@google.com>,
	Andrey Konovalov <andreyknvl@gmail.com>,
	Dmitry Vyukov <dvyukov@google.com>,
	Andrey Ryabinin <ryabinin.a.a@gmail.com>,
	kasan-dev@googlegroups.com,
	Catalin Marinas <catalin.marinas@arm.com>,
	Will Deacon <will@kernel.org>,
	Vincenzo Frascino <vincenzo.frascino@arm.com>,
	Sami Tolvanen <samitolvanen@google.com>,
	linux-arm-kernel@lists.infradead.org,
	Peter Collingbourne <pcc@google.com>,
	Evgenii Stepanov <eugenis@google.com>,
	Florian Mayer <fmayer@google.com>,
	Andrew Morton <akpm@linux-foundation.org>,
	linux-mm@kvack.org, linux-kernel@vger.kernel.org,
	Andrey Konovalov <andreyknvl@google.com>
Subject: Re: [PATCH v3 0/3] kasan, arm64, scs: collect stack traces from Shadow Call Stack
Date: Thu, 14 Apr 2022 14:40:21 +0100	[thread overview]
Message-ID: <YlgkRXkCLeQ5IcaD@lakrids> (raw)
In-Reply-To: <YlgVa+AP0g4IYvzN@lakrids>

On Thu, Apr 14, 2022 at 01:36:59PM +0100, Mark Rutland wrote:
> As I suspected, you're hitting a known performance oddity with QEMU TCG
> mode where pointer authentication is *incredibly* slow when using the
> architected QARMA5 algorithm (enabled by default with `-cpu max`).

> This overhead has nothing to do with the *nature* of the unwinder, and
> is an artifact of the *platform* and the *structure* of the code.
> There's plenty that can be done to avoid that overhead

FWIW, from a quick look, disabling KASAN instrumentation for the
stacktrace object alone (with no other changes) has a significant impact
(compounded by the TCG QARMA5 slowdown), and I note that x86 doesn't
both instrumenting its stacktrace code anyway, so we could consider
doing likewise.

Atop that, replacing set_bit() with __set_bit() brings the regular
unwinder *really* close to the earlier SCS unwinder figures. I know that
the on_accessible_stack() calculations and checks could be ammortized
with some refactoring (which I'd planned to do anyway), so I think it's
plausible that with some changes to the existing unwinder we can bring
the difference into the noise.

> generic kasan w/ `-cpu max`
> ---------------------------
> 
> master-no-stack-traces: 12.66
> master:                 18.39 (+45.2%)
> master-no-stack-depot:  17.85 (+40.1%)
> up-scs-stacks-v3:       13.54 (+7.0%)

master-noasan:            15.67 (+23.8%)
master-noasan-__set_bit:  14.61 (+15.5%)

> Generic KASAN w/ `-cpu max,pauth-impdef=true`
> ---------------------------------------------
> 
> master-no-stack-traces: 2.69
> master:                 3.35 (+24.5%)
> master-no-stack-depot:  3.54 (+31.5%)
> up-scs-stacks-v3:       2.80 (+4.1%)

master-noasan:            3.05 (+13.0%)
master-noasan-__set_bit:  2.96 (+10.0%)

> Generic KASAN w/ `-cpu max,pauth=false`
> ---------------------------------------
> 
> master-no-stack-traces: 1.92
> master:                 2.27  (+18.2%)
> master-no-stack-depot:  2.22  (+15.6%)
> up-scs-stacks-v3:       2.06  (+7.3%)

master-noasan:             2.14 (+11.4%)
master-noasan-__set_bit:   2.10 (+9.4%)

Thanks,
Mark.

WARNING: multiple messages have this Message-ID (diff)
From: Mark Rutland <mark.rutland@arm.com>
To: andrey.konovalov@linux.dev
Cc: Marco Elver <elver@google.com>,
	Alexander Potapenko <glider@google.com>,
	Andrey Konovalov <andreyknvl@gmail.com>,
	Dmitry Vyukov <dvyukov@google.com>,
	Andrey Ryabinin <ryabinin.a.a@gmail.com>,
	kasan-dev@googlegroups.com,
	Catalin Marinas <catalin.marinas@arm.com>,
	Will Deacon <will@kernel.org>,
	Vincenzo Frascino <vincenzo.frascino@arm.com>,
	Sami Tolvanen <samitolvanen@google.com>,
	linux-arm-kernel@lists.infradead.org,
	Peter Collingbourne <pcc@google.com>,
	Evgenii Stepanov <eugenis@google.com>,
	Florian Mayer <fmayer@google.com>,
	Andrew Morton <akpm@linux-foundation.org>,
	linux-mm@kvack.org, linux-kernel@vger.kernel.org,
	Andrey Konovalov <andreyknvl@google.com>
Subject: Re: [PATCH v3 0/3] kasan, arm64, scs: collect stack traces from Shadow Call Stack
Date: Thu, 14 Apr 2022 14:40:21 +0100	[thread overview]
Message-ID: <YlgkRXkCLeQ5IcaD@lakrids> (raw)
In-Reply-To: <YlgVa+AP0g4IYvzN@lakrids>

On Thu, Apr 14, 2022 at 01:36:59PM +0100, Mark Rutland wrote:
> As I suspected, you're hitting a known performance oddity with QEMU TCG
> mode where pointer authentication is *incredibly* slow when using the
> architected QARMA5 algorithm (enabled by default with `-cpu max`).

> This overhead has nothing to do with the *nature* of the unwinder, and
> is an artifact of the *platform* and the *structure* of the code.
> There's plenty that can be done to avoid that overhead

FWIW, from a quick look, disabling KASAN instrumentation for the
stacktrace object alone (with no other changes) has a significant impact
(compounded by the TCG QARMA5 slowdown), and I note that x86 doesn't
both instrumenting its stacktrace code anyway, so we could consider
doing likewise.

Atop that, replacing set_bit() with __set_bit() brings the regular
unwinder *really* close to the earlier SCS unwinder figures. I know that
the on_accessible_stack() calculations and checks could be ammortized
with some refactoring (which I'd planned to do anyway), so I think it's
plausible that with some changes to the existing unwinder we can bring
the difference into the noise.

> generic kasan w/ `-cpu max`
> ---------------------------
> 
> master-no-stack-traces: 12.66
> master:                 18.39 (+45.2%)
> master-no-stack-depot:  17.85 (+40.1%)
> up-scs-stacks-v3:       13.54 (+7.0%)

master-noasan:            15.67 (+23.8%)
master-noasan-__set_bit:  14.61 (+15.5%)

> Generic KASAN w/ `-cpu max,pauth-impdef=true`
> ---------------------------------------------
> 
> master-no-stack-traces: 2.69
> master:                 3.35 (+24.5%)
> master-no-stack-depot:  3.54 (+31.5%)
> up-scs-stacks-v3:       2.80 (+4.1%)

master-noasan:            3.05 (+13.0%)
master-noasan-__set_bit:  2.96 (+10.0%)

> Generic KASAN w/ `-cpu max,pauth=false`
> ---------------------------------------
> 
> master-no-stack-traces: 1.92
> master:                 2.27  (+18.2%)
> master-no-stack-depot:  2.22  (+15.6%)
> up-scs-stacks-v3:       2.06  (+7.3%)

master-noasan:             2.14 (+11.4%)
master-noasan-__set_bit:   2.10 (+9.4%)

Thanks,
Mark.

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

  reply	other threads:[~2022-04-14 14:56 UTC|newest]

Thread overview: 16+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-04-13 19:26 [PATCH v3 0/3] kasan, arm64, scs: collect stack traces from Shadow Call Stack andrey.konovalov
2022-04-13 19:26 ` andrey.konovalov
2022-04-13 19:26 ` [PATCH v3 1/3] arm64, scs: expose irq_shadow_call_stack_ptr andrey.konovalov
2022-04-13 19:26   ` andrey.konovalov
2022-04-13 19:26 ` [PATCH v3 2/3] kasan, arm64: implement stack_trace_save_shadow andrey.konovalov
2022-04-13 19:26   ` andrey.konovalov
2022-04-14 12:46   ` Mark Rutland
2022-04-14 12:46     ` Mark Rutland
2022-04-13 19:26 ` [PATCH v3 3/3] kasan: use stack_trace_save_shadow andrey.konovalov
2022-04-13 19:26   ` andrey.konovalov
2022-04-14 12:36 ` [PATCH v3 0/3] kasan, arm64, scs: collect stack traces from Shadow Call Stack Mark Rutland
2022-04-14 12:36   ` Mark Rutland
2022-04-14 13:40   ` Mark Rutland [this message]
2022-04-14 13:40     ` Mark Rutland
2022-05-21 22:30   ` Andrey Konovalov
2022-05-21 22:30     ` Andrey Konovalov

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=YlgkRXkCLeQ5IcaD@lakrids \
    --to=mark.rutland@arm.com \
    --cc=akpm@linux-foundation.org \
    --cc=andrey.konovalov@linux.dev \
    --cc=andreyknvl@gmail.com \
    --cc=andreyknvl@google.com \
    --cc=catalin.marinas@arm.com \
    --cc=dvyukov@google.com \
    --cc=elver@google.com \
    --cc=eugenis@google.com \
    --cc=fmayer@google.com \
    --cc=glider@google.com \
    --cc=kasan-dev@googlegroups.com \
    --cc=linux-arm-kernel@lists.infradead.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=pcc@google.com \
    --cc=ryabinin.a.a@gmail.com \
    --cc=samitolvanen@google.com \
    --cc=vincenzo.frascino@arm.com \
    --cc=will@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.