linux-arm-kernel.lists.infradead.org archive mirror
 help / color / mirror / Atom feed
From: Andrew Murray <andrew.murray@arm.com>
To: Mark Rutland <mark.rutland@arm.com>
Cc: Peter Zijlstra <peterz@infradead.org>,
	Catalin Marinas <catalin.marinas@arm.com>,
	Boqun Feng <boqun.feng@gmail.com>,
	Will Deacon <will.deacon@arm.com>,
	Ard.Biesheuvel@arm.com, linux-arm-kernel@lists.infradead.org
Subject: Re: [PATCH v3 3/5] arm64: atomics: avoid out-of-line ll/sc atomics
Date: Wed, 28 Aug 2019 12:53:45 +0100	[thread overview]
Message-ID: <20190828115345.GU14582@e119886-lin.cambridge.arm.com> (raw)
In-Reply-To: <20190822170155.GE33080@lakrids.cambridge.arm.com>

On Thu, Aug 22, 2019 at 06:01:55PM +0100, Mark Rutland wrote:
> On Mon, Aug 12, 2019 at 03:36:23PM +0100, Andrew Murray wrote:
> > When building for LSE atomics (CONFIG_ARM64_LSE_ATOMICS), if the hardware
> > or toolchain doesn't support it the existing code will fallback to ll/sc
> > atomics. It achieves this by branching from inline assembly to a function
> > that is built with specical compile flags. Further this results in the
> > clobbering of registers even when the fallback isn't used increasing
> > register pressure.
> > 
> > Let's improve this by providing inline implementations of both LSE and
> > ll/sc and use a static key to select between them. This allows for the
> > compiler to generate better atomics code.
> > 
> > To improve icache performance for the LL/SC fallback atomics, we put them
> > in their own subsection.
> > 
> > Please note that as atomic_arch.h is included indirectly by kernel.h
> > (via bitops.h), we cannot depend on features provided later in the kernel.h
> > file. This prevents us from placing the system_uses_lse_atomics function
> > in cpu_feature.h due to its dependencies.
> > 
> > Signed-off-by: Andrew Murray <andrew.murray@arm.com>
> 
> [...]
> 
> > diff --git a/arch/arm64/include/asm/atomic_arch.h b/arch/arm64/include/asm/atomic_arch.h
> > new file mode 100644
> > index 000000000000..255a284321c6
> > --- /dev/null
> > +++ b/arch/arm64/include/asm/atomic_arch.h
> > @@ -0,0 +1,154 @@
> > +/* SPDX-License-Identifier: GPL-2.0 */
> > +/*
> > + * Selection between LSE and LL/SC atomics.
> > + *
> > + * Copyright (C) 2018 ARM Ltd.
> > + * Author: Andrew Murray <andrew.murray@arm.com>
> > + */
> > +
> > +#ifndef __ASM_ATOMIC_ARCH_H
> > +#define __ASM_ATOMIC_ARCH_H
> > +
> > +#include <asm/atomic_lse.h>
> > +#include <asm/atomic_ll_sc.h>
> > +
> > +#include <linux/jump_label.h>
> > +#include <asm/cpucaps.h>
> 
> I'm guessing that we have to include the <asm/atomic_*> headers first
> due to the include dependencies. If that's the case, could we please
> have a comment here to that effect?
> 
> Minor nit, but could we also order those two alphabetically, please?
> 
> The general style is to have headers alphabetically, with (for reasons
> unknown) the <linux/*> headers before the <asm/*> headers.

There doesn't appear to be any valid reason for ordering the includes
in the way I have done, so I'll change it to the
linux-followed-by-asm-alphabetical way.

> 
> [...]
> 
> > +#if IS_ENABLED(CONFIG_ARM64_LSE_ATOMICS) && IS_ENABLED(CONFIG_AS_LSE)
> > +#define __LL_SC_FALLBACK(asm_ops)					\
> > +"	b	3f\n"							\
> > +"	.subsection	1\n"						\
> > +"3:\n"									\
> > +asm_ops "\n"								\
> > +"	b	4f\n"							\
> > +"	.previous\n"							\
> > +"4:\n"
> > +#else
> > +#define __LL_SC_FALLBACK(asm_ops) asm_ops
> >  #endif
> 
> Can we instead make the ll/sc functions with the cold attribute (wrapped
> by <linux/compiler.h> as __cold)?
> 
> IIUC that should have a similar effect, and might allow GCC to do better
> (e.g. merging compatible instances of the ll/sc code in the same cold
> subsection).
> 
> https://gcc.gnu.org/onlinedocs/gcc-9.2.0/gcc/Common-Function-Attributes.html#index-cold-function-attribute

Thanks for suggesting this, I hadn't considered this previously. However
the cold attribute only seems to have an effect when the function it is
applied to isn't inline. I think we want to keep both LSE and fallback
functions as inline.

At present the fallback functions are inlined, due to the use of 'unlikely'
they end up at the end of the function they are called from. Within this
inline'd function we take unconditional branches to/from the actual LL/SC
implementation, as we are using subsections the LL/SC implementations are
grouped together at the end of the each compilation unit. As we are using
unconditional branches, each call to an atomics function results in the LL/SC
implementation being duplicated (just like any other inline function, except
the code is elsewhere). We get some locality benefits from the use of
subsection but that is per-compilation unit (so you'll get clusters of them
across the vmlinux).

This approach gives us bloat, we can mitigate this by not using inline
functions, and further by optimising for size. We can put all the fallback
atomics in a single section to benefit from vmlinux-wide code locality -
however to benefit from all this we must use functions calls and their
associated overhead.

Thanks,

Andrew Murray

> 
> Otherwise, this is looking much nicer!
> 
> Thanks,
> Mark.
> 
> _______________________________________________
> linux-arm-kernel mailing list
> linux-arm-kernel@lists.infradead.org
> http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

  reply	other threads:[~2019-08-28 11:54 UTC|newest]

Thread overview: 18+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-08-12 14:36 [PATCH v3 0/5] arm64: avoid out-of-line ll/sc atomics Andrew Murray
2019-08-12 14:36 ` [PATCH v3 1/5] jump_label: Don't warn on __exit jump entries Andrew Murray
2019-08-22 15:32   ` Mark Rutland
2019-08-22 18:41     ` Andrew Murray
2019-08-12 14:36 ` [PATCH v3 2/5] arm64: Use correct ll/sc atomic constraints Andrew Murray
2019-08-22 15:32   ` Mark Rutland
2019-08-28 13:01     ` Andrew Murray
2019-08-28 15:25       ` Mark Rutland
2019-08-28 15:44         ` Andrew Murray
2019-08-28 16:24           ` Mark Rutland
2019-08-28 16:42             ` Andrew Murray
2019-08-12 14:36 ` [PATCH v3 3/5] arm64: atomics: avoid out-of-line ll/sc atomics Andrew Murray
2019-08-22 17:01   ` Mark Rutland
2019-08-28 11:53     ` Andrew Murray [this message]
2019-08-12 14:36 ` [PATCH v3 4/5] arm64: avoid using hard-coded registers for LSE atomics Andrew Murray
2019-08-12 14:36 ` [PATCH v3 5/5] arm64: atomics: remove atomic_ll_sc compilation unit Andrew Murray
2019-08-27 16:49 ` [PATCH v3 0/5] arm64: avoid out-of-line ll/sc atomics Will Deacon
2019-08-28  9:04   ` Andrew Murray

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20190828115345.GU14582@e119886-lin.cambridge.arm.com \
    --to=andrew.murray@arm.com \
    --cc=Ard.Biesheuvel@arm.com \
    --cc=boqun.feng@gmail.com \
    --cc=catalin.marinas@arm.com \
    --cc=linux-arm-kernel@lists.infradead.org \
    --cc=mark.rutland@arm.com \
    --cc=peterz@infradead.org \
    --cc=will.deacon@arm.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).