From: Andrew Murray <andrew.murray@arm.com> To: Peter Zijlstra <peterz@infradead.org> Cc: Catalin Marinas <catalin.marinas@arm.com>, Boqun Feng <boqun.feng@gmail.com>, Will Deacon <will.deacon@arm.com>, linux-arm-kernel@lists.infradead.org, Ard.Biesheuvel@arm.com Subject: Re: [PATCH v1 0/5] arm64: avoid out-of-line ll/sc atomics Date: Fri, 17 May 2019 11:08:03 +0100 Message-ID: <20190517100802.GS8268@e119886-lin.cambridge.arm.com> (raw) In-Reply-To: <20190517072401.GI2623@hirez.programming.kicks-ass.net> On Fri, May 17, 2019 at 09:24:01AM +0200, Peter Zijlstra wrote: > On Thu, May 16, 2019 at 04:53:39PM +0100, Andrew Murray wrote: > > When building for LSE atomics (CONFIG_ARM64_LSE_ATOMICS), if the hardware > > or toolchain doesn't support it the existing code will fallback to ll/sc > > atomics. It achieves this by branching from inline assembly to a function > > that is built with specical compile flags. Further this results in the > > clobbering of registers even when the fallback isn't used increasing > > register pressure. > > > > Let's improve this by providing inline implementatins of both LSE and > > ll/sc and use a static key to select between them. This allows for the > > compiler to generate better atomics code. > > Don't you guys have alternatives? That would avoid having both versions > in the code, and thus significantly cuts back on the bloat. Yes we do. Prior to patch 3 of this series, the ARM64_LSE_ATOMIC_INSN macro used ALTERNATIVE to either bl to a fallback ll/sc function (and nops) - or execute some LSE instructions. But this approach limits the compilers ability to optimise the code due to the asm clobber list being the superset of both ll/sc and LSE - and the gcc compiler flags used on the ll/sc functions. I think the alternative solution (excuse the pun) that you are suggesting is to put the body of the ll/sc or LSE code in the ALTERNATIVE oldinstr/newinstr blocks (i.e. drop the fallback branches). However this still gives us some bloat (but less than my current solution) because we're still now inlining the larger fallback ll/sc whereas previously they were non-inline'd functions. We still end up with potentially unnecessary clobbers for LSE code with this approach. Approach prior to this series: BL 1 or NOP <- single alternative instruction LSE LSE ... 1: LL/SC <- LL/SC fallback not inlined so reused LL/SC LL/SC LL/SC Approach proposed by this series: BL 1 or NOP <- single alternative instruction LSE LSE BL 2 1: LL/SC <- inlined LL/SC and thus duplicated LL/SC LL/SC LL/SC 2: .. Approach using alternative without braces: LSE LSE NOP NOP or LL/SC <- inlined LL/SC and thus duplicated LL/SC LL/SC LL/SC I guess there is a balance here between bloat and code optimisation. > > > These changes add a small amount of bloat on defconfig according to > > bloat-o-meter: > > > > text: > > add/remove: 1/108 grow/shrink: 3448/20 up/down: 272768/-4320 (268448) > > Total: Before=12363112, After=12631560, chg +2.17% > > I'd say 2% is quite significant bloat. Thanks, Andrew Murray _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
next prev parent reply index Thread overview: 14+ messages / expand[flat|nested] mbox.gz Atom feed top 2019-05-16 15:53 Andrew Murray 2019-05-16 15:53 ` [PATCH v1 1/5] jump_label: Don't warn on __exit jump entries Andrew Murray 2019-05-16 15:53 ` [PATCH v1 2/5] arm64: Use correct ll/sc atomic constraints Andrew Murray 2019-05-16 15:53 ` [PATCH v1 3/5] arm64: atomics: avoid out-of-line ll/sc atomics Andrew Murray 2019-05-16 15:53 ` [PATCH v1 4/5] arm64: avoid using hard-coded registers for LSE atomics Andrew Murray 2019-05-16 15:53 ` [PATCH v1 5/5] arm64: atomics: remove atomic_ll_sc compilation unit Andrew Murray 2019-05-17 7:24 ` [PATCH v1 0/5] arm64: avoid out-of-line ll/sc atomics Peter Zijlstra 2019-05-17 10:08 ` Andrew Murray [this message] 2019-05-17 10:29 ` Ard Biesheuvel 2019-05-22 10:45 ` Andrew Murray 2019-05-22 11:44 ` Ard Biesheuvel 2019-05-22 15:36 ` Andrew Murray 2019-05-17 12:05 ` Peter Zijlstra 2019-05-17 12:19 ` Ard Biesheuvel
Reply instructions: You may reply publicly to this message via plain-text email using any one of the following methods: * Save the following mbox file, import it into your mail client, and reply-to-all from there: mbox Avoid top-posting and favor interleaved quoting: https://en.wikipedia.org/wiki/Posting_style#Interleaved_style * Reply using the --to, --cc, and --in-reply-to switches of git-send-email(1): git send-email \ --in-reply-to=20190517100802.GS8268@e119886-lin.cambridge.arm.com \ --to=andrew.murray@arm.com \ --cc=Ard.Biesheuvel@arm.com \ --cc=boqun.feng@gmail.com \ --cc=catalin.marinas@arm.com \ --cc=linux-arm-kernel@lists.infradead.org \ --cc=peterz@infradead.org \ --cc=will.deacon@arm.com \ /path/to/YOUR_REPLY https://kernel.org/pub/software/scm/git/docs/git-send-email.html * If your mail client supports setting the In-Reply-To header via mailto: links, try the mailto: link
Linux-ARM-Kernel Archive on lore.kernel.org Archives are clonable: git clone --mirror https://lore.kernel.org/linux-arm-kernel/0 linux-arm-kernel/git/0.git git clone --mirror https://lore.kernel.org/linux-arm-kernel/1 linux-arm-kernel/git/1.git # If you have public-inbox 1.1+ installed, you may # initialize and index your mirror using the following commands: public-inbox-init -V2 linux-arm-kernel linux-arm-kernel/ https://lore.kernel.org/linux-arm-kernel \ linux-arm-kernel@lists.infradead.org public-inbox-index linux-arm-kernel Example config snippet for mirrors Newsgroup available over NNTP: nntp://nntp.lore.kernel.org/org.infradead.lists.linux-arm-kernel AGPL code for this site: git clone https://public-inbox.org/public-inbox.git