All of lore.kernel.org
 help / color / mirror / Atom feed
From: trong@android.com (Tri Vo)
To: linux-arm-kernel@lists.infradead.org
Subject: [PATCH] arm64: build CONFIG_ARM64_LSE_ATOMICS without -ffixed- flags
Date: Fri, 3 Aug 2018 18:05:29 -0700	[thread overview]
Message-ID: <CANA+-vDMckt3WN+7E59vd_EKnUS4r_iuPNgyp2cs6-uWiAg3qA@mail.gmail.com> (raw)
In-Reply-To: <CAKv+Gu-SgWCzbBH1iZnWKYbTj28TiL6Xd_x9OTm0CXSiBQCitw@mail.gmail.com>

On Fri, Aug 3, 2018 at 12:34 PM, Ard Biesheuvel
<ard.biesheuvel@linaro.org> wrote:
> On 3 August 2018 at 21:13, Ard Biesheuvel <ard.biesheuvel@linaro.org> wrote:
>> On 3 August 2018 at 21:08, Tri Vo <trong@android.com> wrote:
>>> On Fri, Aug 3, 2018 at 10:27 AM, Ard Biesheuvel
>>> <ard.biesheuvel@linaro.org> wrote:
>>>> On 3 August 2018 at 19:17, Nick Desaulniers <ndesaulniers@google.com> wrote:
>>>>> On Fri, Aug 3, 2018 at 1:43 AM Ard Biesheuvel <ard.biesheuvel@linaro.org> wrote:
>>>>>>
>>>>>> On 3 August 2018 at 01:17, Tri Vo <trong@android.com> wrote:
>>>>>> > Remove -ffixed- compile flags as they are not required for correct
>>>>>> > behavior of out-of-line ll/sc implementations of atomic functions.
>>>>>> > Registers (except x0 which contains return value) only need to be
>>>>>> > callee-saved, not necessarily fixed.
>>>>>> >
>>>>>> > For readability purposes, represent "callee-saves-all-registers" calling
>>>>>> > convention with a combination of -fcall-used- and -fcall-saved- flags only.
>>>>>> >
>>>>>> > Signed-off-by: Tri Vo <trong@android.com>
>>>>>>
>>>>>> Wouldn't this permit the compiler to stack/unstack these registers in
>>>>>> the implementations of the out of line LL/SC fallbacks.
>>>
>>> Yes. However, the number of registers that get stacked/unstacked
>>> should remain the same since compiler should only preserve registers
>>> that are used by callee (assuming callee saves all).
>>>>>
>>>>> I'm curious to understand more about how this LSE patching works, and
>>>>> what the constraints exactly are.  Are there some docs I should read
>>>>> that you can point me to?  I see
>>>>> arch/arm64/include/asm/atomic_ll_sc.h, but are those are the in line
>>>>> implementations?  Does that make arch/arm64/include/asm/atomic.h the
>>>>> out of line definitions?  What are the restrictions around preventing
>>>>> the compiler to push/pop those registers?
>>>>>
>>>>
>>>> The idea is that LSE atomics can each be patched into a bl instruction
>>>> that calls the appropriate LL/SC fallback, but without the overhead of
>>>> an ordinary function call. (actually the bl is there by default, but
>>>> patched into LSE ops on h/w that supports it)
>>>>
>>>> So that is why the 'bl' is in an asm() block, and why x30 is used as a
>>>> temporary/clobber in the LSE sequences: the function call is
>>>> completely hidden from the compiler. Obviously, for performance, you
>>>> want to prevent these fallbacks from touching the stack.
>>>
>>> The function call is hidden from the compiler on the caller side
>>> using 'bl' in the asm block. I suspect that is done to make
>>> alternative instructions the same byte length (constraints of dynamic
>>> patching). However, on the callee side, LL/SC fallbacks
>>> (in atomic_ll_sc.h) are implemented as C functions. So the compiler
>>> still handles the function call details and register allocation
>>> within those functions.
>>>
>>> Since the caller doesn't save any registers, callee stacks/unstacks
>>> all the registers it uses. So, for example, LL/SC fallback
>>> implementation of atomic_add preserves 2 registers with or without
>>> this change.
>>>
>>
>> Did you look at the disassembly?

Yes, I did.
>
> Actually, I only just spotted that you only modify x1 .. x7 and that
> the other registers already use -fcall-saved-xNN
>
> Will should comment on why the distinction exists in the first place.
> Note that a strict reading of the GCC man page suggests that using
> -fcall-saved-x8 is a bad idea, given that it is the indirect return
> register. However, given the limited scope of the option here, that
> probably does not matter in practice.
>
> So I guess you need this patch for Clang? Would be better to mention
> that in the commit log if this is the case.

Yes, I am planning to fold whatever compiler support is needed for LSE
atomics into "callee-saves-all-registers" calling convention (without
ffixed flags). I'll then implement that calling convention in Clang.
I'll mention that in the commit message.

  reply	other threads:[~2018-08-04  1:05 UTC|newest]

Thread overview: 11+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-08-02 23:17 [PATCH] arm64: build CONFIG_ARM64_LSE_ATOMICS without -ffixed- flags Tri Vo
2018-08-03  8:43 ` Ard Biesheuvel
2018-08-03 17:17   ` Nick Desaulniers
2018-08-03 17:27     ` Ard Biesheuvel
2018-08-03 19:08       ` Tri Vo
2018-08-03 19:13         ` Ard Biesheuvel
2018-08-03 19:34           ` Ard Biesheuvel
2018-08-04  1:05             ` Tri Vo [this message]
2018-08-04  7:21               ` Ard Biesheuvel
2018-08-06 21:54                 ` Tri Vo
2018-08-06 22:00                   ` Ard Biesheuvel

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CANA+-vDMckt3WN+7E59vd_EKnUS4r_iuPNgyp2cs6-uWiAg3qA@mail.gmail.com \
    --to=trong@android.com \
    --cc=linux-arm-kernel@lists.infradead.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.