archive mirror
 help / color / mirror / Atom feed
From: Richard Sandiford <>
To: Dan Li <>
Subject: Re: [PATCH] [PATCH,v4,1/1,AARCH64][PR102768] aarch64: Add compiler support for Shadow Call Stack
Date: Fri, 11 Feb 2022 09:53:04 +0000	[thread overview]
Message-ID: <> (raw)
In-Reply-To: <> (Dan Li's message of "Fri, 11 Feb 2022 00:57:10 -0800")

Dan Li <> writes:
> On 2/10/22 01:55, Richard Sandiford wrote:
>>> There might be a little difference:
>>> - Using push candidates means that a register to be ignored in pop
>>> candidates will not be emitted again during the "restore" (pop_candidates
>>> should always be a subset of push_candidates, since popping a register
>>> without a push might not make sense).
>> The push candidates are simply a subset of the saved registers though.
>> Similarly, the pop candidates are simply a subset of the restored registers.
>> So I think the requirement operates at that level: the restored registers
>> must be a subset of the saved registers.
>> In other circumstances it could have been the other way around:
>> there might have been a change that stopped us from saving two
>> registers during the allocation, but we wanted to carry on restoring
>> two registers during the deallocation.  I don't think there's a
>> reason that the push candidates *have* to be a superset of the
>> pop candidates (even though they are with the current change).
> Oh yeah, that sounds more reasonable.
>>> When we use "pop_candidate[12]", one more insn is emitted:
>>> 0000000000400604 <main>:
>>>      400604:       a9bf53f3        stp     x19, x20, [sp, #-16]!
>>>      400608:       52800000        mov     w0, #0x0
>>> +  40060c:       f94007f4        ldr     x20, [sp, #8]
>>>      400610:       f84107f3        ldr     x19, [sp], #16
>>>      400614:       d65f03c0        ret
>>> But in the case of ignoring a specific register (like scs ignores x30),
>>> there is no difference between the two (because we always need
>>> to explicitly specify which registers to ignore in the parameter of
>>> aarch64_restore_callee_saves).
>> I think this is the correct behaviour.  If we don't want to restore
>> a register at all then it should be excluded from the restore list
>> somehow.  In your case you're doing that be using a limit of
>> X29_REGNUM instead of X30_REGNUM.
> Got it, I'll use pop candidates in the next version.
>> FWIW, I did wonder whether aarch64_restore_callee_saves should be
>> doing the scs pop, rather than aarch64_expand_epilogue, and in an
>> earlier draft of the previous review I'd asked for that.  It does
>> seem conceptually cleaner, but in practice, it would probably have
>> been awkward to implement.  E.g. we'd need to explicitly stop an
>> LDP being formed with X30 as the second register.
> Well, then I think I should keep it the same here :).
>> But treating scs push and scs pop as part of the register save and
>> restore sequences would have one advantage: it would allow the
>> scs push and scs pop to be shrink-wrapped.
> Sorry for my limited knowledge of shrink warping, I don't think I get
> it here (I tried to find a case when compiling the kernel and some
> gcc test cases but I still don't have a clue.).
> I see that the bitmap of LR_REGNUM is cleared in
> aarch64_get_separate_components and scs push/pop are x18 based operations.
> If we handle them in aarch64_restore/save_callee_saves,
> could scs push/pop be shrink-wrapped in some cases?

Yeah, I think so.  E.g. for:

void f();
int g(int x) {
    if (x == 0)
        return 1;
    return 2;

shrink wrapping would allow the scs push and pop to move along with the
x30 save:

        cbnz    w0, .L9
        mov     w0, 1
        stp     x29, x30, [sp, -16]!
        mov     x29, sp
        bl      f
        mov     w0, 2
        ldp     x29, x30, [sp], 16

The idea is that aarch64_save_callee_saves would treat the scs push
as part of saving x30 (along with the normal store to the frame chain,
when used).  aarch64_restore_callee_saves would similarly treat the scs
pop as the way of restoring x30 (instead of loading from the frame chain).
This is in contrast to the current patch, where the scs push and pop are
treated as fixed parts of the prologue and epilogue instead, and where
aarch64_restore_callee_saves tries to avoid doing anything for x30.

If shrink-wrapping decides to treat x30 as a separate “component”, as it
does in the example above, then the scs push and pop would be emitted
by aarch64_process_components instead.

It would be more complex, but it would give better code.


  reply	other threads:[~2022-02-11  9:53 UTC|newest]

Thread overview: 10+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-02-05 11:04 [PATCH] [PATCH,v4,1/1,AARCH64][PR102768] aarch64: Add compiler support for Shadow Call Stack Dan Li
2022-02-09 16:08 ` Richard Sandiford
2022-02-10  3:06   ` Dan Li
2022-02-10  9:55     ` Richard Sandiford
2022-02-11  8:57       ` Dan Li
2022-02-11  9:53         ` Richard Sandiford [this message]
2022-02-11 13:43           ` Dan Li
2022-02-11 15:35             ` Richard Sandiford
2022-02-12  5:30               ` Dan Li
2022-02-12  8:43               ` Dan Li

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \ \ \ \ \ \ \ \ \ \ \ \ \ \

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).