All of lore.kernel.org
 help / color / mirror / Atom feed
From: Peter Collingbourne <pcc@google.com>
To: Nick Desaulniers <ndesaulniers@google.com>
Cc: Mark Rutland <mark.rutland@arm.com>,
	Ard Biesheuvel <ard.biesheuvel@linaro.org>,
	Masahiro Yamada <yamada.masahiro@socionext.com>,
	Catalin Marinas <catalin.marinas@arm.com>,
	clang-built-linux <clang-built-linux@googlegroups.com>,
	Will Deacon <will@kernel.org>,
	Linux ARM <linux-arm-kernel@lists.infradead.org>
Subject: Re: [PATCH] arm64: Add support for relocating the kernel with RELR relocations
Date: Tue, 9 Jul 2019 15:04:35 -0700	[thread overview]
Message-ID: <CAMn1gO5RR7mbOb3ZgkpN=fbP4tfYYZJLXeX6xyvT6xtqxZwpVw@mail.gmail.com> (raw)
In-Reply-To: <CAKwvOd=q8-zZJ443YYQbzQrtvVX1Z70o6sT3Zft+vYOuCjAhxA@mail.gmail.com>

On Mon, Jul 8, 2019 at 11:02 AM Nick Desaulniers
<ndesaulniers@google.com> wrote:
>
> On Fri, Jul 5, 2019 at 1:03 AM 'Peter Collingbourne' via Clang Built
> Linux <clang-built-linux@googlegroups.com> wrote:
> >
> > RELR is a relocation packing format for relative relocations.
> > The format is described in a generic-abi proposal:
> > https://groups.google.com/d/topic/generic-abi/bX460iggiKg/discussion
>
> Cool, Rahul reports 9-19% savings for various userspace binaries.
> Just curious, but a quick scan makes it seem like this could be
> do-able for other arch's as well? (maybe a topic for a separate
> thread)

Yes, but it would likely involve reimplementing the self-relocation
code in assembly for each architecture so that we can guarantee that
the code does not itself need to be relocated. Probably worth
discussing on a separate thread.

> > The LLD linker can be instructed to pack relocations in the RELR
> > format by passing the flag --pack-dyn-relocs=relr.
> >
> > This patch adds a new config option, CONFIG_RELR. Enabling this option
> > instructs the linker to pack vmlinux's relative relocations in the RELR
> > format, and causes the kernel to apply the relocations at startup along
> > with the RELA relocations. RELA relocations still need to be applied
> > because the linker will emit RELA relative relocations if they are
> > unrepresentable in the RELR format (i.e. address not a multiple of 2).
> >
> > Enabling CONFIG_RELR reduces the size of a defconfig kernel image
> > with CONFIG_RANDOMIZE_BASE by 3.5MB/16% uncompressed, or 550KB/5%
> > compressed (lz4).
>
> Neat! Thanks for the patch; I'll try to get it tested this week to see
> if I can reproduce the results and boot test on hardware (I think
> llvm-nm has no known issues, I'll need to check llvm-objcopy).

Thanks. I've already boot tested it using qemu and I was planning to
test on hikey960 (currently waiting on a part), but more testing would
be useful.

> > Signed-off-by: Peter Collingbourne <pcc@google.com>
> > ---
> >  arch/arm64/Kconfig              |  9 +++++
> >  arch/arm64/Makefile             |  4 ++
> >  arch/arm64/kernel/head.S        | 70 ++++++++++++++++++++++++++++-----
> >  arch/arm64/kernel/vmlinux.lds.S |  9 +++++
> >  4 files changed, 83 insertions(+), 9 deletions(-)
> >
> > diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig
> > index 697ea05107298..f0cd0d2607e70 100644
> > --- a/arch/arm64/Kconfig
> > +++ b/arch/arm64/Kconfig
> > @@ -1447,6 +1447,15 @@ config RELOCATABLE
> >           relocation pass at runtime even if the kernel is loaded at the
> >           same address it was linked at.
> >
> > +config RELR
> > +       bool "Use RELR relocation packing"
> > +       depends on RELOCATABLE && $(ld-option,--pack-dyn-relocs=relr)
>
> Oh, ld-option in Kconfig? +Masahiro
>
> > +       help
> > +         Store the kernel's dynamic relocations in the RELR relocation packing
> > +         format. Requires a compatible linker (currently only LLD supports
> > +         this feature), as well as compatible NM and OBJCOPY utilities
> > +         (llvm-nm and llvm-objcopy are compatible).
>
> So sounds like `make LD=ld.lld NM=llvm-nm OBJCOPY=llvm-objcopy` will
> be needed to test.  The ld-option check above doesn't seem strong
> enough, but maybe it's not easy to feature test NM or OBJCOPY?

Right. Ideally we want to test the property that the tool accepts an
input file with a .relr.dyn section, and this isn't easy without
actually creating such a file. We could test that the tools are
actually the LLVM versions (e.g. by testing the output of $TOOL
--version), but I'm not sure if we want to exclude the possibility
that GNU or other toolchains will add support for this section in the
future.

> > +
> >  config RANDOMIZE_BASE
> >         bool "Randomize the address of the kernel image"
> >         select ARM64_MODULE_PLTS if MODULES
> > diff --git a/arch/arm64/Makefile b/arch/arm64/Makefile
> > index e9d2e578cbe67..16a8636f815c9 100644
> > --- a/arch/arm64/Makefile
> > +++ b/arch/arm64/Makefile
> > @@ -22,6 +22,10 @@ LDFLAGS_vmlinux              += -shared -Bsymbolic -z notext -z norelro \
> >                         $(call ld-option, --no-apply-dynamic-relocs)
> >  endif
> >
> > +ifeq ($(CONFIG_RELR),y)
> > +  LDFLAGS_vmlinux += --pack-dyn-relocs=relr
> > +endif
> > +
> >  ifeq ($(CONFIG_ARM64_ERRATUM_843419),y)
> >    ifeq ($(call ld-option, --fix-cortex-a53-843419),)
> >  $(warning ld does not support --fix-cortex-a53-843419; kernel may be susceptible to erratum)
> > diff --git a/arch/arm64/kernel/head.S b/arch/arm64/kernel/head.S
> > index 2cdacd1c141b9..9b27d5e7d8f70 100644
> > --- a/arch/arm64/kernel/head.S
> > +++ b/arch/arm64/kernel/head.S
> > @@ -102,6 +102,7 @@ pe_header:
> >          *  x23        stext() .. start_kernel()  physical misalignment/KASLR offset
> >          *  x28        __create_page_tables()     callee preserved temp register
> >          *  x19/x20    __primary_switch()         callee preserved temp registers
> > +        *  x24        __primary_switch()         current RELR displacement
> >          */
> >  ENTRY(stext)
> >         bl      preserve_boot_args
> > @@ -824,24 +825,63 @@ __relocate_kernel:
> >          * Iterate over each entry in the relocation table, and apply the
> >          * relocations in place.
> >          */
> > -       ldr     w9, =__rela_offset              // offset to reloc table
> > -       ldr     w10, =__rela_size               // size of reloc table
> > -
> >         mov_q   x11, KIMAGE_VADDR               // default virtual offset
> >         add     x11, x11, x23                   // actual virtual offset
> > +
> > +       ldr     w9, =__rela_offset              // offset to reloc table
> > +       ldr     w10, =__rela_size               // size of reloc table
>
> Was this reordering intentional?  I don't think w9 or w10 would change
> across the mov_q and add above? Or is it just to match the loop update
> below?

Yes, it was intended to group all of the code that deals with RELA
tables together so that it resembles the RELR code.

> >         add     x9, x9, x11                     // __va(.rela)
> >         add     x10, x9, x10                    // __va(.rela) + sizeof(.rela)
> >
> >  0:     cmp     x9, x10
> >         b.hs    1f
> > -       ldp     x11, x12, [x9], #24
> > -       ldr     x13, [x9, #-8]
> > -       cmp     w12, #R_AARCH64_RELATIVE
> > +       ldp     x12, x13, [x9], #24
> > +       ldr     x14, [x9, #-8]
> > +       cmp     w13, #R_AARCH64_RELATIVE
>
> Can you help me understand this renaming?
> x11 -> x12
> x13 -> x14
> x12 -> x13
> but they all get clobbered before use in your added ifdef hunk?

I use the value of x11 before it is clobbered in the instruction with
the comment "// __va(.relr)".

> >         b.ne    0b
> > -       add     x13, x13, x23                   // relocate
> > -       str     x13, [x11, x23]
> > +       add     x14, x14, x23                   // relocate
> > +       str     x14, [x12, x23]
> >         b       0b
> > -1:     ret
> > +
> > +1:
> > +#ifdef CONFIG_RELR
> > +       ldr     w9, =__relr_offset              // offset to reloc table
> > +       ldr     w10, =__relr_size               // size of reloc table
>
> Were these modified since first loaded in the above hunk?  I see the
> offsets applied below, but I don't spot any loops back up to `1:` (but
> could be missing it).  It also doesn't look like x11 or x10 are
> modified below (or above), so this looks like we're rematerializing
> values that already exist in those registers, IIUC?  Maybe I'm missing
> some side effect of one of the instructions?

These instructions refer to __relr_{offset,size} which are different
from __rela_{offset,size} loaded above. They are only loaded once per
function call; the main loop is between labels 2 and 7 below.

You might have missed the implicit increment of x9 by 8 in the "ldr
x11, [x9], #8" instruction below (see also similar instructions
above), which is how we move to the next relocation table entry in the
main loop.

> > +       add     x9, x9, x11                     // __va(.relr)
> > +       add     x10, x9, x10                    // __va(.relr) + sizeof(.relr)
> > +
> > +       sub     x15, x23, x24                   // delta from previous offset
> > +       cbz     x15, 7f                         // nothing to do if unchanged
> > +       mov     x24, x23                        // save new offset
> > +
> > +2:     cmp     x9, x10
> > +       b.hs    7f
> > +       ldr     x11, [x9], #8
> > +       tbnz    x11, #0, 3f                     // branch to handle bitmaps
> > +       add     x13, x11, x23
> > +       ldr     x12, [x13]                      // relocate address entry
> > +       add     x12, x12, x15
> > +       str     x12, [x13], #8                  // adjust to start of bitmap
> > +       b       2b
> > +
> > +3:     mov     x14, x13
> > +4:     lsr     x11, x11, #1
> > +       cbz     x11, 6f
> > +       tbz     x11, #0, 5f                     // skip bit if not set
> > +       ldr     x12, [x14]                      // relocate bit
> > +       add     x12, x12, x15
> > +       str     x12, [x14]
> > +
> > +5:     add     x14, x14, #8                    // move to next bit's address
> > +       b       4b
> > +
> > +6:     add     x13, x13, #(8 * 63)             // move to next bitmap's address
>
> Sorry, what's this constant `#(8 * 63)`?

It is the number of bytes covered by a bitmap entry. 8 is the word
size, and 63 is the number of significant bits in a bitmap entry.

>
> > +       b       2b
> > +
> > +7:
> > +#endif
> > +       ret
> > +
> >  ENDPROC(__relocate_kernel)
> >  #endif
> >
> > @@ -854,6 +894,18 @@ __primary_switch:
> >         adrp    x1, init_pg_dir
> >         bl      __enable_mmu
> >  #ifdef CONFIG_RELOCATABLE
> > +#ifdef CONFIG_RELR
> > +       /*
> > +        * RELR is similar to REL in that the addends are stored in place in the
> > +        * binary. This means that RELR relocations cannot be applied
> > +        * idempotently. We use x24 to keep track of the currently applied
> > +        * displacement so that we can correctly relocate if __relocate_kernel
> > +        * is called twice with non-zero displacements (i.e. if there is both a
> > +        * physical misalignment and a KASLR displacement). We start off at 0
>
> Sounds like I should test w/ and w/o CONFIG_RANDOMIZE_BASE enabled?

Sure, makes sense.

Peter

> > +        * because no displacement has been applied yet.
> > +        */
> > +       mov     x24, #0
> > +#endif
> >         bl      __relocate_kernel
> >  #ifdef CONFIG_RANDOMIZE_BASE
> >         ldr     x8, =__primary_switched
> > diff --git a/arch/arm64/kernel/vmlinux.lds.S b/arch/arm64/kernel/vmlinux.lds.S
> > index 7fa0083749078..31716afa30f65 100644
> > --- a/arch/arm64/kernel/vmlinux.lds.S
> > +++ b/arch/arm64/kernel/vmlinux.lds.S
> > @@ -200,6 +200,15 @@ SECTIONS
> >         __rela_offset   = ABSOLUTE(ADDR(.rela.dyn) - KIMAGE_VADDR);
> >         __rela_size     = SIZEOF(.rela.dyn);
> >
> > +#ifdef CONFIG_RELR
> > +       .relr.dyn : ALIGN(8) {
> > +               *(.relr.dyn)
> > +       }
> > +
> > +       __relr_offset   = ABSOLUTE(ADDR(.relr.dyn) - KIMAGE_VADDR);
> > +       __relr_size     = SIZEOF(.relr.dyn);
> > +#endif
> > +
> >         . = ALIGN(SEGMENT_ALIGN);
> >         __initdata_end = .;
> >         __init_end = .;
> > --
>
> --
> Thanks,
> ~Nick Desaulniers

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

  reply	other threads:[~2019-07-09 22:04 UTC|newest]

Thread overview: 17+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-07-05  8:02 [PATCH] arm64: Add support for relocating the kernel with RELR relocations Peter Collingbourne
2019-07-08 18:02 ` Nick Desaulniers
2019-07-09 22:04   ` Peter Collingbourne [this message]
2019-07-09 23:13     ` Nick Desaulniers
2019-07-12 19:40       ` Peter Collingbourne
2019-07-10 16:21 ` Will Deacon
2019-07-12 19:40   ` Peter Collingbourne
2019-07-10 23:14 ` Nick Desaulniers
2019-07-12 19:40   ` Peter Collingbourne
2019-07-12 19:33 ` [PATCH v2] arm64: Add support for relocating the kernel with RELR Peter Collingbourne
2019-07-12 19:38 ` [PATCH v2] arm64: Add support for relocating the kernel with RELR relocations Peter Collingbourne
2019-07-29 20:00   ` Peter Collingbourne
2019-07-31 16:48   ` Will Deacon
2019-08-01  1:19     ` Peter Collingbourne
2019-08-01  1:18   ` [PATCH v3] " Peter Collingbourne
2019-08-01 12:05     ` Will Deacon
2019-08-01 17:51       ` Peter Collingbourne

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CAMn1gO5RR7mbOb3ZgkpN=fbP4tfYYZJLXeX6xyvT6xtqxZwpVw@mail.gmail.com' \
    --to=pcc@google.com \
    --cc=ard.biesheuvel@linaro.org \
    --cc=catalin.marinas@arm.com \
    --cc=clang-built-linux@googlegroups.com \
    --cc=linux-arm-kernel@lists.infradead.org \
    --cc=mark.rutland@arm.com \
    --cc=ndesaulniers@google.com \
    --cc=will@kernel.org \
    --cc=yamada.masahiro@socionext.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.