Re: [PATCH] arm64: Add support for relocating the kernel with RELR relocations

From: Peter Collingbourne <pcc@google.com>
To: Nick Desaulniers <ndesaulniers@google.com>
Cc: Mark Rutland <mark.rutland@arm.com>,
	Ard Biesheuvel <ard.biesheuvel@linaro.org>,
	Masahiro Yamada <yamada.masahiro@socionext.com>,
	Catalin Marinas <catalin.marinas@arm.com>,
	clang-built-linux <clang-built-linux@googlegroups.com>,
	Will Deacon <will@kernel.org>,
	Linux ARM <linux-arm-kernel@lists.infradead.org>
Subject: Re: [PATCH] arm64: Add support for relocating the kernel with RELR relocations
Date: Tue, 9 Jul 2019 15:04:35 -0700	[thread overview]
Message-ID: <CAMn1gO5RR7mbOb3ZgkpN=fbP4tfYYZJLXeX6xyvT6xtqxZwpVw@mail.gmail.com> (raw)
In-Reply-To: <CAKwvOd=q8-zZJ443YYQbzQrtvVX1Z70o6sT3Zft+vYOuCjAhxA@mail.gmail.com>

On Mon, Jul 8, 2019 at 11:02 AM Nick Desaulniers
<ndesaulniers@google.com> wrote:
>
> On Fri, Jul 5, 2019 at 1:03 AM 'Peter Collingbourne' via Clang Built
> Linux <clang-built-linux@googlegroups.com> wrote:
> >
> > RELR is a relocation packing format for relative relocations.
> > The format is described in a generic-abi proposal:
> > https://groups.google.com/d/topic/generic-abi/bX460iggiKg/discussion
>
> Cool, Rahul reports 9-19% savings for various userspace binaries.
> Just curious, but a quick scan makes it seem like this could be
> do-able for other arch's as well? (maybe a topic for a separate
> thread)

Yes, but it would likely involve reimplementing the self-relocation
code in assembly for each architecture so that we can guarantee that
the code does not itself need to be relocated. Probably worth
discussing on a separate thread.

> > The LLD linker can be instructed to pack relocations in the RELR
> > format by passing the flag --pack-dyn-relocs=relr.
> >
> > This patch adds a new config option, CONFIG_RELR. Enabling this option
> > instructs the linker to pack vmlinux's relative relocations in the RELR
> > format, and causes the kernel to apply the relocations at startup along
> > with the RELA relocations. RELA relocations still need to be applied
> > because the linker will emit RELA relative relocations if they are
> > unrepresentable in the RELR format (i.e. address not a multiple of 2).
> >
> > Enabling CONFIG_RELR reduces the size of a defconfig kernel image
> > with CONFIG_RANDOMIZE_BASE by 3.5MB/16% uncompressed, or 550KB/5%
> > compressed (lz4).
>
> Neat! Thanks for the patch; I'll try to get it tested this week to see
> if I can reproduce the results and boot test on hardware (I think
> llvm-nm has no known issues, I'll need to check llvm-objcopy).

Thanks. I've already boot tested it using qemu and I was planning to
test on hikey960 (currently waiting on a part), but more testing would
be useful.

> > Signed-off-by: Peter Collingbourne <pcc@google.com>
> > ---
> >  arch/arm64/Kconfig              |  9 +++++
> >  arch/arm64/Makefile             |  4 ++
> >  arch/arm64/kernel/head.S        | 70 ++++++++++++++++++++++++++++-----
> >  arch/arm64/kernel/vmlinux.lds.S |  9 +++++
> >  4 files changed, 83 insertions(+), 9 deletions(-)
> >
> > diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig
> > index 697ea05107298..f0cd0d2607e70 100644
> > --- a/arch/arm64/Kconfig
> > +++ b/arch/arm64/Kconfig
> > @@ -1447,6 +1447,15 @@ config RELOCATABLE
> >           relocation pass at runtime even if the kernel is loaded at the
> >           same address it was linked at.
> >
> > +config RELR
> > +       bool "Use RELR relocation packing"
> > +       depends on RELOCATABLE && $(ld-option,--pack-dyn-relocs=relr)
>
> Oh, ld-option in Kconfig? +Masahiro
>
> > +       help
> > +         Store the kernel's dynamic relocations in the RELR relocation packing
> > +         format. Requires a compatible linker (currently only LLD supports
> > +         this feature), as well as compatible NM and OBJCOPY utilities
> > +         (llvm-nm and llvm-objcopy are compatible).
>
> So sounds like `make LD=ld.lld NM=llvm-nm OBJCOPY=llvm-objcopy` will
> be needed to test.  The ld-option check above doesn't seem strong
> enough, but maybe it's not easy to feature test NM or OBJCOPY?

Right. Ideally we want to test the property that the tool accepts an
input file with a .relr.dyn section, and this isn't easy without
actually creating such a file. We could test that the tools are
actually the LLVM versions (e.g. by testing the output of $TOOL
--version), but I'm not sure if we want to exclude the possibility
that GNU or other toolchains will add support for this section in the
future.

> > +
> >  config RANDOMIZE_BASE
> >         bool "Randomize the address of the kernel image"
> >         select ARM64_MODULE_PLTS if MODULES
> > diff --git a/arch/arm64/Makefile b/arch/arm64/Makefile
> > index e9d2e578cbe67..16a8636f815c9 100644
> > --- a/arch/arm64/Makefile
> > +++ b/arch/arm64/Makefile
> > @@ -22,6 +22,10 @@ LDFLAGS_vmlinux              += -shared -Bsymbolic -z notext -z norelro \
> >                         $(call ld-option, --no-apply-dynamic-relocs)
> >  endif
> >
> > +ifeq ($(CONFIG_RELR),y)
> > +  LDFLAGS_vmlinux += --pack-dyn-relocs=relr
> > +endif
> > +
> >  ifeq ($(CONFIG_ARM64_ERRATUM_843419),y)
> >    ifeq ($(call ld-option, --fix-cortex-a53-843419),)
> >  $(warning ld does not support --fix-cortex-a53-843419; kernel may be susceptible to erratum)
> > diff --git a/arch/arm64/kernel/head.S b/arch/arm64/kernel/head.S
> > index 2cdacd1c141b9..9b27d5e7d8f70 100644
> > --- a/arch/arm64/kernel/head.S
> > +++ b/arch/arm64/kernel/head.S
> > @@ -102,6 +102,7 @@ pe_header:
> >          *  x23        stext() .. start_kernel()  physical misalignment/KASLR offset
> >          *  x28        __create_page_tables()     callee preserved temp register
> >          *  x19/x20    __primary_switch()         callee preserved temp registers
> > +        *  x24        __primary_switch()         current RELR displacement
> >          */
> >  ENTRY(stext)
> >         bl      preserve_boot_args
> > @@ -824,24 +825,63 @@ __relocate_kernel:
> >          * Iterate over each entry in the relocation table, and apply the
> >          * relocations in place.
> >          */
> > -       ldr     w9, =__rela_offset              // offset to reloc table
> > -       ldr     w10, =__rela_size               // size of reloc table
> > -
> >         mov_q   x11, KIMAGE_VADDR               // default virtual offset
> >         add     x11, x11, x23                   // actual virtual offset
> > +
> > +       ldr     w9, =__rela_offset              // offset to reloc table
> > +       ldr     w10, =__rela_size               // size of reloc table
>
> Was this reordering intentional?  I don't think w9 or w10 would change
> across the mov_q and add above? Or is it just to match the loop update
> below?

Yes, it was intended to group all of the code that deals with RELA
tables together so that it resembles the RELR code.

> >         add     x9, x9, x11                     // __va(.rela)
> >         add     x10, x9, x10                    // __va(.rela) + sizeof(.rela)
> >
> >  0:     cmp     x9, x10
> >         b.hs    1f
> > -       ldp     x11, x12, [x9], #24
> > -       ldr     x13, [x9, #-8]
> > -       cmp     w12, #R_AARCH64_RELATIVE
> > +       ldp     x12, x13, [x9], #24
> > +       ldr     x14, [x9, #-8]
> > +       cmp     w13, #R_AARCH64_RELATIVE
>
> Can you help me understand this renaming?
> x11 -> x12
> x13 -> x14
> x12 -> x13
> but they all get clobbered before use in your added ifdef hunk?

I use the value of x11 before it is clobbered in the instruction with
the comment "// __va(.relr)".

> >         b.ne    0b
> > -       add     x13, x13, x23                   // relocate
> > -       str     x13, [x11, x23]
> > +       add     x14, x14, x23                   // relocate
> > +       str     x14, [x12, x23]
> >         b       0b
> > -1:     ret
> > +
> > +1:
> > +#ifdef CONFIG_RELR
> > +       ldr     w9, =__relr_offset              // offset to reloc table
> > +       ldr     w10, =__relr_size               // size of reloc table
>
> Were these modified since first loaded in the above hunk?  I see the
> offsets applied below, but I don't spot any loops back up to `1:` (but
> could be missing it).  It also doesn't look like x11 or x10 are
> modified below (or above), so this looks like we're rematerializing
> values that already exist in those registers, IIUC?  Maybe I'm missing
> some side effect of one of the instructions?

These instructions refer to __relr_{offset,size} which are different
from __rela_{offset,size} loaded above. They are only loaded once per
function call; the main loop is between labels 2 and 7 below.

You might have missed the implicit increment of x9 by 8 in the "ldr
x11, [x9], #8" instruction below (see also similar instructions
above), which is how we move to the next relocation table entry in the
main loop.

> > +       add     x9, x9, x11                     // __va(.relr)
> > +       add     x10, x9, x10                    // __va(.relr) + sizeof(.relr)
> > +
> > +       sub     x15, x23, x24                   // delta from previous offset
> > +       cbz     x15, 7f                         // nothing to do if unchanged
> > +       mov     x24, x23                        // save new offset
> > +
> > +2:     cmp     x9, x10
> > +       b.hs    7f
> > +       ldr     x11, [x9], #8
> > +       tbnz    x11, #0, 3f                     // branch to handle bitmaps
> > +       add     x13, x11, x23
> > +       ldr     x12, [x13]                      // relocate address entry
> > +       add     x12, x12, x15
> > +       str     x12, [x13], #8                  // adjust to start of bitmap
> > +       b       2b
> > +
> > +3:     mov     x14, x13
> > +4:     lsr     x11, x11, #1
> > +       cbz     x11, 6f
> > +       tbz     x11, #0, 5f                     // skip bit if not set
> > +       ldr     x12, [x14]                      // relocate bit
> > +       add     x12, x12, x15
> > +       str     x12, [x14]
> > +
> > +5:     add     x14, x14, #8                    // move to next bit's address
> > +       b       4b
> > +
> > +6:     add     x13, x13, #(8 * 63)             // move to next bitmap's address
>
> Sorry, what's this constant `#(8 * 63)`?

It is the number of bytes covered by a bitmap entry. 8 is the word
size, and 63 is the number of significant bits in a bitmap entry.

>
> > +       b       2b
> > +
> > +7:
> > +#endif
> > +       ret
> > +
> >  ENDPROC(__relocate_kernel)
> >  #endif
> >
> > @@ -854,6 +894,18 @@ __primary_switch:
> >         adrp    x1, init_pg_dir
> >         bl      __enable_mmu
> >  #ifdef CONFIG_RELOCATABLE
> > +#ifdef CONFIG_RELR
> > +       /*
> > +        * RELR is similar to REL in that the addends are stored in place in the
> > +        * binary. This means that RELR relocations cannot be applied
> > +        * idempotently. We use x24 to keep track of the currently applied
> > +        * displacement so that we can correctly relocate if __relocate_kernel
> > +        * is called twice with non-zero displacements (i.e. if there is both a
> > +        * physical misalignment and a KASLR displacement). We start off at 0
>
> Sounds like I should test w/ and w/o CONFIG_RANDOMIZE_BASE enabled?

Sure, makes sense.

Peter

> > +        * because no displacement has been applied yet.
> > +        */
> > +       mov     x24, #0
> > +#endif
> >         bl      __relocate_kernel
> >  #ifdef CONFIG_RANDOMIZE_BASE
> >         ldr     x8, =__primary_switched
> > diff --git a/arch/arm64/kernel/vmlinux.lds.S b/arch/arm64/kernel/vmlinux.lds.S
> > index 7fa0083749078..31716afa30f65 100644
> > --- a/arch/arm64/kernel/vmlinux.lds.S
> > +++ b/arch/arm64/kernel/vmlinux.lds.S
> > @@ -200,6 +200,15 @@ SECTIONS
> >         __rela_offset   = ABSOLUTE(ADDR(.rela.dyn) - KIMAGE_VADDR);
> >         __rela_size     = SIZEOF(.rela.dyn);
> >
> > +#ifdef CONFIG_RELR
> > +       .relr.dyn : ALIGN(8) {
> > +               *(.relr.dyn)
> > +       }
> > +
> > +       __relr_offset   = ABSOLUTE(ADDR(.relr.dyn) - KIMAGE_VADDR);
> > +       __relr_size     = SIZEOF(.relr.dyn);
> > +#endif
> > +
> >         . = ALIGN(SEGMENT_ALIGN);
> >         __initdata_end = .;
> >         __init_end = .;
> > --
>
> --
> Thanks,
> ~Nick Desaulniers

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel