linux-toolchains.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Nick Desaulniers <ndesaulniers@google.com>
To: Petr Pavlu <petr.pavlu@suse.com>
Cc: tglx@linutronix.de, mingo@redhat.com, bp@alien8.de,
	dave.hansen@linux.intel.com, x86@kernel.org, hpa@zytor.com,
	nathan@kernel.org, trix@redhat.com, corbet@lwn.net,
	linux-kernel@vger.kernel.org,
	clang-built-linux <llvm@lists.linux.dev>,
	linux-toolchains <linux-toolchains@vger.kernel.org>,
	Fangrui Song <maskray@google.com>
Subject: Re: [PATCH v5] x86: Avoid relocation information in final vmlinux
Date: Mon, 20 Mar 2023 11:35:30 -0700	[thread overview]
Message-ID: <CAKwvOdkFvMgYypc4w+UChO2O50wSHqXJUct2fkxrd+Qgn2C4Cg@mail.gmail.com> (raw)
In-Reply-To: <20230320121006.4863-1-petr.pavlu@suse.com>

On Mon, Mar 20, 2023 at 5:10 AM Petr Pavlu <petr.pavlu@suse.com> wrote:
>
> The Linux build process on x86 roughly consists of compiling all input
> files, statically linking them into a vmlinux ELF file, and then taking
> and turning this file into an actual bzImage bootable file.
>
> vmlinux has in this process two main purposes:
> 1) It is an intermediate build target on the way to produce the final
>    bootable image.
> 2) It is a file that is expected to be used by debuggers and standard
>    ELF tooling to work with the built kernel.
>
> For the second purpose, a vmlinux file is typically collected by various
> package build recipes, such as distribution spec files, including the
> kernel's own tar-pkg target.
>
> When building a kernel supporting KASLR with CONFIG_X86_NEED_RELOCS,
> vmlinux contains also relocation information produced by using the
> --emit-relocs linker option. This is utilized by subsequent build steps
> to create vmlinux.relocs and produce a relocatable image. However, the
> information is not needed by debuggers and other standard ELF tooling.
>
> The issue is then that the collected vmlinux file and hence distribution
> packages end up unnecessarily large because of this extra data. The
> following is a size comparison of vmlinux v6.0 with and without the
> relocation information:
> | Configuration      | With relocs | Stripped relocs |
> | x86_64_defconfig   |       70 MB |           43 MB |
> | +CONFIG_DEBUG_INFO |      818 MB |          367 MB |

Thanks for getting this to work with llvm-objcopy.  It's a pretty big
win for us, especially for thin-lto builds which produce a ridiculous
amount of debug info duplication (something I'm petitioning our DWARF
folks to look into for DWARFv6) some measurements (all LLVM=1):

Before this patch:
defconfig:
76M vmlinux
DEBUG_INFO:
510M vmlinux
DEBUG_INFO+LTO_CLANG_THIN:
796M vmlinux

after:
defconfig:
48M vmlinux (-36.8%)
DEBUG_INFO:
270M vmlinux (-47%)
LTO_CLANG_THIN:
400M vmlinux (-49.8%)

So basically a 50% reduction in vmlinux size, depending on the precise
configs selected. That's pretty great!

Android usually keeps around vmlinux artifacts as well as the
compressed image in case we need to debug the image later, this should
help us cut our storage costs for those quite a bit.  arm64 is more
common for Android, but x86_64 is pretty helpful for a virtualized
target; we do use it alot for first party development.

I also tested that I could still boot the result in QEMU, attach GDB,
and still hit breakpoints in the resulting vmlinux.  I also tested
that there were no more rel/rela sections missed in the resulting
vmlinux images.

Tested-by: Nick Desaulniers <ndesaulniers@google.com>

Some minor review comments below.


I do also wonder if linkers have something like --emit-relocs, but the
option to produce it in an additional file. That would help us avoid
producing it only to split it out in the first place.

>
> Optimize a resulting vmlinux by adding a postlink step that splits the
> relocation information into vmlinux.relocs and then strips it from the
> vmlinux binary.
>
> Signed-off-by: Petr Pavlu <petr.pavlu@suse.com>
> ---
>
> Changes since v4 [1]:
> - Update the example target which is mentioned in the patch description
>   to collect vmlinux from binrpm-pkg to tar-pkg, to reflect fc8c2d8ff206
>   ("kbuild: Stop including vmlinux.bz2 in the rpm's").
>
> Changes since v3 [2]:
> - Update the Kbuild.include path in arch/x86/Makefile.postlink to work
>   after 67d7c3023a67 ("kbuild: remove --include-dir MAKEFLAG from top
>   Makefile").
>
> Changes since v2 [3]:
> - Ignore only the moved vmlinux.relocs, add it to .gitignore and
>   Documentation/dontdiff.
> - Clean up the patch description.
>
> Changes since v1 [4]:
> - Fix the command to remove relocations to work with llvm-objcopy too.
>
> [1] https://lore.kernel.org/lkml/20230227131829.26824-1-petr.pavlu@suse.com/
> [2] https://lore.kernel.org/lkml/20221211141227.7622-1-petr.pavlu@suse.com/
> [3] https://lore.kernel.org/lkml/20220927084632.14531-1-petr.pavlu@suse.com/
> [4] https://lore.kernel.org/lkml/20220913132911.6850-1-petr.pavlu@suse.com/
>
>  .gitignore                          |  1 +
>  Documentation/dontdiff              |  1 +
>  arch/x86/Makefile.postlink          | 41 +++++++++++++++++++++++++++++
>  arch/x86/boot/compressed/.gitignore |  1 -
>  arch/x86/boot/compressed/Makefile   | 10 +++----
>  5 files changed, 47 insertions(+), 7 deletions(-)
>  create mode 100644 arch/x86/Makefile.postlink
>
> diff --git a/.gitignore b/.gitignore
> index 70ec6037fa7a..9bafd3c6bb5f 100644
> --- a/.gitignore
> +++ b/.gitignore
> @@ -65,6 +65,7 @@ modules.order
>  /vmlinux
>  /vmlinux.32
>  /vmlinux.map
> +/vmlinux.relocs

Why do you move this from the arch/x86/boot/compressed/ dir?

>  /vmlinux.symvers
>  /vmlinux-gdb.py
>  /vmlinuz
> diff --git a/Documentation/dontdiff b/Documentation/dontdiff
> index 3c399f132e2d..a62ad01e6d11 100644
> --- a/Documentation/dontdiff
> +++ b/Documentation/dontdiff
> @@ -254,6 +254,7 @@ vmlinux.aout
>  vmlinux.bin.all
>  vmlinux.lds
>  vmlinux.map
> +vmlinux.relocs
>  vmlinux.symvers
>  vmlinuz
>  voffset.h
> diff --git a/arch/x86/Makefile.postlink b/arch/x86/Makefile.postlink
> new file mode 100644
> index 000000000000..195af937aa4d
> --- /dev/null
> +++ b/arch/x86/Makefile.postlink
> @@ -0,0 +1,41 @@
> +# SPDX-License-Identifier: GPL-2.0
> +# ===========================================================================
> +# Post-link x86 pass
> +# ===========================================================================
> +#
> +# 1. Separate relocations from vmlinux into vmlinux.relocs.
> +# 2. Strip relocations from vmlinux.
> +
> +PHONY := __archpost
> +__archpost:
> +
> +-include include/config/auto.conf
> +include $(srctree)/scripts/Kbuild.include
> +
> +CMD_RELOCS = arch/x86/tools/relocs
> +quiet_cmd_relocs = RELOCS  $@.relocs
> +      cmd_relocs = $(CMD_RELOCS) $@ > $@.relocs;$(CMD_RELOCS) --abs-relocs $@
> +
> +quiet_cmd_strip_relocs = RSTRIP  $@
> +      cmd_strip_relocs = $(OBJCOPY) --remove-section='.rel.*' --remove-section='.rel__*' --remove-section='.rela.*' --remove-section='.rela__*' $@

This line is pretty long (146 chars), can you use \ here to wrap it?

> +
> +# `@true` prevents complaint when there is nothing to be done
> +
> +vmlinux: FORCE
> +       @true
> +ifeq ($(CONFIG_X86_NEED_RELOCS),y)
> +       $(call cmd,relocs)
> +       $(call cmd,strip_relocs)
> +endif
> +
> +%.ko: FORCE
> +       @true
> +
> +clean:
> +       @rm -f vmlinux.relocs
> +
> +PHONY += FORCE clean
> +
> +FORCE:
> +
> +.PHONY: $(PHONY)
> diff --git a/arch/x86/boot/compressed/.gitignore b/arch/x86/boot/compressed/.gitignore
> index 25805199a506..b2968175fc27 100644
> --- a/arch/x86/boot/compressed/.gitignore
> +++ b/arch/x86/boot/compressed/.gitignore
> @@ -1,7 +1,6 @@
>  # SPDX-License-Identifier: GPL-2.0-only
>  relocs
>  vmlinux.bin.all
> -vmlinux.relocs
>  vmlinux.lds
>  mkpiggy
>  piggy.S
> diff --git a/arch/x86/boot/compressed/Makefile b/arch/x86/boot/compressed/Makefile
> index 6b6cfe607bdb..19d1fb601796 100644
> --- a/arch/x86/boot/compressed/Makefile
> +++ b/arch/x86/boot/compressed/Makefile
> @@ -121,14 +121,12 @@ $(obj)/vmlinux.bin: vmlinux FORCE
>
>  targets += $(patsubst $(obj)/%,%,$(vmlinux-objs-y)) vmlinux.bin.all vmlinux.relocs
>
> -CMD_RELOCS = arch/x86/tools/relocs
> -quiet_cmd_relocs = RELOCS  $@
> -      cmd_relocs = $(CMD_RELOCS) $< > $@;$(CMD_RELOCS) --abs-relocs $<
> -$(obj)/vmlinux.relocs: vmlinux FORCE
> -       $(call if_changed,relocs)
> +# vmlinux.relocs is created by the vmlinux postlink step.
> +vmlinux.relocs: vmlinux
> +       @true
>
>  vmlinux.bin.all-y := $(obj)/vmlinux.bin
> -vmlinux.bin.all-$(CONFIG_X86_NEED_RELOCS) += $(obj)/vmlinux.relocs
> +vmlinux.bin.all-$(CONFIG_X86_NEED_RELOCS) += vmlinux.relocs

Why do you remove $(obj) here? I'm guessing that's why you moved
vmlinux.relocs between .gitignore files?

>
>  $(obj)/vmlinux.bin.gz: $(vmlinux.bin.all-y) FORCE
>         $(call if_changed,gzip)
> --
> 2.35.3
>


-- 
Thanks,
~Nick Desaulniers

       reply	other threads:[~2023-03-20 18:44 UTC|newest]

Thread overview: 4+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <20230320121006.4863-1-petr.pavlu@suse.com>
2023-03-20 18:35 ` Nick Desaulniers [this message]
2023-03-21 11:57   ` [PATCH v5] x86: Avoid relocation information in final vmlinux Borislav Petkov
2023-03-24 10:31     ` Petr Pavlu
2023-03-24 10:30   ` Petr Pavlu

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CAKwvOdkFvMgYypc4w+UChO2O50wSHqXJUct2fkxrd+Qgn2C4Cg@mail.gmail.com \
    --to=ndesaulniers@google.com \
    --cc=bp@alien8.de \
    --cc=corbet@lwn.net \
    --cc=dave.hansen@linux.intel.com \
    --cc=hpa@zytor.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-toolchains@vger.kernel.org \
    --cc=llvm@lists.linux.dev \
    --cc=maskray@google.com \
    --cc=mingo@redhat.com \
    --cc=nathan@kernel.org \
    --cc=petr.pavlu@suse.com \
    --cc=tglx@linutronix.de \
    --cc=trix@redhat.com \
    --cc=x86@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).