Linux-ARM-Kernel Archive on lore.kernel.org
 help / color / Atom feed
From: "Fāng-ruì Sòng" <maskray@google.com>
To: Arvind Sankar <nivedita@alum.mit.edu>, Kees Cook <keescook@chromium.org>
Cc: Mark Rutland <mark.rutland@arm.com>,
	linux-efi@vger.kernel.org,
	Catalin Marinas <catalin.marinas@arm.com>,
	Manoj Gupta <manojgupta@google.com>,
	Will Deacon <will@kernel.org>,
	Thomas Gleixner <tglx@linutronix.de>,
	linux-arch@vger.kernel.org, Andi Kleen <ak@linux.intel.com>,
	Masahiro Yamada <masahiroy@kernel.org>,
	x86@kernel.org, Russell King <linux@armlinux.org.uk>,
	Ard Biesheuvel <ardb@kernel.org>,
	clang-built-linux@googlegroups.com,
	Ingo Molnar <mingo@redhat.com>, Luis Lozano <llozano@google.com>,
	Borislav Petkov <bp@suse.de>, Arnd Bergmann <arnd@arndb.de>,
	Jian Cai <jiancai@google.com>,
	Nathan Chancellor <natechancellor@gmail.com>,
	Peter Collingbourne <pcc@google.com>,
	linux-arm-kernel@lists.infradead.org,
	Michal Marek <michal.lkml@markovi.net>,
	Nick Desaulniers <ndesaulniers@google.com>,
	linux-kernel@vger.kernel.org, stable@vger.kernel.org,
	James Morse <james.morse@arm.com>
Subject: Re: [PATCH v5 13/36] vmlinux.lds.h: add PGO and AutoFDO input sections
Date: Mon, 3 Aug 2020 18:19:35 -0700
Message-ID: <20200804011935.b4asdxdxwvwic7js@google.com> (raw)
In-Reply-To: <20200803201525.GA1351390@rani.riverdale.lan>

On 2020-08-03, Arvind Sankar wrote:
>On Mon, Aug 03, 2020 at 12:05:06PM -0700, Andi Kleen wrote:
>> > However, the history of their being together comes from
>> >
>> >   9bebe9e5b0f3 ("kbuild: Fix .text.unlikely placement")
>> >
>> > which seems to indicate there was some problem with having them separated out,
>> > although I don't quite understand what the issue was from the commit message.
>>
>> Separating it out is less efficient. Gives worse packing for the hot part
>> if they are not aligned to 64byte boundaries, which they are usually not.
>>
>> It also improves packing of the cold part, but that probably doesn't matter.
>>
>> -Andi
>
>Why is that? Both .text and .text.hot have alignment of 2^4 (default
>function alignment on x86) by default, so it doesn't seem like it should
>matter for packing density.  Avoiding interspersing cold text among
>regular/hot text seems like it should be a net win.
>
>That old commit doesn't reference efficiency -- it says there was some
>problem with matching when they were separated out, but there were no
>wildcard section names back then.

I just want to share some context. GNU ld's internal linker script does
impose a particular input section order by specifying separate input section descriptions:

   .text           :
   {
     *(.text.unlikely .text.*_unlikely .text.unlikely.*)
     *(.text.exit .text.exit.*)
     *(.text.startup .text.startup.*)
     *(.text.hot .text.hot.*)
     *(SORT(.text.sorted.*))   # binutils 5fa5f8f5fe494ba4fe98c11899a5464cd164ec75, invented for GCC's call graph profiling. LLVM doesn't use it
     *(.text .stub .text.* .gnu.linkonce.t.*)
     ...

This order is a bit arbitrary. gold and LLD have -z keep-text-section-prefix. With the option,
there can be several output sections, with the '.unlikely'/'.exit'/'.startup'/etc suffix.
This has the advantage that the hot/unlikely/exit/etc attribution of a particular function is more obvious:

   [ 2] .text             PROGBITS        000000000040007c 00007c 000003 00  AX  0   0  4
   [ 3] .text.startup     PROGBITS        000000000040007f 00007f 000001 00  AX  0   0  1
   [ 4] .text.exit        PROGBITS        0000000000400080 000080 000002 00  AX  0   0  1
   [ 5] .text.unlikely    PROGBITS        0000000000400082 000082 000001 00  AX  0   0  1 
   ...

In our case we only need one output section.......

If we place all text sections in one input section description:

     *(.text.unlikely .text.*_unlikely .text.exit .text.exit.* .text.startup .text.startup.* .text.hot .text.hot.* ... )

In many cases the input sections are laid out in the input order. In LLD there are two ordering cases:

* If clang PGO (-fprofile-use=) is enabled, .llvm.call-graph-profile will be created automatically.
   LLD can perform reordering **within an input section description**. The ordering is quite complex,
   you can read https://github.com/llvm/llvm-project/blob/master/lld/ELF/CallGraphSort.cpp#L9 if you are curious:)

   I don't know the performance improvement of this heuristic. (I don't think the original paper
   cgo2017-hfsort-final1.pdf took ThinLTO into account, so the result might not
   reflect realistic work loads where both ThinLTO and PGO are used) This, if matters, likely only matters
   for very large executable, not the case for the kernel.
* On some RISC architectures (ARM/AArch64/PowerPC),
   the ordered sections (due to either .llvm.call-graph-profile or
   --symbol-reordering-file=; the two can't be used together) are placed in a
   suitable place in the input section description
   ( http://reviews.llvm.org/D44969 )


In summary, using one (large) input section description may have some
performance improvement with LLD but I don't think it will be significant.
There may be some size improvement for ARM/AArch64/PowerPC if someone wants to test.

>commit 9bebe9e5b0f3109a14000df25308c2971f872605
>Author: Andi Kleen <ak@linux.intel.com>
>Date:   Sun Jul 19 18:01:19 2015 -0700
>
>    kbuild: Fix .text.unlikely placement
>
>    When building a kernel with .text.unlikely text the unlikely text for
>    each translation unit was put next to the main .text code in the
>    final vmlinux.
>
>    The problem is that the linker doesn't allow more specific submatches
>    of a section name in a different linker script statement after the
>    main match.
>
>    So we need to move them all into one line. With that change
>    .text.unlikely is at the end of everything again.
>
>    I also moved .text.hot into the same statement though, even though
>    that's not strictly needed.
>
>    Signed-off-by: Andi Kleen <ak@linux.intel.com>
>    Signed-off-by: Michal Marek <mmarek@suse.com>
>
>diff --git a/include/asm-generic/vmlinux.lds.h b/include/asm-generic/vmlinux.lds.h
>index 8bd374d3cf21..1781e54ea6d3 100644
>--- a/include/asm-generic/vmlinux.lds.h
>+++ b/include/asm-generic/vmlinux.lds.h
>@@ -412,12 +412,10 @@
>  * during second ld run in second ld pass when generating System.map */
> #define TEXT_TEXT							\
> 		ALIGN_FUNCTION();					\
>-		*(.text.hot)						\
>-		*(.text .text.fixup)					\
>+		*(.text.hot .text .text.fixup .text.unlikely)		\
> 		*(.ref.text)						\
> 	MEM_KEEP(init.text)						\
> 	MEM_KEEP(exit.text)						\
>-		*(.text.unlikely)
>
>
> /* sched.text is aling to function alignment to secure we have same

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

  reply index

Thread overview: 63+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-07-31 23:07 [PATCH v5 00/36] Warn on orphan section placement Kees Cook
2020-07-31 23:07 ` [PATCH v5 01/36] x86/boot/compressed: Move .got.plt entries out of the .got section Kees Cook
2020-07-31 23:07 ` [PATCH v5 02/36] x86/boot/compressed: Force hidden visibility for all symbol references Kees Cook
2020-07-31 23:07 ` [PATCH v5 03/36] x86/boot/compressed: Get rid of GOT fixup code Kees Cook
2020-07-31 23:07 ` [PATCH v5 04/36] x86/boot: Add .text.* to setup.ld Kees Cook
2020-07-31 23:07 ` [PATCH v5 05/36] x86/boot: Remove run-time relocations from .head.text code Kees Cook
2020-07-31 23:42   ` Nick Desaulniers
2020-07-31 23:07 ` [PATCH v5 06/36] x86/boot: Remove run-time relocations from head_{32, 64}.S Kees Cook
2020-08-07 18:12   ` Nick Desaulniers
2020-08-07 20:20     ` [PATCH v5 06/36] x86/boot: Remove run-time relocations from head_{32,64}.S Arvind Sankar
2020-07-31 23:07 ` [PATCH v5 07/36] x86/boot: Check that there are no run-time relocations Kees Cook
2020-07-31 23:07 ` [PATCH v5 08/36] vmlinux.lds.h: Create COMMON_DISCARDS Kees Cook
2020-07-31 23:07 ` [PATCH v5 09/36] vmlinux.lds.h: Add .gnu.version* to COMMON_DISCARDS Kees Cook
2020-07-31 23:07 ` [PATCH v5 10/36] vmlinux.lds.h: Avoid KASAN and KCSAN's unwanted sections Kees Cook
2020-07-31 23:07 ` [PATCH v5 11/36] vmlinux.lds.h: Split ELF_DETAILS from STABS_DEBUG Kees Cook
2020-07-31 23:07 ` [PATCH v5 12/36] vmlinux.lds.h: Add .symtab, .strtab, and .shstrtab to ELF_DETAILS Kees Cook
2020-07-31 23:07 ` [PATCH v5 13/36] vmlinux.lds.h: add PGO and AutoFDO input sections Kees Cook
2020-08-01  3:51   ` Arvind Sankar
2020-08-01  6:18     ` Kees Cook
2020-08-01 17:27       ` Arvind Sankar
2020-08-03 19:05     ` Andi Kleen
2020-08-03 20:15       ` Arvind Sankar
2020-08-04  1:19         ` Fāng-ruì Sòng [this message]
2020-08-04  4:45         ` Andi Kleen
2020-08-04  5:32           ` Fāng-ruì Sòng
2020-08-04 16:06           ` Arvind Sankar
2020-08-21 19:18             ` Kees Cook
2020-07-31 23:07 ` [PATCH v5 14/36] efi/libstub: Disable -mbranch-protection Kees Cook
2020-07-31 23:07 ` [PATCH v5 15/36] arm64/mm: Remove needless section quotes Kees Cook
2020-07-31 23:08 ` [PATCH v5 16/36] arm64/kernel: Remove needless Call Frame Information annotations Kees Cook
2020-07-31 23:08 ` [PATCH v5 17/36] arm64/build: Remove .eh_frame* sections due to unwind tables Kees Cook
2020-07-31 23:08 ` [PATCH v5 18/36] arm64/build: Use common DISCARDS in linker script Kees Cook
2020-07-31 23:08 ` [PATCH v5 19/36] arm64/build: Add missing DWARF sections Kees Cook
2020-07-31 23:08 ` [PATCH v5 20/36] arm64/build: Assert for unwanted sections Kees Cook
2020-07-31 23:08 ` [PATCH v5 21/36] arm64/build: Warn on orphan section placement Kees Cook
2020-07-31 23:08 ` [PATCH v5 22/36] arm/build: Refactor linker script headers Kees Cook
2020-07-31 23:08 ` [PATCH v5 23/36] arm/build: Explicitly keep .ARM.attributes sections Kees Cook
2020-08-03 19:02   ` Nick Desaulniers
2020-08-17 22:06     ` Fangrui Song
2020-07-31 23:08 ` [PATCH v5 24/36] arm/build: Add missing sections Kees Cook
2020-07-31 23:08 ` [PATCH v5 25/36] arm/build: Warn on orphan section placement Kees Cook
2020-07-31 23:08 ` [PATCH v5 26/36] arm/boot: Handle all sections explicitly Kees Cook
2020-07-31 23:08 ` [PATCH v5 27/36] arm/boot: Warn on orphan section placement Kees Cook
2020-07-31 23:08 ` [PATCH v5 28/36] x86/asm: Avoid generating unused kprobe sections Kees Cook
2020-07-31 23:08 ` [PATCH v5 29/36] x86/build: Enforce an empty .got.plt section Kees Cook
2020-08-01  2:12   ` Arvind Sankar
2020-08-01  5:32     ` Kees Cook
2020-08-21 17:49     ` Kees Cook
2020-07-31 23:08 ` [PATCH v5 30/36] x86/build: Assert for unwanted sections Kees Cook
2020-07-31 23:08 ` [PATCH v5 31/36] x86/build: Warn on orphan section placement Kees Cook
2020-07-31 23:08 ` [PATCH v5 32/36] x86/boot/compressed: Reorganize zero-size section asserts Kees Cook
2020-08-01  1:47   ` Arvind Sankar
2020-08-01  2:53     ` Arvind Sankar
2020-08-01  5:36       ` Kees Cook
2020-08-01 17:12         ` Arvind Sankar
2020-08-21 18:24           ` Kees Cook
2020-08-01  5:35     ` Kees Cook
2020-08-01 17:00       ` Arvind Sankar
2020-08-21 18:19     ` Kees Cook
2020-07-31 23:08 ` [PATCH v5 33/36] x86/boot/compressed: Remove, discard, or assert for unwanted sections Kees Cook
2020-07-31 23:08 ` [PATCH v5 34/36] x86/boot/compressed: Add missing debugging sections to output Kees Cook
2020-07-31 23:08 ` [PATCH v5 35/36] x86/boot/compressed: Warn on orphan section placement Kees Cook
2020-07-31 23:08 ` [PATCH v5 36/36] arm/build: Assert for unwanted sections Kees Cook

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20200804011935.b4asdxdxwvwic7js@google.com \
    --to=maskray@google.com \
    --cc=ak@linux.intel.com \
    --cc=ardb@kernel.org \
    --cc=arnd@arndb.de \
    --cc=bp@suse.de \
    --cc=catalin.marinas@arm.com \
    --cc=clang-built-linux@googlegroups.com \
    --cc=james.morse@arm.com \
    --cc=jiancai@google.com \
    --cc=keescook@chromium.org \
    --cc=linux-arch@vger.kernel.org \
    --cc=linux-arm-kernel@lists.infradead.org \
    --cc=linux-efi@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux@armlinux.org.uk \
    --cc=llozano@google.com \
    --cc=manojgupta@google.com \
    --cc=mark.rutland@arm.com \
    --cc=masahiroy@kernel.org \
    --cc=michal.lkml@markovi.net \
    --cc=mingo@redhat.com \
    --cc=natechancellor@gmail.com \
    --cc=ndesaulniers@google.com \
    --cc=nivedita@alum.mit.edu \
    --cc=pcc@google.com \
    --cc=stable@vger.kernel.org \
    --cc=tglx@linutronix.de \
    --cc=will@kernel.org \
    --cc=x86@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Linux-ARM-Kernel Archive on lore.kernel.org

Archives are clonable:
	git clone --mirror https://lore.kernel.org/linux-arm-kernel/0 linux-arm-kernel/git/0.git
	git clone --mirror https://lore.kernel.org/linux-arm-kernel/1 linux-arm-kernel/git/1.git

	# If you have public-inbox 1.1+ installed, you may
	# initialize and index your mirror using the following commands:
	public-inbox-init -V2 linux-arm-kernel linux-arm-kernel/ https://lore.kernel.org/linux-arm-kernel \
		linux-arm-kernel@lists.infradead.org
	public-inbox-index linux-arm-kernel

Example config snippet for mirrors

Newsgroup available over NNTP:
	nntp://nntp.lore.kernel.org/org.infradead.lists.linux-arm-kernel


AGPL code for this site: git clone https://public-inbox.org/public-inbox.git