linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Uros Bizjak <ubizjak@gmail.com>
To: x86@kernel.org, linux-kernel@vger.kernel.org
Cc: Uros Bizjak <ubizjak@gmail.com>,
	Andy Lutomirski <luto@kernel.org>, Ingo Molnar <mingo@kernel.org>,
	Nadav Amit <namit@vmware.com>, Brian Gerst <brgerst@gmail.com>,
	Denys Vlasenko <dvlasenk@redhat.com>,
	"H . Peter Anvin" <hpa@zytor.com>,
	Linus Torvalds <torvalds@linux-foundation.org>,
	Peter Zijlstra <peterz@infradead.org>,
	Thomas Gleixner <tglx@linutronix.de>,
	Borislav Petkov <bp@alien8.de>,
	Josh Poimboeuf <jpoimboe@redhat.com>
Subject: [PATCH 0/4] x86/percpu: Use segment qualifiers
Date: Wed,  4 Oct 2023 16:49:40 +0200	[thread overview]
Message-ID: <20231004145137.86537-1-ubizjak@gmail.com> (raw)

This patchset resurrect the work of Richard Henderson [1] and Nadav
Amit [2] to introduce named address spaces compiler extension [3,4]
into the linux kernel.

On the x86 target, variables may be declared as being relative to
the %fs or %gs segments.

__seg_fs
__seg_gs

The object is accessed with the respective segment override prefix.

The following patchset takes a bit more cautious approach and converts
only moves, currently implemented as an asm, to generic moves to/from
named address space. The compiler is then able to propagate memory
arguments into instructions that use these memory references, producing
more compact assembly, in addition to avoiding using a register as a
temporary to hold value from the memory.

The patchset enables propagation of hundreds of memory arguments,
resulting in the cumulative code size reduction of 7.94kB (please note
that the kernel is compiled with -O2, so the code size is not entirely
correct measure; some parts of the code can now be duplicated for
better performance due to -O2, etc...).

Some examples of propagations:

a) into sign/zero extensions:

 110b54:       65 0f b6 05 00 00 00    movzbl %gs:0x0(%rip),%eax
 11ab90:       65 0f b6 15 00 00 00    movzbl %gs:0x0(%rip),%edx
 14484a:       65 0f b7 35 00 00 00    movzwl %gs:0x0(%rip),%esi
 1a08a9:       65 0f b6 43 78          movzbl %gs:0x78(%rbx),%eax
 1a08f9:       65 0f b6 43 78          movzbl %gs:0x78(%rbx),%eax

 4ab29a:       65 48 63 15 00 00 00    movslq %gs:0x0(%rip),%rdx
 4be128:       65 4c 63 25 00 00 00    movslq %gs:0x0(%rip),%r12
 547468:       65 48 63 1f             movslq %gs:(%rdi),%rbx
 5474e7:       65 48 63 0a             movslq %gs:(%rdx),%rcx
 54d05d:       65 48 63 0d 00 00 00    movslq %gs:0x0(%rip),%rcx

b) into compares:

 b40804:       65 f7 05 00 00 00 00    testl  $0xf0000,%gs:0x0(%rip)
 b487e8:       65 f7 05 00 00 00 00    testl  $0xf0000,%gs:0x0(%rip)
 b6f14c:       65 f6 05 00 00 00 00    testb  $0x1,%gs:0x0(%rip)
 bac1b8:       65 f6 05 00 00 00 00    testb  $0x1,%gs:0x0(%rip)
 df2244:       65 f7 05 00 00 00 00    testl  $0xff00,%gs:0x0(%rip)

 9a7517:       65 80 3d 00 00 00 00    cmpb   $0x0,%gs:0x0(%rip)
 b282ba:       65 44 3b 35 00 00 00    cmp    %gs:0x0(%rip),%r14d
 b48f61:       65 66 83 3d 00 00 00    cmpw   $0x8,%gs:0x0(%rip)
 b493fe:       65 80 38 00             cmpb   $0x0,%gs:(%rax)
 b73867:       65 66 83 3d 00 00 00    cmpw   $0x8,%gs:0x0(%rip)

c) into other insns:

 65ec02:       65 0f 44 15 00 00 00    cmove  %gs:0x0(%rip),%edx
 6c98ac:       65 0f 44 15 00 00 00    cmove  %gs:0x0(%rip),%edx
 9aafaf:       65 0f 44 15 00 00 00    cmove  %gs:0x0(%rip),%edx
 b45868:       65 0f 48 35 00 00 00    cmovs  %gs:0x0(%rip),%esi
 d276f8:       65 0f 44 15 00 00 00    cmove  %gs:0x0(%rip),%edx

The above propagations result in the following code size
improvements for current mainline kernel (with the default config),
compiled with

gcc (GCC) 12.3.1 20230508 (Red Hat 12.3.1-1)

   text    data     bss     dec     hex filename
25508862        4386540  808388 30703790        1d480ae vmlinux-vanilla.o
25500922        4386532  808388 30695842        1d461a2 vmlinux-new.o

The conversion of other read-modify-write instructions does not bring
us any benefits, the compiler has some problems when constructing RMW
instructions from the generic code and easily misses some opportunities.

There are other optimizations possible involving arch_raw_cpu_ptr and
aggressive caching of current that are implemented in the original
patch series. These can be implemented as follow-ups at some later
time.

The patcshet was tested on Fedora 38 with kernel 6.5.5 and gcc 13.2.1
(In fact, I'm writing this message on the patched kernel.)

[1] https://lore.kernel.org/lkml/1454483253-11246-1-git-send-email-rth@twiddle.net/
[2] https://lore.kernel.org/lkml/20190823224424.15296-1-namit@vmware.com/
[3] https://gcc.gnu.org/onlinedocs/gcc/Named-Address-Spaces.html
[4] https://clang.llvm.org/docs/LanguageExtensions.html#target-specific-extensions

Cc: Andy Lutomirski <luto@kernel.org>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Nadav Amit <namit@vmware.com>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Josh Poimboeuf <jpoimboe@redhat.com>

Uros Bizjak (4):
  x86/percpu: Update arch/x86/include/asm/percpu.h to the current tip
  x86/percpu: Enable named address spaces with known compiler version
  x86/percpu: Use compiler segment prefix qualifier
  x86/percpu: Use C for percpu read/write accessors

 arch/x86/Kconfig               |   7 +
 arch/x86/include/asm/percpu.h  | 237 ++++++++++++++++++++++++++++-----
 arch/x86/include/asm/preempt.h |   2 +-
 3 files changed, 209 insertions(+), 37 deletions(-)

-- 
2.41.0


             reply	other threads:[~2023-10-04 14:51 UTC|newest]

Thread overview: 37+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-10-04 14:49 Uros Bizjak [this message]
2023-10-04 14:49 ` [PATCH 1/4] x86/percpu: Update arch/x86/include/asm/percpu.h to the current tip Uros Bizjak
2023-10-04 14:49 ` [PATCH 2/4] x86/percpu: Enable named address spaces with known compiler version Uros Bizjak
2023-10-05  7:20   ` [tip: x86/percpu] " tip-bot2 for Uros Bizjak
2023-10-04 14:49 ` [PATCH 3/4] x86/percpu: Use compiler segment prefix qualifier Uros Bizjak
2023-10-05  7:20   ` [tip: x86/percpu] " tip-bot2 for Nadav Amit
2023-10-04 14:49 ` [PATCH 4/4] x86/percpu: Use C for percpu read/write accessors Uros Bizjak
2023-10-04 16:37   ` Ingo Molnar
2023-10-04 16:40     ` Ingo Molnar
2023-10-04 19:23     ` [PATCH v2 " Uros Bizjak
2023-10-04 19:42       ` Linus Torvalds
2023-10-04 20:07         ` Uros Bizjak
2023-10-04 20:12           ` Linus Torvalds
2023-10-04 20:19             ` Linus Torvalds
2023-10-04 20:22               ` Uros Bizjak
2023-10-05  7:06       ` Ingo Molnar
2023-10-05  7:40         ` Uros Bizjak
2023-10-05  7:20       ` [tip: x86/percpu] " tip-bot2 for Uros Bizjak
2023-10-08 17:59   ` [PATCH 4/4] " Linus Torvalds
2023-10-08 19:17     ` Uros Bizjak
2023-10-08 20:13       ` Linus Torvalds
2023-10-08 20:48         ` Linus Torvalds
2023-10-08 21:41           ` Uros Bizjak
2023-10-09 11:41             ` Ingo Molnar
2023-10-09 11:51               ` Ingo Molnar
2023-10-09 12:00                 ` Uros Bizjak
2023-10-09 12:20                   ` Ingo Molnar
2023-10-09 12:21                   ` Nadav Amit
2023-10-09 12:42                     ` Uros Bizjak
2023-10-09 12:53                       ` Nadav Amit
2023-10-09 12:27               ` Uros Bizjak
2023-10-09 14:35               ` Uros Bizjak
2024-04-10 11:11                 ` Andrey Konovalov
2024-04-10 11:21                   ` Uros Bizjak
2024-04-10 11:24                     ` Andrey Konovalov
2023-10-09 11:42       ` Ingo Molnar
2023-10-10  6:37     ` Uros Bizjak

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20231004145137.86537-1-ubizjak@gmail.com \
    --to=ubizjak@gmail.com \
    --cc=bp@alien8.de \
    --cc=brgerst@gmail.com \
    --cc=dvlasenk@redhat.com \
    --cc=hpa@zytor.com \
    --cc=jpoimboe@redhat.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=luto@kernel.org \
    --cc=mingo@kernel.org \
    --cc=namit@vmware.com \
    --cc=peterz@infradead.org \
    --cc=tglx@linutronix.de \
    --cc=torvalds@linux-foundation.org \
    --cc=x86@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).