From: Ingo Molnar <mingo@kernel.org>
To: Uros Bizjak <ubizjak@gmail.com>
Cc: x86@kernel.org, linux-kernel@vger.kernel.org,
Andy Lutomirski <luto@kernel.org>, Nadav Amit <namit@vmware.com>,
Brian Gerst <brgerst@gmail.com>,
Denys Vlasenko <dvlasenk@redhat.com>,
"H . Peter Anvin" <hpa@zytor.com>,
Linus Torvalds <torvalds@linux-foundation.org>,
Peter Zijlstra <peterz@infradead.org>,
Thomas Gleixner <tglx@linutronix.de>,
Borislav Petkov <bp@alien8.de>,
Josh Poimboeuf <jpoimboe@redhat.com>
Subject: Re: [PATCH 4/4] x86/percpu: Use C for percpu read/write accessors
Date: Wed, 4 Oct 2023 18:40:42 +0200 [thread overview]
Message-ID: <ZR2VitjPb6Miksim@gmail.com> (raw)
In-Reply-To: <ZR2U4DLycLT5xFH6@gmail.com>
* Ingo Molnar <mingo@kernel.org> wrote:
>
> * Uros Bizjak <ubizjak@gmail.com> wrote:
>
> > The percpu code mostly uses inline assembly. Using segment qualifiers
> > allows to use C code instead, which enables the compiler to perform
> > various optimizations (e.g. propagation of memory arguments). Convert
> > percpu read and write accessors to C code, so the memory argument can
> > be propagated to the instruction that uses this argument.
> >
> > Some examples of propagations:
> >
> > a) into sign/zero extensions:
> >
> > 110b54: 65 0f b6 05 00 00 00 movzbl %gs:0x0(%rip),%eax
> > 11ab90: 65 0f b6 15 00 00 00 movzbl %gs:0x0(%rip),%edx
> > 14484a: 65 0f b7 35 00 00 00 movzwl %gs:0x0(%rip),%esi
> > 1a08a9: 65 0f b6 43 78 movzbl %gs:0x78(%rbx),%eax
> > 1a08f9: 65 0f b6 43 78 movzbl %gs:0x78(%rbx),%eax
> >
> > 4ab29a: 65 48 63 15 00 00 00 movslq %gs:0x0(%rip),%rdx
> > 4be128: 65 4c 63 25 00 00 00 movslq %gs:0x0(%rip),%r12
> > 547468: 65 48 63 1f movslq %gs:(%rdi),%rbx
> > 5474e7: 65 48 63 0a movslq %gs:(%rdx),%rcx
> > 54d05d: 65 48 63 0d 00 00 00 movslq %gs:0x0(%rip),%rcx
>
> Could you please also quote a 'before' assembly sequence, at least once
> per group of propagations?
Ie. for any changes to x86 code generation, please follow the changelog
format of:
7c097ca50d2b ("x86/percpu: Do not clobber %rsi in percpu_{try_,}cmpxchg{64,128}_op")
...
Move the load of %rsi outside inline asm, so the compiler can
reuse the value. The code in slub.o improves from:
55ac: 49 8b 3c 24 mov (%r12),%rdi
55b0: 48 8d 4a 40 lea 0x40(%rdx),%rcx
55b4: 49 8b 1c 07 mov (%r15,%rax,1),%rbx
55b8: 4c 89 f8 mov %r15,%rax
55bb: 48 8d 37 lea (%rdi),%rsi
55be: e8 00 00 00 00 callq 55c3 <...>
55bf: R_X86_64_PLT32 this_cpu_cmpxchg16b_emu-0x4
55c3: 75 a3 jne 5568 <...>
55c5: ...
0000000000000000 <.altinstr_replacement>:
5: 65 48 0f c7 0f cmpxchg16b %gs:(%rdi)
to:
55ac: 49 8b 34 24 mov (%r12),%rsi
55b0: 48 8d 4a 40 lea 0x40(%rdx),%rcx
55b4: 49 8b 1c 07 mov (%r15,%rax,1),%rbx
55b8: 4c 89 f8 mov %r15,%rax
55bb: e8 00 00 00 00 callq 55c0 <...>
55bc: R_X86_64_PLT32 this_cpu_cmpxchg16b_emu-0x4
55c0: 75 a6 jne 5568 <...>
55c2: ...
Where the alternative replacement instruction now uses %rsi:
0000000000000000 <.altinstr_replacement>:
5: 65 48 0f c7 0e cmpxchg16b %gs:(%rsi)
The instruction (effectively a reg-reg move) at 55bb: in the original
assembly is removed. Also, both the CALL and replacement CMPXCHG16B
are 5 bytes long, removing the need for NOPs in the asm code.
...
Thanks,
Ingo
next prev parent reply other threads:[~2023-10-04 16:40 UTC|newest]
Thread overview: 37+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-10-04 14:49 [PATCH 0/4] x86/percpu: Use segment qualifiers Uros Bizjak
2023-10-04 14:49 ` [PATCH 1/4] x86/percpu: Update arch/x86/include/asm/percpu.h to the current tip Uros Bizjak
2023-10-04 14:49 ` [PATCH 2/4] x86/percpu: Enable named address spaces with known compiler version Uros Bizjak
2023-10-05 7:20 ` [tip: x86/percpu] " tip-bot2 for Uros Bizjak
2023-10-04 14:49 ` [PATCH 3/4] x86/percpu: Use compiler segment prefix qualifier Uros Bizjak
2023-10-05 7:20 ` [tip: x86/percpu] " tip-bot2 for Nadav Amit
2023-10-04 14:49 ` [PATCH 4/4] x86/percpu: Use C for percpu read/write accessors Uros Bizjak
2023-10-04 16:37 ` Ingo Molnar
2023-10-04 16:40 ` Ingo Molnar [this message]
2023-10-04 19:23 ` [PATCH v2 " Uros Bizjak
2023-10-04 19:42 ` Linus Torvalds
2023-10-04 20:07 ` Uros Bizjak
2023-10-04 20:12 ` Linus Torvalds
2023-10-04 20:19 ` Linus Torvalds
2023-10-04 20:22 ` Uros Bizjak
2023-10-05 7:06 ` Ingo Molnar
2023-10-05 7:40 ` Uros Bizjak
2023-10-05 7:20 ` [tip: x86/percpu] " tip-bot2 for Uros Bizjak
2023-10-08 17:59 ` [PATCH 4/4] " Linus Torvalds
2023-10-08 19:17 ` Uros Bizjak
2023-10-08 20:13 ` Linus Torvalds
2023-10-08 20:48 ` Linus Torvalds
2023-10-08 21:41 ` Uros Bizjak
2023-10-09 11:41 ` Ingo Molnar
2023-10-09 11:51 ` Ingo Molnar
2023-10-09 12:00 ` Uros Bizjak
2023-10-09 12:20 ` Ingo Molnar
2023-10-09 12:21 ` Nadav Amit
2023-10-09 12:42 ` Uros Bizjak
2023-10-09 12:53 ` Nadav Amit
2023-10-09 12:27 ` Uros Bizjak
2023-10-09 14:35 ` Uros Bizjak
2024-04-10 11:11 ` Andrey Konovalov
2024-04-10 11:21 ` Uros Bizjak
2024-04-10 11:24 ` Andrey Konovalov
2023-10-09 11:42 ` Ingo Molnar
2023-10-10 6:37 ` Uros Bizjak
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=ZR2VitjPb6Miksim@gmail.com \
--to=mingo@kernel.org \
--cc=bp@alien8.de \
--cc=brgerst@gmail.com \
--cc=dvlasenk@redhat.com \
--cc=hpa@zytor.com \
--cc=jpoimboe@redhat.com \
--cc=linux-kernel@vger.kernel.org \
--cc=luto@kernel.org \
--cc=namit@vmware.com \
--cc=peterz@infradead.org \
--cc=tglx@linutronix.de \
--cc=torvalds@linux-foundation.org \
--cc=ubizjak@gmail.com \
--cc=x86@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).