linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Ingo Molnar <mingo@kernel.org>
To: Uros Bizjak <ubizjak@gmail.com>
Cc: x86@kernel.org, linux-kernel@vger.kernel.org,
	Andy Lutomirski <luto@kernel.org>, Nadav Amit <namit@vmware.com>,
	Brian Gerst <brgerst@gmail.com>,
	Denys Vlasenko <dvlasenk@redhat.com>,
	"H . Peter Anvin" <hpa@zytor.com>,
	Linus Torvalds <torvalds@linux-foundation.org>,
	Peter Zijlstra <peterz@infradead.org>,
	Thomas Gleixner <tglx@linutronix.de>,
	Borislav Petkov <bp@alien8.de>,
	Josh Poimboeuf <jpoimboe@redhat.com>
Subject: Re: [PATCH 4/4] x86/percpu: Use C for percpu read/write accessors
Date: Wed, 4 Oct 2023 18:40:42 +0200	[thread overview]
Message-ID: <ZR2VitjPb6Miksim@gmail.com> (raw)
In-Reply-To: <ZR2U4DLycLT5xFH6@gmail.com>


* Ingo Molnar <mingo@kernel.org> wrote:

> 
> * Uros Bizjak <ubizjak@gmail.com> wrote:
> 
> > The percpu code mostly uses inline assembly. Using segment qualifiers
> > allows to use C code instead, which enables the compiler to perform
> > various optimizations (e.g. propagation of memory arguments). Convert
> > percpu read and write accessors to C code, so the memory argument can
> > be propagated to the instruction that uses this argument.
> > 
> > Some examples of propagations:
> > 
> > a) into sign/zero extensions:
> > 
> >  110b54:       65 0f b6 05 00 00 00    movzbl %gs:0x0(%rip),%eax
> >  11ab90:       65 0f b6 15 00 00 00    movzbl %gs:0x0(%rip),%edx
> >  14484a:       65 0f b7 35 00 00 00    movzwl %gs:0x0(%rip),%esi
> >  1a08a9:       65 0f b6 43 78          movzbl %gs:0x78(%rbx),%eax
> >  1a08f9:       65 0f b6 43 78          movzbl %gs:0x78(%rbx),%eax
> > 
> >  4ab29a:       65 48 63 15 00 00 00    movslq %gs:0x0(%rip),%rdx
> >  4be128:       65 4c 63 25 00 00 00    movslq %gs:0x0(%rip),%r12
> >  547468:       65 48 63 1f             movslq %gs:(%rdi),%rbx
> >  5474e7:       65 48 63 0a             movslq %gs:(%rdx),%rcx
> >  54d05d:       65 48 63 0d 00 00 00    movslq %gs:0x0(%rip),%rcx
> 
> Could you please also quote a 'before' assembly sequence, at least once
> per group of propagations?

Ie. for any changes to x86 code generation, please follow the changelog 
format of:

   7c097ca50d2b ("x86/percpu: Do not clobber %rsi in percpu_{try_,}cmpxchg{64,128}_op")

   ...
	Move the load of %rsi outside inline asm, so the compiler can
	reuse the value. The code in slub.o improves from:

	    55ac:	49 8b 3c 24          	mov    (%r12),%rdi
	    55b0:	48 8d 4a 40          	lea    0x40(%rdx),%rcx
	    55b4:	49 8b 1c 07          	mov    (%r15,%rax,1),%rbx
	    55b8:	4c 89 f8             	mov    %r15,%rax
	    55bb:	48 8d 37             	lea    (%rdi),%rsi
	    55be:	e8 00 00 00 00       	callq  55c3 <...>
				55bf: R_X86_64_PLT32	this_cpu_cmpxchg16b_emu-0x4
	    55c3:	75 a3                	jne    5568 <...>
	    55c5:	...

	 0000000000000000 <.altinstr_replacement>:
	   5:	65 48 0f c7 0f       	cmpxchg16b %gs:(%rdi)

	to:

	    55ac:	49 8b 34 24          	mov    (%r12),%rsi
	    55b0:	48 8d 4a 40          	lea    0x40(%rdx),%rcx
	    55b4:	49 8b 1c 07          	mov    (%r15,%rax,1),%rbx
	    55b8:	4c 89 f8             	mov    %r15,%rax
	    55bb:	e8 00 00 00 00       	callq  55c0 <...>
				55bc: R_X86_64_PLT32	this_cpu_cmpxchg16b_emu-0x4
	    55c0:	75 a6                	jne    5568 <...>
	    55c2:	...

	Where the alternative replacement instruction now uses %rsi:

	 0000000000000000 <.altinstr_replacement>:
	   5:	65 48 0f c7 0e       	cmpxchg16b %gs:(%rsi)

	The instruction (effectively a reg-reg move) at 55bb: in the original
	assembly is removed. Also, both the CALL and replacement CMPXCHG16B
	are 5 bytes long, removing the need for NOPs in the asm code.
   ...

Thanks,

	Ingo

  reply	other threads:[~2023-10-04 16:40 UTC|newest]

Thread overview: 37+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-10-04 14:49 [PATCH 0/4] x86/percpu: Use segment qualifiers Uros Bizjak
2023-10-04 14:49 ` [PATCH 1/4] x86/percpu: Update arch/x86/include/asm/percpu.h to the current tip Uros Bizjak
2023-10-04 14:49 ` [PATCH 2/4] x86/percpu: Enable named address spaces with known compiler version Uros Bizjak
2023-10-05  7:20   ` [tip: x86/percpu] " tip-bot2 for Uros Bizjak
2023-10-04 14:49 ` [PATCH 3/4] x86/percpu: Use compiler segment prefix qualifier Uros Bizjak
2023-10-05  7:20   ` [tip: x86/percpu] " tip-bot2 for Nadav Amit
2023-10-04 14:49 ` [PATCH 4/4] x86/percpu: Use C for percpu read/write accessors Uros Bizjak
2023-10-04 16:37   ` Ingo Molnar
2023-10-04 16:40     ` Ingo Molnar [this message]
2023-10-04 19:23     ` [PATCH v2 " Uros Bizjak
2023-10-04 19:42       ` Linus Torvalds
2023-10-04 20:07         ` Uros Bizjak
2023-10-04 20:12           ` Linus Torvalds
2023-10-04 20:19             ` Linus Torvalds
2023-10-04 20:22               ` Uros Bizjak
2023-10-05  7:06       ` Ingo Molnar
2023-10-05  7:40         ` Uros Bizjak
2023-10-05  7:20       ` [tip: x86/percpu] " tip-bot2 for Uros Bizjak
2023-10-08 17:59   ` [PATCH 4/4] " Linus Torvalds
2023-10-08 19:17     ` Uros Bizjak
2023-10-08 20:13       ` Linus Torvalds
2023-10-08 20:48         ` Linus Torvalds
2023-10-08 21:41           ` Uros Bizjak
2023-10-09 11:41             ` Ingo Molnar
2023-10-09 11:51               ` Ingo Molnar
2023-10-09 12:00                 ` Uros Bizjak
2023-10-09 12:20                   ` Ingo Molnar
2023-10-09 12:21                   ` Nadav Amit
2023-10-09 12:42                     ` Uros Bizjak
2023-10-09 12:53                       ` Nadav Amit
2023-10-09 12:27               ` Uros Bizjak
2023-10-09 14:35               ` Uros Bizjak
2024-04-10 11:11                 ` Andrey Konovalov
2024-04-10 11:21                   ` Uros Bizjak
2024-04-10 11:24                     ` Andrey Konovalov
2023-10-09 11:42       ` Ingo Molnar
2023-10-10  6:37     ` Uros Bizjak

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=ZR2VitjPb6Miksim@gmail.com \
    --to=mingo@kernel.org \
    --cc=bp@alien8.de \
    --cc=brgerst@gmail.com \
    --cc=dvlasenk@redhat.com \
    --cc=hpa@zytor.com \
    --cc=jpoimboe@redhat.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=luto@kernel.org \
    --cc=namit@vmware.com \
    --cc=peterz@infradead.org \
    --cc=tglx@linutronix.de \
    --cc=torvalds@linux-foundation.org \
    --cc=ubizjak@gmail.com \
    --cc=x86@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).