All of lore.kernel.org
 help / color / mirror / Atom feed
From: Ingo Molnar <mingo@kernel.org>
To: Uros Bizjak <ubizjak@gmail.com>
Cc: x86@kernel.org, linux-kernel@vger.kernel.org,
	Andy Lutomirski <luto@kernel.org>, Nadav Amit <namit@vmware.com>,
	Brian Gerst <brgerst@gmail.com>,
	Denys Vlasenko <dvlasenk@redhat.com>,
	"H . Peter Anvin" <hpa@zytor.com>,
	Linus Torvalds <torvalds@linux-foundation.org>,
	Peter Zijlstra <peterz@infradead.org>,
	Thomas Gleixner <tglx@linutronix.de>,
	Borislav Petkov <bp@alien8.de>,
	Josh Poimboeuf <jpoimboe@redhat.com>
Subject: Re: [PATCH 4/4] x86/percpu: Use C for percpu read/write accessors
Date: Wed, 4 Oct 2023 18:40:42 +0200	[thread overview]
Message-ID: <ZR2VitjPb6Miksim@gmail.com> (raw)
In-Reply-To: <ZR2U4DLycLT5xFH6@gmail.com>


* Ingo Molnar <mingo@kernel.org> wrote:

> 
> * Uros Bizjak <ubizjak@gmail.com> wrote:
> 
> > The percpu code mostly uses inline assembly. Using segment qualifiers
> > allows to use C code instead, which enables the compiler to perform
> > various optimizations (e.g. propagation of memory arguments). Convert
> > percpu read and write accessors to C code, so the memory argument can
> > be propagated to the instruction that uses this argument.
> > 
> > Some examples of propagations:
> > 
> > a) into sign/zero extensions:
> > 
> >  110b54:       65 0f b6 05 00 00 00    movzbl %gs:0x0(%rip),%eax
> >  11ab90:       65 0f b6 15 00 00 00    movzbl %gs:0x0(%rip),%edx
> >  14484a:       65 0f b7 35 00 00 00    movzwl %gs:0x0(%rip),%esi
> >  1a08a9:       65 0f b6 43 78          movzbl %gs:0x78(%rbx),%eax
> >  1a08f9:       65 0f b6 43 78          movzbl %gs:0x78(%rbx),%eax
> > 
> >  4ab29a:       65 48 63 15 00 00 00    movslq %gs:0x0(%rip),%rdx
> >  4be128:       65 4c 63 25 00 00 00    movslq %gs:0x0(%rip),%r12
> >  547468:       65 48 63 1f             movslq %gs:(%rdi),%rbx
> >  5474e7:       65 48 63 0a             movslq %gs:(%rdx),%rcx
> >  54d05d:       65 48 63 0d 00 00 00    movslq %gs:0x0(%rip),%rcx
> 
> Could you please also quote a 'before' assembly sequence, at least once
> per group of propagations?

Ie. for any changes to x86 code generation, please follow the changelog 
format of:

   7c097ca50d2b ("x86/percpu: Do not clobber %rsi in percpu_{try_,}cmpxchg{64,128}_op")

   ...
	Move the load of %rsi outside inline asm, so the compiler can
	reuse the value. The code in slub.o improves from:

	    55ac:	49 8b 3c 24          	mov    (%r12),%rdi
	    55b0:	48 8d 4a 40          	lea    0x40(%rdx),%rcx
	    55b4:	49 8b 1c 07          	mov    (%r15,%rax,1),%rbx
	    55b8:	4c 89 f8             	mov    %r15,%rax
	    55bb:	48 8d 37             	lea    (%rdi),%rsi
	    55be:	e8 00 00 00 00       	callq  55c3 <...>
				55bf: R_X86_64_PLT32	this_cpu_cmpxchg16b_emu-0x4
	    55c3:	75 a3                	jne    5568 <...>
	    55c5:	...

	 0000000000000000 <.altinstr_replacement>:
	   5:	65 48 0f c7 0f       	cmpxchg16b %gs:(%rdi)

	to:

	    55ac:	49 8b 34 24          	mov    (%r12),%rsi
	    55b0:	48 8d 4a 40          	lea    0x40(%rdx),%rcx
	    55b4:	49 8b 1c 07          	mov    (%r15,%rax,1),%rbx
	    55b8:	4c 89 f8             	mov    %r15,%rax
	    55bb:	e8 00 00 00 00       	callq  55c0 <...>
				55bc: R_X86_64_PLT32	this_cpu_cmpxchg16b_emu-0x4
	    55c0:	75 a6                	jne    5568 <...>
	    55c2:	...

	Where the alternative replacement instruction now uses %rsi:

	 0000000000000000 <.altinstr_replacement>:
	   5:	65 48 0f c7 0e       	cmpxchg16b %gs:(%rsi)

	The instruction (effectively a reg-reg move) at 55bb: in the original
	assembly is removed. Also, both the CALL and replacement CMPXCHG16B
	are 5 bytes long, removing the need for NOPs in the asm code.
   ...

Thanks,

	Ingo

  reply	other threads:[~2023-10-04 16:40 UTC|newest]

Thread overview: 37+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-10-04 14:49 [PATCH 0/4] x86/percpu: Use segment qualifiers Uros Bizjak
2023-10-04 14:49 ` [PATCH 1/4] x86/percpu: Update arch/x86/include/asm/percpu.h to the current tip Uros Bizjak
2023-10-04 14:49 ` [PATCH 2/4] x86/percpu: Enable named address spaces with known compiler version Uros Bizjak
2023-10-05  7:20   ` [tip: x86/percpu] " tip-bot2 for Uros Bizjak
2023-10-04 14:49 ` [PATCH 3/4] x86/percpu: Use compiler segment prefix qualifier Uros Bizjak
2023-10-05  7:20   ` [tip: x86/percpu] " tip-bot2 for Nadav Amit
2023-10-04 14:49 ` [PATCH 4/4] x86/percpu: Use C for percpu read/write accessors Uros Bizjak
2023-10-04 16:37   ` Ingo Molnar
2023-10-04 16:40     ` Ingo Molnar [this message]
2023-10-04 19:23     ` [PATCH v2 " Uros Bizjak
2023-10-04 19:42       ` Linus Torvalds
2023-10-04 20:07         ` Uros Bizjak
2023-10-04 20:12           ` Linus Torvalds
2023-10-04 20:19             ` Linus Torvalds
2023-10-04 20:22               ` Uros Bizjak
2023-10-05  7:06       ` Ingo Molnar
2023-10-05  7:40         ` Uros Bizjak
2023-10-05  7:20       ` [tip: x86/percpu] " tip-bot2 for Uros Bizjak
2023-10-08 17:59   ` [PATCH 4/4] " Linus Torvalds
2023-10-08 19:17     ` Uros Bizjak
2023-10-08 20:13       ` Linus Torvalds
2023-10-08 20:48         ` Linus Torvalds
2023-10-08 21:41           ` Uros Bizjak
2023-10-09 11:41             ` Ingo Molnar
2023-10-09 11:51               ` Ingo Molnar
2023-10-09 12:00                 ` Uros Bizjak
2023-10-09 12:20                   ` Ingo Molnar
2023-10-09 12:21                   ` Nadav Amit
2023-10-09 12:42                     ` Uros Bizjak
2023-10-09 12:53                       ` Nadav Amit
2023-10-09 12:27               ` Uros Bizjak
2023-10-09 14:35               ` Uros Bizjak
2024-04-10 11:11                 ` Andrey Konovalov
2024-04-10 11:21                   ` Uros Bizjak
2024-04-10 11:24                     ` Andrey Konovalov
2023-10-09 11:42       ` Ingo Molnar
2023-10-10  6:37     ` Uros Bizjak

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=ZR2VitjPb6Miksim@gmail.com \
    --to=mingo@kernel.org \
    --cc=bp@alien8.de \
    --cc=brgerst@gmail.com \
    --cc=dvlasenk@redhat.com \
    --cc=hpa@zytor.com \
    --cc=jpoimboe@redhat.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=luto@kernel.org \
    --cc=namit@vmware.com \
    --cc=peterz@infradead.org \
    --cc=tglx@linutronix.de \
    --cc=torvalds@linux-foundation.org \
    --cc=ubizjak@gmail.com \
    --cc=x86@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.