linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Nadav Amit <namit@vmware.com>
To: Peter Zijlstra <peterz@infradead.org>
Cc: Andy Lutomirski <luto@kernel.org>,
	x86@kernel.org, linux-kernel@vger.kernel.org,
	Dave Hansen <dave.hansen@linux.intel.com>,
	Thomas Gleixner <tglx@linutronix.de>,
	Ingo Molnar <mingo@redhat.com>, Nadav Amit <namit@vmware.com>
Subject: [RFC 6/7] x86/percpu: Optimized arch_raw_cpu_ptr()
Date: Thu, 18 Jul 2019 10:41:09 -0700	[thread overview]
Message-ID: <20190718174110.4635-7-namit@vmware.com> (raw)
In-Reply-To: <20190718174110.4635-1-namit@vmware.com>

Implementing arch_raw_cpu_ptr() in C, allows the compiler to perform
better optimizations, such as setting an appropriate base to compute
the address instead of an add instruction.

The benefit of this computation is relevant only when using compiler
segment qualifiers. It is inapplicable to use this method when the
address size is greater than the maximum operand size, as it is when
building vdso32.

Distinguish between the two cases in which preemption is disabled (as
happens when this_cpu_ptr() is used) and enabled (when raw_cpu_ptr() is
used).

This allows optimizations, for instance in rcu_dynticks_eqs_exit(),
the following code:

  mov    $0x2bbc0,%rax
  add    %gs:0x7ef07570(%rip),%rax	# 0x10358 <this_cpu_off>
  lock xadd %edx,0xd8(%rax)

Turns with this patch into:

  mov    %gs:0x7ef08aa5(%rip),%rax	# 0x10358 <this_cpu_off>
  lock xadd %edx,0x2bc58(%rax)

Signed-off-by: Nadav Amit <namit@vmware.com>
---
 arch/x86/include/asm/percpu.h | 25 +++++++++++++++++++++++--
 1 file changed, 23 insertions(+), 2 deletions(-)

diff --git a/arch/x86/include/asm/percpu.h b/arch/x86/include/asm/percpu.h
index 13987f9bc82f..8bac7db397cc 100644
--- a/arch/x86/include/asm/percpu.h
+++ b/arch/x86/include/asm/percpu.h
@@ -73,20 +73,41 @@
 #endif /* USE_X86_SEG_SUPPORT */
 
 #define __my_cpu_offset		this_cpu_read(this_cpu_off)
+#define __raw_my_cpu_offset	__this_cpu_read(this_cpu_off)
+#define __my_cpu_ptr(ptr)	(__my_cpu_type(*ptr) *)(uintptr_t)(ptr)
 
+#if USE_X86_SEG_SUPPORT && (!defined(BUILD_VDSO32) || defined(CONFIG_X86_64))
+/*
+ * Efficient implementation for cases in which the compiler supports C segments.
+ * Allows the compiler to perform additional optimizations that can save more
+ * instructions.
+ *
+ * This optimized version can only be used if the pointer size equals to native
+ * operand size, which does not happen when vdso32 is used.
+ */
+#define __arch_raw_cpu_ptr_qual(qual, ptr)				\
+({									\
+	(qual typeof(*(ptr)) __kernel __force *)((uintptr_t)(ptr) +	\
+						__my_cpu_offset);	\
+})
+#else /* USE_X86_SEG_SUPPORT && (!defined(BUILD_VDSO32) || defined(CONFIG_X86_64)) */
 /*
  * Compared to the generic __my_cpu_offset version, the following
  * saves one instruction and avoids clobbering a temp register.
  */
-#define arch_raw_cpu_ptr(ptr)				\
+#define __arch_raw_cpu_ptr_qual(qual, ptr)		\
 ({							\
 	unsigned long tcp_ptr__;			\
-	asm volatile("add " __percpu_arg(1) ", %0"	\
+	asm qual ("add " __percpu_arg(1) ", %0"		\
 		     : "=r" (tcp_ptr__)			\
 		     : "m" (__my_cpu_var(this_cpu_off)),\
 		       "0" (ptr));			\
 	(typeof(*(ptr)) __kernel __force *)tcp_ptr__;	\
 })
+#endif /* USE_X86_SEG_SUPPORT && (!defined(BUILD_VDSO32) || defined(CONFIG_X86_64)) */
+
+#define arch_raw_cpu_ptr(ptr)				__arch_raw_cpu_ptr_qual(volatile, ptr)
+#define arch_raw_cpu_ptr_preemption_disabled(ptr)	__arch_raw_cpu_ptr_qual( , ptr)
 #else /* CONFIG_SMP */
 #define __percpu_seg_override
 #define __percpu_prefix		""
-- 
2.17.1


  parent reply	other threads:[~2019-07-19  1:04 UTC|newest]

Thread overview: 13+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-07-18 17:41 [RFC 0/7] x86/percpu: Use segment qualifiers Nadav Amit
2019-07-18 17:41 ` [RFC 1/7] compiler: Report x86 segment support Nadav Amit
2019-07-18 17:41 ` [RFC 2/7] x86/percpu: Use compiler segment prefix qualifier Nadav Amit
2019-07-18 17:41 ` [RFC 3/7] x86/percpu: Use C for percpu accesses when possible Nadav Amit
2019-07-22 20:52   ` Peter Zijlstra
2019-07-22 21:12     ` Nadav Amit
2019-07-18 17:41 ` [RFC 4/7] x86: Fix possible caching of current_task Nadav Amit
2019-07-18 17:41 ` [RFC 5/7] percpu: Assume preemption is disabled on per_cpu_ptr() Nadav Amit
2019-07-18 17:41 ` Nadav Amit [this message]
2019-07-18 17:41 ` [RFC 7/7] x86/current: Aggressive caching of current Nadav Amit
2019-07-22 21:07   ` Peter Zijlstra
2019-07-22 21:20     ` Nadav Amit
2019-07-22 21:09 ` [RFC 0/7] x86/percpu: Use segment qualifiers Peter Zijlstra

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20190718174110.4635-7-namit@vmware.com \
    --to=namit@vmware.com \
    --cc=dave.hansen@linux.intel.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=luto@kernel.org \
    --cc=mingo@redhat.com \
    --cc=peterz@infradead.org \
    --cc=tglx@linutronix.de \
    --cc=x86@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).