[PATCH] x86: Optimize variable_test_bit()

* [PATCH] x86: Optimize variable_test_bit()
@ 2015-05-01 15:16 Peter Zijlstra
  2015-05-01 16:03 ` Linus Torvalds
  2015-05-04 13:42 ` Peter Zijlstra
  0 siblings, 2 replies; 31+ messages in thread
From: Peter Zijlstra @ 2015-05-01 15:16 UTC (permalink / raw)
  To: Ingo Molnar, H. Peter Anvin, Thomas Gleixner, Linus Torvalds
  Cc: linux-kernel, Borislav Petkov, Jakub Jelinek

While looking at some asm I noticed we produce the most horrid code for
test_bit():

    1a5e:       49 0f a3 30             bt     %rsi,(%r8)
    1a62:       45 19 c0                sbb    %r8d,%r8d
    1a65:       45 85 c0                test   %r8d,%r8d
    1a68:       75 a6                   jne    1a10 <x86_schedule_events+0xc0>

Since test_bit() doesn't actually have any output variables, we can use
asm goto without having to add a memory clobber. This reduces the code
to something sensible:

    1a12:       49 0f a3 30             bt     %rsi,(%r8)
    1a16:       72 68                   jb     1a80 <x86_schedule_events+0x130>

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
---
PS. should we kill the memory clobber for __test_and_change_bit()? It
    seems inconsistent and out of place.

PPS. Jakub, I see gcc5.1 still hasn't got output operands for asm goto;
     is this something we can get 'fixed' ?

 arch/x86/include/asm/bitops.h | 10 ++++++++++
 1 file changed, 10 insertions(+)

diff --git a/arch/x86/include/asm/bitops.h b/arch/x86/include/asm/bitops.h
index cfe3b954d5e4..bcf4fa77c04f 100644
--- a/arch/x86/include/asm/bitops.h
+++ b/arch/x86/include/asm/bitops.h
@@ -313,6 +313,15 @@ static __always_inline int constant_test_bit(long nr, const volatile unsigned lo
 
 static inline int variable_test_bit(long nr, volatile const unsigned long *addr)
 {
+#ifdef CC_HAVE_ASM_GOTO
+	asm_volatile_goto ("bt %1, %0\n\t"
+			   "jc %l[cc_label]"
+			   : : "m" (*(unsigned long *)addr), "Ir" (nr)
+			   : : cc_label);
+	return 0;
+cc_label:
+	return 1;
+#else
 	int oldbit;
 
 	asm volatile("bt %2,%1\n\t"
@@ -321,6 +330,7 @@ static inline int variable_test_bit(long nr, volatile const unsigned long *addr)
 		     : "m" (*(unsigned long *)addr), "Ir" (nr));
 
 	return oldbit;
+#endif
 }
 
 #if 0 /* Fool kernel-doc since it doesn't do macros yet */

^ permalink raw reply related	[flat|nested] 31+ messages in thread