[PATCH] lockref: remove cpu_relax() again

* [PATCH] lockref: remove cpu_relax() again
@ 2013-09-05 13:18 Heiko Carstens
  2013-09-05 14:13 ` Heiko Carstens
                   ` (2 more replies)
  0 siblings, 3 replies; 13+ messages in thread
From: Heiko Carstens @ 2013-09-05 13:18 UTC (permalink / raw)
  To: Linus Torvalds, Tony Luck; +Cc: linux-kernel

d472d9d9 "lockref: Relax in cmpxchg loop" added a cpu_relax() call to the
CMPXCHG_LOOP() macro. However to me it seems to be wrong since it is very
likely that the next round will succeed (or the loop will be left).
Even worse: cpu_relax() is very expensive on s390, since it means yield
"my virtual cpu to the hypervisor". So we are talking of several 1000 cycles.

In fact some measurements show the bad impact of the cpu_relax() call on
s390 using Linus' test case that "stats()" like mad:

Without converting s390 to lockref:
Total loops: 81236173

After converting s390 to lockref:
Total loops: 31896802

After converting s390 to lockref but with removed cpu_relax() call:
Total loops: 86242190

So the cpu_relax() call completely contradicts the intention of
CONFIG_CMPXCHG_LOCKREF at least on s390.

*If* however the cpu_relax() makes sense on other platforms maybe we could
add something like we have already with "arch_mutex_cpu_relax()":

include/linux/mutex.h:
 #ifndef CONFIG_HAVE_ARCH_MUTEX_CPU_RELAX
 #define arch_mutex_cpu_relax()  cpu_relax()
 #endif

arch/s390/include/asm/mutex.h:
 #define arch_mutex_cpu_relax()  barrier()

Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
---
 lib/lockref.c | 1 -
 1 file changed, 1 deletion(-)

diff --git a/lib/lockref.c b/lib/lockref.c
index 9d76f40..7819c2d 100644
--- a/lib/lockref.c
+++ b/lib/lockref.c
@@ -19,7 +19,6 @@
 		if (likely(old.lock_count == prev.lock_count)) {		\
 			SUCCESS;						\
 		}								\
-		cpu_relax();							\
 	}									\
 } while (0)
 
-- 
1.8.3.4


^ permalink raw reply related	[flat|nested] 13+ messages in thread