All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH] arch,locking: Ciao arch_mutex_cpu_relax()
@ 2014-06-20 18:21 Davidlohr Bueso
  2014-06-23  6:54 ` Peter Zijlstra
  2014-06-23  6:58 ` Peter Zijlstra
  0 siblings, 2 replies; 14+ messages in thread
From: Davidlohr Bueso @ 2014-06-20 18:21 UTC (permalink / raw)
  To: Peter Zijlstra, Ingo Molnar
  Cc: Linus Torvalds, Heiko Carstens, Davidlohr Bueso,
	Aswin Chandramouleeswaran, linux-kernel

From: Davidlohr Bueso <davidlohr@hp.com>

The arch_mutex_cpu_relax() function, introduced by 34b133f, is
hacky and ugly. It was added a few years ago to address the fact
that common cpu_relax() calls include yielding on s390, and thus
impact the optimistic spinning functionality of mutexes. Nowadays
we use this function well beyond mutexes: rwsem, qrwlock, mcs and
lockref. Since the macro that defines the call is in the mutex header,
any users must include mutex.h and the naming is misleading as well.

This patch (i) renames the call to arch_cpu_relax (for lack of a better
name), and (ii) defines it in each arch's asm/processor.h local header,
just like for regular cpu_relax() functions. On all archs, except s390,
arch_cpu_relax is simply cpu_relax, and thus we can take it out of
mutex.h. While this can seem redundant or weird, I believe it is a
good choice as it allows us to move out arch specific logic from generic
locking primitives and enables future(?) archs to transparently define
it, similarly to System Z.

Please note that these changes are only tested on x86-64.

Signed-off-by: Davidlohr Bueso <davidlohr@hp.com>
---
 arch/alpha/include/asm/processor.h     | 1 +
 arch/arc/include/asm/processor.h       | 2 ++
 arch/arm/include/asm/processor.h       | 2 ++
 arch/arm64/include/asm/processor.h     | 1 +
 arch/avr32/include/asm/processor.h     | 1 +
 arch/blackfin/include/asm/processor.h  | 2 +-
 arch/c6x/include/asm/processor.h       | 1 +
 arch/cris/include/asm/processor.h      | 1 +
 arch/hexagon/include/asm/processor.h   | 1 +
 arch/ia64/include/asm/processor.h      | 1 +
 arch/m32r/include/asm/processor.h      | 1 +
 arch/m68k/include/asm/processor.h      | 1 +
 arch/metag/include/asm/processor.h     | 1 +
 arch/mips/include/asm/processor.h      | 1 +
 arch/mn10300/include/asm/processor.h   | 2 ++
 arch/openrisc/include/asm/processor.h  | 1 +
 arch/parisc/include/asm/processor.h    | 1 +
 arch/powerpc/include/asm/processor.h   | 2 ++
 arch/s390/include/asm/processor.h      | 2 +-
 arch/score/include/asm/processor.h     | 1 +
 arch/sh/include/asm/processor.h        | 1 +
 arch/sparc/include/asm/processor_32.h  | 2 ++
 arch/sparc/include/asm/processor_64.h  | 1 +
 arch/tile/include/asm/processor.h      | 2 ++
 arch/unicore32/include/asm/processor.h | 1 +
 arch/x86/include/asm/processor.h       | 2 ++
 arch/xtensa/include/asm/processor.h    | 1 +
 include/linux/mutex.h                  | 4 ----
 kernel/locking/mcs_spinlock.c          | 8 +++-----
 kernel/locking/mcs_spinlock.h          | 4 ++--
 kernel/locking/mutex.c                 | 4 ++--
 kernel/locking/qrwlock.c               | 9 ++++-----
 kernel/locking/rwsem-xadd.c            | 4 ++--
 lib/lockref.c                          | 3 +--
 34 files changed, 48 insertions(+), 24 deletions(-)

diff --git a/arch/alpha/include/asm/processor.h b/arch/alpha/include/asm/processor.h
index 6cb7fe8..8ca4b4d 100644
--- a/arch/alpha/include/asm/processor.h
+++ b/arch/alpha/include/asm/processor.h
@@ -57,6 +57,7 @@ unsigned long get_wchan(struct task_struct *p);
   ((tsk) == current ? rdusp() : task_thread_info(tsk)->pcb.usp)
 
 #define cpu_relax()	barrier()
+#define arch_cpu_relax() cpu_relax()
 
 #define ARCH_HAS_PREFETCH
 #define ARCH_HAS_PREFETCHW
diff --git a/arch/arc/include/asm/processor.h b/arch/arc/include/asm/processor.h
index d99f9b3..8e1bf6b 100644
--- a/arch/arc/include/asm/processor.h
+++ b/arch/arc/include/asm/processor.h
@@ -62,6 +62,8 @@ unsigned long thread_saved_pc(struct task_struct *t);
 #define cpu_relax()	do { } while (0)
 #endif
 
+#define arch_cpu_relax() cpu_relax()
+
 #define copy_segments(tsk, mm)      do { } while (0)
 #define release_segments(mm)        do { } while (0)
 
diff --git a/arch/arm/include/asm/processor.h b/arch/arm/include/asm/processor.h
index c3d5fc1..448f98b 100644
--- a/arch/arm/include/asm/processor.h
+++ b/arch/arm/include/asm/processor.h
@@ -82,6 +82,8 @@ unsigned long get_wchan(struct task_struct *p);
 #define cpu_relax()			barrier()
 #endif
 
+#define arch_cpu_relax()                cpu_relax()
+
 #define task_pt_regs(p) \
 	((struct pt_regs *)(THREAD_START_SP + task_stack_page(p)) - 1)
 
diff --git a/arch/arm64/include/asm/processor.h b/arch/arm64/include/asm/processor.h
index 34de2a8..d6c2d64 100644
--- a/arch/arm64/include/asm/processor.h
+++ b/arch/arm64/include/asm/processor.h
@@ -129,6 +129,7 @@ extern void release_thread(struct task_struct *);
 unsigned long get_wchan(struct task_struct *p);
 
 #define cpu_relax()			barrier()
+#define arch_cpu_relax()                cpu_relax()
 
 /* Thread switching */
 extern struct task_struct *cpu_switch_to(struct task_struct *prev,
diff --git a/arch/avr32/include/asm/processor.h b/arch/avr32/include/asm/processor.h
index 972adcc..15d93d0 100644
--- a/arch/avr32/include/asm/processor.h
+++ b/arch/avr32/include/asm/processor.h
@@ -92,6 +92,7 @@ extern struct avr32_cpuinfo boot_cpu_data;
 #define TASK_UNMAPPED_BASE	(PAGE_ALIGN(TASK_SIZE / 3))
 
 #define cpu_relax()		barrier()
+#define arch_cpu_relax()        cpu_relax()
 #define cpu_sync_pipeline()	asm volatile("sub pc, -2" : : : "memory")
 
 struct cpu_context {
diff --git a/arch/blackfin/include/asm/processor.h b/arch/blackfin/include/asm/processor.h
index d0e72e9..05945e6 100644
--- a/arch/blackfin/include/asm/processor.h
+++ b/arch/blackfin/include/asm/processor.h
@@ -99,7 +99,7 @@ unsigned long get_wchan(struct task_struct *p);
 #define	KSTK_ESP(tsk)	((tsk) == current ? rdusp() : (tsk)->thread.usp)
 
 #define cpu_relax()    	smp_mb()
-
+#define arch_cpu_relax() cpu_relax()
 
 /* Get the Silicon Revision of the chip */
 static inline uint32_t __pure bfin_revid(void)
diff --git a/arch/c6x/include/asm/processor.h b/arch/c6x/include/asm/processor.h
index b9eb3da..ca26777 100644
--- a/arch/c6x/include/asm/processor.h
+++ b/arch/c6x/include/asm/processor.h
@@ -121,6 +121,7 @@ extern unsigned long get_wchan(struct task_struct *p);
 #define KSTK_ESP(task)	(task_pt_regs(task)->sp)
 
 #define cpu_relax()		do { } while (0)
+#define arch_cpu_relax()        cpu_relax()
 
 extern const struct seq_operations cpuinfo_op;
 
diff --git a/arch/cris/include/asm/processor.h b/arch/cris/include/asm/processor.h
index 15b815d..2edd8e9 100644
--- a/arch/cris/include/asm/processor.h
+++ b/arch/cris/include/asm/processor.h
@@ -63,6 +63,7 @@ static inline void release_thread(struct task_struct *dead_task)
 #define init_stack      (init_thread_union.stack)
 
 #define cpu_relax()     barrier()
+#define arch_cpu_relax() cpu_relax()
 
 void default_idle(void);
 
diff --git a/arch/hexagon/include/asm/processor.h b/arch/hexagon/include/asm/processor.h
index 45a8254..00edcd2 100644
--- a/arch/hexagon/include/asm/processor.h
+++ b/arch/hexagon/include/asm/processor.h
@@ -56,6 +56,7 @@ struct thread_struct {
 }
 
 #define cpu_relax() __vmyield()
+#define arch_cpu_relax() cpu_relax()
 
 /*
  * Decides where the kernel will search for a free chunk of vm space during
diff --git a/arch/ia64/include/asm/processor.h b/arch/ia64/include/asm/processor.h
index efd1b92..3034d5a 100644
--- a/arch/ia64/include/asm/processor.h
+++ b/arch/ia64/include/asm/processor.h
@@ -548,6 +548,7 @@ ia64_eoi (void)
 }
 
 #define cpu_relax()	ia64_hint(ia64_hint_pause)
+#define arch_cpu_relax() cpu_relax()
 
 static inline int
 ia64_get_irr(unsigned int vector)
diff --git a/arch/m32r/include/asm/processor.h b/arch/m32r/include/asm/processor.h
index 5767367..05ade24 100644
--- a/arch/m32r/include/asm/processor.h
+++ b/arch/m32r/include/asm/processor.h
@@ -133,5 +133,6 @@ unsigned long get_wchan(struct task_struct *p);
 #define KSTK_ESP(tsk)  ((tsk)->thread.sp)
 
 #define cpu_relax()	barrier()
+#define arch_cpu_relax() cpu_relax()
 
 #endif /* _ASM_M32R_PROCESSOR_H */
diff --git a/arch/m68k/include/asm/processor.h b/arch/m68k/include/asm/processor.h
index b0768a6..7b6a79b 100644
--- a/arch/m68k/include/asm/processor.h
+++ b/arch/m68k/include/asm/processor.h
@@ -176,5 +176,6 @@ unsigned long get_wchan(struct task_struct *p);
 #define task_pt_regs(tsk)	((struct pt_regs *) ((tsk)->thread.esp0))
 
 #define cpu_relax()	barrier()
+#define arch_cpu_relax() cpu_relax()
 
 #endif
diff --git a/arch/metag/include/asm/processor.h b/arch/metag/include/asm/processor.h
index a8a3747..7077e58 100644
--- a/arch/metag/include/asm/processor.h
+++ b/arch/metag/include/asm/processor.h
@@ -155,6 +155,7 @@ unsigned long get_wchan(struct task_struct *p);
 #define user_stack_pointer(regs)        ((regs)->ctx.AX[0].U0)
 
 #define cpu_relax()     barrier()
+#define arch_cpu_relax()  cpu_relax()
 
 extern void setup_priv(void);
 
diff --git a/arch/mips/include/asm/processor.h b/arch/mips/include/asm/processor.h
index ad70cba..99ab29c 100644
--- a/arch/mips/include/asm/processor.h
+++ b/arch/mips/include/asm/processor.h
@@ -367,6 +367,7 @@ unsigned long get_wchan(struct task_struct *p);
 #define KSTK_STATUS(tsk) (task_pt_regs(tsk)->cp0_status)
 
 #define cpu_relax()	barrier()
+#define arch_cpu_relax() cpu_relax()
 
 /*
  * Return_address is a replacement for __builtin_return_address(count)
diff --git a/arch/mn10300/include/asm/processor.h b/arch/mn10300/include/asm/processor.h
index 8b80b19..76fc9b7 100644
--- a/arch/mn10300/include/asm/processor.h
+++ b/arch/mn10300/include/asm/processor.h
@@ -68,7 +68,9 @@ extern struct mn10300_cpuinfo cpu_data[];
 extern void identify_cpu(struct mn10300_cpuinfo *);
 extern void print_cpu_info(struct mn10300_cpuinfo *);
 extern void dodgy_tsc(void);
+
 #define cpu_relax() barrier()
+#define arch_cpu_relax() cpu_relax()
 
 /*
  * User space process size: 1.75GB (default).
diff --git a/arch/openrisc/include/asm/processor.h b/arch/openrisc/include/asm/processor.h
index cab746f..39cfb38 100644
--- a/arch/openrisc/include/asm/processor.h
+++ b/arch/openrisc/include/asm/processor.h
@@ -101,6 +101,7 @@ extern unsigned long thread_saved_pc(struct task_struct *t);
 #define init_stack      (init_thread_union.stack)
 
 #define cpu_relax()     barrier()
+#define arch_cpu_relax() cpu_relax()
 
 #endif /* __ASSEMBLY__ */
 #endif /* __ASM_OPENRISC_PROCESSOR_H */
diff --git a/arch/parisc/include/asm/processor.h b/arch/parisc/include/asm/processor.h
index d951c96..08cab20 100644
--- a/arch/parisc/include/asm/processor.h
+++ b/arch/parisc/include/asm/processor.h
@@ -338,6 +338,7 @@ extern unsigned long get_wchan(struct task_struct *p);
 #define KSTK_ESP(tsk)	((tsk)->thread.regs.gr[30])
 
 #define cpu_relax()	barrier()
+#define arch_cpu_relax() cpu_relax()
 
 /* Used as a macro to identify the combined VIPT/PIPT cached
  * CPUs which require a guarantee of coherency (no inequivalent
diff --git a/arch/powerpc/include/asm/processor.h b/arch/powerpc/include/asm/processor.h
index 6d59072..aa2c7ad 100644
--- a/arch/powerpc/include/asm/processor.h
+++ b/arch/powerpc/include/asm/processor.h
@@ -400,6 +400,8 @@ static inline unsigned long __pack_fe01(unsigned int fpmode)
 #define cpu_relax()	barrier()
 #endif
 
+#define arch_cpu_relax() cpu_relax()
+
 /* Check that a certain kernel stack pointer is valid in task_struct p */
 int validate_sp(unsigned long sp, struct task_struct *p,
                        unsigned long nbytes);
diff --git a/arch/s390/include/asm/processor.h b/arch/s390/include/asm/processor.h
index 6f02d45..e8da302 100644
--- a/arch/s390/include/asm/processor.h
+++ b/arch/s390/include/asm/processor.h
@@ -217,7 +217,7 @@ static inline void cpu_relax(void)
 	barrier();
 }
 
-#define arch_mutex_cpu_relax()  barrier()
+#define arch_cpu_relax()  barrier()
 
 static inline void psw_set_key(unsigned int key)
 {
diff --git a/arch/score/include/asm/processor.h b/arch/score/include/asm/processor.h
index d9a922d..cfc09fb 100644
--- a/arch/score/include/asm/processor.h
+++ b/arch/score/include/asm/processor.h
@@ -24,6 +24,7 @@ extern unsigned long get_wchan(struct task_struct *p);
 #define current_text_addr() ({ __label__ _l; _l: &&_l; })
 
 #define cpu_relax()		barrier()
+#define arch_cpu_relax()        cpu_relax()
 #define release_thread(thread)	do {} while (0)
 
 /*
diff --git a/arch/sh/include/asm/processor.h b/arch/sh/include/asm/processor.h
index 5448f9b..5b65a3f 100644
--- a/arch/sh/include/asm/processor.h
+++ b/arch/sh/include/asm/processor.h
@@ -97,6 +97,7 @@ extern struct sh_cpuinfo cpu_data[];
 
 #define cpu_sleep()	__asm__ __volatile__ ("sleep" : : : "memory")
 #define cpu_relax()	barrier()
+#define arch_cpu_relax() cpu_relax()
 
 void default_idle(void);
 void stop_this_cpu(void *);
diff --git a/arch/sparc/include/asm/processor_32.h b/arch/sparc/include/asm/processor_32.h
index 2c7baa4..363cc74 100644
--- a/arch/sparc/include/asm/processor_32.h
+++ b/arch/sparc/include/asm/processor_32.h
@@ -118,6 +118,8 @@ extern unsigned long get_wchan(struct task_struct *);
 extern struct task_struct *last_task_used_math;
 
 #define cpu_relax()	barrier()
+#define arch_cpu_relax() cpu_relax()
+
 extern void (*sparc_idle)(void);
 
 #endif
diff --git a/arch/sparc/include/asm/processor_64.h b/arch/sparc/include/asm/processor_64.h
index 4c3f7f0..b1a5f8c 100644
--- a/arch/sparc/include/asm/processor_64.h
+++ b/arch/sparc/include/asm/processor_64.h
@@ -216,6 +216,7 @@ extern unsigned long get_wchan(struct task_struct *task);
 				     "nop\n\t"				\
 				     ".previous"			\
 				     ::: "memory")
+#define arch_cpu_relax() cpu_relax()
 
 /* Prefetch support.  This is tuned for UltraSPARC-III and later.
  * UltraSPARC-I will treat these as nops, and UltraSPARC-II has
diff --git a/arch/tile/include/asm/processor.h b/arch/tile/include/asm/processor.h
index 4232363..f39e6c7 100644
--- a/arch/tile/include/asm/processor.h
+++ b/arch/tile/include/asm/processor.h
@@ -266,6 +266,8 @@ static inline void cpu_relax(void)
 	barrier();
 }
 
+#define arch_cpu_relax() cpu_relax()
+
 /* Info on this processor (see fs/proc/cpuinfo.c) */
 struct seq_operations;
 extern const struct seq_operations cpuinfo_op;
diff --git a/arch/unicore32/include/asm/processor.h b/arch/unicore32/include/asm/processor.h
index 4eaa421..fa6c925 100644
--- a/arch/unicore32/include/asm/processor.h
+++ b/arch/unicore32/include/asm/processor.h
@@ -71,6 +71,7 @@ extern void release_thread(struct task_struct *);
 unsigned long get_wchan(struct task_struct *p);
 
 #define cpu_relax()			barrier()
+#define arch_cpu_relax()                cpu_relax()
 
 #define task_pt_regs(p) \
 	((struct pt_regs *)(THREAD_START_SP + task_stack_page(p)) - 1)
diff --git a/arch/x86/include/asm/processor.h b/arch/x86/include/asm/processor.h
index a4ea023..57782d1 100644
--- a/arch/x86/include/asm/processor.h
+++ b/arch/x86/include/asm/processor.h
@@ -696,6 +696,8 @@ static inline void cpu_relax(void)
 	rep_nop();
 }
 
+#define arch_cpu_relax() cpu_relax()
+
 /* Stop speculative execution and prefetching of modified code. */
 static inline void sync_core(void)
 {
diff --git a/arch/xtensa/include/asm/processor.h b/arch/xtensa/include/asm/processor.h
index abb5970..1d97d55 100644
--- a/arch/xtensa/include/asm/processor.h
+++ b/arch/xtensa/include/asm/processor.h
@@ -182,6 +182,7 @@ extern unsigned long get_wchan(struct task_struct *p);
 #define KSTK_ESP(tsk)		(task_pt_regs(tsk)->areg[1])
 
 #define cpu_relax()  barrier()
+#define arch_cpu_relax() cpu_relax()
 
 /* Special register access. */
 
diff --git a/include/linux/mutex.h b/include/linux/mutex.h
index 11692de..cf63ead 100644
--- a/include/linux/mutex.h
+++ b/include/linux/mutex.h
@@ -176,8 +176,4 @@ extern void mutex_unlock(struct mutex *lock);
 
 extern int atomic_dec_and_mutex_lock(atomic_t *cnt, struct mutex *lock);
 
-#ifndef arch_mutex_cpu_relax
-# define arch_mutex_cpu_relax() cpu_relax()
-#endif
-
 #endif /* __LINUX_MUTEX_H */
diff --git a/kernel/locking/mcs_spinlock.c b/kernel/locking/mcs_spinlock.c
index 838dc9e..e651829 100644
--- a/kernel/locking/mcs_spinlock.c
+++ b/kernel/locking/mcs_spinlock.c
@@ -1,6 +1,4 @@
-
 #include <linux/percpu.h>
-#include <linux/mutex.h>
 #include <linux/sched.h>
 #include "mcs_spinlock.h"
 
@@ -53,7 +51,7 @@ osq_wait_next(struct optimistic_spin_queue **lock,
 				break;
 		}
 
-		arch_mutex_cpu_relax();
+		arch_cpu_relax();
 	}
 
 	return next;
@@ -89,7 +87,7 @@ bool osq_lock(struct optimistic_spin_queue **lock)
 		if (need_resched())
 			goto unqueue;
 
-		arch_mutex_cpu_relax();
+		arch_cpu_relax();
 	}
 	return true;
 
@@ -115,7 +113,7 @@ unqueue:
 		if (smp_load_acquire(&node->locked))
 			return true;
 
-		arch_mutex_cpu_relax();
+		arch_cpu_relax();
 
 		/*
 		 * Or we race against a concurrent unqueue()'s step-B, in which
diff --git a/kernel/locking/mcs_spinlock.h b/kernel/locking/mcs_spinlock.h
index a2dbac4..75eb35b 100644
--- a/kernel/locking/mcs_spinlock.h
+++ b/kernel/locking/mcs_spinlock.h
@@ -27,7 +27,7 @@ struct mcs_spinlock {
 #define arch_mcs_spin_lock_contended(l)					\
 do {									\
 	while (!(smp_load_acquire(l)))					\
-		arch_mutex_cpu_relax();					\
+		arch_cpu_relax();					\
 } while (0)
 #endif
 
@@ -104,7 +104,7 @@ void mcs_spin_unlock(struct mcs_spinlock **lock, struct mcs_spinlock *node)
 			return;
 		/* Wait until the next pointer is set */
 		while (!(next = ACCESS_ONCE(node->next)))
-			arch_mutex_cpu_relax();
+			arch_cpu_relax();
 	}
 
 	/* Pass lock to next waiter. */
diff --git a/kernel/locking/mutex.c b/kernel/locking/mutex.c
index bc73d33..f584685 100644
--- a/kernel/locking/mutex.c
+++ b/kernel/locking/mutex.c
@@ -152,7 +152,7 @@ int mutex_spin_on_owner(struct mutex *lock, struct task_struct *owner)
 		if (need_resched())
 			break;
 
-		arch_mutex_cpu_relax();
+		arch_cpu_relax();
 	}
 	rcu_read_unlock();
 
@@ -471,7 +471,7 @@ __mutex_lock_common(struct mutex *lock, long state, unsigned int subclass,
 		 * memory barriers as we'll eventually observe the right
 		 * values at the cost of a few extra spins.
 		 */
-		arch_mutex_cpu_relax();
+		arch_cpu_relax();
 	}
 	osq_unlock(&lock->osq);
 slowpath:
diff --git a/kernel/locking/qrwlock.c b/kernel/locking/qrwlock.c
index fb5b8ac..eb1042a 100644
--- a/kernel/locking/qrwlock.c
+++ b/kernel/locking/qrwlock.c
@@ -20,7 +20,6 @@
 #include <linux/cpumask.h>
 #include <linux/percpu.h>
 #include <linux/hardirq.h>
-#include <linux/mutex.h>
 #include <asm/qrwlock.h>
 
 /**
@@ -35,7 +34,7 @@ static __always_inline void
 rspin_until_writer_unlock(struct qrwlock *lock, u32 cnts)
 {
 	while ((cnts & _QW_WMASK) == _QW_LOCKED) {
-		arch_mutex_cpu_relax();
+		arch_cpu_relax();
 		cnts = smp_load_acquire((u32 *)&lock->cnts);
 	}
 }
@@ -75,7 +74,7 @@ void queue_read_lock_slowpath(struct qrwlock *lock)
 	 * to make sure that the write lock isn't taken.
 	 */
 	while (atomic_read(&lock->cnts) & _QW_WMASK)
-		arch_mutex_cpu_relax();
+		arch_cpu_relax();
 
 	cnts = atomic_add_return(_QR_BIAS, &lock->cnts) - _QR_BIAS;
 	rspin_until_writer_unlock(lock, cnts);
@@ -114,7 +113,7 @@ void queue_write_lock_slowpath(struct qrwlock *lock)
 				    cnts | _QW_WAITING) == cnts))
 			break;
 
-		arch_mutex_cpu_relax();
+		arch_cpu_relax();
 	}
 
 	/* When no more readers, set the locked flag */
@@ -125,7 +124,7 @@ void queue_write_lock_slowpath(struct qrwlock *lock)
 				    _QW_LOCKED) == _QW_WAITING))
 			break;
 
-		arch_mutex_cpu_relax();
+		arch_cpu_relax();
 	}
 unlock:
 	arch_spin_unlock(&lock->lock);
diff --git a/kernel/locking/rwsem-xadd.c b/kernel/locking/rwsem-xadd.c
index dacc321..8416a9f 100644
--- a/kernel/locking/rwsem-xadd.c
+++ b/kernel/locking/rwsem-xadd.c
@@ -329,7 +329,7 @@ bool rwsem_spin_on_owner(struct rw_semaphore *sem, struct task_struct *owner)
 		if (need_resched())
 			break;
 
-		arch_mutex_cpu_relax();
+		arch_cpu_relax();
 	}
 	rcu_read_unlock();
 
@@ -381,7 +381,7 @@ static bool rwsem_optimistic_spin(struct rw_semaphore *sem)
 		 * memory barriers as we'll eventually observe the right
 		 * values at the cost of a few extra spins.
 		 */
-		arch_mutex_cpu_relax();
+		arch_cpu_relax();
 	}
 	osq_unlock(&sem->osq);
 done:
diff --git a/lib/lockref.c b/lib/lockref.c
index f07a40d..b2c57a8 100644
--- a/lib/lockref.c
+++ b/lib/lockref.c
@@ -1,6 +1,5 @@
 #include <linux/export.h>
 #include <linux/lockref.h>
-#include <linux/mutex.h>
 
 #if USE_CMPXCHG_LOCKREF
 
@@ -29,7 +28,7 @@
 		if (likely(old.lock_count == prev.lock_count)) {		\
 			SUCCESS;						\
 		}								\
-		arch_mutex_cpu_relax();						\
+		arch_cpu_relax();						\
 	}									\
 } while (0)
 
-- 
1.8.1.4




^ permalink raw reply related	[flat|nested] 14+ messages in thread

* Re: [PATCH] arch,locking: Ciao arch_mutex_cpu_relax()
  2014-06-20 18:21 [PATCH] arch,locking: Ciao arch_mutex_cpu_relax() Davidlohr Bueso
@ 2014-06-23  6:54 ` Peter Zijlstra
  2014-06-23  7:13   ` Vineet Gupta
  2014-06-23  6:58 ` Peter Zijlstra
  1 sibling, 1 reply; 14+ messages in thread
From: Peter Zijlstra @ 2014-06-23  6:54 UTC (permalink / raw)
  To: Davidlohr Bueso
  Cc: Ingo Molnar, Linus Torvalds, Heiko Carstens,
	Aswin Chandramouleeswaran, linux-kernel, vgupta

On Fri, Jun 20, 2014 at 11:21:13AM -0700, Davidlohr Bueso wrote:
> diff --git a/arch/arc/include/asm/processor.h b/arch/arc/include/asm/processor.h
> index d99f9b3..8e1bf6b 100644
> --- a/arch/arc/include/asm/processor.h
> +++ b/arch/arc/include/asm/processor.h
> @@ -62,6 +62,8 @@ unsigned long thread_saved_pc(struct task_struct *t);
>  #define cpu_relax()	do { } while (0)
>  #endif
>  
> +#define arch_cpu_relax() cpu_relax()
> +
>  #define copy_segments(tsk, mm)      do { } while (0)
>  #define release_segments(mm)        do { } while (0)

I'm not at all sure that cpu_relax() definition ARC has is valid. We
rely on cpu_relax() being at least a barrier() all over the place, and
it doesn't need to be SMP only. You can have a UP wait loop waiting for
an interrupt for example.

Vineet?

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH] arch,locking: Ciao arch_mutex_cpu_relax()
  2014-06-20 18:21 [PATCH] arch,locking: Ciao arch_mutex_cpu_relax() Davidlohr Bueso
  2014-06-23  6:54 ` Peter Zijlstra
@ 2014-06-23  6:58 ` Peter Zijlstra
  2014-06-24 15:06   ` Davidlohr Bueso
  1 sibling, 1 reply; 14+ messages in thread
From: Peter Zijlstra @ 2014-06-23  6:58 UTC (permalink / raw)
  To: Davidlohr Bueso
  Cc: Ingo Molnar, Linus Torvalds, Heiko Carstens,
	Aswin Chandramouleeswaran, linux-kernel

On Fri, Jun 20, 2014 at 11:21:13AM -0700, Davidlohr Bueso wrote:
> From: Davidlohr Bueso <davidlohr@hp.com>
> 
> The arch_mutex_cpu_relax() function, introduced by 34b133f, is
> hacky and ugly. It was added a few years ago to address the fact
> that common cpu_relax() calls include yielding on s390, and thus
> impact the optimistic spinning functionality of mutexes. Nowadays
> we use this function well beyond mutexes: rwsem, qrwlock, mcs and
> lockref. Since the macro that defines the call is in the mutex header,
> any users must include mutex.h and the naming is misleading as well.
> 
> This patch (i) renames the call to arch_cpu_relax (for lack of a better
> name), and (ii) defines it in each arch's asm/processor.h local header,
> just like for regular cpu_relax() functions. On all archs, except s390,
> arch_cpu_relax is simply cpu_relax, and thus we can take it out of
> mutex.h. While this can seem redundant or weird, I believe it is a
> good choice as it allows us to move out arch specific logic from generic
> locking primitives and enables future(?) archs to transparently define
> it, similarly to System Z.
> 
> Please note that these changes are only tested on x86-64.

While I like the general idea; does anyone have a better name for this?
So in particular, the difference is that on s390:

 cpu_relax()                - yields the vcpu
 arch_{,mutex_}cpu_relax()  - will actually spin-wait



^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH] arch,locking: Ciao arch_mutex_cpu_relax()
  2014-06-23  6:54 ` Peter Zijlstra
@ 2014-06-23  7:13   ` Vineet Gupta
  2014-06-23 11:34     ` Peter Zijlstra
  0 siblings, 1 reply; 14+ messages in thread
From: Vineet Gupta @ 2014-06-23  7:13 UTC (permalink / raw)
  To: Peter Zijlstra, Davidlohr Bueso
  Cc: Ingo Molnar, Linus Torvalds, Heiko Carstens,
	Aswin Chandramouleeswaran, linux-kernel

Hi Peter,

On Monday 23 June 2014 12:24 PM, Peter Zijlstra wrote:
> On Fri, Jun 20, 2014 at 11:21:13AM -0700, Davidlohr Bueso wrote:
>> diff --git a/arch/arc/include/asm/processor.h b/arch/arc/include/asm/processor.h
>> index d99f9b3..8e1bf6b 100644
>> --- a/arch/arc/include/asm/processor.h
>> +++ b/arch/arc/include/asm/processor.h
>> @@ -62,6 +62,8 @@ unsigned long thread_saved_pc(struct task_struct *t);
>>  #define cpu_relax()	do { } while (0)
>>  #endif
>>  
>> +#define arch_cpu_relax() cpu_relax()
>> +
>>  #define copy_segments(tsk, mm)      do { } while (0)
>>  #define release_segments(mm)        do { } while (0)
> 
> I'm not at all sure that cpu_relax() definition ARC has is valid. We
> rely on cpu_relax() being at least a barrier() all over the place, and
> it doesn't need to be SMP only. You can have a UP wait loop waiting for
> an interrupt for example.
> 
> Vineet?

Over the years we've not had any trouble with !SMP cpu_relax() being a no-op (and
barrier version was only required when we hit a hard hang in our our initial SMP
code). UP busy wait looping would be frowned upon in general.

However what we have now is just a code optimization quirk for !SMP since a
compiler barrier will cause gcc to dump out and reload scratch regs - specially
for our deep reg file.

Here's what I get with current UP kernel switching to compiler barrier

./scripts/bloat-o-meter vmlinux-pre-cpu-relax  vmlinux | head
add/remove: 1/0 grow/shrink: 75/5 up/down: 1218/-32 (1186)
function                                     old     new   delta
path_init                                    708     826    +118
sys_semtimedop                              2540    2640    +100
...
__slab_alloc.isra.constprop                  564     560      -4
deactivate_slab                              886     878      -8

So it doesn't look too bad, although I've not run any performance tests. We can
switch UP to barrier if you feel it is needed semantically.

-Vineet



^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH] arch,locking: Ciao arch_mutex_cpu_relax()
  2014-06-23  7:13   ` Vineet Gupta
@ 2014-06-23 11:34     ` Peter Zijlstra
  0 siblings, 0 replies; 14+ messages in thread
From: Peter Zijlstra @ 2014-06-23 11:34 UTC (permalink / raw)
  To: Vineet Gupta
  Cc: Davidlohr Bueso, Ingo Molnar, Linus Torvalds, Heiko Carstens,
	Aswin Chandramouleeswaran, linux-kernel

On Mon, Jun 23, 2014 at 12:43:46PM +0530, Vineet Gupta wrote:
> Hi Peter,
> 
> On Monday 23 June 2014 12:24 PM, Peter Zijlstra wrote:
> > On Fri, Jun 20, 2014 at 11:21:13AM -0700, Davidlohr Bueso wrote:
> >> diff --git a/arch/arc/include/asm/processor.h b/arch/arc/include/asm/processor.h
> >> index d99f9b3..8e1bf6b 100644
> >> --- a/arch/arc/include/asm/processor.h
> >> +++ b/arch/arc/include/asm/processor.h
> >> @@ -62,6 +62,8 @@ unsigned long thread_saved_pc(struct task_struct *t);
> >>  #define cpu_relax()	do { } while (0)
> >>  #endif
> >>  
> >> +#define arch_cpu_relax() cpu_relax()
> >> +
> >>  #define copy_segments(tsk, mm)      do { } while (0)
> >>  #define release_segments(mm)        do { } while (0)
> > 
> > I'm not at all sure that cpu_relax() definition ARC has is valid. We
> > rely on cpu_relax() being at least a barrier() all over the place, and
> > it doesn't need to be SMP only. You can have a UP wait loop waiting for
> > an interrupt for example.
> > 
> > Vineet?
> 
> Over the years we've not had any trouble with !SMP cpu_relax() being a no-op (and
> barrier version was only required when we hit a hard hang in our our initial SMP
> code). UP busy wait looping would be frowned upon in general.
> 
> However what we have now is just a code optimization quirk for !SMP since a
> compiler barrier will cause gcc to dump out and reload scratch regs - specially
> for our deep reg file.
> 
> Here's what I get with current UP kernel switching to compiler barrier
> 
> ./scripts/bloat-o-meter vmlinux-pre-cpu-relax  vmlinux | head
> add/remove: 1/0 grow/shrink: 75/5 up/down: 1218/-32 (1186)
> function                                     old     new   delta
> path_init                                    708     826    +118
> sys_semtimedop                              2540    2640    +100
> ...
> __slab_alloc.isra.constprop                  564     560      -4
> deactivate_slab                              886     878      -8
> 
> So it doesn't look too bad, although I've not run any performance tests. We can
> switch UP to barrier if you feel it is needed semantically.

Hard call, semantically it makes more sense to have it barrier, its also
certainly safer. Then again, you say you've been running like this for
years etc.. and I agree that UP spin-loops _should_ be rare.

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH] arch,locking: Ciao arch_mutex_cpu_relax()
  2014-06-23  6:58 ` Peter Zijlstra
@ 2014-06-24 15:06   ` Davidlohr Bueso
  2014-06-25  6:25     ` Heiko Carstens
  0 siblings, 1 reply; 14+ messages in thread
From: Davidlohr Bueso @ 2014-06-24 15:06 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Ingo Molnar, Linus Torvalds, Heiko Carstens,
	Aswin Chandramouleeswaran, linux-kernel

On Mon, 2014-06-23 at 08:58 +0200, Peter Zijlstra wrote:
> On Fri, Jun 20, 2014 at 11:21:13AM -0700, Davidlohr Bueso wrote:
> > From: Davidlohr Bueso <davidlohr@hp.com>
> > 
> > The arch_mutex_cpu_relax() function, introduced by 34b133f, is
> > hacky and ugly. It was added a few years ago to address the fact
> > that common cpu_relax() calls include yielding on s390, and thus
> > impact the optimistic spinning functionality of mutexes. Nowadays
> > we use this function well beyond mutexes: rwsem, qrwlock, mcs and
> > lockref. Since the macro that defines the call is in the mutex header,
> > any users must include mutex.h and the naming is misleading as well.
> > 
> > This patch (i) renames the call to arch_cpu_relax (for lack of a better
> > name), and (ii) defines it in each arch's asm/processor.h local header,
> > just like for regular cpu_relax() functions. On all archs, except s390,
> > arch_cpu_relax is simply cpu_relax, and thus we can take it out of
> > mutex.h. While this can seem redundant or weird, I believe it is a
> > good choice as it allows us to move out arch specific logic from generic
> > locking primitives and enables future(?) archs to transparently define
> > it, similarly to System Z.
> > 
> > Please note that these changes are only tested on x86-64.
> 
> While I like the general idea; does anyone have a better name for this?
> So in particular, the difference is that on s390:
> 
>  cpu_relax()                - yields the vcpu
>  arch_{,mutex_}cpu_relax()  - will actually spin-wait

iirc Heiko had suggested cpu_relax_simple() in the past. I don't think
it's any better or worse than arch_cpu_relax(). For s390
cpu_relax_noyield() would perhaps be suitable, but not very descriptive
for the rest of the archs. I'm really lacking creativity for this name.

Thanks,
Davidlohr



^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH] arch,locking: Ciao arch_mutex_cpu_relax()
  2014-06-24 15:06   ` Davidlohr Bueso
@ 2014-06-25  6:25     ` Heiko Carstens
  2014-06-25 11:57       ` Linus Torvalds
  0 siblings, 1 reply; 14+ messages in thread
From: Heiko Carstens @ 2014-06-25  6:25 UTC (permalink / raw)
  To: Davidlohr Bueso
  Cc: Peter Zijlstra, Ingo Molnar, Linus Torvalds,
	Aswin Chandramouleeswaran, linux-kernel

On Tue, Jun 24, 2014 at 08:06:55AM -0700, Davidlohr Bueso wrote:
> On Mon, 2014-06-23 at 08:58 +0200, Peter Zijlstra wrote:
> > While I like the general idea; does anyone have a better name for this?
> > So in particular, the difference is that on s390:
> > 
> >  cpu_relax()                - yields the vcpu
> >  arch_{,mutex_}cpu_relax()  - will actually spin-wait
> 
> iirc Heiko had suggested cpu_relax_simple() in the past. I don't think
> it's any better or worse than arch_cpu_relax(). For s390
> cpu_relax_noyield() would perhaps be suitable, but not very descriptive
> for the rest of the archs. I'm really lacking creativity for this name.

Maybe cpu_relax_spin() ? However that doesn't sound much better as well.


^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH] arch,locking: Ciao arch_mutex_cpu_relax()
  2014-06-25  6:25     ` Heiko Carstens
@ 2014-06-25 11:57       ` Linus Torvalds
  2014-06-29 22:09         ` [PATCH v2] " Davidlohr Bueso
  0 siblings, 1 reply; 14+ messages in thread
From: Linus Torvalds @ 2014-06-25 11:57 UTC (permalink / raw)
  To: Heiko Carstens
  Cc: Davidlohr Bueso, Peter Zijlstra, Ingo Molnar,
	Aswin Chandramouleeswaran, Linux Kernel Mailing List

On Tue, Jun 24, 2014 at 11:25 PM, Heiko Carstens
<heiko.carstens@de.ibm.com> wrote:
>
> Maybe cpu_relax_spin() ? However that doesn't sound much better as well.

"cpu_relax_lowlatency()", perhaps? Naming it for what the use is
("relax, but only if you can do it with very low latency")

             Linus

^ permalink raw reply	[flat|nested] 14+ messages in thread

* [PATCH v2] arch,locking: Ciao arch_mutex_cpu_relax()
  2014-06-25 11:57       ` Linus Torvalds
@ 2014-06-29 22:09         ` Davidlohr Bueso
  2014-08-05 13:04           ` Geert Uytterhoeven
  0 siblings, 1 reply; 14+ messages in thread
From: Davidlohr Bueso @ 2014-06-29 22:09 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Heiko Carstens, Peter Zijlstra, Ingo Molnar,
	Aswin Chandramouleeswaran, Linux Kernel Mailing List

From: Davidlohr Bueso <davidlohr@hp.com>

The arch_mutex_cpu_relax() function, introduced by 34b133f, is
hacky and ugly. It was added a few years ago to address the fact
that common cpu_relax() calls include yielding on s390, and thus
impact the optimistic spinning functionality of mutexes. Nowadays
we use this function well beyond mutexes: rwsem, qrwlock, mcs and
lockref. Since the macro that defines the call is in the mutex header,
any users must include mutex.h and the naming is misleading as well.

This patch (i) renames the call to cpu_relax_lowlatency  ("relax, but
only if you can do it with very low latency") and (ii) defines it in
each arch's asm/processor.h local header, just like for regular cpu_relax
functions. On all archs, except s390, cpu_relax_lowlatency is simply cpu_relax,
and thus we can take it out of mutex.h. While this can seem redundant,
I believe it is a good choice as it allows us to move out arch specific
logic from generic locking primitives and enables future(?) archs to
transparently define it, similarly to System Z.

Please note that these changes are only tested on x86-64.

Signed-off-by: Davidlohr Bueso <davidlohr@hp.com>
---
Changes from v1: Rename arch_cpu_relax to cpu_relax_lowlatency,
based on the purpose of the function, as suggested by Linus.

 arch/alpha/include/asm/processor.h     | 1 +
 arch/arc/include/asm/processor.h       | 2 ++
 arch/arm/include/asm/processor.h       | 2 ++
 arch/arm64/include/asm/processor.h     | 1 +
 arch/avr32/include/asm/processor.h     | 1 +
 arch/blackfin/include/asm/processor.h  | 2 +-
 arch/c6x/include/asm/processor.h       | 1 +
 arch/cris/include/asm/processor.h      | 1 +
 arch/hexagon/include/asm/processor.h   | 1 +
 arch/ia64/include/asm/processor.h      | 1 +
 arch/m32r/include/asm/processor.h      | 1 +
 arch/m68k/include/asm/processor.h      | 1 +
 arch/metag/include/asm/processor.h     | 1 +
 arch/mips/include/asm/processor.h      | 1 +
 arch/mn10300/include/asm/processor.h   | 2 ++
 arch/openrisc/include/asm/processor.h  | 1 +
 arch/parisc/include/asm/processor.h    | 1 +
 arch/powerpc/include/asm/processor.h   | 2 ++
 arch/s390/include/asm/processor.h      | 2 +-
 arch/score/include/asm/processor.h     | 1 +
 arch/sh/include/asm/processor.h        | 1 +
 arch/sparc/include/asm/processor_32.h  | 2 ++
 arch/sparc/include/asm/processor_64.h  | 1 +
 arch/tile/include/asm/processor.h      | 2 ++
 arch/unicore32/include/asm/processor.h | 1 +
 arch/x86/include/asm/processor.h       | 2 ++
 arch/xtensa/include/asm/processor.h    | 1 +
 include/linux/mutex.h                  | 4 ----
 kernel/locking/mcs_spinlock.c          | 8 +++-----
 kernel/locking/mcs_spinlock.h          | 4 ++--
 kernel/locking/mutex.c                 | 4 ++--
 kernel/locking/qrwlock.c               | 9 ++++-----
 kernel/locking/rwsem-xadd.c            | 4 ++--
 lib/lockref.c                          | 3 +--
 34 files changed, 48 insertions(+), 24 deletions(-)

diff --git a/arch/alpha/include/asm/processor.h b/arch/alpha/include/asm/processor.h
index 6cb7fe8..b4cf036 100644
--- a/arch/alpha/include/asm/processor.h
+++ b/arch/alpha/include/asm/processor.h
@@ -57,6 +57,7 @@ unsigned long get_wchan(struct task_struct *p);
   ((tsk) == current ? rdusp() : task_thread_info(tsk)->pcb.usp)
 
 #define cpu_relax()	barrier()
+#define cpu_relax_lowlatency() cpu_relax()
 
 #define ARCH_HAS_PREFETCH
 #define ARCH_HAS_PREFETCHW
diff --git a/arch/arc/include/asm/processor.h b/arch/arc/include/asm/processor.h
index d99f9b3..82588f3 100644
--- a/arch/arc/include/asm/processor.h
+++ b/arch/arc/include/asm/processor.h
@@ -62,6 +62,8 @@ unsigned long thread_saved_pc(struct task_struct *t);
 #define cpu_relax()	do { } while (0)
 #endif
 
+#define cpu_relax_lowlatency() cpu_relax()
+
 #define copy_segments(tsk, mm)      do { } while (0)
 #define release_segments(mm)        do { } while (0)
 
diff --git a/arch/arm/include/asm/processor.h b/arch/arm/include/asm/processor.h
index c3d5fc1..8a1e8e9 100644
--- a/arch/arm/include/asm/processor.h
+++ b/arch/arm/include/asm/processor.h
@@ -82,6 +82,8 @@ unsigned long get_wchan(struct task_struct *p);
 #define cpu_relax()			barrier()
 #endif
 
+#define cpu_relax_lowlatency()                cpu_relax()
+
 #define task_pt_regs(p) \
 	((struct pt_regs *)(THREAD_START_SP + task_stack_page(p)) - 1)
 
diff --git a/arch/arm64/include/asm/processor.h b/arch/arm64/include/asm/processor.h
index 34de2a8..4610b0d 100644
--- a/arch/arm64/include/asm/processor.h
+++ b/arch/arm64/include/asm/processor.h
@@ -129,6 +129,7 @@ extern void release_thread(struct task_struct *);
 unsigned long get_wchan(struct task_struct *p);
 
 #define cpu_relax()			barrier()
+#define cpu_relax_lowlatency()                cpu_relax()
 
 /* Thread switching */
 extern struct task_struct *cpu_switch_to(struct task_struct *prev,
diff --git a/arch/avr32/include/asm/processor.h b/arch/avr32/include/asm/processor.h
index 972adcc..941593c 100644
--- a/arch/avr32/include/asm/processor.h
+++ b/arch/avr32/include/asm/processor.h
@@ -92,6 +92,7 @@ extern struct avr32_cpuinfo boot_cpu_data;
 #define TASK_UNMAPPED_BASE	(PAGE_ALIGN(TASK_SIZE / 3))
 
 #define cpu_relax()		barrier()
+#define cpu_relax_lowlatency()        cpu_relax()
 #define cpu_sync_pipeline()	asm volatile("sub pc, -2" : : : "memory")
 
 struct cpu_context {
diff --git a/arch/blackfin/include/asm/processor.h b/arch/blackfin/include/asm/processor.h
index d0e72e9..7acd466 100644
--- a/arch/blackfin/include/asm/processor.h
+++ b/arch/blackfin/include/asm/processor.h
@@ -99,7 +99,7 @@ unsigned long get_wchan(struct task_struct *p);
 #define	KSTK_ESP(tsk)	((tsk) == current ? rdusp() : (tsk)->thread.usp)
 
 #define cpu_relax()    	smp_mb()
-
+#define cpu_relax_lowlatency() cpu_relax()
 
 /* Get the Silicon Revision of the chip */
 static inline uint32_t __pure bfin_revid(void)
diff --git a/arch/c6x/include/asm/processor.h b/arch/c6x/include/asm/processor.h
index b9eb3da..f2ef31b 100644
--- a/arch/c6x/include/asm/processor.h
+++ b/arch/c6x/include/asm/processor.h
@@ -121,6 +121,7 @@ extern unsigned long get_wchan(struct task_struct *p);
 #define KSTK_ESP(task)	(task_pt_regs(task)->sp)
 
 #define cpu_relax()		do { } while (0)
+#define cpu_relax_lowlatency()        cpu_relax()
 
 extern const struct seq_operations cpuinfo_op;
 
diff --git a/arch/cris/include/asm/processor.h b/arch/cris/include/asm/processor.h
index 15b815d..862126b 100644
--- a/arch/cris/include/asm/processor.h
+++ b/arch/cris/include/asm/processor.h
@@ -63,6 +63,7 @@ static inline void release_thread(struct task_struct *dead_task)
 #define init_stack      (init_thread_union.stack)
 
 #define cpu_relax()     barrier()
+#define cpu_relax_lowlatency() cpu_relax()
 
 void default_idle(void);
 
diff --git a/arch/hexagon/include/asm/processor.h b/arch/hexagon/include/asm/processor.h
index 45a8254..d850113 100644
--- a/arch/hexagon/include/asm/processor.h
+++ b/arch/hexagon/include/asm/processor.h
@@ -56,6 +56,7 @@ struct thread_struct {
 }
 
 #define cpu_relax() __vmyield()
+#define cpu_relax_lowlatency() cpu_relax()
 
 /*
  * Decides where the kernel will search for a free chunk of vm space during
diff --git a/arch/ia64/include/asm/processor.h b/arch/ia64/include/asm/processor.h
index efd1b92..c736713 100644
--- a/arch/ia64/include/asm/processor.h
+++ b/arch/ia64/include/asm/processor.h
@@ -548,6 +548,7 @@ ia64_eoi (void)
 }
 
 #define cpu_relax()	ia64_hint(ia64_hint_pause)
+#define cpu_relax_lowlatency() cpu_relax()
 
 static inline int
 ia64_get_irr(unsigned int vector)
diff --git a/arch/m32r/include/asm/processor.h b/arch/m32r/include/asm/processor.h
index 5767367..9f8fd9b 100644
--- a/arch/m32r/include/asm/processor.h
+++ b/arch/m32r/include/asm/processor.h
@@ -133,5 +133,6 @@ unsigned long get_wchan(struct task_struct *p);
 #define KSTK_ESP(tsk)  ((tsk)->thread.sp)
 
 #define cpu_relax()	barrier()
+#define cpu_relax_lowlatency() cpu_relax()
 
 #endif /* _ASM_M32R_PROCESSOR_H */
diff --git a/arch/m68k/include/asm/processor.h b/arch/m68k/include/asm/processor.h
index b0768a6..20dda1d 100644
--- a/arch/m68k/include/asm/processor.h
+++ b/arch/m68k/include/asm/processor.h
@@ -176,5 +176,6 @@ unsigned long get_wchan(struct task_struct *p);
 #define task_pt_regs(tsk)	((struct pt_regs *) ((tsk)->thread.esp0))
 
 #define cpu_relax()	barrier()
+#define cpu_relax_lowlatency() cpu_relax()
 
 #endif
diff --git a/arch/metag/include/asm/processor.h b/arch/metag/include/asm/processor.h
index a8a3747..881071c 100644
--- a/arch/metag/include/asm/processor.h
+++ b/arch/metag/include/asm/processor.h
@@ -155,6 +155,7 @@ unsigned long get_wchan(struct task_struct *p);
 #define user_stack_pointer(regs)        ((regs)->ctx.AX[0].U0)
 
 #define cpu_relax()     barrier()
+#define cpu_relax_lowlatency()  cpu_relax()
 
 extern void setup_priv(void);
 
diff --git a/arch/mips/include/asm/processor.h b/arch/mips/include/asm/processor.h
index ad70cba..d5098bc 100644
--- a/arch/mips/include/asm/processor.h
+++ b/arch/mips/include/asm/processor.h
@@ -367,6 +367,7 @@ unsigned long get_wchan(struct task_struct *p);
 #define KSTK_STATUS(tsk) (task_pt_regs(tsk)->cp0_status)
 
 #define cpu_relax()	barrier()
+#define cpu_relax_lowlatency() cpu_relax()
 
 /*
  * Return_address is a replacement for __builtin_return_address(count)
diff --git a/arch/mn10300/include/asm/processor.h b/arch/mn10300/include/asm/processor.h
index 8b80b19..769d5ed 100644
--- a/arch/mn10300/include/asm/processor.h
+++ b/arch/mn10300/include/asm/processor.h
@@ -68,7 +68,9 @@ extern struct mn10300_cpuinfo cpu_data[];
 extern void identify_cpu(struct mn10300_cpuinfo *);
 extern void print_cpu_info(struct mn10300_cpuinfo *);
 extern void dodgy_tsc(void);
+
 #define cpu_relax() barrier()
+#define cpu_relax_lowlatency() cpu_relax()
 
 /*
  * User space process size: 1.75GB (default).
diff --git a/arch/openrisc/include/asm/processor.h b/arch/openrisc/include/asm/processor.h
index cab746f..4d235e3 100644
--- a/arch/openrisc/include/asm/processor.h
+++ b/arch/openrisc/include/asm/processor.h
@@ -101,6 +101,7 @@ extern unsigned long thread_saved_pc(struct task_struct *t);
 #define init_stack      (init_thread_union.stack)
 
 #define cpu_relax()     barrier()
+#define cpu_relax_lowlatency() cpu_relax()
 
 #endif /* __ASSEMBLY__ */
 #endif /* __ASM_OPENRISC_PROCESSOR_H */
diff --git a/arch/parisc/include/asm/processor.h b/arch/parisc/include/asm/processor.h
index d951c96..689a8ad 100644
--- a/arch/parisc/include/asm/processor.h
+++ b/arch/parisc/include/asm/processor.h
@@ -338,6 +338,7 @@ extern unsigned long get_wchan(struct task_struct *p);
 #define KSTK_ESP(tsk)	((tsk)->thread.regs.gr[30])
 
 #define cpu_relax()	barrier()
+#define cpu_relax_lowlatency() cpu_relax()
 
 /* Used as a macro to identify the combined VIPT/PIPT cached
  * CPUs which require a guarantee of coherency (no inequivalent
diff --git a/arch/powerpc/include/asm/processor.h b/arch/powerpc/include/asm/processor.h
index 6d59072..dda7ac4 100644
--- a/arch/powerpc/include/asm/processor.h
+++ b/arch/powerpc/include/asm/processor.h
@@ -400,6 +400,8 @@ static inline unsigned long __pack_fe01(unsigned int fpmode)
 #define cpu_relax()	barrier()
 #endif
 
+#define cpu_relax_lowlatency() cpu_relax()
+
 /* Check that a certain kernel stack pointer is valid in task_struct p */
 int validate_sp(unsigned long sp, struct task_struct *p,
                        unsigned long nbytes);
diff --git a/arch/s390/include/asm/processor.h b/arch/s390/include/asm/processor.h
index 6f02d45..e568fc8 100644
--- a/arch/s390/include/asm/processor.h
+++ b/arch/s390/include/asm/processor.h
@@ -217,7 +217,7 @@ static inline void cpu_relax(void)
 	barrier();
 }
 
-#define arch_mutex_cpu_relax()  barrier()
+#define cpu_relax_lowlatency()  barrier()
 
 static inline void psw_set_key(unsigned int key)
 {
diff --git a/arch/score/include/asm/processor.h b/arch/score/include/asm/processor.h
index d9a922d..851f441 100644
--- a/arch/score/include/asm/processor.h
+++ b/arch/score/include/asm/processor.h
@@ -24,6 +24,7 @@ extern unsigned long get_wchan(struct task_struct *p);
 #define current_text_addr() ({ __label__ _l; _l: &&_l; })
 
 #define cpu_relax()		barrier()
+#define cpu_relax_lowlatency()        cpu_relax()
 #define release_thread(thread)	do {} while (0)
 
 /*
diff --git a/arch/sh/include/asm/processor.h b/arch/sh/include/asm/processor.h
index 5448f9b..1506897 100644
--- a/arch/sh/include/asm/processor.h
+++ b/arch/sh/include/asm/processor.h
@@ -97,6 +97,7 @@ extern struct sh_cpuinfo cpu_data[];
 
 #define cpu_sleep()	__asm__ __volatile__ ("sleep" : : : "memory")
 #define cpu_relax()	barrier()
+#define cpu_relax_lowlatency() cpu_relax()
 
 void default_idle(void);
 void stop_this_cpu(void *);
diff --git a/arch/sparc/include/asm/processor_32.h b/arch/sparc/include/asm/processor_32.h
index 2c7baa4..f873358 100644
--- a/arch/sparc/include/asm/processor_32.h
+++ b/arch/sparc/include/asm/processor_32.h
@@ -118,6 +118,8 @@ extern unsigned long get_wchan(struct task_struct *);
 extern struct task_struct *last_task_used_math;
 
 #define cpu_relax()	barrier()
+#define cpu_relax_lowlatency() cpu_relax()
+
 extern void (*sparc_idle)(void);
 
 #endif
diff --git a/arch/sparc/include/asm/processor_64.h b/arch/sparc/include/asm/processor_64.h
index 4c3f7f0..733544b 100644
--- a/arch/sparc/include/asm/processor_64.h
+++ b/arch/sparc/include/asm/processor_64.h
@@ -216,6 +216,7 @@ extern unsigned long get_wchan(struct task_struct *task);
 				     "nop\n\t"				\
 				     ".previous"			\
 				     ::: "memory")
+#define cpu_relax_lowlatency() cpu_relax()
 
 /* Prefetch support.  This is tuned for UltraSPARC-III and later.
  * UltraSPARC-I will treat these as nops, and UltraSPARC-II has
diff --git a/arch/tile/include/asm/processor.h b/arch/tile/include/asm/processor.h
index 4232363..dd4f9f1 100644
--- a/arch/tile/include/asm/processor.h
+++ b/arch/tile/include/asm/processor.h
@@ -266,6 +266,8 @@ static inline void cpu_relax(void)
 	barrier();
 }
 
+#define cpu_relax_lowlatency() cpu_relax()
+
 /* Info on this processor (see fs/proc/cpuinfo.c) */
 struct seq_operations;
 extern const struct seq_operations cpuinfo_op;
diff --git a/arch/unicore32/include/asm/processor.h b/arch/unicore32/include/asm/processor.h
index 4eaa421..8d21b7a 100644
--- a/arch/unicore32/include/asm/processor.h
+++ b/arch/unicore32/include/asm/processor.h
@@ -71,6 +71,7 @@ extern void release_thread(struct task_struct *);
 unsigned long get_wchan(struct task_struct *p);
 
 #define cpu_relax()			barrier()
+#define cpu_relax_lowlatency()                cpu_relax()
 
 #define task_pt_regs(p) \
 	((struct pt_regs *)(THREAD_START_SP + task_stack_page(p)) - 1)
diff --git a/arch/x86/include/asm/processor.h b/arch/x86/include/asm/processor.h
index a4ea023..32cc237 100644
--- a/arch/x86/include/asm/processor.h
+++ b/arch/x86/include/asm/processor.h
@@ -696,6 +696,8 @@ static inline void cpu_relax(void)
 	rep_nop();
 }
 
+#define cpu_relax_lowlatency() cpu_relax()
+
 /* Stop speculative execution and prefetching of modified code. */
 static inline void sync_core(void)
 {
diff --git a/arch/xtensa/include/asm/processor.h b/arch/xtensa/include/asm/processor.h
index abb5970..b61bdf0 100644
--- a/arch/xtensa/include/asm/processor.h
+++ b/arch/xtensa/include/asm/processor.h
@@ -182,6 +182,7 @@ extern unsigned long get_wchan(struct task_struct *p);
 #define KSTK_ESP(tsk)		(task_pt_regs(tsk)->areg[1])
 
 #define cpu_relax()  barrier()
+#define cpu_relax_lowlatency() cpu_relax()
 
 /* Special register access. */
 
diff --git a/include/linux/mutex.h b/include/linux/mutex.h
index 11692de..cf63ead 100644
--- a/include/linux/mutex.h
+++ b/include/linux/mutex.h
@@ -176,8 +176,4 @@ extern void mutex_unlock(struct mutex *lock);
 
 extern int atomic_dec_and_mutex_lock(atomic_t *cnt, struct mutex *lock);
 
-#ifndef arch_mutex_cpu_relax
-# define arch_mutex_cpu_relax() cpu_relax()
-#endif
-
 #endif /* __LINUX_MUTEX_H */
diff --git a/kernel/locking/mcs_spinlock.c b/kernel/locking/mcs_spinlock.c
index 838dc9e..b469bd3 100644
--- a/kernel/locking/mcs_spinlock.c
+++ b/kernel/locking/mcs_spinlock.c
@@ -1,6 +1,4 @@
-
 #include <linux/percpu.h>
-#include <linux/mutex.h>
 #include <linux/sched.h>
 #include "mcs_spinlock.h"
 
@@ -53,7 +51,7 @@ osq_wait_next(struct optimistic_spin_queue **lock,
 				break;
 		}
 
-		arch_mutex_cpu_relax();
+		cpu_relax_lowlatency();
 	}
 
 	return next;
@@ -89,7 +87,7 @@ bool osq_lock(struct optimistic_spin_queue **lock)
 		if (need_resched())
 			goto unqueue;
 
-		arch_mutex_cpu_relax();
+		cpu_relax_lowlatency();
 	}
 	return true;
 
@@ -115,7 +113,7 @@ unqueue:
 		if (smp_load_acquire(&node->locked))
 			return true;
 
-		arch_mutex_cpu_relax();
+		cpu_relax_lowlatency();
 
 		/*
 		 * Or we race against a concurrent unqueue()'s step-B, in which
diff --git a/kernel/locking/mcs_spinlock.h b/kernel/locking/mcs_spinlock.h
index a2dbac4..856880e 100644
--- a/kernel/locking/mcs_spinlock.h
+++ b/kernel/locking/mcs_spinlock.h
@@ -27,7 +27,7 @@ struct mcs_spinlock {
 #define arch_mcs_spin_lock_contended(l)					\
 do {									\
 	while (!(smp_load_acquire(l)))					\
-		arch_mutex_cpu_relax();					\
+		cpu_relax_lowlatency();					\
 } while (0)
 #endif
 
@@ -104,7 +104,7 @@ void mcs_spin_unlock(struct mcs_spinlock **lock, struct mcs_spinlock *node)
 			return;
 		/* Wait until the next pointer is set */
 		while (!(next = ACCESS_ONCE(node->next)))
-			arch_mutex_cpu_relax();
+			cpu_relax_lowlatency();
 	}
 
 	/* Pass lock to next waiter. */
diff --git a/kernel/locking/mutex.c b/kernel/locking/mutex.c
index bc73d33..19ceb91 100644
--- a/kernel/locking/mutex.c
+++ b/kernel/locking/mutex.c
@@ -152,7 +152,7 @@ int mutex_spin_on_owner(struct mutex *lock, struct task_struct *owner)
 		if (need_resched())
 			break;
 
-		arch_mutex_cpu_relax();
+		cpu_relax_lowlatency();
 	}
 	rcu_read_unlock();
 
@@ -471,7 +471,7 @@ __mutex_lock_common(struct mutex *lock, long state, unsigned int subclass,
 		 * memory barriers as we'll eventually observe the right
 		 * values at the cost of a few extra spins.
 		 */
-		arch_mutex_cpu_relax();
+		cpu_relax_lowlatency();
 	}
 	osq_unlock(&lock->osq);
 slowpath:
diff --git a/kernel/locking/qrwlock.c b/kernel/locking/qrwlock.c
index fb5b8ac..f956ede 100644
--- a/kernel/locking/qrwlock.c
+++ b/kernel/locking/qrwlock.c
@@ -20,7 +20,6 @@
 #include <linux/cpumask.h>
 #include <linux/percpu.h>
 #include <linux/hardirq.h>
-#include <linux/mutex.h>
 #include <asm/qrwlock.h>
 
 /**
@@ -35,7 +34,7 @@ static __always_inline void
 rspin_until_writer_unlock(struct qrwlock *lock, u32 cnts)
 {
 	while ((cnts & _QW_WMASK) == _QW_LOCKED) {
-		arch_mutex_cpu_relax();
+		cpu_relax_lowlatency();
 		cnts = smp_load_acquire((u32 *)&lock->cnts);
 	}
 }
@@ -75,7 +74,7 @@ void queue_read_lock_slowpath(struct qrwlock *lock)
 	 * to make sure that the write lock isn't taken.
 	 */
 	while (atomic_read(&lock->cnts) & _QW_WMASK)
-		arch_mutex_cpu_relax();
+		cpu_relax_lowlatency();
 
 	cnts = atomic_add_return(_QR_BIAS, &lock->cnts) - _QR_BIAS;
 	rspin_until_writer_unlock(lock, cnts);
@@ -114,7 +113,7 @@ void queue_write_lock_slowpath(struct qrwlock *lock)
 				    cnts | _QW_WAITING) == cnts))
 			break;
 
-		arch_mutex_cpu_relax();
+		cpu_relax_lowlatency();
 	}
 
 	/* When no more readers, set the locked flag */
@@ -125,7 +124,7 @@ void queue_write_lock_slowpath(struct qrwlock *lock)
 				    _QW_LOCKED) == _QW_WAITING))
 			break;
 
-		arch_mutex_cpu_relax();
+		cpu_relax_lowlatency();
 	}
 unlock:
 	arch_spin_unlock(&lock->lock);
diff --git a/kernel/locking/rwsem-xadd.c b/kernel/locking/rwsem-xadd.c
index dacc321..f69ddb0 100644
--- a/kernel/locking/rwsem-xadd.c
+++ b/kernel/locking/rwsem-xadd.c
@@ -329,7 +329,7 @@ bool rwsem_spin_on_owner(struct rw_semaphore *sem, struct task_struct *owner)
 		if (need_resched())
 			break;
 
-		arch_mutex_cpu_relax();
+		cpu_relax_lowlatency();
 	}
 	rcu_read_unlock();
 
@@ -381,7 +381,7 @@ static bool rwsem_optimistic_spin(struct rw_semaphore *sem)
 		 * memory barriers as we'll eventually observe the right
 		 * values at the cost of a few extra spins.
 		 */
-		arch_mutex_cpu_relax();
+		cpu_relax_lowlatency();
 	}
 	osq_unlock(&sem->osq);
 done:
diff --git a/lib/lockref.c b/lib/lockref.c
index f07a40d..d2233de 100644
--- a/lib/lockref.c
+++ b/lib/lockref.c
@@ -1,6 +1,5 @@
 #include <linux/export.h>
 #include <linux/lockref.h>
-#include <linux/mutex.h>
 
 #if USE_CMPXCHG_LOCKREF
 
@@ -29,7 +28,7 @@
 		if (likely(old.lock_count == prev.lock_count)) {		\
 			SUCCESS;						\
 		}								\
-		arch_mutex_cpu_relax();						\
+		cpu_relax_lowlatency();						\
 	}									\
 } while (0)
 
-- 
1.8.1.4




^ permalink raw reply related	[flat|nested] 14+ messages in thread

* Re: [PATCH v2] arch,locking: Ciao arch_mutex_cpu_relax()
  2014-06-29 22:09         ` [PATCH v2] " Davidlohr Bueso
@ 2014-08-05 13:04           ` Geert Uytterhoeven
  2014-08-05 17:42             ` Davidlohr Bueso
  0 siblings, 1 reply; 14+ messages in thread
From: Geert Uytterhoeven @ 2014-08-05 13:04 UTC (permalink / raw)
  To: Davidlohr Bueso
  Cc: Linus Torvalds, Heiko Carstens, Peter Zijlstra, Ingo Molnar,
	Aswin Chandramouleeswaran, Linux Kernel Mailing List

Hi David,

On Mon, Jun 30, 2014 at 12:09 AM, Davidlohr Bueso <davidlohr@hp.com> wrote:
> From: Davidlohr Bueso <davidlohr@hp.com>
>
> The arch_mutex_cpu_relax() function, introduced by 34b133f, is
> hacky and ugly. It was added a few years ago to address the fact
> that common cpu_relax() calls include yielding on s390, and thus
> impact the optimistic spinning functionality of mutexes. Nowadays
> we use this function well beyond mutexes: rwsem, qrwlock, mcs and
> lockref. Since the macro that defines the call is in the mutex header,
> any users must include mutex.h and the naming is misleading as well.
>
> This patch (i) renames the call to cpu_relax_lowlatency  ("relax, but
> only if you can do it with very low latency") and (ii) defines it in
> each arch's asm/processor.h local header, just like for regular cpu_relax
> functions. On all archs, except s390, cpu_relax_lowlatency is simply cpu_relax,
> and thus we can take it out of mutex.h. While this can seem redundant,
> I believe it is a good choice as it allows us to move out arch specific
> logic from generic locking primitives and enables future(?) archs to
> transparently define it, similarly to System Z.
>
> Please note that these changes are only tested on x86-64.
>
> Signed-off-by: Davidlohr Bueso <davidlohr@hp.com>
> ---
> Changes from v1: Rename arch_cpu_relax to cpu_relax_lowlatency,
> based on the purpose of the function, as suggested by Linus.
>
>  arch/alpha/include/asm/processor.h     | 1 +
>  arch/arc/include/asm/processor.h       | 2 ++
>  arch/arm/include/asm/processor.h       | 2 ++
>  arch/arm64/include/asm/processor.h     | 1 +
>  arch/avr32/include/asm/processor.h     | 1 +
>  arch/blackfin/include/asm/processor.h  | 2 +-
>  arch/c6x/include/asm/processor.h       | 1 +
>  arch/cris/include/asm/processor.h      | 1 +
>  arch/hexagon/include/asm/processor.h   | 1 +
>  arch/ia64/include/asm/processor.h      | 1 +
>  arch/m32r/include/asm/processor.h      | 1 +
>  arch/m68k/include/asm/processor.h      | 1 +
>  arch/metag/include/asm/processor.h     | 1 +
>  arch/mips/include/asm/processor.h      | 1 +
>  arch/mn10300/include/asm/processor.h   | 2 ++
>  arch/openrisc/include/asm/processor.h  | 1 +
>  arch/parisc/include/asm/processor.h    | 1 +
>  arch/powerpc/include/asm/processor.h   | 2 ++
>  arch/s390/include/asm/processor.h      | 2 +-
>  arch/score/include/asm/processor.h     | 1 +
>  arch/sh/include/asm/processor.h        | 1 +
>  arch/sparc/include/asm/processor_32.h  | 2 ++
>  arch/sparc/include/asm/processor_64.h  | 1 +
>  arch/tile/include/asm/processor.h      | 2 ++
>  arch/unicore32/include/asm/processor.h | 1 +
>  arch/x86/include/asm/processor.h       | 2 ++
>  arch/xtensa/include/asm/processor.h    | 1 +
>  include/linux/mutex.h                  | 4 ----
>  kernel/locking/mcs_spinlock.c          | 8 +++-----
>  kernel/locking/mcs_spinlock.h          | 4 ++--
>  kernel/locking/mutex.c                 | 4 ++--
>  kernel/locking/qrwlock.c               | 9 ++++-----
>  kernel/locking/rwsem-xadd.c            | 4 ++--
>  lib/lockref.c                          | 3 +--
>  34 files changed, 48 insertions(+), 24 deletions(-)

It looks like you forgot to update frv?  It's been failing on -next since a
few days:

kernel/locking/mcs_spinlock.h:87:2: error: implicit declaration of
function 'cpu_relax_lowlatency'
[-Werror=implicit-function-declaration]
cc1: some warnings being treated as errors
kernel/locking/mcs_spinlock.h:87:2: error: implicit declaration of
function 'cpu_relax_lowlatency'
[-Werror=implicit-function-declaration]
make[3]: *** [kernel/locking/mcs_spinlock.o] Error 1
cc1: some warnings being treated as errors
make[3]: *** [kernel/locking/mutex.o] Error 1

http://kisskb.ellerman.id.au/kisskb/buildresult/11616307/

Gr{oetje,eeting}s,

                        Geert

--
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- geert@linux-m68k.org

In personal conversations with technical people, I call myself a hacker. But
when I'm talking to journalists I just say "programmer" or something like that.
                                -- Linus Torvalds

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH v2] arch,locking: Ciao arch_mutex_cpu_relax()
  2014-08-05 13:04           ` Geert Uytterhoeven
@ 2014-08-05 17:42             ` Davidlohr Bueso
  2014-08-08  1:18               ` Davidlohr Bueso
  0 siblings, 1 reply; 14+ messages in thread
From: Davidlohr Bueso @ 2014-08-05 17:42 UTC (permalink / raw)
  To: Geert Uytterhoeven
  Cc: Linus Torvalds, Heiko Carstens, Peter Zijlstra, Ingo Molnar,
	Aswin Chandramouleeswaran, Linux Kernel Mailing List

On Tue, 2014-08-05 at 15:04 +0200, Geert Uytterhoeven wrote:
> It looks like you forgot to update frv?  It's been failing on -next since a
> few days:
> 
> kernel/locking/mcs_spinlock.h:87:2: error: implicit declaration of
> function 'cpu_relax_lowlatency'
> [-Werror=implicit-function-declaration]
> cc1: some warnings being treated as errors
> kernel/locking/mcs_spinlock.h:87:2: error: implicit declaration of
> function 'cpu_relax_lowlatency'
> [-Werror=implicit-function-declaration]
> make[3]: *** [kernel/locking/mcs_spinlock.o] Error 1
> cc1: some warnings being treated as errors
> make[3]: *** [kernel/locking/mutex.o] Error 1
> 
> http://kisskb.ellerman.id.au/kisskb/buildresult/11616307/

Ah, indeed. Thanks for the report, afaict this was the only missing
arch .

8<---------------------------------------------------------------
From: Davidlohr Bueso <davidlohr@hp.com>
Subject: [PATCH] frv: Define cpu_relax_lowlatency()

Fixes errors such as:

kernel/locking/mcs_spinlock.h:87:2: error: implicit declaration of function 'cpu_relax_lowlatency'

Reported-by: Geert Uytterhoeven <geert@linux-m68k.org>
Signed-off-by: Davidlohr Bueso <davidlohr@hp.com>
---
 arch/frv/include/asm/processor.h | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/arch/frv/include/asm/processor.h b/arch/frv/include/asm/processor.h
index a34f309..6554e78 100644
--- a/arch/frv/include/asm/processor.h
+++ b/arch/frv/include/asm/processor.h
@@ -129,7 +129,8 @@ unsigned long get_wchan(struct task_struct *p);
 #define	KSTK_EIP(tsk)	((tsk)->thread.frame0->pc)
 #define	KSTK_ESP(tsk)	((tsk)->thread.frame0->sp)
 
-#define cpu_relax()    barrier()
+#define cpu_relax() barrier()
+#define cpu_relax_lowlatency() cpu_relax()
 
 /* data cache prefetch */
 #define ARCH_HAS_PREFETCH
-- 
1.8.1.4




^ permalink raw reply related	[flat|nested] 14+ messages in thread

* Re: [PATCH v2] arch,locking: Ciao arch_mutex_cpu_relax()
  2014-08-05 17:42             ` Davidlohr Bueso
@ 2014-08-08  1:18               ` Davidlohr Bueso
  2014-08-08 10:45                 ` Guenter Roeck
  2014-08-15 19:34                 ` Guenter Roeck
  0 siblings, 2 replies; 14+ messages in thread
From: Davidlohr Bueso @ 2014-08-08  1:18 UTC (permalink / raw)
  To: Geert Uytterhoeven
  Cc: Linus Torvalds, Heiko Carstens, Peter Zijlstra, Ingo Molnar,
	Aswin Chandramouleeswaran, Linux Kernel Mailing List,
	Guenter Roeck

On Tue, 2014-08-05 at 10:42 -0700, Davidlohr Bueso wrote:
> On Tue, 2014-08-05 at 15:04 +0200, Geert Uytterhoeven wrote:
> > It looks like you forgot to update frv?  It's been failing on -next since a
> > few days:

Anyway developers can be alerted sooner about this (ie: while its still
in -next phase), like automated emails or something? This would be extra
nice for those archs that are harder to get tested.

> > kernel/locking/mcs_spinlock.h:87:2: error: implicit declaration of
> > function 'cpu_relax_lowlatency'
> > [-Werror=implicit-function-declaration]
> > cc1: some warnings being treated as errors
> > kernel/locking/mcs_spinlock.h:87:2: error: implicit declaration of
> > function 'cpu_relax_lowlatency'
> > [-Werror=implicit-function-declaration]
> > make[3]: *** [kernel/locking/mcs_spinlock.o] Error 1
> > cc1: some warnings being treated as errors
> > make[3]: *** [kernel/locking/mutex.o] Error 1
> > 
> > http://kisskb.ellerman.id.au/kisskb/buildresult/11616307/
> 
> Ah, indeed. Thanks for the report, afaict this was the only missing
> arch .

Adding Guenter who also reported this yesterday.

Linus, since this is build-breaking an entire arch, it might be worth
avoiding the whole -tip thing and get the fix in as soon as possible.

Thanks,
Davidlohr


^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH v2] arch,locking: Ciao arch_mutex_cpu_relax()
  2014-08-08  1:18               ` Davidlohr Bueso
@ 2014-08-08 10:45                 ` Guenter Roeck
  2014-08-15 19:34                 ` Guenter Roeck
  1 sibling, 0 replies; 14+ messages in thread
From: Guenter Roeck @ 2014-08-08 10:45 UTC (permalink / raw)
  To: Davidlohr Bueso, Geert Uytterhoeven
  Cc: Linus Torvalds, Heiko Carstens, Peter Zijlstra, Ingo Molnar,
	Aswin Chandramouleeswaran, Linux Kernel Mailing List

On 08/07/2014 06:18 PM, Davidlohr Bueso wrote:
> On Tue, 2014-08-05 at 10:42 -0700, Davidlohr Bueso wrote:
>> On Tue, 2014-08-05 at 15:04 +0200, Geert Uytterhoeven wrote:
>>> It looks like you forgot to update frv?  It's been failing on -next since a
>>> few days:
>
> Anyway developers can be alerted sooner about this (ie: while its still
> in -next phase), like automated emails or something? This would be extra
> nice for those archs that are harder to get tested.
>
Not sure if that would help. After all, images are being built.
People just don't care enough to have a look.

FWIW, next-20140808 build results on my build/test farm:
	total: 132 pass: 120 fail: 12
Failed builds:
	frv:defconfig
	mips:nlm_xlp_defconfig
	openrisc:defconfig
	powerpc:allmodconfig (binutils 2.23 and 2.24)
	powerpc:mpc83xx_defconfig (ditto)
	powerpc:mpc85xx_defconfig (ditto)
	powerpc:mpc85xx_smp_defconfig (ditto)
	unicore32:defconfig

with qemu test runs failing for arm, mips, mips64, and ppc.

Which points to the next problem: I simply do not have the time to track down
all those failures, and I would guess neither do others who might be interested.
Geert is doing an excellent job trying to report and fix things, but quite frankly
sometimes one feels like Sisyphus trying to keep builds clean.

Guenter


^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH v2] arch,locking: Ciao arch_mutex_cpu_relax()
  2014-08-08  1:18               ` Davidlohr Bueso
  2014-08-08 10:45                 ` Guenter Roeck
@ 2014-08-15 19:34                 ` Guenter Roeck
  1 sibling, 0 replies; 14+ messages in thread
From: Guenter Roeck @ 2014-08-15 19:34 UTC (permalink / raw)
  To: Davidlohr Bueso; +Cc: Geert Uytterhoeven, linux-kernel

On Thu, Aug 07, 2014 at 06:18:34PM -0700, Davidlohr Bueso wrote:
> On Tue, 2014-08-05 at 10:42 -0700, Davidlohr Bueso wrote:
> > On Tue, 2014-08-05 at 15:04 +0200, Geert Uytterhoeven wrote:
> > > It looks like you forgot to update frv?  It's been failing on -next since a
> > > few days:
> 
> Anyway developers can be alerted sooner about this (ie: while its still
> in -next phase), like automated emails or something? This would be extra
> nice for those archs that are harder to get tested.
> 
> > > kernel/locking/mcs_spinlock.h:87:2: error: implicit declaration of
> > > function 'cpu_relax_lowlatency'
> > > [-Werror=implicit-function-declaration]
> > > cc1: some warnings being treated as errors
> > > kernel/locking/mcs_spinlock.h:87:2: error: implicit declaration of
> > > function 'cpu_relax_lowlatency'
> > > [-Werror=implicit-function-declaration]
> > > make[3]: *** [kernel/locking/mcs_spinlock.o] Error 1
> > > cc1: some warnings being treated as errors
> > > make[3]: *** [kernel/locking/mutex.o] Error 1
> > > 
> > > http://kisskb.ellerman.id.au/kisskb/buildresult/11616307/
> > 
> > Ah, indeed. Thanks for the report, afaict this was the only missing
> > arch .
> 
> Adding Guenter who also reported this yesterday.
> 
> Linus, since this is build-breaking an entire arch, it might be worth
> avoiding the whole -tip thing and get the fix in as soon as possible.
> 

Hi Davidlohr,

The fix is still not upstream, unfortunately. Maybe the patch got lost because
it was not sent as separate patch, or maybe it has to go through the frv
maintainer, or some Cc: was missing. Would be great if you could follow up
on this.

Thanks,
Guenter

^ permalink raw reply	[flat|nested] 14+ messages in thread

end of thread, other threads:[~2014-08-15 19:34 UTC | newest]

Thread overview: 14+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2014-06-20 18:21 [PATCH] arch,locking: Ciao arch_mutex_cpu_relax() Davidlohr Bueso
2014-06-23  6:54 ` Peter Zijlstra
2014-06-23  7:13   ` Vineet Gupta
2014-06-23 11:34     ` Peter Zijlstra
2014-06-23  6:58 ` Peter Zijlstra
2014-06-24 15:06   ` Davidlohr Bueso
2014-06-25  6:25     ` Heiko Carstens
2014-06-25 11:57       ` Linus Torvalds
2014-06-29 22:09         ` [PATCH v2] " Davidlohr Bueso
2014-08-05 13:04           ` Geert Uytterhoeven
2014-08-05 17:42             ` Davidlohr Bueso
2014-08-08  1:18               ` Davidlohr Bueso
2014-08-08 10:45                 ` Guenter Roeck
2014-08-15 19:34                 ` Guenter Roeck

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.