linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] mutex: have non-spinning mutexes on s390 by default
@ 2009-04-09 15:47 Heiko Carstens
  2009-04-09 15:54 ` Peter Zijlstra
  2009-04-17 21:42 ` [PATCH] " Folkert van Heusden
  0 siblings, 2 replies; 12+ messages in thread
From: Heiko Carstens @ 2009-04-09 15:47 UTC (permalink / raw)
  To: Ingo Molnar, Peter Zijlstra
  Cc: Martin Schwidefsky, Christian Borntraeger, linux-kernel

From: Heiko Carstens <heiko.carstens@de.ibm.com>

The adaptive spinning mutexes will not always do what one would expect on
virtualized architectures like s390. Especially the cpu_relax() loop in
mutex_spin_on_owner might hurt if the mutex holding cpu has been scheduled
away by the hypervisor.
We would end up in a cpu_relax() loop when there is no chance that the
state of the mutex changes until the target cpu has been scheduled again by
the hypervisor.
For that reason we should change the default behaviour to no-spin on s390.

We do have an instruction which allows to yield the current cpu in favour of
a different target cpu. Also we have an instruction which allows us to figure
out if the target cpu is physically backed.

However we need to do some performance tests until we can come up with
a solution that will do the right thing on s390.
Until then make the old behaviour default for us.

Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
---
 arch/Kconfig            |    3 +++
 arch/s390/Kconfig       |    1 +
 kernel/sched_features.h |    4 ++++
 3 files changed, 8 insertions(+)

Index: linux-2.6/arch/Kconfig
===================================================================
--- linux-2.6.orig/arch/Kconfig
+++ linux-2.6/arch/Kconfig
@@ -109,3 +109,6 @@ config HAVE_CLK
 
 config HAVE_DMA_API_DEBUG
 	bool
+
+config HAVE_DEFAULT_NO_SPIN_MUTEXES
+	bool
Index: linux-2.6/arch/s390/Kconfig
===================================================================
--- linux-2.6.orig/arch/s390/Kconfig
+++ linux-2.6/arch/s390/Kconfig
@@ -82,6 +82,7 @@ config S390
 	select USE_GENERIC_SMP_HELPERS if SMP
 	select HAVE_SYSCALL_WRAPPERS
 	select HAVE_FUNCTION_TRACER
+	select HAVE_DEFAULT_NO_SPIN_MUTEXES
 	select HAVE_OPROFILE
 	select HAVE_KPROBES
 	select HAVE_KRETPROBES
Index: linux-2.6/kernel/sched_features.h
===================================================================
--- linux-2.6.orig/kernel/sched_features.h
+++ linux-2.6/kernel/sched_features.h
@@ -14,4 +14,8 @@ SCHED_FEAT(LB_WAKEUP_UPDATE, 1)
 SCHED_FEAT(ASYM_EFF_LOAD, 1)
 SCHED_FEAT(WAKEUP_OVERLAP, 0)
 SCHED_FEAT(LAST_BUDDY, 1)
+#ifdef CONFIG_HAVE_DEFAULT_NO_SPIN_MUTEXES
+SCHED_FEAT(OWNER_SPIN, 0)
+#else
 SCHED_FEAT(OWNER_SPIN, 1)
+#endif

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH] mutex: have non-spinning mutexes on s390 by default
  2009-04-09 15:47 [PATCH] mutex: have non-spinning mutexes on s390 by default Heiko Carstens
@ 2009-04-09 15:54 ` Peter Zijlstra
  2009-04-09 16:14   ` Heiko Carstens
  2009-04-17 21:42 ` [PATCH] " Folkert van Heusden
  1 sibling, 1 reply; 12+ messages in thread
From: Peter Zijlstra @ 2009-04-09 15:54 UTC (permalink / raw)
  To: Heiko Carstens
  Cc: Ingo Molnar, Martin Schwidefsky, Christian Borntraeger, linux-kernel

On Thu, 2009-04-09 at 17:47 +0200, Heiko Carstens wrote:
> From: Heiko Carstens <heiko.carstens@de.ibm.com>
> 
> The adaptive spinning mutexes will not always do what one would expect on
> virtualized architectures like s390. Especially the cpu_relax() loop in
> mutex_spin_on_owner might hurt if the mutex holding cpu has been scheduled
> away by the hypervisor.
> We would end up in a cpu_relax() loop when there is no chance that the
> state of the mutex changes until the target cpu has been scheduled again by
> the hypervisor.
> For that reason we should change the default behaviour to no-spin on s390.
> 
> We do have an instruction which allows to yield the current cpu in favour of
> a different target cpu. Also we have an instruction which allows us to figure
> out if the target cpu is physically backed.
> 
> However we need to do some performance tests until we can come up with
> a solution that will do the right thing on s390.
> Until then make the old behaviour default for us.
> 
> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
> Cc: Ingo Molnar <mingo@elte.hu>
> Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
> ---
>  arch/Kconfig            |    3 +++
>  arch/s390/Kconfig       |    1 +
>  kernel/sched_features.h |    4 ++++
>  3 files changed, 8 insertions(+)
> 
> Index: linux-2.6/arch/Kconfig
> ===================================================================
> --- linux-2.6.orig/arch/Kconfig
> +++ linux-2.6/arch/Kconfig
> @@ -109,3 +109,6 @@ config HAVE_CLK
>  
>  config HAVE_DMA_API_DEBUG
>  	bool
> +
> +config HAVE_DEFAULT_NO_SPIN_MUTEXES
> +	bool
> Index: linux-2.6/arch/s390/Kconfig
> ===================================================================
> --- linux-2.6.orig/arch/s390/Kconfig
> +++ linux-2.6/arch/s390/Kconfig
> @@ -82,6 +82,7 @@ config S390
>  	select USE_GENERIC_SMP_HELPERS if SMP
>  	select HAVE_SYSCALL_WRAPPERS
>  	select HAVE_FUNCTION_TRACER
> +	select HAVE_DEFAULT_NO_SPIN_MUTEXES
>  	select HAVE_OPROFILE
>  	select HAVE_KPROBES
>  	select HAVE_KRETPROBES
> Index: linux-2.6/kernel/sched_features.h
> ===================================================================
> --- linux-2.6.orig/kernel/sched_features.h
> +++ linux-2.6/kernel/sched_features.h
> @@ -14,4 +14,8 @@ SCHED_FEAT(LB_WAKEUP_UPDATE, 1)
>  SCHED_FEAT(ASYM_EFF_LOAD, 1)
>  SCHED_FEAT(WAKEUP_OVERLAP, 0)
>  SCHED_FEAT(LAST_BUDDY, 1)
> +#ifdef CONFIG_HAVE_DEFAULT_NO_SPIN_MUTEXES
> +SCHED_FEAT(OWNER_SPIN, 0)
> +#else
>  SCHED_FEAT(OWNER_SPIN, 1)
> +#endif

Hmm, I'd rather have you'd make the whole block in __mutex_lock_common
go away on that CONFIG thingy.

Would be nice though to get something working on s390, does it have a
monitor wait like ins where it can properly sleep so that another
virtual host can run?

If so, we could possibly do a monitor wait on the lock owner field
instead of spinning.

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH] mutex: have non-spinning mutexes on s390 by default
  2009-04-09 15:54 ` Peter Zijlstra
@ 2009-04-09 16:14   ` Heiko Carstens
  2009-04-09 16:48     ` Heiko Carstens
  0 siblings, 1 reply; 12+ messages in thread
From: Heiko Carstens @ 2009-04-09 16:14 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Ingo Molnar, Martin Schwidefsky, Christian Borntraeger, linux-kernel

On Thu, 09 Apr 2009 17:54:56 +0200
Peter Zijlstra <peterz@infradead.org> wrote:

> > Index: linux-2.6/kernel/sched_features.h
> > ===================================================================
> > --- linux-2.6.orig/kernel/sched_features.h
> > +++ linux-2.6/kernel/sched_features.h
> > @@ -14,4 +14,8 @@ SCHED_FEAT(LB_WAKEUP_UPDATE, 1)
> >  SCHED_FEAT(ASYM_EFF_LOAD, 1)
> >  SCHED_FEAT(WAKEUP_OVERLAP, 0)
> >  SCHED_FEAT(LAST_BUDDY, 1)
> > +#ifdef CONFIG_HAVE_DEFAULT_NO_SPIN_MUTEXES
> > +SCHED_FEAT(OWNER_SPIN, 0)
> > +#else
> >  SCHED_FEAT(OWNER_SPIN, 1)
> > +#endif
> 
> Hmm, I'd rather have you'd make the whole block in __mutex_lock_common
> go away on that CONFIG thingy.

Sure, I can do that.

> Would be nice though to get something working on s390, does it have a
> monitor wait like ins where it can properly sleep so that another
> virtual host can run?
> 
> If so, we could possibly do a monitor wait on the lock owner field
> instead of spinning.

What we now have: the cpu_relax() implementation on s390 will yield the
current (virtual) cpu and give it back to the hypervisor. If the
hypervisor has nothing else to schedule or for fairness reasons decides
that this cpu has to run again it will be scheduled again.
One roundtrip (exit/reenter) is quite expensive (~1200 cycles).

We also have a directed yield where you can tell the hypervisor "don't
schedule me again until cpu x was has been scheduled".

And we have an IPI (signal processer with order code sense running)
which examines if the target cpu is actually physically backed. That's
in the order of ~80 cycles. At least that's what I just measured with
two hypervisors running below my Linux kernel.

So the idea for the spinning loop would be to additionaly test if the
targer cpu is physically backed. If it is we could happily spin and
sense again, more or less like the current code does.
Now the question is what do we do if it isn't backed? Yield the current
cpu in favour of the target cpu or just make the current process sleep
and schedule a different one?
For that I'd like to have performance numbers before we go in any
direction.. It might take some time to get the numbers since it's not
easy to get a slot for performance measurements.

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH] mutex: have non-spinning mutexes on s390 by default
  2009-04-09 16:14   ` Heiko Carstens
@ 2009-04-09 16:48     ` Heiko Carstens
  2009-04-09 16:53       ` Peter Zijlstra
  2009-04-09 18:12       ` [tip:core/urgent] " Heiko Carstens
  0 siblings, 2 replies; 12+ messages in thread
From: Heiko Carstens @ 2009-04-09 16:48 UTC (permalink / raw)
  To: Peter Zijlstra, Ingo Molnar
  Cc: Heiko Carstens, Martin Schwidefsky, Christian Borntraeger, linux-kernel

Updated Patch below:

Subject: [PATCH] mutex: have non-spinning mutexes on s390 by default

From: Heiko Carstens <heiko.carstens@de.ibm.com>

The adaptive spinning mutexes will not always do what one would expect on
virtualized architectures like s390. Especially the cpu_relax() loop in
mutex_spin_on_owner might hurt if the mutex holding cpu has been scheduled
away by the hypervisor.
We would end up in a cpu_relax() loop when there is no chance that the
state of the mutex changes until the target cpu has been scheduled again by
the hypervisor.
For that reason we should change the default behaviour to no-spin on s390.

We do have an instruction which allows to yield the current cpu in favour of
a different target cpu. Also we have an instruction which allows us to figure
out if the target cpu is physically backed.

However we need to do some performance tests until we can come up with
a solution that will do the right thing on s390.

Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
---
 arch/Kconfig      |    3 +++
 arch/s390/Kconfig |    1 +
 kernel/mutex.c    |    3 ++-
 3 files changed, 6 insertions(+), 1 deletion(-)

Index: linux-2.6/arch/Kconfig
===================================================================
--- linux-2.6.orig/arch/Kconfig
+++ linux-2.6/arch/Kconfig
@@ -109,3 +109,6 @@ config HAVE_CLK
 
 config HAVE_DMA_API_DEBUG
 	bool
+
+config HAVE_DEFAULT_NO_SPIN_MUTEXES
+	bool
Index: linux-2.6/arch/s390/Kconfig
===================================================================
--- linux-2.6.orig/arch/s390/Kconfig
+++ linux-2.6/arch/s390/Kconfig
@@ -82,6 +82,7 @@ config S390
 	select USE_GENERIC_SMP_HELPERS if SMP
 	select HAVE_SYSCALL_WRAPPERS
 	select HAVE_FUNCTION_TRACER
+	select HAVE_DEFAULT_NO_SPIN_MUTEXES
 	select HAVE_OPROFILE
 	select HAVE_KPROBES
 	select HAVE_KRETPROBES
Index: linux-2.6/kernel/mutex.c
===================================================================
--- linux-2.6.orig/kernel/mutex.c
+++ linux-2.6/kernel/mutex.c
@@ -148,7 +148,8 @@ __mutex_lock_common(struct mutex *lock, 
 
 	preempt_disable();
 	mutex_acquire(&lock->dep_map, subclass, 0, ip);
-#if defined(CONFIG_SMP) && !defined(CONFIG_DEBUG_MUTEXES)
+#if defined(CONFIG_SMP) && !defined(CONFIG_DEBUG_MUTEXES) && \
+    !defined(CONFIG_HAVE_DEFAULT_NO_SPIN_MUTEXES)
 	/*
 	 * Optimistic spinning.
 	 *

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH] mutex: have non-spinning mutexes on s390 by default
  2009-04-09 16:48     ` Heiko Carstens
@ 2009-04-09 16:53       ` Peter Zijlstra
  2009-04-09 17:38         ` Peter Zijlstra
  2009-04-09 18:12       ` [tip:core/urgent] " Heiko Carstens
  1 sibling, 1 reply; 12+ messages in thread
From: Peter Zijlstra @ 2009-04-09 16:53 UTC (permalink / raw)
  To: Heiko Carstens
  Cc: Ingo Molnar, Martin Schwidefsky, Christian Borntraeger, linux-kernel

On Thu, 2009-04-09 at 18:48 +0200, Heiko Carstens wrote:
> Updated Patch below:
> 
> Subject: [PATCH] mutex: have non-spinning mutexes on s390 by default
> 
> From: Heiko Carstens <heiko.carstens@de.ibm.com>
> 
> The adaptive spinning mutexes will not always do what one would expect on
> virtualized architectures like s390. Especially the cpu_relax() loop in
> mutex_spin_on_owner might hurt if the mutex holding cpu has been scheduled
> away by the hypervisor.
> We would end up in a cpu_relax() loop when there is no chance that the
> state of the mutex changes until the target cpu has been scheduled again by
> the hypervisor.
> For that reason we should change the default behaviour to no-spin on s390.
> 
> We do have an instruction which allows to yield the current cpu in favour of
> a different target cpu. Also we have an instruction which allows us to figure
> out if the target cpu is physically backed.
> 
> However we need to do some performance tests until we can come up with
> a solution that will do the right thing on s390.
> 
> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>

Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl>

I was looking at how an monitor-wait could be used here, but that
appears non-trivial, there's two variables we're watching, lock->owner
and rq->curr, either could change.

Reducing that to 1 seems an interesting problem :-)

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH] mutex: have non-spinning mutexes on s390 by default
  2009-04-09 16:53       ` Peter Zijlstra
@ 2009-04-09 17:38         ` Peter Zijlstra
  2009-04-09 17:50           ` Jeremy Fitzhardinge
  0 siblings, 1 reply; 12+ messages in thread
From: Peter Zijlstra @ 2009-04-09 17:38 UTC (permalink / raw)
  To: Heiko Carstens
  Cc: Ingo Molnar, Martin Schwidefsky, Christian Borntraeger,
	linux-kernel, Jeremy Fitzhardinge, Avi Kivity

On Thu, 2009-04-09 at 18:53 +0200, Peter Zijlstra wrote:

> I was looking at how an monitor-wait could be used here, but that
> appears non-trivial, there's two variables we're watching, lock->owner
> and rq->curr, either could change.
> 
> Reducing that to 1 seems an interesting problem :-)

How about something like this?

Does it make sense to implement an monitor-wait spinlock for the virt
case as well? Avi, Jeremy?

---
 arch/Kconfig                     |    3 +++
 arch/x86/Kconfig                 |    1 +
 arch/x86/include/asm/processor.h |   21 +++++++++++++++++++++
 include/linux/sched.h            |    2 ++
 kernel/mutex.h                   |    1 +
 kernel/sched.c                   |   27 ++++++++++++++++++++++++++-
 6 files changed, 54 insertions(+), 1 deletions(-)

diff --git a/arch/Kconfig b/arch/Kconfig
index dc81b34..67aa9f9 100644
--- a/arch/Kconfig
+++ b/arch/Kconfig
@@ -109,3 +109,6 @@ config HAVE_CLK
 
 config HAVE_DMA_API_DEBUG
 	bool
+
+config HAVE_MUTEX_MWAIT
+	bool
diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig
index 2560fff..62d378b 100644
--- a/arch/x86/Kconfig
+++ b/arch/x86/Kconfig
@@ -47,6 +47,7 @@ config X86
 	select HAVE_KERNEL_LZMA
 	select HAVE_ARCH_KMEMCHECK
 	select HAVE_DMA_API_DEBUG
+	select HAVE_MUTEX_MWAIT
 
 config ARCH_DEFCONFIG
 	string
diff --git a/arch/x86/include/asm/processor.h b/arch/x86/include/asm/processor.h
index 7c39de7..c2617e4 100644
--- a/arch/x86/include/asm/processor.h
+++ b/arch/x86/include/asm/processor.h
@@ -727,6 +727,27 @@ static inline void __sti_mwait(unsigned long eax, unsigned long ecx)
 		     :: "a" (eax), "c" (ecx));
 }
 
+#ifdef CONFIG_SMP
+static inline void mutex_spin_monitor(void *addr)
+{
+	if (!boot_cpu_has(X86_FEATURE_CLFLUSH_MONITOR))
+		return;
+
+	__monitor(addr, 0, 0);
+	smp_mb();
+}
+
+static inline void mutex_spin_monitor_wait(void)
+{
+	if (!boot_cpu_has(X86_FEATURE_CLFLUSH_MONITOR)) {
+		cpu_relax();
+		return;
+	}
+
+	__mwait(0, 0);
+}
+#endif
+
 extern void mwait_idle_with_hints(unsigned long eax, unsigned long ecx);
 
 extern void select_idle_routine(const struct cpuinfo_x86 *c);
diff --git a/include/linux/sched.h b/include/linux/sched.h
index 25bdac7..87f945e 100644
--- a/include/linux/sched.h
+++ b/include/linux/sched.h
@@ -342,7 +342,9 @@ extern signed long schedule_timeout_killable(signed long timeout);
 extern signed long schedule_timeout_uninterruptible(signed long timeout);
 asmlinkage void __schedule(void);
 asmlinkage void schedule(void);
+
 extern int mutex_spin_on_owner(struct mutex *lock, struct thread_info *owner);
+extern void mutex_spin_unlock(void);
 
 struct nsproxy;
 struct user_namespace;
diff --git a/kernel/mutex.h b/kernel/mutex.h
index 67578ca..c4d2d7a 100644
--- a/kernel/mutex.h
+++ b/kernel/mutex.h
@@ -25,6 +25,7 @@ static inline void mutex_set_owner(struct mutex *lock)
 static inline void mutex_clear_owner(struct mutex *lock)
 {
 	lock->owner = NULL;
+	mutex_spin_unlock();
 }
 #else
 static inline void mutex_set_owner(struct mutex *lock)
diff --git a/kernel/sched.c b/kernel/sched.c
index f2cf383..f15af61 100644
--- a/kernel/sched.c
+++ b/kernel/sched.c
@@ -5184,6 +5184,28 @@ need_resched_nonpreemptible:
 EXPORT_SYMBOL(schedule);
 
 #ifdef CONFIG_SMP
+
+#ifndef CONFIG_HAVE_MUTEX_MWAIT
+static inline void mutex_spin_monitor(void *addr)
+{
+}
+
+static inline void mutex_spin_monitor_wait(void)
+{
+	cpu_relax();
+}
+#endif
+
+void mutex_spin_unlock(void)
+{
+	/*
+	 * XXX fix the below to now require as many ins
+	 */
+	preempt_disable();
+	this_rq()->curr = current;
+	preempt_enable();
+}
+
 /*
  * Look out! "owner" is an entirely speculative pointer
  * access and not reliable.
@@ -5225,6 +5247,9 @@ int mutex_spin_on_owner(struct mutex *lock, struct thread_info *owner)
 	rq = cpu_rq(cpu);
 
 	for (;;) {
+
+		mutex_spin_monitor(&rq->curr);
+
 		/*
 		 * Owner changed, break to re-assess state.
 		 */
@@ -5237,7 +5262,7 @@ int mutex_spin_on_owner(struct mutex *lock, struct thread_info *owner)
 		if (task_thread_info(rq->curr) != owner || need_resched())
 			return 0;
 
-		cpu_relax();
+		mutex_spin_monitor_wait();
 	}
 out:
 	return 1;


^ permalink raw reply related	[flat|nested] 12+ messages in thread

* Re: [PATCH] mutex: have non-spinning mutexes on s390 by default
  2009-04-09 17:38         ` Peter Zijlstra
@ 2009-04-09 17:50           ` Jeremy Fitzhardinge
  2009-04-09 18:34             ` Peter Zijlstra
  0 siblings, 1 reply; 12+ messages in thread
From: Jeremy Fitzhardinge @ 2009-04-09 17:50 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Heiko Carstens, Ingo Molnar, Martin Schwidefsky,
	Christian Borntraeger, linux-kernel, Avi Kivity,
	Arjan van de Ven, H. Peter Anvin

Peter Zijlstra wrote:
> On Thu, 2009-04-09 at 18:53 +0200, Peter Zijlstra wrote:
>
>   
>> I was looking at how an monitor-wait could be used here, but that
>> appears non-trivial, there's two variables we're watching, lock->owner
>> and rq->curr, either could change.
>>
>> Reducing that to 1 seems an interesting problem :-)
>>     
>
> How about something like this?
>
> Does it make sense to implement an monitor-wait spinlock for the virt
> case as well? Avi, Jeremy?
>   

Last time I tried to put mwait in a spinlock, Arjan (or HPA?) said that 
mwait takes approx a week and a half to wake up, and that it was really 
only useful for idle power savings.

Has that changed?

Aside from that, using mwait directly doesn't really help PV Xen much 
(it never an available CPU feature); we'd need a higher-level hook to 
implement something else to block the vcpu.

    J

> ---
>  arch/Kconfig                     |    3 +++
>  arch/x86/Kconfig                 |    1 +
>  arch/x86/include/asm/processor.h |   21 +++++++++++++++++++++
>  include/linux/sched.h            |    2 ++
>  kernel/mutex.h                   |    1 +
>  kernel/sched.c                   |   27 ++++++++++++++++++++++++++-
>  6 files changed, 54 insertions(+), 1 deletions(-)
>
> diff --git a/arch/Kconfig b/arch/Kconfig
> index dc81b34..67aa9f9 100644
> --- a/arch/Kconfig
> +++ b/arch/Kconfig
> @@ -109,3 +109,6 @@ config HAVE_CLK
>  
>  config HAVE_DMA_API_DEBUG
>  	bool
> +
> +config HAVE_MUTEX_MWAIT
> +	bool
> diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig
> index 2560fff..62d378b 100644
> --- a/arch/x86/Kconfig
> +++ b/arch/x86/Kconfig
> @@ -47,6 +47,7 @@ config X86
>  	select HAVE_KERNEL_LZMA
>  	select HAVE_ARCH_KMEMCHECK
>  	select HAVE_DMA_API_DEBUG
> +	select HAVE_MUTEX_MWAIT
>  
>  config ARCH_DEFCONFIG
>  	string
> diff --git a/arch/x86/include/asm/processor.h b/arch/x86/include/asm/processor.h
> index 7c39de7..c2617e4 100644
> --- a/arch/x86/include/asm/processor.h
> +++ b/arch/x86/include/asm/processor.h
> @@ -727,6 +727,27 @@ static inline void __sti_mwait(unsigned long eax, unsigned long ecx)
>  		     :: "a" (eax), "c" (ecx));
>  }
>  
> +#ifdef CONFIG_SMP
> +static inline void mutex_spin_monitor(void *addr)
> +{
> +	if (!boot_cpu_has(X86_FEATURE_CLFLUSH_MONITOR))
> +		return;
> +
> +	__monitor(addr, 0, 0);
> +	smp_mb();
> +}
> +
> +static inline void mutex_spin_monitor_wait(void)
> +{
> +	if (!boot_cpu_has(X86_FEATURE_CLFLUSH_MONITOR)) {
> +		cpu_relax();
> +		return;
> +	}
> +
> +	__mwait(0, 0);
> +}
> +#endif
> +
>  extern void mwait_idle_with_hints(unsigned long eax, unsigned long ecx);
>  
>  extern void select_idle_routine(const struct cpuinfo_x86 *c);
> diff --git a/include/linux/sched.h b/include/linux/sched.h
> index 25bdac7..87f945e 100644
> --- a/include/linux/sched.h
> +++ b/include/linux/sched.h
> @@ -342,7 +342,9 @@ extern signed long schedule_timeout_killable(signed long timeout);
>  extern signed long schedule_timeout_uninterruptible(signed long timeout);
>  asmlinkage void __schedule(void);
>  asmlinkage void schedule(void);
> +
>  extern int mutex_spin_on_owner(struct mutex *lock, struct thread_info *owner);
> +extern void mutex_spin_unlock(void);
>  
>  struct nsproxy;
>  struct user_namespace;
> diff --git a/kernel/mutex.h b/kernel/mutex.h
> index 67578ca..c4d2d7a 100644
> --- a/kernel/mutex.h
> +++ b/kernel/mutex.h
> @@ -25,6 +25,7 @@ static inline void mutex_set_owner(struct mutex *lock)
>  static inline void mutex_clear_owner(struct mutex *lock)
>  {
>  	lock->owner = NULL;
> +	mutex_spin_unlock();
>  }
>  #else
>  static inline void mutex_set_owner(struct mutex *lock)
> diff --git a/kernel/sched.c b/kernel/sched.c
> index f2cf383..f15af61 100644
> --- a/kernel/sched.c
> +++ b/kernel/sched.c
> @@ -5184,6 +5184,28 @@ need_resched_nonpreemptible:
>  EXPORT_SYMBOL(schedule);
>  
>  #ifdef CONFIG_SMP
> +
> +#ifndef CONFIG_HAVE_MUTEX_MWAIT
> +static inline void mutex_spin_monitor(void *addr)
> +{
> +}
> +
> +static inline void mutex_spin_monitor_wait(void)
> +{
> +	cpu_relax();
> +}
> +#endif
> +
> +void mutex_spin_unlock(void)
> +{
> +	/*
> +	 * XXX fix the below to now require as many ins
> +	 */
> +	preempt_disable();
> +	this_rq()->curr = current;
> +	preempt_enable();
> +}
> +
>  /*
>   * Look out! "owner" is an entirely speculative pointer
>   * access and not reliable.
> @@ -5225,6 +5247,9 @@ int mutex_spin_on_owner(struct mutex *lock, struct thread_info *owner)
>  	rq = cpu_rq(cpu);
>  
>  	for (;;) {
> +
> +		mutex_spin_monitor(&rq->curr);
> +
>  		/*
>  		 * Owner changed, break to re-assess state.
>  		 */
> @@ -5237,7 +5262,7 @@ int mutex_spin_on_owner(struct mutex *lock, struct thread_info *owner)
>  		if (task_thread_info(rq->curr) != owner || need_resched())
>  			return 0;
>  
> -		cpu_relax();
> +		mutex_spin_monitor_wait();
>  	}
>  out:
>  	return 1;
>
>   


^ permalink raw reply	[flat|nested] 12+ messages in thread

* [tip:core/urgent] mutex: have non-spinning mutexes on s390 by default
  2009-04-09 16:48     ` Heiko Carstens
  2009-04-09 16:53       ` Peter Zijlstra
@ 2009-04-09 18:12       ` Heiko Carstens
  1 sibling, 0 replies; 12+ messages in thread
From: Heiko Carstens @ 2009-04-09 18:12 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: linux-kernel, hpa, mingo, a.p.zijlstra, schwidefsky, borntraeger,
	heiko.carstens, tglx, mingo

Commit-ID:  36cd3c9f925b9307236505ae7ad1ad7ac4d4357c
Gitweb:     http://git.kernel.org/tip/36cd3c9f925b9307236505ae7ad1ad7ac4d4357c
Author:     Heiko Carstens <heiko.carstens@de.ibm.com>
AuthorDate: Thu, 9 Apr 2009 18:48:34 +0200
Committer:  Ingo Molnar <mingo@elte.hu>
CommitDate: Thu, 9 Apr 2009 19:28:24 +0200

mutex: have non-spinning mutexes on s390 by default

Impact: performance regression fix for s390

The adaptive spinning mutexes will not always do what one would expect on
virtualized architectures like s390. Especially the cpu_relax() loop in
mutex_spin_on_owner might hurt if the mutex holding cpu has been scheduled
away by the hypervisor.

We would end up in a cpu_relax() loop when there is no chance that the
state of the mutex changes until the target cpu has been scheduled again by
the hypervisor.

For that reason we should change the default behaviour to no-spin on s390.

We do have an instruction which allows to yield the current cpu in favour of
a different target cpu. Also we have an instruction which allows us to figure
out if the target cpu is physically backed.

However we need to do some performance tests until we can come up with
a solution that will do the right thing on s390.

Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Christian Borntraeger <borntraeger@de.ibm.com>
LKML-Reference: <20090409184834.7a0df7b2@osiris.boeblingen.de.ibm.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>


---
 arch/Kconfig      |    3 +++
 arch/s390/Kconfig |    1 +
 kernel/mutex.c    |    3 ++-
 3 files changed, 6 insertions(+), 1 deletions(-)

diff --git a/arch/Kconfig b/arch/Kconfig
index dc81b34..78a35e9 100644
--- a/arch/Kconfig
+++ b/arch/Kconfig
@@ -109,3 +109,6 @@ config HAVE_CLK
 
 config HAVE_DMA_API_DEBUG
 	bool
+
+config HAVE_DEFAULT_NO_SPIN_MUTEXES
+	bool
diff --git a/arch/s390/Kconfig b/arch/s390/Kconfig
index dcb667c..2eca5fe 100644
--- a/arch/s390/Kconfig
+++ b/arch/s390/Kconfig
@@ -82,6 +82,7 @@ config S390
 	select USE_GENERIC_SMP_HELPERS if SMP
 	select HAVE_SYSCALL_WRAPPERS
 	select HAVE_FUNCTION_TRACER
+	select HAVE_DEFAULT_NO_SPIN_MUTEXES
 	select HAVE_OPROFILE
 	select HAVE_KPROBES
 	select HAVE_KRETPROBES
diff --git a/kernel/mutex.c b/kernel/mutex.c
index 5d79781..507cf2b 100644
--- a/kernel/mutex.c
+++ b/kernel/mutex.c
@@ -148,7 +148,8 @@ __mutex_lock_common(struct mutex *lock, long state, unsigned int subclass,
 
 	preempt_disable();
 	mutex_acquire(&lock->dep_map, subclass, 0, ip);
-#if defined(CONFIG_SMP) && !defined(CONFIG_DEBUG_MUTEXES)
+#if defined(CONFIG_SMP) && !defined(CONFIG_DEBUG_MUTEXES) && \
+    !defined(CONFIG_HAVE_DEFAULT_NO_SPIN_MUTEXES)
 	/*
 	 * Optimistic spinning.
 	 *

^ permalink raw reply related	[flat|nested] 12+ messages in thread

* Re: [PATCH] mutex: have non-spinning mutexes on s390 by default
  2009-04-09 17:50           ` Jeremy Fitzhardinge
@ 2009-04-09 18:34             ` Peter Zijlstra
  0 siblings, 0 replies; 12+ messages in thread
From: Peter Zijlstra @ 2009-04-09 18:34 UTC (permalink / raw)
  To: Jeremy Fitzhardinge
  Cc: Heiko Carstens, Ingo Molnar, Martin Schwidefsky,
	Christian Borntraeger, linux-kernel, Avi Kivity,
	Arjan van de Ven, H. Peter Anvin

On Thu, 2009-04-09 at 10:50 -0700, Jeremy Fitzhardinge wrote:
> Peter Zijlstra wrote:
> > On Thu, 2009-04-09 at 18:53 +0200, Peter Zijlstra wrote:
> >
> >   
> >> I was looking at how an monitor-wait could be used here, but that
> >> appears non-trivial, there's two variables we're watching, lock->owner
> >> and rq->curr, either could change.
> >>
> >> Reducing that to 1 seems an interesting problem :-)
> >>     
> >
> > How about something like this?
> >
> > Does it make sense to implement an monitor-wait spinlock for the virt
> > case as well? Avi, Jeremy?
> >   
> 
> Last time I tried to put mwait in a spinlock, Arjan (or HPA?) said that 
> mwait takes approx a week and a half to wake up, and that it was really 
> only useful for idle power savings.

Yeah, sad that.

> Has that changed?

Nothing much, I was thinking perhaps it would make sense for the virt
case, but if its not properly virtualized then its pretty useless
indeed.

monitor-wait is basically a hardware/hv futex like thing, so I thought
it might help -- spinning in a guest is pretty painful.


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH] mutex: have non-spinning mutexes on s390 by default
  2009-04-09 15:47 [PATCH] mutex: have non-spinning mutexes on s390 by default Heiko Carstens
  2009-04-09 15:54 ` Peter Zijlstra
@ 2009-04-17 21:42 ` Folkert van Heusden
  2009-04-20 12:01   ` Heiko Carstens
  1 sibling, 1 reply; 12+ messages in thread
From: Folkert van Heusden @ 2009-04-17 21:42 UTC (permalink / raw)
  To: Heiko Carstens
  Cc: Ingo Molnar, Peter Zijlstra, Martin Schwidefsky,
	Christian Borntraeger, linux-kernel

> The adaptive spinning mutexes will not always do what one would expect on
> virtualized architectures like s390. Especially the cpu_relax() loop in
> mutex_spin_on_owner might hurt if the mutex holding cpu has been scheduled
> away by the hypervisor.
> We would end up in a cpu_relax() loop when there is no chance that the
> state of the mutex changes until the target cpu has been scheduled again by
> the hypervisor.
> For that reason we should change the default behaviour to no-spin on s390.

Hmmm. Wouldn't this be a change that applies to ibm system p as wel?
E.g. with linux in an lpar.


Folkert van Heusden

-- 
----------------------------------------------------------------------
Phone: +31-6-41278122, PGP-key: 1F28D8AE, www.vanheusden.com

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH] mutex: have non-spinning mutexes on s390 by default
  2009-04-17 21:42 ` [PATCH] " Folkert van Heusden
@ 2009-04-20 12:01   ` Heiko Carstens
  2009-04-20 12:04     ` David Miller
  0 siblings, 1 reply; 12+ messages in thread
From: Heiko Carstens @ 2009-04-20 12:01 UTC (permalink / raw)
  To: Folkert van Heusden
  Cc: Ingo Molnar, Peter Zijlstra, Martin Schwidefsky,
	Christian Borntraeger, linux-kernel

On Fri, 17 Apr 2009 23:42:12 +0200
Folkert van Heusden <folkert@vanheusden.com> wrote:

> > The adaptive spinning mutexes will not always do what one would expect on
> > virtualized architectures like s390. Especially the cpu_relax() loop in
> > mutex_spin_on_owner might hurt if the mutex holding cpu has been scheduled
> > away by the hypervisor.
> > We would end up in a cpu_relax() loop when there is no chance that the
> > state of the mutex changes until the target cpu has been scheduled again by
> > the hypervisor.
> > For that reason we should change the default behaviour to no-spin on s390.
> 
> Hmmm. Wouldn't this be a change that applies to ibm system p as wel?
> E.g. with linux in an lpar.

This applies to every kernel that runs under a hypervisor. However the
difference is that s390 kernels _always_ have a hypervisor, whereas
other architectures may or may not have one.
In case the mutex spinning code ist causing problem it still can be
switched at run time. That way other architectures can come up with a
solution that fits their needs best.

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH] mutex: have non-spinning mutexes on s390 by default
  2009-04-20 12:01   ` Heiko Carstens
@ 2009-04-20 12:04     ` David Miller
  0 siblings, 0 replies; 12+ messages in thread
From: David Miller @ 2009-04-20 12:04 UTC (permalink / raw)
  To: heiko.carstens
  Cc: folkert, mingo, peterz, schwidefsky, borntraeger, linux-kernel

From: Heiko Carstens <heiko.carstens@de.ibm.com>
Date: Mon, 20 Apr 2009 14:01:03 +0200

> On Fri, 17 Apr 2009 23:42:12 +0200
> Folkert van Heusden <folkert@vanheusden.com> wrote:
> 
>> > The adaptive spinning mutexes will not always do what one would expect on
>> > virtualized architectures like s390. Especially the cpu_relax() loop in
>> > mutex_spin_on_owner might hurt if the mutex holding cpu has been scheduled
>> > away by the hypervisor.
>> > We would end up in a cpu_relax() loop when there is no chance that the
>> > state of the mutex changes until the target cpu has been scheduled again by
>> > the hypervisor.
>> > For that reason we should change the default behaviour to no-spin on s390.
>> 
>> Hmmm. Wouldn't this be a change that applies to ibm system p as wel?
>> E.g. with linux in an lpar.
> 
> This applies to every kernel that runs under a hypervisor.

I wouldn't say "every" because this change is definitely not
appropriate on sparc64 hypervisor based systems.

^ permalink raw reply	[flat|nested] 12+ messages in thread

end of thread, other threads:[~2009-04-20 12:04 UTC | newest]

Thread overview: 12+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2009-04-09 15:47 [PATCH] mutex: have non-spinning mutexes on s390 by default Heiko Carstens
2009-04-09 15:54 ` Peter Zijlstra
2009-04-09 16:14   ` Heiko Carstens
2009-04-09 16:48     ` Heiko Carstens
2009-04-09 16:53       ` Peter Zijlstra
2009-04-09 17:38         ` Peter Zijlstra
2009-04-09 17:50           ` Jeremy Fitzhardinge
2009-04-09 18:34             ` Peter Zijlstra
2009-04-09 18:12       ` [tip:core/urgent] " Heiko Carstens
2009-04-17 21:42 ` [PATCH] " Folkert van Heusden
2009-04-20 12:01   ` Heiko Carstens
2009-04-20 12:04     ` David Miller

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).