linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* i386 FRAME_POINTER needs DEBUG_MUTEXES
@ 2006-01-10 17:44 Hugh Dickins
  2006-01-10 21:07 ` [patch] fix i386 mutex fastpath on FRAME_POINTER && !DEBUG_MUTEXES Ingo Molnar
  0 siblings, 1 reply; 5+ messages in thread
From: Hugh Dickins @ 2006-01-10 17:44 UTC (permalink / raw)
  To: Ingo Molnar; +Cc: linux-kernel

I find the 2.6.15-git6 i386 CONFIG_FRAME_POINTER=y doesn't work unless
CONFIG_DEBUG_MUTEXES=y, because of the __mutex_fastpath jumps to fail_fn
(giving two pushes of %ebp to one pop).  Whereas x86_64 __mutex_fastpaths
use calls and work with FRAME_POINTER=y.  Whether i386 deserves asm mods
rather than Kconfigery to force one from other, I've no strong instinct.

Hugh

^ permalink raw reply	[flat|nested] 5+ messages in thread

* [patch] fix i386 mutex fastpath on FRAME_POINTER && !DEBUG_MUTEXES
  2006-01-10 17:44 i386 FRAME_POINTER needs DEBUG_MUTEXES Hugh Dickins
@ 2006-01-10 21:07 ` Ingo Molnar
  0 siblings, 0 replies; 5+ messages in thread
From: Ingo Molnar @ 2006-01-10 21:07 UTC (permalink / raw)
  To: Hugh Dickins; +Cc: linux-kernel, Linus Torvalds


* Hugh Dickins <hugh@veritas.com> wrote:

> I find the 2.6.15-git6 i386 CONFIG_FRAME_POINTER=y doesn't work unless 
> CONFIG_DEBUG_MUTEXES=y, because of the __mutex_fastpath jumps to 
> fail_fn (giving two pushes of %ebp to one pop).  Whereas x86_64 
> __mutex_fastpaths use calls and work with FRAME_POINTER=y.  Whether 
> i386 deserves asm mods rather than Kconfigery to force one from other, 
> I've no strong instinct.

ah, good eyes! This explains some of the crashes i saw today. The patch 
below solves it. Build and boot tested on x86. Linus, please apply.

	Ingo

--
call the mutex slowpath more conservatively - e.g. FRAME_POINTERS can
change the calling convention, in which case a direct branch to the
slowpath becomes illegal. Bug found by Hugh Dickins.

Signed-off-by: Ingo Molnar <mingo@elte.hu>

----

 include/asm-i386/mutex.h |   16 ++++++++++++++--
 kernel/mutex.c           |    9 ---------
 2 files changed, 14 insertions(+), 11 deletions(-)

Index: linux/include/asm-i386/mutex.h
===================================================================
--- linux.orig/include/asm-i386/mutex.h
+++ linux/include/asm-i386/mutex.h
@@ -28,7 +28,13 @@ do {									\
 									\
 	__asm__ __volatile__(						\
 		LOCK	"   decl (%%eax)	\n"			\
-			"   js "#fail_fn"	\n"			\
+			"   js 2f		\n"			\
+			"1:			\n"			\
+									\
+		LOCK_SECTION_START("")					\
+			"2: call "#fail_fn"	\n"			\
+			"   jmp 1b		\n"			\
+		LOCK_SECTION_END					\
 									\
 		:"=a" (dummy)						\
 		: "a" (count)						\
@@ -78,7 +84,13 @@ do {									\
 									\
 	__asm__ __volatile__(						\
 		LOCK	"   incl (%%eax)	\n"			\
-			"   jle "#fail_fn"	\n"			\
+			"   jle 2f		\n"			\
+			"1:			\n"			\
+									\
+		LOCK_SECTION_START("")					\
+			"2: call "#fail_fn"	\n"			\
+			"   jmp 1b		\n"			\
+		LOCK_SECTION_END					\
 									\
 		:"=a" (dummy)						\
 		: "a" (count)						\
Index: linux/kernel/mutex.c
===================================================================
--- linux.orig/kernel/mutex.c
+++ linux/kernel/mutex.c
@@ -84,12 +84,6 @@ void fastcall __sched mutex_lock(struct 
 	/*
 	 * The locking fastpath is the 1->0 transition from
 	 * 'unlocked' into 'locked' state.
-	 *
-	 * NOTE: if asm/mutex.h is included, then some architectures
-	 * rely on mutex_lock() having _no other code_ here but this
-	 * fastpath. That allows the assembly fastpath to do
-	 * tail-merging optimizations. (If you want to put testcode
-	 * here, do it under #ifndef CONFIG_MUTEX_DEBUG.)
 	 */
 	__mutex_fastpath_lock(&lock->count, __mutex_lock_slowpath);
 }
@@ -115,8 +109,6 @@ void fastcall __sched mutex_unlock(struc
 	/*
 	 * The unlocking fastpath is the 0->1 transition from 'locked'
 	 * into 'unlocked' state:
-	 *
-	 * NOTE: no other code must be here - see mutex_lock() .
 	 */
 	__mutex_fastpath_unlock(&lock->count, __mutex_unlock_slowpath);
 }
@@ -261,7 +253,6 @@ __mutex_lock_interruptible_slowpath(atom
  */
 int fastcall __sched mutex_lock_interruptible(struct mutex *lock)
 {
-	/* NOTE: no other code must be here - see mutex_lock() */
 	return __mutex_fastpath_lock_retval
 			(&lock->count, __mutex_lock_interruptible_slowpath);
 }

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [patch] fix i386 mutex fastpath on FRAME_POINTER &&  !DEBUG_MUTEXES
@ 2006-01-12 17:53 Chuck Ebbert
  0 siblings, 0 replies; 5+ messages in thread
From: Chuck Ebbert @ 2006-01-12 17:53 UTC (permalink / raw)
  To: Ingo Molnar; +Cc: linux-kernel, Linus Torvalds, Hugh Dickins, Ingo Molnar

In-Reply-To: <Pine.LNX.4.58.0601110511590.24014@devserv.devel.redhat.com>

On Wed, 11 Jan 2006, Ingo Molnar wrote:

> On Wed, 11 Jan 2006, Chuck Ebbert wrote:
>
> > 
> >                 LOCK    "   decl (%%eax)        \n"                     \
> >                         "   jns 1f              \n"                     \
> >                         "   call "#fail_fn"     \n"                     \
> >                         "1:                     \n"                     \
> >                                                                         \
> >                 :"=a" (dummy)                                           \
> >                 : "a" (count)                                           \
> > 
> > 
> > Will the extra taken forward conditional jump in the fastpath cause much
> > of a slowdown?
> 
> yeah - the fastpath is much more common than the slowpath.

But that's how the spinlock code does it.  Should that be changed to put the
spinloops in .text.lock and make the fastpaths a fall-through?

-- 
Chuck
Currently reading: _Olympos_ by Dan Simmons

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [patch] fix i386 mutex fastpath on FRAME_POINTER &&  !DEBUG_MUTEXES
  2006-01-11  8:24 Chuck Ebbert
@ 2006-01-11 10:13 ` Ingo Molnar
  0 siblings, 0 replies; 5+ messages in thread
From: Ingo Molnar @ 2006-01-11 10:13 UTC (permalink / raw)
  To: Chuck Ebbert; +Cc: Ingo Molnar, Hugh Dickins, Linus Torvalds, linux-kernel


On Wed, 11 Jan 2006, Chuck Ebbert wrote:

> In-Reply-To: <20060110210744.GA8850@elte.hu>
> 
> On Tue, 10 Jan 2006 at 22:07:44 +0100, Ingo Molnar wrote:
> 
> > --- linux.orig/include/asm-i386/mutex.h
> > +++ linux/include/asm-i386/mutex.h
> > @@ -28,7 +28,13 @@ do {                                                                       \
> >                                                                       \
> >       __asm__ __volatile__(                                           \
> >               LOCK    "   decl (%%eax)        \n"                     \
> > -                     "   js "#fail_fn"       \n"                     \
> > +                     "   js 2f               \n"                     \
> > +                     "1:                     \n"                     \
> > +                                                                     \
> > +             LOCK_SECTION_START("")                                  \
> > +                     "2: call "#fail_fn"     \n"                     \
> > +                     "   jmp 1b              \n"                     \
> > +             LOCK_SECTION_END                                        \
> >                                                                       \
> >               :"=a" (dummy)                                           \
> >               : "a" (count)                                           \
> 
> 
> But now it's inefficient again.
> 
> Why not this:
> 
>                 LOCK    "   decl (%%eax)        \n"                     \
>                         "   jns 1f              \n"                     \
>                         "   call "#fail_fn"     \n"                     \
>                         "1:                     \n"                     \
>                                                                         \
>                 :"=a" (dummy)                                           \
>                 : "a" (count)                                           \
> 
> 
> Will the extra taken forward conditional jump in the fastpath cause much
> of a slowdown?

yeah - the fastpath is much more common than the slowpath.

	Ingo

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [patch] fix i386 mutex fastpath on FRAME_POINTER && !DEBUG_MUTEXES
@ 2006-01-11  8:24 Chuck Ebbert
  2006-01-11 10:13 ` Ingo Molnar
  0 siblings, 1 reply; 5+ messages in thread
From: Chuck Ebbert @ 2006-01-11  8:24 UTC (permalink / raw)
  To: Ingo Molnar; +Cc: Hugh Dickins, Ingo Molnar, Linus Torvalds, linux-kernel

In-Reply-To: <20060110210744.GA8850@elte.hu>

On Tue, 10 Jan 2006 at 22:07:44 +0100, Ingo Molnar wrote:

> --- linux.orig/include/asm-i386/mutex.h
> +++ linux/include/asm-i386/mutex.h
> @@ -28,7 +28,13 @@ do {                                                                       \
>                                                                       \
>       __asm__ __volatile__(                                           \
>               LOCK    "   decl (%%eax)        \n"                     \
> -                     "   js "#fail_fn"       \n"                     \
> +                     "   js 2f               \n"                     \
> +                     "1:                     \n"                     \
> +                                                                     \
> +             LOCK_SECTION_START("")                                  \
> +                     "2: call "#fail_fn"     \n"                     \
> +                     "   jmp 1b              \n"                     \
> +             LOCK_SECTION_END                                        \
>                                                                       \
>               :"=a" (dummy)                                           \
>               : "a" (count)                                           \


But now it's inefficient again.

Why not this:

                LOCK    "   decl (%%eax)        \n"                     \
                        "   jns 1f              \n"                     \
                        "   call "#fail_fn"     \n"                     \
                        "1:                     \n"                     \
                                                                        \
                :"=a" (dummy)                                           \
                : "a" (count)                                           \


Will the extra taken forward conditional jump in the fastpath cause much of
a slowdown?

-- 
Chuck
Currently reading: _Olympos_ by Dan Simmons

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2006-01-12 17:56 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2006-01-10 17:44 i386 FRAME_POINTER needs DEBUG_MUTEXES Hugh Dickins
2006-01-10 21:07 ` [patch] fix i386 mutex fastpath on FRAME_POINTER && !DEBUG_MUTEXES Ingo Molnar
2006-01-11  8:24 Chuck Ebbert
2006-01-11 10:13 ` Ingo Molnar
2006-01-12 17:53 Chuck Ebbert

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).