linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] remove LOCK_SECTION from x86_64 spin_lock asm
@ 2004-09-15 16:01 Zwane Mwaikambo
  2004-09-15 21:45 ` Andrew Morton
  0 siblings, 1 reply; 19+ messages in thread
From: Zwane Mwaikambo @ 2004-09-15 16:01 UTC (permalink / raw)
  To: Linux Kernel; +Cc: Andrew Morton, Andi Kleen, William Lee Irwin III

William spotted this stray bit, LOCK_SECTION isn't used anymore on x86_64. 
Andrew i've diffed against -mm because i'd like for the irq enable on 
contention patch to be merged, i believe making spinlock functions out 
of line was a prerequisite Andi wanted before agreeing to the irq enable 
code.

Signed-off-by: Zwane Mwaikambo <zwane@fsmlabs.com>

Index: linux-2.6.9-rc1-mm5/include/asm-x86_64/spinlock.h
===================================================================
RCS file: /home/cvsroot/linux-2.6.9-rc1-mm5/include/asm-x86_64/spinlock.h,v
retrieving revision 1.1.1.1
diff -u -p -B -r1.1.1.1 spinlock.h
--- linux-2.6.9-rc1-mm5/include/asm-x86_64/spinlock.h	13 Sep 2004 12:46:47 -0000	1.1.1.1
+++ linux-2.6.9-rc1-mm5/include/asm-x86_64/spinlock.h	15 Sep 2004 08:47:52 -0000
@@ -45,20 +45,18 @@ typedef struct {
 #define spin_lock_string \
 	"\n1:\t" \
 	"lock ; decb %0\n\t" \
-	"js 2f\n" \
-	LOCK_SECTION_START("") \
+	"jns 3f\n" \
 	"2:\t" \
 	"rep;nop\n\t" \
 	"cmpb $0,%0\n\t" \
 	"jle 2b\n\t" \
 	"jmp 1b\n" \
-	LOCK_SECTION_END
+	"3:\n\t"
 
 #define spin_lock_string_flags \
 	"\n1:\t" \
 	"lock ; decb %0\n\t" \
-	"js 2f\n\t" \
-	LOCK_SECTION_START("") \
+	"jns 4f\n\t" \
 	"2:\t" \
 	"test $0x200, %1\n\t" \
 	"jz 3f\n\t" \
@@ -69,7 +67,7 @@ typedef struct {
 	"jle 3b\n\t" \
 	"cli\n\t" \
 	"jmp 1b\n" \
-	LOCK_SECTION_END
+	"4:\n\t"
 
 /*
  * This works. Despite all the confusion.

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH] remove LOCK_SECTION from x86_64 spin_lock asm
  2004-09-15 21:45 ` Andrew Morton
@ 2004-09-15 17:55   ` Zwane Mwaikambo
  2004-09-15 21:47   ` Ingo Molnar
  2004-09-16  6:13   ` Andi Kleen
  2 siblings, 0 replies; 19+ messages in thread
From: Zwane Mwaikambo @ 2004-09-15 17:55 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Linux Kernel, Andi Kleen, William Lee Irwin III, Ingo Molnar

On Wed, 15 Sep 2004, Andrew Morton wrote:

> Zwane Mwaikambo <zwane@fsmlabs.com> wrote:
> >
> > William spotted this stray bit, LOCK_SECTION isn't used anymore on x86_64. 
> 
> btw, Ingo and I were scratching heads over an x86_64 oops in curent -linus
> trees.
> 
> If you enable profiling and frame pointers, profile_pc() goes splat
> dereferencing the `regs' argument when it decides that the pc refers to a
> lock section.  Ingo said `regs' had a value of 0x2, iirc.  Consider this a
> bug report ;)

Yeah profile_pc on x86_64 is just broken, i'll have a look.

Thanks,
	Zwane


^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH] remove LOCK_SECTION from x86_64 spin_lock asm
  2004-09-15 16:01 [PATCH] remove LOCK_SECTION from x86_64 spin_lock asm Zwane Mwaikambo
@ 2004-09-15 21:45 ` Andrew Morton
  2004-09-15 17:55   ` Zwane Mwaikambo
                     ` (2 more replies)
  0 siblings, 3 replies; 19+ messages in thread
From: Andrew Morton @ 2004-09-15 21:45 UTC (permalink / raw)
  To: Zwane Mwaikambo; +Cc: linux-kernel, ak, wli, Ingo Molnar

Zwane Mwaikambo <zwane@fsmlabs.com> wrote:
>
> William spotted this stray bit, LOCK_SECTION isn't used anymore on x86_64. 

btw, Ingo and I were scratching heads over an x86_64 oops in curent -linus
trees.

If you enable profiling and frame pointers, profile_pc() goes splat
dereferencing the `regs' argument when it decides that the pc refers to a
lock section.  Ingo said `regs' had a value of 0x2, iirc.  Consider this a
bug report ;)

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH] remove LOCK_SECTION from x86_64 spin_lock asm
  2004-09-15 21:45 ` Andrew Morton
  2004-09-15 17:55   ` Zwane Mwaikambo
@ 2004-09-15 21:47   ` Ingo Molnar
  2004-09-16  6:13   ` Andi Kleen
  2 siblings, 0 replies; 19+ messages in thread
From: Ingo Molnar @ 2004-09-15 21:47 UTC (permalink / raw)
  To: Andrew Morton; +Cc: Zwane Mwaikambo, linux-kernel, ak, wli


* Andrew Morton <akpm@osdl.org> wrote:

> Zwane Mwaikambo <zwane@fsmlabs.com> wrote:
> >
> > William spotted this stray bit, LOCK_SECTION isn't used anymore on x86_64. 
> 
> btw, Ingo and I were scratching heads over an x86_64 oops in curent -linus
> trees.
> 
> If you enable profiling and frame pointers, profile_pc() goes splat
> dereferencing the `regs' argument when it decides that the pc refers to a
> lock section.  Ingo said `regs' had a value of 0x2, iirc.  Consider this a
> bug report ;)

'regs->rbp' had the wrong value i think.

	Ingo

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH] remove LOCK_SECTION from x86_64 spin_lock asm
  2004-09-15 21:45 ` Andrew Morton
  2004-09-15 17:55   ` Zwane Mwaikambo
  2004-09-15 21:47   ` Ingo Molnar
@ 2004-09-16  6:13   ` Andi Kleen
  2004-09-16  6:27     ` Ingo Molnar
  2 siblings, 1 reply; 19+ messages in thread
From: Andi Kleen @ 2004-09-16  6:13 UTC (permalink / raw)
  To: Andrew Morton; +Cc: Zwane Mwaikambo, linux-kernel, ak, wli, Ingo Molnar

On Wed, Sep 15, 2004 at 02:45:23PM -0700, Andrew Morton wrote:
> Zwane Mwaikambo <zwane@fsmlabs.com> wrote:
> >
> > William spotted this stray bit, LOCK_SECTION isn't used anymore on x86_64. 
> 
> btw, Ingo and I were scratching heads over an x86_64 oops in curent -linus
> trees.
> 
> If you enable profiling and frame pointers, profile_pc() goes splat
> dereferencing the `regs' argument when it decides that the pc refers to a
> lock section.  Ingo said `regs' had a value of 0x2, iirc.  Consider this a
> bug report ;)

Known problem. Interrupts don't save regs->rbp, but the new profile_pc
that was introduced recently uses it.

One quick fix is to just use SAVE_ALL in the interrupt entry code,
but I don't like this because it will affect interrupt latency.

The real fix is to fix profile_pc to not reference it.

-Andi

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH] remove LOCK_SECTION from x86_64 spin_lock asm
  2004-09-16  6:13   ` Andi Kleen
@ 2004-09-16  6:27     ` Ingo Molnar
  2004-09-16  6:44       ` Andi Kleen
  2004-09-16 12:44       ` Zwane Mwaikambo
  0 siblings, 2 replies; 19+ messages in thread
From: Ingo Molnar @ 2004-09-16  6:27 UTC (permalink / raw)
  To: Andi Kleen; +Cc: Andrew Morton, Zwane Mwaikambo, linux-kernel, wli


* Andi Kleen <ak@suse.de> wrote:

> Known problem. Interrupts don't save regs->rbp, but the new profile_pc
> that was introduced recently uses it.
> 
> One quick fix is to just use SAVE_ALL in the interrupt entry code, but
> I don't like this because it will affect interrupt latency.
> 
> The real fix is to fix profile_pc to not reference it.

it would be nice if we could profile the callers of the spinlock
functions instead of the opaque spinlock functions.

the ebp trick is nice, but forcing a formal stack frame for every
function has global performance implications. Couldnt we define some
sort of current-> field [or current_thread_info() field] that the
spinlock code could set and clear, which field would be listened to by
profile_pc(), so that the time spent spinning would be attributed to the
callee? Something like:

spin_lock()
	current->real_pc = __builtin_return_address(0);
	...
	current->real_pc = 0;

profile_pc():
	...
	if (current->real_pc)
		pc = current->real_pc;
	else
		pc = regs->rip;

this basically means that we set up a 'conditional frame pointer' in the
spinlock functions - but only in the spinlock functions! It is true that 
this would be 1-2 cycles more expensive than using the frame pointer 
register but considering the totality of performance i believe this is 
wastly superior.

AFAIR __builtin_return_address(0) is nice and sweet on all platforms,
and it works even with -fomit-frame-pointers. (levels 1 and more are not
guaranteed to work, but level 0 always works.) On x86/x64 gcc derives it
from %esp so the overhead of setting up current->real_pc. On platforms
that have 'current' (or current_thread_info()) in a register, saving and
clearing current->real_pc ought to be a matter of 2-3 instructions.
(could be 2 instructions total on x64 - the same cost as
-fno-omit-frame-pointer ebp saving would have!)

->real_pc would have a small race window from the point it's cleared up
until it returns, but it's not a big issue because 99% of the spinlock
related profile overhead is either in the spinning section or the first
access to the cacheline. If there is some small overhead measured
spin_lock() it's not a big issue, as long as the brunt of the overhead
is attributed to the ->real_pc callee.

spin_unlock() doesnt even need to set up ->real_pc - making this
solution even cheaper.

hm?

	Ingo

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH] remove LOCK_SECTION from x86_64 spin_lock asm
  2004-09-16  6:27     ` Ingo Molnar
@ 2004-09-16  6:44       ` Andi Kleen
  2004-09-16  6:51         ` Ingo Molnar
  2004-09-16 12:44       ` Zwane Mwaikambo
  1 sibling, 1 reply; 19+ messages in thread
From: Andi Kleen @ 2004-09-16  6:44 UTC (permalink / raw)
  To: Ingo Molnar; +Cc: Andi Kleen, Andrew Morton, Zwane Mwaikambo, linux-kernel, wli

On Thu, Sep 16, 2004 at 08:27:59AM +0200, Ingo Molnar wrote:
> 
> * Andi Kleen <ak@suse.de> wrote:
> 
> > Known problem. Interrupts don't save regs->rbp, but the new profile_pc
> > that was introduced recently uses it.
> > 
> > One quick fix is to just use SAVE_ALL in the interrupt entry code, but
> > I don't like this because it will affect interrupt latency.
> > 
> > The real fix is to fix profile_pc to not reference it.
> 
> it would be nice if we could profile the callers of the spinlock
> functions instead of the opaque spinlock functions.
> 
> the ebp trick is nice, but forcing a formal stack frame for every
> function has global performance implications. Couldnt we define some

I don't think that is needed anyways.  It would seem to 
be overkill to me to make a relatively fast path slower
just for the profiler interrupt.

I think the idea was that the spinlock functions should be small 
enough that they don't have any stack local variables. In this case 
for the standard non FP build you can just use *regs->rsp. The only problem
was with CONFIG_FRAME_POINTER, because then the compiler puts
an additional word onto the stack. I think the right way 
is to just correct for this word and still use rsp.

Here's an untested patch to implement this. Does this fix the problem for 
you?

-Andi


Index: linux/arch/x86_64/kernel/time.c
===================================================================
--- linux.orig/arch/x86_64/kernel/time.c	2004-09-13 22:19:22.%N +0200
+++ linux/arch/x86_64/kernel/time.c	2004-09-16 08:43:06.%N +0200
@@ -184,9 +184,11 @@
 unsigned long profile_pc(struct pt_regs *regs)
 {
 	unsigned long pc = instruction_pointer(regs);
-
+	/* [0] is the frame pointer, [1] is the return address.
+	   This assumes that the spinlock function is small enough
+	   to not have any stack variables. */	
 	if (in_lock_functions(pc))
-		return *(unsigned long *)regs->rbp;
+		return ((unsigned long *)regs->rsp)[1];
 	return pc;
 }
 EXPORT_SYMBOL(profile_pc);


^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH] remove LOCK_SECTION from x86_64 spin_lock asm
  2004-09-16  6:44       ` Andi Kleen
@ 2004-09-16  6:51         ` Ingo Molnar
  2004-09-16  6:53           ` Andi Kleen
  0 siblings, 1 reply; 19+ messages in thread
From: Ingo Molnar @ 2004-09-16  6:51 UTC (permalink / raw)
  To: Andi Kleen; +Cc: Andrew Morton, Zwane Mwaikambo, linux-kernel, wli


* Andi Kleen <ak@suse.de> wrote:

> I think the idea was that the spinlock functions should be small
> enough that they don't have any stack local variables. [...]

this might work for x64, but even there, are you sure it works even with
CONFIG_PREEMPT enabled (there the spinlocks are more complex)? It sure
wont work on x86 so i think we need a generic 'soft PC' solution.

the alternative would be to unwind the stack - quite some task on some 
platforms ...

	Ingo

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH] remove LOCK_SECTION from x86_64 spin_lock asm
  2004-09-16  6:51         ` Ingo Molnar
@ 2004-09-16  6:53           ` Andi Kleen
  2004-09-16  6:58             ` Ingo Molnar
  0 siblings, 1 reply; 19+ messages in thread
From: Andi Kleen @ 2004-09-16  6:53 UTC (permalink / raw)
  To: Ingo Molnar; +Cc: Andi Kleen, Andrew Morton, Zwane Mwaikambo, linux-kernel, wli

On Thu, Sep 16, 2004 at 08:51:01AM +0200, Ingo Molnar wrote:
> 
> * Andi Kleen <ak@suse.de> wrote:
> 
> > I think the idea was that the spinlock functions should be small
> > enough that they don't have any stack local variables. [...]
> 
> this might work for x64, but even there, are you sure it works even with
> CONFIG_PREEMPT enabled (there the spinlocks are more complex)? It sure

I would expect so. x86-64 should have enough callee clobbered registers
by default to handle it.  Unless someone makes them complex enough that it
needs more than 4 variables or so or adds another function call. But I hope 
this won't happen.

> wont work on x86 so i think we need a generic 'soft PC' solution.
> 
> the alternative would be to unwind the stack - quite some task on some 
> platforms ...

Sometimes call graph profiling would be very useful, but I wouldn't 
want the profiler to do it by default, especially not for this silly
simple case. dwarf2 unwinding is complex enough that just requiring 
frame pointers for the CG case would look attractive.

-Andi


^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH] remove LOCK_SECTION from x86_64 spin_lock asm
  2004-09-16  6:53           ` Andi Kleen
@ 2004-09-16  6:58             ` Ingo Molnar
  2004-09-16  7:09               ` Andi Kleen
  0 siblings, 1 reply; 19+ messages in thread
From: Ingo Molnar @ 2004-09-16  6:58 UTC (permalink / raw)
  To: Andi Kleen; +Cc: Andrew Morton, Zwane Mwaikambo, linux-kernel, wli


* Andi Kleen <ak@suse.de> wrote:

> > the alternative would be to unwind the stack - quite some task on some 
> > platforms ...
> 
> Sometimes call graph profiling would be very useful, but I wouldn't
> want the profiler to do it by default, especially not for this silly
> simple case. dwarf2 unwinding is complex enough that just requiring
> frame pointers for the CG case would look attractive.

but ... frame pointers are overhead to all of the kernel. (the overhead
is less pronounced on architectures with more registers, but even with
15 registers, 14 or 15 can still be a difference.) While unwinding is
overhead to profiling only - if enabled. Also, there could be other
functionality (exception handling?) that could benefit from dwarf2
unwinding.

	Ingo

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH] remove LOCK_SECTION from x86_64 spin_lock asm
  2004-09-16  6:58             ` Ingo Molnar
@ 2004-09-16  7:09               ` Andi Kleen
  2004-09-16  7:19                 ` Ingo Molnar
  0 siblings, 1 reply; 19+ messages in thread
From: Andi Kleen @ 2004-09-16  7:09 UTC (permalink / raw)
  To: Ingo Molnar; +Cc: Andi Kleen, Andrew Morton, Zwane Mwaikambo, linux-kernel, wli

On Thu, Sep 16, 2004 at 08:58:05AM +0200, Ingo Molnar wrote:
> 
> * Andi Kleen <ak@suse.de> wrote:
> 
> > > the alternative would be to unwind the stack - quite some task on some 
> > > platforms ...
> > 
> > Sometimes call graph profiling would be very useful, but I wouldn't
> > want the profiler to do it by default, especially not for this silly
> > simple case. dwarf2 unwinding is complex enough that just requiring
> > frame pointers for the CG case would look attractive.
> 
> but ... frame pointers are overhead to all of the kernel. (the overhead

Yes, it may be up to a few percent in extreme cases because it 
adds a stall on rsp on K8.  On the other hand the code
may get slightly smaller, because a rbp reference is one byte
shorter than a rsp reference.

> is less pronounced on architectures with more registers, but even with
> 15 registers, 14 or 15 can still be a difference.) While unwinding is
> overhead to profiling only - if enabled. Also, there could be other
> functionality (exception handling?) that could benefit from dwarf2
> unwinding.

Your oopses could get better backtraces, but that could be done
with frame pointers too (I'm surprised nobody did it already btw...) 

You can try to write a dwarf2 unwinder for the kernel  (actually
I think IA64 already has one, but it's quite complex as expected
and doesn't easily map to anything else). Even with that doing
a dwarf2 unwind interpretation will have much more overhead.  
For me it doesn't look unreasonable to recompile the kernel for
special profiles though. 

-Andi

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH] remove LOCK_SECTION from x86_64 spin_lock asm
  2004-09-16  7:09               ` Andi Kleen
@ 2004-09-16  7:19                 ` Ingo Molnar
  2004-09-16  7:29                   ` Andi Kleen
  0 siblings, 1 reply; 19+ messages in thread
From: Ingo Molnar @ 2004-09-16  7:19 UTC (permalink / raw)
  To: Andi Kleen; +Cc: Andrew Morton, Zwane Mwaikambo, linux-kernel, wli


* Andi Kleen <ak@suse.de> wrote:

> > is less pronounced on architectures with more registers, but even with
> > 15 registers, 14 or 15 can still be a difference.) While unwinding is
> > overhead to profiling only - if enabled. Also, there could be other
> > functionality (exception handling?) that could benefit from dwarf2
> > unwinding.
> 
> Your oopses could get better backtraces, but that could be done with
> frame pointers too (I'm surprised nobody did it already btw...) 

i did it for x86 a number of times. It gets really messy - you need to
save ebp across interrupt and exception contexts and the ptrace code has
deep assumptions there. But frame pointers on x86 are really really bad
- 6 instead of 7 registers, quite a number of more spills.  _Despite
this overhead_, there are distros that picked framepointers on x86 due
to the debuggability plus! So by not having a clean unwinding and
backtracing infrastructure we are forcing distributors to compile an
inferior kernel.

> You can try to write a dwarf2 unwinder for the kernel (actually I
> think IA64 already has one, but it's quite complex as expected and
> doesn't easily map to anything else). Even with that doing a dwarf2
> unwind interpretation will have much more overhead.  For me it doesn't
> look unreasonable to recompile the kernel for special profiles though. 

the main issue is production level distro kernels - they will pick the
profilable option. So we must work on this issue some more. My ->real_pc
solution solves the profiling problem at a minimal cost (the cost to
spin_lock is close to the cost of ebp saving (on x86), and the cost to
other functions is zero). If a distro has the option between compiling
with framepointers or compiling with ->real_pc i'm sure they will pick
the ->real_pc solution. I dont say it's an elegant solution but it will
work on all architectures - and kernel/spinlock.c is shared amongst all
architectures.

(likewise, they'd pick the dwarf2 unwinder over ->real_pc because that
removes the spin_lock overhead but a dwarf2 unwinder doesnt seem to be
in the works ...)

Even on x64, you really dont want profiling to break just because
someone compiles with spinlock debugging enabled and you happen to run
out of the 7 callee-saved registers ... I think your patch is dangerous
in this respect - it might work if you can detect for sure at build time
whether there's any local variable. Tricks like this really tend to
haunt us later.

	Ingo

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH] remove LOCK_SECTION from x86_64 spin_lock asm
  2004-09-16  7:19                 ` Ingo Molnar
@ 2004-09-16  7:29                   ` Andi Kleen
  2004-09-16  7:44                     ` Ingo Molnar
  0 siblings, 1 reply; 19+ messages in thread
From: Andi Kleen @ 2004-09-16  7:29 UTC (permalink / raw)
  To: Ingo Molnar; +Cc: Andi Kleen, Andrew Morton, Zwane Mwaikambo, linux-kernel, wli

On Thu, Sep 16, 2004 at 09:19:31AM +0200, Ingo Molnar wrote:
> i did it for x86 a number of times. It gets really messy - you need to
> save ebp across interrupt and exception contexts and the ptrace code has
> deep assumptions there. But frame pointers on x86 are really really bad
> - 6 instead of 7 registers, quite a number of more spills.  _Despite
> this overhead_, there are distros that picked framepointers on x86 due
> to the debuggability plus! So by not having a clean unwinding and
> backtracing infrastructure we are forcing distributors to compile an
> inferior kernel.

I think you're overstating the case here. We certainly have adequate 
profiling and debugging infrastructure on x86-64 with frame pointer
off. 

In fact on x86-64 it works even better because kgdb actually
works properly with dwarf2 and without FP and the kernel has all the proper
unwind info.

I was talking about call graph profiling, but that's an additional
new feature that could be added in the future (oprofile already
supports it with an extra patch on i386). It may need it. But
not having it wasn't a big issue so far. 

I think most users of CG profiling want to have it for user space
anyways, not necessarily for the kernel.

> 
> > You can try to write a dwarf2 unwinder for the kernel (actually I
> > think IA64 already has one, but it's quite complex as expected and
> > doesn't easily map to anything else). Even with that doing a dwarf2
> > unwind interpretation will have much more overhead.  For me it doesn't
> > look unreasonable to recompile the kernel for special profiles though. 
> 
> the main issue is production level distro kernels - they will pick the
> profilable option. So we must work on this issue some more. My ->real_pc

Something is mixed up here:

The whole problem only happens on kernels using frame pointer. I never
saw it, simply because I don't use frame pointers.

On a frame pointer less kernel profiling works just fine, and 
with this fix it should work the same on a FP kernel.

> solution solves the profiling problem at a minimal cost (the cost to

I think my new profile_pc gives it at even smaller cost though :)


> (likewise, they'd pick the dwarf2 unwinder over ->real_pc because that
> removes the spin_lock overhead but a dwarf2 unwinder doesnt seem to be
> in the works ...)
> 
> Even on x64, you really dont want profiling to break just because
> someone compiles with spinlock debugging enabled and you happen to run
> out of the 7 callee-saved registers ... I think your patch is dangerous

If someone adds such bloated debug code they can deal with it
themselves. I don't think we should try to handle such extreme
cases by default.

Ok, adding a comment there may be fair to warn them in advance.

> in this respect - it might work if you can detect for sure at build time
> whether there's any local variable. Tricks like this really tend to
> haunt us later.

There are already lots of such assumptions in the kernel (e.g. WCHAN and 
others).  I don't think adding one more is a big issue.

-Andi


^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH] remove LOCK_SECTION from x86_64 spin_lock asm
  2004-09-16  7:29                   ` Andi Kleen
@ 2004-09-16  7:44                     ` Ingo Molnar
  2004-09-16  7:53                       ` Andi Kleen
  2004-09-16  9:01                       ` Andi Kleen
  0 siblings, 2 replies; 19+ messages in thread
From: Ingo Molnar @ 2004-09-16  7:44 UTC (permalink / raw)
  To: Andi Kleen; +Cc: Andrew Morton, Zwane Mwaikambo, linux-kernel, wli


* Andi Kleen <ak@suse.de> wrote:

> Something is mixed up here:
> 
> The whole problem only happens on kernels using frame pointer. I never
> saw it, simply because I don't use frame pointers.
>
> On a frame pointer less kernel profiling works just fine, and with
> this fix it should work the same on a FP kernel.

it only works on pointer less kernels because the spinlock profile 
unwinding is _conditional_ on an FP kernel right now:

 #if defined(CONFIG_SMP) && defined(CONFIG_FRAME_POINTER)
 unsigned long profile_pc(struct pt_regs *regs)
 {
 ...

on non-FP kernels you'll see all the overhead in the single spin_lock()
function, agreed?

> > in this respect - it might work if you can detect for sure at build time
> > whether there's any local variable. Tricks like this really tend to
> > haunt us later.
> 
> There are already lots of such assumptions in the kernel (e.g. WCHAN
> and others).  I don't think adding one more is a big issue.

wchan has only one assumption: that that all __sched section functions
have a valid frame pointer. This is not unrobust at all. (it is
nonperformant though on register-starved platforms and having proper
unwind would fix wchan too.) In fact ->real_pc could be used by __sched
functions as well, it would likely be cheaper (on x86) than having to
compile with -fno-omit-frame-pointers.

	Ingo

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH] remove LOCK_SECTION from x86_64 spin_lock asm
  2004-09-16  7:44                     ` Ingo Molnar
@ 2004-09-16  7:53                       ` Andi Kleen
  2004-09-16  9:01                       ` Andi Kleen
  1 sibling, 0 replies; 19+ messages in thread
From: Andi Kleen @ 2004-09-16  7:53 UTC (permalink / raw)
  To: Ingo Molnar; +Cc: Andi Kleen, Andrew Morton, Zwane Mwaikambo, linux-kernel, wli

On Thu, Sep 16, 2004 at 09:44:31AM +0200, Ingo Molnar wrote:
> it only works on pointer less kernels because the spinlock profile 
> unwinding is _conditional_ on an FP kernel right now:
> 
>  #if defined(CONFIG_SMP) && defined(CONFIG_FRAME_POINTER)
>  unsigned long profile_pc(struct pt_regs *regs)
>  {
>  ...
> 
> on non-FP kernels you'll see all the overhead in the single spin_lock()
> function, agreed?

Agreed.  Ok, should be easy to fix with the same scheme, except
that it has to use offset 0 instead of 1 for the !FP case.
I will cook up a patch.

BTW the SMP check is also wrong because preemptive kernels also
need this unwinding, although they should not spend too much
time in locking.

> > > in this respect - it might work if you can detect for sure at build time
> > > whether there's any local variable. Tricks like this really tend to
> > > haunt us later.
> > 
> > There are already lots of such assumptions in the kernel (e.g. WCHAN
> > and others).  I don't think adding one more is a big issue.
> 
> wchan has only one assumption: that that all __sched section functions

It's more than WCHAN, e.g. sysrq-t assumes it too. 

-Andi

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH] remove LOCK_SECTION from x86_64 spin_lock asm
  2004-09-16  7:44                     ` Ingo Molnar
  2004-09-16  7:53                       ` Andi Kleen
@ 2004-09-16  9:01                       ` Andi Kleen
  1 sibling, 0 replies; 19+ messages in thread
From: Andi Kleen @ 2004-09-16  9:01 UTC (permalink / raw)
  To: Ingo Molnar; +Cc: Andi Kleen, Andrew Morton, Zwane Mwaikambo, linux-kernel, wli

On Thu, Sep 16, 2004 at 09:44:31AM +0200, Ingo Molnar wrote:
> 
> * Andi Kleen <ak@suse.de> wrote:
> 
> > Something is mixed up here:
> > 
> > The whole problem only happens on kernels using frame pointer. I never
> > saw it, simply because I don't use frame pointers.
> >
> > On a frame pointer less kernel profiling works just fine, and with
> > this fix it should work the same on a FP kernel.
> 
> it only works on pointer less kernels because the spinlock profile 
> unwinding is _conditional_ on an FP kernel right now:

Actually my idea doesn't work  because there is a one instruction
race where rbp is not set up on the stack yet. I think I give up and just save
%rbp in this case.

-Andi

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH] remove LOCK_SECTION from x86_64 spin_lock asm
  2004-09-16  6:27     ` Ingo Molnar
  2004-09-16  6:44       ` Andi Kleen
@ 2004-09-16 12:44       ` Zwane Mwaikambo
  2004-09-16 19:30         ` Ingo Molnar
  1 sibling, 1 reply; 19+ messages in thread
From: Zwane Mwaikambo @ 2004-09-16 12:44 UTC (permalink / raw)
  To: Ingo Molnar; +Cc: Andi Kleen, Andrew Morton, linux-kernel, wli

On Thu, 16 Sep 2004, Ingo Molnar wrote:

> the ebp trick is nice, but forcing a formal stack frame for every
> function has global performance implications. Couldnt we define some
> sort of current-> field [or current_thread_info() field] that the
> spinlock code could set and clear, which field would be listened to by
> profile_pc(), so that the time spent spinning would be attributed to the
> callee? Something like:

I think the generic route is nice but wouldn't this break with the 
following.

taskA:
spin_lock(lockA); // contended
<interrupt>
int1:
spin_lock(lockB)

I was thinking along the likes of a per_cpu real_pc, but realised it falls 
prey to the same problem as above... Unless we have irq threads, then of 
course your solution works.

	Zwane


^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH] remove LOCK_SECTION from x86_64 spin_lock asm
  2004-09-16 12:44       ` Zwane Mwaikambo
@ 2004-09-16 19:30         ` Ingo Molnar
  0 siblings, 0 replies; 19+ messages in thread
From: Ingo Molnar @ 2004-09-16 19:30 UTC (permalink / raw)
  To: Zwane Mwaikambo; +Cc: Andi Kleen, Andrew Morton, linux-kernel, wli


* Zwane Mwaikambo <zwane@linuxpower.ca> wrote:

> On Thu, 16 Sep 2004, Ingo Molnar wrote:
> 
> > the ebp trick is nice, but forcing a formal stack frame for every
> > function has global performance implications. Couldnt we define some
> > sort of current-> field [or current_thread_info() field] that the
> > spinlock code could set and clear, which field would be listened to by
> > profile_pc(), so that the time spent spinning would be attributed to the
> > callee? Something like:
> 
> I think the generic route is nice but wouldn't this break with the 
> following.
> 
> taskA:
> spin_lock(lockA); // contended
> <interrupt>
> int1:
> spin_lock(lockB)
> 
> I was thinking along the likes of a per_cpu real_pc, but realised it
> falls prey to the same problem as above... Unless we have irq threads,
> then of course your solution works.

you mean the nesting? spin_lock() should save/restore the value instead
of setting/clearing it - and fork() should initialize it to zero.

	Ingo

^ permalink raw reply	[flat|nested] 19+ messages in thread

* RE: [PATCH] remove LOCK_SECTION from x86_64 spin_lock asm
@ 2004-09-15 22:42 Andrew Chew
  0 siblings, 0 replies; 19+ messages in thread
From: Andrew Chew @ 2004-09-15 22:42 UTC (permalink / raw)
  To: Ingo Molnar, Andrew Morton; +Cc: Zwane Mwaikambo, linux-kernel, ak, wli

> > If you enable profiling and frame pointers, profile_pc() goes splat 
> > dereferencing the `regs' argument when it decides that the 
> pc refers 
> > to a lock section.  Ingo said `regs' had a value of 0x2, iirc.  
> > Consider this a bug report ;)

I've been seeing this as well, although I'm pretty sure I have profiling
disabled.  Are you sure it isn't the combination of SMP support and
frame pointers that causes this bug to occur? (It looks like
profile_pc() takes on a different implementation only if both CONFIG_SMP
and CONFIG_FRAME_POINTER is defined.)

^ permalink raw reply	[flat|nested] 19+ messages in thread

end of thread, other threads:[~2004-09-16 19:29 UTC | newest]

Thread overview: 19+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2004-09-15 16:01 [PATCH] remove LOCK_SECTION from x86_64 spin_lock asm Zwane Mwaikambo
2004-09-15 21:45 ` Andrew Morton
2004-09-15 17:55   ` Zwane Mwaikambo
2004-09-15 21:47   ` Ingo Molnar
2004-09-16  6:13   ` Andi Kleen
2004-09-16  6:27     ` Ingo Molnar
2004-09-16  6:44       ` Andi Kleen
2004-09-16  6:51         ` Ingo Molnar
2004-09-16  6:53           ` Andi Kleen
2004-09-16  6:58             ` Ingo Molnar
2004-09-16  7:09               ` Andi Kleen
2004-09-16  7:19                 ` Ingo Molnar
2004-09-16  7:29                   ` Andi Kleen
2004-09-16  7:44                     ` Ingo Molnar
2004-09-16  7:53                       ` Andi Kleen
2004-09-16  9:01                       ` Andi Kleen
2004-09-16 12:44       ` Zwane Mwaikambo
2004-09-16 19:30         ` Ingo Molnar
2004-09-15 22:42 Andrew Chew

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).