linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] x86: LLVMLinux: Reimplement current_stack_pointer without register usage.
@ 2014-02-21  4:44 behanw
  2014-02-21  4:55 ` H. Peter Anvin
  0 siblings, 1 reply; 4+ messages in thread
From: behanw @ 2014-02-21  4:44 UTC (permalink / raw)
  To: tglx, mingo, hpa, x86, peterz, ak, oleg; +Cc: akpm, linux-kernel, Behan Webster

From: Behan Webster <behanw@converseincode.com>

Use asm to make the globally named register work again for gcc and clang.
Much more efficient than copying the stack pointer to a variable and back again.

Signed-off-by: Behan Webster <behanw@converseincode.com>
---
 arch/x86/include/asm/thread_info.h | 8 ++++----
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/arch/x86/include/asm/thread_info.h b/arch/x86/include/asm/thread_info.h
index e1940c0..e27ccc1 100644
--- a/arch/x86/include/asm/thread_info.h
+++ b/arch/x86/include/asm/thread_info.h
@@ -163,10 +163,10 @@ struct thread_info {
  */
 #ifndef __ASSEMBLY__
 
-#define current_stack_pointer ({		\
-	unsigned long sp;			\
-	asm("mov %%esp,%0" : "=g" (sp));	\
-	sp;					\
+#define current_stack_pointer ({			\
+	register unsigned long sp asm("esp") __used;	\
+	asm("" : "=r" (sp));				\
+	sp;						\
 })
 
 /* how to get the thread information struct from C */
-- 
1.8.3.2


^ permalink raw reply related	[flat|nested] 4+ messages in thread

* Re: [PATCH] x86: LLVMLinux: Reimplement current_stack_pointer without register usage.
  2014-02-21  4:44 [PATCH] x86: LLVMLinux: Reimplement current_stack_pointer without register usage behanw
@ 2014-02-21  4:55 ` H. Peter Anvin
  2014-02-26  3:00   ` Andy Lutomirski
  0 siblings, 1 reply; 4+ messages in thread
From: H. Peter Anvin @ 2014-02-21  4:55 UTC (permalink / raw)
  To: behanw, tglx, mingo, x86, peterz, ak, oleg; +Cc: akpm, linux-kernel

This seems like really deep magic when looking at it... at the very 
least, this needs to be very carefully commented, including why it works 
on the various platforms.

How much does this actually affect the output?  I only see three uses of 
current_stack_pointer:

/* how to get the thread information struct from C */
static inline struct thread_info *current_thread_info(void)
{
         return (struct thread_info *)
                 (current_stack_pointer & ~(THREAD_SIZE - 1));
}

... here we need the mov anyway, because we have to then AND it with a 
mask, which we obviously can't do inside the stack pointer.

kernel/irq_32.c:        irqctx->tinfo.previous_esp = current_stack_pointer;

(two times)

Here we are moving it into a memory variable anyway, which the "=g" 
constraint should allow.

So I see no evidence this is more efficient in any way.

	-hpa


On 02/20/2014 08:44 PM, behanw@converseincode.com wrote:
> From: Behan Webster <behanw@converseincode.com>
>
> Use asm to make the globally named register work again for gcc and clang.
> Much more efficient than copying the stack pointer to a variable and back again.
>
> Signed-off-by: Behan Webster <behanw@converseincode.com>
> ---
>   arch/x86/include/asm/thread_info.h | 8 ++++----
>   1 file changed, 4 insertions(+), 4 deletions(-)
>
> diff --git a/arch/x86/include/asm/thread_info.h b/arch/x86/include/asm/thread_info.h
> index e1940c0..e27ccc1 100644
> --- a/arch/x86/include/asm/thread_info.h
> +++ b/arch/x86/include/asm/thread_info.h
> @@ -163,10 +163,10 @@ struct thread_info {
>    */
>   #ifndef __ASSEMBLY__
>
> -#define current_stack_pointer ({		\
> -	unsigned long sp;			\
> -	asm("mov %%esp,%0" : "=g" (sp));	\
> -	sp;					\
> +#define current_stack_pointer ({			\
> +	register unsigned long sp asm("esp") __used;	\
> +	asm("" : "=r" (sp));				\
> +	sp;						\
>   })
>
>   /* how to get the thread information struct from C */
>


^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [PATCH] x86: LLVMLinux: Reimplement current_stack_pointer without register usage.
  2014-02-21  4:55 ` H. Peter Anvin
@ 2014-02-26  3:00   ` Andy Lutomirski
  2014-02-26  3:03     ` H. Peter Anvin
  0 siblings, 1 reply; 4+ messages in thread
From: Andy Lutomirski @ 2014-02-26  3:00 UTC (permalink / raw)
  To: H. Peter Anvin, behanw, tglx, mingo, x86, peterz, ak, oleg
  Cc: akpm, linux-kernel

On 02/20/2014 08:55 PM, H. Peter Anvin wrote:
> This seems like really deep magic when looking at it... at the very
> least, this needs to be very carefully commented, including why it works
> on the various platforms.
> 
> How much does this actually affect the output?  I only see three uses of
> current_stack_pointer:
> 
> /* how to get the thread information struct from C */
> static inline struct thread_info *current_thread_info(void)
> {
>         return (struct thread_info *)
>                 (current_stack_pointer & ~(THREAD_SIZE - 1));
> }
> 
> ... here we need the mov anyway, because we have to then AND it with a
> mask, which we obviously can't do inside the stack pointer.

No clue what code is actually generated, but the new code could generate:

mov $MASK, %rax;
and %esp, %rax;

Admittedly, I can't see any reason why this would be an improvement.

--Andy

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [PATCH] x86: LLVMLinux: Reimplement current_stack_pointer without register usage.
  2014-02-26  3:00   ` Andy Lutomirski
@ 2014-02-26  3:03     ` H. Peter Anvin
  0 siblings, 0 replies; 4+ messages in thread
From: H. Peter Anvin @ 2014-02-26  3:03 UTC (permalink / raw)
  To: Andy Lutomirski, behanw, tglx, mingo, x86, peterz, ak, oleg
  Cc: akpm, linux-kernel

On 02/25/2014 07:00 PM, Andy Lutomirski wrote:
>>
>> How much does this actually affect the output?  I only see three uses of
>> current_stack_pointer:
>>
>> /* how to get the thread information struct from C */
>> static inline struct thread_info *current_thread_info(void)
>> {
>>         return (struct thread_info *)
>>                 (current_stack_pointer & ~(THREAD_SIZE - 1));
>> }
>>
>> ... here we need the mov anyway, because we have to then AND it with a
>> mask, which we obviously can't do inside the stack pointer.
> 
> No clue what code is actually generated, but the new code could generate:
> 
> mov $MASK, %rax;
> and %esp, %rax;
> 
> Admittedly, I can't see any reason why this would be an improvement.
> 

You have to generate one of the code sequences:

	mov $MASK, %eax
	and %esp, %eax

... or ...

	mov %esp, %eax
	and $MASK, %eax

No real difference either way.

	-hpa



^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2014-02-26  3:14 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2014-02-21  4:44 [PATCH] x86: LLVMLinux: Reimplement current_stack_pointer without register usage behanw
2014-02-21  4:55 ` H. Peter Anvin
2014-02-26  3:00   ` Andy Lutomirski
2014-02-26  3:03     ` H. Peter Anvin

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).