linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 1/4] x86/asm/entry/64: better label name, fix comments
@ 2015-03-25 17:18 Denys Vlasenko
  2015-03-25 17:18 ` [PATCH 2/4] x86/asm/entry/64: do not TRACE_IRQS fast SYSRET64 path Denys Vlasenko
                   ` (3 more replies)
  0 siblings, 4 replies; 24+ messages in thread
From: Denys Vlasenko @ 2015-03-25 17:18 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Denys Vlasenko, Linus Torvalds, Steven Rostedt, Borislav Petkov,
	H. Peter Anvin, Andy Lutomirski, Oleg Nesterov,
	Frederic Weisbecker, Alexei Starovoitov, Will Drewry, Kees Cook,
	x86, linux-kernel

A named label "ret_from_sys_call" implies that there are jumps
to this location from elsewhere, as happens with many other labels
in this file.
But this label is used only by the JMP a few insns above.
To make that obvious, use local numeric label instead.

Improve comments:

"and return regs->ax" isn't too informative. We always return regs->ax.

The comment suggesting that it'd be cool to use rip relative addressing for CALL
is deleted. It's unclear why that would be an improvement - we aren't striving
to use position-independent code here. PIC code here would require something like
LEA sys_call_table(%rip),reg + CALL *(reg,%rax*8)...

"iret frame is also incomplete" is no longer true, fix that too.

Signed-off-by: Denys Vlasenko <dvlasenk@redhat.com>
CC: Linus Torvalds <torvalds@linux-foundation.org>
CC: Steven Rostedt <rostedt@goodmis.org>
CC: Ingo Molnar <mingo@kernel.org>
CC: Borislav Petkov <bp@alien8.de>
CC: "H. Peter Anvin" <hpa@zytor.com>
CC: Andy Lutomirski <luto@amacapital.net>
CC: Oleg Nesterov <oleg@redhat.com>
CC: Frederic Weisbecker <fweisbec@gmail.com>
CC: Alexei Starovoitov <ast@plumgrid.com>
CC: Will Drewry <wad@chromium.org>
CC: Kees Cook <keescook@chromium.org>
CC: x86@kernel.org
CC: linux-kernel@vger.kernel.org
---
 arch/x86/kernel/entry_64.S | 11 +++++------
 1 file changed, 5 insertions(+), 6 deletions(-)

diff --git a/arch/x86/kernel/entry_64.S b/arch/x86/kernel/entry_64.S
index bf9afad..40f7760 100644
--- a/arch/x86/kernel/entry_64.S
+++ b/arch/x86/kernel/entry_64.S
@@ -258,16 +258,15 @@ system_call_fastpath:
 	andl $__SYSCALL_MASK,%eax
 	cmpl $__NR_syscall_max,%eax
 #endif
-	ja ret_from_sys_call  /* and return regs->ax */
+	ja	1f	/* return -ENOSYS (already in pt_regs->ax) */
 	movq %r10,%rcx
-	call *sys_call_table(,%rax,8)  # XXX:	 rip relative
+	call *sys_call_table(,%rax,8)
 	movq %rax,RAX(%rsp)
+1:
 /*
- * Syscall return path ending with SYSRET (fast path)
- * Has incompletely filled pt_regs, iret frame is also incomplete.
+ * Syscall return path ending with SYSRET (fast path).
+ * Has incompletely filled pt_regs.
  */
-ret_from_sys_call:
-
 	LOCKDEP_SYS_EXIT
 	DISABLE_INTERRUPTS(CLBR_NONE)
 	TRACE_IRQS_OFF
-- 
1.8.1.4


^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [PATCH 2/4] x86/asm/entry/64: do not TRACE_IRQS fast SYSRET64 path
  2015-03-25 17:18 [PATCH 1/4] x86/asm/entry/64: better label name, fix comments Denys Vlasenko
@ 2015-03-25 17:18 ` Denys Vlasenko
  2015-03-25 17:29   ` Ingo Molnar
  2015-03-25 17:18 ` [PATCH 3/4] x86/asm/entry/64: use smaller insns Denys Vlasenko
                   ` (2 subsequent siblings)
  3 siblings, 1 reply; 24+ messages in thread
From: Denys Vlasenko @ 2015-03-25 17:18 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Denys Vlasenko, Linus Torvalds, Steven Rostedt, Borislav Petkov,
	H. Peter Anvin, Andy Lutomirski, Oleg Nesterov,
	Frederic Weisbecker, Alexei Starovoitov, Will Drewry, Kees Cook,
	x86, linux-kernel

SYSRET code path has a small irq-off block.
On this code path, TRACE_IRQS_ON can't be called right before interrupts
are enabled for real, we can't clobber registers there.
So current code does it earlier, in a safe place.

But with this, TRACE_IRQS_OFF/ON frames just two fast instructions,
which is ridiculous: now most of irq-off block is _outside_ of the framing.

Do the same thing that we do on SYSCALL entry: do not track this irq-off block,
it is very small to ever cause noticeable irq latency.

Be careful: make sure that "jnz int_ret_from_sys_call_irqs_off" now does
invoke TRACE_IRQS_OFF - move int_ret_from_sys_call_irqs_off label before
TRACE_IRQS_OFF.

Signed-off-by: Denys Vlasenko <dvlasenk@redhat.com>
CC: Linus Torvalds <torvalds@linux-foundation.org>
CC: Steven Rostedt <rostedt@goodmis.org>
CC: Ingo Molnar <mingo@kernel.org>
CC: Borislav Petkov <bp@alien8.de>
CC: "H. Peter Anvin" <hpa@zytor.com>
CC: Andy Lutomirski <luto@amacapital.net>
CC: Oleg Nesterov <oleg@redhat.com>
CC: Frederic Weisbecker <fweisbec@gmail.com>
CC: Alexei Starovoitov <ast@plumgrid.com>
CC: Will Drewry <wad@chromium.org>
CC: Kees Cook <keescook@chromium.org>
CC: x86@kernel.org
CC: linux-kernel@vger.kernel.org
---
 arch/x86/kernel/entry_64.S | 13 +++++++------
 1 file changed, 7 insertions(+), 6 deletions(-)

diff --git a/arch/x86/kernel/entry_64.S b/arch/x86/kernel/entry_64.S
index 40f7760..11b7339 100644
--- a/arch/x86/kernel/entry_64.S
+++ b/arch/x86/kernel/entry_64.S
@@ -268,8 +268,11 @@ system_call_fastpath:
  * Has incompletely filled pt_regs.
  */
 	LOCKDEP_SYS_EXIT
+	/*
+	 * We do not frame this tiny irq-off block with TRACE_IRQS_OFF/ON,
+	 * it is too small to ever cause noticeable irq latency.
+	 */
 	DISABLE_INTERRUPTS(CLBR_NONE)
-	TRACE_IRQS_OFF
 
 	/*
 	 * We must check ti flags with interrupts (or at least preemption)
@@ -283,10 +286,7 @@ system_call_fastpath:
 	jnz int_ret_from_sys_call_irqs_off	/* Go to the slow path */
 
 	CFI_REMEMBER_STATE
-	/*
-	 * sysretq will re-enable interrupts:
-	 */
-	TRACE_IRQS_ON
+
 	RESTORE_C_REGS_EXCEPT_RCX_R11
 	movq	RIP(%rsp),%rcx
 	CFI_REGISTER	rip,rcx
@@ -297,6 +297,7 @@ system_call_fastpath:
 	 * 64bit SYSRET restores rip from rcx,
 	 * rflags from r11 (but RF and VM bits are forced to 0),
 	 * cs and ss are loaded from MSRs.
+	 * Restoration of rflags re-enables interrupts.
 	 */
 	USERGS_SYSRET64
 
@@ -345,8 +346,8 @@ tracesys_phase2:
  */
 GLOBAL(int_ret_from_sys_call)
 	DISABLE_INTERRUPTS(CLBR_NONE)
-	TRACE_IRQS_OFF
 int_ret_from_sys_call_irqs_off:
+	TRACE_IRQS_OFF
 	movl $_TIF_ALLWORK_MASK,%edi
 	/* edi:	mask to check */
 GLOBAL(int_with_check)
-- 
1.8.1.4


^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [PATCH 3/4] x86/asm/entry/64: use smaller insns
  2015-03-25 17:18 [PATCH 1/4] x86/asm/entry/64: better label name, fix comments Denys Vlasenko
  2015-03-25 17:18 ` [PATCH 2/4] x86/asm/entry/64: do not TRACE_IRQS fast SYSRET64 path Denys Vlasenko
@ 2015-03-25 17:18 ` Denys Vlasenko
  2015-03-25 18:03   ` Andy Lutomirski
                     ` (2 more replies)
  2015-03-25 17:18 ` [PATCH 4/4] x86/asm/entry/64: fix typo in comment Denys Vlasenko
  2015-03-27 11:46 ` [tip:x86/asm] x86/asm/entry/64: Use better label name, fix comments tip-bot for Denys Vlasenko
  3 siblings, 3 replies; 24+ messages in thread
From: Denys Vlasenko @ 2015-03-25 17:18 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Denys Vlasenko, Linus Torvalds, Steven Rostedt, Borislav Petkov,
	H. Peter Anvin, Andy Lutomirski, Oleg Nesterov,
	Frederic Weisbecker, Alexei Starovoitov, Will Drewry, Kees Cook,
	x86, linux-kernel

The $AUDIT_ARCH_X86_64 parameter to syscall_trace_enter_phase1/2
is a 32-bit constant, loading it with 32-bit MOV produces 5-byte insn
instead of 10-byte one.

Signed-off-by: Denys Vlasenko <dvlasenk@redhat.com>
CC: Linus Torvalds <torvalds@linux-foundation.org>
CC: Steven Rostedt <rostedt@goodmis.org>
CC: Ingo Molnar <mingo@kernel.org>
CC: Borislav Petkov <bp@alien8.de>
CC: "H. Peter Anvin" <hpa@zytor.com>
CC: Andy Lutomirski <luto@amacapital.net>
CC: Oleg Nesterov <oleg@redhat.com>
CC: Frederic Weisbecker <fweisbec@gmail.com>
CC: Alexei Starovoitov <ast@plumgrid.com>
CC: Will Drewry <wad@chromium.org>
CC: Kees Cook <keescook@chromium.org>
CC: x86@kernel.org
CC: linux-kernel@vger.kernel.org
---
 arch/x86/kernel/entry_64.S | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/arch/x86/kernel/entry_64.S b/arch/x86/kernel/entry_64.S
index 11b7339..23aa43e 100644
--- a/arch/x86/kernel/entry_64.S
+++ b/arch/x86/kernel/entry_64.S
@@ -306,7 +306,7 @@ system_call_fastpath:
 	/* Do syscall entry tracing */
 tracesys:
 	movq %rsp, %rdi
-	movq $AUDIT_ARCH_X86_64, %rsi
+	movl $AUDIT_ARCH_X86_64, %esi
 	call syscall_trace_enter_phase1
 	test %rax, %rax
 	jnz tracesys_phase2		/* if needed, run the slow path */
@@ -317,7 +317,7 @@ tracesys:
 tracesys_phase2:
 	SAVE_EXTRA_REGS
 	movq %rsp, %rdi
-	movq $AUDIT_ARCH_X86_64, %rsi
+	movl $AUDIT_ARCH_X86_64, %esi
 	movq %rax,%rdx
 	call syscall_trace_enter_phase2
 
-- 
1.8.1.4


^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [PATCH 4/4] x86/asm/entry/64: fix typo in comment
  2015-03-25 17:18 [PATCH 1/4] x86/asm/entry/64: better label name, fix comments Denys Vlasenko
  2015-03-25 17:18 ` [PATCH 2/4] x86/asm/entry/64: do not TRACE_IRQS fast SYSRET64 path Denys Vlasenko
  2015-03-25 17:18 ` [PATCH 3/4] x86/asm/entry/64: use smaller insns Denys Vlasenko
@ 2015-03-25 17:18 ` Denys Vlasenko
  2015-03-27 11:46 ` [tip:x86/asm] x86/asm/entry/64: Use better label name, fix comments tip-bot for Denys Vlasenko
  3 siblings, 0 replies; 24+ messages in thread
From: Denys Vlasenko @ 2015-03-25 17:18 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Denys Vlasenko, Linus Torvalds, Steven Rostedt, Borislav Petkov,
	H. Peter Anvin, Andy Lutomirski, Oleg Nesterov,
	Frederic Weisbecker, Alexei Starovoitov, Will Drewry, Kees Cook,
	x86, linux-kernel

Signed-off-by: Denys Vlasenko <dvlasenk@redhat.com>
CC: Linus Torvalds <torvalds@linux-foundation.org>
CC: Steven Rostedt <rostedt@goodmis.org>
CC: Ingo Molnar <mingo@kernel.org>
CC: Borislav Petkov <bp@alien8.de>
CC: "H. Peter Anvin" <hpa@zytor.com>
CC: Andy Lutomirski <luto@amacapital.net>
CC: Oleg Nesterov <oleg@redhat.com>
CC: Frederic Weisbecker <fweisbec@gmail.com>
CC: Alexei Starovoitov <ast@plumgrid.com>
CC: Will Drewry <wad@chromium.org>
CC: Kees Cook <keescook@chromium.org>
CC: x86@kernel.org
CC: linux-kernel@vger.kernel.org
---
 arch/x86/kernel/entry_64.S | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/x86/kernel/entry_64.S b/arch/x86/kernel/entry_64.S
index 23aa43e..47e126b 100644
--- a/arch/x86/kernel/entry_64.S
+++ b/arch/x86/kernel/entry_64.S
@@ -1407,7 +1407,7 @@ ENTRY(nmi)
 	 * NMI.
 	 */
 
-	/* Use %rdx as out temp variable throughout */
+	/* Use %rdx as our temp variable throughout */
 	pushq_cfi %rdx
 	CFI_REL_OFFSET rdx, 0
 
-- 
1.8.1.4


^ permalink raw reply related	[flat|nested] 24+ messages in thread

* Re: [PATCH 2/4] x86/asm/entry/64: do not TRACE_IRQS fast SYSRET64 path
  2015-03-25 17:18 ` [PATCH 2/4] x86/asm/entry/64: do not TRACE_IRQS fast SYSRET64 path Denys Vlasenko
@ 2015-03-25 17:29   ` Ingo Molnar
  2015-03-25 17:41     ` Denys Vlasenko
  0 siblings, 1 reply; 24+ messages in thread
From: Ingo Molnar @ 2015-03-25 17:29 UTC (permalink / raw)
  To: Denys Vlasenko
  Cc: Linus Torvalds, Steven Rostedt, Borislav Petkov, H. Peter Anvin,
	Andy Lutomirski, Oleg Nesterov, Frederic Weisbecker,
	Alexei Starovoitov, Will Drewry, Kees Cook, x86, linux-kernel


* Denys Vlasenko <dvlasenk@redhat.com> wrote:

> SYSRET code path has a small irq-off block.
> On this code path, TRACE_IRQS_ON can't be called right before interrupts
> are enabled for real, we can't clobber registers there.
> So current code does it earlier, in a safe place.
> 
> But with this, TRACE_IRQS_OFF/ON frames just two fast instructions,
> which is ridiculous: now most of irq-off block is _outside_ of the framing.
> 
> Do the same thing that we do on SYSCALL entry: do not track this irq-off block,
> it is very small to ever cause noticeable irq latency.
> 
> Be careful: make sure that "jnz int_ret_from_sys_call_irqs_off" now does
> invoke TRACE_IRQS_OFF - move int_ret_from_sys_call_irqs_off label before
> TRACE_IRQS_OFF.

> @@ -345,8 +346,8 @@ tracesys_phase2:
>   */
>  GLOBAL(int_ret_from_sys_call)
>  	DISABLE_INTERRUPTS(CLBR_NONE)
> -	TRACE_IRQS_OFF
>  int_ret_from_sys_call_irqs_off:
> +	TRACE_IRQS_OFF
>  	movl $_TIF_ALLWORK_MASK,%edi
>  	/* edi:	mask to check */

This latter trick absolutely needs a comment, to keep future lockdep 
developers from wondering about the mismatch and the weird label 
placement ...

Thanks,

	Ingo

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH 2/4] x86/asm/entry/64: do not TRACE_IRQS fast SYSRET64 path
  2015-03-25 17:29   ` Ingo Molnar
@ 2015-03-25 17:41     ` Denys Vlasenko
  2015-03-25 18:04       ` Ingo Molnar
  0 siblings, 1 reply; 24+ messages in thread
From: Denys Vlasenko @ 2015-03-25 17:41 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Linus Torvalds, Steven Rostedt, Borislav Petkov, H. Peter Anvin,
	Andy Lutomirski, Oleg Nesterov, Frederic Weisbecker,
	Alexei Starovoitov, Will Drewry, Kees Cook, x86, linux-kernel

On 03/25/2015 06:29 PM, Ingo Molnar wrote:
> 
> * Denys Vlasenko <dvlasenk@redhat.com> wrote:
> 
>> SYSRET code path has a small irq-off block.
>> On this code path, TRACE_IRQS_ON can't be called right before interrupts
>> are enabled for real, we can't clobber registers there.
>> So current code does it earlier, in a safe place.
>>
>> But with this, TRACE_IRQS_OFF/ON frames just two fast instructions,
>> which is ridiculous: now most of irq-off block is _outside_ of the framing.
>>
>> Do the same thing that we do on SYSCALL entry: do not track this irq-off block,
>> it is very small to ever cause noticeable irq latency.
>>
>> Be careful: make sure that "jnz int_ret_from_sys_call_irqs_off" now does
>> invoke TRACE_IRQS_OFF - move int_ret_from_sys_call_irqs_off label before
>> TRACE_IRQS_OFF.
> 
>> @@ -345,8 +346,8 @@ tracesys_phase2:
>>   */
>>  GLOBAL(int_ret_from_sys_call)
>>  	DISABLE_INTERRUPTS(CLBR_NONE)
>> -	TRACE_IRQS_OFF
>>  int_ret_from_sys_call_irqs_off:
>> +	TRACE_IRQS_OFF
>>  	movl $_TIF_ALLWORK_MASK,%edi
>>  	/* edi:	mask to check */
> 
> This latter trick absolutely needs a comment, to keep future lockdep 
> developers from wondering about the mismatch and the weird label 
> placement ...

Unsure how to format it.

How about:


	DISABLE_INTERRUPTS(CLBR_NONE)
int_ret_from_sys_call_irqs_off: /* jumps come here with irqs off */
	TRACE_IRQS_OFF



(In truth, there is only one jump as of now, but using pliral
"jumps" if that would change)

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH 3/4] x86/asm/entry/64: use smaller insns
  2015-03-25 17:18 ` [PATCH 3/4] x86/asm/entry/64: use smaller insns Denys Vlasenko
@ 2015-03-25 18:03   ` Andy Lutomirski
  2015-03-25 23:51   ` Linus Torvalds
  2015-03-27 11:47   ` [tip:x86/asm] x86/asm/entry/64: Use smaller instructions tip-bot for Denys Vlasenko
  2 siblings, 0 replies; 24+ messages in thread
From: Andy Lutomirski @ 2015-03-25 18:03 UTC (permalink / raw)
  To: Denys Vlasenko
  Cc: Ingo Molnar, Linus Torvalds, Steven Rostedt, Borislav Petkov,
	H. Peter Anvin, Oleg Nesterov, Frederic Weisbecker,
	Alexei Starovoitov, Will Drewry, Kees Cook, X86 ML, linux-kernel

On Wed, Mar 25, 2015 at 10:18 AM, Denys Vlasenko <dvlasenk@redhat.com> wrote:
> The $AUDIT_ARCH_X86_64 parameter to syscall_trace_enter_phase1/2
> is a 32-bit constant, loading it with 32-bit MOV produces 5-byte insn
> instead of 10-byte one.

Acked-by: Andy Lutomirski <luto@kernel.org>

>
> Signed-off-by: Denys Vlasenko <dvlasenk@redhat.com>
> CC: Linus Torvalds <torvalds@linux-foundation.org>
> CC: Steven Rostedt <rostedt@goodmis.org>
> CC: Ingo Molnar <mingo@kernel.org>
> CC: Borislav Petkov <bp@alien8.de>
> CC: "H. Peter Anvin" <hpa@zytor.com>
> CC: Andy Lutomirski <luto@amacapital.net>
> CC: Oleg Nesterov <oleg@redhat.com>
> CC: Frederic Weisbecker <fweisbec@gmail.com>
> CC: Alexei Starovoitov <ast@plumgrid.com>
> CC: Will Drewry <wad@chromium.org>
> CC: Kees Cook <keescook@chromium.org>
> CC: x86@kernel.org
> CC: linux-kernel@vger.kernel.org
> ---
>  arch/x86/kernel/entry_64.S | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
>
> diff --git a/arch/x86/kernel/entry_64.S b/arch/x86/kernel/entry_64.S
> index 11b7339..23aa43e 100644
> --- a/arch/x86/kernel/entry_64.S
> +++ b/arch/x86/kernel/entry_64.S
> @@ -306,7 +306,7 @@ system_call_fastpath:
>         /* Do syscall entry tracing */
>  tracesys:
>         movq %rsp, %rdi
> -       movq $AUDIT_ARCH_X86_64, %rsi
> +       movl $AUDIT_ARCH_X86_64, %esi
>         call syscall_trace_enter_phase1
>         test %rax, %rax
>         jnz tracesys_phase2             /* if needed, run the slow path */
> @@ -317,7 +317,7 @@ tracesys:
>  tracesys_phase2:
>         SAVE_EXTRA_REGS
>         movq %rsp, %rdi
> -       movq $AUDIT_ARCH_X86_64, %rsi
> +       movl $AUDIT_ARCH_X86_64, %esi
>         movq %rax,%rdx
>         call syscall_trace_enter_phase2
>
> --
> 1.8.1.4
>



-- 
Andy Lutomirski
AMA Capital Management, LLC

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH 2/4] x86/asm/entry/64: do not TRACE_IRQS fast SYSRET64 path
  2015-03-25 17:41     ` Denys Vlasenko
@ 2015-03-25 18:04       ` Ingo Molnar
  2015-03-25 18:19         ` Denys Vlasenko
  0 siblings, 1 reply; 24+ messages in thread
From: Ingo Molnar @ 2015-03-25 18:04 UTC (permalink / raw)
  To: Denys Vlasenko
  Cc: Linus Torvalds, Steven Rostedt, Borislav Petkov, H. Peter Anvin,
	Andy Lutomirski, Oleg Nesterov, Frederic Weisbecker,
	Alexei Starovoitov, Will Drewry, Kees Cook, x86, linux-kernel


* Denys Vlasenko <dvlasenk@redhat.com> wrote:

> On 03/25/2015 06:29 PM, Ingo Molnar wrote:
> > 
> > * Denys Vlasenko <dvlasenk@redhat.com> wrote:
> > 
> >> SYSRET code path has a small irq-off block.
> >> On this code path, TRACE_IRQS_ON can't be called right before interrupts
> >> are enabled for real, we can't clobber registers there.
> >> So current code does it earlier, in a safe place.
> >>
> >> But with this, TRACE_IRQS_OFF/ON frames just two fast instructions,
> >> which is ridiculous: now most of irq-off block is _outside_ of the framing.
> >>
> >> Do the same thing that we do on SYSCALL entry: do not track this irq-off block,
> >> it is very small to ever cause noticeable irq latency.
> >>
> >> Be careful: make sure that "jnz int_ret_from_sys_call_irqs_off" now does
> >> invoke TRACE_IRQS_OFF - move int_ret_from_sys_call_irqs_off label before
> >> TRACE_IRQS_OFF.
> > 
> >> @@ -345,8 +346,8 @@ tracesys_phase2:
> >>   */
> >>  GLOBAL(int_ret_from_sys_call)
> >>  	DISABLE_INTERRUPTS(CLBR_NONE)
> >> -	TRACE_IRQS_OFF
> >>  int_ret_from_sys_call_irqs_off:
> >> +	TRACE_IRQS_OFF
> >>  	movl $_TIF_ALLWORK_MASK,%edi
> >>  	/* edi:	mask to check */
> > 
> > This latter trick absolutely needs a comment, to keep future lockdep 
> > developers from wondering about the mismatch and the weird label 
> > placement ...
> 
> Unsure how to format it.
> 
> How about:
> 
> 
> 	DISABLE_INTERRUPTS(CLBR_NONE)
> int_ret_from_sys_call_irqs_off: /* jumps come here with irqs off */
> 	TRACE_IRQS_OFF

Why not something like 'jumps come here from the irqs-off SYSRET 
path'?

> 
> 
> 
> (In truth, there is only one jump as of now, but using pliral
> "jumps" if that would change)

I'd also put a comment to the actual sysret IRQ-disablement that we 
are skipping with the annotation. Explain that it's an optimization 
for a visible irqs-off path that needs no annotation - and that the 
moment something complex is done in that path, this optimization loses 
its validity.

Thanks,

	Ingo

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH 2/4] x86/asm/entry/64: do not TRACE_IRQS fast SYSRET64 path
  2015-03-25 18:04       ` Ingo Molnar
@ 2015-03-25 18:19         ` Denys Vlasenko
  0 siblings, 0 replies; 24+ messages in thread
From: Denys Vlasenko @ 2015-03-25 18:19 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Linus Torvalds, Steven Rostedt, Borislav Petkov, H. Peter Anvin,
	Andy Lutomirski, Oleg Nesterov, Frederic Weisbecker,
	Alexei Starovoitov, Will Drewry, Kees Cook, x86, linux-kernel

On 03/25/2015 07:04 PM, Ingo Molnar wrote:
>>> * Denys Vlasenko <dvlasenk@redhat.com> wrote:
>>>
>>>> SYSRET code path has a small irq-off block.
>>>> On this code path, TRACE_IRQS_ON can't be called right before interrupts
>>>> are enabled for real, we can't clobber registers there.
>>>> So current code does it earlier, in a safe place.
>>>>
>>>> But with this, TRACE_IRQS_OFF/ON frames just two fast instructions,
>>>> which is ridiculous: now most of irq-off block is _outside_ of the framing.
>>>>
>>>> Do the same thing that we do on SYSCALL entry: do not track this irq-off block,
>>>> it is very small to ever cause noticeable irq latency.
>>>>
>>>> Be careful: make sure that "jnz int_ret_from_sys_call_irqs_off" now does
>>>> invoke TRACE_IRQS_OFF - move int_ret_from_sys_call_irqs_off label before
>>>> TRACE_IRQS_OFF.
>>>
>>>> @@ -345,8 +346,8 @@ tracesys_phase2:
>>>>   */
>>>>  GLOBAL(int_ret_from_sys_call)
>>>>  	DISABLE_INTERRUPTS(CLBR_NONE)
>>>> -	TRACE_IRQS_OFF
>>>>  int_ret_from_sys_call_irqs_off:
>>>> +	TRACE_IRQS_OFF
>>>>  	movl $_TIF_ALLWORK_MASK,%edi
>>>>  	/* edi:	mask to check */
>>>
>>> This latter trick absolutely needs a comment, to keep future lockdep 
>>> developers from wondering about the mismatch and the weird label 
>>> placement ...
>>
>> Unsure how to format it.
>>
>> How about:
>>
>>
>> 	DISABLE_INTERRUPTS(CLBR_NONE)
>> int_ret_from_sys_call_irqs_off: /* jumps come here with irqs off */
>> 	TRACE_IRQS_OFF
> 
> Why not something like 'jumps come here from the irqs-off SYSRET 
> path'?

Ok. I'll send v2 for patches 1 and 2.

(Patch 1 will be expanded, there is another instance of jump
needing the same treatment.)

>>
>>
>>
>> (In truth, there is only one jump as of now, but using pliral
>> "jumps" if that would change)
> 
> I'd also put a comment to the actual sysret IRQ-disablement that we 
> are skipping with the annotation.


Patch does add such a comment:

+       /*
+        * We do not frame this tiny irq-off block with TRACE_IRQS_OFF/ON,
+        * it is too small to ever cause noticeable irq latency.
+        */



^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH 3/4] x86/asm/entry/64: use smaller insns
  2015-03-25 17:18 ` [PATCH 3/4] x86/asm/entry/64: use smaller insns Denys Vlasenko
  2015-03-25 18:03   ` Andy Lutomirski
@ 2015-03-25 23:51   ` Linus Torvalds
  2015-03-25 23:56     ` H. Peter Anvin
  2015-03-26  9:30     ` Ingo Molnar
  2015-03-27 11:47   ` [tip:x86/asm] x86/asm/entry/64: Use smaller instructions tip-bot for Denys Vlasenko
  2 siblings, 2 replies; 24+ messages in thread
From: Linus Torvalds @ 2015-03-25 23:51 UTC (permalink / raw)
  To: Denys Vlasenko
  Cc: Ingo Molnar, Steven Rostedt, Borislav Petkov, H. Peter Anvin,
	Andy Lutomirski, Oleg Nesterov, Frederic Weisbecker,
	Alexei Starovoitov, Will Drewry, Kees Cook,
	the arch/x86 maintainers, Linux Kernel Mailing List

On Wed, Mar 25, 2015 at 10:18 AM, Denys Vlasenko <dvlasenk@redhat.com> wrote:
> The $AUDIT_ARCH_X86_64 parameter to syscall_trace_enter_phase1/2
> is a 32-bit constant, loading it with 32-bit MOV produces 5-byte insn
> instead of 10-byte one.

Side note: has anybody talked to the assembler people? This would seem
to be very much something that the assembler could have noticed and
done on its own. It's a bit sad that we need to overspecify these
things..

If it had actually been a 64-bit constant, the assembler would have
ended up silently using a different instruction encoding *anyway*
("movabs"), so it's not like the "movq" in any way specifies one
particular instruction representation, and the assembler already picks
different instruction versions for different constant values. Why not
this one?

                       Linus

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH 3/4] x86/asm/entry/64: use smaller insns
  2015-03-25 23:51   ` Linus Torvalds
@ 2015-03-25 23:56     ` H. Peter Anvin
  2015-03-26  0:05       ` Linus Torvalds
  2015-03-26  9:30     ` Ingo Molnar
  1 sibling, 1 reply; 24+ messages in thread
From: H. Peter Anvin @ 2015-03-25 23:56 UTC (permalink / raw)
  To: Linus Torvalds, Denys Vlasenko
  Cc: Ingo Molnar, Steven Rostedt, Borislav Petkov, Andy Lutomirski,
	Oleg Nesterov, Frederic Weisbecker, Alexei Starovoitov,
	Will Drewry, Kees Cook, the arch/x86 maintainers,
	Linux Kernel Mailing List

No, movabs is yet another instruction (with a 64-bit absolute address.) But movq can mean 10 or 7 bytes...

On March 25, 2015 4:51:50 PM PDT, Linus Torvalds <torvalds@linux-foundation.org> wrote:
>On Wed, Mar 25, 2015 at 10:18 AM, Denys Vlasenko <dvlasenk@redhat.com>
>wrote:
>> The $AUDIT_ARCH_X86_64 parameter to syscall_trace_enter_phase1/2
>> is a 32-bit constant, loading it with 32-bit MOV produces 5-byte insn
>> instead of 10-byte one.
>
>Side note: has anybody talked to the assembler people? This would seem
>to be very much something that the assembler could have noticed and
>done on its own. It's a bit sad that we need to overspecify these
>things..
>
>If it had actually been a 64-bit constant, the assembler would have
>ended up silently using a different instruction encoding *anyway*
>("movabs"), so it's not like the "movq" in any way specifies one
>particular instruction representation, and the assembler already picks
>different instruction versions for different constant values. Why not
>this one?
>
>                       Linus

-- 
Sent from my mobile phone.  Please pardon brevity and lack of formatting.

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH 3/4] x86/asm/entry/64: use smaller insns
  2015-03-25 23:56     ` H. Peter Anvin
@ 2015-03-26  0:05       ` Linus Torvalds
  2015-03-26  0:12         ` H. Peter Anvin
  2015-03-26  9:27         ` Borislav Petkov
  0 siblings, 2 replies; 24+ messages in thread
From: Linus Torvalds @ 2015-03-26  0:05 UTC (permalink / raw)
  To: H. Peter Anvin
  Cc: Denys Vlasenko, Ingo Molnar, Steven Rostedt, Borislav Petkov,
	Andy Lutomirski, Oleg Nesterov, Frederic Weisbecker,
	Alexei Starovoitov, Will Drewry, Kees Cook,
	the arch/x86 maintainers, Linux Kernel Mailing List

On Wed, Mar 25, 2015 at 4:56 PM, H. Peter Anvin <hpa@zytor.com> wrote:
> No, movabs is yet another instruction (with a 64-bit absolute address.) But movq can mean 10 or 7 bytes...

I mentioned movabs because that is literally what as generates at
least for me (or then objdump is confused):

  [torvalds@i7 ~]$ as -v
  GNU assembler version 2.24 (x86_64-redhat-linux) using BFD version
version 2.24

  [torvalds@i7 ~]$ cat t.s
  main:
        movq $0x12, %rdi
        movq $0x1234, %rdi
        movq $0x123456, %rdi
        movq $0x12345678, %rdi
        movq $0x123456789ab, %rdi

  [torvalds@i7 ~]$ as t.s
  [torvalds@i7 ~]$ objdump -d a.out
  ...
     0: 48 c7 c7 12 00 00 00 mov    $0x12,%rdi
     7: 48 c7 c7 34 12 00 00 mov    $0x1234,%rdi
     e: 48 c7 c7 56 34 12 00 mov    $0x123456,%rdi
    15: 48 c7 c7 78 56 34 12 mov    $0x12345678,%rdi
    1c: 48 bf ab 89 67 45 23 movabs $0x123456789ab,%rdi
    23: 01 00 00

so 'as' is clearly just stupid. It already takes the size of the
constant into account and generates different instructions. Why not
for the common 32-bit case too?

Oh well.

                       Linus

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH 3/4] x86/asm/entry/64: use smaller insns
  2015-03-26  0:05       ` Linus Torvalds
@ 2015-03-26  0:12         ` H. Peter Anvin
  2015-03-26  9:27         ` Borislav Petkov
  1 sibling, 0 replies; 24+ messages in thread
From: H. Peter Anvin @ 2015-03-26  0:12 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Denys Vlasenko, Ingo Molnar, Steven Rostedt, Borislav Petkov,
	Andy Lutomirski, Oleg Nesterov, Frederic Weisbecker,
	Alexei Starovoitov, Will Drewry, Kees Cook,
	the arch/x86 maintainers, Linux Kernel Mailing List

Ok, "movabs" is crazier than I thought.

On March 25, 2015 5:05:50 PM PDT, Linus Torvalds <torvalds@linux-foundation.org> wrote:
>On Wed, Mar 25, 2015 at 4:56 PM, H. Peter Anvin <hpa@zytor.com> wrote:
>> No, movabs is yet another instruction (with a 64-bit absolute
>address.) But movq can mean 10 or 7 bytes...
>
>I mentioned movabs because that is literally what as generates at
>least for me (or then objdump is confused):
>
>  [torvalds@i7 ~]$ as -v
>  GNU assembler version 2.24 (x86_64-redhat-linux) using BFD version
>version 2.24
>
>  [torvalds@i7 ~]$ cat t.s
>  main:
>        movq $0x12, %rdi
>        movq $0x1234, %rdi
>        movq $0x123456, %rdi
>        movq $0x12345678, %rdi
>        movq $0x123456789ab, %rdi
>
>  [torvalds@i7 ~]$ as t.s
>  [torvalds@i7 ~]$ objdump -d a.out
>  ...
>     0: 48 c7 c7 12 00 00 00 mov    $0x12,%rdi
>     7: 48 c7 c7 34 12 00 00 mov    $0x1234,%rdi
>     e: 48 c7 c7 56 34 12 00 mov    $0x123456,%rdi
>    15: 48 c7 c7 78 56 34 12 mov    $0x12345678,%rdi
>    1c: 48 bf ab 89 67 45 23 movabs $0x123456789ab,%rdi
>    23: 01 00 00
>
>so 'as' is clearly just stupid. It already takes the size of the
>constant into account and generates different instructions. Why not
>for the common 32-bit case too?
>
>Oh well.
>
>                       Linus

-- 
Sent from my mobile phone.  Please pardon brevity and lack of formatting.

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH 3/4] x86/asm/entry/64: use smaller insns
  2015-03-26  0:05       ` Linus Torvalds
  2015-03-26  0:12         ` H. Peter Anvin
@ 2015-03-26  9:27         ` Borislav Petkov
  2015-03-26  9:37           ` Ingo Molnar
  1 sibling, 1 reply; 24+ messages in thread
From: Borislav Petkov @ 2015-03-26  9:27 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: H. Peter Anvin, Denys Vlasenko, Ingo Molnar, Steven Rostedt,
	Andy Lutomirski, Oleg Nesterov, Frederic Weisbecker,
	Alexei Starovoitov, Will Drewry, Kees Cook,
	the arch/x86 maintainers, Linux Kernel Mailing List

On Wed, Mar 25, 2015 at 05:05:50PM -0700, Linus Torvalds wrote:
> so 'as' is clearly just stupid. It already takes the size of the
> constant into account and generates different instructions. Why not
> for the common 32-bit case too?

I think the destination register mandates which insn to use:

        mov $0x12345678, %rdi
        mov $0x12345678, %edi

...

  15:   48 c7 c7 78 56 34 12    mov    $0x12345678,%rdi
  1c:   bf 78 56 34 12          mov    $0x12345678,%edi

and 'as' is perhaps not insolent enough to go and change it when seeing
the 32-bit constant.

-- 
Regards/Gruss,
    Boris.

ECO tip #101: Trim your mails when you reply.
--

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH 3/4] x86/asm/entry/64: use smaller insns
  2015-03-25 23:51   ` Linus Torvalds
  2015-03-25 23:56     ` H. Peter Anvin
@ 2015-03-26  9:30     ` Ingo Molnar
  1 sibling, 0 replies; 24+ messages in thread
From: Ingo Molnar @ 2015-03-26  9:30 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Denys Vlasenko, Steven Rostedt, Borislav Petkov, H. Peter Anvin,
	Andy Lutomirski, Oleg Nesterov, Frederic Weisbecker,
	Alexei Starovoitov, Will Drewry, Kees Cook,
	the arch/x86 maintainers, Linux Kernel Mailing List


* Linus Torvalds <torvalds@linux-foundation.org> wrote:

> On Wed, Mar 25, 2015 at 10:18 AM, Denys Vlasenko <dvlasenk@redhat.com> wrote:
> > The $AUDIT_ARCH_X86_64 parameter to syscall_trace_enter_phase1/2
> > is a 32-bit constant, loading it with 32-bit MOV produces 5-byte insn
> > instead of 10-byte one.
> 
> Side note: has anybody talked to the assembler people? This would 
> seem to be very much something that the assembler could have noticed 
> and done on its own. [...]

Maybe GCC already picks a 32-bit opcode in these small-constant cases, 
so there was little incentive to optimize on the GAS side, other than 
making it correct.

> [...] It's a bit sad that we need to overspecify these things..

Yeah, that's sad.

Yesterday when I have read Denys's patch I double checked that there's 
no other similar (easily identifiable ...) movq opcode left in the 
64-bit entry code, so we seem to have squashed most of them.

Thanks,

	Ingo

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH 3/4] x86/asm/entry/64: use smaller insns
  2015-03-26  9:27         ` Borislav Petkov
@ 2015-03-26  9:37           ` Ingo Molnar
  2015-03-26  9:51             ` Borislav Petkov
  2015-03-26 10:07             ` Denys Vlasenko
  0 siblings, 2 replies; 24+ messages in thread
From: Ingo Molnar @ 2015-03-26  9:37 UTC (permalink / raw)
  To: Borislav Petkov
  Cc: Linus Torvalds, H. Peter Anvin, Denys Vlasenko, Steven Rostedt,
	Andy Lutomirski, Oleg Nesterov, Frederic Weisbecker,
	Alexei Starovoitov, Will Drewry, Kees Cook,
	the arch/x86 maintainers, Linux Kernel Mailing List


* Borislav Petkov <bp@alien8.de> wrote:

> On Wed, Mar 25, 2015 at 05:05:50PM -0700, Linus Torvalds wrote:
> > so 'as' is clearly just stupid. It already takes the size of the
> > constant into account and generates different instructions. Why not
> > for the common 32-bit case too?
> 
> I think the destination register mandates which insn to use:
> 
>         mov $0x12345678, %rdi
>         mov $0x12345678, %edi
> 
> ...
> 
>   15:   48 c7 c7 78 56 34 12    mov    $0x12345678,%rdi
>   1c:   bf 78 56 34 12          mov    $0x12345678,%edi
> 
> and 'as' is perhaps not insolent enough to go and change it when 
> seeing the 32-bit constant.

Well, GAS generally has the freedom to use more optimal opcodes, as 
long as behavior on the CPU matches what the specified opcode does.

Saying that 'movq $0x1, %rdi' should map to the longer variant is 
defensible: there should probably always be a way to specify a very 
specific opcode even if it's not optimal - say I want to fill in an 
alignment space and don't want to use an extra NOP.

But here GAS generates the 10-byte opcode even if 'mov $0x12345678, 
%rdi' is used, which is an unforced error.

Thanks,

	Ingo

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH 3/4] x86/asm/entry/64: use smaller insns
  2015-03-26  9:37           ` Ingo Molnar
@ 2015-03-26  9:51             ` Borislav Petkov
  2015-03-26 10:07             ` Denys Vlasenko
  1 sibling, 0 replies; 24+ messages in thread
From: Borislav Petkov @ 2015-03-26  9:51 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Linus Torvalds, H. Peter Anvin, Denys Vlasenko, Steven Rostedt,
	Andy Lutomirski, Oleg Nesterov, Frederic Weisbecker,
	Alexei Starovoitov, Will Drewry, Kees Cook,
	the arch/x86 maintainers, Linux Kernel Mailing List

On Thu, Mar 26, 2015 at 10:37:32AM +0100, Ingo Molnar wrote:
> But here GAS generates the 10-byte opcode even if 'mov $0x12345678, 
> %rdi' is used, which is an unforced error.

7 bytes:

48 c7 c7 78 56 34 12    mov    $0x12345678,%rdi

which is REX + opcode C7 + ModRM + 4-byte immediate.

If it used the bX opcodes, it would have to use REX prefix too for the
%rdi register which would generate a 10-byte insn then:

               0:       48 bf 78 56 43 12 00    mov $0x12435678,%rdi
               7:       00 00 00

REX + opcode BF + 8 bytes immediate.

So I think using the REX register "rdi" forces it to use the REX prefix
and thus the longer instructions.

All IMHO of course.

-- 
Regards/Gruss,
    Boris.

ECO tip #101: Trim your mails when you reply.
--

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH 3/4] x86/asm/entry/64: use smaller insns
  2015-03-26  9:37           ` Ingo Molnar
  2015-03-26  9:51             ` Borislav Petkov
@ 2015-03-26 10:07             ` Denys Vlasenko
  2015-03-26 10:25               ` Ingo Molnar
  2015-03-26 10:25               ` Borislav Petkov
  1 sibling, 2 replies; 24+ messages in thread
From: Denys Vlasenko @ 2015-03-26 10:07 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Borislav Petkov, Linus Torvalds, H. Peter Anvin, Denys Vlasenko,
	Steven Rostedt, Andy Lutomirski, Oleg Nesterov,
	Frederic Weisbecker, Alexei Starovoitov, Will Drewry, Kees Cook,
	the arch/x86 maintainers, Linux Kernel Mailing List

On Thu, Mar 26, 2015 at 10:37 AM, Ingo Molnar <mingo@kernel.org> wrote:
>> On Wed, Mar 25, 2015 at 05:05:50PM -0700, Linus Torvalds wrote:
>> > so 'as' is clearly just stupid. It already takes the size of the
>> > constant into account and generates different instructions. Why not
>> > for the common 32-bit case too?
>>
>> I think the destination register mandates which insn to use:
>>
>>         mov $0x12345678, %rdi
>>         mov $0x12345678, %edi
>>
>> ...
>>
>>   15:   48 c7 c7 78 56 34 12    mov    $0x12345678,%rdi
>>   1c:   bf 78 56 34 12          mov    $0x12345678,%edi
>>
>> and 'as' is perhaps not insolent enough to go and change it when
>> seeing the 32-bit constant.
>
> Well, GAS generally has the freedom to use more optimal opcodes, as
> long as behavior on the CPU matches what the specified opcode does.
>
> Saying that 'movq $0x1, %rdi' should map to the longer variant is
> defensible: there should probably always be a way to specify a very
> specific opcode even if it's not optimal - say I want to fill in an
> alignment space and don't want to use an extra NOP.
>
> But here GAS generates the 10-byte opcode even if 'mov $0x12345678,
> %rdi' is used, which is an unforced error.

In my experiment, GAS uses 10-byte insn only for constants which
won't work with 7-byte encoding; or if I explicitly ask for "movabs":

_start:         .globl  _start
                mov $0x12345678,%edi  # 5 bytes
                mov $0x12345678,%rdi  # 7 bytes
                movq $0x12345678,%rdi  # 7 bytes
                mov $0x80000000,%rdi  # 10 bytes
                mov $0x123456789,%rdi  # 10 bytes
                movabs $0x12345678,%rdi # 10 bytes


$ gcc -nostartfiles -nostdlib -c z.S && objdump -dr z.o
z.o:     file format elf64-x86-64
Disassembly of section .text:
0000000000000000 <_start>:
   0:    bf 78 56 34 12           mov    $0x12345678,%edi
   5:    48 c7 c7 78 56 34 12     mov    $0x12345678,%rdi
   c:    48 c7 c7 78 56 34 12     mov    $0x12345678,%rdi
  13:    48 bf 00 00 00 80 00     movabs $0x80000000,%rdi
  1a:    00 00 00
  1d:    48 bf 89 67 45 23 01     movabs $0x123456789,%rdi
  24:    00 00 00
  27:    48 bf 78 56 34 12 00     movabs $0x12345678,%rdi
  2e:    00 00 00

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH 3/4] x86/asm/entry/64: use smaller insns
  2015-03-26 10:07             ` Denys Vlasenko
@ 2015-03-26 10:25               ` Ingo Molnar
  2015-03-26 10:53                 ` Denys Vlasenko
  2015-03-26 10:25               ` Borislav Petkov
  1 sibling, 1 reply; 24+ messages in thread
From: Ingo Molnar @ 2015-03-26 10:25 UTC (permalink / raw)
  To: Denys Vlasenko
  Cc: Borislav Petkov, Linus Torvalds, H. Peter Anvin, Denys Vlasenko,
	Steven Rostedt, Andy Lutomirski, Oleg Nesterov,
	Frederic Weisbecker, Alexei Starovoitov, Will Drewry, Kees Cook,
	the arch/x86 maintainers, Linux Kernel Mailing List


* Denys Vlasenko <vda.linux@googlemail.com> wrote:

> In my experiment, GAS uses 10-byte insn only for constants which
> won't work with 7-byte encoding; or if I explicitly ask for "movabs":
> 
> _start:         .globl  _start
>                 mov $0x12345678,%edi  # 5 bytes
>                 mov $0x12345678,%rdi  # 7 bytes
>                 movq $0x12345678,%rdi  # 7 bytes
>                 mov $0x80000000,%rdi  # 10 bytes
>                 mov $0x123456789,%rdi  # 10 bytes
>                 movabs $0x12345678,%rdi # 10 bytes
> 
> 
> $ gcc -nostartfiles -nostdlib -c z.S && objdump -dr z.o
> z.o:     file format elf64-x86-64
> Disassembly of section .text:
> 0000000000000000 <_start>:
>    0:    bf 78 56 34 12           mov    $0x12345678,%edi
>    5:    48 c7 c7 78 56 34 12     mov    $0x12345678,%rdi
>    c:    48 c7 c7 78 56 34 12     mov    $0x12345678,%rdi
>   13:    48 bf 00 00 00 80 00     movabs $0x80000000,%rdi
>   1a:    00 00 00
>   1d:    48 bf 89 67 45 23 01     movabs $0x123456789,%rdi
>   24:    00 00 00
>   27:    48 bf 78 56 34 12 00     movabs $0x12345678,%rdi
>   2e:    00 00 00

I see, so:

       movq $AUDIT_ARCH_X86_64, %rsi

generated a 10-byte MOVABS opcode, while moving into %esi generates 
the 5-byte 32-bit MOV opcode?

Thanks,

	Ingo

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH 3/4] x86/asm/entry/64: use smaller insns
  2015-03-26 10:07             ` Denys Vlasenko
  2015-03-26 10:25               ` Ingo Molnar
@ 2015-03-26 10:25               ` Borislav Petkov
  1 sibling, 0 replies; 24+ messages in thread
From: Borislav Petkov @ 2015-03-26 10:25 UTC (permalink / raw)
  To: Denys Vlasenko
  Cc: Ingo Molnar, Linus Torvalds, H. Peter Anvin, Denys Vlasenko,
	Steven Rostedt, Andy Lutomirski, Oleg Nesterov,
	Frederic Weisbecker, Alexei Starovoitov, Will Drewry, Kees Cook,
	the arch/x86 maintainers, Linux Kernel Mailing List

On Thu, Mar 26, 2015 at 11:07:42AM +0100, Denys Vlasenko wrote:
> In my experiment, GAS uses 10-byte insn only for constants which
> won't work with 7-byte encoding; or if I explicitly ask for "movabs":
> 
> _start:         .globl  _start
>                 mov $0x12345678,%edi  # 5 bytes
>                 mov $0x12345678,%rdi  # 7 bytes
>                 movq $0x12345678,%rdi  # 7 bytes
>                 mov $0x80000000,%rdi  # 10 bytes

Right, and since they're signed immediates, the AUDIT_ARCH_X86_64 thing
is 0xc000003e and does not fit in an s32, thus the 64-bit immediate with
bf opcode:

	mov $0x80000000-1,%rdi
	mov $0x80000000,%rdi
	mov $0xc000003e,%rdi
	mov $0xc000003e,%edi

...

  21:   48 c7 c7 ff ff ff 7f    mov    $0x7fffffff,%rdi
  28:   48 bf 00 00 00 80 00    movabs $0x80000000,%rdi
  2f:   00 00 00 
  32:   48 bf 3e 00 00 c0 00    movabs $0xc000003e,%rdi
  39:   00 00 00
  3c:   bf 3e 00 00 c0          mov    $0xc000003e,%edi

Makes sense to me.

-- 
Regards/Gruss,
    Boris.

ECO tip #101: Trim your mails when you reply.
--

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH 3/4] x86/asm/entry/64: use smaller insns
  2015-03-26 10:25               ` Ingo Molnar
@ 2015-03-26 10:53                 ` Denys Vlasenko
  0 siblings, 0 replies; 24+ messages in thread
From: Denys Vlasenko @ 2015-03-26 10:53 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Borislav Petkov, Linus Torvalds, H. Peter Anvin, Denys Vlasenko,
	Steven Rostedt, Andy Lutomirski, Oleg Nesterov,
	Frederic Weisbecker, Alexei Starovoitov, Will Drewry, Kees Cook,
	the arch/x86 maintainers, Linux Kernel Mailing List

On Thu, Mar 26, 2015 at 11:25 AM, Ingo Molnar <mingo@kernel.org> wrote:
> * Denys Vlasenko <vda.linux@googlemail.com> wrote:
>> In my experiment, GAS uses 10-byte insn only for constants which
>> won't work with 7-byte encoding; or if I explicitly ask for "movabs":
>>
>> _start:         .globl  _start
>>                 mov $0x12345678,%edi  # 5 bytes
>>                 mov $0x12345678,%rdi  # 7 bytes
>>                 movq $0x12345678,%rdi  # 7 bytes
>>                 mov $0x80000000,%rdi  # 10 bytes
>>                 mov $0x123456789,%rdi  # 10 bytes
>>                 movabs $0x12345678,%rdi # 10 bytes
>>
>>
>> $ gcc -nostartfiles -nostdlib -c z.S && objdump -dr z.o
>> z.o:     file format elf64-x86-64
>> Disassembly of section .text:
>> 0000000000000000 <_start>:
>>    0:    bf 78 56 34 12           mov    $0x12345678,%edi
>>    5:    48 c7 c7 78 56 34 12     mov    $0x12345678,%rdi
>>    c:    48 c7 c7 78 56 34 12     mov    $0x12345678,%rdi
>>   13:    48 bf 00 00 00 80 00     movabs $0x80000000,%rdi
>>   1a:    00 00 00
>>   1d:    48 bf 89 67 45 23 01     movabs $0x123456789,%rdi
>>   24:    00 00 00
>>   27:    48 bf 78 56 34 12 00     movabs $0x12345678,%rdi
>>   2e:    00 00 00
>
> I see, so:
>
>        movq $AUDIT_ARCH_X86_64, %rsi
>
> generated a 10-byte MOVABS opcode, while moving into %esi generates
> the 5-byte 32-bit MOV opcode?

Exactly.

^ permalink raw reply	[flat|nested] 24+ messages in thread

* [tip:x86/asm] x86/asm/entry/64: Use better label name, fix comments
  2015-03-25 17:18 [PATCH 1/4] x86/asm/entry/64: better label name, fix comments Denys Vlasenko
                   ` (2 preceding siblings ...)
  2015-03-25 17:18 ` [PATCH 4/4] x86/asm/entry/64: fix typo in comment Denys Vlasenko
@ 2015-03-27 11:46 ` tip-bot for Denys Vlasenko
  2015-03-27 12:15   ` Borislav Petkov
  3 siblings, 1 reply; 24+ messages in thread
From: tip-bot for Denys Vlasenko @ 2015-03-27 11:46 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: luto, fweisbec, linux-kernel, ast, dvlasenk, bp, mingo, wad,
	oleg, rostedt, torvalds, hpa, keescook, tglx

Commit-ID:  146b2b097d7a322b64b88a927fc5d870fc79a60b
Gitweb:     http://git.kernel.org/tip/146b2b097d7a322b64b88a927fc5d870fc79a60b
Author:     Denys Vlasenko <dvlasenk@redhat.com>
AuthorDate: Wed, 25 Mar 2015 18:18:13 +0100
Committer:  Ingo Molnar <mingo@kernel.org>
CommitDate: Fri, 27 Mar 2015 09:57:05 +0100

x86/asm/entry/64: Use better label name, fix comments

A named label "ret_from_sys_call" implies that there are jumps
to this location from elsewhere, as happens with many other
labels in this file.

But this label is used only by the JMP a few insns above.
To make that obvious, use local numeric label instead.

Improve comments:

"and return regs->ax" isn't too informative. We always return
regs->ax.

The comment suggesting that it'd be cool to use rip relative
addressing for CALL is deleted. It's unclear why that would be
an improvement - we aren't striving to use position-independent
code here. PIC code here would require something like LEA
sys_call_table(%rip),reg + CALL *(reg,%rax*8)...

"iret frame is also incomplete" is no longer true, fix that too.

Also fix typo in comment.

Signed-off-by: Denys Vlasenko <dvlasenk@redhat.com>
Cc: Alexei Starovoitov <ast@plumgrid.com>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Will Drewry <wad@chromium.org>
Link: http://lkml.kernel.org/r/1427303896-24023-1-git-send-email-dvlasenk@redhat.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
---
 arch/x86/kernel/entry_64.S | 13 ++++++-------
 1 file changed, 6 insertions(+), 7 deletions(-)

diff --git a/arch/x86/kernel/entry_64.S b/arch/x86/kernel/entry_64.S
index bf9afad..9988c4b 100644
--- a/arch/x86/kernel/entry_64.S
+++ b/arch/x86/kernel/entry_64.S
@@ -258,16 +258,15 @@ system_call_fastpath:
 	andl $__SYSCALL_MASK,%eax
 	cmpl $__NR_syscall_max,%eax
 #endif
-	ja ret_from_sys_call  /* and return regs->ax */
+	ja	1f	/* return -ENOSYS (already in pt_regs->ax) */
 	movq %r10,%rcx
-	call *sys_call_table(,%rax,8)  # XXX:	 rip relative
+	call *sys_call_table(,%rax,8)
 	movq %rax,RAX(%rsp)
+1:
 /*
- * Syscall return path ending with SYSRET (fast path)
- * Has incompletely filled pt_regs, iret frame is also incomplete.
+ * Syscall return path ending with SYSRET (fast path).
+ * Has incompletely filled pt_regs.
  */
-ret_from_sys_call:
-
 	LOCKDEP_SYS_EXIT
 	DISABLE_INTERRUPTS(CLBR_NONE)
 	TRACE_IRQS_OFF
@@ -1407,7 +1406,7 @@ ENTRY(nmi)
 	 * NMI.
 	 */
 
-	/* Use %rdx as out temp variable throughout */
+	/* Use %rdx as our temp variable throughout */
 	pushq_cfi %rdx
 	CFI_REL_OFFSET rdx, 0
 

^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [tip:x86/asm] x86/asm/entry/64: Use smaller instructions
  2015-03-25 17:18 ` [PATCH 3/4] x86/asm/entry/64: use smaller insns Denys Vlasenko
  2015-03-25 18:03   ` Andy Lutomirski
  2015-03-25 23:51   ` Linus Torvalds
@ 2015-03-27 11:47   ` tip-bot for Denys Vlasenko
  2 siblings, 0 replies; 24+ messages in thread
From: tip-bot for Denys Vlasenko @ 2015-03-27 11:47 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: fweisbec, oleg, wad, luto, tglx, ast, bp, dvlasenk, rostedt,
	torvalds, hpa, linux-kernel, keescook, mingo

Commit-ID:  47eb582e702880c302036d17341c7ea1a7dc2a53
Gitweb:     http://git.kernel.org/tip/47eb582e702880c302036d17341c7ea1a7dc2a53
Author:     Denys Vlasenko <dvlasenk@redhat.com>
AuthorDate: Wed, 25 Mar 2015 18:18:15 +0100
Committer:  Ingo Molnar <mingo@kernel.org>
CommitDate: Fri, 27 Mar 2015 09:57:06 +0100

x86/asm/entry/64: Use smaller instructions

The $AUDIT_ARCH_X86_64 parameter to syscall_trace_enter_phase1/2
is a 32-bit constant, loading it with 32-bit MOV produces 5-byte
insn instead of 10-byte MOVABS one.

Signed-off-by: Denys Vlasenko <dvlasenk@redhat.com>
Cc: Alexei Starovoitov <ast@plumgrid.com>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Will Drewry <wad@chromium.org>
Link: http://lkml.kernel.org/r/1427303896-24023-3-git-send-email-dvlasenk@redhat.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
---
 arch/x86/kernel/entry_64.S | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/arch/x86/kernel/entry_64.S b/arch/x86/kernel/entry_64.S
index 9988c4b..f85d2cc 100644
--- a/arch/x86/kernel/entry_64.S
+++ b/arch/x86/kernel/entry_64.S
@@ -305,7 +305,7 @@ system_call_fastpath:
 	/* Do syscall entry tracing */
 tracesys:
 	movq %rsp, %rdi
-	movq $AUDIT_ARCH_X86_64, %rsi
+	movl $AUDIT_ARCH_X86_64, %esi
 	call syscall_trace_enter_phase1
 	test %rax, %rax
 	jnz tracesys_phase2		/* if needed, run the slow path */
@@ -316,7 +316,7 @@ tracesys:
 tracesys_phase2:
 	SAVE_EXTRA_REGS
 	movq %rsp, %rdi
-	movq $AUDIT_ARCH_X86_64, %rsi
+	movl $AUDIT_ARCH_X86_64, %esi
 	movq %rax,%rdx
 	call syscall_trace_enter_phase2
 

^ permalink raw reply related	[flat|nested] 24+ messages in thread

* Re: [tip:x86/asm] x86/asm/entry/64: Use better label name, fix comments
  2015-03-27 11:46 ` [tip:x86/asm] x86/asm/entry/64: Use better label name, fix comments tip-bot for Denys Vlasenko
@ 2015-03-27 12:15   ` Borislav Petkov
  0 siblings, 0 replies; 24+ messages in thread
From: Borislav Petkov @ 2015-03-27 12:15 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: ast, dvlasenk, luto, fweisbec, linux-kernel, wad, rostedt, oleg,
	mingo, torvalds, hpa, keescook, tglx, linux-tip-commits

On Fri, Mar 27, 2015 at 04:46:44AM -0700, tip-bot for Denys Vlasenko wrote:
> Commit-ID:  146b2b097d7a322b64b88a927fc5d870fc79a60b
> Gitweb:     http://git.kernel.org/tip/146b2b097d7a322b64b88a927fc5d870fc79a60b
> Author:     Denys Vlasenko <dvlasenk@redhat.com>
> AuthorDate: Wed, 25 Mar 2015 18:18:13 +0100
> Committer:  Ingo Molnar <mingo@kernel.org>
> CommitDate: Fri, 27 Mar 2015 09:57:05 +0100
> 
> x86/asm/entry/64: Use better label name, fix comments
> 
> A named label "ret_from_sys_call" implies that there are jumps
> to this location from elsewhere, as happens with many other
> labels in this file.
> 
> But this label is used only by the JMP a few insns above.
> To make that obvious, use local numeric label instead.
> 
> Improve comments:
> 
> "and return regs->ax" isn't too informative. We always return
> regs->ax.
> 
> The comment suggesting that it'd be cool to use rip relative
> addressing for CALL is deleted. It's unclear why that would be
> an improvement - we aren't striving to use position-independent
> code here. PIC code here would require something like LEA
> sys_call_table(%rip),reg + CALL *(reg,%rax*8)...
> 
> "iret frame is also incomplete" is no longer true, fix that too.
> 
> Also fix typo in comment.
> 
> Signed-off-by: Denys Vlasenko <dvlasenk@redhat.com>
> Cc: Alexei Starovoitov <ast@plumgrid.com>
> Cc: Andy Lutomirski <luto@amacapital.net>
> Cc: Borislav Petkov <bp@alien8.de>
> Cc: Frederic Weisbecker <fweisbec@gmail.com>
> Cc: H. Peter Anvin <hpa@zytor.com>
> Cc: Kees Cook <keescook@chromium.org>
> Cc: Linus Torvalds <torvalds@linux-foundation.org>
> Cc: Oleg Nesterov <oleg@redhat.com>
> Cc: Steven Rostedt <rostedt@goodmis.org>
> Cc: Will Drewry <wad@chromium.org>
> Link: http://lkml.kernel.org/r/1427303896-24023-1-git-send-email-dvlasenk@redhat.com
> Signed-off-by: Ingo Molnar <mingo@kernel.org>
> ---
>  arch/x86/kernel/entry_64.S | 13 ++++++-------
>  1 file changed, 6 insertions(+), 7 deletions(-)

Shouldn't you take v2 of that one?

https://lkml.kernel.org/r/1427307629-10024-1-git-send-email-dvlasenk@redhat.com

v2 is missing this hunk though:

---
@@ -1407,7 +1406,7 @@ ENTRY(nmi)
         * NMI.
         */

-       /* Use %rdx as out temp variable throughout */
+       /* Use %rdx as our temp variable throughout */
        pushq_cfi %rdx
        CFI_REL_OFFSET rdx, 0
--

I blame Denys for sending too many patches! :)

-- 
Regards/Gruss,
    Boris.

ECO tip #101: Trim your mails when you reply.
--

^ permalink raw reply	[flat|nested] 24+ messages in thread

end of thread, other threads:[~2015-03-27 12:17 UTC | newest]

Thread overview: 24+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-03-25 17:18 [PATCH 1/4] x86/asm/entry/64: better label name, fix comments Denys Vlasenko
2015-03-25 17:18 ` [PATCH 2/4] x86/asm/entry/64: do not TRACE_IRQS fast SYSRET64 path Denys Vlasenko
2015-03-25 17:29   ` Ingo Molnar
2015-03-25 17:41     ` Denys Vlasenko
2015-03-25 18:04       ` Ingo Molnar
2015-03-25 18:19         ` Denys Vlasenko
2015-03-25 17:18 ` [PATCH 3/4] x86/asm/entry/64: use smaller insns Denys Vlasenko
2015-03-25 18:03   ` Andy Lutomirski
2015-03-25 23:51   ` Linus Torvalds
2015-03-25 23:56     ` H. Peter Anvin
2015-03-26  0:05       ` Linus Torvalds
2015-03-26  0:12         ` H. Peter Anvin
2015-03-26  9:27         ` Borislav Petkov
2015-03-26  9:37           ` Ingo Molnar
2015-03-26  9:51             ` Borislav Petkov
2015-03-26 10:07             ` Denys Vlasenko
2015-03-26 10:25               ` Ingo Molnar
2015-03-26 10:53                 ` Denys Vlasenko
2015-03-26 10:25               ` Borislav Petkov
2015-03-26  9:30     ` Ingo Molnar
2015-03-27 11:47   ` [tip:x86/asm] x86/asm/entry/64: Use smaller instructions tip-bot for Denys Vlasenko
2015-03-25 17:18 ` [PATCH 4/4] x86/asm/entry/64: fix typo in comment Denys Vlasenko
2015-03-27 11:46 ` [tip:x86/asm] x86/asm/entry/64: Use better label name, fix comments tip-bot for Denys Vlasenko
2015-03-27 12:15   ` Borislav Petkov

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).