* [PATCH 1/4] x86/asm/entry/64: better label name, fix comments
@ 2015-03-25 17:18 Denys Vlasenko
2015-03-25 17:18 ` [PATCH 2/4] x86/asm/entry/64: do not TRACE_IRQS fast SYSRET64 path Denys Vlasenko
` (3 more replies)
0 siblings, 4 replies; 24+ messages in thread
From: Denys Vlasenko @ 2015-03-25 17:18 UTC (permalink / raw)
To: Ingo Molnar
Cc: Denys Vlasenko, Linus Torvalds, Steven Rostedt, Borislav Petkov,
H. Peter Anvin, Andy Lutomirski, Oleg Nesterov,
Frederic Weisbecker, Alexei Starovoitov, Will Drewry, Kees Cook,
x86, linux-kernel
A named label "ret_from_sys_call" implies that there are jumps
to this location from elsewhere, as happens with many other labels
in this file.
But this label is used only by the JMP a few insns above.
To make that obvious, use local numeric label instead.
Improve comments:
"and return regs->ax" isn't too informative. We always return regs->ax.
The comment suggesting that it'd be cool to use rip relative addressing for CALL
is deleted. It's unclear why that would be an improvement - we aren't striving
to use position-independent code here. PIC code here would require something like
LEA sys_call_table(%rip),reg + CALL *(reg,%rax*8)...
"iret frame is also incomplete" is no longer true, fix that too.
Signed-off-by: Denys Vlasenko <dvlasenk@redhat.com>
CC: Linus Torvalds <torvalds@linux-foundation.org>
CC: Steven Rostedt <rostedt@goodmis.org>
CC: Ingo Molnar <mingo@kernel.org>
CC: Borislav Petkov <bp@alien8.de>
CC: "H. Peter Anvin" <hpa@zytor.com>
CC: Andy Lutomirski <luto@amacapital.net>
CC: Oleg Nesterov <oleg@redhat.com>
CC: Frederic Weisbecker <fweisbec@gmail.com>
CC: Alexei Starovoitov <ast@plumgrid.com>
CC: Will Drewry <wad@chromium.org>
CC: Kees Cook <keescook@chromium.org>
CC: x86@kernel.org
CC: linux-kernel@vger.kernel.org
---
arch/x86/kernel/entry_64.S | 11 +++++------
1 file changed, 5 insertions(+), 6 deletions(-)
diff --git a/arch/x86/kernel/entry_64.S b/arch/x86/kernel/entry_64.S
index bf9afad..40f7760 100644
--- a/arch/x86/kernel/entry_64.S
+++ b/arch/x86/kernel/entry_64.S
@@ -258,16 +258,15 @@ system_call_fastpath:
andl $__SYSCALL_MASK,%eax
cmpl $__NR_syscall_max,%eax
#endif
- ja ret_from_sys_call /* and return regs->ax */
+ ja 1f /* return -ENOSYS (already in pt_regs->ax) */
movq %r10,%rcx
- call *sys_call_table(,%rax,8) # XXX: rip relative
+ call *sys_call_table(,%rax,8)
movq %rax,RAX(%rsp)
+1:
/*
- * Syscall return path ending with SYSRET (fast path)
- * Has incompletely filled pt_regs, iret frame is also incomplete.
+ * Syscall return path ending with SYSRET (fast path).
+ * Has incompletely filled pt_regs.
*/
-ret_from_sys_call:
-
LOCKDEP_SYS_EXIT
DISABLE_INTERRUPTS(CLBR_NONE)
TRACE_IRQS_OFF
--
1.8.1.4
^ permalink raw reply related [flat|nested] 24+ messages in thread
* [PATCH 2/4] x86/asm/entry/64: do not TRACE_IRQS fast SYSRET64 path
2015-03-25 17:18 [PATCH 1/4] x86/asm/entry/64: better label name, fix comments Denys Vlasenko
@ 2015-03-25 17:18 ` Denys Vlasenko
2015-03-25 17:29 ` Ingo Molnar
2015-03-25 17:18 ` [PATCH 3/4] x86/asm/entry/64: use smaller insns Denys Vlasenko
` (2 subsequent siblings)
3 siblings, 1 reply; 24+ messages in thread
From: Denys Vlasenko @ 2015-03-25 17:18 UTC (permalink / raw)
To: Ingo Molnar
Cc: Denys Vlasenko, Linus Torvalds, Steven Rostedt, Borislav Petkov,
H. Peter Anvin, Andy Lutomirski, Oleg Nesterov,
Frederic Weisbecker, Alexei Starovoitov, Will Drewry, Kees Cook,
x86, linux-kernel
SYSRET code path has a small irq-off block.
On this code path, TRACE_IRQS_ON can't be called right before interrupts
are enabled for real, we can't clobber registers there.
So current code does it earlier, in a safe place.
But with this, TRACE_IRQS_OFF/ON frames just two fast instructions,
which is ridiculous: now most of irq-off block is _outside_ of the framing.
Do the same thing that we do on SYSCALL entry: do not track this irq-off block,
it is very small to ever cause noticeable irq latency.
Be careful: make sure that "jnz int_ret_from_sys_call_irqs_off" now does
invoke TRACE_IRQS_OFF - move int_ret_from_sys_call_irqs_off label before
TRACE_IRQS_OFF.
Signed-off-by: Denys Vlasenko <dvlasenk@redhat.com>
CC: Linus Torvalds <torvalds@linux-foundation.org>
CC: Steven Rostedt <rostedt@goodmis.org>
CC: Ingo Molnar <mingo@kernel.org>
CC: Borislav Petkov <bp@alien8.de>
CC: "H. Peter Anvin" <hpa@zytor.com>
CC: Andy Lutomirski <luto@amacapital.net>
CC: Oleg Nesterov <oleg@redhat.com>
CC: Frederic Weisbecker <fweisbec@gmail.com>
CC: Alexei Starovoitov <ast@plumgrid.com>
CC: Will Drewry <wad@chromium.org>
CC: Kees Cook <keescook@chromium.org>
CC: x86@kernel.org
CC: linux-kernel@vger.kernel.org
---
arch/x86/kernel/entry_64.S | 13 +++++++------
1 file changed, 7 insertions(+), 6 deletions(-)
diff --git a/arch/x86/kernel/entry_64.S b/arch/x86/kernel/entry_64.S
index 40f7760..11b7339 100644
--- a/arch/x86/kernel/entry_64.S
+++ b/arch/x86/kernel/entry_64.S
@@ -268,8 +268,11 @@ system_call_fastpath:
* Has incompletely filled pt_regs.
*/
LOCKDEP_SYS_EXIT
+ /*
+ * We do not frame this tiny irq-off block with TRACE_IRQS_OFF/ON,
+ * it is too small to ever cause noticeable irq latency.
+ */
DISABLE_INTERRUPTS(CLBR_NONE)
- TRACE_IRQS_OFF
/*
* We must check ti flags with interrupts (or at least preemption)
@@ -283,10 +286,7 @@ system_call_fastpath:
jnz int_ret_from_sys_call_irqs_off /* Go to the slow path */
CFI_REMEMBER_STATE
- /*
- * sysretq will re-enable interrupts:
- */
- TRACE_IRQS_ON
+
RESTORE_C_REGS_EXCEPT_RCX_R11
movq RIP(%rsp),%rcx
CFI_REGISTER rip,rcx
@@ -297,6 +297,7 @@ system_call_fastpath:
* 64bit SYSRET restores rip from rcx,
* rflags from r11 (but RF and VM bits are forced to 0),
* cs and ss are loaded from MSRs.
+ * Restoration of rflags re-enables interrupts.
*/
USERGS_SYSRET64
@@ -345,8 +346,8 @@ tracesys_phase2:
*/
GLOBAL(int_ret_from_sys_call)
DISABLE_INTERRUPTS(CLBR_NONE)
- TRACE_IRQS_OFF
int_ret_from_sys_call_irqs_off:
+ TRACE_IRQS_OFF
movl $_TIF_ALLWORK_MASK,%edi
/* edi: mask to check */
GLOBAL(int_with_check)
--
1.8.1.4
^ permalink raw reply related [flat|nested] 24+ messages in thread
* [PATCH 3/4] x86/asm/entry/64: use smaller insns
2015-03-25 17:18 [PATCH 1/4] x86/asm/entry/64: better label name, fix comments Denys Vlasenko
2015-03-25 17:18 ` [PATCH 2/4] x86/asm/entry/64: do not TRACE_IRQS fast SYSRET64 path Denys Vlasenko
@ 2015-03-25 17:18 ` Denys Vlasenko
2015-03-25 18:03 ` Andy Lutomirski
` (2 more replies)
2015-03-25 17:18 ` [PATCH 4/4] x86/asm/entry/64: fix typo in comment Denys Vlasenko
2015-03-27 11:46 ` [tip:x86/asm] x86/asm/entry/64: Use better label name, fix comments tip-bot for Denys Vlasenko
3 siblings, 3 replies; 24+ messages in thread
From: Denys Vlasenko @ 2015-03-25 17:18 UTC (permalink / raw)
To: Ingo Molnar
Cc: Denys Vlasenko, Linus Torvalds, Steven Rostedt, Borislav Petkov,
H. Peter Anvin, Andy Lutomirski, Oleg Nesterov,
Frederic Weisbecker, Alexei Starovoitov, Will Drewry, Kees Cook,
x86, linux-kernel
The $AUDIT_ARCH_X86_64 parameter to syscall_trace_enter_phase1/2
is a 32-bit constant, loading it with 32-bit MOV produces 5-byte insn
instead of 10-byte one.
Signed-off-by: Denys Vlasenko <dvlasenk@redhat.com>
CC: Linus Torvalds <torvalds@linux-foundation.org>
CC: Steven Rostedt <rostedt@goodmis.org>
CC: Ingo Molnar <mingo@kernel.org>
CC: Borislav Petkov <bp@alien8.de>
CC: "H. Peter Anvin" <hpa@zytor.com>
CC: Andy Lutomirski <luto@amacapital.net>
CC: Oleg Nesterov <oleg@redhat.com>
CC: Frederic Weisbecker <fweisbec@gmail.com>
CC: Alexei Starovoitov <ast@plumgrid.com>
CC: Will Drewry <wad@chromium.org>
CC: Kees Cook <keescook@chromium.org>
CC: x86@kernel.org
CC: linux-kernel@vger.kernel.org
---
arch/x86/kernel/entry_64.S | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/arch/x86/kernel/entry_64.S b/arch/x86/kernel/entry_64.S
index 11b7339..23aa43e 100644
--- a/arch/x86/kernel/entry_64.S
+++ b/arch/x86/kernel/entry_64.S
@@ -306,7 +306,7 @@ system_call_fastpath:
/* Do syscall entry tracing */
tracesys:
movq %rsp, %rdi
- movq $AUDIT_ARCH_X86_64, %rsi
+ movl $AUDIT_ARCH_X86_64, %esi
call syscall_trace_enter_phase1
test %rax, %rax
jnz tracesys_phase2 /* if needed, run the slow path */
@@ -317,7 +317,7 @@ tracesys:
tracesys_phase2:
SAVE_EXTRA_REGS
movq %rsp, %rdi
- movq $AUDIT_ARCH_X86_64, %rsi
+ movl $AUDIT_ARCH_X86_64, %esi
movq %rax,%rdx
call syscall_trace_enter_phase2
--
1.8.1.4
^ permalink raw reply related [flat|nested] 24+ messages in thread
* [PATCH 4/4] x86/asm/entry/64: fix typo in comment
2015-03-25 17:18 [PATCH 1/4] x86/asm/entry/64: better label name, fix comments Denys Vlasenko
2015-03-25 17:18 ` [PATCH 2/4] x86/asm/entry/64: do not TRACE_IRQS fast SYSRET64 path Denys Vlasenko
2015-03-25 17:18 ` [PATCH 3/4] x86/asm/entry/64: use smaller insns Denys Vlasenko
@ 2015-03-25 17:18 ` Denys Vlasenko
2015-03-27 11:46 ` [tip:x86/asm] x86/asm/entry/64: Use better label name, fix comments tip-bot for Denys Vlasenko
3 siblings, 0 replies; 24+ messages in thread
From: Denys Vlasenko @ 2015-03-25 17:18 UTC (permalink / raw)
To: Ingo Molnar
Cc: Denys Vlasenko, Linus Torvalds, Steven Rostedt, Borislav Petkov,
H. Peter Anvin, Andy Lutomirski, Oleg Nesterov,
Frederic Weisbecker, Alexei Starovoitov, Will Drewry, Kees Cook,
x86, linux-kernel
Signed-off-by: Denys Vlasenko <dvlasenk@redhat.com>
CC: Linus Torvalds <torvalds@linux-foundation.org>
CC: Steven Rostedt <rostedt@goodmis.org>
CC: Ingo Molnar <mingo@kernel.org>
CC: Borislav Petkov <bp@alien8.de>
CC: "H. Peter Anvin" <hpa@zytor.com>
CC: Andy Lutomirski <luto@amacapital.net>
CC: Oleg Nesterov <oleg@redhat.com>
CC: Frederic Weisbecker <fweisbec@gmail.com>
CC: Alexei Starovoitov <ast@plumgrid.com>
CC: Will Drewry <wad@chromium.org>
CC: Kees Cook <keescook@chromium.org>
CC: x86@kernel.org
CC: linux-kernel@vger.kernel.org
---
arch/x86/kernel/entry_64.S | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/arch/x86/kernel/entry_64.S b/arch/x86/kernel/entry_64.S
index 23aa43e..47e126b 100644
--- a/arch/x86/kernel/entry_64.S
+++ b/arch/x86/kernel/entry_64.S
@@ -1407,7 +1407,7 @@ ENTRY(nmi)
* NMI.
*/
- /* Use %rdx as out temp variable throughout */
+ /* Use %rdx as our temp variable throughout */
pushq_cfi %rdx
CFI_REL_OFFSET rdx, 0
--
1.8.1.4
^ permalink raw reply related [flat|nested] 24+ messages in thread
* Re: [PATCH 2/4] x86/asm/entry/64: do not TRACE_IRQS fast SYSRET64 path
2015-03-25 17:18 ` [PATCH 2/4] x86/asm/entry/64: do not TRACE_IRQS fast SYSRET64 path Denys Vlasenko
@ 2015-03-25 17:29 ` Ingo Molnar
2015-03-25 17:41 ` Denys Vlasenko
0 siblings, 1 reply; 24+ messages in thread
From: Ingo Molnar @ 2015-03-25 17:29 UTC (permalink / raw)
To: Denys Vlasenko
Cc: Linus Torvalds, Steven Rostedt, Borislav Petkov, H. Peter Anvin,
Andy Lutomirski, Oleg Nesterov, Frederic Weisbecker,
Alexei Starovoitov, Will Drewry, Kees Cook, x86, linux-kernel
* Denys Vlasenko <dvlasenk@redhat.com> wrote:
> SYSRET code path has a small irq-off block.
> On this code path, TRACE_IRQS_ON can't be called right before interrupts
> are enabled for real, we can't clobber registers there.
> So current code does it earlier, in a safe place.
>
> But with this, TRACE_IRQS_OFF/ON frames just two fast instructions,
> which is ridiculous: now most of irq-off block is _outside_ of the framing.
>
> Do the same thing that we do on SYSCALL entry: do not track this irq-off block,
> it is very small to ever cause noticeable irq latency.
>
> Be careful: make sure that "jnz int_ret_from_sys_call_irqs_off" now does
> invoke TRACE_IRQS_OFF - move int_ret_from_sys_call_irqs_off label before
> TRACE_IRQS_OFF.
> @@ -345,8 +346,8 @@ tracesys_phase2:
> */
> GLOBAL(int_ret_from_sys_call)
> DISABLE_INTERRUPTS(CLBR_NONE)
> - TRACE_IRQS_OFF
> int_ret_from_sys_call_irqs_off:
> + TRACE_IRQS_OFF
> movl $_TIF_ALLWORK_MASK,%edi
> /* edi: mask to check */
This latter trick absolutely needs a comment, to keep future lockdep
developers from wondering about the mismatch and the weird label
placement ...
Thanks,
Ingo
^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: [PATCH 2/4] x86/asm/entry/64: do not TRACE_IRQS fast SYSRET64 path
2015-03-25 17:29 ` Ingo Molnar
@ 2015-03-25 17:41 ` Denys Vlasenko
2015-03-25 18:04 ` Ingo Molnar
0 siblings, 1 reply; 24+ messages in thread
From: Denys Vlasenko @ 2015-03-25 17:41 UTC (permalink / raw)
To: Ingo Molnar
Cc: Linus Torvalds, Steven Rostedt, Borislav Petkov, H. Peter Anvin,
Andy Lutomirski, Oleg Nesterov, Frederic Weisbecker,
Alexei Starovoitov, Will Drewry, Kees Cook, x86, linux-kernel
On 03/25/2015 06:29 PM, Ingo Molnar wrote:
>
> * Denys Vlasenko <dvlasenk@redhat.com> wrote:
>
>> SYSRET code path has a small irq-off block.
>> On this code path, TRACE_IRQS_ON can't be called right before interrupts
>> are enabled for real, we can't clobber registers there.
>> So current code does it earlier, in a safe place.
>>
>> But with this, TRACE_IRQS_OFF/ON frames just two fast instructions,
>> which is ridiculous: now most of irq-off block is _outside_ of the framing.
>>
>> Do the same thing that we do on SYSCALL entry: do not track this irq-off block,
>> it is very small to ever cause noticeable irq latency.
>>
>> Be careful: make sure that "jnz int_ret_from_sys_call_irqs_off" now does
>> invoke TRACE_IRQS_OFF - move int_ret_from_sys_call_irqs_off label before
>> TRACE_IRQS_OFF.
>
>> @@ -345,8 +346,8 @@ tracesys_phase2:
>> */
>> GLOBAL(int_ret_from_sys_call)
>> DISABLE_INTERRUPTS(CLBR_NONE)
>> - TRACE_IRQS_OFF
>> int_ret_from_sys_call_irqs_off:
>> + TRACE_IRQS_OFF
>> movl $_TIF_ALLWORK_MASK,%edi
>> /* edi: mask to check */
>
> This latter trick absolutely needs a comment, to keep future lockdep
> developers from wondering about the mismatch and the weird label
> placement ...
Unsure how to format it.
How about:
DISABLE_INTERRUPTS(CLBR_NONE)
int_ret_from_sys_call_irqs_off: /* jumps come here with irqs off */
TRACE_IRQS_OFF
(In truth, there is only one jump as of now, but using pliral
"jumps" if that would change)
^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: [PATCH 3/4] x86/asm/entry/64: use smaller insns
2015-03-25 17:18 ` [PATCH 3/4] x86/asm/entry/64: use smaller insns Denys Vlasenko
@ 2015-03-25 18:03 ` Andy Lutomirski
2015-03-25 23:51 ` Linus Torvalds
2015-03-27 11:47 ` [tip:x86/asm] x86/asm/entry/64: Use smaller instructions tip-bot for Denys Vlasenko
2 siblings, 0 replies; 24+ messages in thread
From: Andy Lutomirski @ 2015-03-25 18:03 UTC (permalink / raw)
To: Denys Vlasenko
Cc: Ingo Molnar, Linus Torvalds, Steven Rostedt, Borislav Petkov,
H. Peter Anvin, Oleg Nesterov, Frederic Weisbecker,
Alexei Starovoitov, Will Drewry, Kees Cook, X86 ML, linux-kernel
On Wed, Mar 25, 2015 at 10:18 AM, Denys Vlasenko <dvlasenk@redhat.com> wrote:
> The $AUDIT_ARCH_X86_64 parameter to syscall_trace_enter_phase1/2
> is a 32-bit constant, loading it with 32-bit MOV produces 5-byte insn
> instead of 10-byte one.
Acked-by: Andy Lutomirski <luto@kernel.org>
>
> Signed-off-by: Denys Vlasenko <dvlasenk@redhat.com>
> CC: Linus Torvalds <torvalds@linux-foundation.org>
> CC: Steven Rostedt <rostedt@goodmis.org>
> CC: Ingo Molnar <mingo@kernel.org>
> CC: Borislav Petkov <bp@alien8.de>
> CC: "H. Peter Anvin" <hpa@zytor.com>
> CC: Andy Lutomirski <luto@amacapital.net>
> CC: Oleg Nesterov <oleg@redhat.com>
> CC: Frederic Weisbecker <fweisbec@gmail.com>
> CC: Alexei Starovoitov <ast@plumgrid.com>
> CC: Will Drewry <wad@chromium.org>
> CC: Kees Cook <keescook@chromium.org>
> CC: x86@kernel.org
> CC: linux-kernel@vger.kernel.org
> ---
> arch/x86/kernel/entry_64.S | 4 ++--
> 1 file changed, 2 insertions(+), 2 deletions(-)
>
> diff --git a/arch/x86/kernel/entry_64.S b/arch/x86/kernel/entry_64.S
> index 11b7339..23aa43e 100644
> --- a/arch/x86/kernel/entry_64.S
> +++ b/arch/x86/kernel/entry_64.S
> @@ -306,7 +306,7 @@ system_call_fastpath:
> /* Do syscall entry tracing */
> tracesys:
> movq %rsp, %rdi
> - movq $AUDIT_ARCH_X86_64, %rsi
> + movl $AUDIT_ARCH_X86_64, %esi
> call syscall_trace_enter_phase1
> test %rax, %rax
> jnz tracesys_phase2 /* if needed, run the slow path */
> @@ -317,7 +317,7 @@ tracesys:
> tracesys_phase2:
> SAVE_EXTRA_REGS
> movq %rsp, %rdi
> - movq $AUDIT_ARCH_X86_64, %rsi
> + movl $AUDIT_ARCH_X86_64, %esi
> movq %rax,%rdx
> call syscall_trace_enter_phase2
>
> --
> 1.8.1.4
>
--
Andy Lutomirski
AMA Capital Management, LLC
^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: [PATCH 2/4] x86/asm/entry/64: do not TRACE_IRQS fast SYSRET64 path
2015-03-25 17:41 ` Denys Vlasenko
@ 2015-03-25 18:04 ` Ingo Molnar
2015-03-25 18:19 ` Denys Vlasenko
0 siblings, 1 reply; 24+ messages in thread
From: Ingo Molnar @ 2015-03-25 18:04 UTC (permalink / raw)
To: Denys Vlasenko
Cc: Linus Torvalds, Steven Rostedt, Borislav Petkov, H. Peter Anvin,
Andy Lutomirski, Oleg Nesterov, Frederic Weisbecker,
Alexei Starovoitov, Will Drewry, Kees Cook, x86, linux-kernel
* Denys Vlasenko <dvlasenk@redhat.com> wrote:
> On 03/25/2015 06:29 PM, Ingo Molnar wrote:
> >
> > * Denys Vlasenko <dvlasenk@redhat.com> wrote:
> >
> >> SYSRET code path has a small irq-off block.
> >> On this code path, TRACE_IRQS_ON can't be called right before interrupts
> >> are enabled for real, we can't clobber registers there.
> >> So current code does it earlier, in a safe place.
> >>
> >> But with this, TRACE_IRQS_OFF/ON frames just two fast instructions,
> >> which is ridiculous: now most of irq-off block is _outside_ of the framing.
> >>
> >> Do the same thing that we do on SYSCALL entry: do not track this irq-off block,
> >> it is very small to ever cause noticeable irq latency.
> >>
> >> Be careful: make sure that "jnz int_ret_from_sys_call_irqs_off" now does
> >> invoke TRACE_IRQS_OFF - move int_ret_from_sys_call_irqs_off label before
> >> TRACE_IRQS_OFF.
> >
> >> @@ -345,8 +346,8 @@ tracesys_phase2:
> >> */
> >> GLOBAL(int_ret_from_sys_call)
> >> DISABLE_INTERRUPTS(CLBR_NONE)
> >> - TRACE_IRQS_OFF
> >> int_ret_from_sys_call_irqs_off:
> >> + TRACE_IRQS_OFF
> >> movl $_TIF_ALLWORK_MASK,%edi
> >> /* edi: mask to check */
> >
> > This latter trick absolutely needs a comment, to keep future lockdep
> > developers from wondering about the mismatch and the weird label
> > placement ...
>
> Unsure how to format it.
>
> How about:
>
>
> DISABLE_INTERRUPTS(CLBR_NONE)
> int_ret_from_sys_call_irqs_off: /* jumps come here with irqs off */
> TRACE_IRQS_OFF
Why not something like 'jumps come here from the irqs-off SYSRET
path'?
>
>
>
> (In truth, there is only one jump as of now, but using pliral
> "jumps" if that would change)
I'd also put a comment to the actual sysret IRQ-disablement that we
are skipping with the annotation. Explain that it's an optimization
for a visible irqs-off path that needs no annotation - and that the
moment something complex is done in that path, this optimization loses
its validity.
Thanks,
Ingo
^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: [PATCH 2/4] x86/asm/entry/64: do not TRACE_IRQS fast SYSRET64 path
2015-03-25 18:04 ` Ingo Molnar
@ 2015-03-25 18:19 ` Denys Vlasenko
0 siblings, 0 replies; 24+ messages in thread
From: Denys Vlasenko @ 2015-03-25 18:19 UTC (permalink / raw)
To: Ingo Molnar
Cc: Linus Torvalds, Steven Rostedt, Borislav Petkov, H. Peter Anvin,
Andy Lutomirski, Oleg Nesterov, Frederic Weisbecker,
Alexei Starovoitov, Will Drewry, Kees Cook, x86, linux-kernel
On 03/25/2015 07:04 PM, Ingo Molnar wrote:
>>> * Denys Vlasenko <dvlasenk@redhat.com> wrote:
>>>
>>>> SYSRET code path has a small irq-off block.
>>>> On this code path, TRACE_IRQS_ON can't be called right before interrupts
>>>> are enabled for real, we can't clobber registers there.
>>>> So current code does it earlier, in a safe place.
>>>>
>>>> But with this, TRACE_IRQS_OFF/ON frames just two fast instructions,
>>>> which is ridiculous: now most of irq-off block is _outside_ of the framing.
>>>>
>>>> Do the same thing that we do on SYSCALL entry: do not track this irq-off block,
>>>> it is very small to ever cause noticeable irq latency.
>>>>
>>>> Be careful: make sure that "jnz int_ret_from_sys_call_irqs_off" now does
>>>> invoke TRACE_IRQS_OFF - move int_ret_from_sys_call_irqs_off label before
>>>> TRACE_IRQS_OFF.
>>>
>>>> @@ -345,8 +346,8 @@ tracesys_phase2:
>>>> */
>>>> GLOBAL(int_ret_from_sys_call)
>>>> DISABLE_INTERRUPTS(CLBR_NONE)
>>>> - TRACE_IRQS_OFF
>>>> int_ret_from_sys_call_irqs_off:
>>>> + TRACE_IRQS_OFF
>>>> movl $_TIF_ALLWORK_MASK,%edi
>>>> /* edi: mask to check */
>>>
>>> This latter trick absolutely needs a comment, to keep future lockdep
>>> developers from wondering about the mismatch and the weird label
>>> placement ...
>>
>> Unsure how to format it.
>>
>> How about:
>>
>>
>> DISABLE_INTERRUPTS(CLBR_NONE)
>> int_ret_from_sys_call_irqs_off: /* jumps come here with irqs off */
>> TRACE_IRQS_OFF
>
> Why not something like 'jumps come here from the irqs-off SYSRET
> path'?
Ok. I'll send v2 for patches 1 and 2.
(Patch 1 will be expanded, there is another instance of jump
needing the same treatment.)
>>
>>
>>
>> (In truth, there is only one jump as of now, but using pliral
>> "jumps" if that would change)
>
> I'd also put a comment to the actual sysret IRQ-disablement that we
> are skipping with the annotation.
Patch does add such a comment:
+ /*
+ * We do not frame this tiny irq-off block with TRACE_IRQS_OFF/ON,
+ * it is too small to ever cause noticeable irq latency.
+ */
^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: [PATCH 3/4] x86/asm/entry/64: use smaller insns
2015-03-25 17:18 ` [PATCH 3/4] x86/asm/entry/64: use smaller insns Denys Vlasenko
2015-03-25 18:03 ` Andy Lutomirski
@ 2015-03-25 23:51 ` Linus Torvalds
2015-03-25 23:56 ` H. Peter Anvin
2015-03-26 9:30 ` Ingo Molnar
2015-03-27 11:47 ` [tip:x86/asm] x86/asm/entry/64: Use smaller instructions tip-bot for Denys Vlasenko
2 siblings, 2 replies; 24+ messages in thread
From: Linus Torvalds @ 2015-03-25 23:51 UTC (permalink / raw)
To: Denys Vlasenko
Cc: Ingo Molnar, Steven Rostedt, Borislav Petkov, H. Peter Anvin,
Andy Lutomirski, Oleg Nesterov, Frederic Weisbecker,
Alexei Starovoitov, Will Drewry, Kees Cook,
the arch/x86 maintainers, Linux Kernel Mailing List
On Wed, Mar 25, 2015 at 10:18 AM, Denys Vlasenko <dvlasenk@redhat.com> wrote:
> The $AUDIT_ARCH_X86_64 parameter to syscall_trace_enter_phase1/2
> is a 32-bit constant, loading it with 32-bit MOV produces 5-byte insn
> instead of 10-byte one.
Side note: has anybody talked to the assembler people? This would seem
to be very much something that the assembler could have noticed and
done on its own. It's a bit sad that we need to overspecify these
things..
If it had actually been a 64-bit constant, the assembler would have
ended up silently using a different instruction encoding *anyway*
("movabs"), so it's not like the "movq" in any way specifies one
particular instruction representation, and the assembler already picks
different instruction versions for different constant values. Why not
this one?
Linus
^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: [PATCH 3/4] x86/asm/entry/64: use smaller insns
2015-03-25 23:51 ` Linus Torvalds
@ 2015-03-25 23:56 ` H. Peter Anvin
2015-03-26 0:05 ` Linus Torvalds
2015-03-26 9:30 ` Ingo Molnar
1 sibling, 1 reply; 24+ messages in thread
From: H. Peter Anvin @ 2015-03-25 23:56 UTC (permalink / raw)
To: Linus Torvalds, Denys Vlasenko
Cc: Ingo Molnar, Steven Rostedt, Borislav Petkov, Andy Lutomirski,
Oleg Nesterov, Frederic Weisbecker, Alexei Starovoitov,
Will Drewry, Kees Cook, the arch/x86 maintainers,
Linux Kernel Mailing List
No, movabs is yet another instruction (with a 64-bit absolute address.) But movq can mean 10 or 7 bytes...
On March 25, 2015 4:51:50 PM PDT, Linus Torvalds <torvalds@linux-foundation.org> wrote:
>On Wed, Mar 25, 2015 at 10:18 AM, Denys Vlasenko <dvlasenk@redhat.com>
>wrote:
>> The $AUDIT_ARCH_X86_64 parameter to syscall_trace_enter_phase1/2
>> is a 32-bit constant, loading it with 32-bit MOV produces 5-byte insn
>> instead of 10-byte one.
>
>Side note: has anybody talked to the assembler people? This would seem
>to be very much something that the assembler could have noticed and
>done on its own. It's a bit sad that we need to overspecify these
>things..
>
>If it had actually been a 64-bit constant, the assembler would have
>ended up silently using a different instruction encoding *anyway*
>("movabs"), so it's not like the "movq" in any way specifies one
>particular instruction representation, and the assembler already picks
>different instruction versions for different constant values. Why not
>this one?
>
> Linus
--
Sent from my mobile phone. Please pardon brevity and lack of formatting.
^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: [PATCH 3/4] x86/asm/entry/64: use smaller insns
2015-03-25 23:56 ` H. Peter Anvin
@ 2015-03-26 0:05 ` Linus Torvalds
2015-03-26 0:12 ` H. Peter Anvin
2015-03-26 9:27 ` Borislav Petkov
0 siblings, 2 replies; 24+ messages in thread
From: Linus Torvalds @ 2015-03-26 0:05 UTC (permalink / raw)
To: H. Peter Anvin
Cc: Denys Vlasenko, Ingo Molnar, Steven Rostedt, Borislav Petkov,
Andy Lutomirski, Oleg Nesterov, Frederic Weisbecker,
Alexei Starovoitov, Will Drewry, Kees Cook,
the arch/x86 maintainers, Linux Kernel Mailing List
On Wed, Mar 25, 2015 at 4:56 PM, H. Peter Anvin <hpa@zytor.com> wrote:
> No, movabs is yet another instruction (with a 64-bit absolute address.) But movq can mean 10 or 7 bytes...
I mentioned movabs because that is literally what as generates at
least for me (or then objdump is confused):
[torvalds@i7 ~]$ as -v
GNU assembler version 2.24 (x86_64-redhat-linux) using BFD version
version 2.24
[torvalds@i7 ~]$ cat t.s
main:
movq $0x12, %rdi
movq $0x1234, %rdi
movq $0x123456, %rdi
movq $0x12345678, %rdi
movq $0x123456789ab, %rdi
[torvalds@i7 ~]$ as t.s
[torvalds@i7 ~]$ objdump -d a.out
...
0: 48 c7 c7 12 00 00 00 mov $0x12,%rdi
7: 48 c7 c7 34 12 00 00 mov $0x1234,%rdi
e: 48 c7 c7 56 34 12 00 mov $0x123456,%rdi
15: 48 c7 c7 78 56 34 12 mov $0x12345678,%rdi
1c: 48 bf ab 89 67 45 23 movabs $0x123456789ab,%rdi
23: 01 00 00
so 'as' is clearly just stupid. It already takes the size of the
constant into account and generates different instructions. Why not
for the common 32-bit case too?
Oh well.
Linus
^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: [PATCH 3/4] x86/asm/entry/64: use smaller insns
2015-03-26 0:05 ` Linus Torvalds
@ 2015-03-26 0:12 ` H. Peter Anvin
2015-03-26 9:27 ` Borislav Petkov
1 sibling, 0 replies; 24+ messages in thread
From: H. Peter Anvin @ 2015-03-26 0:12 UTC (permalink / raw)
To: Linus Torvalds
Cc: Denys Vlasenko, Ingo Molnar, Steven Rostedt, Borislav Petkov,
Andy Lutomirski, Oleg Nesterov, Frederic Weisbecker,
Alexei Starovoitov, Will Drewry, Kees Cook,
the arch/x86 maintainers, Linux Kernel Mailing List
Ok, "movabs" is crazier than I thought.
On March 25, 2015 5:05:50 PM PDT, Linus Torvalds <torvalds@linux-foundation.org> wrote:
>On Wed, Mar 25, 2015 at 4:56 PM, H. Peter Anvin <hpa@zytor.com> wrote:
>> No, movabs is yet another instruction (with a 64-bit absolute
>address.) But movq can mean 10 or 7 bytes...
>
>I mentioned movabs because that is literally what as generates at
>least for me (or then objdump is confused):
>
> [torvalds@i7 ~]$ as -v
> GNU assembler version 2.24 (x86_64-redhat-linux) using BFD version
>version 2.24
>
> [torvalds@i7 ~]$ cat t.s
> main:
> movq $0x12, %rdi
> movq $0x1234, %rdi
> movq $0x123456, %rdi
> movq $0x12345678, %rdi
> movq $0x123456789ab, %rdi
>
> [torvalds@i7 ~]$ as t.s
> [torvalds@i7 ~]$ objdump -d a.out
> ...
> 0: 48 c7 c7 12 00 00 00 mov $0x12,%rdi
> 7: 48 c7 c7 34 12 00 00 mov $0x1234,%rdi
> e: 48 c7 c7 56 34 12 00 mov $0x123456,%rdi
> 15: 48 c7 c7 78 56 34 12 mov $0x12345678,%rdi
> 1c: 48 bf ab 89 67 45 23 movabs $0x123456789ab,%rdi
> 23: 01 00 00
>
>so 'as' is clearly just stupid. It already takes the size of the
>constant into account and generates different instructions. Why not
>for the common 32-bit case too?
>
>Oh well.
>
> Linus
--
Sent from my mobile phone. Please pardon brevity and lack of formatting.
^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: [PATCH 3/4] x86/asm/entry/64: use smaller insns
2015-03-26 0:05 ` Linus Torvalds
2015-03-26 0:12 ` H. Peter Anvin
@ 2015-03-26 9:27 ` Borislav Petkov
2015-03-26 9:37 ` Ingo Molnar
1 sibling, 1 reply; 24+ messages in thread
From: Borislav Petkov @ 2015-03-26 9:27 UTC (permalink / raw)
To: Linus Torvalds
Cc: H. Peter Anvin, Denys Vlasenko, Ingo Molnar, Steven Rostedt,
Andy Lutomirski, Oleg Nesterov, Frederic Weisbecker,
Alexei Starovoitov, Will Drewry, Kees Cook,
the arch/x86 maintainers, Linux Kernel Mailing List
On Wed, Mar 25, 2015 at 05:05:50PM -0700, Linus Torvalds wrote:
> so 'as' is clearly just stupid. It already takes the size of the
> constant into account and generates different instructions. Why not
> for the common 32-bit case too?
I think the destination register mandates which insn to use:
mov $0x12345678, %rdi
mov $0x12345678, %edi
...
15: 48 c7 c7 78 56 34 12 mov $0x12345678,%rdi
1c: bf 78 56 34 12 mov $0x12345678,%edi
and 'as' is perhaps not insolent enough to go and change it when seeing
the 32-bit constant.
--
Regards/Gruss,
Boris.
ECO tip #101: Trim your mails when you reply.
--
^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: [PATCH 3/4] x86/asm/entry/64: use smaller insns
2015-03-25 23:51 ` Linus Torvalds
2015-03-25 23:56 ` H. Peter Anvin
@ 2015-03-26 9:30 ` Ingo Molnar
1 sibling, 0 replies; 24+ messages in thread
From: Ingo Molnar @ 2015-03-26 9:30 UTC (permalink / raw)
To: Linus Torvalds
Cc: Denys Vlasenko, Steven Rostedt, Borislav Petkov, H. Peter Anvin,
Andy Lutomirski, Oleg Nesterov, Frederic Weisbecker,
Alexei Starovoitov, Will Drewry, Kees Cook,
the arch/x86 maintainers, Linux Kernel Mailing List
* Linus Torvalds <torvalds@linux-foundation.org> wrote:
> On Wed, Mar 25, 2015 at 10:18 AM, Denys Vlasenko <dvlasenk@redhat.com> wrote:
> > The $AUDIT_ARCH_X86_64 parameter to syscall_trace_enter_phase1/2
> > is a 32-bit constant, loading it with 32-bit MOV produces 5-byte insn
> > instead of 10-byte one.
>
> Side note: has anybody talked to the assembler people? This would
> seem to be very much something that the assembler could have noticed
> and done on its own. [...]
Maybe GCC already picks a 32-bit opcode in these small-constant cases,
so there was little incentive to optimize on the GAS side, other than
making it correct.
> [...] It's a bit sad that we need to overspecify these things..
Yeah, that's sad.
Yesterday when I have read Denys's patch I double checked that there's
no other similar (easily identifiable ...) movq opcode left in the
64-bit entry code, so we seem to have squashed most of them.
Thanks,
Ingo
^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: [PATCH 3/4] x86/asm/entry/64: use smaller insns
2015-03-26 9:27 ` Borislav Petkov
@ 2015-03-26 9:37 ` Ingo Molnar
2015-03-26 9:51 ` Borislav Petkov
2015-03-26 10:07 ` Denys Vlasenko
0 siblings, 2 replies; 24+ messages in thread
From: Ingo Molnar @ 2015-03-26 9:37 UTC (permalink / raw)
To: Borislav Petkov
Cc: Linus Torvalds, H. Peter Anvin, Denys Vlasenko, Steven Rostedt,
Andy Lutomirski, Oleg Nesterov, Frederic Weisbecker,
Alexei Starovoitov, Will Drewry, Kees Cook,
the arch/x86 maintainers, Linux Kernel Mailing List
* Borislav Petkov <bp@alien8.de> wrote:
> On Wed, Mar 25, 2015 at 05:05:50PM -0700, Linus Torvalds wrote:
> > so 'as' is clearly just stupid. It already takes the size of the
> > constant into account and generates different instructions. Why not
> > for the common 32-bit case too?
>
> I think the destination register mandates which insn to use:
>
> mov $0x12345678, %rdi
> mov $0x12345678, %edi
>
> ...
>
> 15: 48 c7 c7 78 56 34 12 mov $0x12345678,%rdi
> 1c: bf 78 56 34 12 mov $0x12345678,%edi
>
> and 'as' is perhaps not insolent enough to go and change it when
> seeing the 32-bit constant.
Well, GAS generally has the freedom to use more optimal opcodes, as
long as behavior on the CPU matches what the specified opcode does.
Saying that 'movq $0x1, %rdi' should map to the longer variant is
defensible: there should probably always be a way to specify a very
specific opcode even if it's not optimal - say I want to fill in an
alignment space and don't want to use an extra NOP.
But here GAS generates the 10-byte opcode even if 'mov $0x12345678,
%rdi' is used, which is an unforced error.
Thanks,
Ingo
^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: [PATCH 3/4] x86/asm/entry/64: use smaller insns
2015-03-26 9:37 ` Ingo Molnar
@ 2015-03-26 9:51 ` Borislav Petkov
2015-03-26 10:07 ` Denys Vlasenko
1 sibling, 0 replies; 24+ messages in thread
From: Borislav Petkov @ 2015-03-26 9:51 UTC (permalink / raw)
To: Ingo Molnar
Cc: Linus Torvalds, H. Peter Anvin, Denys Vlasenko, Steven Rostedt,
Andy Lutomirski, Oleg Nesterov, Frederic Weisbecker,
Alexei Starovoitov, Will Drewry, Kees Cook,
the arch/x86 maintainers, Linux Kernel Mailing List
On Thu, Mar 26, 2015 at 10:37:32AM +0100, Ingo Molnar wrote:
> But here GAS generates the 10-byte opcode even if 'mov $0x12345678,
> %rdi' is used, which is an unforced error.
7 bytes:
48 c7 c7 78 56 34 12 mov $0x12345678,%rdi
which is REX + opcode C7 + ModRM + 4-byte immediate.
If it used the bX opcodes, it would have to use REX prefix too for the
%rdi register which would generate a 10-byte insn then:
0: 48 bf 78 56 43 12 00 mov $0x12435678,%rdi
7: 00 00 00
REX + opcode BF + 8 bytes immediate.
So I think using the REX register "rdi" forces it to use the REX prefix
and thus the longer instructions.
All IMHO of course.
--
Regards/Gruss,
Boris.
ECO tip #101: Trim your mails when you reply.
--
^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: [PATCH 3/4] x86/asm/entry/64: use smaller insns
2015-03-26 9:37 ` Ingo Molnar
2015-03-26 9:51 ` Borislav Petkov
@ 2015-03-26 10:07 ` Denys Vlasenko
2015-03-26 10:25 ` Ingo Molnar
2015-03-26 10:25 ` Borislav Petkov
1 sibling, 2 replies; 24+ messages in thread
From: Denys Vlasenko @ 2015-03-26 10:07 UTC (permalink / raw)
To: Ingo Molnar
Cc: Borislav Petkov, Linus Torvalds, H. Peter Anvin, Denys Vlasenko,
Steven Rostedt, Andy Lutomirski, Oleg Nesterov,
Frederic Weisbecker, Alexei Starovoitov, Will Drewry, Kees Cook,
the arch/x86 maintainers, Linux Kernel Mailing List
On Thu, Mar 26, 2015 at 10:37 AM, Ingo Molnar <mingo@kernel.org> wrote:
>> On Wed, Mar 25, 2015 at 05:05:50PM -0700, Linus Torvalds wrote:
>> > so 'as' is clearly just stupid. It already takes the size of the
>> > constant into account and generates different instructions. Why not
>> > for the common 32-bit case too?
>>
>> I think the destination register mandates which insn to use:
>>
>> mov $0x12345678, %rdi
>> mov $0x12345678, %edi
>>
>> ...
>>
>> 15: 48 c7 c7 78 56 34 12 mov $0x12345678,%rdi
>> 1c: bf 78 56 34 12 mov $0x12345678,%edi
>>
>> and 'as' is perhaps not insolent enough to go and change it when
>> seeing the 32-bit constant.
>
> Well, GAS generally has the freedom to use more optimal opcodes, as
> long as behavior on the CPU matches what the specified opcode does.
>
> Saying that 'movq $0x1, %rdi' should map to the longer variant is
> defensible: there should probably always be a way to specify a very
> specific opcode even if it's not optimal - say I want to fill in an
> alignment space and don't want to use an extra NOP.
>
> But here GAS generates the 10-byte opcode even if 'mov $0x12345678,
> %rdi' is used, which is an unforced error.
In my experiment, GAS uses 10-byte insn only for constants which
won't work with 7-byte encoding; or if I explicitly ask for "movabs":
_start: .globl _start
mov $0x12345678,%edi # 5 bytes
mov $0x12345678,%rdi # 7 bytes
movq $0x12345678,%rdi # 7 bytes
mov $0x80000000,%rdi # 10 bytes
mov $0x123456789,%rdi # 10 bytes
movabs $0x12345678,%rdi # 10 bytes
$ gcc -nostartfiles -nostdlib -c z.S && objdump -dr z.o
z.o: file format elf64-x86-64
Disassembly of section .text:
0000000000000000 <_start>:
0: bf 78 56 34 12 mov $0x12345678,%edi
5: 48 c7 c7 78 56 34 12 mov $0x12345678,%rdi
c: 48 c7 c7 78 56 34 12 mov $0x12345678,%rdi
13: 48 bf 00 00 00 80 00 movabs $0x80000000,%rdi
1a: 00 00 00
1d: 48 bf 89 67 45 23 01 movabs $0x123456789,%rdi
24: 00 00 00
27: 48 bf 78 56 34 12 00 movabs $0x12345678,%rdi
2e: 00 00 00
^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: [PATCH 3/4] x86/asm/entry/64: use smaller insns
2015-03-26 10:07 ` Denys Vlasenko
@ 2015-03-26 10:25 ` Ingo Molnar
2015-03-26 10:53 ` Denys Vlasenko
2015-03-26 10:25 ` Borislav Petkov
1 sibling, 1 reply; 24+ messages in thread
From: Ingo Molnar @ 2015-03-26 10:25 UTC (permalink / raw)
To: Denys Vlasenko
Cc: Borislav Petkov, Linus Torvalds, H. Peter Anvin, Denys Vlasenko,
Steven Rostedt, Andy Lutomirski, Oleg Nesterov,
Frederic Weisbecker, Alexei Starovoitov, Will Drewry, Kees Cook,
the arch/x86 maintainers, Linux Kernel Mailing List
* Denys Vlasenko <vda.linux@googlemail.com> wrote:
> In my experiment, GAS uses 10-byte insn only for constants which
> won't work with 7-byte encoding; or if I explicitly ask for "movabs":
>
> _start: .globl _start
> mov $0x12345678,%edi # 5 bytes
> mov $0x12345678,%rdi # 7 bytes
> movq $0x12345678,%rdi # 7 bytes
> mov $0x80000000,%rdi # 10 bytes
> mov $0x123456789,%rdi # 10 bytes
> movabs $0x12345678,%rdi # 10 bytes
>
>
> $ gcc -nostartfiles -nostdlib -c z.S && objdump -dr z.o
> z.o: file format elf64-x86-64
> Disassembly of section .text:
> 0000000000000000 <_start>:
> 0: bf 78 56 34 12 mov $0x12345678,%edi
> 5: 48 c7 c7 78 56 34 12 mov $0x12345678,%rdi
> c: 48 c7 c7 78 56 34 12 mov $0x12345678,%rdi
> 13: 48 bf 00 00 00 80 00 movabs $0x80000000,%rdi
> 1a: 00 00 00
> 1d: 48 bf 89 67 45 23 01 movabs $0x123456789,%rdi
> 24: 00 00 00
> 27: 48 bf 78 56 34 12 00 movabs $0x12345678,%rdi
> 2e: 00 00 00
I see, so:
movq $AUDIT_ARCH_X86_64, %rsi
generated a 10-byte MOVABS opcode, while moving into %esi generates
the 5-byte 32-bit MOV opcode?
Thanks,
Ingo
^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: [PATCH 3/4] x86/asm/entry/64: use smaller insns
2015-03-26 10:07 ` Denys Vlasenko
2015-03-26 10:25 ` Ingo Molnar
@ 2015-03-26 10:25 ` Borislav Petkov
1 sibling, 0 replies; 24+ messages in thread
From: Borislav Petkov @ 2015-03-26 10:25 UTC (permalink / raw)
To: Denys Vlasenko
Cc: Ingo Molnar, Linus Torvalds, H. Peter Anvin, Denys Vlasenko,
Steven Rostedt, Andy Lutomirski, Oleg Nesterov,
Frederic Weisbecker, Alexei Starovoitov, Will Drewry, Kees Cook,
the arch/x86 maintainers, Linux Kernel Mailing List
On Thu, Mar 26, 2015 at 11:07:42AM +0100, Denys Vlasenko wrote:
> In my experiment, GAS uses 10-byte insn only for constants which
> won't work with 7-byte encoding; or if I explicitly ask for "movabs":
>
> _start: .globl _start
> mov $0x12345678,%edi # 5 bytes
> mov $0x12345678,%rdi # 7 bytes
> movq $0x12345678,%rdi # 7 bytes
> mov $0x80000000,%rdi # 10 bytes
Right, and since they're signed immediates, the AUDIT_ARCH_X86_64 thing
is 0xc000003e and does not fit in an s32, thus the 64-bit immediate with
bf opcode:
mov $0x80000000-1,%rdi
mov $0x80000000,%rdi
mov $0xc000003e,%rdi
mov $0xc000003e,%edi
...
21: 48 c7 c7 ff ff ff 7f mov $0x7fffffff,%rdi
28: 48 bf 00 00 00 80 00 movabs $0x80000000,%rdi
2f: 00 00 00
32: 48 bf 3e 00 00 c0 00 movabs $0xc000003e,%rdi
39: 00 00 00
3c: bf 3e 00 00 c0 mov $0xc000003e,%edi
Makes sense to me.
--
Regards/Gruss,
Boris.
ECO tip #101: Trim your mails when you reply.
--
^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: [PATCH 3/4] x86/asm/entry/64: use smaller insns
2015-03-26 10:25 ` Ingo Molnar
@ 2015-03-26 10:53 ` Denys Vlasenko
0 siblings, 0 replies; 24+ messages in thread
From: Denys Vlasenko @ 2015-03-26 10:53 UTC (permalink / raw)
To: Ingo Molnar
Cc: Borislav Petkov, Linus Torvalds, H. Peter Anvin, Denys Vlasenko,
Steven Rostedt, Andy Lutomirski, Oleg Nesterov,
Frederic Weisbecker, Alexei Starovoitov, Will Drewry, Kees Cook,
the arch/x86 maintainers, Linux Kernel Mailing List
On Thu, Mar 26, 2015 at 11:25 AM, Ingo Molnar <mingo@kernel.org> wrote:
> * Denys Vlasenko <vda.linux@googlemail.com> wrote:
>> In my experiment, GAS uses 10-byte insn only for constants which
>> won't work with 7-byte encoding; or if I explicitly ask for "movabs":
>>
>> _start: .globl _start
>> mov $0x12345678,%edi # 5 bytes
>> mov $0x12345678,%rdi # 7 bytes
>> movq $0x12345678,%rdi # 7 bytes
>> mov $0x80000000,%rdi # 10 bytes
>> mov $0x123456789,%rdi # 10 bytes
>> movabs $0x12345678,%rdi # 10 bytes
>>
>>
>> $ gcc -nostartfiles -nostdlib -c z.S && objdump -dr z.o
>> z.o: file format elf64-x86-64
>> Disassembly of section .text:
>> 0000000000000000 <_start>:
>> 0: bf 78 56 34 12 mov $0x12345678,%edi
>> 5: 48 c7 c7 78 56 34 12 mov $0x12345678,%rdi
>> c: 48 c7 c7 78 56 34 12 mov $0x12345678,%rdi
>> 13: 48 bf 00 00 00 80 00 movabs $0x80000000,%rdi
>> 1a: 00 00 00
>> 1d: 48 bf 89 67 45 23 01 movabs $0x123456789,%rdi
>> 24: 00 00 00
>> 27: 48 bf 78 56 34 12 00 movabs $0x12345678,%rdi
>> 2e: 00 00 00
>
> I see, so:
>
> movq $AUDIT_ARCH_X86_64, %rsi
>
> generated a 10-byte MOVABS opcode, while moving into %esi generates
> the 5-byte 32-bit MOV opcode?
Exactly.
^ permalink raw reply [flat|nested] 24+ messages in thread
* [tip:x86/asm] x86/asm/entry/64: Use better label name, fix comments
2015-03-25 17:18 [PATCH 1/4] x86/asm/entry/64: better label name, fix comments Denys Vlasenko
` (2 preceding siblings ...)
2015-03-25 17:18 ` [PATCH 4/4] x86/asm/entry/64: fix typo in comment Denys Vlasenko
@ 2015-03-27 11:46 ` tip-bot for Denys Vlasenko
2015-03-27 12:15 ` Borislav Petkov
3 siblings, 1 reply; 24+ messages in thread
From: tip-bot for Denys Vlasenko @ 2015-03-27 11:46 UTC (permalink / raw)
To: linux-tip-commits
Cc: luto, fweisbec, linux-kernel, ast, dvlasenk, bp, mingo, wad,
oleg, rostedt, torvalds, hpa, keescook, tglx
Commit-ID: 146b2b097d7a322b64b88a927fc5d870fc79a60b
Gitweb: http://git.kernel.org/tip/146b2b097d7a322b64b88a927fc5d870fc79a60b
Author: Denys Vlasenko <dvlasenk@redhat.com>
AuthorDate: Wed, 25 Mar 2015 18:18:13 +0100
Committer: Ingo Molnar <mingo@kernel.org>
CommitDate: Fri, 27 Mar 2015 09:57:05 +0100
x86/asm/entry/64: Use better label name, fix comments
A named label "ret_from_sys_call" implies that there are jumps
to this location from elsewhere, as happens with many other
labels in this file.
But this label is used only by the JMP a few insns above.
To make that obvious, use local numeric label instead.
Improve comments:
"and return regs->ax" isn't too informative. We always return
regs->ax.
The comment suggesting that it'd be cool to use rip relative
addressing for CALL is deleted. It's unclear why that would be
an improvement - we aren't striving to use position-independent
code here. PIC code here would require something like LEA
sys_call_table(%rip),reg + CALL *(reg,%rax*8)...
"iret frame is also incomplete" is no longer true, fix that too.
Also fix typo in comment.
Signed-off-by: Denys Vlasenko <dvlasenk@redhat.com>
Cc: Alexei Starovoitov <ast@plumgrid.com>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Will Drewry <wad@chromium.org>
Link: http://lkml.kernel.org/r/1427303896-24023-1-git-send-email-dvlasenk@redhat.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
---
arch/x86/kernel/entry_64.S | 13 ++++++-------
1 file changed, 6 insertions(+), 7 deletions(-)
diff --git a/arch/x86/kernel/entry_64.S b/arch/x86/kernel/entry_64.S
index bf9afad..9988c4b 100644
--- a/arch/x86/kernel/entry_64.S
+++ b/arch/x86/kernel/entry_64.S
@@ -258,16 +258,15 @@ system_call_fastpath:
andl $__SYSCALL_MASK,%eax
cmpl $__NR_syscall_max,%eax
#endif
- ja ret_from_sys_call /* and return regs->ax */
+ ja 1f /* return -ENOSYS (already in pt_regs->ax) */
movq %r10,%rcx
- call *sys_call_table(,%rax,8) # XXX: rip relative
+ call *sys_call_table(,%rax,8)
movq %rax,RAX(%rsp)
+1:
/*
- * Syscall return path ending with SYSRET (fast path)
- * Has incompletely filled pt_regs, iret frame is also incomplete.
+ * Syscall return path ending with SYSRET (fast path).
+ * Has incompletely filled pt_regs.
*/
-ret_from_sys_call:
-
LOCKDEP_SYS_EXIT
DISABLE_INTERRUPTS(CLBR_NONE)
TRACE_IRQS_OFF
@@ -1407,7 +1406,7 @@ ENTRY(nmi)
* NMI.
*/
- /* Use %rdx as out temp variable throughout */
+ /* Use %rdx as our temp variable throughout */
pushq_cfi %rdx
CFI_REL_OFFSET rdx, 0
^ permalink raw reply related [flat|nested] 24+ messages in thread
* [tip:x86/asm] x86/asm/entry/64: Use smaller instructions
2015-03-25 17:18 ` [PATCH 3/4] x86/asm/entry/64: use smaller insns Denys Vlasenko
2015-03-25 18:03 ` Andy Lutomirski
2015-03-25 23:51 ` Linus Torvalds
@ 2015-03-27 11:47 ` tip-bot for Denys Vlasenko
2 siblings, 0 replies; 24+ messages in thread
From: tip-bot for Denys Vlasenko @ 2015-03-27 11:47 UTC (permalink / raw)
To: linux-tip-commits
Cc: fweisbec, oleg, wad, luto, tglx, ast, bp, dvlasenk, rostedt,
torvalds, hpa, linux-kernel, keescook, mingo
Commit-ID: 47eb582e702880c302036d17341c7ea1a7dc2a53
Gitweb: http://git.kernel.org/tip/47eb582e702880c302036d17341c7ea1a7dc2a53
Author: Denys Vlasenko <dvlasenk@redhat.com>
AuthorDate: Wed, 25 Mar 2015 18:18:15 +0100
Committer: Ingo Molnar <mingo@kernel.org>
CommitDate: Fri, 27 Mar 2015 09:57:06 +0100
x86/asm/entry/64: Use smaller instructions
The $AUDIT_ARCH_X86_64 parameter to syscall_trace_enter_phase1/2
is a 32-bit constant, loading it with 32-bit MOV produces 5-byte
insn instead of 10-byte MOVABS one.
Signed-off-by: Denys Vlasenko <dvlasenk@redhat.com>
Cc: Alexei Starovoitov <ast@plumgrid.com>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Will Drewry <wad@chromium.org>
Link: http://lkml.kernel.org/r/1427303896-24023-3-git-send-email-dvlasenk@redhat.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
---
arch/x86/kernel/entry_64.S | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/arch/x86/kernel/entry_64.S b/arch/x86/kernel/entry_64.S
index 9988c4b..f85d2cc 100644
--- a/arch/x86/kernel/entry_64.S
+++ b/arch/x86/kernel/entry_64.S
@@ -305,7 +305,7 @@ system_call_fastpath:
/* Do syscall entry tracing */
tracesys:
movq %rsp, %rdi
- movq $AUDIT_ARCH_X86_64, %rsi
+ movl $AUDIT_ARCH_X86_64, %esi
call syscall_trace_enter_phase1
test %rax, %rax
jnz tracesys_phase2 /* if needed, run the slow path */
@@ -316,7 +316,7 @@ tracesys:
tracesys_phase2:
SAVE_EXTRA_REGS
movq %rsp, %rdi
- movq $AUDIT_ARCH_X86_64, %rsi
+ movl $AUDIT_ARCH_X86_64, %esi
movq %rax,%rdx
call syscall_trace_enter_phase2
^ permalink raw reply related [flat|nested] 24+ messages in thread
* Re: [tip:x86/asm] x86/asm/entry/64: Use better label name, fix comments
2015-03-27 11:46 ` [tip:x86/asm] x86/asm/entry/64: Use better label name, fix comments tip-bot for Denys Vlasenko
@ 2015-03-27 12:15 ` Borislav Petkov
0 siblings, 0 replies; 24+ messages in thread
From: Borislav Petkov @ 2015-03-27 12:15 UTC (permalink / raw)
To: Ingo Molnar
Cc: ast, dvlasenk, luto, fweisbec, linux-kernel, wad, rostedt, oleg,
mingo, torvalds, hpa, keescook, tglx, linux-tip-commits
On Fri, Mar 27, 2015 at 04:46:44AM -0700, tip-bot for Denys Vlasenko wrote:
> Commit-ID: 146b2b097d7a322b64b88a927fc5d870fc79a60b
> Gitweb: http://git.kernel.org/tip/146b2b097d7a322b64b88a927fc5d870fc79a60b
> Author: Denys Vlasenko <dvlasenk@redhat.com>
> AuthorDate: Wed, 25 Mar 2015 18:18:13 +0100
> Committer: Ingo Molnar <mingo@kernel.org>
> CommitDate: Fri, 27 Mar 2015 09:57:05 +0100
>
> x86/asm/entry/64: Use better label name, fix comments
>
> A named label "ret_from_sys_call" implies that there are jumps
> to this location from elsewhere, as happens with many other
> labels in this file.
>
> But this label is used only by the JMP a few insns above.
> To make that obvious, use local numeric label instead.
>
> Improve comments:
>
> "and return regs->ax" isn't too informative. We always return
> regs->ax.
>
> The comment suggesting that it'd be cool to use rip relative
> addressing for CALL is deleted. It's unclear why that would be
> an improvement - we aren't striving to use position-independent
> code here. PIC code here would require something like LEA
> sys_call_table(%rip),reg + CALL *(reg,%rax*8)...
>
> "iret frame is also incomplete" is no longer true, fix that too.
>
> Also fix typo in comment.
>
> Signed-off-by: Denys Vlasenko <dvlasenk@redhat.com>
> Cc: Alexei Starovoitov <ast@plumgrid.com>
> Cc: Andy Lutomirski <luto@amacapital.net>
> Cc: Borislav Petkov <bp@alien8.de>
> Cc: Frederic Weisbecker <fweisbec@gmail.com>
> Cc: H. Peter Anvin <hpa@zytor.com>
> Cc: Kees Cook <keescook@chromium.org>
> Cc: Linus Torvalds <torvalds@linux-foundation.org>
> Cc: Oleg Nesterov <oleg@redhat.com>
> Cc: Steven Rostedt <rostedt@goodmis.org>
> Cc: Will Drewry <wad@chromium.org>
> Link: http://lkml.kernel.org/r/1427303896-24023-1-git-send-email-dvlasenk@redhat.com
> Signed-off-by: Ingo Molnar <mingo@kernel.org>
> ---
> arch/x86/kernel/entry_64.S | 13 ++++++-------
> 1 file changed, 6 insertions(+), 7 deletions(-)
Shouldn't you take v2 of that one?
https://lkml.kernel.org/r/1427307629-10024-1-git-send-email-dvlasenk@redhat.com
v2 is missing this hunk though:
---
@@ -1407,7 +1406,7 @@ ENTRY(nmi)
* NMI.
*/
- /* Use %rdx as out temp variable throughout */
+ /* Use %rdx as our temp variable throughout */
pushq_cfi %rdx
CFI_REL_OFFSET rdx, 0
--
I blame Denys for sending too many patches! :)
--
Regards/Gruss,
Boris.
ECO tip #101: Trim your mails when you reply.
--
^ permalink raw reply [flat|nested] 24+ messages in thread
end of thread, other threads:[~2015-03-27 12:17 UTC | newest]
Thread overview: 24+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-03-25 17:18 [PATCH 1/4] x86/asm/entry/64: better label name, fix comments Denys Vlasenko
2015-03-25 17:18 ` [PATCH 2/4] x86/asm/entry/64: do not TRACE_IRQS fast SYSRET64 path Denys Vlasenko
2015-03-25 17:29 ` Ingo Molnar
2015-03-25 17:41 ` Denys Vlasenko
2015-03-25 18:04 ` Ingo Molnar
2015-03-25 18:19 ` Denys Vlasenko
2015-03-25 17:18 ` [PATCH 3/4] x86/asm/entry/64: use smaller insns Denys Vlasenko
2015-03-25 18:03 ` Andy Lutomirski
2015-03-25 23:51 ` Linus Torvalds
2015-03-25 23:56 ` H. Peter Anvin
2015-03-26 0:05 ` Linus Torvalds
2015-03-26 0:12 ` H. Peter Anvin
2015-03-26 9:27 ` Borislav Petkov
2015-03-26 9:37 ` Ingo Molnar
2015-03-26 9:51 ` Borislav Petkov
2015-03-26 10:07 ` Denys Vlasenko
2015-03-26 10:25 ` Ingo Molnar
2015-03-26 10:53 ` Denys Vlasenko
2015-03-26 10:25 ` Borislav Petkov
2015-03-26 9:30 ` Ingo Molnar
2015-03-27 11:47 ` [tip:x86/asm] x86/asm/entry/64: Use smaller instructions tip-bot for Denys Vlasenko
2015-03-25 17:18 ` [PATCH 4/4] x86/asm/entry/64: fix typo in comment Denys Vlasenko
2015-03-27 11:46 ` [tip:x86/asm] x86/asm/entry/64: Use better label name, fix comments tip-bot for Denys Vlasenko
2015-03-27 12:15 ` Borislav Petkov
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).