* Re: CONFIG_PAGE_TABLE_ISOLATION=y on x86_64 causes gcc to segfault when building x86_32 binaries
2018-01-03 18:52 ` Thomas Gleixner
@ 2018-01-03 22:12 ` Laura Abbott
2018-01-03 22:14 ` Andy Lutomirski
` (3 subsequent siblings)
4 siblings, 0 replies; 16+ messages in thread
From: Laura Abbott @ 2018-01-03 22:12 UTC (permalink / raw)
To: Thomas Gleixner, Lars Wendler
Cc: LKML, x86, Borislav Betkov, Andy Lutomirski, Dave Hansen,
Peter Zijlstra, Greg KH, Boris Ostrovsky, Juergen Gross
On 01/03/2018 10:52 AM, Thomas Gleixner wrote:
> On Wed, 3 Jan 2018, Thomas Gleixner wrote:
>
>> On Wed, 3 Jan 2018, Lars Wendler wrote:
>>> Am Wed, 3 Jan 2018 13:05:38 +0100 (CET)
>>> schrieb Thomas Gleixner <tglx@linutronix.de>:
>>>> Also can you please try Linus v4.15-rc6 with PTI enabled so we can see
>>>> whether that's a backport issue or a general one?
>>>
>>> Same problem with 4.15-rc6. So I suppose that means it's a general
>>> issue.
>>
>> Just a shot in the dark as I just decoded another issue on a AMD CPU. Can
>> you please try the patch below?
>
> Ok. Found the real issue. This is a problem on AMD boxen.
>
Fedora reporter says it fixes it.
> Fix below.
>
> Can Xen folks please have a look at that as well?
>
> Thanks,
>
> tglx
>
> 8<-------------------
>
> arch/x86/entry/entry_64_compat.S | 13 ++++++-------
> 1 file changed, 6 insertions(+), 7 deletions(-)
>
> --- a/arch/x86/entry/entry_64_compat.S
> +++ b/arch/x86/entry/entry_64_compat.S
> @@ -190,8 +190,13 @@ ENTRY(entry_SYSCALL_compat)
> /* Interrupts are off on entry. */
> swapgs
>
> - /* Stash user ESP and switch to the kernel stack. */
> + /* Stash user ESP */
> movl %esp, %r8d
> +
> + /* Use %rsp as scratch reg. User ESP is stashed in r8 */
> + SWITCH_TO_KERNEL_CR3 scratch_reg=%rsp
> +
> + /* Switch to the kernel stack */
> movq PER_CPU_VAR(cpu_current_top_of_stack), %rsp
>
> /* Construct struct pt_regs on stack */
> @@ -220,12 +225,6 @@ GLOBAL(entry_SYSCALL_compat_after_hwfram
> pushq $0 /* pt_regs->r15 = 0 */
>
> /*
> - * We just saved %rdi so it is safe to clobber. It is not
> - * preserved during the C calls inside TRACE_IRQS_OFF anyway.
> - */
> - SWITCH_TO_KERNEL_CR3 scratch_reg=%rdi
> -
> - /*
> * User mode is traced as though IRQs are on, and SYSENTER
> * turned them off.
> */
>
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: CONFIG_PAGE_TABLE_ISOLATION=y on x86_64 causes gcc to segfault when building x86_32 binaries
2018-01-03 18:52 ` Thomas Gleixner
2018-01-03 22:12 ` Laura Abbott
@ 2018-01-03 22:14 ` Andy Lutomirski
2018-01-03 22:22 ` Thomas Gleixner
2018-01-03 22:27 ` Dave Hansen
2018-01-03 22:25 ` [tip:x86/pti] x86/pti: Switch to kernel CR3 at early in entry_SYSCALL_compat() tip-bot for Thomas Gleixner
` (2 subsequent siblings)
4 siblings, 2 replies; 16+ messages in thread
From: Andy Lutomirski @ 2018-01-03 22:14 UTC (permalink / raw)
To: Thomas Gleixner
Cc: Lars Wendler, LKML, X86 ML, Borislav Betkov, Andy Lutomirski,
Dave Hansen, Peter Zijlstra, Greg KH, Laura Abbott,
Boris Ostrovsky, Juergen Gross
On Wed, Jan 3, 2018 at 10:52 AM, Thomas Gleixner <tglx@linutronix.de> wrote:
> On Wed, 3 Jan 2018, Thomas Gleixner wrote:
>
>> On Wed, 3 Jan 2018, Lars Wendler wrote:
>> > Am Wed, 3 Jan 2018 13:05:38 +0100 (CET)
>> > schrieb Thomas Gleixner <tglx@linutronix.de>:
>> > > Also can you please try Linus v4.15-rc6 with PTI enabled so we can see
>> > > whether that's a backport issue or a general one?
>> >
>> > Same problem with 4.15-rc6. So I suppose that means it's a general
>> > issue.
>>
>> Just a shot in the dark as I just decoded another issue on a AMD CPU. Can
>> you please try the patch below?
>
> Ok. Found the real issue. This is a problem on AMD boxen.
>
> Fix below.
>
> Can Xen folks please have a look at that as well?
>
> Thanks,
>
> tglx
>
> 8<-------------------
>
> arch/x86/entry/entry_64_compat.S | 13 ++++++-------
> 1 file changed, 6 insertions(+), 7 deletions(-)
>
> --- a/arch/x86/entry/entry_64_compat.S
> +++ b/arch/x86/entry/entry_64_compat.S
> @@ -190,8 +190,13 @@ ENTRY(entry_SYSCALL_compat)
> /* Interrupts are off on entry. */
> swapgs
>
> - /* Stash user ESP and switch to the kernel stack. */
> + /* Stash user ESP */
> movl %esp, %r8d
> +
> + /* Use %rsp as scratch reg. User ESP is stashed in r8 */
> + SWITCH_TO_KERNEL_CR3 scratch_reg=%rsp
> +
> + /* Switch to the kernel stack */
> movq PER_CPU_VAR(cpu_current_top_of_stack), %rsp
>
> /* Construct struct pt_regs on stack */
> @@ -220,12 +225,6 @@ GLOBAL(entry_SYSCALL_compat_after_hwfram
> pushq $0 /* pt_regs->r15 = 0 */
>
> /*
> - * We just saved %rdi so it is safe to clobber. It is not
> - * preserved during the C calls inside TRACE_IRQS_OFF anyway.
> - */
> - SWITCH_TO_KERNEL_CR3 scratch_reg=%rdi
> -
> - /*
> * User mode is traced as though IRQs are on, and SYSENTER
> * turned them off.
> */
What's the issue that this is fixing?
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: CONFIG_PAGE_TABLE_ISOLATION=y on x86_64 causes gcc to segfault when building x86_32 binaries
2018-01-03 22:14 ` Andy Lutomirski
@ 2018-01-03 22:22 ` Thomas Gleixner
2018-01-03 23:43 ` Andy Lutomirski
2018-01-03 22:27 ` Dave Hansen
1 sibling, 1 reply; 16+ messages in thread
From: Thomas Gleixner @ 2018-01-03 22:22 UTC (permalink / raw)
To: Andy Lutomirski
Cc: Lars Wendler, LKML, X86 ML, Borislav Betkov, Dave Hansen,
Peter Zijlstra, Greg KH, Laura Abbott, Boris Ostrovsky,
Juergen Gross
On Wed, 3 Jan 2018, Andy Lutomirski wrote:
> On Wed, Jan 3, 2018 at 10:52 AM, Thomas Gleixner <tglx@linutronix.de> wrote:
> > On Wed, 3 Jan 2018, Thomas Gleixner wrote:
> >
> >> On Wed, 3 Jan 2018, Lars Wendler wrote:
> >> > Am Wed, 3 Jan 2018 13:05:38 +0100 (CET)
> >> > schrieb Thomas Gleixner <tglx@linutronix.de>:
> >> > > Also can you please try Linus v4.15-rc6 with PTI enabled so we can see
> >> > > whether that's a backport issue or a general one?
> >> >
> >> > Same problem with 4.15-rc6. So I suppose that means it's a general
> >> > issue.
> >>
> >> Just a shot in the dark as I just decoded another issue on a AMD CPU. Can
> >> you please try the patch below?
> >
> > Ok. Found the real issue. This is a problem on AMD boxen.
> >
> > Fix below.
> >
> > Can Xen folks please have a look at that as well?
> >
> > Thanks,
> >
> > tglx
> >
> > 8<-------------------
> >
> > arch/x86/entry/entry_64_compat.S | 13 ++++++-------
> > 1 file changed, 6 insertions(+), 7 deletions(-)
> >
> > --- a/arch/x86/entry/entry_64_compat.S
> > +++ b/arch/x86/entry/entry_64_compat.S
> > @@ -190,8 +190,13 @@ ENTRY(entry_SYSCALL_compat)
> > /* Interrupts are off on entry. */
> > swapgs
> >
> > - /* Stash user ESP and switch to the kernel stack. */
> > + /* Stash user ESP */
> > movl %esp, %r8d
> > +
> > + /* Use %rsp as scratch reg. User ESP is stashed in r8 */
> > + SWITCH_TO_KERNEL_CR3 scratch_reg=%rsp
> > +
> > + /* Switch to the kernel stack */
> > movq PER_CPU_VAR(cpu_current_top_of_stack), %rsp
> >
> > /* Construct struct pt_regs on stack */
> > @@ -220,12 +225,6 @@ GLOBAL(entry_SYSCALL_compat_after_hwfram
> > pushq $0 /* pt_regs->r15 = 0 */
> >
> > /*
> > - * We just saved %rdi so it is safe to clobber. It is not
> > - * preserved during the C calls inside TRACE_IRQS_OFF anyway.
> > - */
> > - SWITCH_TO_KERNEL_CR3 scratch_reg=%rdi
> > -
> > - /*
> > * User mode is traced as though IRQs are on, and SYSENTER
> > * turned them off.
> > */
>
> What's the issue that this is fixing?
> > movq PER_CPU_VAR(cpu_current_top_of_stack), %rsp
before switching CR3 is obviously broken ...
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: CONFIG_PAGE_TABLE_ISOLATION=y on x86_64 causes gcc to segfault when building x86_32 binaries
2018-01-03 22:22 ` Thomas Gleixner
@ 2018-01-03 23:43 ` Andy Lutomirski
0 siblings, 0 replies; 16+ messages in thread
From: Andy Lutomirski @ 2018-01-03 23:43 UTC (permalink / raw)
To: Thomas Gleixner
Cc: Andy Lutomirski, Lars Wendler, LKML, X86 ML, Borislav Betkov,
Dave Hansen, Peter Zijlstra, Greg KH, Laura Abbott,
Boris Ostrovsky, Juergen Gross
> On Jan 3, 2018, at 2:22 PM, Thomas Gleixner <tglx@linutronix.de> wrote:
>
>> On Wed, 3 Jan 2018, Andy Lutomirski wrote:
>>
>>> On Wed, Jan 3, 2018 at 10:52 AM, Thomas Gleixner <tglx@linutronix.de> wrote:
>>>> On Wed, 3 Jan 2018, Thomas Gleixner wrote:
>>>>
>>>>> On Wed, 3 Jan 2018, Lars Wendler wrote:
>>>>> Am Wed, 3 Jan 2018 13:05:38 +0100 (CET)
>>>>> schrieb Thomas Gleixner <tglx@linutronix.de>:
>>>>>> Also can you please try Linus v4.15-rc6 with PTI enabled so we can see
>>>>>> whether that's a backport issue or a general one?
>>>>>
>>>>> Same problem with 4.15-rc6. So I suppose that means it's a general
>>>>> issue.
>>>>
>>>> Just a shot in the dark as I just decoded another issue on a AMD CPU. Can
>>>> you please try the patch below?
>>>
>>> Ok. Found the real issue. This is a problem on AMD boxen.
>>>
>>> Fix below.
>>>
>>> Can Xen folks please have a look at that as well?
>>>
>>> Thanks,
>>>
>>> tglx
>>>
>>> 8<-------------------
>>>
>>> arch/x86/entry/entry_64_compat.S | 13 ++++++-------
>>> 1 file changed, 6 insertions(+), 7 deletions(-)
>>>
>>> --- a/arch/x86/entry/entry_64_compat.S
>>> +++ b/arch/x86/entry/entry_64_compat.S
>>> @@ -190,8 +190,13 @@ ENTRY(entry_SYSCALL_compat)
>>> /* Interrupts are off on entry. */
>>> swapgs
>>>
>>> - /* Stash user ESP and switch to the kernel stack. */
>>> + /* Stash user ESP */
>>> movl %esp, %r8d
>>> +
>>> + /* Use %rsp as scratch reg. User ESP is stashed in r8 */
>>> + SWITCH_TO_KERNEL_CR3 scratch_reg=%rsp
>>> +
>>> + /* Switch to the kernel stack */
>>> movq PER_CPU_VAR(cpu_current_top_of_stack), %rsp
>>>
>>> /* Construct struct pt_regs on stack */
>>> @@ -220,12 +225,6 @@ GLOBAL(entry_SYSCALL_compat_after_hwfram
>>> pushq $0 /* pt_regs->r15 = 0 */
>>>
>>> /*
>>> - * We just saved %rdi so it is safe to clobber. It is not
>>> - * preserved during the C calls inside TRACE_IRQS_OFF anyway.
>>> - */
>>> - SWITCH_TO_KERNEL_CR3 scratch_reg=%rdi
>>> -
>>> - /*
>>> * User mode is traced as though IRQs are on, and SYSENTER
>>> * turned them off.
>>> */
>>
>> What's the issue that this is fixing?
>
>>> movq PER_CPU_VAR(cpu_current_top_of_stack), %rsp
>
> before switching CR3 is obviously broken ...
>
>
Duh.
This is what happens when we have five hundred versions of the patches and we change how it all works half way through. And the 0day bot doesn't test the AMD path.
>
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: CONFIG_PAGE_TABLE_ISOLATION=y on x86_64 causes gcc to segfault when building x86_32 binaries
2018-01-03 22:14 ` Andy Lutomirski
2018-01-03 22:22 ` Thomas Gleixner
@ 2018-01-03 22:27 ` Dave Hansen
1 sibling, 0 replies; 16+ messages in thread
From: Dave Hansen @ 2018-01-03 22:27 UTC (permalink / raw)
To: Andy Lutomirski, Thomas Gleixner
Cc: Lars Wendler, LKML, X86 ML, Borislav Betkov, Peter Zijlstra,
Greg KH, Laura Abbott, Boris Ostrovsky, Juergen Gross
On 01/03/2018 02:14 PM, Andy Lutomirski wrote:
> + /* Use %rsp as scratch reg. User ESP is stashed in r8 */
> + SWITCH_TO_KERNEL_CR3 scratch_reg=%rsp
> +
> + /* Switch to the kernel stack */
> movq PER_CPU_VAR(cpu_current_top_of_stack), %rsp
The stack is unreadable at this point without the CR3 switch.
> What's the issue that this is fixing?
Users doing 32-bit SYSCALLs on the CPUs that support them double fault
since they end up with an %rsp that they can't access.
^ permalink raw reply [flat|nested] 16+ messages in thread
* [tip:x86/pti] x86/pti: Switch to kernel CR3 at early in entry_SYSCALL_compat()
2018-01-03 18:52 ` Thomas Gleixner
2018-01-03 22:12 ` Laura Abbott
2018-01-03 22:14 ` Andy Lutomirski
@ 2018-01-03 22:25 ` tip-bot for Thomas Gleixner
2018-01-03 23:46 ` CONFIG_PAGE_TABLE_ISOLATION=y on x86_64 causes gcc to segfault when building x86_32 binaries Lars Wendler
2018-01-04 2:44 ` Boris Ostrovsky
4 siblings, 0 replies; 16+ messages in thread
From: tip-bot for Thomas Gleixner @ 2018-01-03 22:25 UTC (permalink / raw)
To: linux-tip-commits
Cc: bp, linux-kernel, wendler.lars, labbott, hpa, jgross, mingo, tglx
Commit-ID: d7732ba55c4b6a2da339bb12589c515830cfac2c
Gitweb: https://git.kernel.org/tip/d7732ba55c4b6a2da339bb12589c515830cfac2c
Author: Thomas Gleixner <tglx@linutronix.de>
AuthorDate: Wed, 3 Jan 2018 19:52:04 +0100
Committer: Thomas Gleixner <tglx@linutronix.de>
CommitDate: Wed, 3 Jan 2018 23:19:32 +0100
x86/pti: Switch to kernel CR3 at early in entry_SYSCALL_compat()
The preparation for PTI which added CR3 switching to the entry code
misplaced the CR3 switch in entry_SYSCALL_compat().
With PTI enabled the entry code tries to access a per cpu variable after
switching to kernel GS. This fails because that variable is not mapped to
user space. This results in a double fault and in the worst case a kernel
crash.
Move the switch ahead of the access and clobber RSP which has been saved
already.
Fixes: 8a09317b895f ("x86/mm/pti: Prepare the x86/entry assembly code for entry/exit CR3 switching")
Reported-by: Lars Wendler <wendler.lars@web.de>
Reported-by: Laura Abbott <labbott@redhat.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Borislav Betkov <bp@alien8.de>
Cc: Andy Lutomirski <luto@kernel.org>,
Cc: Dave Hansen <dave.hansen@linux.intel.com>,
Cc: Peter Zijlstra <peterz@infradead.org>,
Cc: Greg KH <gregkh@linuxfoundation.org>, ,
Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>,
Cc: Juergen Gross <jgross@suse.com>
Cc: stable@vger.kernel.org
Link: https://lkml.kernel.org/r/alpine.DEB.2.20.1801031949200.1957@nanos
---
arch/x86/entry/entry_64_compat.S | 13 ++++++-------
1 file changed, 6 insertions(+), 7 deletions(-)
diff --git a/arch/x86/entry/entry_64_compat.S b/arch/x86/entry/entry_64_compat.S
index 40f1700..98d5358 100644
--- a/arch/x86/entry/entry_64_compat.S
+++ b/arch/x86/entry/entry_64_compat.S
@@ -190,8 +190,13 @@ ENTRY(entry_SYSCALL_compat)
/* Interrupts are off on entry. */
swapgs
- /* Stash user ESP and switch to the kernel stack. */
+ /* Stash user ESP */
movl %esp, %r8d
+
+ /* Use %rsp as scratch reg. User ESP is stashed in r8 */
+ SWITCH_TO_KERNEL_CR3 scratch_reg=%rsp
+
+ /* Switch to the kernel stack */
movq PER_CPU_VAR(cpu_current_top_of_stack), %rsp
/* Construct struct pt_regs on stack */
@@ -220,12 +225,6 @@ GLOBAL(entry_SYSCALL_compat_after_hwframe)
pushq $0 /* pt_regs->r15 = 0 */
/*
- * We just saved %rdi so it is safe to clobber. It is not
- * preserved during the C calls inside TRACE_IRQS_OFF anyway.
- */
- SWITCH_TO_KERNEL_CR3 scratch_reg=%rdi
-
- /*
* User mode is traced as though IRQs are on, and SYSENTER
* turned them off.
*/
^ permalink raw reply related [flat|nested] 16+ messages in thread
* Re: CONFIG_PAGE_TABLE_ISOLATION=y on x86_64 causes gcc to segfault when building x86_32 binaries
2018-01-03 18:52 ` Thomas Gleixner
` (2 preceding siblings ...)
2018-01-03 22:25 ` [tip:x86/pti] x86/pti: Switch to kernel CR3 at early in entry_SYSCALL_compat() tip-bot for Thomas Gleixner
@ 2018-01-03 23:46 ` Lars Wendler
2018-01-04 2:44 ` Boris Ostrovsky
4 siblings, 0 replies; 16+ messages in thread
From: Lars Wendler @ 2018-01-03 23:46 UTC (permalink / raw)
To: Thomas Gleixner
Cc: LKML, x86, Borislav Betkov, Andy Lutomirski, Dave Hansen,
Peter Zijlstra, Greg KH, Laura Abbott, Boris Ostrovsky,
Juergen Gross
[-- Attachment #1: Type: text/plain, Size: 885 bytes --]
Am Wed, 3 Jan 2018 19:52:04 +0100 (CET)
schrieb Thomas Gleixner <tglx@linutronix.de>:
> On Wed, 3 Jan 2018, Thomas Gleixner wrote:
>
> > On Wed, 3 Jan 2018, Lars Wendler wrote:
> > > Am Wed, 3 Jan 2018 13:05:38 +0100 (CET)
> > > schrieb Thomas Gleixner <tglx@linutronix.de>:
> > > > Also can you please try Linus v4.15-rc6 with PTI enabled so we
> > > > can see whether that's a backport issue or a general one?
> > >
> > > Same problem with 4.15-rc6. So I suppose that means it's a general
> > > issue.
> >
> > Just a shot in the dark as I just decoded another issue on a AMD
> > CPU. Can you please try the patch below?
>
> Ok. Found the real issue. This is a problem on AMD boxen.
>
> Fix below.
>
> Can Xen folks please have a look at that as well?
>
> Thanks,
>
> tglx
That indeed fixes the issue. Thank you!
Kind regards
Lars
[-- Attachment #2: Digitale Signatur von OpenPGP --]
[-- Type: application/pgp-signature, Size: 833 bytes --]
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: CONFIG_PAGE_TABLE_ISOLATION=y on x86_64 causes gcc to segfault when building x86_32 binaries
2018-01-03 18:52 ` Thomas Gleixner
` (3 preceding siblings ...)
2018-01-03 23:46 ` CONFIG_PAGE_TABLE_ISOLATION=y on x86_64 causes gcc to segfault when building x86_32 binaries Lars Wendler
@ 2018-01-04 2:44 ` Boris Ostrovsky
4 siblings, 0 replies; 16+ messages in thread
From: Boris Ostrovsky @ 2018-01-04 2:44 UTC (permalink / raw)
To: Thomas Gleixner, Lars Wendler
Cc: LKML, x86, Borislav Betkov, Andy Lutomirski, Dave Hansen,
Peter Zijlstra, Greg KH, Laura Abbott, Juergen Gross
On 01/03/2018 01:52 PM, Thomas Gleixner wrote:
> On Wed, 3 Jan 2018, Thomas Gleixner wrote:
>
>> On Wed, 3 Jan 2018, Lars Wendler wrote:
>>> Am Wed, 3 Jan 2018 13:05:38 +0100 (CET)
>>> schrieb Thomas Gleixner <tglx@linutronix.de>:
>>>> Also can you please try Linus v4.15-rc6 with PTI enabled so we can see
>>>> whether that's a backport issue or a general one?
>>> Same problem with 4.15-rc6. So I suppose that means it's a general
>>> issue.
>> Just a shot in the dark as I just decoded another issue on a AMD CPU. Can
>> you please try the patch below?
> Ok. Found the real issue. This is a problem on AMD boxen.
>
> Fix below.
>
> Can Xen folks please have a look at that as well?
(Apologies for the delay)
This is not an issue for PV guests.
-boris
>
> Thanks,
>
> tglx
>
> 8<-------------------
>
> arch/x86/entry/entry_64_compat.S | 13 ++++++-------
> 1 file changed, 6 insertions(+), 7 deletions(-)
>
> --- a/arch/x86/entry/entry_64_compat.S
> +++ b/arch/x86/entry/entry_64_compat.S
> @@ -190,8 +190,13 @@ ENTRY(entry_SYSCALL_compat)
> /* Interrupts are off on entry. */
> swapgs
>
> - /* Stash user ESP and switch to the kernel stack. */
> + /* Stash user ESP */
> movl %esp, %r8d
> +
> + /* Use %rsp as scratch reg. User ESP is stashed in r8 */
> + SWITCH_TO_KERNEL_CR3 scratch_reg=%rsp
> +
> + /* Switch to the kernel stack */
> movq PER_CPU_VAR(cpu_current_top_of_stack), %rsp
>
> /* Construct struct pt_regs on stack */
> @@ -220,12 +225,6 @@ GLOBAL(entry_SYSCALL_compat_after_hwfram
> pushq $0 /* pt_regs->r15 = 0 */
>
> /*
> - * We just saved %rdi so it is safe to clobber. It is not
> - * preserved during the C calls inside TRACE_IRQS_OFF anyway.
> - */
> - SWITCH_TO_KERNEL_CR3 scratch_reg=%rdi
> -
> - /*
> * User mode is traced as though IRQs are on, and SYSENTER
> * turned them off.
> */
^ permalink raw reply [flat|nested] 16+ messages in thread