linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* Re: stable/linux-4.14.y boot: 108 boots: 0 failed, 107 passed with 1 conflict (v4.14.11)
       [not found]     ` <ec0baa05-eeb7-7baa-7bc6-062d7f52e6ec@collabora.com>
@ 2018-01-03 10:36       ` Thomas Gleixner
  2018-01-03 11:14         ` Paolo Bonzini
  0 siblings, 1 reply; 3+ messages in thread
From: Thomas Gleixner @ 2018-01-03 10:36 UTC (permalink / raw)
  To: Guillaume Tucker
  Cc: Dave Hansen, Ingo Molnar, Greg Kroah-Hartman,
	kernel-build-reports, Matt Hart, stable, LKML, x86,
	Andy Lutomirski, Peter Zijlstra, Paolo Bonzini, qemu-devel

On Wed, 3 Jan 2018, Guillaume Tucker wrote:
> On 03/01/18 09:48, Thomas Gleixner wrote:
> > > Well, it turns out this is not exactly a conflict as there's a
> > > subtle difference between the qemu devices in lab-mhart and in
> > > lab-collabora.  The ones in lab-collabora are configured to use
> > > KVM, and it looks like the ones in lab-mhart aren't.
> > > 
> > > So this job with KVM enabled passes in lab-collabora:
> > > 
> > >    https://lava.collabora.co.uk/scheduler/job/1032358
> > > 
> > > but it fails if I tell LAVA (qemu) to disable KVM:
> > > 
> > >    https://lava.collabora.co.uk/scheduler/job/1032359
> > > 
> > > with the same panic as in lab-mhart.  It seems like it's failing
> > > to return from an interrupt:
> > > 
> > >    http://lava.streamtester.net/scheduler/job/87308
> > > 
> > >    [    2.678828]  ? native_iret+0x7/0x7
> > >    [    2.679208] WARNING: can't dereference iret registers at
> > > 00000000ffc66068
> > > for ip page_fault+0x11/0x60
> > > 
> > > This triggered an automated bisection on kernelci.org, please see
> > > the results below.
> > > 
> > > I may run another bisection with this config enabled earlier in
> > > the history to track down the actual change in the code that
> > > introduced the issue, let me know if it's worth doing.
> > 
> > No, because before that commit not all pieces are in place.
> > 
> > Can you please try the failing kernel with pti=off on the command line?
> 
> It does boot with pti=off (and KVM disabled):
> 
>   https://lava.collabora.co.uk/scheduler/job/1032387

So it's a qemu issue. Added qemu folks on Cc.

> > I'll start a test instance here as well.
 
Thanks,

	tglx

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: stable/linux-4.14.y boot: 108 boots: 0 failed, 107 passed with 1 conflict (v4.14.11)
  2018-01-03 10:36       ` stable/linux-4.14.y boot: 108 boots: 0 failed, 107 passed with 1 conflict (v4.14.11) Thomas Gleixner
@ 2018-01-03 11:14         ` Paolo Bonzini
  2018-01-03 19:55           ` Thomas Gleixner
  0 siblings, 1 reply; 3+ messages in thread
From: Paolo Bonzini @ 2018-01-03 11:14 UTC (permalink / raw)
  To: Thomas Gleixner, Guillaume Tucker
  Cc: Dave Hansen, Ingo Molnar, Greg Kroah-Hartman,
	kernel-build-reports, Matt Hart, stable, LKML, x86,
	Andy Lutomirski, Peter Zijlstra, qemu-devel

On 03/01/2018 11:36, Thomas Gleixner wrote:
> On Wed, 3 Jan 2018, Guillaume Tucker wrote:
>> On 03/01/18 09:48, Thomas Gleixner wrote:
>>>> Well, it turns out this is not exactly a conflict as there's a
>>>> subtle difference between the qemu devices in lab-mhart and in
>>>> lab-collabora.  The ones in lab-collabora are configured to use
>>>> KVM, and it looks like the ones in lab-mhart aren't.
>>>>
>>>> So this job with KVM enabled passes in lab-collabora:
>>>>
>>>>    https://lava.collabora.co.uk/scheduler/job/1032358
>>>>
>>>> but it fails if I tell LAVA (qemu) to disable KVM:
>>>>
>>>>    https://lava.collabora.co.uk/scheduler/job/1032359
>>>>
>>>> with the same panic as in lab-mhart.  It seems like it's failing
>>>> to return from an interrupt:
>>>>
>>>>    http://lava.streamtester.net/scheduler/job/87308
>>>>
>>>>    [    2.678828]  ? native_iret+0x7/0x7
>>>>    [    2.679208] WARNING: can't dereference iret registers at
>>>> 00000000ffc66068
>>>> for ip page_fault+0x11/0x60
>>>>
>>>> This triggered an automated bisection on kernelci.org, please see
>>>> the results below.
>>>>
>>>> I may run another bisection with this config enabled earlier in
>>>> the history to track down the actual change in the code that
>>>> introduced the issue, let me know if it's worth doing.
>>>
>>> No, because before that commit not all pieces are in place.
>>>
>>> Can you please try the failing kernel with pti=off on the command line?
>>
>> It does boot with pti=off (and KVM disabled):
>>
>>   https://lava.collabora.co.uk/scheduler/job/1032387
> 
> So it's a qemu issue. Added qemu folks on Cc.

Reproduced, thanks.  I will look into it.

Paolo

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: stable/linux-4.14.y boot: 108 boots: 0 failed, 107 passed with 1 conflict (v4.14.11)
  2018-01-03 11:14         ` Paolo Bonzini
@ 2018-01-03 19:55           ` Thomas Gleixner
  0 siblings, 0 replies; 3+ messages in thread
From: Thomas Gleixner @ 2018-01-03 19:55 UTC (permalink / raw)
  To: Paolo Bonzini
  Cc: Guillaume Tucker, Dave Hansen, Ingo Molnar, Greg Kroah-Hartman,
	kernel-build-reports, Matt Hart, stable, LKML, x86,
	Andy Lutomirski, Peter Zijlstra, qemu-devel

On Wed, 3 Jan 2018, Paolo Bonzini wrote:
> On 03/01/2018 11:36, Thomas Gleixner wrote:
> > On Wed, 3 Jan 2018, Guillaume Tucker wrote:
> >> On 03/01/18 09:48, Thomas Gleixner wrote:
> >>>> Well, it turns out this is not exactly a conflict as there's a
> >>>> subtle difference between the qemu devices in lab-mhart and in
> >>>> lab-collabora.  The ones in lab-collabora are configured to use
> >>>> KVM, and it looks like the ones in lab-mhart aren't.
> >>>>
> >>>> So this job with KVM enabled passes in lab-collabora:
> >>>>
> >>>>    https://lava.collabora.co.uk/scheduler/job/1032358
> >>>>
> >>>> but it fails if I tell LAVA (qemu) to disable KVM:
> >>>>
> >>>>    https://lava.collabora.co.uk/scheduler/job/1032359
> >>>>
> >>>> with the same panic as in lab-mhart.  It seems like it's failing
> >>>> to return from an interrupt:
> >>>>
> >>>>    http://lava.streamtester.net/scheduler/job/87308
> >>>>
> >>>>    [    2.678828]  ? native_iret+0x7/0x7
> >>>>    [    2.679208] WARNING: can't dereference iret registers at
> >>>> 00000000ffc66068
> >>>> for ip page_fault+0x11/0x60
> >>>>
> >>>> This triggered an automated bisection on kernelci.org, please see
> >>>> the results below.
> >>>>
> >>>> I may run another bisection with this config enabled earlier in
> >>>> the history to track down the actual change in the code that
> >>>> introduced the issue, let me know if it's worth doing.
> >>>
> >>> No, because before that commit not all pieces are in place.
> >>>
> >>> Can you please try the failing kernel with pti=off on the command line?
> >>
> >> It does boot with pti=off (and KVM disabled):
> >>
> >>   https://lava.collabora.co.uk/scheduler/job/1032387
> > 
> > So it's a qemu issue. Added qemu folks on Cc.
> 
> Reproduced, thanks.  I will look into it.

I just noticed that the qemu instance emulates an AMD CPU.

We discovered an AMD related issue which fits in the problem you are seing
today.

Can you try the patch below please?

Thanks,

	tglx

8<------------------

--- a/arch/x86/entry/entry_64_compat.S
+++ b/arch/x86/entry/entry_64_compat.S
@@ -190,8 +190,13 @@ ENTRY(entry_SYSCALL_compat)
 	/* Interrupts are off on entry. */
 	swapgs
 
-	/* Stash user ESP and switch to the kernel stack. */
+	/* Stash user ESP */
 	movl	%esp, %r8d
+
+	/* Use %rsp as scratch reg. User ESP is stashed in r8 */
+	SWITCH_TO_KERNEL_CR3 scratch_reg=%rsp
+	
+	/* Switch to the kernel stack */
 	movq	PER_CPU_VAR(cpu_current_top_of_stack), %rsp
 
 	/* Construct struct pt_regs on stack */
@@ -220,12 +225,6 @@ GLOBAL(entry_SYSCALL_compat_after_hwfram
 	pushq   $0			/* pt_regs->r15 = 0 */
 
 	/*
-	 * We just saved %rdi so it is safe to clobber.  It is not
-	 * preserved during the C calls inside TRACE_IRQS_OFF anyway.
-	 */
-	SWITCH_TO_KERNEL_CR3 scratch_reg=%rdi
-
-	/*
 	 * User mode is traced as though IRQs are on, and SYSENTER
 	 * turned them off.
 	 */

^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2018-01-03 19:55 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
     [not found] <5a4c49c8.50b91c0a.2b051.322e@mx.google.com>
     [not found] ` <de09d482-9cf1-e695-ee57-f91e2277c8de@collabora.com>
     [not found]   ` <alpine.DEB.2.20.1801031046270.1957@nanos>
     [not found]     ` <ec0baa05-eeb7-7baa-7bc6-062d7f52e6ec@collabora.com>
2018-01-03 10:36       ` stable/linux-4.14.y boot: 108 boots: 0 failed, 107 passed with 1 conflict (v4.14.11) Thomas Gleixner
2018-01-03 11:14         ` Paolo Bonzini
2018-01-03 19:55           ` Thomas Gleixner

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).