All of lore.kernel.org
 help / color / mirror / Atom feed
* Internal error: Oops - BUG() / kvm boot race - arm64 kpti patchset related
@ 2018-02-28 15:45 Paolo Pisati
  2018-02-28 15:51 ` Ard Biesheuvel
                   ` (2 more replies)
  0 siblings, 3 replies; 10+ messages in thread
From: Paolo Pisati @ 2018-02-28 15:45 UTC (permalink / raw)
  To: linux-arm-kernel

Reproducible on 4.16-rc3 and 4.4.20 using defconfig - the failure it's
intermittent, but i could reproduce it 100% if i boot loop the kvm instance 
(it usually shows up in less than 10 iterations but i tested 32 boots before
marking it good).

I bisected it down to this interval in linux-4.14.y:

2feb36e arm64: kpti: Add ->enable callback to remap swapper using nG mappings
ee28fed arm64: mm: Permit transitioning from Global to Non-Global without BBM
6928820 arm64: kpti: Make use of nG dependent on arm64_kernel_unmapped_at_el0()
c98c8c2 arm64: Turn on KPTI only on CPUs that need it

c98c8c2 is good, 2feb36e is bad - couldn't bisect in between, since it didn't
boot there.
And yes, when i tested 2feb36e i applied the "el1 trashing fix" mentioned here:
https://www.spinics.net/lists/arm-kernel/msg636489.html

Anything else i can do to help debug this?
-- 
bye,
p.

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Internal error: Oops - BUG() / kvm boot race - arm64 kpti patchset related
  2018-02-28 15:45 Internal error: Oops - BUG() / kvm boot race - arm64 kpti patchset related Paolo Pisati
@ 2018-02-28 15:51 ` Ard Biesheuvel
  2018-02-28 16:13   ` Paolo Pisati
  2018-02-28 15:51 ` Marc Zyngier
  2018-02-28 16:07 ` Will Deacon
  2 siblings, 1 reply; 10+ messages in thread
From: Ard Biesheuvel @ 2018-02-28 15:51 UTC (permalink / raw)
  To: linux-arm-kernel

On 28 February 2018 at 15:45, Paolo Pisati <p.pisati@gmail.com> wrote:
> Reproducible on 4.16-rc3 and 4.4.20 using defconfig - the failure it's
> intermittent, but i could reproduce it 100% if i boot loop the kvm instance
> (it usually shows up in less than 10 iterations but i tested 32 boots before
> marking it good).
>
> I bisected it down to this interval in linux-4.14.y:
>
> 2feb36e arm64: kpti: Add ->enable callback to remap swapper using nG mappings
> ee28fed arm64: mm: Permit transitioning from Global to Non-Global without BBM
> 6928820 arm64: kpti: Make use of nG dependent on arm64_kernel_unmapped_at_el0()
> c98c8c2 arm64: Turn on KPTI only on CPUs that need it
>
> c98c8c2 is good, 2feb36e is bad - couldn't bisect in between, since it didn't
> boot there.
> And yes, when i tested 2feb36e i applied the "el1 trashing fix" mentioned here:
> https://www.spinics.net/lists/arm-kernel/msg636489.html
>
> Anything else i can do to help debug this?

First of al, v4.4.20 ?!? How on earth could that have anything to do with KPTI?

In any case, you could try whether this patch helps at all, or at
least makes your bisect less inconclusive.

https://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux.git/commit/?h=fixes/core&id=753e8abc36b2c966caea075db0c845563c8a19bf

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Internal error: Oops - BUG() / kvm boot race - arm64 kpti patchset related
  2018-02-28 15:45 Internal error: Oops - BUG() / kvm boot race - arm64 kpti patchset related Paolo Pisati
  2018-02-28 15:51 ` Ard Biesheuvel
@ 2018-02-28 15:51 ` Marc Zyngier
  2018-02-28 16:15   ` Paolo Pisati
  2018-02-28 16:07 ` Will Deacon
  2 siblings, 1 reply; 10+ messages in thread
From: Marc Zyngier @ 2018-02-28 15:51 UTC (permalink / raw)
  To: linux-arm-kernel

Paolo,

On 28/02/18 15:45, Paolo Pisati wrote:
> Reproducible on 4.16-rc3 and 4.4.20 using defconfig - the failure it's
> intermittent, but i could reproduce it 100% if i boot loop the kvm instance 
> (it usually shows up in less than 10 iterations but i tested 32 boots before
> marking it good).
> 
> I bisected it down to this interval in linux-4.14.y:
> 
> 2feb36e arm64: kpti: Add ->enable callback to remap swapper using nG mappings
> ee28fed arm64: mm: Permit transitioning from Global to Non-Global without BBM
> 6928820 arm64: kpti: Make use of nG dependent on arm64_kernel_unmapped_at_el0()
> c98c8c2 arm64: Turn on KPTI only on CPUs that need it
> 
> c98c8c2 is good, 2feb36e is bad - couldn't bisect in between, since it didn't
> boot there.
> And yes, when i tested 2feb36e i applied the "el1 trashing fix" mentioned here:
> https://www.spinics.net/lists/arm-kernel/msg636489.html
> 
> Anything else i can do to help debug this?

What HW are you using? Your command line? Your configuration? How are
you rebooting your guest (to EFI? directly to the kernel itself?)? How
comes it didn't boot between these 4 commits? Does the failure affects
the host or the guest?

Thanks,

	M.
-- 
Jazz is not dead. It just smells funny...

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Internal error: Oops - BUG() / kvm boot race - arm64 kpti patchset related
  2018-02-28 15:45 Internal error: Oops - BUG() / kvm boot race - arm64 kpti patchset related Paolo Pisati
  2018-02-28 15:51 ` Ard Biesheuvel
  2018-02-28 15:51 ` Marc Zyngier
@ 2018-02-28 16:07 ` Will Deacon
  2018-02-28 16:12   ` Paolo Pisati
  2 siblings, 1 reply; 10+ messages in thread
From: Will Deacon @ 2018-02-28 16:07 UTC (permalink / raw)
  To: linux-arm-kernel

On Wed, Feb 28, 2018 at 04:45:23PM +0100, Paolo Pisati wrote:
> Reproducible on 4.16-rc3 and 4.4.20 using defconfig - the failure it's
> intermittent, but i could reproduce it 100% if i boot loop the kvm instance 
> (it usually shows up in less than 10 iterations but i tested 32 boots before
> marking it good).

[...]

> Anything else i can do to help debug this?

If you could share the crash log, that would be helpful.

Cheers,

Will

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Internal error: Oops - BUG() / kvm boot race - arm64 kpti patchset related
  2018-02-28 16:07 ` Will Deacon
@ 2018-02-28 16:12   ` Paolo Pisati
  2018-02-28 16:18     ` Will Deacon
  2018-02-28 16:18     ` Mark Rutland
  0 siblings, 2 replies; 10+ messages in thread
From: Paolo Pisati @ 2018-02-28 16:12 UTC (permalink / raw)
  To: linux-arm-kernel

On Wed, Feb 28, 2018 at 04:07:35PM +0000, Will Deacon wrote:
> On Wed, Feb 28, 2018 at 04:45:23PM +0100, Paolo Pisati wrote:
> > Reproducible on 4.16-rc3 and 4.4.20 using defconfig - the failure it's
> > intermittent, but i could reproduce it 100% if i boot loop the kvm instance 
> > (it usually shows up in less than 10 iterations but i tested 32 boots before
> > marking it good).
> 
> [...]
> 
> > Anything else i can do to help debug this?
> 
> If you could share the crash log, that would be helpful.

Oops log here:
https://launchpadlibrarian.net/358687166/submission_2018-02-26T14.58.17.996524.html#9-1-log

I prepared the email, then when i was about to send it out i inadvertently cut out the
crash...
-- 
bye,
p.

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Internal error: Oops - BUG() / kvm boot race - arm64 kpti patchset related
  2018-02-28 15:51 ` Ard Biesheuvel
@ 2018-02-28 16:13   ` Paolo Pisati
  0 siblings, 0 replies; 10+ messages in thread
From: Paolo Pisati @ 2018-02-28 16:13 UTC (permalink / raw)
  To: linux-arm-kernel

On Wed, Feb 28, 2018 at 03:51:09PM +0000, Ard Biesheuvel wrote:
> 
> First of al, v4.4.20 ?!? How on earth could that have anything to do with KPTI?

Sorry, i meant 4.14.20...

> 
> In any case, you could try whether this patch helps at all, or at
> least makes your bisect less inconclusive.
> 
> https://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux.git/commit/?h=fixes/core&id=753e8abc36b2c966caea075db0c845563c8a19bf

-- 
bye,
p.

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Internal error: Oops - BUG() / kvm boot race - arm64 kpti patchset related
  2018-02-28 15:51 ` Marc Zyngier
@ 2018-02-28 16:15   ` Paolo Pisati
  0 siblings, 0 replies; 10+ messages in thread
From: Paolo Pisati @ 2018-02-28 16:15 UTC (permalink / raw)
  To: linux-arm-kernel

On Wed, Feb 28, 2018 at 03:51:55PM +0000, Marc Zyngier wrote:
> 
> What HW are you using? Your command line? Your configuration? How are
> you rebooting your guest (to EFI? directly to the kernel itself?)? How
> comes it didn't boot between these 4 commits? Does the failure affects
> the host or the guest?

I inadvertently cut out the crash log from the original email, hope this helps:

https://launchpadlibrarian.net/358687166/submission_2018-02-26T14.58.17.996524.html#9-1-log
-- 
bye,
p.

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Internal error: Oops - BUG() / kvm boot race - arm64 kpti patchset related
  2018-02-28 16:12   ` Paolo Pisati
@ 2018-02-28 16:18     ` Will Deacon
  2018-02-28 16:25       ` Paolo Pisati
  2018-02-28 16:18     ` Mark Rutland
  1 sibling, 1 reply; 10+ messages in thread
From: Will Deacon @ 2018-02-28 16:18 UTC (permalink / raw)
  To: linux-arm-kernel

On Wed, Feb 28, 2018 at 05:12:56PM +0100, Paolo Pisati wrote:
> On Wed, Feb 28, 2018 at 04:07:35PM +0000, Will Deacon wrote:
> > On Wed, Feb 28, 2018 at 04:45:23PM +0100, Paolo Pisati wrote:
> > > Reproducible on 4.16-rc3 and 4.4.20 using defconfig - the failure it's
> > > intermittent, but i could reproduce it 100% if i boot loop the kvm instance 
> > > (it usually shows up in less than 10 iterations but i tested 32 boots before
> > > marking it good).
> > 
> > [...]
> > 
> > > Anything else i can do to help debug this?
> > 
> > If you could share the crash log, that would be helpful.
> 
> Oops log here:
> https://launchpadlibrarian.net/358687166/submission_2018-02-26T14.58.17.996524.html#9-1-log

The patch that Ard linked to should resolve that particular crash.
I see this is a 4.13-based kernel. Is the source tree available anywhere?

Will

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Internal error: Oops - BUG() / kvm boot race - arm64 kpti patchset related
  2018-02-28 16:12   ` Paolo Pisati
  2018-02-28 16:18     ` Will Deacon
@ 2018-02-28 16:18     ` Mark Rutland
  1 sibling, 0 replies; 10+ messages in thread
From: Mark Rutland @ 2018-02-28 16:18 UTC (permalink / raw)
  To: linux-arm-kernel

On Wed, Feb 28, 2018 at 05:12:56PM +0100, Paolo Pisati wrote:
> On Wed, Feb 28, 2018 at 04:07:35PM +0000, Will Deacon wrote:
> > On Wed, Feb 28, 2018 at 04:45:23PM +0100, Paolo Pisati wrote:
> > > Reproducible on 4.16-rc3 and 4.4.20 using defconfig - the failure it's
> > > intermittent, but i could reproduce it 100% if i boot loop the kvm instance 
> > > (it usually shows up in less than 10 iterations but i tested 32 boots before
> > > marking it good).
> > 
> > [...]
> > 
> > > Anything else i can do to help debug this?
> > 
> > If you could share the crash log, that would be helpful.
> 
> Oops log here:
> https://launchpadlibrarian.net/358687166/submission_2018-02-26T14.58.17.996524.html#9-1-log

In your tree, does arch/arm64/mm/mmu.c line 142 look like:

  BUG_ON(!pgattr_change_is_safe(...)) ?

If so, this looks like the issue Ard fixed in:

  https://lkml.kernel.org/r/20180223180448.6006-1-ard.biesheuvel at linaro.org

Thanks,
Mark.

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Internal error: Oops - BUG() / kvm boot race - arm64 kpti patchset related
  2018-02-28 16:18     ` Will Deacon
@ 2018-02-28 16:25       ` Paolo Pisati
  0 siblings, 0 replies; 10+ messages in thread
From: Paolo Pisati @ 2018-02-28 16:25 UTC (permalink / raw)
  To: linux-arm-kernel

On Wed, Feb 28, 2018 at 04:18:33PM +0000, Will Deacon wrote:
> > 
> > Oops log here:
> > https://launchpadlibrarian.net/358687166/submission_2018-02-26T14.58.17.996524.html#9-1-log
> 
> The patch that Ard linked to should resolve that particular crash.
> I see this is a 4.13-based kernel. Is the source tree available anywhere?

https://git.launchpad.net/~p-pisati/ubuntu/+source/linux/log/?h=artful-master-next-arm64-kpti-414-backport

I'll try that patch, thanks.
-- 
bye,
p.

^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2018-02-28 16:25 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-02-28 15:45 Internal error: Oops - BUG() / kvm boot race - arm64 kpti patchset related Paolo Pisati
2018-02-28 15:51 ` Ard Biesheuvel
2018-02-28 16:13   ` Paolo Pisati
2018-02-28 15:51 ` Marc Zyngier
2018-02-28 16:15   ` Paolo Pisati
2018-02-28 16:07 ` Will Deacon
2018-02-28 16:12   ` Paolo Pisati
2018-02-28 16:18     ` Will Deacon
2018-02-28 16:25       ` Paolo Pisati
2018-02-28 16:18     ` Mark Rutland

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.