All of lore.kernel.org
 help / color / mirror / Atom feed
From: Marc Zyngier <marc.zyngier@arm.com>
To: AKASHI Takahiro <takahiro.akashi@linaro.org>
Cc: Kyle McMartin <kmcmarti@redhat.com>,
	Catalin Marinas <Catalin.Marinas@arm.com>,
	Will Deacon <Will.Deacon@arm.com>,
	Mark Rutland <Mark.Rutland@arm.com>,
	"linux-arm-kernel@lists.infradead.org" 
	<linux-arm-kernel@lists.infradead.org>,
	"linaro-kernel@lists.linaro.org" <linaro-kernel@lists.linaro.org>,
	"geoff@infradead.org" <geoff@infradead.org>,
	"kexec@lists.infradead.org" <kexec@lists.infradead.org>,
	"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
	"broonie@kernel.org" <broonie@kernel.org>,
	"david.griego@linaro.org" <david.griego@linaro.org>,
	"christoffer.dall@linaro.org" <christoffer.dall@linaro.org>,
	"freddy77@gmail.com" <freddy77@gmail.com>
Subject: Re: [RFC v2 0/5] arm64: kvm: reset hyp context for kexec
Date: Tue, 31 Mar 2015 08:31:59 +0100	[thread overview]
Message-ID: <20150331083159.7c3f78d4@arm.com> (raw)
In-Reply-To: <551A38FC.1040801@linaro.org>

On Tue, 31 Mar 2015 07:04:44 +0100
AKASHI Takahiro <takahiro.akashi@linaro.org> wrote:

Hi Takahiro,

> Marc,
> 
> On 03/30/2015 05:54 PM, AKASHI Takahiro wrote:
> > On 03/30/2015 04:16 PM, Marc Zyngier wrote:
> >> On Mon, 30 Mar 2015 02:39:53 +0100
> >> AKASHI Takahiro <takahiro.akashi@linaro.org> wrote:
> >>
> >>> On 03/28/2015 02:40 AM, Kyle McMartin wrote:
> >>>> On Fri, Mar 27, 2015 at 03:37:04PM +0000, Marc Zyngier wrote:
> >>>>>> [  236.260863] Kernel panic - not syncing: HYP panic:
> >>>>>> [  236.260863] PS:600003c9 PC:000003ffffff0830 ESR:0000000096000006
> >>>>>
> >>>>> It would be interesting if you could find out what you have at offset
> >>>>> 0x830 of hyp-init.o (the stack trace is for EL1, and is not going to
> >>>>> help much).
> >>>>>
> >>>>
> >>>> Given the alignment, i'm going to assume i'm looking at the right thing:
> >>>>
> >>>> 0000000000000820 <__kvm_hyp_reset>:
> >>>>        820:       d51c2000        msr     ttbr0_el2, x0
> >>>>        824:       d5033fdf        isb
> >>>>        828:       d50c871f        tlbi    alle2
> >>>>        82c:       d5033f9f        dsb     sy
> >>>>        830:       10000060        adr     x0, 83c <__kvm_hyp_reset+0x1c>
> >>>>        834:       b3403c01        bfxil   x1, x0, #0, #16
> >>>>        838:       d61f0020        br      x1
> >>>>        83c:       d53c1000        mrs     x0, sctlr_el2
> >>>>
> >>>> but it seems fairly implausible to be trapping on ADR x0, 1f...
> >>>
> >>>
> >>> I've never seen this panic on fast model...
> >>>
> >>> ESR shows that
> >>>     - Exception class: Data abort taken without a change in Exception level
> >>>     - Data fault status code: Translation fault at EL2
> >>>
> >>> and FAR seems not to be a proper address.
> >>
> >> ... which is consistent with what we're seeing here (data fault on
> >> something that doesn't generate a load/store). I'm pretty sure the
> >> page tables are screwed.
> >>
> >> Have you tested it with 64k pages?
> >
> > Hmm... It seems that I was able to reproduce the problem if 64k pages enabled.
> 
> The entry address in trampoline code calc'ed by kvm_virt_to_trampoline(__kvm_hyp_reset)
> seems to be wrong due to improper page-alignment in hyp-init.S.
> The following patch fixed this problem, at least, in my environment(fast model).
> (I don't know why it's PAGE_SHIFT - 1, not PAGE_SHIFT.)
> 
>  >diff --git a/arch/arm64/kvm/hyp-init.S b/arch/arm64/kvm/hyp-init.S
>  >index d212990..45b8d98 100644
>  >--- a/arch/arm64/kvm/hyp-init.S
>  >+++ b/arch/arm64/kvm/hyp-init.S
>  >@@ -24,7 +24,7 @@
>  >        .text
>  >        .pushsection    .hyp.idmap.text, "ax"
>  >
>  >-       .align  11
>  >+       .align  (PAGE_SHIFT - 1)

I'm afraid this is wrong. This alignment is for the vectors (which have
to be aligned on a 2kB boundary, hence the ".align 11"), not for the
code. Aligning it on a 32kB boundary doesn't make any sense, and just
hides the bug.

I bet that without this hack, the hyp-init code is spread across two
64kB pages, and the kernel generates a bounce page for this code. By
changing the alignment, you just end up having the code to fit in a
single page, and no bounce page gets generated.

If I'm right above the above, it means that you're computing something
against the wrong base. Can you please verify this scenario?

Now, the good news is that Ard is removing the bounce page from the KVM
code (for unrelated reasons), and this may just be the saving grace.

>  >
>  > ENTRY(__kvm_hyp_init)
>  >        ventry  __invalid               // Synchronous EL2t
> 
> 
> After applying this patch, I got another problem with kexec-tools on 64k page kernel,
> but I've already modified kexec-tools.

The idea that userspace behavior is dependent on the kernel page size
is deeply worrying...

Thanks,

	M.
-- 
Jazz is not dead. It just smells funny.

WARNING: multiple messages have this Message-ID (diff)
From: marc.zyngier@arm.com (Marc Zyngier)
To: linux-arm-kernel@lists.infradead.org
Subject: [RFC v2 0/5] arm64: kvm: reset hyp context for kexec
Date: Tue, 31 Mar 2015 08:31:59 +0100	[thread overview]
Message-ID: <20150331083159.7c3f78d4@arm.com> (raw)
In-Reply-To: <551A38FC.1040801@linaro.org>

On Tue, 31 Mar 2015 07:04:44 +0100
AKASHI Takahiro <takahiro.akashi@linaro.org> wrote:

Hi Takahiro,

> Marc,
> 
> On 03/30/2015 05:54 PM, AKASHI Takahiro wrote:
> > On 03/30/2015 04:16 PM, Marc Zyngier wrote:
> >> On Mon, 30 Mar 2015 02:39:53 +0100
> >> AKASHI Takahiro <takahiro.akashi@linaro.org> wrote:
> >>
> >>> On 03/28/2015 02:40 AM, Kyle McMartin wrote:
> >>>> On Fri, Mar 27, 2015 at 03:37:04PM +0000, Marc Zyngier wrote:
> >>>>>> [  236.260863] Kernel panic - not syncing: HYP panic:
> >>>>>> [  236.260863] PS:600003c9 PC:000003ffffff0830 ESR:0000000096000006
> >>>>>
> >>>>> It would be interesting if you could find out what you have at offset
> >>>>> 0x830 of hyp-init.o (the stack trace is for EL1, and is not going to
> >>>>> help much).
> >>>>>
> >>>>
> >>>> Given the alignment, i'm going to assume i'm looking at the right thing:
> >>>>
> >>>> 0000000000000820 <__kvm_hyp_reset>:
> >>>>        820:       d51c2000        msr     ttbr0_el2, x0
> >>>>        824:       d5033fdf        isb
> >>>>        828:       d50c871f        tlbi    alle2
> >>>>        82c:       d5033f9f        dsb     sy
> >>>>        830:       10000060        adr     x0, 83c <__kvm_hyp_reset+0x1c>
> >>>>        834:       b3403c01        bfxil   x1, x0, #0, #16
> >>>>        838:       d61f0020        br      x1
> >>>>        83c:       d53c1000        mrs     x0, sctlr_el2
> >>>>
> >>>> but it seems fairly implausible to be trapping on ADR x0, 1f...
> >>>
> >>>
> >>> I've never seen this panic on fast model...
> >>>
> >>> ESR shows that
> >>>     - Exception class: Data abort taken without a change in Exception level
> >>>     - Data fault status code: Translation fault at EL2
> >>>
> >>> and FAR seems not to be a proper address.
> >>
> >> ... which is consistent with what we're seeing here (data fault on
> >> something that doesn't generate a load/store). I'm pretty sure the
> >> page tables are screwed.
> >>
> >> Have you tested it with 64k pages?
> >
> > Hmm... It seems that I was able to reproduce the problem if 64k pages enabled.
> 
> The entry address in trampoline code calc'ed by kvm_virt_to_trampoline(__kvm_hyp_reset)
> seems to be wrong due to improper page-alignment in hyp-init.S.
> The following patch fixed this problem, at least, in my environment(fast model).
> (I don't know why it's PAGE_SHIFT - 1, not PAGE_SHIFT.)
> 
>  >diff --git a/arch/arm64/kvm/hyp-init.S b/arch/arm64/kvm/hyp-init.S
>  >index d212990..45b8d98 100644
>  >--- a/arch/arm64/kvm/hyp-init.S
>  >+++ b/arch/arm64/kvm/hyp-init.S
>  >@@ -24,7 +24,7 @@
>  >        .text
>  >        .pushsection    .hyp.idmap.text, "ax"
>  >
>  >-       .align  11
>  >+       .align  (PAGE_SHIFT - 1)

I'm afraid this is wrong. This alignment is for the vectors (which have
to be aligned on a 2kB boundary, hence the ".align 11"), not for the
code. Aligning it on a 32kB boundary doesn't make any sense, and just
hides the bug.

I bet that without this hack, the hyp-init code is spread across two
64kB pages, and the kernel generates a bounce page for this code. By
changing the alignment, you just end up having the code to fit in a
single page, and no bounce page gets generated.

If I'm right above the above, it means that you're computing something
against the wrong base. Can you please verify this scenario?

Now, the good news is that Ard is removing the bounce page from the KVM
code (for unrelated reasons), and this may just be the saving grace.

>  >
>  > ENTRY(__kvm_hyp_init)
>  >        ventry  __invalid               // Synchronous EL2t
> 
> 
> After applying this patch, I got another problem with kexec-tools on 64k page kernel,
> but I've already modified kexec-tools.

The idea that userspace behavior is dependent on the kernel page size
is deeply worrying...

Thanks,

	M.
-- 
Jazz is not dead. It just smells funny.

WARNING: multiple messages have this Message-ID (diff)
From: Marc Zyngier <marc.zyngier@arm.com>
To: AKASHI Takahiro <takahiro.akashi@linaro.org>
Cc: Mark Rutland <Mark.Rutland@arm.com>,
	"linaro-kernel@lists.linaro.org" <linaro-kernel@lists.linaro.org>,
	"christoffer.dall@linaro.org" <christoffer.dall@linaro.org>,
	"geoff@infradead.org" <geoff@infradead.org>,
	Catalin Marinas <Catalin.Marinas@arm.com>,
	Will Deacon <Will.Deacon@arm.com>,
	"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
	"freddy77@gmail.com" <freddy77@gmail.com>,
	"broonie@kernel.org" <broonie@kernel.org>,
	Kyle McMartin <kmcmarti@redhat.com>,
	"kexec@lists.infradead.org" <kexec@lists.infradead.org>,
	"linux-arm-kernel@lists.infradead.org"
	<linux-arm-kernel@lists.infradead.org>,
	"david.griego@linaro.org" <david.griego@linaro.org>
Subject: Re: [RFC v2 0/5] arm64: kvm: reset hyp context for kexec
Date: Tue, 31 Mar 2015 08:31:59 +0100	[thread overview]
Message-ID: <20150331083159.7c3f78d4@arm.com> (raw)
In-Reply-To: <551A38FC.1040801@linaro.org>

On Tue, 31 Mar 2015 07:04:44 +0100
AKASHI Takahiro <takahiro.akashi@linaro.org> wrote:

Hi Takahiro,

> Marc,
> 
> On 03/30/2015 05:54 PM, AKASHI Takahiro wrote:
> > On 03/30/2015 04:16 PM, Marc Zyngier wrote:
> >> On Mon, 30 Mar 2015 02:39:53 +0100
> >> AKASHI Takahiro <takahiro.akashi@linaro.org> wrote:
> >>
> >>> On 03/28/2015 02:40 AM, Kyle McMartin wrote:
> >>>> On Fri, Mar 27, 2015 at 03:37:04PM +0000, Marc Zyngier wrote:
> >>>>>> [  236.260863] Kernel panic - not syncing: HYP panic:
> >>>>>> [  236.260863] PS:600003c9 PC:000003ffffff0830 ESR:0000000096000006
> >>>>>
> >>>>> It would be interesting if you could find out what you have at offset
> >>>>> 0x830 of hyp-init.o (the stack trace is for EL1, and is not going to
> >>>>> help much).
> >>>>>
> >>>>
> >>>> Given the alignment, i'm going to assume i'm looking at the right thing:
> >>>>
> >>>> 0000000000000820 <__kvm_hyp_reset>:
> >>>>        820:       d51c2000        msr     ttbr0_el2, x0
> >>>>        824:       d5033fdf        isb
> >>>>        828:       d50c871f        tlbi    alle2
> >>>>        82c:       d5033f9f        dsb     sy
> >>>>        830:       10000060        adr     x0, 83c <__kvm_hyp_reset+0x1c>
> >>>>        834:       b3403c01        bfxil   x1, x0, #0, #16
> >>>>        838:       d61f0020        br      x1
> >>>>        83c:       d53c1000        mrs     x0, sctlr_el2
> >>>>
> >>>> but it seems fairly implausible to be trapping on ADR x0, 1f...
> >>>
> >>>
> >>> I've never seen this panic on fast model...
> >>>
> >>> ESR shows that
> >>>     - Exception class: Data abort taken without a change in Exception level
> >>>     - Data fault status code: Translation fault at EL2
> >>>
> >>> and FAR seems not to be a proper address.
> >>
> >> ... which is consistent with what we're seeing here (data fault on
> >> something that doesn't generate a load/store). I'm pretty sure the
> >> page tables are screwed.
> >>
> >> Have you tested it with 64k pages?
> >
> > Hmm... It seems that I was able to reproduce the problem if 64k pages enabled.
> 
> The entry address in trampoline code calc'ed by kvm_virt_to_trampoline(__kvm_hyp_reset)
> seems to be wrong due to improper page-alignment in hyp-init.S.
> The following patch fixed this problem, at least, in my environment(fast model).
> (I don't know why it's PAGE_SHIFT - 1, not PAGE_SHIFT.)
> 
>  >diff --git a/arch/arm64/kvm/hyp-init.S b/arch/arm64/kvm/hyp-init.S
>  >index d212990..45b8d98 100644
>  >--- a/arch/arm64/kvm/hyp-init.S
>  >+++ b/arch/arm64/kvm/hyp-init.S
>  >@@ -24,7 +24,7 @@
>  >        .text
>  >        .pushsection    .hyp.idmap.text, "ax"
>  >
>  >-       .align  11
>  >+       .align  (PAGE_SHIFT - 1)

I'm afraid this is wrong. This alignment is for the vectors (which have
to be aligned on a 2kB boundary, hence the ".align 11"), not for the
code. Aligning it on a 32kB boundary doesn't make any sense, and just
hides the bug.

I bet that without this hack, the hyp-init code is spread across two
64kB pages, and the kernel generates a bounce page for this code. By
changing the alignment, you just end up having the code to fit in a
single page, and no bounce page gets generated.

If I'm right above the above, it means that you're computing something
against the wrong base. Can you please verify this scenario?

Now, the good news is that Ard is removing the bounce page from the KVM
code (for unrelated reasons), and this may just be the saving grace.

>  >
>  > ENTRY(__kvm_hyp_init)
>  >        ventry  __invalid               // Synchronous EL2t
> 
> 
> After applying this patch, I got another problem with kexec-tools on 64k page kernel,
> but I've already modified kexec-tools.

The idea that userspace behavior is dependent on the kernel page size
is deeply worrying...

Thanks,

	M.
-- 
Jazz is not dead. It just smells funny.

_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec

  reply	other threads:[~2015-03-31  7:32 UTC|newest]

Thread overview: 51+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-03-26  8:25 [RFC v2 0/5] arm64: kvm: reset hyp context for kexec AKASHI Takahiro
2015-03-26  8:25 ` AKASHI Takahiro
2015-03-26  8:25 ` AKASHI Takahiro
2015-03-26  8:25 ` [RFC v2 1/5] arm64: kvm: add a cpu tear-down function AKASHI Takahiro
2015-03-26  8:25   ` AKASHI Takahiro
2015-03-26  8:25   ` AKASHI Takahiro
2015-03-26  8:25 ` [RFC v2 2/5] arm64: kvm: allow EL2 context to be reset on shutdown AKASHI Takahiro
2015-03-26  8:25   ` AKASHI Takahiro
2015-03-26  8:25   ` AKASHI Takahiro
2015-03-26  8:25 ` [RFC v2 3/5] arm64: kvm: add cpu reset hook for cpu hotplug AKASHI Takahiro
2015-03-26  8:25   ` AKASHI Takahiro
2015-03-26  8:25   ` AKASHI Takahiro
2015-03-26  8:25 ` [RFC v2 4/5] arm64: kvm: add cpu reset at module exit AKASHI Takahiro
2015-03-26  8:25   ` AKASHI Takahiro
2015-03-26  8:25   ` AKASHI Takahiro
2015-03-26  8:25 ` [RFC v2 5/5] arm: kvm: add stub implementation for kvm_cpu_reset() AKASHI Takahiro
2015-03-26  8:25   ` AKASHI Takahiro
2015-03-26  8:25   ` AKASHI Takahiro
2015-03-27 15:31 ` [RFC v2 0/5] arm64: kvm: reset hyp context for kexec Kyle McMartin
2015-03-27 15:31   ` Kyle McMartin
2015-03-27 15:31   ` Kyle McMartin
2015-03-27 15:37   ` Marc Zyngier
2015-03-27 15:37     ` Marc Zyngier
2015-03-27 15:37     ` Marc Zyngier
2015-03-27 17:40     ` Kyle McMartin
2015-03-27 17:40       ` Kyle McMartin
2015-03-27 17:40       ` Kyle McMartin
2015-03-27 17:50       ` Marc Zyngier
2015-03-27 17:50         ` Marc Zyngier
2015-03-27 17:50         ` Marc Zyngier
2015-03-30  1:39       ` AKASHI Takahiro
2015-03-30  1:39         ` AKASHI Takahiro
2015-03-30  1:39         ` AKASHI Takahiro
2015-03-30  7:16         ` Marc Zyngier
2015-03-30  7:16           ` Marc Zyngier
2015-03-30  7:16           ` Marc Zyngier
2015-03-30  8:54           ` AKASHI Takahiro
2015-03-30  8:54             ` AKASHI Takahiro
2015-03-30  8:54             ` AKASHI Takahiro
2015-03-31  6:04             ` AKASHI Takahiro
2015-03-31  6:04               ` AKASHI Takahiro
2015-03-31  6:04               ` AKASHI Takahiro
2015-03-31  7:31               ` Marc Zyngier [this message]
2015-03-31  7:31                 ` Marc Zyngier
2015-03-31  7:31                 ` Marc Zyngier
2015-04-01  5:06                 ` AKASHI Takahiro
2015-04-01  5:06                   ` AKASHI Takahiro
2015-04-01  5:06                   ` AKASHI Takahiro
2015-04-01  9:05                   ` Marc Zyngier
2015-04-01  9:05                     ` Marc Zyngier
2015-04-01  9:05                     ` Marc Zyngier

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20150331083159.7c3f78d4@arm.com \
    --to=marc.zyngier@arm.com \
    --cc=Catalin.Marinas@arm.com \
    --cc=Mark.Rutland@arm.com \
    --cc=Will.Deacon@arm.com \
    --cc=broonie@kernel.org \
    --cc=christoffer.dall@linaro.org \
    --cc=david.griego@linaro.org \
    --cc=freddy77@gmail.com \
    --cc=geoff@infradead.org \
    --cc=kexec@lists.infradead.org \
    --cc=kmcmarti@redhat.com \
    --cc=linaro-kernel@lists.linaro.org \
    --cc=linux-arm-kernel@lists.infradead.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=takahiro.akashi@linaro.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.