From: Pingfan Liu <kernelfans@gmail.com>
To: James Morse <james.morse@arm.com>
Cc: Mark Rutland <mark.rutland@arm.com>,
Kees Cook <keescook@chromium.org>,
Geoff Levand <geoff@infradead.org>,
Catalin Marinas <catalin.marinas@arm.com>,
Mark Brown <broonie@kernel.org>,
Remi Denis-Courmont <remi.denis.courmont@huawei.com>,
Will Deacon <will@kernel.org>, Ard Biesheuvel <ardb@kernel.org>,
linux-arm-kernel@lists.infradead.org
Subject: Re: [PATCH] arm64/relocate_kernel: remove redundant but misleading code
Date: Fri, 7 Aug 2020 22:14:41 +0800 [thread overview]
Message-ID: <CAFgQCTtmEoeWwTNo75Z6ymXDBRCKRWzyYEu4x35teyHSW39Czw@mail.gmail.com> (raw)
In-Reply-To: <9b0da257-785b-90ba-de3c-b9ee9ccdeeba@arm.com>
Hi Morse,
Appreciate for your patient explanation. I have no experience of
arm/kvm and after a quick through, I still have some questions. Please
correct me if I am wrong.
On Thu, Aug 6, 2020 at 8:20 PM James Morse <james.morse@arm.com> wrote:
>
> Hi Liu,
>
> On 06/08/2020 09:26, Pingfan Liu wrote:
> > The kexec switch sequence looks like the following:
> > SYM_CODE_START(__cpu_soft_restart)
> > ...
> > pre_disable_mmu_workaround
> > msr sctlr_el1, x12
> > ...
> > br x8
> >
> > SYM_CODE_START(arm64_relocate_new_kernel)
> > ...
> > pre_disable_mmu_workaround
> > msr sctlr_el2, x0
> > ...
>
> > "msr sctlr_el2, x0" is misleading, because "br x8" jump to a physical
> > address, which has no entry in idmap.
>
> Even better: this code run from a copy allocated by kexec, its not in the idmap either.
Yes, looks better.
>
> See the memcpy() in machine_kexec().
>
>
> > It implies that MMU has already been fully off after "msr sctlr_el1, x12".
>
>
> > And according to "D12.2.101 SCTLR_EL2, System Control Register (EL2)" in
> > "ARM Architecture Reference Manual", actually, EL2&0 host accesses
> > to SCTLR_EL2 when using mnemonic SCTLR_EL1.
>
> Only when HCR_EL2.E2H is enabled. If linux booted at EL2 on a non-VHE system, SCTLR_EL1
> and SCTLR_EL2 are different registers, both of which are managed by linux/KVM.
Yeah. I have realized these factors when composing this patch, but not
sure anything is missing.
Kexec on arm is introduced by the following two commits
d28f6df arm64/kexec: Add core kexec support
f9076ec arm64: Add back cpu reset routines
And at d28f6df, the code snippet is
in kernel/cpu-reset.S,
ENTRY(__cpu_soft_restart)
/* Clear sctlr_el1 flags. */
mrs x12, sctlr_el1
ldr x13, =SCTLR_ELx_FLAGS
bic x12, x12, x13
msr sctlr_el1, x12
isb
cbz x0, 1f // el2_switch?
mov x0, #HVC_SOFT_RESTART
hvc #0 // no return
//--> as the note, I think for both EL1&0 host and guest, they
will never resume the following code
//--> so in the original patch, the rest code is only for EL2&0 host
1: mov x18, x1 // entry
mov x0, x2 // arg0
mov x1, x3 // arg1
mov x2, x4 // arg2
br x18
ENDPROC(__cpu_soft_restart)
And in kernel/hyp-stub.S
el1_sync:
mrs x30, esr_el2
lsr x30, x30, #ESR_ELx_EC_SHIFT
cmp x30, #ESR_ELx_EC_HVC64
b.ne 9f // Not an HVC trap
cmp x0, #HVC_GET_VECTORS
b.ne 1f
mrs x0, vbar_el2
b 9f
1: cmp x0, #HVC_SET_VECTORS
b.ne 2f
msr vbar_el2, x1
b 9f
2: cmp x0, #HVC_SOFT_RESTART
b.ne 3f
mov x0, x2
mov x2, x4
mov x4, x1
mov x1, x3
br x4 // no return
/* Someone called kvm_call_hyp() against the hyp-stub... */
3: mov x0, #ARM_EXCEPTION_HYP_GONE
9: eret
ENDPROC(el1_sync)
I think for a soft restart, it jumps to the new kernel by 'br x4'. But
it seems a bug, which does not clear I+C bits, not disable EL2
translation regime.
On the other hand, if it wants to resume execution immediately after
"hvc #0" in ENTRY(__cpu_soft_restart), it should eret by the
modification of "spsr_el2.M[3:0] = PSR_MODE_EL2h", the code originates
from EL1, but the rest code tries to modify EL2 registers.
By this way, the code can do the exact things you mentioned. But there
seems to be a missing of such mechanisms.
>
>
> > Hence removing the redundant but misleading code.
>
> This isn't the reason its redundant...
>
>
> > diff --git a/arch/arm64/kernel/cpu-reset.S b/arch/arm64/kernel/cpu-reset.S
> > index 4a18055..37721eb 100644
> > --- a/arch/arm64/kernel/cpu-reset.S
> > +++ b/arch/arm64/kernel/cpu-reset.S
> > @@ -35,6 +35,10 @@ SYM_CODE_START(__cpu_soft_restart)
> > mov_q x13, SCTLR_ELx_FLAGS
> > bic x12, x12, x13
> > pre_disable_mmu_workaround
> > + /*
> > + * either disable EL1&0 translation regime or disable EL2&0 translation
> > + * regime if HCR_EL2.E2H == 1
> > + */> msr sctlr_el1, x12
> > isb
>
> On a VHE system, yes the cpu-reset.S disables EL2&0 by writing to SCTLR_EL1
>
> But on a non-VHE system, that same code disabled EL1&0. cup-reset.S goes on to call
> HVC_SOFT_RESTART for EL2, which may be serviced by KVM or the hyp-stub. (or maybe
> something else that implements the hyp-stub api)
>
> For kexec, on non-VHE both EL1&0 and EL2 get disabled.
Yes. I see the latest code in SYM_CODE_START(__kvm_handle_stub_hvc)
does the things exactly as what you said.
>
>
> > diff --git a/arch/arm64/kernel/relocate_kernel.S b/arch/arm64/kernel/relocate_kernel.S
> > index 542d6ed..84eec95 100644
> > --- a/arch/arm64/kernel/relocate_kernel.S
> > +++ b/arch/arm64/kernel/relocate_kernel.S
> > @@ -36,18 +36,6 @@ SYM_CODE_START(arm64_relocate_new_kernel)
> > mov x14, xzr /* x14 = entry ptr */
> > mov x13, xzr /* x13 = copy dest */
> >
> > - /* Clear the sctlr_el2 flags. */
> > - mrs x0, CurrentEL
> > - cmp x0, #CurrentEL_EL2
> > - b.ne 1f
> > - mrs x0, sctlr_el2
> > - mov_q x1, SCTLR_ELx_FLAGS
> > - bic x0, x0, x1
> > - pre_disable_mmu_workaround
> > - msr sctlr_el2, x0
> > - isb
> > -1:
>
> I agree this doesn't disable the MMU anymore.
>
> This was originally kept to disable the I+C bits when Kdump interrupted KVM, but since KVM
> formalised the hyp-stub API, and has this exact sequence to back its HVC_SOFT_RESTART, it
> was only needed for the hyp-stub itself, which has no clue about these SCTLR_EL2 bits.
>
> HVC_SOFT_RESTART only says it needs to disable the MMU. See
> Documentation/virt/kvm/arm/hyp-abi.rst
>
>
> I think its fine to remove this, but the reason is because el2_setup doesn't set those
> bits, and KVM clears them when its unloaded, or has a HVC_SOFT_RESTART call.
>
> It might be worth updating the document, but we'd need to check the guarantee is the same
> on 32bit. I assume there is no out-of-tree user of the hyp-stub abi.
I have no idea about 32bit. Could you enlighten me?
Again, I am a fresh man on arm/kvm. Please forgive me if I make any
silly mistakes.
Thanks for your time.
Pingfan
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
next prev parent reply other threads:[~2020-08-07 14:16 UTC|newest]
Thread overview: 6+ messages / expand[flat|nested] mbox.gz Atom feed top
2020-08-06 8:26 [PATCH] arm64/relocate_kernel: remove redundant but misleading code Pingfan Liu
2020-08-06 12:20 ` James Morse
2020-08-07 14:14 ` Pingfan Liu [this message]
2020-08-12 14:21 ` Pingfan Liu
2020-08-25 17:56 ` James Morse
2020-08-28 8:38 ` Pingfan Liu
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=CAFgQCTtmEoeWwTNo75Z6ymXDBRCKRWzyYEu4x35teyHSW39Czw@mail.gmail.com \
--to=kernelfans@gmail.com \
--cc=ardb@kernel.org \
--cc=broonie@kernel.org \
--cc=catalin.marinas@arm.com \
--cc=geoff@infradead.org \
--cc=james.morse@arm.com \
--cc=keescook@chromium.org \
--cc=linux-arm-kernel@lists.infradead.org \
--cc=mark.rutland@arm.com \
--cc=remi.denis.courmont@huawei.com \
--cc=will@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).