On 7/18/19 1:13 PM, Tomasz Nowicki wrote: > Hello Alex, > > On 09.07.2019 15:20, Alexandru Elisei wrote: >> On 6/21/19 10:38 AM, Marc Zyngier wrote: >>> From: Jintack Lim >>> >>> When supporting nested virtualization a guest hypervisor executing AT >>> instructions must be trapped and emulated by the host hypervisor, >>> because untrapped AT instructions operating on S1E1 will use the wrong >>> translation regieme (the one used to emulate virtual EL2 in EL1 instead >> I think that should be "regime". >> >>> of virtual EL1) and AT instructions operating on S12 will not work from >>> EL1. >>> >>> This patch does several things. >>> >>> 1. List and define all AT system instructions to emulate and document >>> the emulation design. >>> >>> 2. Implement AT instruction handling logic in EL2. This will be used to >>> emulate AT instructions executed in the virtual EL2. >>> >>> AT instruction emulation works by loading the proper processor >>> context, which depends on the trapped instruction and the virtual >>> HCR_EL2, to the EL1 virtual memory control registers and executing AT >>> instructions. Note that ctxt->hw_sys_regs is expected to have the >>> proper processor context before calling the handling >>> function(__kvm_at_insn) implemented in this patch. >>> >>> 4. Emulate AT S1E[01] instructions by issuing the same instructions in >>> EL2. We set the physical EL1 registers, NV and NV1 bits as described in >>> the AT instruction emulation overview. >> Is item number 3 missing, or is that the result of an unfortunate typo? >> >>> 5. Emulate AT A12E[01] instructions in two steps: First, do the stage-1 >>> translation by reusing the existing AT emulation functions. Second, do >>> the stage-2 translation by walking the guest hypervisor's stage-2 page >>> table in software. Record the translation result to PAR_EL1. >>> >>> 6. Emulate AT S1E2 instructions by issuing the corresponding S1E1 >>> instructions in EL2. We set the physical EL1 registers and the HCR_EL2 >>> register as described in the AT instruction emulation overview. >>> >>> 7. Forward system instruction traps to the virtual EL2 if the corresponding >>> virtual AT bit is set in the virtual HCR_EL2. >>> >>> [ Much logic above has been reworked by Marc Zyngier ] >>> >>> Signed-off-by: Jintack Lim >>> Signed-off-by: Marc Zyngier >>> Signed-off-by: Christoffer Dall >>> --- >>> arch/arm64/include/asm/kvm_arm.h | 2 + >>> arch/arm64/include/asm/kvm_asm.h | 2 + >>> arch/arm64/include/asm/sysreg.h | 17 +++ >>> arch/arm64/kvm/hyp/Makefile | 1 + >>> arch/arm64/kvm/hyp/at.c | 217 +++++++++++++++++++++++++++++++ >>> arch/arm64/kvm/hyp/switch.c | 13 +- >>> arch/arm64/kvm/sys_regs.c | 202 +++++++++++++++++++++++++++- >>> 7 files changed, 450 insertions(+), 4 deletions(-) >>> create mode 100644 arch/arm64/kvm/hyp/at.c >>> > [...] > >>> + >>> +void __kvm_at_s1e01(struct kvm_vcpu *vcpu, u32 op, u64 vaddr) >>> +{ >>> + struct kvm_cpu_context *ctxt = &vcpu->arch.ctxt; >>> + struct mmu_config config; >>> + struct kvm_s2_mmu *mmu; >>> + >>> + /* >>> + * We can only get here when trapping from vEL2, so we're >>> + * translating a guest guest VA. >>> + * >>> + * FIXME: Obtaining the S2 MMU for a a guest guest is horribly >>> + * racy, and we may not find it. >>> + */ >>> + spin_lock(&vcpu->kvm->mmu_lock); >>> + >>> + mmu = lookup_s2_mmu(vcpu->kvm, >>> + vcpu_read_sys_reg(vcpu, VTTBR_EL2), >>> + vcpu_read_sys_reg(vcpu, HCR_EL2)); >> From ARM DDI 0487D.b, the description for AT S1E1R (page C5-467, it's the same >> for the other at s1e{0,1}* instructions): >> >> [..] Performs stage 1 address translation, with permisions as if reading from >> the given virtual address from EL1, or from EL2 [..], using the following >> translation regime: >> - If HCR_EL2.{E2H,TGE} is {1, 1}, the EL2&0 translation regime, accessed from EL2. >> >> If the guest is VHE, I don't think there's any need to switch mmus. The AT >> instruction will use the physical EL1&0 translation regime already on the >> hardware (assuming host HCR_EL2.TGE == 0), which is the vEL2&0 regime for the >> guest hypervisor. > Here we want to run AT for L2 (guest guest) EL1&0 regime and not the L1 > (guest hypervisor) so we have to lookup and switch to nested VM MMU > context. Or did I miss your point? > > Thanks, > Tomasz What I mean to say is that if the L1 guest has set HCR_EL2.{E2H, TGE} = {1, 1}, then the instruction affects the vEL2&0 translation regime (as per the instruction description in the arhitecture), which is already loaded. The AT instruction will affect the L1 guest hypervisor, not the L2 guest. In other words: if (!vcpu_el2_e2h_is_set(vcpu) || !vcpu_el2_tge_is_set(vcpu)) /* switch mmus, the instruction affects the L2 guest (the guest guest) */ else /* do not switch mmus, the instruction affects the L1 guest hypervisor which is loaded */ I hope this makes things clearer.