From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752065AbdJDJNt (ORCPT ); Wed, 4 Oct 2017 05:13:49 -0400 Received: from usa-sjc-mx-foss1.foss.arm.com ([217.140.101.70]:58358 "EHLO foss.arm.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751925AbdJDJNq (ORCPT ); Wed, 4 Oct 2017 05:13:46 -0400 Subject: Re: [RFC PATCH v2 19/31] KVM: arm64: Describe AT instruction emulation design To: Jintack Lim , James Morse Cc: KVM General , Catalin Marinas , Will Deacon , linux@armlinux.org.uk, lkml - Kernel Mailing List , arm-mail-list , Paolo Bonzini , kvmarm@lists.cs.columbia.edu References: <1507000273-3735-1-git-send-email-jintack.lim@linaro.org> <1507000273-3735-17-git-send-email-jintack.lim@linaro.org> <59D3CAF2.2030704@arm.com> From: Marc Zyngier Organization: ARM Ltd Message-ID: <7335d045-fb58-3235-fadd-6f6b59304c2b@arm.com> Date: Wed, 4 Oct 2017 10:13:42 +0100 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:52.0) Gecko/20100101 Thunderbird/52.3.0 MIME-Version: 1.0 In-Reply-To: Content-Type: text/plain; charset=utf-8 Content-Language: en-GB Content-Transfer-Encoding: 7bit Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 03/10/17 22:11, Jintack Lim wrote: > Hi James, > > On Tue, Oct 3, 2017 at 1:37 PM, James Morse wrote: >> Hi Jintack, >> >> On 03/10/17 04:11, Jintack Lim wrote: >>> This design overview will help to digest the subsequent patches that >>> implement AT instruction emulation. >> >>> diff --git a/arch/arm64/kvm/sys_regs.c b/arch/arm64/kvm/sys_regs.c >>> index 8d04926..d8728cc 100644 >>> --- a/arch/arm64/kvm/sys_regs.c >>> +++ b/arch/arm64/kvm/sys_regs.c >>> @@ -1621,6 +1621,72 @@ static bool access_id_aa64mmfr0_el1(struct kvm_vcpu *v, >>> { SYS_DESC(SYS_SP_EL2), NULL, reset_special, SP_EL2, 0}, >>> }; >>> >>> +/* >>> + * AT instruction emulation >>> + * >>> + * We emulate AT instructions executed in the virtual EL2. >> >>> + * Basic strategy for the stage-1 translation emulation is to load proper >>> + * context, which depends on the trapped instruction and the virtual HCR_EL2, >>> + * to the EL1 virtual memory control registers and execute S1E[01] instructions >>> + * in EL2. See below for more detail. >> >> What happens if the guest memory containing some stage1-page-table has been >> unmapped from stage2? (e.g. its swapped to disk). >> >> (there is some background to this: I tried to implement the kvm_translate >> ioctl() using this approach, running 'at s1e1*' from EL2. I ran into problems >> when parts of the guest's stage1 page tables had been unmapped from stage2.) >> >> From memory, I found that the AT instructions would fault-in those pages when >> run from EL1, but when executing the same instruction at EL2 they just failed >> without any hint of which IPA needed mapping in. Let me see if I follow: AT S1E1 at EL1 should only generate a fault if the page table walking itself generates a fault (the guest page tables have been swapped out), and the fault is taken to EL2. At that point, that's a normal translation fault, which EL2 can easily resolve and restart the AT instruction. This is in fact no different from a faulting load/store. Doing the same thing at EL2 would simply indeed indicate a failed translation, and not generate a fault, which I think is what you're observing. After all, it is the hypervisor that unmapped those pages, it might as well properly track what is happening. It is a bit of an odd case because the AT here is executed at vEL2 (EL1), and trapped to EL2 because of the NV bits. If it wasn't trapped, everything would just work. In this case, I can't see any other way but to walk the S1PT by hand, having put all the other vcpus on hold to avoid concurrent modifications... Yes, this sucks. If only AT could do partial walks... The saving grace is that this only happens in the unmapped S1PT case. The above can be used as a fallback if the AT S1 from EL2 actually fails. > I think I haven't encountered this case yet, probably because I > usually don't set a swap partition. > > In fact, I couldn't find pseudocode for AT instructions. If you > happened to have one, is that behavior you observed described in ARM > ARM? See J1.1.5 in the ARMv8 ARM Rev B.a, and the various comments indicating how this applies to Address Translation instructions. There is also some description of what is expected from the AT instructions in D4.2.11. Thanks, M. -- Jazz is not dead. It just smells funny... From mboxrd@z Thu Jan 1 00:00:00 1970 From: marc.zyngier@arm.com (Marc Zyngier) Date: Wed, 4 Oct 2017 10:13:42 +0100 Subject: [RFC PATCH v2 19/31] KVM: arm64: Describe AT instruction emulation design In-Reply-To: References: <1507000273-3735-1-git-send-email-jintack.lim@linaro.org> <1507000273-3735-17-git-send-email-jintack.lim@linaro.org> <59D3CAF2.2030704@arm.com> Message-ID: <7335d045-fb58-3235-fadd-6f6b59304c2b@arm.com> To: linux-arm-kernel@lists.infradead.org List-Id: linux-arm-kernel.lists.infradead.org On 03/10/17 22:11, Jintack Lim wrote: > Hi James, > > On Tue, Oct 3, 2017 at 1:37 PM, James Morse wrote: >> Hi Jintack, >> >> On 03/10/17 04:11, Jintack Lim wrote: >>> This design overview will help to digest the subsequent patches that >>> implement AT instruction emulation. >> >>> diff --git a/arch/arm64/kvm/sys_regs.c b/arch/arm64/kvm/sys_regs.c >>> index 8d04926..d8728cc 100644 >>> --- a/arch/arm64/kvm/sys_regs.c >>> +++ b/arch/arm64/kvm/sys_regs.c >>> @@ -1621,6 +1621,72 @@ static bool access_id_aa64mmfr0_el1(struct kvm_vcpu *v, >>> { SYS_DESC(SYS_SP_EL2), NULL, reset_special, SP_EL2, 0}, >>> }; >>> >>> +/* >>> + * AT instruction emulation >>> + * >>> + * We emulate AT instructions executed in the virtual EL2. >> >>> + * Basic strategy for the stage-1 translation emulation is to load proper >>> + * context, which depends on the trapped instruction and the virtual HCR_EL2, >>> + * to the EL1 virtual memory control registers and execute S1E[01] instructions >>> + * in EL2. See below for more detail. >> >> What happens if the guest memory containing some stage1-page-table has been >> unmapped from stage2? (e.g. its swapped to disk). >> >> (there is some background to this: I tried to implement the kvm_translate >> ioctl() using this approach, running 'at s1e1*' from EL2. I ran into problems >> when parts of the guest's stage1 page tables had been unmapped from stage2.) >> >> From memory, I found that the AT instructions would fault-in those pages when >> run from EL1, but when executing the same instruction at EL2 they just failed >> without any hint of which IPA needed mapping in. Let me see if I follow: AT S1E1 at EL1 should only generate a fault if the page table walking itself generates a fault (the guest page tables have been swapped out), and the fault is taken to EL2. At that point, that's a normal translation fault, which EL2 can easily resolve and restart the AT instruction. This is in fact no different from a faulting load/store. Doing the same thing at EL2 would simply indeed indicate a failed translation, and not generate a fault, which I think is what you're observing. After all, it is the hypervisor that unmapped those pages, it might as well properly track what is happening. It is a bit of an odd case because the AT here is executed at vEL2 (EL1), and trapped to EL2 because of the NV bits. If it wasn't trapped, everything would just work. In this case, I can't see any other way but to walk the S1PT by hand, having put all the other vcpus on hold to avoid concurrent modifications... Yes, this sucks. If only AT could do partial walks... The saving grace is that this only happens in the unmapped S1PT case. The above can be used as a fallback if the AT S1 from EL2 actually fails. > I think I haven't encountered this case yet, probably because I > usually don't set a swap partition. > > In fact, I couldn't find pseudocode for AT instructions. If you > happened to have one, is that behavior you observed described in ARM > ARM? See J1.1.5 in the ARMv8 ARM Rev B.a, and the various comments indicating how this applies to Address Translation instructions. There is also some description of what is expected from the AT instructions in D4.2.11. Thanks, M. -- Jazz is not dead. It just smells funny...