From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <linux-kernel-owner@vger.kernel.org>
Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand
        id S1751483AbdCQRlG (ORCPT <rfc822;w@1wt.eu>);
        Fri, 17 Mar 2017 13:41:06 -0400
Received: from mx1.redhat.com ([209.132.183.28]:52224 "EHLO mx1.redhat.com"
        rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP
        id S1751107AbdCQRlE (ORCPT <rfc822;linux-kernel@vger.kernel.org>);
        Fri, 17 Mar 2017 13:41:04 -0400
DMARC-Filter: OpenDMARC Filter v1.3.2 mx1.redhat.com E2F703DBC7
Authentication-Results: ext-mx06.extmail.prod.ext.phx2.redhat.com; dmarc=none (p=none dis=none) header.from=redhat.com
Authentication-Results: ext-mx06.extmail.prod.ext.phx2.redhat.com; spf=pass smtp.mailfrom=pbonzini@redhat.com
DKIM-Filter: OpenDKIM Filter v2.11.0 mx1.redhat.com E2F703DBC7
Subject: Re: [PATCH] KVM: nVMX: Fix L2 guest hang if shadow page tables on EPT
To: Ladi Prosek <lprosek@redhat.com>, Wanpeng Li <kernellwp@gmail.com>
References: <1489761691-11441-1-git-send-email-wanpeng.li@hotmail.com>
 <CABdb734qbaDh51DK2xhAC0kRgqtgY-mb6YvstM-5D6XXqhuQqA@mail.gmail.com>
Cc: linux-kernel@vger.kernel.org, KVM list <kvm@vger.kernel.org>,
        =?UTF-8?B?UmFkaW0gS3LEjW3DocWZ?= <rkrcmar@redhat.com>,
        Wanpeng Li <wanpeng.li@hotmail.com>
From: Paolo Bonzini <pbonzini@redhat.com>
Message-ID: <8647981f-ac36-4c26-23ce-4f36479efd7c@redhat.com>
Date: Fri, 17 Mar 2017 18:33:14 +0100
User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:45.0) Gecko/20100101
 Thunderbird/45.7.0
MIME-Version: 1.0
In-Reply-To: <CABdb734qbaDh51DK2xhAC0kRgqtgY-mb6YvstM-5D6XXqhuQqA@mail.gmail.com>
Content-Type: text/plain; charset=utf-8
Content-Transfer-Encoding: 8bit
X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.30]); Fri, 17 Mar 2017 17:33:18 +0000 (UTC)
Sender: linux-kernel-owner@vger.kernel.org
List-ID: <linux-kernel.vger.kernel.org>
X-Mailing-List: linux-kernel@vger.kernel.org


On 17/03/2017 18:28, Ladi Prosek wrote:
> On Fri, Mar 17, 2017 at 3:41 PM, Wanpeng Li <kernellwp@gmail.com> wrote:
>> From: Wanpeng Li <wanpeng.li@hotmail.com>
>>
>> The L2 guest hang if shadow page tables on EPT, the trace on L1 shows that
>> L2 kvm_exit reason EXCEPTION_NMI and page fault repeatedly:
>>
>> qemu-system-x86-2821  [003] d..2    45.848814: kvm_entry: vcpu 0
>> qemu-system-x86-2821  [003] ...1    45.848827: kvm_exit: reason EXCEPTION_NMI rip 0xe05b info fe05b 80000b0e
>> qemu-system-x86-2821  [003] ...1    45.848827: kvm_page_fault: address fe05b error_code 14
>>
>> Commit 7ca29de21362 (KVM: nVMX: fix CR3 load if L2 uses PAE paging and EPT)
>> prevents to load L2's PDPTRs according to dereferencing L2's CR3 since it is
>> uninitialized in real mode. Hyper-V L1 will emulate L2 real mode with PAE
>> paging and EPT enabled. However, there is a progress to switch from Legacy
>> mode's such-mode Protected mode to Long mode during system boot, the check
>> in nested_vmx_load_cr3() will prevent to load PDPTRs if it is still in
>> Protected mode w/ PAE paging and nested EPT/shadow page tables on EPT. Actually
>> the original commit should just intended to prevent to dereference L2's CR3
>> if the L1 hypervisor emulates L2's real mode through vm8086.
>>
>> This patch fixes it by allowing load PDPTRs if PAE paing, EPT enabled and
>> !vm86_active.
>>
>> Cc: Paolo Bonzini <pbonzini@redhat.com>
>> Cc: Radim Krčmář <rkrcmar@redhat.com>
>> Cc: Ladi Prosek <lprosek@redhat.com>
>> Signed-off-by: Wanpeng Li <wanpeng.li@hotmail.com>
>> ---
>>  arch/x86/kvm/vmx.c | 4 ++--
>>  1 file changed, 2 insertions(+), 2 deletions(-)
>>
>> diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c
>> index c664365..2b2a05f 100644
>> --- a/arch/x86/kvm/vmx.c
>> +++ b/arch/x86/kvm/vmx.c
>> @@ -9933,7 +9933,7 @@ static bool nested_cr3_valid(struct kvm_vcpu *vcpu, unsigned long val)
>>  static int nested_vmx_load_cr3(struct kvm_vcpu *vcpu, unsigned long cr3, bool nested_ept,
>>                                u32 *entry_failure_code)
>>  {
>> -       if (cr3 != kvm_read_cr3(vcpu) || (!nested_ept && pdptrs_changed(vcpu))) {
>> +       if (cr3 != kvm_read_cr3(vcpu) || pdptrs_changed(vcpu)) {
>>                 if (!nested_cr3_valid(vcpu, cr3)) {
>>                         *entry_failure_code = ENTRY_FAIL_DEFAULT;
>>                         return 1;
>> @@ -9944,7 +9944,7 @@ static int nested_vmx_load_cr3(struct kvm_vcpu *vcpu, unsigned long cr3, bool ne
>>                  * must not be dereferenced.
>>                  */
>>                 if (!is_long_mode(vcpu) && is_pae(vcpu) && is_paging(vcpu) &&
>> -                   !nested_ept) {
>> +                   !(nested_ept && to_vmx(vcpu)->rmode.vm86_active)) {
> 
> This change breaks Hyper-V on KVM. L2 hangs on start-up, same symptoms
> as before 7ca29de21362.

Looks like we need _two_ testcases then... :)

Paolo

> I'll take a closer look next week. Is there an easy way for me to
> reproduce the issue you're seeing?