From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-2.3 required=3.0 tests=HEADER_FROM_DIFFERENT_DOMAINS, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,USER_AGENT_SANE_1 autolearn=no autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 2C8EAC17445 for ; Mon, 11 Nov 2019 17:38:00 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id 099BC20856 for ; Mon, 11 Nov 2019 17:38:00 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726951AbfKKRh7 (ORCPT ); Mon, 11 Nov 2019 12:37:59 -0500 Received: from mga09.intel.com ([134.134.136.24]:11669 "EHLO mga09.intel.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726763AbfKKRh6 (ORCPT ); Mon, 11 Nov 2019 12:37:58 -0500 X-Amp-Result: UNKNOWN X-Amp-Original-Verdict: FILE UNKNOWN X-Amp-File-Uploaded: False Received: from orsmga001.jf.intel.com ([10.7.209.18]) by orsmga102.jf.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 11 Nov 2019 09:37:57 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.68,293,1569308400"; d="scan'208";a="287236950" Received: from sjchrist-coffee.jf.intel.com (HELO linux.intel.com) ([10.54.74.41]) by orsmga001.jf.intel.com with ESMTP; 11 Nov 2019 09:37:57 -0800 Date: Mon, 11 Nov 2019 09:37:57 -0800 From: Sean Christopherson To: Thomas Lamprecht Cc: Greg Kroah-Hartman , linux-kernel@vger.kernel.org, stable@vger.kernel.org, Nadav Amit , Doug Reiland , Peter Xu , Paolo Bonzini Subject: Re: [PATCH 4.19 167/211] KVM: x86: Manually calculate reserved bits when loading PDPTRS Message-ID: <20191111173757.GB11805@linux.intel.com> References: <20191003154447.010950442@linuxfoundation.org> <20191003154525.870373223@linuxfoundation.org> <68d02406-b9cc-2fc1-848c-5d272d9a3350@proxmox.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <68d02406-b9cc-2fc1-848c-5d272d9a3350@proxmox.com> User-Agent: Mutt/1.5.24 (2015-08-30) Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Mon, Nov 11, 2019 at 10:32:05AM +0100, Thomas Lamprecht wrote: > On 10/3/19 5:53 PM, Greg Kroah-Hartman wrote: > > From: Sean Christopherson > > > > commit 16cfacc8085782dab8e365979356ce1ca87fd6cc upstream. > > > > Manually generate the PDPTR reserved bit mask when explicitly loading > > PDPTRs. The reserved bits that are being tracked by the MMU reflect the > It seems that a backport of this to stable and distro kernels tickled out > some issue[0] for KVM Linux 64bit guests on older than about 8-10 year old > Intel CPUs[1]. It manifests specifically when running with EPT disabled (no surprise there). Actually, it probably would reproduce simply with unrestricted guest disabled, but that's beside the point. The issue is a flawed PAE-paging check in kvm_set_cr3(), which causes KVM to incorrectly load PDPTRs in 64-bit mode and inject a #GP. It's a sneaky little bugger because the "if (is_long_mode() ..." makes it appear to be correct at first glance. if (is_long_mode(vcpu) && (cr3 & rsvd_bits(cpuid_maxphyaddr(vcpu), 63))) return 1; else if (is_pae(vcpu) && is_paging(vcpu) && <--- needs !is_long_mode() !load_pdptrs(vcpu, vcpu->arch.walk_mmu, cr3)) return 1; With unrestricted guest, KVM doesn't intercept writes to CR3 and so doesn't trigger the buggy code. This doesn't fail upstream because the offending code was refactored to encapsulate the PAE checks in a single helper, precisely to avoid this type of headache. commit bf03d4f9334728bf7c8ffc7de787df48abd6340e Author: Paolo Bonzini Date: Thu Jun 6 18:52:44 2019 +0200 KVM: x86: introduce is_pae_paging Checking for 32-bit PAE is quite common around code that fiddles with the PDPTRs. Add a function to compress all checks into a single invocation. Commit bf03d4f93347 ("KVM: x86: introduce is_pae_paging") doesn't apply cleanly to 4.19 or earlier because of the VMX file movement in 4.20. But, the revelant changes in x86.c do apply cleanly, and I've quadruple checked that the PAE checks in vmx.c are correct, i.e. applying the patch and ignoring the nested.c/vmx.c conflicts would be a viable lazy option. > Basically, booting this kernel as host, then running an KVM guest distro > or kernel fails it that guest kernel early in the boot phase without any > error or other log to serial console, earlyprintk. ... > > [0]: https://bugzilla.kernel.org/show_bug.cgi?id=205441 > [1]: models tested as problematic are: intel core2duo E8500; Xeon E5420; so > westmere, conroe and that stuff. AFAICT anything from about pre-2010 which > has VMX support (i.e. is 64bit based) Note, not Westmere, which has EPT and unrestricted guest. Xeon E5420 is Harpertown, a.k.a. Penryn, the shrink of Conroe.