From mboxrd@z Thu Jan 1 00:00:00 1970 From: "Jan Beulich" Subject: Re: hap_invlpg() vs INVLPGA Date: Mon, 01 Feb 2016 02:58:44 -0700 Message-ID: <56AF3A6502000078000CCD61@prv-mh.provo.novell.com> References: <56AB761F02000078000CC667@prv-mh.provo.novell.com> <56AB6FB7.7030003@amazon.de> <56AB70FE.9030906@amazon.de> <56AB992202000078000CC72C@prv-mh.provo.novell.com> <56AB9CD9.8070103@amazon.de> <56AF1FBB02000078000CCBF0@prv-mh.provo.novell.com> <56AF13D7.9080406@amazon.de> <56AF2CCE02000078000CCC44@prv-mh.provo.novell.com> <56AF2832.2010801@amazon.de> Mime-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Return-path: Received: from mail6.bemta3.messagelabs.com ([195.245.230.39]) by lists.xen.org with esmtp (Exim 4.72) (envelope-from ) id 1aQBG0-0004Ha-Qy for xen-devel@lists.xenproject.org; Mon, 01 Feb 2016 09:58:53 +0000 In-Reply-To: <56AF2832.2010801@amazon.de> Content-Disposition: inline List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Sender: xen-devel-bounces@lists.xen.org Errors-To: xen-devel-bounces@lists.xen.org To: Christoph Egger Cc: xen-devel List-Id: xen-devel@lists.xenproject.org >>> On 01.02.16 at 10:41, wrote: > On 01/02/16 10:00, Jan Beulich wrote: >>>>> On 01.02.16 at 09:14, wrote: >>> On 01/02/16 09:04, Jan Beulich wrote: >>>>>> This, otoh, reads as if you imply we intercept the L2's INVLPG. >>>>>> Yet the INVLPG intercept gets cleared when the domain uses >>>>>> NPT (and your original change also didn't alter any intercept >>>>>> settings). Hence I'm still lost how hap_invlpg() can be reached >>>>>> in that case other than via emulating INVLPG in the instruction >>>>>> emulator. >>>>> >>>>> svm_invlpg_intercept() and vmx_invlpg_intercept() call >>>>> paging_invlpg(). paging_invlpg() calls hap_invlpg() >>>>> as initialized in xen/arch/x86/mm/hap/hap.c >>>> >>>> That's all fine, but according to my previous reply: How does >>>> execution reach svm_invlpg_intercept() when the INVLPG >>>> intercept gets disabled for domains using HAP (NPT)? >>> >>> The intercept bitmask for L1 guest and L2 guest gets binary or'ed >>> when emulating the VMENTRY for the L1 guest. >>> That way you get also intercepts for the L1 hypervisor. >> >> Okay, I can see this perhaps being correct (albeit unexpected) >> for general1-intercepts (because all 32 bits are defined), but >> clearly this is broken for e.g. general2-intercepts (where the >> guest could set flags the hypervisor doesn't know about), >> leading to the BUG() in nsvm_vmcb_guest_intercepts_exitcode(). >> Hence I didn't expect such behavior to be there in the first place. > > Whenever new intercepts get defined then those must be added. I'm sorry, but no - this attitude is why nested mode can't be expected to become supported any time soon. Unknown intercepts must be explicitly filtered out and/or unknown L2 exits must be handled gracefully (to at least the hypervisor). >> And then this still doesn't make svm_invlpg_intercept() reachable: >> While the L2 guest runs, the INVLPG intercept would be reflected >> to the L1 guest. Whereas while the L1 guest runs, the intercept >> would be off. > > While this is correct, L0 hypervisor must flush the nested hap or > whatever the L1 hypervisor does has no real effect to the L2 guest, > otherwise because the TLB/MMU pagetable walk is not different. I don't understand: You agree that svm_invlpg_intercept() is unreachable when the guest uses HAP, but at the same time you say that what it does is required for correct operation? Jan