From mboxrd@z Thu Jan  1 00:00:00 1970
From: "Jan Beulich" <JBeulich@suse.com>
Subject: Re: hap_invlpg() vs INVLPGA
Date: Mon, 01 Feb 2016 02:58:44 -0700
Message-ID: <56AF3A6502000078000CCD61@prv-mh.provo.novell.com>
References: <56AB761F02000078000CC667@prv-mh.provo.novell.com>
	<56AB6FB7.7030003@amazon.de> <56AB70FE.9030906@amazon.de>
	<56AB992202000078000CC72C@prv-mh.provo.novell.com>
	<56AB9CD9.8070103@amazon.de>
	<56AF1FBB02000078000CCBF0@prv-mh.provo.novell.com>
	<56AF13D7.9080406@amazon.de>
	<56AF2CCE02000078000CCC44@prv-mh.provo.novell.com>
	<56AF2832.2010801@amazon.de>
Mime-Version: 1.0
Content-Type: text/plain; charset="us-ascii"
Content-Transfer-Encoding: 7bit
Return-path: <xen-devel-bounces@lists.xen.org>
Received: from mail6.bemta3.messagelabs.com ([195.245.230.39])
	by lists.xen.org with esmtp (Exim 4.72)
	(envelope-from <JBeulich@suse.com>) id 1aQBG0-0004Ha-Qy
	for xen-devel@lists.xenproject.org; Mon, 01 Feb 2016 09:58:53 +0000
In-Reply-To: <56AF2832.2010801@amazon.de>
Content-Disposition: inline
List-Unsubscribe: <http://lists.xen.org/cgi-bin/mailman/options/xen-devel>,
	<mailto:xen-devel-request@lists.xen.org?subject=unsubscribe>
List-Post: <mailto:xen-devel@lists.xen.org>
List-Help: <mailto:xen-devel-request@lists.xen.org?subject=help>
List-Subscribe: <http://lists.xen.org/cgi-bin/mailman/listinfo/xen-devel>,
	<mailto:xen-devel-request@lists.xen.org?subject=subscribe>
Sender: xen-devel-bounces@lists.xen.org
Errors-To: xen-devel-bounces@lists.xen.org
To: Christoph Egger <chegger@amazon.de>
Cc: xen-devel <xen-devel@lists.xenproject.org>
List-Id: xen-devel@lists.xenproject.org

>>> On 01.02.16 at 10:41, <chegger@amazon.de> wrote:
> On 01/02/16 10:00, Jan Beulich wrote:
>>>>> On 01.02.16 at 09:14, <chegger@amazon.de> wrote:
>>> On 01/02/16 09:04, Jan Beulich wrote:
>>>>>> This, otoh, reads as if you imply we intercept the L2's INVLPG.
>>>>>> Yet the INVLPG intercept gets cleared when the domain uses
>>>>>> NPT (and your original change also didn't alter any intercept
>>>>>> settings). Hence I'm still lost how hap_invlpg() can be reached
>>>>>> in that case other than via emulating INVLPG in the instruction
>>>>>> emulator.
>>>>>
>>>>> svm_invlpg_intercept() and vmx_invlpg_intercept() call
>>>>> paging_invlpg().  paging_invlpg() calls hap_invlpg()
>>>>> as initialized in xen/arch/x86/mm/hap/hap.c
>>>>
>>>> That's all fine, but according to my previous reply: How does
>>>> execution reach svm_invlpg_intercept() when the INVLPG
>>>> intercept gets disabled for domains using HAP (NPT)?
>>>
>>> The intercept bitmask for L1 guest and L2 guest gets binary or'ed
>>> when emulating the VMENTRY for the L1 guest.
>>> That way you get also intercepts for the L1 hypervisor.
>> 
>> Okay, I can see this perhaps being correct (albeit unexpected)
>> for general1-intercepts (because all 32 bits are defined), but
>> clearly this is broken for e.g. general2-intercepts (where the
>> guest could set flags the hypervisor doesn't know about),
>> leading to the BUG() in nsvm_vmcb_guest_intercepts_exitcode().
>> Hence I didn't expect such behavior to be there in the first place.
> 
> Whenever new intercepts get defined then those must be added.

I'm sorry, but no - this attitude is why nested mode can't be
expected to become supported any time soon. Unknown intercepts
must be explicitly filtered out and/or unknown L2 exits must be
handled gracefully (to at least the hypervisor).

>> And then this still doesn't make svm_invlpg_intercept() reachable:
>> While the L2 guest runs, the INVLPG intercept would be reflected
>> to the L1 guest. Whereas while the L1 guest runs, the intercept
>> would be off.
> 
> While this is correct, L0 hypervisor must flush the nested hap or
> whatever the L1 hypervisor does has no real effect to the L2 guest,
> otherwise because the TLB/MMU pagetable walk is not different.

I don't understand: You agree that svm_invlpg_intercept() is
unreachable when the guest uses HAP, but at the same time
you say that what it does is required for correct operation?

Jan