From mboxrd@z Thu Jan 1 00:00:00 1970 From: Tim Deegan Subject: Re: [PATCH v2 1/6] x86/EPT: don't walk entire page tables when globally changing types Date: Thu, 24 Apr 2014 17:41:03 +0200 Message-ID: <20140424154103.GH48969@deinos.phlegethon.org> References: <53567B1C020000780000AB8C@nat28.tlf.novell.com> <53567C85020000780000ABA3@nat28.tlf.novell.com> Mime-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Return-path: Received: from mail6.bemta4.messagelabs.com ([85.158.143.247]) by lists.xen.org with esmtp (Exim 4.72) (envelope-from ) id 1WdLlw-0000V7-Vi for xen-devel@lists.xenproject.org; Thu, 24 Apr 2014 15:41:13 +0000 Content-Disposition: inline In-Reply-To: <53567C85020000780000ABA3@nat28.tlf.novell.com> List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Sender: xen-devel-bounces@lists.xen.org Errors-To: xen-devel-bounces@lists.xen.org To: Jan Beulich Cc: Kevin Tian , Keir Fraser , suravee.suthikulpanit@amd.com, Eddie Dong , Jun Nakajima , xen-devel , Boris Ostrovsky List-Id: xen-devel@lists.xenproject.org At 13:28 +0100 on 22 Apr (1398169701), Jan Beulich wrote: > Instead leverage the EPT_MISCONFIG VM exit by marking just the top > level entries as needing recalculation of their type, propagating the > the recalculation state down as necessary such that the actual > recalculation gets done upon access. > > For this to work, we have to > - restrict the types between which conversions can be done (right now > only the two types involved in log dirty tracking need to be taken > care of) > - remember the ranges that log dirty tracking was requested for as well > as whether global log dirty tracking is in effect > > Signed-off-by: Jan Beulich Generally looks good; I think the core idea of reusing the misconfig mechanism is the right way to go. The naming you've chosen implies that it might be extended later to other lazily-updated types; I'm not sure how well that will scale. OTOH right now I can't think of any other users for this (and I can see why you're not happy with using change_type() to flush out foreign mappings on teardown after this changes). A few comments on detail below: > --- a/xen/arch/x86/mm/hap/hap.c > +++ b/xen/arch/x86/mm/hap/hap.c > @@ -110,11 +110,18 @@ int hap_track_dirty_vram(struct domain * > if ( begin_pfn != dirty_vram->begin_pfn || > begin_pfn + nr != dirty_vram->end_pfn ) > { > + unsigned long ostart = dirty_vram->begin_pfn; > + unsigned long oend = dirty_vram->end_pfn; > + > dirty_vram->begin_pfn = begin_pfn; > dirty_vram->end_pfn = begin_pfn + nr; > > paging_unlock(d); > > + if ( oend > ostart ) > + p2m_change_type_range(d, ostart, oend, > + p2m_ram_logdirty, p2m_ram_rw); > + Does this (and the simlar change below) not risk losing updates if we're in full log-dirty mode? I think it should be fine to leave those entries as log-dirty, since at worst they'll provoke another fault-and-fixup. > --- a/xen/arch/x86/mm/p2m.c > +++ b/xen/arch/x86/mm/p2m.c > @@ -116,8 +116,14 @@ static int p2m_init_hostp2m(struct domai > > if ( p2m ) > { > - d->arch.p2m = p2m; > - return 0; > + p2m->logdirty_ranges = rangeset_new(d, "log-dirty", > + RANGESETF_prettyprint_hex); Is there anything the guest can do to cause this rangeset to grow very large? Like moving its video RAM around? > +/* > + * Resolve deliberately mis-configured (EMT field set to an invalid value) > + * entries in the page table hierarchy for the given GFN: > + * - calculate the correct value for the EMT field > + * - if marked so, re-calculate the P2M type > + * - propagate EMT and re-calculation flag down to the next page table level > + * for entries not involved in the translation of the given GFN > + */ > +static int resolve_misconfig(struct p2m_domain *p2m, unsigned long gfn) > { Please document the different return values here as well. > > - return !!okay; > + return rc >= !spurious; Please just write out this logic in a human-readable way. The compiler will DTRT. :) > } > > /* > @@ -416,12 +470,11 @@ ept_set_entry(struct p2m_domain *p2m, un > unsigned long gfn_remainder = gfn; > int i, target = order / EPT_TABLE_ORDER; > int rc = 0; > - int ret = 0; > bool_t direct_mmio = (p2mt == p2m_mmio_direct); > uint8_t ipat = 0; > int need_modify_vtd_table = 1; > int vtd_pte_present = 0; > - int needs_sync = 1; > + int ret, needs_sync = -1; Hmmm, another tristate. I find these hard to follow, esp. without comments to say which non-zero value is which. I wonder if we could add some sort of framework for these to make them clearer. Ideally we'd have an enum-like type and rely on the compiler to DTRT about optimizing the tests. In this case, where needs_sync isn't being passed as a return value, I think we'd be fine with two booleans in place of this int. And more generally, I would like to see at least a comment on every new instance of this trick before I ack it in x86/mm code. Cheers, Tim.