From mboxrd@z Thu Jan 1 00:00:00 1970 From: "Aravindh Puthiyaparambil (aravindp)" Subject: Re: [PATCH RFC 1/4] x86/mm: Shadow and p2m changes for PV mem_access Date: Fri, 2 May 2014 16:28:11 +0000 Message-ID: <97A500D504438F4ABC02EBA81613CC63317E9FD2@xmb-aln-x02.cisco.com> References: <1398746705-6658-1-git-send-email-aravindp@cisco.com> <1398746705-6658-2-git-send-email-aravindp@cisco.com> <535F83FE020000780000D397@nat28.tlf.novell.com> <97A500D504438F4ABC02EBA81613CC63317E7AC4@xmb-aln-x02.cisco.com> <5360B51D020000780000D9A2@g0-1-119.ukb-fw-asa.gns.novell.com> <97A500D504438F4ABC02EBA81613CC63317E9527@xmb-aln-x02.cisco.com> <536372E4020000780000E5A3@mail.emea.novell.com> Mime-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Return-path: Received: from mail6.bemta4.messagelabs.com ([85.158.143.247]) by lists.xen.org with esmtp (Exim 4.72) (envelope-from ) id 1WgGJs-0003Ei-JW for xen-devel@lists.xenproject.org; Fri, 02 May 2014 16:28:16 +0000 In-Reply-To: <536372E4020000780000E5A3@mail.emea.novell.com> Content-Language: en-US List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Sender: xen-devel-bounces@lists.xen.org Errors-To: xen-devel-bounces@lists.xen.org To: Jan Beulich Cc: Keir Fraser , Ian Campbell , "Andres Lagar-Cavilla (andreslc@gridcentric.ca)" , Tim Deegan , "xen-devel@lists.xenproject.org" , Ian Jackson List-Id: xen-devel@lists.xenproject.org >>>> On 01.05.14 at 00:20, wrote: >>>I think implying _n to mean "default" is at least a latent bug anyway >>>(since that precludes setting that type). And _if_ "get" really does >>>any modification, that minimally would require a big comment. >> >> For PV domains access_n is precluded. >> >>>Furthermore, with it being possible for the default type to change, >>>the net effect might easily be different between setting the default >>>type at initialization time or when the first lookup happens. >> >> I am not sure that I follow your concern here. Setting default access >> does not mean that the access table gets cleared. So if a particular >> MFN entry's access value changed and the default value was changed, >> then the MFN should retain the access value it was changed to and not >> the default value. The new default value will only be returned for >> MFNs for which set_entry was not called on and I think that is correct. > >It might still be a difference when the designated access gets effected: >I would assume that a request for the default type should obtain the default >set at the time of the request, not the one in effects at the time the entry >gets first used. Ah I understand now. Given that I am moving away from using the lookup table, this might not be a concern anymore. >>>>>> +int p2m_mem_access_init(struct p2m_domain *p2m) { >>>>>> + struct domain *d = p2m->domain; >>>>>> + mfn_t *access_lookup_table; >>>>>> + uint32_t nr_access_table_pages; >>>>>> + uint32_t ctr; >>>>>> + >>>>>> + nr_access_table_pages = get_domain_nr_access_table_pages(d); >>>>>> + access_lookup_table = xzalloc_array(mfn_t, >>>>>> + nr_access_table_pages); >>>>> >>>>>This surely is going to be an order > 0 allocation, and you even >>>>>risk it being an order > MAX_ORDER one. The former is disallowed at >>>>>runtime by convention/agreement, the latter is an outright bug. >>>> >>>> Sorry, I was not aware of the convention. What should I be doing >>>> here instead? >>> >>>Especially with the above note about GFN space being effectively >>>unbounded for PV guests, you need to find a different data structure >>>to represent the information you need. Perhaps something similar to >>>what log-dirty mode uses, except that you would need more levels in >>>the extreme case, and hence it might be worthwhile making the actually >used number of levels dynamic? >> >> I am a little confused here. In my initial discussions about this I >> was told that PV guests have bounded and contiguous memory. This is >> why I took the approach of an indexable array. >> >> " OTOH, all you need is a byte per pfn, and the great thing is that in >> PV domains, the physmap is bounded and continuous. Unlike HVM and its >> PCI holes, etc, which demand the sparse tree structure. So you can >> allocate an easily indexable array, notwithstanding super page concerns (I >think/hope)." >> >> http://lists.xenproject.org/archives/html/xen-devel/2013-11/msg03860.h >> tml >> >> Did I misunderstand something here? > >What he stated is the situation at domain start. Due to ballooning and other >kinds of manipulations (as said, pv-ops re-arranges things to avoid having >memory [PFNs] overlap MMIO [MFNs]) this may change immediately after >the domain started. > >And even if it was a plain contiguous linear address space, you still wouldn't be >permitted to allocate a flat one-byte-per-page array, as for large domains this >might still overflow the MAX_ORDER allocation constraint. OK, the flat array is a bad idea. I am looking in to some of Tim's suggestions to do this now. Either stash the values in the shadow_flags or reuse the p2m-pt implementation. I am going to try reusing the shadows_flags first and if that does not work I will go with the p2m-pt implementation. >>>>>> @@ -2443,7 +2443,8 @@ int sh_remove_all_mappings(struct vcpu *v, >>>>>mfn_t gmfn) >>>>>> if ( !(shadow_mode_external(v->domain) >>>>>> && (page->count_info & PGC_count_mask) <= 3 >>>>>> && ((page->u.inuse.type_info & PGT_count_mask) >>>>>> - == !!is_xen_heap_page(page))) ) >>>>>> + == !!is_xen_heap_page(page))) >>>>>> + && !(shadow_mode_mem_access(v->domain)) ) >>>>> >>>>>You're breaking indentation, there are pointless parentheses again, >>>>>but most importantly - why? >>>> >>>> Sorry, I meant to ask a question about this in my patch message and >>>> mark this as a workaround. I am seeing the message on all >>>> sh_remove_all_mappings() and I was not able to figure out why this >>>> was happening. I just added this as a work around. I was hoping you >>>> or Tim >>>would shed more light on this. >>> >>>I'm afraid that without detail on which pages the triggers on, and you >>>at >> least >>>having spent some time finding out where the stray/extra references >>>may be coming from it's going to be hard to help. >> >> Here are some of the prints I saw. I typically saw it for every >> set_entry() call. > >There are some rather large usage counts among the examples you gave - >these surely need understanding rather than blindly ignoring. Tim gave me the reason for this: "This is because you are not in shadow_mode_refcounts() (i.e. the page's refcount and typecount are based on the _guest_ pagetables rather than the _shadow_ ones). That means that sh_remove_all_mappings() can't use the refcounts to figure out when it's removed the last shadow mapping." Thanks, Aravindh