From mboxrd@z Thu Jan 1 00:00:00 1970 From: "Jan Beulich" Subject: Re: [PATCH v5 10/15] x86/altp2m: add remaining support routines. Date: Tue, 21 Jul 2015 00:38:22 -0600 Message-ID: <55AE04FE020000780009376F@prv-mh.provo.novell.com> References: <1436832903-12639-1-git-send-email-edmund.h.white@intel.com> <1436832903-12639-11-git-send-email-edmund.h.white@intel.com> <55A539710200007800090CB0@mail.emea.novell.com> <55A796B20200007800091C61@mail.emea.novell.com> <55ACB6EF0200007800092E1B@mail.emea.novell.com> Mime-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Return-path: In-Reply-To: Content-Disposition: inline List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Sender: xen-devel-bounces@lists.xen.org Errors-To: xen-devel-bounces@lists.xen.org To: Ravi Sahita Cc: Tim Deegan , Wei Liu , George Dunlap , Andrew Cooper , Ian Jackson , Edmund H White , "xen-devel@lists.xen.org" , "tlengyel@novetta.com" , Daniel De Graaf List-Id: xen-devel@lists.xenproject.org >>> On 21.07.15 at 07:46, wrote: >> From: Jan Beulich [mailto:JBeulich@suse.com] >>Sent: Sunday, July 19, 2015 11:53 PM >>>>> On 18.07.15 at 00:32, wrote: >>>> From: Jan Beulich [mailto:JBeulich@suse.com] >>>>Sent: Thursday, July 16, 2015 2:34 AM >>>> >>>>>>> On 16.07.15 at 11:16, wrote: >>>>>> From: Jan Beulich [mailto:JBeulich@suse.com] >>>>>>Sent: Tuesday, July 14, 2015 7:32 AM >>>>>>>>> On 14.07.15 at 02:14, wrote: >>>>>>> @@ -2965,9 +3003,15 @@ int hvm_hap_nested_page_fault(paddr_t gpa, >>>>>>unsigned long gla, >>>>>>> if ( npfec.write_access ) >>>>>>> { >>>>>>> paging_mark_dirty(currd, mfn_x(mfn)); >>>>>>> + /* If p2m is really an altp2m, unlock here to avoid >>>>>>> + lock >>>>> ordering >>>>>>> + * violation when the change below is propagated from >>>>>>> + host p2m >>>>> */ >>>>>>> + if ( ap2m_active ) >>>>>>> + __put_gfn(p2m, gfn); >>>>>>> p2m_change_type_one(currd, gfn, p2m_ram_logdirty, >>>>>>> p2m_ram_rw); >>>>>> >>>>>>And this won't result in any races? >>>>> >>>>> No >>>> >>>>To be honest I expected a little more than just "no" here. Now I have >>>>to ask - why? >>>> >>> >>> Yes, I should have described it more than that :-) so this part of >>> the code is handling the log dirty transition of the page, and this >>> page permission transition happens always on the hostp2m. Given the >>> way the locking order is setup (hostp2m->altp2m-list-lock->altp2m and >>> there was a separate writeup and discussion with George on that), at >>> this point in this sequence there is a p2m lock (whether it's a >>> hostp2m or altp2m lock depends on the mode of the >>> domain) - the reason we have to drop the lock here first is due to >>> what happens next; the permission changes in hostp2m will be serially >>> propagated to altp2ms and not dropping the lock here would cause a >>> locking order violation. Hope that clarifies. >> >>Sadly it doesn't at all: You re-explain why you need to drop the lock, while > you >>fail to say anything on why this won't cause a race. >> > > It only drops the lock on the altp2m, which is no longer required in this > function anyway. The important aspect is that there is still a lock held on > the host p2m, and that is dropped after the log-dirty updates, as it would be > in the non-altp2m case (maybe that was the part I should have explained > clearly in the para above). Does that clarify or do you see a particular race > condition here? (We don't ). Sounds okay then. >>>>>>> +long p2m_init_altp2m_by_id(struct domain *d, uint16_t idx) { >>>>>>> + long rc = -EINVAL; >>>>>> >>>>>>Why long (for both variable and function return type)? (More of >>>>>>these in functions below.) >>>>> >>>>> Because the error variable in the code that calls these (in hvm.c) >>>>> is a long, and you had given feedback earlier to propagate the >>>>> returns from these functions through that calling code. >>>> >>>>I don't see the connection. The function only returns zero or -E... >>>>values, so why would its return type be "long"? >>>> >>> >>> do_hvm_op declares a rc that is of type "long" and hence this returns >>> a "long" >> >>What type your caller(s) return is of no interest at all here: What would you > do >>if you had multiple callers with differing return types? >>A function's return type should be chosen based on the range of values it may >>return, and the result possibly widened to not yield inefficient code (like > in >>some of the uint16_t cases elsewhere in the series would be necessary). >> > > What do you suggest the return type be? For the case here - int (quite obviously I would say). For the uint16_t ones - unsigned int. >>>>>>> +void p2m_altp2m_propagate_change(struct domain *d, gfn_t gfn, >>>>>>> + mfn_t mfn, unsigned int page_order, >>>>>>> + p2m_type_t p2mt, p2m_access_t >>>>>>> +p2ma) { >>>>>>> + struct p2m_domain *p2m; >>>>>>> + p2m_access_t a; >>>>>>> + p2m_type_t t; >>>>>>> + mfn_t m; >>>>>>> + uint16_t i; >>>>>>> + bool_t reset_p2m; >>>>>>> + unsigned int reset_count = 0; >>>>>>> + uint16_t last_reset_idx = ~0; >>>>>>> + >>>>>>> + if ( !altp2m_active(d) ) >>>>>>> + return; >>>>>>> + >>>>>>> + altp2m_list_lock(d); >>>>>>> + >>>>>>> + for ( i = 0; i < MAX_ALTP2M; i++ ) >>>>>>> + { >>>>>>> + if ( d->arch.altp2m_eptp[i] == INVALID_MFN ) >>>>>>> + continue; >>>>>>> + >>>>>>> + p2m = d->arch.altp2m_p2m[i]; >>>>>>> + m = get_gfn_type_access(p2m, gfn_x(gfn), &t, &a, 0, >>>>>>> + NULL); >>>>>>> + >>>>>>> + reset_p2m = 0; >>>>>>> + >>>>>>> + /* Check for a dropped page that may impact this altp2m */ >>>>>>> + if ( mfn_x(mfn) == INVALID_MFN && >>>>>>> + gfn_x(gfn) >= p2m->min_remapped_gfn && >>>>>>> + gfn_x(gfn) <= p2m->max_remapped_gfn ) >>>>>>> + reset_p2m = 1; >>>>>> >>>>>>Considering that this looks like an optimization, what's the >>>>>>downside of possibly having min=0 and max=? >>I.e. >>>>>>can there a long latency operation result that's this way a guest can >>effect? >>>>>> >>>>> >>>>> ... A p2m is a gfn->mfn map, amongst other things. There is a >>>>> reverse >>>>> mfn->gfn map, but that is only valid for the host p2m. Unless the >>>>> remap altp2m hypercall is used, the gfn->mfn map in every altp2m >>>>> mirrors the gfn->mfn map in the host p2m (or a subset thereof, due >>>>> to lazy-copy), so handling removal of an mfn from a guest is simple: >>>>> do a reverse look up for the host p2m and mark the relevant gfn as >>>>> invalid, then do the same for every altp2m where that gfn is currently >>valid. >>>>> >>>>> Remap changes things: it says take gfn1 and replace ->mfn with the >>>>> ->mfn of gfn2. Here is where the optimization is used and the >>>>> ->invalidate >>>>logic is: >>>>> record the lowest and highest gfn2's that have been used in remap >>>>> ops; if an mfn is dropped from the hostp2m, for the purposes of >>>>> altp2m invalidation, see if the gfn derived from the host p2m >>>>> reverse lookup falls within the range of used gfn2's. If it does, an >>>>> invalidation is required. Which is why min and max are inited the >>>>> way they are - hope the explanation clarifies this optimization. >>>> >>>>Sadly it doesn't, it just re-states what I already understood and >>>>doesn't answer the question: What happens if min=0 and >>>>max=>>>space>? I.e. can the guest nullify the optimization by careful >>>>space>fiddling >>> issuing >>>>some of the new hypercalls, and if so will this have any negative >>>>impact on >>> the >>>>hypervisor? I'm asking this from a security standpoint ... >>>> >>> >>> To take that exact case, If min=0 and max= then >>> any hostp2m change where the first mfn is dropped, will cause all >>> altp2ms to be reset even if the mfn dropped doesn't affect altp2ms at >>> all, which wont serve as an optimization at all - Hope that clarifies. >> >>Again - no. I understand the optimization is gone then. But what's the effect? >>I.e. will the guest, by extending this range to be arbitrarily wide, be able > to >>cause a long latency hypervisor operation (i.e. a DoS)? >> > > The extent of the range affects the likelihood of an invalidation. It has no > impact on the cost of an invalidation (so no its not a DoS issue). > I'm not sure what change you are suggesting here or just clarification (if > you think this optimization is confusing perhaps some documentation of this > optimization will help?) Well, the optimization must be optimizing _something_. And hence _something_ must go sub-optimal when the optimization is being subverted. And the question is how much worse un-optimized is compared to optimized. It _looks like_ the overall effect really is just to avoid a one time (for a given non-preemptible operation) reset, but whether that's really the case depends on the calling contexts (which, as said a couple of times before, is hard to see for a patch that introduces functions without callers - hence the question). >>>>Nor do I find my question answered why max can't be initialized to zero: >>>>You don't care whether max is a valid GFN when a certain GFN doesn't >>>>fall in the (then empty) [min, max] range. What am I missing? >>> >>> Since 0 is a valid GFN so we cannot initialize min or max to 0 - since >>> matching this condition (gfn_x(gfn) >= p2m->min_remapped_gfn && >>> gfn_x(gfn) <= >>> p2m->max_remapped_gfn) will cause a reset (throw away) of the altp2m >>> p2m->to >>> rebuild it from the hostp2m. So essentially what is being done here is >>> the range is the non-existent set to start with, unless some altp2m >>> changes occur, and then it is grown to be the smallest set around the gfns >>affected. >> >>Again you just re-state what was already clear, yet you neglect answering the >>actual question. Taking what you wrote above, when max=0 (and >>min=INVALID_GFN), then >> >> gfn_x(gfn) >= p2m->min_remapped_gfn && >> gfn_x(gfn) <= p2m->max_remapped_gfn >> >>will be false for any value of gfn; in fact the "max" part won't even be > looked >>at because the "min" part will already be false for any valid gfn, i.e. only > the >>INVALID_GFN case would make it to the "max" part. >> > > You are suggesting an alternative which will work, but what we have will > also produce the same results, and the results are correct for both cases - so > can we go with the approach we have taken currently? Of course we can, but should we? I.e. why use sub-optimal code when there clearly is a way to improve it? Counter question - why are you insisting to use a model requiring (in the earlier pointed out place) four comparisons when two can do? I realize this is only a small inefficiency here, but you should realize that they add up if we don't look to avoid them where possible. Jan