On 2014-03-26 21:22, Bandan Das wrote: > Jan Kiszka writes: > >> On 2014-03-22 17:43, Bandan Das wrote: >>> Jan Kiszka writes: >>> >>>> On 2014-03-20 21:58, Bandan Das wrote: >>>>> Jan Kiszka writes: >>>>> >>>>>> On 2014-03-20 04:28, Bandan Das wrote: >>>>>>> Some L1 hypervisors such as Xen seem to be calling invept after >>>>>>> vmclear or before vmptrld on L2. In this case, proceed with >>>>>>> falling through and syncing roots as a case where >>>>>>> context wide invalidation can't be supported >>>>>> >>>>>> Can we also base this behaviour on a statement in the SDM? But on first >>>>>> glance, I do not find anything like this over there. >>>>> >>>>> The SDM has nothing of this sort explicitly mentioned but 28.3.3.1 >>>>> "Operations that invalidate Cached Mappings" does mention that >>>>> the instruction may invalidate mappings associated with other >>>>> EP4TAs (even in single context). >>>> >>>> Yes, "may". So we are implementing undefined behavior in order to please >>>> a broken hypervisor that relies on it? Then please state this in the >>>> patch and probably also inform Xen about their issue. >>> >>> Why undefined behavior ? We don't do anything specific for >>> the single context invalidation case ianyway .e If the eptp matches what >>> vmcs12 has, single context invalidation does fall though to the global >>> invalidation case already. All this change does is add the "L1 calls >>> invept after vmclear and before vmptrld" to the list of cases to fall >>> though to global invalidation since nvmx doesn't have any knowledge of >>> the current eptp for this case. >> >> OK, I think I misunderstood what the guest expects and how we currently >> achieve this: we do not track the mapping between guest and host eptp, >> thus cannot properly emulate its behaviour. We therefore need to flush >> everything. >> >>> >>> Or do you think we should rethink this approach ? >> >> Well, I wonder if we should expose single-context invept support at all. >> >> I'm also wondering if we are returning proper flags on >> >> if ((operand.eptp & eptp_mask) != >> (nested_ept_get_cr3(vcpu) & eptp_mask)) >> break; > > That does sound plausible but then we would have to get rid of this > little optimization in the code you have quoted above. We would have > to flush and sync roots for all invept calls. So, my preference is to > keep it. Do you have any numbers on how many per-context invept calls we are able to ignore? At least for KVM this is should be 0 as we only flush on changes to the current EPT structures. Jan > >> Neither nested_vmx_succeed nor nested_vmx_fail* is called if this >> condition evaluates to true. > > It appears we should always call nested_vmx_succeed(). If you are ok, > I will send a v2. > > Thanks, > Bandan > >> Jan