Re: [PATCH V5] x86/altp2m: Fix crash with INVALID_ALTP2M EPTP index

From: George Dunlap <george.dunlap@citrix.com>
To: Razvan Cojocaru <rcojocaru@bitdefender.com>, xen-devel@lists.xen.org
Cc: george.dunlap@eu.citrix.com, andrew.cooper3@citrix.com,
	kevin.tian@intel.com, jbeulich@suse.com, jun.nakajima@intel.com
Subject: Re: [PATCH V5] x86/altp2m: Fix crash with INVALID_ALTP2M EPTP index
Date: Fri, 20 Jul 2018 18:18:18 +0100	[thread overview]
Message-ID: <8f681a50-e161-b8d5-013f-6b807ad5f942@citrix.com> (raw)
In-Reply-To: <91e61736-c6a4-a628-dbb2-a2d50a9ff38e@bitdefender.com>

On 07/20/2018 05:29 PM, Razvan Cojocaru wrote:
> On 07/20/2018 06:07 PM, George Dunlap wrote:
>> On 06/28/2018 03:35 PM, Razvan Cojocaru wrote:
>>> A VM exit handler executed immediately after enabling #VE might
>>> find a stale __vmsave()d EPTP_INDEX, stored by calling
>>> altp2m_vcpu_destroy() when SECONDARY_EXEC_ENABLE_VIRT_EXCEPTIONS
>>> had been enabled by altp2m_vcpu_update_vmfunc_ve().
>>>
>>> vmx_vmexit_handler() __vmread()s EPTP_INDEX as soon as
>>> SECONDARY_EXEC_ENABLE_VIRT_EXCEPTIONS is set, so if an
>>> application enables altp2m on a domain, succesfully calls
>>> xc_altp2m_set_vcpu_enable_notify(), then disables altp2m and
>>> exits, a second run of said application will likely read the
>>> INVALID_ALTP2M EPTP_INDEX set when disabling altp2m in the first
>>> run, and crash the host with the BUG_ON(idx >= MAX_ALTP2M),
>>> between xc_altp2m_set_vcpu_enable_notify() and
>>> xc_altp2m_set_domain_state(..., false).
>>>
>>> The problem is not restricted to an INVALID_ALTP2M EPTP_INDEX
>>> (which can only sanely happen on altp2m uninit), but applies
>>> to any stale index previously saved - which means that all
>>> altp2m_vcpu_update_vmfunc_ve() calls must also call
>>> altp2m_vcpu_update_p2m() after setting
>>> SECONDARY_EXEC_ENABLE_VIRT_EXCEPTIONS, in order to make sure
>>> that the stored EPTP_INDEX is always valid at
>>> vmx_vmexit_handler() time.
>>
>> I'm sorry, this description still doesn't make hardly any sense to me,
>> nor the solution, even after reading all the previous threads on the
>> issue.  The description doesn't, for instance, mention vcpu_pause() at
>> all, in spite of the fact that it seems (from the previous discussion)
>> that this is a critical part of why this solution works; nor is there
>> any comment in the code about the required discipline regarding
>> SECONDARY_EXEC_ENABLE_VIRT_EXCEPTIONS,  making it fairly likely that
>> someone will re-introduce a bug like this in the future.
>>
>> My normal template for something like this is
>> 1. Explain what the current situation is
>> 2. Explain why that's a problem
>> 3. Describe what you're changing and how it fixes it.
>>
>> I can't help but think the right thing to do here is in vmx.c somewhere
>> -- it is, after all, code in vmx.c that:
>> 1. Sets and clears SECONDARY_EXEC_ENABLE_VIRT_EXCEPTIONS
>> 2. Writes EPTP_INDEX
>> 3. Assumes that SECONDARY_EXEC_ENABLE_VIRT_EXCEPTIONS => EPTP_INDEX is
>> valid.
>>
>> What about something like the attached, instead (compile-tested only)?
> George, thanks for the review, comments and new patch! You're the third
> person telling me that the patch description is hard to parse - I'll
> definitely work on that skill in the future (and sorry for the
> inconvenience).

No worries -- everything here is a bit of a tangled mess.

> The vcpu_pause() lead was a red herring in my initial investigation of
> the issue, and that is the reason why it didn't make it into the patch
> description. The pausing already done is fine.
> 
> I've tested your patch on my system (where I can reproduce the crash
> with a 100% reproduction rate without it), and I've had no crashes - so
> it does seem to have fixed the problem. Thinking about the crash path,
> it also makes sense that it would fix the problem - I can't think of any
> objections to it.
> 
> Let me try the explanation again:
> 
> The current situation: when we run twice an altp2m client application
> which uses altp2m_vcpu_update_vmfunc_ve() (it _has_ to be twice), the
> following happens: after the first run of the application,
> altp2m_vcpu_destroy() gets called as part of the cleanup process, and
> this stores INVALID_ALTP2M EPTP_INDEX in the VMCS.

Right, I meant, the current situation in terms of the way the code in
Xen / the processor currently behaves / what it expects.

I tried to follow that pattern in my own patch.  The key to the whole
bug is this:

* vmx_vmexit_handler() assumes that is VIRT_EXCEPTION is set, that
EPTP_INDEX is valid

Once you state it that way, you realize, OK that's false.  But why is it
false?

* Because VIRT_EXCEPTION is enabled without touching EPTP_INDEX

That's the core problem.  That description by itself should make anyone
go, "Yeah, that will be a a problem."  The details of how that can go
wrong is just icing on the cake / grep fodder for people looking for how
to fix their own problem.

The reason this ever worked, AFAICT, is that EPTP_INDEX was accidentally
correct.  If we'd initialized EPTP_INDEX with 0xDEADBEEF on VMCS
creation, then you also would have hit the bug.  (In fact, that might
not be a bad idea.)

Furthermore, imagine the following scenario:

* dom0 enables altp2m on domain A
* dom0 switches altp2m to view 1 on domain A
* dom0 enables #VE on domain A
* domain A has a vmexit
  -> At this point, EPTP_INDEX is 0, so the vmexit code will drop a
reference on altp2m index 1 and increase the reference count on altp2m
index 0 #

My patch fixes the above issue, but your patch doesn't (AFAICT).  What
altp2m_vcpu_destroy() did wasn't fundamentally buggy; it just
highlighted the issue by doing the equivalent of putting 0xDEADBEEF in
EPTP_INDEX; and what your patch did was to reverse that, by making
EPTP_INDEX accidentally correct again the next time you ran your test.

(Let me know if I'm wrong about that!)

Stating the problem like that -- saying what the assumption is that's
being violated and why -- is not only more clear, but it also leads to a
more robust solution.

> I just
> thought that I should change the code that's _not_ VMX-specific in case
> altp2m is extended to SVM.

Right, but that assumes the internals are going to be similar somehow.
It's better if you don't have to make assumptions about the internals of
an interface you're calling.

> Reviewed-by: Razvan Cojocaru <rcojocaru@bitdefender.com>
> Tested-by: Razvan Cojocaru <rcojocaru@bitdefender.com>

Thanks. :-)

 -George

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel