All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH V2 1/2] xen/altp2m: set access_required properly for all altp2ms
@ 2018-06-13  8:52 Razvan Cojocaru
  2018-06-13  8:52 ` [PATCH V2 2/2] x86/altp2m: Fixed domain crash with INVALID_ALTP2M EPTP index Razvan Cojocaru
  2018-06-27 14:09 ` [PATCH V2 1/2] xen/altp2m: set access_required properly for all altp2ms Wei Liu
  0 siblings, 2 replies; 14+ messages in thread
From: Razvan Cojocaru @ 2018-06-13  8:52 UTC (permalink / raw)
  To: xen-devel
  Cc: Stefano Stabellini, Wei Liu, Razvan Cojocaru, George Dunlap,
	Andrew Cooper, Ian Jackson, Tim Deegan, Julien Grall,
	Tamas K Lengyel, Jan Beulich

For the hostp2m, access_required starts off as 0, then it can be
set with xc_domain_set_access_required(). However, all the altp2ms
set it to 1 on init, and ignore both the hostp2m and the hypercall.
This patch sets access_required to the value from the hostp2m
on altp2m init, and propagates the values received via hypercall
to all the active altp2ms, when applicable.

Signed-off-by: Razvan Cojocaru <rcojocaru@bitdefender.com>

---
Cc: Stefano Stabellini <sstabellini@kernel.org>
Cc: Julien Grall <julien.grall@arm.com>
Cc: Andrew Cooper <andrew.cooper3@citrix.com>
Cc: George Dunlap <George.Dunlap@eu.citrix.com>
Cc: Ian Jackson <ian.jackson@eu.citrix.com>
Cc: Jan Beulich <jbeulich@suse.com>
Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Cc: Tim Deegan <tim@xen.org>
Cc: Wei Liu <wei.liu2@citrix.com>
Cc: George Dunlap <george.dunlap@eu.citrix.com>
Cc: Tamas K Lengyel <tamas@tklengyel.com>
---
Changes since V1:
 - Corrected typos in the commit message.
 - Moved arch_domain_set_access_required() to x86/mm/mem_access.c
   and arm/mem_access.c.
 - Fixed wrong tab type.
 - Removed the d->arch.altp2m_eptp[i] == mfn_x(INVALID_MFN) check.
 - Moved the setting of p2m_get_hostp2m(d)->access_required to
   the arch_domain_set_access_required() function.
---
 xen/arch/arm/mem_access.c    |  5 +++++
 xen/arch/x86/mm/mem_access.c | 18 ++++++++++++++++++
 xen/arch/x86/mm/p2m.c        |  3 ++-
 xen/common/domctl.c          |  4 ++--
 xen/include/xen/domain.h     |  2 ++
 5 files changed, 29 insertions(+), 3 deletions(-)

diff --git a/xen/arch/arm/mem_access.c b/xen/arch/arm/mem_access.c
index ae2686f..a59c6ef 100644
--- a/xen/arch/arm/mem_access.c
+++ b/xen/arch/arm/mem_access.c
@@ -453,6 +453,11 @@ int p2m_get_mem_access(struct domain *d, gfn_t gfn,
     return ret;
 }
 
+void arch_domain_set_access_required(struct domain *d, bool access_required)
+{
+    p2m_get_hostp2m(d)->access_required = access_required;
+}
+
 /*
  * Local variables:
  * mode: C
diff --git a/xen/arch/x86/mm/mem_access.c b/xen/arch/x86/mm/mem_access.c
index c0cd017..6811572 100644
--- a/xen/arch/x86/mm/mem_access.c
+++ b/xen/arch/x86/mm/mem_access.c
@@ -465,6 +465,24 @@ int p2m_get_mem_access(struct domain *d, gfn_t gfn, xenmem_access_t *access)
     return _p2m_get_mem_access(p2m, gfn, access);
 }
 
+void arch_domain_set_access_required(struct domain *d, bool access_required)
+{
+    unsigned int i;
+
+    p2m_get_hostp2m(d)->access_required = access_required;
+
+    if ( !altp2m_active(d) )
+        return;
+
+    for ( i = 0; i < MAX_ALTP2M; i++ )
+    {
+        struct p2m_domain *p2m = d->arch.altp2m_p2m[i];
+
+        if ( p2m )
+            p2m->access_required = access_required;
+    }
+}
+
 /*
  * Local variables:
  * mode: C
diff --git a/xen/arch/x86/mm/p2m.c b/xen/arch/x86/mm/p2m.c
index c53cab4..8e9fbb5 100644
--- a/xen/arch/x86/mm/p2m.c
+++ b/xen/arch/x86/mm/p2m.c
@@ -199,6 +199,7 @@ static int p2m_init_altp2m(struct domain *d)
 {
     unsigned int i;
     struct p2m_domain *p2m;
+    struct p2m_domain *hostp2m = p2m_get_hostp2m(d);
 
     mm_lock_init(&d->arch.altp2m_list_lock);
     for ( i = 0; i < MAX_ALTP2M; i++ )
@@ -210,7 +211,7 @@ static int p2m_init_altp2m(struct domain *d)
             return -ENOMEM;
         }
         p2m->p2m_class = p2m_alternate;
-        p2m->access_required = 1;
+        p2m->access_required = hostp2m->access_required;
         _atomic_set(&p2m->active_vcpus, 0);
     }
 
diff --git a/xen/common/domctl.c b/xen/common/domctl.c
index 9b7bc08..37f174f 100644
--- a/xen/common/domctl.c
+++ b/xen/common/domctl.c
@@ -1092,8 +1092,8 @@ long do_domctl(XEN_GUEST_HANDLE_PARAM(xen_domctl_t) u_domctl)
         else
         {
             domain_pause(d);
-            p2m_get_hostp2m(d)->access_required =
-                op->u.access_required.access_required;
+            arch_domain_set_access_required(d,
+                op->u.access_required.access_required);
             domain_unpause(d);
         }
         break;
diff --git a/xen/include/xen/domain.h b/xen/include/xen/domain.h
index 177cb35..9df53a6 100644
--- a/xen/include/xen/domain.h
+++ b/xen/include/xen/domain.h
@@ -66,6 +66,8 @@ void arch_domain_unpause(struct domain *d);
 
 int arch_domain_soft_reset(struct domain *d);
 
+void arch_domain_set_access_required(struct domain *d, bool access_required);
+
 int arch_set_info_guest(struct vcpu *, vcpu_guest_context_u);
 void arch_get_info_guest(struct vcpu *, vcpu_guest_context_u);
 
-- 
2.7.4


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH V2 2/2] x86/altp2m: Fixed domain crash with INVALID_ALTP2M EPTP index
  2018-06-13  8:52 [PATCH V2 1/2] xen/altp2m: set access_required properly for all altp2ms Razvan Cojocaru
@ 2018-06-13  8:52 ` Razvan Cojocaru
  2018-06-22 15:28   ` Jan Beulich
  2018-06-27 14:09 ` [PATCH V2 1/2] xen/altp2m: set access_required properly for all altp2ms Wei Liu
  1 sibling, 1 reply; 14+ messages in thread
From: Razvan Cojocaru @ 2018-06-13  8:52 UTC (permalink / raw)
  To: xen-devel
  Cc: Kevin Tian, Tamas K Lengyel, Jun Nakajima, Razvan Cojocaru,
	Andrew Cooper, Jan Beulich

vcpu_altp2m(v).p2midx can become INVALID_ALTP2M with normal
usage (in altp2m_vcpu_reset()), which can then result in that
value being __vmwritten() in EPTP_INDEX by vmx_vcpu_update_eptp().
The value can then end up being __vmread() in vmx_vmexit_handler()
which then calls BUG_ON(idx >= MAX_ALTP2M). Since MAX_ALTP2M is
currently 10 and INVALID_ALTP2M is #defined as 0xffff, the
domain will always crash in this case.

Signed-off-by: Razvan Cojocaru <rcojocaru@bitdefender.com>

---
Cc: Jun Nakajima <jun.nakajima@intel.com>
Cc: Kevin Tian <kevin.tian@intel.com>
Cc: Jan Beulich <jbeulich@suse.com>
Cc: Andrew Cooper <andrew.cooper3@citrix.com>
Cc: Tamas K Lengyel <tamas@tklengyel.com>
---
 xen/arch/x86/hvm/vmx/vmx.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/xen/arch/x86/hvm/vmx/vmx.c b/xen/arch/x86/hvm/vmx/vmx.c
index 9707514..c7f3925 100644
--- a/xen/arch/x86/hvm/vmx/vmx.c
+++ b/xen/arch/x86/hvm/vmx/vmx.c
@@ -3592,7 +3592,7 @@ void vmx_vmexit_handler(struct cpu_user_regs *regs)
             }
         }
 
-        if ( idx != vcpu_altp2m(v).p2midx )
+        if ( idx != INVALID_ALTP2M && idx != vcpu_altp2m(v).p2midx )
         {
             BUG_ON(idx >= MAX_ALTP2M);
             atomic_dec(&p2m_get_altp2m(v)->active_vcpus);
-- 
2.7.4


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply related	[flat|nested] 14+ messages in thread

* Re: [PATCH V2 2/2] x86/altp2m: Fixed domain crash with INVALID_ALTP2M EPTP index
  2018-06-13  8:52 ` [PATCH V2 2/2] x86/altp2m: Fixed domain crash with INVALID_ALTP2M EPTP index Razvan Cojocaru
@ 2018-06-22 15:28   ` Jan Beulich
  2018-06-22 16:55     ` Razvan Cojocaru
  0 siblings, 1 reply; 14+ messages in thread
From: Jan Beulich @ 2018-06-22 15:28 UTC (permalink / raw)
  To: Razvan Cojocaru; +Cc: Andrew Cooper, Kevin Tian, tamas, Jun Nakajima, xen-devel

>>> On 13.06.18 at 10:52, <rcojocaru@bitdefender.com> wrote:
> --- a/xen/arch/x86/hvm/vmx/vmx.c
> +++ b/xen/arch/x86/hvm/vmx/vmx.c
> @@ -3592,7 +3592,7 @@ void vmx_vmexit_handler(struct cpu_user_regs *regs)
>              }
>          }
>  
> -        if ( idx != vcpu_altp2m(v).p2midx )
> +        if ( idx != INVALID_ALTP2M && idx != vcpu_altp2m(v).p2midx )
>          {
>              BUG_ON(idx >= MAX_ALTP2M);

In the code immediately ahead of this there is an INVALID_ALTP2M check
already (in the else branch). If the __vmread() can legitimately produce
this value, why would the domain be crashed when getting back
INVALID_ALTP2M in the other case? I think the correctness of your change
can only be judged once both code paths behave consistently.

Jan



_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH V2 2/2] x86/altp2m: Fixed domain crash with INVALID_ALTP2M EPTP index
  2018-06-22 15:28   ` Jan Beulich
@ 2018-06-22 16:55     ` Razvan Cojocaru
  2018-06-25 12:12       ` Razvan Cojocaru
  0 siblings, 1 reply; 14+ messages in thread
From: Razvan Cojocaru @ 2018-06-22 16:55 UTC (permalink / raw)
  To: Jan Beulich; +Cc: Andrew Cooper, Kevin Tian, tamas, Jun Nakajima, xen-devel

On 06/22/2018 06:28 PM, Jan Beulich wrote:
>>>> On 13.06.18 at 10:52, <rcojocaru@bitdefender.com> wrote:
>> --- a/xen/arch/x86/hvm/vmx/vmx.c
>> +++ b/xen/arch/x86/hvm/vmx/vmx.c
>> @@ -3592,7 +3592,7 @@ void vmx_vmexit_handler(struct cpu_user_regs *regs)
>>              }
>>          }
>>  
>> -        if ( idx != vcpu_altp2m(v).p2midx )
>> +        if ( idx != INVALID_ALTP2M && idx != vcpu_altp2m(v).p2midx )
>>          {
>>              BUG_ON(idx >= MAX_ALTP2M);
> 
> In the code immediately ahead of this there is an INVALID_ALTP2M check
> already (in the else branch). If the __vmread() can legitimately produce
> this value, why would the domain be crashed when getting back
> INVALID_ALTP2M in the other case? I think the correctness of your change
> can only be judged once both code paths behave consistently.

You're right, I had somehow convinced myself that this is a #VE-specific
problem, but it looks like a generic altp2m problem. I'll simulate the
other branch in the code and see what it does with my small test
application.


Thanks,
Razvan

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH V2 2/2] x86/altp2m: Fixed domain crash with INVALID_ALTP2M EPTP index
  2018-06-22 16:55     ` Razvan Cojocaru
@ 2018-06-25 12:12       ` Razvan Cojocaru
  2018-06-25 12:28         ` Jan Beulich
  0 siblings, 1 reply; 14+ messages in thread
From: Razvan Cojocaru @ 2018-06-25 12:12 UTC (permalink / raw)
  To: Jan Beulich; +Cc: Andrew Cooper, Kevin Tian, tamas, Jun Nakajima, xen-devel

On 06/22/2018 07:55 PM, Razvan Cojocaru wrote:
> On 06/22/2018 06:28 PM, Jan Beulich wrote:
>>>>> On 13.06.18 at 10:52, <rcojocaru@bitdefender.com> wrote:
>>> --- a/xen/arch/x86/hvm/vmx/vmx.c
>>> +++ b/xen/arch/x86/hvm/vmx/vmx.c
>>> @@ -3592,7 +3592,7 @@ void vmx_vmexit_handler(struct cpu_user_regs *regs)
>>>              }
>>>          }
>>>  
>>> -        if ( idx != vcpu_altp2m(v).p2midx )
>>> +        if ( idx != INVALID_ALTP2M && idx != vcpu_altp2m(v).p2midx )
>>>          {
>>>              BUG_ON(idx >= MAX_ALTP2M);
>>
>> In the code immediately ahead of this there is an INVALID_ALTP2M check
>> already (in the else branch). If the __vmread() can legitimately produce
>> this value, why would the domain be crashed when getting back
>> INVALID_ALTP2M in the other case? I think the correctness of your change
>> can only be judged once both code paths behave consistently.
> 
> You're right, I had somehow convinced myself that this is a #VE-specific
> problem, but it looks like a generic altp2m problem. I'll simulate the
> other branch in the code and see what it does with my small test
> application.

After a bit of debugging, the issue explained in full seems to be this
(it indeed appears to be #VE-specific, as initially assumed): client
application calls xc_altp2m_set_domain_state(xci, domid, 1), followed by
xc_altp2m_set_vcpu_enable_notify() (with a suitable gfn), followed by
xc_altp2m_set_domain_state(xci, domid, 0).

This causes Xen to go through the following steps:

1. altp2m_vcpu_initialise() (calls altp2m_vcpu_reset()).
2. HVMOP_altp2m_vcpu_enable_notify -> vmx_vcpu_update_vmfunc_ve().
3. altp2m_vcpu_destroy() (calls altp2m_vcpu_reset() and (indirectly)
vmx_vcpu_update_eptp()).
4. Still part of the altp2m_vcpu_destroy() workflow,
altp2m_vcpu_update_vmfunc_ve(v) gets called.

At step 2, vmx_vcpu_update_vmfunc_ve() modifies
v->arch.hvm_vmx.secondary_exec_control (from 0x1054eb to 0x1474eb -
which has the SECONDARY_EXEC_ENABLE_VIRT_EXCEPTIONS bit set).

At step 3, altp2m_vcpu_reset() sets av->p2midx = INVALID_ALTP2M, then
vmx_vcpu_update_eptp() sees that SECONDARY_EXEC_ENABLE_VIRT_EXCEPTIONS
is set, and as a consequence calls __vmwrite(EPTP_INDEX,
vcpu_altp2m(v).p2midx).

Now, at step 4 the SECONDARY_EXEC_ENABLE_VIRT_EXCEPTIONS bit should now
become 0, because altp2m_vcpu_reset() has set veinfo_gfn to INVALID_GFN.
But _sometimes_, what happens is that _between_ steps 3 and 4 a
vmx_vmexit_handler() occurs, which __vmread()s EPTP_INDEX (on the logic
branch I've tried to fix), compares it to MAX_ALTP2M and then proceeds
to BUG_ON(), bringing the hypervisor down.


Thanks,
Razvan

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH V2 2/2] x86/altp2m: Fixed domain crash with INVALID_ALTP2M EPTP index
  2018-06-25 12:12       ` Razvan Cojocaru
@ 2018-06-25 12:28         ` Jan Beulich
  2018-06-25 12:32           ` Razvan Cojocaru
  0 siblings, 1 reply; 14+ messages in thread
From: Jan Beulich @ 2018-06-25 12:28 UTC (permalink / raw)
  To: Razvan Cojocaru; +Cc: Andrew Cooper, Kevin Tian, tamas, Jun Nakajima, xen-devel

>>> On 25.06.18 at 14:12, <rcojocaru@bitdefender.com> wrote:
> On 06/22/2018 07:55 PM, Razvan Cojocaru wrote:
>> On 06/22/2018 06:28 PM, Jan Beulich wrote:
>>>>>> On 13.06.18 at 10:52, <rcojocaru@bitdefender.com> wrote:
>>>> --- a/xen/arch/x86/hvm/vmx/vmx.c
>>>> +++ b/xen/arch/x86/hvm/vmx/vmx.c
>>>> @@ -3592,7 +3592,7 @@ void vmx_vmexit_handler(struct cpu_user_regs *regs)
>>>>              }
>>>>          }
>>>>  
>>>> -        if ( idx != vcpu_altp2m(v).p2midx )
>>>> +        if ( idx != INVALID_ALTP2M && idx != vcpu_altp2m(v).p2midx )
>>>>          {
>>>>              BUG_ON(idx >= MAX_ALTP2M);
>>>
>>> In the code immediately ahead of this there is an INVALID_ALTP2M check
>>> already (in the else branch). If the __vmread() can legitimately produce
>>> this value, why would the domain be crashed when getting back
>>> INVALID_ALTP2M in the other case? I think the correctness of your change
>>> can only be judged once both code paths behave consistently.
>> 
>> You're right, I had somehow convinced myself that this is a #VE-specific
>> problem, but it looks like a generic altp2m problem. I'll simulate the
>> other branch in the code and see what it does with my small test
>> application.
> 
> After a bit of debugging, the issue explained in full seems to be this
> (it indeed appears to be #VE-specific, as initially assumed): client
> application calls xc_altp2m_set_domain_state(xci, domid, 1), followed by
> xc_altp2m_set_vcpu_enable_notify() (with a suitable gfn), followed by
> xc_altp2m_set_domain_state(xci, domid, 0).
> 
> This causes Xen to go through the following steps:
> 
> 1. altp2m_vcpu_initialise() (calls altp2m_vcpu_reset()).
> 2. HVMOP_altp2m_vcpu_enable_notify -> vmx_vcpu_update_vmfunc_ve().
> 3. altp2m_vcpu_destroy() (calls altp2m_vcpu_reset() and (indirectly)
> vmx_vcpu_update_eptp()).
> 4. Still part of the altp2m_vcpu_destroy() workflow,
> altp2m_vcpu_update_vmfunc_ve(v) gets called.
> 
> At step 2, vmx_vcpu_update_vmfunc_ve() modifies
> v->arch.hvm_vmx.secondary_exec_control (from 0x1054eb to 0x1474eb -
> which has the SECONDARY_EXEC_ENABLE_VIRT_EXCEPTIONS bit set).
> 
> At step 3, altp2m_vcpu_reset() sets av->p2midx = INVALID_ALTP2M, then
> vmx_vcpu_update_eptp() sees that SECONDARY_EXEC_ENABLE_VIRT_EXCEPTIONS
> is set, and as a consequence calls __vmwrite(EPTP_INDEX,
> vcpu_altp2m(v).p2midx).
> 
> Now, at step 4 the SECONDARY_EXEC_ENABLE_VIRT_EXCEPTIONS bit should now
> become 0, because altp2m_vcpu_reset() has set veinfo_gfn to INVALID_GFN.
> But _sometimes_, what happens is that _between_ steps 3 and 4 a
> vmx_vmexit_handler() occurs, which __vmread()s EPTP_INDEX (on the logic
> branch I've tried to fix), compares it to MAX_ALTP2M and then proceeds
> to BUG_ON(), bringing the hypervisor down.

Thanks for the detailed analysis. With that I wonder whether it is
reasonable for a VM exit to occur in parallel with the processing of
altp2m_vcpu_destroy(). Shouldn't a domain (or vCPU) undergoing such
a mode change be paused?

I also remain unconvinced that a similar race is entirely impossible in the
non-#VE case.

Jan



_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH V2 2/2] x86/altp2m: Fixed domain crash with INVALID_ALTP2M EPTP index
  2018-06-25 12:28         ` Jan Beulich
@ 2018-06-25 12:32           ` Razvan Cojocaru
  2018-06-25 12:40             ` Razvan Cojocaru
  0 siblings, 1 reply; 14+ messages in thread
From: Razvan Cojocaru @ 2018-06-25 12:32 UTC (permalink / raw)
  To: Jan Beulich; +Cc: Andrew Cooper, Kevin Tian, tamas, Jun Nakajima, xen-devel

On 06/25/2018 03:28 PM, Jan Beulich wrote:
>>>> On 25.06.18 at 14:12, <rcojocaru@bitdefender.com> wrote:
>> On 06/22/2018 07:55 PM, Razvan Cojocaru wrote:
>>> On 06/22/2018 06:28 PM, Jan Beulich wrote:
>>>>>>> On 13.06.18 at 10:52, <rcojocaru@bitdefender.com> wrote:
>>>>> --- a/xen/arch/x86/hvm/vmx/vmx.c
>>>>> +++ b/xen/arch/x86/hvm/vmx/vmx.c
>>>>> @@ -3592,7 +3592,7 @@ void vmx_vmexit_handler(struct cpu_user_regs *regs)
>>>>>              }
>>>>>          }
>>>>>  
>>>>> -        if ( idx != vcpu_altp2m(v).p2midx )
>>>>> +        if ( idx != INVALID_ALTP2M && idx != vcpu_altp2m(v).p2midx )
>>>>>          {
>>>>>              BUG_ON(idx >= MAX_ALTP2M);
>>>>
>>>> In the code immediately ahead of this there is an INVALID_ALTP2M check
>>>> already (in the else branch). If the __vmread() can legitimately produce
>>>> this value, why would the domain be crashed when getting back
>>>> INVALID_ALTP2M in the other case? I think the correctness of your change
>>>> can only be judged once both code paths behave consistently.
>>>
>>> You're right, I had somehow convinced myself that this is a #VE-specific
>>> problem, but it looks like a generic altp2m problem. I'll simulate the
>>> other branch in the code and see what it does with my small test
>>> application.
>>
>> After a bit of debugging, the issue explained in full seems to be this
>> (it indeed appears to be #VE-specific, as initially assumed): client
>> application calls xc_altp2m_set_domain_state(xci, domid, 1), followed by
>> xc_altp2m_set_vcpu_enable_notify() (with a suitable gfn), followed by
>> xc_altp2m_set_domain_state(xci, domid, 0).
>>
>> This causes Xen to go through the following steps:
>>
>> 1. altp2m_vcpu_initialise() (calls altp2m_vcpu_reset()).
>> 2. HVMOP_altp2m_vcpu_enable_notify -> vmx_vcpu_update_vmfunc_ve().
>> 3. altp2m_vcpu_destroy() (calls altp2m_vcpu_reset() and (indirectly)
>> vmx_vcpu_update_eptp()).
>> 4. Still part of the altp2m_vcpu_destroy() workflow,
>> altp2m_vcpu_update_vmfunc_ve(v) gets called.
>>
>> At step 2, vmx_vcpu_update_vmfunc_ve() modifies
>> v->arch.hvm_vmx.secondary_exec_control (from 0x1054eb to 0x1474eb -
>> which has the SECONDARY_EXEC_ENABLE_VIRT_EXCEPTIONS bit set).
>>
>> At step 3, altp2m_vcpu_reset() sets av->p2midx = INVALID_ALTP2M, then
>> vmx_vcpu_update_eptp() sees that SECONDARY_EXEC_ENABLE_VIRT_EXCEPTIONS
>> is set, and as a consequence calls __vmwrite(EPTP_INDEX,
>> vcpu_altp2m(v).p2midx).
>>
>> Now, at step 4 the SECONDARY_EXEC_ENABLE_VIRT_EXCEPTIONS bit should now
>> become 0, because altp2m_vcpu_reset() has set veinfo_gfn to INVALID_GFN.
>> But _sometimes_, what happens is that _between_ steps 3 and 4 a
>> vmx_vmexit_handler() occurs, which __vmread()s EPTP_INDEX (on the logic
>> branch I've tried to fix), compares it to MAX_ALTP2M and then proceeds
>> to BUG_ON(), bringing the hypervisor down.
> 
> Thanks for the detailed analysis. With that I wonder whether it is
> reasonable for a VM exit to occur in parallel with the processing of
> altp2m_vcpu_destroy(). Shouldn't a domain (or vCPU) undergoing such
> a mode change be paused?
> 
> I also remain unconvinced that a similar race is entirely impossible in the
> non-#VE case.

Apologies, I seem to have misread the crash timing.

A "good run":

(XEN) [ 1923.964832] altp2m_vcpu_initialise()
(XEN) [ 1923.964836] altp2m_vcpu_reset()
(XEN) [ 1923.964837] 1 altp2m_vcpu_update_p2m()
(XEN) [ 1923.964838] vmx_vcpu_update_eptp()
(XEN) [ 1923.964876] HVMOP_altp2m_vcpu_enable_notify
(XEN) [ 1923.964878] vmx_vcpu_update_vmfunc_ve(0),
v->arch.hvm_vmx.secondary_exec_control: 0x1054eb
(XEN) [ 1923.964880] exit vmx_vcpu_update_vmfunc_ve(0),
v->arch.hvm_vmx.secondary_exec_control: 0x1474eb
(XEN) [ 1923.964986] altp2m_vcpu_destroy()
(XEN) [ 1923.964987] altp2m_vcpu_reset()
(XEN) [ 1923.964988] 2 altp2m_vcpu_update_p2m()
(XEN) [ 1923.964989] vmx_vcpu_update_eptp()
(XEN) [ 1923.964991] __vmwrite(EPTP_INDEX, 65535)
(XEN) [ 1923.964992] vmx_vcpu_update_vmfunc_ve(0),
v->arch.hvm_vmx.secondary_exec_control: 0x1474eb
(XEN) [ 1923.964993] exit vmx_vcpu_update_vmfunc_ve(0),
v->arch.hvm_vmx.secondary_exec_control: 0x1054eb

Crash:

(XEN) [ 1924.367273] altp2m_vcpu_initialise()
(XEN) [ 1924.367277] altp2m_vcpu_reset()
(XEN) [ 1924.367278] 1 altp2m_vcpu_update_p2m()
(XEN) [ 1924.367279] vmx_vcpu_update_eptp()
(XEN) [ 1924.367318] HVMOP_altp2m_vcpu_enable_notify
(XEN) [ 1924.367321] vmx_vcpu_update_vmfunc_ve(0),
v->arch.hvm_vmx.secondary_exec_control: 0x1054eb
(XEN) [ 1924.367326] exit vmx_vcpu_update_vmfunc_ve(0),
v->arch.hvm_vmx.secondary_exec_control: 0x1474eb
(XEN) [ 1924.367344] Xen BUG at vmx.c:3407

The vmx_vmexit_handler() call appears to happen right after the first
vmx_vcpu_update_vmfunc_ve() call, but still before
altp2m_vcpu_destroy(). I was also quite confuse that a
vmx_vmexit_handler() run is possible in parallel with an HVMOP.

I'll keep digging.


Thanks,
Razvan

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH V2 2/2] x86/altp2m: Fixed domain crash with INVALID_ALTP2M EPTP index
  2018-06-25 12:32           ` Razvan Cojocaru
@ 2018-06-25 12:40             ` Razvan Cojocaru
  2018-06-25 12:54               ` Jan Beulich
  0 siblings, 1 reply; 14+ messages in thread
From: Razvan Cojocaru @ 2018-06-25 12:40 UTC (permalink / raw)
  To: Jan Beulich; +Cc: Andrew Cooper, Kevin Tian, tamas, Jun Nakajima, xen-devel

> (XEN) [ 1923.964832] altp2m_vcpu_initialise()
> (XEN) [ 1923.964836] altp2m_vcpu_reset()
> (XEN) [ 1923.964837] 1 altp2m_vcpu_update_p2m()
> (XEN) [ 1923.964838] vmx_vcpu_update_eptp()
> (XEN) [ 1923.964876] HVMOP_altp2m_vcpu_enable_notify
> (XEN) [ 1923.964878] vmx_vcpu_update_vmfunc_ve(0),
> v->arch.hvm_vmx.secondary_exec_control: 0x1054eb
> (XEN) [ 1923.964880] exit vmx_vcpu_update_vmfunc_ve(0),
> v->arch.hvm_vmx.secondary_exec_control: 0x1474eb
> (XEN) [ 1923.964986] altp2m_vcpu_destroy()
> (XEN) [ 1923.964987] altp2m_vcpu_reset()
> (XEN) [ 1923.964988] 2 altp2m_vcpu_update_p2m()
> (XEN) [ 1923.964989] vmx_vcpu_update_eptp()
> (XEN) [ 1923.964991] __vmwrite(EPTP_INDEX, 65535)
> (XEN) [ 1923.964992] vmx_vcpu_update_vmfunc_ve(0),
> v->arch.hvm_vmx.secondary_exec_control: 0x1474eb
> (XEN) [ 1923.964993] exit vmx_vcpu_update_vmfunc_ve(0),
> v->arch.hvm_vmx.secondary_exec_control: 0x1054eb
> 
> Crash:
> 
> (XEN) [ 1924.367273] altp2m_vcpu_initialise()
> (XEN) [ 1924.367277] altp2m_vcpu_reset()
> (XEN) [ 1924.367278] 1 altp2m_vcpu_update_p2m()
> (XEN) [ 1924.367279] vmx_vcpu_update_eptp()
> (XEN) [ 1924.367318] HVMOP_altp2m_vcpu_enable_notify
> (XEN) [ 1924.367321] vmx_vcpu_update_vmfunc_ve(0),
> v->arch.hvm_vmx.secondary_exec_control: 0x1054eb
> (XEN) [ 1924.367326] exit vmx_vcpu_update_vmfunc_ve(0),
> v->arch.hvm_vmx.secondary_exec_control: 0x1474eb
> (XEN) [ 1924.367344] Xen BUG at vmx.c:3407

Actually I think this shows us the problem: 65535 (INVALID_ALTP2M) is a
stale value from a previous good run. But the EPTP_INDEX value is
ignored unless SECONDARY_EXEC_ENABLE_VIRT_EXCEPTIONS is set. So at the
crash point, SECONDARY_EXEC_ENABLE_VIRT_EXCEPTIONS just got set, the
"live" index is 0, and the stale INVALID_ALTP2M value is being read from
the previous run (and compared to 0 and MAX_ALTP2M).


Thanks,
Razvan

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH V2 2/2] x86/altp2m: Fixed domain crash with INVALID_ALTP2M EPTP index
  2018-06-25 12:40             ` Razvan Cojocaru
@ 2018-06-25 12:54               ` Jan Beulich
  2018-06-25 12:59                 ` Razvan Cojocaru
  0 siblings, 1 reply; 14+ messages in thread
From: Jan Beulich @ 2018-06-25 12:54 UTC (permalink / raw)
  To: Razvan Cojocaru; +Cc: Andrew Cooper, Kevin Tian, tamas, Jun Nakajima, xen-devel

>>> On 25.06.18 at 14:40, <rcojocaru@bitdefender.com> wrote:
>> Crash:
>> 
>> (XEN) [ 1924.367273] altp2m_vcpu_initialise()
>> (XEN) [ 1924.367277] altp2m_vcpu_reset()
>> (XEN) [ 1924.367278] 1 altp2m_vcpu_update_p2m()
>> (XEN) [ 1924.367279] vmx_vcpu_update_eptp()
>> (XEN) [ 1924.367318] HVMOP_altp2m_vcpu_enable_notify
>> (XEN) [ 1924.367321] vmx_vcpu_update_vmfunc_ve(0),
>> v->arch.hvm_vmx.secondary_exec_control: 0x1054eb
>> (XEN) [ 1924.367326] exit vmx_vcpu_update_vmfunc_ve(0),
>> v->arch.hvm_vmx.secondary_exec_control: 0x1474eb
>> (XEN) [ 1924.367344] Xen BUG at vmx.c:3407
> 
> Actually I think this shows us the problem: 65535 (INVALID_ALTP2M) is a
> stale value from a previous good run. But the EPTP_INDEX value is
> ignored unless SECONDARY_EXEC_ENABLE_VIRT_EXCEPTIONS is set. So at the
> crash point, SECONDARY_EXEC_ENABLE_VIRT_EXCEPTIONS just got set, the
> "live" index is 0, and the stale INVALID_ALTP2M value is being read from
> the previous run (and compared to 0 and MAX_ALTP2M).

So perhaps the writing of EPTP_INDEX should be done earlier?

Jan



_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH V2 2/2] x86/altp2m: Fixed domain crash with INVALID_ALTP2M EPTP index
  2018-06-25 12:54               ` Jan Beulich
@ 2018-06-25 12:59                 ` Razvan Cojocaru
  2018-06-25 13:11                   ` Jan Beulich
  0 siblings, 1 reply; 14+ messages in thread
From: Razvan Cojocaru @ 2018-06-25 12:59 UTC (permalink / raw)
  To: Jan Beulich; +Cc: Andrew Cooper, Kevin Tian, tamas, Jun Nakajima, xen-devel

On 06/25/2018 03:54 PM, Jan Beulich wrote:
>>>> On 25.06.18 at 14:40, <rcojocaru@bitdefender.com> wrote:
>>> Crash:
>>>
>>> (XEN) [ 1924.367273] altp2m_vcpu_initialise()
>>> (XEN) [ 1924.367277] altp2m_vcpu_reset()
>>> (XEN) [ 1924.367278] 1 altp2m_vcpu_update_p2m()
>>> (XEN) [ 1924.367279] vmx_vcpu_update_eptp()
>>> (XEN) [ 1924.367318] HVMOP_altp2m_vcpu_enable_notify
>>> (XEN) [ 1924.367321] vmx_vcpu_update_vmfunc_ve(0),
>>> v->arch.hvm_vmx.secondary_exec_control: 0x1054eb
>>> (XEN) [ 1924.367326] exit vmx_vcpu_update_vmfunc_ve(0),
>>> v->arch.hvm_vmx.secondary_exec_control: 0x1474eb
>>> (XEN) [ 1924.367344] Xen BUG at vmx.c:3407
>>
>> Actually I think this shows us the problem: 65535 (INVALID_ALTP2M) is a
>> stale value from a previous good run. But the EPTP_INDEX value is
>> ignored unless SECONDARY_EXEC_ENABLE_VIRT_EXCEPTIONS is set. So at the
>> crash point, SECONDARY_EXEC_ENABLE_VIRT_EXCEPTIONS just got set, the
>> "live" index is 0, and the stale INVALID_ALTP2M value is being read from
>> the previous run (and compared to 0 and MAX_ALTP2M).
> 
> So perhaps the writing of EPTP_INDEX should be done earlier?

And indeed I can confirm this: I've added a sleep() in my test between
xc_altp2m_set_vcpu_enable_notify() and xc_altp2m_set_domain_state(xci,
domid, 0), and it _always_ crashes Xen on the second run.

Quite right, that's exactly what I've been doing: a satisfactory fix
appears to be to simply reverse the order of altp2m_vcpu_update_p2m(v)
and altp2m_vcpu_update_vmfunc_ve(v) in altp2m_vcpu_destroy().

I'll send out a patch ASAP.


Thanks,
Razvan

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH V2 2/2] x86/altp2m: Fixed domain crash with INVALID_ALTP2M EPTP index
  2018-06-25 12:59                 ` Razvan Cojocaru
@ 2018-06-25 13:11                   ` Jan Beulich
  0 siblings, 0 replies; 14+ messages in thread
From: Jan Beulich @ 2018-06-25 13:11 UTC (permalink / raw)
  To: Razvan Cojocaru; +Cc: Andrew Cooper, Kevin Tian, tamas, Jun Nakajima, xen-devel

>>> On 25.06.18 at 14:59, <rcojocaru@bitdefender.com> wrote:
> Quite right, that's exactly what I've been doing: a satisfactory fix
> appears to be to simply reverse the order of altp2m_vcpu_update_p2m(v)
> and altp2m_vcpu_update_vmfunc_ve(v) in altp2m_vcpu_destroy().

And that's also more logical considering that
vmx_vcpu_update_vmfunc_ve() modifies
SECONDARY_EXEC_ENABLE_VIRT_EXCEPTIONS which
vmx_vcpu_update_eptp() actually looks at to decide whether to
write EPTP_INDEX.

Jan



_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH V2 1/2] xen/altp2m: set access_required properly for all altp2ms
  2018-06-13  8:52 [PATCH V2 1/2] xen/altp2m: set access_required properly for all altp2ms Razvan Cojocaru
  2018-06-13  8:52 ` [PATCH V2 2/2] x86/altp2m: Fixed domain crash with INVALID_ALTP2M EPTP index Razvan Cojocaru
@ 2018-06-27 14:09 ` Wei Liu
  2018-06-28  7:08   ` Razvan Cojocaru
  1 sibling, 1 reply; 14+ messages in thread
From: Wei Liu @ 2018-06-27 14:09 UTC (permalink / raw)
  To: Razvan Cojocaru
  Cc: Tim Deegan, Stefano Stabellini, Wei Liu, George Dunlap,
	Andrew Cooper, Ian Jackson, xen-devel, Julien Grall,
	Tamas K Lengyel, Jan Beulich

On Wed, Jun 13, 2018 at 11:52:18AM +0300, Razvan Cojocaru wrote:
> ---
>  xen/arch/arm/mem_access.c    |  5 +++++
>  xen/arch/x86/mm/mem_access.c | 18 ++++++++++++++++++
>  xen/arch/x86/mm/p2m.c        |  3 ++-
>  xen/common/domctl.c          |  4 ++--
>  xen/include/xen/domain.h     |  2 ++
>  5 files changed, 29 insertions(+), 3 deletions(-)
> 
> diff --git a/xen/arch/arm/mem_access.c b/xen/arch/arm/mem_access.c
> index ae2686f..a59c6ef 100644
> --- a/xen/arch/arm/mem_access.c
> +++ b/xen/arch/arm/mem_access.c
> @@ -453,6 +453,11 @@ int p2m_get_mem_access(struct domain *d, gfn_t gfn,
>      return ret;
>  }
>  
> +void arch_domain_set_access_required(struct domain *d, bool access_required)
> +{
> +    p2m_get_hostp2m(d)->access_required = access_required;
> +}
> +
>  /*
>   * Local variables:
>   * mode: C
> diff --git a/xen/arch/x86/mm/mem_access.c b/xen/arch/x86/mm/mem_access.c
> index c0cd017..6811572 100644
> --- a/xen/arch/x86/mm/mem_access.c
> +++ b/xen/arch/x86/mm/mem_access.c
> @@ -465,6 +465,24 @@ int p2m_get_mem_access(struct domain *d, gfn_t gfn, xenmem_access_t *access)
>      return _p2m_get_mem_access(p2m, gfn, access);
>  }
>  
> +void arch_domain_set_access_required(struct domain *d, bool access_required)

arch_p2m_set_access_required?

> +{
> +    unsigned int i;
> +
> +    p2m_get_hostp2m(d)->access_required = access_required;
> +
> +    if ( !altp2m_active(d) )
> +        return;
> +
> +    for ( i = 0; i < MAX_ALTP2M; i++ )
> +    {
> +        struct p2m_domain *p2m = d->arch.altp2m_p2m[i];
> +
> +        if ( p2m )
> +            p2m->access_required = access_required;
> +    }

It seems to me you should check for domain pause count at the beginning
of this function to avoid mistakes.

The rest of looks fine (to my untrained eye).

Wei.

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH V2 1/2] xen/altp2m: set access_required properly for all altp2ms
  2018-06-27 14:09 ` [PATCH V2 1/2] xen/altp2m: set access_required properly for all altp2ms Wei Liu
@ 2018-06-28  7:08   ` Razvan Cojocaru
  2018-06-28  7:15     ` Wei Liu
  0 siblings, 1 reply; 14+ messages in thread
From: Razvan Cojocaru @ 2018-06-28  7:08 UTC (permalink / raw)
  To: Wei Liu
  Cc: Stefano Stabellini, George Dunlap, Andrew Cooper, Tim Deegan,
	xen-devel, Julien Grall, Tamas K Lengyel, Jan Beulich,
	Ian Jackson

On 06/27/2018 05:09 PM, Wei Liu wrote:
> On Wed, Jun 13, 2018 at 11:52:18AM +0300, Razvan Cojocaru wrote:
>> ---
>>  xen/arch/arm/mem_access.c    |  5 +++++
>>  xen/arch/x86/mm/mem_access.c | 18 ++++++++++++++++++
>>  xen/arch/x86/mm/p2m.c        |  3 ++-
>>  xen/common/domctl.c          |  4 ++--
>>  xen/include/xen/domain.h     |  2 ++
>>  5 files changed, 29 insertions(+), 3 deletions(-)
>>
>> diff --git a/xen/arch/arm/mem_access.c b/xen/arch/arm/mem_access.c
>> index ae2686f..a59c6ef 100644
>> --- a/xen/arch/arm/mem_access.c
>> +++ b/xen/arch/arm/mem_access.c
>> @@ -453,6 +453,11 @@ int p2m_get_mem_access(struct domain *d, gfn_t gfn,
>>      return ret;
>>  }
>>  
>> +void arch_domain_set_access_required(struct domain *d, bool access_required)
>> +{
>> +    p2m_get_hostp2m(d)->access_required = access_required;
>> +}
>> +
>>  /*
>>   * Local variables:
>>   * mode: C
>> diff --git a/xen/arch/x86/mm/mem_access.c b/xen/arch/x86/mm/mem_access.c
>> index c0cd017..6811572 100644
>> --- a/xen/arch/x86/mm/mem_access.c
>> +++ b/xen/arch/x86/mm/mem_access.c
>> @@ -465,6 +465,24 @@ int p2m_get_mem_access(struct domain *d, gfn_t gfn, xenmem_access_t *access)
>>      return _p2m_get_mem_access(p2m, gfn, access);
>>  }
>>  
>> +void arch_domain_set_access_required(struct domain *d, bool access_required)
> 
> arch_p2m_set_access_required?

I'll change it.

>> +{
>> +    unsigned int i;
>> +
>> +    p2m_get_hostp2m(d)->access_required = access_required;
>> +
>> +    if ( !altp2m_active(d) )
>> +        return;
>> +
>> +    for ( i = 0; i < MAX_ALTP2M; i++ )
>> +    {
>> +        struct p2m_domain *p2m = d->arch.altp2m_p2m[i];
>> +
>> +        if ( p2m )
>> +            p2m->access_required = access_required;
>> +    }
> 
> It seems to me you should check for domain pause count at the beginning
> of this function to avoid mistakes.

Do you mean ASSERT(atomic_read(&d->pause_count)); ?

> The rest of looks fine (to my untrained eye).

Thanks!

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH V2 1/2] xen/altp2m: set access_required properly for all altp2ms
  2018-06-28  7:08   ` Razvan Cojocaru
@ 2018-06-28  7:15     ` Wei Liu
  0 siblings, 0 replies; 14+ messages in thread
From: Wei Liu @ 2018-06-28  7:15 UTC (permalink / raw)
  To: Razvan Cojocaru
  Cc: Stefano Stabellini, Wei Liu, George Dunlap, Ian Jackson,
	Tim Deegan, xen-devel, Julien Grall, Tamas K Lengyel,
	Jan Beulich, Andrew Cooper

On Thu, Jun 28, 2018 at 10:08:19AM +0300, Razvan Cojocaru wrote:
> >> +{
> >> +    unsigned int i;
> >> +
> >> +    p2m_get_hostp2m(d)->access_required = access_required;
> >> +
> >> +    if ( !altp2m_active(d) )
> >> +        return;
> >> +
> >> +    for ( i = 0; i < MAX_ALTP2M; i++ )
> >> +    {
> >> +        struct p2m_domain *p2m = d->arch.altp2m_p2m[i];
> >> +
> >> +        if ( p2m )
> >> +            p2m->access_required = access_required;
> >> +    }
> > 
> > It seems to me you should check for domain pause count at the beginning
> > of this function to avoid mistakes.
> 
> Do you mean ASSERT(atomic_read(&d->pause_count)); ?

Yes.

Wei.

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 14+ messages in thread

end of thread, other threads:[~2018-06-28  7:15 UTC | newest]

Thread overview: 14+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-06-13  8:52 [PATCH V2 1/2] xen/altp2m: set access_required properly for all altp2ms Razvan Cojocaru
2018-06-13  8:52 ` [PATCH V2 2/2] x86/altp2m: Fixed domain crash with INVALID_ALTP2M EPTP index Razvan Cojocaru
2018-06-22 15:28   ` Jan Beulich
2018-06-22 16:55     ` Razvan Cojocaru
2018-06-25 12:12       ` Razvan Cojocaru
2018-06-25 12:28         ` Jan Beulich
2018-06-25 12:32           ` Razvan Cojocaru
2018-06-25 12:40             ` Razvan Cojocaru
2018-06-25 12:54               ` Jan Beulich
2018-06-25 12:59                 ` Razvan Cojocaru
2018-06-25 13:11                   ` Jan Beulich
2018-06-27 14:09 ` [PATCH V2 1/2] xen/altp2m: set access_required properly for all altp2ms Wei Liu
2018-06-28  7:08   ` Razvan Cojocaru
2018-06-28  7:15     ` Wei Liu

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.