All of lore.kernel.org
 help / color / mirror / Atom feed
* VM_EVENT_FLAG_SET_REGISTERS and sync_vcpu_execstate()
@ 2016-08-08 10:31 Razvan Cojocaru
  2016-08-08 12:01 ` Andrew Cooper
  0 siblings, 1 reply; 12+ messages in thread
From: Razvan Cojocaru @ 2016-08-08 10:31 UTC (permalink / raw)
  To: xen-devel; +Cc: Andrew Cooper, Tamas K Lengyel

Hello,

We've been mostly setting registers by using xc_vcpu_setcontext():

https://github.com/razvan-cojocaru/libbdvmi/blob/master/src/bdvmixendriver.cpp#L504

but lately trying to push as much of that as possible to the
VM_EVENT_FLAG_SET_REGISTERS-related code (i.e. via the vm_event replies)
to save processing time.

So with the xc_vcpu_setcontext() call removed, I've found that there are
cases where vm_event_set_registers() won't work properly unless I kept
the xc_vcpu_getcontext() call. This would appear to be not because of
anything that arch_get_info_guest() does (please see the implementation
for XEN_DOMCTL_getvcpucontext), but because of the vcpu_pause() call, or
more specifically, because of its calling sync_vcpu_execstate().

So it would appear that a sync_vcpu_execstate(v) call is necessary at
the start of vm_event_set_registers() for the vcpu struct instance to be
synchronized with the current VCPU state.

Any objections to a patch with this simple modification?


Thanks,
Razvan

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: VM_EVENT_FLAG_SET_REGISTERS and sync_vcpu_execstate()
  2016-08-08 10:31 VM_EVENT_FLAG_SET_REGISTERS and sync_vcpu_execstate() Razvan Cojocaru
@ 2016-08-08 12:01 ` Andrew Cooper
  2016-08-08 12:47   ` Razvan Cojocaru
  0 siblings, 1 reply; 12+ messages in thread
From: Andrew Cooper @ 2016-08-08 12:01 UTC (permalink / raw)
  To: Razvan Cojocaru, xen-devel; +Cc: Tamas K Lengyel

On 08/08/16 11:31, Razvan Cojocaru wrote:
> Hello,
>
> We've been mostly setting registers by using xc_vcpu_setcontext():
>
> https://github.com/razvan-cojocaru/libbdvmi/blob/master/src/bdvmixendriver.cpp#L504
>
> but lately trying to push as much of that as possible to the
> VM_EVENT_FLAG_SET_REGISTERS-related code (i.e. via the vm_event replies)
> to save processing time.
>
> So with the xc_vcpu_setcontext() call removed, I've found that there are
> cases where vm_event_set_registers() won't work properly unless I kept
> the xc_vcpu_getcontext() call. This would appear to be not because of
> anything that arch_get_info_guest() does (please see the implementation
> for XEN_DOMCTL_getvcpucontext), but because of the vcpu_pause() call, or
> more specifically, because of its calling sync_vcpu_execstate().
>
> So it would appear that a sync_vcpu_execstate(v) call is necessary at
> the start of vm_event_set_registers() for the vcpu struct instance to be
> synchronized with the current VCPU state.
>
> Any objections to a patch with this simple modification?

It would be helpful to identify exactly what is currently going wrong,
and why sync_vcpu_execstate(v) helps.

~Andrew

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: VM_EVENT_FLAG_SET_REGISTERS and sync_vcpu_execstate()
  2016-08-08 12:01 ` Andrew Cooper
@ 2016-08-08 12:47   ` Razvan Cojocaru
  2016-08-08 15:10     ` Razvan Cojocaru
  0 siblings, 1 reply; 12+ messages in thread
From: Razvan Cojocaru @ 2016-08-08 12:47 UTC (permalink / raw)
  To: Andrew Cooper, xen-devel; +Cc: Tamas K Lengyel

On 08/08/2016 03:01 PM, Andrew Cooper wrote:
> On 08/08/16 11:31, Razvan Cojocaru wrote:
>> Hello,
>>
>> We've been mostly setting registers by using xc_vcpu_setcontext():
>>
>> https://github.com/razvan-cojocaru/libbdvmi/blob/master/src/bdvmixendriver.cpp#L504
>>
>> but lately trying to push as much of that as possible to the
>> VM_EVENT_FLAG_SET_REGISTERS-related code (i.e. via the vm_event replies)
>> to save processing time.
>>
>> So with the xc_vcpu_setcontext() call removed, I've found that there are
>> cases where vm_event_set_registers() won't work properly unless I kept
>> the xc_vcpu_getcontext() call. This would appear to be not because of
>> anything that arch_get_info_guest() does (please see the implementation
>> for XEN_DOMCTL_getvcpucontext), but because of the vcpu_pause() call, or
>> more specifically, because of its calling sync_vcpu_execstate().
>>
>> So it would appear that a sync_vcpu_execstate(v) call is necessary at
>> the start of vm_event_set_registers() for the vcpu struct instance to be
>> synchronized with the current VCPU state.
>>
>> Any objections to a patch with this simple modification?
> 
> It would be helpful to identify exactly what is currently going wrong,
> and why sync_vcpu_execstate(v) helps.

I've placed a

printk("EIP: 0x%016lx, 0x%016lx\n", v->arch.user_regs.eip,
rsp->data.regs.x86.rip);

at the top of vm_event_set_registers(), so I could see what the old
value is (v->arch.user_regs.eip) vs. the new value (rsp->data.regs.x86.rip).

I'm also logging these in my application, the difference there being
that the old value is the value that came with the vm_event request,
which has been obtained with guest_cpu_user_regs()->eip (where in
vm_event_set_registers() we set v->arch.user_regs.eip, since v != current).

Here's a short test run, with xl dmesg:

(XEN) [  395.739543] EIP: 0xfffff80001be5957, 0xfffff80001be595b
(XEN) [  409.795311] EIP: 0xfffff80001be480f, 0xfffff80001be4812
(XEN) [  416.675023] EIP: 0xfffff80001be480f, 0xfffff80001be4812
(XEN) [  421.475567] EIP: 0xfffff80001be480f, 0xfffff80001be4812
(XEN) [  428.275125] EIP: 0xfffff80001be480f, 0xfffff80001be4812
(XEN) [  435.507009] EIP: 0xfffff80001be480f, 0xfffff80001be4812
(XEN) [  441.318224] EIP: 0xfffff80001be480f, 0xfffff80001be4812
(XEN) [  445.514807] EIP: 0xfffff80001be480f, 0xfffff80001be4812
(XEN) [  452.539190] EIP: 0xfffff80001be480f, 0xfffff80001be4812
(XEN) [  454.762810] EIP: 0xfffff80001be480f, 0xfffff80001be4812
(XEN) [  459.538651] EIP: 0xfffff80001be480f, 0xfffff80001be4812
(XEN) [  462.027222] EIP: 0xfffff80001be480f, 0xfffff80001be4812
(XEN) [  481.770470] EIP: 0xfffff80001be480f, 0xfffff80001be4812
(XEN) [  483.298493] EIP: 0xfffff80001be480f, 0xfffff80001be4812
(XEN) [  486.522344] EIP: 0xfffff80001be480f, 0xfffff80001be4812
(XEN) [  491.042325] EIP: 0xfffff80001be480f, 0xfffff80001be4812
(XEN) [  494.874468] EIP: 0xfffff80001be480f, 0xfffff80001be4812
(XEN) [  497.450765] EIP: 0xfffff80001be480f, 0xfffff80001be4812
(XEN) [  500.562738] EIP: 0xfffff80001be480f, 0xfffff80001be4812
(XEN) [  509.179754] EIP: 0xfffff80001be480f, 0xfffff80001be4812
(XEN) [  510.826236] EIP: 0xfffff80001be480f, 0xfffff80001be4812
(XEN) [  512.106206] EIP: 0xfffff80001be480f, 0xfffff80001be4812
(XEN) [  518.658092] EIP: 0xfffff80001be480f, 0xfffff80001be4812
(XEN) [  520.450156] EIP: 0xfffff80001be480f, 0xfffff80001be4812
(XEN) [  521.882088] EIP: 0xfffff80001be480f, 0xfffff80001be4812
(XEN) [  523.250092] EIP: 0xfffff80001be480f, 0xfffff80001be4812
(XEN) [  524.577987] EIP: 0xfffff80001be480f, 0xfffff80001be4812
(XEN) [  525.962042] EIP: 0xfffff80001be480f, 0xfffff80001be4812
(XEN) [  527.353942] EIP: 0xfffff80001be4812, 0xfffff80001be4812
(XEN) [  528.714089] EIP: 0xfffff80001be480f, 0xfffff80001be4812
(XEN) [  530.065994] EIP: 0xfffff80001be480f, 0xfffff80001be4812
(XEN) [  531.357762] EIP: 0xfffff80001be4812, 0xfffff80001be4812
(XEN) [  532.594016] EIP: 0xfffff80001be480f, 0xfffff80001be4812
(XEN) [  533.849886] EIP: 0xfffff80001be480f, 0xfffff80001be4812
(XEN) [  535.145879] EIP: 0xfffff80001be4812, 0xfffff80001be4812
(XEN) [  536.546846] EIP: 0xfffff80001be480f, 0xfffff80001be4812
(XEN) [  537.905756] EIP: 0xfffff80001be480f, 0xfffff80001be4812
(XEN) [  539.209454] EIP: 0xfffff80001be4812, 0xfffff80001be4812

and the corresponding part in the userspace application's log:

GET EIP: 0xfffff80001be5957 SET EIP: 0xfffff80001be595b
GET EIP: 0xfffff80001be480f SET EIP: 0xfffff80001be4812
GET EIP: 0xfffff80001be480f SET EIP: 0xfffff80001be4812
GET EIP: 0xfffff80001be480f SET EIP: 0xfffff80001be4812
GET EIP: 0xfffff80001be480f SET EIP: 0xfffff80001be4812
GET EIP: 0xfffff80001be480f SET EIP: 0xfffff80001be4812
GET EIP: 0xfffff80001be480f SET EIP: 0xfffff80001be4812
GET EIP: 0xfffff80001be480f SET EIP: 0xfffff80001be4812
GET EIP: 0xfffff80001be480f SET EIP: 0xfffff80001be4812
GET EIP: 0xfffff80001be480f SET EIP: 0xfffff80001be4812
GET EIP: 0xfffff80001be480f SET EIP: 0xfffff80001be4812
GET EIP: 0xfffff80001be480f SET EIP: 0xfffff80001be4812
GET EIP: 0xfffff80001be480f SET EIP: 0xfffff80001be4812
GET EIP: 0xfffff80001be480f SET EIP: 0xfffff80001be4812
GET EIP: 0xfffff80001be480f SET EIP: 0xfffff80001be4812
GET EIP: 0xfffff80001be480f SET EIP: 0xfffff80001be4812
GET EIP: 0xfffff80001be480f SET EIP: 0xfffff80001be4812
GET EIP: 0xfffff80001be480f SET EIP: 0xfffff80001be4812
GET EIP: 0xfffff80001be480f SET EIP: 0xfffff80001be4812
GET EIP: 0xfffff80001be480f SET EIP: 0xfffff80001be4812
GET EIP: 0xfffff80001be480f SET EIP: 0xfffff80001be4812
GET EIP: 0xfffff80001be480f SET EIP: 0xfffff80001be4812
GET EIP: 0xfffff80001be480f SET EIP: 0xfffff80001be4812
GET EIP: 0xfffff80001be480f SET EIP: 0xfffff80001be4812
GET EIP: 0xfffff80001be480f SET EIP: 0xfffff80001be4812
GET EIP: 0xfffff80001be480f SET EIP: 0xfffff80001be4812
GET EIP: 0xfffff80001be480f SET EIP: 0xfffff80001be4812
GET EIP: 0xfffff80001be480f SET EIP: 0xfffff80001be4812
GET EIP: 0xfffff80001be480f SET EIP: 0xfffff80001be4812
GET EIP: 0xfffff80001be480f SET EIP: 0xfffff80001be4812
GET EIP: 0xfffff80001be480f SET EIP: 0xfffff80001be4812
GET EIP: 0xfffff80001be480f SET EIP: 0xfffff80001be4812
GET EIP: 0xfffff80001be480f SET EIP: 0xfffff80001be4812
GET EIP: 0xfffff80001be480f SET EIP: 0xfffff80001be4812
GET EIP: 0xfffff80001be480f SET EIP: 0xfffff80001be4812
GET EIP: 0xfffff80001be480f SET EIP: 0xfffff80001be4812
GET EIP: 0xfffff80001be480f SET EIP: 0xfffff80001be4812
GET EIP: 0xfffff80001be480f SET EIP: 0xfffff80001be4812

As you can see, they're different (and correct in the application log),
but (for example for the last one) they both already have the same value
in the hypervisor. So it would appear that guest_cpu_user_regs()->eip !=
v->arch.user_regs.eip at that point (and likely even more state than
that differs there).

These are EPT fault events, all of them, and I just reply with
VM_EVENT_FLAG_SET_REGISTERS to them here, and nothing else.


Thanks,
Razvan

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: VM_EVENT_FLAG_SET_REGISTERS and sync_vcpu_execstate()
  2016-08-08 12:47   ` Razvan Cojocaru
@ 2016-08-08 15:10     ` Razvan Cojocaru
  2016-08-08 15:52       ` Tamas K Lengyel
  0 siblings, 1 reply; 12+ messages in thread
From: Razvan Cojocaru @ 2016-08-08 15:10 UTC (permalink / raw)
  To: Andrew Cooper, xen-devel; +Cc: Tamas K Lengyel

On 08/08/2016 03:47 PM, Razvan Cojocaru wrote:
> On 08/08/2016 03:01 PM, Andrew Cooper wrote:
>> On 08/08/16 11:31, Razvan Cojocaru wrote:
>>> Hello,
>>>
>>> We've been mostly setting registers by using xc_vcpu_setcontext():
>>>
>>> https://github.com/razvan-cojocaru/libbdvmi/blob/master/src/bdvmixendriver.cpp#L504
>>>
>>> but lately trying to push as much of that as possible to the
>>> VM_EVENT_FLAG_SET_REGISTERS-related code (i.e. via the vm_event replies)
>>> to save processing time.
>>>
>>> So with the xc_vcpu_setcontext() call removed, I've found that there are
>>> cases where vm_event_set_registers() won't work properly unless I kept
>>> the xc_vcpu_getcontext() call. This would appear to be not because of
>>> anything that arch_get_info_guest() does (please see the implementation
>>> for XEN_DOMCTL_getvcpucontext), but because of the vcpu_pause() call, or
>>> more specifically, because of its calling sync_vcpu_execstate().
>>>
>>> So it would appear that a sync_vcpu_execstate(v) call is necessary at
>>> the start of vm_event_set_registers() for the vcpu struct instance to be
>>> synchronized with the current VCPU state.
>>>
>>> Any objections to a patch with this simple modification?
>>
>> It would be helpful to identify exactly what is currently going wrong,
>> and why sync_vcpu_execstate(v) helps.
> 
> I've placed a
> 
> printk("EIP: 0x%016lx, 0x%016lx\n", v->arch.user_regs.eip,
> rsp->data.regs.x86.rip);
> 
> at the top of vm_event_set_registers(), so I could see what the old
> value is (v->arch.user_regs.eip) vs. the new value (rsp->data.regs.x86.rip).
> 
> I'm also logging these in my application, the difference there being
> that the old value is the value that came with the vm_event request,
> which has been obtained with guest_cpu_user_regs()->eip (where in
> vm_event_set_registers() we set v->arch.user_regs.eip, since v != current).
> 
> Here's a short test run, with xl dmesg:
> 
> (XEN) [  395.739543] EIP: 0xfffff80001be5957, 0xfffff80001be595b
> (XEN) [  409.795311] EIP: 0xfffff80001be480f, 0xfffff80001be4812
> (XEN) [  416.675023] EIP: 0xfffff80001be480f, 0xfffff80001be4812
> (XEN) [  421.475567] EIP: 0xfffff80001be480f, 0xfffff80001be4812
> (XEN) [  428.275125] EIP: 0xfffff80001be480f, 0xfffff80001be4812
> (XEN) [  435.507009] EIP: 0xfffff80001be480f, 0xfffff80001be4812
> (XEN) [  441.318224] EIP: 0xfffff80001be480f, 0xfffff80001be4812
> (XEN) [  445.514807] EIP: 0xfffff80001be480f, 0xfffff80001be4812
> (XEN) [  452.539190] EIP: 0xfffff80001be480f, 0xfffff80001be4812
> (XEN) [  454.762810] EIP: 0xfffff80001be480f, 0xfffff80001be4812
> (XEN) [  459.538651] EIP: 0xfffff80001be480f, 0xfffff80001be4812
> (XEN) [  462.027222] EIP: 0xfffff80001be480f, 0xfffff80001be4812
> (XEN) [  481.770470] EIP: 0xfffff80001be480f, 0xfffff80001be4812
> (XEN) [  483.298493] EIP: 0xfffff80001be480f, 0xfffff80001be4812
> (XEN) [  486.522344] EIP: 0xfffff80001be480f, 0xfffff80001be4812
> (XEN) [  491.042325] EIP: 0xfffff80001be480f, 0xfffff80001be4812
> (XEN) [  494.874468] EIP: 0xfffff80001be480f, 0xfffff80001be4812
> (XEN) [  497.450765] EIP: 0xfffff80001be480f, 0xfffff80001be4812
> (XEN) [  500.562738] EIP: 0xfffff80001be480f, 0xfffff80001be4812
> (XEN) [  509.179754] EIP: 0xfffff80001be480f, 0xfffff80001be4812
> (XEN) [  510.826236] EIP: 0xfffff80001be480f, 0xfffff80001be4812
> (XEN) [  512.106206] EIP: 0xfffff80001be480f, 0xfffff80001be4812
> (XEN) [  518.658092] EIP: 0xfffff80001be480f, 0xfffff80001be4812
> (XEN) [  520.450156] EIP: 0xfffff80001be480f, 0xfffff80001be4812
> (XEN) [  521.882088] EIP: 0xfffff80001be480f, 0xfffff80001be4812
> (XEN) [  523.250092] EIP: 0xfffff80001be480f, 0xfffff80001be4812
> (XEN) [  524.577987] EIP: 0xfffff80001be480f, 0xfffff80001be4812
> (XEN) [  525.962042] EIP: 0xfffff80001be480f, 0xfffff80001be4812
> (XEN) [  527.353942] EIP: 0xfffff80001be4812, 0xfffff80001be4812
> (XEN) [  528.714089] EIP: 0xfffff80001be480f, 0xfffff80001be4812
> (XEN) [  530.065994] EIP: 0xfffff80001be480f, 0xfffff80001be4812
> (XEN) [  531.357762] EIP: 0xfffff80001be4812, 0xfffff80001be4812
> (XEN) [  532.594016] EIP: 0xfffff80001be480f, 0xfffff80001be4812
> (XEN) [  533.849886] EIP: 0xfffff80001be480f, 0xfffff80001be4812
> (XEN) [  535.145879] EIP: 0xfffff80001be4812, 0xfffff80001be4812
> (XEN) [  536.546846] EIP: 0xfffff80001be480f, 0xfffff80001be4812
> (XEN) [  537.905756] EIP: 0xfffff80001be480f, 0xfffff80001be4812
> (XEN) [  539.209454] EIP: 0xfffff80001be4812, 0xfffff80001be4812
> 
> and the corresponding part in the userspace application's log:
> 
> GET EIP: 0xfffff80001be5957 SET EIP: 0xfffff80001be595b
> GET EIP: 0xfffff80001be480f SET EIP: 0xfffff80001be4812
> GET EIP: 0xfffff80001be480f SET EIP: 0xfffff80001be4812
> GET EIP: 0xfffff80001be480f SET EIP: 0xfffff80001be4812
> GET EIP: 0xfffff80001be480f SET EIP: 0xfffff80001be4812
> GET EIP: 0xfffff80001be480f SET EIP: 0xfffff80001be4812
> GET EIP: 0xfffff80001be480f SET EIP: 0xfffff80001be4812
> GET EIP: 0xfffff80001be480f SET EIP: 0xfffff80001be4812
> GET EIP: 0xfffff80001be480f SET EIP: 0xfffff80001be4812
> GET EIP: 0xfffff80001be480f SET EIP: 0xfffff80001be4812
> GET EIP: 0xfffff80001be480f SET EIP: 0xfffff80001be4812
> GET EIP: 0xfffff80001be480f SET EIP: 0xfffff80001be4812
> GET EIP: 0xfffff80001be480f SET EIP: 0xfffff80001be4812
> GET EIP: 0xfffff80001be480f SET EIP: 0xfffff80001be4812
> GET EIP: 0xfffff80001be480f SET EIP: 0xfffff80001be4812
> GET EIP: 0xfffff80001be480f SET EIP: 0xfffff80001be4812
> GET EIP: 0xfffff80001be480f SET EIP: 0xfffff80001be4812
> GET EIP: 0xfffff80001be480f SET EIP: 0xfffff80001be4812
> GET EIP: 0xfffff80001be480f SET EIP: 0xfffff80001be4812
> GET EIP: 0xfffff80001be480f SET EIP: 0xfffff80001be4812
> GET EIP: 0xfffff80001be480f SET EIP: 0xfffff80001be4812
> GET EIP: 0xfffff80001be480f SET EIP: 0xfffff80001be4812
> GET EIP: 0xfffff80001be480f SET EIP: 0xfffff80001be4812
> GET EIP: 0xfffff80001be480f SET EIP: 0xfffff80001be4812
> GET EIP: 0xfffff80001be480f SET EIP: 0xfffff80001be4812
> GET EIP: 0xfffff80001be480f SET EIP: 0xfffff80001be4812
> GET EIP: 0xfffff80001be480f SET EIP: 0xfffff80001be4812
> GET EIP: 0xfffff80001be480f SET EIP: 0xfffff80001be4812
> GET EIP: 0xfffff80001be480f SET EIP: 0xfffff80001be4812
> GET EIP: 0xfffff80001be480f SET EIP: 0xfffff80001be4812
> GET EIP: 0xfffff80001be480f SET EIP: 0xfffff80001be4812
> GET EIP: 0xfffff80001be480f SET EIP: 0xfffff80001be4812
> GET EIP: 0xfffff80001be480f SET EIP: 0xfffff80001be4812
> GET EIP: 0xfffff80001be480f SET EIP: 0xfffff80001be4812
> GET EIP: 0xfffff80001be480f SET EIP: 0xfffff80001be4812
> GET EIP: 0xfffff80001be480f SET EIP: 0xfffff80001be4812
> GET EIP: 0xfffff80001be480f SET EIP: 0xfffff80001be4812
> GET EIP: 0xfffff80001be480f SET EIP: 0xfffff80001be4812
> 
> As you can see, they're different (and correct in the application log),
> but (for example for the last one) they both already have the same value
> in the hypervisor. So it would appear that guest_cpu_user_regs()->eip !=
> v->arch.user_regs.eip at that point (and likely even more state than
> that differs there).
> 
> These are EPT fault events, all of them, and I just reply with
> VM_EVENT_FLAG_SET_REGISTERS to them here, and nothing else.

I think the issue might be that vm_event_vcpu_pause() uses
vcpu_pause_nosync(), and it's being used everywhere we pause the VCPU in
the course of sending out a vm_event.

So this creates the premise for a race condition: either some more
things happen between sending out the vm_event and replying to it that
cause v->arch.user_regs to be synchronized - in which case everything
works (this has been the case when I was reading the VCPU context via a
domctl that did vcpu_pause()) -, or not, in which case all bets are off.

In this case, we should either drop vm_event_vcpu_pause() completely and
just use vcpu_pause() everywhere, modify it to use vcpu_sleep_sync()
(and basically turn it into vcpu_pause()), or only sync in
vm_event_set_registers() as I've suggested.


Thanks,
Razvan

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: VM_EVENT_FLAG_SET_REGISTERS and sync_vcpu_execstate()
  2016-08-08 15:10     ` Razvan Cojocaru
@ 2016-08-08 15:52       ` Tamas K Lengyel
  2016-08-08 16:29         ` Razvan Cojocaru
  0 siblings, 1 reply; 12+ messages in thread
From: Tamas K Lengyel @ 2016-08-08 15:52 UTC (permalink / raw)
  To: Razvan Cojocaru; +Cc: Andrew Cooper, Xen-devel

On Mon, Aug 8, 2016 at 9:10 AM, Razvan Cojocaru
<rcojocaru@bitdefender.com> wrote:
> On 08/08/2016 03:47 PM, Razvan Cojocaru wrote:
>> On 08/08/2016 03:01 PM, Andrew Cooper wrote:
>>> On 08/08/16 11:31, Razvan Cojocaru wrote:
>>>> Hello,
>>>>
>>>> We've been mostly setting registers by using xc_vcpu_setcontext():
>>>>
>>>> https://github.com/razvan-cojocaru/libbdvmi/blob/master/src/bdvmixendriver.cpp#L504
>>>>
>>>> but lately trying to push as much of that as possible to the
>>>> VM_EVENT_FLAG_SET_REGISTERS-related code (i.e. via the vm_event replies)
>>>> to save processing time.
>>>>
>>>> So with the xc_vcpu_setcontext() call removed, I've found that there are
>>>> cases where vm_event_set_registers() won't work properly unless I kept
>>>> the xc_vcpu_getcontext() call. This would appear to be not because of
>>>> anything that arch_get_info_guest() does (please see the implementation
>>>> for XEN_DOMCTL_getvcpucontext), but because of the vcpu_pause() call, or
>>>> more specifically, because of its calling sync_vcpu_execstate().
>>>>
>>>> So it would appear that a sync_vcpu_execstate(v) call is necessary at
>>>> the start of vm_event_set_registers() for the vcpu struct instance to be
>>>> synchronized with the current VCPU state.
>>>>
>>>> Any objections to a patch with this simple modification?
>>>
>>> It would be helpful to identify exactly what is currently going wrong,
>>> and why sync_vcpu_execstate(v) helps.
>>
>> I've placed a
>>
>> printk("EIP: 0x%016lx, 0x%016lx\n", v->arch.user_regs.eip,
>> rsp->data.regs.x86.rip);
>>
>> at the top of vm_event_set_registers(), so I could see what the old
>> value is (v->arch.user_regs.eip) vs. the new value (rsp->data.regs.x86.rip).
>>
>> I'm also logging these in my application, the difference there being
>> that the old value is the value that came with the vm_event request,
>> which has been obtained with guest_cpu_user_regs()->eip (where in
>> vm_event_set_registers() we set v->arch.user_regs.eip, since v != current).
>>
>> Here's a short test run, with xl dmesg:
>>
>> (XEN) [  395.739543] EIP: 0xfffff80001be5957, 0xfffff80001be595b
>> (XEN) [  409.795311] EIP: 0xfffff80001be480f, 0xfffff80001be4812
>> (XEN) [  416.675023] EIP: 0xfffff80001be480f, 0xfffff80001be4812
>> (XEN) [  421.475567] EIP: 0xfffff80001be480f, 0xfffff80001be4812
>> (XEN) [  428.275125] EIP: 0xfffff80001be480f, 0xfffff80001be4812
>> (XEN) [  435.507009] EIP: 0xfffff80001be480f, 0xfffff80001be4812
>> (XEN) [  441.318224] EIP: 0xfffff80001be480f, 0xfffff80001be4812
>> (XEN) [  445.514807] EIP: 0xfffff80001be480f, 0xfffff80001be4812
>> (XEN) [  452.539190] EIP: 0xfffff80001be480f, 0xfffff80001be4812
>> (XEN) [  454.762810] EIP: 0xfffff80001be480f, 0xfffff80001be4812
>> (XEN) [  459.538651] EIP: 0xfffff80001be480f, 0xfffff80001be4812
>> (XEN) [  462.027222] EIP: 0xfffff80001be480f, 0xfffff80001be4812
>> (XEN) [  481.770470] EIP: 0xfffff80001be480f, 0xfffff80001be4812
>> (XEN) [  483.298493] EIP: 0xfffff80001be480f, 0xfffff80001be4812
>> (XEN) [  486.522344] EIP: 0xfffff80001be480f, 0xfffff80001be4812
>> (XEN) [  491.042325] EIP: 0xfffff80001be480f, 0xfffff80001be4812
>> (XEN) [  494.874468] EIP: 0xfffff80001be480f, 0xfffff80001be4812
>> (XEN) [  497.450765] EIP: 0xfffff80001be480f, 0xfffff80001be4812
>> (XEN) [  500.562738] EIP: 0xfffff80001be480f, 0xfffff80001be4812
>> (XEN) [  509.179754] EIP: 0xfffff80001be480f, 0xfffff80001be4812
>> (XEN) [  510.826236] EIP: 0xfffff80001be480f, 0xfffff80001be4812
>> (XEN) [  512.106206] EIP: 0xfffff80001be480f, 0xfffff80001be4812
>> (XEN) [  518.658092] EIP: 0xfffff80001be480f, 0xfffff80001be4812
>> (XEN) [  520.450156] EIP: 0xfffff80001be480f, 0xfffff80001be4812
>> (XEN) [  521.882088] EIP: 0xfffff80001be480f, 0xfffff80001be4812
>> (XEN) [  523.250092] EIP: 0xfffff80001be480f, 0xfffff80001be4812
>> (XEN) [  524.577987] EIP: 0xfffff80001be480f, 0xfffff80001be4812
>> (XEN) [  525.962042] EIP: 0xfffff80001be480f, 0xfffff80001be4812
>> (XEN) [  527.353942] EIP: 0xfffff80001be4812, 0xfffff80001be4812
>> (XEN) [  528.714089] EIP: 0xfffff80001be480f, 0xfffff80001be4812
>> (XEN) [  530.065994] EIP: 0xfffff80001be480f, 0xfffff80001be4812
>> (XEN) [  531.357762] EIP: 0xfffff80001be4812, 0xfffff80001be4812
>> (XEN) [  532.594016] EIP: 0xfffff80001be480f, 0xfffff80001be4812
>> (XEN) [  533.849886] EIP: 0xfffff80001be480f, 0xfffff80001be4812
>> (XEN) [  535.145879] EIP: 0xfffff80001be4812, 0xfffff80001be4812
>> (XEN) [  536.546846] EIP: 0xfffff80001be480f, 0xfffff80001be4812
>> (XEN) [  537.905756] EIP: 0xfffff80001be480f, 0xfffff80001be4812
>> (XEN) [  539.209454] EIP: 0xfffff80001be4812, 0xfffff80001be4812
>>
>> and the corresponding part in the userspace application's log:
>>
>> GET EIP: 0xfffff80001be5957 SET EIP: 0xfffff80001be595b
>> GET EIP: 0xfffff80001be480f SET EIP: 0xfffff80001be4812
>> GET EIP: 0xfffff80001be480f SET EIP: 0xfffff80001be4812
>> GET EIP: 0xfffff80001be480f SET EIP: 0xfffff80001be4812
>> GET EIP: 0xfffff80001be480f SET EIP: 0xfffff80001be4812
>> GET EIP: 0xfffff80001be480f SET EIP: 0xfffff80001be4812
>> GET EIP: 0xfffff80001be480f SET EIP: 0xfffff80001be4812
>> GET EIP: 0xfffff80001be480f SET EIP: 0xfffff80001be4812
>> GET EIP: 0xfffff80001be480f SET EIP: 0xfffff80001be4812
>> GET EIP: 0xfffff80001be480f SET EIP: 0xfffff80001be4812
>> GET EIP: 0xfffff80001be480f SET EIP: 0xfffff80001be4812
>> GET EIP: 0xfffff80001be480f SET EIP: 0xfffff80001be4812
>> GET EIP: 0xfffff80001be480f SET EIP: 0xfffff80001be4812
>> GET EIP: 0xfffff80001be480f SET EIP: 0xfffff80001be4812
>> GET EIP: 0xfffff80001be480f SET EIP: 0xfffff80001be4812
>> GET EIP: 0xfffff80001be480f SET EIP: 0xfffff80001be4812
>> GET EIP: 0xfffff80001be480f SET EIP: 0xfffff80001be4812
>> GET EIP: 0xfffff80001be480f SET EIP: 0xfffff80001be4812
>> GET EIP: 0xfffff80001be480f SET EIP: 0xfffff80001be4812
>> GET EIP: 0xfffff80001be480f SET EIP: 0xfffff80001be4812
>> GET EIP: 0xfffff80001be480f SET EIP: 0xfffff80001be4812
>> GET EIP: 0xfffff80001be480f SET EIP: 0xfffff80001be4812
>> GET EIP: 0xfffff80001be480f SET EIP: 0xfffff80001be4812
>> GET EIP: 0xfffff80001be480f SET EIP: 0xfffff80001be4812
>> GET EIP: 0xfffff80001be480f SET EIP: 0xfffff80001be4812
>> GET EIP: 0xfffff80001be480f SET EIP: 0xfffff80001be4812
>> GET EIP: 0xfffff80001be480f SET EIP: 0xfffff80001be4812
>> GET EIP: 0xfffff80001be480f SET EIP: 0xfffff80001be4812
>> GET EIP: 0xfffff80001be480f SET EIP: 0xfffff80001be4812
>> GET EIP: 0xfffff80001be480f SET EIP: 0xfffff80001be4812
>> GET EIP: 0xfffff80001be480f SET EIP: 0xfffff80001be4812
>> GET EIP: 0xfffff80001be480f SET EIP: 0xfffff80001be4812
>> GET EIP: 0xfffff80001be480f SET EIP: 0xfffff80001be4812
>> GET EIP: 0xfffff80001be480f SET EIP: 0xfffff80001be4812
>> GET EIP: 0xfffff80001be480f SET EIP: 0xfffff80001be4812
>> GET EIP: 0xfffff80001be480f SET EIP: 0xfffff80001be4812
>> GET EIP: 0xfffff80001be480f SET EIP: 0xfffff80001be4812
>> GET EIP: 0xfffff80001be480f SET EIP: 0xfffff80001be4812
>>
>> As you can see, they're different (and correct in the application log),
>> but (for example for the last one) they both already have the same value
>> in the hypervisor. So it would appear that guest_cpu_user_regs()->eip !=
>> v->arch.user_regs.eip at that point (and likely even more state than
>> that differs there).
>>
>> These are EPT fault events, all of them, and I just reply with
>> VM_EVENT_FLAG_SET_REGISTERS to them here, and nothing else.
>
> I think the issue might be that vm_event_vcpu_pause() uses
> vcpu_pause_nosync(), and it's being used everywhere we pause the VCPU in
> the course of sending out a vm_event.
>
> So this creates the premise for a race condition: either some more
> things happen between sending out the vm_event and replying to it that
> cause v->arch.user_regs to be synchronized - in which case everything
> works (this has been the case when I was reading the VCPU context via a
> domctl that did vcpu_pause()) -, or not, in which case all bets are off.
>
> In this case, we should either drop vm_event_vcpu_pause() completely and
> just use vcpu_pause() everywhere, modify it to use vcpu_sleep_sync()
> (and basically turn it into vcpu_pause()), or only sync in
> vm_event_set_registers() as I've suggested.
>

I would say just using vcpu_pause() would make sense as it's the least
complicated route. Is there any downside of doing the sync() in all
cases? Was the current way implemented perhaps the way it is for
performance reasons? If so, is it noticeable at all?

Tamas

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: VM_EVENT_FLAG_SET_REGISTERS and sync_vcpu_execstate()
  2016-08-08 15:52       ` Tamas K Lengyel
@ 2016-08-08 16:29         ` Razvan Cojocaru
  2016-08-08 18:01           ` Tamas K Lengyel
  2016-08-09  9:41           ` Tim Deegan
  0 siblings, 2 replies; 12+ messages in thread
From: Razvan Cojocaru @ 2016-08-08 16:29 UTC (permalink / raw)
  To: Tamas K Lengyel; +Cc: Andrew Cooper, Tim Deegan, Xen-devel

On 08/08/16 18:52, Tamas K Lengyel wrote:
>> I think the issue might be that vm_event_vcpu_pause() uses
>> > vcpu_pause_nosync(), and it's being used everywhere we pause the VCPU in
>> > the course of sending out a vm_event.
>> >
>> > So this creates the premise for a race condition: either some more
>> > things happen between sending out the vm_event and replying to it that
>> > cause v->arch.user_regs to be synchronized - in which case everything
>> > works (this has been the case when I was reading the VCPU context via a
>> > domctl that did vcpu_pause()) -, or not, in which case all bets are off.
>> >
>> > In this case, we should either drop vm_event_vcpu_pause() completely and
>> > just use vcpu_pause() everywhere, modify it to use vcpu_sleep_sync()
>> > (and basically turn it into vcpu_pause()), or only sync in
>> > vm_event_set_registers() as I've suggested.
>> >
> I would say just using vcpu_pause() would make sense as it's the least
> complicated route. Is there any downside of doing the sync() in all
> cases? Was the current way implemented perhaps the way it is for
> performance reasons? If so, is it noticeable at all?

I was hoping you'd know why it's implemented like this :), I think maybe
Tim Deegan (CCd, hopefully the email address is not out of date) did the
original implementation?

That's why I proposed to only sync in vm_event_set_registers() - for all
the other cases we can then keep the current performance (if
significant). The least complicated route (at least as far as the patch
changes go) I think is also this one, since it only requires a new line
of code in vm_event_set_registers() - using vcpu_pause() everywhere else
requires removing vm_event_pause_vcpu() as well as replacing the call
everywhere else.


Thanks,
Razvan

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: VM_EVENT_FLAG_SET_REGISTERS and sync_vcpu_execstate()
  2016-08-08 16:29         ` Razvan Cojocaru
@ 2016-08-08 18:01           ` Tamas K Lengyel
  2016-08-09  8:19             ` Razvan Cojocaru
  2016-08-09  9:41           ` Tim Deegan
  1 sibling, 1 reply; 12+ messages in thread
From: Tamas K Lengyel @ 2016-08-08 18:01 UTC (permalink / raw)
  To: Razvan Cojocaru; +Cc: Andrew Cooper, Tim Deegan, Xen-devel

On Mon, Aug 8, 2016 at 10:29 AM, Razvan Cojocaru
<rcojocaru@bitdefender.com> wrote:
> On 08/08/16 18:52, Tamas K Lengyel wrote:
>>> I think the issue might be that vm_event_vcpu_pause() uses
>>> > vcpu_pause_nosync(), and it's being used everywhere we pause the VCPU in
>>> > the course of sending out a vm_event.
>>> >
>>> > So this creates the premise for a race condition: either some more
>>> > things happen between sending out the vm_event and replying to it that
>>> > cause v->arch.user_regs to be synchronized - in which case everything
>>> > works (this has been the case when I was reading the VCPU context via a
>>> > domctl that did vcpu_pause()) -, or not, in which case all bets are off.
>>> >
>>> > In this case, we should either drop vm_event_vcpu_pause() completely and
>>> > just use vcpu_pause() everywhere, modify it to use vcpu_sleep_sync()
>>> > (and basically turn it into vcpu_pause()), or only sync in
>>> > vm_event_set_registers() as I've suggested.
>>> >
>> I would say just using vcpu_pause() would make sense as it's the least
>> complicated route. Is there any downside of doing the sync() in all
>> cases? Was the current way implemented perhaps the way it is for
>> performance reasons? If so, is it noticeable at all?
>
> I was hoping you'd know why it's implemented like this :), I think maybe
> Tim Deegan (CCd, hopefully the email address is not out of date) did the
> original implementation?
>
> That's why I proposed to only sync in vm_event_set_registers() - for all
> the other cases we can then keep the current performance (if
> significant). The least complicated route (at least as far as the patch
> changes go) I think is also this one, since it only requires a new line
> of code in vm_event_set_registers() - using vcpu_pause() everywhere else
> requires removing vm_event_pause_vcpu() as well as replacing the call
> everywhere else.
>

Using _nosync() predates me touching this code by a couple years,
according to git blame it goes all the way back to when mem_access was
introduced:

commit fbbedcae8c0c5374f8c0a869f49784b37baf04bb
Joe Epstein <jepstein98@gmail.com>
Date:   Fri Jan 7 11:54:40 2011 +0000

    mem_access: mem event additions for access

My only concern with changing to sync only in the set registers route
is that we potentially keep a buggy interface where we might run into
this problem in the future. IMHO just shifting all cases to do sync()
is safer, provided the performance difference is unnoticeable.

Tamas

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: VM_EVENT_FLAG_SET_REGISTERS and sync_vcpu_execstate()
  2016-08-08 18:01           ` Tamas K Lengyel
@ 2016-08-09  8:19             ` Razvan Cojocaru
  2016-08-09  9:01               ` Razvan Cojocaru
  0 siblings, 1 reply; 12+ messages in thread
From: Razvan Cojocaru @ 2016-08-09  8:19 UTC (permalink / raw)
  To: Tamas K Lengyel; +Cc: Andrew Cooper, Tim Deegan, Xen-devel

On 08/08/2016 09:01 PM, Tamas K Lengyel wrote:
> On Mon, Aug 8, 2016 at 10:29 AM, Razvan Cojocaru
> <rcojocaru@bitdefender.com> wrote:
>> On 08/08/16 18:52, Tamas K Lengyel wrote:
>>>> I think the issue might be that vm_event_vcpu_pause() uses
>>>>> vcpu_pause_nosync(), and it's being used everywhere we pause the VCPU in
>>>>> the course of sending out a vm_event.
>>>>>
>>>>> So this creates the premise for a race condition: either some more
>>>>> things happen between sending out the vm_event and replying to it that
>>>>> cause v->arch.user_regs to be synchronized - in which case everything
>>>>> works (this has been the case when I was reading the VCPU context via a
>>>>> domctl that did vcpu_pause()) -, or not, in which case all bets are off.
>>>>>
>>>>> In this case, we should either drop vm_event_vcpu_pause() completely and
>>>>> just use vcpu_pause() everywhere, modify it to use vcpu_sleep_sync()
>>>>> (and basically turn it into vcpu_pause()), or only sync in
>>>>> vm_event_set_registers() as I've suggested.
>>>>>
>>> I would say just using vcpu_pause() would make sense as it's the least
>>> complicated route. Is there any downside of doing the sync() in all
>>> cases? Was the current way implemented perhaps the way it is for
>>> performance reasons? If so, is it noticeable at all?
>>
>> I was hoping you'd know why it's implemented like this :), I think maybe
>> Tim Deegan (CCd, hopefully the email address is not out of date) did the
>> original implementation?
>>
>> That's why I proposed to only sync in vm_event_set_registers() - for all
>> the other cases we can then keep the current performance (if
>> significant). The least complicated route (at least as far as the patch
>> changes go) I think is also this one, since it only requires a new line
>> of code in vm_event_set_registers() - using vcpu_pause() everywhere else
>> requires removing vm_event_pause_vcpu() as well as replacing the call
>> everywhere else.
>>
> 
> Using _nosync() predates me touching this code by a couple years,
> according to git blame it goes all the way back to when mem_access was
> introduced:
> 
> commit fbbedcae8c0c5374f8c0a869f49784b37baf04bb
> Joe Epstein <jepstein98@gmail.com>
> Date:   Fri Jan 7 11:54:40 2011 +0000
> 
>     mem_access: mem event additions for access
> 
> My only concern with changing to sync only in the set registers route
> is that we potentially keep a buggy interface where we might run into
> this problem in the future. IMHO just shifting all cases to do sync()
> is safer, provided the performance difference is unnoticeable.

Actually looking at the code again, this is vcpu_pause():

 875 void vcpu_pause(struct vcpu *v)
 876 {
 877     ASSERT(v != current);
 878     atomic_inc(&v->pause_count);
 879     vcpu_sleep_sync(v);
 880 }
 881
 882 void vcpu_pause_nosync(struct vcpu *v)
 883 {
 884     atomic_inc(&v->pause_count);
 885     vcpu_sleep_nosync(v);
 886 }

and this is vm_event_vcpu_pause():

751 void vm_event_vcpu_pause(struct vcpu *v)
752 {
753     ASSERT(v == current);
754
755     atomic_inc(&v->vm_event_pause_count);
756     vcpu_pause_nosync(v);
757 }

If we want to preserve the vm_event-specific reference counter
(v->vm_event_pause_count) we'd use vcpu_pause() instead of
vcpu_pause_nosync(). But vcpu_pause() wants to be used on a VCPU !=
current (see the ASSERT()). This is possibly why vcpu_pause_nosync() has
been chosen over vcpu_pause().

I'll try to remove the ASSERT() and see if anything explodes, but it's
looking increasingly like the smallest change is the one I've initially
proposed.


Thanks,
Razvan

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: VM_EVENT_FLAG_SET_REGISTERS and sync_vcpu_execstate()
  2016-08-09  8:19             ` Razvan Cojocaru
@ 2016-08-09  9:01               ` Razvan Cojocaru
  0 siblings, 0 replies; 12+ messages in thread
From: Razvan Cojocaru @ 2016-08-09  9:01 UTC (permalink / raw)
  To: Tamas K Lengyel; +Cc: Andrew Cooper, Xen-devel

On 08/09/2016 11:19 AM, Razvan Cojocaru wrote:
> On 08/08/2016 09:01 PM, Tamas K Lengyel wrote:
>> On Mon, Aug 8, 2016 at 10:29 AM, Razvan Cojocaru
>> <rcojocaru@bitdefender.com> wrote:
>>> On 08/08/16 18:52, Tamas K Lengyel wrote:
>>>>> I think the issue might be that vm_event_vcpu_pause() uses
>>>>>> vcpu_pause_nosync(), and it's being used everywhere we pause the VCPU in
>>>>>> the course of sending out a vm_event.
>>>>>>
>>>>>> So this creates the premise for a race condition: either some more
>>>>>> things happen between sending out the vm_event and replying to it that
>>>>>> cause v->arch.user_regs to be synchronized - in which case everything
>>>>>> works (this has been the case when I was reading the VCPU context via a
>>>>>> domctl that did vcpu_pause()) -, or not, in which case all bets are off.
>>>>>>
>>>>>> In this case, we should either drop vm_event_vcpu_pause() completely and
>>>>>> just use vcpu_pause() everywhere, modify it to use vcpu_sleep_sync()
>>>>>> (and basically turn it into vcpu_pause()), or only sync in
>>>>>> vm_event_set_registers() as I've suggested.
>>>>>>
>>>> I would say just using vcpu_pause() would make sense as it's the least
>>>> complicated route. Is there any downside of doing the sync() in all
>>>> cases? Was the current way implemented perhaps the way it is for
>>>> performance reasons? If so, is it noticeable at all?
>>>
>>> I was hoping you'd know why it's implemented like this :), I think maybe
>>> Tim Deegan (CCd, hopefully the email address is not out of date) did the
>>> original implementation?
>>>
>>> That's why I proposed to only sync in vm_event_set_registers() - for all
>>> the other cases we can then keep the current performance (if
>>> significant). The least complicated route (at least as far as the patch
>>> changes go) I think is also this one, since it only requires a new line
>>> of code in vm_event_set_registers() - using vcpu_pause() everywhere else
>>> requires removing vm_event_pause_vcpu() as well as replacing the call
>>> everywhere else.
>>>
>>
>> Using _nosync() predates me touching this code by a couple years,
>> according to git blame it goes all the way back to when mem_access was
>> introduced:
>>
>> commit fbbedcae8c0c5374f8c0a869f49784b37baf04bb
>> Joe Epstein <jepstein98@gmail.com>
>> Date:   Fri Jan 7 11:54:40 2011 +0000
>>
>>     mem_access: mem event additions for access
>>
>> My only concern with changing to sync only in the set registers route
>> is that we potentially keep a buggy interface where we might run into
>> this problem in the future. IMHO just shifting all cases to do sync()
>> is safer, provided the performance difference is unnoticeable.
> 
> Actually looking at the code again, this is vcpu_pause():
> 
>  875 void vcpu_pause(struct vcpu *v)
>  876 {
>  877     ASSERT(v != current);
>  878     atomic_inc(&v->pause_count);
>  879     vcpu_sleep_sync(v);
>  880 }
>  881
>  882 void vcpu_pause_nosync(struct vcpu *v)
>  883 {
>  884     atomic_inc(&v->pause_count);
>  885     vcpu_sleep_nosync(v);
>  886 }
> 
> and this is vm_event_vcpu_pause():
> 
> 751 void vm_event_vcpu_pause(struct vcpu *v)
> 752 {
> 753     ASSERT(v == current);
> 754
> 755     atomic_inc(&v->vm_event_pause_count);
> 756     vcpu_pause_nosync(v);
> 757 }
> 
> If we want to preserve the vm_event-specific reference counter
> (v->vm_event_pause_count) we'd use vcpu_pause() instead of
> vcpu_pause_nosync(). But vcpu_pause() wants to be used on a VCPU !=
> current (see the ASSERT()). This is possibly why vcpu_pause_nosync() has
> been chosen over vcpu_pause().
> 
> I'll try to remove the ASSERT() and see if anything explodes, but it's
> looking increasingly like the smallest change is the one I've initially
> proposed.

Predictably, it did explode: http://pastebin.com/ruMKD2f0

However, your concern is valid, and I think we can address it by doing a
sync_vcpu_execstate() as soon as possible in vm_event_resume() - that
way we'll make sure that all the custom handlers running afterwards will
see the synced VCPU state.

I'll hack a patch and send it in ASAP.


Thanks,
Razvan

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: VM_EVENT_FLAG_SET_REGISTERS and sync_vcpu_execstate()
  2016-08-08 16:29         ` Razvan Cojocaru
  2016-08-08 18:01           ` Tamas K Lengyel
@ 2016-08-09  9:41           ` Tim Deegan
  2016-08-09  9:46             ` Razvan Cojocaru
  2016-08-09 17:19             ` Tamas K Lengyel
  1 sibling, 2 replies; 12+ messages in thread
From: Tim Deegan @ 2016-08-09  9:41 UTC (permalink / raw)
  To: Razvan Cojocaru; +Cc: Andrew Cooper, Tamas K Lengyel, Xen-devel

At 19:29 +0300 on 08 Aug (1470684579), Razvan Cojocaru wrote:
> On 08/08/16 18:52, Tamas K Lengyel wrote:
> >> I think the issue might be that vm_event_vcpu_pause() uses
> >> > vcpu_pause_nosync(), and it's being used everywhere we pause the VCPU in
> >> > the course of sending out a vm_event.
> >> >
> >> > So this creates the premise for a race condition: either some more
> >> > things happen between sending out the vm_event and replying to it that
> >> > cause v->arch.user_regs to be synchronized - in which case everything
> >> > works (this has been the case when I was reading the VCPU context via a
> >> > domctl that did vcpu_pause()) -, or not, in which case all bets are off.
> >> >
> >> > In this case, we should either drop vm_event_vcpu_pause() completely and
> >> > just use vcpu_pause() everywhere, modify it to use vcpu_sleep_sync()
> >> > (and basically turn it into vcpu_pause()), or only sync in
> >> > vm_event_set_registers() as I've suggested.
> >> >
> > I would say just using vcpu_pause() would make sense as it's the least
> > complicated route. Is there any downside of doing the sync() in all
> > cases? Was the current way implemented perhaps the way it is for
> > performance reasons? If so, is it noticeable at all?
> 
> I was hoping you'd know why it's implemented like this :), I think maybe
> Tim Deegan (CCd, hopefully the email address is not out of date) did the
> original implementation?

Wasn't me!  I'm missing some context here but it looks like
vm_event_vcpu_pause() is always called on current, which means the
vcpu:
 - is "running", i.e. scheduled on this pcpu, so vcpu_pause_sync()
   would deadlock in vcpu_sleep_sync() (well, it would ASSERT first).
 - is not in guest mode, so any VMCx state should have been synced
   onto the local stack at the last vmexit.

If your vm_event response hypercall needs access to remote vcpu state,
then you should call vcpu_pause_sync() on _that_ path, and
vcpu_unpause() when you're done.  If you can _guarantee_ (even with
buggy/malicious tools) that the target vcpu is already paused and
nothing can unpause it underfoot, then just calling vcpu_sleep_sync()
before accessing the state is enough.

The *_sync() functions are dangerous - you can't ever let a domain
call them on its own vcpus, or have two domains that can call them on
each other, without some other interlock to stop two vcpus deadlocking
waiting for each other to be descheduled.

Cheers,

Tim.

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: VM_EVENT_FLAG_SET_REGISTERS and sync_vcpu_execstate()
  2016-08-09  9:41           ` Tim Deegan
@ 2016-08-09  9:46             ` Razvan Cojocaru
  2016-08-09 17:19             ` Tamas K Lengyel
  1 sibling, 0 replies; 12+ messages in thread
From: Razvan Cojocaru @ 2016-08-09  9:46 UTC (permalink / raw)
  To: Tim Deegan; +Cc: Andrew Cooper, Tamas K Lengyel, Xen-devel

On 08/09/2016 12:41 PM, Tim Deegan wrote:
> At 19:29 +0300 on 08 Aug (1470684579), Razvan Cojocaru wrote:
>> On 08/08/16 18:52, Tamas K Lengyel wrote:
>>>> I think the issue might be that vm_event_vcpu_pause() uses
>>>>> vcpu_pause_nosync(), and it's being used everywhere we pause the VCPU in
>>>>> the course of sending out a vm_event.
>>>>>
>>>>> So this creates the premise for a race condition: either some more
>>>>> things happen between sending out the vm_event and replying to it that
>>>>> cause v->arch.user_regs to be synchronized - in which case everything
>>>>> works (this has been the case when I was reading the VCPU context via a
>>>>> domctl that did vcpu_pause()) -, or not, in which case all bets are off.
>>>>>
>>>>> In this case, we should either drop vm_event_vcpu_pause() completely and
>>>>> just use vcpu_pause() everywhere, modify it to use vcpu_sleep_sync()
>>>>> (and basically turn it into vcpu_pause()), or only sync in
>>>>> vm_event_set_registers() as I've suggested.
>>>>>
>>> I would say just using vcpu_pause() would make sense as it's the least
>>> complicated route. Is there any downside of doing the sync() in all
>>> cases? Was the current way implemented perhaps the way it is for
>>> performance reasons? If so, is it noticeable at all?
>>
>> I was hoping you'd know why it's implemented like this :), I think maybe
>> Tim Deegan (CCd, hopefully the email address is not out of date) did the
>> original implementation?
> 
> Wasn't me!  I'm missing some context here but it looks like
> vm_event_vcpu_pause() is always called on current, which means the
> vcpu:
>  - is "running", i.e. scheduled on this pcpu, so vcpu_pause_sync()
>    would deadlock in vcpu_sleep_sync() (well, it would ASSERT first).
>  - is not in guest mode, so any VMCx state should have been synced
>    onto the local stack at the last vmexit.
> 
> If your vm_event response hypercall needs access to remote vcpu state,
> then you should call vcpu_pause_sync() on _that_ path, and
> vcpu_unpause() when you're done.  If you can _guarantee_ (even with
> buggy/malicious tools) that the target vcpu is already paused and
> nothing can unpause it underfoot, then just calling vcpu_sleep_sync()
> before accessing the state is enough.
> 
> The *_sync() functions are dangerous - you can't ever let a domain
> call them on its own vcpus, or have two domains that can call them on
> each other, without some other interlock to stop two vcpus deadlocking
> waiting for each other to be descheduled.

Thanks for the reply! Indeed, those things have all hit me until I got
to the patch I've just submitted a couple of minutes ago. :)


Thanks again,
Razvan

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: VM_EVENT_FLAG_SET_REGISTERS and sync_vcpu_execstate()
  2016-08-09  9:41           ` Tim Deegan
  2016-08-09  9:46             ` Razvan Cojocaru
@ 2016-08-09 17:19             ` Tamas K Lengyel
  1 sibling, 0 replies; 12+ messages in thread
From: Tamas K Lengyel @ 2016-08-09 17:19 UTC (permalink / raw)
  To: Tim Deegan; +Cc: Andrew Cooper, Razvan Cojocaru, Xen-devel

On Tue, Aug 9, 2016 at 3:41 AM, Tim Deegan <tim@xen.org> wrote:
> At 19:29 +0300 on 08 Aug (1470684579), Razvan Cojocaru wrote:
>> On 08/08/16 18:52, Tamas K Lengyel wrote:
>> >> I think the issue might be that vm_event_vcpu_pause() uses
>> >> > vcpu_pause_nosync(), and it's being used everywhere we pause the VCPU in
>> >> > the course of sending out a vm_event.
>> >> >
>> >> > So this creates the premise for a race condition: either some more
>> >> > things happen between sending out the vm_event and replying to it that
>> >> > cause v->arch.user_regs to be synchronized - in which case everything
>> >> > works (this has been the case when I was reading the VCPU context via a
>> >> > domctl that did vcpu_pause()) -, or not, in which case all bets are off.
>> >> >
>> >> > In this case, we should either drop vm_event_vcpu_pause() completely and
>> >> > just use vcpu_pause() everywhere, modify it to use vcpu_sleep_sync()
>> >> > (and basically turn it into vcpu_pause()), or only sync in
>> >> > vm_event_set_registers() as I've suggested.
>> >> >
>> > I would say just using vcpu_pause() would make sense as it's the least
>> > complicated route. Is there any downside of doing the sync() in all
>> > cases? Was the current way implemented perhaps the way it is for
>> > performance reasons? If so, is it noticeable at all?
>>
>> I was hoping you'd know why it's implemented like this :), I think maybe
>> Tim Deegan (CCd, hopefully the email address is not out of date) did the
>> original implementation?
>
> Wasn't me!  I'm missing some context here but it looks like
> vm_event_vcpu_pause() is always called on current, which means the
> vcpu:
>  - is "running", i.e. scheduled on this pcpu, so vcpu_pause_sync()
>    would deadlock in vcpu_sleep_sync() (well, it would ASSERT first).
>  - is not in guest mode, so any VMCx state should have been synced
>    onto the local stack at the last vmexit.
>
> If your vm_event response hypercall needs access to remote vcpu state,
> then you should call vcpu_pause_sync() on _that_ path, and
> vcpu_unpause() when you're done.  If you can _guarantee_ (even with
> buggy/malicious tools) that the target vcpu is already paused and
> nothing can unpause it underfoot, then just calling vcpu_sleep_sync()
> before accessing the state is enough.
>
> The *_sync() functions are dangerous - you can't ever let a domain
> call them on its own vcpus, or have two domains that can call them on
> each other, without some other interlock to stop two vcpus deadlocking
> waiting for each other to be descheduled.
>

Hi Tim,
thanks for clarifying this for us! =)

Cheers,
Tamas

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 12+ messages in thread

end of thread, other threads:[~2016-08-09 17:19 UTC | newest]

Thread overview: 12+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-08-08 10:31 VM_EVENT_FLAG_SET_REGISTERS and sync_vcpu_execstate() Razvan Cojocaru
2016-08-08 12:01 ` Andrew Cooper
2016-08-08 12:47   ` Razvan Cojocaru
2016-08-08 15:10     ` Razvan Cojocaru
2016-08-08 15:52       ` Tamas K Lengyel
2016-08-08 16:29         ` Razvan Cojocaru
2016-08-08 18:01           ` Tamas K Lengyel
2016-08-09  8:19             ` Razvan Cojocaru
2016-08-09  9:01               ` Razvan Cojocaru
2016-08-09  9:41           ` Tim Deegan
2016-08-09  9:46             ` Razvan Cojocaru
2016-08-09 17:19             ` Tamas K Lengyel

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.