* VM_EVENT_FLAG_SET_REGISTERS and sync_vcpu_execstate() @ 2016-08-08 10:31 Razvan Cojocaru 2016-08-08 12:01 ` Andrew Cooper 0 siblings, 1 reply; 12+ messages in thread From: Razvan Cojocaru @ 2016-08-08 10:31 UTC (permalink / raw) To: xen-devel; +Cc: Andrew Cooper, Tamas K Lengyel Hello, We've been mostly setting registers by using xc_vcpu_setcontext(): https://github.com/razvan-cojocaru/libbdvmi/blob/master/src/bdvmixendriver.cpp#L504 but lately trying to push as much of that as possible to the VM_EVENT_FLAG_SET_REGISTERS-related code (i.e. via the vm_event replies) to save processing time. So with the xc_vcpu_setcontext() call removed, I've found that there are cases where vm_event_set_registers() won't work properly unless I kept the xc_vcpu_getcontext() call. This would appear to be not because of anything that arch_get_info_guest() does (please see the implementation for XEN_DOMCTL_getvcpucontext), but because of the vcpu_pause() call, or more specifically, because of its calling sync_vcpu_execstate(). So it would appear that a sync_vcpu_execstate(v) call is necessary at the start of vm_event_set_registers() for the vcpu struct instance to be synchronized with the current VCPU state. Any objections to a patch with this simple modification? Thanks, Razvan _______________________________________________ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: VM_EVENT_FLAG_SET_REGISTERS and sync_vcpu_execstate() 2016-08-08 10:31 VM_EVENT_FLAG_SET_REGISTERS and sync_vcpu_execstate() Razvan Cojocaru @ 2016-08-08 12:01 ` Andrew Cooper 2016-08-08 12:47 ` Razvan Cojocaru 0 siblings, 1 reply; 12+ messages in thread From: Andrew Cooper @ 2016-08-08 12:01 UTC (permalink / raw) To: Razvan Cojocaru, xen-devel; +Cc: Tamas K Lengyel On 08/08/16 11:31, Razvan Cojocaru wrote: > Hello, > > We've been mostly setting registers by using xc_vcpu_setcontext(): > > https://github.com/razvan-cojocaru/libbdvmi/blob/master/src/bdvmixendriver.cpp#L504 > > but lately trying to push as much of that as possible to the > VM_EVENT_FLAG_SET_REGISTERS-related code (i.e. via the vm_event replies) > to save processing time. > > So with the xc_vcpu_setcontext() call removed, I've found that there are > cases where vm_event_set_registers() won't work properly unless I kept > the xc_vcpu_getcontext() call. This would appear to be not because of > anything that arch_get_info_guest() does (please see the implementation > for XEN_DOMCTL_getvcpucontext), but because of the vcpu_pause() call, or > more specifically, because of its calling sync_vcpu_execstate(). > > So it would appear that a sync_vcpu_execstate(v) call is necessary at > the start of vm_event_set_registers() for the vcpu struct instance to be > synchronized with the current VCPU state. > > Any objections to a patch with this simple modification? It would be helpful to identify exactly what is currently going wrong, and why sync_vcpu_execstate(v) helps. ~Andrew _______________________________________________ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: VM_EVENT_FLAG_SET_REGISTERS and sync_vcpu_execstate() 2016-08-08 12:01 ` Andrew Cooper @ 2016-08-08 12:47 ` Razvan Cojocaru 2016-08-08 15:10 ` Razvan Cojocaru 0 siblings, 1 reply; 12+ messages in thread From: Razvan Cojocaru @ 2016-08-08 12:47 UTC (permalink / raw) To: Andrew Cooper, xen-devel; +Cc: Tamas K Lengyel On 08/08/2016 03:01 PM, Andrew Cooper wrote: > On 08/08/16 11:31, Razvan Cojocaru wrote: >> Hello, >> >> We've been mostly setting registers by using xc_vcpu_setcontext(): >> >> https://github.com/razvan-cojocaru/libbdvmi/blob/master/src/bdvmixendriver.cpp#L504 >> >> but lately trying to push as much of that as possible to the >> VM_EVENT_FLAG_SET_REGISTERS-related code (i.e. via the vm_event replies) >> to save processing time. >> >> So with the xc_vcpu_setcontext() call removed, I've found that there are >> cases where vm_event_set_registers() won't work properly unless I kept >> the xc_vcpu_getcontext() call. This would appear to be not because of >> anything that arch_get_info_guest() does (please see the implementation >> for XEN_DOMCTL_getvcpucontext), but because of the vcpu_pause() call, or >> more specifically, because of its calling sync_vcpu_execstate(). >> >> So it would appear that a sync_vcpu_execstate(v) call is necessary at >> the start of vm_event_set_registers() for the vcpu struct instance to be >> synchronized with the current VCPU state. >> >> Any objections to a patch with this simple modification? > > It would be helpful to identify exactly what is currently going wrong, > and why sync_vcpu_execstate(v) helps. I've placed a printk("EIP: 0x%016lx, 0x%016lx\n", v->arch.user_regs.eip, rsp->data.regs.x86.rip); at the top of vm_event_set_registers(), so I could see what the old value is (v->arch.user_regs.eip) vs. the new value (rsp->data.regs.x86.rip). I'm also logging these in my application, the difference there being that the old value is the value that came with the vm_event request, which has been obtained with guest_cpu_user_regs()->eip (where in vm_event_set_registers() we set v->arch.user_regs.eip, since v != current). Here's a short test run, with xl dmesg: (XEN) [ 395.739543] EIP: 0xfffff80001be5957, 0xfffff80001be595b (XEN) [ 409.795311] EIP: 0xfffff80001be480f, 0xfffff80001be4812 (XEN) [ 416.675023] EIP: 0xfffff80001be480f, 0xfffff80001be4812 (XEN) [ 421.475567] EIP: 0xfffff80001be480f, 0xfffff80001be4812 (XEN) [ 428.275125] EIP: 0xfffff80001be480f, 0xfffff80001be4812 (XEN) [ 435.507009] EIP: 0xfffff80001be480f, 0xfffff80001be4812 (XEN) [ 441.318224] EIP: 0xfffff80001be480f, 0xfffff80001be4812 (XEN) [ 445.514807] EIP: 0xfffff80001be480f, 0xfffff80001be4812 (XEN) [ 452.539190] EIP: 0xfffff80001be480f, 0xfffff80001be4812 (XEN) [ 454.762810] EIP: 0xfffff80001be480f, 0xfffff80001be4812 (XEN) [ 459.538651] EIP: 0xfffff80001be480f, 0xfffff80001be4812 (XEN) [ 462.027222] EIP: 0xfffff80001be480f, 0xfffff80001be4812 (XEN) [ 481.770470] EIP: 0xfffff80001be480f, 0xfffff80001be4812 (XEN) [ 483.298493] EIP: 0xfffff80001be480f, 0xfffff80001be4812 (XEN) [ 486.522344] EIP: 0xfffff80001be480f, 0xfffff80001be4812 (XEN) [ 491.042325] EIP: 0xfffff80001be480f, 0xfffff80001be4812 (XEN) [ 494.874468] EIP: 0xfffff80001be480f, 0xfffff80001be4812 (XEN) [ 497.450765] EIP: 0xfffff80001be480f, 0xfffff80001be4812 (XEN) [ 500.562738] EIP: 0xfffff80001be480f, 0xfffff80001be4812 (XEN) [ 509.179754] EIP: 0xfffff80001be480f, 0xfffff80001be4812 (XEN) [ 510.826236] EIP: 0xfffff80001be480f, 0xfffff80001be4812 (XEN) [ 512.106206] EIP: 0xfffff80001be480f, 0xfffff80001be4812 (XEN) [ 518.658092] EIP: 0xfffff80001be480f, 0xfffff80001be4812 (XEN) [ 520.450156] EIP: 0xfffff80001be480f, 0xfffff80001be4812 (XEN) [ 521.882088] EIP: 0xfffff80001be480f, 0xfffff80001be4812 (XEN) [ 523.250092] EIP: 0xfffff80001be480f, 0xfffff80001be4812 (XEN) [ 524.577987] EIP: 0xfffff80001be480f, 0xfffff80001be4812 (XEN) [ 525.962042] EIP: 0xfffff80001be480f, 0xfffff80001be4812 (XEN) [ 527.353942] EIP: 0xfffff80001be4812, 0xfffff80001be4812 (XEN) [ 528.714089] EIP: 0xfffff80001be480f, 0xfffff80001be4812 (XEN) [ 530.065994] EIP: 0xfffff80001be480f, 0xfffff80001be4812 (XEN) [ 531.357762] EIP: 0xfffff80001be4812, 0xfffff80001be4812 (XEN) [ 532.594016] EIP: 0xfffff80001be480f, 0xfffff80001be4812 (XEN) [ 533.849886] EIP: 0xfffff80001be480f, 0xfffff80001be4812 (XEN) [ 535.145879] EIP: 0xfffff80001be4812, 0xfffff80001be4812 (XEN) [ 536.546846] EIP: 0xfffff80001be480f, 0xfffff80001be4812 (XEN) [ 537.905756] EIP: 0xfffff80001be480f, 0xfffff80001be4812 (XEN) [ 539.209454] EIP: 0xfffff80001be4812, 0xfffff80001be4812 and the corresponding part in the userspace application's log: GET EIP: 0xfffff80001be5957 SET EIP: 0xfffff80001be595b GET EIP: 0xfffff80001be480f SET EIP: 0xfffff80001be4812 GET EIP: 0xfffff80001be480f SET EIP: 0xfffff80001be4812 GET EIP: 0xfffff80001be480f SET EIP: 0xfffff80001be4812 GET EIP: 0xfffff80001be480f SET EIP: 0xfffff80001be4812 GET EIP: 0xfffff80001be480f SET EIP: 0xfffff80001be4812 GET EIP: 0xfffff80001be480f SET EIP: 0xfffff80001be4812 GET EIP: 0xfffff80001be480f SET EIP: 0xfffff80001be4812 GET EIP: 0xfffff80001be480f SET EIP: 0xfffff80001be4812 GET EIP: 0xfffff80001be480f SET EIP: 0xfffff80001be4812 GET EIP: 0xfffff80001be480f SET EIP: 0xfffff80001be4812 GET EIP: 0xfffff80001be480f SET EIP: 0xfffff80001be4812 GET EIP: 0xfffff80001be480f SET EIP: 0xfffff80001be4812 GET EIP: 0xfffff80001be480f SET EIP: 0xfffff80001be4812 GET EIP: 0xfffff80001be480f SET EIP: 0xfffff80001be4812 GET EIP: 0xfffff80001be480f SET EIP: 0xfffff80001be4812 GET EIP: 0xfffff80001be480f SET EIP: 0xfffff80001be4812 GET EIP: 0xfffff80001be480f SET EIP: 0xfffff80001be4812 GET EIP: 0xfffff80001be480f SET EIP: 0xfffff80001be4812 GET EIP: 0xfffff80001be480f SET EIP: 0xfffff80001be4812 GET EIP: 0xfffff80001be480f SET EIP: 0xfffff80001be4812 GET EIP: 0xfffff80001be480f SET EIP: 0xfffff80001be4812 GET EIP: 0xfffff80001be480f SET EIP: 0xfffff80001be4812 GET EIP: 0xfffff80001be480f SET EIP: 0xfffff80001be4812 GET EIP: 0xfffff80001be480f SET EIP: 0xfffff80001be4812 GET EIP: 0xfffff80001be480f SET EIP: 0xfffff80001be4812 GET EIP: 0xfffff80001be480f SET EIP: 0xfffff80001be4812 GET EIP: 0xfffff80001be480f SET EIP: 0xfffff80001be4812 GET EIP: 0xfffff80001be480f SET EIP: 0xfffff80001be4812 GET EIP: 0xfffff80001be480f SET EIP: 0xfffff80001be4812 GET EIP: 0xfffff80001be480f SET EIP: 0xfffff80001be4812 GET EIP: 0xfffff80001be480f SET EIP: 0xfffff80001be4812 GET EIP: 0xfffff80001be480f SET EIP: 0xfffff80001be4812 GET EIP: 0xfffff80001be480f SET EIP: 0xfffff80001be4812 GET EIP: 0xfffff80001be480f SET EIP: 0xfffff80001be4812 GET EIP: 0xfffff80001be480f SET EIP: 0xfffff80001be4812 GET EIP: 0xfffff80001be480f SET EIP: 0xfffff80001be4812 GET EIP: 0xfffff80001be480f SET EIP: 0xfffff80001be4812 As you can see, they're different (and correct in the application log), but (for example for the last one) they both already have the same value in the hypervisor. So it would appear that guest_cpu_user_regs()->eip != v->arch.user_regs.eip at that point (and likely even more state than that differs there). These are EPT fault events, all of them, and I just reply with VM_EVENT_FLAG_SET_REGISTERS to them here, and nothing else. Thanks, Razvan _______________________________________________ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: VM_EVENT_FLAG_SET_REGISTERS and sync_vcpu_execstate() 2016-08-08 12:47 ` Razvan Cojocaru @ 2016-08-08 15:10 ` Razvan Cojocaru 2016-08-08 15:52 ` Tamas K Lengyel 0 siblings, 1 reply; 12+ messages in thread From: Razvan Cojocaru @ 2016-08-08 15:10 UTC (permalink / raw) To: Andrew Cooper, xen-devel; +Cc: Tamas K Lengyel On 08/08/2016 03:47 PM, Razvan Cojocaru wrote: > On 08/08/2016 03:01 PM, Andrew Cooper wrote: >> On 08/08/16 11:31, Razvan Cojocaru wrote: >>> Hello, >>> >>> We've been mostly setting registers by using xc_vcpu_setcontext(): >>> >>> https://github.com/razvan-cojocaru/libbdvmi/blob/master/src/bdvmixendriver.cpp#L504 >>> >>> but lately trying to push as much of that as possible to the >>> VM_EVENT_FLAG_SET_REGISTERS-related code (i.e. via the vm_event replies) >>> to save processing time. >>> >>> So with the xc_vcpu_setcontext() call removed, I've found that there are >>> cases where vm_event_set_registers() won't work properly unless I kept >>> the xc_vcpu_getcontext() call. This would appear to be not because of >>> anything that arch_get_info_guest() does (please see the implementation >>> for XEN_DOMCTL_getvcpucontext), but because of the vcpu_pause() call, or >>> more specifically, because of its calling sync_vcpu_execstate(). >>> >>> So it would appear that a sync_vcpu_execstate(v) call is necessary at >>> the start of vm_event_set_registers() for the vcpu struct instance to be >>> synchronized with the current VCPU state. >>> >>> Any objections to a patch with this simple modification? >> >> It would be helpful to identify exactly what is currently going wrong, >> and why sync_vcpu_execstate(v) helps. > > I've placed a > > printk("EIP: 0x%016lx, 0x%016lx\n", v->arch.user_regs.eip, > rsp->data.regs.x86.rip); > > at the top of vm_event_set_registers(), so I could see what the old > value is (v->arch.user_regs.eip) vs. the new value (rsp->data.regs.x86.rip). > > I'm also logging these in my application, the difference there being > that the old value is the value that came with the vm_event request, > which has been obtained with guest_cpu_user_regs()->eip (where in > vm_event_set_registers() we set v->arch.user_regs.eip, since v != current). > > Here's a short test run, with xl dmesg: > > (XEN) [ 395.739543] EIP: 0xfffff80001be5957, 0xfffff80001be595b > (XEN) [ 409.795311] EIP: 0xfffff80001be480f, 0xfffff80001be4812 > (XEN) [ 416.675023] EIP: 0xfffff80001be480f, 0xfffff80001be4812 > (XEN) [ 421.475567] EIP: 0xfffff80001be480f, 0xfffff80001be4812 > (XEN) [ 428.275125] EIP: 0xfffff80001be480f, 0xfffff80001be4812 > (XEN) [ 435.507009] EIP: 0xfffff80001be480f, 0xfffff80001be4812 > (XEN) [ 441.318224] EIP: 0xfffff80001be480f, 0xfffff80001be4812 > (XEN) [ 445.514807] EIP: 0xfffff80001be480f, 0xfffff80001be4812 > (XEN) [ 452.539190] EIP: 0xfffff80001be480f, 0xfffff80001be4812 > (XEN) [ 454.762810] EIP: 0xfffff80001be480f, 0xfffff80001be4812 > (XEN) [ 459.538651] EIP: 0xfffff80001be480f, 0xfffff80001be4812 > (XEN) [ 462.027222] EIP: 0xfffff80001be480f, 0xfffff80001be4812 > (XEN) [ 481.770470] EIP: 0xfffff80001be480f, 0xfffff80001be4812 > (XEN) [ 483.298493] EIP: 0xfffff80001be480f, 0xfffff80001be4812 > (XEN) [ 486.522344] EIP: 0xfffff80001be480f, 0xfffff80001be4812 > (XEN) [ 491.042325] EIP: 0xfffff80001be480f, 0xfffff80001be4812 > (XEN) [ 494.874468] EIP: 0xfffff80001be480f, 0xfffff80001be4812 > (XEN) [ 497.450765] EIP: 0xfffff80001be480f, 0xfffff80001be4812 > (XEN) [ 500.562738] EIP: 0xfffff80001be480f, 0xfffff80001be4812 > (XEN) [ 509.179754] EIP: 0xfffff80001be480f, 0xfffff80001be4812 > (XEN) [ 510.826236] EIP: 0xfffff80001be480f, 0xfffff80001be4812 > (XEN) [ 512.106206] EIP: 0xfffff80001be480f, 0xfffff80001be4812 > (XEN) [ 518.658092] EIP: 0xfffff80001be480f, 0xfffff80001be4812 > (XEN) [ 520.450156] EIP: 0xfffff80001be480f, 0xfffff80001be4812 > (XEN) [ 521.882088] EIP: 0xfffff80001be480f, 0xfffff80001be4812 > (XEN) [ 523.250092] EIP: 0xfffff80001be480f, 0xfffff80001be4812 > (XEN) [ 524.577987] EIP: 0xfffff80001be480f, 0xfffff80001be4812 > (XEN) [ 525.962042] EIP: 0xfffff80001be480f, 0xfffff80001be4812 > (XEN) [ 527.353942] EIP: 0xfffff80001be4812, 0xfffff80001be4812 > (XEN) [ 528.714089] EIP: 0xfffff80001be480f, 0xfffff80001be4812 > (XEN) [ 530.065994] EIP: 0xfffff80001be480f, 0xfffff80001be4812 > (XEN) [ 531.357762] EIP: 0xfffff80001be4812, 0xfffff80001be4812 > (XEN) [ 532.594016] EIP: 0xfffff80001be480f, 0xfffff80001be4812 > (XEN) [ 533.849886] EIP: 0xfffff80001be480f, 0xfffff80001be4812 > (XEN) [ 535.145879] EIP: 0xfffff80001be4812, 0xfffff80001be4812 > (XEN) [ 536.546846] EIP: 0xfffff80001be480f, 0xfffff80001be4812 > (XEN) [ 537.905756] EIP: 0xfffff80001be480f, 0xfffff80001be4812 > (XEN) [ 539.209454] EIP: 0xfffff80001be4812, 0xfffff80001be4812 > > and the corresponding part in the userspace application's log: > > GET EIP: 0xfffff80001be5957 SET EIP: 0xfffff80001be595b > GET EIP: 0xfffff80001be480f SET EIP: 0xfffff80001be4812 > GET EIP: 0xfffff80001be480f SET EIP: 0xfffff80001be4812 > GET EIP: 0xfffff80001be480f SET EIP: 0xfffff80001be4812 > GET EIP: 0xfffff80001be480f SET EIP: 0xfffff80001be4812 > GET EIP: 0xfffff80001be480f SET EIP: 0xfffff80001be4812 > GET EIP: 0xfffff80001be480f SET EIP: 0xfffff80001be4812 > GET EIP: 0xfffff80001be480f SET EIP: 0xfffff80001be4812 > GET EIP: 0xfffff80001be480f SET EIP: 0xfffff80001be4812 > GET EIP: 0xfffff80001be480f SET EIP: 0xfffff80001be4812 > GET EIP: 0xfffff80001be480f SET EIP: 0xfffff80001be4812 > GET EIP: 0xfffff80001be480f SET EIP: 0xfffff80001be4812 > GET EIP: 0xfffff80001be480f SET EIP: 0xfffff80001be4812 > GET EIP: 0xfffff80001be480f SET EIP: 0xfffff80001be4812 > GET EIP: 0xfffff80001be480f SET EIP: 0xfffff80001be4812 > GET EIP: 0xfffff80001be480f SET EIP: 0xfffff80001be4812 > GET EIP: 0xfffff80001be480f SET EIP: 0xfffff80001be4812 > GET EIP: 0xfffff80001be480f SET EIP: 0xfffff80001be4812 > GET EIP: 0xfffff80001be480f SET EIP: 0xfffff80001be4812 > GET EIP: 0xfffff80001be480f SET EIP: 0xfffff80001be4812 > GET EIP: 0xfffff80001be480f SET EIP: 0xfffff80001be4812 > GET EIP: 0xfffff80001be480f SET EIP: 0xfffff80001be4812 > GET EIP: 0xfffff80001be480f SET EIP: 0xfffff80001be4812 > GET EIP: 0xfffff80001be480f SET EIP: 0xfffff80001be4812 > GET EIP: 0xfffff80001be480f SET EIP: 0xfffff80001be4812 > GET EIP: 0xfffff80001be480f SET EIP: 0xfffff80001be4812 > GET EIP: 0xfffff80001be480f SET EIP: 0xfffff80001be4812 > GET EIP: 0xfffff80001be480f SET EIP: 0xfffff80001be4812 > GET EIP: 0xfffff80001be480f SET EIP: 0xfffff80001be4812 > GET EIP: 0xfffff80001be480f SET EIP: 0xfffff80001be4812 > GET EIP: 0xfffff80001be480f SET EIP: 0xfffff80001be4812 > GET EIP: 0xfffff80001be480f SET EIP: 0xfffff80001be4812 > GET EIP: 0xfffff80001be480f SET EIP: 0xfffff80001be4812 > GET EIP: 0xfffff80001be480f SET EIP: 0xfffff80001be4812 > GET EIP: 0xfffff80001be480f SET EIP: 0xfffff80001be4812 > GET EIP: 0xfffff80001be480f SET EIP: 0xfffff80001be4812 > GET EIP: 0xfffff80001be480f SET EIP: 0xfffff80001be4812 > GET EIP: 0xfffff80001be480f SET EIP: 0xfffff80001be4812 > > As you can see, they're different (and correct in the application log), > but (for example for the last one) they both already have the same value > in the hypervisor. So it would appear that guest_cpu_user_regs()->eip != > v->arch.user_regs.eip at that point (and likely even more state than > that differs there). > > These are EPT fault events, all of them, and I just reply with > VM_EVENT_FLAG_SET_REGISTERS to them here, and nothing else. I think the issue might be that vm_event_vcpu_pause() uses vcpu_pause_nosync(), and it's being used everywhere we pause the VCPU in the course of sending out a vm_event. So this creates the premise for a race condition: either some more things happen between sending out the vm_event and replying to it that cause v->arch.user_regs to be synchronized - in which case everything works (this has been the case when I was reading the VCPU context via a domctl that did vcpu_pause()) -, or not, in which case all bets are off. In this case, we should either drop vm_event_vcpu_pause() completely and just use vcpu_pause() everywhere, modify it to use vcpu_sleep_sync() (and basically turn it into vcpu_pause()), or only sync in vm_event_set_registers() as I've suggested. Thanks, Razvan _______________________________________________ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: VM_EVENT_FLAG_SET_REGISTERS and sync_vcpu_execstate() 2016-08-08 15:10 ` Razvan Cojocaru @ 2016-08-08 15:52 ` Tamas K Lengyel 2016-08-08 16:29 ` Razvan Cojocaru 0 siblings, 1 reply; 12+ messages in thread From: Tamas K Lengyel @ 2016-08-08 15:52 UTC (permalink / raw) To: Razvan Cojocaru; +Cc: Andrew Cooper, Xen-devel On Mon, Aug 8, 2016 at 9:10 AM, Razvan Cojocaru <rcojocaru@bitdefender.com> wrote: > On 08/08/2016 03:47 PM, Razvan Cojocaru wrote: >> On 08/08/2016 03:01 PM, Andrew Cooper wrote: >>> On 08/08/16 11:31, Razvan Cojocaru wrote: >>>> Hello, >>>> >>>> We've been mostly setting registers by using xc_vcpu_setcontext(): >>>> >>>> https://github.com/razvan-cojocaru/libbdvmi/blob/master/src/bdvmixendriver.cpp#L504 >>>> >>>> but lately trying to push as much of that as possible to the >>>> VM_EVENT_FLAG_SET_REGISTERS-related code (i.e. via the vm_event replies) >>>> to save processing time. >>>> >>>> So with the xc_vcpu_setcontext() call removed, I've found that there are >>>> cases where vm_event_set_registers() won't work properly unless I kept >>>> the xc_vcpu_getcontext() call. This would appear to be not because of >>>> anything that arch_get_info_guest() does (please see the implementation >>>> for XEN_DOMCTL_getvcpucontext), but because of the vcpu_pause() call, or >>>> more specifically, because of its calling sync_vcpu_execstate(). >>>> >>>> So it would appear that a sync_vcpu_execstate(v) call is necessary at >>>> the start of vm_event_set_registers() for the vcpu struct instance to be >>>> synchronized with the current VCPU state. >>>> >>>> Any objections to a patch with this simple modification? >>> >>> It would be helpful to identify exactly what is currently going wrong, >>> and why sync_vcpu_execstate(v) helps. >> >> I've placed a >> >> printk("EIP: 0x%016lx, 0x%016lx\n", v->arch.user_regs.eip, >> rsp->data.regs.x86.rip); >> >> at the top of vm_event_set_registers(), so I could see what the old >> value is (v->arch.user_regs.eip) vs. the new value (rsp->data.regs.x86.rip). >> >> I'm also logging these in my application, the difference there being >> that the old value is the value that came with the vm_event request, >> which has been obtained with guest_cpu_user_regs()->eip (where in >> vm_event_set_registers() we set v->arch.user_regs.eip, since v != current). >> >> Here's a short test run, with xl dmesg: >> >> (XEN) [ 395.739543] EIP: 0xfffff80001be5957, 0xfffff80001be595b >> (XEN) [ 409.795311] EIP: 0xfffff80001be480f, 0xfffff80001be4812 >> (XEN) [ 416.675023] EIP: 0xfffff80001be480f, 0xfffff80001be4812 >> (XEN) [ 421.475567] EIP: 0xfffff80001be480f, 0xfffff80001be4812 >> (XEN) [ 428.275125] EIP: 0xfffff80001be480f, 0xfffff80001be4812 >> (XEN) [ 435.507009] EIP: 0xfffff80001be480f, 0xfffff80001be4812 >> (XEN) [ 441.318224] EIP: 0xfffff80001be480f, 0xfffff80001be4812 >> (XEN) [ 445.514807] EIP: 0xfffff80001be480f, 0xfffff80001be4812 >> (XEN) [ 452.539190] EIP: 0xfffff80001be480f, 0xfffff80001be4812 >> (XEN) [ 454.762810] EIP: 0xfffff80001be480f, 0xfffff80001be4812 >> (XEN) [ 459.538651] EIP: 0xfffff80001be480f, 0xfffff80001be4812 >> (XEN) [ 462.027222] EIP: 0xfffff80001be480f, 0xfffff80001be4812 >> (XEN) [ 481.770470] EIP: 0xfffff80001be480f, 0xfffff80001be4812 >> (XEN) [ 483.298493] EIP: 0xfffff80001be480f, 0xfffff80001be4812 >> (XEN) [ 486.522344] EIP: 0xfffff80001be480f, 0xfffff80001be4812 >> (XEN) [ 491.042325] EIP: 0xfffff80001be480f, 0xfffff80001be4812 >> (XEN) [ 494.874468] EIP: 0xfffff80001be480f, 0xfffff80001be4812 >> (XEN) [ 497.450765] EIP: 0xfffff80001be480f, 0xfffff80001be4812 >> (XEN) [ 500.562738] EIP: 0xfffff80001be480f, 0xfffff80001be4812 >> (XEN) [ 509.179754] EIP: 0xfffff80001be480f, 0xfffff80001be4812 >> (XEN) [ 510.826236] EIP: 0xfffff80001be480f, 0xfffff80001be4812 >> (XEN) [ 512.106206] EIP: 0xfffff80001be480f, 0xfffff80001be4812 >> (XEN) [ 518.658092] EIP: 0xfffff80001be480f, 0xfffff80001be4812 >> (XEN) [ 520.450156] EIP: 0xfffff80001be480f, 0xfffff80001be4812 >> (XEN) [ 521.882088] EIP: 0xfffff80001be480f, 0xfffff80001be4812 >> (XEN) [ 523.250092] EIP: 0xfffff80001be480f, 0xfffff80001be4812 >> (XEN) [ 524.577987] EIP: 0xfffff80001be480f, 0xfffff80001be4812 >> (XEN) [ 525.962042] EIP: 0xfffff80001be480f, 0xfffff80001be4812 >> (XEN) [ 527.353942] EIP: 0xfffff80001be4812, 0xfffff80001be4812 >> (XEN) [ 528.714089] EIP: 0xfffff80001be480f, 0xfffff80001be4812 >> (XEN) [ 530.065994] EIP: 0xfffff80001be480f, 0xfffff80001be4812 >> (XEN) [ 531.357762] EIP: 0xfffff80001be4812, 0xfffff80001be4812 >> (XEN) [ 532.594016] EIP: 0xfffff80001be480f, 0xfffff80001be4812 >> (XEN) [ 533.849886] EIP: 0xfffff80001be480f, 0xfffff80001be4812 >> (XEN) [ 535.145879] EIP: 0xfffff80001be4812, 0xfffff80001be4812 >> (XEN) [ 536.546846] EIP: 0xfffff80001be480f, 0xfffff80001be4812 >> (XEN) [ 537.905756] EIP: 0xfffff80001be480f, 0xfffff80001be4812 >> (XEN) [ 539.209454] EIP: 0xfffff80001be4812, 0xfffff80001be4812 >> >> and the corresponding part in the userspace application's log: >> >> GET EIP: 0xfffff80001be5957 SET EIP: 0xfffff80001be595b >> GET EIP: 0xfffff80001be480f SET EIP: 0xfffff80001be4812 >> GET EIP: 0xfffff80001be480f SET EIP: 0xfffff80001be4812 >> GET EIP: 0xfffff80001be480f SET EIP: 0xfffff80001be4812 >> GET EIP: 0xfffff80001be480f SET EIP: 0xfffff80001be4812 >> GET EIP: 0xfffff80001be480f SET EIP: 0xfffff80001be4812 >> GET EIP: 0xfffff80001be480f SET EIP: 0xfffff80001be4812 >> GET EIP: 0xfffff80001be480f SET EIP: 0xfffff80001be4812 >> GET EIP: 0xfffff80001be480f SET EIP: 0xfffff80001be4812 >> GET EIP: 0xfffff80001be480f SET EIP: 0xfffff80001be4812 >> GET EIP: 0xfffff80001be480f SET EIP: 0xfffff80001be4812 >> GET EIP: 0xfffff80001be480f SET EIP: 0xfffff80001be4812 >> GET EIP: 0xfffff80001be480f SET EIP: 0xfffff80001be4812 >> GET EIP: 0xfffff80001be480f SET EIP: 0xfffff80001be4812 >> GET EIP: 0xfffff80001be480f SET EIP: 0xfffff80001be4812 >> GET EIP: 0xfffff80001be480f SET EIP: 0xfffff80001be4812 >> GET EIP: 0xfffff80001be480f SET EIP: 0xfffff80001be4812 >> GET EIP: 0xfffff80001be480f SET EIP: 0xfffff80001be4812 >> GET EIP: 0xfffff80001be480f SET EIP: 0xfffff80001be4812 >> GET EIP: 0xfffff80001be480f SET EIP: 0xfffff80001be4812 >> GET EIP: 0xfffff80001be480f SET EIP: 0xfffff80001be4812 >> GET EIP: 0xfffff80001be480f SET EIP: 0xfffff80001be4812 >> GET EIP: 0xfffff80001be480f SET EIP: 0xfffff80001be4812 >> GET EIP: 0xfffff80001be480f SET EIP: 0xfffff80001be4812 >> GET EIP: 0xfffff80001be480f SET EIP: 0xfffff80001be4812 >> GET EIP: 0xfffff80001be480f SET EIP: 0xfffff80001be4812 >> GET EIP: 0xfffff80001be480f SET EIP: 0xfffff80001be4812 >> GET EIP: 0xfffff80001be480f SET EIP: 0xfffff80001be4812 >> GET EIP: 0xfffff80001be480f SET EIP: 0xfffff80001be4812 >> GET EIP: 0xfffff80001be480f SET EIP: 0xfffff80001be4812 >> GET EIP: 0xfffff80001be480f SET EIP: 0xfffff80001be4812 >> GET EIP: 0xfffff80001be480f SET EIP: 0xfffff80001be4812 >> GET EIP: 0xfffff80001be480f SET EIP: 0xfffff80001be4812 >> GET EIP: 0xfffff80001be480f SET EIP: 0xfffff80001be4812 >> GET EIP: 0xfffff80001be480f SET EIP: 0xfffff80001be4812 >> GET EIP: 0xfffff80001be480f SET EIP: 0xfffff80001be4812 >> GET EIP: 0xfffff80001be480f SET EIP: 0xfffff80001be4812 >> GET EIP: 0xfffff80001be480f SET EIP: 0xfffff80001be4812 >> >> As you can see, they're different (and correct in the application log), >> but (for example for the last one) they both already have the same value >> in the hypervisor. So it would appear that guest_cpu_user_regs()->eip != >> v->arch.user_regs.eip at that point (and likely even more state than >> that differs there). >> >> These are EPT fault events, all of them, and I just reply with >> VM_EVENT_FLAG_SET_REGISTERS to them here, and nothing else. > > I think the issue might be that vm_event_vcpu_pause() uses > vcpu_pause_nosync(), and it's being used everywhere we pause the VCPU in > the course of sending out a vm_event. > > So this creates the premise for a race condition: either some more > things happen between sending out the vm_event and replying to it that > cause v->arch.user_regs to be synchronized - in which case everything > works (this has been the case when I was reading the VCPU context via a > domctl that did vcpu_pause()) -, or not, in which case all bets are off. > > In this case, we should either drop vm_event_vcpu_pause() completely and > just use vcpu_pause() everywhere, modify it to use vcpu_sleep_sync() > (and basically turn it into vcpu_pause()), or only sync in > vm_event_set_registers() as I've suggested. > I would say just using vcpu_pause() would make sense as it's the least complicated route. Is there any downside of doing the sync() in all cases? Was the current way implemented perhaps the way it is for performance reasons? If so, is it noticeable at all? Tamas _______________________________________________ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: VM_EVENT_FLAG_SET_REGISTERS and sync_vcpu_execstate() 2016-08-08 15:52 ` Tamas K Lengyel @ 2016-08-08 16:29 ` Razvan Cojocaru 2016-08-08 18:01 ` Tamas K Lengyel 2016-08-09 9:41 ` Tim Deegan 0 siblings, 2 replies; 12+ messages in thread From: Razvan Cojocaru @ 2016-08-08 16:29 UTC (permalink / raw) To: Tamas K Lengyel; +Cc: Andrew Cooper, Tim Deegan, Xen-devel On 08/08/16 18:52, Tamas K Lengyel wrote: >> I think the issue might be that vm_event_vcpu_pause() uses >> > vcpu_pause_nosync(), and it's being used everywhere we pause the VCPU in >> > the course of sending out a vm_event. >> > >> > So this creates the premise for a race condition: either some more >> > things happen between sending out the vm_event and replying to it that >> > cause v->arch.user_regs to be synchronized - in which case everything >> > works (this has been the case when I was reading the VCPU context via a >> > domctl that did vcpu_pause()) -, or not, in which case all bets are off. >> > >> > In this case, we should either drop vm_event_vcpu_pause() completely and >> > just use vcpu_pause() everywhere, modify it to use vcpu_sleep_sync() >> > (and basically turn it into vcpu_pause()), or only sync in >> > vm_event_set_registers() as I've suggested. >> > > I would say just using vcpu_pause() would make sense as it's the least > complicated route. Is there any downside of doing the sync() in all > cases? Was the current way implemented perhaps the way it is for > performance reasons? If so, is it noticeable at all? I was hoping you'd know why it's implemented like this :), I think maybe Tim Deegan (CCd, hopefully the email address is not out of date) did the original implementation? That's why I proposed to only sync in vm_event_set_registers() - for all the other cases we can then keep the current performance (if significant). The least complicated route (at least as far as the patch changes go) I think is also this one, since it only requires a new line of code in vm_event_set_registers() - using vcpu_pause() everywhere else requires removing vm_event_pause_vcpu() as well as replacing the call everywhere else. Thanks, Razvan _______________________________________________ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: VM_EVENT_FLAG_SET_REGISTERS and sync_vcpu_execstate() 2016-08-08 16:29 ` Razvan Cojocaru @ 2016-08-08 18:01 ` Tamas K Lengyel 2016-08-09 8:19 ` Razvan Cojocaru 2016-08-09 9:41 ` Tim Deegan 1 sibling, 1 reply; 12+ messages in thread From: Tamas K Lengyel @ 2016-08-08 18:01 UTC (permalink / raw) To: Razvan Cojocaru; +Cc: Andrew Cooper, Tim Deegan, Xen-devel On Mon, Aug 8, 2016 at 10:29 AM, Razvan Cojocaru <rcojocaru@bitdefender.com> wrote: > On 08/08/16 18:52, Tamas K Lengyel wrote: >>> I think the issue might be that vm_event_vcpu_pause() uses >>> > vcpu_pause_nosync(), and it's being used everywhere we pause the VCPU in >>> > the course of sending out a vm_event. >>> > >>> > So this creates the premise for a race condition: either some more >>> > things happen between sending out the vm_event and replying to it that >>> > cause v->arch.user_regs to be synchronized - in which case everything >>> > works (this has been the case when I was reading the VCPU context via a >>> > domctl that did vcpu_pause()) -, or not, in which case all bets are off. >>> > >>> > In this case, we should either drop vm_event_vcpu_pause() completely and >>> > just use vcpu_pause() everywhere, modify it to use vcpu_sleep_sync() >>> > (and basically turn it into vcpu_pause()), or only sync in >>> > vm_event_set_registers() as I've suggested. >>> > >> I would say just using vcpu_pause() would make sense as it's the least >> complicated route. Is there any downside of doing the sync() in all >> cases? Was the current way implemented perhaps the way it is for >> performance reasons? If so, is it noticeable at all? > > I was hoping you'd know why it's implemented like this :), I think maybe > Tim Deegan (CCd, hopefully the email address is not out of date) did the > original implementation? > > That's why I proposed to only sync in vm_event_set_registers() - for all > the other cases we can then keep the current performance (if > significant). The least complicated route (at least as far as the patch > changes go) I think is also this one, since it only requires a new line > of code in vm_event_set_registers() - using vcpu_pause() everywhere else > requires removing vm_event_pause_vcpu() as well as replacing the call > everywhere else. > Using _nosync() predates me touching this code by a couple years, according to git blame it goes all the way back to when mem_access was introduced: commit fbbedcae8c0c5374f8c0a869f49784b37baf04bb Joe Epstein <jepstein98@gmail.com> Date: Fri Jan 7 11:54:40 2011 +0000 mem_access: mem event additions for access My only concern with changing to sync only in the set registers route is that we potentially keep a buggy interface where we might run into this problem in the future. IMHO just shifting all cases to do sync() is safer, provided the performance difference is unnoticeable. Tamas _______________________________________________ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: VM_EVENT_FLAG_SET_REGISTERS and sync_vcpu_execstate() 2016-08-08 18:01 ` Tamas K Lengyel @ 2016-08-09 8:19 ` Razvan Cojocaru 2016-08-09 9:01 ` Razvan Cojocaru 0 siblings, 1 reply; 12+ messages in thread From: Razvan Cojocaru @ 2016-08-09 8:19 UTC (permalink / raw) To: Tamas K Lengyel; +Cc: Andrew Cooper, Tim Deegan, Xen-devel On 08/08/2016 09:01 PM, Tamas K Lengyel wrote: > On Mon, Aug 8, 2016 at 10:29 AM, Razvan Cojocaru > <rcojocaru@bitdefender.com> wrote: >> On 08/08/16 18:52, Tamas K Lengyel wrote: >>>> I think the issue might be that vm_event_vcpu_pause() uses >>>>> vcpu_pause_nosync(), and it's being used everywhere we pause the VCPU in >>>>> the course of sending out a vm_event. >>>>> >>>>> So this creates the premise for a race condition: either some more >>>>> things happen between sending out the vm_event and replying to it that >>>>> cause v->arch.user_regs to be synchronized - in which case everything >>>>> works (this has been the case when I was reading the VCPU context via a >>>>> domctl that did vcpu_pause()) -, or not, in which case all bets are off. >>>>> >>>>> In this case, we should either drop vm_event_vcpu_pause() completely and >>>>> just use vcpu_pause() everywhere, modify it to use vcpu_sleep_sync() >>>>> (and basically turn it into vcpu_pause()), or only sync in >>>>> vm_event_set_registers() as I've suggested. >>>>> >>> I would say just using vcpu_pause() would make sense as it's the least >>> complicated route. Is there any downside of doing the sync() in all >>> cases? Was the current way implemented perhaps the way it is for >>> performance reasons? If so, is it noticeable at all? >> >> I was hoping you'd know why it's implemented like this :), I think maybe >> Tim Deegan (CCd, hopefully the email address is not out of date) did the >> original implementation? >> >> That's why I proposed to only sync in vm_event_set_registers() - for all >> the other cases we can then keep the current performance (if >> significant). The least complicated route (at least as far as the patch >> changes go) I think is also this one, since it only requires a new line >> of code in vm_event_set_registers() - using vcpu_pause() everywhere else >> requires removing vm_event_pause_vcpu() as well as replacing the call >> everywhere else. >> > > Using _nosync() predates me touching this code by a couple years, > according to git blame it goes all the way back to when mem_access was > introduced: > > commit fbbedcae8c0c5374f8c0a869f49784b37baf04bb > Joe Epstein <jepstein98@gmail.com> > Date: Fri Jan 7 11:54:40 2011 +0000 > > mem_access: mem event additions for access > > My only concern with changing to sync only in the set registers route > is that we potentially keep a buggy interface where we might run into > this problem in the future. IMHO just shifting all cases to do sync() > is safer, provided the performance difference is unnoticeable. Actually looking at the code again, this is vcpu_pause(): 875 void vcpu_pause(struct vcpu *v) 876 { 877 ASSERT(v != current); 878 atomic_inc(&v->pause_count); 879 vcpu_sleep_sync(v); 880 } 881 882 void vcpu_pause_nosync(struct vcpu *v) 883 { 884 atomic_inc(&v->pause_count); 885 vcpu_sleep_nosync(v); 886 } and this is vm_event_vcpu_pause(): 751 void vm_event_vcpu_pause(struct vcpu *v) 752 { 753 ASSERT(v == current); 754 755 atomic_inc(&v->vm_event_pause_count); 756 vcpu_pause_nosync(v); 757 } If we want to preserve the vm_event-specific reference counter (v->vm_event_pause_count) we'd use vcpu_pause() instead of vcpu_pause_nosync(). But vcpu_pause() wants to be used on a VCPU != current (see the ASSERT()). This is possibly why vcpu_pause_nosync() has been chosen over vcpu_pause(). I'll try to remove the ASSERT() and see if anything explodes, but it's looking increasingly like the smallest change is the one I've initially proposed. Thanks, Razvan _______________________________________________ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: VM_EVENT_FLAG_SET_REGISTERS and sync_vcpu_execstate() 2016-08-09 8:19 ` Razvan Cojocaru @ 2016-08-09 9:01 ` Razvan Cojocaru 0 siblings, 0 replies; 12+ messages in thread From: Razvan Cojocaru @ 2016-08-09 9:01 UTC (permalink / raw) To: Tamas K Lengyel; +Cc: Andrew Cooper, Xen-devel On 08/09/2016 11:19 AM, Razvan Cojocaru wrote: > On 08/08/2016 09:01 PM, Tamas K Lengyel wrote: >> On Mon, Aug 8, 2016 at 10:29 AM, Razvan Cojocaru >> <rcojocaru@bitdefender.com> wrote: >>> On 08/08/16 18:52, Tamas K Lengyel wrote: >>>>> I think the issue might be that vm_event_vcpu_pause() uses >>>>>> vcpu_pause_nosync(), and it's being used everywhere we pause the VCPU in >>>>>> the course of sending out a vm_event. >>>>>> >>>>>> So this creates the premise for a race condition: either some more >>>>>> things happen between sending out the vm_event and replying to it that >>>>>> cause v->arch.user_regs to be synchronized - in which case everything >>>>>> works (this has been the case when I was reading the VCPU context via a >>>>>> domctl that did vcpu_pause()) -, or not, in which case all bets are off. >>>>>> >>>>>> In this case, we should either drop vm_event_vcpu_pause() completely and >>>>>> just use vcpu_pause() everywhere, modify it to use vcpu_sleep_sync() >>>>>> (and basically turn it into vcpu_pause()), or only sync in >>>>>> vm_event_set_registers() as I've suggested. >>>>>> >>>> I would say just using vcpu_pause() would make sense as it's the least >>>> complicated route. Is there any downside of doing the sync() in all >>>> cases? Was the current way implemented perhaps the way it is for >>>> performance reasons? If so, is it noticeable at all? >>> >>> I was hoping you'd know why it's implemented like this :), I think maybe >>> Tim Deegan (CCd, hopefully the email address is not out of date) did the >>> original implementation? >>> >>> That's why I proposed to only sync in vm_event_set_registers() - for all >>> the other cases we can then keep the current performance (if >>> significant). The least complicated route (at least as far as the patch >>> changes go) I think is also this one, since it only requires a new line >>> of code in vm_event_set_registers() - using vcpu_pause() everywhere else >>> requires removing vm_event_pause_vcpu() as well as replacing the call >>> everywhere else. >>> >> >> Using _nosync() predates me touching this code by a couple years, >> according to git blame it goes all the way back to when mem_access was >> introduced: >> >> commit fbbedcae8c0c5374f8c0a869f49784b37baf04bb >> Joe Epstein <jepstein98@gmail.com> >> Date: Fri Jan 7 11:54:40 2011 +0000 >> >> mem_access: mem event additions for access >> >> My only concern with changing to sync only in the set registers route >> is that we potentially keep a buggy interface where we might run into >> this problem in the future. IMHO just shifting all cases to do sync() >> is safer, provided the performance difference is unnoticeable. > > Actually looking at the code again, this is vcpu_pause(): > > 875 void vcpu_pause(struct vcpu *v) > 876 { > 877 ASSERT(v != current); > 878 atomic_inc(&v->pause_count); > 879 vcpu_sleep_sync(v); > 880 } > 881 > 882 void vcpu_pause_nosync(struct vcpu *v) > 883 { > 884 atomic_inc(&v->pause_count); > 885 vcpu_sleep_nosync(v); > 886 } > > and this is vm_event_vcpu_pause(): > > 751 void vm_event_vcpu_pause(struct vcpu *v) > 752 { > 753 ASSERT(v == current); > 754 > 755 atomic_inc(&v->vm_event_pause_count); > 756 vcpu_pause_nosync(v); > 757 } > > If we want to preserve the vm_event-specific reference counter > (v->vm_event_pause_count) we'd use vcpu_pause() instead of > vcpu_pause_nosync(). But vcpu_pause() wants to be used on a VCPU != > current (see the ASSERT()). This is possibly why vcpu_pause_nosync() has > been chosen over vcpu_pause(). > > I'll try to remove the ASSERT() and see if anything explodes, but it's > looking increasingly like the smallest change is the one I've initially > proposed. Predictably, it did explode: http://pastebin.com/ruMKD2f0 However, your concern is valid, and I think we can address it by doing a sync_vcpu_execstate() as soon as possible in vm_event_resume() - that way we'll make sure that all the custom handlers running afterwards will see the synced VCPU state. I'll hack a patch and send it in ASAP. Thanks, Razvan _______________________________________________ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: VM_EVENT_FLAG_SET_REGISTERS and sync_vcpu_execstate() 2016-08-08 16:29 ` Razvan Cojocaru 2016-08-08 18:01 ` Tamas K Lengyel @ 2016-08-09 9:41 ` Tim Deegan 2016-08-09 9:46 ` Razvan Cojocaru 2016-08-09 17:19 ` Tamas K Lengyel 1 sibling, 2 replies; 12+ messages in thread From: Tim Deegan @ 2016-08-09 9:41 UTC (permalink / raw) To: Razvan Cojocaru; +Cc: Andrew Cooper, Tamas K Lengyel, Xen-devel At 19:29 +0300 on 08 Aug (1470684579), Razvan Cojocaru wrote: > On 08/08/16 18:52, Tamas K Lengyel wrote: > >> I think the issue might be that vm_event_vcpu_pause() uses > >> > vcpu_pause_nosync(), and it's being used everywhere we pause the VCPU in > >> > the course of sending out a vm_event. > >> > > >> > So this creates the premise for a race condition: either some more > >> > things happen between sending out the vm_event and replying to it that > >> > cause v->arch.user_regs to be synchronized - in which case everything > >> > works (this has been the case when I was reading the VCPU context via a > >> > domctl that did vcpu_pause()) -, or not, in which case all bets are off. > >> > > >> > In this case, we should either drop vm_event_vcpu_pause() completely and > >> > just use vcpu_pause() everywhere, modify it to use vcpu_sleep_sync() > >> > (and basically turn it into vcpu_pause()), or only sync in > >> > vm_event_set_registers() as I've suggested. > >> > > > I would say just using vcpu_pause() would make sense as it's the least > > complicated route. Is there any downside of doing the sync() in all > > cases? Was the current way implemented perhaps the way it is for > > performance reasons? If so, is it noticeable at all? > > I was hoping you'd know why it's implemented like this :), I think maybe > Tim Deegan (CCd, hopefully the email address is not out of date) did the > original implementation? Wasn't me! I'm missing some context here but it looks like vm_event_vcpu_pause() is always called on current, which means the vcpu: - is "running", i.e. scheduled on this pcpu, so vcpu_pause_sync() would deadlock in vcpu_sleep_sync() (well, it would ASSERT first). - is not in guest mode, so any VMCx state should have been synced onto the local stack at the last vmexit. If your vm_event response hypercall needs access to remote vcpu state, then you should call vcpu_pause_sync() on _that_ path, and vcpu_unpause() when you're done. If you can _guarantee_ (even with buggy/malicious tools) that the target vcpu is already paused and nothing can unpause it underfoot, then just calling vcpu_sleep_sync() before accessing the state is enough. The *_sync() functions are dangerous - you can't ever let a domain call them on its own vcpus, or have two domains that can call them on each other, without some other interlock to stop two vcpus deadlocking waiting for each other to be descheduled. Cheers, Tim. _______________________________________________ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: VM_EVENT_FLAG_SET_REGISTERS and sync_vcpu_execstate() 2016-08-09 9:41 ` Tim Deegan @ 2016-08-09 9:46 ` Razvan Cojocaru 2016-08-09 17:19 ` Tamas K Lengyel 1 sibling, 0 replies; 12+ messages in thread From: Razvan Cojocaru @ 2016-08-09 9:46 UTC (permalink / raw) To: Tim Deegan; +Cc: Andrew Cooper, Tamas K Lengyel, Xen-devel On 08/09/2016 12:41 PM, Tim Deegan wrote: > At 19:29 +0300 on 08 Aug (1470684579), Razvan Cojocaru wrote: >> On 08/08/16 18:52, Tamas K Lengyel wrote: >>>> I think the issue might be that vm_event_vcpu_pause() uses >>>>> vcpu_pause_nosync(), and it's being used everywhere we pause the VCPU in >>>>> the course of sending out a vm_event. >>>>> >>>>> So this creates the premise for a race condition: either some more >>>>> things happen between sending out the vm_event and replying to it that >>>>> cause v->arch.user_regs to be synchronized - in which case everything >>>>> works (this has been the case when I was reading the VCPU context via a >>>>> domctl that did vcpu_pause()) -, or not, in which case all bets are off. >>>>> >>>>> In this case, we should either drop vm_event_vcpu_pause() completely and >>>>> just use vcpu_pause() everywhere, modify it to use vcpu_sleep_sync() >>>>> (and basically turn it into vcpu_pause()), or only sync in >>>>> vm_event_set_registers() as I've suggested. >>>>> >>> I would say just using vcpu_pause() would make sense as it's the least >>> complicated route. Is there any downside of doing the sync() in all >>> cases? Was the current way implemented perhaps the way it is for >>> performance reasons? If so, is it noticeable at all? >> >> I was hoping you'd know why it's implemented like this :), I think maybe >> Tim Deegan (CCd, hopefully the email address is not out of date) did the >> original implementation? > > Wasn't me! I'm missing some context here but it looks like > vm_event_vcpu_pause() is always called on current, which means the > vcpu: > - is "running", i.e. scheduled on this pcpu, so vcpu_pause_sync() > would deadlock in vcpu_sleep_sync() (well, it would ASSERT first). > - is not in guest mode, so any VMCx state should have been synced > onto the local stack at the last vmexit. > > If your vm_event response hypercall needs access to remote vcpu state, > then you should call vcpu_pause_sync() on _that_ path, and > vcpu_unpause() when you're done. If you can _guarantee_ (even with > buggy/malicious tools) that the target vcpu is already paused and > nothing can unpause it underfoot, then just calling vcpu_sleep_sync() > before accessing the state is enough. > > The *_sync() functions are dangerous - you can't ever let a domain > call them on its own vcpus, or have two domains that can call them on > each other, without some other interlock to stop two vcpus deadlocking > waiting for each other to be descheduled. Thanks for the reply! Indeed, those things have all hit me until I got to the patch I've just submitted a couple of minutes ago. :) Thanks again, Razvan _______________________________________________ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: VM_EVENT_FLAG_SET_REGISTERS and sync_vcpu_execstate() 2016-08-09 9:41 ` Tim Deegan 2016-08-09 9:46 ` Razvan Cojocaru @ 2016-08-09 17:19 ` Tamas K Lengyel 1 sibling, 0 replies; 12+ messages in thread From: Tamas K Lengyel @ 2016-08-09 17:19 UTC (permalink / raw) To: Tim Deegan; +Cc: Andrew Cooper, Razvan Cojocaru, Xen-devel On Tue, Aug 9, 2016 at 3:41 AM, Tim Deegan <tim@xen.org> wrote: > At 19:29 +0300 on 08 Aug (1470684579), Razvan Cojocaru wrote: >> On 08/08/16 18:52, Tamas K Lengyel wrote: >> >> I think the issue might be that vm_event_vcpu_pause() uses >> >> > vcpu_pause_nosync(), and it's being used everywhere we pause the VCPU in >> >> > the course of sending out a vm_event. >> >> > >> >> > So this creates the premise for a race condition: either some more >> >> > things happen between sending out the vm_event and replying to it that >> >> > cause v->arch.user_regs to be synchronized - in which case everything >> >> > works (this has been the case when I was reading the VCPU context via a >> >> > domctl that did vcpu_pause()) -, or not, in which case all bets are off. >> >> > >> >> > In this case, we should either drop vm_event_vcpu_pause() completely and >> >> > just use vcpu_pause() everywhere, modify it to use vcpu_sleep_sync() >> >> > (and basically turn it into vcpu_pause()), or only sync in >> >> > vm_event_set_registers() as I've suggested. >> >> > >> > I would say just using vcpu_pause() would make sense as it's the least >> > complicated route. Is there any downside of doing the sync() in all >> > cases? Was the current way implemented perhaps the way it is for >> > performance reasons? If so, is it noticeable at all? >> >> I was hoping you'd know why it's implemented like this :), I think maybe >> Tim Deegan (CCd, hopefully the email address is not out of date) did the >> original implementation? > > Wasn't me! I'm missing some context here but it looks like > vm_event_vcpu_pause() is always called on current, which means the > vcpu: > - is "running", i.e. scheduled on this pcpu, so vcpu_pause_sync() > would deadlock in vcpu_sleep_sync() (well, it would ASSERT first). > - is not in guest mode, so any VMCx state should have been synced > onto the local stack at the last vmexit. > > If your vm_event response hypercall needs access to remote vcpu state, > then you should call vcpu_pause_sync() on _that_ path, and > vcpu_unpause() when you're done. If you can _guarantee_ (even with > buggy/malicious tools) that the target vcpu is already paused and > nothing can unpause it underfoot, then just calling vcpu_sleep_sync() > before accessing the state is enough. > > The *_sync() functions are dangerous - you can't ever let a domain > call them on its own vcpus, or have two domains that can call them on > each other, without some other interlock to stop two vcpus deadlocking > waiting for each other to be descheduled. > Hi Tim, thanks for clarifying this for us! =) Cheers, Tamas _______________________________________________ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel ^ permalink raw reply [flat|nested] 12+ messages in thread
end of thread, other threads:[~2016-08-09 17:19 UTC | newest] Thread overview: 12+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2016-08-08 10:31 VM_EVENT_FLAG_SET_REGISTERS and sync_vcpu_execstate() Razvan Cojocaru 2016-08-08 12:01 ` Andrew Cooper 2016-08-08 12:47 ` Razvan Cojocaru 2016-08-08 15:10 ` Razvan Cojocaru 2016-08-08 15:52 ` Tamas K Lengyel 2016-08-08 16:29 ` Razvan Cojocaru 2016-08-08 18:01 ` Tamas K Lengyel 2016-08-09 8:19 ` Razvan Cojocaru 2016-08-09 9:01 ` Razvan Cojocaru 2016-08-09 9:41 ` Tim Deegan 2016-08-09 9:46 ` Razvan Cojocaru 2016-08-09 17:19 ` Tamas K Lengyel
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.