From: Razvan Cojocaru <rcojocaru@bitdefender.com>
To: Tamas K Lengyel <tamas@tklengyel.com>
Cc: Andrew Cooper <andrew.cooper3@citrix.com>,
Xen-devel <xen-devel@lists.xen.org>
Subject: Re: VM_EVENT_FLAG_SET_REGISTERS and sync_vcpu_execstate()
Date: Tue, 9 Aug 2016 12:01:08 +0300 [thread overview]
Message-ID: <f45adbcf-218d-34b0-ec20-a03b87255d44@bitdefender.com> (raw)
In-Reply-To: <871af997-7ef6-6c85-2823-97447f418efb@bitdefender.com>
On 08/09/2016 11:19 AM, Razvan Cojocaru wrote:
> On 08/08/2016 09:01 PM, Tamas K Lengyel wrote:
>> On Mon, Aug 8, 2016 at 10:29 AM, Razvan Cojocaru
>> <rcojocaru@bitdefender.com> wrote:
>>> On 08/08/16 18:52, Tamas K Lengyel wrote:
>>>>> I think the issue might be that vm_event_vcpu_pause() uses
>>>>>> vcpu_pause_nosync(), and it's being used everywhere we pause the VCPU in
>>>>>> the course of sending out a vm_event.
>>>>>>
>>>>>> So this creates the premise for a race condition: either some more
>>>>>> things happen between sending out the vm_event and replying to it that
>>>>>> cause v->arch.user_regs to be synchronized - in which case everything
>>>>>> works (this has been the case when I was reading the VCPU context via a
>>>>>> domctl that did vcpu_pause()) -, or not, in which case all bets are off.
>>>>>>
>>>>>> In this case, we should either drop vm_event_vcpu_pause() completely and
>>>>>> just use vcpu_pause() everywhere, modify it to use vcpu_sleep_sync()
>>>>>> (and basically turn it into vcpu_pause()), or only sync in
>>>>>> vm_event_set_registers() as I've suggested.
>>>>>>
>>>> I would say just using vcpu_pause() would make sense as it's the least
>>>> complicated route. Is there any downside of doing the sync() in all
>>>> cases? Was the current way implemented perhaps the way it is for
>>>> performance reasons? If so, is it noticeable at all?
>>>
>>> I was hoping you'd know why it's implemented like this :), I think maybe
>>> Tim Deegan (CCd, hopefully the email address is not out of date) did the
>>> original implementation?
>>>
>>> That's why I proposed to only sync in vm_event_set_registers() - for all
>>> the other cases we can then keep the current performance (if
>>> significant). The least complicated route (at least as far as the patch
>>> changes go) I think is also this one, since it only requires a new line
>>> of code in vm_event_set_registers() - using vcpu_pause() everywhere else
>>> requires removing vm_event_pause_vcpu() as well as replacing the call
>>> everywhere else.
>>>
>>
>> Using _nosync() predates me touching this code by a couple years,
>> according to git blame it goes all the way back to when mem_access was
>> introduced:
>>
>> commit fbbedcae8c0c5374f8c0a869f49784b37baf04bb
>> Joe Epstein <jepstein98@gmail.com>
>> Date: Fri Jan 7 11:54:40 2011 +0000
>>
>> mem_access: mem event additions for access
>>
>> My only concern with changing to sync only in the set registers route
>> is that we potentially keep a buggy interface where we might run into
>> this problem in the future. IMHO just shifting all cases to do sync()
>> is safer, provided the performance difference is unnoticeable.
>
> Actually looking at the code again, this is vcpu_pause():
>
> 875 void vcpu_pause(struct vcpu *v)
> 876 {
> 877 ASSERT(v != current);
> 878 atomic_inc(&v->pause_count);
> 879 vcpu_sleep_sync(v);
> 880 }
> 881
> 882 void vcpu_pause_nosync(struct vcpu *v)
> 883 {
> 884 atomic_inc(&v->pause_count);
> 885 vcpu_sleep_nosync(v);
> 886 }
>
> and this is vm_event_vcpu_pause():
>
> 751 void vm_event_vcpu_pause(struct vcpu *v)
> 752 {
> 753 ASSERT(v == current);
> 754
> 755 atomic_inc(&v->vm_event_pause_count);
> 756 vcpu_pause_nosync(v);
> 757 }
>
> If we want to preserve the vm_event-specific reference counter
> (v->vm_event_pause_count) we'd use vcpu_pause() instead of
> vcpu_pause_nosync(). But vcpu_pause() wants to be used on a VCPU !=
> current (see the ASSERT()). This is possibly why vcpu_pause_nosync() has
> been chosen over vcpu_pause().
>
> I'll try to remove the ASSERT() and see if anything explodes, but it's
> looking increasingly like the smallest change is the one I've initially
> proposed.
Predictably, it did explode: http://pastebin.com/ruMKD2f0
However, your concern is valid, and I think we can address it by doing a
sync_vcpu_execstate() as soon as possible in vm_event_resume() - that
way we'll make sure that all the custom handlers running afterwards will
see the synced VCPU state.
I'll hack a patch and send it in ASAP.
Thanks,
Razvan
_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel
next prev parent reply other threads:[~2016-08-09 9:01 UTC|newest]
Thread overview: 12+ messages / expand[flat|nested] mbox.gz Atom feed top
2016-08-08 10:31 VM_EVENT_FLAG_SET_REGISTERS and sync_vcpu_execstate() Razvan Cojocaru
2016-08-08 12:01 ` Andrew Cooper
2016-08-08 12:47 ` Razvan Cojocaru
2016-08-08 15:10 ` Razvan Cojocaru
2016-08-08 15:52 ` Tamas K Lengyel
2016-08-08 16:29 ` Razvan Cojocaru
2016-08-08 18:01 ` Tamas K Lengyel
2016-08-09 8:19 ` Razvan Cojocaru
2016-08-09 9:01 ` Razvan Cojocaru [this message]
2016-08-09 9:41 ` Tim Deegan
2016-08-09 9:46 ` Razvan Cojocaru
2016-08-09 17:19 ` Tamas K Lengyel
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=f45adbcf-218d-34b0-ec20-a03b87255d44@bitdefender.com \
--to=rcojocaru@bitdefender.com \
--cc=andrew.cooper3@citrix.com \
--cc=tamas@tklengyel.com \
--cc=xen-devel@lists.xen.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.