xen-devel.lists.xenproject.org archive mirror
 help / color / mirror / Atom feed
From: Tamas K Lengyel <tamas@tklengyel.com>
To: Andrew Cooper <andrew.cooper3@citrix.com>
Cc: Razvan Cojocaru <rcojocaru@bitdefender.com>,
	"xen-devel@lists.xen.org" <xen-devel@lists.xen.org>
Subject: Re: Issues with the xen-access.c model
Date: Tue, 5 Apr 2016 09:35:02 -0600	[thread overview]
Message-ID: <CABfawhmc8gqTtuQ90hf7HZBxXsEdP8JbRXnLeouT0K=e96Ff3g@mail.gmail.com> (raw)
In-Reply-To: <5703ABF3.4040709@citrix.com>


[-- Attachment #1.1: Type: text/plain, Size: 4057 bytes --]

On Tue, Apr 5, 2016 at 6:13 AM, Andrew Cooper <andrew.cooper3@citrix.com>
wrote:

> On 05/04/16 11:55, Razvan Cojocaru wrote:
> > Hello,
> >
> > xen-access.c does roughly this:
> >
> > for (;;) {
> >       poll_evt_channel();
> >
> >       if (new_events) {
> >               while (ring_buffer_has_requests) {
> >                       pull_request();
> >                       process_request();
> >                       put_response();
> >               }
> >
> >               signal_evt_channel();
> >       }
> > }
> >
> > The problems are these:
> >
> > 1. vm_event_put_request() does notify_via_xen_event_channel(d,
> > ved->xen_port); at the very end, and we seem to be using a FIFO event
> > channel (if I'm reading the code right, it's generic event channel code
> > that pre-dates the vm_event code that sits on top of it).
> >
> > This means that for a guest that's running two VCPUs, the following
> > scenario is possible:
> >
> > * VCPU1 puts an event in the ring buffer
> > * VCPU1 signals the event channel
> > * poll_evt_channel() wakes up
> > * VCPU2 puts an event in the ring buffer
> > * the loop processes the first event
> > * VCPU2 signals the event channel
> > * the loop processes the second event (since there _are_ more requests
> > in the ring buffer)
> > * the loop signals the event channel that requests have been processed
> > * poll_evt_channel() wakes up again, because of the second event channel
> > signal, but this is pointless since all events have already been
> processed
> >
> > This can be avoided by counting the requests processed vs. the number of
> > succesful poll()s, but it still does not solve the second problem:
> >
> > 2. Race conditions. At least in theory, there's nothing to stop the
> > second iteration of the request processing loop to read a partially
> > written request from the ring buffer, or garbage left over from a
> > previous request, if it hits it "just right" (i.e. the "VCPU2 puts an
> > event in the ring buffer" part can run on top of the "the loop processes
> > the second event").
> >
> > All this doesn't really show up unless we're in heavy processing
> > scenarios, but they can occur, especially in synced vm_event scenarios.
> >
> > The problems seems to be alleviated by changing the loop to this:
> >
> > for (;;) {
> >       poll_evt_channel();
> >
> >       if (new_events) {
> >               while (ring_buffer_has_requests) {
> >                       pull_request();
> >                       process_request();
> >                       put_response();
> >
> >                       /* Moved here. */
> >                       signal_evt_channel();
>

That was the original setup, I've changed it a while back by moving it
outside the loop. My reasoning was that there is no point in going lockstep
with events and issuing a hypercall after each is processed. With only a
handful of events on the ring the context switch is probably more taxing
then having all requests and responses processed in one swipe. Of course,
depending on the application this may not be the optimal case and the
signal can be moved back to the loop (with lots of vCPUs it may be the case
that the first vCPU remains paused while all events are processed for the
other vCPUs, which of course would be bad).


> >               }
> >       }
> > }
> >
> > but it's obviously not foolproof, the main problem remaining the
> > synchronization of the ring buffer from both the hypervisor side and
> > userspace.
> >
> > How should we proceed to fix this?
>
> The vm_event code in Xen must increment the producer index as the final
> action of putting a request on the ring.  If it doesn't then it is
> simply broken and needs fixing.
>
> This will ensure that the consumer never sees a partial request on the
> ring; by the time it observes a newer producer index, all data is
> already written.
>

It may worth double-checking but I'm pretty sure that's how it works right
now as that part of the code has been pretty much just been recycled from
the other Xen ring implementations.

Tamas

[-- Attachment #1.2: Type: text/html, Size: 5444 bytes --]

[-- Attachment #2: Type: text/plain, Size: 126 bytes --]

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

  reply	other threads:[~2016-04-05 15:35 UTC|newest]

Thread overview: 6+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-04-05 10:55 Razvan Cojocaru
2016-04-05 12:13 ` Andrew Cooper
2016-04-05 15:35   ` Tamas K Lengyel [this message]
2016-04-05 15:55     ` Razvan Cojocaru
2016-04-05 16:10       ` Andrew Cooper
2016-04-06 16:08         ` Razvan Cojocaru

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CABfawhmc8gqTtuQ90hf7HZBxXsEdP8JbRXnLeouT0K=e96Ff3g@mail.gmail.com' \
    --to=tamas@tklengyel.com \
    --cc=andrew.cooper3@citrix.com \
    --cc=rcojocaru@bitdefender.com \
    --cc=xen-devel@lists.xen.org \
    --subject='Re: Issues with the xen-access.c model' \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).