Re: x86/vMSI-X emulation issue

From: "Jan Beulich" <JBeulich@suse.com>
To: xen-devel <xen-devel@lists.xenproject.org>
Cc: Andrew Cooper <andrew.cooper3@citrix.com>
Subject: Re: x86/vMSI-X emulation issue
Date: Thu, 24 Mar 2016 01:51:31 -0600	[thread overview]
Message-ID: <56F3AA9302000078000DFE82@prv-mh.provo.novell.com> (raw)
In-Reply-To: <56F2DAF302000078000DFC58@prv-mh.provo.novell.com>

>>> On 23.03.16 at 18:05, <JBeulich@suse.com> wrote:
> All,
> 
> so I've just learned that Windows (at least some versions and some
> of their code paths) use REP MOVSD to read/write the MSI-X table.
> The way at least msixtbl_write() works is not compatible with this
> (msixtbl_read() also seems affected, albeit to a lesser degree), and
> apparently it just worked by accident until the XSA-120 and 128-131
> and follow-up changes - most notably commit ad28e42bd1 ("x86/MSI:
> track host and guest masking separately"), as without the call to
> guest_mask_msi_irq() interrupts won't ever get unmasked.
> 
> The problem with emulating REP MOVSD is that msixtbl_write()
> intentionally returns X86EMUL_UNHANDLEABLE on all writes to
> words 0, 1, and 2. When in the process of emulating multiple
> writes, we therefore hand the entire batch of 3 or 4 writes to qemu,
> and the hypervisor doesn't get to see any other than the initial
> iteration.
> 
> Now I see a couple of possible solutions, but none of them look
> really neat, hence I'm seeking a second opinion (including, of
> course, further alternative ideas):
> 
> 1) Introduce another X86EMUL_* like status that's not really to be
>     used by the emulator itself, but only by the two vMSI-X functions
>     to indicate to their caller that prior to forwarding the request it
>     should be chopped to a single repetition.
> 
> 2) Do aforementioned chopping automatically on seeing
>     X86EMUL_UNHANDLEABLE, on the basis that the .check
>     handler had indicated that the full range was acceptable. That
>     would at once cover other similarly undesirable cases like the
>     vLAPIC code returning this error. However, any stdvga like
>     emulated device would clearly not want such to happen, and
>     would instead prefer the entire batch to get forwarded in one
>     go (stdvga itself sits on a different path). Otoh, with the
>     devices we have currently, this would seem to be the least
>     intrusive solution.

Having thought about it more over night, I think this indeed is
the most reasonable route, not just because it's least intrusive:
For non-buffered internally handled I/O requests, no good can
come from forwarding full batches to qemu, when the respective
range checking function has indicated that this is an acceptable
request. And in fact neither vHPET not vIO-APIC code generate
X86EMUL_UNHANDLEABLE. And vLAPIC code doing so is also
just apparently so - I'll submit a patch to make this obvious once
tested.

Otoh stdvga_intercept_pio() uses X86EMUL_UNHANDLEABLE in
a manner similar to the vMSI-X code - for internal caching and
then forwarding to qemu. Clearly that is also broken for
REP OUTS, and hence a similar rep count reduction is going to
be needed for the port I/O case.

vRTC code would misbehave too, albeit there it is quite hard to
see what use REP INS or REP OUTS could be. Yet we can't
exclude a guest using such, so we should make it behave
correctly.

For handle_pmt_io(), otoh, forwarding the full batch would be
okay, but since there shouldn't be any writes breaking up such
batches wouldn't be a problem. Then again forwarding such
invalid requests to qemu is kind of pointless - we could as well
terminate them right in Xen, just like we terminate requests
of other than 4 byte width -  again I'll submit a patch to make
this obvious once tested.

Jan

> 3) Have emulation backends provide some kind of (static) flag
>     indicating which forwarding behavior they would like.
> 
> 4) Expose the full ioreq to the emulation backends, so they can
>     fiddle with the request to their liking.
> 
> Thanks, Jan
> 
> 
> _______________________________________________
> Xen-devel mailing list
> Xen-devel@lists.xen.org 
> http://lists.xen.org/xen-devel 

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel