All of lore.kernel.org
 help / color / mirror / Atom feed
From: Ard Biesheuvel <ard.biesheuvel@linaro.org>
To: Christoffer Dall <christoffer.dall@linaro.org>
Cc: Marc Zyngier <marc.zyngier@arm.com>,
	Catalin Marinas <catalin.marinas@arm.com>,
	Laszlo Ersek <lersek@redhat.com>,
	"kvmarm@lists.cs.columbia.edu" <kvmarm@lists.cs.columbia.edu>
Subject: Re: issues with emulated PCI MMIO backed by host memory under KVM
Date: Mon, 27 Jun 2016 11:47:18 +0200	[thread overview]
Message-ID: <CAKv+Gu9u2f6+HLYMzqtkq0z=Xk5gTq=_11QvAKMaBZj_oRMwTA@mail.gmail.com> (raw)
In-Reply-To: <20160627091619.GB26498@cbox>

On 27 June 2016 at 11:16, Christoffer Dall <christoffer.dall@linaro.org> wrote:
> Hi,
>
> I'm going to ask some stupid questions here...
>
> On Fri, Jun 24, 2016 at 04:04:45PM +0200, Ard Biesheuvel wrote:
>> Hi all,
>>
>> This old subject came up again in a discussion related to PCIe support
>> for QEMU/KVM under Tianocore. The fact that we need to map PCI MMIO
>> regions as cacheable is preventing us from reusing a significant slice
>> of the PCIe support infrastructure, and so I'd like to bring this up
>> again, perhaps just to reiterate why we're simply out of luck.
>>
>> To refresh your memories, the issue is that on ARM, PCI MMIO regions
>> for emulated devices may be backed by memory that is mapped cacheable
>> by the host. Note that this has nothing to do with the device being
>> DMA coherent or not: in this case, we are dealing with regions that
>> are not memory from the POV of the guest, and it is reasonable for the
>> guest to assume that accesses to such a region are not visible to the
>> device before they hit the actual PCI MMIO window and are translated
>> into cycles on the PCI bus.
>
> For the sake of completeness, why is this reasonable?
>

Because the whole point of accessing these regions is to communicate
with the device. It is common to use write combining mappings for
things like framebuffers to group writes before they hit the PCI bus,
but any caching just makes it more difficult for the driver state and
device state to remain synchronized.

> Is this how any real ARM system implementing PCI would actually work?
>

Yes.

>> That means that mapping such a region
>> cacheable is a strange thing to do, in fact, and it is unlikely that
>> patches implementing this against the generic PCI stack in Tianocore
>> will be accepted by the maintainers.
>>
>> Note that this issue not only affects framebuffers on PCI cards, it
>> also affects emulated USB host controllers (perhaps Alex can remind us
>> which one exactly?) and likely other emulated generic PCI devices as
>> well.
>>
>> Since the issue exists only for emulated PCI devices whose MMIO
>> regions are backed by host memory, is there any way we can already
>> distinguish such memslots from ordinary ones? If we can, is there
>> anything we could do to treat these specially? Perhaps something like
>> using read-only memslots so we can at least trap guest writes instead
>> of having main memory going out of sync with the caches unnoticed? I
>> am just brainstorming here ...
>
> I think the only sensible solution is to make sure that the guest and
> emulation mappings use the same memory type, either cached or
> non-cached, and we 'simply' have to find the best way to implement this.
>
> As Drew suggested, forcing some S2 mappings to be non-cacheable is the
> one way.
>
> The other way is to use something like what you once wrote that rewrites
> stage-1 mappings to be cacheable, does that apply here ?
>
> Do we have a clear picture of why we'd prefer one way over the other?
>

So first of all, let me reiterate that I could only find a single
instance in QEMU where a PCI MMIO region is backed by host memory,
which is vga-pci.c. I wonder of there are any other occurrences, but
if there aren't any, it makes much more sense to prohibit PCI BARs
backed by host memory rather than spend a lot of effort working around
it.

If we do decide to fix this, the best way would be to use uncached
attributes for the QEMU userland mapping, and force it uncached in the
guest via a stage 2 override (as Drews suggests). The only problem I
see here is that the host's kernel direct mapping has a cached alias
that we need to get rid of. The MAIR hack is just that, a hack, since
there are corner cases that cannot be handled (but please refer to the
old thread for the details)

As for the USB case, I can't really figure out what is going on here,
but I am fairly certain it is a different issue. If this is related to
DMA, I wonder if adding the 'dma-coherent' property to the PCIe root
complex node fixes anything.

-- 
Ard.

  reply	other threads:[~2016-06-27  9:42 UTC|newest]

Thread overview: 32+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-06-24 14:04 issues with emulated PCI MMIO backed by host memory under KVM Ard Biesheuvel
2016-06-24 14:57 ` Andrew Jones
2016-06-27  8:17   ` Marc Zyngier
2016-06-24 18:16 ` Ard Biesheuvel
2016-06-25  7:15   ` Alexander Graf
2016-06-25  7:19 ` Alexander Graf
2016-06-27  8:11   ` Marc Zyngier
2016-06-27  9:16 ` Christoffer Dall
2016-06-27  9:47   ` Ard Biesheuvel [this message]
2016-06-27 10:34     ` Christoffer Dall
2016-06-27 12:30       ` Ard Biesheuvel
2016-06-27 13:35         ` Christoffer Dall
2016-06-27 13:57           ` Ard Biesheuvel
2016-06-27 14:29             ` Alexander Graf
2016-06-28 11:02               ` Laszlo Ersek
2016-06-28 10:04             ` Christoffer Dall
2016-06-28 11:06               ` Laszlo Ersek
2016-06-28 12:20                 ` Christoffer Dall
2016-06-28 13:10                   ` Catalin Marinas
2016-06-28 13:19                     ` Ard Biesheuvel
2016-06-28 13:25                       ` Catalin Marinas
2016-06-28 14:02                         ` Andrew Jones
2016-06-27 14:24       ` Alexander Graf
2016-06-28 10:55       ` Laszlo Ersek
2016-06-28 13:14         ` Ard Biesheuvel
2016-06-28 13:32           ` Laszlo Ersek
2016-06-29  7:12             ` Gerd Hoffmann
2016-06-28 15:23         ` Alexander Graf
2016-06-27 13:15     ` Peter Maydell
2016-06-27 13:49       ` Mark Rutland
2016-06-27 14:10         ` Peter Maydell
2016-06-28 10:05           ` Christoffer Dall

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CAKv+Gu9u2f6+HLYMzqtkq0z=Xk5gTq=_11QvAKMaBZj_oRMwTA@mail.gmail.com' \
    --to=ard.biesheuvel@linaro.org \
    --cc=catalin.marinas@arm.com \
    --cc=christoffer.dall@linaro.org \
    --cc=kvmarm@lists.cs.columbia.edu \
    --cc=lersek@redhat.com \
    --cc=marc.zyngier@arm.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.