From mboxrd@z Thu Jan 1 00:00:00 1970 From: Andrew Jones Subject: Re: issues with emulated PCI MMIO backed by host memory under KVM Date: Fri, 24 Jun 2016 16:57:48 +0200 Message-ID: <20160624145748.e2uwqrdv2fik46wc@hawk.localdomain> References: Mime-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Return-path: Received: from localhost (localhost [127.0.0.1]) by mm01.cs.columbia.edu (Postfix) with ESMTP id 2346649B4A for ; Fri, 24 Jun 2016 10:52:56 -0400 (EDT) Received: from mm01.cs.columbia.edu ([127.0.0.1]) by localhost (mm01.cs.columbia.edu [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id BpPbKShFe4zo for ; Fri, 24 Jun 2016 10:52:54 -0400 (EDT) Received: from mx1.redhat.com (mx1.redhat.com [209.132.183.28]) by mm01.cs.columbia.edu (Postfix) with ESMTPS id B2AD349B3C for ; Fri, 24 Jun 2016 10:52:54 -0400 (EDT) Content-Disposition: inline In-Reply-To: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: kvmarm-bounces@lists.cs.columbia.edu Sender: kvmarm-bounces@lists.cs.columbia.edu To: Ard Biesheuvel Cc: Marc Zyngier , Catalin Marinas , Laszlo Ersek , "kvmarm@lists.cs.columbia.edu" List-Id: kvmarm@lists.cs.columbia.edu Hi Ard, Thanks for bringing this back up again (I think :-) On Fri, Jun 24, 2016 at 04:04:45PM +0200, Ard Biesheuvel wrote: > Hi all, > > This old subject came up again in a discussion related to PCIe support > for QEMU/KVM under Tianocore. The fact that we need to map PCI MMIO > regions as cacheable is preventing us from reusing a significant slice > of the PCIe support infrastructure, and so I'd like to bring this up > again, perhaps just to reiterate why we're simply out of luck. > > To refresh your memories, the issue is that on ARM, PCI MMIO regions > for emulated devices may be backed by memory that is mapped cacheable > by the host. Note that this has nothing to do with the device being > DMA coherent or not: in this case, we are dealing with regions that > are not memory from the POV of the guest, and it is reasonable for the > guest to assume that accesses to such a region are not visible to the > device before they hit the actual PCI MMIO window and are translated > into cycles on the PCI bus. That means that mapping such a region > cacheable is a strange thing to do, in fact, and it is unlikely that > patches implementing this against the generic PCI stack in Tianocore > will be accepted by the maintainers. > > Note that this issue not only affects framebuffers on PCI cards, it > also affects emulated USB host controllers (perhaps Alex can remind us > which one exactly?) and likely other emulated generic PCI devices as > well. > > Since the issue exists only for emulated PCI devices whose MMIO > regions are backed by host memory, is there any way we can already > distinguish such memslots from ordinary ones? If we can, is there When I was looking at this I didn't see any way to identify these memslots. I wrote some patches to add a new flag, KVM_MEM_NONCACHEABLE, allowing userspace to point them out. That was the easy part (although I didn't like that userspace developers would have to go around finding all memory regions that needed to be flagged, and new devices would likely not be flagged when developed on non-arm architectures, so we'd always be chasing it...) However what really slowed/stopped me was trying to figure out what to do with those identified memslots. My last idea, which had implementation issues (probably because I was getting in over my head), was 1) introduce PAGE_S2_NORMAL_NC and use it when mapping the guest's pages 2) flush the userspace pages and update all PTEs to be NC The reasoning was that, while we can't force a guest to use cacheable memory, we can take advantage of the noncacheable precedence of the architecture, forcing the memory accesses to be noncached by way of S2 attributes. And of course userspace mappings also need to become NC to finally have coherency. > anything we could do to treat these specially? Perhaps something like > using read-only memslots so we can at least trap guest writes instead > of having main memory going out of sync with the caches unnoticed? I > am just brainstorming here ... > > In any case, it would be good to put this to bed one way or the other > (assuming it hasn't been put to bed already) I'm willing to work on this again (because it's fun), but I'm a bit overloaded right now, and last time I touched it it sucked me into a time hole... drew