From mboxrd@z Thu Jan 1 00:00:00 1970 From: Benjamin Herrenschmidt Subject: Re: [PATCH v4 0/4] virtio: Clean up scatterlists and use the DMA API Date: Wed, 03 Sep 2014 08:10:10 +1000 Message-ID: <1409695810.30640.57.camel@pasglop> References: <1409609814.30640.11.camel@pasglop> <1409691213.30640.37.camel@pasglop> Mime-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Return-path: In-Reply-To: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: virtualization-bounces@lists.linux-foundation.org Errors-To: virtualization-bounces@lists.linux-foundation.org List-Archive: List-Post: To: Andy Lutomirski Cc: "linux-s390@vger.kernel.org" , Konrad Rzeszutek Wilk , "Michael S. Tsirkin" , Linux Virtualization , Christian Borntraeger , Paolo Bonzini , "linux390@de.ibm.com" List-ID: On Tue, 2014-09-02 at 14:37 -0700, Andy Lutomirski wrote: > Let's take a step back from from the implementation. What is a driver > for a virtio PCI device (i.e. a PCI device with vendor 0x1af4) > supposed to do on ppc64? Today, it's supposed to send guest physical addresses. We can make that optional via some nego or capabilities to support more esoteric setups but for backward compatibility, this must remain the default behaviour. > It can send the device physical addresses and ignore the normal PCI > DMA semantics, which is what the current virtio_pci driver does. This > seems like a layering violation, and this won't work if the device is > a real PCI device. Correct, it's an original virtio implementation choice for maximum performances. > Alternatively, it can treat the device like any > other PCI device and use the IOMMU. This is a bit slower, and it is > also incompatible with current hypervisors. This is a potentially a LOT slower and is backward incompatible with current qemu/KVM and kvmtool yes. The slowness can be alleviated using various techniques, for example on ppc64 we can create a DMA window that contains a permanent mapping of the entire guest space, so we could use such a thing for virtio. Another think we could do potentially is advertize via the device-tree that such a bus uses a direct mapping and have the guest use appropriate "direct map" dma_ops. But we need to keep backward compatibility with existing guest/hypervisors so the default must remain as it is. > There really are virtio devices that are pieces of silicon and not > figments of a hypervisor's imagination [1]. I am aware of that. There are also attempts at using virtio to make two machines communicate via a PCIe link (either with one as endpoint of the other or via a non-transparent switch). Which is why I'm not objecting to what you are trying to do ;-) My suggestion was that it might be a cleaner approach to do that by having the individual virtio drivers always use the dma_map_* API, and limiting the kludgery to a combination of virtio_pci "core" and arch code by selecting an appropriate set of dma_map_ops, defaulting with a "transparent" (or direct) one as our current default case (and thus overriding the iommu ones provided by the arch). > We could teach virtio_pci > to use physical addressing on ppc64, but that seems like a pretty > awful hack, and it'll start needing quirks as soon as someone tries to > plug a virtio-speaking PCI card into a ppc64 machine. But x86_64 is the same no ? The day it starts growing an iommu emulation in qemu (and I've heard it's happening) it will still want to do direct bypass for virtio for performance. > Ideas? x86 and arm seem to be safe here, since AFAIK there is no such > thing as a physically addressed virtio "PCI" device on a bus with an > IOMMU on x86, arm, or arm64. Today .... I wouldn't bet on it to remain that way. The qemu implementation of virtio is physically addressed and you don't necessarily have a choice of which device gets an iommu and which not. Cheers, Ben. > [1] https://lwn.net/Articles/580186/ > > > Cheers, > > Ben. > > > > > > >