From mboxrd@z Thu Jan 1 00:00:00 1970 From: "Michael S. Tsirkin" Subject: Re: [Qemu-devel] [PATCH RFC] dma_rw.h (was Re: [PATCH 0/7] AMD IOMMU emulation patchset v4) Date: Thu, 16 Sep 2010 11:35:15 +0200 Message-ID: <20100916093515.GC20864@redhat.com> References: <1283007298-10942-1-git-send-email-eduard.munteanu@linux360.ro> <20100913200120.GA20918@redhat.com> <4C8E8D6E.90800@codemonkey.ws> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Cc: Eduard - Gabriel Munteanu , kvm@vger.kernel.org, joro@8bytes.org, qemu-devel@nongnu.org, blauwirbel@gmail.com, yamahata@valinux.co.jp, paul@codesourcery.com, avi@redhat.com To: Anthony Liguori Return-path: Received: from mx1.redhat.com ([209.132.183.28]:43825 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752199Ab0IPJlr (ORCPT ); Thu, 16 Sep 2010 05:41:47 -0400 Content-Disposition: inline In-Reply-To: <4C8E8D6E.90800@codemonkey.ws> Sender: kvm-owner@vger.kernel.org List-ID: On Mon, Sep 13, 2010 at 03:45:34PM -0500, Anthony Liguori wrote: > On 09/13/2010 03:01 PM, Michael S. Tsirkin wrote: > >So I think the following will give the idea of what an API > >might look like that will let us avoid the scary hacks in > >e.g. the ide layer and other generic layers that need to do DMA, > >without either binding us to pci, adding more complexity with > >callbacks, or losing type safety with casts and void*. > > > >Basically we have DMADevice that we can use container_of on > >to get a PCIDevice from, and DMAMmu that will get instanciated > >in a specific MMU. > > > >This is not complete code - just a header - I might complete > >this later if/when there's interest or hopefully someone interested > >in iommu emulation will. > > > >Notes: > >the IOMMU_PERM_RW code seem unused, so I replaced > >this with plain is_write. Is it ever useful? > > > >It seems that invalidate callback should be able to > >get away with just a device, so I switched to that > >from a void pointer for type safety. > >Seems enough for the users I saw. > > > >I saw devices do stl_le_phys and such, these > >might need to be wrapped as well. > > > >Signed-off-by: Michael S. Tsirkin > > One of the troubles with an interface like this is that I'm not sure > a generic model universally works. > > For instance, I know some PCI busses do transparent byte swapping. > For this to work, there has to be a notion of generic memory > reads/writes vs. reads of a 32-bit, 16-bit, and 8-bit value. > > With a generic API, we lose the flexibility to do this type of bus > interface. > > Regards, > > Anthony Liguori Surely only PCI root can do such tricks. Anyway, I suspect what you refer to is byte swapping of config cycles and similar IO done by driver. If a bus byteswapped a DMA transaction, this basically breaks DMA as driver will have to go and fix up all data before passing it up to the OS. Right? We'd have to add more wrappers to emulate such insanity, as MMU intentionally only handles translation. > >--- > > > >diff --git a/hw/dma_rw.h b/hw/dma_rw.h > >new file mode 100644 > >index 0000000..d63fd17 > >--- /dev/null > >+++ b/hw/dma_rw.h > >@@ -0,0 +1,122 @@ > >+#ifndef DMA_RW_H > >+#define DMA_RW_H > >+ > >+#include "qemu-common.h" > >+ > >+/* We currently only have pci mmus, but using > >+ a generic type makes it possible to use this > >+ e.g. from the generic ide code without callbacks. */ > >+typedef uint64_t dma_addr_t; > >+ > >+typedef struct DMAMmu DMAMmu; > >+typedef struct DMADevice DMADevice; > >+ > >+typedef int DMATranslateFunc(DMAMmu *mmu, > >+ DMADevice *dev, > >+ dma_addr_t addr, > >+ dma_addr_t *paddr, > >+ dma_addr_t *len, > >+ int is_write); > >+ > >+typedef int DMAInvalidateMapFunc(DMADevice *); > >+struct DMAMmu { > >+ /* invalidate, etc. */ > >+ DmaTranslateFunc *translate; > >+}; > >+ > >+struct DMADevice { > >+ DMAMmu *mmu; > >+ DMAInvalidateMapFunc *invalidate; > >+}; > >+ > >+void dma_device_init(DMADevice *, DMAMmu *, DMAInvalidateMapFunc *); > >+ > >+static inline void dma_memory_rw(DMADevice *dev, > >+ dma_addr_t addr, > >+ void *buf, > >+ uint32_t len, > >+ int is_write) > >+{ > >+ uint32_t plen; > >+ /* Fast-path non-iommu. > >+ * More importantly, makes it obvious what this function does. */ > >+ if (!dev->mmu) { > >+ cpu_physical_memory_rw(paddr, buf, plen, is_write); > >+ return; > >+ } > >+ while (len) { > >+ err = dev->mmu->translate(iommu, dev, addr,&paddr,&plen, is_write); > >+ if (err) { > >+ return; > >+ } > >+ > >+ /* The translation might be valid for larger regions. */ > >+ if (plen> len) { > >+ plen = len; > >+ } > >+ > >+ cpu_physical_memory_rw(paddr, buf, plen, is_write); > >+ > >+ len -= plen; > >+ addr += plen; > >+ buf += plen; > >+ } > >+} > >+ > >+void *dma_memory_map(DMADevice *dev, > >+ dma_addr_t addr, > >+ uint32_t *len, > >+ int is_write); > >+void dma_memory_unmap(DMADevice *dev, > >+ void *buffer, > >+ uint32_t len, > >+ int is_write, > >+ uint32_t access_len); > >+ > >+ > >++#define DEFINE_DMA_LD(suffix, size) \ > >++uint##size##_t dma_ld##suffix(DMADevice *dev, dma_addr_t addr) \ > >++{ \ > >++ int err; \ > >++ target_phys_addr_t paddr, plen; \ > >++ if (!dev->mmu) { \ > >++ return ld##suffix##_phys(addr, val); \ > >++ } \ > >++ \ > >++ err = dev->mmu->translate(dev->bus->iommu, dev, \ > >++ addr,&paddr,&plen, IOMMU_PERM_READ); \ > >++ if (err || (plen< size / 8)) \ > >++ return 0; \ > >++ \ > >++ return ld##suffix##_phys(paddr); \ > >++} > >++ > >++#define DEFINE_DMA_ST(suffix, size) \ > >++void dma_st##suffix(DMADevice *dev, dma_addr_t addr, uint##size##_t val) \ > >++{ \ > >++ int err; \ > >++ target_phys_addr_t paddr, plen; \ > >++ \ > >++ if (!dev->mmu) { \ > >++ st##suffix##_phys(addr, val); \ > >++ return; \ > >++ } \ > >++ err = dev->mmu->translate(dev->bus->iommu, dev, \ > >++ addr,&paddr,&plen, IOMMU_PERM_WRITE); \ > >++ if (err || (plen< size / 8)) \ > >++ return; \ > >++ \ > >++ st##suffix##_phys(paddr, val); \ > >++} > >+ > >+DEFINE_DMA_LD(ub, 8) > >+DEFINE_DMA_LD(uw, 16) > >+DEFINE_DMA_LD(l, 32) > >+DEFINE_DMA_LD(q, 64) > >+ > >+DEFINE_DMA_ST(b, 8) > >+DEFINE_DMA_ST(w, 16) > >+DEFINE_DMA_ST(l, 32) > >+DEFINE_DMA_ST(q, 64) > >+ > >+#endif > >diff --git a/hw/pci.h b/hw/pci.h > >index 1c6075e..9737f0e 100644 > >--- a/hw/pci.h > >+++ b/hw/pci.h > >@@ -5,6 +5,7 @@ > > #include "qobject.h" > > > > #include "qdev.h" > >+#include "dma_rw.h" > > > > /* PCI includes legacy ISA access. */ > > #include "isa.h" > >@@ -119,6 +120,10 @@ enum { > > > > struct PCIDevice { > > DeviceState qdev; > >+ > >+ /* For devices that do DMA. */ > >+ DMADevice dma; > >+ > > /* PCI config space */ > > uint8_t *config; > > > > From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from [140.186.70.92] (port=32899 helo=eggs.gnu.org) by lists.gnu.org with esmtp (Exim 4.43) id 1OwAyP-0003Qp-13 for qemu-devel@nongnu.org; Thu, 16 Sep 2010 05:41:46 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.69) (envelope-from ) id 1OwAyN-0006FI-LZ for qemu-devel@nongnu.org; Thu, 16 Sep 2010 05:41:44 -0400 Received: from mx1.redhat.com ([209.132.183.28]:38424) by eggs.gnu.org with esmtp (Exim 4.69) (envelope-from ) id 1OwAyN-0006Ey-2G for qemu-devel@nongnu.org; Thu, 16 Sep 2010 05:41:43 -0400 Date: Thu, 16 Sep 2010 11:35:15 +0200 From: "Michael S. Tsirkin" Subject: Re: [Qemu-devel] [PATCH RFC] dma_rw.h (was Re: [PATCH 0/7] AMD IOMMU emulation patchset v4) Message-ID: <20100916093515.GC20864@redhat.com> References: <1283007298-10942-1-git-send-email-eduard.munteanu@linux360.ro> <20100913200120.GA20918@redhat.com> <4C8E8D6E.90800@codemonkey.ws> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <4C8E8D6E.90800@codemonkey.ws> List-Id: qemu-devel.nongnu.org List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Anthony Liguori Cc: kvm@vger.kernel.org, joro@8bytes.org, qemu-devel@nongnu.org, blauwirbel@gmail.com, yamahata@valinux.co.jp, paul@codesourcery.com, Eduard - Gabriel Munteanu , avi@redhat.com On Mon, Sep 13, 2010 at 03:45:34PM -0500, Anthony Liguori wrote: > On 09/13/2010 03:01 PM, Michael S. Tsirkin wrote: > >So I think the following will give the idea of what an API > >might look like that will let us avoid the scary hacks in > >e.g. the ide layer and other generic layers that need to do DMA, > >without either binding us to pci, adding more complexity with > >callbacks, or losing type safety with casts and void*. > > > >Basically we have DMADevice that we can use container_of on > >to get a PCIDevice from, and DMAMmu that will get instanciated > >in a specific MMU. > > > >This is not complete code - just a header - I might complete > >this later if/when there's interest or hopefully someone interested > >in iommu emulation will. > > > >Notes: > >the IOMMU_PERM_RW code seem unused, so I replaced > >this with plain is_write. Is it ever useful? > > > >It seems that invalidate callback should be able to > >get away with just a device, so I switched to that > >from a void pointer for type safety. > >Seems enough for the users I saw. > > > >I saw devices do stl_le_phys and such, these > >might need to be wrapped as well. > > > >Signed-off-by: Michael S. Tsirkin > > One of the troubles with an interface like this is that I'm not sure > a generic model universally works. > > For instance, I know some PCI busses do transparent byte swapping. > For this to work, there has to be a notion of generic memory > reads/writes vs. reads of a 32-bit, 16-bit, and 8-bit value. > > With a generic API, we lose the flexibility to do this type of bus > interface. > > Regards, > > Anthony Liguori Surely only PCI root can do such tricks. Anyway, I suspect what you refer to is byte swapping of config cycles and similar IO done by driver. If a bus byteswapped a DMA transaction, this basically breaks DMA as driver will have to go and fix up all data before passing it up to the OS. Right? We'd have to add more wrappers to emulate such insanity, as MMU intentionally only handles translation. > >--- > > > >diff --git a/hw/dma_rw.h b/hw/dma_rw.h > >new file mode 100644 > >index 0000000..d63fd17 > >--- /dev/null > >+++ b/hw/dma_rw.h > >@@ -0,0 +1,122 @@ > >+#ifndef DMA_RW_H > >+#define DMA_RW_H > >+ > >+#include "qemu-common.h" > >+ > >+/* We currently only have pci mmus, but using > >+ a generic type makes it possible to use this > >+ e.g. from the generic ide code without callbacks. */ > >+typedef uint64_t dma_addr_t; > >+ > >+typedef struct DMAMmu DMAMmu; > >+typedef struct DMADevice DMADevice; > >+ > >+typedef int DMATranslateFunc(DMAMmu *mmu, > >+ DMADevice *dev, > >+ dma_addr_t addr, > >+ dma_addr_t *paddr, > >+ dma_addr_t *len, > >+ int is_write); > >+ > >+typedef int DMAInvalidateMapFunc(DMADevice *); > >+struct DMAMmu { > >+ /* invalidate, etc. */ > >+ DmaTranslateFunc *translate; > >+}; > >+ > >+struct DMADevice { > >+ DMAMmu *mmu; > >+ DMAInvalidateMapFunc *invalidate; > >+}; > >+ > >+void dma_device_init(DMADevice *, DMAMmu *, DMAInvalidateMapFunc *); > >+ > >+static inline void dma_memory_rw(DMADevice *dev, > >+ dma_addr_t addr, > >+ void *buf, > >+ uint32_t len, > >+ int is_write) > >+{ > >+ uint32_t plen; > >+ /* Fast-path non-iommu. > >+ * More importantly, makes it obvious what this function does. */ > >+ if (!dev->mmu) { > >+ cpu_physical_memory_rw(paddr, buf, plen, is_write); > >+ return; > >+ } > >+ while (len) { > >+ err = dev->mmu->translate(iommu, dev, addr,&paddr,&plen, is_write); > >+ if (err) { > >+ return; > >+ } > >+ > >+ /* The translation might be valid for larger regions. */ > >+ if (plen> len) { > >+ plen = len; > >+ } > >+ > >+ cpu_physical_memory_rw(paddr, buf, plen, is_write); > >+ > >+ len -= plen; > >+ addr += plen; > >+ buf += plen; > >+ } > >+} > >+ > >+void *dma_memory_map(DMADevice *dev, > >+ dma_addr_t addr, > >+ uint32_t *len, > >+ int is_write); > >+void dma_memory_unmap(DMADevice *dev, > >+ void *buffer, > >+ uint32_t len, > >+ int is_write, > >+ uint32_t access_len); > >+ > >+ > >++#define DEFINE_DMA_LD(suffix, size) \ > >++uint##size##_t dma_ld##suffix(DMADevice *dev, dma_addr_t addr) \ > >++{ \ > >++ int err; \ > >++ target_phys_addr_t paddr, plen; \ > >++ if (!dev->mmu) { \ > >++ return ld##suffix##_phys(addr, val); \ > >++ } \ > >++ \ > >++ err = dev->mmu->translate(dev->bus->iommu, dev, \ > >++ addr,&paddr,&plen, IOMMU_PERM_READ); \ > >++ if (err || (plen< size / 8)) \ > >++ return 0; \ > >++ \ > >++ return ld##suffix##_phys(paddr); \ > >++} > >++ > >++#define DEFINE_DMA_ST(suffix, size) \ > >++void dma_st##suffix(DMADevice *dev, dma_addr_t addr, uint##size##_t val) \ > >++{ \ > >++ int err; \ > >++ target_phys_addr_t paddr, plen; \ > >++ \ > >++ if (!dev->mmu) { \ > >++ st##suffix##_phys(addr, val); \ > >++ return; \ > >++ } \ > >++ err = dev->mmu->translate(dev->bus->iommu, dev, \ > >++ addr,&paddr,&plen, IOMMU_PERM_WRITE); \ > >++ if (err || (plen< size / 8)) \ > >++ return; \ > >++ \ > >++ st##suffix##_phys(paddr, val); \ > >++} > >+ > >+DEFINE_DMA_LD(ub, 8) > >+DEFINE_DMA_LD(uw, 16) > >+DEFINE_DMA_LD(l, 32) > >+DEFINE_DMA_LD(q, 64) > >+ > >+DEFINE_DMA_ST(b, 8) > >+DEFINE_DMA_ST(w, 16) > >+DEFINE_DMA_ST(l, 32) > >+DEFINE_DMA_ST(q, 64) > >+ > >+#endif > >diff --git a/hw/pci.h b/hw/pci.h > >index 1c6075e..9737f0e 100644 > >--- a/hw/pci.h > >+++ b/hw/pci.h > >@@ -5,6 +5,7 @@ > > #include "qobject.h" > > > > #include "qdev.h" > >+#include "dma_rw.h" > > > > /* PCI includes legacy ISA access. */ > > #include "isa.h" > >@@ -119,6 +120,10 @@ enum { > > > > struct PCIDevice { > > DeviceState qdev; > >+ > >+ /* For devices that do DMA. */ > >+ DMADevice dma; > >+ > > /* PCI config space */ > > uint8_t *config; > > > >