On Wed, 2015-10-28 at 13:10 +0200, Shamir Rabinovitch wrote: > On Wed, Oct 28, 2015 at 03:30:01PM +0900, David Woodhouse wrote: > > > > +For systems with IOMMU it is assumed all DMA translations use the IOMMU. > > > > Not entirely true. We have per-device dma_ops on a most architectures > > already, and we were just talking about the need to add them to > > POWER/SPARC too, because we need to avoid trying to use the IOMMU to > > map virtio devices too. > > SPARC has it's implementation under arch/sparc for dma_ops (sun4v_dma_ops). > > Some drivers use IOMMU under SPARC for example ixgbe (Intel 10G ETH). > Some, like IB, suffer from IOMMU MAP setup/tear-down & limited address range. > On SPARC IOMMU bypass is not total bypass of the IOMMU but rather much simple > translation that does not require any complex translations tables. We have an option in the Intel IOMMU for pass-through mode too, which basically *is* a total bypass. In practice, what's the difference between that and a "simple translation that does not require any [translation]"? We set up a full 1:1 mapping of all memory, and then the map/unmap methods become no-ops. Currently we have no way to request that mode on a per-device basis; we only have 'iommu=pt' on the command line to set it for *all* devices. But performance-sensitive devices might want it, while we keep doing proper translation for others. > > > > As we look at that (and make the per-device dma_ops a generic thing > > rather than per-arch), we should probably look at folding your case in > > too. > > Whether to use IOMMU or not for DMA is up to the driver. The above example > show real situation where one driver can use IOMMU and the other can't. It > seems that the device cannot know when to use IOMMU bypass and when not. > Even in the driver we can decide to DMA map some buffers using IOMMU > translation and some as IOMMU bypass. In practice there are multiple possibilities — there are cases where you *must* use an IOMMU and do full translation, and there is no option of a bypass. There are cases where there just isn't an IOMMU (and sometimes that's a per-device fact, like with virtio). And there are cases where you *can* use the IOMMU, but if you ask nicely you can get away without it. My point in linking up these two threads is that we should contemplate all of those use cases and come up with something that addresses it all. > Do you agree that we need this attribute in the generic DMA API? Yeah, I think it can be useful. -- David Woodhouse Open Source Technology Centre David.Woodhouse@intel.com Intel Corporation