All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 0/6] IOMMU/DMA map_resource support for peer-to-peer
@ 2015-05-01 18:32 ` wdavis-DDmLM1+adcrQT0dZR+AlfA
  0 siblings, 0 replies; 34+ messages in thread
From: wdavis @ 2015-05-01 18:32 UTC (permalink / raw)
  To: joro; +Cc: iommu, linux-pci, tripperda, jhubbard, jglisse, Will Davis

From: Will Davis <wdavis@nvidia.com>

Hi,

This patch series adds DMA APIs to map and unmap a struct resource to and from
a PCI device's IOVA domain, and implements the AMD, Intel, and nommu versions
of these interfaces.

This solves a long-standing problem with the existing DMA-remapping interfaces,
which require that a struct page be given for the region to be mapped into a
device's IOVA domain. This requirement cannot support peer device BAR ranges,
for which no struct pages exist.

The underlying implementations of map_page and map_sg convert the struct page
into its physical address anyway, so we just need a way to route the physical
address of the BAR region to these implementations. The new interfaces do this
by taking the struct resource describing a device's BAR region, from which the
physical address is derived.

The Intel and nommu versions have been verified on a dual Intel Xeon E5405
workstation. I'm in the process of obtaining hardware to test the AMD version
as well. Please review.

Thanks,
Will

Will Davis (6):
  dma-debug: add checking for map/unmap_resource
  DMA-API: Introduce dma_(un)map_resource
  dma-mapping: pci: add pci_(un)map_resource
  iommu/amd: Implement (un)map_resource
  iommu/vt-d: implement (un)map_resource
  x86: add pci-nommu implementation of map_resource

 arch/x86/kernel/pci-nommu.c              | 17 +++++++
 drivers/iommu/amd_iommu.c                | 76 ++++++++++++++++++++++++++------
 drivers/iommu/intel-iommu.c              | 18 ++++++++
 include/asm-generic/dma-mapping-broken.h |  9 ++++
 include/asm-generic/dma-mapping-common.h | 34 ++++++++++++++
 include/asm-generic/pci-dma-compat.h     | 14 ++++++
 include/linux/dma-debug.h                | 20 +++++++++
 include/linux/dma-mapping.h              |  7 +++
 lib/dma-debug.c                          | 48 ++++++++++++++++++++
 9 files changed, 230 insertions(+), 13 deletions(-)

-- 
2.3.7


^ permalink raw reply	[flat|nested] 34+ messages in thread

* [PATCH 0/6] IOMMU/DMA map_resource support for peer-to-peer
@ 2015-05-01 18:32 ` wdavis-DDmLM1+adcrQT0dZR+AlfA
  0 siblings, 0 replies; 34+ messages in thread
From: wdavis-DDmLM1+adcrQT0dZR+AlfA @ 2015-05-01 18:32 UTC (permalink / raw)
  To: joro-zLv9SwRftAIdnm+yROfE0A
  Cc: jglisse-H+wXaHxf7aLQT0dZR+AlfA, linux-pci-u79uwXL29TY76Z2rM5mHXA,
	Will Davis, iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA,
	jhubbard-DDmLM1+adcrQT0dZR+AlfA,
	tripperda-DDmLM1+adcrQT0dZR+AlfA

From: Will Davis <wdavis-DDmLM1+adcrQT0dZR+AlfA@public.gmane.org>

Hi,

This patch series adds DMA APIs to map and unmap a struct resource to and from
a PCI device's IOVA domain, and implements the AMD, Intel, and nommu versions
of these interfaces.

This solves a long-standing problem with the existing DMA-remapping interfaces,
which require that a struct page be given for the region to be mapped into a
device's IOVA domain. This requirement cannot support peer device BAR ranges,
for which no struct pages exist.

The underlying implementations of map_page and map_sg convert the struct page
into its physical address anyway, so we just need a way to route the physical
address of the BAR region to these implementations. The new interfaces do this
by taking the struct resource describing a device's BAR region, from which the
physical address is derived.

The Intel and nommu versions have been verified on a dual Intel Xeon E5405
workstation. I'm in the process of obtaining hardware to test the AMD version
as well. Please review.

Thanks,
Will

Will Davis (6):
  dma-debug: add checking for map/unmap_resource
  DMA-API: Introduce dma_(un)map_resource
  dma-mapping: pci: add pci_(un)map_resource
  iommu/amd: Implement (un)map_resource
  iommu/vt-d: implement (un)map_resource
  x86: add pci-nommu implementation of map_resource

 arch/x86/kernel/pci-nommu.c              | 17 +++++++
 drivers/iommu/amd_iommu.c                | 76 ++++++++++++++++++++++++++------
 drivers/iommu/intel-iommu.c              | 18 ++++++++
 include/asm-generic/dma-mapping-broken.h |  9 ++++
 include/asm-generic/dma-mapping-common.h | 34 ++++++++++++++
 include/asm-generic/pci-dma-compat.h     | 14 ++++++
 include/linux/dma-debug.h                | 20 +++++++++
 include/linux/dma-mapping.h              |  7 +++
 lib/dma-debug.c                          | 48 ++++++++++++++++++++
 9 files changed, 230 insertions(+), 13 deletions(-)

-- 
2.3.7

^ permalink raw reply	[flat|nested] 34+ messages in thread

* [PATCH 1/6] dma-debug: add checking for map/unmap_resource
  2015-05-01 18:32 ` wdavis-DDmLM1+adcrQT0dZR+AlfA
  (?)
@ 2015-05-01 18:32 ` wdavis
  -1 siblings, 0 replies; 34+ messages in thread
From: wdavis @ 2015-05-01 18:32 UTC (permalink / raw)
  To: joro; +Cc: iommu, linux-pci, tripperda, jhubbard, jglisse, Will Davis

From: Will Davis <wdavis@nvidia.com>

Add debug callbacks for the new dma_map_resource and dma_unmap_resource
functions.

Signed-off-by: Will Davis <wdavis@nvidia.com>
Reviewed-by: Terence Ripperda <tripperda@nvidia.com>
Reviewed-by: John Hubbard <jhubbard@nvidia.com>
---
 include/linux/dma-debug.h | 20 ++++++++++++++++++++
 lib/dma-debug.c           | 48 +++++++++++++++++++++++++++++++++++++++++++++++
 2 files changed, 68 insertions(+)

diff --git a/include/linux/dma-debug.h b/include/linux/dma-debug.h
index fe8cb61..19f328c 100644
--- a/include/linux/dma-debug.h
+++ b/include/linux/dma-debug.h
@@ -44,6 +44,13 @@ extern void debug_dma_mapping_error(struct device *dev, dma_addr_t dma_addr);
 extern void debug_dma_unmap_page(struct device *dev, dma_addr_t addr,
 				 size_t size, int direction, bool map_single);
 
+extern void debug_dma_map_resource(struct device *dev, struct resource *res,
+				   size_t offset, size_t size, int direction,
+				   dma_addr_t dma_addr);
+
+extern void debug_dma_unmap_resource(struct device *dev, dma_addr_t addr,
+				     size_t size, int direction);
+
 extern void debug_dma_map_sg(struct device *dev, struct scatterlist *sg,
 			     int nents, int mapped_ents, int direction);
 
@@ -120,6 +127,19 @@ static inline void debug_dma_unmap_page(struct device *dev, dma_addr_t addr,
 {
 }
 
+static inline void debug_dma_map_resource(struct device *dev,
+					  struct resource *res, size_t offset,
+					  size_t size, int direction,
+					  dma_addr_t dma_addr)
+{
+}
+
+static inline void debug_dma_unmap_resource(struct device *dev,
+					    dma_addr_t addr, size_t size,
+					    int direction)
+{
+}
+
 static inline void debug_dma_map_sg(struct device *dev, struct scatterlist *sg,
 				    int nents, int mapped_ents, int direction)
 {
diff --git a/lib/dma-debug.c b/lib/dma-debug.c
index ae4b65e..0ef092f 100644
--- a/lib/dma-debug.c
+++ b/lib/dma-debug.c
@@ -43,6 +43,7 @@ enum {
 	dma_debug_page,
 	dma_debug_sg,
 	dma_debug_coherent,
+	dma_debug_resource,
 };
 
 enum map_err_types {
@@ -1348,6 +1349,53 @@ void debug_dma_unmap_page(struct device *dev, dma_addr_t addr,
 }
 EXPORT_SYMBOL(debug_dma_unmap_page);
 
+void debug_dma_map_resource(struct device *dev, struct resource *resource,
+			    size_t offset, size_t size, int direction,
+			    dma_addr_t dma_addr)
+{
+	struct dma_debug_entry *entry;
+
+	if (unlikely(dma_debug_disabled()))
+		return;
+
+	if (dma_mapping_error(dev, dma_addr))
+		return;
+
+	entry = dma_entry_alloc();
+	if (!entry)
+		return;
+
+	entry->dev       = dev;
+	entry->type      = dma_debug_resource;
+	entry->pfn       = resource->start >> PAGE_SHIFT;
+	entry->offset    = offset,
+	entry->dev_addr  = dma_addr;
+	entry->size      = size;
+	entry->direction = direction;
+	entry->map_err_type = MAP_ERR_NOT_CHECKED;
+
+	add_dma_entry(entry);
+}
+EXPORT_SYMBOL(debug_dma_map_resource);
+
+void debug_dma_unmap_resource(struct device *dev, dma_addr_t addr,
+			      size_t size, int direction)
+{
+	struct dma_debug_entry ref = {
+		.type           = dma_debug_resource,
+		.dev            = dev,
+		.dev_addr       = addr,
+		.size           = size,
+		.direction      = direction,
+	};
+
+	if (unlikely(dma_debug_disabled()))
+		return;
+
+	check_unmap(&ref);
+}
+EXPORT_SYMBOL(debug_dma_unmap_resource);
+
 void debug_dma_map_sg(struct device *dev, struct scatterlist *sg,
 		      int nents, int mapped_ents, int direction)
 {
-- 
2.3.7


^ permalink raw reply related	[flat|nested] 34+ messages in thread

* [PATCH 2/6] DMA-API: Introduce dma_(un)map_resource
  2015-05-01 18:32 ` wdavis-DDmLM1+adcrQT0dZR+AlfA
  (?)
  (?)
@ 2015-05-01 18:32 ` wdavis
  2015-05-07 15:09   ` Bjorn Helgaas
  -1 siblings, 1 reply; 34+ messages in thread
From: wdavis @ 2015-05-01 18:32 UTC (permalink / raw)
  To: joro; +Cc: iommu, linux-pci, tripperda, jhubbard, jglisse, Will Davis

From: Will Davis <wdavis@nvidia.com>

Add functions to DMA-map and -unmap a resource for a given device. This
will allow devices to DMA-map a peer device's resource (for example,
another device's BAR region on PCI) to enable peer-to-peer transactions.

Signed-off-by: Will Davis <wdavis@nvidia.com>
Reviewed-by: Terence Ripperda <tripperda@nvidia.com>
Reviewed-by: John Hubbard <jhubbard@nvidia.com>
---
 include/asm-generic/dma-mapping-broken.h |  9 +++++++++
 include/asm-generic/dma-mapping-common.h | 34 ++++++++++++++++++++++++++++++++
 include/linux/dma-mapping.h              |  7 +++++++
 3 files changed, 50 insertions(+)

diff --git a/include/asm-generic/dma-mapping-broken.h b/include/asm-generic/dma-mapping-broken.h
index 6c32af9..d171f01 100644
--- a/include/asm-generic/dma-mapping-broken.h
+++ b/include/asm-generic/dma-mapping-broken.h
@@ -59,6 +59,15 @@ extern void
 dma_unmap_page(struct device *dev, dma_addr_t dma_address, size_t size,
 	       enum dma_data_direction direction);
 
+extern dma_addr_t
+dma_map_resource(struct device *dev, struct resource *res,
+		 unsigned long offset, size_t size,
+		 enum dma_data_direction direction);
+
+extern void
+dma_unmap_resource(struct device *dev, dma_addr_t dma_address, size_t size,
+		   enum dma_data_direction direction);
+
 extern void
 dma_sync_single_for_cpu(struct device *dev, dma_addr_t dma_handle, size_t size,
 			enum dma_data_direction direction);
diff --git a/include/asm-generic/dma-mapping-common.h b/include/asm-generic/dma-mapping-common.h
index 940d5ec..cd9948d 100644
--- a/include/asm-generic/dma-mapping-common.h
+++ b/include/asm-generic/dma-mapping-common.h
@@ -73,6 +73,36 @@ static inline void dma_unmap_sg_attrs(struct device *dev, struct scatterlist *sg
 		ops->unmap_sg(dev, sg, nents, dir, attrs);
 }
 
+static inline dma_addr_t dma_map_resource_attrs(struct device *dev,
+						struct resource *res,
+						size_t offset, size_t size,
+						enum dma_data_direction dir,
+						struct dma_attrs *attrs)
+{
+	const struct dma_map_ops *ops = get_dma_ops(dev);
+	dma_addr_t addr = 0;
+
+	BUG_ON(!valid_dma_direction(dir));
+	if (ops->map_resource)
+		addr = ops->map_resource(dev, res, offset, size, dir, attrs);
+	debug_dma_map_resource(dev, res, offset, size, dir, addr);
+
+	return addr;
+}
+
+static inline void dma_unmap_resource_attrs(struct device *dev, dma_addr_t addr,
+					    size_t size,
+					    enum dma_data_direction dir,
+					    struct dma_attrs *attrs)
+{
+	const struct dma_map_ops *ops = get_dma_ops(dev);
+
+	BUG_ON(!valid_dma_direction(dir));
+	if (ops->unmap_resource)
+		ops->unmap_resource(dev, addr, size, dir, attrs);
+	debug_dma_unmap_resource(dev, addr, size, dir);
+}
+
 static inline dma_addr_t dma_map_page(struct device *dev, struct page *page,
 				      size_t offset, size_t size,
 				      enum dma_data_direction dir)
@@ -180,6 +210,10 @@ dma_sync_sg_for_device(struct device *dev, struct scatterlist *sg,
 #define dma_unmap_single(d, a, s, r) dma_unmap_single_attrs(d, a, s, r, NULL)
 #define dma_map_sg(d, s, n, r) dma_map_sg_attrs(d, s, n, r, NULL)
 #define dma_unmap_sg(d, s, n, r) dma_unmap_sg_attrs(d, s, n, r, NULL)
+#define dma_map_resource(d, e, o, s, r) \
+	dma_map_resource_attrs(d, e, o, s, r, NULL)
+#define dma_unmap_resource(d, a, s, r) \
+	dma_unmap_resource_attrs(d, a, s, r, NULL)
 
 extern int dma_common_mmap(struct device *dev, struct vm_area_struct *vma,
 			   void *cpu_addr, dma_addr_t dma_addr, size_t size);
diff --git a/include/linux/dma-mapping.h b/include/linux/dma-mapping.h
index ac07ff0..05b0b51 100644
--- a/include/linux/dma-mapping.h
+++ b/include/linux/dma-mapping.h
@@ -34,6 +34,13 @@ struct dma_map_ops {
 	void (*unmap_page)(struct device *dev, dma_addr_t dma_handle,
 			   size_t size, enum dma_data_direction dir,
 			   struct dma_attrs *attrs);
+	dma_addr_t (*map_resource)(struct device *dev, struct resource *res,
+				   unsigned long offset, size_t size,
+				   enum dma_data_direction dir,
+				   struct dma_attrs *attrs);
+	void (*unmap_resource)(struct device *dev, dma_addr_t dma_handle,
+			       size_t size, enum dma_data_direction dir,
+			       struct dma_attrs *attrs);
 	/*
 	 * map_sg returns 0 on error and a value > 0 on success.
 	 * It should never return a value < 0.
-- 
2.3.7


^ permalink raw reply related	[flat|nested] 34+ messages in thread

* [PATCH 3/6] dma-mapping: pci: add pci_(un)map_resource
  2015-05-01 18:32 ` wdavis-DDmLM1+adcrQT0dZR+AlfA
                   ` (2 preceding siblings ...)
  (?)
@ 2015-05-01 18:32 ` wdavis
  2015-05-07 15:19     ` Bjorn Helgaas
  -1 siblings, 1 reply; 34+ messages in thread
From: wdavis @ 2015-05-01 18:32 UTC (permalink / raw)
  To: joro; +Cc: iommu, linux-pci, tripperda, jhubbard, jglisse, Will Davis

From: Will Davis <wdavis@nvidia.com>

Simply route these through to the new dma_(un)map_resource APIs.

Signed-off-by: Will Davis <wdavis@nvidia.com>
Reviewed-by: Terence Ripperda <tripperda@nvidia.com>
Reviewed-by: John Hubbard <jhubbard@nvidia.com>
---
 include/asm-generic/pci-dma-compat.h | 14 ++++++++++++++
 1 file changed, 14 insertions(+)

diff --git a/include/asm-generic/pci-dma-compat.h b/include/asm-generic/pci-dma-compat.h
index c110843..ac4a4ad 100644
--- a/include/asm-generic/pci-dma-compat.h
+++ b/include/asm-generic/pci-dma-compat.h
@@ -61,6 +61,20 @@ pci_unmap_page(struct pci_dev *hwdev, dma_addr_t dma_address,
 	dma_unmap_page(hwdev == NULL ? NULL : &hwdev->dev, dma_address, size, (enum dma_data_direction)direction);
 }
 
+static inline dma_addr_t
+pci_map_resource(struct pci_dev *hwdev, struct resource *resource,
+		 unsigned long offset, size_t size, int direction)
+{
+	return dma_map_resource(hwdev == NULL ? NULL : &hwdev->dev, resource, offset, size, (enum dma_data_direction)direction);
+}
+
+static inline void
+pci_unmap_resource(struct pci_dev *hwdev, dma_addr_t dma_address, size_t size,
+		   int direction)
+{
+	dma_unmap_resource(hwdev == NULL ? NULL : &hwdev->dev, dma_address, size, (enum dma_data_direction)direction);
+}
+
 static inline int
 pci_map_sg(struct pci_dev *hwdev, struct scatterlist *sg,
 	   int nents, int direction)
-- 
2.3.7


^ permalink raw reply related	[flat|nested] 34+ messages in thread

* [PATCH 4/6] iommu/amd: Implement (un)map_resource
  2015-05-01 18:32 ` wdavis-DDmLM1+adcrQT0dZR+AlfA
                   ` (3 preceding siblings ...)
  (?)
@ 2015-05-01 18:32 ` wdavis
  -1 siblings, 0 replies; 34+ messages in thread
From: wdavis @ 2015-05-01 18:32 UTC (permalink / raw)
  To: joro; +Cc: iommu, linux-pci, tripperda, jhubbard, jglisse, Will Davis

From: Will Davis <wdavis@nvidia.com>

Implement 'map_resource' for the AMD IOMMU driver. Generalize the existing
map_page implementation to operate on a physical address, and make both
map_page and map_resource wrappers around that helper (and similiarly, for
unmap_page and unmap_resource).

This allows a device to map another's resource, to enable peer-to-peer
transactions.

Signed-off-by: Will Davis <wdavis@nvidia.com>
Reviewed-by: Terence Ripperda <tripperda@nvidia.com>
Reviewed-by: John Hubbard <jhubbard@nvidia.com>
---
 drivers/iommu/amd_iommu.c | 76 +++++++++++++++++++++++++++++++++++++++--------
 1 file changed, 63 insertions(+), 13 deletions(-)

diff --git a/drivers/iommu/amd_iommu.c b/drivers/iommu/amd_iommu.c
index e43d489..ca2dac6 100644
--- a/drivers/iommu/amd_iommu.c
+++ b/drivers/iommu/amd_iommu.c
@@ -503,6 +503,8 @@ DECLARE_STATS_COUNTER(cnt_map_single);
 DECLARE_STATS_COUNTER(cnt_unmap_single);
 DECLARE_STATS_COUNTER(cnt_map_sg);
 DECLARE_STATS_COUNTER(cnt_unmap_sg);
+DECLARE_STATS_COUNTER(cnt_map_resource);
+DECLARE_STATS_COUNTER(cnt_unmap_resource);
 DECLARE_STATS_COUNTER(cnt_alloc_coherent);
 DECLARE_STATS_COUNTER(cnt_free_coherent);
 DECLARE_STATS_COUNTER(cross_page);
@@ -541,6 +543,8 @@ static void amd_iommu_stats_init(void)
 	amd_iommu_stats_add(&cnt_unmap_single);
 	amd_iommu_stats_add(&cnt_map_sg);
 	amd_iommu_stats_add(&cnt_unmap_sg);
+	amd_iommu_stats_add(&cnt_map_resource);
+	amd_iommu_stats_add(&cnt_unmap_resource);
 	amd_iommu_stats_add(&cnt_alloc_coherent);
 	amd_iommu_stats_add(&cnt_free_coherent);
 	amd_iommu_stats_add(&cross_page);
@@ -2752,20 +2756,16 @@ static void __unmap_single(struct dma_ops_domain *dma_dom,
 }
 
 /*
- * The exported map_single function for dma_ops.
+ * Wrapper function that contains code common to mapping a physical address
+ * range from a page or a resource.
  */
-static dma_addr_t map_page(struct device *dev, struct page *page,
-			   unsigned long offset, size_t size,
-			   enum dma_data_direction dir,
-			   struct dma_attrs *attrs)
+static dma_addr_t __map_phys(struct device *dev, phys_addr_t paddr,
+			     size_t size, enum dma_data_direction dir)
 {
 	unsigned long flags;
 	struct protection_domain *domain;
 	dma_addr_t addr;
 	u64 dma_mask;
-	phys_addr_t paddr = page_to_phys(page) + offset;
-
-	INC_STATS_COUNTER(cnt_map_single);
 
 	domain = get_domain(dev);
 	if (PTR_ERR(domain) == -EINVAL)
@@ -2791,16 +2791,15 @@ out:
 }
 
 /*
- * The exported unmap_single function for dma_ops.
+ * Wrapper function that contains code common to unmapping a physical address
+ * range from a page or a resource.
  */
-static void unmap_page(struct device *dev, dma_addr_t dma_addr, size_t size,
-		       enum dma_data_direction dir, struct dma_attrs *attrs)
+static void __unmap_phys(struct device *dev, dma_addr_t dma_addr, size_t size,
+			 enum dma_data_direction dir)
 {
 	unsigned long flags;
 	struct protection_domain *domain;
 
-	INC_STATS_COUNTER(cnt_unmap_single);
-
 	domain = get_domain(dev);
 	if (IS_ERR(domain))
 		return;
@@ -2815,6 +2814,55 @@ static void unmap_page(struct device *dev, dma_addr_t dma_addr, size_t size,
 }
 
 /*
+ * The exported map_single function for dma_ops.
+ */
+static dma_addr_t map_page(struct device *dev, struct page *page,
+			   unsigned long offset, size_t size,
+			   enum dma_data_direction dir,
+			   struct dma_attrs *attrs)
+{
+	INC_STATS_COUNTER(cnt_map_single);
+
+	return __map_phys(dev, page_to_phys(page) + offset, size, dir);
+}
+
+/*
+ * The exported unmap_single function for dma_ops.
+ */
+static void unmap_page(struct device *dev, dma_addr_t dma_addr, size_t size,
+		       enum dma_data_direction dir, struct dma_attrs *attrs)
+{
+	INC_STATS_COUNTER(cnt_unmap_single);
+
+	__unmap_phys(dev, dma_addr, size, dir);
+}
+
+/*
+ * The exported map_resource function for dma_ops.
+ */
+static dma_addr_t map_resource(struct device *dev, struct resource *res,
+			       unsigned long offset, size_t size,
+			       enum dma_data_direction dir,
+			       struct dma_attrs *attrs)
+{
+	INC_STATS_COUNTER(cnt_map_resource);
+
+	return __map_phys(dev, res->start + offset, size, dir);
+}
+
+/*
+ * The exported unmap_resource function for dma_ops.
+ */
+static void unmap_resource(struct device *dev, dma_addr_t dma_addr,
+			   size_t size, enum dma_data_direction dir,
+			   struct dma_attrs *attrs)
+{
+	INC_STATS_COUNTER(cnt_unmap_resource);
+
+	__unmap_phys(dev, dma_addr, size, dir);
+}
+
+/*
  * The exported map_sg function for dma_ops (handles scatter-gather
  * lists).
  */
@@ -3066,6 +3114,8 @@ static struct dma_map_ops amd_iommu_dma_ops = {
 	.unmap_page = unmap_page,
 	.map_sg = map_sg,
 	.unmap_sg = unmap_sg,
+	.map_resource = map_resource,
+	.unmap_resource = unmap_resource,
 	.dma_supported = amd_iommu_dma_supported,
 };
 
-- 
2.3.7


^ permalink raw reply related	[flat|nested] 34+ messages in thread

* [PATCH 5/6] iommu/vt-d: implement (un)map_resource
  2015-05-01 18:32 ` wdavis-DDmLM1+adcrQT0dZR+AlfA
                   ` (4 preceding siblings ...)
  (?)
@ 2015-05-01 18:32 ` wdavis
  -1 siblings, 0 replies; 34+ messages in thread
From: wdavis @ 2015-05-01 18:32 UTC (permalink / raw)
  To: joro; +Cc: iommu, linux-pci, tripperda, jhubbard, jglisse, Will Davis

From: Will Davis <wdavis@nvidia.com>

Implement 'map_resource' for the Intel IOMMU driver. Simply translate the
resource to a physical address and route it to the same handlers used by
the 'map_page' API.

This allows a device to map another's resource, to enable peer-to-peer
transactions.

Signed-off-by: Will Davis <wdavis@nvidia.com>
Reviewed-by: Terence Ripperda <tripperda@nvidia.com>
Reviewed-by: John Hubbard <jhubbard@nvidia.com>
---
 drivers/iommu/intel-iommu.c | 18 ++++++++++++++++++
 1 file changed, 18 insertions(+)

diff --git a/drivers/iommu/intel-iommu.c b/drivers/iommu/intel-iommu.c
index 68d43be..0f49eff 100644
--- a/drivers/iommu/intel-iommu.c
+++ b/drivers/iommu/intel-iommu.c
@@ -3095,6 +3095,15 @@ static dma_addr_t intel_map_page(struct device *dev, struct page *page,
 				  dir, *dev->dma_mask);
 }
 
+static dma_addr_t intel_map_resource(struct device *dev, struct resource *res,
+				     unsigned long offset, size_t size,
+				     enum dma_data_direction dir,
+				     struct dma_attrs *attrs)
+{
+	return __intel_map_single(dev, res->start + offset, size,
+				  dir, *dev->dma_mask);
+}
+
 static void flush_unmaps(void)
 {
 	int i, j;
@@ -3226,6 +3235,13 @@ static void intel_unmap_page(struct device *dev, dma_addr_t dev_addr,
 	intel_unmap(dev, dev_addr);
 }
 
+static void intel_unmap_resource(struct device *dev, dma_addr_t dev_addr,
+				 size_t size, enum dma_data_direction dir,
+				 struct dma_attrs *attrs)
+{
+	intel_unmap(dev, dev_addr);
+}
+
 static void *intel_alloc_coherent(struct device *dev, size_t size,
 				  dma_addr_t *dma_handle, gfp_t flags,
 				  struct dma_attrs *attrs)
@@ -3382,6 +3398,8 @@ struct dma_map_ops intel_dma_ops = {
 	.unmap_sg = intel_unmap_sg,
 	.map_page = intel_map_page,
 	.unmap_page = intel_unmap_page,
+	.map_resource = intel_map_resource,
+	.unmap_resource = intel_unmap_resource,
 	.mapping_error = intel_mapping_error,
 };
 
-- 
2.3.7


^ permalink raw reply related	[flat|nested] 34+ messages in thread

* [PATCH 6/6] x86: add pci-nommu implementation of map_resource
  2015-05-01 18:32 ` wdavis-DDmLM1+adcrQT0dZR+AlfA
                   ` (5 preceding siblings ...)
  (?)
@ 2015-05-01 18:32 ` wdavis
  2015-05-07 15:08   ` Bjorn Helgaas
  -1 siblings, 1 reply; 34+ messages in thread
From: wdavis @ 2015-05-01 18:32 UTC (permalink / raw)
  To: joro; +Cc: iommu, linux-pci, tripperda, jhubbard, jglisse, Will Davis

From: Will Davis <wdavis@nvidia.com>

Simply pass through the physical address as the DMA address.

Signed-off-by: Will Davis <wdavis@nvidia.com>
Reviewed-by: Terence Ripperda <tripperda@nvidia.com>
Reviewed-by: John Hubbard <jhubbard@nvidia.com>
---
 arch/x86/kernel/pci-nommu.c | 17 +++++++++++++++++
 1 file changed, 17 insertions(+)

diff --git a/arch/x86/kernel/pci-nommu.c b/arch/x86/kernel/pci-nommu.c
index da15918..6e9e66d 100644
--- a/arch/x86/kernel/pci-nommu.c
+++ b/arch/x86/kernel/pci-nommu.c
@@ -38,6 +38,22 @@ static dma_addr_t nommu_map_page(struct device *dev, struct page *page,
 	return bus;
 }
 
+static dma_addr_t nommu_map_resource(struct device *dev, struct resource *res,
+				     unsigned long offset, size_t size,
+				     enum dma_data_direction dir,
+				     struct dma_attrs *attrs)
+{
+	dma_addr_t bus = res->start + offset;
+
+	WARN_ON(size == 0);
+
+	if (!check_addr("map_resource", dev, bus, size))
+		return DMA_ERROR_CODE;
+	flush_write_buffers();
+	return bus;
+}
+
+
 /* Map a set of buffers described by scatterlist in streaming
  * mode for DMA.  This is the scatter-gather version of the
  * above pci_map_single interface.  Here the scatter gather list
@@ -93,6 +109,7 @@ struct dma_map_ops nommu_dma_ops = {
 	.free			= dma_generic_free_coherent,
 	.map_sg			= nommu_map_sg,
 	.map_page		= nommu_map_page,
+	.map_resource		= nommu_map_resource,
 	.sync_single_for_device = nommu_sync_single_for_device,
 	.sync_sg_for_device	= nommu_sync_sg_for_device,
 	.is_phys		= 1,
-- 
2.3.7


^ permalink raw reply related	[flat|nested] 34+ messages in thread

* Re: [PATCH 0/6] IOMMU/DMA map_resource support for peer-to-peer
  2015-05-01 18:32 ` wdavis-DDmLM1+adcrQT0dZR+AlfA
                   ` (6 preceding siblings ...)
  (?)
@ 2015-05-06 22:18 ` Bjorn Helgaas
  2015-05-06 22:30   ` Alex Williamson
  2015-05-07  1:48     ` Yijing Wang
  -1 siblings, 2 replies; 34+ messages in thread
From: Bjorn Helgaas @ 2015-05-06 22:18 UTC (permalink / raw)
  To: wdavis
  Cc: joro, iommu, linux-pci, tripperda, jhubbard, jglisse,
	Yijing Wang, Dave Jiang, David S. Miller, Alex Williamson

[+cc Yijing, Dave J, Dave M, Alex]

On Fri, May 01, 2015 at 01:32:12PM -0500, wdavis@nvidia.com wrote:
> From: Will Davis <wdavis@nvidia.com>
> 
> Hi,
> 
> This patch series adds DMA APIs to map and unmap a struct resource to and from
> a PCI device's IOVA domain, and implements the AMD, Intel, and nommu versions
> of these interfaces.
> 
> This solves a long-standing problem with the existing DMA-remapping interfaces,
> which require that a struct page be given for the region to be mapped into a
> device's IOVA domain. This requirement cannot support peer device BAR ranges,
> for which no struct pages exist.
> 
> The underlying implementations of map_page and map_sg convert the struct page
> into its physical address anyway, so we just need a way to route the physical
> address of the BAR region to these implementations. The new interfaces do this
> by taking the struct resource describing a device's BAR region, from which the
> physical address is derived.
> 
> The Intel and nommu versions have been verified on a dual Intel Xeon E5405
> workstation. I'm in the process of obtaining hardware to test the AMD version
> as well. Please review.

I think we currently assume there's no peer-to-peer traffic.

I don't know whether changing that will break anything, but I'm concerned
about these:

  - PCIe MPS configuration (see pcie_bus_configure_settings()).

  - PCIe ACS, e.g., pci_acs_enabled().  My guess is that this one is OK,
    but Alex would know better.

  - dma_addr_t.  Currently dma_addr_t is big enough to hold any address
    returned from the DMA API.  That's not necessarily big enough to hold a
    PCI bus address, e.g., a raw BAR value.

> Will Davis (6):
>   dma-debug: add checking for map/unmap_resource
>   DMA-API: Introduce dma_(un)map_resource
>   dma-mapping: pci: add pci_(un)map_resource
>   iommu/amd: Implement (un)map_resource
>   iommu/vt-d: implement (un)map_resource
>   x86: add pci-nommu implementation of map_resource
> 
>  arch/x86/kernel/pci-nommu.c              | 17 +++++++
>  drivers/iommu/amd_iommu.c                | 76 ++++++++++++++++++++++++++------
>  drivers/iommu/intel-iommu.c              | 18 ++++++++
>  include/asm-generic/dma-mapping-broken.h |  9 ++++
>  include/asm-generic/dma-mapping-common.h | 34 ++++++++++++++
>  include/asm-generic/pci-dma-compat.h     | 14 ++++++
>  include/linux/dma-debug.h                | 20 +++++++++
>  include/linux/dma-mapping.h              |  7 +++
>  lib/dma-debug.c                          | 48 ++++++++++++++++++++
>  9 files changed, 230 insertions(+), 13 deletions(-)
> 
> -- 
> 2.3.7
> 
> --
> To unsubscribe from this list: send the line "unsubscribe linux-pci" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH 0/6] IOMMU/DMA map_resource support for peer-to-peer
  2015-05-06 22:18 ` [PATCH 0/6] IOMMU/DMA map_resource support for peer-to-peer Bjorn Helgaas
@ 2015-05-06 22:30   ` Alex Williamson
  2015-05-07  1:48     ` Yijing Wang
  1 sibling, 0 replies; 34+ messages in thread
From: Alex Williamson @ 2015-05-06 22:30 UTC (permalink / raw)
  To: Bjorn Helgaas
  Cc: wdavis, joro, iommu, linux-pci, tripperda, jhubbard, jglisse,
	Yijing Wang, Dave Jiang, David S. Miller

On Wed, 2015-05-06 at 17:18 -0500, Bjorn Helgaas wrote:
> [+cc Yijing, Dave J, Dave M, Alex]
> 
> On Fri, May 01, 2015 at 01:32:12PM -0500, wdavis@nvidia.com wrote:
> > From: Will Davis <wdavis@nvidia.com>
> > 
> > Hi,
> > 
> > This patch series adds DMA APIs to map and unmap a struct resource to and from
> > a PCI device's IOVA domain, and implements the AMD, Intel, and nommu versions
> > of these interfaces.
> > 
> > This solves a long-standing problem with the existing DMA-remapping interfaces,
> > which require that a struct page be given for the region to be mapped into a
> > device's IOVA domain. This requirement cannot support peer device BAR ranges,
> > for which no struct pages exist.
> > 
> > The underlying implementations of map_page and map_sg convert the struct page
> > into its physical address anyway, so we just need a way to route the physical
> > address of the BAR region to these implementations. The new interfaces do this
> > by taking the struct resource describing a device's BAR region, from which the
> > physical address is derived.
> > 
> > The Intel and nommu versions have been verified on a dual Intel Xeon E5405
> > workstation. I'm in the process of obtaining hardware to test the AMD version
> > as well. Please review.
> 
> I think we currently assume there's no peer-to-peer traffic.
> 
> I don't know whether changing that will break anything, but I'm concerned
> about these:
> 
>   - PCIe MPS configuration (see pcie_bus_configure_settings()).
> 
>   - PCIe ACS, e.g., pci_acs_enabled().  My guess is that this one is OK,
>     but Alex would know better.

I think it should be OK too.  ACS will force the transaction upstream
for IOMMU translation rather than possible allowing redirection lower in
the topology, but that's sort of the price we pay for isolation.  The
p2p context entries need to be present in the IOMMU, so without actually
reading the patches, this does seem like something a driver might want
to do via the DMA API.  The IOMMU API already allows us to avoid the
struct page issue and create mappings for p2p in the IOMMU.

>   - dma_addr_t.  Currently dma_addr_t is big enough to hold any address
>     returned from the DMA API.  That's not necessarily big enough to hold a
>     PCI bus address, e.g., a raw BAR value.
> 
> > Will Davis (6):
> >   dma-debug: add checking for map/unmap_resource
> >   DMA-API: Introduce dma_(un)map_resource
> >   dma-mapping: pci: add pci_(un)map_resource
> >   iommu/amd: Implement (un)map_resource
> >   iommu/vt-d: implement (un)map_resource
> >   x86: add pci-nommu implementation of map_resource
> > 
> >  arch/x86/kernel/pci-nommu.c              | 17 +++++++
> >  drivers/iommu/amd_iommu.c                | 76 ++++++++++++++++++++++++++------
> >  drivers/iommu/intel-iommu.c              | 18 ++++++++
> >  include/asm-generic/dma-mapping-broken.h |  9 ++++
> >  include/asm-generic/dma-mapping-common.h | 34 ++++++++++++++
> >  include/asm-generic/pci-dma-compat.h     | 14 ++++++
> >  include/linux/dma-debug.h                | 20 +++++++++
> >  include/linux/dma-mapping.h              |  7 +++
> >  lib/dma-debug.c                          | 48 ++++++++++++++++++++
> >  9 files changed, 230 insertions(+), 13 deletions(-)
> > 
> > -- 
> > 2.3.7
> > 
> > --
> > To unsubscribe from this list: send the line "unsubscribe linux-pci" in
> > the body of a message to majordomo@vger.kernel.org
> > More majordomo info at  http://vger.kernel.org/majordomo-info.html
> --
> To unsubscribe from this list: send the line "unsubscribe linux-pci" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html




^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH 0/6] IOMMU/DMA map_resource support for peer-to-peer
  2015-05-06 22:18 ` [PATCH 0/6] IOMMU/DMA map_resource support for peer-to-peer Bjorn Helgaas
@ 2015-05-07  1:48     ` Yijing Wang
  2015-05-07  1:48     ` Yijing Wang
  1 sibling, 0 replies; 34+ messages in thread
From: Yijing Wang @ 2015-05-07  1:48 UTC (permalink / raw)
  To: Bjorn Helgaas, wdavis
  Cc: joro, iommu, linux-pci, tripperda, jhubbard, jglisse, Dave Jiang,
	David S. Miller, Alex Williamson

On 2015/5/7 6:18, Bjorn Helgaas wrote:
> [+cc Yijing, Dave J, Dave M, Alex]
> 
> On Fri, May 01, 2015 at 01:32:12PM -0500, wdavis@nvidia.com wrote:
>> From: Will Davis <wdavis@nvidia.com>
>>
>> Hi,
>>
>> This patch series adds DMA APIs to map and unmap a struct resource to and from
>> a PCI device's IOVA domain, and implements the AMD, Intel, and nommu versions
>> of these interfaces.
>>
>> This solves a long-standing problem with the existing DMA-remapping interfaces,
>> which require that a struct page be given for the region to be mapped into a
>> device's IOVA domain. This requirement cannot support peer device BAR ranges,
>> for which no struct pages exist.
>>
>> The underlying implementations of map_page and map_sg convert the struct page
>> into its physical address anyway, so we just need a way to route the physical
>> address of the BAR region to these implementations. The new interfaces do this
>> by taking the struct resource describing a device's BAR region, from which the
>> physical address is derived.
>>
>> The Intel and nommu versions have been verified on a dual Intel Xeon E5405
>> workstation. I'm in the process of obtaining hardware to test the AMD version
>> as well. Please review.
> 
> I think we currently assume there's no peer-to-peer traffic.
> 
> I don't know whether changing that will break anything, but I'm concerned
> about these:
> 
>   - PCIe MPS configuration (see pcie_bus_configure_settings()).

I think it should be ok for PCIe MPS configuration, PCIE_BUS_PEER2PEER force every
device's MPS to 128B, what its concern is the TLP payload size. In this series, it
seems to only map a iova for device bar region.


> 
>   - PCIe ACS, e.g., pci_acs_enabled().  My guess is that this one is OK,
>     but Alex would know better.
> 
>   - dma_addr_t.  Currently dma_addr_t is big enough to hold any address
>     returned from the DMA API.  That's not necessarily big enough to hold a
>     PCI bus address, e.g., a raw BAR value.
> 
>> Will Davis (6):
>>   dma-debug: add checking for map/unmap_resource
>>   DMA-API: Introduce dma_(un)map_resource
>>   dma-mapping: pci: add pci_(un)map_resource
>>   iommu/amd: Implement (un)map_resource
>>   iommu/vt-d: implement (un)map_resource
>>   x86: add pci-nommu implementation of map_resource
>>
>>  arch/x86/kernel/pci-nommu.c              | 17 +++++++
>>  drivers/iommu/amd_iommu.c                | 76 ++++++++++++++++++++++++++------
>>  drivers/iommu/intel-iommu.c              | 18 ++++++++
>>  include/asm-generic/dma-mapping-broken.h |  9 ++++
>>  include/asm-generic/dma-mapping-common.h | 34 ++++++++++++++
>>  include/asm-generic/pci-dma-compat.h     | 14 ++++++
>>  include/linux/dma-debug.h                | 20 +++++++++
>>  include/linux/dma-mapping.h              |  7 +++
>>  lib/dma-debug.c                          | 48 ++++++++++++++++++++
>>  9 files changed, 230 insertions(+), 13 deletions(-)
>>
>> -- 
>> 2.3.7
>>
>> --
>> To unsubscribe from this list: send the line "unsubscribe linux-pci" in
>> the body of a message to majordomo@vger.kernel.org
>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 
> .
> 


-- 
Thanks!
Yijing


^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH 0/6] IOMMU/DMA map_resource support for peer-to-peer
@ 2015-05-07  1:48     ` Yijing Wang
  0 siblings, 0 replies; 34+ messages in thread
From: Yijing Wang @ 2015-05-07  1:48 UTC (permalink / raw)
  To: Bjorn Helgaas, wdavis
  Cc: joro, iommu, linux-pci, tripperda, jhubbard, jglisse, Dave Jiang,
	David S. Miller, Alex Williamson

On 2015/5/7 6:18, Bjorn Helgaas wrote:
> [+cc Yijing, Dave J, Dave M, Alex]
> 
> On Fri, May 01, 2015 at 01:32:12PM -0500, wdavis@nvidia.com wrote:
>> From: Will Davis <wdavis@nvidia.com>
>>
>> Hi,
>>
>> This patch series adds DMA APIs to map and unmap a struct resource to and from
>> a PCI device's IOVA domain, and implements the AMD, Intel, and nommu versions
>> of these interfaces.
>>
>> This solves a long-standing problem with the existing DMA-remapping interfaces,
>> which require that a struct page be given for the region to be mapped into a
>> device's IOVA domain. This requirement cannot support peer device BAR ranges,
>> for which no struct pages exist.
>>
>> The underlying implementations of map_page and map_sg convert the struct page
>> into its physical address anyway, so we just need a way to route the physical
>> address of the BAR region to these implementations. The new interfaces do this
>> by taking the struct resource describing a device's BAR region, from which the
>> physical address is derived.
>>
>> The Intel and nommu versions have been verified on a dual Intel Xeon E5405
>> workstation. I'm in the process of obtaining hardware to test the AMD version
>> as well. Please review.
> 
> I think we currently assume there's no peer-to-peer traffic.
> 
> I don't know whether changing that will break anything, but I'm concerned
> about these:
> 
>   - PCIe MPS configuration (see pcie_bus_configure_settings()).

I think it should be ok for PCIe MPS configuration, PCIE_BUS_PEER2PEER force every
device's MPS to 128B, what its concern is the TLP payload size. In this series, it
seems to only map a iova for device bar region.


> 
>   - PCIe ACS, e.g., pci_acs_enabled().  My guess is that this one is OK,
>     but Alex would know better.
> 
>   - dma_addr_t.  Currently dma_addr_t is big enough to hold any address
>     returned from the DMA API.  That's not necessarily big enough to hold a
>     PCI bus address, e.g., a raw BAR value.
> 
>> Will Davis (6):
>>   dma-debug: add checking for map/unmap_resource
>>   DMA-API: Introduce dma_(un)map_resource
>>   dma-mapping: pci: add pci_(un)map_resource
>>   iommu/amd: Implement (un)map_resource
>>   iommu/vt-d: implement (un)map_resource
>>   x86: add pci-nommu implementation of map_resource
>>
>>  arch/x86/kernel/pci-nommu.c              | 17 +++++++
>>  drivers/iommu/amd_iommu.c                | 76 ++++++++++++++++++++++++++------
>>  drivers/iommu/intel-iommu.c              | 18 ++++++++
>>  include/asm-generic/dma-mapping-broken.h |  9 ++++
>>  include/asm-generic/dma-mapping-common.h | 34 ++++++++++++++
>>  include/asm-generic/pci-dma-compat.h     | 14 ++++++
>>  include/linux/dma-debug.h                | 20 +++++++++
>>  include/linux/dma-mapping.h              |  7 +++
>>  lib/dma-debug.c                          | 48 ++++++++++++++++++++
>>  9 files changed, 230 insertions(+), 13 deletions(-)
>>
>> -- 
>> 2.3.7
>>
>> --
>> To unsubscribe from this list: send the line "unsubscribe linux-pci" in
>> the body of a message to majordomo@vger.kernel.org
>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 
> .
> 


-- 
Thanks!
Yijing

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH 0/6] IOMMU/DMA map_resource support for peer-to-peer
  2015-05-07  1:48     ` Yijing Wang
  (?)
@ 2015-05-07 13:13     ` Bjorn Helgaas
  2015-05-07 16:23         ` William Davis
  -1 siblings, 1 reply; 34+ messages in thread
From: Bjorn Helgaas @ 2015-05-07 13:13 UTC (permalink / raw)
  To: Yijing Wang
  Cc: wdavis, Joerg Roedel, open list:INTEL IOMMU (VT-d),
	linux-pci, tripperda, jhubbard, Jerome Glisse, Dave Jiang,
	David S. Miller, Alex Williamson

On Wed, May 6, 2015 at 8:48 PM, Yijing Wang <wangyijing@huawei.com> wrote:
> On 2015/5/7 6:18, Bjorn Helgaas wrote:
>> [+cc Yijing, Dave J, Dave M, Alex]
>>
>> On Fri, May 01, 2015 at 01:32:12PM -0500, wdavis@nvidia.com wrote:
>>> From: Will Davis <wdavis@nvidia.com>
>>>
>>> Hi,
>>>
>>> This patch series adds DMA APIs to map and unmap a struct resource to and from
>>> a PCI device's IOVA domain, and implements the AMD, Intel, and nommu versions
>>> of these interfaces.
>>>
>>> This solves a long-standing problem with the existing DMA-remapping interfaces,
>>> which require that a struct page be given for the region to be mapped into a
>>> device's IOVA domain. This requirement cannot support peer device BAR ranges,
>>> for which no struct pages exist.
>>> ...

>> I think we currently assume there's no peer-to-peer traffic.
>>
>> I don't know whether changing that will break anything, but I'm concerned
>> about these:
>>
>>   - PCIe MPS configuration (see pcie_bus_configure_settings()).
>
> I think it should be ok for PCIe MPS configuration, PCIE_BUS_PEER2PEER force every
> device's MPS to 128B, what its concern is the TLP payload size. In this series, it
> seems to only map a iova for device bar region.

MPS configuration makes assumptions about whether there will be any
peer-to-peer traffic.  If there will be none, MPS can be configured
more aggressively.

I don't think Linux has any way to detect whether a driver is doing
peer-to-peer, and there's no way to prevent a driver from doing it.
We're stuck with requiring the user to specify boot options
("pci=pcie_bus_safe", "pci=pcie_bus_perf", "pci=pcie_bus_peer2peer",
etc.) that tell the PCI core what the user expects to happen.

This is a terrible user experience.  The user has no way to tell what
drivers are going to do.  If he specifies the wrong thing, e.g.,
"assume no peer-to-peer traffic," and then loads a driver that does
peer-to-peer, the kernel will configure MPS aggressively and when the
device does a peer-to-peer transfer, it may cause a Malformed TLP
error.

Bjorn

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH 6/6] x86: add pci-nommu implementation of map_resource
  2015-05-01 18:32 ` [PATCH 6/6] x86: add pci-nommu implementation of map_resource wdavis
@ 2015-05-07 15:08   ` Bjorn Helgaas
  2015-05-07 16:07     ` William Davis
  0 siblings, 1 reply; 34+ messages in thread
From: Bjorn Helgaas @ 2015-05-07 15:08 UTC (permalink / raw)
  To: wdavis; +Cc: joro, iommu, linux-pci, tripperda, jhubbard, jglisse

On Fri, May 01, 2015 at 01:32:18PM -0500, wdavis@nvidia.com wrote:
> From: Will Davis <wdavis@nvidia.com>
> 
> Simply pass through the physical address as the DMA address.
> 
> Signed-off-by: Will Davis <wdavis@nvidia.com>
> Reviewed-by: Terence Ripperda <tripperda@nvidia.com>
> Reviewed-by: John Hubbard <jhubbard@nvidia.com>
> ---
>  arch/x86/kernel/pci-nommu.c | 17 +++++++++++++++++
>  1 file changed, 17 insertions(+)
> 
> diff --git a/arch/x86/kernel/pci-nommu.c b/arch/x86/kernel/pci-nommu.c
> index da15918..6e9e66d 100644
> --- a/arch/x86/kernel/pci-nommu.c
> +++ b/arch/x86/kernel/pci-nommu.c
> @@ -38,6 +38,22 @@ static dma_addr_t nommu_map_page(struct device *dev, struct page *page,
>  	return bus;
>  }
>  
> +static dma_addr_t nommu_map_resource(struct device *dev, struct resource *res,
> +				     unsigned long offset, size_t size,
> +				     enum dma_data_direction dir,
> +				     struct dma_attrs *attrs)
> +{
> +	dma_addr_t bus = res->start + offset;

"res->start" is the CPU physical address, not the bus address.  There is a
pci_bus_address() interface to get the bus address.

On many, but not all, x86 platforms the CPU physical address is identical
to the PCI bus address.

> +
> +	WARN_ON(size == 0);
> +
> +	if (!check_addr("map_resource", dev, bus, size))
> +		return DMA_ERROR_CODE;
> +	flush_write_buffers();
> +	return bus;
> +}
> +
> +
>  /* Map a set of buffers described by scatterlist in streaming
>   * mode for DMA.  This is the scatter-gather version of the
>   * above pci_map_single interface.  Here the scatter gather list
> @@ -93,6 +109,7 @@ struct dma_map_ops nommu_dma_ops = {
>  	.free			= dma_generic_free_coherent,
>  	.map_sg			= nommu_map_sg,
>  	.map_page		= nommu_map_page,
> +	.map_resource		= nommu_map_resource,
>  	.sync_single_for_device = nommu_sync_single_for_device,
>  	.sync_sg_for_device	= nommu_sync_sg_for_device,
>  	.is_phys		= 1,
> -- 
> 2.3.7
> 
> --
> To unsubscribe from this list: send the line "unsubscribe linux-pci" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH 2/6] DMA-API: Introduce dma_(un)map_resource
  2015-05-01 18:32 ` [PATCH 2/6] DMA-API: Introduce dma_(un)map_resource wdavis
@ 2015-05-07 15:09   ` Bjorn Helgaas
  2015-05-07 16:10     ` William Davis
  0 siblings, 1 reply; 34+ messages in thread
From: Bjorn Helgaas @ 2015-05-07 15:09 UTC (permalink / raw)
  To: wdavis; +Cc: joro, iommu, linux-pci, tripperda, jhubbard, jglisse

On Fri, May 01, 2015 at 01:32:14PM -0500, wdavis@nvidia.com wrote:
> From: Will Davis <wdavis@nvidia.com>
> 
> Add functions to DMA-map and -unmap a resource for a given device. This
> will allow devices to DMA-map a peer device's resource (for example,
> another device's BAR region on PCI) to enable peer-to-peer transactions.
> 
> Signed-off-by: Will Davis <wdavis@nvidia.com>
> Reviewed-by: Terence Ripperda <tripperda@nvidia.com>
> Reviewed-by: John Hubbard <jhubbard@nvidia.com>
> ---
>  include/asm-generic/dma-mapping-broken.h |  9 +++++++++
>  include/asm-generic/dma-mapping-common.h | 34 ++++++++++++++++++++++++++++++++
>  include/linux/dma-mapping.h              |  7 +++++++

You should document these new interfaces in Documentation/DMA-API-*

Bjorn

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH 3/6] dma-mapping: pci: add pci_(un)map_resource
@ 2015-05-07 15:19     ` Bjorn Helgaas
  0 siblings, 0 replies; 34+ messages in thread
From: Bjorn Helgaas @ 2015-05-07 15:19 UTC (permalink / raw)
  To: wdavis
  Cc: joro, iommu, linux-pci, tripperda, jhubbard, jglisse,
	David S. Miller, Yinghai Lu

[+cc Dave for sparc64, Yinghai]

On Fri, May 01, 2015 at 01:32:15PM -0500, wdavis@nvidia.com wrote:
> From: Will Davis <wdavis@nvidia.com>
> 
> Simply route these through to the new dma_(un)map_resource APIs.
> 
> Signed-off-by: Will Davis <wdavis@nvidia.com>
> Reviewed-by: Terence Ripperda <tripperda@nvidia.com>
> Reviewed-by: John Hubbard <jhubbard@nvidia.com>
> ---
>  include/asm-generic/pci-dma-compat.h | 14 ++++++++++++++
>  1 file changed, 14 insertions(+)
> 
> diff --git a/include/asm-generic/pci-dma-compat.h b/include/asm-generic/pci-dma-compat.h
> index c110843..ac4a4ad 100644
> --- a/include/asm-generic/pci-dma-compat.h
> +++ b/include/asm-generic/pci-dma-compat.h
> @@ -61,6 +61,20 @@ pci_unmap_page(struct pci_dev *hwdev, dma_addr_t dma_address,
>  	dma_unmap_page(hwdev == NULL ? NULL : &hwdev->dev, dma_address, size, (enum dma_data_direction)direction);
>  }
>  
> +static inline dma_addr_t
> +pci_map_resource(struct pci_dev *hwdev, struct resource *resource,
> +		 unsigned long offset, size_t size, int direction)
> +{
> +	return dma_map_resource(hwdev == NULL ? NULL : &hwdev->dev, resource, offset, size, (enum dma_data_direction)direction);
> +}

On sparc64, PCI bus addresses, e.g., raw BAR values, can be 64 bits wide,
but dma_addr_t is only 32 bits [1].  So dma_addr_t is a bit of a problem
here.  It's likely that we will add a pci_bus_addr_t, but that hasn't
happened yet [2].

We do have existing problems already, e.g,. pci_bus_address() returns a
dma_addr_t, so it has the same problem.  So I guess this is just a heads-up
that this needs to be fixed eventually.

Bjorn

[1] http://lkml.kernel.org/r/20150327.145016.86183910134380870.davem@davemloft.net
[2] http://lkml.kernel.org/r/1427857069-6789-2-git-send-email-yinghai@kernel.org

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH 3/6] dma-mapping: pci: add pci_(un)map_resource
@ 2015-05-07 15:19     ` Bjorn Helgaas
  0 siblings, 0 replies; 34+ messages in thread
From: Bjorn Helgaas @ 2015-05-07 15:19 UTC (permalink / raw)
  To: wdavis-DDmLM1+adcrQT0dZR+AlfA
  Cc: linux-pci-u79uwXL29TY76Z2rM5mHXA,
	iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA,
	jglisse-H+wXaHxf7aLQT0dZR+AlfA, jhubbard-DDmLM1+adcrQT0dZR+AlfA,
	tripperda-DDmLM1+adcrQT0dZR+AlfA, Yinghai Lu, David S. Miller

[+cc Dave for sparc64, Yinghai]

On Fri, May 01, 2015 at 01:32:15PM -0500, wdavis-DDmLM1+adcrQT0dZR+AlfA@public.gmane.org wrote:
> From: Will Davis <wdavis-DDmLM1+adcrQT0dZR+AlfA@public.gmane.org>
> 
> Simply route these through to the new dma_(un)map_resource APIs.
> 
> Signed-off-by: Will Davis <wdavis-DDmLM1+adcrQT0dZR+AlfA@public.gmane.org>
> Reviewed-by: Terence Ripperda <tripperda-DDmLM1+adcrQT0dZR+AlfA@public.gmane.org>
> Reviewed-by: John Hubbard <jhubbard-DDmLM1+adcrQT0dZR+AlfA@public.gmane.org>
> ---
>  include/asm-generic/pci-dma-compat.h | 14 ++++++++++++++
>  1 file changed, 14 insertions(+)
> 
> diff --git a/include/asm-generic/pci-dma-compat.h b/include/asm-generic/pci-dma-compat.h
> index c110843..ac4a4ad 100644
> --- a/include/asm-generic/pci-dma-compat.h
> +++ b/include/asm-generic/pci-dma-compat.h
> @@ -61,6 +61,20 @@ pci_unmap_page(struct pci_dev *hwdev, dma_addr_t dma_address,
>  	dma_unmap_page(hwdev == NULL ? NULL : &hwdev->dev, dma_address, size, (enum dma_data_direction)direction);
>  }
>  
> +static inline dma_addr_t
> +pci_map_resource(struct pci_dev *hwdev, struct resource *resource,
> +		 unsigned long offset, size_t size, int direction)
> +{
> +	return dma_map_resource(hwdev == NULL ? NULL : &hwdev->dev, resource, offset, size, (enum dma_data_direction)direction);
> +}

On sparc64, PCI bus addresses, e.g., raw BAR values, can be 64 bits wide,
but dma_addr_t is only 32 bits [1].  So dma_addr_t is a bit of a problem
here.  It's likely that we will add a pci_bus_addr_t, but that hasn't
happened yet [2].

We do have existing problems already, e.g,. pci_bus_address() returns a
dma_addr_t, so it has the same problem.  So I guess this is just a heads-up
that this needs to be fixed eventually.

Bjorn

[1] http://lkml.kernel.org/r/20150327.145016.86183910134380870.davem-fT/PcQaiUtIeIZ0/mPfg9Q@public.gmane.org
[2] http://lkml.kernel.org/r/1427857069-6789-2-git-send-email-yinghai-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org

^ permalink raw reply	[flat|nested] 34+ messages in thread

* RE: [PATCH 6/6] x86: add pci-nommu implementation of map_resource
  2015-05-07 15:08   ` Bjorn Helgaas
@ 2015-05-07 16:07     ` William Davis
  0 siblings, 0 replies; 34+ messages in thread
From: William Davis @ 2015-05-07 16:07 UTC (permalink / raw)
  To: Bjorn Helgaas
  Cc: joro, iommu, linux-pci, Terence Ripperda, John Hubbard, jglisse



> -----Original Message-----
> From: Bjorn Helgaas [mailto:bhelgaas@google.com]
> Sent: Thursday, May 7, 2015 10:08 AM
> To: William Davis
> Cc: joro@8bytes.org; iommu@lists.linux-foundation.org; linux-
> pci@vger.kernel.org; Terence Ripperda; John Hubbard; jglisse@redhat.com
> Subject: Re: [PATCH 6/6] x86: add pci-nommu implementation of map_resource
> 
> On Fri, May 01, 2015 at 01:32:18PM -0500, wdavis@nvidia.com wrote:
> > From: Will Davis <wdavis@nvidia.com>
> >
> > diff --git a/arch/x86/kernel/pci-nommu.c b/arch/x86/kernel/pci-nommu.c
> > index da15918..6e9e66d 100644
> > --- a/arch/x86/kernel/pci-nommu.c
> > +++ b/arch/x86/kernel/pci-nommu.c
> > @@ -38,6 +38,22 @@ static dma_addr_t nommu_map_page(struct device *dev,
> struct page *page,
> >  	return bus;
> >  }
> >
> > +static dma_addr_t nommu_map_resource(struct device *dev, struct resource
> *res,
> > +				     unsigned long offset, size_t size,
> > +				     enum dma_data_direction dir,
> > +				     struct dma_attrs *attrs)
> > +{
> > +	dma_addr_t bus = res->start + offset;
> 
> "res->start" is the CPU physical address, not the bus address.  There is a
> pci_bus_address() interface to get the bus address.
> 
> On many, but not all, x86 platforms the CPU physical address is identical
> to the PCI bus address.
> 

Thanks for pointing that out. Since we already have the resource here (and not the BAR index), I'll use pcibios_resource_to_bus(), as pci_bus_address() does.

Thanks,
Will

--
nvpublic

^ permalink raw reply	[flat|nested] 34+ messages in thread

* RE: [PATCH 2/6] DMA-API: Introduce dma_(un)map_resource
  2015-05-07 15:09   ` Bjorn Helgaas
@ 2015-05-07 16:10     ` William Davis
  0 siblings, 0 replies; 34+ messages in thread
From: William Davis @ 2015-05-07 16:10 UTC (permalink / raw)
  To: Bjorn Helgaas
  Cc: joro, iommu, linux-pci, Terence Ripperda, John Hubbard, jglisse



> -----Original Message-----
> From: Bjorn Helgaas [mailto:bhelgaas@google.com]
> Sent: Thursday, May 7, 2015 10:10 AM
> To: William Davis
> Cc: joro@8bytes.org; iommu@lists.linux-foundation.org; linux-
> pci@vger.kernel.org; Terence Ripperda; John Hubbard; jglisse@redhat.com
> Subject: Re: [PATCH 2/6] DMA-API: Introduce dma_(un)map_resource
> 
> On Fri, May 01, 2015 at 01:32:14PM -0500, wdavis@nvidia.com wrote:
> > From: Will Davis <wdavis@nvidia.com>
> >
> > Add functions to DMA-map and -unmap a resource for a given device.
> > This will allow devices to DMA-map a peer device's resource (for
> > example, another device's BAR region on PCI) to enable peer-to-peer
> transactions.
> >
> > Signed-off-by: Will Davis <wdavis@nvidia.com>
> > Reviewed-by: Terence Ripperda <tripperda@nvidia.com>
> > Reviewed-by: John Hubbard <jhubbard@nvidia.com>
> > ---
> >  include/asm-generic/dma-mapping-broken.h |  9 +++++++++
> > include/asm-generic/dma-mapping-common.h | 34
> ++++++++++++++++++++++++++++++++
> >  include/linux/dma-mapping.h              |  7 +++++++
> 
> You should document these new interfaces in Documentation/DMA-API-*
> 

Will do.

Thanks,
Will

--
nvpublic

^ permalink raw reply	[flat|nested] 34+ messages in thread

* RE: [PATCH 0/6] IOMMU/DMA map_resource support for peer-to-peer
  2015-05-07 13:13     ` Bjorn Helgaas
@ 2015-05-07 16:23         ` William Davis
  0 siblings, 0 replies; 34+ messages in thread
From: William Davis @ 2015-05-07 16:23 UTC (permalink / raw)
  To: Bjorn Helgaas, Yijing Wang
  Cc: Joerg Roedel, open list:INTEL IOMMU (VT-d),
	linux-pci, Terence Ripperda, John Hubbard, Jerome Glisse,
	Dave Jiang, David S. Miller, Alex Williamson

DQoNCj4gLS0tLS1PcmlnaW5hbCBNZXNzYWdlLS0tLS0NCj4gRnJvbTogQmpvcm4gSGVsZ2FhcyBb
bWFpbHRvOmJoZWxnYWFzQGdvb2dsZS5jb21dDQo+IFNlbnQ6IFRodXJzZGF5LCBNYXkgNywgMjAx
NSA4OjEzIEFNDQo+IFRvOiBZaWppbmcgV2FuZw0KPiBDYzogV2lsbGlhbSBEYXZpczsgSm9lcmcg
Um9lZGVsOyBvcGVuIGxpc3Q6SU5URUwgSU9NTVUgKFZULWQpOyBsaW51eC0NCj4gcGNpQHZnZXIu
a2VybmVsLm9yZzsgVGVyZW5jZSBSaXBwZXJkYTsgSm9obiBIdWJiYXJkOyBKZXJvbWUgR2xpc3Nl
OyBEYXZlDQo+IEppYW5nOyBEYXZpZCBTLiBNaWxsZXI7IEFsZXggV2lsbGlhbXNvbg0KPiBTdWJq
ZWN0OiBSZTogW1BBVENIIDAvNl0gSU9NTVUvRE1BIG1hcF9yZXNvdXJjZSBzdXBwb3J0IGZvciBw
ZWVyLXRvLXBlZXINCj4gDQo+IE9uIFdlZCwgTWF5IDYsIDIwMTUgYXQgODo0OCBQTSwgWWlqaW5n
IFdhbmcgPHdhbmd5aWppbmdAaHVhd2VpLmNvbT4gd3JvdGU6DQo+ID4gT24gMjAxNS81LzcgNjox
OCwgQmpvcm4gSGVsZ2FhcyB3cm90ZToNCj4gPj4gWytjYyBZaWppbmcsIERhdmUgSiwgRGF2ZSBN
LCBBbGV4XQ0KPiA+Pg0KPiA+PiBPbiBGcmksIE1heSAwMSwgMjAxNSBhdCAwMTozMjoxMlBNIC0w
NTAwLCB3ZGF2aXNAbnZpZGlhLmNvbSB3cm90ZToNCj4gPj4+IEZyb206IFdpbGwgRGF2aXMgPHdk
YXZpc0BudmlkaWEuY29tPg0KPiA+Pj4NCj4gPj4+IEhpLA0KPiA+Pj4NCj4gPj4+IFRoaXMgcGF0
Y2ggc2VyaWVzIGFkZHMgRE1BIEFQSXMgdG8gbWFwIGFuZCB1bm1hcCBhIHN0cnVjdCByZXNvdXJj
ZQ0KPiA+Pj4gdG8gYW5kIGZyb20gYSBQQ0kgZGV2aWNlJ3MgSU9WQSBkb21haW4sIGFuZCBpbXBs
ZW1lbnRzIHRoZSBBTUQsDQo+ID4+PiBJbnRlbCwgYW5kIG5vbW11IHZlcnNpb25zIG9mIHRoZXNl
IGludGVyZmFjZXMuDQo+ID4+Pg0KPiA+Pj4gVGhpcyBzb2x2ZXMgYSBsb25nLXN0YW5kaW5nIHBy
b2JsZW0gd2l0aCB0aGUgZXhpc3RpbmcgRE1BLXJlbWFwcGluZw0KPiA+Pj4gaW50ZXJmYWNlcywg
d2hpY2ggcmVxdWlyZSB0aGF0IGEgc3RydWN0IHBhZ2UgYmUgZ2l2ZW4gZm9yIHRoZSByZWdpb24N
Cj4gPj4+IHRvIGJlIG1hcHBlZCBpbnRvIGEgZGV2aWNlJ3MgSU9WQSBkb21haW4uIFRoaXMgcmVx
dWlyZW1lbnQgY2Fubm90DQo+ID4+PiBzdXBwb3J0IHBlZXIgZGV2aWNlIEJBUiByYW5nZXMsIGZv
ciB3aGljaCBubyBzdHJ1Y3QgcGFnZXMgZXhpc3QuDQo+ID4+PiAuLi4NCj4gDQo+ID4+IEkgdGhp
bmsgd2UgY3VycmVudGx5IGFzc3VtZSB0aGVyZSdzIG5vIHBlZXItdG8tcGVlciB0cmFmZmljLg0K
PiA+Pg0KPiA+PiBJIGRvbid0IGtub3cgd2hldGhlciBjaGFuZ2luZyB0aGF0IHdpbGwgYnJlYWsg
YW55dGhpbmcsIGJ1dCBJJ20NCj4gPj4gY29uY2VybmVkIGFib3V0IHRoZXNlOg0KPiA+Pg0KPiA+
PiAgIC0gUENJZSBNUFMgY29uZmlndXJhdGlvbiAoc2VlIHBjaWVfYnVzX2NvbmZpZ3VyZV9zZXR0
aW5ncygpKS4NCj4gPg0KPiA+IEkgdGhpbmsgaXQgc2hvdWxkIGJlIG9rIGZvciBQQ0llIE1QUyBj
b25maWd1cmF0aW9uLCBQQ0lFX0JVU19QRUVSMlBFRVINCj4gPiBmb3JjZSBldmVyeSBkZXZpY2Un
cyBNUFMgdG8gMTI4Qiwgd2hhdCBpdHMgY29uY2VybiBpcyB0aGUgVExQIHBheWxvYWQNCj4gPiBz
aXplLiBJbiB0aGlzIHNlcmllcywgaXQgc2VlbXMgdG8gb25seSBtYXAgYSBpb3ZhIGZvciBkZXZp
Y2UgYmFyIHJlZ2lvbi4NCj4gDQo+IE1QUyBjb25maWd1cmF0aW9uIG1ha2VzIGFzc3VtcHRpb25z
IGFib3V0IHdoZXRoZXIgdGhlcmUgd2lsbCBiZSBhbnkgcGVlci0NCj4gdG8tcGVlciB0cmFmZmlj
LiAgSWYgdGhlcmUgd2lsbCBiZSBub25lLCBNUFMgY2FuIGJlIGNvbmZpZ3VyZWQgbW9yZQ0KPiBh
Z2dyZXNzaXZlbHkuDQo+IA0KPiBJIGRvbid0IHRoaW5rIExpbnV4IGhhcyBhbnkgd2F5IHRvIGRl
dGVjdCB3aGV0aGVyIGEgZHJpdmVyIGlzIGRvaW5nIHBlZXItDQo+IHRvLXBlZXIsIGFuZCB0aGVy
ZSdzIG5vIHdheSB0byBwcmV2ZW50IGEgZHJpdmVyIGZyb20gZG9pbmcgaXQuDQo+IFdlJ3JlIHN0
dWNrIHdpdGggcmVxdWlyaW5nIHRoZSB1c2VyIHRvIHNwZWNpZnkgYm9vdCBvcHRpb25zDQo+ICgi
cGNpPXBjaWVfYnVzX3NhZmUiLCAicGNpPXBjaWVfYnVzX3BlcmYiLCAicGNpPXBjaWVfYnVzX3Bl
ZXIycGVlciIsDQo+IGV0Yy4pIHRoYXQgdGVsbCB0aGUgUENJIGNvcmUgd2hhdCB0aGUgdXNlciBl
eHBlY3RzIHRvIGhhcHBlbi4NCj4gDQo+IFRoaXMgaXMgYSB0ZXJyaWJsZSB1c2VyIGV4cGVyaWVu
Y2UuICBUaGUgdXNlciBoYXMgbm8gd2F5IHRvIHRlbGwgd2hhdA0KPiBkcml2ZXJzIGFyZSBnb2lu
ZyB0byBkby4gIElmIGhlIHNwZWNpZmllcyB0aGUgd3JvbmcgdGhpbmcsIGUuZy4sICJhc3N1bWUg
bm8NCj4gcGVlci10by1wZWVyIHRyYWZmaWMsIiBhbmQgdGhlbiBsb2FkcyBhIGRyaXZlciB0aGF0
IGRvZXMgcGVlci10by1wZWVyLCB0aGUNCj4ga2VybmVsIHdpbGwgY29uZmlndXJlIE1QUyBhZ2dy
ZXNzaXZlbHkgYW5kIHdoZW4gdGhlIGRldmljZSBkb2VzIGEgcGVlci10by0NCj4gcGVlciB0cmFu
c2ZlciwgaXQgbWF5IGNhdXNlIGEgTWFsZm9ybWVkIFRMUCBlcnJvci4NCj4gDQoNCkkgYWdyZWUg
dGhhdCB0aGlzIGlzbid0IGEgZ3JlYXQgdXNlciBleHBlcmllbmNlLCBidXQganVzdCB3YW50IHRv
IGNsYXJpZnkgdGhhdCB0aGlzIHByb2JsZW0gaXMgb3J0aG9nb25hbCB0byB0aGlzIHBhdGNoIHNl
cmllcywgY29ycmVjdD8NCg0KUHJpb3IgdG8gdGhpcyBzZXJpZXMsIHRoZSBNUFMgbWlzbWF0Y2gg
aXMgc3RpbGwgcG9zc2libGUgd2l0aCBwMnAgdHJhZmZpYywgYnV0IHdoZW4gYW4gSU9NTVUgaXMg
ZW5hYmxlZCBwMnAgdHJhZmZpYyB3aWxsIHJlc3VsdCBpbiBETUFSIGZhdWx0cy4gVGhlIGFpbSBv
ZiB0aGUgc2VyaWVzIGlzIHRvIGFsbG93IGRyaXZlcnMgdG8gZml4IHRoZSBsYXR0ZXIsIG5vdCB0
aGUgZm9ybWVyLg0KDQpUaGFua3MsDQpXaWxsDQoNCi0tDQpudnB1YmxpYw0K

^ permalink raw reply	[flat|nested] 34+ messages in thread

* RE: [PATCH 0/6] IOMMU/DMA map_resource support for peer-to-peer
@ 2015-05-07 16:23         ` William Davis
  0 siblings, 0 replies; 34+ messages in thread
From: William Davis @ 2015-05-07 16:23 UTC (permalink / raw)
  To: Bjorn Helgaas, Yijing Wang
  Cc: Joerg Roedel, open list:INTEL IOMMU (VT-d),
	linux-pci, Terence Ripperda, John Hubbard, Jerome Glisse,
	Dave Jiang, David S. Miller, Alex Williamson



> -----Original Message-----
> From: Bjorn Helgaas [mailto:bhelgaas@google.com]
> Sent: Thursday, May 7, 2015 8:13 AM
> To: Yijing Wang
> Cc: William Davis; Joerg Roedel; open list:INTEL IOMMU (VT-d); linux-
> pci@vger.kernel.org; Terence Ripperda; John Hubbard; Jerome Glisse; Dave
> Jiang; David S. Miller; Alex Williamson
> Subject: Re: [PATCH 0/6] IOMMU/DMA map_resource support for peer-to-peer
> 
> On Wed, May 6, 2015 at 8:48 PM, Yijing Wang <wangyijing@huawei.com> wrote:
> > On 2015/5/7 6:18, Bjorn Helgaas wrote:
> >> [+cc Yijing, Dave J, Dave M, Alex]
> >>
> >> On Fri, May 01, 2015 at 01:32:12PM -0500, wdavis@nvidia.com wrote:
> >>> From: Will Davis <wdavis@nvidia.com>
> >>>
> >>> Hi,
> >>>
> >>> This patch series adds DMA APIs to map and unmap a struct resource
> >>> to and from a PCI device's IOVA domain, and implements the AMD,
> >>> Intel, and nommu versions of these interfaces.
> >>>
> >>> This solves a long-standing problem with the existing DMA-remapping
> >>> interfaces, which require that a struct page be given for the region
> >>> to be mapped into a device's IOVA domain. This requirement cannot
> >>> support peer device BAR ranges, for which no struct pages exist.
> >>> ...
> 
> >> I think we currently assume there's no peer-to-peer traffic.
> >>
> >> I don't know whether changing that will break anything, but I'm
> >> concerned about these:
> >>
> >>   - PCIe MPS configuration (see pcie_bus_configure_settings()).
> >
> > I think it should be ok for PCIe MPS configuration, PCIE_BUS_PEER2PEER
> > force every device's MPS to 128B, what its concern is the TLP payload
> > size. In this series, it seems to only map a iova for device bar region.
> 
> MPS configuration makes assumptions about whether there will be any peer-
> to-peer traffic.  If there will be none, MPS can be configured more
> aggressively.
> 
> I don't think Linux has any way to detect whether a driver is doing peer-
> to-peer, and there's no way to prevent a driver from doing it.
> We're stuck with requiring the user to specify boot options
> ("pci=pcie_bus_safe", "pci=pcie_bus_perf", "pci=pcie_bus_peer2peer",
> etc.) that tell the PCI core what the user expects to happen.
> 
> This is a terrible user experience.  The user has no way to tell what
> drivers are going to do.  If he specifies the wrong thing, e.g., "assume no
> peer-to-peer traffic," and then loads a driver that does peer-to-peer, the
> kernel will configure MPS aggressively and when the device does a peer-to-
> peer transfer, it may cause a Malformed TLP error.
> 

I agree that this isn't a great user experience, but just want to clarify that this problem is orthogonal to this patch series, correct?

Prior to this series, the MPS mismatch is still possible with p2p traffic, but when an IOMMU is enabled p2p traffic will result in DMAR faults. The aim of the series is to allow drivers to fix the latter, not the former.

Thanks,
Will

--
nvpublic

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH 0/6] IOMMU/DMA map_resource support for peer-to-peer
  2015-05-07 16:23         ` William Davis
  (?)
@ 2015-05-07 17:16         ` Bjorn Helgaas
  2015-05-07 18:11           ` Jerome Glisse
  -1 siblings, 1 reply; 34+ messages in thread
From: Bjorn Helgaas @ 2015-05-07 17:16 UTC (permalink / raw)
  To: William Davis
  Cc: Yijing Wang, Joerg Roedel, open list:INTEL IOMMU (VT-d),
	linux-pci, Terence Ripperda, John Hubbard, Jerome Glisse,
	Dave Jiang, David S. Miller, Alex Williamson

On Thu, May 7, 2015 at 11:23 AM, William Davis <wdavis@nvidia.com> wrote:
>
>
>> -----Original Message-----
>> From: Bjorn Helgaas [mailto:bhelgaas@google.com]
>> Sent: Thursday, May 7, 2015 8:13 AM
>> To: Yijing Wang
>> Cc: William Davis; Joerg Roedel; open list:INTEL IOMMU (VT-d); linux-
>> pci@vger.kernel.org; Terence Ripperda; John Hubbard; Jerome Glisse; Dave
>> Jiang; David S. Miller; Alex Williamson
>> Subject: Re: [PATCH 0/6] IOMMU/DMA map_resource support for peer-to-peer
>>
>> On Wed, May 6, 2015 at 8:48 PM, Yijing Wang <wangyijing@huawei.com> wrote:
>> > On 2015/5/7 6:18, Bjorn Helgaas wrote:
>> >> [+cc Yijing, Dave J, Dave M, Alex]
>> >>
>> >> On Fri, May 01, 2015 at 01:32:12PM -0500, wdavis@nvidia.com wrote:
>> >>> From: Will Davis <wdavis@nvidia.com>
>> >>>
>> >>> Hi,
>> >>>
>> >>> This patch series adds DMA APIs to map and unmap a struct resource
>> >>> to and from a PCI device's IOVA domain, and implements the AMD,
>> >>> Intel, and nommu versions of these interfaces.
>> >>>
>> >>> This solves a long-standing problem with the existing DMA-remapping
>> >>> interfaces, which require that a struct page be given for the region
>> >>> to be mapped into a device's IOVA domain. This requirement cannot
>> >>> support peer device BAR ranges, for which no struct pages exist.
>> >>> ...
>>
>> >> I think we currently assume there's no peer-to-peer traffic.
>> >>
>> >> I don't know whether changing that will break anything, but I'm
>> >> concerned about these:
>> >>
>> >>   - PCIe MPS configuration (see pcie_bus_configure_settings()).
>> >
>> > I think it should be ok for PCIe MPS configuration, PCIE_BUS_PEER2PEER
>> > force every device's MPS to 128B, what its concern is the TLP payload
>> > size. In this series, it seems to only map a iova for device bar region.
>>
>> MPS configuration makes assumptions about whether there will be any peer-
>> to-peer traffic.  If there will be none, MPS can be configured more
>> aggressively.
>>
>> I don't think Linux has any way to detect whether a driver is doing peer-
>> to-peer, and there's no way to prevent a driver from doing it.
>> We're stuck with requiring the user to specify boot options
>> ("pci=pcie_bus_safe", "pci=pcie_bus_perf", "pci=pcie_bus_peer2peer",
>> etc.) that tell the PCI core what the user expects to happen.
>>
>> This is a terrible user experience.  The user has no way to tell what
>> drivers are going to do.  If he specifies the wrong thing, e.g., "assume no
>> peer-to-peer traffic," and then loads a driver that does peer-to-peer, the
>> kernel will configure MPS aggressively and when the device does a peer-to-
>> peer transfer, it may cause a Malformed TLP error.
>>
>
> I agree that this isn't a great user experience, but just want to clarify that this problem is orthogonal to this patch series, correct?
>
> Prior to this series, the MPS mismatch is still possible with p2p traffic, but when an IOMMU is enabled p2p traffic will result in DMAR faults. The aim of the series is to allow drivers to fix the latter, not the former.

Prior to this series, there wasn't any infrastructure for drivers to
do p2p, so it was mostly reasonable to assume that there *was* no p2p
traffic.

I think we currently default to doing nothing to MPS.  Prior to this
series, it might have been reasonable to optimize based on a "no-p2p"
assumption, e.g., default to pcie_bus_safe or pcie_bus_perf.  After
this series, I'm not sure what we could do, because p2p will be much
more likely.

It's just an issue; I don't know what the resolution is.

Bjorn

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH 0/6] IOMMU/DMA map_resource support for peer-to-peer
  2015-05-07 17:16         ` Bjorn Helgaas
@ 2015-05-07 18:11           ` Jerome Glisse
  2015-05-11 19:21               ` Don Dutile
  0 siblings, 1 reply; 34+ messages in thread
From: Jerome Glisse @ 2015-05-07 18:11 UTC (permalink / raw)
  To: Bjorn Helgaas
  Cc: William Davis, Dave Jiang, linux-pci, Jerome Glisse,
	open list:INTEL IOMMU (VT-d),
	John Hubbard, Terence Ripperda, David S. Miller

On Thu, May 07, 2015 at 12:16:30PM -0500, Bjorn Helgaas wrote:
> On Thu, May 7, 2015 at 11:23 AM, William Davis <wdavis@nvidia.com> wrote:
> >> From: Bjorn Helgaas [mailto:bhelgaas@google.com]
> >> Sent: Thursday, May 7, 2015 8:13 AM
> >> To: Yijing Wang
> >> Cc: William Davis; Joerg Roedel; open list:INTEL IOMMU (VT-d); linux-
> >> pci@vger.kernel.org; Terence Ripperda; John Hubbard; Jerome Glisse; Dave
> >> Jiang; David S. Miller; Alex Williamson
> >> Subject: Re: [PATCH 0/6] IOMMU/DMA map_resource support for peer-to-peer
> >>
> >> On Wed, May 6, 2015 at 8:48 PM, Yijing Wang <wangyijing@huawei.com> wrote:
> >> > On 2015/5/7 6:18, Bjorn Helgaas wrote:
> >> >> [+cc Yijing, Dave J, Dave M, Alex]
> >> >>
> >> >> On Fri, May 01, 2015 at 01:32:12PM -0500, wdavis@nvidia.com wrote:
> >> >>> From: Will Davis <wdavis@nvidia.com>
> >> >>>
> >> >>> Hi,
> >> >>>
> >> >>> This patch series adds DMA APIs to map and unmap a struct resource
> >> >>> to and from a PCI device's IOVA domain, and implements the AMD,
> >> >>> Intel, and nommu versions of these interfaces.
> >> >>>
> >> >>> This solves a long-standing problem with the existing DMA-remapping
> >> >>> interfaces, which require that a struct page be given for the region
> >> >>> to be mapped into a device's IOVA domain. This requirement cannot
> >> >>> support peer device BAR ranges, for which no struct pages exist.
> >> >>> ...
> >>
> >> >> I think we currently assume there's no peer-to-peer traffic.
> >> >>
> >> >> I don't know whether changing that will break anything, but I'm
> >> >> concerned about these:
> >> >>
> >> >>   - PCIe MPS configuration (see pcie_bus_configure_settings()).
> >> >
> >> > I think it should be ok for PCIe MPS configuration, PCIE_BUS_PEER2PEER
> >> > force every device's MPS to 128B, what its concern is the TLP payload
> >> > size. In this series, it seems to only map a iova for device bar region.
> >>
> >> MPS configuration makes assumptions about whether there will be any peer-
> >> to-peer traffic.  If there will be none, MPS can be configured more
> >> aggressively.
> >>
> >> I don't think Linux has any way to detect whether a driver is doing peer-
> >> to-peer, and there's no way to prevent a driver from doing it.
> >> We're stuck with requiring the user to specify boot options
> >> ("pci=pcie_bus_safe", "pci=pcie_bus_perf", "pci=pcie_bus_peer2peer",
> >> etc.) that tell the PCI core what the user expects to happen.
> >>
> >> This is a terrible user experience.  The user has no way to tell what
> >> drivers are going to do.  If he specifies the wrong thing, e.g., "assume no
> >> peer-to-peer traffic," and then loads a driver that does peer-to-peer, the
> >> kernel will configure MPS aggressively and when the device does a peer-to-
> >> peer transfer, it may cause a Malformed TLP error.
> >>
> >
> > I agree that this isn't a great user experience, but just want to clarify
> > that this problem is orthogonal to this patch series, correct?
> >
> > Prior to this series, the MPS mismatch is still possible with p2p traffic,
> > but when an IOMMU is enabled p2p traffic will result in DMAR faults. The
> > aim of the series is to allow drivers to fix the latter, not the former.
> 
> Prior to this series, there wasn't any infrastructure for drivers to
> do p2p, so it was mostly reasonable to assume that there *was* no p2p
> traffic.
> 
> I think we currently default to doing nothing to MPS.  Prior to this
> series, it might have been reasonable to optimize based on a "no-p2p"
> assumption, e.g., default to pcie_bus_safe or pcie_bus_perf.  After
> this series, I'm not sure what we could do, because p2p will be much
> more likely.
> 
> It's just an issue; I don't know what the resolution is.

Can't we just have each device update its MPS at runtime. So if device A
decide to map something from device B then device A update MPS for A and
B to lowest common supported value.

Of course you need to keep track of that per device so that if a device C
comes around and want to exchange with device B and both C and B support
higher payload than A then if C reprogram B it will trigger issue for A.

I know we update other PCIE configuration parameter at runtime for GPU,
dunno if it is widely tested for other devices.

Cheers,
Jérôme

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH 0/6] IOMMU/DMA map_resource support for peer-to-peer
  2015-05-01 18:32 ` wdavis-DDmLM1+adcrQT0dZR+AlfA
                   ` (7 preceding siblings ...)
  (?)
@ 2015-05-08 20:21 ` Konrad Rzeszutek Wilk
  2015-05-08 20:46   ` Mark Hounschell
  2015-05-11 19:49   ` William Davis
  -1 siblings, 2 replies; 34+ messages in thread
From: Konrad Rzeszutek Wilk @ 2015-05-08 20:21 UTC (permalink / raw)
  To: wdavis; +Cc: joro, jglisse, linux-pci, iommu, jhubbard, tripperda

On Fri, May 01, 2015 at 01:32:12PM -0500, wdavis@nvidia.com wrote:
> From: Will Davis <wdavis@nvidia.com>
> 
> Hi,
> 
> This patch series adds DMA APIs to map and unmap a struct resource to and from
> a PCI device's IOVA domain, and implements the AMD, Intel, and nommu versions
> of these interfaces.
> 
> This solves a long-standing problem with the existing DMA-remapping interfaces,
> which require that a struct page be given for the region to be mapped into a
> device's IOVA domain. This requirement cannot support peer device BAR ranges,
> for which no struct pages exist.
> 
> The underlying implementations of map_page and map_sg convert the struct page
> into its physical address anyway, so we just need a way to route the physical
> address of the BAR region to these implementations. The new interfaces do this
> by taking the struct resource describing a device's BAR region, from which the
> physical address is derived.
> 
> The Intel and nommu versions have been verified on a dual Intel Xeon E5405
> workstation. I'm in the process of obtaining hardware to test the AMD version
> as well. Please review.

Does it work if you boot with 'iommu=soft swiotlb=force' which will mandate
an strict usage of the DMA API?

> 
> Thanks,
> Will
> 
> Will Davis (6):
>   dma-debug: add checking for map/unmap_resource
>   DMA-API: Introduce dma_(un)map_resource
>   dma-mapping: pci: add pci_(un)map_resource
>   iommu/amd: Implement (un)map_resource
>   iommu/vt-d: implement (un)map_resource
>   x86: add pci-nommu implementation of map_resource
> 
>  arch/x86/kernel/pci-nommu.c              | 17 +++++++
>  drivers/iommu/amd_iommu.c                | 76 ++++++++++++++++++++++++++------
>  drivers/iommu/intel-iommu.c              | 18 ++++++++
>  include/asm-generic/dma-mapping-broken.h |  9 ++++
>  include/asm-generic/dma-mapping-common.h | 34 ++++++++++++++
>  include/asm-generic/pci-dma-compat.h     | 14 ++++++
>  include/linux/dma-debug.h                | 20 +++++++++
>  include/linux/dma-mapping.h              |  7 +++
>  lib/dma-debug.c                          | 48 ++++++++++++++++++++
>  9 files changed, 230 insertions(+), 13 deletions(-)
> 
> -- 
> 2.3.7
> 
> _______________________________________________
> iommu mailing list
> iommu@lists.linux-foundation.org
> https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH 0/6] IOMMU/DMA map_resource support for peer-to-peer
  2015-05-08 20:21 ` Konrad Rzeszutek Wilk
@ 2015-05-08 20:46   ` Mark Hounschell
  2015-05-11 14:32       ` Konrad Rzeszutek Wilk
  2015-05-11 20:05     ` William Davis
  2015-05-11 19:49   ` William Davis
  1 sibling, 2 replies; 34+ messages in thread
From: Mark Hounschell @ 2015-05-08 20:46 UTC (permalink / raw)
  To: Konrad Rzeszutek Wilk, wdavis-DDmLM1+adcrQT0dZR+AlfA
  Cc: linux-pci-u79uwXL29TY76Z2rM5mHXA,
	iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA,
	tripperda-DDmLM1+adcrQT0dZR+AlfA, jglisse-H+wXaHxf7aLQT0dZR+AlfA,
	jhubbard-DDmLM1+adcrQT0dZR+AlfA

On 05/08/2015 04:21 PM, Konrad Rzeszutek Wilk wrote:
> On Fri, May 01, 2015 at 01:32:12PM -0500, wdavis-DDmLM1+adcrQT0dZR+AlfA@public.gmane.org wrote:
>> From: Will Davis <wdavis-DDmLM1+adcrQT0dZR+AlfA@public.gmane.org>
>>
>> Hi,
>>
>> This patch series adds DMA APIs to map and unmap a struct resource to and from
>> a PCI device's IOVA domain, and implements the AMD, Intel, and nommu versions
>> of these interfaces.
>>
>> This solves a long-standing problem with the existing DMA-remapping interfaces,
>> which require that a struct page be given for the region to be mapped into a
>> device's IOVA domain. This requirement cannot support peer device BAR ranges,
>> for which no struct pages exist.
>>
>> The underlying implementations of map_page and map_sg convert the struct page
>> into its physical address anyway, so we just need a way to route the physical
>> address of the BAR region to these implementations. The new interfaces do this
>> by taking the struct resource describing a device's BAR region, from which the
>> physical address is derived.
>>
>> The Intel and nommu versions have been verified on a dual Intel Xeon E5405
>> workstation. I'm in the process of obtaining hardware to test the AMD version
>> as well. Please review.
>
> Does it work if you boot with 'iommu=soft swiotlb=force' which will mandate
> an strict usage of the DMA API?
>

PCIe peer2peer is borked on all motherboards I've tried. Only writes are 
possible. Reads are not supported. I suppose if you have a platform with 
only PCI and an IOMMU this would be very useful. Without both read and 
write PCIe peer2peer support, this seems unnecessary.

Mark

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH 3/6] dma-mapping: pci: add pci_(un)map_resource
@ 2015-05-11 14:30       ` Konrad Rzeszutek Wilk
  0 siblings, 0 replies; 34+ messages in thread
From: Konrad Rzeszutek Wilk @ 2015-05-11 14:30 UTC (permalink / raw)
  To: Bjorn Helgaas
  Cc: wdavis, linux-pci, iommu, jglisse, jhubbard, tripperda,
	Yinghai Lu, David S. Miller

On Thu, May 07, 2015 at 10:19:05AM -0500, Bjorn Helgaas wrote:
> [+cc Dave for sparc64, Yinghai]
> 
> On Fri, May 01, 2015 at 01:32:15PM -0500, wdavis@nvidia.com wrote:
> > From: Will Davis <wdavis@nvidia.com>
> > 
> > Simply route these through to the new dma_(un)map_resource APIs.
> > 
> > Signed-off-by: Will Davis <wdavis@nvidia.com>
> > Reviewed-by: Terence Ripperda <tripperda@nvidia.com>
> > Reviewed-by: John Hubbard <jhubbard@nvidia.com>
> > ---
> >  include/asm-generic/pci-dma-compat.h | 14 ++++++++++++++
> >  1 file changed, 14 insertions(+)
> > 
> > diff --git a/include/asm-generic/pci-dma-compat.h b/include/asm-generic/pci-dma-compat.h
> > index c110843..ac4a4ad 100644
> > --- a/include/asm-generic/pci-dma-compat.h
> > +++ b/include/asm-generic/pci-dma-compat.h
> > @@ -61,6 +61,20 @@ pci_unmap_page(struct pci_dev *hwdev, dma_addr_t dma_address,
> >  	dma_unmap_page(hwdev == NULL ? NULL : &hwdev->dev, dma_address, size, (enum dma_data_direction)direction);
> >  }
> >  
> > +static inline dma_addr_t
> > +pci_map_resource(struct pci_dev *hwdev, struct resource *resource,
> > +		 unsigned long offset, size_t size, int direction)
> > +{
> > +	return dma_map_resource(hwdev == NULL ? NULL : &hwdev->dev, resource, offset, size, (enum dma_data_direction)direction);
> > +}
> 
> On sparc64, PCI bus addresses, e.g., raw BAR values, can be 64 bits wide,
> but dma_addr_t is only 32 bits [1].  So dma_addr_t is a bit of a problem
> here.  It's likely that we will add a pci_bus_addr_t, but that hasn't
> happened yet [2].

Why not just expand the 'dma_addr_t' to be unsigned long (if to support
the T5-8 box)?
> 
> We do have existing problems already, e.g,. pci_bus_address() returns a
> dma_addr_t, so it has the same problem.  So I guess this is just a heads-up
> that this needs to be fixed eventually.
> 
> Bjorn
> 
> [1] http://lkml.kernel.org/r/20150327.145016.86183910134380870.davem@davemloft.net
> [2] http://lkml.kernel.org/r/1427857069-6789-2-git-send-email-yinghai@kernel.org
> _______________________________________________
> iommu mailing list
> iommu@lists.linux-foundation.org
> https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH 3/6] dma-mapping: pci: add pci_(un)map_resource
@ 2015-05-11 14:30       ` Konrad Rzeszutek Wilk
  0 siblings, 0 replies; 34+ messages in thread
From: Konrad Rzeszutek Wilk @ 2015-05-11 14:30 UTC (permalink / raw)
  To: Bjorn Helgaas
  Cc: linux-pci-u79uwXL29TY76Z2rM5mHXA,
	iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA,
	wdavis-DDmLM1+adcrQT0dZR+AlfA, jglisse-H+wXaHxf7aLQT0dZR+AlfA,
	jhubbard-DDmLM1+adcrQT0dZR+AlfA,
	tripperda-DDmLM1+adcrQT0dZR+AlfA, Yinghai Lu, David S. Miller

On Thu, May 07, 2015 at 10:19:05AM -0500, Bjorn Helgaas wrote:
> [+cc Dave for sparc64, Yinghai]
> 
> On Fri, May 01, 2015 at 01:32:15PM -0500, wdavis-DDmLM1+adcrQT0dZR+AlfA@public.gmane.org wrote:
> > From: Will Davis <wdavis-DDmLM1+adcrQT0dZR+AlfA@public.gmane.org>
> > 
> > Simply route these through to the new dma_(un)map_resource APIs.
> > 
> > Signed-off-by: Will Davis <wdavis-DDmLM1+adcrQT0dZR+AlfA@public.gmane.org>
> > Reviewed-by: Terence Ripperda <tripperda-DDmLM1+adcrQT0dZR+AlfA@public.gmane.org>
> > Reviewed-by: John Hubbard <jhubbard-DDmLM1+adcrQT0dZR+AlfA@public.gmane.org>
> > ---
> >  include/asm-generic/pci-dma-compat.h | 14 ++++++++++++++
> >  1 file changed, 14 insertions(+)
> > 
> > diff --git a/include/asm-generic/pci-dma-compat.h b/include/asm-generic/pci-dma-compat.h
> > index c110843..ac4a4ad 100644
> > --- a/include/asm-generic/pci-dma-compat.h
> > +++ b/include/asm-generic/pci-dma-compat.h
> > @@ -61,6 +61,20 @@ pci_unmap_page(struct pci_dev *hwdev, dma_addr_t dma_address,
> >  	dma_unmap_page(hwdev == NULL ? NULL : &hwdev->dev, dma_address, size, (enum dma_data_direction)direction);
> >  }
> >  
> > +static inline dma_addr_t
> > +pci_map_resource(struct pci_dev *hwdev, struct resource *resource,
> > +		 unsigned long offset, size_t size, int direction)
> > +{
> > +	return dma_map_resource(hwdev == NULL ? NULL : &hwdev->dev, resource, offset, size, (enum dma_data_direction)direction);
> > +}
> 
> On sparc64, PCI bus addresses, e.g., raw BAR values, can be 64 bits wide,
> but dma_addr_t is only 32 bits [1].  So dma_addr_t is a bit of a problem
> here.  It's likely that we will add a pci_bus_addr_t, but that hasn't
> happened yet [2].

Why not just expand the 'dma_addr_t' to be unsigned long (if to support
the T5-8 box)?
> 
> We do have existing problems already, e.g,. pci_bus_address() returns a
> dma_addr_t, so it has the same problem.  So I guess this is just a heads-up
> that this needs to be fixed eventually.
> 
> Bjorn
> 
> [1] http://lkml.kernel.org/r/20150327.145016.86183910134380870.davem-fT/PcQaiUtIeIZ0/mPfg9Q@public.gmane.org
> [2] http://lkml.kernel.org/r/1427857069-6789-2-git-send-email-yinghai-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org
> _______________________________________________
> iommu mailing list
> iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA@public.gmane.org
> https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH 0/6] IOMMU/DMA map_resource support for peer-to-peer
@ 2015-05-11 14:32       ` Konrad Rzeszutek Wilk
  0 siblings, 0 replies; 34+ messages in thread
From: Konrad Rzeszutek Wilk @ 2015-05-11 14:32 UTC (permalink / raw)
  To: Mark Hounschell; +Cc: wdavis, linux-pci, iommu, jglisse, jhubbard, tripperda

On Fri, May 08, 2015 at 04:46:17PM -0400, Mark Hounschell wrote:
> On 05/08/2015 04:21 PM, Konrad Rzeszutek Wilk wrote:
> >On Fri, May 01, 2015 at 01:32:12PM -0500, wdavis@nvidia.com wrote:
> >>From: Will Davis <wdavis@nvidia.com>
> >>
> >>Hi,
> >>
> >>This patch series adds DMA APIs to map and unmap a struct resource to and from
> >>a PCI device's IOVA domain, and implements the AMD, Intel, and nommu versions
> >>of these interfaces.
> >>
> >>This solves a long-standing problem with the existing DMA-remapping interfaces,
> >>which require that a struct page be given for the region to be mapped into a
> >>device's IOVA domain. This requirement cannot support peer device BAR ranges,
> >>for which no struct pages exist.
> >>
> >>The underlying implementations of map_page and map_sg convert the struct page
> >>into its physical address anyway, so we just need a way to route the physical
> >>address of the BAR region to these implementations. The new interfaces do this
> >>by taking the struct resource describing a device's BAR region, from which the
> >>physical address is derived.
> >>
> >>The Intel and nommu versions have been verified on a dual Intel Xeon E5405
> >>workstation. I'm in the process of obtaining hardware to test the AMD version
> >>as well. Please review.
> >
> >Does it work if you boot with 'iommu=soft swiotlb=force' which will mandate
> >an strict usage of the DMA API?
> >
> 
> PCIe peer2peer is borked on all motherboards I've tried. Only writes are

:-(

> possible. Reads are not supported. I suppose if you have a platform with
> only PCI and an IOMMU this would be very useful. Without both read and write
> PCIe peer2peer support, this seems unnecessary.
> 

It is a perfect way to test the code to make sure the API works (or it
fails in the failure modes) _and_ that the drivers as well (use the
pci_map_sync, and so on).

> Mark
> 
> 

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH 0/6] IOMMU/DMA map_resource support for peer-to-peer
@ 2015-05-11 14:32       ` Konrad Rzeszutek Wilk
  0 siblings, 0 replies; 34+ messages in thread
From: Konrad Rzeszutek Wilk @ 2015-05-11 14:32 UTC (permalink / raw)
  To: Mark Hounschell
  Cc: linux-pci-u79uwXL29TY76Z2rM5mHXA,
	iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA,
	wdavis-DDmLM1+adcrQT0dZR+AlfA, jglisse-H+wXaHxf7aLQT0dZR+AlfA,
	jhubbard-DDmLM1+adcrQT0dZR+AlfA,
	tripperda-DDmLM1+adcrQT0dZR+AlfA

On Fri, May 08, 2015 at 04:46:17PM -0400, Mark Hounschell wrote:
> On 05/08/2015 04:21 PM, Konrad Rzeszutek Wilk wrote:
> >On Fri, May 01, 2015 at 01:32:12PM -0500, wdavis-DDmLM1+adcrQT0dZR+AlfA@public.gmane.org wrote:
> >>From: Will Davis <wdavis-DDmLM1+adcrQT0dZR+AlfA@public.gmane.org>
> >>
> >>Hi,
> >>
> >>This patch series adds DMA APIs to map and unmap a struct resource to and from
> >>a PCI device's IOVA domain, and implements the AMD, Intel, and nommu versions
> >>of these interfaces.
> >>
> >>This solves a long-standing problem with the existing DMA-remapping interfaces,
> >>which require that a struct page be given for the region to be mapped into a
> >>device's IOVA domain. This requirement cannot support peer device BAR ranges,
> >>for which no struct pages exist.
> >>
> >>The underlying implementations of map_page and map_sg convert the struct page
> >>into its physical address anyway, so we just need a way to route the physical
> >>address of the BAR region to these implementations. The new interfaces do this
> >>by taking the struct resource describing a device's BAR region, from which the
> >>physical address is derived.
> >>
> >>The Intel and nommu versions have been verified on a dual Intel Xeon E5405
> >>workstation. I'm in the process of obtaining hardware to test the AMD version
> >>as well. Please review.
> >
> >Does it work if you boot with 'iommu=soft swiotlb=force' which will mandate
> >an strict usage of the DMA API?
> >
> 
> PCIe peer2peer is borked on all motherboards I've tried. Only writes are

:-(

> possible. Reads are not supported. I suppose if you have a platform with
> only PCI and an IOMMU this would be very useful. Without both read and write
> PCIe peer2peer support, this seems unnecessary.
> 

It is a perfect way to test the code to make sure the API works (or it
fails in the failure modes) _and_ that the drivers as well (use the
pci_map_sync, and so on).

> Mark
> 
> 

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH 3/6] dma-mapping: pci: add pci_(un)map_resource
  2015-05-11 14:30       ` Konrad Rzeszutek Wilk
  (?)
@ 2015-05-11 15:27       ` Bjorn Helgaas
  -1 siblings, 0 replies; 34+ messages in thread
From: Bjorn Helgaas @ 2015-05-11 15:27 UTC (permalink / raw)
  To: Konrad Rzeszutek Wilk
  Cc: William Davis, linux-pci, open list:INTEL IOMMU (VT-d),
	Jerome Glisse, John Hubbard, Terence Ripperda, Yinghai Lu,
	David S. Miller

On Mon, May 11, 2015 at 9:30 AM, Konrad Rzeszutek Wilk
<konrad.wilk@oracle.com> wrote:
> On Thu, May 07, 2015 at 10:19:05AM -0500, Bjorn Helgaas wrote:
>> [+cc Dave for sparc64, Yinghai]
>>
>> On Fri, May 01, 2015 at 01:32:15PM -0500, wdavis@nvidia.com wrote:
>> > From: Will Davis <wdavis@nvidia.com>
>> >
>> > Simply route these through to the new dma_(un)map_resource APIs.
>> >
>> > Signed-off-by: Will Davis <wdavis@nvidia.com>
>> > Reviewed-by: Terence Ripperda <tripperda@nvidia.com>
>> > Reviewed-by: John Hubbard <jhubbard@nvidia.com>
>> > ---
>> >  include/asm-generic/pci-dma-compat.h | 14 ++++++++++++++
>> >  1 file changed, 14 insertions(+)
>> >
>> > diff --git a/include/asm-generic/pci-dma-compat.h b/include/asm-generic/pci-dma-compat.h
>> > index c110843..ac4a4ad 100644
>> > --- a/include/asm-generic/pci-dma-compat.h
>> > +++ b/include/asm-generic/pci-dma-compat.h
>> > @@ -61,6 +61,20 @@ pci_unmap_page(struct pci_dev *hwdev, dma_addr_t dma_address,
>> >     dma_unmap_page(hwdev == NULL ? NULL : &hwdev->dev, dma_address, size, (enum dma_data_direction)direction);
>> >  }
>> >
>> > +static inline dma_addr_t
>> > +pci_map_resource(struct pci_dev *hwdev, struct resource *resource,
>> > +            unsigned long offset, size_t size, int direction)
>> > +{
>> > +   return dma_map_resource(hwdev == NULL ? NULL : &hwdev->dev, resource, offset, size, (enum dma_data_direction)direction);
>> > +}
>>
>> On sparc64, PCI bus addresses, e.g., raw BAR values, can be 64 bits wide,
>> but dma_addr_t is only 32 bits [1].  So dma_addr_t is a bit of a problem
>> here.  It's likely that we will add a pci_bus_addr_t, but that hasn't
>> happened yet [2].
>
> Why not just expand the 'dma_addr_t' to be unsigned long (if to support
> the T5-8 box)?

That would work, but would increase the kernel memory footprint [3].
I don't have numbers (maybe Dave does), but if it is really
significant, other platforms with IOMMUs might want to explore using a
dma_addr_t smaller than 64 bits, too.

>> [1] http://lkml.kernel.org/r/20150327.145016.86183910134380870.davem@davemloft.net
>> [2] http://lkml.kernel.org/r/1427857069-6789-2-git-send-email-yinghai@kernel.org

[3] http://lkml.kernel.org/r/20150403.124855.96516097693494126.davem@davemloft.net

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH 0/6] IOMMU/DMA map_resource support for peer-to-peer
@ 2015-05-11 19:21               ` Don Dutile
  0 siblings, 0 replies; 34+ messages in thread
From: Don Dutile @ 2015-05-11 19:21 UTC (permalink / raw)
  To: Jerome Glisse, Bjorn Helgaas
  Cc: Dave Jiang, linux-pci, William Davis,
	open list:INTEL IOMMU (VT-d),
	Jerome Glisse, John Hubbard, Terence Ripperda, David S. Miller

On 05/07/2015 02:11 PM, Jerome Glisse wrote:
> On Thu, May 07, 2015 at 12:16:30PM -0500, Bjorn Helgaas wrote:
>> On Thu, May 7, 2015 at 11:23 AM, William Davis <wdavis@nvidia.com> wrote:
>>>> From: Bjorn Helgaas [mailto:bhelgaas@google.com]
>>>> Sent: Thursday, May 7, 2015 8:13 AM
>>>> To: Yijing Wang
>>>> Cc: William Davis; Joerg Roedel; open list:INTEL IOMMU (VT-d); linux-
>>>> pci@vger.kernel.org; Terence Ripperda; John Hubbard; Jerome Glisse; Dave
>>>> Jiang; David S. Miller; Alex Williamson
>>>> Subject: Re: [PATCH 0/6] IOMMU/DMA map_resource support for peer-to-peer
>>>>
>>>> On Wed, May 6, 2015 at 8:48 PM, Yijing Wang <wangyijing@huawei.com> wrote:
>>>>> On 2015/5/7 6:18, Bjorn Helgaas wrote:
>>>>>> [+cc Yijing, Dave J, Dave M, Alex]
>>>>>>
>>>>>> On Fri, May 01, 2015 at 01:32:12PM -0500, wdavis@nvidia.com wrote:
>>>>>>> From: Will Davis <wdavis@nvidia.com>
>>>>>>>
>>>>>>> Hi,
>>>>>>>
>>>>>>> This patch series adds DMA APIs to map and unmap a struct resource
>>>>>>> to and from a PCI device's IOVA domain, and implements the AMD,
>>>>>>> Intel, and nommu versions of these interfaces.
>>>>>>>
>>>>>>> This solves a long-standing problem with the existing DMA-remapping
>>>>>>> interfaces, which require that a struct page be given for the region
>>>>>>> to be mapped into a device's IOVA domain. This requirement cannot
>>>>>>> support peer device BAR ranges, for which no struct pages exist.
>>>>>>> ...
>>>>
>>>>>> I think we currently assume there's no peer-to-peer traffic.
>>>>>>
>>>>>> I don't know whether changing that will break anything, but I'm
>>>>>> concerned about these:
>>>>>>
>>>>>>    - PCIe MPS configuration (see pcie_bus_configure_settings()).
>>>>>
>>>>> I think it should be ok for PCIe MPS configuration, PCIE_BUS_PEER2PEER
>>>>> force every device's MPS to 128B, what its concern is the TLP payload
>>>>> size. In this series, it seems to only map a iova for device bar region.
>>>>
>>>> MPS configuration makes assumptions about whether there will be any peer-
>>>> to-peer traffic.  If there will be none, MPS can be configured more
>>>> aggressively.
>>>>
>>>> I don't think Linux has any way to detect whether a driver is doing peer-
>>>> to-peer, and there's no way to prevent a driver from doing it.
>>>> We're stuck with requiring the user to specify boot options
>>>> ("pci=pcie_bus_safe", "pci=pcie_bus_perf", "pci=pcie_bus_peer2peer",
>>>> etc.) that tell the PCI core what the user expects to happen.
>>>>
>>>> This is a terrible user experience.  The user has no way to tell what
>>>> drivers are going to do.  If he specifies the wrong thing, e.g., "assume no
>>>> peer-to-peer traffic," and then loads a driver that does peer-to-peer, the
>>>> kernel will configure MPS aggressively and when the device does a peer-to-
>>>> peer transfer, it may cause a Malformed TLP error.
>>>>
>>>
>>> I agree that this isn't a great user experience, but just want to clarify
>>> that this problem is orthogonal to this patch series, correct?
>>>
>>> Prior to this series, the MPS mismatch is still possible with p2p traffic,
>>> but when an IOMMU is enabled p2p traffic will result in DMAR faults. The
>>> aim of the series is to allow drivers to fix the latter, not the former.
>>
>> Prior to this series, there wasn't any infrastructure for drivers to
>> do p2p, so it was mostly reasonable to assume that there *was* no p2p
>> traffic.
>>
>> I think we currently default to doing nothing to MPS.  Prior to this
>> series, it might have been reasonable to optimize based on a "no-p2p"
>> assumption, e.g., default to pcie_bus_safe or pcie_bus_perf.  After
>> this series, I'm not sure what we could do, because p2p will be much
>> more likely.
>>
>> It's just an issue; I don't know what the resolution is.
>
> Can't we just have each device update its MPS at runtime. So if device A
> decide to map something from device B then device A update MPS for A and
> B to lowest common supported value.
>
> Of course you need to keep track of that per device so that if a device C
> comes around and want to exchange with device B and both C and B support
> higher payload than A then if C reprogram B it will trigger issue for A.
>
> I know we update other PCIE configuration parameter at runtime for GPU,
> dunno if it is widely tested for other devices.
>
I believe all these cases are btwn endpts and the upstream ports of a
PCIe port/host-bringe/PCIe switch they are connected to, i.e., true, wire peers
-- not across a PCIe domain, which is the context of this p2p that the MPS has to span.


> Cheers,
> Jérôme
> _______________________________________________
> iommu mailing list
> iommu@lists.linux-foundation.org
> https://lists.linuxfoundation.org/mailman/listinfo/iommu
>


^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH 0/6] IOMMU/DMA map_resource support for peer-to-peer
@ 2015-05-11 19:21               ` Don Dutile
  0 siblings, 0 replies; 34+ messages in thread
From: Don Dutile @ 2015-05-11 19:21 UTC (permalink / raw)
  To: Jerome Glisse, Bjorn Helgaas
  Cc: Dave Jiang, linux-pci-u79uwXL29TY76Z2rM5mHXA,
	open list:INTEL IOMMU (VT-d),
	William Davis, Jerome Glisse, John Hubbard, Terence Ripperda,
	David S. Miller

On 05/07/2015 02:11 PM, Jerome Glisse wrote:
> On Thu, May 07, 2015 at 12:16:30PM -0500, Bjorn Helgaas wrote:
>> On Thu, May 7, 2015 at 11:23 AM, William Davis <wdavis-DDmLM1+adcrQT0dZR+AlfA@public.gmane.org> wrote:
>>>> From: Bjorn Helgaas [mailto:bhelgaas-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org]
>>>> Sent: Thursday, May 7, 2015 8:13 AM
>>>> To: Yijing Wang
>>>> Cc: William Davis; Joerg Roedel; open list:INTEL IOMMU (VT-d); linux-
>>>> pci-u79uwXL29TY76Z2rM5mHXA@public.gmane.org; Terence Ripperda; John Hubbard; Jerome Glisse; Dave
>>>> Jiang; David S. Miller; Alex Williamson
>>>> Subject: Re: [PATCH 0/6] IOMMU/DMA map_resource support for peer-to-peer
>>>>
>>>> On Wed, May 6, 2015 at 8:48 PM, Yijing Wang <wangyijing-hv44wF8Li93QT0dZR+AlfA@public.gmane.org> wrote:
>>>>> On 2015/5/7 6:18, Bjorn Helgaas wrote:
>>>>>> [+cc Yijing, Dave J, Dave M, Alex]
>>>>>>
>>>>>> On Fri, May 01, 2015 at 01:32:12PM -0500, wdavis-DDmLM1+adcrQT0dZR+AlfA@public.gmane.org wrote:
>>>>>>> From: Will Davis <wdavis-DDmLM1+adcrQT0dZR+AlfA@public.gmane.org>
>>>>>>>
>>>>>>> Hi,
>>>>>>>
>>>>>>> This patch series adds DMA APIs to map and unmap a struct resource
>>>>>>> to and from a PCI device's IOVA domain, and implements the AMD,
>>>>>>> Intel, and nommu versions of these interfaces.
>>>>>>>
>>>>>>> This solves a long-standing problem with the existing DMA-remapping
>>>>>>> interfaces, which require that a struct page be given for the region
>>>>>>> to be mapped into a device's IOVA domain. This requirement cannot
>>>>>>> support peer device BAR ranges, for which no struct pages exist.
>>>>>>> ...
>>>>
>>>>>> I think we currently assume there's no peer-to-peer traffic.
>>>>>>
>>>>>> I don't know whether changing that will break anything, but I'm
>>>>>> concerned about these:
>>>>>>
>>>>>>    - PCIe MPS configuration (see pcie_bus_configure_settings()).
>>>>>
>>>>> I think it should be ok for PCIe MPS configuration, PCIE_BUS_PEER2PEER
>>>>> force every device's MPS to 128B, what its concern is the TLP payload
>>>>> size. In this series, it seems to only map a iova for device bar region.
>>>>
>>>> MPS configuration makes assumptions about whether there will be any peer-
>>>> to-peer traffic.  If there will be none, MPS can be configured more
>>>> aggressively.
>>>>
>>>> I don't think Linux has any way to detect whether a driver is doing peer-
>>>> to-peer, and there's no way to prevent a driver from doing it.
>>>> We're stuck with requiring the user to specify boot options
>>>> ("pci=pcie_bus_safe", "pci=pcie_bus_perf", "pci=pcie_bus_peer2peer",
>>>> etc.) that tell the PCI core what the user expects to happen.
>>>>
>>>> This is a terrible user experience.  The user has no way to tell what
>>>> drivers are going to do.  If he specifies the wrong thing, e.g., "assume no
>>>> peer-to-peer traffic," and then loads a driver that does peer-to-peer, the
>>>> kernel will configure MPS aggressively and when the device does a peer-to-
>>>> peer transfer, it may cause a Malformed TLP error.
>>>>
>>>
>>> I agree that this isn't a great user experience, but just want to clarify
>>> that this problem is orthogonal to this patch series, correct?
>>>
>>> Prior to this series, the MPS mismatch is still possible with p2p traffic,
>>> but when an IOMMU is enabled p2p traffic will result in DMAR faults. The
>>> aim of the series is to allow drivers to fix the latter, not the former.
>>
>> Prior to this series, there wasn't any infrastructure for drivers to
>> do p2p, so it was mostly reasonable to assume that there *was* no p2p
>> traffic.
>>
>> I think we currently default to doing nothing to MPS.  Prior to this
>> series, it might have been reasonable to optimize based on a "no-p2p"
>> assumption, e.g., default to pcie_bus_safe or pcie_bus_perf.  After
>> this series, I'm not sure what we could do, because p2p will be much
>> more likely.
>>
>> It's just an issue; I don't know what the resolution is.
>
> Can't we just have each device update its MPS at runtime. So if device A
> decide to map something from device B then device A update MPS for A and
> B to lowest common supported value.
>
> Of course you need to keep track of that per device so that if a device C
> comes around and want to exchange with device B and both C and B support
> higher payload than A then if C reprogram B it will trigger issue for A.
>
> I know we update other PCIE configuration parameter at runtime for GPU,
> dunno if it is widely tested for other devices.
>
I believe all these cases are btwn endpts and the upstream ports of a
PCIe port/host-bringe/PCIe switch they are connected to, i.e., true, wire peers
-- not across a PCIe domain, which is the context of this p2p that the MPS has to span.


> Cheers,
> Jérôme
> _______________________________________________
> iommu mailing list
> iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA@public.gmane.org
> https://lists.linuxfoundation.org/mailman/listinfo/iommu
>

^ permalink raw reply	[flat|nested] 34+ messages in thread

* RE: [PATCH 0/6] IOMMU/DMA map_resource support for peer-to-peer
  2015-05-08 20:21 ` Konrad Rzeszutek Wilk
  2015-05-08 20:46   ` Mark Hounschell
@ 2015-05-11 19:49   ` William Davis
  1 sibling, 0 replies; 34+ messages in thread
From: William Davis @ 2015-05-11 19:49 UTC (permalink / raw)
  To: Konrad Rzeszutek Wilk
  Cc: joro, jglisse, linux-pci, iommu, John Hubbard, Terence Ripperda



> -----Original Message-----
> From: Konrad Rzeszutek Wilk [mailto:konrad.wilk@oracle.com]
> Sent: Friday, May 08, 2015 3:22 PM
> To: William Davis
> Cc: joro@8bytes.org; jglisse@redhat.com; linux-pci@vger.kernel.org; iommu@lists.linux-foundation.org;
> John Hubbard; Terence Ripperda
> Subject: Re: [PATCH 0/6] IOMMU/DMA map_resource support for peer-to-peer
> 
> ...
> >
> > The Intel and nommu versions have been verified on a dual Intel Xeon E5405
> > workstation. I'm in the process of obtaining hardware to test the AMD version
> > as well. Please review.
> 
> Does it work if you boot with 'iommu=soft swiotlb=force' which will mandate
> an strict usage of the DMA API?
> 

This patch series doesn't yet add a SWIOTLB implementation, and so the dma_map_resource() call would return 0 to indicate the path is not implemented (see patch 2/6). So no, the new interfaces would not work with that configuration, but they're also not expected to at this point.

Thanks,
Will
--
nvpublic

^ permalink raw reply	[flat|nested] 34+ messages in thread

* RE: [PATCH 0/6] IOMMU/DMA map_resource support for peer-to-peer
  2015-05-08 20:46   ` Mark Hounschell
  2015-05-11 14:32       ` Konrad Rzeszutek Wilk
@ 2015-05-11 20:05     ` William Davis
  1 sibling, 0 replies; 34+ messages in thread
From: William Davis @ 2015-05-11 20:05 UTC (permalink / raw)
  To: markh, Konrad Rzeszutek Wilk
  Cc: linux-pci, iommu, jglisse, John Hubbard, Terence Ripperda



> -----Original Message-----
> From: Mark Hounschell [mailto:markh@compro.net]
> Sent: Friday, May 08, 2015 3:46 PM
> To: Konrad Rzeszutek Wilk; William Davis
> Cc: linux-pci@vger.kernel.org; iommu@lists.linux-foundation.org; jglisse@redhat.com; John Hubbard;
> Terence Ripperda
> Subject: Re: [PATCH 0/6] IOMMU/DMA map_resource support for peer-to-peer
> 
> On 05/08/2015 04:21 PM, Konrad Rzeszutek Wilk wrote:
> > On Fri, May 01, 2015 at 01:32:12PM -0500, wdavis@nvidia.com wrote:
> >> From: Will Davis <wdavis@nvidia.com>
> >>
> >> Hi,
> >>
> ...
> >>
> >> This solves a long-standing problem with the existing DMA-remapping interfaces,
> >> which require that a struct page be given for the region to be mapped into a
> >> device's IOVA domain. This requirement cannot support peer device BAR ranges,
> >> for which no struct pages exist.
> >>
> ...
> >>
> >> The Intel and nommu versions have been verified on a dual Intel Xeon E5405
> >> workstation.
> 
> PCIe peer2peer is borked on all motherboards I've tried. Only writes are
> possible. Reads are not supported. I suppose if you have a platform with
> only PCI and an IOMMU this would be very useful. Without both read and
> write PCIe peer2peer support, this seems unnecessary.
> 

PCIe peer-to-peer isn't inherently broken or useless itself, even if a lot of its implementations are; I've successfully tested these patches with existing hardware (dual NVIDIA GPUs + an old-ish workstation), and it solves a longstanding problem for us [1], so I disagree with the assessment that this would be unnecessary.

I guess I don't see why the existence of some, or even a lot of, poor implementations suffices as a reason to reject a generic mechanism to support the "good", standardized implementations.

[1] http://stackoverflow.com/questions/19841815/does-the-nvidia-rdma-gpudirect-always-operate-only-physical-addresses-in-physic

Thanks,
Will

--
nvpublic

^ permalink raw reply	[flat|nested] 34+ messages in thread

end of thread, other threads:[~2015-05-11 20:05 UTC | newest]

Thread overview: 34+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-05-01 18:32 [PATCH 0/6] IOMMU/DMA map_resource support for peer-to-peer wdavis
2015-05-01 18:32 ` wdavis-DDmLM1+adcrQT0dZR+AlfA
2015-05-01 18:32 ` [PATCH 1/6] dma-debug: add checking for map/unmap_resource wdavis
2015-05-01 18:32 ` [PATCH 2/6] DMA-API: Introduce dma_(un)map_resource wdavis
2015-05-07 15:09   ` Bjorn Helgaas
2015-05-07 16:10     ` William Davis
2015-05-01 18:32 ` [PATCH 3/6] dma-mapping: pci: add pci_(un)map_resource wdavis
2015-05-07 15:19   ` Bjorn Helgaas
2015-05-07 15:19     ` Bjorn Helgaas
2015-05-11 14:30     ` Konrad Rzeszutek Wilk
2015-05-11 14:30       ` Konrad Rzeszutek Wilk
2015-05-11 15:27       ` Bjorn Helgaas
2015-05-01 18:32 ` [PATCH 4/6] iommu/amd: Implement (un)map_resource wdavis
2015-05-01 18:32 ` [PATCH 5/6] iommu/vt-d: implement (un)map_resource wdavis
2015-05-01 18:32 ` [PATCH 6/6] x86: add pci-nommu implementation of map_resource wdavis
2015-05-07 15:08   ` Bjorn Helgaas
2015-05-07 16:07     ` William Davis
2015-05-06 22:18 ` [PATCH 0/6] IOMMU/DMA map_resource support for peer-to-peer Bjorn Helgaas
2015-05-06 22:30   ` Alex Williamson
2015-05-07  1:48   ` Yijing Wang
2015-05-07  1:48     ` Yijing Wang
2015-05-07 13:13     ` Bjorn Helgaas
2015-05-07 16:23       ` William Davis
2015-05-07 16:23         ` William Davis
2015-05-07 17:16         ` Bjorn Helgaas
2015-05-07 18:11           ` Jerome Glisse
2015-05-11 19:21             ` Don Dutile
2015-05-11 19:21               ` Don Dutile
2015-05-08 20:21 ` Konrad Rzeszutek Wilk
2015-05-08 20:46   ` Mark Hounschell
2015-05-11 14:32     ` Konrad Rzeszutek Wilk
2015-05-11 14:32       ` Konrad Rzeszutek Wilk
2015-05-11 20:05     ` William Davis
2015-05-11 19:49   ` William Davis

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.