linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [net-next PATCH 00/27] Add support for DMA writable pages being writable by the network stack
@ 2016-10-25 15:36 Alexander Duyck
  2016-10-25 15:36 ` [net-next PATCH 01/27] swiotlb: Drop unused function swiotlb_map_sg Alexander Duyck
                   ` (28 more replies)
  0 siblings, 29 replies; 38+ messages in thread
From: Alexander Duyck @ 2016-10-25 15:36 UTC (permalink / raw)
  To: netdev, intel-wired-lan, linux-kernel, linux-mm; +Cc: davem, brouer

The first 22 patches in the set add support for the DMA attribute
DMA_ATTR_SKIP_CPU_SYNC on multiple platforms/architectures.  This is needed
so that we can flag the calls to dma_map/unmap_page so that we do not
invalidate cache lines that do not currently belong to the device.  Instead
we have to take care of this in the driver via a call to
sync_single_range_for_cpu prior to freeing the Rx page.

Patch 23 adds support for dma_map_page_attrs and dma_unmap_page_attrs so
that we can unmap and map a page using the DMA_ATTR_SKIP_CPU_SYNC
attribute.

Patch 24 adds support for freeing a page that has multiple references being
held by a single caller.  This way we can free page fragments that were
allocated by a given driver.

The last 3 patches use these updates in the igb driver to allow for us to
reimpelement the use of build_skb.

My hope is to get the series accepted into the net-next tree as I have a
number of other Intel drivers I could then begin updating once these
patches are accepted.

v1: Split out changes DMA_ERROR_CODE fix for swiotlb-xen
    Minor fixes based on issues found by kernel build bot
    Few minor changes for issues found on code review
    Added Acked-by for patches that were acked and not changed

---

Alexander Duyck (27):
      swiotlb: Drop unused function swiotlb_map_sg
      swiotlb-xen: Enforce return of DMA_ERROR_CODE in mapping function
      swiotlb: Add support for DMA_ATTR_SKIP_CPU_SYNC
      arch/arc: Add option to skip sync on DMA mapping
      arch/arm: Add option to skip sync on DMA map and unmap
      arch/avr32: Add option to skip sync on DMA map
      arch/blackfin: Add option to skip sync on DMA map
      arch/c6x: Add option to skip sync on DMA map and unmap
      arch/frv: Add option to skip sync on DMA map
      arch/hexagon: Add option to skip DMA sync as a part of mapping
      arch/m68k: Add option to skip DMA sync as a part of mapping
      arch/metag: Add option to skip DMA sync as a part of map and unmap
      arch/microblaze: Add option to skip DMA sync as a part of map and unmap
      arch/mips: Add option to skip DMA sync as a part of map and unmap
      arch/nios2: Add option to skip DMA sync as a part of map and unmap
      arch/openrisc: Add option to skip DMA sync as a part of mapping
      arch/parisc: Add option to skip DMA sync as a part of map and unmap
      arch/powerpc: Add option to skip DMA sync as a part of mapping
      arch/sh: Add option to skip DMA sync as a part of mapping
      arch/sparc: Add option to skip DMA sync as a part of map and unmap
      arch/tile: Add option to skip DMA sync as a part of map and unmap
      arch/xtensa: Add option to skip DMA sync as a part of mapping
      dma: Add calls for dma_map_page_attrs and dma_unmap_page_attrs
      mm: Add support for releasing multiple instances of a page
      igb: Update driver to make use of DMA_ATTR_SKIP_CPU_SYNC
      igb: Update code to better handle incrementing page count
      igb: Revert "igb: Revert support for build_skb in igb"


 arch/arc/mm/dma.c                         |    5 +
 arch/arm/common/dmabounce.c               |   16 +-
 arch/arm/xen/mm.c                         |    1 
 arch/avr32/mm/dma-coherent.c              |    7 +
 arch/blackfin/kernel/dma-mapping.c        |    8 +
 arch/c6x/kernel/dma.c                     |   14 +-
 arch/frv/mb93090-mb00/pci-dma-nommu.c     |   14 +-
 arch/frv/mb93090-mb00/pci-dma.c           |    9 +
 arch/hexagon/kernel/dma.c                 |    6 +
 arch/m68k/kernel/dma.c                    |    8 +
 arch/metag/kernel/dma.c                   |   16 ++
 arch/microblaze/kernel/dma.c              |   10 +
 arch/mips/loongson64/common/dma-swiotlb.c |    2 
 arch/mips/mm/dma-default.c                |    8 +
 arch/nios2/mm/dma-mapping.c               |   26 +++-
 arch/openrisc/kernel/dma.c                |    3 
 arch/parisc/kernel/pci-dma.c              |   20 ++-
 arch/powerpc/kernel/dma.c                 |    9 +
 arch/sh/kernel/dma-nommu.c                |    7 +
 arch/sparc/kernel/iommu.c                 |    4 -
 arch/sparc/kernel/ioport.c                |    4 -
 arch/tile/kernel/pci-dma.c                |   12 +-
 arch/x86/xen/pci-swiotlb-xen.c            |    1 
 arch/xtensa/kernel/pci-dma.c              |    7 +
 drivers/net/ethernet/intel/igb/igb.h      |   36 ++++-
 drivers/net/ethernet/intel/igb/igb_main.c |  207 +++++++++++++++++++++++------
 drivers/xen/swiotlb-xen.c                 |   27 ++--
 include/linux/dma-mapping.h               |   20 ++-
 include/linux/gfp.h                       |    2 
 include/linux/swiotlb.h                   |   10 +
 include/xen/swiotlb-xen.h                 |    3 
 lib/swiotlb.c                             |   56 ++++----
 mm/page_alloc.c                           |   14 ++
 33 files changed, 433 insertions(+), 159 deletions(-)

--
Signature

^ permalink raw reply	[flat|nested] 38+ messages in thread

* [net-next PATCH 01/27] swiotlb: Drop unused function swiotlb_map_sg
  2016-10-25 15:36 [net-next PATCH 00/27] Add support for DMA writable pages being writable by the network stack Alexander Duyck
@ 2016-10-25 15:36 ` Alexander Duyck
  2016-10-25 15:36 ` [net-next PATCH 02/27] swiotlb-xen: Enforce return of DMA_ERROR_CODE in mapping function Alexander Duyck
                   ` (27 subsequent siblings)
  28 siblings, 0 replies; 38+ messages in thread
From: Alexander Duyck @ 2016-10-25 15:36 UTC (permalink / raw)
  To: netdev, intel-wired-lan, linux-kernel, linux-mm
  Cc: brouer, davem, Konrad Rzeszutek Wilk

There are no users for swiotlb_map_sg so we might as well just drop it.

Acked-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
---
 include/linux/swiotlb.h |    4 ----
 lib/swiotlb.c           |    8 --------
 2 files changed, 12 deletions(-)

diff --git a/include/linux/swiotlb.h b/include/linux/swiotlb.h
index 5f81f8a..e237b6f 100644
--- a/include/linux/swiotlb.h
+++ b/include/linux/swiotlb.h
@@ -72,10 +72,6 @@ extern void swiotlb_unmap_page(struct device *hwdev, dma_addr_t dev_addr,
 			       size_t size, enum dma_data_direction dir,
 			       unsigned long attrs);
 
-extern int
-swiotlb_map_sg(struct device *hwdev, struct scatterlist *sg, int nents,
-	       enum dma_data_direction dir);
-
 extern void
 swiotlb_unmap_sg(struct device *hwdev, struct scatterlist *sg, int nents,
 		 enum dma_data_direction dir);
diff --git a/lib/swiotlb.c b/lib/swiotlb.c
index 22e13a0..47aad37 100644
--- a/lib/swiotlb.c
+++ b/lib/swiotlb.c
@@ -910,14 +910,6 @@ void swiotlb_unmap_page(struct device *hwdev, dma_addr_t dev_addr,
 }
 EXPORT_SYMBOL(swiotlb_map_sg_attrs);
 
-int
-swiotlb_map_sg(struct device *hwdev, struct scatterlist *sgl, int nelems,
-	       enum dma_data_direction dir)
-{
-	return swiotlb_map_sg_attrs(hwdev, sgl, nelems, dir, 0);
-}
-EXPORT_SYMBOL(swiotlb_map_sg);
-
 /*
  * Unmap a set of streaming mode DMA translations.  Again, cpu read rules
  * concerning calls here are the same as for swiotlb_unmap_page() above.

^ permalink raw reply related	[flat|nested] 38+ messages in thread

* [net-next PATCH 02/27] swiotlb-xen: Enforce return of DMA_ERROR_CODE in mapping function
  2016-10-25 15:36 [net-next PATCH 00/27] Add support for DMA writable pages being writable by the network stack Alexander Duyck
  2016-10-25 15:36 ` [net-next PATCH 01/27] swiotlb: Drop unused function swiotlb_map_sg Alexander Duyck
@ 2016-10-25 15:36 ` Alexander Duyck
  2016-10-28 17:35   ` Konrad Rzeszutek Wilk
  2016-10-25 15:37 ` [net-next PATCH 03/27] swiotlb: Add support for DMA_ATTR_SKIP_CPU_SYNC Alexander Duyck
                   ` (26 subsequent siblings)
  28 siblings, 1 reply; 38+ messages in thread
From: Alexander Duyck @ 2016-10-25 15:36 UTC (permalink / raw)
  To: netdev, intel-wired-lan, linux-kernel, linux-mm
  Cc: brouer, davem, Konrad Rzeszutek Wilk

The mapping function should always return DMA_ERROR_CODE when a mapping has
failed as this is what the DMA API expects when a DMA error has occurred.
The current function for mapping a page in Xen was returning either
DMA_ERROR_CODE or 0 depending on where it failed.

On x86 DMA_ERROR_CODE is 0, but on other architectures such as ARM it is
~0. We need to make sure we return the same error value if either the
mapping failed or the device is not capable of accessing the mapping.

If we are returning DMA_ERROR_CODE as our error value we can drop the
function for checking the error code as the default is to compare the
return value against DMA_ERROR_CODE if no function is defined.

Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
---
 arch/arm/xen/mm.c              |    1 -
 arch/x86/xen/pci-swiotlb-xen.c |    1 -
 drivers/xen/swiotlb-xen.c      |   18 ++++++------------
 include/xen/swiotlb-xen.h      |    3 ---
 4 files changed, 6 insertions(+), 17 deletions(-)

diff --git a/arch/arm/xen/mm.c b/arch/arm/xen/mm.c
index d062f08..bd62d94 100644
--- a/arch/arm/xen/mm.c
+++ b/arch/arm/xen/mm.c
@@ -186,7 +186,6 @@ void xen_destroy_contiguous_region(phys_addr_t pstart, unsigned int order)
 EXPORT_SYMBOL(xen_dma_ops);
 
 static struct dma_map_ops xen_swiotlb_dma_ops = {
-	.mapping_error = xen_swiotlb_dma_mapping_error,
 	.alloc = xen_swiotlb_alloc_coherent,
 	.free = xen_swiotlb_free_coherent,
 	.sync_single_for_cpu = xen_swiotlb_sync_single_for_cpu,
diff --git a/arch/x86/xen/pci-swiotlb-xen.c b/arch/x86/xen/pci-swiotlb-xen.c
index 0e98e5d..a9fafb5 100644
--- a/arch/x86/xen/pci-swiotlb-xen.c
+++ b/arch/x86/xen/pci-swiotlb-xen.c
@@ -19,7 +19,6 @@
 int xen_swiotlb __read_mostly;
 
 static struct dma_map_ops xen_swiotlb_dma_ops = {
-	.mapping_error = xen_swiotlb_dma_mapping_error,
 	.alloc = xen_swiotlb_alloc_coherent,
 	.free = xen_swiotlb_free_coherent,
 	.sync_single_for_cpu = xen_swiotlb_sync_single_for_cpu,
diff --git a/drivers/xen/swiotlb-xen.c b/drivers/xen/swiotlb-xen.c
index 87e6035..b8014bf 100644
--- a/drivers/xen/swiotlb-xen.c
+++ b/drivers/xen/swiotlb-xen.c
@@ -416,11 +416,12 @@ dma_addr_t xen_swiotlb_map_page(struct device *dev, struct page *page,
 	/*
 	 * Ensure that the address returned is DMA'ble
 	 */
-	if (!dma_capable(dev, dev_addr, size)) {
-		swiotlb_tbl_unmap_single(dev, map, size, dir);
-		dev_addr = 0;
-	}
-	return dev_addr;
+	if (dma_capable(dev, dev_addr, size))
+		return dev_addr;
+
+	swiotlb_tbl_unmap_single(dev, map, size, dir);
+
+	return DMA_ERROR_CODE;
 }
 EXPORT_SYMBOL_GPL(xen_swiotlb_map_page);
 
@@ -648,13 +649,6 @@ void xen_swiotlb_unmap_page(struct device *hwdev, dma_addr_t dev_addr,
 }
 EXPORT_SYMBOL_GPL(xen_swiotlb_sync_sg_for_device);
 
-int
-xen_swiotlb_dma_mapping_error(struct device *hwdev, dma_addr_t dma_addr)
-{
-	return !dma_addr;
-}
-EXPORT_SYMBOL_GPL(xen_swiotlb_dma_mapping_error);
-
 /*
  * Return whether the given device DMA address mask can be supported
  * properly.  For example, if your device can only drive the low 24-bits
diff --git a/include/xen/swiotlb-xen.h b/include/xen/swiotlb-xen.h
index 7c35e27..a0083be 100644
--- a/include/xen/swiotlb-xen.h
+++ b/include/xen/swiotlb-xen.h
@@ -51,9 +51,6 @@ extern void xen_swiotlb_unmap_page(struct device *hwdev, dma_addr_t dev_addr,
 			       int nelems, enum dma_data_direction dir);
 
 extern int
-xen_swiotlb_dma_mapping_error(struct device *hwdev, dma_addr_t dma_addr);
-
-extern int
 xen_swiotlb_dma_supported(struct device *hwdev, u64 mask);
 
 extern int

^ permalink raw reply related	[flat|nested] 38+ messages in thread

* [net-next PATCH 03/27] swiotlb: Add support for DMA_ATTR_SKIP_CPU_SYNC
  2016-10-25 15:36 [net-next PATCH 00/27] Add support for DMA writable pages being writable by the network stack Alexander Duyck
  2016-10-25 15:36 ` [net-next PATCH 01/27] swiotlb: Drop unused function swiotlb_map_sg Alexander Duyck
  2016-10-25 15:36 ` [net-next PATCH 02/27] swiotlb-xen: Enforce return of DMA_ERROR_CODE in mapping function Alexander Duyck
@ 2016-10-25 15:37 ` Alexander Duyck
  2016-10-28 17:34   ` Konrad Rzeszutek Wilk
  2016-10-25 15:37 ` [net-next PATCH 04/27] arch/arc: Add option to skip sync on DMA mapping Alexander Duyck
                   ` (25 subsequent siblings)
  28 siblings, 1 reply; 38+ messages in thread
From: Alexander Duyck @ 2016-10-25 15:37 UTC (permalink / raw)
  To: netdev, intel-wired-lan, linux-kernel, linux-mm
  Cc: brouer, davem, Konrad Rzeszutek Wilk

As a first step to making DMA_ATTR_SKIP_CPU_SYNC apply to architectures
beyond just ARM I need to make it so that the swiotlb will respect the
flag.  In order to do that I also need to update the swiotlb-xen since it
heavily makes use of the functionality.

Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
---
 drivers/xen/swiotlb-xen.c |   11 +++++++---
 include/linux/swiotlb.h   |    6 ++++--
 lib/swiotlb.c             |   48 +++++++++++++++++++++++++++------------------
 3 files changed, 40 insertions(+), 25 deletions(-)

diff --git a/drivers/xen/swiotlb-xen.c b/drivers/xen/swiotlb-xen.c
index b8014bf..3d048af 100644
--- a/drivers/xen/swiotlb-xen.c
+++ b/drivers/xen/swiotlb-xen.c
@@ -405,7 +405,8 @@ dma_addr_t xen_swiotlb_map_page(struct device *dev, struct page *page,
 	 */
 	trace_swiotlb_bounced(dev, dev_addr, size, swiotlb_force);
 
-	map = swiotlb_tbl_map_single(dev, start_dma_addr, phys, size, dir);
+	map = swiotlb_tbl_map_single(dev, start_dma_addr, phys, size, dir,
+				     attrs);
 	if (map == SWIOTLB_MAP_ERROR)
 		return DMA_ERROR_CODE;
 
@@ -419,7 +420,8 @@ dma_addr_t xen_swiotlb_map_page(struct device *dev, struct page *page,
 	if (dma_capable(dev, dev_addr, size))
 		return dev_addr;
 
-	swiotlb_tbl_unmap_single(dev, map, size, dir);
+	swiotlb_tbl_unmap_single(dev, map, size, dir,
+				 attrs | DMA_ATTR_SKIP_CPU_SYNC);
 
 	return DMA_ERROR_CODE;
 }
@@ -445,7 +447,7 @@ static void xen_unmap_single(struct device *hwdev, dma_addr_t dev_addr,
 
 	/* NOTE: We use dev_addr here, not paddr! */
 	if (is_xen_swiotlb_buffer(dev_addr)) {
-		swiotlb_tbl_unmap_single(hwdev, paddr, size, dir);
+		swiotlb_tbl_unmap_single(hwdev, paddr, size, dir, attrs);
 		return;
 	}
 
@@ -558,11 +560,12 @@ void xen_swiotlb_unmap_page(struct device *hwdev, dma_addr_t dev_addr,
 								 start_dma_addr,
 								 sg_phys(sg),
 								 sg->length,
-								 dir);
+								 dir, attrs);
 			if (map == SWIOTLB_MAP_ERROR) {
 				dev_warn(hwdev, "swiotlb buffer is full\n");
 				/* Don't panic here, we expect map_sg users
 				   to do proper error handling. */
+				attrs |= DMA_ATTR_SKIP_CPU_SYNC;
 				xen_swiotlb_unmap_sg_attrs(hwdev, sgl, i, dir,
 							   attrs);
 				sg_dma_len(sgl) = 0;
diff --git a/include/linux/swiotlb.h b/include/linux/swiotlb.h
index e237b6f..4517be9 100644
--- a/include/linux/swiotlb.h
+++ b/include/linux/swiotlb.h
@@ -44,11 +44,13 @@ enum dma_sync_target {
 extern phys_addr_t swiotlb_tbl_map_single(struct device *hwdev,
 					  dma_addr_t tbl_dma_addr,
 					  phys_addr_t phys, size_t size,
-					  enum dma_data_direction dir);
+					  enum dma_data_direction dir,
+					  unsigned long attrs);
 
 extern void swiotlb_tbl_unmap_single(struct device *hwdev,
 				     phys_addr_t tlb_addr,
-				     size_t size, enum dma_data_direction dir);
+				     size_t size, enum dma_data_direction dir,
+				     unsigned long attrs);
 
 extern void swiotlb_tbl_sync_single(struct device *hwdev,
 				    phys_addr_t tlb_addr,
diff --git a/lib/swiotlb.c b/lib/swiotlb.c
index 47aad37..b538d39 100644
--- a/lib/swiotlb.c
+++ b/lib/swiotlb.c
@@ -425,7 +425,8 @@ static void swiotlb_bounce(phys_addr_t orig_addr, phys_addr_t tlb_addr,
 phys_addr_t swiotlb_tbl_map_single(struct device *hwdev,
 				   dma_addr_t tbl_dma_addr,
 				   phys_addr_t orig_addr, size_t size,
-				   enum dma_data_direction dir)
+				   enum dma_data_direction dir,
+				   unsigned long attrs)
 {
 	unsigned long flags;
 	phys_addr_t tlb_addr;
@@ -526,7 +527,8 @@ phys_addr_t swiotlb_tbl_map_single(struct device *hwdev,
 	 */
 	for (i = 0; i < nslots; i++)
 		io_tlb_orig_addr[index+i] = orig_addr + (i << IO_TLB_SHIFT);
-	if (dir == DMA_TO_DEVICE || dir == DMA_BIDIRECTIONAL)
+	if (!(attrs & DMA_ATTR_SKIP_CPU_SYNC) &&
+	    (dir == DMA_TO_DEVICE || dir == DMA_BIDIRECTIONAL))
 		swiotlb_bounce(orig_addr, tlb_addr, size, DMA_TO_DEVICE);
 
 	return tlb_addr;
@@ -539,18 +541,20 @@ phys_addr_t swiotlb_tbl_map_single(struct device *hwdev,
 
 static phys_addr_t
 map_single(struct device *hwdev, phys_addr_t phys, size_t size,
-	   enum dma_data_direction dir)
+	   enum dma_data_direction dir, unsigned long attrs)
 {
 	dma_addr_t start_dma_addr = phys_to_dma(hwdev, io_tlb_start);
 
-	return swiotlb_tbl_map_single(hwdev, start_dma_addr, phys, size, dir);
+	return swiotlb_tbl_map_single(hwdev, start_dma_addr, phys, size,
+				      dir, attrs);
 }
 
 /*
  * dma_addr is the kernel virtual address of the bounce buffer to unmap.
  */
 void swiotlb_tbl_unmap_single(struct device *hwdev, phys_addr_t tlb_addr,
-			      size_t size, enum dma_data_direction dir)
+			      size_t size, enum dma_data_direction dir,
+			      unsigned long attrs)
 {
 	unsigned long flags;
 	int i, count, nslots = ALIGN(size, 1 << IO_TLB_SHIFT) >> IO_TLB_SHIFT;
@@ -561,6 +565,7 @@ void swiotlb_tbl_unmap_single(struct device *hwdev, phys_addr_t tlb_addr,
 	 * First, sync the memory before unmapping the entry
 	 */
 	if (orig_addr != INVALID_PHYS_ADDR &&
+	    !(attrs & DMA_ATTR_SKIP_CPU_SYNC) &&
 	    ((dir == DMA_FROM_DEVICE) || (dir == DMA_BIDIRECTIONAL)))
 		swiotlb_bounce(orig_addr, tlb_addr, size, DMA_FROM_DEVICE);
 
@@ -654,7 +659,8 @@ void swiotlb_tbl_sync_single(struct device *hwdev, phys_addr_t tlb_addr,
 		 * GFP_DMA memory; fall back on map_single(), which
 		 * will grab memory from the lowest available address range.
 		 */
-		phys_addr_t paddr = map_single(hwdev, 0, size, DMA_FROM_DEVICE);
+		phys_addr_t paddr = map_single(hwdev, 0, size,
+					       DMA_FROM_DEVICE, 0);
 		if (paddr == SWIOTLB_MAP_ERROR)
 			goto err_warn;
 
@@ -669,7 +675,8 @@ void swiotlb_tbl_sync_single(struct device *hwdev, phys_addr_t tlb_addr,
 
 			/* DMA_TO_DEVICE to avoid memcpy in unmap_single */
 			swiotlb_tbl_unmap_single(hwdev, paddr,
-						 size, DMA_TO_DEVICE);
+						 size, DMA_TO_DEVICE,
+						 DMA_ATTR_SKIP_CPU_SYNC);
 			goto err_warn;
 		}
 	}
@@ -699,7 +706,7 @@ void swiotlb_tbl_sync_single(struct device *hwdev, phys_addr_t tlb_addr,
 		free_pages((unsigned long)vaddr, get_order(size));
 	else
 		/* DMA_TO_DEVICE to avoid memcpy in swiotlb_tbl_unmap_single */
-		swiotlb_tbl_unmap_single(hwdev, paddr, size, DMA_TO_DEVICE);
+		swiotlb_tbl_unmap_single(hwdev, paddr, size, DMA_TO_DEVICE, 0);
 }
 EXPORT_SYMBOL(swiotlb_free_coherent);
 
@@ -755,7 +762,7 @@ dma_addr_t swiotlb_map_page(struct device *dev, struct page *page,
 	trace_swiotlb_bounced(dev, dev_addr, size, swiotlb_force);
 
 	/* Oh well, have to allocate and map a bounce buffer. */
-	map = map_single(dev, phys, size, dir);
+	map = map_single(dev, phys, size, dir, attrs);
 	if (map == SWIOTLB_MAP_ERROR) {
 		swiotlb_full(dev, size, dir, 1);
 		return phys_to_dma(dev, io_tlb_overflow_buffer);
@@ -764,12 +771,13 @@ dma_addr_t swiotlb_map_page(struct device *dev, struct page *page,
 	dev_addr = phys_to_dma(dev, map);
 
 	/* Ensure that the address returned is DMA'ble */
-	if (!dma_capable(dev, dev_addr, size)) {
-		swiotlb_tbl_unmap_single(dev, map, size, dir);
-		return phys_to_dma(dev, io_tlb_overflow_buffer);
-	}
+	if (dma_capable(dev, dev_addr, size))
+		return dev_addr;
+
+	swiotlb_tbl_unmap_single(dev, map, size, dir,
+				 attrs | DMA_ATTR_SKIP_CPU_SYNC);
 
-	return dev_addr;
+	return phys_to_dma(dev, io_tlb_overflow_buffer);
 }
 EXPORT_SYMBOL_GPL(swiotlb_map_page);
 
@@ -782,14 +790,15 @@ dma_addr_t swiotlb_map_page(struct device *dev, struct page *page,
  * whatever the device wrote there.
  */
 static void unmap_single(struct device *hwdev, dma_addr_t dev_addr,
-			 size_t size, enum dma_data_direction dir)
+			 size_t size, enum dma_data_direction dir,
+			 unsigned long attrs)
 {
 	phys_addr_t paddr = dma_to_phys(hwdev, dev_addr);
 
 	BUG_ON(dir == DMA_NONE);
 
 	if (is_swiotlb_buffer(paddr)) {
-		swiotlb_tbl_unmap_single(hwdev, paddr, size, dir);
+		swiotlb_tbl_unmap_single(hwdev, paddr, size, dir, attrs);
 		return;
 	}
 
@@ -809,7 +818,7 @@ void swiotlb_unmap_page(struct device *hwdev, dma_addr_t dev_addr,
 			size_t size, enum dma_data_direction dir,
 			unsigned long attrs)
 {
-	unmap_single(hwdev, dev_addr, size, dir);
+	unmap_single(hwdev, dev_addr, size, dir, attrs);
 }
 EXPORT_SYMBOL_GPL(swiotlb_unmap_page);
 
@@ -891,7 +900,7 @@ void swiotlb_unmap_page(struct device *hwdev, dma_addr_t dev_addr,
 		if (swiotlb_force ||
 		    !dma_capable(hwdev, dev_addr, sg->length)) {
 			phys_addr_t map = map_single(hwdev, sg_phys(sg),
-						     sg->length, dir);
+						     sg->length, dir, attrs);
 			if (map == SWIOTLB_MAP_ERROR) {
 				/* Don't panic here, we expect map_sg users
 				   to do proper error handling. */
@@ -925,7 +934,8 @@ void swiotlb_unmap_page(struct device *hwdev, dma_addr_t dev_addr,
 	BUG_ON(dir == DMA_NONE);
 
 	for_each_sg(sgl, sg, nelems, i)
-		unmap_single(hwdev, sg->dma_address, sg_dma_len(sg), dir);
+		unmap_single(hwdev, sg->dma_address, sg_dma_len(sg), dir,
+			     attrs);
 
 }
 EXPORT_SYMBOL(swiotlb_unmap_sg_attrs);

^ permalink raw reply related	[flat|nested] 38+ messages in thread

* [net-next PATCH 04/27] arch/arc: Add option to skip sync on DMA mapping
  2016-10-25 15:36 [net-next PATCH 00/27] Add support for DMA writable pages being writable by the network stack Alexander Duyck
                   ` (2 preceding siblings ...)
  2016-10-25 15:37 ` [net-next PATCH 03/27] swiotlb: Add support for DMA_ATTR_SKIP_CPU_SYNC Alexander Duyck
@ 2016-10-25 15:37 ` Alexander Duyck
  2016-10-25 22:00   ` Vineet Gupta
  2016-10-25 15:37 ` [net-next PATCH 05/27] arch/arm: Add option to skip sync on DMA map and unmap Alexander Duyck
                   ` (24 subsequent siblings)
  28 siblings, 1 reply; 38+ messages in thread
From: Alexander Duyck @ 2016-10-25 15:37 UTC (permalink / raw)
  To: netdev, intel-wired-lan, linux-kernel, linux-mm
  Cc: Vineet Gupta, linux-snps-arc, davem, brouer

This change allows us to pass DMA_ATTR_SKIP_CPU_SYNC which allows us to
avoid invoking cache line invalidation if the driver will just handle it
later via a sync_for_cpu or sync_for_device call.

Cc: Vineet Gupta <vgupta@synopsys.com>
Cc: linux-snps-arc@lists.infradead.org
Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
---
 arch/arc/mm/dma.c |    5 ++++-
 1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/arch/arc/mm/dma.c b/arch/arc/mm/dma.c
index 20afc65..6303c34 100644
--- a/arch/arc/mm/dma.c
+++ b/arch/arc/mm/dma.c
@@ -133,7 +133,10 @@ static dma_addr_t arc_dma_map_page(struct device *dev, struct page *page,
 		unsigned long attrs)
 {
 	phys_addr_t paddr = page_to_phys(page) + offset;
-	_dma_cache_sync(paddr, size, dir);
+
+	if (!(attrs & DMA_ATTR_SKIP_CPU_SYNC))
+		_dma_cache_sync(paddr, size, dir);
+
 	return plat_phys_to_dma(dev, paddr);
 }
 

^ permalink raw reply related	[flat|nested] 38+ messages in thread

* [net-next PATCH 05/27] arch/arm: Add option to skip sync on DMA map and unmap
  2016-10-25 15:36 [net-next PATCH 00/27] Add support for DMA writable pages being writable by the network stack Alexander Duyck
                   ` (3 preceding siblings ...)
  2016-10-25 15:37 ` [net-next PATCH 04/27] arch/arc: Add option to skip sync on DMA mapping Alexander Duyck
@ 2016-10-25 15:37 ` Alexander Duyck
  2016-10-25 15:37 ` [net-next PATCH 06/27] arch/avr32: Add option to skip sync on DMA map Alexander Duyck
                   ` (23 subsequent siblings)
  28 siblings, 0 replies; 38+ messages in thread
From: Alexander Duyck @ 2016-10-25 15:37 UTC (permalink / raw)
  To: netdev, intel-wired-lan, linux-kernel, linux-mm
  Cc: brouer, Russell King, davem

The use of DMA_ATTR_SKIP_CPU_SYNC was not consistent across all of the DMA
APIs in the arch/arm folder.  This change is meant to correct that so that
we get consistent behavior.

Cc: Russell King <linux@armlinux.org.uk>
Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
---
 arch/arm/common/dmabounce.c |   16 ++++++++++------
 1 file changed, 10 insertions(+), 6 deletions(-)

diff --git a/arch/arm/common/dmabounce.c b/arch/arm/common/dmabounce.c
index 3012816..75055df 100644
--- a/arch/arm/common/dmabounce.c
+++ b/arch/arm/common/dmabounce.c
@@ -243,7 +243,8 @@ static int needs_bounce(struct device *dev, dma_addr_t dma_addr, size_t size)
 }
 
 static inline dma_addr_t map_single(struct device *dev, void *ptr, size_t size,
-		enum dma_data_direction dir)
+				    enum dma_data_direction dir,
+				    unsigned long attrs)
 {
 	struct dmabounce_device_info *device_info = dev->archdata.dmabounce;
 	struct safe_buffer *buf;
@@ -262,7 +263,8 @@ static inline dma_addr_t map_single(struct device *dev, void *ptr, size_t size,
 		__func__, buf->ptr, virt_to_dma(dev, buf->ptr),
 		buf->safe, buf->safe_dma_addr);
 
-	if (dir == DMA_TO_DEVICE || dir == DMA_BIDIRECTIONAL) {
+	if ((dir == DMA_TO_DEVICE || dir == DMA_BIDIRECTIONAL) &&
+	    !(attrs & DMA_ATTR_SKIP_CPU_SYNC)) {
 		dev_dbg(dev, "%s: copy unsafe %p to safe %p, size %d\n",
 			__func__, ptr, buf->safe, size);
 		memcpy(buf->safe, ptr, size);
@@ -272,7 +274,8 @@ static inline dma_addr_t map_single(struct device *dev, void *ptr, size_t size,
 }
 
 static inline void unmap_single(struct device *dev, struct safe_buffer *buf,
-		size_t size, enum dma_data_direction dir)
+				size_t size, enum dma_data_direction dir,
+				unsigned long attrs)
 {
 	BUG_ON(buf->size != size);
 	BUG_ON(buf->direction != dir);
@@ -283,7 +286,8 @@ static inline void unmap_single(struct device *dev, struct safe_buffer *buf,
 
 	DO_STATS(dev->archdata.dmabounce->bounce_count++);
 
-	if (dir == DMA_FROM_DEVICE || dir == DMA_BIDIRECTIONAL) {
+	if ((dir == DMA_FROM_DEVICE || dir == DMA_BIDIRECTIONAL) &&
+	    !(attrs & DMA_ATTR_SKIP_CPU_SYNC)) {
 		void *ptr = buf->ptr;
 
 		dev_dbg(dev, "%s: copy back safe %p to unsafe %p size %d\n",
@@ -334,7 +338,7 @@ static dma_addr_t dmabounce_map_page(struct device *dev, struct page *page,
 		return DMA_ERROR_CODE;
 	}
 
-	return map_single(dev, page_address(page) + offset, size, dir);
+	return map_single(dev, page_address(page) + offset, size, dir, attrs);
 }
 
 /*
@@ -357,7 +361,7 @@ static void dmabounce_unmap_page(struct device *dev, dma_addr_t dma_addr, size_t
 		return;
 	}
 
-	unmap_single(dev, buf, size, dir);
+	unmap_single(dev, buf, size, dir, attrs);
 }
 
 static int __dmabounce_sync_for_cpu(struct device *dev, dma_addr_t addr,

^ permalink raw reply related	[flat|nested] 38+ messages in thread

* [net-next PATCH 06/27] arch/avr32: Add option to skip sync on DMA map
  2016-10-25 15:36 [net-next PATCH 00/27] Add support for DMA writable pages being writable by the network stack Alexander Duyck
                   ` (4 preceding siblings ...)
  2016-10-25 15:37 ` [net-next PATCH 05/27] arch/arm: Add option to skip sync on DMA map and unmap Alexander Duyck
@ 2016-10-25 15:37 ` Alexander Duyck
  2016-10-25 15:37 ` [net-next PATCH 07/27] arch/blackfin: " Alexander Duyck
                   ` (22 subsequent siblings)
  28 siblings, 0 replies; 38+ messages in thread
From: Alexander Duyck @ 2016-10-25 15:37 UTC (permalink / raw)
  To: netdev, intel-wired-lan, linux-kernel, linux-mm
  Cc: brouer, davem, Hans-Christian Noren Egtvedt

The use of DMA_ATTR_SKIP_CPU_SYNC was not consistent across all of the DMA
APIs in the arch/arm folder.  This change is meant to correct that so that
we get consistent behavior.

Acked-by: Hans-Christian Noren Egtvedt <egtvedt@samfundet.no>
Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
---
 arch/avr32/mm/dma-coherent.c |    7 ++++++-
 1 file changed, 6 insertions(+), 1 deletion(-)

diff --git a/arch/avr32/mm/dma-coherent.c b/arch/avr32/mm/dma-coherent.c
index 58610d0..54534e5 100644
--- a/arch/avr32/mm/dma-coherent.c
+++ b/arch/avr32/mm/dma-coherent.c
@@ -146,7 +146,8 @@ static dma_addr_t avr32_dma_map_page(struct device *dev, struct page *page,
 {
 	void *cpu_addr = page_address(page) + offset;
 
-	dma_cache_sync(dev, cpu_addr, size, direction);
+	if (!(attrs & DMA_ATTR_SKIP_CPU_SYNC))
+		dma_cache_sync(dev, cpu_addr, size, direction);
 	return virt_to_bus(cpu_addr);
 }
 
@@ -162,6 +163,10 @@ static int avr32_dma_map_sg(struct device *dev, struct scatterlist *sglist,
 
 		sg->dma_address = page_to_bus(sg_page(sg)) + sg->offset;
 		virt = sg_virt(sg);
+
+		if (attrs & DMA_ATTR_SKIP_CPU_SYNC)
+			continue;
+
 		dma_cache_sync(dev, virt, sg->length, direction);
 	}
 

^ permalink raw reply related	[flat|nested] 38+ messages in thread

* [net-next PATCH 07/27] arch/blackfin: Add option to skip sync on DMA map
  2016-10-25 15:36 [net-next PATCH 00/27] Add support for DMA writable pages being writable by the network stack Alexander Duyck
                   ` (5 preceding siblings ...)
  2016-10-25 15:37 ` [net-next PATCH 06/27] arch/avr32: Add option to skip sync on DMA map Alexander Duyck
@ 2016-10-25 15:37 ` Alexander Duyck
  2016-10-25 15:37 ` [net-next PATCH 08/27] arch/c6x: Add option to skip sync on DMA map and unmap Alexander Duyck
                   ` (21 subsequent siblings)
  28 siblings, 0 replies; 38+ messages in thread
From: Alexander Duyck @ 2016-10-25 15:37 UTC (permalink / raw)
  To: netdev, intel-wired-lan, linux-kernel, linux-mm
  Cc: brouer, davem, Steven Miao

The use of DMA_ATTR_SKIP_CPU_SYNC was not consistent across all of the DMA
APIs in the arch/arm folder.  This change is meant to correct that so that
we get consistent behavior.

Cc: Steven Miao <realmz6@gmail.com>
Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
---
 arch/blackfin/kernel/dma-mapping.c |    8 +++++++-
 1 file changed, 7 insertions(+), 1 deletion(-)

diff --git a/arch/blackfin/kernel/dma-mapping.c b/arch/blackfin/kernel/dma-mapping.c
index 53fbbb6..a27a74a 100644
--- a/arch/blackfin/kernel/dma-mapping.c
+++ b/arch/blackfin/kernel/dma-mapping.c
@@ -118,6 +118,10 @@ static int bfin_dma_map_sg(struct device *dev, struct scatterlist *sg_list,
 
 	for_each_sg(sg_list, sg, nents, i) {
 		sg->dma_address = (dma_addr_t) sg_virt(sg);
+
+		if (attrs & DMA_ATTR_SKIP_CPU_SYNC)
+			continue;
+
 		__dma_sync(sg_dma_address(sg), sg_dma_len(sg), direction);
 	}
 
@@ -143,7 +147,9 @@ static dma_addr_t bfin_dma_map_page(struct device *dev, struct page *page,
 {
 	dma_addr_t handle = (dma_addr_t)(page_address(page) + offset);
 
-	_dma_sync(handle, size, dir);
+	if (!(attrs & DMA_ATTR_SKIP_CPU_SYNC))
+		_dma_sync(handle, size, dir);
+
 	return handle;
 }
 

^ permalink raw reply related	[flat|nested] 38+ messages in thread

* [net-next PATCH 08/27] arch/c6x: Add option to skip sync on DMA map and unmap
  2016-10-25 15:36 [net-next PATCH 00/27] Add support for DMA writable pages being writable by the network stack Alexander Duyck
                   ` (6 preceding siblings ...)
  2016-10-25 15:37 ` [net-next PATCH 07/27] arch/blackfin: " Alexander Duyck
@ 2016-10-25 15:37 ` Alexander Duyck
  2016-10-25 15:37 ` [net-next PATCH 09/27] arch/frv: Add option to skip sync on DMA map Alexander Duyck
                   ` (20 subsequent siblings)
  28 siblings, 0 replies; 38+ messages in thread
From: Alexander Duyck @ 2016-10-25 15:37 UTC (permalink / raw)
  To: netdev, intel-wired-lan, linux-kernel, linux-mm
  Cc: brouer, Mark Salter, davem, Aurelien Jacquiot

This change allows us to pass DMA_ATTR_SKIP_CPU_SYNC which allows us to
avoid invoking cache line invalidation if the driver will just handle it
later via a sync_for_cpu or sync_for_device call.

Cc: Mark Salter <msalter@redhat.com>
Cc: Aurelien Jacquiot <a-jacquiot@ti.com>
Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
---
 arch/c6x/kernel/dma.c |   14 ++++++++++----
 1 file changed, 10 insertions(+), 4 deletions(-)

diff --git a/arch/c6x/kernel/dma.c b/arch/c6x/kernel/dma.c
index db4a6a3..6752df3 100644
--- a/arch/c6x/kernel/dma.c
+++ b/arch/c6x/kernel/dma.c
@@ -42,14 +42,17 @@ static dma_addr_t c6x_dma_map_page(struct device *dev, struct page *page,
 {
 	dma_addr_t handle = virt_to_phys(page_address(page) + offset);
 
-	c6x_dma_sync(handle, size, dir);
+	if (!(attrs & DMA_ATTR_SKIP_CPU_SYNC))
+		c6x_dma_sync(handle, size, dir);
+
 	return handle;
 }
 
 static void c6x_dma_unmap_page(struct device *dev, dma_addr_t handle,
 		size_t size, enum dma_data_direction dir, unsigned long attrs)
 {
-	c6x_dma_sync(handle, size, dir);
+	if (!(attrs & DMA_ATTR_SKIP_CPU_SYNC))
+		c6x_dma_sync(handle, size, dir);
 }
 
 static int c6x_dma_map_sg(struct device *dev, struct scatterlist *sglist,
@@ -60,7 +63,8 @@ static int c6x_dma_map_sg(struct device *dev, struct scatterlist *sglist,
 
 	for_each_sg(sglist, sg, nents, i) {
 		sg->dma_address = sg_phys(sg);
-		c6x_dma_sync(sg->dma_address, sg->length, dir);
+		if (!(attrs & DMA_ATTR_SKIP_CPU_SYNC))
+			c6x_dma_sync(sg->dma_address, sg->length, dir);
 	}
 
 	return nents;
@@ -72,9 +76,11 @@ static void c6x_dma_unmap_sg(struct device *dev, struct scatterlist *sglist,
 	struct scatterlist *sg;
 	int i;
 
+	if (attrs & DMA_ATTR_SKIP_CPU_SYNC)
+		return;
+
 	for_each_sg(sglist, sg, nents, i)
 		c6x_dma_sync(sg_dma_address(sg), sg->length, dir);
-
 }
 
 static void c6x_dma_sync_single_for_cpu(struct device *dev, dma_addr_t handle,

^ permalink raw reply related	[flat|nested] 38+ messages in thread

* [net-next PATCH 09/27] arch/frv: Add option to skip sync on DMA map
  2016-10-25 15:36 [net-next PATCH 00/27] Add support for DMA writable pages being writable by the network stack Alexander Duyck
                   ` (7 preceding siblings ...)
  2016-10-25 15:37 ` [net-next PATCH 08/27] arch/c6x: Add option to skip sync on DMA map and unmap Alexander Duyck
@ 2016-10-25 15:37 ` Alexander Duyck
  2016-10-25 15:37 ` [net-next PATCH 10/27] arch/hexagon: Add option to skip DMA sync as a part of mapping Alexander Duyck
                   ` (19 subsequent siblings)
  28 siblings, 0 replies; 38+ messages in thread
From: Alexander Duyck @ 2016-10-25 15:37 UTC (permalink / raw)
  To: netdev, intel-wired-lan, linux-kernel, linux-mm; +Cc: davem, brouer

The use of DMA_ATTR_SKIP_CPU_SYNC was not consistent across all of the DMA
APIs in the arch/arm folder.  This change is meant to correct that so that
we get consistent behavior.

Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
---
 arch/frv/mb93090-mb00/pci-dma-nommu.c |   14 ++++++++++----
 arch/frv/mb93090-mb00/pci-dma.c       |    9 +++++++--
 2 files changed, 17 insertions(+), 6 deletions(-)

diff --git a/arch/frv/mb93090-mb00/pci-dma-nommu.c b/arch/frv/mb93090-mb00/pci-dma-nommu.c
index 90f2e4c..1876881 100644
--- a/arch/frv/mb93090-mb00/pci-dma-nommu.c
+++ b/arch/frv/mb93090-mb00/pci-dma-nommu.c
@@ -109,16 +109,19 @@ static int frv_dma_map_sg(struct device *dev, struct scatterlist *sglist,
 		int nents, enum dma_data_direction direction,
 		unsigned long attrs)
 {
-	int i;
 	struct scatterlist *sg;
+	int i;
+
+	BUG_ON(direction == DMA_NONE);
+
+	if (attrs & DMA_ATTR_SKIP_CPU_SYNC)
+		return nents;
 
 	for_each_sg(sglist, sg, nents, i) {
 		frv_cache_wback_inv(sg_dma_address(sg),
 				    sg_dma_address(sg) + sg_dma_len(sg));
 	}
 
-	BUG_ON(direction == DMA_NONE);
-
 	return nents;
 }
 
@@ -127,7 +130,10 @@ static dma_addr_t frv_dma_map_page(struct device *dev, struct page *page,
 		enum dma_data_direction direction, unsigned long attrs)
 {
 	BUG_ON(direction == DMA_NONE);
-	flush_dcache_page(page);
+
+	if (!(attrs & DMA_ATTR_SKIP_CPU_SYNC))
+		flush_dcache_page(page);
+
 	return (dma_addr_t) page_to_phys(page) + offset;
 }
 
diff --git a/arch/frv/mb93090-mb00/pci-dma.c b/arch/frv/mb93090-mb00/pci-dma.c
index f585745..dba7df9 100644
--- a/arch/frv/mb93090-mb00/pci-dma.c
+++ b/arch/frv/mb93090-mb00/pci-dma.c
@@ -40,13 +40,16 @@ static int frv_dma_map_sg(struct device *dev, struct scatterlist *sglist,
 		int nents, enum dma_data_direction direction,
 		unsigned long attrs)
 {
+	struct scatterlist *sg;
 	unsigned long dampr2;
 	void *vaddr;
 	int i;
-	struct scatterlist *sg;
 
 	BUG_ON(direction == DMA_NONE);
 
+	if (attrs & DMA_ATTR_SKIP_CPU_SYNC)
+		return nents;
+
 	dampr2 = __get_DAMPR(2);
 
 	for_each_sg(sglist, sg, nents, i) {
@@ -70,7 +73,9 @@ static dma_addr_t frv_dma_map_page(struct device *dev, struct page *page,
 		unsigned long offset, size_t size,
 		enum dma_data_direction direction, unsigned long attrs)
 {
-	flush_dcache_page(page);
+	if (!(attrs & DMA_ATTR_SKIP_CPU_SYNC))
+		flush_dcache_page(page);
+
 	return (dma_addr_t) page_to_phys(page) + offset;
 }
 

^ permalink raw reply related	[flat|nested] 38+ messages in thread

* [net-next PATCH 10/27] arch/hexagon: Add option to skip DMA sync as a part of mapping
  2016-10-25 15:36 [net-next PATCH 00/27] Add support for DMA writable pages being writable by the network stack Alexander Duyck
                   ` (8 preceding siblings ...)
  2016-10-25 15:37 ` [net-next PATCH 09/27] arch/frv: Add option to skip sync on DMA map Alexander Duyck
@ 2016-10-25 15:37 ` Alexander Duyck
  2016-10-25 15:37 ` [net-next PATCH 11/27] arch/m68k: " Alexander Duyck
                   ` (18 subsequent siblings)
  28 siblings, 0 replies; 38+ messages in thread
From: Alexander Duyck @ 2016-10-25 15:37 UTC (permalink / raw)
  To: netdev, intel-wired-lan, linux-kernel, linux-mm
  Cc: linux-hexagon, brouer, davem, Richard Kuo

This change allows us to pass DMA_ATTR_SKIP_CPU_SYNC which allows us to
avoid invoking cache line invalidation if the driver will just handle it
later via a sync_for_cpu or sync_for_device call.

Cc: Richard Kuo <rkuo@codeaurora.org>
Cc: linux-hexagon@vger.kernel.org
Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
---
 arch/hexagon/kernel/dma.c |    6 +++++-
 1 file changed, 5 insertions(+), 1 deletion(-)

diff --git a/arch/hexagon/kernel/dma.c b/arch/hexagon/kernel/dma.c
index b901778..dbc4f10 100644
--- a/arch/hexagon/kernel/dma.c
+++ b/arch/hexagon/kernel/dma.c
@@ -119,6 +119,9 @@ static int hexagon_map_sg(struct device *hwdev, struct scatterlist *sg,
 
 		s->dma_length = s->length;
 
+		if (attrs & DMA_ATTR_SKIP_CPU_SYNC)
+			continue;
+
 		flush_dcache_range(dma_addr_to_virt(s->dma_address),
 				   dma_addr_to_virt(s->dma_address + s->length));
 	}
@@ -180,7 +183,8 @@ static dma_addr_t hexagon_map_page(struct device *dev, struct page *page,
 	if (!check_addr("map_single", dev, bus, size))
 		return bad_dma_address;
 
-	dma_sync(dma_addr_to_virt(bus), size, dir);
+	if (!(attrs & DMA_ATTR_SKIP_CPU_SYNC))
+		dma_sync(dma_addr_to_virt(bus), size, dir);
 
 	return bus;
 }

^ permalink raw reply related	[flat|nested] 38+ messages in thread

* [net-next PATCH 11/27] arch/m68k: Add option to skip DMA sync as a part of mapping
  2016-10-25 15:36 [net-next PATCH 00/27] Add support for DMA writable pages being writable by the network stack Alexander Duyck
                   ` (9 preceding siblings ...)
  2016-10-25 15:37 ` [net-next PATCH 10/27] arch/hexagon: Add option to skip DMA sync as a part of mapping Alexander Duyck
@ 2016-10-25 15:37 ` Alexander Duyck
  2016-10-25 15:37 ` [net-next PATCH 12/27] arch/metag: Add option to skip DMA sync as a part of map and unmap Alexander Duyck
                   ` (17 subsequent siblings)
  28 siblings, 0 replies; 38+ messages in thread
From: Alexander Duyck @ 2016-10-25 15:37 UTC (permalink / raw)
  To: netdev, intel-wired-lan, linux-kernel, linux-mm
  Cc: linux-m68k, Geert Uytterhoeven, davem, brouer

This change allows us to pass DMA_ATTR_SKIP_CPU_SYNC which allows us to
avoid invoking cache line invalidation if the driver will just handle it
later via a sync_for_cpu or sync_for_device call.

Cc: Geert Uytterhoeven <geert@linux-m68k.org>
Cc: linux-m68k@lists.linux-m68k.org
Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
---
 arch/m68k/kernel/dma.c |    8 +++++++-
 1 file changed, 7 insertions(+), 1 deletion(-)

diff --git a/arch/m68k/kernel/dma.c b/arch/m68k/kernel/dma.c
index 8cf97cb..0707006 100644
--- a/arch/m68k/kernel/dma.c
+++ b/arch/m68k/kernel/dma.c
@@ -134,7 +134,9 @@ static dma_addr_t m68k_dma_map_page(struct device *dev, struct page *page,
 {
 	dma_addr_t handle = page_to_phys(page) + offset;
 
-	dma_sync_single_for_device(dev, handle, size, dir);
+	if (!(attrs & DMA_ATTR_SKIP_CPU_SYNC))
+		dma_sync_single_for_device(dev, handle, size, dir);
+
 	return handle;
 }
 
@@ -146,6 +148,10 @@ static int m68k_dma_map_sg(struct device *dev, struct scatterlist *sglist,
 
 	for_each_sg(sglist, sg, nents, i) {
 		sg->dma_address = sg_phys(sg);
+
+		if (attrs & DMA_ATTR_SKIP_CPU_SYNC)
+			continue;
+
 		dma_sync_single_for_device(dev, sg->dma_address, sg->length,
 					   dir);
 	}

^ permalink raw reply related	[flat|nested] 38+ messages in thread

* [net-next PATCH 12/27] arch/metag: Add option to skip DMA sync as a part of map and unmap
  2016-10-25 15:36 [net-next PATCH 00/27] Add support for DMA writable pages being writable by the network stack Alexander Duyck
                   ` (10 preceding siblings ...)
  2016-10-25 15:37 ` [net-next PATCH 11/27] arch/m68k: " Alexander Duyck
@ 2016-10-25 15:37 ` Alexander Duyck
  2016-10-25 15:37 ` [net-next PATCH 13/27] arch/microblaze: " Alexander Duyck
                   ` (16 subsequent siblings)
  28 siblings, 0 replies; 38+ messages in thread
From: Alexander Duyck @ 2016-10-25 15:37 UTC (permalink / raw)
  To: netdev, intel-wired-lan, linux-kernel, linux-mm
  Cc: brouer, James Hogan, linux-metag, davem

This change allows us to pass DMA_ATTR_SKIP_CPU_SYNC which allows us to
avoid invoking cache line invalidation if the driver will just handle it
via a sync_for_cpu or sync_for_device call.

Cc: James Hogan <james.hogan@imgtec.com>
Cc: linux-metag@vger.kernel.org
Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
---
 arch/metag/kernel/dma.c |   16 +++++++++++++---
 1 file changed, 13 insertions(+), 3 deletions(-)

diff --git a/arch/metag/kernel/dma.c b/arch/metag/kernel/dma.c
index 0db31e2..91968d9 100644
--- a/arch/metag/kernel/dma.c
+++ b/arch/metag/kernel/dma.c
@@ -484,8 +484,9 @@ static dma_addr_t metag_dma_map_page(struct device *dev, struct page *page,
 		unsigned long offset, size_t size,
 		enum dma_data_direction direction, unsigned long attrs)
 {
-	dma_sync_for_device((void *)(page_to_phys(page) + offset), size,
-			    direction);
+	if (!(attrs & DMA_ATTR_SKIP_CPU_SYNC))
+		dma_sync_for_device((void *)(page_to_phys(page) + offset),
+				    size, direction);
 	return page_to_phys(page) + offset;
 }
 
@@ -493,7 +494,8 @@ static void metag_dma_unmap_page(struct device *dev, dma_addr_t dma_address,
 		size_t size, enum dma_data_direction direction,
 		unsigned long attrs)
 {
-	dma_sync_for_cpu(phys_to_virt(dma_address), size, direction);
+	if (!(attrs & DMA_ATTR_SKIP_CPU_SYNC))
+		dma_sync_for_cpu(phys_to_virt(dma_address), size, direction);
 }
 
 static int metag_dma_map_sg(struct device *dev, struct scatterlist *sglist,
@@ -507,6 +509,10 @@ static int metag_dma_map_sg(struct device *dev, struct scatterlist *sglist,
 		BUG_ON(!sg_page(sg));
 
 		sg->dma_address = sg_phys(sg);
+
+		if (attrs & DMA_ATTR_SKIP_CPU_SYNC)
+			continue;
+
 		dma_sync_for_device(sg_virt(sg), sg->length, direction);
 	}
 
@@ -525,6 +531,10 @@ static void metag_dma_unmap_sg(struct device *dev, struct scatterlist *sglist,
 		BUG_ON(!sg_page(sg));
 
 		sg->dma_address = sg_phys(sg);
+
+		if (attrs & DMA_ATTR_SKIP_CPU_SYNC)
+			continue;
+
 		dma_sync_for_cpu(sg_virt(sg), sg->length, direction);
 	}
 }

^ permalink raw reply related	[flat|nested] 38+ messages in thread

* [net-next PATCH 13/27] arch/microblaze: Add option to skip DMA sync as a part of map and unmap
  2016-10-25 15:36 [net-next PATCH 00/27] Add support for DMA writable pages being writable by the network stack Alexander Duyck
                   ` (11 preceding siblings ...)
  2016-10-25 15:37 ` [net-next PATCH 12/27] arch/metag: Add option to skip DMA sync as a part of map and unmap Alexander Duyck
@ 2016-10-25 15:37 ` Alexander Duyck
  2016-10-25 15:38 ` [net-next PATCH 14/27] arch/mips: " Alexander Duyck
                   ` (15 subsequent siblings)
  28 siblings, 0 replies; 38+ messages in thread
From: Alexander Duyck @ 2016-10-25 15:37 UTC (permalink / raw)
  To: netdev, intel-wired-lan, linux-kernel, linux-mm
  Cc: Michal Simek, davem, brouer

This change allows us to pass DMA_ATTR_SKIP_CPU_SYNC which allows us to
avoid invoking cache line invalidation if the driver will just handle it
via a sync_for_cpu or sync_for_device call.

Cc: Michal Simek <monstr@monstr.eu>
Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
---
 arch/microblaze/kernel/dma.c |   10 ++++++++--
 1 file changed, 8 insertions(+), 2 deletions(-)

diff --git a/arch/microblaze/kernel/dma.c b/arch/microblaze/kernel/dma.c
index ec04dc1..818daf2 100644
--- a/arch/microblaze/kernel/dma.c
+++ b/arch/microblaze/kernel/dma.c
@@ -61,6 +61,10 @@ static int dma_direct_map_sg(struct device *dev, struct scatterlist *sgl,
 	/* FIXME this part of code is untested */
 	for_each_sg(sgl, sg, nents, i) {
 		sg->dma_address = sg_phys(sg);
+
+		if (attrs & DMA_ATTR_SKIP_CPU_SYNC)
+			continue;
+
 		__dma_sync(page_to_phys(sg_page(sg)) + sg->offset,
 							sg->length, direction);
 	}
@@ -80,7 +84,8 @@ static inline dma_addr_t dma_direct_map_page(struct device *dev,
 					     enum dma_data_direction direction,
 					     unsigned long attrs)
 {
-	__dma_sync(page_to_phys(page) + offset, size, direction);
+	if (!(attrs & DMA_ATTR_SKIP_CPU_SYNC))
+		__dma_sync(page_to_phys(page) + offset, size, direction);
 	return page_to_phys(page) + offset;
 }
 
@@ -95,7 +100,8 @@ static inline void dma_direct_unmap_page(struct device *dev,
  * phys_to_virt is here because in __dma_sync_page is __virt_to_phys and
  * dma_address is physical address
  */
-	__dma_sync(dma_address, size, direction);
+	if (!(attrs & DMA_ATTR_SKIP_CPU_SYNC))
+		__dma_sync(dma_address, size, direction);
 }
 
 static inline void

^ permalink raw reply related	[flat|nested] 38+ messages in thread

* [net-next PATCH 14/27] arch/mips: Add option to skip DMA sync as a part of map and unmap
  2016-10-25 15:36 [net-next PATCH 00/27] Add support for DMA writable pages being writable by the network stack Alexander Duyck
                   ` (12 preceding siblings ...)
  2016-10-25 15:37 ` [net-next PATCH 13/27] arch/microblaze: " Alexander Duyck
@ 2016-10-25 15:38 ` Alexander Duyck
  2016-10-25 15:38 ` [net-next PATCH 15/27] arch/nios2: " Alexander Duyck
                   ` (14 subsequent siblings)
  28 siblings, 0 replies; 38+ messages in thread
From: Alexander Duyck @ 2016-10-25 15:38 UTC (permalink / raw)
  To: netdev, intel-wired-lan, linux-kernel, linux-mm
  Cc: linux-mips, Keguang Zhang, davem, Ralf Baechle, brouer

This change allows us to pass DMA_ATTR_SKIP_CPU_SYNC which allows us to
avoid invoking cache line invalidation if the driver will just handle it
via a sync_for_cpu or sync_for_device call.

Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: Keguang Zhang <keguang.zhang@gmail.com>
Cc: linux-mips@linux-mips.org
Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
---
 arch/mips/loongson64/common/dma-swiotlb.c |    2 +-
 arch/mips/mm/dma-default.c                |    8 +++++---
 2 files changed, 6 insertions(+), 4 deletions(-)

diff --git a/arch/mips/loongson64/common/dma-swiotlb.c b/arch/mips/loongson64/common/dma-swiotlb.c
index 1a80b6f..aab4fd6 100644
--- a/arch/mips/loongson64/common/dma-swiotlb.c
+++ b/arch/mips/loongson64/common/dma-swiotlb.c
@@ -61,7 +61,7 @@ static int loongson_dma_map_sg(struct device *dev, struct scatterlist *sg,
 				int nents, enum dma_data_direction dir,
 				unsigned long attrs)
 {
-	int r = swiotlb_map_sg_attrs(dev, sg, nents, dir, 0);
+	int r = swiotlb_map_sg_attrs(dev, sg, nents, dir, attrs);
 	mb();
 
 	return r;
diff --git a/arch/mips/mm/dma-default.c b/arch/mips/mm/dma-default.c
index b2eadd6..dd998d7 100644
--- a/arch/mips/mm/dma-default.c
+++ b/arch/mips/mm/dma-default.c
@@ -293,7 +293,7 @@ static inline void __dma_sync(struct page *page,
 static void mips_dma_unmap_page(struct device *dev, dma_addr_t dma_addr,
 	size_t size, enum dma_data_direction direction, unsigned long attrs)
 {
-	if (cpu_needs_post_dma_flush(dev))
+	if (cpu_needs_post_dma_flush(dev) && !(attrs & DMA_ATTR_SKIP_CPU_SYNC))
 		__dma_sync(dma_addr_to_page(dev, dma_addr),
 			   dma_addr & ~PAGE_MASK, size, direction);
 	plat_post_dma_flush(dev);
@@ -307,7 +307,8 @@ static int mips_dma_map_sg(struct device *dev, struct scatterlist *sglist,
 	struct scatterlist *sg;
 
 	for_each_sg(sglist, sg, nents, i) {
-		if (!plat_device_is_coherent(dev))
+		if (!plat_device_is_coherent(dev) &&
+		    !(attrs & DMA_ATTR_SKIP_CPU_SYNC))
 			__dma_sync(sg_page(sg), sg->offset, sg->length,
 				   direction);
 #ifdef CONFIG_NEED_SG_DMA_LENGTH
@@ -324,7 +325,7 @@ static dma_addr_t mips_dma_map_page(struct device *dev, struct page *page,
 	unsigned long offset, size_t size, enum dma_data_direction direction,
 	unsigned long attrs)
 {
-	if (!plat_device_is_coherent(dev))
+	if (!plat_device_is_coherent(dev) && !(attrs & DMA_ATTR_SKIP_CPU_SYNC))
 		__dma_sync(page, offset, size, direction);
 
 	return plat_map_dma_mem_page(dev, page) + offset;
@@ -339,6 +340,7 @@ static void mips_dma_unmap_sg(struct device *dev, struct scatterlist *sglist,
 
 	for_each_sg(sglist, sg, nhwentries, i) {
 		if (!plat_device_is_coherent(dev) &&
+		    !(attrs & DMA_ATTR_SKIP_CPU_SYNC) &&
 		    direction != DMA_TO_DEVICE)
 			__dma_sync(sg_page(sg), sg->offset, sg->length,
 				   direction);

^ permalink raw reply related	[flat|nested] 38+ messages in thread

* [net-next PATCH 15/27] arch/nios2: Add option to skip DMA sync as a part of map and unmap
  2016-10-25 15:36 [net-next PATCH 00/27] Add support for DMA writable pages being writable by the network stack Alexander Duyck
                   ` (13 preceding siblings ...)
  2016-10-25 15:38 ` [net-next PATCH 14/27] arch/mips: " Alexander Duyck
@ 2016-10-25 15:38 ` Alexander Duyck
  2016-10-25 15:38 ` [net-next PATCH 16/27] arch/openrisc: Add option to skip DMA sync as a part of mapping Alexander Duyck
                   ` (13 subsequent siblings)
  28 siblings, 0 replies; 38+ messages in thread
From: Alexander Duyck @ 2016-10-25 15:38 UTC (permalink / raw)
  To: netdev, intel-wired-lan, linux-kernel, linux-mm
  Cc: Ley Foon Tan, davem, brouer

This change allows us to pass DMA_ATTR_SKIP_CPU_SYNC which allows us to
avoid invoking cache line invalidation if the driver will just handle it
via a sync_for_cpu or sync_for_device call.

Cc: Ley Foon Tan <lftan@altera.com>
Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
---
 arch/nios2/mm/dma-mapping.c |   26 ++++++++++++++++++--------
 1 file changed, 18 insertions(+), 8 deletions(-)

diff --git a/arch/nios2/mm/dma-mapping.c b/arch/nios2/mm/dma-mapping.c
index d800fad..f6a5dcf 100644
--- a/arch/nios2/mm/dma-mapping.c
+++ b/arch/nios2/mm/dma-mapping.c
@@ -98,13 +98,17 @@ static int nios2_dma_map_sg(struct device *dev, struct scatterlist *sg,
 	int i;
 
 	for_each_sg(sg, sg, nents, i) {
-		void *addr;
+		void *addr = sg_virt(sg);
 
-		addr = sg_virt(sg);
-		if (addr) {
-			__dma_sync_for_device(addr, sg->length, direction);
-			sg->dma_address = sg_phys(sg);
-		}
+		if (!addr)
+			continue;
+
+		sg->dma_address = sg_phys(sg);
+
+		if (attrs & DMA_ATTR_SKIP_CPU_SYNC)
+			continue;
+
+		__dma_sync_for_device(addr, sg->length, direction);
 	}
 
 	return nents;
@@ -117,7 +121,9 @@ static dma_addr_t nios2_dma_map_page(struct device *dev, struct page *page,
 {
 	void *addr = page_address(page) + offset;
 
-	__dma_sync_for_device(addr, size, direction);
+	if (!(attrs & DMA_ATTR_SKIP_CPU_SYNC))
+		__dma_sync_for_device(addr, size, direction);
+
 	return page_to_phys(page) + offset;
 }
 
@@ -125,7 +131,8 @@ static void nios2_dma_unmap_page(struct device *dev, dma_addr_t dma_address,
 		size_t size, enum dma_data_direction direction,
 		unsigned long attrs)
 {
-	__dma_sync_for_cpu(phys_to_virt(dma_address), size, direction);
+	if (!(attrs & DMA_ATTR_SKIP_CPU_SYNC))
+		__dma_sync_for_cpu(phys_to_virt(dma_address), size, direction);
 }
 
 static void nios2_dma_unmap_sg(struct device *dev, struct scatterlist *sg,
@@ -138,6 +145,9 @@ static void nios2_dma_unmap_sg(struct device *dev, struct scatterlist *sg,
 	if (direction == DMA_TO_DEVICE)
 		return;
 
+	if (attrs & DMA_ATTR_SKIP_CPU_SYNC)
+		return;
+
 	for_each_sg(sg, sg, nhwentries, i) {
 		addr = sg_virt(sg);
 		if (addr)

^ permalink raw reply related	[flat|nested] 38+ messages in thread

* [net-next PATCH 16/27] arch/openrisc: Add option to skip DMA sync as a part of mapping
  2016-10-25 15:36 [net-next PATCH 00/27] Add support for DMA writable pages being writable by the network stack Alexander Duyck
                   ` (14 preceding siblings ...)
  2016-10-25 15:38 ` [net-next PATCH 15/27] arch/nios2: " Alexander Duyck
@ 2016-10-25 15:38 ` Alexander Duyck
  2016-10-25 15:38 ` [net-next PATCH 17/27] arch/parisc: Add option to skip DMA sync as a part of map and unmap Alexander Duyck
                   ` (12 subsequent siblings)
  28 siblings, 0 replies; 38+ messages in thread
From: Alexander Duyck @ 2016-10-25 15:38 UTC (permalink / raw)
  To: netdev, intel-wired-lan, linux-kernel, linux-mm; +Cc: Jonas Bonn, davem, brouer

This change allows us to pass DMA_ATTR_SKIP_CPU_SYNC which allows us to
avoid invoking cache line invalidation if the driver will just handle it
via a sync_for_cpu or sync_for_device call.

Cc: Jonas Bonn <jonas@southpole.se>
Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
---
 arch/openrisc/kernel/dma.c |    3 +++
 1 file changed, 3 insertions(+)

diff --git a/arch/openrisc/kernel/dma.c b/arch/openrisc/kernel/dma.c
index 140c991..906998b 100644
--- a/arch/openrisc/kernel/dma.c
+++ b/arch/openrisc/kernel/dma.c
@@ -141,6 +141,9 @@
 	unsigned long cl;
 	dma_addr_t addr = page_to_phys(page) + offset;
 
+	if (attrs & DMA_ATTR_SKIP_CPU_SYNC)
+		return addr;
+
 	switch (dir) {
 	case DMA_TO_DEVICE:
 		/* Flush the dcache for the requested range */

^ permalink raw reply related	[flat|nested] 38+ messages in thread

* [net-next PATCH 17/27] arch/parisc: Add option to skip DMA sync as a part of map and unmap
  2016-10-25 15:36 [net-next PATCH 00/27] Add support for DMA writable pages being writable by the network stack Alexander Duyck
                   ` (15 preceding siblings ...)
  2016-10-25 15:38 ` [net-next PATCH 16/27] arch/openrisc: Add option to skip DMA sync as a part of mapping Alexander Duyck
@ 2016-10-25 15:38 ` Alexander Duyck
  2016-10-25 15:38 ` [net-next PATCH 18/27] arch/powerpc: Add option to skip DMA sync as a part of mapping Alexander Duyck
                   ` (11 subsequent siblings)
  28 siblings, 0 replies; 38+ messages in thread
From: Alexander Duyck @ 2016-10-25 15:38 UTC (permalink / raw)
  To: netdev, intel-wired-lan, linux-kernel, linux-mm
  Cc: Helge Deller, brouer, James E.J. Bottomley, linux-parisc, davem

This change allows us to pass DMA_ATTR_SKIP_CPU_SYNC which allows us to
avoid invoking cache line invalidation if the driver will just handle it
via a sync_for_cpu or sync_for_device call.

Cc: "James E.J. Bottomley" <jejb@parisc-linux.org>
Cc: Helge Deller <deller@gmx.de>
Cc: linux-parisc@vger.kernel.org
Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
---
 arch/parisc/kernel/pci-dma.c |   20 +++++++++++++++-----
 1 file changed, 15 insertions(+), 5 deletions(-)

diff --git a/arch/parisc/kernel/pci-dma.c b/arch/parisc/kernel/pci-dma.c
index 02d9ed0..be55ede 100644
--- a/arch/parisc/kernel/pci-dma.c
+++ b/arch/parisc/kernel/pci-dma.c
@@ -459,7 +459,9 @@ static dma_addr_t pa11_dma_map_page(struct device *dev, struct page *page,
 	void *addr = page_address(page) + offset;
 	BUG_ON(direction == DMA_NONE);
 
-	flush_kernel_dcache_range((unsigned long) addr, size);
+	if (!(attrs & DMA_ATTR_SKIP_CPU_SYNC))
+		flush_kernel_dcache_range((unsigned long) addr, size);
+
 	return virt_to_phys(addr);
 }
 
@@ -469,8 +471,11 @@ static void pa11_dma_unmap_page(struct device *dev, dma_addr_t dma_handle,
 {
 	BUG_ON(direction == DMA_NONE);
 
+	if (attrs & DMA_ATTR_SKIP_CPU_SYNC)
+		return;
+
 	if (direction == DMA_TO_DEVICE)
-	    return;
+		return;
 
 	/*
 	 * For PCI_DMA_FROMDEVICE this flush is not necessary for the
@@ -479,7 +484,6 @@ static void pa11_dma_unmap_page(struct device *dev, dma_addr_t dma_handle,
 	 */
 
 	flush_kernel_dcache_range((unsigned long) phys_to_virt(dma_handle), size);
-	return;
 }
 
 static int pa11_dma_map_sg(struct device *dev, struct scatterlist *sglist,
@@ -496,6 +500,10 @@ static int pa11_dma_map_sg(struct device *dev, struct scatterlist *sglist,
 
 		sg_dma_address(sg) = (dma_addr_t) virt_to_phys(vaddr);
 		sg_dma_len(sg) = sg->length;
+
+		if (attrs & DMA_ATTR_SKIP_CPU_SYNC)
+			continue;
+
 		flush_kernel_dcache_range(vaddr, sg->length);
 	}
 	return nents;
@@ -510,14 +518,16 @@ static void pa11_dma_unmap_sg(struct device *dev, struct scatterlist *sglist,
 
 	BUG_ON(direction == DMA_NONE);
 
+	if (attrs & DMA_ATTR_SKIP_CPU_SYNC)
+		return;
+
 	if (direction == DMA_TO_DEVICE)
-	    return;
+		return;
 
 	/* once we do combining we'll need to use phys_to_virt(sg_dma_address(sglist)) */
 
 	for_each_sg(sglist, sg, nents, i)
 		flush_kernel_vmap_range(sg_virt(sg), sg->length);
-	return;
 }
 
 static void pa11_dma_sync_single_for_cpu(struct device *dev,

^ permalink raw reply related	[flat|nested] 38+ messages in thread

* [net-next PATCH 18/27] arch/powerpc: Add option to skip DMA sync as a part of mapping
  2016-10-25 15:36 [net-next PATCH 00/27] Add support for DMA writable pages being writable by the network stack Alexander Duyck
                   ` (16 preceding siblings ...)
  2016-10-25 15:38 ` [net-next PATCH 17/27] arch/parisc: Add option to skip DMA sync as a part of map and unmap Alexander Duyck
@ 2016-10-25 15:38 ` Alexander Duyck
  2016-10-25 15:38 ` [net-next PATCH 19/27] arch/sh: " Alexander Duyck
                   ` (10 subsequent siblings)
  28 siblings, 0 replies; 38+ messages in thread
From: Alexander Duyck @ 2016-10-25 15:38 UTC (permalink / raw)
  To: netdev, intel-wired-lan, linux-kernel, linux-mm
  Cc: Benjamin Herrenschmidt, Paul Mackerras, Michael Ellerman, brouer,
	linuxppc-dev, davem

This change allows us to pass DMA_ATTR_SKIP_CPU_SYNC which allows us to
avoid invoking cache line invalidation if the driver will just handle it
via a sync_for_cpu or sync_for_device call.

Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: linuxppc-dev@lists.ozlabs.org
Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
---
 arch/powerpc/kernel/dma.c |    9 ++++++++-
 1 file changed, 8 insertions(+), 1 deletion(-)

diff --git a/arch/powerpc/kernel/dma.c b/arch/powerpc/kernel/dma.c
index e64a601..6877e3f 100644
--- a/arch/powerpc/kernel/dma.c
+++ b/arch/powerpc/kernel/dma.c
@@ -203,6 +203,10 @@ static int dma_direct_map_sg(struct device *dev, struct scatterlist *sgl,
 	for_each_sg(sgl, sg, nents, i) {
 		sg->dma_address = sg_phys(sg) + get_dma_offset(dev);
 		sg->dma_length = sg->length;
+
+		if (attrs & DMA_ATTR_SKIP_CPU_SYNC)
+			continue;
+
 		__dma_sync_page(sg_page(sg), sg->offset, sg->length, direction);
 	}
 
@@ -235,7 +239,10 @@ static inline dma_addr_t dma_direct_map_page(struct device *dev,
 					     unsigned long attrs)
 {
 	BUG_ON(dir == DMA_NONE);
-	__dma_sync_page(page, offset, size, dir);
+
+	if (!(attrs & DMA_ATTR_SKIP_CPU_SYNC))
+		__dma_sync_page(page, offset, size, dir);
+
 	return page_to_phys(page) + offset + get_dma_offset(dev);
 }
 

^ permalink raw reply related	[flat|nested] 38+ messages in thread

* [net-next PATCH 19/27] arch/sh: Add option to skip DMA sync as a part of mapping
  2016-10-25 15:36 [net-next PATCH 00/27] Add support for DMA writable pages being writable by the network stack Alexander Duyck
                   ` (17 preceding siblings ...)
  2016-10-25 15:38 ` [net-next PATCH 18/27] arch/powerpc: Add option to skip DMA sync as a part of mapping Alexander Duyck
@ 2016-10-25 15:38 ` Alexander Duyck
  2016-10-25 15:38 ` [net-next PATCH 20/27] arch/sparc: Add option to skip DMA sync as a part of map and unmap Alexander Duyck
                   ` (9 subsequent siblings)
  28 siblings, 0 replies; 38+ messages in thread
From: Alexander Duyck @ 2016-10-25 15:38 UTC (permalink / raw)
  To: netdev, intel-wired-lan, linux-kernel, linux-mm
  Cc: brouer, Rich Felker, davem, Yoshinori Sato, linux-sh

This change allows us to pass DMA_ATTR_SKIP_CPU_SYNC which allows us to
avoid invoking cache line invalidation if the driver will just handle it
via a sync_for_cpu or sync_for_device call.

Cc: Yoshinori Sato <ysato@users.sourceforge.jp>
Cc: Rich Felker <dalias@libc.org>
Cc: linux-sh@vger.kernel.org
Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
---
 arch/sh/kernel/dma-nommu.c |    7 +++++--
 1 file changed, 5 insertions(+), 2 deletions(-)

diff --git a/arch/sh/kernel/dma-nommu.c b/arch/sh/kernel/dma-nommu.c
index eadb669..47fee3b 100644
--- a/arch/sh/kernel/dma-nommu.c
+++ b/arch/sh/kernel/dma-nommu.c
@@ -18,7 +18,9 @@ static dma_addr_t nommu_map_page(struct device *dev, struct page *page,
 	dma_addr_t addr = page_to_phys(page) + offset;
 
 	WARN_ON(size == 0);
-	dma_cache_sync(dev, page_address(page) + offset, size, dir);
+
+	if (!(attrs & DMA_ATTR_SKIP_CPU_SYNC))
+		dma_cache_sync(dev, page_address(page) + offset, size, dir);
 
 	return addr;
 }
@@ -35,7 +37,8 @@ static int nommu_map_sg(struct device *dev, struct scatterlist *sg,
 	for_each_sg(sg, s, nents, i) {
 		BUG_ON(!sg_page(s));
 
-		dma_cache_sync(dev, sg_virt(s), s->length, dir);
+		if (!(attrs & DMA_ATTR_SKIP_CPU_SYNC))
+			dma_cache_sync(dev, sg_virt(s), s->length, dir);
 
 		s->dma_address = sg_phys(s);
 		s->dma_length = s->length;

^ permalink raw reply related	[flat|nested] 38+ messages in thread

* [net-next PATCH 20/27] arch/sparc: Add option to skip DMA sync as a part of map and unmap
  2016-10-25 15:36 [net-next PATCH 00/27] Add support for DMA writable pages being writable by the network stack Alexander Duyck
                   ` (18 preceding siblings ...)
  2016-10-25 15:38 ` [net-next PATCH 19/27] arch/sh: " Alexander Duyck
@ 2016-10-25 15:38 ` Alexander Duyck
  2016-10-25 15:38 ` [net-next PATCH 21/27] arch/tile: " Alexander Duyck
                   ` (8 subsequent siblings)
  28 siblings, 0 replies; 38+ messages in thread
From: Alexander Duyck @ 2016-10-25 15:38 UTC (permalink / raw)
  To: netdev, intel-wired-lan, linux-kernel, linux-mm; +Cc: sparclinux, davem, brouer

This change allows us to pass DMA_ATTR_SKIP_CPU_SYNC which allows us to
avoid invoking cache line invalidation if the driver will just handle it
via a sync_for_cpu or sync_for_device call.

Cc: "David S. Miller" <davem@davemloft.net>
Cc: sparclinux@vger.kernel.org
Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
---
 arch/sparc/kernel/iommu.c  |    4 ++--
 arch/sparc/kernel/ioport.c |    4 ++--
 2 files changed, 4 insertions(+), 4 deletions(-)

diff --git a/arch/sparc/kernel/iommu.c b/arch/sparc/kernel/iommu.c
index 5c615ab..8fda4e4 100644
--- a/arch/sparc/kernel/iommu.c
+++ b/arch/sparc/kernel/iommu.c
@@ -415,7 +415,7 @@ static void dma_4u_unmap_page(struct device *dev, dma_addr_t bus_addr,
 		ctx = (iopte_val(*base) & IOPTE_CONTEXT) >> 47UL;
 
 	/* Step 1: Kick data out of streaming buffers if necessary. */
-	if (strbuf->strbuf_enabled)
+	if (strbuf->strbuf_enabled && !(attrs & DMA_ATTR_SKIP_CPU_SYNC))
 		strbuf_flush(strbuf, iommu, bus_addr, ctx,
 			     npages, direction);
 
@@ -640,7 +640,7 @@ static void dma_4u_unmap_sg(struct device *dev, struct scatterlist *sglist,
 		base = iommu->page_table + entry;
 
 		dma_handle &= IO_PAGE_MASK;
-		if (strbuf->strbuf_enabled)
+		if (strbuf->strbuf_enabled && !(attrs & DMA_ATTR_SKIP_CPU_SYNC))
 			strbuf_flush(strbuf, iommu, dma_handle, ctx,
 				     npages, direction);
 
diff --git a/arch/sparc/kernel/ioport.c b/arch/sparc/kernel/ioport.c
index 2344103..6ffaec4 100644
--- a/arch/sparc/kernel/ioport.c
+++ b/arch/sparc/kernel/ioport.c
@@ -527,7 +527,7 @@ static dma_addr_t pci32_map_page(struct device *dev, struct page *page,
 static void pci32_unmap_page(struct device *dev, dma_addr_t ba, size_t size,
 			     enum dma_data_direction dir, unsigned long attrs)
 {
-	if (dir != PCI_DMA_TODEVICE)
+	if (dir != PCI_DMA_TODEVICE && !(attrs & DMA_ATTR_SKIP_CPU_SYNC))
 		dma_make_coherent(ba, PAGE_ALIGN(size));
 }
 
@@ -572,7 +572,7 @@ static void pci32_unmap_sg(struct device *dev, struct scatterlist *sgl,
 	struct scatterlist *sg;
 	int n;
 
-	if (dir != PCI_DMA_TODEVICE) {
+	if (dir != PCI_DMA_TODEVICE && !(attrs & DMA_ATTR_SKIP_CPU_SYNC)) {
 		for_each_sg(sgl, sg, nents, n) {
 			dma_make_coherent(sg_phys(sg), PAGE_ALIGN(sg->length));
 		}

^ permalink raw reply related	[flat|nested] 38+ messages in thread

* [net-next PATCH 21/27] arch/tile: Add option to skip DMA sync as a part of map and unmap
  2016-10-25 15:36 [net-next PATCH 00/27] Add support for DMA writable pages being writable by the network stack Alexander Duyck
                   ` (19 preceding siblings ...)
  2016-10-25 15:38 ` [net-next PATCH 20/27] arch/sparc: Add option to skip DMA sync as a part of map and unmap Alexander Duyck
@ 2016-10-25 15:38 ` Alexander Duyck
  2016-10-25 15:38 ` [net-next PATCH 22/27] arch/xtensa: Add option to skip DMA sync as a part of mapping Alexander Duyck
                   ` (7 subsequent siblings)
  28 siblings, 0 replies; 38+ messages in thread
From: Alexander Duyck @ 2016-10-25 15:38 UTC (permalink / raw)
  To: netdev, intel-wired-lan, linux-kernel, linux-mm
  Cc: brouer, Chris Metcalf, davem

This change allows us to pass DMA_ATTR_SKIP_CPU_SYNC which allows us to
avoid invoking cache line invalidation if the driver will just handle it
via a sync_for_cpu or sync_for_device call.

Cc: Chris Metcalf <cmetcalf@mellanox.com>
Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
---
 arch/tile/kernel/pci-dma.c |   12 ++++++++++--
 1 file changed, 10 insertions(+), 2 deletions(-)

diff --git a/arch/tile/kernel/pci-dma.c b/arch/tile/kernel/pci-dma.c
index 09bb774..24e0f8c 100644
--- a/arch/tile/kernel/pci-dma.c
+++ b/arch/tile/kernel/pci-dma.c
@@ -213,10 +213,12 @@ static int tile_dma_map_sg(struct device *dev, struct scatterlist *sglist,
 
 	for_each_sg(sglist, sg, nents, i) {
 		sg->dma_address = sg_phys(sg);
-		__dma_prep_pa_range(sg->dma_address, sg->length, direction);
 #ifdef CONFIG_NEED_SG_DMA_LENGTH
 		sg->dma_length = sg->length;
 #endif
+		if (attrs & DMA_ATTR_SKIP_CPU_SYNC)
+			continue;
+		__dma_prep_pa_range(sg->dma_address, sg->length, direction);
 	}
 
 	return nents;
@@ -232,6 +234,8 @@ static void tile_dma_unmap_sg(struct device *dev, struct scatterlist *sglist,
 	BUG_ON(!valid_dma_direction(direction));
 	for_each_sg(sglist, sg, nents, i) {
 		sg->dma_address = sg_phys(sg);
+		if (attrs & DMA_ATTR_SKIP_CPU_SYNC)
+			continue;
 		__dma_complete_pa_range(sg->dma_address, sg->length,
 					direction);
 	}
@@ -245,7 +249,8 @@ static dma_addr_t tile_dma_map_page(struct device *dev, struct page *page,
 	BUG_ON(!valid_dma_direction(direction));
 
 	BUG_ON(offset + size > PAGE_SIZE);
-	__dma_prep_page(page, offset, size, direction);
+	if (!(attrs & DMA_ATTR_SKIP_CPU_SYNC))
+		__dma_prep_page(page, offset, size, direction);
 
 	return page_to_pa(page) + offset;
 }
@@ -256,6 +261,9 @@ static void tile_dma_unmap_page(struct device *dev, dma_addr_t dma_address,
 {
 	BUG_ON(!valid_dma_direction(direction));
 
+	if (attrs & DMA_ATTR_SKIP_CPU_SYNC)
+		return;
+
 	__dma_complete_page(pfn_to_page(PFN_DOWN(dma_address)),
 			    dma_address & (PAGE_SIZE - 1), size, direction);
 }

^ permalink raw reply related	[flat|nested] 38+ messages in thread

* [net-next PATCH 22/27] arch/xtensa: Add option to skip DMA sync as a part of mapping
  2016-10-25 15:36 [net-next PATCH 00/27] Add support for DMA writable pages being writable by the network stack Alexander Duyck
                   ` (20 preceding siblings ...)
  2016-10-25 15:38 ` [net-next PATCH 21/27] arch/tile: " Alexander Duyck
@ 2016-10-25 15:38 ` Alexander Duyck
  2016-10-25 15:38 ` [net-next PATCH 23/27] dma: Add calls for dma_map_page_attrs and dma_unmap_page_attrs Alexander Duyck
                   ` (6 subsequent siblings)
  28 siblings, 0 replies; 38+ messages in thread
From: Alexander Duyck @ 2016-10-25 15:38 UTC (permalink / raw)
  To: netdev, intel-wired-lan, linux-kernel, linux-mm
  Cc: Max Filippov, davem, brouer

This change allows us to pass DMA_ATTR_SKIP_CPU_SYNC which allows us to
avoid invoking cache line invalidation if the driver will just handle it
via a sync_for_cpu or sync_for_device call.

Cc: Max Filippov <jcmvbkbc@gmail.com>
Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
---
 arch/xtensa/kernel/pci-dma.c |    7 +++++--
 1 file changed, 5 insertions(+), 2 deletions(-)

diff --git a/arch/xtensa/kernel/pci-dma.c b/arch/xtensa/kernel/pci-dma.c
index 1e68806..6a16dec 100644
--- a/arch/xtensa/kernel/pci-dma.c
+++ b/arch/xtensa/kernel/pci-dma.c
@@ -189,7 +189,9 @@ static dma_addr_t xtensa_map_page(struct device *dev, struct page *page,
 {
 	dma_addr_t dma_handle = page_to_phys(page) + offset;
 
-	xtensa_sync_single_for_device(dev, dma_handle, size, dir);
+	if (!(attrs & DMA_ATTR_SKIP_CPU_SYNC))
+		xtensa_sync_single_for_device(dev, dma_handle, size, dir);
+
 	return dma_handle;
 }
 
@@ -197,7 +199,8 @@ static void xtensa_unmap_page(struct device *dev, dma_addr_t dma_handle,
 			      size_t size, enum dma_data_direction dir,
 			      unsigned long attrs)
 {
-	xtensa_sync_single_for_cpu(dev, dma_handle, size, dir);
+	if (!(attrs & DMA_ATTR_SKIP_CPU_SYNC))
+		xtensa_sync_single_for_cpu(dev, dma_handle, size, dir);
 }
 
 static int xtensa_map_sg(struct device *dev, struct scatterlist *sg,

^ permalink raw reply related	[flat|nested] 38+ messages in thread

* [net-next PATCH 23/27] dma: Add calls for dma_map_page_attrs and dma_unmap_page_attrs
  2016-10-25 15:36 [net-next PATCH 00/27] Add support for DMA writable pages being writable by the network stack Alexander Duyck
                   ` (21 preceding siblings ...)
  2016-10-25 15:38 ` [net-next PATCH 22/27] arch/xtensa: Add option to skip DMA sync as a part of mapping Alexander Duyck
@ 2016-10-25 15:38 ` Alexander Duyck
  2016-10-25 15:38 ` [net-next PATCH 24/27] mm: Add support for releasing multiple instances of a page Alexander Duyck
                   ` (5 subsequent siblings)
  28 siblings, 0 replies; 38+ messages in thread
From: Alexander Duyck @ 2016-10-25 15:38 UTC (permalink / raw)
  To: netdev, intel-wired-lan, linux-kernel, linux-mm; +Cc: davem, brouer

Add support for mapping and unmapping a page with attributes.  The primary
use for this is currently to allow for us to pass the
DMA_ATTR_SKIP_CPU_SYNC attribute when mapping and unmapping a page.  On
some architectures such as ARM the synchronization has significant overhead
and if we are already taking care of the sync_for_cpu and sync_for_device
from the driver there isn't much need to handle this in the map/unmap calls
as well.

Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
---
 include/linux/dma-mapping.h |   20 +++++++++++++-------
 1 file changed, 13 insertions(+), 7 deletions(-)

diff --git a/include/linux/dma-mapping.h b/include/linux/dma-mapping.h
index 08528af..10c5a17 100644
--- a/include/linux/dma-mapping.h
+++ b/include/linux/dma-mapping.h
@@ -243,29 +243,33 @@ static inline void dma_unmap_sg_attrs(struct device *dev, struct scatterlist *sg
 		ops->unmap_sg(dev, sg, nents, dir, attrs);
 }
 
-static inline dma_addr_t dma_map_page(struct device *dev, struct page *page,
-				      size_t offset, size_t size,
-				      enum dma_data_direction dir)
+static inline dma_addr_t dma_map_page_attrs(struct device *dev,
+					    struct page *page,
+					    size_t offset, size_t size,
+					    enum dma_data_direction dir,
+					    unsigned long attrs)
 {
 	struct dma_map_ops *ops = get_dma_ops(dev);
 	dma_addr_t addr;
 
 	kmemcheck_mark_initialized(page_address(page) + offset, size);
 	BUG_ON(!valid_dma_direction(dir));
-	addr = ops->map_page(dev, page, offset, size, dir, 0);
+	addr = ops->map_page(dev, page, offset, size, dir, attrs);
 	debug_dma_map_page(dev, page, offset, size, dir, addr, false);
 
 	return addr;
 }
 
-static inline void dma_unmap_page(struct device *dev, dma_addr_t addr,
-				  size_t size, enum dma_data_direction dir)
+static inline void dma_unmap_page_attrs(struct device *dev,
+					dma_addr_t addr, size_t size,
+					enum dma_data_direction dir,
+					unsigned long attrs)
 {
 	struct dma_map_ops *ops = get_dma_ops(dev);
 
 	BUG_ON(!valid_dma_direction(dir));
 	if (ops->unmap_page)
-		ops->unmap_page(dev, addr, size, dir, 0);
+		ops->unmap_page(dev, addr, size, dir, attrs);
 	debug_dma_unmap_page(dev, addr, size, dir, false);
 }
 
@@ -385,6 +389,8 @@ static inline void dma_sync_single_range_for_device(struct device *dev,
 #define dma_unmap_single(d, a, s, r) dma_unmap_single_attrs(d, a, s, r, 0)
 #define dma_map_sg(d, s, n, r) dma_map_sg_attrs(d, s, n, r, 0)
 #define dma_unmap_sg(d, s, n, r) dma_unmap_sg_attrs(d, s, n, r, 0)
+#define dma_map_page(d, p, o, s, r) dma_map_page_attrs(d, p, o, s, r, 0)
+#define dma_unmap_page(d, a, s, r) dma_unmap_page_attrs(d, a, s, r, 0)
 
 extern int dma_common_mmap(struct device *dev, struct vm_area_struct *vma,
 			   void *cpu_addr, dma_addr_t dma_addr, size_t size);

^ permalink raw reply related	[flat|nested] 38+ messages in thread

* [net-next PATCH 24/27] mm: Add support for releasing multiple instances of a page
  2016-10-25 15:36 [net-next PATCH 00/27] Add support for DMA writable pages being writable by the network stack Alexander Duyck
                   ` (22 preceding siblings ...)
  2016-10-25 15:38 ` [net-next PATCH 23/27] dma: Add calls for dma_map_page_attrs and dma_unmap_page_attrs Alexander Duyck
@ 2016-10-25 15:38 ` Alexander Duyck
  2016-10-25 15:39 ` [net-next PATCH 25/27] igb: Update driver to make use of DMA_ATTR_SKIP_CPU_SYNC Alexander Duyck
                   ` (4 subsequent siblings)
  28 siblings, 0 replies; 38+ messages in thread
From: Alexander Duyck @ 2016-10-25 15:38 UTC (permalink / raw)
  To: netdev, intel-wired-lan, linux-kernel, linux-mm; +Cc: davem, brouer

This patch adds a function that allows us to batch free a page that has
multiple references outstanding.  Specifically this function can be used to
drop a page being used in the page frag alloc cache.  With this drivers can
make use of functionality similar to the page frag alloc cache without
having to do any workarounds for the fact that there is no function that
frees multiple references.

Cc: linux-mm@kvack.org
Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
---
 include/linux/gfp.h |    2 ++
 mm/page_alloc.c     |   14 ++++++++++++++
 2 files changed, 16 insertions(+)

diff --git a/include/linux/gfp.h b/include/linux/gfp.h
index f8041f9de..4175dca 100644
--- a/include/linux/gfp.h
+++ b/include/linux/gfp.h
@@ -506,6 +506,8 @@ extern struct page *alloc_pages_vma(gfp_t gfp_mask, int order,
 extern void free_hot_cold_page_list(struct list_head *list, bool cold);
 
 struct page_frag_cache;
+extern void __page_frag_drain(struct page *page, unsigned int order,
+			      unsigned int count);
 extern void *__alloc_page_frag(struct page_frag_cache *nc,
 			       unsigned int fragsz, gfp_t gfp_mask);
 extern void __free_page_frag(void *addr);
diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index ca423cc..253046a 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -3883,6 +3883,20 @@ static struct page *__page_frag_refill(struct page_frag_cache *nc,
 	return page;
 }
 
+void __page_frag_drain(struct page *page, unsigned int order,
+		       unsigned int count)
+{
+	VM_BUG_ON_PAGE(page_ref_count(page) == 0, page);
+
+	if (page_ref_sub_and_test(page, count)) {
+		if (order == 0)
+			free_hot_cold_page(page, false);
+		else
+			__free_pages_ok(page, order);
+	}
+}
+EXPORT_SYMBOL(__page_frag_drain);
+
 void *__alloc_page_frag(struct page_frag_cache *nc,
 			unsigned int fragsz, gfp_t gfp_mask)
 {

^ permalink raw reply related	[flat|nested] 38+ messages in thread

* [net-next PATCH 25/27] igb: Update driver to make use of DMA_ATTR_SKIP_CPU_SYNC
  2016-10-25 15:36 [net-next PATCH 00/27] Add support for DMA writable pages being writable by the network stack Alexander Duyck
                   ` (23 preceding siblings ...)
  2016-10-25 15:38 ` [net-next PATCH 24/27] mm: Add support for releasing multiple instances of a page Alexander Duyck
@ 2016-10-25 15:39 ` Alexander Duyck
  2016-10-26 17:21   ` [Intel-wired-lan] " Jeff Kirsher
  2016-10-25 15:39 ` [net-next PATCH 26/27] igb: Update code to better handle incrementing page count Alexander Duyck
                   ` (3 subsequent siblings)
  28 siblings, 1 reply; 38+ messages in thread
From: Alexander Duyck @ 2016-10-25 15:39 UTC (permalink / raw)
  To: netdev, intel-wired-lan, linux-kernel, linux-mm; +Cc: davem, brouer

The ARM architecture provides a mechanism for deferring cache line
invalidation in the case of map/unmap.  This patch makes use of this
mechanism to avoid unnecessary synchronization.

A secondary effect of this change is that the portion of the page that has
been synchronized for use by the CPU should be writable and could be passed
up the stack (at least on ARM).

The last bit that occurred to me is that on architectures where the
sync_for_cpu call invalidates cache lines we were prefetching and then
invalidating the first 128 bytes of the packet.  To avoid that I have moved
the sync up to before we perform the prefetch and allocate the skbuff so
that we can actually make use of it.

Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
---
 drivers/net/ethernet/intel/igb/igb_main.c |   53 ++++++++++++++++++-----------
 1 file changed, 33 insertions(+), 20 deletions(-)

diff --git a/drivers/net/ethernet/intel/igb/igb_main.c b/drivers/net/ethernet/intel/igb/igb_main.c
index 4feca69..c8c458c 100644
--- a/drivers/net/ethernet/intel/igb/igb_main.c
+++ b/drivers/net/ethernet/intel/igb/igb_main.c
@@ -3947,10 +3947,21 @@ static void igb_clean_rx_ring(struct igb_ring *rx_ring)
 		if (!buffer_info->page)
 			continue;
 
-		dma_unmap_page(rx_ring->dev,
-			       buffer_info->dma,
-			       PAGE_SIZE,
-			       DMA_FROM_DEVICE);
+		/* Invalidate cache lines that may have been written to by
+		 * device so that we avoid corrupting memory.
+		 */
+		dma_sync_single_range_for_cpu(rx_ring->dev,
+					      buffer_info->dma,
+					      buffer_info->page_offset,
+					      IGB_RX_BUFSZ,
+					      DMA_FROM_DEVICE);
+
+		/* free resources associated with mapping */
+		dma_unmap_page_attrs(rx_ring->dev,
+				     buffer_info->dma,
+				     PAGE_SIZE,
+				     DMA_FROM_DEVICE,
+				     DMA_ATTR_SKIP_CPU_SYNC);
 		__free_page(buffer_info->page);
 
 		buffer_info->page = NULL;
@@ -6808,12 +6819,6 @@ static void igb_reuse_rx_page(struct igb_ring *rx_ring,
 
 	/* transfer page from old buffer to new buffer */
 	*new_buff = *old_buff;
-
-	/* sync the buffer for use by the device */
-	dma_sync_single_range_for_device(rx_ring->dev, old_buff->dma,
-					 old_buff->page_offset,
-					 IGB_RX_BUFSZ,
-					 DMA_FROM_DEVICE);
 }
 
 static inline bool igb_page_is_reserved(struct page *page)
@@ -6934,6 +6939,13 @@ static struct sk_buff *igb_fetch_rx_buffer(struct igb_ring *rx_ring,
 	page = rx_buffer->page;
 	prefetchw(page);
 
+	/* we are reusing so sync this buffer for CPU use */
+	dma_sync_single_range_for_cpu(rx_ring->dev,
+				      rx_buffer->dma,
+				      rx_buffer->page_offset,
+				      size,
+				      DMA_FROM_DEVICE);
+
 	if (likely(!skb)) {
 		void *page_addr = page_address(page) +
 				  rx_buffer->page_offset;
@@ -6958,21 +6970,15 @@ static struct sk_buff *igb_fetch_rx_buffer(struct igb_ring *rx_ring,
 		prefetchw(skb->data);
 	}
 
-	/* we are reusing so sync this buffer for CPU use */
-	dma_sync_single_range_for_cpu(rx_ring->dev,
-				      rx_buffer->dma,
-				      rx_buffer->page_offset,
-				      size,
-				      DMA_FROM_DEVICE);
-
 	/* pull page into skb */
 	if (igb_add_rx_frag(rx_ring, rx_buffer, size, rx_desc, skb)) {
 		/* hand second half of page back to the ring */
 		igb_reuse_rx_page(rx_ring, rx_buffer);
 	} else {
 		/* we are not reusing the buffer so unmap it */
-		dma_unmap_page(rx_ring->dev, rx_buffer->dma,
-			       PAGE_SIZE, DMA_FROM_DEVICE);
+		dma_unmap_page_attrs(rx_ring->dev, rx_buffer->dma,
+				     PAGE_SIZE, DMA_FROM_DEVICE,
+				     DMA_ATTR_SKIP_CPU_SYNC);
 	}
 
 	/* clear contents of rx_buffer */
@@ -7230,7 +7236,8 @@ static bool igb_alloc_mapped_page(struct igb_ring *rx_ring,
 	}
 
 	/* map page for use */
-	dma = dma_map_page(rx_ring->dev, page, 0, PAGE_SIZE, DMA_FROM_DEVICE);
+	dma = dma_map_page_attrs(rx_ring->dev, page, 0, PAGE_SIZE,
+				 DMA_FROM_DEVICE, DMA_ATTR_SKIP_CPU_SYNC);
 
 	/* if mapping failed free memory back to system since
 	 * there isn't much point in holding memory we can't use
@@ -7271,6 +7278,12 @@ void igb_alloc_rx_buffers(struct igb_ring *rx_ring, u16 cleaned_count)
 		if (!igb_alloc_mapped_page(rx_ring, bi))
 			break;
 
+		/* sync the buffer for use by the device */
+		dma_sync_single_range_for_device(rx_ring->dev, bi->dma,
+						 bi->page_offset,
+						 IGB_RX_BUFSZ,
+						 DMA_FROM_DEVICE);
+
 		/* Refresh the desc even if buffer_addrs didn't change
 		 * because each write-back erases this info.
 		 */

^ permalink raw reply related	[flat|nested] 38+ messages in thread

* [net-next PATCH 26/27] igb: Update code to better handle incrementing page count
  2016-10-25 15:36 [net-next PATCH 00/27] Add support for DMA writable pages being writable by the network stack Alexander Duyck
                   ` (24 preceding siblings ...)
  2016-10-25 15:39 ` [net-next PATCH 25/27] igb: Update driver to make use of DMA_ATTR_SKIP_CPU_SYNC Alexander Duyck
@ 2016-10-25 15:39 ` Alexander Duyck
  2016-10-26 17:21   ` [Intel-wired-lan] " Jeff Kirsher
  2016-10-25 15:39 ` [net-next PATCH 27/27] igb: Revert "igb: Revert support for build_skb in igb" Alexander Duyck
                   ` (2 subsequent siblings)
  28 siblings, 1 reply; 38+ messages in thread
From: Alexander Duyck @ 2016-10-25 15:39 UTC (permalink / raw)
  To: netdev, intel-wired-lan, linux-kernel, linux-mm; +Cc: davem, brouer

This patch updates the driver code so that we do bulk updates of the page
reference count instead of just incrementing it by one reference at a time.
The advantage to doing this is that we cut down on atomic operations and
this in turn should give us a slight improvement in cycles per packet.  In
addition if we eventually move this over to using build_skb the gains will
be more noticeable.

Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
---
 drivers/net/ethernet/intel/igb/igb.h      |    7 ++++++-
 drivers/net/ethernet/intel/igb/igb_main.c |   24 +++++++++++++++++-------
 2 files changed, 23 insertions(+), 8 deletions(-)

diff --git a/drivers/net/ethernet/intel/igb/igb.h b/drivers/net/ethernet/intel/igb/igb.h
index d11093d..acbc3ab 100644
--- a/drivers/net/ethernet/intel/igb/igb.h
+++ b/drivers/net/ethernet/intel/igb/igb.h
@@ -210,7 +210,12 @@ struct igb_tx_buffer {
 struct igb_rx_buffer {
 	dma_addr_t dma;
 	struct page *page;
-	unsigned int page_offset;
+#if (BITS_PER_LONG > 32) || (PAGE_SIZE >= 65536)
+	__u32 page_offset;
+#else
+	__u16 page_offset;
+#endif
+	__u16 pagecnt_bias;
 };
 
 struct igb_tx_queue_stats {
diff --git a/drivers/net/ethernet/intel/igb/igb_main.c b/drivers/net/ethernet/intel/igb/igb_main.c
index c8c458c..5e66cde 100644
--- a/drivers/net/ethernet/intel/igb/igb_main.c
+++ b/drivers/net/ethernet/intel/igb/igb_main.c
@@ -3962,7 +3962,8 @@ static void igb_clean_rx_ring(struct igb_ring *rx_ring)
 				     PAGE_SIZE,
 				     DMA_FROM_DEVICE,
 				     DMA_ATTR_SKIP_CPU_SYNC);
-		__free_page(buffer_info->page);
+		__page_frag_drain(buffer_info->page, 0,
+				  buffer_info->pagecnt_bias);
 
 		buffer_info->page = NULL;
 	}
@@ -6830,13 +6831,15 @@ static bool igb_can_reuse_rx_page(struct igb_rx_buffer *rx_buffer,
 				  struct page *page,
 				  unsigned int truesize)
 {
+	unsigned int pagecnt_bias = rx_buffer->pagecnt_bias--;
+
 	/* avoid re-using remote pages */
 	if (unlikely(igb_page_is_reserved(page)))
 		return false;
 
 #if (PAGE_SIZE < 8192)
 	/* if we are only owner of page we can reuse it */
-	if (unlikely(page_count(page) != 1))
+	if (unlikely(page_ref_count(page) != pagecnt_bias))
 		return false;
 
 	/* flip page offset to other buffer */
@@ -6849,10 +6852,14 @@ static bool igb_can_reuse_rx_page(struct igb_rx_buffer *rx_buffer,
 		return false;
 #endif
 
-	/* Even if we own the page, we are not allowed to use atomic_set()
-	 * This would break get_page_unless_zero() users.
+	/* If we have drained the page fragment pool we need to update
+	 * the pagecnt_bias and page count so that we fully restock the
+	 * number of references the driver holds.
 	 */
-	page_ref_inc(page);
+	if (unlikely(pagecnt_bias == 1)) {
+		page_ref_add(page, USHRT_MAX);
+		rx_buffer->pagecnt_bias = USHRT_MAX;
+	}
 
 	return true;
 }
@@ -6904,7 +6911,6 @@ static bool igb_add_rx_frag(struct igb_ring *rx_ring,
 			return true;
 
 		/* this page cannot be reused so discard it */
-		__free_page(page);
 		return false;
 	}
 
@@ -6975,10 +6981,13 @@ static struct sk_buff *igb_fetch_rx_buffer(struct igb_ring *rx_ring,
 		/* hand second half of page back to the ring */
 		igb_reuse_rx_page(rx_ring, rx_buffer);
 	} else {
-		/* we are not reusing the buffer so unmap it */
+		/* We are not reusing the buffer so unmap it and free
+		 * any references we are holding to it
+		 */
 		dma_unmap_page_attrs(rx_ring->dev, rx_buffer->dma,
 				     PAGE_SIZE, DMA_FROM_DEVICE,
 				     DMA_ATTR_SKIP_CPU_SYNC);
+		__page_frag_drain(page, 0, rx_buffer->pagecnt_bias);
 	}
 
 	/* clear contents of rx_buffer */
@@ -7252,6 +7261,7 @@ static bool igb_alloc_mapped_page(struct igb_ring *rx_ring,
 	bi->dma = dma;
 	bi->page = page;
 	bi->page_offset = 0;
+	bi->pagecnt_bias = 1;
 
 	return true;
 }

^ permalink raw reply related	[flat|nested] 38+ messages in thread

* [net-next PATCH 27/27] igb: Revert "igb: Revert support for build_skb in igb"
  2016-10-25 15:36 [net-next PATCH 00/27] Add support for DMA writable pages being writable by the network stack Alexander Duyck
                   ` (25 preceding siblings ...)
  2016-10-25 15:39 ` [net-next PATCH 26/27] igb: Update code to better handle incrementing page count Alexander Duyck
@ 2016-10-25 15:39 ` Alexander Duyck
  2016-10-26 17:22   ` [Intel-wired-lan] " Jeff Kirsher
  2016-10-26 15:45 ` [net-next PATCH 00/27] Add support for DMA writable pages being writable by the network stack Jesper Dangaard Brouer
  2016-10-28 15:48 ` Alexander Duyck
  28 siblings, 1 reply; 38+ messages in thread
From: Alexander Duyck @ 2016-10-25 15:39 UTC (permalink / raw)
  To: netdev, intel-wired-lan, linux-kernel, linux-mm; +Cc: davem, brouer

This reverts commit f9d40f6a9921 ("igb: Revert support for build_skb in
igb") and adds a few changes to update it to work with the latest version
of igb. We are now able to revert the removal of this due to the fact
that with the recent changes to the page count and the use of
DMA_ATTR_SKIP_CPU_SYNC we can make the pages writable so we should not be
invalidating the additional data added when we call build_skb.

The biggest risk with this change is that we are now not able to support
full jumbo frames when using build_skb.  Instead we can only support up to
2K minus the skb overhead and padding offset.

Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
---
 drivers/net/ethernet/intel/igb/igb.h      |   29 ++++++
 drivers/net/ethernet/intel/igb/igb_main.c |  130 ++++++++++++++++++++++++++---
 2 files changed, 142 insertions(+), 17 deletions(-)

diff --git a/drivers/net/ethernet/intel/igb/igb.h b/drivers/net/ethernet/intel/igb/igb.h
index acbc3ab..c3420f3 100644
--- a/drivers/net/ethernet/intel/igb/igb.h
+++ b/drivers/net/ethernet/intel/igb/igb.h
@@ -145,6 +145,10 @@ struct vf_data_storage {
 #define IGB_RX_HDR_LEN		IGB_RXBUFFER_256
 #define IGB_RX_BUFSZ		IGB_RXBUFFER_2048
 
+#define IGB_SKB_PAD		(NET_SKB_PAD + NET_IP_ALIGN)
+#define IGB_MAX_BUILD_SKB_SIZE \
+	(SKB_WITH_OVERHEAD(IGB_RX_BUFSZ) - (IGB_SKB_PAD + IGB_TS_HDR_LEN))
+
 /* How many Rx Buffers do we bundle into one write to the hardware ? */
 #define IGB_RX_BUFFER_WRITE	16 /* Must be power of 2 */
 
@@ -301,12 +305,29 @@ struct igb_q_vector {
 };
 
 enum e1000_ring_flags_t {
-	IGB_RING_FLAG_RX_SCTP_CSUM,
-	IGB_RING_FLAG_RX_LB_VLAN_BSWAP,
-	IGB_RING_FLAG_TX_CTX_IDX,
-	IGB_RING_FLAG_TX_DETECT_HANG
+	IGB_RING_FLAG_RX_SCTP_CSUM = 0,
+#if (NET_IP_ALIGN != 0)
+	IGB_RING_FLAG_RX_BUILD_SKB_ENABLED = 1,
+#endif
+	IGB_RING_FLAG_RX_LB_VLAN_BSWAP = 2,
+	IGB_RING_FLAG_TX_CTX_IDX = 3,
+	IGB_RING_FLAG_TX_DETECT_HANG = 4,
+#if (NET_IP_ALIGN == 0)
+#if (L1_CACHE_SHIFT < 5)
+	IGB_RING_FLAG_RX_BUILD_SKB_ENABLED = 5,
+#else
+	IGB_RING_FLAG_RX_BUILD_SKB_ENABLED = L1_CACHE_SHIFT,
+#endif
+#endif
 };
 
+#define ring_uses_build_skb(ring) \
+	test_bit(IGB_RING_FLAG_RX_BUILD_SKB_ENABLED, &(ring)->flags)
+#define set_ring_build_skb_enabled(ring) \
+	set_bit(IGB_RING_FLAG_RX_BUILD_SKB_ENABLED, &(ring)->flags)
+#define clear_ring_build_skb_enabled(ring) \
+	clear_bit(IGB_RING_FLAG_RX_BUILD_SKB_ENABLED, &(ring)->flags)
+
 #define IGB_TXD_DCMD (E1000_ADVTXD_DCMD_EOP | E1000_ADVTXD_DCMD_RS)
 
 #define IGB_RX_DESC(R, i)	\
diff --git a/drivers/net/ethernet/intel/igb/igb_main.c b/drivers/net/ethernet/intel/igb/igb_main.c
index 5e66cde..e55407a 100644
--- a/drivers/net/ethernet/intel/igb/igb_main.c
+++ b/drivers/net/ethernet/intel/igb/igb_main.c
@@ -3761,6 +3761,16 @@ void igb_configure_rx_ring(struct igb_adapter *adapter,
 	wr32(E1000_RXDCTL(reg_idx), rxdctl);
 }
 
+static void igb_set_rx_buffer_len(struct igb_adapter *adapter,
+				  struct igb_ring *rx_ring)
+{
+	/* set build_skb flag */
+	if (adapter->max_frame_size <= IGB_MAX_BUILD_SKB_SIZE)
+		set_ring_build_skb_enabled(rx_ring);
+	else
+		clear_ring_build_skb_enabled(rx_ring);
+}
+
 /**
  *  igb_configure_rx - Configure receive Unit after Reset
  *  @adapter: board private structure
@@ -3778,8 +3788,12 @@ static void igb_configure_rx(struct igb_adapter *adapter)
 	/* Setup the HW Rx Head and Tail Descriptor Pointers and
 	 * the Base and Length of the Rx Descriptor Ring
 	 */
-	for (i = 0; i < adapter->num_rx_queues; i++)
-		igb_configure_rx_ring(adapter, adapter->rx_ring[i]);
+	for (i = 0; i < adapter->num_rx_queues; i++) {
+		struct igb_ring *rx_ring = adapter->rx_ring[i];
+
+		igb_set_rx_buffer_len(adapter, rx_ring);
+		igb_configure_rx_ring(adapter, rx_ring);
+	}
 }
 
 /**
@@ -4238,7 +4252,7 @@ static void igb_set_rx_mode(struct net_device *netdev)
 	struct igb_adapter *adapter = netdev_priv(netdev);
 	struct e1000_hw *hw = &adapter->hw;
 	unsigned int vfn = adapter->vfs_allocated_count;
-	u32 rctl = 0, vmolr = 0;
+	u32 rctl = 0, vmolr = 0, rlpml = MAX_JUMBO_FRAME_SIZE;
 	int count;
 
 	/* Check for Promiscuous and All Multicast modes */
@@ -4310,12 +4324,18 @@ static void igb_set_rx_mode(struct net_device *netdev)
 	vmolr |= rd32(E1000_VMOLR(vfn)) &
 		 ~(E1000_VMOLR_ROPE | E1000_VMOLR_MPME | E1000_VMOLR_ROMPE);
 
-	/* enable Rx jumbo frames, no need for restriction */
+	/* enable Rx jumbo frames, restrict as needed to support build_skb */
 	vmolr &= ~E1000_VMOLR_RLPML_MASK;
-	vmolr |= MAX_JUMBO_FRAME_SIZE | E1000_VMOLR_LPE;
+	vmolr |= E1000_VMOLR_LPE;
+	vmolr |= (adapter->max_frame_size <= IGB_MAX_BUILD_SKB_SIZE) ?
+		 IGB_MAX_BUILD_SKB_SIZE : MAX_JUMBO_FRAME_SIZE;
+
+	if (!adapter->vfs_allocated_count &&
+	    (adapter->max_frame_size <= IGB_MAX_BUILD_SKB_SIZE))
+		rlpml = IGB_MAX_BUILD_SKB_SIZE;
 
 	wr32(E1000_VMOLR(vfn), vmolr);
-	wr32(E1000_RLPML, MAX_JUMBO_FRAME_SIZE);
+	wr32(E1000_RLPML, rlpml);
 
 	igb_restore_vf_multicasts(adapter);
 }
@@ -5046,9 +5066,9 @@ static void igb_tx_csum(struct igb_ring *tx_ring, struct igb_tx_buffer *first)
 }
 
 #define IGB_SET_FLAG(_input, _flag, _result) \
-	((_flag <= _result) ? \
-	 ((u32)(_input & _flag) * (_result / _flag)) : \
-	 ((u32)(_input & _flag) / (_flag / _result)))
+	(((_flag) <= (_result)) ? \
+	 ((u32)(_input & (_flag)) * ((_result) / (_flag))) : \
+	 ((u32)(_input & (_flag)) / ((_flag) / (_result))))
 
 static u32 igb_tx_cmd_type(struct sk_buff *skb, u32 tx_flags)
 {
@@ -6829,7 +6849,7 @@ static inline bool igb_page_is_reserved(struct page *page)
 
 static bool igb_can_reuse_rx_page(struct igb_rx_buffer *rx_buffer,
 				  struct page *page,
-				  unsigned int truesize)
+				  const unsigned int truesize)
 {
 	unsigned int pagecnt_bias = rx_buffer->pagecnt_bias--;
 
@@ -6888,7 +6908,7 @@ static bool igb_add_rx_frag(struct igb_ring *rx_ring,
 	struct page *page = rx_buffer->page;
 	unsigned char *va = page_address(page) + rx_buffer->page_offset;
 #if (PAGE_SIZE < 8192)
-	unsigned int truesize = IGB_RX_BUFSZ;
+	const unsigned int truesize = IGB_RX_BUFSZ;
 #else
 	unsigned int truesize = SKB_DATA_ALIGN(size);
 #endif
@@ -6933,6 +6953,78 @@ static bool igb_add_rx_frag(struct igb_ring *rx_ring,
 	return igb_can_reuse_rx_page(rx_buffer, page, truesize);
 }
 
+static struct sk_buff *igb_build_rx_buffer(struct igb_ring *rx_ring,
+					   union e1000_adv_rx_desc *rx_desc)
+{
+	unsigned int size = le16_to_cpu(rx_desc->wb.upper.length);
+	struct igb_rx_buffer *rx_buffer;
+	struct sk_buff *skb;
+	struct page *page;
+	void *va;
+#if (PAGE_SIZE < 8192)
+	const unsigned int truesize = IGB_RX_BUFSZ;
+#else
+	unsigned int truesize = SKB_DATA_ALIGN(sizeof(struct skb_shared_info)) +
+				SKB_DATA_ALIGN(NET_SKB_PAD +
+					       NET_IP_ALIGN +
+					       size);
+#endif
+
+	rx_buffer = &rx_ring->rx_buffer_info[rx_ring->next_to_clean];
+	page = rx_buffer->page;
+	prefetchw(page);
+
+	/* we are reusing so sync this buffer for CPU use */
+	dma_sync_single_range_for_cpu(rx_ring->dev,
+				      rx_buffer->dma,
+				      rx_buffer->page_offset + IGB_SKB_PAD,
+				      size,
+				      DMA_FROM_DEVICE);
+
+	va = page_address(page) + rx_buffer->page_offset;
+
+	/* prefetch first cache line of first page */
+	prefetch(va + IGB_SKB_PAD);
+#if L1_CACHE_BYTES < 128
+	prefetch(va + L1_CACHE_BYTES + IGB_SKB_PAD);
+#endif
+
+	/* build an skb to around the page buffer */
+	skb = build_skb(va, truesize);
+	if (unlikely(!skb)) {
+		rx_ring->rx_stats.alloc_failed++;
+		return NULL;
+	}
+
+	/* update pointers within the skb to store the data */
+	skb_reserve(skb, IGB_SKB_PAD);
+	__skb_put(skb, size);
+
+	/* pull timestamp out of packet data */
+	if (igb_test_staterr(rx_desc, E1000_RXDADV_STAT_TSIP)) {
+		igb_ptp_rx_pktstamp(rx_ring->q_vector, skb->data, skb);
+		__skb_pull(skb, IGB_TS_HDR_LEN);
+	}
+
+	if (igb_can_reuse_rx_page(rx_buffer, page, truesize)) {
+		/* hand second half of page back to the ring */
+		igb_reuse_rx_page(rx_ring, rx_buffer);
+	} else {
+		/* We are not reusing the buffer so unmap it and free
+		 * any references we are holding to it
+		 */
+		dma_unmap_page_attrs(rx_ring->dev, rx_buffer->dma,
+				     PAGE_SIZE, DMA_FROM_DEVICE,
+				     DMA_ATTR_SKIP_CPU_SYNC);
+		__page_frag_drain(page, 0, rx_buffer->pagecnt_bias);
+	}
+
+	/* clear contents of rx_buffer */
+	rx_buffer->page = NULL;
+
+	return skb;
+}
+
 static struct sk_buff *igb_fetch_rx_buffer(struct igb_ring *rx_ring,
 					   union e1000_adv_rx_desc *rx_desc,
 					   struct sk_buff *skb)
@@ -7178,7 +7270,10 @@ static int igb_clean_rx_irq(struct igb_q_vector *q_vector, const int budget)
 		dma_rmb();
 
 		/* retrieve a buffer from the ring */
-		skb = igb_fetch_rx_buffer(rx_ring, rx_desc, skb);
+		if (ring_uses_build_skb(rx_ring))
+			skb = igb_build_rx_buffer(rx_ring, rx_desc);
+		else
+			skb = igb_fetch_rx_buffer(rx_ring, rx_desc, skb);
 
 		/* exit if we failed to retrieve a buffer */
 		if (!skb)
@@ -7266,6 +7361,13 @@ static bool igb_alloc_mapped_page(struct igb_ring *rx_ring,
 	return true;
 }
 
+static inline unsigned int igb_rx_offset(struct igb_ring *rx_ring)
+{
+	return IGB_SET_FLAG(rx_ring->flags,
+			    1 << IGB_RING_FLAG_RX_BUILD_SKB_ENABLED,
+			    IGB_SKB_PAD);
+}
+
 /**
  *  igb_alloc_rx_buffers - Replace used receive buffers; packet split
  *  @adapter: address of board private structure
@@ -7297,7 +7399,9 @@ void igb_alloc_rx_buffers(struct igb_ring *rx_ring, u16 cleaned_count)
 		/* Refresh the desc even if buffer_addrs didn't change
 		 * because each write-back erases this info.
 		 */
-		rx_desc->read.pkt_addr = cpu_to_le64(bi->dma + bi->page_offset);
+		rx_desc->read.pkt_addr = cpu_to_le64(bi->dma +
+						     bi->page_offset +
+						     igb_rx_offset(rx_ring));
 
 		rx_desc++;
 		bi++;

^ permalink raw reply related	[flat|nested] 38+ messages in thread

* Re: [net-next PATCH 04/27] arch/arc: Add option to skip sync on DMA mapping
  2016-10-25 15:37 ` [net-next PATCH 04/27] arch/arc: Add option to skip sync on DMA mapping Alexander Duyck
@ 2016-10-25 22:00   ` Vineet Gupta
  0 siblings, 0 replies; 38+ messages in thread
From: Vineet Gupta @ 2016-10-25 22:00 UTC (permalink / raw)
  To: Alexander Duyck, netdev, intel-wired-lan, linux-kernel, linux-mm
  Cc: Vineet Gupta, linux-snps-arc, davem, brouer

On 10/25/2016 02:38 PM, Alexander Duyck wrote:
> This change allows us to pass DMA_ATTR_SKIP_CPU_SYNC which allows us to
> avoid invoking cache line invalidation if the driver will just handle it
> later via a sync_for_cpu or sync_for_device call.
>
> Cc: Vineet Gupta <vgupta@synopsys.com>
> Cc: linux-snps-arc@lists.infradead.org
> Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
> ---
>  arch/arc/mm/dma.c |    5 ++++-

Acked-by: Vineet Gupta <vgupta@synopsys.com>

>  1 file changed, 4 insertions(+), 1 deletion(-)
>
> diff --git a/arch/arc/mm/dma.c b/arch/arc/mm/dma.c
> index 20afc65..6303c34 100644
> --- a/arch/arc/mm/dma.c
> +++ b/arch/arc/mm/dma.c
> @@ -133,7 +133,10 @@ static dma_addr_t arc_dma_map_page(struct device *dev, struct page *page,
>  		unsigned long attrs)
>  {
>  	phys_addr_t paddr = page_to_phys(page) + offset;
> -	_dma_cache_sync(paddr, size, dir);
> +
> +	if (!(attrs & DMA_ATTR_SKIP_CPU_SYNC))
> +		_dma_cache_sync(paddr, size, dir);
> +
>  	return plat_phys_to_dma(dev, paddr);
>  }
>  
>
>

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [net-next PATCH 00/27] Add support for DMA writable pages being writable by the network stack
  2016-10-25 15:36 [net-next PATCH 00/27] Add support for DMA writable pages being writable by the network stack Alexander Duyck
                   ` (26 preceding siblings ...)
  2016-10-25 15:39 ` [net-next PATCH 27/27] igb: Revert "igb: Revert support for build_skb in igb" Alexander Duyck
@ 2016-10-26 15:45 ` Jesper Dangaard Brouer
  2016-10-28 15:48 ` Alexander Duyck
  28 siblings, 0 replies; 38+ messages in thread
From: Jesper Dangaard Brouer @ 2016-10-26 15:45 UTC (permalink / raw)
  To: Alexander Duyck
  Cc: netdev, intel-wired-lan, linux-kernel, linux-mm, davem, brouer

On Tue, 25 Oct 2016 11:36:48 -0400
Alexander Duyck <alexander.h.duyck@intel.com> wrote:

> The first 22 patches in the set add support for the DMA attribute
> DMA_ATTR_SKIP_CPU_SYNC on multiple platforms/architectures.  This is needed
> so that we can flag the calls to dma_map/unmap_page so that we do not
> invalidate cache lines that do not currently belong to the device.  Instead
> we have to take care of this in the driver via a call to
> sync_single_range_for_cpu prior to freeing the Rx page.
> 
> Patch 23 adds support for dma_map_page_attrs and dma_unmap_page_attrs so
> that we can unmap and map a page using the DMA_ATTR_SKIP_CPU_SYNC
> attribute.
> 
> Patch 24 adds support for freeing a page that has multiple references being
> held by a single caller.  This way we can free page fragments that were
> allocated by a given driver.
> 
> The last 3 patches use these updates in the igb driver to allow for us to
> reimpelement the use of build_skb.
> 
> My hope is to get the series accepted into the net-next tree as I have a
> number of other Intel drivers I could then begin updating once these
> patches are accepted.
> 
> v1: Split out changes DMA_ERROR_CODE fix for swiotlb-xen
>     Minor fixes based on issues found by kernel build bot
>     Few minor changes for issues found on code review
>     Added Acked-by for patches that were acked and not changed

I really appreciate you are doing this work Alex, thanks! And I do
think it fits into my page pool plans. Thanks!

-- 
Best regards,
  Jesper Dangaard Brouer
  MSc.CS, Principal Kernel Engineer at Red Hat
  Author of http://www.iptv-analyzer.org
  LinkedIn: http://www.linkedin.com/in/brouer

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [Intel-wired-lan] [net-next PATCH 25/27] igb: Update driver to make use of DMA_ATTR_SKIP_CPU_SYNC
  2016-10-25 15:39 ` [net-next PATCH 25/27] igb: Update driver to make use of DMA_ATTR_SKIP_CPU_SYNC Alexander Duyck
@ 2016-10-26 17:21   ` Jeff Kirsher
  0 siblings, 0 replies; 38+ messages in thread
From: Jeff Kirsher @ 2016-10-26 17:21 UTC (permalink / raw)
  To: Alexander Duyck, netdev, intel-wired-lan, linux-kernel, linux-mm
  Cc: davem, brouer

[-- Attachment #1: Type: text/plain, Size: 930 bytes --]

On Tue, 2016-10-25 at 11:39 -0400, Alexander Duyck wrote:
> The ARM architecture provides a mechanism for deferring cache line
> invalidation in the case of map/unmap.  This patch makes use of this
> mechanism to avoid unnecessary synchronization.
> 
> A secondary effect of this change is that the portion of the page that
> has
> been synchronized for use by the CPU should be writable and could be
> passed
> up the stack (at least on ARM).
> 
> The last bit that occurred to me is that on architectures where the
> sync_for_cpu call invalidates cache lines we were prefetching and then
> invalidating the first 128 bytes of the packet.  To avoid that I have
> moved
> the sync up to before we perform the prefetch and allocate the skbuff so
> that we can actually make use of it.
> 
> Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
> 

Acked-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>

[-- Attachment #2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 819 bytes --]

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [Intel-wired-lan] [net-next PATCH 26/27] igb: Update code to better handle incrementing page count
  2016-10-25 15:39 ` [net-next PATCH 26/27] igb: Update code to better handle incrementing page count Alexander Duyck
@ 2016-10-26 17:21   ` Jeff Kirsher
  0 siblings, 0 replies; 38+ messages in thread
From: Jeff Kirsher @ 2016-10-26 17:21 UTC (permalink / raw)
  To: Alexander Duyck, netdev, intel-wired-lan, linux-kernel, linux-mm
  Cc: davem, brouer

[-- Attachment #1: Type: text/plain, Size: 602 bytes --]

On Tue, 2016-10-25 at 11:39 -0400, Alexander Duyck wrote:
> This patch updates the driver code so that we do bulk updates of the page
> reference count instead of just incrementing it by one reference at a
> time.
> The advantage to doing this is that we cut down on atomic operations and
> this in turn should give us a slight improvement in cycles per packet. 
> In
> addition if we eventually move this over to using build_skb the gains
> will
> be more noticeable.
> 
> Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>

Acked-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>

[-- Attachment #2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 819 bytes --]

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [Intel-wired-lan] [net-next PATCH 27/27] igb: Revert "igb: Revert support for build_skb in igb"
  2016-10-25 15:39 ` [net-next PATCH 27/27] igb: Revert "igb: Revert support for build_skb in igb" Alexander Duyck
@ 2016-10-26 17:22   ` Jeff Kirsher
  0 siblings, 0 replies; 38+ messages in thread
From: Jeff Kirsher @ 2016-10-26 17:22 UTC (permalink / raw)
  To: Alexander Duyck, netdev, intel-wired-lan, linux-kernel, linux-mm
  Cc: davem, brouer

[-- Attachment #1: Type: text/plain, Size: 825 bytes --]

On Tue, 2016-10-25 at 11:39 -0400, Alexander Duyck wrote:
> This reverts commit f9d40f6a9921 ("igb: Revert support for build_skb in
> igb") and adds a few changes to update it to work with the latest version
> of igb. We are now able to revert the removal of this due to the fact
> that with the recent changes to the page count and the use of
> DMA_ATTR_SKIP_CPU_SYNC we can make the pages writable so we should not be
> invalidating the additional data added when we call build_skb.
> 
> The biggest risk with this change is that we are now not able to support
> full jumbo frames when using build_skb.  Instead we can only support up
> to
> 2K minus the skb overhead and padding offset.
> 
> Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>

Acked-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>

[-- Attachment #2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 819 bytes --]

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [net-next PATCH 00/27] Add support for DMA writable pages being writable by the network stack
  2016-10-25 15:36 [net-next PATCH 00/27] Add support for DMA writable pages being writable by the network stack Alexander Duyck
                   ` (27 preceding siblings ...)
  2016-10-26 15:45 ` [net-next PATCH 00/27] Add support for DMA writable pages being writable by the network stack Jesper Dangaard Brouer
@ 2016-10-28 15:48 ` Alexander Duyck
  2016-10-28 17:06   ` David Miller
  28 siblings, 1 reply; 38+ messages in thread
From: Alexander Duyck @ 2016-10-28 15:48 UTC (permalink / raw)
  To: David Miller
  Cc: Netdev, intel-wired-lan, linux-kernel, linux-mm,
	Jesper Dangaard Brouer, Alexander Duyck, Konrad Rzeszutek Wilk,
	Jeff Kirsher

On Tue, Oct 25, 2016 at 8:36 AM, Alexander Duyck
<alexander.h.duyck@intel.com> wrote:
> The first 22 patches in the set add support for the DMA attribute
> DMA_ATTR_SKIP_CPU_SYNC on multiple platforms/architectures.  This is needed
> so that we can flag the calls to dma_map/unmap_page so that we do not
> invalidate cache lines that do not currently belong to the device.  Instead
> we have to take care of this in the driver via a call to
> sync_single_range_for_cpu prior to freeing the Rx page.
>
> Patch 23 adds support for dma_map_page_attrs and dma_unmap_page_attrs so
> that we can unmap and map a page using the DMA_ATTR_SKIP_CPU_SYNC
> attribute.
>
> Patch 24 adds support for freeing a page that has multiple references being
> held by a single caller.  This way we can free page fragments that were
> allocated by a given driver.
>
> The last 3 patches use these updates in the igb driver to allow for us to
> reimpelement the use of build_skb.
>
> My hope is to get the series accepted into the net-next tree as I have a
> number of other Intel drivers I could then begin updating once these
> patches are accepted.
>
> v1: Split out changes DMA_ERROR_CODE fix for swiotlb-xen
>     Minor fixes based on issues found by kernel build bot
>     Few minor changes for issues found on code review
>     Added Acked-by for patches that were acked and not changed

So the feedback for this set has been mostly just a few "Acked-by"s,
and it looks like the series was marked as "Not Applicable" in
patchwork.  I was wondering what the correct merge strategy for this
patch set should be going forward?

I was wondering if I should be looking at breaking up the set and
splitting it over a few different trees, or if I should just hold onto
it and resubmit it when the merge window opens?  My preference would
be to submit it as a single set so I can know all the patches are
present to avoid any possible regressions due to only part of the set
being present.

Anyway, I am just trying to figure out how best to proceed from here
since these patch sets that touch multiple areas are always
complicated to get submitted.

Thanks.

- Alex

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [net-next PATCH 00/27] Add support for DMA writable pages being writable by the network stack
  2016-10-28 15:48 ` Alexander Duyck
@ 2016-10-28 17:06   ` David Miller
  0 siblings, 0 replies; 38+ messages in thread
From: David Miller @ 2016-10-28 17:06 UTC (permalink / raw)
  To: alexander.duyck
  Cc: netdev, intel-wired-lan, linux-kernel, linux-mm, brouer,
	alexander.h.duyck, konrad.wilk, jeffrey.t.kirsher

From: Alexander Duyck <alexander.duyck@gmail.com>
Date: Fri, 28 Oct 2016 08:48:01 -0700

> So the feedback for this set has been mostly just a few "Acked-by"s,
> and it looks like the series was marked as "Not Applicable" in
> patchwork.  I was wondering what the correct merge strategy for this
> patch set should be going forward?

I marked it as not applicable because it's definitely not a networking
change, and merging it via my tree would be really inappropriate, even
though we need it for some infrastructure we want to build for
networking.

So you have to merge this upstream via a more appropriate path.

> I was wondering if I should be looking at breaking up the set and
> splitting it over a few different trees, or if I should just hold onto
> it and resubmit it when the merge window opens?  My preference would
> be to submit it as a single set so I can know all the patches are
> present to avoid any possible regressions due to only part of the set
> being present.

I don't think you need to split it up.

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [net-next PATCH 03/27] swiotlb: Add support for DMA_ATTR_SKIP_CPU_SYNC
  2016-10-25 15:37 ` [net-next PATCH 03/27] swiotlb: Add support for DMA_ATTR_SKIP_CPU_SYNC Alexander Duyck
@ 2016-10-28 17:34   ` Konrad Rzeszutek Wilk
  2016-10-28 18:09     ` Alexander Duyck
  0 siblings, 1 reply; 38+ messages in thread
From: Konrad Rzeszutek Wilk @ 2016-10-28 17:34 UTC (permalink / raw)
  To: Alexander Duyck
  Cc: netdev, intel-wired-lan, linux-kernel, linux-mm, brouer, davem

On Tue, Oct 25, 2016 at 11:37:03AM -0400, Alexander Duyck wrote:
> As a first step to making DMA_ATTR_SKIP_CPU_SYNC apply to architectures
> beyond just ARM I need to make it so that the swiotlb will respect the
> flag.  In order to do that I also need to update the swiotlb-xen since it
> heavily makes use of the functionality.
> 
> Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>

I am pretty sure I acked it the RFC. Was there a particular
reason (this is very different from the RFC?) you dropped my ACk?

Thanks.

> Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
> ---
>  drivers/xen/swiotlb-xen.c |   11 +++++++---
>  include/linux/swiotlb.h   |    6 ++++--
>  lib/swiotlb.c             |   48 +++++++++++++++++++++++++++------------------
>  3 files changed, 40 insertions(+), 25 deletions(-)
> 
> diff --git a/drivers/xen/swiotlb-xen.c b/drivers/xen/swiotlb-xen.c
> index b8014bf..3d048af 100644
> --- a/drivers/xen/swiotlb-xen.c
> +++ b/drivers/xen/swiotlb-xen.c
> @@ -405,7 +405,8 @@ dma_addr_t xen_swiotlb_map_page(struct device *dev, struct page *page,
>  	 */
>  	trace_swiotlb_bounced(dev, dev_addr, size, swiotlb_force);
>  
> -	map = swiotlb_tbl_map_single(dev, start_dma_addr, phys, size, dir);
> +	map = swiotlb_tbl_map_single(dev, start_dma_addr, phys, size, dir,
> +				     attrs);
>  	if (map == SWIOTLB_MAP_ERROR)
>  		return DMA_ERROR_CODE;
>  
> @@ -419,7 +420,8 @@ dma_addr_t xen_swiotlb_map_page(struct device *dev, struct page *page,
>  	if (dma_capable(dev, dev_addr, size))
>  		return dev_addr;
>  
> -	swiotlb_tbl_unmap_single(dev, map, size, dir);
> +	swiotlb_tbl_unmap_single(dev, map, size, dir,
> +				 attrs | DMA_ATTR_SKIP_CPU_SYNC);
>  
>  	return DMA_ERROR_CODE;
>  }
> @@ -445,7 +447,7 @@ static void xen_unmap_single(struct device *hwdev, dma_addr_t dev_addr,
>  
>  	/* NOTE: We use dev_addr here, not paddr! */
>  	if (is_xen_swiotlb_buffer(dev_addr)) {
> -		swiotlb_tbl_unmap_single(hwdev, paddr, size, dir);
> +		swiotlb_tbl_unmap_single(hwdev, paddr, size, dir, attrs);
>  		return;
>  	}
>  
> @@ -558,11 +560,12 @@ void xen_swiotlb_unmap_page(struct device *hwdev, dma_addr_t dev_addr,
>  								 start_dma_addr,
>  								 sg_phys(sg),
>  								 sg->length,
> -								 dir);
> +								 dir, attrs);
>  			if (map == SWIOTLB_MAP_ERROR) {
>  				dev_warn(hwdev, "swiotlb buffer is full\n");
>  				/* Don't panic here, we expect map_sg users
>  				   to do proper error handling. */
> +				attrs |= DMA_ATTR_SKIP_CPU_SYNC;
>  				xen_swiotlb_unmap_sg_attrs(hwdev, sgl, i, dir,
>  							   attrs);
>  				sg_dma_len(sgl) = 0;
> diff --git a/include/linux/swiotlb.h b/include/linux/swiotlb.h
> index e237b6f..4517be9 100644
> --- a/include/linux/swiotlb.h
> +++ b/include/linux/swiotlb.h
> @@ -44,11 +44,13 @@ enum dma_sync_target {
>  extern phys_addr_t swiotlb_tbl_map_single(struct device *hwdev,
>  					  dma_addr_t tbl_dma_addr,
>  					  phys_addr_t phys, size_t size,
> -					  enum dma_data_direction dir);
> +					  enum dma_data_direction dir,
> +					  unsigned long attrs);
>  
>  extern void swiotlb_tbl_unmap_single(struct device *hwdev,
>  				     phys_addr_t tlb_addr,
> -				     size_t size, enum dma_data_direction dir);
> +				     size_t size, enum dma_data_direction dir,
> +				     unsigned long attrs);
>  
>  extern void swiotlb_tbl_sync_single(struct device *hwdev,
>  				    phys_addr_t tlb_addr,
> diff --git a/lib/swiotlb.c b/lib/swiotlb.c
> index 47aad37..b538d39 100644
> --- a/lib/swiotlb.c
> +++ b/lib/swiotlb.c
> @@ -425,7 +425,8 @@ static void swiotlb_bounce(phys_addr_t orig_addr, phys_addr_t tlb_addr,
>  phys_addr_t swiotlb_tbl_map_single(struct device *hwdev,
>  				   dma_addr_t tbl_dma_addr,
>  				   phys_addr_t orig_addr, size_t size,
> -				   enum dma_data_direction dir)
> +				   enum dma_data_direction dir,
> +				   unsigned long attrs)
>  {
>  	unsigned long flags;
>  	phys_addr_t tlb_addr;
> @@ -526,7 +527,8 @@ phys_addr_t swiotlb_tbl_map_single(struct device *hwdev,
>  	 */
>  	for (i = 0; i < nslots; i++)
>  		io_tlb_orig_addr[index+i] = orig_addr + (i << IO_TLB_SHIFT);
> -	if (dir == DMA_TO_DEVICE || dir == DMA_BIDIRECTIONAL)
> +	if (!(attrs & DMA_ATTR_SKIP_CPU_SYNC) &&
> +	    (dir == DMA_TO_DEVICE || dir == DMA_BIDIRECTIONAL))
>  		swiotlb_bounce(orig_addr, tlb_addr, size, DMA_TO_DEVICE);
>  
>  	return tlb_addr;
> @@ -539,18 +541,20 @@ phys_addr_t swiotlb_tbl_map_single(struct device *hwdev,
>  
>  static phys_addr_t
>  map_single(struct device *hwdev, phys_addr_t phys, size_t size,
> -	   enum dma_data_direction dir)
> +	   enum dma_data_direction dir, unsigned long attrs)
>  {
>  	dma_addr_t start_dma_addr = phys_to_dma(hwdev, io_tlb_start);
>  
> -	return swiotlb_tbl_map_single(hwdev, start_dma_addr, phys, size, dir);
> +	return swiotlb_tbl_map_single(hwdev, start_dma_addr, phys, size,
> +				      dir, attrs);
>  }
>  
>  /*
>   * dma_addr is the kernel virtual address of the bounce buffer to unmap.
>   */
>  void swiotlb_tbl_unmap_single(struct device *hwdev, phys_addr_t tlb_addr,
> -			      size_t size, enum dma_data_direction dir)
> +			      size_t size, enum dma_data_direction dir,
> +			      unsigned long attrs)
>  {
>  	unsigned long flags;
>  	int i, count, nslots = ALIGN(size, 1 << IO_TLB_SHIFT) >> IO_TLB_SHIFT;
> @@ -561,6 +565,7 @@ void swiotlb_tbl_unmap_single(struct device *hwdev, phys_addr_t tlb_addr,
>  	 * First, sync the memory before unmapping the entry
>  	 */
>  	if (orig_addr != INVALID_PHYS_ADDR &&
> +	    !(attrs & DMA_ATTR_SKIP_CPU_SYNC) &&
>  	    ((dir == DMA_FROM_DEVICE) || (dir == DMA_BIDIRECTIONAL)))
>  		swiotlb_bounce(orig_addr, tlb_addr, size, DMA_FROM_DEVICE);
>  
> @@ -654,7 +659,8 @@ void swiotlb_tbl_sync_single(struct device *hwdev, phys_addr_t tlb_addr,
>  		 * GFP_DMA memory; fall back on map_single(), which
>  		 * will grab memory from the lowest available address range.
>  		 */
> -		phys_addr_t paddr = map_single(hwdev, 0, size, DMA_FROM_DEVICE);
> +		phys_addr_t paddr = map_single(hwdev, 0, size,
> +					       DMA_FROM_DEVICE, 0);
>  		if (paddr == SWIOTLB_MAP_ERROR)
>  			goto err_warn;
>  
> @@ -669,7 +675,8 @@ void swiotlb_tbl_sync_single(struct device *hwdev, phys_addr_t tlb_addr,
>  
>  			/* DMA_TO_DEVICE to avoid memcpy in unmap_single */
>  			swiotlb_tbl_unmap_single(hwdev, paddr,
> -						 size, DMA_TO_DEVICE);
> +						 size, DMA_TO_DEVICE,
> +						 DMA_ATTR_SKIP_CPU_SYNC);
>  			goto err_warn;
>  		}
>  	}
> @@ -699,7 +706,7 @@ void swiotlb_tbl_sync_single(struct device *hwdev, phys_addr_t tlb_addr,
>  		free_pages((unsigned long)vaddr, get_order(size));
>  	else
>  		/* DMA_TO_DEVICE to avoid memcpy in swiotlb_tbl_unmap_single */
> -		swiotlb_tbl_unmap_single(hwdev, paddr, size, DMA_TO_DEVICE);
> +		swiotlb_tbl_unmap_single(hwdev, paddr, size, DMA_TO_DEVICE, 0);
>  }
>  EXPORT_SYMBOL(swiotlb_free_coherent);
>  
> @@ -755,7 +762,7 @@ dma_addr_t swiotlb_map_page(struct device *dev, struct page *page,
>  	trace_swiotlb_bounced(dev, dev_addr, size, swiotlb_force);
>  
>  	/* Oh well, have to allocate and map a bounce buffer. */
> -	map = map_single(dev, phys, size, dir);
> +	map = map_single(dev, phys, size, dir, attrs);
>  	if (map == SWIOTLB_MAP_ERROR) {
>  		swiotlb_full(dev, size, dir, 1);
>  		return phys_to_dma(dev, io_tlb_overflow_buffer);
> @@ -764,12 +771,13 @@ dma_addr_t swiotlb_map_page(struct device *dev, struct page *page,
>  	dev_addr = phys_to_dma(dev, map);
>  
>  	/* Ensure that the address returned is DMA'ble */
> -	if (!dma_capable(dev, dev_addr, size)) {
> -		swiotlb_tbl_unmap_single(dev, map, size, dir);
> -		return phys_to_dma(dev, io_tlb_overflow_buffer);
> -	}
> +	if (dma_capable(dev, dev_addr, size))
> +		return dev_addr;
> +
> +	swiotlb_tbl_unmap_single(dev, map, size, dir,
> +				 attrs | DMA_ATTR_SKIP_CPU_SYNC);
>  
> -	return dev_addr;
> +	return phys_to_dma(dev, io_tlb_overflow_buffer);
>  }
>  EXPORT_SYMBOL_GPL(swiotlb_map_page);
>  
> @@ -782,14 +790,15 @@ dma_addr_t swiotlb_map_page(struct device *dev, struct page *page,
>   * whatever the device wrote there.
>   */
>  static void unmap_single(struct device *hwdev, dma_addr_t dev_addr,
> -			 size_t size, enum dma_data_direction dir)
> +			 size_t size, enum dma_data_direction dir,
> +			 unsigned long attrs)
>  {
>  	phys_addr_t paddr = dma_to_phys(hwdev, dev_addr);
>  
>  	BUG_ON(dir == DMA_NONE);
>  
>  	if (is_swiotlb_buffer(paddr)) {
> -		swiotlb_tbl_unmap_single(hwdev, paddr, size, dir);
> +		swiotlb_tbl_unmap_single(hwdev, paddr, size, dir, attrs);
>  		return;
>  	}
>  
> @@ -809,7 +818,7 @@ void swiotlb_unmap_page(struct device *hwdev, dma_addr_t dev_addr,
>  			size_t size, enum dma_data_direction dir,
>  			unsigned long attrs)
>  {
> -	unmap_single(hwdev, dev_addr, size, dir);
> +	unmap_single(hwdev, dev_addr, size, dir, attrs);
>  }
>  EXPORT_SYMBOL_GPL(swiotlb_unmap_page);
>  
> @@ -891,7 +900,7 @@ void swiotlb_unmap_page(struct device *hwdev, dma_addr_t dev_addr,
>  		if (swiotlb_force ||
>  		    !dma_capable(hwdev, dev_addr, sg->length)) {
>  			phys_addr_t map = map_single(hwdev, sg_phys(sg),
> -						     sg->length, dir);
> +						     sg->length, dir, attrs);
>  			if (map == SWIOTLB_MAP_ERROR) {
>  				/* Don't panic here, we expect map_sg users
>  				   to do proper error handling. */
> @@ -925,7 +934,8 @@ void swiotlb_unmap_page(struct device *hwdev, dma_addr_t dev_addr,
>  	BUG_ON(dir == DMA_NONE);
>  
>  	for_each_sg(sgl, sg, nelems, i)
> -		unmap_single(hwdev, sg->dma_address, sg_dma_len(sg), dir);
> +		unmap_single(hwdev, sg->dma_address, sg_dma_len(sg), dir,
> +			     attrs);
>  
>  }
>  EXPORT_SYMBOL(swiotlb_unmap_sg_attrs);
> 

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [net-next PATCH 02/27] swiotlb-xen: Enforce return of DMA_ERROR_CODE in mapping function
  2016-10-25 15:36 ` [net-next PATCH 02/27] swiotlb-xen: Enforce return of DMA_ERROR_CODE in mapping function Alexander Duyck
@ 2016-10-28 17:35   ` Konrad Rzeszutek Wilk
  0 siblings, 0 replies; 38+ messages in thread
From: Konrad Rzeszutek Wilk @ 2016-10-28 17:35 UTC (permalink / raw)
  To: Alexander Duyck
  Cc: netdev, intel-wired-lan, linux-kernel, linux-mm, brouer, davem

On Tue, Oct 25, 2016 at 11:36:58AM -0400, Alexander Duyck wrote:
> The mapping function should always return DMA_ERROR_CODE when a mapping has
> failed as this is what the DMA API expects when a DMA error has occurred.
> The current function for mapping a page in Xen was returning either
> DMA_ERROR_CODE or 0 depending on where it failed.
> 
> On x86 DMA_ERROR_CODE is 0, but on other architectures such as ARM it is
> ~0. We need to make sure we return the same error value if either the
> mapping failed or the device is not capable of accessing the mapping.
> 
> If we are returning DMA_ERROR_CODE as our error value we can drop the
> function for checking the error code as the default is to compare the
> return value against DMA_ERROR_CODE if no function is defined.
> 
> Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>

I am pretty sure I gave an Ack. Any particular reason from dropping it
(if so, please add a comment under the --- of the reason).

Thanks.
> Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
> ---
>  arch/arm/xen/mm.c              |    1 -
>  arch/x86/xen/pci-swiotlb-xen.c |    1 -
>  drivers/xen/swiotlb-xen.c      |   18 ++++++------------
>  include/xen/swiotlb-xen.h      |    3 ---
>  4 files changed, 6 insertions(+), 17 deletions(-)
> 
> diff --git a/arch/arm/xen/mm.c b/arch/arm/xen/mm.c
> index d062f08..bd62d94 100644
> --- a/arch/arm/xen/mm.c
> +++ b/arch/arm/xen/mm.c
> @@ -186,7 +186,6 @@ void xen_destroy_contiguous_region(phys_addr_t pstart, unsigned int order)
>  EXPORT_SYMBOL(xen_dma_ops);
>  
>  static struct dma_map_ops xen_swiotlb_dma_ops = {
> -	.mapping_error = xen_swiotlb_dma_mapping_error,
>  	.alloc = xen_swiotlb_alloc_coherent,
>  	.free = xen_swiotlb_free_coherent,
>  	.sync_single_for_cpu = xen_swiotlb_sync_single_for_cpu,
> diff --git a/arch/x86/xen/pci-swiotlb-xen.c b/arch/x86/xen/pci-swiotlb-xen.c
> index 0e98e5d..a9fafb5 100644
> --- a/arch/x86/xen/pci-swiotlb-xen.c
> +++ b/arch/x86/xen/pci-swiotlb-xen.c
> @@ -19,7 +19,6 @@
>  int xen_swiotlb __read_mostly;
>  
>  static struct dma_map_ops xen_swiotlb_dma_ops = {
> -	.mapping_error = xen_swiotlb_dma_mapping_error,
>  	.alloc = xen_swiotlb_alloc_coherent,
>  	.free = xen_swiotlb_free_coherent,
>  	.sync_single_for_cpu = xen_swiotlb_sync_single_for_cpu,
> diff --git a/drivers/xen/swiotlb-xen.c b/drivers/xen/swiotlb-xen.c
> index 87e6035..b8014bf 100644
> --- a/drivers/xen/swiotlb-xen.c
> +++ b/drivers/xen/swiotlb-xen.c
> @@ -416,11 +416,12 @@ dma_addr_t xen_swiotlb_map_page(struct device *dev, struct page *page,
>  	/*
>  	 * Ensure that the address returned is DMA'ble
>  	 */
> -	if (!dma_capable(dev, dev_addr, size)) {
> -		swiotlb_tbl_unmap_single(dev, map, size, dir);
> -		dev_addr = 0;
> -	}
> -	return dev_addr;
> +	if (dma_capable(dev, dev_addr, size))
> +		return dev_addr;
> +
> +	swiotlb_tbl_unmap_single(dev, map, size, dir);
> +
> +	return DMA_ERROR_CODE;
>  }
>  EXPORT_SYMBOL_GPL(xen_swiotlb_map_page);
>  
> @@ -648,13 +649,6 @@ void xen_swiotlb_unmap_page(struct device *hwdev, dma_addr_t dev_addr,
>  }
>  EXPORT_SYMBOL_GPL(xen_swiotlb_sync_sg_for_device);
>  
> -int
> -xen_swiotlb_dma_mapping_error(struct device *hwdev, dma_addr_t dma_addr)
> -{
> -	return !dma_addr;
> -}
> -EXPORT_SYMBOL_GPL(xen_swiotlb_dma_mapping_error);
> -
>  /*
>   * Return whether the given device DMA address mask can be supported
>   * properly.  For example, if your device can only drive the low 24-bits
> diff --git a/include/xen/swiotlb-xen.h b/include/xen/swiotlb-xen.h
> index 7c35e27..a0083be 100644
> --- a/include/xen/swiotlb-xen.h
> +++ b/include/xen/swiotlb-xen.h
> @@ -51,9 +51,6 @@ extern void xen_swiotlb_unmap_page(struct device *hwdev, dma_addr_t dev_addr,
>  			       int nelems, enum dma_data_direction dir);
>  
>  extern int
> -xen_swiotlb_dma_mapping_error(struct device *hwdev, dma_addr_t dma_addr);
> -
> -extern int
>  xen_swiotlb_dma_supported(struct device *hwdev, u64 mask);
>  
>  extern int
> 

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [net-next PATCH 03/27] swiotlb: Add support for DMA_ATTR_SKIP_CPU_SYNC
  2016-10-28 17:34   ` Konrad Rzeszutek Wilk
@ 2016-10-28 18:09     ` Alexander Duyck
  0 siblings, 0 replies; 38+ messages in thread
From: Alexander Duyck @ 2016-10-28 18:09 UTC (permalink / raw)
  To: Konrad Rzeszutek Wilk
  Cc: Alexander Duyck, Netdev, intel-wired-lan, linux-kernel, linux-mm,
	Jesper Dangaard Brouer, David Miller

On Fri, Oct 28, 2016 at 10:34 AM, Konrad Rzeszutek Wilk
<konrad.wilk@oracle.com> wrote:
> On Tue, Oct 25, 2016 at 11:37:03AM -0400, Alexander Duyck wrote:
>> As a first step to making DMA_ATTR_SKIP_CPU_SYNC apply to architectures
>> beyond just ARM I need to make it so that the swiotlb will respect the
>> flag.  In order to do that I also need to update the swiotlb-xen since it
>> heavily makes use of the functionality.
>>
>> Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
>
> I am pretty sure I acked it the RFC. Was there a particular
> reason (this is very different from the RFC?) you dropped my ACk?
>
> Thanks.

If I recall you had acked patch 1, but for 2 you had some review
comments on and suggested I change a few things.  What was patch 2 in
the RFC was split out into patches 2 and 3.  That is why I didn't
include an Ack from you for those patches.

Patch 2 is a fix for Xen to address the fact that you could return
either 0 or ~0.  It was part of patch 2 originally and I pulled it out
into a separate patch.

Patch 3 does most of what patch 2 in the RFC was doing before with
fixes to address the fact that I was moving some code to avoid going
over 80 characters.  I found a different way to fix that by just
updating attrs before using it instead of ORing in the value when
passing it as a parameter.

>> @@ -558,11 +560,12 @@ void xen_swiotlb_unmap_page(struct device *hwdev, dma_addr_t dev_addr,
>>                                                                start_dma_addr,
>>                                                                sg_phys(sg),
>>                                                                sg->length,
>> -                                                              dir);
>> +                                                              dir, attrs);
>>                       if (map == SWIOTLB_MAP_ERROR) {
>>                               dev_warn(hwdev, "swiotlb buffer is full\n");
>>                               /* Don't panic here, we expect map_sg users
>>                                  to do proper error handling. */
>> +                             attrs |= DMA_ATTR_SKIP_CPU_SYNC;
>>                               xen_swiotlb_unmap_sg_attrs(hwdev, sgl, i, dir,
>>                                                          attrs);
>>                               sg_dma_len(sgl) = 0;

The biggest difference from patch 2 in the RFC is right here.  This
code before was moving this off to the end of the function and adding
a label which I then jumped to.  I just ORed the
DMA_ATTR_SKIP_CPU_SYNC into attrs and skipped the problem entirely.
It should be harmless to do this way since attrs isn't used anywhere
else once we have had the error.

I hope that helps to clear it up.  So if you want I will add your
Acked-by for patches 2 and 3, but I just wanted to make sure this
worked with the changes you suggested.

Thanks.

- Alex

^ permalink raw reply	[flat|nested] 38+ messages in thread

end of thread, other threads:[~2016-10-28 18:09 UTC | newest]

Thread overview: 38+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-10-25 15:36 [net-next PATCH 00/27] Add support for DMA writable pages being writable by the network stack Alexander Duyck
2016-10-25 15:36 ` [net-next PATCH 01/27] swiotlb: Drop unused function swiotlb_map_sg Alexander Duyck
2016-10-25 15:36 ` [net-next PATCH 02/27] swiotlb-xen: Enforce return of DMA_ERROR_CODE in mapping function Alexander Duyck
2016-10-28 17:35   ` Konrad Rzeszutek Wilk
2016-10-25 15:37 ` [net-next PATCH 03/27] swiotlb: Add support for DMA_ATTR_SKIP_CPU_SYNC Alexander Duyck
2016-10-28 17:34   ` Konrad Rzeszutek Wilk
2016-10-28 18:09     ` Alexander Duyck
2016-10-25 15:37 ` [net-next PATCH 04/27] arch/arc: Add option to skip sync on DMA mapping Alexander Duyck
2016-10-25 22:00   ` Vineet Gupta
2016-10-25 15:37 ` [net-next PATCH 05/27] arch/arm: Add option to skip sync on DMA map and unmap Alexander Duyck
2016-10-25 15:37 ` [net-next PATCH 06/27] arch/avr32: Add option to skip sync on DMA map Alexander Duyck
2016-10-25 15:37 ` [net-next PATCH 07/27] arch/blackfin: " Alexander Duyck
2016-10-25 15:37 ` [net-next PATCH 08/27] arch/c6x: Add option to skip sync on DMA map and unmap Alexander Duyck
2016-10-25 15:37 ` [net-next PATCH 09/27] arch/frv: Add option to skip sync on DMA map Alexander Duyck
2016-10-25 15:37 ` [net-next PATCH 10/27] arch/hexagon: Add option to skip DMA sync as a part of mapping Alexander Duyck
2016-10-25 15:37 ` [net-next PATCH 11/27] arch/m68k: " Alexander Duyck
2016-10-25 15:37 ` [net-next PATCH 12/27] arch/metag: Add option to skip DMA sync as a part of map and unmap Alexander Duyck
2016-10-25 15:37 ` [net-next PATCH 13/27] arch/microblaze: " Alexander Duyck
2016-10-25 15:38 ` [net-next PATCH 14/27] arch/mips: " Alexander Duyck
2016-10-25 15:38 ` [net-next PATCH 15/27] arch/nios2: " Alexander Duyck
2016-10-25 15:38 ` [net-next PATCH 16/27] arch/openrisc: Add option to skip DMA sync as a part of mapping Alexander Duyck
2016-10-25 15:38 ` [net-next PATCH 17/27] arch/parisc: Add option to skip DMA sync as a part of map and unmap Alexander Duyck
2016-10-25 15:38 ` [net-next PATCH 18/27] arch/powerpc: Add option to skip DMA sync as a part of mapping Alexander Duyck
2016-10-25 15:38 ` [net-next PATCH 19/27] arch/sh: " Alexander Duyck
2016-10-25 15:38 ` [net-next PATCH 20/27] arch/sparc: Add option to skip DMA sync as a part of map and unmap Alexander Duyck
2016-10-25 15:38 ` [net-next PATCH 21/27] arch/tile: " Alexander Duyck
2016-10-25 15:38 ` [net-next PATCH 22/27] arch/xtensa: Add option to skip DMA sync as a part of mapping Alexander Duyck
2016-10-25 15:38 ` [net-next PATCH 23/27] dma: Add calls for dma_map_page_attrs and dma_unmap_page_attrs Alexander Duyck
2016-10-25 15:38 ` [net-next PATCH 24/27] mm: Add support for releasing multiple instances of a page Alexander Duyck
2016-10-25 15:39 ` [net-next PATCH 25/27] igb: Update driver to make use of DMA_ATTR_SKIP_CPU_SYNC Alexander Duyck
2016-10-26 17:21   ` [Intel-wired-lan] " Jeff Kirsher
2016-10-25 15:39 ` [net-next PATCH 26/27] igb: Update code to better handle incrementing page count Alexander Duyck
2016-10-26 17:21   ` [Intel-wired-lan] " Jeff Kirsher
2016-10-25 15:39 ` [net-next PATCH 27/27] igb: Revert "igb: Revert support for build_skb in igb" Alexander Duyck
2016-10-26 17:22   ` [Intel-wired-lan] " Jeff Kirsher
2016-10-26 15:45 ` [net-next PATCH 00/27] Add support for DMA writable pages being writable by the network stack Jesper Dangaard Brouer
2016-10-28 15:48 ` Alexander Duyck
2016-10-28 17:06   ` David Miller

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).