All of lore.kernel.org
 help / color / mirror / Atom feed
* [RFC 0/3] Introduce usb_{alloc,free}_noncoherent API
@ 2018-08-30 17:20 ` Ezequiel Garcia
  0 siblings, 0 replies; 32+ messages in thread
From: Ezequiel Garcia @ 2018-08-30 17:20 UTC (permalink / raw)
  To: linux-media, linux-usb, linux-arm-kernel, linux-kernel
  Cc: Laurent Pinchart, Tomasz Figa, Matwey V . Kornilov, Alan Stern,
	kernel, Keiichi Watanabe, Ezequiel Garcia

Following the discussion on PWC [1] and UVC [2] drivers, where
use non-consistent mappings for the URB transfer buffers was
shown to improve transfer speed significantly, here's a proposal
for a non-coherent USB helpers.

With this pachset, it's possible to get 360x288 raw analog video
using stk1160 and a AM335x Beaglebone Black board. This isn't
possible in mainline, for the same reasons Matwey has explained [1].

First patch is a hack, obviously incomplete, to add support for
non-consistent mappings on ARM.

The second patch introduces the usb_{alloc,free}_noncoherent API,
while the third patch is an example on stk1160.

I'm sending this patchset as RFC, just to get the ball rolling.

[1] https://lkml.org/lkml/2018/8/21/663
[2] https://lkml.org/lkml/2018/6/27/188

Ezequiel Garcia (3):
  HACK: ARM: dma-mapping: Get writeback memory for non-consistent
    mappings
  USB: core: Add non-coherent buffer allocation helpers
  stk1160: Use non-coherent buffers for USB transfers

 arch/arm/include/asm/pgtable.h            |  3 ++
 arch/arm/mm/dma-mapping.c                 |  9 ++--
 drivers/media/usb/stk1160/stk1160-video.c | 22 +++------
 drivers/usb/core/buffer.c                 | 29 +++++++-----
 drivers/usb/core/hcd.c                    |  5 +-
 drivers/usb/core/usb.c                    | 56 ++++++++++++++++++++++-
 include/linux/usb.h                       |  5 ++
 include/linux/usb/hcd.h                   |  4 +-
 8 files changed, 97 insertions(+), 36 deletions(-)

-- 
2.18.0


^ permalink raw reply	[flat|nested] 32+ messages in thread

* [RFC 0/3] Introduce usb_{alloc,free}_noncoherent API
@ 2018-08-30 17:20 ` Ezequiel Garcia
  0 siblings, 0 replies; 32+ messages in thread
From: Ezequiel Garcia @ 2018-08-30 17:20 UTC (permalink / raw)
  To: linux-arm-kernel

Following the discussion on PWC [1] and UVC [2] drivers, where
use non-consistent mappings for the URB transfer buffers was
shown to improve transfer speed significantly, here's a proposal
for a non-coherent USB helpers.

With this pachset, it's possible to get 360x288 raw analog video
using stk1160 and a AM335x Beaglebone Black board. This isn't
possible in mainline, for the same reasons Matwey has explained [1].

First patch is a hack, obviously incomplete, to add support for
non-consistent mappings on ARM.

The second patch introduces the usb_{alloc,free}_noncoherent API,
while the third patch is an example on stk1160.

I'm sending this patchset as RFC, just to get the ball rolling.

[1] https://lkml.org/lkml/2018/8/21/663
[2] https://lkml.org/lkml/2018/6/27/188

Ezequiel Garcia (3):
  HACK: ARM: dma-mapping: Get writeback memory for non-consistent
    mappings
  USB: core: Add non-coherent buffer allocation helpers
  stk1160: Use non-coherent buffers for USB transfers

 arch/arm/include/asm/pgtable.h            |  3 ++
 arch/arm/mm/dma-mapping.c                 |  9 ++--
 drivers/media/usb/stk1160/stk1160-video.c | 22 +++------
 drivers/usb/core/buffer.c                 | 29 +++++++-----
 drivers/usb/core/hcd.c                    |  5 +-
 drivers/usb/core/usb.c                    | 56 ++++++++++++++++++++++-
 include/linux/usb.h                       |  5 ++
 include/linux/usb/hcd.h                   |  4 +-
 8 files changed, 97 insertions(+), 36 deletions(-)

-- 
2.18.0

^ permalink raw reply	[flat|nested] 32+ messages in thread

* [RFC 1/3] HACK: ARM: dma-mapping: Get writeback memory for non-consistent mappings
@ 2018-08-30 17:20   ` Ezequiel Garcia
  0 siblings, 0 replies; 32+ messages in thread
From: Ezequiel Garcia @ 2018-08-30 17:20 UTC (permalink / raw)
  To: linux-media, linux-usb, linux-arm-kernel, linux-kernel
  Cc: Laurent Pinchart, Tomasz Figa, Matwey V . Kornilov, Alan Stern,
	kernel, Keiichi Watanabe, Ezequiel Garcia

This is obviously a hack.

Signed-off-by: Ezequiel Garcia <ezequiel@collabora.com>
---
 arch/arm/include/asm/pgtable.h | 3 +++
 arch/arm/mm/dma-mapping.c      | 9 ++++++---
 2 files changed, 9 insertions(+), 3 deletions(-)

diff --git a/arch/arm/include/asm/pgtable.h b/arch/arm/include/asm/pgtable.h
index a757401129f9..37ddd0d73434 100644
--- a/arch/arm/include/asm/pgtable.h
+++ b/arch/arm/include/asm/pgtable.h
@@ -122,6 +122,9 @@ extern pgprot_t		pgprot_s2_device;
 #define pgprot_writecombine(prot) \
 	__pgprot_modify(prot, L_PTE_MT_MASK, L_PTE_MT_BUFFERABLE)
 
+#define pgprot_writeback(prot) \
+	__pgprot_modify(prot, L_PTE_MT_MASK, L_PTE_MT_WRITEBACK)
+
 #define pgprot_stronglyordered(prot) \
 	__pgprot_modify(prot, L_PTE_MT_MASK, L_PTE_MT_UNCACHED)
 
diff --git a/arch/arm/mm/dma-mapping.c b/arch/arm/mm/dma-mapping.c
index 66566472c153..11cca7bbb0a8 100644
--- a/arch/arm/mm/dma-mapping.c
+++ b/arch/arm/mm/dma-mapping.c
@@ -633,9 +633,12 @@ static void __free_from_contiguous(struct device *dev, struct page *page,
 
 static inline pgprot_t __get_dma_pgprot(unsigned long attrs, pgprot_t prot)
 {
-	prot = (attrs & DMA_ATTR_WRITE_COMBINE) ?
-			pgprot_writecombine(prot) :
-			pgprot_dmacoherent(prot);
+	if (attrs & DMA_ATTR_WRITE_COMBINE)
+		prot = pgprot_writecombine(prot);
+	else if (attrs & DMA_ATTR_NON_CONSISTENT)
+		prot = pgprot_writeback(prot);
+	else
+		prot = pgprot_dmacoherent(prot);
 	return prot;
 }
 
-- 
2.18.0


^ permalink raw reply related	[flat|nested] 32+ messages in thread

* [RFC,1/3] HACK: ARM: dma-mapping: Get writeback memory for non-consistent mappings
@ 2018-08-30 17:20   ` Ezequiel Garcia
  0 siblings, 0 replies; 32+ messages in thread
From: Ezequiel Garcia @ 2018-08-30 17:20 UTC (permalink / raw)
  To: linux-media, linux-usb, linux-arm-kernel, linux-kernel
  Cc: Laurent Pinchart, Tomasz Figa, Matwey V . Kornilov, Alan Stern,
	kernel, Keiichi Watanabe, Ezequiel Garcia

This is obviously a hack.

Signed-off-by: Ezequiel Garcia <ezequiel@collabora.com>
---
 arch/arm/include/asm/pgtable.h | 3 +++
 arch/arm/mm/dma-mapping.c      | 9 ++++++---
 2 files changed, 9 insertions(+), 3 deletions(-)

diff --git a/arch/arm/include/asm/pgtable.h b/arch/arm/include/asm/pgtable.h
index a757401129f9..37ddd0d73434 100644
--- a/arch/arm/include/asm/pgtable.h
+++ b/arch/arm/include/asm/pgtable.h
@@ -122,6 +122,9 @@ extern pgprot_t		pgprot_s2_device;
 #define pgprot_writecombine(prot) \
 	__pgprot_modify(prot, L_PTE_MT_MASK, L_PTE_MT_BUFFERABLE)
 
+#define pgprot_writeback(prot) \
+	__pgprot_modify(prot, L_PTE_MT_MASK, L_PTE_MT_WRITEBACK)
+
 #define pgprot_stronglyordered(prot) \
 	__pgprot_modify(prot, L_PTE_MT_MASK, L_PTE_MT_UNCACHED)
 
diff --git a/arch/arm/mm/dma-mapping.c b/arch/arm/mm/dma-mapping.c
index 66566472c153..11cca7bbb0a8 100644
--- a/arch/arm/mm/dma-mapping.c
+++ b/arch/arm/mm/dma-mapping.c
@@ -633,9 +633,12 @@ static void __free_from_contiguous(struct device *dev, struct page *page,
 
 static inline pgprot_t __get_dma_pgprot(unsigned long attrs, pgprot_t prot)
 {
-	prot = (attrs & DMA_ATTR_WRITE_COMBINE) ?
-			pgprot_writecombine(prot) :
-			pgprot_dmacoherent(prot);
+	if (attrs & DMA_ATTR_WRITE_COMBINE)
+		prot = pgprot_writecombine(prot);
+	else if (attrs & DMA_ATTR_NON_CONSISTENT)
+		prot = pgprot_writeback(prot);
+	else
+		prot = pgprot_dmacoherent(prot);
 	return prot;
 }
 

^ permalink raw reply related	[flat|nested] 32+ messages in thread

* [RFC 1/3] HACK: ARM: dma-mapping: Get writeback memory for non-consistent mappings
@ 2018-08-30 17:20   ` Ezequiel Garcia
  0 siblings, 0 replies; 32+ messages in thread
From: Ezequiel Garcia @ 2018-08-30 17:20 UTC (permalink / raw)
  To: linux-arm-kernel

This is obviously a hack.

Signed-off-by: Ezequiel Garcia <ezequiel@collabora.com>
---
 arch/arm/include/asm/pgtable.h | 3 +++
 arch/arm/mm/dma-mapping.c      | 9 ++++++---
 2 files changed, 9 insertions(+), 3 deletions(-)

diff --git a/arch/arm/include/asm/pgtable.h b/arch/arm/include/asm/pgtable.h
index a757401129f9..37ddd0d73434 100644
--- a/arch/arm/include/asm/pgtable.h
+++ b/arch/arm/include/asm/pgtable.h
@@ -122,6 +122,9 @@ extern pgprot_t		pgprot_s2_device;
 #define pgprot_writecombine(prot) \
 	__pgprot_modify(prot, L_PTE_MT_MASK, L_PTE_MT_BUFFERABLE)
 
+#define pgprot_writeback(prot) \
+	__pgprot_modify(prot, L_PTE_MT_MASK, L_PTE_MT_WRITEBACK)
+
 #define pgprot_stronglyordered(prot) \
 	__pgprot_modify(prot, L_PTE_MT_MASK, L_PTE_MT_UNCACHED)
 
diff --git a/arch/arm/mm/dma-mapping.c b/arch/arm/mm/dma-mapping.c
index 66566472c153..11cca7bbb0a8 100644
--- a/arch/arm/mm/dma-mapping.c
+++ b/arch/arm/mm/dma-mapping.c
@@ -633,9 +633,12 @@ static void __free_from_contiguous(struct device *dev, struct page *page,
 
 static inline pgprot_t __get_dma_pgprot(unsigned long attrs, pgprot_t prot)
 {
-	prot = (attrs & DMA_ATTR_WRITE_COMBINE) ?
-			pgprot_writecombine(prot) :
-			pgprot_dmacoherent(prot);
+	if (attrs & DMA_ATTR_WRITE_COMBINE)
+		prot = pgprot_writecombine(prot);
+	else if (attrs & DMA_ATTR_NON_CONSISTENT)
+		prot = pgprot_writeback(prot);
+	else
+		prot = pgprot_dmacoherent(prot);
 	return prot;
 }
 
-- 
2.18.0

^ permalink raw reply related	[flat|nested] 32+ messages in thread

* [RFC 2/3] USB: core: Add non-coherent buffer allocation helpers
@ 2018-08-30 17:20   ` Ezequiel Garcia
  0 siblings, 0 replies; 32+ messages in thread
From: Ezequiel Garcia @ 2018-08-30 17:20 UTC (permalink / raw)
  To: linux-media, linux-usb, linux-arm-kernel, linux-kernel
  Cc: Laurent Pinchart, Tomasz Figa, Matwey V . Kornilov, Alan Stern,
	kernel, Keiichi Watanabe, Ezequiel Garcia

As noted recently by Matwey V. Kornilov, using coherent
buffers on platforms _without_ hardware coherency results in
some devices being completely unusable, due to transfers
being too slow.

Moreover, using non-coherent buffers on platforms _with_ hardware
coherency, do not show a significant impact. This has been tested
by Matwey on PWC USB cameras on x86_64 and ARM platforms.
Quoting [1] (where kmalloc-ed buffers use streaming mappings):

"""
[..] average memcpy() data transfer rate (rate) and handler
completion time (time) have been measured when running video stream at
640x480 resolution at 10fps.

x86_64 based system (Intel Core i5-3470). This platform has hardware
coherent DMA support and proposed change doesn't make big difference here.

 * kmalloc:            rate = (2.0 +- 0.4) GBps
                       time = (5.0 +- 3.0) usec
 * usb_alloc_coherent: rate = (3.4 +- 1.2) GBps
                       time = (3.5 +- 3.0) usec

armv7l based system (TI AM335x BeagleBone Black @ 300MHz). This platform
has no hardware coherent DMA support. DMA coherence is implemented via
disabled page caching that slows down memcpy() due to memory controller
behaviour.

 * kmalloc:            rate =  (114 +- 5) MBps
                       time =   (84 +- 4) usec
 * usb_alloc_coherent: rate = (28.1 +- 0.1) MBps
                       time =  (341 +- 2) usec
""

Introduce a pair of usb_{alloc,free}_noncoherent helper functions,
for drivers that want to use non-coherent transfer buffers.

[1]: https://lkml.org/lkml/2018/8/9/734

Signed-off-by: Ezequiel Garcia <ezequiel@collabora.com>
---
 drivers/usb/core/buffer.c | 29 ++++++++++++--------
 drivers/usb/core/hcd.c    |  5 ++--
 drivers/usb/core/usb.c    | 56 +++++++++++++++++++++++++++++++++++++--
 include/linux/usb.h       |  5 ++++
 include/linux/usb/hcd.h   |  4 +--
 5 files changed, 82 insertions(+), 17 deletions(-)

diff --git a/drivers/usb/core/buffer.c b/drivers/usb/core/buffer.c
index 77eef8acff94..1bc9df883337 100644
--- a/drivers/usb/core/buffer.c
+++ b/drivers/usb/core/buffer.c
@@ -119,7 +119,8 @@ void *hcd_buffer_alloc(
 	struct usb_bus		*bus,
 	size_t			size,
 	gfp_t			mem_flags,
-	dma_addr_t		*dma
+	dma_addr_t		*dma,
+	unsigned long		attrs
 )
 {
 	struct usb_hcd		*hcd = bus_to_hcd(bus);
@@ -136,18 +137,22 @@ void *hcd_buffer_alloc(
 		return kmalloc(size, mem_flags);
 	}
 
-	for (i = 0; i < HCD_BUFFER_POOLS; i++) {
-		if (size <= pool_max[i])
-			return dma_pool_alloc(hcd->pool[i], mem_flags, dma);
+	/* Only use pools for coherent buffer requests */
+	if (!attrs) {
+		for (i = 0; i < HCD_BUFFER_POOLS; i++)
+			if (size <= pool_max[i])
+				return dma_pool_alloc(hcd->pool[i],
+						mem_flags, dma);
 	}
-	return dma_alloc_coherent(hcd->self.sysdev, size, dma, mem_flags);
+	return dma_alloc_attrs(hcd->self.sysdev, size, dma, mem_flags, attrs);
 }
 
 void hcd_buffer_free(
 	struct usb_bus		*bus,
 	size_t			size,
 	void			*addr,
-	dma_addr_t		dma
+	dma_addr_t		dma,
+	unsigned long		attrs
 )
 {
 	struct usb_hcd		*hcd = bus_to_hcd(bus);
@@ -163,11 +168,13 @@ void hcd_buffer_free(
 		return;
 	}
 
-	for (i = 0; i < HCD_BUFFER_POOLS; i++) {
-		if (size <= pool_max[i]) {
-			dma_pool_free(hcd->pool[i], addr, dma);
-			return;
+	if (!attrs) {
+		for (i = 0; i < HCD_BUFFER_POOLS; i++) {
+			if (size <= pool_max[i]) {
+				dma_pool_free(hcd->pool[i], addr, dma);
+				return;
+			}
 		}
 	}
-	dma_free_coherent(hcd->self.sysdev, size, addr, dma);
+	dma_free_attrs(hcd->self.sysdev, size, addr, dma, attrs);
 }
diff --git a/drivers/usb/core/hcd.c b/drivers/usb/core/hcd.c
index 1c21955fe7c0..25303738eb28 100644
--- a/drivers/usb/core/hcd.c
+++ b/drivers/usb/core/hcd.c
@@ -1383,7 +1383,7 @@ static int hcd_alloc_coherent(struct usb_bus *bus,
 	}
 
 	vaddr = hcd_buffer_alloc(bus, size + sizeof(vaddr),
-				 mem_flags, dma_handle);
+				 mem_flags, dma_handle, 0);
 	if (!vaddr)
 		return -ENOMEM;
 
@@ -1416,7 +1416,8 @@ static void hcd_free_coherent(struct usb_bus *bus, dma_addr_t *dma_handle,
 	if (dir == DMA_FROM_DEVICE)
 		memcpy(vaddr, *vaddr_handle, size);
 
-	hcd_buffer_free(bus, size + sizeof(vaddr), *vaddr_handle, *dma_handle);
+	hcd_buffer_free(bus, size + sizeof(vaddr),
+			*vaddr_handle, *dma_handle, 0);
 
 	*vaddr_handle = vaddr;
 	*dma_handle = 0;
diff --git a/drivers/usb/core/usb.c b/drivers/usb/core/usb.c
index 623be3174fb3..234ea5ab4bb7 100644
--- a/drivers/usb/core/usb.c
+++ b/drivers/usb/core/usb.c
@@ -858,6 +858,58 @@ int __usb_get_extra_descriptor(char *buffer, unsigned size,
 }
 EXPORT_SYMBOL_GPL(__usb_get_extra_descriptor);
 
+/**
+ * usb_alloc_noncoherent - allocate dma-non-coherent buffer
+ * @dev: device the buffer will be used with
+ * @size: requested buffer size
+ * @mem_flags: affect whether allocation may block
+ * @dma: used to return DMA address of buffer
+ *
+ * Return: Either null (indicating no buffer could be allocated), or the
+ * cpu-space pointer to a non-coherent buffer that may be used to perform
+ * DMA to the specified device. Such cpu-space buffers are returned along
+ * with the DMA address (through the pointer provided).
+ *
+ * Note:
+ * These non-conherent buffers are used with URB_NO_xxx_DMA_MAP set in
+ * urb->transfer_flags to avoid using "DMA bounce buffers". When using
+ * this API, you must have the necessary syncs points. If you are unsure
+ * about this, you should be using coherent buffers via usb_alloc_coherent.
+ *
+ * When the buffer is no longer used, free it with usb_free_noncoherent().
+ */
+void *usb_alloc_noncoherent(struct usb_device *dev, size_t size, gfp_t mem_flags,
+			 dma_addr_t *dma)
+{
+	if (!dev || !dev->bus)
+		return NULL;
+	return hcd_buffer_alloc(dev->bus, size,
+			mem_flags, dma, DMA_ATTR_NON_CONSISTENT);
+}
+EXPORT_SYMBOL_GPL(usb_alloc_noncoherent);
+
+/**
+ * usb_free_noncoherent - free memory allocated with usb_alloc_noncoherent()
+ * @dev: device the buffer was used with
+ * @size: requested buffer size
+ * @addr: CPU address of buffer
+ * @dma: DMA address of buffer
+ *
+ * This reclaims an I/O buffer, letting it be reused.  The memory must have
+ * been allocated using usb_alloc_noncoherent(), and the parameters must match
+ * those provided in that allocation request.
+ */
+void usb_free_noncoherent(struct usb_device *dev, size_t size, void *addr,
+		       dma_addr_t dma)
+{
+	if (!dev || !dev->bus)
+		return;
+	if (!addr)
+		return;
+	hcd_buffer_free(dev->bus, size, addr, dma, DMA_ATTR_NON_CONSISTENT);
+}
+EXPORT_SYMBOL_GPL(usb_free_noncoherent);
+
 /**
  * usb_alloc_coherent - allocate dma-consistent buffer for URB_NO_xxx_DMA_MAP
  * @dev: device the buffer will be used with
@@ -886,7 +938,7 @@ void *usb_alloc_coherent(struct usb_device *dev, size_t size, gfp_t mem_flags,
 {
 	if (!dev || !dev->bus)
 		return NULL;
-	return hcd_buffer_alloc(dev->bus, size, mem_flags, dma);
+	return hcd_buffer_alloc(dev->bus, size, mem_flags, dma, 0);
 }
 EXPORT_SYMBOL_GPL(usb_alloc_coherent);
 
@@ -908,7 +960,7 @@ void usb_free_coherent(struct usb_device *dev, size_t size, void *addr,
 		return;
 	if (!addr)
 		return;
-	hcd_buffer_free(dev->bus, size, addr, dma);
+	hcd_buffer_free(dev->bus, size, addr, dma, 0);
 }
 EXPORT_SYMBOL_GPL(usb_free_coherent);
 
diff --git a/include/linux/usb.h b/include/linux/usb.h
index 4cdd515a4385..7fddd6c2a61e 100644
--- a/include/linux/usb.h
+++ b/include/linux/usb.h
@@ -1750,6 +1750,11 @@ static inline int usb_urb_dir_out(struct urb *urb)
 
 int usb_urb_ep_type_check(const struct urb *urb);
 
+void *usb_alloc_noncoherent(struct usb_device *dev, size_t size,
+	gfp_t mem_flags, dma_addr_t *dma);
+void usb_free_noncoherent(struct usb_device *dev, size_t size,
+	void *addr, dma_addr_t dma);
+
 void *usb_alloc_coherent(struct usb_device *dev, size_t size,
 	gfp_t mem_flags, dma_addr_t *dma);
 void usb_free_coherent(struct usb_device *dev, size_t size,
diff --git a/include/linux/usb/hcd.h b/include/linux/usb/hcd.h
index 97e2ddec18b1..41dd5f0acaad 100644
--- a/include/linux/usb/hcd.h
+++ b/include/linux/usb/hcd.h
@@ -486,9 +486,9 @@ int hcd_buffer_create(struct usb_hcd *hcd);
 void hcd_buffer_destroy(struct usb_hcd *hcd);
 
 void *hcd_buffer_alloc(struct usb_bus *bus, size_t size,
-	gfp_t mem_flags, dma_addr_t *dma);
+	gfp_t mem_flags, dma_addr_t *dma, unsigned long attrs);
 void hcd_buffer_free(struct usb_bus *bus, size_t size,
-	void *addr, dma_addr_t dma);
+	void *addr, dma_addr_t dma, unsigned long attrs);
 
 /* generic bus glue, needed for host controllers that don't use PCI */
 extern irqreturn_t usb_hcd_irq(int irq, void *__hcd);
-- 
2.18.0


^ permalink raw reply related	[flat|nested] 32+ messages in thread

* [RFC,2/3] USB: core: Add non-coherent buffer allocation helpers
@ 2018-08-30 17:20   ` Ezequiel Garcia
  0 siblings, 0 replies; 32+ messages in thread
From: Ezequiel Garcia @ 2018-08-30 17:20 UTC (permalink / raw)
  To: linux-media, linux-usb, linux-arm-kernel, linux-kernel
  Cc: Laurent Pinchart, Tomasz Figa, Matwey V . Kornilov, Alan Stern,
	kernel, Keiichi Watanabe, Ezequiel Garcia

As noted recently by Matwey V. Kornilov, using coherent
buffers on platforms _without_ hardware coherency results in
some devices being completely unusable, due to transfers
being too slow.

Moreover, using non-coherent buffers on platforms _with_ hardware
coherency, do not show a significant impact. This has been tested
by Matwey on PWC USB cameras on x86_64 and ARM platforms.
Quoting [1] (where kmalloc-ed buffers use streaming mappings):

"""
[..] average memcpy() data transfer rate (rate) and handler
completion time (time) have been measured when running video stream at
640x480 resolution at 10fps.

x86_64 based system (Intel Core i5-3470). This platform has hardware
coherent DMA support and proposed change doesn't make big difference here.

 * kmalloc:            rate = (2.0 +- 0.4) GBps
                       time = (5.0 +- 3.0) usec
 * usb_alloc_coherent: rate = (3.4 +- 1.2) GBps
                       time = (3.5 +- 3.0) usec

armv7l based system (TI AM335x BeagleBone Black @ 300MHz). This platform
has no hardware coherent DMA support. DMA coherence is implemented via
disabled page caching that slows down memcpy() due to memory controller
behaviour.

 * kmalloc:            rate =  (114 +- 5) MBps
                       time =   (84 +- 4) usec
 * usb_alloc_coherent: rate = (28.1 +- 0.1) MBps
                       time =  (341 +- 2) usec
""

Introduce a pair of usb_{alloc,free}_noncoherent helper functions,
for drivers that want to use non-coherent transfer buffers.

[1]: https://lkml.org/lkml/2018/8/9/734

Signed-off-by: Ezequiel Garcia <ezequiel@collabora.com>
---
 drivers/usb/core/buffer.c | 29 ++++++++++++--------
 drivers/usb/core/hcd.c    |  5 ++--
 drivers/usb/core/usb.c    | 56 +++++++++++++++++++++++++++++++++++++--
 include/linux/usb.h       |  5 ++++
 include/linux/usb/hcd.h   |  4 +--
 5 files changed, 82 insertions(+), 17 deletions(-)

diff --git a/drivers/usb/core/buffer.c b/drivers/usb/core/buffer.c
index 77eef8acff94..1bc9df883337 100644
--- a/drivers/usb/core/buffer.c
+++ b/drivers/usb/core/buffer.c
@@ -119,7 +119,8 @@ void *hcd_buffer_alloc(
 	struct usb_bus		*bus,
 	size_t			size,
 	gfp_t			mem_flags,
-	dma_addr_t		*dma
+	dma_addr_t		*dma,
+	unsigned long		attrs
 )
 {
 	struct usb_hcd		*hcd = bus_to_hcd(bus);
@@ -136,18 +137,22 @@ void *hcd_buffer_alloc(
 		return kmalloc(size, mem_flags);
 	}
 
-	for (i = 0; i < HCD_BUFFER_POOLS; i++) {
-		if (size <= pool_max[i])
-			return dma_pool_alloc(hcd->pool[i], mem_flags, dma);
+	/* Only use pools for coherent buffer requests */
+	if (!attrs) {
+		for (i = 0; i < HCD_BUFFER_POOLS; i++)
+			if (size <= pool_max[i])
+				return dma_pool_alloc(hcd->pool[i],
+						mem_flags, dma);
 	}
-	return dma_alloc_coherent(hcd->self.sysdev, size, dma, mem_flags);
+	return dma_alloc_attrs(hcd->self.sysdev, size, dma, mem_flags, attrs);
 }
 
 void hcd_buffer_free(
 	struct usb_bus		*bus,
 	size_t			size,
 	void			*addr,
-	dma_addr_t		dma
+	dma_addr_t		dma,
+	unsigned long		attrs
 )
 {
 	struct usb_hcd		*hcd = bus_to_hcd(bus);
@@ -163,11 +168,13 @@ void hcd_buffer_free(
 		return;
 	}
 
-	for (i = 0; i < HCD_BUFFER_POOLS; i++) {
-		if (size <= pool_max[i]) {
-			dma_pool_free(hcd->pool[i], addr, dma);
-			return;
+	if (!attrs) {
+		for (i = 0; i < HCD_BUFFER_POOLS; i++) {
+			if (size <= pool_max[i]) {
+				dma_pool_free(hcd->pool[i], addr, dma);
+				return;
+			}
 		}
 	}
-	dma_free_coherent(hcd->self.sysdev, size, addr, dma);
+	dma_free_attrs(hcd->self.sysdev, size, addr, dma, attrs);
 }
diff --git a/drivers/usb/core/hcd.c b/drivers/usb/core/hcd.c
index 1c21955fe7c0..25303738eb28 100644
--- a/drivers/usb/core/hcd.c
+++ b/drivers/usb/core/hcd.c
@@ -1383,7 +1383,7 @@ static int hcd_alloc_coherent(struct usb_bus *bus,
 	}
 
 	vaddr = hcd_buffer_alloc(bus, size + sizeof(vaddr),
-				 mem_flags, dma_handle);
+				 mem_flags, dma_handle, 0);
 	if (!vaddr)
 		return -ENOMEM;
 
@@ -1416,7 +1416,8 @@ static void hcd_free_coherent(struct usb_bus *bus, dma_addr_t *dma_handle,
 	if (dir == DMA_FROM_DEVICE)
 		memcpy(vaddr, *vaddr_handle, size);
 
-	hcd_buffer_free(bus, size + sizeof(vaddr), *vaddr_handle, *dma_handle);
+	hcd_buffer_free(bus, size + sizeof(vaddr),
+			*vaddr_handle, *dma_handle, 0);
 
 	*vaddr_handle = vaddr;
 	*dma_handle = 0;
diff --git a/drivers/usb/core/usb.c b/drivers/usb/core/usb.c
index 623be3174fb3..234ea5ab4bb7 100644
--- a/drivers/usb/core/usb.c
+++ b/drivers/usb/core/usb.c
@@ -858,6 +858,58 @@ int __usb_get_extra_descriptor(char *buffer, unsigned size,
 }
 EXPORT_SYMBOL_GPL(__usb_get_extra_descriptor);
 
+/**
+ * usb_alloc_noncoherent - allocate dma-non-coherent buffer
+ * @dev: device the buffer will be used with
+ * @size: requested buffer size
+ * @mem_flags: affect whether allocation may block
+ * @dma: used to return DMA address of buffer
+ *
+ * Return: Either null (indicating no buffer could be allocated), or the
+ * cpu-space pointer to a non-coherent buffer that may be used to perform
+ * DMA to the specified device. Such cpu-space buffers are returned along
+ * with the DMA address (through the pointer provided).
+ *
+ * Note:
+ * These non-conherent buffers are used with URB_NO_xxx_DMA_MAP set in
+ * urb->transfer_flags to avoid using "DMA bounce buffers". When using
+ * this API, you must have the necessary syncs points. If you are unsure
+ * about this, you should be using coherent buffers via usb_alloc_coherent.
+ *
+ * When the buffer is no longer used, free it with usb_free_noncoherent().
+ */
+void *usb_alloc_noncoherent(struct usb_device *dev, size_t size, gfp_t mem_flags,
+			 dma_addr_t *dma)
+{
+	if (!dev || !dev->bus)
+		return NULL;
+	return hcd_buffer_alloc(dev->bus, size,
+			mem_flags, dma, DMA_ATTR_NON_CONSISTENT);
+}
+EXPORT_SYMBOL_GPL(usb_alloc_noncoherent);
+
+/**
+ * usb_free_noncoherent - free memory allocated with usb_alloc_noncoherent()
+ * @dev: device the buffer was used with
+ * @size: requested buffer size
+ * @addr: CPU address of buffer
+ * @dma: DMA address of buffer
+ *
+ * This reclaims an I/O buffer, letting it be reused.  The memory must have
+ * been allocated using usb_alloc_noncoherent(), and the parameters must match
+ * those provided in that allocation request.
+ */
+void usb_free_noncoherent(struct usb_device *dev, size_t size, void *addr,
+		       dma_addr_t dma)
+{
+	if (!dev || !dev->bus)
+		return;
+	if (!addr)
+		return;
+	hcd_buffer_free(dev->bus, size, addr, dma, DMA_ATTR_NON_CONSISTENT);
+}
+EXPORT_SYMBOL_GPL(usb_free_noncoherent);
+
 /**
  * usb_alloc_coherent - allocate dma-consistent buffer for URB_NO_xxx_DMA_MAP
  * @dev: device the buffer will be used with
@@ -886,7 +938,7 @@ void *usb_alloc_coherent(struct usb_device *dev, size_t size, gfp_t mem_flags,
 {
 	if (!dev || !dev->bus)
 		return NULL;
-	return hcd_buffer_alloc(dev->bus, size, mem_flags, dma);
+	return hcd_buffer_alloc(dev->bus, size, mem_flags, dma, 0);
 }
 EXPORT_SYMBOL_GPL(usb_alloc_coherent);
 
@@ -908,7 +960,7 @@ void usb_free_coherent(struct usb_device *dev, size_t size, void *addr,
 		return;
 	if (!addr)
 		return;
-	hcd_buffer_free(dev->bus, size, addr, dma);
+	hcd_buffer_free(dev->bus, size, addr, dma, 0);
 }
 EXPORT_SYMBOL_GPL(usb_free_coherent);
 
diff --git a/include/linux/usb.h b/include/linux/usb.h
index 4cdd515a4385..7fddd6c2a61e 100644
--- a/include/linux/usb.h
+++ b/include/linux/usb.h
@@ -1750,6 +1750,11 @@ static inline int usb_urb_dir_out(struct urb *urb)
 
 int usb_urb_ep_type_check(const struct urb *urb);
 
+void *usb_alloc_noncoherent(struct usb_device *dev, size_t size,
+	gfp_t mem_flags, dma_addr_t *dma);
+void usb_free_noncoherent(struct usb_device *dev, size_t size,
+	void *addr, dma_addr_t dma);
+
 void *usb_alloc_coherent(struct usb_device *dev, size_t size,
 	gfp_t mem_flags, dma_addr_t *dma);
 void usb_free_coherent(struct usb_device *dev, size_t size,
diff --git a/include/linux/usb/hcd.h b/include/linux/usb/hcd.h
index 97e2ddec18b1..41dd5f0acaad 100644
--- a/include/linux/usb/hcd.h
+++ b/include/linux/usb/hcd.h
@@ -486,9 +486,9 @@ int hcd_buffer_create(struct usb_hcd *hcd);
 void hcd_buffer_destroy(struct usb_hcd *hcd);
 
 void *hcd_buffer_alloc(struct usb_bus *bus, size_t size,
-	gfp_t mem_flags, dma_addr_t *dma);
+	gfp_t mem_flags, dma_addr_t *dma, unsigned long attrs);
 void hcd_buffer_free(struct usb_bus *bus, size_t size,
-	void *addr, dma_addr_t dma);
+	void *addr, dma_addr_t dma, unsigned long attrs);
 
 /* generic bus glue, needed for host controllers that don't use PCI */
 extern irqreturn_t usb_hcd_irq(int irq, void *__hcd);

^ permalink raw reply related	[flat|nested] 32+ messages in thread

* [RFC 2/3] USB: core: Add non-coherent buffer allocation helpers
@ 2018-08-30 17:20   ` Ezequiel Garcia
  0 siblings, 0 replies; 32+ messages in thread
From: Ezequiel Garcia @ 2018-08-30 17:20 UTC (permalink / raw)
  To: linux-arm-kernel

As noted recently by Matwey V. Kornilov, using coherent
buffers on platforms _without_ hardware coherency results in
some devices being completely unusable, due to transfers
being too slow.

Moreover, using non-coherent buffers on platforms _with_ hardware
coherency, do not show a significant impact. This has been tested
by Matwey on PWC USB cameras on x86_64 and ARM platforms.
Quoting [1] (where kmalloc-ed buffers use streaming mappings):

"""
[..] average memcpy() data transfer rate (rate) and handler
completion time (time) have been measured when running video stream at
640x480 resolution at 10fps.

x86_64 based system (Intel Core i5-3470). This platform has hardware
coherent DMA support and proposed change doesn't make big difference here.

 * kmalloc:            rate = (2.0 +- 0.4) GBps
                       time = (5.0 +- 3.0) usec
 * usb_alloc_coherent: rate = (3.4 +- 1.2) GBps
                       time = (3.5 +- 3.0) usec

armv7l based system (TI AM335x BeagleBone Black @ 300MHz). This platform
has no hardware coherent DMA support. DMA coherence is implemented via
disabled page caching that slows down memcpy() due to memory controller
behaviour.

 * kmalloc:            rate =  (114 +- 5) MBps
                       time =   (84 +- 4) usec
 * usb_alloc_coherent: rate = (28.1 +- 0.1) MBps
                       time =  (341 +- 2) usec
""

Introduce a pair of usb_{alloc,free}_noncoherent helper functions,
for drivers that want to use non-coherent transfer buffers.

[1]: https://lkml.org/lkml/2018/8/9/734

Signed-off-by: Ezequiel Garcia <ezequiel@collabora.com>
---
 drivers/usb/core/buffer.c | 29 ++++++++++++--------
 drivers/usb/core/hcd.c    |  5 ++--
 drivers/usb/core/usb.c    | 56 +++++++++++++++++++++++++++++++++++++--
 include/linux/usb.h       |  5 ++++
 include/linux/usb/hcd.h   |  4 +--
 5 files changed, 82 insertions(+), 17 deletions(-)

diff --git a/drivers/usb/core/buffer.c b/drivers/usb/core/buffer.c
index 77eef8acff94..1bc9df883337 100644
--- a/drivers/usb/core/buffer.c
+++ b/drivers/usb/core/buffer.c
@@ -119,7 +119,8 @@ void *hcd_buffer_alloc(
 	struct usb_bus		*bus,
 	size_t			size,
 	gfp_t			mem_flags,
-	dma_addr_t		*dma
+	dma_addr_t		*dma,
+	unsigned long		attrs
 )
 {
 	struct usb_hcd		*hcd = bus_to_hcd(bus);
@@ -136,18 +137,22 @@ void *hcd_buffer_alloc(
 		return kmalloc(size, mem_flags);
 	}
 
-	for (i = 0; i < HCD_BUFFER_POOLS; i++) {
-		if (size <= pool_max[i])
-			return dma_pool_alloc(hcd->pool[i], mem_flags, dma);
+	/* Only use pools for coherent buffer requests */
+	if (!attrs) {
+		for (i = 0; i < HCD_BUFFER_POOLS; i++)
+			if (size <= pool_max[i])
+				return dma_pool_alloc(hcd->pool[i],
+						mem_flags, dma);
 	}
-	return dma_alloc_coherent(hcd->self.sysdev, size, dma, mem_flags);
+	return dma_alloc_attrs(hcd->self.sysdev, size, dma, mem_flags, attrs);
 }
 
 void hcd_buffer_free(
 	struct usb_bus		*bus,
 	size_t			size,
 	void			*addr,
-	dma_addr_t		dma
+	dma_addr_t		dma,
+	unsigned long		attrs
 )
 {
 	struct usb_hcd		*hcd = bus_to_hcd(bus);
@@ -163,11 +168,13 @@ void hcd_buffer_free(
 		return;
 	}
 
-	for (i = 0; i < HCD_BUFFER_POOLS; i++) {
-		if (size <= pool_max[i]) {
-			dma_pool_free(hcd->pool[i], addr, dma);
-			return;
+	if (!attrs) {
+		for (i = 0; i < HCD_BUFFER_POOLS; i++) {
+			if (size <= pool_max[i]) {
+				dma_pool_free(hcd->pool[i], addr, dma);
+				return;
+			}
 		}
 	}
-	dma_free_coherent(hcd->self.sysdev, size, addr, dma);
+	dma_free_attrs(hcd->self.sysdev, size, addr, dma, attrs);
 }
diff --git a/drivers/usb/core/hcd.c b/drivers/usb/core/hcd.c
index 1c21955fe7c0..25303738eb28 100644
--- a/drivers/usb/core/hcd.c
+++ b/drivers/usb/core/hcd.c
@@ -1383,7 +1383,7 @@ static int hcd_alloc_coherent(struct usb_bus *bus,
 	}
 
 	vaddr = hcd_buffer_alloc(bus, size + sizeof(vaddr),
-				 mem_flags, dma_handle);
+				 mem_flags, dma_handle, 0);
 	if (!vaddr)
 		return -ENOMEM;
 
@@ -1416,7 +1416,8 @@ static void hcd_free_coherent(struct usb_bus *bus, dma_addr_t *dma_handle,
 	if (dir == DMA_FROM_DEVICE)
 		memcpy(vaddr, *vaddr_handle, size);
 
-	hcd_buffer_free(bus, size + sizeof(vaddr), *vaddr_handle, *dma_handle);
+	hcd_buffer_free(bus, size + sizeof(vaddr),
+			*vaddr_handle, *dma_handle, 0);
 
 	*vaddr_handle = vaddr;
 	*dma_handle = 0;
diff --git a/drivers/usb/core/usb.c b/drivers/usb/core/usb.c
index 623be3174fb3..234ea5ab4bb7 100644
--- a/drivers/usb/core/usb.c
+++ b/drivers/usb/core/usb.c
@@ -858,6 +858,58 @@ int __usb_get_extra_descriptor(char *buffer, unsigned size,
 }
 EXPORT_SYMBOL_GPL(__usb_get_extra_descriptor);
 
+/**
+ * usb_alloc_noncoherent - allocate dma-non-coherent buffer
+ * @dev: device the buffer will be used with
+ * @size: requested buffer size
+ * @mem_flags: affect whether allocation may block
+ * @dma: used to return DMA address of buffer
+ *
+ * Return: Either null (indicating no buffer could be allocated), or the
+ * cpu-space pointer to a non-coherent buffer that may be used to perform
+ * DMA to the specified device. Such cpu-space buffers are returned along
+ * with the DMA address (through the pointer provided).
+ *
+ * Note:
+ * These non-conherent buffers are used with URB_NO_xxx_DMA_MAP set in
+ * urb->transfer_flags to avoid using "DMA bounce buffers". When using
+ * this API, you must have the necessary syncs points. If you are unsure
+ * about this, you should be using coherent buffers via usb_alloc_coherent.
+ *
+ * When the buffer is no longer used, free it with usb_free_noncoherent().
+ */
+void *usb_alloc_noncoherent(struct usb_device *dev, size_t size, gfp_t mem_flags,
+			 dma_addr_t *dma)
+{
+	if (!dev || !dev->bus)
+		return NULL;
+	return hcd_buffer_alloc(dev->bus, size,
+			mem_flags, dma, DMA_ATTR_NON_CONSISTENT);
+}
+EXPORT_SYMBOL_GPL(usb_alloc_noncoherent);
+
+/**
+ * usb_free_noncoherent - free memory allocated with usb_alloc_noncoherent()
+ * @dev: device the buffer was used with
+ * @size: requested buffer size
+ * @addr: CPU address of buffer
+ * @dma: DMA address of buffer
+ *
+ * This reclaims an I/O buffer, letting it be reused.  The memory must have
+ * been allocated using usb_alloc_noncoherent(), and the parameters must match
+ * those provided in that allocation request.
+ */
+void usb_free_noncoherent(struct usb_device *dev, size_t size, void *addr,
+		       dma_addr_t dma)
+{
+	if (!dev || !dev->bus)
+		return;
+	if (!addr)
+		return;
+	hcd_buffer_free(dev->bus, size, addr, dma, DMA_ATTR_NON_CONSISTENT);
+}
+EXPORT_SYMBOL_GPL(usb_free_noncoherent);
+
 /**
  * usb_alloc_coherent - allocate dma-consistent buffer for URB_NO_xxx_DMA_MAP
  * @dev: device the buffer will be used with
@@ -886,7 +938,7 @@ void *usb_alloc_coherent(struct usb_device *dev, size_t size, gfp_t mem_flags,
 {
 	if (!dev || !dev->bus)
 		return NULL;
-	return hcd_buffer_alloc(dev->bus, size, mem_flags, dma);
+	return hcd_buffer_alloc(dev->bus, size, mem_flags, dma, 0);
 }
 EXPORT_SYMBOL_GPL(usb_alloc_coherent);
 
@@ -908,7 +960,7 @@ void usb_free_coherent(struct usb_device *dev, size_t size, void *addr,
 		return;
 	if (!addr)
 		return;
-	hcd_buffer_free(dev->bus, size, addr, dma);
+	hcd_buffer_free(dev->bus, size, addr, dma, 0);
 }
 EXPORT_SYMBOL_GPL(usb_free_coherent);
 
diff --git a/include/linux/usb.h b/include/linux/usb.h
index 4cdd515a4385..7fddd6c2a61e 100644
--- a/include/linux/usb.h
+++ b/include/linux/usb.h
@@ -1750,6 +1750,11 @@ static inline int usb_urb_dir_out(struct urb *urb)
 
 int usb_urb_ep_type_check(const struct urb *urb);
 
+void *usb_alloc_noncoherent(struct usb_device *dev, size_t size,
+	gfp_t mem_flags, dma_addr_t *dma);
+void usb_free_noncoherent(struct usb_device *dev, size_t size,
+	void *addr, dma_addr_t dma);
+
 void *usb_alloc_coherent(struct usb_device *dev, size_t size,
 	gfp_t mem_flags, dma_addr_t *dma);
 void usb_free_coherent(struct usb_device *dev, size_t size,
diff --git a/include/linux/usb/hcd.h b/include/linux/usb/hcd.h
index 97e2ddec18b1..41dd5f0acaad 100644
--- a/include/linux/usb/hcd.h
+++ b/include/linux/usb/hcd.h
@@ -486,9 +486,9 @@ int hcd_buffer_create(struct usb_hcd *hcd);
 void hcd_buffer_destroy(struct usb_hcd *hcd);
 
 void *hcd_buffer_alloc(struct usb_bus *bus, size_t size,
-	gfp_t mem_flags, dma_addr_t *dma);
+	gfp_t mem_flags, dma_addr_t *dma, unsigned long attrs);
 void hcd_buffer_free(struct usb_bus *bus, size_t size,
-	void *addr, dma_addr_t dma);
+	void *addr, dma_addr_t dma, unsigned long attrs);
 
 /* generic bus glue, needed for host controllers that don't use PCI */
 extern irqreturn_t usb_hcd_irq(int irq, void *__hcd);
-- 
2.18.0

^ permalink raw reply related	[flat|nested] 32+ messages in thread

* [RFC 3/3] stk1160: Use non-coherent buffers for USB transfers
@ 2018-08-30 17:20   ` Ezequiel Garcia
  0 siblings, 0 replies; 32+ messages in thread
From: Ezequiel Garcia @ 2018-08-30 17:20 UTC (permalink / raw)
  To: linux-media, linux-usb, linux-arm-kernel, linux-kernel
  Cc: Laurent Pinchart, Tomasz Figa, Matwey V . Kornilov, Alan Stern,
	kernel, Keiichi Watanabe, Ezequiel Garcia

Platforms without hardware coherency can benefit a lot
from using non-coherent buffers. Moreover, platforms
with hardware coherency aren't impacted by this change.

For instance, on AM335x, while it's still not possible
to capture full resolution frames, this patch enables
half-resolution frame streams to work.

Signed-off-by: Ezequiel Garcia <ezequiel@collabora.com>
---
 drivers/media/usb/stk1160/stk1160-video.c | 22 ++++++----------------
 1 file changed, 6 insertions(+), 16 deletions(-)

diff --git a/drivers/media/usb/stk1160/stk1160-video.c b/drivers/media/usb/stk1160/stk1160-video.c
index 2811f612820f..aeb4264d1998 100644
--- a/drivers/media/usb/stk1160/stk1160-video.c
+++ b/drivers/media/usb/stk1160/stk1160-video.c
@@ -240,6 +240,9 @@ static void stk1160_process_isoc(struct stk1160 *dev, struct urb *urb)
 		return;
 	}
 
+	dma_sync_single_for_cpu(&urb->dev->dev, urb->transfer_dma,
+		urb->transfer_buffer_length, DMA_FROM_DEVICE);
+
 	for (i = 0; i < urb->number_of_packets; i++) {
 		status = urb->iso_frame_desc[i].status;
 		if (status < 0) {
@@ -379,16 +382,11 @@ void stk1160_free_isoc(struct stk1160 *dev)
 		urb = dev->isoc_ctl.urb[i];
 		if (urb) {
 
-			if (dev->isoc_ctl.transfer_buffer[i]) {
-#ifndef CONFIG_DMA_NONCOHERENT
-				usb_free_coherent(dev->udev,
+			if (dev->isoc_ctl.transfer_buffer[i])
+				usb_free_noncoherent(dev->udev,
 					urb->transfer_buffer_length,
 					dev->isoc_ctl.transfer_buffer[i],
 					urb->transfer_dma);
-#else
-				kfree(dev->isoc_ctl.transfer_buffer[i]);
-#endif
-			}
 			usb_free_urb(urb);
 			dev->isoc_ctl.urb[i] = NULL;
 		}
@@ -461,12 +459,8 @@ int stk1160_alloc_isoc(struct stk1160 *dev)
 			goto free_i_bufs;
 		dev->isoc_ctl.urb[i] = urb;
 
-#ifndef CONFIG_DMA_NONCOHERENT
-		dev->isoc_ctl.transfer_buffer[i] = usb_alloc_coherent(dev->udev,
+		dev->isoc_ctl.transfer_buffer[i] = usb_alloc_noncoherent(dev->udev,
 			sb_size, GFP_KERNEL, &urb->transfer_dma);
-#else
-		dev->isoc_ctl.transfer_buffer[i] = kmalloc(sb_size, GFP_KERNEL);
-#endif
 		if (!dev->isoc_ctl.transfer_buffer[i]) {
 			stk1160_err("cannot alloc %d bytes for tx[%d] buffer\n",
 				sb_size, i);
@@ -490,11 +484,7 @@ int stk1160_alloc_isoc(struct stk1160 *dev)
 		urb->interval = 1;
 		urb->start_frame = 0;
 		urb->number_of_packets = max_packets;
-#ifndef CONFIG_DMA_NONCOHERENT
 		urb->transfer_flags = URB_ISO_ASAP | URB_NO_TRANSFER_DMA_MAP;
-#else
-		urb->transfer_flags = URB_ISO_ASAP;
-#endif
 
 		k = 0;
 		for (j = 0; j < max_packets; j++) {
-- 
2.18.0


^ permalink raw reply related	[flat|nested] 32+ messages in thread

* [RFC,3/3] stk1160: Use non-coherent buffers for USB transfers
@ 2018-08-30 17:20   ` Ezequiel Garcia
  0 siblings, 0 replies; 32+ messages in thread
From: Ezequiel Garcia @ 2018-08-30 17:20 UTC (permalink / raw)
  To: linux-media, linux-usb, linux-arm-kernel, linux-kernel
  Cc: Laurent Pinchart, Tomasz Figa, Matwey V . Kornilov, Alan Stern,
	kernel, Keiichi Watanabe, Ezequiel Garcia

Platforms without hardware coherency can benefit a lot
from using non-coherent buffers. Moreover, platforms
with hardware coherency aren't impacted by this change.

For instance, on AM335x, while it's still not possible
to capture full resolution frames, this patch enables
half-resolution frame streams to work.

Signed-off-by: Ezequiel Garcia <ezequiel@collabora.com>
---
 drivers/media/usb/stk1160/stk1160-video.c | 22 ++++++----------------
 1 file changed, 6 insertions(+), 16 deletions(-)

diff --git a/drivers/media/usb/stk1160/stk1160-video.c b/drivers/media/usb/stk1160/stk1160-video.c
index 2811f612820f..aeb4264d1998 100644
--- a/drivers/media/usb/stk1160/stk1160-video.c
+++ b/drivers/media/usb/stk1160/stk1160-video.c
@@ -240,6 +240,9 @@ static void stk1160_process_isoc(struct stk1160 *dev, struct urb *urb)
 		return;
 	}
 
+	dma_sync_single_for_cpu(&urb->dev->dev, urb->transfer_dma,
+		urb->transfer_buffer_length, DMA_FROM_DEVICE);
+
 	for (i = 0; i < urb->number_of_packets; i++) {
 		status = urb->iso_frame_desc[i].status;
 		if (status < 0) {
@@ -379,16 +382,11 @@ void stk1160_free_isoc(struct stk1160 *dev)
 		urb = dev->isoc_ctl.urb[i];
 		if (urb) {
 
-			if (dev->isoc_ctl.transfer_buffer[i]) {
-#ifndef CONFIG_DMA_NONCOHERENT
-				usb_free_coherent(dev->udev,
+			if (dev->isoc_ctl.transfer_buffer[i])
+				usb_free_noncoherent(dev->udev,
 					urb->transfer_buffer_length,
 					dev->isoc_ctl.transfer_buffer[i],
 					urb->transfer_dma);
-#else
-				kfree(dev->isoc_ctl.transfer_buffer[i]);
-#endif
-			}
 			usb_free_urb(urb);
 			dev->isoc_ctl.urb[i] = NULL;
 		}
@@ -461,12 +459,8 @@ int stk1160_alloc_isoc(struct stk1160 *dev)
 			goto free_i_bufs;
 		dev->isoc_ctl.urb[i] = urb;
 
-#ifndef CONFIG_DMA_NONCOHERENT
-		dev->isoc_ctl.transfer_buffer[i] = usb_alloc_coherent(dev->udev,
+		dev->isoc_ctl.transfer_buffer[i] = usb_alloc_noncoherent(dev->udev,
 			sb_size, GFP_KERNEL, &urb->transfer_dma);
-#else
-		dev->isoc_ctl.transfer_buffer[i] = kmalloc(sb_size, GFP_KERNEL);
-#endif
 		if (!dev->isoc_ctl.transfer_buffer[i]) {
 			stk1160_err("cannot alloc %d bytes for tx[%d] buffer\n",
 				sb_size, i);
@@ -490,11 +484,7 @@ int stk1160_alloc_isoc(struct stk1160 *dev)
 		urb->interval = 1;
 		urb->start_frame = 0;
 		urb->number_of_packets = max_packets;
-#ifndef CONFIG_DMA_NONCOHERENT
 		urb->transfer_flags = URB_ISO_ASAP | URB_NO_TRANSFER_DMA_MAP;
-#else
-		urb->transfer_flags = URB_ISO_ASAP;
-#endif
 
 		k = 0;
 		for (j = 0; j < max_packets; j++) {

^ permalink raw reply related	[flat|nested] 32+ messages in thread

* [RFC 3/3] stk1160: Use non-coherent buffers for USB transfers
@ 2018-08-30 17:20   ` Ezequiel Garcia
  0 siblings, 0 replies; 32+ messages in thread
From: Ezequiel Garcia @ 2018-08-30 17:20 UTC (permalink / raw)
  To: linux-arm-kernel

Platforms without hardware coherency can benefit a lot
from using non-coherent buffers. Moreover, platforms
with hardware coherency aren't impacted by this change.

For instance, on AM335x, while it's still not possible
to capture full resolution frames, this patch enables
half-resolution frame streams to work.

Signed-off-by: Ezequiel Garcia <ezequiel@collabora.com>
---
 drivers/media/usb/stk1160/stk1160-video.c | 22 ++++++----------------
 1 file changed, 6 insertions(+), 16 deletions(-)

diff --git a/drivers/media/usb/stk1160/stk1160-video.c b/drivers/media/usb/stk1160/stk1160-video.c
index 2811f612820f..aeb4264d1998 100644
--- a/drivers/media/usb/stk1160/stk1160-video.c
+++ b/drivers/media/usb/stk1160/stk1160-video.c
@@ -240,6 +240,9 @@ static void stk1160_process_isoc(struct stk1160 *dev, struct urb *urb)
 		return;
 	}
 
+	dma_sync_single_for_cpu(&urb->dev->dev, urb->transfer_dma,
+		urb->transfer_buffer_length, DMA_FROM_DEVICE);
+
 	for (i = 0; i < urb->number_of_packets; i++) {
 		status = urb->iso_frame_desc[i].status;
 		if (status < 0) {
@@ -379,16 +382,11 @@ void stk1160_free_isoc(struct stk1160 *dev)
 		urb = dev->isoc_ctl.urb[i];
 		if (urb) {
 
-			if (dev->isoc_ctl.transfer_buffer[i]) {
-#ifndef CONFIG_DMA_NONCOHERENT
-				usb_free_coherent(dev->udev,
+			if (dev->isoc_ctl.transfer_buffer[i])
+				usb_free_noncoherent(dev->udev,
 					urb->transfer_buffer_length,
 					dev->isoc_ctl.transfer_buffer[i],
 					urb->transfer_dma);
-#else
-				kfree(dev->isoc_ctl.transfer_buffer[i]);
-#endif
-			}
 			usb_free_urb(urb);
 			dev->isoc_ctl.urb[i] = NULL;
 		}
@@ -461,12 +459,8 @@ int stk1160_alloc_isoc(struct stk1160 *dev)
 			goto free_i_bufs;
 		dev->isoc_ctl.urb[i] = urb;
 
-#ifndef CONFIG_DMA_NONCOHERENT
-		dev->isoc_ctl.transfer_buffer[i] = usb_alloc_coherent(dev->udev,
+		dev->isoc_ctl.transfer_buffer[i] = usb_alloc_noncoherent(dev->udev,
 			sb_size, GFP_KERNEL, &urb->transfer_dma);
-#else
-		dev->isoc_ctl.transfer_buffer[i] = kmalloc(sb_size, GFP_KERNEL);
-#endif
 		if (!dev->isoc_ctl.transfer_buffer[i]) {
 			stk1160_err("cannot alloc %d bytes for tx[%d] buffer\n",
 				sb_size, i);
@@ -490,11 +484,7 @@ int stk1160_alloc_isoc(struct stk1160 *dev)
 		urb->interval = 1;
 		urb->start_frame = 0;
 		urb->number_of_packets = max_packets;
-#ifndef CONFIG_DMA_NONCOHERENT
 		urb->transfer_flags = URB_ISO_ASAP | URB_NO_TRANSFER_DMA_MAP;
-#else
-		urb->transfer_flags = URB_ISO_ASAP;
-#endif
 
 		k = 0;
 		for (j = 0; j < max_packets; j++) {
-- 
2.18.0

^ permalink raw reply related	[flat|nested] 32+ messages in thread

* Re: [RFC 2/3] USB: core: Add non-coherent buffer allocation helpers
@ 2018-08-30 17:58     ` Christoph Hellwig
  0 siblings, 0 replies; 32+ messages in thread
From: Christoph Hellwig @ 2018-08-30 17:58 UTC (permalink / raw)
  To: Ezequiel Garcia
  Cc: linux-media, linux-usb, linux-arm-kernel, linux-kernel,
	Laurent Pinchart, Tomasz Figa, Matwey V . Kornilov, Alan Stern,
	kernel, Keiichi Watanabe

Please don't introduce new DMA_ATTR_NON_CONSISTENT users, it is
a rather horrible interface, and I plan to kill it off rather sooner
than later.  I plan to post some patches for a better interface
that can reuse the normal dma_sync_single_* interfaces for ownership
transfers.  I can happily include usb in that initial patch set based
on your work here if that helps.

^ permalink raw reply	[flat|nested] 32+ messages in thread

* [RFC,2/3] USB: core: Add non-coherent buffer allocation helpers
@ 2018-08-30 17:58     ` Christoph Hellwig
  0 siblings, 0 replies; 32+ messages in thread
From: Christoph Hellwig @ 2018-08-30 17:58 UTC (permalink / raw)
  To: Ezequiel Garcia
  Cc: linux-media, linux-usb, linux-arm-kernel, linux-kernel,
	Laurent Pinchart, Tomasz Figa, Matwey V . Kornilov, Alan Stern,
	kernel, Keiichi Watanabe

Please don't introduce new DMA_ATTR_NON_CONSISTENT users, it is
a rather horrible interface, and I plan to kill it off rather sooner
than later.  I plan to post some patches for a better interface
that can reuse the normal dma_sync_single_* interfaces for ownership
transfers.  I can happily include usb in that initial patch set based
on your work here if that helps.

^ permalink raw reply	[flat|nested] 32+ messages in thread

* [RFC 2/3] USB: core: Add non-coherent buffer allocation helpers
@ 2018-08-30 17:58     ` Christoph Hellwig
  0 siblings, 0 replies; 32+ messages in thread
From: Christoph Hellwig @ 2018-08-30 17:58 UTC (permalink / raw)
  To: linux-arm-kernel

Please don't introduce new DMA_ATTR_NON_CONSISTENT users, it is
a rather horrible interface, and I plan to kill it off rather sooner
than later.  I plan to post some patches for a better interface
that can reuse the normal dma_sync_single_* interfaces for ownership
transfers.  I can happily include usb in that initial patch set based
on your work here if that helps.

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [RFC 3/3] stk1160: Use non-coherent buffers for USB transfers
@ 2018-08-30 17:59     ` Christoph Hellwig
  0 siblings, 0 replies; 32+ messages in thread
From: Christoph Hellwig @ 2018-08-30 17:59 UTC (permalink / raw)
  To: Ezequiel Garcia
  Cc: linux-media, linux-usb, linux-arm-kernel, linux-kernel,
	Laurent Pinchart, Tomasz Figa, Matwey V . Kornilov, Alan Stern,
	kernel, Keiichi Watanabe

> +	dma_sync_single_for_cpu(&urb->dev->dev, urb->transfer_dma,
> +		urb->transfer_buffer_length, DMA_FROM_DEVICE);

You can't ue dma_sync_single_for_cpu on non-coherent dma buffers,
which is one of the major issues with them.

^ permalink raw reply	[flat|nested] 32+ messages in thread

* [RFC,3/3] stk1160: Use non-coherent buffers for USB transfers
@ 2018-08-30 17:59     ` Christoph Hellwig
  0 siblings, 0 replies; 32+ messages in thread
From: Christoph Hellwig @ 2018-08-30 17:59 UTC (permalink / raw)
  To: Ezequiel Garcia
  Cc: linux-media, linux-usb, linux-arm-kernel, linux-kernel,
	Laurent Pinchart, Tomasz Figa, Matwey V . Kornilov, Alan Stern,
	kernel, Keiichi Watanabe

> +	dma_sync_single_for_cpu(&urb->dev->dev, urb->transfer_dma,
> +		urb->transfer_buffer_length, DMA_FROM_DEVICE);

You can't ue dma_sync_single_for_cpu on non-coherent dma buffers,
which is one of the major issues with them.

^ permalink raw reply	[flat|nested] 32+ messages in thread

* [RFC 3/3] stk1160: Use non-coherent buffers for USB transfers
@ 2018-08-30 17:59     ` Christoph Hellwig
  0 siblings, 0 replies; 32+ messages in thread
From: Christoph Hellwig @ 2018-08-30 17:59 UTC (permalink / raw)
  To: linux-arm-kernel

> +	dma_sync_single_for_cpu(&urb->dev->dev, urb->transfer_dma,
> +		urb->transfer_buffer_length, DMA_FROM_DEVICE);

You can't ue dma_sync_single_for_cpu on non-coherent dma buffers,
which is one of the major issues with them.

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [RFC 2/3] USB: core: Add non-coherent buffer allocation helpers
@ 2018-08-30 22:11       ` Ezequiel Garcia
  0 siblings, 0 replies; 32+ messages in thread
From: Ezequiel Garcia @ 2018-08-30 22:11 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: linux-media, linux-usb, linux-arm-kernel, linux-kernel,
	Laurent Pinchart, Tomasz Figa, Matwey V . Kornilov, Alan Stern,
	kernel, Keiichi Watanabe

On Thu, 2018-08-30 at 10:58 -0700, Christoph Hellwig wrote:
> Please don't introduce new DMA_ATTR_NON_CONSISTENT users, it is
> a rather horrible interface, and I plan to kill it off rather sooner
> than later.  I plan to post some patches for a better interface
> that can reuse the normal dma_sync_single_* interfaces for ownership
> transfers.  I can happily include usb in that initial patch set based
> on your work here if that helps.

Please do. Until we have proper allocators that go thru the DMA API,
drivers will have to kmalloc the USB transfer buffers, and have
streaming mappings. Which in turns mean not using IOMMU or CMA.

Regards,
Eze

^ permalink raw reply	[flat|nested] 32+ messages in thread

* [RFC,2/3] USB: core: Add non-coherent buffer allocation helpers
@ 2018-08-30 22:11       ` Ezequiel Garcia
  0 siblings, 0 replies; 32+ messages in thread
From: Ezequiel Garcia @ 2018-08-30 22:11 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: linux-media, linux-usb, linux-arm-kernel, linux-kernel,
	Laurent Pinchart, Tomasz Figa, Matwey V . Kornilov, Alan Stern,
	kernel, Keiichi Watanabe

On Thu, 2018-08-30 at 10:58 -0700, Christoph Hellwig wrote:
> Please don't introduce new DMA_ATTR_NON_CONSISTENT users, it is
> a rather horrible interface, and I plan to kill it off rather sooner
> than later.  I plan to post some patches for a better interface
> that can reuse the normal dma_sync_single_* interfaces for ownership
> transfers.  I can happily include usb in that initial patch set based
> on your work here if that helps.

Please do. Until we have proper allocators that go thru the DMA API,
drivers will have to kmalloc the USB transfer buffers, and have
streaming mappings. Which in turns mean not using IOMMU or CMA.

Regards,
Eze

^ permalink raw reply	[flat|nested] 32+ messages in thread

* [RFC 2/3] USB: core: Add non-coherent buffer allocation helpers
@ 2018-08-30 22:11       ` Ezequiel Garcia
  0 siblings, 0 replies; 32+ messages in thread
From: Ezequiel Garcia @ 2018-08-30 22:11 UTC (permalink / raw)
  To: linux-arm-kernel

On Thu, 2018-08-30 at 10:58 -0700, Christoph Hellwig wrote:
> Please don't introduce new DMA_ATTR_NON_CONSISTENT users, it is
> a rather horrible interface, and I plan to kill it off rather sooner
> than later.  I plan to post some patches for a better interface
> that can reuse the normal dma_sync_single_* interfaces for ownership
> transfers.  I can happily include usb in that initial patch set based
> on your work here if that helps.

Please do. Until we have proper allocators that go thru the DMA API,
drivers will have to kmalloc the USB transfer buffers, and have
streaming mappings. Which in turns mean not using IOMMU or CMA.

Regards,
Eze

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [RFC 2/3] USB: core: Add non-coherent buffer allocation helpers
@ 2018-08-31  5:50         ` Christoph Hellwig
  0 siblings, 0 replies; 32+ messages in thread
From: Christoph Hellwig @ 2018-08-31  5:50 UTC (permalink / raw)
  To: Ezequiel Garcia
  Cc: Christoph Hellwig, linux-media, linux-usb, linux-arm-kernel,
	linux-kernel, Laurent Pinchart, Tomasz Figa, Matwey V . Kornilov,
	Alan Stern, kernel, Keiichi Watanabe

On Thu, Aug 30, 2018 at 07:11:35PM -0300, Ezequiel Garcia wrote:
> On Thu, 2018-08-30 at 10:58 -0700, Christoph Hellwig wrote:
> > Please don't introduce new DMA_ATTR_NON_CONSISTENT users, it is
> > a rather horrible interface, and I plan to kill it off rather sooner
> > than later.  I plan to post some patches for a better interface
> > that can reuse the normal dma_sync_single_* interfaces for ownership
> > transfers.  I can happily include usb in that initial patch set based
> > on your work here if that helps.
> 
> Please do. Until we have proper allocators that go thru the DMA API,
> drivers will have to kmalloc the USB transfer buffers, and have
> streaming mappings. Which in turns mean not using IOMMU or CMA.

dma_map_page will of course use the iommu.

^ permalink raw reply	[flat|nested] 32+ messages in thread

* [RFC,2/3] USB: core: Add non-coherent buffer allocation helpers
@ 2018-08-31  5:50         ` Christoph Hellwig
  0 siblings, 0 replies; 32+ messages in thread
From: Christoph Hellwig @ 2018-08-31  5:50 UTC (permalink / raw)
  To: Ezequiel Garcia
  Cc: Christoph Hellwig, linux-media, linux-usb, linux-arm-kernel,
	linux-kernel, Laurent Pinchart, Tomasz Figa, Matwey V . Kornilov,
	Alan Stern, kernel, Keiichi Watanabe

On Thu, Aug 30, 2018 at 07:11:35PM -0300, Ezequiel Garcia wrote:
> On Thu, 2018-08-30 at 10:58 -0700, Christoph Hellwig wrote:
> > Please don't introduce new DMA_ATTR_NON_CONSISTENT users, it is
> > a rather horrible interface, and I plan to kill it off rather sooner
> > than later.  I plan to post some patches for a better interface
> > that can reuse the normal dma_sync_single_* interfaces for ownership
> > transfers.  I can happily include usb in that initial patch set based
> > on your work here if that helps.
> 
> Please do. Until we have proper allocators that go thru the DMA API,
> drivers will have to kmalloc the USB transfer buffers, and have
> streaming mappings. Which in turns mean not using IOMMU or CMA.

dma_map_page will of course use the iommu.

^ permalink raw reply	[flat|nested] 32+ messages in thread

* [RFC 2/3] USB: core: Add non-coherent buffer allocation helpers
@ 2018-08-31  5:50         ` Christoph Hellwig
  0 siblings, 0 replies; 32+ messages in thread
From: Christoph Hellwig @ 2018-08-31  5:50 UTC (permalink / raw)
  To: linux-arm-kernel

On Thu, Aug 30, 2018 at 07:11:35PM -0300, Ezequiel Garcia wrote:
> On Thu, 2018-08-30 at 10:58 -0700, Christoph Hellwig wrote:
> > Please don't introduce new DMA_ATTR_NON_CONSISTENT users, it is
> > a rather horrible interface, and I plan to kill it off rather sooner
> > than later.  I plan to post some patches for a better interface
> > that can reuse the normal dma_sync_single_* interfaces for ownership
> > transfers.  I can happily include usb in that initial patch set based
> > on your work here if that helps.
> 
> Please do. Until we have proper allocators that go thru the DMA API,
> drivers will have to kmalloc the USB transfer buffers, and have
> streaming mappings. Which in turns mean not using IOMMU or CMA.

dma_map_page will of course use the iommu.

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [RFC 2/3] USB: core: Add non-coherent buffer allocation helpers
@ 2018-08-31  6:51           ` Tomasz Figa
  0 siblings, 0 replies; 32+ messages in thread
From: Tomasz Figa @ 2018-08-31  6:51 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: Ezequiel Garcia, Linux Media Mailing List, linux-usb,
	list@263.net:IOMMU DRIVERS
	<iommu@lists.linux-foundation.org>,
	Joerg Roedel <joro@8bytes.org>,,
	Linux Kernel Mailing List, Laurent Pinchart, Matwey V. Kornilov,
	Alan Stern, kernel, keiichiw

On Fri, Aug 31, 2018 at 2:50 PM Christoph Hellwig <hch@infradead.org> wrote:
>
> On Thu, Aug 30, 2018 at 07:11:35PM -0300, Ezequiel Garcia wrote:
> > On Thu, 2018-08-30 at 10:58 -0700, Christoph Hellwig wrote:
> > > Please don't introduce new DMA_ATTR_NON_CONSISTENT users, it is
> > > a rather horrible interface, and I plan to kill it off rather sooner
> > > than later.  I plan to post some patches for a better interface
> > > that can reuse the normal dma_sync_single_* interfaces for ownership
> > > transfers.  I can happily include usb in that initial patch set based
> > > on your work here if that helps.
> >
> > Please do. Until we have proper allocators that go thru the DMA API,
> > drivers will have to kmalloc the USB transfer buffers, and have
> > streaming mappings. Which in turns mean not using IOMMU or CMA.
>
> dma_map_page will of course use the iommu.

Sure, dma_map*() will, but using kmalloc() defeats (half of) the
purpose of it, since contiguous memory would be allocated
unnecessarily, risking failures due to fragmentation.

Best regards,
Tomasz

^ permalink raw reply	[flat|nested] 32+ messages in thread

* [RFC,2/3] USB: core: Add non-coherent buffer allocation helpers
@ 2018-08-31  6:51           ` Tomasz Figa
  0 siblings, 0 replies; 32+ messages in thread
From: Tomasz Figa @ 2018-08-31  6:51 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: Ezequiel Garcia, Linux Media Mailing List, linux-usb,
	list@263.net:IOMMU DRIVERS
	<iommu@lists.linux-foundation.org>,
	Joerg Roedel <joro@8bytes.org>,,
	Linux Kernel Mailing List, Laurent Pinchart, Matwey V. Kornilov,
	Alan Stern, kernel, keiichiw

On Fri, Aug 31, 2018 at 2:50 PM Christoph Hellwig <hch@infradead.org> wrote:
>
> On Thu, Aug 30, 2018 at 07:11:35PM -0300, Ezequiel Garcia wrote:
> > On Thu, 2018-08-30 at 10:58 -0700, Christoph Hellwig wrote:
> > > Please don't introduce new DMA_ATTR_NON_CONSISTENT users, it is
> > > a rather horrible interface, and I plan to kill it off rather sooner
> > > than later.  I plan to post some patches for a better interface
> > > that can reuse the normal dma_sync_single_* interfaces for ownership
> > > transfers.  I can happily include usb in that initial patch set based
> > > on your work here if that helps.
> >
> > Please do. Until we have proper allocators that go thru the DMA API,
> > drivers will have to kmalloc the USB transfer buffers, and have
> > streaming mappings. Which in turns mean not using IOMMU or CMA.
>
> dma_map_page will of course use the iommu.

Sure, dma_map*() will, but using kmalloc() defeats (half of) the
purpose of it, since contiguous memory would be allocated
unnecessarily, risking failures due to fragmentation.

Best regards,
Tomasz

^ permalink raw reply	[flat|nested] 32+ messages in thread

* [RFC 2/3] USB: core: Add non-coherent buffer allocation helpers
@ 2018-08-31  6:51           ` Tomasz Figa
  0 siblings, 0 replies; 32+ messages in thread
From: Tomasz Figa @ 2018-08-31  6:51 UTC (permalink / raw)
  To: linux-arm-kernel

On Fri, Aug 31, 2018 at 2:50 PM Christoph Hellwig <hch@infradead.org> wrote:
>
> On Thu, Aug 30, 2018 at 07:11:35PM -0300, Ezequiel Garcia wrote:
> > On Thu, 2018-08-30 at 10:58 -0700, Christoph Hellwig wrote:
> > > Please don't introduce new DMA_ATTR_NON_CONSISTENT users, it is
> > > a rather horrible interface, and I plan to kill it off rather sooner
> > > than later.  I plan to post some patches for a better interface
> > > that can reuse the normal dma_sync_single_* interfaces for ownership
> > > transfers.  I can happily include usb in that initial patch set based
> > > on your work here if that helps.
> >
> > Please do. Until we have proper allocators that go thru the DMA API,
> > drivers will have to kmalloc the USB transfer buffers, and have
> > streaming mappings. Which in turns mean not using IOMMU or CMA.
>
> dma_map_page will of course use the iommu.

Sure, dma_map*() will, but using kmalloc() defeats (half of) the
purpose of it, since contiguous memory would be allocated
unnecessarily, risking failures due to fragmentation.

Best regards,
Tomasz

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [RFC 3/3] stk1160: Use non-coherent buffers for USB transfers
@ 2018-09-07  8:54       ` Tomasz Figa
  0 siblings, 0 replies; 32+ messages in thread
From: Tomasz Figa @ 2018-09-07  8:54 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: Ezequiel Garcia, Linux Media Mailing List, linux-usb,
	list@263.net:IOMMU DRIVERS
	<iommu@lists.linux-foundation.org>,
	Joerg Roedel <joro@8bytes.org>,,
	Linux Kernel Mailing List, Laurent Pinchart, Matwey V. Kornilov,
	Alan Stern, kernel, keiichiw

On Fri, Aug 31, 2018 at 2:59 AM Christoph Hellwig <hch@infradead.org> wrote:
>
> > +     dma_sync_single_for_cpu(&urb->dev->dev, urb->transfer_dma,
> > +             urb->transfer_buffer_length, DMA_FROM_DEVICE);
>
> You can't ue dma_sync_single_for_cpu on non-coherent dma buffers,
> which is one of the major issues with them.

It's not an issue of DMA API, but just an API mismatch. By design,
memory allocated for device (e.g. by DMA API) doesn't have to be
physically contiguous, while dma_*_single() API expects a _single_,
physically contiguous region of memory.

We need a way to allocate non-coherent memory using DMA API to handle
(on USB example, but applies to virtually any class of devices doing
DMA):
 - DMA address range limitations (e.g. dma_mask) - while a USB HCD
driver is normally aware of those, USB device driver should have no
idea,
 - memory mapping capability === whether contiguous memory or a set of
random pages can be allocated - this is a platform integration detail,
which even a USB HCD driver may not be aware of, if a SoC IOMMU is
just stuffed between the bus and HCD,
 - platform coherency specifics - there are practical scenarios when
on a coherent-by-default system it's more efficient to allocate
non-coherent memory and manage caches explicitly to avoid the costs of
cache snooping.

If DMA_ATTR_NON_CONSISTENT is not the right way to do it, there should
be definitely a new API introduced, coupled closely to DMA API
implementation on given platform, since it's the only place which can
solve all the constraints above.

Best regards,
Tomasz

^ permalink raw reply	[flat|nested] 32+ messages in thread

* [RFC,3/3] stk1160: Use non-coherent buffers for USB transfers
@ 2018-09-07  8:54       ` Tomasz Figa
  0 siblings, 0 replies; 32+ messages in thread
From: Tomasz Figa @ 2018-09-07  8:54 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: Ezequiel Garcia, Linux Media Mailing List, linux-usb,
	list@263.net:IOMMU DRIVERS
	<iommu@lists.linux-foundation.org>,
	Joerg Roedel <joro@8bytes.org>,,
	Linux Kernel Mailing List, Laurent Pinchart, Matwey V. Kornilov,
	Alan Stern, kernel, keiichiw

On Fri, Aug 31, 2018 at 2:59 AM Christoph Hellwig <hch@infradead.org> wrote:
>
> > +     dma_sync_single_for_cpu(&urb->dev->dev, urb->transfer_dma,
> > +             urb->transfer_buffer_length, DMA_FROM_DEVICE);
>
> You can't ue dma_sync_single_for_cpu on non-coherent dma buffers,
> which is one of the major issues with them.

It's not an issue of DMA API, but just an API mismatch. By design,
memory allocated for device (e.g. by DMA API) doesn't have to be
physically contiguous, while dma_*_single() API expects a _single_,
physically contiguous region of memory.

We need a way to allocate non-coherent memory using DMA API to handle
(on USB example, but applies to virtually any class of devices doing
DMA):
 - DMA address range limitations (e.g. dma_mask) - while a USB HCD
driver is normally aware of those, USB device driver should have no
idea,
 - memory mapping capability === whether contiguous memory or a set of
random pages can be allocated - this is a platform integration detail,
which even a USB HCD driver may not be aware of, if a SoC IOMMU is
just stuffed between the bus and HCD,
 - platform coherency specifics - there are practical scenarios when
on a coherent-by-default system it's more efficient to allocate
non-coherent memory and manage caches explicitly to avoid the costs of
cache snooping.

If DMA_ATTR_NON_CONSISTENT is not the right way to do it, there should
be definitely a new API introduced, coupled closely to DMA API
implementation on given platform, since it's the only place which can
solve all the constraints above.

Best regards,
Tomasz

^ permalink raw reply	[flat|nested] 32+ messages in thread

* [RFC 3/3] stk1160: Use non-coherent buffers for USB transfers
@ 2018-09-07  8:54       ` Tomasz Figa
  0 siblings, 0 replies; 32+ messages in thread
From: Tomasz Figa @ 2018-09-07  8:54 UTC (permalink / raw)
  To: linux-arm-kernel

On Fri, Aug 31, 2018 at 2:59 AM Christoph Hellwig <hch@infradead.org> wrote:
>
> > +     dma_sync_single_for_cpu(&urb->dev->dev, urb->transfer_dma,
> > +             urb->transfer_buffer_length, DMA_FROM_DEVICE);
>
> You can't ue dma_sync_single_for_cpu on non-coherent dma buffers,
> which is one of the major issues with them.

It's not an issue of DMA API, but just an API mismatch. By design,
memory allocated for device (e.g. by DMA API) doesn't have to be
physically contiguous, while dma_*_single() API expects a _single_,
physically contiguous region of memory.

We need a way to allocate non-coherent memory using DMA API to handle
(on USB example, but applies to virtually any class of devices doing
DMA):
 - DMA address range limitations (e.g. dma_mask) - while a USB HCD
driver is normally aware of those, USB device driver should have no
idea,
 - memory mapping capability === whether contiguous memory or a set of
random pages can be allocated - this is a platform integration detail,
which even a USB HCD driver may not be aware of, if a SoC IOMMU is
just stuffed between the bus and HCD,
 - platform coherency specifics - there are practical scenarios when
on a coherent-by-default system it's more efficient to allocate
non-coherent memory and manage caches explicitly to avoid the costs of
cache snooping.

If DMA_ATTR_NON_CONSISTENT is not the right way to do it, there should
be definitely a new API introduced, coupled closely to DMA API
implementation on given platform, since it's the only place which can
solve all the constraints above.

Best regards,
Tomasz

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [RFC 2/3] USB: core: Add non-coherent buffer allocation helpers
@ 2018-10-31  1:55             ` Tomasz Figa
  0 siblings, 0 replies; 32+ messages in thread
From: Tomasz Figa @ 2018-10-31  1:55 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: Ezequiel Garcia, Linux Media Mailing List, linux-usb,
	list@263.net:IOMMU DRIVERS
	<iommu@lists.linux-foundation.org>,
	Joerg Roedel <joro@8bytes.org>,,
	Linux Kernel Mailing List, Laurent Pinchart, Matwey V. Kornilov,
	Alan Stern, kernel, keiichiw

Hi Christoph and everyone,

On Fri, Aug 31, 2018 at 3:51 PM Tomasz Figa <tfiga@chromium.org> wrote:
>
> On Fri, Aug 31, 2018 at 2:50 PM Christoph Hellwig <hch@infradead.org> wrote:
> >
> > On Thu, Aug 30, 2018 at 07:11:35PM -0300, Ezequiel Garcia wrote:
> > > On Thu, 2018-08-30 at 10:58 -0700, Christoph Hellwig wrote:
> > > > Please don't introduce new DMA_ATTR_NON_CONSISTENT users, it is
> > > > a rather horrible interface, and I plan to kill it off rather sooner
> > > > than later.  I plan to post some patches for a better interface
> > > > that can reuse the normal dma_sync_single_* interfaces for ownership
> > > > transfers.  I can happily include usb in that initial patch set based
> > > > on your work here if that helps.
> > >
> > > Please do. Until we have proper allocators that go thru the DMA API,
> > > drivers will have to kmalloc the USB transfer buffers, and have
> > > streaming mappings. Which in turns mean not using IOMMU or CMA.
> >
> > dma_map_page will of course use the iommu.
>
> Sure, dma_map*() will, but using kmalloc() defeats (half of) the
> purpose of it, since contiguous memory would be allocated
> unnecessarily, risking failures due to fragmentation.

Have we reached a conclusion here?

It sounds like it's a quite significant problem, at least for some of
the camera (media) devices over there and there are people interested
in solving it, so all we need here is a conclusion on how to do it. :)

Best regards,
Tomasz

^ permalink raw reply	[flat|nested] 32+ messages in thread

* [RFC,2/3] USB: core: Add non-coherent buffer allocation helpers
@ 2018-10-31  1:55             ` Tomasz Figa
  0 siblings, 0 replies; 32+ messages in thread
From: Tomasz Figa @ 2018-10-31  1:55 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: Ezequiel Garcia, Linux Media Mailing List, linux-usb,
	list@263.net:IOMMU DRIVERS
	<iommu@lists.linux-foundation.org>,
	Joerg Roedel <joro@8bytes.org>,,
	Linux Kernel Mailing List, Laurent Pinchart, Matwey V. Kornilov,
	Alan Stern, kernel, keiichiw

Hi Christoph and everyone,

On Fri, Aug 31, 2018 at 3:51 PM Tomasz Figa <tfiga@chromium.org> wrote:
>
> On Fri, Aug 31, 2018 at 2:50 PM Christoph Hellwig <hch@infradead.org> wrote:
> >
> > On Thu, Aug 30, 2018 at 07:11:35PM -0300, Ezequiel Garcia wrote:
> > > On Thu, 2018-08-30 at 10:58 -0700, Christoph Hellwig wrote:
> > > > Please don't introduce new DMA_ATTR_NON_CONSISTENT users, it is
> > > > a rather horrible interface, and I plan to kill it off rather sooner
> > > > than later.  I plan to post some patches for a better interface
> > > > that can reuse the normal dma_sync_single_* interfaces for ownership
> > > > transfers.  I can happily include usb in that initial patch set based
> > > > on your work here if that helps.
> > >
> > > Please do. Until we have proper allocators that go thru the DMA API,
> > > drivers will have to kmalloc the USB transfer buffers, and have
> > > streaming mappings. Which in turns mean not using IOMMU or CMA.
> >
> > dma_map_page will of course use the iommu.
>
> Sure, dma_map*() will, but using kmalloc() defeats (half of) the
> purpose of it, since contiguous memory would be allocated
> unnecessarily, risking failures due to fragmentation.

Have we reached a conclusion here?

It sounds like it's a quite significant problem, at least for some of
the camera (media) devices over there and there are people interested
in solving it, so all we need here is a conclusion on how to do it. :)

Best regards,
Tomasz

^ permalink raw reply	[flat|nested] 32+ messages in thread

* [RFC 2/3] USB: core: Add non-coherent buffer allocation helpers
@ 2018-10-31  1:55             ` Tomasz Figa
  0 siblings, 0 replies; 32+ messages in thread
From: Tomasz Figa @ 2018-10-31  1:55 UTC (permalink / raw)
  To: linux-arm-kernel

Hi Christoph and everyone,

On Fri, Aug 31, 2018 at 3:51 PM Tomasz Figa <tfiga@chromium.org> wrote:
>
> On Fri, Aug 31, 2018 at 2:50 PM Christoph Hellwig <hch@infradead.org> wrote:
> >
> > On Thu, Aug 30, 2018 at 07:11:35PM -0300, Ezequiel Garcia wrote:
> > > On Thu, 2018-08-30 at 10:58 -0700, Christoph Hellwig wrote:
> > > > Please don't introduce new DMA_ATTR_NON_CONSISTENT users, it is
> > > > a rather horrible interface, and I plan to kill it off rather sooner
> > > > than later.  I plan to post some patches for a better interface
> > > > that can reuse the normal dma_sync_single_* interfaces for ownership
> > > > transfers.  I can happily include usb in that initial patch set based
> > > > on your work here if that helps.
> > >
> > > Please do. Until we have proper allocators that go thru the DMA API,
> > > drivers will have to kmalloc the USB transfer buffers, and have
> > > streaming mappings. Which in turns mean not using IOMMU or CMA.
> >
> > dma_map_page will of course use the iommu.
>
> Sure, dma_map*() will, but using kmalloc() defeats (half of) the
> purpose of it, since contiguous memory would be allocated
> unnecessarily, risking failures due to fragmentation.

Have we reached a conclusion here?

It sounds like it's a quite significant problem, at least for some of
the camera (media) devices over there and there are people interested
in solving it, so all we need here is a conclusion on how to do it. :)

Best regards,
Tomasz

^ permalink raw reply	[flat|nested] 32+ messages in thread

end of thread, other threads:[~2018-10-31  1:55 UTC | newest]

Thread overview: 32+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-08-30 17:20 [RFC 0/3] Introduce usb_{alloc,free}_noncoherent API Ezequiel Garcia
2018-08-30 17:20 ` Ezequiel Garcia
2018-08-30 17:20 ` [RFC 1/3] HACK: ARM: dma-mapping: Get writeback memory for non-consistent mappings Ezequiel Garcia
2018-08-30 17:20   ` Ezequiel Garcia
2018-08-30 17:20   ` [RFC,1/3] " Ezequiel Garcia
2018-08-30 17:20 ` [RFC 2/3] USB: core: Add non-coherent buffer allocation helpers Ezequiel Garcia
2018-08-30 17:20   ` Ezequiel Garcia
2018-08-30 17:20   ` [RFC,2/3] " Ezequiel Garcia
2018-08-30 17:58   ` [RFC 2/3] " Christoph Hellwig
2018-08-30 17:58     ` Christoph Hellwig
2018-08-30 17:58     ` [RFC,2/3] " Christoph Hellwig
2018-08-30 22:11     ` [RFC 2/3] " Ezequiel Garcia
2018-08-30 22:11       ` Ezequiel Garcia
2018-08-30 22:11       ` [RFC,2/3] " Ezequiel Garcia
2018-08-31  5:50       ` [RFC 2/3] " Christoph Hellwig
2018-08-31  5:50         ` Christoph Hellwig
2018-08-31  5:50         ` [RFC,2/3] " Christoph Hellwig
2018-08-31  6:51         ` [RFC 2/3] " Tomasz Figa
2018-08-31  6:51           ` Tomasz Figa
2018-08-31  6:51           ` [RFC,2/3] " Tomasz Figa
2018-10-31  1:55           ` [RFC 2/3] " Tomasz Figa
2018-10-31  1:55             ` Tomasz Figa
2018-10-31  1:55             ` [RFC,2/3] " Tomasz Figa
2018-08-30 17:20 ` [RFC 3/3] stk1160: Use non-coherent buffers for USB transfers Ezequiel Garcia
2018-08-30 17:20   ` Ezequiel Garcia
2018-08-30 17:20   ` [RFC,3/3] " Ezequiel Garcia
2018-08-30 17:59   ` [RFC 3/3] " Christoph Hellwig
2018-08-30 17:59     ` Christoph Hellwig
2018-08-30 17:59     ` [RFC,3/3] " Christoph Hellwig
2018-09-07  8:54     ` [RFC 3/3] " Tomasz Figa
2018-09-07  8:54       ` Tomasz Figa
2018-09-07  8:54       ` [RFC,3/3] " Tomasz Figa

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.