* [RFC 0/3] Introduce usb_{alloc,free}_noncoherent API @ 2018-08-30 17:20 ` Ezequiel Garcia 0 siblings, 0 replies; 32+ messages in thread From: Ezequiel Garcia @ 2018-08-30 17:20 UTC (permalink / raw) To: linux-media, linux-usb, linux-arm-kernel, linux-kernel Cc: Laurent Pinchart, Tomasz Figa, Matwey V . Kornilov, Alan Stern, kernel, Keiichi Watanabe, Ezequiel Garcia Following the discussion on PWC [1] and UVC [2] drivers, where use non-consistent mappings for the URB transfer buffers was shown to improve transfer speed significantly, here's a proposal for a non-coherent USB helpers. With this pachset, it's possible to get 360x288 raw analog video using stk1160 and a AM335x Beaglebone Black board. This isn't possible in mainline, for the same reasons Matwey has explained [1]. First patch is a hack, obviously incomplete, to add support for non-consistent mappings on ARM. The second patch introduces the usb_{alloc,free}_noncoherent API, while the third patch is an example on stk1160. I'm sending this patchset as RFC, just to get the ball rolling. [1] https://lkml.org/lkml/2018/8/21/663 [2] https://lkml.org/lkml/2018/6/27/188 Ezequiel Garcia (3): HACK: ARM: dma-mapping: Get writeback memory for non-consistent mappings USB: core: Add non-coherent buffer allocation helpers stk1160: Use non-coherent buffers for USB transfers arch/arm/include/asm/pgtable.h | 3 ++ arch/arm/mm/dma-mapping.c | 9 ++-- drivers/media/usb/stk1160/stk1160-video.c | 22 +++------ drivers/usb/core/buffer.c | 29 +++++++----- drivers/usb/core/hcd.c | 5 +- drivers/usb/core/usb.c | 56 ++++++++++++++++++++++- include/linux/usb.h | 5 ++ include/linux/usb/hcd.h | 4 +- 8 files changed, 97 insertions(+), 36 deletions(-) -- 2.18.0 ^ permalink raw reply [flat|nested] 32+ messages in thread
* [RFC 0/3] Introduce usb_{alloc,free}_noncoherent API @ 2018-08-30 17:20 ` Ezequiel Garcia 0 siblings, 0 replies; 32+ messages in thread From: Ezequiel Garcia @ 2018-08-30 17:20 UTC (permalink / raw) To: linux-arm-kernel Following the discussion on PWC [1] and UVC [2] drivers, where use non-consistent mappings for the URB transfer buffers was shown to improve transfer speed significantly, here's a proposal for a non-coherent USB helpers. With this pachset, it's possible to get 360x288 raw analog video using stk1160 and a AM335x Beaglebone Black board. This isn't possible in mainline, for the same reasons Matwey has explained [1]. First patch is a hack, obviously incomplete, to add support for non-consistent mappings on ARM. The second patch introduces the usb_{alloc,free}_noncoherent API, while the third patch is an example on stk1160. I'm sending this patchset as RFC, just to get the ball rolling. [1] https://lkml.org/lkml/2018/8/21/663 [2] https://lkml.org/lkml/2018/6/27/188 Ezequiel Garcia (3): HACK: ARM: dma-mapping: Get writeback memory for non-consistent mappings USB: core: Add non-coherent buffer allocation helpers stk1160: Use non-coherent buffers for USB transfers arch/arm/include/asm/pgtable.h | 3 ++ arch/arm/mm/dma-mapping.c | 9 ++-- drivers/media/usb/stk1160/stk1160-video.c | 22 +++------ drivers/usb/core/buffer.c | 29 +++++++----- drivers/usb/core/hcd.c | 5 +- drivers/usb/core/usb.c | 56 ++++++++++++++++++++++- include/linux/usb.h | 5 ++ include/linux/usb/hcd.h | 4 +- 8 files changed, 97 insertions(+), 36 deletions(-) -- 2.18.0 ^ permalink raw reply [flat|nested] 32+ messages in thread
* [RFC 1/3] HACK: ARM: dma-mapping: Get writeback memory for non-consistent mappings @ 2018-08-30 17:20 ` Ezequiel Garcia 0 siblings, 0 replies; 32+ messages in thread From: Ezequiel Garcia @ 2018-08-30 17:20 UTC (permalink / raw) To: linux-media, linux-usb, linux-arm-kernel, linux-kernel Cc: Laurent Pinchart, Tomasz Figa, Matwey V . Kornilov, Alan Stern, kernel, Keiichi Watanabe, Ezequiel Garcia This is obviously a hack. Signed-off-by: Ezequiel Garcia <ezequiel@collabora.com> --- arch/arm/include/asm/pgtable.h | 3 +++ arch/arm/mm/dma-mapping.c | 9 ++++++--- 2 files changed, 9 insertions(+), 3 deletions(-) diff --git a/arch/arm/include/asm/pgtable.h b/arch/arm/include/asm/pgtable.h index a757401129f9..37ddd0d73434 100644 --- a/arch/arm/include/asm/pgtable.h +++ b/arch/arm/include/asm/pgtable.h @@ -122,6 +122,9 @@ extern pgprot_t pgprot_s2_device; #define pgprot_writecombine(prot) \ __pgprot_modify(prot, L_PTE_MT_MASK, L_PTE_MT_BUFFERABLE) +#define pgprot_writeback(prot) \ + __pgprot_modify(prot, L_PTE_MT_MASK, L_PTE_MT_WRITEBACK) + #define pgprot_stronglyordered(prot) \ __pgprot_modify(prot, L_PTE_MT_MASK, L_PTE_MT_UNCACHED) diff --git a/arch/arm/mm/dma-mapping.c b/arch/arm/mm/dma-mapping.c index 66566472c153..11cca7bbb0a8 100644 --- a/arch/arm/mm/dma-mapping.c +++ b/arch/arm/mm/dma-mapping.c @@ -633,9 +633,12 @@ static void __free_from_contiguous(struct device *dev, struct page *page, static inline pgprot_t __get_dma_pgprot(unsigned long attrs, pgprot_t prot) { - prot = (attrs & DMA_ATTR_WRITE_COMBINE) ? - pgprot_writecombine(prot) : - pgprot_dmacoherent(prot); + if (attrs & DMA_ATTR_WRITE_COMBINE) + prot = pgprot_writecombine(prot); + else if (attrs & DMA_ATTR_NON_CONSISTENT) + prot = pgprot_writeback(prot); + else + prot = pgprot_dmacoherent(prot); return prot; } -- 2.18.0 ^ permalink raw reply related [flat|nested] 32+ messages in thread
* [RFC 1/3] HACK: ARM: dma-mapping: Get writeback memory for non-consistent mappings @ 2018-08-30 17:20 ` Ezequiel Garcia 0 siblings, 0 replies; 32+ messages in thread From: Ezequiel Garcia @ 2018-08-30 17:20 UTC (permalink / raw) To: linux-arm-kernel This is obviously a hack. Signed-off-by: Ezequiel Garcia <ezequiel@collabora.com> --- arch/arm/include/asm/pgtable.h | 3 +++ arch/arm/mm/dma-mapping.c | 9 ++++++--- 2 files changed, 9 insertions(+), 3 deletions(-) diff --git a/arch/arm/include/asm/pgtable.h b/arch/arm/include/asm/pgtable.h index a757401129f9..37ddd0d73434 100644 --- a/arch/arm/include/asm/pgtable.h +++ b/arch/arm/include/asm/pgtable.h @@ -122,6 +122,9 @@ extern pgprot_t pgprot_s2_device; #define pgprot_writecombine(prot) \ __pgprot_modify(prot, L_PTE_MT_MASK, L_PTE_MT_BUFFERABLE) +#define pgprot_writeback(prot) \ + __pgprot_modify(prot, L_PTE_MT_MASK, L_PTE_MT_WRITEBACK) + #define pgprot_stronglyordered(prot) \ __pgprot_modify(prot, L_PTE_MT_MASK, L_PTE_MT_UNCACHED) diff --git a/arch/arm/mm/dma-mapping.c b/arch/arm/mm/dma-mapping.c index 66566472c153..11cca7bbb0a8 100644 --- a/arch/arm/mm/dma-mapping.c +++ b/arch/arm/mm/dma-mapping.c @@ -633,9 +633,12 @@ static void __free_from_contiguous(struct device *dev, struct page *page, static inline pgprot_t __get_dma_pgprot(unsigned long attrs, pgprot_t prot) { - prot = (attrs & DMA_ATTR_WRITE_COMBINE) ? - pgprot_writecombine(prot) : - pgprot_dmacoherent(prot); + if (attrs & DMA_ATTR_WRITE_COMBINE) + prot = pgprot_writecombine(prot); + else if (attrs & DMA_ATTR_NON_CONSISTENT) + prot = pgprot_writeback(prot); + else + prot = pgprot_dmacoherent(prot); return prot; } -- 2.18.0 ^ permalink raw reply related [flat|nested] 32+ messages in thread
* [RFC,1/3] HACK: ARM: dma-mapping: Get writeback memory for non-consistent mappings @ 2018-08-30 17:20 ` Ezequiel Garcia 0 siblings, 0 replies; 32+ messages in thread From: Ezequiel Garcia @ 2018-08-30 17:20 UTC (permalink / raw) To: linux-media, linux-usb, linux-arm-kernel, linux-kernel Cc: Laurent Pinchart, Tomasz Figa, Matwey V . Kornilov, Alan Stern, kernel, Keiichi Watanabe, Ezequiel Garcia This is obviously a hack. Signed-off-by: Ezequiel Garcia <ezequiel@collabora.com> --- arch/arm/include/asm/pgtable.h | 3 +++ arch/arm/mm/dma-mapping.c | 9 ++++++--- 2 files changed, 9 insertions(+), 3 deletions(-) diff --git a/arch/arm/include/asm/pgtable.h b/arch/arm/include/asm/pgtable.h index a757401129f9..37ddd0d73434 100644 --- a/arch/arm/include/asm/pgtable.h +++ b/arch/arm/include/asm/pgtable.h @@ -122,6 +122,9 @@ extern pgprot_t pgprot_s2_device; #define pgprot_writecombine(prot) \ __pgprot_modify(prot, L_PTE_MT_MASK, L_PTE_MT_BUFFERABLE) +#define pgprot_writeback(prot) \ + __pgprot_modify(prot, L_PTE_MT_MASK, L_PTE_MT_WRITEBACK) + #define pgprot_stronglyordered(prot) \ __pgprot_modify(prot, L_PTE_MT_MASK, L_PTE_MT_UNCACHED) diff --git a/arch/arm/mm/dma-mapping.c b/arch/arm/mm/dma-mapping.c index 66566472c153..11cca7bbb0a8 100644 --- a/arch/arm/mm/dma-mapping.c +++ b/arch/arm/mm/dma-mapping.c @@ -633,9 +633,12 @@ static void __free_from_contiguous(struct device *dev, struct page *page, static inline pgprot_t __get_dma_pgprot(unsigned long attrs, pgprot_t prot) { - prot = (attrs & DMA_ATTR_WRITE_COMBINE) ? - pgprot_writecombine(prot) : - pgprot_dmacoherent(prot); + if (attrs & DMA_ATTR_WRITE_COMBINE) + prot = pgprot_writecombine(prot); + else if (attrs & DMA_ATTR_NON_CONSISTENT) + prot = pgprot_writeback(prot); + else + prot = pgprot_dmacoherent(prot); return prot; } ^ permalink raw reply related [flat|nested] 32+ messages in thread
* [RFC 2/3] USB: core: Add non-coherent buffer allocation helpers @ 2018-08-30 17:20 ` Ezequiel Garcia 0 siblings, 0 replies; 32+ messages in thread From: Ezequiel Garcia @ 2018-08-30 17:20 UTC (permalink / raw) To: linux-media, linux-usb, linux-arm-kernel, linux-kernel Cc: Laurent Pinchart, Tomasz Figa, Matwey V . Kornilov, Alan Stern, kernel, Keiichi Watanabe, Ezequiel Garcia As noted recently by Matwey V. Kornilov, using coherent buffers on platforms _without_ hardware coherency results in some devices being completely unusable, due to transfers being too slow. Moreover, using non-coherent buffers on platforms _with_ hardware coherency, do not show a significant impact. This has been tested by Matwey on PWC USB cameras on x86_64 and ARM platforms. Quoting [1] (where kmalloc-ed buffers use streaming mappings): """ [..] average memcpy() data transfer rate (rate) and handler completion time (time) have been measured when running video stream at 640x480 resolution at 10fps. x86_64 based system (Intel Core i5-3470). This platform has hardware coherent DMA support and proposed change doesn't make big difference here. * kmalloc: rate = (2.0 +- 0.4) GBps time = (5.0 +- 3.0) usec * usb_alloc_coherent: rate = (3.4 +- 1.2) GBps time = (3.5 +- 3.0) usec armv7l based system (TI AM335x BeagleBone Black @ 300MHz). This platform has no hardware coherent DMA support. DMA coherence is implemented via disabled page caching that slows down memcpy() due to memory controller behaviour. * kmalloc: rate = (114 +- 5) MBps time = (84 +- 4) usec * usb_alloc_coherent: rate = (28.1 +- 0.1) MBps time = (341 +- 2) usec "" Introduce a pair of usb_{alloc,free}_noncoherent helper functions, for drivers that want to use non-coherent transfer buffers. [1]: https://lkml.org/lkml/2018/8/9/734 Signed-off-by: Ezequiel Garcia <ezequiel@collabora.com> --- drivers/usb/core/buffer.c | 29 ++++++++++++-------- drivers/usb/core/hcd.c | 5 ++-- drivers/usb/core/usb.c | 56 +++++++++++++++++++++++++++++++++++++-- include/linux/usb.h | 5 ++++ include/linux/usb/hcd.h | 4 +-- 5 files changed, 82 insertions(+), 17 deletions(-) diff --git a/drivers/usb/core/buffer.c b/drivers/usb/core/buffer.c index 77eef8acff94..1bc9df883337 100644 --- a/drivers/usb/core/buffer.c +++ b/drivers/usb/core/buffer.c @@ -119,7 +119,8 @@ void *hcd_buffer_alloc( struct usb_bus *bus, size_t size, gfp_t mem_flags, - dma_addr_t *dma + dma_addr_t *dma, + unsigned long attrs ) { struct usb_hcd *hcd = bus_to_hcd(bus); @@ -136,18 +137,22 @@ void *hcd_buffer_alloc( return kmalloc(size, mem_flags); } - for (i = 0; i < HCD_BUFFER_POOLS; i++) { - if (size <= pool_max[i]) - return dma_pool_alloc(hcd->pool[i], mem_flags, dma); + /* Only use pools for coherent buffer requests */ + if (!attrs) { + for (i = 0; i < HCD_BUFFER_POOLS; i++) + if (size <= pool_max[i]) + return dma_pool_alloc(hcd->pool[i], + mem_flags, dma); } - return dma_alloc_coherent(hcd->self.sysdev, size, dma, mem_flags); + return dma_alloc_attrs(hcd->self.sysdev, size, dma, mem_flags, attrs); } void hcd_buffer_free( struct usb_bus *bus, size_t size, void *addr, - dma_addr_t dma + dma_addr_t dma, + unsigned long attrs ) { struct usb_hcd *hcd = bus_to_hcd(bus); @@ -163,11 +168,13 @@ void hcd_buffer_free( return; } - for (i = 0; i < HCD_BUFFER_POOLS; i++) { - if (size <= pool_max[i]) { - dma_pool_free(hcd->pool[i], addr, dma); - return; + if (!attrs) { + for (i = 0; i < HCD_BUFFER_POOLS; i++) { + if (size <= pool_max[i]) { + dma_pool_free(hcd->pool[i], addr, dma); + return; + } } } - dma_free_coherent(hcd->self.sysdev, size, addr, dma); + dma_free_attrs(hcd->self.sysdev, size, addr, dma, attrs); } diff --git a/drivers/usb/core/hcd.c b/drivers/usb/core/hcd.c index 1c21955fe7c0..25303738eb28 100644 --- a/drivers/usb/core/hcd.c +++ b/drivers/usb/core/hcd.c @@ -1383,7 +1383,7 @@ static int hcd_alloc_coherent(struct usb_bus *bus, } vaddr = hcd_buffer_alloc(bus, size + sizeof(vaddr), - mem_flags, dma_handle); + mem_flags, dma_handle, 0); if (!vaddr) return -ENOMEM; @@ -1416,7 +1416,8 @@ static void hcd_free_coherent(struct usb_bus *bus, dma_addr_t *dma_handle, if (dir == DMA_FROM_DEVICE) memcpy(vaddr, *vaddr_handle, size); - hcd_buffer_free(bus, size + sizeof(vaddr), *vaddr_handle, *dma_handle); + hcd_buffer_free(bus, size + sizeof(vaddr), + *vaddr_handle, *dma_handle, 0); *vaddr_handle = vaddr; *dma_handle = 0; diff --git a/drivers/usb/core/usb.c b/drivers/usb/core/usb.c index 623be3174fb3..234ea5ab4bb7 100644 --- a/drivers/usb/core/usb.c +++ b/drivers/usb/core/usb.c @@ -858,6 +858,58 @@ int __usb_get_extra_descriptor(char *buffer, unsigned size, } EXPORT_SYMBOL_GPL(__usb_get_extra_descriptor); +/** + * usb_alloc_noncoherent - allocate dma-non-coherent buffer + * @dev: device the buffer will be used with + * @size: requested buffer size + * @mem_flags: affect whether allocation may block + * @dma: used to return DMA address of buffer + * + * Return: Either null (indicating no buffer could be allocated), or the + * cpu-space pointer to a non-coherent buffer that may be used to perform + * DMA to the specified device. Such cpu-space buffers are returned along + * with the DMA address (through the pointer provided). + * + * Note: + * These non-conherent buffers are used with URB_NO_xxx_DMA_MAP set in + * urb->transfer_flags to avoid using "DMA bounce buffers". When using + * this API, you must have the necessary syncs points. If you are unsure + * about this, you should be using coherent buffers via usb_alloc_coherent. + * + * When the buffer is no longer used, free it with usb_free_noncoherent(). + */ +void *usb_alloc_noncoherent(struct usb_device *dev, size_t size, gfp_t mem_flags, + dma_addr_t *dma) +{ + if (!dev || !dev->bus) + return NULL; + return hcd_buffer_alloc(dev->bus, size, + mem_flags, dma, DMA_ATTR_NON_CONSISTENT); +} +EXPORT_SYMBOL_GPL(usb_alloc_noncoherent); + +/** + * usb_free_noncoherent - free memory allocated with usb_alloc_noncoherent() + * @dev: device the buffer was used with + * @size: requested buffer size + * @addr: CPU address of buffer + * @dma: DMA address of buffer + * + * This reclaims an I/O buffer, letting it be reused. The memory must have + * been allocated using usb_alloc_noncoherent(), and the parameters must match + * those provided in that allocation request. + */ +void usb_free_noncoherent(struct usb_device *dev, size_t size, void *addr, + dma_addr_t dma) +{ + if (!dev || !dev->bus) + return; + if (!addr) + return; + hcd_buffer_free(dev->bus, size, addr, dma, DMA_ATTR_NON_CONSISTENT); +} +EXPORT_SYMBOL_GPL(usb_free_noncoherent); + /** * usb_alloc_coherent - allocate dma-consistent buffer for URB_NO_xxx_DMA_MAP * @dev: device the buffer will be used with @@ -886,7 +938,7 @@ void *usb_alloc_coherent(struct usb_device *dev, size_t size, gfp_t mem_flags, { if (!dev || !dev->bus) return NULL; - return hcd_buffer_alloc(dev->bus, size, mem_flags, dma); + return hcd_buffer_alloc(dev->bus, size, mem_flags, dma, 0); } EXPORT_SYMBOL_GPL(usb_alloc_coherent); @@ -908,7 +960,7 @@ void usb_free_coherent(struct usb_device *dev, size_t size, void *addr, return; if (!addr) return; - hcd_buffer_free(dev->bus, size, addr, dma); + hcd_buffer_free(dev->bus, size, addr, dma, 0); } EXPORT_SYMBOL_GPL(usb_free_coherent); diff --git a/include/linux/usb.h b/include/linux/usb.h index 4cdd515a4385..7fddd6c2a61e 100644 --- a/include/linux/usb.h +++ b/include/linux/usb.h @@ -1750,6 +1750,11 @@ static inline int usb_urb_dir_out(struct urb *urb) int usb_urb_ep_type_check(const struct urb *urb); +void *usb_alloc_noncoherent(struct usb_device *dev, size_t size, + gfp_t mem_flags, dma_addr_t *dma); +void usb_free_noncoherent(struct usb_device *dev, size_t size, + void *addr, dma_addr_t dma); + void *usb_alloc_coherent(struct usb_device *dev, size_t size, gfp_t mem_flags, dma_addr_t *dma); void usb_free_coherent(struct usb_device *dev, size_t size, diff --git a/include/linux/usb/hcd.h b/include/linux/usb/hcd.h index 97e2ddec18b1..41dd5f0acaad 100644 --- a/include/linux/usb/hcd.h +++ b/include/linux/usb/hcd.h @@ -486,9 +486,9 @@ int hcd_buffer_create(struct usb_hcd *hcd); void hcd_buffer_destroy(struct usb_hcd *hcd); void *hcd_buffer_alloc(struct usb_bus *bus, size_t size, - gfp_t mem_flags, dma_addr_t *dma); + gfp_t mem_flags, dma_addr_t *dma, unsigned long attrs); void hcd_buffer_free(struct usb_bus *bus, size_t size, - void *addr, dma_addr_t dma); + void *addr, dma_addr_t dma, unsigned long attrs); /* generic bus glue, needed for host controllers that don't use PCI */ extern irqreturn_t usb_hcd_irq(int irq, void *__hcd); -- 2.18.0 ^ permalink raw reply related [flat|nested] 32+ messages in thread
* [RFC 2/3] USB: core: Add non-coherent buffer allocation helpers @ 2018-08-30 17:20 ` Ezequiel Garcia 0 siblings, 0 replies; 32+ messages in thread From: Ezequiel Garcia @ 2018-08-30 17:20 UTC (permalink / raw) To: linux-arm-kernel As noted recently by Matwey V. Kornilov, using coherent buffers on platforms _without_ hardware coherency results in some devices being completely unusable, due to transfers being too slow. Moreover, using non-coherent buffers on platforms _with_ hardware coherency, do not show a significant impact. This has been tested by Matwey on PWC USB cameras on x86_64 and ARM platforms. Quoting [1] (where kmalloc-ed buffers use streaming mappings): """ [..] average memcpy() data transfer rate (rate) and handler completion time (time) have been measured when running video stream at 640x480 resolution at 10fps. x86_64 based system (Intel Core i5-3470). This platform has hardware coherent DMA support and proposed change doesn't make big difference here. * kmalloc: rate = (2.0 +- 0.4) GBps time = (5.0 +- 3.0) usec * usb_alloc_coherent: rate = (3.4 +- 1.2) GBps time = (3.5 +- 3.0) usec armv7l based system (TI AM335x BeagleBone Black @ 300MHz). This platform has no hardware coherent DMA support. DMA coherence is implemented via disabled page caching that slows down memcpy() due to memory controller behaviour. * kmalloc: rate = (114 +- 5) MBps time = (84 +- 4) usec * usb_alloc_coherent: rate = (28.1 +- 0.1) MBps time = (341 +- 2) usec "" Introduce a pair of usb_{alloc,free}_noncoherent helper functions, for drivers that want to use non-coherent transfer buffers. [1]: https://lkml.org/lkml/2018/8/9/734 Signed-off-by: Ezequiel Garcia <ezequiel@collabora.com> --- drivers/usb/core/buffer.c | 29 ++++++++++++-------- drivers/usb/core/hcd.c | 5 ++-- drivers/usb/core/usb.c | 56 +++++++++++++++++++++++++++++++++++++-- include/linux/usb.h | 5 ++++ include/linux/usb/hcd.h | 4 +-- 5 files changed, 82 insertions(+), 17 deletions(-) diff --git a/drivers/usb/core/buffer.c b/drivers/usb/core/buffer.c index 77eef8acff94..1bc9df883337 100644 --- a/drivers/usb/core/buffer.c +++ b/drivers/usb/core/buffer.c @@ -119,7 +119,8 @@ void *hcd_buffer_alloc( struct usb_bus *bus, size_t size, gfp_t mem_flags, - dma_addr_t *dma + dma_addr_t *dma, + unsigned long attrs ) { struct usb_hcd *hcd = bus_to_hcd(bus); @@ -136,18 +137,22 @@ void *hcd_buffer_alloc( return kmalloc(size, mem_flags); } - for (i = 0; i < HCD_BUFFER_POOLS; i++) { - if (size <= pool_max[i]) - return dma_pool_alloc(hcd->pool[i], mem_flags, dma); + /* Only use pools for coherent buffer requests */ + if (!attrs) { + for (i = 0; i < HCD_BUFFER_POOLS; i++) + if (size <= pool_max[i]) + return dma_pool_alloc(hcd->pool[i], + mem_flags, dma); } - return dma_alloc_coherent(hcd->self.sysdev, size, dma, mem_flags); + return dma_alloc_attrs(hcd->self.sysdev, size, dma, mem_flags, attrs); } void hcd_buffer_free( struct usb_bus *bus, size_t size, void *addr, - dma_addr_t dma + dma_addr_t dma, + unsigned long attrs ) { struct usb_hcd *hcd = bus_to_hcd(bus); @@ -163,11 +168,13 @@ void hcd_buffer_free( return; } - for (i = 0; i < HCD_BUFFER_POOLS; i++) { - if (size <= pool_max[i]) { - dma_pool_free(hcd->pool[i], addr, dma); - return; + if (!attrs) { + for (i = 0; i < HCD_BUFFER_POOLS; i++) { + if (size <= pool_max[i]) { + dma_pool_free(hcd->pool[i], addr, dma); + return; + } } } - dma_free_coherent(hcd->self.sysdev, size, addr, dma); + dma_free_attrs(hcd->self.sysdev, size, addr, dma, attrs); } diff --git a/drivers/usb/core/hcd.c b/drivers/usb/core/hcd.c index 1c21955fe7c0..25303738eb28 100644 --- a/drivers/usb/core/hcd.c +++ b/drivers/usb/core/hcd.c @@ -1383,7 +1383,7 @@ static int hcd_alloc_coherent(struct usb_bus *bus, } vaddr = hcd_buffer_alloc(bus, size + sizeof(vaddr), - mem_flags, dma_handle); + mem_flags, dma_handle, 0); if (!vaddr) return -ENOMEM; @@ -1416,7 +1416,8 @@ static void hcd_free_coherent(struct usb_bus *bus, dma_addr_t *dma_handle, if (dir == DMA_FROM_DEVICE) memcpy(vaddr, *vaddr_handle, size); - hcd_buffer_free(bus, size + sizeof(vaddr), *vaddr_handle, *dma_handle); + hcd_buffer_free(bus, size + sizeof(vaddr), + *vaddr_handle, *dma_handle, 0); *vaddr_handle = vaddr; *dma_handle = 0; diff --git a/drivers/usb/core/usb.c b/drivers/usb/core/usb.c index 623be3174fb3..234ea5ab4bb7 100644 --- a/drivers/usb/core/usb.c +++ b/drivers/usb/core/usb.c @@ -858,6 +858,58 @@ int __usb_get_extra_descriptor(char *buffer, unsigned size, } EXPORT_SYMBOL_GPL(__usb_get_extra_descriptor); +/** + * usb_alloc_noncoherent - allocate dma-non-coherent buffer + * @dev: device the buffer will be used with + * @size: requested buffer size + * @mem_flags: affect whether allocation may block + * @dma: used to return DMA address of buffer + * + * Return: Either null (indicating no buffer could be allocated), or the + * cpu-space pointer to a non-coherent buffer that may be used to perform + * DMA to the specified device. Such cpu-space buffers are returned along + * with the DMA address (through the pointer provided). + * + * Note: + * These non-conherent buffers are used with URB_NO_xxx_DMA_MAP set in + * urb->transfer_flags to avoid using "DMA bounce buffers". When using + * this API, you must have the necessary syncs points. If you are unsure + * about this, you should be using coherent buffers via usb_alloc_coherent. + * + * When the buffer is no longer used, free it with usb_free_noncoherent(). + */ +void *usb_alloc_noncoherent(struct usb_device *dev, size_t size, gfp_t mem_flags, + dma_addr_t *dma) +{ + if (!dev || !dev->bus) + return NULL; + return hcd_buffer_alloc(dev->bus, size, + mem_flags, dma, DMA_ATTR_NON_CONSISTENT); +} +EXPORT_SYMBOL_GPL(usb_alloc_noncoherent); + +/** + * usb_free_noncoherent - free memory allocated with usb_alloc_noncoherent() + * @dev: device the buffer was used with + * @size: requested buffer size + * @addr: CPU address of buffer + * @dma: DMA address of buffer + * + * This reclaims an I/O buffer, letting it be reused. The memory must have + * been allocated using usb_alloc_noncoherent(), and the parameters must match + * those provided in that allocation request. + */ +void usb_free_noncoherent(struct usb_device *dev, size_t size, void *addr, + dma_addr_t dma) +{ + if (!dev || !dev->bus) + return; + if (!addr) + return; + hcd_buffer_free(dev->bus, size, addr, dma, DMA_ATTR_NON_CONSISTENT); +} +EXPORT_SYMBOL_GPL(usb_free_noncoherent); + /** * usb_alloc_coherent - allocate dma-consistent buffer for URB_NO_xxx_DMA_MAP * @dev: device the buffer will be used with @@ -886,7 +938,7 @@ void *usb_alloc_coherent(struct usb_device *dev, size_t size, gfp_t mem_flags, { if (!dev || !dev->bus) return NULL; - return hcd_buffer_alloc(dev->bus, size, mem_flags, dma); + return hcd_buffer_alloc(dev->bus, size, mem_flags, dma, 0); } EXPORT_SYMBOL_GPL(usb_alloc_coherent); @@ -908,7 +960,7 @@ void usb_free_coherent(struct usb_device *dev, size_t size, void *addr, return; if (!addr) return; - hcd_buffer_free(dev->bus, size, addr, dma); + hcd_buffer_free(dev->bus, size, addr, dma, 0); } EXPORT_SYMBOL_GPL(usb_free_coherent); diff --git a/include/linux/usb.h b/include/linux/usb.h index 4cdd515a4385..7fddd6c2a61e 100644 --- a/include/linux/usb.h +++ b/include/linux/usb.h @@ -1750,6 +1750,11 @@ static inline int usb_urb_dir_out(struct urb *urb) int usb_urb_ep_type_check(const struct urb *urb); +void *usb_alloc_noncoherent(struct usb_device *dev, size_t size, + gfp_t mem_flags, dma_addr_t *dma); +void usb_free_noncoherent(struct usb_device *dev, size_t size, + void *addr, dma_addr_t dma); + void *usb_alloc_coherent(struct usb_device *dev, size_t size, gfp_t mem_flags, dma_addr_t *dma); void usb_free_coherent(struct usb_device *dev, size_t size, diff --git a/include/linux/usb/hcd.h b/include/linux/usb/hcd.h index 97e2ddec18b1..41dd5f0acaad 100644 --- a/include/linux/usb/hcd.h +++ b/include/linux/usb/hcd.h @@ -486,9 +486,9 @@ int hcd_buffer_create(struct usb_hcd *hcd); void hcd_buffer_destroy(struct usb_hcd *hcd); void *hcd_buffer_alloc(struct usb_bus *bus, size_t size, - gfp_t mem_flags, dma_addr_t *dma); + gfp_t mem_flags, dma_addr_t *dma, unsigned long attrs); void hcd_buffer_free(struct usb_bus *bus, size_t size, - void *addr, dma_addr_t dma); + void *addr, dma_addr_t dma, unsigned long attrs); /* generic bus glue, needed for host controllers that don't use PCI */ extern irqreturn_t usb_hcd_irq(int irq, void *__hcd); -- 2.18.0 ^ permalink raw reply related [flat|nested] 32+ messages in thread
* [RFC,2/3] USB: core: Add non-coherent buffer allocation helpers @ 2018-08-30 17:20 ` Ezequiel Garcia 0 siblings, 0 replies; 32+ messages in thread From: Ezequiel Garcia @ 2018-08-30 17:20 UTC (permalink / raw) To: linux-media, linux-usb, linux-arm-kernel, linux-kernel Cc: Laurent Pinchart, Tomasz Figa, Matwey V . Kornilov, Alan Stern, kernel, Keiichi Watanabe, Ezequiel Garcia As noted recently by Matwey V. Kornilov, using coherent buffers on platforms _without_ hardware coherency results in some devices being completely unusable, due to transfers being too slow. Moreover, using non-coherent buffers on platforms _with_ hardware coherency, do not show a significant impact. This has been tested by Matwey on PWC USB cameras on x86_64 and ARM platforms. Quoting [1] (where kmalloc-ed buffers use streaming mappings): """ [..] average memcpy() data transfer rate (rate) and handler completion time (time) have been measured when running video stream at 640x480 resolution at 10fps. x86_64 based system (Intel Core i5-3470). This platform has hardware coherent DMA support and proposed change doesn't make big difference here. * kmalloc: rate = (2.0 +- 0.4) GBps time = (5.0 +- 3.0) usec * usb_alloc_coherent: rate = (3.4 +- 1.2) GBps time = (3.5 +- 3.0) usec armv7l based system (TI AM335x BeagleBone Black @ 300MHz). This platform has no hardware coherent DMA support. DMA coherence is implemented via disabled page caching that slows down memcpy() due to memory controller behaviour. * kmalloc: rate = (114 +- 5) MBps time = (84 +- 4) usec * usb_alloc_coherent: rate = (28.1 +- 0.1) MBps time = (341 +- 2) usec "" Introduce a pair of usb_{alloc,free}_noncoherent helper functions, for drivers that want to use non-coherent transfer buffers. [1]: https://lkml.org/lkml/2018/8/9/734 Signed-off-by: Ezequiel Garcia <ezequiel@collabora.com> --- drivers/usb/core/buffer.c | 29 ++++++++++++-------- drivers/usb/core/hcd.c | 5 ++-- drivers/usb/core/usb.c | 56 +++++++++++++++++++++++++++++++++++++-- include/linux/usb.h | 5 ++++ include/linux/usb/hcd.h | 4 +-- 5 files changed, 82 insertions(+), 17 deletions(-) diff --git a/drivers/usb/core/buffer.c b/drivers/usb/core/buffer.c index 77eef8acff94..1bc9df883337 100644 --- a/drivers/usb/core/buffer.c +++ b/drivers/usb/core/buffer.c @@ -119,7 +119,8 @@ void *hcd_buffer_alloc( struct usb_bus *bus, size_t size, gfp_t mem_flags, - dma_addr_t *dma + dma_addr_t *dma, + unsigned long attrs ) { struct usb_hcd *hcd = bus_to_hcd(bus); @@ -136,18 +137,22 @@ void *hcd_buffer_alloc( return kmalloc(size, mem_flags); } - for (i = 0; i < HCD_BUFFER_POOLS; i++) { - if (size <= pool_max[i]) - return dma_pool_alloc(hcd->pool[i], mem_flags, dma); + /* Only use pools for coherent buffer requests */ + if (!attrs) { + for (i = 0; i < HCD_BUFFER_POOLS; i++) + if (size <= pool_max[i]) + return dma_pool_alloc(hcd->pool[i], + mem_flags, dma); } - return dma_alloc_coherent(hcd->self.sysdev, size, dma, mem_flags); + return dma_alloc_attrs(hcd->self.sysdev, size, dma, mem_flags, attrs); } void hcd_buffer_free( struct usb_bus *bus, size_t size, void *addr, - dma_addr_t dma + dma_addr_t dma, + unsigned long attrs ) { struct usb_hcd *hcd = bus_to_hcd(bus); @@ -163,11 +168,13 @@ void hcd_buffer_free( return; } - for (i = 0; i < HCD_BUFFER_POOLS; i++) { - if (size <= pool_max[i]) { - dma_pool_free(hcd->pool[i], addr, dma); - return; + if (!attrs) { + for (i = 0; i < HCD_BUFFER_POOLS; i++) { + if (size <= pool_max[i]) { + dma_pool_free(hcd->pool[i], addr, dma); + return; + } } } - dma_free_coherent(hcd->self.sysdev, size, addr, dma); + dma_free_attrs(hcd->self.sysdev, size, addr, dma, attrs); } diff --git a/drivers/usb/core/hcd.c b/drivers/usb/core/hcd.c index 1c21955fe7c0..25303738eb28 100644 --- a/drivers/usb/core/hcd.c +++ b/drivers/usb/core/hcd.c @@ -1383,7 +1383,7 @@ static int hcd_alloc_coherent(struct usb_bus *bus, } vaddr = hcd_buffer_alloc(bus, size + sizeof(vaddr), - mem_flags, dma_handle); + mem_flags, dma_handle, 0); if (!vaddr) return -ENOMEM; @@ -1416,7 +1416,8 @@ static void hcd_free_coherent(struct usb_bus *bus, dma_addr_t *dma_handle, if (dir == DMA_FROM_DEVICE) memcpy(vaddr, *vaddr_handle, size); - hcd_buffer_free(bus, size + sizeof(vaddr), *vaddr_handle, *dma_handle); + hcd_buffer_free(bus, size + sizeof(vaddr), + *vaddr_handle, *dma_handle, 0); *vaddr_handle = vaddr; *dma_handle = 0; diff --git a/drivers/usb/core/usb.c b/drivers/usb/core/usb.c index 623be3174fb3..234ea5ab4bb7 100644 --- a/drivers/usb/core/usb.c +++ b/drivers/usb/core/usb.c @@ -858,6 +858,58 @@ int __usb_get_extra_descriptor(char *buffer, unsigned size, } EXPORT_SYMBOL_GPL(__usb_get_extra_descriptor); +/** + * usb_alloc_noncoherent - allocate dma-non-coherent buffer + * @dev: device the buffer will be used with + * @size: requested buffer size + * @mem_flags: affect whether allocation may block + * @dma: used to return DMA address of buffer + * + * Return: Either null (indicating no buffer could be allocated), or the + * cpu-space pointer to a non-coherent buffer that may be used to perform + * DMA to the specified device. Such cpu-space buffers are returned along + * with the DMA address (through the pointer provided). + * + * Note: + * These non-conherent buffers are used with URB_NO_xxx_DMA_MAP set in + * urb->transfer_flags to avoid using "DMA bounce buffers". When using + * this API, you must have the necessary syncs points. If you are unsure + * about this, you should be using coherent buffers via usb_alloc_coherent. + * + * When the buffer is no longer used, free it with usb_free_noncoherent(). + */ +void *usb_alloc_noncoherent(struct usb_device *dev, size_t size, gfp_t mem_flags, + dma_addr_t *dma) +{ + if (!dev || !dev->bus) + return NULL; + return hcd_buffer_alloc(dev->bus, size, + mem_flags, dma, DMA_ATTR_NON_CONSISTENT); +} +EXPORT_SYMBOL_GPL(usb_alloc_noncoherent); + +/** + * usb_free_noncoherent - free memory allocated with usb_alloc_noncoherent() + * @dev: device the buffer was used with + * @size: requested buffer size + * @addr: CPU address of buffer + * @dma: DMA address of buffer + * + * This reclaims an I/O buffer, letting it be reused. The memory must have + * been allocated using usb_alloc_noncoherent(), and the parameters must match + * those provided in that allocation request. + */ +void usb_free_noncoherent(struct usb_device *dev, size_t size, void *addr, + dma_addr_t dma) +{ + if (!dev || !dev->bus) + return; + if (!addr) + return; + hcd_buffer_free(dev->bus, size, addr, dma, DMA_ATTR_NON_CONSISTENT); +} +EXPORT_SYMBOL_GPL(usb_free_noncoherent); + /** * usb_alloc_coherent - allocate dma-consistent buffer for URB_NO_xxx_DMA_MAP * @dev: device the buffer will be used with @@ -886,7 +938,7 @@ void *usb_alloc_coherent(struct usb_device *dev, size_t size, gfp_t mem_flags, { if (!dev || !dev->bus) return NULL; - return hcd_buffer_alloc(dev->bus, size, mem_flags, dma); + return hcd_buffer_alloc(dev->bus, size, mem_flags, dma, 0); } EXPORT_SYMBOL_GPL(usb_alloc_coherent); @@ -908,7 +960,7 @@ void usb_free_coherent(struct usb_device *dev, size_t size, void *addr, return; if (!addr) return; - hcd_buffer_free(dev->bus, size, addr, dma); + hcd_buffer_free(dev->bus, size, addr, dma, 0); } EXPORT_SYMBOL_GPL(usb_free_coherent); diff --git a/include/linux/usb.h b/include/linux/usb.h index 4cdd515a4385..7fddd6c2a61e 100644 --- a/include/linux/usb.h +++ b/include/linux/usb.h @@ -1750,6 +1750,11 @@ static inline int usb_urb_dir_out(struct urb *urb) int usb_urb_ep_type_check(const struct urb *urb); +void *usb_alloc_noncoherent(struct usb_device *dev, size_t size, + gfp_t mem_flags, dma_addr_t *dma); +void usb_free_noncoherent(struct usb_device *dev, size_t size, + void *addr, dma_addr_t dma); + void *usb_alloc_coherent(struct usb_device *dev, size_t size, gfp_t mem_flags, dma_addr_t *dma); void usb_free_coherent(struct usb_device *dev, size_t size, diff --git a/include/linux/usb/hcd.h b/include/linux/usb/hcd.h index 97e2ddec18b1..41dd5f0acaad 100644 --- a/include/linux/usb/hcd.h +++ b/include/linux/usb/hcd.h @@ -486,9 +486,9 @@ int hcd_buffer_create(struct usb_hcd *hcd); void hcd_buffer_destroy(struct usb_hcd *hcd); void *hcd_buffer_alloc(struct usb_bus *bus, size_t size, - gfp_t mem_flags, dma_addr_t *dma); + gfp_t mem_flags, dma_addr_t *dma, unsigned long attrs); void hcd_buffer_free(struct usb_bus *bus, size_t size, - void *addr, dma_addr_t dma); + void *addr, dma_addr_t dma, unsigned long attrs); /* generic bus glue, needed for host controllers that don't use PCI */ extern irqreturn_t usb_hcd_irq(int irq, void *__hcd); ^ permalink raw reply related [flat|nested] 32+ messages in thread
* Re: [RFC 2/3] USB: core: Add non-coherent buffer allocation helpers @ 2018-08-30 17:58 ` Christoph Hellwig 0 siblings, 0 replies; 32+ messages in thread From: Christoph Hellwig @ 2018-08-30 17:58 UTC (permalink / raw) To: Ezequiel Garcia Cc: linux-media, linux-usb, linux-arm-kernel, linux-kernel, Laurent Pinchart, Tomasz Figa, Matwey V . Kornilov, Alan Stern, kernel, Keiichi Watanabe Please don't introduce new DMA_ATTR_NON_CONSISTENT users, it is a rather horrible interface, and I plan to kill it off rather sooner than later. I plan to post some patches for a better interface that can reuse the normal dma_sync_single_* interfaces for ownership transfers. I can happily include usb in that initial patch set based on your work here if that helps. ^ permalink raw reply [flat|nested] 32+ messages in thread
* [RFC 2/3] USB: core: Add non-coherent buffer allocation helpers @ 2018-08-30 17:58 ` Christoph Hellwig 0 siblings, 0 replies; 32+ messages in thread From: Christoph Hellwig @ 2018-08-30 17:58 UTC (permalink / raw) To: linux-arm-kernel Please don't introduce new DMA_ATTR_NON_CONSISTENT users, it is a rather horrible interface, and I plan to kill it off rather sooner than later. I plan to post some patches for a better interface that can reuse the normal dma_sync_single_* interfaces for ownership transfers. I can happily include usb in that initial patch set based on your work here if that helps. ^ permalink raw reply [flat|nested] 32+ messages in thread
* [RFC,2/3] USB: core: Add non-coherent buffer allocation helpers @ 2018-08-30 17:58 ` Christoph Hellwig 0 siblings, 0 replies; 32+ messages in thread From: Christoph Hellwig @ 2018-08-30 17:58 UTC (permalink / raw) To: Ezequiel Garcia Cc: linux-media, linux-usb, linux-arm-kernel, linux-kernel, Laurent Pinchart, Tomasz Figa, Matwey V . Kornilov, Alan Stern, kernel, Keiichi Watanabe Please don't introduce new DMA_ATTR_NON_CONSISTENT users, it is a rather horrible interface, and I plan to kill it off rather sooner than later. I plan to post some patches for a better interface that can reuse the normal dma_sync_single_* interfaces for ownership transfers. I can happily include usb in that initial patch set based on your work here if that helps. ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [RFC 2/3] USB: core: Add non-coherent buffer allocation helpers @ 2018-08-30 22:11 ` Ezequiel Garcia 0 siblings, 0 replies; 32+ messages in thread From: Ezequiel Garcia @ 2018-08-30 22:11 UTC (permalink / raw) To: Christoph Hellwig Cc: linux-media, linux-usb, linux-arm-kernel, linux-kernel, Laurent Pinchart, Tomasz Figa, Matwey V . Kornilov, Alan Stern, kernel, Keiichi Watanabe On Thu, 2018-08-30 at 10:58 -0700, Christoph Hellwig wrote: > Please don't introduce new DMA_ATTR_NON_CONSISTENT users, it is > a rather horrible interface, and I plan to kill it off rather sooner > than later. I plan to post some patches for a better interface > that can reuse the normal dma_sync_single_* interfaces for ownership > transfers. I can happily include usb in that initial patch set based > on your work here if that helps. Please do. Until we have proper allocators that go thru the DMA API, drivers will have to kmalloc the USB transfer buffers, and have streaming mappings. Which in turns mean not using IOMMU or CMA. Regards, Eze ^ permalink raw reply [flat|nested] 32+ messages in thread
* [RFC 2/3] USB: core: Add non-coherent buffer allocation helpers @ 2018-08-30 22:11 ` Ezequiel Garcia 0 siblings, 0 replies; 32+ messages in thread From: Ezequiel Garcia @ 2018-08-30 22:11 UTC (permalink / raw) To: linux-arm-kernel On Thu, 2018-08-30 at 10:58 -0700, Christoph Hellwig wrote: > Please don't introduce new DMA_ATTR_NON_CONSISTENT users, it is > a rather horrible interface, and I plan to kill it off rather sooner > than later. I plan to post some patches for a better interface > that can reuse the normal dma_sync_single_* interfaces for ownership > transfers. I can happily include usb in that initial patch set based > on your work here if that helps. Please do. Until we have proper allocators that go thru the DMA API, drivers will have to kmalloc the USB transfer buffers, and have streaming mappings. Which in turns mean not using IOMMU or CMA. Regards, Eze ^ permalink raw reply [flat|nested] 32+ messages in thread
* [RFC,2/3] USB: core: Add non-coherent buffer allocation helpers @ 2018-08-30 22:11 ` Ezequiel Garcia 0 siblings, 0 replies; 32+ messages in thread From: Ezequiel Garcia @ 2018-08-30 22:11 UTC (permalink / raw) To: Christoph Hellwig Cc: linux-media, linux-usb, linux-arm-kernel, linux-kernel, Laurent Pinchart, Tomasz Figa, Matwey V . Kornilov, Alan Stern, kernel, Keiichi Watanabe On Thu, 2018-08-30 at 10:58 -0700, Christoph Hellwig wrote: > Please don't introduce new DMA_ATTR_NON_CONSISTENT users, it is > a rather horrible interface, and I plan to kill it off rather sooner > than later. I plan to post some patches for a better interface > that can reuse the normal dma_sync_single_* interfaces for ownership > transfers. I can happily include usb in that initial patch set based > on your work here if that helps. Please do. Until we have proper allocators that go thru the DMA API, drivers will have to kmalloc the USB transfer buffers, and have streaming mappings. Which in turns mean not using IOMMU or CMA. Regards, Eze ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [RFC 2/3] USB: core: Add non-coherent buffer allocation helpers @ 2018-08-31 5:50 ` Christoph Hellwig 0 siblings, 0 replies; 32+ messages in thread From: Christoph Hellwig @ 2018-08-31 5:50 UTC (permalink / raw) To: Ezequiel Garcia Cc: Christoph Hellwig, linux-media, linux-usb, linux-arm-kernel, linux-kernel, Laurent Pinchart, Tomasz Figa, Matwey V . Kornilov, Alan Stern, kernel, Keiichi Watanabe On Thu, Aug 30, 2018 at 07:11:35PM -0300, Ezequiel Garcia wrote: > On Thu, 2018-08-30 at 10:58 -0700, Christoph Hellwig wrote: > > Please don't introduce new DMA_ATTR_NON_CONSISTENT users, it is > > a rather horrible interface, and I plan to kill it off rather sooner > > than later. I plan to post some patches for a better interface > > that can reuse the normal dma_sync_single_* interfaces for ownership > > transfers. I can happily include usb in that initial patch set based > > on your work here if that helps. > > Please do. Until we have proper allocators that go thru the DMA API, > drivers will have to kmalloc the USB transfer buffers, and have > streaming mappings. Which in turns mean not using IOMMU or CMA. dma_map_page will of course use the iommu. ^ permalink raw reply [flat|nested] 32+ messages in thread
* [RFC 2/3] USB: core: Add non-coherent buffer allocation helpers @ 2018-08-31 5:50 ` Christoph Hellwig 0 siblings, 0 replies; 32+ messages in thread From: Christoph Hellwig @ 2018-08-31 5:50 UTC (permalink / raw) To: linux-arm-kernel On Thu, Aug 30, 2018 at 07:11:35PM -0300, Ezequiel Garcia wrote: > On Thu, 2018-08-30 at 10:58 -0700, Christoph Hellwig wrote: > > Please don't introduce new DMA_ATTR_NON_CONSISTENT users, it is > > a rather horrible interface, and I plan to kill it off rather sooner > > than later. I plan to post some patches for a better interface > > that can reuse the normal dma_sync_single_* interfaces for ownership > > transfers. I can happily include usb in that initial patch set based > > on your work here if that helps. > > Please do. Until we have proper allocators that go thru the DMA API, > drivers will have to kmalloc the USB transfer buffers, and have > streaming mappings. Which in turns mean not using IOMMU or CMA. dma_map_page will of course use the iommu. ^ permalink raw reply [flat|nested] 32+ messages in thread
* [RFC,2/3] USB: core: Add non-coherent buffer allocation helpers @ 2018-08-31 5:50 ` Christoph Hellwig 0 siblings, 0 replies; 32+ messages in thread From: Christoph Hellwig @ 2018-08-31 5:50 UTC (permalink / raw) To: Ezequiel Garcia Cc: Christoph Hellwig, linux-media, linux-usb, linux-arm-kernel, linux-kernel, Laurent Pinchart, Tomasz Figa, Matwey V . Kornilov, Alan Stern, kernel, Keiichi Watanabe On Thu, Aug 30, 2018 at 07:11:35PM -0300, Ezequiel Garcia wrote: > On Thu, 2018-08-30 at 10:58 -0700, Christoph Hellwig wrote: > > Please don't introduce new DMA_ATTR_NON_CONSISTENT users, it is > > a rather horrible interface, and I plan to kill it off rather sooner > > than later. I plan to post some patches for a better interface > > that can reuse the normal dma_sync_single_* interfaces for ownership > > transfers. I can happily include usb in that initial patch set based > > on your work here if that helps. > > Please do. Until we have proper allocators that go thru the DMA API, > drivers will have to kmalloc the USB transfer buffers, and have > streaming mappings. Which in turns mean not using IOMMU or CMA. dma_map_page will of course use the iommu. ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [RFC 2/3] USB: core: Add non-coherent buffer allocation helpers @ 2018-08-31 6:51 ` Tomasz Figa 0 siblings, 0 replies; 32+ messages in thread From: Tomasz Figa @ 2018-08-31 6:51 UTC (permalink / raw) To: Christoph Hellwig Cc: Ezequiel Garcia, Linux Media Mailing List, linux-usb, list@263.net:IOMMU DRIVERS <iommu@lists.linux-foundation.org>, Joerg Roedel <joro@8bytes.org>,, Linux Kernel Mailing List, Laurent Pinchart, Matwey V. Kornilov, Alan Stern, kernel, keiichiw On Fri, Aug 31, 2018 at 2:50 PM Christoph Hellwig <hch@infradead.org> wrote: > > On Thu, Aug 30, 2018 at 07:11:35PM -0300, Ezequiel Garcia wrote: > > On Thu, 2018-08-30 at 10:58 -0700, Christoph Hellwig wrote: > > > Please don't introduce new DMA_ATTR_NON_CONSISTENT users, it is > > > a rather horrible interface, and I plan to kill it off rather sooner > > > than later. I plan to post some patches for a better interface > > > that can reuse the normal dma_sync_single_* interfaces for ownership > > > transfers. I can happily include usb in that initial patch set based > > > on your work here if that helps. > > > > Please do. Until we have proper allocators that go thru the DMA API, > > drivers will have to kmalloc the USB transfer buffers, and have > > streaming mappings. Which in turns mean not using IOMMU or CMA. > > dma_map_page will of course use the iommu. Sure, dma_map*() will, but using kmalloc() defeats (half of) the purpose of it, since contiguous memory would be allocated unnecessarily, risking failures due to fragmentation. Best regards, Tomasz ^ permalink raw reply [flat|nested] 32+ messages in thread
* [RFC 2/3] USB: core: Add non-coherent buffer allocation helpers @ 2018-08-31 6:51 ` Tomasz Figa 0 siblings, 0 replies; 32+ messages in thread From: Tomasz Figa @ 2018-08-31 6:51 UTC (permalink / raw) To: linux-arm-kernel On Fri, Aug 31, 2018 at 2:50 PM Christoph Hellwig <hch@infradead.org> wrote: > > On Thu, Aug 30, 2018 at 07:11:35PM -0300, Ezequiel Garcia wrote: > > On Thu, 2018-08-30 at 10:58 -0700, Christoph Hellwig wrote: > > > Please don't introduce new DMA_ATTR_NON_CONSISTENT users, it is > > > a rather horrible interface, and I plan to kill it off rather sooner > > > than later. I plan to post some patches for a better interface > > > that can reuse the normal dma_sync_single_* interfaces for ownership > > > transfers. I can happily include usb in that initial patch set based > > > on your work here if that helps. > > > > Please do. Until we have proper allocators that go thru the DMA API, > > drivers will have to kmalloc the USB transfer buffers, and have > > streaming mappings. Which in turns mean not using IOMMU or CMA. > > dma_map_page will of course use the iommu. Sure, dma_map*() will, but using kmalloc() defeats (half of) the purpose of it, since contiguous memory would be allocated unnecessarily, risking failures due to fragmentation. Best regards, Tomasz ^ permalink raw reply [flat|nested] 32+ messages in thread
* [RFC,2/3] USB: core: Add non-coherent buffer allocation helpers @ 2018-08-31 6:51 ` Tomasz Figa 0 siblings, 0 replies; 32+ messages in thread From: Tomasz Figa @ 2018-08-31 6:51 UTC (permalink / raw) To: Christoph Hellwig Cc: Ezequiel Garcia, Linux Media Mailing List, linux-usb, list@263.net:IOMMU DRIVERS <iommu@lists.linux-foundation.org>, Joerg Roedel <joro@8bytes.org>,, Linux Kernel Mailing List, Laurent Pinchart, Matwey V. Kornilov, Alan Stern, kernel, keiichiw On Fri, Aug 31, 2018 at 2:50 PM Christoph Hellwig <hch@infradead.org> wrote: > > On Thu, Aug 30, 2018 at 07:11:35PM -0300, Ezequiel Garcia wrote: > > On Thu, 2018-08-30 at 10:58 -0700, Christoph Hellwig wrote: > > > Please don't introduce new DMA_ATTR_NON_CONSISTENT users, it is > > > a rather horrible interface, and I plan to kill it off rather sooner > > > than later. I plan to post some patches for a better interface > > > that can reuse the normal dma_sync_single_* interfaces for ownership > > > transfers. I can happily include usb in that initial patch set based > > > on your work here if that helps. > > > > Please do. Until we have proper allocators that go thru the DMA API, > > drivers will have to kmalloc the USB transfer buffers, and have > > streaming mappings. Which in turns mean not using IOMMU or CMA. > > dma_map_page will of course use the iommu. Sure, dma_map*() will, but using kmalloc() defeats (half of) the purpose of it, since contiguous memory would be allocated unnecessarily, risking failures due to fragmentation. Best regards, Tomasz ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [RFC 2/3] USB: core: Add non-coherent buffer allocation helpers @ 2018-10-31 1:55 ` Tomasz Figa 0 siblings, 0 replies; 32+ messages in thread From: Tomasz Figa @ 2018-10-31 1:55 UTC (permalink / raw) To: Christoph Hellwig Cc: Ezequiel Garcia, Linux Media Mailing List, linux-usb, list@263.net:IOMMU DRIVERS <iommu@lists.linux-foundation.org>, Joerg Roedel <joro@8bytes.org>,, Linux Kernel Mailing List, Laurent Pinchart, Matwey V. Kornilov, Alan Stern, kernel, keiichiw Hi Christoph and everyone, On Fri, Aug 31, 2018 at 3:51 PM Tomasz Figa <tfiga@chromium.org> wrote: > > On Fri, Aug 31, 2018 at 2:50 PM Christoph Hellwig <hch@infradead.org> wrote: > > > > On Thu, Aug 30, 2018 at 07:11:35PM -0300, Ezequiel Garcia wrote: > > > On Thu, 2018-08-30 at 10:58 -0700, Christoph Hellwig wrote: > > > > Please don't introduce new DMA_ATTR_NON_CONSISTENT users, it is > > > > a rather horrible interface, and I plan to kill it off rather sooner > > > > than later. I plan to post some patches for a better interface > > > > that can reuse the normal dma_sync_single_* interfaces for ownership > > > > transfers. I can happily include usb in that initial patch set based > > > > on your work here if that helps. > > > > > > Please do. Until we have proper allocators that go thru the DMA API, > > > drivers will have to kmalloc the USB transfer buffers, and have > > > streaming mappings. Which in turns mean not using IOMMU or CMA. > > > > dma_map_page will of course use the iommu. > > Sure, dma_map*() will, but using kmalloc() defeats (half of) the > purpose of it, since contiguous memory would be allocated > unnecessarily, risking failures due to fragmentation. Have we reached a conclusion here? It sounds like it's a quite significant problem, at least for some of the camera (media) devices over there and there are people interested in solving it, so all we need here is a conclusion on how to do it. :) Best regards, Tomasz ^ permalink raw reply [flat|nested] 32+ messages in thread
* [RFC 2/3] USB: core: Add non-coherent buffer allocation helpers @ 2018-10-31 1:55 ` Tomasz Figa 0 siblings, 0 replies; 32+ messages in thread From: Tomasz Figa @ 2018-10-31 1:55 UTC (permalink / raw) To: linux-arm-kernel Hi Christoph and everyone, On Fri, Aug 31, 2018 at 3:51 PM Tomasz Figa <tfiga@chromium.org> wrote: > > On Fri, Aug 31, 2018 at 2:50 PM Christoph Hellwig <hch@infradead.org> wrote: > > > > On Thu, Aug 30, 2018 at 07:11:35PM -0300, Ezequiel Garcia wrote: > > > On Thu, 2018-08-30 at 10:58 -0700, Christoph Hellwig wrote: > > > > Please don't introduce new DMA_ATTR_NON_CONSISTENT users, it is > > > > a rather horrible interface, and I plan to kill it off rather sooner > > > > than later. I plan to post some patches for a better interface > > > > that can reuse the normal dma_sync_single_* interfaces for ownership > > > > transfers. I can happily include usb in that initial patch set based > > > > on your work here if that helps. > > > > > > Please do. Until we have proper allocators that go thru the DMA API, > > > drivers will have to kmalloc the USB transfer buffers, and have > > > streaming mappings. Which in turns mean not using IOMMU or CMA. > > > > dma_map_page will of course use the iommu. > > Sure, dma_map*() will, but using kmalloc() defeats (half of) the > purpose of it, since contiguous memory would be allocated > unnecessarily, risking failures due to fragmentation. Have we reached a conclusion here? It sounds like it's a quite significant problem, at least for some of the camera (media) devices over there and there are people interested in solving it, so all we need here is a conclusion on how to do it. :) Best regards, Tomasz ^ permalink raw reply [flat|nested] 32+ messages in thread
* [RFC,2/3] USB: core: Add non-coherent buffer allocation helpers @ 2018-10-31 1:55 ` Tomasz Figa 0 siblings, 0 replies; 32+ messages in thread From: Tomasz Figa @ 2018-10-31 1:55 UTC (permalink / raw) To: Christoph Hellwig Cc: Ezequiel Garcia, Linux Media Mailing List, linux-usb, list@263.net:IOMMU DRIVERS <iommu@lists.linux-foundation.org>, Joerg Roedel <joro@8bytes.org>,, Linux Kernel Mailing List, Laurent Pinchart, Matwey V. Kornilov, Alan Stern, kernel, keiichiw Hi Christoph and everyone, On Fri, Aug 31, 2018 at 3:51 PM Tomasz Figa <tfiga@chromium.org> wrote: > > On Fri, Aug 31, 2018 at 2:50 PM Christoph Hellwig <hch@infradead.org> wrote: > > > > On Thu, Aug 30, 2018 at 07:11:35PM -0300, Ezequiel Garcia wrote: > > > On Thu, 2018-08-30 at 10:58 -0700, Christoph Hellwig wrote: > > > > Please don't introduce new DMA_ATTR_NON_CONSISTENT users, it is > > > > a rather horrible interface, and I plan to kill it off rather sooner > > > > than later. I plan to post some patches for a better interface > > > > that can reuse the normal dma_sync_single_* interfaces for ownership > > > > transfers. I can happily include usb in that initial patch set based > > > > on your work here if that helps. > > > > > > Please do. Until we have proper allocators that go thru the DMA API, > > > drivers will have to kmalloc the USB transfer buffers, and have > > > streaming mappings. Which in turns mean not using IOMMU or CMA. > > > > dma_map_page will of course use the iommu. > > Sure, dma_map*() will, but using kmalloc() defeats (half of) the > purpose of it, since contiguous memory would be allocated > unnecessarily, risking failures due to fragmentation. Have we reached a conclusion here? It sounds like it's a quite significant problem, at least for some of the camera (media) devices over there and there are people interested in solving it, so all we need here is a conclusion on how to do it. :) Best regards, Tomasz ^ permalink raw reply [flat|nested] 32+ messages in thread
* [RFC 3/3] stk1160: Use non-coherent buffers for USB transfers @ 2018-08-30 17:20 ` Ezequiel Garcia 0 siblings, 0 replies; 32+ messages in thread From: Ezequiel Garcia @ 2018-08-30 17:20 UTC (permalink / raw) To: linux-media, linux-usb, linux-arm-kernel, linux-kernel Cc: Laurent Pinchart, Tomasz Figa, Matwey V . Kornilov, Alan Stern, kernel, Keiichi Watanabe, Ezequiel Garcia Platforms without hardware coherency can benefit a lot from using non-coherent buffers. Moreover, platforms with hardware coherency aren't impacted by this change. For instance, on AM335x, while it's still not possible to capture full resolution frames, this patch enables half-resolution frame streams to work. Signed-off-by: Ezequiel Garcia <ezequiel@collabora.com> --- drivers/media/usb/stk1160/stk1160-video.c | 22 ++++++---------------- 1 file changed, 6 insertions(+), 16 deletions(-) diff --git a/drivers/media/usb/stk1160/stk1160-video.c b/drivers/media/usb/stk1160/stk1160-video.c index 2811f612820f..aeb4264d1998 100644 --- a/drivers/media/usb/stk1160/stk1160-video.c +++ b/drivers/media/usb/stk1160/stk1160-video.c @@ -240,6 +240,9 @@ static void stk1160_process_isoc(struct stk1160 *dev, struct urb *urb) return; } + dma_sync_single_for_cpu(&urb->dev->dev, urb->transfer_dma, + urb->transfer_buffer_length, DMA_FROM_DEVICE); + for (i = 0; i < urb->number_of_packets; i++) { status = urb->iso_frame_desc[i].status; if (status < 0) { @@ -379,16 +382,11 @@ void stk1160_free_isoc(struct stk1160 *dev) urb = dev->isoc_ctl.urb[i]; if (urb) { - if (dev->isoc_ctl.transfer_buffer[i]) { -#ifndef CONFIG_DMA_NONCOHERENT - usb_free_coherent(dev->udev, + if (dev->isoc_ctl.transfer_buffer[i]) + usb_free_noncoherent(dev->udev, urb->transfer_buffer_length, dev->isoc_ctl.transfer_buffer[i], urb->transfer_dma); -#else - kfree(dev->isoc_ctl.transfer_buffer[i]); -#endif - } usb_free_urb(urb); dev->isoc_ctl.urb[i] = NULL; } @@ -461,12 +459,8 @@ int stk1160_alloc_isoc(struct stk1160 *dev) goto free_i_bufs; dev->isoc_ctl.urb[i] = urb; -#ifndef CONFIG_DMA_NONCOHERENT - dev->isoc_ctl.transfer_buffer[i] = usb_alloc_coherent(dev->udev, + dev->isoc_ctl.transfer_buffer[i] = usb_alloc_noncoherent(dev->udev, sb_size, GFP_KERNEL, &urb->transfer_dma); -#else - dev->isoc_ctl.transfer_buffer[i] = kmalloc(sb_size, GFP_KERNEL); -#endif if (!dev->isoc_ctl.transfer_buffer[i]) { stk1160_err("cannot alloc %d bytes for tx[%d] buffer\n", sb_size, i); @@ -490,11 +484,7 @@ int stk1160_alloc_isoc(struct stk1160 *dev) urb->interval = 1; urb->start_frame = 0; urb->number_of_packets = max_packets; -#ifndef CONFIG_DMA_NONCOHERENT urb->transfer_flags = URB_ISO_ASAP | URB_NO_TRANSFER_DMA_MAP; -#else - urb->transfer_flags = URB_ISO_ASAP; -#endif k = 0; for (j = 0; j < max_packets; j++) { -- 2.18.0 ^ permalink raw reply related [flat|nested] 32+ messages in thread
* [RFC 3/3] stk1160: Use non-coherent buffers for USB transfers @ 2018-08-30 17:20 ` Ezequiel Garcia 0 siblings, 0 replies; 32+ messages in thread From: Ezequiel Garcia @ 2018-08-30 17:20 UTC (permalink / raw) To: linux-arm-kernel Platforms without hardware coherency can benefit a lot from using non-coherent buffers. Moreover, platforms with hardware coherency aren't impacted by this change. For instance, on AM335x, while it's still not possible to capture full resolution frames, this patch enables half-resolution frame streams to work. Signed-off-by: Ezequiel Garcia <ezequiel@collabora.com> --- drivers/media/usb/stk1160/stk1160-video.c | 22 ++++++---------------- 1 file changed, 6 insertions(+), 16 deletions(-) diff --git a/drivers/media/usb/stk1160/stk1160-video.c b/drivers/media/usb/stk1160/stk1160-video.c index 2811f612820f..aeb4264d1998 100644 --- a/drivers/media/usb/stk1160/stk1160-video.c +++ b/drivers/media/usb/stk1160/stk1160-video.c @@ -240,6 +240,9 @@ static void stk1160_process_isoc(struct stk1160 *dev, struct urb *urb) return; } + dma_sync_single_for_cpu(&urb->dev->dev, urb->transfer_dma, + urb->transfer_buffer_length, DMA_FROM_DEVICE); + for (i = 0; i < urb->number_of_packets; i++) { status = urb->iso_frame_desc[i].status; if (status < 0) { @@ -379,16 +382,11 @@ void stk1160_free_isoc(struct stk1160 *dev) urb = dev->isoc_ctl.urb[i]; if (urb) { - if (dev->isoc_ctl.transfer_buffer[i]) { -#ifndef CONFIG_DMA_NONCOHERENT - usb_free_coherent(dev->udev, + if (dev->isoc_ctl.transfer_buffer[i]) + usb_free_noncoherent(dev->udev, urb->transfer_buffer_length, dev->isoc_ctl.transfer_buffer[i], urb->transfer_dma); -#else - kfree(dev->isoc_ctl.transfer_buffer[i]); -#endif - } usb_free_urb(urb); dev->isoc_ctl.urb[i] = NULL; } @@ -461,12 +459,8 @@ int stk1160_alloc_isoc(struct stk1160 *dev) goto free_i_bufs; dev->isoc_ctl.urb[i] = urb; -#ifndef CONFIG_DMA_NONCOHERENT - dev->isoc_ctl.transfer_buffer[i] = usb_alloc_coherent(dev->udev, + dev->isoc_ctl.transfer_buffer[i] = usb_alloc_noncoherent(dev->udev, sb_size, GFP_KERNEL, &urb->transfer_dma); -#else - dev->isoc_ctl.transfer_buffer[i] = kmalloc(sb_size, GFP_KERNEL); -#endif if (!dev->isoc_ctl.transfer_buffer[i]) { stk1160_err("cannot alloc %d bytes for tx[%d] buffer\n", sb_size, i); @@ -490,11 +484,7 @@ int stk1160_alloc_isoc(struct stk1160 *dev) urb->interval = 1; urb->start_frame = 0; urb->number_of_packets = max_packets; -#ifndef CONFIG_DMA_NONCOHERENT urb->transfer_flags = URB_ISO_ASAP | URB_NO_TRANSFER_DMA_MAP; -#else - urb->transfer_flags = URB_ISO_ASAP; -#endif k = 0; for (j = 0; j < max_packets; j++) { -- 2.18.0 ^ permalink raw reply related [flat|nested] 32+ messages in thread
* [RFC,3/3] stk1160: Use non-coherent buffers for USB transfers @ 2018-08-30 17:20 ` Ezequiel Garcia 0 siblings, 0 replies; 32+ messages in thread From: Ezequiel Garcia @ 2018-08-30 17:20 UTC (permalink / raw) To: linux-media, linux-usb, linux-arm-kernel, linux-kernel Cc: Laurent Pinchart, Tomasz Figa, Matwey V . Kornilov, Alan Stern, kernel, Keiichi Watanabe, Ezequiel Garcia Platforms without hardware coherency can benefit a lot from using non-coherent buffers. Moreover, platforms with hardware coherency aren't impacted by this change. For instance, on AM335x, while it's still not possible to capture full resolution frames, this patch enables half-resolution frame streams to work. Signed-off-by: Ezequiel Garcia <ezequiel@collabora.com> --- drivers/media/usb/stk1160/stk1160-video.c | 22 ++++++---------------- 1 file changed, 6 insertions(+), 16 deletions(-) diff --git a/drivers/media/usb/stk1160/stk1160-video.c b/drivers/media/usb/stk1160/stk1160-video.c index 2811f612820f..aeb4264d1998 100644 --- a/drivers/media/usb/stk1160/stk1160-video.c +++ b/drivers/media/usb/stk1160/stk1160-video.c @@ -240,6 +240,9 @@ static void stk1160_process_isoc(struct stk1160 *dev, struct urb *urb) return; } + dma_sync_single_for_cpu(&urb->dev->dev, urb->transfer_dma, + urb->transfer_buffer_length, DMA_FROM_DEVICE); + for (i = 0; i < urb->number_of_packets; i++) { status = urb->iso_frame_desc[i].status; if (status < 0) { @@ -379,16 +382,11 @@ void stk1160_free_isoc(struct stk1160 *dev) urb = dev->isoc_ctl.urb[i]; if (urb) { - if (dev->isoc_ctl.transfer_buffer[i]) { -#ifndef CONFIG_DMA_NONCOHERENT - usb_free_coherent(dev->udev, + if (dev->isoc_ctl.transfer_buffer[i]) + usb_free_noncoherent(dev->udev, urb->transfer_buffer_length, dev->isoc_ctl.transfer_buffer[i], urb->transfer_dma); -#else - kfree(dev->isoc_ctl.transfer_buffer[i]); -#endif - } usb_free_urb(urb); dev->isoc_ctl.urb[i] = NULL; } @@ -461,12 +459,8 @@ int stk1160_alloc_isoc(struct stk1160 *dev) goto free_i_bufs; dev->isoc_ctl.urb[i] = urb; -#ifndef CONFIG_DMA_NONCOHERENT - dev->isoc_ctl.transfer_buffer[i] = usb_alloc_coherent(dev->udev, + dev->isoc_ctl.transfer_buffer[i] = usb_alloc_noncoherent(dev->udev, sb_size, GFP_KERNEL, &urb->transfer_dma); -#else - dev->isoc_ctl.transfer_buffer[i] = kmalloc(sb_size, GFP_KERNEL); -#endif if (!dev->isoc_ctl.transfer_buffer[i]) { stk1160_err("cannot alloc %d bytes for tx[%d] buffer\n", sb_size, i); @@ -490,11 +484,7 @@ int stk1160_alloc_isoc(struct stk1160 *dev) urb->interval = 1; urb->start_frame = 0; urb->number_of_packets = max_packets; -#ifndef CONFIG_DMA_NONCOHERENT urb->transfer_flags = URB_ISO_ASAP | URB_NO_TRANSFER_DMA_MAP; -#else - urb->transfer_flags = URB_ISO_ASAP; -#endif k = 0; for (j = 0; j < max_packets; j++) { ^ permalink raw reply related [flat|nested] 32+ messages in thread
* Re: [RFC 3/3] stk1160: Use non-coherent buffers for USB transfers @ 2018-08-30 17:59 ` Christoph Hellwig 0 siblings, 0 replies; 32+ messages in thread From: Christoph Hellwig @ 2018-08-30 17:59 UTC (permalink / raw) To: Ezequiel Garcia Cc: linux-media, linux-usb, linux-arm-kernel, linux-kernel, Laurent Pinchart, Tomasz Figa, Matwey V . Kornilov, Alan Stern, kernel, Keiichi Watanabe > + dma_sync_single_for_cpu(&urb->dev->dev, urb->transfer_dma, > + urb->transfer_buffer_length, DMA_FROM_DEVICE); You can't ue dma_sync_single_for_cpu on non-coherent dma buffers, which is one of the major issues with them. ^ permalink raw reply [flat|nested] 32+ messages in thread
* [RFC 3/3] stk1160: Use non-coherent buffers for USB transfers @ 2018-08-30 17:59 ` Christoph Hellwig 0 siblings, 0 replies; 32+ messages in thread From: Christoph Hellwig @ 2018-08-30 17:59 UTC (permalink / raw) To: linux-arm-kernel > + dma_sync_single_for_cpu(&urb->dev->dev, urb->transfer_dma, > + urb->transfer_buffer_length, DMA_FROM_DEVICE); You can't ue dma_sync_single_for_cpu on non-coherent dma buffers, which is one of the major issues with them. ^ permalink raw reply [flat|nested] 32+ messages in thread
* [RFC,3/3] stk1160: Use non-coherent buffers for USB transfers @ 2018-08-30 17:59 ` Christoph Hellwig 0 siblings, 0 replies; 32+ messages in thread From: Christoph Hellwig @ 2018-08-30 17:59 UTC (permalink / raw) To: Ezequiel Garcia Cc: linux-media, linux-usb, linux-arm-kernel, linux-kernel, Laurent Pinchart, Tomasz Figa, Matwey V . Kornilov, Alan Stern, kernel, Keiichi Watanabe > + dma_sync_single_for_cpu(&urb->dev->dev, urb->transfer_dma, > + urb->transfer_buffer_length, DMA_FROM_DEVICE); You can't ue dma_sync_single_for_cpu on non-coherent dma buffers, which is one of the major issues with them. ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [RFC 3/3] stk1160: Use non-coherent buffers for USB transfers @ 2018-09-07 8:54 ` Tomasz Figa 0 siblings, 0 replies; 32+ messages in thread From: Tomasz Figa @ 2018-09-07 8:54 UTC (permalink / raw) To: Christoph Hellwig Cc: Ezequiel Garcia, Linux Media Mailing List, linux-usb, list@263.net:IOMMU DRIVERS <iommu@lists.linux-foundation.org>, Joerg Roedel <joro@8bytes.org>,, Linux Kernel Mailing List, Laurent Pinchart, Matwey V. Kornilov, Alan Stern, kernel, keiichiw On Fri, Aug 31, 2018 at 2:59 AM Christoph Hellwig <hch@infradead.org> wrote: > > > + dma_sync_single_for_cpu(&urb->dev->dev, urb->transfer_dma, > > + urb->transfer_buffer_length, DMA_FROM_DEVICE); > > You can't ue dma_sync_single_for_cpu on non-coherent dma buffers, > which is one of the major issues with them. It's not an issue of DMA API, but just an API mismatch. By design, memory allocated for device (e.g. by DMA API) doesn't have to be physically contiguous, while dma_*_single() API expects a _single_, physically contiguous region of memory. We need a way to allocate non-coherent memory using DMA API to handle (on USB example, but applies to virtually any class of devices doing DMA): - DMA address range limitations (e.g. dma_mask) - while a USB HCD driver is normally aware of those, USB device driver should have no idea, - memory mapping capability === whether contiguous memory or a set of random pages can be allocated - this is a platform integration detail, which even a USB HCD driver may not be aware of, if a SoC IOMMU is just stuffed between the bus and HCD, - platform coherency specifics - there are practical scenarios when on a coherent-by-default system it's more efficient to allocate non-coherent memory and manage caches explicitly to avoid the costs of cache snooping. If DMA_ATTR_NON_CONSISTENT is not the right way to do it, there should be definitely a new API introduced, coupled closely to DMA API implementation on given platform, since it's the only place which can solve all the constraints above. Best regards, Tomasz ^ permalink raw reply [flat|nested] 32+ messages in thread
* [RFC 3/3] stk1160: Use non-coherent buffers for USB transfers @ 2018-09-07 8:54 ` Tomasz Figa 0 siblings, 0 replies; 32+ messages in thread From: Tomasz Figa @ 2018-09-07 8:54 UTC (permalink / raw) To: linux-arm-kernel On Fri, Aug 31, 2018 at 2:59 AM Christoph Hellwig <hch@infradead.org> wrote: > > > + dma_sync_single_for_cpu(&urb->dev->dev, urb->transfer_dma, > > + urb->transfer_buffer_length, DMA_FROM_DEVICE); > > You can't ue dma_sync_single_for_cpu on non-coherent dma buffers, > which is one of the major issues with them. It's not an issue of DMA API, but just an API mismatch. By design, memory allocated for device (e.g. by DMA API) doesn't have to be physically contiguous, while dma_*_single() API expects a _single_, physically contiguous region of memory. We need a way to allocate non-coherent memory using DMA API to handle (on USB example, but applies to virtually any class of devices doing DMA): - DMA address range limitations (e.g. dma_mask) - while a USB HCD driver is normally aware of those, USB device driver should have no idea, - memory mapping capability === whether contiguous memory or a set of random pages can be allocated - this is a platform integration detail, which even a USB HCD driver may not be aware of, if a SoC IOMMU is just stuffed between the bus and HCD, - platform coherency specifics - there are practical scenarios when on a coherent-by-default system it's more efficient to allocate non-coherent memory and manage caches explicitly to avoid the costs of cache snooping. If DMA_ATTR_NON_CONSISTENT is not the right way to do it, there should be definitely a new API introduced, coupled closely to DMA API implementation on given platform, since it's the only place which can solve all the constraints above. Best regards, Tomasz ^ permalink raw reply [flat|nested] 32+ messages in thread
* [RFC,3/3] stk1160: Use non-coherent buffers for USB transfers @ 2018-09-07 8:54 ` Tomasz Figa 0 siblings, 0 replies; 32+ messages in thread From: Tomasz Figa @ 2018-09-07 8:54 UTC (permalink / raw) To: Christoph Hellwig Cc: Ezequiel Garcia, Linux Media Mailing List, linux-usb, list@263.net:IOMMU DRIVERS <iommu@lists.linux-foundation.org>, Joerg Roedel <joro@8bytes.org>,, Linux Kernel Mailing List, Laurent Pinchart, Matwey V. Kornilov, Alan Stern, kernel, keiichiw On Fri, Aug 31, 2018 at 2:59 AM Christoph Hellwig <hch@infradead.org> wrote: > > > + dma_sync_single_for_cpu(&urb->dev->dev, urb->transfer_dma, > > + urb->transfer_buffer_length, DMA_FROM_DEVICE); > > You can't ue dma_sync_single_for_cpu on non-coherent dma buffers, > which is one of the major issues with them. It's not an issue of DMA API, but just an API mismatch. By design, memory allocated for device (e.g. by DMA API) doesn't have to be physically contiguous, while dma_*_single() API expects a _single_, physically contiguous region of memory. We need a way to allocate non-coherent memory using DMA API to handle (on USB example, but applies to virtually any class of devices doing DMA): - DMA address range limitations (e.g. dma_mask) - while a USB HCD driver is normally aware of those, USB device driver should have no idea, - memory mapping capability === whether contiguous memory or a set of random pages can be allocated - this is a platform integration detail, which even a USB HCD driver may not be aware of, if a SoC IOMMU is just stuffed between the bus and HCD, - platform coherency specifics - there are practical scenarios when on a coherent-by-default system it's more efficient to allocate non-coherent memory and manage caches explicitly to avoid the costs of cache snooping. If DMA_ATTR_NON_CONSISTENT is not the right way to do it, there should be definitely a new API introduced, coupled closely to DMA API implementation on given platform, since it's the only place which can solve all the constraints above. Best regards, Tomasz ^ permalink raw reply [flat|nested] 32+ messages in thread
end of thread, other threads:[~2018-10-31 1:55 UTC | newest] Thread overview: 32+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2018-08-30 17:20 [RFC 0/3] Introduce usb_{alloc,free}_noncoherent API Ezequiel Garcia 2018-08-30 17:20 ` Ezequiel Garcia 2018-08-30 17:20 ` [RFC 1/3] HACK: ARM: dma-mapping: Get writeback memory for non-consistent mappings Ezequiel Garcia 2018-08-30 17:20 ` Ezequiel Garcia 2018-08-30 17:20 ` [RFC,1/3] " Ezequiel Garcia 2018-08-30 17:20 ` [RFC 2/3] USB: core: Add non-coherent buffer allocation helpers Ezequiel Garcia 2018-08-30 17:20 ` Ezequiel Garcia 2018-08-30 17:20 ` [RFC,2/3] " Ezequiel Garcia 2018-08-30 17:58 ` [RFC 2/3] " Christoph Hellwig 2018-08-30 17:58 ` Christoph Hellwig 2018-08-30 17:58 ` [RFC,2/3] " Christoph Hellwig 2018-08-30 22:11 ` [RFC 2/3] " Ezequiel Garcia 2018-08-30 22:11 ` Ezequiel Garcia 2018-08-30 22:11 ` [RFC,2/3] " Ezequiel Garcia 2018-08-31 5:50 ` [RFC 2/3] " Christoph Hellwig 2018-08-31 5:50 ` Christoph Hellwig 2018-08-31 5:50 ` [RFC,2/3] " Christoph Hellwig 2018-08-31 6:51 ` [RFC 2/3] " Tomasz Figa 2018-08-31 6:51 ` Tomasz Figa 2018-08-31 6:51 ` [RFC,2/3] " Tomasz Figa 2018-10-31 1:55 ` [RFC 2/3] " Tomasz Figa 2018-10-31 1:55 ` Tomasz Figa 2018-10-31 1:55 ` [RFC,2/3] " Tomasz Figa 2018-08-30 17:20 ` [RFC 3/3] stk1160: Use non-coherent buffers for USB transfers Ezequiel Garcia 2018-08-30 17:20 ` Ezequiel Garcia 2018-08-30 17:20 ` [RFC,3/3] " Ezequiel Garcia 2018-08-30 17:59 ` [RFC 3/3] " Christoph Hellwig 2018-08-30 17:59 ` Christoph Hellwig 2018-08-30 17:59 ` [RFC,3/3] " Christoph Hellwig 2018-09-07 8:54 ` [RFC 3/3] " Tomasz Figa 2018-09-07 8:54 ` Tomasz Figa 2018-09-07 8:54 ` [RFC,3/3] " Tomasz Figa
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.