All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v3 0/2] [media] videobuf2-dc: Add support for cacheable MMAP
@ 2016-10-26  8:52 ` Thierry Escande
  2016-10-26  8:52   ` [PATCH v3 1/2] [media] videobuf2-dc: Move vb2_dc_get_base_sgt() above mmap callbacks Thierry Escande
                     ` (2 more replies)
  0 siblings, 3 replies; 11+ messages in thread
From: Thierry Escande @ 2016-10-26  8:52 UTC (permalink / raw)
  To: Mauro Carvalho Chehab, Sakari Ailus
  Cc: linux-media, linux-kernel, Pawel Osciak, Marek Szyprowski, Kyungmin Park

Hi,

This series adds support for cacheable MMAP in DMA coherent allocator.

The first patch moves the vb2_dc_get_base_sgt() function above mmap
callbacks for calls introduced by the second patch. This avoids a
forward declaration.

Changes in v2:
- Put function move in a separate patch
- Added comments

Changes in v3:
- Remove redundant test on NO_KERNEL_MAPPING DMA attribute in mmap()

Heng-Ruey Hsu (1):
  [media] videobuf2-dc: Support cacheable MMAP

Thierry Escande (1):
  [media] videobuf2-dc: Move vb2_dc_get_base_sgt() above mmap callbacks

 drivers/media/v4l2-core/videobuf2-dma-contig.c | 60 ++++++++++++++++----------
 1 file changed, 38 insertions(+), 22 deletions(-)

-- 
2.7.4

^ permalink raw reply	[flat|nested] 11+ messages in thread

* [PATCH v3 1/2] [media] videobuf2-dc: Move vb2_dc_get_base_sgt() above mmap callbacks
  2016-10-26  8:52 ` [PATCH v3 0/2] [media] videobuf2-dc: Add support for cacheable MMAP Thierry Escande
@ 2016-10-26  8:52   ` Thierry Escande
  2016-10-26  8:52   ` [PATCH v3 2/2] [media] videobuf2-dc: Support cacheable MMAP Thierry Escande
  2017-07-03  9:27   ` [PATCH v3 0/2] [media] videobuf2-dc: Add support for " Marek Szyprowski
  2 siblings, 0 replies; 11+ messages in thread
From: Thierry Escande @ 2016-10-26  8:52 UTC (permalink / raw)
  To: Mauro Carvalho Chehab, Sakari Ailus
  Cc: linux-media, linux-kernel, Pawel Osciak, Marek Szyprowski, Kyungmin Park

This patch moves vb2_dc_get_base_sgt() function above mmap buffers
callbacks, particularly vb2_dc_alloc() and vb2_dc_mmap() from where it
will be called for cacheable MMAP support introduced in the next patch.

Signed-off-by: Thierry Escande <thierry.escande@collabora.com>
---
 drivers/media/v4l2-core/videobuf2-dma-contig.c | 44 +++++++++++++-------------
 1 file changed, 22 insertions(+), 22 deletions(-)

diff --git a/drivers/media/v4l2-core/videobuf2-dma-contig.c b/drivers/media/v4l2-core/videobuf2-dma-contig.c
index a44e383..0d9665d 100644
--- a/drivers/media/v4l2-core/videobuf2-dma-contig.c
+++ b/drivers/media/v4l2-core/videobuf2-dma-contig.c
@@ -61,6 +61,28 @@ static unsigned long vb2_dc_get_contiguous_size(struct sg_table *sgt)
 	return size;
 }
 
+static struct sg_table *vb2_dc_get_base_sgt(struct vb2_dc_buf *buf)
+{
+	int ret;
+	struct sg_table *sgt;
+
+	sgt = kmalloc(sizeof(*sgt), GFP_KERNEL);
+	if (!sgt) {
+		dev_err(buf->dev, "failed to alloc sg table\n");
+		return NULL;
+	}
+
+	ret = dma_get_sgtable_attrs(buf->dev, sgt, buf->cookie, buf->dma_addr,
+		buf->size, buf->attrs);
+	if (ret < 0) {
+		dev_err(buf->dev, "failed to get scatterlist from DMA API\n");
+		kfree(sgt);
+		return NULL;
+	}
+
+	return sgt;
+}
+
 /*********************************************/
 /*         callbacks for all buffers         */
 /*********************************************/
@@ -363,28 +385,6 @@ static struct dma_buf_ops vb2_dc_dmabuf_ops = {
 	.release = vb2_dc_dmabuf_ops_release,
 };
 
-static struct sg_table *vb2_dc_get_base_sgt(struct vb2_dc_buf *buf)
-{
-	int ret;
-	struct sg_table *sgt;
-
-	sgt = kmalloc(sizeof(*sgt), GFP_KERNEL);
-	if (!sgt) {
-		dev_err(buf->dev, "failed to alloc sg table\n");
-		return NULL;
-	}
-
-	ret = dma_get_sgtable_attrs(buf->dev, sgt, buf->cookie, buf->dma_addr,
-		buf->size, buf->attrs);
-	if (ret < 0) {
-		dev_err(buf->dev, "failed to get scatterlist from DMA API\n");
-		kfree(sgt);
-		return NULL;
-	}
-
-	return sgt;
-}
-
 static struct dma_buf *vb2_dc_get_dmabuf(void *buf_priv, unsigned long flags)
 {
 	struct vb2_dc_buf *buf = buf_priv;
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [PATCH v3 2/2] [media] videobuf2-dc: Support cacheable MMAP
  2016-10-26  8:52 ` [PATCH v3 0/2] [media] videobuf2-dc: Add support for cacheable MMAP Thierry Escande
  2016-10-26  8:52   ` [PATCH v3 1/2] [media] videobuf2-dc: Move vb2_dc_get_base_sgt() above mmap callbacks Thierry Escande
@ 2016-10-26  8:52   ` Thierry Escande
  2018-09-17 14:41     ` Hans Verkuil
  2017-07-03  9:27   ` [PATCH v3 0/2] [media] videobuf2-dc: Add support for " Marek Szyprowski
  2 siblings, 1 reply; 11+ messages in thread
From: Thierry Escande @ 2016-10-26  8:52 UTC (permalink / raw)
  To: Mauro Carvalho Chehab, Sakari Ailus
  Cc: linux-media, linux-kernel, Pawel Osciak, Marek Szyprowski, Kyungmin Park

From: Heng-Ruey Hsu <henryhsu@chromium.org>

DMA allocations for MMAP type are uncached by default. But for
some cases, CPU has to access the buffers. ie: memcpy for format
converter. Supporting cacheable MMAP improves huge performance.

This patch enables cacheable memory for DMA coherent allocator in mmap
buffer allocation if non-consistent DMA attribute is set and kernel
mapping is present. Even if userspace doesn't mmap the buffer, sync
still should be happening if kernel mapping is present.
If not done in allocation, it is enabled when memory is mapped from
userspace (if non-consistent DMA attribute is set).

Signed-off-by: Heng-Ruey Hsu <henryhsu@chromium.org>
Tested-by: Heng-ruey Hsu <henryhsu@chromium.org>
Reviewed-by: Tomasz Figa <tfiga@chromium.org>
Signed-off-by: Thierry Escande <thierry.escande@collabora.com>
---
 drivers/media/v4l2-core/videobuf2-dma-contig.c | 16 ++++++++++++++++
 1 file changed, 16 insertions(+)

diff --git a/drivers/media/v4l2-core/videobuf2-dma-contig.c b/drivers/media/v4l2-core/videobuf2-dma-contig.c
index 0d9665d..89b534a 100644
--- a/drivers/media/v4l2-core/videobuf2-dma-contig.c
+++ b/drivers/media/v4l2-core/videobuf2-dma-contig.c
@@ -151,6 +151,10 @@ static void vb2_dc_put(void *buf_priv)
 		sg_free_table(buf->sgt_base);
 		kfree(buf->sgt_base);
 	}
+	if (buf->dma_sgt) {
+		sg_free_table(buf->dma_sgt);
+		kfree(buf->dma_sgt);
+	}
 	dma_free_attrs(buf->dev, buf->size, buf->cookie, buf->dma_addr,
 		       buf->attrs);
 	put_device(buf->dev);
@@ -192,6 +196,14 @@ static void *vb2_dc_alloc(struct device *dev, unsigned long attrs,
 	buf->handler.put = vb2_dc_put;
 	buf->handler.arg = buf;
 
+	/*
+	 * Enable cache maintenance. Even if userspace doesn't mmap the buffer,
+	 * sync still should be happening if kernel mapping is present.
+	 */
+	if (!(buf->attrs & DMA_ATTR_NO_KERNEL_MAPPING) &&
+	    buf->attrs & DMA_ATTR_NON_CONSISTENT)
+		buf->dma_sgt = vb2_dc_get_base_sgt(buf);
+
 	atomic_inc(&buf->refcount);
 
 	return buf;
@@ -227,6 +239,10 @@ static int vb2_dc_mmap(void *buf_priv, struct vm_area_struct *vma)
 
 	vma->vm_ops->open(vma);
 
+	/* Enable cache maintenance if not enabled in allocation. */
+	if (!buf->dma_sgt && buf->attrs & DMA_ATTR_NON_CONSISTENT)
+		buf->dma_sgt = vb2_dc_get_base_sgt(buf);
+
 	pr_debug("%s: mapped dma addr 0x%08lx at 0x%08lx, size %ld\n",
 		__func__, (unsigned long)buf->dma_addr, vma->vm_start,
 		buf->size);
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 11+ messages in thread

* Re: [PATCH v3 0/2] [media] videobuf2-dc: Add support for cacheable MMAP
  2016-10-26  8:52 ` [PATCH v3 0/2] [media] videobuf2-dc: Add support for cacheable MMAP Thierry Escande
  2016-10-26  8:52   ` [PATCH v3 1/2] [media] videobuf2-dc: Move vb2_dc_get_base_sgt() above mmap callbacks Thierry Escande
  2016-10-26  8:52   ` [PATCH v3 2/2] [media] videobuf2-dc: Support cacheable MMAP Thierry Escande
@ 2017-07-03  9:27   ` Marek Szyprowski
  2017-07-05 17:33     ` Christoph Hellwig
  2 siblings, 1 reply; 11+ messages in thread
From: Marek Szyprowski @ 2017-07-03  9:27 UTC (permalink / raw)
  To: Thierry Escande, Mauro Carvalho Chehab, Sakari Ailus
  Cc: linux-media, linux-kernel, Pawel Osciak, Kyungmin Park,
	Christoph Hellwig, Hans Verkuil, Shuah Khan, Russell King

Hi All,

On 2016-10-26 10:52, Thierry Escande wrote:
> This series adds support for cacheable MMAP in DMA coherent allocator.
>
> The first patch moves the vb2_dc_get_base_sgt() function above mmap
> callbacks for calls introduced by the second patch. This avoids a
> forward declaration.

I'm sorry for late review. Sylwester kicked me for pending v4l2/vb2 patches
and I've just found this thread in my TODO folder.

The main question here if we want to merge incomplete solution or not. As
for now, there is no support in ARM/ARM64 for NON_CONSISTENT attribute.
Also none of the v4l2 drivers use it. Sadly support for NON_CONSISTENT
attribute is not fully implemented nor even defined in mainline.

I know that it works fine for some vendor kernel trees, but supporting it in
mainline was a bit controversial. There is no proper way to sync cache 
for such
buffers. Calling dma_sync_sg worked so far, but it has to be first agreed as
a proper DMA API.

> Changes in v2:
> - Put function move in a separate patch
> - Added comments
>
> Changes in v3:
> - Remove redundant test on NO_KERNEL_MAPPING DMA attribute in mmap()
>
> Heng-Ruey Hsu (1):
>    [media] videobuf2-dc: Support cacheable MMAP
>
> Thierry Escande (1):
>    [media] videobuf2-dc: Move vb2_dc_get_base_sgt() above mmap callbacks
>
>   drivers/media/v4l2-core/videobuf2-dma-contig.c | 60 ++++++++++++++++----------
>   1 file changed, 38 insertions(+), 22 deletions(-)
>

Best regards
-- 
Marek Szyprowski, PhD
Samsung R&D Institute Poland

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH v3 0/2] [media] videobuf2-dc: Add support for cacheable MMAP
  2017-07-03  9:27   ` [PATCH v3 0/2] [media] videobuf2-dc: Add support for " Marek Szyprowski
@ 2017-07-05 17:33     ` Christoph Hellwig
  2017-07-13 11:13       ` Marek Szyprowski
  0 siblings, 1 reply; 11+ messages in thread
From: Christoph Hellwig @ 2017-07-05 17:33 UTC (permalink / raw)
  To: Marek Szyprowski
  Cc: Thierry Escande, Mauro Carvalho Chehab, Sakari Ailus,
	linux-media, linux-kernel, Pawel Osciak, Kyungmin Park,
	Christoph Hellwig, Hans Verkuil, Shuah Khan, Russell King

On Mon, Jul 03, 2017 at 11:27:32AM +0200, Marek Szyprowski wrote:
> The main question here if we want to merge incomplete solution or not. As
> for now, there is no support in ARM/ARM64 for NON_CONSISTENT attribute.
> Also none of the v4l2 drivers use it. Sadly support for NON_CONSISTENT
> attribute is not fully implemented nor even defined in mainline.
>

DMA_ATTR_NON_CONSISTENT is the way to get the dma_alloc_noncoherent
semantics through the dma_alloc_attr API, and as such I think it is
pretty well defined, although the documentation in
Documentation/DMA-attributes.txt is really bad and we need to improve
it, by merging it with the dma_alloc_noncoherent description in
Documentation/DMA-API.txt. My series to remove dma_alloc_noncoherent
updates the latter to mention DMA_ATTR_NON_CONSISTENT, but
we should probably merge Documentation/DMA-API.txt,
Documentation/DMA-attributes.txt and Documentation/DMA-API-HOWTO.txt
into a single coherent document.


> I know that it works fine for some vendor kernel trees, but supporting it in
> mainline was a bit controversial. There is no proper way to sync cache for 
> such
> buffers. Calling dma_sync_sg worked so far, but it has to be first agreed as
> a proper DMA API.

As documented in Documentation/DMA-API.txt the proper way to sync
noncoherent/nonconsistent regions is to call dma_cache_sync.  It seems
like it generally is the same as dma_sync_range/sg so if we could
eventually merge these APIs that should reduce the confusion further.

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH v3 0/2] [media] videobuf2-dc: Add support for cacheable MMAP
  2017-07-05 17:33     ` Christoph Hellwig
@ 2017-07-13 11:13       ` Marek Szyprowski
  2017-07-13 13:21         ` Russell King - ARM Linux
  0 siblings, 1 reply; 11+ messages in thread
From: Marek Szyprowski @ 2017-07-13 11:13 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: Thierry Escande, Mauro Carvalho Chehab, Sakari Ailus,
	linux-media, linux-kernel, Pawel Osciak, Kyungmin Park,
	Hans Verkuil, Shuah Khan, Russell King

Hi Christoph,

On 2017-07-05 19:33, Christoph Hellwig wrote:
> On Mon, Jul 03, 2017 at 11:27:32AM +0200, Marek Szyprowski wrote:
>> The main question here if we want to merge incomplete solution or not. As
>> for now, there is no support in ARM/ARM64 for NON_CONSISTENT attribute.
>> Also none of the v4l2 drivers use it. Sadly support for NON_CONSISTENT
>> attribute is not fully implemented nor even defined in mainline.
>>
> DMA_ATTR_NON_CONSISTENT is the way to get the dma_alloc_noncoherent
> semantics through the dma_alloc_attr API, and as such I think it is
> pretty well defined, although the documentation in
> Documentation/DMA-attributes.txt is really bad and we need to improve
> it, by merging it with the dma_alloc_noncoherent description in
> Documentation/DMA-API.txt. My series to remove dma_alloc_noncoherent
> updates the latter to mention DMA_ATTR_NON_CONSISTENT, but
> we should probably merge Documentation/DMA-API.txt,
> Documentation/DMA-attributes.txt and Documentation/DMA-API-HOWTO.txt
> into a single coherent document.

Right. I started conversion of dma_alloc_noncoherent to NON_CONSISTENT
DMA attribute, but later I got stuck at the details of cache 
synchronization.

>> I know that it works fine for some vendor kernel trees, but supporting it in
>> mainline was a bit controversial. There is no proper way to sync cache for
>> such
>> buffers. Calling dma_sync_sg worked so far, but it has to be first agreed as
>> a proper DMA API.
> As documented in Documentation/DMA-API.txt the proper way to sync
> noncoherent/nonconsistent regions is to call dma_cache_sync.  It seems
> like it generally is the same as dma_sync_range/sg so if we could
> eventually merge these APIs that should reduce the confusion further.

Original dma_alloc_noncoherent utilized dma_cache_sync() function, which had
some flaws, which prevented me to continue that task and introduce it to ARM
architecture. The dma_alloc_noncoherent() and dma_cache_sync() API lacks
buffer ownership and imprecisely defines how and when the caches has to be
synchronized. dma_cache_sync() also lacks DMA address argument, what also
complicates potential lightweight implementation.

IMHO it would make sense to change it to work similar to the other
dma_sync_*_for_{cpu,device} functions, but I didn't find enough time to
finally take a try.

Best regards
-- 
Marek Szyprowski, PhD
Samsung R&D Institute Poland

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH v3 0/2] [media] videobuf2-dc: Add support for cacheable MMAP
  2017-07-13 11:13       ` Marek Szyprowski
@ 2017-07-13 13:21         ` Russell King - ARM Linux
  2017-07-13 13:36             ` Christoph Hellwig
  0 siblings, 1 reply; 11+ messages in thread
From: Russell King - ARM Linux @ 2017-07-13 13:21 UTC (permalink / raw)
  To: Marek Szyprowski
  Cc: Christoph Hellwig, Thierry Escande, Mauro Carvalho Chehab,
	Sakari Ailus, linux-media, linux-kernel, Pawel Osciak,
	Kyungmin Park, Hans Verkuil, Shuah Khan

On Thu, Jul 13, 2017 at 01:13:28PM +0200, Marek Szyprowski wrote:
> Hi Christoph,
> 
> On 2017-07-05 19:33, Christoph Hellwig wrote:
> >On Mon, Jul 03, 2017 at 11:27:32AM +0200, Marek Szyprowski wrote:
> >>The main question here if we want to merge incomplete solution or not. As
> >>for now, there is no support in ARM/ARM64 for NON_CONSISTENT attribute.
> >>Also none of the v4l2 drivers use it. Sadly support for NON_CONSISTENT
> >>attribute is not fully implemented nor even defined in mainline.
> >>
> >DMA_ATTR_NON_CONSISTENT is the way to get the dma_alloc_noncoherent
> >semantics through the dma_alloc_attr API, and as such I think it is
> >pretty well defined, although the documentation in
> >Documentation/DMA-attributes.txt is really bad and we need to improve
> >it, by merging it with the dma_alloc_noncoherent description in
> >Documentation/DMA-API.txt. My series to remove dma_alloc_noncoherent
> >updates the latter to mention DMA_ATTR_NON_CONSISTENT, but
> >we should probably merge Documentation/DMA-API.txt,
> >Documentation/DMA-attributes.txt and Documentation/DMA-API-HOWTO.txt
> >into a single coherent document.
> 
> Right. I started conversion of dma_alloc_noncoherent to NON_CONSISTENT
> DMA attribute, but later I got stuck at the details of cache
> synchronization.
> 
> >>I know that it works fine for some vendor kernel trees, but supporting it in
> >>mainline was a bit controversial. There is no proper way to sync cache for
> >>such
> >>buffers. Calling dma_sync_sg worked so far, but it has to be first agreed as
> >>a proper DMA API.
> >As documented in Documentation/DMA-API.txt the proper way to sync
> >noncoherent/nonconsistent regions is to call dma_cache_sync.  It seems
> >like it generally is the same as dma_sync_range/sg so if we could
> >eventually merge these APIs that should reduce the confusion further.
> 
> Original dma_alloc_noncoherent utilized dma_cache_sync() function, which had
> some flaws, which prevented me to continue that task and introduce it to ARM
> architecture. The dma_alloc_noncoherent() and dma_cache_sync() API lacks
> buffer ownership and imprecisely defines how and when the caches has to be
> synchronized. dma_cache_sync() also lacks DMA address argument, what also
> complicates potential lightweight implementation.

My conclusion of the dma_alloc_noncoherent() and dma_cache_sync() API
when it was introduced is that it's basically a completely broken
interface, and I've never seen any point to it.  Maybe some of that is
because it's badly documented - which in turn makes it badly designed
(because there's no specification detailing what it's supposed to be
doing.)

I'd like to see that thing die...

-- 
RMK's Patch system: http://www.armlinux.org.uk/developer/patches/
FTTC broadband for 0.8mile line: currently at 9.6Mbps down 400kbps up
according to speedtest.net.

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH v3 0/2] [media] videobuf2-dc: Add support for cacheable MMAP
  2017-07-13 13:21         ` Russell King - ARM Linux
@ 2017-07-13 13:36             ` Christoph Hellwig
  0 siblings, 0 replies; 11+ messages in thread
From: Christoph Hellwig @ 2017-07-13 13:36 UTC (permalink / raw)
  To: Russell King - ARM Linux
  Cc: Marek Szyprowski, Christoph Hellwig, Thierry Escande,
	Mauro Carvalho Chehab, Sakari Ailus, linux-media, linux-kernel,
	Pawel Osciak, Kyungmin Park, Hans Verkuil, Shuah Khan,
	linux-parisc, linux-mips, linux-sh, linuxppc-dev

On Thu, Jul 13, 2017 at 02:21:53PM +0100, Russell King - ARM Linux wrote:
> My conclusion of the dma_alloc_noncoherent() and dma_cache_sync() API
> when it was introduced is that it's basically a completely broken
> interface, and I've never seen any point to it.  Maybe some of that is
> because it's badly documented - which in turn makes it badly designed
> (because there's no specification detailing what it's supposed to be
> doing.)
> 
> I'd like to see that thing die...

It's not exactly the best interface ever, so any improvement is welcome.

I've posted a series to kill dma_alloc_noncoherent in favor of
dma_alloc_attrs a short while ago, and a big chunk of it should have
made it into 4.12.  I plan to kill it off entirely for 4.13.

That leaves dma_cache_sync() - it's used by 6 drivers:

drivers/net/ethernet/i825xx/lasi_82596.c
drivers/net/ethernet/seeq/sgiseeq.c
drivers/scsi/53c700.c
drivers/scsi/sgiwd93.c
drivers/sh/maple/maple.c
drivers/tty/serial/mpsc.c

Those are used on parisc, mips for a few old SGI systems, the SH
dreamcast and powerpc marvell mv64x60 devices.

So it shouldn't be too hard to figure out if they could be moved
to the normal dma_sync_* calls.

On parisc dma_cache_sync is equivalent to dma_sync_single_for_cpu,
so that should be fine.

On mips the implementation of dma_sync_single_for_cpu is a little
more complicated, but both dma_sync_single_for_cpu and dma_cache_sync
end up calling __dma_sync_virtual, so they look like the same in
the end as well.

On SH sync_single_for_device is implemented using dma_cache_sync,
and there is no dma_sync_single_for_cpu.

On powerpc both dma_sync_single_for_cpu and dma_sync_single_for_device
are implemented using the same primitive as dma_cache_sync.

In short: killing off dma_cache_sync and using the existing and
better defined syncing primitives looks entirely feasible.

I'll add it to my TODO list for 4.13.

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH v3 0/2] [media] videobuf2-dc: Add support for cacheable MMAP
@ 2017-07-13 13:36             ` Christoph Hellwig
  0 siblings, 0 replies; 11+ messages in thread
From: Christoph Hellwig @ 2017-07-13 13:36 UTC (permalink / raw)
  To: Russell King - ARM Linux
  Cc: Marek Szyprowski, Christoph Hellwig, Thierry Escande,
	Mauro Carvalho Chehab, Sakari Ailus, linux-media, linux-kernel,
	Pawel Osciak, Kyungmin Park, Hans Verkuil, Shuah Khan,
	linux-parisc, linux-mips, linux-sh, linuxppc-dev

On Thu, Jul 13, 2017 at 02:21:53PM +0100, Russell King - ARM Linux wrote:
> My conclusion of the dma_alloc_noncoherent() and dma_cache_sync() API
> when it was introduced is that it's basically a completely broken
> interface, and I've never seen any point to it.  Maybe some of that is
> because it's badly documented - which in turn makes it badly designed
> (because there's no specification detailing what it's supposed to be
> doing.)
> 
> I'd like to see that thing die...

It's not exactly the best interface ever, so any improvement is welcome.

I've posted a series to kill dma_alloc_noncoherent in favor of
dma_alloc_attrs a short while ago, and a big chunk of it should have
made it into 4.12.  I plan to kill it off entirely for 4.13.

That leaves dma_cache_sync() - it's used by 6 drivers:

drivers/net/ethernet/i825xx/lasi_82596.c
drivers/net/ethernet/seeq/sgiseeq.c
drivers/scsi/53c700.c
drivers/scsi/sgiwd93.c
drivers/sh/maple/maple.c
drivers/tty/serial/mpsc.c

Those are used on parisc, mips for a few old SGI systems, the SH
dreamcast and powerpc marvell mv64x60 devices.

So it shouldn't be too hard to figure out if they could be moved
to the normal dma_sync_* calls.

On parisc dma_cache_sync is equivalent to dma_sync_single_for_cpu,
so that should be fine.

On mips the implementation of dma_sync_single_for_cpu is a little
more complicated, but both dma_sync_single_for_cpu and dma_cache_sync
end up calling __dma_sync_virtual, so they look like the same in
the end as well.

On SH sync_single_for_device is implemented using dma_cache_sync,
and there is no dma_sync_single_for_cpu.

On powerpc both dma_sync_single_for_cpu and dma_sync_single_for_device
are implemented using the same primitive as dma_cache_sync.

In short: killing off dma_cache_sync and using the existing and
better defined syncing primitives looks entirely feasible.

I'll add it to my TODO list for 4.13.

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH v3 2/2] [media] videobuf2-dc: Support cacheable MMAP
  2016-10-26  8:52   ` [PATCH v3 2/2] [media] videobuf2-dc: Support cacheable MMAP Thierry Escande
@ 2018-09-17 14:41     ` Hans Verkuil
  2018-09-18  9:41       ` Tomasz Figa
  0 siblings, 1 reply; 11+ messages in thread
From: Hans Verkuil @ 2018-09-17 14:41 UTC (permalink / raw)
  To: Thierry Escande, Mauro Carvalho Chehab, Sakari Ailus
  Cc: linux-media, linux-kernel, Pawel Osciak, Marek Szyprowski, Kyungmin Park

I'm going through old patches in patchwork that fell through the
cracks, and this is one of them.

If this is still desired, please rebase and repost.

I'm marking this series as Obsoleted in patchwork, since it no longer
applies anyway.

Regards,

	Hans


On 10/26/2016 10:52 AM, Thierry Escande wrote:
> From: Heng-Ruey Hsu <henryhsu@chromium.org>
> 
> DMA allocations for MMAP type are uncached by default. But for
> some cases, CPU has to access the buffers. ie: memcpy for format
> converter. Supporting cacheable MMAP improves huge performance.
> 
> This patch enables cacheable memory for DMA coherent allocator in mmap
> buffer allocation if non-consistent DMA attribute is set and kernel
> mapping is present. Even if userspace doesn't mmap the buffer, sync
> still should be happening if kernel mapping is present.
> If not done in allocation, it is enabled when memory is mapped from
> userspace (if non-consistent DMA attribute is set).
> 
> Signed-off-by: Heng-Ruey Hsu <henryhsu@chromium.org>
> Tested-by: Heng-ruey Hsu <henryhsu@chromium.org>
> Reviewed-by: Tomasz Figa <tfiga@chromium.org>
> Signed-off-by: Thierry Escande <thierry.escande@collabora.com>
> ---
>  drivers/media/v4l2-core/videobuf2-dma-contig.c | 16 ++++++++++++++++
>  1 file changed, 16 insertions(+)
> 
> diff --git a/drivers/media/v4l2-core/videobuf2-dma-contig.c b/drivers/media/v4l2-core/videobuf2-dma-contig.c
> index 0d9665d..89b534a 100644
> --- a/drivers/media/v4l2-core/videobuf2-dma-contig.c
> +++ b/drivers/media/v4l2-core/videobuf2-dma-contig.c
> @@ -151,6 +151,10 @@ static void vb2_dc_put(void *buf_priv)
>  		sg_free_table(buf->sgt_base);
>  		kfree(buf->sgt_base);
>  	}
> +	if (buf->dma_sgt) {
> +		sg_free_table(buf->dma_sgt);
> +		kfree(buf->dma_sgt);
> +	}
>  	dma_free_attrs(buf->dev, buf->size, buf->cookie, buf->dma_addr,
>  		       buf->attrs);
>  	put_device(buf->dev);
> @@ -192,6 +196,14 @@ static void *vb2_dc_alloc(struct device *dev, unsigned long attrs,
>  	buf->handler.put = vb2_dc_put;
>  	buf->handler.arg = buf;
>  
> +	/*
> +	 * Enable cache maintenance. Even if userspace doesn't mmap the buffer,
> +	 * sync still should be happening if kernel mapping is present.
> +	 */
> +	if (!(buf->attrs & DMA_ATTR_NO_KERNEL_MAPPING) &&
> +	    buf->attrs & DMA_ATTR_NON_CONSISTENT)
> +		buf->dma_sgt = vb2_dc_get_base_sgt(buf);
> +
>  	atomic_inc(&buf->refcount);
>  
>  	return buf;
> @@ -227,6 +239,10 @@ static int vb2_dc_mmap(void *buf_priv, struct vm_area_struct *vma)
>  
>  	vma->vm_ops->open(vma);
>  
> +	/* Enable cache maintenance if not enabled in allocation. */
> +	if (!buf->dma_sgt && buf->attrs & DMA_ATTR_NON_CONSISTENT)
> +		buf->dma_sgt = vb2_dc_get_base_sgt(buf);
> +
>  	pr_debug("%s: mapped dma addr 0x%08lx at 0x%08lx, size %ld\n",
>  		__func__, (unsigned long)buf->dma_addr, vma->vm_start,
>  		buf->size);
> 


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH v3 2/2] [media] videobuf2-dc: Support cacheable MMAP
  2018-09-17 14:41     ` Hans Verkuil
@ 2018-09-18  9:41       ` Tomasz Figa
  0 siblings, 0 replies; 11+ messages in thread
From: Tomasz Figa @ 2018-09-18  9:41 UTC (permalink / raw)
  To: Hans Verkuil, Sakari Ailus
  Cc: Thierry Escande, Mauro Carvalho Chehab, Linux Media Mailing List,
	Linux Kernel Mailing List, Pawel Osciak, Marek Szyprowski,
	Kyungmin Park

Hi Hans,

On Mon, Sep 17, 2018 at 11:41 PM Hans Verkuil <hverkuil@xs4all.nl> wrote:
>
> I'm going through old patches in patchwork that fell through the
> cracks, and this is one of them.
>
> If this is still desired, please rebase and repost.
>
> I'm marking this series as Obsoleted in patchwork, since it no longer
> applies anyway.

The ability to have cached mappings of MMAP buffers is strongly
desired, but I'm afraid not the way this patch does it.

First of all, it's not a decision for the driver to make, but for the
user space, depending on the access pattern it does. It also isn't
something specific to vb2-dma-contig only.

I remember Sakari had a series that attempted to solve this in a more
comprehensive way[1]. I remember it had some minor problems when I
reviewed it, but generally the idea seemed sane.

Sakari, do you have any plans to revive that work?

[1] https://www.mail-archive.com/linux-media@vger.kernel.org/msg112459.html

Best regards,
Tomasz

>
> Regards,
>
>         Hans
>
>
> On 10/26/2016 10:52 AM, Thierry Escande wrote:
> > From: Heng-Ruey Hsu <henryhsu@chromium.org>
> >
> > DMA allocations for MMAP type are uncached by default. But for
> > some cases, CPU has to access the buffers. ie: memcpy for format
> > converter. Supporting cacheable MMAP improves huge performance.
> >
> > This patch enables cacheable memory for DMA coherent allocator in mmap
> > buffer allocation if non-consistent DMA attribute is set and kernel
> > mapping is present. Even if userspace doesn't mmap the buffer, sync
> > still should be happening if kernel mapping is present.
> > If not done in allocation, it is enabled when memory is mapped from
> > userspace (if non-consistent DMA attribute is set).
> >
> > Signed-off-by: Heng-Ruey Hsu <henryhsu@chromium.org>
> > Tested-by: Heng-ruey Hsu <henryhsu@chromium.org>
> > Reviewed-by: Tomasz Figa <tfiga@chromium.org>
> > Signed-off-by: Thierry Escande <thierry.escande@collabora.com>
> > ---
> >  drivers/media/v4l2-core/videobuf2-dma-contig.c | 16 ++++++++++++++++
> >  1 file changed, 16 insertions(+)
> >
> > diff --git a/drivers/media/v4l2-core/videobuf2-dma-contig.c b/drivers/media/v4l2-core/videobuf2-dma-contig.c
> > index 0d9665d..89b534a 100644
> > --- a/drivers/media/v4l2-core/videobuf2-dma-contig.c
> > +++ b/drivers/media/v4l2-core/videobuf2-dma-contig.c
> > @@ -151,6 +151,10 @@ static void vb2_dc_put(void *buf_priv)
> >               sg_free_table(buf->sgt_base);
> >               kfree(buf->sgt_base);
> >       }
> > +     if (buf->dma_sgt) {
> > +             sg_free_table(buf->dma_sgt);
> > +             kfree(buf->dma_sgt);
> > +     }
> >       dma_free_attrs(buf->dev, buf->size, buf->cookie, buf->dma_addr,
> >                      buf->attrs);
> >       put_device(buf->dev);
> > @@ -192,6 +196,14 @@ static void *vb2_dc_alloc(struct device *dev, unsigned long attrs,
> >       buf->handler.put = vb2_dc_put;
> >       buf->handler.arg = buf;
> >
> > +     /*
> > +      * Enable cache maintenance. Even if userspace doesn't mmap the buffer,
> > +      * sync still should be happening if kernel mapping is present.
> > +      */
> > +     if (!(buf->attrs & DMA_ATTR_NO_KERNEL_MAPPING) &&
> > +         buf->attrs & DMA_ATTR_NON_CONSISTENT)
> > +             buf->dma_sgt = vb2_dc_get_base_sgt(buf);
> > +
> >       atomic_inc(&buf->refcount);
> >
> >       return buf;
> > @@ -227,6 +239,10 @@ static int vb2_dc_mmap(void *buf_priv, struct vm_area_struct *vma)
> >
> >       vma->vm_ops->open(vma);
> >
> > +     /* Enable cache maintenance if not enabled in allocation. */
> > +     if (!buf->dma_sgt && buf->attrs & DMA_ATTR_NON_CONSISTENT)
> > +             buf->dma_sgt = vb2_dc_get_base_sgt(buf);
> > +
> >       pr_debug("%s: mapped dma addr 0x%08lx at 0x%08lx, size %ld\n",
> >               __func__, (unsigned long)buf->dma_addr, vma->vm_start,
> >               buf->size);
> >
>

^ permalink raw reply	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2018-09-18  9:41 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
     [not found] <CGME20161026085228epcas3p3895ea279d5538750a3b1c59715ad3761@epcas3p3.samsung.com>
2016-10-26  8:52 ` [PATCH v3 0/2] [media] videobuf2-dc: Add support for cacheable MMAP Thierry Escande
2016-10-26  8:52   ` [PATCH v3 1/2] [media] videobuf2-dc: Move vb2_dc_get_base_sgt() above mmap callbacks Thierry Escande
2016-10-26  8:52   ` [PATCH v3 2/2] [media] videobuf2-dc: Support cacheable MMAP Thierry Escande
2018-09-17 14:41     ` Hans Verkuil
2018-09-18  9:41       ` Tomasz Figa
2017-07-03  9:27   ` [PATCH v3 0/2] [media] videobuf2-dc: Add support for " Marek Szyprowski
2017-07-05 17:33     ` Christoph Hellwig
2017-07-13 11:13       ` Marek Szyprowski
2017-07-13 13:21         ` Russell King - ARM Linux
2017-07-13 13:36           ` Christoph Hellwig
2017-07-13 13:36             ` Christoph Hellwig

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.