All of lore.kernel.org
 help / color / mirror / Atom feed
* Query on patch to be upstream ?
@ 2014-03-05 20:46 Ritesh Harjani
  2014-03-11 12:15 ` Catalin Marinas
  0 siblings, 1 reply; 18+ messages in thread
From: Ritesh Harjani @ 2014-03-05 20:46 UTC (permalink / raw)
  To: linux-arm-kernel

Hi Catalin,

Your Patch (arm64: Implement coherent DMA API based on swiotlb) from
your devel branch going upstream (without modifications) ?


Thanks
Ritesh

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Query on patch to be upstream ?
  2014-03-05 20:46 Query on patch to be upstream ? Ritesh Harjani
@ 2014-03-11 12:15 ` Catalin Marinas
  2014-03-11 18:04   ` Laura Abbott
  0 siblings, 1 reply; 18+ messages in thread
From: Catalin Marinas @ 2014-03-11 12:15 UTC (permalink / raw)
  To: linux-arm-kernel

On Wed, Mar 05, 2014 at 08:46:24PM +0000, Ritesh Harjani wrote:
> Your Patch (arm64: Implement coherent DMA API based on swiotlb) from
> your devel branch going upstream (without modifications) ?

That's the plan. But once we get the system topology or Santosh's dma I
coherent DT bindings plan to make it DT-driven on whether a device is
coherent or not. For now we need some solution as I get people asking
about cache management for DMA buffers (though none of them got back
with a Tested-by ;)).

-- 
Catalin

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Query on patch to be upstream ?
  2014-03-11 12:15 ` Catalin Marinas
@ 2014-03-11 18:04   ` Laura Abbott
  2014-03-11 18:26     ` Catalin Marinas
  0 siblings, 1 reply; 18+ messages in thread
From: Laura Abbott @ 2014-03-11 18:04 UTC (permalink / raw)
  To: linux-arm-kernel

With one caveat below, you can add

Tested-by: Laura Abbott <lauraa@codeaurora.org>

for "arm64: Implement coherent DMA API based on swiotlb"

Thanks,
Laura

-- 8< --
From: Laura Abbott <lauraa@codeaurora.org>
Date: Thu, 6 Mar 2014 21:05:41 -0800
Subject: [PATCH] arm64: Use custom mmap function for noncoherent dma ops

The non-coherent dma ops remap memory with appropriate attributes.
This remapped address cannot be used with virt_to_page which
dma_common_mmap uses. Implement a custom (but very similar) function
which correctly calculates the physical address and remaps to userspace.

Signed-off-by: Laura Abbott <lauraa@codeaurora.org>
---
 arch/arm64/mm/dma-mapping.c |   27 +++++++++++++++++++++++++++
 1 files changed, 27 insertions(+), 0 deletions(-)

diff --git a/arch/arm64/mm/dma-mapping.c b/arch/arm64/mm/dma-mapping.c
index e923a5b..5ac7a14 100644
--- a/arch/arm64/mm/dma-mapping.c
+++ b/arch/arm64/mm/dma-mapping.c
@@ -194,9 +194,36 @@ static void arm64_swiotlb_sync_sg_for_device(struct device *dev,
 			       sg->length, dir);
 }
 
+int arm64_swiotlb_mmap(struct device *dev, struct vm_area_struct *vma,
+		 void *cpu_addr, dma_addr_t dma_addr, size_t size,
+		 struct dma_attrs *attrs)
+{
+	int ret = -ENXIO;
+	unsigned long nr_vma_pages = (vma->vm_end - vma->vm_start) >>
+					PAGE_SHIFT;
+	unsigned long nr_pages = PAGE_ALIGN(size) >> PAGE_SHIFT;
+	unsigned long pfn = dma_to_phys(dev, dma_addr) >> PAGE_SHIFT;
+	unsigned long off = vma->vm_pgoff;
+
+	vma->vm_page_prot = pgprot_dmacoherent(vma->vm_page_prot);
+
+	if (dma_mmap_from_coherent(dev, vma, cpu_addr, size, &ret))
+		return ret;
+
+	if (off < nr_pages && nr_vma_pages <= (nr_pages - off)) {
+		ret = remap_pfn_range(vma, vma->vm_start,
+				      pfn + off,
+				      vma->vm_end - vma->vm_start,
+				      vma->vm_page_prot);
+	}
+
+	return ret;
+}
+
 struct dma_map_ops noncoherent_swiotlb_dma_ops = {
 	.alloc = arm64_swiotlb_alloc_noncoherent,
 	.free = arm64_swiotlb_free_noncoherent,
+	.mmap = arm64_swiotlb_mmap,
 	.map_page = arm64_swiotlb_map_page,
 	.unmap_page = arm64_swiotlb_unmap_page,
 	.map_sg = arm64_swiotlb_map_sg_attrs,
-- 
The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum,
hosted by The Linux Foundation

^ permalink raw reply related	[flat|nested] 18+ messages in thread

* Query on patch to be upstream ?
  2014-03-11 18:04   ` Laura Abbott
@ 2014-03-11 18:26     ` Catalin Marinas
  2014-03-12 18:20       ` Laura Abbott
  0 siblings, 1 reply; 18+ messages in thread
From: Catalin Marinas @ 2014-03-11 18:26 UTC (permalink / raw)
  To: linux-arm-kernel

On Tue, Mar 11, 2014 at 06:04:19PM +0000, Laura Abbott wrote:
> With one caveat below, you can add
> 
> Tested-by: Laura Abbott <lauraa@codeaurora.org>

Thanks.

> From: Laura Abbott <lauraa@codeaurora.org>
> Date: Thu, 6 Mar 2014 21:05:41 -0800
> Subject: [PATCH] arm64: Use custom mmap function for noncoherent dma ops
> 
> The non-coherent dma ops remap memory with appropriate attributes.
> This remapped address cannot be used with virt_to_page which
> dma_common_mmap uses. Implement a custom (but very similar) function
> which correctly calculates the physical address and remaps to userspace.

Looking at the coherent implementation, I think we also have a
(performance) problem with dma_common_mmap(). It uses pgprot_noncached()
by default and if the DMA is coherent we don't really need strongly
ordered memory.

Would you mind writing a __dma_common_mmap() for arm64 with separate
coherent/non-coherent dma mmap functions that set the vm_page_prot
accordingly?

Thanks.

-- 
Catalin

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Query on patch to be upstream ?
  2014-03-11 18:26     ` Catalin Marinas
@ 2014-03-12 18:20       ` Laura Abbott
  2014-03-13 17:45         ` [PATCH 1/2] arm64: Implement custom mmap functions for dma mapping Laura Abbott
  2014-03-13 17:45         ` [PATCH 2/2] arm64: Support DMA_ATTR_WRITE_COMBINE Laura Abbott
  0 siblings, 2 replies; 18+ messages in thread
From: Laura Abbott @ 2014-03-12 18:20 UTC (permalink / raw)
  To: linux-arm-kernel

On 3/11/2014 11:26 AM, Catalin Marinas wrote:
> On Tue, Mar 11, 2014 at 06:04:19PM +0000, Laura Abbott wrote:
>> With one caveat below, you can add
>>
>> Tested-by: Laura Abbott <lauraa@codeaurora.org>
>
> Thanks.
>
>> From: Laura Abbott <lauraa@codeaurora.org>
>> Date: Thu, 6 Mar 2014 21:05:41 -0800
>> Subject: [PATCH] arm64: Use custom mmap function for noncoherent dma ops
>>
>> The non-coherent dma ops remap memory with appropriate attributes.
>> This remapped address cannot be used with virt_to_page which
>> dma_common_mmap uses. Implement a custom (but very similar) function
>> which correctly calculates the physical address and remaps to userspace.
>
> Looking at the coherent implementation, I think we also have a
> (performance) problem with dma_common_mmap(). It uses pgprot_noncached()
> by default and if the DMA is coherent we don't really need strongly
> ordered memory.
>
> Would you mind writing a __dma_common_mmap() for arm64 with separate
> coherent/non-coherent dma mmap functions that set the vm_page_prot
> accordingly?
>
> Thanks.
>

Yes, I will add that to my TODO list.

Thanks,
Laura

-- 
Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum,
hosted by The Linux Foundation

^ permalink raw reply	[flat|nested] 18+ messages in thread

* [PATCH 1/2] arm64: Implement custom mmap functions for dma mapping
  2014-03-12 18:20       ` Laura Abbott
@ 2014-03-13 17:45         ` Laura Abbott
  2014-03-13 17:49           ` Catalin Marinas
  2014-03-13 17:45         ` [PATCH 2/2] arm64: Support DMA_ATTR_WRITE_COMBINE Laura Abbott
  1 sibling, 1 reply; 18+ messages in thread
From: Laura Abbott @ 2014-03-13 17:45 UTC (permalink / raw)
  To: linux-arm-kernel

The current dma_ops do not specify an mmap function so maping
falls back to the default implementation. There are at least
two issues with using the default implementation:

1) The pgprot is always pgprot_noncached (strongly ordered)
memory even with coherent operations
2) dma_common_mmap calls virt_to_page on the remapped non-coherent
address which leads to invalid memory being mapped.

Fix both these issue by implementing a custom mmap function which
correctly accounts for remapped addresses and sets vm_pg_prot
appropriately.

Signed-off-by: Laura Abbott <lauraa@codeaurora.org>
---
 arch/arm64/mm/dma-mapping.c | 44 ++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 44 insertions(+)

diff --git a/arch/arm64/mm/dma-mapping.c b/arch/arm64/mm/dma-mapping.c
index e923a5b..9a639bf 100644
--- a/arch/arm64/mm/dma-mapping.c
+++ b/arch/arm64/mm/dma-mapping.c
@@ -194,9 +194,52 @@ static void arm64_swiotlb_sync_sg_for_device(struct device *dev,
 			       sg->length, dir);
 }
 
+/* vma->vm_page_prot must be set appropriately before calling this function */
+static int __dma_common_mmap(struct device *dev, struct vm_area_struct *vma,
+			     void *cpu_addr, dma_addr_t dma_addr, size_t size)
+{
+	int ret = -ENXIO;
+	unsigned long nr_vma_pages = (vma->vm_end - vma->vm_start) >>
+					PAGE_SHIFT;
+	unsigned long nr_pages = PAGE_ALIGN(size) >> PAGE_SHIFT;
+	unsigned long pfn = dma_to_phys(dev, dma_addr) >> PAGE_SHIFT;
+	unsigned long off = vma->vm_pgoff;
+
+	if (dma_mmap_from_coherent(dev, vma, cpu_addr, size, &ret))
+		return ret;
+
+	if (off < nr_pages && nr_vma_pages <= (nr_pages - off)) {
+		ret = remap_pfn_range(vma, vma->vm_start,
+				      pfn + off,
+				      vma->vm_end - vma->vm_start,
+				      vma->vm_page_prot);
+	}
+
+	return ret;
+}
+
+static int arm64_swiotlb_mmap_noncoherent(struct device *dev,
+		struct vm_area_struct *vma,
+		void *cpu_addr, dma_addr_t dma_addr, size_t size,
+		struct dma_attrs *attrs)
+{
+	/* Just use whatever page_prot attributes were specified */
+	return __dma_common_mmap(dev, vma, cpu_addr, dma_addr, size);
+}
+
+static int arm64_swiotlb_mmap_coherent(struct device *dev,
+		struct vm_area_struct *vma,
+		void *cpu_addr, dma_addr_t dma_addr, size_t size,
+		struct dma_attrs *attrs)
+{
+	vma->vm_page_prot = pgprot_dmacoherent(vma->vm_page_prot);
+	return __dma_common_mmap(dev, vma, cpu_addr, dma_addr, size);
+}
+
 struct dma_map_ops noncoherent_swiotlb_dma_ops = {
 	.alloc = arm64_swiotlb_alloc_noncoherent,
 	.free = arm64_swiotlb_free_noncoherent,
+	.mmap = arm64_swiotlb_mmap_noncoherent,
 	.map_page = arm64_swiotlb_map_page,
 	.unmap_page = arm64_swiotlb_unmap_page,
 	.map_sg = arm64_swiotlb_map_sg_attrs,
@@ -213,6 +256,7 @@ EXPORT_SYMBOL(noncoherent_swiotlb_dma_ops);
 struct dma_map_ops coherent_swiotlb_dma_ops = {
 	.alloc = arm64_swiotlb_alloc_coherent,
 	.free = arm64_swiotlb_free_coherent,
+	.mmap = arm64_swiotlb_mmap_coherent,
 	.map_page = swiotlb_map_page,
 	.unmap_page = swiotlb_unmap_page,
 	.map_sg = swiotlb_map_sg_attrs,
-- 
The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum,
hosted by The Linux Foundation

^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [PATCH 2/2] arm64: Support DMA_ATTR_WRITE_COMBINE
  2014-03-12 18:20       ` Laura Abbott
  2014-03-13 17:45         ` [PATCH 1/2] arm64: Implement custom mmap functions for dma mapping Laura Abbott
@ 2014-03-13 17:45         ` Laura Abbott
  2014-03-13 17:52           ` Catalin Marinas
                             ` (3 more replies)
  1 sibling, 4 replies; 18+ messages in thread
From: Laura Abbott @ 2014-03-13 17:45 UTC (permalink / raw)
  To: linux-arm-kernel

DMA_ATTR_WRITE_COMBINE is currently ignored. Set the pgprot
appropriately for non coherent opperations.

Signed-off-by: Laura Abbott <lauraa@codeaurora.org>
---
 Potential addition on top of the coherent work as well. Writecombine
 and dmacoherent seem to be the same at the moment but it might be
 good practice to have the two be separate?

 arch/arm64/mm/dma-mapping.c | 13 +++++++++++--
 1 file changed, 11 insertions(+), 2 deletions(-)

diff --git a/arch/arm64/mm/dma-mapping.c b/arch/arm64/mm/dma-mapping.c
index 9a639bf..d2c0027 100644
--- a/arch/arm64/mm/dma-mapping.c
+++ b/arch/arm64/mm/dma-mapping.c
@@ -29,6 +29,15 @@
 struct dma_map_ops *dma_ops;
 EXPORT_SYMBOL(dma_ops);
 
+
+static inline pgprot_t __get_dma_pgprot(struct dma_attrs *attrs, pgprot_t prot)
+{
+	prot = dma_get_attr(DMA_ATTR_WRITE_COMBINE, attrs) ?
+			    pgprot_writecombine(prot) :
+			    pgprot_dmacoherent(prot);
+	return prot;
+}
+
 static void *arm64_swiotlb_alloc_coherent(struct device *dev, size_t size,
 					  dma_addr_t *dma_handle, gfp_t flags,
 					  struct dma_attrs *attrs)
@@ -72,7 +81,7 @@ static void *arm64_swiotlb_alloc_noncoherent(struct device *dev, size_t size,
 	for (i = 0; i < (size >> PAGE_SHIFT); i++)
 		map[i] = page + i;
 	coherent_ptr = vmap(map, size >> PAGE_SHIFT, VM_MAP,
-			    pgprot_dmacoherent(pgprot_default));
+			    __get_dma_pgprot(attrs, pgprot_default));
 	kfree(map);
 	if (!coherent_ptr)
 		goto no_map;
@@ -232,7 +241,7 @@ static int arm64_swiotlb_mmap_coherent(struct device *dev,
 		void *cpu_addr, dma_addr_t dma_addr, size_t size,
 		struct dma_attrs *attrs)
 {
-	vma->vm_page_prot = pgprot_dmacoherent(vma->vm_page_prot);
+	vma->vm_page_prot = __get_dma_pgprot(attrs, vma->vm_page_prot);
 	return __dma_common_mmap(dev, vma, cpu_addr, dma_addr, size);
 }
 
-- 
The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum,
hosted by The Linux Foundation

^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [PATCH 1/2] arm64: Implement custom mmap functions for dma mapping
  2014-03-13 17:45         ` [PATCH 1/2] arm64: Implement custom mmap functions for dma mapping Laura Abbott
@ 2014-03-13 17:49           ` Catalin Marinas
  2014-03-14  1:53             ` Laura Abbott
  0 siblings, 1 reply; 18+ messages in thread
From: Catalin Marinas @ 2014-03-13 17:49 UTC (permalink / raw)
  To: linux-arm-kernel

On Thu, Mar 13, 2014 at 05:45:15PM +0000, Laura Abbott wrote:
> +/* vma->vm_page_prot must be set appropriately before calling this function */
> +static int __dma_common_mmap(struct device *dev, struct vm_area_struct *vma,
> +			     void *cpu_addr, dma_addr_t dma_addr, size_t size)
> +{
> +	int ret = -ENXIO;
> +	unsigned long nr_vma_pages = (vma->vm_end - vma->vm_start) >>
> +					PAGE_SHIFT;
> +	unsigned long nr_pages = PAGE_ALIGN(size) >> PAGE_SHIFT;
> +	unsigned long pfn = dma_to_phys(dev, dma_addr) >> PAGE_SHIFT;
> +	unsigned long off = vma->vm_pgoff;
> +
> +	if (dma_mmap_from_coherent(dev, vma, cpu_addr, size, &ret))
> +		return ret;
> +
> +	if (off < nr_pages && nr_vma_pages <= (nr_pages - off)) {
> +		ret = remap_pfn_range(vma, vma->vm_start,
> +				      pfn + off,
> +				      vma->vm_end - vma->vm_start,
> +				      vma->vm_page_prot);
> +	}
> +
> +	return ret;
> +}
> +
> +static int arm64_swiotlb_mmap_noncoherent(struct device *dev,
> +		struct vm_area_struct *vma,
> +		void *cpu_addr, dma_addr_t dma_addr, size_t size,
> +		struct dma_attrs *attrs)
> +{
> +	/* Just use whatever page_prot attributes were specified */
> +	return __dma_common_mmap(dev, vma, cpu_addr, dma_addr, size);
> +}
> +
> +static int arm64_swiotlb_mmap_coherent(struct device *dev,
> +		struct vm_area_struct *vma,
> +		void *cpu_addr, dma_addr_t dma_addr, size_t size,
> +		struct dma_attrs *attrs)
> +{
> +	vma->vm_page_prot = pgprot_dmacoherent(vma->vm_page_prot);
> +	return __dma_common_mmap(dev, vma, cpu_addr, dma_addr, size);
> +}

So I think we got the naming the other way around. The
noncoherent_dma_ops assume that the DMA is non-coherent and therefore
must change the pgprot.

-- 
Catalin

^ permalink raw reply	[flat|nested] 18+ messages in thread

* [PATCH 2/2] arm64: Support DMA_ATTR_WRITE_COMBINE
  2014-03-13 17:45         ` [PATCH 2/2] arm64: Support DMA_ATTR_WRITE_COMBINE Laura Abbott
@ 2014-03-13 17:52           ` Catalin Marinas
  2014-03-14  2:02             ` Laura Abbott
  2014-03-14 19:52           ` [PATCHv2 1/3] arm64: Implement custom mmap functions for dma mapping Laura Abbott
                             ` (2 subsequent siblings)
  3 siblings, 1 reply; 18+ messages in thread
From: Catalin Marinas @ 2014-03-13 17:52 UTC (permalink / raw)
  To: linux-arm-kernel

On Thu, Mar 13, 2014 at 05:45:16PM +0000, Laura Abbott wrote:
> DMA_ATTR_WRITE_COMBINE is currently ignored. Set the pgprot
> appropriately for non coherent opperations.
> 
> Signed-off-by: Laura Abbott <lauraa@codeaurora.org>
> ---
>  Potential addition on top of the coherent work as well. Writecombine
>  and dmacoherent seem to be the same at the moment but it might be
>  good practice to have the two be separate?
> 
>  arch/arm64/mm/dma-mapping.c | 13 +++++++++++--
>  1 file changed, 11 insertions(+), 2 deletions(-)
> 
> diff --git a/arch/arm64/mm/dma-mapping.c b/arch/arm64/mm/dma-mapping.c
> index 9a639bf..d2c0027 100644
> --- a/arch/arm64/mm/dma-mapping.c
> +++ b/arch/arm64/mm/dma-mapping.c
> @@ -29,6 +29,15 @@
>  struct dma_map_ops *dma_ops;
>  EXPORT_SYMBOL(dma_ops);
>  
> +
> +static inline pgprot_t __get_dma_pgprot(struct dma_attrs *attrs, pgprot_t prot)
> +{
> +	prot = dma_get_attr(DMA_ATTR_WRITE_COMBINE, attrs) ?
> +			    pgprot_writecombine(prot) :
> +			    pgprot_dmacoherent(prot);
> +	return prot;
> +}

pgprot_writecombine and pgprot_dmacoherent are the same on arm64 (and
ARMv6/v7). So when the DMA is coherent on an SoC, we need to leave the
prot unchanged if !DMA_ATTR_WRITE_COMBINE.

-- 
Catalin

^ permalink raw reply	[flat|nested] 18+ messages in thread

* [PATCH 1/2] arm64: Implement custom mmap functions for dma mapping
  2014-03-13 17:49           ` Catalin Marinas
@ 2014-03-14  1:53             ` Laura Abbott
  0 siblings, 0 replies; 18+ messages in thread
From: Laura Abbott @ 2014-03-14  1:53 UTC (permalink / raw)
  To: linux-arm-kernel

On 3/13/2014 10:49 AM, Catalin Marinas wrote:
> On Thu, Mar 13, 2014 at 05:45:15PM +0000, Laura Abbott wrote:
>> +/* vma->vm_page_prot must be set appropriately before calling this function */
>> +static int __dma_common_mmap(struct device *dev, struct vm_area_struct *vma,
>> +			     void *cpu_addr, dma_addr_t dma_addr, size_t size)
>> +{
>> +	int ret = -ENXIO;
>> +	unsigned long nr_vma_pages = (vma->vm_end - vma->vm_start) >>
>> +					PAGE_SHIFT;
>> +	unsigned long nr_pages = PAGE_ALIGN(size) >> PAGE_SHIFT;
>> +	unsigned long pfn = dma_to_phys(dev, dma_addr) >> PAGE_SHIFT;
>> +	unsigned long off = vma->vm_pgoff;
>> +
>> +	if (dma_mmap_from_coherent(dev, vma, cpu_addr, size, &ret))
>> +		return ret;
>> +
>> +	if (off < nr_pages && nr_vma_pages <= (nr_pages - off)) {
>> +		ret = remap_pfn_range(vma, vma->vm_start,
>> +				      pfn + off,
>> +				      vma->vm_end - vma->vm_start,
>> +				      vma->vm_page_prot);
>> +	}
>> +
>> +	return ret;
>> +}
>> +
>> +static int arm64_swiotlb_mmap_noncoherent(struct device *dev,
>> +		struct vm_area_struct *vma,
>> +		void *cpu_addr, dma_addr_t dma_addr, size_t size,
>> +		struct dma_attrs *attrs)
>> +{
>> +	/* Just use whatever page_prot attributes were specified */
>> +	return __dma_common_mmap(dev, vma, cpu_addr, dma_addr, size);
>> +}
>> +
>> +static int arm64_swiotlb_mmap_coherent(struct device *dev,
>> +		struct vm_area_struct *vma,
>> +		void *cpu_addr, dma_addr_t dma_addr, size_t size,
>> +		struct dma_attrs *attrs)
>> +{
>> +	vma->vm_page_prot = pgprot_dmacoherent(vma->vm_page_prot);
>> +	return __dma_common_mmap(dev, vma, cpu_addr, dma_addr, size);
>> +}
>
> So I think we got the naming the other way around. The
> noncoherent_dma_ops assume that the DMA is non-coherent and therefore
> must change the pgprot.

I spent way too long convincing myself this was right and I was still 
wrong. I'll fix it up for v2 (I have one more patch I need to put in for 
CMA integration as well)

Thanks,
Laura
-- 
Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum,
hosted by The Linux Foundation

^ permalink raw reply	[flat|nested] 18+ messages in thread

* [PATCH 2/2] arm64: Support DMA_ATTR_WRITE_COMBINE
  2014-03-13 17:52           ` Catalin Marinas
@ 2014-03-14  2:02             ` Laura Abbott
  0 siblings, 0 replies; 18+ messages in thread
From: Laura Abbott @ 2014-03-14  2:02 UTC (permalink / raw)
  To: linux-arm-kernel

On 3/13/2014 10:52 AM, Catalin Marinas wrote:
> On Thu, Mar 13, 2014 at 05:45:16PM +0000, Laura Abbott wrote:
>> DMA_ATTR_WRITE_COMBINE is currently ignored. Set the pgprot
>> appropriately for non coherent opperations.
>>
>> Signed-off-by: Laura Abbott <lauraa@codeaurora.org>
>> ---
>>   Potential addition on top of the coherent work as well. Writecombine
>>   and dmacoherent seem to be the same at the moment but it might be
>>   good practice to have the two be separate?
>>
>>   arch/arm64/mm/dma-mapping.c | 13 +++++++++++--
>>   1 file changed, 11 insertions(+), 2 deletions(-)
>>
>> diff --git a/arch/arm64/mm/dma-mapping.c b/arch/arm64/mm/dma-mapping.c
>> index 9a639bf..d2c0027 100644
>> --- a/arch/arm64/mm/dma-mapping.c
>> +++ b/arch/arm64/mm/dma-mapping.c
>> @@ -29,6 +29,15 @@
>>   struct dma_map_ops *dma_ops;
>>   EXPORT_SYMBOL(dma_ops);
>>
>> +
>> +static inline pgprot_t __get_dma_pgprot(struct dma_attrs *attrs, pgprot_t prot)
>> +{
>> +	prot = dma_get_attr(DMA_ATTR_WRITE_COMBINE, attrs) ?
>> +			    pgprot_writecombine(prot) :
>> +			    pgprot_dmacoherent(prot);
>> +	return prot;
>> +}
>
> pgprot_writecombine and pgprot_dmacoherent are the same on arm64 (and
> ARMv6/v7). So when the DMA is coherent on an SoC, we need to leave the
> prot unchanged if !DMA_ATTR_WRITE_COMBINE.
>

Ah yes I missed that. I'll fix it up.

-- 
Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum,
hosted by The Linux Foundation

^ permalink raw reply	[flat|nested] 18+ messages in thread

* [PATCHv2 1/3] arm64: Implement custom mmap functions for dma mapping
  2014-03-13 17:45         ` [PATCH 2/2] arm64: Support DMA_ATTR_WRITE_COMBINE Laura Abbott
  2014-03-13 17:52           ` Catalin Marinas
@ 2014-03-14 19:52           ` Laura Abbott
  2014-03-24 10:33             ` Catalin Marinas
       [not found]             ` <CALk7dXr3cZSkQ6dTUyCjUDStOd6=ghGN9-iO5RQiTfHCciGxLg@mail.gmail.com>
  2014-03-14 19:52           ` [PATCHv2 2/3] arm64: Support DMA_ATTR_WRITE_COMBINE Laura Abbott
  2014-03-14 19:52           ` [PATCHv2 3/3] arm64: Use arm64 coherent APIs for non-coherent freeing Laura Abbott
  3 siblings, 2 replies; 18+ messages in thread
From: Laura Abbott @ 2014-03-14 19:52 UTC (permalink / raw)
  To: linux-arm-kernel

The current dma_ops do not specify an mmap function so maping
falls back to the default implementation. There are at least
two issues with using the default implementation:

1) The pgprot is always pgprot_noncached (strongly ordered)
memory even with coherent operations
2) dma_common_mmap calls virt_to_page on the remapped non-coherent
address which leads to invalid memory being mapped.

Fix both these issue by implementing a custom mmap function which
correctly accounts for remapped addresses and sets vm_pg_prot
appropriately.

Signed-off-by: Laura Abbott <lauraa@codeaurora.org>
---
 arch/arm64/mm/dma-mapping.c | 44 ++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 44 insertions(+)

diff --git a/arch/arm64/mm/dma-mapping.c b/arch/arm64/mm/dma-mapping.c
index e923a5b..0cdd2f6 100644
--- a/arch/arm64/mm/dma-mapping.c
+++ b/arch/arm64/mm/dma-mapping.c
@@ -194,9 +194,52 @@ static void arm64_swiotlb_sync_sg_for_device(struct device *dev,
 			       sg->length, dir);
 }
 
+/* vma->vm_page_prot must be set appropriately before calling this function */
+static int __dma_common_mmap(struct device *dev, struct vm_area_struct *vma,
+			     void *cpu_addr, dma_addr_t dma_addr, size_t size)
+{
+	int ret = -ENXIO;
+	unsigned long nr_vma_pages = (vma->vm_end - vma->vm_start) >>
+					PAGE_SHIFT;
+	unsigned long nr_pages = PAGE_ALIGN(size) >> PAGE_SHIFT;
+	unsigned long pfn = dma_to_phys(dev, dma_addr) >> PAGE_SHIFT;
+	unsigned long off = vma->vm_pgoff;
+
+	if (dma_mmap_from_coherent(dev, vma, cpu_addr, size, &ret))
+		return ret;
+
+	if (off < nr_pages && nr_vma_pages <= (nr_pages - off)) {
+		ret = remap_pfn_range(vma, vma->vm_start,
+				      pfn + off,
+				      vma->vm_end - vma->vm_start,
+				      vma->vm_page_prot);
+	}
+
+	return ret;
+}
+
+static int arm64_swiotlb_mmap_noncoherent(struct device *dev,
+		struct vm_area_struct *vma,
+		void *cpu_addr, dma_addr_t dma_addr, size_t size,
+		struct dma_attrs *attrs)
+{
+	vma->vm_page_prot = pgprot_dmacoherent(vma->vm_page_prot);
+	return __dma_common_mmap(dev, vma, cpu_addr, dma_addr, size);
+}
+
+static int arm64_swiotlb_mmap_coherent(struct device *dev,
+		struct vm_area_struct *vma,
+		void *cpu_addr, dma_addr_t dma_addr, size_t size,
+		struct dma_attrs *attrs)
+{
+	/* Just use whatever page_prot attributes were specified */
+	return __dma_common_mmap(dev, vma, cpu_addr, dma_addr, size);
+}
+
 struct dma_map_ops noncoherent_swiotlb_dma_ops = {
 	.alloc = arm64_swiotlb_alloc_noncoherent,
 	.free = arm64_swiotlb_free_noncoherent,
+	.mmap = arm64_swiotlb_mmap_noncoherent,
 	.map_page = arm64_swiotlb_map_page,
 	.unmap_page = arm64_swiotlb_unmap_page,
 	.map_sg = arm64_swiotlb_map_sg_attrs,
@@ -213,6 +256,7 @@ EXPORT_SYMBOL(noncoherent_swiotlb_dma_ops);
 struct dma_map_ops coherent_swiotlb_dma_ops = {
 	.alloc = arm64_swiotlb_alloc_coherent,
 	.free = arm64_swiotlb_free_coherent,
+	.mmap = arm64_swiotlb_mmap_coherent,
 	.map_page = swiotlb_map_page,
 	.unmap_page = swiotlb_unmap_page,
 	.map_sg = swiotlb_map_sg_attrs,
-- 
The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum,
hosted by The Linux Foundation

^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [PATCHv2 2/3] arm64: Support DMA_ATTR_WRITE_COMBINE
  2014-03-13 17:45         ` [PATCH 2/2] arm64: Support DMA_ATTR_WRITE_COMBINE Laura Abbott
  2014-03-13 17:52           ` Catalin Marinas
  2014-03-14 19:52           ` [PATCHv2 1/3] arm64: Implement custom mmap functions for dma mapping Laura Abbott
@ 2014-03-14 19:52           ` Laura Abbott
  2014-03-14 20:24             ` Rob Herring
  2014-03-14 19:52           ` [PATCHv2 3/3] arm64: Use arm64 coherent APIs for non-coherent freeing Laura Abbott
  3 siblings, 1 reply; 18+ messages in thread
From: Laura Abbott @ 2014-03-14 19:52 UTC (permalink / raw)
  To: linux-arm-kernel

DMA_ATTR_WRITE_COMBINE is currently ignored. Set the pgprot
appropriately for non coherent opperations.

Signed-off-by: Laura Abbott <lauraa@codeaurora.org>
---
 arch/arm64/mm/dma-mapping.c | 15 +++++++++++++--
 1 file changed, 13 insertions(+), 2 deletions(-)

diff --git a/arch/arm64/mm/dma-mapping.c b/arch/arm64/mm/dma-mapping.c
index 0cdd2f6..608c343 100644
--- a/arch/arm64/mm/dma-mapping.c
+++ b/arch/arm64/mm/dma-mapping.c
@@ -29,6 +29,17 @@
 struct dma_map_ops *dma_ops;
 EXPORT_SYMBOL(dma_ops);
 
+
+static inline pgprot_t __get_dma_pgprot(struct dma_attrs *attrs, pgprot_t prot,
+					bool coherent)
+{
+	if (dma_get_attr(DMA_ATTR_WRITE_COMBINE, attrs))
+		return pgprot_writecombine(prot)
+	else if (!coherent)
+		return pgprot_dmacoherent(prot);
+	return prot;
+}
+
 static void *arm64_swiotlb_alloc_coherent(struct device *dev, size_t size,
 					  dma_addr_t *dma_handle, gfp_t flags,
 					  struct dma_attrs *attrs)
@@ -72,7 +83,7 @@ static void *arm64_swiotlb_alloc_noncoherent(struct device *dev, size_t size,
 	for (i = 0; i < (size >> PAGE_SHIFT); i++)
 		map[i] = page + i;
 	coherent_ptr = vmap(map, size >> PAGE_SHIFT, VM_MAP,
-			    pgprot_dmacoherent(pgprot_default));
+			    __get_dma_pgprot(attrs, pgprot_default, false));
 	kfree(map);
 	if (!coherent_ptr)
 		goto no_map;
@@ -223,7 +234,7 @@ static int arm64_swiotlb_mmap_noncoherent(struct device *dev,
 		void *cpu_addr, dma_addr_t dma_addr, size_t size,
 		struct dma_attrs *attrs)
 {
-	vma->vm_page_prot = pgprot_dmacoherent(vma->vm_page_prot);
+	vma->vm_page_prot = __get_dma_pgprot(attrs, vma->vm_page_prot, false);
 	return __dma_common_mmap(dev, vma, cpu_addr, dma_addr, size);
 }
 
-- 
The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum,
hosted by The Linux Foundation

^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [PATCHv2 3/3] arm64: Use arm64 coherent APIs for non-coherent freeing
  2014-03-13 17:45         ` [PATCH 2/2] arm64: Support DMA_ATTR_WRITE_COMBINE Laura Abbott
                             ` (2 preceding siblings ...)
  2014-03-14 19:52           ` [PATCHv2 2/3] arm64: Support DMA_ATTR_WRITE_COMBINE Laura Abbott
@ 2014-03-14 19:52           ` Laura Abbott
  3 siblings, 0 replies; 18+ messages in thread
From: Laura Abbott @ 2014-03-14 19:52 UTC (permalink / raw)
  To: linux-arm-kernel

The noncoherent free function currently unconditionally frees
back to the page allocator which is incorrect for CMA pages.
call the coherent free function to correctly differentiate
between pages.

Signed-off-by: Laura Abbott <lauraa@codeaurora.org>
---
 arch/arm64/mm/dma-mapping.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/arm64/mm/dma-mapping.c b/arch/arm64/mm/dma-mapping.c
index 608c343..99ff063 100644
--- a/arch/arm64/mm/dma-mapping.c
+++ b/arch/arm64/mm/dma-mapping.c
@@ -104,7 +104,7 @@ static void arm64_swiotlb_free_noncoherent(struct device *dev, size_t size,
 	void *swiotlb_addr = phys_to_virt(dma_to_phys(dev, dma_handle));
 
 	vunmap(vaddr);
-	swiotlb_free_coherent(dev, size, swiotlb_addr, dma_handle);
+	arm64_swiotlb_free_coherent(dev, size, swiotlb_addr, dma_handle, attrs);
 }
 
 static dma_addr_t arm64_swiotlb_map_page(struct device *dev,
-- 
The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum,
hosted by The Linux Foundation

^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [PATCHv2 2/3] arm64: Support DMA_ATTR_WRITE_COMBINE
  2014-03-14 19:52           ` [PATCHv2 2/3] arm64: Support DMA_ATTR_WRITE_COMBINE Laura Abbott
@ 2014-03-14 20:24             ` Rob Herring
  2014-03-14 23:07               ` Catalin Marinas
  0 siblings, 1 reply; 18+ messages in thread
From: Rob Herring @ 2014-03-14 20:24 UTC (permalink / raw)
  To: linux-arm-kernel

On Fri, Mar 14, 2014 at 2:52 PM, Laura Abbott <lauraa@codeaurora.org> wrote:
> DMA_ATTR_WRITE_COMBINE is currently ignored. Set the pgprot
> appropriately for non coherent opperations.
>
> Signed-off-by: Laura Abbott <lauraa@codeaurora.org>
> ---
>  arch/arm64/mm/dma-mapping.c | 15 +++++++++++++--
>  1 file changed, 13 insertions(+), 2 deletions(-)
>
> diff --git a/arch/arm64/mm/dma-mapping.c b/arch/arm64/mm/dma-mapping.c
> index 0cdd2f6..608c343 100644
> --- a/arch/arm64/mm/dma-mapping.c
> +++ b/arch/arm64/mm/dma-mapping.c
> @@ -29,6 +29,17 @@
>  struct dma_map_ops *dma_ops;
>  EXPORT_SYMBOL(dma_ops);
>
> +
> +static inline pgprot_t __get_dma_pgprot(struct dma_attrs *attrs, pgprot_t prot,
> +                                       bool coherent)
> +{
> +       if (dma_get_attr(DMA_ATTR_WRITE_COMBINE, attrs))
> +               return pgprot_writecombine(prot)

Does this compile? Missing semicolon.

> +       else if (!coherent)
> +               return pgprot_dmacoherent(prot);
> +       return prot;

Isn't DMA_ATTR_WRITE_COMBINE supposed to be an optimization over plain
non-cached? But coherent would be more optimal over just write
combine, so we would want to continue to ignore DMA_ATTR_WRITE_COMBINE
in the coherent case. Something like this:

      if (coherent)
               return prot;
      else if (dma_get_attr(DMA_ATTR_WRITE_COMBINE, attrs))
               return pgprot_writecombine(prot);
      return pgprot_dmacoherent(prot);


But then this function is never used in the coherent case, so I'm
confused by Catalin's comment. The original version seems sufficient
to me.

Rob

^ permalink raw reply	[flat|nested] 18+ messages in thread

* [PATCHv2 2/3] arm64: Support DMA_ATTR_WRITE_COMBINE
  2014-03-14 20:24             ` Rob Herring
@ 2014-03-14 23:07               ` Catalin Marinas
  0 siblings, 0 replies; 18+ messages in thread
From: Catalin Marinas @ 2014-03-14 23:07 UTC (permalink / raw)
  To: linux-arm-kernel

On 14 Mar 2014, at 20:24, Rob Herring <robherring2@gmail.com> wrote:
> On Fri, Mar 14, 2014 at 2:52 PM, Laura Abbott <lauraa@codeaurora.org> wrote:
>> DMA_ATTR_WRITE_COMBINE is currently ignored. Set the pgprot
>> appropriately for non coherent opperations.
>> 
>> Signed-off-by: Laura Abbott <lauraa@codeaurora.org>
>> ---
>> arch/arm64/mm/dma-mapping.c | 15 +++++++++++++--
>> 1 file changed, 13 insertions(+), 2 deletions(-)
>> 
>> diff --git a/arch/arm64/mm/dma-mapping.c b/arch/arm64/mm/dma-mapping.c
>> index 0cdd2f6..608c343 100644
>> --- a/arch/arm64/mm/dma-mapping.c
>> +++ b/arch/arm64/mm/dma-mapping.c
>> @@ -29,6 +29,17 @@
>> struct dma_map_ops *dma_ops;
>> EXPORT_SYMBOL(dma_ops);
>> 
>> +
>> +static inline pgprot_t __get_dma_pgprot(struct dma_attrs *attrs, pgprot_t prot,
>> +                                       bool coherent)
>> +{
>> +       if (dma_get_attr(DMA_ATTR_WRITE_COMBINE, attrs))
>> +               return pgprot_writecombine(prot)
> 
> Does this compile? Missing semicolon.
> 
>> +       else if (!coherent)
>> +               return pgprot_dmacoherent(prot);
>> +       return prot;
> 
> Isn't DMA_ATTR_WRITE_COMBINE supposed to be an optimization over plain
> non-cached?

Since writecombine and dmacoherent are the same, we probably shouldn?t
even bother with the check as it doesn?t have any effect.

> But coherent would be more optimal over just write
> combine, so we would want to continue to ignore DMA_ATTR_WRITE_COMBINE
> in the coherent case. Something like this:
> 
>      if (coherent)
>               return prot;
>      else if (dma_get_attr(DMA_ATTR_WRITE_COMBINE, attrs))
>               return pgprot_writecombine(prot);
>      return pgprot_dmacoherent(prot);

I?ve got request from graphics people in the past about using
writecombine even though the DMA is fully coherent. The reason is that
some (frame)buffers are more efficient to fill with non-cacheable
memory than cacheable + write allocate which affects the whole CPU
cache.

> But then this function is never used in the coherent case, so I'm
> confused by Catalin's comment. The original version seems sufficient
> to me.

I think we need mmap function for both coherent and non-coherent cases
since the default dma_mmap_coherent() just uses pgprot_noncached() which
gives us Strongly Ordered.

So my proposal:

	if (!coherent)
		return pgprot_dmacoherent(prot);
	else if (dma_get_attr(DMA_ATTR_WRITE_COMBINE, attrs))
		return pgprot_writecombine(prot);
	return prot;

I could even use dmacoherent instead of writecombine to avoid confusion.

Catalin

^ permalink raw reply	[flat|nested] 18+ messages in thread

* [PATCHv2 1/3] arm64: Implement custom mmap functions for dma mapping
  2014-03-14 19:52           ` [PATCHv2 1/3] arm64: Implement custom mmap functions for dma mapping Laura Abbott
@ 2014-03-24 10:33             ` Catalin Marinas
       [not found]             ` <CALk7dXr3cZSkQ6dTUyCjUDStOd6=ghGN9-iO5RQiTfHCciGxLg@mail.gmail.com>
  1 sibling, 0 replies; 18+ messages in thread
From: Catalin Marinas @ 2014-03-24 10:33 UTC (permalink / raw)
  To: linux-arm-kernel

On Fri, Mar 14, 2014 at 07:52:23PM +0000, Laura Abbott wrote:
> The current dma_ops do not specify an mmap function so maping
> falls back to the default implementation. There are at least
> two issues with using the default implementation:
> 
> 1) The pgprot is always pgprot_noncached (strongly ordered)
> memory even with coherent operations
> 2) dma_common_mmap calls virt_to_page on the remapped non-coherent
> address which leads to invalid memory being mapped.
> 
> Fix both these issue by implementing a custom mmap function which
> correctly accounts for remapped addresses and sets vm_pg_prot
> appropriately.
> 
> Signed-off-by: Laura Abbott <lauraa@codeaurora.org>

I thought there was still some update needed to this series but it turns
out that patch 2/3 was what I was expecting already, so I merged the
first two patches (with minor changes for s/arm64_/__/ prefix). Patch 3
seems to be folded onto my patch already.

Thanks.

-- 
Catalin

^ permalink raw reply	[flat|nested] 18+ messages in thread

* [PATCHv2 1/3] arm64: Implement custom mmap functions for dma mapping
       [not found]             ` <CALk7dXr3cZSkQ6dTUyCjUDStOd6=ghGN9-iO5RQiTfHCciGxLg@mail.gmail.com>
@ 2014-03-28 10:37               ` Catalin Marinas
  0 siblings, 0 replies; 18+ messages in thread
From: Catalin Marinas @ 2014-03-28 10:37 UTC (permalink / raw)
  To: linux-arm-kernel

On Fri, Mar 28, 2014 at 09:45:08AM +0000, Ritesh Harjani wrote:
> On Sat, Mar 15, 2014 at 1:22 AM, Laura Abbott <lauraa@codeaurora.org> wrote:
> > +static int __dma_common_mmap(struct device *dev, struct vm_area_struct *vma,
> > +                            void *cpu_addr, dma_addr_t dma_addr, size_t size)
> > +{
> > +       int ret = -ENXIO;
> > +       unsigned long nr_vma_pages = (vma->vm_end - vma->vm_start) >>
> > +                                       PAGE_SHIFT;
> > +       unsigned long nr_pages = PAGE_ALIGN(size) >> PAGE_SHIFT;
> > +       unsigned long pfn = dma_to_phys(dev, dma_addr) >> PAGE_SHIFT;
> 
> Why not __phys_to_pfn here ?? just asking, I know there is nothing wrong in
> this too..

Because dma_addr is a DMA address (as seen by the device from the other
side of the bus) rather than a CPU physical address. In many cases they
are the same but not always.

-- 
Catalin

^ permalink raw reply	[flat|nested] 18+ messages in thread

end of thread, other threads:[~2014-03-28 10:37 UTC | newest]

Thread overview: 18+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2014-03-05 20:46 Query on patch to be upstream ? Ritesh Harjani
2014-03-11 12:15 ` Catalin Marinas
2014-03-11 18:04   ` Laura Abbott
2014-03-11 18:26     ` Catalin Marinas
2014-03-12 18:20       ` Laura Abbott
2014-03-13 17:45         ` [PATCH 1/2] arm64: Implement custom mmap functions for dma mapping Laura Abbott
2014-03-13 17:49           ` Catalin Marinas
2014-03-14  1:53             ` Laura Abbott
2014-03-13 17:45         ` [PATCH 2/2] arm64: Support DMA_ATTR_WRITE_COMBINE Laura Abbott
2014-03-13 17:52           ` Catalin Marinas
2014-03-14  2:02             ` Laura Abbott
2014-03-14 19:52           ` [PATCHv2 1/3] arm64: Implement custom mmap functions for dma mapping Laura Abbott
2014-03-24 10:33             ` Catalin Marinas
     [not found]             ` <CALk7dXr3cZSkQ6dTUyCjUDStOd6=ghGN9-iO5RQiTfHCciGxLg@mail.gmail.com>
2014-03-28 10:37               ` Catalin Marinas
2014-03-14 19:52           ` [PATCHv2 2/3] arm64: Support DMA_ATTR_WRITE_COMBINE Laura Abbott
2014-03-14 20:24             ` Rob Herring
2014-03-14 23:07               ` Catalin Marinas
2014-03-14 19:52           ` [PATCHv2 3/3] arm64: Use arm64 coherent APIs for non-coherent freeing Laura Abbott

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.