All of lore.kernel.org
 help / color / mirror / Atom feed
* [RFc] Map CMA pages as cached
@ 2012-07-13 18:01 ` Laura Abbott
  0 siblings, 0 replies; 16+ messages in thread
From: Laura Abbott @ 2012-07-13 18:01 UTC (permalink / raw)
  To: linaro-mm-sig, Marek Szyprowski, Russell King
  Cc: linux-arm-kernel, linux-arm-msm

Current APIs only support allocating CMA pages as either coherent or
writecombine. This seems to miss support for cached pages completely.
The following patch seems to be the first obvious solution.

More generally though, what should be the strategy for remapping with
other memory types? Add a new function call each time?

Thanks,
Laura

^ permalink raw reply	[flat|nested] 16+ messages in thread

* [RFc] Map CMA pages as cached
@ 2012-07-13 18:01 ` Laura Abbott
  0 siblings, 0 replies; 16+ messages in thread
From: Laura Abbott @ 2012-07-13 18:01 UTC (permalink / raw)
  To: linux-arm-kernel

Current APIs only support allocating CMA pages as either coherent or
writecombine. This seems to miss support for cached pages completely.
The following patch seems to be the first obvious solution.

More generally though, what should be the strategy for remapping with
other memory types? Add a new function call each time?

Thanks,
Laura

^ permalink raw reply	[flat|nested] 16+ messages in thread

* [PATCH][RFC] arm: dma-mapping: Add support for allocating/mapping cached buffers
  2012-07-13 18:01 ` Laura Abbott
@ 2012-07-13 18:01   ` Laura Abbott
  -1 siblings, 0 replies; 16+ messages in thread
From: Laura Abbott @ 2012-07-13 18:01 UTC (permalink / raw)
  To: linaro-mm-sig, Marek Szyprowski, Russell King
  Cc: linux-arm-kernel, linux-arm-msm, Laura Abbott

There are currently no dma allocation APIs that support cached
buffers. For some use cases, caching provides a signficiant
performance boost that beats write-combining regions. Add
apis to allocate and map a cached DMA region.

Signed-off-by: Laura Abbott <lauraa@codeaurora.org>
---
 arch/arm/include/asm/dma-mapping.h |   21 +++++++++++++++++++++
 arch/arm/mm/dma-mapping.c          |   21 +++++++++++++++++++++
 2 files changed, 42 insertions(+), 0 deletions(-)

diff --git a/arch/arm/include/asm/dma-mapping.h b/arch/arm/include/asm/dma-mapping.h
index dc988ff..1565403 100644
--- a/arch/arm/include/asm/dma-mapping.h
+++ b/arch/arm/include/asm/dma-mapping.h
@@ -239,12 +239,33 @@ int dma_mmap_coherent(struct device *, struct vm_area_struct *,
 extern void *dma_alloc_writecombine(struct device *, size_t, dma_addr_t *,
 		gfp_t);
 
+/**
+ * dma_alloc_cached - allocate cached memory for DMA
+ * @dev: valid struct device pointer, or NULL for ISA and EISA-like devices
+ * @size: required memory size
+ * @handle: bus-specific DMA address
+ *
+ * Allocate some cached memory for a device for
+ * performing DMA.  This function allocates pages, and will
+ * return the CPU-viewed address, and sets @handle to be the
+ * device-viewed address.
+ */
+extern void *dma_alloc_cached(struct device *, size_t, dma_addr_t *,
+		gfp_t);
+
 #define dma_free_writecombine(dev,size,cpu_addr,handle) \
 	dma_free_coherent(dev,size,cpu_addr,handle)
 
+#define dma_free_cached(dev,size,cpu_addr,handle) \
+	dma_free_coherent(dev,size,cpu_addr,handle)
+
 int dma_mmap_writecombine(struct device *, struct vm_area_struct *,
 		void *, dma_addr_t, size_t);
 
+
+int dma_mmap_cached(struct device *, struct vm_area_struct *,
+		void *, dma_addr_t, size_t);
+
 /*
  * This can be called during boot to increase the size of the consistent
  * DMA region above it's default value of 2MB. It must be called before the
diff --git a/arch/arm/mm/dma-mapping.c b/arch/arm/mm/dma-mapping.c
index b1911c4..f396ddc 100644
--- a/arch/arm/mm/dma-mapping.c
+++ b/arch/arm/mm/dma-mapping.c
@@ -633,6 +633,20 @@ dma_alloc_writecombine(struct device *dev, size_t size, dma_addr_t *handle, gfp_
 }
 EXPORT_SYMBOL(dma_alloc_writecombine);
 
+/*
+ * Allocate a cached DMA region
+ */
+void *
+dma_alloc_cached(struct device *dev, size_t size, dma_addr_t *handle, gfp_t gfp)
+{
+	return __dma_alloc(dev, size, handle, gfp,
+			   pgprot_kernel,
+			   __builtin_return_address(0));
+}
+EXPORT_SYMBOL(dma_alloc_cached);
+
+
+
 static int dma_mmap(struct device *dev, struct vm_area_struct *vma,
 		    void *cpu_addr, dma_addr_t dma_addr, size_t size)
 {
@@ -664,6 +678,13 @@ int dma_mmap_writecombine(struct device *dev, struct vm_area_struct *vma,
 }
 EXPORT_SYMBOL(dma_mmap_writecombine);
 
+int dma_mmap_cached(struct device *dev, struct vm_area_struct *vma,
+			  void *cpu_addr, dma_addr_t dma_addr, size_t size)
+{
+	return dma_mmap(dev, vma, cpu_addr, dma_addr, size);
+}
+EXPORT_SYMBOL(dma_mmap_cached);
+
 
 /*
  * Free a buffer as defined by the above mapping.
-- 
1.7.8.3

^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [PATCH][RFC] arm: dma-mapping: Add support for allocating/mapping cached buffers
@ 2012-07-13 18:01   ` Laura Abbott
  0 siblings, 0 replies; 16+ messages in thread
From: Laura Abbott @ 2012-07-13 18:01 UTC (permalink / raw)
  To: linux-arm-kernel

There are currently no dma allocation APIs that support cached
buffers. For some use cases, caching provides a signficiant
performance boost that beats write-combining regions. Add
apis to allocate and map a cached DMA region.

Signed-off-by: Laura Abbott <lauraa@codeaurora.org>
---
 arch/arm/include/asm/dma-mapping.h |   21 +++++++++++++++++++++
 arch/arm/mm/dma-mapping.c          |   21 +++++++++++++++++++++
 2 files changed, 42 insertions(+), 0 deletions(-)

diff --git a/arch/arm/include/asm/dma-mapping.h b/arch/arm/include/asm/dma-mapping.h
index dc988ff..1565403 100644
--- a/arch/arm/include/asm/dma-mapping.h
+++ b/arch/arm/include/asm/dma-mapping.h
@@ -239,12 +239,33 @@ int dma_mmap_coherent(struct device *, struct vm_area_struct *,
 extern void *dma_alloc_writecombine(struct device *, size_t, dma_addr_t *,
 		gfp_t);
 
+/**
+ * dma_alloc_cached - allocate cached memory for DMA
+ * @dev: valid struct device pointer, or NULL for ISA and EISA-like devices
+ * @size: required memory size
+ * @handle: bus-specific DMA address
+ *
+ * Allocate some cached memory for a device for
+ * performing DMA.  This function allocates pages, and will
+ * return the CPU-viewed address, and sets @handle to be the
+ * device-viewed address.
+ */
+extern void *dma_alloc_cached(struct device *, size_t, dma_addr_t *,
+		gfp_t);
+
 #define dma_free_writecombine(dev,size,cpu_addr,handle) \
 	dma_free_coherent(dev,size,cpu_addr,handle)
 
+#define dma_free_cached(dev,size,cpu_addr,handle) \
+	dma_free_coherent(dev,size,cpu_addr,handle)
+
 int dma_mmap_writecombine(struct device *, struct vm_area_struct *,
 		void *, dma_addr_t, size_t);
 
+
+int dma_mmap_cached(struct device *, struct vm_area_struct *,
+		void *, dma_addr_t, size_t);
+
 /*
  * This can be called during boot to increase the size of the consistent
  * DMA region above it's default value of 2MB. It must be called before the
diff --git a/arch/arm/mm/dma-mapping.c b/arch/arm/mm/dma-mapping.c
index b1911c4..f396ddc 100644
--- a/arch/arm/mm/dma-mapping.c
+++ b/arch/arm/mm/dma-mapping.c
@@ -633,6 +633,20 @@ dma_alloc_writecombine(struct device *dev, size_t size, dma_addr_t *handle, gfp_
 }
 EXPORT_SYMBOL(dma_alloc_writecombine);
 
+/*
+ * Allocate a cached DMA region
+ */
+void *
+dma_alloc_cached(struct device *dev, size_t size, dma_addr_t *handle, gfp_t gfp)
+{
+	return __dma_alloc(dev, size, handle, gfp,
+			   pgprot_kernel,
+			   __builtin_return_address(0));
+}
+EXPORT_SYMBOL(dma_alloc_cached);
+
+
+
 static int dma_mmap(struct device *dev, struct vm_area_struct *vma,
 		    void *cpu_addr, dma_addr_t dma_addr, size_t size)
 {
@@ -664,6 +678,13 @@ int dma_mmap_writecombine(struct device *dev, struct vm_area_struct *vma,
 }
 EXPORT_SYMBOL(dma_mmap_writecombine);
 
+int dma_mmap_cached(struct device *dev, struct vm_area_struct *vma,
+			  void *cpu_addr, dma_addr_t dma_addr, size_t size)
+{
+	return dma_mmap(dev, vma, cpu_addr, dma_addr, size);
+}
+EXPORT_SYMBOL(dma_mmap_cached);
+
 
 /*
  * Free a buffer as defined by the above mapping.
-- 
1.7.8.3

^ permalink raw reply related	[flat|nested] 16+ messages in thread

* Re: [Linaro-mm-sig] [PATCH][RFC] arm: dma-mapping: Add support for allocating/mapping cached buffers
  2012-07-13 18:01   ` Laura Abbott
@ 2012-07-14 13:53     ` Clark, Rob
  -1 siblings, 0 replies; 16+ messages in thread
From: Clark, Rob @ 2012-07-14 13:53 UTC (permalink / raw)
  To: Laura Abbott
  Cc: linaro-mm-sig, Marek Szyprowski, Russell King, linux-arm-msm,
	linux-arm-kernel

On Fri, Jul 13, 2012 at 1:01 PM, Laura Abbott <lauraa@codeaurora.org> wrote:
> There are currently no dma allocation APIs that support cached
> buffers. For some use cases, caching provides a signficiant
> performance boost that beats write-combining regions. Add
> apis to allocate and map a cached DMA region.

btw, there were recent patches for allocating dma memory without a
virtual mapping.  With this you could map however you want to
userspace (for example, cached)

I'm assuming that you are not needing it to be mapped cached to kernel?

BR,
-R

> Signed-off-by: Laura Abbott <lauraa@codeaurora.org>
> ---
>  arch/arm/include/asm/dma-mapping.h |   21 +++++++++++++++++++++
>  arch/arm/mm/dma-mapping.c          |   21 +++++++++++++++++++++
>  2 files changed, 42 insertions(+), 0 deletions(-)
>
> diff --git a/arch/arm/include/asm/dma-mapping.h b/arch/arm/include/asm/dma-mapping.h
> index dc988ff..1565403 100644
> --- a/arch/arm/include/asm/dma-mapping.h
> +++ b/arch/arm/include/asm/dma-mapping.h
> @@ -239,12 +239,33 @@ int dma_mmap_coherent(struct device *, struct vm_area_struct *,
>  extern void *dma_alloc_writecombine(struct device *, size_t, dma_addr_t *,
>                 gfp_t);
>
> +/**
> + * dma_alloc_cached - allocate cached memory for DMA
> + * @dev: valid struct device pointer, or NULL for ISA and EISA-like devices
> + * @size: required memory size
> + * @handle: bus-specific DMA address
> + *
> + * Allocate some cached memory for a device for
> + * performing DMA.  This function allocates pages, and will
> + * return the CPU-viewed address, and sets @handle to be the
> + * device-viewed address.
> + */
> +extern void *dma_alloc_cached(struct device *, size_t, dma_addr_t *,
> +               gfp_t);
> +
>  #define dma_free_writecombine(dev,size,cpu_addr,handle) \
>         dma_free_coherent(dev,size,cpu_addr,handle)
>
> +#define dma_free_cached(dev,size,cpu_addr,handle) \
> +       dma_free_coherent(dev,size,cpu_addr,handle)
> +
>  int dma_mmap_writecombine(struct device *, struct vm_area_struct *,
>                 void *, dma_addr_t, size_t);
>
> +
> +int dma_mmap_cached(struct device *, struct vm_area_struct *,
> +               void *, dma_addr_t, size_t);
> +
>  /*
>   * This can be called during boot to increase the size of the consistent
>   * DMA region above it's default value of 2MB. It must be called before the
> diff --git a/arch/arm/mm/dma-mapping.c b/arch/arm/mm/dma-mapping.c
> index b1911c4..f396ddc 100644
> --- a/arch/arm/mm/dma-mapping.c
> +++ b/arch/arm/mm/dma-mapping.c
> @@ -633,6 +633,20 @@ dma_alloc_writecombine(struct device *dev, size_t size, dma_addr_t *handle, gfp_
>  }
>  EXPORT_SYMBOL(dma_alloc_writecombine);
>
> +/*
> + * Allocate a cached DMA region
> + */
> +void *
> +dma_alloc_cached(struct device *dev, size_t size, dma_addr_t *handle, gfp_t gfp)
> +{
> +       return __dma_alloc(dev, size, handle, gfp,
> +                          pgprot_kernel,
> +                          __builtin_return_address(0));
> +}
> +EXPORT_SYMBOL(dma_alloc_cached);
> +
> +
> +
>  static int dma_mmap(struct device *dev, struct vm_area_struct *vma,
>                     void *cpu_addr, dma_addr_t dma_addr, size_t size)
>  {
> @@ -664,6 +678,13 @@ int dma_mmap_writecombine(struct device *dev, struct vm_area_struct *vma,
>  }
>  EXPORT_SYMBOL(dma_mmap_writecombine);
>
> +int dma_mmap_cached(struct device *dev, struct vm_area_struct *vma,
> +                         void *cpu_addr, dma_addr_t dma_addr, size_t size)
> +{
> +       return dma_mmap(dev, vma, cpu_addr, dma_addr, size);
> +}
> +EXPORT_SYMBOL(dma_mmap_cached);
> +
>
>  /*
>   * Free a buffer as defined by the above mapping.
> --
> 1.7.8.3
>
>
> _______________________________________________
> Linaro-mm-sig mailing list
> Linaro-mm-sig@lists.linaro.org
> http://lists.linaro.org/mailman/listinfo/linaro-mm-sig

^ permalink raw reply	[flat|nested] 16+ messages in thread

* [Linaro-mm-sig] [PATCH][RFC] arm: dma-mapping: Add support for allocating/mapping cached buffers
@ 2012-07-14 13:53     ` Clark, Rob
  0 siblings, 0 replies; 16+ messages in thread
From: Clark, Rob @ 2012-07-14 13:53 UTC (permalink / raw)
  To: linux-arm-kernel

On Fri, Jul 13, 2012 at 1:01 PM, Laura Abbott <lauraa@codeaurora.org> wrote:
> There are currently no dma allocation APIs that support cached
> buffers. For some use cases, caching provides a signficiant
> performance boost that beats write-combining regions. Add
> apis to allocate and map a cached DMA region.

btw, there were recent patches for allocating dma memory without a
virtual mapping.  With this you could map however you want to
userspace (for example, cached)

I'm assuming that you are not needing it to be mapped cached to kernel?

BR,
-R

> Signed-off-by: Laura Abbott <lauraa@codeaurora.org>
> ---
>  arch/arm/include/asm/dma-mapping.h |   21 +++++++++++++++++++++
>  arch/arm/mm/dma-mapping.c          |   21 +++++++++++++++++++++
>  2 files changed, 42 insertions(+), 0 deletions(-)
>
> diff --git a/arch/arm/include/asm/dma-mapping.h b/arch/arm/include/asm/dma-mapping.h
> index dc988ff..1565403 100644
> --- a/arch/arm/include/asm/dma-mapping.h
> +++ b/arch/arm/include/asm/dma-mapping.h
> @@ -239,12 +239,33 @@ int dma_mmap_coherent(struct device *, struct vm_area_struct *,
>  extern void *dma_alloc_writecombine(struct device *, size_t, dma_addr_t *,
>                 gfp_t);
>
> +/**
> + * dma_alloc_cached - allocate cached memory for DMA
> + * @dev: valid struct device pointer, or NULL for ISA and EISA-like devices
> + * @size: required memory size
> + * @handle: bus-specific DMA address
> + *
> + * Allocate some cached memory for a device for
> + * performing DMA.  This function allocates pages, and will
> + * return the CPU-viewed address, and sets @handle to be the
> + * device-viewed address.
> + */
> +extern void *dma_alloc_cached(struct device *, size_t, dma_addr_t *,
> +               gfp_t);
> +
>  #define dma_free_writecombine(dev,size,cpu_addr,handle) \
>         dma_free_coherent(dev,size,cpu_addr,handle)
>
> +#define dma_free_cached(dev,size,cpu_addr,handle) \
> +       dma_free_coherent(dev,size,cpu_addr,handle)
> +
>  int dma_mmap_writecombine(struct device *, struct vm_area_struct *,
>                 void *, dma_addr_t, size_t);
>
> +
> +int dma_mmap_cached(struct device *, struct vm_area_struct *,
> +               void *, dma_addr_t, size_t);
> +
>  /*
>   * This can be called during boot to increase the size of the consistent
>   * DMA region above it's default value of 2MB. It must be called before the
> diff --git a/arch/arm/mm/dma-mapping.c b/arch/arm/mm/dma-mapping.c
> index b1911c4..f396ddc 100644
> --- a/arch/arm/mm/dma-mapping.c
> +++ b/arch/arm/mm/dma-mapping.c
> @@ -633,6 +633,20 @@ dma_alloc_writecombine(struct device *dev, size_t size, dma_addr_t *handle, gfp_
>  }
>  EXPORT_SYMBOL(dma_alloc_writecombine);
>
> +/*
> + * Allocate a cached DMA region
> + */
> +void *
> +dma_alloc_cached(struct device *dev, size_t size, dma_addr_t *handle, gfp_t gfp)
> +{
> +       return __dma_alloc(dev, size, handle, gfp,
> +                          pgprot_kernel,
> +                          __builtin_return_address(0));
> +}
> +EXPORT_SYMBOL(dma_alloc_cached);
> +
> +
> +
>  static int dma_mmap(struct device *dev, struct vm_area_struct *vma,
>                     void *cpu_addr, dma_addr_t dma_addr, size_t size)
>  {
> @@ -664,6 +678,13 @@ int dma_mmap_writecombine(struct device *dev, struct vm_area_struct *vma,
>  }
>  EXPORT_SYMBOL(dma_mmap_writecombine);
>
> +int dma_mmap_cached(struct device *dev, struct vm_area_struct *vma,
> +                         void *cpu_addr, dma_addr_t dma_addr, size_t size)
> +{
> +       return dma_mmap(dev, vma, cpu_addr, dma_addr, size);
> +}
> +EXPORT_SYMBOL(dma_mmap_cached);
> +
>
>  /*
>   * Free a buffer as defined by the above mapping.
> --
> 1.7.8.3
>
>
> _______________________________________________
> Linaro-mm-sig mailing list
> Linaro-mm-sig at lists.linaro.org
> http://lists.linaro.org/mailman/listinfo/linaro-mm-sig

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [Linaro-mm-sig] [PATCH][RFC] arm: dma-mapping: Add support for allocating/mapping cached buffers
  2012-07-14 13:53     ` Clark, Rob
@ 2012-07-17  1:06       ` Laura Abbott
  -1 siblings, 0 replies; 16+ messages in thread
From: Laura Abbott @ 2012-07-17  1:06 UTC (permalink / raw)
  To: Clark, Rob
  Cc: linaro-mm-sig, Marek Szyprowski, Russell King, linux-arm-msm,
	linux-arm-kernel

On 7/14/2012 6:53 AM, Clark, Rob wrote:
> On Fri, Jul 13, 2012 at 1:01 PM, Laura Abbott <lauraa@codeaurora.org> wrote:
>> There are currently no dma allocation APIs that support cached
>> buffers. For some use cases, caching provides a signficiant
>> performance boost that beats write-combining regions. Add
>> apis to allocate and map a cached DMA region.
>
> btw, there were recent patches for allocating dma memory without a
> virtual mapping.  With this you could map however you want to
> userspace (for example, cached)
>
> I'm assuming that you are not needing it to be mapped cached to kernel?
>

Thanks for reminding me about those patches. They don't quite solve the 
problem as is for two reasons: 1) I'm looking at regular CMA 
allocations, not iommu allocations which is what the patches covered 2) 
I do actually need a kernel cached mapping in addition to the userspace 
mappings.

I've obviously missed the last DMA rework patches, and I should 
rebase/rework against those. Is another DMA attribute (DMA_ATTR_CACHED) 
an acceptable option?

> BR,
> -R
>

Thanks,
Laura

>> Signed-off-by: Laura Abbott <lauraa@codeaurora.org>
>> ---
>>   arch/arm/include/asm/dma-mapping.h |   21 +++++++++++++++++++++
>>   arch/arm/mm/dma-mapping.c          |   21 +++++++++++++++++++++
>>   2 files changed, 42 insertions(+), 0 deletions(-)
>>
>> diff --git a/arch/arm/include/asm/dma-mapping.h b/arch/arm/include/asm/dma-mapping.h
>> index dc988ff..1565403 100644
>> --- a/arch/arm/include/asm/dma-mapping.h
>> +++ b/arch/arm/include/asm/dma-mapping.h
>> @@ -239,12 +239,33 @@ int dma_mmap_coherent(struct device *, struct vm_area_struct *,
>>   extern void *dma_alloc_writecombine(struct device *, size_t, dma_addr_t *,
>>                  gfp_t);
>>
>> +/**
>> + * dma_alloc_cached - allocate cached memory for DMA
>> + * @dev: valid struct device pointer, or NULL for ISA and EISA-like devices
>> + * @size: required memory size
>> + * @handle: bus-specific DMA address
>> + *
>> + * Allocate some cached memory for a device for
>> + * performing DMA.  This function allocates pages, and will
>> + * return the CPU-viewed address, and sets @handle to be the
>> + * device-viewed address.
>> + */
>> +extern void *dma_alloc_cached(struct device *, size_t, dma_addr_t *,
>> +               gfp_t);
>> +
>>   #define dma_free_writecombine(dev,size,cpu_addr,handle) \
>>          dma_free_coherent(dev,size,cpu_addr,handle)
>>
>> +#define dma_free_cached(dev,size,cpu_addr,handle) \
>> +       dma_free_coherent(dev,size,cpu_addr,handle)
>> +
>>   int dma_mmap_writecombine(struct device *, struct vm_area_struct *,
>>                  void *, dma_addr_t, size_t);
>>
>> +
>> +int dma_mmap_cached(struct device *, struct vm_area_struct *,
>> +               void *, dma_addr_t, size_t);
>> +
>>   /*
>>    * This can be called during boot to increase the size of the consistent
>>    * DMA region above it's default value of 2MB. It must be called before the
>> diff --git a/arch/arm/mm/dma-mapping.c b/arch/arm/mm/dma-mapping.c
>> index b1911c4..f396ddc 100644
>> --- a/arch/arm/mm/dma-mapping.c
>> +++ b/arch/arm/mm/dma-mapping.c
>> @@ -633,6 +633,20 @@ dma_alloc_writecombine(struct device *dev, size_t size, dma_addr_t *handle, gfp_
>>   }
>>   EXPORT_SYMBOL(dma_alloc_writecombine);
>>
>> +/*
>> + * Allocate a cached DMA region
>> + */
>> +void *
>> +dma_alloc_cached(struct device *dev, size_t size, dma_addr_t *handle, gfp_t gfp)
>> +{
>> +       return __dma_alloc(dev, size, handle, gfp,
>> +                          pgprot_kernel,
>> +                          __builtin_return_address(0));
>> +}
>> +EXPORT_SYMBOL(dma_alloc_cached);
>> +
>> +
>> +
>>   static int dma_mmap(struct device *dev, struct vm_area_struct *vma,
>>                      void *cpu_addr, dma_addr_t dma_addr, size_t size)
>>   {
>> @@ -664,6 +678,13 @@ int dma_mmap_writecombine(struct device *dev, struct vm_area_struct *vma,
>>   }
>>   EXPORT_SYMBOL(dma_mmap_writecombine);
>>
>> +int dma_mmap_cached(struct device *dev, struct vm_area_struct *vma,
>> +                         void *cpu_addr, dma_addr_t dma_addr, size_t size)
>> +{
>> +       return dma_mmap(dev, vma, cpu_addr, dma_addr, size);
>> +}
>> +EXPORT_SYMBOL(dma_mmap_cached);
>> +
>>
>>   /*
>>    * Free a buffer as defined by the above mapping.
>> --
>> 1.7.8.3
>>
>>
>> _______________________________________________
>> Linaro-mm-sig mailing list
>> Linaro-mm-sig@lists.linaro.org
>> http://lists.linaro.org/mailman/listinfo/linaro-mm-sig


-- 
Sent by an employee of the Qualcomm Innovation Center, Inc.
The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum.

^ permalink raw reply	[flat|nested] 16+ messages in thread

* [Linaro-mm-sig] [PATCH][RFC] arm: dma-mapping: Add support for allocating/mapping cached buffers
@ 2012-07-17  1:06       ` Laura Abbott
  0 siblings, 0 replies; 16+ messages in thread
From: Laura Abbott @ 2012-07-17  1:06 UTC (permalink / raw)
  To: linux-arm-kernel

On 7/14/2012 6:53 AM, Clark, Rob wrote:
> On Fri, Jul 13, 2012 at 1:01 PM, Laura Abbott <lauraa@codeaurora.org> wrote:
>> There are currently no dma allocation APIs that support cached
>> buffers. For some use cases, caching provides a signficiant
>> performance boost that beats write-combining regions. Add
>> apis to allocate and map a cached DMA region.
>
> btw, there were recent patches for allocating dma memory without a
> virtual mapping.  With this you could map however you want to
> userspace (for example, cached)
>
> I'm assuming that you are not needing it to be mapped cached to kernel?
>

Thanks for reminding me about those patches. They don't quite solve the 
problem as is for two reasons: 1) I'm looking at regular CMA 
allocations, not iommu allocations which is what the patches covered 2) 
I do actually need a kernel cached mapping in addition to the userspace 
mappings.

I've obviously missed the last DMA rework patches, and I should 
rebase/rework against those. Is another DMA attribute (DMA_ATTR_CACHED) 
an acceptable option?

> BR,
> -R
>

Thanks,
Laura

>> Signed-off-by: Laura Abbott <lauraa@codeaurora.org>
>> ---
>>   arch/arm/include/asm/dma-mapping.h |   21 +++++++++++++++++++++
>>   arch/arm/mm/dma-mapping.c          |   21 +++++++++++++++++++++
>>   2 files changed, 42 insertions(+), 0 deletions(-)
>>
>> diff --git a/arch/arm/include/asm/dma-mapping.h b/arch/arm/include/asm/dma-mapping.h
>> index dc988ff..1565403 100644
>> --- a/arch/arm/include/asm/dma-mapping.h
>> +++ b/arch/arm/include/asm/dma-mapping.h
>> @@ -239,12 +239,33 @@ int dma_mmap_coherent(struct device *, struct vm_area_struct *,
>>   extern void *dma_alloc_writecombine(struct device *, size_t, dma_addr_t *,
>>                  gfp_t);
>>
>> +/**
>> + * dma_alloc_cached - allocate cached memory for DMA
>> + * @dev: valid struct device pointer, or NULL for ISA and EISA-like devices
>> + * @size: required memory size
>> + * @handle: bus-specific DMA address
>> + *
>> + * Allocate some cached memory for a device for
>> + * performing DMA.  This function allocates pages, and will
>> + * return the CPU-viewed address, and sets @handle to be the
>> + * device-viewed address.
>> + */
>> +extern void *dma_alloc_cached(struct device *, size_t, dma_addr_t *,
>> +               gfp_t);
>> +
>>   #define dma_free_writecombine(dev,size,cpu_addr,handle) \
>>          dma_free_coherent(dev,size,cpu_addr,handle)
>>
>> +#define dma_free_cached(dev,size,cpu_addr,handle) \
>> +       dma_free_coherent(dev,size,cpu_addr,handle)
>> +
>>   int dma_mmap_writecombine(struct device *, struct vm_area_struct *,
>>                  void *, dma_addr_t, size_t);
>>
>> +
>> +int dma_mmap_cached(struct device *, struct vm_area_struct *,
>> +               void *, dma_addr_t, size_t);
>> +
>>   /*
>>    * This can be called during boot to increase the size of the consistent
>>    * DMA region above it's default value of 2MB. It must be called before the
>> diff --git a/arch/arm/mm/dma-mapping.c b/arch/arm/mm/dma-mapping.c
>> index b1911c4..f396ddc 100644
>> --- a/arch/arm/mm/dma-mapping.c
>> +++ b/arch/arm/mm/dma-mapping.c
>> @@ -633,6 +633,20 @@ dma_alloc_writecombine(struct device *dev, size_t size, dma_addr_t *handle, gfp_
>>   }
>>   EXPORT_SYMBOL(dma_alloc_writecombine);
>>
>> +/*
>> + * Allocate a cached DMA region
>> + */
>> +void *
>> +dma_alloc_cached(struct device *dev, size_t size, dma_addr_t *handle, gfp_t gfp)
>> +{
>> +       return __dma_alloc(dev, size, handle, gfp,
>> +                          pgprot_kernel,
>> +                          __builtin_return_address(0));
>> +}
>> +EXPORT_SYMBOL(dma_alloc_cached);
>> +
>> +
>> +
>>   static int dma_mmap(struct device *dev, struct vm_area_struct *vma,
>>                      void *cpu_addr, dma_addr_t dma_addr, size_t size)
>>   {
>> @@ -664,6 +678,13 @@ int dma_mmap_writecombine(struct device *dev, struct vm_area_struct *vma,
>>   }
>>   EXPORT_SYMBOL(dma_mmap_writecombine);
>>
>> +int dma_mmap_cached(struct device *dev, struct vm_area_struct *vma,
>> +                         void *cpu_addr, dma_addr_t dma_addr, size_t size)
>> +{
>> +       return dma_mmap(dev, vma, cpu_addr, dma_addr, size);
>> +}
>> +EXPORT_SYMBOL(dma_mmap_cached);
>> +
>>
>>   /*
>>    * Free a buffer as defined by the above mapping.
>> --
>> 1.7.8.3
>>
>>
>> _______________________________________________
>> Linaro-mm-sig mailing list
>> Linaro-mm-sig at lists.linaro.org
>> http://lists.linaro.org/mailman/listinfo/linaro-mm-sig


-- 
Sent by an employee of the Qualcomm Innovation Center, Inc.
The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum.

^ permalink raw reply	[flat|nested] 16+ messages in thread

* RE: [PATCH][RFC] arm: dma-mapping: Add support for allocating/mapping cached buffers
  2012-07-13 18:01   ` Laura Abbott
@ 2012-07-17  5:58     ` Marek Szyprowski
  -1 siblings, 0 replies; 16+ messages in thread
From: Marek Szyprowski @ 2012-07-17  5:58 UTC (permalink / raw)
  To: 'Laura Abbott', linaro-mm-sig, 'Russell King'
  Cc: linux-arm-kernel, linux-arm-msm

Hi Laura,

On Friday, July 13, 2012 8:02 PM Laura Abbott wrote:

> There are currently no dma allocation APIs that support cached
> buffers. For some use cases, caching provides a signficiant
> performance boost that beats write-combining regions. Add
> apis to allocate and map a cached DMA region.
> 
> Signed-off-by: Laura Abbott <lauraa@codeaurora.org>

I agree that there is a need for cached contiguous memory blocks. I see that your patch
is based on some older version of CMA/dma-mapping code. In v3.5-rc1 CMA has been merged
to mainline kernel together with DMA-mapping redesign patches, so an attribute approach
can be used instead of adding new functions to the API. My original idea was to utilize
the dma_alloc_nonconsistent() call and DMA_ATTR_NONCONSISTENT for allocating/mapping
cached contiguous buffers, but I didn't have enough time for completing this work. 

The main missing piece is the API for managing cache synchronization on such buffers.
There is a dma_cache_synch() functions but it is broken from the API point of view. To
replace it with something better, some additional work is needed for all drivers which
already use it. Also some work in needed for cleanup dma_alloc_nonconsistent() 
implementations for all the architectures using dma_map_ops approach. All this is on my
TODO list, but I currently I'm really busy with other tasks related to CMA (mainly 
bugfixes for some special use-cases).

> ---
>  arch/arm/include/asm/dma-mapping.h |   21 +++++++++++++++++++++
>  arch/arm/mm/dma-mapping.c          |   21 +++++++++++++++++++++
>  2 files changed, 42 insertions(+), 0 deletions(-)
> 
> diff --git a/arch/arm/include/asm/dma-mapping.h b/arch/arm/include/asm/dma-mapping.h
> index dc988ff..1565403 100644
> --- a/arch/arm/include/asm/dma-mapping.h
> +++ b/arch/arm/include/asm/dma-mapping.h
> @@ -239,12 +239,33 @@ int dma_mmap_coherent(struct device *, struct vm_area_struct *,
>  extern void *dma_alloc_writecombine(struct device *, size_t, dma_addr_t *,
>  		gfp_t);
> 
> +/**
> + * dma_alloc_cached - allocate cached memory for DMA
> + * @dev: valid struct device pointer, or NULL for ISA and EISA-like devices
> + * @size: required memory size
> + * @handle: bus-specific DMA address
> + *
> + * Allocate some cached memory for a device for
> + * performing DMA.  This function allocates pages, and will
> + * return the CPU-viewed address, and sets @handle to be the
> + * device-viewed address.
> + */
> +extern void *dma_alloc_cached(struct device *, size_t, dma_addr_t *,
> +		gfp_t);
> +
>  #define dma_free_writecombine(dev,size,cpu_addr,handle) \
>  	dma_free_coherent(dev,size,cpu_addr,handle)
> 
> +#define dma_free_cached(dev,size,cpu_addr,handle) \
> +	dma_free_coherent(dev,size,cpu_addr,handle)
> +
>  int dma_mmap_writecombine(struct device *, struct vm_area_struct *,
>  		void *, dma_addr_t, size_t);
> 
> +
> +int dma_mmap_cached(struct device *, struct vm_area_struct *,
> +		void *, dma_addr_t, size_t);
> +
>  /*
>   * This can be called during boot to increase the size of the consistent
>   * DMA region above it's default value of 2MB. It must be called before the
> diff --git a/arch/arm/mm/dma-mapping.c b/arch/arm/mm/dma-mapping.c
> index b1911c4..f396ddc 100644
> --- a/arch/arm/mm/dma-mapping.c
> +++ b/arch/arm/mm/dma-mapping.c
> @@ -633,6 +633,20 @@ dma_alloc_writecombine(struct device *dev, size_t size, dma_addr_t
> *handle, gfp_
>  }
>  EXPORT_SYMBOL(dma_alloc_writecombine);
> 
> +/*
> + * Allocate a cached DMA region
> + */
> +void *
> +dma_alloc_cached(struct device *dev, size_t size, dma_addr_t *handle, gfp_t gfp)
> +{
> +	return __dma_alloc(dev, size, handle, gfp,
> +			   pgprot_kernel,
> +			   __builtin_return_address(0));
> +}
> +EXPORT_SYMBOL(dma_alloc_cached);
> +
> +
> +
>  static int dma_mmap(struct device *dev, struct vm_area_struct *vma,
>  		    void *cpu_addr, dma_addr_t dma_addr, size_t size)
>  {
> @@ -664,6 +678,13 @@ int dma_mmap_writecombine(struct device *dev, struct vm_area_struct *vma,
>  }
>  EXPORT_SYMBOL(dma_mmap_writecombine);
> 
> +int dma_mmap_cached(struct device *dev, struct vm_area_struct *vma,
> +			  void *cpu_addr, dma_addr_t dma_addr, size_t size)
> +{
> +	return dma_mmap(dev, vma, cpu_addr, dma_addr, size);
> +}
> +EXPORT_SYMBOL(dma_mmap_cached);
> +
> 
>  /*
>   * Free a buffer as defined by the above mapping.
> --
> 1.7.8.3

Best regards
-- 
Marek Szyprowski
Samsung Poland R&D Center

^ permalink raw reply	[flat|nested] 16+ messages in thread

* [PATCH][RFC] arm: dma-mapping: Add support for allocating/mapping cached buffers
@ 2012-07-17  5:58     ` Marek Szyprowski
  0 siblings, 0 replies; 16+ messages in thread
From: Marek Szyprowski @ 2012-07-17  5:58 UTC (permalink / raw)
  To: linux-arm-kernel

Hi Laura,

On Friday, July 13, 2012 8:02 PM Laura Abbott wrote:

> There are currently no dma allocation APIs that support cached
> buffers. For some use cases, caching provides a signficiant
> performance boost that beats write-combining regions. Add
> apis to allocate and map a cached DMA region.
> 
> Signed-off-by: Laura Abbott <lauraa@codeaurora.org>

I agree that there is a need for cached contiguous memory blocks. I see that your patch
is based on some older version of CMA/dma-mapping code. In v3.5-rc1 CMA has been merged
to mainline kernel together with DMA-mapping redesign patches, so an attribute approach
can be used instead of adding new functions to the API. My original idea was to utilize
the dma_alloc_nonconsistent() call and DMA_ATTR_NONCONSISTENT for allocating/mapping
cached contiguous buffers, but I didn't have enough time for completing this work. 

The main missing piece is the API for managing cache synchronization on such buffers.
There is a dma_cache_synch() functions but it is broken from the API point of view. To
replace it with something better, some additional work is needed for all drivers which
already use it. Also some work in needed for cleanup dma_alloc_nonconsistent() 
implementations for all the architectures using dma_map_ops approach. All this is on my
TODO list, but I currently I'm really busy with other tasks related to CMA (mainly 
bugfixes for some special use-cases).

> ---
>  arch/arm/include/asm/dma-mapping.h |   21 +++++++++++++++++++++
>  arch/arm/mm/dma-mapping.c          |   21 +++++++++++++++++++++
>  2 files changed, 42 insertions(+), 0 deletions(-)
> 
> diff --git a/arch/arm/include/asm/dma-mapping.h b/arch/arm/include/asm/dma-mapping.h
> index dc988ff..1565403 100644
> --- a/arch/arm/include/asm/dma-mapping.h
> +++ b/arch/arm/include/asm/dma-mapping.h
> @@ -239,12 +239,33 @@ int dma_mmap_coherent(struct device *, struct vm_area_struct *,
>  extern void *dma_alloc_writecombine(struct device *, size_t, dma_addr_t *,
>  		gfp_t);
> 
> +/**
> + * dma_alloc_cached - allocate cached memory for DMA
> + * @dev: valid struct device pointer, or NULL for ISA and EISA-like devices
> + * @size: required memory size
> + * @handle: bus-specific DMA address
> + *
> + * Allocate some cached memory for a device for
> + * performing DMA.  This function allocates pages, and will
> + * return the CPU-viewed address, and sets @handle to be the
> + * device-viewed address.
> + */
> +extern void *dma_alloc_cached(struct device *, size_t, dma_addr_t *,
> +		gfp_t);
> +
>  #define dma_free_writecombine(dev,size,cpu_addr,handle) \
>  	dma_free_coherent(dev,size,cpu_addr,handle)
> 
> +#define dma_free_cached(dev,size,cpu_addr,handle) \
> +	dma_free_coherent(dev,size,cpu_addr,handle)
> +
>  int dma_mmap_writecombine(struct device *, struct vm_area_struct *,
>  		void *, dma_addr_t, size_t);
> 
> +
> +int dma_mmap_cached(struct device *, struct vm_area_struct *,
> +		void *, dma_addr_t, size_t);
> +
>  /*
>   * This can be called during boot to increase the size of the consistent
>   * DMA region above it's default value of 2MB. It must be called before the
> diff --git a/arch/arm/mm/dma-mapping.c b/arch/arm/mm/dma-mapping.c
> index b1911c4..f396ddc 100644
> --- a/arch/arm/mm/dma-mapping.c
> +++ b/arch/arm/mm/dma-mapping.c
> @@ -633,6 +633,20 @@ dma_alloc_writecombine(struct device *dev, size_t size, dma_addr_t
> *handle, gfp_
>  }
>  EXPORT_SYMBOL(dma_alloc_writecombine);
> 
> +/*
> + * Allocate a cached DMA region
> + */
> +void *
> +dma_alloc_cached(struct device *dev, size_t size, dma_addr_t *handle, gfp_t gfp)
> +{
> +	return __dma_alloc(dev, size, handle, gfp,
> +			   pgprot_kernel,
> +			   __builtin_return_address(0));
> +}
> +EXPORT_SYMBOL(dma_alloc_cached);
> +
> +
> +
>  static int dma_mmap(struct device *dev, struct vm_area_struct *vma,
>  		    void *cpu_addr, dma_addr_t dma_addr, size_t size)
>  {
> @@ -664,6 +678,13 @@ int dma_mmap_writecombine(struct device *dev, struct vm_area_struct *vma,
>  }
>  EXPORT_SYMBOL(dma_mmap_writecombine);
> 
> +int dma_mmap_cached(struct device *dev, struct vm_area_struct *vma,
> +			  void *cpu_addr, dma_addr_t dma_addr, size_t size)
> +{
> +	return dma_mmap(dev, vma, cpu_addr, dma_addr, size);
> +}
> +EXPORT_SYMBOL(dma_mmap_cached);
> +
> 
>  /*
>   * Free a buffer as defined by the above mapping.
> --
> 1.7.8.3

Best regards
-- 
Marek Szyprowski
Samsung Poland R&D Center

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH][RFC] arm: dma-mapping: Add support for allocating/mapping cached buffers
  2012-07-17  5:58     ` Marek Szyprowski
@ 2012-07-20 20:30       ` Laura Abbott
  -1 siblings, 0 replies; 16+ messages in thread
From: Laura Abbott @ 2012-07-20 20:30 UTC (permalink / raw)
  To: Marek Szyprowski
  Cc: linaro-mm-sig, 'Russell King', linux-arm-msm, linux-arm-kernel

On 7/16/2012 10:58 PM, Marek Szyprowski wrote:
> Hi Laura,
>
> On Friday, July 13, 2012 8:02 PM Laura Abbott wrote:
>
>> There are currently no dma allocation APIs that support cached
>> buffers. For some use cases, caching provides a signficiant
>> performance boost that beats write-combining regions. Add
>> apis to allocate and map a cached DMA region.
>>
>> Signed-off-by: Laura Abbott <lauraa@codeaurora.org>
>
> I agree that there is a need for cached contiguous memory blocks. I see that your patch
> is based on some older version of CMA/dma-mapping code. In v3.5-rc1 CMA has been merged
> to mainline kernel together with DMA-mapping redesign patches, so an attribute approach
> can be used instead of adding new functions to the API. My original idea was to utilize
> the dma_alloc_nonconsistent() call and DMA_ATTR_NONCONSISTENT for allocating/mapping
> cached contiguous buffers, but I didn't have enough time for completing this work.
>
> The main missing piece is the API for managing cache synchronization on such buffers.
> There is a dma_cache_synch() functions but it is broken from the API point of view. To
> replace it with something better, some additional work is needed for all drivers which
> already use it. Also some work in needed for cleanup dma_alloc_nonconsistent()
> implementations for all the architectures using dma_map_ops approach. All this is on my
> TODO list, but I currently I'm really busy with other tasks related to CMA (mainly
> bugfixes for some special use-cases).
>

In what is the dma_cache_sync API broken? Just curious at this point.

Thanks,
Laura

>> ---
>>   arch/arm/include/asm/dma-mapping.h |   21 +++++++++++++++++++++
>>   arch/arm/mm/dma-mapping.c          |   21 +++++++++++++++++++++
>>   2 files changed, 42 insertions(+), 0 deletions(-)
>>
>> diff --git a/arch/arm/include/asm/dma-mapping.h b/arch/arm/include/asm/dma-mapping.h
>> index dc988ff..1565403 100644
>> --- a/arch/arm/include/asm/dma-mapping.h
>> +++ b/arch/arm/include/asm/dma-mapping.h
>> @@ -239,12 +239,33 @@ int dma_mmap_coherent(struct device *, struct vm_area_struct *,
>>   extern void *dma_alloc_writecombine(struct device *, size_t, dma_addr_t *,
>>   		gfp_t);
>>
>> +/**
>> + * dma_alloc_cached - allocate cached memory for DMA
>> + * @dev: valid struct device pointer, or NULL for ISA and EISA-like devices
>> + * @size: required memory size
>> + * @handle: bus-specific DMA address
>> + *
>> + * Allocate some cached memory for a device for
>> + * performing DMA.  This function allocates pages, and will
>> + * return the CPU-viewed address, and sets @handle to be the
>> + * device-viewed address.
>> + */
>> +extern void *dma_alloc_cached(struct device *, size_t, dma_addr_t *,
>> +		gfp_t);
>> +
>>   #define dma_free_writecombine(dev,size,cpu_addr,handle) \
>>   	dma_free_coherent(dev,size,cpu_addr,handle)
>>
>> +#define dma_free_cached(dev,size,cpu_addr,handle) \
>> +	dma_free_coherent(dev,size,cpu_addr,handle)
>> +
>>   int dma_mmap_writecombine(struct device *, struct vm_area_struct *,
>>   		void *, dma_addr_t, size_t);
>>
>> +
>> +int dma_mmap_cached(struct device *, struct vm_area_struct *,
>> +		void *, dma_addr_t, size_t);
>> +
>>   /*
>>    * This can be called during boot to increase the size of the consistent
>>    * DMA region above it's default value of 2MB. It must be called before the
>> diff --git a/arch/arm/mm/dma-mapping.c b/arch/arm/mm/dma-mapping.c
>> index b1911c4..f396ddc 100644
>> --- a/arch/arm/mm/dma-mapping.c
>> +++ b/arch/arm/mm/dma-mapping.c
>> @@ -633,6 +633,20 @@ dma_alloc_writecombine(struct device *dev, size_t size, dma_addr_t
>> *handle, gfp_
>>   }
>>   EXPORT_SYMBOL(dma_alloc_writecombine);
>>
>> +/*
>> + * Allocate a cached DMA region
>> + */
>> +void *
>> +dma_alloc_cached(struct device *dev, size_t size, dma_addr_t *handle, gfp_t gfp)
>> +{
>> +	return __dma_alloc(dev, size, handle, gfp,
>> +			   pgprot_kernel,
>> +			   __builtin_return_address(0));
>> +}
>> +EXPORT_SYMBOL(dma_alloc_cached);
>> +
>> +
>> +
>>   static int dma_mmap(struct device *dev, struct vm_area_struct *vma,
>>   		    void *cpu_addr, dma_addr_t dma_addr, size_t size)
>>   {
>> @@ -664,6 +678,13 @@ int dma_mmap_writecombine(struct device *dev, struct vm_area_struct *vma,
>>   }
>>   EXPORT_SYMBOL(dma_mmap_writecombine);
>>
>> +int dma_mmap_cached(struct device *dev, struct vm_area_struct *vma,
>> +			  void *cpu_addr, dma_addr_t dma_addr, size_t size)
>> +{
>> +	return dma_mmap(dev, vma, cpu_addr, dma_addr, size);
>> +}
>> +EXPORT_SYMBOL(dma_mmap_cached);
>> +
>>
>>   /*
>>    * Free a buffer as defined by the above mapping.
>> --
>> 1.7.8.3
>
> Best regards
>


-- 
Sent by an employee of the Qualcomm Innovation Center, Inc.
The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum.

^ permalink raw reply	[flat|nested] 16+ messages in thread

* [PATCH][RFC] arm: dma-mapping: Add support for allocating/mapping cached buffers
@ 2012-07-20 20:30       ` Laura Abbott
  0 siblings, 0 replies; 16+ messages in thread
From: Laura Abbott @ 2012-07-20 20:30 UTC (permalink / raw)
  To: linux-arm-kernel

On 7/16/2012 10:58 PM, Marek Szyprowski wrote:
> Hi Laura,
>
> On Friday, July 13, 2012 8:02 PM Laura Abbott wrote:
>
>> There are currently no dma allocation APIs that support cached
>> buffers. For some use cases, caching provides a signficiant
>> performance boost that beats write-combining regions. Add
>> apis to allocate and map a cached DMA region.
>>
>> Signed-off-by: Laura Abbott <lauraa@codeaurora.org>
>
> I agree that there is a need for cached contiguous memory blocks. I see that your patch
> is based on some older version of CMA/dma-mapping code. In v3.5-rc1 CMA has been merged
> to mainline kernel together with DMA-mapping redesign patches, so an attribute approach
> can be used instead of adding new functions to the API. My original idea was to utilize
> the dma_alloc_nonconsistent() call and DMA_ATTR_NONCONSISTENT for allocating/mapping
> cached contiguous buffers, but I didn't have enough time for completing this work.
>
> The main missing piece is the API for managing cache synchronization on such buffers.
> There is a dma_cache_synch() functions but it is broken from the API point of view. To
> replace it with something better, some additional work is needed for all drivers which
> already use it. Also some work in needed for cleanup dma_alloc_nonconsistent()
> implementations for all the architectures using dma_map_ops approach. All this is on my
> TODO list, but I currently I'm really busy with other tasks related to CMA (mainly
> bugfixes for some special use-cases).
>

In what is the dma_cache_sync API broken? Just curious at this point.

Thanks,
Laura

>> ---
>>   arch/arm/include/asm/dma-mapping.h |   21 +++++++++++++++++++++
>>   arch/arm/mm/dma-mapping.c          |   21 +++++++++++++++++++++
>>   2 files changed, 42 insertions(+), 0 deletions(-)
>>
>> diff --git a/arch/arm/include/asm/dma-mapping.h b/arch/arm/include/asm/dma-mapping.h
>> index dc988ff..1565403 100644
>> --- a/arch/arm/include/asm/dma-mapping.h
>> +++ b/arch/arm/include/asm/dma-mapping.h
>> @@ -239,12 +239,33 @@ int dma_mmap_coherent(struct device *, struct vm_area_struct *,
>>   extern void *dma_alloc_writecombine(struct device *, size_t, dma_addr_t *,
>>   		gfp_t);
>>
>> +/**
>> + * dma_alloc_cached - allocate cached memory for DMA
>> + * @dev: valid struct device pointer, or NULL for ISA and EISA-like devices
>> + * @size: required memory size
>> + * @handle: bus-specific DMA address
>> + *
>> + * Allocate some cached memory for a device for
>> + * performing DMA.  This function allocates pages, and will
>> + * return the CPU-viewed address, and sets @handle to be the
>> + * device-viewed address.
>> + */
>> +extern void *dma_alloc_cached(struct device *, size_t, dma_addr_t *,
>> +		gfp_t);
>> +
>>   #define dma_free_writecombine(dev,size,cpu_addr,handle) \
>>   	dma_free_coherent(dev,size,cpu_addr,handle)
>>
>> +#define dma_free_cached(dev,size,cpu_addr,handle) \
>> +	dma_free_coherent(dev,size,cpu_addr,handle)
>> +
>>   int dma_mmap_writecombine(struct device *, struct vm_area_struct *,
>>   		void *, dma_addr_t, size_t);
>>
>> +
>> +int dma_mmap_cached(struct device *, struct vm_area_struct *,
>> +		void *, dma_addr_t, size_t);
>> +
>>   /*
>>    * This can be called during boot to increase the size of the consistent
>>    * DMA region above it's default value of 2MB. It must be called before the
>> diff --git a/arch/arm/mm/dma-mapping.c b/arch/arm/mm/dma-mapping.c
>> index b1911c4..f396ddc 100644
>> --- a/arch/arm/mm/dma-mapping.c
>> +++ b/arch/arm/mm/dma-mapping.c
>> @@ -633,6 +633,20 @@ dma_alloc_writecombine(struct device *dev, size_t size, dma_addr_t
>> *handle, gfp_
>>   }
>>   EXPORT_SYMBOL(dma_alloc_writecombine);
>>
>> +/*
>> + * Allocate a cached DMA region
>> + */
>> +void *
>> +dma_alloc_cached(struct device *dev, size_t size, dma_addr_t *handle, gfp_t gfp)
>> +{
>> +	return __dma_alloc(dev, size, handle, gfp,
>> +			   pgprot_kernel,
>> +			   __builtin_return_address(0));
>> +}
>> +EXPORT_SYMBOL(dma_alloc_cached);
>> +
>> +
>> +
>>   static int dma_mmap(struct device *dev, struct vm_area_struct *vma,
>>   		    void *cpu_addr, dma_addr_t dma_addr, size_t size)
>>   {
>> @@ -664,6 +678,13 @@ int dma_mmap_writecombine(struct device *dev, struct vm_area_struct *vma,
>>   }
>>   EXPORT_SYMBOL(dma_mmap_writecombine);
>>
>> +int dma_mmap_cached(struct device *dev, struct vm_area_struct *vma,
>> +			  void *cpu_addr, dma_addr_t dma_addr, size_t size)
>> +{
>> +	return dma_mmap(dev, vma, cpu_addr, dma_addr, size);
>> +}
>> +EXPORT_SYMBOL(dma_mmap_cached);
>> +
>>
>>   /*
>>    * Free a buffer as defined by the above mapping.
>> --
>> 1.7.8.3
>
> Best regards
>


-- 
Sent by an employee of the Qualcomm Innovation Center, Inc.
The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum.

^ permalink raw reply	[flat|nested] 16+ messages in thread

* RE: [PATCH][RFC] arm: dma-mapping: Add support for allocating/mapping cached buffers
  2012-07-20 20:30       ` Laura Abbott
@ 2012-07-23  7:22         ` Marek Szyprowski
  -1 siblings, 0 replies; 16+ messages in thread
From: Marek Szyprowski @ 2012-07-23  7:22 UTC (permalink / raw)
  To: 'Laura Abbott'
  Cc: linaro-mm-sig, 'Russell King',
	linux-arm-msm, linux-arm-kernel, 'Arnd Bergmann',
	'Benjamin Herrenschmidt'




> -----Original Message-----
> From: Laura Abbott [mailto:lauraa@codeaurora.org]
> Sent: Friday, July 20, 2012 10:30 PM
> To: Marek Szyprowski
> Cc: linaro-mm-sig@lists.linaro.org; 'Russell King'; linux-arm-msm@vger.kernel.org; linux-arm-
> kernel@lists.infradead.org
> Subject: Re: [PATCH][RFC] arm: dma-mapping: Add support for allocating/mapping cached buffers
> 
> On 7/16/2012 10:58 PM, Marek Szyprowski wrote:
> > Hi Laura,
> >
> > On Friday, July 13, 2012 8:02 PM Laura Abbott wrote:
> >
> >> There are currently no dma allocation APIs that support cached
> >> buffers. For some use cases, caching provides a signficiant
> >> performance boost that beats write-combining regions. Add
> >> apis to allocate and map a cached DMA region.
> >>
> >> Signed-off-by: Laura Abbott <lauraa@codeaurora.org>
> >
> > I agree that there is a need for cached contiguous memory blocks. I see that your patch
> > is based on some older version of CMA/dma-mapping code. In v3.5-rc1 CMA has been merged
> > to mainline kernel together with DMA-mapping redesign patches, so an attribute approach
> > can be used instead of adding new functions to the API. My original idea was to utilize
> > the dma_alloc_nonconsistent() call and DMA_ATTR_NONCONSISTENT for allocating/mapping
> > cached contiguous buffers, but I didn't have enough time for completing this work.
> >
> > The main missing piece is the API for managing cache synchronization on such buffers.
> > There is a dma_cache_synch() functions but it is broken from the API point of view. To
> > replace it with something better, some additional work is needed for all drivers which
> > already use it. Also some work in needed for cleanup dma_alloc_nonconsistent()
> > implementations for all the architectures using dma_map_ops approach. All this is on my
> > TODO list, but I currently I'm really busy with other tasks related to CMA (mainly
> > bugfixes for some special use-cases).
> >
> 
> In what is the dma_cache_sync API broken? Just curious at this point.

There are two issues with it:
1. There is no clear buffer ownership definition like it is done for 
   dma_sync_single_for_cpu/device() functions.
2. DMA address argument is missing, which is required for clean and robust implementation 
   on some architectures.

I would like to completely remove dma_cache_sync() and replace it with 
dma_sync_single_for_cpu/device(), but this probably require a bit more discussion and fixing 
all current clients of dma_cache_sync().

Best regards
-- 
Marek Szyprowski
Samsung Poland R&D Center

^ permalink raw reply	[flat|nested] 16+ messages in thread

* [PATCH][RFC] arm: dma-mapping: Add support for allocating/mapping cached buffers
@ 2012-07-23  7:22         ` Marek Szyprowski
  0 siblings, 0 replies; 16+ messages in thread
From: Marek Szyprowski @ 2012-07-23  7:22 UTC (permalink / raw)
  To: linux-arm-kernel




> -----Original Message-----
> From: Laura Abbott [mailto:lauraa at codeaurora.org]
> Sent: Friday, July 20, 2012 10:30 PM
> To: Marek Szyprowski
> Cc: linaro-mm-sig at lists.linaro.org; 'Russell King'; linux-arm-msm at vger.kernel.org; linux-arm-
> kernel at lists.infradead.org
> Subject: Re: [PATCH][RFC] arm: dma-mapping: Add support for allocating/mapping cached buffers
> 
> On 7/16/2012 10:58 PM, Marek Szyprowski wrote:
> > Hi Laura,
> >
> > On Friday, July 13, 2012 8:02 PM Laura Abbott wrote:
> >
> >> There are currently no dma allocation APIs that support cached
> >> buffers. For some use cases, caching provides a signficiant
> >> performance boost that beats write-combining regions. Add
> >> apis to allocate and map a cached DMA region.
> >>
> >> Signed-off-by: Laura Abbott <lauraa@codeaurora.org>
> >
> > I agree that there is a need for cached contiguous memory blocks. I see that your patch
> > is based on some older version of CMA/dma-mapping code. In v3.5-rc1 CMA has been merged
> > to mainline kernel together with DMA-mapping redesign patches, so an attribute approach
> > can be used instead of adding new functions to the API. My original idea was to utilize
> > the dma_alloc_nonconsistent() call and DMA_ATTR_NONCONSISTENT for allocating/mapping
> > cached contiguous buffers, but I didn't have enough time for completing this work.
> >
> > The main missing piece is the API for managing cache synchronization on such buffers.
> > There is a dma_cache_synch() functions but it is broken from the API point of view. To
> > replace it with something better, some additional work is needed for all drivers which
> > already use it. Also some work in needed for cleanup dma_alloc_nonconsistent()
> > implementations for all the architectures using dma_map_ops approach. All this is on my
> > TODO list, but I currently I'm really busy with other tasks related to CMA (mainly
> > bugfixes for some special use-cases).
> >
> 
> In what is the dma_cache_sync API broken? Just curious at this point.

There are two issues with it:
1. There is no clear buffer ownership definition like it is done for 
   dma_sync_single_for_cpu/device() functions.
2. DMA address argument is missing, which is required for clean and robust implementation 
   on some architectures.

I would like to completely remove dma_cache_sync() and replace it with 
dma_sync_single_for_cpu/device(), but this probably require a bit more discussion and fixing 
all current clients of dma_cache_sync().

Best regards
-- 
Marek Szyprowski
Samsung Poland R&D Center

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH][RFC] arm: dma-mapping: Add support for allocating/mapping cached buffers
  2012-07-23  7:22         ` Marek Szyprowski
@ 2012-07-23  8:46           ` Arnd Bergmann
  -1 siblings, 0 replies; 16+ messages in thread
From: Arnd Bergmann @ 2012-07-23  8:46 UTC (permalink / raw)
  To: Marek Szyprowski
  Cc: 'Laura Abbott', linaro-mm-sig, 'Russell King',
	linux-arm-msm, linux-arm-kernel, 'Benjamin Herrenschmidt'

On Monday 23 July 2012, Marek Szyprowski wrote:
> I would like to completely remove dma_cache_sync() and replace it with 
> dma_sync_single_for_cpu/device(), but this probably require a bit more
> discussion and fixing all current clients of dma_cache_sync().

Sounds like a good idea. Fortunately, there are only a handful of such
drivers in the kernel, and even fewer machines that actually need this,
so we should be able to find all the people that care about these.

	Arnd

^ permalink raw reply	[flat|nested] 16+ messages in thread

* [PATCH][RFC] arm: dma-mapping: Add support for allocating/mapping cached buffers
@ 2012-07-23  8:46           ` Arnd Bergmann
  0 siblings, 0 replies; 16+ messages in thread
From: Arnd Bergmann @ 2012-07-23  8:46 UTC (permalink / raw)
  To: linux-arm-kernel

On Monday 23 July 2012, Marek Szyprowski wrote:
> I would like to completely remove dma_cache_sync() and replace it with 
> dma_sync_single_for_cpu/device(), but this probably require a bit more
> discussion and fixing all current clients of dma_cache_sync().

Sounds like a good idea. Fortunately, there are only a handful of such
drivers in the kernel, and even fewer machines that actually need this,
so we should be able to find all the people that care about these.

	Arnd

^ permalink raw reply	[flat|nested] 16+ messages in thread

end of thread, other threads:[~2012-07-23  8:47 UTC | newest]

Thread overview: 16+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2012-07-13 18:01 [RFc] Map CMA pages as cached Laura Abbott
2012-07-13 18:01 ` Laura Abbott
2012-07-13 18:01 ` [PATCH][RFC] arm: dma-mapping: Add support for allocating/mapping cached buffers Laura Abbott
2012-07-13 18:01   ` Laura Abbott
2012-07-14 13:53   ` [Linaro-mm-sig] " Clark, Rob
2012-07-14 13:53     ` Clark, Rob
2012-07-17  1:06     ` Laura Abbott
2012-07-17  1:06       ` Laura Abbott
2012-07-17  5:58   ` Marek Szyprowski
2012-07-17  5:58     ` Marek Szyprowski
2012-07-20 20:30     ` Laura Abbott
2012-07-20 20:30       ` Laura Abbott
2012-07-23  7:22       ` Marek Szyprowski
2012-07-23  7:22         ` Marek Szyprowski
2012-07-23  8:46         ` Arnd Bergmann
2012-07-23  8:46           ` Arnd Bergmann

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.