All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCHv2 0/6] ARM: DMA-mapping: new extensions for buffer sharing
@ 2012-06-13 11:50 ` Marek Szyprowski
  0 siblings, 0 replies; 45+ messages in thread
From: Marek Szyprowski @ 2012-06-13 11:50 UTC (permalink / raw)
  To: linux-arm-kernel, linaro-mm-sig, linux-mm, linux-arch, linux-kernel
  Cc: Marek Szyprowski, Kyungmin Park, Arnd Bergmann,
	Russell King - ARM Linux, Chunsang Jeong, Krishna Reddy,
	Benjamin Herrenschmidt, Konrad Rzeszutek Wilk, Hiroshi Doyu,
	Subash Patel, Sumit Semwal, Abhinav Kochhar, Tomasz Stanislawski

Hello,

This is an updated version of the patch series introducing a new
features to DMA mapping subsystem to let drivers share the allocated
buffers (preferably using recently introduced dma_buf framework) easy
and efficient.

The first extension is DMA_ATTR_NO_KERNEL_MAPPING attribute. It is
intended for use with dma_{alloc, mmap, free}_attrs functions. It can be
used to notify dma-mapping core that the driver will not use kernel
mapping for the allocated buffer at all, so the core can skip creating
it. This saves precious kernel virtual address space. Such buffer can be
accessed from userspace, after calling dma_mmap_attrs() for it (a
typical use case for multimedia buffers). The value returned by
dma_alloc_attrs() with this attribute should be considered as a DMA
cookie, which needs to be passed to dma_mmap_attrs() and
dma_free_attrs() funtions.

The second extension is required to let drivers to share the buffers
allocated by DMA-mapping subsystem. Right now the driver gets a dma
address of the allocated buffer and the kernel virtual mapping for it.
If it wants to share it with other device (= map into its dma address
space) it usually hacks around kernel virtual addresses to get pointers
to pages or assumes that both devices share the DMA address space. Both
solutions are just hacks for the special cases, which should be avoided
in the final version of buffer sharing. To solve this issue in a generic
way, a new call to DMA mapping has been introduced - dma_get_sgtable().
It allocates a scatter-list which describes the allocated buffer and
lets the driver(s) to use it with other device(s) by calling
dma_map_sg() on it.

The third extension solves the performance issues which we observed with
some advanced buffer sharing use cases, which require creating a dma
mapping for the same memory buffer for more than one device. From the
DMA-mapping perspective this requires to call one of the
dma_map_{page,single,sg} function for the given memory buffer a few
times, for each of the devices. Each dma_map_* call performs CPU cache
synchronization, what might be a time consuming operation, especially
when the buffers are large. We would like to avoid any useless and time
consuming operations, so that was the main reason for introducing
another attribute for DMA-mapping subsystem: DMA_ATTR_SKIP_CPU_SYNC,
which lets dma-mapping core to skip CPU cache synchronization in certain
cases.

The proposed patches have been rebased on the latest Linux kernel
v3.5-rc2 with 'ARM: replace custom consistent dma region with vmalloc'
patches applied (for more information, please refer to the 
http://www.spinics.net/lists/arm-kernel/msg179202.html thread).

The patches together with all dependences are also available on the
following GIT branch:

git://git.linaro.org/people/mszyprowski/linux-dma-mapping.git 3.5-rc2-dma-ext-v2

Best regards
Marek Szyprowski
Samsung Poland R&D Center

Changelog:

v2:
- rebased onto v3.5-rc2 and adapted for CMA and dma-mapping changes
- renamed dma_get_sgtable() to dma_get_sgtable_attrs() to match the convention
  of the other dma-mapping calls with attributes
- added generic fallback function for dma_get_sgtable() for architectures with
  simple dma-mapping implementations

v1: http://thread.gmane.org/gmane.linux.kernel.mm/78644
    http://thread.gmane.org/gmane.linux.kernel.cross-arch/14435 (part 2)
- initial version

Patch summary:

Marek Szyprowski (6):
  common: DMA-mapping: add DMA_ATTR_NO_KERNEL_MAPPING attribute
  ARM: dma-mapping: add support for DMA_ATTR_NO_KERNEL_MAPPING
    attribute
  common: dma-mapping: introduce dma_get_sgtable() function
  ARM: dma-mapping: add support for dma_get_sgtable()
  common: DMA-mapping: add DMA_ATTR_SKIP_CPU_SYNC attribute
  ARM: dma-mapping: add support for DMA_ATTR_SKIP_CPU_SYNC attribute

 Documentation/DMA-attributes.txt         |   42 ++++++++++++++++++
 arch/arm/common/dmabounce.c              |    1 +
 arch/arm/include/asm/dma-mapping.h       |    3 +
 arch/arm/mm/dma-mapping.c                |   69 ++++++++++++++++++++++++------
 drivers/base/dma-mapping.c               |   18 ++++++++
 include/asm-generic/dma-mapping-common.h |   18 ++++++++
 include/linux/dma-attrs.h                |    2 +
 include/linux/dma-mapping.h              |    3 +
 8 files changed, 142 insertions(+), 14 deletions(-)

-- 
1.7.1.569.g6f426


^ permalink raw reply	[flat|nested] 45+ messages in thread

* [PATCHv2 0/6] ARM: DMA-mapping: new extensions for buffer sharing
@ 2012-06-13 11:50 ` Marek Szyprowski
  0 siblings, 0 replies; 45+ messages in thread
From: Marek Szyprowski @ 2012-06-13 11:50 UTC (permalink / raw)
  To: linux-arm-kernel, linaro-mm-sig, linux-mm, linux-arch, linux-kernel
  Cc: Marek Szyprowski, Kyungmin Park, Arnd Bergmann,
	Russell King - ARM Linux, Chunsang Jeong, Krishna Reddy,
	Benjamin Herrenschmidt, Konrad Rzeszutek Wilk, Hiroshi Doyu,
	Subash Patel, Sumit Semwal, Abhinav Kochhar, Tomasz Stanislawski

Hello,

This is an updated version of the patch series introducing a new
features to DMA mapping subsystem to let drivers share the allocated
buffers (preferably using recently introduced dma_buf framework) easy
and efficient.

The first extension is DMA_ATTR_NO_KERNEL_MAPPING attribute. It is
intended for use with dma_{alloc, mmap, free}_attrs functions. It can be
used to notify dma-mapping core that the driver will not use kernel
mapping for the allocated buffer at all, so the core can skip creating
it. This saves precious kernel virtual address space. Such buffer can be
accessed from userspace, after calling dma_mmap_attrs() for it (a
typical use case for multimedia buffers). The value returned by
dma_alloc_attrs() with this attribute should be considered as a DMA
cookie, which needs to be passed to dma_mmap_attrs() and
dma_free_attrs() funtions.

The second extension is required to let drivers to share the buffers
allocated by DMA-mapping subsystem. Right now the driver gets a dma
address of the allocated buffer and the kernel virtual mapping for it.
If it wants to share it with other device (= map into its dma address
space) it usually hacks around kernel virtual addresses to get pointers
to pages or assumes that both devices share the DMA address space. Both
solutions are just hacks for the special cases, which should be avoided
in the final version of buffer sharing. To solve this issue in a generic
way, a new call to DMA mapping has been introduced - dma_get_sgtable().
It allocates a scatter-list which describes the allocated buffer and
lets the driver(s) to use it with other device(s) by calling
dma_map_sg() on it.

The third extension solves the performance issues which we observed with
some advanced buffer sharing use cases, which require creating a dma
mapping for the same memory buffer for more than one device. From the
DMA-mapping perspective this requires to call one of the
dma_map_{page,single,sg} function for the given memory buffer a few
times, for each of the devices. Each dma_map_* call performs CPU cache
synchronization, what might be a time consuming operation, especially
when the buffers are large. We would like to avoid any useless and time
consuming operations, so that was the main reason for introducing
another attribute for DMA-mapping subsystem: DMA_ATTR_SKIP_CPU_SYNC,
which lets dma-mapping core to skip CPU cache synchronization in certain
cases.

The proposed patches have been rebased on the latest Linux kernel
v3.5-rc2 with 'ARM: replace custom consistent dma region with vmalloc'
patches applied (for more information, please refer to the 
http://www.spinics.net/lists/arm-kernel/msg179202.html thread).

The patches together with all dependences are also available on the
following GIT branch:

git://git.linaro.org/people/mszyprowski/linux-dma-mapping.git 3.5-rc2-dma-ext-v2

Best regards
Marek Szyprowski
Samsung Poland R&D Center

Changelog:

v2:
- rebased onto v3.5-rc2 and adapted for CMA and dma-mapping changes
- renamed dma_get_sgtable() to dma_get_sgtable_attrs() to match the convention
  of the other dma-mapping calls with attributes
- added generic fallback function for dma_get_sgtable() for architectures with
  simple dma-mapping implementations

v1: http://thread.gmane.org/gmane.linux.kernel.mm/78644
    http://thread.gmane.org/gmane.linux.kernel.cross-arch/14435 (part 2)
- initial version

Patch summary:

Marek Szyprowski (6):
  common: DMA-mapping: add DMA_ATTR_NO_KERNEL_MAPPING attribute
  ARM: dma-mapping: add support for DMA_ATTR_NO_KERNEL_MAPPING
    attribute
  common: dma-mapping: introduce dma_get_sgtable() function
  ARM: dma-mapping: add support for dma_get_sgtable()
  common: DMA-mapping: add DMA_ATTR_SKIP_CPU_SYNC attribute
  ARM: dma-mapping: add support for DMA_ATTR_SKIP_CPU_SYNC attribute

 Documentation/DMA-attributes.txt         |   42 ++++++++++++++++++
 arch/arm/common/dmabounce.c              |    1 +
 arch/arm/include/asm/dma-mapping.h       |    3 +
 arch/arm/mm/dma-mapping.c                |   69 ++++++++++++++++++++++++------
 drivers/base/dma-mapping.c               |   18 ++++++++
 include/asm-generic/dma-mapping-common.h |   18 ++++++++
 include/linux/dma-attrs.h                |    2 +
 include/linux/dma-mapping.h              |    3 +
 8 files changed, 142 insertions(+), 14 deletions(-)

-- 
1.7.1.569.g6f426

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 45+ messages in thread

* [PATCHv2 0/6] ARM: DMA-mapping: new extensions for buffer sharing
@ 2012-06-13 11:50 ` Marek Szyprowski
  0 siblings, 0 replies; 45+ messages in thread
From: Marek Szyprowski @ 2012-06-13 11:50 UTC (permalink / raw)
  To: linux-arm-kernel

Hello,

This is an updated version of the patch series introducing a new
features to DMA mapping subsystem to let drivers share the allocated
buffers (preferably using recently introduced dma_buf framework) easy
and efficient.

The first extension is DMA_ATTR_NO_KERNEL_MAPPING attribute. It is
intended for use with dma_{alloc, mmap, free}_attrs functions. It can be
used to notify dma-mapping core that the driver will not use kernel
mapping for the allocated buffer at all, so the core can skip creating
it. This saves precious kernel virtual address space. Such buffer can be
accessed from userspace, after calling dma_mmap_attrs() for it (a
typical use case for multimedia buffers). The value returned by
dma_alloc_attrs() with this attribute should be considered as a DMA
cookie, which needs to be passed to dma_mmap_attrs() and
dma_free_attrs() funtions.

The second extension is required to let drivers to share the buffers
allocated by DMA-mapping subsystem. Right now the driver gets a dma
address of the allocated buffer and the kernel virtual mapping for it.
If it wants to share it with other device (= map into its dma address
space) it usually hacks around kernel virtual addresses to get pointers
to pages or assumes that both devices share the DMA address space. Both
solutions are just hacks for the special cases, which should be avoided
in the final version of buffer sharing. To solve this issue in a generic
way, a new call to DMA mapping has been introduced - dma_get_sgtable().
It allocates a scatter-list which describes the allocated buffer and
lets the driver(s) to use it with other device(s) by calling
dma_map_sg() on it.

The third extension solves the performance issues which we observed with
some advanced buffer sharing use cases, which require creating a dma
mapping for the same memory buffer for more than one device. From the
DMA-mapping perspective this requires to call one of the
dma_map_{page,single,sg} function for the given memory buffer a few
times, for each of the devices. Each dma_map_* call performs CPU cache
synchronization, what might be a time consuming operation, especially
when the buffers are large. We would like to avoid any useless and time
consuming operations, so that was the main reason for introducing
another attribute for DMA-mapping subsystem: DMA_ATTR_SKIP_CPU_SYNC,
which lets dma-mapping core to skip CPU cache synchronization in certain
cases.

The proposed patches have been rebased on the latest Linux kernel
v3.5-rc2 with 'ARM: replace custom consistent dma region with vmalloc'
patches applied (for more information, please refer to the 
http://www.spinics.net/lists/arm-kernel/msg179202.html thread).

The patches together with all dependences are also available on the
following GIT branch:

git://git.linaro.org/people/mszyprowski/linux-dma-mapping.git 3.5-rc2-dma-ext-v2

Best regards
Marek Szyprowski
Samsung Poland R&D Center

Changelog:

v2:
- rebased onto v3.5-rc2 and adapted for CMA and dma-mapping changes
- renamed dma_get_sgtable() to dma_get_sgtable_attrs() to match the convention
  of the other dma-mapping calls with attributes
- added generic fallback function for dma_get_sgtable() for architectures with
  simple dma-mapping implementations

v1: http://thread.gmane.org/gmane.linux.kernel.mm/78644
    http://thread.gmane.org/gmane.linux.kernel.cross-arch/14435 (part 2)
- initial version

Patch summary:

Marek Szyprowski (6):
  common: DMA-mapping: add DMA_ATTR_NO_KERNEL_MAPPING attribute
  ARM: dma-mapping: add support for DMA_ATTR_NO_KERNEL_MAPPING
    attribute
  common: dma-mapping: introduce dma_get_sgtable() function
  ARM: dma-mapping: add support for dma_get_sgtable()
  common: DMA-mapping: add DMA_ATTR_SKIP_CPU_SYNC attribute
  ARM: dma-mapping: add support for DMA_ATTR_SKIP_CPU_SYNC attribute

 Documentation/DMA-attributes.txt         |   42 ++++++++++++++++++
 arch/arm/common/dmabounce.c              |    1 +
 arch/arm/include/asm/dma-mapping.h       |    3 +
 arch/arm/mm/dma-mapping.c                |   69 ++++++++++++++++++++++++------
 drivers/base/dma-mapping.c               |   18 ++++++++
 include/asm-generic/dma-mapping-common.h |   18 ++++++++
 include/linux/dma-attrs.h                |    2 +
 include/linux/dma-mapping.h              |    3 +
 8 files changed, 142 insertions(+), 14 deletions(-)

-- 
1.7.1.569.g6f426

^ permalink raw reply	[flat|nested] 45+ messages in thread

* [PATCHv2 1/6] common: DMA-mapping: add DMA_ATTR_NO_KERNEL_MAPPING attribute
  2012-06-13 11:50 ` Marek Szyprowski
  (?)
@ 2012-06-13 11:50   ` Marek Szyprowski
  -1 siblings, 0 replies; 45+ messages in thread
From: Marek Szyprowski @ 2012-06-13 11:50 UTC (permalink / raw)
  To: linux-arm-kernel, linaro-mm-sig, linux-mm, linux-arch, linux-kernel
  Cc: Marek Szyprowski, Kyungmin Park, Arnd Bergmann,
	Russell King - ARM Linux, Chunsang Jeong, Krishna Reddy,
	Benjamin Herrenschmidt, Konrad Rzeszutek Wilk, Hiroshi Doyu,
	Subash Patel, Sumit Semwal, Abhinav Kochhar, Tomasz Stanislawski

This patch adds DMA_ATTR_NO_KERNEL_MAPPING attribute which lets the
platform to avoid creating a kernel virtual mapping for the allocated
buffer. On some architectures creating such mapping is non-trivial task
and consumes very limited resources (like kernel virtual address space
or dma consistent address space). Buffers allocated with this attribute
can be only passed to user space by calling dma_mmap_attrs().

Signed-off-by: Marek Szyprowski <m.szyprowski@samsung.com>
Reviewed-by: Kyungmin Park <kyungmin.park@samsung.com>
---
 Documentation/DMA-attributes.txt |   18 ++++++++++++++++++
 include/linux/dma-attrs.h        |    1 +
 2 files changed, 19 insertions(+), 0 deletions(-)

diff --git a/Documentation/DMA-attributes.txt b/Documentation/DMA-attributes.txt
index 5c72eed..725580d 100644
--- a/Documentation/DMA-attributes.txt
+++ b/Documentation/DMA-attributes.txt
@@ -49,3 +49,21 @@ DMA_ATTR_NON_CONSISTENT lets the platform to choose to return either
 consistent or non-consistent memory as it sees fit.  By using this API,
 you are guaranteeing to the platform that you have all the correct and
 necessary sync points for this memory in the driver.
+
+DMA_ATTR_NO_KERNEL_MAPPING
+--------------------------
+
+DMA_ATTR_NO_KERNEL_MAPPING lets the platform to avoid creating a kernel
+virtual mapping for the allocated buffer. On some architectures creating
+such mapping is non-trivial task and consumes very limited resources
+(like kernel virtual address space or dma consistent address space).
+Buffers allocated with this attribute can be only passed to user space
+by calling dma_mmap_attrs(). By using this API, you are guaranteeing
+that you won't dereference the pointer returned by dma_alloc_attr(). You
+can threat it as a cookie that must be passed to dma_mmap_attrs() and
+dma_free_attrs(). Make sure that both of these also get this attribute
+set on each call.
+
+Since it is optional for platforms to implement
+DMA_ATTR_NO_KERNEL_MAPPING, those that do not will simply ignore the
+attribute and exhibit default behavior.
diff --git a/include/linux/dma-attrs.h b/include/linux/dma-attrs.h
index 547ab56..a37c10c 100644
--- a/include/linux/dma-attrs.h
+++ b/include/linux/dma-attrs.h
@@ -15,6 +15,7 @@ enum dma_attr {
 	DMA_ATTR_WEAK_ORDERING,
 	DMA_ATTR_WRITE_COMBINE,
 	DMA_ATTR_NON_CONSISTENT,
+	DMA_ATTR_NO_KERNEL_MAPPING,
 	DMA_ATTR_MAX,
 };
 
-- 
1.7.1.569.g6f426


^ permalink raw reply related	[flat|nested] 45+ messages in thread

* [PATCHv2 1/6] common: DMA-mapping: add DMA_ATTR_NO_KERNEL_MAPPING attribute
@ 2012-06-13 11:50   ` Marek Szyprowski
  0 siblings, 0 replies; 45+ messages in thread
From: Marek Szyprowski @ 2012-06-13 11:50 UTC (permalink / raw)
  To: linux-arm-kernel, linaro-mm-sig, linux-mm, linux-arch, linux-kernel
  Cc: Marek Szyprowski, Kyungmin Park, Arnd Bergmann,
	Russell King - ARM Linux, Chunsang Jeong, Krishna Reddy,
	Benjamin Herrenschmidt, Konrad Rzeszutek Wilk, Hiroshi Doyu,
	Subash Patel, Sumit Semwal, Abhinav Kochhar, Tomasz Stanislawski

This patch adds DMA_ATTR_NO_KERNEL_MAPPING attribute which lets the
platform to avoid creating a kernel virtual mapping for the allocated
buffer. On some architectures creating such mapping is non-trivial task
and consumes very limited resources (like kernel virtual address space
or dma consistent address space). Buffers allocated with this attribute
can be only passed to user space by calling dma_mmap_attrs().

Signed-off-by: Marek Szyprowski <m.szyprowski@samsung.com>
Reviewed-by: Kyungmin Park <kyungmin.park@samsung.com>
---
 Documentation/DMA-attributes.txt |   18 ++++++++++++++++++
 include/linux/dma-attrs.h        |    1 +
 2 files changed, 19 insertions(+), 0 deletions(-)

diff --git a/Documentation/DMA-attributes.txt b/Documentation/DMA-attributes.txt
index 5c72eed..725580d 100644
--- a/Documentation/DMA-attributes.txt
+++ b/Documentation/DMA-attributes.txt
@@ -49,3 +49,21 @@ DMA_ATTR_NON_CONSISTENT lets the platform to choose to return either
 consistent or non-consistent memory as it sees fit.  By using this API,
 you are guaranteeing to the platform that you have all the correct and
 necessary sync points for this memory in the driver.
+
+DMA_ATTR_NO_KERNEL_MAPPING
+--------------------------
+
+DMA_ATTR_NO_KERNEL_MAPPING lets the platform to avoid creating a kernel
+virtual mapping for the allocated buffer. On some architectures creating
+such mapping is non-trivial task and consumes very limited resources
+(like kernel virtual address space or dma consistent address space).
+Buffers allocated with this attribute can be only passed to user space
+by calling dma_mmap_attrs(). By using this API, you are guaranteeing
+that you won't dereference the pointer returned by dma_alloc_attr(). You
+can threat it as a cookie that must be passed to dma_mmap_attrs() and
+dma_free_attrs(). Make sure that both of these also get this attribute
+set on each call.
+
+Since it is optional for platforms to implement
+DMA_ATTR_NO_KERNEL_MAPPING, those that do not will simply ignore the
+attribute and exhibit default behavior.
diff --git a/include/linux/dma-attrs.h b/include/linux/dma-attrs.h
index 547ab56..a37c10c 100644
--- a/include/linux/dma-attrs.h
+++ b/include/linux/dma-attrs.h
@@ -15,6 +15,7 @@ enum dma_attr {
 	DMA_ATTR_WEAK_ORDERING,
 	DMA_ATTR_WRITE_COMBINE,
 	DMA_ATTR_NON_CONSISTENT,
+	DMA_ATTR_NO_KERNEL_MAPPING,
 	DMA_ATTR_MAX,
 };
 
-- 
1.7.1.569.g6f426

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 45+ messages in thread

* [PATCHv2 1/6] common: DMA-mapping: add DMA_ATTR_NO_KERNEL_MAPPING attribute
@ 2012-06-13 11:50   ` Marek Szyprowski
  0 siblings, 0 replies; 45+ messages in thread
From: Marek Szyprowski @ 2012-06-13 11:50 UTC (permalink / raw)
  To: linux-arm-kernel

This patch adds DMA_ATTR_NO_KERNEL_MAPPING attribute which lets the
platform to avoid creating a kernel virtual mapping for the allocated
buffer. On some architectures creating such mapping is non-trivial task
and consumes very limited resources (like kernel virtual address space
or dma consistent address space). Buffers allocated with this attribute
can be only passed to user space by calling dma_mmap_attrs().

Signed-off-by: Marek Szyprowski <m.szyprowski@samsung.com>
Reviewed-by: Kyungmin Park <kyungmin.park@samsung.com>
---
 Documentation/DMA-attributes.txt |   18 ++++++++++++++++++
 include/linux/dma-attrs.h        |    1 +
 2 files changed, 19 insertions(+), 0 deletions(-)

diff --git a/Documentation/DMA-attributes.txt b/Documentation/DMA-attributes.txt
index 5c72eed..725580d 100644
--- a/Documentation/DMA-attributes.txt
+++ b/Documentation/DMA-attributes.txt
@@ -49,3 +49,21 @@ DMA_ATTR_NON_CONSISTENT lets the platform to choose to return either
 consistent or non-consistent memory as it sees fit.  By using this API,
 you are guaranteeing to the platform that you have all the correct and
 necessary sync points for this memory in the driver.
+
+DMA_ATTR_NO_KERNEL_MAPPING
+--------------------------
+
+DMA_ATTR_NO_KERNEL_MAPPING lets the platform to avoid creating a kernel
+virtual mapping for the allocated buffer. On some architectures creating
+such mapping is non-trivial task and consumes very limited resources
+(like kernel virtual address space or dma consistent address space).
+Buffers allocated with this attribute can be only passed to user space
+by calling dma_mmap_attrs(). By using this API, you are guaranteeing
+that you won't dereference the pointer returned by dma_alloc_attr(). You
+can threat it as a cookie that must be passed to dma_mmap_attrs() and
+dma_free_attrs(). Make sure that both of these also get this attribute
+set on each call.
+
+Since it is optional for platforms to implement
+DMA_ATTR_NO_KERNEL_MAPPING, those that do not will simply ignore the
+attribute and exhibit default behavior.
diff --git a/include/linux/dma-attrs.h b/include/linux/dma-attrs.h
index 547ab56..a37c10c 100644
--- a/include/linux/dma-attrs.h
+++ b/include/linux/dma-attrs.h
@@ -15,6 +15,7 @@ enum dma_attr {
 	DMA_ATTR_WEAK_ORDERING,
 	DMA_ATTR_WRITE_COMBINE,
 	DMA_ATTR_NON_CONSISTENT,
+	DMA_ATTR_NO_KERNEL_MAPPING,
 	DMA_ATTR_MAX,
 };
 
-- 
1.7.1.569.g6f426

^ permalink raw reply related	[flat|nested] 45+ messages in thread

* [PATCHv2 2/6] ARM: dma-mapping: add support for DMA_ATTR_NO_KERNEL_MAPPING attribute
  2012-06-13 11:50 ` Marek Szyprowski
  (?)
@ 2012-06-13 11:50   ` Marek Szyprowski
  -1 siblings, 0 replies; 45+ messages in thread
From: Marek Szyprowski @ 2012-06-13 11:50 UTC (permalink / raw)
  To: linux-arm-kernel, linaro-mm-sig, linux-mm, linux-arch, linux-kernel
  Cc: Marek Szyprowski, Kyungmin Park, Arnd Bergmann,
	Russell King - ARM Linux, Chunsang Jeong, Krishna Reddy,
	Benjamin Herrenschmidt, Konrad Rzeszutek Wilk, Hiroshi Doyu,
	Subash Patel, Sumit Semwal, Abhinav Kochhar, Tomasz Stanislawski

This patch adds support for DMA_ATTR_NO_KERNEL_MAPPING attribute for
IOMMU allocations, what let drivers to save precious kernel virtual
address space for large buffers that are intended to be accessed only
from userspace.

This patch is heavily based on initial work kindly provided by Abhinav
Kochhar <abhinav.k@samsung.com>.

Signed-off-by: Marek Szyprowski <m.szyprowski@samsung.com>
Reviewed-by: Kyungmin Park <kyungmin.park@samsung.com>
---
 arch/arm/mm/dma-mapping.c |   18 +++++++++++++-----
 1 files changed, 13 insertions(+), 5 deletions(-)

diff --git a/arch/arm/mm/dma-mapping.c b/arch/arm/mm/dma-mapping.c
index b3ffcf9..5d8b8b2 100644
--- a/arch/arm/mm/dma-mapping.c
+++ b/arch/arm/mm/dma-mapping.c
@@ -1071,10 +1071,13 @@ static int __iommu_remove_mapping(struct device *dev, dma_addr_t iova, size_t si
 	return 0;
 }
 
-static struct page **__iommu_get_pages(void *cpu_addr)
+static struct page **__iommu_get_pages(void *cpu_addr, struct dma_attrs *attrs)
 {
 	struct vm_struct *area;
 
+	if (dma_get_attr(DMA_ATTR_NO_KERNEL_MAPPING, attrs))
+		return cpu_addr;
+
 	area = find_vm_area(cpu_addr);
 	if (area && (area->flags & VM_DMA))
 		return area->pages;
@@ -1099,6 +1102,9 @@ static void *arm_iommu_alloc_attrs(struct device *dev, size_t size,
 	if (*handle == DMA_ERROR_CODE)
 		goto err_buffer;
 
+	if (dma_get_attr(DMA_ATTR_NO_KERNEL_MAPPING, attrs))
+		return pages;
+
 	addr = __iommu_alloc_remap(pages, size, gfp, prot,
 				   __builtin_return_address(0));
 	if (!addr)
@@ -1119,7 +1125,7 @@ static int arm_iommu_mmap_attrs(struct device *dev, struct vm_area_struct *vma,
 {
 	unsigned long uaddr = vma->vm_start;
 	unsigned long usize = vma->vm_end - vma->vm_start;
-	struct page **pages = __iommu_get_pages(cpu_addr);
+	struct page **pages = __iommu_get_pages(cpu_addr, attrs);
 
 	vma->vm_page_prot = __get_dma_pgprot(attrs, vma->vm_page_prot);
 
@@ -1146,7 +1152,7 @@ static int arm_iommu_mmap_attrs(struct device *dev, struct vm_area_struct *vma,
 void arm_iommu_free_attrs(struct device *dev, size_t size, void *cpu_addr,
 			  dma_addr_t handle, struct dma_attrs *attrs)
 {
-	struct page **pages = __iommu_get_pages(cpu_addr);
+	struct page **pages = __iommu_get_pages(cpu_addr, attrs);
 	size = PAGE_ALIGN(size);
 
 	if (!pages) {
@@ -1156,8 +1162,10 @@ void arm_iommu_free_attrs(struct device *dev, size_t size, void *cpu_addr,
 		return;
 	}
 
-	unmap_kernel_range((unsigned long)cpu_addr, size);
-	vunmap(cpu_addr);
+	if (!dma_get_attr(DMA_ATTR_NO_KERNEL_MAPPING, attrs)) {
+		unmap_kernel_range((unsigned long)cpu_addr, size);
+		vunmap(cpu_addr);
+	}
 
 	__iommu_remove_mapping(dev, handle, size);
 	__iommu_free_buffer(dev, pages, size);
-- 
1.7.1.569.g6f426


^ permalink raw reply related	[flat|nested] 45+ messages in thread

* [PATCHv2 2/6] ARM: dma-mapping: add support for DMA_ATTR_NO_KERNEL_MAPPING attribute
@ 2012-06-13 11:50   ` Marek Szyprowski
  0 siblings, 0 replies; 45+ messages in thread
From: Marek Szyprowski @ 2012-06-13 11:50 UTC (permalink / raw)
  To: linux-arm-kernel, linaro-mm-sig, linux-mm, linux-arch, linux-kernel
  Cc: Marek Szyprowski, Kyungmin Park, Arnd Bergmann,
	Russell King - ARM Linux, Chunsang Jeong, Krishna Reddy,
	Benjamin Herrenschmidt, Konrad Rzeszutek Wilk, Hiroshi Doyu,
	Subash Patel, Sumit Semwal, Abhinav Kochhar, Tomasz Stanislawski

This patch adds support for DMA_ATTR_NO_KERNEL_MAPPING attribute for
IOMMU allocations, what let drivers to save precious kernel virtual
address space for large buffers that are intended to be accessed only
from userspace.

This patch is heavily based on initial work kindly provided by Abhinav
Kochhar <abhinav.k@samsung.com>.

Signed-off-by: Marek Szyprowski <m.szyprowski@samsung.com>
Reviewed-by: Kyungmin Park <kyungmin.park@samsung.com>
---
 arch/arm/mm/dma-mapping.c |   18 +++++++++++++-----
 1 files changed, 13 insertions(+), 5 deletions(-)

diff --git a/arch/arm/mm/dma-mapping.c b/arch/arm/mm/dma-mapping.c
index b3ffcf9..5d8b8b2 100644
--- a/arch/arm/mm/dma-mapping.c
+++ b/arch/arm/mm/dma-mapping.c
@@ -1071,10 +1071,13 @@ static int __iommu_remove_mapping(struct device *dev, dma_addr_t iova, size_t si
 	return 0;
 }
 
-static struct page **__iommu_get_pages(void *cpu_addr)
+static struct page **__iommu_get_pages(void *cpu_addr, struct dma_attrs *attrs)
 {
 	struct vm_struct *area;
 
+	if (dma_get_attr(DMA_ATTR_NO_KERNEL_MAPPING, attrs))
+		return cpu_addr;
+
 	area = find_vm_area(cpu_addr);
 	if (area && (area->flags & VM_DMA))
 		return area->pages;
@@ -1099,6 +1102,9 @@ static void *arm_iommu_alloc_attrs(struct device *dev, size_t size,
 	if (*handle == DMA_ERROR_CODE)
 		goto err_buffer;
 
+	if (dma_get_attr(DMA_ATTR_NO_KERNEL_MAPPING, attrs))
+		return pages;
+
 	addr = __iommu_alloc_remap(pages, size, gfp, prot,
 				   __builtin_return_address(0));
 	if (!addr)
@@ -1119,7 +1125,7 @@ static int arm_iommu_mmap_attrs(struct device *dev, struct vm_area_struct *vma,
 {
 	unsigned long uaddr = vma->vm_start;
 	unsigned long usize = vma->vm_end - vma->vm_start;
-	struct page **pages = __iommu_get_pages(cpu_addr);
+	struct page **pages = __iommu_get_pages(cpu_addr, attrs);
 
 	vma->vm_page_prot = __get_dma_pgprot(attrs, vma->vm_page_prot);
 
@@ -1146,7 +1152,7 @@ static int arm_iommu_mmap_attrs(struct device *dev, struct vm_area_struct *vma,
 void arm_iommu_free_attrs(struct device *dev, size_t size, void *cpu_addr,
 			  dma_addr_t handle, struct dma_attrs *attrs)
 {
-	struct page **pages = __iommu_get_pages(cpu_addr);
+	struct page **pages = __iommu_get_pages(cpu_addr, attrs);
 	size = PAGE_ALIGN(size);
 
 	if (!pages) {
@@ -1156,8 +1162,10 @@ void arm_iommu_free_attrs(struct device *dev, size_t size, void *cpu_addr,
 		return;
 	}
 
-	unmap_kernel_range((unsigned long)cpu_addr, size);
-	vunmap(cpu_addr);
+	if (!dma_get_attr(DMA_ATTR_NO_KERNEL_MAPPING, attrs)) {
+		unmap_kernel_range((unsigned long)cpu_addr, size);
+		vunmap(cpu_addr);
+	}
 
 	__iommu_remove_mapping(dev, handle, size);
 	__iommu_free_buffer(dev, pages, size);
-- 
1.7.1.569.g6f426

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 45+ messages in thread

* [PATCHv2 2/6] ARM: dma-mapping: add support for DMA_ATTR_NO_KERNEL_MAPPING attribute
@ 2012-06-13 11:50   ` Marek Szyprowski
  0 siblings, 0 replies; 45+ messages in thread
From: Marek Szyprowski @ 2012-06-13 11:50 UTC (permalink / raw)
  To: linux-arm-kernel

This patch adds support for DMA_ATTR_NO_KERNEL_MAPPING attribute for
IOMMU allocations, what let drivers to save precious kernel virtual
address space for large buffers that are intended to be accessed only
from userspace.

This patch is heavily based on initial work kindly provided by Abhinav
Kochhar <abhinav.k@samsung.com>.

Signed-off-by: Marek Szyprowski <m.szyprowski@samsung.com>
Reviewed-by: Kyungmin Park <kyungmin.park@samsung.com>
---
 arch/arm/mm/dma-mapping.c |   18 +++++++++++++-----
 1 files changed, 13 insertions(+), 5 deletions(-)

diff --git a/arch/arm/mm/dma-mapping.c b/arch/arm/mm/dma-mapping.c
index b3ffcf9..5d8b8b2 100644
--- a/arch/arm/mm/dma-mapping.c
+++ b/arch/arm/mm/dma-mapping.c
@@ -1071,10 +1071,13 @@ static int __iommu_remove_mapping(struct device *dev, dma_addr_t iova, size_t si
 	return 0;
 }
 
-static struct page **__iommu_get_pages(void *cpu_addr)
+static struct page **__iommu_get_pages(void *cpu_addr, struct dma_attrs *attrs)
 {
 	struct vm_struct *area;
 
+	if (dma_get_attr(DMA_ATTR_NO_KERNEL_MAPPING, attrs))
+		return cpu_addr;
+
 	area = find_vm_area(cpu_addr);
 	if (area && (area->flags & VM_DMA))
 		return area->pages;
@@ -1099,6 +1102,9 @@ static void *arm_iommu_alloc_attrs(struct device *dev, size_t size,
 	if (*handle == DMA_ERROR_CODE)
 		goto err_buffer;
 
+	if (dma_get_attr(DMA_ATTR_NO_KERNEL_MAPPING, attrs))
+		return pages;
+
 	addr = __iommu_alloc_remap(pages, size, gfp, prot,
 				   __builtin_return_address(0));
 	if (!addr)
@@ -1119,7 +1125,7 @@ static int arm_iommu_mmap_attrs(struct device *dev, struct vm_area_struct *vma,
 {
 	unsigned long uaddr = vma->vm_start;
 	unsigned long usize = vma->vm_end - vma->vm_start;
-	struct page **pages = __iommu_get_pages(cpu_addr);
+	struct page **pages = __iommu_get_pages(cpu_addr, attrs);
 
 	vma->vm_page_prot = __get_dma_pgprot(attrs, vma->vm_page_prot);
 
@@ -1146,7 +1152,7 @@ static int arm_iommu_mmap_attrs(struct device *dev, struct vm_area_struct *vma,
 void arm_iommu_free_attrs(struct device *dev, size_t size, void *cpu_addr,
 			  dma_addr_t handle, struct dma_attrs *attrs)
 {
-	struct page **pages = __iommu_get_pages(cpu_addr);
+	struct page **pages = __iommu_get_pages(cpu_addr, attrs);
 	size = PAGE_ALIGN(size);
 
 	if (!pages) {
@@ -1156,8 +1162,10 @@ void arm_iommu_free_attrs(struct device *dev, size_t size, void *cpu_addr,
 		return;
 	}
 
-	unmap_kernel_range((unsigned long)cpu_addr, size);
-	vunmap(cpu_addr);
+	if (!dma_get_attr(DMA_ATTR_NO_KERNEL_MAPPING, attrs)) {
+		unmap_kernel_range((unsigned long)cpu_addr, size);
+		vunmap(cpu_addr);
+	}
 
 	__iommu_remove_mapping(dev, handle, size);
 	__iommu_free_buffer(dev, pages, size);
-- 
1.7.1.569.g6f426

^ permalink raw reply related	[flat|nested] 45+ messages in thread

* [PATCHv2 3/6] common: dma-mapping: introduce dma_get_sgtable() function
  2012-06-13 11:50 ` Marek Szyprowski
  (?)
@ 2012-06-13 11:50   ` Marek Szyprowski
  -1 siblings, 0 replies; 45+ messages in thread
From: Marek Szyprowski @ 2012-06-13 11:50 UTC (permalink / raw)
  To: linux-arm-kernel, linaro-mm-sig, linux-mm, linux-arch, linux-kernel
  Cc: Marek Szyprowski, Kyungmin Park, Arnd Bergmann,
	Russell King - ARM Linux, Chunsang Jeong, Krishna Reddy,
	Benjamin Herrenschmidt, Konrad Rzeszutek Wilk, Hiroshi Doyu,
	Subash Patel, Sumit Semwal, Abhinav Kochhar, Tomasz Stanislawski

This patch adds dma_get_sgtable() function which is required to let
drivers to share the buffers allocated by DMA-mapping subsystem. Right
now the driver gets a dma address of the allocated buffer and the kernel
virtual mapping for it. If it wants to share it with other device (= map
into its dma address space) it usually hacks around kernel virtual
addresses to get pointers to pages or assumes that both devices share
the DMA address space. Both solutions are just hacks for the special
cases, which should be avoided in the final version of buffer sharing.

To solve this issue in a generic way, a new call to DMA mapping has been
introduced - dma_get_sgtable(). It allocates a scatter-list which
describes the allocated buffer and lets the driver(s) to use it with
other device(s) by calling dma_map_sg() on it.

This patch provides a generic implementation based on virt_to_page()
call. Architectures which require more sophisticated translation might
provide their own get_sgtable() methods.

Signed-off-by: Marek Szyprowski <m.szyprowski@samsung.com>
Reviewed-by: Kyungmin Park <kyungmin.park@samsung.com>
---
 drivers/base/dma-mapping.c               |   18 ++++++++++++++++++
 include/asm-generic/dma-mapping-common.h |   18 ++++++++++++++++++
 include/linux/dma-mapping.h              |    3 +++
 3 files changed, 39 insertions(+), 0 deletions(-)

diff --git a/drivers/base/dma-mapping.c b/drivers/base/dma-mapping.c
index 6f3676f..49785c1 100644
--- a/drivers/base/dma-mapping.c
+++ b/drivers/base/dma-mapping.c
@@ -217,4 +217,22 @@ void dmam_release_declared_memory(struct device *dev)
 }
 EXPORT_SYMBOL(dmam_release_declared_memory);
 
+/*
+ * Create scatter-list for the already allocated DMA buffer.
+ */
+int dma_common_get_sgtable(struct device *dev, struct sg_table *sgt,
+		 void *cpu_addr, dma_addr_t handle, size_t size)
+{
+	struct page *page = virt_to_page(cpu_addr);
+	int ret;
+
+	ret = sg_alloc_table(sgt, 1, GFP_KERNEL);
+	if (unlikely(ret))
+		return ret;
+
+	sg_set_page(sgt->sgl, page, PAGE_ALIGN(size), 0);
+	return 0;
+}
+EXPORT_SYMBOL(dma_common_get_sgtable);
+
 #endif
diff --git a/include/asm-generic/dma-mapping-common.h b/include/asm-generic/dma-mapping-common.h
index 2e248d8..34841c6 100644
--- a/include/asm-generic/dma-mapping-common.h
+++ b/include/asm-generic/dma-mapping-common.h
@@ -176,4 +176,22 @@ dma_sync_sg_for_device(struct device *dev, struct scatterlist *sg,
 #define dma_map_sg(d, s, n, r) dma_map_sg_attrs(d, s, n, r, NULL)
 #define dma_unmap_sg(d, s, n, r) dma_unmap_sg_attrs(d, s, n, r, NULL)
 
+int
+dma_common_get_sgtable(struct device *dev, struct sg_table *sgt,
+		       void *cpu_addr, dma_addr_t dma_addr, size_t size);
+
+static inline int
+dma_get_sgtable_attrs(struct device *dev, struct sg_table *sgt, void *cpu_addr,
+		      dma_addr_t dma_addr, size_t size, struct dma_attrs *attrs)
+{
+	struct dma_map_ops *ops = get_dma_ops(dev);
+	BUG_ON(!ops);
+	if (ops->get_sgtable)
+		return ops->get_sgtable(dev, sgt, cpu_addr, dma_addr, size,
+					attrs);
+	return dma_common_get_sgtable(dev, sgt, cpu_addr, dma_addr, size);
+}
+
+#define dma_get_sgtable(d, t, v, h, s) dma_get_sgtable_attrs(d, t, v, h, s, NULL)
+
 #endif
diff --git a/include/linux/dma-mapping.h b/include/linux/dma-mapping.h
index dfc099e..94af418 100644
--- a/include/linux/dma-mapping.h
+++ b/include/linux/dma-mapping.h
@@ -18,6 +18,9 @@ struct dma_map_ops {
 	int (*mmap)(struct device *, struct vm_area_struct *,
 			  void *, dma_addr_t, size_t, struct dma_attrs *attrs);
 
+	int (*get_sgtable)(struct device *dev, struct sg_table *sgt, void *,
+			   dma_addr_t, size_t, struct dma_attrs *attrs);
+
 	dma_addr_t (*map_page)(struct device *dev, struct page *page,
 			       unsigned long offset, size_t size,
 			       enum dma_data_direction dir,
-- 
1.7.1.569.g6f426


^ permalink raw reply related	[flat|nested] 45+ messages in thread

* [PATCHv2 3/6] common: dma-mapping: introduce dma_get_sgtable() function
@ 2012-06-13 11:50   ` Marek Szyprowski
  0 siblings, 0 replies; 45+ messages in thread
From: Marek Szyprowski @ 2012-06-13 11:50 UTC (permalink / raw)
  To: linux-arm-kernel, linaro-mm-sig, linux-mm, linux-arch, linux-kernel
  Cc: Marek Szyprowski, Kyungmin Park, Arnd Bergmann,
	Russell King - ARM Linux, Chunsang Jeong, Krishna Reddy,
	Benjamin Herrenschmidt, Konrad Rzeszutek Wilk, Hiroshi Doyu,
	Subash Patel, Sumit Semwal, Abhinav Kochhar, Tomasz Stanislawski

This patch adds dma_get_sgtable() function which is required to let
drivers to share the buffers allocated by DMA-mapping subsystem. Right
now the driver gets a dma address of the allocated buffer and the kernel
virtual mapping for it. If it wants to share it with other device (= map
into its dma address space) it usually hacks around kernel virtual
addresses to get pointers to pages or assumes that both devices share
the DMA address space. Both solutions are just hacks for the special
cases, which should be avoided in the final version of buffer sharing.

To solve this issue in a generic way, a new call to DMA mapping has been
introduced - dma_get_sgtable(). It allocates a scatter-list which
describes the allocated buffer and lets the driver(s) to use it with
other device(s) by calling dma_map_sg() on it.

This patch provides a generic implementation based on virt_to_page()
call. Architectures which require more sophisticated translation might
provide their own get_sgtable() methods.

Signed-off-by: Marek Szyprowski <m.szyprowski@samsung.com>
Reviewed-by: Kyungmin Park <kyungmin.park@samsung.com>
---
 drivers/base/dma-mapping.c               |   18 ++++++++++++++++++
 include/asm-generic/dma-mapping-common.h |   18 ++++++++++++++++++
 include/linux/dma-mapping.h              |    3 +++
 3 files changed, 39 insertions(+), 0 deletions(-)

diff --git a/drivers/base/dma-mapping.c b/drivers/base/dma-mapping.c
index 6f3676f..49785c1 100644
--- a/drivers/base/dma-mapping.c
+++ b/drivers/base/dma-mapping.c
@@ -217,4 +217,22 @@ void dmam_release_declared_memory(struct device *dev)
 }
 EXPORT_SYMBOL(dmam_release_declared_memory);
 
+/*
+ * Create scatter-list for the already allocated DMA buffer.
+ */
+int dma_common_get_sgtable(struct device *dev, struct sg_table *sgt,
+		 void *cpu_addr, dma_addr_t handle, size_t size)
+{
+	struct page *page = virt_to_page(cpu_addr);
+	int ret;
+
+	ret = sg_alloc_table(sgt, 1, GFP_KERNEL);
+	if (unlikely(ret))
+		return ret;
+
+	sg_set_page(sgt->sgl, page, PAGE_ALIGN(size), 0);
+	return 0;
+}
+EXPORT_SYMBOL(dma_common_get_sgtable);
+
 #endif
diff --git a/include/asm-generic/dma-mapping-common.h b/include/asm-generic/dma-mapping-common.h
index 2e248d8..34841c6 100644
--- a/include/asm-generic/dma-mapping-common.h
+++ b/include/asm-generic/dma-mapping-common.h
@@ -176,4 +176,22 @@ dma_sync_sg_for_device(struct device *dev, struct scatterlist *sg,
 #define dma_map_sg(d, s, n, r) dma_map_sg_attrs(d, s, n, r, NULL)
 #define dma_unmap_sg(d, s, n, r) dma_unmap_sg_attrs(d, s, n, r, NULL)
 
+int
+dma_common_get_sgtable(struct device *dev, struct sg_table *sgt,
+		       void *cpu_addr, dma_addr_t dma_addr, size_t size);
+
+static inline int
+dma_get_sgtable_attrs(struct device *dev, struct sg_table *sgt, void *cpu_addr,
+		      dma_addr_t dma_addr, size_t size, struct dma_attrs *attrs)
+{
+	struct dma_map_ops *ops = get_dma_ops(dev);
+	BUG_ON(!ops);
+	if (ops->get_sgtable)
+		return ops->get_sgtable(dev, sgt, cpu_addr, dma_addr, size,
+					attrs);
+	return dma_common_get_sgtable(dev, sgt, cpu_addr, dma_addr, size);
+}
+
+#define dma_get_sgtable(d, t, v, h, s) dma_get_sgtable_attrs(d, t, v, h, s, NULL)
+
 #endif
diff --git a/include/linux/dma-mapping.h b/include/linux/dma-mapping.h
index dfc099e..94af418 100644
--- a/include/linux/dma-mapping.h
+++ b/include/linux/dma-mapping.h
@@ -18,6 +18,9 @@ struct dma_map_ops {
 	int (*mmap)(struct device *, struct vm_area_struct *,
 			  void *, dma_addr_t, size_t, struct dma_attrs *attrs);
 
+	int (*get_sgtable)(struct device *dev, struct sg_table *sgt, void *,
+			   dma_addr_t, size_t, struct dma_attrs *attrs);
+
 	dma_addr_t (*map_page)(struct device *dev, struct page *page,
 			       unsigned long offset, size_t size,
 			       enum dma_data_direction dir,
-- 
1.7.1.569.g6f426

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 45+ messages in thread

* [PATCHv2 3/6] common: dma-mapping: introduce dma_get_sgtable() function
@ 2012-06-13 11:50   ` Marek Szyprowski
  0 siblings, 0 replies; 45+ messages in thread
From: Marek Szyprowski @ 2012-06-13 11:50 UTC (permalink / raw)
  To: linux-arm-kernel

This patch adds dma_get_sgtable() function which is required to let
drivers to share the buffers allocated by DMA-mapping subsystem. Right
now the driver gets a dma address of the allocated buffer and the kernel
virtual mapping for it. If it wants to share it with other device (= map
into its dma address space) it usually hacks around kernel virtual
addresses to get pointers to pages or assumes that both devices share
the DMA address space. Both solutions are just hacks for the special
cases, which should be avoided in the final version of buffer sharing.

To solve this issue in a generic way, a new call to DMA mapping has been
introduced - dma_get_sgtable(). It allocates a scatter-list which
describes the allocated buffer and lets the driver(s) to use it with
other device(s) by calling dma_map_sg() on it.

This patch provides a generic implementation based on virt_to_page()
call. Architectures which require more sophisticated translation might
provide their own get_sgtable() methods.

Signed-off-by: Marek Szyprowski <m.szyprowski@samsung.com>
Reviewed-by: Kyungmin Park <kyungmin.park@samsung.com>
---
 drivers/base/dma-mapping.c               |   18 ++++++++++++++++++
 include/asm-generic/dma-mapping-common.h |   18 ++++++++++++++++++
 include/linux/dma-mapping.h              |    3 +++
 3 files changed, 39 insertions(+), 0 deletions(-)

diff --git a/drivers/base/dma-mapping.c b/drivers/base/dma-mapping.c
index 6f3676f..49785c1 100644
--- a/drivers/base/dma-mapping.c
+++ b/drivers/base/dma-mapping.c
@@ -217,4 +217,22 @@ void dmam_release_declared_memory(struct device *dev)
 }
 EXPORT_SYMBOL(dmam_release_declared_memory);
 
+/*
+ * Create scatter-list for the already allocated DMA buffer.
+ */
+int dma_common_get_sgtable(struct device *dev, struct sg_table *sgt,
+		 void *cpu_addr, dma_addr_t handle, size_t size)
+{
+	struct page *page = virt_to_page(cpu_addr);
+	int ret;
+
+	ret = sg_alloc_table(sgt, 1, GFP_KERNEL);
+	if (unlikely(ret))
+		return ret;
+
+	sg_set_page(sgt->sgl, page, PAGE_ALIGN(size), 0);
+	return 0;
+}
+EXPORT_SYMBOL(dma_common_get_sgtable);
+
 #endif
diff --git a/include/asm-generic/dma-mapping-common.h b/include/asm-generic/dma-mapping-common.h
index 2e248d8..34841c6 100644
--- a/include/asm-generic/dma-mapping-common.h
+++ b/include/asm-generic/dma-mapping-common.h
@@ -176,4 +176,22 @@ dma_sync_sg_for_device(struct device *dev, struct scatterlist *sg,
 #define dma_map_sg(d, s, n, r) dma_map_sg_attrs(d, s, n, r, NULL)
 #define dma_unmap_sg(d, s, n, r) dma_unmap_sg_attrs(d, s, n, r, NULL)
 
+int
+dma_common_get_sgtable(struct device *dev, struct sg_table *sgt,
+		       void *cpu_addr, dma_addr_t dma_addr, size_t size);
+
+static inline int
+dma_get_sgtable_attrs(struct device *dev, struct sg_table *sgt, void *cpu_addr,
+		      dma_addr_t dma_addr, size_t size, struct dma_attrs *attrs)
+{
+	struct dma_map_ops *ops = get_dma_ops(dev);
+	BUG_ON(!ops);
+	if (ops->get_sgtable)
+		return ops->get_sgtable(dev, sgt, cpu_addr, dma_addr, size,
+					attrs);
+	return dma_common_get_sgtable(dev, sgt, cpu_addr, dma_addr, size);
+}
+
+#define dma_get_sgtable(d, t, v, h, s) dma_get_sgtable_attrs(d, t, v, h, s, NULL)
+
 #endif
diff --git a/include/linux/dma-mapping.h b/include/linux/dma-mapping.h
index dfc099e..94af418 100644
--- a/include/linux/dma-mapping.h
+++ b/include/linux/dma-mapping.h
@@ -18,6 +18,9 @@ struct dma_map_ops {
 	int (*mmap)(struct device *, struct vm_area_struct *,
 			  void *, dma_addr_t, size_t, struct dma_attrs *attrs);
 
+	int (*get_sgtable)(struct device *dev, struct sg_table *sgt, void *,
+			   dma_addr_t, size_t, struct dma_attrs *attrs);
+
 	dma_addr_t (*map_page)(struct device *dev, struct page *page,
 			       unsigned long offset, size_t size,
 			       enum dma_data_direction dir,
-- 
1.7.1.569.g6f426

^ permalink raw reply related	[flat|nested] 45+ messages in thread

* [PATCHv2 4/6] ARM: dma-mapping: add support for dma_get_sgtable()
  2012-06-13 11:50 ` Marek Szyprowski
  (?)
@ 2012-06-13 11:50   ` Marek Szyprowski
  -1 siblings, 0 replies; 45+ messages in thread
From: Marek Szyprowski @ 2012-06-13 11:50 UTC (permalink / raw)
  To: linux-arm-kernel, linaro-mm-sig, linux-mm, linux-arch, linux-kernel
  Cc: Marek Szyprowski, Kyungmin Park, Arnd Bergmann,
	Russell King - ARM Linux, Chunsang Jeong, Krishna Reddy,
	Benjamin Herrenschmidt, Konrad Rzeszutek Wilk, Hiroshi Doyu,
	Subash Patel, Sumit Semwal, Abhinav Kochhar, Tomasz Stanislawski

This patch adds support for dma_get_sgtable() function which is required
to let drivers to share the buffers allocated by DMA-mapping subsystem.

Generic implementation based on virt_to_page() is not suitable for ARM
dma-mapping subsystem.

Signed-off-by: Marek Szyprowski <m.szyprowski@samsung.com>
Reviewed-by: Kyungmin Park <kyungmin.park@samsung.com>
---
 arch/arm/common/dmabounce.c        |    1 +
 arch/arm/include/asm/dma-mapping.h |    3 +++
 arch/arm/mm/dma-mapping.c          |   31 +++++++++++++++++++++++++++++++
 3 files changed, 35 insertions(+), 0 deletions(-)

diff --git a/arch/arm/common/dmabounce.c b/arch/arm/common/dmabounce.c
index 9d7eb53..1486124 100644
--- a/arch/arm/common/dmabounce.c
+++ b/arch/arm/common/dmabounce.c
@@ -452,6 +452,7 @@ static struct dma_map_ops dmabounce_ops = {
 	.alloc			= arm_dma_alloc,
 	.free			= arm_dma_free,
 	.mmap			= arm_dma_mmap,
+	.get_sgtable		= arm_dma_get_sgtable,
 	.map_page		= dmabounce_map_page,
 	.unmap_page		= dmabounce_unmap_page,
 	.sync_single_for_cpu	= dmabounce_sync_for_cpu,
diff --git a/arch/arm/include/asm/dma-mapping.h b/arch/arm/include/asm/dma-mapping.h
index 80777d87..804bf65 100644
--- a/arch/arm/include/asm/dma-mapping.h
+++ b/arch/arm/include/asm/dma-mapping.h
@@ -280,6 +280,9 @@ extern void arm_dma_sync_sg_for_cpu(struct device *, struct scatterlist *, int,
 		enum dma_data_direction);
 extern void arm_dma_sync_sg_for_device(struct device *, struct scatterlist *, int,
 		enum dma_data_direction);
+extern int arm_dma_get_sgtable(struct device *dev, struct sg_table *sgt,
+		void *cpu_addr, dma_addr_t dma_addr, size_t size,
+		struct dma_attrs *attrs);
 
 #endif /* __KERNEL__ */
 #endif
diff --git a/arch/arm/mm/dma-mapping.c b/arch/arm/mm/dma-mapping.c
index 5d8b8b2..3840997 100644
--- a/arch/arm/mm/dma-mapping.c
+++ b/arch/arm/mm/dma-mapping.c
@@ -125,6 +125,7 @@ struct dma_map_ops arm_dma_ops = {
 	.alloc			= arm_dma_alloc,
 	.free			= arm_dma_free,
 	.mmap			= arm_dma_mmap,
+	.get_sgtable		= arm_dma_get_sgtable,
 	.map_page		= arm_dma_map_page,
 	.unmap_page		= arm_dma_unmap_page,
 	.map_sg			= arm_dma_map_sg,
@@ -659,6 +660,21 @@ void arm_dma_free(struct device *dev, size_t size, void *cpu_addr,
 	}
 }
 
+int arm_dma_get_sgtable(struct device *dev, struct sg_table *sgt,
+		 void *cpu_addr, dma_addr_t handle, size_t size,
+		 struct dma_attrs *attrs)
+{
+	struct page *page = pfn_to_page(dma_to_pfn(dev, handle));
+	int ret;
+
+	ret = sg_alloc_table(sgt, 1, GFP_KERNEL);
+	if (unlikely(ret))
+		return ret;
+
+	sg_set_page(sgt->sgl, page, PAGE_ALIGN(size), 0);
+	return 0;
+}
+
 static void dma_cache_maint_page(struct page *page, unsigned long offset,
 	size_t size, enum dma_data_direction dir,
 	void (*op)(const void *, size_t, int))
@@ -1171,6 +1187,20 @@ void arm_iommu_free_attrs(struct device *dev, size_t size, void *cpu_addr,
 	__iommu_free_buffer(dev, pages, size);
 }
 
+static int arm_iommu_get_sgtable(struct device *dev, struct sg_table *sgt,
+				 void *cpu_addr, dma_addr_t dma_addr,
+				 size_t size, struct dma_attrs *attrs)
+{
+	unsigned int count = PAGE_ALIGN(size) >> PAGE_SHIFT;
+	struct page **pages = __iommu_get_pages(cpu_addr, attrs);
+
+	if (!pages)
+		return -ENXIO;
+
+	return sg_alloc_table_from_pages(sgt, pages, count, 0, size,
+					 GFP_KERNEL);
+}
+
 /*
  * Map a part of the scatter-gather list into contiguous io address space
  */
@@ -1430,6 +1460,7 @@ struct dma_map_ops iommu_ops = {
 	.alloc		= arm_iommu_alloc_attrs,
 	.free		= arm_iommu_free_attrs,
 	.mmap		= arm_iommu_mmap_attrs,
+	.get_sgtable	= arm_iommu_get_sgtable,
 
 	.map_page		= arm_iommu_map_page,
 	.unmap_page		= arm_iommu_unmap_page,
-- 
1.7.1.569.g6f426


^ permalink raw reply related	[flat|nested] 45+ messages in thread

* [PATCHv2 4/6] ARM: dma-mapping: add support for dma_get_sgtable()
@ 2012-06-13 11:50   ` Marek Szyprowski
  0 siblings, 0 replies; 45+ messages in thread
From: Marek Szyprowski @ 2012-06-13 11:50 UTC (permalink / raw)
  To: linux-arm-kernel, linaro-mm-sig, linux-mm, linux-arch, linux-kernel
  Cc: Marek Szyprowski, Kyungmin Park, Arnd Bergmann,
	Russell King - ARM Linux, Chunsang Jeong, Krishna Reddy,
	Benjamin Herrenschmidt, Konrad Rzeszutek Wilk, Hiroshi Doyu,
	Subash Patel, Sumit Semwal, Abhinav Kochhar, Tomasz Stanislawski

This patch adds support for dma_get_sgtable() function which is required
to let drivers to share the buffers allocated by DMA-mapping subsystem.

Generic implementation based on virt_to_page() is not suitable for ARM
dma-mapping subsystem.

Signed-off-by: Marek Szyprowski <m.szyprowski@samsung.com>
Reviewed-by: Kyungmin Park <kyungmin.park@samsung.com>
---
 arch/arm/common/dmabounce.c        |    1 +
 arch/arm/include/asm/dma-mapping.h |    3 +++
 arch/arm/mm/dma-mapping.c          |   31 +++++++++++++++++++++++++++++++
 3 files changed, 35 insertions(+), 0 deletions(-)

diff --git a/arch/arm/common/dmabounce.c b/arch/arm/common/dmabounce.c
index 9d7eb53..1486124 100644
--- a/arch/arm/common/dmabounce.c
+++ b/arch/arm/common/dmabounce.c
@@ -452,6 +452,7 @@ static struct dma_map_ops dmabounce_ops = {
 	.alloc			= arm_dma_alloc,
 	.free			= arm_dma_free,
 	.mmap			= arm_dma_mmap,
+	.get_sgtable		= arm_dma_get_sgtable,
 	.map_page		= dmabounce_map_page,
 	.unmap_page		= dmabounce_unmap_page,
 	.sync_single_for_cpu	= dmabounce_sync_for_cpu,
diff --git a/arch/arm/include/asm/dma-mapping.h b/arch/arm/include/asm/dma-mapping.h
index 80777d87..804bf65 100644
--- a/arch/arm/include/asm/dma-mapping.h
+++ b/arch/arm/include/asm/dma-mapping.h
@@ -280,6 +280,9 @@ extern void arm_dma_sync_sg_for_cpu(struct device *, struct scatterlist *, int,
 		enum dma_data_direction);
 extern void arm_dma_sync_sg_for_device(struct device *, struct scatterlist *, int,
 		enum dma_data_direction);
+extern int arm_dma_get_sgtable(struct device *dev, struct sg_table *sgt,
+		void *cpu_addr, dma_addr_t dma_addr, size_t size,
+		struct dma_attrs *attrs);
 
 #endif /* __KERNEL__ */
 #endif
diff --git a/arch/arm/mm/dma-mapping.c b/arch/arm/mm/dma-mapping.c
index 5d8b8b2..3840997 100644
--- a/arch/arm/mm/dma-mapping.c
+++ b/arch/arm/mm/dma-mapping.c
@@ -125,6 +125,7 @@ struct dma_map_ops arm_dma_ops = {
 	.alloc			= arm_dma_alloc,
 	.free			= arm_dma_free,
 	.mmap			= arm_dma_mmap,
+	.get_sgtable		= arm_dma_get_sgtable,
 	.map_page		= arm_dma_map_page,
 	.unmap_page		= arm_dma_unmap_page,
 	.map_sg			= arm_dma_map_sg,
@@ -659,6 +660,21 @@ void arm_dma_free(struct device *dev, size_t size, void *cpu_addr,
 	}
 }
 
+int arm_dma_get_sgtable(struct device *dev, struct sg_table *sgt,
+		 void *cpu_addr, dma_addr_t handle, size_t size,
+		 struct dma_attrs *attrs)
+{
+	struct page *page = pfn_to_page(dma_to_pfn(dev, handle));
+	int ret;
+
+	ret = sg_alloc_table(sgt, 1, GFP_KERNEL);
+	if (unlikely(ret))
+		return ret;
+
+	sg_set_page(sgt->sgl, page, PAGE_ALIGN(size), 0);
+	return 0;
+}
+
 static void dma_cache_maint_page(struct page *page, unsigned long offset,
 	size_t size, enum dma_data_direction dir,
 	void (*op)(const void *, size_t, int))
@@ -1171,6 +1187,20 @@ void arm_iommu_free_attrs(struct device *dev, size_t size, void *cpu_addr,
 	__iommu_free_buffer(dev, pages, size);
 }
 
+static int arm_iommu_get_sgtable(struct device *dev, struct sg_table *sgt,
+				 void *cpu_addr, dma_addr_t dma_addr,
+				 size_t size, struct dma_attrs *attrs)
+{
+	unsigned int count = PAGE_ALIGN(size) >> PAGE_SHIFT;
+	struct page **pages = __iommu_get_pages(cpu_addr, attrs);
+
+	if (!pages)
+		return -ENXIO;
+
+	return sg_alloc_table_from_pages(sgt, pages, count, 0, size,
+					 GFP_KERNEL);
+}
+
 /*
  * Map a part of the scatter-gather list into contiguous io address space
  */
@@ -1430,6 +1460,7 @@ struct dma_map_ops iommu_ops = {
 	.alloc		= arm_iommu_alloc_attrs,
 	.free		= arm_iommu_free_attrs,
 	.mmap		= arm_iommu_mmap_attrs,
+	.get_sgtable	= arm_iommu_get_sgtable,
 
 	.map_page		= arm_iommu_map_page,
 	.unmap_page		= arm_iommu_unmap_page,
-- 
1.7.1.569.g6f426

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 45+ messages in thread

* [PATCHv2 4/6] ARM: dma-mapping: add support for dma_get_sgtable()
@ 2012-06-13 11:50   ` Marek Szyprowski
  0 siblings, 0 replies; 45+ messages in thread
From: Marek Szyprowski @ 2012-06-13 11:50 UTC (permalink / raw)
  To: linux-arm-kernel

This patch adds support for dma_get_sgtable() function which is required
to let drivers to share the buffers allocated by DMA-mapping subsystem.

Generic implementation based on virt_to_page() is not suitable for ARM
dma-mapping subsystem.

Signed-off-by: Marek Szyprowski <m.szyprowski@samsung.com>
Reviewed-by: Kyungmin Park <kyungmin.park@samsung.com>
---
 arch/arm/common/dmabounce.c        |    1 +
 arch/arm/include/asm/dma-mapping.h |    3 +++
 arch/arm/mm/dma-mapping.c          |   31 +++++++++++++++++++++++++++++++
 3 files changed, 35 insertions(+), 0 deletions(-)

diff --git a/arch/arm/common/dmabounce.c b/arch/arm/common/dmabounce.c
index 9d7eb53..1486124 100644
--- a/arch/arm/common/dmabounce.c
+++ b/arch/arm/common/dmabounce.c
@@ -452,6 +452,7 @@ static struct dma_map_ops dmabounce_ops = {
 	.alloc			= arm_dma_alloc,
 	.free			= arm_dma_free,
 	.mmap			= arm_dma_mmap,
+	.get_sgtable		= arm_dma_get_sgtable,
 	.map_page		= dmabounce_map_page,
 	.unmap_page		= dmabounce_unmap_page,
 	.sync_single_for_cpu	= dmabounce_sync_for_cpu,
diff --git a/arch/arm/include/asm/dma-mapping.h b/arch/arm/include/asm/dma-mapping.h
index 80777d87..804bf65 100644
--- a/arch/arm/include/asm/dma-mapping.h
+++ b/arch/arm/include/asm/dma-mapping.h
@@ -280,6 +280,9 @@ extern void arm_dma_sync_sg_for_cpu(struct device *, struct scatterlist *, int,
 		enum dma_data_direction);
 extern void arm_dma_sync_sg_for_device(struct device *, struct scatterlist *, int,
 		enum dma_data_direction);
+extern int arm_dma_get_sgtable(struct device *dev, struct sg_table *sgt,
+		void *cpu_addr, dma_addr_t dma_addr, size_t size,
+		struct dma_attrs *attrs);
 
 #endif /* __KERNEL__ */
 #endif
diff --git a/arch/arm/mm/dma-mapping.c b/arch/arm/mm/dma-mapping.c
index 5d8b8b2..3840997 100644
--- a/arch/arm/mm/dma-mapping.c
+++ b/arch/arm/mm/dma-mapping.c
@@ -125,6 +125,7 @@ struct dma_map_ops arm_dma_ops = {
 	.alloc			= arm_dma_alloc,
 	.free			= arm_dma_free,
 	.mmap			= arm_dma_mmap,
+	.get_sgtable		= arm_dma_get_sgtable,
 	.map_page		= arm_dma_map_page,
 	.unmap_page		= arm_dma_unmap_page,
 	.map_sg			= arm_dma_map_sg,
@@ -659,6 +660,21 @@ void arm_dma_free(struct device *dev, size_t size, void *cpu_addr,
 	}
 }
 
+int arm_dma_get_sgtable(struct device *dev, struct sg_table *sgt,
+		 void *cpu_addr, dma_addr_t handle, size_t size,
+		 struct dma_attrs *attrs)
+{
+	struct page *page = pfn_to_page(dma_to_pfn(dev, handle));
+	int ret;
+
+	ret = sg_alloc_table(sgt, 1, GFP_KERNEL);
+	if (unlikely(ret))
+		return ret;
+
+	sg_set_page(sgt->sgl, page, PAGE_ALIGN(size), 0);
+	return 0;
+}
+
 static void dma_cache_maint_page(struct page *page, unsigned long offset,
 	size_t size, enum dma_data_direction dir,
 	void (*op)(const void *, size_t, int))
@@ -1171,6 +1187,20 @@ void arm_iommu_free_attrs(struct device *dev, size_t size, void *cpu_addr,
 	__iommu_free_buffer(dev, pages, size);
 }
 
+static int arm_iommu_get_sgtable(struct device *dev, struct sg_table *sgt,
+				 void *cpu_addr, dma_addr_t dma_addr,
+				 size_t size, struct dma_attrs *attrs)
+{
+	unsigned int count = PAGE_ALIGN(size) >> PAGE_SHIFT;
+	struct page **pages = __iommu_get_pages(cpu_addr, attrs);
+
+	if (!pages)
+		return -ENXIO;
+
+	return sg_alloc_table_from_pages(sgt, pages, count, 0, size,
+					 GFP_KERNEL);
+}
+
 /*
  * Map a part of the scatter-gather list into contiguous io address space
  */
@@ -1430,6 +1460,7 @@ struct dma_map_ops iommu_ops = {
 	.alloc		= arm_iommu_alloc_attrs,
 	.free		= arm_iommu_free_attrs,
 	.mmap		= arm_iommu_mmap_attrs,
+	.get_sgtable	= arm_iommu_get_sgtable,
 
 	.map_page		= arm_iommu_map_page,
 	.unmap_page		= arm_iommu_unmap_page,
-- 
1.7.1.569.g6f426

^ permalink raw reply related	[flat|nested] 45+ messages in thread

* [PATCHv2 5/6] common: DMA-mapping: add DMA_ATTR_SKIP_CPU_SYNC attribute
  2012-06-13 11:50 ` Marek Szyprowski
  (?)
@ 2012-06-13 11:50   ` Marek Szyprowski
  -1 siblings, 0 replies; 45+ messages in thread
From: Marek Szyprowski @ 2012-06-13 11:50 UTC (permalink / raw)
  To: linux-arm-kernel, linaro-mm-sig, linux-mm, linux-arch, linux-kernel
  Cc: Marek Szyprowski, Kyungmin Park, Arnd Bergmann,
	Russell King - ARM Linux, Chunsang Jeong, Krishna Reddy,
	Benjamin Herrenschmidt, Konrad Rzeszutek Wilk, Hiroshi Doyu,
	Subash Patel, Sumit Semwal, Abhinav Kochhar, Tomasz Stanislawski

This patch adds DMA_ATTR_SKIP_CPU_SYNC attribute to the DMA-mapping
subsystem.

By default dma_map_{single,page,sg} functions family transfer a given
buffer from CPU domain to device domain. Some advanced use cases might
require sharing a buffer between more than one device. This requires
having a mapping created separately for each device and is usually
performed by calling dma_map_{single,page,sg} function more than once
for the given buffer with device pointer to each device taking part in
the buffer sharing. The first call transfers a buffer from 'CPU' domain
to 'device' domain, what synchronizes CPU caches for the given region
(usually it means that the cache has been flushed or invalidated
depending on the dma direction). However, next calls to
dma_map_{single,page,sg}() for other devices will perform exactly the
same sychronization operation on the CPU cache. CPU cache sychronization
might be a time consuming operation, especially if the buffers are
large, so it is highly recommended to avoid it if possible.
DMA_ATTR_SKIP_CPU_SYNC allows platform code to skip synchronization of
the CPU cache for the given buffer assuming that it has been already
transferred to 'device' domain. This attribute can be also used for
dma_unmap_{single,page,sg} functions family to force buffer to stay in
device domain after releasing a mapping for it. Use this attribute with
care!

Signed-off-by: Marek Szyprowski <m.szyprowski@samsung.com>
Reviewed-by: Kyungmin Park <kyungmin.park@samsung.com>
---
 Documentation/DMA-attributes.txt |   24 ++++++++++++++++++++++++
 include/linux/dma-attrs.h        |    1 +
 2 files changed, 25 insertions(+), 0 deletions(-)

diff --git a/Documentation/DMA-attributes.txt b/Documentation/DMA-attributes.txt
index 725580d..f503090 100644
--- a/Documentation/DMA-attributes.txt
+++ b/Documentation/DMA-attributes.txt
@@ -67,3 +67,27 @@ set on each call.
 Since it is optional for platforms to implement
 DMA_ATTR_NO_KERNEL_MAPPING, those that do not will simply ignore the
 attribute and exhibit default behavior.
+
+DMA_ATTR_SKIP_CPU_SYNC
+----------------------
+
+By default dma_map_{single,page,sg} functions family transfer a given
+buffer from CPU domain to device domain. Some advanced use cases might
+require sharing a buffer between more than one device. This requires
+having a mapping created separately for each device and is usually
+performed by calling dma_map_{single,page,sg} function more than once
+for the given buffer with device pointer to each device taking part in
+the buffer sharing. The first call transfers a buffer from 'CPU' domain
+to 'device' domain, what synchronizes CPU caches for the given region
+(usually it means that the cache has been flushed or invalidated
+depending on the dma direction). However, next calls to
+dma_map_{single,page,sg}() for other devices will perform exactly the
+same sychronization operation on the CPU cache. CPU cache sychronization
+might be a time consuming operation, especially if the buffers are
+large, so it is highly recommended to avoid it if possible.
+DMA_ATTR_SKIP_CPU_SYNC allows platform code to skip synchronization of
+the CPU cache for the given buffer assuming that it has been already
+transferred to 'device' domain. This attribute can be also used for
+dma_unmap_{single,page,sg} functions family to force buffer to stay in
+device domain after releasing a mapping for it. Use this attribute with
+care!
diff --git a/include/linux/dma-attrs.h b/include/linux/dma-attrs.h
index a37c10c..f83f793 100644
--- a/include/linux/dma-attrs.h
+++ b/include/linux/dma-attrs.h
@@ -16,6 +16,7 @@ enum dma_attr {
 	DMA_ATTR_WRITE_COMBINE,
 	DMA_ATTR_NON_CONSISTENT,
 	DMA_ATTR_NO_KERNEL_MAPPING,
+	DMA_ATTR_SKIP_CPU_SYNC,
 	DMA_ATTR_MAX,
 };
 
-- 
1.7.1.569.g6f426


^ permalink raw reply related	[flat|nested] 45+ messages in thread

* [PATCHv2 5/6] common: DMA-mapping: add DMA_ATTR_SKIP_CPU_SYNC attribute
@ 2012-06-13 11:50   ` Marek Szyprowski
  0 siblings, 0 replies; 45+ messages in thread
From: Marek Szyprowski @ 2012-06-13 11:50 UTC (permalink / raw)
  To: linux-arm-kernel, linaro-mm-sig, linux-mm, linux-arch, linux-kernel
  Cc: Marek Szyprowski, Kyungmin Park, Arnd Bergmann,
	Russell King - ARM Linux, Chunsang Jeong, Krishna Reddy,
	Benjamin Herrenschmidt, Konrad Rzeszutek Wilk, Hiroshi Doyu,
	Subash Patel, Sumit Semwal, Abhinav Kochhar, Tomasz Stanislawski

This patch adds DMA_ATTR_SKIP_CPU_SYNC attribute to the DMA-mapping
subsystem.

By default dma_map_{single,page,sg} functions family transfer a given
buffer from CPU domain to device domain. Some advanced use cases might
require sharing a buffer between more than one device. This requires
having a mapping created separately for each device and is usually
performed by calling dma_map_{single,page,sg} function more than once
for the given buffer with device pointer to each device taking part in
the buffer sharing. The first call transfers a buffer from 'CPU' domain
to 'device' domain, what synchronizes CPU caches for the given region
(usually it means that the cache has been flushed or invalidated
depending on the dma direction). However, next calls to
dma_map_{single,page,sg}() for other devices will perform exactly the
same sychronization operation on the CPU cache. CPU cache sychronization
might be a time consuming operation, especially if the buffers are
large, so it is highly recommended to avoid it if possible.
DMA_ATTR_SKIP_CPU_SYNC allows platform code to skip synchronization of
the CPU cache for the given buffer assuming that it has been already
transferred to 'device' domain. This attribute can be also used for
dma_unmap_{single,page,sg} functions family to force buffer to stay in
device domain after releasing a mapping for it. Use this attribute with
care!

Signed-off-by: Marek Szyprowski <m.szyprowski@samsung.com>
Reviewed-by: Kyungmin Park <kyungmin.park@samsung.com>
---
 Documentation/DMA-attributes.txt |   24 ++++++++++++++++++++++++
 include/linux/dma-attrs.h        |    1 +
 2 files changed, 25 insertions(+), 0 deletions(-)

diff --git a/Documentation/DMA-attributes.txt b/Documentation/DMA-attributes.txt
index 725580d..f503090 100644
--- a/Documentation/DMA-attributes.txt
+++ b/Documentation/DMA-attributes.txt
@@ -67,3 +67,27 @@ set on each call.
 Since it is optional for platforms to implement
 DMA_ATTR_NO_KERNEL_MAPPING, those that do not will simply ignore the
 attribute and exhibit default behavior.
+
+DMA_ATTR_SKIP_CPU_SYNC
+----------------------
+
+By default dma_map_{single,page,sg} functions family transfer a given
+buffer from CPU domain to device domain. Some advanced use cases might
+require sharing a buffer between more than one device. This requires
+having a mapping created separately for each device and is usually
+performed by calling dma_map_{single,page,sg} function more than once
+for the given buffer with device pointer to each device taking part in
+the buffer sharing. The first call transfers a buffer from 'CPU' domain
+to 'device' domain, what synchronizes CPU caches for the given region
+(usually it means that the cache has been flushed or invalidated
+depending on the dma direction). However, next calls to
+dma_map_{single,page,sg}() for other devices will perform exactly the
+same sychronization operation on the CPU cache. CPU cache sychronization
+might be a time consuming operation, especially if the buffers are
+large, so it is highly recommended to avoid it if possible.
+DMA_ATTR_SKIP_CPU_SYNC allows platform code to skip synchronization of
+the CPU cache for the given buffer assuming that it has been already
+transferred to 'device' domain. This attribute can be also used for
+dma_unmap_{single,page,sg} functions family to force buffer to stay in
+device domain after releasing a mapping for it. Use this attribute with
+care!
diff --git a/include/linux/dma-attrs.h b/include/linux/dma-attrs.h
index a37c10c..f83f793 100644
--- a/include/linux/dma-attrs.h
+++ b/include/linux/dma-attrs.h
@@ -16,6 +16,7 @@ enum dma_attr {
 	DMA_ATTR_WRITE_COMBINE,
 	DMA_ATTR_NON_CONSISTENT,
 	DMA_ATTR_NO_KERNEL_MAPPING,
+	DMA_ATTR_SKIP_CPU_SYNC,
 	DMA_ATTR_MAX,
 };
 
-- 
1.7.1.569.g6f426

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 45+ messages in thread

* [PATCHv2 5/6] common: DMA-mapping: add DMA_ATTR_SKIP_CPU_SYNC attribute
@ 2012-06-13 11:50   ` Marek Szyprowski
  0 siblings, 0 replies; 45+ messages in thread
From: Marek Szyprowski @ 2012-06-13 11:50 UTC (permalink / raw)
  To: linux-arm-kernel

This patch adds DMA_ATTR_SKIP_CPU_SYNC attribute to the DMA-mapping
subsystem.

By default dma_map_{single,page,sg} functions family transfer a given
buffer from CPU domain to device domain. Some advanced use cases might
require sharing a buffer between more than one device. This requires
having a mapping created separately for each device and is usually
performed by calling dma_map_{single,page,sg} function more than once
for the given buffer with device pointer to each device taking part in
the buffer sharing. The first call transfers a buffer from 'CPU' domain
to 'device' domain, what synchronizes CPU caches for the given region
(usually it means that the cache has been flushed or invalidated
depending on the dma direction). However, next calls to
dma_map_{single,page,sg}() for other devices will perform exactly the
same sychronization operation on the CPU cache. CPU cache sychronization
might be a time consuming operation, especially if the buffers are
large, so it is highly recommended to avoid it if possible.
DMA_ATTR_SKIP_CPU_SYNC allows platform code to skip synchronization of
the CPU cache for the given buffer assuming that it has been already
transferred to 'device' domain. This attribute can be also used for
dma_unmap_{single,page,sg} functions family to force buffer to stay in
device domain after releasing a mapping for it. Use this attribute with
care!

Signed-off-by: Marek Szyprowski <m.szyprowski@samsung.com>
Reviewed-by: Kyungmin Park <kyungmin.park@samsung.com>
---
 Documentation/DMA-attributes.txt |   24 ++++++++++++++++++++++++
 include/linux/dma-attrs.h        |    1 +
 2 files changed, 25 insertions(+), 0 deletions(-)

diff --git a/Documentation/DMA-attributes.txt b/Documentation/DMA-attributes.txt
index 725580d..f503090 100644
--- a/Documentation/DMA-attributes.txt
+++ b/Documentation/DMA-attributes.txt
@@ -67,3 +67,27 @@ set on each call.
 Since it is optional for platforms to implement
 DMA_ATTR_NO_KERNEL_MAPPING, those that do not will simply ignore the
 attribute and exhibit default behavior.
+
+DMA_ATTR_SKIP_CPU_SYNC
+----------------------
+
+By default dma_map_{single,page,sg} functions family transfer a given
+buffer from CPU domain to device domain. Some advanced use cases might
+require sharing a buffer between more than one device. This requires
+having a mapping created separately for each device and is usually
+performed by calling dma_map_{single,page,sg} function more than once
+for the given buffer with device pointer to each device taking part in
+the buffer sharing. The first call transfers a buffer from 'CPU' domain
+to 'device' domain, what synchronizes CPU caches for the given region
+(usually it means that the cache has been flushed or invalidated
+depending on the dma direction). However, next calls to
+dma_map_{single,page,sg}() for other devices will perform exactly the
+same sychronization operation on the CPU cache. CPU cache sychronization
+might be a time consuming operation, especially if the buffers are
+large, so it is highly recommended to avoid it if possible.
+DMA_ATTR_SKIP_CPU_SYNC allows platform code to skip synchronization of
+the CPU cache for the given buffer assuming that it has been already
+transferred to 'device' domain. This attribute can be also used for
+dma_unmap_{single,page,sg} functions family to force buffer to stay in
+device domain after releasing a mapping for it. Use this attribute with
+care!
diff --git a/include/linux/dma-attrs.h b/include/linux/dma-attrs.h
index a37c10c..f83f793 100644
--- a/include/linux/dma-attrs.h
+++ b/include/linux/dma-attrs.h
@@ -16,6 +16,7 @@ enum dma_attr {
 	DMA_ATTR_WRITE_COMBINE,
 	DMA_ATTR_NON_CONSISTENT,
 	DMA_ATTR_NO_KERNEL_MAPPING,
+	DMA_ATTR_SKIP_CPU_SYNC,
 	DMA_ATTR_MAX,
 };
 
-- 
1.7.1.569.g6f426

^ permalink raw reply related	[flat|nested] 45+ messages in thread

* [PATCHv2 6/6] ARM: dma-mapping: add support for DMA_ATTR_SKIP_CPU_SYNC attribute
  2012-06-13 11:50 ` Marek Szyprowski
  (?)
@ 2012-06-13 11:50   ` Marek Szyprowski
  -1 siblings, 0 replies; 45+ messages in thread
From: Marek Szyprowski @ 2012-06-13 11:50 UTC (permalink / raw)
  To: linux-arm-kernel, linaro-mm-sig, linux-mm, linux-arch, linux-kernel
  Cc: Marek Szyprowski, Kyungmin Park, Arnd Bergmann,
	Russell King - ARM Linux, Chunsang Jeong, Krishna Reddy,
	Benjamin Herrenschmidt, Konrad Rzeszutek Wilk, Hiroshi Doyu,
	Subash Patel, Sumit Semwal, Abhinav Kochhar, Tomasz Stanislawski

This patch adds support for DMA_ATTR_SKIP_CPU_SYNC attribute for
dma_(un)map_(single,page,sg) functions family. It lets dma mapping clients
to create a mapping for the buffer for the given device without performing
a CPU cache synchronization. CPU cache synchronization can be skipped for
the buffers which it is known that they are already in 'device' domain (CPU
caches have been already synchronized or there are only coherent mappings
for the buffer). For advanced users only, please use it with care.

Signed-off-by: Marek Szyprowski <m.szyprowski@samsung.com>
Reviewed-by: Kyungmin Park <kyungmin.park@samsung.com>
---
 arch/arm/mm/dma-mapping.c |   20 +++++++++++---------
 1 files changed, 11 insertions(+), 9 deletions(-)

diff --git a/arch/arm/mm/dma-mapping.c b/arch/arm/mm/dma-mapping.c
index 3840997..939cdc2 100644
--- a/arch/arm/mm/dma-mapping.c
+++ b/arch/arm/mm/dma-mapping.c
@@ -73,7 +73,7 @@ static dma_addr_t arm_dma_map_page(struct device *dev, struct page *page,
 	     unsigned long offset, size_t size, enum dma_data_direction dir,
 	     struct dma_attrs *attrs)
 {
-	if (!arch_is_coherent())
+	if (!arch_is_coherent() && !dma_get_attr(DMA_ATTR_SKIP_CPU_SYNC, attrs))
 		__dma_page_cpu_to_dev(page, offset, size, dir);
 	return pfn_to_dma(dev, page_to_pfn(page)) + offset;
 }
@@ -96,7 +96,7 @@ static void arm_dma_unmap_page(struct device *dev, dma_addr_t handle,
 		size_t size, enum dma_data_direction dir,
 		struct dma_attrs *attrs)
 {
-	if (!arch_is_coherent())
+	if (!arch_is_coherent() && !dma_get_attr(DMA_ATTR_SKIP_CPU_SYNC, attrs))
 		__dma_page_dev_to_cpu(pfn_to_page(dma_to_pfn(dev, handle)),
 				      handle & ~PAGE_MASK, size, dir);
 }
@@ -1206,7 +1206,7 @@ static int arm_iommu_get_sgtable(struct device *dev, struct sg_table *sgt,
  */
 static int __map_sg_chunk(struct device *dev, struct scatterlist *sg,
 			  size_t size, dma_addr_t *handle,
-			  enum dma_data_direction dir)
+			  enum dma_data_direction dir, struct dma_attrs *attrs)
 {
 	struct dma_iommu_mapping *mapping = dev->archdata.mapping;
 	dma_addr_t iova, iova_base;
@@ -1225,7 +1225,8 @@ static int __map_sg_chunk(struct device *dev, struct scatterlist *sg,
 		phys_addr_t phys = page_to_phys(sg_page(s));
 		unsigned int len = PAGE_ALIGN(s->offset + s->length);
 
-		if (!arch_is_coherent())
+		if (!arch_is_coherent() &&
+		    !dma_get_attr(DMA_ATTR_SKIP_CPU_SYNC, attrs))
 			__dma_page_cpu_to_dev(sg_page(s), s->offset, s->length, dir);
 
 		ret = iommu_map(mapping->domain, iova, phys, len, 0);
@@ -1272,7 +1273,7 @@ int arm_iommu_map_sg(struct device *dev, struct scatterlist *sg, int nents,
 
 		if (s->offset || (size & ~PAGE_MASK) || size + s->length > max) {
 			if (__map_sg_chunk(dev, start, size, &dma->dma_address,
-			    dir) < 0)
+			    dir, attrs) < 0)
 				goto bad_mapping;
 
 			dma->dma_address += offset;
@@ -1285,7 +1286,7 @@ int arm_iommu_map_sg(struct device *dev, struct scatterlist *sg, int nents,
 		}
 		size += s->length;
 	}
-	if (__map_sg_chunk(dev, start, size, &dma->dma_address, dir) < 0)
+	if (__map_sg_chunk(dev, start, size, &dma->dma_address, dir, attrs) < 0)
 		goto bad_mapping;
 
 	dma->dma_address += offset;
@@ -1319,7 +1320,8 @@ void arm_iommu_unmap_sg(struct device *dev, struct scatterlist *sg, int nents,
 		if (sg_dma_len(s))
 			__iommu_remove_mapping(dev, sg_dma_address(s),
 					       sg_dma_len(s));
-		if (!arch_is_coherent())
+		if (!arch_is_coherent() &&
+		    !dma_get_attr(DMA_ATTR_SKIP_CPU_SYNC, attrs))
 			__dma_page_dev_to_cpu(sg_page(s), s->offset,
 					      s->length, dir);
 	}
@@ -1381,7 +1383,7 @@ static dma_addr_t arm_iommu_map_page(struct device *dev, struct page *page,
 	dma_addr_t dma_addr;
 	int ret, len = PAGE_ALIGN(size + offset);
 
-	if (!arch_is_coherent())
+	if (!arch_is_coherent() && !dma_get_attr(DMA_ATTR_SKIP_CPU_SYNC, attrs))
 		__dma_page_cpu_to_dev(page, offset, size, dir);
 
 	dma_addr = __alloc_iova(mapping, len);
@@ -1420,7 +1422,7 @@ static void arm_iommu_unmap_page(struct device *dev, dma_addr_t handle,
 	if (!iova)
 		return;
 
-	if (!arch_is_coherent())
+	if (!arch_is_coherent() && !dma_get_attr(DMA_ATTR_SKIP_CPU_SYNC, attrs))
 		__dma_page_dev_to_cpu(page, offset, size, dir);
 
 	iommu_unmap(mapping->domain, iova, len);
-- 
1.7.1.569.g6f426


^ permalink raw reply related	[flat|nested] 45+ messages in thread

* [PATCHv2 6/6] ARM: dma-mapping: add support for DMA_ATTR_SKIP_CPU_SYNC attribute
@ 2012-06-13 11:50   ` Marek Szyprowski
  0 siblings, 0 replies; 45+ messages in thread
From: Marek Szyprowski @ 2012-06-13 11:50 UTC (permalink / raw)
  To: linux-arm-kernel, linaro-mm-sig, linux-mm, linux-arch, linux-kernel
  Cc: Marek Szyprowski, Kyungmin Park, Arnd Bergmann,
	Russell King - ARM Linux, Chunsang Jeong, Krishna Reddy,
	Benjamin Herrenschmidt, Konrad Rzeszutek Wilk, Hiroshi Doyu,
	Subash Patel, Sumit Semwal, Abhinav Kochhar, Tomasz Stanislawski

This patch adds support for DMA_ATTR_SKIP_CPU_SYNC attribute for
dma_(un)map_(single,page,sg) functions family. It lets dma mapping clients
to create a mapping for the buffer for the given device without performing
a CPU cache synchronization. CPU cache synchronization can be skipped for
the buffers which it is known that they are already in 'device' domain (CPU
caches have been already synchronized or there are only coherent mappings
for the buffer). For advanced users only, please use it with care.

Signed-off-by: Marek Szyprowski <m.szyprowski@samsung.com>
Reviewed-by: Kyungmin Park <kyungmin.park@samsung.com>
---
 arch/arm/mm/dma-mapping.c |   20 +++++++++++---------
 1 files changed, 11 insertions(+), 9 deletions(-)

diff --git a/arch/arm/mm/dma-mapping.c b/arch/arm/mm/dma-mapping.c
index 3840997..939cdc2 100644
--- a/arch/arm/mm/dma-mapping.c
+++ b/arch/arm/mm/dma-mapping.c
@@ -73,7 +73,7 @@ static dma_addr_t arm_dma_map_page(struct device *dev, struct page *page,
 	     unsigned long offset, size_t size, enum dma_data_direction dir,
 	     struct dma_attrs *attrs)
 {
-	if (!arch_is_coherent())
+	if (!arch_is_coherent() && !dma_get_attr(DMA_ATTR_SKIP_CPU_SYNC, attrs))
 		__dma_page_cpu_to_dev(page, offset, size, dir);
 	return pfn_to_dma(dev, page_to_pfn(page)) + offset;
 }
@@ -96,7 +96,7 @@ static void arm_dma_unmap_page(struct device *dev, dma_addr_t handle,
 		size_t size, enum dma_data_direction dir,
 		struct dma_attrs *attrs)
 {
-	if (!arch_is_coherent())
+	if (!arch_is_coherent() && !dma_get_attr(DMA_ATTR_SKIP_CPU_SYNC, attrs))
 		__dma_page_dev_to_cpu(pfn_to_page(dma_to_pfn(dev, handle)),
 				      handle & ~PAGE_MASK, size, dir);
 }
@@ -1206,7 +1206,7 @@ static int arm_iommu_get_sgtable(struct device *dev, struct sg_table *sgt,
  */
 static int __map_sg_chunk(struct device *dev, struct scatterlist *sg,
 			  size_t size, dma_addr_t *handle,
-			  enum dma_data_direction dir)
+			  enum dma_data_direction dir, struct dma_attrs *attrs)
 {
 	struct dma_iommu_mapping *mapping = dev->archdata.mapping;
 	dma_addr_t iova, iova_base;
@@ -1225,7 +1225,8 @@ static int __map_sg_chunk(struct device *dev, struct scatterlist *sg,
 		phys_addr_t phys = page_to_phys(sg_page(s));
 		unsigned int len = PAGE_ALIGN(s->offset + s->length);
 
-		if (!arch_is_coherent())
+		if (!arch_is_coherent() &&
+		    !dma_get_attr(DMA_ATTR_SKIP_CPU_SYNC, attrs))
 			__dma_page_cpu_to_dev(sg_page(s), s->offset, s->length, dir);
 
 		ret = iommu_map(mapping->domain, iova, phys, len, 0);
@@ -1272,7 +1273,7 @@ int arm_iommu_map_sg(struct device *dev, struct scatterlist *sg, int nents,
 
 		if (s->offset || (size & ~PAGE_MASK) || size + s->length > max) {
 			if (__map_sg_chunk(dev, start, size, &dma->dma_address,
-			    dir) < 0)
+			    dir, attrs) < 0)
 				goto bad_mapping;
 
 			dma->dma_address += offset;
@@ -1285,7 +1286,7 @@ int arm_iommu_map_sg(struct device *dev, struct scatterlist *sg, int nents,
 		}
 		size += s->length;
 	}
-	if (__map_sg_chunk(dev, start, size, &dma->dma_address, dir) < 0)
+	if (__map_sg_chunk(dev, start, size, &dma->dma_address, dir, attrs) < 0)
 		goto bad_mapping;
 
 	dma->dma_address += offset;
@@ -1319,7 +1320,8 @@ void arm_iommu_unmap_sg(struct device *dev, struct scatterlist *sg, int nents,
 		if (sg_dma_len(s))
 			__iommu_remove_mapping(dev, sg_dma_address(s),
 					       sg_dma_len(s));
-		if (!arch_is_coherent())
+		if (!arch_is_coherent() &&
+		    !dma_get_attr(DMA_ATTR_SKIP_CPU_SYNC, attrs))
 			__dma_page_dev_to_cpu(sg_page(s), s->offset,
 					      s->length, dir);
 	}
@@ -1381,7 +1383,7 @@ static dma_addr_t arm_iommu_map_page(struct device *dev, struct page *page,
 	dma_addr_t dma_addr;
 	int ret, len = PAGE_ALIGN(size + offset);
 
-	if (!arch_is_coherent())
+	if (!arch_is_coherent() && !dma_get_attr(DMA_ATTR_SKIP_CPU_SYNC, attrs))
 		__dma_page_cpu_to_dev(page, offset, size, dir);
 
 	dma_addr = __alloc_iova(mapping, len);
@@ -1420,7 +1422,7 @@ static void arm_iommu_unmap_page(struct device *dev, dma_addr_t handle,
 	if (!iova)
 		return;
 
-	if (!arch_is_coherent())
+	if (!arch_is_coherent() && !dma_get_attr(DMA_ATTR_SKIP_CPU_SYNC, attrs))
 		__dma_page_dev_to_cpu(page, offset, size, dir);
 
 	iommu_unmap(mapping->domain, iova, len);
-- 
1.7.1.569.g6f426

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 45+ messages in thread

* [PATCHv2 6/6] ARM: dma-mapping: add support for DMA_ATTR_SKIP_CPU_SYNC attribute
@ 2012-06-13 11:50   ` Marek Szyprowski
  0 siblings, 0 replies; 45+ messages in thread
From: Marek Szyprowski @ 2012-06-13 11:50 UTC (permalink / raw)
  To: linux-arm-kernel

This patch adds support for DMA_ATTR_SKIP_CPU_SYNC attribute for
dma_(un)map_(single,page,sg) functions family. It lets dma mapping clients
to create a mapping for the buffer for the given device without performing
a CPU cache synchronization. CPU cache synchronization can be skipped for
the buffers which it is known that they are already in 'device' domain (CPU
caches have been already synchronized or there are only coherent mappings
for the buffer). For advanced users only, please use it with care.

Signed-off-by: Marek Szyprowski <m.szyprowski@samsung.com>
Reviewed-by: Kyungmin Park <kyungmin.park@samsung.com>
---
 arch/arm/mm/dma-mapping.c |   20 +++++++++++---------
 1 files changed, 11 insertions(+), 9 deletions(-)

diff --git a/arch/arm/mm/dma-mapping.c b/arch/arm/mm/dma-mapping.c
index 3840997..939cdc2 100644
--- a/arch/arm/mm/dma-mapping.c
+++ b/arch/arm/mm/dma-mapping.c
@@ -73,7 +73,7 @@ static dma_addr_t arm_dma_map_page(struct device *dev, struct page *page,
 	     unsigned long offset, size_t size, enum dma_data_direction dir,
 	     struct dma_attrs *attrs)
 {
-	if (!arch_is_coherent())
+	if (!arch_is_coherent() && !dma_get_attr(DMA_ATTR_SKIP_CPU_SYNC, attrs))
 		__dma_page_cpu_to_dev(page, offset, size, dir);
 	return pfn_to_dma(dev, page_to_pfn(page)) + offset;
 }
@@ -96,7 +96,7 @@ static void arm_dma_unmap_page(struct device *dev, dma_addr_t handle,
 		size_t size, enum dma_data_direction dir,
 		struct dma_attrs *attrs)
 {
-	if (!arch_is_coherent())
+	if (!arch_is_coherent() && !dma_get_attr(DMA_ATTR_SKIP_CPU_SYNC, attrs))
 		__dma_page_dev_to_cpu(pfn_to_page(dma_to_pfn(dev, handle)),
 				      handle & ~PAGE_MASK, size, dir);
 }
@@ -1206,7 +1206,7 @@ static int arm_iommu_get_sgtable(struct device *dev, struct sg_table *sgt,
  */
 static int __map_sg_chunk(struct device *dev, struct scatterlist *sg,
 			  size_t size, dma_addr_t *handle,
-			  enum dma_data_direction dir)
+			  enum dma_data_direction dir, struct dma_attrs *attrs)
 {
 	struct dma_iommu_mapping *mapping = dev->archdata.mapping;
 	dma_addr_t iova, iova_base;
@@ -1225,7 +1225,8 @@ static int __map_sg_chunk(struct device *dev, struct scatterlist *sg,
 		phys_addr_t phys = page_to_phys(sg_page(s));
 		unsigned int len = PAGE_ALIGN(s->offset + s->length);
 
-		if (!arch_is_coherent())
+		if (!arch_is_coherent() &&
+		    !dma_get_attr(DMA_ATTR_SKIP_CPU_SYNC, attrs))
 			__dma_page_cpu_to_dev(sg_page(s), s->offset, s->length, dir);
 
 		ret = iommu_map(mapping->domain, iova, phys, len, 0);
@@ -1272,7 +1273,7 @@ int arm_iommu_map_sg(struct device *dev, struct scatterlist *sg, int nents,
 
 		if (s->offset || (size & ~PAGE_MASK) || size + s->length > max) {
 			if (__map_sg_chunk(dev, start, size, &dma->dma_address,
-			    dir) < 0)
+			    dir, attrs) < 0)
 				goto bad_mapping;
 
 			dma->dma_address += offset;
@@ -1285,7 +1286,7 @@ int arm_iommu_map_sg(struct device *dev, struct scatterlist *sg, int nents,
 		}
 		size += s->length;
 	}
-	if (__map_sg_chunk(dev, start, size, &dma->dma_address, dir) < 0)
+	if (__map_sg_chunk(dev, start, size, &dma->dma_address, dir, attrs) < 0)
 		goto bad_mapping;
 
 	dma->dma_address += offset;
@@ -1319,7 +1320,8 @@ void arm_iommu_unmap_sg(struct device *dev, struct scatterlist *sg, int nents,
 		if (sg_dma_len(s))
 			__iommu_remove_mapping(dev, sg_dma_address(s),
 					       sg_dma_len(s));
-		if (!arch_is_coherent())
+		if (!arch_is_coherent() &&
+		    !dma_get_attr(DMA_ATTR_SKIP_CPU_SYNC, attrs))
 			__dma_page_dev_to_cpu(sg_page(s), s->offset,
 					      s->length, dir);
 	}
@@ -1381,7 +1383,7 @@ static dma_addr_t arm_iommu_map_page(struct device *dev, struct page *page,
 	dma_addr_t dma_addr;
 	int ret, len = PAGE_ALIGN(size + offset);
 
-	if (!arch_is_coherent())
+	if (!arch_is_coherent() && !dma_get_attr(DMA_ATTR_SKIP_CPU_SYNC, attrs))
 		__dma_page_cpu_to_dev(page, offset, size, dir);
 
 	dma_addr = __alloc_iova(mapping, len);
@@ -1420,7 +1422,7 @@ static void arm_iommu_unmap_page(struct device *dev, dma_addr_t handle,
 	if (!iova)
 		return;
 
-	if (!arch_is_coherent())
+	if (!arch_is_coherent() && !dma_get_attr(DMA_ATTR_SKIP_CPU_SYNC, attrs))
 		__dma_page_dev_to_cpu(page, offset, size, dir);
 
 	iommu_unmap(mapping->domain, iova, len);
-- 
1.7.1.569.g6f426

^ permalink raw reply related	[flat|nested] 45+ messages in thread

* Re: [PATCHv2 0/6] ARM: DMA-mapping: new extensions for buffer sharing
  2012-06-13 11:50 ` Marek Szyprowski
  (?)
@ 2012-06-13 14:12   ` Konrad Rzeszutek Wilk
  -1 siblings, 0 replies; 45+ messages in thread
From: Konrad Rzeszutek Wilk @ 2012-06-13 14:12 UTC (permalink / raw)
  To: Marek Szyprowski
  Cc: linux-arm-kernel, linaro-mm-sig, linux-mm, linux-arch,
	linux-kernel, Kyungmin Park, Arnd Bergmann,
	Russell King - ARM Linux, Chunsang Jeong, Krishna Reddy,
	Benjamin Herrenschmidt, Hiroshi Doyu, Subash Patel, Sumit Semwal,
	Abhinav Kochhar, Tomasz Stanislawski

On Wed, Jun 13, 2012 at 01:50:12PM +0200, Marek Szyprowski wrote:
> Hello,
> 
> This is an updated version of the patch series introducing a new
> features to DMA mapping subsystem to let drivers share the allocated
> buffers (preferably using recently introduced dma_buf framework) easy
> and efficient.
> 
> The first extension is DMA_ATTR_NO_KERNEL_MAPPING attribute. It is
> intended for use with dma_{alloc, mmap, free}_attrs functions. It can be
> used to notify dma-mapping core that the driver will not use kernel
> mapping for the allocated buffer at all, so the core can skip creating
> it. This saves precious kernel virtual address space. Such buffer can be
> accessed from userspace, after calling dma_mmap_attrs() for it (a
> typical use case for multimedia buffers). The value returned by
> dma_alloc_attrs() with this attribute should be considered as a DMA
> cookie, which needs to be passed to dma_mmap_attrs() and
> dma_free_attrs() funtions.
> 
> The second extension is required to let drivers to share the buffers
> allocated by DMA-mapping subsystem. Right now the driver gets a dma
> address of the allocated buffer and the kernel virtual mapping for it.
> If it wants to share it with other device (= map into its dma address
> space) it usually hacks around kernel virtual addresses to get pointers
> to pages or assumes that both devices share the DMA address space. Both
> solutions are just hacks for the special cases, which should be avoided
> in the final version of buffer sharing. To solve this issue in a generic
> way, a new call to DMA mapping has been introduced - dma_get_sgtable().
> It allocates a scatter-list which describes the allocated buffer and
> lets the driver(s) to use it with other device(s) by calling
> dma_map_sg() on it.

What about the cases where the driver wants to share the buffer but there
are multiple IOMMUs? So the DMA address returned initially would be
different on the other IOMMUs? Would the driver have to figure this out
or would the DMA/IOMMU implementation be in charge of that?

And what about IOMMU's that don't do DMA_ATTR_NO_KERNEL_MAPPING?
Can they just ignore it and do what they did before ? (I presume yes).

> 
> The third extension solves the performance issues which we observed with
> some advanced buffer sharing use cases, which require creating a dma
> mapping for the same memory buffer for more than one device. From the
> DMA-mapping perspective this requires to call one of the
> dma_map_{page,single,sg} function for the given memory buffer a few
> times, for each of the devices. Each dma_map_* call performs CPU cache
> synchronization, what might be a time consuming operation, especially
> when the buffers are large. We would like to avoid any useless and time
> consuming operations, so that was the main reason for introducing
> another attribute for DMA-mapping subsystem: DMA_ATTR_SKIP_CPU_SYNC,
> which lets dma-mapping core to skip CPU cache synchronization in certain
> cases.
> 
> The proposed patches have been rebased on the latest Linux kernel
> v3.5-rc2 with 'ARM: replace custom consistent dma region with vmalloc'
> patches applied (for more information, please refer to the 
> http://www.spinics.net/lists/arm-kernel/msg179202.html thread).
> 
> The patches together with all dependences are also available on the
> following GIT branch:
> 
> git://git.linaro.org/people/mszyprowski/linux-dma-mapping.git 3.5-rc2-dma-ext-v2
> 
> Best regards
> Marek Szyprowski
> Samsung Poland R&D Center
> 
> Changelog:
> 
> v2:
> - rebased onto v3.5-rc2 and adapted for CMA and dma-mapping changes
> - renamed dma_get_sgtable() to dma_get_sgtable_attrs() to match the convention
>   of the other dma-mapping calls with attributes
> - added generic fallback function for dma_get_sgtable() for architectures with
>   simple dma-mapping implementations
> 
> v1: http://thread.gmane.org/gmane.linux.kernel.mm/78644
>     http://thread.gmane.org/gmane.linux.kernel.cross-arch/14435 (part 2)
> - initial version
> 
> Patch summary:
> 
> Marek Szyprowski (6):
>   common: DMA-mapping: add DMA_ATTR_NO_KERNEL_MAPPING attribute
>   ARM: dma-mapping: add support for DMA_ATTR_NO_KERNEL_MAPPING
>     attribute
>   common: dma-mapping: introduce dma_get_sgtable() function
>   ARM: dma-mapping: add support for dma_get_sgtable()
>   common: DMA-mapping: add DMA_ATTR_SKIP_CPU_SYNC attribute
>   ARM: dma-mapping: add support for DMA_ATTR_SKIP_CPU_SYNC attribute
> 
>  Documentation/DMA-attributes.txt         |   42 ++++++++++++++++++
>  arch/arm/common/dmabounce.c              |    1 +
>  arch/arm/include/asm/dma-mapping.h       |    3 +
>  arch/arm/mm/dma-mapping.c                |   69 ++++++++++++++++++++++++------
>  drivers/base/dma-mapping.c               |   18 ++++++++
>  include/asm-generic/dma-mapping-common.h |   18 ++++++++
>  include/linux/dma-attrs.h                |    2 +
>  include/linux/dma-mapping.h              |    3 +
>  8 files changed, 142 insertions(+), 14 deletions(-)
> 
> -- 
> 1.7.1.569.g6f426
> 
> --
> To unsubscribe, send a message with 'unsubscribe linux-mm' in
> the body to majordomo@kvack.org.  For more info on Linux MM,
> see: http://www.linux-mm.org/ .
> Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [PATCHv2 0/6] ARM: DMA-mapping: new extensions for buffer sharing
@ 2012-06-13 14:12   ` Konrad Rzeszutek Wilk
  0 siblings, 0 replies; 45+ messages in thread
From: Konrad Rzeszutek Wilk @ 2012-06-13 14:12 UTC (permalink / raw)
  To: Marek Szyprowski
  Cc: linux-arm-kernel, linaro-mm-sig, linux-mm, linux-arch,
	linux-kernel, Kyungmin Park, Arnd Bergmann,
	Russell King - ARM Linux, Chunsang Jeong, Krishna Reddy,
	Benjamin Herrenschmidt, Hiroshi Doyu, Subash Patel, Sumit Semwal,
	Abhinav Kochhar, Tomasz Stanislawski

On Wed, Jun 13, 2012 at 01:50:12PM +0200, Marek Szyprowski wrote:
> Hello,
> 
> This is an updated version of the patch series introducing a new
> features to DMA mapping subsystem to let drivers share the allocated
> buffers (preferably using recently introduced dma_buf framework) easy
> and efficient.
> 
> The first extension is DMA_ATTR_NO_KERNEL_MAPPING attribute. It is
> intended for use with dma_{alloc, mmap, free}_attrs functions. It can be
> used to notify dma-mapping core that the driver will not use kernel
> mapping for the allocated buffer at all, so the core can skip creating
> it. This saves precious kernel virtual address space. Such buffer can be
> accessed from userspace, after calling dma_mmap_attrs() for it (a
> typical use case for multimedia buffers). The value returned by
> dma_alloc_attrs() with this attribute should be considered as a DMA
> cookie, which needs to be passed to dma_mmap_attrs() and
> dma_free_attrs() funtions.
> 
> The second extension is required to let drivers to share the buffers
> allocated by DMA-mapping subsystem. Right now the driver gets a dma
> address of the allocated buffer and the kernel virtual mapping for it.
> If it wants to share it with other device (= map into its dma address
> space) it usually hacks around kernel virtual addresses to get pointers
> to pages or assumes that both devices share the DMA address space. Both
> solutions are just hacks for the special cases, which should be avoided
> in the final version of buffer sharing. To solve this issue in a generic
> way, a new call to DMA mapping has been introduced - dma_get_sgtable().
> It allocates a scatter-list which describes the allocated buffer and
> lets the driver(s) to use it with other device(s) by calling
> dma_map_sg() on it.

What about the cases where the driver wants to share the buffer but there
are multiple IOMMUs? So the DMA address returned initially would be
different on the other IOMMUs? Would the driver have to figure this out
or would the DMA/IOMMU implementation be in charge of that?

And what about IOMMU's that don't do DMA_ATTR_NO_KERNEL_MAPPING?
Can they just ignore it and do what they did before ? (I presume yes).

> 
> The third extension solves the performance issues which we observed with
> some advanced buffer sharing use cases, which require creating a dma
> mapping for the same memory buffer for more than one device. From the
> DMA-mapping perspective this requires to call one of the
> dma_map_{page,single,sg} function for the given memory buffer a few
> times, for each of the devices. Each dma_map_* call performs CPU cache
> synchronization, what might be a time consuming operation, especially
> when the buffers are large. We would like to avoid any useless and time
> consuming operations, so that was the main reason for introducing
> another attribute for DMA-mapping subsystem: DMA_ATTR_SKIP_CPU_SYNC,
> which lets dma-mapping core to skip CPU cache synchronization in certain
> cases.
> 
> The proposed patches have been rebased on the latest Linux kernel
> v3.5-rc2 with 'ARM: replace custom consistent dma region with vmalloc'
> patches applied (for more information, please refer to the 
> http://www.spinics.net/lists/arm-kernel/msg179202.html thread).
> 
> The patches together with all dependences are also available on the
> following GIT branch:
> 
> git://git.linaro.org/people/mszyprowski/linux-dma-mapping.git 3.5-rc2-dma-ext-v2
> 
> Best regards
> Marek Szyprowski
> Samsung Poland R&D Center
> 
> Changelog:
> 
> v2:
> - rebased onto v3.5-rc2 and adapted for CMA and dma-mapping changes
> - renamed dma_get_sgtable() to dma_get_sgtable_attrs() to match the convention
>   of the other dma-mapping calls with attributes
> - added generic fallback function for dma_get_sgtable() for architectures with
>   simple dma-mapping implementations
> 
> v1: http://thread.gmane.org/gmane.linux.kernel.mm/78644
>     http://thread.gmane.org/gmane.linux.kernel.cross-arch/14435 (part 2)
> - initial version
> 
> Patch summary:
> 
> Marek Szyprowski (6):
>   common: DMA-mapping: add DMA_ATTR_NO_KERNEL_MAPPING attribute
>   ARM: dma-mapping: add support for DMA_ATTR_NO_KERNEL_MAPPING
>     attribute
>   common: dma-mapping: introduce dma_get_sgtable() function
>   ARM: dma-mapping: add support for dma_get_sgtable()
>   common: DMA-mapping: add DMA_ATTR_SKIP_CPU_SYNC attribute
>   ARM: dma-mapping: add support for DMA_ATTR_SKIP_CPU_SYNC attribute
> 
>  Documentation/DMA-attributes.txt         |   42 ++++++++++++++++++
>  arch/arm/common/dmabounce.c              |    1 +
>  arch/arm/include/asm/dma-mapping.h       |    3 +
>  arch/arm/mm/dma-mapping.c                |   69 ++++++++++++++++++++++++------
>  drivers/base/dma-mapping.c               |   18 ++++++++
>  include/asm-generic/dma-mapping-common.h |   18 ++++++++
>  include/linux/dma-attrs.h                |    2 +
>  include/linux/dma-mapping.h              |    3 +
>  8 files changed, 142 insertions(+), 14 deletions(-)
> 
> -- 
> 1.7.1.569.g6f426
> 
> --
> To unsubscribe, send a message with 'unsubscribe linux-mm' in
> the body to majordomo@kvack.org.  For more info on Linux MM,
> see: http://www.linux-mm.org/ .
> Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 45+ messages in thread

* [PATCHv2 0/6] ARM: DMA-mapping: new extensions for buffer sharing
@ 2012-06-13 14:12   ` Konrad Rzeszutek Wilk
  0 siblings, 0 replies; 45+ messages in thread
From: Konrad Rzeszutek Wilk @ 2012-06-13 14:12 UTC (permalink / raw)
  To: linux-arm-kernel

On Wed, Jun 13, 2012 at 01:50:12PM +0200, Marek Szyprowski wrote:
> Hello,
> 
> This is an updated version of the patch series introducing a new
> features to DMA mapping subsystem to let drivers share the allocated
> buffers (preferably using recently introduced dma_buf framework) easy
> and efficient.
> 
> The first extension is DMA_ATTR_NO_KERNEL_MAPPING attribute. It is
> intended for use with dma_{alloc, mmap, free}_attrs functions. It can be
> used to notify dma-mapping core that the driver will not use kernel
> mapping for the allocated buffer at all, so the core can skip creating
> it. This saves precious kernel virtual address space. Such buffer can be
> accessed from userspace, after calling dma_mmap_attrs() for it (a
> typical use case for multimedia buffers). The value returned by
> dma_alloc_attrs() with this attribute should be considered as a DMA
> cookie, which needs to be passed to dma_mmap_attrs() and
> dma_free_attrs() funtions.
> 
> The second extension is required to let drivers to share the buffers
> allocated by DMA-mapping subsystem. Right now the driver gets a dma
> address of the allocated buffer and the kernel virtual mapping for it.
> If it wants to share it with other device (= map into its dma address
> space) it usually hacks around kernel virtual addresses to get pointers
> to pages or assumes that both devices share the DMA address space. Both
> solutions are just hacks for the special cases, which should be avoided
> in the final version of buffer sharing. To solve this issue in a generic
> way, a new call to DMA mapping has been introduced - dma_get_sgtable().
> It allocates a scatter-list which describes the allocated buffer and
> lets the driver(s) to use it with other device(s) by calling
> dma_map_sg() on it.

What about the cases where the driver wants to share the buffer but there
are multiple IOMMUs? So the DMA address returned initially would be
different on the other IOMMUs? Would the driver have to figure this out
or would the DMA/IOMMU implementation be in charge of that?

And what about IOMMU's that don't do DMA_ATTR_NO_KERNEL_MAPPING?
Can they just ignore it and do what they did before ? (I presume yes).

> 
> The third extension solves the performance issues which we observed with
> some advanced buffer sharing use cases, which require creating a dma
> mapping for the same memory buffer for more than one device. From the
> DMA-mapping perspective this requires to call one of the
> dma_map_{page,single,sg} function for the given memory buffer a few
> times, for each of the devices. Each dma_map_* call performs CPU cache
> synchronization, what might be a time consuming operation, especially
> when the buffers are large. We would like to avoid any useless and time
> consuming operations, so that was the main reason for introducing
> another attribute for DMA-mapping subsystem: DMA_ATTR_SKIP_CPU_SYNC,
> which lets dma-mapping core to skip CPU cache synchronization in certain
> cases.
> 
> The proposed patches have been rebased on the latest Linux kernel
> v3.5-rc2 with 'ARM: replace custom consistent dma region with vmalloc'
> patches applied (for more information, please refer to the 
> http://www.spinics.net/lists/arm-kernel/msg179202.html thread).
> 
> The patches together with all dependences are also available on the
> following GIT branch:
> 
> git://git.linaro.org/people/mszyprowski/linux-dma-mapping.git 3.5-rc2-dma-ext-v2
> 
> Best regards
> Marek Szyprowski
> Samsung Poland R&D Center
> 
> Changelog:
> 
> v2:
> - rebased onto v3.5-rc2 and adapted for CMA and dma-mapping changes
> - renamed dma_get_sgtable() to dma_get_sgtable_attrs() to match the convention
>   of the other dma-mapping calls with attributes
> - added generic fallback function for dma_get_sgtable() for architectures with
>   simple dma-mapping implementations
> 
> v1: http://thread.gmane.org/gmane.linux.kernel.mm/78644
>     http://thread.gmane.org/gmane.linux.kernel.cross-arch/14435 (part 2)
> - initial version
> 
> Patch summary:
> 
> Marek Szyprowski (6):
>   common: DMA-mapping: add DMA_ATTR_NO_KERNEL_MAPPING attribute
>   ARM: dma-mapping: add support for DMA_ATTR_NO_KERNEL_MAPPING
>     attribute
>   common: dma-mapping: introduce dma_get_sgtable() function
>   ARM: dma-mapping: add support for dma_get_sgtable()
>   common: DMA-mapping: add DMA_ATTR_SKIP_CPU_SYNC attribute
>   ARM: dma-mapping: add support for DMA_ATTR_SKIP_CPU_SYNC attribute
> 
>  Documentation/DMA-attributes.txt         |   42 ++++++++++++++++++
>  arch/arm/common/dmabounce.c              |    1 +
>  arch/arm/include/asm/dma-mapping.h       |    3 +
>  arch/arm/mm/dma-mapping.c                |   69 ++++++++++++++++++++++++------
>  drivers/base/dma-mapping.c               |   18 ++++++++
>  include/asm-generic/dma-mapping-common.h |   18 ++++++++
>  include/linux/dma-attrs.h                |    2 +
>  include/linux/dma-mapping.h              |    3 +
>  8 files changed, 142 insertions(+), 14 deletions(-)
> 
> -- 
> 1.7.1.569.g6f426
> 
> --
> To unsubscribe, send a message with 'unsubscribe linux-mm' in
> the body to majordomo at kvack.org.  For more info on Linux MM,
> see: http://www.linux-mm.org/ .
> Don't email: <a href=mailto:"dont@kvack.org"> email at kvack.org </a>

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [Linaro-mm-sig] [PATCHv2 5/6] common: DMA-mapping: add DMA_ATTR_SKIP_CPU_SYNC attribute
  2012-06-13 11:50   ` Marek Szyprowski
  (?)
@ 2012-06-13 18:45     ` Daniel Vetter
  -1 siblings, 0 replies; 45+ messages in thread
From: Daniel Vetter @ 2012-06-13 18:45 UTC (permalink / raw)
  To: Marek Szyprowski
  Cc: linux-arm-kernel, linaro-mm-sig, linux-mm, linux-arch,
	linux-kernel, Abhinav Kochhar, Russell King - ARM Linux,
	Arnd Bergmann, Konrad Rzeszutek Wilk, Benjamin Herrenschmidt,
	Kyungmin Park, Subash Patel

On Wed, Jun 13, 2012 at 01:50:17PM +0200, Marek Szyprowski wrote:
> This patch adds DMA_ATTR_SKIP_CPU_SYNC attribute to the DMA-mapping
> subsystem.
> 
> By default dma_map_{single,page,sg} functions family transfer a given
> buffer from CPU domain to device domain. Some advanced use cases might
> require sharing a buffer between more than one device. This requires
> having a mapping created separately for each device and is usually
> performed by calling dma_map_{single,page,sg} function more than once
> for the given buffer with device pointer to each device taking part in
> the buffer sharing. The first call transfers a buffer from 'CPU' domain
> to 'device' domain, what synchronizes CPU caches for the given region
> (usually it means that the cache has been flushed or invalidated
> depending on the dma direction). However, next calls to
> dma_map_{single,page,sg}() for other devices will perform exactly the
> same sychronization operation on the CPU cache. CPU cache sychronization
> might be a time consuming operation, especially if the buffers are
> large, so it is highly recommended to avoid it if possible.
> DMA_ATTR_SKIP_CPU_SYNC allows platform code to skip synchronization of
> the CPU cache for the given buffer assuming that it has been already
> transferred to 'device' domain. This attribute can be also used for
> dma_unmap_{single,page,sg} functions family to force buffer to stay in
> device domain after releasing a mapping for it. Use this attribute with
> care!
> 
> Signed-off-by: Marek Szyprowski <m.szyprowski@samsung.com>
> Reviewed-by: Kyungmin Park <kyungmin.park@samsung.com>

Curious question: What's the use-case for this? Is this just to
work-around the fact that dma-buf atm doesn't support streaming dma so
that we could optimize this all (and keep around the mappings)? Or is
there a different use-case that I don't see?
-Daniel

> ---
>  Documentation/DMA-attributes.txt |   24 ++++++++++++++++++++++++
>  include/linux/dma-attrs.h        |    1 +
>  2 files changed, 25 insertions(+), 0 deletions(-)
> 
> diff --git a/Documentation/DMA-attributes.txt b/Documentation/DMA-attributes.txt
> index 725580d..f503090 100644
> --- a/Documentation/DMA-attributes.txt
> +++ b/Documentation/DMA-attributes.txt
> @@ -67,3 +67,27 @@ set on each call.
>  Since it is optional for platforms to implement
>  DMA_ATTR_NO_KERNEL_MAPPING, those that do not will simply ignore the
>  attribute and exhibit default behavior.
> +
> +DMA_ATTR_SKIP_CPU_SYNC
> +----------------------
> +
> +By default dma_map_{single,page,sg} functions family transfer a given
> +buffer from CPU domain to device domain. Some advanced use cases might
> +require sharing a buffer between more than one device. This requires
> +having a mapping created separately for each device and is usually
> +performed by calling dma_map_{single,page,sg} function more than once
> +for the given buffer with device pointer to each device taking part in
> +the buffer sharing. The first call transfers a buffer from 'CPU' domain
> +to 'device' domain, what synchronizes CPU caches for the given region
> +(usually it means that the cache has been flushed or invalidated
> +depending on the dma direction). However, next calls to
> +dma_map_{single,page,sg}() for other devices will perform exactly the
> +same sychronization operation on the CPU cache. CPU cache sychronization
> +might be a time consuming operation, especially if the buffers are
> +large, so it is highly recommended to avoid it if possible.
> +DMA_ATTR_SKIP_CPU_SYNC allows platform code to skip synchronization of
> +the CPU cache for the given buffer assuming that it has been already
> +transferred to 'device' domain. This attribute can be also used for
> +dma_unmap_{single,page,sg} functions family to force buffer to stay in
> +device domain after releasing a mapping for it. Use this attribute with
> +care!
> diff --git a/include/linux/dma-attrs.h b/include/linux/dma-attrs.h
> index a37c10c..f83f793 100644
> --- a/include/linux/dma-attrs.h
> +++ b/include/linux/dma-attrs.h
> @@ -16,6 +16,7 @@ enum dma_attr {
>  	DMA_ATTR_WRITE_COMBINE,
>  	DMA_ATTR_NON_CONSISTENT,
>  	DMA_ATTR_NO_KERNEL_MAPPING,
> +	DMA_ATTR_SKIP_CPU_SYNC,
>  	DMA_ATTR_MAX,
>  };
>  
> -- 
> 1.7.1.569.g6f426
> 
> 
> _______________________________________________
> Linaro-mm-sig mailing list
> Linaro-mm-sig@lists.linaro.org
> http://lists.linaro.org/mailman/listinfo/linaro-mm-sig

-- 
Daniel Vetter
Mail: daniel@ffwll.ch
Mobile: +41 (0)79 365 57 48

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [Linaro-mm-sig] [PATCHv2 5/6] common: DMA-mapping: add DMA_ATTR_SKIP_CPU_SYNC attribute
@ 2012-06-13 18:45     ` Daniel Vetter
  0 siblings, 0 replies; 45+ messages in thread
From: Daniel Vetter @ 2012-06-13 18:45 UTC (permalink / raw)
  To: Marek Szyprowski
  Cc: linux-arm-kernel, linaro-mm-sig, linux-mm, linux-arch,
	linux-kernel, Abhinav Kochhar, Russell King - ARM Linux,
	Arnd Bergmann, Konrad Rzeszutek Wilk, Benjamin Herrenschmidt,
	Kyungmin Park, Subash Patel

On Wed, Jun 13, 2012 at 01:50:17PM +0200, Marek Szyprowski wrote:
> This patch adds DMA_ATTR_SKIP_CPU_SYNC attribute to the DMA-mapping
> subsystem.
> 
> By default dma_map_{single,page,sg} functions family transfer a given
> buffer from CPU domain to device domain. Some advanced use cases might
> require sharing a buffer between more than one device. This requires
> having a mapping created separately for each device and is usually
> performed by calling dma_map_{single,page,sg} function more than once
> for the given buffer with device pointer to each device taking part in
> the buffer sharing. The first call transfers a buffer from 'CPU' domain
> to 'device' domain, what synchronizes CPU caches for the given region
> (usually it means that the cache has been flushed or invalidated
> depending on the dma direction). However, next calls to
> dma_map_{single,page,sg}() for other devices will perform exactly the
> same sychronization operation on the CPU cache. CPU cache sychronization
> might be a time consuming operation, especially if the buffers are
> large, so it is highly recommended to avoid it if possible.
> DMA_ATTR_SKIP_CPU_SYNC allows platform code to skip synchronization of
> the CPU cache for the given buffer assuming that it has been already
> transferred to 'device' domain. This attribute can be also used for
> dma_unmap_{single,page,sg} functions family to force buffer to stay in
> device domain after releasing a mapping for it. Use this attribute with
> care!
> 
> Signed-off-by: Marek Szyprowski <m.szyprowski@samsung.com>
> Reviewed-by: Kyungmin Park <kyungmin.park@samsung.com>

Curious question: What's the use-case for this? Is this just to
work-around the fact that dma-buf atm doesn't support streaming dma so
that we could optimize this all (and keep around the mappings)? Or is
there a different use-case that I don't see?
-Daniel

> ---
>  Documentation/DMA-attributes.txt |   24 ++++++++++++++++++++++++
>  include/linux/dma-attrs.h        |    1 +
>  2 files changed, 25 insertions(+), 0 deletions(-)
> 
> diff --git a/Documentation/DMA-attributes.txt b/Documentation/DMA-attributes.txt
> index 725580d..f503090 100644
> --- a/Documentation/DMA-attributes.txt
> +++ b/Documentation/DMA-attributes.txt
> @@ -67,3 +67,27 @@ set on each call.
>  Since it is optional for platforms to implement
>  DMA_ATTR_NO_KERNEL_MAPPING, those that do not will simply ignore the
>  attribute and exhibit default behavior.
> +
> +DMA_ATTR_SKIP_CPU_SYNC
> +----------------------
> +
> +By default dma_map_{single,page,sg} functions family transfer a given
> +buffer from CPU domain to device domain. Some advanced use cases might
> +require sharing a buffer between more than one device. This requires
> +having a mapping created separately for each device and is usually
> +performed by calling dma_map_{single,page,sg} function more than once
> +for the given buffer with device pointer to each device taking part in
> +the buffer sharing. The first call transfers a buffer from 'CPU' domain
> +to 'device' domain, what synchronizes CPU caches for the given region
> +(usually it means that the cache has been flushed or invalidated
> +depending on the dma direction). However, next calls to
> +dma_map_{single,page,sg}() for other devices will perform exactly the
> +same sychronization operation on the CPU cache. CPU cache sychronization
> +might be a time consuming operation, especially if the buffers are
> +large, so it is highly recommended to avoid it if possible.
> +DMA_ATTR_SKIP_CPU_SYNC allows platform code to skip synchronization of
> +the CPU cache for the given buffer assuming that it has been already
> +transferred to 'device' domain. This attribute can be also used for
> +dma_unmap_{single,page,sg} functions family to force buffer to stay in
> +device domain after releasing a mapping for it. Use this attribute with
> +care!
> diff --git a/include/linux/dma-attrs.h b/include/linux/dma-attrs.h
> index a37c10c..f83f793 100644
> --- a/include/linux/dma-attrs.h
> +++ b/include/linux/dma-attrs.h
> @@ -16,6 +16,7 @@ enum dma_attr {
>  	DMA_ATTR_WRITE_COMBINE,
>  	DMA_ATTR_NON_CONSISTENT,
>  	DMA_ATTR_NO_KERNEL_MAPPING,
> +	DMA_ATTR_SKIP_CPU_SYNC,
>  	DMA_ATTR_MAX,
>  };
>  
> -- 
> 1.7.1.569.g6f426
> 
> 
> _______________________________________________
> Linaro-mm-sig mailing list
> Linaro-mm-sig@lists.linaro.org
> http://lists.linaro.org/mailman/listinfo/linaro-mm-sig

-- 
Daniel Vetter
Mail: daniel@ffwll.ch
Mobile: +41 (0)79 365 57 48

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 45+ messages in thread

* [Linaro-mm-sig] [PATCHv2 5/6] common: DMA-mapping: add DMA_ATTR_SKIP_CPU_SYNC attribute
@ 2012-06-13 18:45     ` Daniel Vetter
  0 siblings, 0 replies; 45+ messages in thread
From: Daniel Vetter @ 2012-06-13 18:45 UTC (permalink / raw)
  To: linux-arm-kernel

On Wed, Jun 13, 2012 at 01:50:17PM +0200, Marek Szyprowski wrote:
> This patch adds DMA_ATTR_SKIP_CPU_SYNC attribute to the DMA-mapping
> subsystem.
> 
> By default dma_map_{single,page,sg} functions family transfer a given
> buffer from CPU domain to device domain. Some advanced use cases might
> require sharing a buffer between more than one device. This requires
> having a mapping created separately for each device and is usually
> performed by calling dma_map_{single,page,sg} function more than once
> for the given buffer with device pointer to each device taking part in
> the buffer sharing. The first call transfers a buffer from 'CPU' domain
> to 'device' domain, what synchronizes CPU caches for the given region
> (usually it means that the cache has been flushed or invalidated
> depending on the dma direction). However, next calls to
> dma_map_{single,page,sg}() for other devices will perform exactly the
> same sychronization operation on the CPU cache. CPU cache sychronization
> might be a time consuming operation, especially if the buffers are
> large, so it is highly recommended to avoid it if possible.
> DMA_ATTR_SKIP_CPU_SYNC allows platform code to skip synchronization of
> the CPU cache for the given buffer assuming that it has been already
> transferred to 'device' domain. This attribute can be also used for
> dma_unmap_{single,page,sg} functions family to force buffer to stay in
> device domain after releasing a mapping for it. Use this attribute with
> care!
> 
> Signed-off-by: Marek Szyprowski <m.szyprowski@samsung.com>
> Reviewed-by: Kyungmin Park <kyungmin.park@samsung.com>

Curious question: What's the use-case for this? Is this just to
work-around the fact that dma-buf atm doesn't support streaming dma so
that we could optimize this all (and keep around the mappings)? Or is
there a different use-case that I don't see?
-Daniel

> ---
>  Documentation/DMA-attributes.txt |   24 ++++++++++++++++++++++++
>  include/linux/dma-attrs.h        |    1 +
>  2 files changed, 25 insertions(+), 0 deletions(-)
> 
> diff --git a/Documentation/DMA-attributes.txt b/Documentation/DMA-attributes.txt
> index 725580d..f503090 100644
> --- a/Documentation/DMA-attributes.txt
> +++ b/Documentation/DMA-attributes.txt
> @@ -67,3 +67,27 @@ set on each call.
>  Since it is optional for platforms to implement
>  DMA_ATTR_NO_KERNEL_MAPPING, those that do not will simply ignore the
>  attribute and exhibit default behavior.
> +
> +DMA_ATTR_SKIP_CPU_SYNC
> +----------------------
> +
> +By default dma_map_{single,page,sg} functions family transfer a given
> +buffer from CPU domain to device domain. Some advanced use cases might
> +require sharing a buffer between more than one device. This requires
> +having a mapping created separately for each device and is usually
> +performed by calling dma_map_{single,page,sg} function more than once
> +for the given buffer with device pointer to each device taking part in
> +the buffer sharing. The first call transfers a buffer from 'CPU' domain
> +to 'device' domain, what synchronizes CPU caches for the given region
> +(usually it means that the cache has been flushed or invalidated
> +depending on the dma direction). However, next calls to
> +dma_map_{single,page,sg}() for other devices will perform exactly the
> +same sychronization operation on the CPU cache. CPU cache sychronization
> +might be a time consuming operation, especially if the buffers are
> +large, so it is highly recommended to avoid it if possible.
> +DMA_ATTR_SKIP_CPU_SYNC allows platform code to skip synchronization of
> +the CPU cache for the given buffer assuming that it has been already
> +transferred to 'device' domain. This attribute can be also used for
> +dma_unmap_{single,page,sg} functions family to force buffer to stay in
> +device domain after releasing a mapping for it. Use this attribute with
> +care!
> diff --git a/include/linux/dma-attrs.h b/include/linux/dma-attrs.h
> index a37c10c..f83f793 100644
> --- a/include/linux/dma-attrs.h
> +++ b/include/linux/dma-attrs.h
> @@ -16,6 +16,7 @@ enum dma_attr {
>  	DMA_ATTR_WRITE_COMBINE,
>  	DMA_ATTR_NON_CONSISTENT,
>  	DMA_ATTR_NO_KERNEL_MAPPING,
> +	DMA_ATTR_SKIP_CPU_SYNC,
>  	DMA_ATTR_MAX,
>  };
>  
> -- 
> 1.7.1.569.g6f426
> 
> 
> _______________________________________________
> Linaro-mm-sig mailing list
> Linaro-mm-sig at lists.linaro.org
> http://lists.linaro.org/mailman/listinfo/linaro-mm-sig

-- 
Daniel Vetter
Mail: daniel at ffwll.ch
Mobile: +41 (0)79 365 57 48

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [Linaro-mm-sig] [PATCHv2 1/6] common: DMA-mapping: add DMA_ATTR_NO_KERNEL_MAPPING attribute
  2012-06-13 11:50   ` Marek Szyprowski
  (?)
@ 2012-06-13 18:52     ` Daniel Vetter
  -1 siblings, 0 replies; 45+ messages in thread
From: Daniel Vetter @ 2012-06-13 18:52 UTC (permalink / raw)
  To: Marek Szyprowski
  Cc: linux-arm-kernel, linaro-mm-sig, linux-mm, linux-arch,
	linux-kernel, Abhinav Kochhar, Russell King - ARM Linux,
	Arnd Bergmann, Konrad Rzeszutek Wilk, Benjamin Herrenschmidt,
	Kyungmin Park, Subash Patel

On Wed, Jun 13, 2012 at 01:50:13PM +0200, Marek Szyprowski wrote:
> This patch adds DMA_ATTR_NO_KERNEL_MAPPING attribute which lets the
> platform to avoid creating a kernel virtual mapping for the allocated
> buffer. On some architectures creating such mapping is non-trivial task
> and consumes very limited resources (like kernel virtual address space
> or dma consistent address space). Buffers allocated with this attribute
> can be only passed to user space by calling dma_mmap_attrs().
> 
> Signed-off-by: Marek Szyprowski <m.szyprowski@samsung.com>
> Reviewed-by: Kyungmin Park <kyungmin.park@samsung.com>

Looks like a nice little extension to support dma-buf for the common case,
so:

Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>

One question is whether we should go right ahead and add kmap support for
this, too (with a default implementation that simply returns a pointer to
the coherent&contigous dma mem), but I guess that can wait until a
use-case pops up.
-Daniel

> ---
>  Documentation/DMA-attributes.txt |   18 ++++++++++++++++++
>  include/linux/dma-attrs.h        |    1 +
>  2 files changed, 19 insertions(+), 0 deletions(-)
> 
> diff --git a/Documentation/DMA-attributes.txt b/Documentation/DMA-attributes.txt
> index 5c72eed..725580d 100644
> --- a/Documentation/DMA-attributes.txt
> +++ b/Documentation/DMA-attributes.txt
> @@ -49,3 +49,21 @@ DMA_ATTR_NON_CONSISTENT lets the platform to choose to return either
>  consistent or non-consistent memory as it sees fit.  By using this API,
>  you are guaranteeing to the platform that you have all the correct and
>  necessary sync points for this memory in the driver.
> +
> +DMA_ATTR_NO_KERNEL_MAPPING
> +--------------------------
> +
> +DMA_ATTR_NO_KERNEL_MAPPING lets the platform to avoid creating a kernel
> +virtual mapping for the allocated buffer. On some architectures creating
> +such mapping is non-trivial task and consumes very limited resources
> +(like kernel virtual address space or dma consistent address space).
> +Buffers allocated with this attribute can be only passed to user space
> +by calling dma_mmap_attrs(). By using this API, you are guaranteeing
> +that you won't dereference the pointer returned by dma_alloc_attr(). You
> +can threat it as a cookie that must be passed to dma_mmap_attrs() and
> +dma_free_attrs(). Make sure that both of these also get this attribute
> +set on each call.
> +
> +Since it is optional for platforms to implement
> +DMA_ATTR_NO_KERNEL_MAPPING, those that do not will simply ignore the
> +attribute and exhibit default behavior.
> diff --git a/include/linux/dma-attrs.h b/include/linux/dma-attrs.h
> index 547ab56..a37c10c 100644
> --- a/include/linux/dma-attrs.h
> +++ b/include/linux/dma-attrs.h
> @@ -15,6 +15,7 @@ enum dma_attr {
>  	DMA_ATTR_WEAK_ORDERING,
>  	DMA_ATTR_WRITE_COMBINE,
>  	DMA_ATTR_NON_CONSISTENT,
> +	DMA_ATTR_NO_KERNEL_MAPPING,
>  	DMA_ATTR_MAX,
>  };
>  
> -- 
> 1.7.1.569.g6f426
> 
> 
> _______________________________________________
> Linaro-mm-sig mailing list
> Linaro-mm-sig@lists.linaro.org
> http://lists.linaro.org/mailman/listinfo/linaro-mm-sig

-- 
Daniel Vetter
Mail: daniel@ffwll.ch
Mobile: +41 (0)79 365 57 48

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [Linaro-mm-sig] [PATCHv2 1/6] common: DMA-mapping: add DMA_ATTR_NO_KERNEL_MAPPING attribute
@ 2012-06-13 18:52     ` Daniel Vetter
  0 siblings, 0 replies; 45+ messages in thread
From: Daniel Vetter @ 2012-06-13 18:52 UTC (permalink / raw)
  To: Marek Szyprowski
  Cc: linux-arm-kernel, linaro-mm-sig, linux-mm, linux-arch,
	linux-kernel, Abhinav Kochhar, Russell King - ARM Linux,
	Arnd Bergmann, Konrad Rzeszutek Wilk, Benjamin Herrenschmidt,
	Kyungmin Park, Subash Patel

On Wed, Jun 13, 2012 at 01:50:13PM +0200, Marek Szyprowski wrote:
> This patch adds DMA_ATTR_NO_KERNEL_MAPPING attribute which lets the
> platform to avoid creating a kernel virtual mapping for the allocated
> buffer. On some architectures creating such mapping is non-trivial task
> and consumes very limited resources (like kernel virtual address space
> or dma consistent address space). Buffers allocated with this attribute
> can be only passed to user space by calling dma_mmap_attrs().
> 
> Signed-off-by: Marek Szyprowski <m.szyprowski@samsung.com>
> Reviewed-by: Kyungmin Park <kyungmin.park@samsung.com>

Looks like a nice little extension to support dma-buf for the common case,
so:

Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>

One question is whether we should go right ahead and add kmap support for
this, too (with a default implementation that simply returns a pointer to
the coherent&contigous dma mem), but I guess that can wait until a
use-case pops up.
-Daniel

> ---
>  Documentation/DMA-attributes.txt |   18 ++++++++++++++++++
>  include/linux/dma-attrs.h        |    1 +
>  2 files changed, 19 insertions(+), 0 deletions(-)
> 
> diff --git a/Documentation/DMA-attributes.txt b/Documentation/DMA-attributes.txt
> index 5c72eed..725580d 100644
> --- a/Documentation/DMA-attributes.txt
> +++ b/Documentation/DMA-attributes.txt
> @@ -49,3 +49,21 @@ DMA_ATTR_NON_CONSISTENT lets the platform to choose to return either
>  consistent or non-consistent memory as it sees fit.  By using this API,
>  you are guaranteeing to the platform that you have all the correct and
>  necessary sync points for this memory in the driver.
> +
> +DMA_ATTR_NO_KERNEL_MAPPING
> +--------------------------
> +
> +DMA_ATTR_NO_KERNEL_MAPPING lets the platform to avoid creating a kernel
> +virtual mapping for the allocated buffer. On some architectures creating
> +such mapping is non-trivial task and consumes very limited resources
> +(like kernel virtual address space or dma consistent address space).
> +Buffers allocated with this attribute can be only passed to user space
> +by calling dma_mmap_attrs(). By using this API, you are guaranteeing
> +that you won't dereference the pointer returned by dma_alloc_attr(). You
> +can threat it as a cookie that must be passed to dma_mmap_attrs() and
> +dma_free_attrs(). Make sure that both of these also get this attribute
> +set on each call.
> +
> +Since it is optional for platforms to implement
> +DMA_ATTR_NO_KERNEL_MAPPING, those that do not will simply ignore the
> +attribute and exhibit default behavior.
> diff --git a/include/linux/dma-attrs.h b/include/linux/dma-attrs.h
> index 547ab56..a37c10c 100644
> --- a/include/linux/dma-attrs.h
> +++ b/include/linux/dma-attrs.h
> @@ -15,6 +15,7 @@ enum dma_attr {
>  	DMA_ATTR_WEAK_ORDERING,
>  	DMA_ATTR_WRITE_COMBINE,
>  	DMA_ATTR_NON_CONSISTENT,
> +	DMA_ATTR_NO_KERNEL_MAPPING,
>  	DMA_ATTR_MAX,
>  };
>  
> -- 
> 1.7.1.569.g6f426
> 
> 
> _______________________________________________
> Linaro-mm-sig mailing list
> Linaro-mm-sig@lists.linaro.org
> http://lists.linaro.org/mailman/listinfo/linaro-mm-sig

-- 
Daniel Vetter
Mail: daniel@ffwll.ch
Mobile: +41 (0)79 365 57 48

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 45+ messages in thread

* [Linaro-mm-sig] [PATCHv2 1/6] common: DMA-mapping: add DMA_ATTR_NO_KERNEL_MAPPING attribute
@ 2012-06-13 18:52     ` Daniel Vetter
  0 siblings, 0 replies; 45+ messages in thread
From: Daniel Vetter @ 2012-06-13 18:52 UTC (permalink / raw)
  To: linux-arm-kernel

On Wed, Jun 13, 2012 at 01:50:13PM +0200, Marek Szyprowski wrote:
> This patch adds DMA_ATTR_NO_KERNEL_MAPPING attribute which lets the
> platform to avoid creating a kernel virtual mapping for the allocated
> buffer. On some architectures creating such mapping is non-trivial task
> and consumes very limited resources (like kernel virtual address space
> or dma consistent address space). Buffers allocated with this attribute
> can be only passed to user space by calling dma_mmap_attrs().
> 
> Signed-off-by: Marek Szyprowski <m.szyprowski@samsung.com>
> Reviewed-by: Kyungmin Park <kyungmin.park@samsung.com>

Looks like a nice little extension to support dma-buf for the common case,
so:

Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>

One question is whether we should go right ahead and add kmap support for
this, too (with a default implementation that simply returns a pointer to
the coherent&contigous dma mem), but I guess that can wait until a
use-case pops up.
-Daniel

> ---
>  Documentation/DMA-attributes.txt |   18 ++++++++++++++++++
>  include/linux/dma-attrs.h        |    1 +
>  2 files changed, 19 insertions(+), 0 deletions(-)
> 
> diff --git a/Documentation/DMA-attributes.txt b/Documentation/DMA-attributes.txt
> index 5c72eed..725580d 100644
> --- a/Documentation/DMA-attributes.txt
> +++ b/Documentation/DMA-attributes.txt
> @@ -49,3 +49,21 @@ DMA_ATTR_NON_CONSISTENT lets the platform to choose to return either
>  consistent or non-consistent memory as it sees fit.  By using this API,
>  you are guaranteeing to the platform that you have all the correct and
>  necessary sync points for this memory in the driver.
> +
> +DMA_ATTR_NO_KERNEL_MAPPING
> +--------------------------
> +
> +DMA_ATTR_NO_KERNEL_MAPPING lets the platform to avoid creating a kernel
> +virtual mapping for the allocated buffer. On some architectures creating
> +such mapping is non-trivial task and consumes very limited resources
> +(like kernel virtual address space or dma consistent address space).
> +Buffers allocated with this attribute can be only passed to user space
> +by calling dma_mmap_attrs(). By using this API, you are guaranteeing
> +that you won't dereference the pointer returned by dma_alloc_attr(). You
> +can threat it as a cookie that must be passed to dma_mmap_attrs() and
> +dma_free_attrs(). Make sure that both of these also get this attribute
> +set on each call.
> +
> +Since it is optional for platforms to implement
> +DMA_ATTR_NO_KERNEL_MAPPING, those that do not will simply ignore the
> +attribute and exhibit default behavior.
> diff --git a/include/linux/dma-attrs.h b/include/linux/dma-attrs.h
> index 547ab56..a37c10c 100644
> --- a/include/linux/dma-attrs.h
> +++ b/include/linux/dma-attrs.h
> @@ -15,6 +15,7 @@ enum dma_attr {
>  	DMA_ATTR_WEAK_ORDERING,
>  	DMA_ATTR_WRITE_COMBINE,
>  	DMA_ATTR_NON_CONSISTENT,
> +	DMA_ATTR_NO_KERNEL_MAPPING,
>  	DMA_ATTR_MAX,
>  };
>  
> -- 
> 1.7.1.569.g6f426
> 
> 
> _______________________________________________
> Linaro-mm-sig mailing list
> Linaro-mm-sig at lists.linaro.org
> http://lists.linaro.org/mailman/listinfo/linaro-mm-sig

-- 
Daniel Vetter
Mail: daniel at ffwll.ch
Mobile: +41 (0)79 365 57 48

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [Linaro-mm-sig] [PATCHv2 3/6] common: dma-mapping: introduce dma_get_sgtable() function
  2012-06-13 11:50   ` Marek Szyprowski
  (?)
@ 2012-06-13 18:52     ` Daniel Vetter
  -1 siblings, 0 replies; 45+ messages in thread
From: Daniel Vetter @ 2012-06-13 18:52 UTC (permalink / raw)
  To: Marek Szyprowski
  Cc: linux-arm-kernel, linaro-mm-sig, linux-mm, linux-arch,
	linux-kernel, Abhinav Kochhar, Russell King - ARM Linux,
	Arnd Bergmann, Konrad Rzeszutek Wilk, Benjamin Herrenschmidt,
	Kyungmin Park, Subash Patel

On Wed, Jun 13, 2012 at 01:50:15PM +0200, Marek Szyprowski wrote:
> This patch adds dma_get_sgtable() function which is required to let
> drivers to share the buffers allocated by DMA-mapping subsystem. Right
> now the driver gets a dma address of the allocated buffer and the kernel
> virtual mapping for it. If it wants to share it with other device (= map
> into its dma address space) it usually hacks around kernel virtual
> addresses to get pointers to pages or assumes that both devices share
> the DMA address space. Both solutions are just hacks for the special
> cases, which should be avoided in the final version of buffer sharing.
> 
> To solve this issue in a generic way, a new call to DMA mapping has been
> introduced - dma_get_sgtable(). It allocates a scatter-list which
> describes the allocated buffer and lets the driver(s) to use it with
> other device(s) by calling dma_map_sg() on it.
> 
> This patch provides a generic implementation based on virt_to_page()
> call. Architectures which require more sophisticated translation might
> provide their own get_sgtable() methods.
> 
> Signed-off-by: Marek Szyprowski <m.szyprowski@samsung.com>
> Reviewed-by: Kyungmin Park <kyungmin.park@samsung.com>

Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
-- 
Daniel Vetter
Mail: daniel@ffwll.ch
Mobile: +41 (0)79 365 57 48

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [Linaro-mm-sig] [PATCHv2 3/6] common: dma-mapping: introduce dma_get_sgtable() function
@ 2012-06-13 18:52     ` Daniel Vetter
  0 siblings, 0 replies; 45+ messages in thread
From: Daniel Vetter @ 2012-06-13 18:52 UTC (permalink / raw)
  To: Marek Szyprowski
  Cc: linux-arm-kernel, linaro-mm-sig, linux-mm, linux-arch,
	linux-kernel, Abhinav Kochhar, Russell King - ARM Linux,
	Arnd Bergmann, Konrad Rzeszutek Wilk, Benjamin Herrenschmidt,
	Kyungmin Park, Subash Patel

On Wed, Jun 13, 2012 at 01:50:15PM +0200, Marek Szyprowski wrote:
> This patch adds dma_get_sgtable() function which is required to let
> drivers to share the buffers allocated by DMA-mapping subsystem. Right
> now the driver gets a dma address of the allocated buffer and the kernel
> virtual mapping for it. If it wants to share it with other device (= map
> into its dma address space) it usually hacks around kernel virtual
> addresses to get pointers to pages or assumes that both devices share
> the DMA address space. Both solutions are just hacks for the special
> cases, which should be avoided in the final version of buffer sharing.
> 
> To solve this issue in a generic way, a new call to DMA mapping has been
> introduced - dma_get_sgtable(). It allocates a scatter-list which
> describes the allocated buffer and lets the driver(s) to use it with
> other device(s) by calling dma_map_sg() on it.
> 
> This patch provides a generic implementation based on virt_to_page()
> call. Architectures which require more sophisticated translation might
> provide their own get_sgtable() methods.
> 
> Signed-off-by: Marek Szyprowski <m.szyprowski@samsung.com>
> Reviewed-by: Kyungmin Park <kyungmin.park@samsung.com>

Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
-- 
Daniel Vetter
Mail: daniel@ffwll.ch
Mobile: +41 (0)79 365 57 48

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 45+ messages in thread

* [Linaro-mm-sig] [PATCHv2 3/6] common: dma-mapping: introduce dma_get_sgtable() function
@ 2012-06-13 18:52     ` Daniel Vetter
  0 siblings, 0 replies; 45+ messages in thread
From: Daniel Vetter @ 2012-06-13 18:52 UTC (permalink / raw)
  To: linux-arm-kernel

On Wed, Jun 13, 2012 at 01:50:15PM +0200, Marek Szyprowski wrote:
> This patch adds dma_get_sgtable() function which is required to let
> drivers to share the buffers allocated by DMA-mapping subsystem. Right
> now the driver gets a dma address of the allocated buffer and the kernel
> virtual mapping for it. If it wants to share it with other device (= map
> into its dma address space) it usually hacks around kernel virtual
> addresses to get pointers to pages or assumes that both devices share
> the DMA address space. Both solutions are just hacks for the special
> cases, which should be avoided in the final version of buffer sharing.
> 
> To solve this issue in a generic way, a new call to DMA mapping has been
> introduced - dma_get_sgtable(). It allocates a scatter-list which
> describes the allocated buffer and lets the driver(s) to use it with
> other device(s) by calling dma_map_sg() on it.
> 
> This patch provides a generic implementation based on virt_to_page()
> call. Architectures which require more sophisticated translation might
> provide their own get_sgtable() methods.
> 
> Signed-off-by: Marek Szyprowski <m.szyprowski@samsung.com>
> Reviewed-by: Kyungmin Park <kyungmin.park@samsung.com>

Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
-- 
Daniel Vetter
Mail: daniel at ffwll.ch
Mobile: +41 (0)79 365 57 48

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [Linaro-mm-sig] [PATCHv2 0/6] ARM: DMA-mapping: new extensions for buffer sharing
  2012-06-13 14:12   ` Konrad Rzeszutek Wilk
  (?)
@ 2012-06-13 19:01     ` Daniel Vetter
  -1 siblings, 0 replies; 45+ messages in thread
From: Daniel Vetter @ 2012-06-13 19:01 UTC (permalink / raw)
  To: Konrad Rzeszutek Wilk
  Cc: Marek Szyprowski, linux-arch, Abhinav Kochhar,
	Russell King - ARM Linux, Arnd Bergmann, Benjamin Herrenschmidt,
	linux-kernel, Subash Patel, linaro-mm-sig, linux-mm,
	Kyungmin Park, linux-arm-kernel

On Wed, Jun 13, 2012 at 10:12:12AM -0400, Konrad Rzeszutek Wilk wrote:
> On Wed, Jun 13, 2012 at 01:50:12PM +0200, Marek Szyprowski wrote:
> > Hello,
> > 
> > This is an updated version of the patch series introducing a new
> > features to DMA mapping subsystem to let drivers share the allocated
> > buffers (preferably using recently introduced dma_buf framework) easy
> > and efficient.
> > 
> > The first extension is DMA_ATTR_NO_KERNEL_MAPPING attribute. It is
> > intended for use with dma_{alloc, mmap, free}_attrs functions. It can be
> > used to notify dma-mapping core that the driver will not use kernel
> > mapping for the allocated buffer at all, so the core can skip creating
> > it. This saves precious kernel virtual address space. Such buffer can be
> > accessed from userspace, after calling dma_mmap_attrs() for it (a
> > typical use case for multimedia buffers). The value returned by
> > dma_alloc_attrs() with this attribute should be considered as a DMA
> > cookie, which needs to be passed to dma_mmap_attrs() and
> > dma_free_attrs() funtions.
> > 
> > The second extension is required to let drivers to share the buffers
> > allocated by DMA-mapping subsystem. Right now the driver gets a dma
> > address of the allocated buffer and the kernel virtual mapping for it.
> > If it wants to share it with other device (= map into its dma address
> > space) it usually hacks around kernel virtual addresses to get pointers
> > to pages or assumes that both devices share the DMA address space. Both
> > solutions are just hacks for the special cases, which should be avoided
> > in the final version of buffer sharing. To solve this issue in a generic
> > way, a new call to DMA mapping has been introduced - dma_get_sgtable().
> > It allocates a scatter-list which describes the allocated buffer and
> > lets the driver(s) to use it with other device(s) by calling
> > dma_map_sg() on it.
> 
> What about the cases where the driver wants to share the buffer but there
> are multiple IOMMUs? So the DMA address returned initially would be
> different on the other IOMMUs? Would the driver have to figure this out
> or would the DMA/IOMMU implementation be in charge of that?

You still have to map the allocated sg table into each device address
space, so I think this is all covered. The reason dma-buf specs that the
returned sg list must be mapped into device address space already is to
support special-purpose remapping units that are not handled by the core
dma api.

> And what about IOMMU's that don't do DMA_ATTR_NO_KERNEL_MAPPING?
> Can they just ignore it and do what they did before ? (I presume yes).
> 
> > 
> > The third extension solves the performance issues which we observed with
> > some advanced buffer sharing use cases, which require creating a dma
> > mapping for the same memory buffer for more than one device. From the
> > DMA-mapping perspective this requires to call one of the
> > dma_map_{page,single,sg} function for the given memory buffer a few
> > times, for each of the devices. Each dma_map_* call performs CPU cache
> > synchronization, what might be a time consuming operation, especially
> > when the buffers are large. We would like to avoid any useless and time
> > consuming operations, so that was the main reason for introducing
> > another attribute for DMA-mapping subsystem: DMA_ATTR_SKIP_CPU_SYNC,
> > which lets dma-mapping core to skip CPU cache synchronization in certain
> > cases.

Ah, here's the use-case I've missed ;-) I'm a bit vary of totally insane
platforms that have additional caches only on the device side, and only
for some devices. Well, tlbs belong to that, but the iommu needs to handle
that anyway.

I think it would be good to add a blurb to the documentation that any
device-side flushing (of tlbs or special caches or whatever) still needs
to happen and that this is only a performance optimization to avoid the
costly cpu cache flushing. This way the dma-buf exporter could keep track
of whether it's 'device-coherent' and set that flag if the cpu caches don't
need to be flushed.

Maybe also make it clear that implementing this bit is optional (like your
doc already mentions for NO_KERNEL_MAPPING).

Yours, Daniel


> > 
> > The proposed patches have been rebased on the latest Linux kernel
> > v3.5-rc2 with 'ARM: replace custom consistent dma region with vmalloc'
> > patches applied (for more information, please refer to the 
> > http://www.spinics.net/lists/arm-kernel/msg179202.html thread).
> > 
> > The patches together with all dependences are also available on the
> > following GIT branch:
> > 
> > git://git.linaro.org/people/mszyprowski/linux-dma-mapping.git 3.5-rc2-dma-ext-v2
> > 
> > Best regards
> > Marek Szyprowski
> > Samsung Poland R&D Center
> > 
> > Changelog:
> > 
> > v2:
> > - rebased onto v3.5-rc2 and adapted for CMA and dma-mapping changes
> > - renamed dma_get_sgtable() to dma_get_sgtable_attrs() to match the convention
> >   of the other dma-mapping calls with attributes
> > - added generic fallback function for dma_get_sgtable() for architectures with
> >   simple dma-mapping implementations
> > 
> > v1: http://thread.gmane.org/gmane.linux.kernel.mm/78644
> >     http://thread.gmane.org/gmane.linux.kernel.cross-arch/14435 (part 2)
> > - initial version
> > 
> > Patch summary:
> > 
> > Marek Szyprowski (6):
> >   common: DMA-mapping: add DMA_ATTR_NO_KERNEL_MAPPING attribute
> >   ARM: dma-mapping: add support for DMA_ATTR_NO_KERNEL_MAPPING
> >     attribute
> >   common: dma-mapping: introduce dma_get_sgtable() function
> >   ARM: dma-mapping: add support for dma_get_sgtable()
> >   common: DMA-mapping: add DMA_ATTR_SKIP_CPU_SYNC attribute
> >   ARM: dma-mapping: add support for DMA_ATTR_SKIP_CPU_SYNC attribute
> > 
> >  Documentation/DMA-attributes.txt         |   42 ++++++++++++++++++
> >  arch/arm/common/dmabounce.c              |    1 +
> >  arch/arm/include/asm/dma-mapping.h       |    3 +
> >  arch/arm/mm/dma-mapping.c                |   69 ++++++++++++++++++++++++------
> >  drivers/base/dma-mapping.c               |   18 ++++++++
> >  include/asm-generic/dma-mapping-common.h |   18 ++++++++
> >  include/linux/dma-attrs.h                |    2 +
> >  include/linux/dma-mapping.h              |    3 +
> >  8 files changed, 142 insertions(+), 14 deletions(-)
> > 
> > -- 
> > 1.7.1.569.g6f426
> > 
> > --
> > To unsubscribe, send a message with 'unsubscribe linux-mm' in
> > the body to majordomo@kvack.org.  For more info on Linux MM,
> > see: http://www.linux-mm.org/ .
> > Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
> 
> _______________________________________________
> Linaro-mm-sig mailing list
> Linaro-mm-sig@lists.linaro.org
> http://lists.linaro.org/mailman/listinfo/linaro-mm-sig

-- 
Daniel Vetter
Mail: daniel@ffwll.ch
Mobile: +41 (0)79 365 57 48

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [Linaro-mm-sig] [PATCHv2 0/6] ARM: DMA-mapping: new extensions for buffer sharing
@ 2012-06-13 19:01     ` Daniel Vetter
  0 siblings, 0 replies; 45+ messages in thread
From: Daniel Vetter @ 2012-06-13 19:01 UTC (permalink / raw)
  To: Konrad Rzeszutek Wilk
  Cc: Marek Szyprowski, linux-arch, Abhinav Kochhar,
	Russell King - ARM Linux, Arnd Bergmann, Benjamin Herrenschmidt,
	linux-kernel, Subash Patel, linaro-mm-sig, linux-mm,
	Kyungmin Park, linux-arm-kernel

On Wed, Jun 13, 2012 at 10:12:12AM -0400, Konrad Rzeszutek Wilk wrote:
> On Wed, Jun 13, 2012 at 01:50:12PM +0200, Marek Szyprowski wrote:
> > Hello,
> > 
> > This is an updated version of the patch series introducing a new
> > features to DMA mapping subsystem to let drivers share the allocated
> > buffers (preferably using recently introduced dma_buf framework) easy
> > and efficient.
> > 
> > The first extension is DMA_ATTR_NO_KERNEL_MAPPING attribute. It is
> > intended for use with dma_{alloc, mmap, free}_attrs functions. It can be
> > used to notify dma-mapping core that the driver will not use kernel
> > mapping for the allocated buffer at all, so the core can skip creating
> > it. This saves precious kernel virtual address space. Such buffer can be
> > accessed from userspace, after calling dma_mmap_attrs() for it (a
> > typical use case for multimedia buffers). The value returned by
> > dma_alloc_attrs() with this attribute should be considered as a DMA
> > cookie, which needs to be passed to dma_mmap_attrs() and
> > dma_free_attrs() funtions.
> > 
> > The second extension is required to let drivers to share the buffers
> > allocated by DMA-mapping subsystem. Right now the driver gets a dma
> > address of the allocated buffer and the kernel virtual mapping for it.
> > If it wants to share it with other device (= map into its dma address
> > space) it usually hacks around kernel virtual addresses to get pointers
> > to pages or assumes that both devices share the DMA address space. Both
> > solutions are just hacks for the special cases, which should be avoided
> > in the final version of buffer sharing. To solve this issue in a generic
> > way, a new call to DMA mapping has been introduced - dma_get_sgtable().
> > It allocates a scatter-list which describes the allocated buffer and
> > lets the driver(s) to use it with other device(s) by calling
> > dma_map_sg() on it.
> 
> What about the cases where the driver wants to share the buffer but there
> are multiple IOMMUs? So the DMA address returned initially would be
> different on the other IOMMUs? Would the driver have to figure this out
> or would the DMA/IOMMU implementation be in charge of that?

You still have to map the allocated sg table into each device address
space, so I think this is all covered. The reason dma-buf specs that the
returned sg list must be mapped into device address space already is to
support special-purpose remapping units that are not handled by the core
dma api.

> And what about IOMMU's that don't do DMA_ATTR_NO_KERNEL_MAPPING?
> Can they just ignore it and do what they did before ? (I presume yes).
> 
> > 
> > The third extension solves the performance issues which we observed with
> > some advanced buffer sharing use cases, which require creating a dma
> > mapping for the same memory buffer for more than one device. From the
> > DMA-mapping perspective this requires to call one of the
> > dma_map_{page,single,sg} function for the given memory buffer a few
> > times, for each of the devices. Each dma_map_* call performs CPU cache
> > synchronization, what might be a time consuming operation, especially
> > when the buffers are large. We would like to avoid any useless and time
> > consuming operations, so that was the main reason for introducing
> > another attribute for DMA-mapping subsystem: DMA_ATTR_SKIP_CPU_SYNC,
> > which lets dma-mapping core to skip CPU cache synchronization in certain
> > cases.

Ah, here's the use-case I've missed ;-) I'm a bit vary of totally insane
platforms that have additional caches only on the device side, and only
for some devices. Well, tlbs belong to that, but the iommu needs to handle
that anyway.

I think it would be good to add a blurb to the documentation that any
device-side flushing (of tlbs or special caches or whatever) still needs
to happen and that this is only a performance optimization to avoid the
costly cpu cache flushing. This way the dma-buf exporter could keep track
of whether it's 'device-coherent' and set that flag if the cpu caches don't
need to be flushed.

Maybe also make it clear that implementing this bit is optional (like your
doc already mentions for NO_KERNEL_MAPPING).

Yours, Daniel


> > 
> > The proposed patches have been rebased on the latest Linux kernel
> > v3.5-rc2 with 'ARM: replace custom consistent dma region with vmalloc'
> > patches applied (for more information, please refer to the 
> > http://www.spinics.net/lists/arm-kernel/msg179202.html thread).
> > 
> > The patches together with all dependences are also available on the
> > following GIT branch:
> > 
> > git://git.linaro.org/people/mszyprowski/linux-dma-mapping.git 3.5-rc2-dma-ext-v2
> > 
> > Best regards
> > Marek Szyprowski
> > Samsung Poland R&D Center
> > 
> > Changelog:
> > 
> > v2:
> > - rebased onto v3.5-rc2 and adapted for CMA and dma-mapping changes
> > - renamed dma_get_sgtable() to dma_get_sgtable_attrs() to match the convention
> >   of the other dma-mapping calls with attributes
> > - added generic fallback function for dma_get_sgtable() for architectures with
> >   simple dma-mapping implementations
> > 
> > v1: http://thread.gmane.org/gmane.linux.kernel.mm/78644
> >     http://thread.gmane.org/gmane.linux.kernel.cross-arch/14435 (part 2)
> > - initial version
> > 
> > Patch summary:
> > 
> > Marek Szyprowski (6):
> >   common: DMA-mapping: add DMA_ATTR_NO_KERNEL_MAPPING attribute
> >   ARM: dma-mapping: add support for DMA_ATTR_NO_KERNEL_MAPPING
> >     attribute
> >   common: dma-mapping: introduce dma_get_sgtable() function
> >   ARM: dma-mapping: add support for dma_get_sgtable()
> >   common: DMA-mapping: add DMA_ATTR_SKIP_CPU_SYNC attribute
> >   ARM: dma-mapping: add support for DMA_ATTR_SKIP_CPU_SYNC attribute
> > 
> >  Documentation/DMA-attributes.txt         |   42 ++++++++++++++++++
> >  arch/arm/common/dmabounce.c              |    1 +
> >  arch/arm/include/asm/dma-mapping.h       |    3 +
> >  arch/arm/mm/dma-mapping.c                |   69 ++++++++++++++++++++++++------
> >  drivers/base/dma-mapping.c               |   18 ++++++++
> >  include/asm-generic/dma-mapping-common.h |   18 ++++++++
> >  include/linux/dma-attrs.h                |    2 +
> >  include/linux/dma-mapping.h              |    3 +
> >  8 files changed, 142 insertions(+), 14 deletions(-)
> > 
> > -- 
> > 1.7.1.569.g6f426
> > 
> > --
> > To unsubscribe, send a message with 'unsubscribe linux-mm' in
> > the body to majordomo@kvack.org.  For more info on Linux MM,
> > see: http://www.linux-mm.org/ .
> > Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
> 
> _______________________________________________
> Linaro-mm-sig mailing list
> Linaro-mm-sig@lists.linaro.org
> http://lists.linaro.org/mailman/listinfo/linaro-mm-sig

-- 
Daniel Vetter
Mail: daniel@ffwll.ch
Mobile: +41 (0)79 365 57 48

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 45+ messages in thread

* [Linaro-mm-sig] [PATCHv2 0/6] ARM: DMA-mapping: new extensions for buffer sharing
@ 2012-06-13 19:01     ` Daniel Vetter
  0 siblings, 0 replies; 45+ messages in thread
From: Daniel Vetter @ 2012-06-13 19:01 UTC (permalink / raw)
  To: linux-arm-kernel

On Wed, Jun 13, 2012 at 10:12:12AM -0400, Konrad Rzeszutek Wilk wrote:
> On Wed, Jun 13, 2012 at 01:50:12PM +0200, Marek Szyprowski wrote:
> > Hello,
> > 
> > This is an updated version of the patch series introducing a new
> > features to DMA mapping subsystem to let drivers share the allocated
> > buffers (preferably using recently introduced dma_buf framework) easy
> > and efficient.
> > 
> > The first extension is DMA_ATTR_NO_KERNEL_MAPPING attribute. It is
> > intended for use with dma_{alloc, mmap, free}_attrs functions. It can be
> > used to notify dma-mapping core that the driver will not use kernel
> > mapping for the allocated buffer at all, so the core can skip creating
> > it. This saves precious kernel virtual address space. Such buffer can be
> > accessed from userspace, after calling dma_mmap_attrs() for it (a
> > typical use case for multimedia buffers). The value returned by
> > dma_alloc_attrs() with this attribute should be considered as a DMA
> > cookie, which needs to be passed to dma_mmap_attrs() and
> > dma_free_attrs() funtions.
> > 
> > The second extension is required to let drivers to share the buffers
> > allocated by DMA-mapping subsystem. Right now the driver gets a dma
> > address of the allocated buffer and the kernel virtual mapping for it.
> > If it wants to share it with other device (= map into its dma address
> > space) it usually hacks around kernel virtual addresses to get pointers
> > to pages or assumes that both devices share the DMA address space. Both
> > solutions are just hacks for the special cases, which should be avoided
> > in the final version of buffer sharing. To solve this issue in a generic
> > way, a new call to DMA mapping has been introduced - dma_get_sgtable().
> > It allocates a scatter-list which describes the allocated buffer and
> > lets the driver(s) to use it with other device(s) by calling
> > dma_map_sg() on it.
> 
> What about the cases where the driver wants to share the buffer but there
> are multiple IOMMUs? So the DMA address returned initially would be
> different on the other IOMMUs? Would the driver have to figure this out
> or would the DMA/IOMMU implementation be in charge of that?

You still have to map the allocated sg table into each device address
space, so I think this is all covered. The reason dma-buf specs that the
returned sg list must be mapped into device address space already is to
support special-purpose remapping units that are not handled by the core
dma api.

> And what about IOMMU's that don't do DMA_ATTR_NO_KERNEL_MAPPING?
> Can they just ignore it and do what they did before ? (I presume yes).
> 
> > 
> > The third extension solves the performance issues which we observed with
> > some advanced buffer sharing use cases, which require creating a dma
> > mapping for the same memory buffer for more than one device. From the
> > DMA-mapping perspective this requires to call one of the
> > dma_map_{page,single,sg} function for the given memory buffer a few
> > times, for each of the devices. Each dma_map_* call performs CPU cache
> > synchronization, what might be a time consuming operation, especially
> > when the buffers are large. We would like to avoid any useless and time
> > consuming operations, so that was the main reason for introducing
> > another attribute for DMA-mapping subsystem: DMA_ATTR_SKIP_CPU_SYNC,
> > which lets dma-mapping core to skip CPU cache synchronization in certain
> > cases.

Ah, here's the use-case I've missed ;-) I'm a bit vary of totally insane
platforms that have additional caches only on the device side, and only
for some devices. Well, tlbs belong to that, but the iommu needs to handle
that anyway.

I think it would be good to add a blurb to the documentation that any
device-side flushing (of tlbs or special caches or whatever) still needs
to happen and that this is only a performance optimization to avoid the
costly cpu cache flushing. This way the dma-buf exporter could keep track
of whether it's 'device-coherent' and set that flag if the cpu caches don't
need to be flushed.

Maybe also make it clear that implementing this bit is optional (like your
doc already mentions for NO_KERNEL_MAPPING).

Yours, Daniel


> > 
> > The proposed patches have been rebased on the latest Linux kernel
> > v3.5-rc2 with 'ARM: replace custom consistent dma region with vmalloc'
> > patches applied (for more information, please refer to the 
> > http://www.spinics.net/lists/arm-kernel/msg179202.html thread).
> > 
> > The patches together with all dependences are also available on the
> > following GIT branch:
> > 
> > git://git.linaro.org/people/mszyprowski/linux-dma-mapping.git 3.5-rc2-dma-ext-v2
> > 
> > Best regards
> > Marek Szyprowski
> > Samsung Poland R&D Center
> > 
> > Changelog:
> > 
> > v2:
> > - rebased onto v3.5-rc2 and adapted for CMA and dma-mapping changes
> > - renamed dma_get_sgtable() to dma_get_sgtable_attrs() to match the convention
> >   of the other dma-mapping calls with attributes
> > - added generic fallback function for dma_get_sgtable() for architectures with
> >   simple dma-mapping implementations
> > 
> > v1: http://thread.gmane.org/gmane.linux.kernel.mm/78644
> >     http://thread.gmane.org/gmane.linux.kernel.cross-arch/14435 (part 2)
> > - initial version
> > 
> > Patch summary:
> > 
> > Marek Szyprowski (6):
> >   common: DMA-mapping: add DMA_ATTR_NO_KERNEL_MAPPING attribute
> >   ARM: dma-mapping: add support for DMA_ATTR_NO_KERNEL_MAPPING
> >     attribute
> >   common: dma-mapping: introduce dma_get_sgtable() function
> >   ARM: dma-mapping: add support for dma_get_sgtable()
> >   common: DMA-mapping: add DMA_ATTR_SKIP_CPU_SYNC attribute
> >   ARM: dma-mapping: add support for DMA_ATTR_SKIP_CPU_SYNC attribute
> > 
> >  Documentation/DMA-attributes.txt         |   42 ++++++++++++++++++
> >  arch/arm/common/dmabounce.c              |    1 +
> >  arch/arm/include/asm/dma-mapping.h       |    3 +
> >  arch/arm/mm/dma-mapping.c                |   69 ++++++++++++++++++++++++------
> >  drivers/base/dma-mapping.c               |   18 ++++++++
> >  include/asm-generic/dma-mapping-common.h |   18 ++++++++
> >  include/linux/dma-attrs.h                |    2 +
> >  include/linux/dma-mapping.h              |    3 +
> >  8 files changed, 142 insertions(+), 14 deletions(-)
> > 
> > -- 
> > 1.7.1.569.g6f426
> > 
> > --
> > To unsubscribe, send a message with 'unsubscribe linux-mm' in
> > the body to majordomo at kvack.org.  For more info on Linux MM,
> > see: http://www.linux-mm.org/ .
> > Don't email: <a href=mailto:"dont@kvack.org"> email at kvack.org </a>
> 
> _______________________________________________
> Linaro-mm-sig mailing list
> Linaro-mm-sig at lists.linaro.org
> http://lists.linaro.org/mailman/listinfo/linaro-mm-sig

-- 
Daniel Vetter
Mail: daniel at ffwll.ch
Mobile: +41 (0)79 365 57 48

^ permalink raw reply	[flat|nested] 45+ messages in thread

* RE: [PATCHv2 0/6] ARM: DMA-mapping: new extensions for buffer sharing
  2012-06-13 14:12   ` Konrad Rzeszutek Wilk
  (?)
@ 2012-06-14  8:39     ` Marek Szyprowski
  -1 siblings, 0 replies; 45+ messages in thread
From: Marek Szyprowski @ 2012-06-14  8:39 UTC (permalink / raw)
  To: 'Konrad Rzeszutek Wilk'
  Cc: linux-arm-kernel, linaro-mm-sig, linux-mm, linux-arch,
	linux-kernel, 'Kyungmin Park', 'Arnd Bergmann',
	'Russell King - ARM Linux', 'Chunsang Jeong',
	'Krishna Reddy', 'Benjamin Herrenschmidt',
	'Hiroshi Doyu', 'Subash Patel',
	'Sumit Semwal', 'Abhinav Kochhar',
	Tomasz Stanislawski

Hi Konrad,

On Wednesday, June 13, 2012 4:12 PM Konrad Rzeszutek Wilk wrote:

> On Wed, Jun 13, 2012 at 01:50:12PM +0200, Marek Szyprowski wrote:
> > Hello,
> >
> > This is an updated version of the patch series introducing a new
> > features to DMA mapping subsystem to let drivers share the allocated
> > buffers (preferably using recently introduced dma_buf framework) easy
> > and efficient.
> >
> > The first extension is DMA_ATTR_NO_KERNEL_MAPPING attribute. It is
> > intended for use with dma_{alloc, mmap, free}_attrs functions. It can be
> > used to notify dma-mapping core that the driver will not use kernel
> > mapping for the allocated buffer at all, so the core can skip creating
> > it. This saves precious kernel virtual address space. Such buffer can be
> > accessed from userspace, after calling dma_mmap_attrs() for it (a
> > typical use case for multimedia buffers). The value returned by
> > dma_alloc_attrs() with this attribute should be considered as a DMA
> > cookie, which needs to be passed to dma_mmap_attrs() and
> > dma_free_attrs() funtions.
> >
> > The second extension is required to let drivers to share the buffers
> > allocated by DMA-mapping subsystem. Right now the driver gets a dma
> > address of the allocated buffer and the kernel virtual mapping for it.
> > If it wants to share it with other device (= map into its dma address
> > space) it usually hacks around kernel virtual addresses to get pointers
> > to pages or assumes that both devices share the DMA address space. Both
> > solutions are just hacks for the special cases, which should be avoided
> > in the final version of buffer sharing. To solve this issue in a generic
> > way, a new call to DMA mapping has been introduced - dma_get_sgtable().
> > It allocates a scatter-list which describes the allocated buffer and
> > lets the driver(s) to use it with other device(s) by calling
> > dma_map_sg() on it.
> 
> What about the cases where the driver wants to share the buffer but there
> are multiple IOMMUs? So the DMA address returned initially would be
> different on the other IOMMUs? Would the driver have to figure this out
> or would the DMA/IOMMU implementation be in charge of that?

This extension is exactly to solve this problem. The driver(s) don't need to be 
aware of the IOMMU or IOMMUs between all the devices which are sharing the buffer.
Using dma_get_sgtable() one can get a scatter list describing the buffer allocated
for device1 and the call dma_map_sg() to map that scatter list to device2 dma
area. If there is device3, one calls dma_get_sgtable() again, gets second scatter
list, then maps it to device3. Weather there is a common IOMMU between those
device or each of the has its separate one, it doesn't matter - it will be hidden
behind dma mapping subsystem and the driver should not care about it.

> And what about IOMMU's that don't do DMA_ATTR_NO_KERNEL_MAPPING?
> Can they just ignore it and do what they did before ? (I presume yes).

The main idea about dma attributes (the beauty of the them) is the fact that all
are optional to implement for the platform core. If the attribute makes no sense
for the particular hardware it can be simply ignored. Attributes can relax some
requirements for dma mapping calls, but if the core ignores them and implements
calls in the most restrictive way the driver (client) will still work fine.

> (snipped)

Best regards
-- 
Marek Szyprowski
Samsung Poland R&D Center




^ permalink raw reply	[flat|nested] 45+ messages in thread

* RE: [PATCHv2 0/6] ARM: DMA-mapping: new extensions for buffer sharing
@ 2012-06-14  8:39     ` Marek Szyprowski
  0 siblings, 0 replies; 45+ messages in thread
From: Marek Szyprowski @ 2012-06-14  8:39 UTC (permalink / raw)
  To: 'Konrad Rzeszutek Wilk'
  Cc: linux-arm-kernel, linaro-mm-sig, linux-mm, linux-arch,
	linux-kernel, 'Kyungmin Park', 'Arnd Bergmann',
	'Russell King - ARM Linux', 'Chunsang Jeong',
	'Krishna Reddy', 'Benjamin Herrenschmidt',
	'Hiroshi Doyu', 'Subash Patel',
	'Sumit Semwal', 'Abhinav Kochhar',
	Tomasz Stanislawski

Hi Konrad,

On Wednesday, June 13, 2012 4:12 PM Konrad Rzeszutek Wilk wrote:

> On Wed, Jun 13, 2012 at 01:50:12PM +0200, Marek Szyprowski wrote:
> > Hello,
> >
> > This is an updated version of the patch series introducing a new
> > features to DMA mapping subsystem to let drivers share the allocated
> > buffers (preferably using recently introduced dma_buf framework) easy
> > and efficient.
> >
> > The first extension is DMA_ATTR_NO_KERNEL_MAPPING attribute. It is
> > intended for use with dma_{alloc, mmap, free}_attrs functions. It can be
> > used to notify dma-mapping core that the driver will not use kernel
> > mapping for the allocated buffer at all, so the core can skip creating
> > it. This saves precious kernel virtual address space. Such buffer can be
> > accessed from userspace, after calling dma_mmap_attrs() for it (a
> > typical use case for multimedia buffers). The value returned by
> > dma_alloc_attrs() with this attribute should be considered as a DMA
> > cookie, which needs to be passed to dma_mmap_attrs() and
> > dma_free_attrs() funtions.
> >
> > The second extension is required to let drivers to share the buffers
> > allocated by DMA-mapping subsystem. Right now the driver gets a dma
> > address of the allocated buffer and the kernel virtual mapping for it.
> > If it wants to share it with other device (= map into its dma address
> > space) it usually hacks around kernel virtual addresses to get pointers
> > to pages or assumes that both devices share the DMA address space. Both
> > solutions are just hacks for the special cases, which should be avoided
> > in the final version of buffer sharing. To solve this issue in a generic
> > way, a new call to DMA mapping has been introduced - dma_get_sgtable().
> > It allocates a scatter-list which describes the allocated buffer and
> > lets the driver(s) to use it with other device(s) by calling
> > dma_map_sg() on it.
> 
> What about the cases where the driver wants to share the buffer but there
> are multiple IOMMUs? So the DMA address returned initially would be
> different on the other IOMMUs? Would the driver have to figure this out
> or would the DMA/IOMMU implementation be in charge of that?

This extension is exactly to solve this problem. The driver(s) don't need to be 
aware of the IOMMU or IOMMUs between all the devices which are sharing the buffer.
Using dma_get_sgtable() one can get a scatter list describing the buffer allocated
for device1 and the call dma_map_sg() to map that scatter list to device2 dma
area. If there is device3, one calls dma_get_sgtable() again, gets second scatter
list, then maps it to device3. Weather there is a common IOMMU between those
device or each of the has its separate one, it doesn't matter - it will be hidden
behind dma mapping subsystem and the driver should not care about it.

> And what about IOMMU's that don't do DMA_ATTR_NO_KERNEL_MAPPING?
> Can they just ignore it and do what they did before ? (I presume yes).

The main idea about dma attributes (the beauty of the them) is the fact that all
are optional to implement for the platform core. If the attribute makes no sense
for the particular hardware it can be simply ignored. Attributes can relax some
requirements for dma mapping calls, but if the core ignores them and implements
calls in the most restrictive way the driver (client) will still work fine.

> (snipped)

Best regards
-- 
Marek Szyprowski
Samsung Poland R&D Center



--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 45+ messages in thread

* [PATCHv2 0/6] ARM: DMA-mapping: new extensions for buffer sharing
@ 2012-06-14  8:39     ` Marek Szyprowski
  0 siblings, 0 replies; 45+ messages in thread
From: Marek Szyprowski @ 2012-06-14  8:39 UTC (permalink / raw)
  To: linux-arm-kernel

Hi Konrad,

On Wednesday, June 13, 2012 4:12 PM Konrad Rzeszutek Wilk wrote:

> On Wed, Jun 13, 2012 at 01:50:12PM +0200, Marek Szyprowski wrote:
> > Hello,
> >
> > This is an updated version of the patch series introducing a new
> > features to DMA mapping subsystem to let drivers share the allocated
> > buffers (preferably using recently introduced dma_buf framework) easy
> > and efficient.
> >
> > The first extension is DMA_ATTR_NO_KERNEL_MAPPING attribute. It is
> > intended for use with dma_{alloc, mmap, free}_attrs functions. It can be
> > used to notify dma-mapping core that the driver will not use kernel
> > mapping for the allocated buffer at all, so the core can skip creating
> > it. This saves precious kernel virtual address space. Such buffer can be
> > accessed from userspace, after calling dma_mmap_attrs() for it (a
> > typical use case for multimedia buffers). The value returned by
> > dma_alloc_attrs() with this attribute should be considered as a DMA
> > cookie, which needs to be passed to dma_mmap_attrs() and
> > dma_free_attrs() funtions.
> >
> > The second extension is required to let drivers to share the buffers
> > allocated by DMA-mapping subsystem. Right now the driver gets a dma
> > address of the allocated buffer and the kernel virtual mapping for it.
> > If it wants to share it with other device (= map into its dma address
> > space) it usually hacks around kernel virtual addresses to get pointers
> > to pages or assumes that both devices share the DMA address space. Both
> > solutions are just hacks for the special cases, which should be avoided
> > in the final version of buffer sharing. To solve this issue in a generic
> > way, a new call to DMA mapping has been introduced - dma_get_sgtable().
> > It allocates a scatter-list which describes the allocated buffer and
> > lets the driver(s) to use it with other device(s) by calling
> > dma_map_sg() on it.
> 
> What about the cases where the driver wants to share the buffer but there
> are multiple IOMMUs? So the DMA address returned initially would be
> different on the other IOMMUs? Would the driver have to figure this out
> or would the DMA/IOMMU implementation be in charge of that?

This extension is exactly to solve this problem. The driver(s) don't need to be 
aware of the IOMMU or IOMMUs between all the devices which are sharing the buffer.
Using dma_get_sgtable() one can get a scatter list describing the buffer allocated
for device1 and the call dma_map_sg() to map that scatter list to device2 dma
area. If there is device3, one calls dma_get_sgtable() again, gets second scatter
list, then maps it to device3. Weather there is a common IOMMU between those
device or each of the has its separate one, it doesn't matter - it will be hidden
behind dma mapping subsystem and the driver should not care about it.

> And what about IOMMU's that don't do DMA_ATTR_NO_KERNEL_MAPPING?
> Can they just ignore it and do what they did before ? (I presume yes).

The main idea about dma attributes (the beauty of the them) is the fact that all
are optional to implement for the platform core. If the attribute makes no sense
for the particular hardware it can be simply ignored. Attributes can relax some
requirements for dma mapping calls, but if the core ignores them and implements
calls in the most restrictive way the driver (client) will still work fine.

> (snipped)

Best regards
-- 
Marek Szyprowski
Samsung Poland R&D Center

^ permalink raw reply	[flat|nested] 45+ messages in thread

* RE: [Linaro-mm-sig] [PATCHv2 0/6] ARM: DMA-mapping: new extensions for buffer sharing
  2012-06-13 19:01     ` Daniel Vetter
  (?)
@ 2012-06-14  8:47       ` Marek Szyprowski
  -1 siblings, 0 replies; 45+ messages in thread
From: Marek Szyprowski @ 2012-06-14  8:47 UTC (permalink / raw)
  To: 'Daniel Vetter', 'Konrad Rzeszutek Wilk'
  Cc: linux-arch, 'Abhinav Kochhar',
	'Russell King - ARM Linux', 'Arnd Bergmann',
	'Benjamin Herrenschmidt',
	linux-kernel, 'Subash Patel',
	linaro-mm-sig, linux-mm, 'Kyungmin Park',
	linux-arm-kernel

Hello,

On Wednesday, June 13, 2012 9:02 PM Daniel Vetter wrote:

> On Wed, Jun 13, 2012 at 10:12:12AM -0400, Konrad Rzeszutek Wilk wrote:
> > On Wed, Jun 13, 2012 at 01:50:12PM +0200, Marek Szyprowski wrote:

> (snipped)

> > > The third extension solves the performance issues which we observed with
> > > some advanced buffer sharing use cases, which require creating a dma
> > > mapping for the same memory buffer for more than one device. From the
> > > DMA-mapping perspective this requires to call one of the
> > > dma_map_{page,single,sg} function for the given memory buffer a few
> > > times, for each of the devices. Each dma_map_* call performs CPU cache
> > > synchronization, what might be a time consuming operation, especially
> > > when the buffers are large. We would like to avoid any useless and time
> > > consuming operations, so that was the main reason for introducing
> > > another attribute for DMA-mapping subsystem: DMA_ATTR_SKIP_CPU_SYNC,
> > > which lets dma-mapping core to skip CPU cache synchronization in certain
> > > cases.
> 
> Ah, here's the use-case I've missed ;-) I'm a bit vary of totally insane
> platforms that have additional caches only on the device side, and only
> for some devices. Well, tlbs belong to that, but the iommu needs to handle
> that anyway.
> 
> I think it would be good to add a blurb to the documentation that any
> device-side flushing (of tlbs or special caches or whatever) still needs
> to happen and that this is only a performance optimization to avoid the
> costly cpu cache flushing. This way the dma-buf exporter could keep track
> of whether it's 'device-coherent' and set that flag if the cpu caches don't
> need to be flushed.
> 
> Maybe also make it clear that implementing this bit is optional (like your
> doc already mentions for NO_KERNEL_MAPPING).

Ok, I can add additional comment, but support for all dma attributes is optional
(attributes are considered only as hints that might improve performance for some
use cases on some hw platforms).

Best regards
-- 
Marek Szyprowski
Samsung Poland R&D Center




^ permalink raw reply	[flat|nested] 45+ messages in thread

* RE: [Linaro-mm-sig] [PATCHv2 0/6] ARM: DMA-mapping: new extensions for buffer sharing
@ 2012-06-14  8:47       ` Marek Szyprowski
  0 siblings, 0 replies; 45+ messages in thread
From: Marek Szyprowski @ 2012-06-14  8:47 UTC (permalink / raw)
  To: 'Daniel Vetter', 'Konrad Rzeszutek Wilk'
  Cc: linux-arch, 'Abhinav Kochhar',
	'Russell King - ARM Linux', 'Arnd Bergmann',
	'Benjamin Herrenschmidt',
	linux-kernel, 'Subash Patel',
	linaro-mm-sig, linux-mm, 'Kyungmin Park',
	linux-arm-kernel

Hello,

On Wednesday, June 13, 2012 9:02 PM Daniel Vetter wrote:

> On Wed, Jun 13, 2012 at 10:12:12AM -0400, Konrad Rzeszutek Wilk wrote:
> > On Wed, Jun 13, 2012 at 01:50:12PM +0200, Marek Szyprowski wrote:

> (snipped)

> > > The third extension solves the performance issues which we observed with
> > > some advanced buffer sharing use cases, which require creating a dma
> > > mapping for the same memory buffer for more than one device. From the
> > > DMA-mapping perspective this requires to call one of the
> > > dma_map_{page,single,sg} function for the given memory buffer a few
> > > times, for each of the devices. Each dma_map_* call performs CPU cache
> > > synchronization, what might be a time consuming operation, especially
> > > when the buffers are large. We would like to avoid any useless and time
> > > consuming operations, so that was the main reason for introducing
> > > another attribute for DMA-mapping subsystem: DMA_ATTR_SKIP_CPU_SYNC,
> > > which lets dma-mapping core to skip CPU cache synchronization in certain
> > > cases.
> 
> Ah, here's the use-case I've missed ;-) I'm a bit vary of totally insane
> platforms that have additional caches only on the device side, and only
> for some devices. Well, tlbs belong to that, but the iommu needs to handle
> that anyway.
> 
> I think it would be good to add a blurb to the documentation that any
> device-side flushing (of tlbs or special caches or whatever) still needs
> to happen and that this is only a performance optimization to avoid the
> costly cpu cache flushing. This way the dma-buf exporter could keep track
> of whether it's 'device-coherent' and set that flag if the cpu caches don't
> need to be flushed.
> 
> Maybe also make it clear that implementing this bit is optional (like your
> doc already mentions for NO_KERNEL_MAPPING).

Ok, I can add additional comment, but support for all dma attributes is optional
(attributes are considered only as hints that might improve performance for some
use cases on some hw platforms).

Best regards
-- 
Marek Szyprowski
Samsung Poland R&D Center



--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 45+ messages in thread

* [Linaro-mm-sig] [PATCHv2 0/6] ARM: DMA-mapping: new extensions for buffer sharing
@ 2012-06-14  8:47       ` Marek Szyprowski
  0 siblings, 0 replies; 45+ messages in thread
From: Marek Szyprowski @ 2012-06-14  8:47 UTC (permalink / raw)
  To: linux-arm-kernel

Hello,

On Wednesday, June 13, 2012 9:02 PM Daniel Vetter wrote:

> On Wed, Jun 13, 2012 at 10:12:12AM -0400, Konrad Rzeszutek Wilk wrote:
> > On Wed, Jun 13, 2012 at 01:50:12PM +0200, Marek Szyprowski wrote:

> (snipped)

> > > The third extension solves the performance issues which we observed with
> > > some advanced buffer sharing use cases, which require creating a dma
> > > mapping for the same memory buffer for more than one device. From the
> > > DMA-mapping perspective this requires to call one of the
> > > dma_map_{page,single,sg} function for the given memory buffer a few
> > > times, for each of the devices. Each dma_map_* call performs CPU cache
> > > synchronization, what might be a time consuming operation, especially
> > > when the buffers are large. We would like to avoid any useless and time
> > > consuming operations, so that was the main reason for introducing
> > > another attribute for DMA-mapping subsystem: DMA_ATTR_SKIP_CPU_SYNC,
> > > which lets dma-mapping core to skip CPU cache synchronization in certain
> > > cases.
> 
> Ah, here's the use-case I've missed ;-) I'm a bit vary of totally insane
> platforms that have additional caches only on the device side, and only
> for some devices. Well, tlbs belong to that, but the iommu needs to handle
> that anyway.
> 
> I think it would be good to add a blurb to the documentation that any
> device-side flushing (of tlbs or special caches or whatever) still needs
> to happen and that this is only a performance optimization to avoid the
> costly cpu cache flushing. This way the dma-buf exporter could keep track
> of whether it's 'device-coherent' and set that flag if the cpu caches don't
> need to be flushed.
> 
> Maybe also make it clear that implementing this bit is optional (like your
> doc already mentions for NO_KERNEL_MAPPING).

Ok, I can add additional comment, but support for all dma attributes is optional
(attributes are considered only as hints that might improve performance for some
use cases on some hw platforms).

Best regards
-- 
Marek Szyprowski
Samsung Poland R&D Center

^ permalink raw reply	[flat|nested] 45+ messages in thread

* RE: [Linaro-mm-sig] [PATCHv2 1/6] common: DMA-mapping: add DMA_ATTR_NO_KERNEL_MAPPING attribute
  2012-06-13 18:52     ` Daniel Vetter
  (?)
@ 2012-06-14  9:01       ` Marek Szyprowski
  -1 siblings, 0 replies; 45+ messages in thread
From: Marek Szyprowski @ 2012-06-14  9:01 UTC (permalink / raw)
  To: 'Daniel Vetter'
  Cc: linux-arm-kernel, linaro-mm-sig, linux-mm, linux-arch,
	linux-kernel, 'Abhinav Kochhar',
	'Russell King - ARM Linux', 'Arnd Bergmann',
	'Konrad Rzeszutek Wilk', 'Benjamin Herrenschmidt',
	'Kyungmin Park', 'Subash Patel'

Hello,

On Wednesday, June 13, 2012 8:52 PM Daniel Vetter wrote:

> On Wed, Jun 13, 2012 at 01:50:13PM +0200, Marek Szyprowski wrote:
> > This patch adds DMA_ATTR_NO_KERNEL_MAPPING attribute which lets the
> > platform to avoid creating a kernel virtual mapping for the allocated
> > buffer. On some architectures creating such mapping is non-trivial task
> > and consumes very limited resources (like kernel virtual address space
> > or dma consistent address space). Buffers allocated with this attribute
> > can be only passed to user space by calling dma_mmap_attrs().
> >
> > Signed-off-by: Marek Szyprowski <m.szyprowski@samsung.com>
> > Reviewed-by: Kyungmin Park <kyungmin.park@samsung.com>
> 
> Looks like a nice little extension to support dma-buf for the common case,
> so:
> 
> Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
> 
> One question is whether we should go right ahead and add kmap support for
> this, too (with a default implementation that simply returns a pointer to
> the coherent&contigous dma mem), but I guess that can wait until a
> use-case pops up.

I will wait with this until there will be real use cases. Let's get the
patch into mainline first.

Best regards
-- 
Marek Szyprowski
Samsung Poland R&D Center



^ permalink raw reply	[flat|nested] 45+ messages in thread

* RE: [Linaro-mm-sig] [PATCHv2 1/6] common: DMA-mapping: add DMA_ATTR_NO_KERNEL_MAPPING attribute
@ 2012-06-14  9:01       ` Marek Szyprowski
  0 siblings, 0 replies; 45+ messages in thread
From: Marek Szyprowski @ 2012-06-14  9:01 UTC (permalink / raw)
  To: 'Daniel Vetter'
  Cc: linux-arm-kernel, linaro-mm-sig, linux-mm, linux-arch,
	linux-kernel, 'Abhinav Kochhar',
	'Russell King - ARM Linux', 'Arnd Bergmann',
	'Konrad Rzeszutek Wilk', 'Benjamin Herrenschmidt',
	'Kyungmin Park', 'Subash Patel'

Hello,

On Wednesday, June 13, 2012 8:52 PM Daniel Vetter wrote:

> On Wed, Jun 13, 2012 at 01:50:13PM +0200, Marek Szyprowski wrote:
> > This patch adds DMA_ATTR_NO_KERNEL_MAPPING attribute which lets the
> > platform to avoid creating a kernel virtual mapping for the allocated
> > buffer. On some architectures creating such mapping is non-trivial task
> > and consumes very limited resources (like kernel virtual address space
> > or dma consistent address space). Buffers allocated with this attribute
> > can be only passed to user space by calling dma_mmap_attrs().
> >
> > Signed-off-by: Marek Szyprowski <m.szyprowski@samsung.com>
> > Reviewed-by: Kyungmin Park <kyungmin.park@samsung.com>
> 
> Looks like a nice little extension to support dma-buf for the common case,
> so:
> 
> Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
> 
> One question is whether we should go right ahead and add kmap support for
> this, too (with a default implementation that simply returns a pointer to
> the coherent&contigous dma mem), but I guess that can wait until a
> use-case pops up.

I will wait with this until there will be real use cases. Let's get the
patch into mainline first.

Best regards
-- 
Marek Szyprowski
Samsung Poland R&D Center


--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 45+ messages in thread

* [Linaro-mm-sig] [PATCHv2 1/6] common: DMA-mapping: add DMA_ATTR_NO_KERNEL_MAPPING attribute
@ 2012-06-14  9:01       ` Marek Szyprowski
  0 siblings, 0 replies; 45+ messages in thread
From: Marek Szyprowski @ 2012-06-14  9:01 UTC (permalink / raw)
  To: linux-arm-kernel

Hello,

On Wednesday, June 13, 2012 8:52 PM Daniel Vetter wrote:

> On Wed, Jun 13, 2012 at 01:50:13PM +0200, Marek Szyprowski wrote:
> > This patch adds DMA_ATTR_NO_KERNEL_MAPPING attribute which lets the
> > platform to avoid creating a kernel virtual mapping for the allocated
> > buffer. On some architectures creating such mapping is non-trivial task
> > and consumes very limited resources (like kernel virtual address space
> > or dma consistent address space). Buffers allocated with this attribute
> > can be only passed to user space by calling dma_mmap_attrs().
> >
> > Signed-off-by: Marek Szyprowski <m.szyprowski@samsung.com>
> > Reviewed-by: Kyungmin Park <kyungmin.park@samsung.com>
> 
> Looks like a nice little extension to support dma-buf for the common case,
> so:
> 
> Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
> 
> One question is whether we should go right ahead and add kmap support for
> this, too (with a default implementation that simply returns a pointer to
> the coherent&contigous dma mem), but I guess that can wait until a
> use-case pops up.

I will wait with this until there will be real use cases. Let's get the
patch into mainline first.

Best regards
-- 
Marek Szyprowski
Samsung Poland R&D Center

^ permalink raw reply	[flat|nested] 45+ messages in thread

end of thread, other threads:[~2012-06-14  9:01 UTC | newest]

Thread overview: 45+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2012-06-13 11:50 [PATCHv2 0/6] ARM: DMA-mapping: new extensions for buffer sharing Marek Szyprowski
2012-06-13 11:50 ` Marek Szyprowski
2012-06-13 11:50 ` Marek Szyprowski
2012-06-13 11:50 ` [PATCHv2 1/6] common: DMA-mapping: add DMA_ATTR_NO_KERNEL_MAPPING attribute Marek Szyprowski
2012-06-13 11:50   ` Marek Szyprowski
2012-06-13 11:50   ` Marek Szyprowski
2012-06-13 18:52   ` [Linaro-mm-sig] " Daniel Vetter
2012-06-13 18:52     ` Daniel Vetter
2012-06-13 18:52     ` Daniel Vetter
2012-06-14  9:01     ` Marek Szyprowski
2012-06-14  9:01       ` Marek Szyprowski
2012-06-14  9:01       ` Marek Szyprowski
2012-06-13 11:50 ` [PATCHv2 2/6] ARM: dma-mapping: add support for " Marek Szyprowski
2012-06-13 11:50   ` Marek Szyprowski
2012-06-13 11:50   ` Marek Szyprowski
2012-06-13 11:50 ` [PATCHv2 3/6] common: dma-mapping: introduce dma_get_sgtable() function Marek Szyprowski
2012-06-13 11:50   ` Marek Szyprowski
2012-06-13 11:50   ` Marek Szyprowski
2012-06-13 18:52   ` [Linaro-mm-sig] " Daniel Vetter
2012-06-13 18:52     ` Daniel Vetter
2012-06-13 18:52     ` Daniel Vetter
2012-06-13 11:50 ` [PATCHv2 4/6] ARM: dma-mapping: add support for dma_get_sgtable() Marek Szyprowski
2012-06-13 11:50   ` Marek Szyprowski
2012-06-13 11:50   ` Marek Szyprowski
2012-06-13 11:50 ` [PATCHv2 5/6] common: DMA-mapping: add DMA_ATTR_SKIP_CPU_SYNC attribute Marek Szyprowski
2012-06-13 11:50   ` Marek Szyprowski
2012-06-13 11:50   ` Marek Szyprowski
2012-06-13 18:45   ` [Linaro-mm-sig] " Daniel Vetter
2012-06-13 18:45     ` Daniel Vetter
2012-06-13 18:45     ` Daniel Vetter
2012-06-13 11:50 ` [PATCHv2 6/6] ARM: dma-mapping: add support for " Marek Szyprowski
2012-06-13 11:50   ` Marek Szyprowski
2012-06-13 11:50   ` Marek Szyprowski
2012-06-13 14:12 ` [PATCHv2 0/6] ARM: DMA-mapping: new extensions for buffer sharing Konrad Rzeszutek Wilk
2012-06-13 14:12   ` Konrad Rzeszutek Wilk
2012-06-13 14:12   ` Konrad Rzeszutek Wilk
2012-06-13 19:01   ` [Linaro-mm-sig] " Daniel Vetter
2012-06-13 19:01     ` Daniel Vetter
2012-06-13 19:01     ` Daniel Vetter
2012-06-14  8:47     ` Marek Szyprowski
2012-06-14  8:47       ` Marek Szyprowski
2012-06-14  8:47       ` Marek Szyprowski
2012-06-14  8:39   ` Marek Szyprowski
2012-06-14  8:39     ` Marek Szyprowski
2012-06-14  8:39     ` Marek Szyprowski

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.