linux-arch.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCHv6 0/7] ARM: DMA-mapping framework redesign
@ 2012-02-10 18:58 Marek Szyprowski
  2012-02-10 18:58 ` Marek Szyprowski
                   ` (2 more replies)
  0 siblings, 3 replies; 39+ messages in thread
From: Marek Szyprowski @ 2012-02-10 18:58 UTC (permalink / raw)
  To: linux-arm-kernel, linaro-mm-sig, linux-mm, linux-arch,
	linux-samsung-soc, iommu
  Cc: Marek Szyprowski, Kyungmin Park, Arnd Bergmann, Joerg Roedel,
	Russell King - ARM Linux, Shariq Hasnain, Chunsang Jeong,
	Krishna Reddy, KyongHo Cho, Andrzej Pietrasiewicz,
	Benjamin Herrenschmidt

Hello,

This is another update on my works DMA-mapping framework redesign for
ARM architecture. It includes a few minor cleanup and fixes since the
last version posted by the end of December 2011. This patch series is
now based on the generic, cross-arch dma-mapping redesign patches posted
in the "[PATCH 00/14] DMA-mapping framework redesign preparation"
thread: http://www.spinics.net/lists/linux-sh/msg09777.html

All patches have been now rebased onto v3.3-rc2 kernel.

All the code has been tested on Samsung Exynos4 'UniversalC210' board
with IOMMU driver posted by KyongHo Cho.


History of the development:

v1: (initial version of the DMA-mapping redesign patches):
http://www.spinics.net/lists/linux-mm/msg21241.html

v2:
http://lists.linaro.org/pipermail/linaro-mm-sig/2011-September/000571.html
http://lists.linaro.org/pipermail/linaro-mm-sig/2011-September/000577.html

v3:
http://www.spinics.net/lists/linux-mm/msg25490.html

v4 and v5:
http://www.spinics.net/lists/arm-kernel/msg151147.html
http://www.spinics.net/lists/arm-kernel/msg154889.html

Best regards
--
Marek Szyprowski
Samsung Poland R&D Center


Patch summary:

Marek Szyprowski (7):
  ARM: dma-mapping: remove offset parameter to prepare for generic
    dma_ops
  ARM: dma-mapping: use asm-generic/dma-mapping-common.h
  ARM: dma-mapping: implement dma sg methods on top of any generic dma
    ops
  ARM: dma-mapping: move all dma bounce code to separate dma ops
    structure
  ARM: dma-mapping: remove redundant code and cleanup
  ARM: dma-mapping: use alloc, mmap, free from dma_ops
  ARM: dma-mapping: add support for IOMMU mapper

 arch/arm/Kconfig                   |    9 +
 arch/arm/common/dmabounce.c        |   78 +++-
 arch/arm/include/asm/device.h      |    4 +
 arch/arm/include/asm/dma-iommu.h   |   34 ++
 arch/arm/include/asm/dma-mapping.h |  404 +++++------------
 arch/arm/mm/dma-mapping.c          |  897 ++++++++++++++++++++++++++++++------
 arch/arm/mm/vmregion.h             |    2 +-
 7 files changed, 980 insertions(+), 448 deletions(-)
 create mode 100644 arch/arm/include/asm/dma-iommu.h

-- 
1.7.1.569.g6f426

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 39+ messages in thread

* [PATCHv6 0/7] ARM: DMA-mapping framework redesign
  2012-02-10 18:58 [PATCHv6 0/7] ARM: DMA-mapping framework redesign Marek Szyprowski
@ 2012-02-10 18:58 ` Marek Szyprowski
       [not found] ` <1328900324-20946-1-git-send-email-m.szyprowski-Sze3O3UU22JBDgjK7y7TUQ@public.gmane.org>
  2012-02-10 18:58 ` [PATCHv6 3/7] ARM: dma-mapping: implement dma sg methods on top of any generic dma ops Marek Szyprowski
  2 siblings, 0 replies; 39+ messages in thread
From: Marek Szyprowski @ 2012-02-10 18:58 UTC (permalink / raw)
  To: linux-arm-kernel, linaro-mm-sig, linux-mm, linux-arch,
	linux-samsung-soc, iommu
  Cc: Marek Szyprowski, Kyungmin Park, Arnd Bergmann, Joerg Roedel,
	Russell King - ARM Linux, Shariq Hasnain, Chunsang Jeong,
	Krishna Reddy, KyongHo Cho, Andrzej Pietrasiewicz,
	Benjamin Herrenschmidt

Hello,

This is another update on my works DMA-mapping framework redesign for
ARM architecture. It includes a few minor cleanup and fixes since the
last version posted by the end of December 2011. This patch series is
now based on the generic, cross-arch dma-mapping redesign patches posted
in the "[PATCH 00/14] DMA-mapping framework redesign preparation"
thread: http://www.spinics.net/lists/linux-sh/msg09777.html

All patches have been now rebased onto v3.3-rc2 kernel.

All the code has been tested on Samsung Exynos4 'UniversalC210' board
with IOMMU driver posted by KyongHo Cho.


History of the development:

v1: (initial version of the DMA-mapping redesign patches):
http://www.spinics.net/lists/linux-mm/msg21241.html

v2:
http://lists.linaro.org/pipermail/linaro-mm-sig/2011-September/000571.html
http://lists.linaro.org/pipermail/linaro-mm-sig/2011-September/000577.html

v3:
http://www.spinics.net/lists/linux-mm/msg25490.html

v4 and v5:
http://www.spinics.net/lists/arm-kernel/msg151147.html
http://www.spinics.net/lists/arm-kernel/msg154889.html

Best regards
--
Marek Szyprowski
Samsung Poland R&D Center


Patch summary:

Marek Szyprowski (7):
  ARM: dma-mapping: remove offset parameter to prepare for generic
    dma_ops
  ARM: dma-mapping: use asm-generic/dma-mapping-common.h
  ARM: dma-mapping: implement dma sg methods on top of any generic dma
    ops
  ARM: dma-mapping: move all dma bounce code to separate dma ops
    structure
  ARM: dma-mapping: remove redundant code and cleanup
  ARM: dma-mapping: use alloc, mmap, free from dma_ops
  ARM: dma-mapping: add support for IOMMU mapper

 arch/arm/Kconfig                   |    9 +
 arch/arm/common/dmabounce.c        |   78 +++-
 arch/arm/include/asm/device.h      |    4 +
 arch/arm/include/asm/dma-iommu.h   |   34 ++
 arch/arm/include/asm/dma-mapping.h |  404 +++++------------
 arch/arm/mm/dma-mapping.c          |  897 ++++++++++++++++++++++++++++++------
 arch/arm/mm/vmregion.h             |    2 +-
 7 files changed, 980 insertions(+), 448 deletions(-)
 create mode 100644 arch/arm/include/asm/dma-iommu.h

-- 
1.7.1.569.g6f426


^ permalink raw reply	[flat|nested] 39+ messages in thread

* [PATCHv6 1/7] ARM: dma-mapping: remove offset parameter to prepare for generic dma_ops
       [not found] ` <1328900324-20946-1-git-send-email-m.szyprowski-Sze3O3UU22JBDgjK7y7TUQ@public.gmane.org>
@ 2012-02-10 18:58   ` Marek Szyprowski
  2012-02-10 18:58     ` Marek Szyprowski
  2012-02-10 18:58   ` [PATCHv6 2/7] ARM: dma-mapping: use asm-generic/dma-mapping-common.h Marek Szyprowski
                     ` (4 subsequent siblings)
  5 siblings, 1 reply; 39+ messages in thread
From: Marek Szyprowski @ 2012-02-10 18:58 UTC (permalink / raw)
  To: linux-arm-kernel-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r,
	linaro-mm-sig-cunTk1MwBs8s++Sfvej+rw,
	linux-mm-Bw31MaZKKs3YtjvyW6yDsg,
	linux-arch-u79uwXL29TY76Z2rM5mHXA,
	linux-samsung-soc-u79uwXL29TY76Z2rM5mHXA,
	iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA
  Cc: Shariq Hasnain, Arnd Bergmann, Benjamin Herrenschmidt,
	Krishna Reddy, Kyungmin Park, Andrzej Pietrasiewicz,
	Russell King - ARM Linux, KyongHo Cho, Chunsang Jeong

This patch removes the need for offset parameter in dma bounce
functions. This is required to let dma-mapping framework on ARM
architecture use common, generic dma-mapping helpers.

Signed-off-by: Kyungmin Park <kyungmin.park-Sze3O3UU22JBDgjK7y7TUQ@public.gmane.org>
Signed-off-by: Marek Szyprowski <m.szyprowski-Sze3O3UU22JBDgjK7y7TUQ@public.gmane.org>
---
 arch/arm/common/dmabounce.c        |   13 +++++--
 arch/arm/include/asm/dma-mapping.h |   67 +++++++++++++++++------------------
 arch/arm/mm/dma-mapping.c          |    4 +-
 3 files changed, 45 insertions(+), 39 deletions(-)

diff --git a/arch/arm/common/dmabounce.c b/arch/arm/common/dmabounce.c
index 595ecd29..46b4b8d 100644
--- a/arch/arm/common/dmabounce.c
+++ b/arch/arm/common/dmabounce.c
@@ -173,7 +173,8 @@ find_safe_buffer(struct dmabounce_device_info *device_info, dma_addr_t safe_dma_
 	read_lock_irqsave(&device_info->lock, flags);
 
 	list_for_each_entry(b, &device_info->safe_buffers, node)
-		if (b->safe_dma_addr == safe_dma_addr) {
+		if (b->safe_dma_addr <= safe_dma_addr &&
+		    b->safe_dma_addr + b->size > safe_dma_addr) {
 			rb = b;
 			break;
 		}
@@ -362,9 +363,10 @@ void __dma_unmap_page(struct device *dev, dma_addr_t dma_addr, size_t size,
 EXPORT_SYMBOL(__dma_unmap_page);
 
 int dmabounce_sync_for_cpu(struct device *dev, dma_addr_t addr,
-		unsigned long off, size_t sz, enum dma_data_direction dir)
+		size_t sz, enum dma_data_direction dir)
 {
 	struct safe_buffer *buf;
+	unsigned long off;
 
 	dev_dbg(dev, "%s(dma=%#x,off=%#lx,sz=%zx,dir=%x)\n",
 		__func__, addr, off, sz, dir);
@@ -373,6 +375,8 @@ int dmabounce_sync_for_cpu(struct device *dev, dma_addr_t addr,
 	if (!buf)
 		return 1;
 
+	off = addr - buf->safe_dma_addr;
+
 	BUG_ON(buf->direction != dir);
 
 	dev_dbg(dev, "%s: unsafe buffer %p (dma=%#x) mapped to %p (dma=%#x)\n",
@@ -391,9 +395,10 @@ int dmabounce_sync_for_cpu(struct device *dev, dma_addr_t addr,
 EXPORT_SYMBOL(dmabounce_sync_for_cpu);
 
 int dmabounce_sync_for_device(struct device *dev, dma_addr_t addr,
-		unsigned long off, size_t sz, enum dma_data_direction dir)
+		size_t sz, enum dma_data_direction dir)
 {
 	struct safe_buffer *buf;
+	unsigned long off;
 
 	dev_dbg(dev, "%s(dma=%#x,off=%#lx,sz=%zx,dir=%x)\n",
 		__func__, addr, off, sz, dir);
@@ -402,6 +407,8 @@ int dmabounce_sync_for_device(struct device *dev, dma_addr_t addr,
 	if (!buf)
 		return 1;
 
+	off = addr - buf->safe_dma_addr;
+
 	BUG_ON(buf->direction != dir);
 
 	dev_dbg(dev, "%s: unsafe buffer %p (dma=%#x) mapped to %p (dma=%#x)\n",
diff --git a/arch/arm/include/asm/dma-mapping.h b/arch/arm/include/asm/dma-mapping.h
index cb3b7c9..6bc056c 100644
--- a/arch/arm/include/asm/dma-mapping.h
+++ b/arch/arm/include/asm/dma-mapping.h
@@ -264,19 +264,17 @@ extern void __dma_unmap_page(struct device *, dma_addr_t, size_t,
 /*
  * Private functions
  */
-int dmabounce_sync_for_cpu(struct device *, dma_addr_t, unsigned long,
-		size_t, enum dma_data_direction);
-int dmabounce_sync_for_device(struct device *, dma_addr_t, unsigned long,
-		size_t, enum dma_data_direction);
+int dmabounce_sync_for_cpu(struct device *, dma_addr_t, size_t, enum dma_data_direction);
+int dmabounce_sync_for_device(struct device *, dma_addr_t, size_t, enum dma_data_direction);
 #else
 static inline int dmabounce_sync_for_cpu(struct device *d, dma_addr_t addr,
-	unsigned long offset, size_t size, enum dma_data_direction dir)
+	size_t size, enum dma_data_direction dir)
 {
 	return 1;
 }
 
 static inline int dmabounce_sync_for_device(struct device *d, dma_addr_t addr,
-	unsigned long offset, size_t size, enum dma_data_direction dir)
+	size_t size, enum dma_data_direction dir)
 {
 	return 1;
 }
@@ -399,6 +397,33 @@ static inline void dma_unmap_page(struct device *dev, dma_addr_t handle,
 	__dma_unmap_page(dev, handle, size, dir);
 }
 
+
+static inline void dma_sync_single_for_cpu(struct device *dev,
+		dma_addr_t handle, size_t size, enum dma_data_direction dir)
+{
+	BUG_ON(!valid_dma_direction(dir));
+
+	debug_dma_sync_single_for_cpu(dev, handle, size, dir);
+
+	if (!dmabounce_sync_for_cpu(dev, handle, size, dir))
+		return;
+
+	__dma_single_dev_to_cpu(dma_to_virt(dev, handle), size, dir);
+}
+
+static inline void dma_sync_single_for_device(struct device *dev,
+		dma_addr_t handle, size_t size, enum dma_data_direction dir)
+{
+	BUG_ON(!valid_dma_direction(dir));
+
+	debug_dma_sync_single_for_device(dev, handle, size, dir);
+
+	if (!dmabounce_sync_for_device(dev, handle, size, dir))
+		return;
+
+	__dma_single_cpu_to_dev(dma_to_virt(dev, handle), size, dir);
+}
+
 /**
  * dma_sync_single_range_for_cpu
  * @dev: valid struct device pointer, or NULL for ISA and EISA-like devices
@@ -421,40 +446,14 @@ static inline void dma_sync_single_range_for_cpu(struct device *dev,
 		dma_addr_t handle, unsigned long offset, size_t size,
 		enum dma_data_direction dir)
 {
-	BUG_ON(!valid_dma_direction(dir));
-
-	debug_dma_sync_single_for_cpu(dev, handle + offset, size, dir);
-
-	if (!dmabounce_sync_for_cpu(dev, handle, offset, size, dir))
-		return;
-
-	__dma_single_dev_to_cpu(dma_to_virt(dev, handle) + offset, size, dir);
+	dma_sync_single_for_cpu(dev, handle + offset, size, dir);
 }
 
 static inline void dma_sync_single_range_for_device(struct device *dev,
 		dma_addr_t handle, unsigned long offset, size_t size,
 		enum dma_data_direction dir)
 {
-	BUG_ON(!valid_dma_direction(dir));
-
-	debug_dma_sync_single_for_device(dev, handle + offset, size, dir);
-
-	if (!dmabounce_sync_for_device(dev, handle, offset, size, dir))
-		return;
-
-	__dma_single_cpu_to_dev(dma_to_virt(dev, handle) + offset, size, dir);
-}
-
-static inline void dma_sync_single_for_cpu(struct device *dev,
-		dma_addr_t handle, size_t size, enum dma_data_direction dir)
-{
-	dma_sync_single_range_for_cpu(dev, handle, 0, size, dir);
-}
-
-static inline void dma_sync_single_for_device(struct device *dev,
-		dma_addr_t handle, size_t size, enum dma_data_direction dir)
-{
-	dma_sync_single_range_for_device(dev, handle, 0, size, dir);
+	dma_sync_single_for_device(dev, handle + offset, size, dir);
 }
 
 /*
diff --git a/arch/arm/mm/dma-mapping.c b/arch/arm/mm/dma-mapping.c
index 1aa664a..a5ab8bf 100644
--- a/arch/arm/mm/dma-mapping.c
+++ b/arch/arm/mm/dma-mapping.c
@@ -654,7 +654,7 @@ void dma_sync_sg_for_cpu(struct device *dev, struct scatterlist *sg,
 	int i;
 
 	for_each_sg(sg, s, nents, i) {
-		if (!dmabounce_sync_for_cpu(dev, sg_dma_address(s), 0,
+		if (!dmabounce_sync_for_cpu(dev, sg_dma_address(s),
 					    sg_dma_len(s), dir))
 			continue;
 
@@ -680,7 +680,7 @@ void dma_sync_sg_for_device(struct device *dev, struct scatterlist *sg,
 	int i;
 
 	for_each_sg(sg, s, nents, i) {
-		if (!dmabounce_sync_for_device(dev, sg_dma_address(s), 0,
+		if (!dmabounce_sync_for_device(dev, sg_dma_address(s),
 					sg_dma_len(s), dir))
 			continue;
 
-- 
1.7.1.569.g6f426

^ permalink raw reply related	[flat|nested] 39+ messages in thread

* [PATCHv6 1/7] ARM: dma-mapping: remove offset parameter to prepare for generic dma_ops
  2012-02-10 18:58   ` [PATCHv6 1/7] ARM: dma-mapping: remove offset parameter to prepare for generic dma_ops Marek Szyprowski
@ 2012-02-10 18:58     ` Marek Szyprowski
  0 siblings, 0 replies; 39+ messages in thread
From: Marek Szyprowski @ 2012-02-10 18:58 UTC (permalink / raw)
  To: linux-arm-kernel, linaro-mm-sig, linux-mm, linux-arch,
	linux-samsung-soc, iommu
  Cc: Marek Szyprowski, Kyungmin Park, Arnd Bergmann, Joerg Roedel,
	Russell King - ARM Linux, Shariq Hasnain, Chunsang Jeong,
	Krishna Reddy, KyongHo Cho, Andrzej Pietrasiewicz,
	Benjamin Herrenschmidt

This patch removes the need for offset parameter in dma bounce
functions. This is required to let dma-mapping framework on ARM
architecture use common, generic dma-mapping helpers.

Signed-off-by: Kyungmin Park <kyungmin.park@samsung.com>
Signed-off-by: Marek Szyprowski <m.szyprowski@samsung.com>
---
 arch/arm/common/dmabounce.c        |   13 +++++--
 arch/arm/include/asm/dma-mapping.h |   67 +++++++++++++++++------------------
 arch/arm/mm/dma-mapping.c          |    4 +-
 3 files changed, 45 insertions(+), 39 deletions(-)

diff --git a/arch/arm/common/dmabounce.c b/arch/arm/common/dmabounce.c
index 595ecd29..46b4b8d 100644
--- a/arch/arm/common/dmabounce.c
+++ b/arch/arm/common/dmabounce.c
@@ -173,7 +173,8 @@ find_safe_buffer(struct dmabounce_device_info *device_info, dma_addr_t safe_dma_
 	read_lock_irqsave(&device_info->lock, flags);
 
 	list_for_each_entry(b, &device_info->safe_buffers, node)
-		if (b->safe_dma_addr == safe_dma_addr) {
+		if (b->safe_dma_addr <= safe_dma_addr &&
+		    b->safe_dma_addr + b->size > safe_dma_addr) {
 			rb = b;
 			break;
 		}
@@ -362,9 +363,10 @@ void __dma_unmap_page(struct device *dev, dma_addr_t dma_addr, size_t size,
 EXPORT_SYMBOL(__dma_unmap_page);
 
 int dmabounce_sync_for_cpu(struct device *dev, dma_addr_t addr,
-		unsigned long off, size_t sz, enum dma_data_direction dir)
+		size_t sz, enum dma_data_direction dir)
 {
 	struct safe_buffer *buf;
+	unsigned long off;
 
 	dev_dbg(dev, "%s(dma=%#x,off=%#lx,sz=%zx,dir=%x)\n",
 		__func__, addr, off, sz, dir);
@@ -373,6 +375,8 @@ int dmabounce_sync_for_cpu(struct device *dev, dma_addr_t addr,
 	if (!buf)
 		return 1;
 
+	off = addr - buf->safe_dma_addr;
+
 	BUG_ON(buf->direction != dir);
 
 	dev_dbg(dev, "%s: unsafe buffer %p (dma=%#x) mapped to %p (dma=%#x)\n",
@@ -391,9 +395,10 @@ int dmabounce_sync_for_cpu(struct device *dev, dma_addr_t addr,
 EXPORT_SYMBOL(dmabounce_sync_for_cpu);
 
 int dmabounce_sync_for_device(struct device *dev, dma_addr_t addr,
-		unsigned long off, size_t sz, enum dma_data_direction dir)
+		size_t sz, enum dma_data_direction dir)
 {
 	struct safe_buffer *buf;
+	unsigned long off;
 
 	dev_dbg(dev, "%s(dma=%#x,off=%#lx,sz=%zx,dir=%x)\n",
 		__func__, addr, off, sz, dir);
@@ -402,6 +407,8 @@ int dmabounce_sync_for_device(struct device *dev, dma_addr_t addr,
 	if (!buf)
 		return 1;
 
+	off = addr - buf->safe_dma_addr;
+
 	BUG_ON(buf->direction != dir);
 
 	dev_dbg(dev, "%s: unsafe buffer %p (dma=%#x) mapped to %p (dma=%#x)\n",
diff --git a/arch/arm/include/asm/dma-mapping.h b/arch/arm/include/asm/dma-mapping.h
index cb3b7c9..6bc056c 100644
--- a/arch/arm/include/asm/dma-mapping.h
+++ b/arch/arm/include/asm/dma-mapping.h
@@ -264,19 +264,17 @@ extern void __dma_unmap_page(struct device *, dma_addr_t, size_t,
 /*
  * Private functions
  */
-int dmabounce_sync_for_cpu(struct device *, dma_addr_t, unsigned long,
-		size_t, enum dma_data_direction);
-int dmabounce_sync_for_device(struct device *, dma_addr_t, unsigned long,
-		size_t, enum dma_data_direction);
+int dmabounce_sync_for_cpu(struct device *, dma_addr_t, size_t, enum dma_data_direction);
+int dmabounce_sync_for_device(struct device *, dma_addr_t, size_t, enum dma_data_direction);
 #else
 static inline int dmabounce_sync_for_cpu(struct device *d, dma_addr_t addr,
-	unsigned long offset, size_t size, enum dma_data_direction dir)
+	size_t size, enum dma_data_direction dir)
 {
 	return 1;
 }
 
 static inline int dmabounce_sync_for_device(struct device *d, dma_addr_t addr,
-	unsigned long offset, size_t size, enum dma_data_direction dir)
+	size_t size, enum dma_data_direction dir)
 {
 	return 1;
 }
@@ -399,6 +397,33 @@ static inline void dma_unmap_page(struct device *dev, dma_addr_t handle,
 	__dma_unmap_page(dev, handle, size, dir);
 }
 
+
+static inline void dma_sync_single_for_cpu(struct device *dev,
+		dma_addr_t handle, size_t size, enum dma_data_direction dir)
+{
+	BUG_ON(!valid_dma_direction(dir));
+
+	debug_dma_sync_single_for_cpu(dev, handle, size, dir);
+
+	if (!dmabounce_sync_for_cpu(dev, handle, size, dir))
+		return;
+
+	__dma_single_dev_to_cpu(dma_to_virt(dev, handle), size, dir);
+}
+
+static inline void dma_sync_single_for_device(struct device *dev,
+		dma_addr_t handle, size_t size, enum dma_data_direction dir)
+{
+	BUG_ON(!valid_dma_direction(dir));
+
+	debug_dma_sync_single_for_device(dev, handle, size, dir);
+
+	if (!dmabounce_sync_for_device(dev, handle, size, dir))
+		return;
+
+	__dma_single_cpu_to_dev(dma_to_virt(dev, handle), size, dir);
+}
+
 /**
  * dma_sync_single_range_for_cpu
  * @dev: valid struct device pointer, or NULL for ISA and EISA-like devices
@@ -421,40 +446,14 @@ static inline void dma_sync_single_range_for_cpu(struct device *dev,
 		dma_addr_t handle, unsigned long offset, size_t size,
 		enum dma_data_direction dir)
 {
-	BUG_ON(!valid_dma_direction(dir));
-
-	debug_dma_sync_single_for_cpu(dev, handle + offset, size, dir);
-
-	if (!dmabounce_sync_for_cpu(dev, handle, offset, size, dir))
-		return;
-
-	__dma_single_dev_to_cpu(dma_to_virt(dev, handle) + offset, size, dir);
+	dma_sync_single_for_cpu(dev, handle + offset, size, dir);
 }
 
 static inline void dma_sync_single_range_for_device(struct device *dev,
 		dma_addr_t handle, unsigned long offset, size_t size,
 		enum dma_data_direction dir)
 {
-	BUG_ON(!valid_dma_direction(dir));
-
-	debug_dma_sync_single_for_device(dev, handle + offset, size, dir);
-
-	if (!dmabounce_sync_for_device(dev, handle, offset, size, dir))
-		return;
-
-	__dma_single_cpu_to_dev(dma_to_virt(dev, handle) + offset, size, dir);
-}
-
-static inline void dma_sync_single_for_cpu(struct device *dev,
-		dma_addr_t handle, size_t size, enum dma_data_direction dir)
-{
-	dma_sync_single_range_for_cpu(dev, handle, 0, size, dir);
-}
-
-static inline void dma_sync_single_for_device(struct device *dev,
-		dma_addr_t handle, size_t size, enum dma_data_direction dir)
-{
-	dma_sync_single_range_for_device(dev, handle, 0, size, dir);
+	dma_sync_single_for_device(dev, handle + offset, size, dir);
 }
 
 /*
diff --git a/arch/arm/mm/dma-mapping.c b/arch/arm/mm/dma-mapping.c
index 1aa664a..a5ab8bf 100644
--- a/arch/arm/mm/dma-mapping.c
+++ b/arch/arm/mm/dma-mapping.c
@@ -654,7 +654,7 @@ void dma_sync_sg_for_cpu(struct device *dev, struct scatterlist *sg,
 	int i;
 
 	for_each_sg(sg, s, nents, i) {
-		if (!dmabounce_sync_for_cpu(dev, sg_dma_address(s), 0,
+		if (!dmabounce_sync_for_cpu(dev, sg_dma_address(s),
 					    sg_dma_len(s), dir))
 			continue;
 
@@ -680,7 +680,7 @@ void dma_sync_sg_for_device(struct device *dev, struct scatterlist *sg,
 	int i;
 
 	for_each_sg(sg, s, nents, i) {
-		if (!dmabounce_sync_for_device(dev, sg_dma_address(s), 0,
+		if (!dmabounce_sync_for_device(dev, sg_dma_address(s),
 					sg_dma_len(s), dir))
 			continue;
 
-- 
1.7.1.569.g6f426


^ permalink raw reply related	[flat|nested] 39+ messages in thread

* [PATCHv6 2/7] ARM: dma-mapping: use asm-generic/dma-mapping-common.h
       [not found] ` <1328900324-20946-1-git-send-email-m.szyprowski-Sze3O3UU22JBDgjK7y7TUQ@public.gmane.org>
  2012-02-10 18:58   ` [PATCHv6 1/7] ARM: dma-mapping: remove offset parameter to prepare for generic dma_ops Marek Szyprowski
@ 2012-02-10 18:58   ` Marek Szyprowski
  2012-02-10 18:58     ` Marek Szyprowski
  2012-02-14 15:01     ` Konrad Rzeszutek Wilk
  2012-02-10 18:58   ` [PATCHv6 4/7] ARM: dma-mapping: move all dma bounce code to separate dma ops structure Marek Szyprowski
                     ` (3 subsequent siblings)
  5 siblings, 2 replies; 39+ messages in thread
From: Marek Szyprowski @ 2012-02-10 18:58 UTC (permalink / raw)
  To: linux-arm-kernel-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r,
	linaro-mm-sig-cunTk1MwBs8s++Sfvej+rw,
	linux-mm-Bw31MaZKKs3YtjvyW6yDsg,
	linux-arch-u79uwXL29TY76Z2rM5mHXA,
	linux-samsung-soc-u79uwXL29TY76Z2rM5mHXA,
	iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA
  Cc: Shariq Hasnain, Arnd Bergmann, Benjamin Herrenschmidt,
	Krishna Reddy, Kyungmin Park, Andrzej Pietrasiewicz,
	Russell King - ARM Linux, KyongHo Cho, Chunsang Jeong

This patch modifies dma-mapping implementation on ARM architecture to
use common dma_map_ops structure and asm-generic/dma-mapping-common.h
helpers.

Signed-off-by: Marek Szyprowski <m.szyprowski-Sze3O3UU22JBDgjK7y7TUQ@public.gmane.org>
Signed-off-by: Kyungmin Park <kyungmin.park-Sze3O3UU22JBDgjK7y7TUQ@public.gmane.org>
---
 arch/arm/Kconfig                   |    1 +
 arch/arm/include/asm/device.h      |    1 +
 arch/arm/include/asm/dma-mapping.h |  197 +++++-------------------------------
 arch/arm/mm/dma-mapping.c          |  149 ++++++++++++++++-----------
 4 files changed, 117 insertions(+), 231 deletions(-)

diff --git a/arch/arm/Kconfig b/arch/arm/Kconfig
index a48aecc..59102fb 100644
--- a/arch/arm/Kconfig
+++ b/arch/arm/Kconfig
@@ -4,6 +4,7 @@ config ARM
 	select HAVE_AOUT
 	select HAVE_DMA_API_DEBUG
 	select HAVE_IDE if PCI || ISA || PCMCIA
+	select HAVE_DMA_ATTRS
 	select HAVE_MEMBLOCK
 	select RTC_LIB
 	select SYS_SUPPORTS_APM_EMULATION
diff --git a/arch/arm/include/asm/device.h b/arch/arm/include/asm/device.h
index 7aa3680..6e2cb0e 100644
--- a/arch/arm/include/asm/device.h
+++ b/arch/arm/include/asm/device.h
@@ -7,6 +7,7 @@
 #define ASMARM_DEVICE_H
 
 struct dev_archdata {
+	struct dma_map_ops	*dma_ops;
 #ifdef CONFIG_DMABOUNCE
 	struct dmabounce_device_info *dmabounce;
 #endif
diff --git a/arch/arm/include/asm/dma-mapping.h b/arch/arm/include/asm/dma-mapping.h
index 6bc056c..cf7b77c 100644
--- a/arch/arm/include/asm/dma-mapping.h
+++ b/arch/arm/include/asm/dma-mapping.h
@@ -10,6 +10,28 @@
 #include <asm-generic/dma-coherent.h>
 #include <asm/memory.h>
 
+extern struct dma_map_ops arm_dma_ops;
+
+static inline struct dma_map_ops *get_dma_ops(struct device *dev)
+{
+	if (dev && dev->archdata.dma_ops)
+		return dev->archdata.dma_ops;
+	return &arm_dma_ops;
+}
+
+static inline void set_dma_ops(struct device *dev, struct dma_map_ops *ops)
+{
+	BUG_ON(!dev);
+	dev->archdata.dma_ops = ops;
+}
+
+#include <asm-generic/dma-mapping-common.h>
+
+static inline int dma_set_mask(struct device *dev, u64 mask)
+{
+	return get_dma_ops(dev)->set_dma_mask(dev, mask);
+}
+
 #ifdef __arch_page_to_dma
 #error Please update to __arch_pfn_to_dma
 #endif
@@ -117,7 +139,6 @@ static inline void __dma_page_dev_to_cpu(struct page *page, unsigned long off,
 
 extern int dma_supported(struct device *, u64);
 extern int dma_set_mask(struct device *, u64);
-
 /*
  * DMA errors are defined by all-bits-set in the DMA address.
  */
@@ -295,179 +316,17 @@ static inline void __dma_unmap_page(struct device *dev, dma_addr_t handle,
 }
 #endif /* CONFIG_DMABOUNCE */
 
-/**
- * dma_map_single - map a single buffer for streaming DMA
- * @dev: valid struct device pointer, or NULL for ISA and EISA-like devices
- * @cpu_addr: CPU direct mapped address of buffer
- * @size: size of buffer to map
- * @dir: DMA transfer direction
- *
- * Ensure that any data held in the cache is appropriately discarded
- * or written back.
- *
- * The device owns this memory once this call has completed.  The CPU
- * can regain ownership by calling dma_unmap_single() or
- * dma_sync_single_for_cpu().
- */
-static inline dma_addr_t dma_map_single(struct device *dev, void *cpu_addr,
-		size_t size, enum dma_data_direction dir)
-{
-	unsigned long offset;
-	struct page *page;
-	dma_addr_t addr;
-
-	BUG_ON(!virt_addr_valid(cpu_addr));
-	BUG_ON(!virt_addr_valid(cpu_addr + size - 1));
-	BUG_ON(!valid_dma_direction(dir));
-
-	page = virt_to_page(cpu_addr);
-	offset = (unsigned long)cpu_addr & ~PAGE_MASK;
-	addr = __dma_map_page(dev, page, offset, size, dir);
-	debug_dma_map_page(dev, page, offset, size, dir, addr, true);
-
-	return addr;
-}
-
-/**
- * dma_map_page - map a portion of a page for streaming DMA
- * @dev: valid struct device pointer, or NULL for ISA and EISA-like devices
- * @page: page that buffer resides in
- * @offset: offset into page for start of buffer
- * @size: size of buffer to map
- * @dir: DMA transfer direction
- *
- * Ensure that any data held in the cache is appropriately discarded
- * or written back.
- *
- * The device owns this memory once this call has completed.  The CPU
- * can regain ownership by calling dma_unmap_page().
- */
-static inline dma_addr_t dma_map_page(struct device *dev, struct page *page,
-	     unsigned long offset, size_t size, enum dma_data_direction dir)
-{
-	dma_addr_t addr;
-
-	BUG_ON(!valid_dma_direction(dir));
-
-	addr = __dma_map_page(dev, page, offset, size, dir);
-	debug_dma_map_page(dev, page, offset, size, dir, addr, false);
-
-	return addr;
-}
-
-/**
- * dma_unmap_single - unmap a single buffer previously mapped
- * @dev: valid struct device pointer, or NULL for ISA and EISA-like devices
- * @handle: DMA address of buffer
- * @size: size of buffer (same as passed to dma_map_single)
- * @dir: DMA transfer direction (same as passed to dma_map_single)
- *
- * Unmap a single streaming mode DMA translation.  The handle and size
- * must match what was provided in the previous dma_map_single() call.
- * All other usages are undefined.
- *
- * After this call, reads by the CPU to the buffer are guaranteed to see
- * whatever the device wrote there.
- */
-static inline void dma_unmap_single(struct device *dev, dma_addr_t handle,
-		size_t size, enum dma_data_direction dir)
-{
-	debug_dma_unmap_page(dev, handle, size, dir, true);
-	__dma_unmap_page(dev, handle, size, dir);
-}
-
-/**
- * dma_unmap_page - unmap a buffer previously mapped through dma_map_page()
- * @dev: valid struct device pointer, or NULL for ISA and EISA-like devices
- * @handle: DMA address of buffer
- * @size: size of buffer (same as passed to dma_map_page)
- * @dir: DMA transfer direction (same as passed to dma_map_page)
- *
- * Unmap a page streaming mode DMA translation.  The handle and size
- * must match what was provided in the previous dma_map_page() call.
- * All other usages are undefined.
- *
- * After this call, reads by the CPU to the buffer are guaranteed to see
- * whatever the device wrote there.
- */
-static inline void dma_unmap_page(struct device *dev, dma_addr_t handle,
-		size_t size, enum dma_data_direction dir)
-{
-	debug_dma_unmap_page(dev, handle, size, dir, false);
-	__dma_unmap_page(dev, handle, size, dir);
-}
-
-
-static inline void dma_sync_single_for_cpu(struct device *dev,
-		dma_addr_t handle, size_t size, enum dma_data_direction dir)
-{
-	BUG_ON(!valid_dma_direction(dir));
-
-	debug_dma_sync_single_for_cpu(dev, handle, size, dir);
-
-	if (!dmabounce_sync_for_cpu(dev, handle, size, dir))
-		return;
-
-	__dma_single_dev_to_cpu(dma_to_virt(dev, handle), size, dir);
-}
-
-static inline void dma_sync_single_for_device(struct device *dev,
-		dma_addr_t handle, size_t size, enum dma_data_direction dir)
-{
-	BUG_ON(!valid_dma_direction(dir));
-
-	debug_dma_sync_single_for_device(dev, handle, size, dir);
-
-	if (!dmabounce_sync_for_device(dev, handle, size, dir))
-		return;
-
-	__dma_single_cpu_to_dev(dma_to_virt(dev, handle), size, dir);
-}
-
-/**
- * dma_sync_single_range_for_cpu
- * @dev: valid struct device pointer, or NULL for ISA and EISA-like devices
- * @handle: DMA address of buffer
- * @offset: offset of region to start sync
- * @size: size of region to sync
- * @dir: DMA transfer direction (same as passed to dma_map_single)
- *
- * Make physical memory consistent for a single streaming mode DMA
- * translation after a transfer.
- *
- * If you perform a dma_map_single() but wish to interrogate the
- * buffer using the cpu, yet do not wish to teardown the PCI dma
- * mapping, you must call this function before doing so.  At the
- * next point you give the PCI dma address back to the card, you
- * must first the perform a dma_sync_for_device, and then the
- * device again owns the buffer.
- */
-static inline void dma_sync_single_range_for_cpu(struct device *dev,
-		dma_addr_t handle, unsigned long offset, size_t size,
-		enum dma_data_direction dir)
-{
-	dma_sync_single_for_cpu(dev, handle + offset, size, dir);
-}
-
-static inline void dma_sync_single_range_for_device(struct device *dev,
-		dma_addr_t handle, unsigned long offset, size_t size,
-		enum dma_data_direction dir)
-{
-	dma_sync_single_for_device(dev, handle + offset, size, dir);
-}
-
 /*
  * The scatter list versions of the above methods.
  */
-extern int dma_map_sg(struct device *, struct scatterlist *, int,
-		enum dma_data_direction);
-extern void dma_unmap_sg(struct device *, struct scatterlist *, int,
+extern int arm_dma_map_sg(struct device *, struct scatterlist *, int,
+		enum dma_data_direction, struct dma_attrs *attrs);
+extern void arm_dma_unmap_sg(struct device *, struct scatterlist *, int,
+		enum dma_data_direction, struct dma_attrs *attrs);
+extern void arm_dma_sync_sg_for_cpu(struct device *, struct scatterlist *, int,
 		enum dma_data_direction);
-extern void dma_sync_sg_for_cpu(struct device *, struct scatterlist *, int,
+extern void arm_dma_sync_sg_for_device(struct device *, struct scatterlist *, int,
 		enum dma_data_direction);
-extern void dma_sync_sg_for_device(struct device *, struct scatterlist *, int,
-		enum dma_data_direction);
-
 
 #endif /* __KERNEL__ */
 #endif
diff --git a/arch/arm/mm/dma-mapping.c b/arch/arm/mm/dma-mapping.c
index a5ab8bf..91fe436 100644
--- a/arch/arm/mm/dma-mapping.c
+++ b/arch/arm/mm/dma-mapping.c
@@ -29,6 +29,86 @@
 
 #include "mm.h"
 
+/**
+ * dma_map_page - map a portion of a page for streaming DMA
+ * @dev: valid struct device pointer, or NULL for ISA and EISA-like devices
+ * @page: page that buffer resides in
+ * @offset: offset into page for start of buffer
+ * @size: size of buffer to map
+ * @dir: DMA transfer direction
+ *
+ * Ensure that any data held in the cache is appropriately discarded
+ * or written back.
+ *
+ * The device owns this memory once this call has completed.  The CPU
+ * can regain ownership by calling dma_unmap_page().
+ */
+static inline dma_addr_t arm_dma_map_page(struct device *dev, struct page *page,
+	     unsigned long offset, size_t size, enum dma_data_direction dir,
+	     struct dma_attrs *attrs)
+{
+	return __dma_map_page(dev, page, offset, size, dir);
+}
+
+/**
+ * dma_unmap_page - unmap a buffer previously mapped through dma_map_page()
+ * @dev: valid struct device pointer, or NULL for ISA and EISA-like devices
+ * @handle: DMA address of buffer
+ * @size: size of buffer (same as passed to dma_map_page)
+ * @dir: DMA transfer direction (same as passed to dma_map_page)
+ *
+ * Unmap a page streaming mode DMA translation.  The handle and size
+ * must match what was provided in the previous dma_map_page() call.
+ * All other usages are undefined.
+ *
+ * After this call, reads by the CPU to the buffer are guaranteed to see
+ * whatever the device wrote there.
+ */
+
+static inline void arm_dma_unmap_page(struct device *dev, dma_addr_t handle,
+		size_t size, enum dma_data_direction dir,
+		struct dma_attrs *attrs)
+{
+	__dma_unmap_page(dev, handle, size, dir);
+}
+
+static inline void arm_dma_sync_single_for_cpu(struct device *dev,
+		dma_addr_t handle, size_t size, enum dma_data_direction dir)
+{
+	unsigned int offset = handle & (PAGE_SIZE - 1);
+	struct page *page = pfn_to_page(dma_to_pfn(dev, handle-offset));
+	if (!dmabounce_sync_for_cpu(dev, handle, size, dir))
+		return;
+
+	__dma_page_dev_to_cpu(page, offset, size, dir);
+}
+
+static inline void arm_dma_sync_single_for_device(struct device *dev,
+		dma_addr_t handle, size_t size, enum dma_data_direction dir)
+{
+	unsigned int offset = handle & (PAGE_SIZE - 1);
+	struct page *page = pfn_to_page(dma_to_pfn(dev, handle-offset));
+	if (!dmabounce_sync_for_device(dev, handle, size, dir))
+		return;
+
+	__dma_page_cpu_to_dev(page, offset, size, dir);
+}
+
+static int arm_dma_set_mask(struct device *dev, u64 dma_mask);
+
+struct dma_map_ops arm_dma_ops = {
+	.map_page		= arm_dma_map_page,
+	.unmap_page		= arm_dma_unmap_page,
+	.map_sg			= arm_dma_map_sg,
+	.unmap_sg		= arm_dma_unmap_sg,
+	.sync_single_for_cpu	= arm_dma_sync_single_for_cpu,
+	.sync_single_for_device	= arm_dma_sync_single_for_device,
+	.sync_sg_for_cpu	= arm_dma_sync_sg_for_cpu,
+	.sync_sg_for_device	= arm_dma_sync_sg_for_device,
+	.set_dma_mask		= arm_dma_set_mask,
+};
+EXPORT_SYMBOL(arm_dma_ops);
+
 static u64 get_coherent_dma_mask(struct device *dev)
 {
 	u64 mask = (u64)arm_dma_limit;
@@ -455,47 +535,6 @@ void dma_free_coherent(struct device *dev, size_t size, void *cpu_addr, dma_addr
 }
 EXPORT_SYMBOL(dma_free_coherent);
 
-/*
- * Make an area consistent for devices.
- * Note: Drivers should NOT use this function directly, as it will break
- * platforms with CONFIG_DMABOUNCE.
- * Use the driver DMA support - see dma-mapping.h (dma_sync_*)
- */
-void ___dma_single_cpu_to_dev(const void *kaddr, size_t size,
-	enum dma_data_direction dir)
-{
-	unsigned long paddr;
-
-	BUG_ON(!virt_addr_valid(kaddr) || !virt_addr_valid(kaddr + size - 1));
-
-	dmac_map_area(kaddr, size, dir);
-
-	paddr = __pa(kaddr);
-	if (dir == DMA_FROM_DEVICE) {
-		outer_inv_range(paddr, paddr + size);
-	} else {
-		outer_clean_range(paddr, paddr + size);
-	}
-	/* FIXME: non-speculating: flush on bidirectional mappings? */
-}
-EXPORT_SYMBOL(___dma_single_cpu_to_dev);
-
-void ___dma_single_dev_to_cpu(const void *kaddr, size_t size,
-	enum dma_data_direction dir)
-{
-	BUG_ON(!virt_addr_valid(kaddr) || !virt_addr_valid(kaddr + size - 1));
-
-	/* FIXME: non-speculating: not required */
-	/* don't bother invalidating if DMA to device */
-	if (dir != DMA_TO_DEVICE) {
-		unsigned long paddr = __pa(kaddr);
-		outer_inv_range(paddr, paddr + size);
-	}
-
-	dmac_unmap_area(kaddr, size, dir);
-}
-EXPORT_SYMBOL(___dma_single_dev_to_cpu);
-
 static void dma_cache_maint_page(struct page *page, unsigned long offset,
 	size_t size, enum dma_data_direction dir,
 	void (*op)(const void *, size_t, int))
@@ -593,21 +632,18 @@ EXPORT_SYMBOL(___dma_page_dev_to_cpu);
  * Device ownership issues as mentioned for dma_map_single are the same
  * here.
  */
-int dma_map_sg(struct device *dev, struct scatterlist *sg, int nents,
-		enum dma_data_direction dir)
+int arm_dma_map_sg(struct device *dev, struct scatterlist *sg, int nents,
+		enum dma_data_direction dir, struct dma_attrs *attrs)
 {
 	struct scatterlist *s;
 	int i, j;
 
-	BUG_ON(!valid_dma_direction(dir));
-
 	for_each_sg(sg, s, nents, i) {
 		s->dma_address = __dma_map_page(dev, sg_page(s), s->offset,
 						s->length, dir);
 		if (dma_mapping_error(dev, s->dma_address))
 			goto bad_mapping;
 	}
-	debug_dma_map_sg(dev, sg, nents, nents, dir);
 	return nents;
 
  bad_mapping:
@@ -615,7 +651,6 @@ int dma_map_sg(struct device *dev, struct scatterlist *sg, int nents,
 		__dma_unmap_page(dev, sg_dma_address(s), sg_dma_len(s), dir);
 	return 0;
 }
-EXPORT_SYMBOL(dma_map_sg);
 
 /**
  * dma_unmap_sg - unmap a set of SG buffers mapped by dma_map_sg
@@ -627,18 +662,15 @@ EXPORT_SYMBOL(dma_map_sg);
  * Unmap a set of streaming mode DMA translations.  Again, CPU access
  * rules concerning calls here are the same as for dma_unmap_single().
  */
-void dma_unmap_sg(struct device *dev, struct scatterlist *sg, int nents,
-		enum dma_data_direction dir)
+void arm_dma_unmap_sg(struct device *dev, struct scatterlist *sg, int nents,
+		enum dma_data_direction dir, struct dma_attrs *attrs)
 {
 	struct scatterlist *s;
 	int i;
 
-	debug_dma_unmap_sg(dev, sg, nents, dir);
-
 	for_each_sg(sg, s, nents, i)
 		__dma_unmap_page(dev, sg_dma_address(s), sg_dma_len(s), dir);
 }
-EXPORT_SYMBOL(dma_unmap_sg);
 
 /**
  * dma_sync_sg_for_cpu
@@ -647,7 +679,7 @@ EXPORT_SYMBOL(dma_unmap_sg);
  * @nents: number of buffers to map (returned from dma_map_sg)
  * @dir: DMA transfer direction (same as was passed to dma_map_sg)
  */
-void dma_sync_sg_for_cpu(struct device *dev, struct scatterlist *sg,
+void arm_dma_sync_sg_for_cpu(struct device *dev, struct scatterlist *sg,
 			int nents, enum dma_data_direction dir)
 {
 	struct scatterlist *s;
@@ -661,10 +693,7 @@ void dma_sync_sg_for_cpu(struct device *dev, struct scatterlist *sg,
 		__dma_page_dev_to_cpu(sg_page(s), s->offset,
 				      s->length, dir);
 	}
-
-	debug_dma_sync_sg_for_cpu(dev, sg, nents, dir);
 }
-EXPORT_SYMBOL(dma_sync_sg_for_cpu);
 
 /**
  * dma_sync_sg_for_device
@@ -673,7 +702,7 @@ EXPORT_SYMBOL(dma_sync_sg_for_cpu);
  * @nents: number of buffers to map (returned from dma_map_sg)
  * @dir: DMA transfer direction (same as was passed to dma_map_sg)
  */
-void dma_sync_sg_for_device(struct device *dev, struct scatterlist *sg,
+void arm_dma_sync_sg_for_device(struct device *dev, struct scatterlist *sg,
 			int nents, enum dma_data_direction dir)
 {
 	struct scatterlist *s;
@@ -687,10 +716,7 @@ void dma_sync_sg_for_device(struct device *dev, struct scatterlist *sg,
 		__dma_page_cpu_to_dev(sg_page(s), s->offset,
 				      s->length, dir);
 	}
-
-	debug_dma_sync_sg_for_device(dev, sg, nents, dir);
 }
-EXPORT_SYMBOL(dma_sync_sg_for_device);
 
 /*
  * Return whether the given device DMA address mask can be supported
@@ -706,7 +732,7 @@ int dma_supported(struct device *dev, u64 mask)
 }
 EXPORT_SYMBOL(dma_supported);
 
-int dma_set_mask(struct device *dev, u64 dma_mask)
+static int arm_dma_set_mask(struct device *dev, u64 dma_mask)
 {
 	if (!dev->dma_mask || !dma_supported(dev, dma_mask))
 		return -EIO;
@@ -717,7 +743,6 @@ int dma_set_mask(struct device *dev, u64 dma_mask)
 
 	return 0;
 }
-EXPORT_SYMBOL(dma_set_mask);
 
 #define PREALLOC_DMA_DEBUG_ENTRIES	4096
 
-- 
1.7.1.569.g6f426

^ permalink raw reply related	[flat|nested] 39+ messages in thread

* [PATCHv6 2/7] ARM: dma-mapping: use asm-generic/dma-mapping-common.h
  2012-02-10 18:58   ` [PATCHv6 2/7] ARM: dma-mapping: use asm-generic/dma-mapping-common.h Marek Szyprowski
@ 2012-02-10 18:58     ` Marek Szyprowski
  2012-02-14 15:01     ` Konrad Rzeszutek Wilk
  1 sibling, 0 replies; 39+ messages in thread
From: Marek Szyprowski @ 2012-02-10 18:58 UTC (permalink / raw)
  To: linux-arm-kernel, linaro-mm-sig, linux-mm, linux-arch,
	linux-samsung-soc, iommu
  Cc: Marek Szyprowski, Kyungmin Park, Arnd Bergmann, Joerg Roedel,
	Russell King - ARM Linux, Shariq Hasnain, Chunsang Jeong,
	Krishna Reddy, KyongHo Cho, Andrzej Pietrasiewicz,
	Benjamin Herrenschmidt

This patch modifies dma-mapping implementation on ARM architecture to
use common dma_map_ops structure and asm-generic/dma-mapping-common.h
helpers.

Signed-off-by: Marek Szyprowski <m.szyprowski@samsung.com>
Signed-off-by: Kyungmin Park <kyungmin.park@samsung.com>
---
 arch/arm/Kconfig                   |    1 +
 arch/arm/include/asm/device.h      |    1 +
 arch/arm/include/asm/dma-mapping.h |  197 +++++-------------------------------
 arch/arm/mm/dma-mapping.c          |  149 ++++++++++++++++-----------
 4 files changed, 117 insertions(+), 231 deletions(-)

diff --git a/arch/arm/Kconfig b/arch/arm/Kconfig
index a48aecc..59102fb 100644
--- a/arch/arm/Kconfig
+++ b/arch/arm/Kconfig
@@ -4,6 +4,7 @@ config ARM
 	select HAVE_AOUT
 	select HAVE_DMA_API_DEBUG
 	select HAVE_IDE if PCI || ISA || PCMCIA
+	select HAVE_DMA_ATTRS
 	select HAVE_MEMBLOCK
 	select RTC_LIB
 	select SYS_SUPPORTS_APM_EMULATION
diff --git a/arch/arm/include/asm/device.h b/arch/arm/include/asm/device.h
index 7aa3680..6e2cb0e 100644
--- a/arch/arm/include/asm/device.h
+++ b/arch/arm/include/asm/device.h
@@ -7,6 +7,7 @@
 #define ASMARM_DEVICE_H
 
 struct dev_archdata {
+	struct dma_map_ops	*dma_ops;
 #ifdef CONFIG_DMABOUNCE
 	struct dmabounce_device_info *dmabounce;
 #endif
diff --git a/arch/arm/include/asm/dma-mapping.h b/arch/arm/include/asm/dma-mapping.h
index 6bc056c..cf7b77c 100644
--- a/arch/arm/include/asm/dma-mapping.h
+++ b/arch/arm/include/asm/dma-mapping.h
@@ -10,6 +10,28 @@
 #include <asm-generic/dma-coherent.h>
 #include <asm/memory.h>
 
+extern struct dma_map_ops arm_dma_ops;
+
+static inline struct dma_map_ops *get_dma_ops(struct device *dev)
+{
+	if (dev && dev->archdata.dma_ops)
+		return dev->archdata.dma_ops;
+	return &arm_dma_ops;
+}
+
+static inline void set_dma_ops(struct device *dev, struct dma_map_ops *ops)
+{
+	BUG_ON(!dev);
+	dev->archdata.dma_ops = ops;
+}
+
+#include <asm-generic/dma-mapping-common.h>
+
+static inline int dma_set_mask(struct device *dev, u64 mask)
+{
+	return get_dma_ops(dev)->set_dma_mask(dev, mask);
+}
+
 #ifdef __arch_page_to_dma
 #error Please update to __arch_pfn_to_dma
 #endif
@@ -117,7 +139,6 @@ static inline void __dma_page_dev_to_cpu(struct page *page, unsigned long off,
 
 extern int dma_supported(struct device *, u64);
 extern int dma_set_mask(struct device *, u64);
-
 /*
  * DMA errors are defined by all-bits-set in the DMA address.
  */
@@ -295,179 +316,17 @@ static inline void __dma_unmap_page(struct device *dev, dma_addr_t handle,
 }
 #endif /* CONFIG_DMABOUNCE */
 
-/**
- * dma_map_single - map a single buffer for streaming DMA
- * @dev: valid struct device pointer, or NULL for ISA and EISA-like devices
- * @cpu_addr: CPU direct mapped address of buffer
- * @size: size of buffer to map
- * @dir: DMA transfer direction
- *
- * Ensure that any data held in the cache is appropriately discarded
- * or written back.
- *
- * The device owns this memory once this call has completed.  The CPU
- * can regain ownership by calling dma_unmap_single() or
- * dma_sync_single_for_cpu().
- */
-static inline dma_addr_t dma_map_single(struct device *dev, void *cpu_addr,
-		size_t size, enum dma_data_direction dir)
-{
-	unsigned long offset;
-	struct page *page;
-	dma_addr_t addr;
-
-	BUG_ON(!virt_addr_valid(cpu_addr));
-	BUG_ON(!virt_addr_valid(cpu_addr + size - 1));
-	BUG_ON(!valid_dma_direction(dir));
-
-	page = virt_to_page(cpu_addr);
-	offset = (unsigned long)cpu_addr & ~PAGE_MASK;
-	addr = __dma_map_page(dev, page, offset, size, dir);
-	debug_dma_map_page(dev, page, offset, size, dir, addr, true);
-
-	return addr;
-}
-
-/**
- * dma_map_page - map a portion of a page for streaming DMA
- * @dev: valid struct device pointer, or NULL for ISA and EISA-like devices
- * @page: page that buffer resides in
- * @offset: offset into page for start of buffer
- * @size: size of buffer to map
- * @dir: DMA transfer direction
- *
- * Ensure that any data held in the cache is appropriately discarded
- * or written back.
- *
- * The device owns this memory once this call has completed.  The CPU
- * can regain ownership by calling dma_unmap_page().
- */
-static inline dma_addr_t dma_map_page(struct device *dev, struct page *page,
-	     unsigned long offset, size_t size, enum dma_data_direction dir)
-{
-	dma_addr_t addr;
-
-	BUG_ON(!valid_dma_direction(dir));
-
-	addr = __dma_map_page(dev, page, offset, size, dir);
-	debug_dma_map_page(dev, page, offset, size, dir, addr, false);
-
-	return addr;
-}
-
-/**
- * dma_unmap_single - unmap a single buffer previously mapped
- * @dev: valid struct device pointer, or NULL for ISA and EISA-like devices
- * @handle: DMA address of buffer
- * @size: size of buffer (same as passed to dma_map_single)
- * @dir: DMA transfer direction (same as passed to dma_map_single)
- *
- * Unmap a single streaming mode DMA translation.  The handle and size
- * must match what was provided in the previous dma_map_single() call.
- * All other usages are undefined.
- *
- * After this call, reads by the CPU to the buffer are guaranteed to see
- * whatever the device wrote there.
- */
-static inline void dma_unmap_single(struct device *dev, dma_addr_t handle,
-		size_t size, enum dma_data_direction dir)
-{
-	debug_dma_unmap_page(dev, handle, size, dir, true);
-	__dma_unmap_page(dev, handle, size, dir);
-}
-
-/**
- * dma_unmap_page - unmap a buffer previously mapped through dma_map_page()
- * @dev: valid struct device pointer, or NULL for ISA and EISA-like devices
- * @handle: DMA address of buffer
- * @size: size of buffer (same as passed to dma_map_page)
- * @dir: DMA transfer direction (same as passed to dma_map_page)
- *
- * Unmap a page streaming mode DMA translation.  The handle and size
- * must match what was provided in the previous dma_map_page() call.
- * All other usages are undefined.
- *
- * After this call, reads by the CPU to the buffer are guaranteed to see
- * whatever the device wrote there.
- */
-static inline void dma_unmap_page(struct device *dev, dma_addr_t handle,
-		size_t size, enum dma_data_direction dir)
-{
-	debug_dma_unmap_page(dev, handle, size, dir, false);
-	__dma_unmap_page(dev, handle, size, dir);
-}
-
-
-static inline void dma_sync_single_for_cpu(struct device *dev,
-		dma_addr_t handle, size_t size, enum dma_data_direction dir)
-{
-	BUG_ON(!valid_dma_direction(dir));
-
-	debug_dma_sync_single_for_cpu(dev, handle, size, dir);
-
-	if (!dmabounce_sync_for_cpu(dev, handle, size, dir))
-		return;
-
-	__dma_single_dev_to_cpu(dma_to_virt(dev, handle), size, dir);
-}
-
-static inline void dma_sync_single_for_device(struct device *dev,
-		dma_addr_t handle, size_t size, enum dma_data_direction dir)
-{
-	BUG_ON(!valid_dma_direction(dir));
-
-	debug_dma_sync_single_for_device(dev, handle, size, dir);
-
-	if (!dmabounce_sync_for_device(dev, handle, size, dir))
-		return;
-
-	__dma_single_cpu_to_dev(dma_to_virt(dev, handle), size, dir);
-}
-
-/**
- * dma_sync_single_range_for_cpu
- * @dev: valid struct device pointer, or NULL for ISA and EISA-like devices
- * @handle: DMA address of buffer
- * @offset: offset of region to start sync
- * @size: size of region to sync
- * @dir: DMA transfer direction (same as passed to dma_map_single)
- *
- * Make physical memory consistent for a single streaming mode DMA
- * translation after a transfer.
- *
- * If you perform a dma_map_single() but wish to interrogate the
- * buffer using the cpu, yet do not wish to teardown the PCI dma
- * mapping, you must call this function before doing so.  At the
- * next point you give the PCI dma address back to the card, you
- * must first the perform a dma_sync_for_device, and then the
- * device again owns the buffer.
- */
-static inline void dma_sync_single_range_for_cpu(struct device *dev,
-		dma_addr_t handle, unsigned long offset, size_t size,
-		enum dma_data_direction dir)
-{
-	dma_sync_single_for_cpu(dev, handle + offset, size, dir);
-}
-
-static inline void dma_sync_single_range_for_device(struct device *dev,
-		dma_addr_t handle, unsigned long offset, size_t size,
-		enum dma_data_direction dir)
-{
-	dma_sync_single_for_device(dev, handle + offset, size, dir);
-}
-
 /*
  * The scatter list versions of the above methods.
  */
-extern int dma_map_sg(struct device *, struct scatterlist *, int,
-		enum dma_data_direction);
-extern void dma_unmap_sg(struct device *, struct scatterlist *, int,
+extern int arm_dma_map_sg(struct device *, struct scatterlist *, int,
+		enum dma_data_direction, struct dma_attrs *attrs);
+extern void arm_dma_unmap_sg(struct device *, struct scatterlist *, int,
+		enum dma_data_direction, struct dma_attrs *attrs);
+extern void arm_dma_sync_sg_for_cpu(struct device *, struct scatterlist *, int,
 		enum dma_data_direction);
-extern void dma_sync_sg_for_cpu(struct device *, struct scatterlist *, int,
+extern void arm_dma_sync_sg_for_device(struct device *, struct scatterlist *, int,
 		enum dma_data_direction);
-extern void dma_sync_sg_for_device(struct device *, struct scatterlist *, int,
-		enum dma_data_direction);
-
 
 #endif /* __KERNEL__ */
 #endif
diff --git a/arch/arm/mm/dma-mapping.c b/arch/arm/mm/dma-mapping.c
index a5ab8bf..91fe436 100644
--- a/arch/arm/mm/dma-mapping.c
+++ b/arch/arm/mm/dma-mapping.c
@@ -29,6 +29,86 @@
 
 #include "mm.h"
 
+/**
+ * dma_map_page - map a portion of a page for streaming DMA
+ * @dev: valid struct device pointer, or NULL for ISA and EISA-like devices
+ * @page: page that buffer resides in
+ * @offset: offset into page for start of buffer
+ * @size: size of buffer to map
+ * @dir: DMA transfer direction
+ *
+ * Ensure that any data held in the cache is appropriately discarded
+ * or written back.
+ *
+ * The device owns this memory once this call has completed.  The CPU
+ * can regain ownership by calling dma_unmap_page().
+ */
+static inline dma_addr_t arm_dma_map_page(struct device *dev, struct page *page,
+	     unsigned long offset, size_t size, enum dma_data_direction dir,
+	     struct dma_attrs *attrs)
+{
+	return __dma_map_page(dev, page, offset, size, dir);
+}
+
+/**
+ * dma_unmap_page - unmap a buffer previously mapped through dma_map_page()
+ * @dev: valid struct device pointer, or NULL for ISA and EISA-like devices
+ * @handle: DMA address of buffer
+ * @size: size of buffer (same as passed to dma_map_page)
+ * @dir: DMA transfer direction (same as passed to dma_map_page)
+ *
+ * Unmap a page streaming mode DMA translation.  The handle and size
+ * must match what was provided in the previous dma_map_page() call.
+ * All other usages are undefined.
+ *
+ * After this call, reads by the CPU to the buffer are guaranteed to see
+ * whatever the device wrote there.
+ */
+
+static inline void arm_dma_unmap_page(struct device *dev, dma_addr_t handle,
+		size_t size, enum dma_data_direction dir,
+		struct dma_attrs *attrs)
+{
+	__dma_unmap_page(dev, handle, size, dir);
+}
+
+static inline void arm_dma_sync_single_for_cpu(struct device *dev,
+		dma_addr_t handle, size_t size, enum dma_data_direction dir)
+{
+	unsigned int offset = handle & (PAGE_SIZE - 1);
+	struct page *page = pfn_to_page(dma_to_pfn(dev, handle-offset));
+	if (!dmabounce_sync_for_cpu(dev, handle, size, dir))
+		return;
+
+	__dma_page_dev_to_cpu(page, offset, size, dir);
+}
+
+static inline void arm_dma_sync_single_for_device(struct device *dev,
+		dma_addr_t handle, size_t size, enum dma_data_direction dir)
+{
+	unsigned int offset = handle & (PAGE_SIZE - 1);
+	struct page *page = pfn_to_page(dma_to_pfn(dev, handle-offset));
+	if (!dmabounce_sync_for_device(dev, handle, size, dir))
+		return;
+
+	__dma_page_cpu_to_dev(page, offset, size, dir);
+}
+
+static int arm_dma_set_mask(struct device *dev, u64 dma_mask);
+
+struct dma_map_ops arm_dma_ops = {
+	.map_page		= arm_dma_map_page,
+	.unmap_page		= arm_dma_unmap_page,
+	.map_sg			= arm_dma_map_sg,
+	.unmap_sg		= arm_dma_unmap_sg,
+	.sync_single_for_cpu	= arm_dma_sync_single_for_cpu,
+	.sync_single_for_device	= arm_dma_sync_single_for_device,
+	.sync_sg_for_cpu	= arm_dma_sync_sg_for_cpu,
+	.sync_sg_for_device	= arm_dma_sync_sg_for_device,
+	.set_dma_mask		= arm_dma_set_mask,
+};
+EXPORT_SYMBOL(arm_dma_ops);
+
 static u64 get_coherent_dma_mask(struct device *dev)
 {
 	u64 mask = (u64)arm_dma_limit;
@@ -455,47 +535,6 @@ void dma_free_coherent(struct device *dev, size_t size, void *cpu_addr, dma_addr
 }
 EXPORT_SYMBOL(dma_free_coherent);
 
-/*
- * Make an area consistent for devices.
- * Note: Drivers should NOT use this function directly, as it will break
- * platforms with CONFIG_DMABOUNCE.
- * Use the driver DMA support - see dma-mapping.h (dma_sync_*)
- */
-void ___dma_single_cpu_to_dev(const void *kaddr, size_t size,
-	enum dma_data_direction dir)
-{
-	unsigned long paddr;
-
-	BUG_ON(!virt_addr_valid(kaddr) || !virt_addr_valid(kaddr + size - 1));
-
-	dmac_map_area(kaddr, size, dir);
-
-	paddr = __pa(kaddr);
-	if (dir == DMA_FROM_DEVICE) {
-		outer_inv_range(paddr, paddr + size);
-	} else {
-		outer_clean_range(paddr, paddr + size);
-	}
-	/* FIXME: non-speculating: flush on bidirectional mappings? */
-}
-EXPORT_SYMBOL(___dma_single_cpu_to_dev);
-
-void ___dma_single_dev_to_cpu(const void *kaddr, size_t size,
-	enum dma_data_direction dir)
-{
-	BUG_ON(!virt_addr_valid(kaddr) || !virt_addr_valid(kaddr + size - 1));
-
-	/* FIXME: non-speculating: not required */
-	/* don't bother invalidating if DMA to device */
-	if (dir != DMA_TO_DEVICE) {
-		unsigned long paddr = __pa(kaddr);
-		outer_inv_range(paddr, paddr + size);
-	}
-
-	dmac_unmap_area(kaddr, size, dir);
-}
-EXPORT_SYMBOL(___dma_single_dev_to_cpu);
-
 static void dma_cache_maint_page(struct page *page, unsigned long offset,
 	size_t size, enum dma_data_direction dir,
 	void (*op)(const void *, size_t, int))
@@ -593,21 +632,18 @@ EXPORT_SYMBOL(___dma_page_dev_to_cpu);
  * Device ownership issues as mentioned for dma_map_single are the same
  * here.
  */
-int dma_map_sg(struct device *dev, struct scatterlist *sg, int nents,
-		enum dma_data_direction dir)
+int arm_dma_map_sg(struct device *dev, struct scatterlist *sg, int nents,
+		enum dma_data_direction dir, struct dma_attrs *attrs)
 {
 	struct scatterlist *s;
 	int i, j;
 
-	BUG_ON(!valid_dma_direction(dir));
-
 	for_each_sg(sg, s, nents, i) {
 		s->dma_address = __dma_map_page(dev, sg_page(s), s->offset,
 						s->length, dir);
 		if (dma_mapping_error(dev, s->dma_address))
 			goto bad_mapping;
 	}
-	debug_dma_map_sg(dev, sg, nents, nents, dir);
 	return nents;
 
  bad_mapping:
@@ -615,7 +651,6 @@ int dma_map_sg(struct device *dev, struct scatterlist *sg, int nents,
 		__dma_unmap_page(dev, sg_dma_address(s), sg_dma_len(s), dir);
 	return 0;
 }
-EXPORT_SYMBOL(dma_map_sg);
 
 /**
  * dma_unmap_sg - unmap a set of SG buffers mapped by dma_map_sg
@@ -627,18 +662,15 @@ EXPORT_SYMBOL(dma_map_sg);
  * Unmap a set of streaming mode DMA translations.  Again, CPU access
  * rules concerning calls here are the same as for dma_unmap_single().
  */
-void dma_unmap_sg(struct device *dev, struct scatterlist *sg, int nents,
-		enum dma_data_direction dir)
+void arm_dma_unmap_sg(struct device *dev, struct scatterlist *sg, int nents,
+		enum dma_data_direction dir, struct dma_attrs *attrs)
 {
 	struct scatterlist *s;
 	int i;
 
-	debug_dma_unmap_sg(dev, sg, nents, dir);
-
 	for_each_sg(sg, s, nents, i)
 		__dma_unmap_page(dev, sg_dma_address(s), sg_dma_len(s), dir);
 }
-EXPORT_SYMBOL(dma_unmap_sg);
 
 /**
  * dma_sync_sg_for_cpu
@@ -647,7 +679,7 @@ EXPORT_SYMBOL(dma_unmap_sg);
  * @nents: number of buffers to map (returned from dma_map_sg)
  * @dir: DMA transfer direction (same as was passed to dma_map_sg)
  */
-void dma_sync_sg_for_cpu(struct device *dev, struct scatterlist *sg,
+void arm_dma_sync_sg_for_cpu(struct device *dev, struct scatterlist *sg,
 			int nents, enum dma_data_direction dir)
 {
 	struct scatterlist *s;
@@ -661,10 +693,7 @@ void dma_sync_sg_for_cpu(struct device *dev, struct scatterlist *sg,
 		__dma_page_dev_to_cpu(sg_page(s), s->offset,
 				      s->length, dir);
 	}
-
-	debug_dma_sync_sg_for_cpu(dev, sg, nents, dir);
 }
-EXPORT_SYMBOL(dma_sync_sg_for_cpu);
 
 /**
  * dma_sync_sg_for_device
@@ -673,7 +702,7 @@ EXPORT_SYMBOL(dma_sync_sg_for_cpu);
  * @nents: number of buffers to map (returned from dma_map_sg)
  * @dir: DMA transfer direction (same as was passed to dma_map_sg)
  */
-void dma_sync_sg_for_device(struct device *dev, struct scatterlist *sg,
+void arm_dma_sync_sg_for_device(struct device *dev, struct scatterlist *sg,
 			int nents, enum dma_data_direction dir)
 {
 	struct scatterlist *s;
@@ -687,10 +716,7 @@ void dma_sync_sg_for_device(struct device *dev, struct scatterlist *sg,
 		__dma_page_cpu_to_dev(sg_page(s), s->offset,
 				      s->length, dir);
 	}
-
-	debug_dma_sync_sg_for_device(dev, sg, nents, dir);
 }
-EXPORT_SYMBOL(dma_sync_sg_for_device);
 
 /*
  * Return whether the given device DMA address mask can be supported
@@ -706,7 +732,7 @@ int dma_supported(struct device *dev, u64 mask)
 }
 EXPORT_SYMBOL(dma_supported);
 
-int dma_set_mask(struct device *dev, u64 dma_mask)
+static int arm_dma_set_mask(struct device *dev, u64 dma_mask)
 {
 	if (!dev->dma_mask || !dma_supported(dev, dma_mask))
 		return -EIO;
@@ -717,7 +743,6 @@ int dma_set_mask(struct device *dev, u64 dma_mask)
 
 	return 0;
 }
-EXPORT_SYMBOL(dma_set_mask);
 
 #define PREALLOC_DMA_DEBUG_ENTRIES	4096
 
-- 
1.7.1.569.g6f426


^ permalink raw reply related	[flat|nested] 39+ messages in thread

* [PATCHv6 3/7] ARM: dma-mapping: implement dma sg methods on top of any generic dma ops
  2012-02-10 18:58 [PATCHv6 0/7] ARM: DMA-mapping framework redesign Marek Szyprowski
  2012-02-10 18:58 ` Marek Szyprowski
       [not found] ` <1328900324-20946-1-git-send-email-m.szyprowski-Sze3O3UU22JBDgjK7y7TUQ@public.gmane.org>
@ 2012-02-10 18:58 ` Marek Szyprowski
  2012-02-10 18:58   ` Marek Szyprowski
  2012-02-14 15:02   ` Konrad Rzeszutek Wilk
  2 siblings, 2 replies; 39+ messages in thread
From: Marek Szyprowski @ 2012-02-10 18:58 UTC (permalink / raw)
  To: linux-arm-kernel, linaro-mm-sig, linux-mm, linux-arch,
	linux-samsung-soc, iommu
  Cc: Marek Szyprowski, Kyungmin Park, Arnd Bergmann, Joerg Roedel,
	Russell King - ARM Linux, Shariq Hasnain, Chunsang Jeong,
	Krishna Reddy, KyongHo Cho, Andrzej Pietrasiewicz,
	Benjamin Herrenschmidt

This patch converts all dma_sg methods to be generic (independent of the
current DMA mapping implementation for ARM architecture). All dma sg
operations are now implemented on top of respective
dma_map_page/dma_sync_single_for* operations from dma_map_ops structure.

Signed-off-by: Marek Szyprowski <m.szyprowski@samsung.com>
Signed-off-by: Kyungmin Park <kyungmin.park@samsung.com>
---
 arch/arm/mm/dma-mapping.c |   35 +++++++++++++++--------------------
 1 files changed, 15 insertions(+), 20 deletions(-)

diff --git a/arch/arm/mm/dma-mapping.c b/arch/arm/mm/dma-mapping.c
index 91fe436..31ff699 100644
--- a/arch/arm/mm/dma-mapping.c
+++ b/arch/arm/mm/dma-mapping.c
@@ -635,12 +635,13 @@ EXPORT_SYMBOL(___dma_page_dev_to_cpu);
 int arm_dma_map_sg(struct device *dev, struct scatterlist *sg, int nents,
 		enum dma_data_direction dir, struct dma_attrs *attrs)
 {
+	struct dma_map_ops *ops = get_dma_ops(dev);
 	struct scatterlist *s;
 	int i, j;
 
 	for_each_sg(sg, s, nents, i) {
-		s->dma_address = __dma_map_page(dev, sg_page(s), s->offset,
-						s->length, dir);
+		s->dma_address = ops->map_page(dev, sg_page(s), s->offset,
+						s->length, dir, attrs);
 		if (dma_mapping_error(dev, s->dma_address))
 			goto bad_mapping;
 	}
@@ -648,7 +649,7 @@ int arm_dma_map_sg(struct device *dev, struct scatterlist *sg, int nents,
 
  bad_mapping:
 	for_each_sg(sg, s, i, j)
-		__dma_unmap_page(dev, sg_dma_address(s), sg_dma_len(s), dir);
+		ops->unmap_page(dev, sg_dma_address(s), sg_dma_len(s), dir, attrs);
 	return 0;
 }
 
@@ -665,11 +666,13 @@ int arm_dma_map_sg(struct device *dev, struct scatterlist *sg, int nents,
 void arm_dma_unmap_sg(struct device *dev, struct scatterlist *sg, int nents,
 		enum dma_data_direction dir, struct dma_attrs *attrs)
 {
+	struct dma_map_ops *ops = get_dma_ops(dev);
 	struct scatterlist *s;
+
 	int i;
 
 	for_each_sg(sg, s, nents, i)
-		__dma_unmap_page(dev, sg_dma_address(s), sg_dma_len(s), dir);
+		ops->unmap_page(dev, sg_dma_address(s), sg_dma_len(s), dir, attrs);
 }
 
 /**
@@ -682,17 +685,13 @@ void arm_dma_unmap_sg(struct device *dev, struct scatterlist *sg, int nents,
 void arm_dma_sync_sg_for_cpu(struct device *dev, struct scatterlist *sg,
 			int nents, enum dma_data_direction dir)
 {
+	struct dma_map_ops *ops = get_dma_ops(dev);
 	struct scatterlist *s;
 	int i;
 
-	for_each_sg(sg, s, nents, i) {
-		if (!dmabounce_sync_for_cpu(dev, sg_dma_address(s),
-					    sg_dma_len(s), dir))
-			continue;
-
-		__dma_page_dev_to_cpu(sg_page(s), s->offset,
-				      s->length, dir);
-	}
+	for_each_sg(sg, s, nents, i)
+		ops->sync_single_for_cpu(dev, sg_dma_address(s), s->length,
+					 dir);
 }
 
 /**
@@ -705,17 +704,13 @@ void arm_dma_sync_sg_for_cpu(struct device *dev, struct scatterlist *sg,
 void arm_dma_sync_sg_for_device(struct device *dev, struct scatterlist *sg,
 			int nents, enum dma_data_direction dir)
 {
+	struct dma_map_ops *ops = get_dma_ops(dev);
 	struct scatterlist *s;
 	int i;
 
-	for_each_sg(sg, s, nents, i) {
-		if (!dmabounce_sync_for_device(dev, sg_dma_address(s),
-					sg_dma_len(s), dir))
-			continue;
-
-		__dma_page_cpu_to_dev(sg_page(s), s->offset,
-				      s->length, dir);
-	}
+	for_each_sg(sg, s, nents, i)
+		ops->sync_single_for_device(dev, sg_dma_address(s), s->length,
+					    dir);
 }
 
 /*
-- 
1.7.1.569.g6f426

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 39+ messages in thread

* [PATCHv6 3/7] ARM: dma-mapping: implement dma sg methods on top of any generic dma ops
  2012-02-10 18:58 ` [PATCHv6 3/7] ARM: dma-mapping: implement dma sg methods on top of any generic dma ops Marek Szyprowski
@ 2012-02-10 18:58   ` Marek Szyprowski
  2012-02-14 15:02   ` Konrad Rzeszutek Wilk
  1 sibling, 0 replies; 39+ messages in thread
From: Marek Szyprowski @ 2012-02-10 18:58 UTC (permalink / raw)
  To: linux-arm-kernel, linaro-mm-sig, linux-mm, linux-arch,
	linux-samsung-soc, iommu
  Cc: Marek Szyprowski, Kyungmin Park, Arnd Bergmann, Joerg Roedel,
	Russell King - ARM Linux, Shariq Hasnain, Chunsang Jeong,
	Krishna Reddy, KyongHo Cho, Andrzej Pietrasiewicz,
	Benjamin Herrenschmidt

This patch converts all dma_sg methods to be generic (independent of the
current DMA mapping implementation for ARM architecture). All dma sg
operations are now implemented on top of respective
dma_map_page/dma_sync_single_for* operations from dma_map_ops structure.

Signed-off-by: Marek Szyprowski <m.szyprowski@samsung.com>
Signed-off-by: Kyungmin Park <kyungmin.park@samsung.com>
---
 arch/arm/mm/dma-mapping.c |   35 +++++++++++++++--------------------
 1 files changed, 15 insertions(+), 20 deletions(-)

diff --git a/arch/arm/mm/dma-mapping.c b/arch/arm/mm/dma-mapping.c
index 91fe436..31ff699 100644
--- a/arch/arm/mm/dma-mapping.c
+++ b/arch/arm/mm/dma-mapping.c
@@ -635,12 +635,13 @@ EXPORT_SYMBOL(___dma_page_dev_to_cpu);
 int arm_dma_map_sg(struct device *dev, struct scatterlist *sg, int nents,
 		enum dma_data_direction dir, struct dma_attrs *attrs)
 {
+	struct dma_map_ops *ops = get_dma_ops(dev);
 	struct scatterlist *s;
 	int i, j;
 
 	for_each_sg(sg, s, nents, i) {
-		s->dma_address = __dma_map_page(dev, sg_page(s), s->offset,
-						s->length, dir);
+		s->dma_address = ops->map_page(dev, sg_page(s), s->offset,
+						s->length, dir, attrs);
 		if (dma_mapping_error(dev, s->dma_address))
 			goto bad_mapping;
 	}
@@ -648,7 +649,7 @@ int arm_dma_map_sg(struct device *dev, struct scatterlist *sg, int nents,
 
  bad_mapping:
 	for_each_sg(sg, s, i, j)
-		__dma_unmap_page(dev, sg_dma_address(s), sg_dma_len(s), dir);
+		ops->unmap_page(dev, sg_dma_address(s), sg_dma_len(s), dir, attrs);
 	return 0;
 }
 
@@ -665,11 +666,13 @@ int arm_dma_map_sg(struct device *dev, struct scatterlist *sg, int nents,
 void arm_dma_unmap_sg(struct device *dev, struct scatterlist *sg, int nents,
 		enum dma_data_direction dir, struct dma_attrs *attrs)
 {
+	struct dma_map_ops *ops = get_dma_ops(dev);
 	struct scatterlist *s;
+
 	int i;
 
 	for_each_sg(sg, s, nents, i)
-		__dma_unmap_page(dev, sg_dma_address(s), sg_dma_len(s), dir);
+		ops->unmap_page(dev, sg_dma_address(s), sg_dma_len(s), dir, attrs);
 }
 
 /**
@@ -682,17 +685,13 @@ void arm_dma_unmap_sg(struct device *dev, struct scatterlist *sg, int nents,
 void arm_dma_sync_sg_for_cpu(struct device *dev, struct scatterlist *sg,
 			int nents, enum dma_data_direction dir)
 {
+	struct dma_map_ops *ops = get_dma_ops(dev);
 	struct scatterlist *s;
 	int i;
 
-	for_each_sg(sg, s, nents, i) {
-		if (!dmabounce_sync_for_cpu(dev, sg_dma_address(s),
-					    sg_dma_len(s), dir))
-			continue;
-
-		__dma_page_dev_to_cpu(sg_page(s), s->offset,
-				      s->length, dir);
-	}
+	for_each_sg(sg, s, nents, i)
+		ops->sync_single_for_cpu(dev, sg_dma_address(s), s->length,
+					 dir);
 }
 
 /**
@@ -705,17 +704,13 @@ void arm_dma_sync_sg_for_cpu(struct device *dev, struct scatterlist *sg,
 void arm_dma_sync_sg_for_device(struct device *dev, struct scatterlist *sg,
 			int nents, enum dma_data_direction dir)
 {
+	struct dma_map_ops *ops = get_dma_ops(dev);
 	struct scatterlist *s;
 	int i;
 
-	for_each_sg(sg, s, nents, i) {
-		if (!dmabounce_sync_for_device(dev, sg_dma_address(s),
-					sg_dma_len(s), dir))
-			continue;
-
-		__dma_page_cpu_to_dev(sg_page(s), s->offset,
-				      s->length, dir);
-	}
+	for_each_sg(sg, s, nents, i)
+		ops->sync_single_for_device(dev, sg_dma_address(s), s->length,
+					    dir);
 }
 
 /*
-- 
1.7.1.569.g6f426


^ permalink raw reply related	[flat|nested] 39+ messages in thread

* [PATCHv6 4/7] ARM: dma-mapping: move all dma bounce code to separate dma ops structure
       [not found] ` <1328900324-20946-1-git-send-email-m.szyprowski-Sze3O3UU22JBDgjK7y7TUQ@public.gmane.org>
  2012-02-10 18:58   ` [PATCHv6 1/7] ARM: dma-mapping: remove offset parameter to prepare for generic dma_ops Marek Szyprowski
  2012-02-10 18:58   ` [PATCHv6 2/7] ARM: dma-mapping: use asm-generic/dma-mapping-common.h Marek Szyprowski
@ 2012-02-10 18:58   ` Marek Szyprowski
  2012-02-10 18:58     ` Marek Szyprowski
  2012-02-10 18:58   ` [PATCHv6 5/7] ARM: dma-mapping: remove redundant code and cleanup Marek Szyprowski
                     ` (2 subsequent siblings)
  5 siblings, 1 reply; 39+ messages in thread
From: Marek Szyprowski @ 2012-02-10 18:58 UTC (permalink / raw)
  To: linux-arm-kernel-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r,
	linaro-mm-sig-cunTk1MwBs8s++Sfvej+rw,
	linux-mm-Bw31MaZKKs3YtjvyW6yDsg,
	linux-arch-u79uwXL29TY76Z2rM5mHXA,
	linux-samsung-soc-u79uwXL29TY76Z2rM5mHXA,
	iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA
  Cc: Shariq Hasnain, Arnd Bergmann, Benjamin Herrenschmidt,
	Krishna Reddy, Kyungmin Park, Andrzej Pietrasiewicz,
	Russell King - ARM Linux, KyongHo Cho, Chunsang Jeong

This patch removes dma bounce hooks from the common dma mapping
implementation on ARM architecture and creates a separate set of
dma_map_ops for dma bounce devices.

Signed-off-by: Marek Szyprowski <m.szyprowski-Sze3O3UU22JBDgjK7y7TUQ@public.gmane.org>
Signed-off-by: Kyungmin Park <kyungmin.park-Sze3O3UU22JBDgjK7y7TUQ@public.gmane.org>
---
 arch/arm/common/dmabounce.c        |   62 ++++++++++++++++++-----
 arch/arm/include/asm/dma-mapping.h |   99 +-----------------------------------
 arch/arm/mm/dma-mapping.c          |   79 +++++++++++++++++++++++++----
 3 files changed, 120 insertions(+), 120 deletions(-)

diff --git a/arch/arm/common/dmabounce.c b/arch/arm/common/dmabounce.c
index 46b4b8d..5e7ba61 100644
--- a/arch/arm/common/dmabounce.c
+++ b/arch/arm/common/dmabounce.c
@@ -308,8 +308,9 @@ static inline void unmap_single(struct device *dev, struct safe_buffer *buf,
  * substitute the safe buffer for the unsafe one.
  * (basically move the buffer from an unsafe area to a safe one)
  */
-dma_addr_t __dma_map_page(struct device *dev, struct page *page,
-		unsigned long offset, size_t size, enum dma_data_direction dir)
+static dma_addr_t dmabounce_map_page(struct device *dev, struct page *page,
+		unsigned long offset, size_t size, enum dma_data_direction dir,
+		struct dma_attrs *attrs)
 {
 	dma_addr_t dma_addr;
 	int ret;
@@ -324,7 +325,7 @@ dma_addr_t __dma_map_page(struct device *dev, struct page *page,
 		return ~0;
 
 	if (ret == 0) {
-		__dma_page_cpu_to_dev(page, offset, size, dir);
+		arm_dma_ops.sync_single_for_device(dev, dma_addr, size, dir);
 		return dma_addr;
 	}
 
@@ -335,7 +336,6 @@ dma_addr_t __dma_map_page(struct device *dev, struct page *page,
 
 	return map_single(dev, page_address(page) + offset, size, dir);
 }
-EXPORT_SYMBOL(__dma_map_page);
 
 /*
  * see if a mapped address was really a "safe" buffer and if so, copy
@@ -343,8 +343,8 @@ EXPORT_SYMBOL(__dma_map_page);
  * the safe buffer.  (basically return things back to the way they
  * should be)
  */
-void __dma_unmap_page(struct device *dev, dma_addr_t dma_addr, size_t size,
-		enum dma_data_direction dir)
+static void dmabounce_unmap_page(struct device *dev, dma_addr_t dma_addr, size_t size,
+		enum dma_data_direction dir, struct dma_attrs *attrs)
 {
 	struct safe_buffer *buf;
 
@@ -353,16 +353,14 @@ void __dma_unmap_page(struct device *dev, dma_addr_t dma_addr, size_t size,
 
 	buf = find_safe_buffer_dev(dev, dma_addr, __func__);
 	if (!buf) {
-		__dma_page_dev_to_cpu(pfn_to_page(dma_to_pfn(dev, dma_addr)),
-			dma_addr & ~PAGE_MASK, size, dir);
+		arm_dma_ops.sync_single_for_cpu(dev, dma_addr, size, dir);
 		return;
 	}
 
 	unmap_single(dev, buf, size, dir);
 }
-EXPORT_SYMBOL(__dma_unmap_page);
 
-int dmabounce_sync_for_cpu(struct device *dev, dma_addr_t addr,
+static int __dmabounce_sync_for_cpu(struct device *dev, dma_addr_t addr,
 		size_t sz, enum dma_data_direction dir)
 {
 	struct safe_buffer *buf;
@@ -392,9 +390,17 @@ int dmabounce_sync_for_cpu(struct device *dev, dma_addr_t addr,
 	}
 	return 0;
 }
-EXPORT_SYMBOL(dmabounce_sync_for_cpu);
 
-int dmabounce_sync_for_device(struct device *dev, dma_addr_t addr,
+static void dmabounce_sync_for_cpu(struct device *dev,
+		dma_addr_t handle, size_t size, enum dma_data_direction dir)
+{
+	if (!__dmabounce_sync_for_cpu(dev, handle, size, dir))
+		return;
+
+	arm_dma_ops.sync_single_for_cpu(dev, handle, size, dir);
+}
+
+static int __dmabounce_sync_for_device(struct device *dev, dma_addr_t addr,
 		size_t sz, enum dma_data_direction dir)
 {
 	struct safe_buffer *buf;
@@ -424,7 +430,35 @@ int dmabounce_sync_for_device(struct device *dev, dma_addr_t addr,
 	}
 	return 0;
 }
-EXPORT_SYMBOL(dmabounce_sync_for_device);
+
+static void dmabounce_sync_for_device(struct device *dev,
+		dma_addr_t handle, size_t size, enum dma_data_direction dir)
+{
+	if (!__dmabounce_sync_for_device(dev, handle, size, dir))
+		return;
+
+	arm_dma_ops.sync_single_for_device(dev, handle, size, dir);
+}
+
+static int dmabounce_set_mask(struct device *dev, u64 dma_mask)
+{
+	if (dev->archdata.dmabounce)
+		return 0;
+
+	return arm_dma_ops.set_dma_mask(dev, dma_mask);
+}
+
+static struct dma_map_ops dmabounce_ops = {
+	.map_page		= dmabounce_map_page,
+	.unmap_page		= dmabounce_unmap_page,
+	.sync_single_for_cpu	= dmabounce_sync_for_cpu,
+	.sync_single_for_device	= dmabounce_sync_for_device,
+	.map_sg			= generic_dma_map_sg,
+	.unmap_sg		= generic_dma_unmap_sg,
+	.sync_sg_for_cpu	= generic_dma_sync_sg_for_cpu,
+	.sync_sg_for_device	= generic_dma_sync_sg_for_device,
+	.set_dma_mask		= dmabounce_set_mask,
+};
 
 static int dmabounce_init_pool(struct dmabounce_pool *pool, struct device *dev,
 		const char *name, unsigned long size)
@@ -486,6 +520,7 @@ int dmabounce_register_dev(struct device *dev, unsigned long small_buffer_size,
 #endif
 
 	dev->archdata.dmabounce = device_info;
+	set_dma_ops(dev, &dmabounce_ops);
 
 	dev_info(dev, "dmabounce: registered device\n");
 
@@ -504,6 +539,7 @@ void dmabounce_unregister_dev(struct device *dev)
 	struct dmabounce_device_info *device_info = dev->archdata.dmabounce;
 
 	dev->archdata.dmabounce = NULL;
+	set_dma_ops(dev, NULL);
 
 	if (!device_info) {
 		dev_warn(dev,
diff --git a/arch/arm/include/asm/dma-mapping.h b/arch/arm/include/asm/dma-mapping.h
index cf7b77c..0016bff 100644
--- a/arch/arm/include/asm/dma-mapping.h
+++ b/arch/arm/include/asm/dma-mapping.h
@@ -84,62 +84,6 @@ static inline dma_addr_t virt_to_dma(struct device *dev, void *addr)
 #endif
 
 /*
- * The DMA API is built upon the notion of "buffer ownership".  A buffer
- * is either exclusively owned by the CPU (and therefore may be accessed
- * by it) or exclusively owned by the DMA device.  These helper functions
- * represent the transitions between these two ownership states.
- *
- * Note, however, that on later ARMs, this notion does not work due to
- * speculative prefetches.  We model our approach on the assumption that
- * the CPU does do speculative prefetches, which means we clean caches
- * before transfers and delay cache invalidation until transfer completion.
- *
- * Private support functions: these are not part of the API and are
- * liable to change.  Drivers must not use these.
- */
-static inline void __dma_single_cpu_to_dev(const void *kaddr, size_t size,
-	enum dma_data_direction dir)
-{
-	extern void ___dma_single_cpu_to_dev(const void *, size_t,
-		enum dma_data_direction);
-
-	if (!arch_is_coherent())
-		___dma_single_cpu_to_dev(kaddr, size, dir);
-}
-
-static inline void __dma_single_dev_to_cpu(const void *kaddr, size_t size,
-	enum dma_data_direction dir)
-{
-	extern void ___dma_single_dev_to_cpu(const void *, size_t,
-		enum dma_data_direction);
-
-	if (!arch_is_coherent())
-		___dma_single_dev_to_cpu(kaddr, size, dir);
-}
-
-static inline void __dma_page_cpu_to_dev(struct page *page, unsigned long off,
-	size_t size, enum dma_data_direction dir)
-{
-	extern void ___dma_page_cpu_to_dev(struct page *, unsigned long,
-		size_t, enum dma_data_direction);
-
-	if (!arch_is_coherent())
-		___dma_page_cpu_to_dev(page, off, size, dir);
-}
-
-static inline void __dma_page_dev_to_cpu(struct page *page, unsigned long off,
-	size_t size, enum dma_data_direction dir)
-{
-	extern void ___dma_page_dev_to_cpu(struct page *, unsigned long,
-		size_t, enum dma_data_direction);
-
-	if (!arch_is_coherent())
-		___dma_page_dev_to_cpu(page, off, size, dir);
-}
-
-extern int dma_supported(struct device *, u64);
-extern int dma_set_mask(struct device *, u64);
-/*
  * DMA errors are defined by all-bits-set in the DMA address.
  */
 static inline int dma_mapping_error(struct device *dev, dma_addr_t dma_addr)
@@ -162,6 +106,8 @@ static inline void dma_free_noncoherent(struct device *dev, size_t size,
 {
 }
 
+extern int dma_supported(struct device *dev, u64 mask);
+
 /**
  * dma_alloc_coherent - allocate consistent memory for DMA
  * @dev: valid struct device pointer, or NULL for ISA and EISA-like devices
@@ -234,7 +180,6 @@ int dma_mmap_writecombine(struct device *, struct vm_area_struct *,
 extern void __init init_consistent_dma_size(unsigned long size);
 
 
-#ifdef CONFIG_DMABOUNCE
 /*
  * For SA-1111, IXP425, and ADI systems  the dma-mapping functions are "magic"
  * and utilize bounce buffers as needed to work around limited DMA windows.
@@ -274,47 +219,7 @@ extern int dmabounce_register_dev(struct device *, unsigned long,
  */
 extern void dmabounce_unregister_dev(struct device *);
 
-/*
- * The DMA API, implemented by dmabounce.c.  See below for descriptions.
- */
-extern dma_addr_t __dma_map_page(struct device *, struct page *,
-		unsigned long, size_t, enum dma_data_direction);
-extern void __dma_unmap_page(struct device *, dma_addr_t, size_t,
-		enum dma_data_direction);
-
-/*
- * Private functions
- */
-int dmabounce_sync_for_cpu(struct device *, dma_addr_t, size_t, enum dma_data_direction);
-int dmabounce_sync_for_device(struct device *, dma_addr_t, size_t, enum dma_data_direction);
-#else
-static inline int dmabounce_sync_for_cpu(struct device *d, dma_addr_t addr,
-	size_t size, enum dma_data_direction dir)
-{
-	return 1;
-}
-
-static inline int dmabounce_sync_for_device(struct device *d, dma_addr_t addr,
-	size_t size, enum dma_data_direction dir)
-{
-	return 1;
-}
-
 
-static inline dma_addr_t __dma_map_page(struct device *dev, struct page *page,
-	     unsigned long offset, size_t size, enum dma_data_direction dir)
-{
-	__dma_page_cpu_to_dev(page, offset, size, dir);
-	return pfn_to_dma(dev, page_to_pfn(page)) + offset;
-}
-
-static inline void __dma_unmap_page(struct device *dev, dma_addr_t handle,
-		size_t size, enum dma_data_direction dir)
-{
-	__dma_page_dev_to_cpu(pfn_to_page(dma_to_pfn(dev, handle)),
-		handle & ~PAGE_MASK, size, dir);
-}
-#endif /* CONFIG_DMABOUNCE */
 
 /*
  * The scatter list versions of the above methods.
diff --git a/arch/arm/mm/dma-mapping.c b/arch/arm/mm/dma-mapping.c
index 31ff699..5715e2e 100644
--- a/arch/arm/mm/dma-mapping.c
+++ b/arch/arm/mm/dma-mapping.c
@@ -29,6 +29,75 @@
 
 #include "mm.h"
 
+/*
+ * The DMA API is built upon the notion of "buffer ownership".  A buffer
+ * is either exclusively owned by the CPU (and therefore may be accessed
+ * by it) or exclusively owned by the DMA device.  These helper functions
+ * represent the transitions between these two ownership states.
+ *
+ * Note, however, that on later ARMs, this notion does not work due to
+ * speculative prefetches.  We model our approach on the assumption that
+ * the CPU does do speculative prefetches, which means we clean caches
+ * before transfers and delay cache invalidation until transfer completion.
+ *
+ * Private support functions: these are not part of the API and are
+ * liable to change.  Drivers must not use these.
+ */
+static inline void __dma_single_cpu_to_dev(const void *kaddr, size_t size,
+	enum dma_data_direction dir)
+{
+	extern void ___dma_single_cpu_to_dev(const void *, size_t,
+		enum dma_data_direction);
+
+	if (!arch_is_coherent())
+		___dma_single_cpu_to_dev(kaddr, size, dir);
+}
+
+static inline void __dma_single_dev_to_cpu(const void *kaddr, size_t size,
+	enum dma_data_direction dir)
+{
+	extern void ___dma_single_dev_to_cpu(const void *, size_t,
+		enum dma_data_direction);
+
+	if (!arch_is_coherent())
+		___dma_single_dev_to_cpu(kaddr, size, dir);
+}
+
+static inline void __dma_page_cpu_to_dev(struct page *page, unsigned long off,
+	size_t size, enum dma_data_direction dir)
+{
+	extern void ___dma_page_cpu_to_dev(struct page *, unsigned long,
+		size_t, enum dma_data_direction);
+
+	if (!arch_is_coherent())
+		___dma_page_cpu_to_dev(page, off, size, dir);
+}
+
+static inline void __dma_page_dev_to_cpu(struct page *page, unsigned long off,
+	size_t size, enum dma_data_direction dir)
+{
+	extern void ___dma_page_dev_to_cpu(struct page *, unsigned long,
+		size_t, enum dma_data_direction);
+
+	if (!arch_is_coherent())
+		___dma_page_dev_to_cpu(page, off, size, dir);
+}
+
+
+static inline dma_addr_t __dma_map_page(struct device *dev, struct page *page,
+	     unsigned long offset, size_t size, enum dma_data_direction dir)
+{
+	__dma_page_cpu_to_dev(page, offset, size, dir);
+	return pfn_to_dma(dev, page_to_pfn(page)) + offset;
+}
+
+static inline void __dma_unmap_page(struct device *dev, dma_addr_t handle,
+		size_t size, enum dma_data_direction dir)
+{
+	__dma_page_dev_to_cpu(pfn_to_page(dma_to_pfn(dev, handle)),
+		handle & ~PAGE_MASK, size, dir);
+}
+
 /**
  * dma_map_page - map a portion of a page for streaming DMA
  * @dev: valid struct device pointer, or NULL for ISA and EISA-like devices
@@ -77,9 +146,6 @@ static inline void arm_dma_sync_single_for_cpu(struct device *dev,
 {
 	unsigned int offset = handle & (PAGE_SIZE - 1);
 	struct page *page = pfn_to_page(dma_to_pfn(dev, handle-offset));
-	if (!dmabounce_sync_for_cpu(dev, handle, size, dir))
-		return;
-
 	__dma_page_dev_to_cpu(page, offset, size, dir);
 }
 
@@ -88,9 +154,6 @@ static inline void arm_dma_sync_single_for_device(struct device *dev,
 {
 	unsigned int offset = handle & (PAGE_SIZE - 1);
 	struct page *page = pfn_to_page(dma_to_pfn(dev, handle-offset));
-	if (!dmabounce_sync_for_device(dev, handle, size, dir))
-		return;
-
 	__dma_page_cpu_to_dev(page, offset, size, dir);
 }
 
@@ -594,7 +657,6 @@ void ___dma_page_cpu_to_dev(struct page *page, unsigned long off,
 	}
 	/* FIXME: non-speculating: flush on bidirectional mappings? */
 }
-EXPORT_SYMBOL(___dma_page_cpu_to_dev);
 
 void ___dma_page_dev_to_cpu(struct page *page, unsigned long off,
 	size_t size, enum dma_data_direction dir)
@@ -614,7 +676,6 @@ void ___dma_page_dev_to_cpu(struct page *page, unsigned long off,
 	if (dir != DMA_TO_DEVICE && off == 0 && size >= PAGE_SIZE)
 		set_bit(PG_dcache_clean, &page->flags);
 }
-EXPORT_SYMBOL(___dma_page_dev_to_cpu);
 
 /**
  * dma_map_sg - map a set of SG buffers for streaming mode DMA
@@ -732,9 +793,7 @@ static int arm_dma_set_mask(struct device *dev, u64 dma_mask)
 	if (!dev->dma_mask || !dma_supported(dev, dma_mask))
 		return -EIO;
 
-#ifndef CONFIG_DMABOUNCE
 	*dev->dma_mask = dma_mask;
-#endif
 
 	return 0;
 }
-- 
1.7.1.569.g6f426

^ permalink raw reply related	[flat|nested] 39+ messages in thread

* [PATCHv6 4/7] ARM: dma-mapping: move all dma bounce code to separate dma ops structure
  2012-02-10 18:58   ` [PATCHv6 4/7] ARM: dma-mapping: move all dma bounce code to separate dma ops structure Marek Szyprowski
@ 2012-02-10 18:58     ` Marek Szyprowski
  0 siblings, 0 replies; 39+ messages in thread
From: Marek Szyprowski @ 2012-02-10 18:58 UTC (permalink / raw)
  To: linux-arm-kernel, linaro-mm-sig, linux-mm, linux-arch,
	linux-samsung-soc, iommu
  Cc: Marek Szyprowski, Kyungmin Park, Arnd Bergmann, Joerg Roedel,
	Russell King - ARM Linux, Shariq Hasnain, Chunsang Jeong,
	Krishna Reddy, KyongHo Cho, Andrzej Pietrasiewicz,
	Benjamin Herrenschmidt

This patch removes dma bounce hooks from the common dma mapping
implementation on ARM architecture and creates a separate set of
dma_map_ops for dma bounce devices.

Signed-off-by: Marek Szyprowski <m.szyprowski@samsung.com>
Signed-off-by: Kyungmin Park <kyungmin.park@samsung.com>
---
 arch/arm/common/dmabounce.c        |   62 ++++++++++++++++++-----
 arch/arm/include/asm/dma-mapping.h |   99 +-----------------------------------
 arch/arm/mm/dma-mapping.c          |   79 +++++++++++++++++++++++++----
 3 files changed, 120 insertions(+), 120 deletions(-)

diff --git a/arch/arm/common/dmabounce.c b/arch/arm/common/dmabounce.c
index 46b4b8d..5e7ba61 100644
--- a/arch/arm/common/dmabounce.c
+++ b/arch/arm/common/dmabounce.c
@@ -308,8 +308,9 @@ static inline void unmap_single(struct device *dev, struct safe_buffer *buf,
  * substitute the safe buffer for the unsafe one.
  * (basically move the buffer from an unsafe area to a safe one)
  */
-dma_addr_t __dma_map_page(struct device *dev, struct page *page,
-		unsigned long offset, size_t size, enum dma_data_direction dir)
+static dma_addr_t dmabounce_map_page(struct device *dev, struct page *page,
+		unsigned long offset, size_t size, enum dma_data_direction dir,
+		struct dma_attrs *attrs)
 {
 	dma_addr_t dma_addr;
 	int ret;
@@ -324,7 +325,7 @@ dma_addr_t __dma_map_page(struct device *dev, struct page *page,
 		return ~0;
 
 	if (ret == 0) {
-		__dma_page_cpu_to_dev(page, offset, size, dir);
+		arm_dma_ops.sync_single_for_device(dev, dma_addr, size, dir);
 		return dma_addr;
 	}
 
@@ -335,7 +336,6 @@ dma_addr_t __dma_map_page(struct device *dev, struct page *page,
 
 	return map_single(dev, page_address(page) + offset, size, dir);
 }
-EXPORT_SYMBOL(__dma_map_page);
 
 /*
  * see if a mapped address was really a "safe" buffer and if so, copy
@@ -343,8 +343,8 @@ EXPORT_SYMBOL(__dma_map_page);
  * the safe buffer.  (basically return things back to the way they
  * should be)
  */
-void __dma_unmap_page(struct device *dev, dma_addr_t dma_addr, size_t size,
-		enum dma_data_direction dir)
+static void dmabounce_unmap_page(struct device *dev, dma_addr_t dma_addr, size_t size,
+		enum dma_data_direction dir, struct dma_attrs *attrs)
 {
 	struct safe_buffer *buf;
 
@@ -353,16 +353,14 @@ void __dma_unmap_page(struct device *dev, dma_addr_t dma_addr, size_t size,
 
 	buf = find_safe_buffer_dev(dev, dma_addr, __func__);
 	if (!buf) {
-		__dma_page_dev_to_cpu(pfn_to_page(dma_to_pfn(dev, dma_addr)),
-			dma_addr & ~PAGE_MASK, size, dir);
+		arm_dma_ops.sync_single_for_cpu(dev, dma_addr, size, dir);
 		return;
 	}
 
 	unmap_single(dev, buf, size, dir);
 }
-EXPORT_SYMBOL(__dma_unmap_page);
 
-int dmabounce_sync_for_cpu(struct device *dev, dma_addr_t addr,
+static int __dmabounce_sync_for_cpu(struct device *dev, dma_addr_t addr,
 		size_t sz, enum dma_data_direction dir)
 {
 	struct safe_buffer *buf;
@@ -392,9 +390,17 @@ int dmabounce_sync_for_cpu(struct device *dev, dma_addr_t addr,
 	}
 	return 0;
 }
-EXPORT_SYMBOL(dmabounce_sync_for_cpu);
 
-int dmabounce_sync_for_device(struct device *dev, dma_addr_t addr,
+static void dmabounce_sync_for_cpu(struct device *dev,
+		dma_addr_t handle, size_t size, enum dma_data_direction dir)
+{
+	if (!__dmabounce_sync_for_cpu(dev, handle, size, dir))
+		return;
+
+	arm_dma_ops.sync_single_for_cpu(dev, handle, size, dir);
+}
+
+static int __dmabounce_sync_for_device(struct device *dev, dma_addr_t addr,
 		size_t sz, enum dma_data_direction dir)
 {
 	struct safe_buffer *buf;
@@ -424,7 +430,35 @@ int dmabounce_sync_for_device(struct device *dev, dma_addr_t addr,
 	}
 	return 0;
 }
-EXPORT_SYMBOL(dmabounce_sync_for_device);
+
+static void dmabounce_sync_for_device(struct device *dev,
+		dma_addr_t handle, size_t size, enum dma_data_direction dir)
+{
+	if (!__dmabounce_sync_for_device(dev, handle, size, dir))
+		return;
+
+	arm_dma_ops.sync_single_for_device(dev, handle, size, dir);
+}
+
+static int dmabounce_set_mask(struct device *dev, u64 dma_mask)
+{
+	if (dev->archdata.dmabounce)
+		return 0;
+
+	return arm_dma_ops.set_dma_mask(dev, dma_mask);
+}
+
+static struct dma_map_ops dmabounce_ops = {
+	.map_page		= dmabounce_map_page,
+	.unmap_page		= dmabounce_unmap_page,
+	.sync_single_for_cpu	= dmabounce_sync_for_cpu,
+	.sync_single_for_device	= dmabounce_sync_for_device,
+	.map_sg			= generic_dma_map_sg,
+	.unmap_sg		= generic_dma_unmap_sg,
+	.sync_sg_for_cpu	= generic_dma_sync_sg_for_cpu,
+	.sync_sg_for_device	= generic_dma_sync_sg_for_device,
+	.set_dma_mask		= dmabounce_set_mask,
+};
 
 static int dmabounce_init_pool(struct dmabounce_pool *pool, struct device *dev,
 		const char *name, unsigned long size)
@@ -486,6 +520,7 @@ int dmabounce_register_dev(struct device *dev, unsigned long small_buffer_size,
 #endif
 
 	dev->archdata.dmabounce = device_info;
+	set_dma_ops(dev, &dmabounce_ops);
 
 	dev_info(dev, "dmabounce: registered device\n");
 
@@ -504,6 +539,7 @@ void dmabounce_unregister_dev(struct device *dev)
 	struct dmabounce_device_info *device_info = dev->archdata.dmabounce;
 
 	dev->archdata.dmabounce = NULL;
+	set_dma_ops(dev, NULL);
 
 	if (!device_info) {
 		dev_warn(dev,
diff --git a/arch/arm/include/asm/dma-mapping.h b/arch/arm/include/asm/dma-mapping.h
index cf7b77c..0016bff 100644
--- a/arch/arm/include/asm/dma-mapping.h
+++ b/arch/arm/include/asm/dma-mapping.h
@@ -84,62 +84,6 @@ static inline dma_addr_t virt_to_dma(struct device *dev, void *addr)
 #endif
 
 /*
- * The DMA API is built upon the notion of "buffer ownership".  A buffer
- * is either exclusively owned by the CPU (and therefore may be accessed
- * by it) or exclusively owned by the DMA device.  These helper functions
- * represent the transitions between these two ownership states.
- *
- * Note, however, that on later ARMs, this notion does not work due to
- * speculative prefetches.  We model our approach on the assumption that
- * the CPU does do speculative prefetches, which means we clean caches
- * before transfers and delay cache invalidation until transfer completion.
- *
- * Private support functions: these are not part of the API and are
- * liable to change.  Drivers must not use these.
- */
-static inline void __dma_single_cpu_to_dev(const void *kaddr, size_t size,
-	enum dma_data_direction dir)
-{
-	extern void ___dma_single_cpu_to_dev(const void *, size_t,
-		enum dma_data_direction);
-
-	if (!arch_is_coherent())
-		___dma_single_cpu_to_dev(kaddr, size, dir);
-}
-
-static inline void __dma_single_dev_to_cpu(const void *kaddr, size_t size,
-	enum dma_data_direction dir)
-{
-	extern void ___dma_single_dev_to_cpu(const void *, size_t,
-		enum dma_data_direction);
-
-	if (!arch_is_coherent())
-		___dma_single_dev_to_cpu(kaddr, size, dir);
-}
-
-static inline void __dma_page_cpu_to_dev(struct page *page, unsigned long off,
-	size_t size, enum dma_data_direction dir)
-{
-	extern void ___dma_page_cpu_to_dev(struct page *, unsigned long,
-		size_t, enum dma_data_direction);
-
-	if (!arch_is_coherent())
-		___dma_page_cpu_to_dev(page, off, size, dir);
-}
-
-static inline void __dma_page_dev_to_cpu(struct page *page, unsigned long off,
-	size_t size, enum dma_data_direction dir)
-{
-	extern void ___dma_page_dev_to_cpu(struct page *, unsigned long,
-		size_t, enum dma_data_direction);
-
-	if (!arch_is_coherent())
-		___dma_page_dev_to_cpu(page, off, size, dir);
-}
-
-extern int dma_supported(struct device *, u64);
-extern int dma_set_mask(struct device *, u64);
-/*
  * DMA errors are defined by all-bits-set in the DMA address.
  */
 static inline int dma_mapping_error(struct device *dev, dma_addr_t dma_addr)
@@ -162,6 +106,8 @@ static inline void dma_free_noncoherent(struct device *dev, size_t size,
 {
 }
 
+extern int dma_supported(struct device *dev, u64 mask);
+
 /**
  * dma_alloc_coherent - allocate consistent memory for DMA
  * @dev: valid struct device pointer, or NULL for ISA and EISA-like devices
@@ -234,7 +180,6 @@ int dma_mmap_writecombine(struct device *, struct vm_area_struct *,
 extern void __init init_consistent_dma_size(unsigned long size);
 
 
-#ifdef CONFIG_DMABOUNCE
 /*
  * For SA-1111, IXP425, and ADI systems  the dma-mapping functions are "magic"
  * and utilize bounce buffers as needed to work around limited DMA windows.
@@ -274,47 +219,7 @@ extern int dmabounce_register_dev(struct device *, unsigned long,
  */
 extern void dmabounce_unregister_dev(struct device *);
 
-/*
- * The DMA API, implemented by dmabounce.c.  See below for descriptions.
- */
-extern dma_addr_t __dma_map_page(struct device *, struct page *,
-		unsigned long, size_t, enum dma_data_direction);
-extern void __dma_unmap_page(struct device *, dma_addr_t, size_t,
-		enum dma_data_direction);
-
-/*
- * Private functions
- */
-int dmabounce_sync_for_cpu(struct device *, dma_addr_t, size_t, enum dma_data_direction);
-int dmabounce_sync_for_device(struct device *, dma_addr_t, size_t, enum dma_data_direction);
-#else
-static inline int dmabounce_sync_for_cpu(struct device *d, dma_addr_t addr,
-	size_t size, enum dma_data_direction dir)
-{
-	return 1;
-}
-
-static inline int dmabounce_sync_for_device(struct device *d, dma_addr_t addr,
-	size_t size, enum dma_data_direction dir)
-{
-	return 1;
-}
-
 
-static inline dma_addr_t __dma_map_page(struct device *dev, struct page *page,
-	     unsigned long offset, size_t size, enum dma_data_direction dir)
-{
-	__dma_page_cpu_to_dev(page, offset, size, dir);
-	return pfn_to_dma(dev, page_to_pfn(page)) + offset;
-}
-
-static inline void __dma_unmap_page(struct device *dev, dma_addr_t handle,
-		size_t size, enum dma_data_direction dir)
-{
-	__dma_page_dev_to_cpu(pfn_to_page(dma_to_pfn(dev, handle)),
-		handle & ~PAGE_MASK, size, dir);
-}
-#endif /* CONFIG_DMABOUNCE */
 
 /*
  * The scatter list versions of the above methods.
diff --git a/arch/arm/mm/dma-mapping.c b/arch/arm/mm/dma-mapping.c
index 31ff699..5715e2e 100644
--- a/arch/arm/mm/dma-mapping.c
+++ b/arch/arm/mm/dma-mapping.c
@@ -29,6 +29,75 @@
 
 #include "mm.h"
 
+/*
+ * The DMA API is built upon the notion of "buffer ownership".  A buffer
+ * is either exclusively owned by the CPU (and therefore may be accessed
+ * by it) or exclusively owned by the DMA device.  These helper functions
+ * represent the transitions between these two ownership states.
+ *
+ * Note, however, that on later ARMs, this notion does not work due to
+ * speculative prefetches.  We model our approach on the assumption that
+ * the CPU does do speculative prefetches, which means we clean caches
+ * before transfers and delay cache invalidation until transfer completion.
+ *
+ * Private support functions: these are not part of the API and are
+ * liable to change.  Drivers must not use these.
+ */
+static inline void __dma_single_cpu_to_dev(const void *kaddr, size_t size,
+	enum dma_data_direction dir)
+{
+	extern void ___dma_single_cpu_to_dev(const void *, size_t,
+		enum dma_data_direction);
+
+	if (!arch_is_coherent())
+		___dma_single_cpu_to_dev(kaddr, size, dir);
+}
+
+static inline void __dma_single_dev_to_cpu(const void *kaddr, size_t size,
+	enum dma_data_direction dir)
+{
+	extern void ___dma_single_dev_to_cpu(const void *, size_t,
+		enum dma_data_direction);
+
+	if (!arch_is_coherent())
+		___dma_single_dev_to_cpu(kaddr, size, dir);
+}
+
+static inline void __dma_page_cpu_to_dev(struct page *page, unsigned long off,
+	size_t size, enum dma_data_direction dir)
+{
+	extern void ___dma_page_cpu_to_dev(struct page *, unsigned long,
+		size_t, enum dma_data_direction);
+
+	if (!arch_is_coherent())
+		___dma_page_cpu_to_dev(page, off, size, dir);
+}
+
+static inline void __dma_page_dev_to_cpu(struct page *page, unsigned long off,
+	size_t size, enum dma_data_direction dir)
+{
+	extern void ___dma_page_dev_to_cpu(struct page *, unsigned long,
+		size_t, enum dma_data_direction);
+
+	if (!arch_is_coherent())
+		___dma_page_dev_to_cpu(page, off, size, dir);
+}
+
+
+static inline dma_addr_t __dma_map_page(struct device *dev, struct page *page,
+	     unsigned long offset, size_t size, enum dma_data_direction dir)
+{
+	__dma_page_cpu_to_dev(page, offset, size, dir);
+	return pfn_to_dma(dev, page_to_pfn(page)) + offset;
+}
+
+static inline void __dma_unmap_page(struct device *dev, dma_addr_t handle,
+		size_t size, enum dma_data_direction dir)
+{
+	__dma_page_dev_to_cpu(pfn_to_page(dma_to_pfn(dev, handle)),
+		handle & ~PAGE_MASK, size, dir);
+}
+
 /**
  * dma_map_page - map a portion of a page for streaming DMA
  * @dev: valid struct device pointer, or NULL for ISA and EISA-like devices
@@ -77,9 +146,6 @@ static inline void arm_dma_sync_single_for_cpu(struct device *dev,
 {
 	unsigned int offset = handle & (PAGE_SIZE - 1);
 	struct page *page = pfn_to_page(dma_to_pfn(dev, handle-offset));
-	if (!dmabounce_sync_for_cpu(dev, handle, size, dir))
-		return;
-
 	__dma_page_dev_to_cpu(page, offset, size, dir);
 }
 
@@ -88,9 +154,6 @@ static inline void arm_dma_sync_single_for_device(struct device *dev,
 {
 	unsigned int offset = handle & (PAGE_SIZE - 1);
 	struct page *page = pfn_to_page(dma_to_pfn(dev, handle-offset));
-	if (!dmabounce_sync_for_device(dev, handle, size, dir))
-		return;
-
 	__dma_page_cpu_to_dev(page, offset, size, dir);
 }
 
@@ -594,7 +657,6 @@ void ___dma_page_cpu_to_dev(struct page *page, unsigned long off,
 	}
 	/* FIXME: non-speculating: flush on bidirectional mappings? */
 }
-EXPORT_SYMBOL(___dma_page_cpu_to_dev);
 
 void ___dma_page_dev_to_cpu(struct page *page, unsigned long off,
 	size_t size, enum dma_data_direction dir)
@@ -614,7 +676,6 @@ void ___dma_page_dev_to_cpu(struct page *page, unsigned long off,
 	if (dir != DMA_TO_DEVICE && off == 0 && size >= PAGE_SIZE)
 		set_bit(PG_dcache_clean, &page->flags);
 }
-EXPORT_SYMBOL(___dma_page_dev_to_cpu);
 
 /**
  * dma_map_sg - map a set of SG buffers for streaming mode DMA
@@ -732,9 +793,7 @@ static int arm_dma_set_mask(struct device *dev, u64 dma_mask)
 	if (!dev->dma_mask || !dma_supported(dev, dma_mask))
 		return -EIO;
 
-#ifndef CONFIG_DMABOUNCE
 	*dev->dma_mask = dma_mask;
-#endif
 
 	return 0;
 }
-- 
1.7.1.569.g6f426


^ permalink raw reply related	[flat|nested] 39+ messages in thread

* [PATCHv6 5/7] ARM: dma-mapping: remove redundant code and cleanup
       [not found] ` <1328900324-20946-1-git-send-email-m.szyprowski-Sze3O3UU22JBDgjK7y7TUQ@public.gmane.org>
                     ` (2 preceding siblings ...)
  2012-02-10 18:58   ` [PATCHv6 4/7] ARM: dma-mapping: move all dma bounce code to separate dma ops structure Marek Szyprowski
@ 2012-02-10 18:58   ` Marek Szyprowski
  2012-02-10 18:58     ` Marek Szyprowski
  2012-02-10 18:58   ` [PATCHv6 6/7] ARM: dma-mapping: use alloc, mmap, free from dma_ops Marek Szyprowski
  2012-02-10 18:58   ` [PATCHv6 7/7] ARM: dma-mapping: add support for IOMMU mapper Marek Szyprowski
  5 siblings, 1 reply; 39+ messages in thread
From: Marek Szyprowski @ 2012-02-10 18:58 UTC (permalink / raw)
  To: linux-arm-kernel-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r,
	linaro-mm-sig-cunTk1MwBs8s++Sfvej+rw,
	linux-mm-Bw31MaZKKs3YtjvyW6yDsg,
	linux-arch-u79uwXL29TY76Z2rM5mHXA,
	linux-samsung-soc-u79uwXL29TY76Z2rM5mHXA,
	iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA
  Cc: Shariq Hasnain, Arnd Bergmann, Benjamin Herrenschmidt,
	Krishna Reddy, Kyungmin Park, Andrzej Pietrasiewicz,
	Russell King - ARM Linux, KyongHo Cho, Chunsang Jeong

This patch just performs a global cleanup in DMA mapping implementation
for ARM architecture. Some of the tiny helper functions have been moved
to the caller code, some have been merged together.

Signed-off-by: Marek Szyprowski <m.szyprowski-Sze3O3UU22JBDgjK7y7TUQ@public.gmane.org>
Signed-off-by: Kyungmin Park <kyungmin.park-Sze3O3UU22JBDgjK7y7TUQ@public.gmane.org>
---
 arch/arm/mm/dma-mapping.c |   88 ++++++++++++--------------------------------
 1 files changed, 24 insertions(+), 64 deletions(-)

diff --git a/arch/arm/mm/dma-mapping.c b/arch/arm/mm/dma-mapping.c
index 5715e2e..7c0e68b 100644
--- a/arch/arm/mm/dma-mapping.c
+++ b/arch/arm/mm/dma-mapping.c
@@ -40,64 +40,12 @@
  * the CPU does do speculative prefetches, which means we clean caches
  * before transfers and delay cache invalidation until transfer completion.
  *
- * Private support functions: these are not part of the API and are
- * liable to change.  Drivers must not use these.
  */
-static inline void __dma_single_cpu_to_dev(const void *kaddr, size_t size,
-	enum dma_data_direction dir)
-{
-	extern void ___dma_single_cpu_to_dev(const void *, size_t,
-		enum dma_data_direction);
-
-	if (!arch_is_coherent())
-		___dma_single_cpu_to_dev(kaddr, size, dir);
-}
-
-static inline void __dma_single_dev_to_cpu(const void *kaddr, size_t size,
-	enum dma_data_direction dir)
-{
-	extern void ___dma_single_dev_to_cpu(const void *, size_t,
-		enum dma_data_direction);
-
-	if (!arch_is_coherent())
-		___dma_single_dev_to_cpu(kaddr, size, dir);
-}
-
-static inline void __dma_page_cpu_to_dev(struct page *page, unsigned long off,
-	size_t size, enum dma_data_direction dir)
-{
-	extern void ___dma_page_cpu_to_dev(struct page *, unsigned long,
+static void __dma_page_cpu_to_dev(struct page *, unsigned long,
 		size_t, enum dma_data_direction);
-
-	if (!arch_is_coherent())
-		___dma_page_cpu_to_dev(page, off, size, dir);
-}
-
-static inline void __dma_page_dev_to_cpu(struct page *page, unsigned long off,
-	size_t size, enum dma_data_direction dir)
-{
-	extern void ___dma_page_dev_to_cpu(struct page *, unsigned long,
+static void __dma_page_dev_to_cpu(struct page *, unsigned long,
 		size_t, enum dma_data_direction);
 
-	if (!arch_is_coherent())
-		___dma_page_dev_to_cpu(page, off, size, dir);
-}
-
-
-static inline dma_addr_t __dma_map_page(struct device *dev, struct page *page,
-	     unsigned long offset, size_t size, enum dma_data_direction dir)
-{
-	__dma_page_cpu_to_dev(page, offset, size, dir);
-	return pfn_to_dma(dev, page_to_pfn(page)) + offset;
-}
-
-static inline void __dma_unmap_page(struct device *dev, dma_addr_t handle,
-		size_t size, enum dma_data_direction dir)
-{
-	__dma_page_dev_to_cpu(pfn_to_page(dma_to_pfn(dev, handle)),
-		handle & ~PAGE_MASK, size, dir);
-}
-
 /**
  * dma_map_page - map a portion of a page for streaming DMA
  * @dev: valid struct device pointer, or NULL for ISA and EISA-like devices
@@ -112,11 +60,13 @@ static inline void __dma_unmap_page(struct device *dev, dma_addr_t handle,
  * The device owns this memory once this call has completed.  The CPU
  * can regain ownership by calling dma_unmap_page().
  */
-static inline dma_addr_t arm_dma_map_page(struct device *dev, struct page *page,
+static dma_addr_t arm_dma_map_page(struct device *dev, struct page *page,
 	     unsigned long offset, size_t size, enum dma_data_direction dir,
 	     struct dma_attrs *attrs)
 {
-	return __dma_map_page(dev, page, offset, size, dir);
+	if (!arch_is_coherent())
+		__dma_page_cpu_to_dev(page, offset, size, dir);
+	return pfn_to_dma(dev, page_to_pfn(page)) + offset;
 }
 
 /**
@@ -134,27 +84,31 @@ static inline dma_addr_t arm_dma_map_page(struct device *dev, struct page *page,
  * whatever the device wrote there.
  */
 
-static inline void arm_dma_unmap_page(struct device *dev, dma_addr_t handle,
+static void arm_dma_unmap_page(struct device *dev, dma_addr_t handle,
 		size_t size, enum dma_data_direction dir,
 		struct dma_attrs *attrs)
 {
-	__dma_unmap_page(dev, handle, size, dir);
+	if (!arch_is_coherent())
+		__dma_page_dev_to_cpu(pfn_to_page(dma_to_pfn(dev, handle)),
+				      handle & ~PAGE_MASK, size, dir);
 }
 
-static inline void arm_dma_sync_single_for_cpu(struct device *dev,
+static void arm_dma_sync_single_for_cpu(struct device *dev,
 		dma_addr_t handle, size_t size, enum dma_data_direction dir)
 {
 	unsigned int offset = handle & (PAGE_SIZE - 1);
 	struct page *page = pfn_to_page(dma_to_pfn(dev, handle-offset));
-	__dma_page_dev_to_cpu(page, offset, size, dir);
+	if (!arch_is_coherent())
+		__dma_page_dev_to_cpu(page, offset, size, dir);
 }
 
-static inline void arm_dma_sync_single_for_device(struct device *dev,
+static void arm_dma_sync_single_for_device(struct device *dev,
 		dma_addr_t handle, size_t size, enum dma_data_direction dir)
 {
 	unsigned int offset = handle & (PAGE_SIZE - 1);
 	struct page *page = pfn_to_page(dma_to_pfn(dev, handle-offset));
-	__dma_page_cpu_to_dev(page, offset, size, dir);
+	if (!arch_is_coherent())
+		__dma_page_cpu_to_dev(page, offset, size, dir);
 }
 
 static int arm_dma_set_mask(struct device *dev, u64 dma_mask);
@@ -642,7 +596,13 @@ static void dma_cache_maint_page(struct page *page, unsigned long offset,
 	} while (left);
 }
 
-void ___dma_page_cpu_to_dev(struct page *page, unsigned long off,
+/*
+ * Make an area consistent for devices.
+ * Note: Drivers should NOT use this function directly, as it will break
+ * platforms with CONFIG_DMABOUNCE.
+ * Use the driver DMA support - see dma-mapping.h (dma_sync_*)
+ */
+static void __dma_page_cpu_to_dev(struct page *page, unsigned long off,
 	size_t size, enum dma_data_direction dir)
 {
 	unsigned long paddr;
@@ -658,7 +618,7 @@ void ___dma_page_cpu_to_dev(struct page *page, unsigned long off,
 	/* FIXME: non-speculating: flush on bidirectional mappings? */
 }
 
-void ___dma_page_dev_to_cpu(struct page *page, unsigned long off,
+static void __dma_page_dev_to_cpu(struct page *page, unsigned long off,
 	size_t size, enum dma_data_direction dir)
 {
 	unsigned long paddr = page_to_phys(page) + off;
-- 
1.7.1.569.g6f426

^ permalink raw reply related	[flat|nested] 39+ messages in thread

* [PATCHv6 5/7] ARM: dma-mapping: remove redundant code and cleanup
  2012-02-10 18:58   ` [PATCHv6 5/7] ARM: dma-mapping: remove redundant code and cleanup Marek Szyprowski
@ 2012-02-10 18:58     ` Marek Szyprowski
  0 siblings, 0 replies; 39+ messages in thread
From: Marek Szyprowski @ 2012-02-10 18:58 UTC (permalink / raw)
  To: linux-arm-kernel, linaro-mm-sig, linux-mm, linux-arch,
	linux-samsung-soc, iommu
  Cc: Marek Szyprowski, Kyungmin Park, Arnd Bergmann, Joerg Roedel,
	Russell King - ARM Linux, Shariq Hasnain, Chunsang Jeong,
	Krishna Reddy, KyongHo Cho, Andrzej Pietrasiewicz,
	Benjamin Herrenschmidt

This patch just performs a global cleanup in DMA mapping implementation
for ARM architecture. Some of the tiny helper functions have been moved
to the caller code, some have been merged together.

Signed-off-by: Marek Szyprowski <m.szyprowski@samsung.com>
Signed-off-by: Kyungmin Park <kyungmin.park@samsung.com>
---
 arch/arm/mm/dma-mapping.c |   88 ++++++++++++--------------------------------
 1 files changed, 24 insertions(+), 64 deletions(-)

diff --git a/arch/arm/mm/dma-mapping.c b/arch/arm/mm/dma-mapping.c
index 5715e2e..7c0e68b 100644
--- a/arch/arm/mm/dma-mapping.c
+++ b/arch/arm/mm/dma-mapping.c
@@ -40,64 +40,12 @@
  * the CPU does do speculative prefetches, which means we clean caches
  * before transfers and delay cache invalidation until transfer completion.
  *
- * Private support functions: these are not part of the API and are
- * liable to change.  Drivers must not use these.
  */
-static inline void __dma_single_cpu_to_dev(const void *kaddr, size_t size,
-	enum dma_data_direction dir)
-{
-	extern void ___dma_single_cpu_to_dev(const void *, size_t,
-		enum dma_data_direction);
-
-	if (!arch_is_coherent())
-		___dma_single_cpu_to_dev(kaddr, size, dir);
-}
-
-static inline void __dma_single_dev_to_cpu(const void *kaddr, size_t size,
-	enum dma_data_direction dir)
-{
-	extern void ___dma_single_dev_to_cpu(const void *, size_t,
-		enum dma_data_direction);
-
-	if (!arch_is_coherent())
-		___dma_single_dev_to_cpu(kaddr, size, dir);
-}
-
-static inline void __dma_page_cpu_to_dev(struct page *page, unsigned long off,
-	size_t size, enum dma_data_direction dir)
-{
-	extern void ___dma_page_cpu_to_dev(struct page *, unsigned long,
+static void __dma_page_cpu_to_dev(struct page *, unsigned long,
 		size_t, enum dma_data_direction);
-
-	if (!arch_is_coherent())
-		___dma_page_cpu_to_dev(page, off, size, dir);
-}
-
-static inline void __dma_page_dev_to_cpu(struct page *page, unsigned long off,
-	size_t size, enum dma_data_direction dir)
-{
-	extern void ___dma_page_dev_to_cpu(struct page *, unsigned long,
+static void __dma_page_dev_to_cpu(struct page *, unsigned long,
 		size_t, enum dma_data_direction);
 
-	if (!arch_is_coherent())
-		___dma_page_dev_to_cpu(page, off, size, dir);
-}
-
-
-static inline dma_addr_t __dma_map_page(struct device *dev, struct page *page,
-	     unsigned long offset, size_t size, enum dma_data_direction dir)
-{
-	__dma_page_cpu_to_dev(page, offset, size, dir);
-	return pfn_to_dma(dev, page_to_pfn(page)) + offset;
-}
-
-static inline void __dma_unmap_page(struct device *dev, dma_addr_t handle,
-		size_t size, enum dma_data_direction dir)
-{
-	__dma_page_dev_to_cpu(pfn_to_page(dma_to_pfn(dev, handle)),
-		handle & ~PAGE_MASK, size, dir);
-}
-
 /**
  * dma_map_page - map a portion of a page for streaming DMA
  * @dev: valid struct device pointer, or NULL for ISA and EISA-like devices
@@ -112,11 +60,13 @@ static inline void __dma_unmap_page(struct device *dev, dma_addr_t handle,
  * The device owns this memory once this call has completed.  The CPU
  * can regain ownership by calling dma_unmap_page().
  */
-static inline dma_addr_t arm_dma_map_page(struct device *dev, struct page *page,
+static dma_addr_t arm_dma_map_page(struct device *dev, struct page *page,
 	     unsigned long offset, size_t size, enum dma_data_direction dir,
 	     struct dma_attrs *attrs)
 {
-	return __dma_map_page(dev, page, offset, size, dir);
+	if (!arch_is_coherent())
+		__dma_page_cpu_to_dev(page, offset, size, dir);
+	return pfn_to_dma(dev, page_to_pfn(page)) + offset;
 }
 
 /**
@@ -134,27 +84,31 @@ static inline dma_addr_t arm_dma_map_page(struct device *dev, struct page *page,
  * whatever the device wrote there.
  */
 
-static inline void arm_dma_unmap_page(struct device *dev, dma_addr_t handle,
+static void arm_dma_unmap_page(struct device *dev, dma_addr_t handle,
 		size_t size, enum dma_data_direction dir,
 		struct dma_attrs *attrs)
 {
-	__dma_unmap_page(dev, handle, size, dir);
+	if (!arch_is_coherent())
+		__dma_page_dev_to_cpu(pfn_to_page(dma_to_pfn(dev, handle)),
+				      handle & ~PAGE_MASK, size, dir);
 }
 
-static inline void arm_dma_sync_single_for_cpu(struct device *dev,
+static void arm_dma_sync_single_for_cpu(struct device *dev,
 		dma_addr_t handle, size_t size, enum dma_data_direction dir)
 {
 	unsigned int offset = handle & (PAGE_SIZE - 1);
 	struct page *page = pfn_to_page(dma_to_pfn(dev, handle-offset));
-	__dma_page_dev_to_cpu(page, offset, size, dir);
+	if (!arch_is_coherent())
+		__dma_page_dev_to_cpu(page, offset, size, dir);
 }
 
-static inline void arm_dma_sync_single_for_device(struct device *dev,
+static void arm_dma_sync_single_for_device(struct device *dev,
 		dma_addr_t handle, size_t size, enum dma_data_direction dir)
 {
 	unsigned int offset = handle & (PAGE_SIZE - 1);
 	struct page *page = pfn_to_page(dma_to_pfn(dev, handle-offset));
-	__dma_page_cpu_to_dev(page, offset, size, dir);
+	if (!arch_is_coherent())
+		__dma_page_cpu_to_dev(page, offset, size, dir);
 }
 
 static int arm_dma_set_mask(struct device *dev, u64 dma_mask);
@@ -642,7 +596,13 @@ static void dma_cache_maint_page(struct page *page, unsigned long offset,
 	} while (left);
 }
 
-void ___dma_page_cpu_to_dev(struct page *page, unsigned long off,
+/*
+ * Make an area consistent for devices.
+ * Note: Drivers should NOT use this function directly, as it will break
+ * platforms with CONFIG_DMABOUNCE.
+ * Use the driver DMA support - see dma-mapping.h (dma_sync_*)
+ */
+static void __dma_page_cpu_to_dev(struct page *page, unsigned long off,
 	size_t size, enum dma_data_direction dir)
 {
 	unsigned long paddr;
@@ -658,7 +618,7 @@ void ___dma_page_cpu_to_dev(struct page *page, unsigned long off,
 	/* FIXME: non-speculating: flush on bidirectional mappings? */
 }
 
-void ___dma_page_dev_to_cpu(struct page *page, unsigned long off,
+static void __dma_page_dev_to_cpu(struct page *page, unsigned long off,
 	size_t size, enum dma_data_direction dir)
 {
 	unsigned long paddr = page_to_phys(page) + off;
-- 
1.7.1.569.g6f426


^ permalink raw reply related	[flat|nested] 39+ messages in thread

* [PATCHv6 6/7] ARM: dma-mapping: use alloc, mmap, free from dma_ops
       [not found] ` <1328900324-20946-1-git-send-email-m.szyprowski-Sze3O3UU22JBDgjK7y7TUQ@public.gmane.org>
                     ` (3 preceding siblings ...)
  2012-02-10 18:58   ` [PATCHv6 5/7] ARM: dma-mapping: remove redundant code and cleanup Marek Szyprowski
@ 2012-02-10 18:58   ` Marek Szyprowski
  2012-02-10 18:58     ` Marek Szyprowski
  2012-02-10 18:58   ` [PATCHv6 7/7] ARM: dma-mapping: add support for IOMMU mapper Marek Szyprowski
  5 siblings, 1 reply; 39+ messages in thread
From: Marek Szyprowski @ 2012-02-10 18:58 UTC (permalink / raw)
  To: linux-arm-kernel-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r,
	linaro-mm-sig-cunTk1MwBs8s++Sfvej+rw,
	linux-mm-Bw31MaZKKs3YtjvyW6yDsg,
	linux-arch-u79uwXL29TY76Z2rM5mHXA,
	linux-samsung-soc-u79uwXL29TY76Z2rM5mHXA,
	iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA
  Cc: Shariq Hasnain, Arnd Bergmann, Benjamin Herrenschmidt,
	Krishna Reddy, Kyungmin Park, Andrzej Pietrasiewicz,
	Russell King - ARM Linux, KyongHo Cho, Chunsang Jeong

This patch converts dma_alloc/free/mmap_{coherent,writecombine}
functions to use generic alloc/free/mmap methods from dma_map_ops
structure. A new DMA_ATTR_WRITE_COMBINE DMA attribute have been
introduced to implement writecombine methods.

Signed-off-by: Marek Szyprowski <m.szyprowski-Sze3O3UU22JBDgjK7y7TUQ@public.gmane.org>
Signed-off-by: Kyungmin Park <kyungmin.park-Sze3O3UU22JBDgjK7y7TUQ@public.gmane.org>
---
 arch/arm/common/dmabounce.c        |    3 +
 arch/arm/include/asm/dma-mapping.h |  107 ++++++++++++++++++++++++++----------
 arch/arm/mm/dma-mapping.c          |   53 ++++++------------
 3 files changed, 98 insertions(+), 65 deletions(-)

diff --git a/arch/arm/common/dmabounce.c b/arch/arm/common/dmabounce.c
index 5e7ba61..739407e 100644
--- a/arch/arm/common/dmabounce.c
+++ b/arch/arm/common/dmabounce.c
@@ -449,6 +449,9 @@ static int dmabounce_set_mask(struct device *dev, u64 dma_mask)
 }
 
 static struct dma_map_ops dmabounce_ops = {
+	.alloc			= arm_dma_alloc,
+	.free			= arm_dma_free,
+	.mmap			= arm_dma_mmap,
 	.map_page		= dmabounce_map_page,
 	.unmap_page		= dmabounce_unmap_page,
 	.sync_single_for_cpu	= dmabounce_sync_for_cpu,
diff --git a/arch/arm/include/asm/dma-mapping.h b/arch/arm/include/asm/dma-mapping.h
index 0016bff..ca7a378 100644
--- a/arch/arm/include/asm/dma-mapping.h
+++ b/arch/arm/include/asm/dma-mapping.h
@@ -5,6 +5,7 @@
 
 #include <linux/mm_types.h>
 #include <linux/scatterlist.h>
+#include <linux/dma-attrs.h>
 #include <linux/dma-debug.h>
 
 #include <asm-generic/dma-coherent.h>
@@ -109,68 +110,115 @@ static inline void dma_free_noncoherent(struct device *dev, size_t size,
 extern int dma_supported(struct device *dev, u64 mask);
 
 /**
- * dma_alloc_coherent - allocate consistent memory for DMA
+ * arm_dma_alloc - allocate consistent memory for DMA
  * @dev: valid struct device pointer, or NULL for ISA and EISA-like devices
  * @size: required memory size
  * @handle: bus-specific DMA address
+ * @attrs: optinal attributes that specific mapping properties
  *
- * Allocate some uncached, unbuffered memory for a device for
- * performing DMA.  This function allocates pages, and will
- * return the CPU-viewed address, and sets @handle to be the
- * device-viewed address.
+ * Allocate some memory for a device for performing DMA.  This function
+ * allocates pages, and will return the CPU-viewed address, and sets @handle
+ * to be the device-viewed address.
  */
-extern void *dma_alloc_coherent(struct device *, size_t, dma_addr_t *, gfp_t);
+extern void *arm_dma_alloc(struct device *dev, size_t size, dma_addr_t *handle,
+			   gfp_t gfp, struct dma_attrs *attrs);
+
+#define dma_alloc_coherent(d,s,h,f) dma_alloc_attrs(d,s,h,f,NULL)
+
+static inline void *dma_alloc_attrs(struct device *dev, size_t size,
+				       dma_addr_t *dma_handle, gfp_t flag,
+				       struct dma_attrs *attrs)
+{
+	struct dma_map_ops *ops = get_dma_ops(dev);
+	void *cpu_addr;
+	BUG_ON(!ops);
+
+	cpu_addr = ops->alloc(dev, size, dma_handle, flag, attrs);
+	debug_dma_alloc_coherent(dev, size, *dma_handle, cpu_addr);
+	return cpu_addr;
+}
 
 /**
- * dma_free_coherent - free memory allocated by dma_alloc_coherent
+ * arm_dma_free - free memory allocated by arm_dma_alloc
  * @dev: valid struct device pointer, or NULL for ISA and EISA-like devices
  * @size: size of memory originally requested in dma_alloc_coherent
  * @cpu_addr: CPU-view address returned from dma_alloc_coherent
  * @handle: device-view address returned from dma_alloc_coherent
+ * @attrs: optinal attributes that specific mapping properties
  *
  * Free (and unmap) a DMA buffer previously allocated by
- * dma_alloc_coherent().
+ * arm_dma_alloc().
  *
  * References to memory and mappings associated with cpu_addr/handle
  * during and after this call executing are illegal.
  */
-extern void dma_free_coherent(struct device *, size_t, void *, dma_addr_t);
+extern void arm_dma_free(struct device *dev, size_t size, void *cpu_addr,
+			 dma_addr_t handle, struct dma_attrs *attrs);
+
+#define dma_free_coherent(d,s,c,h) dma_free_attrs(d,s,c,h,NULL)
+
+static inline void dma_free_attrs(struct device *dev, size_t size,
+				     void *cpu_addr, dma_addr_t dma_handle,
+				     struct dma_attrs *attrs)
+{
+	struct dma_map_ops *ops = get_dma_ops(dev);
+	BUG_ON(!ops);
+
+	debug_dma_free_coherent(dev, size, cpu_addr, dma_handle);
+	ops->free(dev, size, cpu_addr, dma_handle, attrs);
+}
 
 /**
- * dma_mmap_coherent - map a coherent DMA allocation into user space
+ * arm_dma_mmap - map a coherent DMA allocation into user space
  * @dev: valid struct device pointer, or NULL for ISA and EISA-like devices
  * @vma: vm_area_struct describing requested user mapping
  * @cpu_addr: kernel CPU-view address returned from dma_alloc_coherent
  * @handle: device-view address returned from dma_alloc_coherent
  * @size: size of memory originally requested in dma_alloc_coherent
+ * @attrs: optinal attributes that specific mapping properties
  *
  * Map a coherent DMA buffer previously allocated by dma_alloc_coherent
  * into user space.  The coherent DMA buffer must not be freed by the
  * driver until the user space mapping has been released.
  */
-int dma_mmap_coherent(struct device *, struct vm_area_struct *,
-		void *, dma_addr_t, size_t);
+extern int arm_dma_mmap(struct device *dev, struct vm_area_struct *vma,
+			void *cpu_addr, dma_addr_t dma_addr, size_t size,
+			struct dma_attrs *attrs);
 
+#define dma_mmap_coherent(d,v,c,h,s) dma_mmap_attrs(d,v,c,h,s,NULL)
 
-/**
- * dma_alloc_writecombine - allocate writecombining memory for DMA
- * @dev: valid struct device pointer, or NULL for ISA and EISA-like devices
- * @size: required memory size
- * @handle: bus-specific DMA address
- *
- * Allocate some uncached, buffered memory for a device for
- * performing DMA.  This function allocates pages, and will
- * return the CPU-viewed address, and sets @handle to be the
- * device-viewed address.
- */
-extern void *dma_alloc_writecombine(struct device *, size_t, dma_addr_t *,
-		gfp_t);
+static inline int dma_mmap_attrs(struct device *dev, struct vm_area_struct *vma,
+				  void *cpu_addr, dma_addr_t dma_addr,
+				  size_t size, struct dma_attrs *attrs)
+{
+	struct dma_map_ops *ops = get_dma_ops(dev);
+	BUG_ON(!ops);
+	return ops->mmap(dev, vma, cpu_addr, dma_addr, size, attrs);
+}
 
-#define dma_free_writecombine(dev,size,cpu_addr,handle) \
-	dma_free_coherent(dev,size,cpu_addr,handle)
+static inline void *dma_alloc_writecombine(struct device *dev, size_t size,
+				       dma_addr_t *dma_handle, gfp_t flag)
+{
+	DEFINE_DMA_ATTRS(attrs);
+	dma_set_attr(DMA_ATTR_WRITE_COMBINE, &attrs);
+	return dma_alloc_attrs(dev, size, dma_handle, flag, &attrs);
+}
 
-int dma_mmap_writecombine(struct device *, struct vm_area_struct *,
-		void *, dma_addr_t, size_t);
+static inline void dma_free_writecombine(struct device *dev, size_t size,
+				     void *cpu_addr, dma_addr_t dma_handle)
+{
+	DEFINE_DMA_ATTRS(attrs);
+	dma_set_attr(DMA_ATTR_WRITE_COMBINE, &attrs);
+	return dma_free_attrs(dev, size, cpu_addr, dma_handle, &attrs);
+}
+
+static inline int dma_mmap_writecombine(struct device *dev, struct vm_area_struct *vma,
+		      void *cpu_addr, dma_addr_t dma_addr, size_t size)
+{
+	DEFINE_DMA_ATTRS(attrs);
+	dma_set_attr(DMA_ATTR_WRITE_COMBINE, &attrs);
+	return dma_mmap_attrs(dev, vma, cpu_addr, dma_addr, size, &attrs);
+}
 
 /*
  * This can be called during boot to increase the size of the consistent
@@ -179,7 +227,6 @@ int dma_mmap_writecombine(struct device *, struct vm_area_struct *,
  */
 extern void __init init_consistent_dma_size(unsigned long size);
 
-
 /*
  * For SA-1111, IXP425, and ADI systems  the dma-mapping functions are "magic"
  * and utilize bounce buffers as needed to work around limited DMA windows.
diff --git a/arch/arm/mm/dma-mapping.c b/arch/arm/mm/dma-mapping.c
index 7c0e68b..4845c09 100644
--- a/arch/arm/mm/dma-mapping.c
+++ b/arch/arm/mm/dma-mapping.c
@@ -114,6 +114,9 @@ static void arm_dma_sync_single_for_device(struct device *dev,
 static int arm_dma_set_mask(struct device *dev, u64 dma_mask);
 
 struct dma_map_ops arm_dma_ops = {
+	.alloc			= arm_dma_alloc,
+	.free			= arm_dma_free,
+	.mmap			= arm_dma_mmap,
 	.map_page		= arm_dma_map_page,
 	.unmap_page		= arm_dma_unmap_page,
 	.map_sg			= arm_dma_map_sg,
@@ -462,33 +465,26 @@ __dma_alloc(struct device *dev, size_t size, dma_addr_t *handle, gfp_t gfp,
  * Allocate DMA-coherent memory space and return both the kernel remapped
  * virtual and bus address for that space.
  */
-void *
-dma_alloc_coherent(struct device *dev, size_t size, dma_addr_t *handle, gfp_t gfp)
+void *arm_dma_alloc(struct device *dev, size_t size, dma_addr_t *handle,
+		    gfp_t gfp, struct dma_attrs *attrs)
 {
+	pgprot_t prot = dma_get_attr(DMA_ATTR_WRITE_COMBINE, attrs) ?
+			pgprot_writecombine(pgprot_kernel) :
+			pgprot_dmacoherent(pgprot_kernel);
 	void *memory;
 
 	if (dma_alloc_from_coherent(dev, size, handle, &memory))
 		return memory;
 
-	return __dma_alloc(dev, size, handle, gfp,
-			   pgprot_dmacoherent(pgprot_kernel));
+	return __dma_alloc(dev, size, handle, gfp, prot);
 }
-EXPORT_SYMBOL(dma_alloc_coherent);
 
 /*
- * Allocate a writecombining region, in much the same way as
- * dma_alloc_coherent above.
+ * Create userspace mapping for the DMA-coherent memory.
  */
-void *
-dma_alloc_writecombine(struct device *dev, size_t size, dma_addr_t *handle, gfp_t gfp)
-{
-	return __dma_alloc(dev, size, handle, gfp,
-			   pgprot_writecombine(pgprot_kernel));
-}
-EXPORT_SYMBOL(dma_alloc_writecombine);
-
-static int dma_mmap(struct device *dev, struct vm_area_struct *vma,
-		    void *cpu_addr, dma_addr_t dma_addr, size_t size)
+int arm_dma_mmap(struct device *dev, struct vm_area_struct *vma,
+		 void *cpu_addr, dma_addr_t dma_addr, size_t size,
+		 struct dma_attrs *attrs)
 {
 	int ret = -ENXIO;
 #ifdef CONFIG_MMU
@@ -496,6 +492,9 @@ static int dma_mmap(struct device *dev, struct vm_area_struct *vma,
 	struct arm_vmregion *c;
 
 	user_size = (vma->vm_end - vma->vm_start) >> PAGE_SHIFT;
+	vma->vm_page_prot = dma_get_attr(DMA_ATTR_WRITE_COMBINE, attrs) ?
+			    pgprot_writecombine(vma->vm_page_prot) :
+			    pgprot_dmacoherent(vma->vm_page_prot);
 
 	c = arm_vmregion_find(&consistent_head, (unsigned long)cpu_addr);
 	if (c) {
@@ -516,27 +515,12 @@ static int dma_mmap(struct device *dev, struct vm_area_struct *vma,
 	return ret;
 }
 
-int dma_mmap_coherent(struct device *dev, struct vm_area_struct *vma,
-		      void *cpu_addr, dma_addr_t dma_addr, size_t size)
-{
-	vma->vm_page_prot = pgprot_dmacoherent(vma->vm_page_prot);
-	return dma_mmap(dev, vma, cpu_addr, dma_addr, size);
-}
-EXPORT_SYMBOL(dma_mmap_coherent);
-
-int dma_mmap_writecombine(struct device *dev, struct vm_area_struct *vma,
-			  void *cpu_addr, dma_addr_t dma_addr, size_t size)
-{
-	vma->vm_page_prot = pgprot_writecombine(vma->vm_page_prot);
-	return dma_mmap(dev, vma, cpu_addr, dma_addr, size);
-}
-EXPORT_SYMBOL(dma_mmap_writecombine);
-
 /*
  * free a page as defined by the above mapping.
  * Must not be called with IRQs disabled.
  */
-void dma_free_coherent(struct device *dev, size_t size, void *cpu_addr, dma_addr_t handle)
+void arm_dma_free(struct device *dev, size_t size, void *cpu_addr,
+		  dma_addr_t handle, struct dma_attrs *attrs)
 {
 	WARN_ON(irqs_disabled());
 
@@ -550,7 +534,6 @@ void dma_free_coherent(struct device *dev, size_t size, void *cpu_addr, dma_addr
 
 	__dma_free_buffer(pfn_to_page(dma_to_pfn(dev, handle)), size);
 }
-EXPORT_SYMBOL(dma_free_coherent);
 
 static void dma_cache_maint_page(struct page *page, unsigned long offset,
 	size_t size, enum dma_data_direction dir,
-- 
1.7.1.569.g6f426

^ permalink raw reply related	[flat|nested] 39+ messages in thread

* [PATCHv6 6/7] ARM: dma-mapping: use alloc, mmap, free from dma_ops
  2012-02-10 18:58   ` [PATCHv6 6/7] ARM: dma-mapping: use alloc, mmap, free from dma_ops Marek Szyprowski
@ 2012-02-10 18:58     ` Marek Szyprowski
  0 siblings, 0 replies; 39+ messages in thread
From: Marek Szyprowski @ 2012-02-10 18:58 UTC (permalink / raw)
  To: linux-arm-kernel, linaro-mm-sig, linux-mm, linux-arch,
	linux-samsung-soc, iommu
  Cc: Marek Szyprowski, Kyungmin Park, Arnd Bergmann, Joerg Roedel,
	Russell King - ARM Linux, Shariq Hasnain, Chunsang Jeong,
	Krishna Reddy, KyongHo Cho, Andrzej Pietrasiewicz,
	Benjamin Herrenschmidt

This patch converts dma_alloc/free/mmap_{coherent,writecombine}
functions to use generic alloc/free/mmap methods from dma_map_ops
structure. A new DMA_ATTR_WRITE_COMBINE DMA attribute have been
introduced to implement writecombine methods.

Signed-off-by: Marek Szyprowski <m.szyprowski@samsung.com>
Signed-off-by: Kyungmin Park <kyungmin.park@samsung.com>
---
 arch/arm/common/dmabounce.c        |    3 +
 arch/arm/include/asm/dma-mapping.h |  107 ++++++++++++++++++++++++++----------
 arch/arm/mm/dma-mapping.c          |   53 ++++++------------
 3 files changed, 98 insertions(+), 65 deletions(-)

diff --git a/arch/arm/common/dmabounce.c b/arch/arm/common/dmabounce.c
index 5e7ba61..739407e 100644
--- a/arch/arm/common/dmabounce.c
+++ b/arch/arm/common/dmabounce.c
@@ -449,6 +449,9 @@ static int dmabounce_set_mask(struct device *dev, u64 dma_mask)
 }
 
 static struct dma_map_ops dmabounce_ops = {
+	.alloc			= arm_dma_alloc,
+	.free			= arm_dma_free,
+	.mmap			= arm_dma_mmap,
 	.map_page		= dmabounce_map_page,
 	.unmap_page		= dmabounce_unmap_page,
 	.sync_single_for_cpu	= dmabounce_sync_for_cpu,
diff --git a/arch/arm/include/asm/dma-mapping.h b/arch/arm/include/asm/dma-mapping.h
index 0016bff..ca7a378 100644
--- a/arch/arm/include/asm/dma-mapping.h
+++ b/arch/arm/include/asm/dma-mapping.h
@@ -5,6 +5,7 @@
 
 #include <linux/mm_types.h>
 #include <linux/scatterlist.h>
+#include <linux/dma-attrs.h>
 #include <linux/dma-debug.h>
 
 #include <asm-generic/dma-coherent.h>
@@ -109,68 +110,115 @@ static inline void dma_free_noncoherent(struct device *dev, size_t size,
 extern int dma_supported(struct device *dev, u64 mask);
 
 /**
- * dma_alloc_coherent - allocate consistent memory for DMA
+ * arm_dma_alloc - allocate consistent memory for DMA
  * @dev: valid struct device pointer, or NULL for ISA and EISA-like devices
  * @size: required memory size
  * @handle: bus-specific DMA address
+ * @attrs: optinal attributes that specific mapping properties
  *
- * Allocate some uncached, unbuffered memory for a device for
- * performing DMA.  This function allocates pages, and will
- * return the CPU-viewed address, and sets @handle to be the
- * device-viewed address.
+ * Allocate some memory for a device for performing DMA.  This function
+ * allocates pages, and will return the CPU-viewed address, and sets @handle
+ * to be the device-viewed address.
  */
-extern void *dma_alloc_coherent(struct device *, size_t, dma_addr_t *, gfp_t);
+extern void *arm_dma_alloc(struct device *dev, size_t size, dma_addr_t *handle,
+			   gfp_t gfp, struct dma_attrs *attrs);
+
+#define dma_alloc_coherent(d,s,h,f) dma_alloc_attrs(d,s,h,f,NULL)
+
+static inline void *dma_alloc_attrs(struct device *dev, size_t size,
+				       dma_addr_t *dma_handle, gfp_t flag,
+				       struct dma_attrs *attrs)
+{
+	struct dma_map_ops *ops = get_dma_ops(dev);
+	void *cpu_addr;
+	BUG_ON(!ops);
+
+	cpu_addr = ops->alloc(dev, size, dma_handle, flag, attrs);
+	debug_dma_alloc_coherent(dev, size, *dma_handle, cpu_addr);
+	return cpu_addr;
+}
 
 /**
- * dma_free_coherent - free memory allocated by dma_alloc_coherent
+ * arm_dma_free - free memory allocated by arm_dma_alloc
  * @dev: valid struct device pointer, or NULL for ISA and EISA-like devices
  * @size: size of memory originally requested in dma_alloc_coherent
  * @cpu_addr: CPU-view address returned from dma_alloc_coherent
  * @handle: device-view address returned from dma_alloc_coherent
+ * @attrs: optinal attributes that specific mapping properties
  *
  * Free (and unmap) a DMA buffer previously allocated by
- * dma_alloc_coherent().
+ * arm_dma_alloc().
  *
  * References to memory and mappings associated with cpu_addr/handle
  * during and after this call executing are illegal.
  */
-extern void dma_free_coherent(struct device *, size_t, void *, dma_addr_t);
+extern void arm_dma_free(struct device *dev, size_t size, void *cpu_addr,
+			 dma_addr_t handle, struct dma_attrs *attrs);
+
+#define dma_free_coherent(d,s,c,h) dma_free_attrs(d,s,c,h,NULL)
+
+static inline void dma_free_attrs(struct device *dev, size_t size,
+				     void *cpu_addr, dma_addr_t dma_handle,
+				     struct dma_attrs *attrs)
+{
+	struct dma_map_ops *ops = get_dma_ops(dev);
+	BUG_ON(!ops);
+
+	debug_dma_free_coherent(dev, size, cpu_addr, dma_handle);
+	ops->free(dev, size, cpu_addr, dma_handle, attrs);
+}
 
 /**
- * dma_mmap_coherent - map a coherent DMA allocation into user space
+ * arm_dma_mmap - map a coherent DMA allocation into user space
  * @dev: valid struct device pointer, or NULL for ISA and EISA-like devices
  * @vma: vm_area_struct describing requested user mapping
  * @cpu_addr: kernel CPU-view address returned from dma_alloc_coherent
  * @handle: device-view address returned from dma_alloc_coherent
  * @size: size of memory originally requested in dma_alloc_coherent
+ * @attrs: optinal attributes that specific mapping properties
  *
  * Map a coherent DMA buffer previously allocated by dma_alloc_coherent
  * into user space.  The coherent DMA buffer must not be freed by the
  * driver until the user space mapping has been released.
  */
-int dma_mmap_coherent(struct device *, struct vm_area_struct *,
-		void *, dma_addr_t, size_t);
+extern int arm_dma_mmap(struct device *dev, struct vm_area_struct *vma,
+			void *cpu_addr, dma_addr_t dma_addr, size_t size,
+			struct dma_attrs *attrs);
 
+#define dma_mmap_coherent(d,v,c,h,s) dma_mmap_attrs(d,v,c,h,s,NULL)
 
-/**
- * dma_alloc_writecombine - allocate writecombining memory for DMA
- * @dev: valid struct device pointer, or NULL for ISA and EISA-like devices
- * @size: required memory size
- * @handle: bus-specific DMA address
- *
- * Allocate some uncached, buffered memory for a device for
- * performing DMA.  This function allocates pages, and will
- * return the CPU-viewed address, and sets @handle to be the
- * device-viewed address.
- */
-extern void *dma_alloc_writecombine(struct device *, size_t, dma_addr_t *,
-		gfp_t);
+static inline int dma_mmap_attrs(struct device *dev, struct vm_area_struct *vma,
+				  void *cpu_addr, dma_addr_t dma_addr,
+				  size_t size, struct dma_attrs *attrs)
+{
+	struct dma_map_ops *ops = get_dma_ops(dev);
+	BUG_ON(!ops);
+	return ops->mmap(dev, vma, cpu_addr, dma_addr, size, attrs);
+}
 
-#define dma_free_writecombine(dev,size,cpu_addr,handle) \
-	dma_free_coherent(dev,size,cpu_addr,handle)
+static inline void *dma_alloc_writecombine(struct device *dev, size_t size,
+				       dma_addr_t *dma_handle, gfp_t flag)
+{
+	DEFINE_DMA_ATTRS(attrs);
+	dma_set_attr(DMA_ATTR_WRITE_COMBINE, &attrs);
+	return dma_alloc_attrs(dev, size, dma_handle, flag, &attrs);
+}
 
-int dma_mmap_writecombine(struct device *, struct vm_area_struct *,
-		void *, dma_addr_t, size_t);
+static inline void dma_free_writecombine(struct device *dev, size_t size,
+				     void *cpu_addr, dma_addr_t dma_handle)
+{
+	DEFINE_DMA_ATTRS(attrs);
+	dma_set_attr(DMA_ATTR_WRITE_COMBINE, &attrs);
+	return dma_free_attrs(dev, size, cpu_addr, dma_handle, &attrs);
+}
+
+static inline int dma_mmap_writecombine(struct device *dev, struct vm_area_struct *vma,
+		      void *cpu_addr, dma_addr_t dma_addr, size_t size)
+{
+	DEFINE_DMA_ATTRS(attrs);
+	dma_set_attr(DMA_ATTR_WRITE_COMBINE, &attrs);
+	return dma_mmap_attrs(dev, vma, cpu_addr, dma_addr, size, &attrs);
+}
 
 /*
  * This can be called during boot to increase the size of the consistent
@@ -179,7 +227,6 @@ int dma_mmap_writecombine(struct device *, struct vm_area_struct *,
  */
 extern void __init init_consistent_dma_size(unsigned long size);
 
-
 /*
  * For SA-1111, IXP425, and ADI systems  the dma-mapping functions are "magic"
  * and utilize bounce buffers as needed to work around limited DMA windows.
diff --git a/arch/arm/mm/dma-mapping.c b/arch/arm/mm/dma-mapping.c
index 7c0e68b..4845c09 100644
--- a/arch/arm/mm/dma-mapping.c
+++ b/arch/arm/mm/dma-mapping.c
@@ -114,6 +114,9 @@ static void arm_dma_sync_single_for_device(struct device *dev,
 static int arm_dma_set_mask(struct device *dev, u64 dma_mask);
 
 struct dma_map_ops arm_dma_ops = {
+	.alloc			= arm_dma_alloc,
+	.free			= arm_dma_free,
+	.mmap			= arm_dma_mmap,
 	.map_page		= arm_dma_map_page,
 	.unmap_page		= arm_dma_unmap_page,
 	.map_sg			= arm_dma_map_sg,
@@ -462,33 +465,26 @@ __dma_alloc(struct device *dev, size_t size, dma_addr_t *handle, gfp_t gfp,
  * Allocate DMA-coherent memory space and return both the kernel remapped
  * virtual and bus address for that space.
  */
-void *
-dma_alloc_coherent(struct device *dev, size_t size, dma_addr_t *handle, gfp_t gfp)
+void *arm_dma_alloc(struct device *dev, size_t size, dma_addr_t *handle,
+		    gfp_t gfp, struct dma_attrs *attrs)
 {
+	pgprot_t prot = dma_get_attr(DMA_ATTR_WRITE_COMBINE, attrs) ?
+			pgprot_writecombine(pgprot_kernel) :
+			pgprot_dmacoherent(pgprot_kernel);
 	void *memory;
 
 	if (dma_alloc_from_coherent(dev, size, handle, &memory))
 		return memory;
 
-	return __dma_alloc(dev, size, handle, gfp,
-			   pgprot_dmacoherent(pgprot_kernel));
+	return __dma_alloc(dev, size, handle, gfp, prot);
 }
-EXPORT_SYMBOL(dma_alloc_coherent);
 
 /*
- * Allocate a writecombining region, in much the same way as
- * dma_alloc_coherent above.
+ * Create userspace mapping for the DMA-coherent memory.
  */
-void *
-dma_alloc_writecombine(struct device *dev, size_t size, dma_addr_t *handle, gfp_t gfp)
-{
-	return __dma_alloc(dev, size, handle, gfp,
-			   pgprot_writecombine(pgprot_kernel));
-}
-EXPORT_SYMBOL(dma_alloc_writecombine);
-
-static int dma_mmap(struct device *dev, struct vm_area_struct *vma,
-		    void *cpu_addr, dma_addr_t dma_addr, size_t size)
+int arm_dma_mmap(struct device *dev, struct vm_area_struct *vma,
+		 void *cpu_addr, dma_addr_t dma_addr, size_t size,
+		 struct dma_attrs *attrs)
 {
 	int ret = -ENXIO;
 #ifdef CONFIG_MMU
@@ -496,6 +492,9 @@ static int dma_mmap(struct device *dev, struct vm_area_struct *vma,
 	struct arm_vmregion *c;
 
 	user_size = (vma->vm_end - vma->vm_start) >> PAGE_SHIFT;
+	vma->vm_page_prot = dma_get_attr(DMA_ATTR_WRITE_COMBINE, attrs) ?
+			    pgprot_writecombine(vma->vm_page_prot) :
+			    pgprot_dmacoherent(vma->vm_page_prot);
 
 	c = arm_vmregion_find(&consistent_head, (unsigned long)cpu_addr);
 	if (c) {
@@ -516,27 +515,12 @@ static int dma_mmap(struct device *dev, struct vm_area_struct *vma,
 	return ret;
 }
 
-int dma_mmap_coherent(struct device *dev, struct vm_area_struct *vma,
-		      void *cpu_addr, dma_addr_t dma_addr, size_t size)
-{
-	vma->vm_page_prot = pgprot_dmacoherent(vma->vm_page_prot);
-	return dma_mmap(dev, vma, cpu_addr, dma_addr, size);
-}
-EXPORT_SYMBOL(dma_mmap_coherent);
-
-int dma_mmap_writecombine(struct device *dev, struct vm_area_struct *vma,
-			  void *cpu_addr, dma_addr_t dma_addr, size_t size)
-{
-	vma->vm_page_prot = pgprot_writecombine(vma->vm_page_prot);
-	return dma_mmap(dev, vma, cpu_addr, dma_addr, size);
-}
-EXPORT_SYMBOL(dma_mmap_writecombine);
-
 /*
  * free a page as defined by the above mapping.
  * Must not be called with IRQs disabled.
  */
-void dma_free_coherent(struct device *dev, size_t size, void *cpu_addr, dma_addr_t handle)
+void arm_dma_free(struct device *dev, size_t size, void *cpu_addr,
+		  dma_addr_t handle, struct dma_attrs *attrs)
 {
 	WARN_ON(irqs_disabled());
 
@@ -550,7 +534,6 @@ void dma_free_coherent(struct device *dev, size_t size, void *cpu_addr, dma_addr
 
 	__dma_free_buffer(pfn_to_page(dma_to_pfn(dev, handle)), size);
 }
-EXPORT_SYMBOL(dma_free_coherent);
 
 static void dma_cache_maint_page(struct page *page, unsigned long offset,
 	size_t size, enum dma_data_direction dir,
-- 
1.7.1.569.g6f426


^ permalink raw reply related	[flat|nested] 39+ messages in thread

* [PATCHv6 7/7] ARM: dma-mapping: add support for IOMMU mapper
       [not found] ` <1328900324-20946-1-git-send-email-m.szyprowski-Sze3O3UU22JBDgjK7y7TUQ@public.gmane.org>
                     ` (4 preceding siblings ...)
  2012-02-10 18:58   ` [PATCHv6 6/7] ARM: dma-mapping: use alloc, mmap, free from dma_ops Marek Szyprowski
@ 2012-02-10 18:58   ` Marek Szyprowski
  2012-02-10 18:58     ` Marek Szyprowski
                       ` (3 more replies)
  5 siblings, 4 replies; 39+ messages in thread
From: Marek Szyprowski @ 2012-02-10 18:58 UTC (permalink / raw)
  To: linux-arm-kernel-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r,
	linaro-mm-sig-cunTk1MwBs8s++Sfvej+rw,
	linux-mm-Bw31MaZKKs3YtjvyW6yDsg,
	linux-arch-u79uwXL29TY76Z2rM5mHXA,
	linux-samsung-soc-u79uwXL29TY76Z2rM5mHXA,
	iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA
  Cc: Shariq Hasnain, Arnd Bergmann, Benjamin Herrenschmidt,
	Krishna Reddy, Kyungmin Park, Andrzej Pietrasiewicz,
	Russell King - ARM Linux, KyongHo Cho, Chunsang Jeong

This patch add a complete implementation of DMA-mapping API for
devices that have IOMMU support. All DMA-mapping calls are supported.

This patch contains some of the code kindly provided by Krishna Reddy
<vdumpa-DDmLM1+adcrQT0dZR+AlfA@public.gmane.org> and Andrzej Pietrasiewicz <andrzej.p-Sze3O3UU22JBDgjK7y7TUQ@public.gmane.org>

Signed-off-by: Marek Szyprowski <m.szyprowski-Sze3O3UU22JBDgjK7y7TUQ@public.gmane.org>
Signed-off-by: Kyungmin Park <kyungmin.park-Sze3O3UU22JBDgjK7y7TUQ@public.gmane.org>
---
 arch/arm/Kconfig                 |    8 +
 arch/arm/include/asm/device.h    |    3 +
 arch/arm/include/asm/dma-iommu.h |   34 ++
 arch/arm/mm/dma-mapping.c        |  635 +++++++++++++++++++++++++++++++++++++-
 arch/arm/mm/vmregion.h           |    2 +-
 5 files changed, 667 insertions(+), 15 deletions(-)
 create mode 100644 arch/arm/include/asm/dma-iommu.h

diff --git a/arch/arm/Kconfig b/arch/arm/Kconfig
index 59102fb..5d9a0b6 100644
--- a/arch/arm/Kconfig
+++ b/arch/arm/Kconfig
@@ -44,6 +44,14 @@ config ARM
 config ARM_HAS_SG_CHAIN
 	bool
 
+config NEED_SG_DMA_LENGTH
+	bool
+
+config ARM_DMA_USE_IOMMU
+	select NEED_SG_DMA_LENGTH
+	select ARM_HAS_SG_CHAIN
+	bool
+
 config HAVE_PWM
 	bool
 
diff --git a/arch/arm/include/asm/device.h b/arch/arm/include/asm/device.h
index 6e2cb0e..b69c0d3 100644
--- a/arch/arm/include/asm/device.h
+++ b/arch/arm/include/asm/device.h
@@ -14,6 +14,9 @@ struct dev_archdata {
 #ifdef CONFIG_IOMMU_API
 	void *iommu; /* private IOMMU data */
 #endif
+#ifdef CONFIG_ARM_DMA_USE_IOMMU
+	struct dma_iommu_mapping	*mapping;
+#endif
 };
 
 struct omap_device;
diff --git a/arch/arm/include/asm/dma-iommu.h b/arch/arm/include/asm/dma-iommu.h
new file mode 100644
index 0000000..799b094
--- /dev/null
+++ b/arch/arm/include/asm/dma-iommu.h
@@ -0,0 +1,34 @@
+#ifndef ASMARM_DMA_IOMMU_H
+#define ASMARM_DMA_IOMMU_H
+
+#ifdef __KERNEL__
+
+#include <linux/mm_types.h>
+#include <linux/scatterlist.h>
+#include <linux/dma-debug.h>
+#include <linux/kmemcheck.h>
+
+struct dma_iommu_mapping {
+	/* iommu specific data */
+	struct iommu_domain	*domain;
+
+	void			*bitmap;
+	size_t			bits;
+	unsigned int		order;
+	dma_addr_t		base;
+
+	spinlock_t		lock;
+	struct kref		kref;
+};
+
+struct dma_iommu_mapping *
+arm_iommu_create_mapping(struct bus_type *bus, dma_addr_t base, size_t size,
+			 int order);
+
+void arm_iommu_release_mapping(struct dma_iommu_mapping *mapping);
+
+int arm_iommu_attach_device(struct device *dev,
+					struct dma_iommu_mapping *mapping);
+
+#endif /* __KERNEL__ */
+#endif
diff --git a/arch/arm/mm/dma-mapping.c b/arch/arm/mm/dma-mapping.c
index 4845c09..4163691 100644
--- a/arch/arm/mm/dma-mapping.c
+++ b/arch/arm/mm/dma-mapping.c
@@ -19,6 +19,7 @@
 #include <linux/dma-mapping.h>
 #include <linux/highmem.h>
 #include <linux/slab.h>
+#include <linux/iommu.h>
 
 #include <asm/memory.h>
 #include <asm/highmem.h>
@@ -26,6 +27,7 @@
 #include <asm/tlbflush.h>
 #include <asm/sizes.h>
 #include <asm/mach/arch.h>
+#include <asm/dma-iommu.h>
 
 #include "mm.h"
 
@@ -156,6 +158,19 @@ static u64 get_coherent_dma_mask(struct device *dev)
 	return mask;
 }
 
+static void __dma_clear_buffer(struct page *page, size_t size)
+{
+	void *ptr;
+	/*
+	 * Ensure that the allocated pages are zeroed, and that any data
+	 * lurking in the kernel direct-mapped region is invalidated.
+	 */
+	ptr = page_address(page);
+	memset(ptr, 0, size);
+	dmac_flush_range(ptr, ptr + size);
+	outer_flush_range(__pa(ptr), __pa(ptr) + size);
+}
+
 /*
  * Allocate a DMA buffer for 'dev' of size 'size' using the
  * specified gfp mask.  Note that 'size' must be page aligned.
@@ -164,7 +179,6 @@ static struct page *__dma_alloc_buffer(struct device *dev, size_t size, gfp_t gf
 {
 	unsigned long order = get_order(size);
 	struct page *page, *p, *e;
-	void *ptr;
 	u64 mask = get_coherent_dma_mask(dev);
 
 #ifdef CONFIG_DMA_API_DEBUG
@@ -193,14 +207,7 @@ static struct page *__dma_alloc_buffer(struct device *dev, size_t size, gfp_t gf
 	for (p = page + (size >> PAGE_SHIFT), e = page + (1 << order); p < e; p++)
 		__free_page(p);
 
-	/*
-	 * Ensure that the allocated pages are zeroed, and that any data
-	 * lurking in the kernel direct-mapped region is invalidated.
-	 */
-	ptr = page_address(page);
-	memset(ptr, 0, size);
-	dmac_flush_range(ptr, ptr + size);
-	outer_flush_range(__pa(ptr), __pa(ptr) + size);
+	__dma_clear_buffer(page, size);
 
 	return page;
 }
@@ -348,7 +355,7 @@ __dma_alloc_remap(struct page *page, size_t size, gfp_t gfp, pgprot_t prot)
 		u32 off = CONSISTENT_OFFSET(c->vm_start) & (PTRS_PER_PTE-1);
 
 		pte = consistent_pte[idx] + off;
-		c->vm_pages = page;
+		c->priv = page;
 
 		do {
 			BUG_ON(!pte_none(*pte));
@@ -461,6 +468,14 @@ __dma_alloc(struct device *dev, size_t size, dma_addr_t *handle, gfp_t gfp,
 	return addr;
 }
 
+static inline pgprot_t __get_dma_pgprot(struct dma_attrs *attrs, pgprot_t prot)
+{
+	prot = dma_get_attr(DMA_ATTR_WRITE_COMBINE, attrs) ?
+			    pgprot_writecombine(prot) :
+			    pgprot_dmacoherent(prot);
+	return prot;
+}
+
 /*
  * Allocate DMA-coherent memory space and return both the kernel remapped
  * virtual and bus address for that space.
@@ -468,9 +483,7 @@ __dma_alloc(struct device *dev, size_t size, dma_addr_t *handle, gfp_t gfp,
 void *arm_dma_alloc(struct device *dev, size_t size, dma_addr_t *handle,
 		    gfp_t gfp, struct dma_attrs *attrs)
 {
-	pgprot_t prot = dma_get_attr(DMA_ATTR_WRITE_COMBINE, attrs) ?
-			pgprot_writecombine(pgprot_kernel) :
-			pgprot_dmacoherent(pgprot_kernel);
+	pgprot_t prot = __get_dma_pgprot(attrs, pgprot_kernel);
 	void *memory;
 
 	if (dma_alloc_from_coherent(dev, size, handle, &memory))
@@ -499,13 +512,14 @@ int arm_dma_mmap(struct device *dev, struct vm_area_struct *vma,
 	c = arm_vmregion_find(&consistent_head, (unsigned long)cpu_addr);
 	if (c) {
 		unsigned long off = vma->vm_pgoff;
+		struct page *pages = c->priv;
 
 		kern_size = (c->vm_end - c->vm_start) >> PAGE_SHIFT;
 
 		if (off < kern_size &&
 		    user_size <= (kern_size - off)) {
 			ret = remap_pfn_range(vma, vma->vm_start,
-					      page_to_pfn(c->vm_pages) + off,
+					      page_to_pfn(pages) + off,
 					      user_size << PAGE_SHIFT,
 					      vma->vm_page_prot);
 		}
@@ -644,6 +658,9 @@ int arm_dma_map_sg(struct device *dev, struct scatterlist *sg, int nents,
 	int i, j;
 
 	for_each_sg(sg, s, nents, i) {
+#ifdef CONFIG_NEED_SG_DMA_LENGTH
+		s->dma_length = s->length;
+#endif
 		s->dma_address = ops->map_page(dev, sg_page(s), s->offset,
 						s->length, dir, attrs);
 		if (dma_mapping_error(dev, s->dma_address))
@@ -749,3 +766,593 @@ static int __init dma_debug_do_init(void)
 	return 0;
 }
 fs_initcall(dma_debug_do_init);
+
+#ifdef CONFIG_ARM_DMA_USE_IOMMU
+
+/* IOMMU */
+
+static inline dma_addr_t __alloc_iova(struct dma_iommu_mapping *mapping,
+				      size_t size)
+{
+	unsigned int order = get_order(size);
+	unsigned int align = 0;
+	unsigned int count, start;
+	unsigned long flags;
+
+	count = ((PAGE_ALIGN(size) >> PAGE_SHIFT) +
+		 (1 << mapping->order) - 1) >> mapping->order;
+
+	if (order > mapping->order)
+		align = (1 << (order - mapping->order)) - 1;
+
+	spin_lock_irqsave(&mapping->lock, flags);
+	start = bitmap_find_next_zero_area(mapping->bitmap, mapping->bits, 0,
+					   count, align);
+	if (start > mapping->bits) {
+		spin_unlock_irqrestore(&mapping->lock, flags);
+		return ~0;
+	}
+
+	bitmap_set(mapping->bitmap, start, count);
+	spin_unlock_irqrestore(&mapping->lock, flags);
+
+	return mapping->base + (start << (mapping->order + PAGE_SHIFT));
+}
+
+static inline void __free_iova(struct dma_iommu_mapping *mapping,
+			       dma_addr_t addr, size_t size)
+{
+	unsigned int start = (addr - mapping->base) >>
+			     (mapping->order + PAGE_SHIFT);
+	unsigned int count = ((size >> PAGE_SHIFT) +
+			      (1 << mapping->order) - 1) >> mapping->order;
+	unsigned long flags;
+
+	spin_lock_irqsave(&mapping->lock, flags);
+	bitmap_clear(mapping->bitmap, start, count);
+	spin_unlock_irqrestore(&mapping->lock, flags);
+}
+
+static struct page **__iommu_alloc_buffer(struct device *dev, size_t size, gfp_t gfp)
+{
+	struct page **pages;
+	int count = size >> PAGE_SHIFT;
+	int i=0;
+
+	pages = kzalloc(count * sizeof(struct page*), gfp);
+	if (!pages)
+		return NULL;
+
+	while (count) {
+		int j, order = __ffs(count);
+
+		pages[i] = alloc_pages(gfp | __GFP_NOWARN, order);
+		while (!pages[i] && order)
+			pages[i] = alloc_pages(gfp | __GFP_NOWARN, --order);
+		if (!pages[i])
+			goto error;
+
+		if (order)
+			split_page(pages[i], order);
+		j = 1 << order;
+		while (--j)
+			pages[i + j] = pages[i] + j;
+
+		__dma_clear_buffer(pages[i], PAGE_SIZE << order);
+		i += 1 << order;
+		count -= 1 << order;
+	}
+
+	return pages;
+error:
+	while (--i)
+		if (pages[i])
+			__free_pages(pages[i], 0);
+	kfree(pages);
+	return NULL;
+}
+
+static int __iommu_free_buffer(struct device *dev, struct page **pages, size_t size)
+{
+	int count = size >> PAGE_SHIFT;
+	int i;
+	for (i=0; i< count; i++)
+		if (pages[i])
+			__free_pages(pages[i], 0);
+	kfree(pages);
+	return 0;
+}
+
+static void *
+__iommu_alloc_remap(struct page **pages, size_t size, gfp_t gfp, pgprot_t prot)
+{
+	struct arm_vmregion *c;
+	size_t align;
+	size_t count = size >> PAGE_SHIFT;
+	int bit;
+
+	if (!consistent_pte[0]) {
+		printk(KERN_ERR "%s: not initialised\n", __func__);
+		dump_stack();
+		return NULL;
+	}
+
+	/*
+	 * Align the virtual region allocation - maximum alignment is
+	 * a section size, minimum is a page size.  This helps reduce
+	 * fragmentation of the DMA space, and also prevents allocations
+	 * smaller than a section from crossing a section boundary.
+	 */
+	bit = fls(size - 1);
+	if (bit > SECTION_SHIFT)
+		bit = SECTION_SHIFT;
+	align = 1 << bit;
+
+	/*
+	 * Allocate a virtual address in the consistent mapping region.
+	 */
+	c = arm_vmregion_alloc(&consistent_head, align, size,
+			    gfp & ~(__GFP_DMA | __GFP_HIGHMEM));
+	if (c) {
+		pte_t *pte;
+		int idx = CONSISTENT_PTE_INDEX(c->vm_start);
+		int i = 0;
+		u32 off = CONSISTENT_OFFSET(c->vm_start) & (PTRS_PER_PTE-1);
+
+		pte = consistent_pte[idx] + off;
+		c->priv = pages;
+
+		do {
+			BUG_ON(!pte_none(*pte));
+
+			set_pte_ext(pte, mk_pte(pages[i], prot), 0);
+			pte++;
+			off++;
+			i++;
+			if (off >= PTRS_PER_PTE) {
+				off = 0;
+				pte = consistent_pte[++idx];
+			}
+		} while (i < count);
+
+		dsb();
+
+		return (void *)c->vm_start;
+	}
+	return NULL;
+}
+
+static dma_addr_t __iommu_create_mapping(struct device *dev, struct page **pages, size_t size)
+{
+	struct dma_iommu_mapping *mapping = dev->archdata.mapping;
+	unsigned int count = PAGE_ALIGN(size) >> PAGE_SHIFT;
+	dma_addr_t dma_addr, iova;
+	int i, ret = ~0;
+
+	dma_addr = __alloc_iova(mapping, size);
+	if (dma_addr == ~0)
+		goto fail;
+
+	iova = dma_addr;
+	for (i=0; i<count; ) {
+		unsigned int phys = page_to_phys(pages[i]);
+		int j = i + 1;
+
+		while (j < count) {
+			if (page_to_phys(pages[j]) != phys + (j - i) * PAGE_SIZE)
+				break;
+			j++;
+		}
+
+		ret = iommu_map(mapping->domain, iova, phys, (j - i) * PAGE_SIZE, 0);
+		if (ret < 0)
+			goto fail;
+		iova += (j - i) * PAGE_SIZE;
+		i = j;
+	}
+
+	return dma_addr;
+fail:
+	return ~0;
+}
+
+static int __iommu_remove_mapping(struct device *dev, dma_addr_t iova, size_t size)
+{
+	struct dma_iommu_mapping *mapping = dev->archdata.mapping;
+	unsigned int count = PAGE_ALIGN(size) >> PAGE_SHIFT;
+
+	iova &= PAGE_MASK;
+
+	iommu_unmap(mapping->domain, iova, count * PAGE_SIZE);
+
+	__free_iova(mapping, iova, size);
+	return 0;
+}
+
+static void *arm_iommu_alloc_attrs(struct device *dev, size_t size,
+	    dma_addr_t *handle, gfp_t gfp, struct dma_attrs *attrs)
+{
+	pgprot_t prot = __get_dma_pgprot(attrs, pgprot_kernel);
+	struct page **pages;
+	void *addr = NULL;
+
+	*handle = ~0;
+	size = PAGE_ALIGN(size);
+
+	pages = __iommu_alloc_buffer(dev, size, gfp);
+	if (!pages)
+		return NULL;
+
+	*handle = __iommu_create_mapping(dev, pages, size);
+	if (*handle == ~0)
+		goto err_buffer;
+
+	addr = __iommu_alloc_remap(pages, size, gfp, prot);
+	if (!addr)
+		goto err_mapping;
+
+	return addr;
+
+err_mapping:
+	__iommu_remove_mapping(dev, *handle, size);
+err_buffer:
+	__iommu_free_buffer(dev, pages, size);
+	return NULL;
+}
+
+static int arm_iommu_mmap_attrs(struct device *dev, struct vm_area_struct *vma,
+		    void *cpu_addr, dma_addr_t dma_addr, size_t size,
+		    struct dma_attrs *attrs)
+{
+	struct arm_vmregion *c;
+
+	vma->vm_page_prot = __get_dma_pgprot(attrs, vma->vm_page_prot);
+	c = arm_vmregion_find(&consistent_head, (unsigned long)cpu_addr);
+
+	if (c) {
+		struct page **pages = c->priv;
+
+		unsigned long uaddr = vma->vm_start;
+		unsigned long usize = vma->vm_end - vma->vm_start;
+		int i = 0;
+
+		do {
+			int ret;
+
+			ret = vm_insert_page(vma, uaddr, pages[i++]);
+			if (ret) {
+				printk(KERN_ERR "Remapping memory, error: %d\n", ret);
+				return ret;
+			}
+
+			uaddr += PAGE_SIZE;
+			usize -= PAGE_SIZE;
+		} while (usize > 0);
+	}
+	return 0;
+}
+
+/*
+ * free a page as defined by the above mapping.
+ * Must not be called with IRQs disabled.
+ */
+void arm_iommu_free_attrs(struct device *dev, size_t size, void *cpu_addr,
+			  dma_addr_t handle, struct dma_attrs *attrs)
+{
+	struct arm_vmregion *c;
+	size = PAGE_ALIGN(size);
+
+	c = arm_vmregion_find(&consistent_head, (unsigned long)cpu_addr);
+	if (c) {
+		struct page **pages = c->priv;
+		__dma_free_remap(cpu_addr, size);
+		__iommu_remove_mapping(dev, handle, size);
+		__iommu_free_buffer(dev, pages, size);
+	}
+}
+
+static int __map_sg_chunk(struct device *dev, struct scatterlist *sg,
+			  size_t size, dma_addr_t *handle,
+			  enum dma_data_direction dir)
+{
+	struct dma_iommu_mapping *mapping = dev->archdata.mapping;
+	dma_addr_t iova, iova_base;
+	int ret = 0;
+	unsigned int count;
+	struct scatterlist *s;
+
+	size = PAGE_ALIGN(size);
+	*handle = ~0;
+
+	iova_base = iova = __alloc_iova(mapping, size);
+	if (iova == ~0)
+		return -ENOMEM;
+
+	for (count = 0, s = sg; count < (size >> PAGE_SHIFT); s = sg_next(s))
+	{
+		phys_addr_t phys = page_to_phys(sg_page(s));
+		unsigned int len = PAGE_ALIGN(s->offset + s->length);
+
+		if (!arch_is_coherent())
+			__dma_page_cpu_to_dev(sg_page(s), s->offset, s->length, dir);
+
+		ret = iommu_map(mapping->domain, iova, phys, len, 0);
+		if (ret < 0)
+			goto fail;
+		count += len >> PAGE_SHIFT;
+		iova += len;
+	}
+	*handle = iova_base;
+
+	return 0;
+fail:
+	iommu_unmap(mapping->domain, iova_base, count * PAGE_SIZE);
+	__free_iova(mapping, iova_base, size);
+	return ret;
+}
+
+int arm_iommu_map_sg(struct device *dev, struct scatterlist *sg, int nents,
+		     enum dma_data_direction dir, struct dma_attrs *attrs)
+{
+	struct scatterlist *s = sg, *dma = sg, *start = sg;
+	int i, count = 0;
+	unsigned int offset = s->offset;
+	unsigned int size = s->offset + s->length;
+	unsigned int max = dma_get_max_seg_size(dev);
+
+	s->dma_address = ~0;
+	s->dma_length = 0;
+
+	for (i = 1; i < nents; i++) {
+		s->dma_address = ~0;
+		s->dma_length = 0;
+
+		s = sg_next(s);
+
+		if (s->offset || (size & ~PAGE_MASK) || size + s->length > max) {
+			if (__map_sg_chunk(dev, start, size, &dma->dma_address,
+			    dir) < 0)
+				goto bad_mapping;
+
+			dma->dma_address += offset;
+			dma->dma_length = size - offset;
+
+			size = offset = s->offset;
+			start = s;
+			dma = sg_next(dma);
+			count += 1;
+		}
+		size += s->length;
+	}
+	if (__map_sg_chunk(dev, start, size, &dma->dma_address, dir) < 0)
+		goto bad_mapping;
+
+	dma->dma_address += offset;
+	dma->dma_length = size - offset;
+
+	return count+1;
+
+bad_mapping:
+	for_each_sg(sg, s, count, i)
+		__iommu_remove_mapping(dev, sg_dma_address(s), sg_dma_len(s));
+	return 0;
+}
+
+void arm_iommu_unmap_sg(struct device *dev, struct scatterlist *sg, int nents,
+			enum dma_data_direction dir, struct dma_attrs *attrs)
+{
+	struct scatterlist *s;
+	int i;
+
+	for_each_sg(sg, s, nents, i) {
+		if (sg_dma_len(s))
+			__iommu_remove_mapping(dev, sg_dma_address(s),
+					       sg_dma_len(s));
+		if (!arch_is_coherent())
+			__dma_page_dev_to_cpu(sg_page(s), s->offset,
+					      s->length, dir);
+	}
+}
+
+
+/**
+ * dma_sync_sg_for_cpu
+ * @dev: valid struct device pointer, or NULL for ISA and EISA-like devices
+ * @sg: list of buffers
+ * @nents: number of buffers to map (returned from dma_map_sg)
+ * @dir: DMA transfer direction (same as was passed to dma_map_sg)
+ */
+void arm_iommu_sync_sg_for_cpu(struct device *dev, struct scatterlist *sg,
+			int nents, enum dma_data_direction dir)
+{
+	struct scatterlist *s;
+	int i;
+
+	for_each_sg(sg, s, nents, i)
+		if (!arch_is_coherent())
+			__dma_page_dev_to_cpu(sg_page(s), s->offset, s->length, dir);
+
+}
+
+/**
+ * dma_sync_sg_for_device
+ * @dev: valid struct device pointer, or NULL for ISA and EISA-like devices
+ * @sg: list of buffers
+ * @nents: number of buffers to map (returned from dma_map_sg)
+ * @dir: DMA transfer direction (same as was passed to dma_map_sg)
+ */
+void arm_iommu_sync_sg_for_device(struct device *dev, struct scatterlist *sg,
+			int nents, enum dma_data_direction dir)
+{
+	struct scatterlist *s;
+	int i;
+
+	for_each_sg(sg, s, nents, i)
+		if (!arch_is_coherent())
+			__dma_page_cpu_to_dev(sg_page(s), s->offset, s->length, dir);
+}
+
+static dma_addr_t arm_iommu_map_page(struct device *dev, struct page *page,
+	     unsigned long offset, size_t size, enum dma_data_direction dir,
+	     struct dma_attrs *attrs)
+{
+	struct dma_iommu_mapping *mapping = dev->archdata.mapping;
+	dma_addr_t dma_addr, iova;
+	unsigned int phys;
+	int ret, len = PAGE_ALIGN(size + offset);
+
+	if (!arch_is_coherent())
+		__dma_page_cpu_to_dev(page, offset, size, dir);
+
+	dma_addr = iova = __alloc_iova(mapping, len);
+	if (iova == ~0)
+		goto fail;
+
+	dma_addr += offset;
+	phys = page_to_phys(page);
+	ret = iommu_map(mapping->domain, iova, phys, size, 0);
+	if (ret < 0)
+		goto fail;
+
+	return dma_addr;
+fail:
+	return ~0;
+}
+
+static void arm_iommu_unmap_page(struct device *dev, dma_addr_t handle,
+		size_t size, enum dma_data_direction dir,
+		struct dma_attrs *attrs)
+{
+	struct dma_iommu_mapping *mapping = dev->archdata.mapping;
+	dma_addr_t iova = handle & PAGE_MASK;
+	struct page *page = phys_to_page(iommu_iova_to_phys(mapping->domain, iova));
+	int offset = handle & ~PAGE_MASK;
+
+	if (!iova)
+		return;
+
+	if (!arch_is_coherent())
+		__dma_page_dev_to_cpu(page, offset, size, dir);
+
+	iommu_unmap(mapping->domain, iova, size);
+	__free_iova(mapping, iova, size);
+}
+
+static void arm_iommu_sync_single_for_cpu(struct device *dev,
+		dma_addr_t handle, size_t size, enum dma_data_direction dir)
+{
+	struct dma_iommu_mapping *mapping = dev->archdata.mapping;
+	dma_addr_t iova = handle & PAGE_MASK;
+	struct page *page = phys_to_page(iommu_iova_to_phys(mapping->domain, iova));
+	unsigned int offset = handle & ~PAGE_MASK;
+
+	if (!iova)
+		return;
+
+	if (!arch_is_coherent())
+		__dma_page_dev_to_cpu(page, offset, size, dir);
+}
+
+static void arm_iommu_sync_single_for_device(struct device *dev,
+		dma_addr_t handle, size_t size, enum dma_data_direction dir)
+{
+	struct dma_iommu_mapping *mapping = dev->archdata.mapping;
+	dma_addr_t iova = handle & PAGE_MASK;
+	struct page *page = phys_to_page(iommu_iova_to_phys(mapping->domain, iova));
+	unsigned int offset = handle & ~PAGE_MASK;
+
+	if (!iova)
+		return;
+
+	__dma_page_cpu_to_dev(page, offset, size, dir);
+}
+
+struct dma_map_ops iommu_ops = {
+	.alloc		= arm_iommu_alloc_attrs,
+	.free		= arm_iommu_free_attrs,
+	.mmap		= arm_iommu_mmap_attrs,
+
+	.map_page		= arm_iommu_map_page,
+	.unmap_page		= arm_iommu_unmap_page,
+	.sync_single_for_cpu	= arm_iommu_sync_single_for_cpu,
+	.sync_single_for_device	= arm_iommu_sync_single_for_device,
+
+	.map_sg			= arm_iommu_map_sg,
+	.unmap_sg		= arm_iommu_unmap_sg,
+	.sync_sg_for_cpu	= arm_iommu_sync_sg_for_cpu,
+	.sync_sg_for_device	= arm_iommu_sync_sg_for_device,
+};
+
+struct dma_iommu_mapping *
+arm_iommu_create_mapping(struct bus_type *bus, dma_addr_t base, size_t size,
+			 int order)
+{
+	unsigned int count = (size >> PAGE_SHIFT) - order;
+	unsigned int bitmap_size = BITS_TO_LONGS(count) * sizeof(long);
+	struct dma_iommu_mapping *mapping;
+	int err = -ENOMEM;
+
+	mapping = kzalloc(sizeof(struct dma_iommu_mapping), GFP_KERNEL);
+	if (!mapping)
+		goto err;
+
+	mapping->bitmap = kzalloc(bitmap_size, GFP_KERNEL);
+	if (!mapping->bitmap)
+		goto err2;
+
+	mapping->base = base;
+	mapping->bits = bitmap_size;
+	mapping->order = order;
+	spin_lock_init(&mapping->lock);
+
+	mapping->domain = iommu_domain_alloc(bus);
+	if (!mapping->domain)
+		goto err3;
+
+	kref_init(&mapping->kref);
+	return mapping;
+err3:
+	kfree(mapping->bitmap);
+err2:
+	kfree(mapping);
+err:
+	return ERR_PTR(err);
+}
+EXPORT_SYMBOL(arm_iommu_create_mapping);
+
+static void release_iommu_mapping(struct kref *kref)
+{
+	struct dma_iommu_mapping *mapping =
+		container_of(kref, struct dma_iommu_mapping, kref);
+
+	iommu_domain_free(mapping->domain);
+	kfree(mapping->bitmap);
+	kfree(mapping);
+}
+
+void arm_iommu_release_mapping(struct dma_iommu_mapping *mapping)
+{
+	if (mapping)
+		kref_put(&mapping->kref, release_iommu_mapping);
+}
+EXPORT_SYMBOL(arm_iommu_release_mapping);
+
+int arm_iommu_attach_device(struct device *dev,
+			    struct dma_iommu_mapping *mapping)
+{
+	int err;
+
+	err = iommu_attach_device(mapping->domain, dev);
+	if (err)
+		return err;
+
+	kref_get(&mapping->kref);
+	dev->archdata.mapping = mapping;
+	set_dma_ops(dev, &iommu_ops);
+
+	printk(KERN_INFO "Attached IOMMU controller to %s device.\n", dev_name(dev));
+	return 0;
+}
+EXPORT_SYMBOL(arm_iommu_attach_device);
+
+#endif
diff --git a/arch/arm/mm/vmregion.h b/arch/arm/mm/vmregion.h
index 15e9f04..6bbc402 100644
--- a/arch/arm/mm/vmregion.h
+++ b/arch/arm/mm/vmregion.h
@@ -17,7 +17,7 @@ struct arm_vmregion {
 	struct list_head	vm_list;
 	unsigned long		vm_start;
 	unsigned long		vm_end;
-	struct page		*vm_pages;
+	void			*priv;
 	int			vm_active;
 };
 
-- 
1.7.1.569.g6f426

^ permalink raw reply related	[flat|nested] 39+ messages in thread

* [PATCHv6 7/7] ARM: dma-mapping: add support for IOMMU mapper
  2012-02-10 18:58   ` [PATCHv6 7/7] ARM: dma-mapping: add support for IOMMU mapper Marek Szyprowski
@ 2012-02-10 18:58     ` Marek Szyprowski
  2012-02-13 18:18     ` Krishna Reddy
                       ` (2 subsequent siblings)
  3 siblings, 0 replies; 39+ messages in thread
From: Marek Szyprowski @ 2012-02-10 18:58 UTC (permalink / raw)
  To: linux-arm-kernel, linaro-mm-sig, linux-mm, linux-arch,
	linux-samsung-soc, iommu
  Cc: Marek Szyprowski, Kyungmin Park, Arnd Bergmann, Joerg Roedel,
	Russell King - ARM Linux, Shariq Hasnain, Chunsang Jeong,
	Krishna Reddy, KyongHo Cho, Andrzej Pietrasiewicz,
	Benjamin Herrenschmidt

This patch add a complete implementation of DMA-mapping API for
devices that have IOMMU support. All DMA-mapping calls are supported.

This patch contains some of the code kindly provided by Krishna Reddy
<vdumpa@nvidia.com> and Andrzej Pietrasiewicz <andrzej.p@samsung.com>

Signed-off-by: Marek Szyprowski <m.szyprowski@samsung.com>
Signed-off-by: Kyungmin Park <kyungmin.park@samsung.com>
---
 arch/arm/Kconfig                 |    8 +
 arch/arm/include/asm/device.h    |    3 +
 arch/arm/include/asm/dma-iommu.h |   34 ++
 arch/arm/mm/dma-mapping.c        |  635 +++++++++++++++++++++++++++++++++++++-
 arch/arm/mm/vmregion.h           |    2 +-
 5 files changed, 667 insertions(+), 15 deletions(-)
 create mode 100644 arch/arm/include/asm/dma-iommu.h

diff --git a/arch/arm/Kconfig b/arch/arm/Kconfig
index 59102fb..5d9a0b6 100644
--- a/arch/arm/Kconfig
+++ b/arch/arm/Kconfig
@@ -44,6 +44,14 @@ config ARM
 config ARM_HAS_SG_CHAIN
 	bool
 
+config NEED_SG_DMA_LENGTH
+	bool
+
+config ARM_DMA_USE_IOMMU
+	select NEED_SG_DMA_LENGTH
+	select ARM_HAS_SG_CHAIN
+	bool
+
 config HAVE_PWM
 	bool
 
diff --git a/arch/arm/include/asm/device.h b/arch/arm/include/asm/device.h
index 6e2cb0e..b69c0d3 100644
--- a/arch/arm/include/asm/device.h
+++ b/arch/arm/include/asm/device.h
@@ -14,6 +14,9 @@ struct dev_archdata {
 #ifdef CONFIG_IOMMU_API
 	void *iommu; /* private IOMMU data */
 #endif
+#ifdef CONFIG_ARM_DMA_USE_IOMMU
+	struct dma_iommu_mapping	*mapping;
+#endif
 };
 
 struct omap_device;
diff --git a/arch/arm/include/asm/dma-iommu.h b/arch/arm/include/asm/dma-iommu.h
new file mode 100644
index 0000000..799b094
--- /dev/null
+++ b/arch/arm/include/asm/dma-iommu.h
@@ -0,0 +1,34 @@
+#ifndef ASMARM_DMA_IOMMU_H
+#define ASMARM_DMA_IOMMU_H
+
+#ifdef __KERNEL__
+
+#include <linux/mm_types.h>
+#include <linux/scatterlist.h>
+#include <linux/dma-debug.h>
+#include <linux/kmemcheck.h>
+
+struct dma_iommu_mapping {
+	/* iommu specific data */
+	struct iommu_domain	*domain;
+
+	void			*bitmap;
+	size_t			bits;
+	unsigned int		order;
+	dma_addr_t		base;
+
+	spinlock_t		lock;
+	struct kref		kref;
+};
+
+struct dma_iommu_mapping *
+arm_iommu_create_mapping(struct bus_type *bus, dma_addr_t base, size_t size,
+			 int order);
+
+void arm_iommu_release_mapping(struct dma_iommu_mapping *mapping);
+
+int arm_iommu_attach_device(struct device *dev,
+					struct dma_iommu_mapping *mapping);
+
+#endif /* __KERNEL__ */
+#endif
diff --git a/arch/arm/mm/dma-mapping.c b/arch/arm/mm/dma-mapping.c
index 4845c09..4163691 100644
--- a/arch/arm/mm/dma-mapping.c
+++ b/arch/arm/mm/dma-mapping.c
@@ -19,6 +19,7 @@
 #include <linux/dma-mapping.h>
 #include <linux/highmem.h>
 #include <linux/slab.h>
+#include <linux/iommu.h>
 
 #include <asm/memory.h>
 #include <asm/highmem.h>
@@ -26,6 +27,7 @@
 #include <asm/tlbflush.h>
 #include <asm/sizes.h>
 #include <asm/mach/arch.h>
+#include <asm/dma-iommu.h>
 
 #include "mm.h"
 
@@ -156,6 +158,19 @@ static u64 get_coherent_dma_mask(struct device *dev)
 	return mask;
 }
 
+static void __dma_clear_buffer(struct page *page, size_t size)
+{
+	void *ptr;
+	/*
+	 * Ensure that the allocated pages are zeroed, and that any data
+	 * lurking in the kernel direct-mapped region is invalidated.
+	 */
+	ptr = page_address(page);
+	memset(ptr, 0, size);
+	dmac_flush_range(ptr, ptr + size);
+	outer_flush_range(__pa(ptr), __pa(ptr) + size);
+}
+
 /*
  * Allocate a DMA buffer for 'dev' of size 'size' using the
  * specified gfp mask.  Note that 'size' must be page aligned.
@@ -164,7 +179,6 @@ static struct page *__dma_alloc_buffer(struct device *dev, size_t size, gfp_t gf
 {
 	unsigned long order = get_order(size);
 	struct page *page, *p, *e;
-	void *ptr;
 	u64 mask = get_coherent_dma_mask(dev);
 
 #ifdef CONFIG_DMA_API_DEBUG
@@ -193,14 +207,7 @@ static struct page *__dma_alloc_buffer(struct device *dev, size_t size, gfp_t gf
 	for (p = page + (size >> PAGE_SHIFT), e = page + (1 << order); p < e; p++)
 		__free_page(p);
 
-	/*
-	 * Ensure that the allocated pages are zeroed, and that any data
-	 * lurking in the kernel direct-mapped region is invalidated.
-	 */
-	ptr = page_address(page);
-	memset(ptr, 0, size);
-	dmac_flush_range(ptr, ptr + size);
-	outer_flush_range(__pa(ptr), __pa(ptr) + size);
+	__dma_clear_buffer(page, size);
 
 	return page;
 }
@@ -348,7 +355,7 @@ __dma_alloc_remap(struct page *page, size_t size, gfp_t gfp, pgprot_t prot)
 		u32 off = CONSISTENT_OFFSET(c->vm_start) & (PTRS_PER_PTE-1);
 
 		pte = consistent_pte[idx] + off;
-		c->vm_pages = page;
+		c->priv = page;
 
 		do {
 			BUG_ON(!pte_none(*pte));
@@ -461,6 +468,14 @@ __dma_alloc(struct device *dev, size_t size, dma_addr_t *handle, gfp_t gfp,
 	return addr;
 }
 
+static inline pgprot_t __get_dma_pgprot(struct dma_attrs *attrs, pgprot_t prot)
+{
+	prot = dma_get_attr(DMA_ATTR_WRITE_COMBINE, attrs) ?
+			    pgprot_writecombine(prot) :
+			    pgprot_dmacoherent(prot);
+	return prot;
+}
+
 /*
  * Allocate DMA-coherent memory space and return both the kernel remapped
  * virtual and bus address for that space.
@@ -468,9 +483,7 @@ __dma_alloc(struct device *dev, size_t size, dma_addr_t *handle, gfp_t gfp,
 void *arm_dma_alloc(struct device *dev, size_t size, dma_addr_t *handle,
 		    gfp_t gfp, struct dma_attrs *attrs)
 {
-	pgprot_t prot = dma_get_attr(DMA_ATTR_WRITE_COMBINE, attrs) ?
-			pgprot_writecombine(pgprot_kernel) :
-			pgprot_dmacoherent(pgprot_kernel);
+	pgprot_t prot = __get_dma_pgprot(attrs, pgprot_kernel);
 	void *memory;
 
 	if (dma_alloc_from_coherent(dev, size, handle, &memory))
@@ -499,13 +512,14 @@ int arm_dma_mmap(struct device *dev, struct vm_area_struct *vma,
 	c = arm_vmregion_find(&consistent_head, (unsigned long)cpu_addr);
 	if (c) {
 		unsigned long off = vma->vm_pgoff;
+		struct page *pages = c->priv;
 
 		kern_size = (c->vm_end - c->vm_start) >> PAGE_SHIFT;
 
 		if (off < kern_size &&
 		    user_size <= (kern_size - off)) {
 			ret = remap_pfn_range(vma, vma->vm_start,
-					      page_to_pfn(c->vm_pages) + off,
+					      page_to_pfn(pages) + off,
 					      user_size << PAGE_SHIFT,
 					      vma->vm_page_prot);
 		}
@@ -644,6 +658,9 @@ int arm_dma_map_sg(struct device *dev, struct scatterlist *sg, int nents,
 	int i, j;
 
 	for_each_sg(sg, s, nents, i) {
+#ifdef CONFIG_NEED_SG_DMA_LENGTH
+		s->dma_length = s->length;
+#endif
 		s->dma_address = ops->map_page(dev, sg_page(s), s->offset,
 						s->length, dir, attrs);
 		if (dma_mapping_error(dev, s->dma_address))
@@ -749,3 +766,593 @@ static int __init dma_debug_do_init(void)
 	return 0;
 }
 fs_initcall(dma_debug_do_init);
+
+#ifdef CONFIG_ARM_DMA_USE_IOMMU
+
+/* IOMMU */
+
+static inline dma_addr_t __alloc_iova(struct dma_iommu_mapping *mapping,
+				      size_t size)
+{
+	unsigned int order = get_order(size);
+	unsigned int align = 0;
+	unsigned int count, start;
+	unsigned long flags;
+
+	count = ((PAGE_ALIGN(size) >> PAGE_SHIFT) +
+		 (1 << mapping->order) - 1) >> mapping->order;
+
+	if (order > mapping->order)
+		align = (1 << (order - mapping->order)) - 1;
+
+	spin_lock_irqsave(&mapping->lock, flags);
+	start = bitmap_find_next_zero_area(mapping->bitmap, mapping->bits, 0,
+					   count, align);
+	if (start > mapping->bits) {
+		spin_unlock_irqrestore(&mapping->lock, flags);
+		return ~0;
+	}
+
+	bitmap_set(mapping->bitmap, start, count);
+	spin_unlock_irqrestore(&mapping->lock, flags);
+
+	return mapping->base + (start << (mapping->order + PAGE_SHIFT));
+}
+
+static inline void __free_iova(struct dma_iommu_mapping *mapping,
+			       dma_addr_t addr, size_t size)
+{
+	unsigned int start = (addr - mapping->base) >>
+			     (mapping->order + PAGE_SHIFT);
+	unsigned int count = ((size >> PAGE_SHIFT) +
+			      (1 << mapping->order) - 1) >> mapping->order;
+	unsigned long flags;
+
+	spin_lock_irqsave(&mapping->lock, flags);
+	bitmap_clear(mapping->bitmap, start, count);
+	spin_unlock_irqrestore(&mapping->lock, flags);
+}
+
+static struct page **__iommu_alloc_buffer(struct device *dev, size_t size, gfp_t gfp)
+{
+	struct page **pages;
+	int count = size >> PAGE_SHIFT;
+	int i=0;
+
+	pages = kzalloc(count * sizeof(struct page*), gfp);
+	if (!pages)
+		return NULL;
+
+	while (count) {
+		int j, order = __ffs(count);
+
+		pages[i] = alloc_pages(gfp | __GFP_NOWARN, order);
+		while (!pages[i] && order)
+			pages[i] = alloc_pages(gfp | __GFP_NOWARN, --order);
+		if (!pages[i])
+			goto error;
+
+		if (order)
+			split_page(pages[i], order);
+		j = 1 << order;
+		while (--j)
+			pages[i + j] = pages[i] + j;
+
+		__dma_clear_buffer(pages[i], PAGE_SIZE << order);
+		i += 1 << order;
+		count -= 1 << order;
+	}
+
+	return pages;
+error:
+	while (--i)
+		if (pages[i])
+			__free_pages(pages[i], 0);
+	kfree(pages);
+	return NULL;
+}
+
+static int __iommu_free_buffer(struct device *dev, struct page **pages, size_t size)
+{
+	int count = size >> PAGE_SHIFT;
+	int i;
+	for (i=0; i< count; i++)
+		if (pages[i])
+			__free_pages(pages[i], 0);
+	kfree(pages);
+	return 0;
+}
+
+static void *
+__iommu_alloc_remap(struct page **pages, size_t size, gfp_t gfp, pgprot_t prot)
+{
+	struct arm_vmregion *c;
+	size_t align;
+	size_t count = size >> PAGE_SHIFT;
+	int bit;
+
+	if (!consistent_pte[0]) {
+		printk(KERN_ERR "%s: not initialised\n", __func__);
+		dump_stack();
+		return NULL;
+	}
+
+	/*
+	 * Align the virtual region allocation - maximum alignment is
+	 * a section size, minimum is a page size.  This helps reduce
+	 * fragmentation of the DMA space, and also prevents allocations
+	 * smaller than a section from crossing a section boundary.
+	 */
+	bit = fls(size - 1);
+	if (bit > SECTION_SHIFT)
+		bit = SECTION_SHIFT;
+	align = 1 << bit;
+
+	/*
+	 * Allocate a virtual address in the consistent mapping region.
+	 */
+	c = arm_vmregion_alloc(&consistent_head, align, size,
+			    gfp & ~(__GFP_DMA | __GFP_HIGHMEM));
+	if (c) {
+		pte_t *pte;
+		int idx = CONSISTENT_PTE_INDEX(c->vm_start);
+		int i = 0;
+		u32 off = CONSISTENT_OFFSET(c->vm_start) & (PTRS_PER_PTE-1);
+
+		pte = consistent_pte[idx] + off;
+		c->priv = pages;
+
+		do {
+			BUG_ON(!pte_none(*pte));
+
+			set_pte_ext(pte, mk_pte(pages[i], prot), 0);
+			pte++;
+			off++;
+			i++;
+			if (off >= PTRS_PER_PTE) {
+				off = 0;
+				pte = consistent_pte[++idx];
+			}
+		} while (i < count);
+
+		dsb();
+
+		return (void *)c->vm_start;
+	}
+	return NULL;
+}
+
+static dma_addr_t __iommu_create_mapping(struct device *dev, struct page **pages, size_t size)
+{
+	struct dma_iommu_mapping *mapping = dev->archdata.mapping;
+	unsigned int count = PAGE_ALIGN(size) >> PAGE_SHIFT;
+	dma_addr_t dma_addr, iova;
+	int i, ret = ~0;
+
+	dma_addr = __alloc_iova(mapping, size);
+	if (dma_addr == ~0)
+		goto fail;
+
+	iova = dma_addr;
+	for (i=0; i<count; ) {
+		unsigned int phys = page_to_phys(pages[i]);
+		int j = i + 1;
+
+		while (j < count) {
+			if (page_to_phys(pages[j]) != phys + (j - i) * PAGE_SIZE)
+				break;
+			j++;
+		}
+
+		ret = iommu_map(mapping->domain, iova, phys, (j - i) * PAGE_SIZE, 0);
+		if (ret < 0)
+			goto fail;
+		iova += (j - i) * PAGE_SIZE;
+		i = j;
+	}
+
+	return dma_addr;
+fail:
+	return ~0;
+}
+
+static int __iommu_remove_mapping(struct device *dev, dma_addr_t iova, size_t size)
+{
+	struct dma_iommu_mapping *mapping = dev->archdata.mapping;
+	unsigned int count = PAGE_ALIGN(size) >> PAGE_SHIFT;
+
+	iova &= PAGE_MASK;
+
+	iommu_unmap(mapping->domain, iova, count * PAGE_SIZE);
+
+	__free_iova(mapping, iova, size);
+	return 0;
+}
+
+static void *arm_iommu_alloc_attrs(struct device *dev, size_t size,
+	    dma_addr_t *handle, gfp_t gfp, struct dma_attrs *attrs)
+{
+	pgprot_t prot = __get_dma_pgprot(attrs, pgprot_kernel);
+	struct page **pages;
+	void *addr = NULL;
+
+	*handle = ~0;
+	size = PAGE_ALIGN(size);
+
+	pages = __iommu_alloc_buffer(dev, size, gfp);
+	if (!pages)
+		return NULL;
+
+	*handle = __iommu_create_mapping(dev, pages, size);
+	if (*handle == ~0)
+		goto err_buffer;
+
+	addr = __iommu_alloc_remap(pages, size, gfp, prot);
+	if (!addr)
+		goto err_mapping;
+
+	return addr;
+
+err_mapping:
+	__iommu_remove_mapping(dev, *handle, size);
+err_buffer:
+	__iommu_free_buffer(dev, pages, size);
+	return NULL;
+}
+
+static int arm_iommu_mmap_attrs(struct device *dev, struct vm_area_struct *vma,
+		    void *cpu_addr, dma_addr_t dma_addr, size_t size,
+		    struct dma_attrs *attrs)
+{
+	struct arm_vmregion *c;
+
+	vma->vm_page_prot = __get_dma_pgprot(attrs, vma->vm_page_prot);
+	c = arm_vmregion_find(&consistent_head, (unsigned long)cpu_addr);
+
+	if (c) {
+		struct page **pages = c->priv;
+
+		unsigned long uaddr = vma->vm_start;
+		unsigned long usize = vma->vm_end - vma->vm_start;
+		int i = 0;
+
+		do {
+			int ret;
+
+			ret = vm_insert_page(vma, uaddr, pages[i++]);
+			if (ret) {
+				printk(KERN_ERR "Remapping memory, error: %d\n", ret);
+				return ret;
+			}
+
+			uaddr += PAGE_SIZE;
+			usize -= PAGE_SIZE;
+		} while (usize > 0);
+	}
+	return 0;
+}
+
+/*
+ * free a page as defined by the above mapping.
+ * Must not be called with IRQs disabled.
+ */
+void arm_iommu_free_attrs(struct device *dev, size_t size, void *cpu_addr,
+			  dma_addr_t handle, struct dma_attrs *attrs)
+{
+	struct arm_vmregion *c;
+	size = PAGE_ALIGN(size);
+
+	c = arm_vmregion_find(&consistent_head, (unsigned long)cpu_addr);
+	if (c) {
+		struct page **pages = c->priv;
+		__dma_free_remap(cpu_addr, size);
+		__iommu_remove_mapping(dev, handle, size);
+		__iommu_free_buffer(dev, pages, size);
+	}
+}
+
+static int __map_sg_chunk(struct device *dev, struct scatterlist *sg,
+			  size_t size, dma_addr_t *handle,
+			  enum dma_data_direction dir)
+{
+	struct dma_iommu_mapping *mapping = dev->archdata.mapping;
+	dma_addr_t iova, iova_base;
+	int ret = 0;
+	unsigned int count;
+	struct scatterlist *s;
+
+	size = PAGE_ALIGN(size);
+	*handle = ~0;
+
+	iova_base = iova = __alloc_iova(mapping, size);
+	if (iova == ~0)
+		return -ENOMEM;
+
+	for (count = 0, s = sg; count < (size >> PAGE_SHIFT); s = sg_next(s))
+	{
+		phys_addr_t phys = page_to_phys(sg_page(s));
+		unsigned int len = PAGE_ALIGN(s->offset + s->length);
+
+		if (!arch_is_coherent())
+			__dma_page_cpu_to_dev(sg_page(s), s->offset, s->length, dir);
+
+		ret = iommu_map(mapping->domain, iova, phys, len, 0);
+		if (ret < 0)
+			goto fail;
+		count += len >> PAGE_SHIFT;
+		iova += len;
+	}
+	*handle = iova_base;
+
+	return 0;
+fail:
+	iommu_unmap(mapping->domain, iova_base, count * PAGE_SIZE);
+	__free_iova(mapping, iova_base, size);
+	return ret;
+}
+
+int arm_iommu_map_sg(struct device *dev, struct scatterlist *sg, int nents,
+		     enum dma_data_direction dir, struct dma_attrs *attrs)
+{
+	struct scatterlist *s = sg, *dma = sg, *start = sg;
+	int i, count = 0;
+	unsigned int offset = s->offset;
+	unsigned int size = s->offset + s->length;
+	unsigned int max = dma_get_max_seg_size(dev);
+
+	s->dma_address = ~0;
+	s->dma_length = 0;
+
+	for (i = 1; i < nents; i++) {
+		s->dma_address = ~0;
+		s->dma_length = 0;
+
+		s = sg_next(s);
+
+		if (s->offset || (size & ~PAGE_MASK) || size + s->length > max) {
+			if (__map_sg_chunk(dev, start, size, &dma->dma_address,
+			    dir) < 0)
+				goto bad_mapping;
+
+			dma->dma_address += offset;
+			dma->dma_length = size - offset;
+
+			size = offset = s->offset;
+			start = s;
+			dma = sg_next(dma);
+			count += 1;
+		}
+		size += s->length;
+	}
+	if (__map_sg_chunk(dev, start, size, &dma->dma_address, dir) < 0)
+		goto bad_mapping;
+
+	dma->dma_address += offset;
+	dma->dma_length = size - offset;
+
+	return count+1;
+
+bad_mapping:
+	for_each_sg(sg, s, count, i)
+		__iommu_remove_mapping(dev, sg_dma_address(s), sg_dma_len(s));
+	return 0;
+}
+
+void arm_iommu_unmap_sg(struct device *dev, struct scatterlist *sg, int nents,
+			enum dma_data_direction dir, struct dma_attrs *attrs)
+{
+	struct scatterlist *s;
+	int i;
+
+	for_each_sg(sg, s, nents, i) {
+		if (sg_dma_len(s))
+			__iommu_remove_mapping(dev, sg_dma_address(s),
+					       sg_dma_len(s));
+		if (!arch_is_coherent())
+			__dma_page_dev_to_cpu(sg_page(s), s->offset,
+					      s->length, dir);
+	}
+}
+
+
+/**
+ * dma_sync_sg_for_cpu
+ * @dev: valid struct device pointer, or NULL for ISA and EISA-like devices
+ * @sg: list of buffers
+ * @nents: number of buffers to map (returned from dma_map_sg)
+ * @dir: DMA transfer direction (same as was passed to dma_map_sg)
+ */
+void arm_iommu_sync_sg_for_cpu(struct device *dev, struct scatterlist *sg,
+			int nents, enum dma_data_direction dir)
+{
+	struct scatterlist *s;
+	int i;
+
+	for_each_sg(sg, s, nents, i)
+		if (!arch_is_coherent())
+			__dma_page_dev_to_cpu(sg_page(s), s->offset, s->length, dir);
+
+}
+
+/**
+ * dma_sync_sg_for_device
+ * @dev: valid struct device pointer, or NULL for ISA and EISA-like devices
+ * @sg: list of buffers
+ * @nents: number of buffers to map (returned from dma_map_sg)
+ * @dir: DMA transfer direction (same as was passed to dma_map_sg)
+ */
+void arm_iommu_sync_sg_for_device(struct device *dev, struct scatterlist *sg,
+			int nents, enum dma_data_direction dir)
+{
+	struct scatterlist *s;
+	int i;
+
+	for_each_sg(sg, s, nents, i)
+		if (!arch_is_coherent())
+			__dma_page_cpu_to_dev(sg_page(s), s->offset, s->length, dir);
+}
+
+static dma_addr_t arm_iommu_map_page(struct device *dev, struct page *page,
+	     unsigned long offset, size_t size, enum dma_data_direction dir,
+	     struct dma_attrs *attrs)
+{
+	struct dma_iommu_mapping *mapping = dev->archdata.mapping;
+	dma_addr_t dma_addr, iova;
+	unsigned int phys;
+	int ret, len = PAGE_ALIGN(size + offset);
+
+	if (!arch_is_coherent())
+		__dma_page_cpu_to_dev(page, offset, size, dir);
+
+	dma_addr = iova = __alloc_iova(mapping, len);
+	if (iova == ~0)
+		goto fail;
+
+	dma_addr += offset;
+	phys = page_to_phys(page);
+	ret = iommu_map(mapping->domain, iova, phys, size, 0);
+	if (ret < 0)
+		goto fail;
+
+	return dma_addr;
+fail:
+	return ~0;
+}
+
+static void arm_iommu_unmap_page(struct device *dev, dma_addr_t handle,
+		size_t size, enum dma_data_direction dir,
+		struct dma_attrs *attrs)
+{
+	struct dma_iommu_mapping *mapping = dev->archdata.mapping;
+	dma_addr_t iova = handle & PAGE_MASK;
+	struct page *page = phys_to_page(iommu_iova_to_phys(mapping->domain, iova));
+	int offset = handle & ~PAGE_MASK;
+
+	if (!iova)
+		return;
+
+	if (!arch_is_coherent())
+		__dma_page_dev_to_cpu(page, offset, size, dir);
+
+	iommu_unmap(mapping->domain, iova, size);
+	__free_iova(mapping, iova, size);
+}
+
+static void arm_iommu_sync_single_for_cpu(struct device *dev,
+		dma_addr_t handle, size_t size, enum dma_data_direction dir)
+{
+	struct dma_iommu_mapping *mapping = dev->archdata.mapping;
+	dma_addr_t iova = handle & PAGE_MASK;
+	struct page *page = phys_to_page(iommu_iova_to_phys(mapping->domain, iova));
+	unsigned int offset = handle & ~PAGE_MASK;
+
+	if (!iova)
+		return;
+
+	if (!arch_is_coherent())
+		__dma_page_dev_to_cpu(page, offset, size, dir);
+}
+
+static void arm_iommu_sync_single_for_device(struct device *dev,
+		dma_addr_t handle, size_t size, enum dma_data_direction dir)
+{
+	struct dma_iommu_mapping *mapping = dev->archdata.mapping;
+	dma_addr_t iova = handle & PAGE_MASK;
+	struct page *page = phys_to_page(iommu_iova_to_phys(mapping->domain, iova));
+	unsigned int offset = handle & ~PAGE_MASK;
+
+	if (!iova)
+		return;
+
+	__dma_page_cpu_to_dev(page, offset, size, dir);
+}
+
+struct dma_map_ops iommu_ops = {
+	.alloc		= arm_iommu_alloc_attrs,
+	.free		= arm_iommu_free_attrs,
+	.mmap		= arm_iommu_mmap_attrs,
+
+	.map_page		= arm_iommu_map_page,
+	.unmap_page		= arm_iommu_unmap_page,
+	.sync_single_for_cpu	= arm_iommu_sync_single_for_cpu,
+	.sync_single_for_device	= arm_iommu_sync_single_for_device,
+
+	.map_sg			= arm_iommu_map_sg,
+	.unmap_sg		= arm_iommu_unmap_sg,
+	.sync_sg_for_cpu	= arm_iommu_sync_sg_for_cpu,
+	.sync_sg_for_device	= arm_iommu_sync_sg_for_device,
+};
+
+struct dma_iommu_mapping *
+arm_iommu_create_mapping(struct bus_type *bus, dma_addr_t base, size_t size,
+			 int order)
+{
+	unsigned int count = (size >> PAGE_SHIFT) - order;
+	unsigned int bitmap_size = BITS_TO_LONGS(count) * sizeof(long);
+	struct dma_iommu_mapping *mapping;
+	int err = -ENOMEM;
+
+	mapping = kzalloc(sizeof(struct dma_iommu_mapping), GFP_KERNEL);
+	if (!mapping)
+		goto err;
+
+	mapping->bitmap = kzalloc(bitmap_size, GFP_KERNEL);
+	if (!mapping->bitmap)
+		goto err2;
+
+	mapping->base = base;
+	mapping->bits = bitmap_size;
+	mapping->order = order;
+	spin_lock_init(&mapping->lock);
+
+	mapping->domain = iommu_domain_alloc(bus);
+	if (!mapping->domain)
+		goto err3;
+
+	kref_init(&mapping->kref);
+	return mapping;
+err3:
+	kfree(mapping->bitmap);
+err2:
+	kfree(mapping);
+err:
+	return ERR_PTR(err);
+}
+EXPORT_SYMBOL(arm_iommu_create_mapping);
+
+static void release_iommu_mapping(struct kref *kref)
+{
+	struct dma_iommu_mapping *mapping =
+		container_of(kref, struct dma_iommu_mapping, kref);
+
+	iommu_domain_free(mapping->domain);
+	kfree(mapping->bitmap);
+	kfree(mapping);
+}
+
+void arm_iommu_release_mapping(struct dma_iommu_mapping *mapping)
+{
+	if (mapping)
+		kref_put(&mapping->kref, release_iommu_mapping);
+}
+EXPORT_SYMBOL(arm_iommu_release_mapping);
+
+int arm_iommu_attach_device(struct device *dev,
+			    struct dma_iommu_mapping *mapping)
+{
+	int err;
+
+	err = iommu_attach_device(mapping->domain, dev);
+	if (err)
+		return err;
+
+	kref_get(&mapping->kref);
+	dev->archdata.mapping = mapping;
+	set_dma_ops(dev, &iommu_ops);
+
+	printk(KERN_INFO "Attached IOMMU controller to %s device.\n", dev_name(dev));
+	return 0;
+}
+EXPORT_SYMBOL(arm_iommu_attach_device);
+
+#endif
diff --git a/arch/arm/mm/vmregion.h b/arch/arm/mm/vmregion.h
index 15e9f04..6bbc402 100644
--- a/arch/arm/mm/vmregion.h
+++ b/arch/arm/mm/vmregion.h
@@ -17,7 +17,7 @@ struct arm_vmregion {
 	struct list_head	vm_list;
 	unsigned long		vm_start;
 	unsigned long		vm_end;
-	struct page		*vm_pages;
+	void			*priv;
 	int			vm_active;
 };
 
-- 
1.7.1.569.g6f426


^ permalink raw reply related	[flat|nested] 39+ messages in thread

* RE: [PATCHv6 7/7] ARM: dma-mapping: add support for IOMMU mapper
  2012-02-10 18:58   ` [PATCHv6 7/7] ARM: dma-mapping: add support for IOMMU mapper Marek Szyprowski
  2012-02-10 18:58     ` Marek Szyprowski
@ 2012-02-13 18:18     ` Krishna Reddy
  2012-02-13 18:18       ` Krishna Reddy
  2012-02-13 19:58     ` Krishna Reddy
  2012-02-14 14:55     ` Konrad Rzeszutek Wilk
  3 siblings, 1 reply; 39+ messages in thread
From: Krishna Reddy @ 2012-02-13 18:18 UTC (permalink / raw)
  To: Marek Szyprowski, linux-arm-kernel, linaro-mm-sig, linux-mm,
	linux-arch, linux-samsung-soc, iommu
  Cc: Kyungmin Park, Arnd Bergmann, Joerg Roedel,
	Russell King - ARM Linux, Shariq Hasnain, Chunsang Jeong,
	KyongHo Cho, Andrzej Pietrasiewicz, Benjamin Herrenschmidt

scripts/checkpatch.pl need to be run on your patches. Out of 7 patches,
6 patches(except patch 5) have coding standard violations.

-KR

^ permalink raw reply	[flat|nested] 39+ messages in thread

* RE: [PATCHv6 7/7] ARM: dma-mapping: add support for IOMMU mapper
  2012-02-13 18:18     ` Krishna Reddy
@ 2012-02-13 18:18       ` Krishna Reddy
  0 siblings, 0 replies; 39+ messages in thread
From: Krishna Reddy @ 2012-02-13 18:18 UTC (permalink / raw)
  To: Marek Szyprowski, linux-arm-kernel, linaro-mm-sig, linux-mm,
	linux-arch, linux-samsung-soc, iommu
  Cc: Kyungmin Park, Arnd Bergmann, Joerg Roedel,
	Russell King - ARM Linux, Shariq Hasnain, Chunsang Jeong,
	KyongHo Cho, Andrzej Pietrasiewicz, Benjamin Herrenschmidt

scripts/checkpatch.pl need to be run on your patches. Out of 7 patches,
6 patches(except patch 5) have coding standard violations.

-KR


^ permalink raw reply	[flat|nested] 39+ messages in thread

* RE: [PATCHv6 7/7] ARM: dma-mapping: add support for IOMMU mapper
  2012-02-10 18:58   ` [PATCHv6 7/7] ARM: dma-mapping: add support for IOMMU mapper Marek Szyprowski
  2012-02-10 18:58     ` Marek Szyprowski
  2012-02-13 18:18     ` Krishna Reddy
@ 2012-02-13 19:58     ` Krishna Reddy
  2012-02-13 19:58       ` Krishna Reddy
       [not found]       ` <401E54CE964CD94BAE1EB4A729C7087E378E42AE18-wAPRp6hVlRhDw2glCA4ptUEOCMrvLtNR@public.gmane.org>
  2012-02-14 14:55     ` Konrad Rzeszutek Wilk
  3 siblings, 2 replies; 39+ messages in thread
From: Krishna Reddy @ 2012-02-13 19:58 UTC (permalink / raw)
  To: Marek Szyprowski, linux-arm-kernel, linaro-mm-sig, linux-mm,
	linux-arch, linux-samsung-soc, iommu
  Cc: Kyungmin Park, Arnd Bergmann, Joerg Roedel,
	Russell King - ARM Linux, Shariq Hasnain, Chunsang Jeong,
	KyongHo Cho, Andrzej Pietrasiewicz, Benjamin Herrenschmidt

The implementation looks nice overall. Have few comments.

> +static struct page **__iommu_alloc_buffer(struct device *dev, size_t
> +size, gfp_t gfp) {
> +     struct page **pages;
> +     int count = size >> PAGE_SHIFT;
> +     int i=0;
> +
> +     pages = kzalloc(count * sizeof(struct page*), gfp);
> +     if (!pages)
> +             return NULL;

kzalloc can fail for any size bigger than PAGE_SIZE, if the system memory is
fully fragmented.
If there is a request for size bigger than 4MB, then the pages pointer array won't
Fit in one page and kzalloc may fail. we should use vzalloc()/vfree()
when pages pointer array size needed is bigger than PAGE_SIZE.


> +static inline dma_addr_t __alloc_iova(struct dma_iommu_mapping
> *mapping,
> +                                   size_t size)
> +{
> +     unsigned int order = get_order(size);
> +     unsigned int align = 0;
> +     unsigned int count, start;
> +     unsigned long flags;
> +
> +     count = ((PAGE_ALIGN(size) >> PAGE_SHIFT) +
> +              (1 << mapping->order) - 1) >> mapping->order;
> +
> +     if (order > mapping->order)
> +             align = (1 << (order - mapping->order)) - 1;
> +
> +     spin_lock_irqsave(&mapping->lock, flags);
> +     start = bitmap_find_next_zero_area(mapping->bitmap, mapping-
> >bits, 0,
> +                                        count, align);

Do we need "align" here? Why is it trying to align the memory request to
size of memory requested? When mapping->order is zero and if the size
requested is 4MB, order becomes 10.  align is set to 1023.
 bitmap_find_next_zero_area looks searching for free area from index, which
is multiple of 1024. Why we can't we say align mask  as 0 and let it allocate from
next free index? Doesn't mapping->order take care of min alignment needed for dev?


> +static dma_addr_t __iommu_create_mapping(struct device *dev, struct
> +page **pages, size_t size) {
> +     struct dma_iommu_mapping *mapping = dev->archdata.mapping;
> +     unsigned int count = PAGE_ALIGN(size) >> PAGE_SHIFT;
> +     dma_addr_t dma_addr, iova;
> +     int i, ret = ~0;
> +
> +     dma_addr = __alloc_iova(mapping, size);
> +     if (dma_addr == ~0)
> +             goto fail;
> +
> +     iova = dma_addr;
> +     for (i=0; i<count; ) {
> +             unsigned int phys = page_to_phys(pages[i]);
> +             int j = i + 1;
> +
> +             while (j < count) {
> +                     if (page_to_phys(pages[j]) != phys + (j - i) *
> PAGE_SIZE)
> +                             break;
> +                     j++;
> +             }
> +
> +             ret = iommu_map(mapping->domain, iova, phys, (j - i) *
> PAGE_SIZE, 0);
> +             if (ret < 0)
> +                     goto fail;
> +             iova += (j - i) * PAGE_SIZE;
> +             i = j;
> +     }
> +
> +     return dma_addr;
> +fail:
> +     return ~0;
> +}

iommu_map failure should release the iova space allocated using __alloc_iova.

> +static dma_addr_t arm_iommu_map_page(struct device *dev, struct page
> *page,
> +          unsigned long offset, size_t size, enum dma_data_direction dir,
> +          struct dma_attrs *attrs)
> +{
> +     struct dma_iommu_mapping *mapping = dev->archdata.mapping;
> +     dma_addr_t dma_addr, iova;
> +     unsigned int phys;
> +     int ret, len = PAGE_ALIGN(size + offset);
> +
> +     if (!arch_is_coherent())
> +             __dma_page_cpu_to_dev(page, offset, size, dir);
> +
> +     dma_addr = iova = __alloc_iova(mapping, len);
> +     if (iova == ~0)
> +             goto fail;
> +
> +     dma_addr += offset;
> +     phys = page_to_phys(page);
> +     ret = iommu_map(mapping->domain, iova, phys, size, 0);
> +     if (ret < 0)
> +             goto fail;
> +
> +     return dma_addr;
> +fail:
> +     return ~0;
> +}

iommu_map failure should release the iova space allocated using __alloc_iova.

>+      printk(KERN_INFO "Attached IOMMU controller to %s device.\n", dev_name(dev));
Just nit-picking. Should use pr_info().

--nvpublic
-KR


> -----Original Message-----
> From: Marek Szyprowski [mailto:m.szyprowski@samsung.com]
> Sent: Friday, February 10, 2012 10:59 AM
> To: linux-arm-kernel@lists.infradead.org; linaro-mm-sig@lists.linaro.org;
> linux-mm@kvack.org; linux-arch@vger.kernel.org; linux-samsung-
> soc@vger.kernel.org; iommu@lists.linux-foundation.org
> Cc: Marek Szyprowski; Kyungmin Park; Arnd Bergmann; Joerg Roedel; Russell
> King - ARM Linux; Shariq Hasnain; Chunsang Jeong; Krishna Reddy; KyongHo
> Cho; Andrzej Pietrasiewicz; Benjamin Herrenschmidt
> Subject: [PATCHv6 7/7] ARM: dma-mapping: add support for IOMMU
> mapper
>
> This patch add a complete implementation of DMA-mapping API for devices
> that have IOMMU support. All DMA-mapping calls are supported.
>
> This patch contains some of the code kindly provided by Krishna Reddy
> <vdumpa@nvidia.com> and Andrzej Pietrasiewicz
> <andrzej.p@samsung.com>
>
> Signed-off-by: Marek Szyprowski <m.szyprowski@samsung.com>
> Signed-off-by: Kyungmin Park <kyungmin.park@samsung.com>
> ---
>  arch/arm/Kconfig                 |    8 +
>  arch/arm/include/asm/device.h    |    3 +
>  arch/arm/include/asm/dma-iommu.h |   34 ++
>  arch/arm/mm/dma-mapping.c        |  635
> +++++++++++++++++++++++++++++++++++++-
>  arch/arm/mm/vmregion.h           |    2 +-
>  5 files changed, 667 insertions(+), 15 deletions(-)  create mode 100644
> arch/arm/include/asm/dma-iommu.h
>
> diff --git a/arch/arm/Kconfig b/arch/arm/Kconfig index 59102fb..5d9a0b6
> 100644
> --- a/arch/arm/Kconfig
> +++ b/arch/arm/Kconfig
> @@ -44,6 +44,14 @@ config ARM
>  config ARM_HAS_SG_CHAIN
>       bool
>
> +config NEED_SG_DMA_LENGTH
> +     bool
> +
> +config ARM_DMA_USE_IOMMU
> +     select NEED_SG_DMA_LENGTH
> +     select ARM_HAS_SG_CHAIN
> +     bool
> +
>  config HAVE_PWM
>       bool
>
> diff --git a/arch/arm/include/asm/device.h
> b/arch/arm/include/asm/device.h index 6e2cb0e..b69c0d3 100644
> --- a/arch/arm/include/asm/device.h
> +++ b/arch/arm/include/asm/device.h
> @@ -14,6 +14,9 @@ struct dev_archdata {
>  #ifdef CONFIG_IOMMU_API
>       void *iommu; /* private IOMMU data */
>  #endif
> +#ifdef CONFIG_ARM_DMA_USE_IOMMU
> +     struct dma_iommu_mapping        *mapping;
> +#endif
>  };
>
>  struct omap_device;
> diff --git a/arch/arm/include/asm/dma-iommu.h
> b/arch/arm/include/asm/dma-iommu.h
> new file mode 100644
> index 0000000..799b094
> --- /dev/null
> +++ b/arch/arm/include/asm/dma-iommu.h
> @@ -0,0 +1,34 @@
> +#ifndef ASMARM_DMA_IOMMU_H
> +#define ASMARM_DMA_IOMMU_H
> +
> +#ifdef __KERNEL__
> +
> +#include <linux/mm_types.h>
> +#include <linux/scatterlist.h>
> +#include <linux/dma-debug.h>
> +#include <linux/kmemcheck.h>
> +
> +struct dma_iommu_mapping {
> +     /* iommu specific data */
> +     struct iommu_domain     *domain;
> +
> +     void                    *bitmap;
> +     size_t                  bits;
> +     unsigned int            order;
> +     dma_addr_t              base;
> +
> +     spinlock_t              lock;
> +     struct kref             kref;
> +};
> +
> +struct dma_iommu_mapping *
> +arm_iommu_create_mapping(struct bus_type *bus, dma_addr_t base,
> size_t size,
> +                      int order);
> +
> +void arm_iommu_release_mapping(struct dma_iommu_mapping
> *mapping);
> +
> +int arm_iommu_attach_device(struct device *dev,
> +                                     struct dma_iommu_mapping
> *mapping);
> +
> +#endif /* __KERNEL__ */
> +#endif
> diff --git a/arch/arm/mm/dma-mapping.c b/arch/arm/mm/dma-mapping.c
> index 4845c09..4163691 100644
> --- a/arch/arm/mm/dma-mapping.c
> +++ b/arch/arm/mm/dma-mapping.c
> @@ -19,6 +19,7 @@
>  #include <linux/dma-mapping.h>
>  #include <linux/highmem.h>
>  #include <linux/slab.h>
> +#include <linux/iommu.h>
>
>  #include <asm/memory.h>
>  #include <asm/highmem.h>
> @@ -26,6 +27,7 @@
>  #include <asm/tlbflush.h>
>  #include <asm/sizes.h>
>  #include <asm/mach/arch.h>
> +#include <asm/dma-iommu.h>
>
>  #include "mm.h"
>
> @@ -156,6 +158,19 @@ static u64 get_coherent_dma_mask(struct device
> *dev)
>       return mask;
>  }
>
> +static void __dma_clear_buffer(struct page *page, size_t size) {
> +     void *ptr;
> +     /*
> +      * Ensure that the allocated pages are zeroed, and that any data
> +      * lurking in the kernel direct-mapped region is invalidated.
> +      */
> +     ptr = page_address(page);
> +     memset(ptr, 0, size);
> +     dmac_flush_range(ptr, ptr + size);
> +     outer_flush_range(__pa(ptr), __pa(ptr) + size); }
> +
>  /*
>   * Allocate a DMA buffer for 'dev' of size 'size' using the
>   * specified gfp mask.  Note that 'size' must be page aligned.
> @@ -164,7 +179,6 @@ static struct page *__dma_alloc_buffer(struct device
> *dev, size_t size, gfp_t gf  {
>       unsigned long order = get_order(size);
>       struct page *page, *p, *e;
> -     void *ptr;
>       u64 mask = get_coherent_dma_mask(dev);
>
>  #ifdef CONFIG_DMA_API_DEBUG
> @@ -193,14 +207,7 @@ static struct page *__dma_alloc_buffer(struct device
> *dev, size_t size, gfp_t gf
>       for (p = page + (size >> PAGE_SHIFT), e = page + (1 << order); p < e;
> p++)
>               __free_page(p);
>
> -     /*
> -      * Ensure that the allocated pages are zeroed, and that any data
> -      * lurking in the kernel direct-mapped region is invalidated.
> -      */
> -     ptr = page_address(page);
> -     memset(ptr, 0, size);
> -     dmac_flush_range(ptr, ptr + size);
> -     outer_flush_range(__pa(ptr), __pa(ptr) + size);
> +     __dma_clear_buffer(page, size);
>
>       return page;
>  }
> @@ -348,7 +355,7 @@ __dma_alloc_remap(struct page *page, size_t size,
> gfp_t gfp, pgprot_t prot)
>               u32 off = CONSISTENT_OFFSET(c->vm_start) &
> (PTRS_PER_PTE-1);
>
>               pte = consistent_pte[idx] + off;
> -             c->vm_pages = page;
> +             c->priv = page;
>
>               do {
>                       BUG_ON(!pte_none(*pte));
> @@ -461,6 +468,14 @@ __dma_alloc(struct device *dev, size_t size,
> dma_addr_t *handle, gfp_t gfp,
>       return addr;
>  }
>
> +static inline pgprot_t __get_dma_pgprot(struct dma_attrs *attrs,
> +pgprot_t prot) {
> +     prot = dma_get_attr(DMA_ATTR_WRITE_COMBINE, attrs) ?
> +                         pgprot_writecombine(prot) :
> +                         pgprot_dmacoherent(prot);
> +     return prot;
> +}
> +
>  /*
>   * Allocate DMA-coherent memory space and return both the kernel
> remapped
>   * virtual and bus address for that space.
> @@ -468,9 +483,7 @@ __dma_alloc(struct device *dev, size_t size,
> dma_addr_t *handle, gfp_t gfp,  void *arm_dma_alloc(struct device *dev,
> size_t size, dma_addr_t *handle,
>                   gfp_t gfp, struct dma_attrs *attrs)  {
> -     pgprot_t prot = dma_get_attr(DMA_ATTR_WRITE_COMBINE, attrs) ?
> -                     pgprot_writecombine(pgprot_kernel) :
> -                     pgprot_dmacoherent(pgprot_kernel);
> +     pgprot_t prot = __get_dma_pgprot(attrs, pgprot_kernel);
>       void *memory;
>
>       if (dma_alloc_from_coherent(dev, size, handle, &memory)) @@ -
> 499,13 +512,14 @@ int arm_dma_mmap(struct device *dev, struct
> vm_area_struct *vma,
>       c = arm_vmregion_find(&consistent_head, (unsigned
> long)cpu_addr);
>       if (c) {
>               unsigned long off = vma->vm_pgoff;
> +             struct page *pages = c->priv;
>
>               kern_size = (c->vm_end - c->vm_start) >> PAGE_SHIFT;
>
>               if (off < kern_size &&
>                   user_size <= (kern_size - off)) {
>                       ret = remap_pfn_range(vma, vma->vm_start,
> -                                           page_to_pfn(c->vm_pages) + off,
> +                                           page_to_pfn(pages) + off,
>                                             user_size << PAGE_SHIFT,
>                                             vma->vm_page_prot);
>               }
> @@ -644,6 +658,9 @@ int arm_dma_map_sg(struct device *dev, struct
> scatterlist *sg, int nents,
>       int i, j;
>
>       for_each_sg(sg, s, nents, i) {
> +#ifdef CONFIG_NEED_SG_DMA_LENGTH
> +             s->dma_length = s->length;
> +#endif
>               s->dma_address = ops->map_page(dev, sg_page(s), s-
> >offset,
>                                               s->length, dir, attrs);
>               if (dma_mapping_error(dev, s->dma_address)) @@ -749,3
> +766,593 @@ static int __init dma_debug_do_init(void)
>       return 0;
>  }
>  fs_initcall(dma_debug_do_init);
> +
> +#ifdef CONFIG_ARM_DMA_USE_IOMMU
> +
> +/* IOMMU */
> +
> +static inline dma_addr_t __alloc_iova(struct dma_iommu_mapping
> *mapping,
> +                                   size_t size)
> +{
> +     unsigned int order = get_order(size);
> +     unsigned int align = 0;
> +     unsigned int count, start;
> +     unsigned long flags;
> +
> +     count = ((PAGE_ALIGN(size) >> PAGE_SHIFT) +
> +              (1 << mapping->order) - 1) >> mapping->order;
> +
> +     if (order > mapping->order)
> +             align = (1 << (order - mapping->order)) - 1;
> +
> +     spin_lock_irqsave(&mapping->lock, flags);
> +     start = bitmap_find_next_zero_area(mapping->bitmap, mapping-
> >bits, 0,
> +                                        count, align);
> +     if (start > mapping->bits) {
> +             spin_unlock_irqrestore(&mapping->lock, flags);
> +             return ~0;
> +     }
> +
> +     bitmap_set(mapping->bitmap, start, count);
> +     spin_unlock_irqrestore(&mapping->lock, flags);
> +
> +     return mapping->base + (start << (mapping->order + PAGE_SHIFT));
> }
> +
> +static inline void __free_iova(struct dma_iommu_mapping *mapping,
> +                            dma_addr_t addr, size_t size) {
> +     unsigned int start = (addr - mapping->base) >>
> +                          (mapping->order + PAGE_SHIFT);
> +     unsigned int count = ((size >> PAGE_SHIFT) +
> +                           (1 << mapping->order) - 1) >> mapping->order;
> +     unsigned long flags;
> +
> +     spin_lock_irqsave(&mapping->lock, flags);
> +     bitmap_clear(mapping->bitmap, start, count);
> +     spin_unlock_irqrestore(&mapping->lock, flags); }
> +
> +static struct page **__iommu_alloc_buffer(struct device *dev, size_t
> +size, gfp_t gfp) {
> +     struct page **pages;
> +     int count = size >> PAGE_SHIFT;
> +     int i=0;
> +
> +     pages = kzalloc(count * sizeof(struct page*), gfp);
> +     if (!pages)
> +             return NULL;
> +
> +     while (count) {
> +             int j, order = __ffs(count);
> +
> +             pages[i] = alloc_pages(gfp | __GFP_NOWARN, order);
> +             while (!pages[i] && order)
> +                     pages[i] = alloc_pages(gfp | __GFP_NOWARN, --
> order);
> +             if (!pages[i])
> +                     goto error;
> +
> +             if (order)
> +                     split_page(pages[i], order);
> +             j = 1 << order;
> +             while (--j)
> +                     pages[i + j] = pages[i] + j;
> +
> +             __dma_clear_buffer(pages[i], PAGE_SIZE << order);
> +             i += 1 << order;
> +             count -= 1 << order;
> +     }
> +
> +     return pages;
> +error:
> +     while (--i)
> +             if (pages[i])
> +                     __free_pages(pages[i], 0);
> +     kfree(pages);
> +     return NULL;
> +}
> +
> +static int __iommu_free_buffer(struct device *dev, struct page **pages,
> +size_t size) {
> +     int count = size >> PAGE_SHIFT;
> +     int i;
> +     for (i=0; i< count; i++)
> +             if (pages[i])
> +                     __free_pages(pages[i], 0);
> +     kfree(pages);
> +     return 0;
> +}
> +
> +static void *
> +__iommu_alloc_remap(struct page **pages, size_t size, gfp_t gfp,
> +pgprot_t prot) {
> +     struct arm_vmregion *c;
> +     size_t align;
> +     size_t count = size >> PAGE_SHIFT;
> +     int bit;
> +
> +     if (!consistent_pte[0]) {
> +             printk(KERN_ERR "%s: not initialised\n", __func__);
> +             dump_stack();
> +             return NULL;
> +     }
> +
> +     /*
> +      * Align the virtual region allocation - maximum alignment is
> +      * a section size, minimum is a page size.  This helps reduce
> +      * fragmentation of the DMA space, and also prevents allocations
> +      * smaller than a section from crossing a section boundary.
> +      */
> +     bit = fls(size - 1);
> +     if (bit > SECTION_SHIFT)
> +             bit = SECTION_SHIFT;
> +     align = 1 << bit;
> +
> +     /*
> +      * Allocate a virtual address in the consistent mapping region.
> +      */
> +     c = arm_vmregion_alloc(&consistent_head, align, size,
> +                         gfp & ~(__GFP_DMA | __GFP_HIGHMEM));
> +     if (c) {
> +             pte_t *pte;
> +             int idx = CONSISTENT_PTE_INDEX(c->vm_start);
> +             int i = 0;
> +             u32 off = CONSISTENT_OFFSET(c->vm_start) &
> (PTRS_PER_PTE-1);
> +
> +             pte = consistent_pte[idx] + off;
> +             c->priv = pages;
> +
> +             do {
> +                     BUG_ON(!pte_none(*pte));
> +
> +                     set_pte_ext(pte, mk_pte(pages[i], prot), 0);
> +                     pte++;
> +                     off++;
> +                     i++;
> +                     if (off >= PTRS_PER_PTE) {
> +                             off = 0;
> +                             pte = consistent_pte[++idx];
> +                     }
> +             } while (i < count);
> +
> +             dsb();
> +
> +             return (void *)c->vm_start;
> +     }
> +     return NULL;
> +}
> +
> +static dma_addr_t __iommu_create_mapping(struct device *dev, struct
> +page **pages, size_t size) {
> +     struct dma_iommu_mapping *mapping = dev->archdata.mapping;
> +     unsigned int count = PAGE_ALIGN(size) >> PAGE_SHIFT;
> +     dma_addr_t dma_addr, iova;
> +     int i, ret = ~0;
> +
> +     dma_addr = __alloc_iova(mapping, size);
> +     if (dma_addr == ~0)
> +             goto fail;
> +
> +     iova = dma_addr;
> +     for (i=0; i<count; ) {
> +             unsigned int phys = page_to_phys(pages[i]);
> +             int j = i + 1;
> +
> +             while (j < count) {
> +                     if (page_to_phys(pages[j]) != phys + (j - i) *
> PAGE_SIZE)
> +                             break;
> +                     j++;
> +             }
> +
> +             ret = iommu_map(mapping->domain, iova, phys, (j - i) *
> PAGE_SIZE, 0);
> +             if (ret < 0)
> +                     goto fail;
> +             iova += (j - i) * PAGE_SIZE;
> +             i = j;
> +     }
> +
> +     return dma_addr;
> +fail:
> +     return ~0;
> +}
> +
> +static int __iommu_remove_mapping(struct device *dev, dma_addr_t iova,
> +size_t size) {
> +     struct dma_iommu_mapping *mapping = dev->archdata.mapping;
> +     unsigned int count = PAGE_ALIGN(size) >> PAGE_SHIFT;
> +
> +     iova &= PAGE_MASK;
> +
> +     iommu_unmap(mapping->domain, iova, count * PAGE_SIZE);
> +
> +     __free_iova(mapping, iova, size);
> +     return 0;
> +}
> +
> +static void *arm_iommu_alloc_attrs(struct device *dev, size_t size,
> +         dma_addr_t *handle, gfp_t gfp, struct dma_attrs *attrs) {
> +     pgprot_t prot = __get_dma_pgprot(attrs, pgprot_kernel);
> +     struct page **pages;
> +     void *addr = NULL;
> +
> +     *handle = ~0;
> +     size = PAGE_ALIGN(size);
> +
> +     pages = __iommu_alloc_buffer(dev, size, gfp);
> +     if (!pages)
> +             return NULL;
> +
> +     *handle = __iommu_create_mapping(dev, pages, size);
> +     if (*handle == ~0)
> +             goto err_buffer;
> +
> +     addr = __iommu_alloc_remap(pages, size, gfp, prot);
> +     if (!addr)
> +             goto err_mapping;
> +
> +     return addr;
> +
> +err_mapping:
> +     __iommu_remove_mapping(dev, *handle, size);
> +err_buffer:
> +     __iommu_free_buffer(dev, pages, size);
> +     return NULL;
> +}
> +
> +static int arm_iommu_mmap_attrs(struct device *dev, struct
> vm_area_struct *vma,
> +                 void *cpu_addr, dma_addr_t dma_addr, size_t size,
> +                 struct dma_attrs *attrs)
> +{
> +     struct arm_vmregion *c;
> +
> +     vma->vm_page_prot = __get_dma_pgprot(attrs, vma-
> >vm_page_prot);
> +     c = arm_vmregion_find(&consistent_head, (unsigned
> long)cpu_addr);
> +
> +     if (c) {
> +             struct page **pages = c->priv;
> +
> +             unsigned long uaddr = vma->vm_start;
> +             unsigned long usize = vma->vm_end - vma->vm_start;
> +             int i = 0;
> +
> +             do {
> +                     int ret;
> +
> +                     ret = vm_insert_page(vma, uaddr, pages[i++]);
> +                     if (ret) {
> +                             printk(KERN_ERR "Remapping memory,
> error: %d\n", ret);
> +                             return ret;
> +                     }
> +
> +                     uaddr += PAGE_SIZE;
> +                     usize -= PAGE_SIZE;
> +             } while (usize > 0);
> +     }
> +     return 0;
> +}
> +
> +/*
> + * free a page as defined by the above mapping.
> + * Must not be called with IRQs disabled.
> + */
> +void arm_iommu_free_attrs(struct device *dev, size_t size, void
> *cpu_addr,
> +                       dma_addr_t handle, struct dma_attrs *attrs) {
> +     struct arm_vmregion *c;
> +     size = PAGE_ALIGN(size);
> +
> +     c = arm_vmregion_find(&consistent_head, (unsigned
> long)cpu_addr);
> +     if (c) {
> +             struct page **pages = c->priv;
> +             __dma_free_remap(cpu_addr, size);
> +             __iommu_remove_mapping(dev, handle, size);
> +             __iommu_free_buffer(dev, pages, size);
> +     }
> +}
> +
> +static int __map_sg_chunk(struct device *dev, struct scatterlist *sg,
> +                       size_t size, dma_addr_t *handle,
> +                       enum dma_data_direction dir)
> +{
> +     struct dma_iommu_mapping *mapping = dev->archdata.mapping;
> +     dma_addr_t iova, iova_base;
> +     int ret = 0;
> +     unsigned int count;
> +     struct scatterlist *s;
> +
> +     size = PAGE_ALIGN(size);
> +     *handle = ~0;
> +
> +     iova_base = iova = __alloc_iova(mapping, size);
> +     if (iova == ~0)
> +             return -ENOMEM;
> +
> +     for (count = 0, s = sg; count < (size >> PAGE_SHIFT); s = sg_next(s))
> +     {
> +             phys_addr_t phys = page_to_phys(sg_page(s));
> +             unsigned int len = PAGE_ALIGN(s->offset + s->length);
> +
> +             if (!arch_is_coherent())
> +                     __dma_page_cpu_to_dev(sg_page(s), s->offset, s-
> >length, dir);
> +
> +             ret = iommu_map(mapping->domain, iova, phys, len, 0);
> +             if (ret < 0)
> +                     goto fail;
> +             count += len >> PAGE_SHIFT;
> +             iova += len;
> +     }
> +     *handle = iova_base;
> +
> +     return 0;
> +fail:
> +     iommu_unmap(mapping->domain, iova_base, count * PAGE_SIZE);
> +     __free_iova(mapping, iova_base, size);
> +     return ret;
> +}
> +
> +int arm_iommu_map_sg(struct device *dev, struct scatterlist *sg, int nents,
> +                  enum dma_data_direction dir, struct dma_attrs *attrs) {
> +     struct scatterlist *s = sg, *dma = sg, *start = sg;
> +     int i, count = 0;
> +     unsigned int offset = s->offset;
> +     unsigned int size = s->offset + s->length;
> +     unsigned int max = dma_get_max_seg_size(dev);
> +
> +     s->dma_address = ~0;
> +     s->dma_length = 0;
> +
> +     for (i = 1; i < nents; i++) {
> +             s->dma_address = ~0;
> +             s->dma_length = 0;
> +
> +             s = sg_next(s);
> +
> +             if (s->offset || (size & ~PAGE_MASK) || size + s->length >
> max) {
> +                     if (__map_sg_chunk(dev, start, size, &dma-
> >dma_address,
> +                         dir) < 0)
> +                             goto bad_mapping;
> +
> +                     dma->dma_address += offset;
> +                     dma->dma_length = size - offset;
> +
> +                     size = offset = s->offset;
> +                     start = s;
> +                     dma = sg_next(dma);
> +                     count += 1;
> +             }
> +             size += s->length;
> +     }
> +     if (__map_sg_chunk(dev, start, size, &dma->dma_address, dir) < 0)
> +             goto bad_mapping;
> +
> +     dma->dma_address += offset;
> +     dma->dma_length = size - offset;
> +
> +     return count+1;
> +
> +bad_mapping:
> +     for_each_sg(sg, s, count, i)
> +             __iommu_remove_mapping(dev, sg_dma_address(s),
> sg_dma_len(s));
> +     return 0;
> +}
> +
> +void arm_iommu_unmap_sg(struct device *dev, struct scatterlist *sg, int
> nents,
> +                     enum dma_data_direction dir, struct dma_attrs
> *attrs) {
> +     struct scatterlist *s;
> +     int i;
> +
> +     for_each_sg(sg, s, nents, i) {
> +             if (sg_dma_len(s))
> +                     __iommu_remove_mapping(dev,
> sg_dma_address(s),
> +                                            sg_dma_len(s));
> +             if (!arch_is_coherent())
> +                     __dma_page_dev_to_cpu(sg_page(s), s->offset,
> +                                           s->length, dir);
> +     }
> +}
> +
> +
> +/**
> + * dma_sync_sg_for_cpu
> + * @dev: valid struct device pointer, or NULL for ISA and EISA-like
> +devices
> + * @sg: list of buffers
> + * @nents: number of buffers to map (returned from dma_map_sg)
> + * @dir: DMA transfer direction (same as was passed to dma_map_sg)  */
> +void arm_iommu_sync_sg_for_cpu(struct device *dev, struct scatterlist
> *sg,
> +                     int nents, enum dma_data_direction dir) {
> +     struct scatterlist *s;
> +     int i;
> +
> +     for_each_sg(sg, s, nents, i)
> +             if (!arch_is_coherent())
> +                     __dma_page_dev_to_cpu(sg_page(s), s->offset, s-
> >length, dir);
> +
> +}
> +
> +/**
> + * dma_sync_sg_for_device
> + * @dev: valid struct device pointer, or NULL for ISA and EISA-like
> +devices
> + * @sg: list of buffers
> + * @nents: number of buffers to map (returned from dma_map_sg)
> + * @dir: DMA transfer direction (same as was passed to dma_map_sg)  */
> +void arm_iommu_sync_sg_for_device(struct device *dev, struct scatterlist
> *sg,
> +                     int nents, enum dma_data_direction dir) {
> +     struct scatterlist *s;
> +     int i;
> +
> +     for_each_sg(sg, s, nents, i)
> +             if (!arch_is_coherent())
> +                     __dma_page_cpu_to_dev(sg_page(s), s->offset, s-
> >length, dir); }
> +
> +static dma_addr_t arm_iommu_map_page(struct device *dev, struct page
> *page,
> +          unsigned long offset, size_t size, enum dma_data_direction dir,
> +          struct dma_attrs *attrs)
> +{
> +     struct dma_iommu_mapping *mapping = dev->archdata.mapping;
> +     dma_addr_t dma_addr, iova;
> +     unsigned int phys;
> +     int ret, len = PAGE_ALIGN(size + offset);
> +
> +     if (!arch_is_coherent())
> +             __dma_page_cpu_to_dev(page, offset, size, dir);
> +
> +     dma_addr = iova = __alloc_iova(mapping, len);
> +     if (iova == ~0)
> +             goto fail;
> +
> +     dma_addr += offset;
> +     phys = page_to_phys(page);
> +     ret = iommu_map(mapping->domain, iova, phys, size, 0);
> +     if (ret < 0)
> +             goto fail;
> +
> +     return dma_addr;
> +fail:
> +     return ~0;
> +}
> +
> +static void arm_iommu_unmap_page(struct device *dev, dma_addr_t
> handle,
> +             size_t size, enum dma_data_direction dir,
> +             struct dma_attrs *attrs)
> +{
> +     struct dma_iommu_mapping *mapping = dev->archdata.mapping;
> +     dma_addr_t iova = handle & PAGE_MASK;
> +     struct page *page = phys_to_page(iommu_iova_to_phys(mapping-
> >domain, iova));
> +     int offset = handle & ~PAGE_MASK;
> +
> +     if (!iova)
> +             return;
> +
> +     if (!arch_is_coherent())
> +             __dma_page_dev_to_cpu(page, offset, size, dir);
> +
> +     iommu_unmap(mapping->domain, iova, size);
> +     __free_iova(mapping, iova, size);
> +}
> +
> +static void arm_iommu_sync_single_for_cpu(struct device *dev,
> +             dma_addr_t handle, size_t size, enum dma_data_direction
> dir) {
> +     struct dma_iommu_mapping *mapping = dev->archdata.mapping;
> +     dma_addr_t iova = handle & PAGE_MASK;
> +     struct page *page = phys_to_page(iommu_iova_to_phys(mapping-
> >domain, iova));
> +     unsigned int offset = handle & ~PAGE_MASK;
> +
> +     if (!iova)
> +             return;
> +
> +     if (!arch_is_coherent())
> +             __dma_page_dev_to_cpu(page, offset, size, dir); }
> +
> +static void arm_iommu_sync_single_for_device(struct device *dev,
> +             dma_addr_t handle, size_t size, enum dma_data_direction
> dir) {
> +     struct dma_iommu_mapping *mapping = dev->archdata.mapping;
> +     dma_addr_t iova = handle & PAGE_MASK;
> +     struct page *page = phys_to_page(iommu_iova_to_phys(mapping-
> >domain, iova));
> +     unsigned int offset = handle & ~PAGE_MASK;
> +
> +     if (!iova)
> +             return;
> +
> +     __dma_page_cpu_to_dev(page, offset, size, dir); }
> +
> +struct dma_map_ops iommu_ops = {
> +     .alloc          = arm_iommu_alloc_attrs,
> +     .free           = arm_iommu_free_attrs,
> +     .mmap           = arm_iommu_mmap_attrs,
> +
> +     .map_page               = arm_iommu_map_page,
> +     .unmap_page             = arm_iommu_unmap_page,
> +     .sync_single_for_cpu    = arm_iommu_sync_single_for_cpu,
> +     .sync_single_for_device =
> arm_iommu_sync_single_for_device,
> +
> +     .map_sg                 = arm_iommu_map_sg,
> +     .unmap_sg               = arm_iommu_unmap_sg,
> +     .sync_sg_for_cpu        = arm_iommu_sync_sg_for_cpu,
> +     .sync_sg_for_device     = arm_iommu_sync_sg_for_device,
> +};
> +
> +struct dma_iommu_mapping *
> +arm_iommu_create_mapping(struct bus_type *bus, dma_addr_t base,
> size_t size,
> +                      int order)
> +{
> +     unsigned int count = (size >> PAGE_SHIFT) - order;
> +     unsigned int bitmap_size = BITS_TO_LONGS(count) * sizeof(long);
> +     struct dma_iommu_mapping *mapping;
> +     int err = -ENOMEM;
> +
> +     mapping = kzalloc(sizeof(struct dma_iommu_mapping),
> GFP_KERNEL);
> +     if (!mapping)
> +             goto err;
> +
> +     mapping->bitmap = kzalloc(bitmap_size, GFP_KERNEL);
> +     if (!mapping->bitmap)
> +             goto err2;
> +
> +     mapping->base = base;
> +     mapping->bits = bitmap_size;
> +     mapping->order = order;
> +     spin_lock_init(&mapping->lock);
> +
> +     mapping->domain = iommu_domain_alloc(bus);
> +     if (!mapping->domain)
> +             goto err3;
> +
> +     kref_init(&mapping->kref);
> +     return mapping;
> +err3:
> +     kfree(mapping->bitmap);
> +err2:
> +     kfree(mapping);
> +err:
> +     return ERR_PTR(err);
> +}
> +EXPORT_SYMBOL(arm_iommu_create_mapping);
> +
> +static void release_iommu_mapping(struct kref *kref) {
> +     struct dma_iommu_mapping *mapping =
> +             container_of(kref, struct dma_iommu_mapping, kref);
> +
> +     iommu_domain_free(mapping->domain);
> +     kfree(mapping->bitmap);
> +     kfree(mapping);
> +}
> +
> +void arm_iommu_release_mapping(struct dma_iommu_mapping
> *mapping) {
> +     if (mapping)
> +             kref_put(&mapping->kref, release_iommu_mapping); }
> +EXPORT_SYMBOL(arm_iommu_release_mapping);
> +
> +int arm_iommu_attach_device(struct device *dev,
> +                         struct dma_iommu_mapping *mapping) {
> +     int err;
> +
> +     err = iommu_attach_device(mapping->domain, dev);
> +     if (err)
> +             return err;
> +
> +     kref_get(&mapping->kref);
> +     dev->archdata.mapping = mapping;
> +     set_dma_ops(dev, &iommu_ops);
> +
> +     printk(KERN_INFO "Attached IOMMU controller to %s device.\n",
> dev_name(dev));
> +     return 0;
> +}
> +EXPORT_SYMBOL(arm_iommu_attach_device);
> +
> +#endif
> diff --git a/arch/arm/mm/vmregion.h b/arch/arm/mm/vmregion.h index
> 15e9f04..6bbc402 100644
> --- a/arch/arm/mm/vmregion.h
> +++ b/arch/arm/mm/vmregion.h
> @@ -17,7 +17,7 @@ struct arm_vmregion {
>       struct list_head        vm_list;
>       unsigned long           vm_start;
>       unsigned long           vm_end;
> -     struct page             *vm_pages;
> +     void                    *priv;
>       int                     vm_active;
>  };
>
> --
> 1.7.1.569.g6f426

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 39+ messages in thread

* RE: [PATCHv6 7/7] ARM: dma-mapping: add support for IOMMU mapper
  2012-02-13 19:58     ` Krishna Reddy
@ 2012-02-13 19:58       ` Krishna Reddy
       [not found]       ` <401E54CE964CD94BAE1EB4A729C7087E378E42AE18-wAPRp6hVlRhDw2glCA4ptUEOCMrvLtNR@public.gmane.org>
  1 sibling, 0 replies; 39+ messages in thread
From: Krishna Reddy @ 2012-02-13 19:58 UTC (permalink / raw)
  To: Marek Szyprowski, linux-arm-kernel, linaro-mm-sig, linux-mm,
	linux-arch, linux-samsung-soc, iommu
  Cc: Kyungmin Park, Arnd Bergmann, Joerg Roedel,
	Russell King - ARM Linux, Shariq Hasnain, Chunsang Jeong,
	KyongHo Cho, Andrzej Pietrasiewicz, Benjamin Herrenschmidt

The implementation looks nice overall. Have few comments.

> +static struct page **__iommu_alloc_buffer(struct device *dev, size_t
> +size, gfp_t gfp) {
> +     struct page **pages;
> +     int count = size >> PAGE_SHIFT;
> +     int i=0;
> +
> +     pages = kzalloc(count * sizeof(struct page*), gfp);
> +     if (!pages)
> +             return NULL;

kzalloc can fail for any size bigger than PAGE_SIZE, if the system memory is
fully fragmented.
If there is a request for size bigger than 4MB, then the pages pointer array won't
Fit in one page and kzalloc may fail. we should use vzalloc()/vfree()
when pages pointer array size needed is bigger than PAGE_SIZE.


> +static inline dma_addr_t __alloc_iova(struct dma_iommu_mapping
> *mapping,
> +                                   size_t size)
> +{
> +     unsigned int order = get_order(size);
> +     unsigned int align = 0;
> +     unsigned int count, start;
> +     unsigned long flags;
> +
> +     count = ((PAGE_ALIGN(size) >> PAGE_SHIFT) +
> +              (1 << mapping->order) - 1) >> mapping->order;
> +
> +     if (order > mapping->order)
> +             align = (1 << (order - mapping->order)) - 1;
> +
> +     spin_lock_irqsave(&mapping->lock, flags);
> +     start = bitmap_find_next_zero_area(mapping->bitmap, mapping-
> >bits, 0,
> +                                        count, align);

Do we need "align" here? Why is it trying to align the memory request to
size of memory requested? When mapping->order is zero and if the size
requested is 4MB, order becomes 10.  align is set to 1023.
 bitmap_find_next_zero_area looks searching for free area from index, which
is multiple of 1024. Why we can't we say align mask  as 0 and let it allocate from
next free index? Doesn't mapping->order take care of min alignment needed for dev?


> +static dma_addr_t __iommu_create_mapping(struct device *dev, struct
> +page **pages, size_t size) {
> +     struct dma_iommu_mapping *mapping = dev->archdata.mapping;
> +     unsigned int count = PAGE_ALIGN(size) >> PAGE_SHIFT;
> +     dma_addr_t dma_addr, iova;
> +     int i, ret = ~0;
> +
> +     dma_addr = __alloc_iova(mapping, size);
> +     if (dma_addr == ~0)
> +             goto fail;
> +
> +     iova = dma_addr;
> +     for (i=0; i<count; ) {
> +             unsigned int phys = page_to_phys(pages[i]);
> +             int j = i + 1;
> +
> +             while (j < count) {
> +                     if (page_to_phys(pages[j]) != phys + (j - i) *
> PAGE_SIZE)
> +                             break;
> +                     j++;
> +             }
> +
> +             ret = iommu_map(mapping->domain, iova, phys, (j - i) *
> PAGE_SIZE, 0);
> +             if (ret < 0)
> +                     goto fail;
> +             iova += (j - i) * PAGE_SIZE;
> +             i = j;
> +     }
> +
> +     return dma_addr;
> +fail:
> +     return ~0;
> +}

iommu_map failure should release the iova space allocated using __alloc_iova.

> +static dma_addr_t arm_iommu_map_page(struct device *dev, struct page
> *page,
> +          unsigned long offset, size_t size, enum dma_data_direction dir,
> +          struct dma_attrs *attrs)
> +{
> +     struct dma_iommu_mapping *mapping = dev->archdata.mapping;
> +     dma_addr_t dma_addr, iova;
> +     unsigned int phys;
> +     int ret, len = PAGE_ALIGN(size + offset);
> +
> +     if (!arch_is_coherent())
> +             __dma_page_cpu_to_dev(page, offset, size, dir);
> +
> +     dma_addr = iova = __alloc_iova(mapping, len);
> +     if (iova == ~0)
> +             goto fail;
> +
> +     dma_addr += offset;
> +     phys = page_to_phys(page);
> +     ret = iommu_map(mapping->domain, iova, phys, size, 0);
> +     if (ret < 0)
> +             goto fail;
> +
> +     return dma_addr;
> +fail:
> +     return ~0;
> +}

iommu_map failure should release the iova space allocated using __alloc_iova.

>+      printk(KERN_INFO "Attached IOMMU controller to %s device.\n", dev_name(dev));
Just nit-picking. Should use pr_info().

--nvpublic
-KR


> -----Original Message-----
> From: Marek Szyprowski [mailto:m.szyprowski@samsung.com]
> Sent: Friday, February 10, 2012 10:59 AM
> To: linux-arm-kernel@lists.infradead.org; linaro-mm-sig@lists.linaro.org;
> linux-mm@kvack.org; linux-arch@vger.kernel.org; linux-samsung-
> soc@vger.kernel.org; iommu@lists.linux-foundation.org
> Cc: Marek Szyprowski; Kyungmin Park; Arnd Bergmann; Joerg Roedel; Russell
> King - ARM Linux; Shariq Hasnain; Chunsang Jeong; Krishna Reddy; KyongHo
> Cho; Andrzej Pietrasiewicz; Benjamin Herrenschmidt
> Subject: [PATCHv6 7/7] ARM: dma-mapping: add support for IOMMU
> mapper
>
> This patch add a complete implementation of DMA-mapping API for devices
> that have IOMMU support. All DMA-mapping calls are supported.
>
> This patch contains some of the code kindly provided by Krishna Reddy
> <vdumpa@nvidia.com> and Andrzej Pietrasiewicz
> <andrzej.p@samsung.com>
>
> Signed-off-by: Marek Szyprowski <m.szyprowski@samsung.com>
> Signed-off-by: Kyungmin Park <kyungmin.park@samsung.com>
> ---
>  arch/arm/Kconfig                 |    8 +
>  arch/arm/include/asm/device.h    |    3 +
>  arch/arm/include/asm/dma-iommu.h |   34 ++
>  arch/arm/mm/dma-mapping.c        |  635
> +++++++++++++++++++++++++++++++++++++-
>  arch/arm/mm/vmregion.h           |    2 +-
>  5 files changed, 667 insertions(+), 15 deletions(-)  create mode 100644
> arch/arm/include/asm/dma-iommu.h
>
> diff --git a/arch/arm/Kconfig b/arch/arm/Kconfig index 59102fb..5d9a0b6
> 100644
> --- a/arch/arm/Kconfig
> +++ b/arch/arm/Kconfig
> @@ -44,6 +44,14 @@ config ARM
>  config ARM_HAS_SG_CHAIN
>       bool
>
> +config NEED_SG_DMA_LENGTH
> +     bool
> +
> +config ARM_DMA_USE_IOMMU
> +     select NEED_SG_DMA_LENGTH
> +     select ARM_HAS_SG_CHAIN
> +     bool
> +
>  config HAVE_PWM
>       bool
>
> diff --git a/arch/arm/include/asm/device.h
> b/arch/arm/include/asm/device.h index 6e2cb0e..b69c0d3 100644
> --- a/arch/arm/include/asm/device.h
> +++ b/arch/arm/include/asm/device.h
> @@ -14,6 +14,9 @@ struct dev_archdata {
>  #ifdef CONFIG_IOMMU_API
>       void *iommu; /* private IOMMU data */
>  #endif
> +#ifdef CONFIG_ARM_DMA_USE_IOMMU
> +     struct dma_iommu_mapping        *mapping;
> +#endif
>  };
>
>  struct omap_device;
> diff --git a/arch/arm/include/asm/dma-iommu.h
> b/arch/arm/include/asm/dma-iommu.h
> new file mode 100644
> index 0000000..799b094
> --- /dev/null
> +++ b/arch/arm/include/asm/dma-iommu.h
> @@ -0,0 +1,34 @@
> +#ifndef ASMARM_DMA_IOMMU_H
> +#define ASMARM_DMA_IOMMU_H
> +
> +#ifdef __KERNEL__
> +
> +#include <linux/mm_types.h>
> +#include <linux/scatterlist.h>
> +#include <linux/dma-debug.h>
> +#include <linux/kmemcheck.h>
> +
> +struct dma_iommu_mapping {
> +     /* iommu specific data */
> +     struct iommu_domain     *domain;
> +
> +     void                    *bitmap;
> +     size_t                  bits;
> +     unsigned int            order;
> +     dma_addr_t              base;
> +
> +     spinlock_t              lock;
> +     struct kref             kref;
> +};
> +
> +struct dma_iommu_mapping *
> +arm_iommu_create_mapping(struct bus_type *bus, dma_addr_t base,
> size_t size,
> +                      int order);
> +
> +void arm_iommu_release_mapping(struct dma_iommu_mapping
> *mapping);
> +
> +int arm_iommu_attach_device(struct device *dev,
> +                                     struct dma_iommu_mapping
> *mapping);
> +
> +#endif /* __KERNEL__ */
> +#endif
> diff --git a/arch/arm/mm/dma-mapping.c b/arch/arm/mm/dma-mapping.c
> index 4845c09..4163691 100644
> --- a/arch/arm/mm/dma-mapping.c
> +++ b/arch/arm/mm/dma-mapping.c
> @@ -19,6 +19,7 @@
>  #include <linux/dma-mapping.h>
>  #include <linux/highmem.h>
>  #include <linux/slab.h>
> +#include <linux/iommu.h>
>
>  #include <asm/memory.h>
>  #include <asm/highmem.h>
> @@ -26,6 +27,7 @@
>  #include <asm/tlbflush.h>
>  #include <asm/sizes.h>
>  #include <asm/mach/arch.h>
> +#include <asm/dma-iommu.h>
>
>  #include "mm.h"
>
> @@ -156,6 +158,19 @@ static u64 get_coherent_dma_mask(struct device
> *dev)
>       return mask;
>  }
>
> +static void __dma_clear_buffer(struct page *page, size_t size) {
> +     void *ptr;
> +     /*
> +      * Ensure that the allocated pages are zeroed, and that any data
> +      * lurking in the kernel direct-mapped region is invalidated.
> +      */
> +     ptr = page_address(page);
> +     memset(ptr, 0, size);
> +     dmac_flush_range(ptr, ptr + size);
> +     outer_flush_range(__pa(ptr), __pa(ptr) + size); }
> +
>  /*
>   * Allocate a DMA buffer for 'dev' of size 'size' using the
>   * specified gfp mask.  Note that 'size' must be page aligned.
> @@ -164,7 +179,6 @@ static struct page *__dma_alloc_buffer(struct device
> *dev, size_t size, gfp_t gf  {
>       unsigned long order = get_order(size);
>       struct page *page, *p, *e;
> -     void *ptr;
>       u64 mask = get_coherent_dma_mask(dev);
>
>  #ifdef CONFIG_DMA_API_DEBUG
> @@ -193,14 +207,7 @@ static struct page *__dma_alloc_buffer(struct device
> *dev, size_t size, gfp_t gf
>       for (p = page + (size >> PAGE_SHIFT), e = page + (1 << order); p < e;
> p++)
>               __free_page(p);
>
> -     /*
> -      * Ensure that the allocated pages are zeroed, and that any data
> -      * lurking in the kernel direct-mapped region is invalidated.
> -      */
> -     ptr = page_address(page);
> -     memset(ptr, 0, size);
> -     dmac_flush_range(ptr, ptr + size);
> -     outer_flush_range(__pa(ptr), __pa(ptr) + size);
> +     __dma_clear_buffer(page, size);
>
>       return page;
>  }
> @@ -348,7 +355,7 @@ __dma_alloc_remap(struct page *page, size_t size,
> gfp_t gfp, pgprot_t prot)
>               u32 off = CONSISTENT_OFFSET(c->vm_start) &
> (PTRS_PER_PTE-1);
>
>               pte = consistent_pte[idx] + off;
> -             c->vm_pages = page;
> +             c->priv = page;
>
>               do {
>                       BUG_ON(!pte_none(*pte));
> @@ -461,6 +468,14 @@ __dma_alloc(struct device *dev, size_t size,
> dma_addr_t *handle, gfp_t gfp,
>       return addr;
>  }
>
> +static inline pgprot_t __get_dma_pgprot(struct dma_attrs *attrs,
> +pgprot_t prot) {
> +     prot = dma_get_attr(DMA_ATTR_WRITE_COMBINE, attrs) ?
> +                         pgprot_writecombine(prot) :
> +                         pgprot_dmacoherent(prot);
> +     return prot;
> +}
> +
>  /*
>   * Allocate DMA-coherent memory space and return both the kernel
> remapped
>   * virtual and bus address for that space.
> @@ -468,9 +483,7 @@ __dma_alloc(struct device *dev, size_t size,
> dma_addr_t *handle, gfp_t gfp,  void *arm_dma_alloc(struct device *dev,
> size_t size, dma_addr_t *handle,
>                   gfp_t gfp, struct dma_attrs *attrs)  {
> -     pgprot_t prot = dma_get_attr(DMA_ATTR_WRITE_COMBINE, attrs) ?
> -                     pgprot_writecombine(pgprot_kernel) :
> -                     pgprot_dmacoherent(pgprot_kernel);
> +     pgprot_t prot = __get_dma_pgprot(attrs, pgprot_kernel);
>       void *memory;
>
>       if (dma_alloc_from_coherent(dev, size, handle, &memory)) @@ -
> 499,13 +512,14 @@ int arm_dma_mmap(struct device *dev, struct
> vm_area_struct *vma,
>       c = arm_vmregion_find(&consistent_head, (unsigned
> long)cpu_addr);
>       if (c) {
>               unsigned long off = vma->vm_pgoff;
> +             struct page *pages = c->priv;
>
>               kern_size = (c->vm_end - c->vm_start) >> PAGE_SHIFT;
>
>               if (off < kern_size &&
>                   user_size <= (kern_size - off)) {
>                       ret = remap_pfn_range(vma, vma->vm_start,
> -                                           page_to_pfn(c->vm_pages) + off,
> +                                           page_to_pfn(pages) + off,
>                                             user_size << PAGE_SHIFT,
>                                             vma->vm_page_prot);
>               }
> @@ -644,6 +658,9 @@ int arm_dma_map_sg(struct device *dev, struct
> scatterlist *sg, int nents,
>       int i, j;
>
>       for_each_sg(sg, s, nents, i) {
> +#ifdef CONFIG_NEED_SG_DMA_LENGTH
> +             s->dma_length = s->length;
> +#endif
>               s->dma_address = ops->map_page(dev, sg_page(s), s-
> >offset,
>                                               s->length, dir, attrs);
>               if (dma_mapping_error(dev, s->dma_address)) @@ -749,3
> +766,593 @@ static int __init dma_debug_do_init(void)
>       return 0;
>  }
>  fs_initcall(dma_debug_do_init);
> +
> +#ifdef CONFIG_ARM_DMA_USE_IOMMU
> +
> +/* IOMMU */
> +
> +static inline dma_addr_t __alloc_iova(struct dma_iommu_mapping
> *mapping,
> +                                   size_t size)
> +{
> +     unsigned int order = get_order(size);
> +     unsigned int align = 0;
> +     unsigned int count, start;
> +     unsigned long flags;
> +
> +     count = ((PAGE_ALIGN(size) >> PAGE_SHIFT) +
> +              (1 << mapping->order) - 1) >> mapping->order;
> +
> +     if (order > mapping->order)
> +             align = (1 << (order - mapping->order)) - 1;
> +
> +     spin_lock_irqsave(&mapping->lock, flags);
> +     start = bitmap_find_next_zero_area(mapping->bitmap, mapping-
> >bits, 0,
> +                                        count, align);
> +     if (start > mapping->bits) {
> +             spin_unlock_irqrestore(&mapping->lock, flags);
> +             return ~0;
> +     }
> +
> +     bitmap_set(mapping->bitmap, start, count);
> +     spin_unlock_irqrestore(&mapping->lock, flags);
> +
> +     return mapping->base + (start << (mapping->order + PAGE_SHIFT));
> }
> +
> +static inline void __free_iova(struct dma_iommu_mapping *mapping,
> +                            dma_addr_t addr, size_t size) {
> +     unsigned int start = (addr - mapping->base) >>
> +                          (mapping->order + PAGE_SHIFT);
> +     unsigned int count = ((size >> PAGE_SHIFT) +
> +                           (1 << mapping->order) - 1) >> mapping->order;
> +     unsigned long flags;
> +
> +     spin_lock_irqsave(&mapping->lock, flags);
> +     bitmap_clear(mapping->bitmap, start, count);
> +     spin_unlock_irqrestore(&mapping->lock, flags); }
> +
> +static struct page **__iommu_alloc_buffer(struct device *dev, size_t
> +size, gfp_t gfp) {
> +     struct page **pages;
> +     int count = size >> PAGE_SHIFT;
> +     int i=0;
> +
> +     pages = kzalloc(count * sizeof(struct page*), gfp);
> +     if (!pages)
> +             return NULL;
> +
> +     while (count) {
> +             int j, order = __ffs(count);
> +
> +             pages[i] = alloc_pages(gfp | __GFP_NOWARN, order);
> +             while (!pages[i] && order)
> +                     pages[i] = alloc_pages(gfp | __GFP_NOWARN, --
> order);
> +             if (!pages[i])
> +                     goto error;
> +
> +             if (order)
> +                     split_page(pages[i], order);
> +             j = 1 << order;
> +             while (--j)
> +                     pages[i + j] = pages[i] + j;
> +
> +             __dma_clear_buffer(pages[i], PAGE_SIZE << order);
> +             i += 1 << order;
> +             count -= 1 << order;
> +     }
> +
> +     return pages;
> +error:
> +     while (--i)
> +             if (pages[i])
> +                     __free_pages(pages[i], 0);
> +     kfree(pages);
> +     return NULL;
> +}
> +
> +static int __iommu_free_buffer(struct device *dev, struct page **pages,
> +size_t size) {
> +     int count = size >> PAGE_SHIFT;
> +     int i;
> +     for (i=0; i< count; i++)
> +             if (pages[i])
> +                     __free_pages(pages[i], 0);
> +     kfree(pages);
> +     return 0;
> +}
> +
> +static void *
> +__iommu_alloc_remap(struct page **pages, size_t size, gfp_t gfp,
> +pgprot_t prot) {
> +     struct arm_vmregion *c;
> +     size_t align;
> +     size_t count = size >> PAGE_SHIFT;
> +     int bit;
> +
> +     if (!consistent_pte[0]) {
> +             printk(KERN_ERR "%s: not initialised\n", __func__);
> +             dump_stack();
> +             return NULL;
> +     }
> +
> +     /*
> +      * Align the virtual region allocation - maximum alignment is
> +      * a section size, minimum is a page size.  This helps reduce
> +      * fragmentation of the DMA space, and also prevents allocations
> +      * smaller than a section from crossing a section boundary.
> +      */
> +     bit = fls(size - 1);
> +     if (bit > SECTION_SHIFT)
> +             bit = SECTION_SHIFT;
> +     align = 1 << bit;
> +
> +     /*
> +      * Allocate a virtual address in the consistent mapping region.
> +      */
> +     c = arm_vmregion_alloc(&consistent_head, align, size,
> +                         gfp & ~(__GFP_DMA | __GFP_HIGHMEM));
> +     if (c) {
> +             pte_t *pte;
> +             int idx = CONSISTENT_PTE_INDEX(c->vm_start);
> +             int i = 0;
> +             u32 off = CONSISTENT_OFFSET(c->vm_start) &
> (PTRS_PER_PTE-1);
> +
> +             pte = consistent_pte[idx] + off;
> +             c->priv = pages;
> +
> +             do {
> +                     BUG_ON(!pte_none(*pte));
> +
> +                     set_pte_ext(pte, mk_pte(pages[i], prot), 0);
> +                     pte++;
> +                     off++;
> +                     i++;
> +                     if (off >= PTRS_PER_PTE) {
> +                             off = 0;
> +                             pte = consistent_pte[++idx];
> +                     }
> +             } while (i < count);
> +
> +             dsb();
> +
> +             return (void *)c->vm_start;
> +     }
> +     return NULL;
> +}
> +
> +static dma_addr_t __iommu_create_mapping(struct device *dev, struct
> +page **pages, size_t size) {
> +     struct dma_iommu_mapping *mapping = dev->archdata.mapping;
> +     unsigned int count = PAGE_ALIGN(size) >> PAGE_SHIFT;
> +     dma_addr_t dma_addr, iova;
> +     int i, ret = ~0;
> +
> +     dma_addr = __alloc_iova(mapping, size);
> +     if (dma_addr == ~0)
> +             goto fail;
> +
> +     iova = dma_addr;
> +     for (i=0; i<count; ) {
> +             unsigned int phys = page_to_phys(pages[i]);
> +             int j = i + 1;
> +
> +             while (j < count) {
> +                     if (page_to_phys(pages[j]) != phys + (j - i) *
> PAGE_SIZE)
> +                             break;
> +                     j++;
> +             }
> +
> +             ret = iommu_map(mapping->domain, iova, phys, (j - i) *
> PAGE_SIZE, 0);
> +             if (ret < 0)
> +                     goto fail;
> +             iova += (j - i) * PAGE_SIZE;
> +             i = j;
> +     }
> +
> +     return dma_addr;
> +fail:
> +     return ~0;
> +}
> +
> +static int __iommu_remove_mapping(struct device *dev, dma_addr_t iova,
> +size_t size) {
> +     struct dma_iommu_mapping *mapping = dev->archdata.mapping;
> +     unsigned int count = PAGE_ALIGN(size) >> PAGE_SHIFT;
> +
> +     iova &= PAGE_MASK;
> +
> +     iommu_unmap(mapping->domain, iova, count * PAGE_SIZE);
> +
> +     __free_iova(mapping, iova, size);
> +     return 0;
> +}
> +
> +static void *arm_iommu_alloc_attrs(struct device *dev, size_t size,
> +         dma_addr_t *handle, gfp_t gfp, struct dma_attrs *attrs) {
> +     pgprot_t prot = __get_dma_pgprot(attrs, pgprot_kernel);
> +     struct page **pages;
> +     void *addr = NULL;
> +
> +     *handle = ~0;
> +     size = PAGE_ALIGN(size);
> +
> +     pages = __iommu_alloc_buffer(dev, size, gfp);
> +     if (!pages)
> +             return NULL;
> +
> +     *handle = __iommu_create_mapping(dev, pages, size);
> +     if (*handle == ~0)
> +             goto err_buffer;
> +
> +     addr = __iommu_alloc_remap(pages, size, gfp, prot);
> +     if (!addr)
> +             goto err_mapping;
> +
> +     return addr;
> +
> +err_mapping:
> +     __iommu_remove_mapping(dev, *handle, size);
> +err_buffer:
> +     __iommu_free_buffer(dev, pages, size);
> +     return NULL;
> +}
> +
> +static int arm_iommu_mmap_attrs(struct device *dev, struct
> vm_area_struct *vma,
> +                 void *cpu_addr, dma_addr_t dma_addr, size_t size,
> +                 struct dma_attrs *attrs)
> +{
> +     struct arm_vmregion *c;
> +
> +     vma->vm_page_prot = __get_dma_pgprot(attrs, vma-
> >vm_page_prot);
> +     c = arm_vmregion_find(&consistent_head, (unsigned
> long)cpu_addr);
> +
> +     if (c) {
> +             struct page **pages = c->priv;
> +
> +             unsigned long uaddr = vma->vm_start;
> +             unsigned long usize = vma->vm_end - vma->vm_start;
> +             int i = 0;
> +
> +             do {
> +                     int ret;
> +
> +                     ret = vm_insert_page(vma, uaddr, pages[i++]);
> +                     if (ret) {
> +                             printk(KERN_ERR "Remapping memory,
> error: %d\n", ret);
> +                             return ret;
> +                     }
> +
> +                     uaddr += PAGE_SIZE;
> +                     usize -= PAGE_SIZE;
> +             } while (usize > 0);
> +     }
> +     return 0;
> +}
> +
> +/*
> + * free a page as defined by the above mapping.
> + * Must not be called with IRQs disabled.
> + */
> +void arm_iommu_free_attrs(struct device *dev, size_t size, void
> *cpu_addr,
> +                       dma_addr_t handle, struct dma_attrs *attrs) {
> +     struct arm_vmregion *c;
> +     size = PAGE_ALIGN(size);
> +
> +     c = arm_vmregion_find(&consistent_head, (unsigned
> long)cpu_addr);
> +     if (c) {
> +             struct page **pages = c->priv;
> +             __dma_free_remap(cpu_addr, size);
> +             __iommu_remove_mapping(dev, handle, size);
> +             __iommu_free_buffer(dev, pages, size);
> +     }
> +}
> +
> +static int __map_sg_chunk(struct device *dev, struct scatterlist *sg,
> +                       size_t size, dma_addr_t *handle,
> +                       enum dma_data_direction dir)
> +{
> +     struct dma_iommu_mapping *mapping = dev->archdata.mapping;
> +     dma_addr_t iova, iova_base;
> +     int ret = 0;
> +     unsigned int count;
> +     struct scatterlist *s;
> +
> +     size = PAGE_ALIGN(size);
> +     *handle = ~0;
> +
> +     iova_base = iova = __alloc_iova(mapping, size);
> +     if (iova == ~0)
> +             return -ENOMEM;
> +
> +     for (count = 0, s = sg; count < (size >> PAGE_SHIFT); s = sg_next(s))
> +     {
> +             phys_addr_t phys = page_to_phys(sg_page(s));
> +             unsigned int len = PAGE_ALIGN(s->offset + s->length);
> +
> +             if (!arch_is_coherent())
> +                     __dma_page_cpu_to_dev(sg_page(s), s->offset, s-
> >length, dir);
> +
> +             ret = iommu_map(mapping->domain, iova, phys, len, 0);
> +             if (ret < 0)
> +                     goto fail;
> +             count += len >> PAGE_SHIFT;
> +             iova += len;
> +     }
> +     *handle = iova_base;
> +
> +     return 0;
> +fail:
> +     iommu_unmap(mapping->domain, iova_base, count * PAGE_SIZE);
> +     __free_iova(mapping, iova_base, size);
> +     return ret;
> +}
> +
> +int arm_iommu_map_sg(struct device *dev, struct scatterlist *sg, int nents,
> +                  enum dma_data_direction dir, struct dma_attrs *attrs) {
> +     struct scatterlist *s = sg, *dma = sg, *start = sg;
> +     int i, count = 0;
> +     unsigned int offset = s->offset;
> +     unsigned int size = s->offset + s->length;
> +     unsigned int max = dma_get_max_seg_size(dev);
> +
> +     s->dma_address = ~0;
> +     s->dma_length = 0;
> +
> +     for (i = 1; i < nents; i++) {
> +             s->dma_address = ~0;
> +             s->dma_length = 0;
> +
> +             s = sg_next(s);
> +
> +             if (s->offset || (size & ~PAGE_MASK) || size + s->length >
> max) {
> +                     if (__map_sg_chunk(dev, start, size, &dma-
> >dma_address,
> +                         dir) < 0)
> +                             goto bad_mapping;
> +
> +                     dma->dma_address += offset;
> +                     dma->dma_length = size - offset;
> +
> +                     size = offset = s->offset;
> +                     start = s;
> +                     dma = sg_next(dma);
> +                     count += 1;
> +             }
> +             size += s->length;
> +     }
> +     if (__map_sg_chunk(dev, start, size, &dma->dma_address, dir) < 0)
> +             goto bad_mapping;
> +
> +     dma->dma_address += offset;
> +     dma->dma_length = size - offset;
> +
> +     return count+1;
> +
> +bad_mapping:
> +     for_each_sg(sg, s, count, i)
> +             __iommu_remove_mapping(dev, sg_dma_address(s),
> sg_dma_len(s));
> +     return 0;
> +}
> +
> +void arm_iommu_unmap_sg(struct device *dev, struct scatterlist *sg, int
> nents,
> +                     enum dma_data_direction dir, struct dma_attrs
> *attrs) {
> +     struct scatterlist *s;
> +     int i;
> +
> +     for_each_sg(sg, s, nents, i) {
> +             if (sg_dma_len(s))
> +                     __iommu_remove_mapping(dev,
> sg_dma_address(s),
> +                                            sg_dma_len(s));
> +             if (!arch_is_coherent())
> +                     __dma_page_dev_to_cpu(sg_page(s), s->offset,
> +                                           s->length, dir);
> +     }
> +}
> +
> +
> +/**
> + * dma_sync_sg_for_cpu
> + * @dev: valid struct device pointer, or NULL for ISA and EISA-like
> +devices
> + * @sg: list of buffers
> + * @nents: number of buffers to map (returned from dma_map_sg)
> + * @dir: DMA transfer direction (same as was passed to dma_map_sg)  */
> +void arm_iommu_sync_sg_for_cpu(struct device *dev, struct scatterlist
> *sg,
> +                     int nents, enum dma_data_direction dir) {
> +     struct scatterlist *s;
> +     int i;
> +
> +     for_each_sg(sg, s, nents, i)
> +             if (!arch_is_coherent())
> +                     __dma_page_dev_to_cpu(sg_page(s), s->offset, s-
> >length, dir);
> +
> +}
> +
> +/**
> + * dma_sync_sg_for_device
> + * @dev: valid struct device pointer, or NULL for ISA and EISA-like
> +devices
> + * @sg: list of buffers
> + * @nents: number of buffers to map (returned from dma_map_sg)
> + * @dir: DMA transfer direction (same as was passed to dma_map_sg)  */
> +void arm_iommu_sync_sg_for_device(struct device *dev, struct scatterlist
> *sg,
> +                     int nents, enum dma_data_direction dir) {
> +     struct scatterlist *s;
> +     int i;
> +
> +     for_each_sg(sg, s, nents, i)
> +             if (!arch_is_coherent())
> +                     __dma_page_cpu_to_dev(sg_page(s), s->offset, s-
> >length, dir); }
> +
> +static dma_addr_t arm_iommu_map_page(struct device *dev, struct page
> *page,
> +          unsigned long offset, size_t size, enum dma_data_direction dir,
> +          struct dma_attrs *attrs)
> +{
> +     struct dma_iommu_mapping *mapping = dev->archdata.mapping;
> +     dma_addr_t dma_addr, iova;
> +     unsigned int phys;
> +     int ret, len = PAGE_ALIGN(size + offset);
> +
> +     if (!arch_is_coherent())
> +             __dma_page_cpu_to_dev(page, offset, size, dir);
> +
> +     dma_addr = iova = __alloc_iova(mapping, len);
> +     if (iova == ~0)
> +             goto fail;
> +
> +     dma_addr += offset;
> +     phys = page_to_phys(page);
> +     ret = iommu_map(mapping->domain, iova, phys, size, 0);
> +     if (ret < 0)
> +             goto fail;
> +
> +     return dma_addr;
> +fail:
> +     return ~0;
> +}
> +
> +static void arm_iommu_unmap_page(struct device *dev, dma_addr_t
> handle,
> +             size_t size, enum dma_data_direction dir,
> +             struct dma_attrs *attrs)
> +{
> +     struct dma_iommu_mapping *mapping = dev->archdata.mapping;
> +     dma_addr_t iova = handle & PAGE_MASK;
> +     struct page *page = phys_to_page(iommu_iova_to_phys(mapping-
> >domain, iova));
> +     int offset = handle & ~PAGE_MASK;
> +
> +     if (!iova)
> +             return;
> +
> +     if (!arch_is_coherent())
> +             __dma_page_dev_to_cpu(page, offset, size, dir);
> +
> +     iommu_unmap(mapping->domain, iova, size);
> +     __free_iova(mapping, iova, size);
> +}
> +
> +static void arm_iommu_sync_single_for_cpu(struct device *dev,
> +             dma_addr_t handle, size_t size, enum dma_data_direction
> dir) {
> +     struct dma_iommu_mapping *mapping = dev->archdata.mapping;
> +     dma_addr_t iova = handle & PAGE_MASK;
> +     struct page *page = phys_to_page(iommu_iova_to_phys(mapping-
> >domain, iova));
> +     unsigned int offset = handle & ~PAGE_MASK;
> +
> +     if (!iova)
> +             return;
> +
> +     if (!arch_is_coherent())
> +             __dma_page_dev_to_cpu(page, offset, size, dir); }
> +
> +static void arm_iommu_sync_single_for_device(struct device *dev,
> +             dma_addr_t handle, size_t size, enum dma_data_direction
> dir) {
> +     struct dma_iommu_mapping *mapping = dev->archdata.mapping;
> +     dma_addr_t iova = handle & PAGE_MASK;
> +     struct page *page = phys_to_page(iommu_iova_to_phys(mapping-
> >domain, iova));
> +     unsigned int offset = handle & ~PAGE_MASK;
> +
> +     if (!iova)
> +             return;
> +
> +     __dma_page_cpu_to_dev(page, offset, size, dir); }
> +
> +struct dma_map_ops iommu_ops = {
> +     .alloc          = arm_iommu_alloc_attrs,
> +     .free           = arm_iommu_free_attrs,
> +     .mmap           = arm_iommu_mmap_attrs,
> +
> +     .map_page               = arm_iommu_map_page,
> +     .unmap_page             = arm_iommu_unmap_page,
> +     .sync_single_for_cpu    = arm_iommu_sync_single_for_cpu,
> +     .sync_single_for_device =
> arm_iommu_sync_single_for_device,
> +
> +     .map_sg                 = arm_iommu_map_sg,
> +     .unmap_sg               = arm_iommu_unmap_sg,
> +     .sync_sg_for_cpu        = arm_iommu_sync_sg_for_cpu,
> +     .sync_sg_for_device     = arm_iommu_sync_sg_for_device,
> +};
> +
> +struct dma_iommu_mapping *
> +arm_iommu_create_mapping(struct bus_type *bus, dma_addr_t base,
> size_t size,
> +                      int order)
> +{
> +     unsigned int count = (size >> PAGE_SHIFT) - order;
> +     unsigned int bitmap_size = BITS_TO_LONGS(count) * sizeof(long);
> +     struct dma_iommu_mapping *mapping;
> +     int err = -ENOMEM;
> +
> +     mapping = kzalloc(sizeof(struct dma_iommu_mapping),
> GFP_KERNEL);
> +     if (!mapping)
> +             goto err;
> +
> +     mapping->bitmap = kzalloc(bitmap_size, GFP_KERNEL);
> +     if (!mapping->bitmap)
> +             goto err2;
> +
> +     mapping->base = base;
> +     mapping->bits = bitmap_size;
> +     mapping->order = order;
> +     spin_lock_init(&mapping->lock);
> +
> +     mapping->domain = iommu_domain_alloc(bus);
> +     if (!mapping->domain)
> +             goto err3;
> +
> +     kref_init(&mapping->kref);
> +     return mapping;
> +err3:
> +     kfree(mapping->bitmap);
> +err2:
> +     kfree(mapping);
> +err:
> +     return ERR_PTR(err);
> +}
> +EXPORT_SYMBOL(arm_iommu_create_mapping);
> +
> +static void release_iommu_mapping(struct kref *kref) {
> +     struct dma_iommu_mapping *mapping =
> +             container_of(kref, struct dma_iommu_mapping, kref);
> +
> +     iommu_domain_free(mapping->domain);
> +     kfree(mapping->bitmap);
> +     kfree(mapping);
> +}
> +
> +void arm_iommu_release_mapping(struct dma_iommu_mapping
> *mapping) {
> +     if (mapping)
> +             kref_put(&mapping->kref, release_iommu_mapping); }
> +EXPORT_SYMBOL(arm_iommu_release_mapping);
> +
> +int arm_iommu_attach_device(struct device *dev,
> +                         struct dma_iommu_mapping *mapping) {
> +     int err;
> +
> +     err = iommu_attach_device(mapping->domain, dev);
> +     if (err)
> +             return err;
> +
> +     kref_get(&mapping->kref);
> +     dev->archdata.mapping = mapping;
> +     set_dma_ops(dev, &iommu_ops);
> +
> +     printk(KERN_INFO "Attached IOMMU controller to %s device.\n",
> dev_name(dev));
> +     return 0;
> +}
> +EXPORT_SYMBOL(arm_iommu_attach_device);
> +
> +#endif
> diff --git a/arch/arm/mm/vmregion.h b/arch/arm/mm/vmregion.h index
> 15e9f04..6bbc402 100644
> --- a/arch/arm/mm/vmregion.h
> +++ b/arch/arm/mm/vmregion.h
> @@ -17,7 +17,7 @@ struct arm_vmregion {
>       struct list_head        vm_list;
>       unsigned long           vm_start;
>       unsigned long           vm_end;
> -     struct page             *vm_pages;
> +     void                    *priv;
>       int                     vm_active;
>  };
>
> --
> 1.7.1.569.g6f426


^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCHv6 7/7] ARM: dma-mapping: add support for IOMMU mapper
  2012-02-10 18:58   ` [PATCHv6 7/7] ARM: dma-mapping: add support for IOMMU mapper Marek Szyprowski
                       ` (2 preceding siblings ...)
  2012-02-13 19:58     ` Krishna Reddy
@ 2012-02-14 14:55     ` Konrad Rzeszutek Wilk
  2012-02-14 14:55       ` Konrad Rzeszutek Wilk
  2012-02-24 13:12       ` Marek Szyprowski
  3 siblings, 2 replies; 39+ messages in thread
From: Konrad Rzeszutek Wilk @ 2012-02-14 14:55 UTC (permalink / raw)
  To: Marek Szyprowski
  Cc: linux-arm-kernel, linaro-mm-sig, linux-mm, linux-arch,
	linux-samsung-soc, iommu, Shariq Hasnain, Arnd Bergmann,
	Benjamin Herrenschmidt, Krishna Reddy, Kyungmin Park,
	Andrzej Pietrasiewicz, Russell King - ARM Linux, KyongHo Cho,
	Chunsang Jeong

> +static void __dma_clear_buffer(struct page *page, size_t size)
> +{
> +	void *ptr;
> +	/*
> +	 * Ensure that the allocated pages are zeroed, and that any data
> +	 * lurking in the kernel direct-mapped region is invalidated.
> +	 */
> +	ptr = page_address(page);

Should you check to see if the ptr is valid?

> +	memset(ptr, 0, size);
> +	dmac_flush_range(ptr, ptr + size);
> +	outer_flush_range(__pa(ptr), __pa(ptr) + size);
> +}
> +
>  /*
>   * Allocate a DMA buffer for 'dev' of size 'size' using the
>   * specified gfp mask.  Note that 'size' must be page aligned.
> @@ -164,7 +179,6 @@ static struct page *__dma_alloc_buffer(struct device *dev, size_t size, gfp_t gf
>  {
>  	unsigned long order = get_order(size);
>  	struct page *page, *p, *e;
> -	void *ptr;
>  	u64 mask = get_coherent_dma_mask(dev);
>  
>  #ifdef CONFIG_DMA_API_DEBUG
> @@ -193,14 +207,7 @@ static struct page *__dma_alloc_buffer(struct device *dev, size_t size, gfp_t gf
>  	for (p = page + (size >> PAGE_SHIFT), e = page + (1 << order); p < e; p++)
>  		__free_page(p);
>  
> -	/*
> -	 * Ensure that the allocated pages are zeroed, and that any data
> -	 * lurking in the kernel direct-mapped region is invalidated.
> -	 */
> -	ptr = page_address(page);
> -	memset(ptr, 0, size);
> -	dmac_flush_range(ptr, ptr + size);
> -	outer_flush_range(__pa(ptr), __pa(ptr) + size);
> +	__dma_clear_buffer(page, size);
>  
>  	return page;
>  }
> @@ -348,7 +355,7 @@ __dma_alloc_remap(struct page *page, size_t size, gfp_t gfp, pgprot_t prot)
>  		u32 off = CONSISTENT_OFFSET(c->vm_start) & (PTRS_PER_PTE-1);
>  
>  		pte = consistent_pte[idx] + off;
> -		c->vm_pages = page;
> +		c->priv = page;
>  
>  		do {
>  			BUG_ON(!pte_none(*pte));
> @@ -461,6 +468,14 @@ __dma_alloc(struct device *dev, size_t size, dma_addr_t *handle, gfp_t gfp,
>  	return addr;
>  }
>  
> +static inline pgprot_t __get_dma_pgprot(struct dma_attrs *attrs, pgprot_t prot)
> +{
> +	prot = dma_get_attr(DMA_ATTR_WRITE_COMBINE, attrs) ?
> +			    pgprot_writecombine(prot) :
> +			    pgprot_dmacoherent(prot);
> +	return prot;
> +}
> +
>  /*
>   * Allocate DMA-coherent memory space and return both the kernel remapped
>   * virtual and bus address for that space.
> @@ -468,9 +483,7 @@ __dma_alloc(struct device *dev, size_t size, dma_addr_t *handle, gfp_t gfp,
>  void *arm_dma_alloc(struct device *dev, size_t size, dma_addr_t *handle,
>  		    gfp_t gfp, struct dma_attrs *attrs)
>  {
> -	pgprot_t prot = dma_get_attr(DMA_ATTR_WRITE_COMBINE, attrs) ?
> -			pgprot_writecombine(pgprot_kernel) :
> -			pgprot_dmacoherent(pgprot_kernel);
> +	pgprot_t prot = __get_dma_pgprot(attrs, pgprot_kernel);
>  	void *memory;
>  
>  	if (dma_alloc_from_coherent(dev, size, handle, &memory))
> @@ -499,13 +512,14 @@ int arm_dma_mmap(struct device *dev, struct vm_area_struct *vma,
>  	c = arm_vmregion_find(&consistent_head, (unsigned long)cpu_addr);
>  	if (c) {
>  		unsigned long off = vma->vm_pgoff;
> +		struct page *pages = c->priv;
>  
>  		kern_size = (c->vm_end - c->vm_start) >> PAGE_SHIFT;
>  
>  		if (off < kern_size &&
>  		    user_size <= (kern_size - off)) {
>  			ret = remap_pfn_range(vma, vma->vm_start,
> -					      page_to_pfn(c->vm_pages) + off,
> +					      page_to_pfn(pages) + off,
>  					      user_size << PAGE_SHIFT,
>  					      vma->vm_page_prot);
>  		}
> @@ -644,6 +658,9 @@ int arm_dma_map_sg(struct device *dev, struct scatterlist *sg, int nents,
>  	int i, j;
>  
>  	for_each_sg(sg, s, nents, i) {
> +#ifdef CONFIG_NEED_SG_DMA_LENGTH
> +		s->dma_length = s->length;
> +#endif
>  		s->dma_address = ops->map_page(dev, sg_page(s), s->offset,
>  						s->length, dir, attrs);
>  		if (dma_mapping_error(dev, s->dma_address))
> @@ -749,3 +766,593 @@ static int __init dma_debug_do_init(void)
>  	return 0;
>  }
>  fs_initcall(dma_debug_do_init);
> +
> +#ifdef CONFIG_ARM_DMA_USE_IOMMU
> +
> +/* IOMMU */
> +
> +static inline dma_addr_t __alloc_iova(struct dma_iommu_mapping *mapping,
> +				      size_t size)
> +{
> +	unsigned int order = get_order(size);
> +	unsigned int align = 0;
> +	unsigned int count, start;
> +	unsigned long flags;
> +
> +	count = ((PAGE_ALIGN(size) >> PAGE_SHIFT) +
> +		 (1 << mapping->order) - 1) >> mapping->order;
> +
> +	if (order > mapping->order)
> +		align = (1 << (order - mapping->order)) - 1;
> +
> +	spin_lock_irqsave(&mapping->lock, flags);
> +	start = bitmap_find_next_zero_area(mapping->bitmap, mapping->bits, 0,
> +					   count, align);
> +	if (start > mapping->bits) {
> +		spin_unlock_irqrestore(&mapping->lock, flags);
> +		return ~0;

Would it make sense to use DMA_ERROR_CODE? Or a ARM variant of it.

> +	}
> +
> +	bitmap_set(mapping->bitmap, start, count);
> +	spin_unlock_irqrestore(&mapping->lock, flags);
> +
> +	return mapping->base + (start << (mapping->order + PAGE_SHIFT));
> +}
> +
> +static inline void __free_iova(struct dma_iommu_mapping *mapping,
> +			       dma_addr_t addr, size_t size)
> +{
> +	unsigned int start = (addr - mapping->base) >>
> +			     (mapping->order + PAGE_SHIFT);
> +	unsigned int count = ((size >> PAGE_SHIFT) +
> +			      (1 << mapping->order) - 1) >> mapping->order;
> +	unsigned long flags;
> +
> +	spin_lock_irqsave(&mapping->lock, flags);
> +	bitmap_clear(mapping->bitmap, start, count);
> +	spin_unlock_irqrestore(&mapping->lock, flags);
> +}
> +
> +static struct page **__iommu_alloc_buffer(struct device *dev, size_t size, gfp_t gfp)
> +{
> +	struct page **pages;
> +	int count = size >> PAGE_SHIFT;
> +	int i=0;
> +
> +	pages = kzalloc(count * sizeof(struct page*), gfp);
> +	if (!pages)
> +		return NULL;
> +
> +	while (count) {
> +		int j, order = __ffs(count);
> +
> +		pages[i] = alloc_pages(gfp | __GFP_NOWARN, order);
> +		while (!pages[i] && order)
> +			pages[i] = alloc_pages(gfp | __GFP_NOWARN, --order);
> +		if (!pages[i])
> +			goto error;
> +
> +		if (order)
> +			split_page(pages[i], order);
> +		j = 1 << order;
> +		while (--j)
> +			pages[i + j] = pages[i] + j;
> +
> +		__dma_clear_buffer(pages[i], PAGE_SIZE << order);
> +		i += 1 << order;
> +		count -= 1 << order;
> +	}
> +
> +	return pages;
> +error:
> +	while (--i)
> +		if (pages[i])
> +			__free_pages(pages[i], 0);
> +	kfree(pages);
> +	return NULL;
> +}
> +
> +static int __iommu_free_buffer(struct device *dev, struct page **pages, size_t size)
> +{
> +	int count = size >> PAGE_SHIFT;
> +	int i;
> +	for (i=0; i< count; i++)

That 'i< count' looks odd. Did checkpath miss that one?

> +		if (pages[i])
> +			__free_pages(pages[i], 0);
> +	kfree(pages);
> +	return 0;
> +}
> +
> +static void *
> +__iommu_alloc_remap(struct page **pages, size_t size, gfp_t gfp, pgprot_t prot)
> +{
> +	struct arm_vmregion *c;
> +	size_t align;
> +	size_t count = size >> PAGE_SHIFT;
> +	int bit;
> +
> +	if (!consistent_pte[0]) {
> +		printk(KERN_ERR "%s: not initialised\n", __func__);
> +		dump_stack();
> +		return NULL;
> +	}
> +
> +	/*
> +	 * Align the virtual region allocation - maximum alignment is
> +	 * a section size, minimum is a page size.  This helps reduce
> +	 * fragmentation of the DMA space, and also prevents allocations
> +	 * smaller than a section from crossing a section boundary.
> +	 */
> +	bit = fls(size - 1);
> +	if (bit > SECTION_SHIFT)
> +		bit = SECTION_SHIFT;
> +	align = 1 << bit;
> +
> +	/*
> +	 * Allocate a virtual address in the consistent mapping region.
> +	 */
> +	c = arm_vmregion_alloc(&consistent_head, align, size,
> +			    gfp & ~(__GFP_DMA | __GFP_HIGHMEM));
> +	if (c) {
> +		pte_t *pte;
> +		int idx = CONSISTENT_PTE_INDEX(c->vm_start);
> +		int i = 0;
> +		u32 off = CONSISTENT_OFFSET(c->vm_start) & (PTRS_PER_PTE-1);
> +
> +		pte = consistent_pte[idx] + off;
> +		c->priv = pages;
> +
> +		do {
> +			BUG_ON(!pte_none(*pte));
> +
> +			set_pte_ext(pte, mk_pte(pages[i], prot), 0);
> +			pte++;
> +			off++;
> +			i++;
> +			if (off >= PTRS_PER_PTE) {
> +				off = 0;
> +				pte = consistent_pte[++idx];
> +			}
> +		} while (i < count);
> +
> +		dsb();
> +
> +		return (void *)c->vm_start;
> +	}
> +	return NULL;
> +}
> +
> +static dma_addr_t __iommu_create_mapping(struct device *dev, struct page **pages, size_t size)
> +{
> +	struct dma_iommu_mapping *mapping = dev->archdata.mapping;
> +	unsigned int count = PAGE_ALIGN(size) >> PAGE_SHIFT;
> +	dma_addr_t dma_addr, iova;
> +	int i, ret = ~0;
> +
> +	dma_addr = __alloc_iova(mapping, size);
> +	if (dma_addr == ~0)
> +		goto fail;
> +
> +	iova = dma_addr;
> +	for (i=0; i<count; ) {
> +		unsigned int phys = page_to_phys(pages[i]);

phys_addr_t ?

> +		int j = i + 1;
> +
> +		while (j < count) {
> +			if (page_to_phys(pages[j]) != phys + (j - i) * PAGE_SIZE)
> +				break;

How about just using pfn values?
So:

	unsigned int next_pfn = page_to_pfn(pages[i])
	unsigned int pfn = i;

	for (j = 1; j < count; j++)
		if (page_to_pfn(pages[++pfn]) != ++next_pfn)
			break;

IMHO it looks easier to read.

> +			j++;
> +		}
> +
> +		ret = iommu_map(mapping->domain, iova, phys, (j - i) * PAGE_SIZE, 0);
> +		if (ret < 0)
> +			goto fail;
> +		iova += (j - i) * PAGE_SIZE;
> +		i = j;

Granted you would have to rework this a bit.
> +	}
> +
> +	return dma_addr;
> +fail:
> +	return ~0;

DMA_ERROR_CODE

> +}
> +
> +static int __iommu_remove_mapping(struct device *dev, dma_addr_t iova, size_t size)
> +{
> +	struct dma_iommu_mapping *mapping = dev->archdata.mapping;
> +	unsigned int count = PAGE_ALIGN(size) >> PAGE_SHIFT;
> +
> +	iova &= PAGE_MASK;
> +
> +	iommu_unmap(mapping->domain, iova, count * PAGE_SIZE);
> +
> +	__free_iova(mapping, iova, size);
> +	return 0;
> +}
> +
> +static void *arm_iommu_alloc_attrs(struct device *dev, size_t size,
> +	    dma_addr_t *handle, gfp_t gfp, struct dma_attrs *attrs)
> +{
> +	pgprot_t prot = __get_dma_pgprot(attrs, pgprot_kernel);
> +	struct page **pages;
> +	void *addr = NULL;
> +
> +	*handle = ~0;
> +	size = PAGE_ALIGN(size);
> +
> +	pages = __iommu_alloc_buffer(dev, size, gfp);
> +	if (!pages)
> +		return NULL;
> +
> +	*handle = __iommu_create_mapping(dev, pages, size);
> +	if (*handle == ~0)
> +		goto err_buffer;
> +
> +	addr = __iommu_alloc_remap(pages, size, gfp, prot);
> +	if (!addr)
> +		goto err_mapping;
> +
> +	return addr;
> +
> +err_mapping:
> +	__iommu_remove_mapping(dev, *handle, size);
> +err_buffer:
> +	__iommu_free_buffer(dev, pages, size);
> +	return NULL;
> +}
> +
> +static int arm_iommu_mmap_attrs(struct device *dev, struct vm_area_struct *vma,
> +		    void *cpu_addr, dma_addr_t dma_addr, size_t size,
> +		    struct dma_attrs *attrs)
> +{
> +	struct arm_vmregion *c;
> +
> +	vma->vm_page_prot = __get_dma_pgprot(attrs, vma->vm_page_prot);
> +	c = arm_vmregion_find(&consistent_head, (unsigned long)cpu_addr);
> +
> +	if (c) {
> +		struct page **pages = c->priv;
> +
> +		unsigned long uaddr = vma->vm_start;
> +		unsigned long usize = vma->vm_end - vma->vm_start;
> +		int i = 0;
> +
> +		do {
> +			int ret;
> +
> +			ret = vm_insert_page(vma, uaddr, pages[i++]);
> +			if (ret) {
> +				printk(KERN_ERR "Remapping memory, error: %d\n", ret);
> +				return ret;
> +			}
> +
> +			uaddr += PAGE_SIZE;
> +			usize -= PAGE_SIZE;
> +		} while (usize > 0);
> +	}
> +	return 0;
> +}
> +
> +/*
> + * free a page as defined by the above mapping.
> + * Must not be called with IRQs disabled.
> + */
> +void arm_iommu_free_attrs(struct device *dev, size_t size, void *cpu_addr,
> +			  dma_addr_t handle, struct dma_attrs *attrs)
> +{
> +	struct arm_vmregion *c;
> +	size = PAGE_ALIGN(size);
> +
> +	c = arm_vmregion_find(&consistent_head, (unsigned long)cpu_addr);
> +	if (c) {
> +		struct page **pages = c->priv;
> +		__dma_free_remap(cpu_addr, size);
> +		__iommu_remove_mapping(dev, handle, size);
> +		__iommu_free_buffer(dev, pages, size);
> +	}
> +}
> +
> +static int __map_sg_chunk(struct device *dev, struct scatterlist *sg,
> +			  size_t size, dma_addr_t *handle,
> +			  enum dma_data_direction dir)
> +{
> +	struct dma_iommu_mapping *mapping = dev->archdata.mapping;
> +	dma_addr_t iova, iova_base;
> +	int ret = 0;
> +	unsigned int count;
> +	struct scatterlist *s;
> +
> +	size = PAGE_ALIGN(size);
> +	*handle = ~0;
> +
> +	iova_base = iova = __alloc_iova(mapping, size);
> +	if (iova == ~0)
> +		return -ENOMEM;
> +
> +	for (count = 0, s = sg; count < (size >> PAGE_SHIFT); s = sg_next(s))
> +	{
> +		phys_addr_t phys = page_to_phys(sg_page(s));
> +		unsigned int len = PAGE_ALIGN(s->offset + s->length);
> +
> +		if (!arch_is_coherent())
> +			__dma_page_cpu_to_dev(sg_page(s), s->offset, s->length, dir);
> +
> +		ret = iommu_map(mapping->domain, iova, phys, len, 0);
> +		if (ret < 0)
> +			goto fail;
> +		count += len >> PAGE_SHIFT;
> +		iova += len;
> +	}
> +	*handle = iova_base;
> +
> +	return 0;
> +fail:
> +	iommu_unmap(mapping->domain, iova_base, count * PAGE_SIZE);
> +	__free_iova(mapping, iova_base, size);
> +	return ret;
> +}
> +
> +int arm_iommu_map_sg(struct device *dev, struct scatterlist *sg, int nents,
> +		     enum dma_data_direction dir, struct dma_attrs *attrs)
> +{
> +	struct scatterlist *s = sg, *dma = sg, *start = sg;
> +	int i, count = 0;
> +	unsigned int offset = s->offset;
> +	unsigned int size = s->offset + s->length;
> +	unsigned int max = dma_get_max_seg_size(dev);
> +
> +	s->dma_address = ~0;
> +	s->dma_length = 0;

Not zero just in case somebody does not check the values and tries to use them?
> +
> +	for (i = 1; i < nents; i++) {
> +		s->dma_address = ~0;
> +		s->dma_length = 0;
> +
> +		s = sg_next(s);
> +
> +		if (s->offset || (size & ~PAGE_MASK) || size + s->length > max) {
> +			if (__map_sg_chunk(dev, start, size, &dma->dma_address,
> +			    dir) < 0)
> +				goto bad_mapping;
> +
> +			dma->dma_address += offset;
> +			dma->dma_length = size - offset;
> +
> +			size = offset = s->offset;
> +			start = s;
> +			dma = sg_next(dma);
> +			count += 1;
> +		}
> +		size += s->length;
> +	}
> +	if (__map_sg_chunk(dev, start, size, &dma->dma_address, dir) < 0)
> +		goto bad_mapping;
> +
> +	dma->dma_address += offset;
> +	dma->dma_length = size - offset;
> +
> +	return count+1;
> +
> +bad_mapping:
> +	for_each_sg(sg, s, count, i)
> +		__iommu_remove_mapping(dev, sg_dma_address(s), sg_dma_len(s));
> +	return 0;
> +}
> +
> +void arm_iommu_unmap_sg(struct device *dev, struct scatterlist *sg, int nents,
> +			enum dma_data_direction dir, struct dma_attrs *attrs)
> +{
> +	struct scatterlist *s;
> +	int i;
> +
> +	for_each_sg(sg, s, nents, i) {
> +		if (sg_dma_len(s))
> +			__iommu_remove_mapping(dev, sg_dma_address(s),
> +					       sg_dma_len(s));
> +		if (!arch_is_coherent())
> +			__dma_page_dev_to_cpu(sg_page(s), s->offset,
> +					      s->length, dir);
> +	}
> +}
> +
> +
> +/**
> + * dma_sync_sg_for_cpu
> + * @dev: valid struct device pointer, or NULL for ISA and EISA-like devices

Uhhh, Won't that conflict with patch #1 which BUGs if dev != NULL?

> + * @sg: list of buffers
> + * @nents: number of buffers to map (returned from dma_map_sg)
> + * @dir: DMA transfer direction (same as was passed to dma_map_sg)
> + */
> +void arm_iommu_sync_sg_for_cpu(struct device *dev, struct scatterlist *sg,
> +			int nents, enum dma_data_direction dir)
> +{
> +	struct scatterlist *s;
> +	int i;
> +
> +	for_each_sg(sg, s, nents, i)
> +		if (!arch_is_coherent())
> +			__dma_page_dev_to_cpu(sg_page(s), s->offset, s->length, dir);

Uh, I thought you would need to pass in the 'dev'?

> +
> +}
> +
> +/**
> + * dma_sync_sg_for_device
> + * @dev: valid struct device pointer, or NULL for ISA and EISA-like devices
> + * @sg: list of buffers
> + * @nents: number of buffers to map (returned from dma_map_sg)
> + * @dir: DMA transfer direction (same as was passed to dma_map_sg)
> + */
> +void arm_iommu_sync_sg_for_device(struct device *dev, struct scatterlist *sg,
> +			int nents, enum dma_data_direction dir)
> +{
> +	struct scatterlist *s;
> +	int i;
> +
> +	for_each_sg(sg, s, nents, i)
> +		if (!arch_is_coherent())
> +			__dma_page_cpu_to_dev(sg_page(s), s->offset, s->length, dir);
> +}
> +
> +static dma_addr_t arm_iommu_map_page(struct device *dev, struct page *page,
> +	     unsigned long offset, size_t size, enum dma_data_direction dir,
> +	     struct dma_attrs *attrs)
> +{
> +	struct dma_iommu_mapping *mapping = dev->archdata.mapping;
> +	dma_addr_t dma_addr, iova;
> +	unsigned int phys;
> +	int ret, len = PAGE_ALIGN(size + offset);
> +
> +	if (!arch_is_coherent())
> +		__dma_page_cpu_to_dev(page, offset, size, dir);
> +
> +	dma_addr = iova = __alloc_iova(mapping, len);
> +	if (iova == ~0)
> +		goto fail;
> +
> +	dma_addr += offset;
> +	phys = page_to_phys(page);
> +	ret = iommu_map(mapping->domain, iova, phys, size, 0);
> +	if (ret < 0)
> +		goto fail;
> +
> +	return dma_addr;
> +fail:
> +	return ~0;
> +}
> +
> +static void arm_iommu_unmap_page(struct device *dev, dma_addr_t handle,
> +		size_t size, enum dma_data_direction dir,
> +		struct dma_attrs *attrs)
> +{
> +	struct dma_iommu_mapping *mapping = dev->archdata.mapping;
> +	dma_addr_t iova = handle & PAGE_MASK;
> +	struct page *page = phys_to_page(iommu_iova_to_phys(mapping->domain, iova));
> +	int offset = handle & ~PAGE_MASK;
> +
> +	if (!iova)
> +		return;
> +
> +	if (!arch_is_coherent())
> +		__dma_page_dev_to_cpu(page, offset, size, dir);
> +
> +	iommu_unmap(mapping->domain, iova, size);
> +	__free_iova(mapping, iova, size);
> +}
> +
> +static void arm_iommu_sync_single_for_cpu(struct device *dev,
> +		dma_addr_t handle, size_t size, enum dma_data_direction dir)
> +{
> +	struct dma_iommu_mapping *mapping = dev->archdata.mapping;
> +	dma_addr_t iova = handle & PAGE_MASK;
> +	struct page *page = phys_to_page(iommu_iova_to_phys(mapping->domain, iova));
> +	unsigned int offset = handle & ~PAGE_MASK;
> +
> +	if (!iova)
> +		return;
> +
> +	if (!arch_is_coherent())
> +		__dma_page_dev_to_cpu(page, offset, size, dir);
> +}
> +
> +static void arm_iommu_sync_single_for_device(struct device *dev,
> +		dma_addr_t handle, size_t size, enum dma_data_direction dir)
> +{
> +	struct dma_iommu_mapping *mapping = dev->archdata.mapping;
> +	dma_addr_t iova = handle & PAGE_MASK;
> +	struct page *page = phys_to_page(iommu_iova_to_phys(mapping->domain, iova));
> +	unsigned int offset = handle & ~PAGE_MASK;
> +
> +	if (!iova)
> +		return;
> +
> +	__dma_page_cpu_to_dev(page, offset, size, dir);
> +}
> +
> +struct dma_map_ops iommu_ops = {
> +	.alloc		= arm_iommu_alloc_attrs,
> +	.free		= arm_iommu_free_attrs,
> +	.mmap		= arm_iommu_mmap_attrs,
> +
> +	.map_page		= arm_iommu_map_page,
> +	.unmap_page		= arm_iommu_unmap_page,
> +	.sync_single_for_cpu	= arm_iommu_sync_single_for_cpu,
> +	.sync_single_for_device	= arm_iommu_sync_single_for_device,
> +
> +	.map_sg			= arm_iommu_map_sg,
> +	.unmap_sg		= arm_iommu_unmap_sg,
> +	.sync_sg_for_cpu	= arm_iommu_sync_sg_for_cpu,
> +	.sync_sg_for_device	= arm_iommu_sync_sg_for_device,
> +};
> +
> +struct dma_iommu_mapping *
> +arm_iommu_create_mapping(struct bus_type *bus, dma_addr_t base, size_t size,
> +			 int order)
> +{
> +	unsigned int count = (size >> PAGE_SHIFT) - order;
> +	unsigned int bitmap_size = BITS_TO_LONGS(count) * sizeof(long);
> +	struct dma_iommu_mapping *mapping;
> +	int err = -ENOMEM;
> +
> +	mapping = kzalloc(sizeof(struct dma_iommu_mapping), GFP_KERNEL);
> +	if (!mapping)
> +		goto err;
> +
> +	mapping->bitmap = kzalloc(bitmap_size, GFP_KERNEL);
> +	if (!mapping->bitmap)
> +		goto err2;
> +
> +	mapping->base = base;
> +	mapping->bits = bitmap_size;
> +	mapping->order = order;
> +	spin_lock_init(&mapping->lock);
> +
> +	mapping->domain = iommu_domain_alloc(bus);
> +	if (!mapping->domain)
> +		goto err3;
> +
> +	kref_init(&mapping->kref);
> +	return mapping;
> +err3:
> +	kfree(mapping->bitmap);
> +err2:
> +	kfree(mapping);
> +err:
> +	return ERR_PTR(err);
> +}
> +EXPORT_SYMBOL(arm_iommu_create_mapping);
> +
> +static void release_iommu_mapping(struct kref *kref)
> +{
> +	struct dma_iommu_mapping *mapping =
> +		container_of(kref, struct dma_iommu_mapping, kref);
> +
> +	iommu_domain_free(mapping->domain);
> +	kfree(mapping->bitmap);
> +	kfree(mapping);
> +}
> +
> +void arm_iommu_release_mapping(struct dma_iommu_mapping *mapping)
> +{
> +	if (mapping)
> +		kref_put(&mapping->kref, release_iommu_mapping);
> +}
> +EXPORT_SYMBOL(arm_iommu_release_mapping);
> +
> +int arm_iommu_attach_device(struct device *dev,
> +			    struct dma_iommu_mapping *mapping)
> +{
> +	int err;
> +
> +	err = iommu_attach_device(mapping->domain, dev);
> +	if (err)
> +		return err;
> +
> +	kref_get(&mapping->kref);
> +	dev->archdata.mapping = mapping;
> +	set_dma_ops(dev, &iommu_ops);
> +
> +	printk(KERN_INFO "Attached IOMMU controller to %s device.\n", dev_name(dev));

pr_debug?

> +	return 0;
> +}
> +EXPORT_SYMBOL(arm_iommu_attach_device);
> +
> +#endif
> diff --git a/arch/arm/mm/vmregion.h b/arch/arm/mm/vmregion.h
> index 15e9f04..6bbc402 100644
> --- a/arch/arm/mm/vmregion.h
> +++ b/arch/arm/mm/vmregion.h
> @@ -17,7 +17,7 @@ struct arm_vmregion {
>  	struct list_head	vm_list;
>  	unsigned long		vm_start;
>  	unsigned long		vm_end;
> -	struct page		*vm_pages;
> +	void			*priv;
>  	int			vm_active;
>  };

You might want to CC the ARM MM maintainers here to get their feedback.

Besides the comments I made, it looks good. You can stick Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
if you would like on the patch.

>  
> -- 
> 1.7.1.569.g6f426
> 
> _______________________________________________
> iommu mailing list
> iommu@lists.linux-foundation.org
> https://lists.linuxfoundation.org/mailman/listinfo/iommu

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCHv6 7/7] ARM: dma-mapping: add support for IOMMU mapper
  2012-02-14 14:55     ` Konrad Rzeszutek Wilk
@ 2012-02-14 14:55       ` Konrad Rzeszutek Wilk
  2012-02-24 13:12       ` Marek Szyprowski
  1 sibling, 0 replies; 39+ messages in thread
From: Konrad Rzeszutek Wilk @ 2012-02-14 14:55 UTC (permalink / raw)
  To: Marek Szyprowski
  Cc: linux-arm-kernel, linaro-mm-sig, linux-mm, linux-arch,
	linux-samsung-soc, iommu, Shariq Hasnain, Arnd Bergmann,
	Benjamin Herrenschmidt, Krishna Reddy, Kyungmin Park,
	Andrzej Pietrasiewicz, Russell King - ARM Linux, KyongHo Cho,
	Chunsang Jeong

> +static void __dma_clear_buffer(struct page *page, size_t size)
> +{
> +	void *ptr;
> +	/*
> +	 * Ensure that the allocated pages are zeroed, and that any data
> +	 * lurking in the kernel direct-mapped region is invalidated.
> +	 */
> +	ptr = page_address(page);

Should you check to see if the ptr is valid?

> +	memset(ptr, 0, size);
> +	dmac_flush_range(ptr, ptr + size);
> +	outer_flush_range(__pa(ptr), __pa(ptr) + size);
> +}
> +
>  /*
>   * Allocate a DMA buffer for 'dev' of size 'size' using the
>   * specified gfp mask.  Note that 'size' must be page aligned.
> @@ -164,7 +179,6 @@ static struct page *__dma_alloc_buffer(struct device *dev, size_t size, gfp_t gf
>  {
>  	unsigned long order = get_order(size);
>  	struct page *page, *p, *e;
> -	void *ptr;
>  	u64 mask = get_coherent_dma_mask(dev);
>  
>  #ifdef CONFIG_DMA_API_DEBUG
> @@ -193,14 +207,7 @@ static struct page *__dma_alloc_buffer(struct device *dev, size_t size, gfp_t gf
>  	for (p = page + (size >> PAGE_SHIFT), e = page + (1 << order); p < e; p++)
>  		__free_page(p);
>  
> -	/*
> -	 * Ensure that the allocated pages are zeroed, and that any data
> -	 * lurking in the kernel direct-mapped region is invalidated.
> -	 */
> -	ptr = page_address(page);
> -	memset(ptr, 0, size);
> -	dmac_flush_range(ptr, ptr + size);
> -	outer_flush_range(__pa(ptr), __pa(ptr) + size);
> +	__dma_clear_buffer(page, size);
>  
>  	return page;
>  }
> @@ -348,7 +355,7 @@ __dma_alloc_remap(struct page *page, size_t size, gfp_t gfp, pgprot_t prot)
>  		u32 off = CONSISTENT_OFFSET(c->vm_start) & (PTRS_PER_PTE-1);
>  
>  		pte = consistent_pte[idx] + off;
> -		c->vm_pages = page;
> +		c->priv = page;
>  
>  		do {
>  			BUG_ON(!pte_none(*pte));
> @@ -461,6 +468,14 @@ __dma_alloc(struct device *dev, size_t size, dma_addr_t *handle, gfp_t gfp,
>  	return addr;
>  }
>  
> +static inline pgprot_t __get_dma_pgprot(struct dma_attrs *attrs, pgprot_t prot)
> +{
> +	prot = dma_get_attr(DMA_ATTR_WRITE_COMBINE, attrs) ?
> +			    pgprot_writecombine(prot) :
> +			    pgprot_dmacoherent(prot);
> +	return prot;
> +}
> +
>  /*
>   * Allocate DMA-coherent memory space and return both the kernel remapped
>   * virtual and bus address for that space.
> @@ -468,9 +483,7 @@ __dma_alloc(struct device *dev, size_t size, dma_addr_t *handle, gfp_t gfp,
>  void *arm_dma_alloc(struct device *dev, size_t size, dma_addr_t *handle,
>  		    gfp_t gfp, struct dma_attrs *attrs)
>  {
> -	pgprot_t prot = dma_get_attr(DMA_ATTR_WRITE_COMBINE, attrs) ?
> -			pgprot_writecombine(pgprot_kernel) :
> -			pgprot_dmacoherent(pgprot_kernel);
> +	pgprot_t prot = __get_dma_pgprot(attrs, pgprot_kernel);
>  	void *memory;
>  
>  	if (dma_alloc_from_coherent(dev, size, handle, &memory))
> @@ -499,13 +512,14 @@ int arm_dma_mmap(struct device *dev, struct vm_area_struct *vma,
>  	c = arm_vmregion_find(&consistent_head, (unsigned long)cpu_addr);
>  	if (c) {
>  		unsigned long off = vma->vm_pgoff;
> +		struct page *pages = c->priv;
>  
>  		kern_size = (c->vm_end - c->vm_start) >> PAGE_SHIFT;
>  
>  		if (off < kern_size &&
>  		    user_size <= (kern_size - off)) {
>  			ret = remap_pfn_range(vma, vma->vm_start,
> -					      page_to_pfn(c->vm_pages) + off,
> +					      page_to_pfn(pages) + off,
>  					      user_size << PAGE_SHIFT,
>  					      vma->vm_page_prot);
>  		}
> @@ -644,6 +658,9 @@ int arm_dma_map_sg(struct device *dev, struct scatterlist *sg, int nents,
>  	int i, j;
>  
>  	for_each_sg(sg, s, nents, i) {
> +#ifdef CONFIG_NEED_SG_DMA_LENGTH
> +		s->dma_length = s->length;
> +#endif
>  		s->dma_address = ops->map_page(dev, sg_page(s), s->offset,
>  						s->length, dir, attrs);
>  		if (dma_mapping_error(dev, s->dma_address))
> @@ -749,3 +766,593 @@ static int __init dma_debug_do_init(void)
>  	return 0;
>  }
>  fs_initcall(dma_debug_do_init);
> +
> +#ifdef CONFIG_ARM_DMA_USE_IOMMU
> +
> +/* IOMMU */
> +
> +static inline dma_addr_t __alloc_iova(struct dma_iommu_mapping *mapping,
> +				      size_t size)
> +{
> +	unsigned int order = get_order(size);
> +	unsigned int align = 0;
> +	unsigned int count, start;
> +	unsigned long flags;
> +
> +	count = ((PAGE_ALIGN(size) >> PAGE_SHIFT) +
> +		 (1 << mapping->order) - 1) >> mapping->order;
> +
> +	if (order > mapping->order)
> +		align = (1 << (order - mapping->order)) - 1;
> +
> +	spin_lock_irqsave(&mapping->lock, flags);
> +	start = bitmap_find_next_zero_area(mapping->bitmap, mapping->bits, 0,
> +					   count, align);
> +	if (start > mapping->bits) {
> +		spin_unlock_irqrestore(&mapping->lock, flags);
> +		return ~0;

Would it make sense to use DMA_ERROR_CODE? Or a ARM variant of it.

> +	}
> +
> +	bitmap_set(mapping->bitmap, start, count);
> +	spin_unlock_irqrestore(&mapping->lock, flags);
> +
> +	return mapping->base + (start << (mapping->order + PAGE_SHIFT));
> +}
> +
> +static inline void __free_iova(struct dma_iommu_mapping *mapping,
> +			       dma_addr_t addr, size_t size)
> +{
> +	unsigned int start = (addr - mapping->base) >>
> +			     (mapping->order + PAGE_SHIFT);
> +	unsigned int count = ((size >> PAGE_SHIFT) +
> +			      (1 << mapping->order) - 1) >> mapping->order;
> +	unsigned long flags;
> +
> +	spin_lock_irqsave(&mapping->lock, flags);
> +	bitmap_clear(mapping->bitmap, start, count);
> +	spin_unlock_irqrestore(&mapping->lock, flags);
> +}
> +
> +static struct page **__iommu_alloc_buffer(struct device *dev, size_t size, gfp_t gfp)
> +{
> +	struct page **pages;
> +	int count = size >> PAGE_SHIFT;
> +	int i=0;
> +
> +	pages = kzalloc(count * sizeof(struct page*), gfp);
> +	if (!pages)
> +		return NULL;
> +
> +	while (count) {
> +		int j, order = __ffs(count);
> +
> +		pages[i] = alloc_pages(gfp | __GFP_NOWARN, order);
> +		while (!pages[i] && order)
> +			pages[i] = alloc_pages(gfp | __GFP_NOWARN, --order);
> +		if (!pages[i])
> +			goto error;
> +
> +		if (order)
> +			split_page(pages[i], order);
> +		j = 1 << order;
> +		while (--j)
> +			pages[i + j] = pages[i] + j;
> +
> +		__dma_clear_buffer(pages[i], PAGE_SIZE << order);
> +		i += 1 << order;
> +		count -= 1 << order;
> +	}
> +
> +	return pages;
> +error:
> +	while (--i)
> +		if (pages[i])
> +			__free_pages(pages[i], 0);
> +	kfree(pages);
> +	return NULL;
> +}
> +
> +static int __iommu_free_buffer(struct device *dev, struct page **pages, size_t size)
> +{
> +	int count = size >> PAGE_SHIFT;
> +	int i;
> +	for (i=0; i< count; i++)

That 'i< count' looks odd. Did checkpath miss that one?

> +		if (pages[i])
> +			__free_pages(pages[i], 0);
> +	kfree(pages);
> +	return 0;
> +}
> +
> +static void *
> +__iommu_alloc_remap(struct page **pages, size_t size, gfp_t gfp, pgprot_t prot)
> +{
> +	struct arm_vmregion *c;
> +	size_t align;
> +	size_t count = size >> PAGE_SHIFT;
> +	int bit;
> +
> +	if (!consistent_pte[0]) {
> +		printk(KERN_ERR "%s: not initialised\n", __func__);
> +		dump_stack();
> +		return NULL;
> +	}
> +
> +	/*
> +	 * Align the virtual region allocation - maximum alignment is
> +	 * a section size, minimum is a page size.  This helps reduce
> +	 * fragmentation of the DMA space, and also prevents allocations
> +	 * smaller than a section from crossing a section boundary.
> +	 */
> +	bit = fls(size - 1);
> +	if (bit > SECTION_SHIFT)
> +		bit = SECTION_SHIFT;
> +	align = 1 << bit;
> +
> +	/*
> +	 * Allocate a virtual address in the consistent mapping region.
> +	 */
> +	c = arm_vmregion_alloc(&consistent_head, align, size,
> +			    gfp & ~(__GFP_DMA | __GFP_HIGHMEM));
> +	if (c) {
> +		pte_t *pte;
> +		int idx = CONSISTENT_PTE_INDEX(c->vm_start);
> +		int i = 0;
> +		u32 off = CONSISTENT_OFFSET(c->vm_start) & (PTRS_PER_PTE-1);
> +
> +		pte = consistent_pte[idx] + off;
> +		c->priv = pages;
> +
> +		do {
> +			BUG_ON(!pte_none(*pte));
> +
> +			set_pte_ext(pte, mk_pte(pages[i], prot), 0);
> +			pte++;
> +			off++;
> +			i++;
> +			if (off >= PTRS_PER_PTE) {
> +				off = 0;
> +				pte = consistent_pte[++idx];
> +			}
> +		} while (i < count);
> +
> +		dsb();
> +
> +		return (void *)c->vm_start;
> +	}
> +	return NULL;
> +}
> +
> +static dma_addr_t __iommu_create_mapping(struct device *dev, struct page **pages, size_t size)
> +{
> +	struct dma_iommu_mapping *mapping = dev->archdata.mapping;
> +	unsigned int count = PAGE_ALIGN(size) >> PAGE_SHIFT;
> +	dma_addr_t dma_addr, iova;
> +	int i, ret = ~0;
> +
> +	dma_addr = __alloc_iova(mapping, size);
> +	if (dma_addr == ~0)
> +		goto fail;
> +
> +	iova = dma_addr;
> +	for (i=0; i<count; ) {
> +		unsigned int phys = page_to_phys(pages[i]);

phys_addr_t ?

> +		int j = i + 1;
> +
> +		while (j < count) {
> +			if (page_to_phys(pages[j]) != phys + (j - i) * PAGE_SIZE)
> +				break;

How about just using pfn values?
So:

	unsigned int next_pfn = page_to_pfn(pages[i])
	unsigned int pfn = i;

	for (j = 1; j < count; j++)
		if (page_to_pfn(pages[++pfn]) != ++next_pfn)
			break;

IMHO it looks easier to read.

> +			j++;
> +		}
> +
> +		ret = iommu_map(mapping->domain, iova, phys, (j - i) * PAGE_SIZE, 0);
> +		if (ret < 0)
> +			goto fail;
> +		iova += (j - i) * PAGE_SIZE;
> +		i = j;

Granted you would have to rework this a bit.
> +	}
> +
> +	return dma_addr;
> +fail:
> +	return ~0;

DMA_ERROR_CODE

> +}
> +
> +static int __iommu_remove_mapping(struct device *dev, dma_addr_t iova, size_t size)
> +{
> +	struct dma_iommu_mapping *mapping = dev->archdata.mapping;
> +	unsigned int count = PAGE_ALIGN(size) >> PAGE_SHIFT;
> +
> +	iova &= PAGE_MASK;
> +
> +	iommu_unmap(mapping->domain, iova, count * PAGE_SIZE);
> +
> +	__free_iova(mapping, iova, size);
> +	return 0;
> +}
> +
> +static void *arm_iommu_alloc_attrs(struct device *dev, size_t size,
> +	    dma_addr_t *handle, gfp_t gfp, struct dma_attrs *attrs)
> +{
> +	pgprot_t prot = __get_dma_pgprot(attrs, pgprot_kernel);
> +	struct page **pages;
> +	void *addr = NULL;
> +
> +	*handle = ~0;
> +	size = PAGE_ALIGN(size);
> +
> +	pages = __iommu_alloc_buffer(dev, size, gfp);
> +	if (!pages)
> +		return NULL;
> +
> +	*handle = __iommu_create_mapping(dev, pages, size);
> +	if (*handle == ~0)
> +		goto err_buffer;
> +
> +	addr = __iommu_alloc_remap(pages, size, gfp, prot);
> +	if (!addr)
> +		goto err_mapping;
> +
> +	return addr;
> +
> +err_mapping:
> +	__iommu_remove_mapping(dev, *handle, size);
> +err_buffer:
> +	__iommu_free_buffer(dev, pages, size);
> +	return NULL;
> +}
> +
> +static int arm_iommu_mmap_attrs(struct device *dev, struct vm_area_struct *vma,
> +		    void *cpu_addr, dma_addr_t dma_addr, size_t size,
> +		    struct dma_attrs *attrs)
> +{
> +	struct arm_vmregion *c;
> +
> +	vma->vm_page_prot = __get_dma_pgprot(attrs, vma->vm_page_prot);
> +	c = arm_vmregion_find(&consistent_head, (unsigned long)cpu_addr);
> +
> +	if (c) {
> +		struct page **pages = c->priv;
> +
> +		unsigned long uaddr = vma->vm_start;
> +		unsigned long usize = vma->vm_end - vma->vm_start;
> +		int i = 0;
> +
> +		do {
> +			int ret;
> +
> +			ret = vm_insert_page(vma, uaddr, pages[i++]);
> +			if (ret) {
> +				printk(KERN_ERR "Remapping memory, error: %d\n", ret);
> +				return ret;
> +			}
> +
> +			uaddr += PAGE_SIZE;
> +			usize -= PAGE_SIZE;
> +		} while (usize > 0);
> +	}
> +	return 0;
> +}
> +
> +/*
> + * free a page as defined by the above mapping.
> + * Must not be called with IRQs disabled.
> + */
> +void arm_iommu_free_attrs(struct device *dev, size_t size, void *cpu_addr,
> +			  dma_addr_t handle, struct dma_attrs *attrs)
> +{
> +	struct arm_vmregion *c;
> +	size = PAGE_ALIGN(size);
> +
> +	c = arm_vmregion_find(&consistent_head, (unsigned long)cpu_addr);
> +	if (c) {
> +		struct page **pages = c->priv;
> +		__dma_free_remap(cpu_addr, size);
> +		__iommu_remove_mapping(dev, handle, size);
> +		__iommu_free_buffer(dev, pages, size);
> +	}
> +}
> +
> +static int __map_sg_chunk(struct device *dev, struct scatterlist *sg,
> +			  size_t size, dma_addr_t *handle,
> +			  enum dma_data_direction dir)
> +{
> +	struct dma_iommu_mapping *mapping = dev->archdata.mapping;
> +	dma_addr_t iova, iova_base;
> +	int ret = 0;
> +	unsigned int count;
> +	struct scatterlist *s;
> +
> +	size = PAGE_ALIGN(size);
> +	*handle = ~0;
> +
> +	iova_base = iova = __alloc_iova(mapping, size);
> +	if (iova == ~0)
> +		return -ENOMEM;
> +
> +	for (count = 0, s = sg; count < (size >> PAGE_SHIFT); s = sg_next(s))
> +	{
> +		phys_addr_t phys = page_to_phys(sg_page(s));
> +		unsigned int len = PAGE_ALIGN(s->offset + s->length);
> +
> +		if (!arch_is_coherent())
> +			__dma_page_cpu_to_dev(sg_page(s), s->offset, s->length, dir);
> +
> +		ret = iommu_map(mapping->domain, iova, phys, len, 0);
> +		if (ret < 0)
> +			goto fail;
> +		count += len >> PAGE_SHIFT;
> +		iova += len;
> +	}
> +	*handle = iova_base;
> +
> +	return 0;
> +fail:
> +	iommu_unmap(mapping->domain, iova_base, count * PAGE_SIZE);
> +	__free_iova(mapping, iova_base, size);
> +	return ret;
> +}
> +
> +int arm_iommu_map_sg(struct device *dev, struct scatterlist *sg, int nents,
> +		     enum dma_data_direction dir, struct dma_attrs *attrs)
> +{
> +	struct scatterlist *s = sg, *dma = sg, *start = sg;
> +	int i, count = 0;
> +	unsigned int offset = s->offset;
> +	unsigned int size = s->offset + s->length;
> +	unsigned int max = dma_get_max_seg_size(dev);
> +
> +	s->dma_address = ~0;
> +	s->dma_length = 0;

Not zero just in case somebody does not check the values and tries to use them?
> +
> +	for (i = 1; i < nents; i++) {
> +		s->dma_address = ~0;
> +		s->dma_length = 0;
> +
> +		s = sg_next(s);
> +
> +		if (s->offset || (size & ~PAGE_MASK) || size + s->length > max) {
> +			if (__map_sg_chunk(dev, start, size, &dma->dma_address,
> +			    dir) < 0)
> +				goto bad_mapping;
> +
> +			dma->dma_address += offset;
> +			dma->dma_length = size - offset;
> +
> +			size = offset = s->offset;
> +			start = s;
> +			dma = sg_next(dma);
> +			count += 1;
> +		}
> +		size += s->length;
> +	}
> +	if (__map_sg_chunk(dev, start, size, &dma->dma_address, dir) < 0)
> +		goto bad_mapping;
> +
> +	dma->dma_address += offset;
> +	dma->dma_length = size - offset;
> +
> +	return count+1;
> +
> +bad_mapping:
> +	for_each_sg(sg, s, count, i)
> +		__iommu_remove_mapping(dev, sg_dma_address(s), sg_dma_len(s));
> +	return 0;
> +}
> +
> +void arm_iommu_unmap_sg(struct device *dev, struct scatterlist *sg, int nents,
> +			enum dma_data_direction dir, struct dma_attrs *attrs)
> +{
> +	struct scatterlist *s;
> +	int i;
> +
> +	for_each_sg(sg, s, nents, i) {
> +		if (sg_dma_len(s))
> +			__iommu_remove_mapping(dev, sg_dma_address(s),
> +					       sg_dma_len(s));
> +		if (!arch_is_coherent())
> +			__dma_page_dev_to_cpu(sg_page(s), s->offset,
> +					      s->length, dir);
> +	}
> +}
> +
> +
> +/**
> + * dma_sync_sg_for_cpu
> + * @dev: valid struct device pointer, or NULL for ISA and EISA-like devices

Uhhh, Won't that conflict with patch #1 which BUGs if dev != NULL?

> + * @sg: list of buffers
> + * @nents: number of buffers to map (returned from dma_map_sg)
> + * @dir: DMA transfer direction (same as was passed to dma_map_sg)
> + */
> +void arm_iommu_sync_sg_for_cpu(struct device *dev, struct scatterlist *sg,
> +			int nents, enum dma_data_direction dir)
> +{
> +	struct scatterlist *s;
> +	int i;
> +
> +	for_each_sg(sg, s, nents, i)
> +		if (!arch_is_coherent())
> +			__dma_page_dev_to_cpu(sg_page(s), s->offset, s->length, dir);

Uh, I thought you would need to pass in the 'dev'?

> +
> +}
> +
> +/**
> + * dma_sync_sg_for_device
> + * @dev: valid struct device pointer, or NULL for ISA and EISA-like devices
> + * @sg: list of buffers
> + * @nents: number of buffers to map (returned from dma_map_sg)
> + * @dir: DMA transfer direction (same as was passed to dma_map_sg)
> + */
> +void arm_iommu_sync_sg_for_device(struct device *dev, struct scatterlist *sg,
> +			int nents, enum dma_data_direction dir)
> +{
> +	struct scatterlist *s;
> +	int i;
> +
> +	for_each_sg(sg, s, nents, i)
> +		if (!arch_is_coherent())
> +			__dma_page_cpu_to_dev(sg_page(s), s->offset, s->length, dir);
> +}
> +
> +static dma_addr_t arm_iommu_map_page(struct device *dev, struct page *page,
> +	     unsigned long offset, size_t size, enum dma_data_direction dir,
> +	     struct dma_attrs *attrs)
> +{
> +	struct dma_iommu_mapping *mapping = dev->archdata.mapping;
> +	dma_addr_t dma_addr, iova;
> +	unsigned int phys;
> +	int ret, len = PAGE_ALIGN(size + offset);
> +
> +	if (!arch_is_coherent())
> +		__dma_page_cpu_to_dev(page, offset, size, dir);
> +
> +	dma_addr = iova = __alloc_iova(mapping, len);
> +	if (iova == ~0)
> +		goto fail;
> +
> +	dma_addr += offset;
> +	phys = page_to_phys(page);
> +	ret = iommu_map(mapping->domain, iova, phys, size, 0);
> +	if (ret < 0)
> +		goto fail;
> +
> +	return dma_addr;
> +fail:
> +	return ~0;
> +}
> +
> +static void arm_iommu_unmap_page(struct device *dev, dma_addr_t handle,
> +		size_t size, enum dma_data_direction dir,
> +		struct dma_attrs *attrs)
> +{
> +	struct dma_iommu_mapping *mapping = dev->archdata.mapping;
> +	dma_addr_t iova = handle & PAGE_MASK;
> +	struct page *page = phys_to_page(iommu_iova_to_phys(mapping->domain, iova));
> +	int offset = handle & ~PAGE_MASK;
> +
> +	if (!iova)
> +		return;
> +
> +	if (!arch_is_coherent())
> +		__dma_page_dev_to_cpu(page, offset, size, dir);
> +
> +	iommu_unmap(mapping->domain, iova, size);
> +	__free_iova(mapping, iova, size);
> +}
> +
> +static void arm_iommu_sync_single_for_cpu(struct device *dev,
> +		dma_addr_t handle, size_t size, enum dma_data_direction dir)
> +{
> +	struct dma_iommu_mapping *mapping = dev->archdata.mapping;
> +	dma_addr_t iova = handle & PAGE_MASK;
> +	struct page *page = phys_to_page(iommu_iova_to_phys(mapping->domain, iova));
> +	unsigned int offset = handle & ~PAGE_MASK;
> +
> +	if (!iova)
> +		return;
> +
> +	if (!arch_is_coherent())
> +		__dma_page_dev_to_cpu(page, offset, size, dir);
> +}
> +
> +static void arm_iommu_sync_single_for_device(struct device *dev,
> +		dma_addr_t handle, size_t size, enum dma_data_direction dir)
> +{
> +	struct dma_iommu_mapping *mapping = dev->archdata.mapping;
> +	dma_addr_t iova = handle & PAGE_MASK;
> +	struct page *page = phys_to_page(iommu_iova_to_phys(mapping->domain, iova));
> +	unsigned int offset = handle & ~PAGE_MASK;
> +
> +	if (!iova)
> +		return;
> +
> +	__dma_page_cpu_to_dev(page, offset, size, dir);
> +}
> +
> +struct dma_map_ops iommu_ops = {
> +	.alloc		= arm_iommu_alloc_attrs,
> +	.free		= arm_iommu_free_attrs,
> +	.mmap		= arm_iommu_mmap_attrs,
> +
> +	.map_page		= arm_iommu_map_page,
> +	.unmap_page		= arm_iommu_unmap_page,
> +	.sync_single_for_cpu	= arm_iommu_sync_single_for_cpu,
> +	.sync_single_for_device	= arm_iommu_sync_single_for_device,
> +
> +	.map_sg			= arm_iommu_map_sg,
> +	.unmap_sg		= arm_iommu_unmap_sg,
> +	.sync_sg_for_cpu	= arm_iommu_sync_sg_for_cpu,
> +	.sync_sg_for_device	= arm_iommu_sync_sg_for_device,
> +};
> +
> +struct dma_iommu_mapping *
> +arm_iommu_create_mapping(struct bus_type *bus, dma_addr_t base, size_t size,
> +			 int order)
> +{
> +	unsigned int count = (size >> PAGE_SHIFT) - order;
> +	unsigned int bitmap_size = BITS_TO_LONGS(count) * sizeof(long);
> +	struct dma_iommu_mapping *mapping;
> +	int err = -ENOMEM;
> +
> +	mapping = kzalloc(sizeof(struct dma_iommu_mapping), GFP_KERNEL);
> +	if (!mapping)
> +		goto err;
> +
> +	mapping->bitmap = kzalloc(bitmap_size, GFP_KERNEL);
> +	if (!mapping->bitmap)
> +		goto err2;
> +
> +	mapping->base = base;
> +	mapping->bits = bitmap_size;
> +	mapping->order = order;
> +	spin_lock_init(&mapping->lock);
> +
> +	mapping->domain = iommu_domain_alloc(bus);
> +	if (!mapping->domain)
> +		goto err3;
> +
> +	kref_init(&mapping->kref);
> +	return mapping;
> +err3:
> +	kfree(mapping->bitmap);
> +err2:
> +	kfree(mapping);
> +err:
> +	return ERR_PTR(err);
> +}
> +EXPORT_SYMBOL(arm_iommu_create_mapping);
> +
> +static void release_iommu_mapping(struct kref *kref)
> +{
> +	struct dma_iommu_mapping *mapping =
> +		container_of(kref, struct dma_iommu_mapping, kref);
> +
> +	iommu_domain_free(mapping->domain);
> +	kfree(mapping->bitmap);
> +	kfree(mapping);
> +}
> +
> +void arm_iommu_release_mapping(struct dma_iommu_mapping *mapping)
> +{
> +	if (mapping)
> +		kref_put(&mapping->kref, release_iommu_mapping);
> +}
> +EXPORT_SYMBOL(arm_iommu_release_mapping);
> +
> +int arm_iommu_attach_device(struct device *dev,
> +			    struct dma_iommu_mapping *mapping)
> +{
> +	int err;
> +
> +	err = iommu_attach_device(mapping->domain, dev);
> +	if (err)
> +		return err;
> +
> +	kref_get(&mapping->kref);
> +	dev->archdata.mapping = mapping;
> +	set_dma_ops(dev, &iommu_ops);
> +
> +	printk(KERN_INFO "Attached IOMMU controller to %s device.\n", dev_name(dev));

pr_debug?

> +	return 0;
> +}
> +EXPORT_SYMBOL(arm_iommu_attach_device);
> +
> +#endif
> diff --git a/arch/arm/mm/vmregion.h b/arch/arm/mm/vmregion.h
> index 15e9f04..6bbc402 100644
> --- a/arch/arm/mm/vmregion.h
> +++ b/arch/arm/mm/vmregion.h
> @@ -17,7 +17,7 @@ struct arm_vmregion {
>  	struct list_head	vm_list;
>  	unsigned long		vm_start;
>  	unsigned long		vm_end;
> -	struct page		*vm_pages;
> +	void			*priv;
>  	int			vm_active;
>  };

You might want to CC the ARM MM maintainers here to get their feedback.

Besides the comments I made, it looks good. You can stick Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
if you would like on the patch.

>  
> -- 
> 1.7.1.569.g6f426
> 
> _______________________________________________
> iommu mailing list
> iommu@lists.linux-foundation.org
> https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCHv6 2/7] ARM: dma-mapping: use asm-generic/dma-mapping-common.h
  2012-02-10 18:58   ` [PATCHv6 2/7] ARM: dma-mapping: use asm-generic/dma-mapping-common.h Marek Szyprowski
  2012-02-10 18:58     ` Marek Szyprowski
@ 2012-02-14 15:01     ` Konrad Rzeszutek Wilk
  2012-02-14 15:01       ` Konrad Rzeszutek Wilk
  1 sibling, 1 reply; 39+ messages in thread
From: Konrad Rzeszutek Wilk @ 2012-02-14 15:01 UTC (permalink / raw)
  To: Marek Szyprowski
  Cc: linux-arm-kernel, linaro-mm-sig, linux-mm, linux-arch,
	linux-samsung-soc, iommu, Shariq Hasnain, Arnd Bergmann,
	Benjamin Herrenschmidt, Krishna Reddy, Kyungmin Park,
	Andrzej Pietrasiewicz, Russell King - ARM Linux, KyongHo Cho,
	Chunsang Jeong

On Fri, Feb 10, 2012 at 07:58:39PM +0100, Marek Szyprowski wrote:
> This patch modifies dma-mapping implementation on ARM architecture to
> use common dma_map_ops structure and asm-generic/dma-mapping-common.h
> helpers.

The patch looks good, but I am not sure about the dma_debug API calls?

I am not seeing them being introduced back in the common/dmabounce.c which
is where the __dma_*_page calls are in, right?


> 
> Signed-off-by: Marek Szyprowski <m.szyprowski@samsung.com>
> Signed-off-by: Kyungmin Park <kyungmin.park@samsung.com>
> ---
>  arch/arm/Kconfig                   |    1 +
>  arch/arm/include/asm/device.h      |    1 +
>  arch/arm/include/asm/dma-mapping.h |  197 +++++-------------------------------
>  arch/arm/mm/dma-mapping.c          |  149 ++++++++++++++++-----------
>  4 files changed, 117 insertions(+), 231 deletions(-)
> 
> diff --git a/arch/arm/Kconfig b/arch/arm/Kconfig
> index a48aecc..59102fb 100644
> --- a/arch/arm/Kconfig
> +++ b/arch/arm/Kconfig
> @@ -4,6 +4,7 @@ config ARM
>  	select HAVE_AOUT
>  	select HAVE_DMA_API_DEBUG
>  	select HAVE_IDE if PCI || ISA || PCMCIA
> +	select HAVE_DMA_ATTRS
>  	select HAVE_MEMBLOCK
>  	select RTC_LIB
>  	select SYS_SUPPORTS_APM_EMULATION
> diff --git a/arch/arm/include/asm/device.h b/arch/arm/include/asm/device.h
> index 7aa3680..6e2cb0e 100644
> --- a/arch/arm/include/asm/device.h
> +++ b/arch/arm/include/asm/device.h
> @@ -7,6 +7,7 @@
>  #define ASMARM_DEVICE_H
>  
>  struct dev_archdata {
> +	struct dma_map_ops	*dma_ops;
>  #ifdef CONFIG_DMABOUNCE
>  	struct dmabounce_device_info *dmabounce;
>  #endif
> diff --git a/arch/arm/include/asm/dma-mapping.h b/arch/arm/include/asm/dma-mapping.h
> index 6bc056c..cf7b77c 100644
> --- a/arch/arm/include/asm/dma-mapping.h
> +++ b/arch/arm/include/asm/dma-mapping.h
> @@ -10,6 +10,28 @@
>  #include <asm-generic/dma-coherent.h>
>  #include <asm/memory.h>
>  
> +extern struct dma_map_ops arm_dma_ops;
> +
> +static inline struct dma_map_ops *get_dma_ops(struct device *dev)
> +{
> +	if (dev && dev->archdata.dma_ops)
> +		return dev->archdata.dma_ops;
> +	return &arm_dma_ops;
> +}
> +
> +static inline void set_dma_ops(struct device *dev, struct dma_map_ops *ops)
> +{
> +	BUG_ON(!dev);
> +	dev->archdata.dma_ops = ops;
> +}
> +
> +#include <asm-generic/dma-mapping-common.h>
> +
> +static inline int dma_set_mask(struct device *dev, u64 mask)
> +{
> +	return get_dma_ops(dev)->set_dma_mask(dev, mask);
> +}
> +
>  #ifdef __arch_page_to_dma
>  #error Please update to __arch_pfn_to_dma
>  #endif
> @@ -117,7 +139,6 @@ static inline void __dma_page_dev_to_cpu(struct page *page, unsigned long off,
>  
>  extern int dma_supported(struct device *, u64);
>  extern int dma_set_mask(struct device *, u64);
> -
>  /*
>   * DMA errors are defined by all-bits-set in the DMA address.
>   */
> @@ -295,179 +316,17 @@ static inline void __dma_unmap_page(struct device *dev, dma_addr_t handle,
>  }
>  #endif /* CONFIG_DMABOUNCE */
>  
> -/**
> - * dma_map_single - map a single buffer for streaming DMA
> - * @dev: valid struct device pointer, or NULL for ISA and EISA-like devices
> - * @cpu_addr: CPU direct mapped address of buffer
> - * @size: size of buffer to map
> - * @dir: DMA transfer direction
> - *
> - * Ensure that any data held in the cache is appropriately discarded
> - * or written back.
> - *
> - * The device owns this memory once this call has completed.  The CPU
> - * can regain ownership by calling dma_unmap_single() or
> - * dma_sync_single_for_cpu().
> - */
> -static inline dma_addr_t dma_map_single(struct device *dev, void *cpu_addr,
> -		size_t size, enum dma_data_direction dir)
> -{
> -	unsigned long offset;
> -	struct page *page;
> -	dma_addr_t addr;
> -
> -	BUG_ON(!virt_addr_valid(cpu_addr));
> -	BUG_ON(!virt_addr_valid(cpu_addr + size - 1));
> -	BUG_ON(!valid_dma_direction(dir));
> -
> -	page = virt_to_page(cpu_addr);
> -	offset = (unsigned long)cpu_addr & ~PAGE_MASK;
> -	addr = __dma_map_page(dev, page, offset, size, dir);
> -	debug_dma_map_page(dev, page, offset, size, dir, addr, true);
> -
> -	return addr;
> -}
> -
> -/**
> - * dma_map_page - map a portion of a page for streaming DMA
> - * @dev: valid struct device pointer, or NULL for ISA and EISA-like devices
> - * @page: page that buffer resides in
> - * @offset: offset into page for start of buffer
> - * @size: size of buffer to map
> - * @dir: DMA transfer direction
> - *
> - * Ensure that any data held in the cache is appropriately discarded
> - * or written back.
> - *
> - * The device owns this memory once this call has completed.  The CPU
> - * can regain ownership by calling dma_unmap_page().
> - */
> -static inline dma_addr_t dma_map_page(struct device *dev, struct page *page,
> -	     unsigned long offset, size_t size, enum dma_data_direction dir)
> -{
> -	dma_addr_t addr;
> -
> -	BUG_ON(!valid_dma_direction(dir));
> -
> -	addr = __dma_map_page(dev, page, offset, size, dir);
> -	debug_dma_map_page(dev, page, offset, size, dir, addr, false);
> -
> -	return addr;
> -}
> -
> -/**
> - * dma_unmap_single - unmap a single buffer previously mapped
> - * @dev: valid struct device pointer, or NULL for ISA and EISA-like devices
> - * @handle: DMA address of buffer
> - * @size: size of buffer (same as passed to dma_map_single)
> - * @dir: DMA transfer direction (same as passed to dma_map_single)
> - *
> - * Unmap a single streaming mode DMA translation.  The handle and size
> - * must match what was provided in the previous dma_map_single() call.
> - * All other usages are undefined.
> - *
> - * After this call, reads by the CPU to the buffer are guaranteed to see
> - * whatever the device wrote there.
> - */
> -static inline void dma_unmap_single(struct device *dev, dma_addr_t handle,
> -		size_t size, enum dma_data_direction dir)
> -{
> -	debug_dma_unmap_page(dev, handle, size, dir, true);
> -	__dma_unmap_page(dev, handle, size, dir);
> -}
> -
> -/**
> - * dma_unmap_page - unmap a buffer previously mapped through dma_map_page()
> - * @dev: valid struct device pointer, or NULL for ISA and EISA-like devices
> - * @handle: DMA address of buffer
> - * @size: size of buffer (same as passed to dma_map_page)
> - * @dir: DMA transfer direction (same as passed to dma_map_page)
> - *
> - * Unmap a page streaming mode DMA translation.  The handle and size
> - * must match what was provided in the previous dma_map_page() call.
> - * All other usages are undefined.
> - *
> - * After this call, reads by the CPU to the buffer are guaranteed to see
> - * whatever the device wrote there.
> - */
> -static inline void dma_unmap_page(struct device *dev, dma_addr_t handle,
> -		size_t size, enum dma_data_direction dir)
> -{
> -	debug_dma_unmap_page(dev, handle, size, dir, false);
> -	__dma_unmap_page(dev, handle, size, dir);
> -}
> -
> -
> -static inline void dma_sync_single_for_cpu(struct device *dev,
> -		dma_addr_t handle, size_t size, enum dma_data_direction dir)
> -{
> -	BUG_ON(!valid_dma_direction(dir));
> -
> -	debug_dma_sync_single_for_cpu(dev, handle, size, dir);
> -
> -	if (!dmabounce_sync_for_cpu(dev, handle, size, dir))
> -		return;
> -
> -	__dma_single_dev_to_cpu(dma_to_virt(dev, handle), size, dir);
> -}
> -
> -static inline void dma_sync_single_for_device(struct device *dev,
> -		dma_addr_t handle, size_t size, enum dma_data_direction dir)
> -{
> -	BUG_ON(!valid_dma_direction(dir));
> -
> -	debug_dma_sync_single_for_device(dev, handle, size, dir);
> -
> -	if (!dmabounce_sync_for_device(dev, handle, size, dir))
> -		return;
> -
> -	__dma_single_cpu_to_dev(dma_to_virt(dev, handle), size, dir);
> -}
> -
> -/**
> - * dma_sync_single_range_for_cpu
> - * @dev: valid struct device pointer, or NULL for ISA and EISA-like devices
> - * @handle: DMA address of buffer
> - * @offset: offset of region to start sync
> - * @size: size of region to sync
> - * @dir: DMA transfer direction (same as passed to dma_map_single)
> - *
> - * Make physical memory consistent for a single streaming mode DMA
> - * translation after a transfer.
> - *
> - * If you perform a dma_map_single() but wish to interrogate the
> - * buffer using the cpu, yet do not wish to teardown the PCI dma
> - * mapping, you must call this function before doing so.  At the
> - * next point you give the PCI dma address back to the card, you
> - * must first the perform a dma_sync_for_device, and then the
> - * device again owns the buffer.
> - */
> -static inline void dma_sync_single_range_for_cpu(struct device *dev,
> -		dma_addr_t handle, unsigned long offset, size_t size,
> -		enum dma_data_direction dir)
> -{
> -	dma_sync_single_for_cpu(dev, handle + offset, size, dir);
> -}
> -
> -static inline void dma_sync_single_range_for_device(struct device *dev,
> -		dma_addr_t handle, unsigned long offset, size_t size,
> -		enum dma_data_direction dir)
> -{
> -	dma_sync_single_for_device(dev, handle + offset, size, dir);
> -}
> -
>  /*
>   * The scatter list versions of the above methods.
>   */
> -extern int dma_map_sg(struct device *, struct scatterlist *, int,
> -		enum dma_data_direction);
> -extern void dma_unmap_sg(struct device *, struct scatterlist *, int,
> +extern int arm_dma_map_sg(struct device *, struct scatterlist *, int,
> +		enum dma_data_direction, struct dma_attrs *attrs);
> +extern void arm_dma_unmap_sg(struct device *, struct scatterlist *, int,
> +		enum dma_data_direction, struct dma_attrs *attrs);
> +extern void arm_dma_sync_sg_for_cpu(struct device *, struct scatterlist *, int,
>  		enum dma_data_direction);
> -extern void dma_sync_sg_for_cpu(struct device *, struct scatterlist *, int,
> +extern void arm_dma_sync_sg_for_device(struct device *, struct scatterlist *, int,
>  		enum dma_data_direction);
> -extern void dma_sync_sg_for_device(struct device *, struct scatterlist *, int,
> -		enum dma_data_direction);
> -
>  
>  #endif /* __KERNEL__ */
>  #endif
> diff --git a/arch/arm/mm/dma-mapping.c b/arch/arm/mm/dma-mapping.c
> index a5ab8bf..91fe436 100644
> --- a/arch/arm/mm/dma-mapping.c
> +++ b/arch/arm/mm/dma-mapping.c
> @@ -29,6 +29,86 @@
>  
>  #include "mm.h"
>  
> +/**
> + * dma_map_page - map a portion of a page for streaming DMA
> + * @dev: valid struct device pointer, or NULL for ISA and EISA-like devices
> + * @page: page that buffer resides in
> + * @offset: offset into page for start of buffer
> + * @size: size of buffer to map
> + * @dir: DMA transfer direction
> + *
> + * Ensure that any data held in the cache is appropriately discarded
> + * or written back.
> + *
> + * The device owns this memory once this call has completed.  The CPU
> + * can regain ownership by calling dma_unmap_page().
> + */
> +static inline dma_addr_t arm_dma_map_page(struct device *dev, struct page *page,
> +	     unsigned long offset, size_t size, enum dma_data_direction dir,
> +	     struct dma_attrs *attrs)
> +{
> +	return __dma_map_page(dev, page, offset, size, dir);
> +}
> +
> +/**
> + * dma_unmap_page - unmap a buffer previously mapped through dma_map_page()
> + * @dev: valid struct device pointer, or NULL for ISA and EISA-like devices
> + * @handle: DMA address of buffer
> + * @size: size of buffer (same as passed to dma_map_page)
> + * @dir: DMA transfer direction (same as passed to dma_map_page)
> + *
> + * Unmap a page streaming mode DMA translation.  The handle and size
> + * must match what was provided in the previous dma_map_page() call.
> + * All other usages are undefined.
> + *
> + * After this call, reads by the CPU to the buffer are guaranteed to see
> + * whatever the device wrote there.
> + */
> +
> +static inline void arm_dma_unmap_page(struct device *dev, dma_addr_t handle,
> +		size_t size, enum dma_data_direction dir,
> +		struct dma_attrs *attrs)
> +{
> +	__dma_unmap_page(dev, handle, size, dir);
> +}
> +
> +static inline void arm_dma_sync_single_for_cpu(struct device *dev,
> +		dma_addr_t handle, size_t size, enum dma_data_direction dir)
> +{
> +	unsigned int offset = handle & (PAGE_SIZE - 1);
> +	struct page *page = pfn_to_page(dma_to_pfn(dev, handle-offset));
> +	if (!dmabounce_sync_for_cpu(dev, handle, size, dir))
> +		return;
> +
> +	__dma_page_dev_to_cpu(page, offset, size, dir);
> +}
> +
> +static inline void arm_dma_sync_single_for_device(struct device *dev,
> +		dma_addr_t handle, size_t size, enum dma_data_direction dir)
> +{
> +	unsigned int offset = handle & (PAGE_SIZE - 1);
> +	struct page *page = pfn_to_page(dma_to_pfn(dev, handle-offset));
> +	if (!dmabounce_sync_for_device(dev, handle, size, dir))
> +		return;
> +
> +	__dma_page_cpu_to_dev(page, offset, size, dir);
> +}
> +
> +static int arm_dma_set_mask(struct device *dev, u64 dma_mask);
> +
> +struct dma_map_ops arm_dma_ops = {
> +	.map_page		= arm_dma_map_page,
> +	.unmap_page		= arm_dma_unmap_page,
> +	.map_sg			= arm_dma_map_sg,
> +	.unmap_sg		= arm_dma_unmap_sg,
> +	.sync_single_for_cpu	= arm_dma_sync_single_for_cpu,
> +	.sync_single_for_device	= arm_dma_sync_single_for_device,
> +	.sync_sg_for_cpu	= arm_dma_sync_sg_for_cpu,
> +	.sync_sg_for_device	= arm_dma_sync_sg_for_device,
> +	.set_dma_mask		= arm_dma_set_mask,
> +};
> +EXPORT_SYMBOL(arm_dma_ops);
> +
>  static u64 get_coherent_dma_mask(struct device *dev)
>  {
>  	u64 mask = (u64)arm_dma_limit;
> @@ -455,47 +535,6 @@ void dma_free_coherent(struct device *dev, size_t size, void *cpu_addr, dma_addr
>  }
>  EXPORT_SYMBOL(dma_free_coherent);
>  
> -/*
> - * Make an area consistent for devices.
> - * Note: Drivers should NOT use this function directly, as it will break
> - * platforms with CONFIG_DMABOUNCE.
> - * Use the driver DMA support - see dma-mapping.h (dma_sync_*)
> - */
> -void ___dma_single_cpu_to_dev(const void *kaddr, size_t size,
> -	enum dma_data_direction dir)
> -{
> -	unsigned long paddr;
> -
> -	BUG_ON(!virt_addr_valid(kaddr) || !virt_addr_valid(kaddr + size - 1));
> -
> -	dmac_map_area(kaddr, size, dir);
> -
> -	paddr = __pa(kaddr);
> -	if (dir == DMA_FROM_DEVICE) {
> -		outer_inv_range(paddr, paddr + size);
> -	} else {
> -		outer_clean_range(paddr, paddr + size);
> -	}
> -	/* FIXME: non-speculating: flush on bidirectional mappings? */
> -}
> -EXPORT_SYMBOL(___dma_single_cpu_to_dev);
> -
> -void ___dma_single_dev_to_cpu(const void *kaddr, size_t size,
> -	enum dma_data_direction dir)
> -{
> -	BUG_ON(!virt_addr_valid(kaddr) || !virt_addr_valid(kaddr + size - 1));
> -
> -	/* FIXME: non-speculating: not required */
> -	/* don't bother invalidating if DMA to device */
> -	if (dir != DMA_TO_DEVICE) {
> -		unsigned long paddr = __pa(kaddr);
> -		outer_inv_range(paddr, paddr + size);
> -	}
> -
> -	dmac_unmap_area(kaddr, size, dir);
> -}
> -EXPORT_SYMBOL(___dma_single_dev_to_cpu);
> -
>  static void dma_cache_maint_page(struct page *page, unsigned long offset,
>  	size_t size, enum dma_data_direction dir,
>  	void (*op)(const void *, size_t, int))
> @@ -593,21 +632,18 @@ EXPORT_SYMBOL(___dma_page_dev_to_cpu);
>   * Device ownership issues as mentioned for dma_map_single are the same
>   * here.
>   */
> -int dma_map_sg(struct device *dev, struct scatterlist *sg, int nents,
> -		enum dma_data_direction dir)
> +int arm_dma_map_sg(struct device *dev, struct scatterlist *sg, int nents,
> +		enum dma_data_direction dir, struct dma_attrs *attrs)
>  {
>  	struct scatterlist *s;
>  	int i, j;
>  
> -	BUG_ON(!valid_dma_direction(dir));
> -
>  	for_each_sg(sg, s, nents, i) {
>  		s->dma_address = __dma_map_page(dev, sg_page(s), s->offset,
>  						s->length, dir);
>  		if (dma_mapping_error(dev, s->dma_address))
>  			goto bad_mapping;
>  	}
> -	debug_dma_map_sg(dev, sg, nents, nents, dir);
>  	return nents;
>  
>   bad_mapping:
> @@ -615,7 +651,6 @@ int dma_map_sg(struct device *dev, struct scatterlist *sg, int nents,
>  		__dma_unmap_page(dev, sg_dma_address(s), sg_dma_len(s), dir);
>  	return 0;
>  }
> -EXPORT_SYMBOL(dma_map_sg);
>  
>  /**
>   * dma_unmap_sg - unmap a set of SG buffers mapped by dma_map_sg
> @@ -627,18 +662,15 @@ EXPORT_SYMBOL(dma_map_sg);
>   * Unmap a set of streaming mode DMA translations.  Again, CPU access
>   * rules concerning calls here are the same as for dma_unmap_single().
>   */
> -void dma_unmap_sg(struct device *dev, struct scatterlist *sg, int nents,
> -		enum dma_data_direction dir)
> +void arm_dma_unmap_sg(struct device *dev, struct scatterlist *sg, int nents,
> +		enum dma_data_direction dir, struct dma_attrs *attrs)
>  {
>  	struct scatterlist *s;
>  	int i;
>  
> -	debug_dma_unmap_sg(dev, sg, nents, dir);
> -
>  	for_each_sg(sg, s, nents, i)
>  		__dma_unmap_page(dev, sg_dma_address(s), sg_dma_len(s), dir);
>  }
> -EXPORT_SYMBOL(dma_unmap_sg);
>  
>  /**
>   * dma_sync_sg_for_cpu
> @@ -647,7 +679,7 @@ EXPORT_SYMBOL(dma_unmap_sg);
>   * @nents: number of buffers to map (returned from dma_map_sg)
>   * @dir: DMA transfer direction (same as was passed to dma_map_sg)
>   */
> -void dma_sync_sg_for_cpu(struct device *dev, struct scatterlist *sg,
> +void arm_dma_sync_sg_for_cpu(struct device *dev, struct scatterlist *sg,
>  			int nents, enum dma_data_direction dir)
>  {
>  	struct scatterlist *s;
> @@ -661,10 +693,7 @@ void dma_sync_sg_for_cpu(struct device *dev, struct scatterlist *sg,
>  		__dma_page_dev_to_cpu(sg_page(s), s->offset,
>  				      s->length, dir);
>  	}
> -
> -	debug_dma_sync_sg_for_cpu(dev, sg, nents, dir);
>  }
> -EXPORT_SYMBOL(dma_sync_sg_for_cpu);
>  
>  /**
>   * dma_sync_sg_for_device
> @@ -673,7 +702,7 @@ EXPORT_SYMBOL(dma_sync_sg_for_cpu);
>   * @nents: number of buffers to map (returned from dma_map_sg)
>   * @dir: DMA transfer direction (same as was passed to dma_map_sg)
>   */
> -void dma_sync_sg_for_device(struct device *dev, struct scatterlist *sg,
> +void arm_dma_sync_sg_for_device(struct device *dev, struct scatterlist *sg,
>  			int nents, enum dma_data_direction dir)
>  {
>  	struct scatterlist *s;
> @@ -687,10 +716,7 @@ void dma_sync_sg_for_device(struct device *dev, struct scatterlist *sg,
>  		__dma_page_cpu_to_dev(sg_page(s), s->offset,
>  				      s->length, dir);
>  	}
> -
> -	debug_dma_sync_sg_for_device(dev, sg, nents, dir);
>  }
> -EXPORT_SYMBOL(dma_sync_sg_for_device);
>  
>  /*
>   * Return whether the given device DMA address mask can be supported
> @@ -706,7 +732,7 @@ int dma_supported(struct device *dev, u64 mask)
>  }
>  EXPORT_SYMBOL(dma_supported);
>  
> -int dma_set_mask(struct device *dev, u64 dma_mask)
> +static int arm_dma_set_mask(struct device *dev, u64 dma_mask)
>  {
>  	if (!dev->dma_mask || !dma_supported(dev, dma_mask))
>  		return -EIO;
> @@ -717,7 +743,6 @@ int dma_set_mask(struct device *dev, u64 dma_mask)
>  
>  	return 0;
>  }
> -EXPORT_SYMBOL(dma_set_mask);
>  
>  #define PREALLOC_DMA_DEBUG_ENTRIES	4096
>  
> -- 
> 1.7.1.569.g6f426
> 
> _______________________________________________
> iommu mailing list
> iommu@lists.linux-foundation.org
> https://lists.linuxfoundation.org/mailman/listinfo/iommu

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCHv6 2/7] ARM: dma-mapping: use asm-generic/dma-mapping-common.h
  2012-02-14 15:01     ` Konrad Rzeszutek Wilk
@ 2012-02-14 15:01       ` Konrad Rzeszutek Wilk
  0 siblings, 0 replies; 39+ messages in thread
From: Konrad Rzeszutek Wilk @ 2012-02-14 15:01 UTC (permalink / raw)
  To: Marek Szyprowski
  Cc: linux-arm-kernel, linaro-mm-sig, linux-mm, linux-arch,
	linux-samsung-soc, iommu, Shariq Hasnain, Arnd Bergmann,
	Benjamin Herrenschmidt, Krishna Reddy, Kyungmin Park,
	Andrzej Pietrasiewicz, Russell King - ARM Linux, KyongHo Cho,
	Chunsang Jeong

On Fri, Feb 10, 2012 at 07:58:39PM +0100, Marek Szyprowski wrote:
> This patch modifies dma-mapping implementation on ARM architecture to
> use common dma_map_ops structure and asm-generic/dma-mapping-common.h
> helpers.

The patch looks good, but I am not sure about the dma_debug API calls?

I am not seeing them being introduced back in the common/dmabounce.c which
is where the __dma_*_page calls are in, right?


> 
> Signed-off-by: Marek Szyprowski <m.szyprowski@samsung.com>
> Signed-off-by: Kyungmin Park <kyungmin.park@samsung.com>
> ---
>  arch/arm/Kconfig                   |    1 +
>  arch/arm/include/asm/device.h      |    1 +
>  arch/arm/include/asm/dma-mapping.h |  197 +++++-------------------------------
>  arch/arm/mm/dma-mapping.c          |  149 ++++++++++++++++-----------
>  4 files changed, 117 insertions(+), 231 deletions(-)
> 
> diff --git a/arch/arm/Kconfig b/arch/arm/Kconfig
> index a48aecc..59102fb 100644
> --- a/arch/arm/Kconfig
> +++ b/arch/arm/Kconfig
> @@ -4,6 +4,7 @@ config ARM
>  	select HAVE_AOUT
>  	select HAVE_DMA_API_DEBUG
>  	select HAVE_IDE if PCI || ISA || PCMCIA
> +	select HAVE_DMA_ATTRS
>  	select HAVE_MEMBLOCK
>  	select RTC_LIB
>  	select SYS_SUPPORTS_APM_EMULATION
> diff --git a/arch/arm/include/asm/device.h b/arch/arm/include/asm/device.h
> index 7aa3680..6e2cb0e 100644
> --- a/arch/arm/include/asm/device.h
> +++ b/arch/arm/include/asm/device.h
> @@ -7,6 +7,7 @@
>  #define ASMARM_DEVICE_H
>  
>  struct dev_archdata {
> +	struct dma_map_ops	*dma_ops;
>  #ifdef CONFIG_DMABOUNCE
>  	struct dmabounce_device_info *dmabounce;
>  #endif
> diff --git a/arch/arm/include/asm/dma-mapping.h b/arch/arm/include/asm/dma-mapping.h
> index 6bc056c..cf7b77c 100644
> --- a/arch/arm/include/asm/dma-mapping.h
> +++ b/arch/arm/include/asm/dma-mapping.h
> @@ -10,6 +10,28 @@
>  #include <asm-generic/dma-coherent.h>
>  #include <asm/memory.h>
>  
> +extern struct dma_map_ops arm_dma_ops;
> +
> +static inline struct dma_map_ops *get_dma_ops(struct device *dev)
> +{
> +	if (dev && dev->archdata.dma_ops)
> +		return dev->archdata.dma_ops;
> +	return &arm_dma_ops;
> +}
> +
> +static inline void set_dma_ops(struct device *dev, struct dma_map_ops *ops)
> +{
> +	BUG_ON(!dev);
> +	dev->archdata.dma_ops = ops;
> +}
> +
> +#include <asm-generic/dma-mapping-common.h>
> +
> +static inline int dma_set_mask(struct device *dev, u64 mask)
> +{
> +	return get_dma_ops(dev)->set_dma_mask(dev, mask);
> +}
> +
>  #ifdef __arch_page_to_dma
>  #error Please update to __arch_pfn_to_dma
>  #endif
> @@ -117,7 +139,6 @@ static inline void __dma_page_dev_to_cpu(struct page *page, unsigned long off,
>  
>  extern int dma_supported(struct device *, u64);
>  extern int dma_set_mask(struct device *, u64);
> -
>  /*
>   * DMA errors are defined by all-bits-set in the DMA address.
>   */
> @@ -295,179 +316,17 @@ static inline void __dma_unmap_page(struct device *dev, dma_addr_t handle,
>  }
>  #endif /* CONFIG_DMABOUNCE */
>  
> -/**
> - * dma_map_single - map a single buffer for streaming DMA
> - * @dev: valid struct device pointer, or NULL for ISA and EISA-like devices
> - * @cpu_addr: CPU direct mapped address of buffer
> - * @size: size of buffer to map
> - * @dir: DMA transfer direction
> - *
> - * Ensure that any data held in the cache is appropriately discarded
> - * or written back.
> - *
> - * The device owns this memory once this call has completed.  The CPU
> - * can regain ownership by calling dma_unmap_single() or
> - * dma_sync_single_for_cpu().
> - */
> -static inline dma_addr_t dma_map_single(struct device *dev, void *cpu_addr,
> -		size_t size, enum dma_data_direction dir)
> -{
> -	unsigned long offset;
> -	struct page *page;
> -	dma_addr_t addr;
> -
> -	BUG_ON(!virt_addr_valid(cpu_addr));
> -	BUG_ON(!virt_addr_valid(cpu_addr + size - 1));
> -	BUG_ON(!valid_dma_direction(dir));
> -
> -	page = virt_to_page(cpu_addr);
> -	offset = (unsigned long)cpu_addr & ~PAGE_MASK;
> -	addr = __dma_map_page(dev, page, offset, size, dir);
> -	debug_dma_map_page(dev, page, offset, size, dir, addr, true);
> -
> -	return addr;
> -}
> -
> -/**
> - * dma_map_page - map a portion of a page for streaming DMA
> - * @dev: valid struct device pointer, or NULL for ISA and EISA-like devices
> - * @page: page that buffer resides in
> - * @offset: offset into page for start of buffer
> - * @size: size of buffer to map
> - * @dir: DMA transfer direction
> - *
> - * Ensure that any data held in the cache is appropriately discarded
> - * or written back.
> - *
> - * The device owns this memory once this call has completed.  The CPU
> - * can regain ownership by calling dma_unmap_page().
> - */
> -static inline dma_addr_t dma_map_page(struct device *dev, struct page *page,
> -	     unsigned long offset, size_t size, enum dma_data_direction dir)
> -{
> -	dma_addr_t addr;
> -
> -	BUG_ON(!valid_dma_direction(dir));
> -
> -	addr = __dma_map_page(dev, page, offset, size, dir);
> -	debug_dma_map_page(dev, page, offset, size, dir, addr, false);
> -
> -	return addr;
> -}
> -
> -/**
> - * dma_unmap_single - unmap a single buffer previously mapped
> - * @dev: valid struct device pointer, or NULL for ISA and EISA-like devices
> - * @handle: DMA address of buffer
> - * @size: size of buffer (same as passed to dma_map_single)
> - * @dir: DMA transfer direction (same as passed to dma_map_single)
> - *
> - * Unmap a single streaming mode DMA translation.  The handle and size
> - * must match what was provided in the previous dma_map_single() call.
> - * All other usages are undefined.
> - *
> - * After this call, reads by the CPU to the buffer are guaranteed to see
> - * whatever the device wrote there.
> - */
> -static inline void dma_unmap_single(struct device *dev, dma_addr_t handle,
> -		size_t size, enum dma_data_direction dir)
> -{
> -	debug_dma_unmap_page(dev, handle, size, dir, true);
> -	__dma_unmap_page(dev, handle, size, dir);
> -}
> -
> -/**
> - * dma_unmap_page - unmap a buffer previously mapped through dma_map_page()
> - * @dev: valid struct device pointer, or NULL for ISA and EISA-like devices
> - * @handle: DMA address of buffer
> - * @size: size of buffer (same as passed to dma_map_page)
> - * @dir: DMA transfer direction (same as passed to dma_map_page)
> - *
> - * Unmap a page streaming mode DMA translation.  The handle and size
> - * must match what was provided in the previous dma_map_page() call.
> - * All other usages are undefined.
> - *
> - * After this call, reads by the CPU to the buffer are guaranteed to see
> - * whatever the device wrote there.
> - */
> -static inline void dma_unmap_page(struct device *dev, dma_addr_t handle,
> -		size_t size, enum dma_data_direction dir)
> -{
> -	debug_dma_unmap_page(dev, handle, size, dir, false);
> -	__dma_unmap_page(dev, handle, size, dir);
> -}
> -
> -
> -static inline void dma_sync_single_for_cpu(struct device *dev,
> -		dma_addr_t handle, size_t size, enum dma_data_direction dir)
> -{
> -	BUG_ON(!valid_dma_direction(dir));
> -
> -	debug_dma_sync_single_for_cpu(dev, handle, size, dir);
> -
> -	if (!dmabounce_sync_for_cpu(dev, handle, size, dir))
> -		return;
> -
> -	__dma_single_dev_to_cpu(dma_to_virt(dev, handle), size, dir);
> -}
> -
> -static inline void dma_sync_single_for_device(struct device *dev,
> -		dma_addr_t handle, size_t size, enum dma_data_direction dir)
> -{
> -	BUG_ON(!valid_dma_direction(dir));
> -
> -	debug_dma_sync_single_for_device(dev, handle, size, dir);
> -
> -	if (!dmabounce_sync_for_device(dev, handle, size, dir))
> -		return;
> -
> -	__dma_single_cpu_to_dev(dma_to_virt(dev, handle), size, dir);
> -}
> -
> -/**
> - * dma_sync_single_range_for_cpu
> - * @dev: valid struct device pointer, or NULL for ISA and EISA-like devices
> - * @handle: DMA address of buffer
> - * @offset: offset of region to start sync
> - * @size: size of region to sync
> - * @dir: DMA transfer direction (same as passed to dma_map_single)
> - *
> - * Make physical memory consistent for a single streaming mode DMA
> - * translation after a transfer.
> - *
> - * If you perform a dma_map_single() but wish to interrogate the
> - * buffer using the cpu, yet do not wish to teardown the PCI dma
> - * mapping, you must call this function before doing so.  At the
> - * next point you give the PCI dma address back to the card, you
> - * must first the perform a dma_sync_for_device, and then the
> - * device again owns the buffer.
> - */
> -static inline void dma_sync_single_range_for_cpu(struct device *dev,
> -		dma_addr_t handle, unsigned long offset, size_t size,
> -		enum dma_data_direction dir)
> -{
> -	dma_sync_single_for_cpu(dev, handle + offset, size, dir);
> -}
> -
> -static inline void dma_sync_single_range_for_device(struct device *dev,
> -		dma_addr_t handle, unsigned long offset, size_t size,
> -		enum dma_data_direction dir)
> -{
> -	dma_sync_single_for_device(dev, handle + offset, size, dir);
> -}
> -
>  /*
>   * The scatter list versions of the above methods.
>   */
> -extern int dma_map_sg(struct device *, struct scatterlist *, int,
> -		enum dma_data_direction);
> -extern void dma_unmap_sg(struct device *, struct scatterlist *, int,
> +extern int arm_dma_map_sg(struct device *, struct scatterlist *, int,
> +		enum dma_data_direction, struct dma_attrs *attrs);
> +extern void arm_dma_unmap_sg(struct device *, struct scatterlist *, int,
> +		enum dma_data_direction, struct dma_attrs *attrs);
> +extern void arm_dma_sync_sg_for_cpu(struct device *, struct scatterlist *, int,
>  		enum dma_data_direction);
> -extern void dma_sync_sg_for_cpu(struct device *, struct scatterlist *, int,
> +extern void arm_dma_sync_sg_for_device(struct device *, struct scatterlist *, int,
>  		enum dma_data_direction);
> -extern void dma_sync_sg_for_device(struct device *, struct scatterlist *, int,
> -		enum dma_data_direction);
> -
>  
>  #endif /* __KERNEL__ */
>  #endif
> diff --git a/arch/arm/mm/dma-mapping.c b/arch/arm/mm/dma-mapping.c
> index a5ab8bf..91fe436 100644
> --- a/arch/arm/mm/dma-mapping.c
> +++ b/arch/arm/mm/dma-mapping.c
> @@ -29,6 +29,86 @@
>  
>  #include "mm.h"
>  
> +/**
> + * dma_map_page - map a portion of a page for streaming DMA
> + * @dev: valid struct device pointer, or NULL for ISA and EISA-like devices
> + * @page: page that buffer resides in
> + * @offset: offset into page for start of buffer
> + * @size: size of buffer to map
> + * @dir: DMA transfer direction
> + *
> + * Ensure that any data held in the cache is appropriately discarded
> + * or written back.
> + *
> + * The device owns this memory once this call has completed.  The CPU
> + * can regain ownership by calling dma_unmap_page().
> + */
> +static inline dma_addr_t arm_dma_map_page(struct device *dev, struct page *page,
> +	     unsigned long offset, size_t size, enum dma_data_direction dir,
> +	     struct dma_attrs *attrs)
> +{
> +	return __dma_map_page(dev, page, offset, size, dir);
> +}
> +
> +/**
> + * dma_unmap_page - unmap a buffer previously mapped through dma_map_page()
> + * @dev: valid struct device pointer, or NULL for ISA and EISA-like devices
> + * @handle: DMA address of buffer
> + * @size: size of buffer (same as passed to dma_map_page)
> + * @dir: DMA transfer direction (same as passed to dma_map_page)
> + *
> + * Unmap a page streaming mode DMA translation.  The handle and size
> + * must match what was provided in the previous dma_map_page() call.
> + * All other usages are undefined.
> + *
> + * After this call, reads by the CPU to the buffer are guaranteed to see
> + * whatever the device wrote there.
> + */
> +
> +static inline void arm_dma_unmap_page(struct device *dev, dma_addr_t handle,
> +		size_t size, enum dma_data_direction dir,
> +		struct dma_attrs *attrs)
> +{
> +	__dma_unmap_page(dev, handle, size, dir);
> +}
> +
> +static inline void arm_dma_sync_single_for_cpu(struct device *dev,
> +		dma_addr_t handle, size_t size, enum dma_data_direction dir)
> +{
> +	unsigned int offset = handle & (PAGE_SIZE - 1);
> +	struct page *page = pfn_to_page(dma_to_pfn(dev, handle-offset));
> +	if (!dmabounce_sync_for_cpu(dev, handle, size, dir))
> +		return;
> +
> +	__dma_page_dev_to_cpu(page, offset, size, dir);
> +}
> +
> +static inline void arm_dma_sync_single_for_device(struct device *dev,
> +		dma_addr_t handle, size_t size, enum dma_data_direction dir)
> +{
> +	unsigned int offset = handle & (PAGE_SIZE - 1);
> +	struct page *page = pfn_to_page(dma_to_pfn(dev, handle-offset));
> +	if (!dmabounce_sync_for_device(dev, handle, size, dir))
> +		return;
> +
> +	__dma_page_cpu_to_dev(page, offset, size, dir);
> +}
> +
> +static int arm_dma_set_mask(struct device *dev, u64 dma_mask);
> +
> +struct dma_map_ops arm_dma_ops = {
> +	.map_page		= arm_dma_map_page,
> +	.unmap_page		= arm_dma_unmap_page,
> +	.map_sg			= arm_dma_map_sg,
> +	.unmap_sg		= arm_dma_unmap_sg,
> +	.sync_single_for_cpu	= arm_dma_sync_single_for_cpu,
> +	.sync_single_for_device	= arm_dma_sync_single_for_device,
> +	.sync_sg_for_cpu	= arm_dma_sync_sg_for_cpu,
> +	.sync_sg_for_device	= arm_dma_sync_sg_for_device,
> +	.set_dma_mask		= arm_dma_set_mask,
> +};
> +EXPORT_SYMBOL(arm_dma_ops);
> +
>  static u64 get_coherent_dma_mask(struct device *dev)
>  {
>  	u64 mask = (u64)arm_dma_limit;
> @@ -455,47 +535,6 @@ void dma_free_coherent(struct device *dev, size_t size, void *cpu_addr, dma_addr
>  }
>  EXPORT_SYMBOL(dma_free_coherent);
>  
> -/*
> - * Make an area consistent for devices.
> - * Note: Drivers should NOT use this function directly, as it will break
> - * platforms with CONFIG_DMABOUNCE.
> - * Use the driver DMA support - see dma-mapping.h (dma_sync_*)
> - */
> -void ___dma_single_cpu_to_dev(const void *kaddr, size_t size,
> -	enum dma_data_direction dir)
> -{
> -	unsigned long paddr;
> -
> -	BUG_ON(!virt_addr_valid(kaddr) || !virt_addr_valid(kaddr + size - 1));
> -
> -	dmac_map_area(kaddr, size, dir);
> -
> -	paddr = __pa(kaddr);
> -	if (dir == DMA_FROM_DEVICE) {
> -		outer_inv_range(paddr, paddr + size);
> -	} else {
> -		outer_clean_range(paddr, paddr + size);
> -	}
> -	/* FIXME: non-speculating: flush on bidirectional mappings? */
> -}
> -EXPORT_SYMBOL(___dma_single_cpu_to_dev);
> -
> -void ___dma_single_dev_to_cpu(const void *kaddr, size_t size,
> -	enum dma_data_direction dir)
> -{
> -	BUG_ON(!virt_addr_valid(kaddr) || !virt_addr_valid(kaddr + size - 1));
> -
> -	/* FIXME: non-speculating: not required */
> -	/* don't bother invalidating if DMA to device */
> -	if (dir != DMA_TO_DEVICE) {
> -		unsigned long paddr = __pa(kaddr);
> -		outer_inv_range(paddr, paddr + size);
> -	}
> -
> -	dmac_unmap_area(kaddr, size, dir);
> -}
> -EXPORT_SYMBOL(___dma_single_dev_to_cpu);
> -
>  static void dma_cache_maint_page(struct page *page, unsigned long offset,
>  	size_t size, enum dma_data_direction dir,
>  	void (*op)(const void *, size_t, int))
> @@ -593,21 +632,18 @@ EXPORT_SYMBOL(___dma_page_dev_to_cpu);
>   * Device ownership issues as mentioned for dma_map_single are the same
>   * here.
>   */
> -int dma_map_sg(struct device *dev, struct scatterlist *sg, int nents,
> -		enum dma_data_direction dir)
> +int arm_dma_map_sg(struct device *dev, struct scatterlist *sg, int nents,
> +		enum dma_data_direction dir, struct dma_attrs *attrs)
>  {
>  	struct scatterlist *s;
>  	int i, j;
>  
> -	BUG_ON(!valid_dma_direction(dir));
> -
>  	for_each_sg(sg, s, nents, i) {
>  		s->dma_address = __dma_map_page(dev, sg_page(s), s->offset,
>  						s->length, dir);
>  		if (dma_mapping_error(dev, s->dma_address))
>  			goto bad_mapping;
>  	}
> -	debug_dma_map_sg(dev, sg, nents, nents, dir);
>  	return nents;
>  
>   bad_mapping:
> @@ -615,7 +651,6 @@ int dma_map_sg(struct device *dev, struct scatterlist *sg, int nents,
>  		__dma_unmap_page(dev, sg_dma_address(s), sg_dma_len(s), dir);
>  	return 0;
>  }
> -EXPORT_SYMBOL(dma_map_sg);
>  
>  /**
>   * dma_unmap_sg - unmap a set of SG buffers mapped by dma_map_sg
> @@ -627,18 +662,15 @@ EXPORT_SYMBOL(dma_map_sg);
>   * Unmap a set of streaming mode DMA translations.  Again, CPU access
>   * rules concerning calls here are the same as for dma_unmap_single().
>   */
> -void dma_unmap_sg(struct device *dev, struct scatterlist *sg, int nents,
> -		enum dma_data_direction dir)
> +void arm_dma_unmap_sg(struct device *dev, struct scatterlist *sg, int nents,
> +		enum dma_data_direction dir, struct dma_attrs *attrs)
>  {
>  	struct scatterlist *s;
>  	int i;
>  
> -	debug_dma_unmap_sg(dev, sg, nents, dir);
> -
>  	for_each_sg(sg, s, nents, i)
>  		__dma_unmap_page(dev, sg_dma_address(s), sg_dma_len(s), dir);
>  }
> -EXPORT_SYMBOL(dma_unmap_sg);
>  
>  /**
>   * dma_sync_sg_for_cpu
> @@ -647,7 +679,7 @@ EXPORT_SYMBOL(dma_unmap_sg);
>   * @nents: number of buffers to map (returned from dma_map_sg)
>   * @dir: DMA transfer direction (same as was passed to dma_map_sg)
>   */
> -void dma_sync_sg_for_cpu(struct device *dev, struct scatterlist *sg,
> +void arm_dma_sync_sg_for_cpu(struct device *dev, struct scatterlist *sg,
>  			int nents, enum dma_data_direction dir)
>  {
>  	struct scatterlist *s;
> @@ -661,10 +693,7 @@ void dma_sync_sg_for_cpu(struct device *dev, struct scatterlist *sg,
>  		__dma_page_dev_to_cpu(sg_page(s), s->offset,
>  				      s->length, dir);
>  	}
> -
> -	debug_dma_sync_sg_for_cpu(dev, sg, nents, dir);
>  }
> -EXPORT_SYMBOL(dma_sync_sg_for_cpu);
>  
>  /**
>   * dma_sync_sg_for_device
> @@ -673,7 +702,7 @@ EXPORT_SYMBOL(dma_sync_sg_for_cpu);
>   * @nents: number of buffers to map (returned from dma_map_sg)
>   * @dir: DMA transfer direction (same as was passed to dma_map_sg)
>   */
> -void dma_sync_sg_for_device(struct device *dev, struct scatterlist *sg,
> +void arm_dma_sync_sg_for_device(struct device *dev, struct scatterlist *sg,
>  			int nents, enum dma_data_direction dir)
>  {
>  	struct scatterlist *s;
> @@ -687,10 +716,7 @@ void dma_sync_sg_for_device(struct device *dev, struct scatterlist *sg,
>  		__dma_page_cpu_to_dev(sg_page(s), s->offset,
>  				      s->length, dir);
>  	}
> -
> -	debug_dma_sync_sg_for_device(dev, sg, nents, dir);
>  }
> -EXPORT_SYMBOL(dma_sync_sg_for_device);
>  
>  /*
>   * Return whether the given device DMA address mask can be supported
> @@ -706,7 +732,7 @@ int dma_supported(struct device *dev, u64 mask)
>  }
>  EXPORT_SYMBOL(dma_supported);
>  
> -int dma_set_mask(struct device *dev, u64 dma_mask)
> +static int arm_dma_set_mask(struct device *dev, u64 dma_mask)
>  {
>  	if (!dev->dma_mask || !dma_supported(dev, dma_mask))
>  		return -EIO;
> @@ -717,7 +743,6 @@ int dma_set_mask(struct device *dev, u64 dma_mask)
>  
>  	return 0;
>  }
> -EXPORT_SYMBOL(dma_set_mask);
>  
>  #define PREALLOC_DMA_DEBUG_ENTRIES	4096
>  
> -- 
> 1.7.1.569.g6f426
> 
> _______________________________________________
> iommu mailing list
> iommu@lists.linux-foundation.org
> https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCHv6 3/7] ARM: dma-mapping: implement dma sg methods on top of any generic dma ops
  2012-02-10 18:58 ` [PATCHv6 3/7] ARM: dma-mapping: implement dma sg methods on top of any generic dma ops Marek Szyprowski
  2012-02-10 18:58   ` Marek Szyprowski
@ 2012-02-14 15:02   ` Konrad Rzeszutek Wilk
  2012-02-14 15:02     ` Konrad Rzeszutek Wilk
  2012-02-24 13:24     ` Marek Szyprowski
  1 sibling, 2 replies; 39+ messages in thread
From: Konrad Rzeszutek Wilk @ 2012-02-14 15:02 UTC (permalink / raw)
  To: Marek Szyprowski
  Cc: linux-arm-kernel, linaro-mm-sig, linux-mm, linux-arch,
	linux-samsung-soc, iommu, Shariq Hasnain, Arnd Bergmann,
	Benjamin Herrenschmidt, Krishna Reddy, Kyungmin Park,
	Andrzej Pietrasiewicz, Russell King - ARM Linux, KyongHo Cho,
	Chunsang Jeong

On Fri, Feb 10, 2012 at 07:58:40PM +0100, Marek Szyprowski wrote:
> This patch converts all dma_sg methods to be generic (independent of the
> current DMA mapping implementation for ARM architecture). All dma sg
> operations are now implemented on top of respective
> dma_map_page/dma_sync_single_for* operations from dma_map_ops structure.

Looks good, except the worry I've that the DMA debug API calls are now
lost.
> 
> Signed-off-by: Marek Szyprowski <m.szyprowski@samsung.com>
> Signed-off-by: Kyungmin Park <kyungmin.park@samsung.com>
> ---
>  arch/arm/mm/dma-mapping.c |   35 +++++++++++++++--------------------
>  1 files changed, 15 insertions(+), 20 deletions(-)
> 
> diff --git a/arch/arm/mm/dma-mapping.c b/arch/arm/mm/dma-mapping.c
> index 91fe436..31ff699 100644
> --- a/arch/arm/mm/dma-mapping.c
> +++ b/arch/arm/mm/dma-mapping.c
> @@ -635,12 +635,13 @@ EXPORT_SYMBOL(___dma_page_dev_to_cpu);
>  int arm_dma_map_sg(struct device *dev, struct scatterlist *sg, int nents,
>  		enum dma_data_direction dir, struct dma_attrs *attrs)
>  {
> +	struct dma_map_ops *ops = get_dma_ops(dev);
>  	struct scatterlist *s;
>  	int i, j;
>  
>  	for_each_sg(sg, s, nents, i) {
> -		s->dma_address = __dma_map_page(dev, sg_page(s), s->offset,
> -						s->length, dir);
> +		s->dma_address = ops->map_page(dev, sg_page(s), s->offset,
> +						s->length, dir, attrs);
>  		if (dma_mapping_error(dev, s->dma_address))
>  			goto bad_mapping;
>  	}
> @@ -648,7 +649,7 @@ int arm_dma_map_sg(struct device *dev, struct scatterlist *sg, int nents,
>  
>   bad_mapping:
>  	for_each_sg(sg, s, i, j)
> -		__dma_unmap_page(dev, sg_dma_address(s), sg_dma_len(s), dir);
> +		ops->unmap_page(dev, sg_dma_address(s), sg_dma_len(s), dir, attrs);
>  	return 0;
>  }
>  
> @@ -665,11 +666,13 @@ int arm_dma_map_sg(struct device *dev, struct scatterlist *sg, int nents,
>  void arm_dma_unmap_sg(struct device *dev, struct scatterlist *sg, int nents,
>  		enum dma_data_direction dir, struct dma_attrs *attrs)
>  {
> +	struct dma_map_ops *ops = get_dma_ops(dev);
>  	struct scatterlist *s;
> +
>  	int i;
>  
>  	for_each_sg(sg, s, nents, i)
> -		__dma_unmap_page(dev, sg_dma_address(s), sg_dma_len(s), dir);
> +		ops->unmap_page(dev, sg_dma_address(s), sg_dma_len(s), dir, attrs);
>  }
>  
>  /**
> @@ -682,17 +685,13 @@ void arm_dma_unmap_sg(struct device *dev, struct scatterlist *sg, int nents,
>  void arm_dma_sync_sg_for_cpu(struct device *dev, struct scatterlist *sg,
>  			int nents, enum dma_data_direction dir)
>  {
> +	struct dma_map_ops *ops = get_dma_ops(dev);
>  	struct scatterlist *s;
>  	int i;
>  
> -	for_each_sg(sg, s, nents, i) {
> -		if (!dmabounce_sync_for_cpu(dev, sg_dma_address(s),
> -					    sg_dma_len(s), dir))
> -			continue;
> -
> -		__dma_page_dev_to_cpu(sg_page(s), s->offset,
> -				      s->length, dir);
> -	}
> +	for_each_sg(sg, s, nents, i)
> +		ops->sync_single_for_cpu(dev, sg_dma_address(s), s->length,
> +					 dir);
>  }
>  
>  /**
> @@ -705,17 +704,13 @@ void arm_dma_sync_sg_for_cpu(struct device *dev, struct scatterlist *sg,
>  void arm_dma_sync_sg_for_device(struct device *dev, struct scatterlist *sg,
>  			int nents, enum dma_data_direction dir)
>  {
> +	struct dma_map_ops *ops = get_dma_ops(dev);
>  	struct scatterlist *s;
>  	int i;
>  
> -	for_each_sg(sg, s, nents, i) {
> -		if (!dmabounce_sync_for_device(dev, sg_dma_address(s),
> -					sg_dma_len(s), dir))
> -			continue;
> -
> -		__dma_page_cpu_to_dev(sg_page(s), s->offset,
> -				      s->length, dir);
> -	}
> +	for_each_sg(sg, s, nents, i)
> +		ops->sync_single_for_device(dev, sg_dma_address(s), s->length,
> +					    dir);
>  }
>  
>  /*
> -- 
> 1.7.1.569.g6f426
> 
> _______________________________________________
> iommu mailing list
> iommu@lists.linux-foundation.org
> https://lists.linuxfoundation.org/mailman/listinfo/iommu

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCHv6 3/7] ARM: dma-mapping: implement dma sg methods on top of any generic dma ops
  2012-02-14 15:02   ` Konrad Rzeszutek Wilk
@ 2012-02-14 15:02     ` Konrad Rzeszutek Wilk
  2012-02-24 13:24     ` Marek Szyprowski
  1 sibling, 0 replies; 39+ messages in thread
From: Konrad Rzeszutek Wilk @ 2012-02-14 15:02 UTC (permalink / raw)
  To: Marek Szyprowski
  Cc: linux-arm-kernel, linaro-mm-sig, linux-mm, linux-arch,
	linux-samsung-soc, iommu, Shariq Hasnain, Arnd Bergmann,
	Benjamin Herrenschmidt, Krishna Reddy, Kyungmin Park,
	Andrzej Pietrasiewicz, Russell King - ARM Linux, KyongHo Cho,
	Chunsang Jeong

On Fri, Feb 10, 2012 at 07:58:40PM +0100, Marek Szyprowski wrote:
> This patch converts all dma_sg methods to be generic (independent of the
> current DMA mapping implementation for ARM architecture). All dma sg
> operations are now implemented on top of respective
> dma_map_page/dma_sync_single_for* operations from dma_map_ops structure.

Looks good, except the worry I've that the DMA debug API calls are now
lost.
> 
> Signed-off-by: Marek Szyprowski <m.szyprowski@samsung.com>
> Signed-off-by: Kyungmin Park <kyungmin.park@samsung.com>
> ---
>  arch/arm/mm/dma-mapping.c |   35 +++++++++++++++--------------------
>  1 files changed, 15 insertions(+), 20 deletions(-)
> 
> diff --git a/arch/arm/mm/dma-mapping.c b/arch/arm/mm/dma-mapping.c
> index 91fe436..31ff699 100644
> --- a/arch/arm/mm/dma-mapping.c
> +++ b/arch/arm/mm/dma-mapping.c
> @@ -635,12 +635,13 @@ EXPORT_SYMBOL(___dma_page_dev_to_cpu);
>  int arm_dma_map_sg(struct device *dev, struct scatterlist *sg, int nents,
>  		enum dma_data_direction dir, struct dma_attrs *attrs)
>  {
> +	struct dma_map_ops *ops = get_dma_ops(dev);
>  	struct scatterlist *s;
>  	int i, j;
>  
>  	for_each_sg(sg, s, nents, i) {
> -		s->dma_address = __dma_map_page(dev, sg_page(s), s->offset,
> -						s->length, dir);
> +		s->dma_address = ops->map_page(dev, sg_page(s), s->offset,
> +						s->length, dir, attrs);
>  		if (dma_mapping_error(dev, s->dma_address))
>  			goto bad_mapping;
>  	}
> @@ -648,7 +649,7 @@ int arm_dma_map_sg(struct device *dev, struct scatterlist *sg, int nents,
>  
>   bad_mapping:
>  	for_each_sg(sg, s, i, j)
> -		__dma_unmap_page(dev, sg_dma_address(s), sg_dma_len(s), dir);
> +		ops->unmap_page(dev, sg_dma_address(s), sg_dma_len(s), dir, attrs);
>  	return 0;
>  }
>  
> @@ -665,11 +666,13 @@ int arm_dma_map_sg(struct device *dev, struct scatterlist *sg, int nents,
>  void arm_dma_unmap_sg(struct device *dev, struct scatterlist *sg, int nents,
>  		enum dma_data_direction dir, struct dma_attrs *attrs)
>  {
> +	struct dma_map_ops *ops = get_dma_ops(dev);
>  	struct scatterlist *s;
> +
>  	int i;
>  
>  	for_each_sg(sg, s, nents, i)
> -		__dma_unmap_page(dev, sg_dma_address(s), sg_dma_len(s), dir);
> +		ops->unmap_page(dev, sg_dma_address(s), sg_dma_len(s), dir, attrs);
>  }
>  
>  /**
> @@ -682,17 +685,13 @@ void arm_dma_unmap_sg(struct device *dev, struct scatterlist *sg, int nents,
>  void arm_dma_sync_sg_for_cpu(struct device *dev, struct scatterlist *sg,
>  			int nents, enum dma_data_direction dir)
>  {
> +	struct dma_map_ops *ops = get_dma_ops(dev);
>  	struct scatterlist *s;
>  	int i;
>  
> -	for_each_sg(sg, s, nents, i) {
> -		if (!dmabounce_sync_for_cpu(dev, sg_dma_address(s),
> -					    sg_dma_len(s), dir))
> -			continue;
> -
> -		__dma_page_dev_to_cpu(sg_page(s), s->offset,
> -				      s->length, dir);
> -	}
> +	for_each_sg(sg, s, nents, i)
> +		ops->sync_single_for_cpu(dev, sg_dma_address(s), s->length,
> +					 dir);
>  }
>  
>  /**
> @@ -705,17 +704,13 @@ void arm_dma_sync_sg_for_cpu(struct device *dev, struct scatterlist *sg,
>  void arm_dma_sync_sg_for_device(struct device *dev, struct scatterlist *sg,
>  			int nents, enum dma_data_direction dir)
>  {
> +	struct dma_map_ops *ops = get_dma_ops(dev);
>  	struct scatterlist *s;
>  	int i;
>  
> -	for_each_sg(sg, s, nents, i) {
> -		if (!dmabounce_sync_for_device(dev, sg_dma_address(s),
> -					sg_dma_len(s), dir))
> -			continue;
> -
> -		__dma_page_cpu_to_dev(sg_page(s), s->offset,
> -				      s->length, dir);
> -	}
> +	for_each_sg(sg, s, nents, i)
> +		ops->sync_single_for_device(dev, sg_dma_address(s), s->length,
> +					    dir);
>  }
>  
>  /*
> -- 
> 1.7.1.569.g6f426
> 
> _______________________________________________
> iommu mailing list
> iommu@lists.linux-foundation.org
> https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 39+ messages in thread

* RE: [PATCHv6 7/7] ARM: dma-mapping: add support for IOMMU mapper
       [not found]       ` <401E54CE964CD94BAE1EB4A729C7087E378E42AE18-wAPRp6hVlRhDw2glCA4ptUEOCMrvLtNR@public.gmane.org>
@ 2012-02-24  9:35         ` Marek Szyprowski
  2012-02-24  9:35           ` Marek Szyprowski
  2012-02-24 12:49           ` Arnd Bergmann
  0 siblings, 2 replies; 39+ messages in thread
From: Marek Szyprowski @ 2012-02-24  9:35 UTC (permalink / raw)
  To: 'Krishna Reddy',
	linux-arm-kernel-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r,
	linaro-mm-sig-cunTk1MwBs8s++Sfvej+rw,
	linux-mm-Bw31MaZKKs3YtjvyW6yDsg,
	linux-arch-u79uwXL29TY76Z2rM5mHXA,
	linux-samsung-soc-u79uwXL29TY76Z2rM5mHXA,
	iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA
  Cc: 'Shariq Hasnain', 'Arnd Bergmann',
	'Benjamin Herrenschmidt', 'Kyungmin Park',
	Andrzej Pietrasiewicz, 'Russell King - ARM Linux',
	'KyongHo Cho', 'Chunsang Jeong'

Hello,

On Monday, February 13, 2012 8:59 PM Krishna Reddy wrote:

> The implementation looks nice overall. Have few comments.
> 
> > +static struct page **__iommu_alloc_buffer(struct device *dev, size_t
> > +size, gfp_t gfp) {
> > +     struct page **pages;
> > +     int count = size >> PAGE_SHIFT;
> > +     int i=0;
> > +
> > +     pages = kzalloc(count * sizeof(struct page*), gfp);
> > +     if (!pages)
> > +             return NULL;
> 
> kzalloc can fail for any size bigger than PAGE_SIZE, if the system memory is
> fully fragmented.
> If there is a request for size bigger than 4MB, then the pages pointer array won't
> Fit in one page and kzalloc may fail. we should use vzalloc()/vfree()
> when pages pointer array size needed is bigger than PAGE_SIZE.

Right, thanks for spotting this. I will fix this in the next version.

> > +static inline dma_addr_t __alloc_iova(struct dma_iommu_mapping
> > *mapping,
> > +                                   size_t size)
> > +{
> > +     unsigned int order = get_order(size);
> > +     unsigned int align = 0;
> > +     unsigned int count, start;
> > +     unsigned long flags;
> > +
> > +     count = ((PAGE_ALIGN(size) >> PAGE_SHIFT) +
> > +              (1 << mapping->order) - 1) >> mapping->order;
> > +
> > +     if (order > mapping->order)
> > +             align = (1 << (order - mapping->order)) - 1;
> > +
> > +     spin_lock_irqsave(&mapping->lock, flags);
> > +     start = bitmap_find_next_zero_area(mapping->bitmap, mapping-
> > >bits, 0,
> > +                                        count, align);
> 
> Do we need "align" here? Why is it trying to align the memory request to
> size of memory requested? When mapping->order is zero and if the size
> requested is 4MB, order becomes 10.  align is set to 1023.
>  bitmap_find_next_zero_area looks searching for free area from index, which
> is multiple of 1024. Why we can't we say align mask  as 0 and let it allocate from
> next free index? Doesn't mapping->order take care of min alignment needed for dev?

Aligning IO address to the nearest power of 2 of the buffer size matches the behavior 
of the other kernel allocators. alloc_pages does exactly the same thing - the physical 
addresses are aligned to the power of 2. Some driver also depend on this feature and 
allocate arbitrary buffer sizes to get memory aligned to certain values. This is 
considered as a feature not a side effect of the internal buddy allocator 
implementation.

Keeping IO addresses aligned also enables use of pages larger than 4KiB for the IOMMU
mappings if the physical memory got allocated in larger chunk (one can use 64KiB 
mappings only if IO address and physical memory address are 64KiB aligned).

> > +static dma_addr_t __iommu_create_mapping(struct device *dev, struct
> > +page **pages, size_t size) {
> > +     struct dma_iommu_mapping *mapping = dev->archdata.mapping;
> > +     unsigned int count = PAGE_ALIGN(size) >> PAGE_SHIFT;
> > +     dma_addr_t dma_addr, iova;
> > +     int i, ret = ~0;
> > +
> > +     dma_addr = __alloc_iova(mapping, size);
> > +     if (dma_addr == ~0)
> > +             goto fail;
> > +
> > +     iova = dma_addr;
> > +     for (i=0; i<count; ) {
> > +             unsigned int phys = page_to_phys(pages[i]);
> > +             int j = i + 1;
> > +
> > +             while (j < count) {
> > +                     if (page_to_phys(pages[j]) != phys + (j - i) *
> > PAGE_SIZE)
> > +                             break;
> > +                     j++;
> > +             }
> > +
> > +             ret = iommu_map(mapping->domain, iova, phys, (j - i) *
> > PAGE_SIZE, 0);
> > +             if (ret < 0)
> > +                     goto fail;
> > +             iova += (j - i) * PAGE_SIZE;
> > +             i = j;
> > +     }
> > +
> > +     return dma_addr;
> > +fail:
> > +     return ~0;
> > +}
> 
> iommu_map failure should release the iova space allocated using __alloc_iova.

Right, thanks for spotting the bug.

> 
> > +static dma_addr_t arm_iommu_map_page(struct device *dev, struct page
> > *page,
> > +          unsigned long offset, size_t size, enum dma_data_direction dir,
> > +          struct dma_attrs *attrs)
> > +{
> > +     struct dma_iommu_mapping *mapping = dev->archdata.mapping;
> > +     dma_addr_t dma_addr, iova;
> > +     unsigned int phys;
> > +     int ret, len = PAGE_ALIGN(size + offset);
> > +
> > +     if (!arch_is_coherent())
> > +             __dma_page_cpu_to_dev(page, offset, size, dir);
> > +
> > +     dma_addr = iova = __alloc_iova(mapping, len);
> > +     if (iova == ~0)
> > +             goto fail;
> > +
> > +     dma_addr += offset;
> > +     phys = page_to_phys(page);
> > +     ret = iommu_map(mapping->domain, iova, phys, size, 0);
> > +     if (ret < 0)
> > +             goto fail;
> > +
> > +     return dma_addr;
> > +fail:
> > +     return ~0;
> > +}
> 
> iommu_map failure should release the iova space allocated using __alloc_iova.
 
Best regards
-- 
Marek Szyprowski
Samsung Poland R&D Center

^ permalink raw reply	[flat|nested] 39+ messages in thread

* RE: [PATCHv6 7/7] ARM: dma-mapping: add support for IOMMU mapper
  2012-02-24  9:35         ` Marek Szyprowski
@ 2012-02-24  9:35           ` Marek Szyprowski
  2012-02-24 12:49           ` Arnd Bergmann
  1 sibling, 0 replies; 39+ messages in thread
From: Marek Szyprowski @ 2012-02-24  9:35 UTC (permalink / raw)
  To: 'Krishna Reddy',
	linux-arm-kernel, linaro-mm-sig, linux-mm, linux-arch,
	linux-samsung-soc, iommu
  Cc: 'Kyungmin Park', 'Arnd Bergmann',
	'Joerg Roedel', 'Russell King - ARM Linux',
	'Shariq Hasnain', 'Chunsang Jeong',
	'KyongHo Cho',
	Andrzej Pietrasiewicz, 'Benjamin Herrenschmidt'

Hello,

On Monday, February 13, 2012 8:59 PM Krishna Reddy wrote:

> The implementation looks nice overall. Have few comments.
> 
> > +static struct page **__iommu_alloc_buffer(struct device *dev, size_t
> > +size, gfp_t gfp) {
> > +     struct page **pages;
> > +     int count = size >> PAGE_SHIFT;
> > +     int i=0;
> > +
> > +     pages = kzalloc(count * sizeof(struct page*), gfp);
> > +     if (!pages)
> > +             return NULL;
> 
> kzalloc can fail for any size bigger than PAGE_SIZE, if the system memory is
> fully fragmented.
> If there is a request for size bigger than 4MB, then the pages pointer array won't
> Fit in one page and kzalloc may fail. we should use vzalloc()/vfree()
> when pages pointer array size needed is bigger than PAGE_SIZE.

Right, thanks for spotting this. I will fix this in the next version.

> > +static inline dma_addr_t __alloc_iova(struct dma_iommu_mapping
> > *mapping,
> > +                                   size_t size)
> > +{
> > +     unsigned int order = get_order(size);
> > +     unsigned int align = 0;
> > +     unsigned int count, start;
> > +     unsigned long flags;
> > +
> > +     count = ((PAGE_ALIGN(size) >> PAGE_SHIFT) +
> > +              (1 << mapping->order) - 1) >> mapping->order;
> > +
> > +     if (order > mapping->order)
> > +             align = (1 << (order - mapping->order)) - 1;
> > +
> > +     spin_lock_irqsave(&mapping->lock, flags);
> > +     start = bitmap_find_next_zero_area(mapping->bitmap, mapping-
> > >bits, 0,
> > +                                        count, align);
> 
> Do we need "align" here? Why is it trying to align the memory request to
> size of memory requested? When mapping->order is zero and if the size
> requested is 4MB, order becomes 10.  align is set to 1023.
>  bitmap_find_next_zero_area looks searching for free area from index, which
> is multiple of 1024. Why we can't we say align mask  as 0 and let it allocate from
> next free index? Doesn't mapping->order take care of min alignment needed for dev?

Aligning IO address to the nearest power of 2 of the buffer size matches the behavior 
of the other kernel allocators. alloc_pages does exactly the same thing - the physical 
addresses are aligned to the power of 2. Some driver also depend on this feature and 
allocate arbitrary buffer sizes to get memory aligned to certain values. This is 
considered as a feature not a side effect of the internal buddy allocator 
implementation.

Keeping IO addresses aligned also enables use of pages larger than 4KiB for the IOMMU
mappings if the physical memory got allocated in larger chunk (one can use 64KiB 
mappings only if IO address and physical memory address are 64KiB aligned).

> > +static dma_addr_t __iommu_create_mapping(struct device *dev, struct
> > +page **pages, size_t size) {
> > +     struct dma_iommu_mapping *mapping = dev->archdata.mapping;
> > +     unsigned int count = PAGE_ALIGN(size) >> PAGE_SHIFT;
> > +     dma_addr_t dma_addr, iova;
> > +     int i, ret = ~0;
> > +
> > +     dma_addr = __alloc_iova(mapping, size);
> > +     if (dma_addr == ~0)
> > +             goto fail;
> > +
> > +     iova = dma_addr;
> > +     for (i=0; i<count; ) {
> > +             unsigned int phys = page_to_phys(pages[i]);
> > +             int j = i + 1;
> > +
> > +             while (j < count) {
> > +                     if (page_to_phys(pages[j]) != phys + (j - i) *
> > PAGE_SIZE)
> > +                             break;
> > +                     j++;
> > +             }
> > +
> > +             ret = iommu_map(mapping->domain, iova, phys, (j - i) *
> > PAGE_SIZE, 0);
> > +             if (ret < 0)
> > +                     goto fail;
> > +             iova += (j - i) * PAGE_SIZE;
> > +             i = j;
> > +     }
> > +
> > +     return dma_addr;
> > +fail:
> > +     return ~0;
> > +}
> 
> iommu_map failure should release the iova space allocated using __alloc_iova.

Right, thanks for spotting the bug.

> 
> > +static dma_addr_t arm_iommu_map_page(struct device *dev, struct page
> > *page,
> > +          unsigned long offset, size_t size, enum dma_data_direction dir,
> > +          struct dma_attrs *attrs)
> > +{
> > +     struct dma_iommu_mapping *mapping = dev->archdata.mapping;
> > +     dma_addr_t dma_addr, iova;
> > +     unsigned int phys;
> > +     int ret, len = PAGE_ALIGN(size + offset);
> > +
> > +     if (!arch_is_coherent())
> > +             __dma_page_cpu_to_dev(page, offset, size, dir);
> > +
> > +     dma_addr = iova = __alloc_iova(mapping, len);
> > +     if (iova == ~0)
> > +             goto fail;
> > +
> > +     dma_addr += offset;
> > +     phys = page_to_phys(page);
> > +     ret = iommu_map(mapping->domain, iova, phys, size, 0);
> > +     if (ret < 0)
> > +             goto fail;
> > +
> > +     return dma_addr;
> > +fail:
> > +     return ~0;
> > +}
> 
> iommu_map failure should release the iova space allocated using __alloc_iova.
 
Best regards
-- 
Marek Szyprowski
Samsung Poland R&D Center




^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCHv6 7/7] ARM: dma-mapping: add support for IOMMU mapper
  2012-02-24  9:35         ` Marek Szyprowski
  2012-02-24  9:35           ` Marek Szyprowski
@ 2012-02-24 12:49           ` Arnd Bergmann
  2012-02-24 12:49             ` Arnd Bergmann
  2012-02-24 13:18             ` Marek Szyprowski
  1 sibling, 2 replies; 39+ messages in thread
From: Arnd Bergmann @ 2012-02-24 12:49 UTC (permalink / raw)
  To: Marek Szyprowski
  Cc: 'Krishna Reddy',
	linux-arm-kernel, linaro-mm-sig, linux-mm, linux-arch,
	linux-samsung-soc, iommu, 'Kyungmin Park',
	'Joerg Roedel', 'Russell King - ARM Linux',
	'Shariq Hasnain', 'Chunsang Jeong',
	'KyongHo Cho',
	Andrzej Pietrasiewicz, 'Benjamin Herrenschmidt'

On Friday 24 February 2012, Marek Szyprowski wrote:
> > > +static struct page **__iommu_alloc_buffer(struct device *dev, size_t
> > > +size, gfp_t gfp) {
> > > +     struct page **pages;
> > > +     int count = size >> PAGE_SHIFT;
> > > +     int i=0;
> > > +
> > > +     pages = kzalloc(count * sizeof(struct page*), gfp);
> > > +     if (!pages)
> > > +             return NULL;
> > 
> > kzalloc can fail for any size bigger than PAGE_SIZE, if the system memory is
> > fully fragmented.
> > If there is a request for size bigger than 4MB, then the pages pointer array won't
> > Fit in one page and kzalloc may fail. we should use vzalloc()/vfree()
> > when pages pointer array size needed is bigger than PAGE_SIZE.
> 
> Right, thanks for spotting this. I will fix this in the next version.

It's not clear though if that is the best solution. vzalloc comes at the
price of using up space in the vmalloc area and as well as extra TLB entries,
so we try to limit its use where possible. The other current code might fail
in out of memory situations, but if a user wants to allocate a >4MB buffer
(using up more than one physically contiguous page of pointers to pages), the
following allocation of >1024 pages will likely fail as well, so we might
just fail early.

	Arnd

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCHv6 7/7] ARM: dma-mapping: add support for IOMMU mapper
  2012-02-24 12:49           ` Arnd Bergmann
@ 2012-02-24 12:49             ` Arnd Bergmann
  2012-02-24 13:18             ` Marek Szyprowski
  1 sibling, 0 replies; 39+ messages in thread
From: Arnd Bergmann @ 2012-02-24 12:49 UTC (permalink / raw)
  To: Marek Szyprowski
  Cc: 'Krishna Reddy',
	linux-arm-kernel, linaro-mm-sig, linux-mm, linux-arch,
	linux-samsung-soc, iommu, 'Kyungmin Park',
	'Joerg Roedel', 'Russell King - ARM Linux',
	'Shariq Hasnain', 'Chunsang Jeong',
	'KyongHo Cho',
	Andrzej Pietrasiewicz, 'Benjamin Herrenschmidt'

On Friday 24 February 2012, Marek Szyprowski wrote:
> > > +static struct page **__iommu_alloc_buffer(struct device *dev, size_t
> > > +size, gfp_t gfp) {
> > > +     struct page **pages;
> > > +     int count = size >> PAGE_SHIFT;
> > > +     int i=0;
> > > +
> > > +     pages = kzalloc(count * sizeof(struct page*), gfp);
> > > +     if (!pages)
> > > +             return NULL;
> > 
> > kzalloc can fail for any size bigger than PAGE_SIZE, if the system memory is
> > fully fragmented.
> > If there is a request for size bigger than 4MB, then the pages pointer array won't
> > Fit in one page and kzalloc may fail. we should use vzalloc()/vfree()
> > when pages pointer array size needed is bigger than PAGE_SIZE.
> 
> Right, thanks for spotting this. I will fix this in the next version.

It's not clear though if that is the best solution. vzalloc comes at the
price of using up space in the vmalloc area and as well as extra TLB entries,
so we try to limit its use where possible. The other current code might fail
in out of memory situations, but if a user wants to allocate a >4MB buffer
(using up more than one physically contiguous page of pointers to pages), the
following allocation of >1024 pages will likely fail as well, so we might
just fail early.

	Arnd

^ permalink raw reply	[flat|nested] 39+ messages in thread

* RE: [PATCHv6 7/7] ARM: dma-mapping: add support for IOMMU mapper
  2012-02-14 14:55     ` Konrad Rzeszutek Wilk
  2012-02-14 14:55       ` Konrad Rzeszutek Wilk
@ 2012-02-24 13:12       ` Marek Szyprowski
  2012-02-24 13:12         ` Marek Szyprowski
  1 sibling, 1 reply; 39+ messages in thread
From: Marek Szyprowski @ 2012-02-24 13:12 UTC (permalink / raw)
  To: 'Konrad Rzeszutek Wilk'
  Cc: linux-arm-kernel, linaro-mm-sig, linux-mm, linux-arch,
	linux-samsung-soc, iommu, 'Shariq Hasnain',
	'Arnd Bergmann', 'Benjamin Herrenschmidt',
	'Krishna Reddy', 'Kyungmin Park',
	Andrzej Pietrasiewicz, 'Russell King - ARM Linux',
	'KyongHo Cho', 'Chunsang Jeong'

Hello,

On Tuesday, February 14, 2012 3:56 PM Konrad Rzeszutek Wilk wrote:

> > +static void __dma_clear_buffer(struct page *page, size_t size)
> > +{
> > +	void *ptr;
> > +	/*
> > +	 * Ensure that the allocated pages are zeroed, and that any data
> > +	 * lurking in the kernel direct-mapped region is invalidated.
> > +	 */
> > +	ptr = page_address(page);
> 
> Should you check to see if the ptr is valid?

Ok.

> > +	memset(ptr, 0, size);
> > +	dmac_flush_range(ptr, ptr + size);
> > +	outer_flush_range(__pa(ptr), __pa(ptr) + size);
> > +}
> > +
> >  /*
> >   * Allocate a DMA buffer for 'dev' of size 'size' using the
> >   * specified gfp mask.  Note that 'size' must be page aligned.
> > @@ -164,7 +179,6 @@ static struct page *__dma_alloc_buffer(struct device *dev, size_t size,
> gfp_t gf
> >  {
> >  	unsigned long order = get_order(size);
> >  	struct page *page, *p, *e;
> > -	void *ptr;
> >  	u64 mask = get_coherent_dma_mask(dev);
> >
> >  #ifdef CONFIG_DMA_API_DEBUG
> > @@ -193,14 +207,7 @@ static struct page *__dma_alloc_buffer(struct device *dev, size_t size,
> gfp_t gf
> >  	for (p = page + (size >> PAGE_SHIFT), e = page + (1 << order); p < e; p++)
> >  		__free_page(p);
> >
> > -	/*
> > -	 * Ensure that the allocated pages are zeroed, and that any data
> > -	 * lurking in the kernel direct-mapped region is invalidated.
> > -	 */
> > -	ptr = page_address(page);
> > -	memset(ptr, 0, size);
> > -	dmac_flush_range(ptr, ptr + size);
> > -	outer_flush_range(__pa(ptr), __pa(ptr) + size);
> > +	__dma_clear_buffer(page, size);
> >
> >  	return page;
> >  }
> > @@ -348,7 +355,7 @@ __dma_alloc_remap(struct page *page, size_t size, gfp_t gfp, pgprot_t
> prot)
> >  		u32 off = CONSISTENT_OFFSET(c->vm_start) & (PTRS_PER_PTE-1);
> >
> >  		pte = consistent_pte[idx] + off;
> > -		c->vm_pages = page;
> > +		c->priv = page;
> >
> >  		do {
> >  			BUG_ON(!pte_none(*pte));
> > @@ -461,6 +468,14 @@ __dma_alloc(struct device *dev, size_t size, dma_addr_t *handle, gfp_t
> gfp,
> >  	return addr;
> >  }
> >
> > +static inline pgprot_t __get_dma_pgprot(struct dma_attrs *attrs, pgprot_t prot)
> > +{
> > +	prot = dma_get_attr(DMA_ATTR_WRITE_COMBINE, attrs) ?
> > +			    pgprot_writecombine(prot) :
> > +			    pgprot_dmacoherent(prot);
> > +	return prot;
> > +}
> > +
> >  /*
> >   * Allocate DMA-coherent memory space and return both the kernel remapped
> >   * virtual and bus address for that space.
> > @@ -468,9 +483,7 @@ __dma_alloc(struct device *dev, size_t size, dma_addr_t *handle, gfp_t
> gfp,
> >  void *arm_dma_alloc(struct device *dev, size_t size, dma_addr_t *handle,
> >  		    gfp_t gfp, struct dma_attrs *attrs)
> >  {
> > -	pgprot_t prot = dma_get_attr(DMA_ATTR_WRITE_COMBINE, attrs) ?
> > -			pgprot_writecombine(pgprot_kernel) :
> > -			pgprot_dmacoherent(pgprot_kernel);
> > +	pgprot_t prot = __get_dma_pgprot(attrs, pgprot_kernel);
> >  	void *memory;
> >
> >  	if (dma_alloc_from_coherent(dev, size, handle, &memory))
> > @@ -499,13 +512,14 @@ int arm_dma_mmap(struct device *dev, struct vm_area_struct *vma,
> >  	c = arm_vmregion_find(&consistent_head, (unsigned long)cpu_addr);
> >  	if (c) {
> >  		unsigned long off = vma->vm_pgoff;
> > +		struct page *pages = c->priv;
> >
> >  		kern_size = (c->vm_end - c->vm_start) >> PAGE_SHIFT;
> >
> >  		if (off < kern_size &&
> >  		    user_size <= (kern_size - off)) {
> >  			ret = remap_pfn_range(vma, vma->vm_start,
> > -					      page_to_pfn(c->vm_pages) + off,
> > +					      page_to_pfn(pages) + off,
> >  					      user_size << PAGE_SHIFT,
> >  					      vma->vm_page_prot);
> >  		}
> > @@ -644,6 +658,9 @@ int arm_dma_map_sg(struct device *dev, struct scatterlist *sg, int nents,
> >  	int i, j;
> >
> >  	for_each_sg(sg, s, nents, i) {
> > +#ifdef CONFIG_NEED_SG_DMA_LENGTH
> > +		s->dma_length = s->length;
> > +#endif
> >  		s->dma_address = ops->map_page(dev, sg_page(s), s->offset,
> >  						s->length, dir, attrs);
> >  		if (dma_mapping_error(dev, s->dma_address))
> > @@ -749,3 +766,593 @@ static int __init dma_debug_do_init(void)
> >  	return 0;
> >  }
> >  fs_initcall(dma_debug_do_init);
> > +
> > +#ifdef CONFIG_ARM_DMA_USE_IOMMU
> > +
> > +/* IOMMU */
> > +
> > +static inline dma_addr_t __alloc_iova(struct dma_iommu_mapping *mapping,
> > +				      size_t size)
> > +{
> > +	unsigned int order = get_order(size);
> > +	unsigned int align = 0;
> > +	unsigned int count, start;
> > +	unsigned long flags;
> > +
> > +	count = ((PAGE_ALIGN(size) >> PAGE_SHIFT) +
> > +		 (1 << mapping->order) - 1) >> mapping->order;
> > +
> > +	if (order > mapping->order)
> > +		align = (1 << (order - mapping->order)) - 1;
> > +
> > +	spin_lock_irqsave(&mapping->lock, flags);
> > +	start = bitmap_find_next_zero_area(mapping->bitmap, mapping->bits, 0,
> > +					   count, align);
> > +	if (start > mapping->bits) {
> > +		spin_unlock_irqrestore(&mapping->lock, flags);
> > +		return ~0;
> 
> Would it make sense to use DMA_ERROR_CODE? Or a ARM variant of it.

Right, the code will be easier to understand if I add ARM_DMA_ERROR define.

> > +	}
> > +
> > +	bitmap_set(mapping->bitmap, start, count);
> > +	spin_unlock_irqrestore(&mapping->lock, flags);
> > +
> > +	return mapping->base + (start << (mapping->order + PAGE_SHIFT));
> > +}
> > +
> > +static inline void __free_iova(struct dma_iommu_mapping *mapping,
> > +			       dma_addr_t addr, size_t size)
> > +{
> > +	unsigned int start = (addr - mapping->base) >>
> > +			     (mapping->order + PAGE_SHIFT);
> > +	unsigned int count = ((size >> PAGE_SHIFT) +
> > +			      (1 << mapping->order) - 1) >> mapping->order;
> > +	unsigned long flags;
> > +
> > +	spin_lock_irqsave(&mapping->lock, flags);
> > +	bitmap_clear(mapping->bitmap, start, count);
> > +	spin_unlock_irqrestore(&mapping->lock, flags);
> > +}
> > +
> > +static struct page **__iommu_alloc_buffer(struct device *dev, size_t size, gfp_t gfp)
> > +{
> > +	struct page **pages;
> > +	int count = size >> PAGE_SHIFT;
> > +	int i=0;
> > +
> > +	pages = kzalloc(count * sizeof(struct page*), gfp);
> > +	if (!pages)
> > +		return NULL;
> > +
> > +	while (count) {
> > +		int j, order = __ffs(count);
> > +
> > +		pages[i] = alloc_pages(gfp | __GFP_NOWARN, order);
> > +		while (!pages[i] && order)
> > +			pages[i] = alloc_pages(gfp | __GFP_NOWARN, --order);
> > +		if (!pages[i])
> > +			goto error;
> > +
> > +		if (order)
> > +			split_page(pages[i], order);
> > +		j = 1 << order;
> > +		while (--j)
> > +			pages[i + j] = pages[i] + j;
> > +
> > +		__dma_clear_buffer(pages[i], PAGE_SIZE << order);
> > +		i += 1 << order;
> > +		count -= 1 << order;
> > +	}
> > +
> > +	return pages;
> > +error:
> > +	while (--i)
> > +		if (pages[i])
> > +			__free_pages(pages[i], 0);
> > +	kfree(pages);
> > +	return NULL;
> > +}
> > +
> > +static int __iommu_free_buffer(struct device *dev, struct page **pages, size_t size)
> > +{
> > +	int count = size >> PAGE_SHIFT;
> > +	int i;
> > +	for (i=0; i< count; i++)
> 
> That 'i< count' looks odd. Did checkpath miss that one?
> 
> > +		if (pages[i])
> > +			__free_pages(pages[i], 0);
> > +	kfree(pages);
> > +	return 0;
> > +}
> > +
> > +static void *
> > +__iommu_alloc_remap(struct page **pages, size_t size, gfp_t gfp, pgprot_t prot)
> > +{
> > +	struct arm_vmregion *c;
> > +	size_t align;
> > +	size_t count = size >> PAGE_SHIFT;
> > +	int bit;
> > +
> > +	if (!consistent_pte[0]) {
> > +		printk(KERN_ERR "%s: not initialised\n", __func__);
> > +		dump_stack();
> > +		return NULL;
> > +	}
> > +
> > +	/*
> > +	 * Align the virtual region allocation - maximum alignment is
> > +	 * a section size, minimum is a page size.  This helps reduce
> > +	 * fragmentation of the DMA space, and also prevents allocations
> > +	 * smaller than a section from crossing a section boundary.
> > +	 */
> > +	bit = fls(size - 1);
> > +	if (bit > SECTION_SHIFT)
> > +		bit = SECTION_SHIFT;
> > +	align = 1 << bit;
> > +
> > +	/*
> > +	 * Allocate a virtual address in the consistent mapping region.
> > +	 */
> > +	c = arm_vmregion_alloc(&consistent_head, align, size,
> > +			    gfp & ~(__GFP_DMA | __GFP_HIGHMEM));
> > +	if (c) {
> > +		pte_t *pte;
> > +		int idx = CONSISTENT_PTE_INDEX(c->vm_start);
> > +		int i = 0;
> > +		u32 off = CONSISTENT_OFFSET(c->vm_start) & (PTRS_PER_PTE-1);
> > +
> > +		pte = consistent_pte[idx] + off;
> > +		c->priv = pages;
> > +
> > +		do {
> > +			BUG_ON(!pte_none(*pte));
> > +
> > +			set_pte_ext(pte, mk_pte(pages[i], prot), 0);
> > +			pte++;
> > +			off++;
> > +			i++;
> > +			if (off >= PTRS_PER_PTE) {
> > +				off = 0;
> > +				pte = consistent_pte[++idx];
> > +			}
> > +		} while (i < count);
> > +
> > +		dsb();
> > +
> > +		return (void *)c->vm_start;
> > +	}
> > +	return NULL;
> > +}
> > +
> > +static dma_addr_t __iommu_create_mapping(struct device *dev, struct page **pages, size_t
> size)
> > +{
> > +	struct dma_iommu_mapping *mapping = dev->archdata.mapping;
> > +	unsigned int count = PAGE_ALIGN(size) >> PAGE_SHIFT;
> > +	dma_addr_t dma_addr, iova;
> > +	int i, ret = ~0;
> > +
> > +	dma_addr = __alloc_iova(mapping, size);
> > +	if (dma_addr == ~0)
> > +		goto fail;
> > +
> > +	iova = dma_addr;
> > +	for (i=0; i<count; ) {
> > +		unsigned int phys = page_to_phys(pages[i]);
> 
> phys_addr_t ?

Right

> > +		int j = i + 1;
> > +
> > +		while (j < count) {
> > +			if (page_to_phys(pages[j]) != phys + (j - i) * PAGE_SIZE)
> > +				break;
> 
> How about just using pfn values?
> So:
> 
> 	unsigned int next_pfn = page_to_pfn(pages[i])
> 	unsigned int pfn = i;
> 
> 	for (j = 1; j < count; j++)
> 		if (page_to_pfn(pages[++pfn]) != ++next_pfn)
> 			break;
> 
> IMHO it looks easier to read.

Right, this one looks much better.

> > +			j++;
> > +		}
> > +
> > +		ret = iommu_map(mapping->domain, iova, phys, (j - i) * PAGE_SIZE, 0);
> > +		if (ret < 0)
> > +			goto fail;
> > +		iova += (j - i) * PAGE_SIZE;
> > +		i = j;
> 
> Granted you would have to rework this a bit.
> > +	}
> > +
> > +	return dma_addr;
> > +fail:
> > +	return ~0;
> 
> DMA_ERROR_CODE
> 
> > +}
> > +
> > +static int __iommu_remove_mapping(struct device *dev, dma_addr_t iova, size_t size)
> > +{
> > +	struct dma_iommu_mapping *mapping = dev->archdata.mapping;
> > +	unsigned int count = PAGE_ALIGN(size) >> PAGE_SHIFT;
> > +
> > +	iova &= PAGE_MASK;
> > +
> > +	iommu_unmap(mapping->domain, iova, count * PAGE_SIZE);
> > +
> > +	__free_iova(mapping, iova, size);
> > +	return 0;
> > +}
> > +
> > +static void *arm_iommu_alloc_attrs(struct device *dev, size_t size,
> > +	    dma_addr_t *handle, gfp_t gfp, struct dma_attrs *attrs)
> > +{
> > +	pgprot_t prot = __get_dma_pgprot(attrs, pgprot_kernel);
> > +	struct page **pages;
> > +	void *addr = NULL;
> > +
> > +	*handle = ~0;
> > +	size = PAGE_ALIGN(size);
> > +
> > +	pages = __iommu_alloc_buffer(dev, size, gfp);
> > +	if (!pages)
> > +		return NULL;
> > +
> > +	*handle = __iommu_create_mapping(dev, pages, size);
> > +	if (*handle == ~0)
> > +		goto err_buffer;
> > +
> > +	addr = __iommu_alloc_remap(pages, size, gfp, prot);
> > +	if (!addr)
> > +		goto err_mapping;
> > +
> > +	return addr;
> > +
> > +err_mapping:
> > +	__iommu_remove_mapping(dev, *handle, size);
> > +err_buffer:
> > +	__iommu_free_buffer(dev, pages, size);
> > +	return NULL;
> > +}
> > +
> > +static int arm_iommu_mmap_attrs(struct device *dev, struct vm_area_struct *vma,
> > +		    void *cpu_addr, dma_addr_t dma_addr, size_t size,
> > +		    struct dma_attrs *attrs)
> > +{
> > +	struct arm_vmregion *c;
> > +
> > +	vma->vm_page_prot = __get_dma_pgprot(attrs, vma->vm_page_prot);
> > +	c = arm_vmregion_find(&consistent_head, (unsigned long)cpu_addr);
> > +
> > +	if (c) {
> > +		struct page **pages = c->priv;
> > +
> > +		unsigned long uaddr = vma->vm_start;
> > +		unsigned long usize = vma->vm_end - vma->vm_start;
> > +		int i = 0;
> > +
> > +		do {
> > +			int ret;
> > +
> > +			ret = vm_insert_page(vma, uaddr, pages[i++]);
> > +			if (ret) {
> > +				printk(KERN_ERR "Remapping memory, error: %d\n", ret);
> > +				return ret;
> > +			}
> > +
> > +			uaddr += PAGE_SIZE;
> > +			usize -= PAGE_SIZE;
> > +		} while (usize > 0);
> > +	}
> > +	return 0;
> > +}
> > +
> > +/*
> > + * free a page as defined by the above mapping.
> > + * Must not be called with IRQs disabled.
> > + */
> > +void arm_iommu_free_attrs(struct device *dev, size_t size, void *cpu_addr,
> > +			  dma_addr_t handle, struct dma_attrs *attrs)
> > +{
> > +	struct arm_vmregion *c;
> > +	size = PAGE_ALIGN(size);
> > +
> > +	c = arm_vmregion_find(&consistent_head, (unsigned long)cpu_addr);
> > +	if (c) {
> > +		struct page **pages = c->priv;
> > +		__dma_free_remap(cpu_addr, size);
> > +		__iommu_remove_mapping(dev, handle, size);
> > +		__iommu_free_buffer(dev, pages, size);
> > +	}
> > +}
> > +
> > +static int __map_sg_chunk(struct device *dev, struct scatterlist *sg,
> > +			  size_t size, dma_addr_t *handle,
> > +			  enum dma_data_direction dir)
> > +{
> > +	struct dma_iommu_mapping *mapping = dev->archdata.mapping;
> > +	dma_addr_t iova, iova_base;
> > +	int ret = 0;
> > +	unsigned int count;
> > +	struct scatterlist *s;
> > +
> > +	size = PAGE_ALIGN(size);
> > +	*handle = ~0;
> > +
> > +	iova_base = iova = __alloc_iova(mapping, size);
> > +	if (iova == ~0)
> > +		return -ENOMEM;
> > +
> > +	for (count = 0, s = sg; count < (size >> PAGE_SHIFT); s = sg_next(s))
> > +	{
> > +		phys_addr_t phys = page_to_phys(sg_page(s));
> > +		unsigned int len = PAGE_ALIGN(s->offset + s->length);
> > +
> > +		if (!arch_is_coherent())
> > +			__dma_page_cpu_to_dev(sg_page(s), s->offset, s->length, dir);
> > +
> > +		ret = iommu_map(mapping->domain, iova, phys, len, 0);
> > +		if (ret < 0)
> > +			goto fail;
> > +		count += len >> PAGE_SHIFT;
> > +		iova += len;
> > +	}
> > +	*handle = iova_base;
> > +
> > +	return 0;
> > +fail:
> > +	iommu_unmap(mapping->domain, iova_base, count * PAGE_SIZE);
> > +	__free_iova(mapping, iova_base, size);
> > +	return ret;
> > +}
> > +
> > +int arm_iommu_map_sg(struct device *dev, struct scatterlist *sg, int nents,
> > +		     enum dma_data_direction dir, struct dma_attrs *attrs)
> > +{
> > +	struct scatterlist *s = sg, *dma = sg, *start = sg;
> > +	int i, count = 0;
> > +	unsigned int offset = s->offset;
> > +	unsigned int size = s->offset + s->length;
> > +	unsigned int max = dma_get_max_seg_size(dev);
> > +
> > +	s->dma_address = ~0;
> > +	s->dma_length = 0;
> 
> Not zero just in case somebody does not check the values and tries to use them?

Well, I've read the whole function again and now I see that it is not really needed. 
We shouldn't care about broken clients who don't check the return value.

> > +
> > +	for (i = 1; i < nents; i++) {
> > +		s->dma_address = ~0;
> > +		s->dma_length = 0;
> > +
> > +		s = sg_next(s);
> > +
> > +		if (s->offset || (size & ~PAGE_MASK) || size + s->length > max) {
> > +			if (__map_sg_chunk(dev, start, size, &dma->dma_address,
> > +			    dir) < 0)
> > +				goto bad_mapping;
> > +
> > +			dma->dma_address += offset;
> > +			dma->dma_length = size - offset;
> > +
> > +			size = offset = s->offset;
> > +			start = s;
> > +			dma = sg_next(dma);
> > +			count += 1;
> > +		}
> > +		size += s->length;
> > +	}
> > +	if (__map_sg_chunk(dev, start, size, &dma->dma_address, dir) < 0)
> > +		goto bad_mapping;
> > +
> > +	dma->dma_address += offset;
> > +	dma->dma_length = size - offset;
> > +
> > +	return count+1;
> > +
> > +bad_mapping:
> > +	for_each_sg(sg, s, count, i)
> > +		__iommu_remove_mapping(dev, sg_dma_address(s), sg_dma_len(s));
> > +	return 0;
> > +}
> > +
> > +void arm_iommu_unmap_sg(struct device *dev, struct scatterlist *sg, int nents,
> > +			enum dma_data_direction dir, struct dma_attrs *attrs)
> > +{
> > +	struct scatterlist *s;
> > +	int i;
> > +
> > +	for_each_sg(sg, s, nents, i) {
> > +		if (sg_dma_len(s))
> > +			__iommu_remove_mapping(dev, sg_dma_address(s),
> > +					       sg_dma_len(s));
> > +		if (!arch_is_coherent())
> > +			__dma_page_dev_to_cpu(sg_page(s), s->offset,
> > +					      s->length, dir);
> > +	}
> > +}
> > +
> > +
> > +/**
> > + * dma_sync_sg_for_cpu
> > + * @dev: valid struct device pointer, or NULL for ISA and EISA-like devices
> 
> Uhhh, Won't that conflict with patch #1 which BUGs if dev != NULL?

Ok, I forgot to update comment once I've copied this function, it should refer to 
iommu_sync_sg_for_cpu and the comment about NULL device pointer should be removed. 

I assume You also wanted to ask what will happen if dev == NULL? Such case is not 
possible. If caller provides NULL device pointer, a get_dma_ops() always return 
generic arm_dma_ops which calls dma_sync_sg_for_cpu not iommu_sync_sg_for_cpu.

> > + * @sg: list of buffers
> > + * @nents: number of buffers to map (returned from dma_map_sg)
> > + * @dir: DMA transfer direction (same as was passed to dma_map_sg)
> > + */
> > +void arm_iommu_sync_sg_for_cpu(struct device *dev, struct scatterlist *sg,
> > +			int nents, enum dma_data_direction dir)
> > +{
> > +	struct scatterlist *s;
> > +	int i;
> > +
> > +	for_each_sg(sg, s, nents, i)
> > +		if (!arch_is_coherent())
> > +			__dma_page_dev_to_cpu(sg_page(s), s->offset, s->length, dir);
> 
> Uh, I thought you would need to pass in the 'dev'?

There is no such need. __dma_page_dev_to_cpu only performs cpu cache management 
(flush or invalidate) and does not need to access device pointer. The implementation
will be even easier to understand once the arm_coherent_dma_ops set is created 
for coherent architectures which do not need any cpu cache handling.

> 
> > +
> > +}
> > +
> > +/**
> > + * dma_sync_sg_for_device
> > + * @dev: valid struct device pointer, or NULL for ISA and EISA-like devices
> > + * @sg: list of buffers
> > + * @nents: number of buffers to map (returned from dma_map_sg)
> > + * @dir: DMA transfer direction (same as was passed to dma_map_sg)
> > + */
> > +void arm_iommu_sync_sg_for_device(struct device *dev, struct scatterlist *sg,
> > +			int nents, enum dma_data_direction dir)
> > +{
> > +	struct scatterlist *s;
> > +	int i;
> > +
> > +	for_each_sg(sg, s, nents, i)
> > +		if (!arch_is_coherent())
> > +			__dma_page_cpu_to_dev(sg_page(s), s->offset, s->length, dir);
> > +}
> > +
> > +static dma_addr_t arm_iommu_map_page(struct device *dev, struct page *page,
> > +	     unsigned long offset, size_t size, enum dma_data_direction dir,
> > +	     struct dma_attrs *attrs)
> > +{
> > +	struct dma_iommu_mapping *mapping = dev->archdata.mapping;
> > +	dma_addr_t dma_addr, iova;
> > +	unsigned int phys;
> > +	int ret, len = PAGE_ALIGN(size + offset);
> > +
> > +	if (!arch_is_coherent())
> > +		__dma_page_cpu_to_dev(page, offset, size, dir);
> > +
> > +	dma_addr = iova = __alloc_iova(mapping, len);
> > +	if (iova == ~0)
> > +		goto fail;
> > +
> > +	dma_addr += offset;
> > +	phys = page_to_phys(page);
> > +	ret = iommu_map(mapping->domain, iova, phys, size, 0);
> > +	if (ret < 0)
> > +		goto fail;
> > +
> > +	return dma_addr;
> > +fail:
> > +	return ~0;
> > +}
> > +
> > +static void arm_iommu_unmap_page(struct device *dev, dma_addr_t handle,
> > +		size_t size, enum dma_data_direction dir,
> > +		struct dma_attrs *attrs)
> > +{
> > +	struct dma_iommu_mapping *mapping = dev->archdata.mapping;
> > +	dma_addr_t iova = handle & PAGE_MASK;
> > +	struct page *page = phys_to_page(iommu_iova_to_phys(mapping->domain, iova));
> > +	int offset = handle & ~PAGE_MASK;
> > +
> > +	if (!iova)
> > +		return;
> > +
> > +	if (!arch_is_coherent())
> > +		__dma_page_dev_to_cpu(page, offset, size, dir);
> > +
> > +	iommu_unmap(mapping->domain, iova, size);
> > +	__free_iova(mapping, iova, size);
> > +}
> > +
> > +static void arm_iommu_sync_single_for_cpu(struct device *dev,
> > +		dma_addr_t handle, size_t size, enum dma_data_direction dir)
> > +{
> > +	struct dma_iommu_mapping *mapping = dev->archdata.mapping;
> > +	dma_addr_t iova = handle & PAGE_MASK;
> > +	struct page *page = phys_to_page(iommu_iova_to_phys(mapping->domain, iova));
> > +	unsigned int offset = handle & ~PAGE_MASK;
> > +
> > +	if (!iova)
> > +		return;
> > +
> > +	if (!arch_is_coherent())
> > +		__dma_page_dev_to_cpu(page, offset, size, dir);
> > +}
> > +
> > +static void arm_iommu_sync_single_for_device(struct device *dev,
> > +		dma_addr_t handle, size_t size, enum dma_data_direction dir)
> > +{
> > +	struct dma_iommu_mapping *mapping = dev->archdata.mapping;
> > +	dma_addr_t iova = handle & PAGE_MASK;
> > +	struct page *page = phys_to_page(iommu_iova_to_phys(mapping->domain, iova));
> > +	unsigned int offset = handle & ~PAGE_MASK;
> > +
> > +	if (!iova)
> > +		return;
> > +
> > +	__dma_page_cpu_to_dev(page, offset, size, dir);
> > +}
> > +
> > +struct dma_map_ops iommu_ops = {
> > +	.alloc		= arm_iommu_alloc_attrs,
> > +	.free		= arm_iommu_free_attrs,
> > +	.mmap		= arm_iommu_mmap_attrs,
> > +
> > +	.map_page		= arm_iommu_map_page,
> > +	.unmap_page		= arm_iommu_unmap_page,
> > +	.sync_single_for_cpu	= arm_iommu_sync_single_for_cpu,
> > +	.sync_single_for_device	= arm_iommu_sync_single_for_device,
> > +
> > +	.map_sg			= arm_iommu_map_sg,
> > +	.unmap_sg		= arm_iommu_unmap_sg,
> > +	.sync_sg_for_cpu	= arm_iommu_sync_sg_for_cpu,
> > +	.sync_sg_for_device	= arm_iommu_sync_sg_for_device,
> > +};
> > +
> > +struct dma_iommu_mapping *
> > +arm_iommu_create_mapping(struct bus_type *bus, dma_addr_t base, size_t size,
> > +			 int order)
> > +{
> > +	unsigned int count = (size >> PAGE_SHIFT) - order;
> > +	unsigned int bitmap_size = BITS_TO_LONGS(count) * sizeof(long);
> > +	struct dma_iommu_mapping *mapping;
> > +	int err = -ENOMEM;
> > +
> > +	mapping = kzalloc(sizeof(struct dma_iommu_mapping), GFP_KERNEL);
> > +	if (!mapping)
> > +		goto err;
> > +
> > +	mapping->bitmap = kzalloc(bitmap_size, GFP_KERNEL);
> > +	if (!mapping->bitmap)
> > +		goto err2;
> > +
> > +	mapping->base = base;
> > +	mapping->bits = bitmap_size;
> > +	mapping->order = order;
> > +	spin_lock_init(&mapping->lock);
> > +
> > +	mapping->domain = iommu_domain_alloc(bus);
> > +	if (!mapping->domain)
> > +		goto err3;
> > +
> > +	kref_init(&mapping->kref);
> > +	return mapping;
> > +err3:
> > +	kfree(mapping->bitmap);
> > +err2:
> > +	kfree(mapping);
> > +err:
> > +	return ERR_PTR(err);
> > +}
> > +EXPORT_SYMBOL(arm_iommu_create_mapping);
> > +
> > +static void release_iommu_mapping(struct kref *kref)
> > +{
> > +	struct dma_iommu_mapping *mapping =
> > +		container_of(kref, struct dma_iommu_mapping, kref);
> > +
> > +	iommu_domain_free(mapping->domain);
> > +	kfree(mapping->bitmap);
> > +	kfree(mapping);
> > +}
> > +
> > +void arm_iommu_release_mapping(struct dma_iommu_mapping *mapping)
> > +{
> > +	if (mapping)
> > +		kref_put(&mapping->kref, release_iommu_mapping);
> > +}
> > +EXPORT_SYMBOL(arm_iommu_release_mapping);
> > +
> > +int arm_iommu_attach_device(struct device *dev,
> > +			    struct dma_iommu_mapping *mapping)
> > +{
> > +	int err;
> > +
> > +	err = iommu_attach_device(mapping->domain, dev);
> > +	if (err)
> > +		return err;
> > +
> > +	kref_get(&mapping->kref);
> > +	dev->archdata.mapping = mapping;
> > +	set_dma_ops(dev, &iommu_ops);
> > +
> > +	printk(KERN_INFO "Attached IOMMU controller to %s device.\n", dev_name(dev));
> 
> pr_debug?
> 
> > +	return 0;
> > +}
> > +EXPORT_SYMBOL(arm_iommu_attach_device);
> > +
> > +#endif
> > diff --git a/arch/arm/mm/vmregion.h b/arch/arm/mm/vmregion.h
> > index 15e9f04..6bbc402 100644
> > --- a/arch/arm/mm/vmregion.h
> > +++ b/arch/arm/mm/vmregion.h
> > @@ -17,7 +17,7 @@ struct arm_vmregion {
> >  	struct list_head	vm_list;
> >  	unsigned long		vm_start;
> >  	unsigned long		vm_end;
> > -	struct page		*vm_pages;
> > +	void			*priv;
> >  	int			vm_active;
> >  };
> 
> You might want to CC the ARM MM maintainers here to get their feedback.

Russell King is already on CC list, who else should I add? 

> Besides the comments I made, it looks good. You can stick Reviewed-by: Konrad Rzeszutek Wilk
> <konrad.wilk@oracle.com>
> if you would like on the patch.

Thanks for your comments!

Best regards
-- 
Marek Szyprowski
Samsung Poland R&D Center


--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 39+ messages in thread

* RE: [PATCHv6 7/7] ARM: dma-mapping: add support for IOMMU mapper
  2012-02-24 13:12       ` Marek Szyprowski
@ 2012-02-24 13:12         ` Marek Szyprowski
  0 siblings, 0 replies; 39+ messages in thread
From: Marek Szyprowski @ 2012-02-24 13:12 UTC (permalink / raw)
  To: 'Konrad Rzeszutek Wilk'
  Cc: linux-arm-kernel, linaro-mm-sig, linux-mm, linux-arch,
	linux-samsung-soc, iommu, 'Shariq Hasnain',
	'Arnd Bergmann', 'Benjamin Herrenschmidt',
	'Krishna Reddy', 'Kyungmin Park',
	Andrzej Pietrasiewicz, 'Russell King - ARM Linux',
	'KyongHo Cho', 'Chunsang Jeong'

Hello,

On Tuesday, February 14, 2012 3:56 PM Konrad Rzeszutek Wilk wrote:

> > +static void __dma_clear_buffer(struct page *page, size_t size)
> > +{
> > +	void *ptr;
> > +	/*
> > +	 * Ensure that the allocated pages are zeroed, and that any data
> > +	 * lurking in the kernel direct-mapped region is invalidated.
> > +	 */
> > +	ptr = page_address(page);
> 
> Should you check to see if the ptr is valid?

Ok.

> > +	memset(ptr, 0, size);
> > +	dmac_flush_range(ptr, ptr + size);
> > +	outer_flush_range(__pa(ptr), __pa(ptr) + size);
> > +}
> > +
> >  /*
> >   * Allocate a DMA buffer for 'dev' of size 'size' using the
> >   * specified gfp mask.  Note that 'size' must be page aligned.
> > @@ -164,7 +179,6 @@ static struct page *__dma_alloc_buffer(struct device *dev, size_t size,
> gfp_t gf
> >  {
> >  	unsigned long order = get_order(size);
> >  	struct page *page, *p, *e;
> > -	void *ptr;
> >  	u64 mask = get_coherent_dma_mask(dev);
> >
> >  #ifdef CONFIG_DMA_API_DEBUG
> > @@ -193,14 +207,7 @@ static struct page *__dma_alloc_buffer(struct device *dev, size_t size,
> gfp_t gf
> >  	for (p = page + (size >> PAGE_SHIFT), e = page + (1 << order); p < e; p++)
> >  		__free_page(p);
> >
> > -	/*
> > -	 * Ensure that the allocated pages are zeroed, and that any data
> > -	 * lurking in the kernel direct-mapped region is invalidated.
> > -	 */
> > -	ptr = page_address(page);
> > -	memset(ptr, 0, size);
> > -	dmac_flush_range(ptr, ptr + size);
> > -	outer_flush_range(__pa(ptr), __pa(ptr) + size);
> > +	__dma_clear_buffer(page, size);
> >
> >  	return page;
> >  }
> > @@ -348,7 +355,7 @@ __dma_alloc_remap(struct page *page, size_t size, gfp_t gfp, pgprot_t
> prot)
> >  		u32 off = CONSISTENT_OFFSET(c->vm_start) & (PTRS_PER_PTE-1);
> >
> >  		pte = consistent_pte[idx] + off;
> > -		c->vm_pages = page;
> > +		c->priv = page;
> >
> >  		do {
> >  			BUG_ON(!pte_none(*pte));
> > @@ -461,6 +468,14 @@ __dma_alloc(struct device *dev, size_t size, dma_addr_t *handle, gfp_t
> gfp,
> >  	return addr;
> >  }
> >
> > +static inline pgprot_t __get_dma_pgprot(struct dma_attrs *attrs, pgprot_t prot)
> > +{
> > +	prot = dma_get_attr(DMA_ATTR_WRITE_COMBINE, attrs) ?
> > +			    pgprot_writecombine(prot) :
> > +			    pgprot_dmacoherent(prot);
> > +	return prot;
> > +}
> > +
> >  /*
> >   * Allocate DMA-coherent memory space and return both the kernel remapped
> >   * virtual and bus address for that space.
> > @@ -468,9 +483,7 @@ __dma_alloc(struct device *dev, size_t size, dma_addr_t *handle, gfp_t
> gfp,
> >  void *arm_dma_alloc(struct device *dev, size_t size, dma_addr_t *handle,
> >  		    gfp_t gfp, struct dma_attrs *attrs)
> >  {
> > -	pgprot_t prot = dma_get_attr(DMA_ATTR_WRITE_COMBINE, attrs) ?
> > -			pgprot_writecombine(pgprot_kernel) :
> > -			pgprot_dmacoherent(pgprot_kernel);
> > +	pgprot_t prot = __get_dma_pgprot(attrs, pgprot_kernel);
> >  	void *memory;
> >
> >  	if (dma_alloc_from_coherent(dev, size, handle, &memory))
> > @@ -499,13 +512,14 @@ int arm_dma_mmap(struct device *dev, struct vm_area_struct *vma,
> >  	c = arm_vmregion_find(&consistent_head, (unsigned long)cpu_addr);
> >  	if (c) {
> >  		unsigned long off = vma->vm_pgoff;
> > +		struct page *pages = c->priv;
> >
> >  		kern_size = (c->vm_end - c->vm_start) >> PAGE_SHIFT;
> >
> >  		if (off < kern_size &&
> >  		    user_size <= (kern_size - off)) {
> >  			ret = remap_pfn_range(vma, vma->vm_start,
> > -					      page_to_pfn(c->vm_pages) + off,
> > +					      page_to_pfn(pages) + off,
> >  					      user_size << PAGE_SHIFT,
> >  					      vma->vm_page_prot);
> >  		}
> > @@ -644,6 +658,9 @@ int arm_dma_map_sg(struct device *dev, struct scatterlist *sg, int nents,
> >  	int i, j;
> >
> >  	for_each_sg(sg, s, nents, i) {
> > +#ifdef CONFIG_NEED_SG_DMA_LENGTH
> > +		s->dma_length = s->length;
> > +#endif
> >  		s->dma_address = ops->map_page(dev, sg_page(s), s->offset,
> >  						s->length, dir, attrs);
> >  		if (dma_mapping_error(dev, s->dma_address))
> > @@ -749,3 +766,593 @@ static int __init dma_debug_do_init(void)
> >  	return 0;
> >  }
> >  fs_initcall(dma_debug_do_init);
> > +
> > +#ifdef CONFIG_ARM_DMA_USE_IOMMU
> > +
> > +/* IOMMU */
> > +
> > +static inline dma_addr_t __alloc_iova(struct dma_iommu_mapping *mapping,
> > +				      size_t size)
> > +{
> > +	unsigned int order = get_order(size);
> > +	unsigned int align = 0;
> > +	unsigned int count, start;
> > +	unsigned long flags;
> > +
> > +	count = ((PAGE_ALIGN(size) >> PAGE_SHIFT) +
> > +		 (1 << mapping->order) - 1) >> mapping->order;
> > +
> > +	if (order > mapping->order)
> > +		align = (1 << (order - mapping->order)) - 1;
> > +
> > +	spin_lock_irqsave(&mapping->lock, flags);
> > +	start = bitmap_find_next_zero_area(mapping->bitmap, mapping->bits, 0,
> > +					   count, align);
> > +	if (start > mapping->bits) {
> > +		spin_unlock_irqrestore(&mapping->lock, flags);
> > +		return ~0;
> 
> Would it make sense to use DMA_ERROR_CODE? Or a ARM variant of it.

Right, the code will be easier to understand if I add ARM_DMA_ERROR define.

> > +	}
> > +
> > +	bitmap_set(mapping->bitmap, start, count);
> > +	spin_unlock_irqrestore(&mapping->lock, flags);
> > +
> > +	return mapping->base + (start << (mapping->order + PAGE_SHIFT));
> > +}
> > +
> > +static inline void __free_iova(struct dma_iommu_mapping *mapping,
> > +			       dma_addr_t addr, size_t size)
> > +{
> > +	unsigned int start = (addr - mapping->base) >>
> > +			     (mapping->order + PAGE_SHIFT);
> > +	unsigned int count = ((size >> PAGE_SHIFT) +
> > +			      (1 << mapping->order) - 1) >> mapping->order;
> > +	unsigned long flags;
> > +
> > +	spin_lock_irqsave(&mapping->lock, flags);
> > +	bitmap_clear(mapping->bitmap, start, count);
> > +	spin_unlock_irqrestore(&mapping->lock, flags);
> > +}
> > +
> > +static struct page **__iommu_alloc_buffer(struct device *dev, size_t size, gfp_t gfp)
> > +{
> > +	struct page **pages;
> > +	int count = size >> PAGE_SHIFT;
> > +	int i=0;
> > +
> > +	pages = kzalloc(count * sizeof(struct page*), gfp);
> > +	if (!pages)
> > +		return NULL;
> > +
> > +	while (count) {
> > +		int j, order = __ffs(count);
> > +
> > +		pages[i] = alloc_pages(gfp | __GFP_NOWARN, order);
> > +		while (!pages[i] && order)
> > +			pages[i] = alloc_pages(gfp | __GFP_NOWARN, --order);
> > +		if (!pages[i])
> > +			goto error;
> > +
> > +		if (order)
> > +			split_page(pages[i], order);
> > +		j = 1 << order;
> > +		while (--j)
> > +			pages[i + j] = pages[i] + j;
> > +
> > +		__dma_clear_buffer(pages[i], PAGE_SIZE << order);
> > +		i += 1 << order;
> > +		count -= 1 << order;
> > +	}
> > +
> > +	return pages;
> > +error:
> > +	while (--i)
> > +		if (pages[i])
> > +			__free_pages(pages[i], 0);
> > +	kfree(pages);
> > +	return NULL;
> > +}
> > +
> > +static int __iommu_free_buffer(struct device *dev, struct page **pages, size_t size)
> > +{
> > +	int count = size >> PAGE_SHIFT;
> > +	int i;
> > +	for (i=0; i< count; i++)
> 
> That 'i< count' looks odd. Did checkpath miss that one?
> 
> > +		if (pages[i])
> > +			__free_pages(pages[i], 0);
> > +	kfree(pages);
> > +	return 0;
> > +}
> > +
> > +static void *
> > +__iommu_alloc_remap(struct page **pages, size_t size, gfp_t gfp, pgprot_t prot)
> > +{
> > +	struct arm_vmregion *c;
> > +	size_t align;
> > +	size_t count = size >> PAGE_SHIFT;
> > +	int bit;
> > +
> > +	if (!consistent_pte[0]) {
> > +		printk(KERN_ERR "%s: not initialised\n", __func__);
> > +		dump_stack();
> > +		return NULL;
> > +	}
> > +
> > +	/*
> > +	 * Align the virtual region allocation - maximum alignment is
> > +	 * a section size, minimum is a page size.  This helps reduce
> > +	 * fragmentation of the DMA space, and also prevents allocations
> > +	 * smaller than a section from crossing a section boundary.
> > +	 */
> > +	bit = fls(size - 1);
> > +	if (bit > SECTION_SHIFT)
> > +		bit = SECTION_SHIFT;
> > +	align = 1 << bit;
> > +
> > +	/*
> > +	 * Allocate a virtual address in the consistent mapping region.
> > +	 */
> > +	c = arm_vmregion_alloc(&consistent_head, align, size,
> > +			    gfp & ~(__GFP_DMA | __GFP_HIGHMEM));
> > +	if (c) {
> > +		pte_t *pte;
> > +		int idx = CONSISTENT_PTE_INDEX(c->vm_start);
> > +		int i = 0;
> > +		u32 off = CONSISTENT_OFFSET(c->vm_start) & (PTRS_PER_PTE-1);
> > +
> > +		pte = consistent_pte[idx] + off;
> > +		c->priv = pages;
> > +
> > +		do {
> > +			BUG_ON(!pte_none(*pte));
> > +
> > +			set_pte_ext(pte, mk_pte(pages[i], prot), 0);
> > +			pte++;
> > +			off++;
> > +			i++;
> > +			if (off >= PTRS_PER_PTE) {
> > +				off = 0;
> > +				pte = consistent_pte[++idx];
> > +			}
> > +		} while (i < count);
> > +
> > +		dsb();
> > +
> > +		return (void *)c->vm_start;
> > +	}
> > +	return NULL;
> > +}
> > +
> > +static dma_addr_t __iommu_create_mapping(struct device *dev, struct page **pages, size_t
> size)
> > +{
> > +	struct dma_iommu_mapping *mapping = dev->archdata.mapping;
> > +	unsigned int count = PAGE_ALIGN(size) >> PAGE_SHIFT;
> > +	dma_addr_t dma_addr, iova;
> > +	int i, ret = ~0;
> > +
> > +	dma_addr = __alloc_iova(mapping, size);
> > +	if (dma_addr == ~0)
> > +		goto fail;
> > +
> > +	iova = dma_addr;
> > +	for (i=0; i<count; ) {
> > +		unsigned int phys = page_to_phys(pages[i]);
> 
> phys_addr_t ?

Right

> > +		int j = i + 1;
> > +
> > +		while (j < count) {
> > +			if (page_to_phys(pages[j]) != phys + (j - i) * PAGE_SIZE)
> > +				break;
> 
> How about just using pfn values?
> So:
> 
> 	unsigned int next_pfn = page_to_pfn(pages[i])
> 	unsigned int pfn = i;
> 
> 	for (j = 1; j < count; j++)
> 		if (page_to_pfn(pages[++pfn]) != ++next_pfn)
> 			break;
> 
> IMHO it looks easier to read.

Right, this one looks much better.

> > +			j++;
> > +		}
> > +
> > +		ret = iommu_map(mapping->domain, iova, phys, (j - i) * PAGE_SIZE, 0);
> > +		if (ret < 0)
> > +			goto fail;
> > +		iova += (j - i) * PAGE_SIZE;
> > +		i = j;
> 
> Granted you would have to rework this a bit.
> > +	}
> > +
> > +	return dma_addr;
> > +fail:
> > +	return ~0;
> 
> DMA_ERROR_CODE
> 
> > +}
> > +
> > +static int __iommu_remove_mapping(struct device *dev, dma_addr_t iova, size_t size)
> > +{
> > +	struct dma_iommu_mapping *mapping = dev->archdata.mapping;
> > +	unsigned int count = PAGE_ALIGN(size) >> PAGE_SHIFT;
> > +
> > +	iova &= PAGE_MASK;
> > +
> > +	iommu_unmap(mapping->domain, iova, count * PAGE_SIZE);
> > +
> > +	__free_iova(mapping, iova, size);
> > +	return 0;
> > +}
> > +
> > +static void *arm_iommu_alloc_attrs(struct device *dev, size_t size,
> > +	    dma_addr_t *handle, gfp_t gfp, struct dma_attrs *attrs)
> > +{
> > +	pgprot_t prot = __get_dma_pgprot(attrs, pgprot_kernel);
> > +	struct page **pages;
> > +	void *addr = NULL;
> > +
> > +	*handle = ~0;
> > +	size = PAGE_ALIGN(size);
> > +
> > +	pages = __iommu_alloc_buffer(dev, size, gfp);
> > +	if (!pages)
> > +		return NULL;
> > +
> > +	*handle = __iommu_create_mapping(dev, pages, size);
> > +	if (*handle == ~0)
> > +		goto err_buffer;
> > +
> > +	addr = __iommu_alloc_remap(pages, size, gfp, prot);
> > +	if (!addr)
> > +		goto err_mapping;
> > +
> > +	return addr;
> > +
> > +err_mapping:
> > +	__iommu_remove_mapping(dev, *handle, size);
> > +err_buffer:
> > +	__iommu_free_buffer(dev, pages, size);
> > +	return NULL;
> > +}
> > +
> > +static int arm_iommu_mmap_attrs(struct device *dev, struct vm_area_struct *vma,
> > +		    void *cpu_addr, dma_addr_t dma_addr, size_t size,
> > +		    struct dma_attrs *attrs)
> > +{
> > +	struct arm_vmregion *c;
> > +
> > +	vma->vm_page_prot = __get_dma_pgprot(attrs, vma->vm_page_prot);
> > +	c = arm_vmregion_find(&consistent_head, (unsigned long)cpu_addr);
> > +
> > +	if (c) {
> > +		struct page **pages = c->priv;
> > +
> > +		unsigned long uaddr = vma->vm_start;
> > +		unsigned long usize = vma->vm_end - vma->vm_start;
> > +		int i = 0;
> > +
> > +		do {
> > +			int ret;
> > +
> > +			ret = vm_insert_page(vma, uaddr, pages[i++]);
> > +			if (ret) {
> > +				printk(KERN_ERR "Remapping memory, error: %d\n", ret);
> > +				return ret;
> > +			}
> > +
> > +			uaddr += PAGE_SIZE;
> > +			usize -= PAGE_SIZE;
> > +		} while (usize > 0);
> > +	}
> > +	return 0;
> > +}
> > +
> > +/*
> > + * free a page as defined by the above mapping.
> > + * Must not be called with IRQs disabled.
> > + */
> > +void arm_iommu_free_attrs(struct device *dev, size_t size, void *cpu_addr,
> > +			  dma_addr_t handle, struct dma_attrs *attrs)
> > +{
> > +	struct arm_vmregion *c;
> > +	size = PAGE_ALIGN(size);
> > +
> > +	c = arm_vmregion_find(&consistent_head, (unsigned long)cpu_addr);
> > +	if (c) {
> > +		struct page **pages = c->priv;
> > +		__dma_free_remap(cpu_addr, size);
> > +		__iommu_remove_mapping(dev, handle, size);
> > +		__iommu_free_buffer(dev, pages, size);
> > +	}
> > +}
> > +
> > +static int __map_sg_chunk(struct device *dev, struct scatterlist *sg,
> > +			  size_t size, dma_addr_t *handle,
> > +			  enum dma_data_direction dir)
> > +{
> > +	struct dma_iommu_mapping *mapping = dev->archdata.mapping;
> > +	dma_addr_t iova, iova_base;
> > +	int ret = 0;
> > +	unsigned int count;
> > +	struct scatterlist *s;
> > +
> > +	size = PAGE_ALIGN(size);
> > +	*handle = ~0;
> > +
> > +	iova_base = iova = __alloc_iova(mapping, size);
> > +	if (iova == ~0)
> > +		return -ENOMEM;
> > +
> > +	for (count = 0, s = sg; count < (size >> PAGE_SHIFT); s = sg_next(s))
> > +	{
> > +		phys_addr_t phys = page_to_phys(sg_page(s));
> > +		unsigned int len = PAGE_ALIGN(s->offset + s->length);
> > +
> > +		if (!arch_is_coherent())
> > +			__dma_page_cpu_to_dev(sg_page(s), s->offset, s->length, dir);
> > +
> > +		ret = iommu_map(mapping->domain, iova, phys, len, 0);
> > +		if (ret < 0)
> > +			goto fail;
> > +		count += len >> PAGE_SHIFT;
> > +		iova += len;
> > +	}
> > +	*handle = iova_base;
> > +
> > +	return 0;
> > +fail:
> > +	iommu_unmap(mapping->domain, iova_base, count * PAGE_SIZE);
> > +	__free_iova(mapping, iova_base, size);
> > +	return ret;
> > +}
> > +
> > +int arm_iommu_map_sg(struct device *dev, struct scatterlist *sg, int nents,
> > +		     enum dma_data_direction dir, struct dma_attrs *attrs)
> > +{
> > +	struct scatterlist *s = sg, *dma = sg, *start = sg;
> > +	int i, count = 0;
> > +	unsigned int offset = s->offset;
> > +	unsigned int size = s->offset + s->length;
> > +	unsigned int max = dma_get_max_seg_size(dev);
> > +
> > +	s->dma_address = ~0;
> > +	s->dma_length = 0;
> 
> Not zero just in case somebody does not check the values and tries to use them?

Well, I've read the whole function again and now I see that it is not really needed. 
We shouldn't care about broken clients who don't check the return value.

> > +
> > +	for (i = 1; i < nents; i++) {
> > +		s->dma_address = ~0;
> > +		s->dma_length = 0;
> > +
> > +		s = sg_next(s);
> > +
> > +		if (s->offset || (size & ~PAGE_MASK) || size + s->length > max) {
> > +			if (__map_sg_chunk(dev, start, size, &dma->dma_address,
> > +			    dir) < 0)
> > +				goto bad_mapping;
> > +
> > +			dma->dma_address += offset;
> > +			dma->dma_length = size - offset;
> > +
> > +			size = offset = s->offset;
> > +			start = s;
> > +			dma = sg_next(dma);
> > +			count += 1;
> > +		}
> > +		size += s->length;
> > +	}
> > +	if (__map_sg_chunk(dev, start, size, &dma->dma_address, dir) < 0)
> > +		goto bad_mapping;
> > +
> > +	dma->dma_address += offset;
> > +	dma->dma_length = size - offset;
> > +
> > +	return count+1;
> > +
> > +bad_mapping:
> > +	for_each_sg(sg, s, count, i)
> > +		__iommu_remove_mapping(dev, sg_dma_address(s), sg_dma_len(s));
> > +	return 0;
> > +}
> > +
> > +void arm_iommu_unmap_sg(struct device *dev, struct scatterlist *sg, int nents,
> > +			enum dma_data_direction dir, struct dma_attrs *attrs)
> > +{
> > +	struct scatterlist *s;
> > +	int i;
> > +
> > +	for_each_sg(sg, s, nents, i) {
> > +		if (sg_dma_len(s))
> > +			__iommu_remove_mapping(dev, sg_dma_address(s),
> > +					       sg_dma_len(s));
> > +		if (!arch_is_coherent())
> > +			__dma_page_dev_to_cpu(sg_page(s), s->offset,
> > +					      s->length, dir);
> > +	}
> > +}
> > +
> > +
> > +/**
> > + * dma_sync_sg_for_cpu
> > + * @dev: valid struct device pointer, or NULL for ISA and EISA-like devices
> 
> Uhhh, Won't that conflict with patch #1 which BUGs if dev != NULL?

Ok, I forgot to update comment once I've copied this function, it should refer to 
iommu_sync_sg_for_cpu and the comment about NULL device pointer should be removed. 

I assume You also wanted to ask what will happen if dev == NULL? Such case is not 
possible. If caller provides NULL device pointer, a get_dma_ops() always return 
generic arm_dma_ops which calls dma_sync_sg_for_cpu not iommu_sync_sg_for_cpu.

> > + * @sg: list of buffers
> > + * @nents: number of buffers to map (returned from dma_map_sg)
> > + * @dir: DMA transfer direction (same as was passed to dma_map_sg)
> > + */
> > +void arm_iommu_sync_sg_for_cpu(struct device *dev, struct scatterlist *sg,
> > +			int nents, enum dma_data_direction dir)
> > +{
> > +	struct scatterlist *s;
> > +	int i;
> > +
> > +	for_each_sg(sg, s, nents, i)
> > +		if (!arch_is_coherent())
> > +			__dma_page_dev_to_cpu(sg_page(s), s->offset, s->length, dir);
> 
> Uh, I thought you would need to pass in the 'dev'?

There is no such need. __dma_page_dev_to_cpu only performs cpu cache management 
(flush or invalidate) and does not need to access device pointer. The implementation
will be even easier to understand once the arm_coherent_dma_ops set is created 
for coherent architectures which do not need any cpu cache handling.

> 
> > +
> > +}
> > +
> > +/**
> > + * dma_sync_sg_for_device
> > + * @dev: valid struct device pointer, or NULL for ISA and EISA-like devices
> > + * @sg: list of buffers
> > + * @nents: number of buffers to map (returned from dma_map_sg)
> > + * @dir: DMA transfer direction (same as was passed to dma_map_sg)
> > + */
> > +void arm_iommu_sync_sg_for_device(struct device *dev, struct scatterlist *sg,
> > +			int nents, enum dma_data_direction dir)
> > +{
> > +	struct scatterlist *s;
> > +	int i;
> > +
> > +	for_each_sg(sg, s, nents, i)
> > +		if (!arch_is_coherent())
> > +			__dma_page_cpu_to_dev(sg_page(s), s->offset, s->length, dir);
> > +}
> > +
> > +static dma_addr_t arm_iommu_map_page(struct device *dev, struct page *page,
> > +	     unsigned long offset, size_t size, enum dma_data_direction dir,
> > +	     struct dma_attrs *attrs)
> > +{
> > +	struct dma_iommu_mapping *mapping = dev->archdata.mapping;
> > +	dma_addr_t dma_addr, iova;
> > +	unsigned int phys;
> > +	int ret, len = PAGE_ALIGN(size + offset);
> > +
> > +	if (!arch_is_coherent())
> > +		__dma_page_cpu_to_dev(page, offset, size, dir);
> > +
> > +	dma_addr = iova = __alloc_iova(mapping, len);
> > +	if (iova == ~0)
> > +		goto fail;
> > +
> > +	dma_addr += offset;
> > +	phys = page_to_phys(page);
> > +	ret = iommu_map(mapping->domain, iova, phys, size, 0);
> > +	if (ret < 0)
> > +		goto fail;
> > +
> > +	return dma_addr;
> > +fail:
> > +	return ~0;
> > +}
> > +
> > +static void arm_iommu_unmap_page(struct device *dev, dma_addr_t handle,
> > +		size_t size, enum dma_data_direction dir,
> > +		struct dma_attrs *attrs)
> > +{
> > +	struct dma_iommu_mapping *mapping = dev->archdata.mapping;
> > +	dma_addr_t iova = handle & PAGE_MASK;
> > +	struct page *page = phys_to_page(iommu_iova_to_phys(mapping->domain, iova));
> > +	int offset = handle & ~PAGE_MASK;
> > +
> > +	if (!iova)
> > +		return;
> > +
> > +	if (!arch_is_coherent())
> > +		__dma_page_dev_to_cpu(page, offset, size, dir);
> > +
> > +	iommu_unmap(mapping->domain, iova, size);
> > +	__free_iova(mapping, iova, size);
> > +}
> > +
> > +static void arm_iommu_sync_single_for_cpu(struct device *dev,
> > +		dma_addr_t handle, size_t size, enum dma_data_direction dir)
> > +{
> > +	struct dma_iommu_mapping *mapping = dev->archdata.mapping;
> > +	dma_addr_t iova = handle & PAGE_MASK;
> > +	struct page *page = phys_to_page(iommu_iova_to_phys(mapping->domain, iova));
> > +	unsigned int offset = handle & ~PAGE_MASK;
> > +
> > +	if (!iova)
> > +		return;
> > +
> > +	if (!arch_is_coherent())
> > +		__dma_page_dev_to_cpu(page, offset, size, dir);
> > +}
> > +
> > +static void arm_iommu_sync_single_for_device(struct device *dev,
> > +		dma_addr_t handle, size_t size, enum dma_data_direction dir)
> > +{
> > +	struct dma_iommu_mapping *mapping = dev->archdata.mapping;
> > +	dma_addr_t iova = handle & PAGE_MASK;
> > +	struct page *page = phys_to_page(iommu_iova_to_phys(mapping->domain, iova));
> > +	unsigned int offset = handle & ~PAGE_MASK;
> > +
> > +	if (!iova)
> > +		return;
> > +
> > +	__dma_page_cpu_to_dev(page, offset, size, dir);
> > +}
> > +
> > +struct dma_map_ops iommu_ops = {
> > +	.alloc		= arm_iommu_alloc_attrs,
> > +	.free		= arm_iommu_free_attrs,
> > +	.mmap		= arm_iommu_mmap_attrs,
> > +
> > +	.map_page		= arm_iommu_map_page,
> > +	.unmap_page		= arm_iommu_unmap_page,
> > +	.sync_single_for_cpu	= arm_iommu_sync_single_for_cpu,
> > +	.sync_single_for_device	= arm_iommu_sync_single_for_device,
> > +
> > +	.map_sg			= arm_iommu_map_sg,
> > +	.unmap_sg		= arm_iommu_unmap_sg,
> > +	.sync_sg_for_cpu	= arm_iommu_sync_sg_for_cpu,
> > +	.sync_sg_for_device	= arm_iommu_sync_sg_for_device,
> > +};
> > +
> > +struct dma_iommu_mapping *
> > +arm_iommu_create_mapping(struct bus_type *bus, dma_addr_t base, size_t size,
> > +			 int order)
> > +{
> > +	unsigned int count = (size >> PAGE_SHIFT) - order;
> > +	unsigned int bitmap_size = BITS_TO_LONGS(count) * sizeof(long);
> > +	struct dma_iommu_mapping *mapping;
> > +	int err = -ENOMEM;
> > +
> > +	mapping = kzalloc(sizeof(struct dma_iommu_mapping), GFP_KERNEL);
> > +	if (!mapping)
> > +		goto err;
> > +
> > +	mapping->bitmap = kzalloc(bitmap_size, GFP_KERNEL);
> > +	if (!mapping->bitmap)
> > +		goto err2;
> > +
> > +	mapping->base = base;
> > +	mapping->bits = bitmap_size;
> > +	mapping->order = order;
> > +	spin_lock_init(&mapping->lock);
> > +
> > +	mapping->domain = iommu_domain_alloc(bus);
> > +	if (!mapping->domain)
> > +		goto err3;
> > +
> > +	kref_init(&mapping->kref);
> > +	return mapping;
> > +err3:
> > +	kfree(mapping->bitmap);
> > +err2:
> > +	kfree(mapping);
> > +err:
> > +	return ERR_PTR(err);
> > +}
> > +EXPORT_SYMBOL(arm_iommu_create_mapping);
> > +
> > +static void release_iommu_mapping(struct kref *kref)
> > +{
> > +	struct dma_iommu_mapping *mapping =
> > +		container_of(kref, struct dma_iommu_mapping, kref);
> > +
> > +	iommu_domain_free(mapping->domain);
> > +	kfree(mapping->bitmap);
> > +	kfree(mapping);
> > +}
> > +
> > +void arm_iommu_release_mapping(struct dma_iommu_mapping *mapping)
> > +{
> > +	if (mapping)
> > +		kref_put(&mapping->kref, release_iommu_mapping);
> > +}
> > +EXPORT_SYMBOL(arm_iommu_release_mapping);
> > +
> > +int arm_iommu_attach_device(struct device *dev,
> > +			    struct dma_iommu_mapping *mapping)
> > +{
> > +	int err;
> > +
> > +	err = iommu_attach_device(mapping->domain, dev);
> > +	if (err)
> > +		return err;
> > +
> > +	kref_get(&mapping->kref);
> > +	dev->archdata.mapping = mapping;
> > +	set_dma_ops(dev, &iommu_ops);
> > +
> > +	printk(KERN_INFO "Attached IOMMU controller to %s device.\n", dev_name(dev));
> 
> pr_debug?
> 
> > +	return 0;
> > +}
> > +EXPORT_SYMBOL(arm_iommu_attach_device);
> > +
> > +#endif
> > diff --git a/arch/arm/mm/vmregion.h b/arch/arm/mm/vmregion.h
> > index 15e9f04..6bbc402 100644
> > --- a/arch/arm/mm/vmregion.h
> > +++ b/arch/arm/mm/vmregion.h
> > @@ -17,7 +17,7 @@ struct arm_vmregion {
> >  	struct list_head	vm_list;
> >  	unsigned long		vm_start;
> >  	unsigned long		vm_end;
> > -	struct page		*vm_pages;
> > +	void			*priv;
> >  	int			vm_active;
> >  };
> 
> You might want to CC the ARM MM maintainers here to get their feedback.

Russell King is already on CC list, who else should I add? 

> Besides the comments I made, it looks good. You can stick Reviewed-by: Konrad Rzeszutek Wilk
> <konrad.wilk@oracle.com>
> if you would like on the patch.

Thanks for your comments!

Best regards
-- 
Marek Szyprowski
Samsung Poland R&D Center



^ permalink raw reply	[flat|nested] 39+ messages in thread

* RE: [PATCHv6 7/7] ARM: dma-mapping: add support for IOMMU mapper
  2012-02-24 12:49           ` Arnd Bergmann
  2012-02-24 12:49             ` Arnd Bergmann
@ 2012-02-24 13:18             ` Marek Szyprowski
  2012-02-24 13:18               ` Marek Szyprowski
  2012-02-24 14:31               ` Arnd Bergmann
  1 sibling, 2 replies; 39+ messages in thread
From: Marek Szyprowski @ 2012-02-24 13:18 UTC (permalink / raw)
  To: 'Arnd Bergmann'
  Cc: 'Krishna Reddy',
	linux-arm-kernel, linaro-mm-sig, linux-mm, linux-arch,
	linux-samsung-soc, iommu, 'Kyungmin Park',
	'Joerg Roedel', 'Russell King - ARM Linux',
	'Chunsang Jeong', 'KyongHo Cho',
	Andrzej Pietrasiewicz, 'Benjamin Herrenschmidt'

Hello,

On Friday, February 24, 2012 1:50 PM Arnd Bergmann wrote:

> On Friday 24 February 2012, Marek Szyprowski wrote:
> > > > +static struct page **__iommu_alloc_buffer(struct device *dev, size_t
> > > > +size, gfp_t gfp) {
> > > > +     struct page **pages;
> > > > +     int count = size >> PAGE_SHIFT;
> > > > +     int i=0;
> > > > +
> > > > +     pages = kzalloc(count * sizeof(struct page*), gfp);
> > > > +     if (!pages)
> > > > +             return NULL;
> > >
> > > kzalloc can fail for any size bigger than PAGE_SIZE, if the system memory is
> > > fully fragmented.
> > > If there is a request for size bigger than 4MB, then the pages pointer array won't
> > > Fit in one page and kzalloc may fail. we should use vzalloc()/vfree()
> > > when pages pointer array size needed is bigger than PAGE_SIZE.
> >
> > Right, thanks for spotting this. I will fix this in the next version.
> 
> It's not clear though if that is the best solution. vzalloc comes at the
> price of using up space in the vmalloc area and as well as extra TLB entries,
> so we try to limit its use where possible. The other current code might fail
> in out of memory situations, but if a user wants to allocate a >4MB buffer
> (using up more than one physically contiguous page of pointers to pages), the
> following allocation of >1024 pages will likely fail as well, so we might
> just fail early.

I want to use some kind of chained arrays, each of at most of PAGE_SIZE. This code 
doesn't really need to keep these page pointers in contiguous virtual memory area, so
it will not be a problem here.

Best regards
-- 
Marek Szyprowski
Samsung Poland R&D Center

^ permalink raw reply	[flat|nested] 39+ messages in thread

* RE: [PATCHv6 7/7] ARM: dma-mapping: add support for IOMMU mapper
  2012-02-24 13:18             ` Marek Szyprowski
@ 2012-02-24 13:18               ` Marek Szyprowski
  2012-02-24 14:31               ` Arnd Bergmann
  1 sibling, 0 replies; 39+ messages in thread
From: Marek Szyprowski @ 2012-02-24 13:18 UTC (permalink / raw)
  To: 'Arnd Bergmann'
  Cc: 'Krishna Reddy',
	linux-arm-kernel, linaro-mm-sig, linux-mm, linux-arch,
	linux-samsung-soc, iommu, 'Kyungmin Park',
	'Joerg Roedel', 'Russell King - ARM Linux',
	'Chunsang Jeong', 'KyongHo Cho',
	Andrzej Pietrasiewicz, 'Benjamin Herrenschmidt'

Hello,

On Friday, February 24, 2012 1:50 PM Arnd Bergmann wrote:

> On Friday 24 February 2012, Marek Szyprowski wrote:
> > > > +static struct page **__iommu_alloc_buffer(struct device *dev, size_t
> > > > +size, gfp_t gfp) {
> > > > +     struct page **pages;
> > > > +     int count = size >> PAGE_SHIFT;
> > > > +     int i=0;
> > > > +
> > > > +     pages = kzalloc(count * sizeof(struct page*), gfp);
> > > > +     if (!pages)
> > > > +             return NULL;
> > >
> > > kzalloc can fail for any size bigger than PAGE_SIZE, if the system memory is
> > > fully fragmented.
> > > If there is a request for size bigger than 4MB, then the pages pointer array won't
> > > Fit in one page and kzalloc may fail. we should use vzalloc()/vfree()
> > > when pages pointer array size needed is bigger than PAGE_SIZE.
> >
> > Right, thanks for spotting this. I will fix this in the next version.
> 
> It's not clear though if that is the best solution. vzalloc comes at the
> price of using up space in the vmalloc area and as well as extra TLB entries,
> so we try to limit its use where possible. The other current code might fail
> in out of memory situations, but if a user wants to allocate a >4MB buffer
> (using up more than one physically contiguous page of pointers to pages), the
> following allocation of >1024 pages will likely fail as well, so we might
> just fail early.

I want to use some kind of chained arrays, each of at most of PAGE_SIZE. This code 
doesn't really need to keep these page pointers in contiguous virtual memory area, so
it will not be a problem here.

Best regards
-- 
Marek Szyprowski
Samsung Poland R&D Center



^ permalink raw reply	[flat|nested] 39+ messages in thread

* RE: [PATCHv6 3/7] ARM: dma-mapping: implement dma sg methods on top of any generic dma ops
  2012-02-14 15:02   ` Konrad Rzeszutek Wilk
  2012-02-14 15:02     ` Konrad Rzeszutek Wilk
@ 2012-02-24 13:24     ` Marek Szyprowski
  2012-02-24 13:24       ` Marek Szyprowski
  1 sibling, 1 reply; 39+ messages in thread
From: Marek Szyprowski @ 2012-02-24 13:24 UTC (permalink / raw)
  To: 'Konrad Rzeszutek Wilk'
  Cc: linux-arm-kernel, linaro-mm-sig, linux-mm, linux-arch,
	linux-samsung-soc, iommu, 'Shariq Hasnain',
	'Arnd Bergmann', 'Benjamin Herrenschmidt',
	'Krishna Reddy', 'Kyungmin Park',
	Andrzej Pietrasiewicz, 'Russell King - ARM Linux',
	'KyongHo Cho', 'Chunsang Jeong'

Hello,

On Tuesday, February 14, 2012 4:03 PM Konrad Rzeszutek Wilk wrote:
 
> On Fri, Feb 10, 2012 at 07:58:40PM +0100, Marek Szyprowski wrote:
> > This patch converts all dma_sg methods to be generic (independent of the
> > current DMA mapping implementation for ARM architecture). All dma sg
> > operations are now implemented on top of respective
> > dma_map_page/dma_sync_single_for* operations from dma_map_ops structure.
> 
> Looks good, except the worry I've that the DMA debug API calls are now
> lost.

Could You point me which DMA debug API calls are lost? The inline functions
from include/asm-generic/dma-mapping-common.h already have all required
dma debug calls, which replaced the previous calls in 
arch/arm/include/dma-mapping.h.

(snipped)

Best regards
-- 
Marek Szyprowski
Samsung Poland R&D Center


--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 39+ messages in thread

* RE: [PATCHv6 3/7] ARM: dma-mapping: implement dma sg methods on top of any generic dma ops
  2012-02-24 13:24     ` Marek Szyprowski
@ 2012-02-24 13:24       ` Marek Szyprowski
  0 siblings, 0 replies; 39+ messages in thread
From: Marek Szyprowski @ 2012-02-24 13:24 UTC (permalink / raw)
  To: 'Konrad Rzeszutek Wilk'
  Cc: linux-arm-kernel, linaro-mm-sig, linux-mm, linux-arch,
	linux-samsung-soc, iommu, 'Shariq Hasnain',
	'Arnd Bergmann', 'Benjamin Herrenschmidt',
	'Krishna Reddy', 'Kyungmin Park',
	Andrzej Pietrasiewicz, 'Russell King - ARM Linux',
	'KyongHo Cho', 'Chunsang Jeong'

Hello,

On Tuesday, February 14, 2012 4:03 PM Konrad Rzeszutek Wilk wrote:
 
> On Fri, Feb 10, 2012 at 07:58:40PM +0100, Marek Szyprowski wrote:
> > This patch converts all dma_sg methods to be generic (independent of the
> > current DMA mapping implementation for ARM architecture). All dma sg
> > operations are now implemented on top of respective
> > dma_map_page/dma_sync_single_for* operations from dma_map_ops structure.
> 
> Looks good, except the worry I've that the DMA debug API calls are now
> lost.

Could You point me which DMA debug API calls are lost? The inline functions
from include/asm-generic/dma-mapping-common.h already have all required
dma debug calls, which replaced the previous calls in 
arch/arm/include/dma-mapping.h.

(snipped)

Best regards
-- 
Marek Szyprowski
Samsung Poland R&D Center



^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCHv6 7/7] ARM: dma-mapping: add support for IOMMU mapper
  2012-02-24 13:18             ` Marek Szyprowski
  2012-02-24 13:18               ` Marek Szyprowski
@ 2012-02-24 14:31               ` Arnd Bergmann
       [not found]                 ` <201202241431.02170.arnd-r2nGTMty4D4@public.gmane.org>
  1 sibling, 1 reply; 39+ messages in thread
From: Arnd Bergmann @ 2012-02-24 14:31 UTC (permalink / raw)
  To: Marek Szyprowski
  Cc: 'Krishna Reddy',
	linux-arm-kernel, linaro-mm-sig, linux-mm, linux-arch,
	linux-samsung-soc, iommu, 'Kyungmin Park',
	'Joerg Roedel', 'Russell King - ARM Linux',
	'Chunsang Jeong', 'KyongHo Cho',
	Andrzej Pietrasiewicz, 'Benjamin Herrenschmidt'

On Friday 24 February 2012, Marek Szyprowski wrote:
> I want to use some kind of chained arrays, each of at most of PAGE_SIZE. This code 
> doesn't really need to keep these page pointers in contiguous virtual memory area, so
> it will not be a problem here.
> 
Sounds like sg_alloc_table(), could you reuse that instead of rolling your own?

	Arnd

^ permalink raw reply	[flat|nested] 39+ messages in thread

* RE: [PATCHv6 7/7] ARM: dma-mapping: add support for IOMMU mapper
       [not found]                 ` <201202241431.02170.arnd-r2nGTMty4D4@public.gmane.org>
@ 2012-02-24 15:30                   ` Marek Szyprowski
  2012-02-24 15:30                     ` Marek Szyprowski
  0 siblings, 1 reply; 39+ messages in thread
From: Marek Szyprowski @ 2012-02-24 15:30 UTC (permalink / raw)
  To: 'Arnd Bergmann'
  Cc: linux-arch-u79uwXL29TY76Z2rM5mHXA,
	linux-samsung-soc-u79uwXL29TY76Z2rM5mHXA,
	'Russell King - ARM Linux',
	'Benjamin Herrenschmidt',
	iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA,
	linaro-mm-sig-cunTk1MwBs8s++Sfvej+rw,
	linux-mm-Bw31MaZKKs3YtjvyW6yDsg, 'Krishna Reddy',
	Andrzej Pietrasiewicz, 'Kyungmin Park',
	'KyongHo Cho', 'Chunsang Jeong',
	linux-arm-kernel-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r

Hello,

On Friday, February 24, 2012 3:31 PM Arnd Bergmann wrote:

> On Friday 24 February 2012, Marek Szyprowski wrote:
> > I want to use some kind of chained arrays, each of at most of PAGE_SIZE. This code
> > doesn't really need to keep these page pointers in contiguous virtual memory area, so
> > it will not be a problem here.
> >
> Sounds like sg_alloc_table(), could you reuse that instead of rolling your own?

I only need to store 'struct page *' there. sg_alloc_table() operates on 'struct statterlist'
entries, which are 4 to 6 times larger than a simple 'struct page *' entry. I don't want to waste
so much memory just for reusing a two function. Implementing the same idea with pure 
'struct page *' pointers will be just a matter of a few lines.

Best regards
-- 
Marek Szyprowski
Samsung Poland R&D Center

^ permalink raw reply	[flat|nested] 39+ messages in thread

* RE: [PATCHv6 7/7] ARM: dma-mapping: add support for IOMMU mapper
  2012-02-24 15:30                   ` Marek Szyprowski
@ 2012-02-24 15:30                     ` Marek Szyprowski
  0 siblings, 0 replies; 39+ messages in thread
From: Marek Szyprowski @ 2012-02-24 15:30 UTC (permalink / raw)
  To: 'Arnd Bergmann'
  Cc: 'Krishna Reddy',
	linux-arm-kernel, linaro-mm-sig, linux-mm, linux-arch,
	linux-samsung-soc, iommu, 'Kyungmin Park',
	'Joerg Roedel', 'Russell King - ARM Linux',
	'Chunsang Jeong', 'KyongHo Cho',
	Andrzej Pietrasiewicz, 'Benjamin Herrenschmidt'

Hello,

On Friday, February 24, 2012 3:31 PM Arnd Bergmann wrote:

> On Friday 24 February 2012, Marek Szyprowski wrote:
> > I want to use some kind of chained arrays, each of at most of PAGE_SIZE. This code
> > doesn't really need to keep these page pointers in contiguous virtual memory area, so
> > it will not be a problem here.
> >
> Sounds like sg_alloc_table(), could you reuse that instead of rolling your own?

I only need to store 'struct page *' there. sg_alloc_table() operates on 'struct statterlist'
entries, which are 4 to 6 times larger than a simple 'struct page *' entry. I don't want to waste
so much memory just for reusing a two function. Implementing the same idea with pure 
'struct page *' pointers will be just a matter of a few lines.

Best regards
-- 
Marek Szyprowski
Samsung Poland R&D Center



^ permalink raw reply	[flat|nested] 39+ messages in thread

end of thread, other threads:[~2012-02-24 15:30 UTC | newest]

Thread overview: 39+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2012-02-10 18:58 [PATCHv6 0/7] ARM: DMA-mapping framework redesign Marek Szyprowski
2012-02-10 18:58 ` Marek Szyprowski
     [not found] ` <1328900324-20946-1-git-send-email-m.szyprowski-Sze3O3UU22JBDgjK7y7TUQ@public.gmane.org>
2012-02-10 18:58   ` [PATCHv6 1/7] ARM: dma-mapping: remove offset parameter to prepare for generic dma_ops Marek Szyprowski
2012-02-10 18:58     ` Marek Szyprowski
2012-02-10 18:58   ` [PATCHv6 2/7] ARM: dma-mapping: use asm-generic/dma-mapping-common.h Marek Szyprowski
2012-02-10 18:58     ` Marek Szyprowski
2012-02-14 15:01     ` Konrad Rzeszutek Wilk
2012-02-14 15:01       ` Konrad Rzeszutek Wilk
2012-02-10 18:58   ` [PATCHv6 4/7] ARM: dma-mapping: move all dma bounce code to separate dma ops structure Marek Szyprowski
2012-02-10 18:58     ` Marek Szyprowski
2012-02-10 18:58   ` [PATCHv6 5/7] ARM: dma-mapping: remove redundant code and cleanup Marek Szyprowski
2012-02-10 18:58     ` Marek Szyprowski
2012-02-10 18:58   ` [PATCHv6 6/7] ARM: dma-mapping: use alloc, mmap, free from dma_ops Marek Szyprowski
2012-02-10 18:58     ` Marek Szyprowski
2012-02-10 18:58   ` [PATCHv6 7/7] ARM: dma-mapping: add support for IOMMU mapper Marek Szyprowski
2012-02-10 18:58     ` Marek Szyprowski
2012-02-13 18:18     ` Krishna Reddy
2012-02-13 18:18       ` Krishna Reddy
2012-02-13 19:58     ` Krishna Reddy
2012-02-13 19:58       ` Krishna Reddy
     [not found]       ` <401E54CE964CD94BAE1EB4A729C7087E378E42AE18-wAPRp6hVlRhDw2glCA4ptUEOCMrvLtNR@public.gmane.org>
2012-02-24  9:35         ` Marek Szyprowski
2012-02-24  9:35           ` Marek Szyprowski
2012-02-24 12:49           ` Arnd Bergmann
2012-02-24 12:49             ` Arnd Bergmann
2012-02-24 13:18             ` Marek Szyprowski
2012-02-24 13:18               ` Marek Szyprowski
2012-02-24 14:31               ` Arnd Bergmann
     [not found]                 ` <201202241431.02170.arnd-r2nGTMty4D4@public.gmane.org>
2012-02-24 15:30                   ` Marek Szyprowski
2012-02-24 15:30                     ` Marek Szyprowski
2012-02-14 14:55     ` Konrad Rzeszutek Wilk
2012-02-14 14:55       ` Konrad Rzeszutek Wilk
2012-02-24 13:12       ` Marek Szyprowski
2012-02-24 13:12         ` Marek Szyprowski
2012-02-10 18:58 ` [PATCHv6 3/7] ARM: dma-mapping: implement dma sg methods on top of any generic dma ops Marek Szyprowski
2012-02-10 18:58   ` Marek Szyprowski
2012-02-14 15:02   ` Konrad Rzeszutek Wilk
2012-02-14 15:02     ` Konrad Rzeszutek Wilk
2012-02-24 13:24     ` Marek Szyprowski
2012-02-24 13:24       ` Marek Szyprowski

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).