All of lore.kernel.org
 help / color / mirror / Atom feed
* [RFC/PATCH v3 0/7] Samsung IOMMU videobuf2 allocator and s5p-fimc update
@ 2011-04-18  9:26 ` Marek Szyprowski
  0 siblings, 0 replies; 64+ messages in thread
From: Marek Szyprowski @ 2011-04-18  9:26 UTC (permalink / raw)
  To: linux-arm-kernel, linux-samsung-soc, linux-media
  Cc: Marek Szyprowski, Kyungmin Park, Andrzej Pietrasiwiecz,
	Sylwester Nawrocki, Arnd Bergmann, Kukjin Kim

Hello,

This is a third version of the Samsung IOMMU driver (see patch #2) and
videobuf2 allocator for IOMMU mapped memory (see patch #4) as well as
FIMC driver update. This update brings some minor bugfixes to Samsung
IOMMU (SYSMMU) driver and support for pages larger than 4KiB in
videobuf2-dma-iommu allocator.

The main change from the first version of the Samsung IOMMU patches is a
complete rewrite of the IOMMU driver API. As suggested by Arnd Bergmann
we decided to drop the custom interface and use the kernel wide, common
iommu API defined in linux/include/iommu.h. This way the videobuf2 iommu
allocator become much more generic framework and it is no longer tied to
any particular iommu implementation.

This patch series introduces new type of videbuf2 memory allocator -
vb2-dma-iommu. This allocator can be used on the platforms that support
linux/include/iommu.h style IOMMU driver. An IOMMU driver for Samsung
EXYNOS4 (called SYSMMU) platform is also included. The allocator and
IOMMU driver is then used by s5p-fimc driver. To make it possible some
additional changes are required. Mainly the platform support for s5p-fimc
for EXYNOS4 machines need to be defined. The proposed solution has been
tested on Universal C210 board (Samsung S5PC210/EXYNOS4 based).
This IOMMU allocator has no dependences on any external subsystems.

We also ported s5p-mfc and s5p-tv drivers to this allocator, they will
be posted in separate patch series. This will enable to get them working
on mainline Linux kernel for EXYNOS4 platform. Support for
S5PV210/S5PC110 platform still depends on CMA allocator that needs more
discussion on memory management mailing list and further development.
The patches with updated s5p-mfc and s5p-tv drivers will follow.

To get FIMC module working on EXYNOS4 platform on UniversalC210 board we
also added support for power domains and power gating.

This patch series contains a collection of patches for various platform
subsystems. Here is a detailed list:

[PATCH 1/7] ARM: EXYNOS4: power domains: fixes and code cleanup
- adds support for block gating in Samsung power domain driver and
  performs some cleanup

[PATCH 2/7] ARM: Samsung: update/rewrite Samsung SYSMMU (IOMMU) driver
- a complete rewrite of sysmmu driver for Samsung platform, now uses
  linux/include/iommu.h api (key patch in this series)

[PATCH 3/7] v4l: videobuf2: dma-sg: move some generic functions to memops
- a little cleanup and preparations for the dma-iommu allocator

[PATCH 4/7] v4l: videobuf2: add IOMMU based DMA memory allocator
- introduces new memory allocator for videobuf2 for drivers that support
  iommu dma memory mappings (key patch in this series)

[PATCH 5/7] v4l: s5p-fimc: add pm_runtime support
- adds support for pm_runtime in s5p-fimc driver

[PATCH 6/7] v4l: s5p-fimc: Add support for vb2-dma-iommu allocator
- adds support for the newly introduces videbuf2-s5p-iommu allocator
  on EXYNOS4 platform

[PATCH 7/7] ARM: EXYNOS4: enable FIMC on Universal_C210
- adds all required machine definitions to get FIMC modules working
  on Universal C210 boards


Changelog:

V3:
 - minor bugfixes in SYSMMU driver
 - added complete support for 64KiB and 1MiB pages to videobuf2-dma-iommu
   allocator

V2: http://68.183.106.108/lists/arm-kernel/msg120636.html
 - custom SYSMMU interface has been dropped in favour of linux/include/iommu.h
   and rewritten SYSMMU driver again
 - added support to SYSMMU for mapping pages larger than 4Kb
 - dropped ARM shared mode
 - videobuf2-s5p-iommu allocator has been renamed to videobuf2-dma-iommu,
   because it has no dependenco on any Samsung platform specific API,
   the allocator still uses only 4Kb pages, but this will be changed in the
   next version
 - dropped FIMC platform patch that have been merged mainline
 - rebased all patches onto Linux kernel v2.6.39-rc1

V1: http://www.spinics.net/lists/linux-media/msg29751.html

Best regards
-- 
Marek Szyprowski
Samsung Poland R&D Center



Complete patch summary:

Andrzej Pietrasiewicz (3):
  ARM: Samsung: update/rewrite Samsung SYSMMU (IOMMU) driver
  v4l: videobuf2: dma-sg: move some generic functions to memops
  v4l: videobuf2: add IOMMU based DMA memory allocator

Marek Szyprowski (3):
  v4l: s5p-fimc: add pm_runtime support
  v4l: s5p-fimc: Add support for vb2-dma-iommu allocator
  ARM: EXYNOS4: enable FIMC on Universal_C210

Tomasz Stanislawski (1):
  ARM: EXYNOS4: power domains: fixes and code cleanup

 arch/arm/mach-exynos4/Kconfig                   |    6 +
 arch/arm/mach-exynos4/clock.c                   |   68 +-
 arch/arm/mach-exynos4/dev-pd.c                  |   93 ++-
 arch/arm/mach-exynos4/dev-sysmmu.c              |  615 ++++++++-----
 arch/arm/mach-exynos4/include/mach/irqs.h       |   35 +-
 arch/arm/mach-exynos4/include/mach/regs-clock.h |    7 +
 arch/arm/mach-exynos4/include/mach/sysmmu.h     |   46 -
 arch/arm/mach-exynos4/mach-universal_c210.c     |   22 +
 arch/arm/plat-s5p/Kconfig                       |   20 +-
 arch/arm/plat-s5p/include/plat/sysmmu.h         |  241 +++--
 arch/arm/plat-s5p/sysmmu.c                      | 1191 +++++++++++++++++------
 arch/arm/plat-samsung/include/plat/devs.h       |    2 +-
 arch/arm/plat-samsung/include/plat/pd.h         |    1 +
 drivers/media/video/Kconfig                     |   11 +-
 drivers/media/video/Makefile                    |    1 +
 drivers/media/video/s5p-fimc/fimc-capture.c     |    9 +-
 drivers/media/video/s5p-fimc/fimc-core.c        |   38 +-
 drivers/media/video/s5p-fimc/fimc-core.h        |    1 +
 drivers/media/video/s5p-fimc/fimc-mem.h         |  104 ++
 drivers/media/video/videobuf2-dma-iommu.c       |  762 +++++++++++++++
 drivers/media/video/videobuf2-dma-sg.c          |   37 +-
 drivers/media/video/videobuf2-memops.c          |   76 ++
 include/media/videobuf2-dma-iommu.h             |   48 +
 include/media/videobuf2-memops.h                |    5 +
 24 files changed, 2638 insertions(+), 801 deletions(-)
 rewrite arch/arm/mach-exynos4/dev-sysmmu.c (88%)
 delete mode 100644 arch/arm/mach-exynos4/include/mach/sysmmu.h
 rewrite arch/arm/plat-s5p/include/plat/sysmmu.h (83%)
 rewrite arch/arm/plat-s5p/sysmmu.c (87%)
 create mode 100644 drivers/media/video/s5p-fimc/fimc-mem.h
 create mode 100644 drivers/media/video/videobuf2-dma-iommu.c
 create mode 100644 include/media/videobuf2-dma-iommu.h

-- 
1.7.1.569.g6f426

^ permalink raw reply	[flat|nested] 64+ messages in thread

* [RFC/PATCH v3 0/7] Samsung IOMMU videobuf2 allocator and s5p-fimc update
@ 2011-04-18  9:26 ` Marek Szyprowski
  0 siblings, 0 replies; 64+ messages in thread
From: Marek Szyprowski @ 2011-04-18  9:26 UTC (permalink / raw)
  To: linux-arm-kernel

Hello,

This is a third version of the Samsung IOMMU driver (see patch #2) and
videobuf2 allocator for IOMMU mapped memory (see patch #4) as well as
FIMC driver update. This update brings some minor bugfixes to Samsung
IOMMU (SYSMMU) driver and support for pages larger than 4KiB in
videobuf2-dma-iommu allocator.

The main change from the first version of the Samsung IOMMU patches is a
complete rewrite of the IOMMU driver API. As suggested by Arnd Bergmann
we decided to drop the custom interface and use the kernel wide, common
iommu API defined in linux/include/iommu.h. This way the videobuf2 iommu
allocator become much more generic framework and it is no longer tied to
any particular iommu implementation.

This patch series introduces new type of videbuf2 memory allocator -
vb2-dma-iommu. This allocator can be used on the platforms that support
linux/include/iommu.h style IOMMU driver. An IOMMU driver for Samsung
EXYNOS4 (called SYSMMU) platform is also included. The allocator and
IOMMU driver is then used by s5p-fimc driver. To make it possible some
additional changes are required. Mainly the platform support for s5p-fimc
for EXYNOS4 machines need to be defined. The proposed solution has been
tested on Universal C210 board (Samsung S5PC210/EXYNOS4 based).
This IOMMU allocator has no dependences on any external subsystems.

We also ported s5p-mfc and s5p-tv drivers to this allocator, they will
be posted in separate patch series. This will enable to get them working
on mainline Linux kernel for EXYNOS4 platform. Support for
S5PV210/S5PC110 platform still depends on CMA allocator that needs more
discussion on memory management mailing list and further development.
The patches with updated s5p-mfc and s5p-tv drivers will follow.

To get FIMC module working on EXYNOS4 platform on UniversalC210 board we
also added support for power domains and power gating.

This patch series contains a collection of patches for various platform
subsystems. Here is a detailed list:

[PATCH 1/7] ARM: EXYNOS4: power domains: fixes and code cleanup
- adds support for block gating in Samsung power domain driver and
  performs some cleanup

[PATCH 2/7] ARM: Samsung: update/rewrite Samsung SYSMMU (IOMMU) driver
- a complete rewrite of sysmmu driver for Samsung platform, now uses
  linux/include/iommu.h api (key patch in this series)

[PATCH 3/7] v4l: videobuf2: dma-sg: move some generic functions to memops
- a little cleanup and preparations for the dma-iommu allocator

[PATCH 4/7] v4l: videobuf2: add IOMMU based DMA memory allocator
- introduces new memory allocator for videobuf2 for drivers that support
  iommu dma memory mappings (key patch in this series)

[PATCH 5/7] v4l: s5p-fimc: add pm_runtime support
- adds support for pm_runtime in s5p-fimc driver

[PATCH 6/7] v4l: s5p-fimc: Add support for vb2-dma-iommu allocator
- adds support for the newly introduces videbuf2-s5p-iommu allocator
  on EXYNOS4 platform

[PATCH 7/7] ARM: EXYNOS4: enable FIMC on Universal_C210
- adds all required machine definitions to get FIMC modules working
  on Universal C210 boards


Changelog:

V3:
 - minor bugfixes in SYSMMU driver
 - added complete support for 64KiB and 1MiB pages to videobuf2-dma-iommu
   allocator

V2: http://68.183.106.108/lists/arm-kernel/msg120636.html
 - custom SYSMMU interface has been dropped in favour of linux/include/iommu.h
   and rewritten SYSMMU driver again
 - added support to SYSMMU for mapping pages larger than 4Kb
 - dropped ARM shared mode
 - videobuf2-s5p-iommu allocator has been renamed to videobuf2-dma-iommu,
   because it has no dependenco on any Samsung platform specific API,
   the allocator still uses only 4Kb pages, but this will be changed in the
   next version
 - dropped FIMC platform patch that have been merged mainline
 - rebased all patches onto Linux kernel v2.6.39-rc1

V1: http://www.spinics.net/lists/linux-media/msg29751.html

Best regards
-- 
Marek Szyprowski
Samsung Poland R&D Center



Complete patch summary:

Andrzej Pietrasiewicz (3):
  ARM: Samsung: update/rewrite Samsung SYSMMU (IOMMU) driver
  v4l: videobuf2: dma-sg: move some generic functions to memops
  v4l: videobuf2: add IOMMU based DMA memory allocator

Marek Szyprowski (3):
  v4l: s5p-fimc: add pm_runtime support
  v4l: s5p-fimc: Add support for vb2-dma-iommu allocator
  ARM: EXYNOS4: enable FIMC on Universal_C210

Tomasz Stanislawski (1):
  ARM: EXYNOS4: power domains: fixes and code cleanup

 arch/arm/mach-exynos4/Kconfig                   |    6 +
 arch/arm/mach-exynos4/clock.c                   |   68 +-
 arch/arm/mach-exynos4/dev-pd.c                  |   93 ++-
 arch/arm/mach-exynos4/dev-sysmmu.c              |  615 ++++++++-----
 arch/arm/mach-exynos4/include/mach/irqs.h       |   35 +-
 arch/arm/mach-exynos4/include/mach/regs-clock.h |    7 +
 arch/arm/mach-exynos4/include/mach/sysmmu.h     |   46 -
 arch/arm/mach-exynos4/mach-universal_c210.c     |   22 +
 arch/arm/plat-s5p/Kconfig                       |   20 +-
 arch/arm/plat-s5p/include/plat/sysmmu.h         |  241 +++--
 arch/arm/plat-s5p/sysmmu.c                      | 1191 +++++++++++++++++------
 arch/arm/plat-samsung/include/plat/devs.h       |    2 +-
 arch/arm/plat-samsung/include/plat/pd.h         |    1 +
 drivers/media/video/Kconfig                     |   11 +-
 drivers/media/video/Makefile                    |    1 +
 drivers/media/video/s5p-fimc/fimc-capture.c     |    9 +-
 drivers/media/video/s5p-fimc/fimc-core.c        |   38 +-
 drivers/media/video/s5p-fimc/fimc-core.h        |    1 +
 drivers/media/video/s5p-fimc/fimc-mem.h         |  104 ++
 drivers/media/video/videobuf2-dma-iommu.c       |  762 +++++++++++++++
 drivers/media/video/videobuf2-dma-sg.c          |   37 +-
 drivers/media/video/videobuf2-memops.c          |   76 ++
 include/media/videobuf2-dma-iommu.h             |   48 +
 include/media/videobuf2-memops.h                |    5 +
 24 files changed, 2638 insertions(+), 801 deletions(-)
 rewrite arch/arm/mach-exynos4/dev-sysmmu.c (88%)
 delete mode 100644 arch/arm/mach-exynos4/include/mach/sysmmu.h
 rewrite arch/arm/plat-s5p/include/plat/sysmmu.h (83%)
 rewrite arch/arm/plat-s5p/sysmmu.c (87%)
 create mode 100644 drivers/media/video/s5p-fimc/fimc-mem.h
 create mode 100644 drivers/media/video/videobuf2-dma-iommu.c
 create mode 100644 include/media/videobuf2-dma-iommu.h

-- 
1.7.1.569.g6f426

^ permalink raw reply	[flat|nested] 64+ messages in thread

* [PATCH 1/7] ARM: EXYNOS4: power domains: fixes and code cleanup
  2011-04-18  9:26 ` Marek Szyprowski
@ 2011-04-18  9:26   ` Marek Szyprowski
  -1 siblings, 0 replies; 64+ messages in thread
From: Marek Szyprowski @ 2011-04-18  9:26 UTC (permalink / raw)
  To: linux-arm-kernel, linux-samsung-soc, linux-media
  Cc: Marek Szyprowski, Kyungmin Park, Andrzej Pietrasiwiecz,
	Sylwester Nawrocki, Arnd Bergmann, Kukjin Kim,
	Tomasz Stanislawski

From: Tomasz Stanislawski <t.stanislaws@samsung.com>

This patch extends power domain driver with support for enabling and
disabling modules in S5P_CLKGATE_BLOCK register. It also performs a
little code cleanup to avoid confusion between exynos4_device_pd array
index and power domain id.

Signed-off-by: Tomasz Stanislawski <t.stanislaws@samsung.com>
Signed-off-by: Kyungmin Park <kyungmin.park@samsung.com>
Signed-off-by: Marek Szyprowski <m.szyprowski@samsung.com>
---
 arch/arm/mach-exynos4/dev-pd.c                  |   93 +++++++++++++++++------
 arch/arm/mach-exynos4/include/mach/regs-clock.h |    7 ++
 arch/arm/plat-samsung/include/plat/pd.h         |    1 +
 3 files changed, 79 insertions(+), 22 deletions(-)

diff --git a/arch/arm/mach-exynos4/dev-pd.c b/arch/arm/mach-exynos4/dev-pd.c
index 3273f25..44c6597 100644
--- a/arch/arm/mach-exynos4/dev-pd.c
+++ b/arch/arm/mach-exynos4/dev-pd.c
@@ -16,13 +16,17 @@
 #include <linux/delay.h>
 
 #include <mach/regs-pmu.h>
+#include <mach/regs-clock.h>
 
 #include <plat/pd.h>
 
+static DEFINE_SPINLOCK(gate_block_slock);
+
 static int exynos4_pd_enable(struct device *dev)
 {
 	struct samsung_pd_info *pdata =  dev->platform_data;
 	u32 timeout;
+	int ret = 0;
 
 	__raw_writel(S5P_INT_LOCAL_PWR_EN, pdata->base);
 
@@ -31,21 +35,39 @@ static int exynos4_pd_enable(struct device *dev)
 	while ((__raw_readl(pdata->base + 0x4) & S5P_INT_LOCAL_PWR_EN)
 		!= S5P_INT_LOCAL_PWR_EN) {
 		if (timeout == 0) {
-			printk(KERN_ERR "Power domain %s enable failed.\n",
-				dev_name(dev));
-			return -ETIMEDOUT;
+			dev_err(dev, "enable failed\n");
+			ret = -ETIMEDOUT;
+			goto done;
 		}
 		timeout--;
 		udelay(100);
 	}
 
-	return 0;
+	/* configure clk gate mask if it is present */
+	if (pdata->gate_mask) {
+		unsigned long flags;
+		unsigned long value;
+
+		spin_lock_irqsave(&gate_block_slock, flags);
+
+		value  = __raw_readl(S5P_CLKGATE_BLOCK);
+		value |= pdata->gate_mask;
+		__raw_writel(value, S5P_CLKGATE_BLOCK);
+
+		spin_unlock_irqrestore(&gate_block_slock, flags);
+	}
+
+done:
+	dev_info(dev, "enable finished\n");
+
+	return ret;
 }
 
 static int exynos4_pd_disable(struct device *dev)
 {
 	struct samsung_pd_info *pdata =  dev->platform_data;
 	u32 timeout;
+	int ret = 0;
 
 	__raw_writel(0, pdata->base);
 
@@ -53,81 +75,108 @@ static int exynos4_pd_disable(struct device *dev)
 	timeout = 10;
 	while (__raw_readl(pdata->base + 0x4) & S5P_INT_LOCAL_PWR_EN) {
 		if (timeout == 0) {
-			printk(KERN_ERR "Power domain %s disable failed.\n",
-				dev_name(dev));
-			return -ETIMEDOUT;
+			dev_err(dev, "disable failed\n");
+			ret = -ETIMEDOUT;
+			goto done;
 		}
 		timeout--;
 		udelay(100);
 	}
 
-	return 0;
+	if (pdata->gate_mask) {
+		unsigned long flags;
+		unsigned long value;
+
+		spin_lock_irqsave(&gate_block_slock, flags);
+
+		value  = __raw_readl(S5P_CLKGATE_BLOCK);
+		value &= ~pdata->gate_mask;
+		__raw_writel(value, S5P_CLKGATE_BLOCK);
+
+		spin_unlock_irqrestore(&gate_block_slock, flags);
+	}
+done:
+	dev_info(dev, "disable finished\n");
+
+	return ret;
 }
 
 struct platform_device exynos4_device_pd[] = {
-	{
+	[PD_MFC] = {
 		.name		= "samsung-pd",
-		.id		= 0,
+		.id		= PD_MFC,
 		.dev = {
 			.platform_data = &(struct samsung_pd_info) {
 				.enable		= exynos4_pd_enable,
 				.disable	= exynos4_pd_disable,
 				.base		= S5P_PMU_MFC_CONF,
+				.gate_mask	= S5P_CLKGATE_BLOCK_MFC,
 			},
 		},
-	}, {
+	},
+	[PD_G3D] = {
 		.name		= "samsung-pd",
-		.id		= 1,
+		.id		= PD_G3D,
 		.dev = {
 			.platform_data = &(struct samsung_pd_info) {
 				.enable		= exynos4_pd_enable,
 				.disable	= exynos4_pd_disable,
 				.base		= S5P_PMU_G3D_CONF,
+				.gate_mask	= S5P_CLKGATE_BLOCK_G3D,
 			},
 		},
-	}, {
+	},
+	[PD_LCD0] = {
 		.name		= "samsung-pd",
-		.id		= 2,
+		.id		= PD_LCD0,
 		.dev = {
 			.platform_data = &(struct samsung_pd_info) {
 				.enable		= exynos4_pd_enable,
 				.disable	= exynos4_pd_disable,
 				.base		= S5P_PMU_LCD0_CONF,
+				.gate_mask	= S5P_CLKGATE_BLOCK_LCD0,
 			},
 		},
-	}, {
+	},
+	[PD_LCD1] = {
 		.name		= "samsung-pd",
-		.id		= 3,
+		.id		= PD_LCD1,
 		.dev = {
 			.platform_data = &(struct samsung_pd_info) {
 				.enable		= exynos4_pd_enable,
 				.disable	= exynos4_pd_disable,
 				.base		= S5P_PMU_LCD1_CONF,
+				.gate_mask	= S5P_CLKGATE_BLOCK_LCD1,
 			},
 		},
-	}, {
+	},
+	[PD_TV] = {
 		.name		= "samsung-pd",
-		.id		= 4,
+		.id		= PD_TV,
 		.dev = {
 			.platform_data = &(struct samsung_pd_info) {
 				.enable		= exynos4_pd_enable,
 				.disable	= exynos4_pd_disable,
 				.base		= S5P_PMU_TV_CONF,
+				.gate_mask	= S5P_CLKGATE_BLOCK_TV,
 			},
 		},
-	}, {
+	},
+	[PD_CAM] = {
 		.name		= "samsung-pd",
-		.id		= 5,
+		.id		= PD_CAM,
 		.dev = {
 			.platform_data = &(struct samsung_pd_info) {
 				.enable		= exynos4_pd_enable,
 				.disable	= exynos4_pd_disable,
 				.base		= S5P_PMU_CAM_CONF,
+				.gate_mask	= S5P_CLKGATE_BLOCK_CAM,
 			},
 		},
-	}, {
+	},
+	[PD_GPS] = {
 		.name		= "samsung-pd",
-		.id		= 6,
+		.id		= PD_GPS,
 		.dev = {
 			.platform_data = &(struct samsung_pd_info) {
 				.enable		= exynos4_pd_enable,
diff --git a/arch/arm/mach-exynos4/include/mach/regs-clock.h b/arch/arm/mach-exynos4/include/mach/regs-clock.h
index 6e311c1..2c1472b 100644
--- a/arch/arm/mach-exynos4/include/mach/regs-clock.h
+++ b/arch/arm/mach-exynos4/include/mach/regs-clock.h
@@ -171,6 +171,13 @@
 #define S5P_CLKDIV_BUS_GPLR_SHIFT	(4)
 #define S5P_CLKDIV_BUS_GPLR_MASK	(0x7 << S5P_CLKDIV_BUS_GPLR_SHIFT)
 
+#define S5P_CLKGATE_BLOCK_CAM		(1 << 0)
+#define S5P_CLKGATE_BLOCK_TV		(1 << 1)
+#define S5P_CLKGATE_BLOCK_MFC		(1 << 2)
+#define S5P_CLKGATE_BLOCK_G3D		(1 << 3)
+#define S5P_CLKGATE_BLOCK_LCD0		(1 << 4)
+#define S5P_CLKGATE_BLOCK_LCD1		(1 << 5)
+
 /* Compatibility defines and inclusion */
 
 #include <mach/regs-pmu.h>
diff --git a/arch/arm/plat-samsung/include/plat/pd.h b/arch/arm/plat-samsung/include/plat/pd.h
index abb4bc3..ef545ed 100644
--- a/arch/arm/plat-samsung/include/plat/pd.h
+++ b/arch/arm/plat-samsung/include/plat/pd.h
@@ -15,6 +15,7 @@ struct samsung_pd_info {
 	int (*enable)(struct device *dev);
 	int (*disable)(struct device *dev);
 	void __iomem *base;
+	unsigned long gate_mask;
 };
 
 enum exynos4_pd_block {
-- 
1.7.1.569.g6f426

^ permalink raw reply related	[flat|nested] 64+ messages in thread

* [PATCH 1/7] ARM: EXYNOS4: power domains: fixes and code cleanup
@ 2011-04-18  9:26   ` Marek Szyprowski
  0 siblings, 0 replies; 64+ messages in thread
From: Marek Szyprowski @ 2011-04-18  9:26 UTC (permalink / raw)
  To: linux-arm-kernel

From: Tomasz Stanislawski <t.stanislaws@samsung.com>

This patch extends power domain driver with support for enabling and
disabling modules in S5P_CLKGATE_BLOCK register. It also performs a
little code cleanup to avoid confusion between exynos4_device_pd array
index and power domain id.

Signed-off-by: Tomasz Stanislawski <t.stanislaws@samsung.com>
Signed-off-by: Kyungmin Park <kyungmin.park@samsung.com>
Signed-off-by: Marek Szyprowski <m.szyprowski@samsung.com>
---
 arch/arm/mach-exynos4/dev-pd.c                  |   93 +++++++++++++++++------
 arch/arm/mach-exynos4/include/mach/regs-clock.h |    7 ++
 arch/arm/plat-samsung/include/plat/pd.h         |    1 +
 3 files changed, 79 insertions(+), 22 deletions(-)

diff --git a/arch/arm/mach-exynos4/dev-pd.c b/arch/arm/mach-exynos4/dev-pd.c
index 3273f25..44c6597 100644
--- a/arch/arm/mach-exynos4/dev-pd.c
+++ b/arch/arm/mach-exynos4/dev-pd.c
@@ -16,13 +16,17 @@
 #include <linux/delay.h>
 
 #include <mach/regs-pmu.h>
+#include <mach/regs-clock.h>
 
 #include <plat/pd.h>
 
+static DEFINE_SPINLOCK(gate_block_slock);
+
 static int exynos4_pd_enable(struct device *dev)
 {
 	struct samsung_pd_info *pdata =  dev->platform_data;
 	u32 timeout;
+	int ret = 0;
 
 	__raw_writel(S5P_INT_LOCAL_PWR_EN, pdata->base);
 
@@ -31,21 +35,39 @@ static int exynos4_pd_enable(struct device *dev)
 	while ((__raw_readl(pdata->base + 0x4) & S5P_INT_LOCAL_PWR_EN)
 		!= S5P_INT_LOCAL_PWR_EN) {
 		if (timeout == 0) {
-			printk(KERN_ERR "Power domain %s enable failed.\n",
-				dev_name(dev));
-			return -ETIMEDOUT;
+			dev_err(dev, "enable failed\n");
+			ret = -ETIMEDOUT;
+			goto done;
 		}
 		timeout--;
 		udelay(100);
 	}
 
-	return 0;
+	/* configure clk gate mask if it is present */
+	if (pdata->gate_mask) {
+		unsigned long flags;
+		unsigned long value;
+
+		spin_lock_irqsave(&gate_block_slock, flags);
+
+		value  = __raw_readl(S5P_CLKGATE_BLOCK);
+		value |= pdata->gate_mask;
+		__raw_writel(value, S5P_CLKGATE_BLOCK);
+
+		spin_unlock_irqrestore(&gate_block_slock, flags);
+	}
+
+done:
+	dev_info(dev, "enable finished\n");
+
+	return ret;
 }
 
 static int exynos4_pd_disable(struct device *dev)
 {
 	struct samsung_pd_info *pdata =  dev->platform_data;
 	u32 timeout;
+	int ret = 0;
 
 	__raw_writel(0, pdata->base);
 
@@ -53,81 +75,108 @@ static int exynos4_pd_disable(struct device *dev)
 	timeout = 10;
 	while (__raw_readl(pdata->base + 0x4) & S5P_INT_LOCAL_PWR_EN) {
 		if (timeout == 0) {
-			printk(KERN_ERR "Power domain %s disable failed.\n",
-				dev_name(dev));
-			return -ETIMEDOUT;
+			dev_err(dev, "disable failed\n");
+			ret = -ETIMEDOUT;
+			goto done;
 		}
 		timeout--;
 		udelay(100);
 	}
 
-	return 0;
+	if (pdata->gate_mask) {
+		unsigned long flags;
+		unsigned long value;
+
+		spin_lock_irqsave(&gate_block_slock, flags);
+
+		value  = __raw_readl(S5P_CLKGATE_BLOCK);
+		value &= ~pdata->gate_mask;
+		__raw_writel(value, S5P_CLKGATE_BLOCK);
+
+		spin_unlock_irqrestore(&gate_block_slock, flags);
+	}
+done:
+	dev_info(dev, "disable finished\n");
+
+	return ret;
 }
 
 struct platform_device exynos4_device_pd[] = {
-	{
+	[PD_MFC] = {
 		.name		= "samsung-pd",
-		.id		= 0,
+		.id		= PD_MFC,
 		.dev = {
 			.platform_data = &(struct samsung_pd_info) {
 				.enable		= exynos4_pd_enable,
 				.disable	= exynos4_pd_disable,
 				.base		= S5P_PMU_MFC_CONF,
+				.gate_mask	= S5P_CLKGATE_BLOCK_MFC,
 			},
 		},
-	}, {
+	},
+	[PD_G3D] = {
 		.name		= "samsung-pd",
-		.id		= 1,
+		.id		= PD_G3D,
 		.dev = {
 			.platform_data = &(struct samsung_pd_info) {
 				.enable		= exynos4_pd_enable,
 				.disable	= exynos4_pd_disable,
 				.base		= S5P_PMU_G3D_CONF,
+				.gate_mask	= S5P_CLKGATE_BLOCK_G3D,
 			},
 		},
-	}, {
+	},
+	[PD_LCD0] = {
 		.name		= "samsung-pd",
-		.id		= 2,
+		.id		= PD_LCD0,
 		.dev = {
 			.platform_data = &(struct samsung_pd_info) {
 				.enable		= exynos4_pd_enable,
 				.disable	= exynos4_pd_disable,
 				.base		= S5P_PMU_LCD0_CONF,
+				.gate_mask	= S5P_CLKGATE_BLOCK_LCD0,
 			},
 		},
-	}, {
+	},
+	[PD_LCD1] = {
 		.name		= "samsung-pd",
-		.id		= 3,
+		.id		= PD_LCD1,
 		.dev = {
 			.platform_data = &(struct samsung_pd_info) {
 				.enable		= exynos4_pd_enable,
 				.disable	= exynos4_pd_disable,
 				.base		= S5P_PMU_LCD1_CONF,
+				.gate_mask	= S5P_CLKGATE_BLOCK_LCD1,
 			},
 		},
-	}, {
+	},
+	[PD_TV] = {
 		.name		= "samsung-pd",
-		.id		= 4,
+		.id		= PD_TV,
 		.dev = {
 			.platform_data = &(struct samsung_pd_info) {
 				.enable		= exynos4_pd_enable,
 				.disable	= exynos4_pd_disable,
 				.base		= S5P_PMU_TV_CONF,
+				.gate_mask	= S5P_CLKGATE_BLOCK_TV,
 			},
 		},
-	}, {
+	},
+	[PD_CAM] = {
 		.name		= "samsung-pd",
-		.id		= 5,
+		.id		= PD_CAM,
 		.dev = {
 			.platform_data = &(struct samsung_pd_info) {
 				.enable		= exynos4_pd_enable,
 				.disable	= exynos4_pd_disable,
 				.base		= S5P_PMU_CAM_CONF,
+				.gate_mask	= S5P_CLKGATE_BLOCK_CAM,
 			},
 		},
-	}, {
+	},
+	[PD_GPS] = {
 		.name		= "samsung-pd",
-		.id		= 6,
+		.id		= PD_GPS,
 		.dev = {
 			.platform_data = &(struct samsung_pd_info) {
 				.enable		= exynos4_pd_enable,
diff --git a/arch/arm/mach-exynos4/include/mach/regs-clock.h b/arch/arm/mach-exynos4/include/mach/regs-clock.h
index 6e311c1..2c1472b 100644
--- a/arch/arm/mach-exynos4/include/mach/regs-clock.h
+++ b/arch/arm/mach-exynos4/include/mach/regs-clock.h
@@ -171,6 +171,13 @@
 #define S5P_CLKDIV_BUS_GPLR_SHIFT	(4)
 #define S5P_CLKDIV_BUS_GPLR_MASK	(0x7 << S5P_CLKDIV_BUS_GPLR_SHIFT)
 
+#define S5P_CLKGATE_BLOCK_CAM		(1 << 0)
+#define S5P_CLKGATE_BLOCK_TV		(1 << 1)
+#define S5P_CLKGATE_BLOCK_MFC		(1 << 2)
+#define S5P_CLKGATE_BLOCK_G3D		(1 << 3)
+#define S5P_CLKGATE_BLOCK_LCD0		(1 << 4)
+#define S5P_CLKGATE_BLOCK_LCD1		(1 << 5)
+
 /* Compatibility defines and inclusion */
 
 #include <mach/regs-pmu.h>
diff --git a/arch/arm/plat-samsung/include/plat/pd.h b/arch/arm/plat-samsung/include/plat/pd.h
index abb4bc3..ef545ed 100644
--- a/arch/arm/plat-samsung/include/plat/pd.h
+++ b/arch/arm/plat-samsung/include/plat/pd.h
@@ -15,6 +15,7 @@ struct samsung_pd_info {
 	int (*enable)(struct device *dev);
 	int (*disable)(struct device *dev);
 	void __iomem *base;
+	unsigned long gate_mask;
 };
 
 enum exynos4_pd_block {
-- 
1.7.1.569.g6f426

^ permalink raw reply related	[flat|nested] 64+ messages in thread

* [PATCH 2/7] ARM: Samsung: update/rewrite Samsung SYSMMU (IOMMU) driver
  2011-04-18  9:26 ` Marek Szyprowski
@ 2011-04-18  9:26   ` Marek Szyprowski
  -1 siblings, 0 replies; 64+ messages in thread
From: Marek Szyprowski @ 2011-04-18  9:26 UTC (permalink / raw)
  To: linux-arm-kernel, linux-samsung-soc, linux-media
  Cc: Marek Szyprowski, Kyungmin Park, Andrzej Pietrasiwiecz,
	Sylwester Nawrocki, Arnd Bergmann, Kukjin Kim

From: Andrzej Pietrasiewicz <andrzej.p@samsung.com>

This patch performs a complete rewrite of sysmmu driver for Samsung platform:
- simplified the resource management: no more single platform
  device with 32 resources is needed, better fits into linux driver model,
  each sysmmu instance has it's own resource definition
- the new version uses kernel wide common iommu api defined in include/iommu.h
- cleaned support for sysmmu clocks
- added support for custom fault handlers and tlb replacement policy

Signed-off-by: Andrzej Pietrasiewicz <andrzej.p@samsung.com>
Signed-off-by: Kyungmin Park <kyungmin.park@samsung.com>
Signed-off-by: Marek Szyprowski <m.szyprowski@samsung.com>
---
 arch/arm/mach-exynos4/clock.c               |   68 +-
 arch/arm/mach-exynos4/dev-sysmmu.c          |  615 +++++++++------
 arch/arm/mach-exynos4/include/mach/irqs.h   |   35 +-
 arch/arm/mach-exynos4/include/mach/sysmmu.h |   46 -
 arch/arm/plat-s5p/Kconfig                   |   20 +-
 arch/arm/plat-s5p/include/plat/sysmmu.h     |  241 ++++---
 arch/arm/plat-s5p/sysmmu.c                  | 1191 ++++++++++++++++++++-------
 arch/arm/plat-samsung/include/plat/devs.h   |    2 +-
 8 files changed, 1478 insertions(+), 740 deletions(-)
 rewrite arch/arm/mach-exynos4/dev-sysmmu.c (88%)
 delete mode 100644 arch/arm/mach-exynos4/include/mach/sysmmu.h
 rewrite arch/arm/plat-s5p/include/plat/sysmmu.h (83%)
 rewrite arch/arm/plat-s5p/sysmmu.c (87%)

diff --git a/arch/arm/mach-exynos4/clock.c b/arch/arm/mach-exynos4/clock.c
index 871f9d5..963195e 100644
--- a/arch/arm/mach-exynos4/clock.c
+++ b/arch/arm/mach-exynos4/clock.c
@@ -20,10 +20,10 @@
 #include <plat/pll.h>
 #include <plat/s5p-clock.h>
 #include <plat/clock-clksrc.h>
+#include <plat/sysmmu.h>
 
 #include <mach/map.h>
 #include <mach/regs-clock.h>
-#include <mach/sysmmu.h>
 
 static struct clk clk_sclk_hdmi27m = {
 	.name		= "sclk_hdmi27m",
@@ -127,6 +127,11 @@ static int exynos4_clk_ip_perir_ctrl(struct clk *clk, int enable)
 	return s5p_gatectrl(S5P_CLKGATE_IP_PERIR, clk, enable);
 }
 
+static int exynos4_clk_ip_dmc_ctrl(struct clk *clk, int enable)
+{
+	return s5p_gatectrl(S5P_CLKGATE_IP_DMC, clk, enable);
+}
+
 /* Core list of CMU_CPU side */
 
 static struct clksrc_clk clk_mout_apll = {
@@ -614,75 +619,80 @@ static struct clk init_clocks_off[] = {
 		.enable		= exynos4_clk_ip_peril_ctrl,
 		.ctrlbit	= (1 << 13),
 	}, {
-		.name		= "SYSMMU_MDMA",
-		.id		= -1,
+		.name		= "sysmmu",
+		.id		= S5P_SYSMMU_MDMA,
 		.enable		= exynos4_clk_ip_image_ctrl,
 		.ctrlbit	= (1 << 5),
 	}, {
-		.name		= "SYSMMU_FIMC0",
-		.id		= -1,
+		.name		= "sysmmu",
+		.id		= S5P_SYSMMU_FIMC0,
 		.enable		= exynos4_clk_ip_cam_ctrl,
 		.ctrlbit	= (1 << 7),
 	}, {
-		.name		= "SYSMMU_FIMC1",
-		.id		= -1,
+		.name		= "sysmmu",
+		.id		= S5P_SYSMMU_FIMC1,
 		.enable		= exynos4_clk_ip_cam_ctrl,
 		.ctrlbit	= (1 << 8),
 	}, {
-		.name		= "SYSMMU_FIMC2",
-		.id		= -1,
+		.name		= "sysmmu",
+		.id		= S5P_SYSMMU_FIMC2,
 		.enable		= exynos4_clk_ip_cam_ctrl,
 		.ctrlbit	= (1 << 9),
 	}, {
-		.name		= "SYSMMU_FIMC3",
-		.id		= -1,
+		.name		= "sysmmu",
+		.id		= S5P_SYSMMU_FIMC3,
 		.enable		= exynos4_clk_ip_cam_ctrl,
 		.ctrlbit	= (1 << 10),
 	}, {
-		.name		= "SYSMMU_JPEG",
-		.id		= -1,
+		.name		= "sysmmu",
+		.id		= S5P_SYSMMU_JPEG,
 		.enable		= exynos4_clk_ip_cam_ctrl,
 		.ctrlbit	= (1 << 11),
 	}, {
-		.name		= "SYSMMU_FIMD0",
-		.id		= -1,
+		.name		= "sysmmu",
+		.id		= S5P_SYSMMU_FIMD0,
 		.enable		= exynos4_clk_ip_lcd0_ctrl,
 		.ctrlbit	= (1 << 4),
 	}, {
-		.name		= "SYSMMU_FIMD1",
-		.id		= -1,
+		.name		= "sysmmu",
+		.id		= S5P_SYSMMU_FIMD1,
 		.enable		= exynos4_clk_ip_lcd1_ctrl,
 		.ctrlbit	= (1 << 4),
 	}, {
-		.name		= "SYSMMU_PCIe",
-		.id		= -1,
+		.name		= "sysmmu",
+		.id		= S5P_SYSMMU_PCIe,
 		.enable		= exynos4_clk_ip_fsys_ctrl,
 		.ctrlbit	= (1 << 18),
 	}, {
-		.name		= "SYSMMU_G2D",
-		.id		= -1,
+		.name		= "sysmmu",
+		.id		= S5P_SYSMMU_G2D,
 		.enable		= exynos4_clk_ip_image_ctrl,
 		.ctrlbit	= (1 << 3),
 	}, {
-		.name		= "SYSMMU_ROTATOR",
-		.id		= -1,
+		.name		= "sysmmu",
+		.id		= S5P_SYSMMU_ROTATOR,
 		.enable		= exynos4_clk_ip_image_ctrl,
 		.ctrlbit	= (1 << 4),
 	}, {
-		.name		= "SYSMMU_TV",
-		.id		= -1,
+		.name		= "sysmmu",
+		.id		= S5P_SYSMMU_TV,
 		.enable		= exynos4_clk_ip_tv_ctrl,
 		.ctrlbit	= (1 << 4),
 	}, {
-		.name		= "SYSMMU_MFC_L",
-		.id		= -1,
+		.name		= "sysmmu",
+		.id		= S5P_SYSMMU_MFC_L,
 		.enable		= exynos4_clk_ip_mfc_ctrl,
 		.ctrlbit	= (1 << 1),
 	}, {
-		.name		= "SYSMMU_MFC_R",
-		.id		= -1,
+		.name		= "sysmmu",
+		.id		= S5P_SYSMMU_MFC_R,
 		.enable		= exynos4_clk_ip_mfc_ctrl,
 		.ctrlbit	= (1 << 2),
+	}, {
+		.name		= "sysmmu",
+		.id		= S5P_SYSMMU_SSS,
+		.enable		= exynos4_clk_ip_dmc_ctrl,
+		.ctrlbit	= (1 << 12),
 	}
 };
 
diff --git a/arch/arm/mach-exynos4/dev-sysmmu.c b/arch/arm/mach-exynos4/dev-sysmmu.c
dissimilarity index 88%
index 3b7cae0..23c3a6e 100644
--- a/arch/arm/mach-exynos4/dev-sysmmu.c
+++ b/arch/arm/mach-exynos4/dev-sysmmu.c
@@ -1,232 +1,383 @@
-/* linux/arch/arm/mach-exynos4/dev-sysmmu.c
- *
- * Copyright (c) 2010 Samsung Electronics Co., Ltd.
- *		http://www.samsung.com
- *
- * EXYNOS4 - System MMU support
- *
- * This program is free software; you can redistribute it and/or modify
- * it under the terms of the GNU General Public License version 2 as
- * published by the Free Software Foundation.
- */
-
-#include <linux/platform_device.h>
-#include <linux/dma-mapping.h>
-
-#include <mach/map.h>
-#include <mach/irqs.h>
-#include <mach/sysmmu.h>
-#include <plat/s5p-clock.h>
-
-/* These names must be equal to the clock names in mach-exynos4/clock.c */
-const char *sysmmu_ips_name[EXYNOS4_SYSMMU_TOTAL_IPNUM] = {
-	"SYSMMU_MDMA"	,
-	"SYSMMU_SSS"	,
-	"SYSMMU_FIMC0"	,
-	"SYSMMU_FIMC1"	,
-	"SYSMMU_FIMC2"	,
-	"SYSMMU_FIMC3"	,
-	"SYSMMU_JPEG"	,
-	"SYSMMU_FIMD0"	,
-	"SYSMMU_FIMD1"	,
-	"SYSMMU_PCIe"	,
-	"SYSMMU_G2D"	,
-	"SYSMMU_ROTATOR",
-	"SYSMMU_MDMA2"	,
-	"SYSMMU_TV"	,
-	"SYSMMU_MFC_L"	,
-	"SYSMMU_MFC_R"	,
-};
-
-static struct resource exynos4_sysmmu_resource[] = {
-	[0] = {
-		.start	= EXYNOS4_PA_SYSMMU_MDMA,
-		.end	= EXYNOS4_PA_SYSMMU_MDMA + SZ_64K - 1,
-		.flags	= IORESOURCE_MEM,
-	},
-	[1] = {
-		.start	= IRQ_SYSMMU_MDMA0_0,
-		.end	= IRQ_SYSMMU_MDMA0_0,
-		.flags	= IORESOURCE_IRQ,
-	},
-	[2] = {
-		.start	= EXYNOS4_PA_SYSMMU_SSS,
-		.end	= EXYNOS4_PA_SYSMMU_SSS + SZ_64K - 1,
-		.flags	= IORESOURCE_MEM,
-	},
-	[3] = {
-		.start	= IRQ_SYSMMU_SSS_0,
-		.end	= IRQ_SYSMMU_SSS_0,
-		.flags	= IORESOURCE_IRQ,
-	},
-	[4] = {
-		.start	= EXYNOS4_PA_SYSMMU_FIMC0,
-		.end	= EXYNOS4_PA_SYSMMU_FIMC0 + SZ_64K - 1,
-		.flags	= IORESOURCE_MEM,
-	},
-	[5] = {
-		.start	= IRQ_SYSMMU_FIMC0_0,
-		.end	= IRQ_SYSMMU_FIMC0_0,
-		.flags	= IORESOURCE_IRQ,
-	},
-	[6] = {
-		.start	= EXYNOS4_PA_SYSMMU_FIMC1,
-		.end	= EXYNOS4_PA_SYSMMU_FIMC1 + SZ_64K - 1,
-		.flags	= IORESOURCE_MEM,
-	},
-	[7] = {
-		.start	= IRQ_SYSMMU_FIMC1_0,
-		.end	= IRQ_SYSMMU_FIMC1_0,
-		.flags	= IORESOURCE_IRQ,
-	},
-	[8] = {
-		.start	= EXYNOS4_PA_SYSMMU_FIMC2,
-		.end	= EXYNOS4_PA_SYSMMU_FIMC2 + SZ_64K - 1,
-		.flags	= IORESOURCE_MEM,
-	},
-	[9] = {
-		.start	= IRQ_SYSMMU_FIMC2_0,
-		.end	= IRQ_SYSMMU_FIMC2_0,
-		.flags	= IORESOURCE_IRQ,
-	},
-	[10] = {
-		.start	= EXYNOS4_PA_SYSMMU_FIMC3,
-		.end	= EXYNOS4_PA_SYSMMU_FIMC3 + SZ_64K - 1,
-		.flags	= IORESOURCE_MEM,
-	},
-	[11] = {
-		.start	= IRQ_SYSMMU_FIMC3_0,
-		.end	= IRQ_SYSMMU_FIMC3_0,
-		.flags	= IORESOURCE_IRQ,
-	},
-	[12] = {
-		.start	= EXYNOS4_PA_SYSMMU_JPEG,
-		.end	= EXYNOS4_PA_SYSMMU_JPEG + SZ_64K - 1,
-		.flags	= IORESOURCE_MEM,
-	},
-	[13] = {
-		.start	= IRQ_SYSMMU_JPEG_0,
-		.end	= IRQ_SYSMMU_JPEG_0,
-		.flags	= IORESOURCE_IRQ,
-	},
-	[14] = {
-		.start	= EXYNOS4_PA_SYSMMU_FIMD0,
-		.end	= EXYNOS4_PA_SYSMMU_FIMD0 + SZ_64K - 1,
-		.flags	= IORESOURCE_MEM,
-	},
-	[15] = {
-		.start	= IRQ_SYSMMU_LCD0_M0_0,
-		.end	= IRQ_SYSMMU_LCD0_M0_0,
-		.flags	= IORESOURCE_IRQ,
-	},
-	[16] = {
-		.start	= EXYNOS4_PA_SYSMMU_FIMD1,
-		.end	= EXYNOS4_PA_SYSMMU_FIMD1 + SZ_64K - 1,
-		.flags	= IORESOURCE_MEM,
-	},
-	[17] = {
-		.start	= IRQ_SYSMMU_LCD1_M1_0,
-		.end	= IRQ_SYSMMU_LCD1_M1_0,
-		.flags	= IORESOURCE_IRQ,
-	},
-	[18] = {
-		.start	= EXYNOS4_PA_SYSMMU_PCIe,
-		.end	= EXYNOS4_PA_SYSMMU_PCIe + SZ_64K - 1,
-		.flags	= IORESOURCE_MEM,
-	},
-	[19] = {
-		.start	= IRQ_SYSMMU_PCIE_0,
-		.end	= IRQ_SYSMMU_PCIE_0,
-		.flags	= IORESOURCE_IRQ,
-	},
-	[20] = {
-		.start	= EXYNOS4_PA_SYSMMU_G2D,
-		.end	= EXYNOS4_PA_SYSMMU_G2D + SZ_64K - 1,
-		.flags	= IORESOURCE_MEM,
-	},
-	[21] = {
-		.start	= IRQ_SYSMMU_2D_0,
-		.end	= IRQ_SYSMMU_2D_0,
-		.flags	= IORESOURCE_IRQ,
-	},
-	[22] = {
-		.start	= EXYNOS4_PA_SYSMMU_ROTATOR,
-		.end	= EXYNOS4_PA_SYSMMU_ROTATOR + SZ_64K - 1,
-		.flags	= IORESOURCE_MEM,
-	},
-	[23] = {
-		.start	= IRQ_SYSMMU_ROTATOR_0,
-		.end	= IRQ_SYSMMU_ROTATOR_0,
-		.flags	= IORESOURCE_IRQ,
-	},
-	[24] = {
-		.start	= EXYNOS4_PA_SYSMMU_MDMA2,
-		.end	= EXYNOS4_PA_SYSMMU_MDMA2 + SZ_64K - 1,
-		.flags	= IORESOURCE_MEM,
-	},
-	[25] = {
-		.start	= IRQ_SYSMMU_MDMA1_0,
-		.end	= IRQ_SYSMMU_MDMA1_0,
-		.flags	= IORESOURCE_IRQ,
-	},
-	[26] = {
-		.start	= EXYNOS4_PA_SYSMMU_TV,
-		.end	= EXYNOS4_PA_SYSMMU_TV + SZ_64K - 1,
-		.flags	= IORESOURCE_MEM,
-	},
-	[27] = {
-		.start	= IRQ_SYSMMU_TV_M0_0,
-		.end	= IRQ_SYSMMU_TV_M0_0,
-		.flags	= IORESOURCE_IRQ,
-	},
-	[28] = {
-		.start	= EXYNOS4_PA_SYSMMU_MFC_L,
-		.end	= EXYNOS4_PA_SYSMMU_MFC_L + SZ_64K - 1,
-		.flags	= IORESOURCE_MEM,
-	},
-	[29] = {
-		.start	= IRQ_SYSMMU_MFC_M0_0,
-		.end	= IRQ_SYSMMU_MFC_M0_0,
-		.flags	= IORESOURCE_IRQ,
-	},
-	[30] = {
-		.start	= EXYNOS4_PA_SYSMMU_MFC_R,
-		.end	= EXYNOS4_PA_SYSMMU_MFC_R + SZ_64K - 1,
-		.flags	= IORESOURCE_MEM,
-	},
-	[31] = {
-		.start	= IRQ_SYSMMU_MFC_M1_0,
-		.end	= IRQ_SYSMMU_MFC_M1_0,
-		.flags	= IORESOURCE_IRQ,
-	},
-};
-
-struct platform_device exynos4_device_sysmmu = {
-	.name		= "s5p-sysmmu",
-	.id		= 32,
-	.num_resources	= ARRAY_SIZE(exynos4_sysmmu_resource),
-	.resource	= exynos4_sysmmu_resource,
-};
-EXPORT_SYMBOL(exynos4_device_sysmmu);
-
-static struct clk *sysmmu_clk[S5P_SYSMMU_TOTAL_IPNUM];
-void sysmmu_clk_init(struct device *dev, sysmmu_ips ips)
-{
-	sysmmu_clk[ips] = clk_get(dev, sysmmu_ips_name[ips]);
-	if (IS_ERR(sysmmu_clk[ips]))
-		sysmmu_clk[ips] = NULL;
-	else
-		clk_put(sysmmu_clk[ips]);
-}
-
-void sysmmu_clk_enable(sysmmu_ips ips)
-{
-	if (sysmmu_clk[ips])
-		clk_enable(sysmmu_clk[ips]);
-}
-
-void sysmmu_clk_disable(sysmmu_ips ips)
-{
-	if (sysmmu_clk[ips])
-		clk_disable(sysmmu_clk[ips]);
-}
+/* linux/arch/arm/mach-exynos4/dev-sysmmu.c
+ *
+ * Copyright (c) 2010 Samsung Electronics Co., Ltd.
+ *		http://www.samsung.com
+ *
+ * EXYNOS4 - System MMU support
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ */
+
+#include <linux/platform_device.h>
+#include <linux/dma-mapping.h>
+
+#include <mach/map.h>
+#include <mach/irqs.h>
+
+#include <plat/devs.h>
+#include <plat/cpu.h>
+#include <plat/sysmmu.h>
+
+#define EXYNOS4_NUM_RESOURCES (2)
+
+static struct resource exynos4_sysmmu_resource[][EXYNOS4_NUM_RESOURCES] = {
+	[S5P_SYSMMU_MDMA] = {
+		[0] = {
+			.start	= EXYNOS4_PA_SYSMMU_MDMA,
+			.end	= EXYNOS4_PA_SYSMMU_MDMA + SZ_4K - 1,
+			.flags	= IORESOURCE_MEM,
+		},
+		[1] = {
+			.start	= IRQ_SYSMMU_MDMA0,
+			.end	= IRQ_SYSMMU_MDMA0,
+			.flags	= IORESOURCE_IRQ,
+		},
+	},
+	[S5P_SYSMMU_SSS] = {
+		[0] = {
+			.start	= EXYNOS4_PA_SYSMMU_SSS,
+			.end	= EXYNOS4_PA_SYSMMU_SSS + SZ_4K - 1,
+			.flags	= IORESOURCE_MEM,
+		},
+		[1] = {
+			.start	= IRQ_SYSMMU_SSS,
+			.end	= IRQ_SYSMMU_SSS,
+			.flags	= IORESOURCE_IRQ,
+		},
+	},
+	[S5P_SYSMMU_FIMC0] = {
+		[0] = {
+			.start = EXYNOS4_PA_SYSMMU_FIMC0,
+			.end   = EXYNOS4_PA_SYSMMU_FIMC0 + SZ_4K - 1,
+			.flags = IORESOURCE_MEM,
+		},
+		[1] = {
+			.start = IRQ_SYSMMU_FIMC0,
+			.end   = IRQ_SYSMMU_FIMC0,
+			.flags = IORESOURCE_IRQ,
+		},
+	},
+	[S5P_SYSMMU_FIMC1] = {
+		[0] = {
+			.start = EXYNOS4_PA_SYSMMU_FIMC1,
+			.end   = EXYNOS4_PA_SYSMMU_FIMC1 + SZ_4K - 1,
+			.flags = IORESOURCE_MEM,
+		},
+		[1] = {
+			.start = IRQ_SYSMMU_FIMC1,
+			.end   = IRQ_SYSMMU_FIMC1,
+			.flags = IORESOURCE_IRQ,
+		},
+	},
+	[S5P_SYSMMU_FIMC2] = {
+		[0] = {
+			.start = EXYNOS4_PA_SYSMMU_FIMC2,
+			.end   = EXYNOS4_PA_SYSMMU_FIMC2 + SZ_4K - 1,
+			.flags = IORESOURCE_MEM,
+		},
+		[1] = {
+			.start = IRQ_SYSMMU_FIMC2,
+			.end   = IRQ_SYSMMU_FIMC2,
+			.flags = IORESOURCE_IRQ,
+		},
+	},
+	[S5P_SYSMMU_FIMC3] = {
+		[0] = {
+			.start = EXYNOS4_PA_SYSMMU_FIMC3,
+			.end   = EXYNOS4_PA_SYSMMU_FIMC3 + SZ_4K - 1,
+			.flags = IORESOURCE_MEM,
+		},
+		[1] = {
+			.start = IRQ_SYSMMU_FIMC3,
+			.end   = IRQ_SYSMMU_FIMC3,
+			.flags = IORESOURCE_IRQ,
+		},
+	},
+	[S5P_SYSMMU_JPEG] = {
+		[0] = {
+			.start	= EXYNOS4_PA_SYSMMU_JPEG,
+			.end	= EXYNOS4_PA_SYSMMU_JPEG + SZ_4K - 1,
+			.flags	= IORESOURCE_MEM,
+		},
+		[1] = {
+			.start	= IRQ_SYSMMU_JPEG,
+			.end	= IRQ_SYSMMU_JPEG,
+			.flags	= IORESOURCE_IRQ,
+		},
+	},
+	[S5P_SYSMMU_FIMD0] = {
+		[0] = {
+			.start	= EXYNOS4_PA_SYSMMU_FIMD0,
+			.end	= EXYNOS4_PA_SYSMMU_FIMD0 + SZ_4K - 1,
+			.flags	= IORESOURCE_MEM,
+		},
+		[1] = {
+			.start	= IRQ_SYSMMU_FIMD0,
+			.end	= IRQ_SYSMMU_FIMD0,
+			.flags	= IORESOURCE_IRQ,
+		},
+	},
+	[S5P_SYSMMU_FIMD1] = {
+		[0] = {
+			.start	= EXYNOS4_PA_SYSMMU_FIMD1,
+			.end	= EXYNOS4_PA_SYSMMU_FIMD1 + SZ_4K - 1,
+			.flags	= IORESOURCE_MEM,
+		},
+		[1] = {
+			.start	= IRQ_SYSMMU_FIMD1,
+			.end	= IRQ_SYSMMU_FIMD1,
+			.flags	= IORESOURCE_IRQ,
+		},
+	},
+	[S5P_SYSMMU_PCIe] = {
+		[0] = {
+			.start	= EXYNOS4_PA_SYSMMU_PCIe,
+			.end	= EXYNOS4_PA_SYSMMU_PCIe + SZ_4K - 1,
+			.flags	= IORESOURCE_MEM,
+		},
+		[1] = {
+			.start	= IRQ_SYSMMU_PCIE,
+			.end	= IRQ_SYSMMU_PCIE,
+			.flags	= IORESOURCE_IRQ,
+		},
+	},
+	[S5P_SYSMMU_G2D] = {
+		[0] = {
+			.start	= EXYNOS4_PA_SYSMMU_G2D,
+			.end	= EXYNOS4_PA_SYSMMU_G2D + SZ_4K - 1,
+			.flags	= IORESOURCE_MEM,
+		},
+		[1] = {
+			.start	= IRQ_SYSMMU_2D,
+			.end	= IRQ_SYSMMU_2D,
+			.flags	= IORESOURCE_IRQ,
+		},
+	},
+	[S5P_SYSMMU_ROTATOR] = {
+		[0] = {
+			.start	= EXYNOS4_PA_SYSMMU_ROTATOR,
+			.end	= EXYNOS4_PA_SYSMMU_ROTATOR + SZ_4K - 1,
+			.flags	= IORESOURCE_MEM,
+		},
+		[1] = {
+			.start	= IRQ_SYSMMU_ROTATOR,
+			.end	= IRQ_SYSMMU_ROTATOR,
+			.flags	= IORESOURCE_IRQ,
+		},
+	},
+	[S5P_SYSMMU_MDMA2] = {
+		[0] = {
+			.start	= EXYNOS4_PA_SYSMMU_MDMA2,
+			.end	= EXYNOS4_PA_SYSMMU_MDMA2 + SZ_4K - 1,
+			.flags	= IORESOURCE_MEM,
+		},
+		[1] = {
+			.start	= IRQ_SYSMMU_MDMA1,
+			.end	= IRQ_SYSMMU_MDMA1,
+			.flags	= IORESOURCE_IRQ,
+		},
+	},
+	[S5P_SYSMMU_TV] = {
+		[0] = {
+			.start	= EXYNOS4_PA_SYSMMU_TV,
+			.end	= EXYNOS4_PA_SYSMMU_TV + SZ_4K - 1,
+			.flags	= IORESOURCE_MEM,
+		},
+		[1] = {
+			.start	= IRQ_SYSMMU_TV,
+			.end	= IRQ_SYSMMU_TV,
+			.flags	= IORESOURCE_IRQ,
+		},
+	},
+	[S5P_SYSMMU_MFC_L] = {
+		[0] = {
+			.start	= EXYNOS4_PA_SYSMMU_MFC_L,
+			.end	= EXYNOS4_PA_SYSMMU_MFC_L + SZ_4K - 1,
+			.flags	= IORESOURCE_MEM,
+		},
+		[1] = {
+			.start	= IRQ_SYSMMU_MFC_L,
+			.end	= IRQ_SYSMMU_MFC_L,
+			.flags	= IORESOURCE_IRQ,
+		},
+	},
+	[S5P_SYSMMU_MFC_R] = {
+		[0] = {
+			.start	= EXYNOS4_PA_SYSMMU_MFC_R,
+			.end	= EXYNOS4_PA_SYSMMU_MFC_R + SZ_4K - 1,
+			.flags	= IORESOURCE_MEM,
+		},
+		[1] = {
+			.start	= IRQ_SYSMMU_MFC_R,
+			.end	= IRQ_SYSMMU_MFC_R,
+			.flags	= IORESOURCE_IRQ,
+		},
+	},
+};
+
+static u64 exynos4_sysmmu_dma_mask = DMA_BIT_MASK(32);
+
+struct platform_device exynos4_device_sysmmu[] = {
+	[S5P_SYSMMU_MDMA] = {
+		.name		= "s5p-sysmmu",
+		.id		= S5P_SYSMMU_MDMA,
+		.num_resources	= EXYNOS4_NUM_RESOURCES,
+		.resource	= exynos4_sysmmu_resource[S5P_SYSMMU_MDMA],
+		.dev		= {
+			.dma_mask		= &exynos4_sysmmu_dma_mask,
+			.coherent_dma_mask	= DMA_BIT_MASK(32),
+		},
+	},
+	[S5P_SYSMMU_SSS] = {
+		.name		= "s5p-sysmmu",
+		.id		= S5P_SYSMMU_SSS,
+		.num_resources	= EXYNOS4_NUM_RESOURCES,
+		.resource	= exynos4_sysmmu_resource[S5P_SYSMMU_SSS],
+		.dev		= {
+			.dma_mask		= &exynos4_sysmmu_dma_mask,
+			.coherent_dma_mask	= DMA_BIT_MASK(32),
+		},
+	},
+	[S5P_SYSMMU_FIMC0] = {
+		.name		= "s5p-sysmmu",
+		.id		= S5P_SYSMMU_FIMC0,
+		.num_resources	= EXYNOS4_NUM_RESOURCES,
+		.resource	= exynos4_sysmmu_resource[S5P_SYSMMU_FIMC0],
+		.dev		= {
+			.dma_mask		= &exynos4_sysmmu_dma_mask,
+			.coherent_dma_mask	= DMA_BIT_MASK(32),
+		},
+	},
+	[S5P_SYSMMU_FIMC1] = {
+		.name		= "s5p-sysmmu",
+		.id		= S5P_SYSMMU_FIMC1,
+		.num_resources	= EXYNOS4_NUM_RESOURCES,
+		.resource	= exynos4_sysmmu_resource[S5P_SYSMMU_FIMC1],
+		.dev		= {
+			.dma_mask		= &exynos4_sysmmu_dma_mask,
+			.coherent_dma_mask	= DMA_BIT_MASK(32),
+		},
+	},
+	[S5P_SYSMMU_FIMC2] = {
+		.name		= "s5p-sysmmu",
+		.id		= S5P_SYSMMU_FIMC2,
+		.num_resources	= EXYNOS4_NUM_RESOURCES,
+		.resource	= exynos4_sysmmu_resource[S5P_SYSMMU_FIMC2],
+		.dev		= {
+			.dma_mask		= &exynos4_sysmmu_dma_mask,
+			.coherent_dma_mask	= DMA_BIT_MASK(32),
+		},
+	},
+	[S5P_SYSMMU_FIMC3] = {
+		.name		= "s5p-sysmmu",
+		.id		= S5P_SYSMMU_FIMC3,
+		.num_resources	= EXYNOS4_NUM_RESOURCES,
+		.resource	= exynos4_sysmmu_resource[S5P_SYSMMU_FIMC3],
+		.dev		= {
+			.dma_mask		= &exynos4_sysmmu_dma_mask,
+			.coherent_dma_mask	= DMA_BIT_MASK(32),
+		},
+	},
+	[S5P_SYSMMU_JPEG] = {
+		.name		= "s5p-sysmmu",
+		.id		= S5P_SYSMMU_JPEG,
+		.num_resources	= EXYNOS4_NUM_RESOURCES,
+		.resource	= exynos4_sysmmu_resource[S5P_SYSMMU_JPEG],
+		.dev		= {
+			.dma_mask		= &exynos4_sysmmu_dma_mask,
+			.coherent_dma_mask	= DMA_BIT_MASK(32),
+		},
+	},
+	[S5P_SYSMMU_FIMD0] = {
+		.name		= "s5p-sysmmu",
+		.id		= S5P_SYSMMU_FIMD0,
+		.num_resources	= EXYNOS4_NUM_RESOURCES,
+		.resource	= exynos4_sysmmu_resource[S5P_SYSMMU_FIMD0],
+		.dev		= {
+			.dma_mask		= &exynos4_sysmmu_dma_mask,
+			.coherent_dma_mask	= DMA_BIT_MASK(32),
+		},
+	},
+	[S5P_SYSMMU_FIMD1] = {
+		.name		= "s5p-sysmmu",
+		.id		= S5P_SYSMMU_FIMD1,
+		.num_resources	= EXYNOS4_NUM_RESOURCES,
+		.resource	= exynos4_sysmmu_resource[S5P_SYSMMU_FIMD1],
+		.dev		= {
+			.dma_mask		= &exynos4_sysmmu_dma_mask,
+			.coherent_dma_mask	= DMA_BIT_MASK(32),
+		},
+	},
+	[S5P_SYSMMU_PCIe] = {
+		.name		= "s5p-sysmmu",
+		.id		= S5P_SYSMMU_PCIe,
+		.num_resources	= EXYNOS4_NUM_RESOURCES,
+		.resource	= exynos4_sysmmu_resource[S5P_SYSMMU_PCIe],
+		.dev		= {
+			.dma_mask		= &exynos4_sysmmu_dma_mask,
+			.coherent_dma_mask	= DMA_BIT_MASK(32),
+		},
+	},
+	[S5P_SYSMMU_G2D] = {
+		.name		= "s5p-sysmmu",
+		.id		= S5P_SYSMMU_G2D,
+		.num_resources	= EXYNOS4_NUM_RESOURCES,
+		.resource	= exynos4_sysmmu_resource[S5P_SYSMMU_G2D],
+		.dev		= {
+			.dma_mask		= &exynos4_sysmmu_dma_mask,
+			.coherent_dma_mask	= DMA_BIT_MASK(32),
+		},
+	},
+	[S5P_SYSMMU_ROTATOR] = {
+		.name		= "s5p-sysmmu",
+		.id		= S5P_SYSMMU_ROTATOR,
+		.num_resources	= EXYNOS4_NUM_RESOURCES,
+		.resource	= exynos4_sysmmu_resource[S5P_SYSMMU_ROTATOR],
+		.dev		= {
+			.dma_mask		= &exynos4_sysmmu_dma_mask,
+			.coherent_dma_mask	= DMA_BIT_MASK(32),
+		},
+	},
+	[S5P_SYSMMU_MDMA2] = {
+		.name		= "s5p-sysmmu",
+		.id		= S5P_SYSMMU_MDMA2,
+		.num_resources	= EXYNOS4_NUM_RESOURCES,
+		.resource	= exynos4_sysmmu_resource[S5P_SYSMMU_MDMA2],
+		.dev		= {
+			.dma_mask		= &exynos4_sysmmu_dma_mask,
+			.coherent_dma_mask	= DMA_BIT_MASK(32),
+		},
+	},
+	[S5P_SYSMMU_TV] = {
+		.name		= "s5p-sysmmu",
+		.id		= S5P_SYSMMU_TV,
+		.num_resources	= EXYNOS4_NUM_RESOURCES,
+		.resource	= exynos4_sysmmu_resource[S5P_SYSMMU_TV],
+		.dev		= {
+			.dma_mask		= &exynos4_sysmmu_dma_mask,
+			.coherent_dma_mask	= DMA_BIT_MASK(32),
+		},
+	},
+	[S5P_SYSMMU_MFC_L] = {
+		.name		= "s5p-sysmmu",
+		.id		= S5P_SYSMMU_MFC_L,
+		.num_resources	= EXYNOS4_NUM_RESOURCES,
+		.resource	= exynos4_sysmmu_resource[S5P_SYSMMU_MFC_L],
+		.dev		= {
+			.dma_mask		= &exynos4_sysmmu_dma_mask,
+			.coherent_dma_mask	= DMA_BIT_MASK(32),
+		},
+	},
+	[S5P_SYSMMU_MFC_R] = {
+		.name		= "s5p-sysmmu",
+		.id		= S5P_SYSMMU_MFC_R,
+		.num_resources	= EXYNOS4_NUM_RESOURCES,
+		.resource	= exynos4_sysmmu_resource[S5P_SYSMMU_MFC_R],
+		.dev		= {
+			.dma_mask		= &exynos4_sysmmu_dma_mask,
+			.coherent_dma_mask	= DMA_BIT_MASK(32),
+		},
+	},
+};
diff --git a/arch/arm/mach-exynos4/include/mach/irqs.h b/arch/arm/mach-exynos4/include/mach/irqs.h
index 5d03730..ad1d00c 100644
--- a/arch/arm/mach-exynos4/include/mach/irqs.h
+++ b/arch/arm/mach-exynos4/include/mach/irqs.h
@@ -55,23 +55,23 @@
 #define COMBINER_GROUP(x)	((x) * MAX_IRQ_IN_COMBINER + IRQ_SPI(64))
 #define COMBINER_IRQ(x, y)	(COMBINER_GROUP(x) + y)
 
-#define IRQ_SYSMMU_MDMA0_0	COMBINER_IRQ(4, 0)
-#define IRQ_SYSMMU_SSS_0	COMBINER_IRQ(4, 1)
-#define IRQ_SYSMMU_FIMC0_0	COMBINER_IRQ(4, 2)
-#define IRQ_SYSMMU_FIMC1_0	COMBINER_IRQ(4, 3)
-#define IRQ_SYSMMU_FIMC2_0	COMBINER_IRQ(4, 4)
-#define IRQ_SYSMMU_FIMC3_0	COMBINER_IRQ(4, 5)
-#define IRQ_SYSMMU_JPEG_0	COMBINER_IRQ(4, 6)
-#define IRQ_SYSMMU_2D_0		COMBINER_IRQ(4, 7)
-
-#define IRQ_SYSMMU_ROTATOR_0	COMBINER_IRQ(5, 0)
-#define IRQ_SYSMMU_MDMA1_0	COMBINER_IRQ(5, 1)
-#define IRQ_SYSMMU_LCD0_M0_0	COMBINER_IRQ(5, 2)
-#define IRQ_SYSMMU_LCD1_M1_0	COMBINER_IRQ(5, 3)
-#define IRQ_SYSMMU_TV_M0_0	COMBINER_IRQ(5, 4)
-#define IRQ_SYSMMU_MFC_M0_0	COMBINER_IRQ(5, 5)
-#define IRQ_SYSMMU_MFC_M1_0	COMBINER_IRQ(5, 6)
-#define IRQ_SYSMMU_PCIE_0	COMBINER_IRQ(5, 7)
+#define IRQ_SYSMMU_MDMA0	COMBINER_IRQ(4, 0)
+#define IRQ_SYSMMU_SSS		COMBINER_IRQ(4, 1)
+#define IRQ_SYSMMU_FIMC0	COMBINER_IRQ(4, 2)
+#define IRQ_SYSMMU_FIMC1	COMBINER_IRQ(4, 3)
+#define IRQ_SYSMMU_FIMC2	COMBINER_IRQ(4, 4)
+#define IRQ_SYSMMU_FIMC3	COMBINER_IRQ(4, 5)
+#define IRQ_SYSMMU_JPEG		COMBINER_IRQ(4, 6)
+#define IRQ_SYSMMU_2D		COMBINER_IRQ(4, 7)
+
+#define IRQ_SYSMMU_ROTATOR	COMBINER_IRQ(5, 0)
+#define IRQ_SYSMMU_MDMA1	COMBINER_IRQ(5, 1)
+#define IRQ_SYSMMU_FIMD0	COMBINER_IRQ(5, 2)
+#define IRQ_SYSMMU_FIMD1	COMBINER_IRQ(5, 3)
+#define IRQ_SYSMMU_TV		COMBINER_IRQ(5, 4)
+#define IRQ_SYSMMU_MFC_L	COMBINER_IRQ(5, 5)
+#define IRQ_SYSMMU_MFC_R	COMBINER_IRQ(5, 6)
+#define IRQ_SYSMMU_PCIE		COMBINER_IRQ(5, 7)
 
 #define IRQ_PDMA0		COMBINER_IRQ(21, 0)
 #define IRQ_PDMA1		COMBINER_IRQ(21, 1)
@@ -157,4 +157,5 @@
 /* Set the default NR_IRQS */
 #define NR_IRQS			(IRQ_GPIO_END)
 
+
 #endif /* __ASM_ARCH_IRQS_H */
diff --git a/arch/arm/mach-exynos4/include/mach/sysmmu.h b/arch/arm/mach-exynos4/include/mach/sysmmu.h
deleted file mode 100644
index 6a5fbb5..0000000
--- a/arch/arm/mach-exynos4/include/mach/sysmmu.h
+++ /dev/null
@@ -1,46 +0,0 @@
-/* linux/arch/arm/mach-exynos4/include/mach/sysmmu.h
- *
- * Copyright (c) 2010-2011 Samsung Electronics Co., Ltd.
- *		http://www.samsung.com
- *
- * Samsung sysmmu driver for EXYNOS4
- *
- * This program is free software; you can redistribute it and/or modify
- * it under the terms of the GNU General Public License version 2 as
- * published by the Free Software Foundation.
-*/
-
-#ifndef __ASM_ARM_ARCH_SYSMMU_H
-#define __ASM_ARM_ARCH_SYSMMU_H __FILE__
-
-enum exynos4_sysmmu_ips {
-	SYSMMU_MDMA,
-	SYSMMU_SSS,
-	SYSMMU_FIMC0,
-	SYSMMU_FIMC1,
-	SYSMMU_FIMC2,
-	SYSMMU_FIMC3,
-	SYSMMU_JPEG,
-	SYSMMU_FIMD0,
-	SYSMMU_FIMD1,
-	SYSMMU_PCIe,
-	SYSMMU_G2D,
-	SYSMMU_ROTATOR,
-	SYSMMU_MDMA2,
-	SYSMMU_TV,
-	SYSMMU_MFC_L,
-	SYSMMU_MFC_R,
-	EXYNOS4_SYSMMU_TOTAL_IPNUM,
-};
-
-#define S5P_SYSMMU_TOTAL_IPNUM		EXYNOS4_SYSMMU_TOTAL_IPNUM
-
-extern const char *sysmmu_ips_name[EXYNOS4_SYSMMU_TOTAL_IPNUM];
-
-typedef enum exynos4_sysmmu_ips sysmmu_ips;
-
-void sysmmu_clk_init(struct device *dev, sysmmu_ips ips);
-void sysmmu_clk_enable(sysmmu_ips ips);
-void sysmmu_clk_disable(sysmmu_ips ips);
-
-#endif /* __ASM_ARM_ARCH_SYSMMU_H */
diff --git a/arch/arm/plat-s5p/Kconfig b/arch/arm/plat-s5p/Kconfig
index 8492297..9a7805b 100644
--- a/arch/arm/plat-s5p/Kconfig
+++ b/arch/arm/plat-s5p/Kconfig
@@ -42,14 +42,6 @@ config S5P_HRT
 	help
 	  Use the High Resolution timer support
 
-comment "System MMU"
-
-config S5P_SYSTEM_MMU
-	bool "S5P SYSTEM MMU"
-	depends on ARCH_EXYNOS4
-	help
-	  Say Y here if you want to enable System MMU
-
 config S5P_DEV_FIMC0
 	bool
 	help
@@ -89,3 +81,15 @@ config S5P_SETUP_MIPIPHY
 	bool
 	help
 	  Compile in common setup code for MIPI-CSIS and MIPI-DSIM devices
+
+comment "System MMU"
+
+config IOMMU_API
+	bool
+
+config S5P_SYSTEM_MMU
+	bool "S5P SYSTEM MMU"
+	depends on ARCH_EXYNOS4
+	select IOMMU_API
+	help
+	  Say Y here if you want to enable System MMU
diff --git a/arch/arm/plat-s5p/include/plat/sysmmu.h b/arch/arm/plat-s5p/include/plat/sysmmu.h
dissimilarity index 83%
index bf5283c..ee9e6d0 100644
--- a/arch/arm/plat-s5p/include/plat/sysmmu.h
+++ b/arch/arm/plat-s5p/include/plat/sysmmu.h
@@ -1,95 +1,146 @@
-/* linux/arch/arm/plat-s5p/include/plat/sysmmu.h
- *
- * Copyright (c) 2010-2011 Samsung Electronics Co., Ltd.
- *		http://www.samsung.com
- *
- * Samsung System MMU driver for S5P platform
- *
- * This program is free software; you can redistribute it and/or modify
- * it under the terms of the GNU General Public License version 2 as
- * published by the Free Software Foundation.
-*/
-
-#ifndef __ASM__PLAT_SYSMMU_H
-#define __ASM__PLAT_SYSMMU_H __FILE__
-
-enum S5P_SYSMMU_INTERRUPT_TYPE {
-	SYSMMU_PAGEFAULT,
-	SYSMMU_AR_MULTIHIT,
-	SYSMMU_AW_MULTIHIT,
-	SYSMMU_BUSERROR,
-	SYSMMU_AR_SECURITY,
-	SYSMMU_AR_ACCESS,
-	SYSMMU_AW_SECURITY,
-	SYSMMU_AW_PROTECTION, /* 7 */
-	SYSMMU_FAULTS_NUM
-};
-
-#ifdef CONFIG_S5P_SYSTEM_MMU
-
-#include <mach/sysmmu.h>
-
-/**
- * s5p_sysmmu_enable() - enable system mmu of ip
- * @ips: The ip connected system mmu.
- * #pgd: Base physical address of the 1st level page table
- *
- * This function enable system mmu to transfer address
- * from virtual address to physical address
- */
-void s5p_sysmmu_enable(sysmmu_ips ips, unsigned long pgd);
-
-/**
- * s5p_sysmmu_disable() - disable sysmmu mmu of ip
- * @ips: The ip connected system mmu.
- *
- * This function disable system mmu to transfer address
- * from virtual address to physical address
- */
-void s5p_sysmmu_disable(sysmmu_ips ips);
-
-/**
- * s5p_sysmmu_set_tablebase_pgd() - set page table base address to refer page table
- * @ips: The ip connected system mmu.
- * @pgd: The page table base address.
- *
- * This function set page table base address
- * When system mmu transfer address from virtaul address to physical address,
- * system mmu refer address information from page table
- */
-void s5p_sysmmu_set_tablebase_pgd(sysmmu_ips ips, unsigned long pgd);
-
-/**
- * s5p_sysmmu_tlb_invalidate() - flush all TLB entry in system mmu
- * @ips: The ip connected system mmu.
- *
- * This function flush all TLB entry in system mmu
- */
-void s5p_sysmmu_tlb_invalidate(sysmmu_ips ips);
-
-/** s5p_sysmmu_set_fault_handler() - Fault handler for System MMUs
- * @itype: type of fault.
- * @pgtable_base: the physical address of page table base. This is 0 if @ips is
- *               SYSMMU_BUSERROR.
- * @fault_addr: the device (virtual) address that the System MMU tried to
- *             translated. This is 0 if @ips is SYSMMU_BUSERROR.
- * Called when interrupt occurred by the System MMUs
- * The device drivers of peripheral devices that has a System MMU can implement
- * a fault handler to resolve address translation fault by System MMU.
- * The meanings of return value and parameters are described below.
-
- * return value: non-zero if the fault is correctly resolved.
- *         zero if the fault is not handled.
- */
-void s5p_sysmmu_set_fault_handler(sysmmu_ips ips,
-			int (*handler)(enum S5P_SYSMMU_INTERRUPT_TYPE itype,
-					unsigned long pgtable_base,
-					unsigned long fault_addr));
-#else
-#define s5p_sysmmu_enable(ips, pgd) do { } while (0)
-#define s5p_sysmmu_disable(ips) do { } while (0)
-#define s5p_sysmmu_set_tablebase_pgd(ips, pgd) do { } while (0)
-#define s5p_sysmmu_tlb_invalidate(ips) do { } while (0)
-#define s5p_sysmmu_set_fault_handler(ips, handler) do { } while (0)
-#endif
-#endif /* __ASM_PLAT_SYSMMU_H */
+/* linux/arch/arm/plat-s5p/include/plat/sysmmu.h
+ *
+ * Copyright (c) 2010-2011 Samsung Electronics Co., Ltd.
+ *		http://www.samsung.com
+ * Author: Andrzej Pietrasiewicz <andrzej.p@samsung.com>
+ *
+ * Samsung System MMU driver for S5P platform
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+*/
+
+#ifndef __ASM__PLAT_SYSMMU_H
+#define __ASM__PLAT_SYSMMU_H __FILE__
+
+struct device;
+struct iommu_domain;
+
+/**
+ * enum s5p_sysmmu_ip - integrated peripherals identifiers
+ * @S5P_SYSMMU_MDMA:	MDMA
+ * @S5P_SYSMMU_SSS:	SSS
+ * @S5P_SYSMMU_FIMC0:	FIMC0
+ * @S5P_SYSMMU_FIMC1:	FIMC1
+ * @S5P_SYSMMU_FIMC2:	FIMC2
+ * @S5P_SYSMMU_FIMC3:	FIMC3
+ * @S5P_SYSMMU_JPEG:	JPEG
+ * @S5P_SYSMMU_FIMD0:	FIMD0
+ * @S5P_SYSMMU_FIMD1:	FIMD1
+ * @S5P_SYSMMU_PCIe:	PCIe
+ * @S5P_SYSMMU_G2D:	G2D
+ * @S5P_SYSMMU_ROTATOR:	ROTATOR
+ * @S5P_SYSMMU_MDMA2:	MDMA2
+ * @S5P_SYSMMU_TV:	TV
+ * @S5P_SYSMMU_MFC_L:	MFC_L
+ * @S5P_SYSMMU_MFC_R:	MFC_R
+ */
+enum s5p_sysmmu_ip {
+	S5P_SYSMMU_MDMA,
+	S5P_SYSMMU_SSS,
+	S5P_SYSMMU_FIMC0,
+	S5P_SYSMMU_FIMC1,
+	S5P_SYSMMU_FIMC2,
+	S5P_SYSMMU_FIMC3,
+	S5P_SYSMMU_JPEG,
+	S5P_SYSMMU_FIMD0,
+	S5P_SYSMMU_FIMD1,
+	S5P_SYSMMU_PCIe,
+	S5P_SYSMMU_G2D,
+	S5P_SYSMMU_ROTATOR,
+	S5P_SYSMMU_MDMA2,
+	S5P_SYSMMU_TV,
+	S5P_SYSMMU_MFC_L,
+	S5P_SYSMMU_MFC_R,
+	S5P_SYSMMU_TOTAL_IP_NUM,
+};
+
+/**
+ * enum s5p_sysmmu_fault - reason of the raised sysmmu irq
+ * @S5P_SYSMMU_PAGE_FAULT
+ * @S5P_SYSMMU_AR_FAULT
+ * @S5P_SYSMMU_AW_FAULT
+ * @S5P_SYSMMU_BUS_ERROR
+ * @S5P_SYSMMU_AR_SECURITY
+ * @S5P_SYSMMU_AR_PROT
+ * @S5P_SYSMMU_AW_SECURITY
+ * @S5P_SYSMMU_AW_PROT
+ */
+enum s5p_sysmmu_fault {
+	S5P_SYSMMU_PAGE_FAULT,
+	S5P_SYSMMU_AR_FAULT,
+	S5P_SYSMMU_AW_FAULT,
+	S5P_SYSMMU_BUS_ERROR,
+	S5P_SYSMMU_AR_SECURITY,
+	S5P_SYSMMU_AR_PROT,
+	S5P_SYSMMU_AW_SECURITY,
+	S5P_SYSMMU_AW_PROT,
+};
+
+/**
+ * enum s5p_sysmmu_tlb_policy - policy of using the tlb
+ * @S5P_SYSMMU_TLB_RR:	round robin policy
+ * @S5P_SYSMMU_TLB_LRU: least recently used policy
+ */
+enum s5p_sysmmu_tlb_policy {
+	S5P_SYSMMU_TLB_RR,
+	S5P_SYSMMU_TLB_LRU,
+};
+
+#define S5P_IRQ_CB(name) \
+	void (*name)(struct iommu_domain *domain, int reason, \
+		     unsigned long addr, void *prv)
+
+/**
+ * struct s5p_sysmmu_irq_callb - callback operations for irq routine
+ * @page_fault:	called when page fault occurs
+ * @ar_fault:	called when ar multi-hit fault occcurs
+ * @aw_fault:	called when aw multi-hit fault occcurs 
+ * @bus_error:	called when bus error occurs
+ * @ar_security:called when ar security protection fault occurs
+ * @ar_prot:	called when ar acces protection fault occurs
+ * @aw_security:called when aw security protection fault occurs
+ * @aw_prot:	called when aw acces protection fault occurs
+ */
+struct s5p_sysmmu_irq_callb {
+	S5P_IRQ_CB(page_fault);
+	S5P_IRQ_CB(ar_fault);
+	S5P_IRQ_CB(aw_fault);
+	S5P_IRQ_CB(bus_error);
+	S5P_IRQ_CB(ar_security);
+	S5P_IRQ_CB(ar_prot);
+	S5P_IRQ_CB(aw_security);
+	S5P_IRQ_CB(aw_prot);
+};
+
+/**
+ * s5p_sysmmu_get() - get sysmmu device instance
+ * @ip:		integrated peripheral identifier of the device
+ */
+struct device *s5p_sysmmu_get(enum s5p_sysmmu_ip ip);
+
+/**
+ * s5p_sysmmu_put() - release sysmmu handle for a device
+ * @dev_id:	sysmmu handle obtained from s5p_sysmmu_get()
+ */
+void s5p_sysmmu_put(void *dev);
+
+/**
+ * s5p_sysmmu_domain_irq_callb() - set non-default per-domain ops to be called
+ * from irq handling routine
+ * @domain:	iommu domain for which to set the ops
+ * @ops:	non-default operations to be set
+ * @priv:	private data to be passed to the op when it is called
+ */
+void s5p_sysmmu_domain_irq_callb(struct iommu_domain *domain,
+			    struct s5p_sysmmu_irq_callb *ops, void *priv);
+
+/**
+ * s5p_sysmmu_domain_tlb_policy() - set per-domain tlb policy
+ * @domain:	iommu domain for which to set the tlb policy
+ * @policy:	tlb policy specifier (0 round robin, 1 lru)
+ */
+void s5p_sysmmu_domain_tlb_policy(struct iommu_domain *domain, int policy);
+
+#endif /* __ASM_PLAT_SYSMMU_H */
diff --git a/arch/arm/plat-s5p/sysmmu.c b/arch/arm/plat-s5p/sysmmu.c
dissimilarity index 87%
index 54f5edd..905bb2b 100644
--- a/arch/arm/plat-s5p/sysmmu.c
+++ b/arch/arm/plat-s5p/sysmmu.c
@@ -1,312 +1,879 @@
-/* linux/arch/arm/plat-s5p/sysmmu.c
- *
- * Copyright (c) 2010 Samsung Electronics Co., Ltd.
- *		http://www.samsung.com
- *
- * This program is free software; you can redistribute it and/or modify
- * it under the terms of the GNU General Public License version 2 as
- * published by the Free Software Foundation.
- */
-
-#include <linux/io.h>
-#include <linux/interrupt.h>
-#include <linux/platform_device.h>
-
-#include <asm/pgtable.h>
-
-#include <mach/map.h>
-#include <mach/regs-sysmmu.h>
-#include <plat/sysmmu.h>
-
-#define CTRL_ENABLE	0x5
-#define CTRL_BLOCK	0x7
-#define CTRL_DISABLE	0x0
-
-static struct device *dev;
-
-static unsigned short fault_reg_offset[SYSMMU_FAULTS_NUM] = {
-	S5P_PAGE_FAULT_ADDR,
-	S5P_AR_FAULT_ADDR,
-	S5P_AW_FAULT_ADDR,
-	S5P_DEFAULT_SLAVE_ADDR,
-	S5P_AR_FAULT_ADDR,
-	S5P_AR_FAULT_ADDR,
-	S5P_AW_FAULT_ADDR,
-	S5P_AW_FAULT_ADDR
-};
-
-static char *sysmmu_fault_name[SYSMMU_FAULTS_NUM] = {
-	"PAGE FAULT",
-	"AR MULTI-HIT FAULT",
-	"AW MULTI-HIT FAULT",
-	"BUS ERROR",
-	"AR SECURITY PROTECTION FAULT",
-	"AR ACCESS PROTECTION FAULT",
-	"AW SECURITY PROTECTION FAULT",
-	"AW ACCESS PROTECTION FAULT"
-};
-
-static int (*fault_handlers[S5P_SYSMMU_TOTAL_IPNUM])(
-		enum S5P_SYSMMU_INTERRUPT_TYPE itype,
-		unsigned long pgtable_base,
-		unsigned long fault_addr);
-
-/*
- * If adjacent 2 bits are true, the system MMU is enabled.
- * The system MMU is disabled, otherwise.
- */
-static unsigned long sysmmu_states;
-
-static inline void set_sysmmu_active(sysmmu_ips ips)
-{
-	sysmmu_states |= 3 << (ips * 2);
-}
-
-static inline void set_sysmmu_inactive(sysmmu_ips ips)
-{
-	sysmmu_states &= ~(3 << (ips * 2));
-}
-
-static inline int is_sysmmu_active(sysmmu_ips ips)
-{
-	return sysmmu_states & (3 << (ips * 2));
-}
-
-static void __iomem *sysmmusfrs[S5P_SYSMMU_TOTAL_IPNUM];
-
-static inline void sysmmu_block(sysmmu_ips ips)
-{
-	__raw_writel(CTRL_BLOCK, sysmmusfrs[ips] + S5P_MMU_CTRL);
-	dev_dbg(dev, "%s is blocked.\n", sysmmu_ips_name[ips]);
-}
-
-static inline void sysmmu_unblock(sysmmu_ips ips)
-{
-	__raw_writel(CTRL_ENABLE, sysmmusfrs[ips] + S5P_MMU_CTRL);
-	dev_dbg(dev, "%s is unblocked.\n", sysmmu_ips_name[ips]);
-}
-
-static inline void __sysmmu_tlb_invalidate(sysmmu_ips ips)
-{
-	__raw_writel(0x1, sysmmusfrs[ips] + S5P_MMU_FLUSH);
-	dev_dbg(dev, "TLB of %s is invalidated.\n", sysmmu_ips_name[ips]);
-}
-
-static inline void __sysmmu_set_ptbase(sysmmu_ips ips, unsigned long pgd)
-{
-	if (unlikely(pgd == 0)) {
-		pgd = (unsigned long)ZERO_PAGE(0);
-		__raw_writel(0x20, sysmmusfrs[ips] + S5P_MMU_CFG); /* 4KB LV1 */
-	} else {
-		__raw_writel(0x0, sysmmusfrs[ips] + S5P_MMU_CFG); /* 16KB LV1 */
-	}
-
-	__raw_writel(pgd, sysmmusfrs[ips] + S5P_PT_BASE_ADDR);
-
-	dev_dbg(dev, "Page table base of %s is initialized with 0x%08lX.\n",
-						sysmmu_ips_name[ips], pgd);
-	__sysmmu_tlb_invalidate(ips);
-}
-
-void sysmmu_set_fault_handler(sysmmu_ips ips,
-			int (*handler)(enum S5P_SYSMMU_INTERRUPT_TYPE itype,
-					unsigned long pgtable_base,
-					unsigned long fault_addr))
-{
-	BUG_ON(!((ips >= SYSMMU_MDMA) && (ips < S5P_SYSMMU_TOTAL_IPNUM)));
-	fault_handlers[ips] = handler;
-}
-
-static irqreturn_t s5p_sysmmu_irq(int irq, void *dev_id)
-{
-	/* SYSMMU is in blocked when interrupt occurred. */
-	unsigned long base = 0;
-	sysmmu_ips ips = (sysmmu_ips)dev_id;
-	enum S5P_SYSMMU_INTERRUPT_TYPE itype;
-
-	itype = (enum S5P_SYSMMU_INTERRUPT_TYPE)
-		__ffs(__raw_readl(sysmmusfrs[ips] + S5P_INT_STATUS));
-
-	BUG_ON(!((itype >= 0) && (itype < 8)));
-
-	dev_alert(dev, "%s occurred by %s.\n", sysmmu_fault_name[itype],
-							sysmmu_ips_name[ips]);
-
-	if (fault_handlers[ips]) {
-		unsigned long addr;
-
-		base = __raw_readl(sysmmusfrs[ips] + S5P_PT_BASE_ADDR);
-		addr = __raw_readl(sysmmusfrs[ips] + fault_reg_offset[itype]);
-
-		if (fault_handlers[ips](itype, base, addr)) {
-			__raw_writel(1 << itype,
-					sysmmusfrs[ips] + S5P_INT_CLEAR);
-			dev_notice(dev, "%s from %s is resolved."
-					" Retrying translation.\n",
-				sysmmu_fault_name[itype], sysmmu_ips_name[ips]);
-		} else {
-			base = 0;
-		}
-	}
-
-	sysmmu_unblock(ips);
-
-	if (!base)
-		dev_notice(dev, "%s from %s is not handled.\n",
-			sysmmu_fault_name[itype], sysmmu_ips_name[ips]);
-
-	return IRQ_HANDLED;
-}
-
-void s5p_sysmmu_set_tablebase_pgd(sysmmu_ips ips, unsigned long pgd)
-{
-	if (is_sysmmu_active(ips)) {
-		sysmmu_block(ips);
-		__sysmmu_set_ptbase(ips, pgd);
-		sysmmu_unblock(ips);
-	} else {
-		dev_dbg(dev, "%s is disabled. "
-			"Skipping initializing page table base.\n",
-						sysmmu_ips_name[ips]);
-	}
-}
-
-void s5p_sysmmu_enable(sysmmu_ips ips, unsigned long pgd)
-{
-	if (!is_sysmmu_active(ips)) {
-		sysmmu_clk_enable(ips);
-
-		__sysmmu_set_ptbase(ips, pgd);
-
-		__raw_writel(CTRL_ENABLE, sysmmusfrs[ips] + S5P_MMU_CTRL);
-
-		set_sysmmu_active(ips);
-		dev_dbg(dev, "%s is enabled.\n", sysmmu_ips_name[ips]);
-	} else {
-		dev_dbg(dev, "%s is already enabled.\n", sysmmu_ips_name[ips]);
-	}
-}
-
-void s5p_sysmmu_disable(sysmmu_ips ips)
-{
-	if (is_sysmmu_active(ips)) {
-		__raw_writel(CTRL_DISABLE, sysmmusfrs[ips] + S5P_MMU_CTRL);
-		set_sysmmu_inactive(ips);
-		sysmmu_clk_disable(ips);
-		dev_dbg(dev, "%s is disabled.\n", sysmmu_ips_name[ips]);
-	} else {
-		dev_dbg(dev, "%s is already disabled.\n", sysmmu_ips_name[ips]);
-	}
-}
-
-void s5p_sysmmu_tlb_invalidate(sysmmu_ips ips)
-{
-	if (is_sysmmu_active(ips)) {
-		sysmmu_block(ips);
-		__sysmmu_tlb_invalidate(ips);
-		sysmmu_unblock(ips);
-	} else {
-		dev_dbg(dev, "%s is disabled. "
-			"Skipping invalidating TLB.\n", sysmmu_ips_name[ips]);
-	}
-}
-
-static int s5p_sysmmu_probe(struct platform_device *pdev)
-{
-	int i, ret;
-	struct resource *res, *mem;
-
-	dev = &pdev->dev;
-
-	for (i = 0; i < S5P_SYSMMU_TOTAL_IPNUM; i++) {
-		int irq;
-
-		sysmmu_clk_init(dev, i);
-		sysmmu_clk_disable(i);
-
-		res = platform_get_resource(pdev, IORESOURCE_MEM, i);
-		if (!res) {
-			dev_err(dev, "Failed to get the resource of %s.\n",
-							sysmmu_ips_name[i]);
-			ret = -ENODEV;
-			goto err_res;
-		}
-
-		mem = request_mem_region(res->start,
-				((res->end) - (res->start)) + 1, pdev->name);
-		if (!mem) {
-			dev_err(dev, "Failed to request the memory region of %s.\n",
-							sysmmu_ips_name[i]);
-			ret = -EBUSY;
-			goto err_res;
-		}
-
-		sysmmusfrs[i] = ioremap(res->start, res->end - res->start + 1);
-		if (!sysmmusfrs[i]) {
-			dev_err(dev, "Failed to ioremap() for %s.\n",
-							sysmmu_ips_name[i]);
-			ret = -ENXIO;
-			goto err_reg;
-		}
-
-		irq = platform_get_irq(pdev, i);
-		if (irq <= 0) {
-			dev_err(dev, "Failed to get the IRQ resource of %s.\n",
-							sysmmu_ips_name[i]);
-			ret = -ENOENT;
-			goto err_map;
-		}
-
-		if (request_irq(irq, s5p_sysmmu_irq, IRQF_DISABLED,
-						pdev->name, (void *)i)) {
-			dev_err(dev, "Failed to request IRQ for %s.\n",
-							sysmmu_ips_name[i]);
-			ret = -ENOENT;
-			goto err_map;
-		}
-	}
-
-	return 0;
-
-err_map:
-	iounmap(sysmmusfrs[i]);
-err_reg:
-	release_mem_region(mem->start, resource_size(mem));
-err_res:
-	return ret;
-}
-
-static int s5p_sysmmu_remove(struct platform_device *pdev)
-{
-	return 0;
-}
-int s5p_sysmmu_runtime_suspend(struct device *dev)
-{
-	return 0;
-}
-
-int s5p_sysmmu_runtime_resume(struct device *dev)
-{
-	return 0;
-}
-
-const struct dev_pm_ops s5p_sysmmu_pm_ops = {
-	.runtime_suspend	= s5p_sysmmu_runtime_suspend,
-	.runtime_resume		= s5p_sysmmu_runtime_resume,
-};
-
-static struct platform_driver s5p_sysmmu_driver = {
-	.probe		= s5p_sysmmu_probe,
-	.remove		= s5p_sysmmu_remove,
-	.driver		= {
-		.owner		= THIS_MODULE,
-		.name		= "s5p-sysmmu",
-		.pm		= &s5p_sysmmu_pm_ops,
-	}
-};
-
-static int __init s5p_sysmmu_init(void)
-{
-	return platform_driver_register(&s5p_sysmmu_driver);
-}
-arch_initcall(s5p_sysmmu_init);
+/* linux/arch/arm/plat-s5p/sysmmu.c
+ *
+ * Copyright (c) 2010-2011 Samsung Electronics Co., Ltd.
+ *		http://www.samsung.com
+ *
+ * Author: Andrzej Pietrasiewicz <andrzej.p@samsung.com>
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ */
+
+#include <linux/gfp.h>
+#include <linux/kernel.h>
+#include <linux/string.h>
+#include <linux/platform_device.h>
+#include <linux/slab.h>
+#include <linux/interrupt.h>
+#include <linux/io.h>
+#include <linux/spinlock.h>
+#include <linux/mm.h>
+#include <linux/pagemap.h>
+#include <linux/module.h>
+#include <linux/clk.h>
+#include <linux/pm_runtime.h>
+#include <linux/iommu.h>
+
+#include <asm/memory.h>
+
+#include <plat/irqs.h>
+#include <plat/devs.h>
+#include <plat/cpu.h>
+#include <plat/sysmmu.h>
+
+#include <mach/map.h>
+#include <mach/regs-sysmmu.h>
+
+static int debug;
+module_param(debug, int, 0644);
+
+#define sysmmu_debug(level, fmt, arg...)				 \
+	do {								 \
+		if (debug >= level)					 \
+			printk(KERN_DEBUG "[%s] " fmt, __func__, ## arg);\
+	} while (0)
+
+#define FLPT_ENTRIES		4096
+#define FLPT_4K_64K_MASK	(~0x3FF)
+#define FLPT_1M_MASK		(~0xFFFFF)
+#define FLPT_16M_MASK		(~0xFFFFFF)
+#define SLPT_4K_MASK		(~0xFFF)
+#define SLPT_64K_MASK		(~0xFFFF)
+#define PAGE_4K_64K		0x1
+#define PAGE_1M			0x2
+#define PAGE_16M		0x40002
+#define PAGE_4K			0x2
+#define PAGE_64K		0x1
+#define FLPT_IDX_SHIFT		20
+#define FLPT_IDX_MASK		0xFFF
+#define FLPT_OFFS_SHIFT		(FLPT_IDX_SHIFT - 2)
+#define FLPT_OFFS_MASK		(FLPT_IDX_MASK << 2)
+#define SLPT_IDX_SHIFT		12
+#define SLPT_IDX_MASK		0xFF
+#define SLPT_OFFS_SHIFT		(SLPT_IDX_SHIFT - 2)
+#define SLPT_OFFS_MASK		(SLPT_IDX_MASK << 2)
+
+#define deref_va(va)		(*((unsigned long *)(va)))
+
+#define generic_extract(l, s, entry) \
+				((entry) & l##LPT_##s##_MASK)
+#define flpt_get_1m(entry)	generic_extract(F, 1M, deref_va(entry))
+#define flpt_get_16m(entry)	generic_extract(F, 16M, deref_va(entry))
+#define slpt_get_4k(entry)	generic_extract(S, 4K, deref_va(entry))
+#define slpt_get_64k(entry)	generic_extract(S, 64K, deref_va(entry))
+
+#define generic_entry(l, s, entry) \
+				(generic_extract(l, s, entry)  | PAGE_##s)
+#define flpt_ent_4k_64k(entry)	generic_entry(F, 4K_64K, entry)
+#define flpt_ent_1m(entry)	generic_entry(F, 1M, entry)
+#define flpt_ent_16m(entry)	generic_entry(F, 16M, entry)
+#define slpt_ent_4k(entry)	generic_entry(S, 4K, entry)
+#define slpt_ent_64k(entry)	generic_entry(S, 64K, entry)
+
+#define page_4k_64k(entry)	(deref_va(entry) & PAGE_4K_64K)
+#define page_1m(entry)		(deref_va(entry) & PAGE_1M)
+#define page_16m(entry)		((deref_va(entry) & PAGE_16M) == PAGE_16M)
+#define page_4k(entry)		(deref_va(entry) & PAGE_4K)
+#define page_64k(entry)		(deref_va(entry) & PAGE_64K)
+
+#define generic_pg_offs(l, s, va) \
+				(va & ~l##LPT_##s##_MASK)
+#define pg_offs_1m(va)		generic_pg_offs(F, 1M, va)
+#define pg_offs_16m(va)		generic_pg_offs(F, 16M, va)
+#define pg_offs_4k(va)		generic_pg_offs(S, 4K, va)
+#define pg_offs_64k(va)		generic_pg_offs(S, 64K, va)
+
+#define flpt_index(va)		(((va) >> FLPT_IDX_SHIFT) & FLPT_IDX_MASK)
+
+#define generic_offset(l, va)	(((va) >> l##LPT_OFFS_SHIFT) & l##LPT_OFFS_MASK)
+#define flpt_offs(va)		generic_offset(F, va)
+#define slpt_offs(va)		generic_offset(S, va)
+
+#define invalidate_slpt_ent(slpt_va) (deref_va(slpt_va) = 0UL)
+
+#define get_irq_callb(cb) \
+				(s5p_domain->irq_callb ? \
+					(s5p_domain->irq_callb->cb ? \
+					s5p_domain->irq_callb->cb : \
+					s5p_sysmmu_irq_callb.cb) \
+				: s5p_sysmmu_irq_callb.cb)
+
+struct s5p_sysmmu_info {
+	struct resource			*ioarea;
+	void __iomem			*regs;
+	unsigned int			irq;
+	struct clk			*clk;
+	bool				enabled;
+	enum s5p_sysmmu_ip		ip;
+	struct device			*dev;
+	struct iommu_domain		*domain;
+};
+
+/*
+ * iommu domain is a virtual address space of an I/O device driver.
+ * It contains kernel virtual and physical addresses of the first level
+ * page table and owns the memory in which the page tables are stored.
+ * It contains a table of kernel virtual addresses of second level
+ * page tables.
+ *
+ * In order to be used the iommu domain must be bound to an iommu device.
+ * This is accomplished with s5p_sysmmu_attach_dev, which is called through
+ * s5p_sysmmu_ops by drivers/base/iommu.c.
+ */
+struct s5p_sysmmu_domain {
+	unsigned long			flpt;
+	void				*flpt_va;
+	void				**slpt_va;
+	unsigned short			*refcount;
+	struct s5p_sysmmu_info		*sysmmu;
+	struct s5p_sysmmu_irq_callb	*irq_callb;
+	void				*irq_callb_priv;
+	int				policy;
+};
+
+static struct s5p_sysmmu_info *sysmmu_table[S5P_SYSMMU_TOTAL_IP_NUM];
+static DEFINE_SPINLOCK(sysmmu_slock);
+
+static struct kmem_cache *slpt_cache;
+
+static const char *irq_reasons[] = {
+	"sysmmu irq:page fault",
+	"sysmmu irq:ar multi hit",
+	"sysmmu irq:aw multi hit",
+	"sysmmu irq:bus error",
+	"sysmmu irq:ar security protection fault",
+	"sysmmu irq:ar access protection fault",
+	"sysmmu irq:aw security protection fault",
+	"sysmmu irq:aw access protection fault"
+};
+
+static void flush_cache(const void *start, unsigned long size)
+{
+	dmac_flush_range(start, start + size);
+	outer_flush_range(virt_to_phys(start), virt_to_phys(start + size));
+}
+
+static int s5p_sysmmu_domain_init(struct iommu_domain *domain)
+{
+	struct s5p_sysmmu_domain *s5p_domain;
+
+	s5p_domain = kzalloc(sizeof(struct s5p_sysmmu_domain), GFP_KERNEL);
+	if (!s5p_domain) {
+		sysmmu_debug(3, "no memory for state\n");
+		return -ENOMEM;
+	}
+	domain->priv = s5p_domain;
+
+	/*
+	 * first-level page table holds
+	 * 4k second-level descriptors == 16kB == 4 pages
+	 */
+	s5p_domain->flpt_va = kzalloc(FLPT_ENTRIES * sizeof(unsigned long),
+					 GFP_KERNEL);
+	if (!s5p_domain->flpt_va)
+		return -ENOMEM;
+	s5p_domain->flpt = virt_to_phys(s5p_domain->flpt_va);
+
+	s5p_domain->refcount = kzalloc(FLPT_ENTRIES * sizeof(u16), GFP_KERNEL);
+	if (!s5p_domain->refcount) {
+		kfree(s5p_domain->flpt_va);
+		return -ENOMEM;
+	}
+
+	s5p_domain->slpt_va = kzalloc(FLPT_ENTRIES * sizeof(void *),
+				      GFP_KERNEL);
+	if (!s5p_domain->slpt_va) {
+		kfree(s5p_domain->refcount);
+		kfree(s5p_domain->flpt_va);
+		return -ENOMEM;
+	}
+	flush_cache(s5p_domain->flpt_va, 4 * PAGE_SIZE);
+	return 0;
+}
+
+static void s5p_sysmmu_domain_destroy(struct iommu_domain *domain)
+{
+	struct s5p_sysmmu_domain *s5p_domain = domain->priv;
+	int i;
+	for (i = FLPT_ENTRIES - 1; i >= 0; --i)
+		if (s5p_domain->refcount[i])
+			kmem_cache_free(slpt_cache, s5p_domain->slpt_va[i]);
+
+	kfree(s5p_domain->slpt_va);
+	kfree(s5p_domain->refcount);
+	kfree(s5p_domain->flpt_va);
+	kfree(domain->priv);
+	domain->priv = NULL;
+}
+
+static int s5p_sysmmu_attach_dev(struct iommu_domain *domain,
+				 struct device *dev)
+{
+	struct platform_device *pdev =
+		container_of(dev, struct platform_device, dev);
+	struct s5p_sysmmu_info *sysmmu = platform_get_drvdata(pdev);
+	struct s5p_sysmmu_domain *s5p_domain = domain->priv;
+	unsigned int reg;
+
+	s5p_domain->sysmmu = sysmmu;
+	sysmmu->domain = domain;
+
+	pm_runtime_get_sync(sysmmu->dev);
+	clk_enable(sysmmu->clk);
+
+	/* configure first level page table base address */
+	writel(s5p_domain->flpt, sysmmu->regs + S5P_PT_BASE_ADDR);
+
+	reg = readl(sysmmu->regs + S5P_MMU_CFG);
+	if (s5p_domain->policy)
+		reg |= (0x1<<0);		/* replacement policy : LRU */
+	else
+		reg &= ~(0x1<<0);		/* replacement policy: RR */
+	writel(reg, sysmmu->regs + S5P_MMU_CFG);
+
+	reg = readl(sysmmu->regs + S5P_MMU_CTRL);
+	reg |= ((0x1<<2)|(0x1<<0));	/* Enable interrupt, Enable MMU */
+	writel(reg, sysmmu->regs + S5P_MMU_CTRL);
+
+	sysmmu->enabled = true;
+
+	return 0;
+}
+
+static void s5p_sysmmu_detach_dev(struct iommu_domain *domain,
+				  struct device *dev)
+{
+	struct platform_device *pdev =
+		container_of(dev, struct platform_device, dev);
+	struct s5p_sysmmu_info *sysmmu = platform_get_drvdata(pdev);
+	struct s5p_sysmmu_domain *s5p_domain = domain->priv;
+	unsigned int reg;
+
+	/* SYSMMU disable */
+	reg = readl(sysmmu->regs + S5P_MMU_CFG);
+	reg |= (0x1<<0);		/* replacement policy : LRU */
+	writel(reg, sysmmu->regs + S5P_MMU_CFG);
+
+	reg = readl(sysmmu->regs + S5P_MMU_CTRL);
+	reg &= ~(0x1);			/* Disable MMU */
+	writel(reg, sysmmu->regs + S5P_MMU_CTRL);
+
+	sysmmu->enabled = false;
+
+	clk_disable(sysmmu->clk);
+	pm_runtime_put_sync(sysmmu->dev);
+
+	sysmmu->domain = NULL;
+	s5p_domain->sysmmu = NULL;
+}
+
+#define bug_mapping_prohibited(iova, len) \
+		s5p_mapping_prohibited_impl(iova, len, __FILE__, __LINE__)
+
+static void s5p_mapping_prohibited_impl(unsigned long iova, size_t len,
+				   const char *file, int line)
+{
+	sysmmu_debug(3, "%s:%d Attempting to map %d@0x%lx over existing\
+mapping\n", file, line, len, iova);
+	BUG();
+}
+
+/*
+ * Map an area of length corresponding to gfp_order, starting at iova.
+ * gfp_order is an order of units of 4kB: 0 -> 1 unit, 1 -> 2 units,
+ * 2 -> 4 units, 3 -> 8 units and so on.
+ *
+ * The act of mapping is all about deciding how to interpret in the MMU the
+ * virtual addresses belonging to the mapped range. Mapping can be done with
+ * 4kB, 64kB, 1MB and 16MB pages, so only orders of 0, 4, 8, 12 are valid.
+ *
+ * iova must be aligned on a 4kB, 64kB, 1MB and 16MB boundaries, respectively.
+ */
+static int s5p_sysmmu_map(struct iommu_domain *domain, unsigned long iova,
+			  phys_addr_t paddr, int gfp_order, int prot)
+{
+	struct s5p_sysmmu_domain *s5p_domain = domain->priv;
+	int flpt_idx = flpt_index(iova);
+	size_t len = 0x1000UL << gfp_order;
+	void *flpt_va, *slpt_va;
+
+	if (len != SZ_16M && len != SZ_1M && len != SZ_64K && len != SZ_4K) {
+		sysmmu_debug(3, "bad order: %d\n", gfp_order);
+		return -EINVAL;
+	}
+
+	flpt_va = s5p_domain->flpt_va + flpt_offs(iova);
+
+	if (SZ_1M == len) {
+		if (deref_va(flpt_va))
+			bug_mapping_prohibited(iova, len);
+		deref_va(flpt_va) = flpt_ent_1m(paddr);
+		flush_cache(flpt_va, 4); /* one 4-byte entry */
+
+		return 0;
+	} else if (SZ_16M == len) {
+		int i = 0;
+		/* first loop to verify mapping allowed */
+		for (i = 0; i < 16; ++i)
+			if (deref_va(flpt_va + 4 * i))
+				bug_mapping_prohibited(iova, len);
+		/* actually map only if allowed */
+		for (i = 0; i < 16; ++i)
+			deref_va(flpt_va + 4 * i) = flpt_ent_16m(paddr);
+		flush_cache(flpt_va, 4 * 16); /* 16 4-byte entries */
+
+		return 0;
+	}
+
+	/* for 4K and 64K pages only */
+	if (page_1m(flpt_va) || page_16m(flpt_va))
+		bug_mapping_prohibited(iova, len);
+
+	/* need to allocate a new second level page table */
+	if (0 == deref_va(flpt_va)) {
+		void *slpt = kmem_cache_zalloc(slpt_cache, GFP_KERNEL);
+		if (!slpt) {
+			sysmmu_debug(3, "cannot allocate slpt\n");
+			return -ENOMEM;
+		}
+
+		s5p_domain->slpt_va[flpt_idx] = slpt;
+		deref_va(flpt_va) = flpt_ent_4k_64k(virt_to_phys(slpt));
+		flush_cache(flpt_va, 4);
+	}
+	slpt_va = s5p_domain->slpt_va[flpt_idx] + slpt_offs(iova);
+
+	if (SZ_4K == len) {
+		if (deref_va(slpt_va))
+			bug_mapping_prohibited(iova, len);
+		deref_va(slpt_va) = slpt_ent_4k(paddr);
+		flush_cache(slpt_va, 4); /* one 4-byte entry */
+		s5p_domain->refcount[flpt_idx]++;
+	} else {
+		int i;
+		/* first loop to verify mapping allowed */
+		for (i = 0; i < 16; ++i)
+			if (deref_va(slpt_va + 4 * i))
+				bug_mapping_prohibited(iova, len);
+		/* actually map only if allowed */
+		for (i = 0; i < 16; ++i) {
+			deref_va(slpt_va + 4 * i) = slpt_ent_64k(paddr);
+			s5p_domain->refcount[flpt_idx]++;
+		}
+		flush_cache(slpt_va, 4 * 16); /* 16 4-byte entries */
+	}
+
+	return 0;
+}
+
+static void s5p_tlb_invalidate(struct s5p_sysmmu_domain *domain)
+{
+	unsigned int reg;
+	void __iomem *regs;
+
+	if (!domain->sysmmu)
+		return;
+
+	regs = domain->sysmmu->regs;
+
+	/* TLB invalidate */
+	reg = readl(regs + S5P_MMU_CTRL);
+	reg |= (0x1<<1);		/* Block MMU */
+	writel(reg, regs + S5P_MMU_CTRL);
+
+	writel(0x1, regs + S5P_MMU_FLUSH);
+					/* Flush_entry */
+
+	reg = readl(regs + S5P_MMU_CTRL);
+	reg &= ~(0x1<<1);		/* Un-block MMU */
+	writel(reg, regs + S5P_MMU_CTRL);
+}
+
+#define bug_unmapping_prohibited(iova, len) \
+		s5p_unmapping_prohibited_impl(iova, len, __FILE__, __LINE__)
+
+static void s5p_unmapping_prohibited_impl(unsigned long iova, size_t len,
+				     const char *file, int line)
+{
+	sysmmu_debug(3, "%s:%d Attempting to unmap different size or \
+non-existing mapping %d@0x%lx\n", file, line, len, iova);
+	BUG();
+}
+
+static int s5p_sysmmu_unmap(struct iommu_domain *domain, unsigned long iova,
+			    int gfp_order)
+{
+	struct s5p_sysmmu_domain *s5p_domain = domain->priv;
+	int flpt_idx = flpt_index(iova);
+	size_t len = 0x1000UL << gfp_order;
+	void *flpt_va, *slpt_va;
+
+	if (len != SZ_16M && len != SZ_1M && len != SZ_64K && len != SZ_4K) {
+		sysmmu_debug(3, "bad order: %d\n", gfp_order);
+		return -EINVAL;
+	}
+
+	flpt_va = s5p_domain->flpt_va + flpt_offs(iova);
+
+	/* check if there is any mapping at all */
+	if (!deref_va(flpt_va))
+		bug_unmapping_prohibited(iova, len);
+
+	if (SZ_1M == len) {
+		if (!page_1m(flpt_va))
+			bug_unmapping_prohibited(iova, len);
+		deref_va(flpt_va) = 0;
+		flush_cache(flpt_va, 4); /* one 4-byte entry */
+		s5p_tlb_invalidate(s5p_domain);
+
+		return 0;
+	} else if (SZ_16M == len) {
+		int i;
+		/* first loop to verify it actually is 16M mapping */
+		for (i = 0; i < 16; ++i)
+			if (!page_16m(flpt_va + 4 * i))
+				bug_unmapping_prohibited(iova, len);
+		/* actually unmap */
+		for (i = 0; i < 16; ++i)
+			deref_va(flpt_va + 4 * i) = 0;
+		flush_cache(flpt_va, 4 * 16); /* 16 4-byte entries */
+		s5p_tlb_invalidate(s5p_domain);
+
+		return 0;
+	}
+
+	if (!page_4k_64k(flpt_va))
+		bug_unmapping_prohibited(iova, len);
+
+	slpt_va = s5p_domain->slpt_va[flpt_idx] + slpt_offs(iova);
+
+	/* verify that we attempt to unmap a matching mapping */
+	if (SZ_4K == len) {
+		if (!page_4k(slpt_va))
+			bug_unmapping_prohibited(iova, len);
+	} else if (SZ_64K == len) {
+		int i;
+		for (i = 0; i < 16; ++i)
+			if (!page_64k(slpt_va + 4 * i))
+				bug_unmapping_prohibited(iova, len);
+	}
+
+	if (SZ_64K == len)
+		s5p_domain->refcount[flpt_idx] -= 15;
+
+	if (--s5p_domain->refcount[flpt_idx]) {
+		if (SZ_4K == len) {
+			invalidate_slpt_ent(slpt_va);
+			flush_cache(slpt_va, 4);
+		} else {
+			int i;
+			for (i = 0; i < 16; ++i)
+				invalidate_slpt_ent(slpt_va + 4 * i);
+			flush_cache(slpt_va, 4 * 16);
+		}
+	} else {
+		kmem_cache_free(slpt_cache, s5p_domain->slpt_va[flpt_idx]);
+		s5p_domain->slpt_va[flpt_idx] = 0;
+		memset(flpt_va, 0, 4);
+		flush_cache(flpt_va, 4);
+	}
+
+	s5p_tlb_invalidate(s5p_domain);
+
+	return 0;
+}
+
+phys_addr_t s5p_iova_to_phys(struct iommu_domain *domain, unsigned long iova)
+{
+	struct s5p_sysmmu_domain *s5p_domain = domain->priv;
+	int flpt_idx = flpt_index(iova);
+	unsigned long flpt_va, slpt_va;
+
+	flpt_va = (unsigned long)s5p_domain->flpt_va + flpt_offs(iova);
+
+	if (!deref_va(flpt_va))
+		return 0;
+
+	if (page_16m(flpt_va))
+		return flpt_get_16m(flpt_va) | pg_offs_16m(iova);
+	else if (page_1m(flpt_va))
+		return flpt_get_1m(flpt_va) | pg_offs_1m(iova);
+
+	if (!page_4k_64k(flpt_va))
+		return 0;
+
+	slpt_va = (unsigned long)s5p_domain->slpt_va[flpt_idx] +
+		  slpt_offs(iova);
+
+	if (!deref_va(slpt_va))
+		return 0;
+
+	if (page_4k(slpt_va))
+		return slpt_get_4k(slpt_va) | pg_offs_4k(iova);
+	else if (page_64k(slpt_va))
+		return slpt_get_64k(slpt_va) | pg_offs_64k(iova);
+
+	return 0;
+}
+
+static struct iommu_ops s5p_sysmmu_ops = {
+	.domain_init = s5p_sysmmu_domain_init,
+	.domain_destroy = s5p_sysmmu_domain_destroy,
+	.attach_dev = s5p_sysmmu_attach_dev,
+	.detach_dev = s5p_sysmmu_detach_dev,
+	.map = s5p_sysmmu_map,
+	.unmap = s5p_sysmmu_unmap,
+	.iova_to_phys = s5p_iova_to_phys,
+};
+
+struct device *s5p_sysmmu_get(enum s5p_sysmmu_ip ip)
+{
+	struct device *ret = NULL;
+	unsigned long flags;
+
+	spin_lock_irqsave(&sysmmu_slock, flags);
+	if (sysmmu_table[ip]) {
+		try_module_get(THIS_MODULE);
+		ret = sysmmu_table[ip]->dev;
+	}
+	spin_unlock_irqrestore(&sysmmu_slock, flags);
+
+	return ret;
+}
+EXPORT_SYMBOL_GPL(s5p_sysmmu_get);
+
+void s5p_sysmmu_put(void *dev)
+{
+	BUG_ON(!dev);
+	module_put(THIS_MODULE);
+}
+EXPORT_SYMBOL_GPL(s5p_sysmmu_put);
+
+void s5p_sysmmu_domain_irq_callb(struct iommu_domain *domain,
+			    struct s5p_sysmmu_irq_callb *ops, void *priv)
+{
+	struct s5p_sysmmu_domain *s5p_domain = domain->priv;
+	s5p_domain->irq_callb = ops;
+	s5p_domain->irq_callb_priv = priv;
+}
+EXPORT_SYMBOL(s5p_sysmmu_domain_irq_callb);
+
+
+void s5p_sysmmu_domain_tlb_policy(struct iommu_domain *domain, int policy)
+{
+	struct s5p_sysmmu_domain *s5p_domain = domain->priv;
+	s5p_domain->policy = policy;
+}
+EXPORT_SYMBOL(s5p_sysmmu_domain_tlb_policy);
+
+static void s5p_sysmmu_irq_page_fault(struct iommu_domain *domain, int reason,
+				      unsigned long addr, void *priv)
+{
+	sysmmu_debug(3, "%s: Faulting virtual address: 0x%08lx\n",
+		     irq_reasons[reason], addr);
+	BUG();
+}
+
+static void s5p_sysmmu_irq_generic_callb(struct iommu_domain *domain,
+					 int reason, unsigned long addr,
+					 void *priv)
+{
+	sysmmu_debug(3, "%s\n", irq_reasons[reason]);
+	BUG();
+}
+
+static struct s5p_sysmmu_irq_callb s5p_sysmmu_irq_callb = {
+	.page_fault = s5p_sysmmu_irq_page_fault,
+	.ar_fault = s5p_sysmmu_irq_generic_callb,
+	.aw_fault = s5p_sysmmu_irq_generic_callb,
+	.bus_error = s5p_sysmmu_irq_generic_callb,
+	.ar_security = s5p_sysmmu_irq_generic_callb,
+	.ar_prot = s5p_sysmmu_irq_generic_callb,
+	.aw_security = s5p_sysmmu_irq_generic_callb,
+	.aw_prot = s5p_sysmmu_irq_generic_callb,
+};
+
+static irqreturn_t s5p_sysmmu_irq(int irq, void *dev_id)
+{
+	struct s5p_sysmmu_info *sysmmu = dev_id;
+	struct s5p_sysmmu_domain *s5p_domain = sysmmu->domain->priv;
+	unsigned int reg_INT_STATUS;
+
+	if (false == sysmmu->enabled)
+		return IRQ_HANDLED;
+
+	reg_INT_STATUS = readl(sysmmu->regs + S5P_INT_STATUS);
+	if (reg_INT_STATUS & 0xFF) {
+		S5P_IRQ_CB(cb);
+		enum s5p_sysmmu_fault reason = 0;
+		unsigned long fault = 0;
+		unsigned reg = 0;
+		cb = NULL;
+		switch (reg_INT_STATUS & 0xFF) {
+		case 0x1:
+			cb = get_irq_callb(page_fault);
+			reason = S5P_SYSMMU_PAGE_FAULT;
+			reg = S5P_PAGE_FAULT_ADDR;
+			break;
+		case 0x2:
+			cb = get_irq_callb(ar_fault);
+			reason = S5P_SYSMMU_AR_FAULT;
+			reg = S5P_AR_FAULT_ADDR;
+			break;
+		case 0x4:
+			cb = get_irq_callb(aw_fault);
+			reason = S5P_SYSMMU_AW_FAULT;
+			reg = S5P_AW_FAULT_ADDR;
+			break;
+		case 0x8:
+			cb = get_irq_callb(bus_error);
+			reason = S5P_SYSMMU_BUS_ERROR;
+			/* register common to page fault and bus error */
+			reg = S5P_PAGE_FAULT_ADDR;
+			break;
+		case 0x10:
+			cb = get_irq_callb(ar_security);
+			reason = S5P_SYSMMU_AR_SECURITY;
+			reg = S5P_AR_FAULT_ADDR;
+			break;
+		case 0x20:
+			cb = get_irq_callb(ar_prot);
+			reason = S5P_SYSMMU_AR_PROT;
+			reg = S5P_AR_FAULT_ADDR;
+			break;
+		case 0x40:
+			cb = get_irq_callb(aw_security);
+			reason = S5P_SYSMMU_AW_SECURITY;
+			reg = S5P_AW_FAULT_ADDR;
+			break;
+		case 0x80:
+			cb = get_irq_callb(aw_prot);
+			reason = S5P_SYSMMU_AW_PROT;
+			reg = S5P_AW_FAULT_ADDR;
+			break;
+		}
+		fault = readl(sysmmu->regs + reg);
+		cb(sysmmu->domain, reason, fault, s5p_domain->irq_callb_priv);
+		writel(reg_INT_STATUS, sysmmu->regs + S5P_INT_CLEAR);
+	}
+	return IRQ_HANDLED;
+}
+
+static int s5p_sysmmu_probe(struct platform_device *pdev)
+{
+	struct s5p_sysmmu_info *sysmmu;
+	struct resource *res;
+	int ret;
+	unsigned long flags;
+
+	sysmmu = kzalloc(sizeof(struct s5p_sysmmu_info), GFP_KERNEL);
+	if (!sysmmu) {
+		dev_err(&pdev->dev, "no memory for state\n");
+		return -ENOMEM;
+	}
+
+	res = platform_get_resource(pdev, IORESOURCE_MEM, 0);
+	if (NULL == res) {
+		dev_err(&pdev->dev, "cannot find IO resource\n");
+		ret = -ENOENT;
+		goto err_s5p_sysmmu_info_allocated;
+	}
+
+	sysmmu->ioarea = request_mem_region(res->start, resource_size(res),
+					 pdev->name);
+
+	if (NULL == sysmmu->ioarea) {
+		dev_err(&pdev->dev, "cannot request IO\n");
+		ret = -ENXIO;
+		goto err_s5p_sysmmu_info_allocated;
+	}
+
+	sysmmu->regs = ioremap(res->start, resource_size(res));
+
+	if (NULL == sysmmu->regs) {
+		dev_err(&pdev->dev, "cannot map IO\n");
+		ret = -ENXIO;
+		goto err_ioarea_requested;
+	}
+
+	dev_dbg(&pdev->dev, "registers %p (%p, %p)\n",
+		sysmmu->regs, sysmmu->ioarea, res);
+
+	sysmmu->irq = ret = platform_get_irq(pdev, 0);
+	if (ret <= 0) {
+		dev_err(&pdev->dev, "cannot find IRQ\n");
+		goto err_iomap_done;
+	}
+
+	ret = request_irq(sysmmu->irq, s5p_sysmmu_irq, 0,
+			  dev_name(&pdev->dev), sysmmu);
+
+	if (ret != 0) {
+		dev_err(&pdev->dev, "cannot claim IRQ %d\n", sysmmu->irq);
+		goto err_iomap_done;
+	}
+
+	sysmmu->clk = clk_get(&pdev->dev, "sysmmu");
+	if (IS_ERR_OR_NULL(sysmmu->clk)) {
+		dev_err(&pdev->dev, "cannot get clock\n");
+		ret = -ENOENT;
+		goto err_request_irq_done;
+	}
+	dev_dbg(&pdev->dev, "clock source %p\n", sysmmu->clk);
+
+	sysmmu->ip = pdev->id;
+
+	spin_lock_irqsave(&sysmmu_slock, flags);
+	sysmmu_table[pdev->id] = sysmmu;
+	spin_unlock_irqrestore(&sysmmu_slock, flags);
+
+	sysmmu->dev = &pdev->dev;
+
+	platform_set_drvdata(pdev, sysmmu);
+
+	pm_runtime_set_active(&pdev->dev);
+	pm_runtime_enable(&pdev->dev);
+
+	dev_info(&pdev->dev, "Samsung S5P SYSMMU (IOMMU)\n");
+	return 0;
+
+err_request_irq_done:
+	free_irq(sysmmu->irq, sysmmu);
+
+err_iomap_done:
+	iounmap(sysmmu->regs);
+
+err_ioarea_requested:
+	release_resource(sysmmu->ioarea);
+	kfree(sysmmu->ioarea);
+
+err_s5p_sysmmu_info_allocated:
+	kfree(sysmmu);
+	return ret;
+}
+
+static int s5p_sysmmu_remove(struct platform_device *pdev)
+{
+	struct s5p_sysmmu_info *sysmmu = platform_get_drvdata(pdev);
+	unsigned long flags;
+
+	pm_runtime_disable(sysmmu->dev);
+
+	spin_lock_irqsave(&sysmmu_slock, flags);
+	sysmmu_table[pdev->id] = NULL;
+	spin_unlock_irqrestore(&sysmmu_slock, flags);
+
+	clk_disable(sysmmu->clk);
+	clk_put(sysmmu->clk);
+
+	free_irq(sysmmu->irq, sysmmu);
+
+	iounmap(sysmmu->regs);
+
+	release_resource(sysmmu->ioarea);
+	kfree(sysmmu->ioarea);
+
+	kfree(sysmmu);
+
+	return 0;
+}
+
+static int
+s5p_sysmmu_suspend(struct platform_device *pdev, pm_message_t state)
+{
+	int ret = 0;
+	sysmmu_debug(3, "begin\n");
+
+	return ret;
+}
+
+static int s5p_sysmmu_resume(struct platform_device *pdev)
+{
+	int ret = 0;
+	sysmmu_debug(3, "begin\n");
+
+	return ret;
+}
+
+static int s5p_sysmmu_runtime_suspend(struct device *dev)
+{
+	sysmmu_debug(3, "begin\n");
+	return 0;
+}
+
+static int s5p_sysmmu_runtime_resume(struct device *dev)
+{
+	sysmmu_debug(3, "begin\n");
+	return 0;
+}
+
+static const struct dev_pm_ops s5p_sysmmu_pm_ops = {
+	.runtime_suspend = s5p_sysmmu_runtime_suspend,
+	.runtime_resume	 = s5p_sysmmu_runtime_resume,
+};
+
+static struct platform_driver s5p_sysmmu_driver = {
+	.probe = s5p_sysmmu_probe,
+	.remove = s5p_sysmmu_remove,
+	.suspend = s5p_sysmmu_suspend,
+	.resume = s5p_sysmmu_resume,
+	.driver = {
+		.owner = THIS_MODULE,
+		.name = "s5p-sysmmu",
+		.pm = &s5p_sysmmu_pm_ops,
+	},
+};
+
+static int __init
+s5p_sysmmu_register(void)
+{
+	int ret;
+
+	sysmmu_debug(3, "Registering sysmmu driver...\n");
+
+	slpt_cache = kmem_cache_create("slpt_cache", 1024, 1024,
+				       SLAB_HWCACHE_ALIGN, NULL);
+	if (!slpt_cache) {
+		printk(KERN_ERR
+			"%s: failed to allocated slpt cache\n", __func__);
+		return -ENOMEM;
+	}
+
+	ret = platform_driver_register(&s5p_sysmmu_driver);
+
+	if (ret) {
+		printk(KERN_ERR
+			"%s: failed to register sysmmu driver\n", __func__);
+		return -EINVAL;
+	}
+
+	register_iommu(&s5p_sysmmu_ops);
+
+	return ret;
+}
+
+static void __exit
+s5p_sysmmu_unregister(void)
+{
+	kmem_cache_destroy(slpt_cache);
+	platform_driver_unregister(&s5p_sysmmu_driver);
+}
+
+module_init(s5p_sysmmu_register);
+module_exit(s5p_sysmmu_unregister);
+
+MODULE_AUTHOR("Andrzej Pietrasiewicz <andrzej.p@samsung.com>");
+MODULE_DESCRIPTION("Samsung System MMU (IOMMU) driver");
+MODULE_LICENSE("GPL");
+
diff --git a/arch/arm/plat-samsung/include/plat/devs.h b/arch/arm/plat-samsung/include/plat/devs.h
index f0da6b7..0ae5dd0 100644
--- a/arch/arm/plat-samsung/include/plat/devs.h
+++ b/arch/arm/plat-samsung/include/plat/devs.h
@@ -142,7 +142,7 @@ extern struct platform_device s5p_device_fimc3;
 extern struct platform_device s5p_device_mipi_csis0;
 extern struct platform_device s5p_device_mipi_csis1;
 
-extern struct platform_device exynos4_device_sysmmu;
+extern struct platform_device exynos4_device_sysmmu[];
 
 /* s3c2440 specific devices */
 
-- 
1.7.1.569.g6f426

^ permalink raw reply related	[flat|nested] 64+ messages in thread

* [PATCH 2/7] ARM: Samsung: update/rewrite Samsung SYSMMU (IOMMU) driver
@ 2011-04-18  9:26   ` Marek Szyprowski
  0 siblings, 0 replies; 64+ messages in thread
From: Marek Szyprowski @ 2011-04-18  9:26 UTC (permalink / raw)
  To: linux-arm-kernel

From: Andrzej Pietrasiewicz <andrzej.p@samsung.com>

This patch performs a complete rewrite of sysmmu driver for Samsung platform:
- simplified the resource management: no more single platform
  device with 32 resources is needed, better fits into linux driver model,
  each sysmmu instance has it's own resource definition
- the new version uses kernel wide common iommu api defined in include/iommu.h
- cleaned support for sysmmu clocks
- added support for custom fault handlers and tlb replacement policy

Signed-off-by: Andrzej Pietrasiewicz <andrzej.p@samsung.com>
Signed-off-by: Kyungmin Park <kyungmin.park@samsung.com>
Signed-off-by: Marek Szyprowski <m.szyprowski@samsung.com>
---
 arch/arm/mach-exynos4/clock.c               |   68 +-
 arch/arm/mach-exynos4/dev-sysmmu.c          |  615 +++++++++------
 arch/arm/mach-exynos4/include/mach/irqs.h   |   35 +-
 arch/arm/mach-exynos4/include/mach/sysmmu.h |   46 -
 arch/arm/plat-s5p/Kconfig                   |   20 +-
 arch/arm/plat-s5p/include/plat/sysmmu.h     |  241 ++++---
 arch/arm/plat-s5p/sysmmu.c                  | 1191 ++++++++++++++++++++-------
 arch/arm/plat-samsung/include/plat/devs.h   |    2 +-
 8 files changed, 1478 insertions(+), 740 deletions(-)
 rewrite arch/arm/mach-exynos4/dev-sysmmu.c (88%)
 delete mode 100644 arch/arm/mach-exynos4/include/mach/sysmmu.h
 rewrite arch/arm/plat-s5p/include/plat/sysmmu.h (83%)
 rewrite arch/arm/plat-s5p/sysmmu.c (87%)

diff --git a/arch/arm/mach-exynos4/clock.c b/arch/arm/mach-exynos4/clock.c
index 871f9d5..963195e 100644
--- a/arch/arm/mach-exynos4/clock.c
+++ b/arch/arm/mach-exynos4/clock.c
@@ -20,10 +20,10 @@
 #include <plat/pll.h>
 #include <plat/s5p-clock.h>
 #include <plat/clock-clksrc.h>
+#include <plat/sysmmu.h>
 
 #include <mach/map.h>
 #include <mach/regs-clock.h>
-#include <mach/sysmmu.h>
 
 static struct clk clk_sclk_hdmi27m = {
 	.name		= "sclk_hdmi27m",
@@ -127,6 +127,11 @@ static int exynos4_clk_ip_perir_ctrl(struct clk *clk, int enable)
 	return s5p_gatectrl(S5P_CLKGATE_IP_PERIR, clk, enable);
 }
 
+static int exynos4_clk_ip_dmc_ctrl(struct clk *clk, int enable)
+{
+	return s5p_gatectrl(S5P_CLKGATE_IP_DMC, clk, enable);
+}
+
 /* Core list of CMU_CPU side */
 
 static struct clksrc_clk clk_mout_apll = {
@@ -614,75 +619,80 @@ static struct clk init_clocks_off[] = {
 		.enable		= exynos4_clk_ip_peril_ctrl,
 		.ctrlbit	= (1 << 13),
 	}, {
-		.name		= "SYSMMU_MDMA",
-		.id		= -1,
+		.name		= "sysmmu",
+		.id		= S5P_SYSMMU_MDMA,
 		.enable		= exynos4_clk_ip_image_ctrl,
 		.ctrlbit	= (1 << 5),
 	}, {
-		.name		= "SYSMMU_FIMC0",
-		.id		= -1,
+		.name		= "sysmmu",
+		.id		= S5P_SYSMMU_FIMC0,
 		.enable		= exynos4_clk_ip_cam_ctrl,
 		.ctrlbit	= (1 << 7),
 	}, {
-		.name		= "SYSMMU_FIMC1",
-		.id		= -1,
+		.name		= "sysmmu",
+		.id		= S5P_SYSMMU_FIMC1,
 		.enable		= exynos4_clk_ip_cam_ctrl,
 		.ctrlbit	= (1 << 8),
 	}, {
-		.name		= "SYSMMU_FIMC2",
-		.id		= -1,
+		.name		= "sysmmu",
+		.id		= S5P_SYSMMU_FIMC2,
 		.enable		= exynos4_clk_ip_cam_ctrl,
 		.ctrlbit	= (1 << 9),
 	}, {
-		.name		= "SYSMMU_FIMC3",
-		.id		= -1,
+		.name		= "sysmmu",
+		.id		= S5P_SYSMMU_FIMC3,
 		.enable		= exynos4_clk_ip_cam_ctrl,
 		.ctrlbit	= (1 << 10),
 	}, {
-		.name		= "SYSMMU_JPEG",
-		.id		= -1,
+		.name		= "sysmmu",
+		.id		= S5P_SYSMMU_JPEG,
 		.enable		= exynos4_clk_ip_cam_ctrl,
 		.ctrlbit	= (1 << 11),
 	}, {
-		.name		= "SYSMMU_FIMD0",
-		.id		= -1,
+		.name		= "sysmmu",
+		.id		= S5P_SYSMMU_FIMD0,
 		.enable		= exynos4_clk_ip_lcd0_ctrl,
 		.ctrlbit	= (1 << 4),
 	}, {
-		.name		= "SYSMMU_FIMD1",
-		.id		= -1,
+		.name		= "sysmmu",
+		.id		= S5P_SYSMMU_FIMD1,
 		.enable		= exynos4_clk_ip_lcd1_ctrl,
 		.ctrlbit	= (1 << 4),
 	}, {
-		.name		= "SYSMMU_PCIe",
-		.id		= -1,
+		.name		= "sysmmu",
+		.id		= S5P_SYSMMU_PCIe,
 		.enable		= exynos4_clk_ip_fsys_ctrl,
 		.ctrlbit	= (1 << 18),
 	}, {
-		.name		= "SYSMMU_G2D",
-		.id		= -1,
+		.name		= "sysmmu",
+		.id		= S5P_SYSMMU_G2D,
 		.enable		= exynos4_clk_ip_image_ctrl,
 		.ctrlbit	= (1 << 3),
 	}, {
-		.name		= "SYSMMU_ROTATOR",
-		.id		= -1,
+		.name		= "sysmmu",
+		.id		= S5P_SYSMMU_ROTATOR,
 		.enable		= exynos4_clk_ip_image_ctrl,
 		.ctrlbit	= (1 << 4),
 	}, {
-		.name		= "SYSMMU_TV",
-		.id		= -1,
+		.name		= "sysmmu",
+		.id		= S5P_SYSMMU_TV,
 		.enable		= exynos4_clk_ip_tv_ctrl,
 		.ctrlbit	= (1 << 4),
 	}, {
-		.name		= "SYSMMU_MFC_L",
-		.id		= -1,
+		.name		= "sysmmu",
+		.id		= S5P_SYSMMU_MFC_L,
 		.enable		= exynos4_clk_ip_mfc_ctrl,
 		.ctrlbit	= (1 << 1),
 	}, {
-		.name		= "SYSMMU_MFC_R",
-		.id		= -1,
+		.name		= "sysmmu",
+		.id		= S5P_SYSMMU_MFC_R,
 		.enable		= exynos4_clk_ip_mfc_ctrl,
 		.ctrlbit	= (1 << 2),
+	}, {
+		.name		= "sysmmu",
+		.id		= S5P_SYSMMU_SSS,
+		.enable		= exynos4_clk_ip_dmc_ctrl,
+		.ctrlbit	= (1 << 12),
 	}
 };
 
diff --git a/arch/arm/mach-exynos4/dev-sysmmu.c b/arch/arm/mach-exynos4/dev-sysmmu.c
dissimilarity index 88%
index 3b7cae0..23c3a6e 100644
--- a/arch/arm/mach-exynos4/dev-sysmmu.c
+++ b/arch/arm/mach-exynos4/dev-sysmmu.c
@@ -1,232 +1,383 @@
-/* linux/arch/arm/mach-exynos4/dev-sysmmu.c
- *
- * Copyright (c) 2010 Samsung Electronics Co., Ltd.
- *		http://www.samsung.com
- *
- * EXYNOS4 - System MMU support
- *
- * This program is free software; you can redistribute it and/or modify
- * it under the terms of the GNU General Public License version 2 as
- * published by the Free Software Foundation.
- */
-
-#include <linux/platform_device.h>
-#include <linux/dma-mapping.h>
-
-#include <mach/map.h>
-#include <mach/irqs.h>
-#include <mach/sysmmu.h>
-#include <plat/s5p-clock.h>
-
-/* These names must be equal to the clock names in mach-exynos4/clock.c */
-const char *sysmmu_ips_name[EXYNOS4_SYSMMU_TOTAL_IPNUM] = {
-	"SYSMMU_MDMA"	,
-	"SYSMMU_SSS"	,
-	"SYSMMU_FIMC0"	,
-	"SYSMMU_FIMC1"	,
-	"SYSMMU_FIMC2"	,
-	"SYSMMU_FIMC3"	,
-	"SYSMMU_JPEG"	,
-	"SYSMMU_FIMD0"	,
-	"SYSMMU_FIMD1"	,
-	"SYSMMU_PCIe"	,
-	"SYSMMU_G2D"	,
-	"SYSMMU_ROTATOR",
-	"SYSMMU_MDMA2"	,
-	"SYSMMU_TV"	,
-	"SYSMMU_MFC_L"	,
-	"SYSMMU_MFC_R"	,
-};
-
-static struct resource exynos4_sysmmu_resource[] = {
-	[0] = {
-		.start	= EXYNOS4_PA_SYSMMU_MDMA,
-		.end	= EXYNOS4_PA_SYSMMU_MDMA + SZ_64K - 1,
-		.flags	= IORESOURCE_MEM,
-	},
-	[1] = {
-		.start	= IRQ_SYSMMU_MDMA0_0,
-		.end	= IRQ_SYSMMU_MDMA0_0,
-		.flags	= IORESOURCE_IRQ,
-	},
-	[2] = {
-		.start	= EXYNOS4_PA_SYSMMU_SSS,
-		.end	= EXYNOS4_PA_SYSMMU_SSS + SZ_64K - 1,
-		.flags	= IORESOURCE_MEM,
-	},
-	[3] = {
-		.start	= IRQ_SYSMMU_SSS_0,
-		.end	= IRQ_SYSMMU_SSS_0,
-		.flags	= IORESOURCE_IRQ,
-	},
-	[4] = {
-		.start	= EXYNOS4_PA_SYSMMU_FIMC0,
-		.end	= EXYNOS4_PA_SYSMMU_FIMC0 + SZ_64K - 1,
-		.flags	= IORESOURCE_MEM,
-	},
-	[5] = {
-		.start	= IRQ_SYSMMU_FIMC0_0,
-		.end	= IRQ_SYSMMU_FIMC0_0,
-		.flags	= IORESOURCE_IRQ,
-	},
-	[6] = {
-		.start	= EXYNOS4_PA_SYSMMU_FIMC1,
-		.end	= EXYNOS4_PA_SYSMMU_FIMC1 + SZ_64K - 1,
-		.flags	= IORESOURCE_MEM,
-	},
-	[7] = {
-		.start	= IRQ_SYSMMU_FIMC1_0,
-		.end	= IRQ_SYSMMU_FIMC1_0,
-		.flags	= IORESOURCE_IRQ,
-	},
-	[8] = {
-		.start	= EXYNOS4_PA_SYSMMU_FIMC2,
-		.end	= EXYNOS4_PA_SYSMMU_FIMC2 + SZ_64K - 1,
-		.flags	= IORESOURCE_MEM,
-	},
-	[9] = {
-		.start	= IRQ_SYSMMU_FIMC2_0,
-		.end	= IRQ_SYSMMU_FIMC2_0,
-		.flags	= IORESOURCE_IRQ,
-	},
-	[10] = {
-		.start	= EXYNOS4_PA_SYSMMU_FIMC3,
-		.end	= EXYNOS4_PA_SYSMMU_FIMC3 + SZ_64K - 1,
-		.flags	= IORESOURCE_MEM,
-	},
-	[11] = {
-		.start	= IRQ_SYSMMU_FIMC3_0,
-		.end	= IRQ_SYSMMU_FIMC3_0,
-		.flags	= IORESOURCE_IRQ,
-	},
-	[12] = {
-		.start	= EXYNOS4_PA_SYSMMU_JPEG,
-		.end	= EXYNOS4_PA_SYSMMU_JPEG + SZ_64K - 1,
-		.flags	= IORESOURCE_MEM,
-	},
-	[13] = {
-		.start	= IRQ_SYSMMU_JPEG_0,
-		.end	= IRQ_SYSMMU_JPEG_0,
-		.flags	= IORESOURCE_IRQ,
-	},
-	[14] = {
-		.start	= EXYNOS4_PA_SYSMMU_FIMD0,
-		.end	= EXYNOS4_PA_SYSMMU_FIMD0 + SZ_64K - 1,
-		.flags	= IORESOURCE_MEM,
-	},
-	[15] = {
-		.start	= IRQ_SYSMMU_LCD0_M0_0,
-		.end	= IRQ_SYSMMU_LCD0_M0_0,
-		.flags	= IORESOURCE_IRQ,
-	},
-	[16] = {
-		.start	= EXYNOS4_PA_SYSMMU_FIMD1,
-		.end	= EXYNOS4_PA_SYSMMU_FIMD1 + SZ_64K - 1,
-		.flags	= IORESOURCE_MEM,
-	},
-	[17] = {
-		.start	= IRQ_SYSMMU_LCD1_M1_0,
-		.end	= IRQ_SYSMMU_LCD1_M1_0,
-		.flags	= IORESOURCE_IRQ,
-	},
-	[18] = {
-		.start	= EXYNOS4_PA_SYSMMU_PCIe,
-		.end	= EXYNOS4_PA_SYSMMU_PCIe + SZ_64K - 1,
-		.flags	= IORESOURCE_MEM,
-	},
-	[19] = {
-		.start	= IRQ_SYSMMU_PCIE_0,
-		.end	= IRQ_SYSMMU_PCIE_0,
-		.flags	= IORESOURCE_IRQ,
-	},
-	[20] = {
-		.start	= EXYNOS4_PA_SYSMMU_G2D,
-		.end	= EXYNOS4_PA_SYSMMU_G2D + SZ_64K - 1,
-		.flags	= IORESOURCE_MEM,
-	},
-	[21] = {
-		.start	= IRQ_SYSMMU_2D_0,
-		.end	= IRQ_SYSMMU_2D_0,
-		.flags	= IORESOURCE_IRQ,
-	},
-	[22] = {
-		.start	= EXYNOS4_PA_SYSMMU_ROTATOR,
-		.end	= EXYNOS4_PA_SYSMMU_ROTATOR + SZ_64K - 1,
-		.flags	= IORESOURCE_MEM,
-	},
-	[23] = {
-		.start	= IRQ_SYSMMU_ROTATOR_0,
-		.end	= IRQ_SYSMMU_ROTATOR_0,
-		.flags	= IORESOURCE_IRQ,
-	},
-	[24] = {
-		.start	= EXYNOS4_PA_SYSMMU_MDMA2,
-		.end	= EXYNOS4_PA_SYSMMU_MDMA2 + SZ_64K - 1,
-		.flags	= IORESOURCE_MEM,
-	},
-	[25] = {
-		.start	= IRQ_SYSMMU_MDMA1_0,
-		.end	= IRQ_SYSMMU_MDMA1_0,
-		.flags	= IORESOURCE_IRQ,
-	},
-	[26] = {
-		.start	= EXYNOS4_PA_SYSMMU_TV,
-		.end	= EXYNOS4_PA_SYSMMU_TV + SZ_64K - 1,
-		.flags	= IORESOURCE_MEM,
-	},
-	[27] = {
-		.start	= IRQ_SYSMMU_TV_M0_0,
-		.end	= IRQ_SYSMMU_TV_M0_0,
-		.flags	= IORESOURCE_IRQ,
-	},
-	[28] = {
-		.start	= EXYNOS4_PA_SYSMMU_MFC_L,
-		.end	= EXYNOS4_PA_SYSMMU_MFC_L + SZ_64K - 1,
-		.flags	= IORESOURCE_MEM,
-	},
-	[29] = {
-		.start	= IRQ_SYSMMU_MFC_M0_0,
-		.end	= IRQ_SYSMMU_MFC_M0_0,
-		.flags	= IORESOURCE_IRQ,
-	},
-	[30] = {
-		.start	= EXYNOS4_PA_SYSMMU_MFC_R,
-		.end	= EXYNOS4_PA_SYSMMU_MFC_R + SZ_64K - 1,
-		.flags	= IORESOURCE_MEM,
-	},
-	[31] = {
-		.start	= IRQ_SYSMMU_MFC_M1_0,
-		.end	= IRQ_SYSMMU_MFC_M1_0,
-		.flags	= IORESOURCE_IRQ,
-	},
-};
-
-struct platform_device exynos4_device_sysmmu = {
-	.name		= "s5p-sysmmu",
-	.id		= 32,
-	.num_resources	= ARRAY_SIZE(exynos4_sysmmu_resource),
-	.resource	= exynos4_sysmmu_resource,
-};
-EXPORT_SYMBOL(exynos4_device_sysmmu);
-
-static struct clk *sysmmu_clk[S5P_SYSMMU_TOTAL_IPNUM];
-void sysmmu_clk_init(struct device *dev, sysmmu_ips ips)
-{
-	sysmmu_clk[ips] = clk_get(dev, sysmmu_ips_name[ips]);
-	if (IS_ERR(sysmmu_clk[ips]))
-		sysmmu_clk[ips] = NULL;
-	else
-		clk_put(sysmmu_clk[ips]);
-}
-
-void sysmmu_clk_enable(sysmmu_ips ips)
-{
-	if (sysmmu_clk[ips])
-		clk_enable(sysmmu_clk[ips]);
-}
-
-void sysmmu_clk_disable(sysmmu_ips ips)
-{
-	if (sysmmu_clk[ips])
-		clk_disable(sysmmu_clk[ips]);
-}
+/* linux/arch/arm/mach-exynos4/dev-sysmmu.c
+ *
+ * Copyright (c) 2010 Samsung Electronics Co., Ltd.
+ *		http://www.samsung.com
+ *
+ * EXYNOS4 - System MMU support
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ */
+
+#include <linux/platform_device.h>
+#include <linux/dma-mapping.h>
+
+#include <mach/map.h>
+#include <mach/irqs.h>
+
+#include <plat/devs.h>
+#include <plat/cpu.h>
+#include <plat/sysmmu.h>
+
+#define EXYNOS4_NUM_RESOURCES (2)
+
+static struct resource exynos4_sysmmu_resource[][EXYNOS4_NUM_RESOURCES] = {
+	[S5P_SYSMMU_MDMA] = {
+		[0] = {
+			.start	= EXYNOS4_PA_SYSMMU_MDMA,
+			.end	= EXYNOS4_PA_SYSMMU_MDMA + SZ_4K - 1,
+			.flags	= IORESOURCE_MEM,
+		},
+		[1] = {
+			.start	= IRQ_SYSMMU_MDMA0,
+			.end	= IRQ_SYSMMU_MDMA0,
+			.flags	= IORESOURCE_IRQ,
+		},
+	},
+	[S5P_SYSMMU_SSS] = {
+		[0] = {
+			.start	= EXYNOS4_PA_SYSMMU_SSS,
+			.end	= EXYNOS4_PA_SYSMMU_SSS + SZ_4K - 1,
+			.flags	= IORESOURCE_MEM,
+		},
+		[1] = {
+			.start	= IRQ_SYSMMU_SSS,
+			.end	= IRQ_SYSMMU_SSS,
+			.flags	= IORESOURCE_IRQ,
+		},
+	},
+	[S5P_SYSMMU_FIMC0] = {
+		[0] = {
+			.start = EXYNOS4_PA_SYSMMU_FIMC0,
+			.end   = EXYNOS4_PA_SYSMMU_FIMC0 + SZ_4K - 1,
+			.flags = IORESOURCE_MEM,
+		},
+		[1] = {
+			.start = IRQ_SYSMMU_FIMC0,
+			.end   = IRQ_SYSMMU_FIMC0,
+			.flags = IORESOURCE_IRQ,
+		},
+	},
+	[S5P_SYSMMU_FIMC1] = {
+		[0] = {
+			.start = EXYNOS4_PA_SYSMMU_FIMC1,
+			.end   = EXYNOS4_PA_SYSMMU_FIMC1 + SZ_4K - 1,
+			.flags = IORESOURCE_MEM,
+		},
+		[1] = {
+			.start = IRQ_SYSMMU_FIMC1,
+			.end   = IRQ_SYSMMU_FIMC1,
+			.flags = IORESOURCE_IRQ,
+		},
+	},
+	[S5P_SYSMMU_FIMC2] = {
+		[0] = {
+			.start = EXYNOS4_PA_SYSMMU_FIMC2,
+			.end   = EXYNOS4_PA_SYSMMU_FIMC2 + SZ_4K - 1,
+			.flags = IORESOURCE_MEM,
+		},
+		[1] = {
+			.start = IRQ_SYSMMU_FIMC2,
+			.end   = IRQ_SYSMMU_FIMC2,
+			.flags = IORESOURCE_IRQ,
+		},
+	},
+	[S5P_SYSMMU_FIMC3] = {
+		[0] = {
+			.start = EXYNOS4_PA_SYSMMU_FIMC3,
+			.end   = EXYNOS4_PA_SYSMMU_FIMC3 + SZ_4K - 1,
+			.flags = IORESOURCE_MEM,
+		},
+		[1] = {
+			.start = IRQ_SYSMMU_FIMC3,
+			.end   = IRQ_SYSMMU_FIMC3,
+			.flags = IORESOURCE_IRQ,
+		},
+	},
+	[S5P_SYSMMU_JPEG] = {
+		[0] = {
+			.start	= EXYNOS4_PA_SYSMMU_JPEG,
+			.end	= EXYNOS4_PA_SYSMMU_JPEG + SZ_4K - 1,
+			.flags	= IORESOURCE_MEM,
+		},
+		[1] = {
+			.start	= IRQ_SYSMMU_JPEG,
+			.end	= IRQ_SYSMMU_JPEG,
+			.flags	= IORESOURCE_IRQ,
+		},
+	},
+	[S5P_SYSMMU_FIMD0] = {
+		[0] = {
+			.start	= EXYNOS4_PA_SYSMMU_FIMD0,
+			.end	= EXYNOS4_PA_SYSMMU_FIMD0 + SZ_4K - 1,
+			.flags	= IORESOURCE_MEM,
+		},
+		[1] = {
+			.start	= IRQ_SYSMMU_FIMD0,
+			.end	= IRQ_SYSMMU_FIMD0,
+			.flags	= IORESOURCE_IRQ,
+		},
+	},
+	[S5P_SYSMMU_FIMD1] = {
+		[0] = {
+			.start	= EXYNOS4_PA_SYSMMU_FIMD1,
+			.end	= EXYNOS4_PA_SYSMMU_FIMD1 + SZ_4K - 1,
+			.flags	= IORESOURCE_MEM,
+		},
+		[1] = {
+			.start	= IRQ_SYSMMU_FIMD1,
+			.end	= IRQ_SYSMMU_FIMD1,
+			.flags	= IORESOURCE_IRQ,
+		},
+	},
+	[S5P_SYSMMU_PCIe] = {
+		[0] = {
+			.start	= EXYNOS4_PA_SYSMMU_PCIe,
+			.end	= EXYNOS4_PA_SYSMMU_PCIe + SZ_4K - 1,
+			.flags	= IORESOURCE_MEM,
+		},
+		[1] = {
+			.start	= IRQ_SYSMMU_PCIE,
+			.end	= IRQ_SYSMMU_PCIE,
+			.flags	= IORESOURCE_IRQ,
+		},
+	},
+	[S5P_SYSMMU_G2D] = {
+		[0] = {
+			.start	= EXYNOS4_PA_SYSMMU_G2D,
+			.end	= EXYNOS4_PA_SYSMMU_G2D + SZ_4K - 1,
+			.flags	= IORESOURCE_MEM,
+		},
+		[1] = {
+			.start	= IRQ_SYSMMU_2D,
+			.end	= IRQ_SYSMMU_2D,
+			.flags	= IORESOURCE_IRQ,
+		},
+	},
+	[S5P_SYSMMU_ROTATOR] = {
+		[0] = {
+			.start	= EXYNOS4_PA_SYSMMU_ROTATOR,
+			.end	= EXYNOS4_PA_SYSMMU_ROTATOR + SZ_4K - 1,
+			.flags	= IORESOURCE_MEM,
+		},
+		[1] = {
+			.start	= IRQ_SYSMMU_ROTATOR,
+			.end	= IRQ_SYSMMU_ROTATOR,
+			.flags	= IORESOURCE_IRQ,
+		},
+	},
+	[S5P_SYSMMU_MDMA2] = {
+		[0] = {
+			.start	= EXYNOS4_PA_SYSMMU_MDMA2,
+			.end	= EXYNOS4_PA_SYSMMU_MDMA2 + SZ_4K - 1,
+			.flags	= IORESOURCE_MEM,
+		},
+		[1] = {
+			.start	= IRQ_SYSMMU_MDMA1,
+			.end	= IRQ_SYSMMU_MDMA1,
+			.flags	= IORESOURCE_IRQ,
+		},
+	},
+	[S5P_SYSMMU_TV] = {
+		[0] = {
+			.start	= EXYNOS4_PA_SYSMMU_TV,
+			.end	= EXYNOS4_PA_SYSMMU_TV + SZ_4K - 1,
+			.flags	= IORESOURCE_MEM,
+		},
+		[1] = {
+			.start	= IRQ_SYSMMU_TV,
+			.end	= IRQ_SYSMMU_TV,
+			.flags	= IORESOURCE_IRQ,
+		},
+	},
+	[S5P_SYSMMU_MFC_L] = {
+		[0] = {
+			.start	= EXYNOS4_PA_SYSMMU_MFC_L,
+			.end	= EXYNOS4_PA_SYSMMU_MFC_L + SZ_4K - 1,
+			.flags	= IORESOURCE_MEM,
+		},
+		[1] = {
+			.start	= IRQ_SYSMMU_MFC_L,
+			.end	= IRQ_SYSMMU_MFC_L,
+			.flags	= IORESOURCE_IRQ,
+		},
+	},
+	[S5P_SYSMMU_MFC_R] = {
+		[0] = {
+			.start	= EXYNOS4_PA_SYSMMU_MFC_R,
+			.end	= EXYNOS4_PA_SYSMMU_MFC_R + SZ_4K - 1,
+			.flags	= IORESOURCE_MEM,
+		},
+		[1] = {
+			.start	= IRQ_SYSMMU_MFC_R,
+			.end	= IRQ_SYSMMU_MFC_R,
+			.flags	= IORESOURCE_IRQ,
+		},
+	},
+};
+
+static u64 exynos4_sysmmu_dma_mask = DMA_BIT_MASK(32);
+
+struct platform_device exynos4_device_sysmmu[] = {
+	[S5P_SYSMMU_MDMA] = {
+		.name		= "s5p-sysmmu",
+		.id		= S5P_SYSMMU_MDMA,
+		.num_resources	= EXYNOS4_NUM_RESOURCES,
+		.resource	= exynos4_sysmmu_resource[S5P_SYSMMU_MDMA],
+		.dev		= {
+			.dma_mask		= &exynos4_sysmmu_dma_mask,
+			.coherent_dma_mask	= DMA_BIT_MASK(32),
+		},
+	},
+	[S5P_SYSMMU_SSS] = {
+		.name		= "s5p-sysmmu",
+		.id		= S5P_SYSMMU_SSS,
+		.num_resources	= EXYNOS4_NUM_RESOURCES,
+		.resource	= exynos4_sysmmu_resource[S5P_SYSMMU_SSS],
+		.dev		= {
+			.dma_mask		= &exynos4_sysmmu_dma_mask,
+			.coherent_dma_mask	= DMA_BIT_MASK(32),
+		},
+	},
+	[S5P_SYSMMU_FIMC0] = {
+		.name		= "s5p-sysmmu",
+		.id		= S5P_SYSMMU_FIMC0,
+		.num_resources	= EXYNOS4_NUM_RESOURCES,
+		.resource	= exynos4_sysmmu_resource[S5P_SYSMMU_FIMC0],
+		.dev		= {
+			.dma_mask		= &exynos4_sysmmu_dma_mask,
+			.coherent_dma_mask	= DMA_BIT_MASK(32),
+		},
+	},
+	[S5P_SYSMMU_FIMC1] = {
+		.name		= "s5p-sysmmu",
+		.id		= S5P_SYSMMU_FIMC1,
+		.num_resources	= EXYNOS4_NUM_RESOURCES,
+		.resource	= exynos4_sysmmu_resource[S5P_SYSMMU_FIMC1],
+		.dev		= {
+			.dma_mask		= &exynos4_sysmmu_dma_mask,
+			.coherent_dma_mask	= DMA_BIT_MASK(32),
+		},
+	},
+	[S5P_SYSMMU_FIMC2] = {
+		.name		= "s5p-sysmmu",
+		.id		= S5P_SYSMMU_FIMC2,
+		.num_resources	= EXYNOS4_NUM_RESOURCES,
+		.resource	= exynos4_sysmmu_resource[S5P_SYSMMU_FIMC2],
+		.dev		= {
+			.dma_mask		= &exynos4_sysmmu_dma_mask,
+			.coherent_dma_mask	= DMA_BIT_MASK(32),
+		},
+	},
+	[S5P_SYSMMU_FIMC3] = {
+		.name		= "s5p-sysmmu",
+		.id		= S5P_SYSMMU_FIMC3,
+		.num_resources	= EXYNOS4_NUM_RESOURCES,
+		.resource	= exynos4_sysmmu_resource[S5P_SYSMMU_FIMC3],
+		.dev		= {
+			.dma_mask		= &exynos4_sysmmu_dma_mask,
+			.coherent_dma_mask	= DMA_BIT_MASK(32),
+		},
+	},
+	[S5P_SYSMMU_JPEG] = {
+		.name		= "s5p-sysmmu",
+		.id		= S5P_SYSMMU_JPEG,
+		.num_resources	= EXYNOS4_NUM_RESOURCES,
+		.resource	= exynos4_sysmmu_resource[S5P_SYSMMU_JPEG],
+		.dev		= {
+			.dma_mask		= &exynos4_sysmmu_dma_mask,
+			.coherent_dma_mask	= DMA_BIT_MASK(32),
+		},
+	},
+	[S5P_SYSMMU_FIMD0] = {
+		.name		= "s5p-sysmmu",
+		.id		= S5P_SYSMMU_FIMD0,
+		.num_resources	= EXYNOS4_NUM_RESOURCES,
+		.resource	= exynos4_sysmmu_resource[S5P_SYSMMU_FIMD0],
+		.dev		= {
+			.dma_mask		= &exynos4_sysmmu_dma_mask,
+			.coherent_dma_mask	= DMA_BIT_MASK(32),
+		},
+	},
+	[S5P_SYSMMU_FIMD1] = {
+		.name		= "s5p-sysmmu",
+		.id		= S5P_SYSMMU_FIMD1,
+		.num_resources	= EXYNOS4_NUM_RESOURCES,
+		.resource	= exynos4_sysmmu_resource[S5P_SYSMMU_FIMD1],
+		.dev		= {
+			.dma_mask		= &exynos4_sysmmu_dma_mask,
+			.coherent_dma_mask	= DMA_BIT_MASK(32),
+		},
+	},
+	[S5P_SYSMMU_PCIe] = {
+		.name		= "s5p-sysmmu",
+		.id		= S5P_SYSMMU_PCIe,
+		.num_resources	= EXYNOS4_NUM_RESOURCES,
+		.resource	= exynos4_sysmmu_resource[S5P_SYSMMU_PCIe],
+		.dev		= {
+			.dma_mask		= &exynos4_sysmmu_dma_mask,
+			.coherent_dma_mask	= DMA_BIT_MASK(32),
+		},
+	},
+	[S5P_SYSMMU_G2D] = {
+		.name		= "s5p-sysmmu",
+		.id		= S5P_SYSMMU_G2D,
+		.num_resources	= EXYNOS4_NUM_RESOURCES,
+		.resource	= exynos4_sysmmu_resource[S5P_SYSMMU_G2D],
+		.dev		= {
+			.dma_mask		= &exynos4_sysmmu_dma_mask,
+			.coherent_dma_mask	= DMA_BIT_MASK(32),
+		},
+	},
+	[S5P_SYSMMU_ROTATOR] = {
+		.name		= "s5p-sysmmu",
+		.id		= S5P_SYSMMU_ROTATOR,
+		.num_resources	= EXYNOS4_NUM_RESOURCES,
+		.resource	= exynos4_sysmmu_resource[S5P_SYSMMU_ROTATOR],
+		.dev		= {
+			.dma_mask		= &exynos4_sysmmu_dma_mask,
+			.coherent_dma_mask	= DMA_BIT_MASK(32),
+		},
+	},
+	[S5P_SYSMMU_MDMA2] = {
+		.name		= "s5p-sysmmu",
+		.id		= S5P_SYSMMU_MDMA2,
+		.num_resources	= EXYNOS4_NUM_RESOURCES,
+		.resource	= exynos4_sysmmu_resource[S5P_SYSMMU_MDMA2],
+		.dev		= {
+			.dma_mask		= &exynos4_sysmmu_dma_mask,
+			.coherent_dma_mask	= DMA_BIT_MASK(32),
+		},
+	},
+	[S5P_SYSMMU_TV] = {
+		.name		= "s5p-sysmmu",
+		.id		= S5P_SYSMMU_TV,
+		.num_resources	= EXYNOS4_NUM_RESOURCES,
+		.resource	= exynos4_sysmmu_resource[S5P_SYSMMU_TV],
+		.dev		= {
+			.dma_mask		= &exynos4_sysmmu_dma_mask,
+			.coherent_dma_mask	= DMA_BIT_MASK(32),
+		},
+	},
+	[S5P_SYSMMU_MFC_L] = {
+		.name		= "s5p-sysmmu",
+		.id		= S5P_SYSMMU_MFC_L,
+		.num_resources	= EXYNOS4_NUM_RESOURCES,
+		.resource	= exynos4_sysmmu_resource[S5P_SYSMMU_MFC_L],
+		.dev		= {
+			.dma_mask		= &exynos4_sysmmu_dma_mask,
+			.coherent_dma_mask	= DMA_BIT_MASK(32),
+		},
+	},
+	[S5P_SYSMMU_MFC_R] = {
+		.name		= "s5p-sysmmu",
+		.id		= S5P_SYSMMU_MFC_R,
+		.num_resources	= EXYNOS4_NUM_RESOURCES,
+		.resource	= exynos4_sysmmu_resource[S5P_SYSMMU_MFC_R],
+		.dev		= {
+			.dma_mask		= &exynos4_sysmmu_dma_mask,
+			.coherent_dma_mask	= DMA_BIT_MASK(32),
+		},
+	},
+};
diff --git a/arch/arm/mach-exynos4/include/mach/irqs.h b/arch/arm/mach-exynos4/include/mach/irqs.h
index 5d03730..ad1d00c 100644
--- a/arch/arm/mach-exynos4/include/mach/irqs.h
+++ b/arch/arm/mach-exynos4/include/mach/irqs.h
@@ -55,23 +55,23 @@
 #define COMBINER_GROUP(x)	((x) * MAX_IRQ_IN_COMBINER + IRQ_SPI(64))
 #define COMBINER_IRQ(x, y)	(COMBINER_GROUP(x) + y)
 
-#define IRQ_SYSMMU_MDMA0_0	COMBINER_IRQ(4, 0)
-#define IRQ_SYSMMU_SSS_0	COMBINER_IRQ(4, 1)
-#define IRQ_SYSMMU_FIMC0_0	COMBINER_IRQ(4, 2)
-#define IRQ_SYSMMU_FIMC1_0	COMBINER_IRQ(4, 3)
-#define IRQ_SYSMMU_FIMC2_0	COMBINER_IRQ(4, 4)
-#define IRQ_SYSMMU_FIMC3_0	COMBINER_IRQ(4, 5)
-#define IRQ_SYSMMU_JPEG_0	COMBINER_IRQ(4, 6)
-#define IRQ_SYSMMU_2D_0		COMBINER_IRQ(4, 7)
-
-#define IRQ_SYSMMU_ROTATOR_0	COMBINER_IRQ(5, 0)
-#define IRQ_SYSMMU_MDMA1_0	COMBINER_IRQ(5, 1)
-#define IRQ_SYSMMU_LCD0_M0_0	COMBINER_IRQ(5, 2)
-#define IRQ_SYSMMU_LCD1_M1_0	COMBINER_IRQ(5, 3)
-#define IRQ_SYSMMU_TV_M0_0	COMBINER_IRQ(5, 4)
-#define IRQ_SYSMMU_MFC_M0_0	COMBINER_IRQ(5, 5)
-#define IRQ_SYSMMU_MFC_M1_0	COMBINER_IRQ(5, 6)
-#define IRQ_SYSMMU_PCIE_0	COMBINER_IRQ(5, 7)
+#define IRQ_SYSMMU_MDMA0	COMBINER_IRQ(4, 0)
+#define IRQ_SYSMMU_SSS		COMBINER_IRQ(4, 1)
+#define IRQ_SYSMMU_FIMC0	COMBINER_IRQ(4, 2)
+#define IRQ_SYSMMU_FIMC1	COMBINER_IRQ(4, 3)
+#define IRQ_SYSMMU_FIMC2	COMBINER_IRQ(4, 4)
+#define IRQ_SYSMMU_FIMC3	COMBINER_IRQ(4, 5)
+#define IRQ_SYSMMU_JPEG		COMBINER_IRQ(4, 6)
+#define IRQ_SYSMMU_2D		COMBINER_IRQ(4, 7)
+
+#define IRQ_SYSMMU_ROTATOR	COMBINER_IRQ(5, 0)
+#define IRQ_SYSMMU_MDMA1	COMBINER_IRQ(5, 1)
+#define IRQ_SYSMMU_FIMD0	COMBINER_IRQ(5, 2)
+#define IRQ_SYSMMU_FIMD1	COMBINER_IRQ(5, 3)
+#define IRQ_SYSMMU_TV		COMBINER_IRQ(5, 4)
+#define IRQ_SYSMMU_MFC_L	COMBINER_IRQ(5, 5)
+#define IRQ_SYSMMU_MFC_R	COMBINER_IRQ(5, 6)
+#define IRQ_SYSMMU_PCIE		COMBINER_IRQ(5, 7)
 
 #define IRQ_PDMA0		COMBINER_IRQ(21, 0)
 #define IRQ_PDMA1		COMBINER_IRQ(21, 1)
@@ -157,4 +157,5 @@
 /* Set the default NR_IRQS */
 #define NR_IRQS			(IRQ_GPIO_END)
 
+
 #endif /* __ASM_ARCH_IRQS_H */
diff --git a/arch/arm/mach-exynos4/include/mach/sysmmu.h b/arch/arm/mach-exynos4/include/mach/sysmmu.h
deleted file mode 100644
index 6a5fbb5..0000000
--- a/arch/arm/mach-exynos4/include/mach/sysmmu.h
+++ /dev/null
@@ -1,46 +0,0 @@
-/* linux/arch/arm/mach-exynos4/include/mach/sysmmu.h
- *
- * Copyright (c) 2010-2011 Samsung Electronics Co., Ltd.
- *		http://www.samsung.com
- *
- * Samsung sysmmu driver for EXYNOS4
- *
- * This program is free software; you can redistribute it and/or modify
- * it under the terms of the GNU General Public License version 2 as
- * published by the Free Software Foundation.
-*/
-
-#ifndef __ASM_ARM_ARCH_SYSMMU_H
-#define __ASM_ARM_ARCH_SYSMMU_H __FILE__
-
-enum exynos4_sysmmu_ips {
-	SYSMMU_MDMA,
-	SYSMMU_SSS,
-	SYSMMU_FIMC0,
-	SYSMMU_FIMC1,
-	SYSMMU_FIMC2,
-	SYSMMU_FIMC3,
-	SYSMMU_JPEG,
-	SYSMMU_FIMD0,
-	SYSMMU_FIMD1,
-	SYSMMU_PCIe,
-	SYSMMU_G2D,
-	SYSMMU_ROTATOR,
-	SYSMMU_MDMA2,
-	SYSMMU_TV,
-	SYSMMU_MFC_L,
-	SYSMMU_MFC_R,
-	EXYNOS4_SYSMMU_TOTAL_IPNUM,
-};
-
-#define S5P_SYSMMU_TOTAL_IPNUM		EXYNOS4_SYSMMU_TOTAL_IPNUM
-
-extern const char *sysmmu_ips_name[EXYNOS4_SYSMMU_TOTAL_IPNUM];
-
-typedef enum exynos4_sysmmu_ips sysmmu_ips;
-
-void sysmmu_clk_init(struct device *dev, sysmmu_ips ips);
-void sysmmu_clk_enable(sysmmu_ips ips);
-void sysmmu_clk_disable(sysmmu_ips ips);
-
-#endif /* __ASM_ARM_ARCH_SYSMMU_H */
diff --git a/arch/arm/plat-s5p/Kconfig b/arch/arm/plat-s5p/Kconfig
index 8492297..9a7805b 100644
--- a/arch/arm/plat-s5p/Kconfig
+++ b/arch/arm/plat-s5p/Kconfig
@@ -42,14 +42,6 @@ config S5P_HRT
 	help
 	  Use the High Resolution timer support
 
-comment "System MMU"
-
-config S5P_SYSTEM_MMU
-	bool "S5P SYSTEM MMU"
-	depends on ARCH_EXYNOS4
-	help
-	  Say Y here if you want to enable System MMU
-
 config S5P_DEV_FIMC0
 	bool
 	help
@@ -89,3 +81,15 @@ config S5P_SETUP_MIPIPHY
 	bool
 	help
 	  Compile in common setup code for MIPI-CSIS and MIPI-DSIM devices
+
+comment "System MMU"
+
+config IOMMU_API
+	bool
+
+config S5P_SYSTEM_MMU
+	bool "S5P SYSTEM MMU"
+	depends on ARCH_EXYNOS4
+	select IOMMU_API
+	help
+	  Say Y here if you want to enable System MMU
diff --git a/arch/arm/plat-s5p/include/plat/sysmmu.h b/arch/arm/plat-s5p/include/plat/sysmmu.h
dissimilarity index 83%
index bf5283c..ee9e6d0 100644
--- a/arch/arm/plat-s5p/include/plat/sysmmu.h
+++ b/arch/arm/plat-s5p/include/plat/sysmmu.h
@@ -1,95 +1,146 @@
-/* linux/arch/arm/plat-s5p/include/plat/sysmmu.h
- *
- * Copyright (c) 2010-2011 Samsung Electronics Co., Ltd.
- *		http://www.samsung.com
- *
- * Samsung System MMU driver for S5P platform
- *
- * This program is free software; you can redistribute it and/or modify
- * it under the terms of the GNU General Public License version 2 as
- * published by the Free Software Foundation.
-*/
-
-#ifndef __ASM__PLAT_SYSMMU_H
-#define __ASM__PLAT_SYSMMU_H __FILE__
-
-enum S5P_SYSMMU_INTERRUPT_TYPE {
-	SYSMMU_PAGEFAULT,
-	SYSMMU_AR_MULTIHIT,
-	SYSMMU_AW_MULTIHIT,
-	SYSMMU_BUSERROR,
-	SYSMMU_AR_SECURITY,
-	SYSMMU_AR_ACCESS,
-	SYSMMU_AW_SECURITY,
-	SYSMMU_AW_PROTECTION, /* 7 */
-	SYSMMU_FAULTS_NUM
-};
-
-#ifdef CONFIG_S5P_SYSTEM_MMU
-
-#include <mach/sysmmu.h>
-
-/**
- * s5p_sysmmu_enable() - enable system mmu of ip
- * @ips: The ip connected system mmu.
- * #pgd: Base physical address of the 1st level page table
- *
- * This function enable system mmu to transfer address
- * from virtual address to physical address
- */
-void s5p_sysmmu_enable(sysmmu_ips ips, unsigned long pgd);
-
-/**
- * s5p_sysmmu_disable() - disable sysmmu mmu of ip
- * @ips: The ip connected system mmu.
- *
- * This function disable system mmu to transfer address
- * from virtual address to physical address
- */
-void s5p_sysmmu_disable(sysmmu_ips ips);
-
-/**
- * s5p_sysmmu_set_tablebase_pgd() - set page table base address to refer page table
- * @ips: The ip connected system mmu.
- * @pgd: The page table base address.
- *
- * This function set page table base address
- * When system mmu transfer address from virtaul address to physical address,
- * system mmu refer address information from page table
- */
-void s5p_sysmmu_set_tablebase_pgd(sysmmu_ips ips, unsigned long pgd);
-
-/**
- * s5p_sysmmu_tlb_invalidate() - flush all TLB entry in system mmu
- * @ips: The ip connected system mmu.
- *
- * This function flush all TLB entry in system mmu
- */
-void s5p_sysmmu_tlb_invalidate(sysmmu_ips ips);
-
-/** s5p_sysmmu_set_fault_handler() - Fault handler for System MMUs
- * @itype: type of fault.
- * @pgtable_base: the physical address of page table base. This is 0 if @ips is
- *               SYSMMU_BUSERROR.
- * @fault_addr: the device (virtual) address that the System MMU tried to
- *             translated. This is 0 if @ips is SYSMMU_BUSERROR.
- * Called when interrupt occurred by the System MMUs
- * The device drivers of peripheral devices that has a System MMU can implement
- * a fault handler to resolve address translation fault by System MMU.
- * The meanings of return value and parameters are described below.
-
- * return value: non-zero if the fault is correctly resolved.
- *         zero if the fault is not handled.
- */
-void s5p_sysmmu_set_fault_handler(sysmmu_ips ips,
-			int (*handler)(enum S5P_SYSMMU_INTERRUPT_TYPE itype,
-					unsigned long pgtable_base,
-					unsigned long fault_addr));
-#else
-#define s5p_sysmmu_enable(ips, pgd) do { } while (0)
-#define s5p_sysmmu_disable(ips) do { } while (0)
-#define s5p_sysmmu_set_tablebase_pgd(ips, pgd) do { } while (0)
-#define s5p_sysmmu_tlb_invalidate(ips) do { } while (0)
-#define s5p_sysmmu_set_fault_handler(ips, handler) do { } while (0)
-#endif
-#endif /* __ASM_PLAT_SYSMMU_H */
+/* linux/arch/arm/plat-s5p/include/plat/sysmmu.h
+ *
+ * Copyright (c) 2010-2011 Samsung Electronics Co., Ltd.
+ *		http://www.samsung.com
+ * Author: Andrzej Pietrasiewicz <andrzej.p@samsung.com>
+ *
+ * Samsung System MMU driver for S5P platform
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+*/
+
+#ifndef __ASM__PLAT_SYSMMU_H
+#define __ASM__PLAT_SYSMMU_H __FILE__
+
+struct device;
+struct iommu_domain;
+
+/**
+ * enum s5p_sysmmu_ip - integrated peripherals identifiers
+ * @S5P_SYSMMU_MDMA:	MDMA
+ * @S5P_SYSMMU_SSS:	SSS
+ * @S5P_SYSMMU_FIMC0:	FIMC0
+ * @S5P_SYSMMU_FIMC1:	FIMC1
+ * @S5P_SYSMMU_FIMC2:	FIMC2
+ * @S5P_SYSMMU_FIMC3:	FIMC3
+ * @S5P_SYSMMU_JPEG:	JPEG
+ * @S5P_SYSMMU_FIMD0:	FIMD0
+ * @S5P_SYSMMU_FIMD1:	FIMD1
+ * @S5P_SYSMMU_PCIe:	PCIe
+ * @S5P_SYSMMU_G2D:	G2D
+ * @S5P_SYSMMU_ROTATOR:	ROTATOR
+ * @S5P_SYSMMU_MDMA2:	MDMA2
+ * @S5P_SYSMMU_TV:	TV
+ * @S5P_SYSMMU_MFC_L:	MFC_L
+ * @S5P_SYSMMU_MFC_R:	MFC_R
+ */
+enum s5p_sysmmu_ip {
+	S5P_SYSMMU_MDMA,
+	S5P_SYSMMU_SSS,
+	S5P_SYSMMU_FIMC0,
+	S5P_SYSMMU_FIMC1,
+	S5P_SYSMMU_FIMC2,
+	S5P_SYSMMU_FIMC3,
+	S5P_SYSMMU_JPEG,
+	S5P_SYSMMU_FIMD0,
+	S5P_SYSMMU_FIMD1,
+	S5P_SYSMMU_PCIe,
+	S5P_SYSMMU_G2D,
+	S5P_SYSMMU_ROTATOR,
+	S5P_SYSMMU_MDMA2,
+	S5P_SYSMMU_TV,
+	S5P_SYSMMU_MFC_L,
+	S5P_SYSMMU_MFC_R,
+	S5P_SYSMMU_TOTAL_IP_NUM,
+};
+
+/**
+ * enum s5p_sysmmu_fault - reason of the raised sysmmu irq
+ * @S5P_SYSMMU_PAGE_FAULT
+ * @S5P_SYSMMU_AR_FAULT
+ * @S5P_SYSMMU_AW_FAULT
+ * @S5P_SYSMMU_BUS_ERROR
+ * @S5P_SYSMMU_AR_SECURITY
+ * @S5P_SYSMMU_AR_PROT
+ * @S5P_SYSMMU_AW_SECURITY
+ * @S5P_SYSMMU_AW_PROT
+ */
+enum s5p_sysmmu_fault {
+	S5P_SYSMMU_PAGE_FAULT,
+	S5P_SYSMMU_AR_FAULT,
+	S5P_SYSMMU_AW_FAULT,
+	S5P_SYSMMU_BUS_ERROR,
+	S5P_SYSMMU_AR_SECURITY,
+	S5P_SYSMMU_AR_PROT,
+	S5P_SYSMMU_AW_SECURITY,
+	S5P_SYSMMU_AW_PROT,
+};
+
+/**
+ * enum s5p_sysmmu_tlb_policy - policy of using the tlb
+ * @S5P_SYSMMU_TLB_RR:	round robin policy
+ * @S5P_SYSMMU_TLB_LRU: least recently used policy
+ */
+enum s5p_sysmmu_tlb_policy {
+	S5P_SYSMMU_TLB_RR,
+	S5P_SYSMMU_TLB_LRU,
+};
+
+#define S5P_IRQ_CB(name) \
+	void (*name)(struct iommu_domain *domain, int reason, \
+		     unsigned long addr, void *prv)
+
+/**
+ * struct s5p_sysmmu_irq_callb - callback operations for irq routine
+ * @page_fault:	called when page fault occurs
+ * @ar_fault:	called when ar multi-hit fault occcurs
+ * @aw_fault:	called when aw multi-hit fault occcurs 
+ * @bus_error:	called when bus error occurs
+ * @ar_security:called when ar security protection fault occurs
+ * @ar_prot:	called when ar acces protection fault occurs
+ * @aw_security:called when aw security protection fault occurs
+ * @aw_prot:	called when aw acces protection fault occurs
+ */
+struct s5p_sysmmu_irq_callb {
+	S5P_IRQ_CB(page_fault);
+	S5P_IRQ_CB(ar_fault);
+	S5P_IRQ_CB(aw_fault);
+	S5P_IRQ_CB(bus_error);
+	S5P_IRQ_CB(ar_security);
+	S5P_IRQ_CB(ar_prot);
+	S5P_IRQ_CB(aw_security);
+	S5P_IRQ_CB(aw_prot);
+};
+
+/**
+ * s5p_sysmmu_get() - get sysmmu device instance
+ * @ip:		integrated peripheral identifier of the device
+ */
+struct device *s5p_sysmmu_get(enum s5p_sysmmu_ip ip);
+
+/**
+ * s5p_sysmmu_put() - release sysmmu handle for a device
+ * @dev_id:	sysmmu handle obtained from s5p_sysmmu_get()
+ */
+void s5p_sysmmu_put(void *dev);
+
+/**
+ * s5p_sysmmu_domain_irq_callb() - set non-default per-domain ops to be called
+ * from irq handling routine
+ * @domain:	iommu domain for which to set the ops
+ * @ops:	non-default operations to be set
+ * @priv:	private data to be passed to the op when it is called
+ */
+void s5p_sysmmu_domain_irq_callb(struct iommu_domain *domain,
+			    struct s5p_sysmmu_irq_callb *ops, void *priv);
+
+/**
+ * s5p_sysmmu_domain_tlb_policy() - set per-domain tlb policy
+ * @domain:	iommu domain for which to set the tlb policy
+ * @policy:	tlb policy specifier (0 round robin, 1 lru)
+ */
+void s5p_sysmmu_domain_tlb_policy(struct iommu_domain *domain, int policy);
+
+#endif /* __ASM_PLAT_SYSMMU_H */
diff --git a/arch/arm/plat-s5p/sysmmu.c b/arch/arm/plat-s5p/sysmmu.c
dissimilarity index 87%
index 54f5edd..905bb2b 100644
--- a/arch/arm/plat-s5p/sysmmu.c
+++ b/arch/arm/plat-s5p/sysmmu.c
@@ -1,312 +1,879 @@
-/* linux/arch/arm/plat-s5p/sysmmu.c
- *
- * Copyright (c) 2010 Samsung Electronics Co., Ltd.
- *		http://www.samsung.com
- *
- * This program is free software; you can redistribute it and/or modify
- * it under the terms of the GNU General Public License version 2 as
- * published by the Free Software Foundation.
- */
-
-#include <linux/io.h>
-#include <linux/interrupt.h>
-#include <linux/platform_device.h>
-
-#include <asm/pgtable.h>
-
-#include <mach/map.h>
-#include <mach/regs-sysmmu.h>
-#include <plat/sysmmu.h>
-
-#define CTRL_ENABLE	0x5
-#define CTRL_BLOCK	0x7
-#define CTRL_DISABLE	0x0
-
-static struct device *dev;
-
-static unsigned short fault_reg_offset[SYSMMU_FAULTS_NUM] = {
-	S5P_PAGE_FAULT_ADDR,
-	S5P_AR_FAULT_ADDR,
-	S5P_AW_FAULT_ADDR,
-	S5P_DEFAULT_SLAVE_ADDR,
-	S5P_AR_FAULT_ADDR,
-	S5P_AR_FAULT_ADDR,
-	S5P_AW_FAULT_ADDR,
-	S5P_AW_FAULT_ADDR
-};
-
-static char *sysmmu_fault_name[SYSMMU_FAULTS_NUM] = {
-	"PAGE FAULT",
-	"AR MULTI-HIT FAULT",
-	"AW MULTI-HIT FAULT",
-	"BUS ERROR",
-	"AR SECURITY PROTECTION FAULT",
-	"AR ACCESS PROTECTION FAULT",
-	"AW SECURITY PROTECTION FAULT",
-	"AW ACCESS PROTECTION FAULT"
-};
-
-static int (*fault_handlers[S5P_SYSMMU_TOTAL_IPNUM])(
-		enum S5P_SYSMMU_INTERRUPT_TYPE itype,
-		unsigned long pgtable_base,
-		unsigned long fault_addr);
-
-/*
- * If adjacent 2 bits are true, the system MMU is enabled.
- * The system MMU is disabled, otherwise.
- */
-static unsigned long sysmmu_states;
-
-static inline void set_sysmmu_active(sysmmu_ips ips)
-{
-	sysmmu_states |= 3 << (ips * 2);
-}
-
-static inline void set_sysmmu_inactive(sysmmu_ips ips)
-{
-	sysmmu_states &= ~(3 << (ips * 2));
-}
-
-static inline int is_sysmmu_active(sysmmu_ips ips)
-{
-	return sysmmu_states & (3 << (ips * 2));
-}
-
-static void __iomem *sysmmusfrs[S5P_SYSMMU_TOTAL_IPNUM];
-
-static inline void sysmmu_block(sysmmu_ips ips)
-{
-	__raw_writel(CTRL_BLOCK, sysmmusfrs[ips] + S5P_MMU_CTRL);
-	dev_dbg(dev, "%s is blocked.\n", sysmmu_ips_name[ips]);
-}
-
-static inline void sysmmu_unblock(sysmmu_ips ips)
-{
-	__raw_writel(CTRL_ENABLE, sysmmusfrs[ips] + S5P_MMU_CTRL);
-	dev_dbg(dev, "%s is unblocked.\n", sysmmu_ips_name[ips]);
-}
-
-static inline void __sysmmu_tlb_invalidate(sysmmu_ips ips)
-{
-	__raw_writel(0x1, sysmmusfrs[ips] + S5P_MMU_FLUSH);
-	dev_dbg(dev, "TLB of %s is invalidated.\n", sysmmu_ips_name[ips]);
-}
-
-static inline void __sysmmu_set_ptbase(sysmmu_ips ips, unsigned long pgd)
-{
-	if (unlikely(pgd == 0)) {
-		pgd = (unsigned long)ZERO_PAGE(0);
-		__raw_writel(0x20, sysmmusfrs[ips] + S5P_MMU_CFG); /* 4KB LV1 */
-	} else {
-		__raw_writel(0x0, sysmmusfrs[ips] + S5P_MMU_CFG); /* 16KB LV1 */
-	}
-
-	__raw_writel(pgd, sysmmusfrs[ips] + S5P_PT_BASE_ADDR);
-
-	dev_dbg(dev, "Page table base of %s is initialized with 0x%08lX.\n",
-						sysmmu_ips_name[ips], pgd);
-	__sysmmu_tlb_invalidate(ips);
-}
-
-void sysmmu_set_fault_handler(sysmmu_ips ips,
-			int (*handler)(enum S5P_SYSMMU_INTERRUPT_TYPE itype,
-					unsigned long pgtable_base,
-					unsigned long fault_addr))
-{
-	BUG_ON(!((ips >= SYSMMU_MDMA) && (ips < S5P_SYSMMU_TOTAL_IPNUM)));
-	fault_handlers[ips] = handler;
-}
-
-static irqreturn_t s5p_sysmmu_irq(int irq, void *dev_id)
-{
-	/* SYSMMU is in blocked when interrupt occurred. */
-	unsigned long base = 0;
-	sysmmu_ips ips = (sysmmu_ips)dev_id;
-	enum S5P_SYSMMU_INTERRUPT_TYPE itype;
-
-	itype = (enum S5P_SYSMMU_INTERRUPT_TYPE)
-		__ffs(__raw_readl(sysmmusfrs[ips] + S5P_INT_STATUS));
-
-	BUG_ON(!((itype >= 0) && (itype < 8)));
-
-	dev_alert(dev, "%s occurred by %s.\n", sysmmu_fault_name[itype],
-							sysmmu_ips_name[ips]);
-
-	if (fault_handlers[ips]) {
-		unsigned long addr;
-
-		base = __raw_readl(sysmmusfrs[ips] + S5P_PT_BASE_ADDR);
-		addr = __raw_readl(sysmmusfrs[ips] + fault_reg_offset[itype]);
-
-		if (fault_handlers[ips](itype, base, addr)) {
-			__raw_writel(1 << itype,
-					sysmmusfrs[ips] + S5P_INT_CLEAR);
-			dev_notice(dev, "%s from %s is resolved."
-					" Retrying translation.\n",
-				sysmmu_fault_name[itype], sysmmu_ips_name[ips]);
-		} else {
-			base = 0;
-		}
-	}
-
-	sysmmu_unblock(ips);
-
-	if (!base)
-		dev_notice(dev, "%s from %s is not handled.\n",
-			sysmmu_fault_name[itype], sysmmu_ips_name[ips]);
-
-	return IRQ_HANDLED;
-}
-
-void s5p_sysmmu_set_tablebase_pgd(sysmmu_ips ips, unsigned long pgd)
-{
-	if (is_sysmmu_active(ips)) {
-		sysmmu_block(ips);
-		__sysmmu_set_ptbase(ips, pgd);
-		sysmmu_unblock(ips);
-	} else {
-		dev_dbg(dev, "%s is disabled. "
-			"Skipping initializing page table base.\n",
-						sysmmu_ips_name[ips]);
-	}
-}
-
-void s5p_sysmmu_enable(sysmmu_ips ips, unsigned long pgd)
-{
-	if (!is_sysmmu_active(ips)) {
-		sysmmu_clk_enable(ips);
-
-		__sysmmu_set_ptbase(ips, pgd);
-
-		__raw_writel(CTRL_ENABLE, sysmmusfrs[ips] + S5P_MMU_CTRL);
-
-		set_sysmmu_active(ips);
-		dev_dbg(dev, "%s is enabled.\n", sysmmu_ips_name[ips]);
-	} else {
-		dev_dbg(dev, "%s is already enabled.\n", sysmmu_ips_name[ips]);
-	}
-}
-
-void s5p_sysmmu_disable(sysmmu_ips ips)
-{
-	if (is_sysmmu_active(ips)) {
-		__raw_writel(CTRL_DISABLE, sysmmusfrs[ips] + S5P_MMU_CTRL);
-		set_sysmmu_inactive(ips);
-		sysmmu_clk_disable(ips);
-		dev_dbg(dev, "%s is disabled.\n", sysmmu_ips_name[ips]);
-	} else {
-		dev_dbg(dev, "%s is already disabled.\n", sysmmu_ips_name[ips]);
-	}
-}
-
-void s5p_sysmmu_tlb_invalidate(sysmmu_ips ips)
-{
-	if (is_sysmmu_active(ips)) {
-		sysmmu_block(ips);
-		__sysmmu_tlb_invalidate(ips);
-		sysmmu_unblock(ips);
-	} else {
-		dev_dbg(dev, "%s is disabled. "
-			"Skipping invalidating TLB.\n", sysmmu_ips_name[ips]);
-	}
-}
-
-static int s5p_sysmmu_probe(struct platform_device *pdev)
-{
-	int i, ret;
-	struct resource *res, *mem;
-
-	dev = &pdev->dev;
-
-	for (i = 0; i < S5P_SYSMMU_TOTAL_IPNUM; i++) {
-		int irq;
-
-		sysmmu_clk_init(dev, i);
-		sysmmu_clk_disable(i);
-
-		res = platform_get_resource(pdev, IORESOURCE_MEM, i);
-		if (!res) {
-			dev_err(dev, "Failed to get the resource of %s.\n",
-							sysmmu_ips_name[i]);
-			ret = -ENODEV;
-			goto err_res;
-		}
-
-		mem = request_mem_region(res->start,
-				((res->end) - (res->start)) + 1, pdev->name);
-		if (!mem) {
-			dev_err(dev, "Failed to request the memory region of %s.\n",
-							sysmmu_ips_name[i]);
-			ret = -EBUSY;
-			goto err_res;
-		}
-
-		sysmmusfrs[i] = ioremap(res->start, res->end - res->start + 1);
-		if (!sysmmusfrs[i]) {
-			dev_err(dev, "Failed to ioremap() for %s.\n",
-							sysmmu_ips_name[i]);
-			ret = -ENXIO;
-			goto err_reg;
-		}
-
-		irq = platform_get_irq(pdev, i);
-		if (irq <= 0) {
-			dev_err(dev, "Failed to get the IRQ resource of %s.\n",
-							sysmmu_ips_name[i]);
-			ret = -ENOENT;
-			goto err_map;
-		}
-
-		if (request_irq(irq, s5p_sysmmu_irq, IRQF_DISABLED,
-						pdev->name, (void *)i)) {
-			dev_err(dev, "Failed to request IRQ for %s.\n",
-							sysmmu_ips_name[i]);
-			ret = -ENOENT;
-			goto err_map;
-		}
-	}
-
-	return 0;
-
-err_map:
-	iounmap(sysmmusfrs[i]);
-err_reg:
-	release_mem_region(mem->start, resource_size(mem));
-err_res:
-	return ret;
-}
-
-static int s5p_sysmmu_remove(struct platform_device *pdev)
-{
-	return 0;
-}
-int s5p_sysmmu_runtime_suspend(struct device *dev)
-{
-	return 0;
-}
-
-int s5p_sysmmu_runtime_resume(struct device *dev)
-{
-	return 0;
-}
-
-const struct dev_pm_ops s5p_sysmmu_pm_ops = {
-	.runtime_suspend	= s5p_sysmmu_runtime_suspend,
-	.runtime_resume		= s5p_sysmmu_runtime_resume,
-};
-
-static struct platform_driver s5p_sysmmu_driver = {
-	.probe		= s5p_sysmmu_probe,
-	.remove		= s5p_sysmmu_remove,
-	.driver		= {
-		.owner		= THIS_MODULE,
-		.name		= "s5p-sysmmu",
-		.pm		= &s5p_sysmmu_pm_ops,
-	}
-};
-
-static int __init s5p_sysmmu_init(void)
-{
-	return platform_driver_register(&s5p_sysmmu_driver);
-}
-arch_initcall(s5p_sysmmu_init);
+/* linux/arch/arm/plat-s5p/sysmmu.c
+ *
+ * Copyright (c) 2010-2011 Samsung Electronics Co., Ltd.
+ *		http://www.samsung.com
+ *
+ * Author: Andrzej Pietrasiewicz <andrzej.p@samsung.com>
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ */
+
+#include <linux/gfp.h>
+#include <linux/kernel.h>
+#include <linux/string.h>
+#include <linux/platform_device.h>
+#include <linux/slab.h>
+#include <linux/interrupt.h>
+#include <linux/io.h>
+#include <linux/spinlock.h>
+#include <linux/mm.h>
+#include <linux/pagemap.h>
+#include <linux/module.h>
+#include <linux/clk.h>
+#include <linux/pm_runtime.h>
+#include <linux/iommu.h>
+
+#include <asm/memory.h>
+
+#include <plat/irqs.h>
+#include <plat/devs.h>
+#include <plat/cpu.h>
+#include <plat/sysmmu.h>
+
+#include <mach/map.h>
+#include <mach/regs-sysmmu.h>
+
+static int debug;
+module_param(debug, int, 0644);
+
+#define sysmmu_debug(level, fmt, arg...)				 \
+	do {								 \
+		if (debug >= level)					 \
+			printk(KERN_DEBUG "[%s] " fmt, __func__, ## arg);\
+	} while (0)
+
+#define FLPT_ENTRIES		4096
+#define FLPT_4K_64K_MASK	(~0x3FF)
+#define FLPT_1M_MASK		(~0xFFFFF)
+#define FLPT_16M_MASK		(~0xFFFFFF)
+#define SLPT_4K_MASK		(~0xFFF)
+#define SLPT_64K_MASK		(~0xFFFF)
+#define PAGE_4K_64K		0x1
+#define PAGE_1M			0x2
+#define PAGE_16M		0x40002
+#define PAGE_4K			0x2
+#define PAGE_64K		0x1
+#define FLPT_IDX_SHIFT		20
+#define FLPT_IDX_MASK		0xFFF
+#define FLPT_OFFS_SHIFT		(FLPT_IDX_SHIFT - 2)
+#define FLPT_OFFS_MASK		(FLPT_IDX_MASK << 2)
+#define SLPT_IDX_SHIFT		12
+#define SLPT_IDX_MASK		0xFF
+#define SLPT_OFFS_SHIFT		(SLPT_IDX_SHIFT - 2)
+#define SLPT_OFFS_MASK		(SLPT_IDX_MASK << 2)
+
+#define deref_va(va)		(*((unsigned long *)(va)))
+
+#define generic_extract(l, s, entry) \
+				((entry) & l##LPT_##s##_MASK)
+#define flpt_get_1m(entry)	generic_extract(F, 1M, deref_va(entry))
+#define flpt_get_16m(entry)	generic_extract(F, 16M, deref_va(entry))
+#define slpt_get_4k(entry)	generic_extract(S, 4K, deref_va(entry))
+#define slpt_get_64k(entry)	generic_extract(S, 64K, deref_va(entry))
+
+#define generic_entry(l, s, entry) \
+				(generic_extract(l, s, entry)  | PAGE_##s)
+#define flpt_ent_4k_64k(entry)	generic_entry(F, 4K_64K, entry)
+#define flpt_ent_1m(entry)	generic_entry(F, 1M, entry)
+#define flpt_ent_16m(entry)	generic_entry(F, 16M, entry)
+#define slpt_ent_4k(entry)	generic_entry(S, 4K, entry)
+#define slpt_ent_64k(entry)	generic_entry(S, 64K, entry)
+
+#define page_4k_64k(entry)	(deref_va(entry) & PAGE_4K_64K)
+#define page_1m(entry)		(deref_va(entry) & PAGE_1M)
+#define page_16m(entry)		((deref_va(entry) & PAGE_16M) == PAGE_16M)
+#define page_4k(entry)		(deref_va(entry) & PAGE_4K)
+#define page_64k(entry)		(deref_va(entry) & PAGE_64K)
+
+#define generic_pg_offs(l, s, va) \
+				(va & ~l##LPT_##s##_MASK)
+#define pg_offs_1m(va)		generic_pg_offs(F, 1M, va)
+#define pg_offs_16m(va)		generic_pg_offs(F, 16M, va)
+#define pg_offs_4k(va)		generic_pg_offs(S, 4K, va)
+#define pg_offs_64k(va)		generic_pg_offs(S, 64K, va)
+
+#define flpt_index(va)		(((va) >> FLPT_IDX_SHIFT) & FLPT_IDX_MASK)
+
+#define generic_offset(l, va)	(((va) >> l##LPT_OFFS_SHIFT) & l##LPT_OFFS_MASK)
+#define flpt_offs(va)		generic_offset(F, va)
+#define slpt_offs(va)		generic_offset(S, va)
+
+#define invalidate_slpt_ent(slpt_va) (deref_va(slpt_va) = 0UL)
+
+#define get_irq_callb(cb) \
+				(s5p_domain->irq_callb ? \
+					(s5p_domain->irq_callb->cb ? \
+					s5p_domain->irq_callb->cb : \
+					s5p_sysmmu_irq_callb.cb) \
+				: s5p_sysmmu_irq_callb.cb)
+
+struct s5p_sysmmu_info {
+	struct resource			*ioarea;
+	void __iomem			*regs;
+	unsigned int			irq;
+	struct clk			*clk;
+	bool				enabled;
+	enum s5p_sysmmu_ip		ip;
+	struct device			*dev;
+	struct iommu_domain		*domain;
+};
+
+/*
+ * iommu domain is a virtual address space of an I/O device driver.
+ * It contains kernel virtual and physical addresses of the first level
+ * page table and owns the memory in which the page tables are stored.
+ * It contains a table of kernel virtual addresses of second level
+ * page tables.
+ *
+ * In order to be used the iommu domain must be bound to an iommu device.
+ * This is accomplished with s5p_sysmmu_attach_dev, which is called through
+ * s5p_sysmmu_ops by drivers/base/iommu.c.
+ */
+struct s5p_sysmmu_domain {
+	unsigned long			flpt;
+	void				*flpt_va;
+	void				**slpt_va;
+	unsigned short			*refcount;
+	struct s5p_sysmmu_info		*sysmmu;
+	struct s5p_sysmmu_irq_callb	*irq_callb;
+	void				*irq_callb_priv;
+	int				policy;
+};
+
+static struct s5p_sysmmu_info *sysmmu_table[S5P_SYSMMU_TOTAL_IP_NUM];
+static DEFINE_SPINLOCK(sysmmu_slock);
+
+static struct kmem_cache *slpt_cache;
+
+static const char *irq_reasons[] = {
+	"sysmmu irq:page fault",
+	"sysmmu irq:ar multi hit",
+	"sysmmu irq:aw multi hit",
+	"sysmmu irq:bus error",
+	"sysmmu irq:ar security protection fault",
+	"sysmmu irq:ar access protection fault",
+	"sysmmu irq:aw security protection fault",
+	"sysmmu irq:aw access protection fault"
+};
+
+static void flush_cache(const void *start, unsigned long size)
+{
+	dmac_flush_range(start, start + size);
+	outer_flush_range(virt_to_phys(start), virt_to_phys(start + size));
+}
+
+static int s5p_sysmmu_domain_init(struct iommu_domain *domain)
+{
+	struct s5p_sysmmu_domain *s5p_domain;
+
+	s5p_domain = kzalloc(sizeof(struct s5p_sysmmu_domain), GFP_KERNEL);
+	if (!s5p_domain) {
+		sysmmu_debug(3, "no memory for state\n");
+		return -ENOMEM;
+	}
+	domain->priv = s5p_domain;
+
+	/*
+	 * first-level page table holds
+	 * 4k second-level descriptors == 16kB == 4 pages
+	 */
+	s5p_domain->flpt_va = kzalloc(FLPT_ENTRIES * sizeof(unsigned long),
+					 GFP_KERNEL);
+	if (!s5p_domain->flpt_va)
+		return -ENOMEM;
+	s5p_domain->flpt = virt_to_phys(s5p_domain->flpt_va);
+
+	s5p_domain->refcount = kzalloc(FLPT_ENTRIES * sizeof(u16), GFP_KERNEL);
+	if (!s5p_domain->refcount) {
+		kfree(s5p_domain->flpt_va);
+		return -ENOMEM;
+	}
+
+	s5p_domain->slpt_va = kzalloc(FLPT_ENTRIES * sizeof(void *),
+				      GFP_KERNEL);
+	if (!s5p_domain->slpt_va) {
+		kfree(s5p_domain->refcount);
+		kfree(s5p_domain->flpt_va);
+		return -ENOMEM;
+	}
+	flush_cache(s5p_domain->flpt_va, 4 * PAGE_SIZE);
+	return 0;
+}
+
+static void s5p_sysmmu_domain_destroy(struct iommu_domain *domain)
+{
+	struct s5p_sysmmu_domain *s5p_domain = domain->priv;
+	int i;
+	for (i = FLPT_ENTRIES - 1; i >= 0; --i)
+		if (s5p_domain->refcount[i])
+			kmem_cache_free(slpt_cache, s5p_domain->slpt_va[i]);
+
+	kfree(s5p_domain->slpt_va);
+	kfree(s5p_domain->refcount);
+	kfree(s5p_domain->flpt_va);
+	kfree(domain->priv);
+	domain->priv = NULL;
+}
+
+static int s5p_sysmmu_attach_dev(struct iommu_domain *domain,
+				 struct device *dev)
+{
+	struct platform_device *pdev =
+		container_of(dev, struct platform_device, dev);
+	struct s5p_sysmmu_info *sysmmu = platform_get_drvdata(pdev);
+	struct s5p_sysmmu_domain *s5p_domain = domain->priv;
+	unsigned int reg;
+
+	s5p_domain->sysmmu = sysmmu;
+	sysmmu->domain = domain;
+
+	pm_runtime_get_sync(sysmmu->dev);
+	clk_enable(sysmmu->clk);
+
+	/* configure first level page table base address */
+	writel(s5p_domain->flpt, sysmmu->regs + S5P_PT_BASE_ADDR);
+
+	reg = readl(sysmmu->regs + S5P_MMU_CFG);
+	if (s5p_domain->policy)
+		reg |= (0x1<<0);		/* replacement policy : LRU */
+	else
+		reg &= ~(0x1<<0);		/* replacement policy: RR */
+	writel(reg, sysmmu->regs + S5P_MMU_CFG);
+
+	reg = readl(sysmmu->regs + S5P_MMU_CTRL);
+	reg |= ((0x1<<2)|(0x1<<0));	/* Enable interrupt, Enable MMU */
+	writel(reg, sysmmu->regs + S5P_MMU_CTRL);
+
+	sysmmu->enabled = true;
+
+	return 0;
+}
+
+static void s5p_sysmmu_detach_dev(struct iommu_domain *domain,
+				  struct device *dev)
+{
+	struct platform_device *pdev =
+		container_of(dev, struct platform_device, dev);
+	struct s5p_sysmmu_info *sysmmu = platform_get_drvdata(pdev);
+	struct s5p_sysmmu_domain *s5p_domain = domain->priv;
+	unsigned int reg;
+
+	/* SYSMMU disable */
+	reg = readl(sysmmu->regs + S5P_MMU_CFG);
+	reg |= (0x1<<0);		/* replacement policy : LRU */
+	writel(reg, sysmmu->regs + S5P_MMU_CFG);
+
+	reg = readl(sysmmu->regs + S5P_MMU_CTRL);
+	reg &= ~(0x1);			/* Disable MMU */
+	writel(reg, sysmmu->regs + S5P_MMU_CTRL);
+
+	sysmmu->enabled = false;
+
+	clk_disable(sysmmu->clk);
+	pm_runtime_put_sync(sysmmu->dev);
+
+	sysmmu->domain = NULL;
+	s5p_domain->sysmmu = NULL;
+}
+
+#define bug_mapping_prohibited(iova, len) \
+		s5p_mapping_prohibited_impl(iova, len, __FILE__, __LINE__)
+
+static void s5p_mapping_prohibited_impl(unsigned long iova, size_t len,
+				   const char *file, int line)
+{
+	sysmmu_debug(3, "%s:%d Attempting to map %d at 0x%lx over existing\
+mapping\n", file, line, len, iova);
+	BUG();
+}
+
+/*
+ * Map an area of length corresponding to gfp_order, starting at iova.
+ * gfp_order is an order of units of 4kB: 0 -> 1 unit, 1 -> 2 units,
+ * 2 -> 4 units, 3 -> 8 units and so on.
+ *
+ * The act of mapping is all about deciding how to interpret in the MMU the
+ * virtual addresses belonging to the mapped range. Mapping can be done with
+ * 4kB, 64kB, 1MB and 16MB pages, so only orders of 0, 4, 8, 12 are valid.
+ *
+ * iova must be aligned on a 4kB, 64kB, 1MB and 16MB boundaries, respectively.
+ */
+static int s5p_sysmmu_map(struct iommu_domain *domain, unsigned long iova,
+			  phys_addr_t paddr, int gfp_order, int prot)
+{
+	struct s5p_sysmmu_domain *s5p_domain = domain->priv;
+	int flpt_idx = flpt_index(iova);
+	size_t len = 0x1000UL << gfp_order;
+	void *flpt_va, *slpt_va;
+
+	if (len != SZ_16M && len != SZ_1M && len != SZ_64K && len != SZ_4K) {
+		sysmmu_debug(3, "bad order: %d\n", gfp_order);
+		return -EINVAL;
+	}
+
+	flpt_va = s5p_domain->flpt_va + flpt_offs(iova);
+
+	if (SZ_1M == len) {
+		if (deref_va(flpt_va))
+			bug_mapping_prohibited(iova, len);
+		deref_va(flpt_va) = flpt_ent_1m(paddr);
+		flush_cache(flpt_va, 4); /* one 4-byte entry */
+
+		return 0;
+	} else if (SZ_16M == len) {
+		int i = 0;
+		/* first loop to verify mapping allowed */
+		for (i = 0; i < 16; ++i)
+			if (deref_va(flpt_va + 4 * i))
+				bug_mapping_prohibited(iova, len);
+		/* actually map only if allowed */
+		for (i = 0; i < 16; ++i)
+			deref_va(flpt_va + 4 * i) = flpt_ent_16m(paddr);
+		flush_cache(flpt_va, 4 * 16); /* 16 4-byte entries */
+
+		return 0;
+	}
+
+	/* for 4K and 64K pages only */
+	if (page_1m(flpt_va) || page_16m(flpt_va))
+		bug_mapping_prohibited(iova, len);
+
+	/* need to allocate a new second level page table */
+	if (0 == deref_va(flpt_va)) {
+		void *slpt = kmem_cache_zalloc(slpt_cache, GFP_KERNEL);
+		if (!slpt) {
+			sysmmu_debug(3, "cannot allocate slpt\n");
+			return -ENOMEM;
+		}
+
+		s5p_domain->slpt_va[flpt_idx] = slpt;
+		deref_va(flpt_va) = flpt_ent_4k_64k(virt_to_phys(slpt));
+		flush_cache(flpt_va, 4);
+	}
+	slpt_va = s5p_domain->slpt_va[flpt_idx] + slpt_offs(iova);
+
+	if (SZ_4K == len) {
+		if (deref_va(slpt_va))
+			bug_mapping_prohibited(iova, len);
+		deref_va(slpt_va) = slpt_ent_4k(paddr);
+		flush_cache(slpt_va, 4); /* one 4-byte entry */
+		s5p_domain->refcount[flpt_idx]++;
+	} else {
+		int i;
+		/* first loop to verify mapping allowed */
+		for (i = 0; i < 16; ++i)
+			if (deref_va(slpt_va + 4 * i))
+				bug_mapping_prohibited(iova, len);
+		/* actually map only if allowed */
+		for (i = 0; i < 16; ++i) {
+			deref_va(slpt_va + 4 * i) = slpt_ent_64k(paddr);
+			s5p_domain->refcount[flpt_idx]++;
+		}
+		flush_cache(slpt_va, 4 * 16); /* 16 4-byte entries */
+	}
+
+	return 0;
+}
+
+static void s5p_tlb_invalidate(struct s5p_sysmmu_domain *domain)
+{
+	unsigned int reg;
+	void __iomem *regs;
+
+	if (!domain->sysmmu)
+		return;
+
+	regs = domain->sysmmu->regs;
+
+	/* TLB invalidate */
+	reg = readl(regs + S5P_MMU_CTRL);
+	reg |= (0x1<<1);		/* Block MMU */
+	writel(reg, regs + S5P_MMU_CTRL);
+
+	writel(0x1, regs + S5P_MMU_FLUSH);
+					/* Flush_entry */
+
+	reg = readl(regs + S5P_MMU_CTRL);
+	reg &= ~(0x1<<1);		/* Un-block MMU */
+	writel(reg, regs + S5P_MMU_CTRL);
+}
+
+#define bug_unmapping_prohibited(iova, len) \
+		s5p_unmapping_prohibited_impl(iova, len, __FILE__, __LINE__)
+
+static void s5p_unmapping_prohibited_impl(unsigned long iova, size_t len,
+				     const char *file, int line)
+{
+	sysmmu_debug(3, "%s:%d Attempting to unmap different size or \
+non-existing mapping %d@0x%lx\n", file, line, len, iova);
+	BUG();
+}
+
+static int s5p_sysmmu_unmap(struct iommu_domain *domain, unsigned long iova,
+			    int gfp_order)
+{
+	struct s5p_sysmmu_domain *s5p_domain = domain->priv;
+	int flpt_idx = flpt_index(iova);
+	size_t len = 0x1000UL << gfp_order;
+	void *flpt_va, *slpt_va;
+
+	if (len != SZ_16M && len != SZ_1M && len != SZ_64K && len != SZ_4K) {
+		sysmmu_debug(3, "bad order: %d\n", gfp_order);
+		return -EINVAL;
+	}
+
+	flpt_va = s5p_domain->flpt_va + flpt_offs(iova);
+
+	/* check if there is any mapping at all */
+	if (!deref_va(flpt_va))
+		bug_unmapping_prohibited(iova, len);
+
+	if (SZ_1M == len) {
+		if (!page_1m(flpt_va))
+			bug_unmapping_prohibited(iova, len);
+		deref_va(flpt_va) = 0;
+		flush_cache(flpt_va, 4); /* one 4-byte entry */
+		s5p_tlb_invalidate(s5p_domain);
+
+		return 0;
+	} else if (SZ_16M == len) {
+		int i;
+		/* first loop to verify it actually is 16M mapping */
+		for (i = 0; i < 16; ++i)
+			if (!page_16m(flpt_va + 4 * i))
+				bug_unmapping_prohibited(iova, len);
+		/* actually unmap */
+		for (i = 0; i < 16; ++i)
+			deref_va(flpt_va + 4 * i) = 0;
+		flush_cache(flpt_va, 4 * 16); /* 16 4-byte entries */
+		s5p_tlb_invalidate(s5p_domain);
+
+		return 0;
+	}
+
+	if (!page_4k_64k(flpt_va))
+		bug_unmapping_prohibited(iova, len);
+
+	slpt_va = s5p_domain->slpt_va[flpt_idx] + slpt_offs(iova);
+
+	/* verify that we attempt to unmap a matching mapping */
+	if (SZ_4K == len) {
+		if (!page_4k(slpt_va))
+			bug_unmapping_prohibited(iova, len);
+	} else if (SZ_64K == len) {
+		int i;
+		for (i = 0; i < 16; ++i)
+			if (!page_64k(slpt_va + 4 * i))
+				bug_unmapping_prohibited(iova, len);
+	}
+
+	if (SZ_64K == len)
+		s5p_domain->refcount[flpt_idx] -= 15;
+
+	if (--s5p_domain->refcount[flpt_idx]) {
+		if (SZ_4K == len) {
+			invalidate_slpt_ent(slpt_va);
+			flush_cache(slpt_va, 4);
+		} else {
+			int i;
+			for (i = 0; i < 16; ++i)
+				invalidate_slpt_ent(slpt_va + 4 * i);
+			flush_cache(slpt_va, 4 * 16);
+		}
+	} else {
+		kmem_cache_free(slpt_cache, s5p_domain->slpt_va[flpt_idx]);
+		s5p_domain->slpt_va[flpt_idx] = 0;
+		memset(flpt_va, 0, 4);
+		flush_cache(flpt_va, 4);
+	}
+
+	s5p_tlb_invalidate(s5p_domain);
+
+	return 0;
+}
+
+phys_addr_t s5p_iova_to_phys(struct iommu_domain *domain, unsigned long iova)
+{
+	struct s5p_sysmmu_domain *s5p_domain = domain->priv;
+	int flpt_idx = flpt_index(iova);
+	unsigned long flpt_va, slpt_va;
+
+	flpt_va = (unsigned long)s5p_domain->flpt_va + flpt_offs(iova);
+
+	if (!deref_va(flpt_va))
+		return 0;
+
+	if (page_16m(flpt_va))
+		return flpt_get_16m(flpt_va) | pg_offs_16m(iova);
+	else if (page_1m(flpt_va))
+		return flpt_get_1m(flpt_va) | pg_offs_1m(iova);
+
+	if (!page_4k_64k(flpt_va))
+		return 0;
+
+	slpt_va = (unsigned long)s5p_domain->slpt_va[flpt_idx] +
+		  slpt_offs(iova);
+
+	if (!deref_va(slpt_va))
+		return 0;
+
+	if (page_4k(slpt_va))
+		return slpt_get_4k(slpt_va) | pg_offs_4k(iova);
+	else if (page_64k(slpt_va))
+		return slpt_get_64k(slpt_va) | pg_offs_64k(iova);
+
+	return 0;
+}
+
+static struct iommu_ops s5p_sysmmu_ops = {
+	.domain_init = s5p_sysmmu_domain_init,
+	.domain_destroy = s5p_sysmmu_domain_destroy,
+	.attach_dev = s5p_sysmmu_attach_dev,
+	.detach_dev = s5p_sysmmu_detach_dev,
+	.map = s5p_sysmmu_map,
+	.unmap = s5p_sysmmu_unmap,
+	.iova_to_phys = s5p_iova_to_phys,
+};
+
+struct device *s5p_sysmmu_get(enum s5p_sysmmu_ip ip)
+{
+	struct device *ret = NULL;
+	unsigned long flags;
+
+	spin_lock_irqsave(&sysmmu_slock, flags);
+	if (sysmmu_table[ip]) {
+		try_module_get(THIS_MODULE);
+		ret = sysmmu_table[ip]->dev;
+	}
+	spin_unlock_irqrestore(&sysmmu_slock, flags);
+
+	return ret;
+}
+EXPORT_SYMBOL_GPL(s5p_sysmmu_get);
+
+void s5p_sysmmu_put(void *dev)
+{
+	BUG_ON(!dev);
+	module_put(THIS_MODULE);
+}
+EXPORT_SYMBOL_GPL(s5p_sysmmu_put);
+
+void s5p_sysmmu_domain_irq_callb(struct iommu_domain *domain,
+			    struct s5p_sysmmu_irq_callb *ops, void *priv)
+{
+	struct s5p_sysmmu_domain *s5p_domain = domain->priv;
+	s5p_domain->irq_callb = ops;
+	s5p_domain->irq_callb_priv = priv;
+}
+EXPORT_SYMBOL(s5p_sysmmu_domain_irq_callb);
+
+
+void s5p_sysmmu_domain_tlb_policy(struct iommu_domain *domain, int policy)
+{
+	struct s5p_sysmmu_domain *s5p_domain = domain->priv;
+	s5p_domain->policy = policy;
+}
+EXPORT_SYMBOL(s5p_sysmmu_domain_tlb_policy);
+
+static void s5p_sysmmu_irq_page_fault(struct iommu_domain *domain, int reason,
+				      unsigned long addr, void *priv)
+{
+	sysmmu_debug(3, "%s: Faulting virtual address: 0x%08lx\n",
+		     irq_reasons[reason], addr);
+	BUG();
+}
+
+static void s5p_sysmmu_irq_generic_callb(struct iommu_domain *domain,
+					 int reason, unsigned long addr,
+					 void *priv)
+{
+	sysmmu_debug(3, "%s\n", irq_reasons[reason]);
+	BUG();
+}
+
+static struct s5p_sysmmu_irq_callb s5p_sysmmu_irq_callb = {
+	.page_fault = s5p_sysmmu_irq_page_fault,
+	.ar_fault = s5p_sysmmu_irq_generic_callb,
+	.aw_fault = s5p_sysmmu_irq_generic_callb,
+	.bus_error = s5p_sysmmu_irq_generic_callb,
+	.ar_security = s5p_sysmmu_irq_generic_callb,
+	.ar_prot = s5p_sysmmu_irq_generic_callb,
+	.aw_security = s5p_sysmmu_irq_generic_callb,
+	.aw_prot = s5p_sysmmu_irq_generic_callb,
+};
+
+static irqreturn_t s5p_sysmmu_irq(int irq, void *dev_id)
+{
+	struct s5p_sysmmu_info *sysmmu = dev_id;
+	struct s5p_sysmmu_domain *s5p_domain = sysmmu->domain->priv;
+	unsigned int reg_INT_STATUS;
+
+	if (false == sysmmu->enabled)
+		return IRQ_HANDLED;
+
+	reg_INT_STATUS = readl(sysmmu->regs + S5P_INT_STATUS);
+	if (reg_INT_STATUS & 0xFF) {
+		S5P_IRQ_CB(cb);
+		enum s5p_sysmmu_fault reason = 0;
+		unsigned long fault = 0;
+		unsigned reg = 0;
+		cb = NULL;
+		switch (reg_INT_STATUS & 0xFF) {
+		case 0x1:
+			cb = get_irq_callb(page_fault);
+			reason = S5P_SYSMMU_PAGE_FAULT;
+			reg = S5P_PAGE_FAULT_ADDR;
+			break;
+		case 0x2:
+			cb = get_irq_callb(ar_fault);
+			reason = S5P_SYSMMU_AR_FAULT;
+			reg = S5P_AR_FAULT_ADDR;
+			break;
+		case 0x4:
+			cb = get_irq_callb(aw_fault);
+			reason = S5P_SYSMMU_AW_FAULT;
+			reg = S5P_AW_FAULT_ADDR;
+			break;
+		case 0x8:
+			cb = get_irq_callb(bus_error);
+			reason = S5P_SYSMMU_BUS_ERROR;
+			/* register common to page fault and bus error */
+			reg = S5P_PAGE_FAULT_ADDR;
+			break;
+		case 0x10:
+			cb = get_irq_callb(ar_security);
+			reason = S5P_SYSMMU_AR_SECURITY;
+			reg = S5P_AR_FAULT_ADDR;
+			break;
+		case 0x20:
+			cb = get_irq_callb(ar_prot);
+			reason = S5P_SYSMMU_AR_PROT;
+			reg = S5P_AR_FAULT_ADDR;
+			break;
+		case 0x40:
+			cb = get_irq_callb(aw_security);
+			reason = S5P_SYSMMU_AW_SECURITY;
+			reg = S5P_AW_FAULT_ADDR;
+			break;
+		case 0x80:
+			cb = get_irq_callb(aw_prot);
+			reason = S5P_SYSMMU_AW_PROT;
+			reg = S5P_AW_FAULT_ADDR;
+			break;
+		}
+		fault = readl(sysmmu->regs + reg);
+		cb(sysmmu->domain, reason, fault, s5p_domain->irq_callb_priv);
+		writel(reg_INT_STATUS, sysmmu->regs + S5P_INT_CLEAR);
+	}
+	return IRQ_HANDLED;
+}
+
+static int s5p_sysmmu_probe(struct platform_device *pdev)
+{
+	struct s5p_sysmmu_info *sysmmu;
+	struct resource *res;
+	int ret;
+	unsigned long flags;
+
+	sysmmu = kzalloc(sizeof(struct s5p_sysmmu_info), GFP_KERNEL);
+	if (!sysmmu) {
+		dev_err(&pdev->dev, "no memory for state\n");
+		return -ENOMEM;
+	}
+
+	res = platform_get_resource(pdev, IORESOURCE_MEM, 0);
+	if (NULL == res) {
+		dev_err(&pdev->dev, "cannot find IO resource\n");
+		ret = -ENOENT;
+		goto err_s5p_sysmmu_info_allocated;
+	}
+
+	sysmmu->ioarea = request_mem_region(res->start, resource_size(res),
+					 pdev->name);
+
+	if (NULL == sysmmu->ioarea) {
+		dev_err(&pdev->dev, "cannot request IO\n");
+		ret = -ENXIO;
+		goto err_s5p_sysmmu_info_allocated;
+	}
+
+	sysmmu->regs = ioremap(res->start, resource_size(res));
+
+	if (NULL == sysmmu->regs) {
+		dev_err(&pdev->dev, "cannot map IO\n");
+		ret = -ENXIO;
+		goto err_ioarea_requested;
+	}
+
+	dev_dbg(&pdev->dev, "registers %p (%p, %p)\n",
+		sysmmu->regs, sysmmu->ioarea, res);
+
+	sysmmu->irq = ret = platform_get_irq(pdev, 0);
+	if (ret <= 0) {
+		dev_err(&pdev->dev, "cannot find IRQ\n");
+		goto err_iomap_done;
+	}
+
+	ret = request_irq(sysmmu->irq, s5p_sysmmu_irq, 0,
+			  dev_name(&pdev->dev), sysmmu);
+
+	if (ret != 0) {
+		dev_err(&pdev->dev, "cannot claim IRQ %d\n", sysmmu->irq);
+		goto err_iomap_done;
+	}
+
+	sysmmu->clk = clk_get(&pdev->dev, "sysmmu");
+	if (IS_ERR_OR_NULL(sysmmu->clk)) {
+		dev_err(&pdev->dev, "cannot get clock\n");
+		ret = -ENOENT;
+		goto err_request_irq_done;
+	}
+	dev_dbg(&pdev->dev, "clock source %p\n", sysmmu->clk);
+
+	sysmmu->ip = pdev->id;
+
+	spin_lock_irqsave(&sysmmu_slock, flags);
+	sysmmu_table[pdev->id] = sysmmu;
+	spin_unlock_irqrestore(&sysmmu_slock, flags);
+
+	sysmmu->dev = &pdev->dev;
+
+	platform_set_drvdata(pdev, sysmmu);
+
+	pm_runtime_set_active(&pdev->dev);
+	pm_runtime_enable(&pdev->dev);
+
+	dev_info(&pdev->dev, "Samsung S5P SYSMMU (IOMMU)\n");
+	return 0;
+
+err_request_irq_done:
+	free_irq(sysmmu->irq, sysmmu);
+
+err_iomap_done:
+	iounmap(sysmmu->regs);
+
+err_ioarea_requested:
+	release_resource(sysmmu->ioarea);
+	kfree(sysmmu->ioarea);
+
+err_s5p_sysmmu_info_allocated:
+	kfree(sysmmu);
+	return ret;
+}
+
+static int s5p_sysmmu_remove(struct platform_device *pdev)
+{
+	struct s5p_sysmmu_info *sysmmu = platform_get_drvdata(pdev);
+	unsigned long flags;
+
+	pm_runtime_disable(sysmmu->dev);
+
+	spin_lock_irqsave(&sysmmu_slock, flags);
+	sysmmu_table[pdev->id] = NULL;
+	spin_unlock_irqrestore(&sysmmu_slock, flags);
+
+	clk_disable(sysmmu->clk);
+	clk_put(sysmmu->clk);
+
+	free_irq(sysmmu->irq, sysmmu);
+
+	iounmap(sysmmu->regs);
+
+	release_resource(sysmmu->ioarea);
+	kfree(sysmmu->ioarea);
+
+	kfree(sysmmu);
+
+	return 0;
+}
+
+static int
+s5p_sysmmu_suspend(struct platform_device *pdev, pm_message_t state)
+{
+	int ret = 0;
+	sysmmu_debug(3, "begin\n");
+
+	return ret;
+}
+
+static int s5p_sysmmu_resume(struct platform_device *pdev)
+{
+	int ret = 0;
+	sysmmu_debug(3, "begin\n");
+
+	return ret;
+}
+
+static int s5p_sysmmu_runtime_suspend(struct device *dev)
+{
+	sysmmu_debug(3, "begin\n");
+	return 0;
+}
+
+static int s5p_sysmmu_runtime_resume(struct device *dev)
+{
+	sysmmu_debug(3, "begin\n");
+	return 0;
+}
+
+static const struct dev_pm_ops s5p_sysmmu_pm_ops = {
+	.runtime_suspend = s5p_sysmmu_runtime_suspend,
+	.runtime_resume	 = s5p_sysmmu_runtime_resume,
+};
+
+static struct platform_driver s5p_sysmmu_driver = {
+	.probe = s5p_sysmmu_probe,
+	.remove = s5p_sysmmu_remove,
+	.suspend = s5p_sysmmu_suspend,
+	.resume = s5p_sysmmu_resume,
+	.driver = {
+		.owner = THIS_MODULE,
+		.name = "s5p-sysmmu",
+		.pm = &s5p_sysmmu_pm_ops,
+	},
+};
+
+static int __init
+s5p_sysmmu_register(void)
+{
+	int ret;
+
+	sysmmu_debug(3, "Registering sysmmu driver...\n");
+
+	slpt_cache = kmem_cache_create("slpt_cache", 1024, 1024,
+				       SLAB_HWCACHE_ALIGN, NULL);
+	if (!slpt_cache) {
+		printk(KERN_ERR
+			"%s: failed to allocated slpt cache\n", __func__);
+		return -ENOMEM;
+	}
+
+	ret = platform_driver_register(&s5p_sysmmu_driver);
+
+	if (ret) {
+		printk(KERN_ERR
+			"%s: failed to register sysmmu driver\n", __func__);
+		return -EINVAL;
+	}
+
+	register_iommu(&s5p_sysmmu_ops);
+
+	return ret;
+}
+
+static void __exit
+s5p_sysmmu_unregister(void)
+{
+	kmem_cache_destroy(slpt_cache);
+	platform_driver_unregister(&s5p_sysmmu_driver);
+}
+
+module_init(s5p_sysmmu_register);
+module_exit(s5p_sysmmu_unregister);
+
+MODULE_AUTHOR("Andrzej Pietrasiewicz <andrzej.p@samsung.com>");
+MODULE_DESCRIPTION("Samsung System MMU (IOMMU) driver");
+MODULE_LICENSE("GPL");
+
diff --git a/arch/arm/plat-samsung/include/plat/devs.h b/arch/arm/plat-samsung/include/plat/devs.h
index f0da6b7..0ae5dd0 100644
--- a/arch/arm/plat-samsung/include/plat/devs.h
+++ b/arch/arm/plat-samsung/include/plat/devs.h
@@ -142,7 +142,7 @@ extern struct platform_device s5p_device_fimc3;
 extern struct platform_device s5p_device_mipi_csis0;
 extern struct platform_device s5p_device_mipi_csis1;
 
-extern struct platform_device exynos4_device_sysmmu;
+extern struct platform_device exynos4_device_sysmmu[];
 
 /* s3c2440 specific devices */
 
-- 
1.7.1.569.g6f426

^ permalink raw reply related	[flat|nested] 64+ messages in thread

* [PATCH 3/7] v4l: videobuf2: dma-sg: move some generic functions to memops
  2011-04-18  9:26 ` Marek Szyprowski
@ 2011-04-18  9:26   ` Marek Szyprowski
  -1 siblings, 0 replies; 64+ messages in thread
From: Marek Szyprowski @ 2011-04-18  9:26 UTC (permalink / raw)
  To: linux-arm-kernel, linux-samsung-soc, linux-media
  Cc: Marek Szyprowski, Kyungmin Park, Andrzej Pietrasiwiecz,
	Sylwester Nawrocki, Arnd Bergmann, Kukjin Kim, Pawel Osciak

From: Andrzej Pietrasiewicz <andrzej.p@samsung.com>

This patch moves some generic code to videobuf2-memops. This code will
be later used by the iommu allocator. This patch adds also vma locking
in user pointer mode.

Signed-off-by: Andrzej Pietrasiewicz <andrzej.p@samsung.com>
Signed-off-by: Kyungmin Park <kyungmin.park@samsung.com>
Signed-off-by: Marek Szyprowski <m.szyprowski@samsung.com>
CC: Pawel Osciak <pawel@osciak.com>
---
 drivers/media/video/videobuf2-dma-sg.c |   37 +++++----------
 drivers/media/video/videobuf2-memops.c |   76 ++++++++++++++++++++++++++++++++
 include/media/videobuf2-memops.h       |    5 ++
 3 files changed, 93 insertions(+), 25 deletions(-)

diff --git a/drivers/media/video/videobuf2-dma-sg.c b/drivers/media/video/videobuf2-dma-sg.c
index b2d9485..240abaa 100644
--- a/drivers/media/video/videobuf2-dma-sg.c
+++ b/drivers/media/video/videobuf2-dma-sg.c
@@ -29,6 +29,7 @@ struct vb2_dma_sg_buf {
 	struct vb2_dma_sg_desc		sg_desc;
 	atomic_t			refcount;
 	struct vb2_vmarea_handler	handler;
+	struct vm_area_struct		*vma;
 };
 
 static void vb2_dma_sg_put(void *buf_priv);
@@ -150,15 +151,9 @@ static void *vb2_dma_sg_get_userptr(void *alloc_ctx, unsigned long vaddr,
 	if (!buf->pages)
 		goto userptr_fail_pages_array_alloc;
 
-	down_read(&current->mm->mmap_sem);
-	num_pages_from_user = get_user_pages(current, current->mm,
-					     vaddr & PAGE_MASK,
-					     buf->sg_desc.num_pages,
-					     write,
-					     1, /* force */
-					     buf->pages,
-					     NULL);
-	up_read(&current->mm->mmap_sem);
+	num_pages_from_user = vb2_get_user_pages(vaddr, buf->sg_desc.num_pages,
+					     buf->pages, write, &buf->vma);
+
 	if (num_pages_from_user != buf->sg_desc.num_pages)
 		goto userptr_fail_get_user_pages;
 
@@ -177,6 +172,8 @@ userptr_fail_get_user_pages:
 	       num_pages_from_user, buf->sg_desc.num_pages);
 	while (--num_pages_from_user >= 0)
 		put_page(buf->pages[num_pages_from_user]);
+	if (buf->vma)
+		vb2_put_vma(buf->vma);
 	kfree(buf->pages);
 
 userptr_fail_pages_array_alloc:
@@ -200,6 +197,8 @@ static void vb2_dma_sg_put_userptr(void *buf_priv)
 	       __func__, buf->sg_desc.num_pages);
 	if (buf->vaddr)
 		vm_unmap_ram(buf->vaddr, buf->sg_desc.num_pages);
+	if (buf->vma)
+		vb2_put_vma(buf->vma);
 	while (--i >= 0) {
 		if (buf->write)
 			set_page_dirty_lock(buf->pages[i]);
@@ -236,28 +235,16 @@ static unsigned int vb2_dma_sg_num_users(void *buf_priv)
 static int vb2_dma_sg_mmap(void *buf_priv, struct vm_area_struct *vma)
 {
 	struct vb2_dma_sg_buf *buf = buf_priv;
-	unsigned long uaddr = vma->vm_start;
-	unsigned long usize = vma->vm_end - vma->vm_start;
-	int i = 0;
+	int ret;
 
 	if (!buf) {
 		printk(KERN_ERR "No memory to map\n");
 		return -EINVAL;
 	}
 
-	do {
-		int ret;
-
-		ret = vm_insert_page(vma, uaddr, buf->pages[i++]);
-		if (ret) {
-			printk(KERN_ERR "Remapping memory, error: %d\n", ret);
-			return ret;
-		}
-
-		uaddr += PAGE_SIZE;
-		usize -= PAGE_SIZE;
-	} while (usize > 0);
-
+	ret = vb2_insert_pages(vma, buf->pages);
+	if (ret)
+		return ret;
 
 	/*
 	 * Use common vm_area operations to track buffer refcount.
diff --git a/drivers/media/video/videobuf2-memops.c b/drivers/media/video/videobuf2-memops.c
index 5370a3a..9d44473 100644
--- a/drivers/media/video/videobuf2-memops.c
+++ b/drivers/media/video/videobuf2-memops.c
@@ -185,6 +185,82 @@ int vb2_mmap_pfn_range(struct vm_area_struct *vma, unsigned long paddr,
 EXPORT_SYMBOL_GPL(vb2_mmap_pfn_range);
 
 /**
+ * vb2_get_user_pages() - pin user pages
+ * @vaddr:	virtual address from which to start
+ * @num_pages:	number of pages to pin
+ * @pages:	table of pointers to struct pages to pin
+ * @write:	if 0, the pages must not be written to
+ * @vma:	output parameter, copy of the vma or NULL
+ *		if get_user_pages fails
+ *
+ * This function just forwards invocation to get_user_pages, but eases using
+ * the latter in videobuf2 allocators.
+ */
+int vb2_get_user_pages(unsigned long vaddr, unsigned int num_pages,
+		       struct page **pages, int write, struct vm_area_struct **vma)
+{
+	struct vm_area_struct *found_vma;
+	struct mm_struct *mm = current->mm;
+	int ret = -EFAULT;
+
+	down_read(&current->mm->mmap_sem);
+
+	found_vma = find_vma(mm, vaddr);
+	if (NULL == found_vma || found_vma->vm_end < (vaddr + num_pages * PAGE_SIZE))
+		goto done;
+
+	*vma = vb2_get_vma(found_vma);
+	if (NULL == *vma) {
+		ret = -ENOMEM;
+		goto done;
+	}
+
+	ret = get_user_pages(current, current->mm, vaddr & PAGE_MASK, num_pages,
+			     write, 1 /* force */, pages, NULL);
+
+	if (ret != num_pages) {
+		vb2_put_vma(*vma);
+		*vma = NULL;
+	}
+
+done:
+	up_read(&current->mm->mmap_sem);
+
+	return ret;
+}
+EXPORT_SYMBOL_GPL(vb2_get_user_pages);
+
+/**
+ * vb2_insert_pages - insert pages into user vma
+ * @vma:	virtual memory region for the mapping
+ * @pages:	table of pointers to struct pages to be inserted
+ *
+ * This function for each page to be inserted performs vm_insert_page.
+ */
+int vb2_insert_pages(struct vm_area_struct *vma, struct page **pages)
+{
+	unsigned long uaddr = vma->vm_start;
+	unsigned long usize = vma->vm_end - vma->vm_start;
+	int i = 0;
+
+	do {
+		int ret;
+
+		ret = vm_insert_page(vma, uaddr, pages[i++]);
+		if (ret) {
+			printk(KERN_ERR "Remapping memory, error: %d\n", ret);
+			return ret;
+		}
+
+		uaddr += PAGE_SIZE;
+		usize -= PAGE_SIZE;
+	} while (usize > 0);
+
+	return 0;
+}
+EXPORT_SYMBOL_GPL(vb2_insert_pages);
+
+/**
  * vb2_common_vm_open() - increase refcount of the vma
  * @vma:	virtual memory region for the mapping
  *
diff --git a/include/media/videobuf2-memops.h b/include/media/videobuf2-memops.h
index 84e1f6c..f8a0886 100644
--- a/include/media/videobuf2-memops.h
+++ b/include/media/videobuf2-memops.h
@@ -41,5 +41,10 @@ int vb2_mmap_pfn_range(struct vm_area_struct *vma, unsigned long paddr,
 struct vm_area_struct *vb2_get_vma(struct vm_area_struct *vma);
 void vb2_put_vma(struct vm_area_struct *vma);
 
+int vb2_get_user_pages(unsigned long vaddr, unsigned int num_pages,
+		       struct page **pages, int write,
+		       struct vm_area_struct **vma);
+
+int vb2_insert_pages(struct vm_area_struct *vma, struct page **pages);
 
 #endif
-- 
1.7.1.569.g6f426

^ permalink raw reply related	[flat|nested] 64+ messages in thread

* [PATCH 3/7] v4l: videobuf2: dma-sg: move some generic functions to memops
@ 2011-04-18  9:26   ` Marek Szyprowski
  0 siblings, 0 replies; 64+ messages in thread
From: Marek Szyprowski @ 2011-04-18  9:26 UTC (permalink / raw)
  To: linux-arm-kernel

From: Andrzej Pietrasiewicz <andrzej.p@samsung.com>

This patch moves some generic code to videobuf2-memops. This code will
be later used by the iommu allocator. This patch adds also vma locking
in user pointer mode.

Signed-off-by: Andrzej Pietrasiewicz <andrzej.p@samsung.com>
Signed-off-by: Kyungmin Park <kyungmin.park@samsung.com>
Signed-off-by: Marek Szyprowski <m.szyprowski@samsung.com>
CC: Pawel Osciak <pawel@osciak.com>
---
 drivers/media/video/videobuf2-dma-sg.c |   37 +++++----------
 drivers/media/video/videobuf2-memops.c |   76 ++++++++++++++++++++++++++++++++
 include/media/videobuf2-memops.h       |    5 ++
 3 files changed, 93 insertions(+), 25 deletions(-)

diff --git a/drivers/media/video/videobuf2-dma-sg.c b/drivers/media/video/videobuf2-dma-sg.c
index b2d9485..240abaa 100644
--- a/drivers/media/video/videobuf2-dma-sg.c
+++ b/drivers/media/video/videobuf2-dma-sg.c
@@ -29,6 +29,7 @@ struct vb2_dma_sg_buf {
 	struct vb2_dma_sg_desc		sg_desc;
 	atomic_t			refcount;
 	struct vb2_vmarea_handler	handler;
+	struct vm_area_struct		*vma;
 };
 
 static void vb2_dma_sg_put(void *buf_priv);
@@ -150,15 +151,9 @@ static void *vb2_dma_sg_get_userptr(void *alloc_ctx, unsigned long vaddr,
 	if (!buf->pages)
 		goto userptr_fail_pages_array_alloc;
 
-	down_read(&current->mm->mmap_sem);
-	num_pages_from_user = get_user_pages(current, current->mm,
-					     vaddr & PAGE_MASK,
-					     buf->sg_desc.num_pages,
-					     write,
-					     1, /* force */
-					     buf->pages,
-					     NULL);
-	up_read(&current->mm->mmap_sem);
+	num_pages_from_user = vb2_get_user_pages(vaddr, buf->sg_desc.num_pages,
+					     buf->pages, write, &buf->vma);
+
 	if (num_pages_from_user != buf->sg_desc.num_pages)
 		goto userptr_fail_get_user_pages;
 
@@ -177,6 +172,8 @@ userptr_fail_get_user_pages:
 	       num_pages_from_user, buf->sg_desc.num_pages);
 	while (--num_pages_from_user >= 0)
 		put_page(buf->pages[num_pages_from_user]);
+	if (buf->vma)
+		vb2_put_vma(buf->vma);
 	kfree(buf->pages);
 
 userptr_fail_pages_array_alloc:
@@ -200,6 +197,8 @@ static void vb2_dma_sg_put_userptr(void *buf_priv)
 	       __func__, buf->sg_desc.num_pages);
 	if (buf->vaddr)
 		vm_unmap_ram(buf->vaddr, buf->sg_desc.num_pages);
+	if (buf->vma)
+		vb2_put_vma(buf->vma);
 	while (--i >= 0) {
 		if (buf->write)
 			set_page_dirty_lock(buf->pages[i]);
@@ -236,28 +235,16 @@ static unsigned int vb2_dma_sg_num_users(void *buf_priv)
 static int vb2_dma_sg_mmap(void *buf_priv, struct vm_area_struct *vma)
 {
 	struct vb2_dma_sg_buf *buf = buf_priv;
-	unsigned long uaddr = vma->vm_start;
-	unsigned long usize = vma->vm_end - vma->vm_start;
-	int i = 0;
+	int ret;
 
 	if (!buf) {
 		printk(KERN_ERR "No memory to map\n");
 		return -EINVAL;
 	}
 
-	do {
-		int ret;
-
-		ret = vm_insert_page(vma, uaddr, buf->pages[i++]);
-		if (ret) {
-			printk(KERN_ERR "Remapping memory, error: %d\n", ret);
-			return ret;
-		}
-
-		uaddr += PAGE_SIZE;
-		usize -= PAGE_SIZE;
-	} while (usize > 0);
-
+	ret = vb2_insert_pages(vma, buf->pages);
+	if (ret)
+		return ret;
 
 	/*
 	 * Use common vm_area operations to track buffer refcount.
diff --git a/drivers/media/video/videobuf2-memops.c b/drivers/media/video/videobuf2-memops.c
index 5370a3a..9d44473 100644
--- a/drivers/media/video/videobuf2-memops.c
+++ b/drivers/media/video/videobuf2-memops.c
@@ -185,6 +185,82 @@ int vb2_mmap_pfn_range(struct vm_area_struct *vma, unsigned long paddr,
 EXPORT_SYMBOL_GPL(vb2_mmap_pfn_range);
 
 /**
+ * vb2_get_user_pages() - pin user pages
+ * @vaddr:	virtual address from which to start
+ * @num_pages:	number of pages to pin
+ * @pages:	table of pointers to struct pages to pin
+ * @write:	if 0, the pages must not be written to
+ * @vma:	output parameter, copy of the vma or NULL
+ *		if get_user_pages fails
+ *
+ * This function just forwards invocation to get_user_pages, but eases using
+ * the latter in videobuf2 allocators.
+ */
+int vb2_get_user_pages(unsigned long vaddr, unsigned int num_pages,
+		       struct page **pages, int write, struct vm_area_struct **vma)
+{
+	struct vm_area_struct *found_vma;
+	struct mm_struct *mm = current->mm;
+	int ret = -EFAULT;
+
+	down_read(&current->mm->mmap_sem);
+
+	found_vma = find_vma(mm, vaddr);
+	if (NULL == found_vma || found_vma->vm_end < (vaddr + num_pages * PAGE_SIZE))
+		goto done;
+
+	*vma = vb2_get_vma(found_vma);
+	if (NULL == *vma) {
+		ret = -ENOMEM;
+		goto done;
+	}
+
+	ret = get_user_pages(current, current->mm, vaddr & PAGE_MASK, num_pages,
+			     write, 1 /* force */, pages, NULL);
+
+	if (ret != num_pages) {
+		vb2_put_vma(*vma);
+		*vma = NULL;
+	}
+
+done:
+	up_read(&current->mm->mmap_sem);
+
+	return ret;
+}
+EXPORT_SYMBOL_GPL(vb2_get_user_pages);
+
+/**
+ * vb2_insert_pages - insert pages into user vma
+ * @vma:	virtual memory region for the mapping
+ * @pages:	table of pointers to struct pages to be inserted
+ *
+ * This function for each page to be inserted performs vm_insert_page.
+ */
+int vb2_insert_pages(struct vm_area_struct *vma, struct page **pages)
+{
+	unsigned long uaddr = vma->vm_start;
+	unsigned long usize = vma->vm_end - vma->vm_start;
+	int i = 0;
+
+	do {
+		int ret;
+
+		ret = vm_insert_page(vma, uaddr, pages[i++]);
+		if (ret) {
+			printk(KERN_ERR "Remapping memory, error: %d\n", ret);
+			return ret;
+		}
+
+		uaddr += PAGE_SIZE;
+		usize -= PAGE_SIZE;
+	} while (usize > 0);
+
+	return 0;
+}
+EXPORT_SYMBOL_GPL(vb2_insert_pages);
+
+/**
  * vb2_common_vm_open() - increase refcount of the vma
  * @vma:	virtual memory region for the mapping
  *
diff --git a/include/media/videobuf2-memops.h b/include/media/videobuf2-memops.h
index 84e1f6c..f8a0886 100644
--- a/include/media/videobuf2-memops.h
+++ b/include/media/videobuf2-memops.h
@@ -41,5 +41,10 @@ int vb2_mmap_pfn_range(struct vm_area_struct *vma, unsigned long paddr,
 struct vm_area_struct *vb2_get_vma(struct vm_area_struct *vma);
 void vb2_put_vma(struct vm_area_struct *vma);
 
+int vb2_get_user_pages(unsigned long vaddr, unsigned int num_pages,
+		       struct page **pages, int write,
+		       struct vm_area_struct **vma);
+
+int vb2_insert_pages(struct vm_area_struct *vma, struct page **pages);
 
 #endif
-- 
1.7.1.569.g6f426

^ permalink raw reply related	[flat|nested] 64+ messages in thread

* [PATCH 4/7] v4l: videobuf2: add IOMMU based DMA memory allocator
  2011-04-18  9:26 ` Marek Szyprowski
@ 2011-04-18  9:26   ` Marek Szyprowski
  -1 siblings, 0 replies; 64+ messages in thread
From: Marek Szyprowski @ 2011-04-18  9:26 UTC (permalink / raw)
  To: linux-arm-kernel, linux-samsung-soc, linux-media
  Cc: Marek Szyprowski, Kyungmin Park, Andrzej Pietrasiwiecz,
	Sylwester Nawrocki, Arnd Bergmann, Kukjin Kim

From: Andrzej Pietrasiewicz <andrzej.p@samsung.com>

This patch adds new videobuf2 memory allocator dedicated to devices that
supports IOMMU DMA mappings. A device with IOMMU module and a driver
with include/iommu.h compatible interface is required. This allocator
aquires memory with standard alloc_page() call and doesn't suffer from
memory fragmentation issues. The allocator support following page sizes:
4KiB, 64KiB, 1MiB and 16MiB to reduce iommu translation overhead.

Signed-off-by: Andrzej Pietrasiewicz <andrzej.p@samsung.com>
Signed-off-by: Kyungmin Park <kyungmin.park@samsung.com>
Signed-off-by: Marek Szyprowski <m.szyprowski@samsung.com>
---
 drivers/media/video/Kconfig               |    8 +-
 drivers/media/video/Makefile              |    1 +
 drivers/media/video/videobuf2-dma-iommu.c |  762 +++++++++++++++++++++++++++++
 include/media/videobuf2-dma-iommu.h       |   48 ++
 4 files changed, 818 insertions(+), 1 deletions(-)
 create mode 100644 drivers/media/video/videobuf2-dma-iommu.c
 create mode 100644 include/media/videobuf2-dma-iommu.h

diff --git a/drivers/media/video/Kconfig b/drivers/media/video/Kconfig
index 4498b94..40d7bcc 100644
--- a/drivers/media/video/Kconfig
+++ b/drivers/media/video/Kconfig
@@ -60,12 +60,18 @@ config VIDEOBUF2_VMALLOC
 	select VIDEOBUF2_MEMOPS
 	tristate
 
-
 config VIDEOBUF2_DMA_SG
 	#depends on HAS_DMA
 	select VIDEOBUF2_CORE
 	select VIDEOBUF2_MEMOPS
 	tristate
+
+config VIDEOBUF2_DMA_IOMMU
+	select GENERIC_ALLOCATOR
+	select VIDEOBUF2_CORE
+	select VIDEOBUF2_MEMOPS
+	tristate
+
 #
 # Multimedia Video device configuration
 #
diff --git a/drivers/media/video/Makefile b/drivers/media/video/Makefile
index ace5d8b..04136f6 100644
--- a/drivers/media/video/Makefile
+++ b/drivers/media/video/Makefile
@@ -118,6 +118,7 @@ obj-$(CONFIG_VIDEOBUF2_MEMOPS)		+= videobuf2-memops.o
 obj-$(CONFIG_VIDEOBUF2_VMALLOC)		+= videobuf2-vmalloc.o
 obj-$(CONFIG_VIDEOBUF2_DMA_CONTIG)	+= videobuf2-dma-contig.o
 obj-$(CONFIG_VIDEOBUF2_DMA_SG)		+= videobuf2-dma-sg.o
+obj-$(CONFIG_VIDEOBUF2_DMA_IOMMU)	+= videobuf2-dma-iommu.o
 
 obj-$(CONFIG_V4L2_MEM2MEM_DEV) += v4l2-mem2mem.o
 
diff --git a/drivers/media/video/videobuf2-dma-iommu.c b/drivers/media/video/videobuf2-dma-iommu.c
new file mode 100644
index 0000000..7ccb51a
--- /dev/null
+++ b/drivers/media/video/videobuf2-dma-iommu.c
@@ -0,0 +1,762 @@
+/*
+ * videobuf2-dma-iommu.c - IOMMU based memory allocator for videobuf2
+ *
+ * Copyright (C) 2011 Samsung Electronics
+ *
+ * Author: Andrzej Pietrasiewicz <andrzej.p@samsung.com>
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation.
+ */
+
+#include <linux/module.h>
+#include <linux/mm.h>
+#include <linux/scatterlist.h>
+#include <linux/sched.h>
+#include <linux/slab.h>
+#include <linux/vmalloc.h>
+#include <linux/genalloc.h>
+#include <linux/device.h>
+#include <linux/iommu.h>
+#include <asm/cacheflush.h>
+#include <asm/page.h>
+
+#include <media/videobuf2-core.h>
+#include <media/videobuf2-memops.h>
+#include <media/videobuf2-dma-iommu.h>
+
+/*
+ * 17: single piece of memory (one bitmap entry) equals 128k,
+ * so by default the genalloc's bitmap occupies 4kB (one page
+ * for a number of architectures)
+ */
+#define VB2_DMA_IOMMU_PIECE_ORDER	17
+
+/* -1: use default node id to allocate gen_pool/gen_pool_chunk structure from */
+#define VB2_DMA_IOMMU_NODE_ID		-1
+
+/*
+ * starting address of the virtual address space of the client device
+ * must not be zero
+ */
+#define VB2_DMA_IOMMU_MEM_BASE		0x30000000
+
+/* size of the virtual address space of the client device */
+#define VB2_DMA_IOMMU_MEM_SIZE		0x40000000
+
+struct vb2_dma_iommu_alloc_ctx {
+	struct device		*dev;
+	struct gen_pool		*pool;
+	unsigned int		order;
+	struct iommu_domain	*domain;
+};
+
+struct vb2_dma_iommu_desc {
+	unsigned long		size;
+	unsigned int		num_pages;
+	struct page		**pages;
+	unsigned long		*pg_map;
+	bool			contig;
+};
+
+struct vb2_dma_iommu_buf {
+	unsigned long			drv_addr;
+	unsigned long			vaddr;
+
+	struct vb2_dma_iommu_desc	info;
+	int				offset;
+	atomic_t			refcount;
+	int				write;
+	struct vm_area_struct		*vma;
+
+	struct vb2_vmarea_handler	handler;
+
+	struct vb2_dma_iommu_alloc_ctx	*ctx;
+};
+
+#define pages_4k(size) \
+			(((size) + PAGE_SIZE - 1) >> PAGE_SHIFT)
+
+#define pages_order(size, order) \
+			((pages_4k(size) >> (order)) & 0xF)
+
+#define for_each_compound_page(bitmap, size, idx) \
+			for ((idx) = find_first_bit((bitmap), (size)); \
+			     (idx) < (size); \
+			     (idx) = find_next_bit((bitmap), (size), (idx) + 1))
+
+static int vb2_dma_iommu_max_order(unsigned long size)
+{
+	if ((size & 0xFFFF) == size) /* < 64k */
+		return 0;
+	if ((size & 0xFFFFF) == size) /* < 1M */
+		return 4;
+	if ((size & 0xFFFFFF) == size) /* < 16M */
+		return 8;
+	return 12; /* >= 16M */
+}
+
+/*
+ * num_pg must be 1, 16, 256 or 4096
+ */
+static int vb2_dma_iommu_pg_order(int num_pg)
+{
+	if (num_pg & 0x1)
+		return 0;
+	if (num_pg & 0x10)
+		return 4;
+	if (num_pg & 0x100)
+		return 8;
+	return 12;
+}
+
+/*
+ * size must be multiple of PAGE_SIZE
+ */
+static int vb2_dma_iommu_get_pages(struct vb2_dma_iommu_desc *desc,
+				   unsigned long size)
+{
+	int order, num_pg_order, curr_4k_page, bit, max_order_ret;
+	unsigned long curr_size;
+
+	curr_4k_page = 0;
+	max_order_ret = 0;
+	curr_size = size; /* allocate (compound) pages until nothing remains */
+
+	order = vb2_dma_iommu_max_order(curr_size);
+	num_pg_order = pages_order(curr_size, order);
+
+	while (curr_size > 0 && order >= 0) {
+		int i, max_order;
+
+		printk(KERN_DEBUG "%s %d page(s) of %d order\n", __func__,
+		       num_pg_order, order);
+
+		for (i = 0; i < num_pg_order; ++i) {
+			struct page *pg;
+			int j, compound_sz;
+
+			pg = alloc_pages(GFP_KERNEL | __GFP_ZERO | __GFP_COMP,
+					 order);
+			if (!pg)
+				break;
+
+
+			if (order > max_order_ret)
+				max_order_ret = order;
+
+			compound_sz = 0x1 << order;
+			/* need to zero bitmap parts only for orders > 0 */
+			if (order)
+				bitmap_clear(desc->pg_map, curr_4k_page + 1,
+					     compound_sz - 1);
+			for (j = 0; j < compound_sz; ++j)
+				desc->pages[curr_4k_page + j] = (pg + j);
+			curr_4k_page += compound_sz;
+		}
+		/*
+		 * after the above for ends either way (loop condition not
+		 * fulfilled/break) the i contains number of (compound) pages
+		 * we managed to allocate
+		 */
+		curr_size -= i * (PAGE_SIZE << order);
+		max_order = vb2_dma_iommu_max_order(curr_size);
+		/*
+		 * max_order >= current order means that some allocations
+		 * with order >= current order have failed, so we cannot attempt
+		 * any greater orders again, we need to try an order smaller
+		 * than the current order instead
+		 */
+		if (max_order >= order)
+			max_order = order - 4;
+		order = max_order;
+		num_pg_order = pages_order(curr_size, order);
+	}
+
+	if (curr_size != 0)
+		goto get_pages_rollback;
+
+	return max_order_ret;
+
+get_pages_rollback:
+	for_each_compound_page(desc->pg_map, curr_4k_page, bit) {
+		int next_bit;
+
+		next_bit = find_next_bit(desc->pg_map, curr_4k_page, bit + 1);
+		order = vb2_dma_iommu_pg_order(next_bit - bit);
+		__free_pages(desc->pages[bit], order);
+	}
+
+	return -1;
+}
+
+static int vb2_dma_iommu_pg_sizes(struct vb2_dma_iommu_desc *desc)
+{
+	int i, order, max_order;
+
+	i = 0;
+	/* max order is 12, set to something greater */
+	order = 12 + 1;
+	max_order = 0;
+	while (i < desc->num_pages) {
+		unsigned long first, curr, next, curr_size;
+		int adjacent, j, new_order, num_pg_order;
+
+		first = 0;
+		j = i;
+		if (desc->contig) {
+			first = page_to_phys(desc->pages[0]);
+			adjacent = desc->num_pages;
+		} else {
+			curr = page_to_phys(desc->pages[i]);
+			if (order > 12)
+				first = curr;
+			while (++j < desc->num_pages) {
+				next = page_to_phys(desc->pages[j]);
+				if (curr + PAGE_SIZE != next)
+					break;
+				curr = next;
+			}
+			adjacent = j - i;
+		}
+		curr_size = adjacent << PAGE_SHIFT;
+		new_order = vb2_dma_iommu_max_order(curr_size);
+		/*
+		 * by design decision max order in a sequence of blocks of
+		 * zero-order pages must be monotonicaly decreasing
+		 */
+		if (new_order > order) {
+			bitmap_fill(desc->pg_map, desc->num_pages);
+			return 0;
+		}
+		/*
+		 * by design decision the first compound page of the buffer
+		 * must be aligned according to its size
+		 */
+		if (order > 12)
+			if (first & ((PAGE_SIZE << new_order) - 1)) {
+				bitmap_fill(desc->pg_map, desc->num_pages);
+				return 0;
+			}
+		order = new_order;
+		if (order > max_order)
+			max_order = order;
+		num_pg_order = pages_order(curr_size, order);
+		while (curr_size > 0) {
+			int compound_sz;
+
+			printk(KERN_DEBUG "%s %d page(s) of %d order\n",
+			       __func__, num_pg_order, order);
+			compound_sz = 0x1 << order;
+			/* need to zero bitmap parts only for orders > 0 */
+			if (order)
+				for (j = 0; j < num_pg_order; ++j)
+					bitmap_clear(desc->pg_map,
+						     i + j * compound_sz + 1,
+						     compound_sz - 1);
+			i += num_pg_order * compound_sz;
+			curr_size -= num_pg_order * (PAGE_SIZE << order);
+			if (curr_size) {
+				order = vb2_dma_iommu_max_order(curr_size);
+				num_pg_order = pages_order(curr_size, order);
+			}
+		}
+	}
+	return max_order;
+}
+
+static int vb2_dma_iommu_map(struct iommu_domain *domain,
+			     unsigned long drv_addr,
+			     struct vb2_dma_iommu_desc *desc)
+{
+	int i, j, ret, order;
+	unsigned long pg_addr;
+
+	pg_addr = drv_addr;
+	ret = 0;
+	for_each_compound_page(desc->pg_map, desc->num_pages, i) {
+		int next_bit;
+		unsigned long paddr, compound_sz;
+
+		next_bit = find_next_bit(desc->pg_map, desc->num_pages, i + 1);
+		order = vb2_dma_iommu_pg_order(next_bit - i);
+		compound_sz = 0x1 << order << PAGE_SHIFT;
+		paddr = page_to_phys(desc->pages[i]);
+		ret = iommu_map(domain, pg_addr, paddr, order, 0);
+		if (ret < 0)
+			goto fail_map_area;
+		pg_addr += compound_sz;
+	}
+
+	return ret;
+
+fail_map_area:
+	pg_addr = drv_addr;
+	for_each_compound_page(desc->pg_map, i, j) {
+		int next_bit;
+
+		next_bit = find_next_bit(desc->pg_map, i, j + 1);
+		order = vb2_dma_iommu_pg_order(next_bit - j);
+		iommu_unmap(domain, pg_addr, order);
+		pg_addr += 0x1 << order << PAGE_SHIFT;
+	}
+	return ret;
+}
+
+static void vb2_dma_iommu_unmap(struct iommu_domain *domain,
+			       unsigned long drv_addr,
+			       struct vb2_dma_iommu_desc *desc)
+{
+	int i;
+
+	for_each_compound_page(desc->pg_map, desc->num_pages, i) {
+		int next_bit, order;
+
+		next_bit = find_next_bit(desc->pg_map, desc->num_pages, i + 1);
+		order = vb2_dma_iommu_pg_order(next_bit - i);
+		iommu_unmap(domain, drv_addr, order);
+		drv_addr += 0x1 << order << PAGE_SHIFT;
+	}
+}
+
+static void vb2_dma_iommu_put(void *buf_priv);
+
+static void *vb2_dma_iommu_alloc(void *alloc_ctx, unsigned long size)
+{
+	struct vb2_dma_iommu_alloc_ctx *ctx = alloc_ctx;
+	struct vb2_dma_iommu_buf *buf;
+	unsigned long size_pg, pg_map_size;
+	int i, ret, max_order;
+	void *rv;
+
+	BUG_ON(NULL == alloc_ctx);
+
+	rv = NULL;
+
+	buf = kzalloc(sizeof *buf, GFP_KERNEL);
+	if (!buf)
+		return ERR_PTR(-ENOMEM);
+
+	buf->ctx = ctx;
+	buf->info.size = size;
+	buf->info.num_pages = size_pg = pages_4k(size);
+
+	buf->info.pages = kzalloc(size_pg * sizeof(struct page *), GFP_KERNEL);
+	if (!buf->info.pages) {
+		rv = ERR_PTR(-ENOMEM);
+		goto buf_alloc_rollback;
+	}
+
+	pg_map_size = BITS_TO_LONGS(size_pg) * sizeof(unsigned long);
+	buf->info.pg_map = kzalloc(pg_map_size, GFP_KERNEL);
+	if (!buf->info.pg_map) {
+		rv = ERR_PTR(-ENOMEM);
+		goto pg_array_alloc_rollback;
+	}
+	bitmap_fill(buf->info.pg_map, size_pg);
+
+	max_order = vb2_dma_iommu_get_pages(&buf->info, size_pg * PAGE_SIZE);
+	if (max_order < 0) {
+		rv = ERR_PTR(-ENOMEM);
+		goto pg_map_alloc_rollback;
+	}
+
+	/* max_order is for number of pages; order of bytes: += 12 */
+	max_order += PAGE_SHIFT;
+	/* we need to keep the contract of vb2_dma_iommu_request */
+	if (max_order < ctx->order)
+		max_order = ctx->order;
+	buf->drv_addr = gen_pool_alloc_aligned(ctx->pool, size, max_order);
+	if (0 == buf->drv_addr) {
+		rv = ERR_PTR(-ENOMEM);
+		goto pages_alloc_rollback;
+	}
+
+	buf->handler.refcount = &buf->refcount;
+	buf->handler.put = vb2_dma_iommu_put;
+	buf->handler.arg = buf;
+
+	atomic_inc(&buf->refcount);
+
+	printk(KERN_DEBUG
+	       "%s: Context 0x%lx mapping buffer of %d pages @0x%lx\n",
+	       __func__ , (unsigned long)ctx, buf->info.num_pages,
+	       buf->drv_addr);
+	ret = vb2_dma_iommu_map(ctx->domain, buf->drv_addr, &buf->info);
+	if (ret < 0) {
+		rv = ERR_PTR(ret);
+		goto gen_pool_alloc_rollback;
+	}
+
+	/*
+	 * TODO: Ensure no one else flushes the cache later onto our memory
+	 * which already contains important data.
+	 * Perhaps find a better way to do it.
+	 */
+	flush_cache_all();
+	outer_flush_all();
+	return buf;
+
+gen_pool_alloc_rollback:
+	gen_pool_free(ctx->pool, buf->drv_addr, size);
+
+pages_alloc_rollback:
+	for_each_compound_page(buf->info.pg_map, buf->info.num_pages, i) {
+		int next_bit;
+
+		next_bit = find_next_bit(buf->info.pg_map, buf->info.num_pages,
+					 i + 1);
+		max_order = vb2_dma_iommu_pg_order(next_bit - i);
+		__free_pages(buf->info.pages[i], max_order);
+	}
+
+pg_map_alloc_rollback:
+	kfree(buf->info.pg_map);
+
+pg_array_alloc_rollback:
+	kfree(buf->info.pages);
+
+buf_alloc_rollback:
+	kfree(buf);
+	return rv;
+}
+
+static void vb2_dma_iommu_put(void *buf_priv)
+{
+	struct vb2_dma_iommu_buf *buf = buf_priv;
+
+	if (atomic_dec_and_test(&buf->refcount)) {
+		int i, order;
+
+		printk(KERN_DEBUG
+		"%s: Context 0x%lx releasing buffer of %d pages @0x%lx\n",
+		       __func__, (unsigned long)buf->ctx, buf->info.num_pages,
+		       buf->drv_addr);
+
+		vb2_dma_iommu_unmap(buf->ctx->domain, buf->drv_addr,
+				    &buf->info);
+		if (buf->vaddr)
+			vm_unmap_ram((void *)buf->vaddr, buf->info.num_pages);
+
+		gen_pool_free(buf->ctx->pool, buf->drv_addr, buf->info.size);
+		for_each_compound_page(buf->info.pg_map,
+				       buf->info.num_pages, i) {
+			int next_bit;
+
+			next_bit = find_next_bit(buf->info.pg_map,
+						 buf->info.num_pages, i + 1);
+			order = vb2_dma_iommu_pg_order(next_bit - i);
+			__free_pages(buf->info.pages[i], order);
+		}
+		kfree(buf->info.pg_map);
+		kfree(buf->info.pages);
+		kfree(buf);
+	}
+}
+
+static void *vb2_dma_iommu_get_userptr(void *alloc_ctx, unsigned long vaddr,
+				    unsigned long size, int write)
+{
+	struct vb2_dma_iommu_alloc_ctx *ctx = alloc_ctx;
+	struct vb2_dma_iommu_buf *buf;
+	unsigned long first, last, size_pg, pg_map_size;
+	int num_pages_from_user, max_order, ret;
+	void *rv;
+
+	BUG_ON(NULL == alloc_ctx);
+
+	rv = NULL;
+
+	buf = kzalloc(sizeof *buf, GFP_KERNEL);
+	if (!buf)
+		return ERR_PTR(-ENOMEM);
+
+	buf->ctx = ctx;
+	buf->info.size = size;
+	/*
+	 * Page numbers of the first and the last byte of the buffer
+	 */
+	first = vaddr >> PAGE_SHIFT;
+	last  = (vaddr + size - 1) >> PAGE_SHIFT;
+	buf->info.num_pages = size_pg = last - first + 1;
+	buf->offset = vaddr & ~PAGE_MASK;
+	buf->write = write;
+
+	buf->info.pages = kzalloc(size_pg * sizeof(struct page *), GFP_KERNEL);
+	if (!buf->info.pages) {
+		rv = ERR_PTR(-ENOMEM);
+		goto buf_alloc_rollback;
+	}
+
+	pg_map_size = BITS_TO_LONGS(size_pg) * sizeof(unsigned long);
+	buf->info.pg_map = kzalloc(pg_map_size, GFP_KERNEL);
+	if (!buf->info.pg_map) {
+		rv = ERR_PTR(-ENOMEM);
+		goto pg_array_alloc_rollback;
+	}
+	bitmap_fill(buf->info.pg_map, size_pg);
+
+	num_pages_from_user = vb2_get_user_pages(vaddr, buf->info.num_pages,
+						 buf->info.pages, write,
+						 &buf->vma);
+
+	/* do not accept partial success */
+	if (num_pages_from_user >= 0 && num_pages_from_user < size_pg) {
+		rv = ERR_PTR(-EFAULT);
+		goto get_user_pages_rollback;
+	}
+
+	if (num_pages_from_user < 0) {
+		struct vm_area_struct *vma;
+		int i;
+		dma_addr_t paddr;
+
+		paddr = 0;
+		ret = vb2_get_contig_userptr(vaddr, size, &vma, &paddr);
+		if (ret) {
+			rv = ERR_PTR(ret);
+			goto get_user_pages_rollback;
+		}
+
+		buf->vma = vma;
+		buf->info.contig = true;
+		paddr -= buf->offset;
+
+		for (i = 0; i < size_pg; paddr += PAGE_SIZE, ++i)
+			buf->info.pages[i] = phys_to_page(paddr);
+	}
+	max_order = vb2_dma_iommu_pg_sizes(&buf->info);
+
+	/* max_order is for number of pages; order of bytes: += 12 */
+	max_order += PAGE_SHIFT;
+	/* we need to keep the contract of vb2_dma_iommu_request */
+	if (max_order < ctx->order)
+		max_order = ctx->order;
+
+	buf->drv_addr = gen_pool_alloc_aligned(ctx->pool, size, max_order);
+	if (0 == buf->drv_addr) {
+		rv = ERR_PTR(-ENOMEM);
+		goto get_user_pages_rollback;
+	}
+	printk(KERN_DEBUG
+	"%s: Context 0x%lx mapping buffer of %ld user pages @0x%lx\n",
+	       __func__ , (unsigned long)ctx, size_pg, buf->drv_addr);
+	ret = vb2_dma_iommu_map(ctx->domain, buf->drv_addr, &buf->info);
+	if (ret < 0) {
+		rv = ERR_PTR(ret);
+		goto gen_pool_alloc_rollback;
+	}
+
+	return buf;
+
+gen_pool_alloc_rollback:
+	gen_pool_free(ctx->pool, buf->drv_addr, size);
+
+get_user_pages_rollback:
+	while (--num_pages_from_user >= 0)
+		put_page(buf->info.pages[num_pages_from_user]);
+	if (buf->vma)
+		vb2_put_vma(buf->vma);
+	kfree(buf->info.pg_map);
+
+pg_array_alloc_rollback:
+	kfree(buf->info.pages);
+
+buf_alloc_rollback:
+	kfree(buf);
+	return rv;
+}
+
+/*
+ * @put_userptr: inform the allocator that a USERPTR buffer will no longer
+ *		 be used
+ */
+static void vb2_dma_iommu_put_userptr(void *buf_priv)
+{
+	struct vb2_dma_iommu_buf *buf = buf_priv;
+	int i;
+
+	printk(KERN_DEBUG
+	       "%s: Context 0x%lx releasing buffer of %d user pages @0x%lx\n",
+	       __func__, (unsigned long)buf->ctx, buf->info.num_pages,
+	       buf->drv_addr);
+	vb2_dma_iommu_unmap(buf->ctx->domain, buf->drv_addr, &buf->info);
+	if (buf->vaddr)
+		vm_unmap_ram((void *)buf->vaddr, buf->info.num_pages);
+
+	gen_pool_free(buf->ctx->pool, buf->drv_addr, buf->info.size);
+
+	if (buf->vma)
+		vb2_put_vma(buf->vma);
+
+	i = buf->info.num_pages;
+	if (!buf->info.contig) {
+		while (--i >= 0) {
+			if (buf->write)
+				set_page_dirty_lock(buf->info.pages[i]);
+			put_page(buf->info.pages[i]);
+		}
+	}
+	kfree(buf->info.pg_map);
+	kfree(buf->info.pages);
+	kfree(buf);
+}
+
+static void *vb2_dma_iommu_vaddr(void *buf_priv)
+{
+	struct vb2_dma_iommu_buf *buf = buf_priv;
+
+	BUG_ON(!buf);
+
+	if (!buf->vaddr)
+		buf->vaddr = (unsigned long)vm_map_ram(buf->info.pages,
+					buf->info.num_pages,
+					-1,
+					pgprot_dmacoherent(PAGE_KERNEL));
+
+	/* add offset in case userptr is not page-aligned */
+	return (void *)(buf->vaddr + buf->offset);
+}
+
+static unsigned int vb2_dma_iommu_num_users(void *buf_priv)
+{
+	struct vb2_dma_iommu_buf *buf = buf_priv;
+
+	return atomic_read(&buf->refcount);
+}
+
+static int vb2_dma_iommu_mmap(void *buf_priv, struct vm_area_struct *vma)
+{
+	struct vb2_dma_iommu_buf *buf = buf_priv;
+	int ret;
+
+	if (!buf) {
+		printk(KERN_ERR "No memory to map\n");
+		return -EINVAL;
+	}
+
+	vma->vm_page_prot = pgprot_dmacoherent(vma->vm_page_prot);
+
+	ret = vb2_insert_pages(vma, buf->info.pages);
+	if (ret)
+		return ret;
+
+	/*
+	 * Use common vm_area operations to track buffer refcount.
+	 */
+	vma->vm_private_data	= &buf->handler;
+	vma->vm_ops		= &vb2_common_vm_ops;
+
+	vma->vm_ops->open(vma);
+
+	return 0;
+}
+
+static void *vb2_dma_iommu_cookie(void *buf_priv)
+{
+	struct vb2_dma_iommu_buf *buf = buf_priv;
+
+	return (void *)buf->drv_addr + buf->offset;
+}
+
+const struct vb2_mem_ops vb2_dma_iommu_memops = {
+	.alloc		= vb2_dma_iommu_alloc,
+	.put		= vb2_dma_iommu_put,
+	.get_userptr	= vb2_dma_iommu_get_userptr,
+	.put_userptr	= vb2_dma_iommu_put_userptr,
+	.vaddr		= vb2_dma_iommu_vaddr,
+	.mmap		= vb2_dma_iommu_mmap,
+	.num_users	= vb2_dma_iommu_num_users,
+	.cookie		= vb2_dma_iommu_cookie,
+};
+EXPORT_SYMBOL_GPL(vb2_dma_iommu_memops);
+
+void *vb2_dma_iommu_init(struct device *dev, struct device *iommu_dev,
+			 struct vb2_dma_iommu_request *iommu_req)
+{
+	struct vb2_dma_iommu_alloc_ctx *ctx;
+	unsigned long mem_base, mem_size;
+	int align_order;
+
+	ctx = kzalloc(sizeof *ctx, GFP_KERNEL);
+	if (!ctx)
+		return ERR_PTR(-ENOMEM);
+
+	align_order = VB2_DMA_IOMMU_PIECE_ORDER;
+	mem_base = VB2_DMA_IOMMU_MEM_BASE;
+	mem_size = VB2_DMA_IOMMU_MEM_SIZE;
+
+	if (iommu_req) {
+		if (iommu_req->align_order)
+			align_order = iommu_req->align_order;
+		if (iommu_req->mem_base)
+			mem_base = iommu_req->mem_base;
+		if (iommu_req->mem_size)
+			mem_size = iommu_req->mem_size;
+	}
+
+	ctx->order = align_order;
+	ctx->pool = gen_pool_create(align_order, VB2_DMA_IOMMU_NODE_ID);
+
+	if (!ctx->pool)
+		goto pool_alloc_fail;
+
+	if (gen_pool_add(ctx->pool, mem_base, mem_size, VB2_DMA_IOMMU_NODE_ID))
+		goto chunk_add_fail;
+
+	ctx->domain = iommu_domain_alloc();
+	if (!ctx->domain) {
+		dev_err(dev, "IOMMU domain alloc failed\n");
+		goto chunk_add_fail;
+	}
+
+	ctx->dev = iommu_dev;
+
+	return ctx;
+
+chunk_add_fail:
+	gen_pool_destroy(ctx->pool);
+pool_alloc_fail:
+	kfree(ctx);
+	return ERR_PTR(-ENOMEM);
+}
+EXPORT_SYMBOL_GPL(vb2_dma_iommu_init);
+
+void vb2_dma_iommu_cleanup(void *alloc_ctx)
+{
+	struct vb2_dma_iommu_alloc_ctx *ctx = alloc_ctx;
+
+	BUG_ON(NULL == alloc_ctx);
+
+	iommu_domain_free(ctx->domain);
+	gen_pool_destroy(ctx->pool);
+	kfree(alloc_ctx);
+}
+EXPORT_SYMBOL_GPL(vb2_dma_iommu_cleanup);
+
+int vb2_dma_iommu_enable(void *alloc_ctx)
+{
+	struct vb2_dma_iommu_alloc_ctx *ctx = alloc_ctx;
+
+	BUG_ON(NULL == alloc_ctx);
+
+	return iommu_attach_device(ctx->domain, ctx->dev);
+}
+EXPORT_SYMBOL_GPL(vb2_dma_iommu_enable);
+
+int vb2_dma_iommu_disable(void *alloc_ctx)
+{
+	struct vb2_dma_iommu_alloc_ctx *ctx = alloc_ctx;
+
+	BUG_ON(NULL == alloc_ctx);
+
+	iommu_detach_device(ctx->domain, ctx->dev);
+
+	return 0;
+}
+EXPORT_SYMBOL_GPL(vb2_dma_iommu_disable);
+
+MODULE_DESCRIPTION("iommu memory handling routines for videobuf2");
+MODULE_AUTHOR("Andrzej Pietrasiewicz");
+MODULE_LICENSE("GPL");
diff --git a/include/media/videobuf2-dma-iommu.h b/include/media/videobuf2-dma-iommu.h
new file mode 100644
index 0000000..02d3b14
--- /dev/null
+++ b/include/media/videobuf2-dma-iommu.h
@@ -0,0 +1,48 @@
+/*
+ * videobuf2-dma-iommu.h - IOMMU based memory allocator for videobuf2
+ *
+ * Copyright (C) 2011 Samsung Electronics
+ *
+ * Author: Andrzej Pietrasiewicz <andrzej.p@samsung.com>
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation.
+ */
+
+#ifndef _MEDIA_VIDEOBUF2_DMA_IOMMU_H
+#define _MEDIA_VIDEOBUF2_DMA_IOMMU_H
+
+#include <media/videobuf2-core.h>
+
+struct device;
+
+struct vb2_dma_iommu_request {
+	/* mem_base and mem_size both 0 => use allocator's default */
+	unsigned long		mem_base;
+	unsigned long		mem_size;
+	/*
+	 * align_order 0 => use allocator's default
+	 * 0 < align_order < PAGE_SHIFT => rounded to PAGE_SHIFT by allocator
+	 */
+	int			align_order;
+};
+
+static inline unsigned long vb2_dma_iommu_plane_addr(
+		struct vb2_buffer *vb, unsigned int plane_no)
+{
+	return (unsigned long)vb2_plane_cookie(vb, plane_no);
+}
+
+extern const struct vb2_mem_ops vb2_dma_iommu_memops;
+
+void *vb2_dma_iommu_init(struct device *dev, struct device *iommu_dev,
+			 struct vb2_dma_iommu_request *req);
+
+void vb2_dma_iommu_cleanup(void *alloc_ctx);
+
+int vb2_dma_iommu_enable(void *alloc_ctx);
+
+int vb2_dma_iommu_disable(void *alloc_ctx);
+
+#endif
-- 
1.7.1.569.g6f426

^ permalink raw reply related	[flat|nested] 64+ messages in thread

* [PATCH 4/7] v4l: videobuf2: add IOMMU based DMA memory allocator
@ 2011-04-18  9:26   ` Marek Szyprowski
  0 siblings, 0 replies; 64+ messages in thread
From: Marek Szyprowski @ 2011-04-18  9:26 UTC (permalink / raw)
  To: linux-arm-kernel

From: Andrzej Pietrasiewicz <andrzej.p@samsung.com>

This patch adds new videobuf2 memory allocator dedicated to devices that
supports IOMMU DMA mappings. A device with IOMMU module and a driver
with include/iommu.h compatible interface is required. This allocator
aquires memory with standard alloc_page() call and doesn't suffer from
memory fragmentation issues. The allocator support following page sizes:
4KiB, 64KiB, 1MiB and 16MiB to reduce iommu translation overhead.

Signed-off-by: Andrzej Pietrasiewicz <andrzej.p@samsung.com>
Signed-off-by: Kyungmin Park <kyungmin.park@samsung.com>
Signed-off-by: Marek Szyprowski <m.szyprowski@samsung.com>
---
 drivers/media/video/Kconfig               |    8 +-
 drivers/media/video/Makefile              |    1 +
 drivers/media/video/videobuf2-dma-iommu.c |  762 +++++++++++++++++++++++++++++
 include/media/videobuf2-dma-iommu.h       |   48 ++
 4 files changed, 818 insertions(+), 1 deletions(-)
 create mode 100644 drivers/media/video/videobuf2-dma-iommu.c
 create mode 100644 include/media/videobuf2-dma-iommu.h

diff --git a/drivers/media/video/Kconfig b/drivers/media/video/Kconfig
index 4498b94..40d7bcc 100644
--- a/drivers/media/video/Kconfig
+++ b/drivers/media/video/Kconfig
@@ -60,12 +60,18 @@ config VIDEOBUF2_VMALLOC
 	select VIDEOBUF2_MEMOPS
 	tristate
 
-
 config VIDEOBUF2_DMA_SG
 	#depends on HAS_DMA
 	select VIDEOBUF2_CORE
 	select VIDEOBUF2_MEMOPS
 	tristate
+
+config VIDEOBUF2_DMA_IOMMU
+	select GENERIC_ALLOCATOR
+	select VIDEOBUF2_CORE
+	select VIDEOBUF2_MEMOPS
+	tristate
+
 #
 # Multimedia Video device configuration
 #
diff --git a/drivers/media/video/Makefile b/drivers/media/video/Makefile
index ace5d8b..04136f6 100644
--- a/drivers/media/video/Makefile
+++ b/drivers/media/video/Makefile
@@ -118,6 +118,7 @@ obj-$(CONFIG_VIDEOBUF2_MEMOPS)		+= videobuf2-memops.o
 obj-$(CONFIG_VIDEOBUF2_VMALLOC)		+= videobuf2-vmalloc.o
 obj-$(CONFIG_VIDEOBUF2_DMA_CONTIG)	+= videobuf2-dma-contig.o
 obj-$(CONFIG_VIDEOBUF2_DMA_SG)		+= videobuf2-dma-sg.o
+obj-$(CONFIG_VIDEOBUF2_DMA_IOMMU)	+= videobuf2-dma-iommu.o
 
 obj-$(CONFIG_V4L2_MEM2MEM_DEV) += v4l2-mem2mem.o
 
diff --git a/drivers/media/video/videobuf2-dma-iommu.c b/drivers/media/video/videobuf2-dma-iommu.c
new file mode 100644
index 0000000..7ccb51a
--- /dev/null
+++ b/drivers/media/video/videobuf2-dma-iommu.c
@@ -0,0 +1,762 @@
+/*
+ * videobuf2-dma-iommu.c - IOMMU based memory allocator for videobuf2
+ *
+ * Copyright (C) 2011 Samsung Electronics
+ *
+ * Author: Andrzej Pietrasiewicz <andrzej.p@samsung.com>
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation.
+ */
+
+#include <linux/module.h>
+#include <linux/mm.h>
+#include <linux/scatterlist.h>
+#include <linux/sched.h>
+#include <linux/slab.h>
+#include <linux/vmalloc.h>
+#include <linux/genalloc.h>
+#include <linux/device.h>
+#include <linux/iommu.h>
+#include <asm/cacheflush.h>
+#include <asm/page.h>
+
+#include <media/videobuf2-core.h>
+#include <media/videobuf2-memops.h>
+#include <media/videobuf2-dma-iommu.h>
+
+/*
+ * 17: single piece of memory (one bitmap entry) equals 128k,
+ * so by default the genalloc's bitmap occupies 4kB (one page
+ * for a number of architectures)
+ */
+#define VB2_DMA_IOMMU_PIECE_ORDER	17
+
+/* -1: use default node id to allocate gen_pool/gen_pool_chunk structure from */
+#define VB2_DMA_IOMMU_NODE_ID		-1
+
+/*
+ * starting address of the virtual address space of the client device
+ * must not be zero
+ */
+#define VB2_DMA_IOMMU_MEM_BASE		0x30000000
+
+/* size of the virtual address space of the client device */
+#define VB2_DMA_IOMMU_MEM_SIZE		0x40000000
+
+struct vb2_dma_iommu_alloc_ctx {
+	struct device		*dev;
+	struct gen_pool		*pool;
+	unsigned int		order;
+	struct iommu_domain	*domain;
+};
+
+struct vb2_dma_iommu_desc {
+	unsigned long		size;
+	unsigned int		num_pages;
+	struct page		**pages;
+	unsigned long		*pg_map;
+	bool			contig;
+};
+
+struct vb2_dma_iommu_buf {
+	unsigned long			drv_addr;
+	unsigned long			vaddr;
+
+	struct vb2_dma_iommu_desc	info;
+	int				offset;
+	atomic_t			refcount;
+	int				write;
+	struct vm_area_struct		*vma;
+
+	struct vb2_vmarea_handler	handler;
+
+	struct vb2_dma_iommu_alloc_ctx	*ctx;
+};
+
+#define pages_4k(size) \
+			(((size) + PAGE_SIZE - 1) >> PAGE_SHIFT)
+
+#define pages_order(size, order) \
+			((pages_4k(size) >> (order)) & 0xF)
+
+#define for_each_compound_page(bitmap, size, idx) \
+			for ((idx) = find_first_bit((bitmap), (size)); \
+			     (idx) < (size); \
+			     (idx) = find_next_bit((bitmap), (size), (idx) + 1))
+
+static int vb2_dma_iommu_max_order(unsigned long size)
+{
+	if ((size & 0xFFFF) == size) /* < 64k */
+		return 0;
+	if ((size & 0xFFFFF) == size) /* < 1M */
+		return 4;
+	if ((size & 0xFFFFFF) == size) /* < 16M */
+		return 8;
+	return 12; /* >= 16M */
+}
+
+/*
+ * num_pg must be 1, 16, 256 or 4096
+ */
+static int vb2_dma_iommu_pg_order(int num_pg)
+{
+	if (num_pg & 0x1)
+		return 0;
+	if (num_pg & 0x10)
+		return 4;
+	if (num_pg & 0x100)
+		return 8;
+	return 12;
+}
+
+/*
+ * size must be multiple of PAGE_SIZE
+ */
+static int vb2_dma_iommu_get_pages(struct vb2_dma_iommu_desc *desc,
+				   unsigned long size)
+{
+	int order, num_pg_order, curr_4k_page, bit, max_order_ret;
+	unsigned long curr_size;
+
+	curr_4k_page = 0;
+	max_order_ret = 0;
+	curr_size = size; /* allocate (compound) pages until nothing remains */
+
+	order = vb2_dma_iommu_max_order(curr_size);
+	num_pg_order = pages_order(curr_size, order);
+
+	while (curr_size > 0 && order >= 0) {
+		int i, max_order;
+
+		printk(KERN_DEBUG "%s %d page(s) of %d order\n", __func__,
+		       num_pg_order, order);
+
+		for (i = 0; i < num_pg_order; ++i) {
+			struct page *pg;
+			int j, compound_sz;
+
+			pg = alloc_pages(GFP_KERNEL | __GFP_ZERO | __GFP_COMP,
+					 order);
+			if (!pg)
+				break;
+
+
+			if (order > max_order_ret)
+				max_order_ret = order;
+
+			compound_sz = 0x1 << order;
+			/* need to zero bitmap parts only for orders > 0 */
+			if (order)
+				bitmap_clear(desc->pg_map, curr_4k_page + 1,
+					     compound_sz - 1);
+			for (j = 0; j < compound_sz; ++j)
+				desc->pages[curr_4k_page + j] = (pg + j);
+			curr_4k_page += compound_sz;
+		}
+		/*
+		 * after the above for ends either way (loop condition not
+		 * fulfilled/break) the i contains number of (compound) pages
+		 * we managed to allocate
+		 */
+		curr_size -= i * (PAGE_SIZE << order);
+		max_order = vb2_dma_iommu_max_order(curr_size);
+		/*
+		 * max_order >= current order means that some allocations
+		 * with order >= current order have failed, so we cannot attempt
+		 * any greater orders again, we need to try an order smaller
+		 * than the current order instead
+		 */
+		if (max_order >= order)
+			max_order = order - 4;
+		order = max_order;
+		num_pg_order = pages_order(curr_size, order);
+	}
+
+	if (curr_size != 0)
+		goto get_pages_rollback;
+
+	return max_order_ret;
+
+get_pages_rollback:
+	for_each_compound_page(desc->pg_map, curr_4k_page, bit) {
+		int next_bit;
+
+		next_bit = find_next_bit(desc->pg_map, curr_4k_page, bit + 1);
+		order = vb2_dma_iommu_pg_order(next_bit - bit);
+		__free_pages(desc->pages[bit], order);
+	}
+
+	return -1;
+}
+
+static int vb2_dma_iommu_pg_sizes(struct vb2_dma_iommu_desc *desc)
+{
+	int i, order, max_order;
+
+	i = 0;
+	/* max order is 12, set to something greater */
+	order = 12 + 1;
+	max_order = 0;
+	while (i < desc->num_pages) {
+		unsigned long first, curr, next, curr_size;
+		int adjacent, j, new_order, num_pg_order;
+
+		first = 0;
+		j = i;
+		if (desc->contig) {
+			first = page_to_phys(desc->pages[0]);
+			adjacent = desc->num_pages;
+		} else {
+			curr = page_to_phys(desc->pages[i]);
+			if (order > 12)
+				first = curr;
+			while (++j < desc->num_pages) {
+				next = page_to_phys(desc->pages[j]);
+				if (curr + PAGE_SIZE != next)
+					break;
+				curr = next;
+			}
+			adjacent = j - i;
+		}
+		curr_size = adjacent << PAGE_SHIFT;
+		new_order = vb2_dma_iommu_max_order(curr_size);
+		/*
+		 * by design decision max order in a sequence of blocks of
+		 * zero-order pages must be monotonicaly decreasing
+		 */
+		if (new_order > order) {
+			bitmap_fill(desc->pg_map, desc->num_pages);
+			return 0;
+		}
+		/*
+		 * by design decision the first compound page of the buffer
+		 * must be aligned according to its size
+		 */
+		if (order > 12)
+			if (first & ((PAGE_SIZE << new_order) - 1)) {
+				bitmap_fill(desc->pg_map, desc->num_pages);
+				return 0;
+			}
+		order = new_order;
+		if (order > max_order)
+			max_order = order;
+		num_pg_order = pages_order(curr_size, order);
+		while (curr_size > 0) {
+			int compound_sz;
+
+			printk(KERN_DEBUG "%s %d page(s) of %d order\n",
+			       __func__, num_pg_order, order);
+			compound_sz = 0x1 << order;
+			/* need to zero bitmap parts only for orders > 0 */
+			if (order)
+				for (j = 0; j < num_pg_order; ++j)
+					bitmap_clear(desc->pg_map,
+						     i + j * compound_sz + 1,
+						     compound_sz - 1);
+			i += num_pg_order * compound_sz;
+			curr_size -= num_pg_order * (PAGE_SIZE << order);
+			if (curr_size) {
+				order = vb2_dma_iommu_max_order(curr_size);
+				num_pg_order = pages_order(curr_size, order);
+			}
+		}
+	}
+	return max_order;
+}
+
+static int vb2_dma_iommu_map(struct iommu_domain *domain,
+			     unsigned long drv_addr,
+			     struct vb2_dma_iommu_desc *desc)
+{
+	int i, j, ret, order;
+	unsigned long pg_addr;
+
+	pg_addr = drv_addr;
+	ret = 0;
+	for_each_compound_page(desc->pg_map, desc->num_pages, i) {
+		int next_bit;
+		unsigned long paddr, compound_sz;
+
+		next_bit = find_next_bit(desc->pg_map, desc->num_pages, i + 1);
+		order = vb2_dma_iommu_pg_order(next_bit - i);
+		compound_sz = 0x1 << order << PAGE_SHIFT;
+		paddr = page_to_phys(desc->pages[i]);
+		ret = iommu_map(domain, pg_addr, paddr, order, 0);
+		if (ret < 0)
+			goto fail_map_area;
+		pg_addr += compound_sz;
+	}
+
+	return ret;
+
+fail_map_area:
+	pg_addr = drv_addr;
+	for_each_compound_page(desc->pg_map, i, j) {
+		int next_bit;
+
+		next_bit = find_next_bit(desc->pg_map, i, j + 1);
+		order = vb2_dma_iommu_pg_order(next_bit - j);
+		iommu_unmap(domain, pg_addr, order);
+		pg_addr += 0x1 << order << PAGE_SHIFT;
+	}
+	return ret;
+}
+
+static void vb2_dma_iommu_unmap(struct iommu_domain *domain,
+			       unsigned long drv_addr,
+			       struct vb2_dma_iommu_desc *desc)
+{
+	int i;
+
+	for_each_compound_page(desc->pg_map, desc->num_pages, i) {
+		int next_bit, order;
+
+		next_bit = find_next_bit(desc->pg_map, desc->num_pages, i + 1);
+		order = vb2_dma_iommu_pg_order(next_bit - i);
+		iommu_unmap(domain, drv_addr, order);
+		drv_addr += 0x1 << order << PAGE_SHIFT;
+	}
+}
+
+static void vb2_dma_iommu_put(void *buf_priv);
+
+static void *vb2_dma_iommu_alloc(void *alloc_ctx, unsigned long size)
+{
+	struct vb2_dma_iommu_alloc_ctx *ctx = alloc_ctx;
+	struct vb2_dma_iommu_buf *buf;
+	unsigned long size_pg, pg_map_size;
+	int i, ret, max_order;
+	void *rv;
+
+	BUG_ON(NULL == alloc_ctx);
+
+	rv = NULL;
+
+	buf = kzalloc(sizeof *buf, GFP_KERNEL);
+	if (!buf)
+		return ERR_PTR(-ENOMEM);
+
+	buf->ctx = ctx;
+	buf->info.size = size;
+	buf->info.num_pages = size_pg = pages_4k(size);
+
+	buf->info.pages = kzalloc(size_pg * sizeof(struct page *), GFP_KERNEL);
+	if (!buf->info.pages) {
+		rv = ERR_PTR(-ENOMEM);
+		goto buf_alloc_rollback;
+	}
+
+	pg_map_size = BITS_TO_LONGS(size_pg) * sizeof(unsigned long);
+	buf->info.pg_map = kzalloc(pg_map_size, GFP_KERNEL);
+	if (!buf->info.pg_map) {
+		rv = ERR_PTR(-ENOMEM);
+		goto pg_array_alloc_rollback;
+	}
+	bitmap_fill(buf->info.pg_map, size_pg);
+
+	max_order = vb2_dma_iommu_get_pages(&buf->info, size_pg * PAGE_SIZE);
+	if (max_order < 0) {
+		rv = ERR_PTR(-ENOMEM);
+		goto pg_map_alloc_rollback;
+	}
+
+	/* max_order is for number of pages; order of bytes: += 12 */
+	max_order += PAGE_SHIFT;
+	/* we need to keep the contract of vb2_dma_iommu_request */
+	if (max_order < ctx->order)
+		max_order = ctx->order;
+	buf->drv_addr = gen_pool_alloc_aligned(ctx->pool, size, max_order);
+	if (0 == buf->drv_addr) {
+		rv = ERR_PTR(-ENOMEM);
+		goto pages_alloc_rollback;
+	}
+
+	buf->handler.refcount = &buf->refcount;
+	buf->handler.put = vb2_dma_iommu_put;
+	buf->handler.arg = buf;
+
+	atomic_inc(&buf->refcount);
+
+	printk(KERN_DEBUG
+	       "%s: Context 0x%lx mapping buffer of %d pages @0x%lx\n",
+	       __func__ , (unsigned long)ctx, buf->info.num_pages,
+	       buf->drv_addr);
+	ret = vb2_dma_iommu_map(ctx->domain, buf->drv_addr, &buf->info);
+	if (ret < 0) {
+		rv = ERR_PTR(ret);
+		goto gen_pool_alloc_rollback;
+	}
+
+	/*
+	 * TODO: Ensure no one else flushes the cache later onto our memory
+	 * which already contains important data.
+	 * Perhaps find a better way to do it.
+	 */
+	flush_cache_all();
+	outer_flush_all();
+	return buf;
+
+gen_pool_alloc_rollback:
+	gen_pool_free(ctx->pool, buf->drv_addr, size);
+
+pages_alloc_rollback:
+	for_each_compound_page(buf->info.pg_map, buf->info.num_pages, i) {
+		int next_bit;
+
+		next_bit = find_next_bit(buf->info.pg_map, buf->info.num_pages,
+					 i + 1);
+		max_order = vb2_dma_iommu_pg_order(next_bit - i);
+		__free_pages(buf->info.pages[i], max_order);
+	}
+
+pg_map_alloc_rollback:
+	kfree(buf->info.pg_map);
+
+pg_array_alloc_rollback:
+	kfree(buf->info.pages);
+
+buf_alloc_rollback:
+	kfree(buf);
+	return rv;
+}
+
+static void vb2_dma_iommu_put(void *buf_priv)
+{
+	struct vb2_dma_iommu_buf *buf = buf_priv;
+
+	if (atomic_dec_and_test(&buf->refcount)) {
+		int i, order;
+
+		printk(KERN_DEBUG
+		"%s: Context 0x%lx releasing buffer of %d pages @0x%lx\n",
+		       __func__, (unsigned long)buf->ctx, buf->info.num_pages,
+		       buf->drv_addr);
+
+		vb2_dma_iommu_unmap(buf->ctx->domain, buf->drv_addr,
+				    &buf->info);
+		if (buf->vaddr)
+			vm_unmap_ram((void *)buf->vaddr, buf->info.num_pages);
+
+		gen_pool_free(buf->ctx->pool, buf->drv_addr, buf->info.size);
+		for_each_compound_page(buf->info.pg_map,
+				       buf->info.num_pages, i) {
+			int next_bit;
+
+			next_bit = find_next_bit(buf->info.pg_map,
+						 buf->info.num_pages, i + 1);
+			order = vb2_dma_iommu_pg_order(next_bit - i);
+			__free_pages(buf->info.pages[i], order);
+		}
+		kfree(buf->info.pg_map);
+		kfree(buf->info.pages);
+		kfree(buf);
+	}
+}
+
+static void *vb2_dma_iommu_get_userptr(void *alloc_ctx, unsigned long vaddr,
+				    unsigned long size, int write)
+{
+	struct vb2_dma_iommu_alloc_ctx *ctx = alloc_ctx;
+	struct vb2_dma_iommu_buf *buf;
+	unsigned long first, last, size_pg, pg_map_size;
+	int num_pages_from_user, max_order, ret;
+	void *rv;
+
+	BUG_ON(NULL == alloc_ctx);
+
+	rv = NULL;
+
+	buf = kzalloc(sizeof *buf, GFP_KERNEL);
+	if (!buf)
+		return ERR_PTR(-ENOMEM);
+
+	buf->ctx = ctx;
+	buf->info.size = size;
+	/*
+	 * Page numbers of the first and the last byte of the buffer
+	 */
+	first = vaddr >> PAGE_SHIFT;
+	last  = (vaddr + size - 1) >> PAGE_SHIFT;
+	buf->info.num_pages = size_pg = last - first + 1;
+	buf->offset = vaddr & ~PAGE_MASK;
+	buf->write = write;
+
+	buf->info.pages = kzalloc(size_pg * sizeof(struct page *), GFP_KERNEL);
+	if (!buf->info.pages) {
+		rv = ERR_PTR(-ENOMEM);
+		goto buf_alloc_rollback;
+	}
+
+	pg_map_size = BITS_TO_LONGS(size_pg) * sizeof(unsigned long);
+	buf->info.pg_map = kzalloc(pg_map_size, GFP_KERNEL);
+	if (!buf->info.pg_map) {
+		rv = ERR_PTR(-ENOMEM);
+		goto pg_array_alloc_rollback;
+	}
+	bitmap_fill(buf->info.pg_map, size_pg);
+
+	num_pages_from_user = vb2_get_user_pages(vaddr, buf->info.num_pages,
+						 buf->info.pages, write,
+						 &buf->vma);
+
+	/* do not accept partial success */
+	if (num_pages_from_user >= 0 && num_pages_from_user < size_pg) {
+		rv = ERR_PTR(-EFAULT);
+		goto get_user_pages_rollback;
+	}
+
+	if (num_pages_from_user < 0) {
+		struct vm_area_struct *vma;
+		int i;
+		dma_addr_t paddr;
+
+		paddr = 0;
+		ret = vb2_get_contig_userptr(vaddr, size, &vma, &paddr);
+		if (ret) {
+			rv = ERR_PTR(ret);
+			goto get_user_pages_rollback;
+		}
+
+		buf->vma = vma;
+		buf->info.contig = true;
+		paddr -= buf->offset;
+
+		for (i = 0; i < size_pg; paddr += PAGE_SIZE, ++i)
+			buf->info.pages[i] = phys_to_page(paddr);
+	}
+	max_order = vb2_dma_iommu_pg_sizes(&buf->info);
+
+	/* max_order is for number of pages; order of bytes: += 12 */
+	max_order += PAGE_SHIFT;
+	/* we need to keep the contract of vb2_dma_iommu_request */
+	if (max_order < ctx->order)
+		max_order = ctx->order;
+
+	buf->drv_addr = gen_pool_alloc_aligned(ctx->pool, size, max_order);
+	if (0 == buf->drv_addr) {
+		rv = ERR_PTR(-ENOMEM);
+		goto get_user_pages_rollback;
+	}
+	printk(KERN_DEBUG
+	"%s: Context 0x%lx mapping buffer of %ld user pages @0x%lx\n",
+	       __func__ , (unsigned long)ctx, size_pg, buf->drv_addr);
+	ret = vb2_dma_iommu_map(ctx->domain, buf->drv_addr, &buf->info);
+	if (ret < 0) {
+		rv = ERR_PTR(ret);
+		goto gen_pool_alloc_rollback;
+	}
+
+	return buf;
+
+gen_pool_alloc_rollback:
+	gen_pool_free(ctx->pool, buf->drv_addr, size);
+
+get_user_pages_rollback:
+	while (--num_pages_from_user >= 0)
+		put_page(buf->info.pages[num_pages_from_user]);
+	if (buf->vma)
+		vb2_put_vma(buf->vma);
+	kfree(buf->info.pg_map);
+
+pg_array_alloc_rollback:
+	kfree(buf->info.pages);
+
+buf_alloc_rollback:
+	kfree(buf);
+	return rv;
+}
+
+/*
+ * @put_userptr: inform the allocator that a USERPTR buffer will no longer
+ *		 be used
+ */
+static void vb2_dma_iommu_put_userptr(void *buf_priv)
+{
+	struct vb2_dma_iommu_buf *buf = buf_priv;
+	int i;
+
+	printk(KERN_DEBUG
+	       "%s: Context 0x%lx releasing buffer of %d user pages @0x%lx\n",
+	       __func__, (unsigned long)buf->ctx, buf->info.num_pages,
+	       buf->drv_addr);
+	vb2_dma_iommu_unmap(buf->ctx->domain, buf->drv_addr, &buf->info);
+	if (buf->vaddr)
+		vm_unmap_ram((void *)buf->vaddr, buf->info.num_pages);
+
+	gen_pool_free(buf->ctx->pool, buf->drv_addr, buf->info.size);
+
+	if (buf->vma)
+		vb2_put_vma(buf->vma);
+
+	i = buf->info.num_pages;
+	if (!buf->info.contig) {
+		while (--i >= 0) {
+			if (buf->write)
+				set_page_dirty_lock(buf->info.pages[i]);
+			put_page(buf->info.pages[i]);
+		}
+	}
+	kfree(buf->info.pg_map);
+	kfree(buf->info.pages);
+	kfree(buf);
+}
+
+static void *vb2_dma_iommu_vaddr(void *buf_priv)
+{
+	struct vb2_dma_iommu_buf *buf = buf_priv;
+
+	BUG_ON(!buf);
+
+	if (!buf->vaddr)
+		buf->vaddr = (unsigned long)vm_map_ram(buf->info.pages,
+					buf->info.num_pages,
+					-1,
+					pgprot_dmacoherent(PAGE_KERNEL));
+
+	/* add offset in case userptr is not page-aligned */
+	return (void *)(buf->vaddr + buf->offset);
+}
+
+static unsigned int vb2_dma_iommu_num_users(void *buf_priv)
+{
+	struct vb2_dma_iommu_buf *buf = buf_priv;
+
+	return atomic_read(&buf->refcount);
+}
+
+static int vb2_dma_iommu_mmap(void *buf_priv, struct vm_area_struct *vma)
+{
+	struct vb2_dma_iommu_buf *buf = buf_priv;
+	int ret;
+
+	if (!buf) {
+		printk(KERN_ERR "No memory to map\n");
+		return -EINVAL;
+	}
+
+	vma->vm_page_prot = pgprot_dmacoherent(vma->vm_page_prot);
+
+	ret = vb2_insert_pages(vma, buf->info.pages);
+	if (ret)
+		return ret;
+
+	/*
+	 * Use common vm_area operations to track buffer refcount.
+	 */
+	vma->vm_private_data	= &buf->handler;
+	vma->vm_ops		= &vb2_common_vm_ops;
+
+	vma->vm_ops->open(vma);
+
+	return 0;
+}
+
+static void *vb2_dma_iommu_cookie(void *buf_priv)
+{
+	struct vb2_dma_iommu_buf *buf = buf_priv;
+
+	return (void *)buf->drv_addr + buf->offset;
+}
+
+const struct vb2_mem_ops vb2_dma_iommu_memops = {
+	.alloc		= vb2_dma_iommu_alloc,
+	.put		= vb2_dma_iommu_put,
+	.get_userptr	= vb2_dma_iommu_get_userptr,
+	.put_userptr	= vb2_dma_iommu_put_userptr,
+	.vaddr		= vb2_dma_iommu_vaddr,
+	.mmap		= vb2_dma_iommu_mmap,
+	.num_users	= vb2_dma_iommu_num_users,
+	.cookie		= vb2_dma_iommu_cookie,
+};
+EXPORT_SYMBOL_GPL(vb2_dma_iommu_memops);
+
+void *vb2_dma_iommu_init(struct device *dev, struct device *iommu_dev,
+			 struct vb2_dma_iommu_request *iommu_req)
+{
+	struct vb2_dma_iommu_alloc_ctx *ctx;
+	unsigned long mem_base, mem_size;
+	int align_order;
+
+	ctx = kzalloc(sizeof *ctx, GFP_KERNEL);
+	if (!ctx)
+		return ERR_PTR(-ENOMEM);
+
+	align_order = VB2_DMA_IOMMU_PIECE_ORDER;
+	mem_base = VB2_DMA_IOMMU_MEM_BASE;
+	mem_size = VB2_DMA_IOMMU_MEM_SIZE;
+
+	if (iommu_req) {
+		if (iommu_req->align_order)
+			align_order = iommu_req->align_order;
+		if (iommu_req->mem_base)
+			mem_base = iommu_req->mem_base;
+		if (iommu_req->mem_size)
+			mem_size = iommu_req->mem_size;
+	}
+
+	ctx->order = align_order;
+	ctx->pool = gen_pool_create(align_order, VB2_DMA_IOMMU_NODE_ID);
+
+	if (!ctx->pool)
+		goto pool_alloc_fail;
+
+	if (gen_pool_add(ctx->pool, mem_base, mem_size, VB2_DMA_IOMMU_NODE_ID))
+		goto chunk_add_fail;
+
+	ctx->domain = iommu_domain_alloc();
+	if (!ctx->domain) {
+		dev_err(dev, "IOMMU domain alloc failed\n");
+		goto chunk_add_fail;
+	}
+
+	ctx->dev = iommu_dev;
+
+	return ctx;
+
+chunk_add_fail:
+	gen_pool_destroy(ctx->pool);
+pool_alloc_fail:
+	kfree(ctx);
+	return ERR_PTR(-ENOMEM);
+}
+EXPORT_SYMBOL_GPL(vb2_dma_iommu_init);
+
+void vb2_dma_iommu_cleanup(void *alloc_ctx)
+{
+	struct vb2_dma_iommu_alloc_ctx *ctx = alloc_ctx;
+
+	BUG_ON(NULL == alloc_ctx);
+
+	iommu_domain_free(ctx->domain);
+	gen_pool_destroy(ctx->pool);
+	kfree(alloc_ctx);
+}
+EXPORT_SYMBOL_GPL(vb2_dma_iommu_cleanup);
+
+int vb2_dma_iommu_enable(void *alloc_ctx)
+{
+	struct vb2_dma_iommu_alloc_ctx *ctx = alloc_ctx;
+
+	BUG_ON(NULL == alloc_ctx);
+
+	return iommu_attach_device(ctx->domain, ctx->dev);
+}
+EXPORT_SYMBOL_GPL(vb2_dma_iommu_enable);
+
+int vb2_dma_iommu_disable(void *alloc_ctx)
+{
+	struct vb2_dma_iommu_alloc_ctx *ctx = alloc_ctx;
+
+	BUG_ON(NULL == alloc_ctx);
+
+	iommu_detach_device(ctx->domain, ctx->dev);
+
+	return 0;
+}
+EXPORT_SYMBOL_GPL(vb2_dma_iommu_disable);
+
+MODULE_DESCRIPTION("iommu memory handling routines for videobuf2");
+MODULE_AUTHOR("Andrzej Pietrasiewicz");
+MODULE_LICENSE("GPL");
diff --git a/include/media/videobuf2-dma-iommu.h b/include/media/videobuf2-dma-iommu.h
new file mode 100644
index 0000000..02d3b14
--- /dev/null
+++ b/include/media/videobuf2-dma-iommu.h
@@ -0,0 +1,48 @@
+/*
+ * videobuf2-dma-iommu.h - IOMMU based memory allocator for videobuf2
+ *
+ * Copyright (C) 2011 Samsung Electronics
+ *
+ * Author: Andrzej Pietrasiewicz <andrzej.p@samsung.com>
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation.
+ */
+
+#ifndef _MEDIA_VIDEOBUF2_DMA_IOMMU_H
+#define _MEDIA_VIDEOBUF2_DMA_IOMMU_H
+
+#include <media/videobuf2-core.h>
+
+struct device;
+
+struct vb2_dma_iommu_request {
+	/* mem_base and mem_size both 0 => use allocator's default */
+	unsigned long		mem_base;
+	unsigned long		mem_size;
+	/*
+	 * align_order 0 => use allocator's default
+	 * 0 < align_order < PAGE_SHIFT => rounded to PAGE_SHIFT by allocator
+	 */
+	int			align_order;
+};
+
+static inline unsigned long vb2_dma_iommu_plane_addr(
+		struct vb2_buffer *vb, unsigned int plane_no)
+{
+	return (unsigned long)vb2_plane_cookie(vb, plane_no);
+}
+
+extern const struct vb2_mem_ops vb2_dma_iommu_memops;
+
+void *vb2_dma_iommu_init(struct device *dev, struct device *iommu_dev,
+			 struct vb2_dma_iommu_request *req);
+
+void vb2_dma_iommu_cleanup(void *alloc_ctx);
+
+int vb2_dma_iommu_enable(void *alloc_ctx);
+
+int vb2_dma_iommu_disable(void *alloc_ctx);
+
+#endif
-- 
1.7.1.569.g6f426

^ permalink raw reply related	[flat|nested] 64+ messages in thread

* [PATCH 5/7] v4l: s5p-fimc: add pm_runtime support
  2011-04-18  9:26 ` Marek Szyprowski
@ 2011-04-18  9:26   ` Marek Szyprowski
  -1 siblings, 0 replies; 64+ messages in thread
From: Marek Szyprowski @ 2011-04-18  9:26 UTC (permalink / raw)
  To: linux-arm-kernel, linux-samsung-soc, linux-media
  Cc: Marek Szyprowski, Kyungmin Park, Andrzej Pietrasiwiecz,
	Sylwester Nawrocki, Arnd Bergmann, Kukjin Kim

This patch adds basic support for pm_runtime to s5p-fimc driver. PM
runtime support is required to enable the driver on S5PV310 series with
power domain driver enabled.

Signed-off-by: Marek Szyprowski <m.szyprowski@samsung.com>
Signed-off-by: Kyungmin Park <kyungmin.park@samsung.com>
---
 drivers/media/video/s5p-fimc/fimc-capture.c |    5 +++++
 drivers/media/video/s5p-fimc/fimc-core.c    |   14 ++++++++++++++
 2 files changed, 19 insertions(+), 0 deletions(-)

diff --git a/drivers/media/video/s5p-fimc/fimc-capture.c b/drivers/media/video/s5p-fimc/fimc-capture.c
index 95f8b4e1..f697ed1 100644
--- a/drivers/media/video/s5p-fimc/fimc-capture.c
+++ b/drivers/media/video/s5p-fimc/fimc-capture.c
@@ -18,6 +18,7 @@
 #include <linux/interrupt.h>
 #include <linux/device.h>
 #include <linux/platform_device.h>
+#include <linux/pm_runtime.h>
 #include <linux/list.h>
 #include <linux/slab.h>
 #include <linux/clk.h>
@@ -398,6 +399,8 @@ static int fimc_capture_open(struct file *file)
 	if (fimc_m2m_active(fimc))
 		return -EBUSY;
 
+	pm_runtime_get_sync(&fimc->pdev->dev);
+
 	if (++fimc->vid_cap.refcnt == 1) {
 		ret = fimc_isp_subdev_init(fimc, 0);
 		if (ret) {
@@ -428,6 +431,8 @@ static int fimc_capture_close(struct file *file)
 		fimc_subdev_unregister(fimc);
 	}
 
+	pm_runtime_put_sync(&fimc->pdev->dev);
+
 	return 0;
 }
 
diff --git a/drivers/media/video/s5p-fimc/fimc-core.c b/drivers/media/video/s5p-fimc/fimc-core.c
index 6c919b3..ead5c0a 100644
--- a/drivers/media/video/s5p-fimc/fimc-core.c
+++ b/drivers/media/video/s5p-fimc/fimc-core.c
@@ -20,6 +20,7 @@
 #include <linux/interrupt.h>
 #include <linux/device.h>
 #include <linux/platform_device.h>
+#include <linux/pm_runtime.h>
 #include <linux/list.h>
 #include <linux/io.h>
 #include <linux/slab.h>
@@ -1410,6 +1411,8 @@ static int fimc_m2m_open(struct file *file)
 	if (fimc->vid_cap.refcnt > 0)
 		return -EBUSY;
 
+	pm_runtime_get_sync(&fimc->pdev->dev);
+
 	fimc->m2m.refcnt++;
 	set_bit(ST_OUTDMA_RUN, &fimc->state);
 
@@ -1452,6 +1455,8 @@ static int fimc_m2m_release(struct file *file)
 	if (--fimc->m2m.refcnt <= 0)
 		clear_bit(ST_OUTDMA_RUN, &fimc->state);
 
+	pm_runtime_put_sync(&fimc->pdev->dev);
+
 	return 0;
 }
 
@@ -1649,6 +1654,11 @@ static int fimc_probe(struct platform_device *pdev)
 		goto err_req_region;
 	}
 
+	pm_runtime_set_active(&pdev->dev);
+	pm_runtime_enable(&pdev->dev);
+
+	pm_runtime_get_sync(&pdev->dev);
+
 	fimc->num_clocks = MAX_FIMC_CLOCKS - 1;
 
 	/* Check if a video capture node needs to be registered. */
@@ -1706,6 +1716,8 @@ static int fimc_probe(struct platform_device *pdev)
 	dev_dbg(&pdev->dev, "%s(): fimc-%d registered successfully\n",
 		__func__, fimc->id);
 
+	pm_runtime_put_sync(&pdev->dev);
+
 	return 0;
 
 err_m2m:
@@ -1740,6 +1752,8 @@ static int __devexit fimc_remove(struct platform_device *pdev)
 
 	vb2_dma_contig_cleanup_ctx(fimc->alloc_ctx);
 
+	pm_runtime_disable(&pdev->dev);
+
 	iounmap(fimc->regs);
 	release_resource(fimc->regs_res);
 	kfree(fimc->regs_res);
-- 
1.7.1.569.g6f426

^ permalink raw reply related	[flat|nested] 64+ messages in thread

* [PATCH 5/7] v4l: s5p-fimc: add pm_runtime support
@ 2011-04-18  9:26   ` Marek Szyprowski
  0 siblings, 0 replies; 64+ messages in thread
From: Marek Szyprowski @ 2011-04-18  9:26 UTC (permalink / raw)
  To: linux-arm-kernel

This patch adds basic support for pm_runtime to s5p-fimc driver. PM
runtime support is required to enable the driver on S5PV310 series with
power domain driver enabled.

Signed-off-by: Marek Szyprowski <m.szyprowski@samsung.com>
Signed-off-by: Kyungmin Park <kyungmin.park@samsung.com>
---
 drivers/media/video/s5p-fimc/fimc-capture.c |    5 +++++
 drivers/media/video/s5p-fimc/fimc-core.c    |   14 ++++++++++++++
 2 files changed, 19 insertions(+), 0 deletions(-)

diff --git a/drivers/media/video/s5p-fimc/fimc-capture.c b/drivers/media/video/s5p-fimc/fimc-capture.c
index 95f8b4e1..f697ed1 100644
--- a/drivers/media/video/s5p-fimc/fimc-capture.c
+++ b/drivers/media/video/s5p-fimc/fimc-capture.c
@@ -18,6 +18,7 @@
 #include <linux/interrupt.h>
 #include <linux/device.h>
 #include <linux/platform_device.h>
+#include <linux/pm_runtime.h>
 #include <linux/list.h>
 #include <linux/slab.h>
 #include <linux/clk.h>
@@ -398,6 +399,8 @@ static int fimc_capture_open(struct file *file)
 	if (fimc_m2m_active(fimc))
 		return -EBUSY;
 
+	pm_runtime_get_sync(&fimc->pdev->dev);
+
 	if (++fimc->vid_cap.refcnt == 1) {
 		ret = fimc_isp_subdev_init(fimc, 0);
 		if (ret) {
@@ -428,6 +431,8 @@ static int fimc_capture_close(struct file *file)
 		fimc_subdev_unregister(fimc);
 	}
 
+	pm_runtime_put_sync(&fimc->pdev->dev);
+
 	return 0;
 }
 
diff --git a/drivers/media/video/s5p-fimc/fimc-core.c b/drivers/media/video/s5p-fimc/fimc-core.c
index 6c919b3..ead5c0a 100644
--- a/drivers/media/video/s5p-fimc/fimc-core.c
+++ b/drivers/media/video/s5p-fimc/fimc-core.c
@@ -20,6 +20,7 @@
 #include <linux/interrupt.h>
 #include <linux/device.h>
 #include <linux/platform_device.h>
+#include <linux/pm_runtime.h>
 #include <linux/list.h>
 #include <linux/io.h>
 #include <linux/slab.h>
@@ -1410,6 +1411,8 @@ static int fimc_m2m_open(struct file *file)
 	if (fimc->vid_cap.refcnt > 0)
 		return -EBUSY;
 
+	pm_runtime_get_sync(&fimc->pdev->dev);
+
 	fimc->m2m.refcnt++;
 	set_bit(ST_OUTDMA_RUN, &fimc->state);
 
@@ -1452,6 +1455,8 @@ static int fimc_m2m_release(struct file *file)
 	if (--fimc->m2m.refcnt <= 0)
 		clear_bit(ST_OUTDMA_RUN, &fimc->state);
 
+	pm_runtime_put_sync(&fimc->pdev->dev);
+
 	return 0;
 }
 
@@ -1649,6 +1654,11 @@ static int fimc_probe(struct platform_device *pdev)
 		goto err_req_region;
 	}
 
+	pm_runtime_set_active(&pdev->dev);
+	pm_runtime_enable(&pdev->dev);
+
+	pm_runtime_get_sync(&pdev->dev);
+
 	fimc->num_clocks = MAX_FIMC_CLOCKS - 1;
 
 	/* Check if a video capture node needs to be registered. */
@@ -1706,6 +1716,8 @@ static int fimc_probe(struct platform_device *pdev)
 	dev_dbg(&pdev->dev, "%s(): fimc-%d registered successfully\n",
 		__func__, fimc->id);
 
+	pm_runtime_put_sync(&pdev->dev);
+
 	return 0;
 
 err_m2m:
@@ -1740,6 +1752,8 @@ static int __devexit fimc_remove(struct platform_device *pdev)
 
 	vb2_dma_contig_cleanup_ctx(fimc->alloc_ctx);
 
+	pm_runtime_disable(&pdev->dev);
+
 	iounmap(fimc->regs);
 	release_resource(fimc->regs_res);
 	kfree(fimc->regs_res);
-- 
1.7.1.569.g6f426

^ permalink raw reply related	[flat|nested] 64+ messages in thread

* [PATCH 6/7] v4l: s5p-fimc: Add support for vb2-dma-iommu allocator
  2011-04-18  9:26 ` Marek Szyprowski
@ 2011-04-18  9:26   ` Marek Szyprowski
  -1 siblings, 0 replies; 64+ messages in thread
From: Marek Szyprowski @ 2011-04-18  9:26 UTC (permalink / raw)
  To: linux-arm-kernel, linux-samsung-soc, linux-media
  Cc: Marek Szyprowski, Kyungmin Park, Andrzej Pietrasiwiecz,
	Sylwester Nawrocki, Arnd Bergmann, Kukjin Kim

This patch adds support for videobuf2-dma-iommu allocator to s5p-fimc
driver. This allocator is selected only on systems that contains support
for S5P SYSMMU module (like EXYNOS4 platform). Otherwise the standard
videobuf2-dma-contig is used.

Signed-off-by: Marek Szyprowski <m.szyprowski@samsung.com>
Signed-off-by: Kyungmin Park <kyungmin.park@samsung.com>
---
 drivers/media/video/Kconfig                 |    3 +-
 drivers/media/video/s5p-fimc/fimc-capture.c |    4 +-
 drivers/media/video/s5p-fimc/fimc-core.c    |   24 ++++---
 drivers/media/video/s5p-fimc/fimc-core.h    |    1 +
 drivers/media/video/s5p-fimc/fimc-mem.h     |  104 +++++++++++++++++++++++++++
 5 files changed, 123 insertions(+), 13 deletions(-)
 create mode 100644 drivers/media/video/s5p-fimc/fimc-mem.h

diff --git a/drivers/media/video/Kconfig b/drivers/media/video/Kconfig
index 40d7bcc..bf2d55d 100644
--- a/drivers/media/video/Kconfig
+++ b/drivers/media/video/Kconfig
@@ -1031,7 +1031,8 @@ config VIDEO_MEM2MEM_TESTDEV
 config  VIDEO_SAMSUNG_S5P_FIMC
 	tristate "Samsung S5P FIMC (video postprocessor) driver"
 	depends on VIDEO_DEV && VIDEO_V4L2 && PLAT_S5P
-	select VIDEOBUF2_DMA_CONTIG
+	select VIDEOBUF2_DMA_IOMMU if S5P_SYSTEM_MMU
+	select VIDEOBUF2_DMA_CONTIG if !S5P_SYSTEM_MMU
 	select V4L2_MEM2MEM_DEV
 	help
 	  This is a v4l2 driver for the S5P camera interface
diff --git a/drivers/media/video/s5p-fimc/fimc-capture.c b/drivers/media/video/s5p-fimc/fimc-capture.c
index f697ed1..714f0df 100644
--- a/drivers/media/video/s5p-fimc/fimc-capture.c
+++ b/drivers/media/video/s5p-fimc/fimc-capture.c
@@ -29,9 +29,9 @@
 #include <media/v4l2-ioctl.h>
 #include <media/v4l2-mem2mem.h>
 #include <media/videobuf2-core.h>
-#include <media/videobuf2-dma-contig.h>
 
 #include "fimc-core.h"
+#include "fimc-mem.h"
 
 static struct v4l2_subdev *fimc_subdev_register(struct fimc_dev *fimc,
 					    struct s5p_fimc_isp_info *isp_info)
@@ -884,7 +884,7 @@ int fimc_register_capture_device(struct fimc_dev *fimc)
 	q->io_modes = VB2_MMAP | VB2_USERPTR;
 	q->drv_priv = fimc->vid_cap.ctx;
 	q->ops = &fimc_capture_qops;
-	q->mem_ops = &vb2_dma_contig_memops;
+	q->mem_ops = &fimc_vb2_allocator_memops;
 	q->buf_struct_size = sizeof(struct fimc_vid_buffer);
 
 	vb2_queue_init(q);
diff --git a/drivers/media/video/s5p-fimc/fimc-core.c b/drivers/media/video/s5p-fimc/fimc-core.c
index ead5c0a..594c471 100644
--- a/drivers/media/video/s5p-fimc/fimc-core.c
+++ b/drivers/media/video/s5p-fimc/fimc-core.c
@@ -27,9 +27,9 @@
 #include <linux/clk.h>
 #include <media/v4l2-ioctl.h>
 #include <media/videobuf2-core.h>
-#include <media/videobuf2-dma-contig.h>
 
 #include "fimc-core.h"
+#include "fimc-mem.h"
 
 static char *fimc_clocks[MAX_FIMC_CLOCKS] = {
 	"sclk_fimc", "fimc", "sclk_cam"
@@ -457,7 +457,7 @@ int fimc_prepare_addr(struct fimc_ctx *ctx, struct vb2_buffer *vb,
 	dbg("memplanes= %d, colplanes= %d, pix_size= %d",
 		frame->fmt->memplanes, frame->fmt->colplanes, pix_size);
 
-	paddr->y = vb2_dma_contig_plane_paddr(vb, 0);
+	paddr->y = fimc_vb2_plane_addr(vb, 0);
 
 	if (frame->fmt->memplanes == 1) {
 		switch (frame->fmt->colplanes) {
@@ -485,10 +485,10 @@ int fimc_prepare_addr(struct fimc_ctx *ctx, struct vb2_buffer *vb,
 		}
 	} else {
 		if (frame->fmt->memplanes >= 2)
-			paddr->cb = vb2_dma_contig_plane_paddr(vb, 1);
+			paddr->cb = fimc_vb2_plane_addr(vb, 1);
 
 		if (frame->fmt->memplanes == 3)
-			paddr->cr = vb2_dma_contig_plane_paddr(vb, 2);
+			paddr->cr = fimc_vb2_plane_addr(vb, 2);
 	}
 
 	dbg("PHYS_ADDR: y= 0x%X  cb= 0x%X cr= 0x%X ret= %d",
@@ -1378,7 +1378,7 @@ static int queue_init(void *priv, struct vb2_queue *src_vq,
 	src_vq->io_modes = VB2_MMAP | VB2_USERPTR;
 	src_vq->drv_priv = ctx;
 	src_vq->ops = &fimc_qops;
-	src_vq->mem_ops = &vb2_dma_contig_memops;
+	src_vq->mem_ops = &fimc_vb2_allocator_memops;
 	src_vq->buf_struct_size = sizeof(struct v4l2_m2m_buffer);
 
 	ret = vb2_queue_init(src_vq);
@@ -1390,7 +1390,7 @@ static int queue_init(void *priv, struct vb2_queue *src_vq,
 	dst_vq->io_modes = VB2_MMAP | VB2_USERPTR;
 	dst_vq->drv_priv = ctx;
 	dst_vq->ops = &fimc_qops;
-	dst_vq->mem_ops = &vb2_dma_contig_memops;
+	dst_vq->mem_ops = &fimc_vb2_allocator_memops;
 	dst_vq->buf_struct_size = sizeof(struct v4l2_m2m_buffer);
 
 	return vb2_queue_init(dst_vq);
@@ -1688,12 +1688,15 @@ static int fimc_probe(struct platform_device *pdev)
 		goto err_clk;
 	}
 
-	/* Initialize contiguous memory allocator */
-	fimc->alloc_ctx = vb2_dma_contig_init_ctx(&fimc->pdev->dev);
+	/* Initialize memory allocator */
+	fimc->alloc_ctx = fimc_vb2_allocator_init(pdev, fimc);
 	if (IS_ERR(fimc->alloc_ctx)) {
 		ret = PTR_ERR(fimc->alloc_ctx);
 		goto err_irq;
 	}
+	ret = fimc_vb2_allocator_enable(fimc->alloc_ctx);
+	if (ret)
+		goto err_irq;
 
 	ret = fimc_register_m2m_device(fimc);
 	if (ret)
@@ -1750,7 +1753,8 @@ static int __devexit fimc_remove(struct platform_device *pdev)
 
 	fimc_clk_release(fimc);
 
-	vb2_dma_contig_cleanup_ctx(fimc->alloc_ctx);
+	fimc_vb2_allocator_disable(fimc->alloc_ctx);
+	fimc_vb2_allocator_cleanup(fimc->alloc_ctx, fimc);
 
 	pm_runtime_disable(&pdev->dev);
 
@@ -1907,7 +1911,7 @@ static struct platform_device_id fimc_driver_ids[] = {
 		.name		= "s5pv210-fimc",
 		.driver_data	= (unsigned long)&fimc_drvdata_s5pv210,
 	}, {
-		.name		= "s5pv310-fimc",
+		.name		= "exynos4-fimc",
 		.driver_data	= (unsigned long)&fimc_drvdata_s5pv310,
 	},
 	{},
diff --git a/drivers/media/video/s5p-fimc/fimc-core.h b/drivers/media/video/s5p-fimc/fimc-core.h
index 3beb1e5..0f23547 100644
--- a/drivers/media/video/s5p-fimc/fimc-core.h
+++ b/drivers/media/video/s5p-fimc/fimc-core.h
@@ -423,6 +423,7 @@ struct fimc_dev {
 	struct fimc_vid_cap		vid_cap;
 	unsigned long			state;
 	struct vb2_alloc_ctx		*alloc_ctx;
+	struct device			*iommu_dev;
 };
 
 /**
diff --git a/drivers/media/video/s5p-fimc/fimc-mem.h b/drivers/media/video/s5p-fimc/fimc-mem.h
new file mode 100644
index 0000000..7b920a8
--- /dev/null
+++ b/drivers/media/video/s5p-fimc/fimc-mem.h
@@ -0,0 +1,104 @@
+/*
+ * Copyright (c) 2011 Samsung Electronics
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ */
+
+#ifndef FIMC_MEM_H_
+#define FIMC_MEM_H_
+
+/*
+ * fimc-mem.h is the interface for videbuf2 allocator. It is a proxy
+ * to real allocator depending on system capabilities.
+ * 1. on S5PC100 & S5PV210/S5PC110 systems vb2-dma-contig is used
+ * 2. on EXYNOS4 systems vb2-dma-iommu allocator is selected.
+ *
+ */
+
+#ifdef CONFIG_S5P_SYSTEM_MMU
+
+#include <plat/sysmmu.h>
+#include <media/videobuf2-dma-iommu.h>
+
+#define fimc_vb2_allocator_memops vb2_dma_iommu_memops
+
+static inline void *fimc_vb2_allocator_init(struct platform_device *pdev,
+					    struct fimc_dev *fimc)
+{
+	struct device *iommu_dev = s5p_sysmmu_get(S5P_SYSMMU_FIMC0 + pdev->id);
+	void *ret;
+
+	if (!iommu_dev) {
+		dev_err(&pdev->dev, "SYSMMU get failed\n");
+		return ERR_PTR(-ENODEV);
+	}
+
+	ret = vb2_dma_iommu_init(&pdev->dev, iommu_dev, NULL);
+	if (IS_ERR(ret)) {
+		s5p_sysmmu_put(iommu_dev);
+		return ret;
+	}
+	fimc->iommu_dev = iommu_dev;
+	return ret;
+}
+
+static inline void fimc_vb2_allocator_cleanup(void *alloc_ctx,
+					      struct fimc_dev *fimc)
+{
+	vb2_dma_iommu_cleanup(alloc_ctx);
+	s5p_sysmmu_put(fimc->iommu_dev);
+}
+
+static inline unsigned long fimc_vb2_plane_addr(struct vb2_buffer *b, int n)
+{
+	return vb2_dma_iommu_plane_addr(b, n);
+}
+
+static inline int fimc_vb2_allocator_enable(void *alloc_ctx)
+{
+	return vb2_dma_iommu_enable(alloc_ctx);
+}
+
+static inline int fimc_vb2_allocator_disable(void *alloc_ctx)
+{
+	return vb2_dma_iommu_disable(alloc_ctx);
+}
+
+#else	/* use vb2-dma-contig allocator */
+
+#include <media/videobuf2-dma-contig.h>
+
+#define fimc_vb2_allocator_memops vb2_dma_contig_memops
+
+static inline void *fimc_vb2_allocator_init(struct platform_device *pdev,
+					    struct fimc_dev *fimc)
+{
+	return vb2_dma_contig_init_ctx(&pdev->dev);
+}
+
+static inline void fimc_vb2_allocator_cleanup(void *alloc_ctx,
+					      struct fimc_dev *fimc)
+{
+	vb2_dma_contig_cleanup_ctx(alloc_ctx);
+}
+
+static inline unsigned long fimc_vb2_plane_addr(struct vb2_buffer *b, int n)
+{
+	return vb2_dma_contig_plane_paddr(b, n);
+}
+
+static inline int fimc_vb2_allocator_enable(void *alloc_ctx)
+{
+	return 0;
+}
+
+static inline int fimc_vb2_allocator_disable(void *alloc_ctx)
+{
+	return 0;
+}
+
+#endif
+
+#endif /* FIMC_CORE_H_ */
-- 
1.7.1.569.g6f426

^ permalink raw reply related	[flat|nested] 64+ messages in thread

* [PATCH 6/7] v4l: s5p-fimc: Add support for vb2-dma-iommu allocator
@ 2011-04-18  9:26   ` Marek Szyprowski
  0 siblings, 0 replies; 64+ messages in thread
From: Marek Szyprowski @ 2011-04-18  9:26 UTC (permalink / raw)
  To: linux-arm-kernel

This patch adds support for videobuf2-dma-iommu allocator to s5p-fimc
driver. This allocator is selected only on systems that contains support
for S5P SYSMMU module (like EXYNOS4 platform). Otherwise the standard
videobuf2-dma-contig is used.

Signed-off-by: Marek Szyprowski <m.szyprowski@samsung.com>
Signed-off-by: Kyungmin Park <kyungmin.park@samsung.com>
---
 drivers/media/video/Kconfig                 |    3 +-
 drivers/media/video/s5p-fimc/fimc-capture.c |    4 +-
 drivers/media/video/s5p-fimc/fimc-core.c    |   24 ++++---
 drivers/media/video/s5p-fimc/fimc-core.h    |    1 +
 drivers/media/video/s5p-fimc/fimc-mem.h     |  104 +++++++++++++++++++++++++++
 5 files changed, 123 insertions(+), 13 deletions(-)
 create mode 100644 drivers/media/video/s5p-fimc/fimc-mem.h

diff --git a/drivers/media/video/Kconfig b/drivers/media/video/Kconfig
index 40d7bcc..bf2d55d 100644
--- a/drivers/media/video/Kconfig
+++ b/drivers/media/video/Kconfig
@@ -1031,7 +1031,8 @@ config VIDEO_MEM2MEM_TESTDEV
 config  VIDEO_SAMSUNG_S5P_FIMC
 	tristate "Samsung S5P FIMC (video postprocessor) driver"
 	depends on VIDEO_DEV && VIDEO_V4L2 && PLAT_S5P
-	select VIDEOBUF2_DMA_CONTIG
+	select VIDEOBUF2_DMA_IOMMU if S5P_SYSTEM_MMU
+	select VIDEOBUF2_DMA_CONTIG if !S5P_SYSTEM_MMU
 	select V4L2_MEM2MEM_DEV
 	help
 	  This is a v4l2 driver for the S5P camera interface
diff --git a/drivers/media/video/s5p-fimc/fimc-capture.c b/drivers/media/video/s5p-fimc/fimc-capture.c
index f697ed1..714f0df 100644
--- a/drivers/media/video/s5p-fimc/fimc-capture.c
+++ b/drivers/media/video/s5p-fimc/fimc-capture.c
@@ -29,9 +29,9 @@
 #include <media/v4l2-ioctl.h>
 #include <media/v4l2-mem2mem.h>
 #include <media/videobuf2-core.h>
-#include <media/videobuf2-dma-contig.h>
 
 #include "fimc-core.h"
+#include "fimc-mem.h"
 
 static struct v4l2_subdev *fimc_subdev_register(struct fimc_dev *fimc,
 					    struct s5p_fimc_isp_info *isp_info)
@@ -884,7 +884,7 @@ int fimc_register_capture_device(struct fimc_dev *fimc)
 	q->io_modes = VB2_MMAP | VB2_USERPTR;
 	q->drv_priv = fimc->vid_cap.ctx;
 	q->ops = &fimc_capture_qops;
-	q->mem_ops = &vb2_dma_contig_memops;
+	q->mem_ops = &fimc_vb2_allocator_memops;
 	q->buf_struct_size = sizeof(struct fimc_vid_buffer);
 
 	vb2_queue_init(q);
diff --git a/drivers/media/video/s5p-fimc/fimc-core.c b/drivers/media/video/s5p-fimc/fimc-core.c
index ead5c0a..594c471 100644
--- a/drivers/media/video/s5p-fimc/fimc-core.c
+++ b/drivers/media/video/s5p-fimc/fimc-core.c
@@ -27,9 +27,9 @@
 #include <linux/clk.h>
 #include <media/v4l2-ioctl.h>
 #include <media/videobuf2-core.h>
-#include <media/videobuf2-dma-contig.h>
 
 #include "fimc-core.h"
+#include "fimc-mem.h"
 
 static char *fimc_clocks[MAX_FIMC_CLOCKS] = {
 	"sclk_fimc", "fimc", "sclk_cam"
@@ -457,7 +457,7 @@ int fimc_prepare_addr(struct fimc_ctx *ctx, struct vb2_buffer *vb,
 	dbg("memplanes= %d, colplanes= %d, pix_size= %d",
 		frame->fmt->memplanes, frame->fmt->colplanes, pix_size);
 
-	paddr->y = vb2_dma_contig_plane_paddr(vb, 0);
+	paddr->y = fimc_vb2_plane_addr(vb, 0);
 
 	if (frame->fmt->memplanes == 1) {
 		switch (frame->fmt->colplanes) {
@@ -485,10 +485,10 @@ int fimc_prepare_addr(struct fimc_ctx *ctx, struct vb2_buffer *vb,
 		}
 	} else {
 		if (frame->fmt->memplanes >= 2)
-			paddr->cb = vb2_dma_contig_plane_paddr(vb, 1);
+			paddr->cb = fimc_vb2_plane_addr(vb, 1);
 
 		if (frame->fmt->memplanes == 3)
-			paddr->cr = vb2_dma_contig_plane_paddr(vb, 2);
+			paddr->cr = fimc_vb2_plane_addr(vb, 2);
 	}
 
 	dbg("PHYS_ADDR: y= 0x%X  cb= 0x%X cr= 0x%X ret= %d",
@@ -1378,7 +1378,7 @@ static int queue_init(void *priv, struct vb2_queue *src_vq,
 	src_vq->io_modes = VB2_MMAP | VB2_USERPTR;
 	src_vq->drv_priv = ctx;
 	src_vq->ops = &fimc_qops;
-	src_vq->mem_ops = &vb2_dma_contig_memops;
+	src_vq->mem_ops = &fimc_vb2_allocator_memops;
 	src_vq->buf_struct_size = sizeof(struct v4l2_m2m_buffer);
 
 	ret = vb2_queue_init(src_vq);
@@ -1390,7 +1390,7 @@ static int queue_init(void *priv, struct vb2_queue *src_vq,
 	dst_vq->io_modes = VB2_MMAP | VB2_USERPTR;
 	dst_vq->drv_priv = ctx;
 	dst_vq->ops = &fimc_qops;
-	dst_vq->mem_ops = &vb2_dma_contig_memops;
+	dst_vq->mem_ops = &fimc_vb2_allocator_memops;
 	dst_vq->buf_struct_size = sizeof(struct v4l2_m2m_buffer);
 
 	return vb2_queue_init(dst_vq);
@@ -1688,12 +1688,15 @@ static int fimc_probe(struct platform_device *pdev)
 		goto err_clk;
 	}
 
-	/* Initialize contiguous memory allocator */
-	fimc->alloc_ctx = vb2_dma_contig_init_ctx(&fimc->pdev->dev);
+	/* Initialize memory allocator */
+	fimc->alloc_ctx = fimc_vb2_allocator_init(pdev, fimc);
 	if (IS_ERR(fimc->alloc_ctx)) {
 		ret = PTR_ERR(fimc->alloc_ctx);
 		goto err_irq;
 	}
+	ret = fimc_vb2_allocator_enable(fimc->alloc_ctx);
+	if (ret)
+		goto err_irq;
 
 	ret = fimc_register_m2m_device(fimc);
 	if (ret)
@@ -1750,7 +1753,8 @@ static int __devexit fimc_remove(struct platform_device *pdev)
 
 	fimc_clk_release(fimc);
 
-	vb2_dma_contig_cleanup_ctx(fimc->alloc_ctx);
+	fimc_vb2_allocator_disable(fimc->alloc_ctx);
+	fimc_vb2_allocator_cleanup(fimc->alloc_ctx, fimc);
 
 	pm_runtime_disable(&pdev->dev);
 
@@ -1907,7 +1911,7 @@ static struct platform_device_id fimc_driver_ids[] = {
 		.name		= "s5pv210-fimc",
 		.driver_data	= (unsigned long)&fimc_drvdata_s5pv210,
 	}, {
-		.name		= "s5pv310-fimc",
+		.name		= "exynos4-fimc",
 		.driver_data	= (unsigned long)&fimc_drvdata_s5pv310,
 	},
 	{},
diff --git a/drivers/media/video/s5p-fimc/fimc-core.h b/drivers/media/video/s5p-fimc/fimc-core.h
index 3beb1e5..0f23547 100644
--- a/drivers/media/video/s5p-fimc/fimc-core.h
+++ b/drivers/media/video/s5p-fimc/fimc-core.h
@@ -423,6 +423,7 @@ struct fimc_dev {
 	struct fimc_vid_cap		vid_cap;
 	unsigned long			state;
 	struct vb2_alloc_ctx		*alloc_ctx;
+	struct device			*iommu_dev;
 };
 
 /**
diff --git a/drivers/media/video/s5p-fimc/fimc-mem.h b/drivers/media/video/s5p-fimc/fimc-mem.h
new file mode 100644
index 0000000..7b920a8
--- /dev/null
+++ b/drivers/media/video/s5p-fimc/fimc-mem.h
@@ -0,0 +1,104 @@
+/*
+ * Copyright (c) 2011 Samsung Electronics
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ */
+
+#ifndef FIMC_MEM_H_
+#define FIMC_MEM_H_
+
+/*
+ * fimc-mem.h is the interface for videbuf2 allocator. It is a proxy
+ * to real allocator depending on system capabilities.
+ * 1. on S5PC100 & S5PV210/S5PC110 systems vb2-dma-contig is used
+ * 2. on EXYNOS4 systems vb2-dma-iommu allocator is selected.
+ *
+ */
+
+#ifdef CONFIG_S5P_SYSTEM_MMU
+
+#include <plat/sysmmu.h>
+#include <media/videobuf2-dma-iommu.h>
+
+#define fimc_vb2_allocator_memops vb2_dma_iommu_memops
+
+static inline void *fimc_vb2_allocator_init(struct platform_device *pdev,
+					    struct fimc_dev *fimc)
+{
+	struct device *iommu_dev = s5p_sysmmu_get(S5P_SYSMMU_FIMC0 + pdev->id);
+	void *ret;
+
+	if (!iommu_dev) {
+		dev_err(&pdev->dev, "SYSMMU get failed\n");
+		return ERR_PTR(-ENODEV);
+	}
+
+	ret = vb2_dma_iommu_init(&pdev->dev, iommu_dev, NULL);
+	if (IS_ERR(ret)) {
+		s5p_sysmmu_put(iommu_dev);
+		return ret;
+	}
+	fimc->iommu_dev = iommu_dev;
+	return ret;
+}
+
+static inline void fimc_vb2_allocator_cleanup(void *alloc_ctx,
+					      struct fimc_dev *fimc)
+{
+	vb2_dma_iommu_cleanup(alloc_ctx);
+	s5p_sysmmu_put(fimc->iommu_dev);
+}
+
+static inline unsigned long fimc_vb2_plane_addr(struct vb2_buffer *b, int n)
+{
+	return vb2_dma_iommu_plane_addr(b, n);
+}
+
+static inline int fimc_vb2_allocator_enable(void *alloc_ctx)
+{
+	return vb2_dma_iommu_enable(alloc_ctx);
+}
+
+static inline int fimc_vb2_allocator_disable(void *alloc_ctx)
+{
+	return vb2_dma_iommu_disable(alloc_ctx);
+}
+
+#else	/* use vb2-dma-contig allocator */
+
+#include <media/videobuf2-dma-contig.h>
+
+#define fimc_vb2_allocator_memops vb2_dma_contig_memops
+
+static inline void *fimc_vb2_allocator_init(struct platform_device *pdev,
+					    struct fimc_dev *fimc)
+{
+	return vb2_dma_contig_init_ctx(&pdev->dev);
+}
+
+static inline void fimc_vb2_allocator_cleanup(void *alloc_ctx,
+					      struct fimc_dev *fimc)
+{
+	vb2_dma_contig_cleanup_ctx(alloc_ctx);
+}
+
+static inline unsigned long fimc_vb2_plane_addr(struct vb2_buffer *b, int n)
+{
+	return vb2_dma_contig_plane_paddr(b, n);
+}
+
+static inline int fimc_vb2_allocator_enable(void *alloc_ctx)
+{
+	return 0;
+}
+
+static inline int fimc_vb2_allocator_disable(void *alloc_ctx)
+{
+	return 0;
+}
+
+#endif
+
+#endif /* FIMC_CORE_H_ */
-- 
1.7.1.569.g6f426

^ permalink raw reply related	[flat|nested] 64+ messages in thread

* [PATCH 7/7] ARM: EXYNOS4: enable FIMC on Universal_C210
  2011-04-18  9:26 ` Marek Szyprowski
@ 2011-04-18  9:26   ` Marek Szyprowski
  -1 siblings, 0 replies; 64+ messages in thread
From: Marek Szyprowski @ 2011-04-18  9:26 UTC (permalink / raw)
  To: linux-arm-kernel, linux-samsung-soc, linux-media
  Cc: Marek Szyprowski, Kyungmin Park, Andrzej Pietrasiwiecz,
	Sylwester Nawrocki, Arnd Bergmann, Kukjin Kim

This patch adds definitions to enable support for s5p-fimc driver
together with required power domains and sysmmu controller on Universal
C210 board.

Signed-off-by: Marek Szyprowski <m.szyprowski@samsung.com>
Signed-off-by: Kyungmin Park <kyungmin.park@samsung.com>
---
 arch/arm/mach-exynos4/Kconfig               |    6 ++++++
 arch/arm/mach-exynos4/mach-universal_c210.c |   22 ++++++++++++++++++++++
 2 files changed, 28 insertions(+), 0 deletions(-)

diff --git a/arch/arm/mach-exynos4/Kconfig b/arch/arm/mach-exynos4/Kconfig
index e849f67..544a594 100644
--- a/arch/arm/mach-exynos4/Kconfig
+++ b/arch/arm/mach-exynos4/Kconfig
@@ -148,12 +148,18 @@ config MACH_ARMLEX4210
 config MACH_UNIVERSAL_C210
 	bool "Mobile UNIVERSAL_C210 Board"
 	select CPU_EXYNOS4210
+	select S5P_DEV_FIMC0
+	select S5P_DEV_FIMC1
+	select S5P_DEV_FIMC2
+	select S5P_DEV_FIMC3
 	select S3C_DEV_HSMMC
 	select S3C_DEV_HSMMC2
 	select S3C_DEV_HSMMC3
 	select S3C_DEV_I2C1
 	select S3C_DEV_I2C5
 	select S5P_DEV_ONENAND
+	select EXYNOS4_DEV_PD
+	select EXYNOS4_DEV_SYSMMU
 	select EXYNOS4_SETUP_I2C1
 	select EXYNOS4_SETUP_I2C5
 	select EXYNOS4_SETUP_SDHCI
diff --git a/arch/arm/mach-exynos4/mach-universal_c210.c b/arch/arm/mach-exynos4/mach-universal_c210.c
index 97d329f..7ff2f5f 100644
--- a/arch/arm/mach-exynos4/mach-universal_c210.c
+++ b/arch/arm/mach-exynos4/mach-universal_c210.c
@@ -27,9 +27,12 @@
 #include <plat/cpu.h>
 #include <plat/devs.h>
 #include <plat/iic.h>
+#include <plat/pd.h>
 #include <plat/sdhci.h>
+#include <plat/sysmmu.h>
 
 #include <mach/map.h>
+#include <mach/regs-clock.h>
 
 /* Following are default values for UCON, ULCON and UFCON UART registers */
 #define UNIVERSAL_UCON_DEFAULT	(S3C2410_UCON_TXILEVEL |	\
@@ -613,6 +616,15 @@ static struct platform_device *universal_devices[] __initdata = {
 	&s3c_device_hsmmc2,
 	&s3c_device_hsmmc3,
 	&s3c_device_i2c5,
+	&s5p_device_fimc0,
+	&s5p_device_fimc1,
+	&s5p_device_fimc2,
+	&s5p_device_fimc3,
+	&exynos4_device_pd[PD_CAM],
+	&exynos4_device_sysmmu[S5P_SYSMMU_FIMC0],
+	&exynos4_device_sysmmu[S5P_SYSMMU_FIMC1],
+	&exynos4_device_sysmmu[S5P_SYSMMU_FIMC2],
+	&exynos4_device_sysmmu[S5P_SYSMMU_FIMC3],
 
 	/* Universal Devices */
 	&universal_gpio_keys,
@@ -638,6 +650,16 @@ static void __init universal_machine_init(void)
 
 	/* Last */
 	platform_add_devices(universal_devices, ARRAY_SIZE(universal_devices));
+
+	s5p_device_fimc0.dev.parent = &exynos4_device_pd[PD_CAM].dev;
+	s5p_device_fimc1.dev.parent = &exynos4_device_pd[PD_CAM].dev;
+	s5p_device_fimc2.dev.parent = &exynos4_device_pd[PD_CAM].dev;
+	s5p_device_fimc3.dev.parent = &exynos4_device_pd[PD_CAM].dev;
+	exynos4_device_sysmmu[S5P_SYSMMU_FIMC0].dev.parent = &exynos4_device_pd[PD_CAM].dev;
+	exynos4_device_sysmmu[S5P_SYSMMU_FIMC1].dev.parent = &exynos4_device_pd[PD_CAM].dev;
+	exynos4_device_sysmmu[S5P_SYSMMU_FIMC2].dev.parent = &exynos4_device_pd[PD_CAM].dev;
+	exynos4_device_sysmmu[S5P_SYSMMU_FIMC3].dev.parent = &exynos4_device_pd[PD_CAM].dev;
+
 }
 
 MACHINE_START(UNIVERSAL_C210, "UNIVERSAL_C210")
-- 
1.7.1.569.g6f426

^ permalink raw reply related	[flat|nested] 64+ messages in thread

* [PATCH 7/7] ARM: EXYNOS4: enable FIMC on Universal_C210
@ 2011-04-18  9:26   ` Marek Szyprowski
  0 siblings, 0 replies; 64+ messages in thread
From: Marek Szyprowski @ 2011-04-18  9:26 UTC (permalink / raw)
  To: linux-arm-kernel

This patch adds definitions to enable support for s5p-fimc driver
together with required power domains and sysmmu controller on Universal
C210 board.

Signed-off-by: Marek Szyprowski <m.szyprowski@samsung.com>
Signed-off-by: Kyungmin Park <kyungmin.park@samsung.com>
---
 arch/arm/mach-exynos4/Kconfig               |    6 ++++++
 arch/arm/mach-exynos4/mach-universal_c210.c |   22 ++++++++++++++++++++++
 2 files changed, 28 insertions(+), 0 deletions(-)

diff --git a/arch/arm/mach-exynos4/Kconfig b/arch/arm/mach-exynos4/Kconfig
index e849f67..544a594 100644
--- a/arch/arm/mach-exynos4/Kconfig
+++ b/arch/arm/mach-exynos4/Kconfig
@@ -148,12 +148,18 @@ config MACH_ARMLEX4210
 config MACH_UNIVERSAL_C210
 	bool "Mobile UNIVERSAL_C210 Board"
 	select CPU_EXYNOS4210
+	select S5P_DEV_FIMC0
+	select S5P_DEV_FIMC1
+	select S5P_DEV_FIMC2
+	select S5P_DEV_FIMC3
 	select S3C_DEV_HSMMC
 	select S3C_DEV_HSMMC2
 	select S3C_DEV_HSMMC3
 	select S3C_DEV_I2C1
 	select S3C_DEV_I2C5
 	select S5P_DEV_ONENAND
+	select EXYNOS4_DEV_PD
+	select EXYNOS4_DEV_SYSMMU
 	select EXYNOS4_SETUP_I2C1
 	select EXYNOS4_SETUP_I2C5
 	select EXYNOS4_SETUP_SDHCI
diff --git a/arch/arm/mach-exynos4/mach-universal_c210.c b/arch/arm/mach-exynos4/mach-universal_c210.c
index 97d329f..7ff2f5f 100644
--- a/arch/arm/mach-exynos4/mach-universal_c210.c
+++ b/arch/arm/mach-exynos4/mach-universal_c210.c
@@ -27,9 +27,12 @@
 #include <plat/cpu.h>
 #include <plat/devs.h>
 #include <plat/iic.h>
+#include <plat/pd.h>
 #include <plat/sdhci.h>
+#include <plat/sysmmu.h>
 
 #include <mach/map.h>
+#include <mach/regs-clock.h>
 
 /* Following are default values for UCON, ULCON and UFCON UART registers */
 #define UNIVERSAL_UCON_DEFAULT	(S3C2410_UCON_TXILEVEL |	\
@@ -613,6 +616,15 @@ static struct platform_device *universal_devices[] __initdata = {
 	&s3c_device_hsmmc2,
 	&s3c_device_hsmmc3,
 	&s3c_device_i2c5,
+	&s5p_device_fimc0,
+	&s5p_device_fimc1,
+	&s5p_device_fimc2,
+	&s5p_device_fimc3,
+	&exynos4_device_pd[PD_CAM],
+	&exynos4_device_sysmmu[S5P_SYSMMU_FIMC0],
+	&exynos4_device_sysmmu[S5P_SYSMMU_FIMC1],
+	&exynos4_device_sysmmu[S5P_SYSMMU_FIMC2],
+	&exynos4_device_sysmmu[S5P_SYSMMU_FIMC3],
 
 	/* Universal Devices */
 	&universal_gpio_keys,
@@ -638,6 +650,16 @@ static void __init universal_machine_init(void)
 
 	/* Last */
 	platform_add_devices(universal_devices, ARRAY_SIZE(universal_devices));
+
+	s5p_device_fimc0.dev.parent = &exynos4_device_pd[PD_CAM].dev;
+	s5p_device_fimc1.dev.parent = &exynos4_device_pd[PD_CAM].dev;
+	s5p_device_fimc2.dev.parent = &exynos4_device_pd[PD_CAM].dev;
+	s5p_device_fimc3.dev.parent = &exynos4_device_pd[PD_CAM].dev;
+	exynos4_device_sysmmu[S5P_SYSMMU_FIMC0].dev.parent = &exynos4_device_pd[PD_CAM].dev;
+	exynos4_device_sysmmu[S5P_SYSMMU_FIMC1].dev.parent = &exynos4_device_pd[PD_CAM].dev;
+	exynos4_device_sysmmu[S5P_SYSMMU_FIMC2].dev.parent = &exynos4_device_pd[PD_CAM].dev;
+	exynos4_device_sysmmu[S5P_SYSMMU_FIMC3].dev.parent = &exynos4_device_pd[PD_CAM].dev;
+
 }
 
 MACHINE_START(UNIVERSAL_C210, "UNIVERSAL_C210")
-- 
1.7.1.569.g6f426

^ permalink raw reply related	[flat|nested] 64+ messages in thread

* RE: [RFC/PATCH v3 0/7] Samsung IOMMU videobuf2 allocator and s5p-fimc update
  2011-04-18  9:26 ` Marek Szyprowski
@ 2011-04-18 13:24   ` Marek Szyprowski
  -1 siblings, 0 replies; 64+ messages in thread
From: Marek Szyprowski @ 2011-04-18 13:24 UTC (permalink / raw)
  To: Marek Szyprowski, linux-arm-kernel, linux-samsung-soc, linux-media
  Cc: 'Kyungmin Park',
	Andrzej Pietrasiewicz, Sylwester Nawrocki,
	'Arnd Bergmann', 'Kukjin Kim'

Hello,

On Monday, April 18, 2011 11:27 AM Marek Szyprowski wrote:

> This is a third version of the Samsung IOMMU driver (see patch #2) and
> videobuf2 allocator for IOMMU mapped memory (see patch #4) as well as
> FIMC driver update. This update brings some minor bugfixes to Samsung
> IOMMU (SYSMMU) driver and support for pages larger than 4KiB in
> videobuf2-dma-iommu allocator.

snip

> This patch series contains a collection of patches for various platform
> subsystems. Here is a detailed list:
> 
> [PATCH 1/7] ARM: EXYNOS4: power domains: fixes and code cleanup
> - adds support for block gating in Samsung power domain driver and
>   performs some cleanup
> 
> [PATCH 2/7] ARM: Samsung: update/rewrite Samsung SYSMMU (IOMMU) driver
> - a complete rewrite of sysmmu driver for Samsung platform, now uses
>   linux/include/iommu.h api (key patch in this series)
> 
> [PATCH 3/7] v4l: videobuf2: dma-sg: move some generic functions to memops
> - a little cleanup and preparations for the dma-iommu allocator
> 
> [PATCH 4/7] v4l: videobuf2: add IOMMU based DMA memory allocator
> - introduces new memory allocator for videobuf2 for drivers that support
>   iommu dma memory mappings (key patch in this series)

I was in a bit hurry and I forgot to mention that the above patch relies
on some improvements to gen_alloc framework. The required 2 patches can be
found in the following patch series:
https://lkml.org/lkml/2011/3/31/213

"[PATCH 01/12] lib: bitmap: Added alignment offset for
bitmap_find_next_zero_area()"
https://lkml.org/lkml/2011/3/31/211

"[PATCH 02/12] lib: genalloc: Generic allocator improvements"
https://lkml.org/lkml/2011/3/31/207

For easier testing I've uploaded the whole IOMMU patch series and
prerequisites to public GIT repository to vb2-iommu branch (will be
available in a about 2 hours):

http://git.infradead.org/users/kmpark/linux-2.6-samsung/shortlog/refs/heads/vb2-
iommu

git://git.infradead.org/users/kmpark/linux-2.6-samsung vb2-iommu

Best regards
-- 
Marek Szyprowski
Samsung Poland R&D Center


^ permalink raw reply	[flat|nested] 64+ messages in thread

* [RFC/PATCH v3 0/7] Samsung IOMMU videobuf2 allocator and s5p-fimc update
@ 2011-04-18 13:24   ` Marek Szyprowski
  0 siblings, 0 replies; 64+ messages in thread
From: Marek Szyprowski @ 2011-04-18 13:24 UTC (permalink / raw)
  To: linux-arm-kernel

Hello,

On Monday, April 18, 2011 11:27 AM Marek Szyprowski wrote:

> This is a third version of the Samsung IOMMU driver (see patch #2) and
> videobuf2 allocator for IOMMU mapped memory (see patch #4) as well as
> FIMC driver update. This update brings some minor bugfixes to Samsung
> IOMMU (SYSMMU) driver and support for pages larger than 4KiB in
> videobuf2-dma-iommu allocator.

snip

> This patch series contains a collection of patches for various platform
> subsystems. Here is a detailed list:
> 
> [PATCH 1/7] ARM: EXYNOS4: power domains: fixes and code cleanup
> - adds support for block gating in Samsung power domain driver and
>   performs some cleanup
> 
> [PATCH 2/7] ARM: Samsung: update/rewrite Samsung SYSMMU (IOMMU) driver
> - a complete rewrite of sysmmu driver for Samsung platform, now uses
>   linux/include/iommu.h api (key patch in this series)
> 
> [PATCH 3/7] v4l: videobuf2: dma-sg: move some generic functions to memops
> - a little cleanup and preparations for the dma-iommu allocator
> 
> [PATCH 4/7] v4l: videobuf2: add IOMMU based DMA memory allocator
> - introduces new memory allocator for videobuf2 for drivers that support
>   iommu dma memory mappings (key patch in this series)

I was in a bit hurry and I forgot to mention that the above patch relies
on some improvements to gen_alloc framework. The required 2 patches can be
found in the following patch series:
https://lkml.org/lkml/2011/3/31/213

"[PATCH 01/12] lib: bitmap: Added alignment offset for
bitmap_find_next_zero_area()"
https://lkml.org/lkml/2011/3/31/211

"[PATCH 02/12] lib: genalloc: Generic allocator improvements"
https://lkml.org/lkml/2011/3/31/207

For easier testing I've uploaded the whole IOMMU patch series and
prerequisites to public GIT repository to vb2-iommu branch (will be
available in a about 2 hours):

http://git.infradead.org/users/kmpark/linux-2.6-samsung/shortlog/refs/heads/vb2-
iommu

git://git.infradead.org/users/kmpark/linux-2.6-samsung vb2-iommu

Best regards
-- 
Marek Szyprowski
Samsung Poland R&D Center

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [PATCH 2/7] ARM: Samsung: update/rewrite Samsung SYSMMU (IOMMU) driver
  2011-04-18  9:26   ` Marek Szyprowski
@ 2011-04-18 14:12     ` Arnd Bergmann
  -1 siblings, 0 replies; 64+ messages in thread
From: Arnd Bergmann @ 2011-04-18 14:12 UTC (permalink / raw)
  To: Marek Szyprowski
  Cc: linux-arm-kernel, linux-samsung-soc, linux-media, Kyungmin Park,
	Andrzej Pietrasiwiecz, Sylwester Nawrocki, Kukjin Kim

On Monday 18 April 2011, Marek Szyprowski wrote:
> From: Andrzej Pietrasiewicz <andrzej.p@samsung.com>
> 
> This patch performs a complete rewrite of sysmmu driver for Samsung platform:
> - simplified the resource management: no more single platform
>   device with 32 resources is needed, better fits into linux driver model,
>   each sysmmu instance has it's own resource definition
> - the new version uses kernel wide common iommu api defined in include/iommu.h
> - cleaned support for sysmmu clocks
> - added support for custom fault handlers and tlb replacement policy

Looks like good progress, but I fear that there is still quite a bit more
work needed here.

> +static int debug;
> +module_param(debug, int, 0644);
> +
> +#define sysmmu_debug(level, fmt, arg...)                                \
> +       do {                                                             \
> +               if (debug >= level)                                      \
> +                       printk(KERN_DEBUG "[%s] " fmt, __func__, ## arg);\
> +       } while (0)

Just use dev_dbg() here, the kernel already has all the infrastructure.

> +
> +#define generic_extract(l, s, entry) \
> +                               ((entry) & l##LPT_##s##_MASK)
> +#define flpt_get_1m(entry)     generic_extract(F, 1M, deref_va(entry))
> +#define flpt_get_16m(entry)    generic_extract(F, 16M, deref_va(entry))
> +#define slpt_get_4k(entry)     generic_extract(S, 4K, deref_va(entry))
> +#define slpt_get_64k(entry)    generic_extract(S, 64K, deref_va(entry))
> +
> +#define generic_entry(l, s, entry) \
> +                               (generic_extract(l, s, entry)  | PAGE_##s)
> +#define flpt_ent_4k_64k(entry) generic_entry(F, 4K_64K, entry)
> +#define flpt_ent_1m(entry)     generic_entry(F, 1M, entry)
> +#define flpt_ent_16m(entry)    generic_entry(F, 16M, entry)
> +#define slpt_ent_4k(entry)     generic_entry(S, 4K, entry)
> +#define slpt_ent_64k(entry)    generic_entry(S, 64K, entry)
> +
> +#define page_4k_64k(entry)     (deref_va(entry) & PAGE_4K_64K)
> +#define page_1m(entry)         (deref_va(entry) & PAGE_1M)
> +#define page_16m(entry)                ((deref_va(entry) & PAGE_16M) == PAGE_16M)
> +#define page_4k(entry)         (deref_va(entry) & PAGE_4K)
> +#define page_64k(entry)                (deref_va(entry) & PAGE_64K)
> +
> +#define generic_pg_offs(l, s, va) \
> +                               (va & ~l##LPT_##s##_MASK)
> +#define pg_offs_1m(va)         generic_pg_offs(F, 1M, va)
> +#define pg_offs_16m(va)                generic_pg_offs(F, 16M, va)
> +#define pg_offs_4k(va)         generic_pg_offs(S, 4K, va)
> +#define pg_offs_64k(va)                generic_pg_offs(S, 64K, va)
> +
> +#define flpt_index(va)         (((va) >> FLPT_IDX_SHIFT) & FLPT_IDX_MASK)
> +
> +#define generic_offset(l, va)  (((va) >> l##LPT_OFFS_SHIFT) & l##LPT_OFFS_MASK)
> +#define flpt_offs(va)          generic_offset(F, va)
> +#define slpt_offs(va)          generic_offset(S, va)
> +
> +#define invalidate_slpt_ent(slpt_va) (deref_va(slpt_va) = 0UL)
> +
> +#define get_irq_callb(cb) \
> +                               (s5p_domain->irq_callb ? \
> +                                       (s5p_domain->irq_callb->cb ? \
> +                                       s5p_domain->irq_callb->cb : \
> +                                       s5p_sysmmu_irq_callb.cb) \
> +                               : s5p_sysmmu_irq_callb.cb)

These macros are really confusing, especially the ones that implicitly
access variables with a specific name. How about converting them into
inline functions?

> +phys_addr_t s5p_iova_to_phys(struct iommu_domain *domain, unsigned long iova)

This should be static.

> +struct device *s5p_sysmmu_get(enum s5p_sysmmu_ip ip)
> +{
> +       struct device *ret = NULL;
> +       unsigned long flags;
> +
> +       spin_lock_irqsave(&sysmmu_slock, flags);
> +       if (sysmmu_table[ip]) {
> +               try_module_get(THIS_MODULE);
> +               ret = sysmmu_table[ip]->dev;
> +       }
> +       spin_unlock_irqrestore(&sysmmu_slock, flags);
> +
> +       return ret;
> +}
> +EXPORT_SYMBOL_GPL(s5p_sysmmu_get);
> +
> +void s5p_sysmmu_put(void *dev)
> +{
> +       BUG_ON(!dev);
> +       module_put(THIS_MODULE);
> +}
> +EXPORT_SYMBOL_GPL(s5p_sysmmu_put);

These look wrong for a number of reasons:

* try_module_get(THIS_MODULE) makes no sense at all, the idea of the
  try_module_get is to pin down another module that was calling down,
  which I suppose is not needed here.

* This extends the generic IOMMU API in platform specific ways, don't
  do that.

* I think you can do without these functions by including a pointer
  to the iommu structure in dev_archdata, see
  arch/powerpc/include/asm/device.h for an example.

> +void s5p_sysmmu_domain_irq_callb(struct iommu_domain *domain,
> +                           struct s5p_sysmmu_irq_callb *ops, void *priv)
> +{
> +       struct s5p_sysmmu_domain *s5p_domain = domain->priv;
> +       s5p_domain->irq_callb = ops;
> +       s5p_domain->irq_callb_priv = priv;
> +}
> +EXPORT_SYMBOL(s5p_sysmmu_domain_irq_callb);
> +
> +
> +void s5p_sysmmu_domain_tlb_policy(struct iommu_domain *domain, int policy)
> +{
> +       struct s5p_sysmmu_domain *s5p_domain = domain->priv;
> +       s5p_domain->policy = policy;
> +}
> +EXPORT_SYMBOL(s5p_sysmmu_domain_tlb_policy);

More private extensions that should not be here. Better extend the generic
IOMMU API to contain callbacks for these if they are required, and document
them in a generic location.

> +static void s5p_sysmmu_irq_page_fault(struct iommu_domain *domain, int reason,
> +                                     unsigned long addr, void *priv)
> +{
> +       sysmmu_debug(3, "%s: Faulting virtual address: 0x%08lx\n",
> +                    irq_reasons[reason], addr);
> +       BUG();
> +}
> +
> +static void s5p_sysmmu_irq_generic_callb(struct iommu_domain *domain,
> +                                        int reason, unsigned long addr,
> +                                        void *priv)
> +{
> +       sysmmu_debug(3, "%s\n", irq_reasons[reason]);
> +       BUG();
> +}

Why BUG() here? The backtrace of an interrupt handler should be rather
uninteresting, and you just end up in panic() since this is run
from an interrupt handler.

> +static struct s5p_sysmmu_irq_callb s5p_sysmmu_irq_callb = {
> +       .page_fault = s5p_sysmmu_irq_page_fault,
> +       .ar_fault = s5p_sysmmu_irq_generic_callb,
> +       .aw_fault = s5p_sysmmu_irq_generic_callb,
> +       .bus_error = s5p_sysmmu_irq_generic_callb,
> +       .ar_security = s5p_sysmmu_irq_generic_callb,
> +       .ar_prot = s5p_sysmmu_irq_generic_callb,
> +       .aw_security = s5p_sysmmu_irq_generic_callb,
> +       .aw_prot = s5p_sysmmu_irq_generic_callb,
> +};
> +
> +static irqreturn_t s5p_sysmmu_irq(int irq, void *dev_id)
> +{
> +       struct s5p_sysmmu_info *sysmmu = dev_id;
> +       struct s5p_sysmmu_domain *s5p_domain = sysmmu->domain->priv;
> +       unsigned int reg_INT_STATUS;
> +
> +       if (false == sysmmu->enabled)
> +               return IRQ_HANDLED;
> +
> +       reg_INT_STATUS = readl(sysmmu->regs + S5P_INT_STATUS);
> +       if (reg_INT_STATUS & 0xFF) {
> +               S5P_IRQ_CB(cb);
> +               enum s5p_sysmmu_fault reason = 0;
> +               unsigned long fault = 0;
> +               unsigned reg = 0;
> +               cb = NULL;
> +               switch (reg_INT_STATUS & 0xFF) {
> +               case 0x1:
> +                       cb = get_irq_callb(page_fault);
> +                       reason = S5P_SYSMMU_PAGE_FAULT;
> +                       reg = S5P_PAGE_FAULT_ADDR;
> +                       break;
> +               case 0x2:
> +                       cb = get_irq_callb(ar_fault);
> +                       reason = S5P_SYSMMU_AR_FAULT;
> +                       reg = S5P_AR_FAULT_ADDR;
> +                       break;
> +               case 0x4:
> +                       cb = get_irq_callb(aw_fault);
> +                       reason = S5P_SYSMMU_AW_FAULT;
> +                       reg = S5P_AW_FAULT_ADDR;
> +                       break;
> +               case 0x8:
> +                       cb = get_irq_callb(bus_error);
> +                       reason = S5P_SYSMMU_BUS_ERROR;
> +                       /* register common to page fault and bus error */
> +                       reg = S5P_PAGE_FAULT_ADDR;
> +                       break;
> +               case 0x10:
> +                       cb = get_irq_callb(ar_security);
> +                       reason = S5P_SYSMMU_AR_SECURITY;
> +                       reg = S5P_AR_FAULT_ADDR;
> +                       break;
> +               case 0x20:
> +                       cb = get_irq_callb(ar_prot);
> +                       reason = S5P_SYSMMU_AR_PROT;
> +                       reg = S5P_AR_FAULT_ADDR;
> +                       break;
> +               case 0x40:
> +                       cb = get_irq_callb(aw_security);
> +                       reason = S5P_SYSMMU_AW_SECURITY;
> +                       reg = S5P_AW_FAULT_ADDR;
> +                       break;
> +               case 0x80:
> +                       cb = get_irq_callb(aw_prot);
> +                       reason = S5P_SYSMMU_AW_PROT;
> +                       reg = S5P_AW_FAULT_ADDR;
> +                       break;
> +               }
> +               fault = readl(sysmmu->regs + reg);
> +               cb(sysmmu->domain, reason, fault, s5p_domain->irq_callb_priv);
> +               writel(reg_INT_STATUS, sysmmu->regs + S5P_INT_CLEAR);
> +       }
> +       return IRQ_HANDLED;
> +}

I think it would be more readable and more efficient to just use a lookup
table here instead of the long switch/case statement.

> +static int
> +s5p_sysmmu_suspend(struct platform_device *pdev, pm_message_t state)
> +{
> +       int ret = 0;
> +       sysmmu_debug(3, "begin\n");
> +
> +       return ret;
> +}
> +
> +static int s5p_sysmmu_resume(struct platform_device *pdev)
> +{
> +       int ret = 0;
> +       sysmmu_debug(3, "begin\n");
> +
> +       return ret;
> +}
> +
> +static int s5p_sysmmu_runtime_suspend(struct device *dev)
> +{
> +       sysmmu_debug(3, "begin\n");
> +       return 0;
> +}
> +
> +static int s5p_sysmmu_runtime_resume(struct device *dev)
> +{
> +       sysmmu_debug(3, "begin\n");
> +       return 0;
> +}

Why even provide these when they don't do anything?

> +static int __init
> +s5p_sysmmu_register(void)
> +{
> +       int ret;
> +
> +       sysmmu_debug(3, "Registering sysmmu driver...\n");
> +
> +       slpt_cache = kmem_cache_create("slpt_cache", 1024, 1024,
> +                                      SLAB_HWCACHE_ALIGN, NULL);
> +       if (!slpt_cache) {
> +               printk(KERN_ERR
> +                       "%s: failed to allocated slpt cache\n", __func__);
> +               return -ENOMEM;
> +       }
> +
> +       ret = platform_driver_register(&s5p_sysmmu_driver);
> +
> +       if (ret) {
> +               printk(KERN_ERR
> +                       "%s: failed to register sysmmu driver\n", __func__);
> +               return -EINVAL;
> +       }
> +
> +       register_iommu(&s5p_sysmmu_ops);
> +
> +       return ret;
> +}

When you register the iommu unconditionally, it becomes impossible for
this driver to coexist with other iommu drivers in the same kernel,
which does against the concept of having a platform driver for this.

It might be better to call the s5p_sysmmu_register function from
the board files and have no platform devices at all if each IOMMU
is always bound to a specific device anyway. 

> diff --git a/arch/arm/plat-samsung/include/plat/devs.h b/arch/arm/plat-samsung/include/plat/devs.h
> index f0da6b7..0ae5dd0 100644
> --- a/arch/arm/plat-samsung/include/plat/devs.h
> +++ b/arch/arm/plat-samsung/include/plat/devs.h
> @@ -142,7 +142,7 @@ extern struct platform_device s5p_device_fimc3;
>  extern struct platform_device s5p_device_mipi_csis0;
>  extern struct platform_device s5p_device_mipi_csis1;
>  
> -extern struct platform_device exynos4_device_sysmmu;
> +extern struct platform_device exynos4_device_sysmmu[];
  
Why is this a global variable? I would expect this to be private to the
implementation.

	Arnd

^ permalink raw reply	[flat|nested] 64+ messages in thread

* [PATCH 2/7] ARM: Samsung: update/rewrite Samsung SYSMMU (IOMMU) driver
@ 2011-04-18 14:12     ` Arnd Bergmann
  0 siblings, 0 replies; 64+ messages in thread
From: Arnd Bergmann @ 2011-04-18 14:12 UTC (permalink / raw)
  To: linux-arm-kernel

On Monday 18 April 2011, Marek Szyprowski wrote:
> From: Andrzej Pietrasiewicz <andrzej.p@samsung.com>
> 
> This patch performs a complete rewrite of sysmmu driver for Samsung platform:
> - simplified the resource management: no more single platform
>   device with 32 resources is needed, better fits into linux driver model,
>   each sysmmu instance has it's own resource definition
> - the new version uses kernel wide common iommu api defined in include/iommu.h
> - cleaned support for sysmmu clocks
> - added support for custom fault handlers and tlb replacement policy

Looks like good progress, but I fear that there is still quite a bit more
work needed here.

> +static int debug;
> +module_param(debug, int, 0644);
> +
> +#define sysmmu_debug(level, fmt, arg...)                                \
> +       do {                                                             \
> +               if (debug >= level)                                      \
> +                       printk(KERN_DEBUG "[%s] " fmt, __func__, ## arg);\
> +       } while (0)

Just use dev_dbg() here, the kernel already has all the infrastructure.

> +
> +#define generic_extract(l, s, entry) \
> +                               ((entry) & l##LPT_##s##_MASK)
> +#define flpt_get_1m(entry)     generic_extract(F, 1M, deref_va(entry))
> +#define flpt_get_16m(entry)    generic_extract(F, 16M, deref_va(entry))
> +#define slpt_get_4k(entry)     generic_extract(S, 4K, deref_va(entry))
> +#define slpt_get_64k(entry)    generic_extract(S, 64K, deref_va(entry))
> +
> +#define generic_entry(l, s, entry) \
> +                               (generic_extract(l, s, entry)  | PAGE_##s)
> +#define flpt_ent_4k_64k(entry) generic_entry(F, 4K_64K, entry)
> +#define flpt_ent_1m(entry)     generic_entry(F, 1M, entry)
> +#define flpt_ent_16m(entry)    generic_entry(F, 16M, entry)
> +#define slpt_ent_4k(entry)     generic_entry(S, 4K, entry)
> +#define slpt_ent_64k(entry)    generic_entry(S, 64K, entry)
> +
> +#define page_4k_64k(entry)     (deref_va(entry) & PAGE_4K_64K)
> +#define page_1m(entry)         (deref_va(entry) & PAGE_1M)
> +#define page_16m(entry)                ((deref_va(entry) & PAGE_16M) == PAGE_16M)
> +#define page_4k(entry)         (deref_va(entry) & PAGE_4K)
> +#define page_64k(entry)                (deref_va(entry) & PAGE_64K)
> +
> +#define generic_pg_offs(l, s, va) \
> +                               (va & ~l##LPT_##s##_MASK)
> +#define pg_offs_1m(va)         generic_pg_offs(F, 1M, va)
> +#define pg_offs_16m(va)                generic_pg_offs(F, 16M, va)
> +#define pg_offs_4k(va)         generic_pg_offs(S, 4K, va)
> +#define pg_offs_64k(va)                generic_pg_offs(S, 64K, va)
> +
> +#define flpt_index(va)         (((va) >> FLPT_IDX_SHIFT) & FLPT_IDX_MASK)
> +
> +#define generic_offset(l, va)  (((va) >> l##LPT_OFFS_SHIFT) & l##LPT_OFFS_MASK)
> +#define flpt_offs(va)          generic_offset(F, va)
> +#define slpt_offs(va)          generic_offset(S, va)
> +
> +#define invalidate_slpt_ent(slpt_va) (deref_va(slpt_va) = 0UL)
> +
> +#define get_irq_callb(cb) \
> +                               (s5p_domain->irq_callb ? \
> +                                       (s5p_domain->irq_callb->cb ? \
> +                                       s5p_domain->irq_callb->cb : \
> +                                       s5p_sysmmu_irq_callb.cb) \
> +                               : s5p_sysmmu_irq_callb.cb)

These macros are really confusing, especially the ones that implicitly
access variables with a specific name. How about converting them into
inline functions?

> +phys_addr_t s5p_iova_to_phys(struct iommu_domain *domain, unsigned long iova)

This should be static.

> +struct device *s5p_sysmmu_get(enum s5p_sysmmu_ip ip)
> +{
> +       struct device *ret = NULL;
> +       unsigned long flags;
> +
> +       spin_lock_irqsave(&sysmmu_slock, flags);
> +       if (sysmmu_table[ip]) {
> +               try_module_get(THIS_MODULE);
> +               ret = sysmmu_table[ip]->dev;
> +       }
> +       spin_unlock_irqrestore(&sysmmu_slock, flags);
> +
> +       return ret;
> +}
> +EXPORT_SYMBOL_GPL(s5p_sysmmu_get);
> +
> +void s5p_sysmmu_put(void *dev)
> +{
> +       BUG_ON(!dev);
> +       module_put(THIS_MODULE);
> +}
> +EXPORT_SYMBOL_GPL(s5p_sysmmu_put);

These look wrong for a number of reasons:

* try_module_get(THIS_MODULE) makes no sense at all, the idea of the
  try_module_get is to pin down another module that was calling down,
  which I suppose is not needed here.

* This extends the generic IOMMU API in platform specific ways, don't
  do that.

* I think you can do without these functions by including a pointer
  to the iommu structure in dev_archdata, see
  arch/powerpc/include/asm/device.h for an example.

> +void s5p_sysmmu_domain_irq_callb(struct iommu_domain *domain,
> +                           struct s5p_sysmmu_irq_callb *ops, void *priv)
> +{
> +       struct s5p_sysmmu_domain *s5p_domain = domain->priv;
> +       s5p_domain->irq_callb = ops;
> +       s5p_domain->irq_callb_priv = priv;
> +}
> +EXPORT_SYMBOL(s5p_sysmmu_domain_irq_callb);
> +
> +
> +void s5p_sysmmu_domain_tlb_policy(struct iommu_domain *domain, int policy)
> +{
> +       struct s5p_sysmmu_domain *s5p_domain = domain->priv;
> +       s5p_domain->policy = policy;
> +}
> +EXPORT_SYMBOL(s5p_sysmmu_domain_tlb_policy);

More private extensions that should not be here. Better extend the generic
IOMMU API to contain callbacks for these if they are required, and document
them in a generic location.

> +static void s5p_sysmmu_irq_page_fault(struct iommu_domain *domain, int reason,
> +                                     unsigned long addr, void *priv)
> +{
> +       sysmmu_debug(3, "%s: Faulting virtual address: 0x%08lx\n",
> +                    irq_reasons[reason], addr);
> +       BUG();
> +}
> +
> +static void s5p_sysmmu_irq_generic_callb(struct iommu_domain *domain,
> +                                        int reason, unsigned long addr,
> +                                        void *priv)
> +{
> +       sysmmu_debug(3, "%s\n", irq_reasons[reason]);
> +       BUG();
> +}

Why BUG() here? The backtrace of an interrupt handler should be rather
uninteresting, and you just end up in panic() since this is run
from an interrupt handler.

> +static struct s5p_sysmmu_irq_callb s5p_sysmmu_irq_callb = {
> +       .page_fault = s5p_sysmmu_irq_page_fault,
> +       .ar_fault = s5p_sysmmu_irq_generic_callb,
> +       .aw_fault = s5p_sysmmu_irq_generic_callb,
> +       .bus_error = s5p_sysmmu_irq_generic_callb,
> +       .ar_security = s5p_sysmmu_irq_generic_callb,
> +       .ar_prot = s5p_sysmmu_irq_generic_callb,
> +       .aw_security = s5p_sysmmu_irq_generic_callb,
> +       .aw_prot = s5p_sysmmu_irq_generic_callb,
> +};
> +
> +static irqreturn_t s5p_sysmmu_irq(int irq, void *dev_id)
> +{
> +       struct s5p_sysmmu_info *sysmmu = dev_id;
> +       struct s5p_sysmmu_domain *s5p_domain = sysmmu->domain->priv;
> +       unsigned int reg_INT_STATUS;
> +
> +       if (false == sysmmu->enabled)
> +               return IRQ_HANDLED;
> +
> +       reg_INT_STATUS = readl(sysmmu->regs + S5P_INT_STATUS);
> +       if (reg_INT_STATUS & 0xFF) {
> +               S5P_IRQ_CB(cb);
> +               enum s5p_sysmmu_fault reason = 0;
> +               unsigned long fault = 0;
> +               unsigned reg = 0;
> +               cb = NULL;
> +               switch (reg_INT_STATUS & 0xFF) {
> +               case 0x1:
> +                       cb = get_irq_callb(page_fault);
> +                       reason = S5P_SYSMMU_PAGE_FAULT;
> +                       reg = S5P_PAGE_FAULT_ADDR;
> +                       break;
> +               case 0x2:
> +                       cb = get_irq_callb(ar_fault);
> +                       reason = S5P_SYSMMU_AR_FAULT;
> +                       reg = S5P_AR_FAULT_ADDR;
> +                       break;
> +               case 0x4:
> +                       cb = get_irq_callb(aw_fault);
> +                       reason = S5P_SYSMMU_AW_FAULT;
> +                       reg = S5P_AW_FAULT_ADDR;
> +                       break;
> +               case 0x8:
> +                       cb = get_irq_callb(bus_error);
> +                       reason = S5P_SYSMMU_BUS_ERROR;
> +                       /* register common to page fault and bus error */
> +                       reg = S5P_PAGE_FAULT_ADDR;
> +                       break;
> +               case 0x10:
> +                       cb = get_irq_callb(ar_security);
> +                       reason = S5P_SYSMMU_AR_SECURITY;
> +                       reg = S5P_AR_FAULT_ADDR;
> +                       break;
> +               case 0x20:
> +                       cb = get_irq_callb(ar_prot);
> +                       reason = S5P_SYSMMU_AR_PROT;
> +                       reg = S5P_AR_FAULT_ADDR;
> +                       break;
> +               case 0x40:
> +                       cb = get_irq_callb(aw_security);
> +                       reason = S5P_SYSMMU_AW_SECURITY;
> +                       reg = S5P_AW_FAULT_ADDR;
> +                       break;
> +               case 0x80:
> +                       cb = get_irq_callb(aw_prot);
> +                       reason = S5P_SYSMMU_AW_PROT;
> +                       reg = S5P_AW_FAULT_ADDR;
> +                       break;
> +               }
> +               fault = readl(sysmmu->regs + reg);
> +               cb(sysmmu->domain, reason, fault, s5p_domain->irq_callb_priv);
> +               writel(reg_INT_STATUS, sysmmu->regs + S5P_INT_CLEAR);
> +       }
> +       return IRQ_HANDLED;
> +}

I think it would be more readable and more efficient to just use a lookup
table here instead of the long switch/case statement.

> +static int
> +s5p_sysmmu_suspend(struct platform_device *pdev, pm_message_t state)
> +{
> +       int ret = 0;
> +       sysmmu_debug(3, "begin\n");
> +
> +       return ret;
> +}
> +
> +static int s5p_sysmmu_resume(struct platform_device *pdev)
> +{
> +       int ret = 0;
> +       sysmmu_debug(3, "begin\n");
> +
> +       return ret;
> +}
> +
> +static int s5p_sysmmu_runtime_suspend(struct device *dev)
> +{
> +       sysmmu_debug(3, "begin\n");
> +       return 0;
> +}
> +
> +static int s5p_sysmmu_runtime_resume(struct device *dev)
> +{
> +       sysmmu_debug(3, "begin\n");
> +       return 0;
> +}

Why even provide these when they don't do anything?

> +static int __init
> +s5p_sysmmu_register(void)
> +{
> +       int ret;
> +
> +       sysmmu_debug(3, "Registering sysmmu driver...\n");
> +
> +       slpt_cache = kmem_cache_create("slpt_cache", 1024, 1024,
> +                                      SLAB_HWCACHE_ALIGN, NULL);
> +       if (!slpt_cache) {
> +               printk(KERN_ERR
> +                       "%s: failed to allocated slpt cache\n", __func__);
> +               return -ENOMEM;
> +       }
> +
> +       ret = platform_driver_register(&s5p_sysmmu_driver);
> +
> +       if (ret) {
> +               printk(KERN_ERR
> +                       "%s: failed to register sysmmu driver\n", __func__);
> +               return -EINVAL;
> +       }
> +
> +       register_iommu(&s5p_sysmmu_ops);
> +
> +       return ret;
> +}

When you register the iommu unconditionally, it becomes impossible for
this driver to coexist with other iommu drivers in the same kernel,
which does against the concept of having a platform driver for this.

It might be better to call the s5p_sysmmu_register function from
the board files and have no platform devices at all if each IOMMU
is always bound to a specific device anyway. 

> diff --git a/arch/arm/plat-samsung/include/plat/devs.h b/arch/arm/plat-samsung/include/plat/devs.h
> index f0da6b7..0ae5dd0 100644
> --- a/arch/arm/plat-samsung/include/plat/devs.h
> +++ b/arch/arm/plat-samsung/include/plat/devs.h
> @@ -142,7 +142,7 @@ extern struct platform_device s5p_device_fimc3;
>  extern struct platform_device s5p_device_mipi_csis0;
>  extern struct platform_device s5p_device_mipi_csis1;
>  
> -extern struct platform_device exynos4_device_sysmmu;
> +extern struct platform_device exynos4_device_sysmmu[];
  
Why is this a global variable? I would expect this to be private to the
implementation.

	Arnd

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [PATCH 4/7] v4l: videobuf2: add IOMMU based DMA memory allocator
  2011-04-18  9:26   ` Marek Szyprowski
@ 2011-04-18 14:15     ` Arnd Bergmann
  -1 siblings, 0 replies; 64+ messages in thread
From: Arnd Bergmann @ 2011-04-18 14:15 UTC (permalink / raw)
  To: Marek Szyprowski
  Cc: linux-arm-kernel, linux-samsung-soc, linux-media, Kyungmin Park,
	Andrzej Pietrasiwiecz, Sylwester Nawrocki, Kukjin Kim

On Monday 18 April 2011, Marek Szyprowski wrote:
> From: Andrzej Pietrasiewicz <andrzej.p@samsung.com>
> 
> This patch adds new videobuf2 memory allocator dedicated to devices that
> supports IOMMU DMA mappings. A device with IOMMU module and a driver
> with include/iommu.h compatible interface is required. This allocator
> aquires memory with standard alloc_page() call and doesn't suffer from
> memory fragmentation issues. The allocator support following page sizes:
> 4KiB, 64KiB, 1MiB and 16MiB to reduce iommu translation overhead.

My feeling is that this is not the right abstraction. Why can't you
just implement the regular dma-mapping.h interfaces for your IOMMU
so that the videobuf code can use the existing allocators?

	Arnd

^ permalink raw reply	[flat|nested] 64+ messages in thread

* [PATCH 4/7] v4l: videobuf2: add IOMMU based DMA memory allocator
@ 2011-04-18 14:15     ` Arnd Bergmann
  0 siblings, 0 replies; 64+ messages in thread
From: Arnd Bergmann @ 2011-04-18 14:15 UTC (permalink / raw)
  To: linux-arm-kernel

On Monday 18 April 2011, Marek Szyprowski wrote:
> From: Andrzej Pietrasiewicz <andrzej.p@samsung.com>
> 
> This patch adds new videobuf2 memory allocator dedicated to devices that
> supports IOMMU DMA mappings. A device with IOMMU module and a driver
> with include/iommu.h compatible interface is required. This allocator
> aquires memory with standard alloc_page() call and doesn't suffer from
> memory fragmentation issues. The allocator support following page sizes:
> 4KiB, 64KiB, 1MiB and 16MiB to reduce iommu translation overhead.

My feeling is that this is not the right abstraction. Why can't you
just implement the regular dma-mapping.h interfaces for your IOMMU
so that the videobuf code can use the existing allocators?

	Arnd

^ permalink raw reply	[flat|nested] 64+ messages in thread

* RE: [PATCH 2/7] ARM: Samsung: update/rewrite Samsung SYSMMU (IOMMU) driver
  2011-04-18 14:12     ` Arnd Bergmann
@ 2011-04-19  8:23       ` Marek Szyprowski
  -1 siblings, 0 replies; 64+ messages in thread
From: Marek Szyprowski @ 2011-04-19  8:23 UTC (permalink / raw)
  To: 'Arnd Bergmann'
  Cc: linux-arm-kernel, linux-samsung-soc, linux-media,
	'Kyungmin Park',
	Andrzej Pietrasiewicz, Sylwester Nawrocki, 'Kukjin Kim'

Hello,

On Monday, April 18, 2011 4:13 PM Arnd Bergmann wrote:

> On Monday 18 April 2011, Marek Szyprowski wrote:
> > From: Andrzej Pietrasiewicz <andrzej.p@samsung.com>
> >
> > This patch performs a complete rewrite of sysmmu driver for Samsung
> platform:
> > - simplified the resource management: no more single platform
> >   device with 32 resources is needed, better fits into linux driver model,
> >   each sysmmu instance has it's own resource definition
> > - the new version uses kernel wide common iommu api defined in
> include/iommu.h
> > - cleaned support for sysmmu clocks
> > - added support for custom fault handlers and tlb replacement policy
> 
> Looks like good progress, but I fear that there is still quite a bit more
> work needed here.

Thanks for your comments! I've snipped the minor implementation comments
and focused only on the core iommu API.

(snipped)

> > +struct device *s5p_sysmmu_get(enum s5p_sysmmu_ip ip)
> > +{
> > +       struct device *ret = NULL;
> > +       unsigned long flags;
> > +
> > +       spin_lock_irqsave(&sysmmu_slock, flags);
> > +       if (sysmmu_table[ip]) {
> > +               try_module_get(THIS_MODULE);
> > +               ret = sysmmu_table[ip]->dev;
> > +       }
> > +       spin_unlock_irqrestore(&sysmmu_slock, flags);
> > +
> > +       return ret;
> > +}
> > +EXPORT_SYMBOL_GPL(s5p_sysmmu_get);
> > +
> > +void s5p_sysmmu_put(void *dev)
> > +{
> > +       BUG_ON(!dev);
> > +       module_put(THIS_MODULE);
> > +}
> > +EXPORT_SYMBOL_GPL(s5p_sysmmu_put);
> 
> These look wrong for a number of reasons:
> 
> * try_module_get(THIS_MODULE) makes no sense at all, the idea of the
>   try_module_get is to pin down another module that was calling down,
>   which I suppose is not needed here.
> 
> * This extends the generic IOMMU API in platform specific ways, don't
>   do that.
> 
> * I think you can do without these functions by including a pointer
>   to the iommu structure in dev_archdata, see
>   arch/powerpc/include/asm/device.h for an example.

We heavily based our solution on the iommu implementation found in 
arch/arm/mach-msm/{devices-iommu,iommu,iommu_dev}.c

The s5p_sysmmu_get/put functions are equivalent for msm_iommu_{get,put}_ctx.

(snipped)

> > +static int
> > +s5p_sysmmu_suspend(struct platform_device *pdev, pm_message_t state)
> > +{
> > +       int ret = 0;
> > +       sysmmu_debug(3, "begin\n");
> > +
> > +       return ret;
> > +}
> > +
> > +static int s5p_sysmmu_resume(struct platform_device *pdev)
> > +{
> > +       int ret = 0;
> > +       sysmmu_debug(3, "begin\n");
> > +
> > +       return ret;
> > +}
> > +
> > +static int s5p_sysmmu_runtime_suspend(struct device *dev)
> > +{
> > +       sysmmu_debug(3, "begin\n");
> > +       return 0;
> > +}
> > +
> > +static int s5p_sysmmu_runtime_resume(struct device *dev)
> > +{
> > +       sysmmu_debug(3, "begin\n");
> > +       return 0;
> > +}
> 
> Why even provide these when they don't do anything?

Because they are required by pm_runtime. If no runtime_{suspend,resume}
methods are provided, the pm_runtime core will not call proper methods
on parent device for pmruntime_{get,put}_sync(). The parent device for
each sysmmu platform device is the power domain the sysmmu belongs to.

I know this is crazy, but this is the only way it can be handled now
with runtime_pm.

> > +static int __init
> > +s5p_sysmmu_register(void)
> > +{
> > +       int ret;
> > +
> > +       sysmmu_debug(3, "Registering sysmmu driver...\n");
> > +
> > +       slpt_cache = kmem_cache_create("slpt_cache", 1024, 1024,
> > +                                      SLAB_HWCACHE_ALIGN, NULL);
> > +       if (!slpt_cache) {
> > +               printk(KERN_ERR
> > +                       "%s: failed to allocated slpt cache\n",
> __func__);
> > +               return -ENOMEM;
> > +       }
> > +
> > +       ret = platform_driver_register(&s5p_sysmmu_driver);
> > +
> > +       if (ret) {
> > +               printk(KERN_ERR
> > +                       "%s: failed to register sysmmu driver\n",
> __func__);
> > +               return -EINVAL;
> > +       }
> > +
> > +       register_iommu(&s5p_sysmmu_ops);
> > +
> > +       return ret;
> > +}
> 
> When you register the iommu unconditionally, it becomes impossible for
> this driver to coexist with other iommu drivers in the same kernel,
> which does against the concept of having a platform driver for this.

> It might be better to call the s5p_sysmmu_register function from
> the board files and have no platform devices at all if each IOMMU
> is always bound to a specific device anyway.

Ok, it looks I don't fully get how this iommu.h should be used. It looks
that there can be only one instance of iommu ops registered in the system,
so only one iommu driver can be activated. You are right that the iommu
driver has to be registered on first probe().

I think it might be beneficial to describe a bit more our hardware 
(Exynos4 platform). There are a number of multimedia blocks. Each has it's
own IOMMU controller. Each IOMMU controller has his own set of hardware
registers and irq. There is also a GPU unit (Mali) which has it's own
IOMMU hardware, incompatible with the SYSMMU, so right now it is ignored.

The multimedia blocks are modeled as platform devices and are independent
of platform type (same multimedia blocks can be found on other Samsung
machines, like for example s5pv210/s5pc110), see arch/arm/plat-s5p/dev-*.c
and arch/arm/plat-samsung/dev-*.c.

Platform driver data defined in the above files are registered by each
board startup code, usually by platform_add_devices(), for more details
please check arch/arm/mach-s5pv210/mach-goni.c. There is
struct platform_device *goni_devices[] array which get registered in the
last line in goni_machine_init() function.

For IOMMU controllers on Exynos4 we created an array of platform devices:
extern struct platform_device exynos4_device_sysmmu[];

Now the board startup code registers only these sysmmu controllers
(instances) that are required on the particular board. See "[PATCH 7/7]
ARM: EXYNOS4: enable FIMC on Universal_C210":
@@ -613,6 +616,15 @@ static struct platform_device *universal_devices[]
__initdata = {
        &s3c_device_hsmmc2,
        &s3c_device_hsmmc3,
        &s3c_device_i2c5,
+       &s5p_device_fimc0,
+       &s5p_device_fimc1,
+       &s5p_device_fimc2,
+       &s5p_device_fimc3,
+       &exynos4_device_pd[PD_CAM],
+       &exynos4_device_sysmmu[S5P_SYSMMU_FIMC0],
+       &exynos4_device_sysmmu[S5P_SYSMMU_FIMC1],
+       &exynos4_device_sysmmu[S5P_SYSMMU_FIMC2],
+       &exynos4_device_sysmmu[S5P_SYSMMU_FIMC3],

We need to map the above structure into linux/iommu.h api.

The domain defined in iommu api are quite straightforward. Each domain 
is just a set of mappings between physical addresses (phys) and io addresses
(iova).

For the drivers the most important are the following functions:
iommu_{attach,detach}_device(struct iommu_domain *domain, struct device *dev);

We assumed that they just assign the domain (mapping) to particular instance
of iommu. However the driver need to get somehow the pointer to the iommu 
instance. That's why we added the s5p_sysmmu_{get,put} functions. 

Now I see that you want to make the clients (drivers) to provide their own
struct device pointer to the iommu_{attach,detach}_device() function instead of
giving there a pointer to iommu device. Am I right? We will need some kind of
mapping between multimedia devices and particular instanced of sysmmu
controllers.

There will be also some problems with such approach. Mainly we have a
multimedia codec module, which have 2 memory controllers (for faster transfers)
and 2 iommu controllers. How can we handle such case?

> > diff --git a/arch/arm/plat-samsung/include/plat/devs.h b/arch/arm/plat-
> samsung/include/plat/devs.h
> > index f0da6b7..0ae5dd0 100644
> > --- a/arch/arm/plat-samsung/include/plat/devs.h
> > +++ b/arch/arm/plat-samsung/include/plat/devs.h
> > @@ -142,7 +142,7 @@ extern struct platform_device s5p_device_fimc3;
> >  extern struct platform_device s5p_device_mipi_csis0;
> >  extern struct platform_device s5p_device_mipi_csis1;
> >
> > -extern struct platform_device exynos4_device_sysmmu;
> > +extern struct platform_device exynos4_device_sysmmu[];
> 
> Why is this a global variable? I would expect this to be private to the
> implementation.

To allow each board to register only particular instances of sysmmu controllers.

Best regards
-- 
Marek Szyprowski
Samsung Poland R&D Center

^ permalink raw reply	[flat|nested] 64+ messages in thread

* [PATCH 2/7] ARM: Samsung: update/rewrite Samsung SYSMMU (IOMMU) driver
@ 2011-04-19  8:23       ` Marek Szyprowski
  0 siblings, 0 replies; 64+ messages in thread
From: Marek Szyprowski @ 2011-04-19  8:23 UTC (permalink / raw)
  To: linux-arm-kernel

Hello,

On Monday, April 18, 2011 4:13 PM Arnd Bergmann wrote:

> On Monday 18 April 2011, Marek Szyprowski wrote:
> > From: Andrzej Pietrasiewicz <andrzej.p@samsung.com>
> >
> > This patch performs a complete rewrite of sysmmu driver for Samsung
> platform:
> > - simplified the resource management: no more single platform
> >   device with 32 resources is needed, better fits into linux driver model,
> >   each sysmmu instance has it's own resource definition
> > - the new version uses kernel wide common iommu api defined in
> include/iommu.h
> > - cleaned support for sysmmu clocks
> > - added support for custom fault handlers and tlb replacement policy
> 
> Looks like good progress, but I fear that there is still quite a bit more
> work needed here.

Thanks for your comments! I've snipped the minor implementation comments
and focused only on the core iommu API.

(snipped)

> > +struct device *s5p_sysmmu_get(enum s5p_sysmmu_ip ip)
> > +{
> > +       struct device *ret = NULL;
> > +       unsigned long flags;
> > +
> > +       spin_lock_irqsave(&sysmmu_slock, flags);
> > +       if (sysmmu_table[ip]) {
> > +               try_module_get(THIS_MODULE);
> > +               ret = sysmmu_table[ip]->dev;
> > +       }
> > +       spin_unlock_irqrestore(&sysmmu_slock, flags);
> > +
> > +       return ret;
> > +}
> > +EXPORT_SYMBOL_GPL(s5p_sysmmu_get);
> > +
> > +void s5p_sysmmu_put(void *dev)
> > +{
> > +       BUG_ON(!dev);
> > +       module_put(THIS_MODULE);
> > +}
> > +EXPORT_SYMBOL_GPL(s5p_sysmmu_put);
> 
> These look wrong for a number of reasons:
> 
> * try_module_get(THIS_MODULE) makes no sense at all, the idea of the
>   try_module_get is to pin down another module that was calling down,
>   which I suppose is not needed here.
> 
> * This extends the generic IOMMU API in platform specific ways, don't
>   do that.
> 
> * I think you can do without these functions by including a pointer
>   to the iommu structure in dev_archdata, see
>   arch/powerpc/include/asm/device.h for an example.

We heavily based our solution on the iommu implementation found in 
arch/arm/mach-msm/{devices-iommu,iommu,iommu_dev}.c

The s5p_sysmmu_get/put functions are equivalent for msm_iommu_{get,put}_ctx.

(snipped)

> > +static int
> > +s5p_sysmmu_suspend(struct platform_device *pdev, pm_message_t state)
> > +{
> > +       int ret = 0;
> > +       sysmmu_debug(3, "begin\n");
> > +
> > +       return ret;
> > +}
> > +
> > +static int s5p_sysmmu_resume(struct platform_device *pdev)
> > +{
> > +       int ret = 0;
> > +       sysmmu_debug(3, "begin\n");
> > +
> > +       return ret;
> > +}
> > +
> > +static int s5p_sysmmu_runtime_suspend(struct device *dev)
> > +{
> > +       sysmmu_debug(3, "begin\n");
> > +       return 0;
> > +}
> > +
> > +static int s5p_sysmmu_runtime_resume(struct device *dev)
> > +{
> > +       sysmmu_debug(3, "begin\n");
> > +       return 0;
> > +}
> 
> Why even provide these when they don't do anything?

Because they are required by pm_runtime. If no runtime_{suspend,resume}
methods are provided, the pm_runtime core will not call proper methods
on parent device for pmruntime_{get,put}_sync(). The parent device for
each sysmmu platform device is the power domain the sysmmu belongs to.

I know this is crazy, but this is the only way it can be handled now
with runtime_pm.

> > +static int __init
> > +s5p_sysmmu_register(void)
> > +{
> > +       int ret;
> > +
> > +       sysmmu_debug(3, "Registering sysmmu driver...\n");
> > +
> > +       slpt_cache = kmem_cache_create("slpt_cache", 1024, 1024,
> > +                                      SLAB_HWCACHE_ALIGN, NULL);
> > +       if (!slpt_cache) {
> > +               printk(KERN_ERR
> > +                       "%s: failed to allocated slpt cache\n",
> __func__);
> > +               return -ENOMEM;
> > +       }
> > +
> > +       ret = platform_driver_register(&s5p_sysmmu_driver);
> > +
> > +       if (ret) {
> > +               printk(KERN_ERR
> > +                       "%s: failed to register sysmmu driver\n",
> __func__);
> > +               return -EINVAL;
> > +       }
> > +
> > +       register_iommu(&s5p_sysmmu_ops);
> > +
> > +       return ret;
> > +}
> 
> When you register the iommu unconditionally, it becomes impossible for
> this driver to coexist with other iommu drivers in the same kernel,
> which does against the concept of having a platform driver for this.

> It might be better to call the s5p_sysmmu_register function from
> the board files and have no platform devices at all if each IOMMU
> is always bound to a specific device anyway.

Ok, it looks I don't fully get how this iommu.h should be used. It looks
that there can be only one instance of iommu ops registered in the system,
so only one iommu driver can be activated. You are right that the iommu
driver has to be registered on first probe().

I think it might be beneficial to describe a bit more our hardware 
(Exynos4 platform). There are a number of multimedia blocks. Each has it's
own IOMMU controller. Each IOMMU controller has his own set of hardware
registers and irq. There is also a GPU unit (Mali) which has it's own
IOMMU hardware, incompatible with the SYSMMU, so right now it is ignored.

The multimedia blocks are modeled as platform devices and are independent
of platform type (same multimedia blocks can be found on other Samsung
machines, like for example s5pv210/s5pc110), see arch/arm/plat-s5p/dev-*.c
and arch/arm/plat-samsung/dev-*.c.

Platform driver data defined in the above files are registered by each
board startup code, usually by platform_add_devices(), for more details
please check arch/arm/mach-s5pv210/mach-goni.c. There is
struct platform_device *goni_devices[] array which get registered in the
last line in goni_machine_init() function.

For IOMMU controllers on Exynos4 we created an array of platform devices:
extern struct platform_device exynos4_device_sysmmu[];

Now the board startup code registers only these sysmmu controllers
(instances) that are required on the particular board. See "[PATCH 7/7]
ARM: EXYNOS4: enable FIMC on Universal_C210":
@@ -613,6 +616,15 @@ static struct platform_device *universal_devices[]
__initdata = {
        &s3c_device_hsmmc2,
        &s3c_device_hsmmc3,
        &s3c_device_i2c5,
+       &s5p_device_fimc0,
+       &s5p_device_fimc1,
+       &s5p_device_fimc2,
+       &s5p_device_fimc3,
+       &exynos4_device_pd[PD_CAM],
+       &exynos4_device_sysmmu[S5P_SYSMMU_FIMC0],
+       &exynos4_device_sysmmu[S5P_SYSMMU_FIMC1],
+       &exynos4_device_sysmmu[S5P_SYSMMU_FIMC2],
+       &exynos4_device_sysmmu[S5P_SYSMMU_FIMC3],

We need to map the above structure into linux/iommu.h api.

The domain defined in iommu api are quite straightforward. Each domain 
is just a set of mappings between physical addresses (phys) and io addresses
(iova).

For the drivers the most important are the following functions:
iommu_{attach,detach}_device(struct iommu_domain *domain, struct device *dev);

We assumed that they just assign the domain (mapping) to particular instance
of iommu. However the driver need to get somehow the pointer to the iommu 
instance. That's why we added the s5p_sysmmu_{get,put} functions. 

Now I see that you want to make the clients (drivers) to provide their own
struct device pointer to the iommu_{attach,detach}_device() function instead of
giving there a pointer to iommu device. Am I right? We will need some kind of
mapping between multimedia devices and particular instanced of sysmmu
controllers.

There will be also some problems with such approach. Mainly we have a
multimedia codec module, which have 2 memory controllers (for faster transfers)
and 2 iommu controllers. How can we handle such case?

> > diff --git a/arch/arm/plat-samsung/include/plat/devs.h b/arch/arm/plat-
> samsung/include/plat/devs.h
> > index f0da6b7..0ae5dd0 100644
> > --- a/arch/arm/plat-samsung/include/plat/devs.h
> > +++ b/arch/arm/plat-samsung/include/plat/devs.h
> > @@ -142,7 +142,7 @@ extern struct platform_device s5p_device_fimc3;
> >  extern struct platform_device s5p_device_mipi_csis0;
> >  extern struct platform_device s5p_device_mipi_csis1;
> >
> > -extern struct platform_device exynos4_device_sysmmu;
> > +extern struct platform_device exynos4_device_sysmmu[];
> 
> Why is this a global variable? I would expect this to be private to the
> implementation.

To allow each board to register only particular instances of sysmmu controllers.

Best regards
-- 
Marek Szyprowski
Samsung Poland R&D Center

^ permalink raw reply	[flat|nested] 64+ messages in thread

* RE: [PATCH 4/7] v4l: videobuf2: add IOMMU based DMA memory allocator
  2011-04-18 14:15     ` Arnd Bergmann
@ 2011-04-19  9:02       ` Marek Szyprowski
  -1 siblings, 0 replies; 64+ messages in thread
From: Marek Szyprowski @ 2011-04-19  9:02 UTC (permalink / raw)
  To: 'Arnd Bergmann'
  Cc: linux-arm-kernel, linux-samsung-soc, linux-media,
	'Kyungmin Park',
	Andrzej Pietrasiewicz, Sylwester Nawrocki, 'Kukjin Kim'

Hello,

On Monday, April 18, 2011 4:16 PM Arnd Bergmann wrote:

> On Monday 18 April 2011, Marek Szyprowski wrote:
> > From: Andrzej Pietrasiewicz <andrzej.p@samsung.com>
> >
> > This patch adds new videobuf2 memory allocator dedicated to devices that
> > supports IOMMU DMA mappings. A device with IOMMU module and a driver
> > with include/iommu.h compatible interface is required. This allocator
> > aquires memory with standard alloc_page() call and doesn't suffer from
> > memory fragmentation issues. The allocator support following page sizes:
> > 4KiB, 64KiB, 1MiB and 16MiB to reduce iommu translation overhead.
> 
> My feeling is that this is not the right abstraction. Why can't you
> just implement the regular dma-mapping.h interfaces for your IOMMU
> so that the videobuf code can use the existing allocators?

I'm not really sure which existing videobuf2 allocators might transparently
support IOMMU interface yet

Do you think that all iommu operations can be hidden behind dma_map_single 
and dma_unmap_single?

Best regards
-- 
Marek Szyprowski
Samsung Poland R&D Center

^ permalink raw reply	[flat|nested] 64+ messages in thread

* [PATCH 4/7] v4l: videobuf2: add IOMMU based DMA memory allocator
@ 2011-04-19  9:02       ` Marek Szyprowski
  0 siblings, 0 replies; 64+ messages in thread
From: Marek Szyprowski @ 2011-04-19  9:02 UTC (permalink / raw)
  To: linux-arm-kernel

Hello,

On Monday, April 18, 2011 4:16 PM Arnd Bergmann wrote:

> On Monday 18 April 2011, Marek Szyprowski wrote:
> > From: Andrzej Pietrasiewicz <andrzej.p@samsung.com>
> >
> > This patch adds new videobuf2 memory allocator dedicated to devices that
> > supports IOMMU DMA mappings. A device with IOMMU module and a driver
> > with include/iommu.h compatible interface is required. This allocator
> > aquires memory with standard alloc_page() call and doesn't suffer from
> > memory fragmentation issues. The allocator support following page sizes:
> > 4KiB, 64KiB, 1MiB and 16MiB to reduce iommu translation overhead.
> 
> My feeling is that this is not the right abstraction. Why can't you
> just implement the regular dma-mapping.h interfaces for your IOMMU
> so that the videobuf code can use the existing allocators?

I'm not really sure which existing videobuf2 allocators might transparently
support IOMMU interface yet

Do you think that all iommu operations can be hidden behind dma_map_single 
and dma_unmap_single?

Best regards
-- 
Marek Szyprowski
Samsung Poland R&D Center

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [PATCH 4/7] v4l: videobuf2: add IOMMU based DMA memory allocator
  2011-04-19  9:02       ` Marek Szyprowski
@ 2011-04-19  9:21         ` Russell King - ARM Linux
  -1 siblings, 0 replies; 64+ messages in thread
From: Russell King - ARM Linux @ 2011-04-19  9:21 UTC (permalink / raw)
  To: Marek Szyprowski
  Cc: 'Arnd Bergmann',
	linux-samsung-soc, 'Kyungmin Park', 'Kukjin Kim',
	Sylwester Nawrocki, Andrzej Pietrasiewicz, linux-arm-kernel,
	linux-media

On Tue, Apr 19, 2011 at 11:02:34AM +0200, Marek Szyprowski wrote:
> On Monday, April 18, 2011 4:16 PM Arnd Bergmann wrote:
> > My feeling is that this is not the right abstraction. Why can't you
> > just implement the regular dma-mapping.h interfaces for your IOMMU
> > so that the videobuf code can use the existing allocators?
> 
> I'm not really sure which existing videobuf2 allocators might transparently
> support IOMMU interface yet
> 
> Do you think that all iommu operations can be hidden behind dma_map_single 
> and dma_unmap_single?

That is one of the intentions of the DMA API.

^ permalink raw reply	[flat|nested] 64+ messages in thread

* [PATCH 4/7] v4l: videobuf2: add IOMMU based DMA memory allocator
@ 2011-04-19  9:21         ` Russell King - ARM Linux
  0 siblings, 0 replies; 64+ messages in thread
From: Russell King - ARM Linux @ 2011-04-19  9:21 UTC (permalink / raw)
  To: linux-arm-kernel

On Tue, Apr 19, 2011 at 11:02:34AM +0200, Marek Szyprowski wrote:
> On Monday, April 18, 2011 4:16 PM Arnd Bergmann wrote:
> > My feeling is that this is not the right abstraction. Why can't you
> > just implement the regular dma-mapping.h interfaces for your IOMMU
> > so that the videobuf code can use the existing allocators?
> 
> I'm not really sure which existing videobuf2 allocators might transparently
> support IOMMU interface yet
> 
> Do you think that all iommu operations can be hidden behind dma_map_single 
> and dma_unmap_single?

That is one of the intentions of the DMA API.

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [PATCH 4/7] v4l: videobuf2: add IOMMU based DMA memory allocator
  2011-04-19  9:21         ` Russell King - ARM Linux
@ 2011-04-19 12:00           ` Arnd Bergmann
  -1 siblings, 0 replies; 64+ messages in thread
From: Arnd Bergmann @ 2011-04-19 12:00 UTC (permalink / raw)
  To: Russell King - ARM Linux
  Cc: Marek Szyprowski, linux-samsung-soc, 'Kyungmin Park',
	'Kukjin Kim',
	Sylwester Nawrocki, Andrzej Pietrasiewicz, linux-arm-kernel,
	linux-media

On Tuesday 19 April 2011, Russell King - ARM Linux wrote:
> On Tue, Apr 19, 2011 at 11:02:34AM +0200, Marek Szyprowski wrote:
> > On Monday, April 18, 2011 4:16 PM Arnd Bergmann wrote:
> > > My feeling is that this is not the right abstraction. Why can't you
> > > just implement the regular dma-mapping.h interfaces for your IOMMU
> > > so that the videobuf code can use the existing allocators?
> > 
> > I'm not really sure which existing videobuf2 allocators might transparently
> > support IOMMU interface yet
> > 
> > Do you think that all iommu operations can be hidden behind dma_map_single 
> > and dma_unmap_single?
> 
> That is one of the intentions of the DMA API.

Exactly.

All architectures that support IOMMUs today do that, see:

arch/alpha/kernel/pci_iommu.c
arch/ia64/hp/common/sba_iommu.c
arch/powerpc/kernel/dma-iommu.c
arch/sparc/kernel/iommu.c
arch/x86/kernel/amd_iommu.c

ARM would be the first one to combine an IOMMU with potentially
noncoherent DMA, but there is no fundamental reason why we shouldn't
be able to transparently support an IOMMU.

Ideally, I think we should first find an architecture-independent
way to define an IOMMU in one place instead of having to do both
the iommu.h and dma-mapping.h interfaces, but I wouldn't require
Samsung to do that in order to support their IOMMU. Doing support for
the dma-mapping.h interface should be sufficient there.

	Arnd

^ permalink raw reply	[flat|nested] 64+ messages in thread

* [PATCH 4/7] v4l: videobuf2: add IOMMU based DMA memory allocator
@ 2011-04-19 12:00           ` Arnd Bergmann
  0 siblings, 0 replies; 64+ messages in thread
From: Arnd Bergmann @ 2011-04-19 12:00 UTC (permalink / raw)
  To: linux-arm-kernel

On Tuesday 19 April 2011, Russell King - ARM Linux wrote:
> On Tue, Apr 19, 2011 at 11:02:34AM +0200, Marek Szyprowski wrote:
> > On Monday, April 18, 2011 4:16 PM Arnd Bergmann wrote:
> > > My feeling is that this is not the right abstraction. Why can't you
> > > just implement the regular dma-mapping.h interfaces for your IOMMU
> > > so that the videobuf code can use the existing allocators?
> > 
> > I'm not really sure which existing videobuf2 allocators might transparently
> > support IOMMU interface yet
> > 
> > Do you think that all iommu operations can be hidden behind dma_map_single 
> > and dma_unmap_single?
> 
> That is one of the intentions of the DMA API.

Exactly.

All architectures that support IOMMUs today do that, see:

arch/alpha/kernel/pci_iommu.c
arch/ia64/hp/common/sba_iommu.c
arch/powerpc/kernel/dma-iommu.c
arch/sparc/kernel/iommu.c
arch/x86/kernel/amd_iommu.c

ARM would be the first one to combine an IOMMU with potentially
noncoherent DMA, but there is no fundamental reason why we shouldn't
be able to transparently support an IOMMU.

Ideally, I think we should first find an architecture-independent
way to define an IOMMU in one place instead of having to do both
the iommu.h and dma-mapping.h interfaces, but I wouldn't require
Samsung to do that in order to support their IOMMU. Doing support for
the dma-mapping.h interface should be sufficient there.

	Arnd

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [PATCH 2/7] ARM: Samsung: update/rewrite Samsung SYSMMU (IOMMU) driver
  2011-04-19  8:23       ` Marek Szyprowski
@ 2011-04-19 12:49         ` Arnd Bergmann
  -1 siblings, 0 replies; 64+ messages in thread
From: Arnd Bergmann @ 2011-04-19 12:49 UTC (permalink / raw)
  To: Marek Szyprowski, Joerg Roedel
  Cc: linux-arm-kernel, linux-samsung-soc, linux-media,
	'Kyungmin Park',
	Andrzej Pietrasiewicz, Sylwester Nawrocki, 'Kukjin Kim'

(adding Joerg to Cc)

On Tuesday 19 April 2011, Marek Szyprowski wrote:

> > These look wrong for a number of reasons:
> > 
> > * try_module_get(THIS_MODULE) makes no sense at all, the idea of the
> >   try_module_get is to pin down another module that was calling down,
> >   which I suppose is not needed here.
> > 
> > * This extends the generic IOMMU API in platform specific ways, don't
> >   do that.
> > 
> > * I think you can do without these functions by including a pointer
> >   to the iommu structure in dev_archdata, see
> >   arch/powerpc/include/asm/device.h for an example.
> 
> We heavily based our solution on the iommu implementation found in 
> arch/arm/mach-msm/{devices-iommu,iommu,iommu_dev}.c
> 
> The s5p_sysmmu_get/put functions are equivalent for msm_iommu_{get,put}_ctx.
> 
> (snipped)

Yes, I'm sorry about this. I commented on the early versions of the MSM
driver, but then did not do another review of the version that actually
got merged. That should also be fixed, ideally we can come up with a
way that works for both drivers.

> > Why even provide these when they don't do anything?
> 
> Because they are required by pm_runtime. If no runtime_{suspend,resume}
> methods are provided, the pm_runtime core will not call proper methods
> on parent device for pmruntime_{get,put}_sync(). The parent device for
> each sysmmu platform device is the power domain the sysmmu belongs to.
> 
> I know this is crazy, but this is the only way it can be handled now
> with runtime_pm.

Please don't try to work around kernel features when they don't fit
what you are doing. The intent of the way that runtime_pm works is
to make life easier for driver writers, not harder ;-)

I can see three ways that would be better solutions:

1. change the runtime_pm subsystem to allow it to ignore some devices
in an easy way.

2. change the device layout if the sysmmu. If the iommu device is
a child of the device that it is responsible for, I guess you don't
have this problem.

3. Not represent the iommu as a device at all, just as a property
of another device.

> > When you register the iommu unconditionally, it becomes impossible for
> > this driver to coexist with other iommu drivers in the same kernel,
> > which does against the concept of having a platform driver for this.
> 
> > It might be better to call the s5p_sysmmu_register function from
> > the board files and have no platform devices at all if each IOMMU
> > is always bound to a specific device anyway.
> 
> Ok, it looks I don't fully get how this iommu.h should be used. It looks
> that there can be only one instance of iommu ops registered in the system,
> so only one iommu driver can be activated. You are right that the iommu
> driver has to be registered on first probe().

That is a limitation of the current implementation. We might want to
change that anyway, e.g. to handle the mali IOMMU along with yours.
I believe the reason for allowing only one IOMMU type so far has been
that nobody required more than one. As I mentioned, the IOMMU API is
rather new and has not been ported to much variety of hardware, unlike
the dma-mapping API, which does support multiple different IOMMUs
in a single system.

> I think it might be beneficial to describe a bit more our hardware 
> (Exynos4 platform). There are a number of multimedia blocks. Each has it's
> own IOMMU controller. Each IOMMU controller has his own set of hardware
> registers and irq. There is also a GPU unit (Mali) which has it's own
> IOMMU hardware, incompatible with the SYSMMU, so right now it is ignored.
> 
> The multimedia blocks are modeled as platform devices and are independent
> of platform type (same multimedia blocks can be found on other Samsung
> machines, like for example s5pv210/s5pc110), see arch/arm/plat-s5p/dev-*.c
> and arch/arm/plat-samsung/dev-*.c.
> 
> Platform driver data defined in the above files are registered by each
> board startup code, usually by platform_add_devices(), for more details
> please check arch/arm/mach-s5pv210/mach-goni.c. There is
> struct platform_device *goni_devices[] array which get registered in the
> last line in goni_machine_init() function.
> 
> For IOMMU controllers on Exynos4 we created an array of platform devices:
> extern struct platform_device exynos4_device_sysmmu[];
> 
> Now the board startup code registers only these sysmmu controllers
> (instances) that are required on the particular board. See "[PATCH 7/7]
> ARM: EXYNOS4: enable FIMC on Universal_C210":
> @@ -613,6 +616,15 @@ static struct platform_device *universal_devices[]
> __initdata = {
>         &s3c_device_hsmmc2,
>         &s3c_device_hsmmc3,
>         &s3c_device_i2c5,
> +       &s5p_device_fimc0,
> +       &s5p_device_fimc1,
> +       &s5p_device_fimc2,
> +       &s5p_device_fimc3,
> +       &exynos4_device_pd[PD_CAM],
> +       &exynos4_device_sysmmu[S5P_SYSMMU_FIMC0],
> +       &exynos4_device_sysmmu[S5P_SYSMMU_FIMC1],
> +       &exynos4_device_sysmmu[S5P_SYSMMU_FIMC2],
> +       &exynos4_device_sysmmu[S5P_SYSMMU_FIMC3],
> 
> We need to map the above structure into linux/iommu.h api.

Thanks for the background information.

> The domain defined in iommu api are quite straightforward. Each domain 
> is just a set of mappings between physical addresses (phys) and io addresses
> (iova).
> 
> For the drivers the most important are the following functions:
> iommu_{attach,detach}_device(struct iommu_domain *domain, struct device *dev);
> 
> We assumed that they just assign the domain (mapping) to particular instance
> of iommu. However the driver need to get somehow the pointer to the iommu 
> instance. That's why we added the s5p_sysmmu_{get,put} functions. 
> 
> Now I see that you want to make the clients (drivers) to provide their own
> struct device pointer to the iommu_{attach,detach}_device() function instead of
> giving there a pointer to iommu device. Am I right? We will need some kind of
> mapping between multimedia devices and particular instanced of sysmmu
> controllers.
> 
> There will be also some problems with such approach. Mainly we have a
> multimedia codec module, which have 2 memory controllers (for faster transfers)
> and 2 iommu controllers. How can we handle such case?

It's not quite how the domains are meant to be used. In the AMD IOMMU
that the API is based on, any number of devices can share one domain,
and devices might be able to have mappings in multiple domains.

The domain really reflects the user, not the device here, which makes more
sense if you think of virtual machines than of multimedia devices.

I would suggest that you just use a single iommu_domain globally for
all in-kernel users.

> > > diff --git a/arch/arm/plat-samsung/include/plat/devs.h b/arch/arm/plat-
> > samsung/include/plat/devs.h
> > > index f0da6b7..0ae5dd0 100644
> > > --- a/arch/arm/plat-samsung/include/plat/devs.h
> > > +++ b/arch/arm/plat-samsung/include/plat/devs.h
> > > @@ -142,7 +142,7 @@ extern struct platform_device s5p_device_fimc3;
> > >  extern struct platform_device s5p_device_mipi_csis0;
> > >  extern struct platform_device s5p_device_mipi_csis1;
> > >
> > > -extern struct platform_device exynos4_device_sysmmu;
> > > +extern struct platform_device exynos4_device_sysmmu[];
> > 
> > Why is this a global variable? I would expect this to be private to the
> > implementation.
> 
> To allow each board to register only particular instances of sysmmu controllers.

That sounds like an unnecessarily complicated way of doing it. This would
be another reason to not make each one a device, but have something else
in struct device take care of it.

	Arnd

^ permalink raw reply	[flat|nested] 64+ messages in thread

* [PATCH 2/7] ARM: Samsung: update/rewrite Samsung SYSMMU (IOMMU) driver
@ 2011-04-19 12:49         ` Arnd Bergmann
  0 siblings, 0 replies; 64+ messages in thread
From: Arnd Bergmann @ 2011-04-19 12:49 UTC (permalink / raw)
  To: linux-arm-kernel

(adding Joerg to Cc)

On Tuesday 19 April 2011, Marek Szyprowski wrote:

> > These look wrong for a number of reasons:
> > 
> > * try_module_get(THIS_MODULE) makes no sense at all, the idea of the
> >   try_module_get is to pin down another module that was calling down,
> >   which I suppose is not needed here.
> > 
> > * This extends the generic IOMMU API in platform specific ways, don't
> >   do that.
> > 
> > * I think you can do without these functions by including a pointer
> >   to the iommu structure in dev_archdata, see
> >   arch/powerpc/include/asm/device.h for an example.
> 
> We heavily based our solution on the iommu implementation found in 
> arch/arm/mach-msm/{devices-iommu,iommu,iommu_dev}.c
> 
> The s5p_sysmmu_get/put functions are equivalent for msm_iommu_{get,put}_ctx.
> 
> (snipped)

Yes, I'm sorry about this. I commented on the early versions of the MSM
driver, but then did not do another review of the version that actually
got merged. That should also be fixed, ideally we can come up with a
way that works for both drivers.

> > Why even provide these when they don't do anything?
> 
> Because they are required by pm_runtime. If no runtime_{suspend,resume}
> methods are provided, the pm_runtime core will not call proper methods
> on parent device for pmruntime_{get,put}_sync(). The parent device for
> each sysmmu platform device is the power domain the sysmmu belongs to.
> 
> I know this is crazy, but this is the only way it can be handled now
> with runtime_pm.

Please don't try to work around kernel features when they don't fit
what you are doing. The intent of the way that runtime_pm works is
to make life easier for driver writers, not harder ;-)

I can see three ways that would be better solutions:

1. change the runtime_pm subsystem to allow it to ignore some devices
in an easy way.

2. change the device layout if the sysmmu. If the iommu device is
a child of the device that it is responsible for, I guess you don't
have this problem.

3. Not represent the iommu as a device at all, just as a property
of another device.

> > When you register the iommu unconditionally, it becomes impossible for
> > this driver to coexist with other iommu drivers in the same kernel,
> > which does against the concept of having a platform driver for this.
> 
> > It might be better to call the s5p_sysmmu_register function from
> > the board files and have no platform devices at all if each IOMMU
> > is always bound to a specific device anyway.
> 
> Ok, it looks I don't fully get how this iommu.h should be used. It looks
> that there can be only one instance of iommu ops registered in the system,
> so only one iommu driver can be activated. You are right that the iommu
> driver has to be registered on first probe().

That is a limitation of the current implementation. We might want to
change that anyway, e.g. to handle the mali IOMMU along with yours.
I believe the reason for allowing only one IOMMU type so far has been
that nobody required more than one. As I mentioned, the IOMMU API is
rather new and has not been ported to much variety of hardware, unlike
the dma-mapping API, which does support multiple different IOMMUs
in a single system.

> I think it might be beneficial to describe a bit more our hardware 
> (Exynos4 platform). There are a number of multimedia blocks. Each has it's
> own IOMMU controller. Each IOMMU controller has his own set of hardware
> registers and irq. There is also a GPU unit (Mali) which has it's own
> IOMMU hardware, incompatible with the SYSMMU, so right now it is ignored.
> 
> The multimedia blocks are modeled as platform devices and are independent
> of platform type (same multimedia blocks can be found on other Samsung
> machines, like for example s5pv210/s5pc110), see arch/arm/plat-s5p/dev-*.c
> and arch/arm/plat-samsung/dev-*.c.
> 
> Platform driver data defined in the above files are registered by each
> board startup code, usually by platform_add_devices(), for more details
> please check arch/arm/mach-s5pv210/mach-goni.c. There is
> struct platform_device *goni_devices[] array which get registered in the
> last line in goni_machine_init() function.
> 
> For IOMMU controllers on Exynos4 we created an array of platform devices:
> extern struct platform_device exynos4_device_sysmmu[];
> 
> Now the board startup code registers only these sysmmu controllers
> (instances) that are required on the particular board. See "[PATCH 7/7]
> ARM: EXYNOS4: enable FIMC on Universal_C210":
> @@ -613,6 +616,15 @@ static struct platform_device *universal_devices[]
> __initdata = {
>         &s3c_device_hsmmc2,
>         &s3c_device_hsmmc3,
>         &s3c_device_i2c5,
> +       &s5p_device_fimc0,
> +       &s5p_device_fimc1,
> +       &s5p_device_fimc2,
> +       &s5p_device_fimc3,
> +       &exynos4_device_pd[PD_CAM],
> +       &exynos4_device_sysmmu[S5P_SYSMMU_FIMC0],
> +       &exynos4_device_sysmmu[S5P_SYSMMU_FIMC1],
> +       &exynos4_device_sysmmu[S5P_SYSMMU_FIMC2],
> +       &exynos4_device_sysmmu[S5P_SYSMMU_FIMC3],
> 
> We need to map the above structure into linux/iommu.h api.

Thanks for the background information.

> The domain defined in iommu api are quite straightforward. Each domain 
> is just a set of mappings between physical addresses (phys) and io addresses
> (iova).
> 
> For the drivers the most important are the following functions:
> iommu_{attach,detach}_device(struct iommu_domain *domain, struct device *dev);
> 
> We assumed that they just assign the domain (mapping) to particular instance
> of iommu. However the driver need to get somehow the pointer to the iommu 
> instance. That's why we added the s5p_sysmmu_{get,put} functions. 
> 
> Now I see that you want to make the clients (drivers) to provide their own
> struct device pointer to the iommu_{attach,detach}_device() function instead of
> giving there a pointer to iommu device. Am I right? We will need some kind of
> mapping between multimedia devices and particular instanced of sysmmu
> controllers.
> 
> There will be also some problems with such approach. Mainly we have a
> multimedia codec module, which have 2 memory controllers (for faster transfers)
> and 2 iommu controllers. How can we handle such case?

It's not quite how the domains are meant to be used. In the AMD IOMMU
that the API is based on, any number of devices can share one domain,
and devices might be able to have mappings in multiple domains.

The domain really reflects the user, not the device here, which makes more
sense if you think of virtual machines than of multimedia devices.

I would suggest that you just use a single iommu_domain globally for
all in-kernel users.

> > > diff --git a/arch/arm/plat-samsung/include/plat/devs.h b/arch/arm/plat-
> > samsung/include/plat/devs.h
> > > index f0da6b7..0ae5dd0 100644
> > > --- a/arch/arm/plat-samsung/include/plat/devs.h
> > > +++ b/arch/arm/plat-samsung/include/plat/devs.h
> > > @@ -142,7 +142,7 @@ extern struct platform_device s5p_device_fimc3;
> > >  extern struct platform_device s5p_device_mipi_csis0;
> > >  extern struct platform_device s5p_device_mipi_csis1;
> > >
> > > -extern struct platform_device exynos4_device_sysmmu;
> > > +extern struct platform_device exynos4_device_sysmmu[];
> > 
> > Why is this a global variable? I would expect this to be private to the
> > implementation.
> 
> To allow each board to register only particular instances of sysmmu controllers.

That sounds like an unnecessarily complicated way of doing it. This would
be another reason to not make each one a device, but have something else
in struct device take care of it.

	Arnd

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [PATCH 2/7] ARM: Samsung: update/rewrite Samsung SYSMMU (IOMMU) driver
  2011-04-19 12:49         ` Arnd Bergmann
@ 2011-04-19 13:50           ` Roedel, Joerg
  -1 siblings, 0 replies; 64+ messages in thread
From: Roedel, Joerg @ 2011-04-19 13:50 UTC (permalink / raw)
  To: Arnd Bergmann
  Cc: Marek Szyprowski, linux-arm-kernel, linux-samsung-soc,
	linux-media, 'Kyungmin Park',
	Andrzej Pietrasiewicz, Sylwester Nawrocki, 'Kukjin Kim'

On Tue, Apr 19, 2011 at 08:49:50AM -0400, Arnd Bergmann wrote:
> (adding Joerg to Cc)
> 
> On Tuesday 19 April 2011, Marek Szyprowski wrote:
> 
> > > * This extends the generic IOMMU API in platform specific ways, don't
> > >   do that.

This is certainly not a good idea. Please Cc me on IOMMU-API changes in
the future to that I can have a look at it. This is also a good way to
prevent misunderstandings (which I found some in this mail).

> > Ok, it looks I don't fully get how this iommu.h should be used. It looks
> > that there can be only one instance of iommu ops registered in the system,
> > so only one iommu driver can be activated. You are right that the iommu
> > driver has to be registered on first probe().
> 
> That is a limitation of the current implementation. We might want to
> change that anyway, e.g. to handle the mali IOMMU along with yours.
> I believe the reason for allowing only one IOMMU type so far has been
> that nobody required more than one. As I mentioned, the IOMMU API is
> rather new and has not been ported to much variety of hardware, unlike
> the dma-mapping API, which does support multiple different IOMMUs
> in a single system.

The current IOMMU-API interface is very simple. It delegates the
selection of the particular IOMMU device to the IOMMU driver. Handle
this selection above the IOMMU driver is a complex thing to do. We will
need some kind of generic IOMMU support in the device-core and
attach IOMMUs to device sub-trees.

A simpler and less intrusive solution is to implement some wrapper code
which dispatches the IOMMU-API calls to the IOMMU driver implementation
required for that device.

> > I think it might be beneficial to describe a bit more our hardware 
> > (Exynos4 platform). There are a number of multimedia blocks. Each has it's
> > own IOMMU controller. Each IOMMU controller has his own set of hardware
> > registers and irq. There is also a GPU unit (Mali) which has it's own
> > IOMMU hardware, incompatible with the SYSMMU, so right now it is ignored.
> > 
> > The multimedia blocks are modeled as platform devices and are independent
> > of platform type (same multimedia blocks can be found on other Samsung
> > machines, like for example s5pv210/s5pc110), see arch/arm/plat-s5p/dev-*.c
> > and arch/arm/plat-samsung/dev-*.c.

Question: Does every platform device has a different type of IOMMU? Or
are the IOMMUs on all of these platform devices similar enough to be
handled by a single driver?

> > The domain defined in iommu api are quite straightforward. Each domain 
> > is just a set of mappings between physical addresses (phys) and io addresses
> > (iova).

Each domain represents an address space. In the AMD IOMMU this is
basically a page-table.

> > For the drivers the most important are the following functions:
> > iommu_{attach,detach}_device(struct iommu_domain *domain, struct device *dev);

Right, and each driver can allocate its own domains.

> > We assumed that they just assign the domain (mapping) to particular instance
> > of iommu. However the driver need to get somehow the pointer to the iommu 
> > instance. That's why we added the s5p_sysmmu_{get,put} functions.

The functions attach a specific device to an IOMMU domain (== address
space). These devices might be behind different IOMMUs and it is the
responsibility of the IOMMU driver to setup everything correctly.

> It's not quite how the domains are meant to be used. In the AMD IOMMU
> that the API is based on, any number of devices can share one domain,
> and devices might be able to have mappings in multiple domains.

Yes, any number of devices can be assigned to one domain, but each
device only belongs to one domain at each point in time. But it is
possible to detach a device from one domain and attach it to another.

Regards,

	Joerg

-- 
AMD Operating System Research Center

Advanced Micro Devices GmbH Einsteinring 24 85609 Dornach
General Managers: Alberto Bozzo, Andrew Bowd
Registration: Dornach, Landkr. Muenchen; Registerger. Muenchen, HRB Nr. 43632


^ permalink raw reply	[flat|nested] 64+ messages in thread

* [PATCH 2/7] ARM: Samsung: update/rewrite Samsung SYSMMU (IOMMU) driver
@ 2011-04-19 13:50           ` Roedel, Joerg
  0 siblings, 0 replies; 64+ messages in thread
From: Roedel, Joerg @ 2011-04-19 13:50 UTC (permalink / raw)
  To: linux-arm-kernel

On Tue, Apr 19, 2011 at 08:49:50AM -0400, Arnd Bergmann wrote:
> (adding Joerg to Cc)
> 
> On Tuesday 19 April 2011, Marek Szyprowski wrote:
> 
> > > * This extends the generic IOMMU API in platform specific ways, don't
> > >   do that.

This is certainly not a good idea. Please Cc me on IOMMU-API changes in
the future to that I can have a look at it. This is also a good way to
prevent misunderstandings (which I found some in this mail).

> > Ok, it looks I don't fully get how this iommu.h should be used. It looks
> > that there can be only one instance of iommu ops registered in the system,
> > so only one iommu driver can be activated. You are right that the iommu
> > driver has to be registered on first probe().
> 
> That is a limitation of the current implementation. We might want to
> change that anyway, e.g. to handle the mali IOMMU along with yours.
> I believe the reason for allowing only one IOMMU type so far has been
> that nobody required more than one. As I mentioned, the IOMMU API is
> rather new and has not been ported to much variety of hardware, unlike
> the dma-mapping API, which does support multiple different IOMMUs
> in a single system.

The current IOMMU-API interface is very simple. It delegates the
selection of the particular IOMMU device to the IOMMU driver. Handle
this selection above the IOMMU driver is a complex thing to do. We will
need some kind of generic IOMMU support in the device-core and
attach IOMMUs to device sub-trees.

A simpler and less intrusive solution is to implement some wrapper code
which dispatches the IOMMU-API calls to the IOMMU driver implementation
required for that device.

> > I think it might be beneficial to describe a bit more our hardware 
> > (Exynos4 platform). There are a number of multimedia blocks. Each has it's
> > own IOMMU controller. Each IOMMU controller has his own set of hardware
> > registers and irq. There is also a GPU unit (Mali) which has it's own
> > IOMMU hardware, incompatible with the SYSMMU, so right now it is ignored.
> > 
> > The multimedia blocks are modeled as platform devices and are independent
> > of platform type (same multimedia blocks can be found on other Samsung
> > machines, like for example s5pv210/s5pc110), see arch/arm/plat-s5p/dev-*.c
> > and arch/arm/plat-samsung/dev-*.c.

Question: Does every platform device has a different type of IOMMU? Or
are the IOMMUs on all of these platform devices similar enough to be
handled by a single driver?

> > The domain defined in iommu api are quite straightforward. Each domain 
> > is just a set of mappings between physical addresses (phys) and io addresses
> > (iova).

Each domain represents an address space. In the AMD IOMMU this is
basically a page-table.

> > For the drivers the most important are the following functions:
> > iommu_{attach,detach}_device(struct iommu_domain *domain, struct device *dev);

Right, and each driver can allocate its own domains.

> > We assumed that they just assign the domain (mapping) to particular instance
> > of iommu. However the driver need to get somehow the pointer to the iommu 
> > instance. That's why we added the s5p_sysmmu_{get,put} functions.

The functions attach a specific device to an IOMMU domain (== address
space). These devices might be behind different IOMMUs and it is the
responsibility of the IOMMU driver to setup everything correctly.

> It's not quite how the domains are meant to be used. In the AMD IOMMU
> that the API is based on, any number of devices can share one domain,
> and devices might be able to have mappings in multiple domains.

Yes, any number of devices can be assigned to one domain, but each
device only belongs to one domain at each point in time. But it is
possible to detach a device from one domain and attach it to another.

Regards,

	Joerg

-- 
AMD Operating System Research Center

Advanced Micro Devices GmbH Einsteinring 24 85609 Dornach
General Managers: Alberto Bozzo, Andrew Bowd
Registration: Dornach, Landkr. Muenchen; Registerger. Muenchen, HRB Nr. 43632

^ permalink raw reply	[flat|nested] 64+ messages in thread

* RE: [PATCH 2/7] ARM: Samsung: update/rewrite Samsung SYSMMU (IOMMU) driver
  2011-04-19 12:49         ` Arnd Bergmann
@ 2011-04-19 14:03           ` Marek Szyprowski
  -1 siblings, 0 replies; 64+ messages in thread
From: Marek Szyprowski @ 2011-04-19 14:03 UTC (permalink / raw)
  To: 'Arnd Bergmann', 'Joerg Roedel'
  Cc: linux-samsung-soc, 'Kyungmin Park', 'Kukjin Kim',
	'Sylwester Nawrocki', 'Andrzej Pietrasiewicz',
	linux-arm-kernel, linux-media

Hello,

On Tuesday, April 19, 2011 2:50 PM Arnd Bergmann wrote:

> On Tuesday 19 April 2011, Marek Szyprowski wrote:
> 
> > > These look wrong for a number of reasons:
> > >
> > > * try_module_get(THIS_MODULE) makes no sense at all, the idea of the
> > >   try_module_get is to pin down another module that was calling down,
> > >   which I suppose is not needed here.
> > >
> > > * This extends the generic IOMMU API in platform specific ways, don't
> > >   do that.
> > >
> > > * I think you can do without these functions by including a pointer
> > >   to the iommu structure in dev_archdata, see
> > >   arch/powerpc/include/asm/device.h for an example.
> >
> > We heavily based our solution on the iommu implementation found in
> > arch/arm/mach-msm/{devices-iommu,iommu,iommu_dev}.c
> >
> > The s5p_sysmmu_get/put functions are equivalent for
> msm_iommu_{get,put}_ctx.
> >
> > (snipped)
> 
> Yes, I'm sorry about this. I commented on the early versions of the MSM
> driver, but then did not do another review of the version that actually
> got merged. That should also be fixed, ideally we can come up with a
> way that works for both drivers.

OK, so it looks that we misunderstood the IOMMU API basing on the 
particular implementation.

> > > Why even provide these when they don't do anything?
> >
> > Because they are required by pm_runtime. If no runtime_{suspend,resume}
> > methods are provided, the pm_runtime core will not call proper methods
> > on parent device for pmruntime_{get,put}_sync(). The parent device for
> > each sysmmu platform device is the power domain the sysmmu belongs to.
> >
> > I know this is crazy, but this is the only way it can be handled now
> > with runtime_pm.
> 
> Please don't try to work around kernel features when they don't fit
> what you are doing. The intent of the way that runtime_pm works is
> to make life easier for driver writers, not harder ;-)
> 
> I can see three ways that would be better solutions:
> 
> 1. change the runtime_pm subsystem to allow it to ignore some devices
> in an easy way.
> 
> 2. change the device layout if the sysmmu. If the iommu device is
> a child of the device that it is responsible for, I guess you don't
> have this problem.
> 
> 3. Not represent the iommu as a device at all, just as a property
> of another device.

Ok, we will handle this issue somehow. I consider this a minor issue and I
would like to focus on the IOMMU/dma-mapping APIs first.
 
> > > When you register the iommu unconditionally, it becomes impossible for
> > > this driver to coexist with other iommu drivers in the same kernel,
> > > which does against the concept of having a platform driver for this.
> >
> > > It might be better to call the s5p_sysmmu_register function from
> > > the board files and have no platform devices at all if each IOMMU
> > > is always bound to a specific device anyway.
> >
> > Ok, it looks I don't fully get how this iommu.h should be used. It looks
> > that there can be only one instance of iommu ops registered in the system,
> > so only one iommu driver can be activated. You are right that the iommu
> > driver has to be registered on first probe().
> 
> That is a limitation of the current implementation. We might want to
> change that anyway, e.g. to handle the mali IOMMU along with yours.
> I believe the reason for allowing only one IOMMU type so far has been
> that nobody required more than one. As I mentioned, the IOMMU API is
> rather new and has not been ported to much variety of hardware, unlike
> the dma-mapping API, which does support multiple different IOMMUs
> in a single system.

Ok. I understand. IOMMU API is quite nice abstraction of the IOMMU chip.
dma-mapping API is something much more complex that creates the actual
mapping for various sets of the devices. IMHO the right direction will
be to create dma-mapping implementation that will be just a client of
the IOMMU API. What's your opinion?

(snipped)

> > The domain defined in iommu api are quite straightforward. Each domain
> > is just a set of mappings between physical addresses (phys) and io
> addresses
> > (iova).
> >
> > For the drivers the most important are the following functions:
> > iommu_{attach,detach}_device(struct iommu_domain *domain, struct device
> *dev);
> >
> > We assumed that they just assign the domain (mapping) to particular
> instance
> > of iommu. However the driver need to get somehow the pointer to the iommu
> > instance. That's why we added the s5p_sysmmu_{get,put} functions.
> >
> > Now I see that you want to make the clients (drivers) to provide their
> own
> > struct device pointer to the iommu_{attach,detach}_device() function
> instead of
> > giving there a pointer to iommu device. Am I right? We will need some
> kind of
> > mapping between multimedia devices and particular instanced of sysmmu
> > controllers.
> >
> > There will be also some problems with such approach. Mainly we have a
> > multimedia codec module, which have 2 memory controllers (for faster
> transfers)
> > and 2 iommu controllers. How can we handle such case?
> 
> It's not quite how the domains are meant to be used. In the AMD IOMMU
> that the API is based on, any number of devices can share one domain,
> and devices might be able to have mappings in multiple domains.
> 
> The domain really reflects the user, not the device here, which makes more
> sense if you think of virtual machines than of multimedia devices.
>
> I would suggest that you just use a single iommu_domain globally for
> all in-kernel users.

There are cases where having a separate mapping for each device makes sense.
It definitely increases the security and helps to find some bugs in
the drivers.

Getting back to our video codec - it has 2 IOMMU controllers. The codec
hardware is able to address only 256MiB of space. Do you have an idea how
this can be handled with dma-mapping API? The only idea that comes to my
mind is to provide a second, fake 'struct device' and use it for allocations
for the second IOMMU controller.

> > > > diff --git a/arch/arm/plat-samsung/include/plat/devs.h
> b/arch/arm/plat-
> > > samsung/include/plat/devs.h
> > > > index f0da6b7..0ae5dd0 100644
> > > > --- a/arch/arm/plat-samsung/include/plat/devs.h
> > > > +++ b/arch/arm/plat-samsung/include/plat/devs.h
> > > > @@ -142,7 +142,7 @@ extern struct platform_device s5p_device_fimc3;
> > > >  extern struct platform_device s5p_device_mipi_csis0;
> > > >  extern struct platform_device s5p_device_mipi_csis1;
> > > >
> > > > -extern struct platform_device exynos4_device_sysmmu;
> > > > +extern struct platform_device exynos4_device_sysmmu[];
> > >
> > > Why is this a global variable? I would expect this to be private to the
> > > implementation.
> >
> > To allow each board to register only particular instances of sysmmu
> controllers.
> 
> That sounds like an unnecessarily complicated way of doing it. This would
> be another reason to not make each one a device, but have something else
> in struct device take care of it.

Well, having each iommu instantiated as a separate device has some advantages,
but I agree that having it automatically registered together with the
corresponding multimedia (client) device will simplify a lot of things. Same
rules should imho apply to power domain drivers.

Best regards
-- 
Marek Szyprowski
Samsung Poland R&D Center

^ permalink raw reply	[flat|nested] 64+ messages in thread

* [PATCH 2/7] ARM: Samsung: update/rewrite Samsung SYSMMU (IOMMU) driver
@ 2011-04-19 14:03           ` Marek Szyprowski
  0 siblings, 0 replies; 64+ messages in thread
From: Marek Szyprowski @ 2011-04-19 14:03 UTC (permalink / raw)
  To: linux-arm-kernel

Hello,

On Tuesday, April 19, 2011 2:50 PM Arnd Bergmann wrote:

> On Tuesday 19 April 2011, Marek Szyprowski wrote:
> 
> > > These look wrong for a number of reasons:
> > >
> > > * try_module_get(THIS_MODULE) makes no sense at all, the idea of the
> > >   try_module_get is to pin down another module that was calling down,
> > >   which I suppose is not needed here.
> > >
> > > * This extends the generic IOMMU API in platform specific ways, don't
> > >   do that.
> > >
> > > * I think you can do without these functions by including a pointer
> > >   to the iommu structure in dev_archdata, see
> > >   arch/powerpc/include/asm/device.h for an example.
> >
> > We heavily based our solution on the iommu implementation found in
> > arch/arm/mach-msm/{devices-iommu,iommu,iommu_dev}.c
> >
> > The s5p_sysmmu_get/put functions are equivalent for
> msm_iommu_{get,put}_ctx.
> >
> > (snipped)
> 
> Yes, I'm sorry about this. I commented on the early versions of the MSM
> driver, but then did not do another review of the version that actually
> got merged. That should also be fixed, ideally we can come up with a
> way that works for both drivers.

OK, so it looks that we misunderstood the IOMMU API basing on the 
particular implementation.

> > > Why even provide these when they don't do anything?
> >
> > Because they are required by pm_runtime. If no runtime_{suspend,resume}
> > methods are provided, the pm_runtime core will not call proper methods
> > on parent device for pmruntime_{get,put}_sync(). The parent device for
> > each sysmmu platform device is the power domain the sysmmu belongs to.
> >
> > I know this is crazy, but this is the only way it can be handled now
> > with runtime_pm.
> 
> Please don't try to work around kernel features when they don't fit
> what you are doing. The intent of the way that runtime_pm works is
> to make life easier for driver writers, not harder ;-)
> 
> I can see three ways that would be better solutions:
> 
> 1. change the runtime_pm subsystem to allow it to ignore some devices
> in an easy way.
> 
> 2. change the device layout if the sysmmu. If the iommu device is
> a child of the device that it is responsible for, I guess you don't
> have this problem.
> 
> 3. Not represent the iommu as a device at all, just as a property
> of another device.

Ok, we will handle this issue somehow. I consider this a minor issue and I
would like to focus on the IOMMU/dma-mapping APIs first.
 
> > > When you register the iommu unconditionally, it becomes impossible for
> > > this driver to coexist with other iommu drivers in the same kernel,
> > > which does against the concept of having a platform driver for this.
> >
> > > It might be better to call the s5p_sysmmu_register function from
> > > the board files and have no platform devices at all if each IOMMU
> > > is always bound to a specific device anyway.
> >
> > Ok, it looks I don't fully get how this iommu.h should be used. It looks
> > that there can be only one instance of iommu ops registered in the system,
> > so only one iommu driver can be activated. You are right that the iommu
> > driver has to be registered on first probe().
> 
> That is a limitation of the current implementation. We might want to
> change that anyway, e.g. to handle the mali IOMMU along with yours.
> I believe the reason for allowing only one IOMMU type so far has been
> that nobody required more than one. As I mentioned, the IOMMU API is
> rather new and has not been ported to much variety of hardware, unlike
> the dma-mapping API, which does support multiple different IOMMUs
> in a single system.

Ok. I understand. IOMMU API is quite nice abstraction of the IOMMU chip.
dma-mapping API is something much more complex that creates the actual
mapping for various sets of the devices. IMHO the right direction will
be to create dma-mapping implementation that will be just a client of
the IOMMU API. What's your opinion?

(snipped)

> > The domain defined in iommu api are quite straightforward. Each domain
> > is just a set of mappings between physical addresses (phys) and io
> addresses
> > (iova).
> >
> > For the drivers the most important are the following functions:
> > iommu_{attach,detach}_device(struct iommu_domain *domain, struct device
> *dev);
> >
> > We assumed that they just assign the domain (mapping) to particular
> instance
> > of iommu. However the driver need to get somehow the pointer to the iommu
> > instance. That's why we added the s5p_sysmmu_{get,put} functions.
> >
> > Now I see that you want to make the clients (drivers) to provide their
> own
> > struct device pointer to the iommu_{attach,detach}_device() function
> instead of
> > giving there a pointer to iommu device. Am I right? We will need some
> kind of
> > mapping between multimedia devices and particular instanced of sysmmu
> > controllers.
> >
> > There will be also some problems with such approach. Mainly we have a
> > multimedia codec module, which have 2 memory controllers (for faster
> transfers)
> > and 2 iommu controllers. How can we handle such case?
> 
> It's not quite how the domains are meant to be used. In the AMD IOMMU
> that the API is based on, any number of devices can share one domain,
> and devices might be able to have mappings in multiple domains.
> 
> The domain really reflects the user, not the device here, which makes more
> sense if you think of virtual machines than of multimedia devices.
>
> I would suggest that you just use a single iommu_domain globally for
> all in-kernel users.

There are cases where having a separate mapping for each device makes sense.
It definitely increases the security and helps to find some bugs in
the drivers.

Getting back to our video codec - it has 2 IOMMU controllers. The codec
hardware is able to address only 256MiB of space. Do you have an idea how
this can be handled with dma-mapping API? The only idea that comes to my
mind is to provide a second, fake 'struct device' and use it for allocations
for the second IOMMU controller.

> > > > diff --git a/arch/arm/plat-samsung/include/plat/devs.h
> b/arch/arm/plat-
> > > samsung/include/plat/devs.h
> > > > index f0da6b7..0ae5dd0 100644
> > > > --- a/arch/arm/plat-samsung/include/plat/devs.h
> > > > +++ b/arch/arm/plat-samsung/include/plat/devs.h
> > > > @@ -142,7 +142,7 @@ extern struct platform_device s5p_device_fimc3;
> > > >  extern struct platform_device s5p_device_mipi_csis0;
> > > >  extern struct platform_device s5p_device_mipi_csis1;
> > > >
> > > > -extern struct platform_device exynos4_device_sysmmu;
> > > > +extern struct platform_device exynos4_device_sysmmu[];
> > >
> > > Why is this a global variable? I would expect this to be private to the
> > > implementation.
> >
> > To allow each board to register only particular instances of sysmmu
> controllers.
> 
> That sounds like an unnecessarily complicated way of doing it. This would
> be another reason to not make each one a device, but have something else
> in struct device take care of it.

Well, having each iommu instantiated as a separate device has some advantages,
but I agree that having it automatically registered together with the
corresponding multimedia (client) device will simplify a lot of things. Same
rules should imho apply to power domain drivers.

Best regards
-- 
Marek Szyprowski
Samsung Poland R&D Center

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [PATCH 2/7] ARM: Samsung: update/rewrite Samsung SYSMMU (IOMMU) driver
  2011-04-19 13:50           ` Roedel, Joerg
@ 2011-04-19 14:28             ` Arnd Bergmann
  -1 siblings, 0 replies; 64+ messages in thread
From: Arnd Bergmann @ 2011-04-19 14:28 UTC (permalink / raw)
  To: Roedel, Joerg
  Cc: Marek Szyprowski, linux-arm-kernel, linux-samsung-soc,
	linux-media, 'Kyungmin Park',
	Andrzej Pietrasiewicz, Sylwester Nawrocki, 'Kukjin Kim'

On Tuesday 19 April 2011, Roedel, Joerg wrote:
> On Tue, Apr 19, 2011 at 08:49:50AM -0400, Arnd Bergmann wrote:
> > > Ok, it looks I don't fully get how this iommu.h should be used. It looks
> > > that there can be only one instance of iommu ops registered in the system,
> > > so only one iommu driver can be activated. You are right that the iommu
> > > driver has to be registered on first probe().
> > 
> > That is a limitation of the current implementation. We might want to
> > change that anyway, e.g. to handle the mali IOMMU along with yours.
> > I believe the reason for allowing only one IOMMU type so far has been
> > that nobody required more than one. As I mentioned, the IOMMU API is
> > rather new and has not been ported to much variety of hardware, unlike
> > the dma-mapping API, which does support multiple different IOMMUs
> > in a single system.
> 
> The current IOMMU-API interface is very simple. It delegates the
> selection of the particular IOMMU device to the IOMMU driver. Handle
> this selection above the IOMMU driver is a complex thing to do. We will
> need some kind of generic IOMMU support in the device-core and
> attach IOMMUs to device sub-trees.
> 
> A simpler and less intrusive solution is to implement some wrapper code
> which dispatches the IOMMU-API calls to the IOMMU driver implementation
> required for that device.

Right. We already do that for the dma-mapping API on some architectures,
and I suppose we can consolidate the mechanism here, possibly into
something that ends up in the common struct device rather than in
the archdata.

> > > I think it might be beneficial to describe a bit more our hardware 
> > > (Exynos4 platform). There are a number of multimedia blocks. Each has it's
> > > own IOMMU controller. Each IOMMU controller has his own set of hardware
> > > registers and irq. There is also a GPU unit (Mali) which has it's own
> > > IOMMU hardware, incompatible with the SYSMMU, so right now it is ignored.
> > > 
> > > The multimedia blocks are modeled as platform devices and are independent
> > > of platform type (same multimedia blocks can be found on other Samsung
> > > machines, like for example s5pv210/s5pc110), see arch/arm/plat-s5p/dev-*.c
> > > and arch/arm/plat-samsung/dev-*.c.
> 
> Question: Does every platform device has a different type of IOMMU? Or
> are the IOMMUs on all of these platform devices similar enough to be
> handled by a single driver?

As Marek explained in the thread before you got on Cc, they are all the
same, except for the graphics core (Mali) that has a different one but
currently disables that.

> > > For the drivers the most important are the following functions:
> > > iommu_{attach,detach}_device(struct iommu_domain *domain, struct device *dev);
> 
> Right, and each driver can allocate its own domains.

For the cases that use the normal dma-mapping API, I guess there only
needs to be one domain to cover the kernel, which can then be hidden
in the driver provides the dma_map_ops based on an iommu_ops.

> > It's not quite how the domains are meant to be used. In the AMD IOMMU
> > that the API is based on, any number of devices can share one domain,
> > and devices might be able to have mappings in multiple domains.
> 
> Yes, any number of devices can be assigned to one domain, but each
> device only belongs to one domain at each point in time. But it is
> possible to detach a device from one domain and attach it to another.

I was thinking of the SR-IOV case, where a single hardware device is
represented as multiple logical devices. As far as I understand, each
logical devices can only belong to one domain, but they don't all have to
be the same.

	Arnd

^ permalink raw reply	[flat|nested] 64+ messages in thread

* [PATCH 2/7] ARM: Samsung: update/rewrite Samsung SYSMMU (IOMMU) driver
@ 2011-04-19 14:28             ` Arnd Bergmann
  0 siblings, 0 replies; 64+ messages in thread
From: Arnd Bergmann @ 2011-04-19 14:28 UTC (permalink / raw)
  To: linux-arm-kernel

On Tuesday 19 April 2011, Roedel, Joerg wrote:
> On Tue, Apr 19, 2011 at 08:49:50AM -0400, Arnd Bergmann wrote:
> > > Ok, it looks I don't fully get how this iommu.h should be used. It looks
> > > that there can be only one instance of iommu ops registered in the system,
> > > so only one iommu driver can be activated. You are right that the iommu
> > > driver has to be registered on first probe().
> > 
> > That is a limitation of the current implementation. We might want to
> > change that anyway, e.g. to handle the mali IOMMU along with yours.
> > I believe the reason for allowing only one IOMMU type so far has been
> > that nobody required more than one. As I mentioned, the IOMMU API is
> > rather new and has not been ported to much variety of hardware, unlike
> > the dma-mapping API, which does support multiple different IOMMUs
> > in a single system.
> 
> The current IOMMU-API interface is very simple. It delegates the
> selection of the particular IOMMU device to the IOMMU driver. Handle
> this selection above the IOMMU driver is a complex thing to do. We will
> need some kind of generic IOMMU support in the device-core and
> attach IOMMUs to device sub-trees.
> 
> A simpler and less intrusive solution is to implement some wrapper code
> which dispatches the IOMMU-API calls to the IOMMU driver implementation
> required for that device.

Right. We already do that for the dma-mapping API on some architectures,
and I suppose we can consolidate the mechanism here, possibly into
something that ends up in the common struct device rather than in
the archdata.

> > > I think it might be beneficial to describe a bit more our hardware 
> > > (Exynos4 platform). There are a number of multimedia blocks. Each has it's
> > > own IOMMU controller. Each IOMMU controller has his own set of hardware
> > > registers and irq. There is also a GPU unit (Mali) which has it's own
> > > IOMMU hardware, incompatible with the SYSMMU, so right now it is ignored.
> > > 
> > > The multimedia blocks are modeled as platform devices and are independent
> > > of platform type (same multimedia blocks can be found on other Samsung
> > > machines, like for example s5pv210/s5pc110), see arch/arm/plat-s5p/dev-*.c
> > > and arch/arm/plat-samsung/dev-*.c.
> 
> Question: Does every platform device has a different type of IOMMU? Or
> are the IOMMUs on all of these platform devices similar enough to be
> handled by a single driver?

As Marek explained in the thread before you got on Cc, they are all the
same, except for the graphics core (Mali) that has a different one but
currently disables that.

> > > For the drivers the most important are the following functions:
> > > iommu_{attach,detach}_device(struct iommu_domain *domain, struct device *dev);
> 
> Right, and each driver can allocate its own domains.

For the cases that use the normal dma-mapping API, I guess there only
needs to be one domain to cover the kernel, which can then be hidden
in the driver provides the dma_map_ops based on an iommu_ops.

> > It's not quite how the domains are meant to be used. In the AMD IOMMU
> > that the API is based on, any number of devices can share one domain,
> > and devices might be able to have mappings in multiple domains.
> 
> Yes, any number of devices can be assigned to one domain, but each
> device only belongs to one domain at each point in time. But it is
> possible to detach a device from one domain and attach it to another.

I was thinking of the SR-IOV case, where a single hardware device is
represented as multiple logical devices. As far as I understand, each
logical devices can only belong to one domain, but they don't all have to
be the same.

	Arnd

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [PATCH 2/7] ARM: Samsung: update/rewrite Samsung SYSMMU (IOMMU) driver
  2011-04-19 14:03           ` Marek Szyprowski
@ 2011-04-19 14:29             ` Arnd Bergmann
  -1 siblings, 0 replies; 64+ messages in thread
From: Arnd Bergmann @ 2011-04-19 14:29 UTC (permalink / raw)
  To: Marek Szyprowski
  Cc: 'Joerg Roedel',
	linux-samsung-soc, 'Kyungmin Park', 'Kukjin Kim',
	'Sylwester Nawrocki', 'Andrzej Pietrasiewicz',
	linux-arm-kernel, linux-media

On Tuesday 19 April 2011, Marek Szyprowski wrote:

> > 
> > 1. change the runtime_pm subsystem to allow it to ignore some devices
> > in an easy way.
> > 
> > 2. change the device layout if the sysmmu. If the iommu device is
> > a child of the device that it is responsible for, I guess you don't
> > have this problem.
> > 
> > 3. Not represent the iommu as a device at all, just as a property
> > of another device.
> 
> Ok, we will handle this issue somehow. I consider this a minor issue and I
> would like to focus on the IOMMU/dma-mapping APIs first.

Yes, agreed.

> > That is a limitation of the current implementation. We might want to
> > change that anyway, e.g. to handle the mali IOMMU along with yours.
> > I believe the reason for allowing only one IOMMU type so far has been
> > that nobody required more than one. As I mentioned, the IOMMU API is
> > rather new and has not been ported to much variety of hardware, unlike
> > the dma-mapping API, which does support multiple different IOMMUs
> > in a single system.
> 
> Ok. I understand. IOMMU API is quite nice abstraction of the IOMMU chip.
> dma-mapping API is something much more complex that creates the actual
> mapping for various sets of the devices. IMHO the right direction will
> be to create dma-mapping implementation that will be just a client of
> the IOMMU API. What's your opinion?
 
Sounds good. I think we should put it into a new drivers/iommu, along
with your specific iommu implementation, and then we can convert the
existing ones over to use that.

Note that this also requires using dma-mapping-common.h, which we currently
don't on ARM.

> > The domain really reflects the user, not the device here, which makes more
> > sense if you think of virtual machines than of multimedia devices.
> >
> > I would suggest that you just use a single iommu_domain globally for
> > all in-kernel users.
> 
> There are cases where having a separate mapping for each device makes sense.
> It definitely increases the security and helps to find some bugs in
> the drivers.
> 
> Getting back to our video codec - it has 2 IOMMU controllers. The codec
> hardware is able to address only 256MiB of space. Do you have an idea how
> this can be handled with dma-mapping API? The only idea that comes to my
> mind is to provide a second, fake 'struct device' and use it for allocations
> for the second IOMMU controller.

Good question. 

How do you even decide which controller to use from the driver?
I would need to understand better what you are trying to do to
give a good recommendation.

	Arnd

^ permalink raw reply	[flat|nested] 64+ messages in thread

* [PATCH 2/7] ARM: Samsung: update/rewrite Samsung SYSMMU (IOMMU) driver
@ 2011-04-19 14:29             ` Arnd Bergmann
  0 siblings, 0 replies; 64+ messages in thread
From: Arnd Bergmann @ 2011-04-19 14:29 UTC (permalink / raw)
  To: linux-arm-kernel

On Tuesday 19 April 2011, Marek Szyprowski wrote:

> > 
> > 1. change the runtime_pm subsystem to allow it to ignore some devices
> > in an easy way.
> > 
> > 2. change the device layout if the sysmmu. If the iommu device is
> > a child of the device that it is responsible for, I guess you don't
> > have this problem.
> > 
> > 3. Not represent the iommu as a device at all, just as a property
> > of another device.
> 
> Ok, we will handle this issue somehow. I consider this a minor issue and I
> would like to focus on the IOMMU/dma-mapping APIs first.

Yes, agreed.

> > That is a limitation of the current implementation. We might want to
> > change that anyway, e.g. to handle the mali IOMMU along with yours.
> > I believe the reason for allowing only one IOMMU type so far has been
> > that nobody required more than one. As I mentioned, the IOMMU API is
> > rather new and has not been ported to much variety of hardware, unlike
> > the dma-mapping API, which does support multiple different IOMMUs
> > in a single system.
> 
> Ok. I understand. IOMMU API is quite nice abstraction of the IOMMU chip.
> dma-mapping API is something much more complex that creates the actual
> mapping for various sets of the devices. IMHO the right direction will
> be to create dma-mapping implementation that will be just a client of
> the IOMMU API. What's your opinion?
 
Sounds good. I think we should put it into a new drivers/iommu, along
with your specific iommu implementation, and then we can convert the
existing ones over to use that.

Note that this also requires using dma-mapping-common.h, which we currently
don't on ARM.

> > The domain really reflects the user, not the device here, which makes more
> > sense if you think of virtual machines than of multimedia devices.
> >
> > I would suggest that you just use a single iommu_domain globally for
> > all in-kernel users.
> 
> There are cases where having a separate mapping for each device makes sense.
> It definitely increases the security and helps to find some bugs in
> the drivers.
> 
> Getting back to our video codec - it has 2 IOMMU controllers. The codec
> hardware is able to address only 256MiB of space. Do you have an idea how
> this can be handled with dma-mapping API? The only idea that comes to my
> mind is to provide a second, fake 'struct device' and use it for allocations
> for the second IOMMU controller.

Good question. 

How do you even decide which controller to use from the driver?
I would need to understand better what you are trying to do to
give a good recommendation.

	Arnd

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [PATCH 2/7] ARM: Samsung: update/rewrite Samsung SYSMMU (IOMMU) driver
  2011-04-19 14:28             ` Arnd Bergmann
@ 2011-04-19 14:51               ` Roedel, Joerg
  -1 siblings, 0 replies; 64+ messages in thread
From: Roedel, Joerg @ 2011-04-19 14:51 UTC (permalink / raw)
  To: Arnd Bergmann
  Cc: Marek Szyprowski, linux-arm-kernel, linux-samsung-soc,
	linux-media, 'Kyungmin Park',
	Andrzej Pietrasiewicz, Sylwester Nawrocki, 'Kukjin Kim'

On Tue, Apr 19, 2011 at 10:28:39AM -0400, Arnd Bergmann wrote:
> On Tuesday 19 April 2011, Roedel, Joerg wrote:
> > On Tue, Apr 19, 2011 at 08:49:50AM -0400, Arnd Bergmann wrote:
> > > > Ok, it looks I don't fully get how this iommu.h should be used. It looks
> > > > that there can be only one instance of iommu ops registered in the system,
> > > > so only one iommu driver can be activated. You are right that the iommu
> > > > driver has to be registered on first probe().
> > > 
> > > That is a limitation of the current implementation. We might want to
> > > change that anyway, e.g. to handle the mali IOMMU along with yours.
> > > I believe the reason for allowing only one IOMMU type so far has been
> > > that nobody required more than one. As I mentioned, the IOMMU API is
> > > rather new and has not been ported to much variety of hardware, unlike
> > > the dma-mapping API, which does support multiple different IOMMUs
> > > in a single system.
> > 
> > The current IOMMU-API interface is very simple. It delegates the
> > selection of the particular IOMMU device to the IOMMU driver. Handle
> > this selection above the IOMMU driver is a complex thing to do. We will
> > need some kind of generic IOMMU support in the device-core and
> > attach IOMMUs to device sub-trees.
> > 
> > A simpler and less intrusive solution is to implement some wrapper code
> > which dispatches the IOMMU-API calls to the IOMMU driver implementation
> > required for that device.
> 
> Right. We already do that for the dma-mapping API on some architectures,
> and I suppose we can consolidate the mechanism here, possibly into
> something that ends up in the common struct device rather than in
> the archdata.

The struct device solution is very much what I meant by adding this into
the device-core code :)

> > Question: Does every platform device has a different type of IOMMU? Or
> > are the IOMMUs on all of these platform devices similar enough to be
> > handled by a single driver?
> 
> As Marek explained in the thread before you got on Cc, they are all the
> same, except for the graphics core (Mali) that has a different one but
> currently disables that.

Then it is no problem at all. The IOMMU driver can find out itself which
IOMMU needs to be used for which device. The x86 implementations already
do this.

> > > > For the drivers the most important are the following functions:
> > > > iommu_{attach,detach}_device(struct iommu_domain *domain, struct device *dev);
> > 
> > Right, and each driver can allocate its own domains.
> 
> For the cases that use the normal dma-mapping API, I guess there only
> needs to be one domain to cover the kernel, which can then be hidden
> in the driver provides the dma_map_ops based on an iommu_ops.

Yes, for dma-api usage one domain is sufficient. But using one domain
for each device has benefits too. It reduces lock-contention on the
domain side and also increases security by isolating the devices from
each other.

> > > It's not quite how the domains are meant to be used. In the AMD IOMMU
> > > that the API is based on, any number of devices can share one domain,
> > > and devices might be able to have mappings in multiple domains.
> > 
> > Yes, any number of devices can be assigned to one domain, but each
> > device only belongs to one domain at each point in time. But it is
> > possible to detach a device from one domain and attach it to another.
> 
> I was thinking of the SR-IOV case, where a single hardware device is
> represented as multiple logical devices. As far as I understand, each
> logical devices can only belong to one domain, but they don't all have to
> be the same.

Well, right, the IOMMU-API makes no distinction between PF and VF. Each
function is just a pci_dev which can independently assigned to a domain.
So if 'device' means a physical card with virtual functions then yes, a
device can be attached to multiple domains, one domain per VF.

	Joerg

-- 
AMD Operating System Research Center

Advanced Micro Devices GmbH Einsteinring 24 85609 Dornach
General Managers: Alberto Bozzo, Andrew Bowd
Registration: Dornach, Landkr. Muenchen; Registerger. Muenchen, HRB Nr. 43632


^ permalink raw reply	[flat|nested] 64+ messages in thread

* [PATCH 2/7] ARM: Samsung: update/rewrite Samsung SYSMMU (IOMMU) driver
@ 2011-04-19 14:51               ` Roedel, Joerg
  0 siblings, 0 replies; 64+ messages in thread
From: Roedel, Joerg @ 2011-04-19 14:51 UTC (permalink / raw)
  To: linux-arm-kernel

On Tue, Apr 19, 2011 at 10:28:39AM -0400, Arnd Bergmann wrote:
> On Tuesday 19 April 2011, Roedel, Joerg wrote:
> > On Tue, Apr 19, 2011 at 08:49:50AM -0400, Arnd Bergmann wrote:
> > > > Ok, it looks I don't fully get how this iommu.h should be used. It looks
> > > > that there can be only one instance of iommu ops registered in the system,
> > > > so only one iommu driver can be activated. You are right that the iommu
> > > > driver has to be registered on first probe().
> > > 
> > > That is a limitation of the current implementation. We might want to
> > > change that anyway, e.g. to handle the mali IOMMU along with yours.
> > > I believe the reason for allowing only one IOMMU type so far has been
> > > that nobody required more than one. As I mentioned, the IOMMU API is
> > > rather new and has not been ported to much variety of hardware, unlike
> > > the dma-mapping API, which does support multiple different IOMMUs
> > > in a single system.
> > 
> > The current IOMMU-API interface is very simple. It delegates the
> > selection of the particular IOMMU device to the IOMMU driver. Handle
> > this selection above the IOMMU driver is a complex thing to do. We will
> > need some kind of generic IOMMU support in the device-core and
> > attach IOMMUs to device sub-trees.
> > 
> > A simpler and less intrusive solution is to implement some wrapper code
> > which dispatches the IOMMU-API calls to the IOMMU driver implementation
> > required for that device.
> 
> Right. We already do that for the dma-mapping API on some architectures,
> and I suppose we can consolidate the mechanism here, possibly into
> something that ends up in the common struct device rather than in
> the archdata.

The struct device solution is very much what I meant by adding this into
the device-core code :)

> > Question: Does every platform device has a different type of IOMMU? Or
> > are the IOMMUs on all of these platform devices similar enough to be
> > handled by a single driver?
> 
> As Marek explained in the thread before you got on Cc, they are all the
> same, except for the graphics core (Mali) that has a different one but
> currently disables that.

Then it is no problem at all. The IOMMU driver can find out itself which
IOMMU needs to be used for which device. The x86 implementations already
do this.

> > > > For the drivers the most important are the following functions:
> > > > iommu_{attach,detach}_device(struct iommu_domain *domain, struct device *dev);
> > 
> > Right, and each driver can allocate its own domains.
> 
> For the cases that use the normal dma-mapping API, I guess there only
> needs to be one domain to cover the kernel, which can then be hidden
> in the driver provides the dma_map_ops based on an iommu_ops.

Yes, for dma-api usage one domain is sufficient. But using one domain
for each device has benefits too. It reduces lock-contention on the
domain side and also increases security by isolating the devices from
each other.

> > > It's not quite how the domains are meant to be used. In the AMD IOMMU
> > > that the API is based on, any number of devices can share one domain,
> > > and devices might be able to have mappings in multiple domains.
> > 
> > Yes, any number of devices can be assigned to one domain, but each
> > device only belongs to one domain at each point in time. But it is
> > possible to detach a device from one domain and attach it to another.
> 
> I was thinking of the SR-IOV case, where a single hardware device is
> represented as multiple logical devices. As far as I understand, each
> logical devices can only belong to one domain, but they don't all have to
> be the same.

Well, right, the IOMMU-API makes no distinction between PF and VF. Each
function is just a pci_dev which can independently assigned to a domain.
So if 'device' means a physical card with virtual functions then yes, a
device can be attached to multiple domains, one domain per VF.

	Joerg

-- 
AMD Operating System Research Center

Advanced Micro Devices GmbH Einsteinring 24 85609 Dornach
General Managers: Alberto Bozzo, Andrew Bowd
Registration: Dornach, Landkr. Muenchen; Registerger. Muenchen, HRB Nr. 43632

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [PATCH 2/7] ARM: Samsung: update/rewrite Samsung SYSMMU (IOMMU) driver
  2011-04-19 14:03           ` Marek Szyprowski
@ 2011-04-19 15:00             ` Roedel, Joerg
  -1 siblings, 0 replies; 64+ messages in thread
From: Roedel, Joerg @ 2011-04-19 15:00 UTC (permalink / raw)
  To: Marek Szyprowski
  Cc: 'Arnd Bergmann',
	linux-samsung-soc, 'Kyungmin Park', 'Kukjin Kim',
	'Sylwester Nawrocki', 'Andrzej Pietrasiewicz',
	linux-arm-kernel, linux-media

On Tue, Apr 19, 2011 at 10:03:27AM -0400, Marek Szyprowski wrote:

> Ok. I understand. IOMMU API is quite nice abstraction of the IOMMU chip.
> dma-mapping API is something much more complex that creates the actual
> mapping for various sets of the devices. IMHO the right direction will
> be to create dma-mapping implementation that will be just a client of
> the IOMMU API. What's your opinion?

Definitly agreed. I plan this since some time but never found the
time to implement it. In the end we can have a generic dma-ops
implementation that works for all iommu-api implementations.

> Getting back to our video codec - it has 2 IOMMU controllers. The codec
> hardware is able to address only 256MiB of space. Do you have an idea how
> this can be handled with dma-mapping API? The only idea that comes to my
> mind is to provide a second, fake 'struct device' and use it for allocations
> for the second IOMMU controller.

The GPU IOMMUs can probably be handled in the GPU driver if they are
that different. Recent PCIe GPUs on x86 have their own IOMMUs too which
are very device specific and are handled in the device driver.

Regards,

	Joerg

-- 
AMD Operating System Research Center

Advanced Micro Devices GmbH Einsteinring 24 85609 Dornach
General Managers: Alberto Bozzo, Andrew Bowd
Registration: Dornach, Landkr. Muenchen; Registerger. Muenchen, HRB Nr. 43632


^ permalink raw reply	[flat|nested] 64+ messages in thread

* [PATCH 2/7] ARM: Samsung: update/rewrite Samsung SYSMMU (IOMMU) driver
@ 2011-04-19 15:00             ` Roedel, Joerg
  0 siblings, 0 replies; 64+ messages in thread
From: Roedel, Joerg @ 2011-04-19 15:00 UTC (permalink / raw)
  To: linux-arm-kernel

On Tue, Apr 19, 2011 at 10:03:27AM -0400, Marek Szyprowski wrote:

> Ok. I understand. IOMMU API is quite nice abstraction of the IOMMU chip.
> dma-mapping API is something much more complex that creates the actual
> mapping for various sets of the devices. IMHO the right direction will
> be to create dma-mapping implementation that will be just a client of
> the IOMMU API. What's your opinion?

Definitly agreed. I plan this since some time but never found the
time to implement it. In the end we can have a generic dma-ops
implementation that works for all iommu-api implementations.

> Getting back to our video codec - it has 2 IOMMU controllers. The codec
> hardware is able to address only 256MiB of space. Do you have an idea how
> this can be handled with dma-mapping API? The only idea that comes to my
> mind is to provide a second, fake 'struct device' and use it for allocations
> for the second IOMMU controller.

The GPU IOMMUs can probably be handled in the GPU driver if they are
that different. Recent PCIe GPUs on x86 have their own IOMMUs too which
are very device specific and are handled in the device driver.

Regards,

	Joerg

-- 
AMD Operating System Research Center

Advanced Micro Devices GmbH Einsteinring 24 85609 Dornach
General Managers: Alberto Bozzo, Andrew Bowd
Registration: Dornach, Landkr. Muenchen; Registerger. Muenchen, HRB Nr. 43632

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [PATCH 2/7] ARM: Samsung: update/rewrite Samsung SYSMMU (IOMMU) driver
  2011-04-19 15:00             ` Roedel, Joerg
@ 2011-04-19 15:37               ` Arnd Bergmann
  -1 siblings, 0 replies; 64+ messages in thread
From: Arnd Bergmann @ 2011-04-19 15:37 UTC (permalink / raw)
  To: Roedel, Joerg
  Cc: Marek Szyprowski, linux-samsung-soc, 'Kyungmin Park',
	'Kukjin Kim', 'Sylwester Nawrocki',
	'Andrzej Pietrasiewicz',
	linux-arm-kernel, linux-media

On Tuesday 19 April 2011, Roedel, Joerg wrote:
> > Getting back to our video codec - it has 2 IOMMU controllers. The codec
> > hardware is able to address only 256MiB of space. Do you have an idea how
> > this can be handled with dma-mapping API? The only idea that comes to my
> > mind is to provide a second, fake 'struct device' and use it for allocations
> > for the second IOMMU controller.
> 
> The GPU IOMMUs can probably be handled in the GPU driver if they are
> that different. Recent PCIe GPUs on x86 have their own IOMMUs too which
> are very device specific and are handled in the device driver.

I tend to disagree with this one, and would suggest that the GPUs should
actually provide their own iommu_ops, even if they are the only users
of these.

However, this is a minor point that we don't need to worry about today.

	Arnd

^ permalink raw reply	[flat|nested] 64+ messages in thread

* [PATCH 2/7] ARM: Samsung: update/rewrite Samsung SYSMMU (IOMMU) driver
@ 2011-04-19 15:37               ` Arnd Bergmann
  0 siblings, 0 replies; 64+ messages in thread
From: Arnd Bergmann @ 2011-04-19 15:37 UTC (permalink / raw)
  To: linux-arm-kernel

On Tuesday 19 April 2011, Roedel, Joerg wrote:
> > Getting back to our video codec - it has 2 IOMMU controllers. The codec
> > hardware is able to address only 256MiB of space. Do you have an idea how
> > this can be handled with dma-mapping API? The only idea that comes to my
> > mind is to provide a second, fake 'struct device' and use it for allocations
> > for the second IOMMU controller.
> 
> The GPU IOMMUs can probably be handled in the GPU driver if they are
> that different. Recent PCIe GPUs on x86 have their own IOMMUs too which
> are very device specific and are handled in the device driver.

I tend to disagree with this one, and would suggest that the GPUs should
actually provide their own iommu_ops, even if they are the only users
of these.

However, this is a minor point that we don't need to worry about today.

	Arnd

^ permalink raw reply	[flat|nested] 64+ messages in thread

* RE: [PATCH 2/7] ARM: Samsung: update/rewrite Samsung SYSMMU (IOMMU) driver
  2011-04-19 14:29             ` Arnd Bergmann
@ 2011-04-20 14:55               ` Marek Szyprowski
  -1 siblings, 0 replies; 64+ messages in thread
From: Marek Szyprowski @ 2011-04-20 14:55 UTC (permalink / raw)
  To: 'Arnd Bergmann'
  Cc: 'Joerg Roedel',
	linux-samsung-soc, 'Kyungmin Park', 'Kukjin Kim',
	Sylwester Nawrocki, Andrzej Pietrasiewicz, linux-arm-kernel,
	linux-media

Hello,

On Tuesday, April 19, 2011 4:30 PM Arnd Bergmann wrote:

> > > That is a limitation of the current implementation. We might want to
> > > change that anyway, e.g. to handle the mali IOMMU along with yours.
> > > I believe the reason for allowing only one IOMMU type so far has been
> > > that nobody required more than one. As I mentioned, the IOMMU API is
> > > rather new and has not been ported to much variety of hardware, unlike
> > > the dma-mapping API, which does support multiple different IOMMUs
> > > in a single system.
> >
> > Ok. I understand. IOMMU API is quite nice abstraction of the IOMMU chip.
> > dma-mapping API is something much more complex that creates the actual
> > mapping for various sets of the devices. IMHO the right direction will
> > be to create dma-mapping implementation that will be just a client of
> > the IOMMU API. What's your opinion?
> 
> Sounds good. I think we should put it into a new drivers/iommu, along
> with your specific iommu implementation, and then we can convert the
> existing ones over to use that.

I see, this sounds quite reasonable. I think I finally got how this should
be implemented. 

The only question is how a device can allocate a buffer that will be most
convenient for IOMMU mapping (i.e. will require least entries to map)?

IOMMU can create a contiguous mapping for ANY set of pages, but it performs
much better if the pages are grouped into 64KiB or 1MiB areas.

Can device allocate a buffer without mapping it into kernel space?

The problem that still left to be resolved is the fact the
dma_coherent_alloc() should also be able to use IOMMU. This would however
trigger the problem of double mappings with different cache attributes: 
dma api might require to create coherent (==non-cached mappings), while 
all low-memory is still mapped with (super)sections as cached, what is 
against ARM CPU specification and might cause unpredicted behavior
especially on CPUs that do speculative prefetch. Right now this problem
has been ignored in dma-mappings implementation, but there have been some
patches posted to resolve this by reserving some area exclusively for dma
coherent mappings: 
http://thread.gmane.org/gmane.linux.ports.arm.kernel/100822/focus=100913

Right now I would like to postpone resolving this issue because the Samsung
iommu task already became really big.

> Note that this also requires using dma-mapping-common.h, which we currently
> don't on ARM.

Yes, I noticed this, shouldn't be much problem, imho.

> > > The domain really reflects the user, not the device here, which makes
> more
> > > sense if you think of virtual machines than of multimedia devices.
> > >
> > > I would suggest that you just use a single iommu_domain globally for
> > > all in-kernel users.
> >
> > There are cases where having a separate mapping for each device makes
> sense.
> > It definitely increases the security and helps to find some bugs in
> > the drivers.
> >
> > Getting back to our video codec - it has 2 IOMMU controllers. The codec
> > hardware is able to address only 256MiB of space. Do you have an idea how
> > this can be handled with dma-mapping API? The only idea that comes to my
> > mind is to provide a second, fake 'struct device' and use it for
> allocations
> > for the second IOMMU controller.
> 
> Good question.
> 
> How do you even decide which controller to use from the driver?
> I would need to understand better what you are trying to do to
> give a good recommendation.

Both controllers are used by the hardware depending on the buffer type.
For example, buffers with chroma video data are accessed by first (called
'left') memory channel, the others (with luma video data) - by the second
channel (called 'right'). Each memory channel is limited to 256MiB address
space and best performance is achieved when buffers are allocated in 
separate physical memory banks (the boards usually have 2 or more memory banks,
memory is not interleaved).

Best regards
-- 
Marek Szyprowski
Samsung Poland R&D Center

^ permalink raw reply	[flat|nested] 64+ messages in thread

* [PATCH 2/7] ARM: Samsung: update/rewrite Samsung SYSMMU (IOMMU) driver
@ 2011-04-20 14:55               ` Marek Szyprowski
  0 siblings, 0 replies; 64+ messages in thread
From: Marek Szyprowski @ 2011-04-20 14:55 UTC (permalink / raw)
  To: linux-arm-kernel

Hello,

On Tuesday, April 19, 2011 4:30 PM Arnd Bergmann wrote:

> > > That is a limitation of the current implementation. We might want to
> > > change that anyway, e.g. to handle the mali IOMMU along with yours.
> > > I believe the reason for allowing only one IOMMU type so far has been
> > > that nobody required more than one. As I mentioned, the IOMMU API is
> > > rather new and has not been ported to much variety of hardware, unlike
> > > the dma-mapping API, which does support multiple different IOMMUs
> > > in a single system.
> >
> > Ok. I understand. IOMMU API is quite nice abstraction of the IOMMU chip.
> > dma-mapping API is something much more complex that creates the actual
> > mapping for various sets of the devices. IMHO the right direction will
> > be to create dma-mapping implementation that will be just a client of
> > the IOMMU API. What's your opinion?
> 
> Sounds good. I think we should put it into a new drivers/iommu, along
> with your specific iommu implementation, and then we can convert the
> existing ones over to use that.

I see, this sounds quite reasonable. I think I finally got how this should
be implemented. 

The only question is how a device can allocate a buffer that will be most
convenient for IOMMU mapping (i.e. will require least entries to map)?

IOMMU can create a contiguous mapping for ANY set of pages, but it performs
much better if the pages are grouped into 64KiB or 1MiB areas.

Can device allocate a buffer without mapping it into kernel space?

The problem that still left to be resolved is the fact the
dma_coherent_alloc() should also be able to use IOMMU. This would however
trigger the problem of double mappings with different cache attributes: 
dma api might require to create coherent (==non-cached mappings), while 
all low-memory is still mapped with (super)sections as cached, what is 
against ARM CPU specification and might cause unpredicted behavior
especially on CPUs that do speculative prefetch. Right now this problem
has been ignored in dma-mappings implementation, but there have been some
patches posted to resolve this by reserving some area exclusively for dma
coherent mappings: 
http://thread.gmane.org/gmane.linux.ports.arm.kernel/100822/focus=100913

Right now I would like to postpone resolving this issue because the Samsung
iommu task already became really big.

> Note that this also requires using dma-mapping-common.h, which we currently
> don't on ARM.

Yes, I noticed this, shouldn't be much problem, imho.

> > > The domain really reflects the user, not the device here, which makes
> more
> > > sense if you think of virtual machines than of multimedia devices.
> > >
> > > I would suggest that you just use a single iommu_domain globally for
> > > all in-kernel users.
> >
> > There are cases where having a separate mapping for each device makes
> sense.
> > It definitely increases the security and helps to find some bugs in
> > the drivers.
> >
> > Getting back to our video codec - it has 2 IOMMU controllers. The codec
> > hardware is able to address only 256MiB of space. Do you have an idea how
> > this can be handled with dma-mapping API? The only idea that comes to my
> > mind is to provide a second, fake 'struct device' and use it for
> allocations
> > for the second IOMMU controller.
> 
> Good question.
> 
> How do you even decide which controller to use from the driver?
> I would need to understand better what you are trying to do to
> give a good recommendation.

Both controllers are used by the hardware depending on the buffer type.
For example, buffers with chroma video data are accessed by first (called
'left') memory channel, the others (with luma video data) - by the second
channel (called 'right'). Each memory channel is limited to 256MiB address
space and best performance is achieved when buffers are allocated in 
separate physical memory banks (the boards usually have 2 or more memory banks,
memory is not interleaved).

Best regards
-- 
Marek Szyprowski
Samsung Poland R&D Center

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [PATCH 2/7] ARM: Samsung: update/rewrite Samsung SYSMMU (IOMMU) driver
  2011-04-20 14:55               ` Marek Szyprowski
@ 2011-04-20 16:07                 ` Arnd Bergmann
  -1 siblings, 0 replies; 64+ messages in thread
From: Arnd Bergmann @ 2011-04-20 16:07 UTC (permalink / raw)
  To: Marek Szyprowski
  Cc: 'Joerg Roedel',
	linux-samsung-soc, 'Kyungmin Park', 'Kukjin Kim',
	Sylwester Nawrocki, Andrzej Pietrasiewicz, linux-arm-kernel,
	linux-media

On Wednesday 20 April 2011, Marek Szyprowski wrote:
> On Tuesday, April 19, 2011 4:30 PM Arnd Bergmann wrote:

> > Sounds good. I think we should put it into a new drivers/iommu, along
> > with your specific iommu implementation, and then we can convert the
> > existing ones over to use that.
> 
> I see, this sounds quite reasonable. I think I finally got how this should
> be implemented. 
> 
> The only question is how a device can allocate a buffer that will be most
> convenient for IOMMU mapping (i.e. will require least entries to map)?
> 
> IOMMU can create a contiguous mapping for ANY set of pages, but it performs
> much better if the pages are grouped into 64KiB or 1MiB areas.
> 
> Can device allocate a buffer without mapping it into kernel space?

Not today as far as I know. You can register coherent memory per device
using dma_declare_coherent_memory(), which will be used to back
dma_alloc_coherent(), but I believe it is always mapped right now.

This can of course be changed. 

> The problem that still left to be resolved is the fact the
> dma_coherent_alloc() should also be able to use IOMMU. This would however
> trigger the problem of double mappings with different cache attributes: 
> dma api might require to create coherent (==non-cached mappings), while 
> all low-memory is still mapped with (super)sections as cached, what is 
> against ARM CPU specification and might cause unpredicted behavior
> especially on CPUs that do speculative prefetch. Right now this problem
> has been ignored in dma-mappings implementation, but there have been some
> patches posted to resolve this by reserving some area exclusively for dma
> coherent mappings: 
> http://thread.gmane.org/gmane.linux.ports.arm.kernel/100822/focus=100913
> 
> Right now I would like to postpone resolving this issue because the Samsung
> iommu task already became really big.

Agreed.

> > > Getting back to our video codec - it has 2 IOMMU controllers. The codec
> > > hardware is able to address only 256MiB of space. Do you have an idea how
> > > this can be handled with dma-mapping API? The only idea that comes to my
> > > mind is to provide a second, fake 'struct device' and use it for
> > allocations
> > > for the second IOMMU controller.
> > 
> > Good question.
> > 
> > How do you even decide which controller to use from the driver?
> > I would need to understand better what you are trying to do to
> > give a good recommendation.
> 
> Both controllers are used by the hardware depending on the buffer type.
> For example, buffers with chroma video data are accessed by first (called
> 'left') memory channel, the others (with luma video data) - by the second
> channel (called 'right'). Each memory channel is limited to 256MiB address
> space and best performance is achieved when buffers are allocated in 
> separate physical memory banks (the boards usually have 2 or more memory banks,
> memory is not interleaved).

Ok, I see. Having one device per channel as you suggested could probably
work around this, and it's at least consistent with how you'd represent
IOMMUs in the device tree. It is not ideal because it makes the video
driver more complex when it now has to deal with multiple struct device
that it binds to, but I can't think of any nicer way either.

	Arnd

^ permalink raw reply	[flat|nested] 64+ messages in thread

* [PATCH 2/7] ARM: Samsung: update/rewrite Samsung SYSMMU (IOMMU) driver
@ 2011-04-20 16:07                 ` Arnd Bergmann
  0 siblings, 0 replies; 64+ messages in thread
From: Arnd Bergmann @ 2011-04-20 16:07 UTC (permalink / raw)
  To: linux-arm-kernel

On Wednesday 20 April 2011, Marek Szyprowski wrote:
> On Tuesday, April 19, 2011 4:30 PM Arnd Bergmann wrote:

> > Sounds good. I think we should put it into a new drivers/iommu, along
> > with your specific iommu implementation, and then we can convert the
> > existing ones over to use that.
> 
> I see, this sounds quite reasonable. I think I finally got how this should
> be implemented. 
> 
> The only question is how a device can allocate a buffer that will be most
> convenient for IOMMU mapping (i.e. will require least entries to map)?
> 
> IOMMU can create a contiguous mapping for ANY set of pages, but it performs
> much better if the pages are grouped into 64KiB or 1MiB areas.
> 
> Can device allocate a buffer without mapping it into kernel space?

Not today as far as I know. You can register coherent memory per device
using dma_declare_coherent_memory(), which will be used to back
dma_alloc_coherent(), but I believe it is always mapped right now.

This can of course be changed. 

> The problem that still left to be resolved is the fact the
> dma_coherent_alloc() should also be able to use IOMMU. This would however
> trigger the problem of double mappings with different cache attributes: 
> dma api might require to create coherent (==non-cached mappings), while 
> all low-memory is still mapped with (super)sections as cached, what is 
> against ARM CPU specification and might cause unpredicted behavior
> especially on CPUs that do speculative prefetch. Right now this problem
> has been ignored in dma-mappings implementation, but there have been some
> patches posted to resolve this by reserving some area exclusively for dma
> coherent mappings: 
> http://thread.gmane.org/gmane.linux.ports.arm.kernel/100822/focus=100913
> 
> Right now I would like to postpone resolving this issue because the Samsung
> iommu task already became really big.

Agreed.

> > > Getting back to our video codec - it has 2 IOMMU controllers. The codec
> > > hardware is able to address only 256MiB of space. Do you have an idea how
> > > this can be handled with dma-mapping API? The only idea that comes to my
> > > mind is to provide a second, fake 'struct device' and use it for
> > allocations
> > > for the second IOMMU controller.
> > 
> > Good question.
> > 
> > How do you even decide which controller to use from the driver?
> > I would need to understand better what you are trying to do to
> > give a good recommendation.
> 
> Both controllers are used by the hardware depending on the buffer type.
> For example, buffers with chroma video data are accessed by first (called
> 'left') memory channel, the others (with luma video data) - by the second
> channel (called 'right'). Each memory channel is limited to 256MiB address
> space and best performance is achieved when buffers are allocated in 
> separate physical memory banks (the boards usually have 2 or more memory banks,
> memory is not interleaved).

Ok, I see. Having one device per channel as you suggested could probably
work around this, and it's at least consistent with how you'd represent
IOMMUs in the device tree. It is not ideal because it makes the video
driver more complex when it now has to deal with multiple struct device
that it binds to, but I can't think of any nicer way either.

	Arnd

^ permalink raw reply	[flat|nested] 64+ messages in thread

* RE: [PATCH 2/7] ARM: Samsung: update/rewrite Samsung SYSMMU (IOMMU) driver
  2011-04-20 16:07                 ` Arnd Bergmann
@ 2011-04-21 11:32                   ` Marek Szyprowski
  -1 siblings, 0 replies; 64+ messages in thread
From: Marek Szyprowski @ 2011-04-21 11:32 UTC (permalink / raw)
  To: 'Arnd Bergmann'
  Cc: 'Joerg Roedel',
	linux-samsung-soc, 'Kyungmin Park', 'Kukjin Kim',
	Sylwester Nawrocki, Andrzej Pietrasiewicz, linux-arm-kernel,
	linux-media

Hello,

On Wednesday, April 20, 2011 6:07 PM Arnd Bergmann wrote:

> On Wednesday 20 April 2011, Marek Szyprowski wrote:
> > On Tuesday, April 19, 2011 4:30 PM Arnd Bergmann wrote:
> 
> > > Sounds good. I think we should put it into a new drivers/iommu, along
> > > with your specific iommu implementation, and then we can convert the
> > > existing ones over to use that.
> >
> > I see, this sounds quite reasonable. I think I finally got how this
> should
> > be implemented.
> >
> > The only question is how a device can allocate a buffer that will be most
> > convenient for IOMMU mapping (i.e. will require least entries to map)?
> >
> > IOMMU can create a contiguous mapping for ANY set of pages, but it
> performs
> > much better if the pages are grouped into 64KiB or 1MiB areas.
> >
> > Can device allocate a buffer without mapping it into kernel space?
> 
> Not today as far as I know. You can register coherent memory per device
> using dma_declare_coherent_memory(), which will be used to back
> dma_alloc_coherent(), but I believe it is always mapped right now.

This is not exactly what I meant.

As we have IOMMU, the device driver can access any system memory. However
the performance will be better if the buffer is composed of larger contiguous
parts (like 64KiB or 1MiB). I would like to avoid putting logic that manages
buffer allocation into the device drivers. It would be best if such buffers
could be allocated by a single call to dma-mapping API.

Right now there is dma_alloc_coherent() function, which is used by the
drivers to allocate a contiguous block of memory and map it to DMA addresses.
With IOMMU implementation it is quite easy to provide a replacement for it
that will allocate some set of pages and map into device virtual address
space as a contiguous buffer. 

This will have the advantage that the same multimedia device driver
will work on both systems - Samsung S5PC110 (without IOMMU) and Exynos4
(with IOMMU).

However dma_alloc_coherent() besides allocating memory also implies some
particular type of memory mapping for it. IMHO it might be a good idea to
separate these 2 things (allocation and mapping) somewhere in the future.

On systems with IOMMU the dma_map_sg() can be also used to create a mapping
in device virtual address space, but the driver will still need to allocate
the memory by itself.

(snipped)

> > > > Getting back to our video codec - it has 2 IOMMU controllers. The
> codec
> > > > hardware is able to address only 256MiB of space. Do you have an idea
> how
> > > > this can be handled with dma-mapping API? The only idea that comes to
> my
> > > > mind is to provide a second, fake 'struct device' and use it for
> > > allocations
> > > > for the second IOMMU controller.
> > >
> > > Good question.
> > >
> > > How do you even decide which controller to use from the driver?
> > > I would need to understand better what you are trying to do to
> > > give a good recommendation.
> >
> > Both controllers are used by the hardware depending on the buffer type.
> > For example, buffers with chroma video data are accessed by first (called
> > 'left') memory channel, the others (with luma video data) - by the second
> > channel (called 'right'). Each memory channel is limited to 256MiB
> address
> > space and best performance is achieved when buffers are allocated in
> > separate physical memory banks (the boards usually have 2 or more memory
> banks,
> > memory is not interleaved).
> 
> Ok, I see. Having one device per channel as you suggested could probably
> work around this, and it's at least consistent with how you'd represent
> IOMMUs in the device tree. It is not ideal because it makes the video
> driver more complex when it now has to deal with multiple struct device
> that it binds to, but I can't think of any nicer way either.

Well, this will definitely complicate the codec driver. I wonder if allowing
the driver to kmalloc(sizeof(struct device))) and copy the relevant data
from the 'proper' struct device will be better idea. It is still hack but 
definitely less intrusive for the driver.

Best regards
-- 
Marek Szyprowski
Samsung Poland R&D Center


^ permalink raw reply	[flat|nested] 64+ messages in thread

* [PATCH 2/7] ARM: Samsung: update/rewrite Samsung SYSMMU (IOMMU) driver
@ 2011-04-21 11:32                   ` Marek Szyprowski
  0 siblings, 0 replies; 64+ messages in thread
From: Marek Szyprowski @ 2011-04-21 11:32 UTC (permalink / raw)
  To: linux-arm-kernel

Hello,

On Wednesday, April 20, 2011 6:07 PM Arnd Bergmann wrote:

> On Wednesday 20 April 2011, Marek Szyprowski wrote:
> > On Tuesday, April 19, 2011 4:30 PM Arnd Bergmann wrote:
> 
> > > Sounds good. I think we should put it into a new drivers/iommu, along
> > > with your specific iommu implementation, and then we can convert the
> > > existing ones over to use that.
> >
> > I see, this sounds quite reasonable. I think I finally got how this
> should
> > be implemented.
> >
> > The only question is how a device can allocate a buffer that will be most
> > convenient for IOMMU mapping (i.e. will require least entries to map)?
> >
> > IOMMU can create a contiguous mapping for ANY set of pages, but it
> performs
> > much better if the pages are grouped into 64KiB or 1MiB areas.
> >
> > Can device allocate a buffer without mapping it into kernel space?
> 
> Not today as far as I know. You can register coherent memory per device
> using dma_declare_coherent_memory(), which will be used to back
> dma_alloc_coherent(), but I believe it is always mapped right now.

This is not exactly what I meant.

As we have IOMMU, the device driver can access any system memory. However
the performance will be better if the buffer is composed of larger contiguous
parts (like 64KiB or 1MiB). I would like to avoid putting logic that manages
buffer allocation into the device drivers. It would be best if such buffers
could be allocated by a single call to dma-mapping API.

Right now there is dma_alloc_coherent() function, which is used by the
drivers to allocate a contiguous block of memory and map it to DMA addresses.
With IOMMU implementation it is quite easy to provide a replacement for it
that will allocate some set of pages and map into device virtual address
space as a contiguous buffer. 

This will have the advantage that the same multimedia device driver
will work on both systems - Samsung S5PC110 (without IOMMU) and Exynos4
(with IOMMU).

However dma_alloc_coherent() besides allocating memory also implies some
particular type of memory mapping for it. IMHO it might be a good idea to
separate these 2 things (allocation and mapping) somewhere in the future.

On systems with IOMMU the dma_map_sg() can be also used to create a mapping
in device virtual address space, but the driver will still need to allocate
the memory by itself.

(snipped)

> > > > Getting back to our video codec - it has 2 IOMMU controllers. The
> codec
> > > > hardware is able to address only 256MiB of space. Do you have an idea
> how
> > > > this can be handled with dma-mapping API? The only idea that comes to
> my
> > > > mind is to provide a second, fake 'struct device' and use it for
> > > allocations
> > > > for the second IOMMU controller.
> > >
> > > Good question.
> > >
> > > How do you even decide which controller to use from the driver?
> > > I would need to understand better what you are trying to do to
> > > give a good recommendation.
> >
> > Both controllers are used by the hardware depending on the buffer type.
> > For example, buffers with chroma video data are accessed by first (called
> > 'left') memory channel, the others (with luma video data) - by the second
> > channel (called 'right'). Each memory channel is limited to 256MiB
> address
> > space and best performance is achieved when buffers are allocated in
> > separate physical memory banks (the boards usually have 2 or more memory
> banks,
> > memory is not interleaved).
> 
> Ok, I see. Having one device per channel as you suggested could probably
> work around this, and it's at least consistent with how you'd represent
> IOMMUs in the device tree. It is not ideal because it makes the video
> driver more complex when it now has to deal with multiple struct device
> that it binds to, but I can't think of any nicer way either.

Well, this will definitely complicate the codec driver. I wonder if allowing
the driver to kmalloc(sizeof(struct device))) and copy the relevant data
from the 'proper' struct device will be better idea. It is still hack but 
definitely less intrusive for the driver.

Best regards
-- 
Marek Szyprowski
Samsung Poland R&D Center

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [PATCH 2/7] ARM: Samsung: update/rewrite Samsung SYSMMU (IOMMU) driver
  2011-04-21 11:32                   ` Marek Szyprowski
@ 2011-04-21 12:00                     ` Arnd Bergmann
  -1 siblings, 0 replies; 64+ messages in thread
From: Arnd Bergmann @ 2011-04-21 12:00 UTC (permalink / raw)
  To: Marek Szyprowski
  Cc: 'Joerg Roedel',
	linux-samsung-soc, 'Kyungmin Park', 'Kukjin Kim',
	Sylwester Nawrocki, Andrzej Pietrasiewicz, linux-arm-kernel,
	linux-media

On Thursday 21 April 2011, Marek Szyprowski wrote:
> On Wednesday, April 20, 2011 6:07 PM Arnd Bergmann wrote:
> > On Wednesday 20 April 2011, Marek Szyprowski wrote:
> > > The only question is how a device can allocate a buffer that will be most
> > > convenient for IOMMU mapping (i.e. will require least entries to map)?
> > >
> > > IOMMU can create a contiguous mapping for ANY set of pages, but it performs
> > > much better if the pages are grouped into 64KiB or 1MiB areas.
> > >
> > > Can device allocate a buffer without mapping it into kernel space?
> > 
> > Not today as far as I know. You can register coherent memory per device
> > using dma_declare_coherent_memory(), which will be used to back
> > dma_alloc_coherent(), but I believe it is always mapped right now.
> 
> This is not exactly what I meant.
> 
> As we have IOMMU, the device driver can access any system memory. However
> the performance will be better if the buffer is composed of larger contiguous
> parts (like 64KiB or 1MiB). I would like to avoid putting logic that manages
> buffer allocation into the device drivers. It would be best if such buffers
> could be allocated by a single call to dma-mapping API.
> 
> Right now there is dma_alloc_coherent() function, which is used by the
> drivers to allocate a contiguous block of memory and map it to DMA addresses.
> With IOMMU implementation it is quite easy to provide a replacement for it
> that will allocate some set of pages and map into device virtual address
> space as a contiguous buffer. 
>
> This will have the advantage that the same multimedia device driver
> will work on both systems - Samsung S5PC110 (without IOMMU) and Exynos4
> (with IOMMU).

Right.
 
> However dma_alloc_coherent() besides allocating memory also implies some
> particular type of memory mapping for it. IMHO it might be a good idea to
> separate these 2 things (allocation and mapping) somewhere in the future.
> 
> On systems with IOMMU the dma_map_sg() can be also used to create a mapping
> in device virtual address space, but the driver will still need to allocate
> the memory by itself.

Note that dma_map_sg() is the "streaming mapping", which provides a cacheable
buffer all the time, while dma_alloc_coherent() and is the "coherent mapping".

There is also dma_alloc_noncoherent(), which you can use to allocate a buffer
for the streaming mapping. This is currently not implemented on ARM, but if
I understand you correctly, adding this would do what you want.

> > Ok, I see. Having one device per channel as you suggested could probably
> > work around this, and it's at least consistent with how you'd represent
> > IOMMUs in the device tree. It is not ideal because it makes the video
> > driver more complex when it now has to deal with multiple struct device
> > that it binds to, but I can't think of any nicer way either.
> 
> Well, this will definitely complicate the codec driver. I wonder if allowing
> the driver to kmalloc(sizeof(struct device))) and copy the relevant data
> from the 'proper' struct device will be better idea. It is still hack but 
> definitely less intrusive for the driver.

No, I think that would be much worse, it definitely destroys all kinds of
assumptions that the core code makes about devices. However, I don't think
it's much of a problem to just create two child devices and use them
from the main driver, you don't really need to create a device_driver
to bind to each of them.

	Arnd

^ permalink raw reply	[flat|nested] 64+ messages in thread

* [PATCH 2/7] ARM: Samsung: update/rewrite Samsung SYSMMU (IOMMU) driver
@ 2011-04-21 12:00                     ` Arnd Bergmann
  0 siblings, 0 replies; 64+ messages in thread
From: Arnd Bergmann @ 2011-04-21 12:00 UTC (permalink / raw)
  To: linux-arm-kernel

On Thursday 21 April 2011, Marek Szyprowski wrote:
> On Wednesday, April 20, 2011 6:07 PM Arnd Bergmann wrote:
> > On Wednesday 20 April 2011, Marek Szyprowski wrote:
> > > The only question is how a device can allocate a buffer that will be most
> > > convenient for IOMMU mapping (i.e. will require least entries to map)?
> > >
> > > IOMMU can create a contiguous mapping for ANY set of pages, but it performs
> > > much better if the pages are grouped into 64KiB or 1MiB areas.
> > >
> > > Can device allocate a buffer without mapping it into kernel space?
> > 
> > Not today as far as I know. You can register coherent memory per device
> > using dma_declare_coherent_memory(), which will be used to back
> > dma_alloc_coherent(), but I believe it is always mapped right now.
> 
> This is not exactly what I meant.
> 
> As we have IOMMU, the device driver can access any system memory. However
> the performance will be better if the buffer is composed of larger contiguous
> parts (like 64KiB or 1MiB). I would like to avoid putting logic that manages
> buffer allocation into the device drivers. It would be best if such buffers
> could be allocated by a single call to dma-mapping API.
> 
> Right now there is dma_alloc_coherent() function, which is used by the
> drivers to allocate a contiguous block of memory and map it to DMA addresses.
> With IOMMU implementation it is quite easy to provide a replacement for it
> that will allocate some set of pages and map into device virtual address
> space as a contiguous buffer. 
>
> This will have the advantage that the same multimedia device driver
> will work on both systems - Samsung S5PC110 (without IOMMU) and Exynos4
> (with IOMMU).

Right.
 
> However dma_alloc_coherent() besides allocating memory also implies some
> particular type of memory mapping for it. IMHO it might be a good idea to
> separate these 2 things (allocation and mapping) somewhere in the future.
> 
> On systems with IOMMU the dma_map_sg() can be also used to create a mapping
> in device virtual address space, but the driver will still need to allocate
> the memory by itself.

Note that dma_map_sg() is the "streaming mapping", which provides a cacheable
buffer all the time, while dma_alloc_coherent() and is the "coherent mapping".

There is also dma_alloc_noncoherent(), which you can use to allocate a buffer
for the streaming mapping. This is currently not implemented on ARM, but if
I understand you correctly, adding this would do what you want.

> > Ok, I see. Having one device per channel as you suggested could probably
> > work around this, and it's at least consistent with how you'd represent
> > IOMMUs in the device tree. It is not ideal because it makes the video
> > driver more complex when it now has to deal with multiple struct device
> > that it binds to, but I can't think of any nicer way either.
> 
> Well, this will definitely complicate the codec driver. I wonder if allowing
> the driver to kmalloc(sizeof(struct device))) and copy the relevant data
> from the 'proper' struct device will be better idea. It is still hack but 
> definitely less intrusive for the driver.

No, I think that would be much worse, it definitely destroys all kinds of
assumptions that the core code makes about devices. However, I don't think
it's much of a problem to just create two child devices and use them
from the main driver, you don't really need to create a device_driver
to bind to each of them.

	Arnd

^ permalink raw reply	[flat|nested] 64+ messages in thread

* RE: [PATCH 2/7] ARM: Samsung: update/rewrite Samsung SYSMMU (IOMMU) driver
  2011-04-21 12:00                     ` Arnd Bergmann
@ 2011-04-21 14:03                       ` Marek Szyprowski
  -1 siblings, 0 replies; 64+ messages in thread
From: Marek Szyprowski @ 2011-04-21 14:03 UTC (permalink / raw)
  To: 'Arnd Bergmann'
  Cc: 'Joerg Roedel',
	linux-samsung-soc, 'Kyungmin Park', 'Kukjin Kim',
	Sylwester Nawrocki, Andrzej Pietrasiewicz, linux-arm-kernel,
	linux-media

Hello,

On Thursday, April 21, 2011 2:00 PM Arnd Bergmann wrote:

> On Thursday 21 April 2011, Marek Szyprowski wrote:
> > On Wednesday, April 20, 2011 6:07 PM Arnd Bergmann wrote:
> > > On Wednesday 20 April 2011, Marek Szyprowski wrote:
> > > > The only question is how a device can allocate a buffer that will be
> most
> > > > convenient for IOMMU mapping (i.e. will require least entries to
> map)?
> > > >
> > > > IOMMU can create a contiguous mapping for ANY set of pages, but it
> performs
> > > > much better if the pages are grouped into 64KiB or 1MiB areas.
> > > >
> > > > Can device allocate a buffer without mapping it into kernel space?
> > >
> > > Not today as far as I know. You can register coherent memory per device
> > > using dma_declare_coherent_memory(), which will be used to back
> > > dma_alloc_coherent(), but I believe it is always mapped right now.
> >
> > This is not exactly what I meant.
> >
> > As we have IOMMU, the device driver can access any system memory. However
> > the performance will be better if the buffer is composed of larger
> contiguous
> > parts (like 64KiB or 1MiB). I would like to avoid putting logic that
> manages
> > buffer allocation into the device drivers. It would be best if such
> buffers
> > could be allocated by a single call to dma-mapping API.
> >
> > Right now there is dma_alloc_coherent() function, which is used by the
> > drivers to allocate a contiguous block of memory and map it to DMA
> addresses.
> > With IOMMU implementation it is quite easy to provide a replacement for
> it
> > that will allocate some set of pages and map into device virtual address
> > space as a contiguous buffer.
> >
> > This will have the advantage that the same multimedia device driver
> > will work on both systems - Samsung S5PC110 (without IOMMU) and Exynos4
> > (with IOMMU).
> 
> Right.
> 
> > However dma_alloc_coherent() besides allocating memory also implies some
> > particular type of memory mapping for it. IMHO it might be a good idea to
> > separate these 2 things (allocation and mapping) somewhere in the future.
> >
> > On systems with IOMMU the dma_map_sg() can be also used to create a
> mapping
> > in device virtual address space, but the driver will still need to
> allocate
> > the memory by itself.
> 
> Note that dma_map_sg() is the "streaming mapping", which provides a
> cacheable
> buffer all the time, while dma_alloc_coherent() and is the "coherent
> mapping".

Ok. 

> There is also dma_alloc_noncoherent(), which you can use to allocate a
> buffer
> for the streaming mapping. This is currently not implemented on ARM, but if
> I understand you correctly, adding this would do what you want.

Ok, I got it. Implementing dma_alloc_noncoherent() as well as dma_map_sg()
for non-IOMMU cases also makes some sense and will simplify the drivers imho.

> > > Ok, I see. Having one device per channel as you suggested could
> probably
> > > work around this, and it's at least consistent with how you'd represent
> > > IOMMUs in the device tree. It is not ideal because it makes the video
> > > driver more complex when it now has to deal with multiple struct device
> > > that it binds to, but I can't think of any nicer way either.
> >
> > Well, this will definitely complicate the codec driver. I wonder if
> allowing
> > the driver to kmalloc(sizeof(struct device))) and copy the relevant data
> > from the 'proper' struct device will be better idea. It is still hack but
> > definitely less intrusive for the driver.
> 
> No, I think that would be much worse, it definitely destroys all kinds of
> assumptions that the core code makes about devices. However, I don't think
> it's much of a problem to just create two child devices and use them
> from the main driver, you don't really need to create a device_driver
> to bind to each of them.

I must have missed something. Video codec is a platform device and struct
device pointer is gathered from it (&pdev->dev). How can I define child
devices and attach them to the platform device?

Best regards
-- 
Marek Szyprowski
Samsung Poland R&D Center


^ permalink raw reply	[flat|nested] 64+ messages in thread

* [PATCH 2/7] ARM: Samsung: update/rewrite Samsung SYSMMU (IOMMU) driver
@ 2011-04-21 14:03                       ` Marek Szyprowski
  0 siblings, 0 replies; 64+ messages in thread
From: Marek Szyprowski @ 2011-04-21 14:03 UTC (permalink / raw)
  To: linux-arm-kernel

Hello,

On Thursday, April 21, 2011 2:00 PM Arnd Bergmann wrote:

> On Thursday 21 April 2011, Marek Szyprowski wrote:
> > On Wednesday, April 20, 2011 6:07 PM Arnd Bergmann wrote:
> > > On Wednesday 20 April 2011, Marek Szyprowski wrote:
> > > > The only question is how a device can allocate a buffer that will be
> most
> > > > convenient for IOMMU mapping (i.e. will require least entries to
> map)?
> > > >
> > > > IOMMU can create a contiguous mapping for ANY set of pages, but it
> performs
> > > > much better if the pages are grouped into 64KiB or 1MiB areas.
> > > >
> > > > Can device allocate a buffer without mapping it into kernel space?
> > >
> > > Not today as far as I know. You can register coherent memory per device
> > > using dma_declare_coherent_memory(), which will be used to back
> > > dma_alloc_coherent(), but I believe it is always mapped right now.
> >
> > This is not exactly what I meant.
> >
> > As we have IOMMU, the device driver can access any system memory. However
> > the performance will be better if the buffer is composed of larger
> contiguous
> > parts (like 64KiB or 1MiB). I would like to avoid putting logic that
> manages
> > buffer allocation into the device drivers. It would be best if such
> buffers
> > could be allocated by a single call to dma-mapping API.
> >
> > Right now there is dma_alloc_coherent() function, which is used by the
> > drivers to allocate a contiguous block of memory and map it to DMA
> addresses.
> > With IOMMU implementation it is quite easy to provide a replacement for
> it
> > that will allocate some set of pages and map into device virtual address
> > space as a contiguous buffer.
> >
> > This will have the advantage that the same multimedia device driver
> > will work on both systems - Samsung S5PC110 (without IOMMU) and Exynos4
> > (with IOMMU).
> 
> Right.
> 
> > However dma_alloc_coherent() besides allocating memory also implies some
> > particular type of memory mapping for it. IMHO it might be a good idea to
> > separate these 2 things (allocation and mapping) somewhere in the future.
> >
> > On systems with IOMMU the dma_map_sg() can be also used to create a
> mapping
> > in device virtual address space, but the driver will still need to
> allocate
> > the memory by itself.
> 
> Note that dma_map_sg() is the "streaming mapping", which provides a
> cacheable
> buffer all the time, while dma_alloc_coherent() and is the "coherent
> mapping".

Ok. 

> There is also dma_alloc_noncoherent(), which you can use to allocate a
> buffer
> for the streaming mapping. This is currently not implemented on ARM, but if
> I understand you correctly, adding this would do what you want.

Ok, I got it. Implementing dma_alloc_noncoherent() as well as dma_map_sg()
for non-IOMMU cases also makes some sense and will simplify the drivers imho.

> > > Ok, I see. Having one device per channel as you suggested could
> probably
> > > work around this, and it's at least consistent with how you'd represent
> > > IOMMUs in the device tree. It is not ideal because it makes the video
> > > driver more complex when it now has to deal with multiple struct device
> > > that it binds to, but I can't think of any nicer way either.
> >
> > Well, this will definitely complicate the codec driver. I wonder if
> allowing
> > the driver to kmalloc(sizeof(struct device))) and copy the relevant data
> > from the 'proper' struct device will be better idea. It is still hack but
> > definitely less intrusive for the driver.
> 
> No, I think that would be much worse, it definitely destroys all kinds of
> assumptions that the core code makes about devices. However, I don't think
> it's much of a problem to just create two child devices and use them
> from the main driver, you don't really need to create a device_driver
> to bind to each of them.

I must have missed something. Video codec is a platform device and struct
device pointer is gathered from it (&pdev->dev). How can I define child
devices and attach them to the platform device?

Best regards
-- 
Marek Szyprowski
Samsung Poland R&D Center

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [PATCH 2/7] ARM: Samsung: update/rewrite Samsung SYSMMU (IOMMU) driver
  2011-04-21 14:03                       ` Marek Szyprowski
@ 2011-04-21 14:18                         ` Arnd Bergmann
  -1 siblings, 0 replies; 64+ messages in thread
From: Arnd Bergmann @ 2011-04-21 14:18 UTC (permalink / raw)
  To: Marek Szyprowski
  Cc: 'Joerg Roedel',
	linux-samsung-soc, 'Kyungmin Park', 'Kukjin Kim',
	Sylwester Nawrocki, Andrzej Pietrasiewicz, linux-arm-kernel,
	linux-media

On Thursday 21 April 2011, Marek Szyprowski wrote:
> > No, I think that would be much worse, it definitely destroys all kinds of
> > assumptions that the core code makes about devices. However, I don't think
> > it's much of a problem to just create two child devices and use them
> > from the main driver, you don't really need to create a device_driver
> > to bind to each of them.
> 
> I must have missed something. Video codec is a platform device and struct
> device pointer is gathered from it (&pdev->dev). How can I define child
> devices and attach them to the platform device?

There are a number of ways:

* Do device_create() with &pdev->dev as the parent, inside of the
  codec driver, with a new class you create for this purpose
* Do device_register() for a device, in the same way
* Create the additional platform devices in the platform code,
  with their parents pointing to the code device, then
  look for them using device_for_each_child in the driver
* Create two codec devices in parallel and bind to both with your
  driver, ideally splitting up the resources between the two
  devices in a meaningful way.

None of them are extremely nice, but it's not that hard either.
You should probably prototype a few of these approaches to see
which one is the least ugly one.

	Arnd

^ permalink raw reply	[flat|nested] 64+ messages in thread

* [PATCH 2/7] ARM: Samsung: update/rewrite Samsung SYSMMU (IOMMU) driver
@ 2011-04-21 14:18                         ` Arnd Bergmann
  0 siblings, 0 replies; 64+ messages in thread
From: Arnd Bergmann @ 2011-04-21 14:18 UTC (permalink / raw)
  To: linux-arm-kernel

On Thursday 21 April 2011, Marek Szyprowski wrote:
> > No, I think that would be much worse, it definitely destroys all kinds of
> > assumptions that the core code makes about devices. However, I don't think
> > it's much of a problem to just create two child devices and use them
> > from the main driver, you don't really need to create a device_driver
> > to bind to each of them.
> 
> I must have missed something. Video codec is a platform device and struct
> device pointer is gathered from it (&pdev->dev). How can I define child
> devices and attach them to the platform device?

There are a number of ways:

* Do device_create() with &pdev->dev as the parent, inside of the
  codec driver, with a new class you create for this purpose
* Do device_register() for a device, in the same way
* Create the additional platform devices in the platform code,
  with their parents pointing to the code device, then
  look for them using device_for_each_child in the driver
* Create two codec devices in parallel and bind to both with your
  driver, ideally splitting up the resources between the two
  devices in a meaningful way.

None of them are extremely nice, but it's not that hard either.
You should probably prototype a few of these approaches to see
which one is the least ugly one.

	Arnd

^ permalink raw reply	[flat|nested] 64+ messages in thread

* RE: [PATCH 2/7] ARM: Samsung: update/rewrite Samsung SYSMMU (IOMMU) driver
  2011-04-21 14:18                         ` Arnd Bergmann
@ 2011-04-22  7:33                           ` Marek Szyprowski
  -1 siblings, 0 replies; 64+ messages in thread
From: Marek Szyprowski @ 2011-04-22  7:33 UTC (permalink / raw)
  To: 'Arnd Bergmann'
  Cc: 'Joerg Roedel',
	linux-samsung-soc, 'Kyungmin Park', 'Kukjin Kim',
	Sylwester Nawrocki, Andrzej Pietrasiewicz, linux-arm-kernel,
	linux-media

Hello,

On Thursday, April 21, 2011 4:19 PM Arnd Bergmann wrote:

> On Thursday 21 April 2011, Marek Szyprowski wrote:
> > > No, I think that would be much worse, it definitely destroys all kinds
> of
> > > assumptions that the core code makes about devices. However, I don't
> think
> > > it's much of a problem to just create two child devices and use them
> > > from the main driver, you don't really need to create a device_driver
> > > to bind to each of them.
> >
> > I must have missed something. Video codec is a platform device and struct
> > device pointer is gathered from it (&pdev->dev). How can I define child
> > devices and attach them to the platform device?
> 
> There are a number of ways:
> 
> * Do device_create() with &pdev->dev as the parent, inside of the
>   codec driver, with a new class you create for this purpose
> * Do device_register() for a device, in the same way
> * Create the additional platform devices in the platform code,
>   with their parents pointing to the code device, then
>   look for them using device_for_each_child in the driver

IMHO this will be the cleanest way. Thanks for the idea.

> * Create two codec devices in parallel and bind to both with your
>   driver, ideally splitting up the resources between the two
>   devices in a meaningful way.

Video codec has only standard 2 resources - ioregs and irq, so there
is not much left for such splitting.

> None of them are extremely nice, but it's not that hard either.
> You should probably prototype a few of these approaches to see
> which one is the least ugly one.

Ok. Today while iterating over the hardware requirements I noticed
one more thing. Our codec hardware has one more, odd requirement for
video buffers. The DMA addresses need to be aligned to 8KiB or 16KiB
(depending on buffer type). Do you have any idea how this can be
handled in a generic way?

Best regards
-- 
Marek Szyprowski
Samsung Poland R&D Center


^ permalink raw reply	[flat|nested] 64+ messages in thread

* [PATCH 2/7] ARM: Samsung: update/rewrite Samsung SYSMMU (IOMMU) driver
@ 2011-04-22  7:33                           ` Marek Szyprowski
  0 siblings, 0 replies; 64+ messages in thread
From: Marek Szyprowski @ 2011-04-22  7:33 UTC (permalink / raw)
  To: linux-arm-kernel

Hello,

On Thursday, April 21, 2011 4:19 PM Arnd Bergmann wrote:

> On Thursday 21 April 2011, Marek Szyprowski wrote:
> > > No, I think that would be much worse, it definitely destroys all kinds
> of
> > > assumptions that the core code makes about devices. However, I don't
> think
> > > it's much of a problem to just create two child devices and use them
> > > from the main driver, you don't really need to create a device_driver
> > > to bind to each of them.
> >
> > I must have missed something. Video codec is a platform device and struct
> > device pointer is gathered from it (&pdev->dev). How can I define child
> > devices and attach them to the platform device?
> 
> There are a number of ways:
> 
> * Do device_create() with &pdev->dev as the parent, inside of the
>   codec driver, with a new class you create for this purpose
> * Do device_register() for a device, in the same way
> * Create the additional platform devices in the platform code,
>   with their parents pointing to the code device, then
>   look for them using device_for_each_child in the driver

IMHO this will be the cleanest way. Thanks for the idea.

> * Create two codec devices in parallel and bind to both with your
>   driver, ideally splitting up the resources between the two
>   devices in a meaningful way.

Video codec has only standard 2 resources - ioregs and irq, so there
is not much left for such splitting.

> None of them are extremely nice, but it's not that hard either.
> You should probably prototype a few of these approaches to see
> which one is the least ugly one.

Ok. Today while iterating over the hardware requirements I noticed
one more thing. Our codec hardware has one more, odd requirement for
video buffers. The DMA addresses need to be aligned to 8KiB or 16KiB
(depending on buffer type). Do you have any idea how this can be
handled in a generic way?

Best regards
-- 
Marek Szyprowski
Samsung Poland R&D Center

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [PATCH 2/7] ARM: Samsung: update/rewrite Samsung SYSMMU (IOMMU) driver
  2011-04-22  7:33                           ` Marek Szyprowski
@ 2011-04-26 14:10                             ` Arnd Bergmann
  -1 siblings, 0 replies; 64+ messages in thread
From: Arnd Bergmann @ 2011-04-26 14:10 UTC (permalink / raw)
  To: Marek Szyprowski
  Cc: 'Joerg Roedel',
	linux-samsung-soc, 'Kyungmin Park', 'Kukjin Kim',
	Sylwester Nawrocki, Andrzej Pietrasiewicz, linux-arm-kernel,
	linux-media

On Friday 22 April 2011, Marek Szyprowski wrote:
> > * Create two codec devices in parallel and bind to both with your
> >   driver, ideally splitting up the resources between the two
> >   devices in a meaningful way.
> 
> Video codec has only standard 2 resources - ioregs and irq, so there
> is not much left for such splitting.

Ok, I see.

> > None of them are extremely nice, but it's not that hard either.
> > You should probably prototype a few of these approaches to see
> > which one is the least ugly one.
> 
> Ok. Today while iterating over the hardware requirements I noticed
> one more thing. Our codec hardware has one more, odd requirement for
> video buffers. The DMA addresses need to be aligned to 8KiB or 16KiB
> (depending on buffer type). Do you have any idea how this can be
> handled in a generic way?

I don't think you can force the mappings to be aligned to that size
in the streaming mapping, but you should be able to just align inside
of dma_map_single etc and map a larger region.

For the allocation functions (dma_alloc_coherent, dma_alloc_noncoherent),
using alloc_pages to allocate multiples of the size you need should
always give you aligned buffers because of the way that the underlying
buddy allocator works.

	Arnd

^ permalink raw reply	[flat|nested] 64+ messages in thread

* [PATCH 2/7] ARM: Samsung: update/rewrite Samsung SYSMMU (IOMMU) driver
@ 2011-04-26 14:10                             ` Arnd Bergmann
  0 siblings, 0 replies; 64+ messages in thread
From: Arnd Bergmann @ 2011-04-26 14:10 UTC (permalink / raw)
  To: linux-arm-kernel

On Friday 22 April 2011, Marek Szyprowski wrote:
> > * Create two codec devices in parallel and bind to both with your
> >   driver, ideally splitting up the resources between the two
> >   devices in a meaningful way.
> 
> Video codec has only standard 2 resources - ioregs and irq, so there
> is not much left for such splitting.

Ok, I see.

> > None of them are extremely nice, but it's not that hard either.
> > You should probably prototype a few of these approaches to see
> > which one is the least ugly one.
> 
> Ok. Today while iterating over the hardware requirements I noticed
> one more thing. Our codec hardware has one more, odd requirement for
> video buffers. The DMA addresses need to be aligned to 8KiB or 16KiB
> (depending on buffer type). Do you have any idea how this can be
> handled in a generic way?

I don't think you can force the mappings to be aligned to that size
in the streaming mapping, but you should be able to just align inside
of dma_map_single etc and map a larger region.

For the allocation functions (dma_alloc_coherent, dma_alloc_noncoherent),
using alloc_pages to allocate multiples of the size you need should
always give you aligned buffers because of the way that the underlying
buddy allocator works.

	Arnd

^ permalink raw reply	[flat|nested] 64+ messages in thread

* RE: [PATCH 2/7] ARM: Samsung: update/rewrite Samsung SYSMMU (IOMMU) driver
  2011-04-26 14:10                             ` Arnd Bergmann
@ 2011-04-26 14:23                               ` Marek Szyprowski
  -1 siblings, 0 replies; 64+ messages in thread
From: Marek Szyprowski @ 2011-04-26 14:23 UTC (permalink / raw)
  To: 'Arnd Bergmann'
  Cc: 'Joerg Roedel',
	linux-samsung-soc, 'Kyungmin Park', 'Kukjin Kim',
	Sylwester Nawrocki, Andrzej Pietrasiewicz, linux-arm-kernel,
	linux-media

Hello,

On Tuesday, April 26, 2011 4:10 PM Arnd Bergmann wrote:

> On Friday 22 April 2011, Marek Szyprowski wrote:
> > > * Create two codec devices in parallel and bind to both with your
> > >   driver, ideally splitting up the resources between the two
> > >   devices in a meaningful way.
> >
> > Video codec has only standard 2 resources - ioregs and irq, so there
> > is not much left for such splitting.
> 
> Ok, I see.
> 
> > > None of them are extremely nice, but it's not that hard either.
> > > You should probably prototype a few of these approaches to see
> > > which one is the least ugly one.
> >
> > Ok. Today while iterating over the hardware requirements I noticed
> > one more thing. Our codec hardware has one more, odd requirement for
> > video buffers. The DMA addresses need to be aligned to 8KiB or 16KiB
> > (depending on buffer type). Do you have any idea how this can be
> > handled in a generic way?
> 
> I don't think you can force the mappings to be aligned to that size
> in the streaming mapping, but you should be able to just align inside
> of dma_map_single etc and map a larger region.
> 
> For the allocation functions (dma_alloc_coherent, dma_alloc_noncoherent),
> using alloc_pages to allocate multiples of the size you need should
> always give you aligned buffers because of the way that the underlying
> buddy allocator works.

Well, I thought about the alignment of the IOVA mapping. I will probably
handle it with some additional archdata stuff.

I've started hacking ARM dma-mapping interface to get support for 
dma-mapping-common.h and then to integrate with Samsung IOMMU driver.
I hope to post the initial version before Linaro meeting in Budapest.

Best regards
-- 
Marek Szyprowski
Samsung Poland R&D Center


^ permalink raw reply	[flat|nested] 64+ messages in thread

* [PATCH 2/7] ARM: Samsung: update/rewrite Samsung SYSMMU (IOMMU) driver
@ 2011-04-26 14:23                               ` Marek Szyprowski
  0 siblings, 0 replies; 64+ messages in thread
From: Marek Szyprowski @ 2011-04-26 14:23 UTC (permalink / raw)
  To: linux-arm-kernel

Hello,

On Tuesday, April 26, 2011 4:10 PM Arnd Bergmann wrote:

> On Friday 22 April 2011, Marek Szyprowski wrote:
> > > * Create two codec devices in parallel and bind to both with your
> > >   driver, ideally splitting up the resources between the two
> > >   devices in a meaningful way.
> >
> > Video codec has only standard 2 resources - ioregs and irq, so there
> > is not much left for such splitting.
> 
> Ok, I see.
> 
> > > None of them are extremely nice, but it's not that hard either.
> > > You should probably prototype a few of these approaches to see
> > > which one is the least ugly one.
> >
> > Ok. Today while iterating over the hardware requirements I noticed
> > one more thing. Our codec hardware has one more, odd requirement for
> > video buffers. The DMA addresses need to be aligned to 8KiB or 16KiB
> > (depending on buffer type). Do you have any idea how this can be
> > handled in a generic way?
> 
> I don't think you can force the mappings to be aligned to that size
> in the streaming mapping, but you should be able to just align inside
> of dma_map_single etc and map a larger region.
> 
> For the allocation functions (dma_alloc_coherent, dma_alloc_noncoherent),
> using alloc_pages to allocate multiples of the size you need should
> always give you aligned buffers because of the way that the underlying
> buddy allocator works.

Well, I thought about the alignment of the IOVA mapping. I will probably
handle it with some additional archdata stuff.

I've started hacking ARM dma-mapping interface to get support for 
dma-mapping-common.h and then to integrate with Samsung IOMMU driver.
I hope to post the initial version before Linaro meeting in Budapest.

Best regards
-- 
Marek Szyprowski
Samsung Poland R&D Center

^ permalink raw reply	[flat|nested] 64+ messages in thread

end of thread, other threads:[~2011-04-26 14:23 UTC | newest]

Thread overview: 64+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2011-04-18  9:26 [RFC/PATCH v3 0/7] Samsung IOMMU videobuf2 allocator and s5p-fimc update Marek Szyprowski
2011-04-18  9:26 ` Marek Szyprowski
2011-04-18  9:26 ` [PATCH 1/7] ARM: EXYNOS4: power domains: fixes and code cleanup Marek Szyprowski
2011-04-18  9:26   ` Marek Szyprowski
2011-04-18  9:26 ` [PATCH 2/7] ARM: Samsung: update/rewrite Samsung SYSMMU (IOMMU) driver Marek Szyprowski
2011-04-18  9:26   ` Marek Szyprowski
2011-04-18 14:12   ` Arnd Bergmann
2011-04-18 14:12     ` Arnd Bergmann
2011-04-19  8:23     ` Marek Szyprowski
2011-04-19  8:23       ` Marek Szyprowski
2011-04-19 12:49       ` Arnd Bergmann
2011-04-19 12:49         ` Arnd Bergmann
2011-04-19 13:50         ` Roedel, Joerg
2011-04-19 13:50           ` Roedel, Joerg
2011-04-19 14:28           ` Arnd Bergmann
2011-04-19 14:28             ` Arnd Bergmann
2011-04-19 14:51             ` Roedel, Joerg
2011-04-19 14:51               ` Roedel, Joerg
2011-04-19 14:03         ` Marek Szyprowski
2011-04-19 14:03           ` Marek Szyprowski
2011-04-19 14:29           ` Arnd Bergmann
2011-04-19 14:29             ` Arnd Bergmann
2011-04-20 14:55             ` Marek Szyprowski
2011-04-20 14:55               ` Marek Szyprowski
2011-04-20 16:07               ` Arnd Bergmann
2011-04-20 16:07                 ` Arnd Bergmann
2011-04-21 11:32                 ` Marek Szyprowski
2011-04-21 11:32                   ` Marek Szyprowski
2011-04-21 12:00                   ` Arnd Bergmann
2011-04-21 12:00                     ` Arnd Bergmann
2011-04-21 14:03                     ` Marek Szyprowski
2011-04-21 14:03                       ` Marek Szyprowski
2011-04-21 14:18                       ` Arnd Bergmann
2011-04-21 14:18                         ` Arnd Bergmann
2011-04-22  7:33                         ` Marek Szyprowski
2011-04-22  7:33                           ` Marek Szyprowski
2011-04-26 14:10                           ` Arnd Bergmann
2011-04-26 14:10                             ` Arnd Bergmann
2011-04-26 14:23                             ` Marek Szyprowski
2011-04-26 14:23                               ` Marek Szyprowski
2011-04-19 15:00           ` Roedel, Joerg
2011-04-19 15:00             ` Roedel, Joerg
2011-04-19 15:37             ` Arnd Bergmann
2011-04-19 15:37               ` Arnd Bergmann
2011-04-18  9:26 ` [PATCH 3/7] v4l: videobuf2: dma-sg: move some generic functions to memops Marek Szyprowski
2011-04-18  9:26   ` Marek Szyprowski
2011-04-18  9:26 ` [PATCH 4/7] v4l: videobuf2: add IOMMU based DMA memory allocator Marek Szyprowski
2011-04-18  9:26   ` Marek Szyprowski
2011-04-18 14:15   ` Arnd Bergmann
2011-04-18 14:15     ` Arnd Bergmann
2011-04-19  9:02     ` Marek Szyprowski
2011-04-19  9:02       ` Marek Szyprowski
2011-04-19  9:21       ` Russell King - ARM Linux
2011-04-19  9:21         ` Russell King - ARM Linux
2011-04-19 12:00         ` Arnd Bergmann
2011-04-19 12:00           ` Arnd Bergmann
2011-04-18  9:26 ` [PATCH 5/7] v4l: s5p-fimc: add pm_runtime support Marek Szyprowski
2011-04-18  9:26   ` Marek Szyprowski
2011-04-18  9:26 ` [PATCH 6/7] v4l: s5p-fimc: Add support for vb2-dma-iommu allocator Marek Szyprowski
2011-04-18  9:26   ` Marek Szyprowski
2011-04-18  9:26 ` [PATCH 7/7] ARM: EXYNOS4: enable FIMC on Universal_C210 Marek Szyprowski
2011-04-18  9:26   ` Marek Szyprowski
2011-04-18 13:24 ` [RFC/PATCH v3 0/7] Samsung IOMMU videobuf2 allocator and s5p-fimc update Marek Szyprowski
2011-04-18 13:24   ` Marek Szyprowski

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.