All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v5 1/3] iommu/rockchip: rk3288 iommu driver
@ 2014-10-14  8:02   ` Daniel Kurtz
  0 siblings, 0 replies; 26+ messages in thread
From: Daniel Kurtz @ 2014-10-14  8:02 UTC (permalink / raw)
  Cc: Grant Grundler, Stéphane Marchesin, Daniel Kurtz, Simon Xue,
	Joerg Roedel, Heiko Stuebner, Grant Likely, Rob Herring,
	open list, open list:IOMMU DRIVERS,
	moderated list:ARM/Rockchip SoC..., open list:ARM/Rockchip SoC...,
	open list:OPEN FIRMWARE AND...

The rk3288 has several iommus.  Each iommu belongs to a single master
device.  There is one device (ISP) that has two slave iommus, but that
case is not yet supported by this driver.

At subsys init, the iommu driver registers itself as the iommu driver for
the platform bus.  The master devices find their slave iommus using the
"iommus" field in their devicetree description.  Since each slave iommu
belongs to exactly one master, their is no additional data needed at probe
to associate a slave with its master.

An iommu device's power domain, clock and irq are all shared with its
master device, and the master device must be careful to attach from the
iommu only after powering and clocking it (and leave it powered and
clocked before detaching).  Because their is no guarantee what the status
of the iommu is at probe, and since the driver does not even know if the
device is powered, we delay requesting its irq until the master device
attaches, at which point we have a guarantee that the device is powered
and clocked and we can reset it and disable its interrupt mask.

An iommu_domain describes a virtual iova address space.  Each iommu_domain
has a corresponding page table that lists the mappings from iova to
physical address.

For the rk3288 iommu, the page table has two levels:
 The Level 1 "directory_table" has 1024 4-byte dte entries.
 Each dte points to a level 2 "page_table".
 Each level 2 page_table has 1024 4-byte pte entries.
 Each pte points to a 4 KiB page of memory.

An iommu_domain is created when a dma_iommu_mapping is created via
arm_iommu_create_mapping.  Master devices can then attach themselves to
this mapping (or attach the mapping to themselves?) by calling
arm_iommu_attach_device().  This in turn instructs the iommu driver to
write the page table's physical address into the slave iommu's "Directory
Table Entry" (DTE) register.

In fact multiple master devices, each with their own slave iommu device,
can all attach to the same mapping.  The iommus for these devices will
share the same iommu_domain and therefore point to the same page table.
Thus, the iommu domain maintains a list of iommu devices which are
attached.  This driver relies on the iommu core to ensure that all devices
have detached before destroying a domain.

Signed-off-by: Daniel Kurtz <djkurtz@chromium.org>
Signed-off-by: Simon Xue <xxm@rock-chips.com>
Reviewed-by: Grant Grundler <grundler@chromium.org>
Reviewed-by: Stéphane Marchesin <marcheu@chromium.org>
---
 drivers/iommu/Kconfig          |  12 +
 drivers/iommu/Makefile         |   1 +
 drivers/iommu/rockchip-iommu.c | 924 +++++++++++++++++++++++++++++++++++++++++
 3 files changed, 937 insertions(+)
 create mode 100644 drivers/iommu/rockchip-iommu.c

diff --git a/drivers/iommu/Kconfig b/drivers/iommu/Kconfig
index dd51122..d0a1261 100644
--- a/drivers/iommu/Kconfig
+++ b/drivers/iommu/Kconfig
@@ -152,6 +152,18 @@ config OMAP_IOMMU_DEBUG
 
          Say N unless you know you need this.
 
+config ROCKCHIP_IOMMU
+	bool "Rockchip IOMMU Support"
+	depends on ARCH_ROCKCHIP
+	select IOMMU_API
+	select ARM_DMA_USE_IOMMU
+	help
+	  Support for IOMMUs found on Rockchip rk32xx SOCs.
+	  These IOMMUs allow virtualization of the address space used by most
+	  cores within the multimedia subsystem.
+	  Say Y here if you are using a Rockchip SoC that includes an IOMMU
+	  device.
+
 config TEGRA_IOMMU_GART
 	bool "Tegra GART IOMMU Support"
 	depends on ARCH_TEGRA_2x_SOC
diff --git a/drivers/iommu/Makefile b/drivers/iommu/Makefile
index 16edef7..3e47ef3 100644
--- a/drivers/iommu/Makefile
+++ b/drivers/iommu/Makefile
@@ -13,6 +13,7 @@ obj-$(CONFIG_IRQ_REMAP) += intel_irq_remapping.o irq_remapping.o
 obj-$(CONFIG_OMAP_IOMMU) += omap-iommu.o
 obj-$(CONFIG_OMAP_IOMMU) += omap-iommu2.o
 obj-$(CONFIG_OMAP_IOMMU_DEBUG) += omap-iommu-debug.o
+obj-$(CONFIG_ROCKCHIP_IOMMU) += rockchip-iommu.o
 obj-$(CONFIG_TEGRA_IOMMU_GART) += tegra-gart.o
 obj-$(CONFIG_TEGRA_IOMMU_SMMU) += tegra-smmu.o
 obj-$(CONFIG_EXYNOS_IOMMU) += exynos-iommu.o
diff --git a/drivers/iommu/rockchip-iommu.c b/drivers/iommu/rockchip-iommu.c
new file mode 100644
index 0000000..08e50fc
--- /dev/null
+++ b/drivers/iommu/rockchip-iommu.c
@@ -0,0 +1,924 @@
+/*
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ */
+
+#include <asm/cacheflush.h>
+#include <asm/pgtable.h>
+#include <linux/compiler.h>
+#include <linux/delay.h>
+#include <linux/device.h>
+#include <linux/errno.h>
+#include <linux/interrupt.h>
+#include <linux/io.h>
+#include <linux/iommu.h>
+#include <linux/jiffies.h>
+#include <linux/list.h>
+#include <linux/mm.h>
+#include <linux/module.h>
+#include <linux/of.h>
+#include <linux/platform_device.h>
+#include <linux/slab.h>
+#include <linux/spinlock.h>
+
+/** MMU register offsets */
+#define RK_MMU_DTE_ADDR		0x00	/* Directory table address */
+#define RK_MMU_STATUS		0x04
+#define RK_MMU_COMMAND		0x08
+#define RK_MMU_PAGE_FAULT_ADDR	0x0C	/* IOVA of last page fault */
+#define RK_MMU_ZAP_ONE_LINE	0x10	/* Shootdown one IOTLB entry */
+#define RK_MMU_INT_RAWSTAT	0x14	/* IRQ status ignoring mask */
+#define RK_MMU_INT_CLEAR	0x18	/* Acknowledge and re-arm irq */
+#define RK_MMU_INT_MASK		0x1C	/* IRQ enable */
+#define RK_MMU_INT_STATUS	0x20	/* IRQ status after masking */
+#define RK_MMU_AUTO_GATING	0x24
+
+#define DTE_ADDR_DUMMY		0xCAFEBABE
+#define FORCE_RESET_TIMEOUT	100	/* ms */
+
+/* RK_MMU_STATUS fields */
+#define RK_MMU_STATUS_PAGING_ENABLED       BIT(0)
+#define RK_MMU_STATUS_PAGE_FAULT_ACTIVE    BIT(1)
+#define RK_MMU_STATUS_STALL_ACTIVE         BIT(2)
+#define RK_MMU_STATUS_IDLE                 BIT(3)
+#define RK_MMU_STATUS_REPLAY_BUFFER_EMPTY  BIT(4)
+#define RK_MMU_STATUS_PAGE_FAULT_IS_WRITE  BIT(5)
+#define RK_MMU_STATUS_STALL_NOT_ACTIVE     BIT(31)
+
+/* RK_MMU_COMMAND command values */
+#define RK_MMU_CMD_ENABLE_PAGING    0  /* Enable memory translation */
+#define RK_MMU_CMD_DISABLE_PAGING   1  /* Disable memory translation */
+#define RK_MMU_CMD_ENABLE_STALL     2  /* Stall paging to allow other cmds */
+#define RK_MMU_CMD_DISABLE_STALL    3  /* Stop stall re-enables paging */
+#define RK_MMU_CMD_ZAP_CACHE        4  /* Shoot down entire IOTLB */
+#define RK_MMU_CMD_PAGE_FAULT_DONE  5  /* Clear page fault */
+#define RK_MMU_CMD_FORCE_RESET      6  /* Reset all registers */
+
+/* RK_MMU_INT_* register fields */
+#define RK_MMU_IRQ_PAGE_FAULT    0x01  /* page fault */
+#define RK_MMU_IRQ_BUS_ERROR     0x02  /* bus read error */
+#define RK_MMU_IRQ_MASK          (RK_MMU_IRQ_PAGE_FAULT | RK_MMU_IRQ_BUS_ERROR)
+
+#define NUM_DT_ENTRIES 1024
+#define NUM_PT_ENTRIES 1024
+
+#define SPAGE_ORDER 12
+#define SPAGE_SIZE (1 << SPAGE_ORDER)
+
+ /*
+  * Support mapping any size that fits in one page table:
+  *   4 KiB to 4 MiB
+  */
+#define RK_IOMMU_PGSIZE_BITMAP 0x007ff000
+
+#define IOMMU_REG_POLL_COUNT_FAST 1000
+
+struct rk_iommu_domain {
+	struct list_head iommus;
+	u32 *dt; /* page directory table */
+	spinlock_t iommus_lock; /* lock for iommus list */
+	spinlock_t dt_lock; /* lock for modifying page directory table */
+};
+
+struct rk_iommu {
+	struct device *dev;
+	void __iomem *base;
+	int irq;
+	struct list_head node; /* entry in rk_iommu_domain.iommus */
+	struct iommu_domain *domain; /* domain to which iommu is attached */
+};
+
+static inline void rk_table_flush(u32 *va, unsigned int count)
+{
+	phys_addr_t pa_start = virt_to_phys(va);
+	phys_addr_t pa_end = virt_to_phys(va + count);
+	size_t size = pa_end - pa_start;
+
+	__cpuc_flush_dcache_area(va, size);
+	outer_flush_range(pa_start, pa_end);
+}
+
+/**
+ * Inspired by _wait_for in intel_drv.h
+ * This is NOT safe for use in interrupt context.
+ *
+ * Note that it's important that we check the condition again after having
+ * timed out, since the timeout could be due to preemption or similar and
+ * we've never had a chance to check the condition before the timeout.
+ */
+#define rk_wait_for(COND, MS) ({ \
+	unsigned long timeout__ = jiffies + msecs_to_jiffies(MS) + 1;	\
+	int ret__ = 0;							\
+	while (!(COND)) {						\
+		if (time_after(jiffies, timeout__)) {			\
+			ret__ = (COND) ? 0 : -ETIMEDOUT;		\
+			break;						\
+		}							\
+		usleep_range(50, 100);					\
+	}								\
+	ret__;								\
+})
+
+/*
+ * The Rockchip rk3288 iommu uses a 2-level page table.
+ * The first level is the "Directory Table" (DT).
+ * The DT consists of 1024 4-byte Directory Table Entries (DTEs), each pointing
+ * to a "Page Table".
+ * The second level is the 1024 Page Tables (PT).
+ * Each PT consists of 1024 4-byte Page Table Entries (PTEs), each pointing to
+ * a 4 KB page of physical memory.
+ *
+ * The DT and each PT fits in a single 4 KB page (4-bytes * 1024 entries).
+ * Each iommu device has a MMU_DTE_ADDR register that contains the physical
+ * address of the start of the DT page.
+ *
+ * The structure of the page table is as follows:
+ *
+ *                   DT
+ * MMU_DTE_ADDR -> +-----+
+ *                 |     |
+ *                 +-----+     PT
+ *                 | DTE | -> +-----+
+ *                 +-----+    |     |     Memory
+ *                 |     |    +-----+     Page
+ *                 |     |    | PTE | -> +-----+
+ *                 +-----+    +-----+    |     |
+ *                            |     |    |     |
+ *                            |     |    |     |
+ *                            +-----+    |     |
+ *                                       |     |
+ *                                       |     |
+ *                                       +-----+
+ */
+
+/*
+ * Each DTE has a PT address and a valid bit:
+ * +---------------------+-----------+-+
+ * | PT address          | Reserved  |V|
+ * +---------------------+-----------+-+
+ *  31:12 - PT address (PTs always starts on a 4 KB boundary)
+ *  11: 1 - Reserved
+ *      0 - 1 if PT @ PT address is valid
+ */
+#define RK_DTE_PT_ADDRESS_MASK    0xfffff000
+#define RK_DTE_PT_VALID           BIT(0)
+
+static inline phys_addr_t rk_dte_pt_address(u32 dte)
+{
+	return (phys_addr_t)dte & RK_DTE_PT_ADDRESS_MASK;
+}
+
+static inline bool rk_dte_is_pt_valid(u32 dte)
+{
+	return dte & RK_DTE_PT_VALID;
+}
+
+static u32 rk_mk_dte(u32 *pt)
+{
+	phys_addr_t pt_phys = virt_to_phys(pt);
+	return (pt_phys & RK_DTE_PT_ADDRESS_MASK) | RK_DTE_PT_VALID;
+}
+
+/*
+ * Each PTE has a Page address, some flags and a valid bit:
+ * +---------------------+---+-------+-+
+ * | Page address        |Rsv| Flags |V|
+ * +---------------------+---+-------+-+
+ *  31:12 - Page address (Pages always start on a 4 KB boundary)
+ *  11: 9 - Reserved
+ *   8: 1 - Flags
+ *      8 - Read allocate - allocate cache space on read misses
+ *      7 - Read cache - enable cache & prefetch of data
+ *      6 - Write buffer - enable delaying writes on their way to memory
+ *      5 - Write allocate - allocate cache space on write misses
+ *      4 - Write cache - different writes can be merged together
+ *      3 - Override cache attributes
+ *          if 1, bits 4-8 control cache attributes
+ *          if 0, the system bus defaults are used
+ *      2 - Writable
+ *      1 - Readable
+ *      0 - 1 if Page @ Page address is valid
+ */
+#define RK_PTE_PAGE_ADDRESS_MASK  0xfffff000
+#define RK_PTE_PAGE_FLAGS_MASK    0x000001fe
+#define RK_PTE_PAGE_WRITABLE      BIT(2)
+#define RK_PTE_PAGE_READABLE      BIT(1)
+#define RK_PTE_PAGE_VALID         BIT(0)
+
+static inline phys_addr_t rk_pte_page_address(u32 pte)
+{
+	return (phys_addr_t)pte & RK_PTE_PAGE_ADDRESS_MASK;
+}
+
+static inline bool rk_pte_is_page_valid(u32 pte)
+{
+	return pte & RK_PTE_PAGE_VALID;
+}
+
+/* TODO: set cache flags per prot IOMMU_CACHE */
+static u32 rk_mk_pte(phys_addr_t page, int prot)
+{
+	u32 flags = 0;
+	flags |= (prot & IOMMU_READ) ? RK_PTE_PAGE_READABLE : 0;
+	flags |= (prot & IOMMU_WRITE) ? RK_PTE_PAGE_WRITABLE : 0;
+	page &= RK_PTE_PAGE_ADDRESS_MASK;
+	return page | flags | RK_PTE_PAGE_VALID;
+}
+
+static u32 rk_mk_pte_invalid(u32 pte)
+{
+	return pte & ~RK_PTE_PAGE_VALID;
+}
+
+/*
+ * rk3288 iova (IOMMU Virtual Address) format
+ *  31       22.21       12.11          0
+ * +-----------+-----------+-------------+
+ * | DTE index | PTE index | Page offset |
+ * +-----------+-----------+-------------+
+ *  31:22 - DTE index   - index of DTE in DT
+ *  21:12 - PTE index   - index of PTE in PT @ DTE.pt_address
+ *  11: 0 - Page offset - offset into page @ PTE.page_address
+ */
+#define RK_IOVA_DTE_MASK    0xffc00000
+#define RK_IOVA_DTE_SHIFT   22
+#define RK_IOVA_PTE_MASK    0x003ff000
+#define RK_IOVA_PTE_SHIFT   12
+#define RK_IOVA_PAGE_MASK   0x00000fff
+#define RK_IOVA_PAGE_SHIFT  0
+
+static u32 rk_iova_dte_index(dma_addr_t iova)
+{
+	return (u32)(iova & RK_IOVA_DTE_MASK) >> RK_IOVA_DTE_SHIFT;
+}
+
+static u32 rk_iova_pte_index(dma_addr_t iova)
+{
+	return (u32)(iova & RK_IOVA_PTE_MASK) >> RK_IOVA_PTE_SHIFT;
+}
+
+static u32 rk_iova_page_offset(dma_addr_t iova)
+{
+	return (u32)(iova & RK_IOVA_PAGE_MASK) >> RK_IOVA_PAGE_SHIFT;
+}
+
+static u32 rk_iommu_read(struct rk_iommu *iommu, u32 offset)
+{
+	return readl(iommu->base + offset);
+}
+
+static void rk_iommu_write(struct rk_iommu *iommu, u32 offset, u32 value)
+{
+	writel(value, iommu->base + offset);
+}
+
+static void rk_iommu_command(struct rk_iommu *iommu, u32 command)
+{
+	writel(command, iommu->base + RK_MMU_COMMAND);
+}
+
+static void rk_iommu_zap_lines(struct rk_iommu *iommu, dma_addr_t iova,
+			       size_t size)
+{
+	dma_addr_t iova_end = iova + size;
+	/*
+	 * TODO(djkurtz): Figure out when it is more efficient to shootdown the
+	 * entire iotlb rather than iterate over individual iovas.
+	 */
+	for (; iova < iova_end; iova += SPAGE_SIZE)
+		rk_iommu_write(iommu, RK_MMU_ZAP_ONE_LINE, iova);
+}
+
+static bool rk_iommu_is_stall_active(struct rk_iommu *iommu)
+{
+	return rk_iommu_read(iommu, RK_MMU_STATUS) & RK_MMU_STATUS_STALL_ACTIVE;
+}
+
+static bool rk_iommu_is_paging_enabled(struct rk_iommu *iommu)
+{
+	return rk_iommu_read(iommu, RK_MMU_STATUS) &
+			     RK_MMU_STATUS_PAGING_ENABLED;
+}
+
+static int rk_iommu_enable_stall(struct rk_iommu *iommu)
+{
+	int ret;
+
+	if (rk_iommu_is_stall_active(iommu))
+		return 0;
+
+	/* Stall can only be enabled if paging is enabled */
+	if (!rk_iommu_is_paging_enabled(iommu))
+		return 0;
+
+	rk_iommu_command(iommu, RK_MMU_CMD_ENABLE_STALL);
+
+	ret = rk_wait_for(rk_iommu_is_stall_active(iommu), 1);
+	if (ret)
+		dev_err(iommu->dev, "Enable stall request timed out, status: %#08x\n",
+			rk_iommu_read(iommu, RK_MMU_STATUS));
+
+	return ret;
+}
+
+static int rk_iommu_disable_stall(struct rk_iommu *iommu)
+{
+	int ret;
+
+	if (!rk_iommu_is_stall_active(iommu))
+		return 0;
+
+	rk_iommu_command(iommu, RK_MMU_CMD_DISABLE_STALL);
+
+	ret = rk_wait_for(!rk_iommu_is_stall_active(iommu), 1);
+	if (ret)
+		dev_err(iommu->dev, "Disable stall request timed out, status: %#08x\n",
+			rk_iommu_read(iommu, RK_MMU_STATUS));
+
+	return ret;
+}
+
+static int rk_iommu_enable_paging(struct rk_iommu *iommu)
+{
+	int ret;
+
+	if (rk_iommu_is_paging_enabled(iommu))
+		return 0;
+
+	rk_iommu_command(iommu, RK_MMU_CMD_ENABLE_PAGING);
+
+	ret = rk_wait_for(rk_iommu_is_paging_enabled(iommu), 1);
+	if (ret)
+		dev_err(iommu->dev, "Enable paging request timed out, status: %#08x\n",
+			rk_iommu_read(iommu, RK_MMU_STATUS));
+
+	return ret;
+}
+
+static int rk_iommu_disable_paging(struct rk_iommu *iommu)
+{
+	int ret;
+
+	if (!rk_iommu_is_paging_enabled(iommu))
+		return 0;
+
+	rk_iommu_command(iommu, RK_MMU_CMD_DISABLE_PAGING);
+
+	ret = rk_wait_for(!rk_iommu_is_paging_enabled(iommu), 1);
+	if (ret)
+		dev_err(iommu->dev, "Disable paging request timed out, status: #%08x\n",
+			rk_iommu_read(iommu, RK_MMU_STATUS));
+
+	return ret;
+}
+
+static int rk_iommu_force_reset(struct rk_iommu *iommu)
+{
+	int ret;
+	u32 dte_addr;
+
+	/*
+	 * Check if register DTE_ADDR is working by writing DTE_ADDR_DUMMY
+	 * and verifying that upper 5 nybbles are read back.
+	 */
+	rk_iommu_write(iommu, RK_MMU_DTE_ADDR, DTE_ADDR_DUMMY);
+
+	dte_addr = rk_iommu_read(iommu, RK_MMU_DTE_ADDR);
+	if (dte_addr != (DTE_ADDR_DUMMY & RK_DTE_PT_ADDRESS_MASK)) {
+		dev_err(iommu->dev, "Error during raw reset. MMU_DTE_ADDR is not functioning\n");
+		return -EFAULT;
+	}
+
+	rk_iommu_command(iommu, RK_MMU_CMD_FORCE_RESET);
+
+	ret = rk_wait_for(rk_iommu_read(iommu, RK_MMU_DTE_ADDR) == 0x00000000,
+			  FORCE_RESET_TIMEOUT);
+	if (ret)
+		dev_err(iommu->dev, "FORCE_RESET command timed out\n");
+
+	return ret;
+}
+
+static void log_iova(struct rk_iommu *iommu, dma_addr_t iova)
+{
+	u32 dte_index, pte_index, page_offset;
+	u32 mmu_dte_addr;
+	phys_addr_t mmu_dte_addr_phys, dte_addr_phys;
+	u32 *dte_addr;
+	u32 dte;
+	phys_addr_t pte_addr_phys = 0;
+	u32 *pte_addr = NULL;
+	u32 pte = 0;
+	phys_addr_t page_addr_phys = 0;
+	u32 page_flags = 0;
+
+	dte_index = rk_iova_dte_index(iova);
+	pte_index = rk_iova_pte_index(iova);
+	page_offset = rk_iova_page_offset(iova);
+
+	mmu_dte_addr = rk_iommu_read(iommu, RK_MMU_DTE_ADDR);
+	mmu_dte_addr_phys = (phys_addr_t)mmu_dte_addr;
+
+	dte_addr_phys = mmu_dte_addr_phys + (4 * dte_index);
+	dte_addr = phys_to_virt(dte_addr_phys);
+	dte = *dte_addr;
+
+	if (!rk_dte_is_pt_valid(dte))
+		goto print_it;
+
+	pte_addr_phys = rk_dte_pt_address(dte) + (pte_index * 4);
+	pte_addr = phys_to_virt(pte_addr_phys);
+	pte = *pte_addr;
+
+	if (!rk_pte_is_page_valid(pte))
+		goto print_it;
+
+	page_addr_phys = rk_pte_page_address(pte) + page_offset;
+	page_flags = pte & RK_PTE_PAGE_FLAGS_MASK;
+
+print_it:
+	dev_err(iommu->dev, "iova = %pad: dte_index: 0x%03x pte_index: 0x%03x page_offset: 0x%03x\n",
+		&iova, dte_index, pte_index, page_offset);
+	dev_err(iommu->dev, "mmu_dte_addr: %pa dte@%pa: %#08x valid: %u pte@%pa: %#08x valid: %u page@%pa flags: %#03x\n",
+		&mmu_dte_addr_phys, &dte_addr_phys, dte,
+		rk_dte_is_pt_valid(dte), &pte_addr_phys, pte,
+		rk_pte_is_page_valid(pte), &page_addr_phys, page_flags);
+}
+
+static irqreturn_t rk_iommu_irq(int irq, void *dev_id)
+{
+	struct rk_iommu *iommu = dev_id;
+	u32 status;
+	u32 int_status;
+	dma_addr_t iova;
+
+	int_status = rk_iommu_read(iommu, RK_MMU_INT_STATUS);
+	if (int_status == 0)
+		return IRQ_NONE;
+
+	iova = rk_iommu_read(iommu, RK_MMU_PAGE_FAULT_ADDR);
+
+	if (int_status & RK_MMU_IRQ_PAGE_FAULT) {
+		int flags;
+
+		status = rk_iommu_read(iommu, RK_MMU_STATUS);
+		flags = (status & RK_MMU_STATUS_PAGE_FAULT_IS_WRITE) ?
+				IOMMU_FAULT_WRITE : IOMMU_FAULT_READ;
+
+		dev_err(iommu->dev, "Page fault at %pad of type %s\n",
+			&iova,
+			(flags == IOMMU_FAULT_WRITE) ? "write" : "read");
+
+		log_iova(iommu, iova);
+
+		/*
+		 * Report page fault to any installed handlers.
+		 * Ignore the return code, though, since we always zap cache
+		 * and clear the page fault anyway.
+		 */
+		if (iommu->domain)
+			report_iommu_fault(iommu->domain, iommu->dev, iova,
+					   flags);
+		else
+			dev_err(iommu->dev, "Page fault while iommu not attached to domain?\n");
+
+		rk_iommu_command(iommu, RK_MMU_CMD_ZAP_CACHE);
+		rk_iommu_command(iommu, RK_MMU_CMD_PAGE_FAULT_DONE);
+	}
+
+	if (int_status & RK_MMU_IRQ_BUS_ERROR)
+		dev_err(iommu->dev, "BUS_ERROR occurred at %pad\n", &iova);
+
+	if (int_status & ~RK_MMU_IRQ_MASK)
+		dev_err(iommu->dev, "unexpected int_status: %#08x\n",
+			int_status);
+
+	rk_iommu_write(iommu, RK_MMU_INT_CLEAR, int_status);
+
+	return IRQ_HANDLED;
+}
+
+static phys_addr_t rk_iommu_iova_to_phys(struct iommu_domain *domain,
+					 dma_addr_t iova)
+{
+	struct rk_iommu_domain *rk_domain = domain->priv;
+	unsigned long flags;
+	phys_addr_t pt_phys, phys = 0;
+	u32 dte, pte;
+	u32 *page_table;
+
+	spin_lock_irqsave(&rk_domain->dt_lock, flags);
+
+	dte = rk_domain->dt[rk_iova_dte_index(iova)];
+	if (!rk_dte_is_pt_valid(dte))
+		goto out;
+
+	pt_phys = rk_dte_pt_address(dte);
+	page_table = (u32 *)phys_to_virt(pt_phys);
+	pte = page_table[rk_iova_pte_index(iova)];
+	if (!rk_pte_is_page_valid(pte))
+		goto out;
+
+	phys = rk_pte_page_address(pte) + rk_iova_page_offset(iova);
+out:
+	spin_unlock_irqrestore(&rk_domain->dt_lock, flags);
+
+	return phys;
+}
+
+static void rk_iommu_zap_iova(struct rk_iommu_domain *rk_domain,
+			      dma_addr_t iova, size_t size)
+{
+	struct list_head *pos;
+	unsigned long flags;
+
+	/* shootdown these iova from all iommus using this domain */
+	spin_lock_irqsave(&rk_domain->iommus_lock, flags);
+	list_for_each(pos, &rk_domain->iommus) {
+		struct rk_iommu *iommu;
+		iommu = list_entry(pos, struct rk_iommu, node);
+		rk_iommu_zap_lines(iommu, iova, size);
+	}
+	spin_unlock_irqrestore(&rk_domain->iommus_lock, flags);
+}
+
+static u32 *rk_dte_get_page_table(struct rk_iommu_domain *rk_domain,
+				  dma_addr_t iova)
+{
+	u32 *page_table, *dte_addr;
+	u32 dte;
+	phys_addr_t pt_phys;
+
+	assert_spin_locked(&rk_domain->dt_lock);
+
+	dte_addr = &rk_domain->dt[rk_iova_dte_index(iova)];
+	dte = *dte_addr;
+	if (rk_dte_is_pt_valid(dte))
+		goto done;
+
+	page_table = (u32 *)get_zeroed_page(GFP_ATOMIC | GFP_DMA32);
+	if (!page_table)
+		return ERR_PTR(-ENOMEM);
+
+	dte = rk_mk_dte(page_table);
+	*dte_addr = dte;
+
+	rk_table_flush(page_table, NUM_PT_ENTRIES);
+	rk_table_flush(dte_addr, 1);
+
+	/*
+	 * Zap the first iova of newly allocated page table so iommu evicts
+	 * old cached value of new dte from the iotlb.
+	 */
+	rk_iommu_zap_iova(rk_domain, iova, SPAGE_SIZE);
+
+done:
+	pt_phys = rk_dte_pt_address(dte);
+	return (u32 *)phys_to_virt(pt_phys);
+}
+
+static size_t rk_iommu_unmap_iova(struct rk_iommu_domain *rk_domain,
+				  u32 *pte_addr, dma_addr_t iova, size_t size)
+{
+	unsigned int pte_count;
+	unsigned int pte_total = size / SPAGE_SIZE;
+
+	assert_spin_locked(&rk_domain->dt_lock);
+
+	for (pte_count = 0; pte_count < pte_total; pte_count++) {
+		u32 pte = pte_addr[pte_count];
+		if (!rk_pte_is_page_valid(pte))
+			break;
+
+		pte_addr[pte_count] = rk_mk_pte_invalid(pte);
+	}
+
+	rk_table_flush(pte_addr, pte_count);
+
+	return pte_count * SPAGE_SIZE;
+}
+
+static int rk_iommu_map_iova(struct rk_iommu_domain *rk_domain, u32 *pte_addr,
+			     dma_addr_t iova, phys_addr_t paddr, size_t size,
+			     int prot)
+{
+	unsigned int pte_count;
+	unsigned int pte_total = size / SPAGE_SIZE;
+	phys_addr_t page_phys;
+
+	assert_spin_locked(&rk_domain->dt_lock);
+
+	for (pte_count = 0; pte_count < pte_total; pte_count++) {
+		u32 pte = pte_addr[pte_count];
+
+		if (rk_pte_is_page_valid(pte))
+			goto unwind;
+
+		pte_addr[pte_count] = rk_mk_pte(paddr, prot);
+
+		paddr += SPAGE_SIZE;
+	}
+
+	rk_table_flush(pte_addr, pte_count);
+
+	return 0;
+unwind:
+	/* Unmap the range of iovas that we just mapped */
+	rk_iommu_unmap_iova(rk_domain, pte_addr, iova, pte_count * SPAGE_SIZE);
+
+	iova += pte_count * SPAGE_SIZE;
+	page_phys = rk_pte_page_address(pte_addr[pte_count]);
+	pr_err("iova: %pad already mapped to %pa cannot remap to phys: %pa prot:%#x\n",
+	       &iova, &page_phys, &paddr, prot);
+
+	return -EADDRINUSE;
+}
+
+static int rk_iommu_map(struct iommu_domain *domain, unsigned long _iova,
+			phys_addr_t paddr, size_t size, int prot)
+{
+	struct rk_iommu_domain *rk_domain = domain->priv;
+	unsigned long flags;
+	dma_addr_t iova = (dma_addr_t)_iova;
+	u32 *page_table, *pte_addr;
+	int ret;
+
+	spin_lock_irqsave(&rk_domain->dt_lock, flags);
+
+	/*
+	 * pgsize_bitmap specifies iova sizes that fit in one page table
+	 * (1024 4-KiB pages = 4 MiB).
+	 * So, size will always be 4096 <= size <= 4194304.
+	 * Since iommu_map() guarantees that both iova and size will be
+	 * aligned, we will always only be mapping from a single dte here.
+	 */
+	page_table = rk_dte_get_page_table(rk_domain, iova);
+	if (IS_ERR(page_table)) {
+		spin_unlock_irqrestore(&rk_domain->dt_lock, flags);
+		return PTR_ERR(page_table);
+	}
+
+	pte_addr = &page_table[rk_iova_pte_index(iova)];
+	ret = rk_iommu_map_iova(rk_domain, pte_addr, iova, paddr, size, prot);
+	spin_unlock_irqrestore(&rk_domain->dt_lock, flags);
+
+	return ret;
+}
+
+static size_t rk_iommu_unmap(struct iommu_domain *domain, unsigned long _iova,
+			     size_t size)
+{
+	struct rk_iommu_domain *rk_domain = domain->priv;
+	unsigned long flags;
+	dma_addr_t iova = (dma_addr_t)_iova;
+	phys_addr_t pt_phys;
+	u32 dte;
+	u32 *pte_addr;
+	size_t unmap_size;
+
+	spin_lock_irqsave(&rk_domain->dt_lock, flags);
+
+	/*
+	 * pgsize_bitmap specifies iova sizes that fit in one page table
+	 * (1024 4-KiB pages = 4 MiB).
+	 * So, size will always be 4096 <= size <= 4194304.
+	 * Since iommu_unmap() guarantees that both iova and size will be
+	 * aligned, we will always only be unmapping from a single dte here.
+	 */
+	dte = rk_domain->dt[rk_iova_dte_index(iova)];
+	/* Just return 0 if iova is unmapped */
+	if (!rk_dte_is_pt_valid(dte)) {
+		spin_unlock_irqrestore(&rk_domain->dt_lock, flags);
+		return 0;
+	}
+
+	pt_phys = rk_dte_pt_address(dte);
+	pte_addr = (u32 *)phys_to_virt(pt_phys) + rk_iova_pte_index(iova);
+	unmap_size = rk_iommu_unmap_iova(rk_domain, pte_addr, iova, size);
+
+	spin_unlock_irqrestore(&rk_domain->dt_lock, flags);
+
+	/* Shootdown iotlb entries for iova range that was just unmapped */
+	rk_iommu_zap_iova(rk_domain, iova, unmap_size);
+
+	return unmap_size;
+}
+
+static int rk_iommu_attach_device(struct iommu_domain *domain,
+				  struct device *dev)
+{
+	struct rk_iommu *iommu = dev_get_drvdata(dev->archdata.iommu);
+	struct rk_iommu_domain *rk_domain = domain->priv;
+	unsigned long flags;
+	int ret;
+	phys_addr_t dte_addr;
+
+	/*
+	 * Allow 'virtual devices' (e.g., drm) to attach to domain.
+	 * Such a device has a NULL archdata.iommu.
+	 */
+	if (!iommu)
+		return 0;
+
+	ret = rk_iommu_enable_stall(iommu);
+	if (ret)
+		return ret;
+
+	ret = rk_iommu_force_reset(iommu);
+	if (ret)
+		return ret;
+
+	iommu->domain = domain;
+
+	ret = devm_request_irq(dev, iommu->irq, rk_iommu_irq,
+			       IRQF_SHARED, dev_name(dev), iommu);
+	if (ret)
+		return ret;
+
+	dte_addr = virt_to_phys(rk_domain->dt);
+	rk_iommu_write(iommu, RK_MMU_DTE_ADDR, dte_addr);
+	rk_iommu_command(iommu, RK_MMU_CMD_ZAP_CACHE);
+	rk_iommu_write(iommu, RK_MMU_INT_MASK, RK_MMU_IRQ_MASK);
+
+	ret = rk_iommu_enable_paging(iommu);
+	if (ret)
+		return ret;
+
+	spin_lock_irqsave(&rk_domain->iommus_lock, flags);
+	list_add_tail(&iommu->node, &rk_domain->iommus);
+	spin_unlock_irqrestore(&rk_domain->iommus_lock, flags);
+
+	dev_info(dev, "Attached to iommu domain\n");
+
+	rk_iommu_disable_stall(iommu);
+
+	return 0;
+}
+
+static void rk_iommu_detach_device(struct iommu_domain *domain,
+				   struct device *dev)
+{
+	struct rk_iommu *iommu = dev_get_drvdata(dev->archdata.iommu);
+	struct rk_iommu_domain *rk_domain = domain->priv;
+	unsigned long flags;
+
+	/* Allow 'virtual devices' (eg drm) to detach from domain */
+	if (!iommu)
+		return;
+
+	iommu->domain = NULL;
+
+	spin_lock_irqsave(&rk_domain->iommus_lock, flags);
+	list_del_init(&iommu->node);
+	spin_unlock_irqrestore(&rk_domain->iommus_lock, flags);
+
+	devm_free_irq(dev, iommu->irq, iommu);
+
+	iommu->domain = NULL;
+
+	/* Ignore error while disabling, just keep going */
+	rk_iommu_enable_stall(iommu);
+	rk_iommu_disable_paging(iommu);
+	rk_iommu_write(iommu, RK_MMU_INT_MASK, 0);
+	rk_iommu_write(iommu, RK_MMU_DTE_ADDR, 0);
+	rk_iommu_disable_stall(iommu);
+
+	dev_info(dev, "Detached from iommu domain\n");
+}
+
+static int rk_iommu_domain_init(struct iommu_domain *domain)
+{
+	struct rk_iommu_domain *rk_domain;
+
+	rk_domain = kzalloc(sizeof(*rk_domain), GFP_KERNEL);
+	if (!rk_domain)
+		return -ENOMEM;
+
+	/*
+	 * rk32xx iommus use a 2 level pagetable.
+	 * Each level1 (dt) and level2 (pt) table has 1024 4-byte entries.
+	 * Allocate one 4 KiB page for each table.
+	 */
+	rk_domain->dt = (u32 *)get_zeroed_page(GFP_KERNEL | GFP_DMA32);
+	if (!rk_domain->dt)
+		goto err_dt;
+
+	rk_table_flush(rk_domain->dt, NUM_DT_ENTRIES);
+
+	spin_lock_init(&rk_domain->iommus_lock);
+	spin_lock_init(&rk_domain->dt_lock);
+	INIT_LIST_HEAD(&rk_domain->iommus);
+
+	domain->priv = rk_domain;
+
+	return 0;
+err_dt:
+	kfree(rk_domain);
+	return -ENOMEM;
+}
+
+static void rk_iommu_domain_destroy(struct iommu_domain *domain)
+{
+	struct rk_iommu_domain *rk_domain = domain->priv;
+	int i;
+
+	WARN_ON(!list_empty(&rk_domain->iommus));
+
+	for (i = 0; i < NUM_DT_ENTRIES; i++) {
+		u32 dte = rk_domain->dt[i];
+		if (rk_dte_is_pt_valid(dte)) {
+			phys_addr_t pt_phys = rk_dte_pt_address(dte);
+			u32 *page_table = phys_to_virt(pt_phys);
+			free_page((unsigned long)page_table);
+		}
+	}
+
+	free_page((unsigned long)rk_domain->dt);
+	kfree(domain->priv);
+	domain->priv = NULL;
+}
+
+static const struct iommu_ops rk_iommu_ops = {
+	.domain_init = rk_iommu_domain_init,
+	.domain_destroy = rk_iommu_domain_destroy,
+	.attach_dev = rk_iommu_attach_device,
+	.detach_dev = rk_iommu_detach_device,
+	.map = rk_iommu_map,
+	.unmap = rk_iommu_unmap,
+	.iova_to_phys = rk_iommu_iova_to_phys,
+	.pgsize_bitmap = RK_IOMMU_PGSIZE_BITMAP,
+};
+
+static int rk_iommu_probe(struct platform_device *pdev)
+{
+	struct device *dev = &pdev->dev;
+	struct rk_iommu *iommu;
+	struct resource *res;
+
+	iommu = devm_kzalloc(dev, sizeof(*iommu), GFP_KERNEL);
+	if (!iommu)
+		return -ENOMEM;
+
+	platform_set_drvdata(pdev, iommu);
+	iommu->dev = dev;
+
+	res = platform_get_resource(pdev, IORESOURCE_MEM, 0);
+	iommu->base = devm_ioremap_resource(&pdev->dev, res);
+	if (IS_ERR(iommu->base))
+		return PTR_ERR(iommu->base);
+
+	iommu->irq = platform_get_irq(pdev, 0);
+	if (iommu->irq < 0) {
+		dev_err(dev, "Failed to get IRQ, %d\n", iommu->irq);
+		return -ENXIO;
+	}
+
+	return 0;
+}
+
+static int rk_iommu_remove(struct platform_device *pdev)
+{
+	return 0;
+}
+
+#ifdef CONFIG_OF
+static const struct of_device_id rk_iommu_dt_ids[] = {
+	{ .compatible = "rockchip,iommu" },
+	{ /* sentinel */ }
+};
+MODULE_DEVICE_TABLE(of, rk_iommu_dt_ids);
+#endif
+
+static struct platform_driver rk_iommu_driver = {
+	.probe = rk_iommu_probe,
+	.remove = rk_iommu_remove,
+	.driver = {
+		   .name = "rk_iommu",
+		   .owner = THIS_MODULE,
+		   .of_match_table = of_match_ptr(rk_iommu_dt_ids),
+	},
+};
+
+static int __init rk_iommu_init(void)
+{
+	int ret;
+
+	ret = bus_set_iommu(&platform_bus_type, &rk_iommu_ops);
+	if (ret)
+		return ret;
+
+	return platform_driver_register(&rk_iommu_driver);
+}
+static void __exit rk_iommu_exit(void)
+{
+	platform_driver_unregister(&rk_iommu_driver);
+}
+
+subsys_initcall(rk_iommu_init);
+module_exit(rk_iommu_exit);
+
+MODULE_DESCRIPTION("IOMMU API for Rockchip");
+MODULE_AUTHOR("Simon Xue <xxm@rock-chips.com> and Daniel Kurtz <djkurtz@chromium.org>");
+MODULE_ALIAS("platform:rockchip-iommu");
+MODULE_LICENSE("GPL v2");
-- 
2.1.0.rc2.206.gedb03e5


^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [PATCH v5 1/3] iommu/rockchip: rk3288 iommu driver
@ 2014-10-14  8:02   ` Daniel Kurtz
  0 siblings, 0 replies; 26+ messages in thread
From: Daniel Kurtz @ 2014-10-14  8:02 UTC (permalink / raw)
  Cc: open list:OPEN FIRMWARE AND...,
	Simon Xue, Grant Grundler, open list, Daniel Kurtz,
	open list:ARM/Rockchip SoC...,
	open list:IOMMU DRIVERS, Rob Herring, Grant Likely,
	Stéphane Marchesin, moderated list:ARM/Rockchip SoC...,
	Heiko Stuebner

The rk3288 has several iommus.  Each iommu belongs to a single master
device.  There is one device (ISP) that has two slave iommus, but that
case is not yet supported by this driver.

At subsys init, the iommu driver registers itself as the iommu driver for
the platform bus.  The master devices find their slave iommus using the
"iommus" field in their devicetree description.  Since each slave iommu
belongs to exactly one master, their is no additional data needed at probe
to associate a slave with its master.

An iommu device's power domain, clock and irq are all shared with its
master device, and the master device must be careful to attach from the
iommu only after powering and clocking it (and leave it powered and
clocked before detaching).  Because their is no guarantee what the status
of the iommu is at probe, and since the driver does not even know if the
device is powered, we delay requesting its irq until the master device
attaches, at which point we have a guarantee that the device is powered
and clocked and we can reset it and disable its interrupt mask.

An iommu_domain describes a virtual iova address space.  Each iommu_domain
has a corresponding page table that lists the mappings from iova to
physical address.

For the rk3288 iommu, the page table has two levels:
 The Level 1 "directory_table" has 1024 4-byte dte entries.
 Each dte points to a level 2 "page_table".
 Each level 2 page_table has 1024 4-byte pte entries.
 Each pte points to a 4 KiB page of memory.

An iommu_domain is created when a dma_iommu_mapping is created via
arm_iommu_create_mapping.  Master devices can then attach themselves to
this mapping (or attach the mapping to themselves?) by calling
arm_iommu_attach_device().  This in turn instructs the iommu driver to
write the page table's physical address into the slave iommu's "Directory
Table Entry" (DTE) register.

In fact multiple master devices, each with their own slave iommu device,
can all attach to the same mapping.  The iommus for these devices will
share the same iommu_domain and therefore point to the same page table.
Thus, the iommu domain maintains a list of iommu devices which are
attached.  This driver relies on the iommu core to ensure that all devices
have detached before destroying a domain.

Signed-off-by: Daniel Kurtz <djkurtz@chromium.org>
Signed-off-by: Simon Xue <xxm@rock-chips.com>
Reviewed-by: Grant Grundler <grundler@chromium.org>
Reviewed-by: Stéphane Marchesin <marcheu@chromium.org>
---
 drivers/iommu/Kconfig          |  12 +
 drivers/iommu/Makefile         |   1 +
 drivers/iommu/rockchip-iommu.c | 924 +++++++++++++++++++++++++++++++++++++++++
 3 files changed, 937 insertions(+)
 create mode 100644 drivers/iommu/rockchip-iommu.c

diff --git a/drivers/iommu/Kconfig b/drivers/iommu/Kconfig
index dd51122..d0a1261 100644
--- a/drivers/iommu/Kconfig
+++ b/drivers/iommu/Kconfig
@@ -152,6 +152,18 @@ config OMAP_IOMMU_DEBUG
 
          Say N unless you know you need this.
 
+config ROCKCHIP_IOMMU
+	bool "Rockchip IOMMU Support"
+	depends on ARCH_ROCKCHIP
+	select IOMMU_API
+	select ARM_DMA_USE_IOMMU
+	help
+	  Support for IOMMUs found on Rockchip rk32xx SOCs.
+	  These IOMMUs allow virtualization of the address space used by most
+	  cores within the multimedia subsystem.
+	  Say Y here if you are using a Rockchip SoC that includes an IOMMU
+	  device.
+
 config TEGRA_IOMMU_GART
 	bool "Tegra GART IOMMU Support"
 	depends on ARCH_TEGRA_2x_SOC
diff --git a/drivers/iommu/Makefile b/drivers/iommu/Makefile
index 16edef7..3e47ef3 100644
--- a/drivers/iommu/Makefile
+++ b/drivers/iommu/Makefile
@@ -13,6 +13,7 @@ obj-$(CONFIG_IRQ_REMAP) += intel_irq_remapping.o irq_remapping.o
 obj-$(CONFIG_OMAP_IOMMU) += omap-iommu.o
 obj-$(CONFIG_OMAP_IOMMU) += omap-iommu2.o
 obj-$(CONFIG_OMAP_IOMMU_DEBUG) += omap-iommu-debug.o
+obj-$(CONFIG_ROCKCHIP_IOMMU) += rockchip-iommu.o
 obj-$(CONFIG_TEGRA_IOMMU_GART) += tegra-gart.o
 obj-$(CONFIG_TEGRA_IOMMU_SMMU) += tegra-smmu.o
 obj-$(CONFIG_EXYNOS_IOMMU) += exynos-iommu.o
diff --git a/drivers/iommu/rockchip-iommu.c b/drivers/iommu/rockchip-iommu.c
new file mode 100644
index 0000000..08e50fc
--- /dev/null
+++ b/drivers/iommu/rockchip-iommu.c
@@ -0,0 +1,924 @@
+/*
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ */
+
+#include <asm/cacheflush.h>
+#include <asm/pgtable.h>
+#include <linux/compiler.h>
+#include <linux/delay.h>
+#include <linux/device.h>
+#include <linux/errno.h>
+#include <linux/interrupt.h>
+#include <linux/io.h>
+#include <linux/iommu.h>
+#include <linux/jiffies.h>
+#include <linux/list.h>
+#include <linux/mm.h>
+#include <linux/module.h>
+#include <linux/of.h>
+#include <linux/platform_device.h>
+#include <linux/slab.h>
+#include <linux/spinlock.h>
+
+/** MMU register offsets */
+#define RK_MMU_DTE_ADDR		0x00	/* Directory table address */
+#define RK_MMU_STATUS		0x04
+#define RK_MMU_COMMAND		0x08
+#define RK_MMU_PAGE_FAULT_ADDR	0x0C	/* IOVA of last page fault */
+#define RK_MMU_ZAP_ONE_LINE	0x10	/* Shootdown one IOTLB entry */
+#define RK_MMU_INT_RAWSTAT	0x14	/* IRQ status ignoring mask */
+#define RK_MMU_INT_CLEAR	0x18	/* Acknowledge and re-arm irq */
+#define RK_MMU_INT_MASK		0x1C	/* IRQ enable */
+#define RK_MMU_INT_STATUS	0x20	/* IRQ status after masking */
+#define RK_MMU_AUTO_GATING	0x24
+
+#define DTE_ADDR_DUMMY		0xCAFEBABE
+#define FORCE_RESET_TIMEOUT	100	/* ms */
+
+/* RK_MMU_STATUS fields */
+#define RK_MMU_STATUS_PAGING_ENABLED       BIT(0)
+#define RK_MMU_STATUS_PAGE_FAULT_ACTIVE    BIT(1)
+#define RK_MMU_STATUS_STALL_ACTIVE         BIT(2)
+#define RK_MMU_STATUS_IDLE                 BIT(3)
+#define RK_MMU_STATUS_REPLAY_BUFFER_EMPTY  BIT(4)
+#define RK_MMU_STATUS_PAGE_FAULT_IS_WRITE  BIT(5)
+#define RK_MMU_STATUS_STALL_NOT_ACTIVE     BIT(31)
+
+/* RK_MMU_COMMAND command values */
+#define RK_MMU_CMD_ENABLE_PAGING    0  /* Enable memory translation */
+#define RK_MMU_CMD_DISABLE_PAGING   1  /* Disable memory translation */
+#define RK_MMU_CMD_ENABLE_STALL     2  /* Stall paging to allow other cmds */
+#define RK_MMU_CMD_DISABLE_STALL    3  /* Stop stall re-enables paging */
+#define RK_MMU_CMD_ZAP_CACHE        4  /* Shoot down entire IOTLB */
+#define RK_MMU_CMD_PAGE_FAULT_DONE  5  /* Clear page fault */
+#define RK_MMU_CMD_FORCE_RESET      6  /* Reset all registers */
+
+/* RK_MMU_INT_* register fields */
+#define RK_MMU_IRQ_PAGE_FAULT    0x01  /* page fault */
+#define RK_MMU_IRQ_BUS_ERROR     0x02  /* bus read error */
+#define RK_MMU_IRQ_MASK          (RK_MMU_IRQ_PAGE_FAULT | RK_MMU_IRQ_BUS_ERROR)
+
+#define NUM_DT_ENTRIES 1024
+#define NUM_PT_ENTRIES 1024
+
+#define SPAGE_ORDER 12
+#define SPAGE_SIZE (1 << SPAGE_ORDER)
+
+ /*
+  * Support mapping any size that fits in one page table:
+  *   4 KiB to 4 MiB
+  */
+#define RK_IOMMU_PGSIZE_BITMAP 0x007ff000
+
+#define IOMMU_REG_POLL_COUNT_FAST 1000
+
+struct rk_iommu_domain {
+	struct list_head iommus;
+	u32 *dt; /* page directory table */
+	spinlock_t iommus_lock; /* lock for iommus list */
+	spinlock_t dt_lock; /* lock for modifying page directory table */
+};
+
+struct rk_iommu {
+	struct device *dev;
+	void __iomem *base;
+	int irq;
+	struct list_head node; /* entry in rk_iommu_domain.iommus */
+	struct iommu_domain *domain; /* domain to which iommu is attached */
+};
+
+static inline void rk_table_flush(u32 *va, unsigned int count)
+{
+	phys_addr_t pa_start = virt_to_phys(va);
+	phys_addr_t pa_end = virt_to_phys(va + count);
+	size_t size = pa_end - pa_start;
+
+	__cpuc_flush_dcache_area(va, size);
+	outer_flush_range(pa_start, pa_end);
+}
+
+/**
+ * Inspired by _wait_for in intel_drv.h
+ * This is NOT safe for use in interrupt context.
+ *
+ * Note that it's important that we check the condition again after having
+ * timed out, since the timeout could be due to preemption or similar and
+ * we've never had a chance to check the condition before the timeout.
+ */
+#define rk_wait_for(COND, MS) ({ \
+	unsigned long timeout__ = jiffies + msecs_to_jiffies(MS) + 1;	\
+	int ret__ = 0;							\
+	while (!(COND)) {						\
+		if (time_after(jiffies, timeout__)) {			\
+			ret__ = (COND) ? 0 : -ETIMEDOUT;		\
+			break;						\
+		}							\
+		usleep_range(50, 100);					\
+	}								\
+	ret__;								\
+})
+
+/*
+ * The Rockchip rk3288 iommu uses a 2-level page table.
+ * The first level is the "Directory Table" (DT).
+ * The DT consists of 1024 4-byte Directory Table Entries (DTEs), each pointing
+ * to a "Page Table".
+ * The second level is the 1024 Page Tables (PT).
+ * Each PT consists of 1024 4-byte Page Table Entries (PTEs), each pointing to
+ * a 4 KB page of physical memory.
+ *
+ * The DT and each PT fits in a single 4 KB page (4-bytes * 1024 entries).
+ * Each iommu device has a MMU_DTE_ADDR register that contains the physical
+ * address of the start of the DT page.
+ *
+ * The structure of the page table is as follows:
+ *
+ *                   DT
+ * MMU_DTE_ADDR -> +-----+
+ *                 |     |
+ *                 +-----+     PT
+ *                 | DTE | -> +-----+
+ *                 +-----+    |     |     Memory
+ *                 |     |    +-----+     Page
+ *                 |     |    | PTE | -> +-----+
+ *                 +-----+    +-----+    |     |
+ *                            |     |    |     |
+ *                            |     |    |     |
+ *                            +-----+    |     |
+ *                                       |     |
+ *                                       |     |
+ *                                       +-----+
+ */
+
+/*
+ * Each DTE has a PT address and a valid bit:
+ * +---------------------+-----------+-+
+ * | PT address          | Reserved  |V|
+ * +---------------------+-----------+-+
+ *  31:12 - PT address (PTs always starts on a 4 KB boundary)
+ *  11: 1 - Reserved
+ *      0 - 1 if PT @ PT address is valid
+ */
+#define RK_DTE_PT_ADDRESS_MASK    0xfffff000
+#define RK_DTE_PT_VALID           BIT(0)
+
+static inline phys_addr_t rk_dte_pt_address(u32 dte)
+{
+	return (phys_addr_t)dte & RK_DTE_PT_ADDRESS_MASK;
+}
+
+static inline bool rk_dte_is_pt_valid(u32 dte)
+{
+	return dte & RK_DTE_PT_VALID;
+}
+
+static u32 rk_mk_dte(u32 *pt)
+{
+	phys_addr_t pt_phys = virt_to_phys(pt);
+	return (pt_phys & RK_DTE_PT_ADDRESS_MASK) | RK_DTE_PT_VALID;
+}
+
+/*
+ * Each PTE has a Page address, some flags and a valid bit:
+ * +---------------------+---+-------+-+
+ * | Page address        |Rsv| Flags |V|
+ * +---------------------+---+-------+-+
+ *  31:12 - Page address (Pages always start on a 4 KB boundary)
+ *  11: 9 - Reserved
+ *   8: 1 - Flags
+ *      8 - Read allocate - allocate cache space on read misses
+ *      7 - Read cache - enable cache & prefetch of data
+ *      6 - Write buffer - enable delaying writes on their way to memory
+ *      5 - Write allocate - allocate cache space on write misses
+ *      4 - Write cache - different writes can be merged together
+ *      3 - Override cache attributes
+ *          if 1, bits 4-8 control cache attributes
+ *          if 0, the system bus defaults are used
+ *      2 - Writable
+ *      1 - Readable
+ *      0 - 1 if Page @ Page address is valid
+ */
+#define RK_PTE_PAGE_ADDRESS_MASK  0xfffff000
+#define RK_PTE_PAGE_FLAGS_MASK    0x000001fe
+#define RK_PTE_PAGE_WRITABLE      BIT(2)
+#define RK_PTE_PAGE_READABLE      BIT(1)
+#define RK_PTE_PAGE_VALID         BIT(0)
+
+static inline phys_addr_t rk_pte_page_address(u32 pte)
+{
+	return (phys_addr_t)pte & RK_PTE_PAGE_ADDRESS_MASK;
+}
+
+static inline bool rk_pte_is_page_valid(u32 pte)
+{
+	return pte & RK_PTE_PAGE_VALID;
+}
+
+/* TODO: set cache flags per prot IOMMU_CACHE */
+static u32 rk_mk_pte(phys_addr_t page, int prot)
+{
+	u32 flags = 0;
+	flags |= (prot & IOMMU_READ) ? RK_PTE_PAGE_READABLE : 0;
+	flags |= (prot & IOMMU_WRITE) ? RK_PTE_PAGE_WRITABLE : 0;
+	page &= RK_PTE_PAGE_ADDRESS_MASK;
+	return page | flags | RK_PTE_PAGE_VALID;
+}
+
+static u32 rk_mk_pte_invalid(u32 pte)
+{
+	return pte & ~RK_PTE_PAGE_VALID;
+}
+
+/*
+ * rk3288 iova (IOMMU Virtual Address) format
+ *  31       22.21       12.11          0
+ * +-----------+-----------+-------------+
+ * | DTE index | PTE index | Page offset |
+ * +-----------+-----------+-------------+
+ *  31:22 - DTE index   - index of DTE in DT
+ *  21:12 - PTE index   - index of PTE in PT @ DTE.pt_address
+ *  11: 0 - Page offset - offset into page @ PTE.page_address
+ */
+#define RK_IOVA_DTE_MASK    0xffc00000
+#define RK_IOVA_DTE_SHIFT   22
+#define RK_IOVA_PTE_MASK    0x003ff000
+#define RK_IOVA_PTE_SHIFT   12
+#define RK_IOVA_PAGE_MASK   0x00000fff
+#define RK_IOVA_PAGE_SHIFT  0
+
+static u32 rk_iova_dte_index(dma_addr_t iova)
+{
+	return (u32)(iova & RK_IOVA_DTE_MASK) >> RK_IOVA_DTE_SHIFT;
+}
+
+static u32 rk_iova_pte_index(dma_addr_t iova)
+{
+	return (u32)(iova & RK_IOVA_PTE_MASK) >> RK_IOVA_PTE_SHIFT;
+}
+
+static u32 rk_iova_page_offset(dma_addr_t iova)
+{
+	return (u32)(iova & RK_IOVA_PAGE_MASK) >> RK_IOVA_PAGE_SHIFT;
+}
+
+static u32 rk_iommu_read(struct rk_iommu *iommu, u32 offset)
+{
+	return readl(iommu->base + offset);
+}
+
+static void rk_iommu_write(struct rk_iommu *iommu, u32 offset, u32 value)
+{
+	writel(value, iommu->base + offset);
+}
+
+static void rk_iommu_command(struct rk_iommu *iommu, u32 command)
+{
+	writel(command, iommu->base + RK_MMU_COMMAND);
+}
+
+static void rk_iommu_zap_lines(struct rk_iommu *iommu, dma_addr_t iova,
+			       size_t size)
+{
+	dma_addr_t iova_end = iova + size;
+	/*
+	 * TODO(djkurtz): Figure out when it is more efficient to shootdown the
+	 * entire iotlb rather than iterate over individual iovas.
+	 */
+	for (; iova < iova_end; iova += SPAGE_SIZE)
+		rk_iommu_write(iommu, RK_MMU_ZAP_ONE_LINE, iova);
+}
+
+static bool rk_iommu_is_stall_active(struct rk_iommu *iommu)
+{
+	return rk_iommu_read(iommu, RK_MMU_STATUS) & RK_MMU_STATUS_STALL_ACTIVE;
+}
+
+static bool rk_iommu_is_paging_enabled(struct rk_iommu *iommu)
+{
+	return rk_iommu_read(iommu, RK_MMU_STATUS) &
+			     RK_MMU_STATUS_PAGING_ENABLED;
+}
+
+static int rk_iommu_enable_stall(struct rk_iommu *iommu)
+{
+	int ret;
+
+	if (rk_iommu_is_stall_active(iommu))
+		return 0;
+
+	/* Stall can only be enabled if paging is enabled */
+	if (!rk_iommu_is_paging_enabled(iommu))
+		return 0;
+
+	rk_iommu_command(iommu, RK_MMU_CMD_ENABLE_STALL);
+
+	ret = rk_wait_for(rk_iommu_is_stall_active(iommu), 1);
+	if (ret)
+		dev_err(iommu->dev, "Enable stall request timed out, status: %#08x\n",
+			rk_iommu_read(iommu, RK_MMU_STATUS));
+
+	return ret;
+}
+
+static int rk_iommu_disable_stall(struct rk_iommu *iommu)
+{
+	int ret;
+
+	if (!rk_iommu_is_stall_active(iommu))
+		return 0;
+
+	rk_iommu_command(iommu, RK_MMU_CMD_DISABLE_STALL);
+
+	ret = rk_wait_for(!rk_iommu_is_stall_active(iommu), 1);
+	if (ret)
+		dev_err(iommu->dev, "Disable stall request timed out, status: %#08x\n",
+			rk_iommu_read(iommu, RK_MMU_STATUS));
+
+	return ret;
+}
+
+static int rk_iommu_enable_paging(struct rk_iommu *iommu)
+{
+	int ret;
+
+	if (rk_iommu_is_paging_enabled(iommu))
+		return 0;
+
+	rk_iommu_command(iommu, RK_MMU_CMD_ENABLE_PAGING);
+
+	ret = rk_wait_for(rk_iommu_is_paging_enabled(iommu), 1);
+	if (ret)
+		dev_err(iommu->dev, "Enable paging request timed out, status: %#08x\n",
+			rk_iommu_read(iommu, RK_MMU_STATUS));
+
+	return ret;
+}
+
+static int rk_iommu_disable_paging(struct rk_iommu *iommu)
+{
+	int ret;
+
+	if (!rk_iommu_is_paging_enabled(iommu))
+		return 0;
+
+	rk_iommu_command(iommu, RK_MMU_CMD_DISABLE_PAGING);
+
+	ret = rk_wait_for(!rk_iommu_is_paging_enabled(iommu), 1);
+	if (ret)
+		dev_err(iommu->dev, "Disable paging request timed out, status: #%08x\n",
+			rk_iommu_read(iommu, RK_MMU_STATUS));
+
+	return ret;
+}
+
+static int rk_iommu_force_reset(struct rk_iommu *iommu)
+{
+	int ret;
+	u32 dte_addr;
+
+	/*
+	 * Check if register DTE_ADDR is working by writing DTE_ADDR_DUMMY
+	 * and verifying that upper 5 nybbles are read back.
+	 */
+	rk_iommu_write(iommu, RK_MMU_DTE_ADDR, DTE_ADDR_DUMMY);
+
+	dte_addr = rk_iommu_read(iommu, RK_MMU_DTE_ADDR);
+	if (dte_addr != (DTE_ADDR_DUMMY & RK_DTE_PT_ADDRESS_MASK)) {
+		dev_err(iommu->dev, "Error during raw reset. MMU_DTE_ADDR is not functioning\n");
+		return -EFAULT;
+	}
+
+	rk_iommu_command(iommu, RK_MMU_CMD_FORCE_RESET);
+
+	ret = rk_wait_for(rk_iommu_read(iommu, RK_MMU_DTE_ADDR) == 0x00000000,
+			  FORCE_RESET_TIMEOUT);
+	if (ret)
+		dev_err(iommu->dev, "FORCE_RESET command timed out\n");
+
+	return ret;
+}
+
+static void log_iova(struct rk_iommu *iommu, dma_addr_t iova)
+{
+	u32 dte_index, pte_index, page_offset;
+	u32 mmu_dte_addr;
+	phys_addr_t mmu_dte_addr_phys, dte_addr_phys;
+	u32 *dte_addr;
+	u32 dte;
+	phys_addr_t pte_addr_phys = 0;
+	u32 *pte_addr = NULL;
+	u32 pte = 0;
+	phys_addr_t page_addr_phys = 0;
+	u32 page_flags = 0;
+
+	dte_index = rk_iova_dte_index(iova);
+	pte_index = rk_iova_pte_index(iova);
+	page_offset = rk_iova_page_offset(iova);
+
+	mmu_dte_addr = rk_iommu_read(iommu, RK_MMU_DTE_ADDR);
+	mmu_dte_addr_phys = (phys_addr_t)mmu_dte_addr;
+
+	dte_addr_phys = mmu_dte_addr_phys + (4 * dte_index);
+	dte_addr = phys_to_virt(dte_addr_phys);
+	dte = *dte_addr;
+
+	if (!rk_dte_is_pt_valid(dte))
+		goto print_it;
+
+	pte_addr_phys = rk_dte_pt_address(dte) + (pte_index * 4);
+	pte_addr = phys_to_virt(pte_addr_phys);
+	pte = *pte_addr;
+
+	if (!rk_pte_is_page_valid(pte))
+		goto print_it;
+
+	page_addr_phys = rk_pte_page_address(pte) + page_offset;
+	page_flags = pte & RK_PTE_PAGE_FLAGS_MASK;
+
+print_it:
+	dev_err(iommu->dev, "iova = %pad: dte_index: 0x%03x pte_index: 0x%03x page_offset: 0x%03x\n",
+		&iova, dte_index, pte_index, page_offset);
+	dev_err(iommu->dev, "mmu_dte_addr: %pa dte@%pa: %#08x valid: %u pte@%pa: %#08x valid: %u page@%pa flags: %#03x\n",
+		&mmu_dte_addr_phys, &dte_addr_phys, dte,
+		rk_dte_is_pt_valid(dte), &pte_addr_phys, pte,
+		rk_pte_is_page_valid(pte), &page_addr_phys, page_flags);
+}
+
+static irqreturn_t rk_iommu_irq(int irq, void *dev_id)
+{
+	struct rk_iommu *iommu = dev_id;
+	u32 status;
+	u32 int_status;
+	dma_addr_t iova;
+
+	int_status = rk_iommu_read(iommu, RK_MMU_INT_STATUS);
+	if (int_status == 0)
+		return IRQ_NONE;
+
+	iova = rk_iommu_read(iommu, RK_MMU_PAGE_FAULT_ADDR);
+
+	if (int_status & RK_MMU_IRQ_PAGE_FAULT) {
+		int flags;
+
+		status = rk_iommu_read(iommu, RK_MMU_STATUS);
+		flags = (status & RK_MMU_STATUS_PAGE_FAULT_IS_WRITE) ?
+				IOMMU_FAULT_WRITE : IOMMU_FAULT_READ;
+
+		dev_err(iommu->dev, "Page fault at %pad of type %s\n",
+			&iova,
+			(flags == IOMMU_FAULT_WRITE) ? "write" : "read");
+
+		log_iova(iommu, iova);
+
+		/*
+		 * Report page fault to any installed handlers.
+		 * Ignore the return code, though, since we always zap cache
+		 * and clear the page fault anyway.
+		 */
+		if (iommu->domain)
+			report_iommu_fault(iommu->domain, iommu->dev, iova,
+					   flags);
+		else
+			dev_err(iommu->dev, "Page fault while iommu not attached to domain?\n");
+
+		rk_iommu_command(iommu, RK_MMU_CMD_ZAP_CACHE);
+		rk_iommu_command(iommu, RK_MMU_CMD_PAGE_FAULT_DONE);
+	}
+
+	if (int_status & RK_MMU_IRQ_BUS_ERROR)
+		dev_err(iommu->dev, "BUS_ERROR occurred at %pad\n", &iova);
+
+	if (int_status & ~RK_MMU_IRQ_MASK)
+		dev_err(iommu->dev, "unexpected int_status: %#08x\n",
+			int_status);
+
+	rk_iommu_write(iommu, RK_MMU_INT_CLEAR, int_status);
+
+	return IRQ_HANDLED;
+}
+
+static phys_addr_t rk_iommu_iova_to_phys(struct iommu_domain *domain,
+					 dma_addr_t iova)
+{
+	struct rk_iommu_domain *rk_domain = domain->priv;
+	unsigned long flags;
+	phys_addr_t pt_phys, phys = 0;
+	u32 dte, pte;
+	u32 *page_table;
+
+	spin_lock_irqsave(&rk_domain->dt_lock, flags);
+
+	dte = rk_domain->dt[rk_iova_dte_index(iova)];
+	if (!rk_dte_is_pt_valid(dte))
+		goto out;
+
+	pt_phys = rk_dte_pt_address(dte);
+	page_table = (u32 *)phys_to_virt(pt_phys);
+	pte = page_table[rk_iova_pte_index(iova)];
+	if (!rk_pte_is_page_valid(pte))
+		goto out;
+
+	phys = rk_pte_page_address(pte) + rk_iova_page_offset(iova);
+out:
+	spin_unlock_irqrestore(&rk_domain->dt_lock, flags);
+
+	return phys;
+}
+
+static void rk_iommu_zap_iova(struct rk_iommu_domain *rk_domain,
+			      dma_addr_t iova, size_t size)
+{
+	struct list_head *pos;
+	unsigned long flags;
+
+	/* shootdown these iova from all iommus using this domain */
+	spin_lock_irqsave(&rk_domain->iommus_lock, flags);
+	list_for_each(pos, &rk_domain->iommus) {
+		struct rk_iommu *iommu;
+		iommu = list_entry(pos, struct rk_iommu, node);
+		rk_iommu_zap_lines(iommu, iova, size);
+	}
+	spin_unlock_irqrestore(&rk_domain->iommus_lock, flags);
+}
+
+static u32 *rk_dte_get_page_table(struct rk_iommu_domain *rk_domain,
+				  dma_addr_t iova)
+{
+	u32 *page_table, *dte_addr;
+	u32 dte;
+	phys_addr_t pt_phys;
+
+	assert_spin_locked(&rk_domain->dt_lock);
+
+	dte_addr = &rk_domain->dt[rk_iova_dte_index(iova)];
+	dte = *dte_addr;
+	if (rk_dte_is_pt_valid(dte))
+		goto done;
+
+	page_table = (u32 *)get_zeroed_page(GFP_ATOMIC | GFP_DMA32);
+	if (!page_table)
+		return ERR_PTR(-ENOMEM);
+
+	dte = rk_mk_dte(page_table);
+	*dte_addr = dte;
+
+	rk_table_flush(page_table, NUM_PT_ENTRIES);
+	rk_table_flush(dte_addr, 1);
+
+	/*
+	 * Zap the first iova of newly allocated page table so iommu evicts
+	 * old cached value of new dte from the iotlb.
+	 */
+	rk_iommu_zap_iova(rk_domain, iova, SPAGE_SIZE);
+
+done:
+	pt_phys = rk_dte_pt_address(dte);
+	return (u32 *)phys_to_virt(pt_phys);
+}
+
+static size_t rk_iommu_unmap_iova(struct rk_iommu_domain *rk_domain,
+				  u32 *pte_addr, dma_addr_t iova, size_t size)
+{
+	unsigned int pte_count;
+	unsigned int pte_total = size / SPAGE_SIZE;
+
+	assert_spin_locked(&rk_domain->dt_lock);
+
+	for (pte_count = 0; pte_count < pte_total; pte_count++) {
+		u32 pte = pte_addr[pte_count];
+		if (!rk_pte_is_page_valid(pte))
+			break;
+
+		pte_addr[pte_count] = rk_mk_pte_invalid(pte);
+	}
+
+	rk_table_flush(pte_addr, pte_count);
+
+	return pte_count * SPAGE_SIZE;
+}
+
+static int rk_iommu_map_iova(struct rk_iommu_domain *rk_domain, u32 *pte_addr,
+			     dma_addr_t iova, phys_addr_t paddr, size_t size,
+			     int prot)
+{
+	unsigned int pte_count;
+	unsigned int pte_total = size / SPAGE_SIZE;
+	phys_addr_t page_phys;
+
+	assert_spin_locked(&rk_domain->dt_lock);
+
+	for (pte_count = 0; pte_count < pte_total; pte_count++) {
+		u32 pte = pte_addr[pte_count];
+
+		if (rk_pte_is_page_valid(pte))
+			goto unwind;
+
+		pte_addr[pte_count] = rk_mk_pte(paddr, prot);
+
+		paddr += SPAGE_SIZE;
+	}
+
+	rk_table_flush(pte_addr, pte_count);
+
+	return 0;
+unwind:
+	/* Unmap the range of iovas that we just mapped */
+	rk_iommu_unmap_iova(rk_domain, pte_addr, iova, pte_count * SPAGE_SIZE);
+
+	iova += pte_count * SPAGE_SIZE;
+	page_phys = rk_pte_page_address(pte_addr[pte_count]);
+	pr_err("iova: %pad already mapped to %pa cannot remap to phys: %pa prot:%#x\n",
+	       &iova, &page_phys, &paddr, prot);
+
+	return -EADDRINUSE;
+}
+
+static int rk_iommu_map(struct iommu_domain *domain, unsigned long _iova,
+			phys_addr_t paddr, size_t size, int prot)
+{
+	struct rk_iommu_domain *rk_domain = domain->priv;
+	unsigned long flags;
+	dma_addr_t iova = (dma_addr_t)_iova;
+	u32 *page_table, *pte_addr;
+	int ret;
+
+	spin_lock_irqsave(&rk_domain->dt_lock, flags);
+
+	/*
+	 * pgsize_bitmap specifies iova sizes that fit in one page table
+	 * (1024 4-KiB pages = 4 MiB).
+	 * So, size will always be 4096 <= size <= 4194304.
+	 * Since iommu_map() guarantees that both iova and size will be
+	 * aligned, we will always only be mapping from a single dte here.
+	 */
+	page_table = rk_dte_get_page_table(rk_domain, iova);
+	if (IS_ERR(page_table)) {
+		spin_unlock_irqrestore(&rk_domain->dt_lock, flags);
+		return PTR_ERR(page_table);
+	}
+
+	pte_addr = &page_table[rk_iova_pte_index(iova)];
+	ret = rk_iommu_map_iova(rk_domain, pte_addr, iova, paddr, size, prot);
+	spin_unlock_irqrestore(&rk_domain->dt_lock, flags);
+
+	return ret;
+}
+
+static size_t rk_iommu_unmap(struct iommu_domain *domain, unsigned long _iova,
+			     size_t size)
+{
+	struct rk_iommu_domain *rk_domain = domain->priv;
+	unsigned long flags;
+	dma_addr_t iova = (dma_addr_t)_iova;
+	phys_addr_t pt_phys;
+	u32 dte;
+	u32 *pte_addr;
+	size_t unmap_size;
+
+	spin_lock_irqsave(&rk_domain->dt_lock, flags);
+
+	/*
+	 * pgsize_bitmap specifies iova sizes that fit in one page table
+	 * (1024 4-KiB pages = 4 MiB).
+	 * So, size will always be 4096 <= size <= 4194304.
+	 * Since iommu_unmap() guarantees that both iova and size will be
+	 * aligned, we will always only be unmapping from a single dte here.
+	 */
+	dte = rk_domain->dt[rk_iova_dte_index(iova)];
+	/* Just return 0 if iova is unmapped */
+	if (!rk_dte_is_pt_valid(dte)) {
+		spin_unlock_irqrestore(&rk_domain->dt_lock, flags);
+		return 0;
+	}
+
+	pt_phys = rk_dte_pt_address(dte);
+	pte_addr = (u32 *)phys_to_virt(pt_phys) + rk_iova_pte_index(iova);
+	unmap_size = rk_iommu_unmap_iova(rk_domain, pte_addr, iova, size);
+
+	spin_unlock_irqrestore(&rk_domain->dt_lock, flags);
+
+	/* Shootdown iotlb entries for iova range that was just unmapped */
+	rk_iommu_zap_iova(rk_domain, iova, unmap_size);
+
+	return unmap_size;
+}
+
+static int rk_iommu_attach_device(struct iommu_domain *domain,
+				  struct device *dev)
+{
+	struct rk_iommu *iommu = dev_get_drvdata(dev->archdata.iommu);
+	struct rk_iommu_domain *rk_domain = domain->priv;
+	unsigned long flags;
+	int ret;
+	phys_addr_t dte_addr;
+
+	/*
+	 * Allow 'virtual devices' (e.g., drm) to attach to domain.
+	 * Such a device has a NULL archdata.iommu.
+	 */
+	if (!iommu)
+		return 0;
+
+	ret = rk_iommu_enable_stall(iommu);
+	if (ret)
+		return ret;
+
+	ret = rk_iommu_force_reset(iommu);
+	if (ret)
+		return ret;
+
+	iommu->domain = domain;
+
+	ret = devm_request_irq(dev, iommu->irq, rk_iommu_irq,
+			       IRQF_SHARED, dev_name(dev), iommu);
+	if (ret)
+		return ret;
+
+	dte_addr = virt_to_phys(rk_domain->dt);
+	rk_iommu_write(iommu, RK_MMU_DTE_ADDR, dte_addr);
+	rk_iommu_command(iommu, RK_MMU_CMD_ZAP_CACHE);
+	rk_iommu_write(iommu, RK_MMU_INT_MASK, RK_MMU_IRQ_MASK);
+
+	ret = rk_iommu_enable_paging(iommu);
+	if (ret)
+		return ret;
+
+	spin_lock_irqsave(&rk_domain->iommus_lock, flags);
+	list_add_tail(&iommu->node, &rk_domain->iommus);
+	spin_unlock_irqrestore(&rk_domain->iommus_lock, flags);
+
+	dev_info(dev, "Attached to iommu domain\n");
+
+	rk_iommu_disable_stall(iommu);
+
+	return 0;
+}
+
+static void rk_iommu_detach_device(struct iommu_domain *domain,
+				   struct device *dev)
+{
+	struct rk_iommu *iommu = dev_get_drvdata(dev->archdata.iommu);
+	struct rk_iommu_domain *rk_domain = domain->priv;
+	unsigned long flags;
+
+	/* Allow 'virtual devices' (eg drm) to detach from domain */
+	if (!iommu)
+		return;
+
+	iommu->domain = NULL;
+
+	spin_lock_irqsave(&rk_domain->iommus_lock, flags);
+	list_del_init(&iommu->node);
+	spin_unlock_irqrestore(&rk_domain->iommus_lock, flags);
+
+	devm_free_irq(dev, iommu->irq, iommu);
+
+	iommu->domain = NULL;
+
+	/* Ignore error while disabling, just keep going */
+	rk_iommu_enable_stall(iommu);
+	rk_iommu_disable_paging(iommu);
+	rk_iommu_write(iommu, RK_MMU_INT_MASK, 0);
+	rk_iommu_write(iommu, RK_MMU_DTE_ADDR, 0);
+	rk_iommu_disable_stall(iommu);
+
+	dev_info(dev, "Detached from iommu domain\n");
+}
+
+static int rk_iommu_domain_init(struct iommu_domain *domain)
+{
+	struct rk_iommu_domain *rk_domain;
+
+	rk_domain = kzalloc(sizeof(*rk_domain), GFP_KERNEL);
+	if (!rk_domain)
+		return -ENOMEM;
+
+	/*
+	 * rk32xx iommus use a 2 level pagetable.
+	 * Each level1 (dt) and level2 (pt) table has 1024 4-byte entries.
+	 * Allocate one 4 KiB page for each table.
+	 */
+	rk_domain->dt = (u32 *)get_zeroed_page(GFP_KERNEL | GFP_DMA32);
+	if (!rk_domain->dt)
+		goto err_dt;
+
+	rk_table_flush(rk_domain->dt, NUM_DT_ENTRIES);
+
+	spin_lock_init(&rk_domain->iommus_lock);
+	spin_lock_init(&rk_domain->dt_lock);
+	INIT_LIST_HEAD(&rk_domain->iommus);
+
+	domain->priv = rk_domain;
+
+	return 0;
+err_dt:
+	kfree(rk_domain);
+	return -ENOMEM;
+}
+
+static void rk_iommu_domain_destroy(struct iommu_domain *domain)
+{
+	struct rk_iommu_domain *rk_domain = domain->priv;
+	int i;
+
+	WARN_ON(!list_empty(&rk_domain->iommus));
+
+	for (i = 0; i < NUM_DT_ENTRIES; i++) {
+		u32 dte = rk_domain->dt[i];
+		if (rk_dte_is_pt_valid(dte)) {
+			phys_addr_t pt_phys = rk_dte_pt_address(dte);
+			u32 *page_table = phys_to_virt(pt_phys);
+			free_page((unsigned long)page_table);
+		}
+	}
+
+	free_page((unsigned long)rk_domain->dt);
+	kfree(domain->priv);
+	domain->priv = NULL;
+}
+
+static const struct iommu_ops rk_iommu_ops = {
+	.domain_init = rk_iommu_domain_init,
+	.domain_destroy = rk_iommu_domain_destroy,
+	.attach_dev = rk_iommu_attach_device,
+	.detach_dev = rk_iommu_detach_device,
+	.map = rk_iommu_map,
+	.unmap = rk_iommu_unmap,
+	.iova_to_phys = rk_iommu_iova_to_phys,
+	.pgsize_bitmap = RK_IOMMU_PGSIZE_BITMAP,
+};
+
+static int rk_iommu_probe(struct platform_device *pdev)
+{
+	struct device *dev = &pdev->dev;
+	struct rk_iommu *iommu;
+	struct resource *res;
+
+	iommu = devm_kzalloc(dev, sizeof(*iommu), GFP_KERNEL);
+	if (!iommu)
+		return -ENOMEM;
+
+	platform_set_drvdata(pdev, iommu);
+	iommu->dev = dev;
+
+	res = platform_get_resource(pdev, IORESOURCE_MEM, 0);
+	iommu->base = devm_ioremap_resource(&pdev->dev, res);
+	if (IS_ERR(iommu->base))
+		return PTR_ERR(iommu->base);
+
+	iommu->irq = platform_get_irq(pdev, 0);
+	if (iommu->irq < 0) {
+		dev_err(dev, "Failed to get IRQ, %d\n", iommu->irq);
+		return -ENXIO;
+	}
+
+	return 0;
+}
+
+static int rk_iommu_remove(struct platform_device *pdev)
+{
+	return 0;
+}
+
+#ifdef CONFIG_OF
+static const struct of_device_id rk_iommu_dt_ids[] = {
+	{ .compatible = "rockchip,iommu" },
+	{ /* sentinel */ }
+};
+MODULE_DEVICE_TABLE(of, rk_iommu_dt_ids);
+#endif
+
+static struct platform_driver rk_iommu_driver = {
+	.probe = rk_iommu_probe,
+	.remove = rk_iommu_remove,
+	.driver = {
+		   .name = "rk_iommu",
+		   .owner = THIS_MODULE,
+		   .of_match_table = of_match_ptr(rk_iommu_dt_ids),
+	},
+};
+
+static int __init rk_iommu_init(void)
+{
+	int ret;
+
+	ret = bus_set_iommu(&platform_bus_type, &rk_iommu_ops);
+	if (ret)
+		return ret;
+
+	return platform_driver_register(&rk_iommu_driver);
+}
+static void __exit rk_iommu_exit(void)
+{
+	platform_driver_unregister(&rk_iommu_driver);
+}
+
+subsys_initcall(rk_iommu_init);
+module_exit(rk_iommu_exit);
+
+MODULE_DESCRIPTION("IOMMU API for Rockchip");
+MODULE_AUTHOR("Simon Xue <xxm@rock-chips.com> and Daniel Kurtz <djkurtz@chromium.org>");
+MODULE_ALIAS("platform:rockchip-iommu");
+MODULE_LICENSE("GPL v2");
-- 
2.1.0.rc2.206.gedb03e5

_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [PATCH v5 1/3] iommu/rockchip: rk3288 iommu driver
@ 2014-10-14  8:02   ` Daniel Kurtz
  0 siblings, 0 replies; 26+ messages in thread
From: Daniel Kurtz @ 2014-10-14  8:02 UTC (permalink / raw)
  To: linux-arm-kernel

The rk3288 has several iommus.  Each iommu belongs to a single master
device.  There is one device (ISP) that has two slave iommus, but that
case is not yet supported by this driver.

At subsys init, the iommu driver registers itself as the iommu driver for
the platform bus.  The master devices find their slave iommus using the
"iommus" field in their devicetree description.  Since each slave iommu
belongs to exactly one master, their is no additional data needed at probe
to associate a slave with its master.

An iommu device's power domain, clock and irq are all shared with its
master device, and the master device must be careful to attach from the
iommu only after powering and clocking it (and leave it powered and
clocked before detaching).  Because their is no guarantee what the status
of the iommu is at probe, and since the driver does not even know if the
device is powered, we delay requesting its irq until the master device
attaches, at which point we have a guarantee that the device is powered
and clocked and we can reset it and disable its interrupt mask.

An iommu_domain describes a virtual iova address space.  Each iommu_domain
has a corresponding page table that lists the mappings from iova to
physical address.

For the rk3288 iommu, the page table has two levels:
 The Level 1 "directory_table" has 1024 4-byte dte entries.
 Each dte points to a level 2 "page_table".
 Each level 2 page_table has 1024 4-byte pte entries.
 Each pte points to a 4 KiB page of memory.

An iommu_domain is created when a dma_iommu_mapping is created via
arm_iommu_create_mapping.  Master devices can then attach themselves to
this mapping (or attach the mapping to themselves?) by calling
arm_iommu_attach_device().  This in turn instructs the iommu driver to
write the page table's physical address into the slave iommu's "Directory
Table Entry" (DTE) register.

In fact multiple master devices, each with their own slave iommu device,
can all attach to the same mapping.  The iommus for these devices will
share the same iommu_domain and therefore point to the same page table.
Thus, the iommu domain maintains a list of iommu devices which are
attached.  This driver relies on the iommu core to ensure that all devices
have detached before destroying a domain.

Signed-off-by: Daniel Kurtz <djkurtz@chromium.org>
Signed-off-by: Simon Xue <xxm@rock-chips.com>
Reviewed-by: Grant Grundler <grundler@chromium.org>
Reviewed-by: St?phane Marchesin <marcheu@chromium.org>
---
 drivers/iommu/Kconfig          |  12 +
 drivers/iommu/Makefile         |   1 +
 drivers/iommu/rockchip-iommu.c | 924 +++++++++++++++++++++++++++++++++++++++++
 3 files changed, 937 insertions(+)
 create mode 100644 drivers/iommu/rockchip-iommu.c

diff --git a/drivers/iommu/Kconfig b/drivers/iommu/Kconfig
index dd51122..d0a1261 100644
--- a/drivers/iommu/Kconfig
+++ b/drivers/iommu/Kconfig
@@ -152,6 +152,18 @@ config OMAP_IOMMU_DEBUG
 
          Say N unless you know you need this.
 
+config ROCKCHIP_IOMMU
+	bool "Rockchip IOMMU Support"
+	depends on ARCH_ROCKCHIP
+	select IOMMU_API
+	select ARM_DMA_USE_IOMMU
+	help
+	  Support for IOMMUs found on Rockchip rk32xx SOCs.
+	  These IOMMUs allow virtualization of the address space used by most
+	  cores within the multimedia subsystem.
+	  Say Y here if you are using a Rockchip SoC that includes an IOMMU
+	  device.
+
 config TEGRA_IOMMU_GART
 	bool "Tegra GART IOMMU Support"
 	depends on ARCH_TEGRA_2x_SOC
diff --git a/drivers/iommu/Makefile b/drivers/iommu/Makefile
index 16edef7..3e47ef3 100644
--- a/drivers/iommu/Makefile
+++ b/drivers/iommu/Makefile
@@ -13,6 +13,7 @@ obj-$(CONFIG_IRQ_REMAP) += intel_irq_remapping.o irq_remapping.o
 obj-$(CONFIG_OMAP_IOMMU) += omap-iommu.o
 obj-$(CONFIG_OMAP_IOMMU) += omap-iommu2.o
 obj-$(CONFIG_OMAP_IOMMU_DEBUG) += omap-iommu-debug.o
+obj-$(CONFIG_ROCKCHIP_IOMMU) += rockchip-iommu.o
 obj-$(CONFIG_TEGRA_IOMMU_GART) += tegra-gart.o
 obj-$(CONFIG_TEGRA_IOMMU_SMMU) += tegra-smmu.o
 obj-$(CONFIG_EXYNOS_IOMMU) += exynos-iommu.o
diff --git a/drivers/iommu/rockchip-iommu.c b/drivers/iommu/rockchip-iommu.c
new file mode 100644
index 0000000..08e50fc
--- /dev/null
+++ b/drivers/iommu/rockchip-iommu.c
@@ -0,0 +1,924 @@
+/*
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ */
+
+#include <asm/cacheflush.h>
+#include <asm/pgtable.h>
+#include <linux/compiler.h>
+#include <linux/delay.h>
+#include <linux/device.h>
+#include <linux/errno.h>
+#include <linux/interrupt.h>
+#include <linux/io.h>
+#include <linux/iommu.h>
+#include <linux/jiffies.h>
+#include <linux/list.h>
+#include <linux/mm.h>
+#include <linux/module.h>
+#include <linux/of.h>
+#include <linux/platform_device.h>
+#include <linux/slab.h>
+#include <linux/spinlock.h>
+
+/** MMU register offsets */
+#define RK_MMU_DTE_ADDR		0x00	/* Directory table address */
+#define RK_MMU_STATUS		0x04
+#define RK_MMU_COMMAND		0x08
+#define RK_MMU_PAGE_FAULT_ADDR	0x0C	/* IOVA of last page fault */
+#define RK_MMU_ZAP_ONE_LINE	0x10	/* Shootdown one IOTLB entry */
+#define RK_MMU_INT_RAWSTAT	0x14	/* IRQ status ignoring mask */
+#define RK_MMU_INT_CLEAR	0x18	/* Acknowledge and re-arm irq */
+#define RK_MMU_INT_MASK		0x1C	/* IRQ enable */
+#define RK_MMU_INT_STATUS	0x20	/* IRQ status after masking */
+#define RK_MMU_AUTO_GATING	0x24
+
+#define DTE_ADDR_DUMMY		0xCAFEBABE
+#define FORCE_RESET_TIMEOUT	100	/* ms */
+
+/* RK_MMU_STATUS fields */
+#define RK_MMU_STATUS_PAGING_ENABLED       BIT(0)
+#define RK_MMU_STATUS_PAGE_FAULT_ACTIVE    BIT(1)
+#define RK_MMU_STATUS_STALL_ACTIVE         BIT(2)
+#define RK_MMU_STATUS_IDLE                 BIT(3)
+#define RK_MMU_STATUS_REPLAY_BUFFER_EMPTY  BIT(4)
+#define RK_MMU_STATUS_PAGE_FAULT_IS_WRITE  BIT(5)
+#define RK_MMU_STATUS_STALL_NOT_ACTIVE     BIT(31)
+
+/* RK_MMU_COMMAND command values */
+#define RK_MMU_CMD_ENABLE_PAGING    0  /* Enable memory translation */
+#define RK_MMU_CMD_DISABLE_PAGING   1  /* Disable memory translation */
+#define RK_MMU_CMD_ENABLE_STALL     2  /* Stall paging to allow other cmds */
+#define RK_MMU_CMD_DISABLE_STALL    3  /* Stop stall re-enables paging */
+#define RK_MMU_CMD_ZAP_CACHE        4  /* Shoot down entire IOTLB */
+#define RK_MMU_CMD_PAGE_FAULT_DONE  5  /* Clear page fault */
+#define RK_MMU_CMD_FORCE_RESET      6  /* Reset all registers */
+
+/* RK_MMU_INT_* register fields */
+#define RK_MMU_IRQ_PAGE_FAULT    0x01  /* page fault */
+#define RK_MMU_IRQ_BUS_ERROR     0x02  /* bus read error */
+#define RK_MMU_IRQ_MASK          (RK_MMU_IRQ_PAGE_FAULT | RK_MMU_IRQ_BUS_ERROR)
+
+#define NUM_DT_ENTRIES 1024
+#define NUM_PT_ENTRIES 1024
+
+#define SPAGE_ORDER 12
+#define SPAGE_SIZE (1 << SPAGE_ORDER)
+
+ /*
+  * Support mapping any size that fits in one page table:
+  *   4 KiB to 4 MiB
+  */
+#define RK_IOMMU_PGSIZE_BITMAP 0x007ff000
+
+#define IOMMU_REG_POLL_COUNT_FAST 1000
+
+struct rk_iommu_domain {
+	struct list_head iommus;
+	u32 *dt; /* page directory table */
+	spinlock_t iommus_lock; /* lock for iommus list */
+	spinlock_t dt_lock; /* lock for modifying page directory table */
+};
+
+struct rk_iommu {
+	struct device *dev;
+	void __iomem *base;
+	int irq;
+	struct list_head node; /* entry in rk_iommu_domain.iommus */
+	struct iommu_domain *domain; /* domain to which iommu is attached */
+};
+
+static inline void rk_table_flush(u32 *va, unsigned int count)
+{
+	phys_addr_t pa_start = virt_to_phys(va);
+	phys_addr_t pa_end = virt_to_phys(va + count);
+	size_t size = pa_end - pa_start;
+
+	__cpuc_flush_dcache_area(va, size);
+	outer_flush_range(pa_start, pa_end);
+}
+
+/**
+ * Inspired by _wait_for in intel_drv.h
+ * This is NOT safe for use in interrupt context.
+ *
+ * Note that it's important that we check the condition again after having
+ * timed out, since the timeout could be due to preemption or similar and
+ * we've never had a chance to check the condition before the timeout.
+ */
+#define rk_wait_for(COND, MS) ({ \
+	unsigned long timeout__ = jiffies + msecs_to_jiffies(MS) + 1;	\
+	int ret__ = 0;							\
+	while (!(COND)) {						\
+		if (time_after(jiffies, timeout__)) {			\
+			ret__ = (COND) ? 0 : -ETIMEDOUT;		\
+			break;						\
+		}							\
+		usleep_range(50, 100);					\
+	}								\
+	ret__;								\
+})
+
+/*
+ * The Rockchip rk3288 iommu uses a 2-level page table.
+ * The first level is the "Directory Table" (DT).
+ * The DT consists of 1024 4-byte Directory Table Entries (DTEs), each pointing
+ * to a "Page Table".
+ * The second level is the 1024 Page Tables (PT).
+ * Each PT consists of 1024 4-byte Page Table Entries (PTEs), each pointing to
+ * a 4 KB page of physical memory.
+ *
+ * The DT and each PT fits in a single 4 KB page (4-bytes * 1024 entries).
+ * Each iommu device has a MMU_DTE_ADDR register that contains the physical
+ * address of the start of the DT page.
+ *
+ * The structure of the page table is as follows:
+ *
+ *                   DT
+ * MMU_DTE_ADDR -> +-----+
+ *                 |     |
+ *                 +-----+     PT
+ *                 | DTE | -> +-----+
+ *                 +-----+    |     |     Memory
+ *                 |     |    +-----+     Page
+ *                 |     |    | PTE | -> +-----+
+ *                 +-----+    +-----+    |     |
+ *                            |     |    |     |
+ *                            |     |    |     |
+ *                            +-----+    |     |
+ *                                       |     |
+ *                                       |     |
+ *                                       +-----+
+ */
+
+/*
+ * Each DTE has a PT address and a valid bit:
+ * +---------------------+-----------+-+
+ * | PT address          | Reserved  |V|
+ * +---------------------+-----------+-+
+ *  31:12 - PT address (PTs always starts on a 4 KB boundary)
+ *  11: 1 - Reserved
+ *      0 - 1 if PT @ PT address is valid
+ */
+#define RK_DTE_PT_ADDRESS_MASK    0xfffff000
+#define RK_DTE_PT_VALID           BIT(0)
+
+static inline phys_addr_t rk_dte_pt_address(u32 dte)
+{
+	return (phys_addr_t)dte & RK_DTE_PT_ADDRESS_MASK;
+}
+
+static inline bool rk_dte_is_pt_valid(u32 dte)
+{
+	return dte & RK_DTE_PT_VALID;
+}
+
+static u32 rk_mk_dte(u32 *pt)
+{
+	phys_addr_t pt_phys = virt_to_phys(pt);
+	return (pt_phys & RK_DTE_PT_ADDRESS_MASK) | RK_DTE_PT_VALID;
+}
+
+/*
+ * Each PTE has a Page address, some flags and a valid bit:
+ * +---------------------+---+-------+-+
+ * | Page address        |Rsv| Flags |V|
+ * +---------------------+---+-------+-+
+ *  31:12 - Page address (Pages always start on a 4 KB boundary)
+ *  11: 9 - Reserved
+ *   8: 1 - Flags
+ *      8 - Read allocate - allocate cache space on read misses
+ *      7 - Read cache - enable cache & prefetch of data
+ *      6 - Write buffer - enable delaying writes on their way to memory
+ *      5 - Write allocate - allocate cache space on write misses
+ *      4 - Write cache - different writes can be merged together
+ *      3 - Override cache attributes
+ *          if 1, bits 4-8 control cache attributes
+ *          if 0, the system bus defaults are used
+ *      2 - Writable
+ *      1 - Readable
+ *      0 - 1 if Page @ Page address is valid
+ */
+#define RK_PTE_PAGE_ADDRESS_MASK  0xfffff000
+#define RK_PTE_PAGE_FLAGS_MASK    0x000001fe
+#define RK_PTE_PAGE_WRITABLE      BIT(2)
+#define RK_PTE_PAGE_READABLE      BIT(1)
+#define RK_PTE_PAGE_VALID         BIT(0)
+
+static inline phys_addr_t rk_pte_page_address(u32 pte)
+{
+	return (phys_addr_t)pte & RK_PTE_PAGE_ADDRESS_MASK;
+}
+
+static inline bool rk_pte_is_page_valid(u32 pte)
+{
+	return pte & RK_PTE_PAGE_VALID;
+}
+
+/* TODO: set cache flags per prot IOMMU_CACHE */
+static u32 rk_mk_pte(phys_addr_t page, int prot)
+{
+	u32 flags = 0;
+	flags |= (prot & IOMMU_READ) ? RK_PTE_PAGE_READABLE : 0;
+	flags |= (prot & IOMMU_WRITE) ? RK_PTE_PAGE_WRITABLE : 0;
+	page &= RK_PTE_PAGE_ADDRESS_MASK;
+	return page | flags | RK_PTE_PAGE_VALID;
+}
+
+static u32 rk_mk_pte_invalid(u32 pte)
+{
+	return pte & ~RK_PTE_PAGE_VALID;
+}
+
+/*
+ * rk3288 iova (IOMMU Virtual Address) format
+ *  31       22.21       12.11          0
+ * +-----------+-----------+-------------+
+ * | DTE index | PTE index | Page offset |
+ * +-----------+-----------+-------------+
+ *  31:22 - DTE index   - index of DTE in DT
+ *  21:12 - PTE index   - index of PTE in PT @ DTE.pt_address
+ *  11: 0 - Page offset - offset into page @ PTE.page_address
+ */
+#define RK_IOVA_DTE_MASK    0xffc00000
+#define RK_IOVA_DTE_SHIFT   22
+#define RK_IOVA_PTE_MASK    0x003ff000
+#define RK_IOVA_PTE_SHIFT   12
+#define RK_IOVA_PAGE_MASK   0x00000fff
+#define RK_IOVA_PAGE_SHIFT  0
+
+static u32 rk_iova_dte_index(dma_addr_t iova)
+{
+	return (u32)(iova & RK_IOVA_DTE_MASK) >> RK_IOVA_DTE_SHIFT;
+}
+
+static u32 rk_iova_pte_index(dma_addr_t iova)
+{
+	return (u32)(iova & RK_IOVA_PTE_MASK) >> RK_IOVA_PTE_SHIFT;
+}
+
+static u32 rk_iova_page_offset(dma_addr_t iova)
+{
+	return (u32)(iova & RK_IOVA_PAGE_MASK) >> RK_IOVA_PAGE_SHIFT;
+}
+
+static u32 rk_iommu_read(struct rk_iommu *iommu, u32 offset)
+{
+	return readl(iommu->base + offset);
+}
+
+static void rk_iommu_write(struct rk_iommu *iommu, u32 offset, u32 value)
+{
+	writel(value, iommu->base + offset);
+}
+
+static void rk_iommu_command(struct rk_iommu *iommu, u32 command)
+{
+	writel(command, iommu->base + RK_MMU_COMMAND);
+}
+
+static void rk_iommu_zap_lines(struct rk_iommu *iommu, dma_addr_t iova,
+			       size_t size)
+{
+	dma_addr_t iova_end = iova + size;
+	/*
+	 * TODO(djkurtz): Figure out when it is more efficient to shootdown the
+	 * entire iotlb rather than iterate over individual iovas.
+	 */
+	for (; iova < iova_end; iova += SPAGE_SIZE)
+		rk_iommu_write(iommu, RK_MMU_ZAP_ONE_LINE, iova);
+}
+
+static bool rk_iommu_is_stall_active(struct rk_iommu *iommu)
+{
+	return rk_iommu_read(iommu, RK_MMU_STATUS) & RK_MMU_STATUS_STALL_ACTIVE;
+}
+
+static bool rk_iommu_is_paging_enabled(struct rk_iommu *iommu)
+{
+	return rk_iommu_read(iommu, RK_MMU_STATUS) &
+			     RK_MMU_STATUS_PAGING_ENABLED;
+}
+
+static int rk_iommu_enable_stall(struct rk_iommu *iommu)
+{
+	int ret;
+
+	if (rk_iommu_is_stall_active(iommu))
+		return 0;
+
+	/* Stall can only be enabled if paging is enabled */
+	if (!rk_iommu_is_paging_enabled(iommu))
+		return 0;
+
+	rk_iommu_command(iommu, RK_MMU_CMD_ENABLE_STALL);
+
+	ret = rk_wait_for(rk_iommu_is_stall_active(iommu), 1);
+	if (ret)
+		dev_err(iommu->dev, "Enable stall request timed out, status: %#08x\n",
+			rk_iommu_read(iommu, RK_MMU_STATUS));
+
+	return ret;
+}
+
+static int rk_iommu_disable_stall(struct rk_iommu *iommu)
+{
+	int ret;
+
+	if (!rk_iommu_is_stall_active(iommu))
+		return 0;
+
+	rk_iommu_command(iommu, RK_MMU_CMD_DISABLE_STALL);
+
+	ret = rk_wait_for(!rk_iommu_is_stall_active(iommu), 1);
+	if (ret)
+		dev_err(iommu->dev, "Disable stall request timed out, status: %#08x\n",
+			rk_iommu_read(iommu, RK_MMU_STATUS));
+
+	return ret;
+}
+
+static int rk_iommu_enable_paging(struct rk_iommu *iommu)
+{
+	int ret;
+
+	if (rk_iommu_is_paging_enabled(iommu))
+		return 0;
+
+	rk_iommu_command(iommu, RK_MMU_CMD_ENABLE_PAGING);
+
+	ret = rk_wait_for(rk_iommu_is_paging_enabled(iommu), 1);
+	if (ret)
+		dev_err(iommu->dev, "Enable paging request timed out, status: %#08x\n",
+			rk_iommu_read(iommu, RK_MMU_STATUS));
+
+	return ret;
+}
+
+static int rk_iommu_disable_paging(struct rk_iommu *iommu)
+{
+	int ret;
+
+	if (!rk_iommu_is_paging_enabled(iommu))
+		return 0;
+
+	rk_iommu_command(iommu, RK_MMU_CMD_DISABLE_PAGING);
+
+	ret = rk_wait_for(!rk_iommu_is_paging_enabled(iommu), 1);
+	if (ret)
+		dev_err(iommu->dev, "Disable paging request timed out, status: #%08x\n",
+			rk_iommu_read(iommu, RK_MMU_STATUS));
+
+	return ret;
+}
+
+static int rk_iommu_force_reset(struct rk_iommu *iommu)
+{
+	int ret;
+	u32 dte_addr;
+
+	/*
+	 * Check if register DTE_ADDR is working by writing DTE_ADDR_DUMMY
+	 * and verifying that upper 5 nybbles are read back.
+	 */
+	rk_iommu_write(iommu, RK_MMU_DTE_ADDR, DTE_ADDR_DUMMY);
+
+	dte_addr = rk_iommu_read(iommu, RK_MMU_DTE_ADDR);
+	if (dte_addr != (DTE_ADDR_DUMMY & RK_DTE_PT_ADDRESS_MASK)) {
+		dev_err(iommu->dev, "Error during raw reset. MMU_DTE_ADDR is not functioning\n");
+		return -EFAULT;
+	}
+
+	rk_iommu_command(iommu, RK_MMU_CMD_FORCE_RESET);
+
+	ret = rk_wait_for(rk_iommu_read(iommu, RK_MMU_DTE_ADDR) == 0x00000000,
+			  FORCE_RESET_TIMEOUT);
+	if (ret)
+		dev_err(iommu->dev, "FORCE_RESET command timed out\n");
+
+	return ret;
+}
+
+static void log_iova(struct rk_iommu *iommu, dma_addr_t iova)
+{
+	u32 dte_index, pte_index, page_offset;
+	u32 mmu_dte_addr;
+	phys_addr_t mmu_dte_addr_phys, dte_addr_phys;
+	u32 *dte_addr;
+	u32 dte;
+	phys_addr_t pte_addr_phys = 0;
+	u32 *pte_addr = NULL;
+	u32 pte = 0;
+	phys_addr_t page_addr_phys = 0;
+	u32 page_flags = 0;
+
+	dte_index = rk_iova_dte_index(iova);
+	pte_index = rk_iova_pte_index(iova);
+	page_offset = rk_iova_page_offset(iova);
+
+	mmu_dte_addr = rk_iommu_read(iommu, RK_MMU_DTE_ADDR);
+	mmu_dte_addr_phys = (phys_addr_t)mmu_dte_addr;
+
+	dte_addr_phys = mmu_dte_addr_phys + (4 * dte_index);
+	dte_addr = phys_to_virt(dte_addr_phys);
+	dte = *dte_addr;
+
+	if (!rk_dte_is_pt_valid(dte))
+		goto print_it;
+
+	pte_addr_phys = rk_dte_pt_address(dte) + (pte_index * 4);
+	pte_addr = phys_to_virt(pte_addr_phys);
+	pte = *pte_addr;
+
+	if (!rk_pte_is_page_valid(pte))
+		goto print_it;
+
+	page_addr_phys = rk_pte_page_address(pte) + page_offset;
+	page_flags = pte & RK_PTE_PAGE_FLAGS_MASK;
+
+print_it:
+	dev_err(iommu->dev, "iova = %pad: dte_index: 0x%03x pte_index: 0x%03x page_offset: 0x%03x\n",
+		&iova, dte_index, pte_index, page_offset);
+	dev_err(iommu->dev, "mmu_dte_addr: %pa dte@%pa: %#08x valid: %u pte@%pa: %#08x valid: %u page@%pa flags: %#03x\n",
+		&mmu_dte_addr_phys, &dte_addr_phys, dte,
+		rk_dte_is_pt_valid(dte), &pte_addr_phys, pte,
+		rk_pte_is_page_valid(pte), &page_addr_phys, page_flags);
+}
+
+static irqreturn_t rk_iommu_irq(int irq, void *dev_id)
+{
+	struct rk_iommu *iommu = dev_id;
+	u32 status;
+	u32 int_status;
+	dma_addr_t iova;
+
+	int_status = rk_iommu_read(iommu, RK_MMU_INT_STATUS);
+	if (int_status == 0)
+		return IRQ_NONE;
+
+	iova = rk_iommu_read(iommu, RK_MMU_PAGE_FAULT_ADDR);
+
+	if (int_status & RK_MMU_IRQ_PAGE_FAULT) {
+		int flags;
+
+		status = rk_iommu_read(iommu, RK_MMU_STATUS);
+		flags = (status & RK_MMU_STATUS_PAGE_FAULT_IS_WRITE) ?
+				IOMMU_FAULT_WRITE : IOMMU_FAULT_READ;
+
+		dev_err(iommu->dev, "Page fault at %pad of type %s\n",
+			&iova,
+			(flags == IOMMU_FAULT_WRITE) ? "write" : "read");
+
+		log_iova(iommu, iova);
+
+		/*
+		 * Report page fault to any installed handlers.
+		 * Ignore the return code, though, since we always zap cache
+		 * and clear the page fault anyway.
+		 */
+		if (iommu->domain)
+			report_iommu_fault(iommu->domain, iommu->dev, iova,
+					   flags);
+		else
+			dev_err(iommu->dev, "Page fault while iommu not attached to domain?\n");
+
+		rk_iommu_command(iommu, RK_MMU_CMD_ZAP_CACHE);
+		rk_iommu_command(iommu, RK_MMU_CMD_PAGE_FAULT_DONE);
+	}
+
+	if (int_status & RK_MMU_IRQ_BUS_ERROR)
+		dev_err(iommu->dev, "BUS_ERROR occurred at %pad\n", &iova);
+
+	if (int_status & ~RK_MMU_IRQ_MASK)
+		dev_err(iommu->dev, "unexpected int_status: %#08x\n",
+			int_status);
+
+	rk_iommu_write(iommu, RK_MMU_INT_CLEAR, int_status);
+
+	return IRQ_HANDLED;
+}
+
+static phys_addr_t rk_iommu_iova_to_phys(struct iommu_domain *domain,
+					 dma_addr_t iova)
+{
+	struct rk_iommu_domain *rk_domain = domain->priv;
+	unsigned long flags;
+	phys_addr_t pt_phys, phys = 0;
+	u32 dte, pte;
+	u32 *page_table;
+
+	spin_lock_irqsave(&rk_domain->dt_lock, flags);
+
+	dte = rk_domain->dt[rk_iova_dte_index(iova)];
+	if (!rk_dte_is_pt_valid(dte))
+		goto out;
+
+	pt_phys = rk_dte_pt_address(dte);
+	page_table = (u32 *)phys_to_virt(pt_phys);
+	pte = page_table[rk_iova_pte_index(iova)];
+	if (!rk_pte_is_page_valid(pte))
+		goto out;
+
+	phys = rk_pte_page_address(pte) + rk_iova_page_offset(iova);
+out:
+	spin_unlock_irqrestore(&rk_domain->dt_lock, flags);
+
+	return phys;
+}
+
+static void rk_iommu_zap_iova(struct rk_iommu_domain *rk_domain,
+			      dma_addr_t iova, size_t size)
+{
+	struct list_head *pos;
+	unsigned long flags;
+
+	/* shootdown these iova from all iommus using this domain */
+	spin_lock_irqsave(&rk_domain->iommus_lock, flags);
+	list_for_each(pos, &rk_domain->iommus) {
+		struct rk_iommu *iommu;
+		iommu = list_entry(pos, struct rk_iommu, node);
+		rk_iommu_zap_lines(iommu, iova, size);
+	}
+	spin_unlock_irqrestore(&rk_domain->iommus_lock, flags);
+}
+
+static u32 *rk_dte_get_page_table(struct rk_iommu_domain *rk_domain,
+				  dma_addr_t iova)
+{
+	u32 *page_table, *dte_addr;
+	u32 dte;
+	phys_addr_t pt_phys;
+
+	assert_spin_locked(&rk_domain->dt_lock);
+
+	dte_addr = &rk_domain->dt[rk_iova_dte_index(iova)];
+	dte = *dte_addr;
+	if (rk_dte_is_pt_valid(dte))
+		goto done;
+
+	page_table = (u32 *)get_zeroed_page(GFP_ATOMIC | GFP_DMA32);
+	if (!page_table)
+		return ERR_PTR(-ENOMEM);
+
+	dte = rk_mk_dte(page_table);
+	*dte_addr = dte;
+
+	rk_table_flush(page_table, NUM_PT_ENTRIES);
+	rk_table_flush(dte_addr, 1);
+
+	/*
+	 * Zap the first iova of newly allocated page table so iommu evicts
+	 * old cached value of new dte from the iotlb.
+	 */
+	rk_iommu_zap_iova(rk_domain, iova, SPAGE_SIZE);
+
+done:
+	pt_phys = rk_dte_pt_address(dte);
+	return (u32 *)phys_to_virt(pt_phys);
+}
+
+static size_t rk_iommu_unmap_iova(struct rk_iommu_domain *rk_domain,
+				  u32 *pte_addr, dma_addr_t iova, size_t size)
+{
+	unsigned int pte_count;
+	unsigned int pte_total = size / SPAGE_SIZE;
+
+	assert_spin_locked(&rk_domain->dt_lock);
+
+	for (pte_count = 0; pte_count < pte_total; pte_count++) {
+		u32 pte = pte_addr[pte_count];
+		if (!rk_pte_is_page_valid(pte))
+			break;
+
+		pte_addr[pte_count] = rk_mk_pte_invalid(pte);
+	}
+
+	rk_table_flush(pte_addr, pte_count);
+
+	return pte_count * SPAGE_SIZE;
+}
+
+static int rk_iommu_map_iova(struct rk_iommu_domain *rk_domain, u32 *pte_addr,
+			     dma_addr_t iova, phys_addr_t paddr, size_t size,
+			     int prot)
+{
+	unsigned int pte_count;
+	unsigned int pte_total = size / SPAGE_SIZE;
+	phys_addr_t page_phys;
+
+	assert_spin_locked(&rk_domain->dt_lock);
+
+	for (pte_count = 0; pte_count < pte_total; pte_count++) {
+		u32 pte = pte_addr[pte_count];
+
+		if (rk_pte_is_page_valid(pte))
+			goto unwind;
+
+		pte_addr[pte_count] = rk_mk_pte(paddr, prot);
+
+		paddr += SPAGE_SIZE;
+	}
+
+	rk_table_flush(pte_addr, pte_count);
+
+	return 0;
+unwind:
+	/* Unmap the range of iovas that we just mapped */
+	rk_iommu_unmap_iova(rk_domain, pte_addr, iova, pte_count * SPAGE_SIZE);
+
+	iova += pte_count * SPAGE_SIZE;
+	page_phys = rk_pte_page_address(pte_addr[pte_count]);
+	pr_err("iova: %pad already mapped to %pa cannot remap to phys: %pa prot:%#x\n",
+	       &iova, &page_phys, &paddr, prot);
+
+	return -EADDRINUSE;
+}
+
+static int rk_iommu_map(struct iommu_domain *domain, unsigned long _iova,
+			phys_addr_t paddr, size_t size, int prot)
+{
+	struct rk_iommu_domain *rk_domain = domain->priv;
+	unsigned long flags;
+	dma_addr_t iova = (dma_addr_t)_iova;
+	u32 *page_table, *pte_addr;
+	int ret;
+
+	spin_lock_irqsave(&rk_domain->dt_lock, flags);
+
+	/*
+	 * pgsize_bitmap specifies iova sizes that fit in one page table
+	 * (1024 4-KiB pages = 4 MiB).
+	 * So, size will always be 4096 <= size <= 4194304.
+	 * Since iommu_map() guarantees that both iova and size will be
+	 * aligned, we will always only be mapping from a single dte here.
+	 */
+	page_table = rk_dte_get_page_table(rk_domain, iova);
+	if (IS_ERR(page_table)) {
+		spin_unlock_irqrestore(&rk_domain->dt_lock, flags);
+		return PTR_ERR(page_table);
+	}
+
+	pte_addr = &page_table[rk_iova_pte_index(iova)];
+	ret = rk_iommu_map_iova(rk_domain, pte_addr, iova, paddr, size, prot);
+	spin_unlock_irqrestore(&rk_domain->dt_lock, flags);
+
+	return ret;
+}
+
+static size_t rk_iommu_unmap(struct iommu_domain *domain, unsigned long _iova,
+			     size_t size)
+{
+	struct rk_iommu_domain *rk_domain = domain->priv;
+	unsigned long flags;
+	dma_addr_t iova = (dma_addr_t)_iova;
+	phys_addr_t pt_phys;
+	u32 dte;
+	u32 *pte_addr;
+	size_t unmap_size;
+
+	spin_lock_irqsave(&rk_domain->dt_lock, flags);
+
+	/*
+	 * pgsize_bitmap specifies iova sizes that fit in one page table
+	 * (1024 4-KiB pages = 4 MiB).
+	 * So, size will always be 4096 <= size <= 4194304.
+	 * Since iommu_unmap() guarantees that both iova and size will be
+	 * aligned, we will always only be unmapping from a single dte here.
+	 */
+	dte = rk_domain->dt[rk_iova_dte_index(iova)];
+	/* Just return 0 if iova is unmapped */
+	if (!rk_dte_is_pt_valid(dte)) {
+		spin_unlock_irqrestore(&rk_domain->dt_lock, flags);
+		return 0;
+	}
+
+	pt_phys = rk_dte_pt_address(dte);
+	pte_addr = (u32 *)phys_to_virt(pt_phys) + rk_iova_pte_index(iova);
+	unmap_size = rk_iommu_unmap_iova(rk_domain, pte_addr, iova, size);
+
+	spin_unlock_irqrestore(&rk_domain->dt_lock, flags);
+
+	/* Shootdown iotlb entries for iova range that was just unmapped */
+	rk_iommu_zap_iova(rk_domain, iova, unmap_size);
+
+	return unmap_size;
+}
+
+static int rk_iommu_attach_device(struct iommu_domain *domain,
+				  struct device *dev)
+{
+	struct rk_iommu *iommu = dev_get_drvdata(dev->archdata.iommu);
+	struct rk_iommu_domain *rk_domain = domain->priv;
+	unsigned long flags;
+	int ret;
+	phys_addr_t dte_addr;
+
+	/*
+	 * Allow 'virtual devices' (e.g., drm) to attach to domain.
+	 * Such a device has a NULL archdata.iommu.
+	 */
+	if (!iommu)
+		return 0;
+
+	ret = rk_iommu_enable_stall(iommu);
+	if (ret)
+		return ret;
+
+	ret = rk_iommu_force_reset(iommu);
+	if (ret)
+		return ret;
+
+	iommu->domain = domain;
+
+	ret = devm_request_irq(dev, iommu->irq, rk_iommu_irq,
+			       IRQF_SHARED, dev_name(dev), iommu);
+	if (ret)
+		return ret;
+
+	dte_addr = virt_to_phys(rk_domain->dt);
+	rk_iommu_write(iommu, RK_MMU_DTE_ADDR, dte_addr);
+	rk_iommu_command(iommu, RK_MMU_CMD_ZAP_CACHE);
+	rk_iommu_write(iommu, RK_MMU_INT_MASK, RK_MMU_IRQ_MASK);
+
+	ret = rk_iommu_enable_paging(iommu);
+	if (ret)
+		return ret;
+
+	spin_lock_irqsave(&rk_domain->iommus_lock, flags);
+	list_add_tail(&iommu->node, &rk_domain->iommus);
+	spin_unlock_irqrestore(&rk_domain->iommus_lock, flags);
+
+	dev_info(dev, "Attached to iommu domain\n");
+
+	rk_iommu_disable_stall(iommu);
+
+	return 0;
+}
+
+static void rk_iommu_detach_device(struct iommu_domain *domain,
+				   struct device *dev)
+{
+	struct rk_iommu *iommu = dev_get_drvdata(dev->archdata.iommu);
+	struct rk_iommu_domain *rk_domain = domain->priv;
+	unsigned long flags;
+
+	/* Allow 'virtual devices' (eg drm) to detach from domain */
+	if (!iommu)
+		return;
+
+	iommu->domain = NULL;
+
+	spin_lock_irqsave(&rk_domain->iommus_lock, flags);
+	list_del_init(&iommu->node);
+	spin_unlock_irqrestore(&rk_domain->iommus_lock, flags);
+
+	devm_free_irq(dev, iommu->irq, iommu);
+
+	iommu->domain = NULL;
+
+	/* Ignore error while disabling, just keep going */
+	rk_iommu_enable_stall(iommu);
+	rk_iommu_disable_paging(iommu);
+	rk_iommu_write(iommu, RK_MMU_INT_MASK, 0);
+	rk_iommu_write(iommu, RK_MMU_DTE_ADDR, 0);
+	rk_iommu_disable_stall(iommu);
+
+	dev_info(dev, "Detached from iommu domain\n");
+}
+
+static int rk_iommu_domain_init(struct iommu_domain *domain)
+{
+	struct rk_iommu_domain *rk_domain;
+
+	rk_domain = kzalloc(sizeof(*rk_domain), GFP_KERNEL);
+	if (!rk_domain)
+		return -ENOMEM;
+
+	/*
+	 * rk32xx iommus use a 2 level pagetable.
+	 * Each level1 (dt) and level2 (pt) table has 1024 4-byte entries.
+	 * Allocate one 4 KiB page for each table.
+	 */
+	rk_domain->dt = (u32 *)get_zeroed_page(GFP_KERNEL | GFP_DMA32);
+	if (!rk_domain->dt)
+		goto err_dt;
+
+	rk_table_flush(rk_domain->dt, NUM_DT_ENTRIES);
+
+	spin_lock_init(&rk_domain->iommus_lock);
+	spin_lock_init(&rk_domain->dt_lock);
+	INIT_LIST_HEAD(&rk_domain->iommus);
+
+	domain->priv = rk_domain;
+
+	return 0;
+err_dt:
+	kfree(rk_domain);
+	return -ENOMEM;
+}
+
+static void rk_iommu_domain_destroy(struct iommu_domain *domain)
+{
+	struct rk_iommu_domain *rk_domain = domain->priv;
+	int i;
+
+	WARN_ON(!list_empty(&rk_domain->iommus));
+
+	for (i = 0; i < NUM_DT_ENTRIES; i++) {
+		u32 dte = rk_domain->dt[i];
+		if (rk_dte_is_pt_valid(dte)) {
+			phys_addr_t pt_phys = rk_dte_pt_address(dte);
+			u32 *page_table = phys_to_virt(pt_phys);
+			free_page((unsigned long)page_table);
+		}
+	}
+
+	free_page((unsigned long)rk_domain->dt);
+	kfree(domain->priv);
+	domain->priv = NULL;
+}
+
+static const struct iommu_ops rk_iommu_ops = {
+	.domain_init = rk_iommu_domain_init,
+	.domain_destroy = rk_iommu_domain_destroy,
+	.attach_dev = rk_iommu_attach_device,
+	.detach_dev = rk_iommu_detach_device,
+	.map = rk_iommu_map,
+	.unmap = rk_iommu_unmap,
+	.iova_to_phys = rk_iommu_iova_to_phys,
+	.pgsize_bitmap = RK_IOMMU_PGSIZE_BITMAP,
+};
+
+static int rk_iommu_probe(struct platform_device *pdev)
+{
+	struct device *dev = &pdev->dev;
+	struct rk_iommu *iommu;
+	struct resource *res;
+
+	iommu = devm_kzalloc(dev, sizeof(*iommu), GFP_KERNEL);
+	if (!iommu)
+		return -ENOMEM;
+
+	platform_set_drvdata(pdev, iommu);
+	iommu->dev = dev;
+
+	res = platform_get_resource(pdev, IORESOURCE_MEM, 0);
+	iommu->base = devm_ioremap_resource(&pdev->dev, res);
+	if (IS_ERR(iommu->base))
+		return PTR_ERR(iommu->base);
+
+	iommu->irq = platform_get_irq(pdev, 0);
+	if (iommu->irq < 0) {
+		dev_err(dev, "Failed to get IRQ, %d\n", iommu->irq);
+		return -ENXIO;
+	}
+
+	return 0;
+}
+
+static int rk_iommu_remove(struct platform_device *pdev)
+{
+	return 0;
+}
+
+#ifdef CONFIG_OF
+static const struct of_device_id rk_iommu_dt_ids[] = {
+	{ .compatible = "rockchip,iommu" },
+	{ /* sentinel */ }
+};
+MODULE_DEVICE_TABLE(of, rk_iommu_dt_ids);
+#endif
+
+static struct platform_driver rk_iommu_driver = {
+	.probe = rk_iommu_probe,
+	.remove = rk_iommu_remove,
+	.driver = {
+		   .name = "rk_iommu",
+		   .owner = THIS_MODULE,
+		   .of_match_table = of_match_ptr(rk_iommu_dt_ids),
+	},
+};
+
+static int __init rk_iommu_init(void)
+{
+	int ret;
+
+	ret = bus_set_iommu(&platform_bus_type, &rk_iommu_ops);
+	if (ret)
+		return ret;
+
+	return platform_driver_register(&rk_iommu_driver);
+}
+static void __exit rk_iommu_exit(void)
+{
+	platform_driver_unregister(&rk_iommu_driver);
+}
+
+subsys_initcall(rk_iommu_init);
+module_exit(rk_iommu_exit);
+
+MODULE_DESCRIPTION("IOMMU API for Rockchip");
+MODULE_AUTHOR("Simon Xue <xxm@rock-chips.com> and Daniel Kurtz <djkurtz@chromium.org>");
+MODULE_ALIAS("platform:rockchip-iommu");
+MODULE_LICENSE("GPL v2");
-- 
2.1.0.rc2.206.gedb03e5

^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [PATCH v5 2/3] dt-bindings: iommu: Add documentation for rockchip iommu
       [not found] <1413273762-32489-1-git-send-email-djkurtz@chromium.org>
@ 2014-10-14  8:02   ` Daniel Kurtz
  2014-10-14  8:02   ` Daniel Kurtz
  2014-10-14  8:02   ` Daniel Kurtz
  2 siblings, 0 replies; 26+ messages in thread
From: Daniel Kurtz @ 2014-10-14  8:02 UTC (permalink / raw)
  Cc: Grant Grundler, Stéphane Marchesin, Daniel Kurtz, Simon Xue,
	Rob Herring, Pawel Moll, Mark Rutland, Ian Campbell, Kumar Gala,
	open list:OPEN FIRMWARE AND...,
	open list

Add binding documentation for Rockchip IOMMU.

Signed-off-by: Daniel Kurtz <djkurtz@chromium.org>
Signed-off-by: Simon Xue <xxm@rock-chips.com>
---
 .../devicetree/bindings/iommu/rockchip,iommu.txt   | 26 ++++++++++++++++++++++
 1 file changed, 26 insertions(+)
 create mode 100644 Documentation/devicetree/bindings/iommu/rockchip,iommu.txt

diff --git a/Documentation/devicetree/bindings/iommu/rockchip,iommu.txt b/Documentation/devicetree/bindings/iommu/rockchip,iommu.txt
new file mode 100644
index 0000000..9a55ac3
--- /dev/null
+++ b/Documentation/devicetree/bindings/iommu/rockchip,iommu.txt
@@ -0,0 +1,26 @@
+Rockchip IOMMU
+==============
+
+A Rockchip DRM iommu translates io virtual addresses to physical addresses for
+its master device.  Each slave device is bound to a single master device, and
+shares its clocks, power domain and irq.
+
+Required properties:
+- compatible      : Should be "rockchip,iommu"
+- reg             : Address space for the configuration registers
+- interrupts      : Interrupt specifier for the IOMMU instance
+- interrupt-names : Interrupt name for the IOMMU instance
+- #iommu-cells    : Should be <0>.  This indicates the iommu is a
+                    "single-master" device, and needs no additional information
+                    to associate with its master device.  See:
+                    Documentation/devicetree/bindings/iommu/iommu.txt
+
+Example:
+
+	vopl_mmu: iommu@ff940300 {
+		compatible = "rockchip,iommu";
+		reg = <0xff940300 0x100>;
+		interrupts = <GIC_SPI 16 IRQ_TYPE_LEVEL_HIGH>;
+		interrupt-names = "vopl_mmu";
+		#iommu-cells = <0>;
+	};
-- 
2.1.0.rc2.206.gedb03e5


^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [PATCH v5 2/3] dt-bindings: iommu: Add documentation for rockchip iommu
@ 2014-10-14  8:02   ` Daniel Kurtz
  0 siblings, 0 replies; 26+ messages in thread
From: Daniel Kurtz @ 2014-10-14  8:02 UTC (permalink / raw)
  Cc: Grant Grundler, Stéphane Marchesin, Daniel Kurtz, Simon Xue,
	Rob Herring, Pawel Moll, Mark Rutland, Ian Campbell, Kumar Gala,
	open list:OPEN FIRMWARE AND...,
	open list

Add binding documentation for Rockchip IOMMU.

Signed-off-by: Daniel Kurtz <djkurtz@chromium.org>
Signed-off-by: Simon Xue <xxm@rock-chips.com>
---
 .../devicetree/bindings/iommu/rockchip,iommu.txt   | 26 ++++++++++++++++++++++
 1 file changed, 26 insertions(+)
 create mode 100644 Documentation/devicetree/bindings/iommu/rockchip,iommu.txt

diff --git a/Documentation/devicetree/bindings/iommu/rockchip,iommu.txt b/Documentation/devicetree/bindings/iommu/rockchip,iommu.txt
new file mode 100644
index 0000000..9a55ac3
--- /dev/null
+++ b/Documentation/devicetree/bindings/iommu/rockchip,iommu.txt
@@ -0,0 +1,26 @@
+Rockchip IOMMU
+==============
+
+A Rockchip DRM iommu translates io virtual addresses to physical addresses for
+its master device.  Each slave device is bound to a single master device, and
+shares its clocks, power domain and irq.
+
+Required properties:
+- compatible      : Should be "rockchip,iommu"
+- reg             : Address space for the configuration registers
+- interrupts      : Interrupt specifier for the IOMMU instance
+- interrupt-names : Interrupt name for the IOMMU instance
+- #iommu-cells    : Should be <0>.  This indicates the iommu is a
+                    "single-master" device, and needs no additional information
+                    to associate with its master device.  See:
+                    Documentation/devicetree/bindings/iommu/iommu.txt
+
+Example:
+
+	vopl_mmu: iommu@ff940300 {
+		compatible = "rockchip,iommu";
+		reg = <0xff940300 0x100>;
+		interrupts = <GIC_SPI 16 IRQ_TYPE_LEVEL_HIGH>;
+		interrupt-names = "vopl_mmu";
+		#iommu-cells = <0>;
+	};
-- 
2.1.0.rc2.206.gedb03e5

^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [PATCH v5 3/3] ARM: dts: rk3288: add VOP iommu nodes
       [not found] <1413273762-32489-1-git-send-email-djkurtz@chromium.org>
  2014-10-14  8:02   ` Daniel Kurtz
@ 2014-10-14  8:02   ` Daniel Kurtz
  2014-10-14  8:02   ` Daniel Kurtz
  2 siblings, 0 replies; 26+ messages in thread
From: Daniel Kurtz @ 2014-10-14  8:02 UTC (permalink / raw)
  Cc: Grant Grundler, Stéphane Marchesin, Daniel Kurtz, Simon Xue,
	Heiko Stuebner, Rob Herring, Pawel Moll, Mark Rutland,
	Ian Campbell, Kumar Gala, Russell King,
	moderated list:ARM/Rockchip SoC..., open list:ARM/Rockchip SoC...,
	open list:OPEN FIRMWARE AND...,
	open list

Add device nodes for the VOP iommus.
Device nodes for other iommus will be added in later patches.

The iommu nodes use the #iommu-cells property as described in:
  Documentation/devicetree/bindings/iommu/iommu.txt

Signed-off-by: Daniel Kurtz <djkurtz@chromium.org>
Signed-off-by: Simon Xue <xxm@rock-chips.com>
---
 arch/arm/boot/dts/rk3288.dtsi | 18 ++++++++++++++++++
 1 file changed, 18 insertions(+)

diff --git a/arch/arm/boot/dts/rk3288.dtsi b/arch/arm/boot/dts/rk3288.dtsi
index 5950b0a..df1170c 100644
--- a/arch/arm/boot/dts/rk3288.dtsi
+++ b/arch/arm/boot/dts/rk3288.dtsi
@@ -271,6 +271,24 @@
 		status = "disabled";
 	};
 
+	vopb_mmu: iommu@ff930300 {
+		compatible = "rockchip,iommu";
+		reg = <0xff930300 0x100>;
+		interrupts = <GIC_SPI 15 IRQ_TYPE_LEVEL_HIGH>;
+		interrupt-names = "vopb_mmu";
+		#iommu-cells = <0>;
+		status = "disabled";
+	};
+
+	vopl_mmu: iommu@ff940300 {
+		compatible = "rockchip,iommu";
+		reg = <0xff940300 0x100>;
+		interrupts = <GIC_SPI 16 IRQ_TYPE_LEVEL_HIGH>;
+		interrupt-names = "vopl_mmu";
+		#iommu-cells = <0>;
+		status = "disabled";
+	};
+
 	gic: interrupt-controller@ffc01000 {
 		compatible = "arm,gic-400";
 		interrupt-controller;
-- 
2.1.0.rc2.206.gedb03e5


^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [PATCH v5 3/3] ARM: dts: rk3288: add VOP iommu nodes
@ 2014-10-14  8:02   ` Daniel Kurtz
  0 siblings, 0 replies; 26+ messages in thread
From: Daniel Kurtz @ 2014-10-14  8:02 UTC (permalink / raw)
  Cc: Grant Grundler, Stéphane Marchesin, Daniel Kurtz, Simon Xue,
	Heiko Stuebner, Rob Herring, Pawel Moll, Mark Rutland,
	Ian Campbell, Kumar Gala, Russell King,
	moderated list:ARM/Rockchip SoC..., open list:ARM/Rockchip SoC...,
	open list:OPEN FIRMWARE AND...,
	open list

Add device nodes for the VOP iommus.
Device nodes for other iommus will be added in later patches.

The iommu nodes use the #iommu-cells property as described in:
  Documentation/devicetree/bindings/iommu/iommu.txt

Signed-off-by: Daniel Kurtz <djkurtz@chromium.org>
Signed-off-by: Simon Xue <xxm@rock-chips.com>
---
 arch/arm/boot/dts/rk3288.dtsi | 18 ++++++++++++++++++
 1 file changed, 18 insertions(+)

diff --git a/arch/arm/boot/dts/rk3288.dtsi b/arch/arm/boot/dts/rk3288.dtsi
index 5950b0a..df1170c 100644
--- a/arch/arm/boot/dts/rk3288.dtsi
+++ b/arch/arm/boot/dts/rk3288.dtsi
@@ -271,6 +271,24 @@
 		status = "disabled";
 	};
 
+	vopb_mmu: iommu@ff930300 {
+		compatible = "rockchip,iommu";
+		reg = <0xff930300 0x100>;
+		interrupts = <GIC_SPI 15 IRQ_TYPE_LEVEL_HIGH>;
+		interrupt-names = "vopb_mmu";
+		#iommu-cells = <0>;
+		status = "disabled";
+	};
+
+	vopl_mmu: iommu@ff940300 {
+		compatible = "rockchip,iommu";
+		reg = <0xff940300 0x100>;
+		interrupts = <GIC_SPI 16 IRQ_TYPE_LEVEL_HIGH>;
+		interrupt-names = "vopl_mmu";
+		#iommu-cells = <0>;
+		status = "disabled";
+	};
+
 	gic: interrupt-controller@ffc01000 {
 		compatible = "arm,gic-400";
 		interrupt-controller;
-- 
2.1.0.rc2.206.gedb03e5

^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [PATCH v5 3/3] ARM: dts: rk3288: add VOP iommu nodes
@ 2014-10-14  8:02   ` Daniel Kurtz
  0 siblings, 0 replies; 26+ messages in thread
From: Daniel Kurtz @ 2014-10-14  8:02 UTC (permalink / raw)
  To: linux-arm-kernel

Add device nodes for the VOP iommus.
Device nodes for other iommus will be added in later patches.

The iommu nodes use the #iommu-cells property as described in:
  Documentation/devicetree/bindings/iommu/iommu.txt

Signed-off-by: Daniel Kurtz <djkurtz@chromium.org>
Signed-off-by: Simon Xue <xxm@rock-chips.com>
---
 arch/arm/boot/dts/rk3288.dtsi | 18 ++++++++++++++++++
 1 file changed, 18 insertions(+)

diff --git a/arch/arm/boot/dts/rk3288.dtsi b/arch/arm/boot/dts/rk3288.dtsi
index 5950b0a..df1170c 100644
--- a/arch/arm/boot/dts/rk3288.dtsi
+++ b/arch/arm/boot/dts/rk3288.dtsi
@@ -271,6 +271,24 @@
 		status = "disabled";
 	};
 
+	vopb_mmu: iommu at ff930300 {
+		compatible = "rockchip,iommu";
+		reg = <0xff930300 0x100>;
+		interrupts = <GIC_SPI 15 IRQ_TYPE_LEVEL_HIGH>;
+		interrupt-names = "vopb_mmu";
+		#iommu-cells = <0>;
+		status = "disabled";
+	};
+
+	vopl_mmu: iommu at ff940300 {
+		compatible = "rockchip,iommu";
+		reg = <0xff940300 0x100>;
+		interrupts = <GIC_SPI 16 IRQ_TYPE_LEVEL_HIGH>;
+		interrupt-names = "vopl_mmu";
+		#iommu-cells = <0>;
+		status = "disabled";
+	};
+
 	gic: interrupt-controller at ffc01000 {
 		compatible = "arm,gic-400";
 		interrupt-controller;
-- 
2.1.0.rc2.206.gedb03e5

^ permalink raw reply related	[flat|nested] 26+ messages in thread

* Re: [PATCH v5 1/3] iommu/rockchip: rk3288 iommu driver
@ 2014-10-17  2:22     ` Daniel Kurtz
  0 siblings, 0 replies; 26+ messages in thread
From: Daniel Kurtz @ 2014-10-17  2:22 UTC (permalink / raw)
  To: Daniel Kurtz
  Cc: Grant Grundler, Stéphane Marchesin, Simon Xue, Joerg Roedel,
	Heiko Stuebner, Grant Likely, Rob Herring, open list,
	open list:IOMMU DRIVERS, moderated list:ARM/Rockchip SoC...,
	open list:ARM/Rockchip SoC..., open list:OPEN FIRMWARE AND...

On Tue, Oct 14, 2014 at 4:02 PM, Daniel Kurtz <djkurtz@chromium.org> wrote:
> The rk3288 has several iommus.  Each iommu belongs to a single master
> device.  There is one device (ISP) that has two slave iommus, but that
> case is not yet supported by this driver.
>
> At subsys init, the iommu driver registers itself as the iommu driver for
> the platform bus.  The master devices find their slave iommus using the
> "iommus" field in their devicetree description.  Since each slave iommu
> belongs to exactly one master, their is no additional data needed at probe
> to associate a slave with its master.
>
> An iommu device's power domain, clock and irq are all shared with its
> master device, and the master device must be careful to attach from the
> iommu only after powering and clocking it (and leave it powered and
> clocked before detaching).  Because their is no guarantee what the status
> of the iommu is at probe, and since the driver does not even know if the
> device is powered, we delay requesting its irq until the master device
> attaches, at which point we have a guarantee that the device is powered
> and clocked and we can reset it and disable its interrupt mask.
>
> An iommu_domain describes a virtual iova address space.  Each iommu_domain
> has a corresponding page table that lists the mappings from iova to
> physical address.
>
> For the rk3288 iommu, the page table has two levels:
>  The Level 1 "directory_table" has 1024 4-byte dte entries.
>  Each dte points to a level 2 "page_table".
>  Each level 2 page_table has 1024 4-byte pte entries.
>  Each pte points to a 4 KiB page of memory.
>
> An iommu_domain is created when a dma_iommu_mapping is created via
> arm_iommu_create_mapping.  Master devices can then attach themselves to
> this mapping (or attach the mapping to themselves?) by calling
> arm_iommu_attach_device().  This in turn instructs the iommu driver to
> write the page table's physical address into the slave iommu's "Directory
> Table Entry" (DTE) register.
>
> In fact multiple master devices, each with their own slave iommu device,
> can all attach to the same mapping.  The iommus for these devices will
> share the same iommu_domain and therefore point to the same page table.
> Thus, the iommu domain maintains a list of iommu devices which are
> attached.  This driver relies on the iommu core to ensure that all devices
> have detached before destroying a domain.
>
> Signed-off-by: Daniel Kurtz <djkurtz@chromium.org>
> Signed-off-by: Simon Xue <xxm@rock-chips.com>
> Reviewed-by: Grant Grundler <grundler@chromium.org>
> Reviewed-by: Stéphane Marchesin <marcheu@chromium.org>

Gentle ping.

Any more feedback on the rockchip iommu driver?

Thanks,
-Daniel

> ---
>  drivers/iommu/Kconfig          |  12 +
>  drivers/iommu/Makefile         |   1 +
>  drivers/iommu/rockchip-iommu.c | 924 +++++++++++++++++++++++++++++++++++++++++
>  3 files changed, 937 insertions(+)
>  create mode 100644 drivers/iommu/rockchip-iommu.c
>
> diff --git a/drivers/iommu/Kconfig b/drivers/iommu/Kconfig
> index dd51122..d0a1261 100644
> --- a/drivers/iommu/Kconfig
> +++ b/drivers/iommu/Kconfig
> @@ -152,6 +152,18 @@ config OMAP_IOMMU_DEBUG
>
>           Say N unless you know you need this.
>
> +config ROCKCHIP_IOMMU
> +       bool "Rockchip IOMMU Support"
> +       depends on ARCH_ROCKCHIP
> +       select IOMMU_API
> +       select ARM_DMA_USE_IOMMU
> +       help
> +         Support for IOMMUs found on Rockchip rk32xx SOCs.
> +         These IOMMUs allow virtualization of the address space used by most
> +         cores within the multimedia subsystem.
> +         Say Y here if you are using a Rockchip SoC that includes an IOMMU
> +         device.
> +
>  config TEGRA_IOMMU_GART
>         bool "Tegra GART IOMMU Support"
>         depends on ARCH_TEGRA_2x_SOC
> diff --git a/drivers/iommu/Makefile b/drivers/iommu/Makefile
> index 16edef7..3e47ef3 100644
> --- a/drivers/iommu/Makefile
> +++ b/drivers/iommu/Makefile
> @@ -13,6 +13,7 @@ obj-$(CONFIG_IRQ_REMAP) += intel_irq_remapping.o irq_remapping.o
>  obj-$(CONFIG_OMAP_IOMMU) += omap-iommu.o
>  obj-$(CONFIG_OMAP_IOMMU) += omap-iommu2.o
>  obj-$(CONFIG_OMAP_IOMMU_DEBUG) += omap-iommu-debug.o
> +obj-$(CONFIG_ROCKCHIP_IOMMU) += rockchip-iommu.o
>  obj-$(CONFIG_TEGRA_IOMMU_GART) += tegra-gart.o
>  obj-$(CONFIG_TEGRA_IOMMU_SMMU) += tegra-smmu.o
>  obj-$(CONFIG_EXYNOS_IOMMU) += exynos-iommu.o
> diff --git a/drivers/iommu/rockchip-iommu.c b/drivers/iommu/rockchip-iommu.c
> new file mode 100644
> index 0000000..08e50fc
> --- /dev/null
> +++ b/drivers/iommu/rockchip-iommu.c
> @@ -0,0 +1,924 @@
> +/*
> + * This program is free software; you can redistribute it and/or modify
> + * it under the terms of the GNU General Public License version 2 as
> + * published by the Free Software Foundation.
> + */
> +
> +#include <asm/cacheflush.h>
> +#include <asm/pgtable.h>
> +#include <linux/compiler.h>
> +#include <linux/delay.h>
> +#include <linux/device.h>
> +#include <linux/errno.h>
> +#include <linux/interrupt.h>
> +#include <linux/io.h>
> +#include <linux/iommu.h>
> +#include <linux/jiffies.h>
> +#include <linux/list.h>
> +#include <linux/mm.h>
> +#include <linux/module.h>
> +#include <linux/of.h>
> +#include <linux/platform_device.h>
> +#include <linux/slab.h>
> +#include <linux/spinlock.h>
> +
> +/** MMU register offsets */
> +#define RK_MMU_DTE_ADDR                0x00    /* Directory table address */
> +#define RK_MMU_STATUS          0x04
> +#define RK_MMU_COMMAND         0x08
> +#define RK_MMU_PAGE_FAULT_ADDR 0x0C    /* IOVA of last page fault */
> +#define RK_MMU_ZAP_ONE_LINE    0x10    /* Shootdown one IOTLB entry */
> +#define RK_MMU_INT_RAWSTAT     0x14    /* IRQ status ignoring mask */
> +#define RK_MMU_INT_CLEAR       0x18    /* Acknowledge and re-arm irq */
> +#define RK_MMU_INT_MASK                0x1C    /* IRQ enable */
> +#define RK_MMU_INT_STATUS      0x20    /* IRQ status after masking */
> +#define RK_MMU_AUTO_GATING     0x24
> +
> +#define DTE_ADDR_DUMMY         0xCAFEBABE
> +#define FORCE_RESET_TIMEOUT    100     /* ms */
> +
> +/* RK_MMU_STATUS fields */
> +#define RK_MMU_STATUS_PAGING_ENABLED       BIT(0)
> +#define RK_MMU_STATUS_PAGE_FAULT_ACTIVE    BIT(1)
> +#define RK_MMU_STATUS_STALL_ACTIVE         BIT(2)
> +#define RK_MMU_STATUS_IDLE                 BIT(3)
> +#define RK_MMU_STATUS_REPLAY_BUFFER_EMPTY  BIT(4)
> +#define RK_MMU_STATUS_PAGE_FAULT_IS_WRITE  BIT(5)
> +#define RK_MMU_STATUS_STALL_NOT_ACTIVE     BIT(31)
> +
> +/* RK_MMU_COMMAND command values */
> +#define RK_MMU_CMD_ENABLE_PAGING    0  /* Enable memory translation */
> +#define RK_MMU_CMD_DISABLE_PAGING   1  /* Disable memory translation */
> +#define RK_MMU_CMD_ENABLE_STALL     2  /* Stall paging to allow other cmds */
> +#define RK_MMU_CMD_DISABLE_STALL    3  /* Stop stall re-enables paging */
> +#define RK_MMU_CMD_ZAP_CACHE        4  /* Shoot down entire IOTLB */
> +#define RK_MMU_CMD_PAGE_FAULT_DONE  5  /* Clear page fault */
> +#define RK_MMU_CMD_FORCE_RESET      6  /* Reset all registers */
> +
> +/* RK_MMU_INT_* register fields */
> +#define RK_MMU_IRQ_PAGE_FAULT    0x01  /* page fault */
> +#define RK_MMU_IRQ_BUS_ERROR     0x02  /* bus read error */
> +#define RK_MMU_IRQ_MASK          (RK_MMU_IRQ_PAGE_FAULT | RK_MMU_IRQ_BUS_ERROR)
> +
> +#define NUM_DT_ENTRIES 1024
> +#define NUM_PT_ENTRIES 1024
> +
> +#define SPAGE_ORDER 12
> +#define SPAGE_SIZE (1 << SPAGE_ORDER)
> +
> + /*
> +  * Support mapping any size that fits in one page table:
> +  *   4 KiB to 4 MiB
> +  */
> +#define RK_IOMMU_PGSIZE_BITMAP 0x007ff000
> +
> +#define IOMMU_REG_POLL_COUNT_FAST 1000
> +
> +struct rk_iommu_domain {
> +       struct list_head iommus;
> +       u32 *dt; /* page directory table */
> +       spinlock_t iommus_lock; /* lock for iommus list */
> +       spinlock_t dt_lock; /* lock for modifying page directory table */
> +};
> +
> +struct rk_iommu {
> +       struct device *dev;
> +       void __iomem *base;
> +       int irq;
> +       struct list_head node; /* entry in rk_iommu_domain.iommus */
> +       struct iommu_domain *domain; /* domain to which iommu is attached */
> +};
> +
> +static inline void rk_table_flush(u32 *va, unsigned int count)
> +{
> +       phys_addr_t pa_start = virt_to_phys(va);
> +       phys_addr_t pa_end = virt_to_phys(va + count);
> +       size_t size = pa_end - pa_start;
> +
> +       __cpuc_flush_dcache_area(va, size);
> +       outer_flush_range(pa_start, pa_end);
> +}
> +
> +/**
> + * Inspired by _wait_for in intel_drv.h
> + * This is NOT safe for use in interrupt context.
> + *
> + * Note that it's important that we check the condition again after having
> + * timed out, since the timeout could be due to preemption or similar and
> + * we've never had a chance to check the condition before the timeout.
> + */
> +#define rk_wait_for(COND, MS) ({ \
> +       unsigned long timeout__ = jiffies + msecs_to_jiffies(MS) + 1;   \
> +       int ret__ = 0;                                                  \
> +       while (!(COND)) {                                               \
> +               if (time_after(jiffies, timeout__)) {                   \
> +                       ret__ = (COND) ? 0 : -ETIMEDOUT;                \
> +                       break;                                          \
> +               }                                                       \
> +               usleep_range(50, 100);                                  \
> +       }                                                               \
> +       ret__;                                                          \
> +})
> +
> +/*
> + * The Rockchip rk3288 iommu uses a 2-level page table.
> + * The first level is the "Directory Table" (DT).
> + * The DT consists of 1024 4-byte Directory Table Entries (DTEs), each pointing
> + * to a "Page Table".
> + * The second level is the 1024 Page Tables (PT).
> + * Each PT consists of 1024 4-byte Page Table Entries (PTEs), each pointing to
> + * a 4 KB page of physical memory.
> + *
> + * The DT and each PT fits in a single 4 KB page (4-bytes * 1024 entries).
> + * Each iommu device has a MMU_DTE_ADDR register that contains the physical
> + * address of the start of the DT page.
> + *
> + * The structure of the page table is as follows:
> + *
> + *                   DT
> + * MMU_DTE_ADDR -> +-----+
> + *                 |     |
> + *                 +-----+     PT
> + *                 | DTE | -> +-----+
> + *                 +-----+    |     |     Memory
> + *                 |     |    +-----+     Page
> + *                 |     |    | PTE | -> +-----+
> + *                 +-----+    +-----+    |     |
> + *                            |     |    |     |
> + *                            |     |    |     |
> + *                            +-----+    |     |
> + *                                       |     |
> + *                                       |     |
> + *                                       +-----+
> + */
> +
> +/*
> + * Each DTE has a PT address and a valid bit:
> + * +---------------------+-----------+-+
> + * | PT address          | Reserved  |V|
> + * +---------------------+-----------+-+
> + *  31:12 - PT address (PTs always starts on a 4 KB boundary)
> + *  11: 1 - Reserved
> + *      0 - 1 if PT @ PT address is valid
> + */
> +#define RK_DTE_PT_ADDRESS_MASK    0xfffff000
> +#define RK_DTE_PT_VALID           BIT(0)
> +
> +static inline phys_addr_t rk_dte_pt_address(u32 dte)
> +{
> +       return (phys_addr_t)dte & RK_DTE_PT_ADDRESS_MASK;
> +}
> +
> +static inline bool rk_dte_is_pt_valid(u32 dte)
> +{
> +       return dte & RK_DTE_PT_VALID;
> +}
> +
> +static u32 rk_mk_dte(u32 *pt)
> +{
> +       phys_addr_t pt_phys = virt_to_phys(pt);
> +       return (pt_phys & RK_DTE_PT_ADDRESS_MASK) | RK_DTE_PT_VALID;
> +}
> +
> +/*
> + * Each PTE has a Page address, some flags and a valid bit:
> + * +---------------------+---+-------+-+
> + * | Page address        |Rsv| Flags |V|
> + * +---------------------+---+-------+-+
> + *  31:12 - Page address (Pages always start on a 4 KB boundary)
> + *  11: 9 - Reserved
> + *   8: 1 - Flags
> + *      8 - Read allocate - allocate cache space on read misses
> + *      7 - Read cache - enable cache & prefetch of data
> + *      6 - Write buffer - enable delaying writes on their way to memory
> + *      5 - Write allocate - allocate cache space on write misses
> + *      4 - Write cache - different writes can be merged together
> + *      3 - Override cache attributes
> + *          if 1, bits 4-8 control cache attributes
> + *          if 0, the system bus defaults are used
> + *      2 - Writable
> + *      1 - Readable
> + *      0 - 1 if Page @ Page address is valid
> + */
> +#define RK_PTE_PAGE_ADDRESS_MASK  0xfffff000
> +#define RK_PTE_PAGE_FLAGS_MASK    0x000001fe
> +#define RK_PTE_PAGE_WRITABLE      BIT(2)
> +#define RK_PTE_PAGE_READABLE      BIT(1)
> +#define RK_PTE_PAGE_VALID         BIT(0)
> +
> +static inline phys_addr_t rk_pte_page_address(u32 pte)
> +{
> +       return (phys_addr_t)pte & RK_PTE_PAGE_ADDRESS_MASK;
> +}
> +
> +static inline bool rk_pte_is_page_valid(u32 pte)
> +{
> +       return pte & RK_PTE_PAGE_VALID;
> +}
> +
> +/* TODO: set cache flags per prot IOMMU_CACHE */
> +static u32 rk_mk_pte(phys_addr_t page, int prot)
> +{
> +       u32 flags = 0;
> +       flags |= (prot & IOMMU_READ) ? RK_PTE_PAGE_READABLE : 0;
> +       flags |= (prot & IOMMU_WRITE) ? RK_PTE_PAGE_WRITABLE : 0;
> +       page &= RK_PTE_PAGE_ADDRESS_MASK;
> +       return page | flags | RK_PTE_PAGE_VALID;
> +}
> +
> +static u32 rk_mk_pte_invalid(u32 pte)
> +{
> +       return pte & ~RK_PTE_PAGE_VALID;
> +}
> +
> +/*
> + * rk3288 iova (IOMMU Virtual Address) format
> + *  31       22.21       12.11          0
> + * +-----------+-----------+-------------+
> + * | DTE index | PTE index | Page offset |
> + * +-----------+-----------+-------------+
> + *  31:22 - DTE index   - index of DTE in DT
> + *  21:12 - PTE index   - index of PTE in PT @ DTE.pt_address
> + *  11: 0 - Page offset - offset into page @ PTE.page_address
> + */
> +#define RK_IOVA_DTE_MASK    0xffc00000
> +#define RK_IOVA_DTE_SHIFT   22
> +#define RK_IOVA_PTE_MASK    0x003ff000
> +#define RK_IOVA_PTE_SHIFT   12
> +#define RK_IOVA_PAGE_MASK   0x00000fff
> +#define RK_IOVA_PAGE_SHIFT  0
> +
> +static u32 rk_iova_dte_index(dma_addr_t iova)
> +{
> +       return (u32)(iova & RK_IOVA_DTE_MASK) >> RK_IOVA_DTE_SHIFT;
> +}
> +
> +static u32 rk_iova_pte_index(dma_addr_t iova)
> +{
> +       return (u32)(iova & RK_IOVA_PTE_MASK) >> RK_IOVA_PTE_SHIFT;
> +}
> +
> +static u32 rk_iova_page_offset(dma_addr_t iova)
> +{
> +       return (u32)(iova & RK_IOVA_PAGE_MASK) >> RK_IOVA_PAGE_SHIFT;
> +}
> +
> +static u32 rk_iommu_read(struct rk_iommu *iommu, u32 offset)
> +{
> +       return readl(iommu->base + offset);
> +}
> +
> +static void rk_iommu_write(struct rk_iommu *iommu, u32 offset, u32 value)
> +{
> +       writel(value, iommu->base + offset);
> +}
> +
> +static void rk_iommu_command(struct rk_iommu *iommu, u32 command)
> +{
> +       writel(command, iommu->base + RK_MMU_COMMAND);
> +}
> +
> +static void rk_iommu_zap_lines(struct rk_iommu *iommu, dma_addr_t iova,
> +                              size_t size)
> +{
> +       dma_addr_t iova_end = iova + size;
> +       /*
> +        * TODO(djkurtz): Figure out when it is more efficient to shootdown the
> +        * entire iotlb rather than iterate over individual iovas.
> +        */
> +       for (; iova < iova_end; iova += SPAGE_SIZE)
> +               rk_iommu_write(iommu, RK_MMU_ZAP_ONE_LINE, iova);
> +}
> +
> +static bool rk_iommu_is_stall_active(struct rk_iommu *iommu)
> +{
> +       return rk_iommu_read(iommu, RK_MMU_STATUS) & RK_MMU_STATUS_STALL_ACTIVE;
> +}
> +
> +static bool rk_iommu_is_paging_enabled(struct rk_iommu *iommu)
> +{
> +       return rk_iommu_read(iommu, RK_MMU_STATUS) &
> +                            RK_MMU_STATUS_PAGING_ENABLED;
> +}
> +
> +static int rk_iommu_enable_stall(struct rk_iommu *iommu)
> +{
> +       int ret;
> +
> +       if (rk_iommu_is_stall_active(iommu))
> +               return 0;
> +
> +       /* Stall can only be enabled if paging is enabled */
> +       if (!rk_iommu_is_paging_enabled(iommu))
> +               return 0;
> +
> +       rk_iommu_command(iommu, RK_MMU_CMD_ENABLE_STALL);
> +
> +       ret = rk_wait_for(rk_iommu_is_stall_active(iommu), 1);
> +       if (ret)
> +               dev_err(iommu->dev, "Enable stall request timed out, status: %#08x\n",
> +                       rk_iommu_read(iommu, RK_MMU_STATUS));
> +
> +       return ret;
> +}
> +
> +static int rk_iommu_disable_stall(struct rk_iommu *iommu)
> +{
> +       int ret;
> +
> +       if (!rk_iommu_is_stall_active(iommu))
> +               return 0;
> +
> +       rk_iommu_command(iommu, RK_MMU_CMD_DISABLE_STALL);
> +
> +       ret = rk_wait_for(!rk_iommu_is_stall_active(iommu), 1);
> +       if (ret)
> +               dev_err(iommu->dev, "Disable stall request timed out, status: %#08x\n",
> +                       rk_iommu_read(iommu, RK_MMU_STATUS));
> +
> +       return ret;
> +}
> +
> +static int rk_iommu_enable_paging(struct rk_iommu *iommu)
> +{
> +       int ret;
> +
> +       if (rk_iommu_is_paging_enabled(iommu))
> +               return 0;
> +
> +       rk_iommu_command(iommu, RK_MMU_CMD_ENABLE_PAGING);
> +
> +       ret = rk_wait_for(rk_iommu_is_paging_enabled(iommu), 1);
> +       if (ret)
> +               dev_err(iommu->dev, "Enable paging request timed out, status: %#08x\n",
> +                       rk_iommu_read(iommu, RK_MMU_STATUS));
> +
> +       return ret;
> +}
> +
> +static int rk_iommu_disable_paging(struct rk_iommu *iommu)
> +{
> +       int ret;
> +
> +       if (!rk_iommu_is_paging_enabled(iommu))
> +               return 0;
> +
> +       rk_iommu_command(iommu, RK_MMU_CMD_DISABLE_PAGING);
> +
> +       ret = rk_wait_for(!rk_iommu_is_paging_enabled(iommu), 1);
> +       if (ret)
> +               dev_err(iommu->dev, "Disable paging request timed out, status: #%08x\n",
> +                       rk_iommu_read(iommu, RK_MMU_STATUS));
> +
> +       return ret;
> +}
> +
> +static int rk_iommu_force_reset(struct rk_iommu *iommu)
> +{
> +       int ret;
> +       u32 dte_addr;
> +
> +       /*
> +        * Check if register DTE_ADDR is working by writing DTE_ADDR_DUMMY
> +        * and verifying that upper 5 nybbles are read back.
> +        */
> +       rk_iommu_write(iommu, RK_MMU_DTE_ADDR, DTE_ADDR_DUMMY);
> +
> +       dte_addr = rk_iommu_read(iommu, RK_MMU_DTE_ADDR);
> +       if (dte_addr != (DTE_ADDR_DUMMY & RK_DTE_PT_ADDRESS_MASK)) {
> +               dev_err(iommu->dev, "Error during raw reset. MMU_DTE_ADDR is not functioning\n");
> +               return -EFAULT;
> +       }
> +
> +       rk_iommu_command(iommu, RK_MMU_CMD_FORCE_RESET);
> +
> +       ret = rk_wait_for(rk_iommu_read(iommu, RK_MMU_DTE_ADDR) == 0x00000000,
> +                         FORCE_RESET_TIMEOUT);
> +       if (ret)
> +               dev_err(iommu->dev, "FORCE_RESET command timed out\n");
> +
> +       return ret;
> +}
> +
> +static void log_iova(struct rk_iommu *iommu, dma_addr_t iova)
> +{
> +       u32 dte_index, pte_index, page_offset;
> +       u32 mmu_dte_addr;
> +       phys_addr_t mmu_dte_addr_phys, dte_addr_phys;
> +       u32 *dte_addr;
> +       u32 dte;
> +       phys_addr_t pte_addr_phys = 0;
> +       u32 *pte_addr = NULL;
> +       u32 pte = 0;
> +       phys_addr_t page_addr_phys = 0;
> +       u32 page_flags = 0;
> +
> +       dte_index = rk_iova_dte_index(iova);
> +       pte_index = rk_iova_pte_index(iova);
> +       page_offset = rk_iova_page_offset(iova);
> +
> +       mmu_dte_addr = rk_iommu_read(iommu, RK_MMU_DTE_ADDR);
> +       mmu_dte_addr_phys = (phys_addr_t)mmu_dte_addr;
> +
> +       dte_addr_phys = mmu_dte_addr_phys + (4 * dte_index);
> +       dte_addr = phys_to_virt(dte_addr_phys);
> +       dte = *dte_addr;
> +
> +       if (!rk_dte_is_pt_valid(dte))
> +               goto print_it;
> +
> +       pte_addr_phys = rk_dte_pt_address(dte) + (pte_index * 4);
> +       pte_addr = phys_to_virt(pte_addr_phys);
> +       pte = *pte_addr;
> +
> +       if (!rk_pte_is_page_valid(pte))
> +               goto print_it;
> +
> +       page_addr_phys = rk_pte_page_address(pte) + page_offset;
> +       page_flags = pte & RK_PTE_PAGE_FLAGS_MASK;
> +
> +print_it:
> +       dev_err(iommu->dev, "iova = %pad: dte_index: 0x%03x pte_index: 0x%03x page_offset: 0x%03x\n",
> +               &iova, dte_index, pte_index, page_offset);
> +       dev_err(iommu->dev, "mmu_dte_addr: %pa dte@%pa: %#08x valid: %u pte@%pa: %#08x valid: %u page@%pa flags: %#03x\n",
> +               &mmu_dte_addr_phys, &dte_addr_phys, dte,
> +               rk_dte_is_pt_valid(dte), &pte_addr_phys, pte,
> +               rk_pte_is_page_valid(pte), &page_addr_phys, page_flags);
> +}
> +
> +static irqreturn_t rk_iommu_irq(int irq, void *dev_id)
> +{
> +       struct rk_iommu *iommu = dev_id;
> +       u32 status;
> +       u32 int_status;
> +       dma_addr_t iova;
> +
> +       int_status = rk_iommu_read(iommu, RK_MMU_INT_STATUS);
> +       if (int_status == 0)
> +               return IRQ_NONE;
> +
> +       iova = rk_iommu_read(iommu, RK_MMU_PAGE_FAULT_ADDR);
> +
> +       if (int_status & RK_MMU_IRQ_PAGE_FAULT) {
> +               int flags;
> +
> +               status = rk_iommu_read(iommu, RK_MMU_STATUS);
> +               flags = (status & RK_MMU_STATUS_PAGE_FAULT_IS_WRITE) ?
> +                               IOMMU_FAULT_WRITE : IOMMU_FAULT_READ;
> +
> +               dev_err(iommu->dev, "Page fault at %pad of type %s\n",
> +                       &iova,
> +                       (flags == IOMMU_FAULT_WRITE) ? "write" : "read");
> +
> +               log_iova(iommu, iova);
> +
> +               /*
> +                * Report page fault to any installed handlers.
> +                * Ignore the return code, though, since we always zap cache
> +                * and clear the page fault anyway.
> +                */
> +               if (iommu->domain)
> +                       report_iommu_fault(iommu->domain, iommu->dev, iova,
> +                                          flags);
> +               else
> +                       dev_err(iommu->dev, "Page fault while iommu not attached to domain?\n");
> +
> +               rk_iommu_command(iommu, RK_MMU_CMD_ZAP_CACHE);
> +               rk_iommu_command(iommu, RK_MMU_CMD_PAGE_FAULT_DONE);
> +       }
> +
> +       if (int_status & RK_MMU_IRQ_BUS_ERROR)
> +               dev_err(iommu->dev, "BUS_ERROR occurred at %pad\n", &iova);
> +
> +       if (int_status & ~RK_MMU_IRQ_MASK)
> +               dev_err(iommu->dev, "unexpected int_status: %#08x\n",
> +                       int_status);
> +
> +       rk_iommu_write(iommu, RK_MMU_INT_CLEAR, int_status);
> +
> +       return IRQ_HANDLED;
> +}
> +
> +static phys_addr_t rk_iommu_iova_to_phys(struct iommu_domain *domain,
> +                                        dma_addr_t iova)
> +{
> +       struct rk_iommu_domain *rk_domain = domain->priv;
> +       unsigned long flags;
> +       phys_addr_t pt_phys, phys = 0;
> +       u32 dte, pte;
> +       u32 *page_table;
> +
> +       spin_lock_irqsave(&rk_domain->dt_lock, flags);
> +
> +       dte = rk_domain->dt[rk_iova_dte_index(iova)];
> +       if (!rk_dte_is_pt_valid(dte))
> +               goto out;
> +
> +       pt_phys = rk_dte_pt_address(dte);
> +       page_table = (u32 *)phys_to_virt(pt_phys);
> +       pte = page_table[rk_iova_pte_index(iova)];
> +       if (!rk_pte_is_page_valid(pte))
> +               goto out;
> +
> +       phys = rk_pte_page_address(pte) + rk_iova_page_offset(iova);
> +out:
> +       spin_unlock_irqrestore(&rk_domain->dt_lock, flags);
> +
> +       return phys;
> +}
> +
> +static void rk_iommu_zap_iova(struct rk_iommu_domain *rk_domain,
> +                             dma_addr_t iova, size_t size)
> +{
> +       struct list_head *pos;
> +       unsigned long flags;
> +
> +       /* shootdown these iova from all iommus using this domain */
> +       spin_lock_irqsave(&rk_domain->iommus_lock, flags);
> +       list_for_each(pos, &rk_domain->iommus) {
> +               struct rk_iommu *iommu;
> +               iommu = list_entry(pos, struct rk_iommu, node);
> +               rk_iommu_zap_lines(iommu, iova, size);
> +       }
> +       spin_unlock_irqrestore(&rk_domain->iommus_lock, flags);
> +}
> +
> +static u32 *rk_dte_get_page_table(struct rk_iommu_domain *rk_domain,
> +                                 dma_addr_t iova)
> +{
> +       u32 *page_table, *dte_addr;
> +       u32 dte;
> +       phys_addr_t pt_phys;
> +
> +       assert_spin_locked(&rk_domain->dt_lock);
> +
> +       dte_addr = &rk_domain->dt[rk_iova_dte_index(iova)];
> +       dte = *dte_addr;
> +       if (rk_dte_is_pt_valid(dte))
> +               goto done;
> +
> +       page_table = (u32 *)get_zeroed_page(GFP_ATOMIC | GFP_DMA32);
> +       if (!page_table)
> +               return ERR_PTR(-ENOMEM);
> +
> +       dte = rk_mk_dte(page_table);
> +       *dte_addr = dte;
> +
> +       rk_table_flush(page_table, NUM_PT_ENTRIES);
> +       rk_table_flush(dte_addr, 1);
> +
> +       /*
> +        * Zap the first iova of newly allocated page table so iommu evicts
> +        * old cached value of new dte from the iotlb.
> +        */
> +       rk_iommu_zap_iova(rk_domain, iova, SPAGE_SIZE);
> +
> +done:
> +       pt_phys = rk_dte_pt_address(dte);
> +       return (u32 *)phys_to_virt(pt_phys);
> +}
> +
> +static size_t rk_iommu_unmap_iova(struct rk_iommu_domain *rk_domain,
> +                                 u32 *pte_addr, dma_addr_t iova, size_t size)
> +{
> +       unsigned int pte_count;
> +       unsigned int pte_total = size / SPAGE_SIZE;
> +
> +       assert_spin_locked(&rk_domain->dt_lock);
> +
> +       for (pte_count = 0; pte_count < pte_total; pte_count++) {
> +               u32 pte = pte_addr[pte_count];
> +               if (!rk_pte_is_page_valid(pte))
> +                       break;
> +
> +               pte_addr[pte_count] = rk_mk_pte_invalid(pte);
> +       }
> +
> +       rk_table_flush(pte_addr, pte_count);
> +
> +       return pte_count * SPAGE_SIZE;
> +}
> +
> +static int rk_iommu_map_iova(struct rk_iommu_domain *rk_domain, u32 *pte_addr,
> +                            dma_addr_t iova, phys_addr_t paddr, size_t size,
> +                            int prot)
> +{
> +       unsigned int pte_count;
> +       unsigned int pte_total = size / SPAGE_SIZE;
> +       phys_addr_t page_phys;
> +
> +       assert_spin_locked(&rk_domain->dt_lock);
> +
> +       for (pte_count = 0; pte_count < pte_total; pte_count++) {
> +               u32 pte = pte_addr[pte_count];
> +
> +               if (rk_pte_is_page_valid(pte))
> +                       goto unwind;
> +
> +               pte_addr[pte_count] = rk_mk_pte(paddr, prot);
> +
> +               paddr += SPAGE_SIZE;
> +       }
> +
> +       rk_table_flush(pte_addr, pte_count);
> +
> +       return 0;
> +unwind:
> +       /* Unmap the range of iovas that we just mapped */
> +       rk_iommu_unmap_iova(rk_domain, pte_addr, iova, pte_count * SPAGE_SIZE);
> +
> +       iova += pte_count * SPAGE_SIZE;
> +       page_phys = rk_pte_page_address(pte_addr[pte_count]);
> +       pr_err("iova: %pad already mapped to %pa cannot remap to phys: %pa prot:%#x\n",
> +              &iova, &page_phys, &paddr, prot);
> +
> +       return -EADDRINUSE;
> +}
> +
> +static int rk_iommu_map(struct iommu_domain *domain, unsigned long _iova,
> +                       phys_addr_t paddr, size_t size, int prot)
> +{
> +       struct rk_iommu_domain *rk_domain = domain->priv;
> +       unsigned long flags;
> +       dma_addr_t iova = (dma_addr_t)_iova;
> +       u32 *page_table, *pte_addr;
> +       int ret;
> +
> +       spin_lock_irqsave(&rk_domain->dt_lock, flags);
> +
> +       /*
> +        * pgsize_bitmap specifies iova sizes that fit in one page table
> +        * (1024 4-KiB pages = 4 MiB).
> +        * So, size will always be 4096 <= size <= 4194304.
> +        * Since iommu_map() guarantees that both iova and size will be
> +        * aligned, we will always only be mapping from a single dte here.
> +        */
> +       page_table = rk_dte_get_page_table(rk_domain, iova);
> +       if (IS_ERR(page_table)) {
> +               spin_unlock_irqrestore(&rk_domain->dt_lock, flags);
> +               return PTR_ERR(page_table);
> +       }
> +
> +       pte_addr = &page_table[rk_iova_pte_index(iova)];
> +       ret = rk_iommu_map_iova(rk_domain, pte_addr, iova, paddr, size, prot);
> +       spin_unlock_irqrestore(&rk_domain->dt_lock, flags);
> +
> +       return ret;
> +}
> +
> +static size_t rk_iommu_unmap(struct iommu_domain *domain, unsigned long _iova,
> +                            size_t size)
> +{
> +       struct rk_iommu_domain *rk_domain = domain->priv;
> +       unsigned long flags;
> +       dma_addr_t iova = (dma_addr_t)_iova;
> +       phys_addr_t pt_phys;
> +       u32 dte;
> +       u32 *pte_addr;
> +       size_t unmap_size;
> +
> +       spin_lock_irqsave(&rk_domain->dt_lock, flags);
> +
> +       /*
> +        * pgsize_bitmap specifies iova sizes that fit in one page table
> +        * (1024 4-KiB pages = 4 MiB).
> +        * So, size will always be 4096 <= size <= 4194304.
> +        * Since iommu_unmap() guarantees that both iova and size will be
> +        * aligned, we will always only be unmapping from a single dte here.
> +        */
> +       dte = rk_domain->dt[rk_iova_dte_index(iova)];
> +       /* Just return 0 if iova is unmapped */
> +       if (!rk_dte_is_pt_valid(dte)) {
> +               spin_unlock_irqrestore(&rk_domain->dt_lock, flags);
> +               return 0;
> +       }
> +
> +       pt_phys = rk_dte_pt_address(dte);
> +       pte_addr = (u32 *)phys_to_virt(pt_phys) + rk_iova_pte_index(iova);
> +       unmap_size = rk_iommu_unmap_iova(rk_domain, pte_addr, iova, size);
> +
> +       spin_unlock_irqrestore(&rk_domain->dt_lock, flags);
> +
> +       /* Shootdown iotlb entries for iova range that was just unmapped */
> +       rk_iommu_zap_iova(rk_domain, iova, unmap_size);
> +
> +       return unmap_size;
> +}
> +
> +static int rk_iommu_attach_device(struct iommu_domain *domain,
> +                                 struct device *dev)
> +{
> +       struct rk_iommu *iommu = dev_get_drvdata(dev->archdata.iommu);
> +       struct rk_iommu_domain *rk_domain = domain->priv;
> +       unsigned long flags;
> +       int ret;
> +       phys_addr_t dte_addr;
> +
> +       /*
> +        * Allow 'virtual devices' (e.g., drm) to attach to domain.
> +        * Such a device has a NULL archdata.iommu.
> +        */
> +       if (!iommu)
> +               return 0;
> +
> +       ret = rk_iommu_enable_stall(iommu);
> +       if (ret)
> +               return ret;
> +
> +       ret = rk_iommu_force_reset(iommu);
> +       if (ret)
> +               return ret;
> +
> +       iommu->domain = domain;
> +
> +       ret = devm_request_irq(dev, iommu->irq, rk_iommu_irq,
> +                              IRQF_SHARED, dev_name(dev), iommu);
> +       if (ret)
> +               return ret;
> +
> +       dte_addr = virt_to_phys(rk_domain->dt);
> +       rk_iommu_write(iommu, RK_MMU_DTE_ADDR, dte_addr);
> +       rk_iommu_command(iommu, RK_MMU_CMD_ZAP_CACHE);
> +       rk_iommu_write(iommu, RK_MMU_INT_MASK, RK_MMU_IRQ_MASK);
> +
> +       ret = rk_iommu_enable_paging(iommu);
> +       if (ret)
> +               return ret;
> +
> +       spin_lock_irqsave(&rk_domain->iommus_lock, flags);
> +       list_add_tail(&iommu->node, &rk_domain->iommus);
> +       spin_unlock_irqrestore(&rk_domain->iommus_lock, flags);
> +
> +       dev_info(dev, "Attached to iommu domain\n");
> +
> +       rk_iommu_disable_stall(iommu);
> +
> +       return 0;
> +}
> +
> +static void rk_iommu_detach_device(struct iommu_domain *domain,
> +                                  struct device *dev)
> +{
> +       struct rk_iommu *iommu = dev_get_drvdata(dev->archdata.iommu);
> +       struct rk_iommu_domain *rk_domain = domain->priv;
> +       unsigned long flags;
> +
> +       /* Allow 'virtual devices' (eg drm) to detach from domain */
> +       if (!iommu)
> +               return;
> +
> +       iommu->domain = NULL;
> +
> +       spin_lock_irqsave(&rk_domain->iommus_lock, flags);
> +       list_del_init(&iommu->node);
> +       spin_unlock_irqrestore(&rk_domain->iommus_lock, flags);
> +
> +       devm_free_irq(dev, iommu->irq, iommu);
> +
> +       iommu->domain = NULL;
> +
> +       /* Ignore error while disabling, just keep going */
> +       rk_iommu_enable_stall(iommu);
> +       rk_iommu_disable_paging(iommu);
> +       rk_iommu_write(iommu, RK_MMU_INT_MASK, 0);
> +       rk_iommu_write(iommu, RK_MMU_DTE_ADDR, 0);
> +       rk_iommu_disable_stall(iommu);
> +
> +       dev_info(dev, "Detached from iommu domain\n");
> +}
> +
> +static int rk_iommu_domain_init(struct iommu_domain *domain)
> +{
> +       struct rk_iommu_domain *rk_domain;
> +
> +       rk_domain = kzalloc(sizeof(*rk_domain), GFP_KERNEL);
> +       if (!rk_domain)
> +               return -ENOMEM;
> +
> +       /*
> +        * rk32xx iommus use a 2 level pagetable.
> +        * Each level1 (dt) and level2 (pt) table has 1024 4-byte entries.
> +        * Allocate one 4 KiB page for each table.
> +        */
> +       rk_domain->dt = (u32 *)get_zeroed_page(GFP_KERNEL | GFP_DMA32);
> +       if (!rk_domain->dt)
> +               goto err_dt;
> +
> +       rk_table_flush(rk_domain->dt, NUM_DT_ENTRIES);
> +
> +       spin_lock_init(&rk_domain->iommus_lock);
> +       spin_lock_init(&rk_domain->dt_lock);
> +       INIT_LIST_HEAD(&rk_domain->iommus);
> +
> +       domain->priv = rk_domain;
> +
> +       return 0;
> +err_dt:
> +       kfree(rk_domain);
> +       return -ENOMEM;
> +}
> +
> +static void rk_iommu_domain_destroy(struct iommu_domain *domain)
> +{
> +       struct rk_iommu_domain *rk_domain = domain->priv;
> +       int i;
> +
> +       WARN_ON(!list_empty(&rk_domain->iommus));
> +
> +       for (i = 0; i < NUM_DT_ENTRIES; i++) {
> +               u32 dte = rk_domain->dt[i];
> +               if (rk_dte_is_pt_valid(dte)) {
> +                       phys_addr_t pt_phys = rk_dte_pt_address(dte);
> +                       u32 *page_table = phys_to_virt(pt_phys);
> +                       free_page((unsigned long)page_table);
> +               }
> +       }
> +
> +       free_page((unsigned long)rk_domain->dt);
> +       kfree(domain->priv);
> +       domain->priv = NULL;
> +}
> +
> +static const struct iommu_ops rk_iommu_ops = {
> +       .domain_init = rk_iommu_domain_init,
> +       .domain_destroy = rk_iommu_domain_destroy,
> +       .attach_dev = rk_iommu_attach_device,
> +       .detach_dev = rk_iommu_detach_device,
> +       .map = rk_iommu_map,
> +       .unmap = rk_iommu_unmap,
> +       .iova_to_phys = rk_iommu_iova_to_phys,
> +       .pgsize_bitmap = RK_IOMMU_PGSIZE_BITMAP,
> +};
> +
> +static int rk_iommu_probe(struct platform_device *pdev)
> +{
> +       struct device *dev = &pdev->dev;
> +       struct rk_iommu *iommu;
> +       struct resource *res;
> +
> +       iommu = devm_kzalloc(dev, sizeof(*iommu), GFP_KERNEL);
> +       if (!iommu)
> +               return -ENOMEM;
> +
> +       platform_set_drvdata(pdev, iommu);
> +       iommu->dev = dev;
> +
> +       res = platform_get_resource(pdev, IORESOURCE_MEM, 0);
> +       iommu->base = devm_ioremap_resource(&pdev->dev, res);
> +       if (IS_ERR(iommu->base))
> +               return PTR_ERR(iommu->base);
> +
> +       iommu->irq = platform_get_irq(pdev, 0);
> +       if (iommu->irq < 0) {
> +               dev_err(dev, "Failed to get IRQ, %d\n", iommu->irq);
> +               return -ENXIO;
> +       }
> +
> +       return 0;
> +}
> +
> +static int rk_iommu_remove(struct platform_device *pdev)
> +{
> +       return 0;
> +}
> +
> +#ifdef CONFIG_OF
> +static const struct of_device_id rk_iommu_dt_ids[] = {
> +       { .compatible = "rockchip,iommu" },
> +       { /* sentinel */ }
> +};
> +MODULE_DEVICE_TABLE(of, rk_iommu_dt_ids);
> +#endif
> +
> +static struct platform_driver rk_iommu_driver = {
> +       .probe = rk_iommu_probe,
> +       .remove = rk_iommu_remove,
> +       .driver = {
> +                  .name = "rk_iommu",
> +                  .owner = THIS_MODULE,
> +                  .of_match_table = of_match_ptr(rk_iommu_dt_ids),
> +       },
> +};
> +
> +static int __init rk_iommu_init(void)
> +{
> +       int ret;
> +
> +       ret = bus_set_iommu(&platform_bus_type, &rk_iommu_ops);
> +       if (ret)
> +               return ret;
> +
> +       return platform_driver_register(&rk_iommu_driver);
> +}
> +static void __exit rk_iommu_exit(void)
> +{
> +       platform_driver_unregister(&rk_iommu_driver);
> +}
> +
> +subsys_initcall(rk_iommu_init);
> +module_exit(rk_iommu_exit);
> +
> +MODULE_DESCRIPTION("IOMMU API for Rockchip");
> +MODULE_AUTHOR("Simon Xue <xxm@rock-chips.com> and Daniel Kurtz <djkurtz@chromium.org>");
> +MODULE_ALIAS("platform:rockchip-iommu");
> +MODULE_LICENSE("GPL v2");
> --
> 2.1.0.rc2.206.gedb03e5
>

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH v5 1/3] iommu/rockchip: rk3288 iommu driver
@ 2014-10-17  2:22     ` Daniel Kurtz
  0 siblings, 0 replies; 26+ messages in thread
From: Daniel Kurtz @ 2014-10-17  2:22 UTC (permalink / raw)
  To: Daniel Kurtz
  Cc: open list:OPEN FIRMWARE AND...,
	Simon Xue, Grant Grundler, open list,
	open list:ARM/Rockchip SoC...,
	open list:IOMMU DRIVERS, Rob Herring, Grant Likely,
	Stéphane Marchesin, moderated list:ARM/Rockchip SoC...,
	Heiko Stuebner

On Tue, Oct 14, 2014 at 4:02 PM, Daniel Kurtz <djkurtz@chromium.org> wrote:
> The rk3288 has several iommus.  Each iommu belongs to a single master
> device.  There is one device (ISP) that has two slave iommus, but that
> case is not yet supported by this driver.
>
> At subsys init, the iommu driver registers itself as the iommu driver for
> the platform bus.  The master devices find their slave iommus using the
> "iommus" field in their devicetree description.  Since each slave iommu
> belongs to exactly one master, their is no additional data needed at probe
> to associate a slave with its master.
>
> An iommu device's power domain, clock and irq are all shared with its
> master device, and the master device must be careful to attach from the
> iommu only after powering and clocking it (and leave it powered and
> clocked before detaching).  Because their is no guarantee what the status
> of the iommu is at probe, and since the driver does not even know if the
> device is powered, we delay requesting its irq until the master device
> attaches, at which point we have a guarantee that the device is powered
> and clocked and we can reset it and disable its interrupt mask.
>
> An iommu_domain describes a virtual iova address space.  Each iommu_domain
> has a corresponding page table that lists the mappings from iova to
> physical address.
>
> For the rk3288 iommu, the page table has two levels:
>  The Level 1 "directory_table" has 1024 4-byte dte entries.
>  Each dte points to a level 2 "page_table".
>  Each level 2 page_table has 1024 4-byte pte entries.
>  Each pte points to a 4 KiB page of memory.
>
> An iommu_domain is created when a dma_iommu_mapping is created via
> arm_iommu_create_mapping.  Master devices can then attach themselves to
> this mapping (or attach the mapping to themselves?) by calling
> arm_iommu_attach_device().  This in turn instructs the iommu driver to
> write the page table's physical address into the slave iommu's "Directory
> Table Entry" (DTE) register.
>
> In fact multiple master devices, each with their own slave iommu device,
> can all attach to the same mapping.  The iommus for these devices will
> share the same iommu_domain and therefore point to the same page table.
> Thus, the iommu domain maintains a list of iommu devices which are
> attached.  This driver relies on the iommu core to ensure that all devices
> have detached before destroying a domain.
>
> Signed-off-by: Daniel Kurtz <djkurtz@chromium.org>
> Signed-off-by: Simon Xue <xxm@rock-chips.com>
> Reviewed-by: Grant Grundler <grundler@chromium.org>
> Reviewed-by: Stéphane Marchesin <marcheu@chromium.org>

Gentle ping.

Any more feedback on the rockchip iommu driver?

Thanks,
-Daniel

> ---
>  drivers/iommu/Kconfig          |  12 +
>  drivers/iommu/Makefile         |   1 +
>  drivers/iommu/rockchip-iommu.c | 924 +++++++++++++++++++++++++++++++++++++++++
>  3 files changed, 937 insertions(+)
>  create mode 100644 drivers/iommu/rockchip-iommu.c
>
> diff --git a/drivers/iommu/Kconfig b/drivers/iommu/Kconfig
> index dd51122..d0a1261 100644
> --- a/drivers/iommu/Kconfig
> +++ b/drivers/iommu/Kconfig
> @@ -152,6 +152,18 @@ config OMAP_IOMMU_DEBUG
>
>           Say N unless you know you need this.
>
> +config ROCKCHIP_IOMMU
> +       bool "Rockchip IOMMU Support"
> +       depends on ARCH_ROCKCHIP
> +       select IOMMU_API
> +       select ARM_DMA_USE_IOMMU
> +       help
> +         Support for IOMMUs found on Rockchip rk32xx SOCs.
> +         These IOMMUs allow virtualization of the address space used by most
> +         cores within the multimedia subsystem.
> +         Say Y here if you are using a Rockchip SoC that includes an IOMMU
> +         device.
> +
>  config TEGRA_IOMMU_GART
>         bool "Tegra GART IOMMU Support"
>         depends on ARCH_TEGRA_2x_SOC
> diff --git a/drivers/iommu/Makefile b/drivers/iommu/Makefile
> index 16edef7..3e47ef3 100644
> --- a/drivers/iommu/Makefile
> +++ b/drivers/iommu/Makefile
> @@ -13,6 +13,7 @@ obj-$(CONFIG_IRQ_REMAP) += intel_irq_remapping.o irq_remapping.o
>  obj-$(CONFIG_OMAP_IOMMU) += omap-iommu.o
>  obj-$(CONFIG_OMAP_IOMMU) += omap-iommu2.o
>  obj-$(CONFIG_OMAP_IOMMU_DEBUG) += omap-iommu-debug.o
> +obj-$(CONFIG_ROCKCHIP_IOMMU) += rockchip-iommu.o
>  obj-$(CONFIG_TEGRA_IOMMU_GART) += tegra-gart.o
>  obj-$(CONFIG_TEGRA_IOMMU_SMMU) += tegra-smmu.o
>  obj-$(CONFIG_EXYNOS_IOMMU) += exynos-iommu.o
> diff --git a/drivers/iommu/rockchip-iommu.c b/drivers/iommu/rockchip-iommu.c
> new file mode 100644
> index 0000000..08e50fc
> --- /dev/null
> +++ b/drivers/iommu/rockchip-iommu.c
> @@ -0,0 +1,924 @@
> +/*
> + * This program is free software; you can redistribute it and/or modify
> + * it under the terms of the GNU General Public License version 2 as
> + * published by the Free Software Foundation.
> + */
> +
> +#include <asm/cacheflush.h>
> +#include <asm/pgtable.h>
> +#include <linux/compiler.h>
> +#include <linux/delay.h>
> +#include <linux/device.h>
> +#include <linux/errno.h>
> +#include <linux/interrupt.h>
> +#include <linux/io.h>
> +#include <linux/iommu.h>
> +#include <linux/jiffies.h>
> +#include <linux/list.h>
> +#include <linux/mm.h>
> +#include <linux/module.h>
> +#include <linux/of.h>
> +#include <linux/platform_device.h>
> +#include <linux/slab.h>
> +#include <linux/spinlock.h>
> +
> +/** MMU register offsets */
> +#define RK_MMU_DTE_ADDR                0x00    /* Directory table address */
> +#define RK_MMU_STATUS          0x04
> +#define RK_MMU_COMMAND         0x08
> +#define RK_MMU_PAGE_FAULT_ADDR 0x0C    /* IOVA of last page fault */
> +#define RK_MMU_ZAP_ONE_LINE    0x10    /* Shootdown one IOTLB entry */
> +#define RK_MMU_INT_RAWSTAT     0x14    /* IRQ status ignoring mask */
> +#define RK_MMU_INT_CLEAR       0x18    /* Acknowledge and re-arm irq */
> +#define RK_MMU_INT_MASK                0x1C    /* IRQ enable */
> +#define RK_MMU_INT_STATUS      0x20    /* IRQ status after masking */
> +#define RK_MMU_AUTO_GATING     0x24
> +
> +#define DTE_ADDR_DUMMY         0xCAFEBABE
> +#define FORCE_RESET_TIMEOUT    100     /* ms */
> +
> +/* RK_MMU_STATUS fields */
> +#define RK_MMU_STATUS_PAGING_ENABLED       BIT(0)
> +#define RK_MMU_STATUS_PAGE_FAULT_ACTIVE    BIT(1)
> +#define RK_MMU_STATUS_STALL_ACTIVE         BIT(2)
> +#define RK_MMU_STATUS_IDLE                 BIT(3)
> +#define RK_MMU_STATUS_REPLAY_BUFFER_EMPTY  BIT(4)
> +#define RK_MMU_STATUS_PAGE_FAULT_IS_WRITE  BIT(5)
> +#define RK_MMU_STATUS_STALL_NOT_ACTIVE     BIT(31)
> +
> +/* RK_MMU_COMMAND command values */
> +#define RK_MMU_CMD_ENABLE_PAGING    0  /* Enable memory translation */
> +#define RK_MMU_CMD_DISABLE_PAGING   1  /* Disable memory translation */
> +#define RK_MMU_CMD_ENABLE_STALL     2  /* Stall paging to allow other cmds */
> +#define RK_MMU_CMD_DISABLE_STALL    3  /* Stop stall re-enables paging */
> +#define RK_MMU_CMD_ZAP_CACHE        4  /* Shoot down entire IOTLB */
> +#define RK_MMU_CMD_PAGE_FAULT_DONE  5  /* Clear page fault */
> +#define RK_MMU_CMD_FORCE_RESET      6  /* Reset all registers */
> +
> +/* RK_MMU_INT_* register fields */
> +#define RK_MMU_IRQ_PAGE_FAULT    0x01  /* page fault */
> +#define RK_MMU_IRQ_BUS_ERROR     0x02  /* bus read error */
> +#define RK_MMU_IRQ_MASK          (RK_MMU_IRQ_PAGE_FAULT | RK_MMU_IRQ_BUS_ERROR)
> +
> +#define NUM_DT_ENTRIES 1024
> +#define NUM_PT_ENTRIES 1024
> +
> +#define SPAGE_ORDER 12
> +#define SPAGE_SIZE (1 << SPAGE_ORDER)
> +
> + /*
> +  * Support mapping any size that fits in one page table:
> +  *   4 KiB to 4 MiB
> +  */
> +#define RK_IOMMU_PGSIZE_BITMAP 0x007ff000
> +
> +#define IOMMU_REG_POLL_COUNT_FAST 1000
> +
> +struct rk_iommu_domain {
> +       struct list_head iommus;
> +       u32 *dt; /* page directory table */
> +       spinlock_t iommus_lock; /* lock for iommus list */
> +       spinlock_t dt_lock; /* lock for modifying page directory table */
> +};
> +
> +struct rk_iommu {
> +       struct device *dev;
> +       void __iomem *base;
> +       int irq;
> +       struct list_head node; /* entry in rk_iommu_domain.iommus */
> +       struct iommu_domain *domain; /* domain to which iommu is attached */
> +};
> +
> +static inline void rk_table_flush(u32 *va, unsigned int count)
> +{
> +       phys_addr_t pa_start = virt_to_phys(va);
> +       phys_addr_t pa_end = virt_to_phys(va + count);
> +       size_t size = pa_end - pa_start;
> +
> +       __cpuc_flush_dcache_area(va, size);
> +       outer_flush_range(pa_start, pa_end);
> +}
> +
> +/**
> + * Inspired by _wait_for in intel_drv.h
> + * This is NOT safe for use in interrupt context.
> + *
> + * Note that it's important that we check the condition again after having
> + * timed out, since the timeout could be due to preemption or similar and
> + * we've never had a chance to check the condition before the timeout.
> + */
> +#define rk_wait_for(COND, MS) ({ \
> +       unsigned long timeout__ = jiffies + msecs_to_jiffies(MS) + 1;   \
> +       int ret__ = 0;                                                  \
> +       while (!(COND)) {                                               \
> +               if (time_after(jiffies, timeout__)) {                   \
> +                       ret__ = (COND) ? 0 : -ETIMEDOUT;                \
> +                       break;                                          \
> +               }                                                       \
> +               usleep_range(50, 100);                                  \
> +       }                                                               \
> +       ret__;                                                          \
> +})
> +
> +/*
> + * The Rockchip rk3288 iommu uses a 2-level page table.
> + * The first level is the "Directory Table" (DT).
> + * The DT consists of 1024 4-byte Directory Table Entries (DTEs), each pointing
> + * to a "Page Table".
> + * The second level is the 1024 Page Tables (PT).
> + * Each PT consists of 1024 4-byte Page Table Entries (PTEs), each pointing to
> + * a 4 KB page of physical memory.
> + *
> + * The DT and each PT fits in a single 4 KB page (4-bytes * 1024 entries).
> + * Each iommu device has a MMU_DTE_ADDR register that contains the physical
> + * address of the start of the DT page.
> + *
> + * The structure of the page table is as follows:
> + *
> + *                   DT
> + * MMU_DTE_ADDR -> +-----+
> + *                 |     |
> + *                 +-----+     PT
> + *                 | DTE | -> +-----+
> + *                 +-----+    |     |     Memory
> + *                 |     |    +-----+     Page
> + *                 |     |    | PTE | -> +-----+
> + *                 +-----+    +-----+    |     |
> + *                            |     |    |     |
> + *                            |     |    |     |
> + *                            +-----+    |     |
> + *                                       |     |
> + *                                       |     |
> + *                                       +-----+
> + */
> +
> +/*
> + * Each DTE has a PT address and a valid bit:
> + * +---------------------+-----------+-+
> + * | PT address          | Reserved  |V|
> + * +---------------------+-----------+-+
> + *  31:12 - PT address (PTs always starts on a 4 KB boundary)
> + *  11: 1 - Reserved
> + *      0 - 1 if PT @ PT address is valid
> + */
> +#define RK_DTE_PT_ADDRESS_MASK    0xfffff000
> +#define RK_DTE_PT_VALID           BIT(0)
> +
> +static inline phys_addr_t rk_dte_pt_address(u32 dte)
> +{
> +       return (phys_addr_t)dte & RK_DTE_PT_ADDRESS_MASK;
> +}
> +
> +static inline bool rk_dte_is_pt_valid(u32 dte)
> +{
> +       return dte & RK_DTE_PT_VALID;
> +}
> +
> +static u32 rk_mk_dte(u32 *pt)
> +{
> +       phys_addr_t pt_phys = virt_to_phys(pt);
> +       return (pt_phys & RK_DTE_PT_ADDRESS_MASK) | RK_DTE_PT_VALID;
> +}
> +
> +/*
> + * Each PTE has a Page address, some flags and a valid bit:
> + * +---------------------+---+-------+-+
> + * | Page address        |Rsv| Flags |V|
> + * +---------------------+---+-------+-+
> + *  31:12 - Page address (Pages always start on a 4 KB boundary)
> + *  11: 9 - Reserved
> + *   8: 1 - Flags
> + *      8 - Read allocate - allocate cache space on read misses
> + *      7 - Read cache - enable cache & prefetch of data
> + *      6 - Write buffer - enable delaying writes on their way to memory
> + *      5 - Write allocate - allocate cache space on write misses
> + *      4 - Write cache - different writes can be merged together
> + *      3 - Override cache attributes
> + *          if 1, bits 4-8 control cache attributes
> + *          if 0, the system bus defaults are used
> + *      2 - Writable
> + *      1 - Readable
> + *      0 - 1 if Page @ Page address is valid
> + */
> +#define RK_PTE_PAGE_ADDRESS_MASK  0xfffff000
> +#define RK_PTE_PAGE_FLAGS_MASK    0x000001fe
> +#define RK_PTE_PAGE_WRITABLE      BIT(2)
> +#define RK_PTE_PAGE_READABLE      BIT(1)
> +#define RK_PTE_PAGE_VALID         BIT(0)
> +
> +static inline phys_addr_t rk_pte_page_address(u32 pte)
> +{
> +       return (phys_addr_t)pte & RK_PTE_PAGE_ADDRESS_MASK;
> +}
> +
> +static inline bool rk_pte_is_page_valid(u32 pte)
> +{
> +       return pte & RK_PTE_PAGE_VALID;
> +}
> +
> +/* TODO: set cache flags per prot IOMMU_CACHE */
> +static u32 rk_mk_pte(phys_addr_t page, int prot)
> +{
> +       u32 flags = 0;
> +       flags |= (prot & IOMMU_READ) ? RK_PTE_PAGE_READABLE : 0;
> +       flags |= (prot & IOMMU_WRITE) ? RK_PTE_PAGE_WRITABLE : 0;
> +       page &= RK_PTE_PAGE_ADDRESS_MASK;
> +       return page | flags | RK_PTE_PAGE_VALID;
> +}
> +
> +static u32 rk_mk_pte_invalid(u32 pte)
> +{
> +       return pte & ~RK_PTE_PAGE_VALID;
> +}
> +
> +/*
> + * rk3288 iova (IOMMU Virtual Address) format
> + *  31       22.21       12.11          0
> + * +-----------+-----------+-------------+
> + * | DTE index | PTE index | Page offset |
> + * +-----------+-----------+-------------+
> + *  31:22 - DTE index   - index of DTE in DT
> + *  21:12 - PTE index   - index of PTE in PT @ DTE.pt_address
> + *  11: 0 - Page offset - offset into page @ PTE.page_address
> + */
> +#define RK_IOVA_DTE_MASK    0xffc00000
> +#define RK_IOVA_DTE_SHIFT   22
> +#define RK_IOVA_PTE_MASK    0x003ff000
> +#define RK_IOVA_PTE_SHIFT   12
> +#define RK_IOVA_PAGE_MASK   0x00000fff
> +#define RK_IOVA_PAGE_SHIFT  0
> +
> +static u32 rk_iova_dte_index(dma_addr_t iova)
> +{
> +       return (u32)(iova & RK_IOVA_DTE_MASK) >> RK_IOVA_DTE_SHIFT;
> +}
> +
> +static u32 rk_iova_pte_index(dma_addr_t iova)
> +{
> +       return (u32)(iova & RK_IOVA_PTE_MASK) >> RK_IOVA_PTE_SHIFT;
> +}
> +
> +static u32 rk_iova_page_offset(dma_addr_t iova)
> +{
> +       return (u32)(iova & RK_IOVA_PAGE_MASK) >> RK_IOVA_PAGE_SHIFT;
> +}
> +
> +static u32 rk_iommu_read(struct rk_iommu *iommu, u32 offset)
> +{
> +       return readl(iommu->base + offset);
> +}
> +
> +static void rk_iommu_write(struct rk_iommu *iommu, u32 offset, u32 value)
> +{
> +       writel(value, iommu->base + offset);
> +}
> +
> +static void rk_iommu_command(struct rk_iommu *iommu, u32 command)
> +{
> +       writel(command, iommu->base + RK_MMU_COMMAND);
> +}
> +
> +static void rk_iommu_zap_lines(struct rk_iommu *iommu, dma_addr_t iova,
> +                              size_t size)
> +{
> +       dma_addr_t iova_end = iova + size;
> +       /*
> +        * TODO(djkurtz): Figure out when it is more efficient to shootdown the
> +        * entire iotlb rather than iterate over individual iovas.
> +        */
> +       for (; iova < iova_end; iova += SPAGE_SIZE)
> +               rk_iommu_write(iommu, RK_MMU_ZAP_ONE_LINE, iova);
> +}
> +
> +static bool rk_iommu_is_stall_active(struct rk_iommu *iommu)
> +{
> +       return rk_iommu_read(iommu, RK_MMU_STATUS) & RK_MMU_STATUS_STALL_ACTIVE;
> +}
> +
> +static bool rk_iommu_is_paging_enabled(struct rk_iommu *iommu)
> +{
> +       return rk_iommu_read(iommu, RK_MMU_STATUS) &
> +                            RK_MMU_STATUS_PAGING_ENABLED;
> +}
> +
> +static int rk_iommu_enable_stall(struct rk_iommu *iommu)
> +{
> +       int ret;
> +
> +       if (rk_iommu_is_stall_active(iommu))
> +               return 0;
> +
> +       /* Stall can only be enabled if paging is enabled */
> +       if (!rk_iommu_is_paging_enabled(iommu))
> +               return 0;
> +
> +       rk_iommu_command(iommu, RK_MMU_CMD_ENABLE_STALL);
> +
> +       ret = rk_wait_for(rk_iommu_is_stall_active(iommu), 1);
> +       if (ret)
> +               dev_err(iommu->dev, "Enable stall request timed out, status: %#08x\n",
> +                       rk_iommu_read(iommu, RK_MMU_STATUS));
> +
> +       return ret;
> +}
> +
> +static int rk_iommu_disable_stall(struct rk_iommu *iommu)
> +{
> +       int ret;
> +
> +       if (!rk_iommu_is_stall_active(iommu))
> +               return 0;
> +
> +       rk_iommu_command(iommu, RK_MMU_CMD_DISABLE_STALL);
> +
> +       ret = rk_wait_for(!rk_iommu_is_stall_active(iommu), 1);
> +       if (ret)
> +               dev_err(iommu->dev, "Disable stall request timed out, status: %#08x\n",
> +                       rk_iommu_read(iommu, RK_MMU_STATUS));
> +
> +       return ret;
> +}
> +
> +static int rk_iommu_enable_paging(struct rk_iommu *iommu)
> +{
> +       int ret;
> +
> +       if (rk_iommu_is_paging_enabled(iommu))
> +               return 0;
> +
> +       rk_iommu_command(iommu, RK_MMU_CMD_ENABLE_PAGING);
> +
> +       ret = rk_wait_for(rk_iommu_is_paging_enabled(iommu), 1);
> +       if (ret)
> +               dev_err(iommu->dev, "Enable paging request timed out, status: %#08x\n",
> +                       rk_iommu_read(iommu, RK_MMU_STATUS));
> +
> +       return ret;
> +}
> +
> +static int rk_iommu_disable_paging(struct rk_iommu *iommu)
> +{
> +       int ret;
> +
> +       if (!rk_iommu_is_paging_enabled(iommu))
> +               return 0;
> +
> +       rk_iommu_command(iommu, RK_MMU_CMD_DISABLE_PAGING);
> +
> +       ret = rk_wait_for(!rk_iommu_is_paging_enabled(iommu), 1);
> +       if (ret)
> +               dev_err(iommu->dev, "Disable paging request timed out, status: #%08x\n",
> +                       rk_iommu_read(iommu, RK_MMU_STATUS));
> +
> +       return ret;
> +}
> +
> +static int rk_iommu_force_reset(struct rk_iommu *iommu)
> +{
> +       int ret;
> +       u32 dte_addr;
> +
> +       /*
> +        * Check if register DTE_ADDR is working by writing DTE_ADDR_DUMMY
> +        * and verifying that upper 5 nybbles are read back.
> +        */
> +       rk_iommu_write(iommu, RK_MMU_DTE_ADDR, DTE_ADDR_DUMMY);
> +
> +       dte_addr = rk_iommu_read(iommu, RK_MMU_DTE_ADDR);
> +       if (dte_addr != (DTE_ADDR_DUMMY & RK_DTE_PT_ADDRESS_MASK)) {
> +               dev_err(iommu->dev, "Error during raw reset. MMU_DTE_ADDR is not functioning\n");
> +               return -EFAULT;
> +       }
> +
> +       rk_iommu_command(iommu, RK_MMU_CMD_FORCE_RESET);
> +
> +       ret = rk_wait_for(rk_iommu_read(iommu, RK_MMU_DTE_ADDR) == 0x00000000,
> +                         FORCE_RESET_TIMEOUT);
> +       if (ret)
> +               dev_err(iommu->dev, "FORCE_RESET command timed out\n");
> +
> +       return ret;
> +}
> +
> +static void log_iova(struct rk_iommu *iommu, dma_addr_t iova)
> +{
> +       u32 dte_index, pte_index, page_offset;
> +       u32 mmu_dte_addr;
> +       phys_addr_t mmu_dte_addr_phys, dte_addr_phys;
> +       u32 *dte_addr;
> +       u32 dte;
> +       phys_addr_t pte_addr_phys = 0;
> +       u32 *pte_addr = NULL;
> +       u32 pte = 0;
> +       phys_addr_t page_addr_phys = 0;
> +       u32 page_flags = 0;
> +
> +       dte_index = rk_iova_dte_index(iova);
> +       pte_index = rk_iova_pte_index(iova);
> +       page_offset = rk_iova_page_offset(iova);
> +
> +       mmu_dte_addr = rk_iommu_read(iommu, RK_MMU_DTE_ADDR);
> +       mmu_dte_addr_phys = (phys_addr_t)mmu_dte_addr;
> +
> +       dte_addr_phys = mmu_dte_addr_phys + (4 * dte_index);
> +       dte_addr = phys_to_virt(dte_addr_phys);
> +       dte = *dte_addr;
> +
> +       if (!rk_dte_is_pt_valid(dte))
> +               goto print_it;
> +
> +       pte_addr_phys = rk_dte_pt_address(dte) + (pte_index * 4);
> +       pte_addr = phys_to_virt(pte_addr_phys);
> +       pte = *pte_addr;
> +
> +       if (!rk_pte_is_page_valid(pte))
> +               goto print_it;
> +
> +       page_addr_phys = rk_pte_page_address(pte) + page_offset;
> +       page_flags = pte & RK_PTE_PAGE_FLAGS_MASK;
> +
> +print_it:
> +       dev_err(iommu->dev, "iova = %pad: dte_index: 0x%03x pte_index: 0x%03x page_offset: 0x%03x\n",
> +               &iova, dte_index, pte_index, page_offset);
> +       dev_err(iommu->dev, "mmu_dte_addr: %pa dte@%pa: %#08x valid: %u pte@%pa: %#08x valid: %u page@%pa flags: %#03x\n",
> +               &mmu_dte_addr_phys, &dte_addr_phys, dte,
> +               rk_dte_is_pt_valid(dte), &pte_addr_phys, pte,
> +               rk_pte_is_page_valid(pte), &page_addr_phys, page_flags);
> +}
> +
> +static irqreturn_t rk_iommu_irq(int irq, void *dev_id)
> +{
> +       struct rk_iommu *iommu = dev_id;
> +       u32 status;
> +       u32 int_status;
> +       dma_addr_t iova;
> +
> +       int_status = rk_iommu_read(iommu, RK_MMU_INT_STATUS);
> +       if (int_status == 0)
> +               return IRQ_NONE;
> +
> +       iova = rk_iommu_read(iommu, RK_MMU_PAGE_FAULT_ADDR);
> +
> +       if (int_status & RK_MMU_IRQ_PAGE_FAULT) {
> +               int flags;
> +
> +               status = rk_iommu_read(iommu, RK_MMU_STATUS);
> +               flags = (status & RK_MMU_STATUS_PAGE_FAULT_IS_WRITE) ?
> +                               IOMMU_FAULT_WRITE : IOMMU_FAULT_READ;
> +
> +               dev_err(iommu->dev, "Page fault at %pad of type %s\n",
> +                       &iova,
> +                       (flags == IOMMU_FAULT_WRITE) ? "write" : "read");
> +
> +               log_iova(iommu, iova);
> +
> +               /*
> +                * Report page fault to any installed handlers.
> +                * Ignore the return code, though, since we always zap cache
> +                * and clear the page fault anyway.
> +                */
> +               if (iommu->domain)
> +                       report_iommu_fault(iommu->domain, iommu->dev, iova,
> +                                          flags);
> +               else
> +                       dev_err(iommu->dev, "Page fault while iommu not attached to domain?\n");
> +
> +               rk_iommu_command(iommu, RK_MMU_CMD_ZAP_CACHE);
> +               rk_iommu_command(iommu, RK_MMU_CMD_PAGE_FAULT_DONE);
> +       }
> +
> +       if (int_status & RK_MMU_IRQ_BUS_ERROR)
> +               dev_err(iommu->dev, "BUS_ERROR occurred at %pad\n", &iova);
> +
> +       if (int_status & ~RK_MMU_IRQ_MASK)
> +               dev_err(iommu->dev, "unexpected int_status: %#08x\n",
> +                       int_status);
> +
> +       rk_iommu_write(iommu, RK_MMU_INT_CLEAR, int_status);
> +
> +       return IRQ_HANDLED;
> +}
> +
> +static phys_addr_t rk_iommu_iova_to_phys(struct iommu_domain *domain,
> +                                        dma_addr_t iova)
> +{
> +       struct rk_iommu_domain *rk_domain = domain->priv;
> +       unsigned long flags;
> +       phys_addr_t pt_phys, phys = 0;
> +       u32 dte, pte;
> +       u32 *page_table;
> +
> +       spin_lock_irqsave(&rk_domain->dt_lock, flags);
> +
> +       dte = rk_domain->dt[rk_iova_dte_index(iova)];
> +       if (!rk_dte_is_pt_valid(dte))
> +               goto out;
> +
> +       pt_phys = rk_dte_pt_address(dte);
> +       page_table = (u32 *)phys_to_virt(pt_phys);
> +       pte = page_table[rk_iova_pte_index(iova)];
> +       if (!rk_pte_is_page_valid(pte))
> +               goto out;
> +
> +       phys = rk_pte_page_address(pte) + rk_iova_page_offset(iova);
> +out:
> +       spin_unlock_irqrestore(&rk_domain->dt_lock, flags);
> +
> +       return phys;
> +}
> +
> +static void rk_iommu_zap_iova(struct rk_iommu_domain *rk_domain,
> +                             dma_addr_t iova, size_t size)
> +{
> +       struct list_head *pos;
> +       unsigned long flags;
> +
> +       /* shootdown these iova from all iommus using this domain */
> +       spin_lock_irqsave(&rk_domain->iommus_lock, flags);
> +       list_for_each(pos, &rk_domain->iommus) {
> +               struct rk_iommu *iommu;
> +               iommu = list_entry(pos, struct rk_iommu, node);
> +               rk_iommu_zap_lines(iommu, iova, size);
> +       }
> +       spin_unlock_irqrestore(&rk_domain->iommus_lock, flags);
> +}
> +
> +static u32 *rk_dte_get_page_table(struct rk_iommu_domain *rk_domain,
> +                                 dma_addr_t iova)
> +{
> +       u32 *page_table, *dte_addr;
> +       u32 dte;
> +       phys_addr_t pt_phys;
> +
> +       assert_spin_locked(&rk_domain->dt_lock);
> +
> +       dte_addr = &rk_domain->dt[rk_iova_dte_index(iova)];
> +       dte = *dte_addr;
> +       if (rk_dte_is_pt_valid(dte))
> +               goto done;
> +
> +       page_table = (u32 *)get_zeroed_page(GFP_ATOMIC | GFP_DMA32);
> +       if (!page_table)
> +               return ERR_PTR(-ENOMEM);
> +
> +       dte = rk_mk_dte(page_table);
> +       *dte_addr = dte;
> +
> +       rk_table_flush(page_table, NUM_PT_ENTRIES);
> +       rk_table_flush(dte_addr, 1);
> +
> +       /*
> +        * Zap the first iova of newly allocated page table so iommu evicts
> +        * old cached value of new dte from the iotlb.
> +        */
> +       rk_iommu_zap_iova(rk_domain, iova, SPAGE_SIZE);
> +
> +done:
> +       pt_phys = rk_dte_pt_address(dte);
> +       return (u32 *)phys_to_virt(pt_phys);
> +}
> +
> +static size_t rk_iommu_unmap_iova(struct rk_iommu_domain *rk_domain,
> +                                 u32 *pte_addr, dma_addr_t iova, size_t size)
> +{
> +       unsigned int pte_count;
> +       unsigned int pte_total = size / SPAGE_SIZE;
> +
> +       assert_spin_locked(&rk_domain->dt_lock);
> +
> +       for (pte_count = 0; pte_count < pte_total; pte_count++) {
> +               u32 pte = pte_addr[pte_count];
> +               if (!rk_pte_is_page_valid(pte))
> +                       break;
> +
> +               pte_addr[pte_count] = rk_mk_pte_invalid(pte);
> +       }
> +
> +       rk_table_flush(pte_addr, pte_count);
> +
> +       return pte_count * SPAGE_SIZE;
> +}
> +
> +static int rk_iommu_map_iova(struct rk_iommu_domain *rk_domain, u32 *pte_addr,
> +                            dma_addr_t iova, phys_addr_t paddr, size_t size,
> +                            int prot)
> +{
> +       unsigned int pte_count;
> +       unsigned int pte_total = size / SPAGE_SIZE;
> +       phys_addr_t page_phys;
> +
> +       assert_spin_locked(&rk_domain->dt_lock);
> +
> +       for (pte_count = 0; pte_count < pte_total; pte_count++) {
> +               u32 pte = pte_addr[pte_count];
> +
> +               if (rk_pte_is_page_valid(pte))
> +                       goto unwind;
> +
> +               pte_addr[pte_count] = rk_mk_pte(paddr, prot);
> +
> +               paddr += SPAGE_SIZE;
> +       }
> +
> +       rk_table_flush(pte_addr, pte_count);
> +
> +       return 0;
> +unwind:
> +       /* Unmap the range of iovas that we just mapped */
> +       rk_iommu_unmap_iova(rk_domain, pte_addr, iova, pte_count * SPAGE_SIZE);
> +
> +       iova += pte_count * SPAGE_SIZE;
> +       page_phys = rk_pte_page_address(pte_addr[pte_count]);
> +       pr_err("iova: %pad already mapped to %pa cannot remap to phys: %pa prot:%#x\n",
> +              &iova, &page_phys, &paddr, prot);
> +
> +       return -EADDRINUSE;
> +}
> +
> +static int rk_iommu_map(struct iommu_domain *domain, unsigned long _iova,
> +                       phys_addr_t paddr, size_t size, int prot)
> +{
> +       struct rk_iommu_domain *rk_domain = domain->priv;
> +       unsigned long flags;
> +       dma_addr_t iova = (dma_addr_t)_iova;
> +       u32 *page_table, *pte_addr;
> +       int ret;
> +
> +       spin_lock_irqsave(&rk_domain->dt_lock, flags);
> +
> +       /*
> +        * pgsize_bitmap specifies iova sizes that fit in one page table
> +        * (1024 4-KiB pages = 4 MiB).
> +        * So, size will always be 4096 <= size <= 4194304.
> +        * Since iommu_map() guarantees that both iova and size will be
> +        * aligned, we will always only be mapping from a single dte here.
> +        */
> +       page_table = rk_dte_get_page_table(rk_domain, iova);
> +       if (IS_ERR(page_table)) {
> +               spin_unlock_irqrestore(&rk_domain->dt_lock, flags);
> +               return PTR_ERR(page_table);
> +       }
> +
> +       pte_addr = &page_table[rk_iova_pte_index(iova)];
> +       ret = rk_iommu_map_iova(rk_domain, pte_addr, iova, paddr, size, prot);
> +       spin_unlock_irqrestore(&rk_domain->dt_lock, flags);
> +
> +       return ret;
> +}
> +
> +static size_t rk_iommu_unmap(struct iommu_domain *domain, unsigned long _iova,
> +                            size_t size)
> +{
> +       struct rk_iommu_domain *rk_domain = domain->priv;
> +       unsigned long flags;
> +       dma_addr_t iova = (dma_addr_t)_iova;
> +       phys_addr_t pt_phys;
> +       u32 dte;
> +       u32 *pte_addr;
> +       size_t unmap_size;
> +
> +       spin_lock_irqsave(&rk_domain->dt_lock, flags);
> +
> +       /*
> +        * pgsize_bitmap specifies iova sizes that fit in one page table
> +        * (1024 4-KiB pages = 4 MiB).
> +        * So, size will always be 4096 <= size <= 4194304.
> +        * Since iommu_unmap() guarantees that both iova and size will be
> +        * aligned, we will always only be unmapping from a single dte here.
> +        */
> +       dte = rk_domain->dt[rk_iova_dte_index(iova)];
> +       /* Just return 0 if iova is unmapped */
> +       if (!rk_dte_is_pt_valid(dte)) {
> +               spin_unlock_irqrestore(&rk_domain->dt_lock, flags);
> +               return 0;
> +       }
> +
> +       pt_phys = rk_dte_pt_address(dte);
> +       pte_addr = (u32 *)phys_to_virt(pt_phys) + rk_iova_pte_index(iova);
> +       unmap_size = rk_iommu_unmap_iova(rk_domain, pte_addr, iova, size);
> +
> +       spin_unlock_irqrestore(&rk_domain->dt_lock, flags);
> +
> +       /* Shootdown iotlb entries for iova range that was just unmapped */
> +       rk_iommu_zap_iova(rk_domain, iova, unmap_size);
> +
> +       return unmap_size;
> +}
> +
> +static int rk_iommu_attach_device(struct iommu_domain *domain,
> +                                 struct device *dev)
> +{
> +       struct rk_iommu *iommu = dev_get_drvdata(dev->archdata.iommu);
> +       struct rk_iommu_domain *rk_domain = domain->priv;
> +       unsigned long flags;
> +       int ret;
> +       phys_addr_t dte_addr;
> +
> +       /*
> +        * Allow 'virtual devices' (e.g., drm) to attach to domain.
> +        * Such a device has a NULL archdata.iommu.
> +        */
> +       if (!iommu)
> +               return 0;
> +
> +       ret = rk_iommu_enable_stall(iommu);
> +       if (ret)
> +               return ret;
> +
> +       ret = rk_iommu_force_reset(iommu);
> +       if (ret)
> +               return ret;
> +
> +       iommu->domain = domain;
> +
> +       ret = devm_request_irq(dev, iommu->irq, rk_iommu_irq,
> +                              IRQF_SHARED, dev_name(dev), iommu);
> +       if (ret)
> +               return ret;
> +
> +       dte_addr = virt_to_phys(rk_domain->dt);
> +       rk_iommu_write(iommu, RK_MMU_DTE_ADDR, dte_addr);
> +       rk_iommu_command(iommu, RK_MMU_CMD_ZAP_CACHE);
> +       rk_iommu_write(iommu, RK_MMU_INT_MASK, RK_MMU_IRQ_MASK);
> +
> +       ret = rk_iommu_enable_paging(iommu);
> +       if (ret)
> +               return ret;
> +
> +       spin_lock_irqsave(&rk_domain->iommus_lock, flags);
> +       list_add_tail(&iommu->node, &rk_domain->iommus);
> +       spin_unlock_irqrestore(&rk_domain->iommus_lock, flags);
> +
> +       dev_info(dev, "Attached to iommu domain\n");
> +
> +       rk_iommu_disable_stall(iommu);
> +
> +       return 0;
> +}
> +
> +static void rk_iommu_detach_device(struct iommu_domain *domain,
> +                                  struct device *dev)
> +{
> +       struct rk_iommu *iommu = dev_get_drvdata(dev->archdata.iommu);
> +       struct rk_iommu_domain *rk_domain = domain->priv;
> +       unsigned long flags;
> +
> +       /* Allow 'virtual devices' (eg drm) to detach from domain */
> +       if (!iommu)
> +               return;
> +
> +       iommu->domain = NULL;
> +
> +       spin_lock_irqsave(&rk_domain->iommus_lock, flags);
> +       list_del_init(&iommu->node);
> +       spin_unlock_irqrestore(&rk_domain->iommus_lock, flags);
> +
> +       devm_free_irq(dev, iommu->irq, iommu);
> +
> +       iommu->domain = NULL;
> +
> +       /* Ignore error while disabling, just keep going */
> +       rk_iommu_enable_stall(iommu);
> +       rk_iommu_disable_paging(iommu);
> +       rk_iommu_write(iommu, RK_MMU_INT_MASK, 0);
> +       rk_iommu_write(iommu, RK_MMU_DTE_ADDR, 0);
> +       rk_iommu_disable_stall(iommu);
> +
> +       dev_info(dev, "Detached from iommu domain\n");
> +}
> +
> +static int rk_iommu_domain_init(struct iommu_domain *domain)
> +{
> +       struct rk_iommu_domain *rk_domain;
> +
> +       rk_domain = kzalloc(sizeof(*rk_domain), GFP_KERNEL);
> +       if (!rk_domain)
> +               return -ENOMEM;
> +
> +       /*
> +        * rk32xx iommus use a 2 level pagetable.
> +        * Each level1 (dt) and level2 (pt) table has 1024 4-byte entries.
> +        * Allocate one 4 KiB page for each table.
> +        */
> +       rk_domain->dt = (u32 *)get_zeroed_page(GFP_KERNEL | GFP_DMA32);
> +       if (!rk_domain->dt)
> +               goto err_dt;
> +
> +       rk_table_flush(rk_domain->dt, NUM_DT_ENTRIES);
> +
> +       spin_lock_init(&rk_domain->iommus_lock);
> +       spin_lock_init(&rk_domain->dt_lock);
> +       INIT_LIST_HEAD(&rk_domain->iommus);
> +
> +       domain->priv = rk_domain;
> +
> +       return 0;
> +err_dt:
> +       kfree(rk_domain);
> +       return -ENOMEM;
> +}
> +
> +static void rk_iommu_domain_destroy(struct iommu_domain *domain)
> +{
> +       struct rk_iommu_domain *rk_domain = domain->priv;
> +       int i;
> +
> +       WARN_ON(!list_empty(&rk_domain->iommus));
> +
> +       for (i = 0; i < NUM_DT_ENTRIES; i++) {
> +               u32 dte = rk_domain->dt[i];
> +               if (rk_dte_is_pt_valid(dte)) {
> +                       phys_addr_t pt_phys = rk_dte_pt_address(dte);
> +                       u32 *page_table = phys_to_virt(pt_phys);
> +                       free_page((unsigned long)page_table);
> +               }
> +       }
> +
> +       free_page((unsigned long)rk_domain->dt);
> +       kfree(domain->priv);
> +       domain->priv = NULL;
> +}
> +
> +static const struct iommu_ops rk_iommu_ops = {
> +       .domain_init = rk_iommu_domain_init,
> +       .domain_destroy = rk_iommu_domain_destroy,
> +       .attach_dev = rk_iommu_attach_device,
> +       .detach_dev = rk_iommu_detach_device,
> +       .map = rk_iommu_map,
> +       .unmap = rk_iommu_unmap,
> +       .iova_to_phys = rk_iommu_iova_to_phys,
> +       .pgsize_bitmap = RK_IOMMU_PGSIZE_BITMAP,
> +};
> +
> +static int rk_iommu_probe(struct platform_device *pdev)
> +{
> +       struct device *dev = &pdev->dev;
> +       struct rk_iommu *iommu;
> +       struct resource *res;
> +
> +       iommu = devm_kzalloc(dev, sizeof(*iommu), GFP_KERNEL);
> +       if (!iommu)
> +               return -ENOMEM;
> +
> +       platform_set_drvdata(pdev, iommu);
> +       iommu->dev = dev;
> +
> +       res = platform_get_resource(pdev, IORESOURCE_MEM, 0);
> +       iommu->base = devm_ioremap_resource(&pdev->dev, res);
> +       if (IS_ERR(iommu->base))
> +               return PTR_ERR(iommu->base);
> +
> +       iommu->irq = platform_get_irq(pdev, 0);
> +       if (iommu->irq < 0) {
> +               dev_err(dev, "Failed to get IRQ, %d\n", iommu->irq);
> +               return -ENXIO;
> +       }
> +
> +       return 0;
> +}
> +
> +static int rk_iommu_remove(struct platform_device *pdev)
> +{
> +       return 0;
> +}
> +
> +#ifdef CONFIG_OF
> +static const struct of_device_id rk_iommu_dt_ids[] = {
> +       { .compatible = "rockchip,iommu" },
> +       { /* sentinel */ }
> +};
> +MODULE_DEVICE_TABLE(of, rk_iommu_dt_ids);
> +#endif
> +
> +static struct platform_driver rk_iommu_driver = {
> +       .probe = rk_iommu_probe,
> +       .remove = rk_iommu_remove,
> +       .driver = {
> +                  .name = "rk_iommu",
> +                  .owner = THIS_MODULE,
> +                  .of_match_table = of_match_ptr(rk_iommu_dt_ids),
> +       },
> +};
> +
> +static int __init rk_iommu_init(void)
> +{
> +       int ret;
> +
> +       ret = bus_set_iommu(&platform_bus_type, &rk_iommu_ops);
> +       if (ret)
> +               return ret;
> +
> +       return platform_driver_register(&rk_iommu_driver);
> +}
> +static void __exit rk_iommu_exit(void)
> +{
> +       platform_driver_unregister(&rk_iommu_driver);
> +}
> +
> +subsys_initcall(rk_iommu_init);
> +module_exit(rk_iommu_exit);
> +
> +MODULE_DESCRIPTION("IOMMU API for Rockchip");
> +MODULE_AUTHOR("Simon Xue <xxm@rock-chips.com> and Daniel Kurtz <djkurtz@chromium.org>");
> +MODULE_ALIAS("platform:rockchip-iommu");
> +MODULE_LICENSE("GPL v2");
> --
> 2.1.0.rc2.206.gedb03e5
>
_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 26+ messages in thread

* [PATCH v5 1/3] iommu/rockchip: rk3288 iommu driver
@ 2014-10-17  2:22     ` Daniel Kurtz
  0 siblings, 0 replies; 26+ messages in thread
From: Daniel Kurtz @ 2014-10-17  2:22 UTC (permalink / raw)
  To: linux-arm-kernel

On Tue, Oct 14, 2014 at 4:02 PM, Daniel Kurtz <djkurtz@chromium.org> wrote:
> The rk3288 has several iommus.  Each iommu belongs to a single master
> device.  There is one device (ISP) that has two slave iommus, but that
> case is not yet supported by this driver.
>
> At subsys init, the iommu driver registers itself as the iommu driver for
> the platform bus.  The master devices find their slave iommus using the
> "iommus" field in their devicetree description.  Since each slave iommu
> belongs to exactly one master, their is no additional data needed at probe
> to associate a slave with its master.
>
> An iommu device's power domain, clock and irq are all shared with its
> master device, and the master device must be careful to attach from the
> iommu only after powering and clocking it (and leave it powered and
> clocked before detaching).  Because their is no guarantee what the status
> of the iommu is at probe, and since the driver does not even know if the
> device is powered, we delay requesting its irq until the master device
> attaches, at which point we have a guarantee that the device is powered
> and clocked and we can reset it and disable its interrupt mask.
>
> An iommu_domain describes a virtual iova address space.  Each iommu_domain
> has a corresponding page table that lists the mappings from iova to
> physical address.
>
> For the rk3288 iommu, the page table has two levels:
>  The Level 1 "directory_table" has 1024 4-byte dte entries.
>  Each dte points to a level 2 "page_table".
>  Each level 2 page_table has 1024 4-byte pte entries.
>  Each pte points to a 4 KiB page of memory.
>
> An iommu_domain is created when a dma_iommu_mapping is created via
> arm_iommu_create_mapping.  Master devices can then attach themselves to
> this mapping (or attach the mapping to themselves?) by calling
> arm_iommu_attach_device().  This in turn instructs the iommu driver to
> write the page table's physical address into the slave iommu's "Directory
> Table Entry" (DTE) register.
>
> In fact multiple master devices, each with their own slave iommu device,
> can all attach to the same mapping.  The iommus for these devices will
> share the same iommu_domain and therefore point to the same page table.
> Thus, the iommu domain maintains a list of iommu devices which are
> attached.  This driver relies on the iommu core to ensure that all devices
> have detached before destroying a domain.
>
> Signed-off-by: Daniel Kurtz <djkurtz@chromium.org>
> Signed-off-by: Simon Xue <xxm@rock-chips.com>
> Reviewed-by: Grant Grundler <grundler@chromium.org>
> Reviewed-by: St?phane Marchesin <marcheu@chromium.org>

Gentle ping.

Any more feedback on the rockchip iommu driver?

Thanks,
-Daniel

> ---
>  drivers/iommu/Kconfig          |  12 +
>  drivers/iommu/Makefile         |   1 +
>  drivers/iommu/rockchip-iommu.c | 924 +++++++++++++++++++++++++++++++++++++++++
>  3 files changed, 937 insertions(+)
>  create mode 100644 drivers/iommu/rockchip-iommu.c
>
> diff --git a/drivers/iommu/Kconfig b/drivers/iommu/Kconfig
> index dd51122..d0a1261 100644
> --- a/drivers/iommu/Kconfig
> +++ b/drivers/iommu/Kconfig
> @@ -152,6 +152,18 @@ config OMAP_IOMMU_DEBUG
>
>           Say N unless you know you need this.
>
> +config ROCKCHIP_IOMMU
> +       bool "Rockchip IOMMU Support"
> +       depends on ARCH_ROCKCHIP
> +       select IOMMU_API
> +       select ARM_DMA_USE_IOMMU
> +       help
> +         Support for IOMMUs found on Rockchip rk32xx SOCs.
> +         These IOMMUs allow virtualization of the address space used by most
> +         cores within the multimedia subsystem.
> +         Say Y here if you are using a Rockchip SoC that includes an IOMMU
> +         device.
> +
>  config TEGRA_IOMMU_GART
>         bool "Tegra GART IOMMU Support"
>         depends on ARCH_TEGRA_2x_SOC
> diff --git a/drivers/iommu/Makefile b/drivers/iommu/Makefile
> index 16edef7..3e47ef3 100644
> --- a/drivers/iommu/Makefile
> +++ b/drivers/iommu/Makefile
> @@ -13,6 +13,7 @@ obj-$(CONFIG_IRQ_REMAP) += intel_irq_remapping.o irq_remapping.o
>  obj-$(CONFIG_OMAP_IOMMU) += omap-iommu.o
>  obj-$(CONFIG_OMAP_IOMMU) += omap-iommu2.o
>  obj-$(CONFIG_OMAP_IOMMU_DEBUG) += omap-iommu-debug.o
> +obj-$(CONFIG_ROCKCHIP_IOMMU) += rockchip-iommu.o
>  obj-$(CONFIG_TEGRA_IOMMU_GART) += tegra-gart.o
>  obj-$(CONFIG_TEGRA_IOMMU_SMMU) += tegra-smmu.o
>  obj-$(CONFIG_EXYNOS_IOMMU) += exynos-iommu.o
> diff --git a/drivers/iommu/rockchip-iommu.c b/drivers/iommu/rockchip-iommu.c
> new file mode 100644
> index 0000000..08e50fc
> --- /dev/null
> +++ b/drivers/iommu/rockchip-iommu.c
> @@ -0,0 +1,924 @@
> +/*
> + * This program is free software; you can redistribute it and/or modify
> + * it under the terms of the GNU General Public License version 2 as
> + * published by the Free Software Foundation.
> + */
> +
> +#include <asm/cacheflush.h>
> +#include <asm/pgtable.h>
> +#include <linux/compiler.h>
> +#include <linux/delay.h>
> +#include <linux/device.h>
> +#include <linux/errno.h>
> +#include <linux/interrupt.h>
> +#include <linux/io.h>
> +#include <linux/iommu.h>
> +#include <linux/jiffies.h>
> +#include <linux/list.h>
> +#include <linux/mm.h>
> +#include <linux/module.h>
> +#include <linux/of.h>
> +#include <linux/platform_device.h>
> +#include <linux/slab.h>
> +#include <linux/spinlock.h>
> +
> +/** MMU register offsets */
> +#define RK_MMU_DTE_ADDR                0x00    /* Directory table address */
> +#define RK_MMU_STATUS          0x04
> +#define RK_MMU_COMMAND         0x08
> +#define RK_MMU_PAGE_FAULT_ADDR 0x0C    /* IOVA of last page fault */
> +#define RK_MMU_ZAP_ONE_LINE    0x10    /* Shootdown one IOTLB entry */
> +#define RK_MMU_INT_RAWSTAT     0x14    /* IRQ status ignoring mask */
> +#define RK_MMU_INT_CLEAR       0x18    /* Acknowledge and re-arm irq */
> +#define RK_MMU_INT_MASK                0x1C    /* IRQ enable */
> +#define RK_MMU_INT_STATUS      0x20    /* IRQ status after masking */
> +#define RK_MMU_AUTO_GATING     0x24
> +
> +#define DTE_ADDR_DUMMY         0xCAFEBABE
> +#define FORCE_RESET_TIMEOUT    100     /* ms */
> +
> +/* RK_MMU_STATUS fields */
> +#define RK_MMU_STATUS_PAGING_ENABLED       BIT(0)
> +#define RK_MMU_STATUS_PAGE_FAULT_ACTIVE    BIT(1)
> +#define RK_MMU_STATUS_STALL_ACTIVE         BIT(2)
> +#define RK_MMU_STATUS_IDLE                 BIT(3)
> +#define RK_MMU_STATUS_REPLAY_BUFFER_EMPTY  BIT(4)
> +#define RK_MMU_STATUS_PAGE_FAULT_IS_WRITE  BIT(5)
> +#define RK_MMU_STATUS_STALL_NOT_ACTIVE     BIT(31)
> +
> +/* RK_MMU_COMMAND command values */
> +#define RK_MMU_CMD_ENABLE_PAGING    0  /* Enable memory translation */
> +#define RK_MMU_CMD_DISABLE_PAGING   1  /* Disable memory translation */
> +#define RK_MMU_CMD_ENABLE_STALL     2  /* Stall paging to allow other cmds */
> +#define RK_MMU_CMD_DISABLE_STALL    3  /* Stop stall re-enables paging */
> +#define RK_MMU_CMD_ZAP_CACHE        4  /* Shoot down entire IOTLB */
> +#define RK_MMU_CMD_PAGE_FAULT_DONE  5  /* Clear page fault */
> +#define RK_MMU_CMD_FORCE_RESET      6  /* Reset all registers */
> +
> +/* RK_MMU_INT_* register fields */
> +#define RK_MMU_IRQ_PAGE_FAULT    0x01  /* page fault */
> +#define RK_MMU_IRQ_BUS_ERROR     0x02  /* bus read error */
> +#define RK_MMU_IRQ_MASK          (RK_MMU_IRQ_PAGE_FAULT | RK_MMU_IRQ_BUS_ERROR)
> +
> +#define NUM_DT_ENTRIES 1024
> +#define NUM_PT_ENTRIES 1024
> +
> +#define SPAGE_ORDER 12
> +#define SPAGE_SIZE (1 << SPAGE_ORDER)
> +
> + /*
> +  * Support mapping any size that fits in one page table:
> +  *   4 KiB to 4 MiB
> +  */
> +#define RK_IOMMU_PGSIZE_BITMAP 0x007ff000
> +
> +#define IOMMU_REG_POLL_COUNT_FAST 1000
> +
> +struct rk_iommu_domain {
> +       struct list_head iommus;
> +       u32 *dt; /* page directory table */
> +       spinlock_t iommus_lock; /* lock for iommus list */
> +       spinlock_t dt_lock; /* lock for modifying page directory table */
> +};
> +
> +struct rk_iommu {
> +       struct device *dev;
> +       void __iomem *base;
> +       int irq;
> +       struct list_head node; /* entry in rk_iommu_domain.iommus */
> +       struct iommu_domain *domain; /* domain to which iommu is attached */
> +};
> +
> +static inline void rk_table_flush(u32 *va, unsigned int count)
> +{
> +       phys_addr_t pa_start = virt_to_phys(va);
> +       phys_addr_t pa_end = virt_to_phys(va + count);
> +       size_t size = pa_end - pa_start;
> +
> +       __cpuc_flush_dcache_area(va, size);
> +       outer_flush_range(pa_start, pa_end);
> +}
> +
> +/**
> + * Inspired by _wait_for in intel_drv.h
> + * This is NOT safe for use in interrupt context.
> + *
> + * Note that it's important that we check the condition again after having
> + * timed out, since the timeout could be due to preemption or similar and
> + * we've never had a chance to check the condition before the timeout.
> + */
> +#define rk_wait_for(COND, MS) ({ \
> +       unsigned long timeout__ = jiffies + msecs_to_jiffies(MS) + 1;   \
> +       int ret__ = 0;                                                  \
> +       while (!(COND)) {                                               \
> +               if (time_after(jiffies, timeout__)) {                   \
> +                       ret__ = (COND) ? 0 : -ETIMEDOUT;                \
> +                       break;                                          \
> +               }                                                       \
> +               usleep_range(50, 100);                                  \
> +       }                                                               \
> +       ret__;                                                          \
> +})
> +
> +/*
> + * The Rockchip rk3288 iommu uses a 2-level page table.
> + * The first level is the "Directory Table" (DT).
> + * The DT consists of 1024 4-byte Directory Table Entries (DTEs), each pointing
> + * to a "Page Table".
> + * The second level is the 1024 Page Tables (PT).
> + * Each PT consists of 1024 4-byte Page Table Entries (PTEs), each pointing to
> + * a 4 KB page of physical memory.
> + *
> + * The DT and each PT fits in a single 4 KB page (4-bytes * 1024 entries).
> + * Each iommu device has a MMU_DTE_ADDR register that contains the physical
> + * address of the start of the DT page.
> + *
> + * The structure of the page table is as follows:
> + *
> + *                   DT
> + * MMU_DTE_ADDR -> +-----+
> + *                 |     |
> + *                 +-----+     PT
> + *                 | DTE | -> +-----+
> + *                 +-----+    |     |     Memory
> + *                 |     |    +-----+     Page
> + *                 |     |    | PTE | -> +-----+
> + *                 +-----+    +-----+    |     |
> + *                            |     |    |     |
> + *                            |     |    |     |
> + *                            +-----+    |     |
> + *                                       |     |
> + *                                       |     |
> + *                                       +-----+
> + */
> +
> +/*
> + * Each DTE has a PT address and a valid bit:
> + * +---------------------+-----------+-+
> + * | PT address          | Reserved  |V|
> + * +---------------------+-----------+-+
> + *  31:12 - PT address (PTs always starts on a 4 KB boundary)
> + *  11: 1 - Reserved
> + *      0 - 1 if PT @ PT address is valid
> + */
> +#define RK_DTE_PT_ADDRESS_MASK    0xfffff000
> +#define RK_DTE_PT_VALID           BIT(0)
> +
> +static inline phys_addr_t rk_dte_pt_address(u32 dte)
> +{
> +       return (phys_addr_t)dte & RK_DTE_PT_ADDRESS_MASK;
> +}
> +
> +static inline bool rk_dte_is_pt_valid(u32 dte)
> +{
> +       return dte & RK_DTE_PT_VALID;
> +}
> +
> +static u32 rk_mk_dte(u32 *pt)
> +{
> +       phys_addr_t pt_phys = virt_to_phys(pt);
> +       return (pt_phys & RK_DTE_PT_ADDRESS_MASK) | RK_DTE_PT_VALID;
> +}
> +
> +/*
> + * Each PTE has a Page address, some flags and a valid bit:
> + * +---------------------+---+-------+-+
> + * | Page address        |Rsv| Flags |V|
> + * +---------------------+---+-------+-+
> + *  31:12 - Page address (Pages always start on a 4 KB boundary)
> + *  11: 9 - Reserved
> + *   8: 1 - Flags
> + *      8 - Read allocate - allocate cache space on read misses
> + *      7 - Read cache - enable cache & prefetch of data
> + *      6 - Write buffer - enable delaying writes on their way to memory
> + *      5 - Write allocate - allocate cache space on write misses
> + *      4 - Write cache - different writes can be merged together
> + *      3 - Override cache attributes
> + *          if 1, bits 4-8 control cache attributes
> + *          if 0, the system bus defaults are used
> + *      2 - Writable
> + *      1 - Readable
> + *      0 - 1 if Page @ Page address is valid
> + */
> +#define RK_PTE_PAGE_ADDRESS_MASK  0xfffff000
> +#define RK_PTE_PAGE_FLAGS_MASK    0x000001fe
> +#define RK_PTE_PAGE_WRITABLE      BIT(2)
> +#define RK_PTE_PAGE_READABLE      BIT(1)
> +#define RK_PTE_PAGE_VALID         BIT(0)
> +
> +static inline phys_addr_t rk_pte_page_address(u32 pte)
> +{
> +       return (phys_addr_t)pte & RK_PTE_PAGE_ADDRESS_MASK;
> +}
> +
> +static inline bool rk_pte_is_page_valid(u32 pte)
> +{
> +       return pte & RK_PTE_PAGE_VALID;
> +}
> +
> +/* TODO: set cache flags per prot IOMMU_CACHE */
> +static u32 rk_mk_pte(phys_addr_t page, int prot)
> +{
> +       u32 flags = 0;
> +       flags |= (prot & IOMMU_READ) ? RK_PTE_PAGE_READABLE : 0;
> +       flags |= (prot & IOMMU_WRITE) ? RK_PTE_PAGE_WRITABLE : 0;
> +       page &= RK_PTE_PAGE_ADDRESS_MASK;
> +       return page | flags | RK_PTE_PAGE_VALID;
> +}
> +
> +static u32 rk_mk_pte_invalid(u32 pte)
> +{
> +       return pte & ~RK_PTE_PAGE_VALID;
> +}
> +
> +/*
> + * rk3288 iova (IOMMU Virtual Address) format
> + *  31       22.21       12.11          0
> + * +-----------+-----------+-------------+
> + * | DTE index | PTE index | Page offset |
> + * +-----------+-----------+-------------+
> + *  31:22 - DTE index   - index of DTE in DT
> + *  21:12 - PTE index   - index of PTE in PT @ DTE.pt_address
> + *  11: 0 - Page offset - offset into page @ PTE.page_address
> + */
> +#define RK_IOVA_DTE_MASK    0xffc00000
> +#define RK_IOVA_DTE_SHIFT   22
> +#define RK_IOVA_PTE_MASK    0x003ff000
> +#define RK_IOVA_PTE_SHIFT   12
> +#define RK_IOVA_PAGE_MASK   0x00000fff
> +#define RK_IOVA_PAGE_SHIFT  0
> +
> +static u32 rk_iova_dte_index(dma_addr_t iova)
> +{
> +       return (u32)(iova & RK_IOVA_DTE_MASK) >> RK_IOVA_DTE_SHIFT;
> +}
> +
> +static u32 rk_iova_pte_index(dma_addr_t iova)
> +{
> +       return (u32)(iova & RK_IOVA_PTE_MASK) >> RK_IOVA_PTE_SHIFT;
> +}
> +
> +static u32 rk_iova_page_offset(dma_addr_t iova)
> +{
> +       return (u32)(iova & RK_IOVA_PAGE_MASK) >> RK_IOVA_PAGE_SHIFT;
> +}
> +
> +static u32 rk_iommu_read(struct rk_iommu *iommu, u32 offset)
> +{
> +       return readl(iommu->base + offset);
> +}
> +
> +static void rk_iommu_write(struct rk_iommu *iommu, u32 offset, u32 value)
> +{
> +       writel(value, iommu->base + offset);
> +}
> +
> +static void rk_iommu_command(struct rk_iommu *iommu, u32 command)
> +{
> +       writel(command, iommu->base + RK_MMU_COMMAND);
> +}
> +
> +static void rk_iommu_zap_lines(struct rk_iommu *iommu, dma_addr_t iova,
> +                              size_t size)
> +{
> +       dma_addr_t iova_end = iova + size;
> +       /*
> +        * TODO(djkurtz): Figure out when it is more efficient to shootdown the
> +        * entire iotlb rather than iterate over individual iovas.
> +        */
> +       for (; iova < iova_end; iova += SPAGE_SIZE)
> +               rk_iommu_write(iommu, RK_MMU_ZAP_ONE_LINE, iova);
> +}
> +
> +static bool rk_iommu_is_stall_active(struct rk_iommu *iommu)
> +{
> +       return rk_iommu_read(iommu, RK_MMU_STATUS) & RK_MMU_STATUS_STALL_ACTIVE;
> +}
> +
> +static bool rk_iommu_is_paging_enabled(struct rk_iommu *iommu)
> +{
> +       return rk_iommu_read(iommu, RK_MMU_STATUS) &
> +                            RK_MMU_STATUS_PAGING_ENABLED;
> +}
> +
> +static int rk_iommu_enable_stall(struct rk_iommu *iommu)
> +{
> +       int ret;
> +
> +       if (rk_iommu_is_stall_active(iommu))
> +               return 0;
> +
> +       /* Stall can only be enabled if paging is enabled */
> +       if (!rk_iommu_is_paging_enabled(iommu))
> +               return 0;
> +
> +       rk_iommu_command(iommu, RK_MMU_CMD_ENABLE_STALL);
> +
> +       ret = rk_wait_for(rk_iommu_is_stall_active(iommu), 1);
> +       if (ret)
> +               dev_err(iommu->dev, "Enable stall request timed out, status: %#08x\n",
> +                       rk_iommu_read(iommu, RK_MMU_STATUS));
> +
> +       return ret;
> +}
> +
> +static int rk_iommu_disable_stall(struct rk_iommu *iommu)
> +{
> +       int ret;
> +
> +       if (!rk_iommu_is_stall_active(iommu))
> +               return 0;
> +
> +       rk_iommu_command(iommu, RK_MMU_CMD_DISABLE_STALL);
> +
> +       ret = rk_wait_for(!rk_iommu_is_stall_active(iommu), 1);
> +       if (ret)
> +               dev_err(iommu->dev, "Disable stall request timed out, status: %#08x\n",
> +                       rk_iommu_read(iommu, RK_MMU_STATUS));
> +
> +       return ret;
> +}
> +
> +static int rk_iommu_enable_paging(struct rk_iommu *iommu)
> +{
> +       int ret;
> +
> +       if (rk_iommu_is_paging_enabled(iommu))
> +               return 0;
> +
> +       rk_iommu_command(iommu, RK_MMU_CMD_ENABLE_PAGING);
> +
> +       ret = rk_wait_for(rk_iommu_is_paging_enabled(iommu), 1);
> +       if (ret)
> +               dev_err(iommu->dev, "Enable paging request timed out, status: %#08x\n",
> +                       rk_iommu_read(iommu, RK_MMU_STATUS));
> +
> +       return ret;
> +}
> +
> +static int rk_iommu_disable_paging(struct rk_iommu *iommu)
> +{
> +       int ret;
> +
> +       if (!rk_iommu_is_paging_enabled(iommu))
> +               return 0;
> +
> +       rk_iommu_command(iommu, RK_MMU_CMD_DISABLE_PAGING);
> +
> +       ret = rk_wait_for(!rk_iommu_is_paging_enabled(iommu), 1);
> +       if (ret)
> +               dev_err(iommu->dev, "Disable paging request timed out, status: #%08x\n",
> +                       rk_iommu_read(iommu, RK_MMU_STATUS));
> +
> +       return ret;
> +}
> +
> +static int rk_iommu_force_reset(struct rk_iommu *iommu)
> +{
> +       int ret;
> +       u32 dte_addr;
> +
> +       /*
> +        * Check if register DTE_ADDR is working by writing DTE_ADDR_DUMMY
> +        * and verifying that upper 5 nybbles are read back.
> +        */
> +       rk_iommu_write(iommu, RK_MMU_DTE_ADDR, DTE_ADDR_DUMMY);
> +
> +       dte_addr = rk_iommu_read(iommu, RK_MMU_DTE_ADDR);
> +       if (dte_addr != (DTE_ADDR_DUMMY & RK_DTE_PT_ADDRESS_MASK)) {
> +               dev_err(iommu->dev, "Error during raw reset. MMU_DTE_ADDR is not functioning\n");
> +               return -EFAULT;
> +       }
> +
> +       rk_iommu_command(iommu, RK_MMU_CMD_FORCE_RESET);
> +
> +       ret = rk_wait_for(rk_iommu_read(iommu, RK_MMU_DTE_ADDR) == 0x00000000,
> +                         FORCE_RESET_TIMEOUT);
> +       if (ret)
> +               dev_err(iommu->dev, "FORCE_RESET command timed out\n");
> +
> +       return ret;
> +}
> +
> +static void log_iova(struct rk_iommu *iommu, dma_addr_t iova)
> +{
> +       u32 dte_index, pte_index, page_offset;
> +       u32 mmu_dte_addr;
> +       phys_addr_t mmu_dte_addr_phys, dte_addr_phys;
> +       u32 *dte_addr;
> +       u32 dte;
> +       phys_addr_t pte_addr_phys = 0;
> +       u32 *pte_addr = NULL;
> +       u32 pte = 0;
> +       phys_addr_t page_addr_phys = 0;
> +       u32 page_flags = 0;
> +
> +       dte_index = rk_iova_dte_index(iova);
> +       pte_index = rk_iova_pte_index(iova);
> +       page_offset = rk_iova_page_offset(iova);
> +
> +       mmu_dte_addr = rk_iommu_read(iommu, RK_MMU_DTE_ADDR);
> +       mmu_dte_addr_phys = (phys_addr_t)mmu_dte_addr;
> +
> +       dte_addr_phys = mmu_dte_addr_phys + (4 * dte_index);
> +       dte_addr = phys_to_virt(dte_addr_phys);
> +       dte = *dte_addr;
> +
> +       if (!rk_dte_is_pt_valid(dte))
> +               goto print_it;
> +
> +       pte_addr_phys = rk_dte_pt_address(dte) + (pte_index * 4);
> +       pte_addr = phys_to_virt(pte_addr_phys);
> +       pte = *pte_addr;
> +
> +       if (!rk_pte_is_page_valid(pte))
> +               goto print_it;
> +
> +       page_addr_phys = rk_pte_page_address(pte) + page_offset;
> +       page_flags = pte & RK_PTE_PAGE_FLAGS_MASK;
> +
> +print_it:
> +       dev_err(iommu->dev, "iova = %pad: dte_index: 0x%03x pte_index: 0x%03x page_offset: 0x%03x\n",
> +               &iova, dte_index, pte_index, page_offset);
> +       dev_err(iommu->dev, "mmu_dte_addr: %pa dte@%pa: %#08x valid: %u pte@%pa: %#08x valid: %u page@%pa flags: %#03x\n",
> +               &mmu_dte_addr_phys, &dte_addr_phys, dte,
> +               rk_dte_is_pt_valid(dte), &pte_addr_phys, pte,
> +               rk_pte_is_page_valid(pte), &page_addr_phys, page_flags);
> +}
> +
> +static irqreturn_t rk_iommu_irq(int irq, void *dev_id)
> +{
> +       struct rk_iommu *iommu = dev_id;
> +       u32 status;
> +       u32 int_status;
> +       dma_addr_t iova;
> +
> +       int_status = rk_iommu_read(iommu, RK_MMU_INT_STATUS);
> +       if (int_status == 0)
> +               return IRQ_NONE;
> +
> +       iova = rk_iommu_read(iommu, RK_MMU_PAGE_FAULT_ADDR);
> +
> +       if (int_status & RK_MMU_IRQ_PAGE_FAULT) {
> +               int flags;
> +
> +               status = rk_iommu_read(iommu, RK_MMU_STATUS);
> +               flags = (status & RK_MMU_STATUS_PAGE_FAULT_IS_WRITE) ?
> +                               IOMMU_FAULT_WRITE : IOMMU_FAULT_READ;
> +
> +               dev_err(iommu->dev, "Page fault at %pad of type %s\n",
> +                       &iova,
> +                       (flags == IOMMU_FAULT_WRITE) ? "write" : "read");
> +
> +               log_iova(iommu, iova);
> +
> +               /*
> +                * Report page fault to any installed handlers.
> +                * Ignore the return code, though, since we always zap cache
> +                * and clear the page fault anyway.
> +                */
> +               if (iommu->domain)
> +                       report_iommu_fault(iommu->domain, iommu->dev, iova,
> +                                          flags);
> +               else
> +                       dev_err(iommu->dev, "Page fault while iommu not attached to domain?\n");
> +
> +               rk_iommu_command(iommu, RK_MMU_CMD_ZAP_CACHE);
> +               rk_iommu_command(iommu, RK_MMU_CMD_PAGE_FAULT_DONE);
> +       }
> +
> +       if (int_status & RK_MMU_IRQ_BUS_ERROR)
> +               dev_err(iommu->dev, "BUS_ERROR occurred at %pad\n", &iova);
> +
> +       if (int_status & ~RK_MMU_IRQ_MASK)
> +               dev_err(iommu->dev, "unexpected int_status: %#08x\n",
> +                       int_status);
> +
> +       rk_iommu_write(iommu, RK_MMU_INT_CLEAR, int_status);
> +
> +       return IRQ_HANDLED;
> +}
> +
> +static phys_addr_t rk_iommu_iova_to_phys(struct iommu_domain *domain,
> +                                        dma_addr_t iova)
> +{
> +       struct rk_iommu_domain *rk_domain = domain->priv;
> +       unsigned long flags;
> +       phys_addr_t pt_phys, phys = 0;
> +       u32 dte, pte;
> +       u32 *page_table;
> +
> +       spin_lock_irqsave(&rk_domain->dt_lock, flags);
> +
> +       dte = rk_domain->dt[rk_iova_dte_index(iova)];
> +       if (!rk_dte_is_pt_valid(dte))
> +               goto out;
> +
> +       pt_phys = rk_dte_pt_address(dte);
> +       page_table = (u32 *)phys_to_virt(pt_phys);
> +       pte = page_table[rk_iova_pte_index(iova)];
> +       if (!rk_pte_is_page_valid(pte))
> +               goto out;
> +
> +       phys = rk_pte_page_address(pte) + rk_iova_page_offset(iova);
> +out:
> +       spin_unlock_irqrestore(&rk_domain->dt_lock, flags);
> +
> +       return phys;
> +}
> +
> +static void rk_iommu_zap_iova(struct rk_iommu_domain *rk_domain,
> +                             dma_addr_t iova, size_t size)
> +{
> +       struct list_head *pos;
> +       unsigned long flags;
> +
> +       /* shootdown these iova from all iommus using this domain */
> +       spin_lock_irqsave(&rk_domain->iommus_lock, flags);
> +       list_for_each(pos, &rk_domain->iommus) {
> +               struct rk_iommu *iommu;
> +               iommu = list_entry(pos, struct rk_iommu, node);
> +               rk_iommu_zap_lines(iommu, iova, size);
> +       }
> +       spin_unlock_irqrestore(&rk_domain->iommus_lock, flags);
> +}
> +
> +static u32 *rk_dte_get_page_table(struct rk_iommu_domain *rk_domain,
> +                                 dma_addr_t iova)
> +{
> +       u32 *page_table, *dte_addr;
> +       u32 dte;
> +       phys_addr_t pt_phys;
> +
> +       assert_spin_locked(&rk_domain->dt_lock);
> +
> +       dte_addr = &rk_domain->dt[rk_iova_dte_index(iova)];
> +       dte = *dte_addr;
> +       if (rk_dte_is_pt_valid(dte))
> +               goto done;
> +
> +       page_table = (u32 *)get_zeroed_page(GFP_ATOMIC | GFP_DMA32);
> +       if (!page_table)
> +               return ERR_PTR(-ENOMEM);
> +
> +       dte = rk_mk_dte(page_table);
> +       *dte_addr = dte;
> +
> +       rk_table_flush(page_table, NUM_PT_ENTRIES);
> +       rk_table_flush(dte_addr, 1);
> +
> +       /*
> +        * Zap the first iova of newly allocated page table so iommu evicts
> +        * old cached value of new dte from the iotlb.
> +        */
> +       rk_iommu_zap_iova(rk_domain, iova, SPAGE_SIZE);
> +
> +done:
> +       pt_phys = rk_dte_pt_address(dte);
> +       return (u32 *)phys_to_virt(pt_phys);
> +}
> +
> +static size_t rk_iommu_unmap_iova(struct rk_iommu_domain *rk_domain,
> +                                 u32 *pte_addr, dma_addr_t iova, size_t size)
> +{
> +       unsigned int pte_count;
> +       unsigned int pte_total = size / SPAGE_SIZE;
> +
> +       assert_spin_locked(&rk_domain->dt_lock);
> +
> +       for (pte_count = 0; pte_count < pte_total; pte_count++) {
> +               u32 pte = pte_addr[pte_count];
> +               if (!rk_pte_is_page_valid(pte))
> +                       break;
> +
> +               pte_addr[pte_count] = rk_mk_pte_invalid(pte);
> +       }
> +
> +       rk_table_flush(pte_addr, pte_count);
> +
> +       return pte_count * SPAGE_SIZE;
> +}
> +
> +static int rk_iommu_map_iova(struct rk_iommu_domain *rk_domain, u32 *pte_addr,
> +                            dma_addr_t iova, phys_addr_t paddr, size_t size,
> +                            int prot)
> +{
> +       unsigned int pte_count;
> +       unsigned int pte_total = size / SPAGE_SIZE;
> +       phys_addr_t page_phys;
> +
> +       assert_spin_locked(&rk_domain->dt_lock);
> +
> +       for (pte_count = 0; pte_count < pte_total; pte_count++) {
> +               u32 pte = pte_addr[pte_count];
> +
> +               if (rk_pte_is_page_valid(pte))
> +                       goto unwind;
> +
> +               pte_addr[pte_count] = rk_mk_pte(paddr, prot);
> +
> +               paddr += SPAGE_SIZE;
> +       }
> +
> +       rk_table_flush(pte_addr, pte_count);
> +
> +       return 0;
> +unwind:
> +       /* Unmap the range of iovas that we just mapped */
> +       rk_iommu_unmap_iova(rk_domain, pte_addr, iova, pte_count * SPAGE_SIZE);
> +
> +       iova += pte_count * SPAGE_SIZE;
> +       page_phys = rk_pte_page_address(pte_addr[pte_count]);
> +       pr_err("iova: %pad already mapped to %pa cannot remap to phys: %pa prot:%#x\n",
> +              &iova, &page_phys, &paddr, prot);
> +
> +       return -EADDRINUSE;
> +}
> +
> +static int rk_iommu_map(struct iommu_domain *domain, unsigned long _iova,
> +                       phys_addr_t paddr, size_t size, int prot)
> +{
> +       struct rk_iommu_domain *rk_domain = domain->priv;
> +       unsigned long flags;
> +       dma_addr_t iova = (dma_addr_t)_iova;
> +       u32 *page_table, *pte_addr;
> +       int ret;
> +
> +       spin_lock_irqsave(&rk_domain->dt_lock, flags);
> +
> +       /*
> +        * pgsize_bitmap specifies iova sizes that fit in one page table
> +        * (1024 4-KiB pages = 4 MiB).
> +        * So, size will always be 4096 <= size <= 4194304.
> +        * Since iommu_map() guarantees that both iova and size will be
> +        * aligned, we will always only be mapping from a single dte here.
> +        */
> +       page_table = rk_dte_get_page_table(rk_domain, iova);
> +       if (IS_ERR(page_table)) {
> +               spin_unlock_irqrestore(&rk_domain->dt_lock, flags);
> +               return PTR_ERR(page_table);
> +       }
> +
> +       pte_addr = &page_table[rk_iova_pte_index(iova)];
> +       ret = rk_iommu_map_iova(rk_domain, pte_addr, iova, paddr, size, prot);
> +       spin_unlock_irqrestore(&rk_domain->dt_lock, flags);
> +
> +       return ret;
> +}
> +
> +static size_t rk_iommu_unmap(struct iommu_domain *domain, unsigned long _iova,
> +                            size_t size)
> +{
> +       struct rk_iommu_domain *rk_domain = domain->priv;
> +       unsigned long flags;
> +       dma_addr_t iova = (dma_addr_t)_iova;
> +       phys_addr_t pt_phys;
> +       u32 dte;
> +       u32 *pte_addr;
> +       size_t unmap_size;
> +
> +       spin_lock_irqsave(&rk_domain->dt_lock, flags);
> +
> +       /*
> +        * pgsize_bitmap specifies iova sizes that fit in one page table
> +        * (1024 4-KiB pages = 4 MiB).
> +        * So, size will always be 4096 <= size <= 4194304.
> +        * Since iommu_unmap() guarantees that both iova and size will be
> +        * aligned, we will always only be unmapping from a single dte here.
> +        */
> +       dte = rk_domain->dt[rk_iova_dte_index(iova)];
> +       /* Just return 0 if iova is unmapped */
> +       if (!rk_dte_is_pt_valid(dte)) {
> +               spin_unlock_irqrestore(&rk_domain->dt_lock, flags);
> +               return 0;
> +       }
> +
> +       pt_phys = rk_dte_pt_address(dte);
> +       pte_addr = (u32 *)phys_to_virt(pt_phys) + rk_iova_pte_index(iova);
> +       unmap_size = rk_iommu_unmap_iova(rk_domain, pte_addr, iova, size);
> +
> +       spin_unlock_irqrestore(&rk_domain->dt_lock, flags);
> +
> +       /* Shootdown iotlb entries for iova range that was just unmapped */
> +       rk_iommu_zap_iova(rk_domain, iova, unmap_size);
> +
> +       return unmap_size;
> +}
> +
> +static int rk_iommu_attach_device(struct iommu_domain *domain,
> +                                 struct device *dev)
> +{
> +       struct rk_iommu *iommu = dev_get_drvdata(dev->archdata.iommu);
> +       struct rk_iommu_domain *rk_domain = domain->priv;
> +       unsigned long flags;
> +       int ret;
> +       phys_addr_t dte_addr;
> +
> +       /*
> +        * Allow 'virtual devices' (e.g., drm) to attach to domain.
> +        * Such a device has a NULL archdata.iommu.
> +        */
> +       if (!iommu)
> +               return 0;
> +
> +       ret = rk_iommu_enable_stall(iommu);
> +       if (ret)
> +               return ret;
> +
> +       ret = rk_iommu_force_reset(iommu);
> +       if (ret)
> +               return ret;
> +
> +       iommu->domain = domain;
> +
> +       ret = devm_request_irq(dev, iommu->irq, rk_iommu_irq,
> +                              IRQF_SHARED, dev_name(dev), iommu);
> +       if (ret)
> +               return ret;
> +
> +       dte_addr = virt_to_phys(rk_domain->dt);
> +       rk_iommu_write(iommu, RK_MMU_DTE_ADDR, dte_addr);
> +       rk_iommu_command(iommu, RK_MMU_CMD_ZAP_CACHE);
> +       rk_iommu_write(iommu, RK_MMU_INT_MASK, RK_MMU_IRQ_MASK);
> +
> +       ret = rk_iommu_enable_paging(iommu);
> +       if (ret)
> +               return ret;
> +
> +       spin_lock_irqsave(&rk_domain->iommus_lock, flags);
> +       list_add_tail(&iommu->node, &rk_domain->iommus);
> +       spin_unlock_irqrestore(&rk_domain->iommus_lock, flags);
> +
> +       dev_info(dev, "Attached to iommu domain\n");
> +
> +       rk_iommu_disable_stall(iommu);
> +
> +       return 0;
> +}
> +
> +static void rk_iommu_detach_device(struct iommu_domain *domain,
> +                                  struct device *dev)
> +{
> +       struct rk_iommu *iommu = dev_get_drvdata(dev->archdata.iommu);
> +       struct rk_iommu_domain *rk_domain = domain->priv;
> +       unsigned long flags;
> +
> +       /* Allow 'virtual devices' (eg drm) to detach from domain */
> +       if (!iommu)
> +               return;
> +
> +       iommu->domain = NULL;
> +
> +       spin_lock_irqsave(&rk_domain->iommus_lock, flags);
> +       list_del_init(&iommu->node);
> +       spin_unlock_irqrestore(&rk_domain->iommus_lock, flags);
> +
> +       devm_free_irq(dev, iommu->irq, iommu);
> +
> +       iommu->domain = NULL;
> +
> +       /* Ignore error while disabling, just keep going */
> +       rk_iommu_enable_stall(iommu);
> +       rk_iommu_disable_paging(iommu);
> +       rk_iommu_write(iommu, RK_MMU_INT_MASK, 0);
> +       rk_iommu_write(iommu, RK_MMU_DTE_ADDR, 0);
> +       rk_iommu_disable_stall(iommu);
> +
> +       dev_info(dev, "Detached from iommu domain\n");
> +}
> +
> +static int rk_iommu_domain_init(struct iommu_domain *domain)
> +{
> +       struct rk_iommu_domain *rk_domain;
> +
> +       rk_domain = kzalloc(sizeof(*rk_domain), GFP_KERNEL);
> +       if (!rk_domain)
> +               return -ENOMEM;
> +
> +       /*
> +        * rk32xx iommus use a 2 level pagetable.
> +        * Each level1 (dt) and level2 (pt) table has 1024 4-byte entries.
> +        * Allocate one 4 KiB page for each table.
> +        */
> +       rk_domain->dt = (u32 *)get_zeroed_page(GFP_KERNEL | GFP_DMA32);
> +       if (!rk_domain->dt)
> +               goto err_dt;
> +
> +       rk_table_flush(rk_domain->dt, NUM_DT_ENTRIES);
> +
> +       spin_lock_init(&rk_domain->iommus_lock);
> +       spin_lock_init(&rk_domain->dt_lock);
> +       INIT_LIST_HEAD(&rk_domain->iommus);
> +
> +       domain->priv = rk_domain;
> +
> +       return 0;
> +err_dt:
> +       kfree(rk_domain);
> +       return -ENOMEM;
> +}
> +
> +static void rk_iommu_domain_destroy(struct iommu_domain *domain)
> +{
> +       struct rk_iommu_domain *rk_domain = domain->priv;
> +       int i;
> +
> +       WARN_ON(!list_empty(&rk_domain->iommus));
> +
> +       for (i = 0; i < NUM_DT_ENTRIES; i++) {
> +               u32 dte = rk_domain->dt[i];
> +               if (rk_dte_is_pt_valid(dte)) {
> +                       phys_addr_t pt_phys = rk_dte_pt_address(dte);
> +                       u32 *page_table = phys_to_virt(pt_phys);
> +                       free_page((unsigned long)page_table);
> +               }
> +       }
> +
> +       free_page((unsigned long)rk_domain->dt);
> +       kfree(domain->priv);
> +       domain->priv = NULL;
> +}
> +
> +static const struct iommu_ops rk_iommu_ops = {
> +       .domain_init = rk_iommu_domain_init,
> +       .domain_destroy = rk_iommu_domain_destroy,
> +       .attach_dev = rk_iommu_attach_device,
> +       .detach_dev = rk_iommu_detach_device,
> +       .map = rk_iommu_map,
> +       .unmap = rk_iommu_unmap,
> +       .iova_to_phys = rk_iommu_iova_to_phys,
> +       .pgsize_bitmap = RK_IOMMU_PGSIZE_BITMAP,
> +};
> +
> +static int rk_iommu_probe(struct platform_device *pdev)
> +{
> +       struct device *dev = &pdev->dev;
> +       struct rk_iommu *iommu;
> +       struct resource *res;
> +
> +       iommu = devm_kzalloc(dev, sizeof(*iommu), GFP_KERNEL);
> +       if (!iommu)
> +               return -ENOMEM;
> +
> +       platform_set_drvdata(pdev, iommu);
> +       iommu->dev = dev;
> +
> +       res = platform_get_resource(pdev, IORESOURCE_MEM, 0);
> +       iommu->base = devm_ioremap_resource(&pdev->dev, res);
> +       if (IS_ERR(iommu->base))
> +               return PTR_ERR(iommu->base);
> +
> +       iommu->irq = platform_get_irq(pdev, 0);
> +       if (iommu->irq < 0) {
> +               dev_err(dev, "Failed to get IRQ, %d\n", iommu->irq);
> +               return -ENXIO;
> +       }
> +
> +       return 0;
> +}
> +
> +static int rk_iommu_remove(struct platform_device *pdev)
> +{
> +       return 0;
> +}
> +
> +#ifdef CONFIG_OF
> +static const struct of_device_id rk_iommu_dt_ids[] = {
> +       { .compatible = "rockchip,iommu" },
> +       { /* sentinel */ }
> +};
> +MODULE_DEVICE_TABLE(of, rk_iommu_dt_ids);
> +#endif
> +
> +static struct platform_driver rk_iommu_driver = {
> +       .probe = rk_iommu_probe,
> +       .remove = rk_iommu_remove,
> +       .driver = {
> +                  .name = "rk_iommu",
> +                  .owner = THIS_MODULE,
> +                  .of_match_table = of_match_ptr(rk_iommu_dt_ids),
> +       },
> +};
> +
> +static int __init rk_iommu_init(void)
> +{
> +       int ret;
> +
> +       ret = bus_set_iommu(&platform_bus_type, &rk_iommu_ops);
> +       if (ret)
> +               return ret;
> +
> +       return platform_driver_register(&rk_iommu_driver);
> +}
> +static void __exit rk_iommu_exit(void)
> +{
> +       platform_driver_unregister(&rk_iommu_driver);
> +}
> +
> +subsys_initcall(rk_iommu_init);
> +module_exit(rk_iommu_exit);
> +
> +MODULE_DESCRIPTION("IOMMU API for Rockchip");
> +MODULE_AUTHOR("Simon Xue <xxm@rock-chips.com> and Daniel Kurtz <djkurtz@chromium.org>");
> +MODULE_ALIAS("platform:rockchip-iommu");
> +MODULE_LICENSE("GPL v2");
> --
> 2.1.0.rc2.206.gedb03e5
>

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH v5 1/3] iommu/rockchip: rk3288 iommu driver
@ 2014-10-17  8:36       ` Joerg Roedel
  0 siblings, 0 replies; 26+ messages in thread
From: Joerg Roedel @ 2014-10-17  8:36 UTC (permalink / raw)
  To: Daniel Kurtz
  Cc: Grant Grundler, Stéphane Marchesin, Simon Xue,
	Heiko Stuebner, Grant Likely, Rob Herring, open list,
	open list:IOMMU DRIVERS, moderated list:ARM/Rockchip SoC...,
	open list:ARM/Rockchip SoC..., open list:OPEN FIRMWARE AND...

On Fri, Oct 17, 2014 at 10:22:13AM +0800, Daniel Kurtz wrote:
> Gentle ping.
> 
> Any more feedback on the rockchip iommu driver?

I'll look at it in more detail when the merge window is over.


	Joerg


^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH v5 1/3] iommu/rockchip: rk3288 iommu driver
@ 2014-10-17  8:36       ` Joerg Roedel
  0 siblings, 0 replies; 26+ messages in thread
From: Joerg Roedel @ 2014-10-17  8:36 UTC (permalink / raw)
  To: Daniel Kurtz
  Cc: open list:OPEN FIRMWARE AND...,
	Simon Xue, Grant Grundler, open list,
	open list:ARM/Rockchip SoC...,
	open list:IOMMU DRIVERS, Rob Herring, Grant Likely,
	Stéphane Marchesin, moderated list:ARM/Rockchip SoC...,
	Heiko Stuebner

On Fri, Oct 17, 2014 at 10:22:13AM +0800, Daniel Kurtz wrote:
> Gentle ping.
> 
> Any more feedback on the rockchip iommu driver?

I'll look at it in more detail when the merge window is over.


	Joerg

^ permalink raw reply	[flat|nested] 26+ messages in thread

* [PATCH v5 1/3] iommu/rockchip: rk3288 iommu driver
@ 2014-10-17  8:36       ` Joerg Roedel
  0 siblings, 0 replies; 26+ messages in thread
From: Joerg Roedel @ 2014-10-17  8:36 UTC (permalink / raw)
  To: linux-arm-kernel

On Fri, Oct 17, 2014 at 10:22:13AM +0800, Daniel Kurtz wrote:
> Gentle ping.
> 
> Any more feedback on the rockchip iommu driver?

I'll look at it in more detail when the merge window is over.


	Joerg

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH v5 1/3] iommu/rockchip: rk3288 iommu driver
@ 2014-10-22 15:12     ` Joerg Roedel
  0 siblings, 0 replies; 26+ messages in thread
From: Joerg Roedel @ 2014-10-22 15:12 UTC (permalink / raw)
  To: Daniel Kurtz
  Cc: Grant Grundler, Stéphane Marchesin, Simon Xue,
	Heiko Stuebner, Grant Likely, Rob Herring, open list,
	open list:IOMMU DRIVERS, moderated list:ARM/Rockchip SoC...,
	open list:ARM/Rockchip SoC..., open list:OPEN FIRMWARE AND...

On Tue, Oct 14, 2014 at 04:02:40PM +0800, Daniel Kurtz wrote:
> +static void rk_iommu_detach_device(struct iommu_domain *domain,
> +				   struct device *dev)
> +{
> +	struct rk_iommu *iommu = dev_get_drvdata(dev->archdata.iommu);
> +	struct rk_iommu_domain *rk_domain = domain->priv;
> +	unsigned long flags;
> +
> +	/* Allow 'virtual devices' (eg drm) to detach from domain */
> +	if (!iommu)
> +		return;
> +
> +	iommu->domain = NULL;

I guess this line is a left-over? Setting iommu->domain to NULL here
before you disabled the IOMMU interrupt is racy. To be fully secure, you
should make sure that no interrupt handler is still running after you
disabled the IOMMU irq and before setting iommu->domain = NULL.

Other than that the code looks good.


	Joerg


^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH v5 1/3] iommu/rockchip: rk3288 iommu driver
@ 2014-10-22 15:12     ` Joerg Roedel
  0 siblings, 0 replies; 26+ messages in thread
From: Joerg Roedel @ 2014-10-22 15:12 UTC (permalink / raw)
  To: Daniel Kurtz
  Cc: open list:OPEN FIRMWARE AND...,
	Simon Xue, Grant Grundler, open list,
	open list:ARM/Rockchip SoC...,
	open list:IOMMU DRIVERS, Rob Herring, Grant Likely,
	Stéphane Marchesin, moderated list:ARM/Rockchip SoC...,
	Heiko Stuebner

On Tue, Oct 14, 2014 at 04:02:40PM +0800, Daniel Kurtz wrote:
> +static void rk_iommu_detach_device(struct iommu_domain *domain,
> +				   struct device *dev)
> +{
> +	struct rk_iommu *iommu = dev_get_drvdata(dev->archdata.iommu);
> +	struct rk_iommu_domain *rk_domain = domain->priv;
> +	unsigned long flags;
> +
> +	/* Allow 'virtual devices' (eg drm) to detach from domain */
> +	if (!iommu)
> +		return;
> +
> +	iommu->domain = NULL;

I guess this line is a left-over? Setting iommu->domain to NULL here
before you disabled the IOMMU interrupt is racy. To be fully secure, you
should make sure that no interrupt handler is still running after you
disabled the IOMMU irq and before setting iommu->domain = NULL.

Other than that the code looks good.


	Joerg

^ permalink raw reply	[flat|nested] 26+ messages in thread

* [PATCH v5 1/3] iommu/rockchip: rk3288 iommu driver
@ 2014-10-22 15:12     ` Joerg Roedel
  0 siblings, 0 replies; 26+ messages in thread
From: Joerg Roedel @ 2014-10-22 15:12 UTC (permalink / raw)
  To: linux-arm-kernel

On Tue, Oct 14, 2014 at 04:02:40PM +0800, Daniel Kurtz wrote:
> +static void rk_iommu_detach_device(struct iommu_domain *domain,
> +				   struct device *dev)
> +{
> +	struct rk_iommu *iommu = dev_get_drvdata(dev->archdata.iommu);
> +	struct rk_iommu_domain *rk_domain = domain->priv;
> +	unsigned long flags;
> +
> +	/* Allow 'virtual devices' (eg drm) to detach from domain */
> +	if (!iommu)
> +		return;
> +
> +	iommu->domain = NULL;

I guess this line is a left-over? Setting iommu->domain to NULL here
before you disabled the IOMMU interrupt is racy. To be fully secure, you
should make sure that no interrupt handler is still running after you
disabled the IOMMU irq and before setting iommu->domain = NULL.

Other than that the code looks good.


	Joerg

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH v5 1/3] iommu/rockchip: rk3288 iommu driver
  2014-10-26 20:32     ` Heiko Stübner
  (?)
@ 2014-10-27 10:08       ` Daniel Kurtz
  -1 siblings, 0 replies; 26+ messages in thread
From: Daniel Kurtz @ 2014-10-27 10:08 UTC (permalink / raw)
  To: Heiko Stübner
  Cc: Joerg Roedel, Simon Xue, Grant Likely, Rob Herring, open list,
	open list:IOMMU DRIVERS, moderated list:ARM/Rockchip SoC...,
	open list:OPEN FIRMWARE AND...

On Mon, Oct 27, 2014 at 4:32 AM, Heiko Stübner <heiko@sntech.de> wrote:
> Hi Daniel,
>
> Am Freitag, 24. Oktober 2014, 15:33:47 schrieb Daniel Kurtz:
>
> [...]
>
>> +static int rk_iommu_attach_device(struct iommu_domain *domain,
>> +                               struct device *dev)
>> +{
>> +     struct rk_iommu *iommu = dev_get_drvdata(dev->archdata.iommu);
>
> Here I get a null-ptr dereference [0] when using the iommu driver with the
> pending drm changes.

That's what I get for testing against a heavily modified v3.14-based kernel...

In v3.14, dev_get_drvdata() would happily return NULL if dev=NULL.
This "feature" was removed in v3.15 by this patch:

commit d4332013919aa87dbdede67d677e4cf2cd32e898
Author: Jean Delvare <jdelvare@suse.de>
Date:   Mon Apr 14 12:57:43 2014 +0200
driver core: dev_get_drvdata: Don't check for NULL dev

>
>> +     struct rk_iommu_domain *rk_domain = domain->priv;
>> +     unsigned long flags;
>> +     int ret;
>> +     phys_addr_t dte_addr;
>> +
>> +     /*
>> +      * Allow 'virtual devices' (e.g., drm) to attach to domain.
>> +      * Such a device has a NULL archdata.iommu.
>> +      */
>> +     if (!iommu)
>
> When the comment is correct, the code should probably do something like
> the following?
>
> if (!dev->archdata.iommu)
>         return 0;
>
> iommu = dev_get_drvdata(dev->archdata.iommu);
>

Yes, that looks reasonable.

>
>> +             return 0;
>> +
>> +     ret = rk_iommu_enable_stall(iommu);
>> +     if (ret)
>> +             return ret;
>> +
>> +     ret = rk_iommu_force_reset(iommu);
>> +     if (ret)
>> +             return ret;
>> +
>> +     iommu->domain = domain;
>> +
>> +     ret = devm_request_irq(dev, iommu->irq, rk_iommu_irq,
>> +                            IRQF_SHARED, dev_name(dev), iommu);
>> +     if (ret)
>> +             return ret;
>> +
>> +     dte_addr = virt_to_phys(rk_domain->dt);
>> +     rk_iommu_write(iommu, RK_MMU_DTE_ADDR, dte_addr);
>> +     rk_iommu_command(iommu, RK_MMU_CMD_ZAP_CACHE);
>> +     rk_iommu_write(iommu, RK_MMU_INT_MASK, RK_MMU_IRQ_MASK);
>> +
>> +     ret = rk_iommu_enable_paging(iommu);
>> +     if (ret)
>> +             return ret;
>> +
>> +     spin_lock_irqsave(&rk_domain->iommus_lock, flags);
>> +     list_add_tail(&iommu->node, &rk_domain->iommus);
>> +     spin_unlock_irqrestore(&rk_domain->iommus_lock, flags);
>> +
>> +     dev_info(dev, "Attached to iommu domain\n");
>> +
>> +     rk_iommu_disable_stall(iommu);
>> +
>> +     return 0;
>> +}
>
> [...]
>
>> +
>> +static struct platform_driver rk_iommu_driver = {
>> +     .probe = rk_iommu_probe,
>> +     .remove = rk_iommu_remove,
>> +     .driver = {
>> +                .name = "rk_iommu",
>> +                .owner = THIS_MODULE,
>> +                .of_match_table = of_match_ptr(rk_iommu_dt_ids),
>> +     },
>> +};
>> +
>> +static int __init rk_iommu_init(void)
>> +{
>> +     int ret;
>> +
>> +     ret = bus_set_iommu(&platform_bus_type, &rk_iommu_ops);
>
> on 3.18-rc1 this fails with -ENODEV, as add_iommu_group() is missing the
> add_device callback in rk_iommu_ops, so the iommu driver actually never
> gets registered.

v3.18-rc1 has patch [0] which changes
bus_set_iommu()->iommu_bus_init() to propagate the return value of
add_iommu_group(), whereas it was ignored in v3.17.

[0] commit fb3e306515ba6a012364b698b8ca71c337424ed3
Author: Mark Salter <msalter@redhat.com>
Date:   Sun Sep 21 13:58:24 2014 -0400

    iommu: Fix bus notifier breakage


This patch made it mandatory that iommu drivers provide an add_group
callback.   I'm not exactly sure why.  Iommu groups do not seem to be
a good fit for the rockchip iommus, since the iommus are all 1:1 with
their master device.

The exynos add_group() is a possibility, however, it causes an
iommu_group to be allocated for every single platform_device, even if
they do not use an iommu.  This seems very wasteful.  Instead we can
check the device's dt node for an iommus field to a phandle with a
"#iommu-cells" field.

Also, perhaps the add_device() is a good place to stick other generic
device initialization code, which we are currently sprinkling in the
drivers of rockchip iommu masters (drm/codec).  Other drivers do this:
 * shmobile: sets up the iommu mapping with arm_iommu_create_mapping()
/ arm_iommu_attach_device()
 * omap: use of_parse_phandle()/of_find_device_by_node() to set a
master device's dev->archdata.iommu.

Or, perhaps we can just ignore iommu groups entirely and use dummy functions:
 static int rk_iommu_add_device(struct device *dev) { return 0; }
 static void rk_iommu_remove_device(struct device *dev) { }

I'll investigate more.

-Dan

>
> I've stolen the generic add_device and remove_device callbacks from the
> exynos iommu driver which makes the rk one at least probe.
>
> Can't say how far it goes, as I'm still struggling with the floating display
> subsystem parts. My current diff against this version can be found in [1].
>
> Maybe the issue I had in attach_device also simply resulted from this one,
> not sure right now.
>
>
> Heiko
>
>> +     if (ret)
>> +             return ret;
>> +
>> +     return platform_driver_register(&rk_iommu_driver);
>> +}
>> +static void __exit rk_iommu_exit(void)
>> +{
>> +     platform_driver_unregister(&rk_iommu_driver);
>> +}
>> +
>> +subsys_initcall(rk_iommu_init);
>> +module_exit(rk_iommu_exit);
>> +
>> +MODULE_DESCRIPTION("IOMMU API for Rockchip");
>> +MODULE_AUTHOR("Simon Xue <xxm@rock-chips.com> and Daniel Kurtz
>> <djkurtz@chromium.org>"); +MODULE_ALIAS("platform:rockchip-iommu");
>> +MODULE_LICENSE("GPL v2");
>
>
>
> [0]
>
> [drm] Initialized drm 1.1.0 20060810
> Unable to handle kernel NULL pointer dereference at virtual address 00000058
> pgd = c0004000
> [00000058] *pgd=00000000
> Internal error: Oops: 5 [#1] SMP ARM
> Modules linked in:
> CPU: 0 PID: 1 Comm: swapper/0 Not tainted 3.18.0-rc1+ #1274
> task: ee067b40 ti: ee068000 task.ti: ee068000
> PC is at rk_iommu_attach_device+0x3c/0x29c
> LR is at rk_iommu_attach_device+0x30/0x29c
> pc : [<c03686d4>]    lr : [<c03686c8>]    psr: 60000153
> sp : ee069db8  ip : 00000000  fp : 00000000
> r10: ee276f00  r9 : 00000000  r8 : ee27cc00
> r7 : ee27cf80  r6 : ee11b610  r5 : ee11b610  r4 : ee27cf80
> r3 : 00000000  r2 : c07bb588  r1 : c045b6d0  r0 : c054a27f
> Flags: nZCv  IRQs on  FIQs off  Mode SVC_32  ISA ARM  Segment kernel
> Control: 10c5387d  Table: 0000406a  DAC: 00000015
> Process swapper/0 (pid: 1, stack limit = 0xee068240)
> Stack: (0xee069db8 to 0xee06a000)
> 9da0:                                                       c07c75a8 c0367fec
> 9dc0: c0367fc4 ee276f00 c07c75a8 ee27cf80 ee11b610 00000000 ee27cf80 ee27cc00
> 9de0: 00000000 c0366c48 ee27c250 c001a470 ee27c250 ee11b610 ee2cf400 00000000
> 9e00: ee27cf80 c0225388 c0225298 ee2cf400 00000000 00000000 00000000 c0213cb0
> 9e20: ee11ca40 ee11b610 ee2cf400 00000000 ee2db130 c022522c ee2dbf80 00000004
> 9e40: ee2db110 c022a9e4 ee27cc00 c07c73f8 ee2dbf80 c07da7c0 c082633c c022abe4
> 9e60: c0446710 ee127010 ffffffed c07c7354 c07da7c0 c0230174 ee127010 c07c7354
> 9e80: 00000000 c022ebc0 c07c7354 ee127010 ee069ea8 ee127010 ee127044 c07c7354
> 9ea0: 00000000 c05be4a0 ee068000 c022ee38 00000000 c07c7354 c022edd0 c022d28c
> 9ec0: ee04675c ee1226b4 c07c7354 ee2d6a80 c07c75a8 c022e2b4 c0512e6d c0512e6d
> 9ee0: 00000072 c07c7354 c07b6e18 c07dfac0 c05dd274 c022f4b0 00000000 ee27cd00
> 9f00: c07b6e18 c00089d0 ee0e9300 c0100018 ee0e9300 ee0e9080 ee0e9000 c0420e38
> 9f20: c0802444 00000000 c057d980 c01001a4 c05a7594 ef7fcc05 00000000 c0036b68
> 9f40: 00000000 00000000 c057d980 c057cd90 000000bf 00000006 c07ba5f0 00000006
> 9f60: c05d1568 c07dfac0 c05dd274 000000bf 00000000 00000000 00000000 c05a7d20
> 9f80: 00000006 00000006 c05a7594 ee068000 00000000 c04171ac 00000000 00000000
> 9fa0: 00000000 c04171b4 00000000 c000e878 00000000 00000000 00000000 00000000
> 9fc0: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
> 9fe0: 00000000 00000000 00000000 00000000 00000013 00000000 00000000 00000000
> [<c03686d4>] (rk_iommu_attach_device) from [<c0366c48>] (iommu_attach_device+0x18/0x24)
> [<c0366c48>] (iommu_attach_device) from [<c001a470>] (arm_iommu_attach_device+0x18/0xd0)
> [<c001a470>] (arm_iommu_attach_device) from [<c0225388>] (rockchip_drm_load+0xf0/0x198)
> [<c0225388>] (rockchip_drm_load) from [<c0213cb0>] (drm_dev_register+0x80/0x100)
> [<c0213cb0>] (drm_dev_register) from [<c022522c>] (rockchip_drm_bind+0x48/0x74)
> [<c022522c>] (rockchip_drm_bind) from [<c022a9e4>] (try_to_bring_up_master.part.2+0xa4/0xf4)
> [<c022a9e4>] (try_to_bring_up_master.part.2) from [<c022abe4>] (component_add+0x9c/0x104)
> [<c022abe4>] (component_add) from [<c0230174>] (platform_drv_probe+0x48/0x90)
> [<c0230174>] (platform_drv_probe) from [<c022ebc0>] (driver_probe_device+0x130/0x340)
> [<c022ebc0>] (driver_probe_device) from [<c022ee38>] (__driver_attach+0x68/0x8c)
> [<c022ee38>] (__driver_attach) from [<c022d28c>] (bus_for_each_dev+0x6c/0x80)
> [<c022d28c>] (bus_for_each_dev) from [<c022e2b4>] (bus_add_driver+0xfc/0x1f0)
> [<c022e2b4>] (bus_add_driver) from [<c022f4b0>] (driver_register+0x9c/0xe0)
> [<c022f4b0>] (driver_register) from [<c00089d0>] (do_one_initcall+0x110/0x1bc)
> [<c00089d0>] (do_one_initcall) from [<c05a7d20>] (kernel_init_freeable+0xfc/0x1c8)
> [<c05a7d20>] (kernel_init_freeable) from [<c04171b4>] (kernel_init+0x8/0xe4)
> [<c04171b4>] (kernel_init) from [<c000e878>] (ret_from_fork+0x14/0x3c)
> Code: eb02c07d e5963140 e59f1234 e59f0234 (e5934058)
> ---[ end trace 41e4f8e55e7119af ]---
> Kernel panic - not syncing: Attempted to kill init! exitcode=0x0000000b
>
> ---[ end Kernel panic - not syncing: Attempted to kill init! exitcode=0x0000000b
>
>
>
> [1]
>
> diff --git a/drivers/iommu/rockchip-iommu.c b/drivers/iommu/rockchip-iommu.c
> index 56ffb76..959348f 100644
> --- a/drivers/iommu/rockchip-iommu.c
> +++ b/drivers/iommu/rockchip-iommu.c
> @@ -708,8 +708,8 @@ static size_t rk_iommu_unmap(struct iommu_domain *domain, unsigned long _iova,
>  static int rk_iommu_attach_device(struct iommu_domain *domain,
>                                   struct device *dev)
>  {
> -       struct rk_iommu *iommu = dev_get_drvdata(dev->archdata.iommu);
>         struct rk_iommu_domain *rk_domain = domain->priv;
> +       struct rk_iommu *iommu;
>         unsigned long flags;
>         int ret;
>         phys_addr_t dte_addr;
> @@ -718,9 +718,11 @@ static int rk_iommu_attach_device(struct iommu_domain *domain,
>          * Allow 'virtual devices' (e.g., drm) to attach to domain.
>          * Such a device has a NULL archdata.iommu.
>          */
> -       if (!iommu)
> +       if (!dev->archdata.iommu)
>                 return 0;
>
> +       iommu = dev_get_drvdata(dev->archdata.iommu);
> +
>         ret = rk_iommu_enable_stall(iommu);
>         if (ret)
>                 return ret;
> @@ -837,6 +839,32 @@ static void rk_iommu_domain_destroy(struct iommu_domain *domain)
>         domain->priv = NULL;
>  }
>
> +static int rk_iommu_add_device(struct device *dev)
> +{
> +       struct iommu_group *group;
> +       int ret;
> +
> +       group = iommu_group_get(dev);
> +
> +       if (!group) {
> +               group = iommu_group_alloc();
> +               if (IS_ERR(group)) {
> +                       dev_err(dev, "Failed to allocate IOMMU group\n");
> +                       return PTR_ERR(group);
> +               }
> +       }
> +
> +       ret = iommu_group_add_device(group, dev);
> +       iommu_group_put(group);
> +
> +       return ret;
> +}
> +
> +static void rk_iommu_remove_device(struct device *dev)
> +{
> +       iommu_group_remove_device(dev);
> +}
> +
>  static const struct iommu_ops rk_iommu_ops = {
>         .domain_init = rk_iommu_domain_init,
>         .domain_destroy = rk_iommu_domain_destroy,
> @@ -845,6 +873,8 @@ static const struct iommu_ops rk_iommu_ops = {
>         .map = rk_iommu_map,
>         .unmap = rk_iommu_unmap,
>         .iova_to_phys = rk_iommu_iova_to_phys,
> +       .add_device = rk_iommu_add_device,
> +       .remove_device = rk_iommu_remove_device,
>         .pgsize_bitmap = RK_IOMMU_PGSIZE_BITMAP,
>  };
>
>

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH v5 1/3] iommu/rockchip: rk3288 iommu driver
@ 2014-10-27 10:08       ` Daniel Kurtz
  0 siblings, 0 replies; 26+ messages in thread
From: Daniel Kurtz @ 2014-10-27 10:08 UTC (permalink / raw)
  To: Heiko Stübner
  Cc: open list:OPEN FIRMWARE AND...,
	Simon Xue, open list, open list:IOMMU DRIVERS, Rob Herring,
	Grant Likely, moderated list:ARM/Rockchip SoC...

On Mon, Oct 27, 2014 at 4:32 AM, Heiko Stübner <heiko@sntech.de> wrote:
> Hi Daniel,
>
> Am Freitag, 24. Oktober 2014, 15:33:47 schrieb Daniel Kurtz:
>
> [...]
>
>> +static int rk_iommu_attach_device(struct iommu_domain *domain,
>> +                               struct device *dev)
>> +{
>> +     struct rk_iommu *iommu = dev_get_drvdata(dev->archdata.iommu);
>
> Here I get a null-ptr dereference [0] when using the iommu driver with the
> pending drm changes.

That's what I get for testing against a heavily modified v3.14-based kernel...

In v3.14, dev_get_drvdata() would happily return NULL if dev=NULL.
This "feature" was removed in v3.15 by this patch:

commit d4332013919aa87dbdede67d677e4cf2cd32e898
Author: Jean Delvare <jdelvare@suse.de>
Date:   Mon Apr 14 12:57:43 2014 +0200
driver core: dev_get_drvdata: Don't check for NULL dev

>
>> +     struct rk_iommu_domain *rk_domain = domain->priv;
>> +     unsigned long flags;
>> +     int ret;
>> +     phys_addr_t dte_addr;
>> +
>> +     /*
>> +      * Allow 'virtual devices' (e.g., drm) to attach to domain.
>> +      * Such a device has a NULL archdata.iommu.
>> +      */
>> +     if (!iommu)
>
> When the comment is correct, the code should probably do something like
> the following?
>
> if (!dev->archdata.iommu)
>         return 0;
>
> iommu = dev_get_drvdata(dev->archdata.iommu);
>

Yes, that looks reasonable.

>
>> +             return 0;
>> +
>> +     ret = rk_iommu_enable_stall(iommu);
>> +     if (ret)
>> +             return ret;
>> +
>> +     ret = rk_iommu_force_reset(iommu);
>> +     if (ret)
>> +             return ret;
>> +
>> +     iommu->domain = domain;
>> +
>> +     ret = devm_request_irq(dev, iommu->irq, rk_iommu_irq,
>> +                            IRQF_SHARED, dev_name(dev), iommu);
>> +     if (ret)
>> +             return ret;
>> +
>> +     dte_addr = virt_to_phys(rk_domain->dt);
>> +     rk_iommu_write(iommu, RK_MMU_DTE_ADDR, dte_addr);
>> +     rk_iommu_command(iommu, RK_MMU_CMD_ZAP_CACHE);
>> +     rk_iommu_write(iommu, RK_MMU_INT_MASK, RK_MMU_IRQ_MASK);
>> +
>> +     ret = rk_iommu_enable_paging(iommu);
>> +     if (ret)
>> +             return ret;
>> +
>> +     spin_lock_irqsave(&rk_domain->iommus_lock, flags);
>> +     list_add_tail(&iommu->node, &rk_domain->iommus);
>> +     spin_unlock_irqrestore(&rk_domain->iommus_lock, flags);
>> +
>> +     dev_info(dev, "Attached to iommu domain\n");
>> +
>> +     rk_iommu_disable_stall(iommu);
>> +
>> +     return 0;
>> +}
>
> [...]
>
>> +
>> +static struct platform_driver rk_iommu_driver = {
>> +     .probe = rk_iommu_probe,
>> +     .remove = rk_iommu_remove,
>> +     .driver = {
>> +                .name = "rk_iommu",
>> +                .owner = THIS_MODULE,
>> +                .of_match_table = of_match_ptr(rk_iommu_dt_ids),
>> +     },
>> +};
>> +
>> +static int __init rk_iommu_init(void)
>> +{
>> +     int ret;
>> +
>> +     ret = bus_set_iommu(&platform_bus_type, &rk_iommu_ops);
>
> on 3.18-rc1 this fails with -ENODEV, as add_iommu_group() is missing the
> add_device callback in rk_iommu_ops, so the iommu driver actually never
> gets registered.

v3.18-rc1 has patch [0] which changes
bus_set_iommu()->iommu_bus_init() to propagate the return value of
add_iommu_group(), whereas it was ignored in v3.17.

[0] commit fb3e306515ba6a012364b698b8ca71c337424ed3
Author: Mark Salter <msalter@redhat.com>
Date:   Sun Sep 21 13:58:24 2014 -0400

    iommu: Fix bus notifier breakage


This patch made it mandatory that iommu drivers provide an add_group
callback.   I'm not exactly sure why.  Iommu groups do not seem to be
a good fit for the rockchip iommus, since the iommus are all 1:1 with
their master device.

The exynos add_group() is a possibility, however, it causes an
iommu_group to be allocated for every single platform_device, even if
they do not use an iommu.  This seems very wasteful.  Instead we can
check the device's dt node for an iommus field to a phandle with a
"#iommu-cells" field.

Also, perhaps the add_device() is a good place to stick other generic
device initialization code, which we are currently sprinkling in the
drivers of rockchip iommu masters (drm/codec).  Other drivers do this:
 * shmobile: sets up the iommu mapping with arm_iommu_create_mapping()
/ arm_iommu_attach_device()
 * omap: use of_parse_phandle()/of_find_device_by_node() to set a
master device's dev->archdata.iommu.

Or, perhaps we can just ignore iommu groups entirely and use dummy functions:
 static int rk_iommu_add_device(struct device *dev) { return 0; }
 static void rk_iommu_remove_device(struct device *dev) { }

I'll investigate more.

-Dan

>
> I've stolen the generic add_device and remove_device callbacks from the
> exynos iommu driver which makes the rk one at least probe.
>
> Can't say how far it goes, as I'm still struggling with the floating display
> subsystem parts. My current diff against this version can be found in [1].
>
> Maybe the issue I had in attach_device also simply resulted from this one,
> not sure right now.
>
>
> Heiko
>
>> +     if (ret)
>> +             return ret;
>> +
>> +     return platform_driver_register(&rk_iommu_driver);
>> +}
>> +static void __exit rk_iommu_exit(void)
>> +{
>> +     platform_driver_unregister(&rk_iommu_driver);
>> +}
>> +
>> +subsys_initcall(rk_iommu_init);
>> +module_exit(rk_iommu_exit);
>> +
>> +MODULE_DESCRIPTION("IOMMU API for Rockchip");
>> +MODULE_AUTHOR("Simon Xue <xxm@rock-chips.com> and Daniel Kurtz
>> <djkurtz@chromium.org>"); +MODULE_ALIAS("platform:rockchip-iommu");
>> +MODULE_LICENSE("GPL v2");
>
>
>
> [0]
>
> [drm] Initialized drm 1.1.0 20060810
> Unable to handle kernel NULL pointer dereference at virtual address 00000058
> pgd = c0004000
> [00000058] *pgd=00000000
> Internal error: Oops: 5 [#1] SMP ARM
> Modules linked in:
> CPU: 0 PID: 1 Comm: swapper/0 Not tainted 3.18.0-rc1+ #1274
> task: ee067b40 ti: ee068000 task.ti: ee068000
> PC is at rk_iommu_attach_device+0x3c/0x29c
> LR is at rk_iommu_attach_device+0x30/0x29c
> pc : [<c03686d4>]    lr : [<c03686c8>]    psr: 60000153
> sp : ee069db8  ip : 00000000  fp : 00000000
> r10: ee276f00  r9 : 00000000  r8 : ee27cc00
> r7 : ee27cf80  r6 : ee11b610  r5 : ee11b610  r4 : ee27cf80
> r3 : 00000000  r2 : c07bb588  r1 : c045b6d0  r0 : c054a27f
> Flags: nZCv  IRQs on  FIQs off  Mode SVC_32  ISA ARM  Segment kernel
> Control: 10c5387d  Table: 0000406a  DAC: 00000015
> Process swapper/0 (pid: 1, stack limit = 0xee068240)
> Stack: (0xee069db8 to 0xee06a000)
> 9da0:                                                       c07c75a8 c0367fec
> 9dc0: c0367fc4 ee276f00 c07c75a8 ee27cf80 ee11b610 00000000 ee27cf80 ee27cc00
> 9de0: 00000000 c0366c48 ee27c250 c001a470 ee27c250 ee11b610 ee2cf400 00000000
> 9e00: ee27cf80 c0225388 c0225298 ee2cf400 00000000 00000000 00000000 c0213cb0
> 9e20: ee11ca40 ee11b610 ee2cf400 00000000 ee2db130 c022522c ee2dbf80 00000004
> 9e40: ee2db110 c022a9e4 ee27cc00 c07c73f8 ee2dbf80 c07da7c0 c082633c c022abe4
> 9e60: c0446710 ee127010 ffffffed c07c7354 c07da7c0 c0230174 ee127010 c07c7354
> 9e80: 00000000 c022ebc0 c07c7354 ee127010 ee069ea8 ee127010 ee127044 c07c7354
> 9ea0: 00000000 c05be4a0 ee068000 c022ee38 00000000 c07c7354 c022edd0 c022d28c
> 9ec0: ee04675c ee1226b4 c07c7354 ee2d6a80 c07c75a8 c022e2b4 c0512e6d c0512e6d
> 9ee0: 00000072 c07c7354 c07b6e18 c07dfac0 c05dd274 c022f4b0 00000000 ee27cd00
> 9f00: c07b6e18 c00089d0 ee0e9300 c0100018 ee0e9300 ee0e9080 ee0e9000 c0420e38
> 9f20: c0802444 00000000 c057d980 c01001a4 c05a7594 ef7fcc05 00000000 c0036b68
> 9f40: 00000000 00000000 c057d980 c057cd90 000000bf 00000006 c07ba5f0 00000006
> 9f60: c05d1568 c07dfac0 c05dd274 000000bf 00000000 00000000 00000000 c05a7d20
> 9f80: 00000006 00000006 c05a7594 ee068000 00000000 c04171ac 00000000 00000000
> 9fa0: 00000000 c04171b4 00000000 c000e878 00000000 00000000 00000000 00000000
> 9fc0: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
> 9fe0: 00000000 00000000 00000000 00000000 00000013 00000000 00000000 00000000
> [<c03686d4>] (rk_iommu_attach_device) from [<c0366c48>] (iommu_attach_device+0x18/0x24)
> [<c0366c48>] (iommu_attach_device) from [<c001a470>] (arm_iommu_attach_device+0x18/0xd0)
> [<c001a470>] (arm_iommu_attach_device) from [<c0225388>] (rockchip_drm_load+0xf0/0x198)
> [<c0225388>] (rockchip_drm_load) from [<c0213cb0>] (drm_dev_register+0x80/0x100)
> [<c0213cb0>] (drm_dev_register) from [<c022522c>] (rockchip_drm_bind+0x48/0x74)
> [<c022522c>] (rockchip_drm_bind) from [<c022a9e4>] (try_to_bring_up_master.part.2+0xa4/0xf4)
> [<c022a9e4>] (try_to_bring_up_master.part.2) from [<c022abe4>] (component_add+0x9c/0x104)
> [<c022abe4>] (component_add) from [<c0230174>] (platform_drv_probe+0x48/0x90)
> [<c0230174>] (platform_drv_probe) from [<c022ebc0>] (driver_probe_device+0x130/0x340)
> [<c022ebc0>] (driver_probe_device) from [<c022ee38>] (__driver_attach+0x68/0x8c)
> [<c022ee38>] (__driver_attach) from [<c022d28c>] (bus_for_each_dev+0x6c/0x80)
> [<c022d28c>] (bus_for_each_dev) from [<c022e2b4>] (bus_add_driver+0xfc/0x1f0)
> [<c022e2b4>] (bus_add_driver) from [<c022f4b0>] (driver_register+0x9c/0xe0)
> [<c022f4b0>] (driver_register) from [<c00089d0>] (do_one_initcall+0x110/0x1bc)
> [<c00089d0>] (do_one_initcall) from [<c05a7d20>] (kernel_init_freeable+0xfc/0x1c8)
> [<c05a7d20>] (kernel_init_freeable) from [<c04171b4>] (kernel_init+0x8/0xe4)
> [<c04171b4>] (kernel_init) from [<c000e878>] (ret_from_fork+0x14/0x3c)
> Code: eb02c07d e5963140 e59f1234 e59f0234 (e5934058)
> ---[ end trace 41e4f8e55e7119af ]---
> Kernel panic - not syncing: Attempted to kill init! exitcode=0x0000000b
>
> ---[ end Kernel panic - not syncing: Attempted to kill init! exitcode=0x0000000b
>
>
>
> [1]
>
> diff --git a/drivers/iommu/rockchip-iommu.c b/drivers/iommu/rockchip-iommu.c
> index 56ffb76..959348f 100644
> --- a/drivers/iommu/rockchip-iommu.c
> +++ b/drivers/iommu/rockchip-iommu.c
> @@ -708,8 +708,8 @@ static size_t rk_iommu_unmap(struct iommu_domain *domain, unsigned long _iova,
>  static int rk_iommu_attach_device(struct iommu_domain *domain,
>                                   struct device *dev)
>  {
> -       struct rk_iommu *iommu = dev_get_drvdata(dev->archdata.iommu);
>         struct rk_iommu_domain *rk_domain = domain->priv;
> +       struct rk_iommu *iommu;
>         unsigned long flags;
>         int ret;
>         phys_addr_t dte_addr;
> @@ -718,9 +718,11 @@ static int rk_iommu_attach_device(struct iommu_domain *domain,
>          * Allow 'virtual devices' (e.g., drm) to attach to domain.
>          * Such a device has a NULL archdata.iommu.
>          */
> -       if (!iommu)
> +       if (!dev->archdata.iommu)
>                 return 0;
>
> +       iommu = dev_get_drvdata(dev->archdata.iommu);
> +
>         ret = rk_iommu_enable_stall(iommu);
>         if (ret)
>                 return ret;
> @@ -837,6 +839,32 @@ static void rk_iommu_domain_destroy(struct iommu_domain *domain)
>         domain->priv = NULL;
>  }
>
> +static int rk_iommu_add_device(struct device *dev)
> +{
> +       struct iommu_group *group;
> +       int ret;
> +
> +       group = iommu_group_get(dev);
> +
> +       if (!group) {
> +               group = iommu_group_alloc();
> +               if (IS_ERR(group)) {
> +                       dev_err(dev, "Failed to allocate IOMMU group\n");
> +                       return PTR_ERR(group);
> +               }
> +       }
> +
> +       ret = iommu_group_add_device(group, dev);
> +       iommu_group_put(group);
> +
> +       return ret;
> +}
> +
> +static void rk_iommu_remove_device(struct device *dev)
> +{
> +       iommu_group_remove_device(dev);
> +}
> +
>  static const struct iommu_ops rk_iommu_ops = {
>         .domain_init = rk_iommu_domain_init,
>         .domain_destroy = rk_iommu_domain_destroy,
> @@ -845,6 +873,8 @@ static const struct iommu_ops rk_iommu_ops = {
>         .map = rk_iommu_map,
>         .unmap = rk_iommu_unmap,
>         .iova_to_phys = rk_iommu_iova_to_phys,
> +       .add_device = rk_iommu_add_device,
> +       .remove_device = rk_iommu_remove_device,
>         .pgsize_bitmap = RK_IOMMU_PGSIZE_BITMAP,
>  };
>
>
_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 26+ messages in thread

* [PATCH v5 1/3] iommu/rockchip: rk3288 iommu driver
@ 2014-10-27 10:08       ` Daniel Kurtz
  0 siblings, 0 replies; 26+ messages in thread
From: Daniel Kurtz @ 2014-10-27 10:08 UTC (permalink / raw)
  To: linux-arm-kernel

On Mon, Oct 27, 2014 at 4:32 AM, Heiko St?bner <heiko@sntech.de> wrote:
> Hi Daniel,
>
> Am Freitag, 24. Oktober 2014, 15:33:47 schrieb Daniel Kurtz:
>
> [...]
>
>> +static int rk_iommu_attach_device(struct iommu_domain *domain,
>> +                               struct device *dev)
>> +{
>> +     struct rk_iommu *iommu = dev_get_drvdata(dev->archdata.iommu);
>
> Here I get a null-ptr dereference [0] when using the iommu driver with the
> pending drm changes.

That's what I get for testing against a heavily modified v3.14-based kernel...

In v3.14, dev_get_drvdata() would happily return NULL if dev=NULL.
This "feature" was removed in v3.15 by this patch:

commit d4332013919aa87dbdede67d677e4cf2cd32e898
Author: Jean Delvare <jdelvare@suse.de>
Date:   Mon Apr 14 12:57:43 2014 +0200
driver core: dev_get_drvdata: Don't check for NULL dev

>
>> +     struct rk_iommu_domain *rk_domain = domain->priv;
>> +     unsigned long flags;
>> +     int ret;
>> +     phys_addr_t dte_addr;
>> +
>> +     /*
>> +      * Allow 'virtual devices' (e.g., drm) to attach to domain.
>> +      * Such a device has a NULL archdata.iommu.
>> +      */
>> +     if (!iommu)
>
> When the comment is correct, the code should probably do something like
> the following?
>
> if (!dev->archdata.iommu)
>         return 0;
>
> iommu = dev_get_drvdata(dev->archdata.iommu);
>

Yes, that looks reasonable.

>
>> +             return 0;
>> +
>> +     ret = rk_iommu_enable_stall(iommu);
>> +     if (ret)
>> +             return ret;
>> +
>> +     ret = rk_iommu_force_reset(iommu);
>> +     if (ret)
>> +             return ret;
>> +
>> +     iommu->domain = domain;
>> +
>> +     ret = devm_request_irq(dev, iommu->irq, rk_iommu_irq,
>> +                            IRQF_SHARED, dev_name(dev), iommu);
>> +     if (ret)
>> +             return ret;
>> +
>> +     dte_addr = virt_to_phys(rk_domain->dt);
>> +     rk_iommu_write(iommu, RK_MMU_DTE_ADDR, dte_addr);
>> +     rk_iommu_command(iommu, RK_MMU_CMD_ZAP_CACHE);
>> +     rk_iommu_write(iommu, RK_MMU_INT_MASK, RK_MMU_IRQ_MASK);
>> +
>> +     ret = rk_iommu_enable_paging(iommu);
>> +     if (ret)
>> +             return ret;
>> +
>> +     spin_lock_irqsave(&rk_domain->iommus_lock, flags);
>> +     list_add_tail(&iommu->node, &rk_domain->iommus);
>> +     spin_unlock_irqrestore(&rk_domain->iommus_lock, flags);
>> +
>> +     dev_info(dev, "Attached to iommu domain\n");
>> +
>> +     rk_iommu_disable_stall(iommu);
>> +
>> +     return 0;
>> +}
>
> [...]
>
>> +
>> +static struct platform_driver rk_iommu_driver = {
>> +     .probe = rk_iommu_probe,
>> +     .remove = rk_iommu_remove,
>> +     .driver = {
>> +                .name = "rk_iommu",
>> +                .owner = THIS_MODULE,
>> +                .of_match_table = of_match_ptr(rk_iommu_dt_ids),
>> +     },
>> +};
>> +
>> +static int __init rk_iommu_init(void)
>> +{
>> +     int ret;
>> +
>> +     ret = bus_set_iommu(&platform_bus_type, &rk_iommu_ops);
>
> on 3.18-rc1 this fails with -ENODEV, as add_iommu_group() is missing the
> add_device callback in rk_iommu_ops, so the iommu driver actually never
> gets registered.

v3.18-rc1 has patch [0] which changes
bus_set_iommu()->iommu_bus_init() to propagate the return value of
add_iommu_group(), whereas it was ignored in v3.17.

[0] commit fb3e306515ba6a012364b698b8ca71c337424ed3
Author: Mark Salter <msalter@redhat.com>
Date:   Sun Sep 21 13:58:24 2014 -0400

    iommu: Fix bus notifier breakage


This patch made it mandatory that iommu drivers provide an add_group
callback.   I'm not exactly sure why.  Iommu groups do not seem to be
a good fit for the rockchip iommus, since the iommus are all 1:1 with
their master device.

The exynos add_group() is a possibility, however, it causes an
iommu_group to be allocated for every single platform_device, even if
they do not use an iommu.  This seems very wasteful.  Instead we can
check the device's dt node for an iommus field to a phandle with a
"#iommu-cells" field.

Also, perhaps the add_device() is a good place to stick other generic
device initialization code, which we are currently sprinkling in the
drivers of rockchip iommu masters (drm/codec).  Other drivers do this:
 * shmobile: sets up the iommu mapping with arm_iommu_create_mapping()
/ arm_iommu_attach_device()
 * omap: use of_parse_phandle()/of_find_device_by_node() to set a
master device's dev->archdata.iommu.

Or, perhaps we can just ignore iommu groups entirely and use dummy functions:
 static int rk_iommu_add_device(struct device *dev) { return 0; }
 static void rk_iommu_remove_device(struct device *dev) { }

I'll investigate more.

-Dan

>
> I've stolen the generic add_device and remove_device callbacks from the
> exynos iommu driver which makes the rk one at least probe.
>
> Can't say how far it goes, as I'm still struggling with the floating display
> subsystem parts. My current diff against this version can be found in [1].
>
> Maybe the issue I had in attach_device also simply resulted from this one,
> not sure right now.
>
>
> Heiko
>
>> +     if (ret)
>> +             return ret;
>> +
>> +     return platform_driver_register(&rk_iommu_driver);
>> +}
>> +static void __exit rk_iommu_exit(void)
>> +{
>> +     platform_driver_unregister(&rk_iommu_driver);
>> +}
>> +
>> +subsys_initcall(rk_iommu_init);
>> +module_exit(rk_iommu_exit);
>> +
>> +MODULE_DESCRIPTION("IOMMU API for Rockchip");
>> +MODULE_AUTHOR("Simon Xue <xxm@rock-chips.com> and Daniel Kurtz
>> <djkurtz@chromium.org>"); +MODULE_ALIAS("platform:rockchip-iommu");
>> +MODULE_LICENSE("GPL v2");
>
>
>
> [0]
>
> [drm] Initialized drm 1.1.0 20060810
> Unable to handle kernel NULL pointer dereference at virtual address 00000058
> pgd = c0004000
> [00000058] *pgd=00000000
> Internal error: Oops: 5 [#1] SMP ARM
> Modules linked in:
> CPU: 0 PID: 1 Comm: swapper/0 Not tainted 3.18.0-rc1+ #1274
> task: ee067b40 ti: ee068000 task.ti: ee068000
> PC is at rk_iommu_attach_device+0x3c/0x29c
> LR is at rk_iommu_attach_device+0x30/0x29c
> pc : [<c03686d4>]    lr : [<c03686c8>]    psr: 60000153
> sp : ee069db8  ip : 00000000  fp : 00000000
> r10: ee276f00  r9 : 00000000  r8 : ee27cc00
> r7 : ee27cf80  r6 : ee11b610  r5 : ee11b610  r4 : ee27cf80
> r3 : 00000000  r2 : c07bb588  r1 : c045b6d0  r0 : c054a27f
> Flags: nZCv  IRQs on  FIQs off  Mode SVC_32  ISA ARM  Segment kernel
> Control: 10c5387d  Table: 0000406a  DAC: 00000015
> Process swapper/0 (pid: 1, stack limit = 0xee068240)
> Stack: (0xee069db8 to 0xee06a000)
> 9da0:                                                       c07c75a8 c0367fec
> 9dc0: c0367fc4 ee276f00 c07c75a8 ee27cf80 ee11b610 00000000 ee27cf80 ee27cc00
> 9de0: 00000000 c0366c48 ee27c250 c001a470 ee27c250 ee11b610 ee2cf400 00000000
> 9e00: ee27cf80 c0225388 c0225298 ee2cf400 00000000 00000000 00000000 c0213cb0
> 9e20: ee11ca40 ee11b610 ee2cf400 00000000 ee2db130 c022522c ee2dbf80 00000004
> 9e40: ee2db110 c022a9e4 ee27cc00 c07c73f8 ee2dbf80 c07da7c0 c082633c c022abe4
> 9e60: c0446710 ee127010 ffffffed c07c7354 c07da7c0 c0230174 ee127010 c07c7354
> 9e80: 00000000 c022ebc0 c07c7354 ee127010 ee069ea8 ee127010 ee127044 c07c7354
> 9ea0: 00000000 c05be4a0 ee068000 c022ee38 00000000 c07c7354 c022edd0 c022d28c
> 9ec0: ee04675c ee1226b4 c07c7354 ee2d6a80 c07c75a8 c022e2b4 c0512e6d c0512e6d
> 9ee0: 00000072 c07c7354 c07b6e18 c07dfac0 c05dd274 c022f4b0 00000000 ee27cd00
> 9f00: c07b6e18 c00089d0 ee0e9300 c0100018 ee0e9300 ee0e9080 ee0e9000 c0420e38
> 9f20: c0802444 00000000 c057d980 c01001a4 c05a7594 ef7fcc05 00000000 c0036b68
> 9f40: 00000000 00000000 c057d980 c057cd90 000000bf 00000006 c07ba5f0 00000006
> 9f60: c05d1568 c07dfac0 c05dd274 000000bf 00000000 00000000 00000000 c05a7d20
> 9f80: 00000006 00000006 c05a7594 ee068000 00000000 c04171ac 00000000 00000000
> 9fa0: 00000000 c04171b4 00000000 c000e878 00000000 00000000 00000000 00000000
> 9fc0: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
> 9fe0: 00000000 00000000 00000000 00000000 00000013 00000000 00000000 00000000
> [<c03686d4>] (rk_iommu_attach_device) from [<c0366c48>] (iommu_attach_device+0x18/0x24)
> [<c0366c48>] (iommu_attach_device) from [<c001a470>] (arm_iommu_attach_device+0x18/0xd0)
> [<c001a470>] (arm_iommu_attach_device) from [<c0225388>] (rockchip_drm_load+0xf0/0x198)
> [<c0225388>] (rockchip_drm_load) from [<c0213cb0>] (drm_dev_register+0x80/0x100)
> [<c0213cb0>] (drm_dev_register) from [<c022522c>] (rockchip_drm_bind+0x48/0x74)
> [<c022522c>] (rockchip_drm_bind) from [<c022a9e4>] (try_to_bring_up_master.part.2+0xa4/0xf4)
> [<c022a9e4>] (try_to_bring_up_master.part.2) from [<c022abe4>] (component_add+0x9c/0x104)
> [<c022abe4>] (component_add) from [<c0230174>] (platform_drv_probe+0x48/0x90)
> [<c0230174>] (platform_drv_probe) from [<c022ebc0>] (driver_probe_device+0x130/0x340)
> [<c022ebc0>] (driver_probe_device) from [<c022ee38>] (__driver_attach+0x68/0x8c)
> [<c022ee38>] (__driver_attach) from [<c022d28c>] (bus_for_each_dev+0x6c/0x80)
> [<c022d28c>] (bus_for_each_dev) from [<c022e2b4>] (bus_add_driver+0xfc/0x1f0)
> [<c022e2b4>] (bus_add_driver) from [<c022f4b0>] (driver_register+0x9c/0xe0)
> [<c022f4b0>] (driver_register) from [<c00089d0>] (do_one_initcall+0x110/0x1bc)
> [<c00089d0>] (do_one_initcall) from [<c05a7d20>] (kernel_init_freeable+0xfc/0x1c8)
> [<c05a7d20>] (kernel_init_freeable) from [<c04171b4>] (kernel_init+0x8/0xe4)
> [<c04171b4>] (kernel_init) from [<c000e878>] (ret_from_fork+0x14/0x3c)
> Code: eb02c07d e5963140 e59f1234 e59f0234 (e5934058)
> ---[ end trace 41e4f8e55e7119af ]---
> Kernel panic - not syncing: Attempted to kill init! exitcode=0x0000000b
>
> ---[ end Kernel panic - not syncing: Attempted to kill init! exitcode=0x0000000b
>
>
>
> [1]
>
> diff --git a/drivers/iommu/rockchip-iommu.c b/drivers/iommu/rockchip-iommu.c
> index 56ffb76..959348f 100644
> --- a/drivers/iommu/rockchip-iommu.c
> +++ b/drivers/iommu/rockchip-iommu.c
> @@ -708,8 +708,8 @@ static size_t rk_iommu_unmap(struct iommu_domain *domain, unsigned long _iova,
>  static int rk_iommu_attach_device(struct iommu_domain *domain,
>                                   struct device *dev)
>  {
> -       struct rk_iommu *iommu = dev_get_drvdata(dev->archdata.iommu);
>         struct rk_iommu_domain *rk_domain = domain->priv;
> +       struct rk_iommu *iommu;
>         unsigned long flags;
>         int ret;
>         phys_addr_t dte_addr;
> @@ -718,9 +718,11 @@ static int rk_iommu_attach_device(struct iommu_domain *domain,
>          * Allow 'virtual devices' (e.g., drm) to attach to domain.
>          * Such a device has a NULL archdata.iommu.
>          */
> -       if (!iommu)
> +       if (!dev->archdata.iommu)
>                 return 0;
>
> +       iommu = dev_get_drvdata(dev->archdata.iommu);
> +
>         ret = rk_iommu_enable_stall(iommu);
>         if (ret)
>                 return ret;
> @@ -837,6 +839,32 @@ static void rk_iommu_domain_destroy(struct iommu_domain *domain)
>         domain->priv = NULL;
>  }
>
> +static int rk_iommu_add_device(struct device *dev)
> +{
> +       struct iommu_group *group;
> +       int ret;
> +
> +       group = iommu_group_get(dev);
> +
> +       if (!group) {
> +               group = iommu_group_alloc();
> +               if (IS_ERR(group)) {
> +                       dev_err(dev, "Failed to allocate IOMMU group\n");
> +                       return PTR_ERR(group);
> +               }
> +       }
> +
> +       ret = iommu_group_add_device(group, dev);
> +       iommu_group_put(group);
> +
> +       return ret;
> +}
> +
> +static void rk_iommu_remove_device(struct device *dev)
> +{
> +       iommu_group_remove_device(dev);
> +}
> +
>  static const struct iommu_ops rk_iommu_ops = {
>         .domain_init = rk_iommu_domain_init,
>         .domain_destroy = rk_iommu_domain_destroy,
> @@ -845,6 +873,8 @@ static const struct iommu_ops rk_iommu_ops = {
>         .map = rk_iommu_map,
>         .unmap = rk_iommu_unmap,
>         .iova_to_phys = rk_iommu_iova_to_phys,
> +       .add_device = rk_iommu_add_device,
> +       .remove_device = rk_iommu_remove_device,
>         .pgsize_bitmap = RK_IOMMU_PGSIZE_BITMAP,
>  };
>
>

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH v5 1/3] iommu/rockchip: rk3288 iommu driver
@ 2014-10-26 20:32     ` Heiko Stübner
  0 siblings, 0 replies; 26+ messages in thread
From: Heiko Stübner @ 2014-10-26 20:32 UTC (permalink / raw)
  To: Daniel Kurtz
  Cc: Joerg Roedel, Simon Xue, Grant Likely, Rob Herring, open list,
	open list:IOMMU DRIVERS, moderated list:ARM/Rockchip SoC...,
	open list:OPEN FIRMWARE AND...

Hi Daniel,

Am Freitag, 24. Oktober 2014, 15:33:47 schrieb Daniel Kurtz:

[...]

> +static int rk_iommu_attach_device(struct iommu_domain *domain,
> +				  struct device *dev)
> +{
> +	struct rk_iommu *iommu = dev_get_drvdata(dev->archdata.iommu);

Here I get a null-ptr dereference [0] when using the iommu driver with the
pending drm changes.

> +	struct rk_iommu_domain *rk_domain = domain->priv;
> +	unsigned long flags;
> +	int ret;
> +	phys_addr_t dte_addr;
> +
> +	/*
> +	 * Allow 'virtual devices' (e.g., drm) to attach to domain.
> +	 * Such a device has a NULL archdata.iommu.
> +	 */
> +	if (!iommu)

When the comment is correct, the code should probably do something like
the following?

if (!dev->archdata.iommu)
	return 0;
 
iommu = dev_get_drvdata(dev->archdata.iommu);


> +		return 0;
> +
> +	ret = rk_iommu_enable_stall(iommu);
> +	if (ret)
> +		return ret;
> +
> +	ret = rk_iommu_force_reset(iommu);
> +	if (ret)
> +		return ret;
> +
> +	iommu->domain = domain;
> +
> +	ret = devm_request_irq(dev, iommu->irq, rk_iommu_irq,
> +			       IRQF_SHARED, dev_name(dev), iommu);
> +	if (ret)
> +		return ret;
> +
> +	dte_addr = virt_to_phys(rk_domain->dt);
> +	rk_iommu_write(iommu, RK_MMU_DTE_ADDR, dte_addr);
> +	rk_iommu_command(iommu, RK_MMU_CMD_ZAP_CACHE);
> +	rk_iommu_write(iommu, RK_MMU_INT_MASK, RK_MMU_IRQ_MASK);
> +
> +	ret = rk_iommu_enable_paging(iommu);
> +	if (ret)
> +		return ret;
> +
> +	spin_lock_irqsave(&rk_domain->iommus_lock, flags);
> +	list_add_tail(&iommu->node, &rk_domain->iommus);
> +	spin_unlock_irqrestore(&rk_domain->iommus_lock, flags);
> +
> +	dev_info(dev, "Attached to iommu domain\n");
> +
> +	rk_iommu_disable_stall(iommu);
> +
> +	return 0;
> +}

[...]

> +
> +static struct platform_driver rk_iommu_driver = {
> +	.probe = rk_iommu_probe,
> +	.remove = rk_iommu_remove,
> +	.driver = {
> +		   .name = "rk_iommu",
> +		   .owner = THIS_MODULE,
> +		   .of_match_table = of_match_ptr(rk_iommu_dt_ids),
> +	},
> +};
> +
> +static int __init rk_iommu_init(void)
> +{
> +	int ret;
> +
> +	ret = bus_set_iommu(&platform_bus_type, &rk_iommu_ops);

on 3.18-rc1 this fails with -ENODEV, as add_iommu_group() is missing the
add_device callback in rk_iommu_ops, so the iommu driver actually never
gets registered.

I've stolen the generic add_device and remove_device callbacks from the
exynos iommu driver which makes the rk one at least probe.

Can't say how far it goes, as I'm still struggling with the floating display
subsystem parts. My current diff against this version can be found in [1].

Maybe the issue I had in attach_device also simply resulted from this one,
not sure right now.


Heiko

> +	if (ret)
> +		return ret;
> +
> +	return platform_driver_register(&rk_iommu_driver);
> +}
> +static void __exit rk_iommu_exit(void)
> +{
> +	platform_driver_unregister(&rk_iommu_driver);
> +}
> +
> +subsys_initcall(rk_iommu_init);
> +module_exit(rk_iommu_exit);
> +
> +MODULE_DESCRIPTION("IOMMU API for Rockchip");
> +MODULE_AUTHOR("Simon Xue <xxm@rock-chips.com> and Daniel Kurtz
> <djkurtz@chromium.org>"); +MODULE_ALIAS("platform:rockchip-iommu");
> +MODULE_LICENSE("GPL v2");



[0]

[drm] Initialized drm 1.1.0 20060810
Unable to handle kernel NULL pointer dereference at virtual address 00000058
pgd = c0004000
[00000058] *pgd=00000000
Internal error: Oops: 5 [#1] SMP ARM
Modules linked in:
CPU: 0 PID: 1 Comm: swapper/0 Not tainted 3.18.0-rc1+ #1274
task: ee067b40 ti: ee068000 task.ti: ee068000
PC is at rk_iommu_attach_device+0x3c/0x29c
LR is at rk_iommu_attach_device+0x30/0x29c
pc : [<c03686d4>]    lr : [<c03686c8>]    psr: 60000153
sp : ee069db8  ip : 00000000  fp : 00000000
r10: ee276f00  r9 : 00000000  r8 : ee27cc00
r7 : ee27cf80  r6 : ee11b610  r5 : ee11b610  r4 : ee27cf80
r3 : 00000000  r2 : c07bb588  r1 : c045b6d0  r0 : c054a27f
Flags: nZCv  IRQs on  FIQs off  Mode SVC_32  ISA ARM  Segment kernel
Control: 10c5387d  Table: 0000406a  DAC: 00000015
Process swapper/0 (pid: 1, stack limit = 0xee068240)
Stack: (0xee069db8 to 0xee06a000)
9da0:                                                       c07c75a8 c0367fec
9dc0: c0367fc4 ee276f00 c07c75a8 ee27cf80 ee11b610 00000000 ee27cf80 ee27cc00
9de0: 00000000 c0366c48 ee27c250 c001a470 ee27c250 ee11b610 ee2cf400 00000000
9e00: ee27cf80 c0225388 c0225298 ee2cf400 00000000 00000000 00000000 c0213cb0
9e20: ee11ca40 ee11b610 ee2cf400 00000000 ee2db130 c022522c ee2dbf80 00000004
9e40: ee2db110 c022a9e4 ee27cc00 c07c73f8 ee2dbf80 c07da7c0 c082633c c022abe4
9e60: c0446710 ee127010 ffffffed c07c7354 c07da7c0 c0230174 ee127010 c07c7354
9e80: 00000000 c022ebc0 c07c7354 ee127010 ee069ea8 ee127010 ee127044 c07c7354
9ea0: 00000000 c05be4a0 ee068000 c022ee38 00000000 c07c7354 c022edd0 c022d28c
9ec0: ee04675c ee1226b4 c07c7354 ee2d6a80 c07c75a8 c022e2b4 c0512e6d c0512e6d
9ee0: 00000072 c07c7354 c07b6e18 c07dfac0 c05dd274 c022f4b0 00000000 ee27cd00
9f00: c07b6e18 c00089d0 ee0e9300 c0100018 ee0e9300 ee0e9080 ee0e9000 c0420e38
9f20: c0802444 00000000 c057d980 c01001a4 c05a7594 ef7fcc05 00000000 c0036b68
9f40: 00000000 00000000 c057d980 c057cd90 000000bf 00000006 c07ba5f0 00000006
9f60: c05d1568 c07dfac0 c05dd274 000000bf 00000000 00000000 00000000 c05a7d20
9f80: 00000006 00000006 c05a7594 ee068000 00000000 c04171ac 00000000 00000000
9fa0: 00000000 c04171b4 00000000 c000e878 00000000 00000000 00000000 00000000
9fc0: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
9fe0: 00000000 00000000 00000000 00000000 00000013 00000000 00000000 00000000
[<c03686d4>] (rk_iommu_attach_device) from [<c0366c48>] (iommu_attach_device+0x18/0x24)
[<c0366c48>] (iommu_attach_device) from [<c001a470>] (arm_iommu_attach_device+0x18/0xd0)
[<c001a470>] (arm_iommu_attach_device) from [<c0225388>] (rockchip_drm_load+0xf0/0x198)
[<c0225388>] (rockchip_drm_load) from [<c0213cb0>] (drm_dev_register+0x80/0x100)
[<c0213cb0>] (drm_dev_register) from [<c022522c>] (rockchip_drm_bind+0x48/0x74)
[<c022522c>] (rockchip_drm_bind) from [<c022a9e4>] (try_to_bring_up_master.part.2+0xa4/0xf4)
[<c022a9e4>] (try_to_bring_up_master.part.2) from [<c022abe4>] (component_add+0x9c/0x104)
[<c022abe4>] (component_add) from [<c0230174>] (platform_drv_probe+0x48/0x90)
[<c0230174>] (platform_drv_probe) from [<c022ebc0>] (driver_probe_device+0x130/0x340)
[<c022ebc0>] (driver_probe_device) from [<c022ee38>] (__driver_attach+0x68/0x8c)
[<c022ee38>] (__driver_attach) from [<c022d28c>] (bus_for_each_dev+0x6c/0x80)
[<c022d28c>] (bus_for_each_dev) from [<c022e2b4>] (bus_add_driver+0xfc/0x1f0)
[<c022e2b4>] (bus_add_driver) from [<c022f4b0>] (driver_register+0x9c/0xe0)
[<c022f4b0>] (driver_register) from [<c00089d0>] (do_one_initcall+0x110/0x1bc)
[<c00089d0>] (do_one_initcall) from [<c05a7d20>] (kernel_init_freeable+0xfc/0x1c8)
[<c05a7d20>] (kernel_init_freeable) from [<c04171b4>] (kernel_init+0x8/0xe4)
[<c04171b4>] (kernel_init) from [<c000e878>] (ret_from_fork+0x14/0x3c)
Code: eb02c07d e5963140 e59f1234 e59f0234 (e5934058) 
---[ end trace 41e4f8e55e7119af ]---
Kernel panic - not syncing: Attempted to kill init! exitcode=0x0000000b

---[ end Kernel panic - not syncing: Attempted to kill init! exitcode=0x0000000b



[1]

diff --git a/drivers/iommu/rockchip-iommu.c b/drivers/iommu/rockchip-iommu.c
index 56ffb76..959348f 100644
--- a/drivers/iommu/rockchip-iommu.c
+++ b/drivers/iommu/rockchip-iommu.c
@@ -708,8 +708,8 @@ static size_t rk_iommu_unmap(struct iommu_domain *domain, unsigned long _iova,
 static int rk_iommu_attach_device(struct iommu_domain *domain,
 				  struct device *dev)
 {
-	struct rk_iommu *iommu = dev_get_drvdata(dev->archdata.iommu);
 	struct rk_iommu_domain *rk_domain = domain->priv;
+	struct rk_iommu *iommu;
 	unsigned long flags;
 	int ret;
 	phys_addr_t dte_addr;
@@ -718,9 +718,11 @@ static int rk_iommu_attach_device(struct iommu_domain *domain,
 	 * Allow 'virtual devices' (e.g., drm) to attach to domain.
 	 * Such a device has a NULL archdata.iommu.
 	 */
-	if (!iommu)
+	if (!dev->archdata.iommu)
 		return 0;
 
+	iommu = dev_get_drvdata(dev->archdata.iommu);
+
 	ret = rk_iommu_enable_stall(iommu);
 	if (ret)
 		return ret;
@@ -837,6 +839,32 @@ static void rk_iommu_domain_destroy(struct iommu_domain *domain)
 	domain->priv = NULL;
 }
 
+static int rk_iommu_add_device(struct device *dev)
+{
+	struct iommu_group *group;
+	int ret;
+
+	group = iommu_group_get(dev);
+
+	if (!group) {
+		group = iommu_group_alloc();
+		if (IS_ERR(group)) {
+			dev_err(dev, "Failed to allocate IOMMU group\n");
+			return PTR_ERR(group);
+		}
+	}
+
+	ret = iommu_group_add_device(group, dev);
+	iommu_group_put(group);
+
+	return ret;
+}
+
+static void rk_iommu_remove_device(struct device *dev)
+{
+	iommu_group_remove_device(dev);
+}
+
 static const struct iommu_ops rk_iommu_ops = {
 	.domain_init = rk_iommu_domain_init,
 	.domain_destroy = rk_iommu_domain_destroy,
@@ -845,6 +873,8 @@ static const struct iommu_ops rk_iommu_ops = {
 	.map = rk_iommu_map,
 	.unmap = rk_iommu_unmap,
 	.iova_to_phys = rk_iommu_iova_to_phys,
+	.add_device = rk_iommu_add_device,
+	.remove_device = rk_iommu_remove_device,
 	.pgsize_bitmap = RK_IOMMU_PGSIZE_BITMAP,
 };
 


^ permalink raw reply related	[flat|nested] 26+ messages in thread

* Re: [PATCH v5 1/3] iommu/rockchip: rk3288 iommu driver
@ 2014-10-26 20:32     ` Heiko Stübner
  0 siblings, 0 replies; 26+ messages in thread
From: Heiko Stübner @ 2014-10-26 20:32 UTC (permalink / raw)
  To: Daniel Kurtz
  Cc: open list:OPEN FIRMWARE AND...,
	Simon Xue, open list, open list:IOMMU DRIVERS, Rob Herring,
	Grant Likely, moderated list:ARM/Rockchip SoC...

Hi Daniel,

Am Freitag, 24. Oktober 2014, 15:33:47 schrieb Daniel Kurtz:

[...]

> +static int rk_iommu_attach_device(struct iommu_domain *domain,
> +				  struct device *dev)
> +{
> +	struct rk_iommu *iommu = dev_get_drvdata(dev->archdata.iommu);

Here I get a null-ptr dereference [0] when using the iommu driver with the
pending drm changes.

> +	struct rk_iommu_domain *rk_domain = domain->priv;
> +	unsigned long flags;
> +	int ret;
> +	phys_addr_t dte_addr;
> +
> +	/*
> +	 * Allow 'virtual devices' (e.g., drm) to attach to domain.
> +	 * Such a device has a NULL archdata.iommu.
> +	 */
> +	if (!iommu)

When the comment is correct, the code should probably do something like
the following?

if (!dev->archdata.iommu)
	return 0;
 
iommu = dev_get_drvdata(dev->archdata.iommu);


> +		return 0;
> +
> +	ret = rk_iommu_enable_stall(iommu);
> +	if (ret)
> +		return ret;
> +
> +	ret = rk_iommu_force_reset(iommu);
> +	if (ret)
> +		return ret;
> +
> +	iommu->domain = domain;
> +
> +	ret = devm_request_irq(dev, iommu->irq, rk_iommu_irq,
> +			       IRQF_SHARED, dev_name(dev), iommu);
> +	if (ret)
> +		return ret;
> +
> +	dte_addr = virt_to_phys(rk_domain->dt);
> +	rk_iommu_write(iommu, RK_MMU_DTE_ADDR, dte_addr);
> +	rk_iommu_command(iommu, RK_MMU_CMD_ZAP_CACHE);
> +	rk_iommu_write(iommu, RK_MMU_INT_MASK, RK_MMU_IRQ_MASK);
> +
> +	ret = rk_iommu_enable_paging(iommu);
> +	if (ret)
> +		return ret;
> +
> +	spin_lock_irqsave(&rk_domain->iommus_lock, flags);
> +	list_add_tail(&iommu->node, &rk_domain->iommus);
> +	spin_unlock_irqrestore(&rk_domain->iommus_lock, flags);
> +
> +	dev_info(dev, "Attached to iommu domain\n");
> +
> +	rk_iommu_disable_stall(iommu);
> +
> +	return 0;
> +}

[...]

> +
> +static struct platform_driver rk_iommu_driver = {
> +	.probe = rk_iommu_probe,
> +	.remove = rk_iommu_remove,
> +	.driver = {
> +		   .name = "rk_iommu",
> +		   .owner = THIS_MODULE,
> +		   .of_match_table = of_match_ptr(rk_iommu_dt_ids),
> +	},
> +};
> +
> +static int __init rk_iommu_init(void)
> +{
> +	int ret;
> +
> +	ret = bus_set_iommu(&platform_bus_type, &rk_iommu_ops);

on 3.18-rc1 this fails with -ENODEV, as add_iommu_group() is missing the
add_device callback in rk_iommu_ops, so the iommu driver actually never
gets registered.

I've stolen the generic add_device and remove_device callbacks from the
exynos iommu driver which makes the rk one at least probe.

Can't say how far it goes, as I'm still struggling with the floating display
subsystem parts. My current diff against this version can be found in [1].

Maybe the issue I had in attach_device also simply resulted from this one,
not sure right now.


Heiko

> +	if (ret)
> +		return ret;
> +
> +	return platform_driver_register(&rk_iommu_driver);
> +}
> +static void __exit rk_iommu_exit(void)
> +{
> +	platform_driver_unregister(&rk_iommu_driver);
> +}
> +
> +subsys_initcall(rk_iommu_init);
> +module_exit(rk_iommu_exit);
> +
> +MODULE_DESCRIPTION("IOMMU API for Rockchip");
> +MODULE_AUTHOR("Simon Xue <xxm-TNX95d0MmH7DzftRWevZcw@public.gmane.org> and Daniel Kurtz
> <djkurtz-F7+t8E8rja9g9hUCZPvPmw@public.gmane.org>"); +MODULE_ALIAS("platform:rockchip-iommu");
> +MODULE_LICENSE("GPL v2");



[0]

[drm] Initialized drm 1.1.0 20060810
Unable to handle kernel NULL pointer dereference at virtual address 00000058
pgd = c0004000
[00000058] *pgd=00000000
Internal error: Oops: 5 [#1] SMP ARM
Modules linked in:
CPU: 0 PID: 1 Comm: swapper/0 Not tainted 3.18.0-rc1+ #1274
task: ee067b40 ti: ee068000 task.ti: ee068000
PC is at rk_iommu_attach_device+0x3c/0x29c
LR is at rk_iommu_attach_device+0x30/0x29c
pc : [<c03686d4>]    lr : [<c03686c8>]    psr: 60000153
sp : ee069db8  ip : 00000000  fp : 00000000
r10: ee276f00  r9 : 00000000  r8 : ee27cc00
r7 : ee27cf80  r6 : ee11b610  r5 : ee11b610  r4 : ee27cf80
r3 : 00000000  r2 : c07bb588  r1 : c045b6d0  r0 : c054a27f
Flags: nZCv  IRQs on  FIQs off  Mode SVC_32  ISA ARM  Segment kernel
Control: 10c5387d  Table: 0000406a  DAC: 00000015
Process swapper/0 (pid: 1, stack limit = 0xee068240)
Stack: (0xee069db8 to 0xee06a000)
9da0:                                                       c07c75a8 c0367fec
9dc0: c0367fc4 ee276f00 c07c75a8 ee27cf80 ee11b610 00000000 ee27cf80 ee27cc00
9de0: 00000000 c0366c48 ee27c250 c001a470 ee27c250 ee11b610 ee2cf400 00000000
9e00: ee27cf80 c0225388 c0225298 ee2cf400 00000000 00000000 00000000 c0213cb0
9e20: ee11ca40 ee11b610 ee2cf400 00000000 ee2db130 c022522c ee2dbf80 00000004
9e40: ee2db110 c022a9e4 ee27cc00 c07c73f8 ee2dbf80 c07da7c0 c082633c c022abe4
9e60: c0446710 ee127010 ffffffed c07c7354 c07da7c0 c0230174 ee127010 c07c7354
9e80: 00000000 c022ebc0 c07c7354 ee127010 ee069ea8 ee127010 ee127044 c07c7354
9ea0: 00000000 c05be4a0 ee068000 c022ee38 00000000 c07c7354 c022edd0 c022d28c
9ec0: ee04675c ee1226b4 c07c7354 ee2d6a80 c07c75a8 c022e2b4 c0512e6d c0512e6d
9ee0: 00000072 c07c7354 c07b6e18 c07dfac0 c05dd274 c022f4b0 00000000 ee27cd00
9f00: c07b6e18 c00089d0 ee0e9300 c0100018 ee0e9300 ee0e9080 ee0e9000 c0420e38
9f20: c0802444 00000000 c057d980 c01001a4 c05a7594 ef7fcc05 00000000 c0036b68
9f40: 00000000 00000000 c057d980 c057cd90 000000bf 00000006 c07ba5f0 00000006
9f60: c05d1568 c07dfac0 c05dd274 000000bf 00000000 00000000 00000000 c05a7d20
9f80: 00000006 00000006 c05a7594 ee068000 00000000 c04171ac 00000000 00000000
9fa0: 00000000 c04171b4 00000000 c000e878 00000000 00000000 00000000 00000000
9fc0: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
9fe0: 00000000 00000000 00000000 00000000 00000013 00000000 00000000 00000000
[<c03686d4>] (rk_iommu_attach_device) from [<c0366c48>] (iommu_attach_device+0x18/0x24)
[<c0366c48>] (iommu_attach_device) from [<c001a470>] (arm_iommu_attach_device+0x18/0xd0)
[<c001a470>] (arm_iommu_attach_device) from [<c0225388>] (rockchip_drm_load+0xf0/0x198)
[<c0225388>] (rockchip_drm_load) from [<c0213cb0>] (drm_dev_register+0x80/0x100)
[<c0213cb0>] (drm_dev_register) from [<c022522c>] (rockchip_drm_bind+0x48/0x74)
[<c022522c>] (rockchip_drm_bind) from [<c022a9e4>] (try_to_bring_up_master.part.2+0xa4/0xf4)
[<c022a9e4>] (try_to_bring_up_master.part.2) from [<c022abe4>] (component_add+0x9c/0x104)
[<c022abe4>] (component_add) from [<c0230174>] (platform_drv_probe+0x48/0x90)
[<c0230174>] (platform_drv_probe) from [<c022ebc0>] (driver_probe_device+0x130/0x340)
[<c022ebc0>] (driver_probe_device) from [<c022ee38>] (__driver_attach+0x68/0x8c)
[<c022ee38>] (__driver_attach) from [<c022d28c>] (bus_for_each_dev+0x6c/0x80)
[<c022d28c>] (bus_for_each_dev) from [<c022e2b4>] (bus_add_driver+0xfc/0x1f0)
[<c022e2b4>] (bus_add_driver) from [<c022f4b0>] (driver_register+0x9c/0xe0)
[<c022f4b0>] (driver_register) from [<c00089d0>] (do_one_initcall+0x110/0x1bc)
[<c00089d0>] (do_one_initcall) from [<c05a7d20>] (kernel_init_freeable+0xfc/0x1c8)
[<c05a7d20>] (kernel_init_freeable) from [<c04171b4>] (kernel_init+0x8/0xe4)
[<c04171b4>] (kernel_init) from [<c000e878>] (ret_from_fork+0x14/0x3c)
Code: eb02c07d e5963140 e59f1234 e59f0234 (e5934058) 
---[ end trace 41e4f8e55e7119af ]---
Kernel panic - not syncing: Attempted to kill init! exitcode=0x0000000b

---[ end Kernel panic - not syncing: Attempted to kill init! exitcode=0x0000000b



[1]

diff --git a/drivers/iommu/rockchip-iommu.c b/drivers/iommu/rockchip-iommu.c
index 56ffb76..959348f 100644
--- a/drivers/iommu/rockchip-iommu.c
+++ b/drivers/iommu/rockchip-iommu.c
@@ -708,8 +708,8 @@ static size_t rk_iommu_unmap(struct iommu_domain *domain, unsigned long _iova,
 static int rk_iommu_attach_device(struct iommu_domain *domain,
 				  struct device *dev)
 {
-	struct rk_iommu *iommu = dev_get_drvdata(dev->archdata.iommu);
 	struct rk_iommu_domain *rk_domain = domain->priv;
+	struct rk_iommu *iommu;
 	unsigned long flags;
 	int ret;
 	phys_addr_t dte_addr;
@@ -718,9 +718,11 @@ static int rk_iommu_attach_device(struct iommu_domain *domain,
 	 * Allow 'virtual devices' (e.g., drm) to attach to domain.
 	 * Such a device has a NULL archdata.iommu.
 	 */
-	if (!iommu)
+	if (!dev->archdata.iommu)
 		return 0;
 
+	iommu = dev_get_drvdata(dev->archdata.iommu);
+
 	ret = rk_iommu_enable_stall(iommu);
 	if (ret)
 		return ret;
@@ -837,6 +839,32 @@ static void rk_iommu_domain_destroy(struct iommu_domain *domain)
 	domain->priv = NULL;
 }
 
+static int rk_iommu_add_device(struct device *dev)
+{
+	struct iommu_group *group;
+	int ret;
+
+	group = iommu_group_get(dev);
+
+	if (!group) {
+		group = iommu_group_alloc();
+		if (IS_ERR(group)) {
+			dev_err(dev, "Failed to allocate IOMMU group\n");
+			return PTR_ERR(group);
+		}
+	}
+
+	ret = iommu_group_add_device(group, dev);
+	iommu_group_put(group);
+
+	return ret;
+}
+
+static void rk_iommu_remove_device(struct device *dev)
+{
+	iommu_group_remove_device(dev);
+}
+
 static const struct iommu_ops rk_iommu_ops = {
 	.domain_init = rk_iommu_domain_init,
 	.domain_destroy = rk_iommu_domain_destroy,
@@ -845,6 +873,8 @@ static const struct iommu_ops rk_iommu_ops = {
 	.map = rk_iommu_map,
 	.unmap = rk_iommu_unmap,
 	.iova_to_phys = rk_iommu_iova_to_phys,
+	.add_device = rk_iommu_add_device,
+	.remove_device = rk_iommu_remove_device,
 	.pgsize_bitmap = RK_IOMMU_PGSIZE_BITMAP,
 };
 

^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [PATCH v5 1/3] iommu/rockchip: rk3288 iommu driver
@ 2014-10-26 20:32     ` Heiko Stübner
  0 siblings, 0 replies; 26+ messages in thread
From: Heiko Stübner @ 2014-10-26 20:32 UTC (permalink / raw)
  To: linux-arm-kernel

Hi Daniel,

Am Freitag, 24. Oktober 2014, 15:33:47 schrieb Daniel Kurtz:

[...]

> +static int rk_iommu_attach_device(struct iommu_domain *domain,
> +				  struct device *dev)
> +{
> +	struct rk_iommu *iommu = dev_get_drvdata(dev->archdata.iommu);

Here I get a null-ptr dereference [0] when using the iommu driver with the
pending drm changes.

> +	struct rk_iommu_domain *rk_domain = domain->priv;
> +	unsigned long flags;
> +	int ret;
> +	phys_addr_t dte_addr;
> +
> +	/*
> +	 * Allow 'virtual devices' (e.g., drm) to attach to domain.
> +	 * Such a device has a NULL archdata.iommu.
> +	 */
> +	if (!iommu)

When the comment is correct, the code should probably do something like
the following?

if (!dev->archdata.iommu)
	return 0;
 
iommu = dev_get_drvdata(dev->archdata.iommu);


> +		return 0;
> +
> +	ret = rk_iommu_enable_stall(iommu);
> +	if (ret)
> +		return ret;
> +
> +	ret = rk_iommu_force_reset(iommu);
> +	if (ret)
> +		return ret;
> +
> +	iommu->domain = domain;
> +
> +	ret = devm_request_irq(dev, iommu->irq, rk_iommu_irq,
> +			       IRQF_SHARED, dev_name(dev), iommu);
> +	if (ret)
> +		return ret;
> +
> +	dte_addr = virt_to_phys(rk_domain->dt);
> +	rk_iommu_write(iommu, RK_MMU_DTE_ADDR, dte_addr);
> +	rk_iommu_command(iommu, RK_MMU_CMD_ZAP_CACHE);
> +	rk_iommu_write(iommu, RK_MMU_INT_MASK, RK_MMU_IRQ_MASK);
> +
> +	ret = rk_iommu_enable_paging(iommu);
> +	if (ret)
> +		return ret;
> +
> +	spin_lock_irqsave(&rk_domain->iommus_lock, flags);
> +	list_add_tail(&iommu->node, &rk_domain->iommus);
> +	spin_unlock_irqrestore(&rk_domain->iommus_lock, flags);
> +
> +	dev_info(dev, "Attached to iommu domain\n");
> +
> +	rk_iommu_disable_stall(iommu);
> +
> +	return 0;
> +}

[...]

> +
> +static struct platform_driver rk_iommu_driver = {
> +	.probe = rk_iommu_probe,
> +	.remove = rk_iommu_remove,
> +	.driver = {
> +		   .name = "rk_iommu",
> +		   .owner = THIS_MODULE,
> +		   .of_match_table = of_match_ptr(rk_iommu_dt_ids),
> +	},
> +};
> +
> +static int __init rk_iommu_init(void)
> +{
> +	int ret;
> +
> +	ret = bus_set_iommu(&platform_bus_type, &rk_iommu_ops);

on 3.18-rc1 this fails with -ENODEV, as add_iommu_group() is missing the
add_device callback in rk_iommu_ops, so the iommu driver actually never
gets registered.

I've stolen the generic add_device and remove_device callbacks from the
exynos iommu driver which makes the rk one at least probe.

Can't say how far it goes, as I'm still struggling with the floating display
subsystem parts. My current diff against this version can be found in [1].

Maybe the issue I had in attach_device also simply resulted from this one,
not sure right now.


Heiko

> +	if (ret)
> +		return ret;
> +
> +	return platform_driver_register(&rk_iommu_driver);
> +}
> +static void __exit rk_iommu_exit(void)
> +{
> +	platform_driver_unregister(&rk_iommu_driver);
> +}
> +
> +subsys_initcall(rk_iommu_init);
> +module_exit(rk_iommu_exit);
> +
> +MODULE_DESCRIPTION("IOMMU API for Rockchip");
> +MODULE_AUTHOR("Simon Xue <xxm@rock-chips.com> and Daniel Kurtz
> <djkurtz@chromium.org>"); +MODULE_ALIAS("platform:rockchip-iommu");
> +MODULE_LICENSE("GPL v2");



[0]

[drm] Initialized drm 1.1.0 20060810
Unable to handle kernel NULL pointer dereference at virtual address 00000058
pgd = c0004000
[00000058] *pgd=00000000
Internal error: Oops: 5 [#1] SMP ARM
Modules linked in:
CPU: 0 PID: 1 Comm: swapper/0 Not tainted 3.18.0-rc1+ #1274
task: ee067b40 ti: ee068000 task.ti: ee068000
PC is at rk_iommu_attach_device+0x3c/0x29c
LR is at rk_iommu_attach_device+0x30/0x29c
pc : [<c03686d4>]    lr : [<c03686c8>]    psr: 60000153
sp : ee069db8  ip : 00000000  fp : 00000000
r10: ee276f00  r9 : 00000000  r8 : ee27cc00
r7 : ee27cf80  r6 : ee11b610  r5 : ee11b610  r4 : ee27cf80
r3 : 00000000  r2 : c07bb588  r1 : c045b6d0  r0 : c054a27f
Flags: nZCv  IRQs on  FIQs off  Mode SVC_32  ISA ARM  Segment kernel
Control: 10c5387d  Table: 0000406a  DAC: 00000015
Process swapper/0 (pid: 1, stack limit = 0xee068240)
Stack: (0xee069db8 to 0xee06a000)
9da0:                                                       c07c75a8 c0367fec
9dc0: c0367fc4 ee276f00 c07c75a8 ee27cf80 ee11b610 00000000 ee27cf80 ee27cc00
9de0: 00000000 c0366c48 ee27c250 c001a470 ee27c250 ee11b610 ee2cf400 00000000
9e00: ee27cf80 c0225388 c0225298 ee2cf400 00000000 00000000 00000000 c0213cb0
9e20: ee11ca40 ee11b610 ee2cf400 00000000 ee2db130 c022522c ee2dbf80 00000004
9e40: ee2db110 c022a9e4 ee27cc00 c07c73f8 ee2dbf80 c07da7c0 c082633c c022abe4
9e60: c0446710 ee127010 ffffffed c07c7354 c07da7c0 c0230174 ee127010 c07c7354
9e80: 00000000 c022ebc0 c07c7354 ee127010 ee069ea8 ee127010 ee127044 c07c7354
9ea0: 00000000 c05be4a0 ee068000 c022ee38 00000000 c07c7354 c022edd0 c022d28c
9ec0: ee04675c ee1226b4 c07c7354 ee2d6a80 c07c75a8 c022e2b4 c0512e6d c0512e6d
9ee0: 00000072 c07c7354 c07b6e18 c07dfac0 c05dd274 c022f4b0 00000000 ee27cd00
9f00: c07b6e18 c00089d0 ee0e9300 c0100018 ee0e9300 ee0e9080 ee0e9000 c0420e38
9f20: c0802444 00000000 c057d980 c01001a4 c05a7594 ef7fcc05 00000000 c0036b68
9f40: 00000000 00000000 c057d980 c057cd90 000000bf 00000006 c07ba5f0 00000006
9f60: c05d1568 c07dfac0 c05dd274 000000bf 00000000 00000000 00000000 c05a7d20
9f80: 00000006 00000006 c05a7594 ee068000 00000000 c04171ac 00000000 00000000
9fa0: 00000000 c04171b4 00000000 c000e878 00000000 00000000 00000000 00000000
9fc0: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
9fe0: 00000000 00000000 00000000 00000000 00000013 00000000 00000000 00000000
[<c03686d4>] (rk_iommu_attach_device) from [<c0366c48>] (iommu_attach_device+0x18/0x24)
[<c0366c48>] (iommu_attach_device) from [<c001a470>] (arm_iommu_attach_device+0x18/0xd0)
[<c001a470>] (arm_iommu_attach_device) from [<c0225388>] (rockchip_drm_load+0xf0/0x198)
[<c0225388>] (rockchip_drm_load) from [<c0213cb0>] (drm_dev_register+0x80/0x100)
[<c0213cb0>] (drm_dev_register) from [<c022522c>] (rockchip_drm_bind+0x48/0x74)
[<c022522c>] (rockchip_drm_bind) from [<c022a9e4>] (try_to_bring_up_master.part.2+0xa4/0xf4)
[<c022a9e4>] (try_to_bring_up_master.part.2) from [<c022abe4>] (component_add+0x9c/0x104)
[<c022abe4>] (component_add) from [<c0230174>] (platform_drv_probe+0x48/0x90)
[<c0230174>] (platform_drv_probe) from [<c022ebc0>] (driver_probe_device+0x130/0x340)
[<c022ebc0>] (driver_probe_device) from [<c022ee38>] (__driver_attach+0x68/0x8c)
[<c022ee38>] (__driver_attach) from [<c022d28c>] (bus_for_each_dev+0x6c/0x80)
[<c022d28c>] (bus_for_each_dev) from [<c022e2b4>] (bus_add_driver+0xfc/0x1f0)
[<c022e2b4>] (bus_add_driver) from [<c022f4b0>] (driver_register+0x9c/0xe0)
[<c022f4b0>] (driver_register) from [<c00089d0>] (do_one_initcall+0x110/0x1bc)
[<c00089d0>] (do_one_initcall) from [<c05a7d20>] (kernel_init_freeable+0xfc/0x1c8)
[<c05a7d20>] (kernel_init_freeable) from [<c04171b4>] (kernel_init+0x8/0xe4)
[<c04171b4>] (kernel_init) from [<c000e878>] (ret_from_fork+0x14/0x3c)
Code: eb02c07d e5963140 e59f1234 e59f0234 (e5934058) 
---[ end trace 41e4f8e55e7119af ]---
Kernel panic - not syncing: Attempted to kill init! exitcode=0x0000000b

---[ end Kernel panic - not syncing: Attempted to kill init! exitcode=0x0000000b



[1]

diff --git a/drivers/iommu/rockchip-iommu.c b/drivers/iommu/rockchip-iommu.c
index 56ffb76..959348f 100644
--- a/drivers/iommu/rockchip-iommu.c
+++ b/drivers/iommu/rockchip-iommu.c
@@ -708,8 +708,8 @@ static size_t rk_iommu_unmap(struct iommu_domain *domain, unsigned long _iova,
 static int rk_iommu_attach_device(struct iommu_domain *domain,
 				  struct device *dev)
 {
-	struct rk_iommu *iommu = dev_get_drvdata(dev->archdata.iommu);
 	struct rk_iommu_domain *rk_domain = domain->priv;
+	struct rk_iommu *iommu;
 	unsigned long flags;
 	int ret;
 	phys_addr_t dte_addr;
@@ -718,9 +718,11 @@ static int rk_iommu_attach_device(struct iommu_domain *domain,
 	 * Allow 'virtual devices' (e.g., drm) to attach to domain.
 	 * Such a device has a NULL archdata.iommu.
 	 */
-	if (!iommu)
+	if (!dev->archdata.iommu)
 		return 0;
 
+	iommu = dev_get_drvdata(dev->archdata.iommu);
+
 	ret = rk_iommu_enable_stall(iommu);
 	if (ret)
 		return ret;
@@ -837,6 +839,32 @@ static void rk_iommu_domain_destroy(struct iommu_domain *domain)
 	domain->priv = NULL;
 }
 
+static int rk_iommu_add_device(struct device *dev)
+{
+	struct iommu_group *group;
+	int ret;
+
+	group = iommu_group_get(dev);
+
+	if (!group) {
+		group = iommu_group_alloc();
+		if (IS_ERR(group)) {
+			dev_err(dev, "Failed to allocate IOMMU group\n");
+			return PTR_ERR(group);
+		}
+	}
+
+	ret = iommu_group_add_device(group, dev);
+	iommu_group_put(group);
+
+	return ret;
+}
+
+static void rk_iommu_remove_device(struct device *dev)
+{
+	iommu_group_remove_device(dev);
+}
+
 static const struct iommu_ops rk_iommu_ops = {
 	.domain_init = rk_iommu_domain_init,
 	.domain_destroy = rk_iommu_domain_destroy,
@@ -845,6 +873,8 @@ static const struct iommu_ops rk_iommu_ops = {
 	.map = rk_iommu_map,
 	.unmap = rk_iommu_unmap,
 	.iova_to_phys = rk_iommu_iova_to_phys,
+	.add_device = rk_iommu_add_device,
+	.remove_device = rk_iommu_remove_device,
 	.pgsize_bitmap = RK_IOMMU_PGSIZE_BITMAP,
 };
 

^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [PATCH v5 1/3] iommu/rockchip: rk3288 iommu driver
@ 2014-10-24  7:33   ` Daniel Kurtz
  0 siblings, 0 replies; 26+ messages in thread
From: Daniel Kurtz @ 2014-10-24  7:33 UTC (permalink / raw)
  To: Joerg Roedel
  Cc: Heiko Stuebner, Daniel Kurtz, Simon Xue, Grant Likely,
	Rob Herring, open list, open list:IOMMU DRIVERS,
	moderated list:ARM/Rockchip SoC...,
	open list:OPEN FIRMWARE AND...

The rk3288 has several iommus.  Each iommu belongs to a single master
device.  There is one device (ISP) that has two slave iommus, but that
case is not yet supported by this driver.

At subsys init, the iommu driver registers itself as the iommu driver for
the platform bus.  The master devices find their slave iommus using the
"iommus" field in their devicetree description.  Since each slave iommu
belongs to exactly one master, their is no additional data needed at probe
to associate a slave with its master.

An iommu device's power domain, clock and irq are all shared with its
master device, and the master device must be careful to attach from the
iommu only after powering and clocking it (and leave it powered and
clocked before detaching).  Because their is no guarantee what the status
of the iommu is at probe, and since the driver does not even know if the
device is powered, we delay requesting its irq until the master device
attaches, at which point we have a guarantee that the device is powered
and clocked and we can reset it and disable its interrupt mask.

An iommu_domain describes a virtual iova address space.  Each iommu_domain
has a corresponding page table that lists the mappings from iova to
physical address.

For the rk3288 iommu, the page table has two levels:
 The Level 1 "directory_table" has 1024 4-byte dte entries.
 Each dte points to a level 2 "page_table".
 Each level 2 page_table has 1024 4-byte pte entries.
 Each pte points to a 4 KiB page of memory.

An iommu_domain is created when a dma_iommu_mapping is created via
arm_iommu_create_mapping.  Master devices can then attach themselves to
this mapping (or attach the mapping to themselves?) by calling
arm_iommu_attach_device().  This in turn instructs the iommu driver to
write the page table's physical address into the slave iommu's "Directory
Table Entry" (DTE) register.

In fact multiple master devices, each with their own slave iommu device,
can all attach to the same mapping.  The iommus for these devices will
share the same iommu_domain and therefore point to the same page table.
Thus, the iommu domain maintains a list of iommu devices which are
attached.  This driver relies on the iommu core to ensure that all devices
have detached before destroying a domain.

Signed-off-by: Daniel Kurtz <djkurtz@chromium.org>
Signed-off-by: Simon Xue <xxm@rock-chips.com>
Reviewed-by: Grant Grundler <grundler@chromium.org>
Reviewed-by: Stéphane Marchesin <marcheu@chromium.org>
---
 drivers/iommu/Kconfig          |  12 +
 drivers/iommu/Makefile         |   1 +
 drivers/iommu/rockchip-iommu.c | 922 +++++++++++++++++++++++++++++++++++++++++
 3 files changed, 935 insertions(+)
 create mode 100644 drivers/iommu/rockchip-iommu.c

diff --git a/drivers/iommu/Kconfig b/drivers/iommu/Kconfig
index dd51122..d0a1261 100644
--- a/drivers/iommu/Kconfig
+++ b/drivers/iommu/Kconfig
@@ -152,6 +152,18 @@ config OMAP_IOMMU_DEBUG
 
          Say N unless you know you need this.
 
+config ROCKCHIP_IOMMU
+	bool "Rockchip IOMMU Support"
+	depends on ARCH_ROCKCHIP
+	select IOMMU_API
+	select ARM_DMA_USE_IOMMU
+	help
+	  Support for IOMMUs found on Rockchip rk32xx SOCs.
+	  These IOMMUs allow virtualization of the address space used by most
+	  cores within the multimedia subsystem.
+	  Say Y here if you are using a Rockchip SoC that includes an IOMMU
+	  device.
+
 config TEGRA_IOMMU_GART
 	bool "Tegra GART IOMMU Support"
 	depends on ARCH_TEGRA_2x_SOC
diff --git a/drivers/iommu/Makefile b/drivers/iommu/Makefile
index 16edef7..3e47ef3 100644
--- a/drivers/iommu/Makefile
+++ b/drivers/iommu/Makefile
@@ -13,6 +13,7 @@ obj-$(CONFIG_IRQ_REMAP) += intel_irq_remapping.o irq_remapping.o
 obj-$(CONFIG_OMAP_IOMMU) += omap-iommu.o
 obj-$(CONFIG_OMAP_IOMMU) += omap-iommu2.o
 obj-$(CONFIG_OMAP_IOMMU_DEBUG) += omap-iommu-debug.o
+obj-$(CONFIG_ROCKCHIP_IOMMU) += rockchip-iommu.o
 obj-$(CONFIG_TEGRA_IOMMU_GART) += tegra-gart.o
 obj-$(CONFIG_TEGRA_IOMMU_SMMU) += tegra-smmu.o
 obj-$(CONFIG_EXYNOS_IOMMU) += exynos-iommu.o
diff --git a/drivers/iommu/rockchip-iommu.c b/drivers/iommu/rockchip-iommu.c
new file mode 100644
index 0000000..56ffb76
--- /dev/null
+++ b/drivers/iommu/rockchip-iommu.c
@@ -0,0 +1,922 @@
+/*
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ */
+
+#include <asm/cacheflush.h>
+#include <asm/pgtable.h>
+#include <linux/compiler.h>
+#include <linux/delay.h>
+#include <linux/device.h>
+#include <linux/errno.h>
+#include <linux/interrupt.h>
+#include <linux/io.h>
+#include <linux/iommu.h>
+#include <linux/jiffies.h>
+#include <linux/list.h>
+#include <linux/mm.h>
+#include <linux/module.h>
+#include <linux/of.h>
+#include <linux/platform_device.h>
+#include <linux/slab.h>
+#include <linux/spinlock.h>
+
+/** MMU register offsets */
+#define RK_MMU_DTE_ADDR		0x00	/* Directory table address */
+#define RK_MMU_STATUS		0x04
+#define RK_MMU_COMMAND		0x08
+#define RK_MMU_PAGE_FAULT_ADDR	0x0C	/* IOVA of last page fault */
+#define RK_MMU_ZAP_ONE_LINE	0x10	/* Shootdown one IOTLB entry */
+#define RK_MMU_INT_RAWSTAT	0x14	/* IRQ status ignoring mask */
+#define RK_MMU_INT_CLEAR	0x18	/* Acknowledge and re-arm irq */
+#define RK_MMU_INT_MASK		0x1C	/* IRQ enable */
+#define RK_MMU_INT_STATUS	0x20	/* IRQ status after masking */
+#define RK_MMU_AUTO_GATING	0x24
+
+#define DTE_ADDR_DUMMY		0xCAFEBABE
+#define FORCE_RESET_TIMEOUT	100	/* ms */
+
+/* RK_MMU_STATUS fields */
+#define RK_MMU_STATUS_PAGING_ENABLED       BIT(0)
+#define RK_MMU_STATUS_PAGE_FAULT_ACTIVE    BIT(1)
+#define RK_MMU_STATUS_STALL_ACTIVE         BIT(2)
+#define RK_MMU_STATUS_IDLE                 BIT(3)
+#define RK_MMU_STATUS_REPLAY_BUFFER_EMPTY  BIT(4)
+#define RK_MMU_STATUS_PAGE_FAULT_IS_WRITE  BIT(5)
+#define RK_MMU_STATUS_STALL_NOT_ACTIVE     BIT(31)
+
+/* RK_MMU_COMMAND command values */
+#define RK_MMU_CMD_ENABLE_PAGING    0  /* Enable memory translation */
+#define RK_MMU_CMD_DISABLE_PAGING   1  /* Disable memory translation */
+#define RK_MMU_CMD_ENABLE_STALL     2  /* Stall paging to allow other cmds */
+#define RK_MMU_CMD_DISABLE_STALL    3  /* Stop stall re-enables paging */
+#define RK_MMU_CMD_ZAP_CACHE        4  /* Shoot down entire IOTLB */
+#define RK_MMU_CMD_PAGE_FAULT_DONE  5  /* Clear page fault */
+#define RK_MMU_CMD_FORCE_RESET      6  /* Reset all registers */
+
+/* RK_MMU_INT_* register fields */
+#define RK_MMU_IRQ_PAGE_FAULT    0x01  /* page fault */
+#define RK_MMU_IRQ_BUS_ERROR     0x02  /* bus read error */
+#define RK_MMU_IRQ_MASK          (RK_MMU_IRQ_PAGE_FAULT | RK_MMU_IRQ_BUS_ERROR)
+
+#define NUM_DT_ENTRIES 1024
+#define NUM_PT_ENTRIES 1024
+
+#define SPAGE_ORDER 12
+#define SPAGE_SIZE (1 << SPAGE_ORDER)
+
+ /*
+  * Support mapping any size that fits in one page table:
+  *   4 KiB to 4 MiB
+  */
+#define RK_IOMMU_PGSIZE_BITMAP 0x007ff000
+
+#define IOMMU_REG_POLL_COUNT_FAST 1000
+
+struct rk_iommu_domain {
+	struct list_head iommus;
+	u32 *dt; /* page directory table */
+	spinlock_t iommus_lock; /* lock for iommus list */
+	spinlock_t dt_lock; /* lock for modifying page directory table */
+};
+
+struct rk_iommu {
+	struct device *dev;
+	void __iomem *base;
+	int irq;
+	struct list_head node; /* entry in rk_iommu_domain.iommus */
+	struct iommu_domain *domain; /* domain to which iommu is attached */
+};
+
+static inline void rk_table_flush(u32 *va, unsigned int count)
+{
+	phys_addr_t pa_start = virt_to_phys(va);
+	phys_addr_t pa_end = virt_to_phys(va + count);
+	size_t size = pa_end - pa_start;
+
+	__cpuc_flush_dcache_area(va, size);
+	outer_flush_range(pa_start, pa_end);
+}
+
+/**
+ * Inspired by _wait_for in intel_drv.h
+ * This is NOT safe for use in interrupt context.
+ *
+ * Note that it's important that we check the condition again after having
+ * timed out, since the timeout could be due to preemption or similar and
+ * we've never had a chance to check the condition before the timeout.
+ */
+#define rk_wait_for(COND, MS) ({ \
+	unsigned long timeout__ = jiffies + msecs_to_jiffies(MS) + 1;	\
+	int ret__ = 0;							\
+	while (!(COND)) {						\
+		if (time_after(jiffies, timeout__)) {			\
+			ret__ = (COND) ? 0 : -ETIMEDOUT;		\
+			break;						\
+		}							\
+		usleep_range(50, 100);					\
+	}								\
+	ret__;								\
+})
+
+/*
+ * The Rockchip rk3288 iommu uses a 2-level page table.
+ * The first level is the "Directory Table" (DT).
+ * The DT consists of 1024 4-byte Directory Table Entries (DTEs), each pointing
+ * to a "Page Table".
+ * The second level is the 1024 Page Tables (PT).
+ * Each PT consists of 1024 4-byte Page Table Entries (PTEs), each pointing to
+ * a 4 KB page of physical memory.
+ *
+ * The DT and each PT fits in a single 4 KB page (4-bytes * 1024 entries).
+ * Each iommu device has a MMU_DTE_ADDR register that contains the physical
+ * address of the start of the DT page.
+ *
+ * The structure of the page table is as follows:
+ *
+ *                   DT
+ * MMU_DTE_ADDR -> +-----+
+ *                 |     |
+ *                 +-----+     PT
+ *                 | DTE | -> +-----+
+ *                 +-----+    |     |     Memory
+ *                 |     |    +-----+     Page
+ *                 |     |    | PTE | -> +-----+
+ *                 +-----+    +-----+    |     |
+ *                            |     |    |     |
+ *                            |     |    |     |
+ *                            +-----+    |     |
+ *                                       |     |
+ *                                       |     |
+ *                                       +-----+
+ */
+
+/*
+ * Each DTE has a PT address and a valid bit:
+ * +---------------------+-----------+-+
+ * | PT address          | Reserved  |V|
+ * +---------------------+-----------+-+
+ *  31:12 - PT address (PTs always starts on a 4 KB boundary)
+ *  11: 1 - Reserved
+ *      0 - 1 if PT @ PT address is valid
+ */
+#define RK_DTE_PT_ADDRESS_MASK    0xfffff000
+#define RK_DTE_PT_VALID           BIT(0)
+
+static inline phys_addr_t rk_dte_pt_address(u32 dte)
+{
+	return (phys_addr_t)dte & RK_DTE_PT_ADDRESS_MASK;
+}
+
+static inline bool rk_dte_is_pt_valid(u32 dte)
+{
+	return dte & RK_DTE_PT_VALID;
+}
+
+static u32 rk_mk_dte(u32 *pt)
+{
+	phys_addr_t pt_phys = virt_to_phys(pt);
+	return (pt_phys & RK_DTE_PT_ADDRESS_MASK) | RK_DTE_PT_VALID;
+}
+
+/*
+ * Each PTE has a Page address, some flags and a valid bit:
+ * +---------------------+---+-------+-+
+ * | Page address        |Rsv| Flags |V|
+ * +---------------------+---+-------+-+
+ *  31:12 - Page address (Pages always start on a 4 KB boundary)
+ *  11: 9 - Reserved
+ *   8: 1 - Flags
+ *      8 - Read allocate - allocate cache space on read misses
+ *      7 - Read cache - enable cache & prefetch of data
+ *      6 - Write buffer - enable delaying writes on their way to memory
+ *      5 - Write allocate - allocate cache space on write misses
+ *      4 - Write cache - different writes can be merged together
+ *      3 - Override cache attributes
+ *          if 1, bits 4-8 control cache attributes
+ *          if 0, the system bus defaults are used
+ *      2 - Writable
+ *      1 - Readable
+ *      0 - 1 if Page @ Page address is valid
+ */
+#define RK_PTE_PAGE_ADDRESS_MASK  0xfffff000
+#define RK_PTE_PAGE_FLAGS_MASK    0x000001fe
+#define RK_PTE_PAGE_WRITABLE      BIT(2)
+#define RK_PTE_PAGE_READABLE      BIT(1)
+#define RK_PTE_PAGE_VALID         BIT(0)
+
+static inline phys_addr_t rk_pte_page_address(u32 pte)
+{
+	return (phys_addr_t)pte & RK_PTE_PAGE_ADDRESS_MASK;
+}
+
+static inline bool rk_pte_is_page_valid(u32 pte)
+{
+	return pte & RK_PTE_PAGE_VALID;
+}
+
+/* TODO: set cache flags per prot IOMMU_CACHE */
+static u32 rk_mk_pte(phys_addr_t page, int prot)
+{
+	u32 flags = 0;
+	flags |= (prot & IOMMU_READ) ? RK_PTE_PAGE_READABLE : 0;
+	flags |= (prot & IOMMU_WRITE) ? RK_PTE_PAGE_WRITABLE : 0;
+	page &= RK_PTE_PAGE_ADDRESS_MASK;
+	return page | flags | RK_PTE_PAGE_VALID;
+}
+
+static u32 rk_mk_pte_invalid(u32 pte)
+{
+	return pte & ~RK_PTE_PAGE_VALID;
+}
+
+/*
+ * rk3288 iova (IOMMU Virtual Address) format
+ *  31       22.21       12.11          0
+ * +-----------+-----------+-------------+
+ * | DTE index | PTE index | Page offset |
+ * +-----------+-----------+-------------+
+ *  31:22 - DTE index   - index of DTE in DT
+ *  21:12 - PTE index   - index of PTE in PT @ DTE.pt_address
+ *  11: 0 - Page offset - offset into page @ PTE.page_address
+ */
+#define RK_IOVA_DTE_MASK    0xffc00000
+#define RK_IOVA_DTE_SHIFT   22
+#define RK_IOVA_PTE_MASK    0x003ff000
+#define RK_IOVA_PTE_SHIFT   12
+#define RK_IOVA_PAGE_MASK   0x00000fff
+#define RK_IOVA_PAGE_SHIFT  0
+
+static u32 rk_iova_dte_index(dma_addr_t iova)
+{
+	return (u32)(iova & RK_IOVA_DTE_MASK) >> RK_IOVA_DTE_SHIFT;
+}
+
+static u32 rk_iova_pte_index(dma_addr_t iova)
+{
+	return (u32)(iova & RK_IOVA_PTE_MASK) >> RK_IOVA_PTE_SHIFT;
+}
+
+static u32 rk_iova_page_offset(dma_addr_t iova)
+{
+	return (u32)(iova & RK_IOVA_PAGE_MASK) >> RK_IOVA_PAGE_SHIFT;
+}
+
+static u32 rk_iommu_read(struct rk_iommu *iommu, u32 offset)
+{
+	return readl(iommu->base + offset);
+}
+
+static void rk_iommu_write(struct rk_iommu *iommu, u32 offset, u32 value)
+{
+	writel(value, iommu->base + offset);
+}
+
+static void rk_iommu_command(struct rk_iommu *iommu, u32 command)
+{
+	writel(command, iommu->base + RK_MMU_COMMAND);
+}
+
+static void rk_iommu_zap_lines(struct rk_iommu *iommu, dma_addr_t iova,
+			       size_t size)
+{
+	dma_addr_t iova_end = iova + size;
+	/*
+	 * TODO(djkurtz): Figure out when it is more efficient to shootdown the
+	 * entire iotlb rather than iterate over individual iovas.
+	 */
+	for (; iova < iova_end; iova += SPAGE_SIZE)
+		rk_iommu_write(iommu, RK_MMU_ZAP_ONE_LINE, iova);
+}
+
+static bool rk_iommu_is_stall_active(struct rk_iommu *iommu)
+{
+	return rk_iommu_read(iommu, RK_MMU_STATUS) & RK_MMU_STATUS_STALL_ACTIVE;
+}
+
+static bool rk_iommu_is_paging_enabled(struct rk_iommu *iommu)
+{
+	return rk_iommu_read(iommu, RK_MMU_STATUS) &
+			     RK_MMU_STATUS_PAGING_ENABLED;
+}
+
+static int rk_iommu_enable_stall(struct rk_iommu *iommu)
+{
+	int ret;
+
+	if (rk_iommu_is_stall_active(iommu))
+		return 0;
+
+	/* Stall can only be enabled if paging is enabled */
+	if (!rk_iommu_is_paging_enabled(iommu))
+		return 0;
+
+	rk_iommu_command(iommu, RK_MMU_CMD_ENABLE_STALL);
+
+	ret = rk_wait_for(rk_iommu_is_stall_active(iommu), 1);
+	if (ret)
+		dev_err(iommu->dev, "Enable stall request timed out, status: %#08x\n",
+			rk_iommu_read(iommu, RK_MMU_STATUS));
+
+	return ret;
+}
+
+static int rk_iommu_disable_stall(struct rk_iommu *iommu)
+{
+	int ret;
+
+	if (!rk_iommu_is_stall_active(iommu))
+		return 0;
+
+	rk_iommu_command(iommu, RK_MMU_CMD_DISABLE_STALL);
+
+	ret = rk_wait_for(!rk_iommu_is_stall_active(iommu), 1);
+	if (ret)
+		dev_err(iommu->dev, "Disable stall request timed out, status: %#08x\n",
+			rk_iommu_read(iommu, RK_MMU_STATUS));
+
+	return ret;
+}
+
+static int rk_iommu_enable_paging(struct rk_iommu *iommu)
+{
+	int ret;
+
+	if (rk_iommu_is_paging_enabled(iommu))
+		return 0;
+
+	rk_iommu_command(iommu, RK_MMU_CMD_ENABLE_PAGING);
+
+	ret = rk_wait_for(rk_iommu_is_paging_enabled(iommu), 1);
+	if (ret)
+		dev_err(iommu->dev, "Enable paging request timed out, status: %#08x\n",
+			rk_iommu_read(iommu, RK_MMU_STATUS));
+
+	return ret;
+}
+
+static int rk_iommu_disable_paging(struct rk_iommu *iommu)
+{
+	int ret;
+
+	if (!rk_iommu_is_paging_enabled(iommu))
+		return 0;
+
+	rk_iommu_command(iommu, RK_MMU_CMD_DISABLE_PAGING);
+
+	ret = rk_wait_for(!rk_iommu_is_paging_enabled(iommu), 1);
+	if (ret)
+		dev_err(iommu->dev, "Disable paging request timed out, status: #%08x\n",
+			rk_iommu_read(iommu, RK_MMU_STATUS));
+
+	return ret;
+}
+
+static int rk_iommu_force_reset(struct rk_iommu *iommu)
+{
+	int ret;
+	u32 dte_addr;
+
+	/*
+	 * Check if register DTE_ADDR is working by writing DTE_ADDR_DUMMY
+	 * and verifying that upper 5 nybbles are read back.
+	 */
+	rk_iommu_write(iommu, RK_MMU_DTE_ADDR, DTE_ADDR_DUMMY);
+
+	dte_addr = rk_iommu_read(iommu, RK_MMU_DTE_ADDR);
+	if (dte_addr != (DTE_ADDR_DUMMY & RK_DTE_PT_ADDRESS_MASK)) {
+		dev_err(iommu->dev, "Error during raw reset. MMU_DTE_ADDR is not functioning\n");
+		return -EFAULT;
+	}
+
+	rk_iommu_command(iommu, RK_MMU_CMD_FORCE_RESET);
+
+	ret = rk_wait_for(rk_iommu_read(iommu, RK_MMU_DTE_ADDR) == 0x00000000,
+			  FORCE_RESET_TIMEOUT);
+	if (ret)
+		dev_err(iommu->dev, "FORCE_RESET command timed out\n");
+
+	return ret;
+}
+
+static void log_iova(struct rk_iommu *iommu, dma_addr_t iova)
+{
+	u32 dte_index, pte_index, page_offset;
+	u32 mmu_dte_addr;
+	phys_addr_t mmu_dte_addr_phys, dte_addr_phys;
+	u32 *dte_addr;
+	u32 dte;
+	phys_addr_t pte_addr_phys = 0;
+	u32 *pte_addr = NULL;
+	u32 pte = 0;
+	phys_addr_t page_addr_phys = 0;
+	u32 page_flags = 0;
+
+	dte_index = rk_iova_dte_index(iova);
+	pte_index = rk_iova_pte_index(iova);
+	page_offset = rk_iova_page_offset(iova);
+
+	mmu_dte_addr = rk_iommu_read(iommu, RK_MMU_DTE_ADDR);
+	mmu_dte_addr_phys = (phys_addr_t)mmu_dte_addr;
+
+	dte_addr_phys = mmu_dte_addr_phys + (4 * dte_index);
+	dte_addr = phys_to_virt(dte_addr_phys);
+	dte = *dte_addr;
+
+	if (!rk_dte_is_pt_valid(dte))
+		goto print_it;
+
+	pte_addr_phys = rk_dte_pt_address(dte) + (pte_index * 4);
+	pte_addr = phys_to_virt(pte_addr_phys);
+	pte = *pte_addr;
+
+	if (!rk_pte_is_page_valid(pte))
+		goto print_it;
+
+	page_addr_phys = rk_pte_page_address(pte) + page_offset;
+	page_flags = pte & RK_PTE_PAGE_FLAGS_MASK;
+
+print_it:
+	dev_err(iommu->dev, "iova = %pad: dte_index: 0x%03x pte_index: 0x%03x page_offset: 0x%03x\n",
+		&iova, dte_index, pte_index, page_offset);
+	dev_err(iommu->dev, "mmu_dte_addr: %pa dte@%pa: %#08x valid: %u pte@%pa: %#08x valid: %u page@%pa flags: %#03x\n",
+		&mmu_dte_addr_phys, &dte_addr_phys, dte,
+		rk_dte_is_pt_valid(dte), &pte_addr_phys, pte,
+		rk_pte_is_page_valid(pte), &page_addr_phys, page_flags);
+}
+
+static irqreturn_t rk_iommu_irq(int irq, void *dev_id)
+{
+	struct rk_iommu *iommu = dev_id;
+	u32 status;
+	u32 int_status;
+	dma_addr_t iova;
+
+	int_status = rk_iommu_read(iommu, RK_MMU_INT_STATUS);
+	if (int_status == 0)
+		return IRQ_NONE;
+
+	iova = rk_iommu_read(iommu, RK_MMU_PAGE_FAULT_ADDR);
+
+	if (int_status & RK_MMU_IRQ_PAGE_FAULT) {
+		int flags;
+
+		status = rk_iommu_read(iommu, RK_MMU_STATUS);
+		flags = (status & RK_MMU_STATUS_PAGE_FAULT_IS_WRITE) ?
+				IOMMU_FAULT_WRITE : IOMMU_FAULT_READ;
+
+		dev_err(iommu->dev, "Page fault at %pad of type %s\n",
+			&iova,
+			(flags == IOMMU_FAULT_WRITE) ? "write" : "read");
+
+		log_iova(iommu, iova);
+
+		/*
+		 * Report page fault to any installed handlers.
+		 * Ignore the return code, though, since we always zap cache
+		 * and clear the page fault anyway.
+		 */
+		if (iommu->domain)
+			report_iommu_fault(iommu->domain, iommu->dev, iova,
+					   flags);
+		else
+			dev_err(iommu->dev, "Page fault while iommu not attached to domain?\n");
+
+		rk_iommu_command(iommu, RK_MMU_CMD_ZAP_CACHE);
+		rk_iommu_command(iommu, RK_MMU_CMD_PAGE_FAULT_DONE);
+	}
+
+	if (int_status & RK_MMU_IRQ_BUS_ERROR)
+		dev_err(iommu->dev, "BUS_ERROR occurred at %pad\n", &iova);
+
+	if (int_status & ~RK_MMU_IRQ_MASK)
+		dev_err(iommu->dev, "unexpected int_status: %#08x\n",
+			int_status);
+
+	rk_iommu_write(iommu, RK_MMU_INT_CLEAR, int_status);
+
+	return IRQ_HANDLED;
+}
+
+static phys_addr_t rk_iommu_iova_to_phys(struct iommu_domain *domain,
+					 dma_addr_t iova)
+{
+	struct rk_iommu_domain *rk_domain = domain->priv;
+	unsigned long flags;
+	phys_addr_t pt_phys, phys = 0;
+	u32 dte, pte;
+	u32 *page_table;
+
+	spin_lock_irqsave(&rk_domain->dt_lock, flags);
+
+	dte = rk_domain->dt[rk_iova_dte_index(iova)];
+	if (!rk_dte_is_pt_valid(dte))
+		goto out;
+
+	pt_phys = rk_dte_pt_address(dte);
+	page_table = (u32 *)phys_to_virt(pt_phys);
+	pte = page_table[rk_iova_pte_index(iova)];
+	if (!rk_pte_is_page_valid(pte))
+		goto out;
+
+	phys = rk_pte_page_address(pte) + rk_iova_page_offset(iova);
+out:
+	spin_unlock_irqrestore(&rk_domain->dt_lock, flags);
+
+	return phys;
+}
+
+static void rk_iommu_zap_iova(struct rk_iommu_domain *rk_domain,
+			      dma_addr_t iova, size_t size)
+{
+	struct list_head *pos;
+	unsigned long flags;
+
+	/* shootdown these iova from all iommus using this domain */
+	spin_lock_irqsave(&rk_domain->iommus_lock, flags);
+	list_for_each(pos, &rk_domain->iommus) {
+		struct rk_iommu *iommu;
+		iommu = list_entry(pos, struct rk_iommu, node);
+		rk_iommu_zap_lines(iommu, iova, size);
+	}
+	spin_unlock_irqrestore(&rk_domain->iommus_lock, flags);
+}
+
+static u32 *rk_dte_get_page_table(struct rk_iommu_domain *rk_domain,
+				  dma_addr_t iova)
+{
+	u32 *page_table, *dte_addr;
+	u32 dte;
+	phys_addr_t pt_phys;
+
+	assert_spin_locked(&rk_domain->dt_lock);
+
+	dte_addr = &rk_domain->dt[rk_iova_dte_index(iova)];
+	dte = *dte_addr;
+	if (rk_dte_is_pt_valid(dte))
+		goto done;
+
+	page_table = (u32 *)get_zeroed_page(GFP_ATOMIC | GFP_DMA32);
+	if (!page_table)
+		return ERR_PTR(-ENOMEM);
+
+	dte = rk_mk_dte(page_table);
+	*dte_addr = dte;
+
+	rk_table_flush(page_table, NUM_PT_ENTRIES);
+	rk_table_flush(dte_addr, 1);
+
+	/*
+	 * Zap the first iova of newly allocated page table so iommu evicts
+	 * old cached value of new dte from the iotlb.
+	 */
+	rk_iommu_zap_iova(rk_domain, iova, SPAGE_SIZE);
+
+done:
+	pt_phys = rk_dte_pt_address(dte);
+	return (u32 *)phys_to_virt(pt_phys);
+}
+
+static size_t rk_iommu_unmap_iova(struct rk_iommu_domain *rk_domain,
+				  u32 *pte_addr, dma_addr_t iova, size_t size)
+{
+	unsigned int pte_count;
+	unsigned int pte_total = size / SPAGE_SIZE;
+
+	assert_spin_locked(&rk_domain->dt_lock);
+
+	for (pte_count = 0; pte_count < pte_total; pte_count++) {
+		u32 pte = pte_addr[pte_count];
+		if (!rk_pte_is_page_valid(pte))
+			break;
+
+		pte_addr[pte_count] = rk_mk_pte_invalid(pte);
+	}
+
+	rk_table_flush(pte_addr, pte_count);
+
+	return pte_count * SPAGE_SIZE;
+}
+
+static int rk_iommu_map_iova(struct rk_iommu_domain *rk_domain, u32 *pte_addr,
+			     dma_addr_t iova, phys_addr_t paddr, size_t size,
+			     int prot)
+{
+	unsigned int pte_count;
+	unsigned int pte_total = size / SPAGE_SIZE;
+	phys_addr_t page_phys;
+
+	assert_spin_locked(&rk_domain->dt_lock);
+
+	for (pte_count = 0; pte_count < pte_total; pte_count++) {
+		u32 pte = pte_addr[pte_count];
+
+		if (rk_pte_is_page_valid(pte))
+			goto unwind;
+
+		pte_addr[pte_count] = rk_mk_pte(paddr, prot);
+
+		paddr += SPAGE_SIZE;
+	}
+
+	rk_table_flush(pte_addr, pte_count);
+
+	return 0;
+unwind:
+	/* Unmap the range of iovas that we just mapped */
+	rk_iommu_unmap_iova(rk_domain, pte_addr, iova, pte_count * SPAGE_SIZE);
+
+	iova += pte_count * SPAGE_SIZE;
+	page_phys = rk_pte_page_address(pte_addr[pte_count]);
+	pr_err("iova: %pad already mapped to %pa cannot remap to phys: %pa prot:%#x\n",
+	       &iova, &page_phys, &paddr, prot);
+
+	return -EADDRINUSE;
+}
+
+static int rk_iommu_map(struct iommu_domain *domain, unsigned long _iova,
+			phys_addr_t paddr, size_t size, int prot)
+{
+	struct rk_iommu_domain *rk_domain = domain->priv;
+	unsigned long flags;
+	dma_addr_t iova = (dma_addr_t)_iova;
+	u32 *page_table, *pte_addr;
+	int ret;
+
+	spin_lock_irqsave(&rk_domain->dt_lock, flags);
+
+	/*
+	 * pgsize_bitmap specifies iova sizes that fit in one page table
+	 * (1024 4-KiB pages = 4 MiB).
+	 * So, size will always be 4096 <= size <= 4194304.
+	 * Since iommu_map() guarantees that both iova and size will be
+	 * aligned, we will always only be mapping from a single dte here.
+	 */
+	page_table = rk_dte_get_page_table(rk_domain, iova);
+	if (IS_ERR(page_table)) {
+		spin_unlock_irqrestore(&rk_domain->dt_lock, flags);
+		return PTR_ERR(page_table);
+	}
+
+	pte_addr = &page_table[rk_iova_pte_index(iova)];
+	ret = rk_iommu_map_iova(rk_domain, pte_addr, iova, paddr, size, prot);
+	spin_unlock_irqrestore(&rk_domain->dt_lock, flags);
+
+	return ret;
+}
+
+static size_t rk_iommu_unmap(struct iommu_domain *domain, unsigned long _iova,
+			     size_t size)
+{
+	struct rk_iommu_domain *rk_domain = domain->priv;
+	unsigned long flags;
+	dma_addr_t iova = (dma_addr_t)_iova;
+	phys_addr_t pt_phys;
+	u32 dte;
+	u32 *pte_addr;
+	size_t unmap_size;
+
+	spin_lock_irqsave(&rk_domain->dt_lock, flags);
+
+	/*
+	 * pgsize_bitmap specifies iova sizes that fit in one page table
+	 * (1024 4-KiB pages = 4 MiB).
+	 * So, size will always be 4096 <= size <= 4194304.
+	 * Since iommu_unmap() guarantees that both iova and size will be
+	 * aligned, we will always only be unmapping from a single dte here.
+	 */
+	dte = rk_domain->dt[rk_iova_dte_index(iova)];
+	/* Just return 0 if iova is unmapped */
+	if (!rk_dte_is_pt_valid(dte)) {
+		spin_unlock_irqrestore(&rk_domain->dt_lock, flags);
+		return 0;
+	}
+
+	pt_phys = rk_dte_pt_address(dte);
+	pte_addr = (u32 *)phys_to_virt(pt_phys) + rk_iova_pte_index(iova);
+	unmap_size = rk_iommu_unmap_iova(rk_domain, pte_addr, iova, size);
+
+	spin_unlock_irqrestore(&rk_domain->dt_lock, flags);
+
+	/* Shootdown iotlb entries for iova range that was just unmapped */
+	rk_iommu_zap_iova(rk_domain, iova, unmap_size);
+
+	return unmap_size;
+}
+
+static int rk_iommu_attach_device(struct iommu_domain *domain,
+				  struct device *dev)
+{
+	struct rk_iommu *iommu = dev_get_drvdata(dev->archdata.iommu);
+	struct rk_iommu_domain *rk_domain = domain->priv;
+	unsigned long flags;
+	int ret;
+	phys_addr_t dte_addr;
+
+	/*
+	 * Allow 'virtual devices' (e.g., drm) to attach to domain.
+	 * Such a device has a NULL archdata.iommu.
+	 */
+	if (!iommu)
+		return 0;
+
+	ret = rk_iommu_enable_stall(iommu);
+	if (ret)
+		return ret;
+
+	ret = rk_iommu_force_reset(iommu);
+	if (ret)
+		return ret;
+
+	iommu->domain = domain;
+
+	ret = devm_request_irq(dev, iommu->irq, rk_iommu_irq,
+			       IRQF_SHARED, dev_name(dev), iommu);
+	if (ret)
+		return ret;
+
+	dte_addr = virt_to_phys(rk_domain->dt);
+	rk_iommu_write(iommu, RK_MMU_DTE_ADDR, dte_addr);
+	rk_iommu_command(iommu, RK_MMU_CMD_ZAP_CACHE);
+	rk_iommu_write(iommu, RK_MMU_INT_MASK, RK_MMU_IRQ_MASK);
+
+	ret = rk_iommu_enable_paging(iommu);
+	if (ret)
+		return ret;
+
+	spin_lock_irqsave(&rk_domain->iommus_lock, flags);
+	list_add_tail(&iommu->node, &rk_domain->iommus);
+	spin_unlock_irqrestore(&rk_domain->iommus_lock, flags);
+
+	dev_info(dev, "Attached to iommu domain\n");
+
+	rk_iommu_disable_stall(iommu);
+
+	return 0;
+}
+
+static void rk_iommu_detach_device(struct iommu_domain *domain,
+				   struct device *dev)
+{
+	struct rk_iommu *iommu = dev_get_drvdata(dev->archdata.iommu);
+	struct rk_iommu_domain *rk_domain = domain->priv;
+	unsigned long flags;
+
+	/* Allow 'virtual devices' (eg drm) to detach from domain */
+	if (!iommu)
+		return;
+
+	spin_lock_irqsave(&rk_domain->iommus_lock, flags);
+	list_del_init(&iommu->node);
+	spin_unlock_irqrestore(&rk_domain->iommus_lock, flags);
+
+	/* Ignore error while disabling, just keep going */
+	rk_iommu_enable_stall(iommu);
+	rk_iommu_disable_paging(iommu);
+	rk_iommu_write(iommu, RK_MMU_INT_MASK, 0);
+	rk_iommu_write(iommu, RK_MMU_DTE_ADDR, 0);
+	rk_iommu_disable_stall(iommu);
+
+	devm_free_irq(dev, iommu->irq, iommu);
+
+	iommu->domain = NULL;
+
+	dev_info(dev, "Detached from iommu domain\n");
+}
+
+static int rk_iommu_domain_init(struct iommu_domain *domain)
+{
+	struct rk_iommu_domain *rk_domain;
+
+	rk_domain = kzalloc(sizeof(*rk_domain), GFP_KERNEL);
+	if (!rk_domain)
+		return -ENOMEM;
+
+	/*
+	 * rk32xx iommus use a 2 level pagetable.
+	 * Each level1 (dt) and level2 (pt) table has 1024 4-byte entries.
+	 * Allocate one 4 KiB page for each table.
+	 */
+	rk_domain->dt = (u32 *)get_zeroed_page(GFP_KERNEL | GFP_DMA32);
+	if (!rk_domain->dt)
+		goto err_dt;
+
+	rk_table_flush(rk_domain->dt, NUM_DT_ENTRIES);
+
+	spin_lock_init(&rk_domain->iommus_lock);
+	spin_lock_init(&rk_domain->dt_lock);
+	INIT_LIST_HEAD(&rk_domain->iommus);
+
+	domain->priv = rk_domain;
+
+	return 0;
+err_dt:
+	kfree(rk_domain);
+	return -ENOMEM;
+}
+
+static void rk_iommu_domain_destroy(struct iommu_domain *domain)
+{
+	struct rk_iommu_domain *rk_domain = domain->priv;
+	int i;
+
+	WARN_ON(!list_empty(&rk_domain->iommus));
+
+	for (i = 0; i < NUM_DT_ENTRIES; i++) {
+		u32 dte = rk_domain->dt[i];
+		if (rk_dte_is_pt_valid(dte)) {
+			phys_addr_t pt_phys = rk_dte_pt_address(dte);
+			u32 *page_table = phys_to_virt(pt_phys);
+			free_page((unsigned long)page_table);
+		}
+	}
+
+	free_page((unsigned long)rk_domain->dt);
+	kfree(domain->priv);
+	domain->priv = NULL;
+}
+
+static const struct iommu_ops rk_iommu_ops = {
+	.domain_init = rk_iommu_domain_init,
+	.domain_destroy = rk_iommu_domain_destroy,
+	.attach_dev = rk_iommu_attach_device,
+	.detach_dev = rk_iommu_detach_device,
+	.map = rk_iommu_map,
+	.unmap = rk_iommu_unmap,
+	.iova_to_phys = rk_iommu_iova_to_phys,
+	.pgsize_bitmap = RK_IOMMU_PGSIZE_BITMAP,
+};
+
+static int rk_iommu_probe(struct platform_device *pdev)
+{
+	struct device *dev = &pdev->dev;
+	struct rk_iommu *iommu;
+	struct resource *res;
+
+	iommu = devm_kzalloc(dev, sizeof(*iommu), GFP_KERNEL);
+	if (!iommu)
+		return -ENOMEM;
+
+	platform_set_drvdata(pdev, iommu);
+	iommu->dev = dev;
+
+	res = platform_get_resource(pdev, IORESOURCE_MEM, 0);
+	iommu->base = devm_ioremap_resource(&pdev->dev, res);
+	if (IS_ERR(iommu->base))
+		return PTR_ERR(iommu->base);
+
+	iommu->irq = platform_get_irq(pdev, 0);
+	if (iommu->irq < 0) {
+		dev_err(dev, "Failed to get IRQ, %d\n", iommu->irq);
+		return -ENXIO;
+	}
+
+	return 0;
+}
+
+static int rk_iommu_remove(struct platform_device *pdev)
+{
+	return 0;
+}
+
+#ifdef CONFIG_OF
+static const struct of_device_id rk_iommu_dt_ids[] = {
+	{ .compatible = "rockchip,iommu" },
+	{ /* sentinel */ }
+};
+MODULE_DEVICE_TABLE(of, rk_iommu_dt_ids);
+#endif
+
+static struct platform_driver rk_iommu_driver = {
+	.probe = rk_iommu_probe,
+	.remove = rk_iommu_remove,
+	.driver = {
+		   .name = "rk_iommu",
+		   .owner = THIS_MODULE,
+		   .of_match_table = of_match_ptr(rk_iommu_dt_ids),
+	},
+};
+
+static int __init rk_iommu_init(void)
+{
+	int ret;
+
+	ret = bus_set_iommu(&platform_bus_type, &rk_iommu_ops);
+	if (ret)
+		return ret;
+
+	return platform_driver_register(&rk_iommu_driver);
+}
+static void __exit rk_iommu_exit(void)
+{
+	platform_driver_unregister(&rk_iommu_driver);
+}
+
+subsys_initcall(rk_iommu_init);
+module_exit(rk_iommu_exit);
+
+MODULE_DESCRIPTION("IOMMU API for Rockchip");
+MODULE_AUTHOR("Simon Xue <xxm@rock-chips.com> and Daniel Kurtz <djkurtz@chromium.org>");
+MODULE_ALIAS("platform:rockchip-iommu");
+MODULE_LICENSE("GPL v2");
-- 
2.1.0.rc2.206.gedb03e5


^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [PATCH v5 1/3] iommu/rockchip: rk3288 iommu driver
@ 2014-10-24  7:33   ` Daniel Kurtz
  0 siblings, 0 replies; 26+ messages in thread
From: Daniel Kurtz @ 2014-10-24  7:33 UTC (permalink / raw)
  To: Joerg Roedel
  Cc: open list:OPEN FIRMWARE AND...,
	Heiko Stuebner, open list, Daniel Kurtz, open list:IOMMU DRIVERS,
	Rob Herring, Grant Likely, moderated list:ARM/Rockchip SoC...,
	Simon Xue

The rk3288 has several iommus.  Each iommu belongs to a single master
device.  There is one device (ISP) that has two slave iommus, but that
case is not yet supported by this driver.

At subsys init, the iommu driver registers itself as the iommu driver for
the platform bus.  The master devices find their slave iommus using the
"iommus" field in their devicetree description.  Since each slave iommu
belongs to exactly one master, their is no additional data needed at probe
to associate a slave with its master.

An iommu device's power domain, clock and irq are all shared with its
master device, and the master device must be careful to attach from the
iommu only after powering and clocking it (and leave it powered and
clocked before detaching).  Because their is no guarantee what the status
of the iommu is at probe, and since the driver does not even know if the
device is powered, we delay requesting its irq until the master device
attaches, at which point we have a guarantee that the device is powered
and clocked and we can reset it and disable its interrupt mask.

An iommu_domain describes a virtual iova address space.  Each iommu_domain
has a corresponding page table that lists the mappings from iova to
physical address.

For the rk3288 iommu, the page table has two levels:
 The Level 1 "directory_table" has 1024 4-byte dte entries.
 Each dte points to a level 2 "page_table".
 Each level 2 page_table has 1024 4-byte pte entries.
 Each pte points to a 4 KiB page of memory.

An iommu_domain is created when a dma_iommu_mapping is created via
arm_iommu_create_mapping.  Master devices can then attach themselves to
this mapping (or attach the mapping to themselves?) by calling
arm_iommu_attach_device().  This in turn instructs the iommu driver to
write the page table's physical address into the slave iommu's "Directory
Table Entry" (DTE) register.

In fact multiple master devices, each with their own slave iommu device,
can all attach to the same mapping.  The iommus for these devices will
share the same iommu_domain and therefore point to the same page table.
Thus, the iommu domain maintains a list of iommu devices which are
attached.  This driver relies on the iommu core to ensure that all devices
have detached before destroying a domain.

Signed-off-by: Daniel Kurtz <djkurtz@chromium.org>
Signed-off-by: Simon Xue <xxm@rock-chips.com>
Reviewed-by: Grant Grundler <grundler@chromium.org>
Reviewed-by: Stéphane Marchesin <marcheu@chromium.org>
---
 drivers/iommu/Kconfig          |  12 +
 drivers/iommu/Makefile         |   1 +
 drivers/iommu/rockchip-iommu.c | 922 +++++++++++++++++++++++++++++++++++++++++
 3 files changed, 935 insertions(+)
 create mode 100644 drivers/iommu/rockchip-iommu.c

diff --git a/drivers/iommu/Kconfig b/drivers/iommu/Kconfig
index dd51122..d0a1261 100644
--- a/drivers/iommu/Kconfig
+++ b/drivers/iommu/Kconfig
@@ -152,6 +152,18 @@ config OMAP_IOMMU_DEBUG
 
          Say N unless you know you need this.
 
+config ROCKCHIP_IOMMU
+	bool "Rockchip IOMMU Support"
+	depends on ARCH_ROCKCHIP
+	select IOMMU_API
+	select ARM_DMA_USE_IOMMU
+	help
+	  Support for IOMMUs found on Rockchip rk32xx SOCs.
+	  These IOMMUs allow virtualization of the address space used by most
+	  cores within the multimedia subsystem.
+	  Say Y here if you are using a Rockchip SoC that includes an IOMMU
+	  device.
+
 config TEGRA_IOMMU_GART
 	bool "Tegra GART IOMMU Support"
 	depends on ARCH_TEGRA_2x_SOC
diff --git a/drivers/iommu/Makefile b/drivers/iommu/Makefile
index 16edef7..3e47ef3 100644
--- a/drivers/iommu/Makefile
+++ b/drivers/iommu/Makefile
@@ -13,6 +13,7 @@ obj-$(CONFIG_IRQ_REMAP) += intel_irq_remapping.o irq_remapping.o
 obj-$(CONFIG_OMAP_IOMMU) += omap-iommu.o
 obj-$(CONFIG_OMAP_IOMMU) += omap-iommu2.o
 obj-$(CONFIG_OMAP_IOMMU_DEBUG) += omap-iommu-debug.o
+obj-$(CONFIG_ROCKCHIP_IOMMU) += rockchip-iommu.o
 obj-$(CONFIG_TEGRA_IOMMU_GART) += tegra-gart.o
 obj-$(CONFIG_TEGRA_IOMMU_SMMU) += tegra-smmu.o
 obj-$(CONFIG_EXYNOS_IOMMU) += exynos-iommu.o
diff --git a/drivers/iommu/rockchip-iommu.c b/drivers/iommu/rockchip-iommu.c
new file mode 100644
index 0000000..56ffb76
--- /dev/null
+++ b/drivers/iommu/rockchip-iommu.c
@@ -0,0 +1,922 @@
+/*
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ */
+
+#include <asm/cacheflush.h>
+#include <asm/pgtable.h>
+#include <linux/compiler.h>
+#include <linux/delay.h>
+#include <linux/device.h>
+#include <linux/errno.h>
+#include <linux/interrupt.h>
+#include <linux/io.h>
+#include <linux/iommu.h>
+#include <linux/jiffies.h>
+#include <linux/list.h>
+#include <linux/mm.h>
+#include <linux/module.h>
+#include <linux/of.h>
+#include <linux/platform_device.h>
+#include <linux/slab.h>
+#include <linux/spinlock.h>
+
+/** MMU register offsets */
+#define RK_MMU_DTE_ADDR		0x00	/* Directory table address */
+#define RK_MMU_STATUS		0x04
+#define RK_MMU_COMMAND		0x08
+#define RK_MMU_PAGE_FAULT_ADDR	0x0C	/* IOVA of last page fault */
+#define RK_MMU_ZAP_ONE_LINE	0x10	/* Shootdown one IOTLB entry */
+#define RK_MMU_INT_RAWSTAT	0x14	/* IRQ status ignoring mask */
+#define RK_MMU_INT_CLEAR	0x18	/* Acknowledge and re-arm irq */
+#define RK_MMU_INT_MASK		0x1C	/* IRQ enable */
+#define RK_MMU_INT_STATUS	0x20	/* IRQ status after masking */
+#define RK_MMU_AUTO_GATING	0x24
+
+#define DTE_ADDR_DUMMY		0xCAFEBABE
+#define FORCE_RESET_TIMEOUT	100	/* ms */
+
+/* RK_MMU_STATUS fields */
+#define RK_MMU_STATUS_PAGING_ENABLED       BIT(0)
+#define RK_MMU_STATUS_PAGE_FAULT_ACTIVE    BIT(1)
+#define RK_MMU_STATUS_STALL_ACTIVE         BIT(2)
+#define RK_MMU_STATUS_IDLE                 BIT(3)
+#define RK_MMU_STATUS_REPLAY_BUFFER_EMPTY  BIT(4)
+#define RK_MMU_STATUS_PAGE_FAULT_IS_WRITE  BIT(5)
+#define RK_MMU_STATUS_STALL_NOT_ACTIVE     BIT(31)
+
+/* RK_MMU_COMMAND command values */
+#define RK_MMU_CMD_ENABLE_PAGING    0  /* Enable memory translation */
+#define RK_MMU_CMD_DISABLE_PAGING   1  /* Disable memory translation */
+#define RK_MMU_CMD_ENABLE_STALL     2  /* Stall paging to allow other cmds */
+#define RK_MMU_CMD_DISABLE_STALL    3  /* Stop stall re-enables paging */
+#define RK_MMU_CMD_ZAP_CACHE        4  /* Shoot down entire IOTLB */
+#define RK_MMU_CMD_PAGE_FAULT_DONE  5  /* Clear page fault */
+#define RK_MMU_CMD_FORCE_RESET      6  /* Reset all registers */
+
+/* RK_MMU_INT_* register fields */
+#define RK_MMU_IRQ_PAGE_FAULT    0x01  /* page fault */
+#define RK_MMU_IRQ_BUS_ERROR     0x02  /* bus read error */
+#define RK_MMU_IRQ_MASK          (RK_MMU_IRQ_PAGE_FAULT | RK_MMU_IRQ_BUS_ERROR)
+
+#define NUM_DT_ENTRIES 1024
+#define NUM_PT_ENTRIES 1024
+
+#define SPAGE_ORDER 12
+#define SPAGE_SIZE (1 << SPAGE_ORDER)
+
+ /*
+  * Support mapping any size that fits in one page table:
+  *   4 KiB to 4 MiB
+  */
+#define RK_IOMMU_PGSIZE_BITMAP 0x007ff000
+
+#define IOMMU_REG_POLL_COUNT_FAST 1000
+
+struct rk_iommu_domain {
+	struct list_head iommus;
+	u32 *dt; /* page directory table */
+	spinlock_t iommus_lock; /* lock for iommus list */
+	spinlock_t dt_lock; /* lock for modifying page directory table */
+};
+
+struct rk_iommu {
+	struct device *dev;
+	void __iomem *base;
+	int irq;
+	struct list_head node; /* entry in rk_iommu_domain.iommus */
+	struct iommu_domain *domain; /* domain to which iommu is attached */
+};
+
+static inline void rk_table_flush(u32 *va, unsigned int count)
+{
+	phys_addr_t pa_start = virt_to_phys(va);
+	phys_addr_t pa_end = virt_to_phys(va + count);
+	size_t size = pa_end - pa_start;
+
+	__cpuc_flush_dcache_area(va, size);
+	outer_flush_range(pa_start, pa_end);
+}
+
+/**
+ * Inspired by _wait_for in intel_drv.h
+ * This is NOT safe for use in interrupt context.
+ *
+ * Note that it's important that we check the condition again after having
+ * timed out, since the timeout could be due to preemption or similar and
+ * we've never had a chance to check the condition before the timeout.
+ */
+#define rk_wait_for(COND, MS) ({ \
+	unsigned long timeout__ = jiffies + msecs_to_jiffies(MS) + 1;	\
+	int ret__ = 0;							\
+	while (!(COND)) {						\
+		if (time_after(jiffies, timeout__)) {			\
+			ret__ = (COND) ? 0 : -ETIMEDOUT;		\
+			break;						\
+		}							\
+		usleep_range(50, 100);					\
+	}								\
+	ret__;								\
+})
+
+/*
+ * The Rockchip rk3288 iommu uses a 2-level page table.
+ * The first level is the "Directory Table" (DT).
+ * The DT consists of 1024 4-byte Directory Table Entries (DTEs), each pointing
+ * to a "Page Table".
+ * The second level is the 1024 Page Tables (PT).
+ * Each PT consists of 1024 4-byte Page Table Entries (PTEs), each pointing to
+ * a 4 KB page of physical memory.
+ *
+ * The DT and each PT fits in a single 4 KB page (4-bytes * 1024 entries).
+ * Each iommu device has a MMU_DTE_ADDR register that contains the physical
+ * address of the start of the DT page.
+ *
+ * The structure of the page table is as follows:
+ *
+ *                   DT
+ * MMU_DTE_ADDR -> +-----+
+ *                 |     |
+ *                 +-----+     PT
+ *                 | DTE | -> +-----+
+ *                 +-----+    |     |     Memory
+ *                 |     |    +-----+     Page
+ *                 |     |    | PTE | -> +-----+
+ *                 +-----+    +-----+    |     |
+ *                            |     |    |     |
+ *                            |     |    |     |
+ *                            +-----+    |     |
+ *                                       |     |
+ *                                       |     |
+ *                                       +-----+
+ */
+
+/*
+ * Each DTE has a PT address and a valid bit:
+ * +---------------------+-----------+-+
+ * | PT address          | Reserved  |V|
+ * +---------------------+-----------+-+
+ *  31:12 - PT address (PTs always starts on a 4 KB boundary)
+ *  11: 1 - Reserved
+ *      0 - 1 if PT @ PT address is valid
+ */
+#define RK_DTE_PT_ADDRESS_MASK    0xfffff000
+#define RK_DTE_PT_VALID           BIT(0)
+
+static inline phys_addr_t rk_dte_pt_address(u32 dte)
+{
+	return (phys_addr_t)dte & RK_DTE_PT_ADDRESS_MASK;
+}
+
+static inline bool rk_dte_is_pt_valid(u32 dte)
+{
+	return dte & RK_DTE_PT_VALID;
+}
+
+static u32 rk_mk_dte(u32 *pt)
+{
+	phys_addr_t pt_phys = virt_to_phys(pt);
+	return (pt_phys & RK_DTE_PT_ADDRESS_MASK) | RK_DTE_PT_VALID;
+}
+
+/*
+ * Each PTE has a Page address, some flags and a valid bit:
+ * +---------------------+---+-------+-+
+ * | Page address        |Rsv| Flags |V|
+ * +---------------------+---+-------+-+
+ *  31:12 - Page address (Pages always start on a 4 KB boundary)
+ *  11: 9 - Reserved
+ *   8: 1 - Flags
+ *      8 - Read allocate - allocate cache space on read misses
+ *      7 - Read cache - enable cache & prefetch of data
+ *      6 - Write buffer - enable delaying writes on their way to memory
+ *      5 - Write allocate - allocate cache space on write misses
+ *      4 - Write cache - different writes can be merged together
+ *      3 - Override cache attributes
+ *          if 1, bits 4-8 control cache attributes
+ *          if 0, the system bus defaults are used
+ *      2 - Writable
+ *      1 - Readable
+ *      0 - 1 if Page @ Page address is valid
+ */
+#define RK_PTE_PAGE_ADDRESS_MASK  0xfffff000
+#define RK_PTE_PAGE_FLAGS_MASK    0x000001fe
+#define RK_PTE_PAGE_WRITABLE      BIT(2)
+#define RK_PTE_PAGE_READABLE      BIT(1)
+#define RK_PTE_PAGE_VALID         BIT(0)
+
+static inline phys_addr_t rk_pte_page_address(u32 pte)
+{
+	return (phys_addr_t)pte & RK_PTE_PAGE_ADDRESS_MASK;
+}
+
+static inline bool rk_pte_is_page_valid(u32 pte)
+{
+	return pte & RK_PTE_PAGE_VALID;
+}
+
+/* TODO: set cache flags per prot IOMMU_CACHE */
+static u32 rk_mk_pte(phys_addr_t page, int prot)
+{
+	u32 flags = 0;
+	flags |= (prot & IOMMU_READ) ? RK_PTE_PAGE_READABLE : 0;
+	flags |= (prot & IOMMU_WRITE) ? RK_PTE_PAGE_WRITABLE : 0;
+	page &= RK_PTE_PAGE_ADDRESS_MASK;
+	return page | flags | RK_PTE_PAGE_VALID;
+}
+
+static u32 rk_mk_pte_invalid(u32 pte)
+{
+	return pte & ~RK_PTE_PAGE_VALID;
+}
+
+/*
+ * rk3288 iova (IOMMU Virtual Address) format
+ *  31       22.21       12.11          0
+ * +-----------+-----------+-------------+
+ * | DTE index | PTE index | Page offset |
+ * +-----------+-----------+-------------+
+ *  31:22 - DTE index   - index of DTE in DT
+ *  21:12 - PTE index   - index of PTE in PT @ DTE.pt_address
+ *  11: 0 - Page offset - offset into page @ PTE.page_address
+ */
+#define RK_IOVA_DTE_MASK    0xffc00000
+#define RK_IOVA_DTE_SHIFT   22
+#define RK_IOVA_PTE_MASK    0x003ff000
+#define RK_IOVA_PTE_SHIFT   12
+#define RK_IOVA_PAGE_MASK   0x00000fff
+#define RK_IOVA_PAGE_SHIFT  0
+
+static u32 rk_iova_dte_index(dma_addr_t iova)
+{
+	return (u32)(iova & RK_IOVA_DTE_MASK) >> RK_IOVA_DTE_SHIFT;
+}
+
+static u32 rk_iova_pte_index(dma_addr_t iova)
+{
+	return (u32)(iova & RK_IOVA_PTE_MASK) >> RK_IOVA_PTE_SHIFT;
+}
+
+static u32 rk_iova_page_offset(dma_addr_t iova)
+{
+	return (u32)(iova & RK_IOVA_PAGE_MASK) >> RK_IOVA_PAGE_SHIFT;
+}
+
+static u32 rk_iommu_read(struct rk_iommu *iommu, u32 offset)
+{
+	return readl(iommu->base + offset);
+}
+
+static void rk_iommu_write(struct rk_iommu *iommu, u32 offset, u32 value)
+{
+	writel(value, iommu->base + offset);
+}
+
+static void rk_iommu_command(struct rk_iommu *iommu, u32 command)
+{
+	writel(command, iommu->base + RK_MMU_COMMAND);
+}
+
+static void rk_iommu_zap_lines(struct rk_iommu *iommu, dma_addr_t iova,
+			       size_t size)
+{
+	dma_addr_t iova_end = iova + size;
+	/*
+	 * TODO(djkurtz): Figure out when it is more efficient to shootdown the
+	 * entire iotlb rather than iterate over individual iovas.
+	 */
+	for (; iova < iova_end; iova += SPAGE_SIZE)
+		rk_iommu_write(iommu, RK_MMU_ZAP_ONE_LINE, iova);
+}
+
+static bool rk_iommu_is_stall_active(struct rk_iommu *iommu)
+{
+	return rk_iommu_read(iommu, RK_MMU_STATUS) & RK_MMU_STATUS_STALL_ACTIVE;
+}
+
+static bool rk_iommu_is_paging_enabled(struct rk_iommu *iommu)
+{
+	return rk_iommu_read(iommu, RK_MMU_STATUS) &
+			     RK_MMU_STATUS_PAGING_ENABLED;
+}
+
+static int rk_iommu_enable_stall(struct rk_iommu *iommu)
+{
+	int ret;
+
+	if (rk_iommu_is_stall_active(iommu))
+		return 0;
+
+	/* Stall can only be enabled if paging is enabled */
+	if (!rk_iommu_is_paging_enabled(iommu))
+		return 0;
+
+	rk_iommu_command(iommu, RK_MMU_CMD_ENABLE_STALL);
+
+	ret = rk_wait_for(rk_iommu_is_stall_active(iommu), 1);
+	if (ret)
+		dev_err(iommu->dev, "Enable stall request timed out, status: %#08x\n",
+			rk_iommu_read(iommu, RK_MMU_STATUS));
+
+	return ret;
+}
+
+static int rk_iommu_disable_stall(struct rk_iommu *iommu)
+{
+	int ret;
+
+	if (!rk_iommu_is_stall_active(iommu))
+		return 0;
+
+	rk_iommu_command(iommu, RK_MMU_CMD_DISABLE_STALL);
+
+	ret = rk_wait_for(!rk_iommu_is_stall_active(iommu), 1);
+	if (ret)
+		dev_err(iommu->dev, "Disable stall request timed out, status: %#08x\n",
+			rk_iommu_read(iommu, RK_MMU_STATUS));
+
+	return ret;
+}
+
+static int rk_iommu_enable_paging(struct rk_iommu *iommu)
+{
+	int ret;
+
+	if (rk_iommu_is_paging_enabled(iommu))
+		return 0;
+
+	rk_iommu_command(iommu, RK_MMU_CMD_ENABLE_PAGING);
+
+	ret = rk_wait_for(rk_iommu_is_paging_enabled(iommu), 1);
+	if (ret)
+		dev_err(iommu->dev, "Enable paging request timed out, status: %#08x\n",
+			rk_iommu_read(iommu, RK_MMU_STATUS));
+
+	return ret;
+}
+
+static int rk_iommu_disable_paging(struct rk_iommu *iommu)
+{
+	int ret;
+
+	if (!rk_iommu_is_paging_enabled(iommu))
+		return 0;
+
+	rk_iommu_command(iommu, RK_MMU_CMD_DISABLE_PAGING);
+
+	ret = rk_wait_for(!rk_iommu_is_paging_enabled(iommu), 1);
+	if (ret)
+		dev_err(iommu->dev, "Disable paging request timed out, status: #%08x\n",
+			rk_iommu_read(iommu, RK_MMU_STATUS));
+
+	return ret;
+}
+
+static int rk_iommu_force_reset(struct rk_iommu *iommu)
+{
+	int ret;
+	u32 dte_addr;
+
+	/*
+	 * Check if register DTE_ADDR is working by writing DTE_ADDR_DUMMY
+	 * and verifying that upper 5 nybbles are read back.
+	 */
+	rk_iommu_write(iommu, RK_MMU_DTE_ADDR, DTE_ADDR_DUMMY);
+
+	dte_addr = rk_iommu_read(iommu, RK_MMU_DTE_ADDR);
+	if (dte_addr != (DTE_ADDR_DUMMY & RK_DTE_PT_ADDRESS_MASK)) {
+		dev_err(iommu->dev, "Error during raw reset. MMU_DTE_ADDR is not functioning\n");
+		return -EFAULT;
+	}
+
+	rk_iommu_command(iommu, RK_MMU_CMD_FORCE_RESET);
+
+	ret = rk_wait_for(rk_iommu_read(iommu, RK_MMU_DTE_ADDR) == 0x00000000,
+			  FORCE_RESET_TIMEOUT);
+	if (ret)
+		dev_err(iommu->dev, "FORCE_RESET command timed out\n");
+
+	return ret;
+}
+
+static void log_iova(struct rk_iommu *iommu, dma_addr_t iova)
+{
+	u32 dte_index, pte_index, page_offset;
+	u32 mmu_dte_addr;
+	phys_addr_t mmu_dte_addr_phys, dte_addr_phys;
+	u32 *dte_addr;
+	u32 dte;
+	phys_addr_t pte_addr_phys = 0;
+	u32 *pte_addr = NULL;
+	u32 pte = 0;
+	phys_addr_t page_addr_phys = 0;
+	u32 page_flags = 0;
+
+	dte_index = rk_iova_dte_index(iova);
+	pte_index = rk_iova_pte_index(iova);
+	page_offset = rk_iova_page_offset(iova);
+
+	mmu_dte_addr = rk_iommu_read(iommu, RK_MMU_DTE_ADDR);
+	mmu_dte_addr_phys = (phys_addr_t)mmu_dte_addr;
+
+	dte_addr_phys = mmu_dte_addr_phys + (4 * dte_index);
+	dte_addr = phys_to_virt(dte_addr_phys);
+	dte = *dte_addr;
+
+	if (!rk_dte_is_pt_valid(dte))
+		goto print_it;
+
+	pte_addr_phys = rk_dte_pt_address(dte) + (pte_index * 4);
+	pte_addr = phys_to_virt(pte_addr_phys);
+	pte = *pte_addr;
+
+	if (!rk_pte_is_page_valid(pte))
+		goto print_it;
+
+	page_addr_phys = rk_pte_page_address(pte) + page_offset;
+	page_flags = pte & RK_PTE_PAGE_FLAGS_MASK;
+
+print_it:
+	dev_err(iommu->dev, "iova = %pad: dte_index: 0x%03x pte_index: 0x%03x page_offset: 0x%03x\n",
+		&iova, dte_index, pte_index, page_offset);
+	dev_err(iommu->dev, "mmu_dte_addr: %pa dte@%pa: %#08x valid: %u pte@%pa: %#08x valid: %u page@%pa flags: %#03x\n",
+		&mmu_dte_addr_phys, &dte_addr_phys, dte,
+		rk_dte_is_pt_valid(dte), &pte_addr_phys, pte,
+		rk_pte_is_page_valid(pte), &page_addr_phys, page_flags);
+}
+
+static irqreturn_t rk_iommu_irq(int irq, void *dev_id)
+{
+	struct rk_iommu *iommu = dev_id;
+	u32 status;
+	u32 int_status;
+	dma_addr_t iova;
+
+	int_status = rk_iommu_read(iommu, RK_MMU_INT_STATUS);
+	if (int_status == 0)
+		return IRQ_NONE;
+
+	iova = rk_iommu_read(iommu, RK_MMU_PAGE_FAULT_ADDR);
+
+	if (int_status & RK_MMU_IRQ_PAGE_FAULT) {
+		int flags;
+
+		status = rk_iommu_read(iommu, RK_MMU_STATUS);
+		flags = (status & RK_MMU_STATUS_PAGE_FAULT_IS_WRITE) ?
+				IOMMU_FAULT_WRITE : IOMMU_FAULT_READ;
+
+		dev_err(iommu->dev, "Page fault at %pad of type %s\n",
+			&iova,
+			(flags == IOMMU_FAULT_WRITE) ? "write" : "read");
+
+		log_iova(iommu, iova);
+
+		/*
+		 * Report page fault to any installed handlers.
+		 * Ignore the return code, though, since we always zap cache
+		 * and clear the page fault anyway.
+		 */
+		if (iommu->domain)
+			report_iommu_fault(iommu->domain, iommu->dev, iova,
+					   flags);
+		else
+			dev_err(iommu->dev, "Page fault while iommu not attached to domain?\n");
+
+		rk_iommu_command(iommu, RK_MMU_CMD_ZAP_CACHE);
+		rk_iommu_command(iommu, RK_MMU_CMD_PAGE_FAULT_DONE);
+	}
+
+	if (int_status & RK_MMU_IRQ_BUS_ERROR)
+		dev_err(iommu->dev, "BUS_ERROR occurred at %pad\n", &iova);
+
+	if (int_status & ~RK_MMU_IRQ_MASK)
+		dev_err(iommu->dev, "unexpected int_status: %#08x\n",
+			int_status);
+
+	rk_iommu_write(iommu, RK_MMU_INT_CLEAR, int_status);
+
+	return IRQ_HANDLED;
+}
+
+static phys_addr_t rk_iommu_iova_to_phys(struct iommu_domain *domain,
+					 dma_addr_t iova)
+{
+	struct rk_iommu_domain *rk_domain = domain->priv;
+	unsigned long flags;
+	phys_addr_t pt_phys, phys = 0;
+	u32 dte, pte;
+	u32 *page_table;
+
+	spin_lock_irqsave(&rk_domain->dt_lock, flags);
+
+	dte = rk_domain->dt[rk_iova_dte_index(iova)];
+	if (!rk_dte_is_pt_valid(dte))
+		goto out;
+
+	pt_phys = rk_dte_pt_address(dte);
+	page_table = (u32 *)phys_to_virt(pt_phys);
+	pte = page_table[rk_iova_pte_index(iova)];
+	if (!rk_pte_is_page_valid(pte))
+		goto out;
+
+	phys = rk_pte_page_address(pte) + rk_iova_page_offset(iova);
+out:
+	spin_unlock_irqrestore(&rk_domain->dt_lock, flags);
+
+	return phys;
+}
+
+static void rk_iommu_zap_iova(struct rk_iommu_domain *rk_domain,
+			      dma_addr_t iova, size_t size)
+{
+	struct list_head *pos;
+	unsigned long flags;
+
+	/* shootdown these iova from all iommus using this domain */
+	spin_lock_irqsave(&rk_domain->iommus_lock, flags);
+	list_for_each(pos, &rk_domain->iommus) {
+		struct rk_iommu *iommu;
+		iommu = list_entry(pos, struct rk_iommu, node);
+		rk_iommu_zap_lines(iommu, iova, size);
+	}
+	spin_unlock_irqrestore(&rk_domain->iommus_lock, flags);
+}
+
+static u32 *rk_dte_get_page_table(struct rk_iommu_domain *rk_domain,
+				  dma_addr_t iova)
+{
+	u32 *page_table, *dte_addr;
+	u32 dte;
+	phys_addr_t pt_phys;
+
+	assert_spin_locked(&rk_domain->dt_lock);
+
+	dte_addr = &rk_domain->dt[rk_iova_dte_index(iova)];
+	dte = *dte_addr;
+	if (rk_dte_is_pt_valid(dte))
+		goto done;
+
+	page_table = (u32 *)get_zeroed_page(GFP_ATOMIC | GFP_DMA32);
+	if (!page_table)
+		return ERR_PTR(-ENOMEM);
+
+	dte = rk_mk_dte(page_table);
+	*dte_addr = dte;
+
+	rk_table_flush(page_table, NUM_PT_ENTRIES);
+	rk_table_flush(dte_addr, 1);
+
+	/*
+	 * Zap the first iova of newly allocated page table so iommu evicts
+	 * old cached value of new dte from the iotlb.
+	 */
+	rk_iommu_zap_iova(rk_domain, iova, SPAGE_SIZE);
+
+done:
+	pt_phys = rk_dte_pt_address(dte);
+	return (u32 *)phys_to_virt(pt_phys);
+}
+
+static size_t rk_iommu_unmap_iova(struct rk_iommu_domain *rk_domain,
+				  u32 *pte_addr, dma_addr_t iova, size_t size)
+{
+	unsigned int pte_count;
+	unsigned int pte_total = size / SPAGE_SIZE;
+
+	assert_spin_locked(&rk_domain->dt_lock);
+
+	for (pte_count = 0; pte_count < pte_total; pte_count++) {
+		u32 pte = pte_addr[pte_count];
+		if (!rk_pte_is_page_valid(pte))
+			break;
+
+		pte_addr[pte_count] = rk_mk_pte_invalid(pte);
+	}
+
+	rk_table_flush(pte_addr, pte_count);
+
+	return pte_count * SPAGE_SIZE;
+}
+
+static int rk_iommu_map_iova(struct rk_iommu_domain *rk_domain, u32 *pte_addr,
+			     dma_addr_t iova, phys_addr_t paddr, size_t size,
+			     int prot)
+{
+	unsigned int pte_count;
+	unsigned int pte_total = size / SPAGE_SIZE;
+	phys_addr_t page_phys;
+
+	assert_spin_locked(&rk_domain->dt_lock);
+
+	for (pte_count = 0; pte_count < pte_total; pte_count++) {
+		u32 pte = pte_addr[pte_count];
+
+		if (rk_pte_is_page_valid(pte))
+			goto unwind;
+
+		pte_addr[pte_count] = rk_mk_pte(paddr, prot);
+
+		paddr += SPAGE_SIZE;
+	}
+
+	rk_table_flush(pte_addr, pte_count);
+
+	return 0;
+unwind:
+	/* Unmap the range of iovas that we just mapped */
+	rk_iommu_unmap_iova(rk_domain, pte_addr, iova, pte_count * SPAGE_SIZE);
+
+	iova += pte_count * SPAGE_SIZE;
+	page_phys = rk_pte_page_address(pte_addr[pte_count]);
+	pr_err("iova: %pad already mapped to %pa cannot remap to phys: %pa prot:%#x\n",
+	       &iova, &page_phys, &paddr, prot);
+
+	return -EADDRINUSE;
+}
+
+static int rk_iommu_map(struct iommu_domain *domain, unsigned long _iova,
+			phys_addr_t paddr, size_t size, int prot)
+{
+	struct rk_iommu_domain *rk_domain = domain->priv;
+	unsigned long flags;
+	dma_addr_t iova = (dma_addr_t)_iova;
+	u32 *page_table, *pte_addr;
+	int ret;
+
+	spin_lock_irqsave(&rk_domain->dt_lock, flags);
+
+	/*
+	 * pgsize_bitmap specifies iova sizes that fit in one page table
+	 * (1024 4-KiB pages = 4 MiB).
+	 * So, size will always be 4096 <= size <= 4194304.
+	 * Since iommu_map() guarantees that both iova and size will be
+	 * aligned, we will always only be mapping from a single dte here.
+	 */
+	page_table = rk_dte_get_page_table(rk_domain, iova);
+	if (IS_ERR(page_table)) {
+		spin_unlock_irqrestore(&rk_domain->dt_lock, flags);
+		return PTR_ERR(page_table);
+	}
+
+	pte_addr = &page_table[rk_iova_pte_index(iova)];
+	ret = rk_iommu_map_iova(rk_domain, pte_addr, iova, paddr, size, prot);
+	spin_unlock_irqrestore(&rk_domain->dt_lock, flags);
+
+	return ret;
+}
+
+static size_t rk_iommu_unmap(struct iommu_domain *domain, unsigned long _iova,
+			     size_t size)
+{
+	struct rk_iommu_domain *rk_domain = domain->priv;
+	unsigned long flags;
+	dma_addr_t iova = (dma_addr_t)_iova;
+	phys_addr_t pt_phys;
+	u32 dte;
+	u32 *pte_addr;
+	size_t unmap_size;
+
+	spin_lock_irqsave(&rk_domain->dt_lock, flags);
+
+	/*
+	 * pgsize_bitmap specifies iova sizes that fit in one page table
+	 * (1024 4-KiB pages = 4 MiB).
+	 * So, size will always be 4096 <= size <= 4194304.
+	 * Since iommu_unmap() guarantees that both iova and size will be
+	 * aligned, we will always only be unmapping from a single dte here.
+	 */
+	dte = rk_domain->dt[rk_iova_dte_index(iova)];
+	/* Just return 0 if iova is unmapped */
+	if (!rk_dte_is_pt_valid(dte)) {
+		spin_unlock_irqrestore(&rk_domain->dt_lock, flags);
+		return 0;
+	}
+
+	pt_phys = rk_dte_pt_address(dte);
+	pte_addr = (u32 *)phys_to_virt(pt_phys) + rk_iova_pte_index(iova);
+	unmap_size = rk_iommu_unmap_iova(rk_domain, pte_addr, iova, size);
+
+	spin_unlock_irqrestore(&rk_domain->dt_lock, flags);
+
+	/* Shootdown iotlb entries for iova range that was just unmapped */
+	rk_iommu_zap_iova(rk_domain, iova, unmap_size);
+
+	return unmap_size;
+}
+
+static int rk_iommu_attach_device(struct iommu_domain *domain,
+				  struct device *dev)
+{
+	struct rk_iommu *iommu = dev_get_drvdata(dev->archdata.iommu);
+	struct rk_iommu_domain *rk_domain = domain->priv;
+	unsigned long flags;
+	int ret;
+	phys_addr_t dte_addr;
+
+	/*
+	 * Allow 'virtual devices' (e.g., drm) to attach to domain.
+	 * Such a device has a NULL archdata.iommu.
+	 */
+	if (!iommu)
+		return 0;
+
+	ret = rk_iommu_enable_stall(iommu);
+	if (ret)
+		return ret;
+
+	ret = rk_iommu_force_reset(iommu);
+	if (ret)
+		return ret;
+
+	iommu->domain = domain;
+
+	ret = devm_request_irq(dev, iommu->irq, rk_iommu_irq,
+			       IRQF_SHARED, dev_name(dev), iommu);
+	if (ret)
+		return ret;
+
+	dte_addr = virt_to_phys(rk_domain->dt);
+	rk_iommu_write(iommu, RK_MMU_DTE_ADDR, dte_addr);
+	rk_iommu_command(iommu, RK_MMU_CMD_ZAP_CACHE);
+	rk_iommu_write(iommu, RK_MMU_INT_MASK, RK_MMU_IRQ_MASK);
+
+	ret = rk_iommu_enable_paging(iommu);
+	if (ret)
+		return ret;
+
+	spin_lock_irqsave(&rk_domain->iommus_lock, flags);
+	list_add_tail(&iommu->node, &rk_domain->iommus);
+	spin_unlock_irqrestore(&rk_domain->iommus_lock, flags);
+
+	dev_info(dev, "Attached to iommu domain\n");
+
+	rk_iommu_disable_stall(iommu);
+
+	return 0;
+}
+
+static void rk_iommu_detach_device(struct iommu_domain *domain,
+				   struct device *dev)
+{
+	struct rk_iommu *iommu = dev_get_drvdata(dev->archdata.iommu);
+	struct rk_iommu_domain *rk_domain = domain->priv;
+	unsigned long flags;
+
+	/* Allow 'virtual devices' (eg drm) to detach from domain */
+	if (!iommu)
+		return;
+
+	spin_lock_irqsave(&rk_domain->iommus_lock, flags);
+	list_del_init(&iommu->node);
+	spin_unlock_irqrestore(&rk_domain->iommus_lock, flags);
+
+	/* Ignore error while disabling, just keep going */
+	rk_iommu_enable_stall(iommu);
+	rk_iommu_disable_paging(iommu);
+	rk_iommu_write(iommu, RK_MMU_INT_MASK, 0);
+	rk_iommu_write(iommu, RK_MMU_DTE_ADDR, 0);
+	rk_iommu_disable_stall(iommu);
+
+	devm_free_irq(dev, iommu->irq, iommu);
+
+	iommu->domain = NULL;
+
+	dev_info(dev, "Detached from iommu domain\n");
+}
+
+static int rk_iommu_domain_init(struct iommu_domain *domain)
+{
+	struct rk_iommu_domain *rk_domain;
+
+	rk_domain = kzalloc(sizeof(*rk_domain), GFP_KERNEL);
+	if (!rk_domain)
+		return -ENOMEM;
+
+	/*
+	 * rk32xx iommus use a 2 level pagetable.
+	 * Each level1 (dt) and level2 (pt) table has 1024 4-byte entries.
+	 * Allocate one 4 KiB page for each table.
+	 */
+	rk_domain->dt = (u32 *)get_zeroed_page(GFP_KERNEL | GFP_DMA32);
+	if (!rk_domain->dt)
+		goto err_dt;
+
+	rk_table_flush(rk_domain->dt, NUM_DT_ENTRIES);
+
+	spin_lock_init(&rk_domain->iommus_lock);
+	spin_lock_init(&rk_domain->dt_lock);
+	INIT_LIST_HEAD(&rk_domain->iommus);
+
+	domain->priv = rk_domain;
+
+	return 0;
+err_dt:
+	kfree(rk_domain);
+	return -ENOMEM;
+}
+
+static void rk_iommu_domain_destroy(struct iommu_domain *domain)
+{
+	struct rk_iommu_domain *rk_domain = domain->priv;
+	int i;
+
+	WARN_ON(!list_empty(&rk_domain->iommus));
+
+	for (i = 0; i < NUM_DT_ENTRIES; i++) {
+		u32 dte = rk_domain->dt[i];
+		if (rk_dte_is_pt_valid(dte)) {
+			phys_addr_t pt_phys = rk_dte_pt_address(dte);
+			u32 *page_table = phys_to_virt(pt_phys);
+			free_page((unsigned long)page_table);
+		}
+	}
+
+	free_page((unsigned long)rk_domain->dt);
+	kfree(domain->priv);
+	domain->priv = NULL;
+}
+
+static const struct iommu_ops rk_iommu_ops = {
+	.domain_init = rk_iommu_domain_init,
+	.domain_destroy = rk_iommu_domain_destroy,
+	.attach_dev = rk_iommu_attach_device,
+	.detach_dev = rk_iommu_detach_device,
+	.map = rk_iommu_map,
+	.unmap = rk_iommu_unmap,
+	.iova_to_phys = rk_iommu_iova_to_phys,
+	.pgsize_bitmap = RK_IOMMU_PGSIZE_BITMAP,
+};
+
+static int rk_iommu_probe(struct platform_device *pdev)
+{
+	struct device *dev = &pdev->dev;
+	struct rk_iommu *iommu;
+	struct resource *res;
+
+	iommu = devm_kzalloc(dev, sizeof(*iommu), GFP_KERNEL);
+	if (!iommu)
+		return -ENOMEM;
+
+	platform_set_drvdata(pdev, iommu);
+	iommu->dev = dev;
+
+	res = platform_get_resource(pdev, IORESOURCE_MEM, 0);
+	iommu->base = devm_ioremap_resource(&pdev->dev, res);
+	if (IS_ERR(iommu->base))
+		return PTR_ERR(iommu->base);
+
+	iommu->irq = platform_get_irq(pdev, 0);
+	if (iommu->irq < 0) {
+		dev_err(dev, "Failed to get IRQ, %d\n", iommu->irq);
+		return -ENXIO;
+	}
+
+	return 0;
+}
+
+static int rk_iommu_remove(struct platform_device *pdev)
+{
+	return 0;
+}
+
+#ifdef CONFIG_OF
+static const struct of_device_id rk_iommu_dt_ids[] = {
+	{ .compatible = "rockchip,iommu" },
+	{ /* sentinel */ }
+};
+MODULE_DEVICE_TABLE(of, rk_iommu_dt_ids);
+#endif
+
+static struct platform_driver rk_iommu_driver = {
+	.probe = rk_iommu_probe,
+	.remove = rk_iommu_remove,
+	.driver = {
+		   .name = "rk_iommu",
+		   .owner = THIS_MODULE,
+		   .of_match_table = of_match_ptr(rk_iommu_dt_ids),
+	},
+};
+
+static int __init rk_iommu_init(void)
+{
+	int ret;
+
+	ret = bus_set_iommu(&platform_bus_type, &rk_iommu_ops);
+	if (ret)
+		return ret;
+
+	return platform_driver_register(&rk_iommu_driver);
+}
+static void __exit rk_iommu_exit(void)
+{
+	platform_driver_unregister(&rk_iommu_driver);
+}
+
+subsys_initcall(rk_iommu_init);
+module_exit(rk_iommu_exit);
+
+MODULE_DESCRIPTION("IOMMU API for Rockchip");
+MODULE_AUTHOR("Simon Xue <xxm@rock-chips.com> and Daniel Kurtz <djkurtz@chromium.org>");
+MODULE_ALIAS("platform:rockchip-iommu");
+MODULE_LICENSE("GPL v2");
-- 
2.1.0.rc2.206.gedb03e5

_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [PATCH v5 1/3] iommu/rockchip: rk3288 iommu driver
@ 2014-10-24  7:33   ` Daniel Kurtz
  0 siblings, 0 replies; 26+ messages in thread
From: Daniel Kurtz @ 2014-10-24  7:33 UTC (permalink / raw)
  To: linux-arm-kernel

The rk3288 has several iommus.  Each iommu belongs to a single master
device.  There is one device (ISP) that has two slave iommus, but that
case is not yet supported by this driver.

At subsys init, the iommu driver registers itself as the iommu driver for
the platform bus.  The master devices find their slave iommus using the
"iommus" field in their devicetree description.  Since each slave iommu
belongs to exactly one master, their is no additional data needed at probe
to associate a slave with its master.

An iommu device's power domain, clock and irq are all shared with its
master device, and the master device must be careful to attach from the
iommu only after powering and clocking it (and leave it powered and
clocked before detaching).  Because their is no guarantee what the status
of the iommu is at probe, and since the driver does not even know if the
device is powered, we delay requesting its irq until the master device
attaches, at which point we have a guarantee that the device is powered
and clocked and we can reset it and disable its interrupt mask.

An iommu_domain describes a virtual iova address space.  Each iommu_domain
has a corresponding page table that lists the mappings from iova to
physical address.

For the rk3288 iommu, the page table has two levels:
 The Level 1 "directory_table" has 1024 4-byte dte entries.
 Each dte points to a level 2 "page_table".
 Each level 2 page_table has 1024 4-byte pte entries.
 Each pte points to a 4 KiB page of memory.

An iommu_domain is created when a dma_iommu_mapping is created via
arm_iommu_create_mapping.  Master devices can then attach themselves to
this mapping (or attach the mapping to themselves?) by calling
arm_iommu_attach_device().  This in turn instructs the iommu driver to
write the page table's physical address into the slave iommu's "Directory
Table Entry" (DTE) register.

In fact multiple master devices, each with their own slave iommu device,
can all attach to the same mapping.  The iommus for these devices will
share the same iommu_domain and therefore point to the same page table.
Thus, the iommu domain maintains a list of iommu devices which are
attached.  This driver relies on the iommu core to ensure that all devices
have detached before destroying a domain.

Signed-off-by: Daniel Kurtz <djkurtz@chromium.org>
Signed-off-by: Simon Xue <xxm@rock-chips.com>
Reviewed-by: Grant Grundler <grundler@chromium.org>
Reviewed-by: St?phane Marchesin <marcheu@chromium.org>
---
 drivers/iommu/Kconfig          |  12 +
 drivers/iommu/Makefile         |   1 +
 drivers/iommu/rockchip-iommu.c | 922 +++++++++++++++++++++++++++++++++++++++++
 3 files changed, 935 insertions(+)
 create mode 100644 drivers/iommu/rockchip-iommu.c

diff --git a/drivers/iommu/Kconfig b/drivers/iommu/Kconfig
index dd51122..d0a1261 100644
--- a/drivers/iommu/Kconfig
+++ b/drivers/iommu/Kconfig
@@ -152,6 +152,18 @@ config OMAP_IOMMU_DEBUG
 
          Say N unless you know you need this.
 
+config ROCKCHIP_IOMMU
+	bool "Rockchip IOMMU Support"
+	depends on ARCH_ROCKCHIP
+	select IOMMU_API
+	select ARM_DMA_USE_IOMMU
+	help
+	  Support for IOMMUs found on Rockchip rk32xx SOCs.
+	  These IOMMUs allow virtualization of the address space used by most
+	  cores within the multimedia subsystem.
+	  Say Y here if you are using a Rockchip SoC that includes an IOMMU
+	  device.
+
 config TEGRA_IOMMU_GART
 	bool "Tegra GART IOMMU Support"
 	depends on ARCH_TEGRA_2x_SOC
diff --git a/drivers/iommu/Makefile b/drivers/iommu/Makefile
index 16edef7..3e47ef3 100644
--- a/drivers/iommu/Makefile
+++ b/drivers/iommu/Makefile
@@ -13,6 +13,7 @@ obj-$(CONFIG_IRQ_REMAP) += intel_irq_remapping.o irq_remapping.o
 obj-$(CONFIG_OMAP_IOMMU) += omap-iommu.o
 obj-$(CONFIG_OMAP_IOMMU) += omap-iommu2.o
 obj-$(CONFIG_OMAP_IOMMU_DEBUG) += omap-iommu-debug.o
+obj-$(CONFIG_ROCKCHIP_IOMMU) += rockchip-iommu.o
 obj-$(CONFIG_TEGRA_IOMMU_GART) += tegra-gart.o
 obj-$(CONFIG_TEGRA_IOMMU_SMMU) += tegra-smmu.o
 obj-$(CONFIG_EXYNOS_IOMMU) += exynos-iommu.o
diff --git a/drivers/iommu/rockchip-iommu.c b/drivers/iommu/rockchip-iommu.c
new file mode 100644
index 0000000..56ffb76
--- /dev/null
+++ b/drivers/iommu/rockchip-iommu.c
@@ -0,0 +1,922 @@
+/*
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ */
+
+#include <asm/cacheflush.h>
+#include <asm/pgtable.h>
+#include <linux/compiler.h>
+#include <linux/delay.h>
+#include <linux/device.h>
+#include <linux/errno.h>
+#include <linux/interrupt.h>
+#include <linux/io.h>
+#include <linux/iommu.h>
+#include <linux/jiffies.h>
+#include <linux/list.h>
+#include <linux/mm.h>
+#include <linux/module.h>
+#include <linux/of.h>
+#include <linux/platform_device.h>
+#include <linux/slab.h>
+#include <linux/spinlock.h>
+
+/** MMU register offsets */
+#define RK_MMU_DTE_ADDR		0x00	/* Directory table address */
+#define RK_MMU_STATUS		0x04
+#define RK_MMU_COMMAND		0x08
+#define RK_MMU_PAGE_FAULT_ADDR	0x0C	/* IOVA of last page fault */
+#define RK_MMU_ZAP_ONE_LINE	0x10	/* Shootdown one IOTLB entry */
+#define RK_MMU_INT_RAWSTAT	0x14	/* IRQ status ignoring mask */
+#define RK_MMU_INT_CLEAR	0x18	/* Acknowledge and re-arm irq */
+#define RK_MMU_INT_MASK		0x1C	/* IRQ enable */
+#define RK_MMU_INT_STATUS	0x20	/* IRQ status after masking */
+#define RK_MMU_AUTO_GATING	0x24
+
+#define DTE_ADDR_DUMMY		0xCAFEBABE
+#define FORCE_RESET_TIMEOUT	100	/* ms */
+
+/* RK_MMU_STATUS fields */
+#define RK_MMU_STATUS_PAGING_ENABLED       BIT(0)
+#define RK_MMU_STATUS_PAGE_FAULT_ACTIVE    BIT(1)
+#define RK_MMU_STATUS_STALL_ACTIVE         BIT(2)
+#define RK_MMU_STATUS_IDLE                 BIT(3)
+#define RK_MMU_STATUS_REPLAY_BUFFER_EMPTY  BIT(4)
+#define RK_MMU_STATUS_PAGE_FAULT_IS_WRITE  BIT(5)
+#define RK_MMU_STATUS_STALL_NOT_ACTIVE     BIT(31)
+
+/* RK_MMU_COMMAND command values */
+#define RK_MMU_CMD_ENABLE_PAGING    0  /* Enable memory translation */
+#define RK_MMU_CMD_DISABLE_PAGING   1  /* Disable memory translation */
+#define RK_MMU_CMD_ENABLE_STALL     2  /* Stall paging to allow other cmds */
+#define RK_MMU_CMD_DISABLE_STALL    3  /* Stop stall re-enables paging */
+#define RK_MMU_CMD_ZAP_CACHE        4  /* Shoot down entire IOTLB */
+#define RK_MMU_CMD_PAGE_FAULT_DONE  5  /* Clear page fault */
+#define RK_MMU_CMD_FORCE_RESET      6  /* Reset all registers */
+
+/* RK_MMU_INT_* register fields */
+#define RK_MMU_IRQ_PAGE_FAULT    0x01  /* page fault */
+#define RK_MMU_IRQ_BUS_ERROR     0x02  /* bus read error */
+#define RK_MMU_IRQ_MASK          (RK_MMU_IRQ_PAGE_FAULT | RK_MMU_IRQ_BUS_ERROR)
+
+#define NUM_DT_ENTRIES 1024
+#define NUM_PT_ENTRIES 1024
+
+#define SPAGE_ORDER 12
+#define SPAGE_SIZE (1 << SPAGE_ORDER)
+
+ /*
+  * Support mapping any size that fits in one page table:
+  *   4 KiB to 4 MiB
+  */
+#define RK_IOMMU_PGSIZE_BITMAP 0x007ff000
+
+#define IOMMU_REG_POLL_COUNT_FAST 1000
+
+struct rk_iommu_domain {
+	struct list_head iommus;
+	u32 *dt; /* page directory table */
+	spinlock_t iommus_lock; /* lock for iommus list */
+	spinlock_t dt_lock; /* lock for modifying page directory table */
+};
+
+struct rk_iommu {
+	struct device *dev;
+	void __iomem *base;
+	int irq;
+	struct list_head node; /* entry in rk_iommu_domain.iommus */
+	struct iommu_domain *domain; /* domain to which iommu is attached */
+};
+
+static inline void rk_table_flush(u32 *va, unsigned int count)
+{
+	phys_addr_t pa_start = virt_to_phys(va);
+	phys_addr_t pa_end = virt_to_phys(va + count);
+	size_t size = pa_end - pa_start;
+
+	__cpuc_flush_dcache_area(va, size);
+	outer_flush_range(pa_start, pa_end);
+}
+
+/**
+ * Inspired by _wait_for in intel_drv.h
+ * This is NOT safe for use in interrupt context.
+ *
+ * Note that it's important that we check the condition again after having
+ * timed out, since the timeout could be due to preemption or similar and
+ * we've never had a chance to check the condition before the timeout.
+ */
+#define rk_wait_for(COND, MS) ({ \
+	unsigned long timeout__ = jiffies + msecs_to_jiffies(MS) + 1;	\
+	int ret__ = 0;							\
+	while (!(COND)) {						\
+		if (time_after(jiffies, timeout__)) {			\
+			ret__ = (COND) ? 0 : -ETIMEDOUT;		\
+			break;						\
+		}							\
+		usleep_range(50, 100);					\
+	}								\
+	ret__;								\
+})
+
+/*
+ * The Rockchip rk3288 iommu uses a 2-level page table.
+ * The first level is the "Directory Table" (DT).
+ * The DT consists of 1024 4-byte Directory Table Entries (DTEs), each pointing
+ * to a "Page Table".
+ * The second level is the 1024 Page Tables (PT).
+ * Each PT consists of 1024 4-byte Page Table Entries (PTEs), each pointing to
+ * a 4 KB page of physical memory.
+ *
+ * The DT and each PT fits in a single 4 KB page (4-bytes * 1024 entries).
+ * Each iommu device has a MMU_DTE_ADDR register that contains the physical
+ * address of the start of the DT page.
+ *
+ * The structure of the page table is as follows:
+ *
+ *                   DT
+ * MMU_DTE_ADDR -> +-----+
+ *                 |     |
+ *                 +-----+     PT
+ *                 | DTE | -> +-----+
+ *                 +-----+    |     |     Memory
+ *                 |     |    +-----+     Page
+ *                 |     |    | PTE | -> +-----+
+ *                 +-----+    +-----+    |     |
+ *                            |     |    |     |
+ *                            |     |    |     |
+ *                            +-----+    |     |
+ *                                       |     |
+ *                                       |     |
+ *                                       +-----+
+ */
+
+/*
+ * Each DTE has a PT address and a valid bit:
+ * +---------------------+-----------+-+
+ * | PT address          | Reserved  |V|
+ * +---------------------+-----------+-+
+ *  31:12 - PT address (PTs always starts on a 4 KB boundary)
+ *  11: 1 - Reserved
+ *      0 - 1 if PT @ PT address is valid
+ */
+#define RK_DTE_PT_ADDRESS_MASK    0xfffff000
+#define RK_DTE_PT_VALID           BIT(0)
+
+static inline phys_addr_t rk_dte_pt_address(u32 dte)
+{
+	return (phys_addr_t)dte & RK_DTE_PT_ADDRESS_MASK;
+}
+
+static inline bool rk_dte_is_pt_valid(u32 dte)
+{
+	return dte & RK_DTE_PT_VALID;
+}
+
+static u32 rk_mk_dte(u32 *pt)
+{
+	phys_addr_t pt_phys = virt_to_phys(pt);
+	return (pt_phys & RK_DTE_PT_ADDRESS_MASK) | RK_DTE_PT_VALID;
+}
+
+/*
+ * Each PTE has a Page address, some flags and a valid bit:
+ * +---------------------+---+-------+-+
+ * | Page address        |Rsv| Flags |V|
+ * +---------------------+---+-------+-+
+ *  31:12 - Page address (Pages always start on a 4 KB boundary)
+ *  11: 9 - Reserved
+ *   8: 1 - Flags
+ *      8 - Read allocate - allocate cache space on read misses
+ *      7 - Read cache - enable cache & prefetch of data
+ *      6 - Write buffer - enable delaying writes on their way to memory
+ *      5 - Write allocate - allocate cache space on write misses
+ *      4 - Write cache - different writes can be merged together
+ *      3 - Override cache attributes
+ *          if 1, bits 4-8 control cache attributes
+ *          if 0, the system bus defaults are used
+ *      2 - Writable
+ *      1 - Readable
+ *      0 - 1 if Page @ Page address is valid
+ */
+#define RK_PTE_PAGE_ADDRESS_MASK  0xfffff000
+#define RK_PTE_PAGE_FLAGS_MASK    0x000001fe
+#define RK_PTE_PAGE_WRITABLE      BIT(2)
+#define RK_PTE_PAGE_READABLE      BIT(1)
+#define RK_PTE_PAGE_VALID         BIT(0)
+
+static inline phys_addr_t rk_pte_page_address(u32 pte)
+{
+	return (phys_addr_t)pte & RK_PTE_PAGE_ADDRESS_MASK;
+}
+
+static inline bool rk_pte_is_page_valid(u32 pte)
+{
+	return pte & RK_PTE_PAGE_VALID;
+}
+
+/* TODO: set cache flags per prot IOMMU_CACHE */
+static u32 rk_mk_pte(phys_addr_t page, int prot)
+{
+	u32 flags = 0;
+	flags |= (prot & IOMMU_READ) ? RK_PTE_PAGE_READABLE : 0;
+	flags |= (prot & IOMMU_WRITE) ? RK_PTE_PAGE_WRITABLE : 0;
+	page &= RK_PTE_PAGE_ADDRESS_MASK;
+	return page | flags | RK_PTE_PAGE_VALID;
+}
+
+static u32 rk_mk_pte_invalid(u32 pte)
+{
+	return pte & ~RK_PTE_PAGE_VALID;
+}
+
+/*
+ * rk3288 iova (IOMMU Virtual Address) format
+ *  31       22.21       12.11          0
+ * +-----------+-----------+-------------+
+ * | DTE index | PTE index | Page offset |
+ * +-----------+-----------+-------------+
+ *  31:22 - DTE index   - index of DTE in DT
+ *  21:12 - PTE index   - index of PTE in PT @ DTE.pt_address
+ *  11: 0 - Page offset - offset into page @ PTE.page_address
+ */
+#define RK_IOVA_DTE_MASK    0xffc00000
+#define RK_IOVA_DTE_SHIFT   22
+#define RK_IOVA_PTE_MASK    0x003ff000
+#define RK_IOVA_PTE_SHIFT   12
+#define RK_IOVA_PAGE_MASK   0x00000fff
+#define RK_IOVA_PAGE_SHIFT  0
+
+static u32 rk_iova_dte_index(dma_addr_t iova)
+{
+	return (u32)(iova & RK_IOVA_DTE_MASK) >> RK_IOVA_DTE_SHIFT;
+}
+
+static u32 rk_iova_pte_index(dma_addr_t iova)
+{
+	return (u32)(iova & RK_IOVA_PTE_MASK) >> RK_IOVA_PTE_SHIFT;
+}
+
+static u32 rk_iova_page_offset(dma_addr_t iova)
+{
+	return (u32)(iova & RK_IOVA_PAGE_MASK) >> RK_IOVA_PAGE_SHIFT;
+}
+
+static u32 rk_iommu_read(struct rk_iommu *iommu, u32 offset)
+{
+	return readl(iommu->base + offset);
+}
+
+static void rk_iommu_write(struct rk_iommu *iommu, u32 offset, u32 value)
+{
+	writel(value, iommu->base + offset);
+}
+
+static void rk_iommu_command(struct rk_iommu *iommu, u32 command)
+{
+	writel(command, iommu->base + RK_MMU_COMMAND);
+}
+
+static void rk_iommu_zap_lines(struct rk_iommu *iommu, dma_addr_t iova,
+			       size_t size)
+{
+	dma_addr_t iova_end = iova + size;
+	/*
+	 * TODO(djkurtz): Figure out when it is more efficient to shootdown the
+	 * entire iotlb rather than iterate over individual iovas.
+	 */
+	for (; iova < iova_end; iova += SPAGE_SIZE)
+		rk_iommu_write(iommu, RK_MMU_ZAP_ONE_LINE, iova);
+}
+
+static bool rk_iommu_is_stall_active(struct rk_iommu *iommu)
+{
+	return rk_iommu_read(iommu, RK_MMU_STATUS) & RK_MMU_STATUS_STALL_ACTIVE;
+}
+
+static bool rk_iommu_is_paging_enabled(struct rk_iommu *iommu)
+{
+	return rk_iommu_read(iommu, RK_MMU_STATUS) &
+			     RK_MMU_STATUS_PAGING_ENABLED;
+}
+
+static int rk_iommu_enable_stall(struct rk_iommu *iommu)
+{
+	int ret;
+
+	if (rk_iommu_is_stall_active(iommu))
+		return 0;
+
+	/* Stall can only be enabled if paging is enabled */
+	if (!rk_iommu_is_paging_enabled(iommu))
+		return 0;
+
+	rk_iommu_command(iommu, RK_MMU_CMD_ENABLE_STALL);
+
+	ret = rk_wait_for(rk_iommu_is_stall_active(iommu), 1);
+	if (ret)
+		dev_err(iommu->dev, "Enable stall request timed out, status: %#08x\n",
+			rk_iommu_read(iommu, RK_MMU_STATUS));
+
+	return ret;
+}
+
+static int rk_iommu_disable_stall(struct rk_iommu *iommu)
+{
+	int ret;
+
+	if (!rk_iommu_is_stall_active(iommu))
+		return 0;
+
+	rk_iommu_command(iommu, RK_MMU_CMD_DISABLE_STALL);
+
+	ret = rk_wait_for(!rk_iommu_is_stall_active(iommu), 1);
+	if (ret)
+		dev_err(iommu->dev, "Disable stall request timed out, status: %#08x\n",
+			rk_iommu_read(iommu, RK_MMU_STATUS));
+
+	return ret;
+}
+
+static int rk_iommu_enable_paging(struct rk_iommu *iommu)
+{
+	int ret;
+
+	if (rk_iommu_is_paging_enabled(iommu))
+		return 0;
+
+	rk_iommu_command(iommu, RK_MMU_CMD_ENABLE_PAGING);
+
+	ret = rk_wait_for(rk_iommu_is_paging_enabled(iommu), 1);
+	if (ret)
+		dev_err(iommu->dev, "Enable paging request timed out, status: %#08x\n",
+			rk_iommu_read(iommu, RK_MMU_STATUS));
+
+	return ret;
+}
+
+static int rk_iommu_disable_paging(struct rk_iommu *iommu)
+{
+	int ret;
+
+	if (!rk_iommu_is_paging_enabled(iommu))
+		return 0;
+
+	rk_iommu_command(iommu, RK_MMU_CMD_DISABLE_PAGING);
+
+	ret = rk_wait_for(!rk_iommu_is_paging_enabled(iommu), 1);
+	if (ret)
+		dev_err(iommu->dev, "Disable paging request timed out, status: #%08x\n",
+			rk_iommu_read(iommu, RK_MMU_STATUS));
+
+	return ret;
+}
+
+static int rk_iommu_force_reset(struct rk_iommu *iommu)
+{
+	int ret;
+	u32 dte_addr;
+
+	/*
+	 * Check if register DTE_ADDR is working by writing DTE_ADDR_DUMMY
+	 * and verifying that upper 5 nybbles are read back.
+	 */
+	rk_iommu_write(iommu, RK_MMU_DTE_ADDR, DTE_ADDR_DUMMY);
+
+	dte_addr = rk_iommu_read(iommu, RK_MMU_DTE_ADDR);
+	if (dte_addr != (DTE_ADDR_DUMMY & RK_DTE_PT_ADDRESS_MASK)) {
+		dev_err(iommu->dev, "Error during raw reset. MMU_DTE_ADDR is not functioning\n");
+		return -EFAULT;
+	}
+
+	rk_iommu_command(iommu, RK_MMU_CMD_FORCE_RESET);
+
+	ret = rk_wait_for(rk_iommu_read(iommu, RK_MMU_DTE_ADDR) == 0x00000000,
+			  FORCE_RESET_TIMEOUT);
+	if (ret)
+		dev_err(iommu->dev, "FORCE_RESET command timed out\n");
+
+	return ret;
+}
+
+static void log_iova(struct rk_iommu *iommu, dma_addr_t iova)
+{
+	u32 dte_index, pte_index, page_offset;
+	u32 mmu_dte_addr;
+	phys_addr_t mmu_dte_addr_phys, dte_addr_phys;
+	u32 *dte_addr;
+	u32 dte;
+	phys_addr_t pte_addr_phys = 0;
+	u32 *pte_addr = NULL;
+	u32 pte = 0;
+	phys_addr_t page_addr_phys = 0;
+	u32 page_flags = 0;
+
+	dte_index = rk_iova_dte_index(iova);
+	pte_index = rk_iova_pte_index(iova);
+	page_offset = rk_iova_page_offset(iova);
+
+	mmu_dte_addr = rk_iommu_read(iommu, RK_MMU_DTE_ADDR);
+	mmu_dte_addr_phys = (phys_addr_t)mmu_dte_addr;
+
+	dte_addr_phys = mmu_dte_addr_phys + (4 * dte_index);
+	dte_addr = phys_to_virt(dte_addr_phys);
+	dte = *dte_addr;
+
+	if (!rk_dte_is_pt_valid(dte))
+		goto print_it;
+
+	pte_addr_phys = rk_dte_pt_address(dte) + (pte_index * 4);
+	pte_addr = phys_to_virt(pte_addr_phys);
+	pte = *pte_addr;
+
+	if (!rk_pte_is_page_valid(pte))
+		goto print_it;
+
+	page_addr_phys = rk_pte_page_address(pte) + page_offset;
+	page_flags = pte & RK_PTE_PAGE_FLAGS_MASK;
+
+print_it:
+	dev_err(iommu->dev, "iova = %pad: dte_index: 0x%03x pte_index: 0x%03x page_offset: 0x%03x\n",
+		&iova, dte_index, pte_index, page_offset);
+	dev_err(iommu->dev, "mmu_dte_addr: %pa dte@%pa: %#08x valid: %u pte@%pa: %#08x valid: %u page@%pa flags: %#03x\n",
+		&mmu_dte_addr_phys, &dte_addr_phys, dte,
+		rk_dte_is_pt_valid(dte), &pte_addr_phys, pte,
+		rk_pte_is_page_valid(pte), &page_addr_phys, page_flags);
+}
+
+static irqreturn_t rk_iommu_irq(int irq, void *dev_id)
+{
+	struct rk_iommu *iommu = dev_id;
+	u32 status;
+	u32 int_status;
+	dma_addr_t iova;
+
+	int_status = rk_iommu_read(iommu, RK_MMU_INT_STATUS);
+	if (int_status == 0)
+		return IRQ_NONE;
+
+	iova = rk_iommu_read(iommu, RK_MMU_PAGE_FAULT_ADDR);
+
+	if (int_status & RK_MMU_IRQ_PAGE_FAULT) {
+		int flags;
+
+		status = rk_iommu_read(iommu, RK_MMU_STATUS);
+		flags = (status & RK_MMU_STATUS_PAGE_FAULT_IS_WRITE) ?
+				IOMMU_FAULT_WRITE : IOMMU_FAULT_READ;
+
+		dev_err(iommu->dev, "Page fault at %pad of type %s\n",
+			&iova,
+			(flags == IOMMU_FAULT_WRITE) ? "write" : "read");
+
+		log_iova(iommu, iova);
+
+		/*
+		 * Report page fault to any installed handlers.
+		 * Ignore the return code, though, since we always zap cache
+		 * and clear the page fault anyway.
+		 */
+		if (iommu->domain)
+			report_iommu_fault(iommu->domain, iommu->dev, iova,
+					   flags);
+		else
+			dev_err(iommu->dev, "Page fault while iommu not attached to domain?\n");
+
+		rk_iommu_command(iommu, RK_MMU_CMD_ZAP_CACHE);
+		rk_iommu_command(iommu, RK_MMU_CMD_PAGE_FAULT_DONE);
+	}
+
+	if (int_status & RK_MMU_IRQ_BUS_ERROR)
+		dev_err(iommu->dev, "BUS_ERROR occurred at %pad\n", &iova);
+
+	if (int_status & ~RK_MMU_IRQ_MASK)
+		dev_err(iommu->dev, "unexpected int_status: %#08x\n",
+			int_status);
+
+	rk_iommu_write(iommu, RK_MMU_INT_CLEAR, int_status);
+
+	return IRQ_HANDLED;
+}
+
+static phys_addr_t rk_iommu_iova_to_phys(struct iommu_domain *domain,
+					 dma_addr_t iova)
+{
+	struct rk_iommu_domain *rk_domain = domain->priv;
+	unsigned long flags;
+	phys_addr_t pt_phys, phys = 0;
+	u32 dte, pte;
+	u32 *page_table;
+
+	spin_lock_irqsave(&rk_domain->dt_lock, flags);
+
+	dte = rk_domain->dt[rk_iova_dte_index(iova)];
+	if (!rk_dte_is_pt_valid(dte))
+		goto out;
+
+	pt_phys = rk_dte_pt_address(dte);
+	page_table = (u32 *)phys_to_virt(pt_phys);
+	pte = page_table[rk_iova_pte_index(iova)];
+	if (!rk_pte_is_page_valid(pte))
+		goto out;
+
+	phys = rk_pte_page_address(pte) + rk_iova_page_offset(iova);
+out:
+	spin_unlock_irqrestore(&rk_domain->dt_lock, flags);
+
+	return phys;
+}
+
+static void rk_iommu_zap_iova(struct rk_iommu_domain *rk_domain,
+			      dma_addr_t iova, size_t size)
+{
+	struct list_head *pos;
+	unsigned long flags;
+
+	/* shootdown these iova from all iommus using this domain */
+	spin_lock_irqsave(&rk_domain->iommus_lock, flags);
+	list_for_each(pos, &rk_domain->iommus) {
+		struct rk_iommu *iommu;
+		iommu = list_entry(pos, struct rk_iommu, node);
+		rk_iommu_zap_lines(iommu, iova, size);
+	}
+	spin_unlock_irqrestore(&rk_domain->iommus_lock, flags);
+}
+
+static u32 *rk_dte_get_page_table(struct rk_iommu_domain *rk_domain,
+				  dma_addr_t iova)
+{
+	u32 *page_table, *dte_addr;
+	u32 dte;
+	phys_addr_t pt_phys;
+
+	assert_spin_locked(&rk_domain->dt_lock);
+
+	dte_addr = &rk_domain->dt[rk_iova_dte_index(iova)];
+	dte = *dte_addr;
+	if (rk_dte_is_pt_valid(dte))
+		goto done;
+
+	page_table = (u32 *)get_zeroed_page(GFP_ATOMIC | GFP_DMA32);
+	if (!page_table)
+		return ERR_PTR(-ENOMEM);
+
+	dte = rk_mk_dte(page_table);
+	*dte_addr = dte;
+
+	rk_table_flush(page_table, NUM_PT_ENTRIES);
+	rk_table_flush(dte_addr, 1);
+
+	/*
+	 * Zap the first iova of newly allocated page table so iommu evicts
+	 * old cached value of new dte from the iotlb.
+	 */
+	rk_iommu_zap_iova(rk_domain, iova, SPAGE_SIZE);
+
+done:
+	pt_phys = rk_dte_pt_address(dte);
+	return (u32 *)phys_to_virt(pt_phys);
+}
+
+static size_t rk_iommu_unmap_iova(struct rk_iommu_domain *rk_domain,
+				  u32 *pte_addr, dma_addr_t iova, size_t size)
+{
+	unsigned int pte_count;
+	unsigned int pte_total = size / SPAGE_SIZE;
+
+	assert_spin_locked(&rk_domain->dt_lock);
+
+	for (pte_count = 0; pte_count < pte_total; pte_count++) {
+		u32 pte = pte_addr[pte_count];
+		if (!rk_pte_is_page_valid(pte))
+			break;
+
+		pte_addr[pte_count] = rk_mk_pte_invalid(pte);
+	}
+
+	rk_table_flush(pte_addr, pte_count);
+
+	return pte_count * SPAGE_SIZE;
+}
+
+static int rk_iommu_map_iova(struct rk_iommu_domain *rk_domain, u32 *pte_addr,
+			     dma_addr_t iova, phys_addr_t paddr, size_t size,
+			     int prot)
+{
+	unsigned int pte_count;
+	unsigned int pte_total = size / SPAGE_SIZE;
+	phys_addr_t page_phys;
+
+	assert_spin_locked(&rk_domain->dt_lock);
+
+	for (pte_count = 0; pte_count < pte_total; pte_count++) {
+		u32 pte = pte_addr[pte_count];
+
+		if (rk_pte_is_page_valid(pte))
+			goto unwind;
+
+		pte_addr[pte_count] = rk_mk_pte(paddr, prot);
+
+		paddr += SPAGE_SIZE;
+	}
+
+	rk_table_flush(pte_addr, pte_count);
+
+	return 0;
+unwind:
+	/* Unmap the range of iovas that we just mapped */
+	rk_iommu_unmap_iova(rk_domain, pte_addr, iova, pte_count * SPAGE_SIZE);
+
+	iova += pte_count * SPAGE_SIZE;
+	page_phys = rk_pte_page_address(pte_addr[pte_count]);
+	pr_err("iova: %pad already mapped to %pa cannot remap to phys: %pa prot:%#x\n",
+	       &iova, &page_phys, &paddr, prot);
+
+	return -EADDRINUSE;
+}
+
+static int rk_iommu_map(struct iommu_domain *domain, unsigned long _iova,
+			phys_addr_t paddr, size_t size, int prot)
+{
+	struct rk_iommu_domain *rk_domain = domain->priv;
+	unsigned long flags;
+	dma_addr_t iova = (dma_addr_t)_iova;
+	u32 *page_table, *pte_addr;
+	int ret;
+
+	spin_lock_irqsave(&rk_domain->dt_lock, flags);
+
+	/*
+	 * pgsize_bitmap specifies iova sizes that fit in one page table
+	 * (1024 4-KiB pages = 4 MiB).
+	 * So, size will always be 4096 <= size <= 4194304.
+	 * Since iommu_map() guarantees that both iova and size will be
+	 * aligned, we will always only be mapping from a single dte here.
+	 */
+	page_table = rk_dte_get_page_table(rk_domain, iova);
+	if (IS_ERR(page_table)) {
+		spin_unlock_irqrestore(&rk_domain->dt_lock, flags);
+		return PTR_ERR(page_table);
+	}
+
+	pte_addr = &page_table[rk_iova_pte_index(iova)];
+	ret = rk_iommu_map_iova(rk_domain, pte_addr, iova, paddr, size, prot);
+	spin_unlock_irqrestore(&rk_domain->dt_lock, flags);
+
+	return ret;
+}
+
+static size_t rk_iommu_unmap(struct iommu_domain *domain, unsigned long _iova,
+			     size_t size)
+{
+	struct rk_iommu_domain *rk_domain = domain->priv;
+	unsigned long flags;
+	dma_addr_t iova = (dma_addr_t)_iova;
+	phys_addr_t pt_phys;
+	u32 dte;
+	u32 *pte_addr;
+	size_t unmap_size;
+
+	spin_lock_irqsave(&rk_domain->dt_lock, flags);
+
+	/*
+	 * pgsize_bitmap specifies iova sizes that fit in one page table
+	 * (1024 4-KiB pages = 4 MiB).
+	 * So, size will always be 4096 <= size <= 4194304.
+	 * Since iommu_unmap() guarantees that both iova and size will be
+	 * aligned, we will always only be unmapping from a single dte here.
+	 */
+	dte = rk_domain->dt[rk_iova_dte_index(iova)];
+	/* Just return 0 if iova is unmapped */
+	if (!rk_dte_is_pt_valid(dte)) {
+		spin_unlock_irqrestore(&rk_domain->dt_lock, flags);
+		return 0;
+	}
+
+	pt_phys = rk_dte_pt_address(dte);
+	pte_addr = (u32 *)phys_to_virt(pt_phys) + rk_iova_pte_index(iova);
+	unmap_size = rk_iommu_unmap_iova(rk_domain, pte_addr, iova, size);
+
+	spin_unlock_irqrestore(&rk_domain->dt_lock, flags);
+
+	/* Shootdown iotlb entries for iova range that was just unmapped */
+	rk_iommu_zap_iova(rk_domain, iova, unmap_size);
+
+	return unmap_size;
+}
+
+static int rk_iommu_attach_device(struct iommu_domain *domain,
+				  struct device *dev)
+{
+	struct rk_iommu *iommu = dev_get_drvdata(dev->archdata.iommu);
+	struct rk_iommu_domain *rk_domain = domain->priv;
+	unsigned long flags;
+	int ret;
+	phys_addr_t dte_addr;
+
+	/*
+	 * Allow 'virtual devices' (e.g., drm) to attach to domain.
+	 * Such a device has a NULL archdata.iommu.
+	 */
+	if (!iommu)
+		return 0;
+
+	ret = rk_iommu_enable_stall(iommu);
+	if (ret)
+		return ret;
+
+	ret = rk_iommu_force_reset(iommu);
+	if (ret)
+		return ret;
+
+	iommu->domain = domain;
+
+	ret = devm_request_irq(dev, iommu->irq, rk_iommu_irq,
+			       IRQF_SHARED, dev_name(dev), iommu);
+	if (ret)
+		return ret;
+
+	dte_addr = virt_to_phys(rk_domain->dt);
+	rk_iommu_write(iommu, RK_MMU_DTE_ADDR, dte_addr);
+	rk_iommu_command(iommu, RK_MMU_CMD_ZAP_CACHE);
+	rk_iommu_write(iommu, RK_MMU_INT_MASK, RK_MMU_IRQ_MASK);
+
+	ret = rk_iommu_enable_paging(iommu);
+	if (ret)
+		return ret;
+
+	spin_lock_irqsave(&rk_domain->iommus_lock, flags);
+	list_add_tail(&iommu->node, &rk_domain->iommus);
+	spin_unlock_irqrestore(&rk_domain->iommus_lock, flags);
+
+	dev_info(dev, "Attached to iommu domain\n");
+
+	rk_iommu_disable_stall(iommu);
+
+	return 0;
+}
+
+static void rk_iommu_detach_device(struct iommu_domain *domain,
+				   struct device *dev)
+{
+	struct rk_iommu *iommu = dev_get_drvdata(dev->archdata.iommu);
+	struct rk_iommu_domain *rk_domain = domain->priv;
+	unsigned long flags;
+
+	/* Allow 'virtual devices' (eg drm) to detach from domain */
+	if (!iommu)
+		return;
+
+	spin_lock_irqsave(&rk_domain->iommus_lock, flags);
+	list_del_init(&iommu->node);
+	spin_unlock_irqrestore(&rk_domain->iommus_lock, flags);
+
+	/* Ignore error while disabling, just keep going */
+	rk_iommu_enable_stall(iommu);
+	rk_iommu_disable_paging(iommu);
+	rk_iommu_write(iommu, RK_MMU_INT_MASK, 0);
+	rk_iommu_write(iommu, RK_MMU_DTE_ADDR, 0);
+	rk_iommu_disable_stall(iommu);
+
+	devm_free_irq(dev, iommu->irq, iommu);
+
+	iommu->domain = NULL;
+
+	dev_info(dev, "Detached from iommu domain\n");
+}
+
+static int rk_iommu_domain_init(struct iommu_domain *domain)
+{
+	struct rk_iommu_domain *rk_domain;
+
+	rk_domain = kzalloc(sizeof(*rk_domain), GFP_KERNEL);
+	if (!rk_domain)
+		return -ENOMEM;
+
+	/*
+	 * rk32xx iommus use a 2 level pagetable.
+	 * Each level1 (dt) and level2 (pt) table has 1024 4-byte entries.
+	 * Allocate one 4 KiB page for each table.
+	 */
+	rk_domain->dt = (u32 *)get_zeroed_page(GFP_KERNEL | GFP_DMA32);
+	if (!rk_domain->dt)
+		goto err_dt;
+
+	rk_table_flush(rk_domain->dt, NUM_DT_ENTRIES);
+
+	spin_lock_init(&rk_domain->iommus_lock);
+	spin_lock_init(&rk_domain->dt_lock);
+	INIT_LIST_HEAD(&rk_domain->iommus);
+
+	domain->priv = rk_domain;
+
+	return 0;
+err_dt:
+	kfree(rk_domain);
+	return -ENOMEM;
+}
+
+static void rk_iommu_domain_destroy(struct iommu_domain *domain)
+{
+	struct rk_iommu_domain *rk_domain = domain->priv;
+	int i;
+
+	WARN_ON(!list_empty(&rk_domain->iommus));
+
+	for (i = 0; i < NUM_DT_ENTRIES; i++) {
+		u32 dte = rk_domain->dt[i];
+		if (rk_dte_is_pt_valid(dte)) {
+			phys_addr_t pt_phys = rk_dte_pt_address(dte);
+			u32 *page_table = phys_to_virt(pt_phys);
+			free_page((unsigned long)page_table);
+		}
+	}
+
+	free_page((unsigned long)rk_domain->dt);
+	kfree(domain->priv);
+	domain->priv = NULL;
+}
+
+static const struct iommu_ops rk_iommu_ops = {
+	.domain_init = rk_iommu_domain_init,
+	.domain_destroy = rk_iommu_domain_destroy,
+	.attach_dev = rk_iommu_attach_device,
+	.detach_dev = rk_iommu_detach_device,
+	.map = rk_iommu_map,
+	.unmap = rk_iommu_unmap,
+	.iova_to_phys = rk_iommu_iova_to_phys,
+	.pgsize_bitmap = RK_IOMMU_PGSIZE_BITMAP,
+};
+
+static int rk_iommu_probe(struct platform_device *pdev)
+{
+	struct device *dev = &pdev->dev;
+	struct rk_iommu *iommu;
+	struct resource *res;
+
+	iommu = devm_kzalloc(dev, sizeof(*iommu), GFP_KERNEL);
+	if (!iommu)
+		return -ENOMEM;
+
+	platform_set_drvdata(pdev, iommu);
+	iommu->dev = dev;
+
+	res = platform_get_resource(pdev, IORESOURCE_MEM, 0);
+	iommu->base = devm_ioremap_resource(&pdev->dev, res);
+	if (IS_ERR(iommu->base))
+		return PTR_ERR(iommu->base);
+
+	iommu->irq = platform_get_irq(pdev, 0);
+	if (iommu->irq < 0) {
+		dev_err(dev, "Failed to get IRQ, %d\n", iommu->irq);
+		return -ENXIO;
+	}
+
+	return 0;
+}
+
+static int rk_iommu_remove(struct platform_device *pdev)
+{
+	return 0;
+}
+
+#ifdef CONFIG_OF
+static const struct of_device_id rk_iommu_dt_ids[] = {
+	{ .compatible = "rockchip,iommu" },
+	{ /* sentinel */ }
+};
+MODULE_DEVICE_TABLE(of, rk_iommu_dt_ids);
+#endif
+
+static struct platform_driver rk_iommu_driver = {
+	.probe = rk_iommu_probe,
+	.remove = rk_iommu_remove,
+	.driver = {
+		   .name = "rk_iommu",
+		   .owner = THIS_MODULE,
+		   .of_match_table = of_match_ptr(rk_iommu_dt_ids),
+	},
+};
+
+static int __init rk_iommu_init(void)
+{
+	int ret;
+
+	ret = bus_set_iommu(&platform_bus_type, &rk_iommu_ops);
+	if (ret)
+		return ret;
+
+	return platform_driver_register(&rk_iommu_driver);
+}
+static void __exit rk_iommu_exit(void)
+{
+	platform_driver_unregister(&rk_iommu_driver);
+}
+
+subsys_initcall(rk_iommu_init);
+module_exit(rk_iommu_exit);
+
+MODULE_DESCRIPTION("IOMMU API for Rockchip");
+MODULE_AUTHOR("Simon Xue <xxm@rock-chips.com> and Daniel Kurtz <djkurtz@chromium.org>");
+MODULE_ALIAS("platform:rockchip-iommu");
+MODULE_LICENSE("GPL v2");
-- 
2.1.0.rc2.206.gedb03e5

^ permalink raw reply related	[flat|nested] 26+ messages in thread

end of thread, other threads:[~2014-10-27 10:08 UTC | newest]

Thread overview: 26+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
     [not found] <1413273762-32489-1-git-send-email-djkurtz@chromium.org>
2014-10-14  8:02 ` [PATCH v5 1/3] iommu/rockchip: rk3288 iommu driver Daniel Kurtz
2014-10-14  8:02   ` Daniel Kurtz
2014-10-14  8:02   ` Daniel Kurtz
2014-10-17  2:22   ` Daniel Kurtz
2014-10-17  2:22     ` Daniel Kurtz
2014-10-17  2:22     ` Daniel Kurtz
2014-10-17  8:36     ` Joerg Roedel
2014-10-17  8:36       ` Joerg Roedel
2014-10-17  8:36       ` Joerg Roedel
2014-10-22 15:12   ` Joerg Roedel
2014-10-22 15:12     ` Joerg Roedel
2014-10-22 15:12     ` Joerg Roedel
2014-10-14  8:02 ` [PATCH v5 2/3] dt-bindings: iommu: Add documentation for rockchip iommu Daniel Kurtz
2014-10-14  8:02   ` Daniel Kurtz
2014-10-14  8:02 ` [PATCH v5 3/3] ARM: dts: rk3288: add VOP iommu nodes Daniel Kurtz
2014-10-14  8:02   ` Daniel Kurtz
2014-10-14  8:02   ` Daniel Kurtz
     [not found] <1414136029-22695-1-git-send-email-djkurtz@chromium.org>
2014-10-24  7:33 ` [PATCH v5 1/3] iommu/rockchip: rk3288 iommu driver Daniel Kurtz
2014-10-24  7:33   ` Daniel Kurtz
2014-10-24  7:33   ` Daniel Kurtz
2014-10-26 20:32   ` Heiko Stübner
2014-10-26 20:32     ` Heiko Stübner
2014-10-26 20:32     ` Heiko Stübner
2014-10-27 10:08     ` Daniel Kurtz
2014-10-27 10:08       ` Daniel Kurtz
2014-10-27 10:08       ` Daniel Kurtz

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.