All of lore.kernel.org
 help / color / mirror / Atom feed
* [RFC 00/10] Add NVIDIA Tegra124 IOMMU support
@ 2014-06-26 20:49 ` Thierry Reding
  0 siblings, 0 replies; 133+ messages in thread
From: Thierry Reding @ 2014-06-26 20:49 UTC (permalink / raw)
  To: Rob Herring, Pawel Moll, Mark Rutland, Ian Campbell, Kumar Gala,
	Stephen Warren, Arnd Bergmann, Will Deacon, Joerg Roedel
  Cc: Olav Haugan, devicetree-u79uwXL29TY76Z2rM5mHXA, Grant Grundler,
	Rhyland Klein, iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA, Marc Zyngier, Allen Martin,
	Paul Walmsley, linux-tegra-u79uwXL29TY76Z2rM5mHXA, Cho KyongHo,
	Dave Martin, linux-arm-kernel-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r

From: Thierry Reding <treding-DDmLM1+adcrQT0dZR+AlfA@public.gmane.org>

This series adds support for the IOMMU found on Tegra124 SoCs. The SMMU
groups memory clients into SWGROUPs and each SWGROUP can be assigned to
one I/O virtual address space. Translation of virtual addresses can be
enabled per memory client.

Patch 1 adds an IOMMU device registry. The driver in patch 4 will add
the IOMMU device with this registry, which will in turn be used by the
client drivers to attach to the IOMMU device. Note that the API that is
introduced in this patch may not be sufficient in the long term (f.e.
when multiple master interfaces need to be supported).

Patch 2 is v3 of the generic IOMMU device tree binding that has been
discussed previously. Patch 3 defines the device tree binding for the
NVIDIA Tegra124 memory controller (and references the generic IOMMU
binding).

Patch 4 implements a memory controller driver for NVIDIA Tegra124. It
initializes the latency allowance programming to sensible defaults and
registers an IOMMU device. Note that this is still somewhat work in
progress. The page tables aren't properly cleaned up yet and other
features of the memory controller may be useful to implement
subsequently.

Patches 5 through 8 add the device tree node for the memory controller
and enable IOMMU support in the display and SDMMC controllers as
examples.

Patches 9 and 10 add support for IOMMU to the DRM and SDMMC drivers.
SDMMC uses the DMA mapping API, which will make use of ARM's DMA/IOMMU
integration. DRM has special needs (buffers that are mapped can be
scanned out by either display controller) and not a good fit for the
DMA mapping API, so it uses the IOMMU API directly.

This has been tested using both SDMMC and DRM drivers via the IOMMU. For
DRM when an IOMMU is detected it will use shmem as backing store, which
removes the need for CMA. Importing from gk20a via the Nouveau driver
also works, but buffers occasionally have some kind of offset that I
haven't been able to track down yet.

Thierry

Thierry Reding (10):
  iommu: Add IOMMU device registry
  devicetree: Add generic IOMMU device tree bindings
  of: Add NVIDIA Tegra124 memory controller binding
  memory: Add Tegra124 memory controller support
  ARM: tegra: Add memory controller on Tegra124
  ARM: tegra: tegra124: Enable IOMMU for display controllers
  ARM: tegra: tegra124: Enable IOMMU for SDMMC controllers
  ARM: tegra: Select ARM_DMA_USE_IOMMU
  drm/tegra: Add IOMMU support
  mmc: sdhci-tegra: Add IOMMU support

 Documentation/devicetree/bindings/iommu/iommu.txt  |  156 ++
 .../memory-controllers/nvidia,tegra124-mc.txt      |   12 +
 arch/arm/boot/dts/tegra124.dtsi                    |   18 +
 arch/arm/mach-tegra/Kconfig                        |    1 +
 drivers/gpu/drm/tegra/dc.c                         |   21 +
 drivers/gpu/drm/tegra/drm.c                        |   17 +
 drivers/gpu/drm/tegra/drm.h                        |    3 +
 drivers/gpu/drm/tegra/fb.c                         |   16 +-
 drivers/gpu/drm/tegra/gem.c                        |  236 ++-
 drivers/gpu/drm/tegra/gem.h                        |    4 +
 drivers/iommu/iommu.c                              |   93 +
 drivers/memory/Kconfig                             |    9 +
 drivers/memory/Makefile                            |    1 +
 drivers/memory/tegra124-mc.c                       | 1945 ++++++++++++++++++++
 drivers/mmc/host/sdhci-tegra.c                     |    8 +
 include/dt-bindings/memory/tegra124-mc.h           |   30 +
 include/linux/iommu.h                              |   27 +
 17 files changed, 2573 insertions(+), 24 deletions(-)
 create mode 100644 Documentation/devicetree/bindings/iommu/iommu.txt
 create mode 100644 Documentation/devicetree/bindings/memory-controllers/nvidia,tegra124-mc.txt
 create mode 100644 drivers/memory/tegra124-mc.c
 create mode 100644 include/dt-bindings/memory/tegra124-mc.h

-- 
2.0.0

^ permalink raw reply	[flat|nested] 133+ messages in thread

* [RFC 00/10] Add NVIDIA Tegra124 IOMMU support
@ 2014-06-26 20:49 ` Thierry Reding
  0 siblings, 0 replies; 133+ messages in thread
From: Thierry Reding @ 2014-06-26 20:49 UTC (permalink / raw)
  To: Rob Herring, Pawel Moll, Mark Rutland, Ian Campbell, Kumar Gala,
	Stephen Warren, Arnd Bergmann, Will Deacon, Joerg Roedel
  Cc: Cho KyongHo, Grant Grundler, Dave Martin, Marc Zyngier,
	Hiroshi Doyu, Olav Haugan, Paul Walmsley, Rhyland Klein,
	Allen Martin, devicetree, iommu, linux-arm-kernel, linux-tegra,
	linux-kernel

From: Thierry Reding <treding@nvidia.com>

This series adds support for the IOMMU found on Tegra124 SoCs. The SMMU
groups memory clients into SWGROUPs and each SWGROUP can be assigned to
one I/O virtual address space. Translation of virtual addresses can be
enabled per memory client.

Patch 1 adds an IOMMU device registry. The driver in patch 4 will add
the IOMMU device with this registry, which will in turn be used by the
client drivers to attach to the IOMMU device. Note that the API that is
introduced in this patch may not be sufficient in the long term (f.e.
when multiple master interfaces need to be supported).

Patch 2 is v3 of the generic IOMMU device tree binding that has been
discussed previously. Patch 3 defines the device tree binding for the
NVIDIA Tegra124 memory controller (and references the generic IOMMU
binding).

Patch 4 implements a memory controller driver for NVIDIA Tegra124. It
initializes the latency allowance programming to sensible defaults and
registers an IOMMU device. Note that this is still somewhat work in
progress. The page tables aren't properly cleaned up yet and other
features of the memory controller may be useful to implement
subsequently.

Patches 5 through 8 add the device tree node for the memory controller
and enable IOMMU support in the display and SDMMC controllers as
examples.

Patches 9 and 10 add support for IOMMU to the DRM and SDMMC drivers.
SDMMC uses the DMA mapping API, which will make use of ARM's DMA/IOMMU
integration. DRM has special needs (buffers that are mapped can be
scanned out by either display controller) and not a good fit for the
DMA mapping API, so it uses the IOMMU API directly.

This has been tested using both SDMMC and DRM drivers via the IOMMU. For
DRM when an IOMMU is detected it will use shmem as backing store, which
removes the need for CMA. Importing from gk20a via the Nouveau driver
also works, but buffers occasionally have some kind of offset that I
haven't been able to track down yet.

Thierry

Thierry Reding (10):
  iommu: Add IOMMU device registry
  devicetree: Add generic IOMMU device tree bindings
  of: Add NVIDIA Tegra124 memory controller binding
  memory: Add Tegra124 memory controller support
  ARM: tegra: Add memory controller on Tegra124
  ARM: tegra: tegra124: Enable IOMMU for display controllers
  ARM: tegra: tegra124: Enable IOMMU for SDMMC controllers
  ARM: tegra: Select ARM_DMA_USE_IOMMU
  drm/tegra: Add IOMMU support
  mmc: sdhci-tegra: Add IOMMU support

 Documentation/devicetree/bindings/iommu/iommu.txt  |  156 ++
 .../memory-controllers/nvidia,tegra124-mc.txt      |   12 +
 arch/arm/boot/dts/tegra124.dtsi                    |   18 +
 arch/arm/mach-tegra/Kconfig                        |    1 +
 drivers/gpu/drm/tegra/dc.c                         |   21 +
 drivers/gpu/drm/tegra/drm.c                        |   17 +
 drivers/gpu/drm/tegra/drm.h                        |    3 +
 drivers/gpu/drm/tegra/fb.c                         |   16 +-
 drivers/gpu/drm/tegra/gem.c                        |  236 ++-
 drivers/gpu/drm/tegra/gem.h                        |    4 +
 drivers/iommu/iommu.c                              |   93 +
 drivers/memory/Kconfig                             |    9 +
 drivers/memory/Makefile                            |    1 +
 drivers/memory/tegra124-mc.c                       | 1945 ++++++++++++++++++++
 drivers/mmc/host/sdhci-tegra.c                     |    8 +
 include/dt-bindings/memory/tegra124-mc.h           |   30 +
 include/linux/iommu.h                              |   27 +
 17 files changed, 2573 insertions(+), 24 deletions(-)
 create mode 100644 Documentation/devicetree/bindings/iommu/iommu.txt
 create mode 100644 Documentation/devicetree/bindings/memory-controllers/nvidia,tegra124-mc.txt
 create mode 100644 drivers/memory/tegra124-mc.c
 create mode 100644 include/dt-bindings/memory/tegra124-mc.h

-- 
2.0.0


^ permalink raw reply	[flat|nested] 133+ messages in thread

* [RFC 00/10] Add NVIDIA Tegra124 IOMMU support
@ 2014-06-26 20:49 ` Thierry Reding
  0 siblings, 0 replies; 133+ messages in thread
From: Thierry Reding @ 2014-06-26 20:49 UTC (permalink / raw)
  To: linux-arm-kernel

From: Thierry Reding <treding@nvidia.com>

This series adds support for the IOMMU found on Tegra124 SoCs. The SMMU
groups memory clients into SWGROUPs and each SWGROUP can be assigned to
one I/O virtual address space. Translation of virtual addresses can be
enabled per memory client.

Patch 1 adds an IOMMU device registry. The driver in patch 4 will add
the IOMMU device with this registry, which will in turn be used by the
client drivers to attach to the IOMMU device. Note that the API that is
introduced in this patch may not be sufficient in the long term (f.e.
when multiple master interfaces need to be supported).

Patch 2 is v3 of the generic IOMMU device tree binding that has been
discussed previously. Patch 3 defines the device tree binding for the
NVIDIA Tegra124 memory controller (and references the generic IOMMU
binding).

Patch 4 implements a memory controller driver for NVIDIA Tegra124. It
initializes the latency allowance programming to sensible defaults and
registers an IOMMU device. Note that this is still somewhat work in
progress. The page tables aren't properly cleaned up yet and other
features of the memory controller may be useful to implement
subsequently.

Patches 5 through 8 add the device tree node for the memory controller
and enable IOMMU support in the display and SDMMC controllers as
examples.

Patches 9 and 10 add support for IOMMU to the DRM and SDMMC drivers.
SDMMC uses the DMA mapping API, which will make use of ARM's DMA/IOMMU
integration. DRM has special needs (buffers that are mapped can be
scanned out by either display controller) and not a good fit for the
DMA mapping API, so it uses the IOMMU API directly.

This has been tested using both SDMMC and DRM drivers via the IOMMU. For
DRM when an IOMMU is detected it will use shmem as backing store, which
removes the need for CMA. Importing from gk20a via the Nouveau driver
also works, but buffers occasionally have some kind of offset that I
haven't been able to track down yet.

Thierry

Thierry Reding (10):
  iommu: Add IOMMU device registry
  devicetree: Add generic IOMMU device tree bindings
  of: Add NVIDIA Tegra124 memory controller binding
  memory: Add Tegra124 memory controller support
  ARM: tegra: Add memory controller on Tegra124
  ARM: tegra: tegra124: Enable IOMMU for display controllers
  ARM: tegra: tegra124: Enable IOMMU for SDMMC controllers
  ARM: tegra: Select ARM_DMA_USE_IOMMU
  drm/tegra: Add IOMMU support
  mmc: sdhci-tegra: Add IOMMU support

 Documentation/devicetree/bindings/iommu/iommu.txt  |  156 ++
 .../memory-controllers/nvidia,tegra124-mc.txt      |   12 +
 arch/arm/boot/dts/tegra124.dtsi                    |   18 +
 arch/arm/mach-tegra/Kconfig                        |    1 +
 drivers/gpu/drm/tegra/dc.c                         |   21 +
 drivers/gpu/drm/tegra/drm.c                        |   17 +
 drivers/gpu/drm/tegra/drm.h                        |    3 +
 drivers/gpu/drm/tegra/fb.c                         |   16 +-
 drivers/gpu/drm/tegra/gem.c                        |  236 ++-
 drivers/gpu/drm/tegra/gem.h                        |    4 +
 drivers/iommu/iommu.c                              |   93 +
 drivers/memory/Kconfig                             |    9 +
 drivers/memory/Makefile                            |    1 +
 drivers/memory/tegra124-mc.c                       | 1945 ++++++++++++++++++++
 drivers/mmc/host/sdhci-tegra.c                     |    8 +
 include/dt-bindings/memory/tegra124-mc.h           |   30 +
 include/linux/iommu.h                              |   27 +
 17 files changed, 2573 insertions(+), 24 deletions(-)
 create mode 100644 Documentation/devicetree/bindings/iommu/iommu.txt
 create mode 100644 Documentation/devicetree/bindings/memory-controllers/nvidia,tegra124-mc.txt
 create mode 100644 drivers/memory/tegra124-mc.c
 create mode 100644 include/dt-bindings/memory/tegra124-mc.h

-- 
2.0.0

^ permalink raw reply	[flat|nested] 133+ messages in thread

* [RFC 01/10] iommu: Add IOMMU device registry
  2014-06-26 20:49 ` Thierry Reding
  (?)
@ 2014-06-26 20:49     ` Thierry Reding
  -1 siblings, 0 replies; 133+ messages in thread
From: Thierry Reding @ 2014-06-26 20:49 UTC (permalink / raw)
  To: Rob Herring, Pawel Moll, Mark Rutland, Ian Campbell, Kumar Gala,
	Stephen Warren, Arnd Bergmann, Will Deacon, Joerg Roedel
  Cc: Olav Haugan, devicetree-u79uwXL29TY76Z2rM5mHXA, Grant Grundler,
	Rhyland Klein, iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA, Marc Zyngier, Allen Martin,
	Paul Walmsley, linux-tegra-u79uwXL29TY76Z2rM5mHXA, Cho KyongHo,
	Dave Martin, linux-arm-kernel-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r

From: Thierry Reding <treding-DDmLM1+adcrQT0dZR+AlfA@public.gmane.org>

Add an IOMMU device registry for drivers to register with and implement
a method for users of the IOMMU API to attach to an IOMMU device. This
allows to support deferred probing and gives the IOMMU API a convenient
hook to perform early initialization of a device if necessary.

Signed-off-by: Thierry Reding <treding-DDmLM1+adcrQT0dZR+AlfA@public.gmane.org>
---
 drivers/iommu/iommu.c | 93 +++++++++++++++++++++++++++++++++++++++++++++++++++
 include/linux/iommu.h | 27 +++++++++++++++
 2 files changed, 120 insertions(+)

diff --git a/drivers/iommu/iommu.c b/drivers/iommu/iommu.c
index 806b55d056b7..5e9e82c73bbf 100644
--- a/drivers/iommu/iommu.c
+++ b/drivers/iommu/iommu.c
@@ -29,8 +29,12 @@
 #include <linux/idr.h>
 #include <linux/notifier.h>
 #include <linux/err.h>
+#include <linux/of.h>
 #include <trace/events/iommu.h>
 
+static DEFINE_MUTEX(iommus_lock);
+static LIST_HEAD(iommus);
+
 static struct kset *iommu_group_kset;
 static struct ida iommu_group_ida;
 static struct mutex iommu_group_mutex;
@@ -1004,3 +1008,92 @@ int iommu_domain_set_attr(struct iommu_domain *domain,
 	return ret;
 }
 EXPORT_SYMBOL_GPL(iommu_domain_set_attr);
+
+int iommu_add(struct iommu *iommu)
+{
+	mutex_lock(&iommus_lock);
+	list_add_tail(&iommu->list, &iommus);
+	mutex_unlock(&iommus_lock);
+
+	return 0;
+}
+EXPORT_SYMBOL_GPL(iommu_add);
+
+void iommu_remove(struct iommu *iommu)
+{
+	mutex_lock(&iommus_lock);
+	list_del_init(&iommu->list);
+	mutex_unlock(&iommus_lock);
+}
+EXPORT_SYMBOL_GPL(iommu_remove);
+
+static int of_iommu_attach(struct device *dev)
+{
+	struct of_phandle_iter iter;
+	struct iommu *iommu;
+
+	mutex_lock(&iommus_lock);
+
+	of_property_for_each_phandle_with_args(iter, dev->of_node, "iommus",
+					       "#iommu-cells", 0) {
+		bool found = false;
+		int err;
+
+		/* skip disabled IOMMUs */
+		if (!of_device_is_available(iter.out_args.np))
+			continue;
+
+		list_for_each_entry(iommu, &iommus, list) {
+			if (iommu->dev->of_node == iter.out_args.np) {
+				err = iommu->ops->attach(iommu, dev);
+				if (err < 0) {
+				}
+
+				found = true;
+			}
+		}
+
+		if (!found) {
+			mutex_unlock(&iommus_lock);
+			return -EPROBE_DEFER;
+		}
+	}
+
+	mutex_unlock(&iommus_lock);
+
+	return 0;
+}
+
+static int of_iommu_detach(struct device *dev)
+{
+	/* TODO: implement */
+	return -ENOSYS;
+}
+
+int iommu_attach(struct device *dev)
+{
+	int err = 0;
+
+	if (IS_ENABLED(CONFIG_OF) && dev->of_node) {
+		err = of_iommu_attach(dev);
+		if (!err)
+			return 0;
+	}
+
+	return err;
+}
+EXPORT_SYMBOL_GPL(iommu_attach);
+
+int iommu_detach(struct device *dev)
+{
+	int err = 0;
+
+	if (IS_ENABLED(CONFIG_OF) && dev->of_node) {
+		err = of_iommu_detach(dev);
+		if (!err)
+			return 0;
+	}
+
+	return err;
+}
+EXPORT_SYMBOL_GPL(iommu_detach);
diff --git a/include/linux/iommu.h b/include/linux/iommu.h
index 284a4683fdc1..ac2ceef194d4 100644
--- a/include/linux/iommu.h
+++ b/include/linux/iommu.h
@@ -43,6 +43,17 @@ struct notifier_block;
 typedef int (*iommu_fault_handler_t)(struct iommu_domain *,
 			struct device *, unsigned long, int, void *);
 
+struct iommu {
+	struct device *dev;
+
+	struct list_head list;
+
+	const struct iommu_ops *ops;
+};
+
+int iommu_add(struct iommu *iommu);
+void iommu_remove(struct iommu *iommu);
+
 struct iommu_domain_geometry {
 	dma_addr_t aperture_start; /* First address that can be mapped    */
 	dma_addr_t aperture_end;   /* Last address that can be mapped     */
@@ -130,6 +141,9 @@ struct iommu_ops {
 	/* Get the numer of window per domain */
 	u32 (*domain_get_windows)(struct iommu_domain *domain);
 
+	int (*attach)(struct iommu *iommu, struct device *dev);
+	int (*detach)(struct iommu *iommu, struct device *dev);
+
 	unsigned long pgsize_bitmap;
 };
 
@@ -192,6 +206,10 @@ extern int iommu_domain_window_enable(struct iommu_domain *domain, u32 wnd_nr,
 				      phys_addr_t offset, u64 size,
 				      int prot);
 extern void iommu_domain_window_disable(struct iommu_domain *domain, u32 wnd_nr);
+
+int iommu_attach(struct device *dev);
+int iommu_detach(struct device *dev);
+
 /**
  * report_iommu_fault() - report about an IOMMU fault to the IOMMU framework
  * @domain: the iommu domain where the fault has happened
@@ -396,6 +414,15 @@ static inline int iommu_domain_set_attr(struct iommu_domain *domain,
 	return -EINVAL;
 }
 
+static inline int iommu_attach(struct device *dev)
+{
+	return 0;
+}
+
+static inline int iommu_detach(struct device *dev)
+{
+	return 0;
+}
 #endif /* CONFIG_IOMMU_API */
 
 #endif /* __LINUX_IOMMU_H */
-- 
2.0.0

^ permalink raw reply related	[flat|nested] 133+ messages in thread

* [RFC 01/10] iommu: Add IOMMU device registry
@ 2014-06-26 20:49     ` Thierry Reding
  0 siblings, 0 replies; 133+ messages in thread
From: Thierry Reding @ 2014-06-26 20:49 UTC (permalink / raw)
  To: Rob Herring, Pawel Moll, Mark Rutland, Ian Campbell, Kumar Gala,
	Stephen Warren, Arnd Bergmann, Will Deacon, Joerg Roedel
  Cc: Cho KyongHo, Grant Grundler, Dave Martin, Marc Zyngier,
	Hiroshi Doyu, Olav Haugan, Paul Walmsley, Rhyland Klein,
	Allen Martin, devicetree, iommu, linux-arm-kernel, linux-tegra,
	linux-kernel

From: Thierry Reding <treding@nvidia.com>

Add an IOMMU device registry for drivers to register with and implement
a method for users of the IOMMU API to attach to an IOMMU device. This
allows to support deferred probing and gives the IOMMU API a convenient
hook to perform early initialization of a device if necessary.

Signed-off-by: Thierry Reding <treding@nvidia.com>
---
 drivers/iommu/iommu.c | 93 +++++++++++++++++++++++++++++++++++++++++++++++++++
 include/linux/iommu.h | 27 +++++++++++++++
 2 files changed, 120 insertions(+)

diff --git a/drivers/iommu/iommu.c b/drivers/iommu/iommu.c
index 806b55d056b7..5e9e82c73bbf 100644
--- a/drivers/iommu/iommu.c
+++ b/drivers/iommu/iommu.c
@@ -29,8 +29,12 @@
 #include <linux/idr.h>
 #include <linux/notifier.h>
 #include <linux/err.h>
+#include <linux/of.h>
 #include <trace/events/iommu.h>
 
+static DEFINE_MUTEX(iommus_lock);
+static LIST_HEAD(iommus);
+
 static struct kset *iommu_group_kset;
 static struct ida iommu_group_ida;
 static struct mutex iommu_group_mutex;
@@ -1004,3 +1008,92 @@ int iommu_domain_set_attr(struct iommu_domain *domain,
 	return ret;
 }
 EXPORT_SYMBOL_GPL(iommu_domain_set_attr);
+
+int iommu_add(struct iommu *iommu)
+{
+	mutex_lock(&iommus_lock);
+	list_add_tail(&iommu->list, &iommus);
+	mutex_unlock(&iommus_lock);
+
+	return 0;
+}
+EXPORT_SYMBOL_GPL(iommu_add);
+
+void iommu_remove(struct iommu *iommu)
+{
+	mutex_lock(&iommus_lock);
+	list_del_init(&iommu->list);
+	mutex_unlock(&iommus_lock);
+}
+EXPORT_SYMBOL_GPL(iommu_remove);
+
+static int of_iommu_attach(struct device *dev)
+{
+	struct of_phandle_iter iter;
+	struct iommu *iommu;
+
+	mutex_lock(&iommus_lock);
+
+	of_property_for_each_phandle_with_args(iter, dev->of_node, "iommus",
+					       "#iommu-cells", 0) {
+		bool found = false;
+		int err;
+
+		/* skip disabled IOMMUs */
+		if (!of_device_is_available(iter.out_args.np))
+			continue;
+
+		list_for_each_entry(iommu, &iommus, list) {
+			if (iommu->dev->of_node == iter.out_args.np) {
+				err = iommu->ops->attach(iommu, dev);
+				if (err < 0) {
+				}
+
+				found = true;
+			}
+		}
+
+		if (!found) {
+			mutex_unlock(&iommus_lock);
+			return -EPROBE_DEFER;
+		}
+	}
+
+	mutex_unlock(&iommus_lock);
+
+	return 0;
+}
+
+static int of_iommu_detach(struct device *dev)
+{
+	/* TODO: implement */
+	return -ENOSYS;
+}
+
+int iommu_attach(struct device *dev)
+{
+	int err = 0;
+
+	if (IS_ENABLED(CONFIG_OF) && dev->of_node) {
+		err = of_iommu_attach(dev);
+		if (!err)
+			return 0;
+	}
+
+	return err;
+}
+EXPORT_SYMBOL_GPL(iommu_attach);
+
+int iommu_detach(struct device *dev)
+{
+	int err = 0;
+
+	if (IS_ENABLED(CONFIG_OF) && dev->of_node) {
+		err = of_iommu_detach(dev);
+		if (!err)
+			return 0;
+	}
+
+	return err;
+}
+EXPORT_SYMBOL_GPL(iommu_detach);
diff --git a/include/linux/iommu.h b/include/linux/iommu.h
index 284a4683fdc1..ac2ceef194d4 100644
--- a/include/linux/iommu.h
+++ b/include/linux/iommu.h
@@ -43,6 +43,17 @@ struct notifier_block;
 typedef int (*iommu_fault_handler_t)(struct iommu_domain *,
 			struct device *, unsigned long, int, void *);
 
+struct iommu {
+	struct device *dev;
+
+	struct list_head list;
+
+	const struct iommu_ops *ops;
+};
+
+int iommu_add(struct iommu *iommu);
+void iommu_remove(struct iommu *iommu);
+
 struct iommu_domain_geometry {
 	dma_addr_t aperture_start; /* First address that can be mapped    */
 	dma_addr_t aperture_end;   /* Last address that can be mapped     */
@@ -130,6 +141,9 @@ struct iommu_ops {
 	/* Get the numer of window per domain */
 	u32 (*domain_get_windows)(struct iommu_domain *domain);
 
+	int (*attach)(struct iommu *iommu, struct device *dev);
+	int (*detach)(struct iommu *iommu, struct device *dev);
+
 	unsigned long pgsize_bitmap;
 };
 
@@ -192,6 +206,10 @@ extern int iommu_domain_window_enable(struct iommu_domain *domain, u32 wnd_nr,
 				      phys_addr_t offset, u64 size,
 				      int prot);
 extern void iommu_domain_window_disable(struct iommu_domain *domain, u32 wnd_nr);
+
+int iommu_attach(struct device *dev);
+int iommu_detach(struct device *dev);
+
 /**
  * report_iommu_fault() - report about an IOMMU fault to the IOMMU framework
  * @domain: the iommu domain where the fault has happened
@@ -396,6 +414,15 @@ static inline int iommu_domain_set_attr(struct iommu_domain *domain,
 	return -EINVAL;
 }
 
+static inline int iommu_attach(struct device *dev)
+{
+	return 0;
+}
+
+static inline int iommu_detach(struct device *dev)
+{
+	return 0;
+}
 #endif /* CONFIG_IOMMU_API */
 
 #endif /* __LINUX_IOMMU_H */
-- 
2.0.0


^ permalink raw reply related	[flat|nested] 133+ messages in thread

* [RFC 01/10] iommu: Add IOMMU device registry
@ 2014-06-26 20:49     ` Thierry Reding
  0 siblings, 0 replies; 133+ messages in thread
From: Thierry Reding @ 2014-06-26 20:49 UTC (permalink / raw)
  To: linux-arm-kernel

From: Thierry Reding <treding@nvidia.com>

Add an IOMMU device registry for drivers to register with and implement
a method for users of the IOMMU API to attach to an IOMMU device. This
allows to support deferred probing and gives the IOMMU API a convenient
hook to perform early initialization of a device if necessary.

Signed-off-by: Thierry Reding <treding@nvidia.com>
---
 drivers/iommu/iommu.c | 93 +++++++++++++++++++++++++++++++++++++++++++++++++++
 include/linux/iommu.h | 27 +++++++++++++++
 2 files changed, 120 insertions(+)

diff --git a/drivers/iommu/iommu.c b/drivers/iommu/iommu.c
index 806b55d056b7..5e9e82c73bbf 100644
--- a/drivers/iommu/iommu.c
+++ b/drivers/iommu/iommu.c
@@ -29,8 +29,12 @@
 #include <linux/idr.h>
 #include <linux/notifier.h>
 #include <linux/err.h>
+#include <linux/of.h>
 #include <trace/events/iommu.h>
 
+static DEFINE_MUTEX(iommus_lock);
+static LIST_HEAD(iommus);
+
 static struct kset *iommu_group_kset;
 static struct ida iommu_group_ida;
 static struct mutex iommu_group_mutex;
@@ -1004,3 +1008,92 @@ int iommu_domain_set_attr(struct iommu_domain *domain,
 	return ret;
 }
 EXPORT_SYMBOL_GPL(iommu_domain_set_attr);
+
+int iommu_add(struct iommu *iommu)
+{
+	mutex_lock(&iommus_lock);
+	list_add_tail(&iommu->list, &iommus);
+	mutex_unlock(&iommus_lock);
+
+	return 0;
+}
+EXPORT_SYMBOL_GPL(iommu_add);
+
+void iommu_remove(struct iommu *iommu)
+{
+	mutex_lock(&iommus_lock);
+	list_del_init(&iommu->list);
+	mutex_unlock(&iommus_lock);
+}
+EXPORT_SYMBOL_GPL(iommu_remove);
+
+static int of_iommu_attach(struct device *dev)
+{
+	struct of_phandle_iter iter;
+	struct iommu *iommu;
+
+	mutex_lock(&iommus_lock);
+
+	of_property_for_each_phandle_with_args(iter, dev->of_node, "iommus",
+					       "#iommu-cells", 0) {
+		bool found = false;
+		int err;
+
+		/* skip disabled IOMMUs */
+		if (!of_device_is_available(iter.out_args.np))
+			continue;
+
+		list_for_each_entry(iommu, &iommus, list) {
+			if (iommu->dev->of_node == iter.out_args.np) {
+				err = iommu->ops->attach(iommu, dev);
+				if (err < 0) {
+				}
+
+				found = true;
+			}
+		}
+
+		if (!found) {
+			mutex_unlock(&iommus_lock);
+			return -EPROBE_DEFER;
+		}
+	}
+
+	mutex_unlock(&iommus_lock);
+
+	return 0;
+}
+
+static int of_iommu_detach(struct device *dev)
+{
+	/* TODO: implement */
+	return -ENOSYS;
+}
+
+int iommu_attach(struct device *dev)
+{
+	int err = 0;
+
+	if (IS_ENABLED(CONFIG_OF) && dev->of_node) {
+		err = of_iommu_attach(dev);
+		if (!err)
+			return 0;
+	}
+
+	return err;
+}
+EXPORT_SYMBOL_GPL(iommu_attach);
+
+int iommu_detach(struct device *dev)
+{
+	int err = 0;
+
+	if (IS_ENABLED(CONFIG_OF) && dev->of_node) {
+		err = of_iommu_detach(dev);
+		if (!err)
+			return 0;
+	}
+
+	return err;
+}
+EXPORT_SYMBOL_GPL(iommu_detach);
diff --git a/include/linux/iommu.h b/include/linux/iommu.h
index 284a4683fdc1..ac2ceef194d4 100644
--- a/include/linux/iommu.h
+++ b/include/linux/iommu.h
@@ -43,6 +43,17 @@ struct notifier_block;
 typedef int (*iommu_fault_handler_t)(struct iommu_domain *,
 			struct device *, unsigned long, int, void *);
 
+struct iommu {
+	struct device *dev;
+
+	struct list_head list;
+
+	const struct iommu_ops *ops;
+};
+
+int iommu_add(struct iommu *iommu);
+void iommu_remove(struct iommu *iommu);
+
 struct iommu_domain_geometry {
 	dma_addr_t aperture_start; /* First address that can be mapped    */
 	dma_addr_t aperture_end;   /* Last address that can be mapped     */
@@ -130,6 +141,9 @@ struct iommu_ops {
 	/* Get the numer of window per domain */
 	u32 (*domain_get_windows)(struct iommu_domain *domain);
 
+	int (*attach)(struct iommu *iommu, struct device *dev);
+	int (*detach)(struct iommu *iommu, struct device *dev);
+
 	unsigned long pgsize_bitmap;
 };
 
@@ -192,6 +206,10 @@ extern int iommu_domain_window_enable(struct iommu_domain *domain, u32 wnd_nr,
 				      phys_addr_t offset, u64 size,
 				      int prot);
 extern void iommu_domain_window_disable(struct iommu_domain *domain, u32 wnd_nr);
+
+int iommu_attach(struct device *dev);
+int iommu_detach(struct device *dev);
+
 /**
  * report_iommu_fault() - report about an IOMMU fault to the IOMMU framework
  * @domain: the iommu domain where the fault has happened
@@ -396,6 +414,15 @@ static inline int iommu_domain_set_attr(struct iommu_domain *domain,
 	return -EINVAL;
 }
 
+static inline int iommu_attach(struct device *dev)
+{
+	return 0;
+}
+
+static inline int iommu_detach(struct device *dev)
+{
+	return 0;
+}
 #endif /* CONFIG_IOMMU_API */
 
 #endif /* __LINUX_IOMMU_H */
-- 
2.0.0

^ permalink raw reply related	[flat|nested] 133+ messages in thread

* [PATCH v3 02/10] devicetree: Add generic IOMMU device tree bindings
  2014-06-26 20:49 ` Thierry Reding
  (?)
@ 2014-06-26 20:49     ` Thierry Reding
  -1 siblings, 0 replies; 133+ messages in thread
From: Thierry Reding @ 2014-06-26 20:49 UTC (permalink / raw)
  To: Rob Herring, Pawel Moll, Mark Rutland, Ian Campbell, Kumar Gala,
	Stephen Warren, Arnd Bergmann, Will Deacon, Joerg Roedel
  Cc: Olav Haugan, devicetree-u79uwXL29TY76Z2rM5mHXA, Grant Grundler,
	Rhyland Klein, iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA, Marc Zyngier, Allen Martin,
	Paul Walmsley, linux-tegra-u79uwXL29TY76Z2rM5mHXA, Cho KyongHo,
	Dave Martin, linux-arm-kernel-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r

From: Thierry Reding <treding-DDmLM1+adcrQT0dZR+AlfA@public.gmane.org>

This commit introduces a generic device tree binding for IOMMU devices.
Only a very minimal subset is described here, but it is enough to cover
the requirements of both the Exynos System MMU and Tegra SMMU as
discussed here:

    https://lkml.org/lkml/2014/4/27/346

Signed-off-by: Thierry Reding <treding-DDmLM1+adcrQT0dZR+AlfA@public.gmane.org>
---
Changes in v3:
- use #iommu-cells instead of #address-cells/#size-cells
- drop optional iommu-names property

Changes in v2:
- add notes about "dma-ranges" property (drop note from commit message)
- document priorities of "iommus" property vs. "dma-ranges" property
- drop #iommu-cells in favour of #address-cells and #size-cells
- remove multiple-master device example

 Documentation/devicetree/bindings/iommu/iommu.txt | 156 ++++++++++++++++++++++
 1 file changed, 156 insertions(+)
 create mode 100644 Documentation/devicetree/bindings/iommu/iommu.txt

diff --git a/Documentation/devicetree/bindings/iommu/iommu.txt b/Documentation/devicetree/bindings/iommu/iommu.txt
new file mode 100644
index 000000000000..f8f03f057156
--- /dev/null
+++ b/Documentation/devicetree/bindings/iommu/iommu.txt
@@ -0,0 +1,156 @@
+This document describes the generic device tree binding for IOMMUs and their
+master(s).
+
+
+IOMMU device node:
+==================
+
+An IOMMU can provide the following services:
+
+* Remap address space to allow devices to access physical memory ranges that
+  they otherwise wouldn't be capable of accessing.
+
+  Example: 32-bit DMA to 64-bit physical addresses
+
+* Implement scatter-gather at page level granularity so that the device does
+  not have to.
+
+* Provide system protection against "rogue" DMA by forcing all accesses to go
+  through the IOMMU and faulting when encountering accesses to unmapped
+  address regions.
+
+* Provide address space isolation between multiple contexts.
+
+  Example: Virtualization
+
+Device nodes compatible with this binding represent hardware with some of the
+above capabilities.
+
+IOMMUs can be single-master or multiple-master. Single-master IOMMU devices
+typically have a fixed association to the master device, whereas multiple-
+master IOMMU devices can translate accesses from more than one master.
+
+The device tree node of the IOMMU device's parent bus must contain a valid
+"dma-ranges" property that describes how the physical address space of the
+IOMMU maps to memory. An empty "dma-ranges" property means that there is a
+1:1 mapping from IOMMU to memory.
+
+Required properties:
+--------------------
+- #iommu-cells: The number of cells in an IOMMU specifier needed to encode an
+  address.
+
+Typical values for the above include:
+- #iommu-cells = <0>: Single master IOMMU devices are not configurable and
+  therefore no additional information needs to be encoded in the specifier.
+  This may also apply to multiple master IOMMU devices that do not allow the
+  association of masters to be configured.
+- #iommu-cells = <1>: Multiple master IOMMU devices may need to be configured
+  in order to enable translation for a given master. In such cases the single
+  address cell corresponds to the master device's ID.
+- #iommu-cells = <4>: Some IOMMU devices allow the DMA window for masters to
+  be configured. The first cell of the address in this may contain the master
+  device's ID for example, while the second cell could contain the start of
+  the DMA window for the given device. The length of the DMA window is given
+  by the third and fourth cells.
+
+
+IOMMU master node:
+==================
+
+Devices that access memory through an IOMMU are called masters. A device can
+have multiple master interfaces (to one or more IOMMU devices).
+
+Required properties:
+--------------------
+- iommus: A list of phandle and IOMMU specifier pairs that describe the IOMMU
+  master interfaces of the device. One entry in the list describes one master
+  interface of the device.
+
+When an "iommus" property is specified in a device tree node, the IOMMU will
+be used for address translation. If a "dma-ranges" property exists in the
+device's parent node it will be ignored. An exception to this rule is if the
+referenced IOMMU is disabled, in which case the "dma-ranges" property of the
+parent shall take effect.
+
+
+Notes:
+======
+
+One possible extension to the above is to use an "iommus" property along with
+a "dma-ranges" property in a bus device node (such as PCI host bridges). This
+can be useful to describe how children on the bus relate to the IOMMU if they
+are not explicitly listed in the device tree (e.g. PCI devices). However, the
+requirements of that use-case haven't been fully determined yet. Implementing
+this is therefore not recommended without further discussion and extension of
+this binding.
+
+
+Examples:
+=========
+
+Single-master IOMMU:
+--------------------
+
+	iommu {
+		#iommu-cells = <0>;
+	};
+
+	master {
+		iommus = <&/iommu>;
+	};
+
+Multiple-master IOMMU with fixed associations:
+----------------------------------------------
+
+	/* multiple-master IOMMU */
+	iommu {
+		/*
+		 * Masters are statically associated with this IOMMU and
+		 * address translation is always enabled.
+		 */
+		#iommu-cells = <0>;
+	};
+
+	/* static association with IOMMU */
+	master@1 {
+		reg = <1>;
+		iommus = <&/iommu>;
+	};
+
+	/* static association with IOMMU */
+	master@2 {
+		reg = <2>;
+		iommus = <&/iommu>;
+	};
+
+Multiple-master IOMMU:
+----------------------
+
+	iommu {
+		/* the specifier represents the ID of the master */
+		#iommu-cells = <1>;
+	};
+
+	master {
+		/* device has master ID 42 in the IOMMU */
+		iommus = <&/iommu 42>;
+	};
+
+Multiple-master IOMMU with configurable DMA window:
+---------------------------------------------------
+
+	/ {
+		#address-cells = <1>;
+		#size-cells = <1>;
+
+		iommu {
+			/* master ID, address and length of DMA window */
+			#iommu-cells = <4>;
+		};
+
+		master {
+			/* master ID 42, 4 GiB DMA window starting at 0 */
+			iommus = <&/iommu  42  0  0x1 0x0>;
+		};
+	};
-- 
2.0.0

^ permalink raw reply related	[flat|nested] 133+ messages in thread

* [PATCH v3 02/10] devicetree: Add generic IOMMU device tree bindings
@ 2014-06-26 20:49     ` Thierry Reding
  0 siblings, 0 replies; 133+ messages in thread
From: Thierry Reding @ 2014-06-26 20:49 UTC (permalink / raw)
  To: Rob Herring, Pawel Moll, Mark Rutland, Ian Campbell, Kumar Gala,
	Stephen Warren, Arnd Bergmann, Will Deacon, Joerg Roedel
  Cc: Cho KyongHo, Grant Grundler, Dave Martin, Marc Zyngier,
	Hiroshi Doyu, Olav Haugan, Paul Walmsley, Rhyland Klein,
	Allen Martin, devicetree, iommu, linux-arm-kernel, linux-tegra,
	linux-kernel

From: Thierry Reding <treding@nvidia.com>

This commit introduces a generic device tree binding for IOMMU devices.
Only a very minimal subset is described here, but it is enough to cover
the requirements of both the Exynos System MMU and Tegra SMMU as
discussed here:

    https://lkml.org/lkml/2014/4/27/346

Signed-off-by: Thierry Reding <treding@nvidia.com>
---
Changes in v3:
- use #iommu-cells instead of #address-cells/#size-cells
- drop optional iommu-names property

Changes in v2:
- add notes about "dma-ranges" property (drop note from commit message)
- document priorities of "iommus" property vs. "dma-ranges" property
- drop #iommu-cells in favour of #address-cells and #size-cells
- remove multiple-master device example

 Documentation/devicetree/bindings/iommu/iommu.txt | 156 ++++++++++++++++++++++
 1 file changed, 156 insertions(+)
 create mode 100644 Documentation/devicetree/bindings/iommu/iommu.txt

diff --git a/Documentation/devicetree/bindings/iommu/iommu.txt b/Documentation/devicetree/bindings/iommu/iommu.txt
new file mode 100644
index 000000000000..f8f03f057156
--- /dev/null
+++ b/Documentation/devicetree/bindings/iommu/iommu.txt
@@ -0,0 +1,156 @@
+This document describes the generic device tree binding for IOMMUs and their
+master(s).
+
+
+IOMMU device node:
+==================
+
+An IOMMU can provide the following services:
+
+* Remap address space to allow devices to access physical memory ranges that
+  they otherwise wouldn't be capable of accessing.
+
+  Example: 32-bit DMA to 64-bit physical addresses
+
+* Implement scatter-gather at page level granularity so that the device does
+  not have to.
+
+* Provide system protection against "rogue" DMA by forcing all accesses to go
+  through the IOMMU and faulting when encountering accesses to unmapped
+  address regions.
+
+* Provide address space isolation between multiple contexts.
+
+  Example: Virtualization
+
+Device nodes compatible with this binding represent hardware with some of the
+above capabilities.
+
+IOMMUs can be single-master or multiple-master. Single-master IOMMU devices
+typically have a fixed association to the master device, whereas multiple-
+master IOMMU devices can translate accesses from more than one master.
+
+The device tree node of the IOMMU device's parent bus must contain a valid
+"dma-ranges" property that describes how the physical address space of the
+IOMMU maps to memory. An empty "dma-ranges" property means that there is a
+1:1 mapping from IOMMU to memory.
+
+Required properties:
+--------------------
+- #iommu-cells: The number of cells in an IOMMU specifier needed to encode an
+  address.
+
+Typical values for the above include:
+- #iommu-cells = <0>: Single master IOMMU devices are not configurable and
+  therefore no additional information needs to be encoded in the specifier.
+  This may also apply to multiple master IOMMU devices that do not allow the
+  association of masters to be configured.
+- #iommu-cells = <1>: Multiple master IOMMU devices may need to be configured
+  in order to enable translation for a given master. In such cases the single
+  address cell corresponds to the master device's ID.
+- #iommu-cells = <4>: Some IOMMU devices allow the DMA window for masters to
+  be configured. The first cell of the address in this may contain the master
+  device's ID for example, while the second cell could contain the start of
+  the DMA window for the given device. The length of the DMA window is given
+  by the third and fourth cells.
+
+
+IOMMU master node:
+==================
+
+Devices that access memory through an IOMMU are called masters. A device can
+have multiple master interfaces (to one or more IOMMU devices).
+
+Required properties:
+--------------------
+- iommus: A list of phandle and IOMMU specifier pairs that describe the IOMMU
+  master interfaces of the device. One entry in the list describes one master
+  interface of the device.
+
+When an "iommus" property is specified in a device tree node, the IOMMU will
+be used for address translation. If a "dma-ranges" property exists in the
+device's parent node it will be ignored. An exception to this rule is if the
+referenced IOMMU is disabled, in which case the "dma-ranges" property of the
+parent shall take effect.
+
+
+Notes:
+======
+
+One possible extension to the above is to use an "iommus" property along with
+a "dma-ranges" property in a bus device node (such as PCI host bridges). This
+can be useful to describe how children on the bus relate to the IOMMU if they
+are not explicitly listed in the device tree (e.g. PCI devices). However, the
+requirements of that use-case haven't been fully determined yet. Implementing
+this is therefore not recommended without further discussion and extension of
+this binding.
+
+
+Examples:
+=========
+
+Single-master IOMMU:
+--------------------
+
+	iommu {
+		#iommu-cells = <0>;
+	};
+
+	master {
+		iommus = <&/iommu>;
+	};
+
+Multiple-master IOMMU with fixed associations:
+----------------------------------------------
+
+	/* multiple-master IOMMU */
+	iommu {
+		/*
+		 * Masters are statically associated with this IOMMU and
+		 * address translation is always enabled.
+		 */
+		#iommu-cells = <0>;
+	};
+
+	/* static association with IOMMU */
+	master@1 {
+		reg = <1>;
+		iommus = <&/iommu>;
+	};
+
+	/* static association with IOMMU */
+	master@2 {
+		reg = <2>;
+		iommus = <&/iommu>;
+	};
+
+Multiple-master IOMMU:
+----------------------
+
+	iommu {
+		/* the specifier represents the ID of the master */
+		#iommu-cells = <1>;
+	};
+
+	master {
+		/* device has master ID 42 in the IOMMU */
+		iommus = <&/iommu 42>;
+	};
+
+Multiple-master IOMMU with configurable DMA window:
+---------------------------------------------------
+
+	/ {
+		#address-cells = <1>;
+		#size-cells = <1>;
+
+		iommu {
+			/* master ID, address and length of DMA window */
+			#iommu-cells = <4>;
+		};
+
+		master {
+			/* master ID 42, 4 GiB DMA window starting at 0 */
+			iommus = <&/iommu  42  0  0x1 0x0>;
+		};
+	};
-- 
2.0.0


^ permalink raw reply related	[flat|nested] 133+ messages in thread

* [PATCH v3 02/10] devicetree: Add generic IOMMU device tree bindings
@ 2014-06-26 20:49     ` Thierry Reding
  0 siblings, 0 replies; 133+ messages in thread
From: Thierry Reding @ 2014-06-26 20:49 UTC (permalink / raw)
  To: linux-arm-kernel

From: Thierry Reding <treding@nvidia.com>

This commit introduces a generic device tree binding for IOMMU devices.
Only a very minimal subset is described here, but it is enough to cover
the requirements of both the Exynos System MMU and Tegra SMMU as
discussed here:

    https://lkml.org/lkml/2014/4/27/346

Signed-off-by: Thierry Reding <treding@nvidia.com>
---
Changes in v3:
- use #iommu-cells instead of #address-cells/#size-cells
- drop optional iommu-names property

Changes in v2:
- add notes about "dma-ranges" property (drop note from commit message)
- document priorities of "iommus" property vs. "dma-ranges" property
- drop #iommu-cells in favour of #address-cells and #size-cells
- remove multiple-master device example

 Documentation/devicetree/bindings/iommu/iommu.txt | 156 ++++++++++++++++++++++
 1 file changed, 156 insertions(+)
 create mode 100644 Documentation/devicetree/bindings/iommu/iommu.txt

diff --git a/Documentation/devicetree/bindings/iommu/iommu.txt b/Documentation/devicetree/bindings/iommu/iommu.txt
new file mode 100644
index 000000000000..f8f03f057156
--- /dev/null
+++ b/Documentation/devicetree/bindings/iommu/iommu.txt
@@ -0,0 +1,156 @@
+This document describes the generic device tree binding for IOMMUs and their
+master(s).
+
+
+IOMMU device node:
+==================
+
+An IOMMU can provide the following services:
+
+* Remap address space to allow devices to access physical memory ranges that
+  they otherwise wouldn't be capable of accessing.
+
+  Example: 32-bit DMA to 64-bit physical addresses
+
+* Implement scatter-gather at page level granularity so that the device does
+  not have to.
+
+* Provide system protection against "rogue" DMA by forcing all accesses to go
+  through the IOMMU and faulting when encountering accesses to unmapped
+  address regions.
+
+* Provide address space isolation between multiple contexts.
+
+  Example: Virtualization
+
+Device nodes compatible with this binding represent hardware with some of the
+above capabilities.
+
+IOMMUs can be single-master or multiple-master. Single-master IOMMU devices
+typically have a fixed association to the master device, whereas multiple-
+master IOMMU devices can translate accesses from more than one master.
+
+The device tree node of the IOMMU device's parent bus must contain a valid
+"dma-ranges" property that describes how the physical address space of the
+IOMMU maps to memory. An empty "dma-ranges" property means that there is a
+1:1 mapping from IOMMU to memory.
+
+Required properties:
+--------------------
+- #iommu-cells: The number of cells in an IOMMU specifier needed to encode an
+  address.
+
+Typical values for the above include:
+- #iommu-cells = <0>: Single master IOMMU devices are not configurable and
+  therefore no additional information needs to be encoded in the specifier.
+  This may also apply to multiple master IOMMU devices that do not allow the
+  association of masters to be configured.
+- #iommu-cells = <1>: Multiple master IOMMU devices may need to be configured
+  in order to enable translation for a given master. In such cases the single
+  address cell corresponds to the master device's ID.
+- #iommu-cells = <4>: Some IOMMU devices allow the DMA window for masters to
+  be configured. The first cell of the address in this may contain the master
+  device's ID for example, while the second cell could contain the start of
+  the DMA window for the given device. The length of the DMA window is given
+  by the third and fourth cells.
+
+
+IOMMU master node:
+==================
+
+Devices that access memory through an IOMMU are called masters. A device can
+have multiple master interfaces (to one or more IOMMU devices).
+
+Required properties:
+--------------------
+- iommus: A list of phandle and IOMMU specifier pairs that describe the IOMMU
+  master interfaces of the device. One entry in the list describes one master
+  interface of the device.
+
+When an "iommus" property is specified in a device tree node, the IOMMU will
+be used for address translation. If a "dma-ranges" property exists in the
+device's parent node it will be ignored. An exception to this rule is if the
+referenced IOMMU is disabled, in which case the "dma-ranges" property of the
+parent shall take effect.
+
+
+Notes:
+======
+
+One possible extension to the above is to use an "iommus" property along with
+a "dma-ranges" property in a bus device node (such as PCI host bridges). This
+can be useful to describe how children on the bus relate to the IOMMU if they
+are not explicitly listed in the device tree (e.g. PCI devices). However, the
+requirements of that use-case haven't been fully determined yet. Implementing
+this is therefore not recommended without further discussion and extension of
+this binding.
+
+
+Examples:
+=========
+
+Single-master IOMMU:
+--------------------
+
+	iommu {
+		#iommu-cells = <0>;
+	};
+
+	master {
+		iommus = <&/iommu>;
+	};
+
+Multiple-master IOMMU with fixed associations:
+----------------------------------------------
+
+	/* multiple-master IOMMU */
+	iommu {
+		/*
+		 * Masters are statically associated with this IOMMU and
+		 * address translation is always enabled.
+		 */
+		#iommu-cells = <0>;
+	};
+
+	/* static association with IOMMU */
+	master at 1 {
+		reg = <1>;
+		iommus = <&/iommu>;
+	};
+
+	/* static association with IOMMU */
+	master at 2 {
+		reg = <2>;
+		iommus = <&/iommu>;
+	};
+
+Multiple-master IOMMU:
+----------------------
+
+	iommu {
+		/* the specifier represents the ID of the master */
+		#iommu-cells = <1>;
+	};
+
+	master {
+		/* device has master ID 42 in the IOMMU */
+		iommus = <&/iommu 42>;
+	};
+
+Multiple-master IOMMU with configurable DMA window:
+---------------------------------------------------
+
+	/ {
+		#address-cells = <1>;
+		#size-cells = <1>;
+
+		iommu {
+			/* master ID, address and length of DMA window */
+			#iommu-cells = <4>;
+		};
+
+		master {
+			/* master ID 42, 4 GiB DMA window starting at 0 */
+			iommus = <&/iommu  42  0  0x1 0x0>;
+		};
+	};
-- 
2.0.0

^ permalink raw reply related	[flat|nested] 133+ messages in thread

* [RFC 03/10] of: Add NVIDIA Tegra124 memory controller binding
  2014-06-26 20:49 ` Thierry Reding
  (?)
@ 2014-06-26 20:49     ` Thierry Reding
  -1 siblings, 0 replies; 133+ messages in thread
From: Thierry Reding @ 2014-06-26 20:49 UTC (permalink / raw)
  To: Rob Herring, Pawel Moll, Mark Rutland, Ian Campbell, Kumar Gala,
	Stephen Warren, Arnd Bergmann, Will Deacon, Joerg Roedel
  Cc: Olav Haugan, devicetree-u79uwXL29TY76Z2rM5mHXA, Grant Grundler,
	Rhyland Klein, iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA, Marc Zyngier, Allen Martin,
	Paul Walmsley, linux-tegra-u79uwXL29TY76Z2rM5mHXA, Cho KyongHo,
	Dave Martin, linux-arm-kernel-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r

From: Thierry Reding <treding-DDmLM1+adcrQT0dZR+AlfA@public.gmane.org>

The memory controller on NVIDIA Tegra124 exposes various knobs that can
be used to tune the behaviour of the clients attached to it.

In addition, the memory controller implements an SMMU (IOMMU) which can
translate I/O virtual addresses to physical addresses for clients. This
is useful for scatter-gather operation on devices that don't support it
natively and for virtualization or process separation.

Signed-off-by: Thierry Reding <treding-DDmLM1+adcrQT0dZR+AlfA@public.gmane.org>
---
 .../bindings/memory-controllers/nvidia,tegra124-mc.txt       | 12 ++++++++++++
 1 file changed, 12 insertions(+)
 create mode 100644 Documentation/devicetree/bindings/memory-controllers/nvidia,tegra124-mc.txt

diff --git a/Documentation/devicetree/bindings/memory-controllers/nvidia,tegra124-mc.txt b/Documentation/devicetree/bindings/memory-controllers/nvidia,tegra124-mc.txt
new file mode 100644
index 000000000000..4c922e839059
--- /dev/null
+++ b/Documentation/devicetree/bindings/memory-controllers/nvidia,tegra124-mc.txt
@@ -0,0 +1,12 @@
+NVIDIA Tegra124 Memory Controller device tree bindings
+======================================================
+
+Required properties:
+- compatible: Should be "nvidia,tegra124-mc"
+- reg: Physical base address and length of the controller's registers.
+- interrupts: The interrupt outputs from the controller.
+- #iommu-cells: Should be 1. The single cell of the IOMMU specifier defines
+  the SWGROUP of the master.
+
+This device implements an IOMMU that complies with the generic IOMMU binding.
+See ../iommu/iommu.txt for details.
-- 
2.0.0

^ permalink raw reply related	[flat|nested] 133+ messages in thread

* [RFC 03/10] of: Add NVIDIA Tegra124 memory controller binding
@ 2014-06-26 20:49     ` Thierry Reding
  0 siblings, 0 replies; 133+ messages in thread
From: Thierry Reding @ 2014-06-26 20:49 UTC (permalink / raw)
  To: Rob Herring, Pawel Moll, Mark Rutland, Ian Campbell, Kumar Gala,
	Stephen Warren, Arnd Bergmann, Will Deacon, Joerg Roedel
  Cc: Cho KyongHo, Grant Grundler, Dave Martin, Marc Zyngier,
	Hiroshi Doyu, Olav Haugan, Paul Walmsley, Rhyland Klein,
	Allen Martin, devicetree, iommu, linux-arm-kernel, linux-tegra,
	linux-kernel

From: Thierry Reding <treding@nvidia.com>

The memory controller on NVIDIA Tegra124 exposes various knobs that can
be used to tune the behaviour of the clients attached to it.

In addition, the memory controller implements an SMMU (IOMMU) which can
translate I/O virtual addresses to physical addresses for clients. This
is useful for scatter-gather operation on devices that don't support it
natively and for virtualization or process separation.

Signed-off-by: Thierry Reding <treding@nvidia.com>
---
 .../bindings/memory-controllers/nvidia,tegra124-mc.txt       | 12 ++++++++++++
 1 file changed, 12 insertions(+)
 create mode 100644 Documentation/devicetree/bindings/memory-controllers/nvidia,tegra124-mc.txt

diff --git a/Documentation/devicetree/bindings/memory-controllers/nvidia,tegra124-mc.txt b/Documentation/devicetree/bindings/memory-controllers/nvidia,tegra124-mc.txt
new file mode 100644
index 000000000000..4c922e839059
--- /dev/null
+++ b/Documentation/devicetree/bindings/memory-controllers/nvidia,tegra124-mc.txt
@@ -0,0 +1,12 @@
+NVIDIA Tegra124 Memory Controller device tree bindings
+======================================================
+
+Required properties:
+- compatible: Should be "nvidia,tegra124-mc"
+- reg: Physical base address and length of the controller's registers.
+- interrupts: The interrupt outputs from the controller.
+- #iommu-cells: Should be 1. The single cell of the IOMMU specifier defines
+  the SWGROUP of the master.
+
+This device implements an IOMMU that complies with the generic IOMMU binding.
+See ../iommu/iommu.txt for details.
-- 
2.0.0


^ permalink raw reply related	[flat|nested] 133+ messages in thread

* [RFC 03/10] of: Add NVIDIA Tegra124 memory controller binding
@ 2014-06-26 20:49     ` Thierry Reding
  0 siblings, 0 replies; 133+ messages in thread
From: Thierry Reding @ 2014-06-26 20:49 UTC (permalink / raw)
  To: linux-arm-kernel

From: Thierry Reding <treding@nvidia.com>

The memory controller on NVIDIA Tegra124 exposes various knobs that can
be used to tune the behaviour of the clients attached to it.

In addition, the memory controller implements an SMMU (IOMMU) which can
translate I/O virtual addresses to physical addresses for clients. This
is useful for scatter-gather operation on devices that don't support it
natively and for virtualization or process separation.

Signed-off-by: Thierry Reding <treding@nvidia.com>
---
 .../bindings/memory-controllers/nvidia,tegra124-mc.txt       | 12 ++++++++++++
 1 file changed, 12 insertions(+)
 create mode 100644 Documentation/devicetree/bindings/memory-controllers/nvidia,tegra124-mc.txt

diff --git a/Documentation/devicetree/bindings/memory-controllers/nvidia,tegra124-mc.txt b/Documentation/devicetree/bindings/memory-controllers/nvidia,tegra124-mc.txt
new file mode 100644
index 000000000000..4c922e839059
--- /dev/null
+++ b/Documentation/devicetree/bindings/memory-controllers/nvidia,tegra124-mc.txt
@@ -0,0 +1,12 @@
+NVIDIA Tegra124 Memory Controller device tree bindings
+======================================================
+
+Required properties:
+- compatible: Should be "nvidia,tegra124-mc"
+- reg: Physical base address and length of the controller's registers.
+- interrupts: The interrupt outputs from the controller.
+- #iommu-cells: Should be 1. The single cell of the IOMMU specifier defines
+  the SWGROUP of the master.
+
+This device implements an IOMMU that complies with the generic IOMMU binding.
+See ../iommu/iommu.txt for details.
-- 
2.0.0

^ permalink raw reply related	[flat|nested] 133+ messages in thread

* [RFC 04/10] memory: Add Tegra124 memory controller support
  2014-06-26 20:49 ` Thierry Reding
  (?)
@ 2014-06-26 20:49     ` Thierry Reding
  -1 siblings, 0 replies; 133+ messages in thread
From: Thierry Reding @ 2014-06-26 20:49 UTC (permalink / raw)
  To: Rob Herring, Pawel Moll, Mark Rutland, Ian Campbell, Kumar Gala,
	Stephen Warren, Arnd Bergmann, Will Deacon, Joerg Roedel
  Cc: Olav Haugan, devicetree-u79uwXL29TY76Z2rM5mHXA, Grant Grundler,
	Rhyland Klein, iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA, Marc Zyngier, Allen Martin,
	Paul Walmsley, linux-tegra-u79uwXL29TY76Z2rM5mHXA, Cho KyongHo,
	Dave Martin, linux-arm-kernel-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r

From: Thierry Reding <treding-DDmLM1+adcrQT0dZR+AlfA@public.gmane.org>

The memory controller on NVIDIA Tegra124 exposes various knobs that can
be used to tune the behaviour of the clients attached to it.

Currently this driver sets up the latency allowance registers to the HW
defaults. Eventually an API should be exported by this driver (via a
custom API or a generic subsystem) to allow clients to register latency
requirements.

This driver also registers an IOMMU (SMMU) that's implemented by the
memory controller.

Signed-off-by: Thierry Reding <treding-DDmLM1+adcrQT0dZR+AlfA@public.gmane.org>
---
 drivers/memory/Kconfig                   |    9 +
 drivers/memory/Makefile                  |    1 +
 drivers/memory/tegra124-mc.c             | 1945 ++++++++++++++++++++++++++++++
 include/dt-bindings/memory/tegra124-mc.h |   30 +
 4 files changed, 1985 insertions(+)
 create mode 100644 drivers/memory/tegra124-mc.c
 create mode 100644 include/dt-bindings/memory/tegra124-mc.h

diff --git a/drivers/memory/Kconfig b/drivers/memory/Kconfig
index c59e9c96e86d..d0f0e6781570 100644
--- a/drivers/memory/Kconfig
+++ b/drivers/memory/Kconfig
@@ -61,6 +61,15 @@ config TEGRA30_MC
 	  analysis, especially for IOMMU/SMMU(System Memory Management
 	  Unit) module.
 
+config TEGRA124_MC
+	bool "Tegra124 Memory Controller driver"
+	depends on ARCH_TEGRA
+	select IOMMU_API
+	help
+	  This driver is for the Memory Controller module available on
+	  Tegra124 SoCs. It provides an IOMMU that can be used for I/O
+	  virtual address translation.
+
 config FSL_IFC
 	bool
 	depends on FSL_SOC
diff --git a/drivers/memory/Makefile b/drivers/memory/Makefile
index 71160a2b7313..03143927abab 100644
--- a/drivers/memory/Makefile
+++ b/drivers/memory/Makefile
@@ -11,3 +11,4 @@ obj-$(CONFIG_FSL_IFC)		+= fsl_ifc.o
 obj-$(CONFIG_MVEBU_DEVBUS)	+= mvebu-devbus.o
 obj-$(CONFIG_TEGRA20_MC)	+= tegra20-mc.o
 obj-$(CONFIG_TEGRA30_MC)	+= tegra30-mc.o
+obj-$(CONFIG_TEGRA124_MC)	+= tegra124-mc.o
diff --git a/drivers/memory/tegra124-mc.c b/drivers/memory/tegra124-mc.c
new file mode 100644
index 000000000000..741755b6785d
--- /dev/null
+++ b/drivers/memory/tegra124-mc.c
@@ -0,0 +1,1945 @@
+/*
+ * Copyright (C) 2014 NVIDIA CORPORATION.  All rights reserved.
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ */
+
+#include <linux/interrupt.h>
+#include <linux/io.h>
+#include <linux/iommu.h>
+#include <linux/kernel.h>
+#include <linux/module.h>
+#include <linux/of.h>
+#include <linux/platform_device.h>
+#include <linux/slab.h>
+
+#include <dt-bindings/memory/tegra124-mc.h>
+
+#include <asm/cacheflush.h>
+#ifndef CONFIG_ARM64
+#include <asm/dma-iommu.h>
+#endif
+
+#define MC_INTSTATUS 0x000
+#define  MC_INT_DECERR_MTS (1 << 16)
+#define  MC_INT_SECERR_SEC (1 << 13)
+#define  MC_INT_DECERR_VPR (1 << 12)
+#define  MC_INT_INVALID_APB_ASID_UPDATE (1 << 11)
+#define  MC_INT_INVALID_SMMU_PAGE (1 << 10)
+#define  MC_INT_ARBITRATION_EMEM (1 << 9)
+#define  MC_INT_SECURITY_VIOLATION (1 << 8)
+#define  MC_INT_DECERR_EMEM (1 << 6)
+#define MC_INTMASK 0x004
+#define MC_ERR_STATUS 0x08
+#define MC_ERR_ADR 0x0c
+
+struct latency_allowance {
+	unsigned int reg;
+	unsigned int shift;
+	unsigned int mask;
+	unsigned int def;
+};
+
+struct smmu_enable {
+	unsigned int reg;
+	unsigned int bit;
+};
+
+struct tegra_mc_client {
+	unsigned int id;
+	const char *name;
+	unsigned int swgroup;
+
+	struct smmu_enable smmu;
+	struct latency_allowance latency;
+};
+
+static const struct tegra_mc_client tegra124_mc_clients[] = {
+	{
+		.id = 0x01,
+		.name = "display0a",
+		.swgroup = TEGRA_SWGROUP_DC,
+		.smmu = {
+			.reg = 0x228,
+			.bit = 1,
+		},
+		.latency = {
+			.reg = 0x2e8,
+			.shift = 0,
+			.mask = 0xff,
+			.def = 0xc2,
+		},
+	}, {
+		.id = 0x02,
+		.name = "display0ab",
+		.swgroup = TEGRA_SWGROUP_DCB,
+		.smmu = {
+			.reg = 0x228,
+			.bit = 2,
+		},
+		.latency = {
+			.reg = 0x2f4,
+			.shift = 0,
+			.mask = 0xff,
+			.def = 0xc6,
+		},
+	}, {
+		.id = 0x03,
+		.name = "display0b",
+		.swgroup = TEGRA_SWGROUP_DC,
+		.smmu = {
+			.reg = 0x228,
+			.bit = 3,
+		},
+		.latency = {
+			.reg = 0x2e8,
+			.shift = 16,
+			.mask = 0xff,
+			.def = 0x50,
+		},
+	}, {
+		.id = 0x04,
+		.name = "display0bb",
+		.swgroup = TEGRA_SWGROUP_DCB,
+		.smmu = {
+			.reg = 0x228,
+			.bit = 4,
+		},
+		.latency = {
+			.reg = 0x2f4,
+			.shift = 16,
+			.mask = 0xff,
+			.def = 0x50,
+		},
+	}, {
+		.id = 0x05,
+		.name = "display0c",
+		.swgroup = TEGRA_SWGROUP_DC,
+		.smmu = {
+			.reg = 0x228,
+			.bit = 5,
+		},
+		.latency = {
+			.reg = 0x2ec,
+			.shift = 0,
+			.mask = 0xff,
+			.def = 0x50,
+		},
+	}, {
+		.id = 0x06,
+		.name = "display0cb",
+		.swgroup = TEGRA_SWGROUP_DCB,
+		.smmu = {
+			.reg = 0x228,
+			.bit = 6,
+		},
+		.latency = {
+			.reg = 0x2f8,
+			.shift = 0,
+			.mask = 0xff,
+			.def = 0x50,
+		},
+	}, {
+		.id = 0x0e,
+		.name = "afir",
+		.swgroup = TEGRA_SWGROUP_AFI,
+		.smmu = {
+			.reg = 0x228,
+			.bit = 14,
+		},
+		.latency = {
+			.reg = 0x2e0,
+			.shift = 0,
+			.mask = 0xff,
+			.def = 0x13,
+		},
+	}, {
+		.id = 0x0f,
+		.name = "avpcarm7r",
+		.swgroup = TEGRA_SWGROUP_AVPC,
+		.smmu = {
+			.reg = 0x228,
+			.bit = 15,
+		},
+		.latency = {
+			.reg = 0x2e4,
+			.shift = 0,
+			.mask = 0xff,
+			.def = 0x04,
+		},
+	}, {
+		.id = 0x10,
+		.name = "displayhc",
+		.swgroup = TEGRA_SWGROUP_DC,
+		.smmu = {
+			.reg = 0x228,
+			.bit = 16,
+		},
+		.latency = {
+			.reg = 0x2f0,
+			.shift = 0,
+			.mask = 0xff,
+			.def = 0x50,
+		},
+	}, {
+		.id = 0x11,
+		.name = "displayhcb",
+		.swgroup = TEGRA_SWGROUP_DCB,
+		.smmu = {
+			.reg = 0x228,
+			.bit = 17,
+		},
+		.latency = {
+			.reg = 0x2fc,
+			.shift = 0,
+			.mask = 0xff,
+			.def = 0x50,
+		},
+	}, {
+		.id = 0x15,
+		.name = "hdar",
+		.swgroup = TEGRA_SWGROUP_HDA,
+		.smmu = {
+			.reg = 0x228,
+			.bit = 21,
+		},
+		.latency = {
+			.reg = 0x318,
+			.shift = 0,
+			.mask = 0xff,
+			.def = 0x24,
+		},
+	}, {
+		.id = 0x16,
+		.name = "host1xdmar",
+		.swgroup = TEGRA_SWGROUP_HC,
+		.smmu = {
+			.reg = 0x228,
+			.bit = 22,
+		},
+		.latency = {
+			.reg = 0x310,
+			.shift = 0,
+			.mask = 0xff,
+			.def = 0x1e,
+		},
+	}, {
+		.id = 0x17,
+		.name = "host1xr",
+		.swgroup = TEGRA_SWGROUP_HC,
+		.smmu = {
+			.reg = 0x228,
+			.bit = 23,
+		},
+		.latency = {
+			.reg = 0x310,
+			.shift = 16,
+			.mask = 0xff,
+			.def = 0x50,
+		},
+	}, {
+		.id = 0x1c,
+		.name = "msencsrd",
+		.swgroup = TEGRA_SWGROUP_MSENC,
+		.smmu = {
+			.reg = 0x228,
+			.bit = 28,
+		},
+		.latency = {
+			.reg = 0x328,
+			.shift = 0,
+			.mask = 0xff,
+			.def = 0x23,
+		},
+	}, {
+		.id = 0x1d,
+		.name = "ppcsahbdmarhdar",
+		.swgroup = TEGRA_SWGROUP_PPCS,
+		.smmu = {
+			.reg = 0x228,
+			.bit = 29,
+		},
+		.latency = {
+			.reg = 0x344,
+			.shift = 0,
+			.mask = 0xff,
+			.def = 0x49,
+		},
+	}, {
+		.id = 0x1e,
+		.name = "ppcsahbslvr",
+		.swgroup = TEGRA_SWGROUP_PPCS,
+		.smmu = {
+			.reg = 0x228,
+			.bit = 30,
+		},
+		.latency = {
+			.reg = 0x344,
+			.shift = 16,
+			.mask = 0xff,
+			.def = 0x1a,
+		},
+	}, {
+		.id = 0x1f,
+		.name = "satar",
+		.swgroup = TEGRA_SWGROUP_SATA,
+		.smmu = {
+			.reg = 0x228,
+			.bit = 31,
+		},
+		.latency = {
+			.reg = 0x350,
+			.shift = 0,
+			.mask = 0xff,
+			.def = 0x65,
+		},
+	}, {
+		.id = 0x22,
+		.name = "vdebsevr",
+		.swgroup = TEGRA_SWGROUP_VDE,
+		.smmu = {
+			.reg = 0x22c,
+			.bit = 2,
+		},
+		.latency = {
+			.reg = 0x354,
+			.shift = 0,
+			.mask = 0xff,
+			.def = 0x4f,
+		},
+	}, {
+		.id = 0x23,
+		.name = "vdember",
+		.swgroup = TEGRA_SWGROUP_VDE,
+		.smmu = {
+			.reg = 0x22c,
+			.bit = 3,
+		},
+		.latency = {
+			.reg = 0x354,
+			.shift = 16,
+			.mask = 0xff,
+			.def = 0x3d,
+		},
+	}, {
+		.id = 0x24,
+		.name = "vdemcer",
+		.swgroup = TEGRA_SWGROUP_VDE,
+		.smmu = {
+			.reg = 0x22c,
+			.bit = 4,
+		},
+		.latency = {
+			.reg = 0x358,
+			.shift = 0,
+			.mask = 0xff,
+			.def = 0x66,
+		},
+	}, {
+		.id = 0x25,
+		.name = "vdetper",
+		.swgroup = TEGRA_SWGROUP_VDE,
+		.smmu = {
+			.reg = 0x22c,
+			.bit = 5,
+		},
+		.latency = {
+			.reg = 0x358,
+			.shift = 16,
+			.mask = 0xff,
+			.def = 0xa5,
+		},
+	}, {
+		.id = 0x26,
+		.name = "mpcorelpr",
+		.swgroup = TEGRA_SWGROUP_MPCORELP,
+		.latency = {
+			.reg = 0x324,
+			.shift = 0,
+			.mask = 0xff,
+			.def = 0x04,
+		},
+	}, {
+		.id = 0x27,
+		.name = "mpcorer",
+		.swgroup = TEGRA_SWGROUP_MPCORE,
+		.smmu = {
+			.reg = 0x22c,
+			.bit = 2,
+		},
+		.latency = {
+			.reg = 0x320,
+			.shift = 0,
+			.mask = 0xff,
+			.def = 0x04,
+		},
+	}, {
+		.id = 0x2b,
+		.name = "msencswr",
+		.swgroup = TEGRA_SWGROUP_MSENC,
+		.smmu = {
+			.reg = 0x22c,
+			.bit = 11,
+		},
+		.latency = {
+			.reg = 0x328,
+			.shift = 16,
+			.mask = 0xff,
+			.def = 0x80,
+		},
+	}, {
+		.id = 0x31,
+		.name = "afiw",
+		.swgroup = TEGRA_SWGROUP_AFI,
+		.smmu = {
+			.reg = 0x22c,
+			.bit = 17,
+		},
+		.latency = {
+			.reg = 0x2e0,
+			.shift = 16,
+			.mask = 0xff,
+			.def = 0x80,
+		},
+	}, {
+		.id = 0x32,
+		.name = "avpcarm7w",
+		.swgroup = TEGRA_SWGROUP_AVPC,
+		.smmu = {
+			.reg = 0x22c,
+			.bit = 18,
+		},
+		.latency = {
+			.reg = 0x2e4,
+			.shift = 16,
+			.mask = 0xff,
+			.def = 0x80,
+		},
+	}, {
+		.id = 0x35,
+		.name = "hdaw",
+		.swgroup = TEGRA_SWGROUP_HDA,
+		.smmu = {
+			.reg = 0x22c,
+			.bit = 21,
+		},
+		.latency = {
+			.reg = 0x318,
+			.shift = 16,
+			.mask = 0xff,
+			.def = 0x80,
+		},
+	}, {
+		.id = 0x36,
+		.name = "host1xw",
+		.swgroup = TEGRA_SWGROUP_HC,
+		.smmu = {
+			.reg = 0x22c,
+			.bit = 22,
+		},
+		.latency = {
+			.reg = 0x314,
+			.shift = 0,
+			.mask = 0xff,
+			.def = 0x80,
+		},
+	}, {
+		.id = 0x38,
+		.name = "mpcorelpw",
+		.swgroup = TEGRA_SWGROUP_MPCORELP,
+		.latency = {
+			.reg = 0x324,
+			.shift = 16,
+			.mask = 0xff,
+			.def = 0x80,
+		},
+	}, {
+		.id = 0x39,
+		.name = "mpcorew",
+		.swgroup = TEGRA_SWGROUP_MPCORE,
+		.latency = {
+			.reg = 0x320,
+			.shift = 16,
+			.mask = 0xff,
+			.def = 0x80,
+		},
+	}, {
+		.id = 0x3b,
+		.name = "ppcsahbdmaw",
+		.swgroup = TEGRA_SWGROUP_PPCS,
+		.smmu = {
+			.reg = 0x22c,
+			.bit = 27,
+		},
+		.latency = {
+			.reg = 0x348,
+			.shift = 0,
+			.mask = 0xff,
+			.def = 0x80,
+		},
+	}, {
+		.id = 0x3c,
+		.name = "ppcsahbslvw",
+		.swgroup = TEGRA_SWGROUP_PPCS,
+		.smmu = {
+			.reg = 0x22c,
+			.bit = 28,
+		},
+		.latency = {
+			.reg = 0x348,
+			.shift = 16,
+			.mask = 0xff,
+			.def = 0x80,
+		},
+	}, {
+		.id = 0x3d,
+		.name = "sataw",
+		.swgroup = TEGRA_SWGROUP_SATA,
+		.smmu = {
+			.reg = 0x22c,
+			.bit = 29,
+		},
+		.latency = {
+			.reg = 0x350,
+			.shift = 16,
+			.mask = 0xff,
+			.def = 0x65,
+		},
+	}, {
+		.id = 0x3e,
+		.name = "vdebsevw",
+		.swgroup = TEGRA_SWGROUP_VDE,
+		.smmu = {
+			.reg = 0x22c,
+			.bit = 30,
+		},
+		.latency = {
+			.reg = 0x35c,
+			.shift = 0,
+			.mask = 0xff,
+			.def = 0x80,
+		},
+	}, {
+		.id = 0x3f,
+		.name = "vdedbgw",
+		.swgroup = TEGRA_SWGROUP_VDE,
+		.smmu = {
+			.reg = 0x22c,
+			.bit = 31,
+		},
+		.latency = {
+			.reg = 0x35c,
+			.shift = 16,
+			.mask = 0xff,
+			.def = 0x80,
+		},
+	}, {
+		.id = 0x40,
+		.name = "vdembew",
+		.swgroup = TEGRA_SWGROUP_VDE,
+		.smmu = {
+			.reg = 0x230,
+			.bit = 0,
+		},
+		.latency = {
+			.reg = 0x360,
+			.shift = 0,
+			.mask = 0xff,
+			.def = 0x80,
+		},
+	}, {
+		.id = 0x41,
+		.name = "vdetpmw",
+		.swgroup = TEGRA_SWGROUP_VDE,
+		.smmu = {
+			.reg = 0x230,
+			.bit = 1,
+		},
+		.latency = {
+			.reg = 0x360,
+			.shift = 16,
+			.mask = 0xff,
+			.def = 0x80,
+		},
+	}, {
+		.id = 0x44,
+		.name = "ispra",
+		.swgroup = TEGRA_SWGROUP_ISP2,
+		.smmu = {
+			.reg = 0x230,
+			.bit = 4,
+		},
+		.latency = {
+			.reg = 0x370,
+			.shift = 0,
+			.mask = 0xff,
+			.def = 0x18,
+		},
+	}, {
+		.id = 0x46,
+		.name = "ispwa",
+		.swgroup = TEGRA_SWGROUP_ISP2,
+		.smmu = {
+			.reg = 0x230,
+			.bit = 6,
+		},
+		.latency = {
+			.reg = 0x374,
+			.shift = 0,
+			.mask = 0xff,
+			.def = 0x80,
+		},
+	}, {
+		.id = 0x47,
+		.name = "ispwb",
+		.swgroup = TEGRA_SWGROUP_ISP2,
+		.smmu = {
+			.reg = 0x230,
+			.bit = 7,
+		},
+		.latency = {
+			.reg = 0x374,
+			.shift = 16,
+			.mask = 0xff,
+			.def = 0x80,
+		},
+	}, {
+		.id = 0x4a,
+		.name = "xusb_hostr",
+		.swgroup = TEGRA_SWGROUP_XUSB_HOST,
+		.smmu = {
+			.reg = 0x230,
+			.bit = 10,
+		},
+		.latency = {
+			.reg = 0x37c,
+			.shift = 0,
+			.mask = 0xff,
+			.def = 0x39,
+		},
+	}, {
+		.id = 0x4b,
+		.name = "xusb_hostw",
+		.swgroup = TEGRA_SWGROUP_XUSB_HOST,
+		.smmu = {
+			.reg = 0x230,
+			.bit = 11,
+		},
+		.latency = {
+			.reg = 0x37c,
+			.shift = 16,
+			.mask = 0xff,
+			.def = 0x80,
+		},
+	}, {
+		.id = 0x4c,
+		.name = "xusb_devr",
+		.swgroup = TEGRA_SWGROUP_XUSB_DEV,
+		.smmu = {
+			.reg = 0x230,
+			.bit = 12,
+		},
+		.latency = {
+			.reg = 0x380,
+			.shift = 0,
+			.mask = 0xff,
+			.def = 0x39,
+		},
+	}, {
+		.id = 0x4d,
+		.name = "xusb_devw",
+		.swgroup = TEGRA_SWGROUP_XUSB_DEV,
+		.smmu = {
+			.reg = 0x230,
+			.bit = 13,
+		},
+		.latency = {
+			.reg = 0x380,
+			.shift = 16,
+			.mask = 0xff,
+			.def = 0x80,
+		},
+	}, {
+		.id = 0x4e,
+		.name = "isprab",
+		.swgroup = TEGRA_SWGROUP_ISP2B,
+		.smmu = {
+			.reg = 0x230,
+			.bit = 14,
+		},
+		.latency = {
+			.reg = 0x384,
+			.shift = 0,
+			.mask = 0xff,
+			.def = 0x18,
+		},
+	}, {
+		.id = 0x50,
+		.name = "ispwab",
+		.swgroup = TEGRA_SWGROUP_ISP2B,
+		.smmu = {
+			.reg = 0x230,
+			.bit = 16,
+		},
+		.latency = {
+			.reg = 0x388,
+			.shift = 0,
+			.mask = 0xff,
+			.def = 0x80,
+		},
+	}, {
+		.id = 0x51,
+		.name = "ispwbb",
+		.swgroup = TEGRA_SWGROUP_ISP2B,
+		.smmu = {
+			.reg = 0x230,
+			.bit = 17,
+		},
+		.latency = {
+			.reg = 0x388,
+			.shift = 16,
+			.mask = 0xff,
+			.def = 0x80,
+		},
+	}, {
+		.id = 0x54,
+		.name = "tsecsrd",
+		.swgroup = TEGRA_SWGROUP_TSEC,
+		.smmu = {
+			.reg = 0x230,
+			.bit = 20,
+		},
+		.latency = {
+			.reg = 0x390,
+			.shift = 0,
+			.mask = 0xff,
+			.def = 0x9b,
+		},
+	}, {
+		.id = 0x55,
+		.name = "tsecswr",
+		.swgroup = TEGRA_SWGROUP_TSEC,
+		.smmu = {
+			.reg = 0x230,
+			.bit = 21,
+		},
+		.latency = {
+			.reg = 0x390,
+			.shift = 16,
+			.mask = 0xff,
+			.def = 0x80,
+		},
+	}, {
+		.id = 0x56,
+		.name = "a9avpscr",
+		.swgroup = TEGRA_SWGROUP_A9AVP,
+		.smmu = {
+			.reg = 0x230,
+			.bit = 22,
+		},
+		.latency = {
+			.reg = 0x3a4,
+			.shift = 0,
+			.mask = 0xff,
+			.def = 0x04,
+		},
+	}, {
+		.id = 0x57,
+		.name = "a9avpscw",
+		.swgroup = TEGRA_SWGROUP_A9AVP,
+		.smmu = {
+			.reg = 0x230,
+			.bit = 23,
+		},
+		.latency = {
+			.reg = 0x3a4,
+			.shift = 16,
+			.mask = 0xff,
+			.def = 0x80,
+		},
+	}, {
+		.id = 0x58,
+		.name = "gpusrd",
+		.swgroup = TEGRA_SWGROUP_GPU,
+		.smmu = {
+			/* read-only */
+			.reg = 0x230,
+			.bit = 24,
+		},
+		.latency = {
+			.reg = 0x3c8,
+			.shift = 0,
+			.mask = 0xff,
+			.def = 0x1a,
+		},
+	}, {
+		.id = 0x59,
+		.name = "gpuswr",
+		.swgroup = TEGRA_SWGROUP_GPU,
+		.smmu = {
+			/* read-only */
+			.reg = 0x230,
+			.bit = 25,
+		},
+		.latency = {
+			.reg = 0x3c8,
+			.shift = 16,
+			.mask = 0xff,
+			.def = 0x80,
+		},
+	}, {
+		.id = 0x5a,
+		.name = "displayt",
+		.swgroup = TEGRA_SWGROUP_DC,
+		.smmu = {
+			.reg = 0x230,
+			.bit = 26,
+		},
+		.latency = {
+			.reg = 0x2f0,
+			.shift = 16,
+			.mask = 0xff,
+			.def = 0x50,
+		},
+	}, {
+		.id = 0x60,
+		.name = "sdmmcra",
+		.swgroup = TEGRA_SWGROUP_SDMMC1A,
+		.smmu = {
+			.reg = 0x234,
+			.bit = 0,
+		},
+		.latency = {
+			.reg = 0x3b8,
+			.shift = 0,
+			.mask = 0xff,
+			.def = 0x49,
+		},
+	}, {
+		.id = 0x61,
+		.name = "sdmmcraa",
+		.swgroup = TEGRA_SWGROUP_SDMMC2A,
+		.smmu = {
+			.reg = 0x234,
+			.bit = 1,
+		},
+		.latency = {
+			.reg = 0x3bc,
+			.shift = 0,
+			.mask = 0xff,
+			.def = 0x49,
+		},
+	}, {
+		.id = 0x62,
+		.name = "sdmmcr",
+		.swgroup = TEGRA_SWGROUP_SDMMC3A,
+		.smmu = {
+			.reg = 0x234,
+			.bit = 2,
+		},
+		.latency = {
+			.reg = 0x3c0,
+			.shift = 0,
+			.mask = 0xff,
+			.def = 0x49,
+		},
+	}, {
+		.id = 0x63,
+		.swgroup = TEGRA_SWGROUP_SDMMC4A,
+		.name = "sdmmcrab",
+		.smmu = {
+			.reg = 0x234,
+			.bit = 3,
+		},
+		.latency = {
+			.reg = 0x3c4,
+			.shift = 0,
+			.mask = 0xff,
+			.def = 0x49,
+		},
+	}, {
+		.id = 0x64,
+		.name = "sdmmcwa",
+		.swgroup = TEGRA_SWGROUP_SDMMC1A,
+		.smmu = {
+			.reg = 0x234,
+			.bit = 4,
+		},
+		.latency = {
+			.reg = 0x3b8,
+			.shift = 16,
+			.mask = 0xff,
+			.def = 0x80,
+		},
+	}, {
+		.id = 0x65,
+		.name = "sdmmcwaa",
+		.swgroup = TEGRA_SWGROUP_SDMMC2A,
+		.smmu = {
+			.reg = 0x234,
+			.bit = 5,
+		},
+		.latency = {
+			.reg = 0x3bc,
+			.shift = 16,
+			.mask = 0xff,
+			.def = 0x80,
+		},
+	}, {
+		.id = 0x66,
+		.name = "sdmmcw",
+		.swgroup = TEGRA_SWGROUP_SDMMC3A,
+		.smmu = {
+			.reg = 0x234,
+			.bit = 6,
+		},
+		.latency = {
+			.reg = 0x3c0,
+			.shift = 16,
+			.mask = 0xff,
+			.def = 0x80,
+		},
+	}, {
+		.id = 0x67,
+		.name = "sdmmcwab",
+		.swgroup = TEGRA_SWGROUP_SDMMC4A,
+		.smmu = {
+			.reg = 0x234,
+			.bit = 7,
+		},
+		.latency = {
+			.reg = 0x3c4,
+			.shift = 16,
+			.mask = 0xff,
+			.def = 0x80,
+		},
+	}, {
+		.id = 0x6c,
+		.name = "vicsrd",
+		.swgroup = TEGRA_SWGROUP_VIC,
+		.smmu = {
+			.reg = 0x234,
+			.bit = 12,
+		},
+		.latency = {
+			.reg = 0x394,
+			.shift = 0,
+			.mask = 0xff,
+			.def = 0x1a,
+		},
+	}, {
+		.id = 0x6d,
+		.name = "vicswr",
+		.swgroup = TEGRA_SWGROUP_VIC,
+		.smmu = {
+			.reg = 0x234,
+			.bit = 13,
+		},
+		.latency = {
+			.reg = 0x394,
+			.shift = 16,
+			.mask = 0xff,
+			.def = 0x80,
+		},
+	}, {
+		.id = 0x72,
+		.name = "viw",
+		.swgroup = TEGRA_SWGROUP_VI,
+		.smmu = {
+			.reg = 0x234,
+			.bit = 18,
+		},
+		.latency = {
+			.reg = 0x398,
+			.shift = 0,
+			.mask = 0xff,
+			.def = 0x80,
+		},
+	}, {
+		.id = 0x73,
+		.name = "displayd",
+		.swgroup = TEGRA_SWGROUP_DC,
+		.smmu = {
+			.reg = 0x234,
+			.bit = 19,
+		},
+		.latency = {
+			.reg = 0x3c8,
+			.shift = 0,
+			.mask = 0xff,
+			.def = 0x50,
+		},
+	},
+};
+
+struct tegra_smmu_swgroup {
+	unsigned int swgroup;
+	unsigned int reg;
+};
+
+static const struct tegra_smmu_swgroup tegra124_swgroups[] = {
+	{ .swgroup = TEGRA_SWGROUP_DC,        .reg = 0x240 },
+	{ .swgroup = TEGRA_SWGROUP_DCB,       .reg = 0x244 },
+	{ .swgroup = TEGRA_SWGROUP_AFI,       .reg = 0x238 },
+	{ .swgroup = TEGRA_SWGROUP_AVPC,      .reg = 0x23c },
+	{ .swgroup = TEGRA_SWGROUP_HDA,       .reg = 0x254 },
+	{ .swgroup = TEGRA_SWGROUP_HC,        .reg = 0x250 },
+	{ .swgroup = TEGRA_SWGROUP_MSENC,     .reg = 0x264 },
+	{ .swgroup = TEGRA_SWGROUP_PPCS,      .reg = 0x270 },
+	{ .swgroup = TEGRA_SWGROUP_SATA,      .reg = 0x274 },
+	{ .swgroup = TEGRA_SWGROUP_VDE,       .reg = 0x27c },
+	{ .swgroup = TEGRA_SWGROUP_ISP2,      .reg = 0x258 },
+	{ .swgroup = TEGRA_SWGROUP_XUSB_HOST, .reg = 0x288 },
+	{ .swgroup = TEGRA_SWGROUP_XUSB_DEV,  .reg = 0x28c },
+	{ .swgroup = TEGRA_SWGROUP_ISP2B,     .reg = 0xaa4 },
+	{ .swgroup = TEGRA_SWGROUP_TSEC,      .reg = 0x294 },
+	{ .swgroup = TEGRA_SWGROUP_A9AVP,     .reg = 0x290 },
+	{ .swgroup = TEGRA_SWGROUP_GPU,       .reg = 0xaa8 },
+	{ .swgroup = TEGRA_SWGROUP_SDMMC1A,   .reg = 0xa94 },
+	{ .swgroup = TEGRA_SWGROUP_SDMMC2A,   .reg = 0xa98 },
+	{ .swgroup = TEGRA_SWGROUP_SDMMC3A,   .reg = 0xa9c },
+	{ .swgroup = TEGRA_SWGROUP_SDMMC4A,   .reg = 0xaa0 },
+	{ .swgroup = TEGRA_SWGROUP_VIC,       .reg = 0x284 },
+	{ .swgroup = TEGRA_SWGROUP_VI,        .reg = 0x280 },
+};
+
+struct tegra_smmu_group_init {
+	unsigned int asid;
+	const char *name;
+
+	const struct of_device_id *matches;
+};
+
+struct tegra_smmu_soc {
+	const struct tegra_smmu_group_init *groups;
+	unsigned int num_groups;
+
+	const struct tegra_mc_client *clients;
+	unsigned int num_clients;
+
+	const struct tegra_smmu_swgroup *swgroups;
+	unsigned int num_swgroups;
+
+	unsigned int num_asids;
+	unsigned int atom_size;
+
+	const struct tegra_smmu_ops *ops;
+};
+
+struct tegra_smmu_ops {
+	void (*flush_dcache)(struct page *page, unsigned long offset,
+			     size_t size);
+};
+
+struct tegra_smmu_master {
+	struct list_head list;
+	struct device *dev;
+};
+
+struct tegra_smmu_group {
+	const char *name;
+	const struct of_device_id *matches;
+	unsigned int asid;
+
+#ifndef CONFIG_ARM64
+	struct dma_iommu_mapping *mapping;
+#endif
+	struct list_head masters;
+};
+
+static const struct of_device_id tegra124_periph_matches[] = {
+	{ .compatible = "nvidia,tegra124-sdhci", },
+	{ }
+};
+
+static const struct tegra_smmu_group_init tegra124_smmu_groups[] = {
+	{ 0, "peripherals", tegra124_periph_matches },
+};
+
+static void tegra_smmu_group_release(void *data)
+{
+	kfree(data);
+}
+
+struct tegra_smmu {
+	void __iomem *regs;
+	struct iommu iommu;
+	struct device *dev;
+
+	const struct tegra_smmu_soc *soc;
+
+	struct iommu_group **groups;
+	unsigned int num_groups;
+
+	unsigned long *asids;
+	struct mutex lock;
+};
+
+struct tegra_smmu_address_space {
+	struct iommu_domain *domain;
+	struct tegra_smmu *smmu;
+	struct page *pd;
+	unsigned id;
+	u32 attr;
+};
+
+static inline void smmu_writel(struct tegra_smmu *smmu, u32 value,
+			       unsigned long offset)
+{
+	writel(value, smmu->regs + offset);
+}
+
+static inline u32 smmu_readl(struct tegra_smmu *smmu, unsigned long offset)
+{
+	return readl(smmu->regs + offset);
+}
+
+#define SMMU_CONFIG 0x010
+#define  SMMU_CONFIG_ENABLE (1 << 0)
+
+#define SMMU_PTB_ASID 0x01c
+#define  SMMU_PTB_ASID_VALUE(x) ((x) & 0x7f)
+
+#define SMMU_PTB_DATA 0x020
+#define  SMMU_PTB_DATA_VALUE(page, attr) (page_to_phys(page) >> 12 | (attr))
+
+#define SMMU_MK_PDE(page, attr) (page_to_phys(page) >> SMMU_PTE_SHIFT | (attr))
+
+#define SMMU_TLB_FLUSH 0x030
+#define  SMMU_TLB_FLUSH_VA_MATCH_ALL     (0 << 0)
+#define  SMMU_TLB_FLUSH_VA_MATCH_SECTION (2 << 0)
+#define  SMMU_TLB_FLUSH_VA_MATCH_GROUP   (3 << 0)
+#define  SMMU_TLB_FLUSH_ASID(x)          (((x) & 0x7f) << 24)
+#define  SMMU_TLB_FLUSH_VA_SECTION(addr) ((((addr) & 0xffc00000) >> 12) | \
+					  SMMU_TLB_FLUSH_VA_MATCH_SECTION)
+#define  SMMU_TLB_FLUSH_VA_GROUP(addr)   ((((addr) & 0xffffc000) >> 12) | \
+					  SMMU_TLB_FLUSH_VA_MATCH_GROUP)
+#define  SMMU_TLB_FLUSH_ASID_MATCH       (1 << 31)
+
+#define SMMU_PTC_FLUSH 0x034
+#define  SMMU_PTC_FLUSH_TYPE_ALL (0 << 0)
+#define  SMMU_PTC_FLUSH_TYPE_ADR (1 << 0)
+
+#define SMMU_PTC_FLUSH_HI 0x9b8
+#define  SMMU_PTC_FLUSH_HI_MASK 0x3
+
+/* per-SWGROUP SMMU_*_ASID register */
+#define SMMU_ASID_ENABLE (1 << 31)
+#define SMMU_ASID_MASK 0x7f
+#define SMMU_ASID_VALUE(x) ((x) & SMMU_ASID_MASK)
+
+/* page table definitions */
+#define SMMU_NUM_PDE 1024
+#define SMMU_NUM_PTE 1024
+
+#define SMMU_SIZE_PD (SMMU_NUM_PDE * 4)
+#define SMMU_SIZE_PT (SMMU_NUM_PTE * 4)
+
+#define SMMU_PDE_SHIFT 22
+#define SMMU_PTE_SHIFT 12
+
+#define SMMU_PFN_MASK 0x000fffff
+
+#define SMMU_PD_READABLE	(1 << 31)
+#define SMMU_PD_WRITABLE	(1 << 30)
+#define SMMU_PD_NONSECURE	(1 << 29)
+
+#define SMMU_PDE_READABLE	(1 << 31)
+#define SMMU_PDE_WRITABLE	(1 << 30)
+#define SMMU_PDE_NONSECURE	(1 << 29)
+#define SMMU_PDE_NEXT		(1 << 28)
+
+#define SMMU_PTE_READABLE	(1 << 31)
+#define SMMU_PTE_WRITABLE	(1 << 30)
+#define SMMU_PTE_NONSECURE	(1 << 29)
+
+#define SMMU_PDE_ATTR		(SMMU_PDE_READABLE | SMMU_PDE_WRITABLE | \
+				 SMMU_PDE_NONSECURE)
+#define SMMU_PTE_ATTR		(SMMU_PTE_READABLE | SMMU_PTE_WRITABLE | \
+				 SMMU_PTE_NONSECURE)
+
+#define SMMU_PDE_VACANT(n)	(((n) << 10) | SMMU_PDE_ATTR)
+#define SMMU_PTE_VACANT(n)	(((n) << 12) | SMMU_PTE_ATTR)
+
+#ifdef CONFIG_ARCH_TEGRA_124_SOC
+static void tegra124_flush_dcache(struct page *page, unsigned long offset,
+				  size_t size)
+{
+	phys_addr_t phys = page_to_phys(page) + offset;
+	void *virt = page_address(page) + offset;
+
+	__cpuc_flush_dcache_area(virt, size);
+	outer_flush_range(phys, phys + size);
+}
+
+static const struct tegra_smmu_ops tegra124_smmu_ops = {
+	.flush_dcache = tegra124_flush_dcache,
+};
+#endif
+
+static void tegra132_flush_dcache(struct page *page, unsigned long offset,
+				  size_t size)
+{
+	/* TODO: implement */
+}
+
+static const struct tegra_smmu_ops tegra132_smmu_ops = {
+	.flush_dcache = tegra132_flush_dcache,
+};
+
+static inline void smmu_flush_ptc(struct tegra_smmu *smmu, struct page *page,
+				  unsigned long offset)
+{
+	phys_addr_t phys = page ? page_to_phys(page) : 0;
+	u32 value;
+
+	if (page) {
+		offset &= ~(smmu->soc->atom_size - 1);
+
+#ifdef CONFIG_PHYS_ADDR_T_64BIT
+		value = (phys >> 32) & SMMU_PTC_FLUSH_HI_MASK;
+#else
+		value = 0;
+#endif
+		smmu_writel(smmu, value, SMMU_PTC_FLUSH_HI);
+
+		value = (phys + offset) | SMMU_PTC_FLUSH_TYPE_ADR;
+	} else {
+		value = SMMU_PTC_FLUSH_TYPE_ALL;
+	}
+
+	smmu_writel(smmu, value, SMMU_PTC_FLUSH);
+}
+
+static inline void smmu_flush_tlb(struct tegra_smmu *smmu)
+{
+	smmu_writel(smmu, SMMU_TLB_FLUSH_VA_MATCH_ALL, SMMU_TLB_FLUSH);
+}
+
+static inline void smmu_flush_tlb_asid(struct tegra_smmu *smmu,
+				       unsigned long asid)
+{
+	u32 value;
+
+	value = SMMU_TLB_FLUSH_ASID_MATCH | SMMU_TLB_FLUSH_ASID(asid) |
+		SMMU_TLB_FLUSH_VA_MATCH_ALL;
+	smmu_writel(smmu, value, SMMU_TLB_FLUSH);
+}
+
+static inline void smmu_flush_tlb_section(struct tegra_smmu *smmu,
+					  unsigned long asid,
+					  unsigned long iova)
+{
+	u32 value;
+
+	value = SMMU_TLB_FLUSH_ASID_MATCH | SMMU_TLB_FLUSH_ASID(asid) |
+		SMMU_TLB_FLUSH_VA_SECTION(iova);
+	smmu_writel(smmu, value, SMMU_TLB_FLUSH);
+}
+
+static inline void smmu_flush_tlb_group(struct tegra_smmu *smmu,
+					unsigned long asid,
+					unsigned long iova)
+{
+	u32 value;
+
+	value = SMMU_TLB_FLUSH_ASID_MATCH | SMMU_TLB_FLUSH_ASID(asid) |
+		SMMU_TLB_FLUSH_VA_GROUP(iova);
+	smmu_writel(smmu, value, SMMU_TLB_FLUSH);
+}
+
+static inline void smmu_flush(struct tegra_smmu *smmu)
+{
+	smmu_readl(smmu, SMMU_CONFIG);
+}
+
+static inline struct tegra_smmu *to_tegra_smmu(struct iommu *iommu)
+{
+	return container_of(iommu, struct tegra_smmu, iommu);
+}
+
+static struct tegra_smmu *smmu_handle = NULL;
+
+static int tegra_smmu_alloc_asid(struct tegra_smmu *smmu, unsigned int *idp)
+{
+	unsigned long id;
+
+	mutex_lock(&smmu->lock);
+
+	id = find_first_zero_bit(smmu->asids, smmu->soc->num_asids);
+	if (id >= smmu->soc->num_asids) {
+		mutex_unlock(&smmu->lock);
+		return -ENOSPC;
+	}
+
+	set_bit(id, smmu->asids);
+	*idp = id;
+
+	mutex_unlock(&smmu->lock);
+	return 0;
+}
+
+static void tegra_smmu_free_asid(struct tegra_smmu *smmu, unsigned int id)
+{
+	mutex_lock(&smmu->lock);
+	clear_bit(id, smmu->asids);
+	mutex_unlock(&smmu->lock);
+}
+
+struct tegra_smmu_address_space *foo = NULL;
+
+static int tegra_smmu_domain_init(struct iommu_domain *domain)
+{
+	struct tegra_smmu *smmu = smmu_handle;
+	struct tegra_smmu_address_space *as;
+	uint32_t *pd, value;
+	unsigned int i;
+	int err = 0;
+
+	as = kzalloc(sizeof(*as), GFP_KERNEL);
+	if (!as) {
+		err = -ENOMEM;
+		goto out;
+	}
+
+	as->attr = SMMU_PD_READABLE | SMMU_PD_WRITABLE | SMMU_PD_NONSECURE;
+	as->smmu = smmu_handle;
+	as->domain = domain;
+
+	err = tegra_smmu_alloc_asid(smmu, &as->id);
+	if (err < 0) {
+		kfree(as);
+		goto out;
+	}
+
+	as->pd = alloc_page(GFP_KERNEL | __GFP_DMA);
+	if (!as->pd) {
+		err = -ENOMEM;
+		goto out;
+	}
+
+	pd = page_address(as->pd);
+	SetPageReserved(as->pd);
+
+	for (i = 0; i < SMMU_NUM_PDE; i++)
+		pd[i] = SMMU_PDE_VACANT(i);
+
+	smmu->soc->ops->flush_dcache(as->pd, 0, SMMU_SIZE_PD);
+	smmu_flush_ptc(smmu, as->pd, 0);
+	smmu_flush_tlb_asid(smmu, as->id);
+
+	smmu_writel(smmu, as->id & 0x7f, SMMU_PTB_ASID);
+	value = SMMU_PTB_DATA_VALUE(as->pd, as->attr);
+	smmu_writel(smmu, value, SMMU_PTB_DATA);
+	smmu_flush(smmu);
+
+	domain->priv = as;
+
+	return 0;
+
+out:
+	return err;
+}
+
+static void tegra_smmu_domain_destroy(struct iommu_domain *domain)
+{
+	struct tegra_smmu_address_space *as = domain->priv;
+
+	/* TODO: free page directory and page tables */
+
+	tegra_smmu_free_asid(as->smmu, as->id);
+	kfree(as);
+}
+
+static const struct tegra_smmu_swgroup *
+tegra_smmu_find_swgroup(struct tegra_smmu *smmu, unsigned int swgroup)
+{
+	const struct tegra_smmu_swgroup *group = NULL;
+	unsigned int i;
+
+	for (i = 0; i < smmu->soc->num_swgroups; i++) {
+		if (smmu->soc->swgroups[i].swgroup == swgroup) {
+			group = &smmu->soc->swgroups[i];
+			break;
+		}
+	}
+
+	return group;
+}
+
+static int tegra_smmu_enable(struct tegra_smmu *smmu, unsigned int swgroup,
+			     unsigned int asid)
+{
+	const struct tegra_smmu_swgroup *group;
+	unsigned int i;
+	u32 value;
+
+	for (i = 0; i < smmu->soc->num_clients; i++) {
+		const struct tegra_mc_client *client = &smmu->soc->clients[i];
+
+		if (client->swgroup != swgroup)
+			continue;
+
+		value = smmu_readl(smmu, client->smmu.reg);
+		value |= BIT(client->smmu.bit);
+		smmu_writel(smmu, value, client->smmu.reg);
+	}
+
+	group = tegra_smmu_find_swgroup(smmu, swgroup);
+	if (group) {
+		value = smmu_readl(smmu, group->reg);
+		value &= ~SMMU_ASID_MASK;
+		value |= SMMU_ASID_VALUE(asid);
+		value |= SMMU_ASID_ENABLE;
+		smmu_writel(smmu, value, group->reg);
+	}
+
+	return 0;
+}
+
+static int tegra_smmu_disable(struct tegra_smmu *smmu, unsigned int swgroup,
+			      unsigned int asid)
+{
+	const struct tegra_smmu_swgroup *group;
+	unsigned int i;
+	u32 value;
+
+	group = tegra_smmu_find_swgroup(smmu, swgroup);
+	if (group) {
+		value = smmu_readl(smmu, group->reg);
+		value &= ~SMMU_ASID_MASK;
+		value |= SMMU_ASID_VALUE(asid);
+		value &= ~SMMU_ASID_ENABLE;
+		smmu_writel(smmu, value, group->reg);
+	}
+
+	for (i = 0; i < smmu->soc->num_clients; i++) {
+		const struct tegra_mc_client *client = &smmu->soc->clients[i];
+
+		if (client->swgroup != swgroup)
+			continue;
+
+		value = smmu_readl(smmu, client->smmu.reg);
+		value &= ~BIT(client->smmu.bit);
+		smmu_writel(smmu, value, client->smmu.reg);
+	}
+
+	return 0;
+}
+
+static int tegra_smmu_attach_dev(struct iommu_domain *domain, struct device *dev)
+{
+	struct tegra_smmu_address_space *as = domain->priv;
+	struct tegra_smmu *smmu = as->smmu;
+	struct of_phandle_iter entry;
+	int err;
+
+	of_property_for_each_phandle_with_args(entry, dev->of_node, "iommus",
+					       "#iommu-cells", 0) {
+		unsigned int swgroup = entry.out_args.args[0];
+
+		if (entry.out_args.np != smmu->dev->of_node)
+			continue;
+
+		err = tegra_smmu_enable(smmu, swgroup, as->id);
+		if (err < 0)
+			pr_err("failed to enable SWGROUP#%u\n", swgroup);
+	}
+
+	return 0;
+}
+
+static void tegra_smmu_detach_dev(struct iommu_domain *domain, struct device *dev)
+{
+	struct tegra_smmu_address_space *as = domain->priv;
+	struct tegra_smmu *smmu = as->smmu;
+	struct of_phandle_iter entry;
+	int err;
+
+	of_property_for_each_phandle_with_args(entry, dev->of_node, "iommus",
+					       "#iommu-cells", 0) {
+		unsigned int swgroup;
+
+		if (entry.out_args.np != smmu->dev->of_node)
+			continue;
+
+		swgroup = entry.out_args.args[0];
+
+		err = tegra_smmu_disable(smmu, swgroup, as->id);
+		if (err < 0) {
+			pr_err("failed to enable SWGROUP#%u\n", swgroup);
+		}
+	}
+}
+
+static u32 *as_get_pte(struct tegra_smmu_address_space *as, dma_addr_t iova,
+		       struct page **pagep)
+{
+	struct tegra_smmu *smmu = smmu_handle;
+	u32 *pd = page_address(as->pd), *pt;
+	u32 pde = (iova >> SMMU_PDE_SHIFT) & 0x3ff;
+	u32 pte = (iova >> SMMU_PTE_SHIFT) & 0x3ff;
+	struct page *page;
+	unsigned int i;
+
+	if (pd[pde] != SMMU_PDE_VACANT(pde)) {
+		page = pfn_to_page(pd[pde] & SMMU_PFN_MASK);
+		pt = page_address(page);
+	} else {
+		page = alloc_page(GFP_KERNEL | __GFP_DMA);
+		if (!page)
+			return NULL;
+
+		pt = page_address(page);
+		SetPageReserved(page);
+
+		for (i = 0; i < SMMU_NUM_PTE; i++)
+			pt[i] = SMMU_PTE_VACANT(i);
+
+		smmu->soc->ops->flush_dcache(page, 0, SMMU_SIZE_PT);
+
+		pd[pde] = SMMU_MK_PDE(page, SMMU_PDE_ATTR | SMMU_PDE_NEXT);
+
+		smmu->soc->ops->flush_dcache(as->pd, pde << 2, 4);
+		smmu_flush_ptc(smmu, as->pd, pde << 2);
+		smmu_flush_tlb_section(smmu, as->id, iova);
+		smmu_flush(smmu);
+	}
+
+	*pagep = page;
+
+	return &pt[pte];
+}
+
+static int tegra_smmu_map(struct iommu_domain *domain, unsigned long iova,
+			  phys_addr_t paddr, size_t size, int prot)
+{
+	struct tegra_smmu_address_space *as = domain->priv;
+	struct tegra_smmu *smmu = smmu_handle;
+	unsigned long offset;
+	struct page *page;
+	u32 *pte;
+
+	pte = as_get_pte(as, iova, &page);
+	if (!pte)
+		return -ENOMEM;
+
+	offset = offset_in_page(pte);
+
+	*pte = __phys_to_pfn(paddr) | SMMU_PTE_ATTR;
+
+	smmu->soc->ops->flush_dcache(page, offset, 4);
+	smmu_flush_ptc(smmu, page, offset);
+	smmu_flush_tlb_group(smmu, as->id, iova);
+	smmu_flush(smmu);
+
+	return 0;
+}
+
+static size_t tegra_smmu_unmap(struct iommu_domain *domain, unsigned long iova,
+			       size_t size)
+{
+	struct tegra_smmu_address_space *as = domain->priv;
+	struct tegra_smmu *smmu = smmu_handle;
+	unsigned long offset;
+	struct page *page;
+	u32 *pte;
+
+	pte = as_get_pte(as, iova, &page);
+	if (!pte)
+		return 0;
+
+	offset = offset_in_page(pte);
+	*pte = 0;
+
+	smmu->soc->ops->flush_dcache(page, offset, 4);
+	smmu_flush_ptc(smmu, page, offset);
+	smmu_flush_tlb_group(smmu, as->id, iova);
+	smmu_flush(smmu);
+
+	return size;
+}
+
+static phys_addr_t tegra_smmu_iova_to_phys(struct iommu_domain *domain,
+					   dma_addr_t iova)
+{
+	struct tegra_smmu_address_space *as = domain->priv;
+	struct page *page;
+	unsigned long pfn;
+	u32 *pte;
+
+	pte = as_get_pte(as, iova, &page);
+	pfn = *pte & SMMU_PFN_MASK;
+
+	return PFN_PHYS(pfn);
+}
+
+static int tegra_smmu_attach(struct iommu *iommu, struct device *dev)
+{
+	struct tegra_smmu *smmu = to_tegra_smmu(iommu);
+	struct tegra_smmu_group *group;
+	unsigned int i;
+
+	for (i = 0; i < smmu->soc->num_groups; i++) {
+		group = iommu_group_get_iommudata(smmu->groups[i]);
+
+		if (of_match_node(group->matches, dev->of_node)) {
+			pr_debug("adding device %s to group %s\n",
+				 dev_name(dev), group->name);
+			iommu_group_add_device(smmu->groups[i], dev);
+			break;
+		}
+	}
+
+	if (i == smmu->soc->num_groups)
+		return 0;
+
+#ifndef CONFIG_ARM64
+	return arm_iommu_attach_device(dev, group->mapping);
+#else
+	return 0;
+#endif
+}
+
+static int tegra_smmu_detach(struct iommu *iommu, struct device *dev)
+{
+	return 0;
+}
+
+static const struct iommu_ops tegra_smmu_ops = {
+	.domain_init = tegra_smmu_domain_init,
+	.domain_destroy = tegra_smmu_domain_destroy,
+	.attach_dev = tegra_smmu_attach_dev,
+	.detach_dev = tegra_smmu_detach_dev,
+	.map = tegra_smmu_map,
+	.unmap = tegra_smmu_unmap,
+	.iova_to_phys = tegra_smmu_iova_to_phys,
+	.attach = tegra_smmu_attach,
+	.detach = tegra_smmu_detach,
+
+	.pgsize_bitmap = SZ_4K,
+};
+
+static struct tegra_smmu *tegra_smmu_probe(struct device *dev,
+					   const struct tegra_smmu_soc *soc,
+					   void __iomem *regs)
+{
+	struct tegra_smmu *smmu;
+	unsigned int i;
+	size_t size;
+	u32 value;
+	int err;
+
+	smmu = devm_kzalloc(dev, sizeof(*smmu), GFP_KERNEL);
+	if (!smmu)
+		return ERR_PTR(-ENOMEM);
+
+	size = BITS_TO_LONGS(soc->num_asids) * sizeof(long);
+
+	smmu->asids = devm_kzalloc(dev, size, GFP_KERNEL);
+	if (!smmu->asids)
+		return ERR_PTR(-ENOMEM);
+
+	INIT_LIST_HEAD(&smmu->iommu.list);
+	mutex_init(&smmu->lock);
+
+	smmu->iommu.ops = &tegra_smmu_ops;
+	smmu->iommu.dev = dev;
+
+	smmu->regs = regs;
+	smmu->soc = soc;
+	smmu->dev = dev;
+
+	smmu_handle = smmu;
+	bus_set_iommu(&platform_bus_type, &tegra_smmu_ops);
+
+	smmu->num_groups = soc->num_groups;
+
+	smmu->groups = devm_kcalloc(dev, smmu->num_groups, sizeof(*smmu->groups),
+				    GFP_KERNEL);
+	if (!smmu->groups)
+		return ERR_PTR(-ENOMEM);
+
+	for (i = 0; i < smmu->num_groups; i++) {
+		struct tegra_smmu_group *group;
+
+		smmu->groups[i] = iommu_group_alloc();
+		if (IS_ERR(smmu->groups[i]))
+			return ERR_CAST(smmu->groups[i]);
+
+		err = iommu_group_set_name(smmu->groups[i], soc->groups[i].name);
+		if (err < 0) {
+		}
+
+		group = kzalloc(sizeof(*group), GFP_KERNEL);
+		if (!group)
+			return ERR_PTR(-ENOMEM);
+
+		group->matches = soc->groups[i].matches;
+		group->asid = soc->groups[i].asid;
+		group->name = soc->groups[i].name;
+
+		iommu_group_set_iommudata(smmu->groups[i], group,
+					  tegra_smmu_group_release);
+
+#ifndef CONFIG_ARM64
+		group->mapping = arm_iommu_create_mapping(&platform_bus_type,
+							  0, SZ_2G);
+		if (IS_ERR(group->mapping)) {
+			dev_err(dev, "failed to create mapping for group %s: %ld\n",
+				group->name, PTR_ERR(group->mapping));
+			return ERR_CAST(group->mapping);
+		}
+#endif
+	}
+
+	value = (1 << 29) | (8 << 24) | 0x3f;
+	smmu_writel(smmu, value, 0x18);
+
+	value = (1 << 29) | (1 << 28) | 0x20;
+	smmu_writel(smmu, value, 0x014);
+
+	smmu_flush_ptc(smmu, NULL, 0);
+	smmu_flush_tlb(smmu);
+	smmu_writel(smmu, SMMU_CONFIG_ENABLE, SMMU_CONFIG);
+	smmu_flush(smmu);
+
+	err = iommu_add(&smmu->iommu);
+	if (err < 0)
+		return ERR_PTR(err);
+
+	return smmu;
+}
+
+static int tegra_smmu_remove(struct tegra_smmu *smmu)
+{
+	iommu_remove(&smmu->iommu);
+
+	return 0;
+}
+
+#ifdef CONFIG_ARCH_TEGRA_124_SOC
+static const struct tegra_smmu_soc tegra124_smmu_soc = {
+	.groups = tegra124_smmu_groups,
+	.num_groups = ARRAY_SIZE(tegra124_smmu_groups),
+	.clients = tegra124_mc_clients,
+	.num_clients = ARRAY_SIZE(tegra124_mc_clients),
+	.swgroups = tegra124_swgroups,
+	.num_swgroups = ARRAY_SIZE(tegra124_swgroups),
+	.num_asids = 128,
+	.atom_size = 32,
+	.ops = &tegra124_smmu_ops,
+};
+#endif
+
+static const struct tegra_smmu_soc tegra132_smmu_soc = {
+	.groups = tegra124_smmu_groups,
+	.num_groups = ARRAY_SIZE(tegra124_smmu_groups),
+	.clients = tegra124_mc_clients,
+	.num_clients = ARRAY_SIZE(tegra124_mc_clients),
+	.swgroups = tegra124_swgroups,
+	.num_swgroups = ARRAY_SIZE(tegra124_swgroups),
+	.num_asids = 128,
+	.atom_size = 32,
+	.ops = &tegra132_smmu_ops,
+};
+
+struct tegra_mc {
+	struct device *dev;
+	struct tegra_smmu *smmu;
+	void __iomem *regs;
+	int irq;
+
+	const struct tegra_mc_soc *soc;
+};
+
+static inline u32 mc_readl(struct tegra_mc *mc, unsigned long offset)
+{
+	return readl(mc->regs + offset);
+}
+
+static inline void mc_writel(struct tegra_mc *mc, u32 value, unsigned long offset)
+{
+	writel(value, mc->regs + offset);
+}
+
+struct tegra_mc_soc {
+	const struct tegra_mc_client *clients;
+	unsigned int num_clients;
+
+	const struct tegra_smmu_soc *smmu;
+};
+
+#ifdef CONFIG_ARCH_TEGRA_124_SOC
+static const struct tegra_mc_soc tegra124_mc_soc = {
+	.clients = tegra124_mc_clients,
+	.num_clients = ARRAY_SIZE(tegra124_mc_clients),
+	.smmu = &tegra124_smmu_soc,
+};
+#endif
+
+static const struct tegra_mc_soc tegra132_mc_soc = {
+	.clients = tegra124_mc_clients,
+	.num_clients = ARRAY_SIZE(tegra124_mc_clients),
+	.smmu = &tegra132_smmu_soc,
+};
+
+static const struct of_device_id tegra_mc_of_match[] = {
+#ifdef CONFIG_ARCH_TEGRA_124_SOC
+	{ .compatible = "nvidia,tegra124-mc", .data = &tegra124_mc_soc },
+#endif
+	{ .compatible = "nvidia,tegra132-mc", .data = &tegra132_mc_soc },
+	{ }
+};
+
+static irqreturn_t tegra124_mc_irq(int irq, void *data)
+{
+	struct tegra_mc *mc = data;
+	u32 value, status, mask;
+
+	/* mask all interrupts to avoid flooding */
+	mask = mc_readl(mc, MC_INTMASK);
+	mc_writel(mc, 0, MC_INTMASK);
+
+	status = mc_readl(mc, MC_INTSTATUS);
+	mc_writel(mc, status, MC_INTSTATUS);
+
+	dev_dbg(mc->dev, "INTSTATUS: %08x\n", status);
+
+	if (status & MC_INT_DECERR_MTS)
+		dev_dbg(mc->dev, "  DECERR_MTS\n");
+
+	if (status & MC_INT_SECERR_SEC)
+		dev_dbg(mc->dev, "  SECERR_SEC\n");
+
+	if (status & MC_INT_DECERR_VPR)
+		dev_dbg(mc->dev, "  DECERR_VPR\n");
+
+	if (status & MC_INT_INVALID_APB_ASID_UPDATE)
+		dev_dbg(mc->dev, "  INVALID_APB_ASID_UPDATE\n");
+
+	if (status & MC_INT_INVALID_SMMU_PAGE)
+		dev_dbg(mc->dev, "  INVALID_SMMU_PAGE\n");
+
+	if (status & MC_INT_ARBITRATION_EMEM)
+		dev_dbg(mc->dev, "  ARBITRATION_EMEM\n");
+
+	if (status & MC_INT_SECURITY_VIOLATION)
+		dev_dbg(mc->dev, "  SECURITY_VIOLATION\n");
+
+	if (status & MC_INT_DECERR_EMEM)
+		dev_dbg(mc->dev, "  DECERR_EMEM\n");
+
+	value = mc_readl(mc, MC_ERR_STATUS);
+
+	dev_dbg(mc->dev, "ERR_STATUS: %08x\n", value);
+	dev_dbg(mc->dev, "  type: %x\n", (value >> 28) & 0x7);
+	dev_dbg(mc->dev, "  protection: %x\n", (value >> 25) & 0x7);
+	dev_dbg(mc->dev, "  adr_hi: %x\n", (value >> 20) & 0x3);
+	dev_dbg(mc->dev, "  swap: %x\n", (value >> 18) & 0x1);
+	dev_dbg(mc->dev, "  security: %x\n", (value >> 17) & 0x1);
+	dev_dbg(mc->dev, "  r/w: %x\n", (value >> 16) & 0x1);
+	dev_dbg(mc->dev, "  adr1: %x\n", (value >> 12) & 0x7);
+	dev_dbg(mc->dev, "  client: %x\n", value & 0x7f);
+
+	value = mc_readl(mc, MC_ERR_ADR);
+	dev_dbg(mc->dev, "ERR_ADR: %08x\n", value);
+
+	mc_writel(mc, mask, MC_INTMASK);
+
+	return IRQ_HANDLED;
+}
+
+static int tegra_mc_probe(struct platform_device *pdev)
+{
+	const struct of_device_id *match;
+	struct resource *res;
+	struct tegra_mc *mc;
+	unsigned int i;
+	u32 value;
+	int err;
+
+	match = of_match_node(tegra_mc_of_match, pdev->dev.of_node);
+	if (!match)
+		return -ENODEV;
+
+	mc = devm_kzalloc(&pdev->dev, sizeof(*mc), GFP_KERNEL);
+	if (!mc)
+		return -ENOMEM;
+
+	platform_set_drvdata(pdev, mc);
+	mc->soc = match->data;
+	mc->dev = &pdev->dev;
+
+	res = platform_get_resource(pdev, IORESOURCE_MEM, 0);
+	mc->regs = devm_ioremap_resource(&pdev->dev, res);
+	if (IS_ERR(mc->regs))
+		return PTR_ERR(mc->regs);
+
+	for (i = 0; i < mc->soc->num_clients; i++) {
+		const struct latency_allowance *la = &mc->soc->clients[i].latency;
+		u32 value;
+
+		value = readl(mc->regs + la->reg);
+		value &= ~(la->mask << la->shift);
+		value |= (la->def & la->mask) << la->shift;
+		writel(value, mc->regs + la->reg);
+	}
+
+	mc->smmu = tegra_smmu_probe(&pdev->dev, mc->soc->smmu, mc->regs);
+	if (IS_ERR(mc->smmu)) {
+		dev_err(&pdev->dev, "failed to probe SMMU: %ld\n",
+			PTR_ERR(mc->smmu));
+		return PTR_ERR(mc->smmu);
+	}
+
+	mc->irq = platform_get_irq(pdev, 0);
+	if (mc->irq < 0) {
+		dev_err(&pdev->dev, "interrupt not specified\n");
+		return mc->irq;
+	}
+
+	err = devm_request_irq(&pdev->dev, mc->irq, tegra124_mc_irq,
+			       IRQF_SHARED, dev_name(&pdev->dev), mc);
+	if (err < 0) {
+		dev_err(&pdev->dev, "failed to request IRQ#%u: %d\n", mc->irq,
+			err);
+		return err;
+	}
+
+	value = MC_INT_DECERR_MTS | MC_INT_SECERR_SEC | MC_INT_DECERR_VPR |
+		MC_INT_INVALID_APB_ASID_UPDATE | MC_INT_INVALID_SMMU_PAGE |
+		MC_INT_ARBITRATION_EMEM | MC_INT_SECURITY_VIOLATION |
+		MC_INT_DECERR_EMEM;
+	mc_writel(mc, value, MC_INTMASK);
+
+	return 0;
+}
+
+static int tegra_mc_remove(struct platform_device *pdev)
+{
+	struct tegra_mc *mc = platform_get_drvdata(pdev);
+	int err;
+
+	err = tegra_smmu_remove(mc->smmu);
+	if (err < 0)
+		dev_err(&pdev->dev, "failed to remove SMMU: %d\n", err);
+
+	return 0;
+}
+
+static struct platform_driver tegra_mc_driver = {
+	.driver = {
+		.name = "tegra124-mc",
+		.of_match_table = tegra_mc_of_match,
+	},
+	.probe = tegra_mc_probe,
+	.remove = tegra_mc_remove,
+};
+module_platform_driver(tegra_mc_driver);
+
+MODULE_AUTHOR("Thierry Reding <treding-DDmLM1+adcrQT0dZR+AlfA@public.gmane.org>");
+MODULE_DESCRIPTION("NVIDIA Tegra124 Memory Controller driver");
+MODULE_LICENSE("GPL v2");
diff --git a/include/dt-bindings/memory/tegra124-mc.h b/include/dt-bindings/memory/tegra124-mc.h
new file mode 100644
index 000000000000..6b1617ce022f
--- /dev/null
+++ b/include/dt-bindings/memory/tegra124-mc.h
@@ -0,0 +1,30 @@
+#ifndef DT_BINDINGS_MEMORY_TEGRA124_MC_H
+#define DT_BINDINGS_MEMORY_TEGRA124_MC_H
+
+#define TEGRA_SWGROUP_DC	0
+#define TEGRA_SWGROUP_DCB	1
+#define TEGRA_SWGROUP_AFI	2
+#define TEGRA_SWGROUP_AVPC	3
+#define TEGRA_SWGROUP_HDA	4
+#define TEGRA_SWGROUP_HC	5
+#define TEGRA_SWGROUP_MSENC	6
+#define TEGRA_SWGROUP_PPCS	7
+#define TEGRA_SWGROUP_SATA	8
+#define TEGRA_SWGROUP_VDE	9
+#define TEGRA_SWGROUP_MPCORELP	10
+#define TEGRA_SWGROUP_MPCORE	11
+#define TEGRA_SWGROUP_ISP2	12
+#define TEGRA_SWGROUP_XUSB_HOST	13
+#define TEGRA_SWGROUP_XUSB_DEV	14
+#define TEGRA_SWGROUP_ISP2B	15
+#define TEGRA_SWGROUP_TSEC	16
+#define TEGRA_SWGROUP_A9AVP	17
+#define TEGRA_SWGROUP_GPU	18
+#define TEGRA_SWGROUP_SDMMC1A	19
+#define TEGRA_SWGROUP_SDMMC2A	20
+#define TEGRA_SWGROUP_SDMMC3A	21
+#define TEGRA_SWGROUP_SDMMC4A	22
+#define TEGRA_SWGROUP_VIC	23
+#define TEGRA_SWGROUP_VI	24
+
+#endif
-- 
2.0.0

^ permalink raw reply related	[flat|nested] 133+ messages in thread

* [RFC 04/10] memory: Add Tegra124 memory controller support
@ 2014-06-26 20:49     ` Thierry Reding
  0 siblings, 0 replies; 133+ messages in thread
From: Thierry Reding @ 2014-06-26 20:49 UTC (permalink / raw)
  To: Rob Herring, Pawel Moll, Mark Rutland, Ian Campbell, Kumar Gala,
	Stephen Warren, Arnd Bergmann, Will Deacon, Joerg Roedel
  Cc: Cho KyongHo, Grant Grundler, Dave Martin, Marc Zyngier,
	Hiroshi Doyu, Olav Haugan, Paul Walmsley, Rhyland Klein,
	Allen Martin, devicetree, iommu, linux-arm-kernel, linux-tegra,
	linux-kernel

From: Thierry Reding <treding@nvidia.com>

The memory controller on NVIDIA Tegra124 exposes various knobs that can
be used to tune the behaviour of the clients attached to it.

Currently this driver sets up the latency allowance registers to the HW
defaults. Eventually an API should be exported by this driver (via a
custom API or a generic subsystem) to allow clients to register latency
requirements.

This driver also registers an IOMMU (SMMU) that's implemented by the
memory controller.

Signed-off-by: Thierry Reding <treding@nvidia.com>
---
 drivers/memory/Kconfig                   |    9 +
 drivers/memory/Makefile                  |    1 +
 drivers/memory/tegra124-mc.c             | 1945 ++++++++++++++++++++++++++++++
 include/dt-bindings/memory/tegra124-mc.h |   30 +
 4 files changed, 1985 insertions(+)
 create mode 100644 drivers/memory/tegra124-mc.c
 create mode 100644 include/dt-bindings/memory/tegra124-mc.h

diff --git a/drivers/memory/Kconfig b/drivers/memory/Kconfig
index c59e9c96e86d..d0f0e6781570 100644
--- a/drivers/memory/Kconfig
+++ b/drivers/memory/Kconfig
@@ -61,6 +61,15 @@ config TEGRA30_MC
 	  analysis, especially for IOMMU/SMMU(System Memory Management
 	  Unit) module.
 
+config TEGRA124_MC
+	bool "Tegra124 Memory Controller driver"
+	depends on ARCH_TEGRA
+	select IOMMU_API
+	help
+	  This driver is for the Memory Controller module available on
+	  Tegra124 SoCs. It provides an IOMMU that can be used for I/O
+	  virtual address translation.
+
 config FSL_IFC
 	bool
 	depends on FSL_SOC
diff --git a/drivers/memory/Makefile b/drivers/memory/Makefile
index 71160a2b7313..03143927abab 100644
--- a/drivers/memory/Makefile
+++ b/drivers/memory/Makefile
@@ -11,3 +11,4 @@ obj-$(CONFIG_FSL_IFC)		+= fsl_ifc.o
 obj-$(CONFIG_MVEBU_DEVBUS)	+= mvebu-devbus.o
 obj-$(CONFIG_TEGRA20_MC)	+= tegra20-mc.o
 obj-$(CONFIG_TEGRA30_MC)	+= tegra30-mc.o
+obj-$(CONFIG_TEGRA124_MC)	+= tegra124-mc.o
diff --git a/drivers/memory/tegra124-mc.c b/drivers/memory/tegra124-mc.c
new file mode 100644
index 000000000000..741755b6785d
--- /dev/null
+++ b/drivers/memory/tegra124-mc.c
@@ -0,0 +1,1945 @@
+/*
+ * Copyright (C) 2014 NVIDIA CORPORATION.  All rights reserved.
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ */
+
+#include <linux/interrupt.h>
+#include <linux/io.h>
+#include <linux/iommu.h>
+#include <linux/kernel.h>
+#include <linux/module.h>
+#include <linux/of.h>
+#include <linux/platform_device.h>
+#include <linux/slab.h>
+
+#include <dt-bindings/memory/tegra124-mc.h>
+
+#include <asm/cacheflush.h>
+#ifndef CONFIG_ARM64
+#include <asm/dma-iommu.h>
+#endif
+
+#define MC_INTSTATUS 0x000
+#define  MC_INT_DECERR_MTS (1 << 16)
+#define  MC_INT_SECERR_SEC (1 << 13)
+#define  MC_INT_DECERR_VPR (1 << 12)
+#define  MC_INT_INVALID_APB_ASID_UPDATE (1 << 11)
+#define  MC_INT_INVALID_SMMU_PAGE (1 << 10)
+#define  MC_INT_ARBITRATION_EMEM (1 << 9)
+#define  MC_INT_SECURITY_VIOLATION (1 << 8)
+#define  MC_INT_DECERR_EMEM (1 << 6)
+#define MC_INTMASK 0x004
+#define MC_ERR_STATUS 0x08
+#define MC_ERR_ADR 0x0c
+
+struct latency_allowance {
+	unsigned int reg;
+	unsigned int shift;
+	unsigned int mask;
+	unsigned int def;
+};
+
+struct smmu_enable {
+	unsigned int reg;
+	unsigned int bit;
+};
+
+struct tegra_mc_client {
+	unsigned int id;
+	const char *name;
+	unsigned int swgroup;
+
+	struct smmu_enable smmu;
+	struct latency_allowance latency;
+};
+
+static const struct tegra_mc_client tegra124_mc_clients[] = {
+	{
+		.id = 0x01,
+		.name = "display0a",
+		.swgroup = TEGRA_SWGROUP_DC,
+		.smmu = {
+			.reg = 0x228,
+			.bit = 1,
+		},
+		.latency = {
+			.reg = 0x2e8,
+			.shift = 0,
+			.mask = 0xff,
+			.def = 0xc2,
+		},
+	}, {
+		.id = 0x02,
+		.name = "display0ab",
+		.swgroup = TEGRA_SWGROUP_DCB,
+		.smmu = {
+			.reg = 0x228,
+			.bit = 2,
+		},
+		.latency = {
+			.reg = 0x2f4,
+			.shift = 0,
+			.mask = 0xff,
+			.def = 0xc6,
+		},
+	}, {
+		.id = 0x03,
+		.name = "display0b",
+		.swgroup = TEGRA_SWGROUP_DC,
+		.smmu = {
+			.reg = 0x228,
+			.bit = 3,
+		},
+		.latency = {
+			.reg = 0x2e8,
+			.shift = 16,
+			.mask = 0xff,
+			.def = 0x50,
+		},
+	}, {
+		.id = 0x04,
+		.name = "display0bb",
+		.swgroup = TEGRA_SWGROUP_DCB,
+		.smmu = {
+			.reg = 0x228,
+			.bit = 4,
+		},
+		.latency = {
+			.reg = 0x2f4,
+			.shift = 16,
+			.mask = 0xff,
+			.def = 0x50,
+		},
+	}, {
+		.id = 0x05,
+		.name = "display0c",
+		.swgroup = TEGRA_SWGROUP_DC,
+		.smmu = {
+			.reg = 0x228,
+			.bit = 5,
+		},
+		.latency = {
+			.reg = 0x2ec,
+			.shift = 0,
+			.mask = 0xff,
+			.def = 0x50,
+		},
+	}, {
+		.id = 0x06,
+		.name = "display0cb",
+		.swgroup = TEGRA_SWGROUP_DCB,
+		.smmu = {
+			.reg = 0x228,
+			.bit = 6,
+		},
+		.latency = {
+			.reg = 0x2f8,
+			.shift = 0,
+			.mask = 0xff,
+			.def = 0x50,
+		},
+	}, {
+		.id = 0x0e,
+		.name = "afir",
+		.swgroup = TEGRA_SWGROUP_AFI,
+		.smmu = {
+			.reg = 0x228,
+			.bit = 14,
+		},
+		.latency = {
+			.reg = 0x2e0,
+			.shift = 0,
+			.mask = 0xff,
+			.def = 0x13,
+		},
+	}, {
+		.id = 0x0f,
+		.name = "avpcarm7r",
+		.swgroup = TEGRA_SWGROUP_AVPC,
+		.smmu = {
+			.reg = 0x228,
+			.bit = 15,
+		},
+		.latency = {
+			.reg = 0x2e4,
+			.shift = 0,
+			.mask = 0xff,
+			.def = 0x04,
+		},
+	}, {
+		.id = 0x10,
+		.name = "displayhc",
+		.swgroup = TEGRA_SWGROUP_DC,
+		.smmu = {
+			.reg = 0x228,
+			.bit = 16,
+		},
+		.latency = {
+			.reg = 0x2f0,
+			.shift = 0,
+			.mask = 0xff,
+			.def = 0x50,
+		},
+	}, {
+		.id = 0x11,
+		.name = "displayhcb",
+		.swgroup = TEGRA_SWGROUP_DCB,
+		.smmu = {
+			.reg = 0x228,
+			.bit = 17,
+		},
+		.latency = {
+			.reg = 0x2fc,
+			.shift = 0,
+			.mask = 0xff,
+			.def = 0x50,
+		},
+	}, {
+		.id = 0x15,
+		.name = "hdar",
+		.swgroup = TEGRA_SWGROUP_HDA,
+		.smmu = {
+			.reg = 0x228,
+			.bit = 21,
+		},
+		.latency = {
+			.reg = 0x318,
+			.shift = 0,
+			.mask = 0xff,
+			.def = 0x24,
+		},
+	}, {
+		.id = 0x16,
+		.name = "host1xdmar",
+		.swgroup = TEGRA_SWGROUP_HC,
+		.smmu = {
+			.reg = 0x228,
+			.bit = 22,
+		},
+		.latency = {
+			.reg = 0x310,
+			.shift = 0,
+			.mask = 0xff,
+			.def = 0x1e,
+		},
+	}, {
+		.id = 0x17,
+		.name = "host1xr",
+		.swgroup = TEGRA_SWGROUP_HC,
+		.smmu = {
+			.reg = 0x228,
+			.bit = 23,
+		},
+		.latency = {
+			.reg = 0x310,
+			.shift = 16,
+			.mask = 0xff,
+			.def = 0x50,
+		},
+	}, {
+		.id = 0x1c,
+		.name = "msencsrd",
+		.swgroup = TEGRA_SWGROUP_MSENC,
+		.smmu = {
+			.reg = 0x228,
+			.bit = 28,
+		},
+		.latency = {
+			.reg = 0x328,
+			.shift = 0,
+			.mask = 0xff,
+			.def = 0x23,
+		},
+	}, {
+		.id = 0x1d,
+		.name = "ppcsahbdmarhdar",
+		.swgroup = TEGRA_SWGROUP_PPCS,
+		.smmu = {
+			.reg = 0x228,
+			.bit = 29,
+		},
+		.latency = {
+			.reg = 0x344,
+			.shift = 0,
+			.mask = 0xff,
+			.def = 0x49,
+		},
+	}, {
+		.id = 0x1e,
+		.name = "ppcsahbslvr",
+		.swgroup = TEGRA_SWGROUP_PPCS,
+		.smmu = {
+			.reg = 0x228,
+			.bit = 30,
+		},
+		.latency = {
+			.reg = 0x344,
+			.shift = 16,
+			.mask = 0xff,
+			.def = 0x1a,
+		},
+	}, {
+		.id = 0x1f,
+		.name = "satar",
+		.swgroup = TEGRA_SWGROUP_SATA,
+		.smmu = {
+			.reg = 0x228,
+			.bit = 31,
+		},
+		.latency = {
+			.reg = 0x350,
+			.shift = 0,
+			.mask = 0xff,
+			.def = 0x65,
+		},
+	}, {
+		.id = 0x22,
+		.name = "vdebsevr",
+		.swgroup = TEGRA_SWGROUP_VDE,
+		.smmu = {
+			.reg = 0x22c,
+			.bit = 2,
+		},
+		.latency = {
+			.reg = 0x354,
+			.shift = 0,
+			.mask = 0xff,
+			.def = 0x4f,
+		},
+	}, {
+		.id = 0x23,
+		.name = "vdember",
+		.swgroup = TEGRA_SWGROUP_VDE,
+		.smmu = {
+			.reg = 0x22c,
+			.bit = 3,
+		},
+		.latency = {
+			.reg = 0x354,
+			.shift = 16,
+			.mask = 0xff,
+			.def = 0x3d,
+		},
+	}, {
+		.id = 0x24,
+		.name = "vdemcer",
+		.swgroup = TEGRA_SWGROUP_VDE,
+		.smmu = {
+			.reg = 0x22c,
+			.bit = 4,
+		},
+		.latency = {
+			.reg = 0x358,
+			.shift = 0,
+			.mask = 0xff,
+			.def = 0x66,
+		},
+	}, {
+		.id = 0x25,
+		.name = "vdetper",
+		.swgroup = TEGRA_SWGROUP_VDE,
+		.smmu = {
+			.reg = 0x22c,
+			.bit = 5,
+		},
+		.latency = {
+			.reg = 0x358,
+			.shift = 16,
+			.mask = 0xff,
+			.def = 0xa5,
+		},
+	}, {
+		.id = 0x26,
+		.name = "mpcorelpr",
+		.swgroup = TEGRA_SWGROUP_MPCORELP,
+		.latency = {
+			.reg = 0x324,
+			.shift = 0,
+			.mask = 0xff,
+			.def = 0x04,
+		},
+	}, {
+		.id = 0x27,
+		.name = "mpcorer",
+		.swgroup = TEGRA_SWGROUP_MPCORE,
+		.smmu = {
+			.reg = 0x22c,
+			.bit = 2,
+		},
+		.latency = {
+			.reg = 0x320,
+			.shift = 0,
+			.mask = 0xff,
+			.def = 0x04,
+		},
+	}, {
+		.id = 0x2b,
+		.name = "msencswr",
+		.swgroup = TEGRA_SWGROUP_MSENC,
+		.smmu = {
+			.reg = 0x22c,
+			.bit = 11,
+		},
+		.latency = {
+			.reg = 0x328,
+			.shift = 16,
+			.mask = 0xff,
+			.def = 0x80,
+		},
+	}, {
+		.id = 0x31,
+		.name = "afiw",
+		.swgroup = TEGRA_SWGROUP_AFI,
+		.smmu = {
+			.reg = 0x22c,
+			.bit = 17,
+		},
+		.latency = {
+			.reg = 0x2e0,
+			.shift = 16,
+			.mask = 0xff,
+			.def = 0x80,
+		},
+	}, {
+		.id = 0x32,
+		.name = "avpcarm7w",
+		.swgroup = TEGRA_SWGROUP_AVPC,
+		.smmu = {
+			.reg = 0x22c,
+			.bit = 18,
+		},
+		.latency = {
+			.reg = 0x2e4,
+			.shift = 16,
+			.mask = 0xff,
+			.def = 0x80,
+		},
+	}, {
+		.id = 0x35,
+		.name = "hdaw",
+		.swgroup = TEGRA_SWGROUP_HDA,
+		.smmu = {
+			.reg = 0x22c,
+			.bit = 21,
+		},
+		.latency = {
+			.reg = 0x318,
+			.shift = 16,
+			.mask = 0xff,
+			.def = 0x80,
+		},
+	}, {
+		.id = 0x36,
+		.name = "host1xw",
+		.swgroup = TEGRA_SWGROUP_HC,
+		.smmu = {
+			.reg = 0x22c,
+			.bit = 22,
+		},
+		.latency = {
+			.reg = 0x314,
+			.shift = 0,
+			.mask = 0xff,
+			.def = 0x80,
+		},
+	}, {
+		.id = 0x38,
+		.name = "mpcorelpw",
+		.swgroup = TEGRA_SWGROUP_MPCORELP,
+		.latency = {
+			.reg = 0x324,
+			.shift = 16,
+			.mask = 0xff,
+			.def = 0x80,
+		},
+	}, {
+		.id = 0x39,
+		.name = "mpcorew",
+		.swgroup = TEGRA_SWGROUP_MPCORE,
+		.latency = {
+			.reg = 0x320,
+			.shift = 16,
+			.mask = 0xff,
+			.def = 0x80,
+		},
+	}, {
+		.id = 0x3b,
+		.name = "ppcsahbdmaw",
+		.swgroup = TEGRA_SWGROUP_PPCS,
+		.smmu = {
+			.reg = 0x22c,
+			.bit = 27,
+		},
+		.latency = {
+			.reg = 0x348,
+			.shift = 0,
+			.mask = 0xff,
+			.def = 0x80,
+		},
+	}, {
+		.id = 0x3c,
+		.name = "ppcsahbslvw",
+		.swgroup = TEGRA_SWGROUP_PPCS,
+		.smmu = {
+			.reg = 0x22c,
+			.bit = 28,
+		},
+		.latency = {
+			.reg = 0x348,
+			.shift = 16,
+			.mask = 0xff,
+			.def = 0x80,
+		},
+	}, {
+		.id = 0x3d,
+		.name = "sataw",
+		.swgroup = TEGRA_SWGROUP_SATA,
+		.smmu = {
+			.reg = 0x22c,
+			.bit = 29,
+		},
+		.latency = {
+			.reg = 0x350,
+			.shift = 16,
+			.mask = 0xff,
+			.def = 0x65,
+		},
+	}, {
+		.id = 0x3e,
+		.name = "vdebsevw",
+		.swgroup = TEGRA_SWGROUP_VDE,
+		.smmu = {
+			.reg = 0x22c,
+			.bit = 30,
+		},
+		.latency = {
+			.reg = 0x35c,
+			.shift = 0,
+			.mask = 0xff,
+			.def = 0x80,
+		},
+	}, {
+		.id = 0x3f,
+		.name = "vdedbgw",
+		.swgroup = TEGRA_SWGROUP_VDE,
+		.smmu = {
+			.reg = 0x22c,
+			.bit = 31,
+		},
+		.latency = {
+			.reg = 0x35c,
+			.shift = 16,
+			.mask = 0xff,
+			.def = 0x80,
+		},
+	}, {
+		.id = 0x40,
+		.name = "vdembew",
+		.swgroup = TEGRA_SWGROUP_VDE,
+		.smmu = {
+			.reg = 0x230,
+			.bit = 0,
+		},
+		.latency = {
+			.reg = 0x360,
+			.shift = 0,
+			.mask = 0xff,
+			.def = 0x80,
+		},
+	}, {
+		.id = 0x41,
+		.name = "vdetpmw",
+		.swgroup = TEGRA_SWGROUP_VDE,
+		.smmu = {
+			.reg = 0x230,
+			.bit = 1,
+		},
+		.latency = {
+			.reg = 0x360,
+			.shift = 16,
+			.mask = 0xff,
+			.def = 0x80,
+		},
+	}, {
+		.id = 0x44,
+		.name = "ispra",
+		.swgroup = TEGRA_SWGROUP_ISP2,
+		.smmu = {
+			.reg = 0x230,
+			.bit = 4,
+		},
+		.latency = {
+			.reg = 0x370,
+			.shift = 0,
+			.mask = 0xff,
+			.def = 0x18,
+		},
+	}, {
+		.id = 0x46,
+		.name = "ispwa",
+		.swgroup = TEGRA_SWGROUP_ISP2,
+		.smmu = {
+			.reg = 0x230,
+			.bit = 6,
+		},
+		.latency = {
+			.reg = 0x374,
+			.shift = 0,
+			.mask = 0xff,
+			.def = 0x80,
+		},
+	}, {
+		.id = 0x47,
+		.name = "ispwb",
+		.swgroup = TEGRA_SWGROUP_ISP2,
+		.smmu = {
+			.reg = 0x230,
+			.bit = 7,
+		},
+		.latency = {
+			.reg = 0x374,
+			.shift = 16,
+			.mask = 0xff,
+			.def = 0x80,
+		},
+	}, {
+		.id = 0x4a,
+		.name = "xusb_hostr",
+		.swgroup = TEGRA_SWGROUP_XUSB_HOST,
+		.smmu = {
+			.reg = 0x230,
+			.bit = 10,
+		},
+		.latency = {
+			.reg = 0x37c,
+			.shift = 0,
+			.mask = 0xff,
+			.def = 0x39,
+		},
+	}, {
+		.id = 0x4b,
+		.name = "xusb_hostw",
+		.swgroup = TEGRA_SWGROUP_XUSB_HOST,
+		.smmu = {
+			.reg = 0x230,
+			.bit = 11,
+		},
+		.latency = {
+			.reg = 0x37c,
+			.shift = 16,
+			.mask = 0xff,
+			.def = 0x80,
+		},
+	}, {
+		.id = 0x4c,
+		.name = "xusb_devr",
+		.swgroup = TEGRA_SWGROUP_XUSB_DEV,
+		.smmu = {
+			.reg = 0x230,
+			.bit = 12,
+		},
+		.latency = {
+			.reg = 0x380,
+			.shift = 0,
+			.mask = 0xff,
+			.def = 0x39,
+		},
+	}, {
+		.id = 0x4d,
+		.name = "xusb_devw",
+		.swgroup = TEGRA_SWGROUP_XUSB_DEV,
+		.smmu = {
+			.reg = 0x230,
+			.bit = 13,
+		},
+		.latency = {
+			.reg = 0x380,
+			.shift = 16,
+			.mask = 0xff,
+			.def = 0x80,
+		},
+	}, {
+		.id = 0x4e,
+		.name = "isprab",
+		.swgroup = TEGRA_SWGROUP_ISP2B,
+		.smmu = {
+			.reg = 0x230,
+			.bit = 14,
+		},
+		.latency = {
+			.reg = 0x384,
+			.shift = 0,
+			.mask = 0xff,
+			.def = 0x18,
+		},
+	}, {
+		.id = 0x50,
+		.name = "ispwab",
+		.swgroup = TEGRA_SWGROUP_ISP2B,
+		.smmu = {
+			.reg = 0x230,
+			.bit = 16,
+		},
+		.latency = {
+			.reg = 0x388,
+			.shift = 0,
+			.mask = 0xff,
+			.def = 0x80,
+		},
+	}, {
+		.id = 0x51,
+		.name = "ispwbb",
+		.swgroup = TEGRA_SWGROUP_ISP2B,
+		.smmu = {
+			.reg = 0x230,
+			.bit = 17,
+		},
+		.latency = {
+			.reg = 0x388,
+			.shift = 16,
+			.mask = 0xff,
+			.def = 0x80,
+		},
+	}, {
+		.id = 0x54,
+		.name = "tsecsrd",
+		.swgroup = TEGRA_SWGROUP_TSEC,
+		.smmu = {
+			.reg = 0x230,
+			.bit = 20,
+		},
+		.latency = {
+			.reg = 0x390,
+			.shift = 0,
+			.mask = 0xff,
+			.def = 0x9b,
+		},
+	}, {
+		.id = 0x55,
+		.name = "tsecswr",
+		.swgroup = TEGRA_SWGROUP_TSEC,
+		.smmu = {
+			.reg = 0x230,
+			.bit = 21,
+		},
+		.latency = {
+			.reg = 0x390,
+			.shift = 16,
+			.mask = 0xff,
+			.def = 0x80,
+		},
+	}, {
+		.id = 0x56,
+		.name = "a9avpscr",
+		.swgroup = TEGRA_SWGROUP_A9AVP,
+		.smmu = {
+			.reg = 0x230,
+			.bit = 22,
+		},
+		.latency = {
+			.reg = 0x3a4,
+			.shift = 0,
+			.mask = 0xff,
+			.def = 0x04,
+		},
+	}, {
+		.id = 0x57,
+		.name = "a9avpscw",
+		.swgroup = TEGRA_SWGROUP_A9AVP,
+		.smmu = {
+			.reg = 0x230,
+			.bit = 23,
+		},
+		.latency = {
+			.reg = 0x3a4,
+			.shift = 16,
+			.mask = 0xff,
+			.def = 0x80,
+		},
+	}, {
+		.id = 0x58,
+		.name = "gpusrd",
+		.swgroup = TEGRA_SWGROUP_GPU,
+		.smmu = {
+			/* read-only */
+			.reg = 0x230,
+			.bit = 24,
+		},
+		.latency = {
+			.reg = 0x3c8,
+			.shift = 0,
+			.mask = 0xff,
+			.def = 0x1a,
+		},
+	}, {
+		.id = 0x59,
+		.name = "gpuswr",
+		.swgroup = TEGRA_SWGROUP_GPU,
+		.smmu = {
+			/* read-only */
+			.reg = 0x230,
+			.bit = 25,
+		},
+		.latency = {
+			.reg = 0x3c8,
+			.shift = 16,
+			.mask = 0xff,
+			.def = 0x80,
+		},
+	}, {
+		.id = 0x5a,
+		.name = "displayt",
+		.swgroup = TEGRA_SWGROUP_DC,
+		.smmu = {
+			.reg = 0x230,
+			.bit = 26,
+		},
+		.latency = {
+			.reg = 0x2f0,
+			.shift = 16,
+			.mask = 0xff,
+			.def = 0x50,
+		},
+	}, {
+		.id = 0x60,
+		.name = "sdmmcra",
+		.swgroup = TEGRA_SWGROUP_SDMMC1A,
+		.smmu = {
+			.reg = 0x234,
+			.bit = 0,
+		},
+		.latency = {
+			.reg = 0x3b8,
+			.shift = 0,
+			.mask = 0xff,
+			.def = 0x49,
+		},
+	}, {
+		.id = 0x61,
+		.name = "sdmmcraa",
+		.swgroup = TEGRA_SWGROUP_SDMMC2A,
+		.smmu = {
+			.reg = 0x234,
+			.bit = 1,
+		},
+		.latency = {
+			.reg = 0x3bc,
+			.shift = 0,
+			.mask = 0xff,
+			.def = 0x49,
+		},
+	}, {
+		.id = 0x62,
+		.name = "sdmmcr",
+		.swgroup = TEGRA_SWGROUP_SDMMC3A,
+		.smmu = {
+			.reg = 0x234,
+			.bit = 2,
+		},
+		.latency = {
+			.reg = 0x3c0,
+			.shift = 0,
+			.mask = 0xff,
+			.def = 0x49,
+		},
+	}, {
+		.id = 0x63,
+		.swgroup = TEGRA_SWGROUP_SDMMC4A,
+		.name = "sdmmcrab",
+		.smmu = {
+			.reg = 0x234,
+			.bit = 3,
+		},
+		.latency = {
+			.reg = 0x3c4,
+			.shift = 0,
+			.mask = 0xff,
+			.def = 0x49,
+		},
+	}, {
+		.id = 0x64,
+		.name = "sdmmcwa",
+		.swgroup = TEGRA_SWGROUP_SDMMC1A,
+		.smmu = {
+			.reg = 0x234,
+			.bit = 4,
+		},
+		.latency = {
+			.reg = 0x3b8,
+			.shift = 16,
+			.mask = 0xff,
+			.def = 0x80,
+		},
+	}, {
+		.id = 0x65,
+		.name = "sdmmcwaa",
+		.swgroup = TEGRA_SWGROUP_SDMMC2A,
+		.smmu = {
+			.reg = 0x234,
+			.bit = 5,
+		},
+		.latency = {
+			.reg = 0x3bc,
+			.shift = 16,
+			.mask = 0xff,
+			.def = 0x80,
+		},
+	}, {
+		.id = 0x66,
+		.name = "sdmmcw",
+		.swgroup = TEGRA_SWGROUP_SDMMC3A,
+		.smmu = {
+			.reg = 0x234,
+			.bit = 6,
+		},
+		.latency = {
+			.reg = 0x3c0,
+			.shift = 16,
+			.mask = 0xff,
+			.def = 0x80,
+		},
+	}, {
+		.id = 0x67,
+		.name = "sdmmcwab",
+		.swgroup = TEGRA_SWGROUP_SDMMC4A,
+		.smmu = {
+			.reg = 0x234,
+			.bit = 7,
+		},
+		.latency = {
+			.reg = 0x3c4,
+			.shift = 16,
+			.mask = 0xff,
+			.def = 0x80,
+		},
+	}, {
+		.id = 0x6c,
+		.name = "vicsrd",
+		.swgroup = TEGRA_SWGROUP_VIC,
+		.smmu = {
+			.reg = 0x234,
+			.bit = 12,
+		},
+		.latency = {
+			.reg = 0x394,
+			.shift = 0,
+			.mask = 0xff,
+			.def = 0x1a,
+		},
+	}, {
+		.id = 0x6d,
+		.name = "vicswr",
+		.swgroup = TEGRA_SWGROUP_VIC,
+		.smmu = {
+			.reg = 0x234,
+			.bit = 13,
+		},
+		.latency = {
+			.reg = 0x394,
+			.shift = 16,
+			.mask = 0xff,
+			.def = 0x80,
+		},
+	}, {
+		.id = 0x72,
+		.name = "viw",
+		.swgroup = TEGRA_SWGROUP_VI,
+		.smmu = {
+			.reg = 0x234,
+			.bit = 18,
+		},
+		.latency = {
+			.reg = 0x398,
+			.shift = 0,
+			.mask = 0xff,
+			.def = 0x80,
+		},
+	}, {
+		.id = 0x73,
+		.name = "displayd",
+		.swgroup = TEGRA_SWGROUP_DC,
+		.smmu = {
+			.reg = 0x234,
+			.bit = 19,
+		},
+		.latency = {
+			.reg = 0x3c8,
+			.shift = 0,
+			.mask = 0xff,
+			.def = 0x50,
+		},
+	},
+};
+
+struct tegra_smmu_swgroup {
+	unsigned int swgroup;
+	unsigned int reg;
+};
+
+static const struct tegra_smmu_swgroup tegra124_swgroups[] = {
+	{ .swgroup = TEGRA_SWGROUP_DC,        .reg = 0x240 },
+	{ .swgroup = TEGRA_SWGROUP_DCB,       .reg = 0x244 },
+	{ .swgroup = TEGRA_SWGROUP_AFI,       .reg = 0x238 },
+	{ .swgroup = TEGRA_SWGROUP_AVPC,      .reg = 0x23c },
+	{ .swgroup = TEGRA_SWGROUP_HDA,       .reg = 0x254 },
+	{ .swgroup = TEGRA_SWGROUP_HC,        .reg = 0x250 },
+	{ .swgroup = TEGRA_SWGROUP_MSENC,     .reg = 0x264 },
+	{ .swgroup = TEGRA_SWGROUP_PPCS,      .reg = 0x270 },
+	{ .swgroup = TEGRA_SWGROUP_SATA,      .reg = 0x274 },
+	{ .swgroup = TEGRA_SWGROUP_VDE,       .reg = 0x27c },
+	{ .swgroup = TEGRA_SWGROUP_ISP2,      .reg = 0x258 },
+	{ .swgroup = TEGRA_SWGROUP_XUSB_HOST, .reg = 0x288 },
+	{ .swgroup = TEGRA_SWGROUP_XUSB_DEV,  .reg = 0x28c },
+	{ .swgroup = TEGRA_SWGROUP_ISP2B,     .reg = 0xaa4 },
+	{ .swgroup = TEGRA_SWGROUP_TSEC,      .reg = 0x294 },
+	{ .swgroup = TEGRA_SWGROUP_A9AVP,     .reg = 0x290 },
+	{ .swgroup = TEGRA_SWGROUP_GPU,       .reg = 0xaa8 },
+	{ .swgroup = TEGRA_SWGROUP_SDMMC1A,   .reg = 0xa94 },
+	{ .swgroup = TEGRA_SWGROUP_SDMMC2A,   .reg = 0xa98 },
+	{ .swgroup = TEGRA_SWGROUP_SDMMC3A,   .reg = 0xa9c },
+	{ .swgroup = TEGRA_SWGROUP_SDMMC4A,   .reg = 0xaa0 },
+	{ .swgroup = TEGRA_SWGROUP_VIC,       .reg = 0x284 },
+	{ .swgroup = TEGRA_SWGROUP_VI,        .reg = 0x280 },
+};
+
+struct tegra_smmu_group_init {
+	unsigned int asid;
+	const char *name;
+
+	const struct of_device_id *matches;
+};
+
+struct tegra_smmu_soc {
+	const struct tegra_smmu_group_init *groups;
+	unsigned int num_groups;
+
+	const struct tegra_mc_client *clients;
+	unsigned int num_clients;
+
+	const struct tegra_smmu_swgroup *swgroups;
+	unsigned int num_swgroups;
+
+	unsigned int num_asids;
+	unsigned int atom_size;
+
+	const struct tegra_smmu_ops *ops;
+};
+
+struct tegra_smmu_ops {
+	void (*flush_dcache)(struct page *page, unsigned long offset,
+			     size_t size);
+};
+
+struct tegra_smmu_master {
+	struct list_head list;
+	struct device *dev;
+};
+
+struct tegra_smmu_group {
+	const char *name;
+	const struct of_device_id *matches;
+	unsigned int asid;
+
+#ifndef CONFIG_ARM64
+	struct dma_iommu_mapping *mapping;
+#endif
+	struct list_head masters;
+};
+
+static const struct of_device_id tegra124_periph_matches[] = {
+	{ .compatible = "nvidia,tegra124-sdhci", },
+	{ }
+};
+
+static const struct tegra_smmu_group_init tegra124_smmu_groups[] = {
+	{ 0, "peripherals", tegra124_periph_matches },
+};
+
+static void tegra_smmu_group_release(void *data)
+{
+	kfree(data);
+}
+
+struct tegra_smmu {
+	void __iomem *regs;
+	struct iommu iommu;
+	struct device *dev;
+
+	const struct tegra_smmu_soc *soc;
+
+	struct iommu_group **groups;
+	unsigned int num_groups;
+
+	unsigned long *asids;
+	struct mutex lock;
+};
+
+struct tegra_smmu_address_space {
+	struct iommu_domain *domain;
+	struct tegra_smmu *smmu;
+	struct page *pd;
+	unsigned id;
+	u32 attr;
+};
+
+static inline void smmu_writel(struct tegra_smmu *smmu, u32 value,
+			       unsigned long offset)
+{
+	writel(value, smmu->regs + offset);
+}
+
+static inline u32 smmu_readl(struct tegra_smmu *smmu, unsigned long offset)
+{
+	return readl(smmu->regs + offset);
+}
+
+#define SMMU_CONFIG 0x010
+#define  SMMU_CONFIG_ENABLE (1 << 0)
+
+#define SMMU_PTB_ASID 0x01c
+#define  SMMU_PTB_ASID_VALUE(x) ((x) & 0x7f)
+
+#define SMMU_PTB_DATA 0x020
+#define  SMMU_PTB_DATA_VALUE(page, attr) (page_to_phys(page) >> 12 | (attr))
+
+#define SMMU_MK_PDE(page, attr) (page_to_phys(page) >> SMMU_PTE_SHIFT | (attr))
+
+#define SMMU_TLB_FLUSH 0x030
+#define  SMMU_TLB_FLUSH_VA_MATCH_ALL     (0 << 0)
+#define  SMMU_TLB_FLUSH_VA_MATCH_SECTION (2 << 0)
+#define  SMMU_TLB_FLUSH_VA_MATCH_GROUP   (3 << 0)
+#define  SMMU_TLB_FLUSH_ASID(x)          (((x) & 0x7f) << 24)
+#define  SMMU_TLB_FLUSH_VA_SECTION(addr) ((((addr) & 0xffc00000) >> 12) | \
+					  SMMU_TLB_FLUSH_VA_MATCH_SECTION)
+#define  SMMU_TLB_FLUSH_VA_GROUP(addr)   ((((addr) & 0xffffc000) >> 12) | \
+					  SMMU_TLB_FLUSH_VA_MATCH_GROUP)
+#define  SMMU_TLB_FLUSH_ASID_MATCH       (1 << 31)
+
+#define SMMU_PTC_FLUSH 0x034
+#define  SMMU_PTC_FLUSH_TYPE_ALL (0 << 0)
+#define  SMMU_PTC_FLUSH_TYPE_ADR (1 << 0)
+
+#define SMMU_PTC_FLUSH_HI 0x9b8
+#define  SMMU_PTC_FLUSH_HI_MASK 0x3
+
+/* per-SWGROUP SMMU_*_ASID register */
+#define SMMU_ASID_ENABLE (1 << 31)
+#define SMMU_ASID_MASK 0x7f
+#define SMMU_ASID_VALUE(x) ((x) & SMMU_ASID_MASK)
+
+/* page table definitions */
+#define SMMU_NUM_PDE 1024
+#define SMMU_NUM_PTE 1024
+
+#define SMMU_SIZE_PD (SMMU_NUM_PDE * 4)
+#define SMMU_SIZE_PT (SMMU_NUM_PTE * 4)
+
+#define SMMU_PDE_SHIFT 22
+#define SMMU_PTE_SHIFT 12
+
+#define SMMU_PFN_MASK 0x000fffff
+
+#define SMMU_PD_READABLE	(1 << 31)
+#define SMMU_PD_WRITABLE	(1 << 30)
+#define SMMU_PD_NONSECURE	(1 << 29)
+
+#define SMMU_PDE_READABLE	(1 << 31)
+#define SMMU_PDE_WRITABLE	(1 << 30)
+#define SMMU_PDE_NONSECURE	(1 << 29)
+#define SMMU_PDE_NEXT		(1 << 28)
+
+#define SMMU_PTE_READABLE	(1 << 31)
+#define SMMU_PTE_WRITABLE	(1 << 30)
+#define SMMU_PTE_NONSECURE	(1 << 29)
+
+#define SMMU_PDE_ATTR		(SMMU_PDE_READABLE | SMMU_PDE_WRITABLE | \
+				 SMMU_PDE_NONSECURE)
+#define SMMU_PTE_ATTR		(SMMU_PTE_READABLE | SMMU_PTE_WRITABLE | \
+				 SMMU_PTE_NONSECURE)
+
+#define SMMU_PDE_VACANT(n)	(((n) << 10) | SMMU_PDE_ATTR)
+#define SMMU_PTE_VACANT(n)	(((n) << 12) | SMMU_PTE_ATTR)
+
+#ifdef CONFIG_ARCH_TEGRA_124_SOC
+static void tegra124_flush_dcache(struct page *page, unsigned long offset,
+				  size_t size)
+{
+	phys_addr_t phys = page_to_phys(page) + offset;
+	void *virt = page_address(page) + offset;
+
+	__cpuc_flush_dcache_area(virt, size);
+	outer_flush_range(phys, phys + size);
+}
+
+static const struct tegra_smmu_ops tegra124_smmu_ops = {
+	.flush_dcache = tegra124_flush_dcache,
+};
+#endif
+
+static void tegra132_flush_dcache(struct page *page, unsigned long offset,
+				  size_t size)
+{
+	/* TODO: implement */
+}
+
+static const struct tegra_smmu_ops tegra132_smmu_ops = {
+	.flush_dcache = tegra132_flush_dcache,
+};
+
+static inline void smmu_flush_ptc(struct tegra_smmu *smmu, struct page *page,
+				  unsigned long offset)
+{
+	phys_addr_t phys = page ? page_to_phys(page) : 0;
+	u32 value;
+
+	if (page) {
+		offset &= ~(smmu->soc->atom_size - 1);
+
+#ifdef CONFIG_PHYS_ADDR_T_64BIT
+		value = (phys >> 32) & SMMU_PTC_FLUSH_HI_MASK;
+#else
+		value = 0;
+#endif
+		smmu_writel(smmu, value, SMMU_PTC_FLUSH_HI);
+
+		value = (phys + offset) | SMMU_PTC_FLUSH_TYPE_ADR;
+	} else {
+		value = SMMU_PTC_FLUSH_TYPE_ALL;
+	}
+
+	smmu_writel(smmu, value, SMMU_PTC_FLUSH);
+}
+
+static inline void smmu_flush_tlb(struct tegra_smmu *smmu)
+{
+	smmu_writel(smmu, SMMU_TLB_FLUSH_VA_MATCH_ALL, SMMU_TLB_FLUSH);
+}
+
+static inline void smmu_flush_tlb_asid(struct tegra_smmu *smmu,
+				       unsigned long asid)
+{
+	u32 value;
+
+	value = SMMU_TLB_FLUSH_ASID_MATCH | SMMU_TLB_FLUSH_ASID(asid) |
+		SMMU_TLB_FLUSH_VA_MATCH_ALL;
+	smmu_writel(smmu, value, SMMU_TLB_FLUSH);
+}
+
+static inline void smmu_flush_tlb_section(struct tegra_smmu *smmu,
+					  unsigned long asid,
+					  unsigned long iova)
+{
+	u32 value;
+
+	value = SMMU_TLB_FLUSH_ASID_MATCH | SMMU_TLB_FLUSH_ASID(asid) |
+		SMMU_TLB_FLUSH_VA_SECTION(iova);
+	smmu_writel(smmu, value, SMMU_TLB_FLUSH);
+}
+
+static inline void smmu_flush_tlb_group(struct tegra_smmu *smmu,
+					unsigned long asid,
+					unsigned long iova)
+{
+	u32 value;
+
+	value = SMMU_TLB_FLUSH_ASID_MATCH | SMMU_TLB_FLUSH_ASID(asid) |
+		SMMU_TLB_FLUSH_VA_GROUP(iova);
+	smmu_writel(smmu, value, SMMU_TLB_FLUSH);
+}
+
+static inline void smmu_flush(struct tegra_smmu *smmu)
+{
+	smmu_readl(smmu, SMMU_CONFIG);
+}
+
+static inline struct tegra_smmu *to_tegra_smmu(struct iommu *iommu)
+{
+	return container_of(iommu, struct tegra_smmu, iommu);
+}
+
+static struct tegra_smmu *smmu_handle = NULL;
+
+static int tegra_smmu_alloc_asid(struct tegra_smmu *smmu, unsigned int *idp)
+{
+	unsigned long id;
+
+	mutex_lock(&smmu->lock);
+
+	id = find_first_zero_bit(smmu->asids, smmu->soc->num_asids);
+	if (id >= smmu->soc->num_asids) {
+		mutex_unlock(&smmu->lock);
+		return -ENOSPC;
+	}
+
+	set_bit(id, smmu->asids);
+	*idp = id;
+
+	mutex_unlock(&smmu->lock);
+	return 0;
+}
+
+static void tegra_smmu_free_asid(struct tegra_smmu *smmu, unsigned int id)
+{
+	mutex_lock(&smmu->lock);
+	clear_bit(id, smmu->asids);
+	mutex_unlock(&smmu->lock);
+}
+
+struct tegra_smmu_address_space *foo = NULL;
+
+static int tegra_smmu_domain_init(struct iommu_domain *domain)
+{
+	struct tegra_smmu *smmu = smmu_handle;
+	struct tegra_smmu_address_space *as;
+	uint32_t *pd, value;
+	unsigned int i;
+	int err = 0;
+
+	as = kzalloc(sizeof(*as), GFP_KERNEL);
+	if (!as) {
+		err = -ENOMEM;
+		goto out;
+	}
+
+	as->attr = SMMU_PD_READABLE | SMMU_PD_WRITABLE | SMMU_PD_NONSECURE;
+	as->smmu = smmu_handle;
+	as->domain = domain;
+
+	err = tegra_smmu_alloc_asid(smmu, &as->id);
+	if (err < 0) {
+		kfree(as);
+		goto out;
+	}
+
+	as->pd = alloc_page(GFP_KERNEL | __GFP_DMA);
+	if (!as->pd) {
+		err = -ENOMEM;
+		goto out;
+	}
+
+	pd = page_address(as->pd);
+	SetPageReserved(as->pd);
+
+	for (i = 0; i < SMMU_NUM_PDE; i++)
+		pd[i] = SMMU_PDE_VACANT(i);
+
+	smmu->soc->ops->flush_dcache(as->pd, 0, SMMU_SIZE_PD);
+	smmu_flush_ptc(smmu, as->pd, 0);
+	smmu_flush_tlb_asid(smmu, as->id);
+
+	smmu_writel(smmu, as->id & 0x7f, SMMU_PTB_ASID);
+	value = SMMU_PTB_DATA_VALUE(as->pd, as->attr);
+	smmu_writel(smmu, value, SMMU_PTB_DATA);
+	smmu_flush(smmu);
+
+	domain->priv = as;
+
+	return 0;
+
+out:
+	return err;
+}
+
+static void tegra_smmu_domain_destroy(struct iommu_domain *domain)
+{
+	struct tegra_smmu_address_space *as = domain->priv;
+
+	/* TODO: free page directory and page tables */
+
+	tegra_smmu_free_asid(as->smmu, as->id);
+	kfree(as);
+}
+
+static const struct tegra_smmu_swgroup *
+tegra_smmu_find_swgroup(struct tegra_smmu *smmu, unsigned int swgroup)
+{
+	const struct tegra_smmu_swgroup *group = NULL;
+	unsigned int i;
+
+	for (i = 0; i < smmu->soc->num_swgroups; i++) {
+		if (smmu->soc->swgroups[i].swgroup == swgroup) {
+			group = &smmu->soc->swgroups[i];
+			break;
+		}
+	}
+
+	return group;
+}
+
+static int tegra_smmu_enable(struct tegra_smmu *smmu, unsigned int swgroup,
+			     unsigned int asid)
+{
+	const struct tegra_smmu_swgroup *group;
+	unsigned int i;
+	u32 value;
+
+	for (i = 0; i < smmu->soc->num_clients; i++) {
+		const struct tegra_mc_client *client = &smmu->soc->clients[i];
+
+		if (client->swgroup != swgroup)
+			continue;
+
+		value = smmu_readl(smmu, client->smmu.reg);
+		value |= BIT(client->smmu.bit);
+		smmu_writel(smmu, value, client->smmu.reg);
+	}
+
+	group = tegra_smmu_find_swgroup(smmu, swgroup);
+	if (group) {
+		value = smmu_readl(smmu, group->reg);
+		value &= ~SMMU_ASID_MASK;
+		value |= SMMU_ASID_VALUE(asid);
+		value |= SMMU_ASID_ENABLE;
+		smmu_writel(smmu, value, group->reg);
+	}
+
+	return 0;
+}
+
+static int tegra_smmu_disable(struct tegra_smmu *smmu, unsigned int swgroup,
+			      unsigned int asid)
+{
+	const struct tegra_smmu_swgroup *group;
+	unsigned int i;
+	u32 value;
+
+	group = tegra_smmu_find_swgroup(smmu, swgroup);
+	if (group) {
+		value = smmu_readl(smmu, group->reg);
+		value &= ~SMMU_ASID_MASK;
+		value |= SMMU_ASID_VALUE(asid);
+		value &= ~SMMU_ASID_ENABLE;
+		smmu_writel(smmu, value, group->reg);
+	}
+
+	for (i = 0; i < smmu->soc->num_clients; i++) {
+		const struct tegra_mc_client *client = &smmu->soc->clients[i];
+
+		if (client->swgroup != swgroup)
+			continue;
+
+		value = smmu_readl(smmu, client->smmu.reg);
+		value &= ~BIT(client->smmu.bit);
+		smmu_writel(smmu, value, client->smmu.reg);
+	}
+
+	return 0;
+}
+
+static int tegra_smmu_attach_dev(struct iommu_domain *domain, struct device *dev)
+{
+	struct tegra_smmu_address_space *as = domain->priv;
+	struct tegra_smmu *smmu = as->smmu;
+	struct of_phandle_iter entry;
+	int err;
+
+	of_property_for_each_phandle_with_args(entry, dev->of_node, "iommus",
+					       "#iommu-cells", 0) {
+		unsigned int swgroup = entry.out_args.args[0];
+
+		if (entry.out_args.np != smmu->dev->of_node)
+			continue;
+
+		err = tegra_smmu_enable(smmu, swgroup, as->id);
+		if (err < 0)
+			pr_err("failed to enable SWGROUP#%u\n", swgroup);
+	}
+
+	return 0;
+}
+
+static void tegra_smmu_detach_dev(struct iommu_domain *domain, struct device *dev)
+{
+	struct tegra_smmu_address_space *as = domain->priv;
+	struct tegra_smmu *smmu = as->smmu;
+	struct of_phandle_iter entry;
+	int err;
+
+	of_property_for_each_phandle_with_args(entry, dev->of_node, "iommus",
+					       "#iommu-cells", 0) {
+		unsigned int swgroup;
+
+		if (entry.out_args.np != smmu->dev->of_node)
+			continue;
+
+		swgroup = entry.out_args.args[0];
+
+		err = tegra_smmu_disable(smmu, swgroup, as->id);
+		if (err < 0) {
+			pr_err("failed to enable SWGROUP#%u\n", swgroup);
+		}
+	}
+}
+
+static u32 *as_get_pte(struct tegra_smmu_address_space *as, dma_addr_t iova,
+		       struct page **pagep)
+{
+	struct tegra_smmu *smmu = smmu_handle;
+	u32 *pd = page_address(as->pd), *pt;
+	u32 pde = (iova >> SMMU_PDE_SHIFT) & 0x3ff;
+	u32 pte = (iova >> SMMU_PTE_SHIFT) & 0x3ff;
+	struct page *page;
+	unsigned int i;
+
+	if (pd[pde] != SMMU_PDE_VACANT(pde)) {
+		page = pfn_to_page(pd[pde] & SMMU_PFN_MASK);
+		pt = page_address(page);
+	} else {
+		page = alloc_page(GFP_KERNEL | __GFP_DMA);
+		if (!page)
+			return NULL;
+
+		pt = page_address(page);
+		SetPageReserved(page);
+
+		for (i = 0; i < SMMU_NUM_PTE; i++)
+			pt[i] = SMMU_PTE_VACANT(i);
+
+		smmu->soc->ops->flush_dcache(page, 0, SMMU_SIZE_PT);
+
+		pd[pde] = SMMU_MK_PDE(page, SMMU_PDE_ATTR | SMMU_PDE_NEXT);
+
+		smmu->soc->ops->flush_dcache(as->pd, pde << 2, 4);
+		smmu_flush_ptc(smmu, as->pd, pde << 2);
+		smmu_flush_tlb_section(smmu, as->id, iova);
+		smmu_flush(smmu);
+	}
+
+	*pagep = page;
+
+	return &pt[pte];
+}
+
+static int tegra_smmu_map(struct iommu_domain *domain, unsigned long iova,
+			  phys_addr_t paddr, size_t size, int prot)
+{
+	struct tegra_smmu_address_space *as = domain->priv;
+	struct tegra_smmu *smmu = smmu_handle;
+	unsigned long offset;
+	struct page *page;
+	u32 *pte;
+
+	pte = as_get_pte(as, iova, &page);
+	if (!pte)
+		return -ENOMEM;
+
+	offset = offset_in_page(pte);
+
+	*pte = __phys_to_pfn(paddr) | SMMU_PTE_ATTR;
+
+	smmu->soc->ops->flush_dcache(page, offset, 4);
+	smmu_flush_ptc(smmu, page, offset);
+	smmu_flush_tlb_group(smmu, as->id, iova);
+	smmu_flush(smmu);
+
+	return 0;
+}
+
+static size_t tegra_smmu_unmap(struct iommu_domain *domain, unsigned long iova,
+			       size_t size)
+{
+	struct tegra_smmu_address_space *as = domain->priv;
+	struct tegra_smmu *smmu = smmu_handle;
+	unsigned long offset;
+	struct page *page;
+	u32 *pte;
+
+	pte = as_get_pte(as, iova, &page);
+	if (!pte)
+		return 0;
+
+	offset = offset_in_page(pte);
+	*pte = 0;
+
+	smmu->soc->ops->flush_dcache(page, offset, 4);
+	smmu_flush_ptc(smmu, page, offset);
+	smmu_flush_tlb_group(smmu, as->id, iova);
+	smmu_flush(smmu);
+
+	return size;
+}
+
+static phys_addr_t tegra_smmu_iova_to_phys(struct iommu_domain *domain,
+					   dma_addr_t iova)
+{
+	struct tegra_smmu_address_space *as = domain->priv;
+	struct page *page;
+	unsigned long pfn;
+	u32 *pte;
+
+	pte = as_get_pte(as, iova, &page);
+	pfn = *pte & SMMU_PFN_MASK;
+
+	return PFN_PHYS(pfn);
+}
+
+static int tegra_smmu_attach(struct iommu *iommu, struct device *dev)
+{
+	struct tegra_smmu *smmu = to_tegra_smmu(iommu);
+	struct tegra_smmu_group *group;
+	unsigned int i;
+
+	for (i = 0; i < smmu->soc->num_groups; i++) {
+		group = iommu_group_get_iommudata(smmu->groups[i]);
+
+		if (of_match_node(group->matches, dev->of_node)) {
+			pr_debug("adding device %s to group %s\n",
+				 dev_name(dev), group->name);
+			iommu_group_add_device(smmu->groups[i], dev);
+			break;
+		}
+	}
+
+	if (i == smmu->soc->num_groups)
+		return 0;
+
+#ifndef CONFIG_ARM64
+	return arm_iommu_attach_device(dev, group->mapping);
+#else
+	return 0;
+#endif
+}
+
+static int tegra_smmu_detach(struct iommu *iommu, struct device *dev)
+{
+	return 0;
+}
+
+static const struct iommu_ops tegra_smmu_ops = {
+	.domain_init = tegra_smmu_domain_init,
+	.domain_destroy = tegra_smmu_domain_destroy,
+	.attach_dev = tegra_smmu_attach_dev,
+	.detach_dev = tegra_smmu_detach_dev,
+	.map = tegra_smmu_map,
+	.unmap = tegra_smmu_unmap,
+	.iova_to_phys = tegra_smmu_iova_to_phys,
+	.attach = tegra_smmu_attach,
+	.detach = tegra_smmu_detach,
+
+	.pgsize_bitmap = SZ_4K,
+};
+
+static struct tegra_smmu *tegra_smmu_probe(struct device *dev,
+					   const struct tegra_smmu_soc *soc,
+					   void __iomem *regs)
+{
+	struct tegra_smmu *smmu;
+	unsigned int i;
+	size_t size;
+	u32 value;
+	int err;
+
+	smmu = devm_kzalloc(dev, sizeof(*smmu), GFP_KERNEL);
+	if (!smmu)
+		return ERR_PTR(-ENOMEM);
+
+	size = BITS_TO_LONGS(soc->num_asids) * sizeof(long);
+
+	smmu->asids = devm_kzalloc(dev, size, GFP_KERNEL);
+	if (!smmu->asids)
+		return ERR_PTR(-ENOMEM);
+
+	INIT_LIST_HEAD(&smmu->iommu.list);
+	mutex_init(&smmu->lock);
+
+	smmu->iommu.ops = &tegra_smmu_ops;
+	smmu->iommu.dev = dev;
+
+	smmu->regs = regs;
+	smmu->soc = soc;
+	smmu->dev = dev;
+
+	smmu_handle = smmu;
+	bus_set_iommu(&platform_bus_type, &tegra_smmu_ops);
+
+	smmu->num_groups = soc->num_groups;
+
+	smmu->groups = devm_kcalloc(dev, smmu->num_groups, sizeof(*smmu->groups),
+				    GFP_KERNEL);
+	if (!smmu->groups)
+		return ERR_PTR(-ENOMEM);
+
+	for (i = 0; i < smmu->num_groups; i++) {
+		struct tegra_smmu_group *group;
+
+		smmu->groups[i] = iommu_group_alloc();
+		if (IS_ERR(smmu->groups[i]))
+			return ERR_CAST(smmu->groups[i]);
+
+		err = iommu_group_set_name(smmu->groups[i], soc->groups[i].name);
+		if (err < 0) {
+		}
+
+		group = kzalloc(sizeof(*group), GFP_KERNEL);
+		if (!group)
+			return ERR_PTR(-ENOMEM);
+
+		group->matches = soc->groups[i].matches;
+		group->asid = soc->groups[i].asid;
+		group->name = soc->groups[i].name;
+
+		iommu_group_set_iommudata(smmu->groups[i], group,
+					  tegra_smmu_group_release);
+
+#ifndef CONFIG_ARM64
+		group->mapping = arm_iommu_create_mapping(&platform_bus_type,
+							  0, SZ_2G);
+		if (IS_ERR(group->mapping)) {
+			dev_err(dev, "failed to create mapping for group %s: %ld\n",
+				group->name, PTR_ERR(group->mapping));
+			return ERR_CAST(group->mapping);
+		}
+#endif
+	}
+
+	value = (1 << 29) | (8 << 24) | 0x3f;
+	smmu_writel(smmu, value, 0x18);
+
+	value = (1 << 29) | (1 << 28) | 0x20;
+	smmu_writel(smmu, value, 0x014);
+
+	smmu_flush_ptc(smmu, NULL, 0);
+	smmu_flush_tlb(smmu);
+	smmu_writel(smmu, SMMU_CONFIG_ENABLE, SMMU_CONFIG);
+	smmu_flush(smmu);
+
+	err = iommu_add(&smmu->iommu);
+	if (err < 0)
+		return ERR_PTR(err);
+
+	return smmu;
+}
+
+static int tegra_smmu_remove(struct tegra_smmu *smmu)
+{
+	iommu_remove(&smmu->iommu);
+
+	return 0;
+}
+
+#ifdef CONFIG_ARCH_TEGRA_124_SOC
+static const struct tegra_smmu_soc tegra124_smmu_soc = {
+	.groups = tegra124_smmu_groups,
+	.num_groups = ARRAY_SIZE(tegra124_smmu_groups),
+	.clients = tegra124_mc_clients,
+	.num_clients = ARRAY_SIZE(tegra124_mc_clients),
+	.swgroups = tegra124_swgroups,
+	.num_swgroups = ARRAY_SIZE(tegra124_swgroups),
+	.num_asids = 128,
+	.atom_size = 32,
+	.ops = &tegra124_smmu_ops,
+};
+#endif
+
+static const struct tegra_smmu_soc tegra132_smmu_soc = {
+	.groups = tegra124_smmu_groups,
+	.num_groups = ARRAY_SIZE(tegra124_smmu_groups),
+	.clients = tegra124_mc_clients,
+	.num_clients = ARRAY_SIZE(tegra124_mc_clients),
+	.swgroups = tegra124_swgroups,
+	.num_swgroups = ARRAY_SIZE(tegra124_swgroups),
+	.num_asids = 128,
+	.atom_size = 32,
+	.ops = &tegra132_smmu_ops,
+};
+
+struct tegra_mc {
+	struct device *dev;
+	struct tegra_smmu *smmu;
+	void __iomem *regs;
+	int irq;
+
+	const struct tegra_mc_soc *soc;
+};
+
+static inline u32 mc_readl(struct tegra_mc *mc, unsigned long offset)
+{
+	return readl(mc->regs + offset);
+}
+
+static inline void mc_writel(struct tegra_mc *mc, u32 value, unsigned long offset)
+{
+	writel(value, mc->regs + offset);
+}
+
+struct tegra_mc_soc {
+	const struct tegra_mc_client *clients;
+	unsigned int num_clients;
+
+	const struct tegra_smmu_soc *smmu;
+};
+
+#ifdef CONFIG_ARCH_TEGRA_124_SOC
+static const struct tegra_mc_soc tegra124_mc_soc = {
+	.clients = tegra124_mc_clients,
+	.num_clients = ARRAY_SIZE(tegra124_mc_clients),
+	.smmu = &tegra124_smmu_soc,
+};
+#endif
+
+static const struct tegra_mc_soc tegra132_mc_soc = {
+	.clients = tegra124_mc_clients,
+	.num_clients = ARRAY_SIZE(tegra124_mc_clients),
+	.smmu = &tegra132_smmu_soc,
+};
+
+static const struct of_device_id tegra_mc_of_match[] = {
+#ifdef CONFIG_ARCH_TEGRA_124_SOC
+	{ .compatible = "nvidia,tegra124-mc", .data = &tegra124_mc_soc },
+#endif
+	{ .compatible = "nvidia,tegra132-mc", .data = &tegra132_mc_soc },
+	{ }
+};
+
+static irqreturn_t tegra124_mc_irq(int irq, void *data)
+{
+	struct tegra_mc *mc = data;
+	u32 value, status, mask;
+
+	/* mask all interrupts to avoid flooding */
+	mask = mc_readl(mc, MC_INTMASK);
+	mc_writel(mc, 0, MC_INTMASK);
+
+	status = mc_readl(mc, MC_INTSTATUS);
+	mc_writel(mc, status, MC_INTSTATUS);
+
+	dev_dbg(mc->dev, "INTSTATUS: %08x\n", status);
+
+	if (status & MC_INT_DECERR_MTS)
+		dev_dbg(mc->dev, "  DECERR_MTS\n");
+
+	if (status & MC_INT_SECERR_SEC)
+		dev_dbg(mc->dev, "  SECERR_SEC\n");
+
+	if (status & MC_INT_DECERR_VPR)
+		dev_dbg(mc->dev, "  DECERR_VPR\n");
+
+	if (status & MC_INT_INVALID_APB_ASID_UPDATE)
+		dev_dbg(mc->dev, "  INVALID_APB_ASID_UPDATE\n");
+
+	if (status & MC_INT_INVALID_SMMU_PAGE)
+		dev_dbg(mc->dev, "  INVALID_SMMU_PAGE\n");
+
+	if (status & MC_INT_ARBITRATION_EMEM)
+		dev_dbg(mc->dev, "  ARBITRATION_EMEM\n");
+
+	if (status & MC_INT_SECURITY_VIOLATION)
+		dev_dbg(mc->dev, "  SECURITY_VIOLATION\n");
+
+	if (status & MC_INT_DECERR_EMEM)
+		dev_dbg(mc->dev, "  DECERR_EMEM\n");
+
+	value = mc_readl(mc, MC_ERR_STATUS);
+
+	dev_dbg(mc->dev, "ERR_STATUS: %08x\n", value);
+	dev_dbg(mc->dev, "  type: %x\n", (value >> 28) & 0x7);
+	dev_dbg(mc->dev, "  protection: %x\n", (value >> 25) & 0x7);
+	dev_dbg(mc->dev, "  adr_hi: %x\n", (value >> 20) & 0x3);
+	dev_dbg(mc->dev, "  swap: %x\n", (value >> 18) & 0x1);
+	dev_dbg(mc->dev, "  security: %x\n", (value >> 17) & 0x1);
+	dev_dbg(mc->dev, "  r/w: %x\n", (value >> 16) & 0x1);
+	dev_dbg(mc->dev, "  adr1: %x\n", (value >> 12) & 0x7);
+	dev_dbg(mc->dev, "  client: %x\n", value & 0x7f);
+
+	value = mc_readl(mc, MC_ERR_ADR);
+	dev_dbg(mc->dev, "ERR_ADR: %08x\n", value);
+
+	mc_writel(mc, mask, MC_INTMASK);
+
+	return IRQ_HANDLED;
+}
+
+static int tegra_mc_probe(struct platform_device *pdev)
+{
+	const struct of_device_id *match;
+	struct resource *res;
+	struct tegra_mc *mc;
+	unsigned int i;
+	u32 value;
+	int err;
+
+	match = of_match_node(tegra_mc_of_match, pdev->dev.of_node);
+	if (!match)
+		return -ENODEV;
+
+	mc = devm_kzalloc(&pdev->dev, sizeof(*mc), GFP_KERNEL);
+	if (!mc)
+		return -ENOMEM;
+
+	platform_set_drvdata(pdev, mc);
+	mc->soc = match->data;
+	mc->dev = &pdev->dev;
+
+	res = platform_get_resource(pdev, IORESOURCE_MEM, 0);
+	mc->regs = devm_ioremap_resource(&pdev->dev, res);
+	if (IS_ERR(mc->regs))
+		return PTR_ERR(mc->regs);
+
+	for (i = 0; i < mc->soc->num_clients; i++) {
+		const struct latency_allowance *la = &mc->soc->clients[i].latency;
+		u32 value;
+
+		value = readl(mc->regs + la->reg);
+		value &= ~(la->mask << la->shift);
+		value |= (la->def & la->mask) << la->shift;
+		writel(value, mc->regs + la->reg);
+	}
+
+	mc->smmu = tegra_smmu_probe(&pdev->dev, mc->soc->smmu, mc->regs);
+	if (IS_ERR(mc->smmu)) {
+		dev_err(&pdev->dev, "failed to probe SMMU: %ld\n",
+			PTR_ERR(mc->smmu));
+		return PTR_ERR(mc->smmu);
+	}
+
+	mc->irq = platform_get_irq(pdev, 0);
+	if (mc->irq < 0) {
+		dev_err(&pdev->dev, "interrupt not specified\n");
+		return mc->irq;
+	}
+
+	err = devm_request_irq(&pdev->dev, mc->irq, tegra124_mc_irq,
+			       IRQF_SHARED, dev_name(&pdev->dev), mc);
+	if (err < 0) {
+		dev_err(&pdev->dev, "failed to request IRQ#%u: %d\n", mc->irq,
+			err);
+		return err;
+	}
+
+	value = MC_INT_DECERR_MTS | MC_INT_SECERR_SEC | MC_INT_DECERR_VPR |
+		MC_INT_INVALID_APB_ASID_UPDATE | MC_INT_INVALID_SMMU_PAGE |
+		MC_INT_ARBITRATION_EMEM | MC_INT_SECURITY_VIOLATION |
+		MC_INT_DECERR_EMEM;
+	mc_writel(mc, value, MC_INTMASK);
+
+	return 0;
+}
+
+static int tegra_mc_remove(struct platform_device *pdev)
+{
+	struct tegra_mc *mc = platform_get_drvdata(pdev);
+	int err;
+
+	err = tegra_smmu_remove(mc->smmu);
+	if (err < 0)
+		dev_err(&pdev->dev, "failed to remove SMMU: %d\n", err);
+
+	return 0;
+}
+
+static struct platform_driver tegra_mc_driver = {
+	.driver = {
+		.name = "tegra124-mc",
+		.of_match_table = tegra_mc_of_match,
+	},
+	.probe = tegra_mc_probe,
+	.remove = tegra_mc_remove,
+};
+module_platform_driver(tegra_mc_driver);
+
+MODULE_AUTHOR("Thierry Reding <treding@nvidia.com>");
+MODULE_DESCRIPTION("NVIDIA Tegra124 Memory Controller driver");
+MODULE_LICENSE("GPL v2");
diff --git a/include/dt-bindings/memory/tegra124-mc.h b/include/dt-bindings/memory/tegra124-mc.h
new file mode 100644
index 000000000000..6b1617ce022f
--- /dev/null
+++ b/include/dt-bindings/memory/tegra124-mc.h
@@ -0,0 +1,30 @@
+#ifndef DT_BINDINGS_MEMORY_TEGRA124_MC_H
+#define DT_BINDINGS_MEMORY_TEGRA124_MC_H
+
+#define TEGRA_SWGROUP_DC	0
+#define TEGRA_SWGROUP_DCB	1
+#define TEGRA_SWGROUP_AFI	2
+#define TEGRA_SWGROUP_AVPC	3
+#define TEGRA_SWGROUP_HDA	4
+#define TEGRA_SWGROUP_HC	5
+#define TEGRA_SWGROUP_MSENC	6
+#define TEGRA_SWGROUP_PPCS	7
+#define TEGRA_SWGROUP_SATA	8
+#define TEGRA_SWGROUP_VDE	9
+#define TEGRA_SWGROUP_MPCORELP	10
+#define TEGRA_SWGROUP_MPCORE	11
+#define TEGRA_SWGROUP_ISP2	12
+#define TEGRA_SWGROUP_XUSB_HOST	13
+#define TEGRA_SWGROUP_XUSB_DEV	14
+#define TEGRA_SWGROUP_ISP2B	15
+#define TEGRA_SWGROUP_TSEC	16
+#define TEGRA_SWGROUP_A9AVP	17
+#define TEGRA_SWGROUP_GPU	18
+#define TEGRA_SWGROUP_SDMMC1A	19
+#define TEGRA_SWGROUP_SDMMC2A	20
+#define TEGRA_SWGROUP_SDMMC3A	21
+#define TEGRA_SWGROUP_SDMMC4A	22
+#define TEGRA_SWGROUP_VIC	23
+#define TEGRA_SWGROUP_VI	24
+
+#endif
-- 
2.0.0


^ permalink raw reply related	[flat|nested] 133+ messages in thread

* [RFC 04/10] memory: Add Tegra124 memory controller support
@ 2014-06-26 20:49     ` Thierry Reding
  0 siblings, 0 replies; 133+ messages in thread
From: Thierry Reding @ 2014-06-26 20:49 UTC (permalink / raw)
  To: linux-arm-kernel

From: Thierry Reding <treding@nvidia.com>

The memory controller on NVIDIA Tegra124 exposes various knobs that can
be used to tune the behaviour of the clients attached to it.

Currently this driver sets up the latency allowance registers to the HW
defaults. Eventually an API should be exported by this driver (via a
custom API or a generic subsystem) to allow clients to register latency
requirements.

This driver also registers an IOMMU (SMMU) that's implemented by the
memory controller.

Signed-off-by: Thierry Reding <treding@nvidia.com>
---
 drivers/memory/Kconfig                   |    9 +
 drivers/memory/Makefile                  |    1 +
 drivers/memory/tegra124-mc.c             | 1945 ++++++++++++++++++++++++++++++
 include/dt-bindings/memory/tegra124-mc.h |   30 +
 4 files changed, 1985 insertions(+)
 create mode 100644 drivers/memory/tegra124-mc.c
 create mode 100644 include/dt-bindings/memory/tegra124-mc.h

diff --git a/drivers/memory/Kconfig b/drivers/memory/Kconfig
index c59e9c96e86d..d0f0e6781570 100644
--- a/drivers/memory/Kconfig
+++ b/drivers/memory/Kconfig
@@ -61,6 +61,15 @@ config TEGRA30_MC
 	  analysis, especially for IOMMU/SMMU(System Memory Management
 	  Unit) module.
 
+config TEGRA124_MC
+	bool "Tegra124 Memory Controller driver"
+	depends on ARCH_TEGRA
+	select IOMMU_API
+	help
+	  This driver is for the Memory Controller module available on
+	  Tegra124 SoCs. It provides an IOMMU that can be used for I/O
+	  virtual address translation.
+
 config FSL_IFC
 	bool
 	depends on FSL_SOC
diff --git a/drivers/memory/Makefile b/drivers/memory/Makefile
index 71160a2b7313..03143927abab 100644
--- a/drivers/memory/Makefile
+++ b/drivers/memory/Makefile
@@ -11,3 +11,4 @@ obj-$(CONFIG_FSL_IFC)		+= fsl_ifc.o
 obj-$(CONFIG_MVEBU_DEVBUS)	+= mvebu-devbus.o
 obj-$(CONFIG_TEGRA20_MC)	+= tegra20-mc.o
 obj-$(CONFIG_TEGRA30_MC)	+= tegra30-mc.o
+obj-$(CONFIG_TEGRA124_MC)	+= tegra124-mc.o
diff --git a/drivers/memory/tegra124-mc.c b/drivers/memory/tegra124-mc.c
new file mode 100644
index 000000000000..741755b6785d
--- /dev/null
+++ b/drivers/memory/tegra124-mc.c
@@ -0,0 +1,1945 @@
+/*
+ * Copyright (C) 2014 NVIDIA CORPORATION.  All rights reserved.
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ */
+
+#include <linux/interrupt.h>
+#include <linux/io.h>
+#include <linux/iommu.h>
+#include <linux/kernel.h>
+#include <linux/module.h>
+#include <linux/of.h>
+#include <linux/platform_device.h>
+#include <linux/slab.h>
+
+#include <dt-bindings/memory/tegra124-mc.h>
+
+#include <asm/cacheflush.h>
+#ifndef CONFIG_ARM64
+#include <asm/dma-iommu.h>
+#endif
+
+#define MC_INTSTATUS 0x000
+#define  MC_INT_DECERR_MTS (1 << 16)
+#define  MC_INT_SECERR_SEC (1 << 13)
+#define  MC_INT_DECERR_VPR (1 << 12)
+#define  MC_INT_INVALID_APB_ASID_UPDATE (1 << 11)
+#define  MC_INT_INVALID_SMMU_PAGE (1 << 10)
+#define  MC_INT_ARBITRATION_EMEM (1 << 9)
+#define  MC_INT_SECURITY_VIOLATION (1 << 8)
+#define  MC_INT_DECERR_EMEM (1 << 6)
+#define MC_INTMASK 0x004
+#define MC_ERR_STATUS 0x08
+#define MC_ERR_ADR 0x0c
+
+struct latency_allowance {
+	unsigned int reg;
+	unsigned int shift;
+	unsigned int mask;
+	unsigned int def;
+};
+
+struct smmu_enable {
+	unsigned int reg;
+	unsigned int bit;
+};
+
+struct tegra_mc_client {
+	unsigned int id;
+	const char *name;
+	unsigned int swgroup;
+
+	struct smmu_enable smmu;
+	struct latency_allowance latency;
+};
+
+static const struct tegra_mc_client tegra124_mc_clients[] = {
+	{
+		.id = 0x01,
+		.name = "display0a",
+		.swgroup = TEGRA_SWGROUP_DC,
+		.smmu = {
+			.reg = 0x228,
+			.bit = 1,
+		},
+		.latency = {
+			.reg = 0x2e8,
+			.shift = 0,
+			.mask = 0xff,
+			.def = 0xc2,
+		},
+	}, {
+		.id = 0x02,
+		.name = "display0ab",
+		.swgroup = TEGRA_SWGROUP_DCB,
+		.smmu = {
+			.reg = 0x228,
+			.bit = 2,
+		},
+		.latency = {
+			.reg = 0x2f4,
+			.shift = 0,
+			.mask = 0xff,
+			.def = 0xc6,
+		},
+	}, {
+		.id = 0x03,
+		.name = "display0b",
+		.swgroup = TEGRA_SWGROUP_DC,
+		.smmu = {
+			.reg = 0x228,
+			.bit = 3,
+		},
+		.latency = {
+			.reg = 0x2e8,
+			.shift = 16,
+			.mask = 0xff,
+			.def = 0x50,
+		},
+	}, {
+		.id = 0x04,
+		.name = "display0bb",
+		.swgroup = TEGRA_SWGROUP_DCB,
+		.smmu = {
+			.reg = 0x228,
+			.bit = 4,
+		},
+		.latency = {
+			.reg = 0x2f4,
+			.shift = 16,
+			.mask = 0xff,
+			.def = 0x50,
+		},
+	}, {
+		.id = 0x05,
+		.name = "display0c",
+		.swgroup = TEGRA_SWGROUP_DC,
+		.smmu = {
+			.reg = 0x228,
+			.bit = 5,
+		},
+		.latency = {
+			.reg = 0x2ec,
+			.shift = 0,
+			.mask = 0xff,
+			.def = 0x50,
+		},
+	}, {
+		.id = 0x06,
+		.name = "display0cb",
+		.swgroup = TEGRA_SWGROUP_DCB,
+		.smmu = {
+			.reg = 0x228,
+			.bit = 6,
+		},
+		.latency = {
+			.reg = 0x2f8,
+			.shift = 0,
+			.mask = 0xff,
+			.def = 0x50,
+		},
+	}, {
+		.id = 0x0e,
+		.name = "afir",
+		.swgroup = TEGRA_SWGROUP_AFI,
+		.smmu = {
+			.reg = 0x228,
+			.bit = 14,
+		},
+		.latency = {
+			.reg = 0x2e0,
+			.shift = 0,
+			.mask = 0xff,
+			.def = 0x13,
+		},
+	}, {
+		.id = 0x0f,
+		.name = "avpcarm7r",
+		.swgroup = TEGRA_SWGROUP_AVPC,
+		.smmu = {
+			.reg = 0x228,
+			.bit = 15,
+		},
+		.latency = {
+			.reg = 0x2e4,
+			.shift = 0,
+			.mask = 0xff,
+			.def = 0x04,
+		},
+	}, {
+		.id = 0x10,
+		.name = "displayhc",
+		.swgroup = TEGRA_SWGROUP_DC,
+		.smmu = {
+			.reg = 0x228,
+			.bit = 16,
+		},
+		.latency = {
+			.reg = 0x2f0,
+			.shift = 0,
+			.mask = 0xff,
+			.def = 0x50,
+		},
+	}, {
+		.id = 0x11,
+		.name = "displayhcb",
+		.swgroup = TEGRA_SWGROUP_DCB,
+		.smmu = {
+			.reg = 0x228,
+			.bit = 17,
+		},
+		.latency = {
+			.reg = 0x2fc,
+			.shift = 0,
+			.mask = 0xff,
+			.def = 0x50,
+		},
+	}, {
+		.id = 0x15,
+		.name = "hdar",
+		.swgroup = TEGRA_SWGROUP_HDA,
+		.smmu = {
+			.reg = 0x228,
+			.bit = 21,
+		},
+		.latency = {
+			.reg = 0x318,
+			.shift = 0,
+			.mask = 0xff,
+			.def = 0x24,
+		},
+	}, {
+		.id = 0x16,
+		.name = "host1xdmar",
+		.swgroup = TEGRA_SWGROUP_HC,
+		.smmu = {
+			.reg = 0x228,
+			.bit = 22,
+		},
+		.latency = {
+			.reg = 0x310,
+			.shift = 0,
+			.mask = 0xff,
+			.def = 0x1e,
+		},
+	}, {
+		.id = 0x17,
+		.name = "host1xr",
+		.swgroup = TEGRA_SWGROUP_HC,
+		.smmu = {
+			.reg = 0x228,
+			.bit = 23,
+		},
+		.latency = {
+			.reg = 0x310,
+			.shift = 16,
+			.mask = 0xff,
+			.def = 0x50,
+		},
+	}, {
+		.id = 0x1c,
+		.name = "msencsrd",
+		.swgroup = TEGRA_SWGROUP_MSENC,
+		.smmu = {
+			.reg = 0x228,
+			.bit = 28,
+		},
+		.latency = {
+			.reg = 0x328,
+			.shift = 0,
+			.mask = 0xff,
+			.def = 0x23,
+		},
+	}, {
+		.id = 0x1d,
+		.name = "ppcsahbdmarhdar",
+		.swgroup = TEGRA_SWGROUP_PPCS,
+		.smmu = {
+			.reg = 0x228,
+			.bit = 29,
+		},
+		.latency = {
+			.reg = 0x344,
+			.shift = 0,
+			.mask = 0xff,
+			.def = 0x49,
+		},
+	}, {
+		.id = 0x1e,
+		.name = "ppcsahbslvr",
+		.swgroup = TEGRA_SWGROUP_PPCS,
+		.smmu = {
+			.reg = 0x228,
+			.bit = 30,
+		},
+		.latency = {
+			.reg = 0x344,
+			.shift = 16,
+			.mask = 0xff,
+			.def = 0x1a,
+		},
+	}, {
+		.id = 0x1f,
+		.name = "satar",
+		.swgroup = TEGRA_SWGROUP_SATA,
+		.smmu = {
+			.reg = 0x228,
+			.bit = 31,
+		},
+		.latency = {
+			.reg = 0x350,
+			.shift = 0,
+			.mask = 0xff,
+			.def = 0x65,
+		},
+	}, {
+		.id = 0x22,
+		.name = "vdebsevr",
+		.swgroup = TEGRA_SWGROUP_VDE,
+		.smmu = {
+			.reg = 0x22c,
+			.bit = 2,
+		},
+		.latency = {
+			.reg = 0x354,
+			.shift = 0,
+			.mask = 0xff,
+			.def = 0x4f,
+		},
+	}, {
+		.id = 0x23,
+		.name = "vdember",
+		.swgroup = TEGRA_SWGROUP_VDE,
+		.smmu = {
+			.reg = 0x22c,
+			.bit = 3,
+		},
+		.latency = {
+			.reg = 0x354,
+			.shift = 16,
+			.mask = 0xff,
+			.def = 0x3d,
+		},
+	}, {
+		.id = 0x24,
+		.name = "vdemcer",
+		.swgroup = TEGRA_SWGROUP_VDE,
+		.smmu = {
+			.reg = 0x22c,
+			.bit = 4,
+		},
+		.latency = {
+			.reg = 0x358,
+			.shift = 0,
+			.mask = 0xff,
+			.def = 0x66,
+		},
+	}, {
+		.id = 0x25,
+		.name = "vdetper",
+		.swgroup = TEGRA_SWGROUP_VDE,
+		.smmu = {
+			.reg = 0x22c,
+			.bit = 5,
+		},
+		.latency = {
+			.reg = 0x358,
+			.shift = 16,
+			.mask = 0xff,
+			.def = 0xa5,
+		},
+	}, {
+		.id = 0x26,
+		.name = "mpcorelpr",
+		.swgroup = TEGRA_SWGROUP_MPCORELP,
+		.latency = {
+			.reg = 0x324,
+			.shift = 0,
+			.mask = 0xff,
+			.def = 0x04,
+		},
+	}, {
+		.id = 0x27,
+		.name = "mpcorer",
+		.swgroup = TEGRA_SWGROUP_MPCORE,
+		.smmu = {
+			.reg = 0x22c,
+			.bit = 2,
+		},
+		.latency = {
+			.reg = 0x320,
+			.shift = 0,
+			.mask = 0xff,
+			.def = 0x04,
+		},
+	}, {
+		.id = 0x2b,
+		.name = "msencswr",
+		.swgroup = TEGRA_SWGROUP_MSENC,
+		.smmu = {
+			.reg = 0x22c,
+			.bit = 11,
+		},
+		.latency = {
+			.reg = 0x328,
+			.shift = 16,
+			.mask = 0xff,
+			.def = 0x80,
+		},
+	}, {
+		.id = 0x31,
+		.name = "afiw",
+		.swgroup = TEGRA_SWGROUP_AFI,
+		.smmu = {
+			.reg = 0x22c,
+			.bit = 17,
+		},
+		.latency = {
+			.reg = 0x2e0,
+			.shift = 16,
+			.mask = 0xff,
+			.def = 0x80,
+		},
+	}, {
+		.id = 0x32,
+		.name = "avpcarm7w",
+		.swgroup = TEGRA_SWGROUP_AVPC,
+		.smmu = {
+			.reg = 0x22c,
+			.bit = 18,
+		},
+		.latency = {
+			.reg = 0x2e4,
+			.shift = 16,
+			.mask = 0xff,
+			.def = 0x80,
+		},
+	}, {
+		.id = 0x35,
+		.name = "hdaw",
+		.swgroup = TEGRA_SWGROUP_HDA,
+		.smmu = {
+			.reg = 0x22c,
+			.bit = 21,
+		},
+		.latency = {
+			.reg = 0x318,
+			.shift = 16,
+			.mask = 0xff,
+			.def = 0x80,
+		},
+	}, {
+		.id = 0x36,
+		.name = "host1xw",
+		.swgroup = TEGRA_SWGROUP_HC,
+		.smmu = {
+			.reg = 0x22c,
+			.bit = 22,
+		},
+		.latency = {
+			.reg = 0x314,
+			.shift = 0,
+			.mask = 0xff,
+			.def = 0x80,
+		},
+	}, {
+		.id = 0x38,
+		.name = "mpcorelpw",
+		.swgroup = TEGRA_SWGROUP_MPCORELP,
+		.latency = {
+			.reg = 0x324,
+			.shift = 16,
+			.mask = 0xff,
+			.def = 0x80,
+		},
+	}, {
+		.id = 0x39,
+		.name = "mpcorew",
+		.swgroup = TEGRA_SWGROUP_MPCORE,
+		.latency = {
+			.reg = 0x320,
+			.shift = 16,
+			.mask = 0xff,
+			.def = 0x80,
+		},
+	}, {
+		.id = 0x3b,
+		.name = "ppcsahbdmaw",
+		.swgroup = TEGRA_SWGROUP_PPCS,
+		.smmu = {
+			.reg = 0x22c,
+			.bit = 27,
+		},
+		.latency = {
+			.reg = 0x348,
+			.shift = 0,
+			.mask = 0xff,
+			.def = 0x80,
+		},
+	}, {
+		.id = 0x3c,
+		.name = "ppcsahbslvw",
+		.swgroup = TEGRA_SWGROUP_PPCS,
+		.smmu = {
+			.reg = 0x22c,
+			.bit = 28,
+		},
+		.latency = {
+			.reg = 0x348,
+			.shift = 16,
+			.mask = 0xff,
+			.def = 0x80,
+		},
+	}, {
+		.id = 0x3d,
+		.name = "sataw",
+		.swgroup = TEGRA_SWGROUP_SATA,
+		.smmu = {
+			.reg = 0x22c,
+			.bit = 29,
+		},
+		.latency = {
+			.reg = 0x350,
+			.shift = 16,
+			.mask = 0xff,
+			.def = 0x65,
+		},
+	}, {
+		.id = 0x3e,
+		.name = "vdebsevw",
+		.swgroup = TEGRA_SWGROUP_VDE,
+		.smmu = {
+			.reg = 0x22c,
+			.bit = 30,
+		},
+		.latency = {
+			.reg = 0x35c,
+			.shift = 0,
+			.mask = 0xff,
+			.def = 0x80,
+		},
+	}, {
+		.id = 0x3f,
+		.name = "vdedbgw",
+		.swgroup = TEGRA_SWGROUP_VDE,
+		.smmu = {
+			.reg = 0x22c,
+			.bit = 31,
+		},
+		.latency = {
+			.reg = 0x35c,
+			.shift = 16,
+			.mask = 0xff,
+			.def = 0x80,
+		},
+	}, {
+		.id = 0x40,
+		.name = "vdembew",
+		.swgroup = TEGRA_SWGROUP_VDE,
+		.smmu = {
+			.reg = 0x230,
+			.bit = 0,
+		},
+		.latency = {
+			.reg = 0x360,
+			.shift = 0,
+			.mask = 0xff,
+			.def = 0x80,
+		},
+	}, {
+		.id = 0x41,
+		.name = "vdetpmw",
+		.swgroup = TEGRA_SWGROUP_VDE,
+		.smmu = {
+			.reg = 0x230,
+			.bit = 1,
+		},
+		.latency = {
+			.reg = 0x360,
+			.shift = 16,
+			.mask = 0xff,
+			.def = 0x80,
+		},
+	}, {
+		.id = 0x44,
+		.name = "ispra",
+		.swgroup = TEGRA_SWGROUP_ISP2,
+		.smmu = {
+			.reg = 0x230,
+			.bit = 4,
+		},
+		.latency = {
+			.reg = 0x370,
+			.shift = 0,
+			.mask = 0xff,
+			.def = 0x18,
+		},
+	}, {
+		.id = 0x46,
+		.name = "ispwa",
+		.swgroup = TEGRA_SWGROUP_ISP2,
+		.smmu = {
+			.reg = 0x230,
+			.bit = 6,
+		},
+		.latency = {
+			.reg = 0x374,
+			.shift = 0,
+			.mask = 0xff,
+			.def = 0x80,
+		},
+	}, {
+		.id = 0x47,
+		.name = "ispwb",
+		.swgroup = TEGRA_SWGROUP_ISP2,
+		.smmu = {
+			.reg = 0x230,
+			.bit = 7,
+		},
+		.latency = {
+			.reg = 0x374,
+			.shift = 16,
+			.mask = 0xff,
+			.def = 0x80,
+		},
+	}, {
+		.id = 0x4a,
+		.name = "xusb_hostr",
+		.swgroup = TEGRA_SWGROUP_XUSB_HOST,
+		.smmu = {
+			.reg = 0x230,
+			.bit = 10,
+		},
+		.latency = {
+			.reg = 0x37c,
+			.shift = 0,
+			.mask = 0xff,
+			.def = 0x39,
+		},
+	}, {
+		.id = 0x4b,
+		.name = "xusb_hostw",
+		.swgroup = TEGRA_SWGROUP_XUSB_HOST,
+		.smmu = {
+			.reg = 0x230,
+			.bit = 11,
+		},
+		.latency = {
+			.reg = 0x37c,
+			.shift = 16,
+			.mask = 0xff,
+			.def = 0x80,
+		},
+	}, {
+		.id = 0x4c,
+		.name = "xusb_devr",
+		.swgroup = TEGRA_SWGROUP_XUSB_DEV,
+		.smmu = {
+			.reg = 0x230,
+			.bit = 12,
+		},
+		.latency = {
+			.reg = 0x380,
+			.shift = 0,
+			.mask = 0xff,
+			.def = 0x39,
+		},
+	}, {
+		.id = 0x4d,
+		.name = "xusb_devw",
+		.swgroup = TEGRA_SWGROUP_XUSB_DEV,
+		.smmu = {
+			.reg = 0x230,
+			.bit = 13,
+		},
+		.latency = {
+			.reg = 0x380,
+			.shift = 16,
+			.mask = 0xff,
+			.def = 0x80,
+		},
+	}, {
+		.id = 0x4e,
+		.name = "isprab",
+		.swgroup = TEGRA_SWGROUP_ISP2B,
+		.smmu = {
+			.reg = 0x230,
+			.bit = 14,
+		},
+		.latency = {
+			.reg = 0x384,
+			.shift = 0,
+			.mask = 0xff,
+			.def = 0x18,
+		},
+	}, {
+		.id = 0x50,
+		.name = "ispwab",
+		.swgroup = TEGRA_SWGROUP_ISP2B,
+		.smmu = {
+			.reg = 0x230,
+			.bit = 16,
+		},
+		.latency = {
+			.reg = 0x388,
+			.shift = 0,
+			.mask = 0xff,
+			.def = 0x80,
+		},
+	}, {
+		.id = 0x51,
+		.name = "ispwbb",
+		.swgroup = TEGRA_SWGROUP_ISP2B,
+		.smmu = {
+			.reg = 0x230,
+			.bit = 17,
+		},
+		.latency = {
+			.reg = 0x388,
+			.shift = 16,
+			.mask = 0xff,
+			.def = 0x80,
+		},
+	}, {
+		.id = 0x54,
+		.name = "tsecsrd",
+		.swgroup = TEGRA_SWGROUP_TSEC,
+		.smmu = {
+			.reg = 0x230,
+			.bit = 20,
+		},
+		.latency = {
+			.reg = 0x390,
+			.shift = 0,
+			.mask = 0xff,
+			.def = 0x9b,
+		},
+	}, {
+		.id = 0x55,
+		.name = "tsecswr",
+		.swgroup = TEGRA_SWGROUP_TSEC,
+		.smmu = {
+			.reg = 0x230,
+			.bit = 21,
+		},
+		.latency = {
+			.reg = 0x390,
+			.shift = 16,
+			.mask = 0xff,
+			.def = 0x80,
+		},
+	}, {
+		.id = 0x56,
+		.name = "a9avpscr",
+		.swgroup = TEGRA_SWGROUP_A9AVP,
+		.smmu = {
+			.reg = 0x230,
+			.bit = 22,
+		},
+		.latency = {
+			.reg = 0x3a4,
+			.shift = 0,
+			.mask = 0xff,
+			.def = 0x04,
+		},
+	}, {
+		.id = 0x57,
+		.name = "a9avpscw",
+		.swgroup = TEGRA_SWGROUP_A9AVP,
+		.smmu = {
+			.reg = 0x230,
+			.bit = 23,
+		},
+		.latency = {
+			.reg = 0x3a4,
+			.shift = 16,
+			.mask = 0xff,
+			.def = 0x80,
+		},
+	}, {
+		.id = 0x58,
+		.name = "gpusrd",
+		.swgroup = TEGRA_SWGROUP_GPU,
+		.smmu = {
+			/* read-only */
+			.reg = 0x230,
+			.bit = 24,
+		},
+		.latency = {
+			.reg = 0x3c8,
+			.shift = 0,
+			.mask = 0xff,
+			.def = 0x1a,
+		},
+	}, {
+		.id = 0x59,
+		.name = "gpuswr",
+		.swgroup = TEGRA_SWGROUP_GPU,
+		.smmu = {
+			/* read-only */
+			.reg = 0x230,
+			.bit = 25,
+		},
+		.latency = {
+			.reg = 0x3c8,
+			.shift = 16,
+			.mask = 0xff,
+			.def = 0x80,
+		},
+	}, {
+		.id = 0x5a,
+		.name = "displayt",
+		.swgroup = TEGRA_SWGROUP_DC,
+		.smmu = {
+			.reg = 0x230,
+			.bit = 26,
+		},
+		.latency = {
+			.reg = 0x2f0,
+			.shift = 16,
+			.mask = 0xff,
+			.def = 0x50,
+		},
+	}, {
+		.id = 0x60,
+		.name = "sdmmcra",
+		.swgroup = TEGRA_SWGROUP_SDMMC1A,
+		.smmu = {
+			.reg = 0x234,
+			.bit = 0,
+		},
+		.latency = {
+			.reg = 0x3b8,
+			.shift = 0,
+			.mask = 0xff,
+			.def = 0x49,
+		},
+	}, {
+		.id = 0x61,
+		.name = "sdmmcraa",
+		.swgroup = TEGRA_SWGROUP_SDMMC2A,
+		.smmu = {
+			.reg = 0x234,
+			.bit = 1,
+		},
+		.latency = {
+			.reg = 0x3bc,
+			.shift = 0,
+			.mask = 0xff,
+			.def = 0x49,
+		},
+	}, {
+		.id = 0x62,
+		.name = "sdmmcr",
+		.swgroup = TEGRA_SWGROUP_SDMMC3A,
+		.smmu = {
+			.reg = 0x234,
+			.bit = 2,
+		},
+		.latency = {
+			.reg = 0x3c0,
+			.shift = 0,
+			.mask = 0xff,
+			.def = 0x49,
+		},
+	}, {
+		.id = 0x63,
+		.swgroup = TEGRA_SWGROUP_SDMMC4A,
+		.name = "sdmmcrab",
+		.smmu = {
+			.reg = 0x234,
+			.bit = 3,
+		},
+		.latency = {
+			.reg = 0x3c4,
+			.shift = 0,
+			.mask = 0xff,
+			.def = 0x49,
+		},
+	}, {
+		.id = 0x64,
+		.name = "sdmmcwa",
+		.swgroup = TEGRA_SWGROUP_SDMMC1A,
+		.smmu = {
+			.reg = 0x234,
+			.bit = 4,
+		},
+		.latency = {
+			.reg = 0x3b8,
+			.shift = 16,
+			.mask = 0xff,
+			.def = 0x80,
+		},
+	}, {
+		.id = 0x65,
+		.name = "sdmmcwaa",
+		.swgroup = TEGRA_SWGROUP_SDMMC2A,
+		.smmu = {
+			.reg = 0x234,
+			.bit = 5,
+		},
+		.latency = {
+			.reg = 0x3bc,
+			.shift = 16,
+			.mask = 0xff,
+			.def = 0x80,
+		},
+	}, {
+		.id = 0x66,
+		.name = "sdmmcw",
+		.swgroup = TEGRA_SWGROUP_SDMMC3A,
+		.smmu = {
+			.reg = 0x234,
+			.bit = 6,
+		},
+		.latency = {
+			.reg = 0x3c0,
+			.shift = 16,
+			.mask = 0xff,
+			.def = 0x80,
+		},
+	}, {
+		.id = 0x67,
+		.name = "sdmmcwab",
+		.swgroup = TEGRA_SWGROUP_SDMMC4A,
+		.smmu = {
+			.reg = 0x234,
+			.bit = 7,
+		},
+		.latency = {
+			.reg = 0x3c4,
+			.shift = 16,
+			.mask = 0xff,
+			.def = 0x80,
+		},
+	}, {
+		.id = 0x6c,
+		.name = "vicsrd",
+		.swgroup = TEGRA_SWGROUP_VIC,
+		.smmu = {
+			.reg = 0x234,
+			.bit = 12,
+		},
+		.latency = {
+			.reg = 0x394,
+			.shift = 0,
+			.mask = 0xff,
+			.def = 0x1a,
+		},
+	}, {
+		.id = 0x6d,
+		.name = "vicswr",
+		.swgroup = TEGRA_SWGROUP_VIC,
+		.smmu = {
+			.reg = 0x234,
+			.bit = 13,
+		},
+		.latency = {
+			.reg = 0x394,
+			.shift = 16,
+			.mask = 0xff,
+			.def = 0x80,
+		},
+	}, {
+		.id = 0x72,
+		.name = "viw",
+		.swgroup = TEGRA_SWGROUP_VI,
+		.smmu = {
+			.reg = 0x234,
+			.bit = 18,
+		},
+		.latency = {
+			.reg = 0x398,
+			.shift = 0,
+			.mask = 0xff,
+			.def = 0x80,
+		},
+	}, {
+		.id = 0x73,
+		.name = "displayd",
+		.swgroup = TEGRA_SWGROUP_DC,
+		.smmu = {
+			.reg = 0x234,
+			.bit = 19,
+		},
+		.latency = {
+			.reg = 0x3c8,
+			.shift = 0,
+			.mask = 0xff,
+			.def = 0x50,
+		},
+	},
+};
+
+struct tegra_smmu_swgroup {
+	unsigned int swgroup;
+	unsigned int reg;
+};
+
+static const struct tegra_smmu_swgroup tegra124_swgroups[] = {
+	{ .swgroup = TEGRA_SWGROUP_DC,        .reg = 0x240 },
+	{ .swgroup = TEGRA_SWGROUP_DCB,       .reg = 0x244 },
+	{ .swgroup = TEGRA_SWGROUP_AFI,       .reg = 0x238 },
+	{ .swgroup = TEGRA_SWGROUP_AVPC,      .reg = 0x23c },
+	{ .swgroup = TEGRA_SWGROUP_HDA,       .reg = 0x254 },
+	{ .swgroup = TEGRA_SWGROUP_HC,        .reg = 0x250 },
+	{ .swgroup = TEGRA_SWGROUP_MSENC,     .reg = 0x264 },
+	{ .swgroup = TEGRA_SWGROUP_PPCS,      .reg = 0x270 },
+	{ .swgroup = TEGRA_SWGROUP_SATA,      .reg = 0x274 },
+	{ .swgroup = TEGRA_SWGROUP_VDE,       .reg = 0x27c },
+	{ .swgroup = TEGRA_SWGROUP_ISP2,      .reg = 0x258 },
+	{ .swgroup = TEGRA_SWGROUP_XUSB_HOST, .reg = 0x288 },
+	{ .swgroup = TEGRA_SWGROUP_XUSB_DEV,  .reg = 0x28c },
+	{ .swgroup = TEGRA_SWGROUP_ISP2B,     .reg = 0xaa4 },
+	{ .swgroup = TEGRA_SWGROUP_TSEC,      .reg = 0x294 },
+	{ .swgroup = TEGRA_SWGROUP_A9AVP,     .reg = 0x290 },
+	{ .swgroup = TEGRA_SWGROUP_GPU,       .reg = 0xaa8 },
+	{ .swgroup = TEGRA_SWGROUP_SDMMC1A,   .reg = 0xa94 },
+	{ .swgroup = TEGRA_SWGROUP_SDMMC2A,   .reg = 0xa98 },
+	{ .swgroup = TEGRA_SWGROUP_SDMMC3A,   .reg = 0xa9c },
+	{ .swgroup = TEGRA_SWGROUP_SDMMC4A,   .reg = 0xaa0 },
+	{ .swgroup = TEGRA_SWGROUP_VIC,       .reg = 0x284 },
+	{ .swgroup = TEGRA_SWGROUP_VI,        .reg = 0x280 },
+};
+
+struct tegra_smmu_group_init {
+	unsigned int asid;
+	const char *name;
+
+	const struct of_device_id *matches;
+};
+
+struct tegra_smmu_soc {
+	const struct tegra_smmu_group_init *groups;
+	unsigned int num_groups;
+
+	const struct tegra_mc_client *clients;
+	unsigned int num_clients;
+
+	const struct tegra_smmu_swgroup *swgroups;
+	unsigned int num_swgroups;
+
+	unsigned int num_asids;
+	unsigned int atom_size;
+
+	const struct tegra_smmu_ops *ops;
+};
+
+struct tegra_smmu_ops {
+	void (*flush_dcache)(struct page *page, unsigned long offset,
+			     size_t size);
+};
+
+struct tegra_smmu_master {
+	struct list_head list;
+	struct device *dev;
+};
+
+struct tegra_smmu_group {
+	const char *name;
+	const struct of_device_id *matches;
+	unsigned int asid;
+
+#ifndef CONFIG_ARM64
+	struct dma_iommu_mapping *mapping;
+#endif
+	struct list_head masters;
+};
+
+static const struct of_device_id tegra124_periph_matches[] = {
+	{ .compatible = "nvidia,tegra124-sdhci", },
+	{ }
+};
+
+static const struct tegra_smmu_group_init tegra124_smmu_groups[] = {
+	{ 0, "peripherals", tegra124_periph_matches },
+};
+
+static void tegra_smmu_group_release(void *data)
+{
+	kfree(data);
+}
+
+struct tegra_smmu {
+	void __iomem *regs;
+	struct iommu iommu;
+	struct device *dev;
+
+	const struct tegra_smmu_soc *soc;
+
+	struct iommu_group **groups;
+	unsigned int num_groups;
+
+	unsigned long *asids;
+	struct mutex lock;
+};
+
+struct tegra_smmu_address_space {
+	struct iommu_domain *domain;
+	struct tegra_smmu *smmu;
+	struct page *pd;
+	unsigned id;
+	u32 attr;
+};
+
+static inline void smmu_writel(struct tegra_smmu *smmu, u32 value,
+			       unsigned long offset)
+{
+	writel(value, smmu->regs + offset);
+}
+
+static inline u32 smmu_readl(struct tegra_smmu *smmu, unsigned long offset)
+{
+	return readl(smmu->regs + offset);
+}
+
+#define SMMU_CONFIG 0x010
+#define  SMMU_CONFIG_ENABLE (1 << 0)
+
+#define SMMU_PTB_ASID 0x01c
+#define  SMMU_PTB_ASID_VALUE(x) ((x) & 0x7f)
+
+#define SMMU_PTB_DATA 0x020
+#define  SMMU_PTB_DATA_VALUE(page, attr) (page_to_phys(page) >> 12 | (attr))
+
+#define SMMU_MK_PDE(page, attr) (page_to_phys(page) >> SMMU_PTE_SHIFT | (attr))
+
+#define SMMU_TLB_FLUSH 0x030
+#define  SMMU_TLB_FLUSH_VA_MATCH_ALL     (0 << 0)
+#define  SMMU_TLB_FLUSH_VA_MATCH_SECTION (2 << 0)
+#define  SMMU_TLB_FLUSH_VA_MATCH_GROUP   (3 << 0)
+#define  SMMU_TLB_FLUSH_ASID(x)          (((x) & 0x7f) << 24)
+#define  SMMU_TLB_FLUSH_VA_SECTION(addr) ((((addr) & 0xffc00000) >> 12) | \
+					  SMMU_TLB_FLUSH_VA_MATCH_SECTION)
+#define  SMMU_TLB_FLUSH_VA_GROUP(addr)   ((((addr) & 0xffffc000) >> 12) | \
+					  SMMU_TLB_FLUSH_VA_MATCH_GROUP)
+#define  SMMU_TLB_FLUSH_ASID_MATCH       (1 << 31)
+
+#define SMMU_PTC_FLUSH 0x034
+#define  SMMU_PTC_FLUSH_TYPE_ALL (0 << 0)
+#define  SMMU_PTC_FLUSH_TYPE_ADR (1 << 0)
+
+#define SMMU_PTC_FLUSH_HI 0x9b8
+#define  SMMU_PTC_FLUSH_HI_MASK 0x3
+
+/* per-SWGROUP SMMU_*_ASID register */
+#define SMMU_ASID_ENABLE (1 << 31)
+#define SMMU_ASID_MASK 0x7f
+#define SMMU_ASID_VALUE(x) ((x) & SMMU_ASID_MASK)
+
+/* page table definitions */
+#define SMMU_NUM_PDE 1024
+#define SMMU_NUM_PTE 1024
+
+#define SMMU_SIZE_PD (SMMU_NUM_PDE * 4)
+#define SMMU_SIZE_PT (SMMU_NUM_PTE * 4)
+
+#define SMMU_PDE_SHIFT 22
+#define SMMU_PTE_SHIFT 12
+
+#define SMMU_PFN_MASK 0x000fffff
+
+#define SMMU_PD_READABLE	(1 << 31)
+#define SMMU_PD_WRITABLE	(1 << 30)
+#define SMMU_PD_NONSECURE	(1 << 29)
+
+#define SMMU_PDE_READABLE	(1 << 31)
+#define SMMU_PDE_WRITABLE	(1 << 30)
+#define SMMU_PDE_NONSECURE	(1 << 29)
+#define SMMU_PDE_NEXT		(1 << 28)
+
+#define SMMU_PTE_READABLE	(1 << 31)
+#define SMMU_PTE_WRITABLE	(1 << 30)
+#define SMMU_PTE_NONSECURE	(1 << 29)
+
+#define SMMU_PDE_ATTR		(SMMU_PDE_READABLE | SMMU_PDE_WRITABLE | \
+				 SMMU_PDE_NONSECURE)
+#define SMMU_PTE_ATTR		(SMMU_PTE_READABLE | SMMU_PTE_WRITABLE | \
+				 SMMU_PTE_NONSECURE)
+
+#define SMMU_PDE_VACANT(n)	(((n) << 10) | SMMU_PDE_ATTR)
+#define SMMU_PTE_VACANT(n)	(((n) << 12) | SMMU_PTE_ATTR)
+
+#ifdef CONFIG_ARCH_TEGRA_124_SOC
+static void tegra124_flush_dcache(struct page *page, unsigned long offset,
+				  size_t size)
+{
+	phys_addr_t phys = page_to_phys(page) + offset;
+	void *virt = page_address(page) + offset;
+
+	__cpuc_flush_dcache_area(virt, size);
+	outer_flush_range(phys, phys + size);
+}
+
+static const struct tegra_smmu_ops tegra124_smmu_ops = {
+	.flush_dcache = tegra124_flush_dcache,
+};
+#endif
+
+static void tegra132_flush_dcache(struct page *page, unsigned long offset,
+				  size_t size)
+{
+	/* TODO: implement */
+}
+
+static const struct tegra_smmu_ops tegra132_smmu_ops = {
+	.flush_dcache = tegra132_flush_dcache,
+};
+
+static inline void smmu_flush_ptc(struct tegra_smmu *smmu, struct page *page,
+				  unsigned long offset)
+{
+	phys_addr_t phys = page ? page_to_phys(page) : 0;
+	u32 value;
+
+	if (page) {
+		offset &= ~(smmu->soc->atom_size - 1);
+
+#ifdef CONFIG_PHYS_ADDR_T_64BIT
+		value = (phys >> 32) & SMMU_PTC_FLUSH_HI_MASK;
+#else
+		value = 0;
+#endif
+		smmu_writel(smmu, value, SMMU_PTC_FLUSH_HI);
+
+		value = (phys + offset) | SMMU_PTC_FLUSH_TYPE_ADR;
+	} else {
+		value = SMMU_PTC_FLUSH_TYPE_ALL;
+	}
+
+	smmu_writel(smmu, value, SMMU_PTC_FLUSH);
+}
+
+static inline void smmu_flush_tlb(struct tegra_smmu *smmu)
+{
+	smmu_writel(smmu, SMMU_TLB_FLUSH_VA_MATCH_ALL, SMMU_TLB_FLUSH);
+}
+
+static inline void smmu_flush_tlb_asid(struct tegra_smmu *smmu,
+				       unsigned long asid)
+{
+	u32 value;
+
+	value = SMMU_TLB_FLUSH_ASID_MATCH | SMMU_TLB_FLUSH_ASID(asid) |
+		SMMU_TLB_FLUSH_VA_MATCH_ALL;
+	smmu_writel(smmu, value, SMMU_TLB_FLUSH);
+}
+
+static inline void smmu_flush_tlb_section(struct tegra_smmu *smmu,
+					  unsigned long asid,
+					  unsigned long iova)
+{
+	u32 value;
+
+	value = SMMU_TLB_FLUSH_ASID_MATCH | SMMU_TLB_FLUSH_ASID(asid) |
+		SMMU_TLB_FLUSH_VA_SECTION(iova);
+	smmu_writel(smmu, value, SMMU_TLB_FLUSH);
+}
+
+static inline void smmu_flush_tlb_group(struct tegra_smmu *smmu,
+					unsigned long asid,
+					unsigned long iova)
+{
+	u32 value;
+
+	value = SMMU_TLB_FLUSH_ASID_MATCH | SMMU_TLB_FLUSH_ASID(asid) |
+		SMMU_TLB_FLUSH_VA_GROUP(iova);
+	smmu_writel(smmu, value, SMMU_TLB_FLUSH);
+}
+
+static inline void smmu_flush(struct tegra_smmu *smmu)
+{
+	smmu_readl(smmu, SMMU_CONFIG);
+}
+
+static inline struct tegra_smmu *to_tegra_smmu(struct iommu *iommu)
+{
+	return container_of(iommu, struct tegra_smmu, iommu);
+}
+
+static struct tegra_smmu *smmu_handle = NULL;
+
+static int tegra_smmu_alloc_asid(struct tegra_smmu *smmu, unsigned int *idp)
+{
+	unsigned long id;
+
+	mutex_lock(&smmu->lock);
+
+	id = find_first_zero_bit(smmu->asids, smmu->soc->num_asids);
+	if (id >= smmu->soc->num_asids) {
+		mutex_unlock(&smmu->lock);
+		return -ENOSPC;
+	}
+
+	set_bit(id, smmu->asids);
+	*idp = id;
+
+	mutex_unlock(&smmu->lock);
+	return 0;
+}
+
+static void tegra_smmu_free_asid(struct tegra_smmu *smmu, unsigned int id)
+{
+	mutex_lock(&smmu->lock);
+	clear_bit(id, smmu->asids);
+	mutex_unlock(&smmu->lock);
+}
+
+struct tegra_smmu_address_space *foo = NULL;
+
+static int tegra_smmu_domain_init(struct iommu_domain *domain)
+{
+	struct tegra_smmu *smmu = smmu_handle;
+	struct tegra_smmu_address_space *as;
+	uint32_t *pd, value;
+	unsigned int i;
+	int err = 0;
+
+	as = kzalloc(sizeof(*as), GFP_KERNEL);
+	if (!as) {
+		err = -ENOMEM;
+		goto out;
+	}
+
+	as->attr = SMMU_PD_READABLE | SMMU_PD_WRITABLE | SMMU_PD_NONSECURE;
+	as->smmu = smmu_handle;
+	as->domain = domain;
+
+	err = tegra_smmu_alloc_asid(smmu, &as->id);
+	if (err < 0) {
+		kfree(as);
+		goto out;
+	}
+
+	as->pd = alloc_page(GFP_KERNEL | __GFP_DMA);
+	if (!as->pd) {
+		err = -ENOMEM;
+		goto out;
+	}
+
+	pd = page_address(as->pd);
+	SetPageReserved(as->pd);
+
+	for (i = 0; i < SMMU_NUM_PDE; i++)
+		pd[i] = SMMU_PDE_VACANT(i);
+
+	smmu->soc->ops->flush_dcache(as->pd, 0, SMMU_SIZE_PD);
+	smmu_flush_ptc(smmu, as->pd, 0);
+	smmu_flush_tlb_asid(smmu, as->id);
+
+	smmu_writel(smmu, as->id & 0x7f, SMMU_PTB_ASID);
+	value = SMMU_PTB_DATA_VALUE(as->pd, as->attr);
+	smmu_writel(smmu, value, SMMU_PTB_DATA);
+	smmu_flush(smmu);
+
+	domain->priv = as;
+
+	return 0;
+
+out:
+	return err;
+}
+
+static void tegra_smmu_domain_destroy(struct iommu_domain *domain)
+{
+	struct tegra_smmu_address_space *as = domain->priv;
+
+	/* TODO: free page directory and page tables */
+
+	tegra_smmu_free_asid(as->smmu, as->id);
+	kfree(as);
+}
+
+static const struct tegra_smmu_swgroup *
+tegra_smmu_find_swgroup(struct tegra_smmu *smmu, unsigned int swgroup)
+{
+	const struct tegra_smmu_swgroup *group = NULL;
+	unsigned int i;
+
+	for (i = 0; i < smmu->soc->num_swgroups; i++) {
+		if (smmu->soc->swgroups[i].swgroup == swgroup) {
+			group = &smmu->soc->swgroups[i];
+			break;
+		}
+	}
+
+	return group;
+}
+
+static int tegra_smmu_enable(struct tegra_smmu *smmu, unsigned int swgroup,
+			     unsigned int asid)
+{
+	const struct tegra_smmu_swgroup *group;
+	unsigned int i;
+	u32 value;
+
+	for (i = 0; i < smmu->soc->num_clients; i++) {
+		const struct tegra_mc_client *client = &smmu->soc->clients[i];
+
+		if (client->swgroup != swgroup)
+			continue;
+
+		value = smmu_readl(smmu, client->smmu.reg);
+		value |= BIT(client->smmu.bit);
+		smmu_writel(smmu, value, client->smmu.reg);
+	}
+
+	group = tegra_smmu_find_swgroup(smmu, swgroup);
+	if (group) {
+		value = smmu_readl(smmu, group->reg);
+		value &= ~SMMU_ASID_MASK;
+		value |= SMMU_ASID_VALUE(asid);
+		value |= SMMU_ASID_ENABLE;
+		smmu_writel(smmu, value, group->reg);
+	}
+
+	return 0;
+}
+
+static int tegra_smmu_disable(struct tegra_smmu *smmu, unsigned int swgroup,
+			      unsigned int asid)
+{
+	const struct tegra_smmu_swgroup *group;
+	unsigned int i;
+	u32 value;
+
+	group = tegra_smmu_find_swgroup(smmu, swgroup);
+	if (group) {
+		value = smmu_readl(smmu, group->reg);
+		value &= ~SMMU_ASID_MASK;
+		value |= SMMU_ASID_VALUE(asid);
+		value &= ~SMMU_ASID_ENABLE;
+		smmu_writel(smmu, value, group->reg);
+	}
+
+	for (i = 0; i < smmu->soc->num_clients; i++) {
+		const struct tegra_mc_client *client = &smmu->soc->clients[i];
+
+		if (client->swgroup != swgroup)
+			continue;
+
+		value = smmu_readl(smmu, client->smmu.reg);
+		value &= ~BIT(client->smmu.bit);
+		smmu_writel(smmu, value, client->smmu.reg);
+	}
+
+	return 0;
+}
+
+static int tegra_smmu_attach_dev(struct iommu_domain *domain, struct device *dev)
+{
+	struct tegra_smmu_address_space *as = domain->priv;
+	struct tegra_smmu *smmu = as->smmu;
+	struct of_phandle_iter entry;
+	int err;
+
+	of_property_for_each_phandle_with_args(entry, dev->of_node, "iommus",
+					       "#iommu-cells", 0) {
+		unsigned int swgroup = entry.out_args.args[0];
+
+		if (entry.out_args.np != smmu->dev->of_node)
+			continue;
+
+		err = tegra_smmu_enable(smmu, swgroup, as->id);
+		if (err < 0)
+			pr_err("failed to enable SWGROUP#%u\n", swgroup);
+	}
+
+	return 0;
+}
+
+static void tegra_smmu_detach_dev(struct iommu_domain *domain, struct device *dev)
+{
+	struct tegra_smmu_address_space *as = domain->priv;
+	struct tegra_smmu *smmu = as->smmu;
+	struct of_phandle_iter entry;
+	int err;
+
+	of_property_for_each_phandle_with_args(entry, dev->of_node, "iommus",
+					       "#iommu-cells", 0) {
+		unsigned int swgroup;
+
+		if (entry.out_args.np != smmu->dev->of_node)
+			continue;
+
+		swgroup = entry.out_args.args[0];
+
+		err = tegra_smmu_disable(smmu, swgroup, as->id);
+		if (err < 0) {
+			pr_err("failed to enable SWGROUP#%u\n", swgroup);
+		}
+	}
+}
+
+static u32 *as_get_pte(struct tegra_smmu_address_space *as, dma_addr_t iova,
+		       struct page **pagep)
+{
+	struct tegra_smmu *smmu = smmu_handle;
+	u32 *pd = page_address(as->pd), *pt;
+	u32 pde = (iova >> SMMU_PDE_SHIFT) & 0x3ff;
+	u32 pte = (iova >> SMMU_PTE_SHIFT) & 0x3ff;
+	struct page *page;
+	unsigned int i;
+
+	if (pd[pde] != SMMU_PDE_VACANT(pde)) {
+		page = pfn_to_page(pd[pde] & SMMU_PFN_MASK);
+		pt = page_address(page);
+	} else {
+		page = alloc_page(GFP_KERNEL | __GFP_DMA);
+		if (!page)
+			return NULL;
+
+		pt = page_address(page);
+		SetPageReserved(page);
+
+		for (i = 0; i < SMMU_NUM_PTE; i++)
+			pt[i] = SMMU_PTE_VACANT(i);
+
+		smmu->soc->ops->flush_dcache(page, 0, SMMU_SIZE_PT);
+
+		pd[pde] = SMMU_MK_PDE(page, SMMU_PDE_ATTR | SMMU_PDE_NEXT);
+
+		smmu->soc->ops->flush_dcache(as->pd, pde << 2, 4);
+		smmu_flush_ptc(smmu, as->pd, pde << 2);
+		smmu_flush_tlb_section(smmu, as->id, iova);
+		smmu_flush(smmu);
+	}
+
+	*pagep = page;
+
+	return &pt[pte];
+}
+
+static int tegra_smmu_map(struct iommu_domain *domain, unsigned long iova,
+			  phys_addr_t paddr, size_t size, int prot)
+{
+	struct tegra_smmu_address_space *as = domain->priv;
+	struct tegra_smmu *smmu = smmu_handle;
+	unsigned long offset;
+	struct page *page;
+	u32 *pte;
+
+	pte = as_get_pte(as, iova, &page);
+	if (!pte)
+		return -ENOMEM;
+
+	offset = offset_in_page(pte);
+
+	*pte = __phys_to_pfn(paddr) | SMMU_PTE_ATTR;
+
+	smmu->soc->ops->flush_dcache(page, offset, 4);
+	smmu_flush_ptc(smmu, page, offset);
+	smmu_flush_tlb_group(smmu, as->id, iova);
+	smmu_flush(smmu);
+
+	return 0;
+}
+
+static size_t tegra_smmu_unmap(struct iommu_domain *domain, unsigned long iova,
+			       size_t size)
+{
+	struct tegra_smmu_address_space *as = domain->priv;
+	struct tegra_smmu *smmu = smmu_handle;
+	unsigned long offset;
+	struct page *page;
+	u32 *pte;
+
+	pte = as_get_pte(as, iova, &page);
+	if (!pte)
+		return 0;
+
+	offset = offset_in_page(pte);
+	*pte = 0;
+
+	smmu->soc->ops->flush_dcache(page, offset, 4);
+	smmu_flush_ptc(smmu, page, offset);
+	smmu_flush_tlb_group(smmu, as->id, iova);
+	smmu_flush(smmu);
+
+	return size;
+}
+
+static phys_addr_t tegra_smmu_iova_to_phys(struct iommu_domain *domain,
+					   dma_addr_t iova)
+{
+	struct tegra_smmu_address_space *as = domain->priv;
+	struct page *page;
+	unsigned long pfn;
+	u32 *pte;
+
+	pte = as_get_pte(as, iova, &page);
+	pfn = *pte & SMMU_PFN_MASK;
+
+	return PFN_PHYS(pfn);
+}
+
+static int tegra_smmu_attach(struct iommu *iommu, struct device *dev)
+{
+	struct tegra_smmu *smmu = to_tegra_smmu(iommu);
+	struct tegra_smmu_group *group;
+	unsigned int i;
+
+	for (i = 0; i < smmu->soc->num_groups; i++) {
+		group = iommu_group_get_iommudata(smmu->groups[i]);
+
+		if (of_match_node(group->matches, dev->of_node)) {
+			pr_debug("adding device %s to group %s\n",
+				 dev_name(dev), group->name);
+			iommu_group_add_device(smmu->groups[i], dev);
+			break;
+		}
+	}
+
+	if (i == smmu->soc->num_groups)
+		return 0;
+
+#ifndef CONFIG_ARM64
+	return arm_iommu_attach_device(dev, group->mapping);
+#else
+	return 0;
+#endif
+}
+
+static int tegra_smmu_detach(struct iommu *iommu, struct device *dev)
+{
+	return 0;
+}
+
+static const struct iommu_ops tegra_smmu_ops = {
+	.domain_init = tegra_smmu_domain_init,
+	.domain_destroy = tegra_smmu_domain_destroy,
+	.attach_dev = tegra_smmu_attach_dev,
+	.detach_dev = tegra_smmu_detach_dev,
+	.map = tegra_smmu_map,
+	.unmap = tegra_smmu_unmap,
+	.iova_to_phys = tegra_smmu_iova_to_phys,
+	.attach = tegra_smmu_attach,
+	.detach = tegra_smmu_detach,
+
+	.pgsize_bitmap = SZ_4K,
+};
+
+static struct tegra_smmu *tegra_smmu_probe(struct device *dev,
+					   const struct tegra_smmu_soc *soc,
+					   void __iomem *regs)
+{
+	struct tegra_smmu *smmu;
+	unsigned int i;
+	size_t size;
+	u32 value;
+	int err;
+
+	smmu = devm_kzalloc(dev, sizeof(*smmu), GFP_KERNEL);
+	if (!smmu)
+		return ERR_PTR(-ENOMEM);
+
+	size = BITS_TO_LONGS(soc->num_asids) * sizeof(long);
+
+	smmu->asids = devm_kzalloc(dev, size, GFP_KERNEL);
+	if (!smmu->asids)
+		return ERR_PTR(-ENOMEM);
+
+	INIT_LIST_HEAD(&smmu->iommu.list);
+	mutex_init(&smmu->lock);
+
+	smmu->iommu.ops = &tegra_smmu_ops;
+	smmu->iommu.dev = dev;
+
+	smmu->regs = regs;
+	smmu->soc = soc;
+	smmu->dev = dev;
+
+	smmu_handle = smmu;
+	bus_set_iommu(&platform_bus_type, &tegra_smmu_ops);
+
+	smmu->num_groups = soc->num_groups;
+
+	smmu->groups = devm_kcalloc(dev, smmu->num_groups, sizeof(*smmu->groups),
+				    GFP_KERNEL);
+	if (!smmu->groups)
+		return ERR_PTR(-ENOMEM);
+
+	for (i = 0; i < smmu->num_groups; i++) {
+		struct tegra_smmu_group *group;
+
+		smmu->groups[i] = iommu_group_alloc();
+		if (IS_ERR(smmu->groups[i]))
+			return ERR_CAST(smmu->groups[i]);
+
+		err = iommu_group_set_name(smmu->groups[i], soc->groups[i].name);
+		if (err < 0) {
+		}
+
+		group = kzalloc(sizeof(*group), GFP_KERNEL);
+		if (!group)
+			return ERR_PTR(-ENOMEM);
+
+		group->matches = soc->groups[i].matches;
+		group->asid = soc->groups[i].asid;
+		group->name = soc->groups[i].name;
+
+		iommu_group_set_iommudata(smmu->groups[i], group,
+					  tegra_smmu_group_release);
+
+#ifndef CONFIG_ARM64
+		group->mapping = arm_iommu_create_mapping(&platform_bus_type,
+							  0, SZ_2G);
+		if (IS_ERR(group->mapping)) {
+			dev_err(dev, "failed to create mapping for group %s: %ld\n",
+				group->name, PTR_ERR(group->mapping));
+			return ERR_CAST(group->mapping);
+		}
+#endif
+	}
+
+	value = (1 << 29) | (8 << 24) | 0x3f;
+	smmu_writel(smmu, value, 0x18);
+
+	value = (1 << 29) | (1 << 28) | 0x20;
+	smmu_writel(smmu, value, 0x014);
+
+	smmu_flush_ptc(smmu, NULL, 0);
+	smmu_flush_tlb(smmu);
+	smmu_writel(smmu, SMMU_CONFIG_ENABLE, SMMU_CONFIG);
+	smmu_flush(smmu);
+
+	err = iommu_add(&smmu->iommu);
+	if (err < 0)
+		return ERR_PTR(err);
+
+	return smmu;
+}
+
+static int tegra_smmu_remove(struct tegra_smmu *smmu)
+{
+	iommu_remove(&smmu->iommu);
+
+	return 0;
+}
+
+#ifdef CONFIG_ARCH_TEGRA_124_SOC
+static const struct tegra_smmu_soc tegra124_smmu_soc = {
+	.groups = tegra124_smmu_groups,
+	.num_groups = ARRAY_SIZE(tegra124_smmu_groups),
+	.clients = tegra124_mc_clients,
+	.num_clients = ARRAY_SIZE(tegra124_mc_clients),
+	.swgroups = tegra124_swgroups,
+	.num_swgroups = ARRAY_SIZE(tegra124_swgroups),
+	.num_asids = 128,
+	.atom_size = 32,
+	.ops = &tegra124_smmu_ops,
+};
+#endif
+
+static const struct tegra_smmu_soc tegra132_smmu_soc = {
+	.groups = tegra124_smmu_groups,
+	.num_groups = ARRAY_SIZE(tegra124_smmu_groups),
+	.clients = tegra124_mc_clients,
+	.num_clients = ARRAY_SIZE(tegra124_mc_clients),
+	.swgroups = tegra124_swgroups,
+	.num_swgroups = ARRAY_SIZE(tegra124_swgroups),
+	.num_asids = 128,
+	.atom_size = 32,
+	.ops = &tegra132_smmu_ops,
+};
+
+struct tegra_mc {
+	struct device *dev;
+	struct tegra_smmu *smmu;
+	void __iomem *regs;
+	int irq;
+
+	const struct tegra_mc_soc *soc;
+};
+
+static inline u32 mc_readl(struct tegra_mc *mc, unsigned long offset)
+{
+	return readl(mc->regs + offset);
+}
+
+static inline void mc_writel(struct tegra_mc *mc, u32 value, unsigned long offset)
+{
+	writel(value, mc->regs + offset);
+}
+
+struct tegra_mc_soc {
+	const struct tegra_mc_client *clients;
+	unsigned int num_clients;
+
+	const struct tegra_smmu_soc *smmu;
+};
+
+#ifdef CONFIG_ARCH_TEGRA_124_SOC
+static const struct tegra_mc_soc tegra124_mc_soc = {
+	.clients = tegra124_mc_clients,
+	.num_clients = ARRAY_SIZE(tegra124_mc_clients),
+	.smmu = &tegra124_smmu_soc,
+};
+#endif
+
+static const struct tegra_mc_soc tegra132_mc_soc = {
+	.clients = tegra124_mc_clients,
+	.num_clients = ARRAY_SIZE(tegra124_mc_clients),
+	.smmu = &tegra132_smmu_soc,
+};
+
+static const struct of_device_id tegra_mc_of_match[] = {
+#ifdef CONFIG_ARCH_TEGRA_124_SOC
+	{ .compatible = "nvidia,tegra124-mc", .data = &tegra124_mc_soc },
+#endif
+	{ .compatible = "nvidia,tegra132-mc", .data = &tegra132_mc_soc },
+	{ }
+};
+
+static irqreturn_t tegra124_mc_irq(int irq, void *data)
+{
+	struct tegra_mc *mc = data;
+	u32 value, status, mask;
+
+	/* mask all interrupts to avoid flooding */
+	mask = mc_readl(mc, MC_INTMASK);
+	mc_writel(mc, 0, MC_INTMASK);
+
+	status = mc_readl(mc, MC_INTSTATUS);
+	mc_writel(mc, status, MC_INTSTATUS);
+
+	dev_dbg(mc->dev, "INTSTATUS: %08x\n", status);
+
+	if (status & MC_INT_DECERR_MTS)
+		dev_dbg(mc->dev, "  DECERR_MTS\n");
+
+	if (status & MC_INT_SECERR_SEC)
+		dev_dbg(mc->dev, "  SECERR_SEC\n");
+
+	if (status & MC_INT_DECERR_VPR)
+		dev_dbg(mc->dev, "  DECERR_VPR\n");
+
+	if (status & MC_INT_INVALID_APB_ASID_UPDATE)
+		dev_dbg(mc->dev, "  INVALID_APB_ASID_UPDATE\n");
+
+	if (status & MC_INT_INVALID_SMMU_PAGE)
+		dev_dbg(mc->dev, "  INVALID_SMMU_PAGE\n");
+
+	if (status & MC_INT_ARBITRATION_EMEM)
+		dev_dbg(mc->dev, "  ARBITRATION_EMEM\n");
+
+	if (status & MC_INT_SECURITY_VIOLATION)
+		dev_dbg(mc->dev, "  SECURITY_VIOLATION\n");
+
+	if (status & MC_INT_DECERR_EMEM)
+		dev_dbg(mc->dev, "  DECERR_EMEM\n");
+
+	value = mc_readl(mc, MC_ERR_STATUS);
+
+	dev_dbg(mc->dev, "ERR_STATUS: %08x\n", value);
+	dev_dbg(mc->dev, "  type: %x\n", (value >> 28) & 0x7);
+	dev_dbg(mc->dev, "  protection: %x\n", (value >> 25) & 0x7);
+	dev_dbg(mc->dev, "  adr_hi: %x\n", (value >> 20) & 0x3);
+	dev_dbg(mc->dev, "  swap: %x\n", (value >> 18) & 0x1);
+	dev_dbg(mc->dev, "  security: %x\n", (value >> 17) & 0x1);
+	dev_dbg(mc->dev, "  r/w: %x\n", (value >> 16) & 0x1);
+	dev_dbg(mc->dev, "  adr1: %x\n", (value >> 12) & 0x7);
+	dev_dbg(mc->dev, "  client: %x\n", value & 0x7f);
+
+	value = mc_readl(mc, MC_ERR_ADR);
+	dev_dbg(mc->dev, "ERR_ADR: %08x\n", value);
+
+	mc_writel(mc, mask, MC_INTMASK);
+
+	return IRQ_HANDLED;
+}
+
+static int tegra_mc_probe(struct platform_device *pdev)
+{
+	const struct of_device_id *match;
+	struct resource *res;
+	struct tegra_mc *mc;
+	unsigned int i;
+	u32 value;
+	int err;
+
+	match = of_match_node(tegra_mc_of_match, pdev->dev.of_node);
+	if (!match)
+		return -ENODEV;
+
+	mc = devm_kzalloc(&pdev->dev, sizeof(*mc), GFP_KERNEL);
+	if (!mc)
+		return -ENOMEM;
+
+	platform_set_drvdata(pdev, mc);
+	mc->soc = match->data;
+	mc->dev = &pdev->dev;
+
+	res = platform_get_resource(pdev, IORESOURCE_MEM, 0);
+	mc->regs = devm_ioremap_resource(&pdev->dev, res);
+	if (IS_ERR(mc->regs))
+		return PTR_ERR(mc->regs);
+
+	for (i = 0; i < mc->soc->num_clients; i++) {
+		const struct latency_allowance *la = &mc->soc->clients[i].latency;
+		u32 value;
+
+		value = readl(mc->regs + la->reg);
+		value &= ~(la->mask << la->shift);
+		value |= (la->def & la->mask) << la->shift;
+		writel(value, mc->regs + la->reg);
+	}
+
+	mc->smmu = tegra_smmu_probe(&pdev->dev, mc->soc->smmu, mc->regs);
+	if (IS_ERR(mc->smmu)) {
+		dev_err(&pdev->dev, "failed to probe SMMU: %ld\n",
+			PTR_ERR(mc->smmu));
+		return PTR_ERR(mc->smmu);
+	}
+
+	mc->irq = platform_get_irq(pdev, 0);
+	if (mc->irq < 0) {
+		dev_err(&pdev->dev, "interrupt not specified\n");
+		return mc->irq;
+	}
+
+	err = devm_request_irq(&pdev->dev, mc->irq, tegra124_mc_irq,
+			       IRQF_SHARED, dev_name(&pdev->dev), mc);
+	if (err < 0) {
+		dev_err(&pdev->dev, "failed to request IRQ#%u: %d\n", mc->irq,
+			err);
+		return err;
+	}
+
+	value = MC_INT_DECERR_MTS | MC_INT_SECERR_SEC | MC_INT_DECERR_VPR |
+		MC_INT_INVALID_APB_ASID_UPDATE | MC_INT_INVALID_SMMU_PAGE |
+		MC_INT_ARBITRATION_EMEM | MC_INT_SECURITY_VIOLATION |
+		MC_INT_DECERR_EMEM;
+	mc_writel(mc, value, MC_INTMASK);
+
+	return 0;
+}
+
+static int tegra_mc_remove(struct platform_device *pdev)
+{
+	struct tegra_mc *mc = platform_get_drvdata(pdev);
+	int err;
+
+	err = tegra_smmu_remove(mc->smmu);
+	if (err < 0)
+		dev_err(&pdev->dev, "failed to remove SMMU: %d\n", err);
+
+	return 0;
+}
+
+static struct platform_driver tegra_mc_driver = {
+	.driver = {
+		.name = "tegra124-mc",
+		.of_match_table = tegra_mc_of_match,
+	},
+	.probe = tegra_mc_probe,
+	.remove = tegra_mc_remove,
+};
+module_platform_driver(tegra_mc_driver);
+
+MODULE_AUTHOR("Thierry Reding <treding@nvidia.com>");
+MODULE_DESCRIPTION("NVIDIA Tegra124 Memory Controller driver");
+MODULE_LICENSE("GPL v2");
diff --git a/include/dt-bindings/memory/tegra124-mc.h b/include/dt-bindings/memory/tegra124-mc.h
new file mode 100644
index 000000000000..6b1617ce022f
--- /dev/null
+++ b/include/dt-bindings/memory/tegra124-mc.h
@@ -0,0 +1,30 @@
+#ifndef DT_BINDINGS_MEMORY_TEGRA124_MC_H
+#define DT_BINDINGS_MEMORY_TEGRA124_MC_H
+
+#define TEGRA_SWGROUP_DC	0
+#define TEGRA_SWGROUP_DCB	1
+#define TEGRA_SWGROUP_AFI	2
+#define TEGRA_SWGROUP_AVPC	3
+#define TEGRA_SWGROUP_HDA	4
+#define TEGRA_SWGROUP_HC	5
+#define TEGRA_SWGROUP_MSENC	6
+#define TEGRA_SWGROUP_PPCS	7
+#define TEGRA_SWGROUP_SATA	8
+#define TEGRA_SWGROUP_VDE	9
+#define TEGRA_SWGROUP_MPCORELP	10
+#define TEGRA_SWGROUP_MPCORE	11
+#define TEGRA_SWGROUP_ISP2	12
+#define TEGRA_SWGROUP_XUSB_HOST	13
+#define TEGRA_SWGROUP_XUSB_DEV	14
+#define TEGRA_SWGROUP_ISP2B	15
+#define TEGRA_SWGROUP_TSEC	16
+#define TEGRA_SWGROUP_A9AVP	17
+#define TEGRA_SWGROUP_GPU	18
+#define TEGRA_SWGROUP_SDMMC1A	19
+#define TEGRA_SWGROUP_SDMMC2A	20
+#define TEGRA_SWGROUP_SDMMC3A	21
+#define TEGRA_SWGROUP_SDMMC4A	22
+#define TEGRA_SWGROUP_VIC	23
+#define TEGRA_SWGROUP_VI	24
+
+#endif
-- 
2.0.0

^ permalink raw reply related	[flat|nested] 133+ messages in thread

* [RFC 05/10] ARM: tegra: Add memory controller on Tegra124
  2014-06-26 20:49 ` Thierry Reding
  (?)
@ 2014-06-26 20:49     ` Thierry Reding
  -1 siblings, 0 replies; 133+ messages in thread
From: Thierry Reding @ 2014-06-26 20:49 UTC (permalink / raw)
  To: Rob Herring, Pawel Moll, Mark Rutland, Ian Campbell, Kumar Gala,
	Stephen Warren, Arnd Bergmann, Will Deacon, Joerg Roedel
  Cc: Olav Haugan, devicetree-u79uwXL29TY76Z2rM5mHXA, Grant Grundler,
	Rhyland Klein, iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA, Marc Zyngier, Allen Martin,
	Paul Walmsley, linux-tegra-u79uwXL29TY76Z2rM5mHXA, Cho KyongHo,
	Dave Martin, linux-arm-kernel-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r

From: Thierry Reding <treding-DDmLM1+adcrQT0dZR+AlfA@public.gmane.org>

Add the memory controller and wire up the interrupt that is used to
report errors. Also add an #iommu-cells property to make the device
as an IOMMU.

Signed-off-by: Thierry Reding <treding-DDmLM1+adcrQT0dZR+AlfA@public.gmane.org>
---
 arch/arm/boot/dts/tegra124.dtsi | 9 +++++++++
 1 file changed, 9 insertions(+)

diff --git a/arch/arm/boot/dts/tegra124.dtsi b/arch/arm/boot/dts/tegra124.dtsi
index 0bf050696186..efa0f0c519be 100644
--- a/arch/arm/boot/dts/tegra124.dtsi
+++ b/arch/arm/boot/dts/tegra124.dtsi
@@ -560,6 +560,15 @@
 		reset-names = "fuse";
 	};
 
+	mc: memory-controller@0,70019000 {
+		compatible = "nvidia,tegra124-mc";
+		reg = <0x0 0x70019000 0x0 0x1000>;
+
+		interrupts = <GIC_SPI 77 IRQ_TYPE_LEVEL_HIGH>;
+
+		#iommu-cells = <1>;
+	};
+
 	hda@0,70030000 {
 		compatible = "nvidia,tegra124-hda", "nvidia,tegra30-hda";
 		reg = <0x0 0x70030000 0x0 0x10000>;
-- 
2.0.0

^ permalink raw reply related	[flat|nested] 133+ messages in thread

* [RFC 05/10] ARM: tegra: Add memory controller on Tegra124
@ 2014-06-26 20:49     ` Thierry Reding
  0 siblings, 0 replies; 133+ messages in thread
From: Thierry Reding @ 2014-06-26 20:49 UTC (permalink / raw)
  To: Rob Herring, Pawel Moll, Mark Rutland, Ian Campbell, Kumar Gala,
	Stephen Warren, Arnd Bergmann, Will Deacon, Joerg Roedel
  Cc: Cho KyongHo, Grant Grundler, Dave Martin, Marc Zyngier,
	Hiroshi Doyu, Olav Haugan, Paul Walmsley, Rhyland Klein,
	Allen Martin, devicetree, iommu, linux-arm-kernel, linux-tegra,
	linux-kernel

From: Thierry Reding <treding@nvidia.com>

Add the memory controller and wire up the interrupt that is used to
report errors. Also add an #iommu-cells property to make the device
as an IOMMU.

Signed-off-by: Thierry Reding <treding@nvidia.com>
---
 arch/arm/boot/dts/tegra124.dtsi | 9 +++++++++
 1 file changed, 9 insertions(+)

diff --git a/arch/arm/boot/dts/tegra124.dtsi b/arch/arm/boot/dts/tegra124.dtsi
index 0bf050696186..efa0f0c519be 100644
--- a/arch/arm/boot/dts/tegra124.dtsi
+++ b/arch/arm/boot/dts/tegra124.dtsi
@@ -560,6 +560,15 @@
 		reset-names = "fuse";
 	};
 
+	mc: memory-controller@0,70019000 {
+		compatible = "nvidia,tegra124-mc";
+		reg = <0x0 0x70019000 0x0 0x1000>;
+
+		interrupts = <GIC_SPI 77 IRQ_TYPE_LEVEL_HIGH>;
+
+		#iommu-cells = <1>;
+	};
+
 	hda@0,70030000 {
 		compatible = "nvidia,tegra124-hda", "nvidia,tegra30-hda";
 		reg = <0x0 0x70030000 0x0 0x10000>;
-- 
2.0.0


^ permalink raw reply related	[flat|nested] 133+ messages in thread

* [RFC 05/10] ARM: tegra: Add memory controller on Tegra124
@ 2014-06-26 20:49     ` Thierry Reding
  0 siblings, 0 replies; 133+ messages in thread
From: Thierry Reding @ 2014-06-26 20:49 UTC (permalink / raw)
  To: linux-arm-kernel

From: Thierry Reding <treding@nvidia.com>

Add the memory controller and wire up the interrupt that is used to
report errors. Also add an #iommu-cells property to make the device
as an IOMMU.

Signed-off-by: Thierry Reding <treding@nvidia.com>
---
 arch/arm/boot/dts/tegra124.dtsi | 9 +++++++++
 1 file changed, 9 insertions(+)

diff --git a/arch/arm/boot/dts/tegra124.dtsi b/arch/arm/boot/dts/tegra124.dtsi
index 0bf050696186..efa0f0c519be 100644
--- a/arch/arm/boot/dts/tegra124.dtsi
+++ b/arch/arm/boot/dts/tegra124.dtsi
@@ -560,6 +560,15 @@
 		reset-names = "fuse";
 	};
 
+	mc: memory-controller at 0,70019000 {
+		compatible = "nvidia,tegra124-mc";
+		reg = <0x0 0x70019000 0x0 0x1000>;
+
+		interrupts = <GIC_SPI 77 IRQ_TYPE_LEVEL_HIGH>;
+
+		#iommu-cells = <1>;
+	};
+
 	hda at 0,70030000 {
 		compatible = "nvidia,tegra124-hda", "nvidia,tegra30-hda";
 		reg = <0x0 0x70030000 0x0 0x10000>;
-- 
2.0.0

^ permalink raw reply related	[flat|nested] 133+ messages in thread

* [RFC 06/10] ARM: tegra: tegra124: Enable IOMMU for display controllers
  2014-06-26 20:49 ` Thierry Reding
  (?)
@ 2014-06-26 20:49     ` Thierry Reding
  -1 siblings, 0 replies; 133+ messages in thread
From: Thierry Reding @ 2014-06-26 20:49 UTC (permalink / raw)
  To: Rob Herring, Pawel Moll, Mark Rutland, Ian Campbell, Kumar Gala,
	Stephen Warren, Arnd Bergmann, Will Deacon, Joerg Roedel
  Cc: Olav Haugan, devicetree-u79uwXL29TY76Z2rM5mHXA, Grant Grundler,
	Rhyland Klein, iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA, Marc Zyngier, Allen Martin,
	Paul Walmsley, linux-tegra-u79uwXL29TY76Z2rM5mHXA, Cho KyongHo,
	Dave Martin, linux-arm-kernel-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r

From: Thierry Reding <treding-DDmLM1+adcrQT0dZR+AlfA@public.gmane.org>

Add an iommus property to each of the display controllers and encode the
SWGROUP in the specifier.

Signed-off-by: Thierry Reding <treding-DDmLM1+adcrQT0dZR+AlfA@public.gmane.org>
---
 arch/arm/boot/dts/tegra124.dtsi | 5 +++++
 1 file changed, 5 insertions(+)

diff --git a/arch/arm/boot/dts/tegra124.dtsi b/arch/arm/boot/dts/tegra124.dtsi
index efa0f0c519be..82751d2878c4 100644
--- a/arch/arm/boot/dts/tegra124.dtsi
+++ b/arch/arm/boot/dts/tegra124.dtsi
@@ -3,6 +3,7 @@
 #include <dt-bindings/pinctrl/pinctrl-tegra.h>
 #include <dt-bindings/pinctrl/pinctrl-tegra-xusb.h>
 #include <dt-bindings/interrupt-controller/arm-gic.h>
+#include <dt-bindings/memory/tegra124-mc.h>
 
 #include "skeleton.dtsi"
 
@@ -104,6 +105,8 @@
 			reset-names = "dc";
 
 			nvidia,head = <0>;
+
+			iommus = <&mc TEGRA_SWGROUP_DC>;
 		};
 
 		dc@0,54240000 {
@@ -117,6 +120,8 @@
 			reset-names = "dc";
 
 			nvidia,head = <1>;
+
+			iommus = <&mc TEGRA_SWGROUP_DCB>;
 		};
 
 		hdmi@0,54280000 {
-- 
2.0.0

^ permalink raw reply related	[flat|nested] 133+ messages in thread

* [RFC 06/10] ARM: tegra: tegra124: Enable IOMMU for display controllers
@ 2014-06-26 20:49     ` Thierry Reding
  0 siblings, 0 replies; 133+ messages in thread
From: Thierry Reding @ 2014-06-26 20:49 UTC (permalink / raw)
  To: Rob Herring, Pawel Moll, Mark Rutland, Ian Campbell, Kumar Gala,
	Stephen Warren, Arnd Bergmann, Will Deacon, Joerg Roedel
  Cc: Cho KyongHo, Grant Grundler, Dave Martin, Marc Zyngier,
	Hiroshi Doyu, Olav Haugan, Paul Walmsley, Rhyland Klein,
	Allen Martin, devicetree, iommu, linux-arm-kernel, linux-tegra,
	linux-kernel

From: Thierry Reding <treding@nvidia.com>

Add an iommus property to each of the display controllers and encode the
SWGROUP in the specifier.

Signed-off-by: Thierry Reding <treding@nvidia.com>
---
 arch/arm/boot/dts/tegra124.dtsi | 5 +++++
 1 file changed, 5 insertions(+)

diff --git a/arch/arm/boot/dts/tegra124.dtsi b/arch/arm/boot/dts/tegra124.dtsi
index efa0f0c519be..82751d2878c4 100644
--- a/arch/arm/boot/dts/tegra124.dtsi
+++ b/arch/arm/boot/dts/tegra124.dtsi
@@ -3,6 +3,7 @@
 #include <dt-bindings/pinctrl/pinctrl-tegra.h>
 #include <dt-bindings/pinctrl/pinctrl-tegra-xusb.h>
 #include <dt-bindings/interrupt-controller/arm-gic.h>
+#include <dt-bindings/memory/tegra124-mc.h>
 
 #include "skeleton.dtsi"
 
@@ -104,6 +105,8 @@
 			reset-names = "dc";
 
 			nvidia,head = <0>;
+
+			iommus = <&mc TEGRA_SWGROUP_DC>;
 		};
 
 		dc@0,54240000 {
@@ -117,6 +120,8 @@
 			reset-names = "dc";
 
 			nvidia,head = <1>;
+
+			iommus = <&mc TEGRA_SWGROUP_DCB>;
 		};
 
 		hdmi@0,54280000 {
-- 
2.0.0


^ permalink raw reply related	[flat|nested] 133+ messages in thread

* [RFC 06/10] ARM: tegra: tegra124: Enable IOMMU for display controllers
@ 2014-06-26 20:49     ` Thierry Reding
  0 siblings, 0 replies; 133+ messages in thread
From: Thierry Reding @ 2014-06-26 20:49 UTC (permalink / raw)
  To: linux-arm-kernel

From: Thierry Reding <treding@nvidia.com>

Add an iommus property to each of the display controllers and encode the
SWGROUP in the specifier.

Signed-off-by: Thierry Reding <treding@nvidia.com>
---
 arch/arm/boot/dts/tegra124.dtsi | 5 +++++
 1 file changed, 5 insertions(+)

diff --git a/arch/arm/boot/dts/tegra124.dtsi b/arch/arm/boot/dts/tegra124.dtsi
index efa0f0c519be..82751d2878c4 100644
--- a/arch/arm/boot/dts/tegra124.dtsi
+++ b/arch/arm/boot/dts/tegra124.dtsi
@@ -3,6 +3,7 @@
 #include <dt-bindings/pinctrl/pinctrl-tegra.h>
 #include <dt-bindings/pinctrl/pinctrl-tegra-xusb.h>
 #include <dt-bindings/interrupt-controller/arm-gic.h>
+#include <dt-bindings/memory/tegra124-mc.h>
 
 #include "skeleton.dtsi"
 
@@ -104,6 +105,8 @@
 			reset-names = "dc";
 
 			nvidia,head = <0>;
+
+			iommus = <&mc TEGRA_SWGROUP_DC>;
 		};
 
 		dc at 0,54240000 {
@@ -117,6 +120,8 @@
 			reset-names = "dc";
 
 			nvidia,head = <1>;
+
+			iommus = <&mc TEGRA_SWGROUP_DCB>;
 		};
 
 		hdmi at 0,54280000 {
-- 
2.0.0

^ permalink raw reply related	[flat|nested] 133+ messages in thread

* [RFC 07/10] ARM: tegra: tegra124: Enable IOMMU for SDMMC controllers
  2014-06-26 20:49 ` Thierry Reding
  (?)
@ 2014-06-26 20:49     ` Thierry Reding
  -1 siblings, 0 replies; 133+ messages in thread
From: Thierry Reding @ 2014-06-26 20:49 UTC (permalink / raw)
  To: Rob Herring, Pawel Moll, Mark Rutland, Ian Campbell, Kumar Gala,
	Stephen Warren, Arnd Bergmann, Will Deacon, Joerg Roedel
  Cc: Olav Haugan, devicetree-u79uwXL29TY76Z2rM5mHXA, Grant Grundler,
	Rhyland Klein, iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA, Marc Zyngier, Allen Martin,
	Paul Walmsley, linux-tegra-u79uwXL29TY76Z2rM5mHXA, Cho KyongHo,
	Dave Martin, linux-arm-kernel-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r

From: Thierry Reding <treding-DDmLM1+adcrQT0dZR+AlfA@public.gmane.org>

The SDMMC controllers can use the IOMMU to avoid the need for bounce
buffers.

Signed-off-by: Thierry Reding <treding-DDmLM1+adcrQT0dZR+AlfA@public.gmane.org>
---
 arch/arm/boot/dts/tegra124.dtsi | 4 ++++
 1 file changed, 4 insertions(+)

diff --git a/arch/arm/boot/dts/tegra124.dtsi b/arch/arm/boot/dts/tegra124.dtsi
index 82751d2878c4..bfffb4c102fb 100644
--- a/arch/arm/boot/dts/tegra124.dtsi
+++ b/arch/arm/boot/dts/tegra124.dtsi
@@ -607,6 +607,7 @@
 		resets = <&tegra_car 14>;
 		reset-names = "sdhci";
 		status = "disabled";
+		iommus = <&mc TEGRA_SWGROUP_SDMMC1A>;
 	};
 
 	sdhci@0,700b0200 {
@@ -618,6 +619,7 @@
 		resets = <&tegra_car 9>;
 		reset-names = "sdhci";
 		status = "disabled";
+		iommus = <&mc TEGRA_SWGROUP_SDMMC2A>;
 	};
 
 	sdhci@0,700b0400 {
@@ -629,6 +631,7 @@
 		resets = <&tegra_car 69>;
 		reset-names = "sdhci";
 		status = "disabled";
+		iommus = <&mc TEGRA_SWGROUP_SDMMC3A>;
 	};
 
 	sdhci@0,700b0600 {
@@ -640,6 +643,7 @@
 		resets = <&tegra_car 15>;
 		reset-names = "sdhci";
 		status = "disabled";
+		iommus = <&mc TEGRA_SWGROUP_SDMMC4A>;
 	};
 
 	ahub@0,70300000 {
-- 
2.0.0

^ permalink raw reply related	[flat|nested] 133+ messages in thread

* [RFC 07/10] ARM: tegra: tegra124: Enable IOMMU for SDMMC controllers
@ 2014-06-26 20:49     ` Thierry Reding
  0 siblings, 0 replies; 133+ messages in thread
From: Thierry Reding @ 2014-06-26 20:49 UTC (permalink / raw)
  To: Rob Herring, Pawel Moll, Mark Rutland, Ian Campbell, Kumar Gala,
	Stephen Warren, Arnd Bergmann, Will Deacon, Joerg Roedel
  Cc: Cho KyongHo, Grant Grundler, Dave Martin, Marc Zyngier,
	Hiroshi Doyu, Olav Haugan, Paul Walmsley, Rhyland Klein,
	Allen Martin, devicetree, iommu, linux-arm-kernel, linux-tegra,
	linux-kernel

From: Thierry Reding <treding@nvidia.com>

The SDMMC controllers can use the IOMMU to avoid the need for bounce
buffers.

Signed-off-by: Thierry Reding <treding@nvidia.com>
---
 arch/arm/boot/dts/tegra124.dtsi | 4 ++++
 1 file changed, 4 insertions(+)

diff --git a/arch/arm/boot/dts/tegra124.dtsi b/arch/arm/boot/dts/tegra124.dtsi
index 82751d2878c4..bfffb4c102fb 100644
--- a/arch/arm/boot/dts/tegra124.dtsi
+++ b/arch/arm/boot/dts/tegra124.dtsi
@@ -607,6 +607,7 @@
 		resets = <&tegra_car 14>;
 		reset-names = "sdhci";
 		status = "disabled";
+		iommus = <&mc TEGRA_SWGROUP_SDMMC1A>;
 	};
 
 	sdhci@0,700b0200 {
@@ -618,6 +619,7 @@
 		resets = <&tegra_car 9>;
 		reset-names = "sdhci";
 		status = "disabled";
+		iommus = <&mc TEGRA_SWGROUP_SDMMC2A>;
 	};
 
 	sdhci@0,700b0400 {
@@ -629,6 +631,7 @@
 		resets = <&tegra_car 69>;
 		reset-names = "sdhci";
 		status = "disabled";
+		iommus = <&mc TEGRA_SWGROUP_SDMMC3A>;
 	};
 
 	sdhci@0,700b0600 {
@@ -640,6 +643,7 @@
 		resets = <&tegra_car 15>;
 		reset-names = "sdhci";
 		status = "disabled";
+		iommus = <&mc TEGRA_SWGROUP_SDMMC4A>;
 	};
 
 	ahub@0,70300000 {
-- 
2.0.0


^ permalink raw reply related	[flat|nested] 133+ messages in thread

* [RFC 07/10] ARM: tegra: tegra124: Enable IOMMU for SDMMC controllers
@ 2014-06-26 20:49     ` Thierry Reding
  0 siblings, 0 replies; 133+ messages in thread
From: Thierry Reding @ 2014-06-26 20:49 UTC (permalink / raw)
  To: linux-arm-kernel

From: Thierry Reding <treding@nvidia.com>

The SDMMC controllers can use the IOMMU to avoid the need for bounce
buffers.

Signed-off-by: Thierry Reding <treding@nvidia.com>
---
 arch/arm/boot/dts/tegra124.dtsi | 4 ++++
 1 file changed, 4 insertions(+)

diff --git a/arch/arm/boot/dts/tegra124.dtsi b/arch/arm/boot/dts/tegra124.dtsi
index 82751d2878c4..bfffb4c102fb 100644
--- a/arch/arm/boot/dts/tegra124.dtsi
+++ b/arch/arm/boot/dts/tegra124.dtsi
@@ -607,6 +607,7 @@
 		resets = <&tegra_car 14>;
 		reset-names = "sdhci";
 		status = "disabled";
+		iommus = <&mc TEGRA_SWGROUP_SDMMC1A>;
 	};
 
 	sdhci at 0,700b0200 {
@@ -618,6 +619,7 @@
 		resets = <&tegra_car 9>;
 		reset-names = "sdhci";
 		status = "disabled";
+		iommus = <&mc TEGRA_SWGROUP_SDMMC2A>;
 	};
 
 	sdhci at 0,700b0400 {
@@ -629,6 +631,7 @@
 		resets = <&tegra_car 69>;
 		reset-names = "sdhci";
 		status = "disabled";
+		iommus = <&mc TEGRA_SWGROUP_SDMMC3A>;
 	};
 
 	sdhci at 0,700b0600 {
@@ -640,6 +643,7 @@
 		resets = <&tegra_car 15>;
 		reset-names = "sdhci";
 		status = "disabled";
+		iommus = <&mc TEGRA_SWGROUP_SDMMC4A>;
 	};
 
 	ahub at 0,70300000 {
-- 
2.0.0

^ permalink raw reply related	[flat|nested] 133+ messages in thread

* [RFC 08/10] ARM: tegra: Select ARM_DMA_USE_IOMMU
  2014-06-26 20:49 ` Thierry Reding
  (?)
@ 2014-06-26 20:49     ` Thierry Reding
  -1 siblings, 0 replies; 133+ messages in thread
From: Thierry Reding @ 2014-06-26 20:49 UTC (permalink / raw)
  To: Rob Herring, Pawel Moll, Mark Rutland, Ian Campbell, Kumar Gala,
	Stephen Warren, Arnd Bergmann, Will Deacon, Joerg Roedel
  Cc: Olav Haugan, devicetree-u79uwXL29TY76Z2rM5mHXA, Grant Grundler,
	Rhyland Klein, iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA, Marc Zyngier, Allen Martin,
	Paul Walmsley, linux-tegra-u79uwXL29TY76Z2rM5mHXA, Cho KyongHo,
	Dave Martin, linux-arm-kernel-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r

From: Thierry Reding <treding-DDmLM1+adcrQT0dZR+AlfA@public.gmane.org>

This enables IOMMU interoperation with the DMA mapping API so that
clients that use the DMA mapping API can seemlessly make use of an
existing IOMMU.

Signed-off-by: Thierry Reding <treding-DDmLM1+adcrQT0dZR+AlfA@public.gmane.org>
---
 arch/arm/mach-tegra/Kconfig | 1 +
 1 file changed, 1 insertion(+)

diff --git a/arch/arm/mach-tegra/Kconfig b/arch/arm/mach-tegra/Kconfig
index a52d96366919..20bc43975bde 100644
--- a/arch/arm/mach-tegra/Kconfig
+++ b/arch/arm/mach-tegra/Kconfig
@@ -2,6 +2,7 @@ menuconfig ARCH_TEGRA
 	bool "NVIDIA Tegra" if ARCH_MULTI_V7
 	select ARCH_REQUIRE_GPIOLIB
 	select ARCH_SUPPORTS_TRUSTED_FOUNDATIONS
+	select ARM_DMA_USE_IOMMU
 	select ARM_GIC
 	select CLKSRC_MMIO
 	select HAVE_ARM_SCU if SMP
-- 
2.0.0

^ permalink raw reply related	[flat|nested] 133+ messages in thread

* [RFC 08/10] ARM: tegra: Select ARM_DMA_USE_IOMMU
@ 2014-06-26 20:49     ` Thierry Reding
  0 siblings, 0 replies; 133+ messages in thread
From: Thierry Reding @ 2014-06-26 20:49 UTC (permalink / raw)
  To: Rob Herring, Pawel Moll, Mark Rutland, Ian Campbell, Kumar Gala,
	Stephen Warren, Arnd Bergmann, Will Deacon, Joerg Roedel
  Cc: Cho KyongHo, Grant Grundler, Dave Martin, Marc Zyngier,
	Hiroshi Doyu, Olav Haugan, Paul Walmsley, Rhyland Klein,
	Allen Martin, devicetree, iommu, linux-arm-kernel, linux-tegra,
	linux-kernel

From: Thierry Reding <treding@nvidia.com>

This enables IOMMU interoperation with the DMA mapping API so that
clients that use the DMA mapping API can seemlessly make use of an
existing IOMMU.

Signed-off-by: Thierry Reding <treding@nvidia.com>
---
 arch/arm/mach-tegra/Kconfig | 1 +
 1 file changed, 1 insertion(+)

diff --git a/arch/arm/mach-tegra/Kconfig b/arch/arm/mach-tegra/Kconfig
index a52d96366919..20bc43975bde 100644
--- a/arch/arm/mach-tegra/Kconfig
+++ b/arch/arm/mach-tegra/Kconfig
@@ -2,6 +2,7 @@ menuconfig ARCH_TEGRA
 	bool "NVIDIA Tegra" if ARCH_MULTI_V7
 	select ARCH_REQUIRE_GPIOLIB
 	select ARCH_SUPPORTS_TRUSTED_FOUNDATIONS
+	select ARM_DMA_USE_IOMMU
 	select ARM_GIC
 	select CLKSRC_MMIO
 	select HAVE_ARM_SCU if SMP
-- 
2.0.0


^ permalink raw reply related	[flat|nested] 133+ messages in thread

* [RFC 08/10] ARM: tegra: Select ARM_DMA_USE_IOMMU
@ 2014-06-26 20:49     ` Thierry Reding
  0 siblings, 0 replies; 133+ messages in thread
From: Thierry Reding @ 2014-06-26 20:49 UTC (permalink / raw)
  To: linux-arm-kernel

From: Thierry Reding <treding@nvidia.com>

This enables IOMMU interoperation with the DMA mapping API so that
clients that use the DMA mapping API can seemlessly make use of an
existing IOMMU.

Signed-off-by: Thierry Reding <treding@nvidia.com>
---
 arch/arm/mach-tegra/Kconfig | 1 +
 1 file changed, 1 insertion(+)

diff --git a/arch/arm/mach-tegra/Kconfig b/arch/arm/mach-tegra/Kconfig
index a52d96366919..20bc43975bde 100644
--- a/arch/arm/mach-tegra/Kconfig
+++ b/arch/arm/mach-tegra/Kconfig
@@ -2,6 +2,7 @@ menuconfig ARCH_TEGRA
 	bool "NVIDIA Tegra" if ARCH_MULTI_V7
 	select ARCH_REQUIRE_GPIOLIB
 	select ARCH_SUPPORTS_TRUSTED_FOUNDATIONS
+	select ARM_DMA_USE_IOMMU
 	select ARM_GIC
 	select CLKSRC_MMIO
 	select HAVE_ARM_SCU if SMP
-- 
2.0.0

^ permalink raw reply related	[flat|nested] 133+ messages in thread

* [RFC 09/10] drm/tegra: Add IOMMU support
  2014-06-26 20:49 ` Thierry Reding
  (?)
@ 2014-06-26 20:49     ` Thierry Reding
  -1 siblings, 0 replies; 133+ messages in thread
From: Thierry Reding @ 2014-06-26 20:49 UTC (permalink / raw)
  To: Rob Herring, Pawel Moll, Mark Rutland, Ian Campbell, Kumar Gala,
	Stephen Warren, Arnd Bergmann, Will Deacon, Joerg Roedel
  Cc: Olav Haugan, devicetree-u79uwXL29TY76Z2rM5mHXA, Grant Grundler,
	Rhyland Klein, iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA, Marc Zyngier, Allen Martin,
	Paul Walmsley, linux-tegra-u79uwXL29TY76Z2rM5mHXA, Cho KyongHo,
	Dave Martin, linux-arm-kernel-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r

From: Thierry Reding <treding-DDmLM1+adcrQT0dZR+AlfA@public.gmane.org>

When an IOMMU device is available on the platform bus, allocate an IOMMU
domain and attach the display controllers to it. The display controllers
can then scan out non-contiguous buffers by mapping them through the
IOMMU.

Signed-off-by: Thierry Reding <treding-DDmLM1+adcrQT0dZR+AlfA@public.gmane.org>
---
 drivers/gpu/drm/tegra/dc.c  |  21 ++++
 drivers/gpu/drm/tegra/drm.c |  17 ++++
 drivers/gpu/drm/tegra/drm.h |   3 +
 drivers/gpu/drm/tegra/fb.c  |  16 ++-
 drivers/gpu/drm/tegra/gem.c | 236 +++++++++++++++++++++++++++++++++++++++-----
 drivers/gpu/drm/tegra/gem.h |   4 +
 6 files changed, 273 insertions(+), 24 deletions(-)

diff --git a/drivers/gpu/drm/tegra/dc.c b/drivers/gpu/drm/tegra/dc.c
index afcca04f5367..0f7452d04811 100644
--- a/drivers/gpu/drm/tegra/dc.c
+++ b/drivers/gpu/drm/tegra/dc.c
@@ -9,6 +9,7 @@
 
 #include <linux/clk.h>
 #include <linux/debugfs.h>
+#include <linux/iommu.h>
 #include <linux/reset.h>
 
 #include "dc.h"
@@ -1283,8 +1284,18 @@ static int tegra_dc_init(struct host1x_client *client)
 {
 	struct drm_device *drm = dev_get_drvdata(client->parent);
 	struct tegra_dc *dc = host1x_client_to_dc(client);
+	struct tegra_drm *tegra = drm->dev_private;
 	int err;
 
+	if (tegra->domain) {
+		err = iommu_attach_device(tegra->domain, dc->dev);
+		if (err < 0) {
+			dev_err(dc->dev, "failed to attach to IOMMU: %d\n",
+				err);
+			return err;
+		}
+	}
+
 	drm_crtc_init(drm, &dc->base, &tegra_crtc_funcs);
 	drm_mode_crtc_set_gamma_size(&dc->base, 256);
 	drm_crtc_helper_add(&dc->base, &tegra_crtc_helper_funcs);
@@ -1318,7 +1329,9 @@ static int tegra_dc_init(struct host1x_client *client)
 
 static int tegra_dc_exit(struct host1x_client *client)
 {
+	struct drm_device *drm = dev_get_drvdata(client->parent);
 	struct tegra_dc *dc = host1x_client_to_dc(client);
+	struct tegra_drm *tegra = drm->dev_private;
 	int err;
 
 	devm_free_irq(dc->dev, dc->irq, dc);
@@ -1335,6 +1348,8 @@ static int tegra_dc_exit(struct host1x_client *client)
 		return err;
 	}
 
+	iommu_detach_device(tegra->domain, dc->dev);
+
 	return 0;
 }
 
@@ -1462,6 +1477,12 @@ static int tegra_dc_probe(struct platform_device *pdev)
 		return -ENXIO;
 	}
 
+	err = iommu_attach(&pdev->dev);
+	if (err < 0) {
+		dev_err(&pdev->dev, "failed to attach to IOMMU: %d\n", err);
+		return err;
+	}
+
 	INIT_LIST_HEAD(&dc->client.list);
 	dc->client.ops = &dc_client_ops;
 	dc->client.dev = &pdev->dev;
diff --git a/drivers/gpu/drm/tegra/drm.c b/drivers/gpu/drm/tegra/drm.c
index 59736bb810cd..1d2bbafad982 100644
--- a/drivers/gpu/drm/tegra/drm.c
+++ b/drivers/gpu/drm/tegra/drm.c
@@ -8,6 +8,7 @@
  */
 
 #include <linux/host1x.h>
+#include <linux/iommu.h>
 
 #include "drm.h"
 #include "gem.h"
@@ -33,6 +34,16 @@ static int tegra_drm_load(struct drm_device *drm, unsigned long flags)
 	if (!tegra)
 		return -ENOMEM;
 
+	if (iommu_present(&platform_bus_type)) {
+		tegra->domain = iommu_domain_alloc(&platform_bus_type);
+		if (IS_ERR(tegra->domain)) {
+			kfree(tegra);
+			return PTR_ERR(tegra->domain);
+		}
+
+		drm_mm_init(&tegra->mm, 0, SZ_2G);
+	}
+
 	mutex_init(&tegra->clients_lock);
 	INIT_LIST_HEAD(&tegra->clients);
 	drm->dev_private = tegra;
@@ -71,6 +82,7 @@ static int tegra_drm_load(struct drm_device *drm, unsigned long flags)
 static int tegra_drm_unload(struct drm_device *drm)
 {
 	struct host1x_device *device = to_host1x_device(drm->dev);
+	struct tegra_drm *tegra = drm->dev_private;
 	int err;
 
 	drm_kms_helper_poll_fini(drm);
@@ -82,6 +94,11 @@ static int tegra_drm_unload(struct drm_device *drm)
 	if (err < 0)
 		return err;
 
+	if (tegra->domain) {
+		iommu_domain_free(tegra->domain);
+		drm_mm_takedown(&tegra->mm);
+	}
+
 	return 0;
 }
 
diff --git a/drivers/gpu/drm/tegra/drm.h b/drivers/gpu/drm/tegra/drm.h
index 96d754e7b3eb..a07c796b7edc 100644
--- a/drivers/gpu/drm/tegra/drm.h
+++ b/drivers/gpu/drm/tegra/drm.h
@@ -39,6 +39,9 @@ struct tegra_fbdev {
 struct tegra_drm {
 	struct drm_device *drm;
 
+	struct iommu_domain *domain;
+	struct drm_mm mm;
+
 	struct mutex clients_lock;
 	struct list_head clients;
 
diff --git a/drivers/gpu/drm/tegra/fb.c b/drivers/gpu/drm/tegra/fb.c
index 7790d43ad082..21c65dd817c3 100644
--- a/drivers/gpu/drm/tegra/fb.c
+++ b/drivers/gpu/drm/tegra/fb.c
@@ -65,8 +65,12 @@ static void tegra_fb_destroy(struct drm_framebuffer *framebuffer)
 	for (i = 0; i < fb->num_planes; i++) {
 		struct tegra_bo *bo = fb->planes[i];
 
-		if (bo)
+		if (bo) {
+			if (bo->pages && bo->virt)
+				vunmap(bo->virt);
+
 			drm_gem_object_unreference_unlocked(&bo->gem);
+		}
 	}
 
 	drm_framebuffer_cleanup(framebuffer);
@@ -252,6 +256,16 @@ static int tegra_fbdev_probe(struct drm_fb_helper *helper,
 	offset = info->var.xoffset * bytes_per_pixel +
 		 info->var.yoffset * fb->pitches[0];
 
+	if (bo->pages) {
+		bo->vaddr = vmap(bo->pages, bo->num_pages, VM_MAP,
+				 pgprot_writecombine(PAGE_KERNEL));
+		if (!bo->vaddr) {
+			dev_err(drm->dev, "failed to vmap() framebuffer\n");
+			err = -ENOMEM;
+			goto destroy;
+		}
+	}
+
 	drm->mode_config.fb_base = (resource_size_t)bo->paddr;
 	info->screen_base = (void __iomem *)bo->vaddr + offset;
 	info->screen_size = size;
diff --git a/drivers/gpu/drm/tegra/gem.c b/drivers/gpu/drm/tegra/gem.c
index c1e4e8b6e5ca..2912e61a2599 100644
--- a/drivers/gpu/drm/tegra/gem.c
+++ b/drivers/gpu/drm/tegra/gem.c
@@ -14,8 +14,10 @@
  */
 
 #include <linux/dma-buf.h>
+#include <linux/iommu.h>
 #include <drm/tegra_drm.h>
 
+#include "drm.h"
 #include "gem.h"
 
 static inline struct tegra_bo *host1x_to_tegra_bo(struct host1x_bo *bo)
@@ -90,14 +92,144 @@ static const struct host1x_bo_ops tegra_bo_ops = {
 	.kunmap = tegra_bo_kunmap,
 };
 
+static int iommu_map_sg(struct iommu_domain *domain, struct sg_table *sgt,
+			dma_addr_t iova, int prot)
+{
+	unsigned long offset = 0;
+	struct scatterlist *sg;
+	unsigned int i, j;
+	int err;
+
+	for_each_sg(sgt->sgl, sg, sgt->nents, i) {
+		dma_addr_t phys = sg_phys(sg);
+		size_t length = sg->offset;
+
+		phys = sg_phys(sg) - sg->offset;
+		length = sg->length + sg->offset;
+
+		err = iommu_map(domain, iova + offset, phys, length, prot);
+		if (err < 0)
+			goto unmap;
+
+		offset += length;
+	}
+
+	return 0;
+
+unmap:
+	offset = 0;
+
+	for_each_sg(sgt->sgl, sg, i, j) {
+		size_t length = sg->length + sg->offset;
+		iommu_unmap(domain, iova + offset, length);
+		offset += length;
+	}
+
+	return err;
+}
+
+static int iommu_unmap_sg(struct iommu_domain *domain, struct sg_table *sgt,
+			  dma_addr_t iova)
+{
+	unsigned long offset = 0;
+	struct scatterlist *sg;
+	unsigned int i;
+
+	for_each_sg(sgt->sgl, sg, sgt->nents, i) {
+		dma_addr_t phys = sg_phys(sg);
+		size_t length = sg->offset;
+
+		phys = sg_phys(sg) - sg->offset;
+		length = sg->length + sg->offset;
+
+		iommu_unmap(domain, iova + offset, length);
+		offset += length;
+	}
+
+	return 0;
+}
+
+static int tegra_bo_iommu_map(struct tegra_drm *tegra, struct tegra_bo *bo)
+{
+	int prot = IOMMU_READ | IOMMU_WRITE;
+	int err;
+
+	if (bo->mm)
+		return -EBUSY;
+
+	bo->mm = kzalloc(sizeof(*bo->mm), GFP_KERNEL);
+	if (!bo->mm)
+		return -ENOMEM;
+
+	err = drm_mm_insert_node_generic(&tegra->mm, bo->mm, bo->gem.size,
+					 PAGE_SIZE, 0, 0, 0);
+	if (err < 0) {
+		dev_err(tegra->drm->dev, "out of virtual memory: %d\n", err);
+		return err;
+	}
+
+	bo->paddr = bo->mm->start;
+
+	err = iommu_map_sg(tegra->domain, bo->sgt, bo->paddr, prot);
+	if (err < 0) {
+		dev_err(tegra->drm->dev, "failed to map buffer: %d\n", err);
+		return err;
+	}
+
+	return 0;
+}
+
+static int tegra_bo_iommu_unmap(struct tegra_drm *tegra, struct tegra_bo *bo)
+{
+	if (!bo->mm)
+		return 0;
+
+	iommu_unmap_sg(tegra->domain, bo->sgt, bo->paddr);
+	drm_mm_remove_node(bo->mm);
+
+	kfree(bo->mm);
+	return 0;
+}
+
 static void tegra_bo_destroy(struct drm_device *drm, struct tegra_bo *bo)
 {
-	dma_free_writecombine(drm->dev, bo->gem.size, bo->vaddr, bo->paddr);
+	if (!bo->pages)
+		dma_free_writecombine(drm->dev, bo->gem.size, bo->vaddr,
+				      bo->paddr);
+	else
+		drm_gem_put_pages(&bo->gem, bo->pages, true, true);
+}
+
+static int tegra_bo_get_pages(struct drm_device *drm, struct tegra_bo *bo,
+			      size_t size)
+{
+	bo->pages = drm_gem_get_pages(&bo->gem, GFP_KERNEL);
+	if (!bo->pages)
+		return -ENOMEM;
+
+	bo->num_pages = size >> PAGE_SHIFT;
+
+	return 0;
+}
+
+static int tegra_bo_alloc(struct drm_device *drm, struct tegra_bo *bo,
+			  size_t size)
+{
+	bo->vaddr = dma_alloc_writecombine(drm->dev, size, &bo->paddr,
+					   GFP_KERNEL | __GFP_NOWARN);
+	if (!bo->vaddr) {
+		dev_err(drm->dev, "failed to allocate buffer of size %zu\n",
+			size);
+		return -ENOMEM;
+	}
+
+	return 0;
 }
 
 struct tegra_bo *tegra_bo_create(struct drm_device *drm, unsigned int size,
 				 unsigned long flags)
 {
+	struct tegra_drm *tegra = drm->dev_private;
 	struct tegra_bo *bo;
 	int err;
 
@@ -108,22 +240,33 @@ struct tegra_bo *tegra_bo_create(struct drm_device *drm, unsigned int size,
 	host1x_bo_init(&bo->base, &tegra_bo_ops);
 	size = round_up(size, PAGE_SIZE);
 
-	bo->vaddr = dma_alloc_writecombine(drm->dev, size, &bo->paddr,
-					   GFP_KERNEL | __GFP_NOWARN);
-	if (!bo->vaddr) {
-		dev_err(drm->dev, "failed to allocate buffer with size %u\n",
-			size);
-		err = -ENOMEM;
-		goto err_dma;
-	}
-
 	err = drm_gem_object_init(drm, &bo->gem, size);
 	if (err)
-		goto err_init;
+		goto free;
 
 	err = drm_gem_create_mmap_offset(&bo->gem);
 	if (err)
-		goto err_mmap;
+		goto release;
+
+	if (tegra->domain) {
+		err = tegra_bo_get_pages(drm, bo, size);
+		if (err < 0)
+			goto release;
+
+		bo->sgt = drm_prime_pages_to_sg(bo->pages, bo->num_pages);
+		if (IS_ERR(bo->sgt)) {
+			err = PTR_ERR(bo->sgt);
+			goto release;
+		}
+
+		err = tegra_bo_iommu_map(tegra, bo);
+		if (err < 0)
+			goto release;
+	} else {
+		err = tegra_bo_alloc(drm, bo, size);
+		if (err < 0)
+			goto release;
+	}
 
 	if (flags & DRM_TEGRA_GEM_CREATE_TILED)
 		bo->tiling.mode = TEGRA_BO_TILING_MODE_TILED;
@@ -133,11 +276,10 @@ struct tegra_bo *tegra_bo_create(struct drm_device *drm, unsigned int size,
 
 	return bo;
 
-err_mmap:
+release:
 	drm_gem_object_release(&bo->gem);
-err_init:
 	tegra_bo_destroy(drm, bo);
-err_dma:
+free:
 	kfree(bo);
 
 	return ERR_PTR(err);
@@ -172,6 +314,7 @@ err:
 static struct tegra_bo *tegra_bo_import(struct drm_device *drm,
 					struct dma_buf *buf)
 {
+	struct tegra_drm *tegra = drm->dev_private;
 	struct dma_buf_attachment *attach;
 	struct tegra_bo *bo;
 	ssize_t size;
@@ -211,12 +354,19 @@ static struct tegra_bo *tegra_bo_import(struct drm_device *drm,
 		goto detach;
 	}
 
-	if (bo->sgt->nents > 1) {
-		err = -EINVAL;
-		goto detach;
+	if (tegra->domain) {
+		err = tegra_bo_iommu_map(tegra, bo);
+		if (err < 0)
+			goto detach;
+	} else {
+		if (bo->sgt->nents > 1) {
+			err = -EINVAL;
+			goto detach;
+		}
+
+		bo->paddr = sg_dma_address(bo->sgt->sgl);
 	}
 
-	bo->paddr = sg_dma_address(bo->sgt->sgl);
 	bo->gem.import_attach = attach;
 
 	return bo;
@@ -239,8 +389,12 @@ free:
 
 void tegra_bo_free_object(struct drm_gem_object *gem)
 {
+	struct tegra_drm *tegra = gem->dev->dev_private;
 	struct tegra_bo *bo = to_tegra_bo(gem);
 
+	if (tegra->domain)
+		tegra_bo_iommu_unmap(tegra, bo);
+
 	if (gem->import_attach) {
 		dma_buf_unmap_attachment(gem->import_attach, bo->sgt,
 					 DMA_TO_DEVICE);
@@ -301,7 +455,38 @@ int tegra_bo_dumb_map_offset(struct drm_file *file, struct drm_device *drm,
 	return 0;
 }
 
+static int tegra_bo_fault(struct vm_area_struct *vma, struct vm_fault *vmf)
+{
+	struct drm_gem_object *gem = vma->vm_private_data;
+	struct tegra_bo *bo = to_tegra_bo(gem);
+	struct page *page;
+	pgoff_t offset;
+	int err;
+
+	if (!bo->pages)
+		return VM_FAULT_SIGBUS;
+
+	offset = ((unsigned long)vmf->virtual_address - vma->vm_start) >> PAGE_SHIFT;
+	page = bo->pages[offset];
+
+	err = vm_insert_page(vma, (unsigned long)vmf->virtual_address, page);
+	switch (err) {
+	case -EAGAIN:
+	case 0:
+	case -ERESTARTSYS:
+	case -EINTR:
+	case -EBUSY:
+		return VM_FAULT_NOPAGE;
+
+	case -ENOMEM:
+		return VM_FAULT_OOM;
+	}
+
+	return VM_FAULT_SIGBUS;
+}
+
 const struct vm_operations_struct tegra_bo_vm_ops = {
+	.fault = tegra_bo_fault,
 	.open = drm_gem_vm_open,
 	.close = drm_gem_vm_close,
 };
@@ -316,13 +501,18 @@ int tegra_drm_mmap(struct file *file, struct vm_area_struct *vma)
 	if (ret)
 		return ret;
 
+	vma->vm_flags |= VM_MIXEDMAP;
+	vma->vm_flags &= ~VM_PFNMAP;
+
 	gem = vma->vm_private_data;
 	bo = to_tegra_bo(gem);
 
-	ret = remap_pfn_range(vma, vma->vm_start, bo->paddr >> PAGE_SHIFT,
-			      vma->vm_end - vma->vm_start, vma->vm_page_prot);
-	if (ret)
-		drm_gem_vm_close(vma);
+	if (!bo->pages) {
+		ret = remap_pfn_range(vma, vma->vm_start, bo->paddr >> PAGE_SHIFT,
+				      vma->vm_end - vma->vm_start, vma->vm_page_prot);
+		if (ret)
+			drm_gem_vm_close(vma);
+	}
 
 	return ret;
 }
diff --git a/drivers/gpu/drm/tegra/gem.h b/drivers/gpu/drm/tegra/gem.h
index 43a25c853357..c2e3f43e4b3f 100644
--- a/drivers/gpu/drm/tegra/gem.h
+++ b/drivers/gpu/drm/tegra/gem.h
@@ -37,6 +37,10 @@ struct tegra_bo {
 	dma_addr_t paddr;
 	void *vaddr;
 
+	struct drm_mm_node *mm;
+	unsigned long num_pages;
+	struct page **pages;
+
 	struct tegra_bo_tiling tiling;
 };
 
-- 
2.0.0

^ permalink raw reply related	[flat|nested] 133+ messages in thread

* [RFC 09/10] drm/tegra: Add IOMMU support
@ 2014-06-26 20:49     ` Thierry Reding
  0 siblings, 0 replies; 133+ messages in thread
From: Thierry Reding @ 2014-06-26 20:49 UTC (permalink / raw)
  To: Rob Herring, Pawel Moll, Mark Rutland, Ian Campbell, Kumar Gala,
	Stephen Warren, Arnd Bergmann, Will Deacon, Joerg Roedel
  Cc: Cho KyongHo, Grant Grundler, Dave Martin, Marc Zyngier,
	Hiroshi Doyu, Olav Haugan, Paul Walmsley, Rhyland Klein,
	Allen Martin, devicetree, iommu, linux-arm-kernel, linux-tegra,
	linux-kernel

From: Thierry Reding <treding@nvidia.com>

When an IOMMU device is available on the platform bus, allocate an IOMMU
domain and attach the display controllers to it. The display controllers
can then scan out non-contiguous buffers by mapping them through the
IOMMU.

Signed-off-by: Thierry Reding <treding@nvidia.com>
---
 drivers/gpu/drm/tegra/dc.c  |  21 ++++
 drivers/gpu/drm/tegra/drm.c |  17 ++++
 drivers/gpu/drm/tegra/drm.h |   3 +
 drivers/gpu/drm/tegra/fb.c  |  16 ++-
 drivers/gpu/drm/tegra/gem.c | 236 +++++++++++++++++++++++++++++++++++++++-----
 drivers/gpu/drm/tegra/gem.h |   4 +
 6 files changed, 273 insertions(+), 24 deletions(-)

diff --git a/drivers/gpu/drm/tegra/dc.c b/drivers/gpu/drm/tegra/dc.c
index afcca04f5367..0f7452d04811 100644
--- a/drivers/gpu/drm/tegra/dc.c
+++ b/drivers/gpu/drm/tegra/dc.c
@@ -9,6 +9,7 @@
 
 #include <linux/clk.h>
 #include <linux/debugfs.h>
+#include <linux/iommu.h>
 #include <linux/reset.h>
 
 #include "dc.h"
@@ -1283,8 +1284,18 @@ static int tegra_dc_init(struct host1x_client *client)
 {
 	struct drm_device *drm = dev_get_drvdata(client->parent);
 	struct tegra_dc *dc = host1x_client_to_dc(client);
+	struct tegra_drm *tegra = drm->dev_private;
 	int err;
 
+	if (tegra->domain) {
+		err = iommu_attach_device(tegra->domain, dc->dev);
+		if (err < 0) {
+			dev_err(dc->dev, "failed to attach to IOMMU: %d\n",
+				err);
+			return err;
+		}
+	}
+
 	drm_crtc_init(drm, &dc->base, &tegra_crtc_funcs);
 	drm_mode_crtc_set_gamma_size(&dc->base, 256);
 	drm_crtc_helper_add(&dc->base, &tegra_crtc_helper_funcs);
@@ -1318,7 +1329,9 @@ static int tegra_dc_init(struct host1x_client *client)
 
 static int tegra_dc_exit(struct host1x_client *client)
 {
+	struct drm_device *drm = dev_get_drvdata(client->parent);
 	struct tegra_dc *dc = host1x_client_to_dc(client);
+	struct tegra_drm *tegra = drm->dev_private;
 	int err;
 
 	devm_free_irq(dc->dev, dc->irq, dc);
@@ -1335,6 +1348,8 @@ static int tegra_dc_exit(struct host1x_client *client)
 		return err;
 	}
 
+	iommu_detach_device(tegra->domain, dc->dev);
+
 	return 0;
 }
 
@@ -1462,6 +1477,12 @@ static int tegra_dc_probe(struct platform_device *pdev)
 		return -ENXIO;
 	}
 
+	err = iommu_attach(&pdev->dev);
+	if (err < 0) {
+		dev_err(&pdev->dev, "failed to attach to IOMMU: %d\n", err);
+		return err;
+	}
+
 	INIT_LIST_HEAD(&dc->client.list);
 	dc->client.ops = &dc_client_ops;
 	dc->client.dev = &pdev->dev;
diff --git a/drivers/gpu/drm/tegra/drm.c b/drivers/gpu/drm/tegra/drm.c
index 59736bb810cd..1d2bbafad982 100644
--- a/drivers/gpu/drm/tegra/drm.c
+++ b/drivers/gpu/drm/tegra/drm.c
@@ -8,6 +8,7 @@
  */
 
 #include <linux/host1x.h>
+#include <linux/iommu.h>
 
 #include "drm.h"
 #include "gem.h"
@@ -33,6 +34,16 @@ static int tegra_drm_load(struct drm_device *drm, unsigned long flags)
 	if (!tegra)
 		return -ENOMEM;
 
+	if (iommu_present(&platform_bus_type)) {
+		tegra->domain = iommu_domain_alloc(&platform_bus_type);
+		if (IS_ERR(tegra->domain)) {
+			kfree(tegra);
+			return PTR_ERR(tegra->domain);
+		}
+
+		drm_mm_init(&tegra->mm, 0, SZ_2G);
+	}
+
 	mutex_init(&tegra->clients_lock);
 	INIT_LIST_HEAD(&tegra->clients);
 	drm->dev_private = tegra;
@@ -71,6 +82,7 @@ static int tegra_drm_load(struct drm_device *drm, unsigned long flags)
 static int tegra_drm_unload(struct drm_device *drm)
 {
 	struct host1x_device *device = to_host1x_device(drm->dev);
+	struct tegra_drm *tegra = drm->dev_private;
 	int err;
 
 	drm_kms_helper_poll_fini(drm);
@@ -82,6 +94,11 @@ static int tegra_drm_unload(struct drm_device *drm)
 	if (err < 0)
 		return err;
 
+	if (tegra->domain) {
+		iommu_domain_free(tegra->domain);
+		drm_mm_takedown(&tegra->mm);
+	}
+
 	return 0;
 }
 
diff --git a/drivers/gpu/drm/tegra/drm.h b/drivers/gpu/drm/tegra/drm.h
index 96d754e7b3eb..a07c796b7edc 100644
--- a/drivers/gpu/drm/tegra/drm.h
+++ b/drivers/gpu/drm/tegra/drm.h
@@ -39,6 +39,9 @@ struct tegra_fbdev {
 struct tegra_drm {
 	struct drm_device *drm;
 
+	struct iommu_domain *domain;
+	struct drm_mm mm;
+
 	struct mutex clients_lock;
 	struct list_head clients;
 
diff --git a/drivers/gpu/drm/tegra/fb.c b/drivers/gpu/drm/tegra/fb.c
index 7790d43ad082..21c65dd817c3 100644
--- a/drivers/gpu/drm/tegra/fb.c
+++ b/drivers/gpu/drm/tegra/fb.c
@@ -65,8 +65,12 @@ static void tegra_fb_destroy(struct drm_framebuffer *framebuffer)
 	for (i = 0; i < fb->num_planes; i++) {
 		struct tegra_bo *bo = fb->planes[i];
 
-		if (bo)
+		if (bo) {
+			if (bo->pages && bo->virt)
+				vunmap(bo->virt);
+
 			drm_gem_object_unreference_unlocked(&bo->gem);
+		}
 	}
 
 	drm_framebuffer_cleanup(framebuffer);
@@ -252,6 +256,16 @@ static int tegra_fbdev_probe(struct drm_fb_helper *helper,
 	offset = info->var.xoffset * bytes_per_pixel +
 		 info->var.yoffset * fb->pitches[0];
 
+	if (bo->pages) {
+		bo->vaddr = vmap(bo->pages, bo->num_pages, VM_MAP,
+				 pgprot_writecombine(PAGE_KERNEL));
+		if (!bo->vaddr) {
+			dev_err(drm->dev, "failed to vmap() framebuffer\n");
+			err = -ENOMEM;
+			goto destroy;
+		}
+	}
+
 	drm->mode_config.fb_base = (resource_size_t)bo->paddr;
 	info->screen_base = (void __iomem *)bo->vaddr + offset;
 	info->screen_size = size;
diff --git a/drivers/gpu/drm/tegra/gem.c b/drivers/gpu/drm/tegra/gem.c
index c1e4e8b6e5ca..2912e61a2599 100644
--- a/drivers/gpu/drm/tegra/gem.c
+++ b/drivers/gpu/drm/tegra/gem.c
@@ -14,8 +14,10 @@
  */
 
 #include <linux/dma-buf.h>
+#include <linux/iommu.h>
 #include <drm/tegra_drm.h>
 
+#include "drm.h"
 #include "gem.h"
 
 static inline struct tegra_bo *host1x_to_tegra_bo(struct host1x_bo *bo)
@@ -90,14 +92,144 @@ static const struct host1x_bo_ops tegra_bo_ops = {
 	.kunmap = tegra_bo_kunmap,
 };
 
+static int iommu_map_sg(struct iommu_domain *domain, struct sg_table *sgt,
+			dma_addr_t iova, int prot)
+{
+	unsigned long offset = 0;
+	struct scatterlist *sg;
+	unsigned int i, j;
+	int err;
+
+	for_each_sg(sgt->sgl, sg, sgt->nents, i) {
+		dma_addr_t phys = sg_phys(sg);
+		size_t length = sg->offset;
+
+		phys = sg_phys(sg) - sg->offset;
+		length = sg->length + sg->offset;
+
+		err = iommu_map(domain, iova + offset, phys, length, prot);
+		if (err < 0)
+			goto unmap;
+
+		offset += length;
+	}
+
+	return 0;
+
+unmap:
+	offset = 0;
+
+	for_each_sg(sgt->sgl, sg, i, j) {
+		size_t length = sg->length + sg->offset;
+		iommu_unmap(domain, iova + offset, length);
+		offset += length;
+	}
+
+	return err;
+}
+
+static int iommu_unmap_sg(struct iommu_domain *domain, struct sg_table *sgt,
+			  dma_addr_t iova)
+{
+	unsigned long offset = 0;
+	struct scatterlist *sg;
+	unsigned int i;
+
+	for_each_sg(sgt->sgl, sg, sgt->nents, i) {
+		dma_addr_t phys = sg_phys(sg);
+		size_t length = sg->offset;
+
+		phys = sg_phys(sg) - sg->offset;
+		length = sg->length + sg->offset;
+
+		iommu_unmap(domain, iova + offset, length);
+		offset += length;
+	}
+
+	return 0;
+}
+
+static int tegra_bo_iommu_map(struct tegra_drm *tegra, struct tegra_bo *bo)
+{
+	int prot = IOMMU_READ | IOMMU_WRITE;
+	int err;
+
+	if (bo->mm)
+		return -EBUSY;
+
+	bo->mm = kzalloc(sizeof(*bo->mm), GFP_KERNEL);
+	if (!bo->mm)
+		return -ENOMEM;
+
+	err = drm_mm_insert_node_generic(&tegra->mm, bo->mm, bo->gem.size,
+					 PAGE_SIZE, 0, 0, 0);
+	if (err < 0) {
+		dev_err(tegra->drm->dev, "out of virtual memory: %d\n", err);
+		return err;
+	}
+
+	bo->paddr = bo->mm->start;
+
+	err = iommu_map_sg(tegra->domain, bo->sgt, bo->paddr, prot);
+	if (err < 0) {
+		dev_err(tegra->drm->dev, "failed to map buffer: %d\n", err);
+		return err;
+	}
+
+	return 0;
+}
+
+static int tegra_bo_iommu_unmap(struct tegra_drm *tegra, struct tegra_bo *bo)
+{
+	if (!bo->mm)
+		return 0;
+
+	iommu_unmap_sg(tegra->domain, bo->sgt, bo->paddr);
+	drm_mm_remove_node(bo->mm);
+
+	kfree(bo->mm);
+	return 0;
+}
+
 static void tegra_bo_destroy(struct drm_device *drm, struct tegra_bo *bo)
 {
-	dma_free_writecombine(drm->dev, bo->gem.size, bo->vaddr, bo->paddr);
+	if (!bo->pages)
+		dma_free_writecombine(drm->dev, bo->gem.size, bo->vaddr,
+				      bo->paddr);
+	else
+		drm_gem_put_pages(&bo->gem, bo->pages, true, true);
+}
+
+static int tegra_bo_get_pages(struct drm_device *drm, struct tegra_bo *bo,
+			      size_t size)
+{
+	bo->pages = drm_gem_get_pages(&bo->gem, GFP_KERNEL);
+	if (!bo->pages)
+		return -ENOMEM;
+
+	bo->num_pages = size >> PAGE_SHIFT;
+
+	return 0;
+}
+
+static int tegra_bo_alloc(struct drm_device *drm, struct tegra_bo *bo,
+			  size_t size)
+{
+	bo->vaddr = dma_alloc_writecombine(drm->dev, size, &bo->paddr,
+					   GFP_KERNEL | __GFP_NOWARN);
+	if (!bo->vaddr) {
+		dev_err(drm->dev, "failed to allocate buffer of size %zu\n",
+			size);
+		return -ENOMEM;
+	}
+
+	return 0;
 }
 
 struct tegra_bo *tegra_bo_create(struct drm_device *drm, unsigned int size,
 				 unsigned long flags)
 {
+	struct tegra_drm *tegra = drm->dev_private;
 	struct tegra_bo *bo;
 	int err;
 
@@ -108,22 +240,33 @@ struct tegra_bo *tegra_bo_create(struct drm_device *drm, unsigned int size,
 	host1x_bo_init(&bo->base, &tegra_bo_ops);
 	size = round_up(size, PAGE_SIZE);
 
-	bo->vaddr = dma_alloc_writecombine(drm->dev, size, &bo->paddr,
-					   GFP_KERNEL | __GFP_NOWARN);
-	if (!bo->vaddr) {
-		dev_err(drm->dev, "failed to allocate buffer with size %u\n",
-			size);
-		err = -ENOMEM;
-		goto err_dma;
-	}
-
 	err = drm_gem_object_init(drm, &bo->gem, size);
 	if (err)
-		goto err_init;
+		goto free;
 
 	err = drm_gem_create_mmap_offset(&bo->gem);
 	if (err)
-		goto err_mmap;
+		goto release;
+
+	if (tegra->domain) {
+		err = tegra_bo_get_pages(drm, bo, size);
+		if (err < 0)
+			goto release;
+
+		bo->sgt = drm_prime_pages_to_sg(bo->pages, bo->num_pages);
+		if (IS_ERR(bo->sgt)) {
+			err = PTR_ERR(bo->sgt);
+			goto release;
+		}
+
+		err = tegra_bo_iommu_map(tegra, bo);
+		if (err < 0)
+			goto release;
+	} else {
+		err = tegra_bo_alloc(drm, bo, size);
+		if (err < 0)
+			goto release;
+	}
 
 	if (flags & DRM_TEGRA_GEM_CREATE_TILED)
 		bo->tiling.mode = TEGRA_BO_TILING_MODE_TILED;
@@ -133,11 +276,10 @@ struct tegra_bo *tegra_bo_create(struct drm_device *drm, unsigned int size,
 
 	return bo;
 
-err_mmap:
+release:
 	drm_gem_object_release(&bo->gem);
-err_init:
 	tegra_bo_destroy(drm, bo);
-err_dma:
+free:
 	kfree(bo);
 
 	return ERR_PTR(err);
@@ -172,6 +314,7 @@ err:
 static struct tegra_bo *tegra_bo_import(struct drm_device *drm,
 					struct dma_buf *buf)
 {
+	struct tegra_drm *tegra = drm->dev_private;
 	struct dma_buf_attachment *attach;
 	struct tegra_bo *bo;
 	ssize_t size;
@@ -211,12 +354,19 @@ static struct tegra_bo *tegra_bo_import(struct drm_device *drm,
 		goto detach;
 	}
 
-	if (bo->sgt->nents > 1) {
-		err = -EINVAL;
-		goto detach;
+	if (tegra->domain) {
+		err = tegra_bo_iommu_map(tegra, bo);
+		if (err < 0)
+			goto detach;
+	} else {
+		if (bo->sgt->nents > 1) {
+			err = -EINVAL;
+			goto detach;
+		}
+
+		bo->paddr = sg_dma_address(bo->sgt->sgl);
 	}
 
-	bo->paddr = sg_dma_address(bo->sgt->sgl);
 	bo->gem.import_attach = attach;
 
 	return bo;
@@ -239,8 +389,12 @@ free:
 
 void tegra_bo_free_object(struct drm_gem_object *gem)
 {
+	struct tegra_drm *tegra = gem->dev->dev_private;
 	struct tegra_bo *bo = to_tegra_bo(gem);
 
+	if (tegra->domain)
+		tegra_bo_iommu_unmap(tegra, bo);
+
 	if (gem->import_attach) {
 		dma_buf_unmap_attachment(gem->import_attach, bo->sgt,
 					 DMA_TO_DEVICE);
@@ -301,7 +455,38 @@ int tegra_bo_dumb_map_offset(struct drm_file *file, struct drm_device *drm,
 	return 0;
 }
 
+static int tegra_bo_fault(struct vm_area_struct *vma, struct vm_fault *vmf)
+{
+	struct drm_gem_object *gem = vma->vm_private_data;
+	struct tegra_bo *bo = to_tegra_bo(gem);
+	struct page *page;
+	pgoff_t offset;
+	int err;
+
+	if (!bo->pages)
+		return VM_FAULT_SIGBUS;
+
+	offset = ((unsigned long)vmf->virtual_address - vma->vm_start) >> PAGE_SHIFT;
+	page = bo->pages[offset];
+
+	err = vm_insert_page(vma, (unsigned long)vmf->virtual_address, page);
+	switch (err) {
+	case -EAGAIN:
+	case 0:
+	case -ERESTARTSYS:
+	case -EINTR:
+	case -EBUSY:
+		return VM_FAULT_NOPAGE;
+
+	case -ENOMEM:
+		return VM_FAULT_OOM;
+	}
+
+	return VM_FAULT_SIGBUS;
+}
+
 const struct vm_operations_struct tegra_bo_vm_ops = {
+	.fault = tegra_bo_fault,
 	.open = drm_gem_vm_open,
 	.close = drm_gem_vm_close,
 };
@@ -316,13 +501,18 @@ int tegra_drm_mmap(struct file *file, struct vm_area_struct *vma)
 	if (ret)
 		return ret;
 
+	vma->vm_flags |= VM_MIXEDMAP;
+	vma->vm_flags &= ~VM_PFNMAP;
+
 	gem = vma->vm_private_data;
 	bo = to_tegra_bo(gem);
 
-	ret = remap_pfn_range(vma, vma->vm_start, bo->paddr >> PAGE_SHIFT,
-			      vma->vm_end - vma->vm_start, vma->vm_page_prot);
-	if (ret)
-		drm_gem_vm_close(vma);
+	if (!bo->pages) {
+		ret = remap_pfn_range(vma, vma->vm_start, bo->paddr >> PAGE_SHIFT,
+				      vma->vm_end - vma->vm_start, vma->vm_page_prot);
+		if (ret)
+			drm_gem_vm_close(vma);
+	}
 
 	return ret;
 }
diff --git a/drivers/gpu/drm/tegra/gem.h b/drivers/gpu/drm/tegra/gem.h
index 43a25c853357..c2e3f43e4b3f 100644
--- a/drivers/gpu/drm/tegra/gem.h
+++ b/drivers/gpu/drm/tegra/gem.h
@@ -37,6 +37,10 @@ struct tegra_bo {
 	dma_addr_t paddr;
 	void *vaddr;
 
+	struct drm_mm_node *mm;
+	unsigned long num_pages;
+	struct page **pages;
+
 	struct tegra_bo_tiling tiling;
 };
 
-- 
2.0.0


^ permalink raw reply related	[flat|nested] 133+ messages in thread

* [RFC 09/10] drm/tegra: Add IOMMU support
@ 2014-06-26 20:49     ` Thierry Reding
  0 siblings, 0 replies; 133+ messages in thread
From: Thierry Reding @ 2014-06-26 20:49 UTC (permalink / raw)
  To: linux-arm-kernel

From: Thierry Reding <treding@nvidia.com>

When an IOMMU device is available on the platform bus, allocate an IOMMU
domain and attach the display controllers to it. The display controllers
can then scan out non-contiguous buffers by mapping them through the
IOMMU.

Signed-off-by: Thierry Reding <treding@nvidia.com>
---
 drivers/gpu/drm/tegra/dc.c  |  21 ++++
 drivers/gpu/drm/tegra/drm.c |  17 ++++
 drivers/gpu/drm/tegra/drm.h |   3 +
 drivers/gpu/drm/tegra/fb.c  |  16 ++-
 drivers/gpu/drm/tegra/gem.c | 236 +++++++++++++++++++++++++++++++++++++++-----
 drivers/gpu/drm/tegra/gem.h |   4 +
 6 files changed, 273 insertions(+), 24 deletions(-)

diff --git a/drivers/gpu/drm/tegra/dc.c b/drivers/gpu/drm/tegra/dc.c
index afcca04f5367..0f7452d04811 100644
--- a/drivers/gpu/drm/tegra/dc.c
+++ b/drivers/gpu/drm/tegra/dc.c
@@ -9,6 +9,7 @@
 
 #include <linux/clk.h>
 #include <linux/debugfs.h>
+#include <linux/iommu.h>
 #include <linux/reset.h>
 
 #include "dc.h"
@@ -1283,8 +1284,18 @@ static int tegra_dc_init(struct host1x_client *client)
 {
 	struct drm_device *drm = dev_get_drvdata(client->parent);
 	struct tegra_dc *dc = host1x_client_to_dc(client);
+	struct tegra_drm *tegra = drm->dev_private;
 	int err;
 
+	if (tegra->domain) {
+		err = iommu_attach_device(tegra->domain, dc->dev);
+		if (err < 0) {
+			dev_err(dc->dev, "failed to attach to IOMMU: %d\n",
+				err);
+			return err;
+		}
+	}
+
 	drm_crtc_init(drm, &dc->base, &tegra_crtc_funcs);
 	drm_mode_crtc_set_gamma_size(&dc->base, 256);
 	drm_crtc_helper_add(&dc->base, &tegra_crtc_helper_funcs);
@@ -1318,7 +1329,9 @@ static int tegra_dc_init(struct host1x_client *client)
 
 static int tegra_dc_exit(struct host1x_client *client)
 {
+	struct drm_device *drm = dev_get_drvdata(client->parent);
 	struct tegra_dc *dc = host1x_client_to_dc(client);
+	struct tegra_drm *tegra = drm->dev_private;
 	int err;
 
 	devm_free_irq(dc->dev, dc->irq, dc);
@@ -1335,6 +1348,8 @@ static int tegra_dc_exit(struct host1x_client *client)
 		return err;
 	}
 
+	iommu_detach_device(tegra->domain, dc->dev);
+
 	return 0;
 }
 
@@ -1462,6 +1477,12 @@ static int tegra_dc_probe(struct platform_device *pdev)
 		return -ENXIO;
 	}
 
+	err = iommu_attach(&pdev->dev);
+	if (err < 0) {
+		dev_err(&pdev->dev, "failed to attach to IOMMU: %d\n", err);
+		return err;
+	}
+
 	INIT_LIST_HEAD(&dc->client.list);
 	dc->client.ops = &dc_client_ops;
 	dc->client.dev = &pdev->dev;
diff --git a/drivers/gpu/drm/tegra/drm.c b/drivers/gpu/drm/tegra/drm.c
index 59736bb810cd..1d2bbafad982 100644
--- a/drivers/gpu/drm/tegra/drm.c
+++ b/drivers/gpu/drm/tegra/drm.c
@@ -8,6 +8,7 @@
  */
 
 #include <linux/host1x.h>
+#include <linux/iommu.h>
 
 #include "drm.h"
 #include "gem.h"
@@ -33,6 +34,16 @@ static int tegra_drm_load(struct drm_device *drm, unsigned long flags)
 	if (!tegra)
 		return -ENOMEM;
 
+	if (iommu_present(&platform_bus_type)) {
+		tegra->domain = iommu_domain_alloc(&platform_bus_type);
+		if (IS_ERR(tegra->domain)) {
+			kfree(tegra);
+			return PTR_ERR(tegra->domain);
+		}
+
+		drm_mm_init(&tegra->mm, 0, SZ_2G);
+	}
+
 	mutex_init(&tegra->clients_lock);
 	INIT_LIST_HEAD(&tegra->clients);
 	drm->dev_private = tegra;
@@ -71,6 +82,7 @@ static int tegra_drm_load(struct drm_device *drm, unsigned long flags)
 static int tegra_drm_unload(struct drm_device *drm)
 {
 	struct host1x_device *device = to_host1x_device(drm->dev);
+	struct tegra_drm *tegra = drm->dev_private;
 	int err;
 
 	drm_kms_helper_poll_fini(drm);
@@ -82,6 +94,11 @@ static int tegra_drm_unload(struct drm_device *drm)
 	if (err < 0)
 		return err;
 
+	if (tegra->domain) {
+		iommu_domain_free(tegra->domain);
+		drm_mm_takedown(&tegra->mm);
+	}
+
 	return 0;
 }
 
diff --git a/drivers/gpu/drm/tegra/drm.h b/drivers/gpu/drm/tegra/drm.h
index 96d754e7b3eb..a07c796b7edc 100644
--- a/drivers/gpu/drm/tegra/drm.h
+++ b/drivers/gpu/drm/tegra/drm.h
@@ -39,6 +39,9 @@ struct tegra_fbdev {
 struct tegra_drm {
 	struct drm_device *drm;
 
+	struct iommu_domain *domain;
+	struct drm_mm mm;
+
 	struct mutex clients_lock;
 	struct list_head clients;
 
diff --git a/drivers/gpu/drm/tegra/fb.c b/drivers/gpu/drm/tegra/fb.c
index 7790d43ad082..21c65dd817c3 100644
--- a/drivers/gpu/drm/tegra/fb.c
+++ b/drivers/gpu/drm/tegra/fb.c
@@ -65,8 +65,12 @@ static void tegra_fb_destroy(struct drm_framebuffer *framebuffer)
 	for (i = 0; i < fb->num_planes; i++) {
 		struct tegra_bo *bo = fb->planes[i];
 
-		if (bo)
+		if (bo) {
+			if (bo->pages && bo->virt)
+				vunmap(bo->virt);
+
 			drm_gem_object_unreference_unlocked(&bo->gem);
+		}
 	}
 
 	drm_framebuffer_cleanup(framebuffer);
@@ -252,6 +256,16 @@ static int tegra_fbdev_probe(struct drm_fb_helper *helper,
 	offset = info->var.xoffset * bytes_per_pixel +
 		 info->var.yoffset * fb->pitches[0];
 
+	if (bo->pages) {
+		bo->vaddr = vmap(bo->pages, bo->num_pages, VM_MAP,
+				 pgprot_writecombine(PAGE_KERNEL));
+		if (!bo->vaddr) {
+			dev_err(drm->dev, "failed to vmap() framebuffer\n");
+			err = -ENOMEM;
+			goto destroy;
+		}
+	}
+
 	drm->mode_config.fb_base = (resource_size_t)bo->paddr;
 	info->screen_base = (void __iomem *)bo->vaddr + offset;
 	info->screen_size = size;
diff --git a/drivers/gpu/drm/tegra/gem.c b/drivers/gpu/drm/tegra/gem.c
index c1e4e8b6e5ca..2912e61a2599 100644
--- a/drivers/gpu/drm/tegra/gem.c
+++ b/drivers/gpu/drm/tegra/gem.c
@@ -14,8 +14,10 @@
  */
 
 #include <linux/dma-buf.h>
+#include <linux/iommu.h>
 #include <drm/tegra_drm.h>
 
+#include "drm.h"
 #include "gem.h"
 
 static inline struct tegra_bo *host1x_to_tegra_bo(struct host1x_bo *bo)
@@ -90,14 +92,144 @@ static const struct host1x_bo_ops tegra_bo_ops = {
 	.kunmap = tegra_bo_kunmap,
 };
 
+static int iommu_map_sg(struct iommu_domain *domain, struct sg_table *sgt,
+			dma_addr_t iova, int prot)
+{
+	unsigned long offset = 0;
+	struct scatterlist *sg;
+	unsigned int i, j;
+	int err;
+
+	for_each_sg(sgt->sgl, sg, sgt->nents, i) {
+		dma_addr_t phys = sg_phys(sg);
+		size_t length = sg->offset;
+
+		phys = sg_phys(sg) - sg->offset;
+		length = sg->length + sg->offset;
+
+		err = iommu_map(domain, iova + offset, phys, length, prot);
+		if (err < 0)
+			goto unmap;
+
+		offset += length;
+	}
+
+	return 0;
+
+unmap:
+	offset = 0;
+
+	for_each_sg(sgt->sgl, sg, i, j) {
+		size_t length = sg->length + sg->offset;
+		iommu_unmap(domain, iova + offset, length);
+		offset += length;
+	}
+
+	return err;
+}
+
+static int iommu_unmap_sg(struct iommu_domain *domain, struct sg_table *sgt,
+			  dma_addr_t iova)
+{
+	unsigned long offset = 0;
+	struct scatterlist *sg;
+	unsigned int i;
+
+	for_each_sg(sgt->sgl, sg, sgt->nents, i) {
+		dma_addr_t phys = sg_phys(sg);
+		size_t length = sg->offset;
+
+		phys = sg_phys(sg) - sg->offset;
+		length = sg->length + sg->offset;
+
+		iommu_unmap(domain, iova + offset, length);
+		offset += length;
+	}
+
+	return 0;
+}
+
+static int tegra_bo_iommu_map(struct tegra_drm *tegra, struct tegra_bo *bo)
+{
+	int prot = IOMMU_READ | IOMMU_WRITE;
+	int err;
+
+	if (bo->mm)
+		return -EBUSY;
+
+	bo->mm = kzalloc(sizeof(*bo->mm), GFP_KERNEL);
+	if (!bo->mm)
+		return -ENOMEM;
+
+	err = drm_mm_insert_node_generic(&tegra->mm, bo->mm, bo->gem.size,
+					 PAGE_SIZE, 0, 0, 0);
+	if (err < 0) {
+		dev_err(tegra->drm->dev, "out of virtual memory: %d\n", err);
+		return err;
+	}
+
+	bo->paddr = bo->mm->start;
+
+	err = iommu_map_sg(tegra->domain, bo->sgt, bo->paddr, prot);
+	if (err < 0) {
+		dev_err(tegra->drm->dev, "failed to map buffer: %d\n", err);
+		return err;
+	}
+
+	return 0;
+}
+
+static int tegra_bo_iommu_unmap(struct tegra_drm *tegra, struct tegra_bo *bo)
+{
+	if (!bo->mm)
+		return 0;
+
+	iommu_unmap_sg(tegra->domain, bo->sgt, bo->paddr);
+	drm_mm_remove_node(bo->mm);
+
+	kfree(bo->mm);
+	return 0;
+}
+
 static void tegra_bo_destroy(struct drm_device *drm, struct tegra_bo *bo)
 {
-	dma_free_writecombine(drm->dev, bo->gem.size, bo->vaddr, bo->paddr);
+	if (!bo->pages)
+		dma_free_writecombine(drm->dev, bo->gem.size, bo->vaddr,
+				      bo->paddr);
+	else
+		drm_gem_put_pages(&bo->gem, bo->pages, true, true);
+}
+
+static int tegra_bo_get_pages(struct drm_device *drm, struct tegra_bo *bo,
+			      size_t size)
+{
+	bo->pages = drm_gem_get_pages(&bo->gem, GFP_KERNEL);
+	if (!bo->pages)
+		return -ENOMEM;
+
+	bo->num_pages = size >> PAGE_SHIFT;
+
+	return 0;
+}
+
+static int tegra_bo_alloc(struct drm_device *drm, struct tegra_bo *bo,
+			  size_t size)
+{
+	bo->vaddr = dma_alloc_writecombine(drm->dev, size, &bo->paddr,
+					   GFP_KERNEL | __GFP_NOWARN);
+	if (!bo->vaddr) {
+		dev_err(drm->dev, "failed to allocate buffer of size %zu\n",
+			size);
+		return -ENOMEM;
+	}
+
+	return 0;
 }
 
 struct tegra_bo *tegra_bo_create(struct drm_device *drm, unsigned int size,
 				 unsigned long flags)
 {
+	struct tegra_drm *tegra = drm->dev_private;
 	struct tegra_bo *bo;
 	int err;
 
@@ -108,22 +240,33 @@ struct tegra_bo *tegra_bo_create(struct drm_device *drm, unsigned int size,
 	host1x_bo_init(&bo->base, &tegra_bo_ops);
 	size = round_up(size, PAGE_SIZE);
 
-	bo->vaddr = dma_alloc_writecombine(drm->dev, size, &bo->paddr,
-					   GFP_KERNEL | __GFP_NOWARN);
-	if (!bo->vaddr) {
-		dev_err(drm->dev, "failed to allocate buffer with size %u\n",
-			size);
-		err = -ENOMEM;
-		goto err_dma;
-	}
-
 	err = drm_gem_object_init(drm, &bo->gem, size);
 	if (err)
-		goto err_init;
+		goto free;
 
 	err = drm_gem_create_mmap_offset(&bo->gem);
 	if (err)
-		goto err_mmap;
+		goto release;
+
+	if (tegra->domain) {
+		err = tegra_bo_get_pages(drm, bo, size);
+		if (err < 0)
+			goto release;
+
+		bo->sgt = drm_prime_pages_to_sg(bo->pages, bo->num_pages);
+		if (IS_ERR(bo->sgt)) {
+			err = PTR_ERR(bo->sgt);
+			goto release;
+		}
+
+		err = tegra_bo_iommu_map(tegra, bo);
+		if (err < 0)
+			goto release;
+	} else {
+		err = tegra_bo_alloc(drm, bo, size);
+		if (err < 0)
+			goto release;
+	}
 
 	if (flags & DRM_TEGRA_GEM_CREATE_TILED)
 		bo->tiling.mode = TEGRA_BO_TILING_MODE_TILED;
@@ -133,11 +276,10 @@ struct tegra_bo *tegra_bo_create(struct drm_device *drm, unsigned int size,
 
 	return bo;
 
-err_mmap:
+release:
 	drm_gem_object_release(&bo->gem);
-err_init:
 	tegra_bo_destroy(drm, bo);
-err_dma:
+free:
 	kfree(bo);
 
 	return ERR_PTR(err);
@@ -172,6 +314,7 @@ err:
 static struct tegra_bo *tegra_bo_import(struct drm_device *drm,
 					struct dma_buf *buf)
 {
+	struct tegra_drm *tegra = drm->dev_private;
 	struct dma_buf_attachment *attach;
 	struct tegra_bo *bo;
 	ssize_t size;
@@ -211,12 +354,19 @@ static struct tegra_bo *tegra_bo_import(struct drm_device *drm,
 		goto detach;
 	}
 
-	if (bo->sgt->nents > 1) {
-		err = -EINVAL;
-		goto detach;
+	if (tegra->domain) {
+		err = tegra_bo_iommu_map(tegra, bo);
+		if (err < 0)
+			goto detach;
+	} else {
+		if (bo->sgt->nents > 1) {
+			err = -EINVAL;
+			goto detach;
+		}
+
+		bo->paddr = sg_dma_address(bo->sgt->sgl);
 	}
 
-	bo->paddr = sg_dma_address(bo->sgt->sgl);
 	bo->gem.import_attach = attach;
 
 	return bo;
@@ -239,8 +389,12 @@ free:
 
 void tegra_bo_free_object(struct drm_gem_object *gem)
 {
+	struct tegra_drm *tegra = gem->dev->dev_private;
 	struct tegra_bo *bo = to_tegra_bo(gem);
 
+	if (tegra->domain)
+		tegra_bo_iommu_unmap(tegra, bo);
+
 	if (gem->import_attach) {
 		dma_buf_unmap_attachment(gem->import_attach, bo->sgt,
 					 DMA_TO_DEVICE);
@@ -301,7 +455,38 @@ int tegra_bo_dumb_map_offset(struct drm_file *file, struct drm_device *drm,
 	return 0;
 }
 
+static int tegra_bo_fault(struct vm_area_struct *vma, struct vm_fault *vmf)
+{
+	struct drm_gem_object *gem = vma->vm_private_data;
+	struct tegra_bo *bo = to_tegra_bo(gem);
+	struct page *page;
+	pgoff_t offset;
+	int err;
+
+	if (!bo->pages)
+		return VM_FAULT_SIGBUS;
+
+	offset = ((unsigned long)vmf->virtual_address - vma->vm_start) >> PAGE_SHIFT;
+	page = bo->pages[offset];
+
+	err = vm_insert_page(vma, (unsigned long)vmf->virtual_address, page);
+	switch (err) {
+	case -EAGAIN:
+	case 0:
+	case -ERESTARTSYS:
+	case -EINTR:
+	case -EBUSY:
+		return VM_FAULT_NOPAGE;
+
+	case -ENOMEM:
+		return VM_FAULT_OOM;
+	}
+
+	return VM_FAULT_SIGBUS;
+}
+
 const struct vm_operations_struct tegra_bo_vm_ops = {
+	.fault = tegra_bo_fault,
 	.open = drm_gem_vm_open,
 	.close = drm_gem_vm_close,
 };
@@ -316,13 +501,18 @@ int tegra_drm_mmap(struct file *file, struct vm_area_struct *vma)
 	if (ret)
 		return ret;
 
+	vma->vm_flags |= VM_MIXEDMAP;
+	vma->vm_flags &= ~VM_PFNMAP;
+
 	gem = vma->vm_private_data;
 	bo = to_tegra_bo(gem);
 
-	ret = remap_pfn_range(vma, vma->vm_start, bo->paddr >> PAGE_SHIFT,
-			      vma->vm_end - vma->vm_start, vma->vm_page_prot);
-	if (ret)
-		drm_gem_vm_close(vma);
+	if (!bo->pages) {
+		ret = remap_pfn_range(vma, vma->vm_start, bo->paddr >> PAGE_SHIFT,
+				      vma->vm_end - vma->vm_start, vma->vm_page_prot);
+		if (ret)
+			drm_gem_vm_close(vma);
+	}
 
 	return ret;
 }
diff --git a/drivers/gpu/drm/tegra/gem.h b/drivers/gpu/drm/tegra/gem.h
index 43a25c853357..c2e3f43e4b3f 100644
--- a/drivers/gpu/drm/tegra/gem.h
+++ b/drivers/gpu/drm/tegra/gem.h
@@ -37,6 +37,10 @@ struct tegra_bo {
 	dma_addr_t paddr;
 	void *vaddr;
 
+	struct drm_mm_node *mm;
+	unsigned long num_pages;
+	struct page **pages;
+
 	struct tegra_bo_tiling tiling;
 };
 
-- 
2.0.0

^ permalink raw reply related	[flat|nested] 133+ messages in thread

* [RFC 10/10] mmc: sdhci-tegra: Add IOMMU support
  2014-06-26 20:49 ` Thierry Reding
  (?)
@ 2014-06-26 20:49     ` Thierry Reding
  -1 siblings, 0 replies; 133+ messages in thread
From: Thierry Reding @ 2014-06-26 20:49 UTC (permalink / raw)
  To: Rob Herring, Pawel Moll, Mark Rutland, Ian Campbell, Kumar Gala,
	Stephen Warren, Arnd Bergmann, Will Deacon, Joerg Roedel
  Cc: Olav Haugan, devicetree-u79uwXL29TY76Z2rM5mHXA, Grant Grundler,
	Rhyland Klein, iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA, Marc Zyngier, Allen Martin,
	Paul Walmsley, linux-tegra-u79uwXL29TY76Z2rM5mHXA, Cho KyongHo,
	Dave Martin, linux-arm-kernel-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r

From: Thierry Reding <treding-DDmLM1+adcrQT0dZR+AlfA@public.gmane.org>

Attach to the device's master interface of the IOMMU at .probe() time.
IOMMU support becomes available via the DMA mapping API interoperation
code, but this explicit attachment is necessary to ensure proper probe
order.

Signed-off-by: Thierry Reding <treding-DDmLM1+adcrQT0dZR+AlfA@public.gmane.org>
---
 drivers/mmc/host/sdhci-tegra.c | 8 ++++++++
 1 file changed, 8 insertions(+)

diff --git a/drivers/mmc/host/sdhci-tegra.c b/drivers/mmc/host/sdhci-tegra.c
index 33100d10d176..b884614fa4e6 100644
--- a/drivers/mmc/host/sdhci-tegra.c
+++ b/drivers/mmc/host/sdhci-tegra.c
@@ -15,6 +15,7 @@
 #include <linux/err.h>
 #include <linux/module.h>
 #include <linux/init.h>
+#include <linux/iommu.h>
 #include <linux/platform_device.h>
 #include <linux/clk.h>
 #include <linux/io.h>
@@ -237,6 +238,11 @@ static int sdhci_tegra_probe(struct platform_device *pdev)
 	match = of_match_device(sdhci_tegra_dt_match, &pdev->dev);
 	if (!match)
 		return -EINVAL;
+
+	rc = iommu_attach(&pdev->dev);
+	if (rc < 0)
+		return rc;
+
 	soc_data = match->data;
 
 	host = sdhci_pltfm_init(pdev, soc_data->pdata, 0);
@@ -310,6 +316,8 @@ static int sdhci_tegra_remove(struct platform_device *pdev)
 	clk_disable_unprepare(pltfm_host->clk);
 	clk_put(pltfm_host->clk);
 
+	iommu_detach(&pdev->dev);
+
 	sdhci_pltfm_free(pdev);
 
 	return 0;
-- 
2.0.0

^ permalink raw reply related	[flat|nested] 133+ messages in thread

* [RFC 10/10] mmc: sdhci-tegra: Add IOMMU support
@ 2014-06-26 20:49     ` Thierry Reding
  0 siblings, 0 replies; 133+ messages in thread
From: Thierry Reding @ 2014-06-26 20:49 UTC (permalink / raw)
  To: Rob Herring, Pawel Moll, Mark Rutland, Ian Campbell, Kumar Gala,
	Stephen Warren, Arnd Bergmann, Will Deacon, Joerg Roedel
  Cc: Cho KyongHo, Grant Grundler, Dave Martin, Marc Zyngier,
	Hiroshi Doyu, Olav Haugan, Paul Walmsley, Rhyland Klein,
	Allen Martin, devicetree, iommu, linux-arm-kernel, linux-tegra,
	linux-kernel

From: Thierry Reding <treding@nvidia.com>

Attach to the device's master interface of the IOMMU at .probe() time.
IOMMU support becomes available via the DMA mapping API interoperation
code, but this explicit attachment is necessary to ensure proper probe
order.

Signed-off-by: Thierry Reding <treding@nvidia.com>
---
 drivers/mmc/host/sdhci-tegra.c | 8 ++++++++
 1 file changed, 8 insertions(+)

diff --git a/drivers/mmc/host/sdhci-tegra.c b/drivers/mmc/host/sdhci-tegra.c
index 33100d10d176..b884614fa4e6 100644
--- a/drivers/mmc/host/sdhci-tegra.c
+++ b/drivers/mmc/host/sdhci-tegra.c
@@ -15,6 +15,7 @@
 #include <linux/err.h>
 #include <linux/module.h>
 #include <linux/init.h>
+#include <linux/iommu.h>
 #include <linux/platform_device.h>
 #include <linux/clk.h>
 #include <linux/io.h>
@@ -237,6 +238,11 @@ static int sdhci_tegra_probe(struct platform_device *pdev)
 	match = of_match_device(sdhci_tegra_dt_match, &pdev->dev);
 	if (!match)
 		return -EINVAL;
+
+	rc = iommu_attach(&pdev->dev);
+	if (rc < 0)
+		return rc;
+
 	soc_data = match->data;
 
 	host = sdhci_pltfm_init(pdev, soc_data->pdata, 0);
@@ -310,6 +316,8 @@ static int sdhci_tegra_remove(struct platform_device *pdev)
 	clk_disable_unprepare(pltfm_host->clk);
 	clk_put(pltfm_host->clk);
 
+	iommu_detach(&pdev->dev);
+
 	sdhci_pltfm_free(pdev);
 
 	return 0;
-- 
2.0.0


^ permalink raw reply related	[flat|nested] 133+ messages in thread

* [RFC 10/10] mmc: sdhci-tegra: Add IOMMU support
@ 2014-06-26 20:49     ` Thierry Reding
  0 siblings, 0 replies; 133+ messages in thread
From: Thierry Reding @ 2014-06-26 20:49 UTC (permalink / raw)
  To: linux-arm-kernel

From: Thierry Reding <treding@nvidia.com>

Attach to the device's master interface of the IOMMU at .probe() time.
IOMMU support becomes available via the DMA mapping API interoperation
code, but this explicit attachment is necessary to ensure proper probe
order.

Signed-off-by: Thierry Reding <treding@nvidia.com>
---
 drivers/mmc/host/sdhci-tegra.c | 8 ++++++++
 1 file changed, 8 insertions(+)

diff --git a/drivers/mmc/host/sdhci-tegra.c b/drivers/mmc/host/sdhci-tegra.c
index 33100d10d176..b884614fa4e6 100644
--- a/drivers/mmc/host/sdhci-tegra.c
+++ b/drivers/mmc/host/sdhci-tegra.c
@@ -15,6 +15,7 @@
 #include <linux/err.h>
 #include <linux/module.h>
 #include <linux/init.h>
+#include <linux/iommu.h>
 #include <linux/platform_device.h>
 #include <linux/clk.h>
 #include <linux/io.h>
@@ -237,6 +238,11 @@ static int sdhci_tegra_probe(struct platform_device *pdev)
 	match = of_match_device(sdhci_tegra_dt_match, &pdev->dev);
 	if (!match)
 		return -EINVAL;
+
+	rc = iommu_attach(&pdev->dev);
+	if (rc < 0)
+		return rc;
+
 	soc_data = match->data;
 
 	host = sdhci_pltfm_init(pdev, soc_data->pdata, 0);
@@ -310,6 +316,8 @@ static int sdhci_tegra_remove(struct platform_device *pdev)
 	clk_disable_unprepare(pltfm_host->clk);
 	clk_put(pltfm_host->clk);
 
+	iommu_detach(&pdev->dev);
+
 	sdhci_pltfm_free(pdev);
 
 	return 0;
-- 
2.0.0

^ permalink raw reply related	[flat|nested] 133+ messages in thread

* Re: [RFC 01/10] iommu: Add IOMMU device registry
  2014-06-26 20:49     ` Thierry Reding
  (?)
@ 2014-06-27  6:58         ` Thierry Reding
  -1 siblings, 0 replies; 133+ messages in thread
From: Thierry Reding @ 2014-06-27  6:58 UTC (permalink / raw)
  To: Rob Herring, Pawel Moll, Mark Rutland, Ian Campbell, Kumar Gala,
	Stephen Warren, Arnd Bergmann, Will Deacon, Joerg Roedel
  Cc: Olav Haugan, devicetree-u79uwXL29TY76Z2rM5mHXA, Grant Grundler,
	Rhyland Klein, iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA, Marc Zyngier, Allen Martin,
	Paul Walmsley, linux-tegra-u79uwXL29TY76Z2rM5mHXA, Cho KyongHo,
	Dave Martin, linux-arm-kernel-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r


[-- Attachment #1.1: Type: text/plain, Size: 3734 bytes --]

On Thu, Jun 26, 2014 at 10:49:41PM +0200, Thierry Reding wrote:
> From: Thierry Reding <treding-DDmLM1+adcrQT0dZR+AlfA@public.gmane.org>
> 
> Add an IOMMU device registry for drivers to register with and implement
> a method for users of the IOMMU API to attach to an IOMMU device. This
> allows to support deferred probing and gives the IOMMU API a convenient
> hook to perform early initialization of a device if necessary.
> 
> Signed-off-by: Thierry Reding <treding-DDmLM1+adcrQT0dZR+AlfA@public.gmane.org>
> ---
>  drivers/iommu/iommu.c | 93 +++++++++++++++++++++++++++++++++++++++++++++++++++
>  include/linux/iommu.h | 27 +++++++++++++++
>  2 files changed, 120 insertions(+)

I thought that perhaps I should elaborate on this a bit since I have a
few ideas on how the API could be enhanced.

> +static int of_iommu_attach(struct device *dev)
> +{
> +	struct of_phandle_iter iter;
> +	struct iommu *iommu;
> +
> +	mutex_lock(&iommus_lock);
> +
> +	of_property_for_each_phandle_with_args(iter, dev->of_node, "iommus",
> +					       "#iommu-cells", 0) {
> +		bool found = false;
> +		int err;
> +
> +		/* skip disabled IOMMUs */
> +		if (!of_device_is_available(iter.out_args.np))
> +			continue;
> +
> +		list_for_each_entry(iommu, &iommus, list) {
> +			if (iommu->dev->of_node == iter.out_args.np) {
> +				err = iommu->ops->attach(iommu, dev);
> +				if (err < 0) {
> +				}
> +
> +				found = true;
> +			}
> +		}
> +
> +		if (!found) {
> +			mutex_unlock(&iommus_lock);
> +			return -EPROBE_DEFER;
> +		}
> +	}
> +
> +	mutex_unlock(&iommus_lock);
> +
> +	return 0;
> +}
> +
> +static int of_iommu_detach(struct device *dev)
> +{
> +	/* TODO: implement */
> +	return -ENOSYS;
> +}
> +
> +int iommu_attach(struct device *dev)
> +{
> +	int err = 0;
> +
> +	if (IS_ENABLED(CONFIG_OF) && dev->of_node) {
> +		err = of_iommu_attach(dev);
> +		if (!err)
> +			return 0;
> +	}
> +
> +	return err;
> +}
> +EXPORT_SYMBOL_GPL(iommu_attach);

I think it might make sense to introduce an explicit object for an IOMMU
master attachment. Maybe something like:

	struct iommu_master {
		struct iommu *iommu;
		struct device *dev;

		...
	};

iommu_attach() could then return a pointer to that attachment and the
IOMMU user driver could subsequently use that as a handle to access
other parts of the API.

The reason is that if we ever need to support more than a single master
interface (and perhaps even multiple master interfaces on different
IOMMUs) for a single device, then we need a way for the IOMMU user to
differentiate between its master interfaces.

> diff --git a/include/linux/iommu.h b/include/linux/iommu.h
> index 284a4683fdc1..ac2ceef194d4 100644
> --- a/include/linux/iommu.h
> +++ b/include/linux/iommu.h
> @@ -43,6 +43,17 @@ struct notifier_block;
>  typedef int (*iommu_fault_handler_t)(struct iommu_domain *,
>  			struct device *, unsigned long, int, void *);
>  
> +struct iommu {
> +	struct device *dev;
> +
> +	struct list_head list;
> +
> +	const struct iommu_ops *ops;
> +};

For reasons explained above, I also think that it would be a good idea
to modify the iommu_ops functions to take a struct iommu * as their
first argument. This may become important when one driver needs to
support multiple IOMMU devices. With the current API drivers have to
rely on global variables to track the driver-specific context. As far as
I can tell, only .domain_init(), .add_device(), .remove_device() and
.device_group(). .domain_init() could set up a pointer to struct iommu
in struct iommu_domain so the functions dealing with domains could gain
access to the IOMMU device via that pointer.

Thierry

[-- Attachment #1.2: Type: application/pgp-signature, Size: 819 bytes --]

[-- Attachment #2: Type: text/plain, Size: 0 bytes --]



^ permalink raw reply	[flat|nested] 133+ messages in thread

* Re: [RFC 01/10] iommu: Add IOMMU device registry
@ 2014-06-27  6:58         ` Thierry Reding
  0 siblings, 0 replies; 133+ messages in thread
From: Thierry Reding @ 2014-06-27  6:58 UTC (permalink / raw)
  To: Rob Herring, Pawel Moll, Mark Rutland, Ian Campbell, Kumar Gala,
	Stephen Warren, Arnd Bergmann, Will Deacon, Joerg Roedel
  Cc: Cho KyongHo, Grant Grundler, Dave Martin, Marc Zyngier,
	Hiroshi Doyu, Olav Haugan, Paul Walmsley, Rhyland Klein,
	Allen Martin, devicetree, iommu, linux-arm-kernel, linux-tegra,
	linux-kernel

[-- Attachment #1: Type: text/plain, Size: 3676 bytes --]

On Thu, Jun 26, 2014 at 10:49:41PM +0200, Thierry Reding wrote:
> From: Thierry Reding <treding@nvidia.com>
> 
> Add an IOMMU device registry for drivers to register with and implement
> a method for users of the IOMMU API to attach to an IOMMU device. This
> allows to support deferred probing and gives the IOMMU API a convenient
> hook to perform early initialization of a device if necessary.
> 
> Signed-off-by: Thierry Reding <treding@nvidia.com>
> ---
>  drivers/iommu/iommu.c | 93 +++++++++++++++++++++++++++++++++++++++++++++++++++
>  include/linux/iommu.h | 27 +++++++++++++++
>  2 files changed, 120 insertions(+)

I thought that perhaps I should elaborate on this a bit since I have a
few ideas on how the API could be enhanced.

> +static int of_iommu_attach(struct device *dev)
> +{
> +	struct of_phandle_iter iter;
> +	struct iommu *iommu;
> +
> +	mutex_lock(&iommus_lock);
> +
> +	of_property_for_each_phandle_with_args(iter, dev->of_node, "iommus",
> +					       "#iommu-cells", 0) {
> +		bool found = false;
> +		int err;
> +
> +		/* skip disabled IOMMUs */
> +		if (!of_device_is_available(iter.out_args.np))
> +			continue;
> +
> +		list_for_each_entry(iommu, &iommus, list) {
> +			if (iommu->dev->of_node == iter.out_args.np) {
> +				err = iommu->ops->attach(iommu, dev);
> +				if (err < 0) {
> +				}
> +
> +				found = true;
> +			}
> +		}
> +
> +		if (!found) {
> +			mutex_unlock(&iommus_lock);
> +			return -EPROBE_DEFER;
> +		}
> +	}
> +
> +	mutex_unlock(&iommus_lock);
> +
> +	return 0;
> +}
> +
> +static int of_iommu_detach(struct device *dev)
> +{
> +	/* TODO: implement */
> +	return -ENOSYS;
> +}
> +
> +int iommu_attach(struct device *dev)
> +{
> +	int err = 0;
> +
> +	if (IS_ENABLED(CONFIG_OF) && dev->of_node) {
> +		err = of_iommu_attach(dev);
> +		if (!err)
> +			return 0;
> +	}
> +
> +	return err;
> +}
> +EXPORT_SYMBOL_GPL(iommu_attach);

I think it might make sense to introduce an explicit object for an IOMMU
master attachment. Maybe something like:

	struct iommu_master {
		struct iommu *iommu;
		struct device *dev;

		...
	};

iommu_attach() could then return a pointer to that attachment and the
IOMMU user driver could subsequently use that as a handle to access
other parts of the API.

The reason is that if we ever need to support more than a single master
interface (and perhaps even multiple master interfaces on different
IOMMUs) for a single device, then we need a way for the IOMMU user to
differentiate between its master interfaces.

> diff --git a/include/linux/iommu.h b/include/linux/iommu.h
> index 284a4683fdc1..ac2ceef194d4 100644
> --- a/include/linux/iommu.h
> +++ b/include/linux/iommu.h
> @@ -43,6 +43,17 @@ struct notifier_block;
>  typedef int (*iommu_fault_handler_t)(struct iommu_domain *,
>  			struct device *, unsigned long, int, void *);
>  
> +struct iommu {
> +	struct device *dev;
> +
> +	struct list_head list;
> +
> +	const struct iommu_ops *ops;
> +};

For reasons explained above, I also think that it would be a good idea
to modify the iommu_ops functions to take a struct iommu * as their
first argument. This may become important when one driver needs to
support multiple IOMMU devices. With the current API drivers have to
rely on global variables to track the driver-specific context. As far as
I can tell, only .domain_init(), .add_device(), .remove_device() and
.device_group(). .domain_init() could set up a pointer to struct iommu
in struct iommu_domain so the functions dealing with domains could gain
access to the IOMMU device via that pointer.

Thierry

[-- Attachment #2: Type: application/pgp-signature, Size: 819 bytes --]

^ permalink raw reply	[flat|nested] 133+ messages in thread

* [RFC 01/10] iommu: Add IOMMU device registry
@ 2014-06-27  6:58         ` Thierry Reding
  0 siblings, 0 replies; 133+ messages in thread
From: Thierry Reding @ 2014-06-27  6:58 UTC (permalink / raw)
  To: linux-arm-kernel

On Thu, Jun 26, 2014 at 10:49:41PM +0200, Thierry Reding wrote:
> From: Thierry Reding <treding@nvidia.com>
> 
> Add an IOMMU device registry for drivers to register with and implement
> a method for users of the IOMMU API to attach to an IOMMU device. This
> allows to support deferred probing and gives the IOMMU API a convenient
> hook to perform early initialization of a device if necessary.
> 
> Signed-off-by: Thierry Reding <treding@nvidia.com>
> ---
>  drivers/iommu/iommu.c | 93 +++++++++++++++++++++++++++++++++++++++++++++++++++
>  include/linux/iommu.h | 27 +++++++++++++++
>  2 files changed, 120 insertions(+)

I thought that perhaps I should elaborate on this a bit since I have a
few ideas on how the API could be enhanced.

> +static int of_iommu_attach(struct device *dev)
> +{
> +	struct of_phandle_iter iter;
> +	struct iommu *iommu;
> +
> +	mutex_lock(&iommus_lock);
> +
> +	of_property_for_each_phandle_with_args(iter, dev->of_node, "iommus",
> +					       "#iommu-cells", 0) {
> +		bool found = false;
> +		int err;
> +
> +		/* skip disabled IOMMUs */
> +		if (!of_device_is_available(iter.out_args.np))
> +			continue;
> +
> +		list_for_each_entry(iommu, &iommus, list) {
> +			if (iommu->dev->of_node == iter.out_args.np) {
> +				err = iommu->ops->attach(iommu, dev);
> +				if (err < 0) {
> +				}
> +
> +				found = true;
> +			}
> +		}
> +
> +		if (!found) {
> +			mutex_unlock(&iommus_lock);
> +			return -EPROBE_DEFER;
> +		}
> +	}
> +
> +	mutex_unlock(&iommus_lock);
> +
> +	return 0;
> +}
> +
> +static int of_iommu_detach(struct device *dev)
> +{
> +	/* TODO: implement */
> +	return -ENOSYS;
> +}
> +
> +int iommu_attach(struct device *dev)
> +{
> +	int err = 0;
> +
> +	if (IS_ENABLED(CONFIG_OF) && dev->of_node) {
> +		err = of_iommu_attach(dev);
> +		if (!err)
> +			return 0;
> +	}
> +
> +	return err;
> +}
> +EXPORT_SYMBOL_GPL(iommu_attach);

I think it might make sense to introduce an explicit object for an IOMMU
master attachment. Maybe something like:

	struct iommu_master {
		struct iommu *iommu;
		struct device *dev;

		...
	};

iommu_attach() could then return a pointer to that attachment and the
IOMMU user driver could subsequently use that as a handle to access
other parts of the API.

The reason is that if we ever need to support more than a single master
interface (and perhaps even multiple master interfaces on different
IOMMUs) for a single device, then we need a way for the IOMMU user to
differentiate between its master interfaces.

> diff --git a/include/linux/iommu.h b/include/linux/iommu.h
> index 284a4683fdc1..ac2ceef194d4 100644
> --- a/include/linux/iommu.h
> +++ b/include/linux/iommu.h
> @@ -43,6 +43,17 @@ struct notifier_block;
>  typedef int (*iommu_fault_handler_t)(struct iommu_domain *,
>  			struct device *, unsigned long, int, void *);
>  
> +struct iommu {
> +	struct device *dev;
> +
> +	struct list_head list;
> +
> +	const struct iommu_ops *ops;
> +};

For reasons explained above, I also think that it would be a good idea
to modify the iommu_ops functions to take a struct iommu * as their
first argument. This may become important when one driver needs to
support multiple IOMMU devices. With the current API drivers have to
rely on global variables to track the driver-specific context. As far as
I can tell, only .domain_init(), .add_device(), .remove_device() and
.device_group(). .domain_init() could set up a pointer to struct iommu
in struct iommu_domain so the functions dealing with domains could gain
access to the IOMMU device via that pointer.

Thierry
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 819 bytes
Desc: not available
URL: <http://lists.infradead.org/pipermail/linux-arm-kernel/attachments/20140627/e9f8f71e/attachment.sig>

^ permalink raw reply	[flat|nested] 133+ messages in thread

* Re: [RFC 04/10] memory: Add Tegra124 memory controller support
  2014-06-26 20:49     ` Thierry Reding
  (?)
@ 2014-06-27  7:41         ` Joseph Lo
  -1 siblings, 0 replies; 133+ messages in thread
From: Joseph Lo @ 2014-06-27  7:41 UTC (permalink / raw)
  To: Thierry Reding, Rob Herring, Pawel Moll, Mark Rutland,
	Ian Campbell, Kumar Gala, Stephen Warren, Arnd Bergmann,
	Will Deacon, Joerg Roedel
  Cc: Olav Haugan, devicetree-u79uwXL29TY76Z2rM5mHXA, Grant Grundler,
	Rhyland Klein, iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA, Marc Zyngier, Allen Martin,
	Paul Walmsley, linux-tegra-u79uwXL29TY76Z2rM5mHXA, Cho KyongHo,
	Dave Martin, linux-arm-kernel-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r

Hi Thierry,

On 06/27/2014 04:49 AM, Thierry Reding wrote:
[snip]
> +
> +#define MC_INTSTATUS 0x000
> +#define  MC_INT_DECERR_MTS (1 << 16)
> +#define  MC_INT_SECERR_SEC (1 << 13)
> +#define  MC_INT_DECERR_VPR (1 << 12)
> +#define  MC_INT_INVALID_APB_ASID_UPDATE (1 << 11)
> +#define  MC_INT_INVALID_SMMU_PAGE (1 << 10)
> +#define  MC_INT_ARBITRATION_EMEM (1 << 9)
> +#define  MC_INT_SECURITY_VIOLATION (1 << 8)
> +#define  MC_INT_DECERR_EMEM (1 << 6)
> +#define MC_INTMASK 0x004
> +#define MC_ERR_STATUS 0x08
> +#define MC_ERR_ADR 0x0c
> +
[snip]
> +
> +#define SMMU_PDE_ATTR          (SMMU_PDE_READABLE | SMMU_PDE_WRITABLE | \
> +                                SMMU_PDE_NONSECURE)
> +#define SMMU_PTE_ATTR          (SMMU_PTE_READABLE | SMMU_PTE_WRITABLE | \
> +                                SMMU_PTE_NONSECURE)
> +
> +#define SMMU_PDE_VACANT(n)     (((n) << 10) | SMMU_PDE_ATTR)
> +#define SMMU_PTE_VACANT(n)     (((n) << 12) | SMMU_PTE_ATTR)

There is an ISR to catch the invalid SMMU translation. Do you want to 
modify the identity mapping with read/write attribute of the unused SMMU 
pages?

This can make sure we capture the invalid SMMU translation. And helps 
for driver to capture issues when using SMMU.

-joseph

> +static irqreturn_t tegra124_mc_irq(int irq, void *data)
> +{
> +       struct tegra_mc *mc = data;
> +       u32 value, status, mask;
> +
> +       /* mask all interrupts to avoid flooding */
> +       mask = mc_readl(mc, MC_INTMASK);
> +       mc_writel(mc, 0, MC_INTMASK);
> +
> +       status = mc_readl(mc, MC_INTSTATUS);
> +       mc_writel(mc, status, MC_INTSTATUS);
> +
> +       dev_dbg(mc->dev, "INTSTATUS: %08x\n", status);
> +
> +       if (status & MC_INT_DECERR_MTS)
> +               dev_dbg(mc->dev, "  DECERR_MTS\n");
> +
> +       if (status & MC_INT_SECERR_SEC)
> +               dev_dbg(mc->dev, "  SECERR_SEC\n");
> +
> +       if (status & MC_INT_DECERR_VPR)
> +               dev_dbg(mc->dev, "  DECERR_VPR\n");
> +
> +       if (status & MC_INT_INVALID_APB_ASID_UPDATE)
> +               dev_dbg(mc->dev, "  INVALID_APB_ASID_UPDATE\n");
> +
> +       if (status & MC_INT_INVALID_SMMU_PAGE)
> +               dev_dbg(mc->dev, "  INVALID_SMMU_PAGE\n");
> +
> +       if (status & MC_INT_ARBITRATION_EMEM)
> +               dev_dbg(mc->dev, "  ARBITRATION_EMEM\n");
> +
> +       if (status & MC_INT_SECURITY_VIOLATION)
> +               dev_dbg(mc->dev, "  SECURITY_VIOLATION\n");
> +
> +       if (status & MC_INT_DECERR_EMEM)
> +               dev_dbg(mc->dev, "  DECERR_EMEM\n");
> +
> +       value = mc_readl(mc, MC_ERR_STATUS);
> +
> +       dev_dbg(mc->dev, "ERR_STATUS: %08x\n", value);
> +       dev_dbg(mc->dev, "  type: %x\n", (value >> 28) & 0x7);
> +       dev_dbg(mc->dev, "  protection: %x\n", (value >> 25) & 0x7);
> +       dev_dbg(mc->dev, "  adr_hi: %x\n", (value >> 20) & 0x3);
> +       dev_dbg(mc->dev, "  swap: %x\n", (value >> 18) & 0x1);
> +       dev_dbg(mc->dev, "  security: %x\n", (value >> 17) & 0x1);
> +       dev_dbg(mc->dev, "  r/w: %x\n", (value >> 16) & 0x1);
> +       dev_dbg(mc->dev, "  adr1: %x\n", (value >> 12) & 0x7);
> +       dev_dbg(mc->dev, "  client: %x\n", value & 0x7f);
> +
> +       value = mc_readl(mc, MC_ERR_ADR);
> +       dev_dbg(mc->dev, "ERR_ADR: %08x\n", value);
> +
> +       mc_writel(mc, mask, MC_INTMASK);
> +
> +       return IRQ_HANDLED;
> +}
> +

^ permalink raw reply	[flat|nested] 133+ messages in thread

* Re: [RFC 04/10] memory: Add Tegra124 memory controller support
@ 2014-06-27  7:41         ` Joseph Lo
  0 siblings, 0 replies; 133+ messages in thread
From: Joseph Lo @ 2014-06-27  7:41 UTC (permalink / raw)
  To: Thierry Reding, Rob Herring, Pawel Moll, Mark Rutland,
	Ian Campbell, Kumar Gala, Stephen Warren, Arnd Bergmann,
	Will Deacon, Joerg Roedel
  Cc: Cho KyongHo, Grant Grundler, Dave Martin, Marc Zyngier,
	Hiroshi Doyu, Olav Haugan, Paul Walmsley, Rhyland Klein,
	Allen Martin, devicetree, iommu, linux-arm-kernel, linux-tegra,
	linux-kernel

Hi Thierry,

On 06/27/2014 04:49 AM, Thierry Reding wrote:
[snip]
> +
> +#define MC_INTSTATUS 0x000
> +#define  MC_INT_DECERR_MTS (1 << 16)
> +#define  MC_INT_SECERR_SEC (1 << 13)
> +#define  MC_INT_DECERR_VPR (1 << 12)
> +#define  MC_INT_INVALID_APB_ASID_UPDATE (1 << 11)
> +#define  MC_INT_INVALID_SMMU_PAGE (1 << 10)
> +#define  MC_INT_ARBITRATION_EMEM (1 << 9)
> +#define  MC_INT_SECURITY_VIOLATION (1 << 8)
> +#define  MC_INT_DECERR_EMEM (1 << 6)
> +#define MC_INTMASK 0x004
> +#define MC_ERR_STATUS 0x08
> +#define MC_ERR_ADR 0x0c
> +
[snip]
> +
> +#define SMMU_PDE_ATTR          (SMMU_PDE_READABLE | SMMU_PDE_WRITABLE | \
> +                                SMMU_PDE_NONSECURE)
> +#define SMMU_PTE_ATTR          (SMMU_PTE_READABLE | SMMU_PTE_WRITABLE | \
> +                                SMMU_PTE_NONSECURE)
> +
> +#define SMMU_PDE_VACANT(n)     (((n) << 10) | SMMU_PDE_ATTR)
> +#define SMMU_PTE_VACANT(n)     (((n) << 12) | SMMU_PTE_ATTR)

There is an ISR to catch the invalid SMMU translation. Do you want to 
modify the identity mapping with read/write attribute of the unused SMMU 
pages?

This can make sure we capture the invalid SMMU translation. And helps 
for driver to capture issues when using SMMU.

-joseph

> +static irqreturn_t tegra124_mc_irq(int irq, void *data)
> +{
> +       struct tegra_mc *mc = data;
> +       u32 value, status, mask;
> +
> +       /* mask all interrupts to avoid flooding */
> +       mask = mc_readl(mc, MC_INTMASK);
> +       mc_writel(mc, 0, MC_INTMASK);
> +
> +       status = mc_readl(mc, MC_INTSTATUS);
> +       mc_writel(mc, status, MC_INTSTATUS);
> +
> +       dev_dbg(mc->dev, "INTSTATUS: %08x\n", status);
> +
> +       if (status & MC_INT_DECERR_MTS)
> +               dev_dbg(mc->dev, "  DECERR_MTS\n");
> +
> +       if (status & MC_INT_SECERR_SEC)
> +               dev_dbg(mc->dev, "  SECERR_SEC\n");
> +
> +       if (status & MC_INT_DECERR_VPR)
> +               dev_dbg(mc->dev, "  DECERR_VPR\n");
> +
> +       if (status & MC_INT_INVALID_APB_ASID_UPDATE)
> +               dev_dbg(mc->dev, "  INVALID_APB_ASID_UPDATE\n");
> +
> +       if (status & MC_INT_INVALID_SMMU_PAGE)
> +               dev_dbg(mc->dev, "  INVALID_SMMU_PAGE\n");
> +
> +       if (status & MC_INT_ARBITRATION_EMEM)
> +               dev_dbg(mc->dev, "  ARBITRATION_EMEM\n");
> +
> +       if (status & MC_INT_SECURITY_VIOLATION)
> +               dev_dbg(mc->dev, "  SECURITY_VIOLATION\n");
> +
> +       if (status & MC_INT_DECERR_EMEM)
> +               dev_dbg(mc->dev, "  DECERR_EMEM\n");
> +
> +       value = mc_readl(mc, MC_ERR_STATUS);
> +
> +       dev_dbg(mc->dev, "ERR_STATUS: %08x\n", value);
> +       dev_dbg(mc->dev, "  type: %x\n", (value >> 28) & 0x7);
> +       dev_dbg(mc->dev, "  protection: %x\n", (value >> 25) & 0x7);
> +       dev_dbg(mc->dev, "  adr_hi: %x\n", (value >> 20) & 0x3);
> +       dev_dbg(mc->dev, "  swap: %x\n", (value >> 18) & 0x1);
> +       dev_dbg(mc->dev, "  security: %x\n", (value >> 17) & 0x1);
> +       dev_dbg(mc->dev, "  r/w: %x\n", (value >> 16) & 0x1);
> +       dev_dbg(mc->dev, "  adr1: %x\n", (value >> 12) & 0x7);
> +       dev_dbg(mc->dev, "  client: %x\n", value & 0x7f);
> +
> +       value = mc_readl(mc, MC_ERR_ADR);
> +       dev_dbg(mc->dev, "ERR_ADR: %08x\n", value);
> +
> +       mc_writel(mc, mask, MC_INTMASK);
> +
> +       return IRQ_HANDLED;
> +}
> +

^ permalink raw reply	[flat|nested] 133+ messages in thread

* [RFC 04/10] memory: Add Tegra124 memory controller support
@ 2014-06-27  7:41         ` Joseph Lo
  0 siblings, 0 replies; 133+ messages in thread
From: Joseph Lo @ 2014-06-27  7:41 UTC (permalink / raw)
  To: linux-arm-kernel

Hi Thierry,

On 06/27/2014 04:49 AM, Thierry Reding wrote:
[snip]
> +
> +#define MC_INTSTATUS 0x000
> +#define  MC_INT_DECERR_MTS (1 << 16)
> +#define  MC_INT_SECERR_SEC (1 << 13)
> +#define  MC_INT_DECERR_VPR (1 << 12)
> +#define  MC_INT_INVALID_APB_ASID_UPDATE (1 << 11)
> +#define  MC_INT_INVALID_SMMU_PAGE (1 << 10)
> +#define  MC_INT_ARBITRATION_EMEM (1 << 9)
> +#define  MC_INT_SECURITY_VIOLATION (1 << 8)
> +#define  MC_INT_DECERR_EMEM (1 << 6)
> +#define MC_INTMASK 0x004
> +#define MC_ERR_STATUS 0x08
> +#define MC_ERR_ADR 0x0c
> +
[snip]
> +
> +#define SMMU_PDE_ATTR          (SMMU_PDE_READABLE | SMMU_PDE_WRITABLE | \
> +                                SMMU_PDE_NONSECURE)
> +#define SMMU_PTE_ATTR          (SMMU_PTE_READABLE | SMMU_PTE_WRITABLE | \
> +                                SMMU_PTE_NONSECURE)
> +
> +#define SMMU_PDE_VACANT(n)     (((n) << 10) | SMMU_PDE_ATTR)
> +#define SMMU_PTE_VACANT(n)     (((n) << 12) | SMMU_PTE_ATTR)

There is an ISR to catch the invalid SMMU translation. Do you want to 
modify the identity mapping with read/write attribute of the unused SMMU 
pages?

This can make sure we capture the invalid SMMU translation. And helps 
for driver to capture issues when using SMMU.

-joseph

> +static irqreturn_t tegra124_mc_irq(int irq, void *data)
> +{
> +       struct tegra_mc *mc = data;
> +       u32 value, status, mask;
> +
> +       /* mask all interrupts to avoid flooding */
> +       mask = mc_readl(mc, MC_INTMASK);
> +       mc_writel(mc, 0, MC_INTMASK);
> +
> +       status = mc_readl(mc, MC_INTSTATUS);
> +       mc_writel(mc, status, MC_INTSTATUS);
> +
> +       dev_dbg(mc->dev, "INTSTATUS: %08x\n", status);
> +
> +       if (status & MC_INT_DECERR_MTS)
> +               dev_dbg(mc->dev, "  DECERR_MTS\n");
> +
> +       if (status & MC_INT_SECERR_SEC)
> +               dev_dbg(mc->dev, "  SECERR_SEC\n");
> +
> +       if (status & MC_INT_DECERR_VPR)
> +               dev_dbg(mc->dev, "  DECERR_VPR\n");
> +
> +       if (status & MC_INT_INVALID_APB_ASID_UPDATE)
> +               dev_dbg(mc->dev, "  INVALID_APB_ASID_UPDATE\n");
> +
> +       if (status & MC_INT_INVALID_SMMU_PAGE)
> +               dev_dbg(mc->dev, "  INVALID_SMMU_PAGE\n");
> +
> +       if (status & MC_INT_ARBITRATION_EMEM)
> +               dev_dbg(mc->dev, "  ARBITRATION_EMEM\n");
> +
> +       if (status & MC_INT_SECURITY_VIOLATION)
> +               dev_dbg(mc->dev, "  SECURITY_VIOLATION\n");
> +
> +       if (status & MC_INT_DECERR_EMEM)
> +               dev_dbg(mc->dev, "  DECERR_EMEM\n");
> +
> +       value = mc_readl(mc, MC_ERR_STATUS);
> +
> +       dev_dbg(mc->dev, "ERR_STATUS: %08x\n", value);
> +       dev_dbg(mc->dev, "  type: %x\n", (value >> 28) & 0x7);
> +       dev_dbg(mc->dev, "  protection: %x\n", (value >> 25) & 0x7);
> +       dev_dbg(mc->dev, "  adr_hi: %x\n", (value >> 20) & 0x3);
> +       dev_dbg(mc->dev, "  swap: %x\n", (value >> 18) & 0x1);
> +       dev_dbg(mc->dev, "  security: %x\n", (value >> 17) & 0x1);
> +       dev_dbg(mc->dev, "  r/w: %x\n", (value >> 16) & 0x1);
> +       dev_dbg(mc->dev, "  adr1: %x\n", (value >> 12) & 0x7);
> +       dev_dbg(mc->dev, "  client: %x\n", value & 0x7f);
> +
> +       value = mc_readl(mc, MC_ERR_ADR);
> +       dev_dbg(mc->dev, "ERR_ADR: %08x\n", value);
> +
> +       mc_writel(mc, mask, MC_INTMASK);
> +
> +       return IRQ_HANDLED;
> +}
> +

^ permalink raw reply	[flat|nested] 133+ messages in thread

* Re: [RFC 04/10] memory: Add Tegra124 memory controller support
  2014-06-27  7:41         ` Joseph Lo
  (?)
@ 2014-06-27  8:17             ` Thierry Reding
  -1 siblings, 0 replies; 133+ messages in thread
From: Thierry Reding @ 2014-06-27  8:17 UTC (permalink / raw)
  To: Joseph Lo
  Cc: Mark Rutland, Will Deacon, Paul Walmsley, Pawel Moll,
	Ian Campbell, Marc Zyngier, Dave Martin, Olav Haugan,
	devicetree-u79uwXL29TY76Z2rM5mHXA, Arnd Bergmann, Stephen Warren,
	Grant Grundler, Allen Martin, Rob Herring,
	linux-tegra-u79uwXL29TY76Z2rM5mHXA, Cho KyongHo,
	linux-arm-kernel-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA, Kumar Gala


[-- Attachment #1.1: Type: text/plain, Size: 1851 bytes --]

On Fri, Jun 27, 2014 at 03:41:20PM +0800, Joseph Lo wrote:
> Hi Thierry,
> 
> On 06/27/2014 04:49 AM, Thierry Reding wrote:
> [snip]
> >+
> >+#define MC_INTSTATUS 0x000
> >+#define  MC_INT_DECERR_MTS (1 << 16)
> >+#define  MC_INT_SECERR_SEC (1 << 13)
> >+#define  MC_INT_DECERR_VPR (1 << 12)
> >+#define  MC_INT_INVALID_APB_ASID_UPDATE (1 << 11)
> >+#define  MC_INT_INVALID_SMMU_PAGE (1 << 10)
> >+#define  MC_INT_ARBITRATION_EMEM (1 << 9)
> >+#define  MC_INT_SECURITY_VIOLATION (1 << 8)
> >+#define  MC_INT_DECERR_EMEM (1 << 6)
> >+#define MC_INTMASK 0x004
> >+#define MC_ERR_STATUS 0x08
> >+#define MC_ERR_ADR 0x0c
> >+
> [snip]
> >+
> >+#define SMMU_PDE_ATTR          (SMMU_PDE_READABLE | SMMU_PDE_WRITABLE | \
> >+                                SMMU_PDE_NONSECURE)
> >+#define SMMU_PTE_ATTR          (SMMU_PTE_READABLE | SMMU_PTE_WRITABLE | \
> >+                                SMMU_PTE_NONSECURE)
> >+
> >+#define SMMU_PDE_VACANT(n)     (((n) << 10) | SMMU_PDE_ATTR)
> >+#define SMMU_PTE_VACANT(n)     (((n) << 12) | SMMU_PTE_ATTR)
> 
> There is an ISR to catch the invalid SMMU translation. Do you want to modify
> the identity mapping with read/write attribute of the unused SMMU pages?

I'm not sure I understand what you mean by "identity mapping". None of
the public documentation seems to describe the exact layout of PDEs or
PTEs, so it's somewhat hard to tell what to set them to when pages are
unmapped.

> This can make sure we capture the invalid SMMU translation. And helps for
> driver to capture issues when using SMMU.

That certainly sounds like a useful thing to have. Like I said this is
an RFC and I'm not even sure if it's acceptable in the current form, so
I wanted to get feedback early on to avoid wasting effort on something
that turn out to be a wild-goose chase.

Thierry

[-- Attachment #1.2: Type: application/pgp-signature, Size: 819 bytes --]

[-- Attachment #2: Type: text/plain, Size: 0 bytes --]



^ permalink raw reply	[flat|nested] 133+ messages in thread

* Re: [RFC 04/10] memory: Add Tegra124 memory controller support
@ 2014-06-27  8:17             ` Thierry Reding
  0 siblings, 0 replies; 133+ messages in thread
From: Thierry Reding @ 2014-06-27  8:17 UTC (permalink / raw)
  To: Joseph Lo
  Cc: Rob Herring, Pawel Moll, Mark Rutland, Ian Campbell, Kumar Gala,
	Stephen Warren, Arnd Bergmann, Will Deacon, Joerg Roedel,
	Cho KyongHo, Grant Grundler, Dave Martin, Marc Zyngier,
	Hiroshi Doyu, Olav Haugan, Paul Walmsley, Rhyland Klein,
	Allen Martin, devicetree, iommu, linux-arm-kernel, linux-tegra,
	linux-kernel

[-- Attachment #1: Type: text/plain, Size: 1851 bytes --]

On Fri, Jun 27, 2014 at 03:41:20PM +0800, Joseph Lo wrote:
> Hi Thierry,
> 
> On 06/27/2014 04:49 AM, Thierry Reding wrote:
> [snip]
> >+
> >+#define MC_INTSTATUS 0x000
> >+#define  MC_INT_DECERR_MTS (1 << 16)
> >+#define  MC_INT_SECERR_SEC (1 << 13)
> >+#define  MC_INT_DECERR_VPR (1 << 12)
> >+#define  MC_INT_INVALID_APB_ASID_UPDATE (1 << 11)
> >+#define  MC_INT_INVALID_SMMU_PAGE (1 << 10)
> >+#define  MC_INT_ARBITRATION_EMEM (1 << 9)
> >+#define  MC_INT_SECURITY_VIOLATION (1 << 8)
> >+#define  MC_INT_DECERR_EMEM (1 << 6)
> >+#define MC_INTMASK 0x004
> >+#define MC_ERR_STATUS 0x08
> >+#define MC_ERR_ADR 0x0c
> >+
> [snip]
> >+
> >+#define SMMU_PDE_ATTR          (SMMU_PDE_READABLE | SMMU_PDE_WRITABLE | \
> >+                                SMMU_PDE_NONSECURE)
> >+#define SMMU_PTE_ATTR          (SMMU_PTE_READABLE | SMMU_PTE_WRITABLE | \
> >+                                SMMU_PTE_NONSECURE)
> >+
> >+#define SMMU_PDE_VACANT(n)     (((n) << 10) | SMMU_PDE_ATTR)
> >+#define SMMU_PTE_VACANT(n)     (((n) << 12) | SMMU_PTE_ATTR)
> 
> There is an ISR to catch the invalid SMMU translation. Do you want to modify
> the identity mapping with read/write attribute of the unused SMMU pages?

I'm not sure I understand what you mean by "identity mapping". None of
the public documentation seems to describe the exact layout of PDEs or
PTEs, so it's somewhat hard to tell what to set them to when pages are
unmapped.

> This can make sure we capture the invalid SMMU translation. And helps for
> driver to capture issues when using SMMU.

That certainly sounds like a useful thing to have. Like I said this is
an RFC and I'm not even sure if it's acceptable in the current form, so
I wanted to get feedback early on to avoid wasting effort on something
that turn out to be a wild-goose chase.

Thierry

[-- Attachment #2: Type: application/pgp-signature, Size: 819 bytes --]

^ permalink raw reply	[flat|nested] 133+ messages in thread

* [RFC 04/10] memory: Add Tegra124 memory controller support
@ 2014-06-27  8:17             ` Thierry Reding
  0 siblings, 0 replies; 133+ messages in thread
From: Thierry Reding @ 2014-06-27  8:17 UTC (permalink / raw)
  To: linux-arm-kernel

On Fri, Jun 27, 2014 at 03:41:20PM +0800, Joseph Lo wrote:
> Hi Thierry,
> 
> On 06/27/2014 04:49 AM, Thierry Reding wrote:
> [snip]
> >+
> >+#define MC_INTSTATUS 0x000
> >+#define  MC_INT_DECERR_MTS (1 << 16)
> >+#define  MC_INT_SECERR_SEC (1 << 13)
> >+#define  MC_INT_DECERR_VPR (1 << 12)
> >+#define  MC_INT_INVALID_APB_ASID_UPDATE (1 << 11)
> >+#define  MC_INT_INVALID_SMMU_PAGE (1 << 10)
> >+#define  MC_INT_ARBITRATION_EMEM (1 << 9)
> >+#define  MC_INT_SECURITY_VIOLATION (1 << 8)
> >+#define  MC_INT_DECERR_EMEM (1 << 6)
> >+#define MC_INTMASK 0x004
> >+#define MC_ERR_STATUS 0x08
> >+#define MC_ERR_ADR 0x0c
> >+
> [snip]
> >+
> >+#define SMMU_PDE_ATTR          (SMMU_PDE_READABLE | SMMU_PDE_WRITABLE | \
> >+                                SMMU_PDE_NONSECURE)
> >+#define SMMU_PTE_ATTR          (SMMU_PTE_READABLE | SMMU_PTE_WRITABLE | \
> >+                                SMMU_PTE_NONSECURE)
> >+
> >+#define SMMU_PDE_VACANT(n)     (((n) << 10) | SMMU_PDE_ATTR)
> >+#define SMMU_PTE_VACANT(n)     (((n) << 12) | SMMU_PTE_ATTR)
> 
> There is an ISR to catch the invalid SMMU translation. Do you want to modify
> the identity mapping with read/write attribute of the unused SMMU pages?

I'm not sure I understand what you mean by "identity mapping". None of
the public documentation seems to describe the exact layout of PDEs or
PTEs, so it's somewhat hard to tell what to set them to when pages are
unmapped.

> This can make sure we capture the invalid SMMU translation. And helps for
> driver to capture issues when using SMMU.

That certainly sounds like a useful thing to have. Like I said this is
an RFC and I'm not even sure if it's acceptable in the current form, so
I wanted to get feedback early on to avoid wasting effort on something
that turn out to be a wild-goose chase.

Thierry
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 819 bytes
Desc: not available
URL: <http://lists.infradead.org/pipermail/linux-arm-kernel/attachments/20140627/4797aef4/attachment.sig>

^ permalink raw reply	[flat|nested] 133+ messages in thread

* Re: [RFC 04/10] memory: Add Tegra124 memory controller support
  2014-06-27  8:17             ` Thierry Reding
@ 2014-06-27  8:24               ` Hiroshi Doyu
  -1 siblings, 0 replies; 133+ messages in thread
From: Hiroshi Doyu @ 2014-06-27  8:24 UTC (permalink / raw)
  To: Thierry Reding
  Cc: Mark Rutland, Will Deacon, Paul Walmsley, Joseph Lo, Pawel Moll,
	Stephen Warren, Marc Zyngier, Dave Martin, Olav Haugan,
	devicetree-u79uwXL29TY76Z2rM5mHXA, Arnd Bergmann, Ian Campbell,
	Grant Grundler, Allen Martin, Rob Herring,
	linux-tegra-u79uwXL29TY76Z2rM5mHXA, Cho KyongHo,
	linux-arm-kernel-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA


Thierry Reding <thierry.reding-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> writes:

> * PGP Signed by an unknown key
>
> On Fri, Jun 27, 2014 at 03:41:20PM +0800, Joseph Lo wrote:
>> Hi Thierry,
>> 
>> On 06/27/2014 04:49 AM, Thierry Reding wrote:
>> [snip]
>> >+
>> >+#define MC_INTSTATUS 0x000
>> >+#define  MC_INT_DECERR_MTS (1 << 16)
>> >+#define  MC_INT_SECERR_SEC (1 << 13)
>> >+#define  MC_INT_DECERR_VPR (1 << 12)
>> >+#define  MC_INT_INVALID_APB_ASID_UPDATE (1 << 11)
>> >+#define  MC_INT_INVALID_SMMU_PAGE (1 << 10)
>> >+#define  MC_INT_ARBITRATION_EMEM (1 << 9)
>> >+#define  MC_INT_SECURITY_VIOLATION (1 << 8)
>> >+#define  MC_INT_DECERR_EMEM (1 << 6)
>> >+#define MC_INTMASK 0x004
>> >+#define MC_ERR_STATUS 0x08
>> >+#define MC_ERR_ADR 0x0c
>> >+
>> [snip]
>> >+
>> >+#define SMMU_PDE_ATTR          (SMMU_PDE_READABLE | SMMU_PDE_WRITABLE | \
>> >+                                SMMU_PDE_NONSECURE)
>> >+#define SMMU_PTE_ATTR          (SMMU_PTE_READABLE | SMMU_PTE_WRITABLE | \
>> >+                                SMMU_PTE_NONSECURE)
>> >+
>> >+#define SMMU_PDE_VACANT(n)     (((n) << 10) | SMMU_PDE_ATTR)
>> >+#define SMMU_PTE_VACANT(n)     (((n) << 12) | SMMU_PTE_ATTR)

They should be set 0.

The above VACANT macros are legacy support for some special case that a
device wanted linear SMMU mapping where iova == phy. No need any more.

>> There is an ISR to catch the invalid SMMU translation. Do you want to modify
>> the identity mapping with read/write attribute of the unused SMMU pages?
>
> I'm not sure I understand what you mean by "identity mapping". None of
> the public documentation seems to describe the exact layout of PDEs or
> PTEs, so it's somewhat hard to tell what to set them to when pages are
> unmapped.
>
>> This can make sure we capture the invalid SMMU translation. And helps for
>> driver to capture issues when using SMMU.
>
> That certainly sounds like a useful thing to have. Like I said this is
> an RFC and I'm not even sure if it's acceptable in the current form, so
> I wanted to get feedback early on to avoid wasting effort on something
> that turn out to be a wild-goose chase.
>
> Thierry
>
> * Unknown Key
> * 0x7F3EB3A1

^ permalink raw reply	[flat|nested] 133+ messages in thread

* [RFC 04/10] memory: Add Tegra124 memory controller support
@ 2014-06-27  8:24               ` Hiroshi Doyu
  0 siblings, 0 replies; 133+ messages in thread
From: Hiroshi Doyu @ 2014-06-27  8:24 UTC (permalink / raw)
  To: linux-arm-kernel


Thierry Reding <thierry.reding@gmail.com> writes:

> * PGP Signed by an unknown key
>
> On Fri, Jun 27, 2014 at 03:41:20PM +0800, Joseph Lo wrote:
>> Hi Thierry,
>> 
>> On 06/27/2014 04:49 AM, Thierry Reding wrote:
>> [snip]
>> >+
>> >+#define MC_INTSTATUS 0x000
>> >+#define  MC_INT_DECERR_MTS (1 << 16)
>> >+#define  MC_INT_SECERR_SEC (1 << 13)
>> >+#define  MC_INT_DECERR_VPR (1 << 12)
>> >+#define  MC_INT_INVALID_APB_ASID_UPDATE (1 << 11)
>> >+#define  MC_INT_INVALID_SMMU_PAGE (1 << 10)
>> >+#define  MC_INT_ARBITRATION_EMEM (1 << 9)
>> >+#define  MC_INT_SECURITY_VIOLATION (1 << 8)
>> >+#define  MC_INT_DECERR_EMEM (1 << 6)
>> >+#define MC_INTMASK 0x004
>> >+#define MC_ERR_STATUS 0x08
>> >+#define MC_ERR_ADR 0x0c
>> >+
>> [snip]
>> >+
>> >+#define SMMU_PDE_ATTR          (SMMU_PDE_READABLE | SMMU_PDE_WRITABLE | \
>> >+                                SMMU_PDE_NONSECURE)
>> >+#define SMMU_PTE_ATTR          (SMMU_PTE_READABLE | SMMU_PTE_WRITABLE | \
>> >+                                SMMU_PTE_NONSECURE)
>> >+
>> >+#define SMMU_PDE_VACANT(n)     (((n) << 10) | SMMU_PDE_ATTR)
>> >+#define SMMU_PTE_VACANT(n)     (((n) << 12) | SMMU_PTE_ATTR)

They should be set 0.

The above VACANT macros are legacy support for some special case that a
device wanted linear SMMU mapping where iova == phy. No need any more.

>> There is an ISR to catch the invalid SMMU translation. Do you want to modify
>> the identity mapping with read/write attribute of the unused SMMU pages?
>
> I'm not sure I understand what you mean by "identity mapping". None of
> the public documentation seems to describe the exact layout of PDEs or
> PTEs, so it's somewhat hard to tell what to set them to when pages are
> unmapped.
>
>> This can make sure we capture the invalid SMMU translation. And helps for
>> driver to capture issues when using SMMU.
>
> That certainly sounds like a useful thing to have. Like I said this is
> an RFC and I'm not even sure if it's acceptable in the current form, so
> I wanted to get feedback early on to avoid wasting effort on something
> that turn out to be a wild-goose chase.
>
> Thierry
>
> * Unknown Key
> * 0x7F3EB3A1

^ permalink raw reply	[flat|nested] 133+ messages in thread

* Re: [RFC 10/10] mmc: sdhci-tegra: Add IOMMU support
  2014-06-26 20:49     ` Thierry Reding
  (?)
@ 2014-06-27  9:46         ` Hiroshi DOyu
  -1 siblings, 0 replies; 133+ messages in thread
From: Hiroshi DOyu @ 2014-06-27  9:46 UTC (permalink / raw)
  To: Thierry Reding
  Cc: Rob Herring, Pawel Moll, Mark Rutland, Ian Campbell, Kumar Gala,
	Stephen Warren, Arnd Bergmann, Will Deacon, Joerg Roedel,
	Cho KyongHo, Grant Grundler, Dave Martin, Marc Zyngier,
	Hiroshi Doyu, Olav Haugan, Paul Walmsley, Rhyland Klein,
	Allen Martin, devicetree-u79uwXL29TY76Z2rM5mHXA,
	iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA,
	linux-arm-kernel-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r


Thierry Reding <thierry.reding-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> writes:

> From: Thierry Reding <treding-DDmLM1+adcrQT0dZR+AlfA@public.gmane.org>
>
> Attach to the device's master interface of the IOMMU at .probe() time.
> IOMMU support becomes available via the DMA mapping API interoperation
> code, but this explicit attachment is necessary to ensure proper probe
> order.
>
> Signed-off-by: Thierry Reding <treding-DDmLM1+adcrQT0dZR+AlfA@public.gmane.org>
> ---
>  drivers/mmc/host/sdhci-tegra.c | 8 ++++++++
>  1 file changed, 8 insertions(+)
>
> diff --git a/drivers/mmc/host/sdhci-tegra.c b/drivers/mmc/host/sdhci-tegra.c
> index 33100d10d176..b884614fa4e6 100644
> --- a/drivers/mmc/host/sdhci-tegra.c
> +++ b/drivers/mmc/host/sdhci-tegra.c
> @@ -15,6 +15,7 @@
>  #include <linux/err.h>
>  #include <linux/module.h>
>  #include <linux/init.h>
> +#include <linux/iommu.h>
>  #include <linux/platform_device.h>
>  #include <linux/clk.h>
>  #include <linux/io.h>
> @@ -237,6 +238,11 @@ static int sdhci_tegra_probe(struct platform_device *pdev)
>  	match = of_match_device(sdhci_tegra_dt_match, &pdev->dev);
>  	if (!match)
>  		return -EINVAL;
> +
> +	rc = iommu_attach(&pdev->dev);
> +	if (rc < 0)
> +		return rc;
> +

I thought that, if we consider that ->probe() should include minimal H/W
probing so that DMA API call in ->probe() could be deferred after
->probe() and till it's in use actually, like opening a device node. For
me this decision(minimal h/w probe) seemed logical but it would add a
new restriction. One advantage is that we could still keep all drivers
wihtout any IOMMU code if it doesn't call DMA API in ->probe().

>  	soc_data = match->data;
>  
>  	host = sdhci_pltfm_init(pdev, soc_data->pdata, 0);
> @@ -310,6 +316,8 @@ static int sdhci_tegra_remove(struct platform_device *pdev)
>  	clk_disable_unprepare(pltfm_host->clk);
>  	clk_put(pltfm_host->clk);
>  
> +	iommu_detach(&pdev->dev);
> +
>  	sdhci_pltfm_free(pdev);
>  
>  	return 0;

^ permalink raw reply	[flat|nested] 133+ messages in thread

* Re: [RFC 10/10] mmc: sdhci-tegra: Add IOMMU support
@ 2014-06-27  9:46         ` Hiroshi DOyu
  0 siblings, 0 replies; 133+ messages in thread
From: Hiroshi DOyu @ 2014-06-27  9:46 UTC (permalink / raw)
  To: Thierry Reding
  Cc: Rob Herring, Pawel Moll, Mark Rutland, Ian Campbell, Kumar Gala,
	Stephen Warren, Arnd Bergmann, Will Deacon, Joerg Roedel,
	Cho KyongHo, Grant Grundler, Dave Martin, Marc Zyngier,
	Hiroshi Doyu, Olav Haugan, Paul Walmsley, Rhyland Klein,
	Allen Martin, devicetree, iommu, linux-arm-kernel, linux-tegra,
	linux-kernel


Thierry Reding <thierry.reding@gmail.com> writes:

> From: Thierry Reding <treding@nvidia.com>
>
> Attach to the device's master interface of the IOMMU at .probe() time.
> IOMMU support becomes available via the DMA mapping API interoperation
> code, but this explicit attachment is necessary to ensure proper probe
> order.
>
> Signed-off-by: Thierry Reding <treding@nvidia.com>
> ---
>  drivers/mmc/host/sdhci-tegra.c | 8 ++++++++
>  1 file changed, 8 insertions(+)
>
> diff --git a/drivers/mmc/host/sdhci-tegra.c b/drivers/mmc/host/sdhci-tegra.c
> index 33100d10d176..b884614fa4e6 100644
> --- a/drivers/mmc/host/sdhci-tegra.c
> +++ b/drivers/mmc/host/sdhci-tegra.c
> @@ -15,6 +15,7 @@
>  #include <linux/err.h>
>  #include <linux/module.h>
>  #include <linux/init.h>
> +#include <linux/iommu.h>
>  #include <linux/platform_device.h>
>  #include <linux/clk.h>
>  #include <linux/io.h>
> @@ -237,6 +238,11 @@ static int sdhci_tegra_probe(struct platform_device *pdev)
>  	match = of_match_device(sdhci_tegra_dt_match, &pdev->dev);
>  	if (!match)
>  		return -EINVAL;
> +
> +	rc = iommu_attach(&pdev->dev);
> +	if (rc < 0)
> +		return rc;
> +

I thought that, if we consider that ->probe() should include minimal H/W
probing so that DMA API call in ->probe() could be deferred after
->probe() and till it's in use actually, like opening a device node. For
me this decision(minimal h/w probe) seemed logical but it would add a
new restriction. One advantage is that we could still keep all drivers
wihtout any IOMMU code if it doesn't call DMA API in ->probe().

>  	soc_data = match->data;
>  
>  	host = sdhci_pltfm_init(pdev, soc_data->pdata, 0);
> @@ -310,6 +316,8 @@ static int sdhci_tegra_remove(struct platform_device *pdev)
>  	clk_disable_unprepare(pltfm_host->clk);
>  	clk_put(pltfm_host->clk);
>  
> +	iommu_detach(&pdev->dev);
> +
>  	sdhci_pltfm_free(pdev);
>  
>  	return 0;

^ permalink raw reply	[flat|nested] 133+ messages in thread

* [RFC 10/10] mmc: sdhci-tegra: Add IOMMU support
@ 2014-06-27  9:46         ` Hiroshi DOyu
  0 siblings, 0 replies; 133+ messages in thread
From: Hiroshi DOyu @ 2014-06-27  9:46 UTC (permalink / raw)
  To: linux-arm-kernel


Thierry Reding <thierry.reding@gmail.com> writes:

> From: Thierry Reding <treding@nvidia.com>
>
> Attach to the device's master interface of the IOMMU at .probe() time.
> IOMMU support becomes available via the DMA mapping API interoperation
> code, but this explicit attachment is necessary to ensure proper probe
> order.
>
> Signed-off-by: Thierry Reding <treding@nvidia.com>
> ---
>  drivers/mmc/host/sdhci-tegra.c | 8 ++++++++
>  1 file changed, 8 insertions(+)
>
> diff --git a/drivers/mmc/host/sdhci-tegra.c b/drivers/mmc/host/sdhci-tegra.c
> index 33100d10d176..b884614fa4e6 100644
> --- a/drivers/mmc/host/sdhci-tegra.c
> +++ b/drivers/mmc/host/sdhci-tegra.c
> @@ -15,6 +15,7 @@
>  #include <linux/err.h>
>  #include <linux/module.h>
>  #include <linux/init.h>
> +#include <linux/iommu.h>
>  #include <linux/platform_device.h>
>  #include <linux/clk.h>
>  #include <linux/io.h>
> @@ -237,6 +238,11 @@ static int sdhci_tegra_probe(struct platform_device *pdev)
>  	match = of_match_device(sdhci_tegra_dt_match, &pdev->dev);
>  	if (!match)
>  		return -EINVAL;
> +
> +	rc = iommu_attach(&pdev->dev);
> +	if (rc < 0)
> +		return rc;
> +

I thought that, if we consider that ->probe() should include minimal H/W
probing so that DMA API call in ->probe() could be deferred after
->probe() and till it's in use actually, like opening a device node. For
me this decision(minimal h/w probe) seemed logical but it would add a
new restriction. One advantage is that we could still keep all drivers
wihtout any IOMMU code if it doesn't call DMA API in ->probe().

>  	soc_data = match->data;
>  
>  	host = sdhci_pltfm_init(pdev, soc_data->pdata, 0);
> @@ -310,6 +316,8 @@ static int sdhci_tegra_remove(struct platform_device *pdev)
>  	clk_disable_unprepare(pltfm_host->clk);
>  	clk_put(pltfm_host->clk);
>  
> +	iommu_detach(&pdev->dev);
> +
>  	sdhci_pltfm_free(pdev);
>  
>  	return 0;

^ permalink raw reply	[flat|nested] 133+ messages in thread

* Re: [RFC 09/10] drm/tegra: Add IOMMU support
  2014-06-26 20:49     ` Thierry Reding
  (?)
@ 2014-06-27  9:46         ` Hiroshi DOyu
  -1 siblings, 0 replies; 133+ messages in thread
From: Hiroshi DOyu @ 2014-06-27  9:46 UTC (permalink / raw)
  To: Thierry Reding
  Cc: Mark Rutland, Will Deacon, Paul Walmsley, Pawel Moll,
	Ian Campbell, Marc Zyngier, Dave Martin, Olav Haugan,
	devicetree-u79uwXL29TY76Z2rM5mHXA, Arnd Bergmann, Stephen Warren,
	Grant Grundler, Allen Martin, Rob Herring,
	linux-tegra-u79uwXL29TY76Z2rM5mHXA, Cho KyongHo,
	linux-arm-kernel-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA, Kumar Gala


Thierry Reding <thierry.reding-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> writes:

> From: Thierry Reding <treding-DDmLM1+adcrQT0dZR+AlfA@public.gmane.org>
>
> When an IOMMU device is available on the platform bus, allocate an IOMMU
> domain and attach the display controllers to it. The display controllers
> can then scan out non-contiguous buffers by mapping them through the
> IOMMU.
>
> Signed-off-by: Thierry Reding <treding-DDmLM1+adcrQT0dZR+AlfA@public.gmane.org>
> ---
>  drivers/gpu/drm/tegra/dc.c  |  21 ++++
>  drivers/gpu/drm/tegra/drm.c |  17 ++++
>  drivers/gpu/drm/tegra/drm.h |   3 +
>  drivers/gpu/drm/tegra/fb.c  |  16 ++-
>  drivers/gpu/drm/tegra/gem.c | 236 +++++++++++++++++++++++++++++++++++++++-----
>  drivers/gpu/drm/tegra/gem.h |   4 +
>  6 files changed, 273 insertions(+), 24 deletions(-)
>
> diff --git a/drivers/gpu/drm/tegra/dc.c b/drivers/gpu/drm/tegra/dc.c
> index afcca04f5367..0f7452d04811 100644
> --- a/drivers/gpu/drm/tegra/dc.c
> +++ b/drivers/gpu/drm/tegra/dc.c
> @@ -9,6 +9,7 @@
>
>  #include <linux/clk.h>
>  #include <linux/debugfs.h>
> +#include <linux/iommu.h>
>  #include <linux/reset.h>
>
>  #include "dc.h"
> @@ -1283,8 +1284,18 @@ static int tegra_dc_init(struct host1x_client *client)
>  {
>         struct drm_device *drm = dev_get_drvdata(client->parent);
>         struct tegra_dc *dc = host1x_client_to_dc(client);
> +       struct tegra_drm *tegra = drm->dev_private;
>         int err;
>
> +       if (tegra->domain) {
> +               err = iommu_attach_device(tegra->domain, dc->dev);

I wanted to keep device drivers iommu-free with the following:

http://patchwork.ozlabs.org/patch/354074/


> +               if (err < 0) {
> +                       dev_err(dc->dev, "failed to attach to IOMMU: %d\n",
> +                               err);
> +                       return err;
> +               }
> +       }
> +
>         drm_crtc_init(drm, &dc->base, &tegra_crtc_funcs);
>         drm_mode_crtc_set_gamma_size(&dc->base, 256);
>         drm_crtc_helper_add(&dc->base, &tegra_crtc_helper_funcs);
> @@ -1318,7 +1329,9 @@ static int tegra_dc_init(struct host1x_client *client)
>
>  static int tegra_dc_exit(struct host1x_client *client)
>  {
> +       struct drm_device *drm = dev_get_drvdata(client->parent);
>         struct tegra_dc *dc = host1x_client_to_dc(client);
> +       struct tegra_drm *tegra = drm->dev_private;
>         int err;
>
>         devm_free_irq(dc->dev, dc->irq, dc);
> @@ -1335,6 +1348,8 @@ static int tegra_dc_exit(struct host1x_client *client)
>                 return err;
>         }
>
> +       iommu_detach_device(tegra->domain, dc->dev);
> +
>         return 0;
>  }
>
> @@ -1462,6 +1477,12 @@ static int tegra_dc_probe(struct platform_device *pdev)
>                 return -ENXIO;
>         }
>
> +       err = iommu_attach(&pdev->dev);
> +       if (err < 0) {
> +               dev_err(&pdev->dev, "failed to attach to IOMMU: %d\n", err);
> +               return err;
> +       }
> +
>         INIT_LIST_HEAD(&dc->client.list);
>         dc->client.ops = &dc_client_ops;
>         dc->client.dev = &pdev->dev;
> diff --git a/drivers/gpu/drm/tegra/drm.c b/drivers/gpu/drm/tegra/drm.c
> index 59736bb810cd..1d2bbafad982 100644
> --- a/drivers/gpu/drm/tegra/drm.c
> +++ b/drivers/gpu/drm/tegra/drm.c
> @@ -8,6 +8,7 @@
>   */
>
>  #include <linux/host1x.h>
> +#include <linux/iommu.h>
>
>  #include "drm.h"
>  #include "gem.h"
> @@ -33,6 +34,16 @@ static int tegra_drm_load(struct drm_device *drm, unsigned long flags)
>         if (!tegra)
>                 return -ENOMEM;
>
> +       if (iommu_present(&platform_bus_type)) {
> +               tegra->domain = iommu_domain_alloc(&platform_bus_type);

Can we use "dma_iommu_mapping" instead of domain?

I thought that DMA API is on the top of IOMMU API so that it may be
cleaner to use only DMA API.


> +               if (IS_ERR(tegra->domain)) {
> +                       kfree(tegra);
> +                       return PTR_ERR(tegra->domain);
> +               }
> +
> +               drm_mm_init(&tegra->mm, 0, SZ_2G);
> +       }
> +
>         mutex_init(&tegra->clients_lock);
>         INIT_LIST_HEAD(&tegra->clients);
>         drm->dev_private = tegra;
> @@ -71,6 +82,7 @@ static int tegra_drm_load(struct drm_device *drm, unsigned long flags)
>  static int tegra_drm_unload(struct drm_device *drm)
>  {
>         struct host1x_device *device = to_host1x_device(drm->dev);
> +       struct tegra_drm *tegra = drm->dev_private;
>         int err;
>
>         drm_kms_helper_poll_fini(drm);
> @@ -82,6 +94,11 @@ static int tegra_drm_unload(struct drm_device *drm)
>         if (err < 0)
>                 return err;
>
> +       if (tegra->domain) {
> +               iommu_domain_free(tegra->domain);
> +               drm_mm_takedown(&tegra->mm);
> +       }
> +
>         return 0;
>  }
>
> diff --git a/drivers/gpu/drm/tegra/drm.h b/drivers/gpu/drm/tegra/drm.h
> index 96d754e7b3eb..a07c796b7edc 100644
> --- a/drivers/gpu/drm/tegra/drm.h
> +++ b/drivers/gpu/drm/tegra/drm.h
> @@ -39,6 +39,9 @@ struct tegra_fbdev {
>  struct tegra_drm {
>         struct drm_device *drm;
>
> +       struct iommu_domain *domain;
> +       struct drm_mm mm;
> +
>         struct mutex clients_lock;
>         struct list_head clients;
>
> diff --git a/drivers/gpu/drm/tegra/fb.c b/drivers/gpu/drm/tegra/fb.c
> index 7790d43ad082..21c65dd817c3 100644
> --- a/drivers/gpu/drm/tegra/fb.c
> +++ b/drivers/gpu/drm/tegra/fb.c
> @@ -65,8 +65,12 @@ static void tegra_fb_destroy(struct drm_framebuffer *framebuffer)
>         for (i = 0; i < fb->num_planes; i++) {
>                 struct tegra_bo *bo = fb->planes[i];
>
> -               if (bo)
> +               if (bo) {
> +                       if (bo->pages && bo->virt)
> +                               vunmap(bo->virt);
> +
>                         drm_gem_object_unreference_unlocked(&bo->gem);
> +               }
>         }
>
>         drm_framebuffer_cleanup(framebuffer);
> @@ -252,6 +256,16 @@ static int tegra_fbdev_probe(struct drm_fb_helper *helper,
>         offset = info->var.xoffset * bytes_per_pixel +
>                  info->var.yoffset * fb->pitches[0];
>
> +       if (bo->pages) {
> +               bo->vaddr = vmap(bo->pages, bo->num_pages, VM_MAP,
> +                                pgprot_writecombine(PAGE_KERNEL));
> +               if (!bo->vaddr) {
> +                       dev_err(drm->dev, "failed to vmap() framebuffer\n");
> +                       err = -ENOMEM;
> +                       goto destroy;
> +               }
> +       }
> +
>         drm->mode_config.fb_base = (resource_size_t)bo->paddr;
>         info->screen_base = (void __iomem *)bo->vaddr + offset;
>         info->screen_size = size;
> diff --git a/drivers/gpu/drm/tegra/gem.c b/drivers/gpu/drm/tegra/gem.c
> index c1e4e8b6e5ca..2912e61a2599 100644
> --- a/drivers/gpu/drm/tegra/gem.c
> +++ b/drivers/gpu/drm/tegra/gem.c
> @@ -14,8 +14,10 @@
>   */
>
>  #include <linux/dma-buf.h>
> +#include <linux/iommu.h>
>  #include <drm/tegra_drm.h>
>
> +#include "drm.h"
>  #include "gem.h"
>
>  static inline struct tegra_bo *host1x_to_tegra_bo(struct host1x_bo *bo)
> @@ -90,14 +92,144 @@ static const struct host1x_bo_ops tegra_bo_ops = {
>         .kunmap = tegra_bo_kunmap,
>  };

iommu_map_sg() could be implemented as iommu_ops->map_sg() for the
better perf since iommu_map() needs some pagetable cache operations. If
we do those cache operations at once, it would bring some perf benefit.

> +static int iommu_map_sg(struct iommu_domain *domain, struct sg_table *sgt,
> +                       dma_addr_t iova, int prot)
> +{
> +       unsigned long offset = 0;
> +       struct scatterlist *sg;
> +       unsigned int i, j;
> +       int err;
> +
> +       for_each_sg(sgt->sgl, sg, sgt->nents, i) {
> +               dma_addr_t phys = sg_phys(sg);
> +               size_t length = sg->offset;
> +
> +               phys = sg_phys(sg) - sg->offset;
> +               length = sg->length + sg->offset;
> +
> +               err = iommu_map(domain, iova + offset, phys, length, prot);
> +               if (err < 0)
> +                       goto unmap;
> +
> +               offset += length;
> +       }
> +
> +       return 0;
> +
> +unmap:
> +       offset = 0;
> +
> +       for_each_sg(sgt->sgl, sg, i, j) {
> +               size_t length = sg->length + sg->offset;
> +               iommu_unmap(domain, iova + offset, length);
> +               offset += length;
> +       }
> +
> +       return err;
> +}

I think that we don't need unmap_sg(), instead normal iommu_unmap() for
a whole area could do the same at once?

> +static int iommu_unmap_sg(struct iommu_domain *domain, struct sg_table *sgt,
> +                         dma_addr_t iova)
> +{
> +       unsigned long offset = 0;
> +       struct scatterlist *sg;
> +       unsigned int i;
> +
> +       for_each_sg(sgt->sgl, sg, sgt->nents, i) {
> +               dma_addr_t phys = sg_phys(sg);
> +               size_t length = sg->offset;
> +
> +               phys = sg_phys(sg) - sg->offset;
> +               length = sg->length + sg->offset;
> +
> +               iommu_unmap(domain, iova + offset, length);
> +               offset += length;
> +       }
> +
> +       return 0;
> +}

Can the rest of IOMMU API be replaced with DMA API too?

^ permalink raw reply	[flat|nested] 133+ messages in thread

* Re: [RFC 09/10] drm/tegra: Add IOMMU support
@ 2014-06-27  9:46         ` Hiroshi DOyu
  0 siblings, 0 replies; 133+ messages in thread
From: Hiroshi DOyu @ 2014-06-27  9:46 UTC (permalink / raw)
  To: Thierry Reding
  Cc: Rob Herring, Pawel Moll, Mark Rutland, Ian Campbell, Kumar Gala,
	Stephen Warren, Arnd Bergmann, Will Deacon, Joerg Roedel,
	Cho KyongHo, Grant Grundler, Dave Martin, Marc Zyngier,
	Hiroshi Doyu, Olav Haugan, Paul Walmsley, Rhyland Klein,
	Allen Martin, devicetree, iommu, linux-arm-kernel, linux-tegra,
	linux-kernel


Thierry Reding <thierry.reding@gmail.com> writes:

> From: Thierry Reding <treding@nvidia.com>
>
> When an IOMMU device is available on the platform bus, allocate an IOMMU
> domain and attach the display controllers to it. The display controllers
> can then scan out non-contiguous buffers by mapping them through the
> IOMMU.
>
> Signed-off-by: Thierry Reding <treding@nvidia.com>
> ---
>  drivers/gpu/drm/tegra/dc.c  |  21 ++++
>  drivers/gpu/drm/tegra/drm.c |  17 ++++
>  drivers/gpu/drm/tegra/drm.h |   3 +
>  drivers/gpu/drm/tegra/fb.c  |  16 ++-
>  drivers/gpu/drm/tegra/gem.c | 236 +++++++++++++++++++++++++++++++++++++++-----
>  drivers/gpu/drm/tegra/gem.h |   4 +
>  6 files changed, 273 insertions(+), 24 deletions(-)
>
> diff --git a/drivers/gpu/drm/tegra/dc.c b/drivers/gpu/drm/tegra/dc.c
> index afcca04f5367..0f7452d04811 100644
> --- a/drivers/gpu/drm/tegra/dc.c
> +++ b/drivers/gpu/drm/tegra/dc.c
> @@ -9,6 +9,7 @@
>
>  #include <linux/clk.h>
>  #include <linux/debugfs.h>
> +#include <linux/iommu.h>
>  #include <linux/reset.h>
>
>  #include "dc.h"
> @@ -1283,8 +1284,18 @@ static int tegra_dc_init(struct host1x_client *client)
>  {
>         struct drm_device *drm = dev_get_drvdata(client->parent);
>         struct tegra_dc *dc = host1x_client_to_dc(client);
> +       struct tegra_drm *tegra = drm->dev_private;
>         int err;
>
> +       if (tegra->domain) {
> +               err = iommu_attach_device(tegra->domain, dc->dev);

I wanted to keep device drivers iommu-free with the following:

http://patchwork.ozlabs.org/patch/354074/


> +               if (err < 0) {
> +                       dev_err(dc->dev, "failed to attach to IOMMU: %d\n",
> +                               err);
> +                       return err;
> +               }
> +       }
> +
>         drm_crtc_init(drm, &dc->base, &tegra_crtc_funcs);
>         drm_mode_crtc_set_gamma_size(&dc->base, 256);
>         drm_crtc_helper_add(&dc->base, &tegra_crtc_helper_funcs);
> @@ -1318,7 +1329,9 @@ static int tegra_dc_init(struct host1x_client *client)
>
>  static int tegra_dc_exit(struct host1x_client *client)
>  {
> +       struct drm_device *drm = dev_get_drvdata(client->parent);
>         struct tegra_dc *dc = host1x_client_to_dc(client);
> +       struct tegra_drm *tegra = drm->dev_private;
>         int err;
>
>         devm_free_irq(dc->dev, dc->irq, dc);
> @@ -1335,6 +1348,8 @@ static int tegra_dc_exit(struct host1x_client *client)
>                 return err;
>         }
>
> +       iommu_detach_device(tegra->domain, dc->dev);
> +
>         return 0;
>  }
>
> @@ -1462,6 +1477,12 @@ static int tegra_dc_probe(struct platform_device *pdev)
>                 return -ENXIO;
>         }
>
> +       err = iommu_attach(&pdev->dev);
> +       if (err < 0) {
> +               dev_err(&pdev->dev, "failed to attach to IOMMU: %d\n", err);
> +               return err;
> +       }
> +
>         INIT_LIST_HEAD(&dc->client.list);
>         dc->client.ops = &dc_client_ops;
>         dc->client.dev = &pdev->dev;
> diff --git a/drivers/gpu/drm/tegra/drm.c b/drivers/gpu/drm/tegra/drm.c
> index 59736bb810cd..1d2bbafad982 100644
> --- a/drivers/gpu/drm/tegra/drm.c
> +++ b/drivers/gpu/drm/tegra/drm.c
> @@ -8,6 +8,7 @@
>   */
>
>  #include <linux/host1x.h>
> +#include <linux/iommu.h>
>
>  #include "drm.h"
>  #include "gem.h"
> @@ -33,6 +34,16 @@ static int tegra_drm_load(struct drm_device *drm, unsigned long flags)
>         if (!tegra)
>                 return -ENOMEM;
>
> +       if (iommu_present(&platform_bus_type)) {
> +               tegra->domain = iommu_domain_alloc(&platform_bus_type);

Can we use "dma_iommu_mapping" instead of domain?

I thought that DMA API is on the top of IOMMU API so that it may be
cleaner to use only DMA API.


> +               if (IS_ERR(tegra->domain)) {
> +                       kfree(tegra);
> +                       return PTR_ERR(tegra->domain);
> +               }
> +
> +               drm_mm_init(&tegra->mm, 0, SZ_2G);
> +       }
> +
>         mutex_init(&tegra->clients_lock);
>         INIT_LIST_HEAD(&tegra->clients);
>         drm->dev_private = tegra;
> @@ -71,6 +82,7 @@ static int tegra_drm_load(struct drm_device *drm, unsigned long flags)
>  static int tegra_drm_unload(struct drm_device *drm)
>  {
>         struct host1x_device *device = to_host1x_device(drm->dev);
> +       struct tegra_drm *tegra = drm->dev_private;
>         int err;
>
>         drm_kms_helper_poll_fini(drm);
> @@ -82,6 +94,11 @@ static int tegra_drm_unload(struct drm_device *drm)
>         if (err < 0)
>                 return err;
>
> +       if (tegra->domain) {
> +               iommu_domain_free(tegra->domain);
> +               drm_mm_takedown(&tegra->mm);
> +       }
> +
>         return 0;
>  }
>
> diff --git a/drivers/gpu/drm/tegra/drm.h b/drivers/gpu/drm/tegra/drm.h
> index 96d754e7b3eb..a07c796b7edc 100644
> --- a/drivers/gpu/drm/tegra/drm.h
> +++ b/drivers/gpu/drm/tegra/drm.h
> @@ -39,6 +39,9 @@ struct tegra_fbdev {
>  struct tegra_drm {
>         struct drm_device *drm;
>
> +       struct iommu_domain *domain;
> +       struct drm_mm mm;
> +
>         struct mutex clients_lock;
>         struct list_head clients;
>
> diff --git a/drivers/gpu/drm/tegra/fb.c b/drivers/gpu/drm/tegra/fb.c
> index 7790d43ad082..21c65dd817c3 100644
> --- a/drivers/gpu/drm/tegra/fb.c
> +++ b/drivers/gpu/drm/tegra/fb.c
> @@ -65,8 +65,12 @@ static void tegra_fb_destroy(struct drm_framebuffer *framebuffer)
>         for (i = 0; i < fb->num_planes; i++) {
>                 struct tegra_bo *bo = fb->planes[i];
>
> -               if (bo)
> +               if (bo) {
> +                       if (bo->pages && bo->virt)
> +                               vunmap(bo->virt);
> +
>                         drm_gem_object_unreference_unlocked(&bo->gem);
> +               }
>         }
>
>         drm_framebuffer_cleanup(framebuffer);
> @@ -252,6 +256,16 @@ static int tegra_fbdev_probe(struct drm_fb_helper *helper,
>         offset = info->var.xoffset * bytes_per_pixel +
>                  info->var.yoffset * fb->pitches[0];
>
> +       if (bo->pages) {
> +               bo->vaddr = vmap(bo->pages, bo->num_pages, VM_MAP,
> +                                pgprot_writecombine(PAGE_KERNEL));
> +               if (!bo->vaddr) {
> +                       dev_err(drm->dev, "failed to vmap() framebuffer\n");
> +                       err = -ENOMEM;
> +                       goto destroy;
> +               }
> +       }
> +
>         drm->mode_config.fb_base = (resource_size_t)bo->paddr;
>         info->screen_base = (void __iomem *)bo->vaddr + offset;
>         info->screen_size = size;
> diff --git a/drivers/gpu/drm/tegra/gem.c b/drivers/gpu/drm/tegra/gem.c
> index c1e4e8b6e5ca..2912e61a2599 100644
> --- a/drivers/gpu/drm/tegra/gem.c
> +++ b/drivers/gpu/drm/tegra/gem.c
> @@ -14,8 +14,10 @@
>   */
>
>  #include <linux/dma-buf.h>
> +#include <linux/iommu.h>
>  #include <drm/tegra_drm.h>
>
> +#include "drm.h"
>  #include "gem.h"
>
>  static inline struct tegra_bo *host1x_to_tegra_bo(struct host1x_bo *bo)
> @@ -90,14 +92,144 @@ static const struct host1x_bo_ops tegra_bo_ops = {
>         .kunmap = tegra_bo_kunmap,
>  };

iommu_map_sg() could be implemented as iommu_ops->map_sg() for the
better perf since iommu_map() needs some pagetable cache operations. If
we do those cache operations at once, it would bring some perf benefit.

> +static int iommu_map_sg(struct iommu_domain *domain, struct sg_table *sgt,
> +                       dma_addr_t iova, int prot)
> +{
> +       unsigned long offset = 0;
> +       struct scatterlist *sg;
> +       unsigned int i, j;
> +       int err;
> +
> +       for_each_sg(sgt->sgl, sg, sgt->nents, i) {
> +               dma_addr_t phys = sg_phys(sg);
> +               size_t length = sg->offset;
> +
> +               phys = sg_phys(sg) - sg->offset;
> +               length = sg->length + sg->offset;
> +
> +               err = iommu_map(domain, iova + offset, phys, length, prot);
> +               if (err < 0)
> +                       goto unmap;
> +
> +               offset += length;
> +       }
> +
> +       return 0;
> +
> +unmap:
> +       offset = 0;
> +
> +       for_each_sg(sgt->sgl, sg, i, j) {
> +               size_t length = sg->length + sg->offset;
> +               iommu_unmap(domain, iova + offset, length);
> +               offset += length;
> +       }
> +
> +       return err;
> +}

I think that we don't need unmap_sg(), instead normal iommu_unmap() for
a whole area could do the same at once?

> +static int iommu_unmap_sg(struct iommu_domain *domain, struct sg_table *sgt,
> +                         dma_addr_t iova)
> +{
> +       unsigned long offset = 0;
> +       struct scatterlist *sg;
> +       unsigned int i;
> +
> +       for_each_sg(sgt->sgl, sg, sgt->nents, i) {
> +               dma_addr_t phys = sg_phys(sg);
> +               size_t length = sg->offset;
> +
> +               phys = sg_phys(sg) - sg->offset;
> +               length = sg->length + sg->offset;
> +
> +               iommu_unmap(domain, iova + offset, length);
> +               offset += length;
> +       }
> +
> +       return 0;
> +}

Can the rest of IOMMU API be replaced with DMA API too?

^ permalink raw reply	[flat|nested] 133+ messages in thread

* [RFC 09/10] drm/tegra: Add IOMMU support
@ 2014-06-27  9:46         ` Hiroshi DOyu
  0 siblings, 0 replies; 133+ messages in thread
From: Hiroshi DOyu @ 2014-06-27  9:46 UTC (permalink / raw)
  To: linux-arm-kernel


Thierry Reding <thierry.reding@gmail.com> writes:

> From: Thierry Reding <treding@nvidia.com>
>
> When an IOMMU device is available on the platform bus, allocate an IOMMU
> domain and attach the display controllers to it. The display controllers
> can then scan out non-contiguous buffers by mapping them through the
> IOMMU.
>
> Signed-off-by: Thierry Reding <treding@nvidia.com>
> ---
>  drivers/gpu/drm/tegra/dc.c  |  21 ++++
>  drivers/gpu/drm/tegra/drm.c |  17 ++++
>  drivers/gpu/drm/tegra/drm.h |   3 +
>  drivers/gpu/drm/tegra/fb.c  |  16 ++-
>  drivers/gpu/drm/tegra/gem.c | 236 +++++++++++++++++++++++++++++++++++++++-----
>  drivers/gpu/drm/tegra/gem.h |   4 +
>  6 files changed, 273 insertions(+), 24 deletions(-)
>
> diff --git a/drivers/gpu/drm/tegra/dc.c b/drivers/gpu/drm/tegra/dc.c
> index afcca04f5367..0f7452d04811 100644
> --- a/drivers/gpu/drm/tegra/dc.c
> +++ b/drivers/gpu/drm/tegra/dc.c
> @@ -9,6 +9,7 @@
>
>  #include <linux/clk.h>
>  #include <linux/debugfs.h>
> +#include <linux/iommu.h>
>  #include <linux/reset.h>
>
>  #include "dc.h"
> @@ -1283,8 +1284,18 @@ static int tegra_dc_init(struct host1x_client *client)
>  {
>         struct drm_device *drm = dev_get_drvdata(client->parent);
>         struct tegra_dc *dc = host1x_client_to_dc(client);
> +       struct tegra_drm *tegra = drm->dev_private;
>         int err;
>
> +       if (tegra->domain) {
> +               err = iommu_attach_device(tegra->domain, dc->dev);

I wanted to keep device drivers iommu-free with the following:

http://patchwork.ozlabs.org/patch/354074/


> +               if (err < 0) {
> +                       dev_err(dc->dev, "failed to attach to IOMMU: %d\n",
> +                               err);
> +                       return err;
> +               }
> +       }
> +
>         drm_crtc_init(drm, &dc->base, &tegra_crtc_funcs);
>         drm_mode_crtc_set_gamma_size(&dc->base, 256);
>         drm_crtc_helper_add(&dc->base, &tegra_crtc_helper_funcs);
> @@ -1318,7 +1329,9 @@ static int tegra_dc_init(struct host1x_client *client)
>
>  static int tegra_dc_exit(struct host1x_client *client)
>  {
> +       struct drm_device *drm = dev_get_drvdata(client->parent);
>         struct tegra_dc *dc = host1x_client_to_dc(client);
> +       struct tegra_drm *tegra = drm->dev_private;
>         int err;
>
>         devm_free_irq(dc->dev, dc->irq, dc);
> @@ -1335,6 +1348,8 @@ static int tegra_dc_exit(struct host1x_client *client)
>                 return err;
>         }
>
> +       iommu_detach_device(tegra->domain, dc->dev);
> +
>         return 0;
>  }
>
> @@ -1462,6 +1477,12 @@ static int tegra_dc_probe(struct platform_device *pdev)
>                 return -ENXIO;
>         }
>
> +       err = iommu_attach(&pdev->dev);
> +       if (err < 0) {
> +               dev_err(&pdev->dev, "failed to attach to IOMMU: %d\n", err);
> +               return err;
> +       }
> +
>         INIT_LIST_HEAD(&dc->client.list);
>         dc->client.ops = &dc_client_ops;
>         dc->client.dev = &pdev->dev;
> diff --git a/drivers/gpu/drm/tegra/drm.c b/drivers/gpu/drm/tegra/drm.c
> index 59736bb810cd..1d2bbafad982 100644
> --- a/drivers/gpu/drm/tegra/drm.c
> +++ b/drivers/gpu/drm/tegra/drm.c
> @@ -8,6 +8,7 @@
>   */
>
>  #include <linux/host1x.h>
> +#include <linux/iommu.h>
>
>  #include "drm.h"
>  #include "gem.h"
> @@ -33,6 +34,16 @@ static int tegra_drm_load(struct drm_device *drm, unsigned long flags)
>         if (!tegra)
>                 return -ENOMEM;
>
> +       if (iommu_present(&platform_bus_type)) {
> +               tegra->domain = iommu_domain_alloc(&platform_bus_type);

Can we use "dma_iommu_mapping" instead of domain?

I thought that DMA API is on the top of IOMMU API so that it may be
cleaner to use only DMA API.


> +               if (IS_ERR(tegra->domain)) {
> +                       kfree(tegra);
> +                       return PTR_ERR(tegra->domain);
> +               }
> +
> +               drm_mm_init(&tegra->mm, 0, SZ_2G);
> +       }
> +
>         mutex_init(&tegra->clients_lock);
>         INIT_LIST_HEAD(&tegra->clients);
>         drm->dev_private = tegra;
> @@ -71,6 +82,7 @@ static int tegra_drm_load(struct drm_device *drm, unsigned long flags)
>  static int tegra_drm_unload(struct drm_device *drm)
>  {
>         struct host1x_device *device = to_host1x_device(drm->dev);
> +       struct tegra_drm *tegra = drm->dev_private;
>         int err;
>
>         drm_kms_helper_poll_fini(drm);
> @@ -82,6 +94,11 @@ static int tegra_drm_unload(struct drm_device *drm)
>         if (err < 0)
>                 return err;
>
> +       if (tegra->domain) {
> +               iommu_domain_free(tegra->domain);
> +               drm_mm_takedown(&tegra->mm);
> +       }
> +
>         return 0;
>  }
>
> diff --git a/drivers/gpu/drm/tegra/drm.h b/drivers/gpu/drm/tegra/drm.h
> index 96d754e7b3eb..a07c796b7edc 100644
> --- a/drivers/gpu/drm/tegra/drm.h
> +++ b/drivers/gpu/drm/tegra/drm.h
> @@ -39,6 +39,9 @@ struct tegra_fbdev {
>  struct tegra_drm {
>         struct drm_device *drm;
>
> +       struct iommu_domain *domain;
> +       struct drm_mm mm;
> +
>         struct mutex clients_lock;
>         struct list_head clients;
>
> diff --git a/drivers/gpu/drm/tegra/fb.c b/drivers/gpu/drm/tegra/fb.c
> index 7790d43ad082..21c65dd817c3 100644
> --- a/drivers/gpu/drm/tegra/fb.c
> +++ b/drivers/gpu/drm/tegra/fb.c
> @@ -65,8 +65,12 @@ static void tegra_fb_destroy(struct drm_framebuffer *framebuffer)
>         for (i = 0; i < fb->num_planes; i++) {
>                 struct tegra_bo *bo = fb->planes[i];
>
> -               if (bo)
> +               if (bo) {
> +                       if (bo->pages && bo->virt)
> +                               vunmap(bo->virt);
> +
>                         drm_gem_object_unreference_unlocked(&bo->gem);
> +               }
>         }
>
>         drm_framebuffer_cleanup(framebuffer);
> @@ -252,6 +256,16 @@ static int tegra_fbdev_probe(struct drm_fb_helper *helper,
>         offset = info->var.xoffset * bytes_per_pixel +
>                  info->var.yoffset * fb->pitches[0];
>
> +       if (bo->pages) {
> +               bo->vaddr = vmap(bo->pages, bo->num_pages, VM_MAP,
> +                                pgprot_writecombine(PAGE_KERNEL));
> +               if (!bo->vaddr) {
> +                       dev_err(drm->dev, "failed to vmap() framebuffer\n");
> +                       err = -ENOMEM;
> +                       goto destroy;
> +               }
> +       }
> +
>         drm->mode_config.fb_base = (resource_size_t)bo->paddr;
>         info->screen_base = (void __iomem *)bo->vaddr + offset;
>         info->screen_size = size;
> diff --git a/drivers/gpu/drm/tegra/gem.c b/drivers/gpu/drm/tegra/gem.c
> index c1e4e8b6e5ca..2912e61a2599 100644
> --- a/drivers/gpu/drm/tegra/gem.c
> +++ b/drivers/gpu/drm/tegra/gem.c
> @@ -14,8 +14,10 @@
>   */
>
>  #include <linux/dma-buf.h>
> +#include <linux/iommu.h>
>  #include <drm/tegra_drm.h>
>
> +#include "drm.h"
>  #include "gem.h"
>
>  static inline struct tegra_bo *host1x_to_tegra_bo(struct host1x_bo *bo)
> @@ -90,14 +92,144 @@ static const struct host1x_bo_ops tegra_bo_ops = {
>         .kunmap = tegra_bo_kunmap,
>  };

iommu_map_sg() could be implemented as iommu_ops->map_sg() for the
better perf since iommu_map() needs some pagetable cache operations. If
we do those cache operations at once, it would bring some perf benefit.

> +static int iommu_map_sg(struct iommu_domain *domain, struct sg_table *sgt,
> +                       dma_addr_t iova, int prot)
> +{
> +       unsigned long offset = 0;
> +       struct scatterlist *sg;
> +       unsigned int i, j;
> +       int err;
> +
> +       for_each_sg(sgt->sgl, sg, sgt->nents, i) {
> +               dma_addr_t phys = sg_phys(sg);
> +               size_t length = sg->offset;
> +
> +               phys = sg_phys(sg) - sg->offset;
> +               length = sg->length + sg->offset;
> +
> +               err = iommu_map(domain, iova + offset, phys, length, prot);
> +               if (err < 0)
> +                       goto unmap;
> +
> +               offset += length;
> +       }
> +
> +       return 0;
> +
> +unmap:
> +       offset = 0;
> +
> +       for_each_sg(sgt->sgl, sg, i, j) {
> +               size_t length = sg->length + sg->offset;
> +               iommu_unmap(domain, iova + offset, length);
> +               offset += length;
> +       }
> +
> +       return err;
> +}

I think that we don't need unmap_sg(), instead normal iommu_unmap() for
a whole area could do the same at once?

> +static int iommu_unmap_sg(struct iommu_domain *domain, struct sg_table *sgt,
> +                         dma_addr_t iova)
> +{
> +       unsigned long offset = 0;
> +       struct scatterlist *sg;
> +       unsigned int i;
> +
> +       for_each_sg(sgt->sgl, sg, sgt->nents, i) {
> +               dma_addr_t phys = sg_phys(sg);
> +               size_t length = sg->offset;
> +
> +               phys = sg_phys(sg) - sg->offset;
> +               length = sg->length + sg->offset;
> +
> +               iommu_unmap(domain, iova + offset, length);
> +               offset += length;
> +       }
> +
> +       return 0;
> +}

Can the rest of IOMMU API be replaced with DMA API too?

^ permalink raw reply	[flat|nested] 133+ messages in thread

* Re: [RFC 04/10] memory: Add Tegra124 memory controller support
  2014-06-26 20:49     ` Thierry Reding
  (?)
@ 2014-06-27  9:46         ` Hiroshi DOyu
  -1 siblings, 0 replies; 133+ messages in thread
From: Hiroshi DOyu @ 2014-06-27  9:46 UTC (permalink / raw)
  To: Thierry Reding
  Cc: Mark Rutland, Will Deacon, Paul Walmsley, Pawel Moll,
	Ian Campbell, Marc Zyngier, Dave Martin, Olav Haugan,
	devicetree-u79uwXL29TY76Z2rM5mHXA, Arnd Bergmann, Stephen Warren,
	Grant Grundler, Allen Martin, Rob Herring,
	linux-tegra-u79uwXL29TY76Z2rM5mHXA, Cho KyongHo,
	linux-arm-kernel-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA, Kumar Gala


Thierry Reding <thierry.reding-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> writes:

> From: Thierry Reding <treding-DDmLM1+adcrQT0dZR+AlfA@public.gmane.org>
>
> The memory controller on NVIDIA Tegra124 exposes various knobs that can
> be used to tune the behaviour of the clients attached to it.
>
> Currently this driver sets up the latency allowance registers to the HW
> defaults. Eventually an API should be exported by this driver (via a
> custom API or a generic subsystem) to allow clients to register latency
> requirements.
>
> This driver also registers an IOMMU (SMMU) that's implemented by the
> memory controller.
>
> Signed-off-by: Thierry Reding <treding-DDmLM1+adcrQT0dZR+AlfA@public.gmane.org>
> ---
>  drivers/memory/Kconfig                   |    9 +
>  drivers/memory/Makefile                  |    1 +
>  drivers/memory/tegra124-mc.c             | 1945 ++++++++++++++++++++++++++++++
>  include/dt-bindings/memory/tegra124-mc.h |   30 +
>  4 files changed, 1985 insertions(+)
>  create mode 100644 drivers/memory/tegra124-mc.c
>  create mode 100644 include/dt-bindings/memory/tegra124-mc.h

I prefer reusing the existing SMMU and having MC and SMMU separated
since most of SMMU code are not different from functionality POV, and
new MC features are quite independent of SMMU.

If it's really convenient to combine MC and SMMU into one driver, we
could move "drivers/iomm/tegra-smmu.c" here first, and add MC features
on the top of it.

^ permalink raw reply	[flat|nested] 133+ messages in thread

* Re: [RFC 04/10] memory: Add Tegra124 memory controller support
@ 2014-06-27  9:46         ` Hiroshi DOyu
  0 siblings, 0 replies; 133+ messages in thread
From: Hiroshi DOyu @ 2014-06-27  9:46 UTC (permalink / raw)
  To: Thierry Reding
  Cc: Rob Herring, Pawel Moll, Mark Rutland, Ian Campbell, Kumar Gala,
	Stephen Warren, Arnd Bergmann, Will Deacon, Joerg Roedel,
	Cho KyongHo, Grant Grundler, Dave Martin, Marc Zyngier,
	Hiroshi Doyu, Olav Haugan, Paul Walmsley, Rhyland Klein,
	Allen Martin, devicetree, iommu, linux-arm-kernel, linux-tegra,
	linux-kernel


Thierry Reding <thierry.reding@gmail.com> writes:

> From: Thierry Reding <treding@nvidia.com>
>
> The memory controller on NVIDIA Tegra124 exposes various knobs that can
> be used to tune the behaviour of the clients attached to it.
>
> Currently this driver sets up the latency allowance registers to the HW
> defaults. Eventually an API should be exported by this driver (via a
> custom API or a generic subsystem) to allow clients to register latency
> requirements.
>
> This driver also registers an IOMMU (SMMU) that's implemented by the
> memory controller.
>
> Signed-off-by: Thierry Reding <treding@nvidia.com>
> ---
>  drivers/memory/Kconfig                   |    9 +
>  drivers/memory/Makefile                  |    1 +
>  drivers/memory/tegra124-mc.c             | 1945 ++++++++++++++++++++++++++++++
>  include/dt-bindings/memory/tegra124-mc.h |   30 +
>  4 files changed, 1985 insertions(+)
>  create mode 100644 drivers/memory/tegra124-mc.c
>  create mode 100644 include/dt-bindings/memory/tegra124-mc.h

I prefer reusing the existing SMMU and having MC and SMMU separated
since most of SMMU code are not different from functionality POV, and
new MC features are quite independent of SMMU.

If it's really convenient to combine MC and SMMU into one driver, we
could move "drivers/iomm/tegra-smmu.c" here first, and add MC features
on the top of it.

^ permalink raw reply	[flat|nested] 133+ messages in thread

* [RFC 04/10] memory: Add Tegra124 memory controller support
@ 2014-06-27  9:46         ` Hiroshi DOyu
  0 siblings, 0 replies; 133+ messages in thread
From: Hiroshi DOyu @ 2014-06-27  9:46 UTC (permalink / raw)
  To: linux-arm-kernel


Thierry Reding <thierry.reding@gmail.com> writes:

> From: Thierry Reding <treding@nvidia.com>
>
> The memory controller on NVIDIA Tegra124 exposes various knobs that can
> be used to tune the behaviour of the clients attached to it.
>
> Currently this driver sets up the latency allowance registers to the HW
> defaults. Eventually an API should be exported by this driver (via a
> custom API or a generic subsystem) to allow clients to register latency
> requirements.
>
> This driver also registers an IOMMU (SMMU) that's implemented by the
> memory controller.
>
> Signed-off-by: Thierry Reding <treding@nvidia.com>
> ---
>  drivers/memory/Kconfig                   |    9 +
>  drivers/memory/Makefile                  |    1 +
>  drivers/memory/tegra124-mc.c             | 1945 ++++++++++++++++++++++++++++++
>  include/dt-bindings/memory/tegra124-mc.h |   30 +
>  4 files changed, 1985 insertions(+)
>  create mode 100644 drivers/memory/tegra124-mc.c
>  create mode 100644 include/dt-bindings/memory/tegra124-mc.h

I prefer reusing the existing SMMU and having MC and SMMU separated
since most of SMMU code are not different from functionality POV, and
new MC features are quite independent of SMMU.

If it's really convenient to combine MC and SMMU into one driver, we
could move "drivers/iomm/tegra-smmu.c" here first, and add MC features
on the top of it.

^ permalink raw reply	[flat|nested] 133+ messages in thread

* Re: [RFC 09/10] drm/tegra: Add IOMMU support
  2014-06-27  9:46         ` Hiroshi DOyu
  (?)
@ 2014-06-27 10:54             ` Arnd Bergmann
  -1 siblings, 0 replies; 133+ messages in thread
From: Arnd Bergmann @ 2014-06-27 10:54 UTC (permalink / raw)
  To: linux-arm-kernel-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r
  Cc: Hiroshi DOyu, Thierry Reding, Mark Rutland, Will Deacon,
	Paul Walmsley, Pawel Moll, Joerg Roedel, Ian Campbell,
	Marc Zyngier, Dave Martin, Olav Haugan,
	devicetree-u79uwXL29TY76Z2rM5mHXA, Stephen Warren,
	Grant Grundler, Allen Martin, Rob Herring,
	linux-tegra-u79uwXL29TY76Z2rM5mHXA, Cho KyongHo,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA, Kumar Gala

On Friday 27 June 2014 12:46:14 Hiroshi DOyu wrote:
> 
> Thierry Reding <thierry.reding-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> writes:
> 
> > From: Thierry Reding <treding-DDmLM1+adcrQT0dZR+AlfA@public.gmane.org>
> >
> > When an IOMMU device is available on the platform bus, allocate an IOMMU
> > domain and attach the display controllers to it. The display controllers
> > can then scan out non-contiguous buffers by mapping them through the
> > IOMMU.
> >
> > Signed-off-by: Thierry Reding <treding-DDmLM1+adcrQT0dZR+AlfA@public.gmane.org>
> > ---
> > @@ -1283,8 +1284,18 @@ static int tegra_dc_init(struct host1x_client *client)
> >  {
> >         struct drm_device *drm = dev_get_drvdata(client->parent);
> >         struct tegra_dc *dc = host1x_client_to_dc(client);
> > +       struct tegra_drm *tegra = drm->dev_private;
> >         int err;
> >
> > +       if (tegra->domain) {
> > +               err = iommu_attach_device(tegra->domain, dc->dev);
> 
> I wanted to keep device drivers iommu-free with the following:
> 
> http://patchwork.ozlabs.org/patch/354074/
> 

We definitely need something like your series to make iommus work transparently
on ARM for normal devices, using the of_dma_configure() to look up the correct
iommu per device and initialize it.

However, any devices that work with multiple iommu domains cannot do that
and still need to use the iommu API directy. I believe the tegra drm code
falls into this category.

	Arnd

^ permalink raw reply	[flat|nested] 133+ messages in thread

* Re: [RFC 09/10] drm/tegra: Add IOMMU support
@ 2014-06-27 10:54             ` Arnd Bergmann
  0 siblings, 0 replies; 133+ messages in thread
From: Arnd Bergmann @ 2014-06-27 10:54 UTC (permalink / raw)
  To: linux-arm-kernel
  Cc: Hiroshi DOyu, Thierry Reding, Mark Rutland, Will Deacon,
	Paul Walmsley, Pawel Moll, Joerg Roedel, Ian Campbell,
	Marc Zyngier, Dave Martin, Olav Haugan, devicetree,
	Stephen Warren, Grant Grundler, Allen Martin, Rob Herring,
	linux-tegra, Cho KyongHo, linux-kernel, iommu, Kumar Gala,
	Rhyland Klein

On Friday 27 June 2014 12:46:14 Hiroshi DOyu wrote:
> 
> Thierry Reding <thierry.reding@gmail.com> writes:
> 
> > From: Thierry Reding <treding@nvidia.com>
> >
> > When an IOMMU device is available on the platform bus, allocate an IOMMU
> > domain and attach the display controllers to it. The display controllers
> > can then scan out non-contiguous buffers by mapping them through the
> > IOMMU.
> >
> > Signed-off-by: Thierry Reding <treding@nvidia.com>
> > ---
> > @@ -1283,8 +1284,18 @@ static int tegra_dc_init(struct host1x_client *client)
> >  {
> >         struct drm_device *drm = dev_get_drvdata(client->parent);
> >         struct tegra_dc *dc = host1x_client_to_dc(client);
> > +       struct tegra_drm *tegra = drm->dev_private;
> >         int err;
> >
> > +       if (tegra->domain) {
> > +               err = iommu_attach_device(tegra->domain, dc->dev);
> 
> I wanted to keep device drivers iommu-free with the following:
> 
> http://patchwork.ozlabs.org/patch/354074/
> 

We definitely need something like your series to make iommus work transparently
on ARM for normal devices, using the of_dma_configure() to look up the correct
iommu per device and initialize it.

However, any devices that work with multiple iommu domains cannot do that
and still need to use the iommu API directy. I believe the tegra drm code
falls into this category.

	Arnd

^ permalink raw reply	[flat|nested] 133+ messages in thread

* [RFC 09/10] drm/tegra: Add IOMMU support
@ 2014-06-27 10:54             ` Arnd Bergmann
  0 siblings, 0 replies; 133+ messages in thread
From: Arnd Bergmann @ 2014-06-27 10:54 UTC (permalink / raw)
  To: linux-arm-kernel

On Friday 27 June 2014 12:46:14 Hiroshi DOyu wrote:
> 
> Thierry Reding <thierry.reding@gmail.com> writes:
> 
> > From: Thierry Reding <treding@nvidia.com>
> >
> > When an IOMMU device is available on the platform bus, allocate an IOMMU
> > domain and attach the display controllers to it. The display controllers
> > can then scan out non-contiguous buffers by mapping them through the
> > IOMMU.
> >
> > Signed-off-by: Thierry Reding <treding@nvidia.com>
> > ---
> > @@ -1283,8 +1284,18 @@ static int tegra_dc_init(struct host1x_client *client)
> >  {
> >         struct drm_device *drm = dev_get_drvdata(client->parent);
> >         struct tegra_dc *dc = host1x_client_to_dc(client);
> > +       struct tegra_drm *tegra = drm->dev_private;
> >         int err;
> >
> > +       if (tegra->domain) {
> > +               err = iommu_attach_device(tegra->domain, dc->dev);
> 
> I wanted to keep device drivers iommu-free with the following:
> 
> http://patchwork.ozlabs.org/patch/354074/
> 

We definitely need something like your series to make iommus work transparently
on ARM for normal devices, using the of_dma_configure() to look up the correct
iommu per device and initialize it.

However, any devices that work with multiple iommu domains cannot do that
and still need to use the iommu API directy. I believe the tegra drm code
falls into this category.

	Arnd

^ permalink raw reply	[flat|nested] 133+ messages in thread

* Re: [RFC 09/10] drm/tegra: Add IOMMU support
  2014-06-27  9:46         ` Hiroshi DOyu
  (?)
@ 2014-06-27 10:58             ` Thierry Reding
  -1 siblings, 0 replies; 133+ messages in thread
From: Thierry Reding @ 2014-06-27 10:58 UTC (permalink / raw)
  To: Hiroshi DOyu
  Cc: Rob Herring, Pawel Moll, Mark Rutland, Ian Campbell, Kumar Gala,
	Stephen Warren, Arnd Bergmann, Will Deacon, Joerg Roedel,
	Cho KyongHo, Grant Grundler, Dave Martin, Marc Zyngier,
	Olav Haugan, Paul Walmsley, Rhyland Klein, Allen Martin,
	devicetree-u79uwXL29TY76Z2rM5mHXA,
	iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA,
	linux-arm-kernel-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r,
	linux-tegra-u79uwXL29TY76Z2rM5mHXA@public.gmane.org

[-- Attachment #1: Type: text/plain, Size: 3740 bytes --]

On Fri, Jun 27, 2014 at 12:46:14PM +0300, Hiroshi DOyu wrote:
> Thierry Reding <thierry.reding-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> writes:
[...]
> > diff --git a/drivers/gpu/drm/tegra/dc.c b/drivers/gpu/drm/tegra/dc.c
[...]
> > +       if (tegra->domain) {
> > +               err = iommu_attach_device(tegra->domain, dc->dev);
> 
> I wanted to keep device drivers iommu-free with the following:
> 
> http://patchwork.ozlabs.org/patch/354074/

That patch only addresses the probe ordering problem that happens if the
user of an IOMMU is probed before the IOMMU. What this patch does is a
whole lot more.

> > diff --git a/drivers/gpu/drm/tegra/drm.c b/drivers/gpu/drm/tegra/drm.c
> > index 59736bb810cd..1d2bbafad982 100644
> > --- a/drivers/gpu/drm/tegra/drm.c
> > +++ b/drivers/gpu/drm/tegra/drm.c
> > @@ -8,6 +8,7 @@
> >   */
> >
> >  #include <linux/host1x.h>
> > +#include <linux/iommu.h>
> >
> >  #include "drm.h"
> >  #include "gem.h"
> > @@ -33,6 +34,16 @@ static int tegra_drm_load(struct drm_device *drm, unsigned long flags)
> >         if (!tegra)
> >                 return -ENOMEM;
> >
> > +       if (iommu_present(&platform_bus_type)) {
> > +               tegra->domain = iommu_domain_alloc(&platform_bus_type);
> 
> Can we use "dma_iommu_mapping" instead of domain?
> 
> I thought that DMA API is on the top of IOMMU API so that it may be
> cleaner to use only DMA API.

Using the DMA API doesn't work for Tegra DRM because it assumes a 1:1
mapping between a device and an IOMMU domain. For Tegra DRM we have two
devices (two display controllers) that need to be able to access the
same buffers, therefore they need to share one IOMMU domain. This can't
be done using the DMA API.

The DMA API is fine to be used by devices that operate on "private" DMA
buffers (SDMMC, USB, ...).

> iommu_map_sg() could be implemented as iommu_ops->map_sg() for the
> better perf since iommu_map() needs some pagetable cache operations. If
> we do those cache operations at once, it would bring some perf benefit.

Yes, I agree that eventually this should be moved into the IOMMU core.
We could add a .map_sg() to IOMMU ops for devices where mapping a whole
sg_table at once would have significant performance benefits and change
this generic implementation to be used by devices that don't implement
.map_sg(). Then the IOMMU core's iommu_map_sg() can call into the driver
directly or fallback to the generic implementation.

> I think that we don't need unmap_sg(), instead normal iommu_unmap() for
> a whole area could do the same at once?

Yes, I suppose that's true. I'll see if it can be safely dropped. It
might give us the same benefit as the iommu_map_sg() regarding cache
maintenance, though.

> > +static int iommu_unmap_sg(struct iommu_domain *domain, struct sg_table *sgt,
> > +                         dma_addr_t iova)
> > +{
> > +       unsigned long offset = 0;
> > +       struct scatterlist *sg;
> > +       unsigned int i;
> > +
> > +       for_each_sg(sgt->sgl, sg, sgt->nents, i) {
> > +               dma_addr_t phys = sg_phys(sg);
> > +               size_t length = sg->offset;
> > +
> > +               phys = sg_phys(sg) - sg->offset;
> > +               length = sg->length + sg->offset;
> > +
> > +               iommu_unmap(domain, iova + offset, length);
> > +               offset += length;
> > +       }
> > +
> > +       return 0;
> > +}
> 
> Can the rest of IOMMU API be replaced with DMA API too?

As I explained above, I don't see how it could be done for this driver.
But I don't think it has to. After all the IOMMU API does exist, so we
shouldn't shy away from using it when appropriate.

Thierry

[-- Attachment #2: Type: application/pgp-signature, Size: 819 bytes --]

^ permalink raw reply	[flat|nested] 133+ messages in thread

* Re: [RFC 09/10] drm/tegra: Add IOMMU support
@ 2014-06-27 10:58             ` Thierry Reding
  0 siblings, 0 replies; 133+ messages in thread
From: Thierry Reding @ 2014-06-27 10:58 UTC (permalink / raw)
  To: Hiroshi DOyu
  Cc: Rob Herring, Pawel Moll, Mark Rutland, Ian Campbell, Kumar Gala,
	Stephen Warren, Arnd Bergmann, Will Deacon, Joerg Roedel,
	Cho KyongHo, Grant Grundler, Dave Martin, Marc Zyngier,
	Olav Haugan, Paul Walmsley, Rhyland Klein, Allen Martin,
	devicetree, iommu, linux-arm-kernel, linux-tegra, linux-kernel

[-- Attachment #1: Type: text/plain, Size: 3710 bytes --]

On Fri, Jun 27, 2014 at 12:46:14PM +0300, Hiroshi DOyu wrote:
> Thierry Reding <thierry.reding@gmail.com> writes:
[...]
> > diff --git a/drivers/gpu/drm/tegra/dc.c b/drivers/gpu/drm/tegra/dc.c
[...]
> > +       if (tegra->domain) {
> > +               err = iommu_attach_device(tegra->domain, dc->dev);
> 
> I wanted to keep device drivers iommu-free with the following:
> 
> http://patchwork.ozlabs.org/patch/354074/

That patch only addresses the probe ordering problem that happens if the
user of an IOMMU is probed before the IOMMU. What this patch does is a
whole lot more.

> > diff --git a/drivers/gpu/drm/tegra/drm.c b/drivers/gpu/drm/tegra/drm.c
> > index 59736bb810cd..1d2bbafad982 100644
> > --- a/drivers/gpu/drm/tegra/drm.c
> > +++ b/drivers/gpu/drm/tegra/drm.c
> > @@ -8,6 +8,7 @@
> >   */
> >
> >  #include <linux/host1x.h>
> > +#include <linux/iommu.h>
> >
> >  #include "drm.h"
> >  #include "gem.h"
> > @@ -33,6 +34,16 @@ static int tegra_drm_load(struct drm_device *drm, unsigned long flags)
> >         if (!tegra)
> >                 return -ENOMEM;
> >
> > +       if (iommu_present(&platform_bus_type)) {
> > +               tegra->domain = iommu_domain_alloc(&platform_bus_type);
> 
> Can we use "dma_iommu_mapping" instead of domain?
> 
> I thought that DMA API is on the top of IOMMU API so that it may be
> cleaner to use only DMA API.

Using the DMA API doesn't work for Tegra DRM because it assumes a 1:1
mapping between a device and an IOMMU domain. For Tegra DRM we have two
devices (two display controllers) that need to be able to access the
same buffers, therefore they need to share one IOMMU domain. This can't
be done using the DMA API.

The DMA API is fine to be used by devices that operate on "private" DMA
buffers (SDMMC, USB, ...).

> iommu_map_sg() could be implemented as iommu_ops->map_sg() for the
> better perf since iommu_map() needs some pagetable cache operations. If
> we do those cache operations at once, it would bring some perf benefit.

Yes, I agree that eventually this should be moved into the IOMMU core.
We could add a .map_sg() to IOMMU ops for devices where mapping a whole
sg_table at once would have significant performance benefits and change
this generic implementation to be used by devices that don't implement
.map_sg(). Then the IOMMU core's iommu_map_sg() can call into the driver
directly or fallback to the generic implementation.

> I think that we don't need unmap_sg(), instead normal iommu_unmap() for
> a whole area could do the same at once?

Yes, I suppose that's true. I'll see if it can be safely dropped. It
might give us the same benefit as the iommu_map_sg() regarding cache
maintenance, though.

> > +static int iommu_unmap_sg(struct iommu_domain *domain, struct sg_table *sgt,
> > +                         dma_addr_t iova)
> > +{
> > +       unsigned long offset = 0;
> > +       struct scatterlist *sg;
> > +       unsigned int i;
> > +
> > +       for_each_sg(sgt->sgl, sg, sgt->nents, i) {
> > +               dma_addr_t phys = sg_phys(sg);
> > +               size_t length = sg->offset;
> > +
> > +               phys = sg_phys(sg) - sg->offset;
> > +               length = sg->length + sg->offset;
> > +
> > +               iommu_unmap(domain, iova + offset, length);
> > +               offset += length;
> > +       }
> > +
> > +       return 0;
> > +}
> 
> Can the rest of IOMMU API be replaced with DMA API too?

As I explained above, I don't see how it could be done for this driver.
But I don't think it has to. After all the IOMMU API does exist, so we
shouldn't shy away from using it when appropriate.

Thierry

[-- Attachment #2: Type: application/pgp-signature, Size: 819 bytes --]

^ permalink raw reply	[flat|nested] 133+ messages in thread

* [RFC 09/10] drm/tegra: Add IOMMU support
@ 2014-06-27 10:58             ` Thierry Reding
  0 siblings, 0 replies; 133+ messages in thread
From: Thierry Reding @ 2014-06-27 10:58 UTC (permalink / raw)
  To: linux-arm-kernel

On Fri, Jun 27, 2014 at 12:46:14PM +0300, Hiroshi DOyu wrote:
> Thierry Reding <thierry.reding@gmail.com> writes:
[...]
> > diff --git a/drivers/gpu/drm/tegra/dc.c b/drivers/gpu/drm/tegra/dc.c
[...]
> > +       if (tegra->domain) {
> > +               err = iommu_attach_device(tegra->domain, dc->dev);
> 
> I wanted to keep device drivers iommu-free with the following:
> 
> http://patchwork.ozlabs.org/patch/354074/

That patch only addresses the probe ordering problem that happens if the
user of an IOMMU is probed before the IOMMU. What this patch does is a
whole lot more.

> > diff --git a/drivers/gpu/drm/tegra/drm.c b/drivers/gpu/drm/tegra/drm.c
> > index 59736bb810cd..1d2bbafad982 100644
> > --- a/drivers/gpu/drm/tegra/drm.c
> > +++ b/drivers/gpu/drm/tegra/drm.c
> > @@ -8,6 +8,7 @@
> >   */
> >
> >  #include <linux/host1x.h>
> > +#include <linux/iommu.h>
> >
> >  #include "drm.h"
> >  #include "gem.h"
> > @@ -33,6 +34,16 @@ static int tegra_drm_load(struct drm_device *drm, unsigned long flags)
> >         if (!tegra)
> >                 return -ENOMEM;
> >
> > +       if (iommu_present(&platform_bus_type)) {
> > +               tegra->domain = iommu_domain_alloc(&platform_bus_type);
> 
> Can we use "dma_iommu_mapping" instead of domain?
> 
> I thought that DMA API is on the top of IOMMU API so that it may be
> cleaner to use only DMA API.

Using the DMA API doesn't work for Tegra DRM because it assumes a 1:1
mapping between a device and an IOMMU domain. For Tegra DRM we have two
devices (two display controllers) that need to be able to access the
same buffers, therefore they need to share one IOMMU domain. This can't
be done using the DMA API.

The DMA API is fine to be used by devices that operate on "private" DMA
buffers (SDMMC, USB, ...).

> iommu_map_sg() could be implemented as iommu_ops->map_sg() for the
> better perf since iommu_map() needs some pagetable cache operations. If
> we do those cache operations at once, it would bring some perf benefit.

Yes, I agree that eventually this should be moved into the IOMMU core.
We could add a .map_sg() to IOMMU ops for devices where mapping a whole
sg_table at once would have significant performance benefits and change
this generic implementation to be used by devices that don't implement
.map_sg(). Then the IOMMU core's iommu_map_sg() can call into the driver
directly or fallback to the generic implementation.

> I think that we don't need unmap_sg(), instead normal iommu_unmap() for
> a whole area could do the same at once?

Yes, I suppose that's true. I'll see if it can be safely dropped. It
might give us the same benefit as the iommu_map_sg() regarding cache
maintenance, though.

> > +static int iommu_unmap_sg(struct iommu_domain *domain, struct sg_table *sgt,
> > +                         dma_addr_t iova)
> > +{
> > +       unsigned long offset = 0;
> > +       struct scatterlist *sg;
> > +       unsigned int i;
> > +
> > +       for_each_sg(sgt->sgl, sg, sgt->nents, i) {
> > +               dma_addr_t phys = sg_phys(sg);
> > +               size_t length = sg->offset;
> > +
> > +               phys = sg_phys(sg) - sg->offset;
> > +               length = sg->length + sg->offset;
> > +
> > +               iommu_unmap(domain, iova + offset, length);
> > +               offset += length;
> > +       }
> > +
> > +       return 0;
> > +}
> 
> Can the rest of IOMMU API be replaced with DMA API too?

As I explained above, I don't see how it could be done for this driver.
But I don't think it has to. After all the IOMMU API does exist, so we
shouldn't shy away from using it when appropriate.

Thierry
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 819 bytes
Desc: not available
URL: <http://lists.infradead.org/pipermail/linux-arm-kernel/attachments/20140627/2760e5d8/attachment-0001.sig>

^ permalink raw reply	[flat|nested] 133+ messages in thread

* Re: [RFC 10/10] mmc: sdhci-tegra: Add IOMMU support
  2014-06-27  9:46         ` Hiroshi DOyu
  (?)
@ 2014-06-27 11:01             ` Thierry Reding
  -1 siblings, 0 replies; 133+ messages in thread
From: Thierry Reding @ 2014-06-27 11:01 UTC (permalink / raw)
  To: Hiroshi DOyu
  Cc: Rob Herring, Pawel Moll, Mark Rutland, Ian Campbell, Kumar Gala,
	Stephen Warren, Arnd Bergmann, Will Deacon, Joerg Roedel,
	Cho KyongHo, Grant Grundler, Dave Martin, Marc Zyngier,
	Olav Haugan, Paul Walmsley, Rhyland Klein, Allen Martin,
	devicetree-u79uwXL29TY76Z2rM5mHXA,
	iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA,
	linux-arm-kernel-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r,
	linux-tegra-u79uwXL29TY76Z2rM5mHXA@public.gmane.org

[-- Attachment #1: Type: text/plain, Size: 2307 bytes --]

On Fri, Jun 27, 2014 at 12:46:02PM +0300, Hiroshi DOyu wrote:
> 
> Thierry Reding <thierry.reding-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> writes:
> 
> > From: Thierry Reding <treding-DDmLM1+adcrQT0dZR+AlfA@public.gmane.org>
> >
> > Attach to the device's master interface of the IOMMU at .probe() time.
> > IOMMU support becomes available via the DMA mapping API interoperation
> > code, but this explicit attachment is necessary to ensure proper probe
> > order.
> >
> > Signed-off-by: Thierry Reding <treding-DDmLM1+adcrQT0dZR+AlfA@public.gmane.org>
> > ---
> >  drivers/mmc/host/sdhci-tegra.c | 8 ++++++++
> >  1 file changed, 8 insertions(+)
> >
> > diff --git a/drivers/mmc/host/sdhci-tegra.c b/drivers/mmc/host/sdhci-tegra.c
> > index 33100d10d176..b884614fa4e6 100644
> > --- a/drivers/mmc/host/sdhci-tegra.c
> > +++ b/drivers/mmc/host/sdhci-tegra.c
> > @@ -15,6 +15,7 @@
> >  #include <linux/err.h>
> >  #include <linux/module.h>
> >  #include <linux/init.h>
> > +#include <linux/iommu.h>
> >  #include <linux/platform_device.h>
> >  #include <linux/clk.h>
> >  #include <linux/io.h>
> > @@ -237,6 +238,11 @@ static int sdhci_tegra_probe(struct platform_device *pdev)
> >  	match = of_match_device(sdhci_tegra_dt_match, &pdev->dev);
> >  	if (!match)
> >  		return -EINVAL;
> > +
> > +	rc = iommu_attach(&pdev->dev);
> > +	if (rc < 0)
> > +		return rc;
> > +
> 
> I thought that, if we consider that ->probe() should include minimal H/W
> probing so that DMA API call in ->probe() could be deferred after
> ->probe() and till it's in use actually, like opening a device node. For
> me this decision(minimal h/w probe) seemed logical but it would add a
> new restriction. One advantage is that we could still keep all drivers
> wihtout any IOMMU code if it doesn't call DMA API in ->probe().

This isn't immediately apparent in this case, but I think that in the
future we may need to have this kind of explicit attachment to an IOMMU
for example once devices start to appear that have multiple master
interfaces (possibly on different IOMMUs). For easy cases like this
SDMMC driver we may be able to get away more easily by hooking this up
within the driver core for example. I'd have to look into how exactly
this would work, though.

Thierry

[-- Attachment #2: Type: application/pgp-signature, Size: 819 bytes --]

^ permalink raw reply	[flat|nested] 133+ messages in thread

* Re: [RFC 10/10] mmc: sdhci-tegra: Add IOMMU support
@ 2014-06-27 11:01             ` Thierry Reding
  0 siblings, 0 replies; 133+ messages in thread
From: Thierry Reding @ 2014-06-27 11:01 UTC (permalink / raw)
  To: Hiroshi DOyu
  Cc: Rob Herring, Pawel Moll, Mark Rutland, Ian Campbell, Kumar Gala,
	Stephen Warren, Arnd Bergmann, Will Deacon, Joerg Roedel,
	Cho KyongHo, Grant Grundler, Dave Martin, Marc Zyngier,
	Olav Haugan, Paul Walmsley, Rhyland Klein, Allen Martin,
	devicetree, iommu, linux-arm-kernel, linux-tegra, linux-kernel

[-- Attachment #1: Type: text/plain, Size: 2219 bytes --]

On Fri, Jun 27, 2014 at 12:46:02PM +0300, Hiroshi DOyu wrote:
> 
> Thierry Reding <thierry.reding@gmail.com> writes:
> 
> > From: Thierry Reding <treding@nvidia.com>
> >
> > Attach to the device's master interface of the IOMMU at .probe() time.
> > IOMMU support becomes available via the DMA mapping API interoperation
> > code, but this explicit attachment is necessary to ensure proper probe
> > order.
> >
> > Signed-off-by: Thierry Reding <treding@nvidia.com>
> > ---
> >  drivers/mmc/host/sdhci-tegra.c | 8 ++++++++
> >  1 file changed, 8 insertions(+)
> >
> > diff --git a/drivers/mmc/host/sdhci-tegra.c b/drivers/mmc/host/sdhci-tegra.c
> > index 33100d10d176..b884614fa4e6 100644
> > --- a/drivers/mmc/host/sdhci-tegra.c
> > +++ b/drivers/mmc/host/sdhci-tegra.c
> > @@ -15,6 +15,7 @@
> >  #include <linux/err.h>
> >  #include <linux/module.h>
> >  #include <linux/init.h>
> > +#include <linux/iommu.h>
> >  #include <linux/platform_device.h>
> >  #include <linux/clk.h>
> >  #include <linux/io.h>
> > @@ -237,6 +238,11 @@ static int sdhci_tegra_probe(struct platform_device *pdev)
> >  	match = of_match_device(sdhci_tegra_dt_match, &pdev->dev);
> >  	if (!match)
> >  		return -EINVAL;
> > +
> > +	rc = iommu_attach(&pdev->dev);
> > +	if (rc < 0)
> > +		return rc;
> > +
> 
> I thought that, if we consider that ->probe() should include minimal H/W
> probing so that DMA API call in ->probe() could be deferred after
> ->probe() and till it's in use actually, like opening a device node. For
> me this decision(minimal h/w probe) seemed logical but it would add a
> new restriction. One advantage is that we could still keep all drivers
> wihtout any IOMMU code if it doesn't call DMA API in ->probe().

This isn't immediately apparent in this case, but I think that in the
future we may need to have this kind of explicit attachment to an IOMMU
for example once devices start to appear that have multiple master
interfaces (possibly on different IOMMUs). For easy cases like this
SDMMC driver we may be able to get away more easily by hooking this up
within the driver core for example. I'd have to look into how exactly
this would work, though.

Thierry

[-- Attachment #2: Type: application/pgp-signature, Size: 819 bytes --]

^ permalink raw reply	[flat|nested] 133+ messages in thread

* [RFC 10/10] mmc: sdhci-tegra: Add IOMMU support
@ 2014-06-27 11:01             ` Thierry Reding
  0 siblings, 0 replies; 133+ messages in thread
From: Thierry Reding @ 2014-06-27 11:01 UTC (permalink / raw)
  To: linux-arm-kernel

On Fri, Jun 27, 2014 at 12:46:02PM +0300, Hiroshi DOyu wrote:
> 
> Thierry Reding <thierry.reding@gmail.com> writes:
> 
> > From: Thierry Reding <treding@nvidia.com>
> >
> > Attach to the device's master interface of the IOMMU at .probe() time.
> > IOMMU support becomes available via the DMA mapping API interoperation
> > code, but this explicit attachment is necessary to ensure proper probe
> > order.
> >
> > Signed-off-by: Thierry Reding <treding@nvidia.com>
> > ---
> >  drivers/mmc/host/sdhci-tegra.c | 8 ++++++++
> >  1 file changed, 8 insertions(+)
> >
> > diff --git a/drivers/mmc/host/sdhci-tegra.c b/drivers/mmc/host/sdhci-tegra.c
> > index 33100d10d176..b884614fa4e6 100644
> > --- a/drivers/mmc/host/sdhci-tegra.c
> > +++ b/drivers/mmc/host/sdhci-tegra.c
> > @@ -15,6 +15,7 @@
> >  #include <linux/err.h>
> >  #include <linux/module.h>
> >  #include <linux/init.h>
> > +#include <linux/iommu.h>
> >  #include <linux/platform_device.h>
> >  #include <linux/clk.h>
> >  #include <linux/io.h>
> > @@ -237,6 +238,11 @@ static int sdhci_tegra_probe(struct platform_device *pdev)
> >  	match = of_match_device(sdhci_tegra_dt_match, &pdev->dev);
> >  	if (!match)
> >  		return -EINVAL;
> > +
> > +	rc = iommu_attach(&pdev->dev);
> > +	if (rc < 0)
> > +		return rc;
> > +
> 
> I thought that, if we consider that ->probe() should include minimal H/W
> probing so that DMA API call in ->probe() could be deferred after
> ->probe() and till it's in use actually, like opening a device node. For
> me this decision(minimal h/w probe) seemed logical but it would add a
> new restriction. One advantage is that we could still keep all drivers
> wihtout any IOMMU code if it doesn't call DMA API in ->probe().

This isn't immediately apparent in this case, but I think that in the
future we may need to have this kind of explicit attachment to an IOMMU
for example once devices start to appear that have multiple master
interfaces (possibly on different IOMMUs). For easy cases like this
SDMMC driver we may be able to get away more easily by hooking this up
within the driver core for example. I'd have to look into how exactly
this would work, though.

Thierry
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 819 bytes
Desc: not available
URL: <http://lists.infradead.org/pipermail/linux-arm-kernel/attachments/20140627/043068d9/attachment.sig>

^ permalink raw reply	[flat|nested] 133+ messages in thread

* Re: [RFC 09/10] drm/tegra: Add IOMMU support
  2014-06-27 10:54             ` Arnd Bergmann
@ 2014-06-27 11:03               ` Hiroshi Doyu
  -1 siblings, 0 replies; 133+ messages in thread
From: Hiroshi Doyu @ 2014-06-27 11:03 UTC (permalink / raw)
  To: Arnd Bergmann
  Cc: Mark Rutland, Will Deacon, Thierry Reding, Paul Walmsley,
	Stephen Warren, Grant Grundler, Dave Martin, Olav Haugan,
	devicetree-u79uwXL29TY76Z2rM5mHXA, Pawel Moll, Ian Campbell,
	Marc Zyngier, Allen Martin, Rob Herring,
	linux-tegra-u79uwXL29TY76Z2rM5mHXA, Cho KyongHo,
	linux-arm-kernel-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA, Kumar Gala


Arnd Bergmann <arnd-r2nGTMty4D4@public.gmane.org> writes:

> On Friday 27 June 2014 12:46:14 Hiroshi DOyu wrote:
>> 
>> Thierry Reding <thierry.reding-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> writes:
>> 
>> > From: Thierry Reding <treding-DDmLM1+adcrQT0dZR+AlfA@public.gmane.org>
>> >
>> > When an IOMMU device is available on the platform bus, allocate an IOMMU
>> > domain and attach the display controllers to it. The display controllers
>> > can then scan out non-contiguous buffers by mapping them through the
>> > IOMMU.
>> >
>> > Signed-off-by: Thierry Reding <treding-DDmLM1+adcrQT0dZR+AlfA@public.gmane.org>
>> > ---
>> > @@ -1283,8 +1284,18 @@ static int tegra_dc_init(struct host1x_client *client)
>> >  {
>> >         struct drm_device *drm = dev_get_drvdata(client->parent);
>> >         struct tegra_dc *dc = host1x_client_to_dc(client);
>> > +       struct tegra_drm *tegra = drm->dev_private;
>> >         int err;
>> >
>> > +       if (tegra->domain) {
>> > +               err = iommu_attach_device(tegra->domain, dc->dev);
>> 
>> I wanted to keep device drivers iommu-free with the following:
>> 
>> http://patchwork.ozlabs.org/patch/354074/
>> 
>
> We definitely need something like your series to make iommus work transparently
> on ARM for normal devices, using the of_dma_configure() to look up the correct
> iommu per device and initialize it.

OK

> However, any devices that work with multiple iommu domains cannot do that
> and still need to use the iommu API directy. I believe the tegra drm code
> falls into this category.

I think that the similar thing can be said to "DMA API" vs "IOMMU
API". Most of traditional devices would be ok with the existing DMA API
which can be backed by IOMMU. OTOH, some smart devices may need to dive
into IOMMU API for more precise control of it.

^ permalink raw reply	[flat|nested] 133+ messages in thread

* [RFC 09/10] drm/tegra: Add IOMMU support
@ 2014-06-27 11:03               ` Hiroshi Doyu
  0 siblings, 0 replies; 133+ messages in thread
From: Hiroshi Doyu @ 2014-06-27 11:03 UTC (permalink / raw)
  To: linux-arm-kernel


Arnd Bergmann <arnd@arndb.de> writes:

> On Friday 27 June 2014 12:46:14 Hiroshi DOyu wrote:
>> 
>> Thierry Reding <thierry.reding@gmail.com> writes:
>> 
>> > From: Thierry Reding <treding@nvidia.com>
>> >
>> > When an IOMMU device is available on the platform bus, allocate an IOMMU
>> > domain and attach the display controllers to it. The display controllers
>> > can then scan out non-contiguous buffers by mapping them through the
>> > IOMMU.
>> >
>> > Signed-off-by: Thierry Reding <treding@nvidia.com>
>> > ---
>> > @@ -1283,8 +1284,18 @@ static int tegra_dc_init(struct host1x_client *client)
>> >  {
>> >         struct drm_device *drm = dev_get_drvdata(client->parent);
>> >         struct tegra_dc *dc = host1x_client_to_dc(client);
>> > +       struct tegra_drm *tegra = drm->dev_private;
>> >         int err;
>> >
>> > +       if (tegra->domain) {
>> > +               err = iommu_attach_device(tegra->domain, dc->dev);
>> 
>> I wanted to keep device drivers iommu-free with the following:
>> 
>> http://patchwork.ozlabs.org/patch/354074/
>> 
>
> We definitely need something like your series to make iommus work transparently
> on ARM for normal devices, using the of_dma_configure() to look up the correct
> iommu per device and initialize it.

OK

> However, any devices that work with multiple iommu domains cannot do that
> and still need to use the iommu API directy. I believe the tegra drm code
> falls into this category.

I think that the similar thing can be said to "DMA API" vs "IOMMU
API". Most of traditional devices would be ok with the existing DMA API
which can be backed by IOMMU. OTOH, some smart devices may need to dive
into IOMMU API for more precise control of it.

^ permalink raw reply	[flat|nested] 133+ messages in thread

* Re: [RFC 04/10] memory: Add Tegra124 memory controller support
  2014-06-26 20:49     ` Thierry Reding
  (?)
@ 2014-06-27 11:07         ` Arnd Bergmann
  -1 siblings, 0 replies; 133+ messages in thread
From: Arnd Bergmann @ 2014-06-27 11:07 UTC (permalink / raw)
  To: Thierry Reding
  Cc: Rob Herring, Pawel Moll, Mark Rutland, Ian Campbell, Kumar Gala,
	Stephen Warren, Will Deacon, Joerg Roedel, Cho KyongHo,
	Grant Grundler, Dave Martin, Marc Zyngier, Hiroshi Doyu,
	Olav Haugan, Paul Walmsley, Rhyland Klein, Allen Martin,
	devicetree-u79uwXL29TY76Z2rM5mHXA,
	iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA,
	linux-arm-kernel-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r,
	linux-tegra-u79uwXL29TY76Z2rM5mHXA,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA

On Thursday 26 June 2014 22:49:44 Thierry Reding wrote:
> +static const struct tegra_mc_client tegra124_mc_clients[] = {
> +       {
> +               .id = 0x01,
> +               .name = "display0a",
> +               .swgroup = TEGRA_SWGROUP_DC,
> +               .smmu = {
> +                       .reg = 0x228,
> +                       .bit = 1,
> +               },
> +               .latency = {
> +                       .reg = 0x2e8,
> +                       .shift = 0,
> +                       .mask = 0xff,
> +                       .def = 0xc2,
> +               },
> +       }, {

This is a rather long table that I assume would need to get duplicated
and modified for each specific SoC. Have you considered to put the information
into DT instead, as auxiliary data in the iommu specifier as provided by
the device?

	Arnd

^ permalink raw reply	[flat|nested] 133+ messages in thread

* Re: [RFC 04/10] memory: Add Tegra124 memory controller support
@ 2014-06-27 11:07         ` Arnd Bergmann
  0 siblings, 0 replies; 133+ messages in thread
From: Arnd Bergmann @ 2014-06-27 11:07 UTC (permalink / raw)
  To: Thierry Reding
  Cc: Rob Herring, Pawel Moll, Mark Rutland, Ian Campbell, Kumar Gala,
	Stephen Warren, Will Deacon, Joerg Roedel, Cho KyongHo,
	Grant Grundler, Dave Martin, Marc Zyngier, Hiroshi Doyu,
	Olav Haugan, Paul Walmsley, Rhyland Klein, Allen Martin,
	devicetree, iommu, linux-arm-kernel, linux-tegra, linux-kernel

On Thursday 26 June 2014 22:49:44 Thierry Reding wrote:
> +static const struct tegra_mc_client tegra124_mc_clients[] = {
> +       {
> +               .id = 0x01,
> +               .name = "display0a",
> +               .swgroup = TEGRA_SWGROUP_DC,
> +               .smmu = {
> +                       .reg = 0x228,
> +                       .bit = 1,
> +               },
> +               .latency = {
> +                       .reg = 0x2e8,
> +                       .shift = 0,
> +                       .mask = 0xff,
> +                       .def = 0xc2,
> +               },
> +       }, {

This is a rather long table that I assume would need to get duplicated
and modified for each specific SoC. Have you considered to put the information
into DT instead, as auxiliary data in the iommu specifier as provided by
the device?

	Arnd

^ permalink raw reply	[flat|nested] 133+ messages in thread

* [RFC 04/10] memory: Add Tegra124 memory controller support
@ 2014-06-27 11:07         ` Arnd Bergmann
  0 siblings, 0 replies; 133+ messages in thread
From: Arnd Bergmann @ 2014-06-27 11:07 UTC (permalink / raw)
  To: linux-arm-kernel

On Thursday 26 June 2014 22:49:44 Thierry Reding wrote:
> +static const struct tegra_mc_client tegra124_mc_clients[] = {
> +       {
> +               .id = 0x01,
> +               .name = "display0a",
> +               .swgroup = TEGRA_SWGROUP_DC,
> +               .smmu = {
> +                       .reg = 0x228,
> +                       .bit = 1,
> +               },
> +               .latency = {
> +                       .reg = 0x2e8,
> +                       .shift = 0,
> +                       .mask = 0xff,
> +                       .def = 0xc2,
> +               },
> +       }, {

This is a rather long table that I assume would need to get duplicated
and modified for each specific SoC. Have you considered to put the information
into DT instead, as auxiliary data in the iommu specifier as provided by
the device?

	Arnd

^ permalink raw reply	[flat|nested] 133+ messages in thread

* Re: [RFC 04/10] memory: Add Tegra124 memory controller support
  2014-06-27  9:46         ` Hiroshi DOyu
  (?)
@ 2014-06-27 11:08             ` Thierry Reding
  -1 siblings, 0 replies; 133+ messages in thread
From: Thierry Reding @ 2014-06-27 11:08 UTC (permalink / raw)
  To: Hiroshi DOyu
  Cc: Mark Rutland, Will Deacon, Paul Walmsley, Pawel Moll,
	Ian Campbell, Marc Zyngier, Dave Martin, Olav Haugan,
	devicetree-u79uwXL29TY76Z2rM5mHXA, Arnd Bergmann, Stephen Warren,
	Grant Grundler, Allen Martin, Rob Herring,
	linux-tegra-u79uwXL29TY76Z2rM5mHXA, Cho KyongHo,
	linux-arm-kernel-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA, Kumar Gala


[-- Attachment #1.1: Type: text/plain, Size: 2388 bytes --]

On Fri, Jun 27, 2014 at 12:46:38PM +0300, Hiroshi DOyu wrote:
> 
> Thierry Reding <thierry.reding-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> writes:
> 
> > From: Thierry Reding <treding-DDmLM1+adcrQT0dZR+AlfA@public.gmane.org>
> >
> > The memory controller on NVIDIA Tegra124 exposes various knobs that can
> > be used to tune the behaviour of the clients attached to it.
> >
> > Currently this driver sets up the latency allowance registers to the HW
> > defaults. Eventually an API should be exported by this driver (via a
> > custom API or a generic subsystem) to allow clients to register latency
> > requirements.
> >
> > This driver also registers an IOMMU (SMMU) that's implemented by the
> > memory controller.
> >
> > Signed-off-by: Thierry Reding <treding-DDmLM1+adcrQT0dZR+AlfA@public.gmane.org>
> > ---
> >  drivers/memory/Kconfig                   |    9 +
> >  drivers/memory/Makefile                  |    1 +
> >  drivers/memory/tegra124-mc.c             | 1945 ++++++++++++++++++++++++++++++
> >  include/dt-bindings/memory/tegra124-mc.h |   30 +
> >  4 files changed, 1985 insertions(+)
> >  create mode 100644 drivers/memory/tegra124-mc.c
> >  create mode 100644 include/dt-bindings/memory/tegra124-mc.h
> 
> I prefer reusing the existing SMMU and having MC and SMMU separated
> since most of SMMU code are not different from functionality POV, and
> new MC features are quite independent of SMMU.
> 
> If it's really convenient to combine MC and SMMU into one driver, we
> could move "drivers/iomm/tegra-smmu.c" here first, and add MC features
> on the top of it.

I'm not sure if we can do that, since the tegra-smmu driver is
technically used by Tegra30 and Tegra114. We've never really made use of
it, but there are device trees in mainline releases that contain the
separate SMMU node.

Perhaps one of the DT folks can comment on whether it would be possible
to break compatibility with existing DTs in this case, given that the
SMMU on Tegra30 and Tegra114 have never been used.

Either way, I do see advantages in incremental patches, but at the same
time the old driver and architecture was never enabled (therefore not
tested either) upstream and as shown by the Tegra DRM example can't cope
with more complex cases. So I'm not completely convinced that an
incremental approach would be the best here.

Thierry

[-- Attachment #1.2: Type: application/pgp-signature, Size: 819 bytes --]

[-- Attachment #2: Type: text/plain, Size: 0 bytes --]



^ permalink raw reply	[flat|nested] 133+ messages in thread

* Re: [RFC 04/10] memory: Add Tegra124 memory controller support
@ 2014-06-27 11:08             ` Thierry Reding
  0 siblings, 0 replies; 133+ messages in thread
From: Thierry Reding @ 2014-06-27 11:08 UTC (permalink / raw)
  To: Hiroshi DOyu
  Cc: Rob Herring, Pawel Moll, Mark Rutland, Ian Campbell, Kumar Gala,
	Stephen Warren, Arnd Bergmann, Will Deacon, Joerg Roedel,
	Cho KyongHo, Grant Grundler, Dave Martin, Marc Zyngier,
	Olav Haugan, Paul Walmsley, Rhyland Klein, Allen Martin,
	devicetree, iommu, linux-arm-kernel, linux-tegra, linux-kernel

[-- Attachment #1: Type: text/plain, Size: 2300 bytes --]

On Fri, Jun 27, 2014 at 12:46:38PM +0300, Hiroshi DOyu wrote:
> 
> Thierry Reding <thierry.reding@gmail.com> writes:
> 
> > From: Thierry Reding <treding@nvidia.com>
> >
> > The memory controller on NVIDIA Tegra124 exposes various knobs that can
> > be used to tune the behaviour of the clients attached to it.
> >
> > Currently this driver sets up the latency allowance registers to the HW
> > defaults. Eventually an API should be exported by this driver (via a
> > custom API or a generic subsystem) to allow clients to register latency
> > requirements.
> >
> > This driver also registers an IOMMU (SMMU) that's implemented by the
> > memory controller.
> >
> > Signed-off-by: Thierry Reding <treding@nvidia.com>
> > ---
> >  drivers/memory/Kconfig                   |    9 +
> >  drivers/memory/Makefile                  |    1 +
> >  drivers/memory/tegra124-mc.c             | 1945 ++++++++++++++++++++++++++++++
> >  include/dt-bindings/memory/tegra124-mc.h |   30 +
> >  4 files changed, 1985 insertions(+)
> >  create mode 100644 drivers/memory/tegra124-mc.c
> >  create mode 100644 include/dt-bindings/memory/tegra124-mc.h
> 
> I prefer reusing the existing SMMU and having MC and SMMU separated
> since most of SMMU code are not different from functionality POV, and
> new MC features are quite independent of SMMU.
> 
> If it's really convenient to combine MC and SMMU into one driver, we
> could move "drivers/iomm/tegra-smmu.c" here first, and add MC features
> on the top of it.

I'm not sure if we can do that, since the tegra-smmu driver is
technically used by Tegra30 and Tegra114. We've never really made use of
it, but there are device trees in mainline releases that contain the
separate SMMU node.

Perhaps one of the DT folks can comment on whether it would be possible
to break compatibility with existing DTs in this case, given that the
SMMU on Tegra30 and Tegra114 have never been used.

Either way, I do see advantages in incremental patches, but at the same
time the old driver and architecture was never enabled (therefore not
tested either) upstream and as shown by the Tegra DRM example can't cope
with more complex cases. So I'm not completely convinced that an
incremental approach would be the best here.

Thierry

[-- Attachment #2: Type: application/pgp-signature, Size: 819 bytes --]

^ permalink raw reply	[flat|nested] 133+ messages in thread

* [RFC 04/10] memory: Add Tegra124 memory controller support
@ 2014-06-27 11:08             ` Thierry Reding
  0 siblings, 0 replies; 133+ messages in thread
From: Thierry Reding @ 2014-06-27 11:08 UTC (permalink / raw)
  To: linux-arm-kernel

On Fri, Jun 27, 2014 at 12:46:38PM +0300, Hiroshi DOyu wrote:
> 
> Thierry Reding <thierry.reding@gmail.com> writes:
> 
> > From: Thierry Reding <treding@nvidia.com>
> >
> > The memory controller on NVIDIA Tegra124 exposes various knobs that can
> > be used to tune the behaviour of the clients attached to it.
> >
> > Currently this driver sets up the latency allowance registers to the HW
> > defaults. Eventually an API should be exported by this driver (via a
> > custom API or a generic subsystem) to allow clients to register latency
> > requirements.
> >
> > This driver also registers an IOMMU (SMMU) that's implemented by the
> > memory controller.
> >
> > Signed-off-by: Thierry Reding <treding@nvidia.com>
> > ---
> >  drivers/memory/Kconfig                   |    9 +
> >  drivers/memory/Makefile                  |    1 +
> >  drivers/memory/tegra124-mc.c             | 1945 ++++++++++++++++++++++++++++++
> >  include/dt-bindings/memory/tegra124-mc.h |   30 +
> >  4 files changed, 1985 insertions(+)
> >  create mode 100644 drivers/memory/tegra124-mc.c
> >  create mode 100644 include/dt-bindings/memory/tegra124-mc.h
> 
> I prefer reusing the existing SMMU and having MC and SMMU separated
> since most of SMMU code are not different from functionality POV, and
> new MC features are quite independent of SMMU.
> 
> If it's really convenient to combine MC and SMMU into one driver, we
> could move "drivers/iomm/tegra-smmu.c" here first, and add MC features
> on the top of it.

I'm not sure if we can do that, since the tegra-smmu driver is
technically used by Tegra30 and Tegra114. We've never really made use of
it, but there are device trees in mainline releases that contain the
separate SMMU node.

Perhaps one of the DT folks can comment on whether it would be possible
to break compatibility with existing DTs in this case, given that the
SMMU on Tegra30 and Tegra114 have never been used.

Either way, I do see advantages in incremental patches, but at the same
time the old driver and architecture was never enabled (therefore not
tested either) upstream and as shown by the Tegra DRM example can't cope
with more complex cases. So I'm not completely convinced that an
incremental approach would be the best here.

Thierry
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 819 bytes
Desc: not available
URL: <http://lists.infradead.org/pipermail/linux-arm-kernel/attachments/20140627/1881b1ae/attachment.sig>

^ permalink raw reply	[flat|nested] 133+ messages in thread

* Re: [RFC 04/10] memory: Add Tegra124 memory controller support
  2014-06-27 11:07         ` Arnd Bergmann
  (?)
@ 2014-06-27 11:15           ` Thierry Reding
  -1 siblings, 0 replies; 133+ messages in thread
From: Thierry Reding @ 2014-06-27 11:15 UTC (permalink / raw)
  To: Arnd Bergmann
  Cc: Mark Rutland, Will Deacon, Paul Walmsley, Ian Campbell,
	Marc Zyngier, Dave Martin, Olav Haugan,
	devicetree-u79uwXL29TY76Z2rM5mHXA, Pawel Moll, Stephen Warren,
	Grant Grundler, Allen Martin, Rob Herring,
	linux-tegra-u79uwXL29TY76Z2rM5mHXA, Cho KyongHo,
	linux-arm-kernel-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA, Kumar Gala,
	Rhyland Klein


[-- Attachment #1.1: Type: text/plain, Size: 1744 bytes --]

On Fri, Jun 27, 2014 at 01:07:04PM +0200, Arnd Bergmann wrote:
> On Thursday 26 June 2014 22:49:44 Thierry Reding wrote:
> > +static const struct tegra_mc_client tegra124_mc_clients[] = {
> > +       {
> > +               .id = 0x01,
> > +               .name = "display0a",
> > +               .swgroup = TEGRA_SWGROUP_DC,
> > +               .smmu = {
> > +                       .reg = 0x228,
> > +                       .bit = 1,
> > +               },
> > +               .latency = {
> > +                       .reg = 0x2e8,
> > +                       .shift = 0,
> > +                       .mask = 0xff,
> > +                       .def = 0xc2,
> > +               },
> > +       }, {
> 
> This is a rather long table that I assume would need to get duplicated
> and modified for each specific SoC. Have you considered to put the information
> into DT instead, as auxiliary data in the iommu specifier as provided by
> the device?

Most of this data really is register information and I don't think that
belongs in DT. Also since this is fixed for a given SoC and in no way
configurable (well, with the exception of the .def field above) I don't
see any point in parsing this from device tree.

Also only the .smmu substruct is immediately relevant to the IOMMU part
of the driver. The .swgroup field could possibly also be moved into that
substructure since it is only relevant to the IOMMU.

So essentially what this table does is map SWGROUPs (which are provided
in the IOMMU specifier) to the clients and registers that the IOMMU
programming needs. As an analogy it corresponds roughly to the pins and
pingroups tables of pinctrl drivers. Those don't belong in device tree
either.

Thierry

[-- Attachment #1.2: Type: application/pgp-signature, Size: 819 bytes --]

[-- Attachment #2: Type: text/plain, Size: 0 bytes --]



^ permalink raw reply	[flat|nested] 133+ messages in thread

* Re: [RFC 04/10] memory: Add Tegra124 memory controller support
@ 2014-06-27 11:15           ` Thierry Reding
  0 siblings, 0 replies; 133+ messages in thread
From: Thierry Reding @ 2014-06-27 11:15 UTC (permalink / raw)
  To: Arnd Bergmann
  Cc: Rob Herring, Pawel Moll, Mark Rutland, Ian Campbell, Kumar Gala,
	Stephen Warren, Will Deacon, Joerg Roedel, Cho KyongHo,
	Grant Grundler, Dave Martin, Marc Zyngier, Hiroshi Doyu,
	Olav Haugan, Paul Walmsley, Rhyland Klein, Allen Martin,
	devicetree, iommu, linux-arm-kernel, linux-tegra, linux-kernel

[-- Attachment #1: Type: text/plain, Size: 1744 bytes --]

On Fri, Jun 27, 2014 at 01:07:04PM +0200, Arnd Bergmann wrote:
> On Thursday 26 June 2014 22:49:44 Thierry Reding wrote:
> > +static const struct tegra_mc_client tegra124_mc_clients[] = {
> > +       {
> > +               .id = 0x01,
> > +               .name = "display0a",
> > +               .swgroup = TEGRA_SWGROUP_DC,
> > +               .smmu = {
> > +                       .reg = 0x228,
> > +                       .bit = 1,
> > +               },
> > +               .latency = {
> > +                       .reg = 0x2e8,
> > +                       .shift = 0,
> > +                       .mask = 0xff,
> > +                       .def = 0xc2,
> > +               },
> > +       }, {
> 
> This is a rather long table that I assume would need to get duplicated
> and modified for each specific SoC. Have you considered to put the information
> into DT instead, as auxiliary data in the iommu specifier as provided by
> the device?

Most of this data really is register information and I don't think that
belongs in DT. Also since this is fixed for a given SoC and in no way
configurable (well, with the exception of the .def field above) I don't
see any point in parsing this from device tree.

Also only the .smmu substruct is immediately relevant to the IOMMU part
of the driver. The .swgroup field could possibly also be moved into that
substructure since it is only relevant to the IOMMU.

So essentially what this table does is map SWGROUPs (which are provided
in the IOMMU specifier) to the clients and registers that the IOMMU
programming needs. As an analogy it corresponds roughly to the pins and
pingroups tables of pinctrl drivers. Those don't belong in device tree
either.

Thierry

[-- Attachment #2: Type: application/pgp-signature, Size: 819 bytes --]

^ permalink raw reply	[flat|nested] 133+ messages in thread

* [RFC 04/10] memory: Add Tegra124 memory controller support
@ 2014-06-27 11:15           ` Thierry Reding
  0 siblings, 0 replies; 133+ messages in thread
From: Thierry Reding @ 2014-06-27 11:15 UTC (permalink / raw)
  To: linux-arm-kernel

On Fri, Jun 27, 2014 at 01:07:04PM +0200, Arnd Bergmann wrote:
> On Thursday 26 June 2014 22:49:44 Thierry Reding wrote:
> > +static const struct tegra_mc_client tegra124_mc_clients[] = {
> > +       {
> > +               .id = 0x01,
> > +               .name = "display0a",
> > +               .swgroup = TEGRA_SWGROUP_DC,
> > +               .smmu = {
> > +                       .reg = 0x228,
> > +                       .bit = 1,
> > +               },
> > +               .latency = {
> > +                       .reg = 0x2e8,
> > +                       .shift = 0,
> > +                       .mask = 0xff,
> > +                       .def = 0xc2,
> > +               },
> > +       }, {
> 
> This is a rather long table that I assume would need to get duplicated
> and modified for each specific SoC. Have you considered to put the information
> into DT instead, as auxiliary data in the iommu specifier as provided by
> the device?

Most of this data really is register information and I don't think that
belongs in DT. Also since this is fixed for a given SoC and in no way
configurable (well, with the exception of the .def field above) I don't
see any point in parsing this from device tree.

Also only the .smmu substruct is immediately relevant to the IOMMU part
of the driver. The .swgroup field could possibly also be moved into that
substructure since it is only relevant to the IOMMU.

So essentially what this table does is map SWGROUPs (which are provided
in the IOMMU specifier) to the clients and registers that the IOMMU
programming needs. As an analogy it corresponds roughly to the pins and
pingroups tables of pinctrl drivers. Those don't belong in device tree
either.

Thierry
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 819 bytes
Desc: not available
URL: <http://lists.infradead.org/pipermail/linux-arm-kernel/attachments/20140627/26343dbd/attachment.sig>

^ permalink raw reply	[flat|nested] 133+ messages in thread

* Re: [RFC 04/10] memory: Add Tegra124 memory controller support
  2014-06-26 20:49     ` Thierry Reding
  (?)
@ 2014-06-27 13:29       ` Mikko Perttunen
  -1 siblings, 0 replies; 133+ messages in thread
From: Mikko Perttunen @ 2014-06-27 13:29 UTC (permalink / raw)
  To: Thierry Reding, Rob Herring, Pawel Moll, Mark Rutland,
	Ian Campbell, Kumar Gala, Stephen Warren, Arnd Bergmann,
	Will Deacon, Joerg Roedel
  Cc: Cho KyongHo, Grant Grundler, Dave Martin, Marc Zyngier,
	Hiroshi Doyu, Olav Haugan, Paul Walmsley, Rhyland Klein,
	Allen Martin, devicetree, iommu, linux-arm-kernel, linux-tegra,
	linux-kernel

In the future, the EMC driver will also want to write and read quite 
many registers in the MC block.. MC_EMEM_*, the latency allowance 
registers and a couple others. Downstream just uses __raw_writel with
values from the EMC tables. A fun thing here is that during the point
that the values are written, the code cannot do some things like reading 
registers (I believe) without hanging, so calling into the MC driver to 
write the changes might not be very nice either. Related to that, 
reading from MC_EMEM_ADR_CFG is used as a barrier in the sequence.

On 26/06/14 23:49, Thierry Reding wrote:
> From: Thierry Reding <treding@nvidia.com>
>
> The memory controller on NVIDIA Tegra124 exposes various knobs that can
> be used to tune the behaviour of the clients attached to it.
>
> Currently this driver sets up the latency allowance registers to the HW
> defaults. Eventually an API should be exported by this driver (via a
> custom API or a generic subsystem) to allow clients to register latency
> requirements.

I cannot see where the downstream latency allowance code is reloading 
the latency allowance registers after a EMC clock rate change. Strange.

>
> This driver also registers an IOMMU (SMMU) that's implemented by the
> memory controller.
>
> Signed-off-by: Thierry Reding <treding@nvidia.com>
> ---
>   drivers/memory/Kconfig                   |    9 +
>   drivers/memory/Makefile                  |    1 +
>   drivers/memory/tegra124-mc.c             | 1945 ++++++++++++++++++++++++++++++
>   include/dt-bindings/memory/tegra124-mc.h |   30 +
>   4 files changed, 1985 insertions(+)
>   create mode 100644 drivers/memory/tegra124-mc.c
>   create mode 100644 include/dt-bindings/memory/tegra124-mc.h
>
> diff --git a/drivers/memory/Kconfig b/drivers/memory/Kconfig
> index c59e9c96e86d..d0f0e6781570 100644
> --- a/drivers/memory/Kconfig
> +++ b/drivers/memory/Kconfig
> @@ -61,6 +61,15 @@ config TEGRA30_MC
>            analysis, especially for IOMMU/SMMU(System Memory Management
>            Unit) module.
>
> +config TEGRA124_MC
> +       bool "Tegra124 Memory Controller driver"
> +       depends on ARCH_TEGRA
> +       select IOMMU_API
> +       help
> +         This driver is for the Memory Controller module available on
> +         Tegra124 SoCs. It provides an IOMMU that can be used for I/O
> +         virtual address translation.
> +
>   config FSL_IFC
>          bool
>          depends on FSL_SOC
> diff --git a/drivers/memory/Makefile b/drivers/memory/Makefile
> index 71160a2b7313..03143927abab 100644
> --- a/drivers/memory/Makefile
> +++ b/drivers/memory/Makefile
> @@ -11,3 +11,4 @@ obj-$(CONFIG_FSL_IFC)         += fsl_ifc.o
>   obj-$(CONFIG_MVEBU_DEVBUS)     += mvebu-devbus.o
>   obj-$(CONFIG_TEGRA20_MC)       += tegra20-mc.o
>   obj-$(CONFIG_TEGRA30_MC)       += tegra30-mc.o
> +obj-$(CONFIG_TEGRA124_MC)      += tegra124-mc.o
> diff --git a/drivers/memory/tegra124-mc.c b/drivers/memory/tegra124-mc.c
> new file mode 100644
> index 000000000000..741755b6785d
> --- /dev/null
> +++ b/drivers/memory/tegra124-mc.c
> @@ -0,0 +1,1945 @@
> +/*
> + * Copyright (C) 2014 NVIDIA CORPORATION.  All rights reserved.
> + *
> + * This program is free software; you can redistribute it and/or modify
> + * it under the terms of the GNU General Public License version 2 as
> + * published by the Free Software Foundation.
> + */
> +
> +#include <linux/interrupt.h>
> +#include <linux/io.h>
> +#include <linux/iommu.h>
> +#include <linux/kernel.h>
> +#include <linux/module.h>
> +#include <linux/of.h>
> +#include <linux/platform_device.h>
> +#include <linux/slab.h>
> +
> +#include <dt-bindings/memory/tegra124-mc.h>
> +
> +#include <asm/cacheflush.h>
> +#ifndef CONFIG_ARM64
> +#include <asm/dma-iommu.h>
> +#endif
> +
> +#define MC_INTSTATUS 0x000
> +#define  MC_INT_DECERR_MTS (1 << 16)
> +#define  MC_INT_SECERR_SEC (1 << 13)
> +#define  MC_INT_DECERR_VPR (1 << 12)
> +#define  MC_INT_INVALID_APB_ASID_UPDATE (1 << 11)
> +#define  MC_INT_INVALID_SMMU_PAGE (1 << 10)
> +#define  MC_INT_ARBITRATION_EMEM (1 << 9)
> +#define  MC_INT_SECURITY_VIOLATION (1 << 8)
> +#define  MC_INT_DECERR_EMEM (1 << 6)
> +#define MC_INTMASK 0x004
> +#define MC_ERR_STATUS 0x08
> +#define MC_ERR_ADR 0x0c
> +
> +struct latency_allowance {
> +       unsigned int reg;
> +       unsigned int shift;
> +       unsigned int mask;
> +       unsigned int def;
> +};
> +
> +struct smmu_enable {
> +       unsigned int reg;
> +       unsigned int bit;
> +};
> +
> +struct tegra_mc_client {
> +       unsigned int id;
> +       const char *name;
> +       unsigned int swgroup;
> +
> +       struct smmu_enable smmu;
> +       struct latency_allowance latency;
> +};
> +
> +static const struct tegra_mc_client tegra124_mc_clients[] = {
> +       {
> +               .id = 0x01,
> +               .name = "display0a",
> +               .swgroup = TEGRA_SWGROUP_DC,
> +               .smmu = {
> +                       .reg = 0x228,
> +                       .bit = 1,
> +               },
> +               .latency = {
> +                       .reg = 0x2e8,
> +                       .shift = 0,
> +                       .mask = 0xff,
> +                       .def = 0xc2,
> +               },
> +       }, {
> +               .id = 0x02,
> +               .name = "display0ab",
> +               .swgroup = TEGRA_SWGROUP_DCB,
> +               .smmu = {
> +                       .reg = 0x228,
> +                       .bit = 2,
> +               },
> +               .latency = {
> +                       .reg = 0x2f4,
> +                       .shift = 0,
> +                       .mask = 0xff,
> +                       .def = 0xc6,
> +               },
> +       }, {
> +               .id = 0x03,
> +               .name = "display0b",
> +               .swgroup = TEGRA_SWGROUP_DC,
> +               .smmu = {
> +                       .reg = 0x228,
> +                       .bit = 3,
> +               },
> +               .latency = {
> +                       .reg = 0x2e8,
> +                       .shift = 16,
> +                       .mask = 0xff,
> +                       .def = 0x50,
> +               },
> +       }, {
> +               .id = 0x04,
> +               .name = "display0bb",
> +               .swgroup = TEGRA_SWGROUP_DCB,
> +               .smmu = {
> +                       .reg = 0x228,
> +                       .bit = 4,
> +               },
> +               .latency = {
> +                       .reg = 0x2f4,
> +                       .shift = 16,
> +                       .mask = 0xff,
> +                       .def = 0x50,
> +               },
> +       }, {
> +               .id = 0x05,
> +               .name = "display0c",
> +               .swgroup = TEGRA_SWGROUP_DC,
> +               .smmu = {
> +                       .reg = 0x228,
> +                       .bit = 5,
> +               },
> +               .latency = {
> +                       .reg = 0x2ec,
> +                       .shift = 0,
> +                       .mask = 0xff,
> +                       .def = 0x50,
> +               },
> +       }, {
> +               .id = 0x06,
> +               .name = "display0cb",
> +               .swgroup = TEGRA_SWGROUP_DCB,
> +               .smmu = {
> +                       .reg = 0x228,
> +                       .bit = 6,
> +               },
> +               .latency = {
> +                       .reg = 0x2f8,
> +                       .shift = 0,
> +                       .mask = 0xff,
> +                       .def = 0x50,
> +               },
> +       }, {
> +               .id = 0x0e,
> +               .name = "afir",
> +               .swgroup = TEGRA_SWGROUP_AFI,
> +               .smmu = {
> +                       .reg = 0x228,
> +                       .bit = 14,
> +               },
> +               .latency = {
> +                       .reg = 0x2e0,
> +                       .shift = 0,
> +                       .mask = 0xff,
> +                       .def = 0x13,
> +               },
> +       }, {
> +               .id = 0x0f,
> +               .name = "avpcarm7r",
> +               .swgroup = TEGRA_SWGROUP_AVPC,
> +               .smmu = {
> +                       .reg = 0x228,
> +                       .bit = 15,
> +               },
> +               .latency = {
> +                       .reg = 0x2e4,
> +                       .shift = 0,
> +                       .mask = 0xff,
> +                       .def = 0x04,
> +               },
> +       }, {
> +               .id = 0x10,
> +               .name = "displayhc",
> +               .swgroup = TEGRA_SWGROUP_DC,
> +               .smmu = {
> +                       .reg = 0x228,
> +                       .bit = 16,
> +               },
> +               .latency = {
> +                       .reg = 0x2f0,
> +                       .shift = 0,
> +                       .mask = 0xff,
> +                       .def = 0x50,
> +               },
> +       }, {
> +               .id = 0x11,
> +               .name = "displayhcb",
> +               .swgroup = TEGRA_SWGROUP_DCB,
> +               .smmu = {
> +                       .reg = 0x228,
> +                       .bit = 17,
> +               },
> +               .latency = {
> +                       .reg = 0x2fc,
> +                       .shift = 0,
> +                       .mask = 0xff,
> +                       .def = 0x50,
> +               },
> +       }, {
> +               .id = 0x15,
> +               .name = "hdar",
> +               .swgroup = TEGRA_SWGROUP_HDA,
> +               .smmu = {
> +                       .reg = 0x228,
> +                       .bit = 21,
> +               },
> +               .latency = {
> +                       .reg = 0x318,
> +                       .shift = 0,
> +                       .mask = 0xff,
> +                       .def = 0x24,
> +               },
> +       }, {
> +               .id = 0x16,
> +               .name = "host1xdmar",
> +               .swgroup = TEGRA_SWGROUP_HC,
> +               .smmu = {
> +                       .reg = 0x228,
> +                       .bit = 22,
> +               },
> +               .latency = {
> +                       .reg = 0x310,
> +                       .shift = 0,
> +                       .mask = 0xff,
> +                       .def = 0x1e,
> +               },
> +       }, {
> +               .id = 0x17,
> +               .name = "host1xr",
> +               .swgroup = TEGRA_SWGROUP_HC,
> +               .smmu = {
> +                       .reg = 0x228,
> +                       .bit = 23,
> +               },
> +               .latency = {
> +                       .reg = 0x310,
> +                       .shift = 16,
> +                       .mask = 0xff,
> +                       .def = 0x50,
> +               },
> +       }, {
> +               .id = 0x1c,
> +               .name = "msencsrd",
> +               .swgroup = TEGRA_SWGROUP_MSENC,
> +               .smmu = {
> +                       .reg = 0x228,
> +                       .bit = 28,
> +               },
> +               .latency = {
> +                       .reg = 0x328,
> +                       .shift = 0,
> +                       .mask = 0xff,
> +                       .def = 0x23,
> +               },
> +       }, {
> +               .id = 0x1d,
> +               .name = "ppcsahbdmarhdar",
> +               .swgroup = TEGRA_SWGROUP_PPCS,
> +               .smmu = {
> +                       .reg = 0x228,
> +                       .bit = 29,
> +               },
> +               .latency = {
> +                       .reg = 0x344,
> +                       .shift = 0,
> +                       .mask = 0xff,
> +                       .def = 0x49,
> +               },
> +       }, {
> +               .id = 0x1e,
> +               .name = "ppcsahbslvr",
> +               .swgroup = TEGRA_SWGROUP_PPCS,
> +               .smmu = {
> +                       .reg = 0x228,
> +                       .bit = 30,
> +               },
> +               .latency = {
> +                       .reg = 0x344,
> +                       .shift = 16,
> +                       .mask = 0xff,
> +                       .def = 0x1a,
> +               },
> +       }, {
> +               .id = 0x1f,
> +               .name = "satar",
> +               .swgroup = TEGRA_SWGROUP_SATA,
> +               .smmu = {
> +                       .reg = 0x228,
> +                       .bit = 31,
> +               },
> +               .latency = {
> +                       .reg = 0x350,
> +                       .shift = 0,
> +                       .mask = 0xff,
> +                       .def = 0x65,
> +               },
> +       }, {
> +               .id = 0x22,
> +               .name = "vdebsevr",
> +               .swgroup = TEGRA_SWGROUP_VDE,
> +               .smmu = {
> +                       .reg = 0x22c,
> +                       .bit = 2,
> +               },
> +               .latency = {
> +                       .reg = 0x354,
> +                       .shift = 0,
> +                       .mask = 0xff,
> +                       .def = 0x4f,
> +               },
> +       }, {
> +               .id = 0x23,
> +               .name = "vdember",
> +               .swgroup = TEGRA_SWGROUP_VDE,
> +               .smmu = {
> +                       .reg = 0x22c,
> +                       .bit = 3,
> +               },
> +               .latency = {
> +                       .reg = 0x354,
> +                       .shift = 16,
> +                       .mask = 0xff,
> +                       .def = 0x3d,
> +               },
> +       }, {
> +               .id = 0x24,
> +               .name = "vdemcer",
> +               .swgroup = TEGRA_SWGROUP_VDE,
> +               .smmu = {
> +                       .reg = 0x22c,
> +                       .bit = 4,
> +               },
> +               .latency = {
> +                       .reg = 0x358,
> +                       .shift = 0,
> +                       .mask = 0xff,
> +                       .def = 0x66,
> +               },
> +       }, {
> +               .id = 0x25,
> +               .name = "vdetper",
> +               .swgroup = TEGRA_SWGROUP_VDE,
> +               .smmu = {
> +                       .reg = 0x22c,
> +                       .bit = 5,
> +               },
> +               .latency = {
> +                       .reg = 0x358,
> +                       .shift = 16,
> +                       .mask = 0xff,
> +                       .def = 0xa5,
> +               },
> +       }, {
> +               .id = 0x26,
> +               .name = "mpcorelpr",
> +               .swgroup = TEGRA_SWGROUP_MPCORELP,
> +               .latency = {
> +                       .reg = 0x324,
> +                       .shift = 0,
> +                       .mask = 0xff,
> +                       .def = 0x04,
> +               },
> +       }, {
> +               .id = 0x27,
> +               .name = "mpcorer",
> +               .swgroup = TEGRA_SWGROUP_MPCORE,
> +               .smmu = {
> +                       .reg = 0x22c,
> +                       .bit = 2,
> +               },
> +               .latency = {
> +                       .reg = 0x320,
> +                       .shift = 0,
> +                       .mask = 0xff,
> +                       .def = 0x04,
> +               },
> +       }, {
> +               .id = 0x2b,
> +               .name = "msencswr",
> +               .swgroup = TEGRA_SWGROUP_MSENC,
> +               .smmu = {
> +                       .reg = 0x22c,
> +                       .bit = 11,
> +               },
> +               .latency = {
> +                       .reg = 0x328,
> +                       .shift = 16,
> +                       .mask = 0xff,
> +                       .def = 0x80,
> +               },
> +       }, {
> +               .id = 0x31,
> +               .name = "afiw",
> +               .swgroup = TEGRA_SWGROUP_AFI,
> +               .smmu = {
> +                       .reg = 0x22c,
> +                       .bit = 17,
> +               },
> +               .latency = {
> +                       .reg = 0x2e0,
> +                       .shift = 16,
> +                       .mask = 0xff,
> +                       .def = 0x80,
> +               },
> +       }, {
> +               .id = 0x32,
> +               .name = "avpcarm7w",
> +               .swgroup = TEGRA_SWGROUP_AVPC,
> +               .smmu = {
> +                       .reg = 0x22c,
> +                       .bit = 18,
> +               },
> +               .latency = {
> +                       .reg = 0x2e4,
> +                       .shift = 16,
> +                       .mask = 0xff,
> +                       .def = 0x80,
> +               },
> +       }, {
> +               .id = 0x35,
> +               .name = "hdaw",
> +               .swgroup = TEGRA_SWGROUP_HDA,
> +               .smmu = {
> +                       .reg = 0x22c,
> +                       .bit = 21,
> +               },
> +               .latency = {
> +                       .reg = 0x318,
> +                       .shift = 16,
> +                       .mask = 0xff,
> +                       .def = 0x80,
> +               },
> +       }, {
> +               .id = 0x36,
> +               .name = "host1xw",
> +               .swgroup = TEGRA_SWGROUP_HC,
> +               .smmu = {
> +                       .reg = 0x22c,
> +                       .bit = 22,
> +               },
> +               .latency = {
> +                       .reg = 0x314,
> +                       .shift = 0,
> +                       .mask = 0xff,
> +                       .def = 0x80,
> +               },
> +       }, {
> +               .id = 0x38,
> +               .name = "mpcorelpw",
> +               .swgroup = TEGRA_SWGROUP_MPCORELP,
> +               .latency = {
> +                       .reg = 0x324,
> +                       .shift = 16,
> +                       .mask = 0xff,
> +                       .def = 0x80,
> +               },
> +       }, {
> +               .id = 0x39,
> +               .name = "mpcorew",
> +               .swgroup = TEGRA_SWGROUP_MPCORE,
> +               .latency = {
> +                       .reg = 0x320,
> +                       .shift = 16,
> +                       .mask = 0xff,
> +                       .def = 0x80,
> +               },
> +       }, {
> +               .id = 0x3b,
> +               .name = "ppcsahbdmaw",
> +               .swgroup = TEGRA_SWGROUP_PPCS,
> +               .smmu = {
> +                       .reg = 0x22c,
> +                       .bit = 27,
> +               },
> +               .latency = {
> +                       .reg = 0x348,
> +                       .shift = 0,
> +                       .mask = 0xff,
> +                       .def = 0x80,
> +               },
> +       }, {
> +               .id = 0x3c,
> +               .name = "ppcsahbslvw",
> +               .swgroup = TEGRA_SWGROUP_PPCS,
> +               .smmu = {
> +                       .reg = 0x22c,
> +                       .bit = 28,
> +               },
> +               .latency = {
> +                       .reg = 0x348,
> +                       .shift = 16,
> +                       .mask = 0xff,
> +                       .def = 0x80,
> +               },
> +       }, {
> +               .id = 0x3d,
> +               .name = "sataw",
> +               .swgroup = TEGRA_SWGROUP_SATA,
> +               .smmu = {
> +                       .reg = 0x22c,
> +                       .bit = 29,
> +               },
> +               .latency = {
> +                       .reg = 0x350,
> +                       .shift = 16,
> +                       .mask = 0xff,
> +                       .def = 0x65,
> +               },
> +       }, {
> +               .id = 0x3e,
> +               .name = "vdebsevw",
> +               .swgroup = TEGRA_SWGROUP_VDE,
> +               .smmu = {
> +                       .reg = 0x22c,
> +                       .bit = 30,
> +               },
> +               .latency = {
> +                       .reg = 0x35c,
> +                       .shift = 0,
> +                       .mask = 0xff,
> +                       .def = 0x80,
> +               },
> +       }, {
> +               .id = 0x3f,
> +               .name = "vdedbgw",
> +               .swgroup = TEGRA_SWGROUP_VDE,
> +               .smmu = {
> +                       .reg = 0x22c,
> +                       .bit = 31,
> +               },
> +               .latency = {
> +                       .reg = 0x35c,
> +                       .shift = 16,
> +                       .mask = 0xff,
> +                       .def = 0x80,
> +               },
> +       }, {
> +               .id = 0x40,
> +               .name = "vdembew",
> +               .swgroup = TEGRA_SWGROUP_VDE,
> +               .smmu = {
> +                       .reg = 0x230,
> +                       .bit = 0,
> +               },
> +               .latency = {
> +                       .reg = 0x360,
> +                       .shift = 0,
> +                       .mask = 0xff,
> +                       .def = 0x80,
> +               },
> +       }, {
> +               .id = 0x41,
> +               .name = "vdetpmw",
> +               .swgroup = TEGRA_SWGROUP_VDE,
> +               .smmu = {
> +                       .reg = 0x230,
> +                       .bit = 1,
> +               },
> +               .latency = {
> +                       .reg = 0x360,
> +                       .shift = 16,
> +                       .mask = 0xff,
> +                       .def = 0x80,
> +               },
> +       }, {
> +               .id = 0x44,
> +               .name = "ispra",
> +               .swgroup = TEGRA_SWGROUP_ISP2,
> +               .smmu = {
> +                       .reg = 0x230,
> +                       .bit = 4,
> +               },
> +               .latency = {
> +                       .reg = 0x370,
> +                       .shift = 0,
> +                       .mask = 0xff,
> +                       .def = 0x18,
> +               },
> +       }, {
> +               .id = 0x46,
> +               .name = "ispwa",
> +               .swgroup = TEGRA_SWGROUP_ISP2,
> +               .smmu = {
> +                       .reg = 0x230,
> +                       .bit = 6,
> +               },
> +               .latency = {
> +                       .reg = 0x374,
> +                       .shift = 0,
> +                       .mask = 0xff,
> +                       .def = 0x80,
> +               },
> +       }, {
> +               .id = 0x47,
> +               .name = "ispwb",
> +               .swgroup = TEGRA_SWGROUP_ISP2,
> +               .smmu = {
> +                       .reg = 0x230,
> +                       .bit = 7,
> +               },
> +               .latency = {
> +                       .reg = 0x374,
> +                       .shift = 16,
> +                       .mask = 0xff,
> +                       .def = 0x80,
> +               },
> +       }, {
> +               .id = 0x4a,
> +               .name = "xusb_hostr",
> +               .swgroup = TEGRA_SWGROUP_XUSB_HOST,
> +               .smmu = {
> +                       .reg = 0x230,
> +                       .bit = 10,
> +               },
> +               .latency = {
> +                       .reg = 0x37c,
> +                       .shift = 0,
> +                       .mask = 0xff,
> +                       .def = 0x39,
> +               },
> +       }, {
> +               .id = 0x4b,
> +               .name = "xusb_hostw",
> +               .swgroup = TEGRA_SWGROUP_XUSB_HOST,
> +               .smmu = {
> +                       .reg = 0x230,
> +                       .bit = 11,
> +               },
> +               .latency = {
> +                       .reg = 0x37c,
> +                       .shift = 16,
> +                       .mask = 0xff,
> +                       .def = 0x80,
> +               },
> +       }, {
> +               .id = 0x4c,
> +               .name = "xusb_devr",
> +               .swgroup = TEGRA_SWGROUP_XUSB_DEV,
> +               .smmu = {
> +                       .reg = 0x230,
> +                       .bit = 12,
> +               },
> +               .latency = {
> +                       .reg = 0x380,
> +                       .shift = 0,
> +                       .mask = 0xff,
> +                       .def = 0x39,
> +               },
> +       }, {
> +               .id = 0x4d,
> +               .name = "xusb_devw",
> +               .swgroup = TEGRA_SWGROUP_XUSB_DEV,
> +               .smmu = {
> +                       .reg = 0x230,
> +                       .bit = 13,
> +               },
> +               .latency = {
> +                       .reg = 0x380,
> +                       .shift = 16,
> +                       .mask = 0xff,
> +                       .def = 0x80,
> +               },
> +       }, {
> +               .id = 0x4e,
> +               .name = "isprab",
> +               .swgroup = TEGRA_SWGROUP_ISP2B,
> +               .smmu = {
> +                       .reg = 0x230,
> +                       .bit = 14,
> +               },
> +               .latency = {
> +                       .reg = 0x384,
> +                       .shift = 0,
> +                       .mask = 0xff,
> +                       .def = 0x18,
> +               },
> +       }, {
> +               .id = 0x50,
> +               .name = "ispwab",
> +               .swgroup = TEGRA_SWGROUP_ISP2B,
> +               .smmu = {
> +                       .reg = 0x230,
> +                       .bit = 16,
> +               },
> +               .latency = {
> +                       .reg = 0x388,
> +                       .shift = 0,
> +                       .mask = 0xff,
> +                       .def = 0x80,
> +               },
> +       }, {
> +               .id = 0x51,
> +               .name = "ispwbb",
> +               .swgroup = TEGRA_SWGROUP_ISP2B,
> +               .smmu = {
> +                       .reg = 0x230,
> +                       .bit = 17,
> +               },
> +               .latency = {
> +                       .reg = 0x388,
> +                       .shift = 16,
> +                       .mask = 0xff,
> +                       .def = 0x80,
> +               },
> +       }, {
> +               .id = 0x54,
> +               .name = "tsecsrd",
> +               .swgroup = TEGRA_SWGROUP_TSEC,
> +               .smmu = {
> +                       .reg = 0x230,
> +                       .bit = 20,
> +               },
> +               .latency = {
> +                       .reg = 0x390,
> +                       .shift = 0,
> +                       .mask = 0xff,
> +                       .def = 0x9b,
> +               },
> +       }, {
> +               .id = 0x55,
> +               .name = "tsecswr",
> +               .swgroup = TEGRA_SWGROUP_TSEC,
> +               .smmu = {
> +                       .reg = 0x230,
> +                       .bit = 21,
> +               },
> +               .latency = {
> +                       .reg = 0x390,
> +                       .shift = 16,
> +                       .mask = 0xff,
> +                       .def = 0x80,
> +               },
> +       }, {
> +               .id = 0x56,
> +               .name = "a9avpscr",
> +               .swgroup = TEGRA_SWGROUP_A9AVP,
> +               .smmu = {
> +                       .reg = 0x230,
> +                       .bit = 22,
> +               },
> +               .latency = {
> +                       .reg = 0x3a4,
> +                       .shift = 0,
> +                       .mask = 0xff,
> +                       .def = 0x04,
> +               },
> +       }, {
> +               .id = 0x57,
> +               .name = "a9avpscw",
> +               .swgroup = TEGRA_SWGROUP_A9AVP,
> +               .smmu = {
> +                       .reg = 0x230,
> +                       .bit = 23,
> +               },
> +               .latency = {
> +                       .reg = 0x3a4,
> +                       .shift = 16,
> +                       .mask = 0xff,
> +                       .def = 0x80,
> +               },
> +       }, {
> +               .id = 0x58,
> +               .name = "gpusrd",
> +               .swgroup = TEGRA_SWGROUP_GPU,
> +               .smmu = {
> +                       /* read-only */
> +                       .reg = 0x230,
> +                       .bit = 24,
> +               },
> +               .latency = {
> +                       .reg = 0x3c8,
> +                       .shift = 0,
> +                       .mask = 0xff,
> +                       .def = 0x1a,
> +               },
> +       }, {
> +               .id = 0x59,
> +               .name = "gpuswr",
> +               .swgroup = TEGRA_SWGROUP_GPU,
> +               .smmu = {
> +                       /* read-only */
> +                       .reg = 0x230,
> +                       .bit = 25,
> +               },
> +               .latency = {
> +                       .reg = 0x3c8,
> +                       .shift = 16,
> +                       .mask = 0xff,
> +                       .def = 0x80,
> +               },
> +       }, {
> +               .id = 0x5a,
> +               .name = "displayt",
> +               .swgroup = TEGRA_SWGROUP_DC,
> +               .smmu = {
> +                       .reg = 0x230,
> +                       .bit = 26,
> +               },
> +               .latency = {
> +                       .reg = 0x2f0,
> +                       .shift = 16,
> +                       .mask = 0xff,
> +                       .def = 0x50,
> +               },
> +       }, {
> +               .id = 0x60,
> +               .name = "sdmmcra",
> +               .swgroup = TEGRA_SWGROUP_SDMMC1A,
> +               .smmu = {
> +                       .reg = 0x234,
> +                       .bit = 0,
> +               },
> +               .latency = {
> +                       .reg = 0x3b8,
> +                       .shift = 0,
> +                       .mask = 0xff,
> +                       .def = 0x49,
> +               },
> +       }, {
> +               .id = 0x61,
> +               .name = "sdmmcraa",
> +               .swgroup = TEGRA_SWGROUP_SDMMC2A,
> +               .smmu = {
> +                       .reg = 0x234,
> +                       .bit = 1,
> +               },
> +               .latency = {
> +                       .reg = 0x3bc,
> +                       .shift = 0,
> +                       .mask = 0xff,
> +                       .def = 0x49,
> +               },
> +       }, {
> +               .id = 0x62,
> +               .name = "sdmmcr",
> +               .swgroup = TEGRA_SWGROUP_SDMMC3A,
> +               .smmu = {
> +                       .reg = 0x234,
> +                       .bit = 2,
> +               },
> +               .latency = {
> +                       .reg = 0x3c0,
> +                       .shift = 0,
> +                       .mask = 0xff,
> +                       .def = 0x49,
> +               },
> +       }, {
> +               .id = 0x63,
> +               .swgroup = TEGRA_SWGROUP_SDMMC4A,
> +               .name = "sdmmcrab",
> +               .smmu = {
> +                       .reg = 0x234,
> +                       .bit = 3,
> +               },
> +               .latency = {
> +                       .reg = 0x3c4,
> +                       .shift = 0,
> +                       .mask = 0xff,
> +                       .def = 0x49,
> +               },
> +       }, {
> +               .id = 0x64,
> +               .name = "sdmmcwa",
> +               .swgroup = TEGRA_SWGROUP_SDMMC1A,
> +               .smmu = {
> +                       .reg = 0x234,
> +                       .bit = 4,
> +               },
> +               .latency = {
> +                       .reg = 0x3b8,
> +                       .shift = 16,
> +                       .mask = 0xff,
> +                       .def = 0x80,
> +               },
> +       }, {
> +               .id = 0x65,
> +               .name = "sdmmcwaa",
> +               .swgroup = TEGRA_SWGROUP_SDMMC2A,
> +               .smmu = {
> +                       .reg = 0x234,
> +                       .bit = 5,
> +               },
> +               .latency = {
> +                       .reg = 0x3bc,
> +                       .shift = 16,
> +                       .mask = 0xff,
> +                       .def = 0x80,
> +               },
> +       }, {
> +               .id = 0x66,
> +               .name = "sdmmcw",
> +               .swgroup = TEGRA_SWGROUP_SDMMC3A,
> +               .smmu = {
> +                       .reg = 0x234,
> +                       .bit = 6,
> +               },
> +               .latency = {
> +                       .reg = 0x3c0,
> +                       .shift = 16,
> +                       .mask = 0xff,
> +                       .def = 0x80,
> +               },
> +       }, {
> +               .id = 0x67,
> +               .name = "sdmmcwab",
> +               .swgroup = TEGRA_SWGROUP_SDMMC4A,
> +               .smmu = {
> +                       .reg = 0x234,
> +                       .bit = 7,
> +               },
> +               .latency = {
> +                       .reg = 0x3c4,
> +                       .shift = 16,
> +                       .mask = 0xff,
> +                       .def = 0x80,
> +               },
> +       }, {
> +               .id = 0x6c,
> +               .name = "vicsrd",
> +               .swgroup = TEGRA_SWGROUP_VIC,
> +               .smmu = {
> +                       .reg = 0x234,
> +                       .bit = 12,
> +               },
> +               .latency = {
> +                       .reg = 0x394,
> +                       .shift = 0,
> +                       .mask = 0xff,
> +                       .def = 0x1a,
> +               },
> +       }, {
> +               .id = 0x6d,
> +               .name = "vicswr",
> +               .swgroup = TEGRA_SWGROUP_VIC,
> +               .smmu = {
> +                       .reg = 0x234,
> +                       .bit = 13,
> +               },
> +               .latency = {
> +                       .reg = 0x394,
> +                       .shift = 16,
> +                       .mask = 0xff,
> +                       .def = 0x80,
> +               },
> +       }, {
> +               .id = 0x72,
> +               .name = "viw",
> +               .swgroup = TEGRA_SWGROUP_VI,
> +               .smmu = {
> +                       .reg = 0x234,
> +                       .bit = 18,
> +               },
> +               .latency = {
> +                       .reg = 0x398,
> +                       .shift = 0,
> +                       .mask = 0xff,
> +                       .def = 0x80,
> +               },
> +       }, {
> +               .id = 0x73,
> +               .name = "displayd",
> +               .swgroup = TEGRA_SWGROUP_DC,
> +               .smmu = {
> +                       .reg = 0x234,
> +                       .bit = 19,
> +               },
> +               .latency = {
> +                       .reg = 0x3c8,
> +                       .shift = 0,
> +                       .mask = 0xff,
> +                       .def = 0x50,
> +               },
> +       },
> +};
> +
> +struct tegra_smmu_swgroup {
> +       unsigned int swgroup;
> +       unsigned int reg;
> +};
> +
> +static const struct tegra_smmu_swgroup tegra124_swgroups[] = {
> +       { .swgroup = TEGRA_SWGROUP_DC,        .reg = 0x240 },
> +       { .swgroup = TEGRA_SWGROUP_DCB,       .reg = 0x244 },
> +       { .swgroup = TEGRA_SWGROUP_AFI,       .reg = 0x238 },
> +       { .swgroup = TEGRA_SWGROUP_AVPC,      .reg = 0x23c },
> +       { .swgroup = TEGRA_SWGROUP_HDA,       .reg = 0x254 },
> +       { .swgroup = TEGRA_SWGROUP_HC,        .reg = 0x250 },
> +       { .swgroup = TEGRA_SWGROUP_MSENC,     .reg = 0x264 },
> +       { .swgroup = TEGRA_SWGROUP_PPCS,      .reg = 0x270 },
> +       { .swgroup = TEGRA_SWGROUP_SATA,      .reg = 0x274 },
> +       { .swgroup = TEGRA_SWGROUP_VDE,       .reg = 0x27c },
> +       { .swgroup = TEGRA_SWGROUP_ISP2,      .reg = 0x258 },
> +       { .swgroup = TEGRA_SWGROUP_XUSB_HOST, .reg = 0x288 },
> +       { .swgroup = TEGRA_SWGROUP_XUSB_DEV,  .reg = 0x28c },
> +       { .swgroup = TEGRA_SWGROUP_ISP2B,     .reg = 0xaa4 },
> +       { .swgroup = TEGRA_SWGROUP_TSEC,      .reg = 0x294 },
> +       { .swgroup = TEGRA_SWGROUP_A9AVP,     .reg = 0x290 },
> +       { .swgroup = TEGRA_SWGROUP_GPU,       .reg = 0xaa8 },
> +       { .swgroup = TEGRA_SWGROUP_SDMMC1A,   .reg = 0xa94 },
> +       { .swgroup = TEGRA_SWGROUP_SDMMC2A,   .reg = 0xa98 },
> +       { .swgroup = TEGRA_SWGROUP_SDMMC3A,   .reg = 0xa9c },
> +       { .swgroup = TEGRA_SWGROUP_SDMMC4A,   .reg = 0xaa0 },
> +       { .swgroup = TEGRA_SWGROUP_VIC,       .reg = 0x284 },
> +       { .swgroup = TEGRA_SWGROUP_VI,        .reg = 0x280 },
> +};
> +
> +struct tegra_smmu_group_init {
> +       unsigned int asid;
> +       const char *name;
> +
> +       const struct of_device_id *matches;
> +};
> +
> +struct tegra_smmu_soc {
> +       const struct tegra_smmu_group_init *groups;
> +       unsigned int num_groups;
> +
> +       const struct tegra_mc_client *clients;
> +       unsigned int num_clients;
> +
> +       const struct tegra_smmu_swgroup *swgroups;
> +       unsigned int num_swgroups;
> +
> +       unsigned int num_asids;
> +       unsigned int atom_size;
> +
> +       const struct tegra_smmu_ops *ops;
> +};
> +
> +struct tegra_smmu_ops {
> +       void (*flush_dcache)(struct page *page, unsigned long offset,
> +                            size_t size);
> +};
> +
> +struct tegra_smmu_master {
> +       struct list_head list;
> +       struct device *dev;
> +};
> +
> +struct tegra_smmu_group {
> +       const char *name;
> +       const struct of_device_id *matches;
> +       unsigned int asid;
> +
> +#ifndef CONFIG_ARM64
> +       struct dma_iommu_mapping *mapping;
> +#endif
> +       struct list_head masters;
> +};
> +
> +static const struct of_device_id tegra124_periph_matches[] = {
> +       { .compatible = "nvidia,tegra124-sdhci", },
> +       { }
> +};
> +
> +static const struct tegra_smmu_group_init tegra124_smmu_groups[] = {
> +       { 0, "peripherals", tegra124_periph_matches },
> +};
> +
> +static void tegra_smmu_group_release(void *data)
> +{
> +       kfree(data);
> +}
> +
> +struct tegra_smmu {
> +       void __iomem *regs;
> +       struct iommu iommu;
> +       struct device *dev;
> +
> +       const struct tegra_smmu_soc *soc;
> +
> +       struct iommu_group **groups;
> +       unsigned int num_groups;
> +
> +       unsigned long *asids;
> +       struct mutex lock;
> +};
> +
> +struct tegra_smmu_address_space {
> +       struct iommu_domain *domain;
> +       struct tegra_smmu *smmu;
> +       struct page *pd;
> +       unsigned id;
> +       u32 attr;
> +};
> +
> +static inline void smmu_writel(struct tegra_smmu *smmu, u32 value,
> +                              unsigned long offset)
> +{
> +       writel(value, smmu->regs + offset);
> +}
> +
> +static inline u32 smmu_readl(struct tegra_smmu *smmu, unsigned long offset)
> +{
> +       return readl(smmu->regs + offset);
> +}
> +
> +#define SMMU_CONFIG 0x010
> +#define  SMMU_CONFIG_ENABLE (1 << 0)
> +
> +#define SMMU_PTB_ASID 0x01c
> +#define  SMMU_PTB_ASID_VALUE(x) ((x) & 0x7f)
> +
> +#define SMMU_PTB_DATA 0x020
> +#define  SMMU_PTB_DATA_VALUE(page, attr) (page_to_phys(page) >> 12 | (attr))
> +
> +#define SMMU_MK_PDE(page, attr) (page_to_phys(page) >> SMMU_PTE_SHIFT | (attr))
> +
> +#define SMMU_TLB_FLUSH 0x030
> +#define  SMMU_TLB_FLUSH_VA_MATCH_ALL     (0 << 0)
> +#define  SMMU_TLB_FLUSH_VA_MATCH_SECTION (2 << 0)
> +#define  SMMU_TLB_FLUSH_VA_MATCH_GROUP   (3 << 0)
> +#define  SMMU_TLB_FLUSH_ASID(x)          (((x) & 0x7f) << 24)
> +#define  SMMU_TLB_FLUSH_VA_SECTION(addr) ((((addr) & 0xffc00000) >> 12) | \
> +                                         SMMU_TLB_FLUSH_VA_MATCH_SECTION)
> +#define  SMMU_TLB_FLUSH_VA_GROUP(addr)   ((((addr) & 0xffffc000) >> 12) | \
> +                                         SMMU_TLB_FLUSH_VA_MATCH_GROUP)
> +#define  SMMU_TLB_FLUSH_ASID_MATCH       (1 << 31)
> +
> +#define SMMU_PTC_FLUSH 0x034
> +#define  SMMU_PTC_FLUSH_TYPE_ALL (0 << 0)
> +#define  SMMU_PTC_FLUSH_TYPE_ADR (1 << 0)
> +
> +#define SMMU_PTC_FLUSH_HI 0x9b8
> +#define  SMMU_PTC_FLUSH_HI_MASK 0x3
> +
> +/* per-SWGROUP SMMU_*_ASID register */
> +#define SMMU_ASID_ENABLE (1 << 31)
> +#define SMMU_ASID_MASK 0x7f
> +#define SMMU_ASID_VALUE(x) ((x) & SMMU_ASID_MASK)
> +
> +/* page table definitions */
> +#define SMMU_NUM_PDE 1024
> +#define SMMU_NUM_PTE 1024
> +
> +#define SMMU_SIZE_PD (SMMU_NUM_PDE * 4)
> +#define SMMU_SIZE_PT (SMMU_NUM_PTE * 4)
> +
> +#define SMMU_PDE_SHIFT 22
> +#define SMMU_PTE_SHIFT 12
> +
> +#define SMMU_PFN_MASK 0x000fffff
> +
> +#define SMMU_PD_READABLE       (1 << 31)
> +#define SMMU_PD_WRITABLE       (1 << 30)
> +#define SMMU_PD_NONSECURE      (1 << 29)
> +
> +#define SMMU_PDE_READABLE      (1 << 31)
> +#define SMMU_PDE_WRITABLE      (1 << 30)
> +#define SMMU_PDE_NONSECURE     (1 << 29)
> +#define SMMU_PDE_NEXT          (1 << 28)
> +
> +#define SMMU_PTE_READABLE      (1 << 31)
> +#define SMMU_PTE_WRITABLE      (1 << 30)
> +#define SMMU_PTE_NONSECURE     (1 << 29)
> +
> +#define SMMU_PDE_ATTR          (SMMU_PDE_READABLE | SMMU_PDE_WRITABLE | \
> +                                SMMU_PDE_NONSECURE)
> +#define SMMU_PTE_ATTR          (SMMU_PTE_READABLE | SMMU_PTE_WRITABLE | \
> +                                SMMU_PTE_NONSECURE)
> +
> +#define SMMU_PDE_VACANT(n)     (((n) << 10) | SMMU_PDE_ATTR)
> +#define SMMU_PTE_VACANT(n)     (((n) << 12) | SMMU_PTE_ATTR)
> +
> +#ifdef CONFIG_ARCH_TEGRA_124_SOC
> +static void tegra124_flush_dcache(struct page *page, unsigned long offset,
> +                                 size_t size)
> +{
> +       phys_addr_t phys = page_to_phys(page) + offset;
> +       void *virt = page_address(page) + offset;
> +
> +       __cpuc_flush_dcache_area(virt, size);
> +       outer_flush_range(phys, phys + size);
> +}
> +
> +static const struct tegra_smmu_ops tegra124_smmu_ops = {
> +       .flush_dcache = tegra124_flush_dcache,
> +};
> +#endif
> +
> +static void tegra132_flush_dcache(struct page *page, unsigned long offset,
> +                                 size_t size)
> +{
> +       /* TODO: implement */
> +}
> +
> +static const struct tegra_smmu_ops tegra132_smmu_ops = {
> +       .flush_dcache = tegra132_flush_dcache,
> +};
> +
> +static inline void smmu_flush_ptc(struct tegra_smmu *smmu, struct page *page,
> +                                 unsigned long offset)
> +{
> +       phys_addr_t phys = page ? page_to_phys(page) : 0;
> +       u32 value;
> +
> +       if (page) {
> +               offset &= ~(smmu->soc->atom_size - 1);
> +
> +#ifdef CONFIG_PHYS_ADDR_T_64BIT
> +               value = (phys >> 32) & SMMU_PTC_FLUSH_HI_MASK;
> +#else
> +               value = 0;
> +#endif
> +               smmu_writel(smmu, value, SMMU_PTC_FLUSH_HI);
> +
> +               value = (phys + offset) | SMMU_PTC_FLUSH_TYPE_ADR;
> +       } else {
> +               value = SMMU_PTC_FLUSH_TYPE_ALL;
> +       }
> +
> +       smmu_writel(smmu, value, SMMU_PTC_FLUSH);
> +}
> +
> +static inline void smmu_flush_tlb(struct tegra_smmu *smmu)
> +{
> +       smmu_writel(smmu, SMMU_TLB_FLUSH_VA_MATCH_ALL, SMMU_TLB_FLUSH);
> +}
> +
> +static inline void smmu_flush_tlb_asid(struct tegra_smmu *smmu,
> +                                      unsigned long asid)
> +{
> +       u32 value;
> +
> +       value = SMMU_TLB_FLUSH_ASID_MATCH | SMMU_TLB_FLUSH_ASID(asid) |
> +               SMMU_TLB_FLUSH_VA_MATCH_ALL;
> +       smmu_writel(smmu, value, SMMU_TLB_FLUSH);
> +}
> +
> +static inline void smmu_flush_tlb_section(struct tegra_smmu *smmu,
> +                                         unsigned long asid,
> +                                         unsigned long iova)
> +{
> +       u32 value;
> +
> +       value = SMMU_TLB_FLUSH_ASID_MATCH | SMMU_TLB_FLUSH_ASID(asid) |
> +               SMMU_TLB_FLUSH_VA_SECTION(iova);
> +       smmu_writel(smmu, value, SMMU_TLB_FLUSH);
> +}
> +
> +static inline void smmu_flush_tlb_group(struct tegra_smmu *smmu,
> +                                       unsigned long asid,
> +                                       unsigned long iova)
> +{
> +       u32 value;
> +
> +       value = SMMU_TLB_FLUSH_ASID_MATCH | SMMU_TLB_FLUSH_ASID(asid) |
> +               SMMU_TLB_FLUSH_VA_GROUP(iova);
> +       smmu_writel(smmu, value, SMMU_TLB_FLUSH);
> +}
> +
> +static inline void smmu_flush(struct tegra_smmu *smmu)
> +{
> +       smmu_readl(smmu, SMMU_CONFIG);
> +}
> +
> +static inline struct tegra_smmu *to_tegra_smmu(struct iommu *iommu)
> +{
> +       return container_of(iommu, struct tegra_smmu, iommu);
> +}
> +
> +static struct tegra_smmu *smmu_handle = NULL;
> +
> +static int tegra_smmu_alloc_asid(struct tegra_smmu *smmu, unsigned int *idp)
> +{
> +       unsigned long id;
> +
> +       mutex_lock(&smmu->lock);
> +
> +       id = find_first_zero_bit(smmu->asids, smmu->soc->num_asids);
> +       if (id >= smmu->soc->num_asids) {
> +               mutex_unlock(&smmu->lock);
> +               return -ENOSPC;
> +       }
> +
> +       set_bit(id, smmu->asids);
> +       *idp = id;
> +
> +       mutex_unlock(&smmu->lock);
> +       return 0;
> +}
> +
> +static void tegra_smmu_free_asid(struct tegra_smmu *smmu, unsigned int id)
> +{
> +       mutex_lock(&smmu->lock);
> +       clear_bit(id, smmu->asids);
> +       mutex_unlock(&smmu->lock);
> +}
> +
> +struct tegra_smmu_address_space *foo = NULL;
> +
> +static int tegra_smmu_domain_init(struct iommu_domain *domain)
> +{
> +       struct tegra_smmu *smmu = smmu_handle;
> +       struct tegra_smmu_address_space *as;
> +       uint32_t *pd, value;
> +       unsigned int i;
> +       int err = 0;
> +
> +       as = kzalloc(sizeof(*as), GFP_KERNEL);
> +       if (!as) {
> +               err = -ENOMEM;
> +               goto out;
> +       }
> +
> +       as->attr = SMMU_PD_READABLE | SMMU_PD_WRITABLE | SMMU_PD_NONSECURE;
> +       as->smmu = smmu_handle;
> +       as->domain = domain;
> +
> +       err = tegra_smmu_alloc_asid(smmu, &as->id);
> +       if (err < 0) {
> +               kfree(as);
> +               goto out;
> +       }
> +
> +       as->pd = alloc_page(GFP_KERNEL | __GFP_DMA);
> +       if (!as->pd) {
> +               err = -ENOMEM;
> +               goto out;
> +       }
> +
> +       pd = page_address(as->pd);
> +       SetPageReserved(as->pd);
> +
> +       for (i = 0; i < SMMU_NUM_PDE; i++)
> +               pd[i] = SMMU_PDE_VACANT(i);
> +
> +       smmu->soc->ops->flush_dcache(as->pd, 0, SMMU_SIZE_PD);
> +       smmu_flush_ptc(smmu, as->pd, 0);
> +       smmu_flush_tlb_asid(smmu, as->id);
> +
> +       smmu_writel(smmu, as->id & 0x7f, SMMU_PTB_ASID);
> +       value = SMMU_PTB_DATA_VALUE(as->pd, as->attr);
> +       smmu_writel(smmu, value, SMMU_PTB_DATA);
> +       smmu_flush(smmu);
> +
> +       domain->priv = as;
> +
> +       return 0;
> +
> +out:
> +       return err;
> +}
> +
> +static void tegra_smmu_domain_destroy(struct iommu_domain *domain)
> +{
> +       struct tegra_smmu_address_space *as = domain->priv;
> +
> +       /* TODO: free page directory and page tables */
> +
> +       tegra_smmu_free_asid(as->smmu, as->id);
> +       kfree(as);
> +}
> +
> +static const struct tegra_smmu_swgroup *
> +tegra_smmu_find_swgroup(struct tegra_smmu *smmu, unsigned int swgroup)
> +{
> +       const struct tegra_smmu_swgroup *group = NULL;
> +       unsigned int i;
> +
> +       for (i = 0; i < smmu->soc->num_swgroups; i++) {
> +               if (smmu->soc->swgroups[i].swgroup == swgroup) {
> +                       group = &smmu->soc->swgroups[i];
> +                       break;
> +               }
> +       }
> +
> +       return group;
> +}
> +
> +static int tegra_smmu_enable(struct tegra_smmu *smmu, unsigned int swgroup,
> +                            unsigned int asid)
> +{
> +       const struct tegra_smmu_swgroup *group;
> +       unsigned int i;
> +       u32 value;
> +
> +       for (i = 0; i < smmu->soc->num_clients; i++) {
> +               const struct tegra_mc_client *client = &smmu->soc->clients[i];
> +
> +               if (client->swgroup != swgroup)
> +                       continue;
> +
> +               value = smmu_readl(smmu, client->smmu.reg);
> +               value |= BIT(client->smmu.bit);
> +               smmu_writel(smmu, value, client->smmu.reg);
> +       }
> +
> +       group = tegra_smmu_find_swgroup(smmu, swgroup);
> +       if (group) {
> +               value = smmu_readl(smmu, group->reg);
> +               value &= ~SMMU_ASID_MASK;
> +               value |= SMMU_ASID_VALUE(asid);
> +               value |= SMMU_ASID_ENABLE;
> +               smmu_writel(smmu, value, group->reg);
> +       }
> +
> +       return 0;
> +}
> +
> +static int tegra_smmu_disable(struct tegra_smmu *smmu, unsigned int swgroup,
> +                             unsigned int asid)
> +{
> +       const struct tegra_smmu_swgroup *group;
> +       unsigned int i;
> +       u32 value;
> +
> +       group = tegra_smmu_find_swgroup(smmu, swgroup);
> +       if (group) {
> +               value = smmu_readl(smmu, group->reg);
> +               value &= ~SMMU_ASID_MASK;
> +               value |= SMMU_ASID_VALUE(asid);
> +               value &= ~SMMU_ASID_ENABLE;
> +               smmu_writel(smmu, value, group->reg);
> +       }
> +
> +       for (i = 0; i < smmu->soc->num_clients; i++) {
> +               const struct tegra_mc_client *client = &smmu->soc->clients[i];
> +
> +               if (client->swgroup != swgroup)
> +                       continue;
> +
> +               value = smmu_readl(smmu, client->smmu.reg);
> +               value &= ~BIT(client->smmu.bit);
> +               smmu_writel(smmu, value, client->smmu.reg);
> +       }
> +
> +       return 0;
> +}
> +
> +static int tegra_smmu_attach_dev(struct iommu_domain *domain, struct device *dev)
> +{
> +       struct tegra_smmu_address_space *as = domain->priv;
> +       struct tegra_smmu *smmu = as->smmu;
> +       struct of_phandle_iter entry;
> +       int err;
> +
> +       of_property_for_each_phandle_with_args(entry, dev->of_node, "iommus",
> +                                              "#iommu-cells", 0) {
> +               unsigned int swgroup = entry.out_args.args[0];
> +
> +               if (entry.out_args.np != smmu->dev->of_node)
> +                       continue;
> +
> +               err = tegra_smmu_enable(smmu, swgroup, as->id);
> +               if (err < 0)
> +                       pr_err("failed to enable SWGROUP#%u\n", swgroup);
> +       }
> +
> +       return 0;
> +}
> +
> +static void tegra_smmu_detach_dev(struct iommu_domain *domain, struct device *dev)
> +{
> +       struct tegra_smmu_address_space *as = domain->priv;
> +       struct tegra_smmu *smmu = as->smmu;
> +       struct of_phandle_iter entry;
> +       int err;
> +
> +       of_property_for_each_phandle_with_args(entry, dev->of_node, "iommus",
> +                                              "#iommu-cells", 0) {
> +               unsigned int swgroup;
> +
> +               if (entry.out_args.np != smmu->dev->of_node)
> +                       continue;
> +
> +               swgroup = entry.out_args.args[0];
> +
> +               err = tegra_smmu_disable(smmu, swgroup, as->id);
> +               if (err < 0) {
> +                       pr_err("failed to enable SWGROUP#%u\n", swgroup);
> +               }
> +       }
> +}
> +
> +static u32 *as_get_pte(struct tegra_smmu_address_space *as, dma_addr_t iova,
> +                      struct page **pagep)
> +{
> +       struct tegra_smmu *smmu = smmu_handle;
> +       u32 *pd = page_address(as->pd), *pt;
> +       u32 pde = (iova >> SMMU_PDE_SHIFT) & 0x3ff;
> +       u32 pte = (iova >> SMMU_PTE_SHIFT) & 0x3ff;
> +       struct page *page;
> +       unsigned int i;
> +
> +       if (pd[pde] != SMMU_PDE_VACANT(pde)) {
> +               page = pfn_to_page(pd[pde] & SMMU_PFN_MASK);
> +               pt = page_address(page);
> +       } else {
> +               page = alloc_page(GFP_KERNEL | __GFP_DMA);
> +               if (!page)
> +                       return NULL;
> +
> +               pt = page_address(page);
> +               SetPageReserved(page);
> +
> +               for (i = 0; i < SMMU_NUM_PTE; i++)
> +                       pt[i] = SMMU_PTE_VACANT(i);
> +
> +               smmu->soc->ops->flush_dcache(page, 0, SMMU_SIZE_PT);
> +
> +               pd[pde] = SMMU_MK_PDE(page, SMMU_PDE_ATTR | SMMU_PDE_NEXT);
> +
> +               smmu->soc->ops->flush_dcache(as->pd, pde << 2, 4);
> +               smmu_flush_ptc(smmu, as->pd, pde << 2);
> +               smmu_flush_tlb_section(smmu, as->id, iova);
> +               smmu_flush(smmu);
> +       }
> +
> +       *pagep = page;
> +
> +       return &pt[pte];
> +}
> +
> +static int tegra_smmu_map(struct iommu_domain *domain, unsigned long iova,
> +                         phys_addr_t paddr, size_t size, int prot)
> +{
> +       struct tegra_smmu_address_space *as = domain->priv;
> +       struct tegra_smmu *smmu = smmu_handle;
> +       unsigned long offset;
> +       struct page *page;
> +       u32 *pte;
> +
> +       pte = as_get_pte(as, iova, &page);
> +       if (!pte)
> +               return -ENOMEM;
> +
> +       offset = offset_in_page(pte);
> +
> +       *pte = __phys_to_pfn(paddr) | SMMU_PTE_ATTR;
> +
> +       smmu->soc->ops->flush_dcache(page, offset, 4);
> +       smmu_flush_ptc(smmu, page, offset);
> +       smmu_flush_tlb_group(smmu, as->id, iova);
> +       smmu_flush(smmu);
> +
> +       return 0;
> +}
> +
> +static size_t tegra_smmu_unmap(struct iommu_domain *domain, unsigned long iova,
> +                              size_t size)
> +{
> +       struct tegra_smmu_address_space *as = domain->priv;
> +       struct tegra_smmu *smmu = smmu_handle;
> +       unsigned long offset;
> +       struct page *page;
> +       u32 *pte;
> +
> +       pte = as_get_pte(as, iova, &page);
> +       if (!pte)
> +               return 0;
> +
> +       offset = offset_in_page(pte);
> +       *pte = 0;
> +
> +       smmu->soc->ops->flush_dcache(page, offset, 4);
> +       smmu_flush_ptc(smmu, page, offset);
> +       smmu_flush_tlb_group(smmu, as->id, iova);
> +       smmu_flush(smmu);
> +
> +       return size;
> +}
> +
> +static phys_addr_t tegra_smmu_iova_to_phys(struct iommu_domain *domain,
> +                                          dma_addr_t iova)
> +{
> +       struct tegra_smmu_address_space *as = domain->priv;
> +       struct page *page;
> +       unsigned long pfn;
> +       u32 *pte;
> +
> +       pte = as_get_pte(as, iova, &page);
> +       pfn = *pte & SMMU_PFN_MASK;
> +
> +       return PFN_PHYS(pfn);
> +}
> +
> +static int tegra_smmu_attach(struct iommu *iommu, struct device *dev)
> +{
> +       struct tegra_smmu *smmu = to_tegra_smmu(iommu);
> +       struct tegra_smmu_group *group;
> +       unsigned int i;
> +
> +       for (i = 0; i < smmu->soc->num_groups; i++) {
> +               group = iommu_group_get_iommudata(smmu->groups[i]);
> +
> +               if (of_match_node(group->matches, dev->of_node)) {
> +                       pr_debug("adding device %s to group %s\n",
> +                                dev_name(dev), group->name);
> +                       iommu_group_add_device(smmu->groups[i], dev);
> +                       break;
> +               }
> +       }
> +
> +       if (i == smmu->soc->num_groups)
> +               return 0;
> +
> +#ifndef CONFIG_ARM64
> +       return arm_iommu_attach_device(dev, group->mapping);
> +#else
> +       return 0;
> +#endif
> +}
> +
> +static int tegra_smmu_detach(struct iommu *iommu, struct device *dev)
> +{
> +       return 0;
> +}
> +
> +static const struct iommu_ops tegra_smmu_ops = {
> +       .domain_init = tegra_smmu_domain_init,
> +       .domain_destroy = tegra_smmu_domain_destroy,
> +       .attach_dev = tegra_smmu_attach_dev,
> +       .detach_dev = tegra_smmu_detach_dev,
> +       .map = tegra_smmu_map,
> +       .unmap = tegra_smmu_unmap,
> +       .iova_to_phys = tegra_smmu_iova_to_phys,
> +       .attach = tegra_smmu_attach,
> +       .detach = tegra_smmu_detach,
> +
> +       .pgsize_bitmap = SZ_4K,
> +};
> +
> +static struct tegra_smmu *tegra_smmu_probe(struct device *dev,
> +                                          const struct tegra_smmu_soc *soc,
> +                                          void __iomem *regs)
> +{
> +       struct tegra_smmu *smmu;
> +       unsigned int i;
> +       size_t size;
> +       u32 value;
> +       int err;
> +
> +       smmu = devm_kzalloc(dev, sizeof(*smmu), GFP_KERNEL);
> +       if (!smmu)
> +               return ERR_PTR(-ENOMEM);
> +
> +       size = BITS_TO_LONGS(soc->num_asids) * sizeof(long);
> +
> +       smmu->asids = devm_kzalloc(dev, size, GFP_KERNEL);
> +       if (!smmu->asids)
> +               return ERR_PTR(-ENOMEM);
> +
> +       INIT_LIST_HEAD(&smmu->iommu.list);
> +       mutex_init(&smmu->lock);
> +
> +       smmu->iommu.ops = &tegra_smmu_ops;
> +       smmu->iommu.dev = dev;
> +
> +       smmu->regs = regs;
> +       smmu->soc = soc;
> +       smmu->dev = dev;
> +
> +       smmu_handle = smmu;
> +       bus_set_iommu(&platform_bus_type, &tegra_smmu_ops);
> +
> +       smmu->num_groups = soc->num_groups;
> +
> +       smmu->groups = devm_kcalloc(dev, smmu->num_groups, sizeof(*smmu->groups),
> +                                   GFP_KERNEL);
> +       if (!smmu->groups)
> +               return ERR_PTR(-ENOMEM);
> +
> +       for (i = 0; i < smmu->num_groups; i++) {
> +               struct tegra_smmu_group *group;
> +
> +               smmu->groups[i] = iommu_group_alloc();
> +               if (IS_ERR(smmu->groups[i]))
> +                       return ERR_CAST(smmu->groups[i]);
> +
> +               err = iommu_group_set_name(smmu->groups[i], soc->groups[i].name);
> +               if (err < 0) {
> +               }
> +
> +               group = kzalloc(sizeof(*group), GFP_KERNEL);
> +               if (!group)
> +                       return ERR_PTR(-ENOMEM);
> +
> +               group->matches = soc->groups[i].matches;
> +               group->asid = soc->groups[i].asid;
> +               group->name = soc->groups[i].name;
> +
> +               iommu_group_set_iommudata(smmu->groups[i], group,
> +                                         tegra_smmu_group_release);
> +
> +#ifndef CONFIG_ARM64
> +               group->mapping = arm_iommu_create_mapping(&platform_bus_type,
> +                                                         0, SZ_2G);
> +               if (IS_ERR(group->mapping)) {
> +                       dev_err(dev, "failed to create mapping for group %s: %ld\n",
> +                               group->name, PTR_ERR(group->mapping));
> +                       return ERR_CAST(group->mapping);
> +               }
> +#endif
> +       }
> +
> +       value = (1 << 29) | (8 << 24) | 0x3f;
> +       smmu_writel(smmu, value, 0x18);
> +
> +       value = (1 << 29) | (1 << 28) | 0x20;
> +       smmu_writel(smmu, value, 0x014);
> +
> +       smmu_flush_ptc(smmu, NULL, 0);
> +       smmu_flush_tlb(smmu);
> +       smmu_writel(smmu, SMMU_CONFIG_ENABLE, SMMU_CONFIG);
> +       smmu_flush(smmu);
> +
> +       err = iommu_add(&smmu->iommu);
> +       if (err < 0)
> +               return ERR_PTR(err);
> +
> +       return smmu;
> +}
> +
> +static int tegra_smmu_remove(struct tegra_smmu *smmu)
> +{
> +       iommu_remove(&smmu->iommu);
> +
> +       return 0;
> +}
> +
> +#ifdef CONFIG_ARCH_TEGRA_124_SOC
> +static const struct tegra_smmu_soc tegra124_smmu_soc = {
> +       .groups = tegra124_smmu_groups,
> +       .num_groups = ARRAY_SIZE(tegra124_smmu_groups),
> +       .clients = tegra124_mc_clients,
> +       .num_clients = ARRAY_SIZE(tegra124_mc_clients),
> +       .swgroups = tegra124_swgroups,
> +       .num_swgroups = ARRAY_SIZE(tegra124_swgroups),
> +       .num_asids = 128,
> +       .atom_size = 32,
> +       .ops = &tegra124_smmu_ops,
> +};
> +#endif
> +
> +static const struct tegra_smmu_soc tegra132_smmu_soc = {
> +       .groups = tegra124_smmu_groups,
> +       .num_groups = ARRAY_SIZE(tegra124_smmu_groups),
> +       .clients = tegra124_mc_clients,
> +       .num_clients = ARRAY_SIZE(tegra124_mc_clients),
> +       .swgroups = tegra124_swgroups,
> +       .num_swgroups = ARRAY_SIZE(tegra124_swgroups),
> +       .num_asids = 128,
> +       .atom_size = 32,
> +       .ops = &tegra132_smmu_ops,
> +};
> +
> +struct tegra_mc {
> +       struct device *dev;
> +       struct tegra_smmu *smmu;
> +       void __iomem *regs;
> +       int irq;
> +
> +       const struct tegra_mc_soc *soc;
> +};
> +
> +static inline u32 mc_readl(struct tegra_mc *mc, unsigned long offset)
> +{
> +       return readl(mc->regs + offset);
> +}
> +
> +static inline void mc_writel(struct tegra_mc *mc, u32 value, unsigned long offset)
> +{
> +       writel(value, mc->regs + offset);
> +}
> +
> +struct tegra_mc_soc {
> +       const struct tegra_mc_client *clients;
> +       unsigned int num_clients;
> +
> +       const struct tegra_smmu_soc *smmu;
> +};
> +
> +#ifdef CONFIG_ARCH_TEGRA_124_SOC
> +static const struct tegra_mc_soc tegra124_mc_soc = {
> +       .clients = tegra124_mc_clients,
> +       .num_clients = ARRAY_SIZE(tegra124_mc_clients),
> +       .smmu = &tegra124_smmu_soc,
> +};
> +#endif
> +
> +static const struct tegra_mc_soc tegra132_mc_soc = {
> +       .clients = tegra124_mc_clients,
> +       .num_clients = ARRAY_SIZE(tegra124_mc_clients),
> +       .smmu = &tegra132_smmu_soc,
> +};
> +
> +static const struct of_device_id tegra_mc_of_match[] = {
> +#ifdef CONFIG_ARCH_TEGRA_124_SOC
> +       { .compatible = "nvidia,tegra124-mc", .data = &tegra124_mc_soc },
> +#endif
> +       { .compatible = "nvidia,tegra132-mc", .data = &tegra132_mc_soc },
> +       { }
> +};
> +
> +static irqreturn_t tegra124_mc_irq(int irq, void *data)
> +{
> +       struct tegra_mc *mc = data;
> +       u32 value, status, mask;
> +
> +       /* mask all interrupts to avoid flooding */
> +       mask = mc_readl(mc, MC_INTMASK);
> +       mc_writel(mc, 0, MC_INTMASK);
> +
> +       status = mc_readl(mc, MC_INTSTATUS);
> +       mc_writel(mc, status, MC_INTSTATUS);
> +
> +       dev_dbg(mc->dev, "INTSTATUS: %08x\n", status);
> +
> +       if (status & MC_INT_DECERR_MTS)
> +               dev_dbg(mc->dev, "  DECERR_MTS\n");
> +
> +       if (status & MC_INT_SECERR_SEC)
> +               dev_dbg(mc->dev, "  SECERR_SEC\n");
> +
> +       if (status & MC_INT_DECERR_VPR)
> +               dev_dbg(mc->dev, "  DECERR_VPR\n");
> +
> +       if (status & MC_INT_INVALID_APB_ASID_UPDATE)
> +               dev_dbg(mc->dev, "  INVALID_APB_ASID_UPDATE\n");
> +
> +       if (status & MC_INT_INVALID_SMMU_PAGE)
> +               dev_dbg(mc->dev, "  INVALID_SMMU_PAGE\n");
> +
> +       if (status & MC_INT_ARBITRATION_EMEM)
> +               dev_dbg(mc->dev, "  ARBITRATION_EMEM\n");
> +
> +       if (status & MC_INT_SECURITY_VIOLATION)
> +               dev_dbg(mc->dev, "  SECURITY_VIOLATION\n");
> +
> +       if (status & MC_INT_DECERR_EMEM)
> +               dev_dbg(mc->dev, "  DECERR_EMEM\n");
> +
> +       value = mc_readl(mc, MC_ERR_STATUS);
> +
> +       dev_dbg(mc->dev, "ERR_STATUS: %08x\n", value);
> +       dev_dbg(mc->dev, "  type: %x\n", (value >> 28) & 0x7);
> +       dev_dbg(mc->dev, "  protection: %x\n", (value >> 25) & 0x7);
> +       dev_dbg(mc->dev, "  adr_hi: %x\n", (value >> 20) & 0x3);
> +       dev_dbg(mc->dev, "  swap: %x\n", (value >> 18) & 0x1);
> +       dev_dbg(mc->dev, "  security: %x\n", (value >> 17) & 0x1);
> +       dev_dbg(mc->dev, "  r/w: %x\n", (value >> 16) & 0x1);
> +       dev_dbg(mc->dev, "  adr1: %x\n", (value >> 12) & 0x7);
> +       dev_dbg(mc->dev, "  client: %x\n", value & 0x7f);
> +
> +       value = mc_readl(mc, MC_ERR_ADR);
> +       dev_dbg(mc->dev, "ERR_ADR: %08x\n", value);
> +
> +       mc_writel(mc, mask, MC_INTMASK);
> +
> +       return IRQ_HANDLED;
> +}
> +
> +static int tegra_mc_probe(struct platform_device *pdev)
> +{
> +       const struct of_device_id *match;
> +       struct resource *res;
> +       struct tegra_mc *mc;
> +       unsigned int i;
> +       u32 value;
> +       int err;
> +
> +       match = of_match_node(tegra_mc_of_match, pdev->dev.of_node);
> +       if (!match)
> +               return -ENODEV;
> +
> +       mc = devm_kzalloc(&pdev->dev, sizeof(*mc), GFP_KERNEL);
> +       if (!mc)
> +               return -ENOMEM;
> +
> +       platform_set_drvdata(pdev, mc);
> +       mc->soc = match->data;
> +       mc->dev = &pdev->dev;
> +
> +       res = platform_get_resource(pdev, IORESOURCE_MEM, 0);
> +       mc->regs = devm_ioremap_resource(&pdev->dev, res);
> +       if (IS_ERR(mc->regs))
> +               return PTR_ERR(mc->regs);
> +
> +       for (i = 0; i < mc->soc->num_clients; i++) {
> +               const struct latency_allowance *la = &mc->soc->clients[i].latency;
> +               u32 value;
> +
> +               value = readl(mc->regs + la->reg);
> +               value &= ~(la->mask << la->shift);
> +               value |= (la->def & la->mask) << la->shift;
> +               writel(value, mc->regs + la->reg);
> +       }
> +
> +       mc->smmu = tegra_smmu_probe(&pdev->dev, mc->soc->smmu, mc->regs);
> +       if (IS_ERR(mc->smmu)) {
> +               dev_err(&pdev->dev, "failed to probe SMMU: %ld\n",
> +                       PTR_ERR(mc->smmu));
> +               return PTR_ERR(mc->smmu);
> +       }
> +
> +       mc->irq = platform_get_irq(pdev, 0);
> +       if (mc->irq < 0) {
> +               dev_err(&pdev->dev, "interrupt not specified\n");
> +               return mc->irq;
> +       }
> +
> +       err = devm_request_irq(&pdev->dev, mc->irq, tegra124_mc_irq,
> +                              IRQF_SHARED, dev_name(&pdev->dev), mc);
> +       if (err < 0) {
> +               dev_err(&pdev->dev, "failed to request IRQ#%u: %d\n", mc->irq,
> +                       err);
> +               return err;
> +       }
> +
> +       value = MC_INT_DECERR_MTS | MC_INT_SECERR_SEC | MC_INT_DECERR_VPR |
> +               MC_INT_INVALID_APB_ASID_UPDATE | MC_INT_INVALID_SMMU_PAGE |
> +               MC_INT_ARBITRATION_EMEM | MC_INT_SECURITY_VIOLATION |
> +               MC_INT_DECERR_EMEM;
> +       mc_writel(mc, value, MC_INTMASK);
> +
> +       return 0;
> +}
> +
> +static int tegra_mc_remove(struct platform_device *pdev)
> +{
> +       struct tegra_mc *mc = platform_get_drvdata(pdev);
> +       int err;
> +
> +       err = tegra_smmu_remove(mc->smmu);
> +       if (err < 0)
> +               dev_err(&pdev->dev, "failed to remove SMMU: %d\n", err);
> +
> +       return 0;
> +}
> +
> +static struct platform_driver tegra_mc_driver = {
> +       .driver = {
> +               .name = "tegra124-mc",
> +               .of_match_table = tegra_mc_of_match,
> +       },
> +       .probe = tegra_mc_probe,
> +       .remove = tegra_mc_remove,
> +};
> +module_platform_driver(tegra_mc_driver);
> +
> +MODULE_AUTHOR("Thierry Reding <treding@nvidia.com>");
> +MODULE_DESCRIPTION("NVIDIA Tegra124 Memory Controller driver");
> +MODULE_LICENSE("GPL v2");
> diff --git a/include/dt-bindings/memory/tegra124-mc.h b/include/dt-bindings/memory/tegra124-mc.h
> new file mode 100644
> index 000000000000..6b1617ce022f
> --- /dev/null
> +++ b/include/dt-bindings/memory/tegra124-mc.h
> @@ -0,0 +1,30 @@
> +#ifndef DT_BINDINGS_MEMORY_TEGRA124_MC_H
> +#define DT_BINDINGS_MEMORY_TEGRA124_MC_H
> +
> +#define TEGRA_SWGROUP_DC       0
> +#define TEGRA_SWGROUP_DCB      1
> +#define TEGRA_SWGROUP_AFI      2
> +#define TEGRA_SWGROUP_AVPC     3
> +#define TEGRA_SWGROUP_HDA      4
> +#define TEGRA_SWGROUP_HC       5
> +#define TEGRA_SWGROUP_MSENC    6
> +#define TEGRA_SWGROUP_PPCS     7
> +#define TEGRA_SWGROUP_SATA     8
> +#define TEGRA_SWGROUP_VDE      9
> +#define TEGRA_SWGROUP_MPCORELP 10
> +#define TEGRA_SWGROUP_MPCORE   11
> +#define TEGRA_SWGROUP_ISP2     12
> +#define TEGRA_SWGROUP_XUSB_HOST        13
> +#define TEGRA_SWGROUP_XUSB_DEV 14
> +#define TEGRA_SWGROUP_ISP2B    15
> +#define TEGRA_SWGROUP_TSEC     16
> +#define TEGRA_SWGROUP_A9AVP    17
> +#define TEGRA_SWGROUP_GPU      18
> +#define TEGRA_SWGROUP_SDMMC1A  19
> +#define TEGRA_SWGROUP_SDMMC2A  20
> +#define TEGRA_SWGROUP_SDMMC3A  21
> +#define TEGRA_SWGROUP_SDMMC4A  22
> +#define TEGRA_SWGROUP_VIC      23
> +#define TEGRA_SWGROUP_VI       24
> +
> +#endif
> --
> 2.0.0
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-tegra" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>

^ permalink raw reply	[flat|nested] 133+ messages in thread

* Re: [RFC 04/10] memory: Add Tegra124 memory controller support
@ 2014-06-27 13:29       ` Mikko Perttunen
  0 siblings, 0 replies; 133+ messages in thread
From: Mikko Perttunen @ 2014-06-27 13:29 UTC (permalink / raw)
  To: Thierry Reding, Rob Herring, Pawel Moll, Mark Rutland,
	Ian Campbell, Kumar Gala, Stephen Warren, Arnd Bergmann,
	Will Deacon, Joerg Roedel
  Cc: Cho KyongHo, Grant Grundler, Dave Martin, Marc Zyngier,
	Hiroshi Doyu, Olav Haugan, Paul Walmsley, Rhyland Klein,
	Allen Martin, devicetree, iommu, linux-arm-kernel, linux-tegra,
	linux-kernel

In the future, the EMC driver will also want to write and read quite 
many registers in the MC block.. MC_EMEM_*, the latency allowance 
registers and a couple others. Downstream just uses __raw_writel with
values from the EMC tables. A fun thing here is that during the point
that the values are written, the code cannot do some things like reading 
registers (I believe) without hanging, so calling into the MC driver to 
write the changes might not be very nice either. Related to that, 
reading from MC_EMEM_ADR_CFG is used as a barrier in the sequence.

On 26/06/14 23:49, Thierry Reding wrote:
> From: Thierry Reding <treding@nvidia.com>
>
> The memory controller on NVIDIA Tegra124 exposes various knobs that can
> be used to tune the behaviour of the clients attached to it.
>
> Currently this driver sets up the latency allowance registers to the HW
> defaults. Eventually an API should be exported by this driver (via a
> custom API or a generic subsystem) to allow clients to register latency
> requirements.

I cannot see where the downstream latency allowance code is reloading 
the latency allowance registers after a EMC clock rate change. Strange.

>
> This driver also registers an IOMMU (SMMU) that's implemented by the
> memory controller.
>
> Signed-off-by: Thierry Reding <treding@nvidia.com>
> ---
>   drivers/memory/Kconfig                   |    9 +
>   drivers/memory/Makefile                  |    1 +
>   drivers/memory/tegra124-mc.c             | 1945 ++++++++++++++++++++++++++++++
>   include/dt-bindings/memory/tegra124-mc.h |   30 +
>   4 files changed, 1985 insertions(+)
>   create mode 100644 drivers/memory/tegra124-mc.c
>   create mode 100644 include/dt-bindings/memory/tegra124-mc.h
>
> diff --git a/drivers/memory/Kconfig b/drivers/memory/Kconfig
> index c59e9c96e86d..d0f0e6781570 100644
> --- a/drivers/memory/Kconfig
> +++ b/drivers/memory/Kconfig
> @@ -61,6 +61,15 @@ config TEGRA30_MC
>            analysis, especially for IOMMU/SMMU(System Memory Management
>            Unit) module.
>
> +config TEGRA124_MC
> +       bool "Tegra124 Memory Controller driver"
> +       depends on ARCH_TEGRA
> +       select IOMMU_API
> +       help
> +         This driver is for the Memory Controller module available on
> +         Tegra124 SoCs. It provides an IOMMU that can be used for I/O
> +         virtual address translation.
> +
>   config FSL_IFC
>          bool
>          depends on FSL_SOC
> diff --git a/drivers/memory/Makefile b/drivers/memory/Makefile
> index 71160a2b7313..03143927abab 100644
> --- a/drivers/memory/Makefile
> +++ b/drivers/memory/Makefile
> @@ -11,3 +11,4 @@ obj-$(CONFIG_FSL_IFC)         += fsl_ifc.o
>   obj-$(CONFIG_MVEBU_DEVBUS)     += mvebu-devbus.o
>   obj-$(CONFIG_TEGRA20_MC)       += tegra20-mc.o
>   obj-$(CONFIG_TEGRA30_MC)       += tegra30-mc.o
> +obj-$(CONFIG_TEGRA124_MC)      += tegra124-mc.o
> diff --git a/drivers/memory/tegra124-mc.c b/drivers/memory/tegra124-mc.c
> new file mode 100644
> index 000000000000..741755b6785d
> --- /dev/null
> +++ b/drivers/memory/tegra124-mc.c
> @@ -0,0 +1,1945 @@
> +/*
> + * Copyright (C) 2014 NVIDIA CORPORATION.  All rights reserved.
> + *
> + * This program is free software; you can redistribute it and/or modify
> + * it under the terms of the GNU General Public License version 2 as
> + * published by the Free Software Foundation.
> + */
> +
> +#include <linux/interrupt.h>
> +#include <linux/io.h>
> +#include <linux/iommu.h>
> +#include <linux/kernel.h>
> +#include <linux/module.h>
> +#include <linux/of.h>
> +#include <linux/platform_device.h>
> +#include <linux/slab.h>
> +
> +#include <dt-bindings/memory/tegra124-mc.h>
> +
> +#include <asm/cacheflush.h>
> +#ifndef CONFIG_ARM64
> +#include <asm/dma-iommu.h>
> +#endif
> +
> +#define MC_INTSTATUS 0x000
> +#define  MC_INT_DECERR_MTS (1 << 16)
> +#define  MC_INT_SECERR_SEC (1 << 13)
> +#define  MC_INT_DECERR_VPR (1 << 12)
> +#define  MC_INT_INVALID_APB_ASID_UPDATE (1 << 11)
> +#define  MC_INT_INVALID_SMMU_PAGE (1 << 10)
> +#define  MC_INT_ARBITRATION_EMEM (1 << 9)
> +#define  MC_INT_SECURITY_VIOLATION (1 << 8)
> +#define  MC_INT_DECERR_EMEM (1 << 6)
> +#define MC_INTMASK 0x004
> +#define MC_ERR_STATUS 0x08
> +#define MC_ERR_ADR 0x0c
> +
> +struct latency_allowance {
> +       unsigned int reg;
> +       unsigned int shift;
> +       unsigned int mask;
> +       unsigned int def;
> +};
> +
> +struct smmu_enable {
> +       unsigned int reg;
> +       unsigned int bit;
> +};
> +
> +struct tegra_mc_client {
> +       unsigned int id;
> +       const char *name;
> +       unsigned int swgroup;
> +
> +       struct smmu_enable smmu;
> +       struct latency_allowance latency;
> +};
> +
> +static const struct tegra_mc_client tegra124_mc_clients[] = {
> +       {
> +               .id = 0x01,
> +               .name = "display0a",
> +               .swgroup = TEGRA_SWGROUP_DC,
> +               .smmu = {
> +                       .reg = 0x228,
> +                       .bit = 1,
> +               },
> +               .latency = {
> +                       .reg = 0x2e8,
> +                       .shift = 0,
> +                       .mask = 0xff,
> +                       .def = 0xc2,
> +               },
> +       }, {
> +               .id = 0x02,
> +               .name = "display0ab",
> +               .swgroup = TEGRA_SWGROUP_DCB,
> +               .smmu = {
> +                       .reg = 0x228,
> +                       .bit = 2,
> +               },
> +               .latency = {
> +                       .reg = 0x2f4,
> +                       .shift = 0,
> +                       .mask = 0xff,
> +                       .def = 0xc6,
> +               },
> +       }, {
> +               .id = 0x03,
> +               .name = "display0b",
> +               .swgroup = TEGRA_SWGROUP_DC,
> +               .smmu = {
> +                       .reg = 0x228,
> +                       .bit = 3,
> +               },
> +               .latency = {
> +                       .reg = 0x2e8,
> +                       .shift = 16,
> +                       .mask = 0xff,
> +                       .def = 0x50,
> +               },
> +       }, {
> +               .id = 0x04,
> +               .name = "display0bb",
> +               .swgroup = TEGRA_SWGROUP_DCB,
> +               .smmu = {
> +                       .reg = 0x228,
> +                       .bit = 4,
> +               },
> +               .latency = {
> +                       .reg = 0x2f4,
> +                       .shift = 16,
> +                       .mask = 0xff,
> +                       .def = 0x50,
> +               },
> +       }, {
> +               .id = 0x05,
> +               .name = "display0c",
> +               .swgroup = TEGRA_SWGROUP_DC,
> +               .smmu = {
> +                       .reg = 0x228,
> +                       .bit = 5,
> +               },
> +               .latency = {
> +                       .reg = 0x2ec,
> +                       .shift = 0,
> +                       .mask = 0xff,
> +                       .def = 0x50,
> +               },
> +       }, {
> +               .id = 0x06,
> +               .name = "display0cb",
> +               .swgroup = TEGRA_SWGROUP_DCB,
> +               .smmu = {
> +                       .reg = 0x228,
> +                       .bit = 6,
> +               },
> +               .latency = {
> +                       .reg = 0x2f8,
> +                       .shift = 0,
> +                       .mask = 0xff,
> +                       .def = 0x50,
> +               },
> +       }, {
> +               .id = 0x0e,
> +               .name = "afir",
> +               .swgroup = TEGRA_SWGROUP_AFI,
> +               .smmu = {
> +                       .reg = 0x228,
> +                       .bit = 14,
> +               },
> +               .latency = {
> +                       .reg = 0x2e0,
> +                       .shift = 0,
> +                       .mask = 0xff,
> +                       .def = 0x13,
> +               },
> +       }, {
> +               .id = 0x0f,
> +               .name = "avpcarm7r",
> +               .swgroup = TEGRA_SWGROUP_AVPC,
> +               .smmu = {
> +                       .reg = 0x228,
> +                       .bit = 15,
> +               },
> +               .latency = {
> +                       .reg = 0x2e4,
> +                       .shift = 0,
> +                       .mask = 0xff,
> +                       .def = 0x04,
> +               },
> +       }, {
> +               .id = 0x10,
> +               .name = "displayhc",
> +               .swgroup = TEGRA_SWGROUP_DC,
> +               .smmu = {
> +                       .reg = 0x228,
> +                       .bit = 16,
> +               },
> +               .latency = {
> +                       .reg = 0x2f0,
> +                       .shift = 0,
> +                       .mask = 0xff,
> +                       .def = 0x50,
> +               },
> +       }, {
> +               .id = 0x11,
> +               .name = "displayhcb",
> +               .swgroup = TEGRA_SWGROUP_DCB,
> +               .smmu = {
> +                       .reg = 0x228,
> +                       .bit = 17,
> +               },
> +               .latency = {
> +                       .reg = 0x2fc,
> +                       .shift = 0,
> +                       .mask = 0xff,
> +                       .def = 0x50,
> +               },
> +       }, {
> +               .id = 0x15,
> +               .name = "hdar",
> +               .swgroup = TEGRA_SWGROUP_HDA,
> +               .smmu = {
> +                       .reg = 0x228,
> +                       .bit = 21,
> +               },
> +               .latency = {
> +                       .reg = 0x318,
> +                       .shift = 0,
> +                       .mask = 0xff,
> +                       .def = 0x24,
> +               },
> +       }, {
> +               .id = 0x16,
> +               .name = "host1xdmar",
> +               .swgroup = TEGRA_SWGROUP_HC,
> +               .smmu = {
> +                       .reg = 0x228,
> +                       .bit = 22,
> +               },
> +               .latency = {
> +                       .reg = 0x310,
> +                       .shift = 0,
> +                       .mask = 0xff,
> +                       .def = 0x1e,
> +               },
> +       }, {
> +               .id = 0x17,
> +               .name = "host1xr",
> +               .swgroup = TEGRA_SWGROUP_HC,
> +               .smmu = {
> +                       .reg = 0x228,
> +                       .bit = 23,
> +               },
> +               .latency = {
> +                       .reg = 0x310,
> +                       .shift = 16,
> +                       .mask = 0xff,
> +                       .def = 0x50,
> +               },
> +       }, {
> +               .id = 0x1c,
> +               .name = "msencsrd",
> +               .swgroup = TEGRA_SWGROUP_MSENC,
> +               .smmu = {
> +                       .reg = 0x228,
> +                       .bit = 28,
> +               },
> +               .latency = {
> +                       .reg = 0x328,
> +                       .shift = 0,
> +                       .mask = 0xff,
> +                       .def = 0x23,
> +               },
> +       }, {
> +               .id = 0x1d,
> +               .name = "ppcsahbdmarhdar",
> +               .swgroup = TEGRA_SWGROUP_PPCS,
> +               .smmu = {
> +                       .reg = 0x228,
> +                       .bit = 29,
> +               },
> +               .latency = {
> +                       .reg = 0x344,
> +                       .shift = 0,
> +                       .mask = 0xff,
> +                       .def = 0x49,
> +               },
> +       }, {
> +               .id = 0x1e,
> +               .name = "ppcsahbslvr",
> +               .swgroup = TEGRA_SWGROUP_PPCS,
> +               .smmu = {
> +                       .reg = 0x228,
> +                       .bit = 30,
> +               },
> +               .latency = {
> +                       .reg = 0x344,
> +                       .shift = 16,
> +                       .mask = 0xff,
> +                       .def = 0x1a,
> +               },
> +       }, {
> +               .id = 0x1f,
> +               .name = "satar",
> +               .swgroup = TEGRA_SWGROUP_SATA,
> +               .smmu = {
> +                       .reg = 0x228,
> +                       .bit = 31,
> +               },
> +               .latency = {
> +                       .reg = 0x350,
> +                       .shift = 0,
> +                       .mask = 0xff,
> +                       .def = 0x65,
> +               },
> +       }, {
> +               .id = 0x22,
> +               .name = "vdebsevr",
> +               .swgroup = TEGRA_SWGROUP_VDE,
> +               .smmu = {
> +                       .reg = 0x22c,
> +                       .bit = 2,
> +               },
> +               .latency = {
> +                       .reg = 0x354,
> +                       .shift = 0,
> +                       .mask = 0xff,
> +                       .def = 0x4f,
> +               },
> +       }, {
> +               .id = 0x23,
> +               .name = "vdember",
> +               .swgroup = TEGRA_SWGROUP_VDE,
> +               .smmu = {
> +                       .reg = 0x22c,
> +                       .bit = 3,
> +               },
> +               .latency = {
> +                       .reg = 0x354,
> +                       .shift = 16,
> +                       .mask = 0xff,
> +                       .def = 0x3d,
> +               },
> +       }, {
> +               .id = 0x24,
> +               .name = "vdemcer",
> +               .swgroup = TEGRA_SWGROUP_VDE,
> +               .smmu = {
> +                       .reg = 0x22c,
> +                       .bit = 4,
> +               },
> +               .latency = {
> +                       .reg = 0x358,
> +                       .shift = 0,
> +                       .mask = 0xff,
> +                       .def = 0x66,
> +               },
> +       }, {
> +               .id = 0x25,
> +               .name = "vdetper",
> +               .swgroup = TEGRA_SWGROUP_VDE,
> +               .smmu = {
> +                       .reg = 0x22c,
> +                       .bit = 5,
> +               },
> +               .latency = {
> +                       .reg = 0x358,
> +                       .shift = 16,
> +                       .mask = 0xff,
> +                       .def = 0xa5,
> +               },
> +       }, {
> +               .id = 0x26,
> +               .name = "mpcorelpr",
> +               .swgroup = TEGRA_SWGROUP_MPCORELP,
> +               .latency = {
> +                       .reg = 0x324,
> +                       .shift = 0,
> +                       .mask = 0xff,
> +                       .def = 0x04,
> +               },
> +       }, {
> +               .id = 0x27,
> +               .name = "mpcorer",
> +               .swgroup = TEGRA_SWGROUP_MPCORE,
> +               .smmu = {
> +                       .reg = 0x22c,
> +                       .bit = 2,
> +               },
> +               .latency = {
> +                       .reg = 0x320,
> +                       .shift = 0,
> +                       .mask = 0xff,
> +                       .def = 0x04,
> +               },
> +       }, {
> +               .id = 0x2b,
> +               .name = "msencswr",
> +               .swgroup = TEGRA_SWGROUP_MSENC,
> +               .smmu = {
> +                       .reg = 0x22c,
> +                       .bit = 11,
> +               },
> +               .latency = {
> +                       .reg = 0x328,
> +                       .shift = 16,
> +                       .mask = 0xff,
> +                       .def = 0x80,
> +               },
> +       }, {
> +               .id = 0x31,
> +               .name = "afiw",
> +               .swgroup = TEGRA_SWGROUP_AFI,
> +               .smmu = {
> +                       .reg = 0x22c,
> +                       .bit = 17,
> +               },
> +               .latency = {
> +                       .reg = 0x2e0,
> +                       .shift = 16,
> +                       .mask = 0xff,
> +                       .def = 0x80,
> +               },
> +       }, {
> +               .id = 0x32,
> +               .name = "avpcarm7w",
> +               .swgroup = TEGRA_SWGROUP_AVPC,
> +               .smmu = {
> +                       .reg = 0x22c,
> +                       .bit = 18,
> +               },
> +               .latency = {
> +                       .reg = 0x2e4,
> +                       .shift = 16,
> +                       .mask = 0xff,
> +                       .def = 0x80,
> +               },
> +       }, {
> +               .id = 0x35,
> +               .name = "hdaw",
> +               .swgroup = TEGRA_SWGROUP_HDA,
> +               .smmu = {
> +                       .reg = 0x22c,
> +                       .bit = 21,
> +               },
> +               .latency = {
> +                       .reg = 0x318,
> +                       .shift = 16,
> +                       .mask = 0xff,
> +                       .def = 0x80,
> +               },
> +       }, {
> +               .id = 0x36,
> +               .name = "host1xw",
> +               .swgroup = TEGRA_SWGROUP_HC,
> +               .smmu = {
> +                       .reg = 0x22c,
> +                       .bit = 22,
> +               },
> +               .latency = {
> +                       .reg = 0x314,
> +                       .shift = 0,
> +                       .mask = 0xff,
> +                       .def = 0x80,
> +               },
> +       }, {
> +               .id = 0x38,
> +               .name = "mpcorelpw",
> +               .swgroup = TEGRA_SWGROUP_MPCORELP,
> +               .latency = {
> +                       .reg = 0x324,
> +                       .shift = 16,
> +                       .mask = 0xff,
> +                       .def = 0x80,
> +               },
> +       }, {
> +               .id = 0x39,
> +               .name = "mpcorew",
> +               .swgroup = TEGRA_SWGROUP_MPCORE,
> +               .latency = {
> +                       .reg = 0x320,
> +                       .shift = 16,
> +                       .mask = 0xff,
> +                       .def = 0x80,
> +               },
> +       }, {
> +               .id = 0x3b,
> +               .name = "ppcsahbdmaw",
> +               .swgroup = TEGRA_SWGROUP_PPCS,
> +               .smmu = {
> +                       .reg = 0x22c,
> +                       .bit = 27,
> +               },
> +               .latency = {
> +                       .reg = 0x348,
> +                       .shift = 0,
> +                       .mask = 0xff,
> +                       .def = 0x80,
> +               },
> +       }, {
> +               .id = 0x3c,
> +               .name = "ppcsahbslvw",
> +               .swgroup = TEGRA_SWGROUP_PPCS,
> +               .smmu = {
> +                       .reg = 0x22c,
> +                       .bit = 28,
> +               },
> +               .latency = {
> +                       .reg = 0x348,
> +                       .shift = 16,
> +                       .mask = 0xff,
> +                       .def = 0x80,
> +               },
> +       }, {
> +               .id = 0x3d,
> +               .name = "sataw",
> +               .swgroup = TEGRA_SWGROUP_SATA,
> +               .smmu = {
> +                       .reg = 0x22c,
> +                       .bit = 29,
> +               },
> +               .latency = {
> +                       .reg = 0x350,
> +                       .shift = 16,
> +                       .mask = 0xff,
> +                       .def = 0x65,
> +               },
> +       }, {
> +               .id = 0x3e,
> +               .name = "vdebsevw",
> +               .swgroup = TEGRA_SWGROUP_VDE,
> +               .smmu = {
> +                       .reg = 0x22c,
> +                       .bit = 30,
> +               },
> +               .latency = {
> +                       .reg = 0x35c,
> +                       .shift = 0,
> +                       .mask = 0xff,
> +                       .def = 0x80,
> +               },
> +       }, {
> +               .id = 0x3f,
> +               .name = "vdedbgw",
> +               .swgroup = TEGRA_SWGROUP_VDE,
> +               .smmu = {
> +                       .reg = 0x22c,
> +                       .bit = 31,
> +               },
> +               .latency = {
> +                       .reg = 0x35c,
> +                       .shift = 16,
> +                       .mask = 0xff,
> +                       .def = 0x80,
> +               },
> +       }, {
> +               .id = 0x40,
> +               .name = "vdembew",
> +               .swgroup = TEGRA_SWGROUP_VDE,
> +               .smmu = {
> +                       .reg = 0x230,
> +                       .bit = 0,
> +               },
> +               .latency = {
> +                       .reg = 0x360,
> +                       .shift = 0,
> +                       .mask = 0xff,
> +                       .def = 0x80,
> +               },
> +       }, {
> +               .id = 0x41,
> +               .name = "vdetpmw",
> +               .swgroup = TEGRA_SWGROUP_VDE,
> +               .smmu = {
> +                       .reg = 0x230,
> +                       .bit = 1,
> +               },
> +               .latency = {
> +                       .reg = 0x360,
> +                       .shift = 16,
> +                       .mask = 0xff,
> +                       .def = 0x80,
> +               },
> +       }, {
> +               .id = 0x44,
> +               .name = "ispra",
> +               .swgroup = TEGRA_SWGROUP_ISP2,
> +               .smmu = {
> +                       .reg = 0x230,
> +                       .bit = 4,
> +               },
> +               .latency = {
> +                       .reg = 0x370,
> +                       .shift = 0,
> +                       .mask = 0xff,
> +                       .def = 0x18,
> +               },
> +       }, {
> +               .id = 0x46,
> +               .name = "ispwa",
> +               .swgroup = TEGRA_SWGROUP_ISP2,
> +               .smmu = {
> +                       .reg = 0x230,
> +                       .bit = 6,
> +               },
> +               .latency = {
> +                       .reg = 0x374,
> +                       .shift = 0,
> +                       .mask = 0xff,
> +                       .def = 0x80,
> +               },
> +       }, {
> +               .id = 0x47,
> +               .name = "ispwb",
> +               .swgroup = TEGRA_SWGROUP_ISP2,
> +               .smmu = {
> +                       .reg = 0x230,
> +                       .bit = 7,
> +               },
> +               .latency = {
> +                       .reg = 0x374,
> +                       .shift = 16,
> +                       .mask = 0xff,
> +                       .def = 0x80,
> +               },
> +       }, {
> +               .id = 0x4a,
> +               .name = "xusb_hostr",
> +               .swgroup = TEGRA_SWGROUP_XUSB_HOST,
> +               .smmu = {
> +                       .reg = 0x230,
> +                       .bit = 10,
> +               },
> +               .latency = {
> +                       .reg = 0x37c,
> +                       .shift = 0,
> +                       .mask = 0xff,
> +                       .def = 0x39,
> +               },
> +       }, {
> +               .id = 0x4b,
> +               .name = "xusb_hostw",
> +               .swgroup = TEGRA_SWGROUP_XUSB_HOST,
> +               .smmu = {
> +                       .reg = 0x230,
> +                       .bit = 11,
> +               },
> +               .latency = {
> +                       .reg = 0x37c,
> +                       .shift = 16,
> +                       .mask = 0xff,
> +                       .def = 0x80,
> +               },
> +       }, {
> +               .id = 0x4c,
> +               .name = "xusb_devr",
> +               .swgroup = TEGRA_SWGROUP_XUSB_DEV,
> +               .smmu = {
> +                       .reg = 0x230,
> +                       .bit = 12,
> +               },
> +               .latency = {
> +                       .reg = 0x380,
> +                       .shift = 0,
> +                       .mask = 0xff,
> +                       .def = 0x39,
> +               },
> +       }, {
> +               .id = 0x4d,
> +               .name = "xusb_devw",
> +               .swgroup = TEGRA_SWGROUP_XUSB_DEV,
> +               .smmu = {
> +                       .reg = 0x230,
> +                       .bit = 13,
> +               },
> +               .latency = {
> +                       .reg = 0x380,
> +                       .shift = 16,
> +                       .mask = 0xff,
> +                       .def = 0x80,
> +               },
> +       }, {
> +               .id = 0x4e,
> +               .name = "isprab",
> +               .swgroup = TEGRA_SWGROUP_ISP2B,
> +               .smmu = {
> +                       .reg = 0x230,
> +                       .bit = 14,
> +               },
> +               .latency = {
> +                       .reg = 0x384,
> +                       .shift = 0,
> +                       .mask = 0xff,
> +                       .def = 0x18,
> +               },
> +       }, {
> +               .id = 0x50,
> +               .name = "ispwab",
> +               .swgroup = TEGRA_SWGROUP_ISP2B,
> +               .smmu = {
> +                       .reg = 0x230,
> +                       .bit = 16,
> +               },
> +               .latency = {
> +                       .reg = 0x388,
> +                       .shift = 0,
> +                       .mask = 0xff,
> +                       .def = 0x80,
> +               },
> +       }, {
> +               .id = 0x51,
> +               .name = "ispwbb",
> +               .swgroup = TEGRA_SWGROUP_ISP2B,
> +               .smmu = {
> +                       .reg = 0x230,
> +                       .bit = 17,
> +               },
> +               .latency = {
> +                       .reg = 0x388,
> +                       .shift = 16,
> +                       .mask = 0xff,
> +                       .def = 0x80,
> +               },
> +       }, {
> +               .id = 0x54,
> +               .name = "tsecsrd",
> +               .swgroup = TEGRA_SWGROUP_TSEC,
> +               .smmu = {
> +                       .reg = 0x230,
> +                       .bit = 20,
> +               },
> +               .latency = {
> +                       .reg = 0x390,
> +                       .shift = 0,
> +                       .mask = 0xff,
> +                       .def = 0x9b,
> +               },
> +       }, {
> +               .id = 0x55,
> +               .name = "tsecswr",
> +               .swgroup = TEGRA_SWGROUP_TSEC,
> +               .smmu = {
> +                       .reg = 0x230,
> +                       .bit = 21,
> +               },
> +               .latency = {
> +                       .reg = 0x390,
> +                       .shift = 16,
> +                       .mask = 0xff,
> +                       .def = 0x80,
> +               },
> +       }, {
> +               .id = 0x56,
> +               .name = "a9avpscr",
> +               .swgroup = TEGRA_SWGROUP_A9AVP,
> +               .smmu = {
> +                       .reg = 0x230,
> +                       .bit = 22,
> +               },
> +               .latency = {
> +                       .reg = 0x3a4,
> +                       .shift = 0,
> +                       .mask = 0xff,
> +                       .def = 0x04,
> +               },
> +       }, {
> +               .id = 0x57,
> +               .name = "a9avpscw",
> +               .swgroup = TEGRA_SWGROUP_A9AVP,
> +               .smmu = {
> +                       .reg = 0x230,
> +                       .bit = 23,
> +               },
> +               .latency = {
> +                       .reg = 0x3a4,
> +                       .shift = 16,
> +                       .mask = 0xff,
> +                       .def = 0x80,
> +               },
> +       }, {
> +               .id = 0x58,
> +               .name = "gpusrd",
> +               .swgroup = TEGRA_SWGROUP_GPU,
> +               .smmu = {
> +                       /* read-only */
> +                       .reg = 0x230,
> +                       .bit = 24,
> +               },
> +               .latency = {
> +                       .reg = 0x3c8,
> +                       .shift = 0,
> +                       .mask = 0xff,
> +                       .def = 0x1a,
> +               },
> +       }, {
> +               .id = 0x59,
> +               .name = "gpuswr",
> +               .swgroup = TEGRA_SWGROUP_GPU,
> +               .smmu = {
> +                       /* read-only */
> +                       .reg = 0x230,
> +                       .bit = 25,
> +               },
> +               .latency = {
> +                       .reg = 0x3c8,
> +                       .shift = 16,
> +                       .mask = 0xff,
> +                       .def = 0x80,
> +               },
> +       }, {
> +               .id = 0x5a,
> +               .name = "displayt",
> +               .swgroup = TEGRA_SWGROUP_DC,
> +               .smmu = {
> +                       .reg = 0x230,
> +                       .bit = 26,
> +               },
> +               .latency = {
> +                       .reg = 0x2f0,
> +                       .shift = 16,
> +                       .mask = 0xff,
> +                       .def = 0x50,
> +               },
> +       }, {
> +               .id = 0x60,
> +               .name = "sdmmcra",
> +               .swgroup = TEGRA_SWGROUP_SDMMC1A,
> +               .smmu = {
> +                       .reg = 0x234,
> +                       .bit = 0,
> +               },
> +               .latency = {
> +                       .reg = 0x3b8,
> +                       .shift = 0,
> +                       .mask = 0xff,
> +                       .def = 0x49,
> +               },
> +       }, {
> +               .id = 0x61,
> +               .name = "sdmmcraa",
> +               .swgroup = TEGRA_SWGROUP_SDMMC2A,
> +               .smmu = {
> +                       .reg = 0x234,
> +                       .bit = 1,
> +               },
> +               .latency = {
> +                       .reg = 0x3bc,
> +                       .shift = 0,
> +                       .mask = 0xff,
> +                       .def = 0x49,
> +               },
> +       }, {
> +               .id = 0x62,
> +               .name = "sdmmcr",
> +               .swgroup = TEGRA_SWGROUP_SDMMC3A,
> +               .smmu = {
> +                       .reg = 0x234,
> +                       .bit = 2,
> +               },
> +               .latency = {
> +                       .reg = 0x3c0,
> +                       .shift = 0,
> +                       .mask = 0xff,
> +                       .def = 0x49,
> +               },
> +       }, {
> +               .id = 0x63,
> +               .swgroup = TEGRA_SWGROUP_SDMMC4A,
> +               .name = "sdmmcrab",
> +               .smmu = {
> +                       .reg = 0x234,
> +                       .bit = 3,
> +               },
> +               .latency = {
> +                       .reg = 0x3c4,
> +                       .shift = 0,
> +                       .mask = 0xff,
> +                       .def = 0x49,
> +               },
> +       }, {
> +               .id = 0x64,
> +               .name = "sdmmcwa",
> +               .swgroup = TEGRA_SWGROUP_SDMMC1A,
> +               .smmu = {
> +                       .reg = 0x234,
> +                       .bit = 4,
> +               },
> +               .latency = {
> +                       .reg = 0x3b8,
> +                       .shift = 16,
> +                       .mask = 0xff,
> +                       .def = 0x80,
> +               },
> +       }, {
> +               .id = 0x65,
> +               .name = "sdmmcwaa",
> +               .swgroup = TEGRA_SWGROUP_SDMMC2A,
> +               .smmu = {
> +                       .reg = 0x234,
> +                       .bit = 5,
> +               },
> +               .latency = {
> +                       .reg = 0x3bc,
> +                       .shift = 16,
> +                       .mask = 0xff,
> +                       .def = 0x80,
> +               },
> +       }, {
> +               .id = 0x66,
> +               .name = "sdmmcw",
> +               .swgroup = TEGRA_SWGROUP_SDMMC3A,
> +               .smmu = {
> +                       .reg = 0x234,
> +                       .bit = 6,
> +               },
> +               .latency = {
> +                       .reg = 0x3c0,
> +                       .shift = 16,
> +                       .mask = 0xff,
> +                       .def = 0x80,
> +               },
> +       }, {
> +               .id = 0x67,
> +               .name = "sdmmcwab",
> +               .swgroup = TEGRA_SWGROUP_SDMMC4A,
> +               .smmu = {
> +                       .reg = 0x234,
> +                       .bit = 7,
> +               },
> +               .latency = {
> +                       .reg = 0x3c4,
> +                       .shift = 16,
> +                       .mask = 0xff,
> +                       .def = 0x80,
> +               },
> +       }, {
> +               .id = 0x6c,
> +               .name = "vicsrd",
> +               .swgroup = TEGRA_SWGROUP_VIC,
> +               .smmu = {
> +                       .reg = 0x234,
> +                       .bit = 12,
> +               },
> +               .latency = {
> +                       .reg = 0x394,
> +                       .shift = 0,
> +                       .mask = 0xff,
> +                       .def = 0x1a,
> +               },
> +       }, {
> +               .id = 0x6d,
> +               .name = "vicswr",
> +               .swgroup = TEGRA_SWGROUP_VIC,
> +               .smmu = {
> +                       .reg = 0x234,
> +                       .bit = 13,
> +               },
> +               .latency = {
> +                       .reg = 0x394,
> +                       .shift = 16,
> +                       .mask = 0xff,
> +                       .def = 0x80,
> +               },
> +       }, {
> +               .id = 0x72,
> +               .name = "viw",
> +               .swgroup = TEGRA_SWGROUP_VI,
> +               .smmu = {
> +                       .reg = 0x234,
> +                       .bit = 18,
> +               },
> +               .latency = {
> +                       .reg = 0x398,
> +                       .shift = 0,
> +                       .mask = 0xff,
> +                       .def = 0x80,
> +               },
> +       }, {
> +               .id = 0x73,
> +               .name = "displayd",
> +               .swgroup = TEGRA_SWGROUP_DC,
> +               .smmu = {
> +                       .reg = 0x234,
> +                       .bit = 19,
> +               },
> +               .latency = {
> +                       .reg = 0x3c8,
> +                       .shift = 0,
> +                       .mask = 0xff,
> +                       .def = 0x50,
> +               },
> +       },
> +};
> +
> +struct tegra_smmu_swgroup {
> +       unsigned int swgroup;
> +       unsigned int reg;
> +};
> +
> +static const struct tegra_smmu_swgroup tegra124_swgroups[] = {
> +       { .swgroup = TEGRA_SWGROUP_DC,        .reg = 0x240 },
> +       { .swgroup = TEGRA_SWGROUP_DCB,       .reg = 0x244 },
> +       { .swgroup = TEGRA_SWGROUP_AFI,       .reg = 0x238 },
> +       { .swgroup = TEGRA_SWGROUP_AVPC,      .reg = 0x23c },
> +       { .swgroup = TEGRA_SWGROUP_HDA,       .reg = 0x254 },
> +       { .swgroup = TEGRA_SWGROUP_HC,        .reg = 0x250 },
> +       { .swgroup = TEGRA_SWGROUP_MSENC,     .reg = 0x264 },
> +       { .swgroup = TEGRA_SWGROUP_PPCS,      .reg = 0x270 },
> +       { .swgroup = TEGRA_SWGROUP_SATA,      .reg = 0x274 },
> +       { .swgroup = TEGRA_SWGROUP_VDE,       .reg = 0x27c },
> +       { .swgroup = TEGRA_SWGROUP_ISP2,      .reg = 0x258 },
> +       { .swgroup = TEGRA_SWGROUP_XUSB_HOST, .reg = 0x288 },
> +       { .swgroup = TEGRA_SWGROUP_XUSB_DEV,  .reg = 0x28c },
> +       { .swgroup = TEGRA_SWGROUP_ISP2B,     .reg = 0xaa4 },
> +       { .swgroup = TEGRA_SWGROUP_TSEC,      .reg = 0x294 },
> +       { .swgroup = TEGRA_SWGROUP_A9AVP,     .reg = 0x290 },
> +       { .swgroup = TEGRA_SWGROUP_GPU,       .reg = 0xaa8 },
> +       { .swgroup = TEGRA_SWGROUP_SDMMC1A,   .reg = 0xa94 },
> +       { .swgroup = TEGRA_SWGROUP_SDMMC2A,   .reg = 0xa98 },
> +       { .swgroup = TEGRA_SWGROUP_SDMMC3A,   .reg = 0xa9c },
> +       { .swgroup = TEGRA_SWGROUP_SDMMC4A,   .reg = 0xaa0 },
> +       { .swgroup = TEGRA_SWGROUP_VIC,       .reg = 0x284 },
> +       { .swgroup = TEGRA_SWGROUP_VI,        .reg = 0x280 },
> +};
> +
> +struct tegra_smmu_group_init {
> +       unsigned int asid;
> +       const char *name;
> +
> +       const struct of_device_id *matches;
> +};
> +
> +struct tegra_smmu_soc {
> +       const struct tegra_smmu_group_init *groups;
> +       unsigned int num_groups;
> +
> +       const struct tegra_mc_client *clients;
> +       unsigned int num_clients;
> +
> +       const struct tegra_smmu_swgroup *swgroups;
> +       unsigned int num_swgroups;
> +
> +       unsigned int num_asids;
> +       unsigned int atom_size;
> +
> +       const struct tegra_smmu_ops *ops;
> +};
> +
> +struct tegra_smmu_ops {
> +       void (*flush_dcache)(struct page *page, unsigned long offset,
> +                            size_t size);
> +};
> +
> +struct tegra_smmu_master {
> +       struct list_head list;
> +       struct device *dev;
> +};
> +
> +struct tegra_smmu_group {
> +       const char *name;
> +       const struct of_device_id *matches;
> +       unsigned int asid;
> +
> +#ifndef CONFIG_ARM64
> +       struct dma_iommu_mapping *mapping;
> +#endif
> +       struct list_head masters;
> +};
> +
> +static const struct of_device_id tegra124_periph_matches[] = {
> +       { .compatible = "nvidia,tegra124-sdhci", },
> +       { }
> +};
> +
> +static const struct tegra_smmu_group_init tegra124_smmu_groups[] = {
> +       { 0, "peripherals", tegra124_periph_matches },
> +};
> +
> +static void tegra_smmu_group_release(void *data)
> +{
> +       kfree(data);
> +}
> +
> +struct tegra_smmu {
> +       void __iomem *regs;
> +       struct iommu iommu;
> +       struct device *dev;
> +
> +       const struct tegra_smmu_soc *soc;
> +
> +       struct iommu_group **groups;
> +       unsigned int num_groups;
> +
> +       unsigned long *asids;
> +       struct mutex lock;
> +};
> +
> +struct tegra_smmu_address_space {
> +       struct iommu_domain *domain;
> +       struct tegra_smmu *smmu;
> +       struct page *pd;
> +       unsigned id;
> +       u32 attr;
> +};
> +
> +static inline void smmu_writel(struct tegra_smmu *smmu, u32 value,
> +                              unsigned long offset)
> +{
> +       writel(value, smmu->regs + offset);
> +}
> +
> +static inline u32 smmu_readl(struct tegra_smmu *smmu, unsigned long offset)
> +{
> +       return readl(smmu->regs + offset);
> +}
> +
> +#define SMMU_CONFIG 0x010
> +#define  SMMU_CONFIG_ENABLE (1 << 0)
> +
> +#define SMMU_PTB_ASID 0x01c
> +#define  SMMU_PTB_ASID_VALUE(x) ((x) & 0x7f)
> +
> +#define SMMU_PTB_DATA 0x020
> +#define  SMMU_PTB_DATA_VALUE(page, attr) (page_to_phys(page) >> 12 | (attr))
> +
> +#define SMMU_MK_PDE(page, attr) (page_to_phys(page) >> SMMU_PTE_SHIFT | (attr))
> +
> +#define SMMU_TLB_FLUSH 0x030
> +#define  SMMU_TLB_FLUSH_VA_MATCH_ALL     (0 << 0)
> +#define  SMMU_TLB_FLUSH_VA_MATCH_SECTION (2 << 0)
> +#define  SMMU_TLB_FLUSH_VA_MATCH_GROUP   (3 << 0)
> +#define  SMMU_TLB_FLUSH_ASID(x)          (((x) & 0x7f) << 24)
> +#define  SMMU_TLB_FLUSH_VA_SECTION(addr) ((((addr) & 0xffc00000) >> 12) | \
> +                                         SMMU_TLB_FLUSH_VA_MATCH_SECTION)
> +#define  SMMU_TLB_FLUSH_VA_GROUP(addr)   ((((addr) & 0xffffc000) >> 12) | \
> +                                         SMMU_TLB_FLUSH_VA_MATCH_GROUP)
> +#define  SMMU_TLB_FLUSH_ASID_MATCH       (1 << 31)
> +
> +#define SMMU_PTC_FLUSH 0x034
> +#define  SMMU_PTC_FLUSH_TYPE_ALL (0 << 0)
> +#define  SMMU_PTC_FLUSH_TYPE_ADR (1 << 0)
> +
> +#define SMMU_PTC_FLUSH_HI 0x9b8
> +#define  SMMU_PTC_FLUSH_HI_MASK 0x3
> +
> +/* per-SWGROUP SMMU_*_ASID register */
> +#define SMMU_ASID_ENABLE (1 << 31)
> +#define SMMU_ASID_MASK 0x7f
> +#define SMMU_ASID_VALUE(x) ((x) & SMMU_ASID_MASK)
> +
> +/* page table definitions */
> +#define SMMU_NUM_PDE 1024
> +#define SMMU_NUM_PTE 1024
> +
> +#define SMMU_SIZE_PD (SMMU_NUM_PDE * 4)
> +#define SMMU_SIZE_PT (SMMU_NUM_PTE * 4)
> +
> +#define SMMU_PDE_SHIFT 22
> +#define SMMU_PTE_SHIFT 12
> +
> +#define SMMU_PFN_MASK 0x000fffff
> +
> +#define SMMU_PD_READABLE       (1 << 31)
> +#define SMMU_PD_WRITABLE       (1 << 30)
> +#define SMMU_PD_NONSECURE      (1 << 29)
> +
> +#define SMMU_PDE_READABLE      (1 << 31)
> +#define SMMU_PDE_WRITABLE      (1 << 30)
> +#define SMMU_PDE_NONSECURE     (1 << 29)
> +#define SMMU_PDE_NEXT          (1 << 28)
> +
> +#define SMMU_PTE_READABLE      (1 << 31)
> +#define SMMU_PTE_WRITABLE      (1 << 30)
> +#define SMMU_PTE_NONSECURE     (1 << 29)
> +
> +#define SMMU_PDE_ATTR          (SMMU_PDE_READABLE | SMMU_PDE_WRITABLE | \
> +                                SMMU_PDE_NONSECURE)
> +#define SMMU_PTE_ATTR          (SMMU_PTE_READABLE | SMMU_PTE_WRITABLE | \
> +                                SMMU_PTE_NONSECURE)
> +
> +#define SMMU_PDE_VACANT(n)     (((n) << 10) | SMMU_PDE_ATTR)
> +#define SMMU_PTE_VACANT(n)     (((n) << 12) | SMMU_PTE_ATTR)
> +
> +#ifdef CONFIG_ARCH_TEGRA_124_SOC
> +static void tegra124_flush_dcache(struct page *page, unsigned long offset,
> +                                 size_t size)
> +{
> +       phys_addr_t phys = page_to_phys(page) + offset;
> +       void *virt = page_address(page) + offset;
> +
> +       __cpuc_flush_dcache_area(virt, size);
> +       outer_flush_range(phys, phys + size);
> +}
> +
> +static const struct tegra_smmu_ops tegra124_smmu_ops = {
> +       .flush_dcache = tegra124_flush_dcache,
> +};
> +#endif
> +
> +static void tegra132_flush_dcache(struct page *page, unsigned long offset,
> +                                 size_t size)
> +{
> +       /* TODO: implement */
> +}
> +
> +static const struct tegra_smmu_ops tegra132_smmu_ops = {
> +       .flush_dcache = tegra132_flush_dcache,
> +};
> +
> +static inline void smmu_flush_ptc(struct tegra_smmu *smmu, struct page *page,
> +                                 unsigned long offset)
> +{
> +       phys_addr_t phys = page ? page_to_phys(page) : 0;
> +       u32 value;
> +
> +       if (page) {
> +               offset &= ~(smmu->soc->atom_size - 1);
> +
> +#ifdef CONFIG_PHYS_ADDR_T_64BIT
> +               value = (phys >> 32) & SMMU_PTC_FLUSH_HI_MASK;
> +#else
> +               value = 0;
> +#endif
> +               smmu_writel(smmu, value, SMMU_PTC_FLUSH_HI);
> +
> +               value = (phys + offset) | SMMU_PTC_FLUSH_TYPE_ADR;
> +       } else {
> +               value = SMMU_PTC_FLUSH_TYPE_ALL;
> +       }
> +
> +       smmu_writel(smmu, value, SMMU_PTC_FLUSH);
> +}
> +
> +static inline void smmu_flush_tlb(struct tegra_smmu *smmu)
> +{
> +       smmu_writel(smmu, SMMU_TLB_FLUSH_VA_MATCH_ALL, SMMU_TLB_FLUSH);
> +}
> +
> +static inline void smmu_flush_tlb_asid(struct tegra_smmu *smmu,
> +                                      unsigned long asid)
> +{
> +       u32 value;
> +
> +       value = SMMU_TLB_FLUSH_ASID_MATCH | SMMU_TLB_FLUSH_ASID(asid) |
> +               SMMU_TLB_FLUSH_VA_MATCH_ALL;
> +       smmu_writel(smmu, value, SMMU_TLB_FLUSH);
> +}
> +
> +static inline void smmu_flush_tlb_section(struct tegra_smmu *smmu,
> +                                         unsigned long asid,
> +                                         unsigned long iova)
> +{
> +       u32 value;
> +
> +       value = SMMU_TLB_FLUSH_ASID_MATCH | SMMU_TLB_FLUSH_ASID(asid) |
> +               SMMU_TLB_FLUSH_VA_SECTION(iova);
> +       smmu_writel(smmu, value, SMMU_TLB_FLUSH);
> +}
> +
> +static inline void smmu_flush_tlb_group(struct tegra_smmu *smmu,
> +                                       unsigned long asid,
> +                                       unsigned long iova)
> +{
> +       u32 value;
> +
> +       value = SMMU_TLB_FLUSH_ASID_MATCH | SMMU_TLB_FLUSH_ASID(asid) |
> +               SMMU_TLB_FLUSH_VA_GROUP(iova);
> +       smmu_writel(smmu, value, SMMU_TLB_FLUSH);
> +}
> +
> +static inline void smmu_flush(struct tegra_smmu *smmu)
> +{
> +       smmu_readl(smmu, SMMU_CONFIG);
> +}
> +
> +static inline struct tegra_smmu *to_tegra_smmu(struct iommu *iommu)
> +{
> +       return container_of(iommu, struct tegra_smmu, iommu);
> +}
> +
> +static struct tegra_smmu *smmu_handle = NULL;
> +
> +static int tegra_smmu_alloc_asid(struct tegra_smmu *smmu, unsigned int *idp)
> +{
> +       unsigned long id;
> +
> +       mutex_lock(&smmu->lock);
> +
> +       id = find_first_zero_bit(smmu->asids, smmu->soc->num_asids);
> +       if (id >= smmu->soc->num_asids) {
> +               mutex_unlock(&smmu->lock);
> +               return -ENOSPC;
> +       }
> +
> +       set_bit(id, smmu->asids);
> +       *idp = id;
> +
> +       mutex_unlock(&smmu->lock);
> +       return 0;
> +}
> +
> +static void tegra_smmu_free_asid(struct tegra_smmu *smmu, unsigned int id)
> +{
> +       mutex_lock(&smmu->lock);
> +       clear_bit(id, smmu->asids);
> +       mutex_unlock(&smmu->lock);
> +}
> +
> +struct tegra_smmu_address_space *foo = NULL;
> +
> +static int tegra_smmu_domain_init(struct iommu_domain *domain)
> +{
> +       struct tegra_smmu *smmu = smmu_handle;
> +       struct tegra_smmu_address_space *as;
> +       uint32_t *pd, value;
> +       unsigned int i;
> +       int err = 0;
> +
> +       as = kzalloc(sizeof(*as), GFP_KERNEL);
> +       if (!as) {
> +               err = -ENOMEM;
> +               goto out;
> +       }
> +
> +       as->attr = SMMU_PD_READABLE | SMMU_PD_WRITABLE | SMMU_PD_NONSECURE;
> +       as->smmu = smmu_handle;
> +       as->domain = domain;
> +
> +       err = tegra_smmu_alloc_asid(smmu, &as->id);
> +       if (err < 0) {
> +               kfree(as);
> +               goto out;
> +       }
> +
> +       as->pd = alloc_page(GFP_KERNEL | __GFP_DMA);
> +       if (!as->pd) {
> +               err = -ENOMEM;
> +               goto out;
> +       }
> +
> +       pd = page_address(as->pd);
> +       SetPageReserved(as->pd);
> +
> +       for (i = 0; i < SMMU_NUM_PDE; i++)
> +               pd[i] = SMMU_PDE_VACANT(i);
> +
> +       smmu->soc->ops->flush_dcache(as->pd, 0, SMMU_SIZE_PD);
> +       smmu_flush_ptc(smmu, as->pd, 0);
> +       smmu_flush_tlb_asid(smmu, as->id);
> +
> +       smmu_writel(smmu, as->id & 0x7f, SMMU_PTB_ASID);
> +       value = SMMU_PTB_DATA_VALUE(as->pd, as->attr);
> +       smmu_writel(smmu, value, SMMU_PTB_DATA);
> +       smmu_flush(smmu);
> +
> +       domain->priv = as;
> +
> +       return 0;
> +
> +out:
> +       return err;
> +}
> +
> +static void tegra_smmu_domain_destroy(struct iommu_domain *domain)
> +{
> +       struct tegra_smmu_address_space *as = domain->priv;
> +
> +       /* TODO: free page directory and page tables */
> +
> +       tegra_smmu_free_asid(as->smmu, as->id);
> +       kfree(as);
> +}
> +
> +static const struct tegra_smmu_swgroup *
> +tegra_smmu_find_swgroup(struct tegra_smmu *smmu, unsigned int swgroup)
> +{
> +       const struct tegra_smmu_swgroup *group = NULL;
> +       unsigned int i;
> +
> +       for (i = 0; i < smmu->soc->num_swgroups; i++) {
> +               if (smmu->soc->swgroups[i].swgroup == swgroup) {
> +                       group = &smmu->soc->swgroups[i];
> +                       break;
> +               }
> +       }
> +
> +       return group;
> +}
> +
> +static int tegra_smmu_enable(struct tegra_smmu *smmu, unsigned int swgroup,
> +                            unsigned int asid)
> +{
> +       const struct tegra_smmu_swgroup *group;
> +       unsigned int i;
> +       u32 value;
> +
> +       for (i = 0; i < smmu->soc->num_clients; i++) {
> +               const struct tegra_mc_client *client = &smmu->soc->clients[i];
> +
> +               if (client->swgroup != swgroup)
> +                       continue;
> +
> +               value = smmu_readl(smmu, client->smmu.reg);
> +               value |= BIT(client->smmu.bit);
> +               smmu_writel(smmu, value, client->smmu.reg);
> +       }
> +
> +       group = tegra_smmu_find_swgroup(smmu, swgroup);
> +       if (group) {
> +               value = smmu_readl(smmu, group->reg);
> +               value &= ~SMMU_ASID_MASK;
> +               value |= SMMU_ASID_VALUE(asid);
> +               value |= SMMU_ASID_ENABLE;
> +               smmu_writel(smmu, value, group->reg);
> +       }
> +
> +       return 0;
> +}
> +
> +static int tegra_smmu_disable(struct tegra_smmu *smmu, unsigned int swgroup,
> +                             unsigned int asid)
> +{
> +       const struct tegra_smmu_swgroup *group;
> +       unsigned int i;
> +       u32 value;
> +
> +       group = tegra_smmu_find_swgroup(smmu, swgroup);
> +       if (group) {
> +               value = smmu_readl(smmu, group->reg);
> +               value &= ~SMMU_ASID_MASK;
> +               value |= SMMU_ASID_VALUE(asid);
> +               value &= ~SMMU_ASID_ENABLE;
> +               smmu_writel(smmu, value, group->reg);
> +       }
> +
> +       for (i = 0; i < smmu->soc->num_clients; i++) {
> +               const struct tegra_mc_client *client = &smmu->soc->clients[i];
> +
> +               if (client->swgroup != swgroup)
> +                       continue;
> +
> +               value = smmu_readl(smmu, client->smmu.reg);
> +               value &= ~BIT(client->smmu.bit);
> +               smmu_writel(smmu, value, client->smmu.reg);
> +       }
> +
> +       return 0;
> +}
> +
> +static int tegra_smmu_attach_dev(struct iommu_domain *domain, struct device *dev)
> +{
> +       struct tegra_smmu_address_space *as = domain->priv;
> +       struct tegra_smmu *smmu = as->smmu;
> +       struct of_phandle_iter entry;
> +       int err;
> +
> +       of_property_for_each_phandle_with_args(entry, dev->of_node, "iommus",
> +                                              "#iommu-cells", 0) {
> +               unsigned int swgroup = entry.out_args.args[0];
> +
> +               if (entry.out_args.np != smmu->dev->of_node)
> +                       continue;
> +
> +               err = tegra_smmu_enable(smmu, swgroup, as->id);
> +               if (err < 0)
> +                       pr_err("failed to enable SWGROUP#%u\n", swgroup);
> +       }
> +
> +       return 0;
> +}
> +
> +static void tegra_smmu_detach_dev(struct iommu_domain *domain, struct device *dev)
> +{
> +       struct tegra_smmu_address_space *as = domain->priv;
> +       struct tegra_smmu *smmu = as->smmu;
> +       struct of_phandle_iter entry;
> +       int err;
> +
> +       of_property_for_each_phandle_with_args(entry, dev->of_node, "iommus",
> +                                              "#iommu-cells", 0) {
> +               unsigned int swgroup;
> +
> +               if (entry.out_args.np != smmu->dev->of_node)
> +                       continue;
> +
> +               swgroup = entry.out_args.args[0];
> +
> +               err = tegra_smmu_disable(smmu, swgroup, as->id);
> +               if (err < 0) {
> +                       pr_err("failed to enable SWGROUP#%u\n", swgroup);
> +               }
> +       }
> +}
> +
> +static u32 *as_get_pte(struct tegra_smmu_address_space *as, dma_addr_t iova,
> +                      struct page **pagep)
> +{
> +       struct tegra_smmu *smmu = smmu_handle;
> +       u32 *pd = page_address(as->pd), *pt;
> +       u32 pde = (iova >> SMMU_PDE_SHIFT) & 0x3ff;
> +       u32 pte = (iova >> SMMU_PTE_SHIFT) & 0x3ff;
> +       struct page *page;
> +       unsigned int i;
> +
> +       if (pd[pde] != SMMU_PDE_VACANT(pde)) {
> +               page = pfn_to_page(pd[pde] & SMMU_PFN_MASK);
> +               pt = page_address(page);
> +       } else {
> +               page = alloc_page(GFP_KERNEL | __GFP_DMA);
> +               if (!page)
> +                       return NULL;
> +
> +               pt = page_address(page);
> +               SetPageReserved(page);
> +
> +               for (i = 0; i < SMMU_NUM_PTE; i++)
> +                       pt[i] = SMMU_PTE_VACANT(i);
> +
> +               smmu->soc->ops->flush_dcache(page, 0, SMMU_SIZE_PT);
> +
> +               pd[pde] = SMMU_MK_PDE(page, SMMU_PDE_ATTR | SMMU_PDE_NEXT);
> +
> +               smmu->soc->ops->flush_dcache(as->pd, pde << 2, 4);
> +               smmu_flush_ptc(smmu, as->pd, pde << 2);
> +               smmu_flush_tlb_section(smmu, as->id, iova);
> +               smmu_flush(smmu);
> +       }
> +
> +       *pagep = page;
> +
> +       return &pt[pte];
> +}
> +
> +static int tegra_smmu_map(struct iommu_domain *domain, unsigned long iova,
> +                         phys_addr_t paddr, size_t size, int prot)
> +{
> +       struct tegra_smmu_address_space *as = domain->priv;
> +       struct tegra_smmu *smmu = smmu_handle;
> +       unsigned long offset;
> +       struct page *page;
> +       u32 *pte;
> +
> +       pte = as_get_pte(as, iova, &page);
> +       if (!pte)
> +               return -ENOMEM;
> +
> +       offset = offset_in_page(pte);
> +
> +       *pte = __phys_to_pfn(paddr) | SMMU_PTE_ATTR;
> +
> +       smmu->soc->ops->flush_dcache(page, offset, 4);
> +       smmu_flush_ptc(smmu, page, offset);
> +       smmu_flush_tlb_group(smmu, as->id, iova);
> +       smmu_flush(smmu);
> +
> +       return 0;
> +}
> +
> +static size_t tegra_smmu_unmap(struct iommu_domain *domain, unsigned long iova,
> +                              size_t size)
> +{
> +       struct tegra_smmu_address_space *as = domain->priv;
> +       struct tegra_smmu *smmu = smmu_handle;
> +       unsigned long offset;
> +       struct page *page;
> +       u32 *pte;
> +
> +       pte = as_get_pte(as, iova, &page);
> +       if (!pte)
> +               return 0;
> +
> +       offset = offset_in_page(pte);
> +       *pte = 0;
> +
> +       smmu->soc->ops->flush_dcache(page, offset, 4);
> +       smmu_flush_ptc(smmu, page, offset);
> +       smmu_flush_tlb_group(smmu, as->id, iova);
> +       smmu_flush(smmu);
> +
> +       return size;
> +}
> +
> +static phys_addr_t tegra_smmu_iova_to_phys(struct iommu_domain *domain,
> +                                          dma_addr_t iova)
> +{
> +       struct tegra_smmu_address_space *as = domain->priv;
> +       struct page *page;
> +       unsigned long pfn;
> +       u32 *pte;
> +
> +       pte = as_get_pte(as, iova, &page);
> +       pfn = *pte & SMMU_PFN_MASK;
> +
> +       return PFN_PHYS(pfn);
> +}
> +
> +static int tegra_smmu_attach(struct iommu *iommu, struct device *dev)
> +{
> +       struct tegra_smmu *smmu = to_tegra_smmu(iommu);
> +       struct tegra_smmu_group *group;
> +       unsigned int i;
> +
> +       for (i = 0; i < smmu->soc->num_groups; i++) {
> +               group = iommu_group_get_iommudata(smmu->groups[i]);
> +
> +               if (of_match_node(group->matches, dev->of_node)) {
> +                       pr_debug("adding device %s to group %s\n",
> +                                dev_name(dev), group->name);
> +                       iommu_group_add_device(smmu->groups[i], dev);
> +                       break;
> +               }
> +       }
> +
> +       if (i == smmu->soc->num_groups)
> +               return 0;
> +
> +#ifndef CONFIG_ARM64
> +       return arm_iommu_attach_device(dev, group->mapping);
> +#else
> +       return 0;
> +#endif
> +}
> +
> +static int tegra_smmu_detach(struct iommu *iommu, struct device *dev)
> +{
> +       return 0;
> +}
> +
> +static const struct iommu_ops tegra_smmu_ops = {
> +       .domain_init = tegra_smmu_domain_init,
> +       .domain_destroy = tegra_smmu_domain_destroy,
> +       .attach_dev = tegra_smmu_attach_dev,
> +       .detach_dev = tegra_smmu_detach_dev,
> +       .map = tegra_smmu_map,
> +       .unmap = tegra_smmu_unmap,
> +       .iova_to_phys = tegra_smmu_iova_to_phys,
> +       .attach = tegra_smmu_attach,
> +       .detach = tegra_smmu_detach,
> +
> +       .pgsize_bitmap = SZ_4K,
> +};
> +
> +static struct tegra_smmu *tegra_smmu_probe(struct device *dev,
> +                                          const struct tegra_smmu_soc *soc,
> +                                          void __iomem *regs)
> +{
> +       struct tegra_smmu *smmu;
> +       unsigned int i;
> +       size_t size;
> +       u32 value;
> +       int err;
> +
> +       smmu = devm_kzalloc(dev, sizeof(*smmu), GFP_KERNEL);
> +       if (!smmu)
> +               return ERR_PTR(-ENOMEM);
> +
> +       size = BITS_TO_LONGS(soc->num_asids) * sizeof(long);
> +
> +       smmu->asids = devm_kzalloc(dev, size, GFP_KERNEL);
> +       if (!smmu->asids)
> +               return ERR_PTR(-ENOMEM);
> +
> +       INIT_LIST_HEAD(&smmu->iommu.list);
> +       mutex_init(&smmu->lock);
> +
> +       smmu->iommu.ops = &tegra_smmu_ops;
> +       smmu->iommu.dev = dev;
> +
> +       smmu->regs = regs;
> +       smmu->soc = soc;
> +       smmu->dev = dev;
> +
> +       smmu_handle = smmu;
> +       bus_set_iommu(&platform_bus_type, &tegra_smmu_ops);
> +
> +       smmu->num_groups = soc->num_groups;
> +
> +       smmu->groups = devm_kcalloc(dev, smmu->num_groups, sizeof(*smmu->groups),
> +                                   GFP_KERNEL);
> +       if (!smmu->groups)
> +               return ERR_PTR(-ENOMEM);
> +
> +       for (i = 0; i < smmu->num_groups; i++) {
> +               struct tegra_smmu_group *group;
> +
> +               smmu->groups[i] = iommu_group_alloc();
> +               if (IS_ERR(smmu->groups[i]))
> +                       return ERR_CAST(smmu->groups[i]);
> +
> +               err = iommu_group_set_name(smmu->groups[i], soc->groups[i].name);
> +               if (err < 0) {
> +               }
> +
> +               group = kzalloc(sizeof(*group), GFP_KERNEL);
> +               if (!group)
> +                       return ERR_PTR(-ENOMEM);
> +
> +               group->matches = soc->groups[i].matches;
> +               group->asid = soc->groups[i].asid;
> +               group->name = soc->groups[i].name;
> +
> +               iommu_group_set_iommudata(smmu->groups[i], group,
> +                                         tegra_smmu_group_release);
> +
> +#ifndef CONFIG_ARM64
> +               group->mapping = arm_iommu_create_mapping(&platform_bus_type,
> +                                                         0, SZ_2G);
> +               if (IS_ERR(group->mapping)) {
> +                       dev_err(dev, "failed to create mapping for group %s: %ld\n",
> +                               group->name, PTR_ERR(group->mapping));
> +                       return ERR_CAST(group->mapping);
> +               }
> +#endif
> +       }
> +
> +       value = (1 << 29) | (8 << 24) | 0x3f;
> +       smmu_writel(smmu, value, 0x18);
> +
> +       value = (1 << 29) | (1 << 28) | 0x20;
> +       smmu_writel(smmu, value, 0x014);
> +
> +       smmu_flush_ptc(smmu, NULL, 0);
> +       smmu_flush_tlb(smmu);
> +       smmu_writel(smmu, SMMU_CONFIG_ENABLE, SMMU_CONFIG);
> +       smmu_flush(smmu);
> +
> +       err = iommu_add(&smmu->iommu);
> +       if (err < 0)
> +               return ERR_PTR(err);
> +
> +       return smmu;
> +}
> +
> +static int tegra_smmu_remove(struct tegra_smmu *smmu)
> +{
> +       iommu_remove(&smmu->iommu);
> +
> +       return 0;
> +}
> +
> +#ifdef CONFIG_ARCH_TEGRA_124_SOC
> +static const struct tegra_smmu_soc tegra124_smmu_soc = {
> +       .groups = tegra124_smmu_groups,
> +       .num_groups = ARRAY_SIZE(tegra124_smmu_groups),
> +       .clients = tegra124_mc_clients,
> +       .num_clients = ARRAY_SIZE(tegra124_mc_clients),
> +       .swgroups = tegra124_swgroups,
> +       .num_swgroups = ARRAY_SIZE(tegra124_swgroups),
> +       .num_asids = 128,
> +       .atom_size = 32,
> +       .ops = &tegra124_smmu_ops,
> +};
> +#endif
> +
> +static const struct tegra_smmu_soc tegra132_smmu_soc = {
> +       .groups = tegra124_smmu_groups,
> +       .num_groups = ARRAY_SIZE(tegra124_smmu_groups),
> +       .clients = tegra124_mc_clients,
> +       .num_clients = ARRAY_SIZE(tegra124_mc_clients),
> +       .swgroups = tegra124_swgroups,
> +       .num_swgroups = ARRAY_SIZE(tegra124_swgroups),
> +       .num_asids = 128,
> +       .atom_size = 32,
> +       .ops = &tegra132_smmu_ops,
> +};
> +
> +struct tegra_mc {
> +       struct device *dev;
> +       struct tegra_smmu *smmu;
> +       void __iomem *regs;
> +       int irq;
> +
> +       const struct tegra_mc_soc *soc;
> +};
> +
> +static inline u32 mc_readl(struct tegra_mc *mc, unsigned long offset)
> +{
> +       return readl(mc->regs + offset);
> +}
> +
> +static inline void mc_writel(struct tegra_mc *mc, u32 value, unsigned long offset)
> +{
> +       writel(value, mc->regs + offset);
> +}
> +
> +struct tegra_mc_soc {
> +       const struct tegra_mc_client *clients;
> +       unsigned int num_clients;
> +
> +       const struct tegra_smmu_soc *smmu;
> +};
> +
> +#ifdef CONFIG_ARCH_TEGRA_124_SOC
> +static const struct tegra_mc_soc tegra124_mc_soc = {
> +       .clients = tegra124_mc_clients,
> +       .num_clients = ARRAY_SIZE(tegra124_mc_clients),
> +       .smmu = &tegra124_smmu_soc,
> +};
> +#endif
> +
> +static const struct tegra_mc_soc tegra132_mc_soc = {
> +       .clients = tegra124_mc_clients,
> +       .num_clients = ARRAY_SIZE(tegra124_mc_clients),
> +       .smmu = &tegra132_smmu_soc,
> +};
> +
> +static const struct of_device_id tegra_mc_of_match[] = {
> +#ifdef CONFIG_ARCH_TEGRA_124_SOC
> +       { .compatible = "nvidia,tegra124-mc", .data = &tegra124_mc_soc },
> +#endif
> +       { .compatible = "nvidia,tegra132-mc", .data = &tegra132_mc_soc },
> +       { }
> +};
> +
> +static irqreturn_t tegra124_mc_irq(int irq, void *data)
> +{
> +       struct tegra_mc *mc = data;
> +       u32 value, status, mask;
> +
> +       /* mask all interrupts to avoid flooding */
> +       mask = mc_readl(mc, MC_INTMASK);
> +       mc_writel(mc, 0, MC_INTMASK);
> +
> +       status = mc_readl(mc, MC_INTSTATUS);
> +       mc_writel(mc, status, MC_INTSTATUS);
> +
> +       dev_dbg(mc->dev, "INTSTATUS: %08x\n", status);
> +
> +       if (status & MC_INT_DECERR_MTS)
> +               dev_dbg(mc->dev, "  DECERR_MTS\n");
> +
> +       if (status & MC_INT_SECERR_SEC)
> +               dev_dbg(mc->dev, "  SECERR_SEC\n");
> +
> +       if (status & MC_INT_DECERR_VPR)
> +               dev_dbg(mc->dev, "  DECERR_VPR\n");
> +
> +       if (status & MC_INT_INVALID_APB_ASID_UPDATE)
> +               dev_dbg(mc->dev, "  INVALID_APB_ASID_UPDATE\n");
> +
> +       if (status & MC_INT_INVALID_SMMU_PAGE)
> +               dev_dbg(mc->dev, "  INVALID_SMMU_PAGE\n");
> +
> +       if (status & MC_INT_ARBITRATION_EMEM)
> +               dev_dbg(mc->dev, "  ARBITRATION_EMEM\n");
> +
> +       if (status & MC_INT_SECURITY_VIOLATION)
> +               dev_dbg(mc->dev, "  SECURITY_VIOLATION\n");
> +
> +       if (status & MC_INT_DECERR_EMEM)
> +               dev_dbg(mc->dev, "  DECERR_EMEM\n");
> +
> +       value = mc_readl(mc, MC_ERR_STATUS);
> +
> +       dev_dbg(mc->dev, "ERR_STATUS: %08x\n", value);
> +       dev_dbg(mc->dev, "  type: %x\n", (value >> 28) & 0x7);
> +       dev_dbg(mc->dev, "  protection: %x\n", (value >> 25) & 0x7);
> +       dev_dbg(mc->dev, "  adr_hi: %x\n", (value >> 20) & 0x3);
> +       dev_dbg(mc->dev, "  swap: %x\n", (value >> 18) & 0x1);
> +       dev_dbg(mc->dev, "  security: %x\n", (value >> 17) & 0x1);
> +       dev_dbg(mc->dev, "  r/w: %x\n", (value >> 16) & 0x1);
> +       dev_dbg(mc->dev, "  adr1: %x\n", (value >> 12) & 0x7);
> +       dev_dbg(mc->dev, "  client: %x\n", value & 0x7f);
> +
> +       value = mc_readl(mc, MC_ERR_ADR);
> +       dev_dbg(mc->dev, "ERR_ADR: %08x\n", value);
> +
> +       mc_writel(mc, mask, MC_INTMASK);
> +
> +       return IRQ_HANDLED;
> +}
> +
> +static int tegra_mc_probe(struct platform_device *pdev)
> +{
> +       const struct of_device_id *match;
> +       struct resource *res;
> +       struct tegra_mc *mc;
> +       unsigned int i;
> +       u32 value;
> +       int err;
> +
> +       match = of_match_node(tegra_mc_of_match, pdev->dev.of_node);
> +       if (!match)
> +               return -ENODEV;
> +
> +       mc = devm_kzalloc(&pdev->dev, sizeof(*mc), GFP_KERNEL);
> +       if (!mc)
> +               return -ENOMEM;
> +
> +       platform_set_drvdata(pdev, mc);
> +       mc->soc = match->data;
> +       mc->dev = &pdev->dev;
> +
> +       res = platform_get_resource(pdev, IORESOURCE_MEM, 0);
> +       mc->regs = devm_ioremap_resource(&pdev->dev, res);
> +       if (IS_ERR(mc->regs))
> +               return PTR_ERR(mc->regs);
> +
> +       for (i = 0; i < mc->soc->num_clients; i++) {
> +               const struct latency_allowance *la = &mc->soc->clients[i].latency;
> +               u32 value;
> +
> +               value = readl(mc->regs + la->reg);
> +               value &= ~(la->mask << la->shift);
> +               value |= (la->def & la->mask) << la->shift;
> +               writel(value, mc->regs + la->reg);
> +       }
> +
> +       mc->smmu = tegra_smmu_probe(&pdev->dev, mc->soc->smmu, mc->regs);
> +       if (IS_ERR(mc->smmu)) {
> +               dev_err(&pdev->dev, "failed to probe SMMU: %ld\n",
> +                       PTR_ERR(mc->smmu));
> +               return PTR_ERR(mc->smmu);
> +       }
> +
> +       mc->irq = platform_get_irq(pdev, 0);
> +       if (mc->irq < 0) {
> +               dev_err(&pdev->dev, "interrupt not specified\n");
> +               return mc->irq;
> +       }
> +
> +       err = devm_request_irq(&pdev->dev, mc->irq, tegra124_mc_irq,
> +                              IRQF_SHARED, dev_name(&pdev->dev), mc);
> +       if (err < 0) {
> +               dev_err(&pdev->dev, "failed to request IRQ#%u: %d\n", mc->irq,
> +                       err);
> +               return err;
> +       }
> +
> +       value = MC_INT_DECERR_MTS | MC_INT_SECERR_SEC | MC_INT_DECERR_VPR |
> +               MC_INT_INVALID_APB_ASID_UPDATE | MC_INT_INVALID_SMMU_PAGE |
> +               MC_INT_ARBITRATION_EMEM | MC_INT_SECURITY_VIOLATION |
> +               MC_INT_DECERR_EMEM;
> +       mc_writel(mc, value, MC_INTMASK);
> +
> +       return 0;
> +}
> +
> +static int tegra_mc_remove(struct platform_device *pdev)
> +{
> +       struct tegra_mc *mc = platform_get_drvdata(pdev);
> +       int err;
> +
> +       err = tegra_smmu_remove(mc->smmu);
> +       if (err < 0)
> +               dev_err(&pdev->dev, "failed to remove SMMU: %d\n", err);
> +
> +       return 0;
> +}
> +
> +static struct platform_driver tegra_mc_driver = {
> +       .driver = {
> +               .name = "tegra124-mc",
> +               .of_match_table = tegra_mc_of_match,
> +       },
> +       .probe = tegra_mc_probe,
> +       .remove = tegra_mc_remove,
> +};
> +module_platform_driver(tegra_mc_driver);
> +
> +MODULE_AUTHOR("Thierry Reding <treding@nvidia.com>");
> +MODULE_DESCRIPTION("NVIDIA Tegra124 Memory Controller driver");
> +MODULE_LICENSE("GPL v2");
> diff --git a/include/dt-bindings/memory/tegra124-mc.h b/include/dt-bindings/memory/tegra124-mc.h
> new file mode 100644
> index 000000000000..6b1617ce022f
> --- /dev/null
> +++ b/include/dt-bindings/memory/tegra124-mc.h
> @@ -0,0 +1,30 @@
> +#ifndef DT_BINDINGS_MEMORY_TEGRA124_MC_H
> +#define DT_BINDINGS_MEMORY_TEGRA124_MC_H
> +
> +#define TEGRA_SWGROUP_DC       0
> +#define TEGRA_SWGROUP_DCB      1
> +#define TEGRA_SWGROUP_AFI      2
> +#define TEGRA_SWGROUP_AVPC     3
> +#define TEGRA_SWGROUP_HDA      4
> +#define TEGRA_SWGROUP_HC       5
> +#define TEGRA_SWGROUP_MSENC    6
> +#define TEGRA_SWGROUP_PPCS     7
> +#define TEGRA_SWGROUP_SATA     8
> +#define TEGRA_SWGROUP_VDE      9
> +#define TEGRA_SWGROUP_MPCORELP 10
> +#define TEGRA_SWGROUP_MPCORE   11
> +#define TEGRA_SWGROUP_ISP2     12
> +#define TEGRA_SWGROUP_XUSB_HOST        13
> +#define TEGRA_SWGROUP_XUSB_DEV 14
> +#define TEGRA_SWGROUP_ISP2B    15
> +#define TEGRA_SWGROUP_TSEC     16
> +#define TEGRA_SWGROUP_A9AVP    17
> +#define TEGRA_SWGROUP_GPU      18
> +#define TEGRA_SWGROUP_SDMMC1A  19
> +#define TEGRA_SWGROUP_SDMMC2A  20
> +#define TEGRA_SWGROUP_SDMMC3A  21
> +#define TEGRA_SWGROUP_SDMMC4A  22
> +#define TEGRA_SWGROUP_VIC      23
> +#define TEGRA_SWGROUP_VI       24
> +
> +#endif
> --
> 2.0.0
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-tegra" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>

^ permalink raw reply	[flat|nested] 133+ messages in thread

* [RFC 04/10] memory: Add Tegra124 memory controller support
@ 2014-06-27 13:29       ` Mikko Perttunen
  0 siblings, 0 replies; 133+ messages in thread
From: Mikko Perttunen @ 2014-06-27 13:29 UTC (permalink / raw)
  To: linux-arm-kernel

In the future, the EMC driver will also want to write and read quite 
many registers in the MC block.. MC_EMEM_*, the latency allowance 
registers and a couple others. Downstream just uses __raw_writel with
values from the EMC tables. A fun thing here is that during the point
that the values are written, the code cannot do some things like reading 
registers (I believe) without hanging, so calling into the MC driver to 
write the changes might not be very nice either. Related to that, 
reading from MC_EMEM_ADR_CFG is used as a barrier in the sequence.

On 26/06/14 23:49, Thierry Reding wrote:
> From: Thierry Reding <treding@nvidia.com>
>
> The memory controller on NVIDIA Tegra124 exposes various knobs that can
> be used to tune the behaviour of the clients attached to it.
>
> Currently this driver sets up the latency allowance registers to the HW
> defaults. Eventually an API should be exported by this driver (via a
> custom API or a generic subsystem) to allow clients to register latency
> requirements.

I cannot see where the downstream latency allowance code is reloading 
the latency allowance registers after a EMC clock rate change. Strange.

>
> This driver also registers an IOMMU (SMMU) that's implemented by the
> memory controller.
>
> Signed-off-by: Thierry Reding <treding@nvidia.com>
> ---
>   drivers/memory/Kconfig                   |    9 +
>   drivers/memory/Makefile                  |    1 +
>   drivers/memory/tegra124-mc.c             | 1945 ++++++++++++++++++++++++++++++
>   include/dt-bindings/memory/tegra124-mc.h |   30 +
>   4 files changed, 1985 insertions(+)
>   create mode 100644 drivers/memory/tegra124-mc.c
>   create mode 100644 include/dt-bindings/memory/tegra124-mc.h
>
> diff --git a/drivers/memory/Kconfig b/drivers/memory/Kconfig
> index c59e9c96e86d..d0f0e6781570 100644
> --- a/drivers/memory/Kconfig
> +++ b/drivers/memory/Kconfig
> @@ -61,6 +61,15 @@ config TEGRA30_MC
>            analysis, especially for IOMMU/SMMU(System Memory Management
>            Unit) module.
>
> +config TEGRA124_MC
> +       bool "Tegra124 Memory Controller driver"
> +       depends on ARCH_TEGRA
> +       select IOMMU_API
> +       help
> +         This driver is for the Memory Controller module available on
> +         Tegra124 SoCs. It provides an IOMMU that can be used for I/O
> +         virtual address translation.
> +
>   config FSL_IFC
>          bool
>          depends on FSL_SOC
> diff --git a/drivers/memory/Makefile b/drivers/memory/Makefile
> index 71160a2b7313..03143927abab 100644
> --- a/drivers/memory/Makefile
> +++ b/drivers/memory/Makefile
> @@ -11,3 +11,4 @@ obj-$(CONFIG_FSL_IFC)         += fsl_ifc.o
>   obj-$(CONFIG_MVEBU_DEVBUS)     += mvebu-devbus.o
>   obj-$(CONFIG_TEGRA20_MC)       += tegra20-mc.o
>   obj-$(CONFIG_TEGRA30_MC)       += tegra30-mc.o
> +obj-$(CONFIG_TEGRA124_MC)      += tegra124-mc.o
> diff --git a/drivers/memory/tegra124-mc.c b/drivers/memory/tegra124-mc.c
> new file mode 100644
> index 000000000000..741755b6785d
> --- /dev/null
> +++ b/drivers/memory/tegra124-mc.c
> @@ -0,0 +1,1945 @@
> +/*
> + * Copyright (C) 2014 NVIDIA CORPORATION.  All rights reserved.
> + *
> + * This program is free software; you can redistribute it and/or modify
> + * it under the terms of the GNU General Public License version 2 as
> + * published by the Free Software Foundation.
> + */
> +
> +#include <linux/interrupt.h>
> +#include <linux/io.h>
> +#include <linux/iommu.h>
> +#include <linux/kernel.h>
> +#include <linux/module.h>
> +#include <linux/of.h>
> +#include <linux/platform_device.h>
> +#include <linux/slab.h>
> +
> +#include <dt-bindings/memory/tegra124-mc.h>
> +
> +#include <asm/cacheflush.h>
> +#ifndef CONFIG_ARM64
> +#include <asm/dma-iommu.h>
> +#endif
> +
> +#define MC_INTSTATUS 0x000
> +#define  MC_INT_DECERR_MTS (1 << 16)
> +#define  MC_INT_SECERR_SEC (1 << 13)
> +#define  MC_INT_DECERR_VPR (1 << 12)
> +#define  MC_INT_INVALID_APB_ASID_UPDATE (1 << 11)
> +#define  MC_INT_INVALID_SMMU_PAGE (1 << 10)
> +#define  MC_INT_ARBITRATION_EMEM (1 << 9)
> +#define  MC_INT_SECURITY_VIOLATION (1 << 8)
> +#define  MC_INT_DECERR_EMEM (1 << 6)
> +#define MC_INTMASK 0x004
> +#define MC_ERR_STATUS 0x08
> +#define MC_ERR_ADR 0x0c
> +
> +struct latency_allowance {
> +       unsigned int reg;
> +       unsigned int shift;
> +       unsigned int mask;
> +       unsigned int def;
> +};
> +
> +struct smmu_enable {
> +       unsigned int reg;
> +       unsigned int bit;
> +};
> +
> +struct tegra_mc_client {
> +       unsigned int id;
> +       const char *name;
> +       unsigned int swgroup;
> +
> +       struct smmu_enable smmu;
> +       struct latency_allowance latency;
> +};
> +
> +static const struct tegra_mc_client tegra124_mc_clients[] = {
> +       {
> +               .id = 0x01,
> +               .name = "display0a",
> +               .swgroup = TEGRA_SWGROUP_DC,
> +               .smmu = {
> +                       .reg = 0x228,
> +                       .bit = 1,
> +               },
> +               .latency = {
> +                       .reg = 0x2e8,
> +                       .shift = 0,
> +                       .mask = 0xff,
> +                       .def = 0xc2,
> +               },
> +       }, {
> +               .id = 0x02,
> +               .name = "display0ab",
> +               .swgroup = TEGRA_SWGROUP_DCB,
> +               .smmu = {
> +                       .reg = 0x228,
> +                       .bit = 2,
> +               },
> +               .latency = {
> +                       .reg = 0x2f4,
> +                       .shift = 0,
> +                       .mask = 0xff,
> +                       .def = 0xc6,
> +               },
> +       }, {
> +               .id = 0x03,
> +               .name = "display0b",
> +               .swgroup = TEGRA_SWGROUP_DC,
> +               .smmu = {
> +                       .reg = 0x228,
> +                       .bit = 3,
> +               },
> +               .latency = {
> +                       .reg = 0x2e8,
> +                       .shift = 16,
> +                       .mask = 0xff,
> +                       .def = 0x50,
> +               },
> +       }, {
> +               .id = 0x04,
> +               .name = "display0bb",
> +               .swgroup = TEGRA_SWGROUP_DCB,
> +               .smmu = {
> +                       .reg = 0x228,
> +                       .bit = 4,
> +               },
> +               .latency = {
> +                       .reg = 0x2f4,
> +                       .shift = 16,
> +                       .mask = 0xff,
> +                       .def = 0x50,
> +               },
> +       }, {
> +               .id = 0x05,
> +               .name = "display0c",
> +               .swgroup = TEGRA_SWGROUP_DC,
> +               .smmu = {
> +                       .reg = 0x228,
> +                       .bit = 5,
> +               },
> +               .latency = {
> +                       .reg = 0x2ec,
> +                       .shift = 0,
> +                       .mask = 0xff,
> +                       .def = 0x50,
> +               },
> +       }, {
> +               .id = 0x06,
> +               .name = "display0cb",
> +               .swgroup = TEGRA_SWGROUP_DCB,
> +               .smmu = {
> +                       .reg = 0x228,
> +                       .bit = 6,
> +               },
> +               .latency = {
> +                       .reg = 0x2f8,
> +                       .shift = 0,
> +                       .mask = 0xff,
> +                       .def = 0x50,
> +               },
> +       }, {
> +               .id = 0x0e,
> +               .name = "afir",
> +               .swgroup = TEGRA_SWGROUP_AFI,
> +               .smmu = {
> +                       .reg = 0x228,
> +                       .bit = 14,
> +               },
> +               .latency = {
> +                       .reg = 0x2e0,
> +                       .shift = 0,
> +                       .mask = 0xff,
> +                       .def = 0x13,
> +               },
> +       }, {
> +               .id = 0x0f,
> +               .name = "avpcarm7r",
> +               .swgroup = TEGRA_SWGROUP_AVPC,
> +               .smmu = {
> +                       .reg = 0x228,
> +                       .bit = 15,
> +               },
> +               .latency = {
> +                       .reg = 0x2e4,
> +                       .shift = 0,
> +                       .mask = 0xff,
> +                       .def = 0x04,
> +               },
> +       }, {
> +               .id = 0x10,
> +               .name = "displayhc",
> +               .swgroup = TEGRA_SWGROUP_DC,
> +               .smmu = {
> +                       .reg = 0x228,
> +                       .bit = 16,
> +               },
> +               .latency = {
> +                       .reg = 0x2f0,
> +                       .shift = 0,
> +                       .mask = 0xff,
> +                       .def = 0x50,
> +               },
> +       }, {
> +               .id = 0x11,
> +               .name = "displayhcb",
> +               .swgroup = TEGRA_SWGROUP_DCB,
> +               .smmu = {
> +                       .reg = 0x228,
> +                       .bit = 17,
> +               },
> +               .latency = {
> +                       .reg = 0x2fc,
> +                       .shift = 0,
> +                       .mask = 0xff,
> +                       .def = 0x50,
> +               },
> +       }, {
> +               .id = 0x15,
> +               .name = "hdar",
> +               .swgroup = TEGRA_SWGROUP_HDA,
> +               .smmu = {
> +                       .reg = 0x228,
> +                       .bit = 21,
> +               },
> +               .latency = {
> +                       .reg = 0x318,
> +                       .shift = 0,
> +                       .mask = 0xff,
> +                       .def = 0x24,
> +               },
> +       }, {
> +               .id = 0x16,
> +               .name = "host1xdmar",
> +               .swgroup = TEGRA_SWGROUP_HC,
> +               .smmu = {
> +                       .reg = 0x228,
> +                       .bit = 22,
> +               },
> +               .latency = {
> +                       .reg = 0x310,
> +                       .shift = 0,
> +                       .mask = 0xff,
> +                       .def = 0x1e,
> +               },
> +       }, {
> +               .id = 0x17,
> +               .name = "host1xr",
> +               .swgroup = TEGRA_SWGROUP_HC,
> +               .smmu = {
> +                       .reg = 0x228,
> +                       .bit = 23,
> +               },
> +               .latency = {
> +                       .reg = 0x310,
> +                       .shift = 16,
> +                       .mask = 0xff,
> +                       .def = 0x50,
> +               },
> +       }, {
> +               .id = 0x1c,
> +               .name = "msencsrd",
> +               .swgroup = TEGRA_SWGROUP_MSENC,
> +               .smmu = {
> +                       .reg = 0x228,
> +                       .bit = 28,
> +               },
> +               .latency = {
> +                       .reg = 0x328,
> +                       .shift = 0,
> +                       .mask = 0xff,
> +                       .def = 0x23,
> +               },
> +       }, {
> +               .id = 0x1d,
> +               .name = "ppcsahbdmarhdar",
> +               .swgroup = TEGRA_SWGROUP_PPCS,
> +               .smmu = {
> +                       .reg = 0x228,
> +                       .bit = 29,
> +               },
> +               .latency = {
> +                       .reg = 0x344,
> +                       .shift = 0,
> +                       .mask = 0xff,
> +                       .def = 0x49,
> +               },
> +       }, {
> +               .id = 0x1e,
> +               .name = "ppcsahbslvr",
> +               .swgroup = TEGRA_SWGROUP_PPCS,
> +               .smmu = {
> +                       .reg = 0x228,
> +                       .bit = 30,
> +               },
> +               .latency = {
> +                       .reg = 0x344,
> +                       .shift = 16,
> +                       .mask = 0xff,
> +                       .def = 0x1a,
> +               },
> +       }, {
> +               .id = 0x1f,
> +               .name = "satar",
> +               .swgroup = TEGRA_SWGROUP_SATA,
> +               .smmu = {
> +                       .reg = 0x228,
> +                       .bit = 31,
> +               },
> +               .latency = {
> +                       .reg = 0x350,
> +                       .shift = 0,
> +                       .mask = 0xff,
> +                       .def = 0x65,
> +               },
> +       }, {
> +               .id = 0x22,
> +               .name = "vdebsevr",
> +               .swgroup = TEGRA_SWGROUP_VDE,
> +               .smmu = {
> +                       .reg = 0x22c,
> +                       .bit = 2,
> +               },
> +               .latency = {
> +                       .reg = 0x354,
> +                       .shift = 0,
> +                       .mask = 0xff,
> +                       .def = 0x4f,
> +               },
> +       }, {
> +               .id = 0x23,
> +               .name = "vdember",
> +               .swgroup = TEGRA_SWGROUP_VDE,
> +               .smmu = {
> +                       .reg = 0x22c,
> +                       .bit = 3,
> +               },
> +               .latency = {
> +                       .reg = 0x354,
> +                       .shift = 16,
> +                       .mask = 0xff,
> +                       .def = 0x3d,
> +               },
> +       }, {
> +               .id = 0x24,
> +               .name = "vdemcer",
> +               .swgroup = TEGRA_SWGROUP_VDE,
> +               .smmu = {
> +                       .reg = 0x22c,
> +                       .bit = 4,
> +               },
> +               .latency = {
> +                       .reg = 0x358,
> +                       .shift = 0,
> +                       .mask = 0xff,
> +                       .def = 0x66,
> +               },
> +       }, {
> +               .id = 0x25,
> +               .name = "vdetper",
> +               .swgroup = TEGRA_SWGROUP_VDE,
> +               .smmu = {
> +                       .reg = 0x22c,
> +                       .bit = 5,
> +               },
> +               .latency = {
> +                       .reg = 0x358,
> +                       .shift = 16,
> +                       .mask = 0xff,
> +                       .def = 0xa5,
> +               },
> +       }, {
> +               .id = 0x26,
> +               .name = "mpcorelpr",
> +               .swgroup = TEGRA_SWGROUP_MPCORELP,
> +               .latency = {
> +                       .reg = 0x324,
> +                       .shift = 0,
> +                       .mask = 0xff,
> +                       .def = 0x04,
> +               },
> +       }, {
> +               .id = 0x27,
> +               .name = "mpcorer",
> +               .swgroup = TEGRA_SWGROUP_MPCORE,
> +               .smmu = {
> +                       .reg = 0x22c,
> +                       .bit = 2,
> +               },
> +               .latency = {
> +                       .reg = 0x320,
> +                       .shift = 0,
> +                       .mask = 0xff,
> +                       .def = 0x04,
> +               },
> +       }, {
> +               .id = 0x2b,
> +               .name = "msencswr",
> +               .swgroup = TEGRA_SWGROUP_MSENC,
> +               .smmu = {
> +                       .reg = 0x22c,
> +                       .bit = 11,
> +               },
> +               .latency = {
> +                       .reg = 0x328,
> +                       .shift = 16,
> +                       .mask = 0xff,
> +                       .def = 0x80,
> +               },
> +       }, {
> +               .id = 0x31,
> +               .name = "afiw",
> +               .swgroup = TEGRA_SWGROUP_AFI,
> +               .smmu = {
> +                       .reg = 0x22c,
> +                       .bit = 17,
> +               },
> +               .latency = {
> +                       .reg = 0x2e0,
> +                       .shift = 16,
> +                       .mask = 0xff,
> +                       .def = 0x80,
> +               },
> +       }, {
> +               .id = 0x32,
> +               .name = "avpcarm7w",
> +               .swgroup = TEGRA_SWGROUP_AVPC,
> +               .smmu = {
> +                       .reg = 0x22c,
> +                       .bit = 18,
> +               },
> +               .latency = {
> +                       .reg = 0x2e4,
> +                       .shift = 16,
> +                       .mask = 0xff,
> +                       .def = 0x80,
> +               },
> +       }, {
> +               .id = 0x35,
> +               .name = "hdaw",
> +               .swgroup = TEGRA_SWGROUP_HDA,
> +               .smmu = {
> +                       .reg = 0x22c,
> +                       .bit = 21,
> +               },
> +               .latency = {
> +                       .reg = 0x318,
> +                       .shift = 16,
> +                       .mask = 0xff,
> +                       .def = 0x80,
> +               },
> +       }, {
> +               .id = 0x36,
> +               .name = "host1xw",
> +               .swgroup = TEGRA_SWGROUP_HC,
> +               .smmu = {
> +                       .reg = 0x22c,
> +                       .bit = 22,
> +               },
> +               .latency = {
> +                       .reg = 0x314,
> +                       .shift = 0,
> +                       .mask = 0xff,
> +                       .def = 0x80,
> +               },
> +       }, {
> +               .id = 0x38,
> +               .name = "mpcorelpw",
> +               .swgroup = TEGRA_SWGROUP_MPCORELP,
> +               .latency = {
> +                       .reg = 0x324,
> +                       .shift = 16,
> +                       .mask = 0xff,
> +                       .def = 0x80,
> +               },
> +       }, {
> +               .id = 0x39,
> +               .name = "mpcorew",
> +               .swgroup = TEGRA_SWGROUP_MPCORE,
> +               .latency = {
> +                       .reg = 0x320,
> +                       .shift = 16,
> +                       .mask = 0xff,
> +                       .def = 0x80,
> +               },
> +       }, {
> +               .id = 0x3b,
> +               .name = "ppcsahbdmaw",
> +               .swgroup = TEGRA_SWGROUP_PPCS,
> +               .smmu = {
> +                       .reg = 0x22c,
> +                       .bit = 27,
> +               },
> +               .latency = {
> +                       .reg = 0x348,
> +                       .shift = 0,
> +                       .mask = 0xff,
> +                       .def = 0x80,
> +               },
> +       }, {
> +               .id = 0x3c,
> +               .name = "ppcsahbslvw",
> +               .swgroup = TEGRA_SWGROUP_PPCS,
> +               .smmu = {
> +                       .reg = 0x22c,
> +                       .bit = 28,
> +               },
> +               .latency = {
> +                       .reg = 0x348,
> +                       .shift = 16,
> +                       .mask = 0xff,
> +                       .def = 0x80,
> +               },
> +       }, {
> +               .id = 0x3d,
> +               .name = "sataw",
> +               .swgroup = TEGRA_SWGROUP_SATA,
> +               .smmu = {
> +                       .reg = 0x22c,
> +                       .bit = 29,
> +               },
> +               .latency = {
> +                       .reg = 0x350,
> +                       .shift = 16,
> +                       .mask = 0xff,
> +                       .def = 0x65,
> +               },
> +       }, {
> +               .id = 0x3e,
> +               .name = "vdebsevw",
> +               .swgroup = TEGRA_SWGROUP_VDE,
> +               .smmu = {
> +                       .reg = 0x22c,
> +                       .bit = 30,
> +               },
> +               .latency = {
> +                       .reg = 0x35c,
> +                       .shift = 0,
> +                       .mask = 0xff,
> +                       .def = 0x80,
> +               },
> +       }, {
> +               .id = 0x3f,
> +               .name = "vdedbgw",
> +               .swgroup = TEGRA_SWGROUP_VDE,
> +               .smmu = {
> +                       .reg = 0x22c,
> +                       .bit = 31,
> +               },
> +               .latency = {
> +                       .reg = 0x35c,
> +                       .shift = 16,
> +                       .mask = 0xff,
> +                       .def = 0x80,
> +               },
> +       }, {
> +               .id = 0x40,
> +               .name = "vdembew",
> +               .swgroup = TEGRA_SWGROUP_VDE,
> +               .smmu = {
> +                       .reg = 0x230,
> +                       .bit = 0,
> +               },
> +               .latency = {
> +                       .reg = 0x360,
> +                       .shift = 0,
> +                       .mask = 0xff,
> +                       .def = 0x80,
> +               },
> +       }, {
> +               .id = 0x41,
> +               .name = "vdetpmw",
> +               .swgroup = TEGRA_SWGROUP_VDE,
> +               .smmu = {
> +                       .reg = 0x230,
> +                       .bit = 1,
> +               },
> +               .latency = {
> +                       .reg = 0x360,
> +                       .shift = 16,
> +                       .mask = 0xff,
> +                       .def = 0x80,
> +               },
> +       }, {
> +               .id = 0x44,
> +               .name = "ispra",
> +               .swgroup = TEGRA_SWGROUP_ISP2,
> +               .smmu = {
> +                       .reg = 0x230,
> +                       .bit = 4,
> +               },
> +               .latency = {
> +                       .reg = 0x370,
> +                       .shift = 0,
> +                       .mask = 0xff,
> +                       .def = 0x18,
> +               },
> +       }, {
> +               .id = 0x46,
> +               .name = "ispwa",
> +               .swgroup = TEGRA_SWGROUP_ISP2,
> +               .smmu = {
> +                       .reg = 0x230,
> +                       .bit = 6,
> +               },
> +               .latency = {
> +                       .reg = 0x374,
> +                       .shift = 0,
> +                       .mask = 0xff,
> +                       .def = 0x80,
> +               },
> +       }, {
> +               .id = 0x47,
> +               .name = "ispwb",
> +               .swgroup = TEGRA_SWGROUP_ISP2,
> +               .smmu = {
> +                       .reg = 0x230,
> +                       .bit = 7,
> +               },
> +               .latency = {
> +                       .reg = 0x374,
> +                       .shift = 16,
> +                       .mask = 0xff,
> +                       .def = 0x80,
> +               },
> +       }, {
> +               .id = 0x4a,
> +               .name = "xusb_hostr",
> +               .swgroup = TEGRA_SWGROUP_XUSB_HOST,
> +               .smmu = {
> +                       .reg = 0x230,
> +                       .bit = 10,
> +               },
> +               .latency = {
> +                       .reg = 0x37c,
> +                       .shift = 0,
> +                       .mask = 0xff,
> +                       .def = 0x39,
> +               },
> +       }, {
> +               .id = 0x4b,
> +               .name = "xusb_hostw",
> +               .swgroup = TEGRA_SWGROUP_XUSB_HOST,
> +               .smmu = {
> +                       .reg = 0x230,
> +                       .bit = 11,
> +               },
> +               .latency = {
> +                       .reg = 0x37c,
> +                       .shift = 16,
> +                       .mask = 0xff,
> +                       .def = 0x80,
> +               },
> +       }, {
> +               .id = 0x4c,
> +               .name = "xusb_devr",
> +               .swgroup = TEGRA_SWGROUP_XUSB_DEV,
> +               .smmu = {
> +                       .reg = 0x230,
> +                       .bit = 12,
> +               },
> +               .latency = {
> +                       .reg = 0x380,
> +                       .shift = 0,
> +                       .mask = 0xff,
> +                       .def = 0x39,
> +               },
> +       }, {
> +               .id = 0x4d,
> +               .name = "xusb_devw",
> +               .swgroup = TEGRA_SWGROUP_XUSB_DEV,
> +               .smmu = {
> +                       .reg = 0x230,
> +                       .bit = 13,
> +               },
> +               .latency = {
> +                       .reg = 0x380,
> +                       .shift = 16,
> +                       .mask = 0xff,
> +                       .def = 0x80,
> +               },
> +       }, {
> +               .id = 0x4e,
> +               .name = "isprab",
> +               .swgroup = TEGRA_SWGROUP_ISP2B,
> +               .smmu = {
> +                       .reg = 0x230,
> +                       .bit = 14,
> +               },
> +               .latency = {
> +                       .reg = 0x384,
> +                       .shift = 0,
> +                       .mask = 0xff,
> +                       .def = 0x18,
> +               },
> +       }, {
> +               .id = 0x50,
> +               .name = "ispwab",
> +               .swgroup = TEGRA_SWGROUP_ISP2B,
> +               .smmu = {
> +                       .reg = 0x230,
> +                       .bit = 16,
> +               },
> +               .latency = {
> +                       .reg = 0x388,
> +                       .shift = 0,
> +                       .mask = 0xff,
> +                       .def = 0x80,
> +               },
> +       }, {
> +               .id = 0x51,
> +               .name = "ispwbb",
> +               .swgroup = TEGRA_SWGROUP_ISP2B,
> +               .smmu = {
> +                       .reg = 0x230,
> +                       .bit = 17,
> +               },
> +               .latency = {
> +                       .reg = 0x388,
> +                       .shift = 16,
> +                       .mask = 0xff,
> +                       .def = 0x80,
> +               },
> +       }, {
> +               .id = 0x54,
> +               .name = "tsecsrd",
> +               .swgroup = TEGRA_SWGROUP_TSEC,
> +               .smmu = {
> +                       .reg = 0x230,
> +                       .bit = 20,
> +               },
> +               .latency = {
> +                       .reg = 0x390,
> +                       .shift = 0,
> +                       .mask = 0xff,
> +                       .def = 0x9b,
> +               },
> +       }, {
> +               .id = 0x55,
> +               .name = "tsecswr",
> +               .swgroup = TEGRA_SWGROUP_TSEC,
> +               .smmu = {
> +                       .reg = 0x230,
> +                       .bit = 21,
> +               },
> +               .latency = {
> +                       .reg = 0x390,
> +                       .shift = 16,
> +                       .mask = 0xff,
> +                       .def = 0x80,
> +               },
> +       }, {
> +               .id = 0x56,
> +               .name = "a9avpscr",
> +               .swgroup = TEGRA_SWGROUP_A9AVP,
> +               .smmu = {
> +                       .reg = 0x230,
> +                       .bit = 22,
> +               },
> +               .latency = {
> +                       .reg = 0x3a4,
> +                       .shift = 0,
> +                       .mask = 0xff,
> +                       .def = 0x04,
> +               },
> +       }, {
> +               .id = 0x57,
> +               .name = "a9avpscw",
> +               .swgroup = TEGRA_SWGROUP_A9AVP,
> +               .smmu = {
> +                       .reg = 0x230,
> +                       .bit = 23,
> +               },
> +               .latency = {
> +                       .reg = 0x3a4,
> +                       .shift = 16,
> +                       .mask = 0xff,
> +                       .def = 0x80,
> +               },
> +       }, {
> +               .id = 0x58,
> +               .name = "gpusrd",
> +               .swgroup = TEGRA_SWGROUP_GPU,
> +               .smmu = {
> +                       /* read-only */
> +                       .reg = 0x230,
> +                       .bit = 24,
> +               },
> +               .latency = {
> +                       .reg = 0x3c8,
> +                       .shift = 0,
> +                       .mask = 0xff,
> +                       .def = 0x1a,
> +               },
> +       }, {
> +               .id = 0x59,
> +               .name = "gpuswr",
> +               .swgroup = TEGRA_SWGROUP_GPU,
> +               .smmu = {
> +                       /* read-only */
> +                       .reg = 0x230,
> +                       .bit = 25,
> +               },
> +               .latency = {
> +                       .reg = 0x3c8,
> +                       .shift = 16,
> +                       .mask = 0xff,
> +                       .def = 0x80,
> +               },
> +       }, {
> +               .id = 0x5a,
> +               .name = "displayt",
> +               .swgroup = TEGRA_SWGROUP_DC,
> +               .smmu = {
> +                       .reg = 0x230,
> +                       .bit = 26,
> +               },
> +               .latency = {
> +                       .reg = 0x2f0,
> +                       .shift = 16,
> +                       .mask = 0xff,
> +                       .def = 0x50,
> +               },
> +       }, {
> +               .id = 0x60,
> +               .name = "sdmmcra",
> +               .swgroup = TEGRA_SWGROUP_SDMMC1A,
> +               .smmu = {
> +                       .reg = 0x234,
> +                       .bit = 0,
> +               },
> +               .latency = {
> +                       .reg = 0x3b8,
> +                       .shift = 0,
> +                       .mask = 0xff,
> +                       .def = 0x49,
> +               },
> +       }, {
> +               .id = 0x61,
> +               .name = "sdmmcraa",
> +               .swgroup = TEGRA_SWGROUP_SDMMC2A,
> +               .smmu = {
> +                       .reg = 0x234,
> +                       .bit = 1,
> +               },
> +               .latency = {
> +                       .reg = 0x3bc,
> +                       .shift = 0,
> +                       .mask = 0xff,
> +                       .def = 0x49,
> +               },
> +       }, {
> +               .id = 0x62,
> +               .name = "sdmmcr",
> +               .swgroup = TEGRA_SWGROUP_SDMMC3A,
> +               .smmu = {
> +                       .reg = 0x234,
> +                       .bit = 2,
> +               },
> +               .latency = {
> +                       .reg = 0x3c0,
> +                       .shift = 0,
> +                       .mask = 0xff,
> +                       .def = 0x49,
> +               },
> +       }, {
> +               .id = 0x63,
> +               .swgroup = TEGRA_SWGROUP_SDMMC4A,
> +               .name = "sdmmcrab",
> +               .smmu = {
> +                       .reg = 0x234,
> +                       .bit = 3,
> +               },
> +               .latency = {
> +                       .reg = 0x3c4,
> +                       .shift = 0,
> +                       .mask = 0xff,
> +                       .def = 0x49,
> +               },
> +       }, {
> +               .id = 0x64,
> +               .name = "sdmmcwa",
> +               .swgroup = TEGRA_SWGROUP_SDMMC1A,
> +               .smmu = {
> +                       .reg = 0x234,
> +                       .bit = 4,
> +               },
> +               .latency = {
> +                       .reg = 0x3b8,
> +                       .shift = 16,
> +                       .mask = 0xff,
> +                       .def = 0x80,
> +               },
> +       }, {
> +               .id = 0x65,
> +               .name = "sdmmcwaa",
> +               .swgroup = TEGRA_SWGROUP_SDMMC2A,
> +               .smmu = {
> +                       .reg = 0x234,
> +                       .bit = 5,
> +               },
> +               .latency = {
> +                       .reg = 0x3bc,
> +                       .shift = 16,
> +                       .mask = 0xff,
> +                       .def = 0x80,
> +               },
> +       }, {
> +               .id = 0x66,
> +               .name = "sdmmcw",
> +               .swgroup = TEGRA_SWGROUP_SDMMC3A,
> +               .smmu = {
> +                       .reg = 0x234,
> +                       .bit = 6,
> +               },
> +               .latency = {
> +                       .reg = 0x3c0,
> +                       .shift = 16,
> +                       .mask = 0xff,
> +                       .def = 0x80,
> +               },
> +       }, {
> +               .id = 0x67,
> +               .name = "sdmmcwab",
> +               .swgroup = TEGRA_SWGROUP_SDMMC4A,
> +               .smmu = {
> +                       .reg = 0x234,
> +                       .bit = 7,
> +               },
> +               .latency = {
> +                       .reg = 0x3c4,
> +                       .shift = 16,
> +                       .mask = 0xff,
> +                       .def = 0x80,
> +               },
> +       }, {
> +               .id = 0x6c,
> +               .name = "vicsrd",
> +               .swgroup = TEGRA_SWGROUP_VIC,
> +               .smmu = {
> +                       .reg = 0x234,
> +                       .bit = 12,
> +               },
> +               .latency = {
> +                       .reg = 0x394,
> +                       .shift = 0,
> +                       .mask = 0xff,
> +                       .def = 0x1a,
> +               },
> +       }, {
> +               .id = 0x6d,
> +               .name = "vicswr",
> +               .swgroup = TEGRA_SWGROUP_VIC,
> +               .smmu = {
> +                       .reg = 0x234,
> +                       .bit = 13,
> +               },
> +               .latency = {
> +                       .reg = 0x394,
> +                       .shift = 16,
> +                       .mask = 0xff,
> +                       .def = 0x80,
> +               },
> +       }, {
> +               .id = 0x72,
> +               .name = "viw",
> +               .swgroup = TEGRA_SWGROUP_VI,
> +               .smmu = {
> +                       .reg = 0x234,
> +                       .bit = 18,
> +               },
> +               .latency = {
> +                       .reg = 0x398,
> +                       .shift = 0,
> +                       .mask = 0xff,
> +                       .def = 0x80,
> +               },
> +       }, {
> +               .id = 0x73,
> +               .name = "displayd",
> +               .swgroup = TEGRA_SWGROUP_DC,
> +               .smmu = {
> +                       .reg = 0x234,
> +                       .bit = 19,
> +               },
> +               .latency = {
> +                       .reg = 0x3c8,
> +                       .shift = 0,
> +                       .mask = 0xff,
> +                       .def = 0x50,
> +               },
> +       },
> +};
> +
> +struct tegra_smmu_swgroup {
> +       unsigned int swgroup;
> +       unsigned int reg;
> +};
> +
> +static const struct tegra_smmu_swgroup tegra124_swgroups[] = {
> +       { .swgroup = TEGRA_SWGROUP_DC,        .reg = 0x240 },
> +       { .swgroup = TEGRA_SWGROUP_DCB,       .reg = 0x244 },
> +       { .swgroup = TEGRA_SWGROUP_AFI,       .reg = 0x238 },
> +       { .swgroup = TEGRA_SWGROUP_AVPC,      .reg = 0x23c },
> +       { .swgroup = TEGRA_SWGROUP_HDA,       .reg = 0x254 },
> +       { .swgroup = TEGRA_SWGROUP_HC,        .reg = 0x250 },
> +       { .swgroup = TEGRA_SWGROUP_MSENC,     .reg = 0x264 },
> +       { .swgroup = TEGRA_SWGROUP_PPCS,      .reg = 0x270 },
> +       { .swgroup = TEGRA_SWGROUP_SATA,      .reg = 0x274 },
> +       { .swgroup = TEGRA_SWGROUP_VDE,       .reg = 0x27c },
> +       { .swgroup = TEGRA_SWGROUP_ISP2,      .reg = 0x258 },
> +       { .swgroup = TEGRA_SWGROUP_XUSB_HOST, .reg = 0x288 },
> +       { .swgroup = TEGRA_SWGROUP_XUSB_DEV,  .reg = 0x28c },
> +       { .swgroup = TEGRA_SWGROUP_ISP2B,     .reg = 0xaa4 },
> +       { .swgroup = TEGRA_SWGROUP_TSEC,      .reg = 0x294 },
> +       { .swgroup = TEGRA_SWGROUP_A9AVP,     .reg = 0x290 },
> +       { .swgroup = TEGRA_SWGROUP_GPU,       .reg = 0xaa8 },
> +       { .swgroup = TEGRA_SWGROUP_SDMMC1A,   .reg = 0xa94 },
> +       { .swgroup = TEGRA_SWGROUP_SDMMC2A,   .reg = 0xa98 },
> +       { .swgroup = TEGRA_SWGROUP_SDMMC3A,   .reg = 0xa9c },
> +       { .swgroup = TEGRA_SWGROUP_SDMMC4A,   .reg = 0xaa0 },
> +       { .swgroup = TEGRA_SWGROUP_VIC,       .reg = 0x284 },
> +       { .swgroup = TEGRA_SWGROUP_VI,        .reg = 0x280 },
> +};
> +
> +struct tegra_smmu_group_init {
> +       unsigned int asid;
> +       const char *name;
> +
> +       const struct of_device_id *matches;
> +};
> +
> +struct tegra_smmu_soc {
> +       const struct tegra_smmu_group_init *groups;
> +       unsigned int num_groups;
> +
> +       const struct tegra_mc_client *clients;
> +       unsigned int num_clients;
> +
> +       const struct tegra_smmu_swgroup *swgroups;
> +       unsigned int num_swgroups;
> +
> +       unsigned int num_asids;
> +       unsigned int atom_size;
> +
> +       const struct tegra_smmu_ops *ops;
> +};
> +
> +struct tegra_smmu_ops {
> +       void (*flush_dcache)(struct page *page, unsigned long offset,
> +                            size_t size);
> +};
> +
> +struct tegra_smmu_master {
> +       struct list_head list;
> +       struct device *dev;
> +};
> +
> +struct tegra_smmu_group {
> +       const char *name;
> +       const struct of_device_id *matches;
> +       unsigned int asid;
> +
> +#ifndef CONFIG_ARM64
> +       struct dma_iommu_mapping *mapping;
> +#endif
> +       struct list_head masters;
> +};
> +
> +static const struct of_device_id tegra124_periph_matches[] = {
> +       { .compatible = "nvidia,tegra124-sdhci", },
> +       { }
> +};
> +
> +static const struct tegra_smmu_group_init tegra124_smmu_groups[] = {
> +       { 0, "peripherals", tegra124_periph_matches },
> +};
> +
> +static void tegra_smmu_group_release(void *data)
> +{
> +       kfree(data);
> +}
> +
> +struct tegra_smmu {
> +       void __iomem *regs;
> +       struct iommu iommu;
> +       struct device *dev;
> +
> +       const struct tegra_smmu_soc *soc;
> +
> +       struct iommu_group **groups;
> +       unsigned int num_groups;
> +
> +       unsigned long *asids;
> +       struct mutex lock;
> +};
> +
> +struct tegra_smmu_address_space {
> +       struct iommu_domain *domain;
> +       struct tegra_smmu *smmu;
> +       struct page *pd;
> +       unsigned id;
> +       u32 attr;
> +};
> +
> +static inline void smmu_writel(struct tegra_smmu *smmu, u32 value,
> +                              unsigned long offset)
> +{
> +       writel(value, smmu->regs + offset);
> +}
> +
> +static inline u32 smmu_readl(struct tegra_smmu *smmu, unsigned long offset)
> +{
> +       return readl(smmu->regs + offset);
> +}
> +
> +#define SMMU_CONFIG 0x010
> +#define  SMMU_CONFIG_ENABLE (1 << 0)
> +
> +#define SMMU_PTB_ASID 0x01c
> +#define  SMMU_PTB_ASID_VALUE(x) ((x) & 0x7f)
> +
> +#define SMMU_PTB_DATA 0x020
> +#define  SMMU_PTB_DATA_VALUE(page, attr) (page_to_phys(page) >> 12 | (attr))
> +
> +#define SMMU_MK_PDE(page, attr) (page_to_phys(page) >> SMMU_PTE_SHIFT | (attr))
> +
> +#define SMMU_TLB_FLUSH 0x030
> +#define  SMMU_TLB_FLUSH_VA_MATCH_ALL     (0 << 0)
> +#define  SMMU_TLB_FLUSH_VA_MATCH_SECTION (2 << 0)
> +#define  SMMU_TLB_FLUSH_VA_MATCH_GROUP   (3 << 0)
> +#define  SMMU_TLB_FLUSH_ASID(x)          (((x) & 0x7f) << 24)
> +#define  SMMU_TLB_FLUSH_VA_SECTION(addr) ((((addr) & 0xffc00000) >> 12) | \
> +                                         SMMU_TLB_FLUSH_VA_MATCH_SECTION)
> +#define  SMMU_TLB_FLUSH_VA_GROUP(addr)   ((((addr) & 0xffffc000) >> 12) | \
> +                                         SMMU_TLB_FLUSH_VA_MATCH_GROUP)
> +#define  SMMU_TLB_FLUSH_ASID_MATCH       (1 << 31)
> +
> +#define SMMU_PTC_FLUSH 0x034
> +#define  SMMU_PTC_FLUSH_TYPE_ALL (0 << 0)
> +#define  SMMU_PTC_FLUSH_TYPE_ADR (1 << 0)
> +
> +#define SMMU_PTC_FLUSH_HI 0x9b8
> +#define  SMMU_PTC_FLUSH_HI_MASK 0x3
> +
> +/* per-SWGROUP SMMU_*_ASID register */
> +#define SMMU_ASID_ENABLE (1 << 31)
> +#define SMMU_ASID_MASK 0x7f
> +#define SMMU_ASID_VALUE(x) ((x) & SMMU_ASID_MASK)
> +
> +/* page table definitions */
> +#define SMMU_NUM_PDE 1024
> +#define SMMU_NUM_PTE 1024
> +
> +#define SMMU_SIZE_PD (SMMU_NUM_PDE * 4)
> +#define SMMU_SIZE_PT (SMMU_NUM_PTE * 4)
> +
> +#define SMMU_PDE_SHIFT 22
> +#define SMMU_PTE_SHIFT 12
> +
> +#define SMMU_PFN_MASK 0x000fffff
> +
> +#define SMMU_PD_READABLE       (1 << 31)
> +#define SMMU_PD_WRITABLE       (1 << 30)
> +#define SMMU_PD_NONSECURE      (1 << 29)
> +
> +#define SMMU_PDE_READABLE      (1 << 31)
> +#define SMMU_PDE_WRITABLE      (1 << 30)
> +#define SMMU_PDE_NONSECURE     (1 << 29)
> +#define SMMU_PDE_NEXT          (1 << 28)
> +
> +#define SMMU_PTE_READABLE      (1 << 31)
> +#define SMMU_PTE_WRITABLE      (1 << 30)
> +#define SMMU_PTE_NONSECURE     (1 << 29)
> +
> +#define SMMU_PDE_ATTR          (SMMU_PDE_READABLE | SMMU_PDE_WRITABLE | \
> +                                SMMU_PDE_NONSECURE)
> +#define SMMU_PTE_ATTR          (SMMU_PTE_READABLE | SMMU_PTE_WRITABLE | \
> +                                SMMU_PTE_NONSECURE)
> +
> +#define SMMU_PDE_VACANT(n)     (((n) << 10) | SMMU_PDE_ATTR)
> +#define SMMU_PTE_VACANT(n)     (((n) << 12) | SMMU_PTE_ATTR)
> +
> +#ifdef CONFIG_ARCH_TEGRA_124_SOC
> +static void tegra124_flush_dcache(struct page *page, unsigned long offset,
> +                                 size_t size)
> +{
> +       phys_addr_t phys = page_to_phys(page) + offset;
> +       void *virt = page_address(page) + offset;
> +
> +       __cpuc_flush_dcache_area(virt, size);
> +       outer_flush_range(phys, phys + size);
> +}
> +
> +static const struct tegra_smmu_ops tegra124_smmu_ops = {
> +       .flush_dcache = tegra124_flush_dcache,
> +};
> +#endif
> +
> +static void tegra132_flush_dcache(struct page *page, unsigned long offset,
> +                                 size_t size)
> +{
> +       /* TODO: implement */
> +}
> +
> +static const struct tegra_smmu_ops tegra132_smmu_ops = {
> +       .flush_dcache = tegra132_flush_dcache,
> +};
> +
> +static inline void smmu_flush_ptc(struct tegra_smmu *smmu, struct page *page,
> +                                 unsigned long offset)
> +{
> +       phys_addr_t phys = page ? page_to_phys(page) : 0;
> +       u32 value;
> +
> +       if (page) {
> +               offset &= ~(smmu->soc->atom_size - 1);
> +
> +#ifdef CONFIG_PHYS_ADDR_T_64BIT
> +               value = (phys >> 32) & SMMU_PTC_FLUSH_HI_MASK;
> +#else
> +               value = 0;
> +#endif
> +               smmu_writel(smmu, value, SMMU_PTC_FLUSH_HI);
> +
> +               value = (phys + offset) | SMMU_PTC_FLUSH_TYPE_ADR;
> +       } else {
> +               value = SMMU_PTC_FLUSH_TYPE_ALL;
> +       }
> +
> +       smmu_writel(smmu, value, SMMU_PTC_FLUSH);
> +}
> +
> +static inline void smmu_flush_tlb(struct tegra_smmu *smmu)
> +{
> +       smmu_writel(smmu, SMMU_TLB_FLUSH_VA_MATCH_ALL, SMMU_TLB_FLUSH);
> +}
> +
> +static inline void smmu_flush_tlb_asid(struct tegra_smmu *smmu,
> +                                      unsigned long asid)
> +{
> +       u32 value;
> +
> +       value = SMMU_TLB_FLUSH_ASID_MATCH | SMMU_TLB_FLUSH_ASID(asid) |
> +               SMMU_TLB_FLUSH_VA_MATCH_ALL;
> +       smmu_writel(smmu, value, SMMU_TLB_FLUSH);
> +}
> +
> +static inline void smmu_flush_tlb_section(struct tegra_smmu *smmu,
> +                                         unsigned long asid,
> +                                         unsigned long iova)
> +{
> +       u32 value;
> +
> +       value = SMMU_TLB_FLUSH_ASID_MATCH | SMMU_TLB_FLUSH_ASID(asid) |
> +               SMMU_TLB_FLUSH_VA_SECTION(iova);
> +       smmu_writel(smmu, value, SMMU_TLB_FLUSH);
> +}
> +
> +static inline void smmu_flush_tlb_group(struct tegra_smmu *smmu,
> +                                       unsigned long asid,
> +                                       unsigned long iova)
> +{
> +       u32 value;
> +
> +       value = SMMU_TLB_FLUSH_ASID_MATCH | SMMU_TLB_FLUSH_ASID(asid) |
> +               SMMU_TLB_FLUSH_VA_GROUP(iova);
> +       smmu_writel(smmu, value, SMMU_TLB_FLUSH);
> +}
> +
> +static inline void smmu_flush(struct tegra_smmu *smmu)
> +{
> +       smmu_readl(smmu, SMMU_CONFIG);
> +}
> +
> +static inline struct tegra_smmu *to_tegra_smmu(struct iommu *iommu)
> +{
> +       return container_of(iommu, struct tegra_smmu, iommu);
> +}
> +
> +static struct tegra_smmu *smmu_handle = NULL;
> +
> +static int tegra_smmu_alloc_asid(struct tegra_smmu *smmu, unsigned int *idp)
> +{
> +       unsigned long id;
> +
> +       mutex_lock(&smmu->lock);
> +
> +       id = find_first_zero_bit(smmu->asids, smmu->soc->num_asids);
> +       if (id >= smmu->soc->num_asids) {
> +               mutex_unlock(&smmu->lock);
> +               return -ENOSPC;
> +       }
> +
> +       set_bit(id, smmu->asids);
> +       *idp = id;
> +
> +       mutex_unlock(&smmu->lock);
> +       return 0;
> +}
> +
> +static void tegra_smmu_free_asid(struct tegra_smmu *smmu, unsigned int id)
> +{
> +       mutex_lock(&smmu->lock);
> +       clear_bit(id, smmu->asids);
> +       mutex_unlock(&smmu->lock);
> +}
> +
> +struct tegra_smmu_address_space *foo = NULL;
> +
> +static int tegra_smmu_domain_init(struct iommu_domain *domain)
> +{
> +       struct tegra_smmu *smmu = smmu_handle;
> +       struct tegra_smmu_address_space *as;
> +       uint32_t *pd, value;
> +       unsigned int i;
> +       int err = 0;
> +
> +       as = kzalloc(sizeof(*as), GFP_KERNEL);
> +       if (!as) {
> +               err = -ENOMEM;
> +               goto out;
> +       }
> +
> +       as->attr = SMMU_PD_READABLE | SMMU_PD_WRITABLE | SMMU_PD_NONSECURE;
> +       as->smmu = smmu_handle;
> +       as->domain = domain;
> +
> +       err = tegra_smmu_alloc_asid(smmu, &as->id);
> +       if (err < 0) {
> +               kfree(as);
> +               goto out;
> +       }
> +
> +       as->pd = alloc_page(GFP_KERNEL | __GFP_DMA);
> +       if (!as->pd) {
> +               err = -ENOMEM;
> +               goto out;
> +       }
> +
> +       pd = page_address(as->pd);
> +       SetPageReserved(as->pd);
> +
> +       for (i = 0; i < SMMU_NUM_PDE; i++)
> +               pd[i] = SMMU_PDE_VACANT(i);
> +
> +       smmu->soc->ops->flush_dcache(as->pd, 0, SMMU_SIZE_PD);
> +       smmu_flush_ptc(smmu, as->pd, 0);
> +       smmu_flush_tlb_asid(smmu, as->id);
> +
> +       smmu_writel(smmu, as->id & 0x7f, SMMU_PTB_ASID);
> +       value = SMMU_PTB_DATA_VALUE(as->pd, as->attr);
> +       smmu_writel(smmu, value, SMMU_PTB_DATA);
> +       smmu_flush(smmu);
> +
> +       domain->priv = as;
> +
> +       return 0;
> +
> +out:
> +       return err;
> +}
> +
> +static void tegra_smmu_domain_destroy(struct iommu_domain *domain)
> +{
> +       struct tegra_smmu_address_space *as = domain->priv;
> +
> +       /* TODO: free page directory and page tables */
> +
> +       tegra_smmu_free_asid(as->smmu, as->id);
> +       kfree(as);
> +}
> +
> +static const struct tegra_smmu_swgroup *
> +tegra_smmu_find_swgroup(struct tegra_smmu *smmu, unsigned int swgroup)
> +{
> +       const struct tegra_smmu_swgroup *group = NULL;
> +       unsigned int i;
> +
> +       for (i = 0; i < smmu->soc->num_swgroups; i++) {
> +               if (smmu->soc->swgroups[i].swgroup == swgroup) {
> +                       group = &smmu->soc->swgroups[i];
> +                       break;
> +               }
> +       }
> +
> +       return group;
> +}
> +
> +static int tegra_smmu_enable(struct tegra_smmu *smmu, unsigned int swgroup,
> +                            unsigned int asid)
> +{
> +       const struct tegra_smmu_swgroup *group;
> +       unsigned int i;
> +       u32 value;
> +
> +       for (i = 0; i < smmu->soc->num_clients; i++) {
> +               const struct tegra_mc_client *client = &smmu->soc->clients[i];
> +
> +               if (client->swgroup != swgroup)
> +                       continue;
> +
> +               value = smmu_readl(smmu, client->smmu.reg);
> +               value |= BIT(client->smmu.bit);
> +               smmu_writel(smmu, value, client->smmu.reg);
> +       }
> +
> +       group = tegra_smmu_find_swgroup(smmu, swgroup);
> +       if (group) {
> +               value = smmu_readl(smmu, group->reg);
> +               value &= ~SMMU_ASID_MASK;
> +               value |= SMMU_ASID_VALUE(asid);
> +               value |= SMMU_ASID_ENABLE;
> +               smmu_writel(smmu, value, group->reg);
> +       }
> +
> +       return 0;
> +}
> +
> +static int tegra_smmu_disable(struct tegra_smmu *smmu, unsigned int swgroup,
> +                             unsigned int asid)
> +{
> +       const struct tegra_smmu_swgroup *group;
> +       unsigned int i;
> +       u32 value;
> +
> +       group = tegra_smmu_find_swgroup(smmu, swgroup);
> +       if (group) {
> +               value = smmu_readl(smmu, group->reg);
> +               value &= ~SMMU_ASID_MASK;
> +               value |= SMMU_ASID_VALUE(asid);
> +               value &= ~SMMU_ASID_ENABLE;
> +               smmu_writel(smmu, value, group->reg);
> +       }
> +
> +       for (i = 0; i < smmu->soc->num_clients; i++) {
> +               const struct tegra_mc_client *client = &smmu->soc->clients[i];
> +
> +               if (client->swgroup != swgroup)
> +                       continue;
> +
> +               value = smmu_readl(smmu, client->smmu.reg);
> +               value &= ~BIT(client->smmu.bit);
> +               smmu_writel(smmu, value, client->smmu.reg);
> +       }
> +
> +       return 0;
> +}
> +
> +static int tegra_smmu_attach_dev(struct iommu_domain *domain, struct device *dev)
> +{
> +       struct tegra_smmu_address_space *as = domain->priv;
> +       struct tegra_smmu *smmu = as->smmu;
> +       struct of_phandle_iter entry;
> +       int err;
> +
> +       of_property_for_each_phandle_with_args(entry, dev->of_node, "iommus",
> +                                              "#iommu-cells", 0) {
> +               unsigned int swgroup = entry.out_args.args[0];
> +
> +               if (entry.out_args.np != smmu->dev->of_node)
> +                       continue;
> +
> +               err = tegra_smmu_enable(smmu, swgroup, as->id);
> +               if (err < 0)
> +                       pr_err("failed to enable SWGROUP#%u\n", swgroup);
> +       }
> +
> +       return 0;
> +}
> +
> +static void tegra_smmu_detach_dev(struct iommu_domain *domain, struct device *dev)
> +{
> +       struct tegra_smmu_address_space *as = domain->priv;
> +       struct tegra_smmu *smmu = as->smmu;
> +       struct of_phandle_iter entry;
> +       int err;
> +
> +       of_property_for_each_phandle_with_args(entry, dev->of_node, "iommus",
> +                                              "#iommu-cells", 0) {
> +               unsigned int swgroup;
> +
> +               if (entry.out_args.np != smmu->dev->of_node)
> +                       continue;
> +
> +               swgroup = entry.out_args.args[0];
> +
> +               err = tegra_smmu_disable(smmu, swgroup, as->id);
> +               if (err < 0) {
> +                       pr_err("failed to enable SWGROUP#%u\n", swgroup);
> +               }
> +       }
> +}
> +
> +static u32 *as_get_pte(struct tegra_smmu_address_space *as, dma_addr_t iova,
> +                      struct page **pagep)
> +{
> +       struct tegra_smmu *smmu = smmu_handle;
> +       u32 *pd = page_address(as->pd), *pt;
> +       u32 pde = (iova >> SMMU_PDE_SHIFT) & 0x3ff;
> +       u32 pte = (iova >> SMMU_PTE_SHIFT) & 0x3ff;
> +       struct page *page;
> +       unsigned int i;
> +
> +       if (pd[pde] != SMMU_PDE_VACANT(pde)) {
> +               page = pfn_to_page(pd[pde] & SMMU_PFN_MASK);
> +               pt = page_address(page);
> +       } else {
> +               page = alloc_page(GFP_KERNEL | __GFP_DMA);
> +               if (!page)
> +                       return NULL;
> +
> +               pt = page_address(page);
> +               SetPageReserved(page);
> +
> +               for (i = 0; i < SMMU_NUM_PTE; i++)
> +                       pt[i] = SMMU_PTE_VACANT(i);
> +
> +               smmu->soc->ops->flush_dcache(page, 0, SMMU_SIZE_PT);
> +
> +               pd[pde] = SMMU_MK_PDE(page, SMMU_PDE_ATTR | SMMU_PDE_NEXT);
> +
> +               smmu->soc->ops->flush_dcache(as->pd, pde << 2, 4);
> +               smmu_flush_ptc(smmu, as->pd, pde << 2);
> +               smmu_flush_tlb_section(smmu, as->id, iova);
> +               smmu_flush(smmu);
> +       }
> +
> +       *pagep = page;
> +
> +       return &pt[pte];
> +}
> +
> +static int tegra_smmu_map(struct iommu_domain *domain, unsigned long iova,
> +                         phys_addr_t paddr, size_t size, int prot)
> +{
> +       struct tegra_smmu_address_space *as = domain->priv;
> +       struct tegra_smmu *smmu = smmu_handle;
> +       unsigned long offset;
> +       struct page *page;
> +       u32 *pte;
> +
> +       pte = as_get_pte(as, iova, &page);
> +       if (!pte)
> +               return -ENOMEM;
> +
> +       offset = offset_in_page(pte);
> +
> +       *pte = __phys_to_pfn(paddr) | SMMU_PTE_ATTR;
> +
> +       smmu->soc->ops->flush_dcache(page, offset, 4);
> +       smmu_flush_ptc(smmu, page, offset);
> +       smmu_flush_tlb_group(smmu, as->id, iova);
> +       smmu_flush(smmu);
> +
> +       return 0;
> +}
> +
> +static size_t tegra_smmu_unmap(struct iommu_domain *domain, unsigned long iova,
> +                              size_t size)
> +{
> +       struct tegra_smmu_address_space *as = domain->priv;
> +       struct tegra_smmu *smmu = smmu_handle;
> +       unsigned long offset;
> +       struct page *page;
> +       u32 *pte;
> +
> +       pte = as_get_pte(as, iova, &page);
> +       if (!pte)
> +               return 0;
> +
> +       offset = offset_in_page(pte);
> +       *pte = 0;
> +
> +       smmu->soc->ops->flush_dcache(page, offset, 4);
> +       smmu_flush_ptc(smmu, page, offset);
> +       smmu_flush_tlb_group(smmu, as->id, iova);
> +       smmu_flush(smmu);
> +
> +       return size;
> +}
> +
> +static phys_addr_t tegra_smmu_iova_to_phys(struct iommu_domain *domain,
> +                                          dma_addr_t iova)
> +{
> +       struct tegra_smmu_address_space *as = domain->priv;
> +       struct page *page;
> +       unsigned long pfn;
> +       u32 *pte;
> +
> +       pte = as_get_pte(as, iova, &page);
> +       pfn = *pte & SMMU_PFN_MASK;
> +
> +       return PFN_PHYS(pfn);
> +}
> +
> +static int tegra_smmu_attach(struct iommu *iommu, struct device *dev)
> +{
> +       struct tegra_smmu *smmu = to_tegra_smmu(iommu);
> +       struct tegra_smmu_group *group;
> +       unsigned int i;
> +
> +       for (i = 0; i < smmu->soc->num_groups; i++) {
> +               group = iommu_group_get_iommudata(smmu->groups[i]);
> +
> +               if (of_match_node(group->matches, dev->of_node)) {
> +                       pr_debug("adding device %s to group %s\n",
> +                                dev_name(dev), group->name);
> +                       iommu_group_add_device(smmu->groups[i], dev);
> +                       break;
> +               }
> +       }
> +
> +       if (i == smmu->soc->num_groups)
> +               return 0;
> +
> +#ifndef CONFIG_ARM64
> +       return arm_iommu_attach_device(dev, group->mapping);
> +#else
> +       return 0;
> +#endif
> +}
> +
> +static int tegra_smmu_detach(struct iommu *iommu, struct device *dev)
> +{
> +       return 0;
> +}
> +
> +static const struct iommu_ops tegra_smmu_ops = {
> +       .domain_init = tegra_smmu_domain_init,
> +       .domain_destroy = tegra_smmu_domain_destroy,
> +       .attach_dev = tegra_smmu_attach_dev,
> +       .detach_dev = tegra_smmu_detach_dev,
> +       .map = tegra_smmu_map,
> +       .unmap = tegra_smmu_unmap,
> +       .iova_to_phys = tegra_smmu_iova_to_phys,
> +       .attach = tegra_smmu_attach,
> +       .detach = tegra_smmu_detach,
> +
> +       .pgsize_bitmap = SZ_4K,
> +};
> +
> +static struct tegra_smmu *tegra_smmu_probe(struct device *dev,
> +                                          const struct tegra_smmu_soc *soc,
> +                                          void __iomem *regs)
> +{
> +       struct tegra_smmu *smmu;
> +       unsigned int i;
> +       size_t size;
> +       u32 value;
> +       int err;
> +
> +       smmu = devm_kzalloc(dev, sizeof(*smmu), GFP_KERNEL);
> +       if (!smmu)
> +               return ERR_PTR(-ENOMEM);
> +
> +       size = BITS_TO_LONGS(soc->num_asids) * sizeof(long);
> +
> +       smmu->asids = devm_kzalloc(dev, size, GFP_KERNEL);
> +       if (!smmu->asids)
> +               return ERR_PTR(-ENOMEM);
> +
> +       INIT_LIST_HEAD(&smmu->iommu.list);
> +       mutex_init(&smmu->lock);
> +
> +       smmu->iommu.ops = &tegra_smmu_ops;
> +       smmu->iommu.dev = dev;
> +
> +       smmu->regs = regs;
> +       smmu->soc = soc;
> +       smmu->dev = dev;
> +
> +       smmu_handle = smmu;
> +       bus_set_iommu(&platform_bus_type, &tegra_smmu_ops);
> +
> +       smmu->num_groups = soc->num_groups;
> +
> +       smmu->groups = devm_kcalloc(dev, smmu->num_groups, sizeof(*smmu->groups),
> +                                   GFP_KERNEL);
> +       if (!smmu->groups)
> +               return ERR_PTR(-ENOMEM);
> +
> +       for (i = 0; i < smmu->num_groups; i++) {
> +               struct tegra_smmu_group *group;
> +
> +               smmu->groups[i] = iommu_group_alloc();
> +               if (IS_ERR(smmu->groups[i]))
> +                       return ERR_CAST(smmu->groups[i]);
> +
> +               err = iommu_group_set_name(smmu->groups[i], soc->groups[i].name);
> +               if (err < 0) {
> +               }
> +
> +               group = kzalloc(sizeof(*group), GFP_KERNEL);
> +               if (!group)
> +                       return ERR_PTR(-ENOMEM);
> +
> +               group->matches = soc->groups[i].matches;
> +               group->asid = soc->groups[i].asid;
> +               group->name = soc->groups[i].name;
> +
> +               iommu_group_set_iommudata(smmu->groups[i], group,
> +                                         tegra_smmu_group_release);
> +
> +#ifndef CONFIG_ARM64
> +               group->mapping = arm_iommu_create_mapping(&platform_bus_type,
> +                                                         0, SZ_2G);
> +               if (IS_ERR(group->mapping)) {
> +                       dev_err(dev, "failed to create mapping for group %s: %ld\n",
> +                               group->name, PTR_ERR(group->mapping));
> +                       return ERR_CAST(group->mapping);
> +               }
> +#endif
> +       }
> +
> +       value = (1 << 29) | (8 << 24) | 0x3f;
> +       smmu_writel(smmu, value, 0x18);
> +
> +       value = (1 << 29) | (1 << 28) | 0x20;
> +       smmu_writel(smmu, value, 0x014);
> +
> +       smmu_flush_ptc(smmu, NULL, 0);
> +       smmu_flush_tlb(smmu);
> +       smmu_writel(smmu, SMMU_CONFIG_ENABLE, SMMU_CONFIG);
> +       smmu_flush(smmu);
> +
> +       err = iommu_add(&smmu->iommu);
> +       if (err < 0)
> +               return ERR_PTR(err);
> +
> +       return smmu;
> +}
> +
> +static int tegra_smmu_remove(struct tegra_smmu *smmu)
> +{
> +       iommu_remove(&smmu->iommu);
> +
> +       return 0;
> +}
> +
> +#ifdef CONFIG_ARCH_TEGRA_124_SOC
> +static const struct tegra_smmu_soc tegra124_smmu_soc = {
> +       .groups = tegra124_smmu_groups,
> +       .num_groups = ARRAY_SIZE(tegra124_smmu_groups),
> +       .clients = tegra124_mc_clients,
> +       .num_clients = ARRAY_SIZE(tegra124_mc_clients),
> +       .swgroups = tegra124_swgroups,
> +       .num_swgroups = ARRAY_SIZE(tegra124_swgroups),
> +       .num_asids = 128,
> +       .atom_size = 32,
> +       .ops = &tegra124_smmu_ops,
> +};
> +#endif
> +
> +static const struct tegra_smmu_soc tegra132_smmu_soc = {
> +       .groups = tegra124_smmu_groups,
> +       .num_groups = ARRAY_SIZE(tegra124_smmu_groups),
> +       .clients = tegra124_mc_clients,
> +       .num_clients = ARRAY_SIZE(tegra124_mc_clients),
> +       .swgroups = tegra124_swgroups,
> +       .num_swgroups = ARRAY_SIZE(tegra124_swgroups),
> +       .num_asids = 128,
> +       .atom_size = 32,
> +       .ops = &tegra132_smmu_ops,
> +};
> +
> +struct tegra_mc {
> +       struct device *dev;
> +       struct tegra_smmu *smmu;
> +       void __iomem *regs;
> +       int irq;
> +
> +       const struct tegra_mc_soc *soc;
> +};
> +
> +static inline u32 mc_readl(struct tegra_mc *mc, unsigned long offset)
> +{
> +       return readl(mc->regs + offset);
> +}
> +
> +static inline void mc_writel(struct tegra_mc *mc, u32 value, unsigned long offset)
> +{
> +       writel(value, mc->regs + offset);
> +}
> +
> +struct tegra_mc_soc {
> +       const struct tegra_mc_client *clients;
> +       unsigned int num_clients;
> +
> +       const struct tegra_smmu_soc *smmu;
> +};
> +
> +#ifdef CONFIG_ARCH_TEGRA_124_SOC
> +static const struct tegra_mc_soc tegra124_mc_soc = {
> +       .clients = tegra124_mc_clients,
> +       .num_clients = ARRAY_SIZE(tegra124_mc_clients),
> +       .smmu = &tegra124_smmu_soc,
> +};
> +#endif
> +
> +static const struct tegra_mc_soc tegra132_mc_soc = {
> +       .clients = tegra124_mc_clients,
> +       .num_clients = ARRAY_SIZE(tegra124_mc_clients),
> +       .smmu = &tegra132_smmu_soc,
> +};
> +
> +static const struct of_device_id tegra_mc_of_match[] = {
> +#ifdef CONFIG_ARCH_TEGRA_124_SOC
> +       { .compatible = "nvidia,tegra124-mc", .data = &tegra124_mc_soc },
> +#endif
> +       { .compatible = "nvidia,tegra132-mc", .data = &tegra132_mc_soc },
> +       { }
> +};
> +
> +static irqreturn_t tegra124_mc_irq(int irq, void *data)
> +{
> +       struct tegra_mc *mc = data;
> +       u32 value, status, mask;
> +
> +       /* mask all interrupts to avoid flooding */
> +       mask = mc_readl(mc, MC_INTMASK);
> +       mc_writel(mc, 0, MC_INTMASK);
> +
> +       status = mc_readl(mc, MC_INTSTATUS);
> +       mc_writel(mc, status, MC_INTSTATUS);
> +
> +       dev_dbg(mc->dev, "INTSTATUS: %08x\n", status);
> +
> +       if (status & MC_INT_DECERR_MTS)
> +               dev_dbg(mc->dev, "  DECERR_MTS\n");
> +
> +       if (status & MC_INT_SECERR_SEC)
> +               dev_dbg(mc->dev, "  SECERR_SEC\n");
> +
> +       if (status & MC_INT_DECERR_VPR)
> +               dev_dbg(mc->dev, "  DECERR_VPR\n");
> +
> +       if (status & MC_INT_INVALID_APB_ASID_UPDATE)
> +               dev_dbg(mc->dev, "  INVALID_APB_ASID_UPDATE\n");
> +
> +       if (status & MC_INT_INVALID_SMMU_PAGE)
> +               dev_dbg(mc->dev, "  INVALID_SMMU_PAGE\n");
> +
> +       if (status & MC_INT_ARBITRATION_EMEM)
> +               dev_dbg(mc->dev, "  ARBITRATION_EMEM\n");
> +
> +       if (status & MC_INT_SECURITY_VIOLATION)
> +               dev_dbg(mc->dev, "  SECURITY_VIOLATION\n");
> +
> +       if (status & MC_INT_DECERR_EMEM)
> +               dev_dbg(mc->dev, "  DECERR_EMEM\n");
> +
> +       value = mc_readl(mc, MC_ERR_STATUS);
> +
> +       dev_dbg(mc->dev, "ERR_STATUS: %08x\n", value);
> +       dev_dbg(mc->dev, "  type: %x\n", (value >> 28) & 0x7);
> +       dev_dbg(mc->dev, "  protection: %x\n", (value >> 25) & 0x7);
> +       dev_dbg(mc->dev, "  adr_hi: %x\n", (value >> 20) & 0x3);
> +       dev_dbg(mc->dev, "  swap: %x\n", (value >> 18) & 0x1);
> +       dev_dbg(mc->dev, "  security: %x\n", (value >> 17) & 0x1);
> +       dev_dbg(mc->dev, "  r/w: %x\n", (value >> 16) & 0x1);
> +       dev_dbg(mc->dev, "  adr1: %x\n", (value >> 12) & 0x7);
> +       dev_dbg(mc->dev, "  client: %x\n", value & 0x7f);
> +
> +       value = mc_readl(mc, MC_ERR_ADR);
> +       dev_dbg(mc->dev, "ERR_ADR: %08x\n", value);
> +
> +       mc_writel(mc, mask, MC_INTMASK);
> +
> +       return IRQ_HANDLED;
> +}
> +
> +static int tegra_mc_probe(struct platform_device *pdev)
> +{
> +       const struct of_device_id *match;
> +       struct resource *res;
> +       struct tegra_mc *mc;
> +       unsigned int i;
> +       u32 value;
> +       int err;
> +
> +       match = of_match_node(tegra_mc_of_match, pdev->dev.of_node);
> +       if (!match)
> +               return -ENODEV;
> +
> +       mc = devm_kzalloc(&pdev->dev, sizeof(*mc), GFP_KERNEL);
> +       if (!mc)
> +               return -ENOMEM;
> +
> +       platform_set_drvdata(pdev, mc);
> +       mc->soc = match->data;
> +       mc->dev = &pdev->dev;
> +
> +       res = platform_get_resource(pdev, IORESOURCE_MEM, 0);
> +       mc->regs = devm_ioremap_resource(&pdev->dev, res);
> +       if (IS_ERR(mc->regs))
> +               return PTR_ERR(mc->regs);
> +
> +       for (i = 0; i < mc->soc->num_clients; i++) {
> +               const struct latency_allowance *la = &mc->soc->clients[i].latency;
> +               u32 value;
> +
> +               value = readl(mc->regs + la->reg);
> +               value &= ~(la->mask << la->shift);
> +               value |= (la->def & la->mask) << la->shift;
> +               writel(value, mc->regs + la->reg);
> +       }
> +
> +       mc->smmu = tegra_smmu_probe(&pdev->dev, mc->soc->smmu, mc->regs);
> +       if (IS_ERR(mc->smmu)) {
> +               dev_err(&pdev->dev, "failed to probe SMMU: %ld\n",
> +                       PTR_ERR(mc->smmu));
> +               return PTR_ERR(mc->smmu);
> +       }
> +
> +       mc->irq = platform_get_irq(pdev, 0);
> +       if (mc->irq < 0) {
> +               dev_err(&pdev->dev, "interrupt not specified\n");
> +               return mc->irq;
> +       }
> +
> +       err = devm_request_irq(&pdev->dev, mc->irq, tegra124_mc_irq,
> +                              IRQF_SHARED, dev_name(&pdev->dev), mc);
> +       if (err < 0) {
> +               dev_err(&pdev->dev, "failed to request IRQ#%u: %d\n", mc->irq,
> +                       err);
> +               return err;
> +       }
> +
> +       value = MC_INT_DECERR_MTS | MC_INT_SECERR_SEC | MC_INT_DECERR_VPR |
> +               MC_INT_INVALID_APB_ASID_UPDATE | MC_INT_INVALID_SMMU_PAGE |
> +               MC_INT_ARBITRATION_EMEM | MC_INT_SECURITY_VIOLATION |
> +               MC_INT_DECERR_EMEM;
> +       mc_writel(mc, value, MC_INTMASK);
> +
> +       return 0;
> +}
> +
> +static int tegra_mc_remove(struct platform_device *pdev)
> +{
> +       struct tegra_mc *mc = platform_get_drvdata(pdev);
> +       int err;
> +
> +       err = tegra_smmu_remove(mc->smmu);
> +       if (err < 0)
> +               dev_err(&pdev->dev, "failed to remove SMMU: %d\n", err);
> +
> +       return 0;
> +}
> +
> +static struct platform_driver tegra_mc_driver = {
> +       .driver = {
> +               .name = "tegra124-mc",
> +               .of_match_table = tegra_mc_of_match,
> +       },
> +       .probe = tegra_mc_probe,
> +       .remove = tegra_mc_remove,
> +};
> +module_platform_driver(tegra_mc_driver);
> +
> +MODULE_AUTHOR("Thierry Reding <treding@nvidia.com>");
> +MODULE_DESCRIPTION("NVIDIA Tegra124 Memory Controller driver");
> +MODULE_LICENSE("GPL v2");
> diff --git a/include/dt-bindings/memory/tegra124-mc.h b/include/dt-bindings/memory/tegra124-mc.h
> new file mode 100644
> index 000000000000..6b1617ce022f
> --- /dev/null
> +++ b/include/dt-bindings/memory/tegra124-mc.h
> @@ -0,0 +1,30 @@
> +#ifndef DT_BINDINGS_MEMORY_TEGRA124_MC_H
> +#define DT_BINDINGS_MEMORY_TEGRA124_MC_H
> +
> +#define TEGRA_SWGROUP_DC       0
> +#define TEGRA_SWGROUP_DCB      1
> +#define TEGRA_SWGROUP_AFI      2
> +#define TEGRA_SWGROUP_AVPC     3
> +#define TEGRA_SWGROUP_HDA      4
> +#define TEGRA_SWGROUP_HC       5
> +#define TEGRA_SWGROUP_MSENC    6
> +#define TEGRA_SWGROUP_PPCS     7
> +#define TEGRA_SWGROUP_SATA     8
> +#define TEGRA_SWGROUP_VDE      9
> +#define TEGRA_SWGROUP_MPCORELP 10
> +#define TEGRA_SWGROUP_MPCORE   11
> +#define TEGRA_SWGROUP_ISP2     12
> +#define TEGRA_SWGROUP_XUSB_HOST        13
> +#define TEGRA_SWGROUP_XUSB_DEV 14
> +#define TEGRA_SWGROUP_ISP2B    15
> +#define TEGRA_SWGROUP_TSEC     16
> +#define TEGRA_SWGROUP_A9AVP    17
> +#define TEGRA_SWGROUP_GPU      18
> +#define TEGRA_SWGROUP_SDMMC1A  19
> +#define TEGRA_SWGROUP_SDMMC2A  20
> +#define TEGRA_SWGROUP_SDMMC3A  21
> +#define TEGRA_SWGROUP_SDMMC4A  22
> +#define TEGRA_SWGROUP_VIC      23
> +#define TEGRA_SWGROUP_VI       24
> +
> +#endif
> --
> 2.0.0
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-tegra" in
> the body of a message to majordomo at vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>

^ permalink raw reply	[flat|nested] 133+ messages in thread

* Re: [PATCH v3 02/10] devicetree: Add generic IOMMU device tree bindings
  2014-06-26 20:49     ` Thierry Reding
  (?)
@ 2014-06-27 13:55         ` Will Deacon
  -1 siblings, 0 replies; 133+ messages in thread
From: Will Deacon @ 2014-06-27 13:55 UTC (permalink / raw)
  To: Thierry Reding
  Cc: Rob Herring, Pawel Moll, Mark Rutland, Ian Campbell, Kumar Gala,
	Stephen Warren, Arnd Bergmann, Joerg Roedel, Cho KyongHo,
	Grant Grundler, Dave P Martin, Marc Zyngier, Hiroshi Doyu,
	Olav Haugan, Paul Walmsley, Rhyland Klein, Allen Martin,
	devicetree-u79uwXL29TY76Z2rM5mHXA,
	iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA,
	linux-arm-kernel-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r,
	linux-tegra-u79uwXL29TY76Z2rM5mHXA@public.gmane.org

Hi Thierry,

On Thu, Jun 26, 2014 at 09:49:42PM +0100, Thierry Reding wrote:
> From: Thierry Reding <treding-DDmLM1+adcrQT0dZR+AlfA@public.gmane.org>
> 
> This commit introduces a generic device tree binding for IOMMU devices.
> Only a very minimal subset is described here, but it is enough to cover
> the requirements of both the Exynos System MMU and Tegra SMMU as
> discussed here:
> 
>     https://lkml.org/lkml/2014/4/27/346
> 
> Signed-off-by: Thierry Reding <treding-DDmLM1+adcrQT0dZR+AlfA@public.gmane.org>

[...]

> +Required properties:
> +--------------------
> +- #iommu-cells: The number of cells in an IOMMU specifier needed to encode an
> +  address.
> +
> +Typical values for the above include:
> +- #iommu-cells = <0>: Single master IOMMU devices are not configurable and
> +  therefore no additional information needs to be encoded in the specifier.
> +  This may also apply to multiple master IOMMU devices that do not allow the
> +  association of masters to be configured.

A multiple-master capable IOMMU could be built with a single master, but
we'd still need #iommu-cells > 0 here. I appreciate this is just an example,
but the wording sounds like it's enforced.

> +- #iommu-cells = <1>: Multiple master IOMMU devices may need to be configured
> +  in order to enable translation for a given master. In such cases the single
> +  address cell corresponds to the master device's ID.

Again, we will definitely need more than one cell in this case, as I fully
expect multiple StreamIDs for each master (e.g. Qualcomm mentioned on the
list the other day that they have a master emitting 43 unique IDs).

Anyway, the actual binding looks great, I just don't want people to think
they need to do something different because they don't fit your example
use-cases.

> +Multiple-master IOMMU:
> +----------------------
> +
> +	iommu {
> +		/* the specifier represents the ID of the master */
> +		#iommu-cells = <1>;
> +	};
> +
> +	master {
> +		/* device has master ID 42 in the IOMMU */
> +		iommus = <&/iommu 42>;
> +	};
> +
> +Multiple-master IOMMU with configurable DMA window:
> +---------------------------------------------------
> +
> +	/ {
> +		#address-cells = <1>;
> +		#size-cells = <1>;
> +
> +		iommu {
> +			/* master ID, address and length of DMA window */
> +			#iommu-cells = <4>;
> +		};
> +
> +		master {
> +			/* master ID 42, 4 GiB DMA window starting at 0 */
> +			iommus = <&/iommu  42  0  0x1 0x0>;
> +		};
> +	};

Could you also please include an example of a master with multiple IDs?

Will

^ permalink raw reply	[flat|nested] 133+ messages in thread

* Re: [PATCH v3 02/10] devicetree: Add generic IOMMU device tree bindings
@ 2014-06-27 13:55         ` Will Deacon
  0 siblings, 0 replies; 133+ messages in thread
From: Will Deacon @ 2014-06-27 13:55 UTC (permalink / raw)
  To: Thierry Reding
  Cc: Rob Herring, Pawel Moll, Mark Rutland, Ian Campbell, Kumar Gala,
	Stephen Warren, Arnd Bergmann, Joerg Roedel, Cho KyongHo,
	Grant Grundler, Dave P Martin, Marc Zyngier, Hiroshi Doyu,
	Olav Haugan, Paul Walmsley, Rhyland Klein, Allen Martin,
	devicetree, iommu, linux-arm-kernel, linux-tegra, linux-kernel

Hi Thierry,

On Thu, Jun 26, 2014 at 09:49:42PM +0100, Thierry Reding wrote:
> From: Thierry Reding <treding@nvidia.com>
> 
> This commit introduces a generic device tree binding for IOMMU devices.
> Only a very minimal subset is described here, but it is enough to cover
> the requirements of both the Exynos System MMU and Tegra SMMU as
> discussed here:
> 
>     https://lkml.org/lkml/2014/4/27/346
> 
> Signed-off-by: Thierry Reding <treding@nvidia.com>

[...]

> +Required properties:
> +--------------------
> +- #iommu-cells: The number of cells in an IOMMU specifier needed to encode an
> +  address.
> +
> +Typical values for the above include:
> +- #iommu-cells = <0>: Single master IOMMU devices are not configurable and
> +  therefore no additional information needs to be encoded in the specifier.
> +  This may also apply to multiple master IOMMU devices that do not allow the
> +  association of masters to be configured.

A multiple-master capable IOMMU could be built with a single master, but
we'd still need #iommu-cells > 0 here. I appreciate this is just an example,
but the wording sounds like it's enforced.

> +- #iommu-cells = <1>: Multiple master IOMMU devices may need to be configured
> +  in order to enable translation for a given master. In such cases the single
> +  address cell corresponds to the master device's ID.

Again, we will definitely need more than one cell in this case, as I fully
expect multiple StreamIDs for each master (e.g. Qualcomm mentioned on the
list the other day that they have a master emitting 43 unique IDs).

Anyway, the actual binding looks great, I just don't want people to think
they need to do something different because they don't fit your example
use-cases.

> +Multiple-master IOMMU:
> +----------------------
> +
> +	iommu {
> +		/* the specifier represents the ID of the master */
> +		#iommu-cells = <1>;
> +	};
> +
> +	master {
> +		/* device has master ID 42 in the IOMMU */
> +		iommus = <&/iommu 42>;
> +	};
> +
> +Multiple-master IOMMU with configurable DMA window:
> +---------------------------------------------------
> +
> +	/ {
> +		#address-cells = <1>;
> +		#size-cells = <1>;
> +
> +		iommu {
> +			/* master ID, address and length of DMA window */
> +			#iommu-cells = <4>;
> +		};
> +
> +		master {
> +			/* master ID 42, 4 GiB DMA window starting at 0 */
> +			iommus = <&/iommu  42  0  0x1 0x0>;
> +		};
> +	};

Could you also please include an example of a master with multiple IDs?

Will

^ permalink raw reply	[flat|nested] 133+ messages in thread

* [PATCH v3 02/10] devicetree: Add generic IOMMU device tree bindings
@ 2014-06-27 13:55         ` Will Deacon
  0 siblings, 0 replies; 133+ messages in thread
From: Will Deacon @ 2014-06-27 13:55 UTC (permalink / raw)
  To: linux-arm-kernel

Hi Thierry,

On Thu, Jun 26, 2014 at 09:49:42PM +0100, Thierry Reding wrote:
> From: Thierry Reding <treding@nvidia.com>
> 
> This commit introduces a generic device tree binding for IOMMU devices.
> Only a very minimal subset is described here, but it is enough to cover
> the requirements of both the Exynos System MMU and Tegra SMMU as
> discussed here:
> 
>     https://lkml.org/lkml/2014/4/27/346
> 
> Signed-off-by: Thierry Reding <treding@nvidia.com>

[...]

> +Required properties:
> +--------------------
> +- #iommu-cells: The number of cells in an IOMMU specifier needed to encode an
> +  address.
> +
> +Typical values for the above include:
> +- #iommu-cells = <0>: Single master IOMMU devices are not configurable and
> +  therefore no additional information needs to be encoded in the specifier.
> +  This may also apply to multiple master IOMMU devices that do not allow the
> +  association of masters to be configured.

A multiple-master capable IOMMU could be built with a single master, but
we'd still need #iommu-cells > 0 here. I appreciate this is just an example,
but the wording sounds like it's enforced.

> +- #iommu-cells = <1>: Multiple master IOMMU devices may need to be configured
> +  in order to enable translation for a given master. In such cases the single
> +  address cell corresponds to the master device's ID.

Again, we will definitely need more than one cell in this case, as I fully
expect multiple StreamIDs for each master (e.g. Qualcomm mentioned on the
list the other day that they have a master emitting 43 unique IDs).

Anyway, the actual binding looks great, I just don't want people to think
they need to do something different because they don't fit your example
use-cases.

> +Multiple-master IOMMU:
> +----------------------
> +
> +	iommu {
> +		/* the specifier represents the ID of the master */
> +		#iommu-cells = <1>;
> +	};
> +
> +	master {
> +		/* device has master ID 42 in the IOMMU */
> +		iommus = <&/iommu 42>;
> +	};
> +
> +Multiple-master IOMMU with configurable DMA window:
> +---------------------------------------------------
> +
> +	/ {
> +		#address-cells = <1>;
> +		#size-cells = <1>;
> +
> +		iommu {
> +			/* master ID, address and length of DMA window */
> +			#iommu-cells = <4>;
> +		};
> +
> +		master {
> +			/* master ID 42, 4 GiB DMA window starting at 0 */
> +			iommus = <&/iommu  42  0  0x1 0x0>;
> +		};
> +	};

Could you also please include an example of a master with multiple IDs?

Will

^ permalink raw reply	[flat|nested] 133+ messages in thread

* Re: [RFC 04/10] memory: Add Tegra124 memory controller support
  2014-06-27 11:08             ` Thierry Reding
  (?)
@ 2014-06-27 21:33               ` Stephen Warren
  -1 siblings, 0 replies; 133+ messages in thread
From: Stephen Warren @ 2014-06-27 21:33 UTC (permalink / raw)
  To: Thierry Reding, Hiroshi DOyu
  Cc: Mark Rutland, Olav Haugan, Pawel Moll, Arnd Bergmann,
	Ian Campbell, Grant Grundler, Rhyland Klein, Will Deacon,
	iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA, Marc Zyngier, Allen Martin,
	Rob Herring, linux-arm-kernel-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r,
	Paul Walmsley, Kumar Gala, linux-tegra-u79uwXL29TY76Z2rM5mHXA,
	Cho KyongHo, Dave Martin, devicetree-u79uwXL29TY76Z2rM5mHXA


[-- Attachment #1.1: Type: text/plain, Size: 2255 bytes --]

On 06/27/2014 05:08 AM, Thierry Reding wrote:
> On Fri, Jun 27, 2014 at 12:46:38PM +0300, Hiroshi DOyu wrote:
>>
>> Thierry Reding <thierry.reding-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> writes:
>>
>>> From: Thierry Reding <treding-DDmLM1+adcrQT0dZR+AlfA@public.gmane.org>
>>>
>>> The memory controller on NVIDIA Tegra124 exposes various knobs that can
>>> be used to tune the behaviour of the clients attached to it.
>>>
>>> Currently this driver sets up the latency allowance registers to the HW
>>> defaults. Eventually an API should be exported by this driver (via a
>>> custom API or a generic subsystem) to allow clients to register latency
>>> requirements.
>>>
>>> This driver also registers an IOMMU (SMMU) that's implemented by the
>>> memory controller.
>>>
>>> Signed-off-by: Thierry Reding <treding-DDmLM1+adcrQT0dZR+AlfA@public.gmane.org>
>>> ---
>>>  drivers/memory/Kconfig                   |    9 +
>>>  drivers/memory/Makefile                  |    1 +
>>>  drivers/memory/tegra124-mc.c             | 1945 ++++++++++++++++++++++++++++++
>>>  include/dt-bindings/memory/tegra124-mc.h |   30 +
>>>  4 files changed, 1985 insertions(+)
>>>  create mode 100644 drivers/memory/tegra124-mc.c
>>>  create mode 100644 include/dt-bindings/memory/tegra124-mc.h
>>
>> I prefer reusing the existing SMMU and having MC and SMMU separated
>> since most of SMMU code are not different from functionality POV, and
>> new MC features are quite independent of SMMU.
>>
>> If it's really convenient to combine MC and SMMU into one driver, we
>> could move "drivers/iomm/tegra-smmu.c" here first, and add MC features
>> on the top of it.
> 
> I'm not sure if we can do that, since the tegra-smmu driver is
> technically used by Tegra30 and Tegra114. We've never really made use of
> it, but there are device trees in mainline releases that contain the
> separate SMMU node.

The existing DT nodes do nothing more than instantiate the driver.
However, IIUC nothing actually uses the driver for any purpose, so if we
simply deleted those nodes or changed them incompatibly, there'd be no
functional difference. Perhaps this is stretching DT ABIness very
slightly, but I think it makes no practical difference.


[-- Attachment #1.2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 901 bytes --]

[-- Attachment #2: Type: text/plain, Size: 0 bytes --]



^ permalink raw reply	[flat|nested] 133+ messages in thread

* Re: [RFC 04/10] memory: Add Tegra124 memory controller support
@ 2014-06-27 21:33               ` Stephen Warren
  0 siblings, 0 replies; 133+ messages in thread
From: Stephen Warren @ 2014-06-27 21:33 UTC (permalink / raw)
  To: Thierry Reding, Hiroshi DOyu
  Cc: Rob Herring, Pawel Moll, Mark Rutland, Ian Campbell, Kumar Gala,
	Arnd Bergmann, Will Deacon, Joerg Roedel, Cho KyongHo,
	Grant Grundler, Dave Martin, Marc Zyngier, Olav Haugan,
	Paul Walmsley, Rhyland Klein, Allen Martin, devicetree, iommu,
	linux-arm-kernel, linux-tegra, linux-kernel

[-- Attachment #1: Type: text/plain, Size: 2167 bytes --]

On 06/27/2014 05:08 AM, Thierry Reding wrote:
> On Fri, Jun 27, 2014 at 12:46:38PM +0300, Hiroshi DOyu wrote:
>>
>> Thierry Reding <thierry.reding@gmail.com> writes:
>>
>>> From: Thierry Reding <treding@nvidia.com>
>>>
>>> The memory controller on NVIDIA Tegra124 exposes various knobs that can
>>> be used to tune the behaviour of the clients attached to it.
>>>
>>> Currently this driver sets up the latency allowance registers to the HW
>>> defaults. Eventually an API should be exported by this driver (via a
>>> custom API or a generic subsystem) to allow clients to register latency
>>> requirements.
>>>
>>> This driver also registers an IOMMU (SMMU) that's implemented by the
>>> memory controller.
>>>
>>> Signed-off-by: Thierry Reding <treding@nvidia.com>
>>> ---
>>>  drivers/memory/Kconfig                   |    9 +
>>>  drivers/memory/Makefile                  |    1 +
>>>  drivers/memory/tegra124-mc.c             | 1945 ++++++++++++++++++++++++++++++
>>>  include/dt-bindings/memory/tegra124-mc.h |   30 +
>>>  4 files changed, 1985 insertions(+)
>>>  create mode 100644 drivers/memory/tegra124-mc.c
>>>  create mode 100644 include/dt-bindings/memory/tegra124-mc.h
>>
>> I prefer reusing the existing SMMU and having MC and SMMU separated
>> since most of SMMU code are not different from functionality POV, and
>> new MC features are quite independent of SMMU.
>>
>> If it's really convenient to combine MC and SMMU into one driver, we
>> could move "drivers/iomm/tegra-smmu.c" here first, and add MC features
>> on the top of it.
> 
> I'm not sure if we can do that, since the tegra-smmu driver is
> technically used by Tegra30 and Tegra114. We've never really made use of
> it, but there are device trees in mainline releases that contain the
> separate SMMU node.

The existing DT nodes do nothing more than instantiate the driver.
However, IIUC nothing actually uses the driver for any purpose, so if we
simply deleted those nodes or changed them incompatibly, there'd be no
functional difference. Perhaps this is stretching DT ABIness very
slightly, but I think it makes no practical difference.


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 901 bytes --]

^ permalink raw reply	[flat|nested] 133+ messages in thread

* [RFC 04/10] memory: Add Tegra124 memory controller support
@ 2014-06-27 21:33               ` Stephen Warren
  0 siblings, 0 replies; 133+ messages in thread
From: Stephen Warren @ 2014-06-27 21:33 UTC (permalink / raw)
  To: linux-arm-kernel

On 06/27/2014 05:08 AM, Thierry Reding wrote:
> On Fri, Jun 27, 2014 at 12:46:38PM +0300, Hiroshi DOyu wrote:
>>
>> Thierry Reding <thierry.reding@gmail.com> writes:
>>
>>> From: Thierry Reding <treding@nvidia.com>
>>>
>>> The memory controller on NVIDIA Tegra124 exposes various knobs that can
>>> be used to tune the behaviour of the clients attached to it.
>>>
>>> Currently this driver sets up the latency allowance registers to the HW
>>> defaults. Eventually an API should be exported by this driver (via a
>>> custom API or a generic subsystem) to allow clients to register latency
>>> requirements.
>>>
>>> This driver also registers an IOMMU (SMMU) that's implemented by the
>>> memory controller.
>>>
>>> Signed-off-by: Thierry Reding <treding@nvidia.com>
>>> ---
>>>  drivers/memory/Kconfig                   |    9 +
>>>  drivers/memory/Makefile                  |    1 +
>>>  drivers/memory/tegra124-mc.c             | 1945 ++++++++++++++++++++++++++++++
>>>  include/dt-bindings/memory/tegra124-mc.h |   30 +
>>>  4 files changed, 1985 insertions(+)
>>>  create mode 100644 drivers/memory/tegra124-mc.c
>>>  create mode 100644 include/dt-bindings/memory/tegra124-mc.h
>>
>> I prefer reusing the existing SMMU and having MC and SMMU separated
>> since most of SMMU code are not different from functionality POV, and
>> new MC features are quite independent of SMMU.
>>
>> If it's really convenient to combine MC and SMMU into one driver, we
>> could move "drivers/iomm/tegra-smmu.c" here first, and add MC features
>> on the top of it.
> 
> I'm not sure if we can do that, since the tegra-smmu driver is
> technically used by Tegra30 and Tegra114. We've never really made use of
> it, but there are device trees in mainline releases that contain the
> separate SMMU node.

The existing DT nodes do nothing more than instantiate the driver.
However, IIUC nothing actually uses the driver for any purpose, so if we
simply deleted those nodes or changed them incompatibly, there'd be no
functional difference. Perhaps this is stretching DT ABIness very
slightly, but I think it makes no practical difference.

-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 901 bytes
Desc: OpenPGP digital signature
URL: <http://lists.infradead.org/pipermail/linux-arm-kernel/attachments/20140627/91142d67/attachment.sig>

^ permalink raw reply	[flat|nested] 133+ messages in thread

* Re: [RFC 04/10] memory: Add Tegra124 memory controller support
  2014-06-27 11:15           ` Thierry Reding
  (?)
@ 2014-06-27 21:37             ` Stephen Warren
  -1 siblings, 0 replies; 133+ messages in thread
From: Stephen Warren @ 2014-06-27 21:37 UTC (permalink / raw)
  To: Thierry Reding, Arnd Bergmann
  Cc: Mark Rutland, Olav Haugan, Pawel Moll, Ian Campbell,
	Grant Grundler, Rhyland Klein, Will Deacon,
	iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA, Marc Zyngier, Allen Martin,
	Rob Herring, linux-arm-kernel-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r,
	Paul Walmsley, Kumar Gala, linux-tegra-u79uwXL29TY76Z2rM5mHXA,
	Cho KyongHo, Dave Martin, devicetree-u79uwXL29TY76Z2rM5mHXA


[-- Attachment #1.1: Type: text/plain, Size: 2127 bytes --]

On 06/27/2014 05:15 AM, Thierry Reding wrote:
> On Fri, Jun 27, 2014 at 01:07:04PM +0200, Arnd Bergmann wrote:
>> On Thursday 26 June 2014 22:49:44 Thierry Reding wrote:
>>> +static const struct tegra_mc_client tegra124_mc_clients[] = {
>>> +       {
>>> +               .id = 0x01,
>>> +               .name = "display0a",
>>> +               .swgroup = TEGRA_SWGROUP_DC,
>>> +               .smmu = {
>>> +                       .reg = 0x228,
>>> +                       .bit = 1,
>>> +               },
>>> +               .latency = {
>>> +                       .reg = 0x2e8,
>>> +                       .shift = 0,
>>> +                       .mask = 0xff,
>>> +                       .def = 0xc2,
>>> +               },
>>> +       }, {
>>
>> This is a rather long table that I assume would need to get duplicated
>> and modified for each specific SoC. Have you considered to put the information
>> into DT instead, as auxiliary data in the iommu specifier as provided by
>> the device?
> 
> Most of this data really is register information and I don't think that
> belongs in DT.

I agree. I think it's quite inappropriate to put information into DT
that could simply be put into a table in the driver. If the information
is put into DT, you have to define a fixed binding for it, munge the
table and data representation to fit DT's much less flexible (than C
structs/arrays) syntax, write a whole bunch of code to parse it back out
(at probably not do a good job with error-checking), all only to end up
with exactly the same C structs in the driver at the end of the process.
Oh, and if multiple SoCs use the same data values, you have to duplicate
those tables into at least the DTBs if not in the .dts files, whereas
with C you can just point at the same struct.

SoCs come out much less frequently than new boards (perhaps ignoring the
fact that we support a small subset of boards in mainline, so the
frequency isn't too dissimilar there). It makes good sense to put
board-to-board differences in DT, but I see little point in putting
static SoC information into DT.


[-- Attachment #1.2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 901 bytes --]

[-- Attachment #2: Type: text/plain, Size: 0 bytes --]



^ permalink raw reply	[flat|nested] 133+ messages in thread

* Re: [RFC 04/10] memory: Add Tegra124 memory controller support
@ 2014-06-27 21:37             ` Stephen Warren
  0 siblings, 0 replies; 133+ messages in thread
From: Stephen Warren @ 2014-06-27 21:37 UTC (permalink / raw)
  To: Thierry Reding, Arnd Bergmann
  Cc: Rob Herring, Pawel Moll, Mark Rutland, Ian Campbell, Kumar Gala,
	Will Deacon, Joerg Roedel, Cho KyongHo, Grant Grundler,
	Dave Martin, Marc Zyngier, Hiroshi Doyu, Olav Haugan,
	Paul Walmsley, Rhyland Klein, Allen Martin, devicetree, iommu,
	linux-arm-kernel, linux-tegra, linux-kernel

[-- Attachment #1: Type: text/plain, Size: 2127 bytes --]

On 06/27/2014 05:15 AM, Thierry Reding wrote:
> On Fri, Jun 27, 2014 at 01:07:04PM +0200, Arnd Bergmann wrote:
>> On Thursday 26 June 2014 22:49:44 Thierry Reding wrote:
>>> +static const struct tegra_mc_client tegra124_mc_clients[] = {
>>> +       {
>>> +               .id = 0x01,
>>> +               .name = "display0a",
>>> +               .swgroup = TEGRA_SWGROUP_DC,
>>> +               .smmu = {
>>> +                       .reg = 0x228,
>>> +                       .bit = 1,
>>> +               },
>>> +               .latency = {
>>> +                       .reg = 0x2e8,
>>> +                       .shift = 0,
>>> +                       .mask = 0xff,
>>> +                       .def = 0xc2,
>>> +               },
>>> +       }, {
>>
>> This is a rather long table that I assume would need to get duplicated
>> and modified for each specific SoC. Have you considered to put the information
>> into DT instead, as auxiliary data in the iommu specifier as provided by
>> the device?
> 
> Most of this data really is register information and I don't think that
> belongs in DT.

I agree. I think it's quite inappropriate to put information into DT
that could simply be put into a table in the driver. If the information
is put into DT, you have to define a fixed binding for it, munge the
table and data representation to fit DT's much less flexible (than C
structs/arrays) syntax, write a whole bunch of code to parse it back out
(at probably not do a good job with error-checking), all only to end up
with exactly the same C structs in the driver at the end of the process.
Oh, and if multiple SoCs use the same data values, you have to duplicate
those tables into at least the DTBs if not in the .dts files, whereas
with C you can just point at the same struct.

SoCs come out much less frequently than new boards (perhaps ignoring the
fact that we support a small subset of boards in mainline, so the
frequency isn't too dissimilar there). It makes good sense to put
board-to-board differences in DT, but I see little point in putting
static SoC information into DT.


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 901 bytes --]

^ permalink raw reply	[flat|nested] 133+ messages in thread

* [RFC 04/10] memory: Add Tegra124 memory controller support
@ 2014-06-27 21:37             ` Stephen Warren
  0 siblings, 0 replies; 133+ messages in thread
From: Stephen Warren @ 2014-06-27 21:37 UTC (permalink / raw)
  To: linux-arm-kernel

On 06/27/2014 05:15 AM, Thierry Reding wrote:
> On Fri, Jun 27, 2014 at 01:07:04PM +0200, Arnd Bergmann wrote:
>> On Thursday 26 June 2014 22:49:44 Thierry Reding wrote:
>>> +static const struct tegra_mc_client tegra124_mc_clients[] = {
>>> +       {
>>> +               .id = 0x01,
>>> +               .name = "display0a",
>>> +               .swgroup = TEGRA_SWGROUP_DC,
>>> +               .smmu = {
>>> +                       .reg = 0x228,
>>> +                       .bit = 1,
>>> +               },
>>> +               .latency = {
>>> +                       .reg = 0x2e8,
>>> +                       .shift = 0,
>>> +                       .mask = 0xff,
>>> +                       .def = 0xc2,
>>> +               },
>>> +       }, {
>>
>> This is a rather long table that I assume would need to get duplicated
>> and modified for each specific SoC. Have you considered to put the information
>> into DT instead, as auxiliary data in the iommu specifier as provided by
>> the device?
> 
> Most of this data really is register information and I don't think that
> belongs in DT.

I agree. I think it's quite inappropriate to put information into DT
that could simply be put into a table in the driver. If the information
is put into DT, you have to define a fixed binding for it, munge the
table and data representation to fit DT's much less flexible (than C
structs/arrays) syntax, write a whole bunch of code to parse it back out
(at probably not do a good job with error-checking), all only to end up
with exactly the same C structs in the driver at the end of the process.
Oh, and if multiple SoCs use the same data values, you have to duplicate
those tables into at least the DTBs if not in the .dts files, whereas
with C you can just point at the same struct.

SoCs come out much less frequently than new boards (perhaps ignoring the
fact that we support a small subset of boards in mainline, so the
frequency isn't too dissimilar there). It makes good sense to put
board-to-board differences in DT, but I see little point in putting
static SoC information into DT.

-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 901 bytes
Desc: OpenPGP digital signature
URL: <http://lists.infradead.org/pipermail/linux-arm-kernel/attachments/20140627/95dbab9d/attachment-0001.sig>

^ permalink raw reply	[flat|nested] 133+ messages in thread

* Re: [PATCH v3 02/10] devicetree: Add generic IOMMU device tree bindings
  2014-06-26 20:49     ` Thierry Reding
  (?)
@ 2014-06-30 22:24         ` Stephen Warren
  -1 siblings, 0 replies; 133+ messages in thread
From: Stephen Warren @ 2014-06-30 22:24 UTC (permalink / raw)
  To: Thierry Reding, Rob Herring, Pawel Moll, Mark Rutland,
	Ian Campbell, Kumar Gala, Arnd Bergmann, Will Deacon,
	Joerg Roedel
  Cc: devicetree-u79uwXL29TY76Z2rM5mHXA, Grant Grundler, Rhyland Klein,
	iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA, Marc Zyngier, Allen Martin,
	Paul Walmsley, linux-tegra-u79uwXL29TY76Z2rM5mHXA, Cho KyongHo,
	Dave Martin, linux-arm-kernel-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r

On 06/26/2014 02:49 PM, Thierry Reding wrote:
> From: Thierry Reding <treding-DDmLM1+adcrQT0dZR+AlfA@public.gmane.org>
> 
> This commit introduces a generic device tree binding for IOMMU devices.
> Only a very minimal subset is described here, but it is enough to cover
> the requirements of both the Exynos System MMU and Tegra SMMU as
> discussed here:
> 
>     https://lkml.org/lkml/2014/4/27/346

> diff --git a/Documentation/devicetree/bindings/iommu/iommu.txt b/Documentation/devicetree/bindings/iommu/iommu.txt

> +When an "iommus" property is specified in a device tree node, the IOMMU will
> +be used for address translation. If a "dma-ranges" property exists in the
> +device's parent node it will be ignored. An exception to this rule is if the
> +referenced IOMMU is disabled, in which case the "dma-ranges" property of the
> +parent shall take effect.

I wonder how useful that paragraph is. The fact that someone disabled a
particular IOMMU's node doesn't necessarily mean that the HW can
actually do that; an IOMMU might always be active in HW and always
translate accesses by some master. In that case, the fallback to
dma-ranges wouldn't correlate with what the HW actually does. Perhaps
all we need is to add a note to that effect here?

^ permalink raw reply	[flat|nested] 133+ messages in thread

* Re: [PATCH v3 02/10] devicetree: Add generic IOMMU device tree bindings
@ 2014-06-30 22:24         ` Stephen Warren
  0 siblings, 0 replies; 133+ messages in thread
From: Stephen Warren @ 2014-06-30 22:24 UTC (permalink / raw)
  To: Thierry Reding, Rob Herring, Pawel Moll, Mark Rutland,
	Ian Campbell, Kumar Gala, Arnd Bergmann, Will Deacon,
	Joerg Roedel
  Cc: Cho KyongHo, Grant Grundler, Dave Martin, Marc Zyngier,
	Hiroshi Doyu, Olav Haugan, Paul Walmsley, Rhyland Klein,
	Allen Martin, devicetree, iommu, linux-arm-kernel, linux-tegra,
	linux-kernel

On 06/26/2014 02:49 PM, Thierry Reding wrote:
> From: Thierry Reding <treding@nvidia.com>
> 
> This commit introduces a generic device tree binding for IOMMU devices.
> Only a very minimal subset is described here, but it is enough to cover
> the requirements of both the Exynos System MMU and Tegra SMMU as
> discussed here:
> 
>     https://lkml.org/lkml/2014/4/27/346

> diff --git a/Documentation/devicetree/bindings/iommu/iommu.txt b/Documentation/devicetree/bindings/iommu/iommu.txt

> +When an "iommus" property is specified in a device tree node, the IOMMU will
> +be used for address translation. If a "dma-ranges" property exists in the
> +device's parent node it will be ignored. An exception to this rule is if the
> +referenced IOMMU is disabled, in which case the "dma-ranges" property of the
> +parent shall take effect.

I wonder how useful that paragraph is. The fact that someone disabled a
particular IOMMU's node doesn't necessarily mean that the HW can
actually do that; an IOMMU might always be active in HW and always
translate accesses by some master. In that case, the fallback to
dma-ranges wouldn't correlate with what the HW actually does. Perhaps
all we need is to add a note to that effect here?

^ permalink raw reply	[flat|nested] 133+ messages in thread

* [PATCH v3 02/10] devicetree: Add generic IOMMU device tree bindings
@ 2014-06-30 22:24         ` Stephen Warren
  0 siblings, 0 replies; 133+ messages in thread
From: Stephen Warren @ 2014-06-30 22:24 UTC (permalink / raw)
  To: linux-arm-kernel

On 06/26/2014 02:49 PM, Thierry Reding wrote:
> From: Thierry Reding <treding@nvidia.com>
> 
> This commit introduces a generic device tree binding for IOMMU devices.
> Only a very minimal subset is described here, but it is enough to cover
> the requirements of both the Exynos System MMU and Tegra SMMU as
> discussed here:
> 
>     https://lkml.org/lkml/2014/4/27/346

> diff --git a/Documentation/devicetree/bindings/iommu/iommu.txt b/Documentation/devicetree/bindings/iommu/iommu.txt

> +When an "iommus" property is specified in a device tree node, the IOMMU will
> +be used for address translation. If a "dma-ranges" property exists in the
> +device's parent node it will be ignored. An exception to this rule is if the
> +referenced IOMMU is disabled, in which case the "dma-ranges" property of the
> +parent shall take effect.

I wonder how useful that paragraph is. The fact that someone disabled a
particular IOMMU's node doesn't necessarily mean that the HW can
actually do that; an IOMMU might always be active in HW and always
translate accesses by some master. In that case, the fallback to
dma-ranges wouldn't correlate with what the HW actually does. Perhaps
all we need is to add a note to that effect here?

^ permalink raw reply	[flat|nested] 133+ messages in thread

* Re: [RFC 04/10] memory: Add Tegra124 memory controller support
  2014-06-26 20:49     ` Thierry Reding
  (?)
@ 2014-06-30 22:43         ` Stephen Warren
  -1 siblings, 0 replies; 133+ messages in thread
From: Stephen Warren @ 2014-06-30 22:43 UTC (permalink / raw)
  To: Thierry Reding
  Cc: Mark Rutland, Will Deacon, Paul Walmsley, Arnd Bergmann,
	Marc Zyngier, Dave Martin, devicetree-u79uwXL29TY76Z2rM5mHXA,
	Pawel Moll, Ian Campbell, Grant Grundler, Allen Martin,
	Rob Herring, linux-tegra-u79uwXL29TY76Z2rM5mHXA, Cho KyongHo,
	linux-arm-kernel-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA, Kumar Gala,
	Rhyland Klein

On 06/26/2014 02:49 PM, Thierry Reding wrote:
> From: Thierry Reding <treding-DDmLM1+adcrQT0dZR+AlfA@public.gmane.org>
> 
> The memory controller on NVIDIA Tegra124 exposes various knobs that can
> be used to tune the behaviour of the clients attached to it.
> 
> Currently this driver sets up the latency allowance registers to the HW
> defaults. Eventually an API should be exported by this driver (via a
> custom API or a generic subsystem) to allow clients to register latency
> requirements.
> 
> This driver also registers an IOMMU (SMMU) that's implemented by the
> memory controller.

> diff --git a/drivers/memory/Kconfig b/drivers/memory/Kconfig

> +config TEGRA124_MC
> +	bool "Tegra124 Memory Controller driver"
> +	depends on ARCH_TEGRA

Does it make sense to default to y for system-level drivers like this?

> diff --git a/drivers/memory/tegra124-mc.c b/drivers/memory/tegra124-mc.c

As a general comment, I wonder why the Tegra124 code/data here is
ifdef'd based on CONFIG_ARCH_TEGRA_124_SOC but the Tegra132 code isn't
ifdef'd at all. I'd assert that the Tegra124 code is small enough it's
hardly worth worrying about ifdefs.

> +static inline void smmu_flush_ptc(struct tegra_smmu *smmu, struct page *page,
> +				  unsigned long offset)
> +{
> +	phys_addr_t phys = page ? page_to_phys(page) : 0;
> +	u32 value;
> +
> +	if (page) {
> +		offset &= ~(smmu->soc->atom_size - 1);
> +
> +#ifdef CONFIG_PHYS_ADDR_T_64BIT
> +		value = (phys >> 32) & SMMU_PTC_FLUSH_HI_MASK;
> +#else
> +		value = 0;
> +#endif

Shouldn't Tegra124 have CONFIG_PHYS_ADDR_T_64BIT defined, such that
there's no need for this ifdef? Certainly Tegra124 {has,can have} RAM
above 4GB physical, for some memory map layouts (i.e. non swiss cheese).

(I assume most of this code matches the existing Tegra30 SMMU driver, so
I didn't look at all of it that closely).

> +static int tegra_smmu_attach(struct iommu *iommu, struct device *dev)
...
> +#ifndef CONFIG_ARM64
> +	return arm_iommu_attach_device(dev, group->mapping);
> +#else
> +	return 0;
> +#endif

Hmm. Why must an SMMU driver for the exact same HW operate differently
depending on the CPU that's attached to the SoC? Surely the requirements
for how IOMMU drives should work should be the same for all architectures?

> +static int tegra_mc_probe(struct platform_device *pdev)

> +	err = devm_request_irq(&pdev->dev, mc->irq, tegra124_mc_irq,
> +			       IRQF_SHARED, dev_name(&pdev->dev), mc);

I don't see any code in tegra_mc_remove() that guarantees that the IRQ
won't fire between tegra_mc_remove() returning, and the devm cleanup
code running to unhook that IRQ handler.

> diff --git a/include/dt-bindings/memory/tegra124-mc.h b/include/dt-bindings/memory/tegra124-mc.h

This file is part of the DT binding, so should be added in the patch
that adds the binding.

^ permalink raw reply	[flat|nested] 133+ messages in thread

* Re: [RFC 04/10] memory: Add Tegra124 memory controller support
@ 2014-06-30 22:43         ` Stephen Warren
  0 siblings, 0 replies; 133+ messages in thread
From: Stephen Warren @ 2014-06-30 22:43 UTC (permalink / raw)
  To: Thierry Reding
  Cc: Rob Herring, Pawel Moll, Mark Rutland, Ian Campbell, Kumar Gala,
	Arnd Bergmann, Will Deacon, Joerg Roedel, Cho KyongHo,
	Grant Grundler, Dave Martin, Marc Zyngier, Hiroshi Doyu,
	Olav Haugan, Paul Walmsley, Rhyland Klein, Allen Martin,
	devicetree, iommu, linux-arm-kernel, linux-tegra, linux-kernel

On 06/26/2014 02:49 PM, Thierry Reding wrote:
> From: Thierry Reding <treding@nvidia.com>
> 
> The memory controller on NVIDIA Tegra124 exposes various knobs that can
> be used to tune the behaviour of the clients attached to it.
> 
> Currently this driver sets up the latency allowance registers to the HW
> defaults. Eventually an API should be exported by this driver (via a
> custom API or a generic subsystem) to allow clients to register latency
> requirements.
> 
> This driver also registers an IOMMU (SMMU) that's implemented by the
> memory controller.

> diff --git a/drivers/memory/Kconfig b/drivers/memory/Kconfig

> +config TEGRA124_MC
> +	bool "Tegra124 Memory Controller driver"
> +	depends on ARCH_TEGRA

Does it make sense to default to y for system-level drivers like this?

> diff --git a/drivers/memory/tegra124-mc.c b/drivers/memory/tegra124-mc.c

As a general comment, I wonder why the Tegra124 code/data here is
ifdef'd based on CONFIG_ARCH_TEGRA_124_SOC but the Tegra132 code isn't
ifdef'd at all. I'd assert that the Tegra124 code is small enough it's
hardly worth worrying about ifdefs.

> +static inline void smmu_flush_ptc(struct tegra_smmu *smmu, struct page *page,
> +				  unsigned long offset)
> +{
> +	phys_addr_t phys = page ? page_to_phys(page) : 0;
> +	u32 value;
> +
> +	if (page) {
> +		offset &= ~(smmu->soc->atom_size - 1);
> +
> +#ifdef CONFIG_PHYS_ADDR_T_64BIT
> +		value = (phys >> 32) & SMMU_PTC_FLUSH_HI_MASK;
> +#else
> +		value = 0;
> +#endif

Shouldn't Tegra124 have CONFIG_PHYS_ADDR_T_64BIT defined, such that
there's no need for this ifdef? Certainly Tegra124 {has,can have} RAM
above 4GB physical, for some memory map layouts (i.e. non swiss cheese).

(I assume most of this code matches the existing Tegra30 SMMU driver, so
I didn't look at all of it that closely).

> +static int tegra_smmu_attach(struct iommu *iommu, struct device *dev)
...
> +#ifndef CONFIG_ARM64
> +	return arm_iommu_attach_device(dev, group->mapping);
> +#else
> +	return 0;
> +#endif

Hmm. Why must an SMMU driver for the exact same HW operate differently
depending on the CPU that's attached to the SoC? Surely the requirements
for how IOMMU drives should work should be the same for all architectures?

> +static int tegra_mc_probe(struct platform_device *pdev)

> +	err = devm_request_irq(&pdev->dev, mc->irq, tegra124_mc_irq,
> +			       IRQF_SHARED, dev_name(&pdev->dev), mc);

I don't see any code in tegra_mc_remove() that guarantees that the IRQ
won't fire between tegra_mc_remove() returning, and the devm cleanup
code running to unhook that IRQ handler.

> diff --git a/include/dt-bindings/memory/tegra124-mc.h b/include/dt-bindings/memory/tegra124-mc.h

This file is part of the DT binding, so should be added in the patch
that adds the binding.

^ permalink raw reply	[flat|nested] 133+ messages in thread

* [RFC 04/10] memory: Add Tegra124 memory controller support
@ 2014-06-30 22:43         ` Stephen Warren
  0 siblings, 0 replies; 133+ messages in thread
From: Stephen Warren @ 2014-06-30 22:43 UTC (permalink / raw)
  To: linux-arm-kernel

On 06/26/2014 02:49 PM, Thierry Reding wrote:
> From: Thierry Reding <treding@nvidia.com>
> 
> The memory controller on NVIDIA Tegra124 exposes various knobs that can
> be used to tune the behaviour of the clients attached to it.
> 
> Currently this driver sets up the latency allowance registers to the HW
> defaults. Eventually an API should be exported by this driver (via a
> custom API or a generic subsystem) to allow clients to register latency
> requirements.
> 
> This driver also registers an IOMMU (SMMU) that's implemented by the
> memory controller.

> diff --git a/drivers/memory/Kconfig b/drivers/memory/Kconfig

> +config TEGRA124_MC
> +	bool "Tegra124 Memory Controller driver"
> +	depends on ARCH_TEGRA

Does it make sense to default to y for system-level drivers like this?

> diff --git a/drivers/memory/tegra124-mc.c b/drivers/memory/tegra124-mc.c

As a general comment, I wonder why the Tegra124 code/data here is
ifdef'd based on CONFIG_ARCH_TEGRA_124_SOC but the Tegra132 code isn't
ifdef'd at all. I'd assert that the Tegra124 code is small enough it's
hardly worth worrying about ifdefs.

> +static inline void smmu_flush_ptc(struct tegra_smmu *smmu, struct page *page,
> +				  unsigned long offset)
> +{
> +	phys_addr_t phys = page ? page_to_phys(page) : 0;
> +	u32 value;
> +
> +	if (page) {
> +		offset &= ~(smmu->soc->atom_size - 1);
> +
> +#ifdef CONFIG_PHYS_ADDR_T_64BIT
> +		value = (phys >> 32) & SMMU_PTC_FLUSH_HI_MASK;
> +#else
> +		value = 0;
> +#endif

Shouldn't Tegra124 have CONFIG_PHYS_ADDR_T_64BIT defined, such that
there's no need for this ifdef? Certainly Tegra124 {has,can have} RAM
above 4GB physical, for some memory map layouts (i.e. non swiss cheese).

(I assume most of this code matches the existing Tegra30 SMMU driver, so
I didn't look at all of it that closely).

> +static int tegra_smmu_attach(struct iommu *iommu, struct device *dev)
...
> +#ifndef CONFIG_ARM64
> +	return arm_iommu_attach_device(dev, group->mapping);
> +#else
> +	return 0;
> +#endif

Hmm. Why must an SMMU driver for the exact same HW operate differently
depending on the CPU that's attached to the SoC? Surely the requirements
for how IOMMU drives should work should be the same for all architectures?

> +static int tegra_mc_probe(struct platform_device *pdev)

> +	err = devm_request_irq(&pdev->dev, mc->irq, tegra124_mc_irq,
> +			       IRQF_SHARED, dev_name(&pdev->dev), mc);

I don't see any code in tegra_mc_remove() that guarantees that the IRQ
won't fire between tegra_mc_remove() returning, and the devm cleanup
code running to unhook that IRQ handler.

> diff --git a/include/dt-bindings/memory/tegra124-mc.h b/include/dt-bindings/memory/tegra124-mc.h

This file is part of the DT binding, so should be added in the patch
that adds the binding.

^ permalink raw reply	[flat|nested] 133+ messages in thread

* Re: [RFC 04/10] memory: Add Tegra124 memory controller support
  2014-06-26 20:49     ` Thierry Reding
  (?)
@ 2014-07-01 12:14         ` Hiroshi Doyu
  -1 siblings, 0 replies; 133+ messages in thread
From: Hiroshi Doyu @ 2014-07-01 12:14 UTC (permalink / raw)
  To: Thierry Reding
  Cc: Mark Rutland, Will Deacon, Paul Walmsley, Pawel Moll,
	Ian Campbell, Marc Zyngier, Dave Martin,
	devicetree-u79uwXL29TY76Z2rM5mHXA, Arnd Bergmann, Stephen Warren,
	Grant Grundler, Allen Martin, Rob Herring,
	linux-tegra-u79uwXL29TY76Z2rM5mHXA, Cho KyongHo,
	linux-arm-kernel-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA, Kumar Gala,
	Rhyland Klein


Thierry Reding <thierry.reding-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> writes:

> diff --git a/include/dt-bindings/memory/tegra124-mc.h b/include/dt-bindings/memory/tegra124-mc.h
> new file mode 100644
> index 000000000000..6b1617ce022f
> --- /dev/null
> +++ b/include/dt-bindings/memory/tegra124-mc.h
> @@ -0,0 +1,30 @@
> +#ifndef DT_BINDINGS_MEMORY_TEGRA124_MC_H
> +#define DT_BINDINGS_MEMORY_TEGRA124_MC_H
> +
> +#define TEGRA_SWGROUP_DC       0
> +#define TEGRA_SWGROUP_DCB      1
> +#define TEGRA_SWGROUP_AFI      2
> +#define TEGRA_SWGROUP_AVPC     3
> +#define TEGRA_SWGROUP_HDA      4
> +#define TEGRA_SWGROUP_HC       5
> +#define TEGRA_SWGROUP_MSENC    6
> +#define TEGRA_SWGROUP_PPCS     7
> +#define TEGRA_SWGROUP_SATA     8
> +#define TEGRA_SWGROUP_VDE      9
> +#define TEGRA_SWGROUP_MPCORELP 10
> +#define TEGRA_SWGROUP_MPCORE   11
> +#define TEGRA_SWGROUP_ISP2     12
> +#define TEGRA_SWGROUP_XUSB_HOST        13
> +#define TEGRA_SWGROUP_XUSB_DEV 14
> +#define TEGRA_SWGROUP_ISP2B    15
> +#define TEGRA_SWGROUP_TSEC     16
> +#define TEGRA_SWGROUP_A9AVP    17
> +#define TEGRA_SWGROUP_GPU      18
> +#define TEGRA_SWGROUP_SDMMC1A  19
> +#define TEGRA_SWGROUP_SDMMC2A  20
> +#define TEGRA_SWGROUP_SDMMC3A  21
> +#define TEGRA_SWGROUP_SDMMC4A  22
> +#define TEGRA_SWGROUP_VIC      23
> +#define TEGRA_SWGROUP_VI       24
> +
> +#endif

In the SMMUv8 patch series, I have assigned unique IDs for all those
HWAs among Tegra SoC generations so that DT can provide which HWAs are
attached to that SoC. The SMMUv8 driver would be unified among Tegra
SoCs, then.

^ permalink raw reply	[flat|nested] 133+ messages in thread

* Re: [RFC 04/10] memory: Add Tegra124 memory controller support
@ 2014-07-01 12:14         ` Hiroshi Doyu
  0 siblings, 0 replies; 133+ messages in thread
From: Hiroshi Doyu @ 2014-07-01 12:14 UTC (permalink / raw)
  To: Thierry Reding
  Cc: Rob Herring, Pawel Moll, Mark Rutland, Ian Campbell, Kumar Gala,
	Stephen Warren, Arnd Bergmann, Will Deacon, Joerg Roedel,
	Cho KyongHo, Grant Grundler, Dave Martin, Marc Zyngier,
	Hiroshi Doyu, Olav Haugan, Paul Walmsley, Rhyland Klein,
	Allen Martin, devicetree, iommu, linux-arm-kernel, linux-tegra,
	linux-kernel


Thierry Reding <thierry.reding@gmail.com> writes:

> diff --git a/include/dt-bindings/memory/tegra124-mc.h b/include/dt-bindings/memory/tegra124-mc.h
> new file mode 100644
> index 000000000000..6b1617ce022f
> --- /dev/null
> +++ b/include/dt-bindings/memory/tegra124-mc.h
> @@ -0,0 +1,30 @@
> +#ifndef DT_BINDINGS_MEMORY_TEGRA124_MC_H
> +#define DT_BINDINGS_MEMORY_TEGRA124_MC_H
> +
> +#define TEGRA_SWGROUP_DC       0
> +#define TEGRA_SWGROUP_DCB      1
> +#define TEGRA_SWGROUP_AFI      2
> +#define TEGRA_SWGROUP_AVPC     3
> +#define TEGRA_SWGROUP_HDA      4
> +#define TEGRA_SWGROUP_HC       5
> +#define TEGRA_SWGROUP_MSENC    6
> +#define TEGRA_SWGROUP_PPCS     7
> +#define TEGRA_SWGROUP_SATA     8
> +#define TEGRA_SWGROUP_VDE      9
> +#define TEGRA_SWGROUP_MPCORELP 10
> +#define TEGRA_SWGROUP_MPCORE   11
> +#define TEGRA_SWGROUP_ISP2     12
> +#define TEGRA_SWGROUP_XUSB_HOST        13
> +#define TEGRA_SWGROUP_XUSB_DEV 14
> +#define TEGRA_SWGROUP_ISP2B    15
> +#define TEGRA_SWGROUP_TSEC     16
> +#define TEGRA_SWGROUP_A9AVP    17
> +#define TEGRA_SWGROUP_GPU      18
> +#define TEGRA_SWGROUP_SDMMC1A  19
> +#define TEGRA_SWGROUP_SDMMC2A  20
> +#define TEGRA_SWGROUP_SDMMC3A  21
> +#define TEGRA_SWGROUP_SDMMC4A  22
> +#define TEGRA_SWGROUP_VIC      23
> +#define TEGRA_SWGROUP_VI       24
> +
> +#endif

In the SMMUv8 patch series, I have assigned unique IDs for all those
HWAs among Tegra SoC generations so that DT can provide which HWAs are
attached to that SoC. The SMMUv8 driver would be unified among Tegra
SoCs, then.

^ permalink raw reply	[flat|nested] 133+ messages in thread

* [RFC 04/10] memory: Add Tegra124 memory controller support
@ 2014-07-01 12:14         ` Hiroshi Doyu
  0 siblings, 0 replies; 133+ messages in thread
From: Hiroshi Doyu @ 2014-07-01 12:14 UTC (permalink / raw)
  To: linux-arm-kernel


Thierry Reding <thierry.reding@gmail.com> writes:

> diff --git a/include/dt-bindings/memory/tegra124-mc.h b/include/dt-bindings/memory/tegra124-mc.h
> new file mode 100644
> index 000000000000..6b1617ce022f
> --- /dev/null
> +++ b/include/dt-bindings/memory/tegra124-mc.h
> @@ -0,0 +1,30 @@
> +#ifndef DT_BINDINGS_MEMORY_TEGRA124_MC_H
> +#define DT_BINDINGS_MEMORY_TEGRA124_MC_H
> +
> +#define TEGRA_SWGROUP_DC       0
> +#define TEGRA_SWGROUP_DCB      1
> +#define TEGRA_SWGROUP_AFI      2
> +#define TEGRA_SWGROUP_AVPC     3
> +#define TEGRA_SWGROUP_HDA      4
> +#define TEGRA_SWGROUP_HC       5
> +#define TEGRA_SWGROUP_MSENC    6
> +#define TEGRA_SWGROUP_PPCS     7
> +#define TEGRA_SWGROUP_SATA     8
> +#define TEGRA_SWGROUP_VDE      9
> +#define TEGRA_SWGROUP_MPCORELP 10
> +#define TEGRA_SWGROUP_MPCORE   11
> +#define TEGRA_SWGROUP_ISP2     12
> +#define TEGRA_SWGROUP_XUSB_HOST        13
> +#define TEGRA_SWGROUP_XUSB_DEV 14
> +#define TEGRA_SWGROUP_ISP2B    15
> +#define TEGRA_SWGROUP_TSEC     16
> +#define TEGRA_SWGROUP_A9AVP    17
> +#define TEGRA_SWGROUP_GPU      18
> +#define TEGRA_SWGROUP_SDMMC1A  19
> +#define TEGRA_SWGROUP_SDMMC2A  20
> +#define TEGRA_SWGROUP_SDMMC3A  21
> +#define TEGRA_SWGROUP_SDMMC4A  22
> +#define TEGRA_SWGROUP_VIC      23
> +#define TEGRA_SWGROUP_VI       24
> +
> +#endif

In the SMMUv8 patch series, I have assigned unique IDs for all those
HWAs among Tegra SoC generations so that DT can provide which HWAs are
attached to that SoC. The SMMUv8 driver would be unified among Tegra
SoCs, then.

^ permalink raw reply	[flat|nested] 133+ messages in thread

* RE: [RFC 01/10] iommu: Add IOMMU device registry
  2014-06-27  6:58         ` Thierry Reding
  (?)
@ 2014-07-03 10:37           ` Varun Sethi
  -1 siblings, 0 replies; 133+ messages in thread
From: Varun Sethi @ 2014-07-03 10:37 UTC (permalink / raw)
  To: Thierry Reding, Rob Herring, Pawel Moll, Mark Rutland,
	Ian Campbell, Kumar Gala, Stephen Warren, Arnd Bergmann,
	Will Deacon, Joerg Roedel
  Cc: Olav Haugan, devicetree-u79uwXL29TY76Z2rM5mHXA, Grant Grundler,
	Rhyland Klein, iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA, Marc Zyngier, Allen Martin,
	Paul Walmsley, linux-tegra-u79uwXL29TY76Z2rM5mHXA, Cho KyongHo,
	Dave Martin, linux-arm-kernel-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r



> -----Original Message-----
> From: iommu-bounces-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA@public.gmane.org [mailto:iommu-
> bounces-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA@public.gmane.org] On Behalf Of Thierry Reding
> Sent: Friday, June 27, 2014 12:29 PM
> To: Rob Herring; Pawel Moll; Mark Rutland; Ian Campbell; Kumar Gala;
> Stephen Warren; Arnd Bergmann; Will Deacon; Joerg Roedel
> Cc: Olav Haugan; devicetree-u79uwXL29TY76Z2rM5mHXA@public.gmane.org; Grant Grundler; Rhyland
> Klein; iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA@public.gmane.org; linux-kernel-u79uwXL29TY76Z2rM5mHXA@public.gmane.org;
> Marc Zyngier; Allen Martin; Paul Walmsley; linux-tegra-u79uwXL29TY76Z2rM5mHXA@public.gmane.org;
> Cho KyongHo; Dave Martin; linux-arm-kernel-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r@public.gmane.org
> Subject: Re: [RFC 01/10] iommu: Add IOMMU device registry
> 
> On Thu, Jun 26, 2014 at 10:49:41PM +0200, Thierry Reding wrote:
> > From: Thierry Reding <treding-DDmLM1+adcrQT0dZR+AlfA@public.gmane.org>
> >
> > Add an IOMMU device registry for drivers to register with and
> > implement a method for users of the IOMMU API to attach to an IOMMU
> > device. This allows to support deferred probing and gives the IOMMU
> > API a convenient hook to perform early initialization of a device if
> necessary.
> >
> > Signed-off-by: Thierry Reding <treding-DDmLM1+adcrQT0dZR+AlfA@public.gmane.org>
> > ---
> >  drivers/iommu/iommu.c | 93
> > +++++++++++++++++++++++++++++++++++++++++++++++++++
> >  include/linux/iommu.h | 27 +++++++++++++++
> >  2 files changed, 120 insertions(+)
> 
> I thought that perhaps I should elaborate on this a bit since I have a
> few ideas on how the API could be enhanced.
> 
> > +static int of_iommu_attach(struct device *dev) {
> > +	struct of_phandle_iter iter;
> > +	struct iommu *iommu;
> > +
> > +	mutex_lock(&iommus_lock);
> > +
> > +	of_property_for_each_phandle_with_args(iter, dev->of_node,
> "iommus",
> > +					       "#iommu-cells", 0) {
> > +		bool found = false;
> > +		int err;
> > +
> > +		/* skip disabled IOMMUs */
> > +		if (!of_device_is_available(iter.out_args.np))
> > +			continue;
> > +
> > +		list_for_each_entry(iommu, &iommus, list) {
> > +			if (iommu->dev->of_node == iter.out_args.np) {
> > +				err = iommu->ops->attach(iommu, dev);
> > +				if (err < 0) {
> > +				}
> > +
> > +				found = true;
> > +			}
> > +		}
> > +
> > +		if (!found) {
> > +			mutex_unlock(&iommus_lock);
> > +			return -EPROBE_DEFER;
> > +		}
> > +	}
> > +
> > +	mutex_unlock(&iommus_lock);
> > +
> > +	return 0;
> > +}
> > +
> > +static int of_iommu_detach(struct device *dev) {
> > +	/* TODO: implement */
> > +	return -ENOSYS;
> > +}
> > +
> > +int iommu_attach(struct device *dev)
> > +{
> > +	int err = 0;
> > +
> > +	if (IS_ENABLED(CONFIG_OF) && dev->of_node) {
> > +		err = of_iommu_attach(dev);
> > +		if (!err)
> > +			return 0;
> > +	}
> > +
> > +	return err;
> > +}
> > +EXPORT_SYMBOL_GPL(iommu_attach);
> 
> I think it might make sense to introduce an explicit object for an IOMMU
> master attachment. Maybe something like:
> 
> 	struct iommu_master {
> 		struct iommu *iommu;
> 		struct device *dev;
> 
> 		...
> 	};
> 
> iommu_attach() could then return a pointer to that attachment and the
> IOMMU user driver could subsequently use that as a handle to access other
> parts of the API.
> 
> The reason is that if we ever need to support more than a single master
> interface (and perhaps even multiple master interfaces on different
> IOMMUs) for a single device, then we need a way for the IOMMU user to
> differentiate between its master interfaces.
> 
> > diff --git a/include/linux/iommu.h b/include/linux/iommu.h index
> > 284a4683fdc1..ac2ceef194d4 100644
> > --- a/include/linux/iommu.h
> > +++ b/include/linux/iommu.h
> > @@ -43,6 +43,17 @@ struct notifier_block;  typedef int
> > (*iommu_fault_handler_t)(struct iommu_domain *,
> >  			struct device *, unsigned long, int, void *);
> >
> > +struct iommu {
> > +	struct device *dev;
> > +
> > +	struct list_head list;
> > +
> > +	const struct iommu_ops *ops;
> > +};
> 
> For reasons explained above, I also think that it would be a good idea to
> modify the iommu_ops functions to take a struct iommu * as their first
> argument. This may become important when one driver needs to support
> multiple IOMMU devices. With the current API drivers have to rely on
> global variables to track the driver-specific context. As far as I can
> tell, only .domain_init(), .add_device(), .remove_device() and
> .device_group(). .domain_init() could set up a pointer to struct iommu in
> struct iommu_domain so the functions dealing with domains could gain
> access to the IOMMU device via that pointer.
Would the proposed interface be an alternate to the add_device interface? Also, how would the iommu group creation work? We are dependent on device driver initialization to attach device an IOMMU, but the add_device allows creation of iommu_groups during bus probing. 
Can't the same thing be achieved using the add device interface where an IOMMU driver can determine (in add_device) if the device is attached to a particular IOMMU. If the device is attached to that IOMMU then it can create the corresponding IOMMU group. IOMMU information can be stored in archdata.

-Varun

^ permalink raw reply	[flat|nested] 133+ messages in thread

* RE: [RFC 01/10] iommu: Add IOMMU device registry
@ 2014-07-03 10:37           ` Varun Sethi
  0 siblings, 0 replies; 133+ messages in thread
From: Varun Sethi @ 2014-07-03 10:37 UTC (permalink / raw)
  To: Thierry Reding, Rob Herring, Pawel Moll, Mark Rutland,
	Ian Campbell, Kumar Gala, Stephen Warren, Arnd Bergmann,
	Will Deacon, Joerg Roedel
  Cc: Olav Haugan, devicetree, Grant Grundler, Rhyland Klein, iommu,
	linux-kernel, Marc Zyngier, Allen Martin, Paul Walmsley,
	linux-tegra, Cho KyongHo, Dave Martin, linux-arm-kernel



> -----Original Message-----
> From: iommu-bounces@lists.linux-foundation.org [mailto:iommu-
> bounces@lists.linux-foundation.org] On Behalf Of Thierry Reding
> Sent: Friday, June 27, 2014 12:29 PM
> To: Rob Herring; Pawel Moll; Mark Rutland; Ian Campbell; Kumar Gala;
> Stephen Warren; Arnd Bergmann; Will Deacon; Joerg Roedel
> Cc: Olav Haugan; devicetree@vger.kernel.org; Grant Grundler; Rhyland
> Klein; iommu@lists.linux-foundation.org; linux-kernel@vger.kernel.org;
> Marc Zyngier; Allen Martin; Paul Walmsley; linux-tegra@vger.kernel.org;
> Cho KyongHo; Dave Martin; linux-arm-kernel@lists.infradead.org
> Subject: Re: [RFC 01/10] iommu: Add IOMMU device registry
> 
> On Thu, Jun 26, 2014 at 10:49:41PM +0200, Thierry Reding wrote:
> > From: Thierry Reding <treding@nvidia.com>
> >
> > Add an IOMMU device registry for drivers to register with and
> > implement a method for users of the IOMMU API to attach to an IOMMU
> > device. This allows to support deferred probing and gives the IOMMU
> > API a convenient hook to perform early initialization of a device if
> necessary.
> >
> > Signed-off-by: Thierry Reding <treding@nvidia.com>
> > ---
> >  drivers/iommu/iommu.c | 93
> > +++++++++++++++++++++++++++++++++++++++++++++++++++
> >  include/linux/iommu.h | 27 +++++++++++++++
> >  2 files changed, 120 insertions(+)
> 
> I thought that perhaps I should elaborate on this a bit since I have a
> few ideas on how the API could be enhanced.
> 
> > +static int of_iommu_attach(struct device *dev) {
> > +	struct of_phandle_iter iter;
> > +	struct iommu *iommu;
> > +
> > +	mutex_lock(&iommus_lock);
> > +
> > +	of_property_for_each_phandle_with_args(iter, dev->of_node,
> "iommus",
> > +					       "#iommu-cells", 0) {
> > +		bool found = false;
> > +		int err;
> > +
> > +		/* skip disabled IOMMUs */
> > +		if (!of_device_is_available(iter.out_args.np))
> > +			continue;
> > +
> > +		list_for_each_entry(iommu, &iommus, list) {
> > +			if (iommu->dev->of_node == iter.out_args.np) {
> > +				err = iommu->ops->attach(iommu, dev);
> > +				if (err < 0) {
> > +				}
> > +
> > +				found = true;
> > +			}
> > +		}
> > +
> > +		if (!found) {
> > +			mutex_unlock(&iommus_lock);
> > +			return -EPROBE_DEFER;
> > +		}
> > +	}
> > +
> > +	mutex_unlock(&iommus_lock);
> > +
> > +	return 0;
> > +}
> > +
> > +static int of_iommu_detach(struct device *dev) {
> > +	/* TODO: implement */
> > +	return -ENOSYS;
> > +}
> > +
> > +int iommu_attach(struct device *dev)
> > +{
> > +	int err = 0;
> > +
> > +	if (IS_ENABLED(CONFIG_OF) && dev->of_node) {
> > +		err = of_iommu_attach(dev);
> > +		if (!err)
> > +			return 0;
> > +	}
> > +
> > +	return err;
> > +}
> > +EXPORT_SYMBOL_GPL(iommu_attach);
> 
> I think it might make sense to introduce an explicit object for an IOMMU
> master attachment. Maybe something like:
> 
> 	struct iommu_master {
> 		struct iommu *iommu;
> 		struct device *dev;
> 
> 		...
> 	};
> 
> iommu_attach() could then return a pointer to that attachment and the
> IOMMU user driver could subsequently use that as a handle to access other
> parts of the API.
> 
> The reason is that if we ever need to support more than a single master
> interface (and perhaps even multiple master interfaces on different
> IOMMUs) for a single device, then we need a way for the IOMMU user to
> differentiate between its master interfaces.
> 
> > diff --git a/include/linux/iommu.h b/include/linux/iommu.h index
> > 284a4683fdc1..ac2ceef194d4 100644
> > --- a/include/linux/iommu.h
> > +++ b/include/linux/iommu.h
> > @@ -43,6 +43,17 @@ struct notifier_block;  typedef int
> > (*iommu_fault_handler_t)(struct iommu_domain *,
> >  			struct device *, unsigned long, int, void *);
> >
> > +struct iommu {
> > +	struct device *dev;
> > +
> > +	struct list_head list;
> > +
> > +	const struct iommu_ops *ops;
> > +};
> 
> For reasons explained above, I also think that it would be a good idea to
> modify the iommu_ops functions to take a struct iommu * as their first
> argument. This may become important when one driver needs to support
> multiple IOMMU devices. With the current API drivers have to rely on
> global variables to track the driver-specific context. As far as I can
> tell, only .domain_init(), .add_device(), .remove_device() and
> .device_group(). .domain_init() could set up a pointer to struct iommu in
> struct iommu_domain so the functions dealing with domains could gain
> access to the IOMMU device via that pointer.
Would the proposed interface be an alternate to the add_device interface? Also, how would the iommu group creation work? We are dependent on device driver initialization to attach device an IOMMU, but the add_device allows creation of iommu_groups during bus probing. 
Can't the same thing be achieved using the add device interface where an IOMMU driver can determine (in add_device) if the device is attached to a particular IOMMU. If the device is attached to that IOMMU then it can create the corresponding IOMMU group. IOMMU information can be stored in archdata.

-Varun

^ permalink raw reply	[flat|nested] 133+ messages in thread

* [RFC 01/10] iommu: Add IOMMU device registry
@ 2014-07-03 10:37           ` Varun Sethi
  0 siblings, 0 replies; 133+ messages in thread
From: Varun Sethi @ 2014-07-03 10:37 UTC (permalink / raw)
  To: linux-arm-kernel



> -----Original Message-----
> From: iommu-bounces at lists.linux-foundation.org [mailto:iommu-
> bounces at lists.linux-foundation.org] On Behalf Of Thierry Reding
> Sent: Friday, June 27, 2014 12:29 PM
> To: Rob Herring; Pawel Moll; Mark Rutland; Ian Campbell; Kumar Gala;
> Stephen Warren; Arnd Bergmann; Will Deacon; Joerg Roedel
> Cc: Olav Haugan; devicetree at vger.kernel.org; Grant Grundler; Rhyland
> Klein; iommu at lists.linux-foundation.org; linux-kernel at vger.kernel.org;
> Marc Zyngier; Allen Martin; Paul Walmsley; linux-tegra at vger.kernel.org;
> Cho KyongHo; Dave Martin; linux-arm-kernel at lists.infradead.org
> Subject: Re: [RFC 01/10] iommu: Add IOMMU device registry
> 
> On Thu, Jun 26, 2014 at 10:49:41PM +0200, Thierry Reding wrote:
> > From: Thierry Reding <treding@nvidia.com>
> >
> > Add an IOMMU device registry for drivers to register with and
> > implement a method for users of the IOMMU API to attach to an IOMMU
> > device. This allows to support deferred probing and gives the IOMMU
> > API a convenient hook to perform early initialization of a device if
> necessary.
> >
> > Signed-off-by: Thierry Reding <treding@nvidia.com>
> > ---
> >  drivers/iommu/iommu.c | 93
> > +++++++++++++++++++++++++++++++++++++++++++++++++++
> >  include/linux/iommu.h | 27 +++++++++++++++
> >  2 files changed, 120 insertions(+)
> 
> I thought that perhaps I should elaborate on this a bit since I have a
> few ideas on how the API could be enhanced.
> 
> > +static int of_iommu_attach(struct device *dev) {
> > +	struct of_phandle_iter iter;
> > +	struct iommu *iommu;
> > +
> > +	mutex_lock(&iommus_lock);
> > +
> > +	of_property_for_each_phandle_with_args(iter, dev->of_node,
> "iommus",
> > +					       "#iommu-cells", 0) {
> > +		bool found = false;
> > +		int err;
> > +
> > +		/* skip disabled IOMMUs */
> > +		if (!of_device_is_available(iter.out_args.np))
> > +			continue;
> > +
> > +		list_for_each_entry(iommu, &iommus, list) {
> > +			if (iommu->dev->of_node == iter.out_args.np) {
> > +				err = iommu->ops->attach(iommu, dev);
> > +				if (err < 0) {
> > +				}
> > +
> > +				found = true;
> > +			}
> > +		}
> > +
> > +		if (!found) {
> > +			mutex_unlock(&iommus_lock);
> > +			return -EPROBE_DEFER;
> > +		}
> > +	}
> > +
> > +	mutex_unlock(&iommus_lock);
> > +
> > +	return 0;
> > +}
> > +
> > +static int of_iommu_detach(struct device *dev) {
> > +	/* TODO: implement */
> > +	return -ENOSYS;
> > +}
> > +
> > +int iommu_attach(struct device *dev)
> > +{
> > +	int err = 0;
> > +
> > +	if (IS_ENABLED(CONFIG_OF) && dev->of_node) {
> > +		err = of_iommu_attach(dev);
> > +		if (!err)
> > +			return 0;
> > +	}
> > +
> > +	return err;
> > +}
> > +EXPORT_SYMBOL_GPL(iommu_attach);
> 
> I think it might make sense to introduce an explicit object for an IOMMU
> master attachment. Maybe something like:
> 
> 	struct iommu_master {
> 		struct iommu *iommu;
> 		struct device *dev;
> 
> 		...
> 	};
> 
> iommu_attach() could then return a pointer to that attachment and the
> IOMMU user driver could subsequently use that as a handle to access other
> parts of the API.
> 
> The reason is that if we ever need to support more than a single master
> interface (and perhaps even multiple master interfaces on different
> IOMMUs) for a single device, then we need a way for the IOMMU user to
> differentiate between its master interfaces.
> 
> > diff --git a/include/linux/iommu.h b/include/linux/iommu.h index
> > 284a4683fdc1..ac2ceef194d4 100644
> > --- a/include/linux/iommu.h
> > +++ b/include/linux/iommu.h
> > @@ -43,6 +43,17 @@ struct notifier_block;  typedef int
> > (*iommu_fault_handler_t)(struct iommu_domain *,
> >  			struct device *, unsigned long, int, void *);
> >
> > +struct iommu {
> > +	struct device *dev;
> > +
> > +	struct list_head list;
> > +
> > +	const struct iommu_ops *ops;
> > +};
> 
> For reasons explained above, I also think that it would be a good idea to
> modify the iommu_ops functions to take a struct iommu * as their first
> argument. This may become important when one driver needs to support
> multiple IOMMU devices. With the current API drivers have to rely on
> global variables to track the driver-specific context. As far as I can
> tell, only .domain_init(), .add_device(), .remove_device() and
> .device_group(). .domain_init() could set up a pointer to struct iommu in
> struct iommu_domain so the functions dealing with domains could gain
> access to the IOMMU device via that pointer.
Would the proposed interface be an alternate to the add_device interface? Also, how would the iommu group creation work? We are dependent on device driver initialization to attach device an IOMMU, but the add_device allows creation of iommu_groups during bus probing. 
Can't the same thing be achieved using the add device interface where an IOMMU driver can determine (in add_device) if the device is attached to a particular IOMMU. If the device is attached to that IOMMU then it can create the corresponding IOMMU group. IOMMU information can be stored in archdata.

-Varun

^ permalink raw reply	[flat|nested] 133+ messages in thread

* RE: [PATCH v3 02/10] devicetree: Add generic IOMMU device tree bindings
  2014-06-26 20:49     ` Thierry Reding
  (?)
@ 2014-07-04  6:42         ` Varun Sethi
  -1 siblings, 0 replies; 133+ messages in thread
From: Varun Sethi @ 2014-07-04  6:42 UTC (permalink / raw)
  To: Thierry Reding, Rob Herring, Pawel Moll, Mark Rutland,
	Ian Campbell, Kumar Gala, Stephen Warren, Arnd Bergmann,
	Will Deacon, Joerg Roedel
  Cc: devicetree-u79uwXL29TY76Z2rM5mHXA, Grant Grundler, Rhyland Klein,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA, Allen Martin, Marc Zyngier,
	iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA, Paul Walmsley,
	linux-tegra-u79uwXL29TY76Z2rM5mHXA, Cho KyongHo, Dave Martin,
	linux-arm-kernel-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r



> -----Original Message-----
> From: iommu-bounces-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA@public.gmane.org [mailto:iommu-
> bounces-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA@public.gmane.org] On Behalf Of Thierry Reding
> Sent: Friday, June 27, 2014 2:20 AM
> To: Rob Herring; Pawel Moll; Mark Rutland; Ian Campbell; Kumar Gala;
> Stephen Warren; Arnd Bergmann; Will Deacon; Joerg Roedel
> Cc: Olav Haugan; devicetree-u79uwXL29TY76Z2rM5mHXA@public.gmane.org; Grant Grundler; Rhyland
> Klein; iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA@public.gmane.org; linux-kernel-u79uwXL29TY76Z2rM5mHXA@public.gmane.org;
> Marc Zyngier; Allen Martin; Paul Walmsley; linux-tegra-u79uwXL29TY76Z2rM5mHXA@public.gmane.org;
> Cho KyongHo; Dave Martin; linux-arm-kernel-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r@public.gmane.org
> Subject: [PATCH v3 02/10] devicetree: Add generic IOMMU device tree
> bindings
> 
> From: Thierry Reding <treding-DDmLM1+adcrQT0dZR+AlfA@public.gmane.org>
> 
> This commit introduces a generic device tree binding for IOMMU devices.
> Only a very minimal subset is described here, but it is enough to cover
> the requirements of both the Exynos System MMU and Tegra SMMU as
> discussed here:
> 
>     https://lkml.org/lkml/2014/4/27/346
> 
> Signed-off-by: Thierry Reding <treding-DDmLM1+adcrQT0dZR+AlfA@public.gmane.org>
> ---
> Changes in v3:
> - use #iommu-cells instead of #address-cells/#size-cells
> - drop optional iommu-names property
> 
> Changes in v2:
> - add notes about "dma-ranges" property (drop note from commit message)
> - document priorities of "iommus" property vs. "dma-ranges" property
> - drop #iommu-cells in favour of #address-cells and #size-cells
> - remove multiple-master device example
> 
>  Documentation/devicetree/bindings/iommu/iommu.txt | 156
> ++++++++++++++++++++++
>  1 file changed, 156 insertions(+)
>  create mode 100644 Documentation/devicetree/bindings/iommu/iommu.txt
> 
> diff --git a/Documentation/devicetree/bindings/iommu/iommu.txt
> b/Documentation/devicetree/bindings/iommu/iommu.txt
> new file mode 100644
> index 000000000000..f8f03f057156
> --- /dev/null
> +++ b/Documentation/devicetree/bindings/iommu/iommu.txt
> @@ -0,0 +1,156 @@
> +This document describes the generic device tree binding for IOMMUs and
> +their master(s).
> +
> +
> +IOMMU device node:
> +==================
> +
> +An IOMMU can provide the following services:
> +
> +* Remap address space to allow devices to access physical memory ranges
> +that
> +  they otherwise wouldn't be capable of accessing.
> +
> +  Example: 32-bit DMA to 64-bit physical addresses
> +
> +* Implement scatter-gather at page level granularity so that the device
> +does
> +  not have to.
> +
> +* Provide system protection against "rogue" DMA by forcing all accesses
> +to go
> +  through the IOMMU and faulting when encountering accesses to unmapped
> +  address regions.
> +
> +* Provide address space isolation between multiple contexts.
> +
> +  Example: Virtualization
> +
> +Device nodes compatible with this binding represent hardware with some
> +of the above capabilities.
> +
> +IOMMUs can be single-master or multiple-master. Single-master IOMMU
> +devices typically have a fixed association to the master device,
> +whereas multiple- master IOMMU devices can translate accesses from more
> than one master.
> +
> +The device tree node of the IOMMU device's parent bus must contain a
> +valid "dma-ranges" property that describes how the physical address
> +space of the IOMMU maps to memory. An empty "dma-ranges" property means
> +that there is a
> +1:1 mapping from IOMMU to memory.
> +
> +Required properties:
> +--------------------
> +- #iommu-cells: The number of cells in an IOMMU specifier needed to
> +encode an
> +  address.
> +
> +Typical values for the above include:
> +- #iommu-cells = <0>: Single master IOMMU devices are not configurable
> +and
> +  therefore no additional information needs to be encoded in the
> specifier.
> +  This may also apply to multiple master IOMMU devices that do not
> +allow the
> +  association of masters to be configured.
> +- #iommu-cells = <1>: Multiple master IOMMU devices may need to be
> +configured
> +  in order to enable translation for a given master. In such cases the
> +single
> +  address cell corresponds to the master device's ID.
> +- #iommu-cells = <4>: Some IOMMU devices allow the DMA window for
> +masters to
> +  be configured. The first cell of the address in this may contain the
> +master
> +  device's ID for example, while the second cell could contain the
> +start of
> +  the DMA window for the given device. The length of the DMA window is
> +given
> +  by the third and fourth cells.
> +
> +
> +IOMMU master node:
> +==================
> +
> +Devices that access memory through an IOMMU are called masters. A
> +device can have multiple master interfaces (to one or more IOMMU
> devices).
> +
> +Required properties:
> +--------------------
> +- iommus: A list of phandle and IOMMU specifier pairs that describe the
> +IOMMU
> +  master interfaces of the device. One entry in the list describes one
> +master
> +  interface of the device.
> +
> +When an "iommus" property is specified in a device tree node, the IOMMU
> +will be used for address translation. If a "dma-ranges" property exists
> +in the device's parent node it will be ignored. An exception to this
> +rule is if the referenced IOMMU is disabled, in which case the
> +"dma-ranges" property of the parent shall take effect.
> +
> +
> +Notes:
> +======
> +
> +One possible extension to the above is to use an "iommus" property
> +along with a "dma-ranges" property in a bus device node (such as PCI
> +host bridges). This can be useful to describe how children on the bus
> +relate to the IOMMU if they are not explicitly listed in the device
> +tree (e.g. PCI devices). However, the requirements of that use-case
> +haven't been fully determined yet. Implementing this is therefore not
> +recommended without further discussion and extension of this binding.
> +
> +
> +Examples:
> +=========
> +
> +Single-master IOMMU:
> +--------------------
> +
> +	iommu {
> +		#iommu-cells = <0>;
> +	};
> +
> +	master {
> +		iommus = <&/iommu>;
> +	};
> +
> +Multiple-master IOMMU with fixed associations:
> +----------------------------------------------
> +
> +	/* multiple-master IOMMU */
> +	iommu {
> +		/*
> +		 * Masters are statically associated with this IOMMU and
> +		 * address translation is always enabled.
> +		 */
> +		#iommu-cells = <0>;
> +	};
> +
> +	/* static association with IOMMU */
> +	master@1 {
> +		reg = <1>;
> +		iommus = <&/iommu>;
> +	};
> +
> +	/* static association with IOMMU */
> +	master@2 {
> +		reg = <2>;
> +		iommus = <&/iommu>;
> +	};
> +
> +Multiple-master IOMMU:
> +----------------------
> +
> +	iommu {
> +		/* the specifier represents the ID of the master */
> +		#iommu-cells = <1>;
> +	};
> +
> +	master {
> +		/* device has master ID 42 in the IOMMU */
> +		iommus = <&/iommu 42>;
> +	};
> +
Master node corresponds to the device node, right? Master ID would correspond to Stream ID? We are already using "iommu-parent" property to link a device to its corresponding IOMMU. We can use the same property instead of using "iommus".

-Varun

^ permalink raw reply	[flat|nested] 133+ messages in thread

* RE: [PATCH v3 02/10] devicetree: Add generic IOMMU device tree bindings
@ 2014-07-04  6:42         ` Varun Sethi
  0 siblings, 0 replies; 133+ messages in thread
From: Varun Sethi @ 2014-07-04  6:42 UTC (permalink / raw)
  To: Thierry Reding, Rob Herring, Pawel Moll, Mark Rutland,
	Ian Campbell, Kumar Gala, Stephen Warren, Arnd Bergmann,
	Will Deacon, Joerg Roedel
  Cc: Olav Haugan, devicetree, Grant Grundler, Rhyland Klein, iommu,
	linux-kernel, Marc Zyngier, Allen Martin, Paul Walmsley,
	linux-tegra, Cho KyongHo, Dave Martin, linux-arm-kernel



> -----Original Message-----
> From: iommu-bounces@lists.linux-foundation.org [mailto:iommu-
> bounces@lists.linux-foundation.org] On Behalf Of Thierry Reding
> Sent: Friday, June 27, 2014 2:20 AM
> To: Rob Herring; Pawel Moll; Mark Rutland; Ian Campbell; Kumar Gala;
> Stephen Warren; Arnd Bergmann; Will Deacon; Joerg Roedel
> Cc: Olav Haugan; devicetree@vger.kernel.org; Grant Grundler; Rhyland
> Klein; iommu@lists.linux-foundation.org; linux-kernel@vger.kernel.org;
> Marc Zyngier; Allen Martin; Paul Walmsley; linux-tegra@vger.kernel.org;
> Cho KyongHo; Dave Martin; linux-arm-kernel@lists.infradead.org
> Subject: [PATCH v3 02/10] devicetree: Add generic IOMMU device tree
> bindings
> 
> From: Thierry Reding <treding@nvidia.com>
> 
> This commit introduces a generic device tree binding for IOMMU devices.
> Only a very minimal subset is described here, but it is enough to cover
> the requirements of both the Exynos System MMU and Tegra SMMU as
> discussed here:
> 
>     https://lkml.org/lkml/2014/4/27/346
> 
> Signed-off-by: Thierry Reding <treding@nvidia.com>
> ---
> Changes in v3:
> - use #iommu-cells instead of #address-cells/#size-cells
> - drop optional iommu-names property
> 
> Changes in v2:
> - add notes about "dma-ranges" property (drop note from commit message)
> - document priorities of "iommus" property vs. "dma-ranges" property
> - drop #iommu-cells in favour of #address-cells and #size-cells
> - remove multiple-master device example
> 
>  Documentation/devicetree/bindings/iommu/iommu.txt | 156
> ++++++++++++++++++++++
>  1 file changed, 156 insertions(+)
>  create mode 100644 Documentation/devicetree/bindings/iommu/iommu.txt
> 
> diff --git a/Documentation/devicetree/bindings/iommu/iommu.txt
> b/Documentation/devicetree/bindings/iommu/iommu.txt
> new file mode 100644
> index 000000000000..f8f03f057156
> --- /dev/null
> +++ b/Documentation/devicetree/bindings/iommu/iommu.txt
> @@ -0,0 +1,156 @@
> +This document describes the generic device tree binding for IOMMUs and
> +their master(s).
> +
> +
> +IOMMU device node:
> +==================
> +
> +An IOMMU can provide the following services:
> +
> +* Remap address space to allow devices to access physical memory ranges
> +that
> +  they otherwise wouldn't be capable of accessing.
> +
> +  Example: 32-bit DMA to 64-bit physical addresses
> +
> +* Implement scatter-gather at page level granularity so that the device
> +does
> +  not have to.
> +
> +* Provide system protection against "rogue" DMA by forcing all accesses
> +to go
> +  through the IOMMU and faulting when encountering accesses to unmapped
> +  address regions.
> +
> +* Provide address space isolation between multiple contexts.
> +
> +  Example: Virtualization
> +
> +Device nodes compatible with this binding represent hardware with some
> +of the above capabilities.
> +
> +IOMMUs can be single-master or multiple-master. Single-master IOMMU
> +devices typically have a fixed association to the master device,
> +whereas multiple- master IOMMU devices can translate accesses from more
> than one master.
> +
> +The device tree node of the IOMMU device's parent bus must contain a
> +valid "dma-ranges" property that describes how the physical address
> +space of the IOMMU maps to memory. An empty "dma-ranges" property means
> +that there is a
> +1:1 mapping from IOMMU to memory.
> +
> +Required properties:
> +--------------------
> +- #iommu-cells: The number of cells in an IOMMU specifier needed to
> +encode an
> +  address.
> +
> +Typical values for the above include:
> +- #iommu-cells = <0>: Single master IOMMU devices are not configurable
> +and
> +  therefore no additional information needs to be encoded in the
> specifier.
> +  This may also apply to multiple master IOMMU devices that do not
> +allow the
> +  association of masters to be configured.
> +- #iommu-cells = <1>: Multiple master IOMMU devices may need to be
> +configured
> +  in order to enable translation for a given master. In such cases the
> +single
> +  address cell corresponds to the master device's ID.
> +- #iommu-cells = <4>: Some IOMMU devices allow the DMA window for
> +masters to
> +  be configured. The first cell of the address in this may contain the
> +master
> +  device's ID for example, while the second cell could contain the
> +start of
> +  the DMA window for the given device. The length of the DMA window is
> +given
> +  by the third and fourth cells.
> +
> +
> +IOMMU master node:
> +==================
> +
> +Devices that access memory through an IOMMU are called masters. A
> +device can have multiple master interfaces (to one or more IOMMU
> devices).
> +
> +Required properties:
> +--------------------
> +- iommus: A list of phandle and IOMMU specifier pairs that describe the
> +IOMMU
> +  master interfaces of the device. One entry in the list describes one
> +master
> +  interface of the device.
> +
> +When an "iommus" property is specified in a device tree node, the IOMMU
> +will be used for address translation. If a "dma-ranges" property exists
> +in the device's parent node it will be ignored. An exception to this
> +rule is if the referenced IOMMU is disabled, in which case the
> +"dma-ranges" property of the parent shall take effect.
> +
> +
> +Notes:
> +======
> +
> +One possible extension to the above is to use an "iommus" property
> +along with a "dma-ranges" property in a bus device node (such as PCI
> +host bridges). This can be useful to describe how children on the bus
> +relate to the IOMMU if they are not explicitly listed in the device
> +tree (e.g. PCI devices). However, the requirements of that use-case
> +haven't been fully determined yet. Implementing this is therefore not
> +recommended without further discussion and extension of this binding.
> +
> +
> +Examples:
> +=========
> +
> +Single-master IOMMU:
> +--------------------
> +
> +	iommu {
> +		#iommu-cells = <0>;
> +	};
> +
> +	master {
> +		iommus = <&/iommu>;
> +	};
> +
> +Multiple-master IOMMU with fixed associations:
> +----------------------------------------------
> +
> +	/* multiple-master IOMMU */
> +	iommu {
> +		/*
> +		 * Masters are statically associated with this IOMMU and
> +		 * address translation is always enabled.
> +		 */
> +		#iommu-cells = <0>;
> +	};
> +
> +	/* static association with IOMMU */
> +	master@1 {
> +		reg = <1>;
> +		iommus = <&/iommu>;
> +	};
> +
> +	/* static association with IOMMU */
> +	master@2 {
> +		reg = <2>;
> +		iommus = <&/iommu>;
> +	};
> +
> +Multiple-master IOMMU:
> +----------------------
> +
> +	iommu {
> +		/* the specifier represents the ID of the master */
> +		#iommu-cells = <1>;
> +	};
> +
> +	master {
> +		/* device has master ID 42 in the IOMMU */
> +		iommus = <&/iommu 42>;
> +	};
> +
Master node corresponds to the device node, right? Master ID would correspond to Stream ID? We are already using "iommu-parent" property to link a device to its corresponding IOMMU. We can use the same property instead of using "iommus".

-Varun


^ permalink raw reply	[flat|nested] 133+ messages in thread

* [PATCH v3 02/10] devicetree: Add generic IOMMU device tree bindings
@ 2014-07-04  6:42         ` Varun Sethi
  0 siblings, 0 replies; 133+ messages in thread
From: Varun Sethi @ 2014-07-04  6:42 UTC (permalink / raw)
  To: linux-arm-kernel



> -----Original Message-----
> From: iommu-bounces at lists.linux-foundation.org [mailto:iommu-
> bounces at lists.linux-foundation.org] On Behalf Of Thierry Reding
> Sent: Friday, June 27, 2014 2:20 AM
> To: Rob Herring; Pawel Moll; Mark Rutland; Ian Campbell; Kumar Gala;
> Stephen Warren; Arnd Bergmann; Will Deacon; Joerg Roedel
> Cc: Olav Haugan; devicetree at vger.kernel.org; Grant Grundler; Rhyland
> Klein; iommu at lists.linux-foundation.org; linux-kernel at vger.kernel.org;
> Marc Zyngier; Allen Martin; Paul Walmsley; linux-tegra at vger.kernel.org;
> Cho KyongHo; Dave Martin; linux-arm-kernel at lists.infradead.org
> Subject: [PATCH v3 02/10] devicetree: Add generic IOMMU device tree
> bindings
> 
> From: Thierry Reding <treding@nvidia.com>
> 
> This commit introduces a generic device tree binding for IOMMU devices.
> Only a very minimal subset is described here, but it is enough to cover
> the requirements of both the Exynos System MMU and Tegra SMMU as
> discussed here:
> 
>     https://lkml.org/lkml/2014/4/27/346
> 
> Signed-off-by: Thierry Reding <treding@nvidia.com>
> ---
> Changes in v3:
> - use #iommu-cells instead of #address-cells/#size-cells
> - drop optional iommu-names property
> 
> Changes in v2:
> - add notes about "dma-ranges" property (drop note from commit message)
> - document priorities of "iommus" property vs. "dma-ranges" property
> - drop #iommu-cells in favour of #address-cells and #size-cells
> - remove multiple-master device example
> 
>  Documentation/devicetree/bindings/iommu/iommu.txt | 156
> ++++++++++++++++++++++
>  1 file changed, 156 insertions(+)
>  create mode 100644 Documentation/devicetree/bindings/iommu/iommu.txt
> 
> diff --git a/Documentation/devicetree/bindings/iommu/iommu.txt
> b/Documentation/devicetree/bindings/iommu/iommu.txt
> new file mode 100644
> index 000000000000..f8f03f057156
> --- /dev/null
> +++ b/Documentation/devicetree/bindings/iommu/iommu.txt
> @@ -0,0 +1,156 @@
> +This document describes the generic device tree binding for IOMMUs and
> +their master(s).
> +
> +
> +IOMMU device node:
> +==================
> +
> +An IOMMU can provide the following services:
> +
> +* Remap address space to allow devices to access physical memory ranges
> +that
> +  they otherwise wouldn't be capable of accessing.
> +
> +  Example: 32-bit DMA to 64-bit physical addresses
> +
> +* Implement scatter-gather at page level granularity so that the device
> +does
> +  not have to.
> +
> +* Provide system protection against "rogue" DMA by forcing all accesses
> +to go
> +  through the IOMMU and faulting when encountering accesses to unmapped
> +  address regions.
> +
> +* Provide address space isolation between multiple contexts.
> +
> +  Example: Virtualization
> +
> +Device nodes compatible with this binding represent hardware with some
> +of the above capabilities.
> +
> +IOMMUs can be single-master or multiple-master. Single-master IOMMU
> +devices typically have a fixed association to the master device,
> +whereas multiple- master IOMMU devices can translate accesses from more
> than one master.
> +
> +The device tree node of the IOMMU device's parent bus must contain a
> +valid "dma-ranges" property that describes how the physical address
> +space of the IOMMU maps to memory. An empty "dma-ranges" property means
> +that there is a
> +1:1 mapping from IOMMU to memory.
> +
> +Required properties:
> +--------------------
> +- #iommu-cells: The number of cells in an IOMMU specifier needed to
> +encode an
> +  address.
> +
> +Typical values for the above include:
> +- #iommu-cells = <0>: Single master IOMMU devices are not configurable
> +and
> +  therefore no additional information needs to be encoded in the
> specifier.
> +  This may also apply to multiple master IOMMU devices that do not
> +allow the
> +  association of masters to be configured.
> +- #iommu-cells = <1>: Multiple master IOMMU devices may need to be
> +configured
> +  in order to enable translation for a given master. In such cases the
> +single
> +  address cell corresponds to the master device's ID.
> +- #iommu-cells = <4>: Some IOMMU devices allow the DMA window for
> +masters to
> +  be configured. The first cell of the address in this may contain the
> +master
> +  device's ID for example, while the second cell could contain the
> +start of
> +  the DMA window for the given device. The length of the DMA window is
> +given
> +  by the third and fourth cells.
> +
> +
> +IOMMU master node:
> +==================
> +
> +Devices that access memory through an IOMMU are called masters. A
> +device can have multiple master interfaces (to one or more IOMMU
> devices).
> +
> +Required properties:
> +--------------------
> +- iommus: A list of phandle and IOMMU specifier pairs that describe the
> +IOMMU
> +  master interfaces of the device. One entry in the list describes one
> +master
> +  interface of the device.
> +
> +When an "iommus" property is specified in a device tree node, the IOMMU
> +will be used for address translation. If a "dma-ranges" property exists
> +in the device's parent node it will be ignored. An exception to this
> +rule is if the referenced IOMMU is disabled, in which case the
> +"dma-ranges" property of the parent shall take effect.
> +
> +
> +Notes:
> +======
> +
> +One possible extension to the above is to use an "iommus" property
> +along with a "dma-ranges" property in a bus device node (such as PCI
> +host bridges). This can be useful to describe how children on the bus
> +relate to the IOMMU if they are not explicitly listed in the device
> +tree (e.g. PCI devices). However, the requirements of that use-case
> +haven't been fully determined yet. Implementing this is therefore not
> +recommended without further discussion and extension of this binding.
> +
> +
> +Examples:
> +=========
> +
> +Single-master IOMMU:
> +--------------------
> +
> +	iommu {
> +		#iommu-cells = <0>;
> +	};
> +
> +	master {
> +		iommus = <&/iommu>;
> +	};
> +
> +Multiple-master IOMMU with fixed associations:
> +----------------------------------------------
> +
> +	/* multiple-master IOMMU */
> +	iommu {
> +		/*
> +		 * Masters are statically associated with this IOMMU and
> +		 * address translation is always enabled.
> +		 */
> +		#iommu-cells = <0>;
> +	};
> +
> +	/* static association with IOMMU */
> +	master at 1 {
> +		reg = <1>;
> +		iommus = <&/iommu>;
> +	};
> +
> +	/* static association with IOMMU */
> +	master at 2 {
> +		reg = <2>;
> +		iommus = <&/iommu>;
> +	};
> +
> +Multiple-master IOMMU:
> +----------------------
> +
> +	iommu {
> +		/* the specifier represents the ID of the master */
> +		#iommu-cells = <1>;
> +	};
> +
> +	master {
> +		/* device has master ID 42 in the IOMMU */
> +		iommus = <&/iommu 42>;
> +	};
> +
Master node corresponds to the device node, right? Master ID would correspond to Stream ID? We are already using "iommu-parent" property to link a device to its corresponding IOMMU. We can use the same property instead of using "iommus".

-Varun

^ permalink raw reply	[flat|nested] 133+ messages in thread

* Re: [PATCH v3 02/10] devicetree: Add generic IOMMU device tree bindings
  2014-07-04  6:42         ` Varun Sethi
  (?)
@ 2014-07-04  9:05             ` Arnd Bergmann
  -1 siblings, 0 replies; 133+ messages in thread
From: Arnd Bergmann @ 2014-07-04  9:05 UTC (permalink / raw)
  To: Varun Sethi
  Cc: Mark Rutland, Will Deacon, Thierry Reding, Paul Walmsley,
	Stephen Warren, Marc Zyngier, Dave Martin,
	devicetree-u79uwXL29TY76Z2rM5mHXA, Pawel Moll, Ian Campbell,
	Grant Grundler, Allen Martin, Rob Herring,
	linux-tegra-u79uwXL29TY76Z2rM5mHXA, Cho KyongHo,
	linux-arm-kernel-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA, Kumar Gala,
	Rhyland Klein

On Friday 04 July 2014 06:42:48 Varun Sethi wrote:
> Master node corresponds to the device node, right? Master ID would correspond
> to Stream ID? We are already using "iommu-parent" property to link a device
> to its corresponding IOMMU. We can use the same property instead of using "iommus".

I don't see "iommu-parent" used anywhere, just "fsl,iommu-parent". We can
probably allow "fsl,iommu-parent" as an alias for "iommus" for backwards-
compatibility if that helps you on PowerPC. For ARM, I'd prefer to mandate
that we use just "iommus".

	Arnd

^ permalink raw reply	[flat|nested] 133+ messages in thread

* Re: [PATCH v3 02/10] devicetree: Add generic IOMMU device tree bindings
@ 2014-07-04  9:05             ` Arnd Bergmann
  0 siblings, 0 replies; 133+ messages in thread
From: Arnd Bergmann @ 2014-07-04  9:05 UTC (permalink / raw)
  To: Varun Sethi
  Cc: Thierry Reding, Rob Herring, Pawel Moll, Mark Rutland,
	Ian Campbell, Kumar Gala, Stephen Warren, Will Deacon,
	Joerg Roedel, Olav Haugan, devicetree, Grant Grundler,
	Rhyland Klein, iommu, linux-kernel, Marc Zyngier, Allen Martin,
	Paul Walmsley, linux-tegra, Cho KyongHo, Dave Martin,
	linux-arm-kernel

On Friday 04 July 2014 06:42:48 Varun Sethi wrote:
> Master node corresponds to the device node, right? Master ID would correspond
> to Stream ID? We are already using "iommu-parent" property to link a device
> to its corresponding IOMMU. We can use the same property instead of using "iommus".

I don't see "iommu-parent" used anywhere, just "fsl,iommu-parent". We can
probably allow "fsl,iommu-parent" as an alias for "iommus" for backwards-
compatibility if that helps you on PowerPC. For ARM, I'd prefer to mandate
that we use just "iommus".

	Arnd

^ permalink raw reply	[flat|nested] 133+ messages in thread

* [PATCH v3 02/10] devicetree: Add generic IOMMU device tree bindings
@ 2014-07-04  9:05             ` Arnd Bergmann
  0 siblings, 0 replies; 133+ messages in thread
From: Arnd Bergmann @ 2014-07-04  9:05 UTC (permalink / raw)
  To: linux-arm-kernel

On Friday 04 July 2014 06:42:48 Varun Sethi wrote:
> Master node corresponds to the device node, right? Master ID would correspond
> to Stream ID? We are already using "iommu-parent" property to link a device
> to its corresponding IOMMU. We can use the same property instead of using "iommus".

I don't see "iommu-parent" used anywhere, just "fsl,iommu-parent". We can
probably allow "fsl,iommu-parent" as an alias for "iommus" for backwards-
compatibility if that helps you on PowerPC. For ARM, I'd prefer to mandate
that we use just "iommus".

	Arnd

^ permalink raw reply	[flat|nested] 133+ messages in thread

* Re: [RFC 01/10] iommu: Add IOMMU device registry
  2014-06-26 20:49     ` Thierry Reding
  (?)
@ 2014-07-04 11:05         ` Joerg Roedel
  -1 siblings, 0 replies; 133+ messages in thread
From: Joerg Roedel @ 2014-07-04 11:05 UTC (permalink / raw)
  To: Thierry Reding
  Cc: Mark Rutland, Will Deacon, Paul Walmsley, Pawel Moll,
	Rhyland Klein, Ian Campbell, Marc Zyngier, Dave Martin,
	devicetree-u79uwXL29TY76Z2rM5mHXA, Arnd Bergmann, Stephen Warren,
	Grant Grundler, Allen Martin, Rob Herring,
	linux-tegra-u79uwXL29TY76Z2rM5mHXA, Cho KyongHo,
	linux-arm-kernel-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA, Kumar Gala

On Thu, Jun 26, 2014 at 10:49:41PM +0200, Thierry Reding wrote:
> Add an IOMMU device registry for drivers to register with and implement
> a method for users of the IOMMU API to attach to an IOMMU device. This
> allows to support deferred probing and gives the IOMMU API a convenient
> hook to perform early initialization of a device if necessary.

Can you elaborate on why exactly you need this? The IOMMU-API is
designed to hide any details from the user about the available IOMMUs in
the system and which IOMMU handles which device. This looks like it is
going in a completly different direction from that.


	Joerg

^ permalink raw reply	[flat|nested] 133+ messages in thread

* Re: [RFC 01/10] iommu: Add IOMMU device registry
@ 2014-07-04 11:05         ` Joerg Roedel
  0 siblings, 0 replies; 133+ messages in thread
From: Joerg Roedel @ 2014-07-04 11:05 UTC (permalink / raw)
  To: Thierry Reding
  Cc: Rob Herring, Pawel Moll, Mark Rutland, Ian Campbell, Kumar Gala,
	Stephen Warren, Arnd Bergmann, Will Deacon, Cho KyongHo,
	Grant Grundler, Dave Martin, Marc Zyngier, Hiroshi Doyu,
	Olav Haugan, Paul Walmsley, Rhyland Klein, Allen Martin,
	devicetree, iommu, linux-arm-kernel, linux-tegra, linux-kernel

On Thu, Jun 26, 2014 at 10:49:41PM +0200, Thierry Reding wrote:
> Add an IOMMU device registry for drivers to register with and implement
> a method for users of the IOMMU API to attach to an IOMMU device. This
> allows to support deferred probing and gives the IOMMU API a convenient
> hook to perform early initialization of a device if necessary.

Can you elaborate on why exactly you need this? The IOMMU-API is
designed to hide any details from the user about the available IOMMUs in
the system and which IOMMU handles which device. This looks like it is
going in a completly different direction from that.


	Joerg



^ permalink raw reply	[flat|nested] 133+ messages in thread

* [RFC 01/10] iommu: Add IOMMU device registry
@ 2014-07-04 11:05         ` Joerg Roedel
  0 siblings, 0 replies; 133+ messages in thread
From: Joerg Roedel @ 2014-07-04 11:05 UTC (permalink / raw)
  To: linux-arm-kernel

On Thu, Jun 26, 2014 at 10:49:41PM +0200, Thierry Reding wrote:
> Add an IOMMU device registry for drivers to register with and implement
> a method for users of the IOMMU API to attach to an IOMMU device. This
> allows to support deferred probing and gives the IOMMU API a convenient
> hook to perform early initialization of a device if necessary.

Can you elaborate on why exactly you need this? The IOMMU-API is
designed to hide any details from the user about the available IOMMUs in
the system and which IOMMU handles which device. This looks like it is
going in a completly different direction from that.


	Joerg

^ permalink raw reply	[flat|nested] 133+ messages in thread

* Re: [RFC 01/10] iommu: Add IOMMU device registry
  2014-07-04 11:05         ` Joerg Roedel
  (?)
@ 2014-07-04 13:47             ` Thierry Reding
  -1 siblings, 0 replies; 133+ messages in thread
From: Thierry Reding @ 2014-07-04 13:47 UTC (permalink / raw)
  To: Joerg Roedel
  Cc: Mark Rutland, Will Deacon, Paul Walmsley, Pawel Moll,
	Rhyland Klein, Ian Campbell, Marc Zyngier, Dave Martin,
	devicetree-u79uwXL29TY76Z2rM5mHXA, Arnd Bergmann, Stephen Warren,
	Grant Grundler, Allen Martin, Rob Herring,
	linux-tegra-u79uwXL29TY76Z2rM5mHXA, Cho KyongHo,
	linux-arm-kernel-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA, Kumar Gala


[-- Attachment #1.1: Type: text/plain, Size: 1353 bytes --]

On Fri, Jul 04, 2014 at 01:05:30PM +0200, Joerg Roedel wrote:
> On Thu, Jun 26, 2014 at 10:49:41PM +0200, Thierry Reding wrote:
> > Add an IOMMU device registry for drivers to register with and implement
> > a method for users of the IOMMU API to attach to an IOMMU device. This
> > allows to support deferred probing and gives the IOMMU API a convenient
> > hook to perform early initialization of a device if necessary.
> 
> Can you elaborate on why exactly you need this? The IOMMU-API is
> designed to hide any details from the user about the available IOMMUs in
> the system and which IOMMU handles which device. This looks like it is
> going in a completly different direction from that.

I need this primarily to properly serialize device probing order.
Without it the IOMMU may be probed later than its clients, in which case
the client drivers will assume that there is no IOMMU (iommu_present()
for the parent bus fails).

There are other ways around this, but I think we'll need to eventually
come up with something like this anyway. Consider for example what
happens when a device has master interfaces on two different IOMMUs. Not
only does the current model of having one and one only IOMMU per struct
bus_type break down, but also IOMMU masters will need a way to specify
which IOMMU they're talking to.

Thierry

[-- Attachment #1.2: Type: application/pgp-signature, Size: 819 bytes --]

[-- Attachment #2: Type: text/plain, Size: 0 bytes --]



^ permalink raw reply	[flat|nested] 133+ messages in thread

* Re: [RFC 01/10] iommu: Add IOMMU device registry
@ 2014-07-04 13:47             ` Thierry Reding
  0 siblings, 0 replies; 133+ messages in thread
From: Thierry Reding @ 2014-07-04 13:47 UTC (permalink / raw)
  To: Joerg Roedel
  Cc: Rob Herring, Pawel Moll, Mark Rutland, Ian Campbell, Kumar Gala,
	Stephen Warren, Arnd Bergmann, Will Deacon, Cho KyongHo,
	Grant Grundler, Dave Martin, Marc Zyngier, Hiroshi Doyu,
	Olav Haugan, Paul Walmsley, Rhyland Klein, Allen Martin,
	devicetree, iommu, linux-arm-kernel, linux-tegra, linux-kernel

[-- Attachment #1: Type: text/plain, Size: 1353 bytes --]

On Fri, Jul 04, 2014 at 01:05:30PM +0200, Joerg Roedel wrote:
> On Thu, Jun 26, 2014 at 10:49:41PM +0200, Thierry Reding wrote:
> > Add an IOMMU device registry for drivers to register with and implement
> > a method for users of the IOMMU API to attach to an IOMMU device. This
> > allows to support deferred probing and gives the IOMMU API a convenient
> > hook to perform early initialization of a device if necessary.
> 
> Can you elaborate on why exactly you need this? The IOMMU-API is
> designed to hide any details from the user about the available IOMMUs in
> the system and which IOMMU handles which device. This looks like it is
> going in a completly different direction from that.

I need this primarily to properly serialize device probing order.
Without it the IOMMU may be probed later than its clients, in which case
the client drivers will assume that there is no IOMMU (iommu_present()
for the parent bus fails).

There are other ways around this, but I think we'll need to eventually
come up with something like this anyway. Consider for example what
happens when a device has master interfaces on two different IOMMUs. Not
only does the current model of having one and one only IOMMU per struct
bus_type break down, but also IOMMU masters will need a way to specify
which IOMMU they're talking to.

Thierry

[-- Attachment #2: Type: application/pgp-signature, Size: 819 bytes --]

^ permalink raw reply	[flat|nested] 133+ messages in thread

* [RFC 01/10] iommu: Add IOMMU device registry
@ 2014-07-04 13:47             ` Thierry Reding
  0 siblings, 0 replies; 133+ messages in thread
From: Thierry Reding @ 2014-07-04 13:47 UTC (permalink / raw)
  To: linux-arm-kernel

On Fri, Jul 04, 2014 at 01:05:30PM +0200, Joerg Roedel wrote:
> On Thu, Jun 26, 2014 at 10:49:41PM +0200, Thierry Reding wrote:
> > Add an IOMMU device registry for drivers to register with and implement
> > a method for users of the IOMMU API to attach to an IOMMU device. This
> > allows to support deferred probing and gives the IOMMU API a convenient
> > hook to perform early initialization of a device if necessary.
> 
> Can you elaborate on why exactly you need this? The IOMMU-API is
> designed to hide any details from the user about the available IOMMUs in
> the system and which IOMMU handles which device. This looks like it is
> going in a completly different direction from that.

I need this primarily to properly serialize device probing order.
Without it the IOMMU may be probed later than its clients, in which case
the client drivers will assume that there is no IOMMU (iommu_present()
for the parent bus fails).

There are other ways around this, but I think we'll need to eventually
come up with something like this anyway. Consider for example what
happens when a device has master interfaces on two different IOMMUs. Not
only does the current model of having one and one only IOMMU per struct
bus_type break down, but also IOMMU masters will need a way to specify
which IOMMU they're talking to.

Thierry
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 819 bytes
Desc: not available
URL: <http://lists.infradead.org/pipermail/linux-arm-kernel/attachments/20140704/07d12776/attachment-0001.sig>

^ permalink raw reply	[flat|nested] 133+ messages in thread

* Re: [RFC 01/10] iommu: Add IOMMU device registry
  2014-07-04 13:47             ` Thierry Reding
  (?)
@ 2014-07-04 13:49               ` Will Deacon
  -1 siblings, 0 replies; 133+ messages in thread
From: Will Deacon @ 2014-07-04 13:49 UTC (permalink / raw)
  To: Thierry Reding
  Cc: Joerg Roedel, Rob Herring, Pawel Moll, Mark Rutland,
	Ian Campbell, Kumar Gala, Stephen Warren, Arnd Bergmann,
	Cho KyongHo, Grant Grundler, Dave P Martin, Marc Zyngier,
	Hiroshi Doyu, Olav Haugan, Paul Walmsley, Rhyland Klein,
	Allen Martin, devicetree-u79uwXL29TY76Z2rM5mHXA,
	iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA,
	linux-arm-kernel-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r,
	linux-tegra-u79uwXL29TY76Z2rM5mHXA@public.gmane.org

On Fri, Jul 04, 2014 at 02:47:10PM +0100, Thierry Reding wrote:
> On Fri, Jul 04, 2014 at 01:05:30PM +0200, Joerg Roedel wrote:
> > On Thu, Jun 26, 2014 at 10:49:41PM +0200, Thierry Reding wrote:
> > > Add an IOMMU device registry for drivers to register with and implement
> > > a method for users of the IOMMU API to attach to an IOMMU device. This
> > > allows to support deferred probing and gives the IOMMU API a convenient
> > > hook to perform early initialization of a device if necessary.
> > 
> > Can you elaborate on why exactly you need this? The IOMMU-API is
> > designed to hide any details from the user about the available IOMMUs in
> > the system and which IOMMU handles which device. This looks like it is
> > going in a completly different direction from that.
> 
> I need this primarily to properly serialize device probing order.
> Without it the IOMMU may be probed later than its clients, in which case
> the client drivers will assume that there is no IOMMU (iommu_present()
> for the parent bus fails).

I can also vouch for needing *a* solution to this problem. The ARM SMMU (and
I think others) rely on initcall ordering rather than the driver probing
model to ensure the IOMMU is probed before any of its masters.

Will
--
To unsubscribe from this list: send the line "unsubscribe devicetree" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 133+ messages in thread

* Re: [RFC 01/10] iommu: Add IOMMU device registry
@ 2014-07-04 13:49               ` Will Deacon
  0 siblings, 0 replies; 133+ messages in thread
From: Will Deacon @ 2014-07-04 13:49 UTC (permalink / raw)
  To: Thierry Reding
  Cc: Joerg Roedel, Rob Herring, Pawel Moll, Mark Rutland,
	Ian Campbell, Kumar Gala, Stephen Warren, Arnd Bergmann,
	Cho KyongHo, Grant Grundler, Dave P Martin, Marc Zyngier,
	Hiroshi Doyu, Olav Haugan, Paul Walmsley, Rhyland Klein,
	Allen Martin, devicetree, iommu, linux-arm-kernel, linux-tegra,
	linux-kernel

On Fri, Jul 04, 2014 at 02:47:10PM +0100, Thierry Reding wrote:
> On Fri, Jul 04, 2014 at 01:05:30PM +0200, Joerg Roedel wrote:
> > On Thu, Jun 26, 2014 at 10:49:41PM +0200, Thierry Reding wrote:
> > > Add an IOMMU device registry for drivers to register with and implement
> > > a method for users of the IOMMU API to attach to an IOMMU device. This
> > > allows to support deferred probing and gives the IOMMU API a convenient
> > > hook to perform early initialization of a device if necessary.
> > 
> > Can you elaborate on why exactly you need this? The IOMMU-API is
> > designed to hide any details from the user about the available IOMMUs in
> > the system and which IOMMU handles which device. This looks like it is
> > going in a completly different direction from that.
> 
> I need this primarily to properly serialize device probing order.
> Without it the IOMMU may be probed later than its clients, in which case
> the client drivers will assume that there is no IOMMU (iommu_present()
> for the parent bus fails).

I can also vouch for needing *a* solution to this problem. The ARM SMMU (and
I think others) rely on initcall ordering rather than the driver probing
model to ensure the IOMMU is probed before any of its masters.

Will

^ permalink raw reply	[flat|nested] 133+ messages in thread

* [RFC 01/10] iommu: Add IOMMU device registry
@ 2014-07-04 13:49               ` Will Deacon
  0 siblings, 0 replies; 133+ messages in thread
From: Will Deacon @ 2014-07-04 13:49 UTC (permalink / raw)
  To: linux-arm-kernel

On Fri, Jul 04, 2014 at 02:47:10PM +0100, Thierry Reding wrote:
> On Fri, Jul 04, 2014 at 01:05:30PM +0200, Joerg Roedel wrote:
> > On Thu, Jun 26, 2014 at 10:49:41PM +0200, Thierry Reding wrote:
> > > Add an IOMMU device registry for drivers to register with and implement
> > > a method for users of the IOMMU API to attach to an IOMMU device. This
> > > allows to support deferred probing and gives the IOMMU API a convenient
> > > hook to perform early initialization of a device if necessary.
> > 
> > Can you elaborate on why exactly you need this? The IOMMU-API is
> > designed to hide any details from the user about the available IOMMUs in
> > the system and which IOMMU handles which device. This looks like it is
> > going in a completly different direction from that.
> 
> I need this primarily to properly serialize device probing order.
> Without it the IOMMU may be probed later than its clients, in which case
> the client drivers will assume that there is no IOMMU (iommu_present()
> for the parent bus fails).

I can also vouch for needing *a* solution to this problem. The ARM SMMU (and
I think others) rely on initcall ordering rather than the driver probing
model to ensure the IOMMU is probed before any of its masters.

Will

^ permalink raw reply	[flat|nested] 133+ messages in thread

* Re: [RFC 01/10] iommu: Add IOMMU device registry
  2014-07-04 13:49               ` Will Deacon
  (?)
@ 2014-07-06 18:17                   ` Arnd Bergmann
  -1 siblings, 0 replies; 133+ messages in thread
From: Arnd Bergmann @ 2014-07-06 18:17 UTC (permalink / raw)
  To: Will Deacon
  Cc: Mark Rutland, Thierry Reding, Paul Walmsley, Stephen Warren,
	Marc Zyngier, Dave P Martin, devicetree-u79uwXL29TY76Z2rM5mHXA,
	Pawel Moll, Ian Campbell, Grant Grundler, Allen Martin,
	Rob Herring, linux-tegra-u79uwXL29TY76Z2rM5mHXA, Cho KyongHo,
	linux-arm-kernel-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA, Kumar Gala,
	Rhyland Klein

On Friday 04 July 2014, Will Deacon wrote:
> On Fri, Jul 04, 2014 at 02:47:10PM +0100, Thierry Reding wrote:
> > On Fri, Jul 04, 2014 at 01:05:30PM +0200, Joerg Roedel wrote:
> > > On Thu, Jun 26, 2014 at 10:49:41PM +0200, Thierry Reding wrote:
> > > > Add an IOMMU device registry for drivers to register with and implement
> > > > a method for users of the IOMMU API to attach to an IOMMU device. This
> > > > allows to support deferred probing and gives the IOMMU API a convenient
> > > > hook to perform early initialization of a device if necessary.
> > > 
> > > Can you elaborate on why exactly you need this? The IOMMU-API is
> > > designed to hide any details from the user about the available IOMMUs in
> > > the system and which IOMMU handles which device. This looks like it is
> > > going in a completly different direction from that.
> > 
> > I need this primarily to properly serialize device probing order.
> > Without it the IOMMU may be probed later than its clients, in which case
> > the client drivers will assume that there is no IOMMU (iommu_present()
> > for the parent bus fails).
> 
> I can also vouch for needing a solution to this problem. The ARM SMMU (and
> I think others) rely on initcall ordering rather than the driver probing
> model to ensure the IOMMU is probed before any of its masters.

I think it would be best to attach platform devices to IOMMUs from the
of_dma_configure() we just introduced. That still requires handling
IOMMUs special though, and I don't know how we should best deal
with that. It would not be too hard to scan for IOMMUs in DT first
and register them all in a way that we can later look them up
by phandle, but that would break down if we ever get nested IOMMUs.

Another possibility might be to register all devices as we do today,
including IOMMU devices, but return -EPROBE_DEFER from
platform_drv_probe() before we call into the driver's probe function
if the IOMMU has not been set up at that point.

For PCI devices, we need a different way of dealing with the IOMMUs,
some generic PCI code needs to be added to attach the correct IOMMU
to a newly added PCI device based on how the host bridge is configured.

We can probably for now get away with not worrying about any bus type
other than platform, amba or PCI: we don't use any other DMA master
capable bus on ARM, and other architectures can probably rely on
having only a single IOMMU implementation in the system.

	Arnd

^ permalink raw reply	[flat|nested] 133+ messages in thread

* Re: [RFC 01/10] iommu: Add IOMMU device registry
@ 2014-07-06 18:17                   ` Arnd Bergmann
  0 siblings, 0 replies; 133+ messages in thread
From: Arnd Bergmann @ 2014-07-06 18:17 UTC (permalink / raw)
  To: Will Deacon
  Cc: Thierry Reding, Joerg Roedel, Rob Herring, Pawel Moll,
	Mark Rutland, Ian Campbell, Kumar Gala, Stephen Warren,
	Cho KyongHo, Grant Grundler, Dave P Martin, Marc Zyngier,
	Hiroshi Doyu, Olav Haugan, Paul Walmsley, Rhyland Klein,
	Allen Martin, devicetree, iommu, linux-arm-kernel, linux-tegra,
	linux-kernel

On Friday 04 July 2014, Will Deacon wrote:
> On Fri, Jul 04, 2014 at 02:47:10PM +0100, Thierry Reding wrote:
> > On Fri, Jul 04, 2014 at 01:05:30PM +0200, Joerg Roedel wrote:
> > > On Thu, Jun 26, 2014 at 10:49:41PM +0200, Thierry Reding wrote:
> > > > Add an IOMMU device registry for drivers to register with and implement
> > > > a method for users of the IOMMU API to attach to an IOMMU device. This
> > > > allows to support deferred probing and gives the IOMMU API a convenient
> > > > hook to perform early initialization of a device if necessary.
> > > 
> > > Can you elaborate on why exactly you need this? The IOMMU-API is
> > > designed to hide any details from the user about the available IOMMUs in
> > > the system and which IOMMU handles which device. This looks like it is
> > > going in a completly different direction from that.
> > 
> > I need this primarily to properly serialize device probing order.
> > Without it the IOMMU may be probed later than its clients, in which case
> > the client drivers will assume that there is no IOMMU (iommu_present()
> > for the parent bus fails).
> 
> I can also vouch for needing a solution to this problem. The ARM SMMU (and
> I think others) rely on initcall ordering rather than the driver probing
> model to ensure the IOMMU is probed before any of its masters.

I think it would be best to attach platform devices to IOMMUs from the
of_dma_configure() we just introduced. That still requires handling
IOMMUs special though, and I don't know how we should best deal
with that. It would not be too hard to scan for IOMMUs in DT first
and register them all in a way that we can later look them up
by phandle, but that would break down if we ever get nested IOMMUs.

Another possibility might be to register all devices as we do today,
including IOMMU devices, but return -EPROBE_DEFER from
platform_drv_probe() before we call into the driver's probe function
if the IOMMU has not been set up at that point.

For PCI devices, we need a different way of dealing with the IOMMUs,
some generic PCI code needs to be added to attach the correct IOMMU
to a newly added PCI device based on how the host bridge is configured.

We can probably for now get away with not worrying about any bus type
other than platform, amba or PCI: we don't use any other DMA master
capable bus on ARM, and other architectures can probably rely on
having only a single IOMMU implementation in the system.

	Arnd

^ permalink raw reply	[flat|nested] 133+ messages in thread

* [RFC 01/10] iommu: Add IOMMU device registry
@ 2014-07-06 18:17                   ` Arnd Bergmann
  0 siblings, 0 replies; 133+ messages in thread
From: Arnd Bergmann @ 2014-07-06 18:17 UTC (permalink / raw)
  To: linux-arm-kernel

On Friday 04 July 2014, Will Deacon wrote:
> On Fri, Jul 04, 2014 at 02:47:10PM +0100, Thierry Reding wrote:
> > On Fri, Jul 04, 2014 at 01:05:30PM +0200, Joerg Roedel wrote:
> > > On Thu, Jun 26, 2014 at 10:49:41PM +0200, Thierry Reding wrote:
> > > > Add an IOMMU device registry for drivers to register with and implement
> > > > a method for users of the IOMMU API to attach to an IOMMU device. This
> > > > allows to support deferred probing and gives the IOMMU API a convenient
> > > > hook to perform early initialization of a device if necessary.
> > > 
> > > Can you elaborate on why exactly you need this? The IOMMU-API is
> > > designed to hide any details from the user about the available IOMMUs in
> > > the system and which IOMMU handles which device. This looks like it is
> > > going in a completly different direction from that.
> > 
> > I need this primarily to properly serialize device probing order.
> > Without it the IOMMU may be probed later than its clients, in which case
> > the client drivers will assume that there is no IOMMU (iommu_present()
> > for the parent bus fails).
> 
> I can also vouch for needing a solution to this problem. The ARM SMMU (and
> I think others) rely on initcall ordering rather than the driver probing
> model to ensure the IOMMU is probed before any of its masters.

I think it would be best to attach platform devices to IOMMUs from the
of_dma_configure() we just introduced. That still requires handling
IOMMUs special though, and I don't know how we should best deal
with that. It would not be too hard to scan for IOMMUs in DT first
and register them all in a way that we can later look them up
by phandle, but that would break down if we ever get nested IOMMUs.

Another possibility might be to register all devices as we do today,
including IOMMU devices, but return -EPROBE_DEFER from
platform_drv_probe() before we call into the driver's probe function
if the IOMMU has not been set up at that point.

For PCI devices, we need a different way of dealing with the IOMMUs,
some generic PCI code needs to be added to attach the correct IOMMU
to a newly added PCI device based on how the host bridge is configured.

We can probably for now get away with not worrying about any bus type
other than platform, amba or PCI: we don't use any other DMA master
capable bus on ARM, and other architectures can probably rely on
having only a single IOMMU implementation in the system.

	Arnd

^ permalink raw reply	[flat|nested] 133+ messages in thread

* Re: [RFC 01/10] iommu: Add IOMMU device registry
  2014-07-06 18:17                   ` Arnd Bergmann
  (?)
@ 2014-07-07 11:42                       ` Thierry Reding
  -1 siblings, 0 replies; 133+ messages in thread
From: Thierry Reding @ 2014-07-07 11:42 UTC (permalink / raw)
  To: Arnd Bergmann
  Cc: Mark Rutland, Will Deacon, Paul Walmsley, Stephen Warren,
	Marc Zyngier, Dave P Martin, devicetree-u79uwXL29TY76Z2rM5mHXA,
	Pawel Moll, Ian Campbell, Grant Grundler, Allen Martin,
	Rob Herring, linux-tegra-u79uwXL29TY76Z2rM5mHXA, Cho KyongHo,
	linux-arm-kernel-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA, Kumar Gala,
	Rhyland Klein


[-- Attachment #1.1: Type: text/plain, Size: 6106 bytes --]

On Sun, Jul 06, 2014 at 08:17:22PM +0200, Arnd Bergmann wrote:
> On Friday 04 July 2014, Will Deacon wrote:
> > On Fri, Jul 04, 2014 at 02:47:10PM +0100, Thierry Reding wrote:
> > > On Fri, Jul 04, 2014 at 01:05:30PM +0200, Joerg Roedel wrote:
> > > > On Thu, Jun 26, 2014 at 10:49:41PM +0200, Thierry Reding wrote:
> > > > > Add an IOMMU device registry for drivers to register with and implement
> > > > > a method for users of the IOMMU API to attach to an IOMMU device. This
> > > > > allows to support deferred probing and gives the IOMMU API a convenient
> > > > > hook to perform early initialization of a device if necessary.
> > > > 
> > > > Can you elaborate on why exactly you need this? The IOMMU-API is
> > > > designed to hide any details from the user about the available IOMMUs in
> > > > the system and which IOMMU handles which device. This looks like it is
> > > > going in a completly different direction from that.
> > > 
> > > I need this primarily to properly serialize device probing order.
> > > Without it the IOMMU may be probed later than its clients, in which case
> > > the client drivers will assume that there is no IOMMU (iommu_present()
> > > for the parent bus fails).
> > 
> > I can also vouch for needing a solution to this problem. The ARM SMMU (and
> > I think others) rely on initcall ordering rather than the driver probing
> > model to ensure the IOMMU is probed before any of its masters.
> 
> I think it would be best to attach platform devices to IOMMUs from the
> of_dma_configure() we just introduced. That still requires handling
> IOMMUs special though, and I don't know how we should best deal
> with that. It would not be too hard to scan for IOMMUs in DT first
> and register them all in a way that we can later look them up
> by phandle, but that would break down if we ever get nested IOMMUs.

But even for nested IOMMUs each will have an associated device node, so
we could scan the tree up front. But given that it only solves the
problem partially I don't think that's a big advantage.

> Another possibility might be to register all devices as we do today,
> including IOMMU devices, but return -EPROBE_DEFER from
> platform_drv_probe() before we call into the driver's probe function
> if the IOMMU has not been set up at that point.

Right, Hiroshi already proposed a patch for that, but it was more or
less NAK'ed because people didn't want to have that functionality in the
device driver core.

> For PCI devices, we need a different way of dealing with the IOMMUs,
> some generic PCI code needs to be added to attach the correct IOMMU
> to a newly added PCI device based on how the host bridge is configured.

I'm curious. Without device tree, how do we find out what IOMMU a device
is connected to? Will it always be an ancestor of the device in the PCI
hierarchy?

> We can probably for now get away with not worrying about any bus type
> other than platform, amba or PCI: we don't use any other DMA master
> capable bus on ARM, and other architectures can probably rely on
> having only a single IOMMU implementation in the system.

Neither of the above proposals will work for cases where more than a
single IOMMU exists in the system. Currently we can only register one
IOMMU per bus and if we try to register a second IOMMU it will fail
(bus_set_iommu() returns -EBUSY).

Also, struct bus_type has only a pointer to a struct iommu_ops, but no
associated context. Hence my proposal, which I only posted partially
here since it didn't seem immediately relevant. But I guess to better
illustrate how I envisioned this to work, here goes:

The idea was to allow each device to have zero or more master on zero or
more IOMMUs. That's as general a case as it gets. Now to make this work
we'd need something like this:

	struct iommu_master {
		struct device *dev; /* the master device */
		struct iommu *iommu; /* the IOMMU that dev masters */
		struct list_head list; /* link in a list of all master
					  interfaces of dev */
	};

Then we could store a list in struct device:

	struct device {
		...
		struct list_head iommu_masters;
		...
	};

It was already mentioned in other threads that if a device does indeed
have more than one master interface, then it needs to control access to
them explicitly via the IOMMU API. Since we only have an API to allocate
an IRQ domain (which automatically forwards calls to the global IOMMU)
we'd need something new, such as:

	master = iommu_get(dev, "foo");

or

	master = iommu_get(dev, 0);

Or whichever variant we prefer. That could return a pointer to a struct
iommu_master, which could then be used to obtain a domain, like so:

	domain = iommu_master_alloc_domain(master);

To make that work, as far as I can tell only very minimal changes would
have to be done to iommu_ops. Most of the functions take a pointer to a
struct iommu_domain anyway, we could extend it with a reference to the
parent of a domain. For that we'll need a structure that represents the
IOMMU device's context (which is what this patch introduces as struct
iommu).

The only functions in struct iommu_ops that deal with an IOMMU directly
are .add_device(), .remove_device() and .device_group(), although they
may become obsolete with the new APIs. Currently .add_device() and
.remove_device() are only used to register devices from a bus notifier
and that would be replaced by something more explicit like above. As for
device_group(), I don't see it being used at all currently.

Now for DMA mapping API integration we could make that use the first (or
only) IOMMU device registered. Perhaps we could even reject using this
layer of integration for multi-master devices, since it would be
difficult to tell whether or not the selected device is the correct one.

We still have the option to handle things mostly transparently with the
above by moving calls to iommu_get() into the core. But we also gain the
flexibility to work with multiple IOMMU contexts explicitly if required.

Thierry

[-- Attachment #1.2: Type: application/pgp-signature, Size: 819 bytes --]

[-- Attachment #2: Type: text/plain, Size: 0 bytes --]



^ permalink raw reply	[flat|nested] 133+ messages in thread

* Re: [RFC 01/10] iommu: Add IOMMU device registry
@ 2014-07-07 11:42                       ` Thierry Reding
  0 siblings, 0 replies; 133+ messages in thread
From: Thierry Reding @ 2014-07-07 11:42 UTC (permalink / raw)
  To: Arnd Bergmann
  Cc: Will Deacon, Joerg Roedel, Rob Herring, Pawel Moll, Mark Rutland,
	Ian Campbell, Kumar Gala, Stephen Warren, Cho KyongHo,
	Grant Grundler, Dave P Martin, Marc Zyngier, Hiroshi Doyu,
	Olav Haugan, Paul Walmsley, Rhyland Klein, Allen Martin,
	devicetree, iommu, linux-arm-kernel, linux-tegra, linux-kernel

[-- Attachment #1: Type: text/plain, Size: 6106 bytes --]

On Sun, Jul 06, 2014 at 08:17:22PM +0200, Arnd Bergmann wrote:
> On Friday 04 July 2014, Will Deacon wrote:
> > On Fri, Jul 04, 2014 at 02:47:10PM +0100, Thierry Reding wrote:
> > > On Fri, Jul 04, 2014 at 01:05:30PM +0200, Joerg Roedel wrote:
> > > > On Thu, Jun 26, 2014 at 10:49:41PM +0200, Thierry Reding wrote:
> > > > > Add an IOMMU device registry for drivers to register with and implement
> > > > > a method for users of the IOMMU API to attach to an IOMMU device. This
> > > > > allows to support deferred probing and gives the IOMMU API a convenient
> > > > > hook to perform early initialization of a device if necessary.
> > > > 
> > > > Can you elaborate on why exactly you need this? The IOMMU-API is
> > > > designed to hide any details from the user about the available IOMMUs in
> > > > the system and which IOMMU handles which device. This looks like it is
> > > > going in a completly different direction from that.
> > > 
> > > I need this primarily to properly serialize device probing order.
> > > Without it the IOMMU may be probed later than its clients, in which case
> > > the client drivers will assume that there is no IOMMU (iommu_present()
> > > for the parent bus fails).
> > 
> > I can also vouch for needing a solution to this problem. The ARM SMMU (and
> > I think others) rely on initcall ordering rather than the driver probing
> > model to ensure the IOMMU is probed before any of its masters.
> 
> I think it would be best to attach platform devices to IOMMUs from the
> of_dma_configure() we just introduced. That still requires handling
> IOMMUs special though, and I don't know how we should best deal
> with that. It would not be too hard to scan for IOMMUs in DT first
> and register them all in a way that we can later look them up
> by phandle, but that would break down if we ever get nested IOMMUs.

But even for nested IOMMUs each will have an associated device node, so
we could scan the tree up front. But given that it only solves the
problem partially I don't think that's a big advantage.

> Another possibility might be to register all devices as we do today,
> including IOMMU devices, but return -EPROBE_DEFER from
> platform_drv_probe() before we call into the driver's probe function
> if the IOMMU has not been set up at that point.

Right, Hiroshi already proposed a patch for that, but it was more or
less NAK'ed because people didn't want to have that functionality in the
device driver core.

> For PCI devices, we need a different way of dealing with the IOMMUs,
> some generic PCI code needs to be added to attach the correct IOMMU
> to a newly added PCI device based on how the host bridge is configured.

I'm curious. Without device tree, how do we find out what IOMMU a device
is connected to? Will it always be an ancestor of the device in the PCI
hierarchy?

> We can probably for now get away with not worrying about any bus type
> other than platform, amba or PCI: we don't use any other DMA master
> capable bus on ARM, and other architectures can probably rely on
> having only a single IOMMU implementation in the system.

Neither of the above proposals will work for cases where more than a
single IOMMU exists in the system. Currently we can only register one
IOMMU per bus and if we try to register a second IOMMU it will fail
(bus_set_iommu() returns -EBUSY).

Also, struct bus_type has only a pointer to a struct iommu_ops, but no
associated context. Hence my proposal, which I only posted partially
here since it didn't seem immediately relevant. But I guess to better
illustrate how I envisioned this to work, here goes:

The idea was to allow each device to have zero or more master on zero or
more IOMMUs. That's as general a case as it gets. Now to make this work
we'd need something like this:

	struct iommu_master {
		struct device *dev; /* the master device */
		struct iommu *iommu; /* the IOMMU that dev masters */
		struct list_head list; /* link in a list of all master
					  interfaces of dev */
	};

Then we could store a list in struct device:

	struct device {
		...
		struct list_head iommu_masters;
		...
	};

It was already mentioned in other threads that if a device does indeed
have more than one master interface, then it needs to control access to
them explicitly via the IOMMU API. Since we only have an API to allocate
an IRQ domain (which automatically forwards calls to the global IOMMU)
we'd need something new, such as:

	master = iommu_get(dev, "foo");

or

	master = iommu_get(dev, 0);

Or whichever variant we prefer. That could return a pointer to a struct
iommu_master, which could then be used to obtain a domain, like so:

	domain = iommu_master_alloc_domain(master);

To make that work, as far as I can tell only very minimal changes would
have to be done to iommu_ops. Most of the functions take a pointer to a
struct iommu_domain anyway, we could extend it with a reference to the
parent of a domain. For that we'll need a structure that represents the
IOMMU device's context (which is what this patch introduces as struct
iommu).

The only functions in struct iommu_ops that deal with an IOMMU directly
are .add_device(), .remove_device() and .device_group(), although they
may become obsolete with the new APIs. Currently .add_device() and
.remove_device() are only used to register devices from a bus notifier
and that would be replaced by something more explicit like above. As for
device_group(), I don't see it being used at all currently.

Now for DMA mapping API integration we could make that use the first (or
only) IOMMU device registered. Perhaps we could even reject using this
layer of integration for multi-master devices, since it would be
difficult to tell whether or not the selected device is the correct one.

We still have the option to handle things mostly transparently with the
above by moving calls to iommu_get() into the core. But we also gain the
flexibility to work with multiple IOMMU contexts explicitly if required.

Thierry

[-- Attachment #2: Type: application/pgp-signature, Size: 819 bytes --]

^ permalink raw reply	[flat|nested] 133+ messages in thread

* [RFC 01/10] iommu: Add IOMMU device registry
@ 2014-07-07 11:42                       ` Thierry Reding
  0 siblings, 0 replies; 133+ messages in thread
From: Thierry Reding @ 2014-07-07 11:42 UTC (permalink / raw)
  To: linux-arm-kernel

On Sun, Jul 06, 2014 at 08:17:22PM +0200, Arnd Bergmann wrote:
> On Friday 04 July 2014, Will Deacon wrote:
> > On Fri, Jul 04, 2014 at 02:47:10PM +0100, Thierry Reding wrote:
> > > On Fri, Jul 04, 2014 at 01:05:30PM +0200, Joerg Roedel wrote:
> > > > On Thu, Jun 26, 2014 at 10:49:41PM +0200, Thierry Reding wrote:
> > > > > Add an IOMMU device registry for drivers to register with and implement
> > > > > a method for users of the IOMMU API to attach to an IOMMU device. This
> > > > > allows to support deferred probing and gives the IOMMU API a convenient
> > > > > hook to perform early initialization of a device if necessary.
> > > > 
> > > > Can you elaborate on why exactly you need this? The IOMMU-API is
> > > > designed to hide any details from the user about the available IOMMUs in
> > > > the system and which IOMMU handles which device. This looks like it is
> > > > going in a completly different direction from that.
> > > 
> > > I need this primarily to properly serialize device probing order.
> > > Without it the IOMMU may be probed later than its clients, in which case
> > > the client drivers will assume that there is no IOMMU (iommu_present()
> > > for the parent bus fails).
> > 
> > I can also vouch for needing a solution to this problem. The ARM SMMU (and
> > I think others) rely on initcall ordering rather than the driver probing
> > model to ensure the IOMMU is probed before any of its masters.
> 
> I think it would be best to attach platform devices to IOMMUs from the
> of_dma_configure() we just introduced. That still requires handling
> IOMMUs special though, and I don't know how we should best deal
> with that. It would not be too hard to scan for IOMMUs in DT first
> and register them all in a way that we can later look them up
> by phandle, but that would break down if we ever get nested IOMMUs.

But even for nested IOMMUs each will have an associated device node, so
we could scan the tree up front. But given that it only solves the
problem partially I don't think that's a big advantage.

> Another possibility might be to register all devices as we do today,
> including IOMMU devices, but return -EPROBE_DEFER from
> platform_drv_probe() before we call into the driver's probe function
> if the IOMMU has not been set up at that point.

Right, Hiroshi already proposed a patch for that, but it was more or
less NAK'ed because people didn't want to have that functionality in the
device driver core.

> For PCI devices, we need a different way of dealing with the IOMMUs,
> some generic PCI code needs to be added to attach the correct IOMMU
> to a newly added PCI device based on how the host bridge is configured.

I'm curious. Without device tree, how do we find out what IOMMU a device
is connected to? Will it always be an ancestor of the device in the PCI
hierarchy?

> We can probably for now get away with not worrying about any bus type
> other than platform, amba or PCI: we don't use any other DMA master
> capable bus on ARM, and other architectures can probably rely on
> having only a single IOMMU implementation in the system.

Neither of the above proposals will work for cases where more than a
single IOMMU exists in the system. Currently we can only register one
IOMMU per bus and if we try to register a second IOMMU it will fail
(bus_set_iommu() returns -EBUSY).

Also, struct bus_type has only a pointer to a struct iommu_ops, but no
associated context. Hence my proposal, which I only posted partially
here since it didn't seem immediately relevant. But I guess to better
illustrate how I envisioned this to work, here goes:

The idea was to allow each device to have zero or more master on zero or
more IOMMUs. That's as general a case as it gets. Now to make this work
we'd need something like this:

	struct iommu_master {
		struct device *dev; /* the master device */
		struct iommu *iommu; /* the IOMMU that dev masters */
		struct list_head list; /* link in a list of all master
					  interfaces of dev */
	};

Then we could store a list in struct device:

	struct device {
		...
		struct list_head iommu_masters;
		...
	};

It was already mentioned in other threads that if a device does indeed
have more than one master interface, then it needs to control access to
them explicitly via the IOMMU API. Since we only have an API to allocate
an IRQ domain (which automatically forwards calls to the global IOMMU)
we'd need something new, such as:

	master = iommu_get(dev, "foo");

or

	master = iommu_get(dev, 0);

Or whichever variant we prefer. That could return a pointer to a struct
iommu_master, which could then be used to obtain a domain, like so:

	domain = iommu_master_alloc_domain(master);

To make that work, as far as I can tell only very minimal changes would
have to be done to iommu_ops. Most of the functions take a pointer to a
struct iommu_domain anyway, we could extend it with a reference to the
parent of a domain. For that we'll need a structure that represents the
IOMMU device's context (which is what this patch introduces as struct
iommu).

The only functions in struct iommu_ops that deal with an IOMMU directly
are .add_device(), .remove_device() and .device_group(), although they
may become obsolete with the new APIs. Currently .add_device() and
.remove_device() are only used to register devices from a bus notifier
and that would be replaced by something more explicit like above. As for
device_group(), I don't see it being used at all currently.

Now for DMA mapping API integration we could make that use the first (or
only) IOMMU device registered. Perhaps we could even reject using this
layer of integration for multi-master devices, since it would be
difficult to tell whether or not the selected device is the correct one.

We still have the option to handle things mostly transparently with the
above by moving calls to iommu_get() into the core. But we also gain the
flexibility to work with multiple IOMMU contexts explicitly if required.

Thierry
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 819 bytes
Desc: not available
URL: <http://lists.infradead.org/pipermail/linux-arm-kernel/attachments/20140707/0ac3020f/attachment.sig>

^ permalink raw reply	[flat|nested] 133+ messages in thread

* Re: [RFC 09/10] drm/tegra: Add IOMMU support
  2014-06-26 20:49     ` Thierry Reding
  (?)
@ 2014-09-30 18:48         ` Sean Paul
  -1 siblings, 0 replies; 133+ messages in thread
From: Sean Paul @ 2014-09-30 18:48 UTC (permalink / raw)
  To: Thierry Reding
  Cc: Rob Herring, Pawel Moll, Mark Rutland, Ian Campbell, Kumar Gala,
	Stephen Warren, Arnd Bergmann, Will Deacon, Joerg Roedel,
	Cho KyongHo, Grant Grundler, Dave Martin, Marc Zyngier,
	Hiroshi Doyu, Olav Haugan, Paul Walmsley, Rhyland Klein,
	Allen Martin, devicetree-u79uwXL29TY76Z2rM5mHXA, Linux IOMMU,
	Linux ARM Kernel, linux-tegra-u79uwXL29TY76Z2rM5mHXA

On Thu, Jun 26, 2014 at 4:49 PM, Thierry Reding
<thierry.reding-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> wrote:
> From: Thierry Reding <treding-DDmLM1+adcrQT0dZR+AlfA@public.gmane.org>
>
> When an IOMMU device is available on the platform bus, allocate an IOMMU
> domain and attach the display controllers to it. The display controllers
> can then scan out non-contiguous buffers by mapping them through the
> IOMMU.
>

Hi Thierry,
A few comments from Stéphane and myself that came up while we were
reviewing this for our tree.

> Signed-off-by: Thierry Reding <treding-DDmLM1+adcrQT0dZR+AlfA@public.gmane.org>
> ---
>  drivers/gpu/drm/tegra/dc.c  |  21 ++++
>  drivers/gpu/drm/tegra/drm.c |  17 ++++
>  drivers/gpu/drm/tegra/drm.h |   3 +
>  drivers/gpu/drm/tegra/fb.c  |  16 ++-
>  drivers/gpu/drm/tegra/gem.c | 236 +++++++++++++++++++++++++++++++++++++++-----
>  drivers/gpu/drm/tegra/gem.h |   4 +
>  6 files changed, 273 insertions(+), 24 deletions(-)
>
> diff --git a/drivers/gpu/drm/tegra/dc.c b/drivers/gpu/drm/tegra/dc.c
> index afcca04f5367..0f7452d04811 100644
> --- a/drivers/gpu/drm/tegra/dc.c
> +++ b/drivers/gpu/drm/tegra/dc.c
> @@ -9,6 +9,7 @@
>
>  #include <linux/clk.h>
>  #include <linux/debugfs.h>
> +#include <linux/iommu.h>
>  #include <linux/reset.h>
>
>  #include "dc.h"
> @@ -1283,8 +1284,18 @@ static int tegra_dc_init(struct host1x_client *client)
>  {
>         struct drm_device *drm = dev_get_drvdata(client->parent);
>         struct tegra_dc *dc = host1x_client_to_dc(client);
> +       struct tegra_drm *tegra = drm->dev_private;
>         int err;
>
> +       if (tegra->domain) {
> +               err = iommu_attach_device(tegra->domain, dc->dev);
> +               if (err < 0) {
> +                       dev_err(dc->dev, "failed to attach to IOMMU: %d\n",
> +                               err);
> +                       return err;
> +               }

[from Stéphane]

shouldn't we call detach in the error paths below?


> +       }
> +
>         drm_crtc_init(drm, &dc->base, &tegra_crtc_funcs);
>         drm_mode_crtc_set_gamma_size(&dc->base, 256);
>         drm_crtc_helper_add(&dc->base, &tegra_crtc_helper_funcs);
> @@ -1318,7 +1329,9 @@ static int tegra_dc_init(struct host1x_client *client)
>
>  static int tegra_dc_exit(struct host1x_client *client)
>  {
> +       struct drm_device *drm = dev_get_drvdata(client->parent);
>         struct tegra_dc *dc = host1x_client_to_dc(client);
> +       struct tegra_drm *tegra = drm->dev_private;
>         int err;
>
>         devm_free_irq(dc->dev, dc->irq, dc);
> @@ -1335,6 +1348,8 @@ static int tegra_dc_exit(struct host1x_client *client)
>                 return err;
>         }
>
> +       iommu_detach_device(tegra->domain, dc->dev);
> +
>         return 0;
>  }
>
> @@ -1462,6 +1477,12 @@ static int tegra_dc_probe(struct platform_device *pdev)
>                 return -ENXIO;
>         }
>
> +       err = iommu_attach(&pdev->dev);
> +       if (err < 0) {
> +               dev_err(&pdev->dev, "failed to attach to IOMMU: %d\n", err);
> +               return err;
> +       }
> +
>         INIT_LIST_HEAD(&dc->client.list);
>         dc->client.ops = &dc_client_ops;
>         dc->client.dev = &pdev->dev;
> diff --git a/drivers/gpu/drm/tegra/drm.c b/drivers/gpu/drm/tegra/drm.c
> index 59736bb810cd..1d2bbafad982 100644
> --- a/drivers/gpu/drm/tegra/drm.c
> +++ b/drivers/gpu/drm/tegra/drm.c
> @@ -8,6 +8,7 @@
>   */
>
>  #include <linux/host1x.h>
> +#include <linux/iommu.h>
>
>  #include "drm.h"
>  #include "gem.h"
> @@ -33,6 +34,16 @@ static int tegra_drm_load(struct drm_device *drm, unsigned long flags)
>         if (!tegra)
>                 return -ENOMEM;
>
> +       if (iommu_present(&platform_bus_type)) {
> +               tegra->domain = iommu_domain_alloc(&platform_bus_type);
> +               if (IS_ERR(tegra->domain)) {
> +                       kfree(tegra);
> +                       return PTR_ERR(tegra->domain);
> +               }
> +
> +               drm_mm_init(&tegra->mm, 0, SZ_2G);


[from Stéphane]:

none of these are freed in the error path below (iommu_domain_free and
drm_mm_takedown)

also |tegra| isn't freed either?



> +       }
> +
>         mutex_init(&tegra->clients_lock);
>         INIT_LIST_HEAD(&tegra->clients);
>         drm->dev_private = tegra;
> @@ -71,6 +82,7 @@ static int tegra_drm_load(struct drm_device *drm, unsigned long flags)
>  static int tegra_drm_unload(struct drm_device *drm)
>  {
>         struct host1x_device *device = to_host1x_device(drm->dev);
> +       struct tegra_drm *tegra = drm->dev_private;
>         int err;
>
>         drm_kms_helper_poll_fini(drm);
> @@ -82,6 +94,11 @@ static int tegra_drm_unload(struct drm_device *drm)
>         if (err < 0)
>                 return err;
>
> +       if (tegra->domain) {
> +               iommu_domain_free(tegra->domain);
> +               drm_mm_takedown(&tegra->mm);
> +       }
> +
>         return 0;
>  }
>
> diff --git a/drivers/gpu/drm/tegra/drm.h b/drivers/gpu/drm/tegra/drm.h
> index 96d754e7b3eb..a07c796b7edc 100644
> --- a/drivers/gpu/drm/tegra/drm.h
> +++ b/drivers/gpu/drm/tegra/drm.h
> @@ -39,6 +39,9 @@ struct tegra_fbdev {
>  struct tegra_drm {
>         struct drm_device *drm;
>
> +       struct iommu_domain *domain;
> +       struct drm_mm mm;
> +
>         struct mutex clients_lock;
>         struct list_head clients;
>
> diff --git a/drivers/gpu/drm/tegra/fb.c b/drivers/gpu/drm/tegra/fb.c
> index 7790d43ad082..21c65dd817c3 100644
> --- a/drivers/gpu/drm/tegra/fb.c
> +++ b/drivers/gpu/drm/tegra/fb.c
> @@ -65,8 +65,12 @@ static void tegra_fb_destroy(struct drm_framebuffer *framebuffer)
>         for (i = 0; i < fb->num_planes; i++) {
>                 struct tegra_bo *bo = fb->planes[i];
>
> -               if (bo)
> +               if (bo) {
> +                       if (bo->pages && bo->virt)
> +                               vunmap(bo->virt);
> +
>                         drm_gem_object_unreference_unlocked(&bo->gem);
> +               }
>         }
>
>         drm_framebuffer_cleanup(framebuffer);
> @@ -252,6 +256,16 @@ static int tegra_fbdev_probe(struct drm_fb_helper *helper,
>         offset = info->var.xoffset * bytes_per_pixel +
>                  info->var.yoffset * fb->pitches[0];
>
> +       if (bo->pages) {
> +               bo->vaddr = vmap(bo->pages, bo->num_pages, VM_MAP,
> +                                pgprot_writecombine(PAGE_KERNEL));
> +               if (!bo->vaddr) {
> +                       dev_err(drm->dev, "failed to vmap() framebuffer\n");
> +                       err = -ENOMEM;
> +                       goto destroy;
> +               }
> +       }
> +
>         drm->mode_config.fb_base = (resource_size_t)bo->paddr;
>         info->screen_base = (void __iomem *)bo->vaddr + offset;
>         info->screen_size = size;
> diff --git a/drivers/gpu/drm/tegra/gem.c b/drivers/gpu/drm/tegra/gem.c
> index c1e4e8b6e5ca..2912e61a2599 100644
> --- a/drivers/gpu/drm/tegra/gem.c
> +++ b/drivers/gpu/drm/tegra/gem.c
> @@ -14,8 +14,10 @@
>   */
>
>  #include <linux/dma-buf.h>
> +#include <linux/iommu.h>
>  #include <drm/tegra_drm.h>
>
> +#include "drm.h"
>  #include "gem.h"
>
>  static inline struct tegra_bo *host1x_to_tegra_bo(struct host1x_bo *bo)
> @@ -90,14 +92,144 @@ static const struct host1x_bo_ops tegra_bo_ops = {
>         .kunmap = tegra_bo_kunmap,
>  };
>
> +static int iommu_map_sg(struct iommu_domain *domain, struct sg_table *sgt,
> +                       dma_addr_t iova, int prot)
> +{
> +       unsigned long offset = 0;
> +       struct scatterlist *sg;
> +       unsigned int i, j;
> +       int err;
> +
> +       for_each_sg(sgt->sgl, sg, sgt->nents, i) {
> +               dma_addr_t phys = sg_phys(sg);
> +               size_t length = sg->offset;
> +
> +               phys = sg_phys(sg) - sg->offset;
> +               length = sg->length + sg->offset;
> +
> +               err = iommu_map(domain, iova + offset, phys, length, prot);
> +               if (err < 0)
> +                       goto unmap;
> +
> +               offset += length;
> +       }
> +
> +       return 0;
> +
> +unmap:
> +       offset = 0;
> +
> +       for_each_sg(sgt->sgl, sg, i, j) {
> +               size_t length = sg->length + sg->offset;
> +               iommu_unmap(domain, iova + offset, length);
> +               offset += length;
> +       }
> +
> +       return err;
> +}
> +
> +static int iommu_unmap_sg(struct iommu_domain *domain, struct sg_table *sgt,
> +                         dma_addr_t iova)
> +{
> +       unsigned long offset = 0;
> +       struct scatterlist *sg;
> +       unsigned int i;
> +
> +       for_each_sg(sgt->sgl, sg, sgt->nents, i) {
> +               dma_addr_t phys = sg_phys(sg);
> +               size_t length = sg->offset;
> +
> +               phys = sg_phys(sg) - sg->offset;
> +               length = sg->length + sg->offset;
> +
> +               iommu_unmap(domain, iova + offset, length);
> +               offset += length;
> +       }
> +
> +       return 0;
> +}
> +
> +static int tegra_bo_iommu_map(struct tegra_drm *tegra, struct tegra_bo *bo)
> +{
> +       int prot = IOMMU_READ | IOMMU_WRITE;
> +       int err;
> +
> +       if (bo->mm)
> +               return -EBUSY;
> +
> +       bo->mm = kzalloc(sizeof(*bo->mm), GFP_KERNEL);
> +       if (!bo->mm)
> +               return -ENOMEM;
> +
> +       err = drm_mm_insert_node_generic(&tegra->mm, bo->mm, bo->gem.size,
> +                                        PAGE_SIZE, 0, 0, 0);
> +       if (err < 0) {
> +               dev_err(tegra->drm->dev, "out of virtual memory: %d\n", err);
> +               return err;
> +       }
> +
> +       bo->paddr = bo->mm->start;
> +
> +       err = iommu_map_sg(tegra->domain, bo->sgt, bo->paddr, prot);
> +       if (err < 0) {
> +               dev_err(tegra->drm->dev, "failed to map buffer: %d\n", err);
> +               return err;
> +       }
> +
> +       return 0;
> +}
> +
> +static int tegra_bo_iommu_unmap(struct tegra_drm *tegra, struct tegra_bo *bo)
> +{
> +       if (!bo->mm)
> +               return 0;
> +
> +       iommu_unmap_sg(tegra->domain, bo->sgt, bo->paddr);
> +       drm_mm_remove_node(bo->mm);
> +
> +       kfree(bo->mm);
> +       return 0;
> +}
> +
>  static void tegra_bo_destroy(struct drm_device *drm, struct tegra_bo *bo)
>  {
> -       dma_free_writecombine(drm->dev, bo->gem.size, bo->vaddr, bo->paddr);
> +       if (!bo->pages)
> +               dma_free_writecombine(drm->dev, bo->gem.size, bo->vaddr,
> +                                     bo->paddr);
> +       else
> +               drm_gem_put_pages(&bo->gem, bo->pages, true, true);
> +}
> +
> +static int tegra_bo_get_pages(struct drm_device *drm, struct tegra_bo *bo,
> +                             size_t size)
> +{
> +       bo->pages = drm_gem_get_pages(&bo->gem, GFP_KERNEL);
> +       if (!bo->pages)
> +               return -ENOMEM;
> +
> +       bo->num_pages = size >> PAGE_SHIFT;
> +
> +       return 0;
> +}
> +
> +static int tegra_bo_alloc(struct drm_device *drm, struct tegra_bo *bo,
> +                         size_t size)
> +{
> +       bo->vaddr = dma_alloc_writecombine(drm->dev, size, &bo->paddr,
> +                                          GFP_KERNEL | __GFP_NOWARN);
> +       if (!bo->vaddr) {
> +               dev_err(drm->dev, "failed to allocate buffer of size %zu\n",
> +                       size);
> +               return -ENOMEM;
> +       }
> +
> +       return 0;
>  }
>
>  struct tegra_bo *tegra_bo_create(struct drm_device *drm, unsigned int size,
>                                  unsigned long flags)
>  {
> +       struct tegra_drm *tegra = drm->dev_private;
>         struct tegra_bo *bo;
>         int err;
>
> @@ -108,22 +240,33 @@ struct tegra_bo *tegra_bo_create(struct drm_device *drm, unsigned int size,
>         host1x_bo_init(&bo->base, &tegra_bo_ops);
>         size = round_up(size, PAGE_SIZE);
>
> -       bo->vaddr = dma_alloc_writecombine(drm->dev, size, &bo->paddr,
> -                                          GFP_KERNEL | __GFP_NOWARN);
> -       if (!bo->vaddr) {
> -               dev_err(drm->dev, "failed to allocate buffer with size %u\n",
> -                       size);
> -               err = -ENOMEM;
> -               goto err_dma;
> -       }
> -
>         err = drm_gem_object_init(drm, &bo->gem, size);
>         if (err)
> -               goto err_init;
> +               goto free;
>
>         err = drm_gem_create_mmap_offset(&bo->gem);

We need to call drm_gem_free_mmap_offset if one of the calls below
fails, otherwise we'll try to free the mmap_offset on an already
destroyed bo.


Sean



>         if (err)
> -               goto err_mmap;
> +               goto release;
> +
> +       if (tegra->domain) {
> +               err = tegra_bo_get_pages(drm, bo, size);
> +               if (err < 0)
> +                       goto release;
> +
> +               bo->sgt = drm_prime_pages_to_sg(bo->pages, bo->num_pages);
> +               if (IS_ERR(bo->sgt)) {
> +                       err = PTR_ERR(bo->sgt);
> +                       goto release;
> +               }
> +
> +               err = tegra_bo_iommu_map(tegra, bo);
> +               if (err < 0)
> +                       goto release;
> +       } else {
> +               err = tegra_bo_alloc(drm, bo, size);
> +               if (err < 0)
> +                       goto release;
> +       }
>
>         if (flags & DRM_TEGRA_GEM_CREATE_TILED)
>                 bo->tiling.mode = TEGRA_BO_TILING_MODE_TILED;
> @@ -133,11 +276,10 @@ struct tegra_bo *tegra_bo_create(struct drm_device *drm, unsigned int size,
>
>         return bo;
>
> -err_mmap:
> +release:
>         drm_gem_object_release(&bo->gem);
> -err_init:
>         tegra_bo_destroy(drm, bo);
> -err_dma:
> +free:
>         kfree(bo);
>
>         return ERR_PTR(err);
> @@ -172,6 +314,7 @@ err:
>  static struct tegra_bo *tegra_bo_import(struct drm_device *drm,
>                                         struct dma_buf *buf)
>  {
> +       struct tegra_drm *tegra = drm->dev_private;
>         struct dma_buf_attachment *attach;
>         struct tegra_bo *bo;
>         ssize_t size;
> @@ -211,12 +354,19 @@ static struct tegra_bo *tegra_bo_import(struct drm_device *drm,
>                 goto detach;
>         }
>
> -       if (bo->sgt->nents > 1) {
> -               err = -EINVAL;
> -               goto detach;
> +       if (tegra->domain) {
> +               err = tegra_bo_iommu_map(tegra, bo);
> +               if (err < 0)
> +                       goto detach;
> +       } else {
> +               if (bo->sgt->nents > 1) {
> +                       err = -EINVAL;
> +                       goto detach;
> +               }
> +
> +               bo->paddr = sg_dma_address(bo->sgt->sgl);
>         }
>
> -       bo->paddr = sg_dma_address(bo->sgt->sgl);
>         bo->gem.import_attach = attach;
>
>         return bo;
> @@ -239,8 +389,12 @@ free:
>
>  void tegra_bo_free_object(struct drm_gem_object *gem)
>  {
> +       struct tegra_drm *tegra = gem->dev->dev_private;
>         struct tegra_bo *bo = to_tegra_bo(gem);
>
> +       if (tegra->domain)
> +               tegra_bo_iommu_unmap(tegra, bo);
> +
>         if (gem->import_attach) {
>                 dma_buf_unmap_attachment(gem->import_attach, bo->sgt,
>                                          DMA_TO_DEVICE);
> @@ -301,7 +455,38 @@ int tegra_bo_dumb_map_offset(struct drm_file *file, struct drm_device *drm,
>         return 0;
>  }
>
> +static int tegra_bo_fault(struct vm_area_struct *vma, struct vm_fault *vmf)
> +{
> +       struct drm_gem_object *gem = vma->vm_private_data;
> +       struct tegra_bo *bo = to_tegra_bo(gem);
> +       struct page *page;
> +       pgoff_t offset;
> +       int err;
> +
> +       if (!bo->pages)
> +               return VM_FAULT_SIGBUS;
> +
> +       offset = ((unsigned long)vmf->virtual_address - vma->vm_start) >> PAGE_SHIFT;
> +       page = bo->pages[offset];
> +
> +       err = vm_insert_page(vma, (unsigned long)vmf->virtual_address, page);
> +       switch (err) {
> +       case -EAGAIN:
> +       case 0:
> +       case -ERESTARTSYS:
> +       case -EINTR:
> +       case -EBUSY:
> +               return VM_FAULT_NOPAGE;
> +
> +       case -ENOMEM:
> +               return VM_FAULT_OOM;
> +       }
> +
> +       return VM_FAULT_SIGBUS;
> +}
> +
>  const struct vm_operations_struct tegra_bo_vm_ops = {
> +       .fault = tegra_bo_fault,
>         .open = drm_gem_vm_open,
>         .close = drm_gem_vm_close,
>  };
> @@ -316,13 +501,18 @@ int tegra_drm_mmap(struct file *file, struct vm_area_struct *vma)
>         if (ret)
>                 return ret;
>
> +       vma->vm_flags |= VM_MIXEDMAP;
> +       vma->vm_flags &= ~VM_PFNMAP;
> +
>         gem = vma->vm_private_data;
>         bo = to_tegra_bo(gem);
>
> -       ret = remap_pfn_range(vma, vma->vm_start, bo->paddr >> PAGE_SHIFT,
> -                             vma->vm_end - vma->vm_start, vma->vm_page_prot);
> -       if (ret)
> -               drm_gem_vm_close(vma);
> +       if (!bo->pages) {
> +               ret = remap_pfn_range(vma, vma->vm_start, bo->paddr >> PAGE_SHIFT,
> +                                     vma->vm_end - vma->vm_start, vma->vm_page_prot);
> +               if (ret)
> +                       drm_gem_vm_close(vma);
> +       }
>
>         return ret;
>  }
> diff --git a/drivers/gpu/drm/tegra/gem.h b/drivers/gpu/drm/tegra/gem.h
> index 43a25c853357..c2e3f43e4b3f 100644
> --- a/drivers/gpu/drm/tegra/gem.h
> +++ b/drivers/gpu/drm/tegra/gem.h
> @@ -37,6 +37,10 @@ struct tegra_bo {
>         dma_addr_t paddr;
>         void *vaddr;
>
> +       struct drm_mm_node *mm;
> +       unsigned long num_pages;
> +       struct page **pages;
> +
>         struct tegra_bo_tiling tiling;
>  };
>
> --
> 2.0.0
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/

^ permalink raw reply	[flat|nested] 133+ messages in thread

* Re: [RFC 09/10] drm/tegra: Add IOMMU support
@ 2014-09-30 18:48         ` Sean Paul
  0 siblings, 0 replies; 133+ messages in thread
From: Sean Paul @ 2014-09-30 18:48 UTC (permalink / raw)
  To: Thierry Reding
  Cc: Rob Herring, Pawel Moll, Mark Rutland, Ian Campbell, Kumar Gala,
	Stephen Warren, Arnd Bergmann, Will Deacon, Joerg Roedel,
	Cho KyongHo, Grant Grundler, Dave Martin, Marc Zyngier,
	Hiroshi Doyu, Olav Haugan, Paul Walmsley, Rhyland Klein,
	Allen Martin, devicetree, Linux IOMMU, Linux ARM Kernel,
	linux-tegra, Linux Kernel Mailing List, Stéphane Marchesin

On Thu, Jun 26, 2014 at 4:49 PM, Thierry Reding
<thierry.reding@gmail.com> wrote:
> From: Thierry Reding <treding@nvidia.com>
>
> When an IOMMU device is available on the platform bus, allocate an IOMMU
> domain and attach the display controllers to it. The display controllers
> can then scan out non-contiguous buffers by mapping them through the
> IOMMU.
>

Hi Thierry,
A few comments from Stéphane and myself that came up while we were
reviewing this for our tree.

> Signed-off-by: Thierry Reding <treding@nvidia.com>
> ---
>  drivers/gpu/drm/tegra/dc.c  |  21 ++++
>  drivers/gpu/drm/tegra/drm.c |  17 ++++
>  drivers/gpu/drm/tegra/drm.h |   3 +
>  drivers/gpu/drm/tegra/fb.c  |  16 ++-
>  drivers/gpu/drm/tegra/gem.c | 236 +++++++++++++++++++++++++++++++++++++++-----
>  drivers/gpu/drm/tegra/gem.h |   4 +
>  6 files changed, 273 insertions(+), 24 deletions(-)
>
> diff --git a/drivers/gpu/drm/tegra/dc.c b/drivers/gpu/drm/tegra/dc.c
> index afcca04f5367..0f7452d04811 100644
> --- a/drivers/gpu/drm/tegra/dc.c
> +++ b/drivers/gpu/drm/tegra/dc.c
> @@ -9,6 +9,7 @@
>
>  #include <linux/clk.h>
>  #include <linux/debugfs.h>
> +#include <linux/iommu.h>
>  #include <linux/reset.h>
>
>  #include "dc.h"
> @@ -1283,8 +1284,18 @@ static int tegra_dc_init(struct host1x_client *client)
>  {
>         struct drm_device *drm = dev_get_drvdata(client->parent);
>         struct tegra_dc *dc = host1x_client_to_dc(client);
> +       struct tegra_drm *tegra = drm->dev_private;
>         int err;
>
> +       if (tegra->domain) {
> +               err = iommu_attach_device(tegra->domain, dc->dev);
> +               if (err < 0) {
> +                       dev_err(dc->dev, "failed to attach to IOMMU: %d\n",
> +                               err);
> +                       return err;
> +               }

[from Stéphane]

shouldn't we call detach in the error paths below?


> +       }
> +
>         drm_crtc_init(drm, &dc->base, &tegra_crtc_funcs);
>         drm_mode_crtc_set_gamma_size(&dc->base, 256);
>         drm_crtc_helper_add(&dc->base, &tegra_crtc_helper_funcs);
> @@ -1318,7 +1329,9 @@ static int tegra_dc_init(struct host1x_client *client)
>
>  static int tegra_dc_exit(struct host1x_client *client)
>  {
> +       struct drm_device *drm = dev_get_drvdata(client->parent);
>         struct tegra_dc *dc = host1x_client_to_dc(client);
> +       struct tegra_drm *tegra = drm->dev_private;
>         int err;
>
>         devm_free_irq(dc->dev, dc->irq, dc);
> @@ -1335,6 +1348,8 @@ static int tegra_dc_exit(struct host1x_client *client)
>                 return err;
>         }
>
> +       iommu_detach_device(tegra->domain, dc->dev);
> +
>         return 0;
>  }
>
> @@ -1462,6 +1477,12 @@ static int tegra_dc_probe(struct platform_device *pdev)
>                 return -ENXIO;
>         }
>
> +       err = iommu_attach(&pdev->dev);
> +       if (err < 0) {
> +               dev_err(&pdev->dev, "failed to attach to IOMMU: %d\n", err);
> +               return err;
> +       }
> +
>         INIT_LIST_HEAD(&dc->client.list);
>         dc->client.ops = &dc_client_ops;
>         dc->client.dev = &pdev->dev;
> diff --git a/drivers/gpu/drm/tegra/drm.c b/drivers/gpu/drm/tegra/drm.c
> index 59736bb810cd..1d2bbafad982 100644
> --- a/drivers/gpu/drm/tegra/drm.c
> +++ b/drivers/gpu/drm/tegra/drm.c
> @@ -8,6 +8,7 @@
>   */
>
>  #include <linux/host1x.h>
> +#include <linux/iommu.h>
>
>  #include "drm.h"
>  #include "gem.h"
> @@ -33,6 +34,16 @@ static int tegra_drm_load(struct drm_device *drm, unsigned long flags)
>         if (!tegra)
>                 return -ENOMEM;
>
> +       if (iommu_present(&platform_bus_type)) {
> +               tegra->domain = iommu_domain_alloc(&platform_bus_type);
> +               if (IS_ERR(tegra->domain)) {
> +                       kfree(tegra);
> +                       return PTR_ERR(tegra->domain);
> +               }
> +
> +               drm_mm_init(&tegra->mm, 0, SZ_2G);


[from Stéphane]:

none of these are freed in the error path below (iommu_domain_free and
drm_mm_takedown)

also |tegra| isn't freed either?



> +       }
> +
>         mutex_init(&tegra->clients_lock);
>         INIT_LIST_HEAD(&tegra->clients);
>         drm->dev_private = tegra;
> @@ -71,6 +82,7 @@ static int tegra_drm_load(struct drm_device *drm, unsigned long flags)
>  static int tegra_drm_unload(struct drm_device *drm)
>  {
>         struct host1x_device *device = to_host1x_device(drm->dev);
> +       struct tegra_drm *tegra = drm->dev_private;
>         int err;
>
>         drm_kms_helper_poll_fini(drm);
> @@ -82,6 +94,11 @@ static int tegra_drm_unload(struct drm_device *drm)
>         if (err < 0)
>                 return err;
>
> +       if (tegra->domain) {
> +               iommu_domain_free(tegra->domain);
> +               drm_mm_takedown(&tegra->mm);
> +       }
> +
>         return 0;
>  }
>
> diff --git a/drivers/gpu/drm/tegra/drm.h b/drivers/gpu/drm/tegra/drm.h
> index 96d754e7b3eb..a07c796b7edc 100644
> --- a/drivers/gpu/drm/tegra/drm.h
> +++ b/drivers/gpu/drm/tegra/drm.h
> @@ -39,6 +39,9 @@ struct tegra_fbdev {
>  struct tegra_drm {
>         struct drm_device *drm;
>
> +       struct iommu_domain *domain;
> +       struct drm_mm mm;
> +
>         struct mutex clients_lock;
>         struct list_head clients;
>
> diff --git a/drivers/gpu/drm/tegra/fb.c b/drivers/gpu/drm/tegra/fb.c
> index 7790d43ad082..21c65dd817c3 100644
> --- a/drivers/gpu/drm/tegra/fb.c
> +++ b/drivers/gpu/drm/tegra/fb.c
> @@ -65,8 +65,12 @@ static void tegra_fb_destroy(struct drm_framebuffer *framebuffer)
>         for (i = 0; i < fb->num_planes; i++) {
>                 struct tegra_bo *bo = fb->planes[i];
>
> -               if (bo)
> +               if (bo) {
> +                       if (bo->pages && bo->virt)
> +                               vunmap(bo->virt);
> +
>                         drm_gem_object_unreference_unlocked(&bo->gem);
> +               }
>         }
>
>         drm_framebuffer_cleanup(framebuffer);
> @@ -252,6 +256,16 @@ static int tegra_fbdev_probe(struct drm_fb_helper *helper,
>         offset = info->var.xoffset * bytes_per_pixel +
>                  info->var.yoffset * fb->pitches[0];
>
> +       if (bo->pages) {
> +               bo->vaddr = vmap(bo->pages, bo->num_pages, VM_MAP,
> +                                pgprot_writecombine(PAGE_KERNEL));
> +               if (!bo->vaddr) {
> +                       dev_err(drm->dev, "failed to vmap() framebuffer\n");
> +                       err = -ENOMEM;
> +                       goto destroy;
> +               }
> +       }
> +
>         drm->mode_config.fb_base = (resource_size_t)bo->paddr;
>         info->screen_base = (void __iomem *)bo->vaddr + offset;
>         info->screen_size = size;
> diff --git a/drivers/gpu/drm/tegra/gem.c b/drivers/gpu/drm/tegra/gem.c
> index c1e4e8b6e5ca..2912e61a2599 100644
> --- a/drivers/gpu/drm/tegra/gem.c
> +++ b/drivers/gpu/drm/tegra/gem.c
> @@ -14,8 +14,10 @@
>   */
>
>  #include <linux/dma-buf.h>
> +#include <linux/iommu.h>
>  #include <drm/tegra_drm.h>
>
> +#include "drm.h"
>  #include "gem.h"
>
>  static inline struct tegra_bo *host1x_to_tegra_bo(struct host1x_bo *bo)
> @@ -90,14 +92,144 @@ static const struct host1x_bo_ops tegra_bo_ops = {
>         .kunmap = tegra_bo_kunmap,
>  };
>
> +static int iommu_map_sg(struct iommu_domain *domain, struct sg_table *sgt,
> +                       dma_addr_t iova, int prot)
> +{
> +       unsigned long offset = 0;
> +       struct scatterlist *sg;
> +       unsigned int i, j;
> +       int err;
> +
> +       for_each_sg(sgt->sgl, sg, sgt->nents, i) {
> +               dma_addr_t phys = sg_phys(sg);
> +               size_t length = sg->offset;
> +
> +               phys = sg_phys(sg) - sg->offset;
> +               length = sg->length + sg->offset;
> +
> +               err = iommu_map(domain, iova + offset, phys, length, prot);
> +               if (err < 0)
> +                       goto unmap;
> +
> +               offset += length;
> +       }
> +
> +       return 0;
> +
> +unmap:
> +       offset = 0;
> +
> +       for_each_sg(sgt->sgl, sg, i, j) {
> +               size_t length = sg->length + sg->offset;
> +               iommu_unmap(domain, iova + offset, length);
> +               offset += length;
> +       }
> +
> +       return err;
> +}
> +
> +static int iommu_unmap_sg(struct iommu_domain *domain, struct sg_table *sgt,
> +                         dma_addr_t iova)
> +{
> +       unsigned long offset = 0;
> +       struct scatterlist *sg;
> +       unsigned int i;
> +
> +       for_each_sg(sgt->sgl, sg, sgt->nents, i) {
> +               dma_addr_t phys = sg_phys(sg);
> +               size_t length = sg->offset;
> +
> +               phys = sg_phys(sg) - sg->offset;
> +               length = sg->length + sg->offset;
> +
> +               iommu_unmap(domain, iova + offset, length);
> +               offset += length;
> +       }
> +
> +       return 0;
> +}
> +
> +static int tegra_bo_iommu_map(struct tegra_drm *tegra, struct tegra_bo *bo)
> +{
> +       int prot = IOMMU_READ | IOMMU_WRITE;
> +       int err;
> +
> +       if (bo->mm)
> +               return -EBUSY;
> +
> +       bo->mm = kzalloc(sizeof(*bo->mm), GFP_KERNEL);
> +       if (!bo->mm)
> +               return -ENOMEM;
> +
> +       err = drm_mm_insert_node_generic(&tegra->mm, bo->mm, bo->gem.size,
> +                                        PAGE_SIZE, 0, 0, 0);
> +       if (err < 0) {
> +               dev_err(tegra->drm->dev, "out of virtual memory: %d\n", err);
> +               return err;
> +       }
> +
> +       bo->paddr = bo->mm->start;
> +
> +       err = iommu_map_sg(tegra->domain, bo->sgt, bo->paddr, prot);
> +       if (err < 0) {
> +               dev_err(tegra->drm->dev, "failed to map buffer: %d\n", err);
> +               return err;
> +       }
> +
> +       return 0;
> +}
> +
> +static int tegra_bo_iommu_unmap(struct tegra_drm *tegra, struct tegra_bo *bo)
> +{
> +       if (!bo->mm)
> +               return 0;
> +
> +       iommu_unmap_sg(tegra->domain, bo->sgt, bo->paddr);
> +       drm_mm_remove_node(bo->mm);
> +
> +       kfree(bo->mm);
> +       return 0;
> +}
> +
>  static void tegra_bo_destroy(struct drm_device *drm, struct tegra_bo *bo)
>  {
> -       dma_free_writecombine(drm->dev, bo->gem.size, bo->vaddr, bo->paddr);
> +       if (!bo->pages)
> +               dma_free_writecombine(drm->dev, bo->gem.size, bo->vaddr,
> +                                     bo->paddr);
> +       else
> +               drm_gem_put_pages(&bo->gem, bo->pages, true, true);
> +}
> +
> +static int tegra_bo_get_pages(struct drm_device *drm, struct tegra_bo *bo,
> +                             size_t size)
> +{
> +       bo->pages = drm_gem_get_pages(&bo->gem, GFP_KERNEL);
> +       if (!bo->pages)
> +               return -ENOMEM;
> +
> +       bo->num_pages = size >> PAGE_SHIFT;
> +
> +       return 0;
> +}
> +
> +static int tegra_bo_alloc(struct drm_device *drm, struct tegra_bo *bo,
> +                         size_t size)
> +{
> +       bo->vaddr = dma_alloc_writecombine(drm->dev, size, &bo->paddr,
> +                                          GFP_KERNEL | __GFP_NOWARN);
> +       if (!bo->vaddr) {
> +               dev_err(drm->dev, "failed to allocate buffer of size %zu\n",
> +                       size);
> +               return -ENOMEM;
> +       }
> +
> +       return 0;
>  }
>
>  struct tegra_bo *tegra_bo_create(struct drm_device *drm, unsigned int size,
>                                  unsigned long flags)
>  {
> +       struct tegra_drm *tegra = drm->dev_private;
>         struct tegra_bo *bo;
>         int err;
>
> @@ -108,22 +240,33 @@ struct tegra_bo *tegra_bo_create(struct drm_device *drm, unsigned int size,
>         host1x_bo_init(&bo->base, &tegra_bo_ops);
>         size = round_up(size, PAGE_SIZE);
>
> -       bo->vaddr = dma_alloc_writecombine(drm->dev, size, &bo->paddr,
> -                                          GFP_KERNEL | __GFP_NOWARN);
> -       if (!bo->vaddr) {
> -               dev_err(drm->dev, "failed to allocate buffer with size %u\n",
> -                       size);
> -               err = -ENOMEM;
> -               goto err_dma;
> -       }
> -
>         err = drm_gem_object_init(drm, &bo->gem, size);
>         if (err)
> -               goto err_init;
> +               goto free;
>
>         err = drm_gem_create_mmap_offset(&bo->gem);

We need to call drm_gem_free_mmap_offset if one of the calls below
fails, otherwise we'll try to free the mmap_offset on an already
destroyed bo.


Sean



>         if (err)
> -               goto err_mmap;
> +               goto release;
> +
> +       if (tegra->domain) {
> +               err = tegra_bo_get_pages(drm, bo, size);
> +               if (err < 0)
> +                       goto release;
> +
> +               bo->sgt = drm_prime_pages_to_sg(bo->pages, bo->num_pages);
> +               if (IS_ERR(bo->sgt)) {
> +                       err = PTR_ERR(bo->sgt);
> +                       goto release;
> +               }
> +
> +               err = tegra_bo_iommu_map(tegra, bo);
> +               if (err < 0)
> +                       goto release;
> +       } else {
> +               err = tegra_bo_alloc(drm, bo, size);
> +               if (err < 0)
> +                       goto release;
> +       }
>
>         if (flags & DRM_TEGRA_GEM_CREATE_TILED)
>                 bo->tiling.mode = TEGRA_BO_TILING_MODE_TILED;
> @@ -133,11 +276,10 @@ struct tegra_bo *tegra_bo_create(struct drm_device *drm, unsigned int size,
>
>         return bo;
>
> -err_mmap:
> +release:
>         drm_gem_object_release(&bo->gem);
> -err_init:
>         tegra_bo_destroy(drm, bo);
> -err_dma:
> +free:
>         kfree(bo);
>
>         return ERR_PTR(err);
> @@ -172,6 +314,7 @@ err:
>  static struct tegra_bo *tegra_bo_import(struct drm_device *drm,
>                                         struct dma_buf *buf)
>  {
> +       struct tegra_drm *tegra = drm->dev_private;
>         struct dma_buf_attachment *attach;
>         struct tegra_bo *bo;
>         ssize_t size;
> @@ -211,12 +354,19 @@ static struct tegra_bo *tegra_bo_import(struct drm_device *drm,
>                 goto detach;
>         }
>
> -       if (bo->sgt->nents > 1) {
> -               err = -EINVAL;
> -               goto detach;
> +       if (tegra->domain) {
> +               err = tegra_bo_iommu_map(tegra, bo);
> +               if (err < 0)
> +                       goto detach;
> +       } else {
> +               if (bo->sgt->nents > 1) {
> +                       err = -EINVAL;
> +                       goto detach;
> +               }
> +
> +               bo->paddr = sg_dma_address(bo->sgt->sgl);
>         }
>
> -       bo->paddr = sg_dma_address(bo->sgt->sgl);
>         bo->gem.import_attach = attach;
>
>         return bo;
> @@ -239,8 +389,12 @@ free:
>
>  void tegra_bo_free_object(struct drm_gem_object *gem)
>  {
> +       struct tegra_drm *tegra = gem->dev->dev_private;
>         struct tegra_bo *bo = to_tegra_bo(gem);
>
> +       if (tegra->domain)
> +               tegra_bo_iommu_unmap(tegra, bo);
> +
>         if (gem->import_attach) {
>                 dma_buf_unmap_attachment(gem->import_attach, bo->sgt,
>                                          DMA_TO_DEVICE);
> @@ -301,7 +455,38 @@ int tegra_bo_dumb_map_offset(struct drm_file *file, struct drm_device *drm,
>         return 0;
>  }
>
> +static int tegra_bo_fault(struct vm_area_struct *vma, struct vm_fault *vmf)
> +{
> +       struct drm_gem_object *gem = vma->vm_private_data;
> +       struct tegra_bo *bo = to_tegra_bo(gem);
> +       struct page *page;
> +       pgoff_t offset;
> +       int err;
> +
> +       if (!bo->pages)
> +               return VM_FAULT_SIGBUS;
> +
> +       offset = ((unsigned long)vmf->virtual_address - vma->vm_start) >> PAGE_SHIFT;
> +       page = bo->pages[offset];
> +
> +       err = vm_insert_page(vma, (unsigned long)vmf->virtual_address, page);
> +       switch (err) {
> +       case -EAGAIN:
> +       case 0:
> +       case -ERESTARTSYS:
> +       case -EINTR:
> +       case -EBUSY:
> +               return VM_FAULT_NOPAGE;
> +
> +       case -ENOMEM:
> +               return VM_FAULT_OOM;
> +       }
> +
> +       return VM_FAULT_SIGBUS;
> +}
> +
>  const struct vm_operations_struct tegra_bo_vm_ops = {
> +       .fault = tegra_bo_fault,
>         .open = drm_gem_vm_open,
>         .close = drm_gem_vm_close,
>  };
> @@ -316,13 +501,18 @@ int tegra_drm_mmap(struct file *file, struct vm_area_struct *vma)
>         if (ret)
>                 return ret;
>
> +       vma->vm_flags |= VM_MIXEDMAP;
> +       vma->vm_flags &= ~VM_PFNMAP;
> +
>         gem = vma->vm_private_data;
>         bo = to_tegra_bo(gem);
>
> -       ret = remap_pfn_range(vma, vma->vm_start, bo->paddr >> PAGE_SHIFT,
> -                             vma->vm_end - vma->vm_start, vma->vm_page_prot);
> -       if (ret)
> -               drm_gem_vm_close(vma);
> +       if (!bo->pages) {
> +               ret = remap_pfn_range(vma, vma->vm_start, bo->paddr >> PAGE_SHIFT,
> +                                     vma->vm_end - vma->vm_start, vma->vm_page_prot);
> +               if (ret)
> +                       drm_gem_vm_close(vma);
> +       }
>
>         return ret;
>  }
> diff --git a/drivers/gpu/drm/tegra/gem.h b/drivers/gpu/drm/tegra/gem.h
> index 43a25c853357..c2e3f43e4b3f 100644
> --- a/drivers/gpu/drm/tegra/gem.h
> +++ b/drivers/gpu/drm/tegra/gem.h
> @@ -37,6 +37,10 @@ struct tegra_bo {
>         dma_addr_t paddr;
>         void *vaddr;
>
> +       struct drm_mm_node *mm;
> +       unsigned long num_pages;
> +       struct page **pages;
> +
>         struct tegra_bo_tiling tiling;
>  };
>
> --
> 2.0.0
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/

^ permalink raw reply	[flat|nested] 133+ messages in thread

* [RFC 09/10] drm/tegra: Add IOMMU support
@ 2014-09-30 18:48         ` Sean Paul
  0 siblings, 0 replies; 133+ messages in thread
From: Sean Paul @ 2014-09-30 18:48 UTC (permalink / raw)
  To: linux-arm-kernel

On Thu, Jun 26, 2014 at 4:49 PM, Thierry Reding
<thierry.reding@gmail.com> wrote:
> From: Thierry Reding <treding@nvidia.com>
>
> When an IOMMU device is available on the platform bus, allocate an IOMMU
> domain and attach the display controllers to it. The display controllers
> can then scan out non-contiguous buffers by mapping them through the
> IOMMU.
>

Hi Thierry,
A few comments from St?phane and myself that came up while we were
reviewing this for our tree.

> Signed-off-by: Thierry Reding <treding@nvidia.com>
> ---
>  drivers/gpu/drm/tegra/dc.c  |  21 ++++
>  drivers/gpu/drm/tegra/drm.c |  17 ++++
>  drivers/gpu/drm/tegra/drm.h |   3 +
>  drivers/gpu/drm/tegra/fb.c  |  16 ++-
>  drivers/gpu/drm/tegra/gem.c | 236 +++++++++++++++++++++++++++++++++++++++-----
>  drivers/gpu/drm/tegra/gem.h |   4 +
>  6 files changed, 273 insertions(+), 24 deletions(-)
>
> diff --git a/drivers/gpu/drm/tegra/dc.c b/drivers/gpu/drm/tegra/dc.c
> index afcca04f5367..0f7452d04811 100644
> --- a/drivers/gpu/drm/tegra/dc.c
> +++ b/drivers/gpu/drm/tegra/dc.c
> @@ -9,6 +9,7 @@
>
>  #include <linux/clk.h>
>  #include <linux/debugfs.h>
> +#include <linux/iommu.h>
>  #include <linux/reset.h>
>
>  #include "dc.h"
> @@ -1283,8 +1284,18 @@ static int tegra_dc_init(struct host1x_client *client)
>  {
>         struct drm_device *drm = dev_get_drvdata(client->parent);
>         struct tegra_dc *dc = host1x_client_to_dc(client);
> +       struct tegra_drm *tegra = drm->dev_private;
>         int err;
>
> +       if (tegra->domain) {
> +               err = iommu_attach_device(tegra->domain, dc->dev);
> +               if (err < 0) {
> +                       dev_err(dc->dev, "failed to attach to IOMMU: %d\n",
> +                               err);
> +                       return err;
> +               }

[from St?phane]

shouldn't we call detach in the error paths below?


> +       }
> +
>         drm_crtc_init(drm, &dc->base, &tegra_crtc_funcs);
>         drm_mode_crtc_set_gamma_size(&dc->base, 256);
>         drm_crtc_helper_add(&dc->base, &tegra_crtc_helper_funcs);
> @@ -1318,7 +1329,9 @@ static int tegra_dc_init(struct host1x_client *client)
>
>  static int tegra_dc_exit(struct host1x_client *client)
>  {
> +       struct drm_device *drm = dev_get_drvdata(client->parent);
>         struct tegra_dc *dc = host1x_client_to_dc(client);
> +       struct tegra_drm *tegra = drm->dev_private;
>         int err;
>
>         devm_free_irq(dc->dev, dc->irq, dc);
> @@ -1335,6 +1348,8 @@ static int tegra_dc_exit(struct host1x_client *client)
>                 return err;
>         }
>
> +       iommu_detach_device(tegra->domain, dc->dev);
> +
>         return 0;
>  }
>
> @@ -1462,6 +1477,12 @@ static int tegra_dc_probe(struct platform_device *pdev)
>                 return -ENXIO;
>         }
>
> +       err = iommu_attach(&pdev->dev);
> +       if (err < 0) {
> +               dev_err(&pdev->dev, "failed to attach to IOMMU: %d\n", err);
> +               return err;
> +       }
> +
>         INIT_LIST_HEAD(&dc->client.list);
>         dc->client.ops = &dc_client_ops;
>         dc->client.dev = &pdev->dev;
> diff --git a/drivers/gpu/drm/tegra/drm.c b/drivers/gpu/drm/tegra/drm.c
> index 59736bb810cd..1d2bbafad982 100644
> --- a/drivers/gpu/drm/tegra/drm.c
> +++ b/drivers/gpu/drm/tegra/drm.c
> @@ -8,6 +8,7 @@
>   */
>
>  #include <linux/host1x.h>
> +#include <linux/iommu.h>
>
>  #include "drm.h"
>  #include "gem.h"
> @@ -33,6 +34,16 @@ static int tegra_drm_load(struct drm_device *drm, unsigned long flags)
>         if (!tegra)
>                 return -ENOMEM;
>
> +       if (iommu_present(&platform_bus_type)) {
> +               tegra->domain = iommu_domain_alloc(&platform_bus_type);
> +               if (IS_ERR(tegra->domain)) {
> +                       kfree(tegra);
> +                       return PTR_ERR(tegra->domain);
> +               }
> +
> +               drm_mm_init(&tegra->mm, 0, SZ_2G);


[from St?phane]:

none of these are freed in the error path below (iommu_domain_free and
drm_mm_takedown)

also |tegra| isn't freed either?



> +       }
> +
>         mutex_init(&tegra->clients_lock);
>         INIT_LIST_HEAD(&tegra->clients);
>         drm->dev_private = tegra;
> @@ -71,6 +82,7 @@ static int tegra_drm_load(struct drm_device *drm, unsigned long flags)
>  static int tegra_drm_unload(struct drm_device *drm)
>  {
>         struct host1x_device *device = to_host1x_device(drm->dev);
> +       struct tegra_drm *tegra = drm->dev_private;
>         int err;
>
>         drm_kms_helper_poll_fini(drm);
> @@ -82,6 +94,11 @@ static int tegra_drm_unload(struct drm_device *drm)
>         if (err < 0)
>                 return err;
>
> +       if (tegra->domain) {
> +               iommu_domain_free(tegra->domain);
> +               drm_mm_takedown(&tegra->mm);
> +       }
> +
>         return 0;
>  }
>
> diff --git a/drivers/gpu/drm/tegra/drm.h b/drivers/gpu/drm/tegra/drm.h
> index 96d754e7b3eb..a07c796b7edc 100644
> --- a/drivers/gpu/drm/tegra/drm.h
> +++ b/drivers/gpu/drm/tegra/drm.h
> @@ -39,6 +39,9 @@ struct tegra_fbdev {
>  struct tegra_drm {
>         struct drm_device *drm;
>
> +       struct iommu_domain *domain;
> +       struct drm_mm mm;
> +
>         struct mutex clients_lock;
>         struct list_head clients;
>
> diff --git a/drivers/gpu/drm/tegra/fb.c b/drivers/gpu/drm/tegra/fb.c
> index 7790d43ad082..21c65dd817c3 100644
> --- a/drivers/gpu/drm/tegra/fb.c
> +++ b/drivers/gpu/drm/tegra/fb.c
> @@ -65,8 +65,12 @@ static void tegra_fb_destroy(struct drm_framebuffer *framebuffer)
>         for (i = 0; i < fb->num_planes; i++) {
>                 struct tegra_bo *bo = fb->planes[i];
>
> -               if (bo)
> +               if (bo) {
> +                       if (bo->pages && bo->virt)
> +                               vunmap(bo->virt);
> +
>                         drm_gem_object_unreference_unlocked(&bo->gem);
> +               }
>         }
>
>         drm_framebuffer_cleanup(framebuffer);
> @@ -252,6 +256,16 @@ static int tegra_fbdev_probe(struct drm_fb_helper *helper,
>         offset = info->var.xoffset * bytes_per_pixel +
>                  info->var.yoffset * fb->pitches[0];
>
> +       if (bo->pages) {
> +               bo->vaddr = vmap(bo->pages, bo->num_pages, VM_MAP,
> +                                pgprot_writecombine(PAGE_KERNEL));
> +               if (!bo->vaddr) {
> +                       dev_err(drm->dev, "failed to vmap() framebuffer\n");
> +                       err = -ENOMEM;
> +                       goto destroy;
> +               }
> +       }
> +
>         drm->mode_config.fb_base = (resource_size_t)bo->paddr;
>         info->screen_base = (void __iomem *)bo->vaddr + offset;
>         info->screen_size = size;
> diff --git a/drivers/gpu/drm/tegra/gem.c b/drivers/gpu/drm/tegra/gem.c
> index c1e4e8b6e5ca..2912e61a2599 100644
> --- a/drivers/gpu/drm/tegra/gem.c
> +++ b/drivers/gpu/drm/tegra/gem.c
> @@ -14,8 +14,10 @@
>   */
>
>  #include <linux/dma-buf.h>
> +#include <linux/iommu.h>
>  #include <drm/tegra_drm.h>
>
> +#include "drm.h"
>  #include "gem.h"
>
>  static inline struct tegra_bo *host1x_to_tegra_bo(struct host1x_bo *bo)
> @@ -90,14 +92,144 @@ static const struct host1x_bo_ops tegra_bo_ops = {
>         .kunmap = tegra_bo_kunmap,
>  };
>
> +static int iommu_map_sg(struct iommu_domain *domain, struct sg_table *sgt,
> +                       dma_addr_t iova, int prot)
> +{
> +       unsigned long offset = 0;
> +       struct scatterlist *sg;
> +       unsigned int i, j;
> +       int err;
> +
> +       for_each_sg(sgt->sgl, sg, sgt->nents, i) {
> +               dma_addr_t phys = sg_phys(sg);
> +               size_t length = sg->offset;
> +
> +               phys = sg_phys(sg) - sg->offset;
> +               length = sg->length + sg->offset;
> +
> +               err = iommu_map(domain, iova + offset, phys, length, prot);
> +               if (err < 0)
> +                       goto unmap;
> +
> +               offset += length;
> +       }
> +
> +       return 0;
> +
> +unmap:
> +       offset = 0;
> +
> +       for_each_sg(sgt->sgl, sg, i, j) {
> +               size_t length = sg->length + sg->offset;
> +               iommu_unmap(domain, iova + offset, length);
> +               offset += length;
> +       }
> +
> +       return err;
> +}
> +
> +static int iommu_unmap_sg(struct iommu_domain *domain, struct sg_table *sgt,
> +                         dma_addr_t iova)
> +{
> +       unsigned long offset = 0;
> +       struct scatterlist *sg;
> +       unsigned int i;
> +
> +       for_each_sg(sgt->sgl, sg, sgt->nents, i) {
> +               dma_addr_t phys = sg_phys(sg);
> +               size_t length = sg->offset;
> +
> +               phys = sg_phys(sg) - sg->offset;
> +               length = sg->length + sg->offset;
> +
> +               iommu_unmap(domain, iova + offset, length);
> +               offset += length;
> +       }
> +
> +       return 0;
> +}
> +
> +static int tegra_bo_iommu_map(struct tegra_drm *tegra, struct tegra_bo *bo)
> +{
> +       int prot = IOMMU_READ | IOMMU_WRITE;
> +       int err;
> +
> +       if (bo->mm)
> +               return -EBUSY;
> +
> +       bo->mm = kzalloc(sizeof(*bo->mm), GFP_KERNEL);
> +       if (!bo->mm)
> +               return -ENOMEM;
> +
> +       err = drm_mm_insert_node_generic(&tegra->mm, bo->mm, bo->gem.size,
> +                                        PAGE_SIZE, 0, 0, 0);
> +       if (err < 0) {
> +               dev_err(tegra->drm->dev, "out of virtual memory: %d\n", err);
> +               return err;
> +       }
> +
> +       bo->paddr = bo->mm->start;
> +
> +       err = iommu_map_sg(tegra->domain, bo->sgt, bo->paddr, prot);
> +       if (err < 0) {
> +               dev_err(tegra->drm->dev, "failed to map buffer: %d\n", err);
> +               return err;
> +       }
> +
> +       return 0;
> +}
> +
> +static int tegra_bo_iommu_unmap(struct tegra_drm *tegra, struct tegra_bo *bo)
> +{
> +       if (!bo->mm)
> +               return 0;
> +
> +       iommu_unmap_sg(tegra->domain, bo->sgt, bo->paddr);
> +       drm_mm_remove_node(bo->mm);
> +
> +       kfree(bo->mm);
> +       return 0;
> +}
> +
>  static void tegra_bo_destroy(struct drm_device *drm, struct tegra_bo *bo)
>  {
> -       dma_free_writecombine(drm->dev, bo->gem.size, bo->vaddr, bo->paddr);
> +       if (!bo->pages)
> +               dma_free_writecombine(drm->dev, bo->gem.size, bo->vaddr,
> +                                     bo->paddr);
> +       else
> +               drm_gem_put_pages(&bo->gem, bo->pages, true, true);
> +}
> +
> +static int tegra_bo_get_pages(struct drm_device *drm, struct tegra_bo *bo,
> +                             size_t size)
> +{
> +       bo->pages = drm_gem_get_pages(&bo->gem, GFP_KERNEL);
> +       if (!bo->pages)
> +               return -ENOMEM;
> +
> +       bo->num_pages = size >> PAGE_SHIFT;
> +
> +       return 0;
> +}
> +
> +static int tegra_bo_alloc(struct drm_device *drm, struct tegra_bo *bo,
> +                         size_t size)
> +{
> +       bo->vaddr = dma_alloc_writecombine(drm->dev, size, &bo->paddr,
> +                                          GFP_KERNEL | __GFP_NOWARN);
> +       if (!bo->vaddr) {
> +               dev_err(drm->dev, "failed to allocate buffer of size %zu\n",
> +                       size);
> +               return -ENOMEM;
> +       }
> +
> +       return 0;
>  }
>
>  struct tegra_bo *tegra_bo_create(struct drm_device *drm, unsigned int size,
>                                  unsigned long flags)
>  {
> +       struct tegra_drm *tegra = drm->dev_private;
>         struct tegra_bo *bo;
>         int err;
>
> @@ -108,22 +240,33 @@ struct tegra_bo *tegra_bo_create(struct drm_device *drm, unsigned int size,
>         host1x_bo_init(&bo->base, &tegra_bo_ops);
>         size = round_up(size, PAGE_SIZE);
>
> -       bo->vaddr = dma_alloc_writecombine(drm->dev, size, &bo->paddr,
> -                                          GFP_KERNEL | __GFP_NOWARN);
> -       if (!bo->vaddr) {
> -               dev_err(drm->dev, "failed to allocate buffer with size %u\n",
> -                       size);
> -               err = -ENOMEM;
> -               goto err_dma;
> -       }
> -
>         err = drm_gem_object_init(drm, &bo->gem, size);
>         if (err)
> -               goto err_init;
> +               goto free;
>
>         err = drm_gem_create_mmap_offset(&bo->gem);

We need to call drm_gem_free_mmap_offset if one of the calls below
fails, otherwise we'll try to free the mmap_offset on an already
destroyed bo.


Sean



>         if (err)
> -               goto err_mmap;
> +               goto release;
> +
> +       if (tegra->domain) {
> +               err = tegra_bo_get_pages(drm, bo, size);
> +               if (err < 0)
> +                       goto release;
> +
> +               bo->sgt = drm_prime_pages_to_sg(bo->pages, bo->num_pages);
> +               if (IS_ERR(bo->sgt)) {
> +                       err = PTR_ERR(bo->sgt);
> +                       goto release;
> +               }
> +
> +               err = tegra_bo_iommu_map(tegra, bo);
> +               if (err < 0)
> +                       goto release;
> +       } else {
> +               err = tegra_bo_alloc(drm, bo, size);
> +               if (err < 0)
> +                       goto release;
> +       }
>
>         if (flags & DRM_TEGRA_GEM_CREATE_TILED)
>                 bo->tiling.mode = TEGRA_BO_TILING_MODE_TILED;
> @@ -133,11 +276,10 @@ struct tegra_bo *tegra_bo_create(struct drm_device *drm, unsigned int size,
>
>         return bo;
>
> -err_mmap:
> +release:
>         drm_gem_object_release(&bo->gem);
> -err_init:
>         tegra_bo_destroy(drm, bo);
> -err_dma:
> +free:
>         kfree(bo);
>
>         return ERR_PTR(err);
> @@ -172,6 +314,7 @@ err:
>  static struct tegra_bo *tegra_bo_import(struct drm_device *drm,
>                                         struct dma_buf *buf)
>  {
> +       struct tegra_drm *tegra = drm->dev_private;
>         struct dma_buf_attachment *attach;
>         struct tegra_bo *bo;
>         ssize_t size;
> @@ -211,12 +354,19 @@ static struct tegra_bo *tegra_bo_import(struct drm_device *drm,
>                 goto detach;
>         }
>
> -       if (bo->sgt->nents > 1) {
> -               err = -EINVAL;
> -               goto detach;
> +       if (tegra->domain) {
> +               err = tegra_bo_iommu_map(tegra, bo);
> +               if (err < 0)
> +                       goto detach;
> +       } else {
> +               if (bo->sgt->nents > 1) {
> +                       err = -EINVAL;
> +                       goto detach;
> +               }
> +
> +               bo->paddr = sg_dma_address(bo->sgt->sgl);
>         }
>
> -       bo->paddr = sg_dma_address(bo->sgt->sgl);
>         bo->gem.import_attach = attach;
>
>         return bo;
> @@ -239,8 +389,12 @@ free:
>
>  void tegra_bo_free_object(struct drm_gem_object *gem)
>  {
> +       struct tegra_drm *tegra = gem->dev->dev_private;
>         struct tegra_bo *bo = to_tegra_bo(gem);
>
> +       if (tegra->domain)
> +               tegra_bo_iommu_unmap(tegra, bo);
> +
>         if (gem->import_attach) {
>                 dma_buf_unmap_attachment(gem->import_attach, bo->sgt,
>                                          DMA_TO_DEVICE);
> @@ -301,7 +455,38 @@ int tegra_bo_dumb_map_offset(struct drm_file *file, struct drm_device *drm,
>         return 0;
>  }
>
> +static int tegra_bo_fault(struct vm_area_struct *vma, struct vm_fault *vmf)
> +{
> +       struct drm_gem_object *gem = vma->vm_private_data;
> +       struct tegra_bo *bo = to_tegra_bo(gem);
> +       struct page *page;
> +       pgoff_t offset;
> +       int err;
> +
> +       if (!bo->pages)
> +               return VM_FAULT_SIGBUS;
> +
> +       offset = ((unsigned long)vmf->virtual_address - vma->vm_start) >> PAGE_SHIFT;
> +       page = bo->pages[offset];
> +
> +       err = vm_insert_page(vma, (unsigned long)vmf->virtual_address, page);
> +       switch (err) {
> +       case -EAGAIN:
> +       case 0:
> +       case -ERESTARTSYS:
> +       case -EINTR:
> +       case -EBUSY:
> +               return VM_FAULT_NOPAGE;
> +
> +       case -ENOMEM:
> +               return VM_FAULT_OOM;
> +       }
> +
> +       return VM_FAULT_SIGBUS;
> +}
> +
>  const struct vm_operations_struct tegra_bo_vm_ops = {
> +       .fault = tegra_bo_fault,
>         .open = drm_gem_vm_open,
>         .close = drm_gem_vm_close,
>  };
> @@ -316,13 +501,18 @@ int tegra_drm_mmap(struct file *file, struct vm_area_struct *vma)
>         if (ret)
>                 return ret;
>
> +       vma->vm_flags |= VM_MIXEDMAP;
> +       vma->vm_flags &= ~VM_PFNMAP;
> +
>         gem = vma->vm_private_data;
>         bo = to_tegra_bo(gem);
>
> -       ret = remap_pfn_range(vma, vma->vm_start, bo->paddr >> PAGE_SHIFT,
> -                             vma->vm_end - vma->vm_start, vma->vm_page_prot);
> -       if (ret)
> -               drm_gem_vm_close(vma);
> +       if (!bo->pages) {
> +               ret = remap_pfn_range(vma, vma->vm_start, bo->paddr >> PAGE_SHIFT,
> +                                     vma->vm_end - vma->vm_start, vma->vm_page_prot);
> +               if (ret)
> +                       drm_gem_vm_close(vma);
> +       }
>
>         return ret;
>  }
> diff --git a/drivers/gpu/drm/tegra/gem.h b/drivers/gpu/drm/tegra/gem.h
> index 43a25c853357..c2e3f43e4b3f 100644
> --- a/drivers/gpu/drm/tegra/gem.h
> +++ b/drivers/gpu/drm/tegra/gem.h
> @@ -37,6 +37,10 @@ struct tegra_bo {
>         dma_addr_t paddr;
>         void *vaddr;
>
> +       struct drm_mm_node *mm;
> +       unsigned long num_pages;
> +       struct page **pages;
> +
>         struct tegra_bo_tiling tiling;
>  };
>
> --
> 2.0.0
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo at vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/

^ permalink raw reply	[flat|nested] 133+ messages in thread

* Re: [RFC 09/10] drm/tegra: Add IOMMU support
  2014-09-30 18:48         ` Sean Paul
  (?)
@ 2014-10-01 15:54             ` Sean Paul
  -1 siblings, 0 replies; 133+ messages in thread
From: Sean Paul @ 2014-10-01 15:54 UTC (permalink / raw)
  To: Thierry Reding
  Cc: Rob Herring, Pawel Moll, Mark Rutland, Ian Campbell, Kumar Gala,
	Stephen Warren, Arnd Bergmann, Will Deacon, Joerg Roedel,
	Cho KyongHo, Grant Grundler, Dave Martin, Marc Zyngier,
	Hiroshi Doyu, Olav Haugan, Paul Walmsley, Rhyland Klein,
	Allen Martin, devicetree-u79uwXL29TY76Z2rM5mHXA, Linux IOMMU,
	Linux ARM Kernel, linux-tegra-u79uwXL29TY76Z2rM5mHXA

On Tue, Sep 30, 2014 at 2:48 PM, Sean Paul <seanpaul-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org> wrote:
> On Thu, Jun 26, 2014 at 4:49 PM, Thierry Reding
> <thierry.reding-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> wrote:
>> From: Thierry Reding <treding-DDmLM1+adcrQT0dZR+AlfA@public.gmane.org>
>>
>> When an IOMMU device is available on the platform bus, allocate an IOMMU
>> domain and attach the display controllers to it. The display controllers
>> can then scan out non-contiguous buffers by mapping them through the
>> IOMMU.
>>
>
> Hi Thierry,
> A few comments from Stéphane and myself that came up while we were
> reviewing this for our tree.
>
>> Signed-off-by: Thierry Reding <treding-DDmLM1+adcrQT0dZR+AlfA@public.gmane.org>
>> ---
>>  drivers/gpu/drm/tegra/dc.c  |  21 ++++
>>  drivers/gpu/drm/tegra/drm.c |  17 ++++
>>  drivers/gpu/drm/tegra/drm.h |   3 +
>>  drivers/gpu/drm/tegra/fb.c  |  16 ++-
>>  drivers/gpu/drm/tegra/gem.c | 236 +++++++++++++++++++++++++++++++++++++++-----
>>  drivers/gpu/drm/tegra/gem.h |   4 +
>>  6 files changed, 273 insertions(+), 24 deletions(-)
>>
>> diff --git a/drivers/gpu/drm/tegra/dc.c b/drivers/gpu/drm/tegra/dc.c
>> index afcca04f5367..0f7452d04811 100644
>> --- a/drivers/gpu/drm/tegra/dc.c
>> +++ b/drivers/gpu/drm/tegra/dc.c
>> @@ -9,6 +9,7 @@
>>
>>  #include <linux/clk.h>
>>  #include <linux/debugfs.h>
>> +#include <linux/iommu.h>
>>  #include <linux/reset.h>
>>
>>  #include "dc.h"
>> @@ -1283,8 +1284,18 @@ static int tegra_dc_init(struct host1x_client *client)
>>  {
>>         struct drm_device *drm = dev_get_drvdata(client->parent);
>>         struct tegra_dc *dc = host1x_client_to_dc(client);
>> +       struct tegra_drm *tegra = drm->dev_private;
>>         int err;
>>
>> +       if (tegra->domain) {
>> +               err = iommu_attach_device(tegra->domain, dc->dev);
>> +               if (err < 0) {
>> +                       dev_err(dc->dev, "failed to attach to IOMMU: %d\n",
>> +                               err);
>> +                       return err;
>> +               }
>
> [from Stéphane]
>
> shouldn't we call detach in the error paths below?
>
>
>> +       }
>> +
>>         drm_crtc_init(drm, &dc->base, &tegra_crtc_funcs);
>>         drm_mode_crtc_set_gamma_size(&dc->base, 256);
>>         drm_crtc_helper_add(&dc->base, &tegra_crtc_helper_funcs);
>> @@ -1318,7 +1329,9 @@ static int tegra_dc_init(struct host1x_client *client)
>>
>>  static int tegra_dc_exit(struct host1x_client *client)
>>  {
>> +       struct drm_device *drm = dev_get_drvdata(client->parent);
>>         struct tegra_dc *dc = host1x_client_to_dc(client);
>> +       struct tegra_drm *tegra = drm->dev_private;
>>         int err;
>>
>>         devm_free_irq(dc->dev, dc->irq, dc);
>> @@ -1335,6 +1348,8 @@ static int tegra_dc_exit(struct host1x_client *client)
>>                 return err;
>>         }
>>
>> +       iommu_detach_device(tegra->domain, dc->dev);
>> +
>>         return 0;
>>  }
>>
>> @@ -1462,6 +1477,12 @@ static int tegra_dc_probe(struct platform_device *pdev)
>>                 return -ENXIO;
>>         }
>>
>> +       err = iommu_attach(&pdev->dev);
>> +       if (err < 0) {
>> +               dev_err(&pdev->dev, "failed to attach to IOMMU: %d\n", err);
>> +               return err;
>> +       }
>> +
>>         INIT_LIST_HEAD(&dc->client.list);
>>         dc->client.ops = &dc_client_ops;
>>         dc->client.dev = &pdev->dev;
>> diff --git a/drivers/gpu/drm/tegra/drm.c b/drivers/gpu/drm/tegra/drm.c
>> index 59736bb810cd..1d2bbafad982 100644
>> --- a/drivers/gpu/drm/tegra/drm.c
>> +++ b/drivers/gpu/drm/tegra/drm.c
>> @@ -8,6 +8,7 @@
>>   */
>>
>>  #include <linux/host1x.h>
>> +#include <linux/iommu.h>
>>
>>  #include "drm.h"
>>  #include "gem.h"
>> @@ -33,6 +34,16 @@ static int tegra_drm_load(struct drm_device *drm, unsigned long flags)
>>         if (!tegra)
>>                 return -ENOMEM;
>>
>> +       if (iommu_present(&platform_bus_type)) {
>> +               tegra->domain = iommu_domain_alloc(&platform_bus_type);
>> +               if (IS_ERR(tegra->domain)) {
>> +                       kfree(tegra);
>> +                       return PTR_ERR(tegra->domain);
>> +               }
>> +
>> +               drm_mm_init(&tegra->mm, 0, SZ_2G);
>
>
> [from Stéphane]:
>
> none of these are freed in the error path below (iommu_domain_free and
> drm_mm_takedown)
>
> also |tegra| isn't freed either?
>
>
>
>> +       }
>> +
>>         mutex_init(&tegra->clients_lock);
>>         INIT_LIST_HEAD(&tegra->clients);
>>         drm->dev_private = tegra;
>> @@ -71,6 +82,7 @@ static int tegra_drm_load(struct drm_device *drm, unsigned long flags)
>>  static int tegra_drm_unload(struct drm_device *drm)
>>  {
>>         struct host1x_device *device = to_host1x_device(drm->dev);
>> +       struct tegra_drm *tegra = drm->dev_private;
>>         int err;
>>
>>         drm_kms_helper_poll_fini(drm);
>> @@ -82,6 +94,11 @@ static int tegra_drm_unload(struct drm_device *drm)
>>         if (err < 0)
>>                 return err;
>>
>> +       if (tegra->domain) {
>> +               iommu_domain_free(tegra->domain);
>> +               drm_mm_takedown(&tegra->mm);
>> +       }
>> +
>>         return 0;
>>  }
>>
>> diff --git a/drivers/gpu/drm/tegra/drm.h b/drivers/gpu/drm/tegra/drm.h
>> index 96d754e7b3eb..a07c796b7edc 100644
>> --- a/drivers/gpu/drm/tegra/drm.h
>> +++ b/drivers/gpu/drm/tegra/drm.h
>> @@ -39,6 +39,9 @@ struct tegra_fbdev {
>>  struct tegra_drm {
>>         struct drm_device *drm;
>>
>> +       struct iommu_domain *domain;
>> +       struct drm_mm mm;
>> +
>>         struct mutex clients_lock;
>>         struct list_head clients;
>>
>> diff --git a/drivers/gpu/drm/tegra/fb.c b/drivers/gpu/drm/tegra/fb.c
>> index 7790d43ad082..21c65dd817c3 100644
>> --- a/drivers/gpu/drm/tegra/fb.c
>> +++ b/drivers/gpu/drm/tegra/fb.c
>> @@ -65,8 +65,12 @@ static void tegra_fb_destroy(struct drm_framebuffer *framebuffer)
>>         for (i = 0; i < fb->num_planes; i++) {
>>                 struct tegra_bo *bo = fb->planes[i];
>>
>> -               if (bo)
>> +               if (bo) {
>> +                       if (bo->pages && bo->virt)
>> +                               vunmap(bo->virt);
>> +
>>                         drm_gem_object_unreference_unlocked(&bo->gem);
>> +               }
>>         }
>>
>>         drm_framebuffer_cleanup(framebuffer);
>> @@ -252,6 +256,16 @@ static int tegra_fbdev_probe(struct drm_fb_helper *helper,
>>         offset = info->var.xoffset * bytes_per_pixel +
>>                  info->var.yoffset * fb->pitches[0];
>>
>> +       if (bo->pages) {
>> +               bo->vaddr = vmap(bo->pages, bo->num_pages, VM_MAP,
>> +                                pgprot_writecombine(PAGE_KERNEL));
>> +               if (!bo->vaddr) {
>> +                       dev_err(drm->dev, "failed to vmap() framebuffer\n");
>> +                       err = -ENOMEM;
>> +                       goto destroy;
>> +               }
>> +       }
>> +
>>         drm->mode_config.fb_base = (resource_size_t)bo->paddr;
>>         info->screen_base = (void __iomem *)bo->vaddr + offset;
>>         info->screen_size = size;
>> diff --git a/drivers/gpu/drm/tegra/gem.c b/drivers/gpu/drm/tegra/gem.c
>> index c1e4e8b6e5ca..2912e61a2599 100644
>> --- a/drivers/gpu/drm/tegra/gem.c
>> +++ b/drivers/gpu/drm/tegra/gem.c
>> @@ -14,8 +14,10 @@
>>   */
>>
>>  #include <linux/dma-buf.h>
>> +#include <linux/iommu.h>
>>  #include <drm/tegra_drm.h>
>>
>> +#include "drm.h"
>>  #include "gem.h"
>>
>>  static inline struct tegra_bo *host1x_to_tegra_bo(struct host1x_bo *bo)
>> @@ -90,14 +92,144 @@ static const struct host1x_bo_ops tegra_bo_ops = {
>>         .kunmap = tegra_bo_kunmap,
>>  };
>>
>> +static int iommu_map_sg(struct iommu_domain *domain, struct sg_table *sgt,
>> +                       dma_addr_t iova, int prot)
>> +{
>> +       unsigned long offset = 0;
>> +       struct scatterlist *sg;
>> +       unsigned int i, j;
>> +       int err;
>> +
>> +       for_each_sg(sgt->sgl, sg, sgt->nents, i) {
>> +               dma_addr_t phys = sg_phys(sg);
>> +               size_t length = sg->offset;
>> +
>> +               phys = sg_phys(sg) - sg->offset;
>> +               length = sg->length + sg->offset;
>> +
>> +               err = iommu_map(domain, iova + offset, phys, length, prot);
>> +               if (err < 0)
>> +                       goto unmap;
>> +
>> +               offset += length;
>> +       }
>> +
>> +       return 0;
>> +
>> +unmap:
>> +       offset = 0;
>> +
>> +       for_each_sg(sgt->sgl, sg, i, j) {
>> +               size_t length = sg->length + sg->offset;
>> +               iommu_unmap(domain, iova + offset, length);
>> +               offset += length;
>> +       }
>> +
>> +       return err;
>> +}
>> +
>> +static int iommu_unmap_sg(struct iommu_domain *domain, struct sg_table *sgt,
>> +                         dma_addr_t iova)
>> +{
>> +       unsigned long offset = 0;
>> +       struct scatterlist *sg;
>> +       unsigned int i;
>> +
>> +       for_each_sg(sgt->sgl, sg, sgt->nents, i) {
>> +               dma_addr_t phys = sg_phys(sg);
>> +               size_t length = sg->offset;
>> +
>> +               phys = sg_phys(sg) - sg->offset;
>> +               length = sg->length + sg->offset;
>> +
>> +               iommu_unmap(domain, iova + offset, length);
>> +               offset += length;
>> +       }
>> +
>> +       return 0;
>> +}
>> +
>> +static int tegra_bo_iommu_map(struct tegra_drm *tegra, struct tegra_bo *bo)
>> +{
>> +       int prot = IOMMU_READ | IOMMU_WRITE;
>> +       int err;
>> +
>> +       if (bo->mm)
>> +               return -EBUSY;
>> +
>> +       bo->mm = kzalloc(sizeof(*bo->mm), GFP_KERNEL);
>> +       if (!bo->mm)
>> +               return -ENOMEM;
>> +
>> +       err = drm_mm_insert_node_generic(&tegra->mm, bo->mm, bo->gem.size,
>> +                                        PAGE_SIZE, 0, 0, 0);
>> +       if (err < 0) {
>> +               dev_err(tegra->drm->dev, "out of virtual memory: %d\n", err);
>> +               return err;
>> +       }
>> +
>> +       bo->paddr = bo->mm->start;
>> +
>> +       err = iommu_map_sg(tegra->domain, bo->sgt, bo->paddr, prot);
>> +       if (err < 0) {
>> +               dev_err(tegra->drm->dev, "failed to map buffer: %d\n", err);
>> +               return err;
>> +       }
>> +
>> +       return 0;
>> +}
>> +
>> +static int tegra_bo_iommu_unmap(struct tegra_drm *tegra, struct tegra_bo *bo)
>> +{
>> +       if (!bo->mm)
>> +               return 0;
>> +
>> +       iommu_unmap_sg(tegra->domain, bo->sgt, bo->paddr);
>> +       drm_mm_remove_node(bo->mm);
>> +
>> +       kfree(bo->mm);
>> +       return 0;
>> +}
>> +
>>  static void tegra_bo_destroy(struct drm_device *drm, struct tegra_bo *bo)
>>  {
>> -       dma_free_writecombine(drm->dev, bo->gem.size, bo->vaddr, bo->paddr);
>> +       if (!bo->pages)
>> +               dma_free_writecombine(drm->dev, bo->gem.size, bo->vaddr,
>> +                                     bo->paddr);

One more thing. If tegra_bo_alloc fails, we'll have bo->vaddr == NULL
and bo->paddr == ~0 here, which causes a crash.

I posted https://lkml.org/lkml/2014/9/30/659 to check for the error
condition in the mm code, but it seems like reviewer consensus is to
check for this before calling free.

As such, we'll need to make sure bo->vaddr != NULL before calling
dma_free_writecombine to avoid this situation.

Would you prefer I send a patch up to fix this separately, or would
you like to roll this into your next version?

Sean




>> +       else
>> +               drm_gem_put_pages(&bo->gem, bo->pages, true, true);
>> +}
>> +
>> +static int tegra_bo_get_pages(struct drm_device *drm, struct tegra_bo *bo,
>> +                             size_t size)
>> +{
>> +       bo->pages = drm_gem_get_pages(&bo->gem, GFP_KERNEL);
>> +       if (!bo->pages)
>> +               return -ENOMEM;
>> +
>> +       bo->num_pages = size >> PAGE_SHIFT;
>> +
>> +       return 0;
>> +}
>> +
>> +static int tegra_bo_alloc(struct drm_device *drm, struct tegra_bo *bo,
>> +                         size_t size)
>> +{
>> +       bo->vaddr = dma_alloc_writecombine(drm->dev, size, &bo->paddr,
>> +                                          GFP_KERNEL | __GFP_NOWARN);
>> +       if (!bo->vaddr) {
>> +               dev_err(drm->dev, "failed to allocate buffer of size %zu\n",
>> +                       size);
>> +               return -ENOMEM;
>> +       }
>> +
>> +       return 0;
>>  }
>>
>>  struct tegra_bo *tegra_bo_create(struct drm_device *drm, unsigned int size,
>>                                  unsigned long flags)
>>  {
>> +       struct tegra_drm *tegra = drm->dev_private;
>>         struct tegra_bo *bo;
>>         int err;
>>
>> @@ -108,22 +240,33 @@ struct tegra_bo *tegra_bo_create(struct drm_device *drm, unsigned int size,
>>         host1x_bo_init(&bo->base, &tegra_bo_ops);
>>         size = round_up(size, PAGE_SIZE);
>>
>> -       bo->vaddr = dma_alloc_writecombine(drm->dev, size, &bo->paddr,
>> -                                          GFP_KERNEL | __GFP_NOWARN);
>> -       if (!bo->vaddr) {
>> -               dev_err(drm->dev, "failed to allocate buffer with size %u\n",
>> -                       size);
>> -               err = -ENOMEM;
>> -               goto err_dma;
>> -       }
>> -
>>         err = drm_gem_object_init(drm, &bo->gem, size);
>>         if (err)
>> -               goto err_init;
>> +               goto free;
>>
>>         err = drm_gem_create_mmap_offset(&bo->gem);
>
> We need to call drm_gem_free_mmap_offset if one of the calls below
> fails, otherwise we'll try to free the mmap_offset on an already
> destroyed bo.
>
>
> Sean
>
>
>
>>         if (err)
>> -               goto err_mmap;
>> +               goto release;
>> +
>> +       if (tegra->domain) {
>> +               err = tegra_bo_get_pages(drm, bo, size);
>> +               if (err < 0)
>> +                       goto release;
>> +
>> +               bo->sgt = drm_prime_pages_to_sg(bo->pages, bo->num_pages);
>> +               if (IS_ERR(bo->sgt)) {
>> +                       err = PTR_ERR(bo->sgt);
>> +                       goto release;
>> +               }
>> +
>> +               err = tegra_bo_iommu_map(tegra, bo);
>> +               if (err < 0)
>> +                       goto release;
>> +       } else {
>> +               err = tegra_bo_alloc(drm, bo, size);
>> +               if (err < 0)
>> +                       goto release;
>> +       }
>>
>>         if (flags & DRM_TEGRA_GEM_CREATE_TILED)
>>                 bo->tiling.mode = TEGRA_BO_TILING_MODE_TILED;
>> @@ -133,11 +276,10 @@ struct tegra_bo *tegra_bo_create(struct drm_device *drm, unsigned int size,
>>
>>         return bo;
>>
>> -err_mmap:
>> +release:
>>         drm_gem_object_release(&bo->gem);
>> -err_init:
>>         tegra_bo_destroy(drm, bo);
>> -err_dma:
>> +free:
>>         kfree(bo);
>>
>>         return ERR_PTR(err);
>> @@ -172,6 +314,7 @@ err:
>>  static struct tegra_bo *tegra_bo_import(struct drm_device *drm,
>>                                         struct dma_buf *buf)
>>  {
>> +       struct tegra_drm *tegra = drm->dev_private;
>>         struct dma_buf_attachment *attach;
>>         struct tegra_bo *bo;
>>         ssize_t size;
>> @@ -211,12 +354,19 @@ static struct tegra_bo *tegra_bo_import(struct drm_device *drm,
>>                 goto detach;
>>         }
>>
>> -       if (bo->sgt->nents > 1) {
>> -               err = -EINVAL;
>> -               goto detach;
>> +       if (tegra->domain) {
>> +               err = tegra_bo_iommu_map(tegra, bo);
>> +               if (err < 0)
>> +                       goto detach;
>> +       } else {
>> +               if (bo->sgt->nents > 1) {
>> +                       err = -EINVAL;
>> +                       goto detach;
>> +               }
>> +
>> +               bo->paddr = sg_dma_address(bo->sgt->sgl);
>>         }
>>
>> -       bo->paddr = sg_dma_address(bo->sgt->sgl);
>>         bo->gem.import_attach = attach;
>>
>>         return bo;
>> @@ -239,8 +389,12 @@ free:
>>
>>  void tegra_bo_free_object(struct drm_gem_object *gem)
>>  {
>> +       struct tegra_drm *tegra = gem->dev->dev_private;
>>         struct tegra_bo *bo = to_tegra_bo(gem);
>>
>> +       if (tegra->domain)
>> +               tegra_bo_iommu_unmap(tegra, bo);
>> +
>>         if (gem->import_attach) {
>>                 dma_buf_unmap_attachment(gem->import_attach, bo->sgt,
>>                                          DMA_TO_DEVICE);
>> @@ -301,7 +455,38 @@ int tegra_bo_dumb_map_offset(struct drm_file *file, struct drm_device *drm,
>>         return 0;
>>  }
>>
>> +static int tegra_bo_fault(struct vm_area_struct *vma, struct vm_fault *vmf)
>> +{
>> +       struct drm_gem_object *gem = vma->vm_private_data;
>> +       struct tegra_bo *bo = to_tegra_bo(gem);
>> +       struct page *page;
>> +       pgoff_t offset;
>> +       int err;
>> +
>> +       if (!bo->pages)
>> +               return VM_FAULT_SIGBUS;
>> +
>> +       offset = ((unsigned long)vmf->virtual_address - vma->vm_start) >> PAGE_SHIFT;
>> +       page = bo->pages[offset];
>> +
>> +       err = vm_insert_page(vma, (unsigned long)vmf->virtual_address, page);
>> +       switch (err) {
>> +       case -EAGAIN:
>> +       case 0:
>> +       case -ERESTARTSYS:
>> +       case -EINTR:
>> +       case -EBUSY:
>> +               return VM_FAULT_NOPAGE;
>> +
>> +       case -ENOMEM:
>> +               return VM_FAULT_OOM;
>> +       }
>> +
>> +       return VM_FAULT_SIGBUS;
>> +}
>> +
>>  const struct vm_operations_struct tegra_bo_vm_ops = {
>> +       .fault = tegra_bo_fault,
>>         .open = drm_gem_vm_open,
>>         .close = drm_gem_vm_close,
>>  };
>> @@ -316,13 +501,18 @@ int tegra_drm_mmap(struct file *file, struct vm_area_struct *vma)
>>         if (ret)
>>                 return ret;
>>
>> +       vma->vm_flags |= VM_MIXEDMAP;
>> +       vma->vm_flags &= ~VM_PFNMAP;
>> +
>>         gem = vma->vm_private_data;
>>         bo = to_tegra_bo(gem);
>>
>> -       ret = remap_pfn_range(vma, vma->vm_start, bo->paddr >> PAGE_SHIFT,
>> -                             vma->vm_end - vma->vm_start, vma->vm_page_prot);
>> -       if (ret)
>> -               drm_gem_vm_close(vma);
>> +       if (!bo->pages) {
>> +               ret = remap_pfn_range(vma, vma->vm_start, bo->paddr >> PAGE_SHIFT,
>> +                                     vma->vm_end - vma->vm_start, vma->vm_page_prot);
>> +               if (ret)
>> +                       drm_gem_vm_close(vma);
>> +       }
>>
>>         return ret;
>>  }
>> diff --git a/drivers/gpu/drm/tegra/gem.h b/drivers/gpu/drm/tegra/gem.h
>> index 43a25c853357..c2e3f43e4b3f 100644
>> --- a/drivers/gpu/drm/tegra/gem.h
>> +++ b/drivers/gpu/drm/tegra/gem.h
>> @@ -37,6 +37,10 @@ struct tegra_bo {
>>         dma_addr_t paddr;
>>         void *vaddr;
>>
>> +       struct drm_mm_node *mm;
>> +       unsigned long num_pages;
>> +       struct page **pages;
>> +
>>         struct tegra_bo_tiling tiling;
>>  };
>>
>> --
>> 2.0.0
>>
>> --
>> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
>> the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>> Please read the FAQ at  http://www.tux.org/lkml/

^ permalink raw reply	[flat|nested] 133+ messages in thread

* Re: [RFC 09/10] drm/tegra: Add IOMMU support
@ 2014-10-01 15:54             ` Sean Paul
  0 siblings, 0 replies; 133+ messages in thread
From: Sean Paul @ 2014-10-01 15:54 UTC (permalink / raw)
  To: Thierry Reding
  Cc: Rob Herring, Pawel Moll, Mark Rutland, Ian Campbell, Kumar Gala,
	Stephen Warren, Arnd Bergmann, Will Deacon, Joerg Roedel,
	Cho KyongHo, Grant Grundler, Dave Martin, Marc Zyngier,
	Hiroshi Doyu, Olav Haugan, Paul Walmsley, Rhyland Klein,
	Allen Martin, devicetree, Linux IOMMU, Linux ARM Kernel,
	linux-tegra, Linux Kernel Mailing List, Stéphane Marchesin

On Tue, Sep 30, 2014 at 2:48 PM, Sean Paul <seanpaul@google.com> wrote:
> On Thu, Jun 26, 2014 at 4:49 PM, Thierry Reding
> <thierry.reding@gmail.com> wrote:
>> From: Thierry Reding <treding@nvidia.com>
>>
>> When an IOMMU device is available on the platform bus, allocate an IOMMU
>> domain and attach the display controllers to it. The display controllers
>> can then scan out non-contiguous buffers by mapping them through the
>> IOMMU.
>>
>
> Hi Thierry,
> A few comments from Stéphane and myself that came up while we were
> reviewing this for our tree.
>
>> Signed-off-by: Thierry Reding <treding@nvidia.com>
>> ---
>>  drivers/gpu/drm/tegra/dc.c  |  21 ++++
>>  drivers/gpu/drm/tegra/drm.c |  17 ++++
>>  drivers/gpu/drm/tegra/drm.h |   3 +
>>  drivers/gpu/drm/tegra/fb.c  |  16 ++-
>>  drivers/gpu/drm/tegra/gem.c | 236 +++++++++++++++++++++++++++++++++++++++-----
>>  drivers/gpu/drm/tegra/gem.h |   4 +
>>  6 files changed, 273 insertions(+), 24 deletions(-)
>>
>> diff --git a/drivers/gpu/drm/tegra/dc.c b/drivers/gpu/drm/tegra/dc.c
>> index afcca04f5367..0f7452d04811 100644
>> --- a/drivers/gpu/drm/tegra/dc.c
>> +++ b/drivers/gpu/drm/tegra/dc.c
>> @@ -9,6 +9,7 @@
>>
>>  #include <linux/clk.h>
>>  #include <linux/debugfs.h>
>> +#include <linux/iommu.h>
>>  #include <linux/reset.h>
>>
>>  #include "dc.h"
>> @@ -1283,8 +1284,18 @@ static int tegra_dc_init(struct host1x_client *client)
>>  {
>>         struct drm_device *drm = dev_get_drvdata(client->parent);
>>         struct tegra_dc *dc = host1x_client_to_dc(client);
>> +       struct tegra_drm *tegra = drm->dev_private;
>>         int err;
>>
>> +       if (tegra->domain) {
>> +               err = iommu_attach_device(tegra->domain, dc->dev);
>> +               if (err < 0) {
>> +                       dev_err(dc->dev, "failed to attach to IOMMU: %d\n",
>> +                               err);
>> +                       return err;
>> +               }
>
> [from Stéphane]
>
> shouldn't we call detach in the error paths below?
>
>
>> +       }
>> +
>>         drm_crtc_init(drm, &dc->base, &tegra_crtc_funcs);
>>         drm_mode_crtc_set_gamma_size(&dc->base, 256);
>>         drm_crtc_helper_add(&dc->base, &tegra_crtc_helper_funcs);
>> @@ -1318,7 +1329,9 @@ static int tegra_dc_init(struct host1x_client *client)
>>
>>  static int tegra_dc_exit(struct host1x_client *client)
>>  {
>> +       struct drm_device *drm = dev_get_drvdata(client->parent);
>>         struct tegra_dc *dc = host1x_client_to_dc(client);
>> +       struct tegra_drm *tegra = drm->dev_private;
>>         int err;
>>
>>         devm_free_irq(dc->dev, dc->irq, dc);
>> @@ -1335,6 +1348,8 @@ static int tegra_dc_exit(struct host1x_client *client)
>>                 return err;
>>         }
>>
>> +       iommu_detach_device(tegra->domain, dc->dev);
>> +
>>         return 0;
>>  }
>>
>> @@ -1462,6 +1477,12 @@ static int tegra_dc_probe(struct platform_device *pdev)
>>                 return -ENXIO;
>>         }
>>
>> +       err = iommu_attach(&pdev->dev);
>> +       if (err < 0) {
>> +               dev_err(&pdev->dev, "failed to attach to IOMMU: %d\n", err);
>> +               return err;
>> +       }
>> +
>>         INIT_LIST_HEAD(&dc->client.list);
>>         dc->client.ops = &dc_client_ops;
>>         dc->client.dev = &pdev->dev;
>> diff --git a/drivers/gpu/drm/tegra/drm.c b/drivers/gpu/drm/tegra/drm.c
>> index 59736bb810cd..1d2bbafad982 100644
>> --- a/drivers/gpu/drm/tegra/drm.c
>> +++ b/drivers/gpu/drm/tegra/drm.c
>> @@ -8,6 +8,7 @@
>>   */
>>
>>  #include <linux/host1x.h>
>> +#include <linux/iommu.h>
>>
>>  #include "drm.h"
>>  #include "gem.h"
>> @@ -33,6 +34,16 @@ static int tegra_drm_load(struct drm_device *drm, unsigned long flags)
>>         if (!tegra)
>>                 return -ENOMEM;
>>
>> +       if (iommu_present(&platform_bus_type)) {
>> +               tegra->domain = iommu_domain_alloc(&platform_bus_type);
>> +               if (IS_ERR(tegra->domain)) {
>> +                       kfree(tegra);
>> +                       return PTR_ERR(tegra->domain);
>> +               }
>> +
>> +               drm_mm_init(&tegra->mm, 0, SZ_2G);
>
>
> [from Stéphane]:
>
> none of these are freed in the error path below (iommu_domain_free and
> drm_mm_takedown)
>
> also |tegra| isn't freed either?
>
>
>
>> +       }
>> +
>>         mutex_init(&tegra->clients_lock);
>>         INIT_LIST_HEAD(&tegra->clients);
>>         drm->dev_private = tegra;
>> @@ -71,6 +82,7 @@ static int tegra_drm_load(struct drm_device *drm, unsigned long flags)
>>  static int tegra_drm_unload(struct drm_device *drm)
>>  {
>>         struct host1x_device *device = to_host1x_device(drm->dev);
>> +       struct tegra_drm *tegra = drm->dev_private;
>>         int err;
>>
>>         drm_kms_helper_poll_fini(drm);
>> @@ -82,6 +94,11 @@ static int tegra_drm_unload(struct drm_device *drm)
>>         if (err < 0)
>>                 return err;
>>
>> +       if (tegra->domain) {
>> +               iommu_domain_free(tegra->domain);
>> +               drm_mm_takedown(&tegra->mm);
>> +       }
>> +
>>         return 0;
>>  }
>>
>> diff --git a/drivers/gpu/drm/tegra/drm.h b/drivers/gpu/drm/tegra/drm.h
>> index 96d754e7b3eb..a07c796b7edc 100644
>> --- a/drivers/gpu/drm/tegra/drm.h
>> +++ b/drivers/gpu/drm/tegra/drm.h
>> @@ -39,6 +39,9 @@ struct tegra_fbdev {
>>  struct tegra_drm {
>>         struct drm_device *drm;
>>
>> +       struct iommu_domain *domain;
>> +       struct drm_mm mm;
>> +
>>         struct mutex clients_lock;
>>         struct list_head clients;
>>
>> diff --git a/drivers/gpu/drm/tegra/fb.c b/drivers/gpu/drm/tegra/fb.c
>> index 7790d43ad082..21c65dd817c3 100644
>> --- a/drivers/gpu/drm/tegra/fb.c
>> +++ b/drivers/gpu/drm/tegra/fb.c
>> @@ -65,8 +65,12 @@ static void tegra_fb_destroy(struct drm_framebuffer *framebuffer)
>>         for (i = 0; i < fb->num_planes; i++) {
>>                 struct tegra_bo *bo = fb->planes[i];
>>
>> -               if (bo)
>> +               if (bo) {
>> +                       if (bo->pages && bo->virt)
>> +                               vunmap(bo->virt);
>> +
>>                         drm_gem_object_unreference_unlocked(&bo->gem);
>> +               }
>>         }
>>
>>         drm_framebuffer_cleanup(framebuffer);
>> @@ -252,6 +256,16 @@ static int tegra_fbdev_probe(struct drm_fb_helper *helper,
>>         offset = info->var.xoffset * bytes_per_pixel +
>>                  info->var.yoffset * fb->pitches[0];
>>
>> +       if (bo->pages) {
>> +               bo->vaddr = vmap(bo->pages, bo->num_pages, VM_MAP,
>> +                                pgprot_writecombine(PAGE_KERNEL));
>> +               if (!bo->vaddr) {
>> +                       dev_err(drm->dev, "failed to vmap() framebuffer\n");
>> +                       err = -ENOMEM;
>> +                       goto destroy;
>> +               }
>> +       }
>> +
>>         drm->mode_config.fb_base = (resource_size_t)bo->paddr;
>>         info->screen_base = (void __iomem *)bo->vaddr + offset;
>>         info->screen_size = size;
>> diff --git a/drivers/gpu/drm/tegra/gem.c b/drivers/gpu/drm/tegra/gem.c
>> index c1e4e8b6e5ca..2912e61a2599 100644
>> --- a/drivers/gpu/drm/tegra/gem.c
>> +++ b/drivers/gpu/drm/tegra/gem.c
>> @@ -14,8 +14,10 @@
>>   */
>>
>>  #include <linux/dma-buf.h>
>> +#include <linux/iommu.h>
>>  #include <drm/tegra_drm.h>
>>
>> +#include "drm.h"
>>  #include "gem.h"
>>
>>  static inline struct tegra_bo *host1x_to_tegra_bo(struct host1x_bo *bo)
>> @@ -90,14 +92,144 @@ static const struct host1x_bo_ops tegra_bo_ops = {
>>         .kunmap = tegra_bo_kunmap,
>>  };
>>
>> +static int iommu_map_sg(struct iommu_domain *domain, struct sg_table *sgt,
>> +                       dma_addr_t iova, int prot)
>> +{
>> +       unsigned long offset = 0;
>> +       struct scatterlist *sg;
>> +       unsigned int i, j;
>> +       int err;
>> +
>> +       for_each_sg(sgt->sgl, sg, sgt->nents, i) {
>> +               dma_addr_t phys = sg_phys(sg);
>> +               size_t length = sg->offset;
>> +
>> +               phys = sg_phys(sg) - sg->offset;
>> +               length = sg->length + sg->offset;
>> +
>> +               err = iommu_map(domain, iova + offset, phys, length, prot);
>> +               if (err < 0)
>> +                       goto unmap;
>> +
>> +               offset += length;
>> +       }
>> +
>> +       return 0;
>> +
>> +unmap:
>> +       offset = 0;
>> +
>> +       for_each_sg(sgt->sgl, sg, i, j) {
>> +               size_t length = sg->length + sg->offset;
>> +               iommu_unmap(domain, iova + offset, length);
>> +               offset += length;
>> +       }
>> +
>> +       return err;
>> +}
>> +
>> +static int iommu_unmap_sg(struct iommu_domain *domain, struct sg_table *sgt,
>> +                         dma_addr_t iova)
>> +{
>> +       unsigned long offset = 0;
>> +       struct scatterlist *sg;
>> +       unsigned int i;
>> +
>> +       for_each_sg(sgt->sgl, sg, sgt->nents, i) {
>> +               dma_addr_t phys = sg_phys(sg);
>> +               size_t length = sg->offset;
>> +
>> +               phys = sg_phys(sg) - sg->offset;
>> +               length = sg->length + sg->offset;
>> +
>> +               iommu_unmap(domain, iova + offset, length);
>> +               offset += length;
>> +       }
>> +
>> +       return 0;
>> +}
>> +
>> +static int tegra_bo_iommu_map(struct tegra_drm *tegra, struct tegra_bo *bo)
>> +{
>> +       int prot = IOMMU_READ | IOMMU_WRITE;
>> +       int err;
>> +
>> +       if (bo->mm)
>> +               return -EBUSY;
>> +
>> +       bo->mm = kzalloc(sizeof(*bo->mm), GFP_KERNEL);
>> +       if (!bo->mm)
>> +               return -ENOMEM;
>> +
>> +       err = drm_mm_insert_node_generic(&tegra->mm, bo->mm, bo->gem.size,
>> +                                        PAGE_SIZE, 0, 0, 0);
>> +       if (err < 0) {
>> +               dev_err(tegra->drm->dev, "out of virtual memory: %d\n", err);
>> +               return err;
>> +       }
>> +
>> +       bo->paddr = bo->mm->start;
>> +
>> +       err = iommu_map_sg(tegra->domain, bo->sgt, bo->paddr, prot);
>> +       if (err < 0) {
>> +               dev_err(tegra->drm->dev, "failed to map buffer: %d\n", err);
>> +               return err;
>> +       }
>> +
>> +       return 0;
>> +}
>> +
>> +static int tegra_bo_iommu_unmap(struct tegra_drm *tegra, struct tegra_bo *bo)
>> +{
>> +       if (!bo->mm)
>> +               return 0;
>> +
>> +       iommu_unmap_sg(tegra->domain, bo->sgt, bo->paddr);
>> +       drm_mm_remove_node(bo->mm);
>> +
>> +       kfree(bo->mm);
>> +       return 0;
>> +}
>> +
>>  static void tegra_bo_destroy(struct drm_device *drm, struct tegra_bo *bo)
>>  {
>> -       dma_free_writecombine(drm->dev, bo->gem.size, bo->vaddr, bo->paddr);
>> +       if (!bo->pages)
>> +               dma_free_writecombine(drm->dev, bo->gem.size, bo->vaddr,
>> +                                     bo->paddr);

One more thing. If tegra_bo_alloc fails, we'll have bo->vaddr == NULL
and bo->paddr == ~0 here, which causes a crash.

I posted https://lkml.org/lkml/2014/9/30/659 to check for the error
condition in the mm code, but it seems like reviewer consensus is to
check for this before calling free.

As such, we'll need to make sure bo->vaddr != NULL before calling
dma_free_writecombine to avoid this situation.

Would you prefer I send a patch up to fix this separately, or would
you like to roll this into your next version?

Sean




>> +       else
>> +               drm_gem_put_pages(&bo->gem, bo->pages, true, true);
>> +}
>> +
>> +static int tegra_bo_get_pages(struct drm_device *drm, struct tegra_bo *bo,
>> +                             size_t size)
>> +{
>> +       bo->pages = drm_gem_get_pages(&bo->gem, GFP_KERNEL);
>> +       if (!bo->pages)
>> +               return -ENOMEM;
>> +
>> +       bo->num_pages = size >> PAGE_SHIFT;
>> +
>> +       return 0;
>> +}
>> +
>> +static int tegra_bo_alloc(struct drm_device *drm, struct tegra_bo *bo,
>> +                         size_t size)
>> +{
>> +       bo->vaddr = dma_alloc_writecombine(drm->dev, size, &bo->paddr,
>> +                                          GFP_KERNEL | __GFP_NOWARN);
>> +       if (!bo->vaddr) {
>> +               dev_err(drm->dev, "failed to allocate buffer of size %zu\n",
>> +                       size);
>> +               return -ENOMEM;
>> +       }
>> +
>> +       return 0;
>>  }
>>
>>  struct tegra_bo *tegra_bo_create(struct drm_device *drm, unsigned int size,
>>                                  unsigned long flags)
>>  {
>> +       struct tegra_drm *tegra = drm->dev_private;
>>         struct tegra_bo *bo;
>>         int err;
>>
>> @@ -108,22 +240,33 @@ struct tegra_bo *tegra_bo_create(struct drm_device *drm, unsigned int size,
>>         host1x_bo_init(&bo->base, &tegra_bo_ops);
>>         size = round_up(size, PAGE_SIZE);
>>
>> -       bo->vaddr = dma_alloc_writecombine(drm->dev, size, &bo->paddr,
>> -                                          GFP_KERNEL | __GFP_NOWARN);
>> -       if (!bo->vaddr) {
>> -               dev_err(drm->dev, "failed to allocate buffer with size %u\n",
>> -                       size);
>> -               err = -ENOMEM;
>> -               goto err_dma;
>> -       }
>> -
>>         err = drm_gem_object_init(drm, &bo->gem, size);
>>         if (err)
>> -               goto err_init;
>> +               goto free;
>>
>>         err = drm_gem_create_mmap_offset(&bo->gem);
>
> We need to call drm_gem_free_mmap_offset if one of the calls below
> fails, otherwise we'll try to free the mmap_offset on an already
> destroyed bo.
>
>
> Sean
>
>
>
>>         if (err)
>> -               goto err_mmap;
>> +               goto release;
>> +
>> +       if (tegra->domain) {
>> +               err = tegra_bo_get_pages(drm, bo, size);
>> +               if (err < 0)
>> +                       goto release;
>> +
>> +               bo->sgt = drm_prime_pages_to_sg(bo->pages, bo->num_pages);
>> +               if (IS_ERR(bo->sgt)) {
>> +                       err = PTR_ERR(bo->sgt);
>> +                       goto release;
>> +               }
>> +
>> +               err = tegra_bo_iommu_map(tegra, bo);
>> +               if (err < 0)
>> +                       goto release;
>> +       } else {
>> +               err = tegra_bo_alloc(drm, bo, size);
>> +               if (err < 0)
>> +                       goto release;
>> +       }
>>
>>         if (flags & DRM_TEGRA_GEM_CREATE_TILED)
>>                 bo->tiling.mode = TEGRA_BO_TILING_MODE_TILED;
>> @@ -133,11 +276,10 @@ struct tegra_bo *tegra_bo_create(struct drm_device *drm, unsigned int size,
>>
>>         return bo;
>>
>> -err_mmap:
>> +release:
>>         drm_gem_object_release(&bo->gem);
>> -err_init:
>>         tegra_bo_destroy(drm, bo);
>> -err_dma:
>> +free:
>>         kfree(bo);
>>
>>         return ERR_PTR(err);
>> @@ -172,6 +314,7 @@ err:
>>  static struct tegra_bo *tegra_bo_import(struct drm_device *drm,
>>                                         struct dma_buf *buf)
>>  {
>> +       struct tegra_drm *tegra = drm->dev_private;
>>         struct dma_buf_attachment *attach;
>>         struct tegra_bo *bo;
>>         ssize_t size;
>> @@ -211,12 +354,19 @@ static struct tegra_bo *tegra_bo_import(struct drm_device *drm,
>>                 goto detach;
>>         }
>>
>> -       if (bo->sgt->nents > 1) {
>> -               err = -EINVAL;
>> -               goto detach;
>> +       if (tegra->domain) {
>> +               err = tegra_bo_iommu_map(tegra, bo);
>> +               if (err < 0)
>> +                       goto detach;
>> +       } else {
>> +               if (bo->sgt->nents > 1) {
>> +                       err = -EINVAL;
>> +                       goto detach;
>> +               }
>> +
>> +               bo->paddr = sg_dma_address(bo->sgt->sgl);
>>         }
>>
>> -       bo->paddr = sg_dma_address(bo->sgt->sgl);
>>         bo->gem.import_attach = attach;
>>
>>         return bo;
>> @@ -239,8 +389,12 @@ free:
>>
>>  void tegra_bo_free_object(struct drm_gem_object *gem)
>>  {
>> +       struct tegra_drm *tegra = gem->dev->dev_private;
>>         struct tegra_bo *bo = to_tegra_bo(gem);
>>
>> +       if (tegra->domain)
>> +               tegra_bo_iommu_unmap(tegra, bo);
>> +
>>         if (gem->import_attach) {
>>                 dma_buf_unmap_attachment(gem->import_attach, bo->sgt,
>>                                          DMA_TO_DEVICE);
>> @@ -301,7 +455,38 @@ int tegra_bo_dumb_map_offset(struct drm_file *file, struct drm_device *drm,
>>         return 0;
>>  }
>>
>> +static int tegra_bo_fault(struct vm_area_struct *vma, struct vm_fault *vmf)
>> +{
>> +       struct drm_gem_object *gem = vma->vm_private_data;
>> +       struct tegra_bo *bo = to_tegra_bo(gem);
>> +       struct page *page;
>> +       pgoff_t offset;
>> +       int err;
>> +
>> +       if (!bo->pages)
>> +               return VM_FAULT_SIGBUS;
>> +
>> +       offset = ((unsigned long)vmf->virtual_address - vma->vm_start) >> PAGE_SHIFT;
>> +       page = bo->pages[offset];
>> +
>> +       err = vm_insert_page(vma, (unsigned long)vmf->virtual_address, page);
>> +       switch (err) {
>> +       case -EAGAIN:
>> +       case 0:
>> +       case -ERESTARTSYS:
>> +       case -EINTR:
>> +       case -EBUSY:
>> +               return VM_FAULT_NOPAGE;
>> +
>> +       case -ENOMEM:
>> +               return VM_FAULT_OOM;
>> +       }
>> +
>> +       return VM_FAULT_SIGBUS;
>> +}
>> +
>>  const struct vm_operations_struct tegra_bo_vm_ops = {
>> +       .fault = tegra_bo_fault,
>>         .open = drm_gem_vm_open,
>>         .close = drm_gem_vm_close,
>>  };
>> @@ -316,13 +501,18 @@ int tegra_drm_mmap(struct file *file, struct vm_area_struct *vma)
>>         if (ret)
>>                 return ret;
>>
>> +       vma->vm_flags |= VM_MIXEDMAP;
>> +       vma->vm_flags &= ~VM_PFNMAP;
>> +
>>         gem = vma->vm_private_data;
>>         bo = to_tegra_bo(gem);
>>
>> -       ret = remap_pfn_range(vma, vma->vm_start, bo->paddr >> PAGE_SHIFT,
>> -                             vma->vm_end - vma->vm_start, vma->vm_page_prot);
>> -       if (ret)
>> -               drm_gem_vm_close(vma);
>> +       if (!bo->pages) {
>> +               ret = remap_pfn_range(vma, vma->vm_start, bo->paddr >> PAGE_SHIFT,
>> +                                     vma->vm_end - vma->vm_start, vma->vm_page_prot);
>> +               if (ret)
>> +                       drm_gem_vm_close(vma);
>> +       }
>>
>>         return ret;
>>  }
>> diff --git a/drivers/gpu/drm/tegra/gem.h b/drivers/gpu/drm/tegra/gem.h
>> index 43a25c853357..c2e3f43e4b3f 100644
>> --- a/drivers/gpu/drm/tegra/gem.h
>> +++ b/drivers/gpu/drm/tegra/gem.h
>> @@ -37,6 +37,10 @@ struct tegra_bo {
>>         dma_addr_t paddr;
>>         void *vaddr;
>>
>> +       struct drm_mm_node *mm;
>> +       unsigned long num_pages;
>> +       struct page **pages;
>> +
>>         struct tegra_bo_tiling tiling;
>>  };
>>
>> --
>> 2.0.0
>>
>> --
>> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
>> the body of a message to majordomo@vger.kernel.org
>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>> Please read the FAQ at  http://www.tux.org/lkml/

^ permalink raw reply	[flat|nested] 133+ messages in thread

* [RFC 09/10] drm/tegra: Add IOMMU support
@ 2014-10-01 15:54             ` Sean Paul
  0 siblings, 0 replies; 133+ messages in thread
From: Sean Paul @ 2014-10-01 15:54 UTC (permalink / raw)
  To: linux-arm-kernel

On Tue, Sep 30, 2014 at 2:48 PM, Sean Paul <seanpaul@google.com> wrote:
> On Thu, Jun 26, 2014 at 4:49 PM, Thierry Reding
> <thierry.reding@gmail.com> wrote:
>> From: Thierry Reding <treding@nvidia.com>
>>
>> When an IOMMU device is available on the platform bus, allocate an IOMMU
>> domain and attach the display controllers to it. The display controllers
>> can then scan out non-contiguous buffers by mapping them through the
>> IOMMU.
>>
>
> Hi Thierry,
> A few comments from St?phane and myself that came up while we were
> reviewing this for our tree.
>
>> Signed-off-by: Thierry Reding <treding@nvidia.com>
>> ---
>>  drivers/gpu/drm/tegra/dc.c  |  21 ++++
>>  drivers/gpu/drm/tegra/drm.c |  17 ++++
>>  drivers/gpu/drm/tegra/drm.h |   3 +
>>  drivers/gpu/drm/tegra/fb.c  |  16 ++-
>>  drivers/gpu/drm/tegra/gem.c | 236 +++++++++++++++++++++++++++++++++++++++-----
>>  drivers/gpu/drm/tegra/gem.h |   4 +
>>  6 files changed, 273 insertions(+), 24 deletions(-)
>>
>> diff --git a/drivers/gpu/drm/tegra/dc.c b/drivers/gpu/drm/tegra/dc.c
>> index afcca04f5367..0f7452d04811 100644
>> --- a/drivers/gpu/drm/tegra/dc.c
>> +++ b/drivers/gpu/drm/tegra/dc.c
>> @@ -9,6 +9,7 @@
>>
>>  #include <linux/clk.h>
>>  #include <linux/debugfs.h>
>> +#include <linux/iommu.h>
>>  #include <linux/reset.h>
>>
>>  #include "dc.h"
>> @@ -1283,8 +1284,18 @@ static int tegra_dc_init(struct host1x_client *client)
>>  {
>>         struct drm_device *drm = dev_get_drvdata(client->parent);
>>         struct tegra_dc *dc = host1x_client_to_dc(client);
>> +       struct tegra_drm *tegra = drm->dev_private;
>>         int err;
>>
>> +       if (tegra->domain) {
>> +               err = iommu_attach_device(tegra->domain, dc->dev);
>> +               if (err < 0) {
>> +                       dev_err(dc->dev, "failed to attach to IOMMU: %d\n",
>> +                               err);
>> +                       return err;
>> +               }
>
> [from St?phane]
>
> shouldn't we call detach in the error paths below?
>
>
>> +       }
>> +
>>         drm_crtc_init(drm, &dc->base, &tegra_crtc_funcs);
>>         drm_mode_crtc_set_gamma_size(&dc->base, 256);
>>         drm_crtc_helper_add(&dc->base, &tegra_crtc_helper_funcs);
>> @@ -1318,7 +1329,9 @@ static int tegra_dc_init(struct host1x_client *client)
>>
>>  static int tegra_dc_exit(struct host1x_client *client)
>>  {
>> +       struct drm_device *drm = dev_get_drvdata(client->parent);
>>         struct tegra_dc *dc = host1x_client_to_dc(client);
>> +       struct tegra_drm *tegra = drm->dev_private;
>>         int err;
>>
>>         devm_free_irq(dc->dev, dc->irq, dc);
>> @@ -1335,6 +1348,8 @@ static int tegra_dc_exit(struct host1x_client *client)
>>                 return err;
>>         }
>>
>> +       iommu_detach_device(tegra->domain, dc->dev);
>> +
>>         return 0;
>>  }
>>
>> @@ -1462,6 +1477,12 @@ static int tegra_dc_probe(struct platform_device *pdev)
>>                 return -ENXIO;
>>         }
>>
>> +       err = iommu_attach(&pdev->dev);
>> +       if (err < 0) {
>> +               dev_err(&pdev->dev, "failed to attach to IOMMU: %d\n", err);
>> +               return err;
>> +       }
>> +
>>         INIT_LIST_HEAD(&dc->client.list);
>>         dc->client.ops = &dc_client_ops;
>>         dc->client.dev = &pdev->dev;
>> diff --git a/drivers/gpu/drm/tegra/drm.c b/drivers/gpu/drm/tegra/drm.c
>> index 59736bb810cd..1d2bbafad982 100644
>> --- a/drivers/gpu/drm/tegra/drm.c
>> +++ b/drivers/gpu/drm/tegra/drm.c
>> @@ -8,6 +8,7 @@
>>   */
>>
>>  #include <linux/host1x.h>
>> +#include <linux/iommu.h>
>>
>>  #include "drm.h"
>>  #include "gem.h"
>> @@ -33,6 +34,16 @@ static int tegra_drm_load(struct drm_device *drm, unsigned long flags)
>>         if (!tegra)
>>                 return -ENOMEM;
>>
>> +       if (iommu_present(&platform_bus_type)) {
>> +               tegra->domain = iommu_domain_alloc(&platform_bus_type);
>> +               if (IS_ERR(tegra->domain)) {
>> +                       kfree(tegra);
>> +                       return PTR_ERR(tegra->domain);
>> +               }
>> +
>> +               drm_mm_init(&tegra->mm, 0, SZ_2G);
>
>
> [from St?phane]:
>
> none of these are freed in the error path below (iommu_domain_free and
> drm_mm_takedown)
>
> also |tegra| isn't freed either?
>
>
>
>> +       }
>> +
>>         mutex_init(&tegra->clients_lock);
>>         INIT_LIST_HEAD(&tegra->clients);
>>         drm->dev_private = tegra;
>> @@ -71,6 +82,7 @@ static int tegra_drm_load(struct drm_device *drm, unsigned long flags)
>>  static int tegra_drm_unload(struct drm_device *drm)
>>  {
>>         struct host1x_device *device = to_host1x_device(drm->dev);
>> +       struct tegra_drm *tegra = drm->dev_private;
>>         int err;
>>
>>         drm_kms_helper_poll_fini(drm);
>> @@ -82,6 +94,11 @@ static int tegra_drm_unload(struct drm_device *drm)
>>         if (err < 0)
>>                 return err;
>>
>> +       if (tegra->domain) {
>> +               iommu_domain_free(tegra->domain);
>> +               drm_mm_takedown(&tegra->mm);
>> +       }
>> +
>>         return 0;
>>  }
>>
>> diff --git a/drivers/gpu/drm/tegra/drm.h b/drivers/gpu/drm/tegra/drm.h
>> index 96d754e7b3eb..a07c796b7edc 100644
>> --- a/drivers/gpu/drm/tegra/drm.h
>> +++ b/drivers/gpu/drm/tegra/drm.h
>> @@ -39,6 +39,9 @@ struct tegra_fbdev {
>>  struct tegra_drm {
>>         struct drm_device *drm;
>>
>> +       struct iommu_domain *domain;
>> +       struct drm_mm mm;
>> +
>>         struct mutex clients_lock;
>>         struct list_head clients;
>>
>> diff --git a/drivers/gpu/drm/tegra/fb.c b/drivers/gpu/drm/tegra/fb.c
>> index 7790d43ad082..21c65dd817c3 100644
>> --- a/drivers/gpu/drm/tegra/fb.c
>> +++ b/drivers/gpu/drm/tegra/fb.c
>> @@ -65,8 +65,12 @@ static void tegra_fb_destroy(struct drm_framebuffer *framebuffer)
>>         for (i = 0; i < fb->num_planes; i++) {
>>                 struct tegra_bo *bo = fb->planes[i];
>>
>> -               if (bo)
>> +               if (bo) {
>> +                       if (bo->pages && bo->virt)
>> +                               vunmap(bo->virt);
>> +
>>                         drm_gem_object_unreference_unlocked(&bo->gem);
>> +               }
>>         }
>>
>>         drm_framebuffer_cleanup(framebuffer);
>> @@ -252,6 +256,16 @@ static int tegra_fbdev_probe(struct drm_fb_helper *helper,
>>         offset = info->var.xoffset * bytes_per_pixel +
>>                  info->var.yoffset * fb->pitches[0];
>>
>> +       if (bo->pages) {
>> +               bo->vaddr = vmap(bo->pages, bo->num_pages, VM_MAP,
>> +                                pgprot_writecombine(PAGE_KERNEL));
>> +               if (!bo->vaddr) {
>> +                       dev_err(drm->dev, "failed to vmap() framebuffer\n");
>> +                       err = -ENOMEM;
>> +                       goto destroy;
>> +               }
>> +       }
>> +
>>         drm->mode_config.fb_base = (resource_size_t)bo->paddr;
>>         info->screen_base = (void __iomem *)bo->vaddr + offset;
>>         info->screen_size = size;
>> diff --git a/drivers/gpu/drm/tegra/gem.c b/drivers/gpu/drm/tegra/gem.c
>> index c1e4e8b6e5ca..2912e61a2599 100644
>> --- a/drivers/gpu/drm/tegra/gem.c
>> +++ b/drivers/gpu/drm/tegra/gem.c
>> @@ -14,8 +14,10 @@
>>   */
>>
>>  #include <linux/dma-buf.h>
>> +#include <linux/iommu.h>
>>  #include <drm/tegra_drm.h>
>>
>> +#include "drm.h"
>>  #include "gem.h"
>>
>>  static inline struct tegra_bo *host1x_to_tegra_bo(struct host1x_bo *bo)
>> @@ -90,14 +92,144 @@ static const struct host1x_bo_ops tegra_bo_ops = {
>>         .kunmap = tegra_bo_kunmap,
>>  };
>>
>> +static int iommu_map_sg(struct iommu_domain *domain, struct sg_table *sgt,
>> +                       dma_addr_t iova, int prot)
>> +{
>> +       unsigned long offset = 0;
>> +       struct scatterlist *sg;
>> +       unsigned int i, j;
>> +       int err;
>> +
>> +       for_each_sg(sgt->sgl, sg, sgt->nents, i) {
>> +               dma_addr_t phys = sg_phys(sg);
>> +               size_t length = sg->offset;
>> +
>> +               phys = sg_phys(sg) - sg->offset;
>> +               length = sg->length + sg->offset;
>> +
>> +               err = iommu_map(domain, iova + offset, phys, length, prot);
>> +               if (err < 0)
>> +                       goto unmap;
>> +
>> +               offset += length;
>> +       }
>> +
>> +       return 0;
>> +
>> +unmap:
>> +       offset = 0;
>> +
>> +       for_each_sg(sgt->sgl, sg, i, j) {
>> +               size_t length = sg->length + sg->offset;
>> +               iommu_unmap(domain, iova + offset, length);
>> +               offset += length;
>> +       }
>> +
>> +       return err;
>> +}
>> +
>> +static int iommu_unmap_sg(struct iommu_domain *domain, struct sg_table *sgt,
>> +                         dma_addr_t iova)
>> +{
>> +       unsigned long offset = 0;
>> +       struct scatterlist *sg;
>> +       unsigned int i;
>> +
>> +       for_each_sg(sgt->sgl, sg, sgt->nents, i) {
>> +               dma_addr_t phys = sg_phys(sg);
>> +               size_t length = sg->offset;
>> +
>> +               phys = sg_phys(sg) - sg->offset;
>> +               length = sg->length + sg->offset;
>> +
>> +               iommu_unmap(domain, iova + offset, length);
>> +               offset += length;
>> +       }
>> +
>> +       return 0;
>> +}
>> +
>> +static int tegra_bo_iommu_map(struct tegra_drm *tegra, struct tegra_bo *bo)
>> +{
>> +       int prot = IOMMU_READ | IOMMU_WRITE;
>> +       int err;
>> +
>> +       if (bo->mm)
>> +               return -EBUSY;
>> +
>> +       bo->mm = kzalloc(sizeof(*bo->mm), GFP_KERNEL);
>> +       if (!bo->mm)
>> +               return -ENOMEM;
>> +
>> +       err = drm_mm_insert_node_generic(&tegra->mm, bo->mm, bo->gem.size,
>> +                                        PAGE_SIZE, 0, 0, 0);
>> +       if (err < 0) {
>> +               dev_err(tegra->drm->dev, "out of virtual memory: %d\n", err);
>> +               return err;
>> +       }
>> +
>> +       bo->paddr = bo->mm->start;
>> +
>> +       err = iommu_map_sg(tegra->domain, bo->sgt, bo->paddr, prot);
>> +       if (err < 0) {
>> +               dev_err(tegra->drm->dev, "failed to map buffer: %d\n", err);
>> +               return err;
>> +       }
>> +
>> +       return 0;
>> +}
>> +
>> +static int tegra_bo_iommu_unmap(struct tegra_drm *tegra, struct tegra_bo *bo)
>> +{
>> +       if (!bo->mm)
>> +               return 0;
>> +
>> +       iommu_unmap_sg(tegra->domain, bo->sgt, bo->paddr);
>> +       drm_mm_remove_node(bo->mm);
>> +
>> +       kfree(bo->mm);
>> +       return 0;
>> +}
>> +
>>  static void tegra_bo_destroy(struct drm_device *drm, struct tegra_bo *bo)
>>  {
>> -       dma_free_writecombine(drm->dev, bo->gem.size, bo->vaddr, bo->paddr);
>> +       if (!bo->pages)
>> +               dma_free_writecombine(drm->dev, bo->gem.size, bo->vaddr,
>> +                                     bo->paddr);

One more thing. If tegra_bo_alloc fails, we'll have bo->vaddr == NULL
and bo->paddr == ~0 here, which causes a crash.

I posted https://lkml.org/lkml/2014/9/30/659 to check for the error
condition in the mm code, but it seems like reviewer consensus is to
check for this before calling free.

As such, we'll need to make sure bo->vaddr != NULL before calling
dma_free_writecombine to avoid this situation.

Would you prefer I send a patch up to fix this separately, or would
you like to roll this into your next version?

Sean




>> +       else
>> +               drm_gem_put_pages(&bo->gem, bo->pages, true, true);
>> +}
>> +
>> +static int tegra_bo_get_pages(struct drm_device *drm, struct tegra_bo *bo,
>> +                             size_t size)
>> +{
>> +       bo->pages = drm_gem_get_pages(&bo->gem, GFP_KERNEL);
>> +       if (!bo->pages)
>> +               return -ENOMEM;
>> +
>> +       bo->num_pages = size >> PAGE_SHIFT;
>> +
>> +       return 0;
>> +}
>> +
>> +static int tegra_bo_alloc(struct drm_device *drm, struct tegra_bo *bo,
>> +                         size_t size)
>> +{
>> +       bo->vaddr = dma_alloc_writecombine(drm->dev, size, &bo->paddr,
>> +                                          GFP_KERNEL | __GFP_NOWARN);
>> +       if (!bo->vaddr) {
>> +               dev_err(drm->dev, "failed to allocate buffer of size %zu\n",
>> +                       size);
>> +               return -ENOMEM;
>> +       }
>> +
>> +       return 0;
>>  }
>>
>>  struct tegra_bo *tegra_bo_create(struct drm_device *drm, unsigned int size,
>>                                  unsigned long flags)
>>  {
>> +       struct tegra_drm *tegra = drm->dev_private;
>>         struct tegra_bo *bo;
>>         int err;
>>
>> @@ -108,22 +240,33 @@ struct tegra_bo *tegra_bo_create(struct drm_device *drm, unsigned int size,
>>         host1x_bo_init(&bo->base, &tegra_bo_ops);
>>         size = round_up(size, PAGE_SIZE);
>>
>> -       bo->vaddr = dma_alloc_writecombine(drm->dev, size, &bo->paddr,
>> -                                          GFP_KERNEL | __GFP_NOWARN);
>> -       if (!bo->vaddr) {
>> -               dev_err(drm->dev, "failed to allocate buffer with size %u\n",
>> -                       size);
>> -               err = -ENOMEM;
>> -               goto err_dma;
>> -       }
>> -
>>         err = drm_gem_object_init(drm, &bo->gem, size);
>>         if (err)
>> -               goto err_init;
>> +               goto free;
>>
>>         err = drm_gem_create_mmap_offset(&bo->gem);
>
> We need to call drm_gem_free_mmap_offset if one of the calls below
> fails, otherwise we'll try to free the mmap_offset on an already
> destroyed bo.
>
>
> Sean
>
>
>
>>         if (err)
>> -               goto err_mmap;
>> +               goto release;
>> +
>> +       if (tegra->domain) {
>> +               err = tegra_bo_get_pages(drm, bo, size);
>> +               if (err < 0)
>> +                       goto release;
>> +
>> +               bo->sgt = drm_prime_pages_to_sg(bo->pages, bo->num_pages);
>> +               if (IS_ERR(bo->sgt)) {
>> +                       err = PTR_ERR(bo->sgt);
>> +                       goto release;
>> +               }
>> +
>> +               err = tegra_bo_iommu_map(tegra, bo);
>> +               if (err < 0)
>> +                       goto release;
>> +       } else {
>> +               err = tegra_bo_alloc(drm, bo, size);
>> +               if (err < 0)
>> +                       goto release;
>> +       }
>>
>>         if (flags & DRM_TEGRA_GEM_CREATE_TILED)
>>                 bo->tiling.mode = TEGRA_BO_TILING_MODE_TILED;
>> @@ -133,11 +276,10 @@ struct tegra_bo *tegra_bo_create(struct drm_device *drm, unsigned int size,
>>
>>         return bo;
>>
>> -err_mmap:
>> +release:
>>         drm_gem_object_release(&bo->gem);
>> -err_init:
>>         tegra_bo_destroy(drm, bo);
>> -err_dma:
>> +free:
>>         kfree(bo);
>>
>>         return ERR_PTR(err);
>> @@ -172,6 +314,7 @@ err:
>>  static struct tegra_bo *tegra_bo_import(struct drm_device *drm,
>>                                         struct dma_buf *buf)
>>  {
>> +       struct tegra_drm *tegra = drm->dev_private;
>>         struct dma_buf_attachment *attach;
>>         struct tegra_bo *bo;
>>         ssize_t size;
>> @@ -211,12 +354,19 @@ static struct tegra_bo *tegra_bo_import(struct drm_device *drm,
>>                 goto detach;
>>         }
>>
>> -       if (bo->sgt->nents > 1) {
>> -               err = -EINVAL;
>> -               goto detach;
>> +       if (tegra->domain) {
>> +               err = tegra_bo_iommu_map(tegra, bo);
>> +               if (err < 0)
>> +                       goto detach;
>> +       } else {
>> +               if (bo->sgt->nents > 1) {
>> +                       err = -EINVAL;
>> +                       goto detach;
>> +               }
>> +
>> +               bo->paddr = sg_dma_address(bo->sgt->sgl);
>>         }
>>
>> -       bo->paddr = sg_dma_address(bo->sgt->sgl);
>>         bo->gem.import_attach = attach;
>>
>>         return bo;
>> @@ -239,8 +389,12 @@ free:
>>
>>  void tegra_bo_free_object(struct drm_gem_object *gem)
>>  {
>> +       struct tegra_drm *tegra = gem->dev->dev_private;
>>         struct tegra_bo *bo = to_tegra_bo(gem);
>>
>> +       if (tegra->domain)
>> +               tegra_bo_iommu_unmap(tegra, bo);
>> +
>>         if (gem->import_attach) {
>>                 dma_buf_unmap_attachment(gem->import_attach, bo->sgt,
>>                                          DMA_TO_DEVICE);
>> @@ -301,7 +455,38 @@ int tegra_bo_dumb_map_offset(struct drm_file *file, struct drm_device *drm,
>>         return 0;
>>  }
>>
>> +static int tegra_bo_fault(struct vm_area_struct *vma, struct vm_fault *vmf)
>> +{
>> +       struct drm_gem_object *gem = vma->vm_private_data;
>> +       struct tegra_bo *bo = to_tegra_bo(gem);
>> +       struct page *page;
>> +       pgoff_t offset;
>> +       int err;
>> +
>> +       if (!bo->pages)
>> +               return VM_FAULT_SIGBUS;
>> +
>> +       offset = ((unsigned long)vmf->virtual_address - vma->vm_start) >> PAGE_SHIFT;
>> +       page = bo->pages[offset];
>> +
>> +       err = vm_insert_page(vma, (unsigned long)vmf->virtual_address, page);
>> +       switch (err) {
>> +       case -EAGAIN:
>> +       case 0:
>> +       case -ERESTARTSYS:
>> +       case -EINTR:
>> +       case -EBUSY:
>> +               return VM_FAULT_NOPAGE;
>> +
>> +       case -ENOMEM:
>> +               return VM_FAULT_OOM;
>> +       }
>> +
>> +       return VM_FAULT_SIGBUS;
>> +}
>> +
>>  const struct vm_operations_struct tegra_bo_vm_ops = {
>> +       .fault = tegra_bo_fault,
>>         .open = drm_gem_vm_open,
>>         .close = drm_gem_vm_close,
>>  };
>> @@ -316,13 +501,18 @@ int tegra_drm_mmap(struct file *file, struct vm_area_struct *vma)
>>         if (ret)
>>                 return ret;
>>
>> +       vma->vm_flags |= VM_MIXEDMAP;
>> +       vma->vm_flags &= ~VM_PFNMAP;
>> +
>>         gem = vma->vm_private_data;
>>         bo = to_tegra_bo(gem);
>>
>> -       ret = remap_pfn_range(vma, vma->vm_start, bo->paddr >> PAGE_SHIFT,
>> -                             vma->vm_end - vma->vm_start, vma->vm_page_prot);
>> -       if (ret)
>> -               drm_gem_vm_close(vma);
>> +       if (!bo->pages) {
>> +               ret = remap_pfn_range(vma, vma->vm_start, bo->paddr >> PAGE_SHIFT,
>> +                                     vma->vm_end - vma->vm_start, vma->vm_page_prot);
>> +               if (ret)
>> +                       drm_gem_vm_close(vma);
>> +       }
>>
>>         return ret;
>>  }
>> diff --git a/drivers/gpu/drm/tegra/gem.h b/drivers/gpu/drm/tegra/gem.h
>> index 43a25c853357..c2e3f43e4b3f 100644
>> --- a/drivers/gpu/drm/tegra/gem.h
>> +++ b/drivers/gpu/drm/tegra/gem.h
>> @@ -37,6 +37,10 @@ struct tegra_bo {
>>         dma_addr_t paddr;
>>         void *vaddr;
>>
>> +       struct drm_mm_node *mm;
>> +       unsigned long num_pages;
>> +       struct page **pages;
>> +
>>         struct tegra_bo_tiling tiling;
>>  };
>>
>> --
>> 2.0.0
>>
>> --
>> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
>> the body of a message to majordomo at vger.kernel.org
>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>> Please read the FAQ at  http://www.tux.org/lkml/

^ permalink raw reply	[flat|nested] 133+ messages in thread

* Re: [RFC 09/10] drm/tegra: Add IOMMU support
  2014-10-01 15:54             ` Sean Paul
  (?)
@ 2014-10-02  8:39                 ` Thierry Reding
  -1 siblings, 0 replies; 133+ messages in thread
From: Thierry Reding @ 2014-10-02  8:39 UTC (permalink / raw)
  To: Sean Paul
  Cc: Mark Rutland, Will Deacon, Paul Walmsley, Pawel Moll,
	Ian Campbell, Marc Zyngier, Dave Martin,
	devicetree-u79uwXL29TY76Z2rM5mHXA, Arnd Bergmann, Stephen Warren,
	Grant Grundler, Allen Martin, Rob Herring,
	linux-tegra-u79uwXL29TY76Z2rM5mHXA, Cho KyongHo,
	Linux ARM Kernel, Stéphane Marchesin,
	Linux Kernel Mailing List, Linux IOMMU, Kumar Gala,
	Rhyland Klein


[-- Attachment #1.1: Type: text/plain, Size: 12815 bytes --]

On Wed, Oct 01, 2014 at 11:54:11AM -0400, Sean Paul wrote:
> On Tue, Sep 30, 2014 at 2:48 PM, Sean Paul <seanpaul-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org> wrote:
> > On Thu, Jun 26, 2014 at 4:49 PM, Thierry Reding
> > <thierry.reding-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> wrote:
> >> From: Thierry Reding <treding-DDmLM1+adcrQT0dZR+AlfA@public.gmane.org>
> >>
> >> When an IOMMU device is available on the platform bus, allocate an IOMMU
> >> domain and attach the display controllers to it. The display controllers
> >> can then scan out non-contiguous buffers by mapping them through the
> >> IOMMU.
> >>
> >
> > Hi Thierry,
> > A few comments from Stéphane and myself that came up while we were
> > reviewing this for our tree.
> >
> >> Signed-off-by: Thierry Reding <treding-DDmLM1+adcrQT0dZR+AlfA@public.gmane.org>
> >> ---
> >>  drivers/gpu/drm/tegra/dc.c  |  21 ++++
> >>  drivers/gpu/drm/tegra/drm.c |  17 ++++
> >>  drivers/gpu/drm/tegra/drm.h |   3 +
> >>  drivers/gpu/drm/tegra/fb.c  |  16 ++-
> >>  drivers/gpu/drm/tegra/gem.c | 236 +++++++++++++++++++++++++++++++++++++++-----
> >>  drivers/gpu/drm/tegra/gem.h |   4 +
> >>  6 files changed, 273 insertions(+), 24 deletions(-)
> >>
> >> diff --git a/drivers/gpu/drm/tegra/dc.c b/drivers/gpu/drm/tegra/dc.c
> >> index afcca04f5367..0f7452d04811 100644
> >> --- a/drivers/gpu/drm/tegra/dc.c
> >> +++ b/drivers/gpu/drm/tegra/dc.c
> >> @@ -9,6 +9,7 @@
> >>
> >>  #include <linux/clk.h>
> >>  #include <linux/debugfs.h>
> >> +#include <linux/iommu.h>
> >>  #include <linux/reset.h>
> >>
> >>  #include "dc.h"
> >> @@ -1283,8 +1284,18 @@ static int tegra_dc_init(struct host1x_client *client)
> >>  {
> >>         struct drm_device *drm = dev_get_drvdata(client->parent);
> >>         struct tegra_dc *dc = host1x_client_to_dc(client);
> >> +       struct tegra_drm *tegra = drm->dev_private;
> >>         int err;
> >>
> >> +       if (tegra->domain) {
> >> +               err = iommu_attach_device(tegra->domain, dc->dev);
> >> +               if (err < 0) {
> >> +                       dev_err(dc->dev, "failed to attach to IOMMU: %d\n",
> >> +                               err);
> >> +                       return err;
> >> +               }
> >
> > [from Stéphane]
> >
> > shouldn't we call detach in the error paths below?
> >
> >
> >> +       }
> >> +
> >>         drm_crtc_init(drm, &dc->base, &tegra_crtc_funcs);
> >>         drm_mode_crtc_set_gamma_size(&dc->base, 256);
> >>         drm_crtc_helper_add(&dc->base, &tegra_crtc_helper_funcs);
> >> @@ -1318,7 +1329,9 @@ static int tegra_dc_init(struct host1x_client *client)
> >>
> >>  static int tegra_dc_exit(struct host1x_client *client)
> >>  {
> >> +       struct drm_device *drm = dev_get_drvdata(client->parent);
> >>         struct tegra_dc *dc = host1x_client_to_dc(client);
> >> +       struct tegra_drm *tegra = drm->dev_private;
> >>         int err;
> >>
> >>         devm_free_irq(dc->dev, dc->irq, dc);
> >> @@ -1335,6 +1348,8 @@ static int tegra_dc_exit(struct host1x_client *client)
> >>                 return err;
> >>         }
> >>
> >> +       iommu_detach_device(tegra->domain, dc->dev);
> >> +
> >>         return 0;
> >>  }
> >>
> >> @@ -1462,6 +1477,12 @@ static int tegra_dc_probe(struct platform_device *pdev)
> >>                 return -ENXIO;
> >>         }
> >>
> >> +       err = iommu_attach(&pdev->dev);
> >> +       if (err < 0) {
> >> +               dev_err(&pdev->dev, "failed to attach to IOMMU: %d\n", err);
> >> +               return err;
> >> +       }
> >> +
> >>         INIT_LIST_HEAD(&dc->client.list);
> >>         dc->client.ops = &dc_client_ops;
> >>         dc->client.dev = &pdev->dev;
> >> diff --git a/drivers/gpu/drm/tegra/drm.c b/drivers/gpu/drm/tegra/drm.c
> >> index 59736bb810cd..1d2bbafad982 100644
> >> --- a/drivers/gpu/drm/tegra/drm.c
> >> +++ b/drivers/gpu/drm/tegra/drm.c
> >> @@ -8,6 +8,7 @@
> >>   */
> >>
> >>  #include <linux/host1x.h>
> >> +#include <linux/iommu.h>
> >>
> >>  #include "drm.h"
> >>  #include "gem.h"
> >> @@ -33,6 +34,16 @@ static int tegra_drm_load(struct drm_device *drm, unsigned long flags)
> >>         if (!tegra)
> >>                 return -ENOMEM;
> >>
> >> +       if (iommu_present(&platform_bus_type)) {
> >> +               tegra->domain = iommu_domain_alloc(&platform_bus_type);
> >> +               if (IS_ERR(tegra->domain)) {
> >> +                       kfree(tegra);
> >> +                       return PTR_ERR(tegra->domain);
> >> +               }
> >> +
> >> +               drm_mm_init(&tegra->mm, 0, SZ_2G);
> >
> >
> > [from Stéphane]:
> >
> > none of these are freed in the error path below (iommu_domain_free and
> > drm_mm_takedown)
> >
> > also |tegra| isn't freed either?
> >
> >
> >
> >> +       }
> >> +
> >>         mutex_init(&tegra->clients_lock);
> >>         INIT_LIST_HEAD(&tegra->clients);
> >>         drm->dev_private = tegra;
> >> @@ -71,6 +82,7 @@ static int tegra_drm_load(struct drm_device *drm, unsigned long flags)
> >>  static int tegra_drm_unload(struct drm_device *drm)
> >>  {
> >>         struct host1x_device *device = to_host1x_device(drm->dev);
> >> +       struct tegra_drm *tegra = drm->dev_private;
> >>         int err;
> >>
> >>         drm_kms_helper_poll_fini(drm);
> >> @@ -82,6 +94,11 @@ static int tegra_drm_unload(struct drm_device *drm)
> >>         if (err < 0)
> >>                 return err;
> >>
> >> +       if (tegra->domain) {
> >> +               iommu_domain_free(tegra->domain);
> >> +               drm_mm_takedown(&tegra->mm);
> >> +       }
> >> +
> >>         return 0;
> >>  }
> >>
> >> diff --git a/drivers/gpu/drm/tegra/drm.h b/drivers/gpu/drm/tegra/drm.h
> >> index 96d754e7b3eb..a07c796b7edc 100644
> >> --- a/drivers/gpu/drm/tegra/drm.h
> >> +++ b/drivers/gpu/drm/tegra/drm.h
> >> @@ -39,6 +39,9 @@ struct tegra_fbdev {
> >>  struct tegra_drm {
> >>         struct drm_device *drm;
> >>
> >> +       struct iommu_domain *domain;
> >> +       struct drm_mm mm;
> >> +
> >>         struct mutex clients_lock;
> >>         struct list_head clients;
> >>
> >> diff --git a/drivers/gpu/drm/tegra/fb.c b/drivers/gpu/drm/tegra/fb.c
> >> index 7790d43ad082..21c65dd817c3 100644
> >> --- a/drivers/gpu/drm/tegra/fb.c
> >> +++ b/drivers/gpu/drm/tegra/fb.c
> >> @@ -65,8 +65,12 @@ static void tegra_fb_destroy(struct drm_framebuffer *framebuffer)
> >>         for (i = 0; i < fb->num_planes; i++) {
> >>                 struct tegra_bo *bo = fb->planes[i];
> >>
> >> -               if (bo)
> >> +               if (bo) {
> >> +                       if (bo->pages && bo->virt)
> >> +                               vunmap(bo->virt);
> >> +
> >>                         drm_gem_object_unreference_unlocked(&bo->gem);
> >> +               }
> >>         }
> >>
> >>         drm_framebuffer_cleanup(framebuffer);
> >> @@ -252,6 +256,16 @@ static int tegra_fbdev_probe(struct drm_fb_helper *helper,
> >>         offset = info->var.xoffset * bytes_per_pixel +
> >>                  info->var.yoffset * fb->pitches[0];
> >>
> >> +       if (bo->pages) {
> >> +               bo->vaddr = vmap(bo->pages, bo->num_pages, VM_MAP,
> >> +                                pgprot_writecombine(PAGE_KERNEL));
> >> +               if (!bo->vaddr) {
> >> +                       dev_err(drm->dev, "failed to vmap() framebuffer\n");
> >> +                       err = -ENOMEM;
> >> +                       goto destroy;
> >> +               }
> >> +       }
> >> +
> >>         drm->mode_config.fb_base = (resource_size_t)bo->paddr;
> >>         info->screen_base = (void __iomem *)bo->vaddr + offset;
> >>         info->screen_size = size;
> >> diff --git a/drivers/gpu/drm/tegra/gem.c b/drivers/gpu/drm/tegra/gem.c
> >> index c1e4e8b6e5ca..2912e61a2599 100644
> >> --- a/drivers/gpu/drm/tegra/gem.c
> >> +++ b/drivers/gpu/drm/tegra/gem.c
> >> @@ -14,8 +14,10 @@
> >>   */
> >>
> >>  #include <linux/dma-buf.h>
> >> +#include <linux/iommu.h>
> >>  #include <drm/tegra_drm.h>
> >>
> >> +#include "drm.h"
> >>  #include "gem.h"
> >>
> >>  static inline struct tegra_bo *host1x_to_tegra_bo(struct host1x_bo *bo)
> >> @@ -90,14 +92,144 @@ static const struct host1x_bo_ops tegra_bo_ops = {
> >>         .kunmap = tegra_bo_kunmap,
> >>  };
> >>
> >> +static int iommu_map_sg(struct iommu_domain *domain, struct sg_table *sgt,
> >> +                       dma_addr_t iova, int prot)
> >> +{
> >> +       unsigned long offset = 0;
> >> +       struct scatterlist *sg;
> >> +       unsigned int i, j;
> >> +       int err;
> >> +
> >> +       for_each_sg(sgt->sgl, sg, sgt->nents, i) {
> >> +               dma_addr_t phys = sg_phys(sg);
> >> +               size_t length = sg->offset;
> >> +
> >> +               phys = sg_phys(sg) - sg->offset;
> >> +               length = sg->length + sg->offset;
> >> +
> >> +               err = iommu_map(domain, iova + offset, phys, length, prot);
> >> +               if (err < 0)
> >> +                       goto unmap;
> >> +
> >> +               offset += length;
> >> +       }
> >> +
> >> +       return 0;
> >> +
> >> +unmap:
> >> +       offset = 0;
> >> +
> >> +       for_each_sg(sgt->sgl, sg, i, j) {
> >> +               size_t length = sg->length + sg->offset;
> >> +               iommu_unmap(domain, iova + offset, length);
> >> +               offset += length;
> >> +       }
> >> +
> >> +       return err;
> >> +}
> >> +
> >> +static int iommu_unmap_sg(struct iommu_domain *domain, struct sg_table *sgt,
> >> +                         dma_addr_t iova)
> >> +{
> >> +       unsigned long offset = 0;
> >> +       struct scatterlist *sg;
> >> +       unsigned int i;
> >> +
> >> +       for_each_sg(sgt->sgl, sg, sgt->nents, i) {
> >> +               dma_addr_t phys = sg_phys(sg);
> >> +               size_t length = sg->offset;
> >> +
> >> +               phys = sg_phys(sg) - sg->offset;
> >> +               length = sg->length + sg->offset;
> >> +
> >> +               iommu_unmap(domain, iova + offset, length);
> >> +               offset += length;
> >> +       }
> >> +
> >> +       return 0;
> >> +}
> >> +
> >> +static int tegra_bo_iommu_map(struct tegra_drm *tegra, struct tegra_bo *bo)
> >> +{
> >> +       int prot = IOMMU_READ | IOMMU_WRITE;
> >> +       int err;
> >> +
> >> +       if (bo->mm)
> >> +               return -EBUSY;
> >> +
> >> +       bo->mm = kzalloc(sizeof(*bo->mm), GFP_KERNEL);
> >> +       if (!bo->mm)
> >> +               return -ENOMEM;
> >> +
> >> +       err = drm_mm_insert_node_generic(&tegra->mm, bo->mm, bo->gem.size,
> >> +                                        PAGE_SIZE, 0, 0, 0);
> >> +       if (err < 0) {
> >> +               dev_err(tegra->drm->dev, "out of virtual memory: %d\n", err);
> >> +               return err;
> >> +       }
> >> +
> >> +       bo->paddr = bo->mm->start;
> >> +
> >> +       err = iommu_map_sg(tegra->domain, bo->sgt, bo->paddr, prot);
> >> +       if (err < 0) {
> >> +               dev_err(tegra->drm->dev, "failed to map buffer: %d\n", err);
> >> +               return err;
> >> +       }
> >> +
> >> +       return 0;
> >> +}
> >> +
> >> +static int tegra_bo_iommu_unmap(struct tegra_drm *tegra, struct tegra_bo *bo)
> >> +{
> >> +       if (!bo->mm)
> >> +               return 0;
> >> +
> >> +       iommu_unmap_sg(tegra->domain, bo->sgt, bo->paddr);
> >> +       drm_mm_remove_node(bo->mm);
> >> +
> >> +       kfree(bo->mm);
> >> +       return 0;
> >> +}
> >> +
> >>  static void tegra_bo_destroy(struct drm_device *drm, struct tegra_bo *bo)
> >>  {
> >> -       dma_free_writecombine(drm->dev, bo->gem.size, bo->vaddr, bo->paddr);
> >> +       if (!bo->pages)
> >> +               dma_free_writecombine(drm->dev, bo->gem.size, bo->vaddr,
> >> +                                     bo->paddr);
> 
> One more thing. If tegra_bo_alloc fails, we'll have bo->vaddr == NULL
> and bo->paddr == ~0 here, which causes a crash.
> 
> I posted https://lkml.org/lkml/2014/9/30/659 to check for the error
> condition in the mm code, but it seems like reviewer consensus is to
> check for this before calling free.
> 
> As such, we'll need to make sure bo->vaddr != NULL before calling
> dma_free_writecombine to avoid this situation.
> 
> Would you prefer I send a patch up to fix this separately, or would
> you like to roll this into your next version?

Thanks for pointing all of these out. I'm going to trace the failure
code path anyway since there seem to be a couple of loose ends here and
there, so I'll probably roll in a fix for this anyway.

Thierry

[-- Attachment #1.2: Type: application/pgp-signature, Size: 819 bytes --]

[-- Attachment #2: Type: text/plain, Size: 0 bytes --]



^ permalink raw reply	[flat|nested] 133+ messages in thread

* Re: [RFC 09/10] drm/tegra: Add IOMMU support
@ 2014-10-02  8:39                 ` Thierry Reding
  0 siblings, 0 replies; 133+ messages in thread
From: Thierry Reding @ 2014-10-02  8:39 UTC (permalink / raw)
  To: Sean Paul
  Cc: Rob Herring, Pawel Moll, Mark Rutland, Ian Campbell, Kumar Gala,
	Stephen Warren, Arnd Bergmann, Will Deacon, Joerg Roedel,
	Cho KyongHo, Grant Grundler, Dave Martin, Marc Zyngier,
	Hiroshi Doyu, Olav Haugan, Paul Walmsley, Rhyland Klein,
	Allen Martin, devicetree, Linux IOMMU, Linux ARM Kernel,
	linux-tegra, Linux Kernel Mailing List, Stéphane Marchesin

[-- Attachment #1: Type: text/plain, Size: 12698 bytes --]

On Wed, Oct 01, 2014 at 11:54:11AM -0400, Sean Paul wrote:
> On Tue, Sep 30, 2014 at 2:48 PM, Sean Paul <seanpaul@google.com> wrote:
> > On Thu, Jun 26, 2014 at 4:49 PM, Thierry Reding
> > <thierry.reding@gmail.com> wrote:
> >> From: Thierry Reding <treding@nvidia.com>
> >>
> >> When an IOMMU device is available on the platform bus, allocate an IOMMU
> >> domain and attach the display controllers to it. The display controllers
> >> can then scan out non-contiguous buffers by mapping them through the
> >> IOMMU.
> >>
> >
> > Hi Thierry,
> > A few comments from Stéphane and myself that came up while we were
> > reviewing this for our tree.
> >
> >> Signed-off-by: Thierry Reding <treding@nvidia.com>
> >> ---
> >>  drivers/gpu/drm/tegra/dc.c  |  21 ++++
> >>  drivers/gpu/drm/tegra/drm.c |  17 ++++
> >>  drivers/gpu/drm/tegra/drm.h |   3 +
> >>  drivers/gpu/drm/tegra/fb.c  |  16 ++-
> >>  drivers/gpu/drm/tegra/gem.c | 236 +++++++++++++++++++++++++++++++++++++++-----
> >>  drivers/gpu/drm/tegra/gem.h |   4 +
> >>  6 files changed, 273 insertions(+), 24 deletions(-)
> >>
> >> diff --git a/drivers/gpu/drm/tegra/dc.c b/drivers/gpu/drm/tegra/dc.c
> >> index afcca04f5367..0f7452d04811 100644
> >> --- a/drivers/gpu/drm/tegra/dc.c
> >> +++ b/drivers/gpu/drm/tegra/dc.c
> >> @@ -9,6 +9,7 @@
> >>
> >>  #include <linux/clk.h>
> >>  #include <linux/debugfs.h>
> >> +#include <linux/iommu.h>
> >>  #include <linux/reset.h>
> >>
> >>  #include "dc.h"
> >> @@ -1283,8 +1284,18 @@ static int tegra_dc_init(struct host1x_client *client)
> >>  {
> >>         struct drm_device *drm = dev_get_drvdata(client->parent);
> >>         struct tegra_dc *dc = host1x_client_to_dc(client);
> >> +       struct tegra_drm *tegra = drm->dev_private;
> >>         int err;
> >>
> >> +       if (tegra->domain) {
> >> +               err = iommu_attach_device(tegra->domain, dc->dev);
> >> +               if (err < 0) {
> >> +                       dev_err(dc->dev, "failed to attach to IOMMU: %d\n",
> >> +                               err);
> >> +                       return err;
> >> +               }
> >
> > [from Stéphane]
> >
> > shouldn't we call detach in the error paths below?
> >
> >
> >> +       }
> >> +
> >>         drm_crtc_init(drm, &dc->base, &tegra_crtc_funcs);
> >>         drm_mode_crtc_set_gamma_size(&dc->base, 256);
> >>         drm_crtc_helper_add(&dc->base, &tegra_crtc_helper_funcs);
> >> @@ -1318,7 +1329,9 @@ static int tegra_dc_init(struct host1x_client *client)
> >>
> >>  static int tegra_dc_exit(struct host1x_client *client)
> >>  {
> >> +       struct drm_device *drm = dev_get_drvdata(client->parent);
> >>         struct tegra_dc *dc = host1x_client_to_dc(client);
> >> +       struct tegra_drm *tegra = drm->dev_private;
> >>         int err;
> >>
> >>         devm_free_irq(dc->dev, dc->irq, dc);
> >> @@ -1335,6 +1348,8 @@ static int tegra_dc_exit(struct host1x_client *client)
> >>                 return err;
> >>         }
> >>
> >> +       iommu_detach_device(tegra->domain, dc->dev);
> >> +
> >>         return 0;
> >>  }
> >>
> >> @@ -1462,6 +1477,12 @@ static int tegra_dc_probe(struct platform_device *pdev)
> >>                 return -ENXIO;
> >>         }
> >>
> >> +       err = iommu_attach(&pdev->dev);
> >> +       if (err < 0) {
> >> +               dev_err(&pdev->dev, "failed to attach to IOMMU: %d\n", err);
> >> +               return err;
> >> +       }
> >> +
> >>         INIT_LIST_HEAD(&dc->client.list);
> >>         dc->client.ops = &dc_client_ops;
> >>         dc->client.dev = &pdev->dev;
> >> diff --git a/drivers/gpu/drm/tegra/drm.c b/drivers/gpu/drm/tegra/drm.c
> >> index 59736bb810cd..1d2bbafad982 100644
> >> --- a/drivers/gpu/drm/tegra/drm.c
> >> +++ b/drivers/gpu/drm/tegra/drm.c
> >> @@ -8,6 +8,7 @@
> >>   */
> >>
> >>  #include <linux/host1x.h>
> >> +#include <linux/iommu.h>
> >>
> >>  #include "drm.h"
> >>  #include "gem.h"
> >> @@ -33,6 +34,16 @@ static int tegra_drm_load(struct drm_device *drm, unsigned long flags)
> >>         if (!tegra)
> >>                 return -ENOMEM;
> >>
> >> +       if (iommu_present(&platform_bus_type)) {
> >> +               tegra->domain = iommu_domain_alloc(&platform_bus_type);
> >> +               if (IS_ERR(tegra->domain)) {
> >> +                       kfree(tegra);
> >> +                       return PTR_ERR(tegra->domain);
> >> +               }
> >> +
> >> +               drm_mm_init(&tegra->mm, 0, SZ_2G);
> >
> >
> > [from Stéphane]:
> >
> > none of these are freed in the error path below (iommu_domain_free and
> > drm_mm_takedown)
> >
> > also |tegra| isn't freed either?
> >
> >
> >
> >> +       }
> >> +
> >>         mutex_init(&tegra->clients_lock);
> >>         INIT_LIST_HEAD(&tegra->clients);
> >>         drm->dev_private = tegra;
> >> @@ -71,6 +82,7 @@ static int tegra_drm_load(struct drm_device *drm, unsigned long flags)
> >>  static int tegra_drm_unload(struct drm_device *drm)
> >>  {
> >>         struct host1x_device *device = to_host1x_device(drm->dev);
> >> +       struct tegra_drm *tegra = drm->dev_private;
> >>         int err;
> >>
> >>         drm_kms_helper_poll_fini(drm);
> >> @@ -82,6 +94,11 @@ static int tegra_drm_unload(struct drm_device *drm)
> >>         if (err < 0)
> >>                 return err;
> >>
> >> +       if (tegra->domain) {
> >> +               iommu_domain_free(tegra->domain);
> >> +               drm_mm_takedown(&tegra->mm);
> >> +       }
> >> +
> >>         return 0;
> >>  }
> >>
> >> diff --git a/drivers/gpu/drm/tegra/drm.h b/drivers/gpu/drm/tegra/drm.h
> >> index 96d754e7b3eb..a07c796b7edc 100644
> >> --- a/drivers/gpu/drm/tegra/drm.h
> >> +++ b/drivers/gpu/drm/tegra/drm.h
> >> @@ -39,6 +39,9 @@ struct tegra_fbdev {
> >>  struct tegra_drm {
> >>         struct drm_device *drm;
> >>
> >> +       struct iommu_domain *domain;
> >> +       struct drm_mm mm;
> >> +
> >>         struct mutex clients_lock;
> >>         struct list_head clients;
> >>
> >> diff --git a/drivers/gpu/drm/tegra/fb.c b/drivers/gpu/drm/tegra/fb.c
> >> index 7790d43ad082..21c65dd817c3 100644
> >> --- a/drivers/gpu/drm/tegra/fb.c
> >> +++ b/drivers/gpu/drm/tegra/fb.c
> >> @@ -65,8 +65,12 @@ static void tegra_fb_destroy(struct drm_framebuffer *framebuffer)
> >>         for (i = 0; i < fb->num_planes; i++) {
> >>                 struct tegra_bo *bo = fb->planes[i];
> >>
> >> -               if (bo)
> >> +               if (bo) {
> >> +                       if (bo->pages && bo->virt)
> >> +                               vunmap(bo->virt);
> >> +
> >>                         drm_gem_object_unreference_unlocked(&bo->gem);
> >> +               }
> >>         }
> >>
> >>         drm_framebuffer_cleanup(framebuffer);
> >> @@ -252,6 +256,16 @@ static int tegra_fbdev_probe(struct drm_fb_helper *helper,
> >>         offset = info->var.xoffset * bytes_per_pixel +
> >>                  info->var.yoffset * fb->pitches[0];
> >>
> >> +       if (bo->pages) {
> >> +               bo->vaddr = vmap(bo->pages, bo->num_pages, VM_MAP,
> >> +                                pgprot_writecombine(PAGE_KERNEL));
> >> +               if (!bo->vaddr) {
> >> +                       dev_err(drm->dev, "failed to vmap() framebuffer\n");
> >> +                       err = -ENOMEM;
> >> +                       goto destroy;
> >> +               }
> >> +       }
> >> +
> >>         drm->mode_config.fb_base = (resource_size_t)bo->paddr;
> >>         info->screen_base = (void __iomem *)bo->vaddr + offset;
> >>         info->screen_size = size;
> >> diff --git a/drivers/gpu/drm/tegra/gem.c b/drivers/gpu/drm/tegra/gem.c
> >> index c1e4e8b6e5ca..2912e61a2599 100644
> >> --- a/drivers/gpu/drm/tegra/gem.c
> >> +++ b/drivers/gpu/drm/tegra/gem.c
> >> @@ -14,8 +14,10 @@
> >>   */
> >>
> >>  #include <linux/dma-buf.h>
> >> +#include <linux/iommu.h>
> >>  #include <drm/tegra_drm.h>
> >>
> >> +#include "drm.h"
> >>  #include "gem.h"
> >>
> >>  static inline struct tegra_bo *host1x_to_tegra_bo(struct host1x_bo *bo)
> >> @@ -90,14 +92,144 @@ static const struct host1x_bo_ops tegra_bo_ops = {
> >>         .kunmap = tegra_bo_kunmap,
> >>  };
> >>
> >> +static int iommu_map_sg(struct iommu_domain *domain, struct sg_table *sgt,
> >> +                       dma_addr_t iova, int prot)
> >> +{
> >> +       unsigned long offset = 0;
> >> +       struct scatterlist *sg;
> >> +       unsigned int i, j;
> >> +       int err;
> >> +
> >> +       for_each_sg(sgt->sgl, sg, sgt->nents, i) {
> >> +               dma_addr_t phys = sg_phys(sg);
> >> +               size_t length = sg->offset;
> >> +
> >> +               phys = sg_phys(sg) - sg->offset;
> >> +               length = sg->length + sg->offset;
> >> +
> >> +               err = iommu_map(domain, iova + offset, phys, length, prot);
> >> +               if (err < 0)
> >> +                       goto unmap;
> >> +
> >> +               offset += length;
> >> +       }
> >> +
> >> +       return 0;
> >> +
> >> +unmap:
> >> +       offset = 0;
> >> +
> >> +       for_each_sg(sgt->sgl, sg, i, j) {
> >> +               size_t length = sg->length + sg->offset;
> >> +               iommu_unmap(domain, iova + offset, length);
> >> +               offset += length;
> >> +       }
> >> +
> >> +       return err;
> >> +}
> >> +
> >> +static int iommu_unmap_sg(struct iommu_domain *domain, struct sg_table *sgt,
> >> +                         dma_addr_t iova)
> >> +{
> >> +       unsigned long offset = 0;
> >> +       struct scatterlist *sg;
> >> +       unsigned int i;
> >> +
> >> +       for_each_sg(sgt->sgl, sg, sgt->nents, i) {
> >> +               dma_addr_t phys = sg_phys(sg);
> >> +               size_t length = sg->offset;
> >> +
> >> +               phys = sg_phys(sg) - sg->offset;
> >> +               length = sg->length + sg->offset;
> >> +
> >> +               iommu_unmap(domain, iova + offset, length);
> >> +               offset += length;
> >> +       }
> >> +
> >> +       return 0;
> >> +}
> >> +
> >> +static int tegra_bo_iommu_map(struct tegra_drm *tegra, struct tegra_bo *bo)
> >> +{
> >> +       int prot = IOMMU_READ | IOMMU_WRITE;
> >> +       int err;
> >> +
> >> +       if (bo->mm)
> >> +               return -EBUSY;
> >> +
> >> +       bo->mm = kzalloc(sizeof(*bo->mm), GFP_KERNEL);
> >> +       if (!bo->mm)
> >> +               return -ENOMEM;
> >> +
> >> +       err = drm_mm_insert_node_generic(&tegra->mm, bo->mm, bo->gem.size,
> >> +                                        PAGE_SIZE, 0, 0, 0);
> >> +       if (err < 0) {
> >> +               dev_err(tegra->drm->dev, "out of virtual memory: %d\n", err);
> >> +               return err;
> >> +       }
> >> +
> >> +       bo->paddr = bo->mm->start;
> >> +
> >> +       err = iommu_map_sg(tegra->domain, bo->sgt, bo->paddr, prot);
> >> +       if (err < 0) {
> >> +               dev_err(tegra->drm->dev, "failed to map buffer: %d\n", err);
> >> +               return err;
> >> +       }
> >> +
> >> +       return 0;
> >> +}
> >> +
> >> +static int tegra_bo_iommu_unmap(struct tegra_drm *tegra, struct tegra_bo *bo)
> >> +{
> >> +       if (!bo->mm)
> >> +               return 0;
> >> +
> >> +       iommu_unmap_sg(tegra->domain, bo->sgt, bo->paddr);
> >> +       drm_mm_remove_node(bo->mm);
> >> +
> >> +       kfree(bo->mm);
> >> +       return 0;
> >> +}
> >> +
> >>  static void tegra_bo_destroy(struct drm_device *drm, struct tegra_bo *bo)
> >>  {
> >> -       dma_free_writecombine(drm->dev, bo->gem.size, bo->vaddr, bo->paddr);
> >> +       if (!bo->pages)
> >> +               dma_free_writecombine(drm->dev, bo->gem.size, bo->vaddr,
> >> +                                     bo->paddr);
> 
> One more thing. If tegra_bo_alloc fails, we'll have bo->vaddr == NULL
> and bo->paddr == ~0 here, which causes a crash.
> 
> I posted https://lkml.org/lkml/2014/9/30/659 to check for the error
> condition in the mm code, but it seems like reviewer consensus is to
> check for this before calling free.
> 
> As such, we'll need to make sure bo->vaddr != NULL before calling
> dma_free_writecombine to avoid this situation.
> 
> Would you prefer I send a patch up to fix this separately, or would
> you like to roll this into your next version?

Thanks for pointing all of these out. I'm going to trace the failure
code path anyway since there seem to be a couple of loose ends here and
there, so I'll probably roll in a fix for this anyway.

Thierry

[-- Attachment #2: Type: application/pgp-signature, Size: 819 bytes --]

^ permalink raw reply	[flat|nested] 133+ messages in thread

* [RFC 09/10] drm/tegra: Add IOMMU support
@ 2014-10-02  8:39                 ` Thierry Reding
  0 siblings, 0 replies; 133+ messages in thread
From: Thierry Reding @ 2014-10-02  8:39 UTC (permalink / raw)
  To: linux-arm-kernel

On Wed, Oct 01, 2014 at 11:54:11AM -0400, Sean Paul wrote:
> On Tue, Sep 30, 2014 at 2:48 PM, Sean Paul <seanpaul@google.com> wrote:
> > On Thu, Jun 26, 2014 at 4:49 PM, Thierry Reding
> > <thierry.reding@gmail.com> wrote:
> >> From: Thierry Reding <treding@nvidia.com>
> >>
> >> When an IOMMU device is available on the platform bus, allocate an IOMMU
> >> domain and attach the display controllers to it. The display controllers
> >> can then scan out non-contiguous buffers by mapping them through the
> >> IOMMU.
> >>
> >
> > Hi Thierry,
> > A few comments from St?phane and myself that came up while we were
> > reviewing this for our tree.
> >
> >> Signed-off-by: Thierry Reding <treding@nvidia.com>
> >> ---
> >>  drivers/gpu/drm/tegra/dc.c  |  21 ++++
> >>  drivers/gpu/drm/tegra/drm.c |  17 ++++
> >>  drivers/gpu/drm/tegra/drm.h |   3 +
> >>  drivers/gpu/drm/tegra/fb.c  |  16 ++-
> >>  drivers/gpu/drm/tegra/gem.c | 236 +++++++++++++++++++++++++++++++++++++++-----
> >>  drivers/gpu/drm/tegra/gem.h |   4 +
> >>  6 files changed, 273 insertions(+), 24 deletions(-)
> >>
> >> diff --git a/drivers/gpu/drm/tegra/dc.c b/drivers/gpu/drm/tegra/dc.c
> >> index afcca04f5367..0f7452d04811 100644
> >> --- a/drivers/gpu/drm/tegra/dc.c
> >> +++ b/drivers/gpu/drm/tegra/dc.c
> >> @@ -9,6 +9,7 @@
> >>
> >>  #include <linux/clk.h>
> >>  #include <linux/debugfs.h>
> >> +#include <linux/iommu.h>
> >>  #include <linux/reset.h>
> >>
> >>  #include "dc.h"
> >> @@ -1283,8 +1284,18 @@ static int tegra_dc_init(struct host1x_client *client)
> >>  {
> >>         struct drm_device *drm = dev_get_drvdata(client->parent);
> >>         struct tegra_dc *dc = host1x_client_to_dc(client);
> >> +       struct tegra_drm *tegra = drm->dev_private;
> >>         int err;
> >>
> >> +       if (tegra->domain) {
> >> +               err = iommu_attach_device(tegra->domain, dc->dev);
> >> +               if (err < 0) {
> >> +                       dev_err(dc->dev, "failed to attach to IOMMU: %d\n",
> >> +                               err);
> >> +                       return err;
> >> +               }
> >
> > [from St?phane]
> >
> > shouldn't we call detach in the error paths below?
> >
> >
> >> +       }
> >> +
> >>         drm_crtc_init(drm, &dc->base, &tegra_crtc_funcs);
> >>         drm_mode_crtc_set_gamma_size(&dc->base, 256);
> >>         drm_crtc_helper_add(&dc->base, &tegra_crtc_helper_funcs);
> >> @@ -1318,7 +1329,9 @@ static int tegra_dc_init(struct host1x_client *client)
> >>
> >>  static int tegra_dc_exit(struct host1x_client *client)
> >>  {
> >> +       struct drm_device *drm = dev_get_drvdata(client->parent);
> >>         struct tegra_dc *dc = host1x_client_to_dc(client);
> >> +       struct tegra_drm *tegra = drm->dev_private;
> >>         int err;
> >>
> >>         devm_free_irq(dc->dev, dc->irq, dc);
> >> @@ -1335,6 +1348,8 @@ static int tegra_dc_exit(struct host1x_client *client)
> >>                 return err;
> >>         }
> >>
> >> +       iommu_detach_device(tegra->domain, dc->dev);
> >> +
> >>         return 0;
> >>  }
> >>
> >> @@ -1462,6 +1477,12 @@ static int tegra_dc_probe(struct platform_device *pdev)
> >>                 return -ENXIO;
> >>         }
> >>
> >> +       err = iommu_attach(&pdev->dev);
> >> +       if (err < 0) {
> >> +               dev_err(&pdev->dev, "failed to attach to IOMMU: %d\n", err);
> >> +               return err;
> >> +       }
> >> +
> >>         INIT_LIST_HEAD(&dc->client.list);
> >>         dc->client.ops = &dc_client_ops;
> >>         dc->client.dev = &pdev->dev;
> >> diff --git a/drivers/gpu/drm/tegra/drm.c b/drivers/gpu/drm/tegra/drm.c
> >> index 59736bb810cd..1d2bbafad982 100644
> >> --- a/drivers/gpu/drm/tegra/drm.c
> >> +++ b/drivers/gpu/drm/tegra/drm.c
> >> @@ -8,6 +8,7 @@
> >>   */
> >>
> >>  #include <linux/host1x.h>
> >> +#include <linux/iommu.h>
> >>
> >>  #include "drm.h"
> >>  #include "gem.h"
> >> @@ -33,6 +34,16 @@ static int tegra_drm_load(struct drm_device *drm, unsigned long flags)
> >>         if (!tegra)
> >>                 return -ENOMEM;
> >>
> >> +       if (iommu_present(&platform_bus_type)) {
> >> +               tegra->domain = iommu_domain_alloc(&platform_bus_type);
> >> +               if (IS_ERR(tegra->domain)) {
> >> +                       kfree(tegra);
> >> +                       return PTR_ERR(tegra->domain);
> >> +               }
> >> +
> >> +               drm_mm_init(&tegra->mm, 0, SZ_2G);
> >
> >
> > [from St?phane]:
> >
> > none of these are freed in the error path below (iommu_domain_free and
> > drm_mm_takedown)
> >
> > also |tegra| isn't freed either?
> >
> >
> >
> >> +       }
> >> +
> >>         mutex_init(&tegra->clients_lock);
> >>         INIT_LIST_HEAD(&tegra->clients);
> >>         drm->dev_private = tegra;
> >> @@ -71,6 +82,7 @@ static int tegra_drm_load(struct drm_device *drm, unsigned long flags)
> >>  static int tegra_drm_unload(struct drm_device *drm)
> >>  {
> >>         struct host1x_device *device = to_host1x_device(drm->dev);
> >> +       struct tegra_drm *tegra = drm->dev_private;
> >>         int err;
> >>
> >>         drm_kms_helper_poll_fini(drm);
> >> @@ -82,6 +94,11 @@ static int tegra_drm_unload(struct drm_device *drm)
> >>         if (err < 0)
> >>                 return err;
> >>
> >> +       if (tegra->domain) {
> >> +               iommu_domain_free(tegra->domain);
> >> +               drm_mm_takedown(&tegra->mm);
> >> +       }
> >> +
> >>         return 0;
> >>  }
> >>
> >> diff --git a/drivers/gpu/drm/tegra/drm.h b/drivers/gpu/drm/tegra/drm.h
> >> index 96d754e7b3eb..a07c796b7edc 100644
> >> --- a/drivers/gpu/drm/tegra/drm.h
> >> +++ b/drivers/gpu/drm/tegra/drm.h
> >> @@ -39,6 +39,9 @@ struct tegra_fbdev {
> >>  struct tegra_drm {
> >>         struct drm_device *drm;
> >>
> >> +       struct iommu_domain *domain;
> >> +       struct drm_mm mm;
> >> +
> >>         struct mutex clients_lock;
> >>         struct list_head clients;
> >>
> >> diff --git a/drivers/gpu/drm/tegra/fb.c b/drivers/gpu/drm/tegra/fb.c
> >> index 7790d43ad082..21c65dd817c3 100644
> >> --- a/drivers/gpu/drm/tegra/fb.c
> >> +++ b/drivers/gpu/drm/tegra/fb.c
> >> @@ -65,8 +65,12 @@ static void tegra_fb_destroy(struct drm_framebuffer *framebuffer)
> >>         for (i = 0; i < fb->num_planes; i++) {
> >>                 struct tegra_bo *bo = fb->planes[i];
> >>
> >> -               if (bo)
> >> +               if (bo) {
> >> +                       if (bo->pages && bo->virt)
> >> +                               vunmap(bo->virt);
> >> +
> >>                         drm_gem_object_unreference_unlocked(&bo->gem);
> >> +               }
> >>         }
> >>
> >>         drm_framebuffer_cleanup(framebuffer);
> >> @@ -252,6 +256,16 @@ static int tegra_fbdev_probe(struct drm_fb_helper *helper,
> >>         offset = info->var.xoffset * bytes_per_pixel +
> >>                  info->var.yoffset * fb->pitches[0];
> >>
> >> +       if (bo->pages) {
> >> +               bo->vaddr = vmap(bo->pages, bo->num_pages, VM_MAP,
> >> +                                pgprot_writecombine(PAGE_KERNEL));
> >> +               if (!bo->vaddr) {
> >> +                       dev_err(drm->dev, "failed to vmap() framebuffer\n");
> >> +                       err = -ENOMEM;
> >> +                       goto destroy;
> >> +               }
> >> +       }
> >> +
> >>         drm->mode_config.fb_base = (resource_size_t)bo->paddr;
> >>         info->screen_base = (void __iomem *)bo->vaddr + offset;
> >>         info->screen_size = size;
> >> diff --git a/drivers/gpu/drm/tegra/gem.c b/drivers/gpu/drm/tegra/gem.c
> >> index c1e4e8b6e5ca..2912e61a2599 100644
> >> --- a/drivers/gpu/drm/tegra/gem.c
> >> +++ b/drivers/gpu/drm/tegra/gem.c
> >> @@ -14,8 +14,10 @@
> >>   */
> >>
> >>  #include <linux/dma-buf.h>
> >> +#include <linux/iommu.h>
> >>  #include <drm/tegra_drm.h>
> >>
> >> +#include "drm.h"
> >>  #include "gem.h"
> >>
> >>  static inline struct tegra_bo *host1x_to_tegra_bo(struct host1x_bo *bo)
> >> @@ -90,14 +92,144 @@ static const struct host1x_bo_ops tegra_bo_ops = {
> >>         .kunmap = tegra_bo_kunmap,
> >>  };
> >>
> >> +static int iommu_map_sg(struct iommu_domain *domain, struct sg_table *sgt,
> >> +                       dma_addr_t iova, int prot)
> >> +{
> >> +       unsigned long offset = 0;
> >> +       struct scatterlist *sg;
> >> +       unsigned int i, j;
> >> +       int err;
> >> +
> >> +       for_each_sg(sgt->sgl, sg, sgt->nents, i) {
> >> +               dma_addr_t phys = sg_phys(sg);
> >> +               size_t length = sg->offset;
> >> +
> >> +               phys = sg_phys(sg) - sg->offset;
> >> +               length = sg->length + sg->offset;
> >> +
> >> +               err = iommu_map(domain, iova + offset, phys, length, prot);
> >> +               if (err < 0)
> >> +                       goto unmap;
> >> +
> >> +               offset += length;
> >> +       }
> >> +
> >> +       return 0;
> >> +
> >> +unmap:
> >> +       offset = 0;
> >> +
> >> +       for_each_sg(sgt->sgl, sg, i, j) {
> >> +               size_t length = sg->length + sg->offset;
> >> +               iommu_unmap(domain, iova + offset, length);
> >> +               offset += length;
> >> +       }
> >> +
> >> +       return err;
> >> +}
> >> +
> >> +static int iommu_unmap_sg(struct iommu_domain *domain, struct sg_table *sgt,
> >> +                         dma_addr_t iova)
> >> +{
> >> +       unsigned long offset = 0;
> >> +       struct scatterlist *sg;
> >> +       unsigned int i;
> >> +
> >> +       for_each_sg(sgt->sgl, sg, sgt->nents, i) {
> >> +               dma_addr_t phys = sg_phys(sg);
> >> +               size_t length = sg->offset;
> >> +
> >> +               phys = sg_phys(sg) - sg->offset;
> >> +               length = sg->length + sg->offset;
> >> +
> >> +               iommu_unmap(domain, iova + offset, length);
> >> +               offset += length;
> >> +       }
> >> +
> >> +       return 0;
> >> +}
> >> +
> >> +static int tegra_bo_iommu_map(struct tegra_drm *tegra, struct tegra_bo *bo)
> >> +{
> >> +       int prot = IOMMU_READ | IOMMU_WRITE;
> >> +       int err;
> >> +
> >> +       if (bo->mm)
> >> +               return -EBUSY;
> >> +
> >> +       bo->mm = kzalloc(sizeof(*bo->mm), GFP_KERNEL);
> >> +       if (!bo->mm)
> >> +               return -ENOMEM;
> >> +
> >> +       err = drm_mm_insert_node_generic(&tegra->mm, bo->mm, bo->gem.size,
> >> +                                        PAGE_SIZE, 0, 0, 0);
> >> +       if (err < 0) {
> >> +               dev_err(tegra->drm->dev, "out of virtual memory: %d\n", err);
> >> +               return err;
> >> +       }
> >> +
> >> +       bo->paddr = bo->mm->start;
> >> +
> >> +       err = iommu_map_sg(tegra->domain, bo->sgt, bo->paddr, prot);
> >> +       if (err < 0) {
> >> +               dev_err(tegra->drm->dev, "failed to map buffer: %d\n", err);
> >> +               return err;
> >> +       }
> >> +
> >> +       return 0;
> >> +}
> >> +
> >> +static int tegra_bo_iommu_unmap(struct tegra_drm *tegra, struct tegra_bo *bo)
> >> +{
> >> +       if (!bo->mm)
> >> +               return 0;
> >> +
> >> +       iommu_unmap_sg(tegra->domain, bo->sgt, bo->paddr);
> >> +       drm_mm_remove_node(bo->mm);
> >> +
> >> +       kfree(bo->mm);
> >> +       return 0;
> >> +}
> >> +
> >>  static void tegra_bo_destroy(struct drm_device *drm, struct tegra_bo *bo)
> >>  {
> >> -       dma_free_writecombine(drm->dev, bo->gem.size, bo->vaddr, bo->paddr);
> >> +       if (!bo->pages)
> >> +               dma_free_writecombine(drm->dev, bo->gem.size, bo->vaddr,
> >> +                                     bo->paddr);
> 
> One more thing. If tegra_bo_alloc fails, we'll have bo->vaddr == NULL
> and bo->paddr == ~0 here, which causes a crash.
> 
> I posted https://lkml.org/lkml/2014/9/30/659 to check for the error
> condition in the mm code, but it seems like reviewer consensus is to
> check for this before calling free.
> 
> As such, we'll need to make sure bo->vaddr != NULL before calling
> dma_free_writecombine to avoid this situation.
> 
> Would you prefer I send a patch up to fix this separately, or would
> you like to roll this into your next version?

Thanks for pointing all of these out. I'm going to trace the failure
code path anyway since there seem to be a couple of loose ends here and
there, so I'll probably roll in a fix for this anyway.

Thierry
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 819 bytes
Desc: not available
URL: <http://lists.infradead.org/pipermail/linux-arm-kernel/attachments/20141002/13e85145/attachment.sig>

^ permalink raw reply	[flat|nested] 133+ messages in thread

* Re: [RFC 09/10] drm/tegra: Add IOMMU support
  2014-10-01 15:54             ` Sean Paul
  (?)
@ 2014-11-05  9:50                 ` Thierry Reding
  -1 siblings, 0 replies; 133+ messages in thread
From: Thierry Reding @ 2014-11-05  9:50 UTC (permalink / raw)
  To: Sean Paul
  Cc: Mark Rutland, Will Deacon, Paul Walmsley, Pawel Moll,
	Ian Campbell, Marc Zyngier, Dave Martin,
	devicetree-u79uwXL29TY76Z2rM5mHXA, Arnd Bergmann, Stephen Warren,
	Grant Grundler, Allen Martin, Rob Herring,
	linux-tegra-u79uwXL29TY76Z2rM5mHXA, Cho KyongHo,
	Linux ARM Kernel, Stéphane Marchesin,
	Linux Kernel Mailing List, Linux IOMMU, Kumar Gala,
	Rhyland Klein


[-- Attachment #1.1: Type: text/plain, Size: 1359 bytes --]

On Wed, Oct 01, 2014 at 11:54:11AM -0400, Sean Paul wrote:
> On Tue, Sep 30, 2014 at 2:48 PM, Sean Paul <seanpaul-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org> wrote:
> > On Thu, Jun 26, 2014 at 4:49 PM, Thierry Reding <thierry.reding@gmail.com> wrote:
> >> diff --git a/drivers/gpu/drm/tegra/gem.c b/drivers/gpu/drm/tegra/gem.c
[...]
> >>  static void tegra_bo_destroy(struct drm_device *drm, struct tegra_bo *bo)
> >>  {
> >> -       dma_free_writecombine(drm->dev, bo->gem.size, bo->vaddr, bo->paddr);
> >> +       if (!bo->pages)
> >> +               dma_free_writecombine(drm->dev, bo->gem.size, bo->vaddr,
> >> +                                     bo->paddr);
> 
> One more thing. If tegra_bo_alloc fails, we'll have bo->vaddr == NULL
> and bo->paddr == ~0 here, which causes a crash.
> 
> I posted https://lkml.org/lkml/2014/9/30/659 to check for the error
> condition in the mm code, but it seems like reviewer consensus is to
> check for this before calling free.
> 
> As such, we'll need to make sure bo->vaddr != NULL before calling
> dma_free_writecombine to avoid this situation.
> 
> Would you prefer I send a patch up to fix this separately, or would
> you like to roll this into your next version?

I've rolled this check into my series because I touch that area of code
anyway.

Thanks for bringing it up.

Thierry

[-- Attachment #1.2: Type: application/pgp-signature, Size: 819 bytes --]

[-- Attachment #2: Type: text/plain, Size: 0 bytes --]



^ permalink raw reply	[flat|nested] 133+ messages in thread

* Re: [RFC 09/10] drm/tegra: Add IOMMU support
@ 2014-11-05  9:50                 ` Thierry Reding
  0 siblings, 0 replies; 133+ messages in thread
From: Thierry Reding @ 2014-11-05  9:50 UTC (permalink / raw)
  To: Sean Paul
  Cc: Rob Herring, Pawel Moll, Mark Rutland, Ian Campbell, Kumar Gala,
	Stephen Warren, Arnd Bergmann, Will Deacon, Joerg Roedel,
	Cho KyongHo, Grant Grundler, Dave Martin, Marc Zyngier,
	Hiroshi Doyu, Olav Haugan, Paul Walmsley, Rhyland Klein,
	Allen Martin, devicetree, Linux IOMMU, Linux ARM Kernel,
	linux-tegra, Linux Kernel Mailing List, Stéphane Marchesin

[-- Attachment #1: Type: text/plain, Size: 1330 bytes --]

On Wed, Oct 01, 2014 at 11:54:11AM -0400, Sean Paul wrote:
> On Tue, Sep 30, 2014 at 2:48 PM, Sean Paul <seanpaul@google.com> wrote:
> > On Thu, Jun 26, 2014 at 4:49 PM, Thierry Reding <thierry.reding@gmail.com> wrote:
> >> diff --git a/drivers/gpu/drm/tegra/gem.c b/drivers/gpu/drm/tegra/gem.c
[...]
> >>  static void tegra_bo_destroy(struct drm_device *drm, struct tegra_bo *bo)
> >>  {
> >> -       dma_free_writecombine(drm->dev, bo->gem.size, bo->vaddr, bo->paddr);
> >> +       if (!bo->pages)
> >> +               dma_free_writecombine(drm->dev, bo->gem.size, bo->vaddr,
> >> +                                     bo->paddr);
> 
> One more thing. If tegra_bo_alloc fails, we'll have bo->vaddr == NULL
> and bo->paddr == ~0 here, which causes a crash.
> 
> I posted https://lkml.org/lkml/2014/9/30/659 to check for the error
> condition in the mm code, but it seems like reviewer consensus is to
> check for this before calling free.
> 
> As such, we'll need to make sure bo->vaddr != NULL before calling
> dma_free_writecombine to avoid this situation.
> 
> Would you prefer I send a patch up to fix this separately, or would
> you like to roll this into your next version?

I've rolled this check into my series because I touch that area of code
anyway.

Thanks for bringing it up.

Thierry

[-- Attachment #2: Type: application/pgp-signature, Size: 819 bytes --]

^ permalink raw reply	[flat|nested] 133+ messages in thread

* [RFC 09/10] drm/tegra: Add IOMMU support
@ 2014-11-05  9:50                 ` Thierry Reding
  0 siblings, 0 replies; 133+ messages in thread
From: Thierry Reding @ 2014-11-05  9:50 UTC (permalink / raw)
  To: linux-arm-kernel

On Wed, Oct 01, 2014 at 11:54:11AM -0400, Sean Paul wrote:
> On Tue, Sep 30, 2014 at 2:48 PM, Sean Paul <seanpaul@google.com> wrote:
> > On Thu, Jun 26, 2014 at 4:49 PM, Thierry Reding <thierry.reding@gmail.com> wrote:
> >> diff --git a/drivers/gpu/drm/tegra/gem.c b/drivers/gpu/drm/tegra/gem.c
[...]
> >>  static void tegra_bo_destroy(struct drm_device *drm, struct tegra_bo *bo)
> >>  {
> >> -       dma_free_writecombine(drm->dev, bo->gem.size, bo->vaddr, bo->paddr);
> >> +       if (!bo->pages)
> >> +               dma_free_writecombine(drm->dev, bo->gem.size, bo->vaddr,
> >> +                                     bo->paddr);
> 
> One more thing. If tegra_bo_alloc fails, we'll have bo->vaddr == NULL
> and bo->paddr == ~0 here, which causes a crash.
> 
> I posted https://lkml.org/lkml/2014/9/30/659 to check for the error
> condition in the mm code, but it seems like reviewer consensus is to
> check for this before calling free.
> 
> As such, we'll need to make sure bo->vaddr != NULL before calling
> dma_free_writecombine to avoid this situation.
> 
> Would you prefer I send a patch up to fix this separately, or would
> you like to roll this into your next version?

I've rolled this check into my series because I touch that area of code
anyway.

Thanks for bringing it up.

Thierry
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 819 bytes
Desc: not available
URL: <http://lists.infradead.org/pipermail/linux-arm-kernel/attachments/20141105/8099e49a/attachment.sig>

^ permalink raw reply	[flat|nested] 133+ messages in thread

* Re: [RFC 09/10] drm/tegra: Add IOMMU support
  2014-09-30 18:48         ` Sean Paul
  (?)
@ 2014-11-05 10:26             ` Thierry Reding
  -1 siblings, 0 replies; 133+ messages in thread
From: Thierry Reding @ 2014-11-05 10:26 UTC (permalink / raw)
  To: Sean Paul
  Cc: Mark Rutland, Will Deacon, Paul Walmsley, Pawel Moll,
	Ian Campbell, Marc Zyngier, Dave Martin,
	devicetree-u79uwXL29TY76Z2rM5mHXA, Arnd Bergmann, Stephen Warren,
	Grant Grundler, Allen Martin, Rob Herring,
	linux-tegra-u79uwXL29TY76Z2rM5mHXA, Cho KyongHo,
	Linux ARM Kernel, Stéphane Marchesin,
	Linux Kernel Mailing List, Linux IOMMU, Kumar Gala,
	Rhyland Klein


[-- Attachment #1.1: Type: text/plain, Size: 4666 bytes --]

On Tue, Sep 30, 2014 at 02:48:35PM -0400, Sean Paul wrote:
> On Thu, Jun 26, 2014 at 4:49 PM, Thierry Reding
> <thierry.reding-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> wrote:
> > From: Thierry Reding <treding-DDmLM1+adcrQT0dZR+AlfA@public.gmane.org>
> >
> > When an IOMMU device is available on the platform bus, allocate an IOMMU
> > domain and attach the display controllers to it. The display controllers
> > can then scan out non-contiguous buffers by mapping them through the
> > IOMMU.
> >
> 
> Hi Thierry,
> A few comments from Stéphane and myself that came up while we were
> reviewing this for our tree.

I just realized that I hadn't integrated these comments completely yet,
but I've done so now in my local tree. I'm running a couple of tests to
verify that it's all handled correctly.

> > Signed-off-by: Thierry Reding <treding-DDmLM1+adcrQT0dZR+AlfA@public.gmane.org>
> > ---
> >  drivers/gpu/drm/tegra/dc.c  |  21 ++++
> >  drivers/gpu/drm/tegra/drm.c |  17 ++++
> >  drivers/gpu/drm/tegra/drm.h |   3 +
> >  drivers/gpu/drm/tegra/fb.c  |  16 ++-
> >  drivers/gpu/drm/tegra/gem.c | 236 +++++++++++++++++++++++++++++++++++++++-----
> >  drivers/gpu/drm/tegra/gem.h |   4 +
> >  6 files changed, 273 insertions(+), 24 deletions(-)
> >
> > diff --git a/drivers/gpu/drm/tegra/dc.c b/drivers/gpu/drm/tegra/dc.c
> > index afcca04f5367..0f7452d04811 100644
> > --- a/drivers/gpu/drm/tegra/dc.c
> > +++ b/drivers/gpu/drm/tegra/dc.c
> > @@ -9,6 +9,7 @@
> >
> >  #include <linux/clk.h>
> >  #include <linux/debugfs.h>
> > +#include <linux/iommu.h>
> >  #include <linux/reset.h>
> >
> >  #include "dc.h"
> > @@ -1283,8 +1284,18 @@ static int tegra_dc_init(struct host1x_client *client)
> >  {
> >         struct drm_device *drm = dev_get_drvdata(client->parent);
> >         struct tegra_dc *dc = host1x_client_to_dc(client);
> > +       struct tegra_drm *tegra = drm->dev_private;
> >         int err;
> >
> > +       if (tegra->domain) {
> > +               err = iommu_attach_device(tegra->domain, dc->dev);
> > +               if (err < 0) {
> > +                       dev_err(dc->dev, "failed to attach to IOMMU: %d\n",
> > +                               err);
> > +                       return err;
> > +               }
> 
> [from Stéphane]
> 
> shouldn't we call detach in the error paths below?

This was mostly rewritten for universal plane support, but I've made
sure that the DC properly detaches from the IOMMU in case of failure
during the code below.

> > diff --git a/drivers/gpu/drm/tegra/drm.c b/drivers/gpu/drm/tegra/drm.c
[...]
> > @@ -8,6 +8,7 @@
> >   */
> >
> >  #include <linux/host1x.h>
> > +#include <linux/iommu.h>
> >
> >  #include "drm.h"
> >  #include "gem.h"
> > @@ -33,6 +34,16 @@ static int tegra_drm_load(struct drm_device *drm, unsigned long flags)
> >         if (!tegra)
> >                 return -ENOMEM;
> >
> > +       if (iommu_present(&platform_bus_type)) {
> > +               tegra->domain = iommu_domain_alloc(&platform_bus_type);
> > +               if (IS_ERR(tegra->domain)) {
> > +                       kfree(tegra);
> > +                       return PTR_ERR(tegra->domain);
> > +               }
> > +
> > +               drm_mm_init(&tegra->mm, 0, SZ_2G);
> 
> 
> [from Stéphane]:
> 
> none of these are freed in the error path below (iommu_domain_free and
> drm_mm_takedown)
> 
> also |tegra| isn't freed either?

None of the resources were actually being cleaned up, but I think I have
it all handled properly now.

> > @@ -108,22 +240,33 @@ struct tegra_bo *tegra_bo_create(struct drm_device *drm, unsigned int size,
> >         host1x_bo_init(&bo->base, &tegra_bo_ops);
> >         size = round_up(size, PAGE_SIZE);
> >
> > -       bo->vaddr = dma_alloc_writecombine(drm->dev, size, &bo->paddr,
> > -                                          GFP_KERNEL | __GFP_NOWARN);
> > -       if (!bo->vaddr) {
> > -               dev_err(drm->dev, "failed to allocate buffer with size %u\n",
> > -                       size);
> > -               err = -ENOMEM;
> > -               goto err_dma;
> > -       }
> > -
> >         err = drm_gem_object_init(drm, &bo->gem, size);
> >         if (err)
> > -               goto err_init;
> > +               goto free;
> >
> >         err = drm_gem_create_mmap_offset(&bo->gem);
> 
> We need to call drm_gem_free_mmap_offset if one of the calls below
> fails, otherwise we'll try to free the mmap_offset on an already
> destroyed bo.

drm_gem_object_release() (below) already calls drm_gem_free_mmap_offset()
for us implicitly.

Thierry

[-- Attachment #1.2: Type: application/pgp-signature, Size: 819 bytes --]

[-- Attachment #2: Type: text/plain, Size: 0 bytes --]



^ permalink raw reply	[flat|nested] 133+ messages in thread

* Re: [RFC 09/10] drm/tegra: Add IOMMU support
@ 2014-11-05 10:26             ` Thierry Reding
  0 siblings, 0 replies; 133+ messages in thread
From: Thierry Reding @ 2014-11-05 10:26 UTC (permalink / raw)
  To: Sean Paul
  Cc: Rob Herring, Pawel Moll, Mark Rutland, Ian Campbell, Kumar Gala,
	Stephen Warren, Arnd Bergmann, Will Deacon, Joerg Roedel,
	Cho KyongHo, Grant Grundler, Dave Martin, Marc Zyngier,
	Hiroshi Doyu, Olav Haugan, Paul Walmsley, Rhyland Klein,
	Allen Martin, devicetree, Linux IOMMU, Linux ARM Kernel,
	linux-tegra, Linux Kernel Mailing List, Stéphane Marchesin

[-- Attachment #1: Type: text/plain, Size: 4578 bytes --]

On Tue, Sep 30, 2014 at 02:48:35PM -0400, Sean Paul wrote:
> On Thu, Jun 26, 2014 at 4:49 PM, Thierry Reding
> <thierry.reding@gmail.com> wrote:
> > From: Thierry Reding <treding@nvidia.com>
> >
> > When an IOMMU device is available on the platform bus, allocate an IOMMU
> > domain and attach the display controllers to it. The display controllers
> > can then scan out non-contiguous buffers by mapping them through the
> > IOMMU.
> >
> 
> Hi Thierry,
> A few comments from Stéphane and myself that came up while we were
> reviewing this for our tree.

I just realized that I hadn't integrated these comments completely yet,
but I've done so now in my local tree. I'm running a couple of tests to
verify that it's all handled correctly.

> > Signed-off-by: Thierry Reding <treding@nvidia.com>
> > ---
> >  drivers/gpu/drm/tegra/dc.c  |  21 ++++
> >  drivers/gpu/drm/tegra/drm.c |  17 ++++
> >  drivers/gpu/drm/tegra/drm.h |   3 +
> >  drivers/gpu/drm/tegra/fb.c  |  16 ++-
> >  drivers/gpu/drm/tegra/gem.c | 236 +++++++++++++++++++++++++++++++++++++++-----
> >  drivers/gpu/drm/tegra/gem.h |   4 +
> >  6 files changed, 273 insertions(+), 24 deletions(-)
> >
> > diff --git a/drivers/gpu/drm/tegra/dc.c b/drivers/gpu/drm/tegra/dc.c
> > index afcca04f5367..0f7452d04811 100644
> > --- a/drivers/gpu/drm/tegra/dc.c
> > +++ b/drivers/gpu/drm/tegra/dc.c
> > @@ -9,6 +9,7 @@
> >
> >  #include <linux/clk.h>
> >  #include <linux/debugfs.h>
> > +#include <linux/iommu.h>
> >  #include <linux/reset.h>
> >
> >  #include "dc.h"
> > @@ -1283,8 +1284,18 @@ static int tegra_dc_init(struct host1x_client *client)
> >  {
> >         struct drm_device *drm = dev_get_drvdata(client->parent);
> >         struct tegra_dc *dc = host1x_client_to_dc(client);
> > +       struct tegra_drm *tegra = drm->dev_private;
> >         int err;
> >
> > +       if (tegra->domain) {
> > +               err = iommu_attach_device(tegra->domain, dc->dev);
> > +               if (err < 0) {
> > +                       dev_err(dc->dev, "failed to attach to IOMMU: %d\n",
> > +                               err);
> > +                       return err;
> > +               }
> 
> [from Stéphane]
> 
> shouldn't we call detach in the error paths below?

This was mostly rewritten for universal plane support, but I've made
sure that the DC properly detaches from the IOMMU in case of failure
during the code below.

> > diff --git a/drivers/gpu/drm/tegra/drm.c b/drivers/gpu/drm/tegra/drm.c
[...]
> > @@ -8,6 +8,7 @@
> >   */
> >
> >  #include <linux/host1x.h>
> > +#include <linux/iommu.h>
> >
> >  #include "drm.h"
> >  #include "gem.h"
> > @@ -33,6 +34,16 @@ static int tegra_drm_load(struct drm_device *drm, unsigned long flags)
> >         if (!tegra)
> >                 return -ENOMEM;
> >
> > +       if (iommu_present(&platform_bus_type)) {
> > +               tegra->domain = iommu_domain_alloc(&platform_bus_type);
> > +               if (IS_ERR(tegra->domain)) {
> > +                       kfree(tegra);
> > +                       return PTR_ERR(tegra->domain);
> > +               }
> > +
> > +               drm_mm_init(&tegra->mm, 0, SZ_2G);
> 
> 
> [from Stéphane]:
> 
> none of these are freed in the error path below (iommu_domain_free and
> drm_mm_takedown)
> 
> also |tegra| isn't freed either?

None of the resources were actually being cleaned up, but I think I have
it all handled properly now.

> > @@ -108,22 +240,33 @@ struct tegra_bo *tegra_bo_create(struct drm_device *drm, unsigned int size,
> >         host1x_bo_init(&bo->base, &tegra_bo_ops);
> >         size = round_up(size, PAGE_SIZE);
> >
> > -       bo->vaddr = dma_alloc_writecombine(drm->dev, size, &bo->paddr,
> > -                                          GFP_KERNEL | __GFP_NOWARN);
> > -       if (!bo->vaddr) {
> > -               dev_err(drm->dev, "failed to allocate buffer with size %u\n",
> > -                       size);
> > -               err = -ENOMEM;
> > -               goto err_dma;
> > -       }
> > -
> >         err = drm_gem_object_init(drm, &bo->gem, size);
> >         if (err)
> > -               goto err_init;
> > +               goto free;
> >
> >         err = drm_gem_create_mmap_offset(&bo->gem);
> 
> We need to call drm_gem_free_mmap_offset if one of the calls below
> fails, otherwise we'll try to free the mmap_offset on an already
> destroyed bo.

drm_gem_object_release() (below) already calls drm_gem_free_mmap_offset()
for us implicitly.

Thierry

[-- Attachment #2: Type: application/pgp-signature, Size: 819 bytes --]

^ permalink raw reply	[flat|nested] 133+ messages in thread

* [RFC 09/10] drm/tegra: Add IOMMU support
@ 2014-11-05 10:26             ` Thierry Reding
  0 siblings, 0 replies; 133+ messages in thread
From: Thierry Reding @ 2014-11-05 10:26 UTC (permalink / raw)
  To: linux-arm-kernel

On Tue, Sep 30, 2014 at 02:48:35PM -0400, Sean Paul wrote:
> On Thu, Jun 26, 2014 at 4:49 PM, Thierry Reding
> <thierry.reding@gmail.com> wrote:
> > From: Thierry Reding <treding@nvidia.com>
> >
> > When an IOMMU device is available on the platform bus, allocate an IOMMU
> > domain and attach the display controllers to it. The display controllers
> > can then scan out non-contiguous buffers by mapping them through the
> > IOMMU.
> >
> 
> Hi Thierry,
> A few comments from St?phane and myself that came up while we were
> reviewing this for our tree.

I just realized that I hadn't integrated these comments completely yet,
but I've done so now in my local tree. I'm running a couple of tests to
verify that it's all handled correctly.

> > Signed-off-by: Thierry Reding <treding@nvidia.com>
> > ---
> >  drivers/gpu/drm/tegra/dc.c  |  21 ++++
> >  drivers/gpu/drm/tegra/drm.c |  17 ++++
> >  drivers/gpu/drm/tegra/drm.h |   3 +
> >  drivers/gpu/drm/tegra/fb.c  |  16 ++-
> >  drivers/gpu/drm/tegra/gem.c | 236 +++++++++++++++++++++++++++++++++++++++-----
> >  drivers/gpu/drm/tegra/gem.h |   4 +
> >  6 files changed, 273 insertions(+), 24 deletions(-)
> >
> > diff --git a/drivers/gpu/drm/tegra/dc.c b/drivers/gpu/drm/tegra/dc.c
> > index afcca04f5367..0f7452d04811 100644
> > --- a/drivers/gpu/drm/tegra/dc.c
> > +++ b/drivers/gpu/drm/tegra/dc.c
> > @@ -9,6 +9,7 @@
> >
> >  #include <linux/clk.h>
> >  #include <linux/debugfs.h>
> > +#include <linux/iommu.h>
> >  #include <linux/reset.h>
> >
> >  #include "dc.h"
> > @@ -1283,8 +1284,18 @@ static int tegra_dc_init(struct host1x_client *client)
> >  {
> >         struct drm_device *drm = dev_get_drvdata(client->parent);
> >         struct tegra_dc *dc = host1x_client_to_dc(client);
> > +       struct tegra_drm *tegra = drm->dev_private;
> >         int err;
> >
> > +       if (tegra->domain) {
> > +               err = iommu_attach_device(tegra->domain, dc->dev);
> > +               if (err < 0) {
> > +                       dev_err(dc->dev, "failed to attach to IOMMU: %d\n",
> > +                               err);
> > +                       return err;
> > +               }
> 
> [from St?phane]
> 
> shouldn't we call detach in the error paths below?

This was mostly rewritten for universal plane support, but I've made
sure that the DC properly detaches from the IOMMU in case of failure
during the code below.

> > diff --git a/drivers/gpu/drm/tegra/drm.c b/drivers/gpu/drm/tegra/drm.c
[...]
> > @@ -8,6 +8,7 @@
> >   */
> >
> >  #include <linux/host1x.h>
> > +#include <linux/iommu.h>
> >
> >  #include "drm.h"
> >  #include "gem.h"
> > @@ -33,6 +34,16 @@ static int tegra_drm_load(struct drm_device *drm, unsigned long flags)
> >         if (!tegra)
> >                 return -ENOMEM;
> >
> > +       if (iommu_present(&platform_bus_type)) {
> > +               tegra->domain = iommu_domain_alloc(&platform_bus_type);
> > +               if (IS_ERR(tegra->domain)) {
> > +                       kfree(tegra);
> > +                       return PTR_ERR(tegra->domain);
> > +               }
> > +
> > +               drm_mm_init(&tegra->mm, 0, SZ_2G);
> 
> 
> [from St?phane]:
> 
> none of these are freed in the error path below (iommu_domain_free and
> drm_mm_takedown)
> 
> also |tegra| isn't freed either?

None of the resources were actually being cleaned up, but I think I have
it all handled properly now.

> > @@ -108,22 +240,33 @@ struct tegra_bo *tegra_bo_create(struct drm_device *drm, unsigned int size,
> >         host1x_bo_init(&bo->base, &tegra_bo_ops);
> >         size = round_up(size, PAGE_SIZE);
> >
> > -       bo->vaddr = dma_alloc_writecombine(drm->dev, size, &bo->paddr,
> > -                                          GFP_KERNEL | __GFP_NOWARN);
> > -       if (!bo->vaddr) {
> > -               dev_err(drm->dev, "failed to allocate buffer with size %u\n",
> > -                       size);
> > -               err = -ENOMEM;
> > -               goto err_dma;
> > -       }
> > -
> >         err = drm_gem_object_init(drm, &bo->gem, size);
> >         if (err)
> > -               goto err_init;
> > +               goto free;
> >
> >         err = drm_gem_create_mmap_offset(&bo->gem);
> 
> We need to call drm_gem_free_mmap_offset if one of the calls below
> fails, otherwise we'll try to free the mmap_offset on an already
> destroyed bo.

drm_gem_object_release() (below) already calls drm_gem_free_mmap_offset()
for us implicitly.

Thierry
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 819 bytes
Desc: not available
URL: <http://lists.infradead.org/pipermail/linux-arm-kernel/attachments/20141105/31f52e43/attachment.sig>

^ permalink raw reply	[flat|nested] 133+ messages in thread

end of thread, other threads:[~2014-11-05 10:27 UTC | newest]

Thread overview: 133+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2014-06-26 20:49 [RFC 00/10] Add NVIDIA Tegra124 IOMMU support Thierry Reding
2014-06-26 20:49 ` Thierry Reding
2014-06-26 20:49 ` Thierry Reding
     [not found] ` <1403815790-8548-1-git-send-email-thierry.reding-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
2014-06-26 20:49   ` [RFC 01/10] iommu: Add IOMMU device registry Thierry Reding
2014-06-26 20:49     ` Thierry Reding
2014-06-26 20:49     ` Thierry Reding
     [not found]     ` <1403815790-8548-2-git-send-email-thierry.reding-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
2014-06-27  6:58       ` Thierry Reding
2014-06-27  6:58         ` Thierry Reding
2014-06-27  6:58         ` Thierry Reding
2014-07-03 10:37         ` Varun Sethi
2014-07-03 10:37           ` Varun Sethi
2014-07-03 10:37           ` Varun Sethi
2014-07-04 11:05       ` Joerg Roedel
2014-07-04 11:05         ` Joerg Roedel
2014-07-04 11:05         ` Joerg Roedel
     [not found]         ` <20140704110529.GF13434-zLv9SwRftAIdnm+yROfE0A@public.gmane.org>
2014-07-04 13:47           ` Thierry Reding
2014-07-04 13:47             ` Thierry Reding
2014-07-04 13:47             ` Thierry Reding
2014-07-04 13:49             ` Will Deacon
2014-07-04 13:49               ` Will Deacon
2014-07-04 13:49               ` Will Deacon
     [not found]               ` <20140704134928.GA25714-5wv7dgnIgG8@public.gmane.org>
2014-07-06 18:17                 ` Arnd Bergmann
2014-07-06 18:17                   ` Arnd Bergmann
2014-07-06 18:17                   ` Arnd Bergmann
     [not found]                   ` <201407062017.23049.arnd-r2nGTMty4D4@public.gmane.org>
2014-07-07 11:42                     ` Thierry Reding
2014-07-07 11:42                       ` Thierry Reding
2014-07-07 11:42                       ` Thierry Reding
2014-06-26 20:49   ` [PATCH v3 02/10] devicetree: Add generic IOMMU device tree bindings Thierry Reding
2014-06-26 20:49     ` Thierry Reding
2014-06-26 20:49     ` Thierry Reding
     [not found]     ` <1403815790-8548-3-git-send-email-thierry.reding-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
2014-06-27 13:55       ` Will Deacon
2014-06-27 13:55         ` Will Deacon
2014-06-27 13:55         ` Will Deacon
2014-06-30 22:24       ` Stephen Warren
2014-06-30 22:24         ` Stephen Warren
2014-06-30 22:24         ` Stephen Warren
2014-07-04  6:42       ` Varun Sethi
2014-07-04  6:42         ` Varun Sethi
2014-07-04  6:42         ` Varun Sethi
     [not found]         ` <9ffe3c3871ef4b60a955259bfa0bed6c-AZ66ij2kwaacCcN9WK45f+O6mTEJWrR4XA4E9RH9d+qIuWR1G4zioA@public.gmane.org>
2014-07-04  9:05           ` Arnd Bergmann
2014-07-04  9:05             ` Arnd Bergmann
2014-07-04  9:05             ` Arnd Bergmann
2014-06-26 20:49   ` [RFC 03/10] of: Add NVIDIA Tegra124 memory controller binding Thierry Reding
2014-06-26 20:49     ` Thierry Reding
2014-06-26 20:49     ` Thierry Reding
2014-06-26 20:49   ` [RFC 04/10] memory: Add Tegra124 memory controller support Thierry Reding
2014-06-26 20:49     ` Thierry Reding
2014-06-26 20:49     ` Thierry Reding
     [not found]     ` <1403815790-8548-5-git-send-email-thierry.reding-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
2014-06-27  7:41       ` Joseph Lo
2014-06-27  7:41         ` Joseph Lo
2014-06-27  7:41         ` Joseph Lo
     [not found]         ` <53AD2020.1050802-DDmLM1+adcrQT0dZR+AlfA@public.gmane.org>
2014-06-27  8:17           ` Thierry Reding
2014-06-27  8:17             ` Thierry Reding
2014-06-27  8:17             ` Thierry Reding
2014-06-27  8:24             ` Hiroshi Doyu
2014-06-27  8:24               ` Hiroshi Doyu
2014-06-27  9:46       ` Hiroshi DOyu
2014-06-27  9:46         ` Hiroshi DOyu
2014-06-27  9:46         ` Hiroshi DOyu
     [not found]         ` <20140627124638.7ec150cca163c89727b8953f-DDmLM1+adcrQT0dZR+AlfA@public.gmane.org>
2014-06-27 11:08           ` Thierry Reding
2014-06-27 11:08             ` Thierry Reding
2014-06-27 11:08             ` Thierry Reding
2014-06-27 21:33             ` Stephen Warren
2014-06-27 21:33               ` Stephen Warren
2014-06-27 21:33               ` Stephen Warren
2014-06-27 11:07       ` Arnd Bergmann
2014-06-27 11:07         ` Arnd Bergmann
2014-06-27 11:07         ` Arnd Bergmann
2014-06-27 11:15         ` Thierry Reding
2014-06-27 11:15           ` Thierry Reding
2014-06-27 11:15           ` Thierry Reding
2014-06-27 21:37           ` Stephen Warren
2014-06-27 21:37             ` Stephen Warren
2014-06-27 21:37             ` Stephen Warren
2014-06-30 22:43       ` Stephen Warren
2014-06-30 22:43         ` Stephen Warren
2014-06-30 22:43         ` Stephen Warren
2014-07-01 12:14       ` Hiroshi Doyu
2014-07-01 12:14         ` Hiroshi Doyu
2014-07-01 12:14         ` Hiroshi Doyu
2014-06-27 13:29     ` Mikko Perttunen
2014-06-27 13:29       ` Mikko Perttunen
2014-06-27 13:29       ` Mikko Perttunen
2014-06-26 20:49   ` [RFC 05/10] ARM: tegra: Add memory controller on Tegra124 Thierry Reding
2014-06-26 20:49     ` Thierry Reding
2014-06-26 20:49     ` Thierry Reding
2014-06-26 20:49   ` [RFC 06/10] ARM: tegra: tegra124: Enable IOMMU for display controllers Thierry Reding
2014-06-26 20:49     ` Thierry Reding
2014-06-26 20:49     ` Thierry Reding
2014-06-26 20:49   ` [RFC 07/10] ARM: tegra: tegra124: Enable IOMMU for SDMMC controllers Thierry Reding
2014-06-26 20:49     ` Thierry Reding
2014-06-26 20:49     ` Thierry Reding
2014-06-26 20:49   ` [RFC 08/10] ARM: tegra: Select ARM_DMA_USE_IOMMU Thierry Reding
2014-06-26 20:49     ` Thierry Reding
2014-06-26 20:49     ` Thierry Reding
2014-06-26 20:49   ` [RFC 09/10] drm/tegra: Add IOMMU support Thierry Reding
2014-06-26 20:49     ` Thierry Reding
2014-06-26 20:49     ` Thierry Reding
     [not found]     ` <1403815790-8548-10-git-send-email-thierry.reding-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
2014-06-27  9:46       ` Hiroshi DOyu
2014-06-27  9:46         ` Hiroshi DOyu
2014-06-27  9:46         ` Hiroshi DOyu
     [not found]         ` <20140627124614.050be2e406a4b9a02d9fe97c-DDmLM1+adcrQT0dZR+AlfA@public.gmane.org>
2014-06-27 10:54           ` Arnd Bergmann
2014-06-27 10:54             ` Arnd Bergmann
2014-06-27 10:54             ` Arnd Bergmann
2014-06-27 11:03             ` Hiroshi Doyu
2014-06-27 11:03               ` Hiroshi Doyu
2014-06-27 10:58           ` Thierry Reding
2014-06-27 10:58             ` Thierry Reding
2014-06-27 10:58             ` Thierry Reding
2014-09-30 18:48       ` Sean Paul
2014-09-30 18:48         ` Sean Paul
2014-09-30 18:48         ` Sean Paul
     [not found]         ` <CAOw6vbJy6oy7cibH4f332UM=kS56KUMcnYdUTG0pEYXyQkFDoQ-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2014-10-01 15:54           ` Sean Paul
2014-10-01 15:54             ` Sean Paul
2014-10-01 15:54             ` Sean Paul
     [not found]             ` <CAOw6vbLFLrqWYB-4N50G7oucgMD+xd+QtdcMSzX4z7xRiU-vPQ-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2014-10-02  8:39               ` Thierry Reding
2014-10-02  8:39                 ` Thierry Reding
2014-10-02  8:39                 ` Thierry Reding
2014-11-05  9:50               ` Thierry Reding
2014-11-05  9:50                 ` Thierry Reding
2014-11-05  9:50                 ` Thierry Reding
2014-11-05 10:26           ` Thierry Reding
2014-11-05 10:26             ` Thierry Reding
2014-11-05 10:26             ` Thierry Reding
2014-06-26 20:49   ` [RFC 10/10] mmc: sdhci-tegra: " Thierry Reding
2014-06-26 20:49     ` Thierry Reding
2014-06-26 20:49     ` Thierry Reding
     [not found]     ` <1403815790-8548-11-git-send-email-thierry.reding-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
2014-06-27  9:46       ` Hiroshi DOyu
2014-06-27  9:46         ` Hiroshi DOyu
2014-06-27  9:46         ` Hiroshi DOyu
     [not found]         ` <20140627124602.53d046dae5d7e269815e56a0-DDmLM1+adcrQT0dZR+AlfA@public.gmane.org>
2014-06-27 11:01           ` Thierry Reding
2014-06-27 11:01             ` Thierry Reding
2014-06-27 11:01             ` Thierry Reding

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.