linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v4 00/20] IOMMU: Tegra GART driver clean up and optimization
@ 2018-09-24  0:41 Dmitry Osipenko
  2018-09-24  0:41 ` [PATCH v4 01/20] iommu/tegra: gart: Remove pr_fmt and clean up includes Dmitry Osipenko
                   ` (19 more replies)
  0 siblings, 20 replies; 59+ messages in thread
From: Dmitry Osipenko @ 2018-09-24  0:41 UTC (permalink / raw)
  To: Thierry Reding, Jonathan Hunter, Joerg Roedel, Rob Herring, Robin Murphy
  Cc: iommu, devicetree, linux-tegra, linux-kernel

Hello,

This patch-series integrates the GART (IOMMU) driver with the Memory
Controller driver, that allows to report the name of a faulty memory
client on GART page fault. A major cleanup of the code is performed by
this series as well, it also optimizes performance of the GART driver.

In the previous v2 iteration Thierry Reding suggested that it is better to
break/change GART's device-tree ABI in order to integrate it with Memory
Controller without much churning. This patch series includes patches that
change the device tree binding in a way that updated driver will work only
with the new binding and older kernels won't be broken by the new binding.

Changelog:

v4: In the v3 Rob Herring requested to make device-tree binding changes
    backwards-compatible with the older kernels, that is achieved by
    changing the 'compatible' value of the DT node.

    The code-refactoring patches got some more (minor) polish.

    Added new patch "memory: tegra: Use of_device_get_match_data()".

v3: Memory Controller integration part has been reworked and now GART's
    device-tree binding is changed. Adding Rob Herring for the device-tree
    changes reviewing.

    GART now disallows more than one active domain at a time.

    Fixed "spinlock recursion", "NULL pointer dereference" and "detaching
    of all devices from inactive domains".

    New code-refactoring patches.

    The previously standalone patch "memory: tegra: Don't invoke Tegra30+
    specific memory timing setup on Tegra20" is now included into this
    series because there is a dependency on that patch and it wasn't applied
    yet.

v2: Addressed review comments from Robin Murphy to v1 by moving devices
    iommu_fwspec check to gart_iommu_add_device().

    Dropped the "Provide single domain and group for all devices" patch from
    the series for now because after some more considering it became not
    exactly apparent whether that is what we need, that was also suggested
    by Robin Murphy in the review comment. Maybe something like a runtime
    IOMMU usage for devices would be a better solution, allowing to implement
    transparent context switching of virtual IOMMU domains.

    Some very minor code cleanups, reworded commit messages.

Dmitry Osipenko (20):
  iommu/tegra: gart: Remove pr_fmt and clean up includes
  iommu/tegra: gart: Clean up driver probe errors handling
  iommu/tegra: gart: Ignore devices without IOMMU phandle in DT
  iommu: Introduce iotlb_sync_map callback
  iommu/tegra: gart: Optimize mapping / unmapping performance
  dt-bindings: memory: tegra: Squash tegra20-gart into tegra20-mc
  ARM: dts: tegra20: Update Memory Controller node to the new binding
  memory: tegra: Don't invoke Tegra30+ specific memory timing setup on
    Tegra20
  memory: tegra: Adapt to Tegra20 device-tree binding changes
  memory: tegra: Read client ID on GART page fault
  memory: tegra: Use of_device_get_match_data()
  iommu/tegra: gart: Integrate with Memory Controller driver
  iommu/tegra: gart: Fix spinlock recursion
  iommu/tegra: gart: Fix NULL pointer dereference
  iommu/tegra: gart: Allow only one active domain at a time
  iommu/tegra: gart: Don't use managed resources
  iommu/tegra: gart: Prepend error/debug messages with "GART:"
  iommu/tegra: gart: Don't detach devices from inactive domains
  iommu/tegra: gart: Simplify clients-tracking code
  iommu/tegra: gart: Perform code refactoring

 .../bindings/iommu/nvidia,tegra20-gart.txt    |  14 -
 .../memory-controllers/nvidia,tegra20-mc.txt  |  27 +-
 arch/arm/boot/dts/tegra20.dtsi                |  15 +-
 drivers/iommu/Kconfig                         |   1 +
 drivers/iommu/iommu.c                         |   8 +-
 drivers/iommu/tegra-gart.c                    | 478 +++++++-----------
 drivers/memory/tegra/mc.c                     |  95 +++-
 drivers/memory/tegra/mc.h                     |   6 -
 include/linux/iommu.h                         |   1 +
 include/soc/tegra/mc.h                        |  29 +-
 10 files changed, 298 insertions(+), 376 deletions(-)
 delete mode 100644 Documentation/devicetree/bindings/iommu/nvidia,tegra20-gart.txt

-- 
2.19.0


^ permalink raw reply	[flat|nested] 59+ messages in thread

* [PATCH v4 01/20] iommu/tegra: gart: Remove pr_fmt and clean up includes
  2018-09-24  0:41 [PATCH v4 00/20] IOMMU: Tegra GART driver clean up and optimization Dmitry Osipenko
@ 2018-09-24  0:41 ` Dmitry Osipenko
  2018-09-24 10:02   ` Thierry Reding
  2018-09-24  0:41 ` [PATCH v4 02/20] iommu/tegra: gart: Clean up driver probe errors handling Dmitry Osipenko
                   ` (18 subsequent siblings)
  19 siblings, 1 reply; 59+ messages in thread
From: Dmitry Osipenko @ 2018-09-24  0:41 UTC (permalink / raw)
  To: Thierry Reding, Jonathan Hunter, Joerg Roedel, Rob Herring, Robin Murphy
  Cc: iommu, devicetree, linux-tegra, linux-kernel

Remove unneeded headers inclusion and sort the headers in alphabet order.
Remove pr_fmt macro since there is no pr_*() in the code and it doesn't
affect dev_*() functions.

Signed-off-by: Dmitry Osipenko <digetx@gmail.com>
---
 drivers/iommu/tegra-gart.c | 17 +++++------------
 1 file changed, 5 insertions(+), 12 deletions(-)

diff --git a/drivers/iommu/tegra-gart.c b/drivers/iommu/tegra-gart.c
index 7b1361d57a17..6dda7ee1d36c 100644
--- a/drivers/iommu/tegra-gart.c
+++ b/drivers/iommu/tegra-gart.c
@@ -17,21 +17,14 @@
  * 51 Franklin St - Fifth Floor, Boston, MA 02110-1301 USA.
  */
 
-#define pr_fmt(fmt)	"%s(): " fmt, __func__
-
+#include <linux/io.h>
+#include <linux/iommu.h>
+#include <linux/list.h>
 #include <linux/module.h>
-#include <linux/platform_device.h>
-#include <linux/spinlock.h>
+#include <linux/of_device.h>
 #include <linux/slab.h>
+#include <linux/spinlock.h>
 #include <linux/vmalloc.h>
-#include <linux/mm.h>
-#include <linux/list.h>
-#include <linux/device.h>
-#include <linux/io.h>
-#include <linux/iommu.h>
-#include <linux/of.h>
-
-#include <asm/cacheflush.h>
 
 /* bitmap of the page sizes currently supported */
 #define GART_IOMMU_PGSIZES	(SZ_4K)
-- 
2.19.0


^ permalink raw reply related	[flat|nested] 59+ messages in thread

* [PATCH v4 02/20] iommu/tegra: gart: Clean up driver probe errors handling
  2018-09-24  0:41 [PATCH v4 00/20] IOMMU: Tegra GART driver clean up and optimization Dmitry Osipenko
  2018-09-24  0:41 ` [PATCH v4 01/20] iommu/tegra: gart: Remove pr_fmt and clean up includes Dmitry Osipenko
@ 2018-09-24  0:41 ` Dmitry Osipenko
  2018-09-24 10:02   ` Thierry Reding
  2018-09-24  0:41 ` [PATCH v4 03/20] iommu/tegra: gart: Ignore devices without IOMMU phandle in DT Dmitry Osipenko
                   ` (17 subsequent siblings)
  19 siblings, 1 reply; 59+ messages in thread
From: Dmitry Osipenko @ 2018-09-24  0:41 UTC (permalink / raw)
  To: Thierry Reding, Jonathan Hunter, Joerg Roedel, Rob Herring, Robin Murphy
  Cc: iommu, devicetree, linux-tegra, linux-kernel

Properly clean up allocated resources on the drivers probe failure and
remove unneeded checks.

Signed-off-by: Dmitry Osipenko <digetx@gmail.com>
---
 drivers/iommu/tegra-gart.c | 16 ++++++++++------
 1 file changed, 10 insertions(+), 6 deletions(-)

diff --git a/drivers/iommu/tegra-gart.c b/drivers/iommu/tegra-gart.c
index 6dda7ee1d36c..e9524ed264cf 100644
--- a/drivers/iommu/tegra-gart.c
+++ b/drivers/iommu/tegra-gart.c
@@ -408,9 +408,6 @@ static int tegra_gart_probe(struct platform_device *pdev)
 	struct device *dev = &pdev->dev;
 	int ret;
 
-	if (gart_handle)
-		return -EIO;
-
 	BUILD_BUG_ON(PAGE_SHIFT != GART_PAGE_SHIFT);
 
 	/* the GART memory aperture is required */
@@ -445,8 +442,7 @@ static int tegra_gart_probe(struct platform_device *pdev)
 	ret = iommu_device_register(&gart->iommu);
 	if (ret) {
 		dev_err(dev, "Failed to register IOMMU\n");
-		iommu_device_sysfs_remove(&gart->iommu);
-		return ret;
+		goto remove_sysfs;
 	}
 
 	gart->dev = &pdev->dev;
@@ -460,7 +456,8 @@ static int tegra_gart_probe(struct platform_device *pdev)
 	gart->savedata = vmalloc(array_size(sizeof(u32), gart->page_count));
 	if (!gart->savedata) {
 		dev_err(dev, "failed to allocate context save area\n");
-		return -ENOMEM;
+		ret = -ENOMEM;
+		goto unregister_iommu;
 	}
 
 	platform_set_drvdata(pdev, gart);
@@ -469,6 +466,13 @@ static int tegra_gart_probe(struct platform_device *pdev)
 	gart_handle = gart;
 
 	return 0;
+
+unregister_iommu:
+	iommu_device_unregister(&gart->iommu);
+remove_sysfs:
+	iommu_device_sysfs_remove(&gart->iommu);
+
+	return ret;
 }
 
 static int tegra_gart_remove(struct platform_device *pdev)
-- 
2.19.0


^ permalink raw reply related	[flat|nested] 59+ messages in thread

* [PATCH v4 03/20] iommu/tegra: gart: Ignore devices without IOMMU phandle in DT
  2018-09-24  0:41 [PATCH v4 00/20] IOMMU: Tegra GART driver clean up and optimization Dmitry Osipenko
  2018-09-24  0:41 ` [PATCH v4 01/20] iommu/tegra: gart: Remove pr_fmt and clean up includes Dmitry Osipenko
  2018-09-24  0:41 ` [PATCH v4 02/20] iommu/tegra: gart: Clean up driver probe errors handling Dmitry Osipenko
@ 2018-09-24  0:41 ` Dmitry Osipenko
  2018-09-24 10:05   ` Thierry Reding
  2018-09-24  0:41 ` [PATCH v4 04/20] iommu: Introduce iotlb_sync_map callback Dmitry Osipenko
                   ` (16 subsequent siblings)
  19 siblings, 1 reply; 59+ messages in thread
From: Dmitry Osipenko @ 2018-09-24  0:41 UTC (permalink / raw)
  To: Thierry Reding, Jonathan Hunter, Joerg Roedel, Rob Herring, Robin Murphy
  Cc: iommu, devicetree, linux-tegra, linux-kernel

GART can't handle all devices, hence ignore devices that aren't related
to GART. IOMMU phandle must be explicitly assign to devices in the device
tree.

Signed-off-by: Dmitry Osipenko <digetx@gmail.com>
---
 drivers/iommu/tegra-gart.c | 14 +++++++++++++-
 1 file changed, 13 insertions(+), 1 deletion(-)

diff --git a/drivers/iommu/tegra-gart.c b/drivers/iommu/tegra-gart.c
index e9524ed264cf..f6cf5cd5aaca 100644
--- a/drivers/iommu/tegra-gart.c
+++ b/drivers/iommu/tegra-gart.c
@@ -342,8 +342,12 @@ static bool gart_iommu_capable(enum iommu_cap cap)
 
 static int gart_iommu_add_device(struct device *dev)
 {
-	struct iommu_group *group = iommu_group_get_for_dev(dev);
+	struct iommu_group *group;
 
+	if (!dev->iommu_fwspec)
+		return -ENODEV;
+
+	group = iommu_group_get_for_dev(dev);
 	if (IS_ERR(group))
 		return PTR_ERR(group);
 
@@ -360,6 +364,12 @@ static void gart_iommu_remove_device(struct device *dev)
 	iommu_device_unlink(&gart_handle->iommu, dev);
 }
 
+static int gart_iommu_of_xlate(struct device *dev,
+			       struct of_phandle_args *args)
+{
+	return 0;
+}
+
 static const struct iommu_ops gart_iommu_ops = {
 	.capable	= gart_iommu_capable,
 	.domain_alloc	= gart_iommu_domain_alloc,
@@ -373,6 +383,7 @@ static const struct iommu_ops gart_iommu_ops = {
 	.unmap		= gart_iommu_unmap,
 	.iova_to_phys	= gart_iommu_iova_to_phys,
 	.pgsize_bitmap	= GART_IOMMU_PGSIZES,
+	.of_xlate	= gart_iommu_of_xlate,
 };
 
 static int tegra_gart_suspend(struct device *dev)
@@ -438,6 +449,7 @@ static int tegra_gart_probe(struct platform_device *pdev)
 	}
 
 	iommu_device_set_ops(&gart->iommu, &gart_iommu_ops);
+	iommu_device_set_fwnode(&gart->iommu, dev->fwnode);
 
 	ret = iommu_device_register(&gart->iommu);
 	if (ret) {
-- 
2.19.0


^ permalink raw reply related	[flat|nested] 59+ messages in thread

* [PATCH v4 04/20] iommu: Introduce iotlb_sync_map callback
  2018-09-24  0:41 [PATCH v4 00/20] IOMMU: Tegra GART driver clean up and optimization Dmitry Osipenko
                   ` (2 preceding siblings ...)
  2018-09-24  0:41 ` [PATCH v4 03/20] iommu/tegra: gart: Ignore devices without IOMMU phandle in DT Dmitry Osipenko
@ 2018-09-24  0:41 ` Dmitry Osipenko
  2018-09-24 10:06   ` Thierry Reding
  2018-09-24  0:41 ` [PATCH v4 05/20] iommu/tegra: gart: Optimize mapping / unmapping performance Dmitry Osipenko
                   ` (15 subsequent siblings)
  19 siblings, 1 reply; 59+ messages in thread
From: Dmitry Osipenko @ 2018-09-24  0:41 UTC (permalink / raw)
  To: Thierry Reding, Jonathan Hunter, Joerg Roedel, Rob Herring, Robin Murphy
  Cc: iommu, devicetree, linux-tegra, linux-kernel

Introduce iotlb_sync_map() callback that is invoked in the end of
iommu_map(). This new callback allows IOMMU drivers to avoid syncing
after mapping of each contiguous chunk and sync only when the whole
mapping is completed, optimizing performance of the mapping operation.

Signed-off-by: Dmitry Osipenko <digetx@gmail.com>
Reviewed-by: Robin Murphy <robin.murphy@arm.com>
---
 drivers/iommu/iommu.c | 8 ++++++--
 include/linux/iommu.h | 1 +
 2 files changed, 7 insertions(+), 2 deletions(-)

diff --git a/drivers/iommu/iommu.c b/drivers/iommu/iommu.c
index 8c15c5980299..8979b16caf61 100644
--- a/drivers/iommu/iommu.c
+++ b/drivers/iommu/iommu.c
@@ -1545,13 +1545,14 @@ static size_t iommu_pgsize(struct iommu_domain *domain,
 int iommu_map(struct iommu_domain *domain, unsigned long iova,
 	      phys_addr_t paddr, size_t size, int prot)
 {
+	const struct iommu_ops *ops = domain->ops;
 	unsigned long orig_iova = iova;
 	unsigned int min_pagesz;
 	size_t orig_size = size;
 	phys_addr_t orig_paddr = paddr;
 	int ret = 0;
 
-	if (unlikely(domain->ops->map == NULL ||
+	if (unlikely(ops->map == NULL ||
 		     domain->pgsize_bitmap == 0UL))
 		return -ENODEV;
 
@@ -1580,7 +1581,7 @@ int iommu_map(struct iommu_domain *domain, unsigned long iova,
 		pr_debug("mapping: iova 0x%lx pa %pa pgsize 0x%zx\n",
 			 iova, &paddr, pgsize);
 
-		ret = domain->ops->map(domain, iova, paddr, pgsize, prot);
+		ret = ops->map(domain, iova, paddr, pgsize, prot);
 		if (ret)
 			break;
 
@@ -1589,6 +1590,9 @@ int iommu_map(struct iommu_domain *domain, unsigned long iova,
 		size -= pgsize;
 	}
 
+	if (ops->iotlb_sync_map)
+		ops->iotlb_sync_map(domain);
+
 	/* unroll mapping in case something went wrong */
 	if (ret)
 		iommu_unmap(domain, orig_iova, orig_size - size);
diff --git a/include/linux/iommu.h b/include/linux/iommu.h
index 87994c265bf5..4c488eb69752 100644
--- a/include/linux/iommu.h
+++ b/include/linux/iommu.h
@@ -202,6 +202,7 @@ struct iommu_ops {
 	void (*flush_iotlb_all)(struct iommu_domain *domain);
 	void (*iotlb_range_add)(struct iommu_domain *domain,
 				unsigned long iova, size_t size);
+	void (*iotlb_sync_map)(struct iommu_domain *domain);
 	void (*iotlb_sync)(struct iommu_domain *domain);
 	phys_addr_t (*iova_to_phys)(struct iommu_domain *domain, dma_addr_t iova);
 	int (*add_device)(struct device *dev);
-- 
2.19.0


^ permalink raw reply related	[flat|nested] 59+ messages in thread

* [PATCH v4 05/20] iommu/tegra: gart: Optimize mapping / unmapping performance
  2018-09-24  0:41 [PATCH v4 00/20] IOMMU: Tegra GART driver clean up and optimization Dmitry Osipenko
                   ` (3 preceding siblings ...)
  2018-09-24  0:41 ` [PATCH v4 04/20] iommu: Introduce iotlb_sync_map callback Dmitry Osipenko
@ 2018-09-24  0:41 ` Dmitry Osipenko
  2018-09-24 10:07   ` Thierry Reding
  2018-09-24  0:41 ` [PATCH v4 06/20] dt-bindings: memory: tegra: Squash tegra20-gart into tegra20-mc Dmitry Osipenko
                   ` (14 subsequent siblings)
  19 siblings, 1 reply; 59+ messages in thread
From: Dmitry Osipenko @ 2018-09-24  0:41 UTC (permalink / raw)
  To: Thierry Reding, Jonathan Hunter, Joerg Roedel, Rob Herring, Robin Murphy
  Cc: iommu, devicetree, linux-tegra, linux-kernel

Currently GART writes one page entry at a time. More optimal would be to
aggregate the writes and flush BUS buffer in the end, this gives map/unmap
10-40% performance boost (depending on size of mapping) in comparison to
flushing after each page entry update.

Signed-off-by: Dmitry Osipenko <digetx@gmail.com>
---
 drivers/iommu/tegra-gart.c | 12 ++++++++++--
 1 file changed, 10 insertions(+), 2 deletions(-)

diff --git a/drivers/iommu/tegra-gart.c b/drivers/iommu/tegra-gart.c
index f6cf5cd5aaca..86a855c0d031 100644
--- a/drivers/iommu/tegra-gart.c
+++ b/drivers/iommu/tegra-gart.c
@@ -287,7 +287,6 @@ static int gart_iommu_map(struct iommu_domain *domain, unsigned long iova,
 		}
 	}
 	gart_set_pte(gart, iova, GART_PTE(pfn));
-	FLUSH_GART_REGS(gart);
 	spin_unlock_irqrestore(&gart->pte_lock, flags);
 	return 0;
 }
@@ -304,7 +303,6 @@ static size_t gart_iommu_unmap(struct iommu_domain *domain, unsigned long iova,
 
 	spin_lock_irqsave(&gart->pte_lock, flags);
 	gart_set_pte(gart, iova, 0);
-	FLUSH_GART_REGS(gart);
 	spin_unlock_irqrestore(&gart->pte_lock, flags);
 	return bytes;
 }
@@ -370,6 +368,14 @@ static int gart_iommu_of_xlate(struct device *dev,
 	return 0;
 }
 
+static void gart_iommu_sync(struct iommu_domain *domain)
+{
+	struct gart_domain *gart_domain = to_gart_domain(domain);
+	struct gart_device *gart = gart_domain->gart;
+
+	FLUSH_GART_REGS(gart);
+}
+
 static const struct iommu_ops gart_iommu_ops = {
 	.capable	= gart_iommu_capable,
 	.domain_alloc	= gart_iommu_domain_alloc,
@@ -384,6 +390,8 @@ static const struct iommu_ops gart_iommu_ops = {
 	.iova_to_phys	= gart_iommu_iova_to_phys,
 	.pgsize_bitmap	= GART_IOMMU_PGSIZES,
 	.of_xlate	= gart_iommu_of_xlate,
+	.iotlb_sync_map	= gart_iommu_sync,
+	.iotlb_sync	= gart_iommu_sync,
 };
 
 static int tegra_gart_suspend(struct device *dev)
-- 
2.19.0


^ permalink raw reply related	[flat|nested] 59+ messages in thread

* [PATCH v4 06/20] dt-bindings: memory: tegra: Squash tegra20-gart into tegra20-mc
  2018-09-24  0:41 [PATCH v4 00/20] IOMMU: Tegra GART driver clean up and optimization Dmitry Osipenko
                   ` (4 preceding siblings ...)
  2018-09-24  0:41 ` [PATCH v4 05/20] iommu/tegra: gart: Optimize mapping / unmapping performance Dmitry Osipenko
@ 2018-09-24  0:41 ` Dmitry Osipenko
  2018-09-24  9:55   ` Thierry Reding
  2018-09-27 18:41   ` Rob Herring
  2018-09-24  0:41 ` [PATCH v4 07/20] ARM: dts: tegra20: Update Memory Controller node to the new binding Dmitry Osipenko
                   ` (13 subsequent siblings)
  19 siblings, 2 replies; 59+ messages in thread
From: Dmitry Osipenko @ 2018-09-24  0:41 UTC (permalink / raw)
  To: Thierry Reding, Jonathan Hunter, Joerg Roedel, Rob Herring, Robin Murphy
  Cc: iommu, devicetree, linux-tegra, linux-kernel

Splitting GART and Memory Controller wasn't a good decision that was made
back in the day. Given that the GART driver wasn't ever been used by
anything in the kernel, we decided that it will be better to correct the
mistakes of the past and merge two bindings into a single one. As a result
there is a DT ABI change for the Memory Controller that allows not to
break newer kernels using older DT and not to break older kernels using
newer DT, that is done by changing the 'compatible' of the node to
'tegra20-mc-gart' and adding a new-required clock property. The new clock
property also puts the tegra20-mc binding in line with the bindings of the
later Tegra generations.

Signed-off-by: Dmitry Osipenko <digetx@gmail.com>
---
 .../bindings/iommu/nvidia,tegra20-gart.txt    | 14 ----------
 .../memory-controllers/nvidia,tegra20-mc.txt  | 27 +++++++++++++------
 2 files changed, 19 insertions(+), 22 deletions(-)
 delete mode 100644 Documentation/devicetree/bindings/iommu/nvidia,tegra20-gart.txt

diff --git a/Documentation/devicetree/bindings/iommu/nvidia,tegra20-gart.txt b/Documentation/devicetree/bindings/iommu/nvidia,tegra20-gart.txt
deleted file mode 100644
index 099d9362ebc1..000000000000
--- a/Documentation/devicetree/bindings/iommu/nvidia,tegra20-gart.txt
+++ /dev/null
@@ -1,14 +0,0 @@
-NVIDIA Tegra 20 GART
-
-Required properties:
-- compatible: "nvidia,tegra20-gart"
-- reg: Two pairs of cells specifying the physical address and size of
-  the memory controller registers and the GART aperture respectively.
-
-Example:
-
-	gart {
-		compatible = "nvidia,tegra20-gart";
-		reg = <0x7000f024 0x00000018	/* controller registers */
-		       0x58000000 0x02000000>;	/* GART aperture */
-	};
diff --git a/Documentation/devicetree/bindings/memory-controllers/nvidia,tegra20-mc.txt b/Documentation/devicetree/bindings/memory-controllers/nvidia,tegra20-mc.txt
index 7d60a50a4fa1..e55328237df4 100644
--- a/Documentation/devicetree/bindings/memory-controllers/nvidia,tegra20-mc.txt
+++ b/Documentation/devicetree/bindings/memory-controllers/nvidia,tegra20-mc.txt
@@ -1,26 +1,37 @@
 NVIDIA Tegra20 MC(Memory Controller)
 
 Required properties:
-- compatible : "nvidia,tegra20-mc"
-- reg : Should contain 2 register ranges(address and length); see the
-  example below. Note that the MC registers are interleaved with the
-  GART registers, and hence must be represented as multiple ranges.
+- compatible : "nvidia,tegra20-mc-gart"
+- reg : Should contain 2 register ranges: physical base address and length of
+  the controller's registers and the GART aperture respectively.
+- clocks: Must contain an entry for each entry in clock-names.
+  See ../clocks/clock-bindings.txt for details.
+- clock-names: Must include the following entries:
+  - mc: the module's clock input
 - interrupts : Should contain MC General interrupt.
 - #reset-cells : Should be 1. This cell represents memory client module ID.
   The assignments may be found in header file <dt-bindings/memory/tegra20-mc.h>
   or in the TRM documentation.
+- #iommu-cells: Should be 0. This cell represents the number of cells in an
+  IOMMU specifier needed to encode an address. GART supports only a single
+  address space that is shared by all devices, therefore no additional
+  information needed for the address encoding.
 
 Example:
 	mc: memory-controller@7000f000 {
-		compatible = "nvidia,tegra20-mc";
-		reg = <0x7000f000 0x024
-		       0x7000f03c 0x3c4>;
-		interrupts = <0 77 0x04>;
+		compatible = "nvidia,tegra20-mc-gart";
+		reg = <0x7000f000 0x400		/* controller registers */
+		       0x58000000 0x02000000>;	/* GART aperture */
+		clocks = <&tegra_car TEGRA20_CLK_MC>;
+		clock-names = "mc";
+		interrupts = <GIC_SPI 77 0x04>;
 		#reset-cells = <1>;
+		#iommu-cells = <0>;
 	};
 
 	video-codec@6001a000 {
 		compatible = "nvidia,tegra20-vde";
 		...
 		resets = <&mc TEGRA20_MC_RESET_VDE>;
+		iommus = <&mc>;
 	};
-- 
2.19.0


^ permalink raw reply related	[flat|nested] 59+ messages in thread

* [PATCH v4 07/20] ARM: dts: tegra20: Update Memory Controller node to the new binding
  2018-09-24  0:41 [PATCH v4 00/20] IOMMU: Tegra GART driver clean up and optimization Dmitry Osipenko
                   ` (5 preceding siblings ...)
  2018-09-24  0:41 ` [PATCH v4 06/20] dt-bindings: memory: tegra: Squash tegra20-gart into tegra20-mc Dmitry Osipenko
@ 2018-09-24  0:41 ` Dmitry Osipenko
  2018-09-24  0:41 ` [PATCH v4 08/20] memory: tegra: Don't invoke Tegra30+ specific memory timing setup on Tegra20 Dmitry Osipenko
                   ` (12 subsequent siblings)
  19 siblings, 0 replies; 59+ messages in thread
From: Dmitry Osipenko @ 2018-09-24  0:41 UTC (permalink / raw)
  To: Thierry Reding, Jonathan Hunter, Joerg Roedel, Rob Herring, Robin Murphy
  Cc: iommu, devicetree, linux-tegra, linux-kernel

Device tree binding of Memory Controller has been changed: GART has been
squashed into the MC, there are a new mandatory clock and #iommu-cells
properties, the compatible has been changed to 'tegra20-mc-gart'.

Signed-off-by: Dmitry Osipenko <digetx@gmail.com>
---
 arch/arm/boot/dts/tegra20.dtsi | 15 ++++++---------
 1 file changed, 6 insertions(+), 9 deletions(-)

diff --git a/arch/arm/boot/dts/tegra20.dtsi b/arch/arm/boot/dts/tegra20.dtsi
index 979f38293fe5..3ebaf38cc598 100644
--- a/arch/arm/boot/dts/tegra20.dtsi
+++ b/arch/arm/boot/dts/tegra20.dtsi
@@ -616,17 +616,14 @@
 	};
 
 	mc: memory-controller@7000f000 {
-		compatible = "nvidia,tegra20-mc";
-		reg = <0x7000f000 0x024
-		       0x7000f03c 0x3c4>;
+		compatible = "nvidia,tegra20-mc-gart";
+		reg = <0x7000f000 0x400		/* controller registers */
+		       0x58000000 0x02000000>;	/* GART aperture */
+		clocks = <&tegra_car TEGRA20_CLK_MC>;
+		clock-names = "mc";
 		interrupts = <GIC_SPI 77 IRQ_TYPE_LEVEL_HIGH>;
 		#reset-cells = <1>;
-	};
-
-	iommu@7000f024 {
-		compatible = "nvidia,tegra20-gart";
-		reg = <0x7000f024 0x00000018	/* controller registers */
-		       0x58000000 0x02000000>;	/* GART aperture */
+		#iommu-cells = <0>;
 	};
 
 	memory-controller@7000f400 {
-- 
2.19.0


^ permalink raw reply related	[flat|nested] 59+ messages in thread

* [PATCH v4 08/20] memory: tegra: Don't invoke Tegra30+ specific memory timing setup on Tegra20
  2018-09-24  0:41 [PATCH v4 00/20] IOMMU: Tegra GART driver clean up and optimization Dmitry Osipenko
                   ` (6 preceding siblings ...)
  2018-09-24  0:41 ` [PATCH v4 07/20] ARM: dts: tegra20: Update Memory Controller node to the new binding Dmitry Osipenko
@ 2018-09-24  0:41 ` Dmitry Osipenko
  2018-09-24  0:41 ` [PATCH v4 09/20] memory: tegra: Adapt to Tegra20 device-tree binding changes Dmitry Osipenko
                   ` (11 subsequent siblings)
  19 siblings, 0 replies; 59+ messages in thread
From: Dmitry Osipenko @ 2018-09-24  0:41 UTC (permalink / raw)
  To: Thierry Reding, Jonathan Hunter, Joerg Roedel, Rob Herring, Robin Murphy
  Cc: iommu, devicetree, linux-tegra, linux-kernel

This fixes irrelevant "tegra-mc 7000f000.memory-controller: no memory
timings for RAM code 0 registered" warning message during of kernels
boot-up on Tegra20.

Fixes: a8d502fd3348 ("memory: tegra: Squash tegra20-mc into common tegra-mc driver")
Signed-off-by: Dmitry Osipenko <digetx@gmail.com>
Acked-by: Jon Hunter <jonathanh@nvidia.com>
---
 drivers/memory/tegra/mc.c | 11 ++++++-----
 1 file changed, 6 insertions(+), 5 deletions(-)

diff --git a/drivers/memory/tegra/mc.c b/drivers/memory/tegra/mc.c
index bd25faf6d13d..e56862495f36 100644
--- a/drivers/memory/tegra/mc.c
+++ b/drivers/memory/tegra/mc.c
@@ -664,12 +664,13 @@ static int tegra_mc_probe(struct platform_device *pdev)
 		}
 
 		isr = tegra_mc_irq;
-	}
 
-	err = tegra_mc_setup_timings(mc);
-	if (err < 0) {
-		dev_err(&pdev->dev, "failed to setup timings: %d\n", err);
-		return err;
+		err = tegra_mc_setup_timings(mc);
+		if (err < 0) {
+			dev_err(&pdev->dev, "failed to setup timings: %d\n",
+				err);
+			return err;
+		}
 	}
 
 	mc->irq = platform_get_irq(pdev, 0);
-- 
2.19.0


^ permalink raw reply related	[flat|nested] 59+ messages in thread

* [PATCH v4 09/20] memory: tegra: Adapt to Tegra20 device-tree binding changes
  2018-09-24  0:41 [PATCH v4 00/20] IOMMU: Tegra GART driver clean up and optimization Dmitry Osipenko
                   ` (7 preceding siblings ...)
  2018-09-24  0:41 ` [PATCH v4 08/20] memory: tegra: Don't invoke Tegra30+ specific memory timing setup on Tegra20 Dmitry Osipenko
@ 2018-09-24  0:41 ` Dmitry Osipenko
  2018-09-24 10:02   ` Thierry Reding
  2018-09-24  0:41 ` [PATCH v4 10/20] memory: tegra: Read client ID on GART page fault Dmitry Osipenko
                   ` (10 subsequent siblings)
  19 siblings, 1 reply; 59+ messages in thread
From: Dmitry Osipenko @ 2018-09-24  0:41 UTC (permalink / raw)
  To: Thierry Reding, Jonathan Hunter, Joerg Roedel, Rob Herring, Robin Murphy
  Cc: iommu, devicetree, linux-tegra, linux-kernel

The tegra20-mc device-tree binding has been changed, GART has been
squashed into Memory Controller and now the clock property is mandatory
for Tegra20, the DT compatible has been changed as well. Adapt driver to
the DT changes.

Signed-off-by: Dmitry Osipenko <digetx@gmail.com>
---
 drivers/memory/tegra/mc.c | 21 ++++++++-------------
 drivers/memory/tegra/mc.h |  6 ------
 include/soc/tegra/mc.h    |  2 +-
 3 files changed, 9 insertions(+), 20 deletions(-)

diff --git a/drivers/memory/tegra/mc.c b/drivers/memory/tegra/mc.c
index e56862495f36..1b4ceefd82f9 100644
--- a/drivers/memory/tegra/mc.c
+++ b/drivers/memory/tegra/mc.c
@@ -51,7 +51,7 @@
 
 static const struct of_device_id tegra_mc_of_match[] = {
 #ifdef CONFIG_ARCH_TEGRA_2x_SOC
-	{ .compatible = "nvidia,tegra20-mc", .data = &tegra20_mc_soc },
+	{ .compatible = "nvidia,tegra20-mc-gart", .data = &tegra20_mc_soc },
 #endif
 #ifdef CONFIG_ARCH_TEGRA_3x_SOC
 	{ .compatible = "nvidia,tegra30-mc", .data = &tegra30_mc_soc },
@@ -638,24 +638,19 @@ static int tegra_mc_probe(struct platform_device *pdev)
 	if (IS_ERR(mc->regs))
 		return PTR_ERR(mc->regs);
 
+	mc->clk = devm_clk_get(&pdev->dev, "mc");
+	if (IS_ERR(mc->clk)) {
+		dev_err(&pdev->dev, "failed to get MC clock: %ld\n",
+			PTR_ERR(mc->clk));
+		return PTR_ERR(mc->clk);
+	}
+
 #ifdef CONFIG_ARCH_TEGRA_2x_SOC
 	if (mc->soc == &tegra20_mc_soc) {
-		res = platform_get_resource(pdev, IORESOURCE_MEM, 1);
-		mc->regs2 = devm_ioremap_resource(&pdev->dev, res);
-		if (IS_ERR(mc->regs2))
-			return PTR_ERR(mc->regs2);
-
 		isr = tegra20_mc_irq;
 	} else
 #endif
 	{
-		mc->clk = devm_clk_get(&pdev->dev, "mc");
-		if (IS_ERR(mc->clk)) {
-			dev_err(&pdev->dev, "failed to get MC clock: %ld\n",
-				PTR_ERR(mc->clk));
-			return PTR_ERR(mc->clk);
-		}
-
 		err = tegra_mc_setup_latency_allowance(mc);
 		if (err < 0) {
 			dev_err(&pdev->dev, "failed to setup latency allowance: %d\n",
diff --git a/drivers/memory/tegra/mc.h b/drivers/memory/tegra/mc.h
index 01065f12ebeb..9856f085e487 100644
--- a/drivers/memory/tegra/mc.h
+++ b/drivers/memory/tegra/mc.h
@@ -26,18 +26,12 @@
 
 static inline u32 mc_readl(struct tegra_mc *mc, unsigned long offset)
 {
-	if (mc->regs2 && offset >= 0x24)
-		return readl(mc->regs2 + offset - 0x3c);
-
 	return readl(mc->regs + offset);
 }
 
 static inline void mc_writel(struct tegra_mc *mc, u32 value,
 			     unsigned long offset)
 {
-	if (mc->regs2 && offset >= 0x24)
-		return writel(value, mc->regs2 + offset - 0x3c);
-
 	writel(value, mc->regs + offset);
 }
 
diff --git a/include/soc/tegra/mc.h b/include/soc/tegra/mc.h
index b43f37fea096..db5bfdf589b4 100644
--- a/include/soc/tegra/mc.h
+++ b/include/soc/tegra/mc.h
@@ -144,7 +144,7 @@ struct tegra_mc_soc {
 struct tegra_mc {
 	struct device *dev;
 	struct tegra_smmu *smmu;
-	void __iomem *regs, *regs2;
+	void __iomem *regs;
 	struct clk *clk;
 	int irq;
 
-- 
2.19.0


^ permalink raw reply related	[flat|nested] 59+ messages in thread

* [PATCH v4 10/20] memory: tegra: Read client ID on GART page fault
  2018-09-24  0:41 [PATCH v4 00/20] IOMMU: Tegra GART driver clean up and optimization Dmitry Osipenko
                   ` (8 preceding siblings ...)
  2018-09-24  0:41 ` [PATCH v4 09/20] memory: tegra: Adapt to Tegra20 device-tree binding changes Dmitry Osipenko
@ 2018-09-24  0:41 ` Dmitry Osipenko
  2018-09-24  0:41 ` [PATCH v4 11/20] memory: tegra: Use of_device_get_match_data() Dmitry Osipenko
                   ` (9 subsequent siblings)
  19 siblings, 0 replies; 59+ messages in thread
From: Dmitry Osipenko @ 2018-09-24  0:41 UTC (permalink / raw)
  To: Thierry Reding, Jonathan Hunter, Joerg Roedel, Rob Herring, Robin Murphy
  Cc: iommu, devicetree, linux-tegra, linux-kernel

With the device tree binding changes, now Memory Controller has access to
GART registers. Hence it is now possible to read client ID on GART page
fault to get information about what memory client causes the fault.

Signed-off-by: Dmitry Osipenko <digetx@gmail.com>
---
 drivers/memory/tegra/mc.c | 12 ++++++++++--
 1 file changed, 10 insertions(+), 2 deletions(-)

diff --git a/drivers/memory/tegra/mc.c b/drivers/memory/tegra/mc.c
index 1b4ceefd82f9..5454ffe5b2e0 100644
--- a/drivers/memory/tegra/mc.c
+++ b/drivers/memory/tegra/mc.c
@@ -38,6 +38,7 @@
 
 #define MC_ERR_ADR 0x0c
 
+#define MC_GART_ERROR_REQ		0x30
 #define MC_DECERR_EMEM_OTHERS_STATUS	0x58
 #define MC_SECURITY_VIOLATION_STATUS	0x74
 
@@ -575,8 +576,15 @@ static __maybe_unused irqreturn_t tegra20_mc_irq(int irq, void *data)
 			break;
 
 		case MC_INT_INVALID_GART_PAGE:
-			dev_err_ratelimited(mc->dev, "%s\n", error);
-			continue;
+			reg = MC_GART_ERROR_REQ;
+			value = mc_readl(mc, reg);
+
+			id = (value >> 1) & mc->soc->client_id_mask;
+			desc = error_names[2];
+
+			if (value & BIT(0))
+				direction = "write";
+			break;
 
 		case MC_INT_SECURITY_VIOLATION:
 			reg = MC_SECURITY_VIOLATION_STATUS;
-- 
2.19.0


^ permalink raw reply related	[flat|nested] 59+ messages in thread

* [PATCH v4 11/20] memory: tegra: Use of_device_get_match_data()
  2018-09-24  0:41 [PATCH v4 00/20] IOMMU: Tegra GART driver clean up and optimization Dmitry Osipenko
                   ` (9 preceding siblings ...)
  2018-09-24  0:41 ` [PATCH v4 10/20] memory: tegra: Read client ID on GART page fault Dmitry Osipenko
@ 2018-09-24  0:41 ` Dmitry Osipenko
  2018-09-24 10:13   ` Thierry Reding
  2018-09-24  0:41 ` [PATCH v4 12/20] iommu/tegra: gart: Integrate with Memory Controller driver Dmitry Osipenko
                   ` (8 subsequent siblings)
  19 siblings, 1 reply; 59+ messages in thread
From: Dmitry Osipenko @ 2018-09-24  0:41 UTC (permalink / raw)
  To: Thierry Reding, Jonathan Hunter, Joerg Roedel, Rob Herring, Robin Murphy
  Cc: iommu, devicetree, linux-tegra, linux-kernel

There is no need to match device with the DT node since it was already
matched, use of_device_get_match_data() helper to get the match-data.

Signed-off-by: Dmitry Osipenko <digetx@gmail.com>
---
 drivers/memory/tegra/mc.c | 10 ++--------
 1 file changed, 2 insertions(+), 8 deletions(-)

diff --git a/drivers/memory/tegra/mc.c b/drivers/memory/tegra/mc.c
index 5454ffe5b2e0..cdc33f93cf7c 100644
--- a/drivers/memory/tegra/mc.c
+++ b/drivers/memory/tegra/mc.c
@@ -11,8 +11,7 @@
 #include <linux/interrupt.h>
 #include <linux/kernel.h>
 #include <linux/module.h>
-#include <linux/of.h>
-#include <linux/platform_device.h>
+#include <linux/of_device.h>
 #include <linux/slab.h>
 #include <linux/sort.h>
 
@@ -619,23 +618,18 @@ static __maybe_unused irqreturn_t tegra20_mc_irq(int irq, void *data)
 
 static int tegra_mc_probe(struct platform_device *pdev)
 {
-	const struct of_device_id *match;
 	struct resource *res;
 	struct tegra_mc *mc;
 	void *isr;
 	int err;
 
-	match = of_match_node(tegra_mc_of_match, pdev->dev.of_node);
-	if (!match)
-		return -ENODEV;
-
 	mc = devm_kzalloc(&pdev->dev, sizeof(*mc), GFP_KERNEL);
 	if (!mc)
 		return -ENOMEM;
 
 	platform_set_drvdata(pdev, mc);
 	spin_lock_init(&mc->lock);
-	mc->soc = match->data;
+	mc->soc = of_device_get_match_data(&pdev->dev);
 	mc->dev = &pdev->dev;
 
 	/* length of MC tick in nanoseconds */
-- 
2.19.0


^ permalink raw reply related	[flat|nested] 59+ messages in thread

* [PATCH v4 12/20] iommu/tegra: gart: Integrate with Memory Controller driver
  2018-09-24  0:41 [PATCH v4 00/20] IOMMU: Tegra GART driver clean up and optimization Dmitry Osipenko
                   ` (10 preceding siblings ...)
  2018-09-24  0:41 ` [PATCH v4 11/20] memory: tegra: Use of_device_get_match_data() Dmitry Osipenko
@ 2018-09-24  0:41 ` Dmitry Osipenko
  2018-09-24 10:23   ` Thierry Reding
  2018-09-24  0:41 ` [PATCH v4 13/20] iommu/tegra: gart: Fix spinlock recursion Dmitry Osipenko
                   ` (7 subsequent siblings)
  19 siblings, 1 reply; 59+ messages in thread
From: Dmitry Osipenko @ 2018-09-24  0:41 UTC (permalink / raw)
  To: Thierry Reding, Jonathan Hunter, Joerg Roedel, Rob Herring, Robin Murphy
  Cc: iommu, devicetree, linux-tegra, linux-kernel

The device-tree binding has been changed. There is no separate GART device
anymore, it is squashed into the Memory Controller. Integrate GART module
with the MC in a way it is done for the SMMU of Tegra30+.

Signed-off-by: Dmitry Osipenko <digetx@gmail.com>
---
 drivers/iommu/Kconfig      |  1 +
 drivers/iommu/tegra-gart.c | 98 ++++++++++----------------------------
 drivers/memory/tegra/mc.c  | 41 ++++++++++++++++
 include/soc/tegra/mc.h     | 27 +++++++++++
 4 files changed, 93 insertions(+), 74 deletions(-)

diff --git a/drivers/iommu/Kconfig b/drivers/iommu/Kconfig
index c60395b7470f..33f97e5f07ca 100644
--- a/drivers/iommu/Kconfig
+++ b/drivers/iommu/Kconfig
@@ -269,6 +269,7 @@ config ROCKCHIP_IOMMU
 config TEGRA_IOMMU_GART
 	bool "Tegra GART IOMMU Support"
 	depends on ARCH_TEGRA_2x_SOC
+	depends on TEGRA_MC
 	select IOMMU_API
 	help
 	  Enables support for remapping discontiguous physical memory
diff --git a/drivers/iommu/tegra-gart.c b/drivers/iommu/tegra-gart.c
index 86a855c0d031..1c89b20ba4bb 100644
--- a/drivers/iommu/tegra-gart.c
+++ b/drivers/iommu/tegra-gart.c
@@ -21,11 +21,13 @@
 #include <linux/iommu.h>
 #include <linux/list.h>
 #include <linux/module.h>
-#include <linux/of_device.h>
+#include <linux/platform_device.h>
 #include <linux/slab.h>
 #include <linux/spinlock.h>
 #include <linux/vmalloc.h>
 
+#include <soc/tegra/mc.h>
+
 /* bitmap of the page sizes currently supported */
 #define GART_IOMMU_PGSIZES	(SZ_4K)
 
@@ -394,9 +396,8 @@ static const struct iommu_ops gart_iommu_ops = {
 	.iotlb_sync	= gart_iommu_sync,
 };
 
-static int tegra_gart_suspend(struct device *dev)
+int tegra_gart_suspend(struct gart_device *gart)
 {
-	struct gart_device *gart = dev_get_drvdata(dev);
 	unsigned long iova;
 	u32 *data = gart->savedata;
 	unsigned long flags;
@@ -408,9 +409,8 @@ static int tegra_gart_suspend(struct device *dev)
 	return 0;
 }
 
-static int tegra_gart_resume(struct device *dev)
+int tegra_gart_resume(struct gart_device *gart)
 {
-	struct gart_device *gart = dev_get_drvdata(dev);
 	unsigned long flags;
 
 	spin_lock_irqsave(&gart->pte_lock, flags);
@@ -419,41 +419,39 @@ static int tegra_gart_resume(struct device *dev)
 	return 0;
 }
 
-static int tegra_gart_probe(struct platform_device *pdev)
+struct gart_device *tegra_gart_probe(struct device *dev,
+				     const struct tegra_smmu_soc *soc,
+				     struct tegra_mc *mc)
 {
 	struct gart_device *gart;
-	struct resource *res, *res_remap;
+	struct resource *res_remap;
 	void __iomem *gart_regs;
-	struct device *dev = &pdev->dev;
 	int ret;
 
 	BUILD_BUG_ON(PAGE_SHIFT != GART_PAGE_SHIFT);
 
+	/* Tegra30+ has an SMMU and no GART */
+	if (soc)
+		return NULL;
+
 	/* the GART memory aperture is required */
-	res = platform_get_resource(pdev, IORESOURCE_MEM, 0);
-	res_remap = platform_get_resource(pdev, IORESOURCE_MEM, 1);
-	if (!res || !res_remap) {
+	res_remap = platform_get_resource(to_platform_device(dev),
+					  IORESOURCE_MEM, 1);
+	if (!res_remap) {
 		dev_err(dev, "GART memory aperture expected\n");
-		return -ENXIO;
+		return ERR_PTR(-ENXIO);
 	}
 
 	gart = devm_kzalloc(dev, sizeof(*gart), GFP_KERNEL);
 	if (!gart) {
 		dev_err(dev, "failed to allocate gart_device\n");
-		return -ENOMEM;
+		return ERR_PTR(-ENOMEM);
 	}
 
-	gart_regs = devm_ioremap(dev, res->start, resource_size(res));
-	if (!gart_regs) {
-		dev_err(dev, "failed to remap GART registers\n");
-		return -ENXIO;
-	}
-
-	ret = iommu_device_sysfs_add(&gart->iommu, &pdev->dev, NULL,
-				     dev_name(&pdev->dev));
+	ret = iommu_device_sysfs_add(&gart->iommu, dev, NULL, "gart");
 	if (ret) {
 		dev_err(dev, "Failed to register IOMMU in sysfs\n");
-		return ret;
+		return ERR_PTR(ret);
 	}
 
 	iommu_device_set_ops(&gart->iommu, &gart_iommu_ops);
@@ -465,7 +463,8 @@ static int tegra_gart_probe(struct platform_device *pdev)
 		goto remove_sysfs;
 	}
 
-	gart->dev = &pdev->dev;
+	gart->dev = dev;
+	gart_regs = mc->regs + GART_REG_BASE;
 	spin_lock_init(&gart->pte_lock);
 	spin_lock_init(&gart->client_lock);
 	INIT_LIST_HEAD(&gart->client);
@@ -480,72 +479,23 @@ static int tegra_gart_probe(struct platform_device *pdev)
 		goto unregister_iommu;
 	}
 
-	platform_set_drvdata(pdev, gart);
 	do_gart_setup(gart, NULL);
 
 	gart_handle = gart;
 
-	return 0;
+	return gart;
 
 unregister_iommu:
 	iommu_device_unregister(&gart->iommu);
 remove_sysfs:
 	iommu_device_sysfs_remove(&gart->iommu);
 
-	return ret;
-}
-
-static int tegra_gart_remove(struct platform_device *pdev)
-{
-	struct gart_device *gart = platform_get_drvdata(pdev);
-
-	iommu_device_unregister(&gart->iommu);
-	iommu_device_sysfs_remove(&gart->iommu);
-
-	writel(0, gart->regs + GART_CONFIG);
-	if (gart->savedata)
-		vfree(gart->savedata);
-	gart_handle = NULL;
-	return 0;
-}
-
-static const struct dev_pm_ops tegra_gart_pm_ops = {
-	.suspend	= tegra_gart_suspend,
-	.resume		= tegra_gart_resume,
-};
-
-static const struct of_device_id tegra_gart_of_match[] = {
-	{ .compatible = "nvidia,tegra20-gart", },
-	{ },
-};
-MODULE_DEVICE_TABLE(of, tegra_gart_of_match);
-
-static struct platform_driver tegra_gart_driver = {
-	.probe		= tegra_gart_probe,
-	.remove		= tegra_gart_remove,
-	.driver = {
-		.name	= "tegra-gart",
-		.pm	= &tegra_gart_pm_ops,
-		.of_match_table = tegra_gart_of_match,
-	},
-};
-
-static int tegra_gart_init(void)
-{
-	return platform_driver_register(&tegra_gart_driver);
-}
-
-static void __exit tegra_gart_exit(void)
-{
-	platform_driver_unregister(&tegra_gart_driver);
+	return ERR_PTR(ret);
 }
 
-subsys_initcall(tegra_gart_init);
-module_exit(tegra_gart_exit);
 module_param(gart_debug, bool, 0644);
 
 MODULE_PARM_DESC(gart_debug, "Enable GART debugging");
 MODULE_DESCRIPTION("IOMMU API for GART in Tegra20");
 MODULE_AUTHOR("Hiroshi DOYU <hdoyu@nvidia.com>");
-MODULE_ALIAS("platform:tegra-gart");
 MODULE_LICENSE("GPL v2");
diff --git a/drivers/memory/tegra/mc.c b/drivers/memory/tegra/mc.c
index cdc33f93cf7c..cdb6f1069930 100644
--- a/drivers/memory/tegra/mc.c
+++ b/drivers/memory/tegra/mc.c
@@ -700,13 +700,54 @@ static int tegra_mc_probe(struct platform_device *pdev)
 				PTR_ERR(mc->smmu));
 	}
 
+	if (IS_ENABLED(CONFIG_TEGRA_IOMMU_GART)) {
+		mc->gart = tegra_gart_probe(&pdev->dev, mc->soc->smmu, mc);
+		if (IS_ERR(mc->gart))
+			dev_err(&pdev->dev, "failed to probe GART: %ld\n",
+				PTR_ERR(mc->gart));
+	}
+
+	return 0;
+}
+
+static int tegra_mc_suspend(struct device *dev)
+{
+	struct tegra_mc *mc = dev_get_drvdata(dev);
+	int err;
+
+	if (mc->gart) {
+		err = tegra_gart_suspend(mc->gart);
+		if (err)
+			return err;
+	}
+
 	return 0;
 }
 
+static int tegra_mc_resume(struct device *dev)
+{
+	struct tegra_mc *mc = dev_get_drvdata(dev);
+	int err;
+
+	if (mc->gart) {
+		err = tegra_gart_resume(mc->gart);
+		if (err)
+			return err;
+	}
+
+	return 0;
+}
+
+static const struct dev_pm_ops tegra_mc_pm_ops = {
+	.suspend = tegra_mc_suspend,
+	.resume = tegra_mc_resume,
+};
+
 static struct platform_driver tegra_mc_driver = {
 	.driver = {
 		.name = "tegra-mc",
 		.of_match_table = tegra_mc_of_match,
+		.pm = &tegra_mc_pm_ops,
 		.suppress_bind_attrs = true,
 	},
 	.prevent_deferred_probe = true,
diff --git a/include/soc/tegra/mc.h b/include/soc/tegra/mc.h
index db5bfdf589b4..5da42e3fb801 100644
--- a/include/soc/tegra/mc.h
+++ b/include/soc/tegra/mc.h
@@ -77,6 +77,7 @@ struct tegra_smmu_soc {
 
 struct tegra_mc;
 struct tegra_smmu;
+struct gart_device;
 
 #ifdef CONFIG_TEGRA_IOMMU_SMMU
 struct tegra_smmu *tegra_smmu_probe(struct device *dev,
@@ -96,6 +97,31 @@ static inline void tegra_smmu_remove(struct tegra_smmu *smmu)
 }
 #endif
 
+#ifdef CONFIG_TEGRA_IOMMU_GART
+struct gart_device *tegra_gart_probe(struct device *dev,
+				     const struct tegra_smmu_soc *soc,
+				     struct tegra_mc *mc);
+int tegra_gart_suspend(struct gart_device *gart);
+int tegra_gart_resume(struct gart_device *gart);
+#else
+static inline struct gart_device *
+tegra_gart_probe(struct device *dev, const struct tegra_smmu_soc *soc,
+		 struct tegra_mc *mc)
+{
+	return NULL;
+}
+
+static inline int tegra_gart_suspend(struct gart_device *gart)
+{
+	return -ENODEV;
+}
+
+static inline int tegra_gart_resume(struct gart_device *gart)
+{
+	return -ENODEV;
+}
+#endif
+
 struct tegra_mc_reset {
 	const char *name;
 	unsigned long id;
@@ -144,6 +170,7 @@ struct tegra_mc_soc {
 struct tegra_mc {
 	struct device *dev;
 	struct tegra_smmu *smmu;
+	struct gart_device *gart;
 	void __iomem *regs;
 	struct clk *clk;
 	int irq;
-- 
2.19.0


^ permalink raw reply related	[flat|nested] 59+ messages in thread

* [PATCH v4 13/20] iommu/tegra: gart: Fix spinlock recursion
  2018-09-24  0:41 [PATCH v4 00/20] IOMMU: Tegra GART driver clean up and optimization Dmitry Osipenko
                   ` (11 preceding siblings ...)
  2018-09-24  0:41 ` [PATCH v4 12/20] iommu/tegra: gart: Integrate with Memory Controller driver Dmitry Osipenko
@ 2018-09-24  0:41 ` Dmitry Osipenko
  2018-09-24 10:49   ` Thierry Reding
  2018-09-24  0:41 ` [PATCH v4 14/20] iommu/tegra: gart: Fix NULL pointer dereference Dmitry Osipenko
                   ` (6 subsequent siblings)
  19 siblings, 1 reply; 59+ messages in thread
From: Dmitry Osipenko @ 2018-09-24  0:41 UTC (permalink / raw)
  To: Thierry Reding, Jonathan Hunter, Joerg Roedel, Rob Herring, Robin Murphy
  Cc: iommu, devicetree, linux-tegra, linux-kernel

Fix spinlock recursion bug that happens on IOMMU domain destruction if
any of the allocated domains have devices attached to them.

Signed-off-by: Dmitry Osipenko <digetx@gmail.com>
---
 drivers/iommu/tegra-gart.c | 24 ++++++++++++++++--------
 1 file changed, 16 insertions(+), 8 deletions(-)

diff --git a/drivers/iommu/tegra-gart.c b/drivers/iommu/tegra-gart.c
index 1c89b20ba4bb..e6fe139576c3 100644
--- a/drivers/iommu/tegra-gart.c
+++ b/drivers/iommu/tegra-gart.c
@@ -195,25 +195,33 @@ static int gart_iommu_attach_dev(struct iommu_domain *domain,
 	return err;
 }
 
-static void gart_iommu_detach_dev(struct iommu_domain *domain,
-				  struct device *dev)
+static void __gart_iommu_detach_dev(struct iommu_domain *domain,
+				    struct device *dev)
 {
 	struct gart_domain *gart_domain = to_gart_domain(domain);
 	struct gart_device *gart = gart_domain->gart;
 	struct gart_client *c;
 
-	spin_lock(&gart->client_lock);
-
 	list_for_each_entry(c, &gart->client, list) {
 		if (c->dev == dev) {
 			list_del(&c->list);
 			devm_kfree(gart->dev, c);
 			dev_dbg(gart->dev, "Detached %s\n", dev_name(dev));
-			goto out;
+			return;
 		}
 	}
-	dev_err(gart->dev, "Couldn't find\n");
-out:
+
+	dev_err(gart->dev, "Couldn't find %s to detach\n", dev_name(dev));
+}
+
+static void gart_iommu_detach_dev(struct iommu_domain *domain,
+				  struct device *dev)
+{
+	struct gart_domain *gart_domain = to_gart_domain(domain);
+	struct gart_device *gart = gart_domain->gart;
+
+	spin_lock(&gart->client_lock);
+	__gart_iommu_detach_dev(domain, dev);
 	spin_unlock(&gart->client_lock);
 }
 
@@ -253,7 +261,7 @@ static void gart_iommu_domain_free(struct iommu_domain *domain)
 			struct gart_client *c;
 
 			list_for_each_entry(c, &gart->client, list)
-				gart_iommu_detach_dev(domain, c->dev);
+				__gart_iommu_detach_dev(domain, c->dev);
 		}
 		spin_unlock(&gart->client_lock);
 	}
-- 
2.19.0


^ permalink raw reply related	[flat|nested] 59+ messages in thread

* [PATCH v4 14/20] iommu/tegra: gart: Fix NULL pointer dereference
  2018-09-24  0:41 [PATCH v4 00/20] IOMMU: Tegra GART driver clean up and optimization Dmitry Osipenko
                   ` (12 preceding siblings ...)
  2018-09-24  0:41 ` [PATCH v4 13/20] iommu/tegra: gart: Fix spinlock recursion Dmitry Osipenko
@ 2018-09-24  0:41 ` Dmitry Osipenko
  2018-09-24 10:49   ` Thierry Reding
  2018-09-24  0:41 ` [PATCH v4 15/20] iommu/tegra: gart: Allow only one active domain at a time Dmitry Osipenko
                   ` (5 subsequent siblings)
  19 siblings, 1 reply; 59+ messages in thread
From: Dmitry Osipenko @ 2018-09-24  0:41 UTC (permalink / raw)
  To: Thierry Reding, Jonathan Hunter, Joerg Roedel, Rob Herring, Robin Murphy
  Cc: iommu, devicetree, linux-tegra, linux-kernel

Fix NULL pointer dereference on IOMMU domain destruction that happens
because clients list is being iterated unsafely and its elements are
getting deleted during the iteration.

Signed-off-by: Dmitry Osipenko <digetx@gmail.com>
---
 drivers/iommu/tegra-gart.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/iommu/tegra-gart.c b/drivers/iommu/tegra-gart.c
index e6fe139576c3..1d45b023adea 100644
--- a/drivers/iommu/tegra-gart.c
+++ b/drivers/iommu/tegra-gart.c
@@ -258,9 +258,9 @@ static void gart_iommu_domain_free(struct iommu_domain *domain)
 	if (gart) {
 		spin_lock(&gart->client_lock);
 		if (!list_empty(&gart->client)) {
-			struct gart_client *c;
+			struct gart_client *c, *tmp;
 
-			list_for_each_entry(c, &gart->client, list)
+			list_for_each_entry_safe(c, tmp, &gart->client, list)
 				__gart_iommu_detach_dev(domain, c->dev);
 		}
 		spin_unlock(&gart->client_lock);
-- 
2.19.0


^ permalink raw reply related	[flat|nested] 59+ messages in thread

* [PATCH v4 15/20] iommu/tegra: gart: Allow only one active domain at a time
  2018-09-24  0:41 [PATCH v4 00/20] IOMMU: Tegra GART driver clean up and optimization Dmitry Osipenko
                   ` (13 preceding siblings ...)
  2018-09-24  0:41 ` [PATCH v4 14/20] iommu/tegra: gart: Fix NULL pointer dereference Dmitry Osipenko
@ 2018-09-24  0:41 ` Dmitry Osipenko
  2018-09-24 10:50   ` Thierry Reding
  2018-09-24  0:41 ` [PATCH v4 16/20] iommu/tegra: gart: Don't use managed resources Dmitry Osipenko
                   ` (4 subsequent siblings)
  19 siblings, 1 reply; 59+ messages in thread
From: Dmitry Osipenko @ 2018-09-24  0:41 UTC (permalink / raw)
  To: Thierry Reding, Jonathan Hunter, Joerg Roedel, Rob Herring, Robin Murphy
  Cc: iommu, devicetree, linux-tegra, linux-kernel

GART has a single address space that is shared by all devices, hence only
one domain could be active at a time.

Signed-off-by: Dmitry Osipenko <digetx@gmail.com>
---
 drivers/iommu/tegra-gart.c | 9 +++++++++
 1 file changed, 9 insertions(+)

diff --git a/drivers/iommu/tegra-gart.c b/drivers/iommu/tegra-gart.c
index 1d45b023adea..9f7d3afb686f 100644
--- a/drivers/iommu/tegra-gart.c
+++ b/drivers/iommu/tegra-gart.c
@@ -55,6 +55,7 @@ struct gart_device {
 	spinlock_t		pte_lock;	/* for pagetable */
 	struct list_head	client;
 	spinlock_t		client_lock;	/* for client list */
+	struct iommu_domain	*active_domain;	/* current active domain */
 	struct device		*dev;
 
 	struct iommu_device	iommu;		/* IOMMU Core handle */
@@ -184,6 +185,12 @@ static int gart_iommu_attach_dev(struct iommu_domain *domain,
 			goto fail;
 		}
 	}
+	if (gart->active_domain && gart->active_domain != domain) {
+		dev_err(gart->dev, "Only one domain can be active at a time\n");
+		err = -EINVAL;
+		goto fail;
+	}
+	gart->active_domain = domain;
 	list_add(&client->list, &gart->client);
 	spin_unlock(&gart->client_lock);
 	dev_dbg(gart->dev, "Attached %s\n", dev_name(dev));
@@ -206,6 +213,8 @@ static void __gart_iommu_detach_dev(struct iommu_domain *domain,
 		if (c->dev == dev) {
 			list_del(&c->list);
 			devm_kfree(gart->dev, c);
+			if (list_empty(&gart->client))
+				gart->active_domain = NULL;
 			dev_dbg(gart->dev, "Detached %s\n", dev_name(dev));
 			return;
 		}
-- 
2.19.0


^ permalink raw reply related	[flat|nested] 59+ messages in thread

* [PATCH v4 16/20] iommu/tegra: gart: Don't use managed resources
  2018-09-24  0:41 [PATCH v4 00/20] IOMMU: Tegra GART driver clean up and optimization Dmitry Osipenko
                   ` (14 preceding siblings ...)
  2018-09-24  0:41 ` [PATCH v4 15/20] iommu/tegra: gart: Allow only one active domain at a time Dmitry Osipenko
@ 2018-09-24  0:41 ` Dmitry Osipenko
  2018-09-24 10:52   ` Thierry Reding
  2018-09-24  0:41 ` [PATCH v4 17/20] iommu/tegra: gart: Prepend error/debug messages with "GART:" Dmitry Osipenko
                   ` (3 subsequent siblings)
  19 siblings, 1 reply; 59+ messages in thread
From: Dmitry Osipenko @ 2018-09-24  0:41 UTC (permalink / raw)
  To: Thierry Reding, Jonathan Hunter, Joerg Roedel, Rob Herring, Robin Murphy
  Cc: iommu, devicetree, linux-tegra, linux-kernel

GART is a part of the Memory Controller driver that is always built-in,
hence there is no benefit from the use of managed resources.

Signed-off-by: Dmitry Osipenko <digetx@gmail.com>
---
 drivers/iommu/tegra-gart.c | 12 +++++++-----
 1 file changed, 7 insertions(+), 5 deletions(-)

diff --git a/drivers/iommu/tegra-gart.c b/drivers/iommu/tegra-gart.c
index 9f7d3afb686f..d019ae8ecfc9 100644
--- a/drivers/iommu/tegra-gart.c
+++ b/drivers/iommu/tegra-gart.c
@@ -171,7 +171,7 @@ static int gart_iommu_attach_dev(struct iommu_domain *domain,
 	struct gart_client *client, *c;
 	int err = 0;
 
-	client = devm_kzalloc(gart->dev, sizeof(*c), GFP_KERNEL);
+	client = kzalloc(sizeof(*c), GFP_KERNEL);
 	if (!client)
 		return -ENOMEM;
 	client->dev = dev;
@@ -197,7 +197,7 @@ static int gart_iommu_attach_dev(struct iommu_domain *domain,
 	return 0;
 
 fail:
-	devm_kfree(gart->dev, client);
+	kfree(client);
 	spin_unlock(&gart->client_lock);
 	return err;
 }
@@ -212,7 +212,7 @@ static void __gart_iommu_detach_dev(struct iommu_domain *domain,
 	list_for_each_entry(c, &gart->client, list) {
 		if (c->dev == dev) {
 			list_del(&c->list);
-			devm_kfree(gart->dev, c);
+			kfree(c);
 			if (list_empty(&gart->client))
 				gart->active_domain = NULL;
 			dev_dbg(gart->dev, "Detached %s\n", dev_name(dev));
@@ -459,7 +459,7 @@ struct gart_device *tegra_gart_probe(struct device *dev,
 		return ERR_PTR(-ENXIO);
 	}
 
-	gart = devm_kzalloc(dev, sizeof(*gart), GFP_KERNEL);
+	gart = kzalloc(sizeof(*gart), GFP_KERNEL);
 	if (!gart) {
 		dev_err(dev, "failed to allocate gart_device\n");
 		return ERR_PTR(-ENOMEM);
@@ -468,7 +468,7 @@ struct gart_device *tegra_gart_probe(struct device *dev,
 	ret = iommu_device_sysfs_add(&gart->iommu, dev, NULL, "gart");
 	if (ret) {
 		dev_err(dev, "Failed to register IOMMU in sysfs\n");
-		return ERR_PTR(ret);
+		goto free_gart;
 	}
 
 	iommu_device_set_ops(&gart->iommu, &gart_iommu_ops);
@@ -506,6 +506,8 @@ struct gart_device *tegra_gart_probe(struct device *dev,
 	iommu_device_unregister(&gart->iommu);
 remove_sysfs:
 	iommu_device_sysfs_remove(&gart->iommu);
+free_gart:
+	kfree(gart);
 
 	return ERR_PTR(ret);
 }
-- 
2.19.0


^ permalink raw reply related	[flat|nested] 59+ messages in thread

* [PATCH v4 17/20] iommu/tegra: gart: Prepend error/debug messages with "GART:"
  2018-09-24  0:41 [PATCH v4 00/20] IOMMU: Tegra GART driver clean up and optimization Dmitry Osipenko
                   ` (15 preceding siblings ...)
  2018-09-24  0:41 ` [PATCH v4 16/20] iommu/tegra: gart: Don't use managed resources Dmitry Osipenko
@ 2018-09-24  0:41 ` Dmitry Osipenko
  2018-09-24 10:57   ` Thierry Reding
  2018-09-24  0:41 ` [PATCH v4 18/20] iommu/tegra: gart: Don't detach devices from inactive domains Dmitry Osipenko
                   ` (2 subsequent siblings)
  19 siblings, 1 reply; 59+ messages in thread
From: Dmitry Osipenko @ 2018-09-24  0:41 UTC (permalink / raw)
  To: Thierry Reding, Jonathan Hunter, Joerg Roedel, Rob Herring, Robin Murphy
  Cc: iommu, devicetree, linux-tegra, linux-kernel

GART became a part of Memory Controller, hence now the drivers device
is Memory Controller and not GART. As a result all printed messages are
prepended with the "tegra-mc 7000f000.memory-controller:", so let's
prepend GART's messages with "GART:" in order to differentiate them
from the MC.

Signed-off-by: Dmitry Osipenko <digetx@gmail.com>
---
 drivers/iommu/tegra-gart.c | 36 ++++++++++++++++++------------------
 1 file changed, 18 insertions(+), 18 deletions(-)

diff --git a/drivers/iommu/tegra-gart.c b/drivers/iommu/tegra-gart.c
index d019ae8ecfc9..284cddf90888 100644
--- a/drivers/iommu/tegra-gart.c
+++ b/drivers/iommu/tegra-gart.c
@@ -96,7 +96,7 @@ static inline void gart_set_pte(struct gart_device *gart,
 	writel(offs, gart->regs + GART_ENTRY_ADDR);
 	writel(pte, gart->regs + GART_ENTRY_DATA);
 
-	dev_dbg(gart->dev, "%s %08lx:%08x\n",
+	dev_dbg(gart->dev, "GART: %s %08lx:%08x\n",
 		 pte ? "map" : "unmap", offs, pte & GART_PAGE_MASK);
 }
 
@@ -134,7 +134,7 @@ static void gart_dump_table(struct gart_device *gart)
 
 		pte = gart_read_pte(gart, iova);
 
-		dev_dbg(gart->dev, "%s %08lx:%08lx\n",
+		dev_dbg(gart->dev, "GART: %s %08lx:%08lx\n",
 			(GART_ENTRY_PHYS_ADDR_VALID & pte) ? "v" : " ",
 			iova, pte & GART_PAGE_MASK);
 	}
@@ -179,21 +179,22 @@ static int gart_iommu_attach_dev(struct iommu_domain *domain,
 	spin_lock(&gart->client_lock);
 	list_for_each_entry(c, &gart->client, list) {
 		if (c->dev == dev) {
-			dev_err(gart->dev,
-				"%s is already attached\n", dev_name(dev));
+			dev_err(gart->dev, "GART: %s is already attached\n",
+				dev_name(dev));
 			err = -EINVAL;
 			goto fail;
 		}
 	}
 	if (gart->active_domain && gart->active_domain != domain) {
-		dev_err(gart->dev, "Only one domain can be active at a time\n");
+		dev_err(gart->dev,
+			"GART: Only one domain can be active at a time\n");
 		err = -EINVAL;
 		goto fail;
 	}
 	gart->active_domain = domain;
 	list_add(&client->list, &gart->client);
 	spin_unlock(&gart->client_lock);
-	dev_dbg(gart->dev, "Attached %s\n", dev_name(dev));
+	dev_dbg(gart->dev, "GART: Attached %s\n", dev_name(dev));
 	return 0;
 
 fail:
@@ -215,12 +216,14 @@ static void __gart_iommu_detach_dev(struct iommu_domain *domain,
 			kfree(c);
 			if (list_empty(&gart->client))
 				gart->active_domain = NULL;
-			dev_dbg(gart->dev, "Detached %s\n", dev_name(dev));
+			dev_dbg(gart->dev, "GART: Detached %s\n",
+				dev_name(dev));
 			return;
 		}
 	}
 
-	dev_err(gart->dev, "Couldn't find %s to detach\n", dev_name(dev));
+	dev_err(gart->dev, "GART: Couldn't find %s to detach\n",
+		dev_name(dev));
 }
 
 static void gart_iommu_detach_dev(struct iommu_domain *domain,
@@ -293,7 +296,7 @@ static int gart_iommu_map(struct iommu_domain *domain, unsigned long iova,
 	spin_lock_irqsave(&gart->pte_lock, flags);
 	pfn = __phys_to_pfn(pa);
 	if (!pfn_valid(pfn)) {
-		dev_err(gart->dev, "Invalid page: %pa\n", &pa);
+		dev_err(gart->dev, "GART: Invalid page: %pa\n", &pa);
 		spin_unlock_irqrestore(&gart->pte_lock, flags);
 		return -EINVAL;
 	}
@@ -301,7 +304,7 @@ static int gart_iommu_map(struct iommu_domain *domain, unsigned long iova,
 		pte = gart_read_pte(gart, iova);
 		if (pte & GART_ENTRY_PHYS_ADDR_VALID) {
 			spin_unlock_irqrestore(&gart->pte_lock, flags);
-			dev_err(gart->dev, "Page entry is in-use\n");
+			dev_err(gart->dev, "GART: Page entry is in-use\n");
 			return -EBUSY;
 		}
 	}
@@ -344,7 +347,7 @@ static phys_addr_t gart_iommu_iova_to_phys(struct iommu_domain *domain,
 
 	pa = (pte & GART_PAGE_MASK);
 	if (!pfn_valid(__phys_to_pfn(pa))) {
-		dev_err(gart->dev, "No entry for %08llx:%pa\n",
+		dev_err(gart->dev, "GART: No entry for %08llx:%pa\n",
 			 (unsigned long long)iova, &pa);
 		gart_dump_table(gart);
 		return -EINVAL;
@@ -455,19 +458,17 @@ struct gart_device *tegra_gart_probe(struct device *dev,
 	res_remap = platform_get_resource(to_platform_device(dev),
 					  IORESOURCE_MEM, 1);
 	if (!res_remap) {
-		dev_err(dev, "GART memory aperture expected\n");
+		dev_err(dev, "GART: Memory aperture resource unavailable\n");
 		return ERR_PTR(-ENXIO);
 	}
 
 	gart = kzalloc(sizeof(*gart), GFP_KERNEL);
-	if (!gart) {
-		dev_err(dev, "failed to allocate gart_device\n");
+	if (!gart)
 		return ERR_PTR(-ENOMEM);
-	}
 
 	ret = iommu_device_sysfs_add(&gart->iommu, dev, NULL, "gart");
 	if (ret) {
-		dev_err(dev, "Failed to register IOMMU in sysfs\n");
+		dev_err(dev, "GART: Failed to register IOMMU sysfs\n");
 		goto free_gart;
 	}
 
@@ -476,7 +477,7 @@ struct gart_device *tegra_gart_probe(struct device *dev,
 
 	ret = iommu_device_register(&gart->iommu);
 	if (ret) {
-		dev_err(dev, "Failed to register IOMMU\n");
+		dev_err(dev, "GART: Failed to register IOMMU\n");
 		goto remove_sysfs;
 	}
 
@@ -491,7 +492,6 @@ struct gart_device *tegra_gart_probe(struct device *dev,
 
 	gart->savedata = vmalloc(array_size(sizeof(u32), gart->page_count));
 	if (!gart->savedata) {
-		dev_err(dev, "failed to allocate context save area\n");
 		ret = -ENOMEM;
 		goto unregister_iommu;
 	}
-- 
2.19.0


^ permalink raw reply related	[flat|nested] 59+ messages in thread

* [PATCH v4 18/20] iommu/tegra: gart: Don't detach devices from inactive domains
  2018-09-24  0:41 [PATCH v4 00/20] IOMMU: Tegra GART driver clean up and optimization Dmitry Osipenko
                   ` (16 preceding siblings ...)
  2018-09-24  0:41 ` [PATCH v4 17/20] iommu/tegra: gart: Prepend error/debug messages with "GART:" Dmitry Osipenko
@ 2018-09-24  0:41 ` Dmitry Osipenko
  2018-09-24 11:00   ` Thierry Reding
  2018-09-24  0:41 ` [PATCH v4 19/20] iommu/tegra: gart: Simplify clients-tracking code Dmitry Osipenko
  2018-09-24  0:41 ` [PATCH v4 20/20] iommu/tegra: gart: Perform code refactoring Dmitry Osipenko
  19 siblings, 1 reply; 59+ messages in thread
From: Dmitry Osipenko @ 2018-09-24  0:41 UTC (permalink / raw)
  To: Thierry Reding, Jonathan Hunter, Joerg Roedel, Rob Herring, Robin Murphy
  Cc: iommu, devicetree, linux-tegra, linux-kernel

There could be unlimited number of allocated domains, but only one domain
can be active at a time. Hence devices must be detached only from the
active domain.

Signed-off-by: Dmitry Osipenko <digetx@gmail.com>
---
 drivers/iommu/tegra-gart.c | 8 +++++---
 1 file changed, 5 insertions(+), 3 deletions(-)

diff --git a/drivers/iommu/tegra-gart.c b/drivers/iommu/tegra-gart.c
index 284cddf90888..306e9644a676 100644
--- a/drivers/iommu/tegra-gart.c
+++ b/drivers/iommu/tegra-gart.c
@@ -167,7 +167,7 @@ static int gart_iommu_attach_dev(struct iommu_domain *domain,
 				 struct device *dev)
 {
 	struct gart_domain *gart_domain = to_gart_domain(domain);
-	struct gart_device *gart = gart_domain->gart;
+	struct gart_device *gart = gart_handle;
 	struct gart_client *client, *c;
 	int err = 0;
 
@@ -192,6 +192,7 @@ static int gart_iommu_attach_dev(struct iommu_domain *domain,
 		goto fail;
 	}
 	gart->active_domain = domain;
+	gart_domain->gart = gart;
 	list_add(&client->list, &gart->client);
 	spin_unlock(&gart->client_lock);
 	dev_dbg(gart->dev, "GART: Attached %s\n", dev_name(dev));
@@ -214,8 +215,10 @@ static void __gart_iommu_detach_dev(struct iommu_domain *domain,
 		if (c->dev == dev) {
 			list_del(&c->list);
 			kfree(c);
-			if (list_empty(&gart->client))
+			if (list_empty(&gart->client)) {
 				gart->active_domain = NULL;
+				gart_domain->gart = NULL;
+			}
 			dev_dbg(gart->dev, "GART: Detached %s\n",
 				dev_name(dev));
 			return;
@@ -253,7 +256,6 @@ static struct iommu_domain *gart_iommu_domain_alloc(unsigned type)
 	if (!gart_domain)
 		return NULL;
 
-	gart_domain->gart = gart;
 	gart_domain->domain.geometry.aperture_start = gart->iovmm_base;
 	gart_domain->domain.geometry.aperture_end = gart->iovmm_base +
 					gart->page_count * GART_PAGE_SIZE - 1;
-- 
2.19.0


^ permalink raw reply related	[flat|nested] 59+ messages in thread

* [PATCH v4 19/20] iommu/tegra: gart: Simplify clients-tracking code
  2018-09-24  0:41 [PATCH v4 00/20] IOMMU: Tegra GART driver clean up and optimization Dmitry Osipenko
                   ` (17 preceding siblings ...)
  2018-09-24  0:41 ` [PATCH v4 18/20] iommu/tegra: gart: Don't detach devices from inactive domains Dmitry Osipenko
@ 2018-09-24  0:41 ` Dmitry Osipenko
  2018-09-24 11:10   ` Thierry Reding
  2018-09-24  0:41 ` [PATCH v4 20/20] iommu/tegra: gart: Perform code refactoring Dmitry Osipenko
  19 siblings, 1 reply; 59+ messages in thread
From: Dmitry Osipenko @ 2018-09-24  0:41 UTC (permalink / raw)
  To: Thierry Reding, Jonathan Hunter, Joerg Roedel, Rob Herring, Robin Murphy
  Cc: iommu, devicetree, linux-tegra, linux-kernel

GART is a simple IOMMU provider that has single address space. There is
no need to setup global clients list and manage it for tracking of the
active domain, hence lot's of code could be safely removed and replaced
with a simpler alternative.

Signed-off-by: Dmitry Osipenko <digetx@gmail.com>
---
 drivers/iommu/tegra-gart.c | 157 +++++++++----------------------------
 1 file changed, 39 insertions(+), 118 deletions(-)

diff --git a/drivers/iommu/tegra-gart.c b/drivers/iommu/tegra-gart.c
index 306e9644a676..7182445c3b76 100644
--- a/drivers/iommu/tegra-gart.c
+++ b/drivers/iommu/tegra-gart.c
@@ -19,7 +19,6 @@
 
 #include <linux/io.h>
 #include <linux/iommu.h>
-#include <linux/list.h>
 #include <linux/module.h>
 #include <linux/platform_device.h>
 #include <linux/slab.h>
@@ -42,30 +41,20 @@
 #define GART_PAGE_MASK						\
 	(~(GART_PAGE_SIZE - 1) & ~GART_ENTRY_PHYS_ADDR_VALID)
 
-struct gart_client {
-	struct device		*dev;
-	struct list_head	list;
-};
-
 struct gart_device {
 	void __iomem		*regs;
 	u32			*savedata;
 	u32			page_count;	/* total remappable size */
 	dma_addr_t		iovmm_base;	/* offset to vmm_area */
 	spinlock_t		pte_lock;	/* for pagetable */
-	struct list_head	client;
-	spinlock_t		client_lock;	/* for client list */
+	spinlock_t		dom_lock;	/* for active domain */
+	unsigned int		active_devices;	/* number of active devices */
 	struct iommu_domain	*active_domain;	/* current active domain */
 	struct device		*dev;
 
 	struct iommu_device	iommu;		/* IOMMU Core handle */
 };
 
-struct gart_domain {
-	struct iommu_domain domain;		/* generic domain handle */
-	struct gart_device *gart;		/* link to gart device   */
-};
-
 static struct gart_device *gart_handle; /* unique for a system */
 
 static bool gart_debug;
@@ -73,11 +62,6 @@ static bool gart_debug;
 #define GART_PTE(_pfn)						\
 	(GART_ENTRY_PHYS_ADDR_VALID | ((_pfn) << PAGE_SHIFT))
 
-static struct gart_domain *to_gart_domain(struct iommu_domain *dom)
-{
-	return container_of(dom, struct gart_domain, domain);
-}
-
 /*
  * Any interaction between any block on PPSB and a block on APB or AHB
  * must have these read-back to ensure the APB/AHB bus transaction is
@@ -166,128 +150,69 @@ static inline bool gart_iova_range_valid(struct gart_device *gart,
 static int gart_iommu_attach_dev(struct iommu_domain *domain,
 				 struct device *dev)
 {
-	struct gart_domain *gart_domain = to_gart_domain(domain);
 	struct gart_device *gart = gart_handle;
-	struct gart_client *client, *c;
-	int err = 0;
-
-	client = kzalloc(sizeof(*c), GFP_KERNEL);
-	if (!client)
-		return -ENOMEM;
-	client->dev = dev;
-
-	spin_lock(&gart->client_lock);
-	list_for_each_entry(c, &gart->client, list) {
-		if (c->dev == dev) {
-			dev_err(gart->dev, "GART: %s is already attached\n",
-				dev_name(dev));
-			err = -EINVAL;
-			goto fail;
-		}
-	}
-	if (gart->active_domain && gart->active_domain != domain) {
-		dev_err(gart->dev,
-			"GART: Only one domain can be active at a time\n");
-		err = -EINVAL;
-		goto fail;
-	}
-	gart->active_domain = domain;
-	gart_domain->gart = gart;
-	list_add(&client->list, &gart->client);
-	spin_unlock(&gart->client_lock);
-	dev_dbg(gart->dev, "GART: Attached %s\n", dev_name(dev));
-	return 0;
+	int ret = 0;
 
-fail:
-	kfree(client);
-	spin_unlock(&gart->client_lock);
-	return err;
-}
+	spin_lock(&gart->dom_lock);
 
-static void __gart_iommu_detach_dev(struct iommu_domain *domain,
-				    struct device *dev)
-{
-	struct gart_domain *gart_domain = to_gart_domain(domain);
-	struct gart_device *gart = gart_domain->gart;
-	struct gart_client *c;
-
-	list_for_each_entry(c, &gart->client, list) {
-		if (c->dev == dev) {
-			list_del(&c->list);
-			kfree(c);
-			if (list_empty(&gart->client)) {
-				gart->active_domain = NULL;
-				gart_domain->gart = NULL;
-			}
-			dev_dbg(gart->dev, "GART: Detached %s\n",
-				dev_name(dev));
-			return;
-		}
+	if (gart->active_domain && gart->active_domain != domain) {
+		ret = -EBUSY;
+	} else if (dev->archdata.iommu != domain) {
+		dev->archdata.iommu = domain;
+		gart->active_domain = domain;
+		gart->active_devices++;
 	}
 
-	dev_err(gart->dev, "GART: Couldn't find %s to detach\n",
-		dev_name(dev));
+	spin_unlock(&gart->dom_lock);
+
+	return ret;
 }
 
 static void gart_iommu_detach_dev(struct iommu_domain *domain,
 				  struct device *dev)
 {
-	struct gart_domain *gart_domain = to_gart_domain(domain);
-	struct gart_device *gart = gart_domain->gart;
+	struct gart_device *gart = gart_handle;
+
+	spin_lock(&gart->dom_lock);
 
-	spin_lock(&gart->client_lock);
-	__gart_iommu_detach_dev(domain, dev);
-	spin_unlock(&gart->client_lock);
+	if (dev->archdata.iommu == domain) {
+		dev->archdata.iommu = NULL;
+
+		if (--gart->active_devices == 0)
+			gart->active_domain = NULL;
+	}
+
+	spin_unlock(&gart->dom_lock);
 }
 
 static struct iommu_domain *gart_iommu_domain_alloc(unsigned type)
 {
-	struct gart_domain *gart_domain;
-	struct gart_device *gart;
+	struct gart_device *gart = gart_handle;
+	struct iommu_domain *domain;
 
 	if (type != IOMMU_DOMAIN_UNMANAGED)
 		return NULL;
 
-	gart = gart_handle;
-	if (!gart)
-		return NULL;
-
-	gart_domain = kzalloc(sizeof(*gart_domain), GFP_KERNEL);
-	if (!gart_domain)
-		return NULL;
-
-	gart_domain->domain.geometry.aperture_start = gart->iovmm_base;
-	gart_domain->domain.geometry.aperture_end = gart->iovmm_base +
+	domain = kzalloc(sizeof(*domain), GFP_KERNEL);
+	if (domain) {
+		domain->geometry.aperture_start = gart->iovmm_base;
+		domain->geometry.aperture_end = gart->iovmm_base +
 					gart->page_count * GART_PAGE_SIZE - 1;
-	gart_domain->domain.geometry.force_aperture = true;
+		domain->geometry.force_aperture = true;
+	}
 
-	return &gart_domain->domain;
+	return domain;
 }
 
 static void gart_iommu_domain_free(struct iommu_domain *domain)
 {
-	struct gart_domain *gart_domain = to_gart_domain(domain);
-	struct gart_device *gart = gart_domain->gart;
-
-	if (gart) {
-		spin_lock(&gart->client_lock);
-		if (!list_empty(&gart->client)) {
-			struct gart_client *c, *tmp;
-
-			list_for_each_entry_safe(c, tmp, &gart->client, list)
-				__gart_iommu_detach_dev(domain, c->dev);
-		}
-		spin_unlock(&gart->client_lock);
-	}
-
-	kfree(gart_domain);
+	kfree(domain);
 }
 
 static int gart_iommu_map(struct iommu_domain *domain, unsigned long iova,
 			  phys_addr_t pa, size_t bytes, int prot)
 {
-	struct gart_domain *gart_domain = to_gart_domain(domain);
-	struct gart_device *gart = gart_domain->gart;
+	struct gart_device *gart = gart_handle;
 	unsigned long flags;
 	unsigned long pfn;
 	unsigned long pte;
@@ -318,8 +243,7 @@ static int gart_iommu_map(struct iommu_domain *domain, unsigned long iova,
 static size_t gart_iommu_unmap(struct iommu_domain *domain, unsigned long iova,
 			       size_t bytes)
 {
-	struct gart_domain *gart_domain = to_gart_domain(domain);
-	struct gart_device *gart = gart_domain->gart;
+	struct gart_device *gart = gart_handle;
 	unsigned long flags;
 
 	if (!gart_iova_range_valid(gart, iova, bytes))
@@ -334,8 +258,7 @@ static size_t gart_iommu_unmap(struct iommu_domain *domain, unsigned long iova,
 static phys_addr_t gart_iommu_iova_to_phys(struct iommu_domain *domain,
 					   dma_addr_t iova)
 {
-	struct gart_domain *gart_domain = to_gart_domain(domain);
-	struct gart_device *gart = gart_domain->gart;
+	struct gart_device *gart = gart_handle;
 	unsigned long pte;
 	phys_addr_t pa;
 	unsigned long flags;
@@ -394,8 +317,7 @@ static int gart_iommu_of_xlate(struct device *dev,
 
 static void gart_iommu_sync(struct iommu_domain *domain)
 {
-	struct gart_domain *gart_domain = to_gart_domain(domain);
-	struct gart_device *gart = gart_domain->gart;
+	struct gart_device *gart = gart_handle;
 
 	FLUSH_GART_REGS(gart);
 }
@@ -486,8 +408,7 @@ struct gart_device *tegra_gart_probe(struct device *dev,
 	gart->dev = dev;
 	gart_regs = mc->regs + GART_REG_BASE;
 	spin_lock_init(&gart->pte_lock);
-	spin_lock_init(&gart->client_lock);
-	INIT_LIST_HEAD(&gart->client);
+	spin_lock_init(&gart->dom_lock);
 	gart->regs = gart_regs;
 	gart->iovmm_base = (dma_addr_t)res_remap->start;
 	gart->page_count = (resource_size(res_remap) >> GART_PAGE_SHIFT);
-- 
2.19.0


^ permalink raw reply related	[flat|nested] 59+ messages in thread

* [PATCH v4 20/20] iommu/tegra: gart: Perform code refactoring
  2018-09-24  0:41 [PATCH v4 00/20] IOMMU: Tegra GART driver clean up and optimization Dmitry Osipenko
                   ` (18 preceding siblings ...)
  2018-09-24  0:41 ` [PATCH v4 19/20] iommu/tegra: gart: Simplify clients-tracking code Dmitry Osipenko
@ 2018-09-24  0:41 ` Dmitry Osipenko
  2018-09-24 11:34   ` Thierry Reding
  19 siblings, 1 reply; 59+ messages in thread
From: Dmitry Osipenko @ 2018-09-24  0:41 UTC (permalink / raw)
  To: Thierry Reding, Jonathan Hunter, Joerg Roedel, Rob Herring, Robin Murphy
  Cc: iommu, devicetree, linux-tegra, linux-kernel

Perform a major code cleanup to make it more readable and as a result
easier to maintain. I removed some redundant safety-checks in the code
and some debug code that isn't actually very useful for debugging, like
enormous pagetable dump on each fault. The majority of the changes are
code reshuffling, variables/whitespaces clean up and removal of debug
messages that duplicate messages of the IOMMU-core.

Signed-off-by: Dmitry Osipenko <digetx@gmail.com>
---
 drivers/iommu/tegra-gart.c | 215 +++++++++++++++----------------------
 1 file changed, 84 insertions(+), 131 deletions(-)

diff --git a/drivers/iommu/tegra-gart.c b/drivers/iommu/tegra-gart.c
index 7182445c3b76..a36d0c568536 100644
--- a/drivers/iommu/tegra-gart.c
+++ b/drivers/iommu/tegra-gart.c
@@ -34,63 +34,56 @@
 #define GART_CONFIG		(0x24 - GART_REG_BASE)
 #define GART_ENTRY_ADDR		(0x28 - GART_REG_BASE)
 #define GART_ENTRY_DATA		(0x2c - GART_REG_BASE)
-#define GART_ENTRY_PHYS_ADDR_VALID	(1 << 31)
+
+#define GART_ENTRY_PHYS_ADDR_VALID	BIT(31)
 
 #define GART_PAGE_SHIFT		12
 #define GART_PAGE_SIZE		(1 << GART_PAGE_SHIFT)
-#define GART_PAGE_MASK						\
-	(~(GART_PAGE_SIZE - 1) & ~GART_ENTRY_PHYS_ADDR_VALID)
+#define GART_PAGE_MASK		GENMASK(30, GART_PAGE_SHIFT)
 
 struct gart_device {
 	void __iomem		*regs;
 	u32			*savedata;
-	u32			page_count;	/* total remappable size */
-	dma_addr_t		iovmm_base;	/* offset to vmm_area */
+	unsigned long		iovmm_base;	/* offset to vmm_area start */
+	unsigned long		iovmm_end;	/* offset to vmm_area end */
 	spinlock_t		pte_lock;	/* for pagetable */
 	spinlock_t		dom_lock;	/* for active domain */
 	unsigned int		active_devices;	/* number of active devices */
 	struct iommu_domain	*active_domain;	/* current active domain */
-	struct device		*dev;
-
 	struct iommu_device	iommu;		/* IOMMU Core handle */
+	struct device		*dev;
 };
 
 static struct gart_device *gart_handle; /* unique for a system */
 
 static bool gart_debug;
 
-#define GART_PTE(_pfn)						\
-	(GART_ENTRY_PHYS_ADDR_VALID | ((_pfn) << PAGE_SHIFT))
-
 /*
  * Any interaction between any block on PPSB and a block on APB or AHB
  * must have these read-back to ensure the APB/AHB bus transaction is
  * complete before initiating activity on the PPSB block.
  */
-#define FLUSH_GART_REGS(gart)	((void)readl((gart)->regs + GART_CONFIG))
+#define FLUSH_GART_REGS(gart)	readl_relaxed((gart)->regs + GART_CONFIG)
 
 #define for_each_gart_pte(gart, iova)					\
 	for (iova = gart->iovmm_base;					\
-	     iova < gart->iovmm_base + GART_PAGE_SIZE * gart->page_count; \
+	     iova < gart->iovmm_end;					\
 	     iova += GART_PAGE_SIZE)
 
 static inline void gart_set_pte(struct gart_device *gart,
-				unsigned long offs, u32 pte)
+				unsigned long iova, phys_addr_t pte)
 {
-	writel(offs, gart->regs + GART_ENTRY_ADDR);
-	writel(pte, gart->regs + GART_ENTRY_DATA);
-
-	dev_dbg(gart->dev, "GART: %s %08lx:%08x\n",
-		 pte ? "map" : "unmap", offs, pte & GART_PAGE_MASK);
+	writel_relaxed(iova, gart->regs + GART_ENTRY_ADDR);
+	writel_relaxed(pte, gart->regs + GART_ENTRY_DATA);
 }
 
 static inline unsigned long gart_read_pte(struct gart_device *gart,
-					  unsigned long offs)
+					  unsigned long iova)
 {
 	unsigned long pte;
 
-	writel(offs, gart->regs + GART_ENTRY_ADDR);
-	pte = readl(gart->regs + GART_ENTRY_DATA);
+	writel_relaxed(iova, gart->regs + GART_ENTRY_ADDR);
+	pte = readl_relaxed(gart->regs + GART_ENTRY_DATA);
 
 	return pte;
 }
@@ -102,49 +95,20 @@ static void do_gart_setup(struct gart_device *gart, const u32 *data)
 	for_each_gart_pte(gart, iova)
 		gart_set_pte(gart, iova, data ? *(data++) : 0);
 
-	writel(1, gart->regs + GART_CONFIG);
+	writel_relaxed(1, gart->regs + GART_CONFIG);
 	FLUSH_GART_REGS(gart);
 }
 
-#ifdef DEBUG
-static void gart_dump_table(struct gart_device *gart)
+static inline bool gart_iova_range_invalid(struct gart_device *gart,
+					   unsigned long iova, size_t bytes)
 {
-	unsigned long iova;
-	unsigned long flags;
-
-	spin_lock_irqsave(&gart->pte_lock, flags);
-	for_each_gart_pte(gart, iova) {
-		unsigned long pte;
-
-		pte = gart_read_pte(gart, iova);
-
-		dev_dbg(gart->dev, "GART: %s %08lx:%08lx\n",
-			(GART_ENTRY_PHYS_ADDR_VALID & pte) ? "v" : " ",
-			iova, pte & GART_PAGE_MASK);
-	}
-	spin_unlock_irqrestore(&gart->pte_lock, flags);
+	return unlikely(iova < gart->iovmm_base || bytes != GART_PAGE_SIZE ||
+			iova + bytes > gart->iovmm_end);
 }
-#else
-static inline void gart_dump_table(struct gart_device *gart)
-{
-}
-#endif
 
-static inline bool gart_iova_range_valid(struct gart_device *gart,
-					 unsigned long iova, size_t bytes)
+static inline bool gart_pte_valid(struct gart_device *gart, unsigned long iova)
 {
-	unsigned long iova_start, iova_end, gart_start, gart_end;
-
-	iova_start = iova;
-	iova_end = iova_start + bytes - 1;
-	gart_start = gart->iovmm_base;
-	gart_end = gart_start + gart->page_count * GART_PAGE_SIZE - 1;
-
-	if (iova_start < gart_start)
-		return false;
-	if (iova_end > gart_end)
-		return false;
-	return true;
+	return !!(gart_read_pte(gart, iova) & GART_ENTRY_PHYS_ADDR_VALID);
 }
 
 static int gart_iommu_attach_dev(struct iommu_domain *domain,
@@ -187,7 +151,6 @@ static void gart_iommu_detach_dev(struct iommu_domain *domain,
 
 static struct iommu_domain *gart_iommu_domain_alloc(unsigned type)
 {
-	struct gart_device *gart = gart_handle;
 	struct iommu_domain *domain;
 
 	if (type != IOMMU_DOMAIN_UNMANAGED)
@@ -195,9 +158,8 @@ static struct iommu_domain *gart_iommu_domain_alloc(unsigned type)
 
 	domain = kzalloc(sizeof(*domain), GFP_KERNEL);
 	if (domain) {
-		domain->geometry.aperture_start = gart->iovmm_base;
-		domain->geometry.aperture_end = gart->iovmm_base +
-					gart->page_count * GART_PAGE_SIZE - 1;
+		domain->geometry.aperture_start = gart_handle->iovmm_base;
+		domain->geometry.aperture_end = gart_handle->iovmm_end - 1;
 		domain->geometry.force_aperture = true;
 	}
 
@@ -209,34 +171,44 @@ static void gart_iommu_domain_free(struct iommu_domain *domain)
 	kfree(domain);
 }
 
+static int __gart_iommu_map(struct gart_device *gart, unsigned long iova,
+			    phys_addr_t pa)
+{
+	if (unlikely(gart_debug && gart_pte_valid(gart, iova))) {
+		dev_WARN(gart->dev, "GART: Page entry is in-use\n");
+		return -EINVAL;
+	}
+
+	gart_set_pte(gart, iova, GART_ENTRY_PHYS_ADDR_VALID | pa);
+
+	return 0;
+}
+
 static int gart_iommu_map(struct iommu_domain *domain, unsigned long iova,
 			  phys_addr_t pa, size_t bytes, int prot)
 {
 	struct gart_device *gart = gart_handle;
-	unsigned long flags;
-	unsigned long pfn;
-	unsigned long pte;
+	int ret;
 
-	if (!gart_iova_range_valid(gart, iova, bytes))
+	if (gart_iova_range_invalid(gart, iova, bytes))
 		return -EINVAL;
 
-	spin_lock_irqsave(&gart->pte_lock, flags);
-	pfn = __phys_to_pfn(pa);
-	if (!pfn_valid(pfn)) {
-		dev_err(gart->dev, "GART: Invalid page: %pa\n", &pa);
-		spin_unlock_irqrestore(&gart->pte_lock, flags);
+	spin_lock(&gart->pte_lock);
+	ret = __gart_iommu_map(gart, iova, pa);
+	spin_unlock(&gart->pte_lock);
+
+	return ret;
+}
+
+static int __gart_iommu_unmap(struct gart_device *gart, unsigned long iova)
+{
+	if (unlikely(gart_debug && !gart_pte_valid(gart, iova))) {
+		dev_WARN(gart->dev, "GART: Page entry is invalid\n");
 		return -EINVAL;
 	}
-	if (gart_debug) {
-		pte = gart_read_pte(gart, iova);
-		if (pte & GART_ENTRY_PHYS_ADDR_VALID) {
-			spin_unlock_irqrestore(&gart->pte_lock, flags);
-			dev_err(gart->dev, "GART: Page entry is in-use\n");
-			return -EBUSY;
-		}
-	}
-	gart_set_pte(gart, iova, GART_PTE(pfn));
-	spin_unlock_irqrestore(&gart->pte_lock, flags);
+
+	gart_set_pte(gart, iova, 0);
+
 	return 0;
 }
 
@@ -244,15 +216,16 @@ static size_t gart_iommu_unmap(struct iommu_domain *domain, unsigned long iova,
 			       size_t bytes)
 {
 	struct gart_device *gart = gart_handle;
-	unsigned long flags;
+	int err;
 
-	if (!gart_iova_range_valid(gart, iova, bytes))
+	if (gart_iova_range_invalid(gart, iova, bytes))
 		return 0;
 
-	spin_lock_irqsave(&gart->pte_lock, flags);
-	gart_set_pte(gart, iova, 0);
-	spin_unlock_irqrestore(&gart->pte_lock, flags);
-	return bytes;
+	spin_lock(&gart->pte_lock);
+	err = __gart_iommu_unmap(gart, iova);
+	spin_unlock(&gart->pte_lock);
+
+	return err ? 0 : bytes;
 }
 
 static phys_addr_t gart_iommu_iova_to_phys(struct iommu_domain *domain,
@@ -260,24 +233,15 @@ static phys_addr_t gart_iommu_iova_to_phys(struct iommu_domain *domain,
 {
 	struct gart_device *gart = gart_handle;
 	unsigned long pte;
-	phys_addr_t pa;
-	unsigned long flags;
 
-	if (!gart_iova_range_valid(gart, iova, 0))
+	if (gart_iova_range_invalid(gart, iova, SZ_4K))
 		return -EINVAL;
 
-	spin_lock_irqsave(&gart->pte_lock, flags);
+	spin_lock(&gart->pte_lock);
 	pte = gart_read_pte(gart, iova);
-	spin_unlock_irqrestore(&gart->pte_lock, flags);
+	spin_unlock(&gart->pte_lock);
 
-	pa = (pte & GART_PAGE_MASK);
-	if (!pfn_valid(__phys_to_pfn(pa))) {
-		dev_err(gart->dev, "GART: No entry for %08llx:%pa\n",
-			 (unsigned long long)iova, &pa);
-		gart_dump_table(gart);
-		return -EINVAL;
-	}
-	return pa;
+	return pte & GART_PAGE_MASK;
 }
 
 static bool gart_iommu_capable(enum iommu_cap cap)
@@ -342,24 +306,19 @@ static const struct iommu_ops gart_iommu_ops = {
 
 int tegra_gart_suspend(struct gart_device *gart)
 {
-	unsigned long iova;
 	u32 *data = gart->savedata;
-	unsigned long flags;
+	unsigned long iova;
 
-	spin_lock_irqsave(&gart->pte_lock, flags);
 	for_each_gart_pte(gart, iova)
 		*(data++) = gart_read_pte(gart, iova);
-	spin_unlock_irqrestore(&gart->pte_lock, flags);
+
 	return 0;
 }
 
 int tegra_gart_resume(struct gart_device *gart)
 {
-	unsigned long flags;
-
-	spin_lock_irqsave(&gart->pte_lock, flags);
 	do_gart_setup(gart, gart->savedata);
-	spin_unlock_irqrestore(&gart->pte_lock, flags);
+
 	return 0;
 }
 
@@ -368,8 +327,7 @@ struct gart_device *tegra_gart_probe(struct device *dev,
 				     struct tegra_mc *mc)
 {
 	struct gart_device *gart;
-	struct resource *res_remap;
-	void __iomem *gart_regs;
+	struct resource *res;
 	int ret;
 
 	BUILD_BUG_ON(PAGE_SHIFT != GART_PAGE_SHIFT);
@@ -379,9 +337,8 @@ struct gart_device *tegra_gart_probe(struct device *dev,
 		return NULL;
 
 	/* the GART memory aperture is required */
-	res_remap = platform_get_resource(to_platform_device(dev),
-					  IORESOURCE_MEM, 1);
-	if (!res_remap) {
+	res = platform_get_resource(to_platform_device(dev), IORESOURCE_MEM, 1);
+	if (!res) {
 		dev_err(dev, "GART: Memory aperture resource unavailable\n");
 		return ERR_PTR(-ENXIO);
 	}
@@ -390,39 +347,35 @@ struct gart_device *tegra_gart_probe(struct device *dev,
 	if (!gart)
 		return ERR_PTR(-ENOMEM);
 
+	gart_handle = gart;
+
+	gart->dev = dev;
+	gart->regs = mc->regs + GART_REG_BASE;
+	gart->iovmm_base = res->start;
+	gart->iovmm_end = res->start + resource_size(res);
+	spin_lock_init(&gart->pte_lock);
+	spin_lock_init(&gart->dom_lock);
+
+	do_gart_setup(gart, NULL);
+
 	ret = iommu_device_sysfs_add(&gart->iommu, dev, NULL, "gart");
-	if (ret) {
-		dev_err(dev, "GART: Failed to register IOMMU sysfs\n");
+	if (ret)
 		goto free_gart;
-	}
 
 	iommu_device_set_ops(&gart->iommu, &gart_iommu_ops);
 	iommu_device_set_fwnode(&gart->iommu, dev->fwnode);
 
 	ret = iommu_device_register(&gart->iommu);
-	if (ret) {
-		dev_err(dev, "GART: Failed to register IOMMU\n");
+	if (ret)
 		goto remove_sysfs;
-	}
 
-	gart->dev = dev;
-	gart_regs = mc->regs + GART_REG_BASE;
-	spin_lock_init(&gart->pte_lock);
-	spin_lock_init(&gart->dom_lock);
-	gart->regs = gart_regs;
-	gart->iovmm_base = (dma_addr_t)res_remap->start;
-	gart->page_count = (resource_size(res_remap) >> GART_PAGE_SHIFT);
-
-	gart->savedata = vmalloc(array_size(sizeof(u32), gart->page_count));
+	gart->savedata = vmalloc(resource_size(res) / GART_PAGE_SIZE *
+				 sizeof(u32));
 	if (!gart->savedata) {
 		ret = -ENOMEM;
 		goto unregister_iommu;
 	}
 
-	do_gart_setup(gart, NULL);
-
-	gart_handle = gart;
-
 	return gart;
 
 unregister_iommu:
-- 
2.19.0


^ permalink raw reply related	[flat|nested] 59+ messages in thread

* Re: [PATCH v4 06/20] dt-bindings: memory: tegra: Squash tegra20-gart into tegra20-mc
  2018-09-24  0:41 ` [PATCH v4 06/20] dt-bindings: memory: tegra: Squash tegra20-gart into tegra20-mc Dmitry Osipenko
@ 2018-09-24  9:55   ` Thierry Reding
  2018-09-27 18:41     ` Rob Herring
  2018-09-27 18:41   ` Rob Herring
  1 sibling, 1 reply; 59+ messages in thread
From: Thierry Reding @ 2018-09-24  9:55 UTC (permalink / raw)
  To: Dmitry Osipenko
  Cc: Jonathan Hunter, Joerg Roedel, Rob Herring, Robin Murphy, iommu,
	devicetree, linux-tegra, linux-kernel

[-- Attachment #1: Type: text/plain, Size: 3055 bytes --]

On Mon, Sep 24, 2018 at 03:41:39AM +0300, Dmitry Osipenko wrote:
> Splitting GART and Memory Controller wasn't a good decision that was made
> back in the day. Given that the GART driver wasn't ever been used by
> anything in the kernel, we decided that it will be better to correct the
> mistakes of the past and merge two bindings into a single one. As a result
> there is a DT ABI change for the Memory Controller that allows not to
> break newer kernels using older DT and not to break older kernels using
> newer DT, that is done by changing the 'compatible' of the node to
> 'tegra20-mc-gart' and adding a new-required clock property. The new clock
> property also puts the tegra20-mc binding in line with the bindings of the
> later Tegra generations.
> 
> Signed-off-by: Dmitry Osipenko <digetx@gmail.com>
> ---
>  .../bindings/iommu/nvidia,tegra20-gart.txt    | 14 ----------
>  .../memory-controllers/nvidia,tegra20-mc.txt  | 27 +++++++++++++------
>  2 files changed, 19 insertions(+), 22 deletions(-)
>  delete mode 100644 Documentation/devicetree/bindings/iommu/nvidia,tegra20-gart.txt
> 
> diff --git a/Documentation/devicetree/bindings/iommu/nvidia,tegra20-gart.txt b/Documentation/devicetree/bindings/iommu/nvidia,tegra20-gart.txt
> deleted file mode 100644
> index 099d9362ebc1..000000000000
> --- a/Documentation/devicetree/bindings/iommu/nvidia,tegra20-gart.txt
> +++ /dev/null
> @@ -1,14 +0,0 @@
> -NVIDIA Tegra 20 GART
> -
> -Required properties:
> -- compatible: "nvidia,tegra20-gart"
> -- reg: Two pairs of cells specifying the physical address and size of
> -  the memory controller registers and the GART aperture respectively.
> -
> -Example:
> -
> -	gart {
> -		compatible = "nvidia,tegra20-gart";
> -		reg = <0x7000f024 0x00000018	/* controller registers */
> -		       0x58000000 0x02000000>;	/* GART aperture */
> -	};
> diff --git a/Documentation/devicetree/bindings/memory-controllers/nvidia,tegra20-mc.txt b/Documentation/devicetree/bindings/memory-controllers/nvidia,tegra20-mc.txt
> index 7d60a50a4fa1..e55328237df4 100644
> --- a/Documentation/devicetree/bindings/memory-controllers/nvidia,tegra20-mc.txt
> +++ b/Documentation/devicetree/bindings/memory-controllers/nvidia,tegra20-mc.txt
> @@ -1,26 +1,37 @@
>  NVIDIA Tegra20 MC(Memory Controller)
>  
>  Required properties:
> -- compatible : "nvidia,tegra20-mc"
> -- reg : Should contain 2 register ranges(address and length); see the
> -  example below. Note that the MC registers are interleaved with the
> -  GART registers, and hence must be represented as multiple ranges.
> +- compatible : "nvidia,tegra20-mc-gart"
> +- reg : Should contain 2 register ranges: physical base address and length of
> +  the controller's registers and the GART aperture respectively.

Couldn't we have achieved the same thing by adding a reg-names property
instead of using a different compatible string? After all we only change
what information the DT provides, but the device is still a "tegra20-mc"
device.

Thierry

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [PATCH v4 09/20] memory: tegra: Adapt to Tegra20 device-tree binding changes
  2018-09-24  0:41 ` [PATCH v4 09/20] memory: tegra: Adapt to Tegra20 device-tree binding changes Dmitry Osipenko
@ 2018-09-24 10:02   ` Thierry Reding
  2018-09-24 13:22     ` Dmitry Osipenko
  0 siblings, 1 reply; 59+ messages in thread
From: Thierry Reding @ 2018-09-24 10:02 UTC (permalink / raw)
  To: Dmitry Osipenko
  Cc: Jonathan Hunter, Joerg Roedel, Rob Herring, Robin Murphy, iommu,
	devicetree, linux-tegra, linux-kernel

[-- Attachment #1: Type: text/plain, Size: 2114 bytes --]

On Mon, Sep 24, 2018 at 03:41:42AM +0300, Dmitry Osipenko wrote:
> The tegra20-mc device-tree binding has been changed, GART has been
> squashed into Memory Controller and now the clock property is mandatory
> for Tegra20, the DT compatible has been changed as well. Adapt driver to
> the DT changes.
> 
> Signed-off-by: Dmitry Osipenko <digetx@gmail.com>
> ---
>  drivers/memory/tegra/mc.c | 21 ++++++++-------------
>  drivers/memory/tegra/mc.h |  6 ------
>  include/soc/tegra/mc.h    |  2 +-
>  3 files changed, 9 insertions(+), 20 deletions(-)
> 
> diff --git a/drivers/memory/tegra/mc.c b/drivers/memory/tegra/mc.c
> index e56862495f36..1b4ceefd82f9 100644
> --- a/drivers/memory/tegra/mc.c
> +++ b/drivers/memory/tegra/mc.c
> @@ -51,7 +51,7 @@
>  
>  static const struct of_device_id tegra_mc_of_match[] = {
>  #ifdef CONFIG_ARCH_TEGRA_2x_SOC
> -	{ .compatible = "nvidia,tegra20-mc", .data = &tegra20_mc_soc },
> +	{ .compatible = "nvidia,tegra20-mc-gart", .data = &tegra20_mc_soc },

Technically we now regress because we no longer support the older device
tree bindings. I know that it doesn't really matter because this driver
doesn't really do much interesting yet other than reporting memory
access violations, but if that's enough to warrant a change of the
compatible string, then I think we also need to preserve compatibility
in the code.

That said, I think compatibility would be easier to preserve if we stuck
with the old compatible string and used a "reg-names" property to
specify which version of the binding we're referring to.

For example, we could have:

	memory-controller@7000f000 {
		compatible = "nvidia,tegra20-mc";
		reg = <0x7000f000 0x024
		       0x7000f03c 0x3c4>;
		...
	};

for the old binding and:

	memory-controller@7000f000 {
		compatible = "nvidia,tegra20-mc";
		reg = <0x7000f000 0x00000400>,
		      <0x58000000 0x02000000>;
		reg-names = "mc", "gart";
		...
	};

for the new binding. The driver can then easily check for the existence
of the reg-names property and take the legacy or new code paths.

Thierry

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [PATCH v4 01/20] iommu/tegra: gart: Remove pr_fmt and clean up includes
  2018-09-24  0:41 ` [PATCH v4 01/20] iommu/tegra: gart: Remove pr_fmt and clean up includes Dmitry Osipenko
@ 2018-09-24 10:02   ` Thierry Reding
  0 siblings, 0 replies; 59+ messages in thread
From: Thierry Reding @ 2018-09-24 10:02 UTC (permalink / raw)
  To: Dmitry Osipenko
  Cc: Jonathan Hunter, Joerg Roedel, Rob Herring, Robin Murphy, iommu,
	devicetree, linux-tegra, linux-kernel

[-- Attachment #1: Type: text/plain, Size: 467 bytes --]

On Mon, Sep 24, 2018 at 03:41:34AM +0300, Dmitry Osipenko wrote:
> Remove unneeded headers inclusion and sort the headers in alphabet order.
> Remove pr_fmt macro since there is no pr_*() in the code and it doesn't
> affect dev_*() functions.
> 
> Signed-off-by: Dmitry Osipenko <digetx@gmail.com>
> ---
>  drivers/iommu/tegra-gart.c | 17 +++++------------
>  1 file changed, 5 insertions(+), 12 deletions(-)

Acked-by: Thierry Reding <treding@nvidia.com>

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [PATCH v4 02/20] iommu/tegra: gart: Clean up driver probe errors handling
  2018-09-24  0:41 ` [PATCH v4 02/20] iommu/tegra: gart: Clean up driver probe errors handling Dmitry Osipenko
@ 2018-09-24 10:02   ` Thierry Reding
  0 siblings, 0 replies; 59+ messages in thread
From: Thierry Reding @ 2018-09-24 10:02 UTC (permalink / raw)
  To: Dmitry Osipenko
  Cc: Jonathan Hunter, Joerg Roedel, Rob Herring, Robin Murphy, iommu,
	devicetree, linux-tegra, linux-kernel

[-- Attachment #1: Type: text/plain, Size: 386 bytes --]

On Mon, Sep 24, 2018 at 03:41:35AM +0300, Dmitry Osipenko wrote:
> Properly clean up allocated resources on the drivers probe failure and
> remove unneeded checks.
> 
> Signed-off-by: Dmitry Osipenko <digetx@gmail.com>
> ---
>  drivers/iommu/tegra-gart.c | 16 ++++++++++------
>  1 file changed, 10 insertions(+), 6 deletions(-)

Acked-by: Thierry Reding <treding@nvidia.com>

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [PATCH v4 03/20] iommu/tegra: gart: Ignore devices without IOMMU phandle in DT
  2018-09-24  0:41 ` [PATCH v4 03/20] iommu/tegra: gart: Ignore devices without IOMMU phandle in DT Dmitry Osipenko
@ 2018-09-24 10:05   ` Thierry Reding
  2018-09-24 18:41     ` Dmitry Osipenko
  0 siblings, 1 reply; 59+ messages in thread
From: Thierry Reding @ 2018-09-24 10:05 UTC (permalink / raw)
  To: Dmitry Osipenko
  Cc: Jonathan Hunter, Joerg Roedel, Rob Herring, Robin Murphy, iommu,
	devicetree, linux-tegra, linux-kernel

[-- Attachment #1: Type: text/plain, Size: 571 bytes --]

On Mon, Sep 24, 2018 at 03:41:36AM +0300, Dmitry Osipenko wrote:
> GART can't handle all devices, hence ignore devices that aren't related
> to GART. IOMMU phandle must be explicitly assign to devices in the device
> tree.

I think technically the GART can indeed handle all devices since it is
just a physical address region that can be used to remap other physical
addresses. That's not to say that doing so would be a good idea. So the
commit message here is slightly confusing, but other than that the idea
is good, so:

Acked-by: Thierry Reding <treding@nvidia.com>

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [PATCH v4 04/20] iommu: Introduce iotlb_sync_map callback
  2018-09-24  0:41 ` [PATCH v4 04/20] iommu: Introduce iotlb_sync_map callback Dmitry Osipenko
@ 2018-09-24 10:06   ` Thierry Reding
  0 siblings, 0 replies; 59+ messages in thread
From: Thierry Reding @ 2018-09-24 10:06 UTC (permalink / raw)
  To: Dmitry Osipenko
  Cc: Jonathan Hunter, Joerg Roedel, Rob Herring, Robin Murphy, iommu,
	devicetree, linux-tegra, linux-kernel

[-- Attachment #1: Type: text/plain, Size: 644 bytes --]

On Mon, Sep 24, 2018 at 03:41:37AM +0300, Dmitry Osipenko wrote:
> Introduce iotlb_sync_map() callback that is invoked in the end of
> iommu_map(). This new callback allows IOMMU drivers to avoid syncing
> after mapping of each contiguous chunk and sync only when the whole
> mapping is completed, optimizing performance of the mapping operation.
> 
> Signed-off-by: Dmitry Osipenko <digetx@gmail.com>
> Reviewed-by: Robin Murphy <robin.murphy@arm.com>
> ---
>  drivers/iommu/iommu.c | 8 ++++++--
>  include/linux/iommu.h | 1 +
>  2 files changed, 7 insertions(+), 2 deletions(-)

Reviewed-by: Thierry Reding <treding@nvidia.com>

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [PATCH v4 05/20] iommu/tegra: gart: Optimize mapping / unmapping performance
  2018-09-24  0:41 ` [PATCH v4 05/20] iommu/tegra: gart: Optimize mapping / unmapping performance Dmitry Osipenko
@ 2018-09-24 10:07   ` Thierry Reding
  0 siblings, 0 replies; 59+ messages in thread
From: Thierry Reding @ 2018-09-24 10:07 UTC (permalink / raw)
  To: Dmitry Osipenko
  Cc: Jonathan Hunter, Joerg Roedel, Rob Herring, Robin Murphy, iommu,
	devicetree, linux-tegra, linux-kernel

[-- Attachment #1: Type: text/plain, Size: 596 bytes --]

On Mon, Sep 24, 2018 at 03:41:38AM +0300, Dmitry Osipenko wrote:
> Currently GART writes one page entry at a time. More optimal would be to
> aggregate the writes and flush BUS buffer in the end, this gives map/unmap
> 10-40% performance boost (depending on size of mapping) in comparison to
> flushing after each page entry update.
> 
> Signed-off-by: Dmitry Osipenko <digetx@gmail.com>
> ---
>  drivers/iommu/tegra-gart.c | 12 ++++++++++--
>  1 file changed, 10 insertions(+), 2 deletions(-)

10-40% sounds really nice, great stuff:

Acked-by: Thierry Reding <treding@nvidia.com>

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [PATCH v4 11/20] memory: tegra: Use of_device_get_match_data()
  2018-09-24  0:41 ` [PATCH v4 11/20] memory: tegra: Use of_device_get_match_data() Dmitry Osipenko
@ 2018-09-24 10:13   ` Thierry Reding
  2018-09-24 18:39     ` Dmitry Osipenko
  0 siblings, 1 reply; 59+ messages in thread
From: Thierry Reding @ 2018-09-24 10:13 UTC (permalink / raw)
  To: Dmitry Osipenko
  Cc: Jonathan Hunter, Joerg Roedel, Rob Herring, Robin Murphy, iommu,
	devicetree, linux-tegra, linux-kernel

[-- Attachment #1: Type: text/plain, Size: 1364 bytes --]

On Mon, Sep 24, 2018 at 03:41:44AM +0300, Dmitry Osipenko wrote:
> There is no need to match device with the DT node since it was already
> matched, use of_device_get_match_data() helper to get the match-data.
> 
> Signed-off-by: Dmitry Osipenko <digetx@gmail.com>
> ---
>  drivers/memory/tegra/mc.c | 10 ++--------
>  1 file changed, 2 insertions(+), 8 deletions(-)
> 
> diff --git a/drivers/memory/tegra/mc.c b/drivers/memory/tegra/mc.c
> index 5454ffe5b2e0..cdc33f93cf7c 100644
> --- a/drivers/memory/tegra/mc.c
> +++ b/drivers/memory/tegra/mc.c
> @@ -11,8 +11,7 @@
>  #include <linux/interrupt.h>
>  #include <linux/kernel.h>
>  #include <linux/module.h>
> -#include <linux/of.h>
> -#include <linux/platform_device.h>

It's better not to remove these two because the code still uses
functions declared in them. If ever we were going to remove code using
linux/of_device.h and then remove the linux/of_device.h include, we'd
break the build and have to reintroduce the includes.

The same would happen if linux/of_device.h were ever to stop including
linux/platform_device.h or linux/of.h. That may sound unlikely, but it
has happened in the past with other includes. It can also happen that
some restructuring takes place in some headers that is not so obvious
and then things can still start falling apart miles away.

Thierry

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [PATCH v4 12/20] iommu/tegra: gart: Integrate with Memory Controller driver
  2018-09-24  0:41 ` [PATCH v4 12/20] iommu/tegra: gart: Integrate with Memory Controller driver Dmitry Osipenko
@ 2018-09-24 10:23   ` Thierry Reding
  2018-09-24 18:22     ` Dmitry Osipenko
  0 siblings, 1 reply; 59+ messages in thread
From: Thierry Reding @ 2018-09-24 10:23 UTC (permalink / raw)
  To: Dmitry Osipenko
  Cc: Jonathan Hunter, Joerg Roedel, Rob Herring, Robin Murphy, iommu,
	devicetree, linux-tegra, linux-kernel

[-- Attachment #1: Type: text/plain, Size: 932 bytes --]

On Mon, Sep 24, 2018 at 03:41:45AM +0300, Dmitry Osipenko wrote:
> The device-tree binding has been changed. There is no separate GART device
> anymore, it is squashed into the Memory Controller. Integrate GART module
> with the MC in a way it is done for the SMMU of Tegra30+.
> 
> Signed-off-by: Dmitry Osipenko <digetx@gmail.com>
> ---
>  drivers/iommu/Kconfig      |  1 +
>  drivers/iommu/tegra-gart.c | 98 ++++++++++----------------------------
>  drivers/memory/tegra/mc.c  | 41 ++++++++++++++++
>  include/soc/tegra/mc.h     | 27 +++++++++++
>  4 files changed, 93 insertions(+), 74 deletions(-)

I think this could technically have been two patches, but since they'd
have a compile-time dependency either way they need to be applied in the
correct order, so some coordination between IOMMU and Tegra trees is
going to have to happen anyway and might as well just stick this into a
single patch.

Thierry

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [PATCH v4 13/20] iommu/tegra: gart: Fix spinlock recursion
  2018-09-24  0:41 ` [PATCH v4 13/20] iommu/tegra: gart: Fix spinlock recursion Dmitry Osipenko
@ 2018-09-24 10:49   ` Thierry Reding
  0 siblings, 0 replies; 59+ messages in thread
From: Thierry Reding @ 2018-09-24 10:49 UTC (permalink / raw)
  To: Dmitry Osipenko
  Cc: Jonathan Hunter, Joerg Roedel, Rob Herring, Robin Murphy, iommu,
	devicetree, linux-tegra, linux-kernel

[-- Attachment #1: Type: text/plain, Size: 430 bytes --]

On Mon, Sep 24, 2018 at 03:41:46AM +0300, Dmitry Osipenko wrote:
> Fix spinlock recursion bug that happens on IOMMU domain destruction if
> any of the allocated domains have devices attached to them.
> 
> Signed-off-by: Dmitry Osipenko <digetx@gmail.com>
> ---
>  drivers/iommu/tegra-gart.c | 24 ++++++++++++++++--------
>  1 file changed, 16 insertions(+), 8 deletions(-)

Acked-by: Thierry Reding <treding@nvidia.com>

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [PATCH v4 14/20] iommu/tegra: gart: Fix NULL pointer dereference
  2018-09-24  0:41 ` [PATCH v4 14/20] iommu/tegra: gart: Fix NULL pointer dereference Dmitry Osipenko
@ 2018-09-24 10:49   ` Thierry Reding
  0 siblings, 0 replies; 59+ messages in thread
From: Thierry Reding @ 2018-09-24 10:49 UTC (permalink / raw)
  To: Dmitry Osipenko
  Cc: Jonathan Hunter, Joerg Roedel, Rob Herring, Robin Murphy, iommu,
	devicetree, linux-tegra, linux-kernel

[-- Attachment #1: Type: text/plain, Size: 457 bytes --]

On Mon, Sep 24, 2018 at 03:41:47AM +0300, Dmitry Osipenko wrote:
> Fix NULL pointer dereference on IOMMU domain destruction that happens
> because clients list is being iterated unsafely and its elements are
> getting deleted during the iteration.
> 
> Signed-off-by: Dmitry Osipenko <digetx@gmail.com>
> ---
>  drivers/iommu/tegra-gart.c | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)

Acked-by: Thierry Reding <treding@nvidia.com>

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [PATCH v4 15/20] iommu/tegra: gart: Allow only one active domain at a time
  2018-09-24  0:41 ` [PATCH v4 15/20] iommu/tegra: gart: Allow only one active domain at a time Dmitry Osipenko
@ 2018-09-24 10:50   ` Thierry Reding
  0 siblings, 0 replies; 59+ messages in thread
From: Thierry Reding @ 2018-09-24 10:50 UTC (permalink / raw)
  To: Dmitry Osipenko
  Cc: Jonathan Hunter, Joerg Roedel, Rob Herring, Robin Murphy, iommu,
	devicetree, linux-tegra, linux-kernel

[-- Attachment #1: Type: text/plain, Size: 880 bytes --]

On Mon, Sep 24, 2018 at 03:41:48AM +0300, Dmitry Osipenko wrote:
> GART has a single address space that is shared by all devices, hence only
> one domain could be active at a time.
> 
> Signed-off-by: Dmitry Osipenko <digetx@gmail.com>
> ---
>  drivers/iommu/tegra-gart.c | 9 +++++++++
>  1 file changed, 9 insertions(+)
> 
> diff --git a/drivers/iommu/tegra-gart.c b/drivers/iommu/tegra-gart.c
> index 1d45b023adea..9f7d3afb686f 100644
> --- a/drivers/iommu/tegra-gart.c
> +++ b/drivers/iommu/tegra-gart.c
> @@ -55,6 +55,7 @@ struct gart_device {
>  	spinlock_t		pte_lock;	/* for pagetable */
>  	struct list_head	client;
>  	spinlock_t		client_lock;	/* for client list */
> +	struct iommu_domain	*active_domain;	/* current active domain */

The active_ prefix seems a little unnecessary here, but either way:

Acked-by: Thierry Reding <treding@nvidia.com>

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [PATCH v4 16/20] iommu/tegra: gart: Don't use managed resources
  2018-09-24  0:41 ` [PATCH v4 16/20] iommu/tegra: gart: Don't use managed resources Dmitry Osipenko
@ 2018-09-24 10:52   ` Thierry Reding
  2018-09-24 18:57     ` Dmitry Osipenko
  0 siblings, 1 reply; 59+ messages in thread
From: Thierry Reding @ 2018-09-24 10:52 UTC (permalink / raw)
  To: Dmitry Osipenko
  Cc: Jonathan Hunter, Joerg Roedel, Rob Herring, Robin Murphy, iommu,
	devicetree, linux-tegra, linux-kernel

[-- Attachment #1: Type: text/plain, Size: 524 bytes --]

On Mon, Sep 24, 2018 at 03:41:49AM +0300, Dmitry Osipenko wrote:
> GART is a part of the Memory Controller driver that is always built-in,
> hence there is no benefit from the use of managed resources.
> 
> Signed-off-by: Dmitry Osipenko <digetx@gmail.com>
> ---
>  drivers/iommu/tegra-gart.c | 12 +++++++-----
>  1 file changed, 7 insertions(+), 5 deletions(-)

One benefit would be cleanup on probe error. Also, we may eventually
want to make even the memory controller driver buildable as a module.

Thierry

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [PATCH v4 17/20] iommu/tegra: gart: Prepend error/debug messages with "GART:"
  2018-09-24  0:41 ` [PATCH v4 17/20] iommu/tegra: gart: Prepend error/debug messages with "GART:" Dmitry Osipenko
@ 2018-09-24 10:57   ` Thierry Reding
  2018-09-24 18:09     ` Dmitry Osipenko
  0 siblings, 1 reply; 59+ messages in thread
From: Thierry Reding @ 2018-09-24 10:57 UTC (permalink / raw)
  To: Dmitry Osipenko
  Cc: Jonathan Hunter, Joerg Roedel, Rob Herring, Robin Murphy, iommu,
	devicetree, linux-tegra, linux-kernel

[-- Attachment #1: Type: text/plain, Size: 1056 bytes --]

On Mon, Sep 24, 2018 at 03:41:50AM +0300, Dmitry Osipenko wrote:
> GART became a part of Memory Controller, hence now the drivers device
> is Memory Controller and not GART. As a result all printed messages are
> prepended with the "tegra-mc 7000f000.memory-controller:", so let's
> prepend GART's messages with "GART:" in order to differentiate them
> from the MC.
> 
> Signed-off-by: Dmitry Osipenko <digetx@gmail.com>
> ---
>  drivers/iommu/tegra-gart.c | 36 ++++++++++++++++++------------------
>  1 file changed, 18 insertions(+), 18 deletions(-)

There's a macro called dev_fmt (similar to pr_fmt) to do this for dev_*
printers. Also I think this would be more readable if the prefix was
"gart: " rather than "GART: ". At least from my personal experience I
get easily distracted by all-caps words in logs, because they usually
indicate something that requires immediate attention. I think it's
better to leave that up to higher level mechanisms, such as the color
keying of messages based on level by tools like dmesg.

Thierry

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [PATCH v4 18/20] iommu/tegra: gart: Don't detach devices from inactive domains
  2018-09-24  0:41 ` [PATCH v4 18/20] iommu/tegra: gart: Don't detach devices from inactive domains Dmitry Osipenko
@ 2018-09-24 11:00   ` Thierry Reding
  2018-09-24 18:05     ` Dmitry Osipenko
  0 siblings, 1 reply; 59+ messages in thread
From: Thierry Reding @ 2018-09-24 11:00 UTC (permalink / raw)
  To: Dmitry Osipenko
  Cc: Jonathan Hunter, Joerg Roedel, Rob Herring, Robin Murphy, iommu,
	devicetree, linux-tegra, linux-kernel

[-- Attachment #1: Type: text/plain, Size: 692 bytes --]

On Mon, Sep 24, 2018 at 03:41:51AM +0300, Dmitry Osipenko wrote:
> There could be unlimited number of allocated domains, but only one domain
> can be active at a time. Hence devices must be detached only from the
> active domain.
> 
> Signed-off-by: Dmitry Osipenko <digetx@gmail.com>
> ---
>  drivers/iommu/tegra-gart.c | 8 +++++---
>  1 file changed, 5 insertions(+), 3 deletions(-)

Do we have a mechanism of switching out different domains? I don't think
we do, so I'm wondering if perhaps a better solution to this would be to
just refuse to create more than one domain in the first place. That
would also allow us to get rid of the global variable gart_handle.

Thierry

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [PATCH v4 19/20] iommu/tegra: gart: Simplify clients-tracking code
  2018-09-24  0:41 ` [PATCH v4 19/20] iommu/tegra: gart: Simplify clients-tracking code Dmitry Osipenko
@ 2018-09-24 11:10   ` Thierry Reding
  2018-09-24 17:50     ` Dmitry Osipenko
  0 siblings, 1 reply; 59+ messages in thread
From: Thierry Reding @ 2018-09-24 11:10 UTC (permalink / raw)
  To: Dmitry Osipenko
  Cc: Jonathan Hunter, Joerg Roedel, Rob Herring, Robin Murphy, iommu,
	devicetree, linux-tegra, linux-kernel

[-- Attachment #1: Type: text/plain, Size: 8092 bytes --]

On Mon, Sep 24, 2018 at 03:41:52AM +0300, Dmitry Osipenko wrote:
> GART is a simple IOMMU provider that has single address space. There is
> no need to setup global clients list and manage it for tracking of the
> active domain, hence lot's of code could be safely removed and replaced
> with a simpler alternative.
> 
> Signed-off-by: Dmitry Osipenko <digetx@gmail.com>
> ---
>  drivers/iommu/tegra-gart.c | 157 +++++++++----------------------------
>  1 file changed, 39 insertions(+), 118 deletions(-)
> 
> diff --git a/drivers/iommu/tegra-gart.c b/drivers/iommu/tegra-gart.c
> index 306e9644a676..7182445c3b76 100644
> --- a/drivers/iommu/tegra-gart.c
> +++ b/drivers/iommu/tegra-gart.c
> @@ -19,7 +19,6 @@
>  
>  #include <linux/io.h>
>  #include <linux/iommu.h>
> -#include <linux/list.h>
>  #include <linux/module.h>
>  #include <linux/platform_device.h>
>  #include <linux/slab.h>
> @@ -42,30 +41,20 @@
>  #define GART_PAGE_MASK						\
>  	(~(GART_PAGE_SIZE - 1) & ~GART_ENTRY_PHYS_ADDR_VALID)
>  
> -struct gart_client {
> -	struct device		*dev;
> -	struct list_head	list;
> -};
> -
>  struct gart_device {
>  	void __iomem		*regs;
>  	u32			*savedata;
>  	u32			page_count;	/* total remappable size */
>  	dma_addr_t		iovmm_base;	/* offset to vmm_area */
>  	spinlock_t		pte_lock;	/* for pagetable */
> -	struct list_head	client;
> -	spinlock_t		client_lock;	/* for client list */
> +	spinlock_t		dom_lock;	/* for active domain */
> +	unsigned int		active_devices;	/* number of active devices */
>  	struct iommu_domain	*active_domain;	/* current active domain */
>  	struct device		*dev;
>  
>  	struct iommu_device	iommu;		/* IOMMU Core handle */
>  };
>  
> -struct gart_domain {
> -	struct iommu_domain domain;		/* generic domain handle */
> -	struct gart_device *gart;		/* link to gart device   */
> -};
> -
>  static struct gart_device *gart_handle; /* unique for a system */
>  
>  static bool gart_debug;
> @@ -73,11 +62,6 @@ static bool gart_debug;
>  #define GART_PTE(_pfn)						\
>  	(GART_ENTRY_PHYS_ADDR_VALID | ((_pfn) << PAGE_SHIFT))
>  
> -static struct gart_domain *to_gart_domain(struct iommu_domain *dom)
> -{
> -	return container_of(dom, struct gart_domain, domain);
> -}
> -
>  /*
>   * Any interaction between any block on PPSB and a block on APB or AHB
>   * must have these read-back to ensure the APB/AHB bus transaction is
> @@ -166,128 +150,69 @@ static inline bool gart_iova_range_valid(struct gart_device *gart,
>  static int gart_iommu_attach_dev(struct iommu_domain *domain,
>  				 struct device *dev)
>  {
> -	struct gart_domain *gart_domain = to_gart_domain(domain);
>  	struct gart_device *gart = gart_handle;
> -	struct gart_client *client, *c;
> -	int err = 0;
> -
> -	client = kzalloc(sizeof(*c), GFP_KERNEL);
> -	if (!client)
> -		return -ENOMEM;
> -	client->dev = dev;
> -
> -	spin_lock(&gart->client_lock);
> -	list_for_each_entry(c, &gart->client, list) {
> -		if (c->dev == dev) {
> -			dev_err(gart->dev, "GART: %s is already attached\n",
> -				dev_name(dev));
> -			err = -EINVAL;
> -			goto fail;
> -		}
> -	}
> -	if (gart->active_domain && gart->active_domain != domain) {
> -		dev_err(gart->dev,
> -			"GART: Only one domain can be active at a time\n");
> -		err = -EINVAL;
> -		goto fail;
> -	}
> -	gart->active_domain = domain;
> -	gart_domain->gart = gart;
> -	list_add(&client->list, &gart->client);
> -	spin_unlock(&gart->client_lock);
> -	dev_dbg(gart->dev, "GART: Attached %s\n", dev_name(dev));
> -	return 0;
> +	int ret = 0;
>  
> -fail:
> -	kfree(client);
> -	spin_unlock(&gart->client_lock);
> -	return err;
> -}
> +	spin_lock(&gart->dom_lock);
>  
> -static void __gart_iommu_detach_dev(struct iommu_domain *domain,
> -				    struct device *dev)
> -{
> -	struct gart_domain *gart_domain = to_gart_domain(domain);
> -	struct gart_device *gart = gart_domain->gart;
> -	struct gart_client *c;
> -
> -	list_for_each_entry(c, &gart->client, list) {
> -		if (c->dev == dev) {
> -			list_del(&c->list);
> -			kfree(c);
> -			if (list_empty(&gart->client)) {
> -				gart->active_domain = NULL;
> -				gart_domain->gart = NULL;
> -			}
> -			dev_dbg(gart->dev, "GART: Detached %s\n",
> -				dev_name(dev));
> -			return;
> -		}
> +	if (gart->active_domain && gart->active_domain != domain) {
> +		ret = -EBUSY;

This omits the error message and returns -EBUSY instead of -EINVAL. Was
this intended? For what it's worth, I do agree with the changes, it's
just that I think you could've made those in the earlier patch that
introduced them.

But this is all one series and the end result looks fine, so no need to
be that picky.

> +	} else if (dev->archdata.iommu != domain) {
> +		dev->archdata.iommu = domain;
> +		gart->active_domain = domain;
> +		gart->active_devices++;
>  	}
>  
> -	dev_err(gart->dev, "GART: Couldn't find %s to detach\n",
> -		dev_name(dev));
> +	spin_unlock(&gart->dom_lock);
> +
> +	return ret;
>  }
>  
>  static void gart_iommu_detach_dev(struct iommu_domain *domain,
>  				  struct device *dev)
>  {
> -	struct gart_domain *gart_domain = to_gart_domain(domain);
> -	struct gart_device *gart = gart_domain->gart;
> +	struct gart_device *gart = gart_handle;
> +
> +	spin_lock(&gart->dom_lock);
>  
> -	spin_lock(&gart->client_lock);
> -	__gart_iommu_detach_dev(domain, dev);
> -	spin_unlock(&gart->client_lock);
> +	if (dev->archdata.iommu == domain) {
> +		dev->archdata.iommu = NULL;
> +
> +		if (--gart->active_devices == 0)
> +			gart->active_domain = NULL;
> +	}
> +
> +	spin_unlock(&gart->dom_lock);
>  }
>  
>  static struct iommu_domain *gart_iommu_domain_alloc(unsigned type)
>  {
> -	struct gart_domain *gart_domain;
> -	struct gart_device *gart;
> +	struct gart_device *gart = gart_handle;
> +	struct iommu_domain *domain;
>  
>  	if (type != IOMMU_DOMAIN_UNMANAGED)
>  		return NULL;
>  
> -	gart = gart_handle;
> -	if (!gart)
> -		return NULL;
> -
> -	gart_domain = kzalloc(sizeof(*gart_domain), GFP_KERNEL);
> -	if (!gart_domain)
> -		return NULL;
> -
> -	gart_domain->domain.geometry.aperture_start = gart->iovmm_base;
> -	gart_domain->domain.geometry.aperture_end = gart->iovmm_base +
> +	domain = kzalloc(sizeof(*domain), GFP_KERNEL);
> +	if (domain) {
> +		domain->geometry.aperture_start = gart->iovmm_base;
> +		domain->geometry.aperture_end = gart->iovmm_base +
>  					gart->page_count * GART_PAGE_SIZE - 1;
> -	gart_domain->domain.geometry.force_aperture = true;
> +		domain->geometry.force_aperture = true;
> +	}
>  
> -	return &gart_domain->domain;
> +	return domain;
>  }
>  
>  static void gart_iommu_domain_free(struct iommu_domain *domain)
>  {
> -	struct gart_domain *gart_domain = to_gart_domain(domain);
> -	struct gart_device *gart = gart_domain->gart;
> -
> -	if (gart) {
> -		spin_lock(&gart->client_lock);
> -		if (!list_empty(&gart->client)) {
> -			struct gart_client *c, *tmp;
> -
> -			list_for_each_entry_safe(c, tmp, &gart->client, list)
> -				__gart_iommu_detach_dev(domain, c->dev);
> -		}
> -		spin_unlock(&gart->client_lock);
> -	}
> -
> -	kfree(gart_domain);
> +	kfree(domain);
>  }

Doesn't this now make it possible to free a potentially active domain?

>  
>  static int gart_iommu_map(struct iommu_domain *domain, unsigned long iova,
>  			  phys_addr_t pa, size_t bytes, int prot)
>  {
> -	struct gart_domain *gart_domain = to_gart_domain(domain);
> -	struct gart_device *gart = gart_domain->gart;
> +	struct gart_device *gart = gart_handle;

Hmm... this now introduces more uses of the gart_handle that I hoped we
could get rid of. I think we could still keep around struct gart_domain
and just make sure it is unique. The small amounts of casting here seem
mostly harmless to me, especially since they will be nops, so we end up
with just one dereference to get at the struct gart_device. I think the
benefits of not having this global variable around are worth the one
dereference here.

Thierry

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [PATCH v4 20/20] iommu/tegra: gart: Perform code refactoring
  2018-09-24  0:41 ` [PATCH v4 20/20] iommu/tegra: gart: Perform code refactoring Dmitry Osipenko
@ 2018-09-24 11:34   ` Thierry Reding
  2018-09-24 17:11     ` Dmitry Osipenko
  0 siblings, 1 reply; 59+ messages in thread
From: Thierry Reding @ 2018-09-24 11:34 UTC (permalink / raw)
  To: Dmitry Osipenko
  Cc: Jonathan Hunter, Joerg Roedel, Rob Herring, Robin Murphy, iommu,
	devicetree, linux-tegra, linux-kernel

[-- Attachment #1: Type: text/plain, Size: 13150 bytes --]

On Mon, Sep 24, 2018 at 03:41:53AM +0300, Dmitry Osipenko wrote:
> Perform a major code cleanup to make it more readable and as a result
> easier to maintain. I removed some redundant safety-checks in the code
> and some debug code that isn't actually very useful for debugging, like
> enormous pagetable dump on each fault. The majority of the changes are
> code reshuffling, variables/whitespaces clean up and removal of debug
> messages that duplicate messages of the IOMMU-core.
> 
> Signed-off-by: Dmitry Osipenko <digetx@gmail.com>
> ---
>  drivers/iommu/tegra-gart.c | 215 +++++++++++++++----------------------
>  1 file changed, 84 insertions(+), 131 deletions(-)

While I'm not strongly opposed to this, it's an awful lot of churn for
little to no gain. Yes, this driver may have its weak points in some
areas, but I don't think it's totally unreadable or unmaintainable.
Also, keep in mind that readability is very subjective.

If you set out to rewrite every piece of code in the kernel that you
think is unreadable, I don't think you'll end up a happy person.

> diff --git a/drivers/iommu/tegra-gart.c b/drivers/iommu/tegra-gart.c
> index 7182445c3b76..a36d0c568536 100644
> --- a/drivers/iommu/tegra-gart.c
> +++ b/drivers/iommu/tegra-gart.c
> @@ -34,63 +34,56 @@
>  #define GART_CONFIG		(0x24 - GART_REG_BASE)
>  #define GART_ENTRY_ADDR		(0x28 - GART_REG_BASE)
>  #define GART_ENTRY_DATA		(0x2c - GART_REG_BASE)
> -#define GART_ENTRY_PHYS_ADDR_VALID	(1 << 31)
> +
> +#define GART_ENTRY_PHYS_ADDR_VALID	BIT(31)
>  
>  #define GART_PAGE_SHIFT		12
>  #define GART_PAGE_SIZE		(1 << GART_PAGE_SHIFT)
> -#define GART_PAGE_MASK						\
> -	(~(GART_PAGE_SIZE - 1) & ~GART_ENTRY_PHYS_ADDR_VALID)
> +#define GART_PAGE_MASK		GENMASK(30, GART_PAGE_SHIFT)
>  
>  struct gart_device {
>  	void __iomem		*regs;
>  	u32			*savedata;
> -	u32			page_count;	/* total remappable size */
> -	dma_addr_t		iovmm_base;	/* offset to vmm_area */
> +	unsigned long		iovmm_base;	/* offset to vmm_area start */
> +	unsigned long		iovmm_end;	/* offset to vmm_area end */
>  	spinlock_t		pte_lock;	/* for pagetable */
>  	spinlock_t		dom_lock;	/* for active domain */
>  	unsigned int		active_devices;	/* number of active devices */
>  	struct iommu_domain	*active_domain;	/* current active domain */
> -	struct device		*dev;
> -
>  	struct iommu_device	iommu;		/* IOMMU Core handle */
> +	struct device		*dev;
>  };
>  
>  static struct gart_device *gart_handle; /* unique for a system */
>  
>  static bool gart_debug;
>  
> -#define GART_PTE(_pfn)						\
> -	(GART_ENTRY_PHYS_ADDR_VALID | ((_pfn) << PAGE_SHIFT))
> -
>  /*
>   * Any interaction between any block on PPSB and a block on APB or AHB
>   * must have these read-back to ensure the APB/AHB bus transaction is
>   * complete before initiating activity on the PPSB block.
>   */
> -#define FLUSH_GART_REGS(gart)	((void)readl((gart)->regs + GART_CONFIG))
> +#define FLUSH_GART_REGS(gart)	readl_relaxed((gart)->regs + GART_CONFIG)
>  
>  #define for_each_gart_pte(gart, iova)					\
>  	for (iova = gart->iovmm_base;					\
> -	     iova < gart->iovmm_base + GART_PAGE_SIZE * gart->page_count; \
> +	     iova < gart->iovmm_end;					\
>  	     iova += GART_PAGE_SIZE)
>  
>  static inline void gart_set_pte(struct gart_device *gart,
> -				unsigned long offs, u32 pte)
> +				unsigned long iova, phys_addr_t pte)

I don't think this makes sense. phys_addr_t can be 64-bit and actually
will be in the majority of multi-platform builds. iova being unsigned
long is borderline, but probably fine since this driver is exclusive to
32-bit builds.

I think it'd be better to make sure elsewhere that only valid, 32-bit
values are passed in here and return an error at a higher level if
that's not the case. Silently casting away the upper 32 bits in the
writel_relaxed() below is suboptimal. Even if we don't care about the
type mismatch because Tegra20 doesn't have LPAE and therefore the upper
32 bits will always be 0 and the cast is in fact safe, I think we should
be explicitly casting at some point, and I think it should be at a
higher level than gart_set_pte().

>  {
> -	writel(offs, gart->regs + GART_ENTRY_ADDR);
> -	writel(pte, gart->regs + GART_ENTRY_DATA);
> -
> -	dev_dbg(gart->dev, "GART: %s %08lx:%08x\n",
> -		 pte ? "map" : "unmap", offs, pte & GART_PAGE_MASK);
> +	writel_relaxed(iova, gart->regs + GART_ENTRY_ADDR);
> +	writel_relaxed(pte, gart->regs + GART_ENTRY_DATA);
>  }
>  
>  static inline unsigned long gart_read_pte(struct gart_device *gart,
> -					  unsigned long offs)
> +					  unsigned long iova)
>  {
>  	unsigned long pte;
>  
> -	writel(offs, gart->regs + GART_ENTRY_ADDR);
> -	pte = readl(gart->regs + GART_ENTRY_DATA);
> +	writel_relaxed(iova, gart->regs + GART_ENTRY_ADDR);
> +	pte = readl_relaxed(gart->regs + GART_ENTRY_DATA);
>  
>  	return pte;
>  }
> @@ -102,49 +95,20 @@ static void do_gart_setup(struct gart_device *gart, const u32 *data)
>  	for_each_gart_pte(gart, iova)
>  		gart_set_pte(gart, iova, data ? *(data++) : 0);
>  
> -	writel(1, gart->regs + GART_CONFIG);
> +	writel_relaxed(1, gart->regs + GART_CONFIG);
>  	FLUSH_GART_REGS(gart);
>  }
>  
> -#ifdef DEBUG
> -static void gart_dump_table(struct gart_device *gart)
> +static inline bool gart_iova_range_invalid(struct gart_device *gart,
> +					   unsigned long iova, size_t bytes)
>  {
> -	unsigned long iova;
> -	unsigned long flags;
> -
> -	spin_lock_irqsave(&gart->pte_lock, flags);
> -	for_each_gart_pte(gart, iova) {
> -		unsigned long pte;
> -
> -		pte = gart_read_pte(gart, iova);
> -
> -		dev_dbg(gart->dev, "GART: %s %08lx:%08lx\n",
> -			(GART_ENTRY_PHYS_ADDR_VALID & pte) ? "v" : " ",
> -			iova, pte & GART_PAGE_MASK);
> -	}
> -	spin_unlock_irqrestore(&gart->pte_lock, flags);
> +	return unlikely(iova < gart->iovmm_base || bytes != GART_PAGE_SIZE ||
> +			iova + bytes > gart->iovmm_end);
>  }
> -#else
> -static inline void gart_dump_table(struct gart_device *gart)
> -{
> -}
> -#endif
>  
> -static inline bool gart_iova_range_valid(struct gart_device *gart,
> -					 unsigned long iova, size_t bytes)
> +static inline bool gart_pte_valid(struct gart_device *gart, unsigned long iova)
>  {
> -	unsigned long iova_start, iova_end, gart_start, gart_end;
> -
> -	iova_start = iova;
> -	iova_end = iova_start + bytes - 1;
> -	gart_start = gart->iovmm_base;
> -	gart_end = gart_start + gart->page_count * GART_PAGE_SIZE - 1;
> -
> -	if (iova_start < gart_start)
> -		return false;
> -	if (iova_end > gart_end)
> -		return false;
> -	return true;
> +	return !!(gart_read_pte(gart, iova) & GART_ENTRY_PHYS_ADDR_VALID);
>  }
>  
>  static int gart_iommu_attach_dev(struct iommu_domain *domain,
> @@ -187,7 +151,6 @@ static void gart_iommu_detach_dev(struct iommu_domain *domain,
>  
>  static struct iommu_domain *gart_iommu_domain_alloc(unsigned type)
>  {
> -	struct gart_device *gart = gart_handle;
>  	struct iommu_domain *domain;
>  
>  	if (type != IOMMU_DOMAIN_UNMANAGED)
> @@ -195,9 +158,8 @@ static struct iommu_domain *gart_iommu_domain_alloc(unsigned type)
>  
>  	domain = kzalloc(sizeof(*domain), GFP_KERNEL);
>  	if (domain) {
> -		domain->geometry.aperture_start = gart->iovmm_base;
> -		domain->geometry.aperture_end = gart->iovmm_base +
> -					gart->page_count * GART_PAGE_SIZE - 1;
> +		domain->geometry.aperture_start = gart_handle->iovmm_base;
> +		domain->geometry.aperture_end = gart_handle->iovmm_end - 1;
>  		domain->geometry.force_aperture = true;
>  	}
>  
> @@ -209,34 +171,44 @@ static void gart_iommu_domain_free(struct iommu_domain *domain)
>  	kfree(domain);
>  }
>  
> +static int __gart_iommu_map(struct gart_device *gart, unsigned long iova,
> +			    phys_addr_t pa)
> +{
> +	if (unlikely(gart_debug && gart_pte_valid(gart, iova))) {
> +		dev_WARN(gart->dev, "GART: Page entry is in-use\n");
> +		return -EINVAL;
> +	}
> +
> +	gart_set_pte(gart, iova, GART_ENTRY_PHYS_ADDR_VALID | pa);
> +
> +	return 0;
> +}
> +
>  static int gart_iommu_map(struct iommu_domain *domain, unsigned long iova,
>  			  phys_addr_t pa, size_t bytes, int prot)
>  {
>  	struct gart_device *gart = gart_handle;
> -	unsigned long flags;
> -	unsigned long pfn;
> -	unsigned long pte;
> +	int ret;
>  
> -	if (!gart_iova_range_valid(gart, iova, bytes))
> +	if (gart_iova_range_invalid(gart, iova, bytes))
>  		return -EINVAL;
>  
> -	spin_lock_irqsave(&gart->pte_lock, flags);
> -	pfn = __phys_to_pfn(pa);
> -	if (!pfn_valid(pfn)) {
> -		dev_err(gart->dev, "GART: Invalid page: %pa\n", &pa);
> -		spin_unlock_irqrestore(&gart->pte_lock, flags);
> +	spin_lock(&gart->pte_lock);
> +	ret = __gart_iommu_map(gart, iova, pa);
> +	spin_unlock(&gart->pte_lock);
> +
> +	return ret;
> +}
> +
> +static int __gart_iommu_unmap(struct gart_device *gart, unsigned long iova)
> +{
> +	if (unlikely(gart_debug && !gart_pte_valid(gart, iova))) {
> +		dev_WARN(gart->dev, "GART: Page entry is invalid\n");
>  		return -EINVAL;
>  	}
> -	if (gart_debug) {
> -		pte = gart_read_pte(gart, iova);
> -		if (pte & GART_ENTRY_PHYS_ADDR_VALID) {
> -			spin_unlock_irqrestore(&gart->pte_lock, flags);
> -			dev_err(gart->dev, "GART: Page entry is in-use\n");
> -			return -EBUSY;
> -		}
> -	}
> -	gart_set_pte(gart, iova, GART_PTE(pfn));
> -	spin_unlock_irqrestore(&gart->pte_lock, flags);
> +
> +	gart_set_pte(gart, iova, 0);
> +
>  	return 0;
>  }
>  
> @@ -244,15 +216,16 @@ static size_t gart_iommu_unmap(struct iommu_domain *domain, unsigned long iova,
>  			       size_t bytes)
>  {
>  	struct gart_device *gart = gart_handle;
> -	unsigned long flags;
> +	int err;
>  
> -	if (!gart_iova_range_valid(gart, iova, bytes))
> +	if (gart_iova_range_invalid(gart, iova, bytes))
>  		return 0;
>  
> -	spin_lock_irqsave(&gart->pte_lock, flags);
> -	gart_set_pte(gart, iova, 0);
> -	spin_unlock_irqrestore(&gart->pte_lock, flags);
> -	return bytes;
> +	spin_lock(&gart->pte_lock);
> +	err = __gart_iommu_unmap(gart, iova);
> +	spin_unlock(&gart->pte_lock);
> +
> +	return err ? 0 : bytes;
>  }
>  
>  static phys_addr_t gart_iommu_iova_to_phys(struct iommu_domain *domain,
> @@ -260,24 +233,15 @@ static phys_addr_t gart_iommu_iova_to_phys(struct iommu_domain *domain,
>  {
>  	struct gart_device *gart = gart_handle;
>  	unsigned long pte;
> -	phys_addr_t pa;
> -	unsigned long flags;
>  
> -	if (!gart_iova_range_valid(gart, iova, 0))
> +	if (gart_iova_range_invalid(gart, iova, SZ_4K))
>  		return -EINVAL;
>  
> -	spin_lock_irqsave(&gart->pte_lock, flags);
> +	spin_lock(&gart->pte_lock);
>  	pte = gart_read_pte(gart, iova);
> -	spin_unlock_irqrestore(&gart->pte_lock, flags);
> +	spin_unlock(&gart->pte_lock);
>  
> -	pa = (pte & GART_PAGE_MASK);
> -	if (!pfn_valid(__phys_to_pfn(pa))) {
> -		dev_err(gart->dev, "GART: No entry for %08llx:%pa\n",
> -			 (unsigned long long)iova, &pa);
> -		gart_dump_table(gart);
> -		return -EINVAL;
> -	}
> -	return pa;
> +	return pte & GART_PAGE_MASK;
>  }
>  
>  static bool gart_iommu_capable(enum iommu_cap cap)
> @@ -342,24 +306,19 @@ static const struct iommu_ops gart_iommu_ops = {
>  
>  int tegra_gart_suspend(struct gart_device *gart)
>  {
> -	unsigned long iova;
>  	u32 *data = gart->savedata;
> -	unsigned long flags;
> +	unsigned long iova;
>  
> -	spin_lock_irqsave(&gart->pte_lock, flags);
>  	for_each_gart_pte(gart, iova)
>  		*(data++) = gart_read_pte(gart, iova);
> -	spin_unlock_irqrestore(&gart->pte_lock, flags);

Why is it safe to remove the lock here?

> +
>  	return 0;
>  }
>  
>  int tegra_gart_resume(struct gart_device *gart)
>  {
> -	unsigned long flags;
> -
> -	spin_lock_irqsave(&gart->pte_lock, flags);
>  	do_gart_setup(gart, gart->savedata);
> -	spin_unlock_irqrestore(&gart->pte_lock, flags);
> +
>  	return 0;
>  }
>  
> @@ -368,8 +327,7 @@ struct gart_device *tegra_gart_probe(struct device *dev,
>  				     struct tegra_mc *mc)
>  {
>  	struct gart_device *gart;
> -	struct resource *res_remap;
> -	void __iomem *gart_regs;
> +	struct resource *res;
>  	int ret;
>  
>  	BUILD_BUG_ON(PAGE_SHIFT != GART_PAGE_SHIFT);
> @@ -379,9 +337,8 @@ struct gart_device *tegra_gart_probe(struct device *dev,
>  		return NULL;
>  
>  	/* the GART memory aperture is required */
> -	res_remap = platform_get_resource(to_platform_device(dev),
> -					  IORESOURCE_MEM, 1);
> -	if (!res_remap) {
> +	res = platform_get_resource(to_platform_device(dev), IORESOURCE_MEM, 1);
> +	if (!res) {
>  		dev_err(dev, "GART: Memory aperture resource unavailable\n");
>  		return ERR_PTR(-ENXIO);
>  	}
> @@ -390,39 +347,35 @@ struct gart_device *tegra_gart_probe(struct device *dev,
>  	if (!gart)
>  		return ERR_PTR(-ENOMEM);
>  
> +	gart_handle = gart;
> +
> +	gart->dev = dev;
> +	gart->regs = mc->regs + GART_REG_BASE;
> +	gart->iovmm_base = res->start;
> +	gart->iovmm_end = res->start + resource_size(res);

Why not simply:

	gart->iovmm_end = res->end;

?

Thierry

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [PATCH v4 09/20] memory: tegra: Adapt to Tegra20 device-tree binding changes
  2018-09-24 10:02   ` Thierry Reding
@ 2018-09-24 13:22     ` Dmitry Osipenko
  2018-09-25 12:16       ` Dmitry Osipenko
  0 siblings, 1 reply; 59+ messages in thread
From: Dmitry Osipenko @ 2018-09-24 13:22 UTC (permalink / raw)
  To: Thierry Reding
  Cc: Jonathan Hunter, Joerg Roedel, Rob Herring, Robin Murphy, iommu,
	devicetree, linux-tegra, linux-kernel

On 9/24/18 1:02 PM, Thierry Reding wrote:
> On Mon, Sep 24, 2018 at 03:41:42AM +0300, Dmitry Osipenko wrote:
>> The tegra20-mc device-tree binding has been changed, GART has been
>> squashed into Memory Controller and now the clock property is mandatory
>> for Tegra20, the DT compatible has been changed as well. Adapt driver to
>> the DT changes.
>>
>> Signed-off-by: Dmitry Osipenko <digetx@gmail.com>
>> ---
>>   drivers/memory/tegra/mc.c | 21 ++++++++-------------
>>   drivers/memory/tegra/mc.h |  6 ------
>>   include/soc/tegra/mc.h    |  2 +-
>>   3 files changed, 9 insertions(+), 20 deletions(-)
>>
>> diff --git a/drivers/memory/tegra/mc.c b/drivers/memory/tegra/mc.c
>> index e56862495f36..1b4ceefd82f9 100644
>> --- a/drivers/memory/tegra/mc.c
>> +++ b/drivers/memory/tegra/mc.c
>> @@ -51,7 +51,7 @@
>>   
>>   static const struct of_device_id tegra_mc_of_match[] = {
>>   #ifdef CONFIG_ARCH_TEGRA_2x_SOC
>> -	{ .compatible = "nvidia,tegra20-mc", .data = &tegra20_mc_soc },
>> +	{ .compatible = "nvidia,tegra20-mc-gart", .data = &tegra20_mc_soc },
> 
> Technically we now regress because we no longer support the older device
> tree bindings. I know that it doesn't really matter because this driver
> doesn't really do much interesting yet other than reporting memory
> access violations, but if that's enough to warrant a change of the
> compatible string, then I think we also need to preserve compatibility
> in the code.
> 
> That said, I think compatibility would be easier to preserve if we stuck
> with the old compatible string and used a "reg-names" property to
> specify which version of the binding we're referring to.
> 
> For example, we could have:
> 
> 	memory-controller@7000f000 {
> 		compatible = "nvidia,tegra20-mc";
> 		reg = <0x7000f000 0x024
> 		       0x7000f03c 0x3c4>;
> 		...
> 	};
> 
> for the old binding and:
> 
> 	memory-controller@7000f000 {
> 		compatible = "nvidia,tegra20-mc";
> 		reg = <0x7000f000 0x00000400>,
> 		      <0x58000000 0x02000000>;
> 		reg-names = "mc", "gart";
> 		...
> 	};
> 
> for the new binding. The driver can then easily check for the existence
> of the reg-names property and take the legacy or new code paths.

There is no problem with keeping compatibility for newer kernels with the older binding, 
it just not worth the effort. The real problem is keeping compatibility of older kernels 
with the new binding, the older kernels won't care about the reg-names and will treat GART 
registers as the second registers bank of the Memory Controller. Unfortunately I don't see 
how your suggestion is supposed to help with the problem.

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [PATCH v4 20/20] iommu/tegra: gart: Perform code refactoring
  2018-09-24 11:34   ` Thierry Reding
@ 2018-09-24 17:11     ` Dmitry Osipenko
  0 siblings, 0 replies; 59+ messages in thread
From: Dmitry Osipenko @ 2018-09-24 17:11 UTC (permalink / raw)
  To: Thierry Reding
  Cc: Jonathan Hunter, Joerg Roedel, Rob Herring, Robin Murphy, iommu,
	devicetree, linux-tegra, linux-kernel

On 9/24/18 2:34 PM, Thierry Reding wrote:
> On Mon, Sep 24, 2018 at 03:41:53AM +0300, Dmitry Osipenko wrote:
>> Perform a major code cleanup to make it more readable and as a result
>> easier to maintain. I removed some redundant safety-checks in the code
>> and some debug code that isn't actually very useful for debugging, like
>> enormous pagetable dump on each fault. The majority of the changes are
>> code reshuffling, variables/whitespaces clean up and removal of debug
>> messages that duplicate messages of the IOMMU-core.
>>
>> Signed-off-by: Dmitry Osipenko <digetx@gmail.com>
>> ---
>>  drivers/iommu/tegra-gart.c | 215 +++++++++++++++----------------------
>>  1 file changed, 84 insertions(+), 131 deletions(-)
> 
> While I'm not strongly opposed to this, it's an awful lot of churn for
> little to no gain. Yes, this driver may have its weak points in some
> areas, but I don't think it's totally unreadable or unmaintainable.
> Also, keep in mind that readability is very subjective.

I never said or meant that the code is "totally unreadable or
unmaintainable". The code is okay, but it could be better and I'm trying
to make it so since I'm already touching it.

> If you set out to rewrite every piece of code in the kernel that you
> think is unreadable, I don't think you'll end up a happy person.

You're certainly getting a wrong impression about me. Though I wouldn't
mind if some other Tegra driver will finally get a major rewrite ;)

>> diff --git a/drivers/iommu/tegra-gart.c b/drivers/iommu/tegra-gart.c
>> index 7182445c3b76..a36d0c568536 100644
>> --- a/drivers/iommu/tegra-gart.c
>> +++ b/drivers/iommu/tegra-gart.c
>> @@ -34,63 +34,56 @@
>>  #define GART_CONFIG		(0x24 - GART_REG_BASE)
>>  #define GART_ENTRY_ADDR		(0x28 - GART_REG_BASE)
>>  #define GART_ENTRY_DATA		(0x2c - GART_REG_BASE)
>> -#define GART_ENTRY_PHYS_ADDR_VALID	(1 << 31)
>> +
>> +#define GART_ENTRY_PHYS_ADDR_VALID	BIT(31)
>>  
>>  #define GART_PAGE_SHIFT		12
>>  #define GART_PAGE_SIZE		(1 << GART_PAGE_SHIFT)
>> -#define GART_PAGE_MASK						\
>> -	(~(GART_PAGE_SIZE - 1) & ~GART_ENTRY_PHYS_ADDR_VALID)
>> +#define GART_PAGE_MASK		GENMASK(30, GART_PAGE_SHIFT)
>>  
>>  struct gart_device {
>>  	void __iomem		*regs;
>>  	u32			*savedata;
>> -	u32			page_count;	/* total remappable size */
>> -	dma_addr_t		iovmm_base;	/* offset to vmm_area */
>> +	unsigned long		iovmm_base;	/* offset to vmm_area start */
>> +	unsigned long		iovmm_end;	/* offset to vmm_area end */
>>  	spinlock_t		pte_lock;	/* for pagetable */
>>  	spinlock_t		dom_lock;	/* for active domain */
>>  	unsigned int		active_devices;	/* number of active devices */
>>  	struct iommu_domain	*active_domain;	/* current active domain */
>> -	struct device		*dev;
>> -
>>  	struct iommu_device	iommu;		/* IOMMU Core handle */
>> +	struct device		*dev;
>>  };
>>  
>>  static struct gart_device *gart_handle; /* unique for a system */
>>  
>>  static bool gart_debug;
>>  
>> -#define GART_PTE(_pfn)						\
>> -	(GART_ENTRY_PHYS_ADDR_VALID | ((_pfn) << PAGE_SHIFT))
>> -
>>  /*
>>   * Any interaction between any block on PPSB and a block on APB or AHB
>>   * must have these read-back to ensure the APB/AHB bus transaction is
>>   * complete before initiating activity on the PPSB block.
>>   */
>> -#define FLUSH_GART_REGS(gart)	((void)readl((gart)->regs + GART_CONFIG))
>> +#define FLUSH_GART_REGS(gart)	readl_relaxed((gart)->regs + GART_CONFIG)
>>  
>>  #define for_each_gart_pte(gart, iova)					\
>>  	for (iova = gart->iovmm_base;					\
>> -	     iova < gart->iovmm_base + GART_PAGE_SIZE * gart->page_count; \
>> +	     iova < gart->iovmm_end;					\
>>  	     iova += GART_PAGE_SIZE)
>>  
>>  static inline void gart_set_pte(struct gart_device *gart,
>> -				unsigned long offs, u32 pte)
>> +				unsigned long iova, phys_addr_t pte)
> 
> I don't think this makes sense. phys_addr_t can be 64-bit and actually
> will be in the majority of multi-platform builds. iova being unsigned
> long is borderline, but probably fine since this driver is exclusive to
> 32-bit builds.
> 
> I think it'd be better to make sure elsewhere that only valid, 32-bit
> values are passed in here and return an error at a higher level if
> that's not the case. Silently casting away the upper 32 bits in the
> writel_relaxed() below is suboptimal. Even if we don't care about the
> type mismatch because Tegra20 doesn't have LPAE and therefore the upper
> 32 bits will always be 0 and the cast is in fact safe, I think we should
> be explicitly casting at some point, and I think it should be at a
> higher level than gart_set_pte().

I'll add an explicit casting to ulong, like other drivers do. Thanks for
the suggestion.

>>  {
>> -	writel(offs, gart->regs + GART_ENTRY_ADDR);
>> -	writel(pte, gart->regs + GART_ENTRY_DATA);
>> -
>> -	dev_dbg(gart->dev, "GART: %s %08lx:%08x\n",
>> -		 pte ? "map" : "unmap", offs, pte & GART_PAGE_MASK);
>> +	writel_relaxed(iova, gart->regs + GART_ENTRY_ADDR);
>> +	writel_relaxed(pte, gart->regs + GART_ENTRY_DATA);
>>  }
>>  
>>  static inline unsigned long gart_read_pte(struct gart_device *gart,
>> -					  unsigned long offs)
>> +					  unsigned long iova)
>>  {
>>  	unsigned long pte;
>>  
>> -	writel(offs, gart->regs + GART_ENTRY_ADDR);
>> -	pte = readl(gart->regs + GART_ENTRY_DATA);
>> +	writel_relaxed(iova, gart->regs + GART_ENTRY_ADDR);
>> +	pte = readl_relaxed(gart->regs + GART_ENTRY_DATA);
>>  
>>  	return pte;
>>  }
>> @@ -102,49 +95,20 @@ static void do_gart_setup(struct gart_device *gart, const u32 *data)
>>  	for_each_gart_pte(gart, iova)
>>  		gart_set_pte(gart, iova, data ? *(data++) : 0);
>>  
>> -	writel(1, gart->regs + GART_CONFIG);
>> +	writel_relaxed(1, gart->regs + GART_CONFIG);
>>  	FLUSH_GART_REGS(gart);
>>  }
>>  
>> -#ifdef DEBUG
>> -static void gart_dump_table(struct gart_device *gart)
>> +static inline bool gart_iova_range_invalid(struct gart_device *gart,
>> +					   unsigned long iova, size_t bytes)
>>  {
>> -	unsigned long iova;
>> -	unsigned long flags;
>> -
>> -	spin_lock_irqsave(&gart->pte_lock, flags);
>> -	for_each_gart_pte(gart, iova) {
>> -		unsigned long pte;
>> -
>> -		pte = gart_read_pte(gart, iova);
>> -
>> -		dev_dbg(gart->dev, "GART: %s %08lx:%08lx\n",
>> -			(GART_ENTRY_PHYS_ADDR_VALID & pte) ? "v" : " ",
>> -			iova, pte & GART_PAGE_MASK);
>> -	}
>> -	spin_unlock_irqrestore(&gart->pte_lock, flags);
>> +	return unlikely(iova < gart->iovmm_base || bytes != GART_PAGE_SIZE ||
>> +			iova + bytes > gart->iovmm_end);
>>  }
>> -#else
>> -static inline void gart_dump_table(struct gart_device *gart)
>> -{
>> -}
>> -#endif
>>  
>> -static inline bool gart_iova_range_valid(struct gart_device *gart,
>> -					 unsigned long iova, size_t bytes)
>> +static inline bool gart_pte_valid(struct gart_device *gart, unsigned long iova)
>>  {
>> -	unsigned long iova_start, iova_end, gart_start, gart_end;
>> -
>> -	iova_start = iova;
>> -	iova_end = iova_start + bytes - 1;
>> -	gart_start = gart->iovmm_base;
>> -	gart_end = gart_start + gart->page_count * GART_PAGE_SIZE - 1;
>> -
>> -	if (iova_start < gart_start)
>> -		return false;
>> -	if (iova_end > gart_end)
>> -		return false;
>> -	return true;
>> +	return !!(gart_read_pte(gart, iova) & GART_ENTRY_PHYS_ADDR_VALID);
>>  }
>>  
>>  static int gart_iommu_attach_dev(struct iommu_domain *domain,
>> @@ -187,7 +151,6 @@ static void gart_iommu_detach_dev(struct iommu_domain *domain,
>>  
>>  static struct iommu_domain *gart_iommu_domain_alloc(unsigned type)
>>  {
>> -	struct gart_device *gart = gart_handle;
>>  	struct iommu_domain *domain;
>>  
>>  	if (type != IOMMU_DOMAIN_UNMANAGED)
>> @@ -195,9 +158,8 @@ static struct iommu_domain *gart_iommu_domain_alloc(unsigned type)
>>  
>>  	domain = kzalloc(sizeof(*domain), GFP_KERNEL);
>>  	if (domain) {
>> -		domain->geometry.aperture_start = gart->iovmm_base;
>> -		domain->geometry.aperture_end = gart->iovmm_base +
>> -					gart->page_count * GART_PAGE_SIZE - 1;
>> +		domain->geometry.aperture_start = gart_handle->iovmm_base;
>> +		domain->geometry.aperture_end = gart_handle->iovmm_end - 1;
>>  		domain->geometry.force_aperture = true;
>>  	}
>>  
>> @@ -209,34 +171,44 @@ static void gart_iommu_domain_free(struct iommu_domain *domain)
>>  	kfree(domain);
>>  }
>>  
>> +static int __gart_iommu_map(struct gart_device *gart, unsigned long iova,
>> +			    phys_addr_t pa)
>> +{
>> +	if (unlikely(gart_debug && gart_pte_valid(gart, iova))) {
>> +		dev_WARN(gart->dev, "GART: Page entry is in-use\n");
>> +		return -EINVAL;
>> +	}
>> +
>> +	gart_set_pte(gart, iova, GART_ENTRY_PHYS_ADDR_VALID | pa);
>> +
>> +	return 0;
>> +}
>> +
>>  static int gart_iommu_map(struct iommu_domain *domain, unsigned long iova,
>>  			  phys_addr_t pa, size_t bytes, int prot)
>>  {
>>  	struct gart_device *gart = gart_handle;
>> -	unsigned long flags;
>> -	unsigned long pfn;
>> -	unsigned long pte;
>> +	int ret;
>>  
>> -	if (!gart_iova_range_valid(gart, iova, bytes))
>> +	if (gart_iova_range_invalid(gart, iova, bytes))
>>  		return -EINVAL;
>>  
>> -	spin_lock_irqsave(&gart->pte_lock, flags);
>> -	pfn = __phys_to_pfn(pa);
>> -	if (!pfn_valid(pfn)) {
>> -		dev_err(gart->dev, "GART: Invalid page: %pa\n", &pa);
>> -		spin_unlock_irqrestore(&gart->pte_lock, flags);
>> +	spin_lock(&gart->pte_lock);
>> +	ret = __gart_iommu_map(gart, iova, pa);
>> +	spin_unlock(&gart->pte_lock);
>> +
>> +	return ret;
>> +}
>> +
>> +static int __gart_iommu_unmap(struct gart_device *gart, unsigned long iova)
>> +{
>> +	if (unlikely(gart_debug && !gart_pte_valid(gart, iova))) {
>> +		dev_WARN(gart->dev, "GART: Page entry is invalid\n");
>>  		return -EINVAL;
>>  	}
>> -	if (gart_debug) {
>> -		pte = gart_read_pte(gart, iova);
>> -		if (pte & GART_ENTRY_PHYS_ADDR_VALID) {
>> -			spin_unlock_irqrestore(&gart->pte_lock, flags);
>> -			dev_err(gart->dev, "GART: Page entry is in-use\n");
>> -			return -EBUSY;
>> -		}
>> -	}
>> -	gart_set_pte(gart, iova, GART_PTE(pfn));
>> -	spin_unlock_irqrestore(&gart->pte_lock, flags);
>> +
>> +	gart_set_pte(gart, iova, 0);
>> +
>>  	return 0;
>>  }
>>  
>> @@ -244,15 +216,16 @@ static size_t gart_iommu_unmap(struct iommu_domain *domain, unsigned long iova,
>>  			       size_t bytes)
>>  {
>>  	struct gart_device *gart = gart_handle;
>> -	unsigned long flags;
>> +	int err;
>>  
>> -	if (!gart_iova_range_valid(gart, iova, bytes))
>> +	if (gart_iova_range_invalid(gart, iova, bytes))
>>  		return 0;
>>  
>> -	spin_lock_irqsave(&gart->pte_lock, flags);
>> -	gart_set_pte(gart, iova, 0);
>> -	spin_unlock_irqrestore(&gart->pte_lock, flags);
>> -	return bytes;
>> +	spin_lock(&gart->pte_lock);
>> +	err = __gart_iommu_unmap(gart, iova);
>> +	spin_unlock(&gart->pte_lock);
>> +
>> +	return err ? 0 : bytes;
>>  }
>>  
>>  static phys_addr_t gart_iommu_iova_to_phys(struct iommu_domain *domain,
>> @@ -260,24 +233,15 @@ static phys_addr_t gart_iommu_iova_to_phys(struct iommu_domain *domain,
>>  {
>>  	struct gart_device *gart = gart_handle;
>>  	unsigned long pte;
>> -	phys_addr_t pa;
>> -	unsigned long flags;
>>  
>> -	if (!gart_iova_range_valid(gart, iova, 0))
>> +	if (gart_iova_range_invalid(gart, iova, SZ_4K))
>>  		return -EINVAL;
>>  
>> -	spin_lock_irqsave(&gart->pte_lock, flags);
>> +	spin_lock(&gart->pte_lock);
>>  	pte = gart_read_pte(gart, iova);
>> -	spin_unlock_irqrestore(&gart->pte_lock, flags);
>> +	spin_unlock(&gart->pte_lock);
>>  
>> -	pa = (pte & GART_PAGE_MASK);
>> -	if (!pfn_valid(__phys_to_pfn(pa))) {
>> -		dev_err(gart->dev, "GART: No entry for %08llx:%pa\n",
>> -			 (unsigned long long)iova, &pa);
>> -		gart_dump_table(gart);
>> -		return -EINVAL;
>> -	}
>> -	return pa;
>> +	return pte & GART_PAGE_MASK;
>>  }
>>  
>>  static bool gart_iommu_capable(enum iommu_cap cap)
>> @@ -342,24 +306,19 @@ static const struct iommu_ops gart_iommu_ops = {
>>  
>>  int tegra_gart_suspend(struct gart_device *gart)
>>  {
>> -	unsigned long iova;
>>  	u32 *data = gart->savedata;
>> -	unsigned long flags;
>> +	unsigned long iova;
>>  
>> -	spin_lock_irqsave(&gart->pte_lock, flags);
>>  	for_each_gart_pte(gart, iova)
>>  		*(data++) = gart_read_pte(gart, iova);
>> -	spin_unlock_irqrestore(&gart->pte_lock, flags);
> 
> Why is it safe to remove the lock here?

Nothing shall access GART at this point, I can't imagine a legit
scenario for that to happen. Hence it will be a bug and locking is not
needed here since it won't be helpful anyway.

>> +
>>  	return 0;
>>  }
>>  
>>  int tegra_gart_resume(struct gart_device *gart)
>>  {
>> -	unsigned long flags;
>> -
>> -	spin_lock_irqsave(&gart->pte_lock, flags);
>>  	do_gart_setup(gart, gart->savedata);
>> -	spin_unlock_irqrestore(&gart->pte_lock, flags);
>> +
>>  	return 0;
>>  }
>>  
>> @@ -368,8 +327,7 @@ struct gart_device *tegra_gart_probe(struct device *dev,
>>  				     struct tegra_mc *mc)
>>  {
>>  	struct gart_device *gart;
>> -	struct resource *res_remap;
>> -	void __iomem *gart_regs;
>> +	struct resource *res;
>>  	int ret;
>>  
>>  	BUILD_BUG_ON(PAGE_SHIFT != GART_PAGE_SHIFT);
>> @@ -379,9 +337,8 @@ struct gart_device *tegra_gart_probe(struct device *dev,
>>  		return NULL;
>>  
>>  	/* the GART memory aperture is required */
>> -	res_remap = platform_get_resource(to_platform_device(dev),
>> -					  IORESOURCE_MEM, 1);
>> -	if (!res_remap) {
>> +	res = platform_get_resource(to_platform_device(dev), IORESOURCE_MEM, 1);
>> +	if (!res) {
>>  		dev_err(dev, "GART: Memory aperture resource unavailable\n");
>>  		return ERR_PTR(-ENXIO);
>>  	}
>> @@ -390,39 +347,35 @@ struct gart_device *tegra_gart_probe(struct device *dev,
>>  	if (!gart)
>>  		return ERR_PTR(-ENOMEM);
>>  
>> +	gart_handle = gart;
>> +
>> +	gart->dev = dev;
>> +	gart->regs = mc->regs + GART_REG_BASE;
>> +	gart->iovmm_base = res->start;
>> +	gart->iovmm_end = res->start + resource_size(res);
> 
> Why not simply:
> 
> 	gart->iovmm_end = res->end;
> 
> ?

It could be set that way too, thanks. Only need to take into account
that res->end is off by one, hence it should be:

 	gart->iovmm_end = res->end + 1;

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [PATCH v4 19/20] iommu/tegra: gart: Simplify clients-tracking code
  2018-09-24 11:10   ` Thierry Reding
@ 2018-09-24 17:50     ` Dmitry Osipenko
  2018-09-25 10:09       ` Thierry Reding
  0 siblings, 1 reply; 59+ messages in thread
From: Dmitry Osipenko @ 2018-09-24 17:50 UTC (permalink / raw)
  To: Thierry Reding
  Cc: Jonathan Hunter, Joerg Roedel, Rob Herring, Robin Murphy, iommu,
	devicetree, linux-tegra, linux-kernel

On 9/24/18 2:10 PM, Thierry Reding wrote:
> On Mon, Sep 24, 2018 at 03:41:52AM +0300, Dmitry Osipenko wrote:
>> GART is a simple IOMMU provider that has single address space. There is
>> no need to setup global clients list and manage it for tracking of the
>> active domain, hence lot's of code could be safely removed and replaced
>> with a simpler alternative.
>>
>> Signed-off-by: Dmitry Osipenko <digetx@gmail.com>
>> ---
>>  drivers/iommu/tegra-gart.c | 157 +++++++++----------------------------
>>  1 file changed, 39 insertions(+), 118 deletions(-)
>>
>> diff --git a/drivers/iommu/tegra-gart.c b/drivers/iommu/tegra-gart.c
>> index 306e9644a676..7182445c3b76 100644
>> --- a/drivers/iommu/tegra-gart.c
>> +++ b/drivers/iommu/tegra-gart.c
>> @@ -19,7 +19,6 @@
>>  
>>  #include <linux/io.h>
>>  #include <linux/iommu.h>
>> -#include <linux/list.h>
>>  #include <linux/module.h>
>>  #include <linux/platform_device.h>
>>  #include <linux/slab.h>
>> @@ -42,30 +41,20 @@
>>  #define GART_PAGE_MASK						\
>>  	(~(GART_PAGE_SIZE - 1) & ~GART_ENTRY_PHYS_ADDR_VALID)
>>  
>> -struct gart_client {
>> -	struct device		*dev;
>> -	struct list_head	list;
>> -};
>> -
>>  struct gart_device {
>>  	void __iomem		*regs;
>>  	u32			*savedata;
>>  	u32			page_count;	/* total remappable size */
>>  	dma_addr_t		iovmm_base;	/* offset to vmm_area */
>>  	spinlock_t		pte_lock;	/* for pagetable */
>> -	struct list_head	client;
>> -	spinlock_t		client_lock;	/* for client list */
>> +	spinlock_t		dom_lock;	/* for active domain */
>> +	unsigned int		active_devices;	/* number of active devices */
>>  	struct iommu_domain	*active_domain;	/* current active domain */
>>  	struct device		*dev;
>>  
>>  	struct iommu_device	iommu;		/* IOMMU Core handle */
>>  };
>>  
>> -struct gart_domain {
>> -	struct iommu_domain domain;		/* generic domain handle */
>> -	struct gart_device *gart;		/* link to gart device   */
>> -};
>> -
>>  static struct gart_device *gart_handle; /* unique for a system */
>>  
>>  static bool gart_debug;
>> @@ -73,11 +62,6 @@ static bool gart_debug;
>>  #define GART_PTE(_pfn)						\
>>  	(GART_ENTRY_PHYS_ADDR_VALID | ((_pfn) << PAGE_SHIFT))
>>  
>> -static struct gart_domain *to_gart_domain(struct iommu_domain *dom)
>> -{
>> -	return container_of(dom, struct gart_domain, domain);
>> -}
>> -
>>  /*
>>   * Any interaction between any block on PPSB and a block on APB or AHB
>>   * must have these read-back to ensure the APB/AHB bus transaction is
>> @@ -166,128 +150,69 @@ static inline bool gart_iova_range_valid(struct gart_device *gart,
>>  static int gart_iommu_attach_dev(struct iommu_domain *domain,
>>  				 struct device *dev)
>>  {
>> -	struct gart_domain *gart_domain = to_gart_domain(domain);
>>  	struct gart_device *gart = gart_handle;
>> -	struct gart_client *client, *c;
>> -	int err = 0;
>> -
>> -	client = kzalloc(sizeof(*c), GFP_KERNEL);
>> -	if (!client)
>> -		return -ENOMEM;
>> -	client->dev = dev;
>> -
>> -	spin_lock(&gart->client_lock);
>> -	list_for_each_entry(c, &gart->client, list) {
>> -		if (c->dev == dev) {
>> -			dev_err(gart->dev, "GART: %s is already attached\n",
>> -				dev_name(dev));
>> -			err = -EINVAL;
>> -			goto fail;
>> -		}
>> -	}
>> -	if (gart->active_domain && gart->active_domain != domain) {
>> -		dev_err(gart->dev,
>> -			"GART: Only one domain can be active at a time\n");
>> -		err = -EINVAL;
>> -		goto fail;
>> -	}
>> -	gart->active_domain = domain;
>> -	gart_domain->gart = gart;
>> -	list_add(&client->list, &gart->client);
>> -	spin_unlock(&gart->client_lock);
>> -	dev_dbg(gart->dev, "GART: Attached %s\n", dev_name(dev));
>> -	return 0;
>> +	int ret = 0;
>>  
>> -fail:
>> -	kfree(client);
>> -	spin_unlock(&gart->client_lock);
>> -	return err;
>> -}
>> +	spin_lock(&gart->dom_lock);
>>  
>> -static void __gart_iommu_detach_dev(struct iommu_domain *domain,
>> -				    struct device *dev)
>> -{
>> -	struct gart_domain *gart_domain = to_gart_domain(domain);
>> -	struct gart_device *gart = gart_domain->gart;
>> -	struct gart_client *c;
>> -
>> -	list_for_each_entry(c, &gart->client, list) {
>> -		if (c->dev == dev) {
>> -			list_del(&c->list);
>> -			kfree(c);
>> -			if (list_empty(&gart->client)) {
>> -				gart->active_domain = NULL;
>> -				gart_domain->gart = NULL;
>> -			}
>> -			dev_dbg(gart->dev, "GART: Detached %s\n",
>> -				dev_name(dev));
>> -			return;
>> -		}
>> +	if (gart->active_domain && gart->active_domain != domain) {
>> +		ret = -EBUSY;
> 
> This omits the error message and returns -EBUSY instead of -EINVAL. Was
> this intended? For what it's worth, I do agree with the changes, it's
> just that I think you could've made those in the earlier patch that
> introduced them.

The message isn't really needed and EBUSY seems fit better than EINVAL here.

> But this is all one series and the end result looks fine, so no need to
> be that picky.

Good, thanks.

>> +	} else if (dev->archdata.iommu != domain) {
>> +		dev->archdata.iommu = domain;
>> +		gart->active_domain = domain;
>> +		gart->active_devices++;
>>  	}
>>  
>> -	dev_err(gart->dev, "GART: Couldn't find %s to detach\n",
>> -		dev_name(dev));
>> +	spin_unlock(&gart->dom_lock);
>> +
>> +	return ret;
>>  }
>>  
>>  static void gart_iommu_detach_dev(struct iommu_domain *domain,
>>  				  struct device *dev)
>>  {
>> -	struct gart_domain *gart_domain = to_gart_domain(domain);
>> -	struct gart_device *gart = gart_domain->gart;
>> +	struct gart_device *gart = gart_handle;
>> +
>> +	spin_lock(&gart->dom_lock);
>>  
>> -	spin_lock(&gart->client_lock);
>> -	__gart_iommu_detach_dev(domain, dev);
>> -	spin_unlock(&gart->client_lock);
>> +	if (dev->archdata.iommu == domain) {
>> +		dev->archdata.iommu = NULL;
>> +
>> +		if (--gart->active_devices == 0)
>> +			gart->active_domain = NULL;
>> +	}
>> +
>> +	spin_unlock(&gart->dom_lock);
>>  }
>>  
>>  static struct iommu_domain *gart_iommu_domain_alloc(unsigned type)
>>  {
>> -	struct gart_domain *gart_domain;
>> -	struct gart_device *gart;
>> +	struct gart_device *gart = gart_handle;
>> +	struct iommu_domain *domain;
>>  
>>  	if (type != IOMMU_DOMAIN_UNMANAGED)
>>  		return NULL;
>>  
>> -	gart = gart_handle;
>> -	if (!gart)
>> -		return NULL;
>> -
>> -	gart_domain = kzalloc(sizeof(*gart_domain), GFP_KERNEL);
>> -	if (!gart_domain)
>> -		return NULL;
>> -
>> -	gart_domain->domain.geometry.aperture_start = gart->iovmm_base;
>> -	gart_domain->domain.geometry.aperture_end = gart->iovmm_base +
>> +	domain = kzalloc(sizeof(*domain), GFP_KERNEL);
>> +	if (domain) {
>> +		domain->geometry.aperture_start = gart->iovmm_base;
>> +		domain->geometry.aperture_end = gart->iovmm_base +
>>  					gart->page_count * GART_PAGE_SIZE - 1;
>> -	gart_domain->domain.geometry.force_aperture = true;
>> +		domain->geometry.force_aperture = true;
>> +	}
>>  
>> -	return &gart_domain->domain;
>> +	return domain;
>>  }
>>  
>>  static void gart_iommu_domain_free(struct iommu_domain *domain)
>>  {
>> -	struct gart_domain *gart_domain = to_gart_domain(domain);
>> -	struct gart_device *gart = gart_domain->gart;
>> -
>> -	if (gart) {
>> -		spin_lock(&gart->client_lock);
>> -		if (!list_empty(&gart->client)) {
>> -			struct gart_client *c, *tmp;
>> -
>> -			list_for_each_entry_safe(c, tmp, &gart->client, list)
>> -				__gart_iommu_detach_dev(domain, c->dev);
>> -		}
>> -		spin_unlock(&gart->client_lock);
>> -	}
>> -
>> -	kfree(gart_domain);
>> +	kfree(domain);
>>  }
> 
> Doesn't this now make it possible to free a potentially active domain?

Yes, don't do it. I can add a WARN_ON() here, though I think IOMMU core
should be the one taking care about that.

>>  
>>  static int gart_iommu_map(struct iommu_domain *domain, unsigned long iova,
>>  			  phys_addr_t pa, size_t bytes, int prot)
>>  {
>> -	struct gart_domain *gart_domain = to_gart_domain(domain);
>> -	struct gart_device *gart = gart_domain->gart;
>> +	struct gart_device *gart = gart_handle;
> 
> Hmm... this now introduces more uses of the gart_handle that I hoped we
> could get rid of. I think we could still keep around struct gart_domain
> and just make sure it is unique. The small amounts of casting here seem
> mostly harmless to me, especially since they will be nops, so we end up
> with just one dereference to get at the struct gart_device. I think the
> benefits of not having this global variable around are worth the one
> dereference here.

What are the benefits? I don't see anything other than the pedantic oddity.

I've removed gart_domain in the end because it is an extra code (and
consumed resources) without any benefit. Let's keep that part as it is
now. I'll be happy to change that code if you'll explain why it is worth
it.

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [PATCH v4 18/20] iommu/tegra: gart: Don't detach devices from inactive domains
  2018-09-24 11:00   ` Thierry Reding
@ 2018-09-24 18:05     ` Dmitry Osipenko
  2018-09-25 10:04       ` Thierry Reding
  0 siblings, 1 reply; 59+ messages in thread
From: Dmitry Osipenko @ 2018-09-24 18:05 UTC (permalink / raw)
  To: Thierry Reding
  Cc: Jonathan Hunter, Joerg Roedel, Rob Herring, Robin Murphy, iommu,
	devicetree, linux-tegra, linux-kernel

On 9/24/18 2:00 PM, Thierry Reding wrote:
> On Mon, Sep 24, 2018 at 03:41:51AM +0300, Dmitry Osipenko wrote:
>> There could be unlimited number of allocated domains, but only one domain
>> can be active at a time. Hence devices must be detached only from the
>> active domain.
>>
>> Signed-off-by: Dmitry Osipenko <digetx@gmail.com>
>> ---
>>  drivers/iommu/tegra-gart.c | 8 +++++---
>>  1 file changed, 5 insertions(+), 3 deletions(-)
> 
> Do we have a mechanism of switching out different domains? I don't think
> we do, so I'm wondering if perhaps a better solution to this would be to
> just refuse to create more than one domain in the first place. That
> would also allow us to get rid of the global variable gart_handle.

That's what was done in v1, Robin Murphy suggested that it will be
better not to restrict allocation of unpopulated domains. It is
mentioned in the changelog, see comment to v2 changes.

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [PATCH v4 17/20] iommu/tegra: gart: Prepend error/debug messages with "GART:"
  2018-09-24 10:57   ` Thierry Reding
@ 2018-09-24 18:09     ` Dmitry Osipenko
  0 siblings, 0 replies; 59+ messages in thread
From: Dmitry Osipenko @ 2018-09-24 18:09 UTC (permalink / raw)
  To: Thierry Reding
  Cc: Jonathan Hunter, Joerg Roedel, Rob Herring, Robin Murphy, iommu,
	devicetree, linux-tegra, linux-kernel

On 9/24/18 1:57 PM, Thierry Reding wrote:
> On Mon, Sep 24, 2018 at 03:41:50AM +0300, Dmitry Osipenko wrote:
>> GART became a part of Memory Controller, hence now the drivers device
>> is Memory Controller and not GART. As a result all printed messages are
>> prepended with the "tegra-mc 7000f000.memory-controller:", so let's
>> prepend GART's messages with "GART:" in order to differentiate them
>> from the MC.
>>
>> Signed-off-by: Dmitry Osipenko <digetx@gmail.com>
>> ---
>>  drivers/iommu/tegra-gart.c | 36 ++++++++++++++++++------------------
>>  1 file changed, 18 insertions(+), 18 deletions(-)
> 
> There's a macro called dev_fmt (similar to pr_fmt) to do this for dev_*
> printers. Also I think this would be more readable if the prefix was
> "gart: " rather than "GART: ". At least from my personal experience I
> get easily distracted by all-caps words in logs, because they usually
> indicate something that requires immediate attention. I think it's
> better to leave that up to higher level mechanisms, such as the color
> keying of messages based on level by tools like dmesg.

The dev_fmt is a new thing, thank you for pointing at it. I'll try to
switch to dev_fmt and lower the text case.

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [PATCH v4 12/20] iommu/tegra: gart: Integrate with Memory Controller driver
  2018-09-24 10:23   ` Thierry Reding
@ 2018-09-24 18:22     ` Dmitry Osipenko
  2018-09-25 10:02       ` Thierry Reding
  0 siblings, 1 reply; 59+ messages in thread
From: Dmitry Osipenko @ 2018-09-24 18:22 UTC (permalink / raw)
  To: Thierry Reding, Joerg Roedel
  Cc: Jonathan Hunter, Rob Herring, Robin Murphy, iommu, devicetree,
	linux-tegra, linux-kernel

On 9/24/18 1:23 PM, Thierry Reding wrote:
> On Mon, Sep 24, 2018 at 03:41:45AM +0300, Dmitry Osipenko wrote:
>> The device-tree binding has been changed. There is no separate GART device
>> anymore, it is squashed into the Memory Controller. Integrate GART module
>> with the MC in a way it is done for the SMMU of Tegra30+.
>>
>> Signed-off-by: Dmitry Osipenko <digetx@gmail.com>
>> ---
>>  drivers/iommu/Kconfig      |  1 +
>>  drivers/iommu/tegra-gart.c | 98 ++++++++++----------------------------
>>  drivers/memory/tegra/mc.c  | 41 ++++++++++++++++
>>  include/soc/tegra/mc.h     | 27 +++++++++++
>>  4 files changed, 93 insertions(+), 74 deletions(-)
> 
> I think this could technically have been two patches, but since they'd
> have a compile-time dependency either way they need to be applied in the
> correct order, so some coordination between IOMMU and Tegra trees is
> going to have to happen anyway and might as well just stick this into a
> single patch.

I assume that Joerg will take the whole series once it's ready (no?),
hence your ACK will be needed here and in other patches.

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [PATCH v4 11/20] memory: tegra: Use of_device_get_match_data()
  2018-09-24 10:13   ` Thierry Reding
@ 2018-09-24 18:39     ` Dmitry Osipenko
  2018-09-25 10:00       ` Thierry Reding
  0 siblings, 1 reply; 59+ messages in thread
From: Dmitry Osipenko @ 2018-09-24 18:39 UTC (permalink / raw)
  To: Thierry Reding
  Cc: Jonathan Hunter, Joerg Roedel, Rob Herring, Robin Murphy, iommu,
	devicetree, linux-tegra, linux-kernel

On 9/24/18 1:13 PM, Thierry Reding wrote:
> On Mon, Sep 24, 2018 at 03:41:44AM +0300, Dmitry Osipenko wrote:
>> There is no need to match device with the DT node since it was already
>> matched, use of_device_get_match_data() helper to get the match-data.
>>
>> Signed-off-by: Dmitry Osipenko <digetx@gmail.com>
>> ---
>>  drivers/memory/tegra/mc.c | 10 ++--------
>>  1 file changed, 2 insertions(+), 8 deletions(-)
>>
>> diff --git a/drivers/memory/tegra/mc.c b/drivers/memory/tegra/mc.c
>> index 5454ffe5b2e0..cdc33f93cf7c 100644
>> --- a/drivers/memory/tegra/mc.c
>> +++ b/drivers/memory/tegra/mc.c
>> @@ -11,8 +11,7 @@
>>  #include <linux/interrupt.h>
>>  #include <linux/kernel.h>
>>  #include <linux/module.h>
>> -#include <linux/of.h>
>> -#include <linux/platform_device.h>
> 
> It's better not to remove these two because the code still uses
> functions declared in them. If ever we were going to remove code using
> linux/of_device.h and then remove the linux/of_device.h include, we'd
> break the build and have to reintroduce the includes.

That doesn't sound like a good argument. You're way too picky here ;)

> The same would happen if linux/of_device.h were ever to stop including
> linux/platform_device.h or linux/of.h. That may sound unlikely, but it
> has happened in the past with other includes. It can also happen that
> some restructuring takes place in some headers that is not so obvious
> and then things can still start falling apart miles away.

Restructuring will be somebody else problem. Not sure that we really
should care about it, I think it is unnecessary. But since you're
insisting..

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [PATCH v4 03/20] iommu/tegra: gart: Ignore devices without IOMMU phandle in DT
  2018-09-24 10:05   ` Thierry Reding
@ 2018-09-24 18:41     ` Dmitry Osipenko
  0 siblings, 0 replies; 59+ messages in thread
From: Dmitry Osipenko @ 2018-09-24 18:41 UTC (permalink / raw)
  To: Thierry Reding
  Cc: Jonathan Hunter, Joerg Roedel, Rob Herring, Robin Murphy, iommu,
	devicetree, linux-tegra, linux-kernel

On 9/24/18 1:05 PM, Thierry Reding wrote:
> On Mon, Sep 24, 2018 at 03:41:36AM +0300, Dmitry Osipenko wrote:
>> GART can't handle all devices, hence ignore devices that aren't related
>> to GART. IOMMU phandle must be explicitly assign to devices in the device
>> tree.
> 
> I think technically the GART can indeed handle all devices since it is
> just a physical address region that can be used to remap other physical
> addresses. That's not to say that doing so would be a good idea. So the
> commit message here is slightly confusing, but other than that the idea
> is good, so:
> 
> Acked-by: Thierry Reding <treding@nvidia.com>
> 

It shouldn't be able to serve something like SDHCI, though I haven't
tried to verify that.

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [PATCH v4 16/20] iommu/tegra: gart: Don't use managed resources
  2018-09-24 10:52   ` Thierry Reding
@ 2018-09-24 18:57     ` Dmitry Osipenko
  2018-09-25 10:03       ` Thierry Reding
  0 siblings, 1 reply; 59+ messages in thread
From: Dmitry Osipenko @ 2018-09-24 18:57 UTC (permalink / raw)
  To: Thierry Reding
  Cc: Jonathan Hunter, Joerg Roedel, Rob Herring, Robin Murphy, iommu,
	devicetree, linux-tegra, linux-kernel

On 9/24/18 1:52 PM, Thierry Reding wrote:
> On Mon, Sep 24, 2018 at 03:41:49AM +0300, Dmitry Osipenko wrote:
>> GART is a part of the Memory Controller driver that is always built-in,
>> hence there is no benefit from the use of managed resources.
>>
>> Signed-off-by: Dmitry Osipenko <digetx@gmail.com>
>> ---
>>  drivers/iommu/tegra-gart.c | 12 +++++++-----
>>  1 file changed, 7 insertions(+), 5 deletions(-)
> 
> One benefit would be cleanup on probe error. Also, we may eventually
> want to make even the memory controller driver buildable as a module.

Please, let's keep that patch as is and re-introduce devm once it will
become really relevant.

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [PATCH v4 11/20] memory: tegra: Use of_device_get_match_data()
  2018-09-24 18:39     ` Dmitry Osipenko
@ 2018-09-25 10:00       ` Thierry Reding
  2018-09-25 13:53         ` Dmitry Osipenko
  0 siblings, 1 reply; 59+ messages in thread
From: Thierry Reding @ 2018-09-25 10:00 UTC (permalink / raw)
  To: Dmitry Osipenko
  Cc: Jonathan Hunter, Joerg Roedel, Rob Herring, Robin Murphy, iommu,
	devicetree, linux-tegra, linux-kernel

[-- Attachment #1: Type: text/plain, Size: 2343 bytes --]

On Mon, Sep 24, 2018 at 09:39:43PM +0300, Dmitry Osipenko wrote:
> On 9/24/18 1:13 PM, Thierry Reding wrote:
> > On Mon, Sep 24, 2018 at 03:41:44AM +0300, Dmitry Osipenko wrote:
> >> There is no need to match device with the DT node since it was already
> >> matched, use of_device_get_match_data() helper to get the match-data.
> >>
> >> Signed-off-by: Dmitry Osipenko <digetx@gmail.com>
> >> ---
> >>  drivers/memory/tegra/mc.c | 10 ++--------
> >>  1 file changed, 2 insertions(+), 8 deletions(-)
> >>
> >> diff --git a/drivers/memory/tegra/mc.c b/drivers/memory/tegra/mc.c
> >> index 5454ffe5b2e0..cdc33f93cf7c 100644
> >> --- a/drivers/memory/tegra/mc.c
> >> +++ b/drivers/memory/tegra/mc.c
> >> @@ -11,8 +11,7 @@
> >>  #include <linux/interrupt.h>
> >>  #include <linux/kernel.h>
> >>  #include <linux/module.h>
> >> -#include <linux/of.h>
> >> -#include <linux/platform_device.h>
> > 
> > It's better not to remove these two because the code still uses
> > functions declared in them. If ever we were going to remove code using
> > linux/of_device.h and then remove the linux/of_device.h include, we'd
> > break the build and have to reintroduce the includes.
> 
> That doesn't sound like a good argument. You're way too picky here ;)
> 
> > The same would happen if linux/of_device.h were ever to stop including
> > linux/platform_device.h or linux/of.h. That may sound unlikely, but it
> > has happened in the past with other includes. It can also happen that
> > some restructuring takes place in some headers that is not so obvious
> > and then things can still start falling apart miles away.
> 
> Restructuring will be somebody else problem. Not sure that we really
> should care about it, I think it is unnecessary. But since you're
> insisting..

It's actually a very common argument and I've seen patches in the past
that add includes just for the purpose of making sure the right
definitions get pulled in. This happens quite frequently as a preamble
to some major rework of some header files that would otherwise cause a
lot of breakage.

So I think it's best to be proactive about this and make sure we
explicitly pull in all the necessary headers in the first place,
irrespective of whether or not they may already get pulled in indirectly
by some other headers.

Thierry

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [PATCH v4 12/20] iommu/tegra: gart: Integrate with Memory Controller driver
  2018-09-24 18:22     ` Dmitry Osipenko
@ 2018-09-25 10:02       ` Thierry Reding
  0 siblings, 0 replies; 59+ messages in thread
From: Thierry Reding @ 2018-09-25 10:02 UTC (permalink / raw)
  To: Dmitry Osipenko
  Cc: Joerg Roedel, Jonathan Hunter, Rob Herring, Robin Murphy, iommu,
	devicetree, linux-tegra, linux-kernel

[-- Attachment #1: Type: text/plain, Size: 1412 bytes --]

On Mon, Sep 24, 2018 at 09:22:59PM +0300, Dmitry Osipenko wrote:
> On 9/24/18 1:23 PM, Thierry Reding wrote:
> > On Mon, Sep 24, 2018 at 03:41:45AM +0300, Dmitry Osipenko wrote:
> >> The device-tree binding has been changed. There is no separate GART device
> >> anymore, it is squashed into the Memory Controller. Integrate GART module
> >> with the MC in a way it is done for the SMMU of Tegra30+.
> >>
> >> Signed-off-by: Dmitry Osipenko <digetx@gmail.com>
> >> ---
> >>  drivers/iommu/Kconfig      |  1 +
> >>  drivers/iommu/tegra-gart.c | 98 ++++++++++----------------------------
> >>  drivers/memory/tegra/mc.c  | 41 ++++++++++++++++
> >>  include/soc/tegra/mc.h     | 27 +++++++++++
> >>  4 files changed, 93 insertions(+), 74 deletions(-)
> > 
> > I think this could technically have been two patches, but since they'd
> > have a compile-time dependency either way they need to be applied in the
> > correct order, so some coordination between IOMMU and Tegra trees is
> > going to have to happen anyway and might as well just stick this into a
> > single patch.
> 
> I assume that Joerg will take the whole series once it's ready (no?),
> hence your ACK will be needed here and in other patches.

Yeah, either Joerg takes them all with my Acked-by's or provides his for
the IOMMU bits and I take the batch through the Tegra tree. I don't have
a strong preference.

Thierry

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [PATCH v4 16/20] iommu/tegra: gart: Don't use managed resources
  2018-09-24 18:57     ` Dmitry Osipenko
@ 2018-09-25 10:03       ` Thierry Reding
  2018-09-25 13:41         ` Dmitry Osipenko
  0 siblings, 1 reply; 59+ messages in thread
From: Thierry Reding @ 2018-09-25 10:03 UTC (permalink / raw)
  To: Dmitry Osipenko
  Cc: Jonathan Hunter, Joerg Roedel, Rob Herring, Robin Murphy, iommu,
	devicetree, linux-tegra, linux-kernel

[-- Attachment #1: Type: text/plain, Size: 802 bytes --]

On Mon, Sep 24, 2018 at 09:57:10PM +0300, Dmitry Osipenko wrote:
> On 9/24/18 1:52 PM, Thierry Reding wrote:
> > On Mon, Sep 24, 2018 at 03:41:49AM +0300, Dmitry Osipenko wrote:
> >> GART is a part of the Memory Controller driver that is always built-in,
> >> hence there is no benefit from the use of managed resources.
> >>
> >> Signed-off-by: Dmitry Osipenko <digetx@gmail.com>
> >> ---
> >>  drivers/iommu/tegra-gart.c | 12 +++++++-----
> >>  1 file changed, 7 insertions(+), 5 deletions(-)
> > 
> > One benefit would be cleanup on probe error. Also, we may eventually
> > want to make even the memory controller driver buildable as a module.
> 
> Please, let's keep that patch as is and re-introduce devm once it will
> become really relevant.

Alright, fine with me.

Thierry

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [PATCH v4 18/20] iommu/tegra: gart: Don't detach devices from inactive domains
  2018-09-24 18:05     ` Dmitry Osipenko
@ 2018-09-25 10:04       ` Thierry Reding
  2018-09-25 13:41         ` Dmitry Osipenko
  0 siblings, 1 reply; 59+ messages in thread
From: Thierry Reding @ 2018-09-25 10:04 UTC (permalink / raw)
  To: Dmitry Osipenko
  Cc: Jonathan Hunter, Joerg Roedel, Rob Herring, Robin Murphy, iommu,
	devicetree, linux-tegra, linux-kernel

[-- Attachment #1: Type: text/plain, Size: 1064 bytes --]

On Mon, Sep 24, 2018 at 09:05:48PM +0300, Dmitry Osipenko wrote:
> On 9/24/18 2:00 PM, Thierry Reding wrote:
> > On Mon, Sep 24, 2018 at 03:41:51AM +0300, Dmitry Osipenko wrote:
> >> There could be unlimited number of allocated domains, but only one domain
> >> can be active at a time. Hence devices must be detached only from the
> >> active domain.
> >>
> >> Signed-off-by: Dmitry Osipenko <digetx@gmail.com>
> >> ---
> >>  drivers/iommu/tegra-gart.c | 8 +++++---
> >>  1 file changed, 5 insertions(+), 3 deletions(-)
> > 
> > Do we have a mechanism of switching out different domains? I don't think
> > we do, so I'm wondering if perhaps a better solution to this would be to
> > just refuse to create more than one domain in the first place. That
> > would also allow us to get rid of the global variable gart_handle.
> 
> That's what was done in v1, Robin Murphy suggested that it will be
> better not to restrict allocation of unpopulated domains. It is
> mentioned in the changelog, see comment to v2 changes.

Okay, fine.

Thierry

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [PATCH v4 19/20] iommu/tegra: gart: Simplify clients-tracking code
  2018-09-24 17:50     ` Dmitry Osipenko
@ 2018-09-25 10:09       ` Thierry Reding
  2018-09-25 13:47         ` Dmitry Osipenko
  0 siblings, 1 reply; 59+ messages in thread
From: Thierry Reding @ 2018-09-25 10:09 UTC (permalink / raw)
  To: Dmitry Osipenko
  Cc: Jonathan Hunter, Joerg Roedel, Rob Herring, Robin Murphy, iommu,
	devicetree, linux-tegra, linux-kernel

[-- Attachment #1: Type: text/plain, Size: 10089 bytes --]

On Mon, Sep 24, 2018 at 08:50:35PM +0300, Dmitry Osipenko wrote:
> On 9/24/18 2:10 PM, Thierry Reding wrote:
> > On Mon, Sep 24, 2018 at 03:41:52AM +0300, Dmitry Osipenko wrote:
> >> GART is a simple IOMMU provider that has single address space. There is
> >> no need to setup global clients list and manage it for tracking of the
> >> active domain, hence lot's of code could be safely removed and replaced
> >> with a simpler alternative.
> >>
> >> Signed-off-by: Dmitry Osipenko <digetx@gmail.com>
> >> ---
> >>  drivers/iommu/tegra-gart.c | 157 +++++++++----------------------------
> >>  1 file changed, 39 insertions(+), 118 deletions(-)
> >>
> >> diff --git a/drivers/iommu/tegra-gart.c b/drivers/iommu/tegra-gart.c
> >> index 306e9644a676..7182445c3b76 100644
> >> --- a/drivers/iommu/tegra-gart.c
> >> +++ b/drivers/iommu/tegra-gart.c
> >> @@ -19,7 +19,6 @@
> >>  
> >>  #include <linux/io.h>
> >>  #include <linux/iommu.h>
> >> -#include <linux/list.h>
> >>  #include <linux/module.h>
> >>  #include <linux/platform_device.h>
> >>  #include <linux/slab.h>
> >> @@ -42,30 +41,20 @@
> >>  #define GART_PAGE_MASK						\
> >>  	(~(GART_PAGE_SIZE - 1) & ~GART_ENTRY_PHYS_ADDR_VALID)
> >>  
> >> -struct gart_client {
> >> -	struct device		*dev;
> >> -	struct list_head	list;
> >> -};
> >> -
> >>  struct gart_device {
> >>  	void __iomem		*regs;
> >>  	u32			*savedata;
> >>  	u32			page_count;	/* total remappable size */
> >>  	dma_addr_t		iovmm_base;	/* offset to vmm_area */
> >>  	spinlock_t		pte_lock;	/* for pagetable */
> >> -	struct list_head	client;
> >> -	spinlock_t		client_lock;	/* for client list */
> >> +	spinlock_t		dom_lock;	/* for active domain */
> >> +	unsigned int		active_devices;	/* number of active devices */
> >>  	struct iommu_domain	*active_domain;	/* current active domain */
> >>  	struct device		*dev;
> >>  
> >>  	struct iommu_device	iommu;		/* IOMMU Core handle */
> >>  };
> >>  
> >> -struct gart_domain {
> >> -	struct iommu_domain domain;		/* generic domain handle */
> >> -	struct gart_device *gart;		/* link to gart device   */
> >> -};
> >> -
> >>  static struct gart_device *gart_handle; /* unique for a system */
> >>  
> >>  static bool gart_debug;
> >> @@ -73,11 +62,6 @@ static bool gart_debug;
> >>  #define GART_PTE(_pfn)						\
> >>  	(GART_ENTRY_PHYS_ADDR_VALID | ((_pfn) << PAGE_SHIFT))
> >>  
> >> -static struct gart_domain *to_gart_domain(struct iommu_domain *dom)
> >> -{
> >> -	return container_of(dom, struct gart_domain, domain);
> >> -}
> >> -
> >>  /*
> >>   * Any interaction between any block on PPSB and a block on APB or AHB
> >>   * must have these read-back to ensure the APB/AHB bus transaction is
> >> @@ -166,128 +150,69 @@ static inline bool gart_iova_range_valid(struct gart_device *gart,
> >>  static int gart_iommu_attach_dev(struct iommu_domain *domain,
> >>  				 struct device *dev)
> >>  {
> >> -	struct gart_domain *gart_domain = to_gart_domain(domain);
> >>  	struct gart_device *gart = gart_handle;
> >> -	struct gart_client *client, *c;
> >> -	int err = 0;
> >> -
> >> -	client = kzalloc(sizeof(*c), GFP_KERNEL);
> >> -	if (!client)
> >> -		return -ENOMEM;
> >> -	client->dev = dev;
> >> -
> >> -	spin_lock(&gart->client_lock);
> >> -	list_for_each_entry(c, &gart->client, list) {
> >> -		if (c->dev == dev) {
> >> -			dev_err(gart->dev, "GART: %s is already attached\n",
> >> -				dev_name(dev));
> >> -			err = -EINVAL;
> >> -			goto fail;
> >> -		}
> >> -	}
> >> -	if (gart->active_domain && gart->active_domain != domain) {
> >> -		dev_err(gart->dev,
> >> -			"GART: Only one domain can be active at a time\n");
> >> -		err = -EINVAL;
> >> -		goto fail;
> >> -	}
> >> -	gart->active_domain = domain;
> >> -	gart_domain->gart = gart;
> >> -	list_add(&client->list, &gart->client);
> >> -	spin_unlock(&gart->client_lock);
> >> -	dev_dbg(gart->dev, "GART: Attached %s\n", dev_name(dev));
> >> -	return 0;
> >> +	int ret = 0;
> >>  
> >> -fail:
> >> -	kfree(client);
> >> -	spin_unlock(&gart->client_lock);
> >> -	return err;
> >> -}
> >> +	spin_lock(&gart->dom_lock);
> >>  
> >> -static void __gart_iommu_detach_dev(struct iommu_domain *domain,
> >> -				    struct device *dev)
> >> -{
> >> -	struct gart_domain *gart_domain = to_gart_domain(domain);
> >> -	struct gart_device *gart = gart_domain->gart;
> >> -	struct gart_client *c;
> >> -
> >> -	list_for_each_entry(c, &gart->client, list) {
> >> -		if (c->dev == dev) {
> >> -			list_del(&c->list);
> >> -			kfree(c);
> >> -			if (list_empty(&gart->client)) {
> >> -				gart->active_domain = NULL;
> >> -				gart_domain->gart = NULL;
> >> -			}
> >> -			dev_dbg(gart->dev, "GART: Detached %s\n",
> >> -				dev_name(dev));
> >> -			return;
> >> -		}
> >> +	if (gart->active_domain && gart->active_domain != domain) {
> >> +		ret = -EBUSY;
> > 
> > This omits the error message and returns -EBUSY instead of -EINVAL. Was
> > this intended? For what it's worth, I do agree with the changes, it's
> > just that I think you could've made those in the earlier patch that
> > introduced them.
> 
> The message isn't really needed and EBUSY seems fit better than EINVAL here.
> 
> > But this is all one series and the end result looks fine, so no need to
> > be that picky.
> 
> Good, thanks.
> 
> >> +	} else if (dev->archdata.iommu != domain) {
> >> +		dev->archdata.iommu = domain;
> >> +		gart->active_domain = domain;
> >> +		gart->active_devices++;
> >>  	}
> >>  
> >> -	dev_err(gart->dev, "GART: Couldn't find %s to detach\n",
> >> -		dev_name(dev));
> >> +	spin_unlock(&gart->dom_lock);
> >> +
> >> +	return ret;
> >>  }
> >>  
> >>  static void gart_iommu_detach_dev(struct iommu_domain *domain,
> >>  				  struct device *dev)
> >>  {
> >> -	struct gart_domain *gart_domain = to_gart_domain(domain);
> >> -	struct gart_device *gart = gart_domain->gart;
> >> +	struct gart_device *gart = gart_handle;
> >> +
> >> +	spin_lock(&gart->dom_lock);
> >>  
> >> -	spin_lock(&gart->client_lock);
> >> -	__gart_iommu_detach_dev(domain, dev);
> >> -	spin_unlock(&gart->client_lock);
> >> +	if (dev->archdata.iommu == domain) {
> >> +		dev->archdata.iommu = NULL;
> >> +
> >> +		if (--gart->active_devices == 0)
> >> +			gart->active_domain = NULL;
> >> +	}
> >> +
> >> +	spin_unlock(&gart->dom_lock);
> >>  }
> >>  
> >>  static struct iommu_domain *gart_iommu_domain_alloc(unsigned type)
> >>  {
> >> -	struct gart_domain *gart_domain;
> >> -	struct gart_device *gart;
> >> +	struct gart_device *gart = gart_handle;
> >> +	struct iommu_domain *domain;
> >>  
> >>  	if (type != IOMMU_DOMAIN_UNMANAGED)
> >>  		return NULL;
> >>  
> >> -	gart = gart_handle;
> >> -	if (!gart)
> >> -		return NULL;
> >> -
> >> -	gart_domain = kzalloc(sizeof(*gart_domain), GFP_KERNEL);
> >> -	if (!gart_domain)
> >> -		return NULL;
> >> -
> >> -	gart_domain->domain.geometry.aperture_start = gart->iovmm_base;
> >> -	gart_domain->domain.geometry.aperture_end = gart->iovmm_base +
> >> +	domain = kzalloc(sizeof(*domain), GFP_KERNEL);
> >> +	if (domain) {
> >> +		domain->geometry.aperture_start = gart->iovmm_base;
> >> +		domain->geometry.aperture_end = gart->iovmm_base +
> >>  					gart->page_count * GART_PAGE_SIZE - 1;
> >> -	gart_domain->domain.geometry.force_aperture = true;
> >> +		domain->geometry.force_aperture = true;
> >> +	}
> >>  
> >> -	return &gart_domain->domain;
> >> +	return domain;
> >>  }
> >>  
> >>  static void gart_iommu_domain_free(struct iommu_domain *domain)
> >>  {
> >> -	struct gart_domain *gart_domain = to_gart_domain(domain);
> >> -	struct gart_device *gart = gart_domain->gart;
> >> -
> >> -	if (gart) {
> >> -		spin_lock(&gart->client_lock);
> >> -		if (!list_empty(&gart->client)) {
> >> -			struct gart_client *c, *tmp;
> >> -
> >> -			list_for_each_entry_safe(c, tmp, &gart->client, list)
> >> -				__gart_iommu_detach_dev(domain, c->dev);
> >> -		}
> >> -		spin_unlock(&gart->client_lock);
> >> -	}
> >> -
> >> -	kfree(gart_domain);
> >> +	kfree(domain);
> >>  }
> > 
> > Doesn't this now make it possible to free a potentially active domain?
> 
> Yes, don't do it. I can add a WARN_ON() here, though I think IOMMU core
> should be the one taking care about that.

Yeah, might be good to have the WARN_ON() either here or in the IOMMU
core. Force-detaching is probably a good idea, too, otherwise the users
of the freed domain are just going to crash anyway, right? Maybe
something to discuss more generally with Joerg.

I think in the meantime just having the WARN_ON() here is probably good
enough. It should point out the cases where we do free the domain with
devices still attached, which hopefully don't exist, and we can fix
them.

> >>  static int gart_iommu_map(struct iommu_domain *domain, unsigned long iova,
> >>  			  phys_addr_t pa, size_t bytes, int prot)
> >>  {
> >> -	struct gart_domain *gart_domain = to_gart_domain(domain);
> >> -	struct gart_device *gart = gart_domain->gart;
> >> +	struct gart_device *gart = gart_handle;
> > 
> > Hmm... this now introduces more uses of the gart_handle that I hoped we
> > could get rid of. I think we could still keep around struct gart_domain
> > and just make sure it is unique. The small amounts of casting here seem
> > mostly harmless to me, especially since they will be nops, so we end up
> > with just one dereference to get at the struct gart_device. I think the
> > benefits of not having this global variable around are worth the one
> > dereference here.
> 
> What are the benefits? I don't see anything other than the pedantic oddity.
> 
> I've removed gart_domain in the end because it is an extra code (and
> consumed resources) without any benefit. Let's keep that part as it is
> now. I'll be happy to change that code if you'll explain why it is worth
> it.

I thought I did explain. Anyway, it's always been like this, so no need
to change it as part of this series.

Thierry

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [PATCH v4 09/20] memory: tegra: Adapt to Tegra20 device-tree binding changes
  2018-09-24 13:22     ` Dmitry Osipenko
@ 2018-09-25 12:16       ` Dmitry Osipenko
  0 siblings, 0 replies; 59+ messages in thread
From: Dmitry Osipenko @ 2018-09-25 12:16 UTC (permalink / raw)
  To: Thierry Reding
  Cc: Jonathan Hunter, Joerg Roedel, Rob Herring, Robin Murphy, iommu,
	devicetree, linux-tegra, linux-kernel

On 9/24/18 4:22 PM, Dmitry Osipenko wrote:
> On 9/24/18 1:02 PM, Thierry Reding wrote:
>> On Mon, Sep 24, 2018 at 03:41:42AM +0300, Dmitry Osipenko wrote:
>>> The tegra20-mc device-tree binding has been changed, GART has been
>>> squashed into Memory Controller and now the clock property is mandatory
>>> for Tegra20, the DT compatible has been changed as well. Adapt driver to
>>> the DT changes.
>>>
>>> Signed-off-by: Dmitry Osipenko <digetx@gmail.com>
>>> ---
>>>   drivers/memory/tegra/mc.c | 21 ++++++++-------------
>>>   drivers/memory/tegra/mc.h |  6 ------
>>>   include/soc/tegra/mc.h    |  2 +-
>>>   3 files changed, 9 insertions(+), 20 deletions(-)
>>>
>>> diff --git a/drivers/memory/tegra/mc.c b/drivers/memory/tegra/mc.c
>>> index e56862495f36..1b4ceefd82f9 100644
>>> --- a/drivers/memory/tegra/mc.c
>>> +++ b/drivers/memory/tegra/mc.c
>>> @@ -51,7 +51,7 @@
>>>     static const struct of_device_id tegra_mc_of_match[] = {
>>>   #ifdef CONFIG_ARCH_TEGRA_2x_SOC
>>> -    { .compatible = "nvidia,tegra20-mc", .data = &tegra20_mc_soc },
>>> +    { .compatible = "nvidia,tegra20-mc-gart", .data =
>>> &tegra20_mc_soc },
>>
>> Technically we now regress because we no longer support the older device
>> tree bindings. I know that it doesn't really matter because this driver
>> doesn't really do much interesting yet other than reporting memory
>> access violations, but if that's enough to warrant a change of the
>> compatible string, then I think we also need to preserve compatibility
>> in the code.
>>
>> That said, I think compatibility would be easier to preserve if we stuck
>> with the old compatible string and used a "reg-names" property to
>> specify which version of the binding we're referring to.
>>
>> For example, we could have:
>>
>>     memory-controller@7000f000 {
>>         compatible = "nvidia,tegra20-mc";
>>         reg = <0x7000f000 0x024
>>                0x7000f03c 0x3c4>;
>>         ...
>>     };
>>
>> for the old binding and:
>>
>>     memory-controller@7000f000 {
>>         compatible = "nvidia,tegra20-mc";
>>         reg = <0x7000f000 0x00000400>,
>>               <0x58000000 0x02000000>;
>>         reg-names = "mc", "gart";
>>         ...
>>     };
>>
>> for the new binding. The driver can then easily check for the existence
>> of the reg-names property and take the legacy or new code paths.
> 
> There is no problem with keeping compatibility for newer kernels with
> the older binding, it just not worth the effort. The real problem is
> keeping compatibility of older kernels with the new binding, the older
> kernels won't care about the reg-names and will treat GART registers as
> the second registers bank of the Memory Controller. Unfortunately I
> don't see how your suggestion is supposed to help with the problem.

I've another variant. What about to drop the GART registers from the
binding? The range is always fixed and there is no good reason to
artificially change it. I recall that in the past you didn't like the
patch that made the GART's aperture size fixed, saying that some
imaginary person may want to change it via DT. It's still not a very
good argument to me, I can't see a good reason why anyone may want to
change the aperture size.

The new binding will look like this (just like T30+ binding, only
iommu-cells number differ):

     memory-controller@7000f000 {
         compatible = "nvidia,tegra20-mc";
         reg = <0x7000f000 0x00000400>;
         clocks = <&tegra_car TEGRA20_CLK_MC>;
         clock-names = "mc";

         interrupts = <GIC_SPI 77 IRQ_TYPE_LEVEL_HIGH>;

         #reset-cells = <1>;
         #iommu-cells = <0>;
     };

That way older kernel will continue to work with the new binding because
of the miss of the second registers range and new kernels may keep
supporting the old binding. Though I don't think that keeping support of
the old binding really worth the churning. Thoughts?

Note that new kernels will require the "mc" clock and hence the old
binding will be rejected because it doesn't have that clock.

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [PATCH v4 16/20] iommu/tegra: gart: Don't use managed resources
  2018-09-25 10:03       ` Thierry Reding
@ 2018-09-25 13:41         ` Dmitry Osipenko
  0 siblings, 0 replies; 59+ messages in thread
From: Dmitry Osipenko @ 2018-09-25 13:41 UTC (permalink / raw)
  To: Thierry Reding
  Cc: Jonathan Hunter, Joerg Roedel, Rob Herring, Robin Murphy, iommu,
	devicetree, linux-tegra, linux-kernel

On 9/25/18 1:03 PM, Thierry Reding wrote:
> On Mon, Sep 24, 2018 at 09:57:10PM +0300, Dmitry Osipenko wrote:
>> On 9/24/18 1:52 PM, Thierry Reding wrote:
>>> On Mon, Sep 24, 2018 at 03:41:49AM +0300, Dmitry Osipenko wrote:
>>>> GART is a part of the Memory Controller driver that is always built-in,
>>>> hence there is no benefit from the use of managed resources.
>>>>
>>>> Signed-off-by: Dmitry Osipenko <digetx@gmail.com>
>>>> ---
>>>>  drivers/iommu/tegra-gart.c | 12 +++++++-----
>>>>  1 file changed, 7 insertions(+), 5 deletions(-)
>>>
>>> One benefit would be cleanup on probe error. Also, we may eventually
>>> want to make even the memory controller driver buildable as a module.
>>
>> Please, let's keep that patch as is and re-introduce devm once it will
>> become really relevant.
> 
> Alright, fine with me.

Thanks, I'm taking that as ACK.

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [PATCH v4 18/20] iommu/tegra: gart: Don't detach devices from inactive domains
  2018-09-25 10:04       ` Thierry Reding
@ 2018-09-25 13:41         ` Dmitry Osipenko
  0 siblings, 0 replies; 59+ messages in thread
From: Dmitry Osipenko @ 2018-09-25 13:41 UTC (permalink / raw)
  To: Thierry Reding
  Cc: Jonathan Hunter, Joerg Roedel, Rob Herring, Robin Murphy, iommu,
	devicetree, linux-tegra, linux-kernel

On 9/25/18 1:04 PM, Thierry Reding wrote:
> On Mon, Sep 24, 2018 at 09:05:48PM +0300, Dmitry Osipenko wrote:
>> On 9/24/18 2:00 PM, Thierry Reding wrote:
>>> On Mon, Sep 24, 2018 at 03:41:51AM +0300, Dmitry Osipenko wrote:
>>>> There could be unlimited number of allocated domains, but only one domain
>>>> can be active at a time. Hence devices must be detached only from the
>>>> active domain.
>>>>
>>>> Signed-off-by: Dmitry Osipenko <digetx@gmail.com>
>>>> ---
>>>>  drivers/iommu/tegra-gart.c | 8 +++++---
>>>>  1 file changed, 5 insertions(+), 3 deletions(-)
>>>
>>> Do we have a mechanism of switching out different domains? I don't think
>>> we do, so I'm wondering if perhaps a better solution to this would be to
>>> just refuse to create more than one domain in the first place. That
>>> would also allow us to get rid of the global variable gart_handle.
>>
>> That's what was done in v1, Robin Murphy suggested that it will be
>> better not to restrict allocation of unpopulated domains. It is
>> mentioned in the changelog, see comment to v2 changes.
> 
> Okay, fine.

Thanks, taking as ACK.

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [PATCH v4 19/20] iommu/tegra: gart: Simplify clients-tracking code
  2018-09-25 10:09       ` Thierry Reding
@ 2018-09-25 13:47         ` Dmitry Osipenko
  0 siblings, 0 replies; 59+ messages in thread
From: Dmitry Osipenko @ 2018-09-25 13:47 UTC (permalink / raw)
  To: Thierry Reding
  Cc: Jonathan Hunter, Joerg Roedel, Rob Herring, Robin Murphy, iommu,
	devicetree, linux-tegra, linux-kernel

On 9/25/18 1:09 PM, Thierry Reding wrote:
> On Mon, Sep 24, 2018 at 08:50:35PM +0300, Dmitry Osipenko wrote:
>> On 9/24/18 2:10 PM, Thierry Reding wrote:
>>> On Mon, Sep 24, 2018 at 03:41:52AM +0300, Dmitry Osipenko wrote:
>>>> GART is a simple IOMMU provider that has single address space. There is
>>>> no need to setup global clients list and manage it for tracking of the
>>>> active domain, hence lot's of code could be safely removed and replaced
>>>> with a simpler alternative.
>>>>
>>>> Signed-off-by: Dmitry Osipenko <digetx@gmail.com>
>>>> ---
>>>>  drivers/iommu/tegra-gart.c | 157 +++++++++----------------------------
>>>>  1 file changed, 39 insertions(+), 118 deletions(-)
>>>>
>>>> diff --git a/drivers/iommu/tegra-gart.c b/drivers/iommu/tegra-gart.c
>>>> index 306e9644a676..7182445c3b76 100644
>>>> --- a/drivers/iommu/tegra-gart.c
>>>> +++ b/drivers/iommu/tegra-gart.c
>>>> @@ -19,7 +19,6 @@
>>>>  
>>>>  #include <linux/io.h>
>>>>  #include <linux/iommu.h>
>>>> -#include <linux/list.h>
>>>>  #include <linux/module.h>
>>>>  #include <linux/platform_device.h>
>>>>  #include <linux/slab.h>
>>>> @@ -42,30 +41,20 @@
>>>>  #define GART_PAGE_MASK						\
>>>>  	(~(GART_PAGE_SIZE - 1) & ~GART_ENTRY_PHYS_ADDR_VALID)
>>>>  
>>>> -struct gart_client {
>>>> -	struct device		*dev;
>>>> -	struct list_head	list;
>>>> -};
>>>> -
>>>>  struct gart_device {
>>>>  	void __iomem		*regs;
>>>>  	u32			*savedata;
>>>>  	u32			page_count;	/* total remappable size */
>>>>  	dma_addr_t		iovmm_base;	/* offset to vmm_area */
>>>>  	spinlock_t		pte_lock;	/* for pagetable */
>>>> -	struct list_head	client;
>>>> -	spinlock_t		client_lock;	/* for client list */
>>>> +	spinlock_t		dom_lock;	/* for active domain */
>>>> +	unsigned int		active_devices;	/* number of active devices */
>>>>  	struct iommu_domain	*active_domain;	/* current active domain */
>>>>  	struct device		*dev;
>>>>  
>>>>  	struct iommu_device	iommu;		/* IOMMU Core handle */
>>>>  };
>>>>  
>>>> -struct gart_domain {
>>>> -	struct iommu_domain domain;		/* generic domain handle */
>>>> -	struct gart_device *gart;		/* link to gart device   */
>>>> -};
>>>> -
>>>>  static struct gart_device *gart_handle; /* unique for a system */
>>>>  
>>>>  static bool gart_debug;
>>>> @@ -73,11 +62,6 @@ static bool gart_debug;
>>>>  #define GART_PTE(_pfn)						\
>>>>  	(GART_ENTRY_PHYS_ADDR_VALID | ((_pfn) << PAGE_SHIFT))
>>>>  
>>>> -static struct gart_domain *to_gart_domain(struct iommu_domain *dom)
>>>> -{
>>>> -	return container_of(dom, struct gart_domain, domain);
>>>> -}
>>>> -
>>>>  /*
>>>>   * Any interaction between any block on PPSB and a block on APB or AHB
>>>>   * must have these read-back to ensure the APB/AHB bus transaction is
>>>> @@ -166,128 +150,69 @@ static inline bool gart_iova_range_valid(struct gart_device *gart,
>>>>  static int gart_iommu_attach_dev(struct iommu_domain *domain,
>>>>  				 struct device *dev)
>>>>  {
>>>> -	struct gart_domain *gart_domain = to_gart_domain(domain);
>>>>  	struct gart_device *gart = gart_handle;
>>>> -	struct gart_client *client, *c;
>>>> -	int err = 0;
>>>> -
>>>> -	client = kzalloc(sizeof(*c), GFP_KERNEL);
>>>> -	if (!client)
>>>> -		return -ENOMEM;
>>>> -	client->dev = dev;
>>>> -
>>>> -	spin_lock(&gart->client_lock);
>>>> -	list_for_each_entry(c, &gart->client, list) {
>>>> -		if (c->dev == dev) {
>>>> -			dev_err(gart->dev, "GART: %s is already attached\n",
>>>> -				dev_name(dev));
>>>> -			err = -EINVAL;
>>>> -			goto fail;
>>>> -		}
>>>> -	}
>>>> -	if (gart->active_domain && gart->active_domain != domain) {
>>>> -		dev_err(gart->dev,
>>>> -			"GART: Only one domain can be active at a time\n");
>>>> -		err = -EINVAL;
>>>> -		goto fail;
>>>> -	}
>>>> -	gart->active_domain = domain;
>>>> -	gart_domain->gart = gart;
>>>> -	list_add(&client->list, &gart->client);
>>>> -	spin_unlock(&gart->client_lock);
>>>> -	dev_dbg(gart->dev, "GART: Attached %s\n", dev_name(dev));
>>>> -	return 0;
>>>> +	int ret = 0;
>>>>  
>>>> -fail:
>>>> -	kfree(client);
>>>> -	spin_unlock(&gart->client_lock);
>>>> -	return err;
>>>> -}
>>>> +	spin_lock(&gart->dom_lock);
>>>>  
>>>> -static void __gart_iommu_detach_dev(struct iommu_domain *domain,
>>>> -				    struct device *dev)
>>>> -{
>>>> -	struct gart_domain *gart_domain = to_gart_domain(domain);
>>>> -	struct gart_device *gart = gart_domain->gart;
>>>> -	struct gart_client *c;
>>>> -
>>>> -	list_for_each_entry(c, &gart->client, list) {
>>>> -		if (c->dev == dev) {
>>>> -			list_del(&c->list);
>>>> -			kfree(c);
>>>> -			if (list_empty(&gart->client)) {
>>>> -				gart->active_domain = NULL;
>>>> -				gart_domain->gart = NULL;
>>>> -			}
>>>> -			dev_dbg(gart->dev, "GART: Detached %s\n",
>>>> -				dev_name(dev));
>>>> -			return;
>>>> -		}
>>>> +	if (gart->active_domain && gart->active_domain != domain) {
>>>> +		ret = -EBUSY;
>>>
>>> This omits the error message and returns -EBUSY instead of -EINVAL. Was
>>> this intended? For what it's worth, I do agree with the changes, it's
>>> just that I think you could've made those in the earlier patch that
>>> introduced them.
>>
>> The message isn't really needed and EBUSY seems fit better than EINVAL here.
>>
>>> But this is all one series and the end result looks fine, so no need to
>>> be that picky.
>>
>> Good, thanks.
>>
>>>> +	} else if (dev->archdata.iommu != domain) {
>>>> +		dev->archdata.iommu = domain;
>>>> +		gart->active_domain = domain;
>>>> +		gart->active_devices++;
>>>>  	}
>>>>  
>>>> -	dev_err(gart->dev, "GART: Couldn't find %s to detach\n",
>>>> -		dev_name(dev));
>>>> +	spin_unlock(&gart->dom_lock);
>>>> +
>>>> +	return ret;
>>>>  }
>>>>  
>>>>  static void gart_iommu_detach_dev(struct iommu_domain *domain,
>>>>  				  struct device *dev)
>>>>  {
>>>> -	struct gart_domain *gart_domain = to_gart_domain(domain);
>>>> -	struct gart_device *gart = gart_domain->gart;
>>>> +	struct gart_device *gart = gart_handle;
>>>> +
>>>> +	spin_lock(&gart->dom_lock);
>>>>  
>>>> -	spin_lock(&gart->client_lock);
>>>> -	__gart_iommu_detach_dev(domain, dev);
>>>> -	spin_unlock(&gart->client_lock);
>>>> +	if (dev->archdata.iommu == domain) {
>>>> +		dev->archdata.iommu = NULL;
>>>> +
>>>> +		if (--gart->active_devices == 0)
>>>> +			gart->active_domain = NULL;
>>>> +	}
>>>> +
>>>> +	spin_unlock(&gart->dom_lock);
>>>>  }
>>>>  
>>>>  static struct iommu_domain *gart_iommu_domain_alloc(unsigned type)
>>>>  {
>>>> -	struct gart_domain *gart_domain;
>>>> -	struct gart_device *gart;
>>>> +	struct gart_device *gart = gart_handle;
>>>> +	struct iommu_domain *domain;
>>>>  
>>>>  	if (type != IOMMU_DOMAIN_UNMANAGED)
>>>>  		return NULL;
>>>>  
>>>> -	gart = gart_handle;
>>>> -	if (!gart)
>>>> -		return NULL;
>>>> -
>>>> -	gart_domain = kzalloc(sizeof(*gart_domain), GFP_KERNEL);
>>>> -	if (!gart_domain)
>>>> -		return NULL;
>>>> -
>>>> -	gart_domain->domain.geometry.aperture_start = gart->iovmm_base;
>>>> -	gart_domain->domain.geometry.aperture_end = gart->iovmm_base +
>>>> +	domain = kzalloc(sizeof(*domain), GFP_KERNEL);
>>>> +	if (domain) {
>>>> +		domain->geometry.aperture_start = gart->iovmm_base;
>>>> +		domain->geometry.aperture_end = gart->iovmm_base +
>>>>  					gart->page_count * GART_PAGE_SIZE - 1;
>>>> -	gart_domain->domain.geometry.force_aperture = true;
>>>> +		domain->geometry.force_aperture = true;
>>>> +	}
>>>>  
>>>> -	return &gart_domain->domain;
>>>> +	return domain;
>>>>  }
>>>>  
>>>>  static void gart_iommu_domain_free(struct iommu_domain *domain)
>>>>  {
>>>> -	struct gart_domain *gart_domain = to_gart_domain(domain);
>>>> -	struct gart_device *gart = gart_domain->gart;
>>>> -
>>>> -	if (gart) {
>>>> -		spin_lock(&gart->client_lock);
>>>> -		if (!list_empty(&gart->client)) {
>>>> -			struct gart_client *c, *tmp;
>>>> -
>>>> -			list_for_each_entry_safe(c, tmp, &gart->client, list)
>>>> -				__gart_iommu_detach_dev(domain, c->dev);
>>>> -		}
>>>> -		spin_unlock(&gart->client_lock);
>>>> -	}
>>>> -
>>>> -	kfree(gart_domain);
>>>> +	kfree(domain);
>>>>  }
>>>
>>> Doesn't this now make it possible to free a potentially active domain?
>>
>> Yes, don't do it. I can add a WARN_ON() here, though I think IOMMU core
>> should be the one taking care about that.
> 
> Yeah, might be good to have the WARN_ON() either here or in the IOMMU
> core. Force-detaching is probably a good idea, too, otherwise the users
> of the freed domain are just going to crash anyway, right? Maybe
> something to discuss more generally with Joerg.
> 
> I think in the meantime just having the WARN_ON() here is probably good
> enough. It should point out the cases where we do free the domain with
> devices still attached, which hopefully don't exist, and we can fix
> them.

Ok.

>>>>  static int gart_iommu_map(struct iommu_domain *domain, unsigned long iova,
>>>>  			  phys_addr_t pa, size_t bytes, int prot)
>>>>  {
>>>> -	struct gart_domain *gart_domain = to_gart_domain(domain);
>>>> -	struct gart_device *gart = gart_domain->gart;
>>>> +	struct gart_device *gart = gart_handle;
>>>
>>> Hmm... this now introduces more uses of the gart_handle that I hoped we
>>> could get rid of. I think we could still keep around struct gart_domain
>>> and just make sure it is unique. The small amounts of casting here seem
>>> mostly harmless to me, especially since they will be nops, so we end up
>>> with just one dereference to get at the struct gart_device. I think the
>>> benefits of not having this global variable around are worth the one
>>> dereference here.
>>
>> What are the benefits? I don't see anything other than the pedantic oddity.
>>
>> I've removed gart_domain in the end because it is an extra code (and
>> consumed resources) without any benefit. Let's keep that part as it is
>> now. I'll be happy to change that code if you'll explain why it is worth
>> it.
> 
> I thought I did explain. Anyway, it's always been like this, so no need
> to change it as part of this series.

Thanks.

You're saying that you want to get rid of the global variable, but
you're not saying why. Usage of global variable is more appealing with
the current driver structure.

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [PATCH v4 11/20] memory: tegra: Use of_device_get_match_data()
  2018-09-25 10:00       ` Thierry Reding
@ 2018-09-25 13:53         ` Dmitry Osipenko
  0 siblings, 0 replies; 59+ messages in thread
From: Dmitry Osipenko @ 2018-09-25 13:53 UTC (permalink / raw)
  To: Thierry Reding
  Cc: Jonathan Hunter, Joerg Roedel, Rob Herring, Robin Murphy, iommu,
	devicetree, linux-tegra, linux-kernel

On 9/25/18 1:00 PM, Thierry Reding wrote:
> On Mon, Sep 24, 2018 at 09:39:43PM +0300, Dmitry Osipenko wrote:
>> On 9/24/18 1:13 PM, Thierry Reding wrote:
>>> On Mon, Sep 24, 2018 at 03:41:44AM +0300, Dmitry Osipenko wrote:
>>>> There is no need to match device with the DT node since it was already
>>>> matched, use of_device_get_match_data() helper to get the match-data.
>>>>
>>>> Signed-off-by: Dmitry Osipenko <digetx@gmail.com>
>>>> ---
>>>>  drivers/memory/tegra/mc.c | 10 ++--------
>>>>  1 file changed, 2 insertions(+), 8 deletions(-)
>>>>
>>>> diff --git a/drivers/memory/tegra/mc.c b/drivers/memory/tegra/mc.c
>>>> index 5454ffe5b2e0..cdc33f93cf7c 100644
>>>> --- a/drivers/memory/tegra/mc.c
>>>> +++ b/drivers/memory/tegra/mc.c
>>>> @@ -11,8 +11,7 @@
>>>>  #include <linux/interrupt.h>
>>>>  #include <linux/kernel.h>
>>>>  #include <linux/module.h>
>>>> -#include <linux/of.h>
>>>> -#include <linux/platform_device.h>
>>>
>>> It's better not to remove these two because the code still uses
>>> functions declared in them. If ever we were going to remove code using
>>> linux/of_device.h and then remove the linux/of_device.h include, we'd
>>> break the build and have to reintroduce the includes.
>>
>> That doesn't sound like a good argument. You're way too picky here ;)
>>
>>> The same would happen if linux/of_device.h were ever to stop including
>>> linux/platform_device.h or linux/of.h. That may sound unlikely, but it
>>> has happened in the past with other includes. It can also happen that
>>> some restructuring takes place in some headers that is not so obvious
>>> and then things can still start falling apart miles away.
>>
>> Restructuring will be somebody else problem. Not sure that we really
>> should care about it, I think it is unnecessary. But since you're
>> insisting..
> 
> It's actually a very common argument and I've seen patches in the past
> that add includes just for the purpose of making sure the right
> definitions get pulled in. This happens quite frequently as a preamble
> to some major rework of some header files that would otherwise cause a
> lot of breakage.
> 
> So I think it's best to be proactive about this and make sure we
> explicitly pull in all the necessary headers in the first place,
> irrespective of whether or not they may already get pulled in indirectly
> by some other headers.

Ok

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [PATCH v4 06/20] dt-bindings: memory: tegra: Squash tegra20-gart into tegra20-mc
  2018-09-24  9:55   ` Thierry Reding
@ 2018-09-27 18:41     ` Rob Herring
  0 siblings, 0 replies; 59+ messages in thread
From: Rob Herring @ 2018-09-27 18:41 UTC (permalink / raw)
  To: Thierry Reding
  Cc: Dmitry Osipenko, Jonathan Hunter, Joerg Roedel, Robin Murphy,
	iommu, devicetree, linux-tegra, linux-kernel

On Mon, Sep 24, 2018 at 11:55:30AM +0200, Thierry Reding wrote:
> On Mon, Sep 24, 2018 at 03:41:39AM +0300, Dmitry Osipenko wrote:
> > Splitting GART and Memory Controller wasn't a good decision that was made
> > back in the day. Given that the GART driver wasn't ever been used by
> > anything in the kernel, we decided that it will be better to correct the
> > mistakes of the past and merge two bindings into a single one. As a result
> > there is a DT ABI change for the Memory Controller that allows not to
> > break newer kernels using older DT and not to break older kernels using
> > newer DT, that is done by changing the 'compatible' of the node to
> > 'tegra20-mc-gart' and adding a new-required clock property. The new clock
> > property also puts the tegra20-mc binding in line with the bindings of the
> > later Tegra generations.
> > 
> > Signed-off-by: Dmitry Osipenko <digetx@gmail.com>
> > ---
> >  .../bindings/iommu/nvidia,tegra20-gart.txt    | 14 ----------
> >  .../memory-controllers/nvidia,tegra20-mc.txt  | 27 +++++++++++++------
> >  2 files changed, 19 insertions(+), 22 deletions(-)
> >  delete mode 100644 Documentation/devicetree/bindings/iommu/nvidia,tegra20-gart.txt
> > 
> > diff --git a/Documentation/devicetree/bindings/iommu/nvidia,tegra20-gart.txt b/Documentation/devicetree/bindings/iommu/nvidia,tegra20-gart.txt
> > deleted file mode 100644
> > index 099d9362ebc1..000000000000
> > --- a/Documentation/devicetree/bindings/iommu/nvidia,tegra20-gart.txt
> > +++ /dev/null
> > @@ -1,14 +0,0 @@
> > -NVIDIA Tegra 20 GART
> > -
> > -Required properties:
> > -- compatible: "nvidia,tegra20-gart"
> > -- reg: Two pairs of cells specifying the physical address and size of
> > -  the memory controller registers and the GART aperture respectively.
> > -
> > -Example:
> > -
> > -	gart {
> > -		compatible = "nvidia,tegra20-gart";
> > -		reg = <0x7000f024 0x00000018	/* controller registers */
> > -		       0x58000000 0x02000000>;	/* GART aperture */
> > -	};
> > diff --git a/Documentation/devicetree/bindings/memory-controllers/nvidia,tegra20-mc.txt b/Documentation/devicetree/bindings/memory-controllers/nvidia,tegra20-mc.txt
> > index 7d60a50a4fa1..e55328237df4 100644
> > --- a/Documentation/devicetree/bindings/memory-controllers/nvidia,tegra20-mc.txt
> > +++ b/Documentation/devicetree/bindings/memory-controllers/nvidia,tegra20-mc.txt
> > @@ -1,26 +1,37 @@
> >  NVIDIA Tegra20 MC(Memory Controller)
> >  
> >  Required properties:
> > -- compatible : "nvidia,tegra20-mc"
> > -- reg : Should contain 2 register ranges(address and length); see the
> > -  example below. Note that the MC registers are interleaved with the
> > -  GART registers, and hence must be represented as multiple ranges.
> > +- compatible : "nvidia,tegra20-mc-gart"
> > +- reg : Should contain 2 register ranges: physical base address and length of
> > +  the controller's registers and the GART aperture respectively.
> 
> Couldn't we have achieved the same thing by adding a reg-names property
> instead of using a different compatible string? After all we only change
> what information the DT provides, but the device is still a "tegra20-mc"
> device.

Yes, if we were adding a reg field, but we're changing what the 2 reg 
fields contain, so I think a new compatible is best.

Rob

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [PATCH v4 06/20] dt-bindings: memory: tegra: Squash tegra20-gart into tegra20-mc
  2018-09-24  0:41 ` [PATCH v4 06/20] dt-bindings: memory: tegra: Squash tegra20-gart into tegra20-mc Dmitry Osipenko
  2018-09-24  9:55   ` Thierry Reding
@ 2018-09-27 18:41   ` Rob Herring
  1 sibling, 0 replies; 59+ messages in thread
From: Rob Herring @ 2018-09-27 18:41 UTC (permalink / raw)
  To: Dmitry Osipenko
  Cc: Thierry Reding, Jonathan Hunter, Joerg Roedel, Robin Murphy,
	iommu, devicetree, linux-tegra, linux-kernel

On Mon, 24 Sep 2018 03:41:39 +0300, Dmitry Osipenko wrote:
> Splitting GART and Memory Controller wasn't a good decision that was made
> back in the day. Given that the GART driver wasn't ever been used by
> anything in the kernel, we decided that it will be better to correct the
> mistakes of the past and merge two bindings into a single one. As a result
> there is a DT ABI change for the Memory Controller that allows not to
> break newer kernels using older DT and not to break older kernels using
> newer DT, that is done by changing the 'compatible' of the node to
> 'tegra20-mc-gart' and adding a new-required clock property. The new clock
> property also puts the tegra20-mc binding in line with the bindings of the
> later Tegra generations.
> 
> Signed-off-by: Dmitry Osipenko <digetx@gmail.com>
> ---
>  .../bindings/iommu/nvidia,tegra20-gart.txt    | 14 ----------
>  .../memory-controllers/nvidia,tegra20-mc.txt  | 27 +++++++++++++------
>  2 files changed, 19 insertions(+), 22 deletions(-)
>  delete mode 100644 Documentation/devicetree/bindings/iommu/nvidia,tegra20-gart.txt
> 

Reviewed-by: Rob Herring <robh@kernel.org>

^ permalink raw reply	[flat|nested] 59+ messages in thread

end of thread, other threads:[~2018-09-27 18:41 UTC | newest]

Thread overview: 59+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-09-24  0:41 [PATCH v4 00/20] IOMMU: Tegra GART driver clean up and optimization Dmitry Osipenko
2018-09-24  0:41 ` [PATCH v4 01/20] iommu/tegra: gart: Remove pr_fmt and clean up includes Dmitry Osipenko
2018-09-24 10:02   ` Thierry Reding
2018-09-24  0:41 ` [PATCH v4 02/20] iommu/tegra: gart: Clean up driver probe errors handling Dmitry Osipenko
2018-09-24 10:02   ` Thierry Reding
2018-09-24  0:41 ` [PATCH v4 03/20] iommu/tegra: gart: Ignore devices without IOMMU phandle in DT Dmitry Osipenko
2018-09-24 10:05   ` Thierry Reding
2018-09-24 18:41     ` Dmitry Osipenko
2018-09-24  0:41 ` [PATCH v4 04/20] iommu: Introduce iotlb_sync_map callback Dmitry Osipenko
2018-09-24 10:06   ` Thierry Reding
2018-09-24  0:41 ` [PATCH v4 05/20] iommu/tegra: gart: Optimize mapping / unmapping performance Dmitry Osipenko
2018-09-24 10:07   ` Thierry Reding
2018-09-24  0:41 ` [PATCH v4 06/20] dt-bindings: memory: tegra: Squash tegra20-gart into tegra20-mc Dmitry Osipenko
2018-09-24  9:55   ` Thierry Reding
2018-09-27 18:41     ` Rob Herring
2018-09-27 18:41   ` Rob Herring
2018-09-24  0:41 ` [PATCH v4 07/20] ARM: dts: tegra20: Update Memory Controller node to the new binding Dmitry Osipenko
2018-09-24  0:41 ` [PATCH v4 08/20] memory: tegra: Don't invoke Tegra30+ specific memory timing setup on Tegra20 Dmitry Osipenko
2018-09-24  0:41 ` [PATCH v4 09/20] memory: tegra: Adapt to Tegra20 device-tree binding changes Dmitry Osipenko
2018-09-24 10:02   ` Thierry Reding
2018-09-24 13:22     ` Dmitry Osipenko
2018-09-25 12:16       ` Dmitry Osipenko
2018-09-24  0:41 ` [PATCH v4 10/20] memory: tegra: Read client ID on GART page fault Dmitry Osipenko
2018-09-24  0:41 ` [PATCH v4 11/20] memory: tegra: Use of_device_get_match_data() Dmitry Osipenko
2018-09-24 10:13   ` Thierry Reding
2018-09-24 18:39     ` Dmitry Osipenko
2018-09-25 10:00       ` Thierry Reding
2018-09-25 13:53         ` Dmitry Osipenko
2018-09-24  0:41 ` [PATCH v4 12/20] iommu/tegra: gart: Integrate with Memory Controller driver Dmitry Osipenko
2018-09-24 10:23   ` Thierry Reding
2018-09-24 18:22     ` Dmitry Osipenko
2018-09-25 10:02       ` Thierry Reding
2018-09-24  0:41 ` [PATCH v4 13/20] iommu/tegra: gart: Fix spinlock recursion Dmitry Osipenko
2018-09-24 10:49   ` Thierry Reding
2018-09-24  0:41 ` [PATCH v4 14/20] iommu/tegra: gart: Fix NULL pointer dereference Dmitry Osipenko
2018-09-24 10:49   ` Thierry Reding
2018-09-24  0:41 ` [PATCH v4 15/20] iommu/tegra: gart: Allow only one active domain at a time Dmitry Osipenko
2018-09-24 10:50   ` Thierry Reding
2018-09-24  0:41 ` [PATCH v4 16/20] iommu/tegra: gart: Don't use managed resources Dmitry Osipenko
2018-09-24 10:52   ` Thierry Reding
2018-09-24 18:57     ` Dmitry Osipenko
2018-09-25 10:03       ` Thierry Reding
2018-09-25 13:41         ` Dmitry Osipenko
2018-09-24  0:41 ` [PATCH v4 17/20] iommu/tegra: gart: Prepend error/debug messages with "GART:" Dmitry Osipenko
2018-09-24 10:57   ` Thierry Reding
2018-09-24 18:09     ` Dmitry Osipenko
2018-09-24  0:41 ` [PATCH v4 18/20] iommu/tegra: gart: Don't detach devices from inactive domains Dmitry Osipenko
2018-09-24 11:00   ` Thierry Reding
2018-09-24 18:05     ` Dmitry Osipenko
2018-09-25 10:04       ` Thierry Reding
2018-09-25 13:41         ` Dmitry Osipenko
2018-09-24  0:41 ` [PATCH v4 19/20] iommu/tegra: gart: Simplify clients-tracking code Dmitry Osipenko
2018-09-24 11:10   ` Thierry Reding
2018-09-24 17:50     ` Dmitry Osipenko
2018-09-25 10:09       ` Thierry Reding
2018-09-25 13:47         ` Dmitry Osipenko
2018-09-24  0:41 ` [PATCH v4 20/20] iommu/tegra: gart: Perform code refactoring Dmitry Osipenko
2018-09-24 11:34   ` Thierry Reding
2018-09-24 17:11     ` Dmitry Osipenko

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).