All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 0/6] iommu/arm-smmu: misc features, new DT bindings
@ 2014-08-13  0:51 ` Mitchel Humpherys
  0 siblings, 0 replies; 68+ messages in thread
From: Mitchel Humpherys @ 2014-08-13  0:51 UTC (permalink / raw)
  To: linux-arm-kernel-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r,
	iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA, Will Deacon,
	devicetree-u79uwXL29TY76Z2rM5mHXA

This series contains a some enhancements to the ARM SMMU driver. These
are mostly distinct but I'm sending them out in a single series since
they depend on each other for clean application.

The first two patches are related to power-saving features (clocks and
regulators).

The third adds support for doing iova_to_phys through the SMMU
hardware on platforms that support it.

The fourth implements the recently merged generic DT bindings. For
reference, the discussion around the generic DT bindings is here:

    http://lists.linuxfoundation.org/pipermail/iommu/2014-July/009357.html

The fifth and sixth handle some implementation-specific issues,
providing knobs in the device tree and a new domain attribute.

This series is based on on Will's iommu/pci branch.



Mitchel Humpherys (6):
  iommu/arm-smmu: add support for specifying clocks
  iommu/arm-smmu: add support for specifying regulators
  iommu/arm-smmu: add support for iova_to_phys through ATS1PR
  iommu/arm-smmu: implement generic DT bindings
  iommu/arm-smmu: support buggy implementations with invalidate-on-map
  iommu/arm-smmu: add .domain_{set,get}_attr for coherent walk control

 .../devicetree/bindings/iommu/arm,smmu.txt         |  18 +
 drivers/iommu/arm-smmu.c                           | 444 ++++++++++++++++++---
 include/linux/iommu.h                              |   1 +
 3 files changed, 406 insertions(+), 57 deletions(-)

-- 
The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum,
hosted by The Linux Foundation

^ permalink raw reply	[flat|nested] 68+ messages in thread

* [PATCH 0/6] iommu/arm-smmu: misc features, new DT bindings
@ 2014-08-13  0:51 ` Mitchel Humpherys
  0 siblings, 0 replies; 68+ messages in thread
From: Mitchel Humpherys @ 2014-08-13  0:51 UTC (permalink / raw)
  To: linux-arm-kernel

This series contains a some enhancements to the ARM SMMU driver. These
are mostly distinct but I'm sending them out in a single series since
they depend on each other for clean application.

The first two patches are related to power-saving features (clocks and
regulators).

The third adds support for doing iova_to_phys through the SMMU
hardware on platforms that support it.

The fourth implements the recently merged generic DT bindings. For
reference, the discussion around the generic DT bindings is here:

    http://lists.linuxfoundation.org/pipermail/iommu/2014-July/009357.html

The fifth and sixth handle some implementation-specific issues,
providing knobs in the device tree and a new domain attribute.

This series is based on on Will's iommu/pci branch.



Mitchel Humpherys (6):
  iommu/arm-smmu: add support for specifying clocks
  iommu/arm-smmu: add support for specifying regulators
  iommu/arm-smmu: add support for iova_to_phys through ATS1PR
  iommu/arm-smmu: implement generic DT bindings
  iommu/arm-smmu: support buggy implementations with invalidate-on-map
  iommu/arm-smmu: add .domain_{set,get}_attr for coherent walk control

 .../devicetree/bindings/iommu/arm,smmu.txt         |  18 +
 drivers/iommu/arm-smmu.c                           | 444 ++++++++++++++++++---
 include/linux/iommu.h                              |   1 +
 3 files changed, 406 insertions(+), 57 deletions(-)

-- 
The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum,
hosted by The Linux Foundation

^ permalink raw reply	[flat|nested] 68+ messages in thread

* [PATCH 1/6] iommu/arm-smmu: add support for specifying clocks
  2014-08-13  0:51 ` Mitchel Humpherys
@ 2014-08-13  0:51     ` Mitchel Humpherys
  -1 siblings, 0 replies; 68+ messages in thread
From: Mitchel Humpherys @ 2014-08-13  0:51 UTC (permalink / raw)
  To: linux-arm-kernel-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r,
	iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA, Will Deacon,
	devicetree-u79uwXL29TY76Z2rM5mHXA

On some platforms with tight power constraints it is polite to only
leave your clocks on for as long as you absolutely need them. Currently
we assume that all clocks necessary for SMMU register access are always
on.

Add some optional device tree properties to specify any clocks that are
necessary for SMMU register access and turn them on and off as needed.

If no clocks are specified in the device tree things continue to work
the way they always have: we assume all necessary clocks are always
turned on.

Signed-off-by: Mitchel Humpherys <mitchelh-sgV2jX0FEOL9JmXXK+q4OQ@public.gmane.org>
---
 .../devicetree/bindings/iommu/arm,smmu.txt         |  11 ++
 drivers/iommu/arm-smmu.c                           | 127 +++++++++++++++++++--
 2 files changed, 129 insertions(+), 9 deletions(-)

diff --git a/Documentation/devicetree/bindings/iommu/arm,smmu.txt b/Documentation/devicetree/bindings/iommu/arm,smmu.txt
index 2d0f7cd867..ceae3fe207 100644
--- a/Documentation/devicetree/bindings/iommu/arm,smmu.txt
+++ b/Documentation/devicetree/bindings/iommu/arm,smmu.txt
@@ -48,6 +48,17 @@ conditions.
                   aliases of secure registers have to be used during
                   SMMU configuration.
 
+- clocks        : List of clocks to be used during SMMU register access. See
+                  Documentation/devicetree/bindings/clock/clock-bindings.txt
+                  for information about the format. For each clock specified
+                  here, there must be a corresponding entery in clock-names
+                  (see below).
+
+- clock-names   : List of clock names corresponding to the clocks specified in
+                  the "clocks" property (above). See
+                  Documentation/devicetree/bindings/clock/clock-bindings.txt
+                  for more info.
+
 Example:
 
         smmu {
diff --git a/drivers/iommu/arm-smmu.c b/drivers/iommu/arm-smmu.c
index 9fd8754db0..e123d75db3 100644
--- a/drivers/iommu/arm-smmu.c
+++ b/drivers/iommu/arm-smmu.c
@@ -399,6 +399,9 @@ struct arm_smmu_device {
 
 	struct list_head		list;
 	struct rb_root			masters;
+
+	int num_clocks;
+	struct clk **clocks;
 };
 
 struct arm_smmu_cfg {
@@ -589,6 +592,31 @@ static void __arm_smmu_free_bitmap(unsigned long *map, int idx)
 	clear_bit(idx, map);
 }
 
+static int arm_smmu_enable_clocks(struct arm_smmu_device *smmu)
+{
+	int i, ret = 0;
+
+	for (i = 0; i < smmu->num_clocks; ++i) {
+		ret = clk_prepare_enable(smmu->clocks[i]);
+		if (ret) {
+			dev_err(smmu->dev, "Couldn't enable clock #%d\n", i);
+			while (i--)
+				clk_disable_unprepare(smmu->clocks[i]);
+			break;
+		}
+	}
+
+	return ret;
+}
+
+static void arm_smmu_disable_clocks(struct arm_smmu_device *smmu)
+{
+	int i;
+
+	for (i = 0; i < smmu->num_clocks; ++i)
+		clk_disable_unprepare(smmu->clocks[i]);
+}
+
 /* Wait for any pending TLB invalidations to complete */
 static void arm_smmu_tlb_sync(struct arm_smmu_device *smmu)
 {
@@ -644,11 +672,15 @@ static irqreturn_t arm_smmu_context_fault(int irq, void *dev)
 	struct arm_smmu_device *smmu = smmu_domain->smmu;
 	void __iomem *cb_base;
 
+	arm_smmu_enable_clocks(smmu);
+
 	cb_base = ARM_SMMU_CB_BASE(smmu) + ARM_SMMU_CB(smmu, cfg->cbndx);
 	fsr = readl_relaxed(cb_base + ARM_SMMU_CB_FSR);
 
-	if (!(fsr & FSR_FAULT))
+	if (!(fsr & FSR_FAULT)) {
+		arm_smmu_disable_clocks(smmu);
 		return IRQ_NONE;
+	}
 
 	if (fsr & FSR_IGN)
 		dev_err_ratelimited(smmu->dev,
@@ -683,6 +715,8 @@ static irqreturn_t arm_smmu_context_fault(int irq, void *dev)
 	if (fsr & FSR_SS)
 		writel_relaxed(resume, cb_base + ARM_SMMU_CB_RESUME);
 
+	arm_smmu_disable_clocks(smmu);
+
 	return ret;
 }
 
@@ -692,13 +726,17 @@ static irqreturn_t arm_smmu_global_fault(int irq, void *dev)
 	struct arm_smmu_device *smmu = dev;
 	void __iomem *gr0_base = ARM_SMMU_GR0_NS(smmu);
 
+	arm_smmu_enable_clocks(smmu);
+
 	gfsr = readl_relaxed(gr0_base + ARM_SMMU_GR0_sGFSR);
 	gfsynr0 = readl_relaxed(gr0_base + ARM_SMMU_GR0_sGFSYNR0);
 	gfsynr1 = readl_relaxed(gr0_base + ARM_SMMU_GR0_sGFSYNR1);
 	gfsynr2 = readl_relaxed(gr0_base + ARM_SMMU_GR0_sGFSYNR2);
 
-	if (!gfsr)
+	if (!gfsr) {
+		arm_smmu_disable_clocks(smmu);
 		return IRQ_NONE;
+	}
 
 	dev_err_ratelimited(smmu->dev,
 		"Unexpected global fault, this could be serious\n");
@@ -707,6 +745,7 @@ static irqreturn_t arm_smmu_global_fault(int irq, void *dev)
 		gfsr, gfsynr0, gfsynr1, gfsynr2);
 
 	writel(gfsr, gr0_base + ARM_SMMU_GR0_sGFSR);
+	arm_smmu_disable_clocks(smmu);
 	return IRQ_HANDLED;
 }
 
@@ -991,10 +1030,12 @@ static void arm_smmu_destroy_domain_context(struct iommu_domain *domain)
 	if (!smmu)
 		return;
 
+	arm_smmu_enable_clocks(smmu_domain->smmu);
 	/* Disable the context bank and nuke the TLB before freeing it. */
 	cb_base = ARM_SMMU_CB_BASE(smmu) + ARM_SMMU_CB(smmu, cfg->cbndx);
 	writel_relaxed(0, cb_base + ARM_SMMU_CB_SCTLR);
 	arm_smmu_tlb_inv_context(smmu_domain);
+	arm_smmu_disable_clocks(smmu_domain->smmu);
 
 	if (cfg->irptndx != INVALID_IRPTNDX) {
 		irq = smmu->irqs[smmu->num_global_irqs + cfg->irptndx];
@@ -1222,6 +1263,7 @@ static void arm_smmu_domain_remove_master(struct arm_smmu_domain *smmu_domain,
 	 * We *must* clear the S2CR first, because freeing the SMR means
 	 * that it can be re-allocated immediately.
 	 */
+	arm_smmu_enable_clocks(smmu);
 	for (i = 0; i < cfg->num_streamids; ++i) {
 		u32 idx = cfg->smrs ? cfg->smrs[i].idx : cfg->streamids[i];
 
@@ -1230,6 +1272,7 @@ static void arm_smmu_domain_remove_master(struct arm_smmu_domain *smmu_domain,
 	}
 
 	arm_smmu_master_free_smrs(smmu, cfg);
+	arm_smmu_disable_clocks(smmu);
 }
 
 static int arm_smmu_attach_dev(struct iommu_domain *domain, struct device *dev)
@@ -1250,6 +1293,8 @@ static int arm_smmu_attach_dev(struct iommu_domain *domain, struct device *dev)
 		return -EEXIST;
 	}
 
+	arm_smmu_enable_clocks(smmu);
+
 	/*
 	 * Sanity check the domain. We don't support domains across
 	 * different SMMUs.
@@ -1259,7 +1304,7 @@ static int arm_smmu_attach_dev(struct iommu_domain *domain, struct device *dev)
 		/* Now that we have a master, we can finalise the domain */
 		ret = arm_smmu_init_domain_context(domain, smmu);
 		if (IS_ERR_VALUE(ret))
-			return ret;
+			goto disable_clocks;
 
 		dom_smmu = smmu_domain->smmu;
 	}
@@ -1268,17 +1313,22 @@ static int arm_smmu_attach_dev(struct iommu_domain *domain, struct device *dev)
 		dev_err(dev,
 			"cannot attach to SMMU %s whilst already attached to domain on SMMU %s\n",
 			dev_name(smmu_domain->smmu->dev), dev_name(smmu->dev));
-		return -EINVAL;
+		ret = -EINVAL;
+		goto disable_clocks;
 	}
 
 	/* Looks ok, so add the device to the domain */
 	cfg = find_smmu_master_cfg(dev);
-	if (!cfg)
-		return -ENODEV;
+	if (!cfg) {
+		ret = -ENODEV;
+		goto disable_clocks;
+	}
 
 	ret = arm_smmu_domain_add_master(smmu_domain, cfg);
 	if (!ret)
 		dev->archdata.iommu = domain;
+disable_clocks:
+	arm_smmu_disable_clocks(smmu);
 	return ret;
 }
 
@@ -1544,7 +1594,9 @@ static size_t arm_smmu_unmap(struct iommu_domain *domain, unsigned long iova,
 	struct arm_smmu_domain *smmu_domain = domain->priv;
 
 	ret = arm_smmu_handle_mapping(smmu_domain, iova, 0, size, 0);
+	arm_smmu_enable_clocks(smmu_domain->smmu);
 	arm_smmu_tlb_inv_context(smmu_domain);
+	arm_smmu_disable_clocks(smmu_domain->smmu);
 	return ret ? 0 : size;
 }
 
@@ -1792,7 +1844,52 @@ static int arm_smmu_id_size_to_bits(int size)
 	}
 }
 
-static int arm_smmu_device_cfg_probe(struct arm_smmu_device *smmu)
+static int arm_smmu_init_clocks(struct arm_smmu_device *smmu)
+{
+	const char *cname;
+	struct property *prop;
+	int i;
+	struct device *dev = smmu->dev;
+
+	smmu->num_clocks = of_property_count_strings(dev->of_node,
+						"clock-names");
+
+	if (!smmu->num_clocks)
+		return 0;
+
+	smmu->clocks = devm_kzalloc(
+		dev, sizeof(*smmu->clocks) * smmu->num_clocks,
+		GFP_KERNEL);
+
+	if (!smmu->clocks) {
+		dev_err(dev,
+			"Failed to allocate memory for clocks\n");
+		return -ENODEV;
+	}
+
+	i = 0;
+	of_property_for_each_string(dev->of_node, "clock-names",
+				prop, cname) {
+		struct clk *c = devm_clk_get(dev, cname);
+		if (IS_ERR(c)) {
+			dev_err(dev, "Couldn't get clock: %s",
+				cname);
+			return -ENODEV;
+		}
+
+		if (clk_get_rate(c) == 0) {
+			long rate = clk_round_rate(c, 1000);
+			clk_set_rate(c, rate);
+		}
+
+		smmu->clocks[i] = c;
+
+		++i;
+	}
+	return 0;
+}
+
+int arm_smmu_device_cfg_probe(struct arm_smmu_device *smmu)
 {
 	unsigned long size;
 	void __iomem *gr0_base = ARM_SMMU_GR0(smmu);
@@ -2027,10 +2124,16 @@ static int arm_smmu_device_dt_probe(struct platform_device *pdev)
 	}
 	dev_notice(dev, "registered %d master devices\n", i);
 
-	err = arm_smmu_device_cfg_probe(smmu);
+	err = arm_smmu_init_clocks(smmu);
 	if (err)
 		goto out_put_masters;
 
+	arm_smmu_enable_clocks(smmu);
+
+	err = arm_smmu_device_cfg_probe(smmu);
+	if (err)
+		goto out_disable_clocks;
+
 	parse_driver_options(smmu);
 
 	if (smmu->version > 1 &&
@@ -2039,7 +2142,7 @@ static int arm_smmu_device_dt_probe(struct platform_device *pdev)
 			"found only %d context interrupt(s) but %d required\n",
 			smmu->num_context_irqs, smmu->num_context_banks);
 		err = -ENODEV;
-		goto out_put_masters;
+		goto out_disable_clocks;
 	}
 
 	for (i = 0; i < smmu->num_global_irqs; ++i) {
@@ -2061,12 +2164,16 @@ static int arm_smmu_device_dt_probe(struct platform_device *pdev)
 	spin_unlock(&arm_smmu_devices_lock);
 
 	arm_smmu_device_reset(smmu);
+	arm_smmu_disable_clocks(smmu);
 	return 0;
 
 out_free_irqs:
 	while (i--)
 		free_irq(smmu->irqs[i], smmu);
 
+out_disable_clocks:
+	arm_smmu_disable_clocks(smmu);
+
 out_put_masters:
 	for (node = rb_first(&smmu->masters); node; node = rb_next(node)) {
 		struct arm_smmu_master *master
@@ -2110,7 +2217,9 @@ static int arm_smmu_device_remove(struct platform_device *pdev)
 		free_irq(smmu->irqs[i], smmu);
 
 	/* Turn the thing off */
+	arm_smmu_enable_clocks(smmu);
 	writel(sCR0_CLIENTPD, ARM_SMMU_GR0_NS(smmu) + ARM_SMMU_GR0_sCR0);
+	arm_smmu_disable_clocks(smmu);
 	return 0;
 }
 
-- 
The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum,
hosted by The Linux Foundation

^ permalink raw reply related	[flat|nested] 68+ messages in thread

* [PATCH 1/6] iommu/arm-smmu: add support for specifying clocks
@ 2014-08-13  0:51     ` Mitchel Humpherys
  0 siblings, 0 replies; 68+ messages in thread
From: Mitchel Humpherys @ 2014-08-13  0:51 UTC (permalink / raw)
  To: linux-arm-kernel

On some platforms with tight power constraints it is polite to only
leave your clocks on for as long as you absolutely need them. Currently
we assume that all clocks necessary for SMMU register access are always
on.

Add some optional device tree properties to specify any clocks that are
necessary for SMMU register access and turn them on and off as needed.

If no clocks are specified in the device tree things continue to work
the way they always have: we assume all necessary clocks are always
turned on.

Signed-off-by: Mitchel Humpherys <mitchelh@codeaurora.org>
---
 .../devicetree/bindings/iommu/arm,smmu.txt         |  11 ++
 drivers/iommu/arm-smmu.c                           | 127 +++++++++++++++++++--
 2 files changed, 129 insertions(+), 9 deletions(-)

diff --git a/Documentation/devicetree/bindings/iommu/arm,smmu.txt b/Documentation/devicetree/bindings/iommu/arm,smmu.txt
index 2d0f7cd867..ceae3fe207 100644
--- a/Documentation/devicetree/bindings/iommu/arm,smmu.txt
+++ b/Documentation/devicetree/bindings/iommu/arm,smmu.txt
@@ -48,6 +48,17 @@ conditions.
                   aliases of secure registers have to be used during
                   SMMU configuration.
 
+- clocks        : List of clocks to be used during SMMU register access. See
+                  Documentation/devicetree/bindings/clock/clock-bindings.txt
+                  for information about the format. For each clock specified
+                  here, there must be a corresponding entery in clock-names
+                  (see below).
+
+- clock-names   : List of clock names corresponding to the clocks specified in
+                  the "clocks" property (above). See
+                  Documentation/devicetree/bindings/clock/clock-bindings.txt
+                  for more info.
+
 Example:
 
         smmu {
diff --git a/drivers/iommu/arm-smmu.c b/drivers/iommu/arm-smmu.c
index 9fd8754db0..e123d75db3 100644
--- a/drivers/iommu/arm-smmu.c
+++ b/drivers/iommu/arm-smmu.c
@@ -399,6 +399,9 @@ struct arm_smmu_device {
 
 	struct list_head		list;
 	struct rb_root			masters;
+
+	int num_clocks;
+	struct clk **clocks;
 };
 
 struct arm_smmu_cfg {
@@ -589,6 +592,31 @@ static void __arm_smmu_free_bitmap(unsigned long *map, int idx)
 	clear_bit(idx, map);
 }
 
+static int arm_smmu_enable_clocks(struct arm_smmu_device *smmu)
+{
+	int i, ret = 0;
+
+	for (i = 0; i < smmu->num_clocks; ++i) {
+		ret = clk_prepare_enable(smmu->clocks[i]);
+		if (ret) {
+			dev_err(smmu->dev, "Couldn't enable clock #%d\n", i);
+			while (i--)
+				clk_disable_unprepare(smmu->clocks[i]);
+			break;
+		}
+	}
+
+	return ret;
+}
+
+static void arm_smmu_disable_clocks(struct arm_smmu_device *smmu)
+{
+	int i;
+
+	for (i = 0; i < smmu->num_clocks; ++i)
+		clk_disable_unprepare(smmu->clocks[i]);
+}
+
 /* Wait for any pending TLB invalidations to complete */
 static void arm_smmu_tlb_sync(struct arm_smmu_device *smmu)
 {
@@ -644,11 +672,15 @@ static irqreturn_t arm_smmu_context_fault(int irq, void *dev)
 	struct arm_smmu_device *smmu = smmu_domain->smmu;
 	void __iomem *cb_base;
 
+	arm_smmu_enable_clocks(smmu);
+
 	cb_base = ARM_SMMU_CB_BASE(smmu) + ARM_SMMU_CB(smmu, cfg->cbndx);
 	fsr = readl_relaxed(cb_base + ARM_SMMU_CB_FSR);
 
-	if (!(fsr & FSR_FAULT))
+	if (!(fsr & FSR_FAULT)) {
+		arm_smmu_disable_clocks(smmu);
 		return IRQ_NONE;
+	}
 
 	if (fsr & FSR_IGN)
 		dev_err_ratelimited(smmu->dev,
@@ -683,6 +715,8 @@ static irqreturn_t arm_smmu_context_fault(int irq, void *dev)
 	if (fsr & FSR_SS)
 		writel_relaxed(resume, cb_base + ARM_SMMU_CB_RESUME);
 
+	arm_smmu_disable_clocks(smmu);
+
 	return ret;
 }
 
@@ -692,13 +726,17 @@ static irqreturn_t arm_smmu_global_fault(int irq, void *dev)
 	struct arm_smmu_device *smmu = dev;
 	void __iomem *gr0_base = ARM_SMMU_GR0_NS(smmu);
 
+	arm_smmu_enable_clocks(smmu);
+
 	gfsr = readl_relaxed(gr0_base + ARM_SMMU_GR0_sGFSR);
 	gfsynr0 = readl_relaxed(gr0_base + ARM_SMMU_GR0_sGFSYNR0);
 	gfsynr1 = readl_relaxed(gr0_base + ARM_SMMU_GR0_sGFSYNR1);
 	gfsynr2 = readl_relaxed(gr0_base + ARM_SMMU_GR0_sGFSYNR2);
 
-	if (!gfsr)
+	if (!gfsr) {
+		arm_smmu_disable_clocks(smmu);
 		return IRQ_NONE;
+	}
 
 	dev_err_ratelimited(smmu->dev,
 		"Unexpected global fault, this could be serious\n");
@@ -707,6 +745,7 @@ static irqreturn_t arm_smmu_global_fault(int irq, void *dev)
 		gfsr, gfsynr0, gfsynr1, gfsynr2);
 
 	writel(gfsr, gr0_base + ARM_SMMU_GR0_sGFSR);
+	arm_smmu_disable_clocks(smmu);
 	return IRQ_HANDLED;
 }
 
@@ -991,10 +1030,12 @@ static void arm_smmu_destroy_domain_context(struct iommu_domain *domain)
 	if (!smmu)
 		return;
 
+	arm_smmu_enable_clocks(smmu_domain->smmu);
 	/* Disable the context bank and nuke the TLB before freeing it. */
 	cb_base = ARM_SMMU_CB_BASE(smmu) + ARM_SMMU_CB(smmu, cfg->cbndx);
 	writel_relaxed(0, cb_base + ARM_SMMU_CB_SCTLR);
 	arm_smmu_tlb_inv_context(smmu_domain);
+	arm_smmu_disable_clocks(smmu_domain->smmu);
 
 	if (cfg->irptndx != INVALID_IRPTNDX) {
 		irq = smmu->irqs[smmu->num_global_irqs + cfg->irptndx];
@@ -1222,6 +1263,7 @@ static void arm_smmu_domain_remove_master(struct arm_smmu_domain *smmu_domain,
 	 * We *must* clear the S2CR first, because freeing the SMR means
 	 * that it can be re-allocated immediately.
 	 */
+	arm_smmu_enable_clocks(smmu);
 	for (i = 0; i < cfg->num_streamids; ++i) {
 		u32 idx = cfg->smrs ? cfg->smrs[i].idx : cfg->streamids[i];
 
@@ -1230,6 +1272,7 @@ static void arm_smmu_domain_remove_master(struct arm_smmu_domain *smmu_domain,
 	}
 
 	arm_smmu_master_free_smrs(smmu, cfg);
+	arm_smmu_disable_clocks(smmu);
 }
 
 static int arm_smmu_attach_dev(struct iommu_domain *domain, struct device *dev)
@@ -1250,6 +1293,8 @@ static int arm_smmu_attach_dev(struct iommu_domain *domain, struct device *dev)
 		return -EEXIST;
 	}
 
+	arm_smmu_enable_clocks(smmu);
+
 	/*
 	 * Sanity check the domain. We don't support domains across
 	 * different SMMUs.
@@ -1259,7 +1304,7 @@ static int arm_smmu_attach_dev(struct iommu_domain *domain, struct device *dev)
 		/* Now that we have a master, we can finalise the domain */
 		ret = arm_smmu_init_domain_context(domain, smmu);
 		if (IS_ERR_VALUE(ret))
-			return ret;
+			goto disable_clocks;
 
 		dom_smmu = smmu_domain->smmu;
 	}
@@ -1268,17 +1313,22 @@ static int arm_smmu_attach_dev(struct iommu_domain *domain, struct device *dev)
 		dev_err(dev,
 			"cannot attach to SMMU %s whilst already attached to domain on SMMU %s\n",
 			dev_name(smmu_domain->smmu->dev), dev_name(smmu->dev));
-		return -EINVAL;
+		ret = -EINVAL;
+		goto disable_clocks;
 	}
 
 	/* Looks ok, so add the device to the domain */
 	cfg = find_smmu_master_cfg(dev);
-	if (!cfg)
-		return -ENODEV;
+	if (!cfg) {
+		ret = -ENODEV;
+		goto disable_clocks;
+	}
 
 	ret = arm_smmu_domain_add_master(smmu_domain, cfg);
 	if (!ret)
 		dev->archdata.iommu = domain;
+disable_clocks:
+	arm_smmu_disable_clocks(smmu);
 	return ret;
 }
 
@@ -1544,7 +1594,9 @@ static size_t arm_smmu_unmap(struct iommu_domain *domain, unsigned long iova,
 	struct arm_smmu_domain *smmu_domain = domain->priv;
 
 	ret = arm_smmu_handle_mapping(smmu_domain, iova, 0, size, 0);
+	arm_smmu_enable_clocks(smmu_domain->smmu);
 	arm_smmu_tlb_inv_context(smmu_domain);
+	arm_smmu_disable_clocks(smmu_domain->smmu);
 	return ret ? 0 : size;
 }
 
@@ -1792,7 +1844,52 @@ static int arm_smmu_id_size_to_bits(int size)
 	}
 }
 
-static int arm_smmu_device_cfg_probe(struct arm_smmu_device *smmu)
+static int arm_smmu_init_clocks(struct arm_smmu_device *smmu)
+{
+	const char *cname;
+	struct property *prop;
+	int i;
+	struct device *dev = smmu->dev;
+
+	smmu->num_clocks = of_property_count_strings(dev->of_node,
+						"clock-names");
+
+	if (!smmu->num_clocks)
+		return 0;
+
+	smmu->clocks = devm_kzalloc(
+		dev, sizeof(*smmu->clocks) * smmu->num_clocks,
+		GFP_KERNEL);
+
+	if (!smmu->clocks) {
+		dev_err(dev,
+			"Failed to allocate memory for clocks\n");
+		return -ENODEV;
+	}
+
+	i = 0;
+	of_property_for_each_string(dev->of_node, "clock-names",
+				prop, cname) {
+		struct clk *c = devm_clk_get(dev, cname);
+		if (IS_ERR(c)) {
+			dev_err(dev, "Couldn't get clock: %s",
+				cname);
+			return -ENODEV;
+		}
+
+		if (clk_get_rate(c) == 0) {
+			long rate = clk_round_rate(c, 1000);
+			clk_set_rate(c, rate);
+		}
+
+		smmu->clocks[i] = c;
+
+		++i;
+	}
+	return 0;
+}
+
+int arm_smmu_device_cfg_probe(struct arm_smmu_device *smmu)
 {
 	unsigned long size;
 	void __iomem *gr0_base = ARM_SMMU_GR0(smmu);
@@ -2027,10 +2124,16 @@ static int arm_smmu_device_dt_probe(struct platform_device *pdev)
 	}
 	dev_notice(dev, "registered %d master devices\n", i);
 
-	err = arm_smmu_device_cfg_probe(smmu);
+	err = arm_smmu_init_clocks(smmu);
 	if (err)
 		goto out_put_masters;
 
+	arm_smmu_enable_clocks(smmu);
+
+	err = arm_smmu_device_cfg_probe(smmu);
+	if (err)
+		goto out_disable_clocks;
+
 	parse_driver_options(smmu);
 
 	if (smmu->version > 1 &&
@@ -2039,7 +2142,7 @@ static int arm_smmu_device_dt_probe(struct platform_device *pdev)
 			"found only %d context interrupt(s) but %d required\n",
 			smmu->num_context_irqs, smmu->num_context_banks);
 		err = -ENODEV;
-		goto out_put_masters;
+		goto out_disable_clocks;
 	}
 
 	for (i = 0; i < smmu->num_global_irqs; ++i) {
@@ -2061,12 +2164,16 @@ static int arm_smmu_device_dt_probe(struct platform_device *pdev)
 	spin_unlock(&arm_smmu_devices_lock);
 
 	arm_smmu_device_reset(smmu);
+	arm_smmu_disable_clocks(smmu);
 	return 0;
 
 out_free_irqs:
 	while (i--)
 		free_irq(smmu->irqs[i], smmu);
 
+out_disable_clocks:
+	arm_smmu_disable_clocks(smmu);
+
 out_put_masters:
 	for (node = rb_first(&smmu->masters); node; node = rb_next(node)) {
 		struct arm_smmu_master *master
@@ -2110,7 +2217,9 @@ static int arm_smmu_device_remove(struct platform_device *pdev)
 		free_irq(smmu->irqs[i], smmu);
 
 	/* Turn the thing off */
+	arm_smmu_enable_clocks(smmu);
 	writel(sCR0_CLIENTPD, ARM_SMMU_GR0_NS(smmu) + ARM_SMMU_GR0_sCR0);
+	arm_smmu_disable_clocks(smmu);
 	return 0;
 }
 
-- 
The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum,
hosted by The Linux Foundation

^ permalink raw reply related	[flat|nested] 68+ messages in thread

* [PATCH 2/6] iommu/arm-smmu: add support for specifying regulators
  2014-08-13  0:51 ` Mitchel Humpherys
@ 2014-08-13  0:51     ` Mitchel Humpherys
  -1 siblings, 0 replies; 68+ messages in thread
From: Mitchel Humpherys @ 2014-08-13  0:51 UTC (permalink / raw)
  To: linux-arm-kernel-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r,
	iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA, Will Deacon,
	devicetree-u79uwXL29TY76Z2rM5mHXA

On some power-constrained platforms it's useful to disable power when a
device is not in use. Add support for specifying regulators for SMMUs
and only leave power on as long as the SMMU is in use (attached).

Signed-off-by: Mitchel Humpherys <mitchelh-sgV2jX0FEOL9JmXXK+q4OQ@public.gmane.org>
---
 .../devicetree/bindings/iommu/arm,smmu.txt         |   3 +
 drivers/iommu/arm-smmu.c                           | 102 ++++++++++++++++++---
 2 files changed, 93 insertions(+), 12 deletions(-)

diff --git a/Documentation/devicetree/bindings/iommu/arm,smmu.txt b/Documentation/devicetree/bindings/iommu/arm,smmu.txt
index ceae3fe207..dbc1ddad79 100644
--- a/Documentation/devicetree/bindings/iommu/arm,smmu.txt
+++ b/Documentation/devicetree/bindings/iommu/arm,smmu.txt
@@ -59,6 +59,9 @@ conditions.
                   Documentation/devicetree/bindings/clock/clock-bindings.txt
                   for more info.
 
+- vdd-supply    : Phandle of the regulator that should be powered on during
+                  SMMU register access.
+
 Example:
 
         smmu {
diff --git a/drivers/iommu/arm-smmu.c b/drivers/iommu/arm-smmu.c
index e123d75db3..7fdc58d8f8 100644
--- a/drivers/iommu/arm-smmu.c
+++ b/drivers/iommu/arm-smmu.c
@@ -402,6 +402,11 @@ struct arm_smmu_device {
 
 	int num_clocks;
 	struct clk **clocks;
+
+	struct regulator *gdsc;
+
+	struct mutex			attach_lock;
+	unsigned int			attach_count;
 };
 
 struct arm_smmu_cfg {
@@ -617,6 +622,22 @@ static void arm_smmu_disable_clocks(struct arm_smmu_device *smmu)
 		clk_disable_unprepare(smmu->clocks[i]);
 }
 
+static int arm_smmu_enable_regulators(struct arm_smmu_device *smmu)
+{
+	if (!smmu->gdsc)
+		return 0;
+
+	return regulator_enable(smmu->gdsc);
+}
+
+static int arm_smmu_disable_regulators(struct arm_smmu_device *smmu)
+{
+	if (!smmu->gdsc)
+		return 0;
+
+	return regulator_disable(smmu->gdsc);
+}
+
 /* Wait for any pending TLB invalidations to complete */
 static void arm_smmu_tlb_sync(struct arm_smmu_device *smmu)
 {
@@ -1275,6 +1296,8 @@ static void arm_smmu_domain_remove_master(struct arm_smmu_domain *smmu_domain,
 	arm_smmu_disable_clocks(smmu);
 }
 
+static void arm_smmu_device_reset(struct arm_smmu_device *smmu);
+
 static int arm_smmu_attach_dev(struct iommu_domain *domain, struct device *dev)
 {
 	int ret;
@@ -1293,7 +1316,15 @@ static int arm_smmu_attach_dev(struct iommu_domain *domain, struct device *dev)
 		return -EEXIST;
 	}
 
-	arm_smmu_enable_clocks(smmu);
+	mutex_lock(&smmu->attach_lock);
+	if (!smmu->attach_count++) {
+		arm_smmu_enable_regulators(smmu);
+		arm_smmu_enable_clocks(smmu);
+		arm_smmu_device_reset(smmu);
+	} else {
+		arm_smmu_enable_clocks(smmu);
+	}
+	mutex_unlock(&smmu->attach_lock);
 
 	/*
 	 * Sanity check the domain. We don't support domains across
@@ -1304,7 +1335,7 @@ static int arm_smmu_attach_dev(struct iommu_domain *domain, struct device *dev)
 		/* Now that we have a master, we can finalise the domain */
 		ret = arm_smmu_init_domain_context(domain, smmu);
 		if (IS_ERR_VALUE(ret))
-			goto disable_clocks;
+			goto err_disable_clocks;
 
 		dom_smmu = smmu_domain->smmu;
 	}
@@ -1314,28 +1345,46 @@ static int arm_smmu_attach_dev(struct iommu_domain *domain, struct device *dev)
 			"cannot attach to SMMU %s whilst already attached to domain on SMMU %s\n",
 			dev_name(smmu_domain->smmu->dev), dev_name(smmu->dev));
 		ret = -EINVAL;
-		goto disable_clocks;
+		goto err_disable_clocks;
 	}
 
 	/* Looks ok, so add the device to the domain */
 	cfg = find_smmu_master_cfg(dev);
 	if (!cfg) {
 		ret = -ENODEV;
-		goto disable_clocks;
+		goto err_disable_clocks;
 	}
 
 	ret = arm_smmu_domain_add_master(smmu_domain, cfg);
 	if (!ret)
 		dev->archdata.iommu = domain;
-disable_clocks:
 	arm_smmu_disable_clocks(smmu);
 	return ret;
+
+err_disable_clocks:
+	arm_smmu_disable_clocks(smmu);
+	mutex_lock(&smmu->attach_lock);
+	if (!--smmu->attach_count)
+		arm_smmu_disable_regulators(smmu);
+	mutex_unlock(&smmu->attach_lock);
+	return ret;
+}
+
+static void arm_smmu_power_off(struct arm_smmu_device *smmu)
+{
+	/* Turn the thing off */
+	arm_smmu_enable_clocks(smmu);
+	writel_relaxed(sCR0_CLIENTPD,
+		ARM_SMMU_GR0_NS(smmu) + ARM_SMMU_GR0_sCR0);
+	arm_smmu_disable_clocks(smmu);
+	arm_smmu_disable_regulators(smmu);
 }
 
 static void arm_smmu_detach_dev(struct iommu_domain *domain, struct device *dev)
 {
 	struct arm_smmu_domain *smmu_domain = domain->priv;
 	struct arm_smmu_master_cfg *cfg;
+	struct arm_smmu_device *smmu = smmu_domain->smmu;
 
 	cfg = find_smmu_master_cfg(dev);
 	if (!cfg)
@@ -1343,6 +1392,10 @@ static void arm_smmu_detach_dev(struct iommu_domain *domain, struct device *dev)
 
 	dev->archdata.iommu = NULL;
 	arm_smmu_domain_remove_master(smmu_domain, cfg);
+	mutex_lock(&smmu->attach_lock);
+	if (!--smmu->attach_count)
+		arm_smmu_power_off(smmu);
+	mutex_unlock(&smmu->attach_lock);
 }
 
 static bool arm_smmu_pte_is_contiguous_range(unsigned long addr,
@@ -1844,6 +1897,20 @@ static int arm_smmu_id_size_to_bits(int size)
 	}
 }
 
+static int arm_smmu_init_regulators(struct arm_smmu_device *smmu)
+{
+	struct device *dev = smmu->dev;
+
+	if (!of_get_property(dev->of_node, "vdd-supply", NULL))
+		return 0;
+
+	smmu->gdsc = devm_regulator_get(dev, "vdd");
+	if (IS_ERR(smmu->gdsc))
+		return PTR_ERR(smmu->gdsc);
+
+	return 0;
+}
+
 static int arm_smmu_init_clocks(struct arm_smmu_device *smmu)
 {
 	const char *cname;
@@ -2065,6 +2132,7 @@ static int arm_smmu_device_dt_probe(struct platform_device *pdev)
 		return -ENOMEM;
 	}
 	smmu->dev = dev;
+	mutex_init(&smmu->attach_lock);
 
 	res = platform_get_resource(pdev, IORESOURCE_MEM, 0);
 	smmu->base = devm_ioremap_resource(dev, res);
@@ -2124,13 +2192,19 @@ static int arm_smmu_device_dt_probe(struct platform_device *pdev)
 	}
 	dev_notice(dev, "registered %d master devices\n", i);
 
+	err = arm_smmu_init_regulators(smmu);
+	if (err)
+		goto out_put_masters;
+
 	err = arm_smmu_init_clocks(smmu);
 	if (err)
 		goto out_put_masters;
 
+	arm_smmu_enable_regulators(smmu);
 	arm_smmu_enable_clocks(smmu);
-
 	err = arm_smmu_device_cfg_probe(smmu);
+	arm_smmu_disable_clocks(smmu);
+	arm_smmu_disable_regulators(smmu);
 	if (err)
 		goto out_disable_clocks;
 
@@ -2163,8 +2237,6 @@ static int arm_smmu_device_dt_probe(struct platform_device *pdev)
 	list_add(&smmu->list, &arm_smmu_devices);
 	spin_unlock(&arm_smmu_devices_lock);
 
-	arm_smmu_device_reset(smmu);
-	arm_smmu_disable_clocks(smmu);
 	return 0;
 
 out_free_irqs:
@@ -2173,6 +2245,7 @@ out_free_irqs:
 
 out_disable_clocks:
 	arm_smmu_disable_clocks(smmu);
+	arm_smmu_disable_regulators(smmu);
 
 out_put_masters:
 	for (node = rb_first(&smmu->masters); node; node = rb_next(node)) {
@@ -2216,10 +2289,15 @@ static int arm_smmu_device_remove(struct platform_device *pdev)
 	for (i = 0; i < smmu->num_global_irqs; ++i)
 		free_irq(smmu->irqs[i], smmu);
 
-	/* Turn the thing off */
-	arm_smmu_enable_clocks(smmu);
-	writel(sCR0_CLIENTPD, ARM_SMMU_GR0_NS(smmu) + ARM_SMMU_GR0_sCR0);
-	arm_smmu_disable_clocks(smmu);
+	mutex_lock(&smmu->attach_lock);
+	/*
+	 * If all devices weren't detached for some reason, we're
+	 * still powered on. Power off now.
+	 */
+	if (smmu->attach_count)
+		arm_smmu_power_off(smmu);
+	mutex_unlock(&smmu->attach_lock);
+
 	return 0;
 }
 
-- 
The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum,
hosted by The Linux Foundation

^ permalink raw reply related	[flat|nested] 68+ messages in thread

* [PATCH 2/6] iommu/arm-smmu: add support for specifying regulators
@ 2014-08-13  0:51     ` Mitchel Humpherys
  0 siblings, 0 replies; 68+ messages in thread
From: Mitchel Humpherys @ 2014-08-13  0:51 UTC (permalink / raw)
  To: linux-arm-kernel

On some power-constrained platforms it's useful to disable power when a
device is not in use. Add support for specifying regulators for SMMUs
and only leave power on as long as the SMMU is in use (attached).

Signed-off-by: Mitchel Humpherys <mitchelh@codeaurora.org>
---
 .../devicetree/bindings/iommu/arm,smmu.txt         |   3 +
 drivers/iommu/arm-smmu.c                           | 102 ++++++++++++++++++---
 2 files changed, 93 insertions(+), 12 deletions(-)

diff --git a/Documentation/devicetree/bindings/iommu/arm,smmu.txt b/Documentation/devicetree/bindings/iommu/arm,smmu.txt
index ceae3fe207..dbc1ddad79 100644
--- a/Documentation/devicetree/bindings/iommu/arm,smmu.txt
+++ b/Documentation/devicetree/bindings/iommu/arm,smmu.txt
@@ -59,6 +59,9 @@ conditions.
                   Documentation/devicetree/bindings/clock/clock-bindings.txt
                   for more info.
 
+- vdd-supply    : Phandle of the regulator that should be powered on during
+                  SMMU register access.
+
 Example:
 
         smmu {
diff --git a/drivers/iommu/arm-smmu.c b/drivers/iommu/arm-smmu.c
index e123d75db3..7fdc58d8f8 100644
--- a/drivers/iommu/arm-smmu.c
+++ b/drivers/iommu/arm-smmu.c
@@ -402,6 +402,11 @@ struct arm_smmu_device {
 
 	int num_clocks;
 	struct clk **clocks;
+
+	struct regulator *gdsc;
+
+	struct mutex			attach_lock;
+	unsigned int			attach_count;
 };
 
 struct arm_smmu_cfg {
@@ -617,6 +622,22 @@ static void arm_smmu_disable_clocks(struct arm_smmu_device *smmu)
 		clk_disable_unprepare(smmu->clocks[i]);
 }
 
+static int arm_smmu_enable_regulators(struct arm_smmu_device *smmu)
+{
+	if (!smmu->gdsc)
+		return 0;
+
+	return regulator_enable(smmu->gdsc);
+}
+
+static int arm_smmu_disable_regulators(struct arm_smmu_device *smmu)
+{
+	if (!smmu->gdsc)
+		return 0;
+
+	return regulator_disable(smmu->gdsc);
+}
+
 /* Wait for any pending TLB invalidations to complete */
 static void arm_smmu_tlb_sync(struct arm_smmu_device *smmu)
 {
@@ -1275,6 +1296,8 @@ static void arm_smmu_domain_remove_master(struct arm_smmu_domain *smmu_domain,
 	arm_smmu_disable_clocks(smmu);
 }
 
+static void arm_smmu_device_reset(struct arm_smmu_device *smmu);
+
 static int arm_smmu_attach_dev(struct iommu_domain *domain, struct device *dev)
 {
 	int ret;
@@ -1293,7 +1316,15 @@ static int arm_smmu_attach_dev(struct iommu_domain *domain, struct device *dev)
 		return -EEXIST;
 	}
 
-	arm_smmu_enable_clocks(smmu);
+	mutex_lock(&smmu->attach_lock);
+	if (!smmu->attach_count++) {
+		arm_smmu_enable_regulators(smmu);
+		arm_smmu_enable_clocks(smmu);
+		arm_smmu_device_reset(smmu);
+	} else {
+		arm_smmu_enable_clocks(smmu);
+	}
+	mutex_unlock(&smmu->attach_lock);
 
 	/*
 	 * Sanity check the domain. We don't support domains across
@@ -1304,7 +1335,7 @@ static int arm_smmu_attach_dev(struct iommu_domain *domain, struct device *dev)
 		/* Now that we have a master, we can finalise the domain */
 		ret = arm_smmu_init_domain_context(domain, smmu);
 		if (IS_ERR_VALUE(ret))
-			goto disable_clocks;
+			goto err_disable_clocks;
 
 		dom_smmu = smmu_domain->smmu;
 	}
@@ -1314,28 +1345,46 @@ static int arm_smmu_attach_dev(struct iommu_domain *domain, struct device *dev)
 			"cannot attach to SMMU %s whilst already attached to domain on SMMU %s\n",
 			dev_name(smmu_domain->smmu->dev), dev_name(smmu->dev));
 		ret = -EINVAL;
-		goto disable_clocks;
+		goto err_disable_clocks;
 	}
 
 	/* Looks ok, so add the device to the domain */
 	cfg = find_smmu_master_cfg(dev);
 	if (!cfg) {
 		ret = -ENODEV;
-		goto disable_clocks;
+		goto err_disable_clocks;
 	}
 
 	ret = arm_smmu_domain_add_master(smmu_domain, cfg);
 	if (!ret)
 		dev->archdata.iommu = domain;
-disable_clocks:
 	arm_smmu_disable_clocks(smmu);
 	return ret;
+
+err_disable_clocks:
+	arm_smmu_disable_clocks(smmu);
+	mutex_lock(&smmu->attach_lock);
+	if (!--smmu->attach_count)
+		arm_smmu_disable_regulators(smmu);
+	mutex_unlock(&smmu->attach_lock);
+	return ret;
+}
+
+static void arm_smmu_power_off(struct arm_smmu_device *smmu)
+{
+	/* Turn the thing off */
+	arm_smmu_enable_clocks(smmu);
+	writel_relaxed(sCR0_CLIENTPD,
+		ARM_SMMU_GR0_NS(smmu) + ARM_SMMU_GR0_sCR0);
+	arm_smmu_disable_clocks(smmu);
+	arm_smmu_disable_regulators(smmu);
 }
 
 static void arm_smmu_detach_dev(struct iommu_domain *domain, struct device *dev)
 {
 	struct arm_smmu_domain *smmu_domain = domain->priv;
 	struct arm_smmu_master_cfg *cfg;
+	struct arm_smmu_device *smmu = smmu_domain->smmu;
 
 	cfg = find_smmu_master_cfg(dev);
 	if (!cfg)
@@ -1343,6 +1392,10 @@ static void arm_smmu_detach_dev(struct iommu_domain *domain, struct device *dev)
 
 	dev->archdata.iommu = NULL;
 	arm_smmu_domain_remove_master(smmu_domain, cfg);
+	mutex_lock(&smmu->attach_lock);
+	if (!--smmu->attach_count)
+		arm_smmu_power_off(smmu);
+	mutex_unlock(&smmu->attach_lock);
 }
 
 static bool arm_smmu_pte_is_contiguous_range(unsigned long addr,
@@ -1844,6 +1897,20 @@ static int arm_smmu_id_size_to_bits(int size)
 	}
 }
 
+static int arm_smmu_init_regulators(struct arm_smmu_device *smmu)
+{
+	struct device *dev = smmu->dev;
+
+	if (!of_get_property(dev->of_node, "vdd-supply", NULL))
+		return 0;
+
+	smmu->gdsc = devm_regulator_get(dev, "vdd");
+	if (IS_ERR(smmu->gdsc))
+		return PTR_ERR(smmu->gdsc);
+
+	return 0;
+}
+
 static int arm_smmu_init_clocks(struct arm_smmu_device *smmu)
 {
 	const char *cname;
@@ -2065,6 +2132,7 @@ static int arm_smmu_device_dt_probe(struct platform_device *pdev)
 		return -ENOMEM;
 	}
 	smmu->dev = dev;
+	mutex_init(&smmu->attach_lock);
 
 	res = platform_get_resource(pdev, IORESOURCE_MEM, 0);
 	smmu->base = devm_ioremap_resource(dev, res);
@@ -2124,13 +2192,19 @@ static int arm_smmu_device_dt_probe(struct platform_device *pdev)
 	}
 	dev_notice(dev, "registered %d master devices\n", i);
 
+	err = arm_smmu_init_regulators(smmu);
+	if (err)
+		goto out_put_masters;
+
 	err = arm_smmu_init_clocks(smmu);
 	if (err)
 		goto out_put_masters;
 
+	arm_smmu_enable_regulators(smmu);
 	arm_smmu_enable_clocks(smmu);
-
 	err = arm_smmu_device_cfg_probe(smmu);
+	arm_smmu_disable_clocks(smmu);
+	arm_smmu_disable_regulators(smmu);
 	if (err)
 		goto out_disable_clocks;
 
@@ -2163,8 +2237,6 @@ static int arm_smmu_device_dt_probe(struct platform_device *pdev)
 	list_add(&smmu->list, &arm_smmu_devices);
 	spin_unlock(&arm_smmu_devices_lock);
 
-	arm_smmu_device_reset(smmu);
-	arm_smmu_disable_clocks(smmu);
 	return 0;
 
 out_free_irqs:
@@ -2173,6 +2245,7 @@ out_free_irqs:
 
 out_disable_clocks:
 	arm_smmu_disable_clocks(smmu);
+	arm_smmu_disable_regulators(smmu);
 
 out_put_masters:
 	for (node = rb_first(&smmu->masters); node; node = rb_next(node)) {
@@ -2216,10 +2289,15 @@ static int arm_smmu_device_remove(struct platform_device *pdev)
 	for (i = 0; i < smmu->num_global_irqs; ++i)
 		free_irq(smmu->irqs[i], smmu);
 
-	/* Turn the thing off */
-	arm_smmu_enable_clocks(smmu);
-	writel(sCR0_CLIENTPD, ARM_SMMU_GR0_NS(smmu) + ARM_SMMU_GR0_sCR0);
-	arm_smmu_disable_clocks(smmu);
+	mutex_lock(&smmu->attach_lock);
+	/*
+	 * If all devices weren't detached for some reason, we're
+	 * still powered on. Power off now.
+	 */
+	if (smmu->attach_count)
+		arm_smmu_power_off(smmu);
+	mutex_unlock(&smmu->attach_lock);
+
 	return 0;
 }
 
-- 
The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum,
hosted by The Linux Foundation

^ permalink raw reply related	[flat|nested] 68+ messages in thread

* [PATCH 3/6] iommu/arm-smmu: add support for iova_to_phys through ATS1PR
  2014-08-13  0:51 ` Mitchel Humpherys
@ 2014-08-13  0:51     ` Mitchel Humpherys
  -1 siblings, 0 replies; 68+ messages in thread
From: Mitchel Humpherys @ 2014-08-13  0:51 UTC (permalink / raw)
  To: linux-arm-kernel-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r,
	iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA, Will Deacon,
	devicetree-u79uwXL29TY76Z2rM5mHXA

Currently, we provide the iommu_ops.iova_to_phys service by doing a
table walk in software to translate IO virtual addresses to physical
addresses. On SMMUs that support it, it can be useful to ask the SMMU
itself to do the translation. This can be used to warm the TLBs for an
SMMU. It can also be useful for testing and hardware validation.

Since the address translation registers are optional on SMMUv2, only
enable hardware translations when using SMMUv1 or when SMMU_IDR0.S1TS=1
and SMMU_IDR0.ATOSNS=0, as described in the ARM SMMU v1-v2 spec.

Signed-off-by: Mitchel Humpherys <mitchelh-sgV2jX0FEOL9JmXXK+q4OQ@public.gmane.org>
---
 drivers/iommu/arm-smmu.c | 75 +++++++++++++++++++++++++++++++++++++++++++++++-
 1 file changed, 74 insertions(+), 1 deletion(-)

diff --git a/drivers/iommu/arm-smmu.c b/drivers/iommu/arm-smmu.c
index 7fdc58d8f8..63c6707fad 100644
--- a/drivers/iommu/arm-smmu.c
+++ b/drivers/iommu/arm-smmu.c
@@ -246,11 +246,17 @@
 #define ARM_SMMU_CB_TTBR0_HI		0x24
 #define ARM_SMMU_CB_TTBCR		0x30
 #define ARM_SMMU_CB_S1_MAIR0		0x38
+#define ARM_SMMU_CB_PAR_LO		0x50
+#define ARM_SMMU_CB_PAR_HI		0x54
 #define ARM_SMMU_CB_FSR			0x58
 #define ARM_SMMU_CB_FAR_LO		0x60
 #define ARM_SMMU_CB_FAR_HI		0x64
 #define ARM_SMMU_CB_FSYNR0		0x68
 #define ARM_SMMU_CB_S1_TLBIASID		0x610
+#define ARM_SMMU_CB_ATS1PR_LO		0x800
+#define ARM_SMMU_CB_ATS1PR_HI		0x804
+#define ARM_SMMU_CB_ATSR		0x8f0
+#define ATSR_LOOP_TIMEOUT		1000000	/* 1s! */
 
 #define SCTLR_S1_ASIDPNE		(1 << 12)
 #define SCTLR_CFCFG			(1 << 7)
@@ -262,6 +268,10 @@
 #define SCTLR_M				(1 << 0)
 #define SCTLR_EAE_SBOP			(SCTLR_AFE | SCTLR_TRE)
 
+#define CB_PAR_F			(1 << 0)
+
+#define ATSR_ACTIVE			(1 << 0)
+
 #define RESUME_RETRY			(0 << 0)
 #define RESUME_TERMINATE		(1 << 0)
 
@@ -375,6 +385,7 @@ struct arm_smmu_device {
 #define ARM_SMMU_FEAT_TRANS_S1		(1 << 2)
 #define ARM_SMMU_FEAT_TRANS_S2		(1 << 3)
 #define ARM_SMMU_FEAT_TRANS_NESTED	(1 << 4)
+#define ARM_SMMU_FEAT_TRANS_OPS		(1 << 5)
 	u32				features;
 
 #define ARM_SMMU_OPT_SECURE_CFG_ACCESS (1 << 0)
@@ -1653,7 +1664,7 @@ static size_t arm_smmu_unmap(struct iommu_domain *domain, unsigned long iova,
 	return ret ? 0 : size;
 }
 
-static phys_addr_t arm_smmu_iova_to_phys(struct iommu_domain *domain,
+static phys_addr_t arm_smmu_iova_to_phys_soft(struct iommu_domain *domain,
 					 dma_addr_t iova)
 {
 	pgd_t *pgdp, pgd;
@@ -1686,6 +1697,63 @@ static phys_addr_t arm_smmu_iova_to_phys(struct iommu_domain *domain,
 	return __pfn_to_phys(pte_pfn(pte)) | (iova & ~PAGE_MASK);
 }
 
+static phys_addr_t arm_smmu_iova_to_phys_hard(struct iommu_domain *domain,
+					dma_addr_t iova)
+{
+	struct arm_smmu_domain *smmu_domain = domain->priv;
+	struct arm_smmu_device *smmu = smmu_domain->smmu;
+	struct arm_smmu_cfg *cfg = &smmu_domain->cfg;
+	struct device *dev = smmu->dev;
+	void __iomem *cb_base;
+	int count = 0;
+	u64 phys;
+
+	arm_smmu_enable_clocks(smmu);
+
+	cb_base = ARM_SMMU_CB_BASE(smmu) + ARM_SMMU_CB(smmu, cfg->cbndx);
+
+	if (smmu->version == 1) {
+		u32 reg = iova & 0xFFFFF000;
+		writel_relaxed(reg, cb_base + ARM_SMMU_CB_ATS1PR_LO);
+	} else {
+		u64 reg = iova & 0xfffffffffffff000;
+		writeq_relaxed(reg, cb_base + ARM_SMMU_CB_ATS1PR_LO);
+	}
+
+	mb();
+	while (readl_relaxed(cb_base + ARM_SMMU_CB_ATSR) & ATSR_ACTIVE) {
+		if (++count == ATSR_LOOP_TIMEOUT) {
+			dev_err(dev,
+				"iova to phys timed out on 0x%pa for %s. Falling back to software table walk.\n",
+				&iova, dev_name(dev));
+			arm_smmu_disable_clocks(smmu);
+			return arm_smmu_iova_to_phys_soft(domain, iova);
+		}
+		cpu_relax();
+	}
+
+	phys = readl_relaxed(cb_base + ARM_SMMU_CB_PAR_LO);
+	phys |= ((u64) readl_relaxed(cb_base + ARM_SMMU_CB_PAR_HI)) << 32;
+
+	if (phys & CB_PAR_F) {
+		dev_err(dev, "translation fault on %s!\n", dev_name(dev));
+		dev_err(dev, "PAR = 0x%llx\n", phys);
+	}
+	phys = (phys & 0xFFFFFFF000ULL) | (iova & 0x00000FFF);
+
+	arm_smmu_disable_clocks(smmu);
+	return phys;
+}
+
+static phys_addr_t arm_smmu_iova_to_phys(struct iommu_domain *domain,
+					dma_addr_t iova)
+{
+	struct arm_smmu_domain *smmu_domain = domain->priv;
+	if (smmu_domain->smmu->features & ARM_SMMU_FEAT_TRANS_OPS)
+		return arm_smmu_iova_to_phys_hard(domain, iova);
+	return arm_smmu_iova_to_phys_soft(domain, iova);
+}
+
 static int arm_smmu_domain_has_cap(struct iommu_domain *domain,
 				   unsigned long cap)
 {
@@ -2005,6 +2073,11 @@ int arm_smmu_device_cfg_probe(struct arm_smmu_device *smmu)
 		return -ENODEV;
 	}
 
+	if (smmu->version == 1 || (!(id & ID0_ATOSNS) && (id & ID0_S1TS))) {
+		smmu->features |= ARM_SMMU_FEAT_TRANS_OPS;
+		dev_notice(smmu->dev, "\taddress translation ops\n");
+	}
+
 	if (id & ID0_CTTW) {
 		smmu->features |= ARM_SMMU_FEAT_COHERENT_WALK;
 		dev_notice(smmu->dev, "\tcoherent table walk\n");
-- 
The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum,
hosted by The Linux Foundation

^ permalink raw reply related	[flat|nested] 68+ messages in thread

* [PATCH 3/6] iommu/arm-smmu: add support for iova_to_phys through ATS1PR
@ 2014-08-13  0:51     ` Mitchel Humpherys
  0 siblings, 0 replies; 68+ messages in thread
From: Mitchel Humpherys @ 2014-08-13  0:51 UTC (permalink / raw)
  To: linux-arm-kernel

Currently, we provide the iommu_ops.iova_to_phys service by doing a
table walk in software to translate IO virtual addresses to physical
addresses. On SMMUs that support it, it can be useful to ask the SMMU
itself to do the translation. This can be used to warm the TLBs for an
SMMU. It can also be useful for testing and hardware validation.

Since the address translation registers are optional on SMMUv2, only
enable hardware translations when using SMMUv1 or when SMMU_IDR0.S1TS=1
and SMMU_IDR0.ATOSNS=0, as described in the ARM SMMU v1-v2 spec.

Signed-off-by: Mitchel Humpherys <mitchelh@codeaurora.org>
---
 drivers/iommu/arm-smmu.c | 75 +++++++++++++++++++++++++++++++++++++++++++++++-
 1 file changed, 74 insertions(+), 1 deletion(-)

diff --git a/drivers/iommu/arm-smmu.c b/drivers/iommu/arm-smmu.c
index 7fdc58d8f8..63c6707fad 100644
--- a/drivers/iommu/arm-smmu.c
+++ b/drivers/iommu/arm-smmu.c
@@ -246,11 +246,17 @@
 #define ARM_SMMU_CB_TTBR0_HI		0x24
 #define ARM_SMMU_CB_TTBCR		0x30
 #define ARM_SMMU_CB_S1_MAIR0		0x38
+#define ARM_SMMU_CB_PAR_LO		0x50
+#define ARM_SMMU_CB_PAR_HI		0x54
 #define ARM_SMMU_CB_FSR			0x58
 #define ARM_SMMU_CB_FAR_LO		0x60
 #define ARM_SMMU_CB_FAR_HI		0x64
 #define ARM_SMMU_CB_FSYNR0		0x68
 #define ARM_SMMU_CB_S1_TLBIASID		0x610
+#define ARM_SMMU_CB_ATS1PR_LO		0x800
+#define ARM_SMMU_CB_ATS1PR_HI		0x804
+#define ARM_SMMU_CB_ATSR		0x8f0
+#define ATSR_LOOP_TIMEOUT		1000000	/* 1s! */
 
 #define SCTLR_S1_ASIDPNE		(1 << 12)
 #define SCTLR_CFCFG			(1 << 7)
@@ -262,6 +268,10 @@
 #define SCTLR_M				(1 << 0)
 #define SCTLR_EAE_SBOP			(SCTLR_AFE | SCTLR_TRE)
 
+#define CB_PAR_F			(1 << 0)
+
+#define ATSR_ACTIVE			(1 << 0)
+
 #define RESUME_RETRY			(0 << 0)
 #define RESUME_TERMINATE		(1 << 0)
 
@@ -375,6 +385,7 @@ struct arm_smmu_device {
 #define ARM_SMMU_FEAT_TRANS_S1		(1 << 2)
 #define ARM_SMMU_FEAT_TRANS_S2		(1 << 3)
 #define ARM_SMMU_FEAT_TRANS_NESTED	(1 << 4)
+#define ARM_SMMU_FEAT_TRANS_OPS		(1 << 5)
 	u32				features;
 
 #define ARM_SMMU_OPT_SECURE_CFG_ACCESS (1 << 0)
@@ -1653,7 +1664,7 @@ static size_t arm_smmu_unmap(struct iommu_domain *domain, unsigned long iova,
 	return ret ? 0 : size;
 }
 
-static phys_addr_t arm_smmu_iova_to_phys(struct iommu_domain *domain,
+static phys_addr_t arm_smmu_iova_to_phys_soft(struct iommu_domain *domain,
 					 dma_addr_t iova)
 {
 	pgd_t *pgdp, pgd;
@@ -1686,6 +1697,63 @@ static phys_addr_t arm_smmu_iova_to_phys(struct iommu_domain *domain,
 	return __pfn_to_phys(pte_pfn(pte)) | (iova & ~PAGE_MASK);
 }
 
+static phys_addr_t arm_smmu_iova_to_phys_hard(struct iommu_domain *domain,
+					dma_addr_t iova)
+{
+	struct arm_smmu_domain *smmu_domain = domain->priv;
+	struct arm_smmu_device *smmu = smmu_domain->smmu;
+	struct arm_smmu_cfg *cfg = &smmu_domain->cfg;
+	struct device *dev = smmu->dev;
+	void __iomem *cb_base;
+	int count = 0;
+	u64 phys;
+
+	arm_smmu_enable_clocks(smmu);
+
+	cb_base = ARM_SMMU_CB_BASE(smmu) + ARM_SMMU_CB(smmu, cfg->cbndx);
+
+	if (smmu->version == 1) {
+		u32 reg = iova & 0xFFFFF000;
+		writel_relaxed(reg, cb_base + ARM_SMMU_CB_ATS1PR_LO);
+	} else {
+		u64 reg = iova & 0xfffffffffffff000;
+		writeq_relaxed(reg, cb_base + ARM_SMMU_CB_ATS1PR_LO);
+	}
+
+	mb();
+	while (readl_relaxed(cb_base + ARM_SMMU_CB_ATSR) & ATSR_ACTIVE) {
+		if (++count == ATSR_LOOP_TIMEOUT) {
+			dev_err(dev,
+				"iova to phys timed out on 0x%pa for %s. Falling back to software table walk.\n",
+				&iova, dev_name(dev));
+			arm_smmu_disable_clocks(smmu);
+			return arm_smmu_iova_to_phys_soft(domain, iova);
+		}
+		cpu_relax();
+	}
+
+	phys = readl_relaxed(cb_base + ARM_SMMU_CB_PAR_LO);
+	phys |= ((u64) readl_relaxed(cb_base + ARM_SMMU_CB_PAR_HI)) << 32;
+
+	if (phys & CB_PAR_F) {
+		dev_err(dev, "translation fault on %s!\n", dev_name(dev));
+		dev_err(dev, "PAR = 0x%llx\n", phys);
+	}
+	phys = (phys & 0xFFFFFFF000ULL) | (iova & 0x00000FFF);
+
+	arm_smmu_disable_clocks(smmu);
+	return phys;
+}
+
+static phys_addr_t arm_smmu_iova_to_phys(struct iommu_domain *domain,
+					dma_addr_t iova)
+{
+	struct arm_smmu_domain *smmu_domain = domain->priv;
+	if (smmu_domain->smmu->features & ARM_SMMU_FEAT_TRANS_OPS)
+		return arm_smmu_iova_to_phys_hard(domain, iova);
+	return arm_smmu_iova_to_phys_soft(domain, iova);
+}
+
 static int arm_smmu_domain_has_cap(struct iommu_domain *domain,
 				   unsigned long cap)
 {
@@ -2005,6 +2073,11 @@ int arm_smmu_device_cfg_probe(struct arm_smmu_device *smmu)
 		return -ENODEV;
 	}
 
+	if (smmu->version == 1 || (!(id & ID0_ATOSNS) && (id & ID0_S1TS))) {
+		smmu->features |= ARM_SMMU_FEAT_TRANS_OPS;
+		dev_notice(smmu->dev, "\taddress translation ops\n");
+	}
+
 	if (id & ID0_CTTW) {
 		smmu->features |= ARM_SMMU_FEAT_COHERENT_WALK;
 		dev_notice(smmu->dev, "\tcoherent table walk\n");
-- 
The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum,
hosted by The Linux Foundation

^ permalink raw reply related	[flat|nested] 68+ messages in thread

* [PATCH 4/6] iommu/arm-smmu: implement generic DT bindings
  2014-08-13  0:51 ` Mitchel Humpherys
@ 2014-08-13  0:51     ` Mitchel Humpherys
  -1 siblings, 0 replies; 68+ messages in thread
From: Mitchel Humpherys @ 2014-08-13  0:51 UTC (permalink / raw)
  To: linux-arm-kernel-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r,
	iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA, Will Deacon,
	devicetree-u79uwXL29TY76Z2rM5mHXA

Generic IOMMU device tree bindings were recently added in
["devicetree: Add generic IOMMU device tree bindings"]. Implement the
bindings in the ARM SMMU driver.

See Documentation/devicetree/bindings/iommu/iommu.txt for the bindings
themselves.

Signed-off-by: Mitchel Humpherys <mitchelh-sgV2jX0FEOL9JmXXK+q4OQ@public.gmane.org>
---
 drivers/iommu/arm-smmu.c | 87 +++++++++++++++++++++++++++++++++++-------------
 1 file changed, 64 insertions(+), 23 deletions(-)

diff --git a/drivers/iommu/arm-smmu.c b/drivers/iommu/arm-smmu.c
index 63c6707fad..22e25f3172 100644
--- a/drivers/iommu/arm-smmu.c
+++ b/drivers/iommu/arm-smmu.c
@@ -538,25 +538,32 @@ static int insert_smmu_master(struct arm_smmu_device *smmu,
 	return 0;
 }
 
+struct iommus_entry {
+	struct list_head list;
+	struct device_node *node;
+	u16 streamids[MAX_MASTER_STREAMIDS];
+	int num_sids;
+};
+
 static int register_smmu_master(struct arm_smmu_device *smmu,
-				struct device *dev,
-				struct of_phandle_args *masterspec)
+				struct iommus_entry *entry)
 {
 	int i;
 	struct arm_smmu_master *master;
+	struct device *dev = smmu->dev;
 
-	master = find_smmu_master(smmu, masterspec->np);
+	master = find_smmu_master(smmu, entry->node);
 	if (master) {
 		dev_err(dev,
 			"rejecting multiple registrations for master device %s\n",
-			masterspec->np->name);
+			entry->node->name);
 		return -EBUSY;
 	}
 
-	if (masterspec->args_count > MAX_MASTER_STREAMIDS) {
+	if (entry->num_sids > MAX_MASTER_STREAMIDS) {
 		dev_err(dev,
 			"reached maximum number (%d) of stream IDs for master device %s\n",
-			MAX_MASTER_STREAMIDS, masterspec->np->name);
+			MAX_MASTER_STREAMIDS, entry->node->name);
 		return -ENOSPC;
 	}
 
@@ -564,15 +571,58 @@ static int register_smmu_master(struct arm_smmu_device *smmu,
 	if (!master)
 		return -ENOMEM;
 
-	master->of_node			= masterspec->np;
-	master->cfg.num_streamids	= masterspec->args_count;
+	master->of_node			= entry->node;
+	master->cfg.num_streamids	= entry->num_sids;
 
 	for (i = 0; i < master->cfg.num_streamids; ++i)
-		master->cfg.streamids[i] = masterspec->args[i];
+		master->cfg.streamids[i] = entry->streamids[i];
 
 	return insert_smmu_master(smmu, master);
 }
 
+static int arm_smmu_parse_iommus_properties(struct arm_smmu_device *smmu,
+					int *num_masters)
+{
+	struct of_phandle_args iommuspec;
+	struct device_node *dn;
+
+	for_each_node_with_property(dn, "iommus") {
+		int arg_ind = 0;
+		struct iommus_entry *entry, *n;
+		LIST_HEAD(iommus);
+
+		while (!of_parse_phandle_with_args(dn, "iommus", "#iommu-cells",
+							arg_ind, &iommuspec)) {
+			int i;
+
+			list_for_each_entry(entry, &iommus, list)
+				if (entry->node == dn)
+					break;
+			if (&entry->list == &iommus) {
+				entry = devm_kzalloc(smmu->dev, sizeof(*entry),
+						GFP_KERNEL);
+				if (!entry)
+					return -ENOMEM;
+				entry->node = dn;
+				list_add(&entry->list, &iommus);
+			}
+			entry->num_sids = iommuspec.args_count;
+			for (i = 0; i < entry->num_sids; ++i)
+				entry->streamids[i] = iommuspec.args[i];
+			arg_ind++;
+		}
+
+		list_for_each_entry_safe(entry, n, &iommus, list) {
+			register_smmu_master(smmu, entry);
+			(*num_masters)++;
+			list_del(&entry->list);
+			devm_kfree(smmu->dev, entry);
+		}
+	}
+
+	return 0;
+}
+
 static struct arm_smmu_device *find_smmu_for_device(struct device *dev)
 {
 	struct arm_smmu_device *smmu;
@@ -2196,8 +2246,7 @@ static int arm_smmu_device_dt_probe(struct platform_device *pdev)
 	struct arm_smmu_device *smmu;
 	struct device *dev = &pdev->dev;
 	struct rb_node *node;
-	struct of_phandle_args masterspec;
-	int num_irqs, i, err;
+	int num_irqs, i, err, num_masters;
 
 	smmu = devm_kzalloc(dev, sizeof(*smmu), GFP_KERNEL);
 	if (!smmu) {
@@ -2251,19 +2300,11 @@ static int arm_smmu_device_dt_probe(struct platform_device *pdev)
 
 	i = 0;
 	smmu->masters = RB_ROOT;
-	while (!of_parse_phandle_with_args(dev->of_node, "mmu-masters",
-					   "#stream-id-cells", i,
-					   &masterspec)) {
-		err = register_smmu_master(smmu, dev, &masterspec);
-		if (err) {
-			dev_err(dev, "failed to add master %s\n",
-				masterspec.np->name);
-			goto out_put_masters;
-		}
+	err = arm_smmu_parse_iommus_properties(smmu, &num_masters);
+	if (err)
+		goto out_put_masters;
 
-		i++;
-	}
-	dev_notice(dev, "registered %d master devices\n", i);
+	dev_notice(dev, "registered %d master devices\n", num_masters);
 
 	err = arm_smmu_init_regulators(smmu);
 	if (err)
-- 
The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum,
hosted by The Linux Foundation

^ permalink raw reply related	[flat|nested] 68+ messages in thread

* [PATCH 4/6] iommu/arm-smmu: implement generic DT bindings
@ 2014-08-13  0:51     ` Mitchel Humpherys
  0 siblings, 0 replies; 68+ messages in thread
From: Mitchel Humpherys @ 2014-08-13  0:51 UTC (permalink / raw)
  To: linux-arm-kernel

Generic IOMMU device tree bindings were recently added in
["devicetree: Add generic IOMMU device tree bindings"]. Implement the
bindings in the ARM SMMU driver.

See Documentation/devicetree/bindings/iommu/iommu.txt for the bindings
themselves.

Signed-off-by: Mitchel Humpherys <mitchelh@codeaurora.org>
---
 drivers/iommu/arm-smmu.c | 87 +++++++++++++++++++++++++++++++++++-------------
 1 file changed, 64 insertions(+), 23 deletions(-)

diff --git a/drivers/iommu/arm-smmu.c b/drivers/iommu/arm-smmu.c
index 63c6707fad..22e25f3172 100644
--- a/drivers/iommu/arm-smmu.c
+++ b/drivers/iommu/arm-smmu.c
@@ -538,25 +538,32 @@ static int insert_smmu_master(struct arm_smmu_device *smmu,
 	return 0;
 }
 
+struct iommus_entry {
+	struct list_head list;
+	struct device_node *node;
+	u16 streamids[MAX_MASTER_STREAMIDS];
+	int num_sids;
+};
+
 static int register_smmu_master(struct arm_smmu_device *smmu,
-				struct device *dev,
-				struct of_phandle_args *masterspec)
+				struct iommus_entry *entry)
 {
 	int i;
 	struct arm_smmu_master *master;
+	struct device *dev = smmu->dev;
 
-	master = find_smmu_master(smmu, masterspec->np);
+	master = find_smmu_master(smmu, entry->node);
 	if (master) {
 		dev_err(dev,
 			"rejecting multiple registrations for master device %s\n",
-			masterspec->np->name);
+			entry->node->name);
 		return -EBUSY;
 	}
 
-	if (masterspec->args_count > MAX_MASTER_STREAMIDS) {
+	if (entry->num_sids > MAX_MASTER_STREAMIDS) {
 		dev_err(dev,
 			"reached maximum number (%d) of stream IDs for master device %s\n",
-			MAX_MASTER_STREAMIDS, masterspec->np->name);
+			MAX_MASTER_STREAMIDS, entry->node->name);
 		return -ENOSPC;
 	}
 
@@ -564,15 +571,58 @@ static int register_smmu_master(struct arm_smmu_device *smmu,
 	if (!master)
 		return -ENOMEM;
 
-	master->of_node			= masterspec->np;
-	master->cfg.num_streamids	= masterspec->args_count;
+	master->of_node			= entry->node;
+	master->cfg.num_streamids	= entry->num_sids;
 
 	for (i = 0; i < master->cfg.num_streamids; ++i)
-		master->cfg.streamids[i] = masterspec->args[i];
+		master->cfg.streamids[i] = entry->streamids[i];
 
 	return insert_smmu_master(smmu, master);
 }
 
+static int arm_smmu_parse_iommus_properties(struct arm_smmu_device *smmu,
+					int *num_masters)
+{
+	struct of_phandle_args iommuspec;
+	struct device_node *dn;
+
+	for_each_node_with_property(dn, "iommus") {
+		int arg_ind = 0;
+		struct iommus_entry *entry, *n;
+		LIST_HEAD(iommus);
+
+		while (!of_parse_phandle_with_args(dn, "iommus", "#iommu-cells",
+							arg_ind, &iommuspec)) {
+			int i;
+
+			list_for_each_entry(entry, &iommus, list)
+				if (entry->node == dn)
+					break;
+			if (&entry->list == &iommus) {
+				entry = devm_kzalloc(smmu->dev, sizeof(*entry),
+						GFP_KERNEL);
+				if (!entry)
+					return -ENOMEM;
+				entry->node = dn;
+				list_add(&entry->list, &iommus);
+			}
+			entry->num_sids = iommuspec.args_count;
+			for (i = 0; i < entry->num_sids; ++i)
+				entry->streamids[i] = iommuspec.args[i];
+			arg_ind++;
+		}
+
+		list_for_each_entry_safe(entry, n, &iommus, list) {
+			register_smmu_master(smmu, entry);
+			(*num_masters)++;
+			list_del(&entry->list);
+			devm_kfree(smmu->dev, entry);
+		}
+	}
+
+	return 0;
+}
+
 static struct arm_smmu_device *find_smmu_for_device(struct device *dev)
 {
 	struct arm_smmu_device *smmu;
@@ -2196,8 +2246,7 @@ static int arm_smmu_device_dt_probe(struct platform_device *pdev)
 	struct arm_smmu_device *smmu;
 	struct device *dev = &pdev->dev;
 	struct rb_node *node;
-	struct of_phandle_args masterspec;
-	int num_irqs, i, err;
+	int num_irqs, i, err, num_masters;
 
 	smmu = devm_kzalloc(dev, sizeof(*smmu), GFP_KERNEL);
 	if (!smmu) {
@@ -2251,19 +2300,11 @@ static int arm_smmu_device_dt_probe(struct platform_device *pdev)
 
 	i = 0;
 	smmu->masters = RB_ROOT;
-	while (!of_parse_phandle_with_args(dev->of_node, "mmu-masters",
-					   "#stream-id-cells", i,
-					   &masterspec)) {
-		err = register_smmu_master(smmu, dev, &masterspec);
-		if (err) {
-			dev_err(dev, "failed to add master %s\n",
-				masterspec.np->name);
-			goto out_put_masters;
-		}
+	err = arm_smmu_parse_iommus_properties(smmu, &num_masters);
+	if (err)
+		goto out_put_masters;
 
-		i++;
-	}
-	dev_notice(dev, "registered %d master devices\n", i);
+	dev_notice(dev, "registered %d master devices\n", num_masters);
 
 	err = arm_smmu_init_regulators(smmu);
 	if (err)
-- 
The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum,
hosted by The Linux Foundation

^ permalink raw reply related	[flat|nested] 68+ messages in thread

* [PATCH 5/6] iommu/arm-smmu: support buggy implementations with invalidate-on-map
  2014-08-13  0:51 ` Mitchel Humpherys
@ 2014-08-13  0:51     ` Mitchel Humpherys
  -1 siblings, 0 replies; 68+ messages in thread
From: Mitchel Humpherys @ 2014-08-13  0:51 UTC (permalink / raw)
  To: linux-arm-kernel-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r,
	iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA, Will Deacon,
	devicetree-u79uwXL29TY76Z2rM5mHXA

Add a workaround for some buggy hardware that requires a TLB invalidate
operation to occur at map time. Activate the feature with the
qcom,smmu-invalidate-on-map boolean DT property.

Signed-off-by: Mitchel Humpherys <mitchelh-sgV2jX0FEOL9JmXXK+q4OQ@public.gmane.org>
---
 Documentation/devicetree/bindings/iommu/arm,smmu.txt |  4 ++++
 drivers/iommu/arm-smmu.c                             | 14 +++++++++++++-
 2 files changed, 17 insertions(+), 1 deletion(-)

diff --git a/Documentation/devicetree/bindings/iommu/arm,smmu.txt b/Documentation/devicetree/bindings/iommu/arm,smmu.txt
index dbc1ddad79..aaebeaeda0 100644
--- a/Documentation/devicetree/bindings/iommu/arm,smmu.txt
+++ b/Documentation/devicetree/bindings/iommu/arm,smmu.txt
@@ -48,6 +48,10 @@ conditions.
                   aliases of secure registers have to be used during
                   SMMU configuration.
 
+- qcom,smmu-invalidate-on-map : Enable proper handling of buggy
+                  implementations that require a TLB invalidate
+                  operation to occur at map time.
+
 - clocks        : List of clocks to be used during SMMU register access. See
                   Documentation/devicetree/bindings/clock/clock-bindings.txt
                   for information about the format. For each clock specified
diff --git a/drivers/iommu/arm-smmu.c b/drivers/iommu/arm-smmu.c
index 22e25f3172..73d056668b 100644
--- a/drivers/iommu/arm-smmu.c
+++ b/drivers/iommu/arm-smmu.c
@@ -389,6 +389,7 @@ struct arm_smmu_device {
 	u32				features;
 
 #define ARM_SMMU_OPT_SECURE_CFG_ACCESS (1 << 0)
+#define ARM_SMMU_OPT_INVALIDATE_ON_MAP (1 << 1)
 	u32				options;
 	int				version;
 
@@ -455,6 +456,7 @@ struct arm_smmu_option_prop {
 
 static struct arm_smmu_option_prop arm_smmu_options[] = {
 	{ ARM_SMMU_OPT_SECURE_CFG_ACCESS, "calxeda,smmu-secure-config-access" },
+	{ ARM_SMMU_OPT_INVALIDATE_ON_MAP, "qcom,smmu-invalidate-on-map" },
 	{ 0, NULL},
 };
 
@@ -1693,12 +1695,22 @@ out_unlock:
 static int arm_smmu_map(struct iommu_domain *domain, unsigned long iova,
 			phys_addr_t paddr, size_t size, int prot)
 {
+	int ret;
 	struct arm_smmu_domain *smmu_domain = domain->priv;
 
 	if (!smmu_domain)
 		return -ENODEV;
 
-	return arm_smmu_handle_mapping(smmu_domain, iova, paddr, size, prot);
+	ret = arm_smmu_handle_mapping(smmu_domain, iova, paddr, size, prot);
+
+	if (!ret &&
+		(smmu_domain->smmu->options & ARM_SMMU_OPT_INVALIDATE_ON_MAP)) {
+		arm_smmu_enable_clocks(smmu_domain->smmu);
+		arm_smmu_tlb_inv_context(smmu_domain);
+		arm_smmu_disable_clocks(smmu_domain->smmu);
+	}
+
+	return ret;
 }
 
 static size_t arm_smmu_unmap(struct iommu_domain *domain, unsigned long iova,
-- 
The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum,
hosted by The Linux Foundation

^ permalink raw reply related	[flat|nested] 68+ messages in thread

* [PATCH 5/6] iommu/arm-smmu: support buggy implementations with invalidate-on-map
@ 2014-08-13  0:51     ` Mitchel Humpherys
  0 siblings, 0 replies; 68+ messages in thread
From: Mitchel Humpherys @ 2014-08-13  0:51 UTC (permalink / raw)
  To: linux-arm-kernel

Add a workaround for some buggy hardware that requires a TLB invalidate
operation to occur at map time. Activate the feature with the
qcom,smmu-invalidate-on-map boolean DT property.

Signed-off-by: Mitchel Humpherys <mitchelh@codeaurora.org>
---
 Documentation/devicetree/bindings/iommu/arm,smmu.txt |  4 ++++
 drivers/iommu/arm-smmu.c                             | 14 +++++++++++++-
 2 files changed, 17 insertions(+), 1 deletion(-)

diff --git a/Documentation/devicetree/bindings/iommu/arm,smmu.txt b/Documentation/devicetree/bindings/iommu/arm,smmu.txt
index dbc1ddad79..aaebeaeda0 100644
--- a/Documentation/devicetree/bindings/iommu/arm,smmu.txt
+++ b/Documentation/devicetree/bindings/iommu/arm,smmu.txt
@@ -48,6 +48,10 @@ conditions.
                   aliases of secure registers have to be used during
                   SMMU configuration.
 
+- qcom,smmu-invalidate-on-map : Enable proper handling of buggy
+                  implementations that require a TLB invalidate
+                  operation to occur at map time.
+
 - clocks        : List of clocks to be used during SMMU register access. See
                   Documentation/devicetree/bindings/clock/clock-bindings.txt
                   for information about the format. For each clock specified
diff --git a/drivers/iommu/arm-smmu.c b/drivers/iommu/arm-smmu.c
index 22e25f3172..73d056668b 100644
--- a/drivers/iommu/arm-smmu.c
+++ b/drivers/iommu/arm-smmu.c
@@ -389,6 +389,7 @@ struct arm_smmu_device {
 	u32				features;
 
 #define ARM_SMMU_OPT_SECURE_CFG_ACCESS (1 << 0)
+#define ARM_SMMU_OPT_INVALIDATE_ON_MAP (1 << 1)
 	u32				options;
 	int				version;
 
@@ -455,6 +456,7 @@ struct arm_smmu_option_prop {
 
 static struct arm_smmu_option_prop arm_smmu_options[] = {
 	{ ARM_SMMU_OPT_SECURE_CFG_ACCESS, "calxeda,smmu-secure-config-access" },
+	{ ARM_SMMU_OPT_INVALIDATE_ON_MAP, "qcom,smmu-invalidate-on-map" },
 	{ 0, NULL},
 };
 
@@ -1693,12 +1695,22 @@ out_unlock:
 static int arm_smmu_map(struct iommu_domain *domain, unsigned long iova,
 			phys_addr_t paddr, size_t size, int prot)
 {
+	int ret;
 	struct arm_smmu_domain *smmu_domain = domain->priv;
 
 	if (!smmu_domain)
 		return -ENODEV;
 
-	return arm_smmu_handle_mapping(smmu_domain, iova, paddr, size, prot);
+	ret = arm_smmu_handle_mapping(smmu_domain, iova, paddr, size, prot);
+
+	if (!ret &&
+		(smmu_domain->smmu->options & ARM_SMMU_OPT_INVALIDATE_ON_MAP)) {
+		arm_smmu_enable_clocks(smmu_domain->smmu);
+		arm_smmu_tlb_inv_context(smmu_domain);
+		arm_smmu_disable_clocks(smmu_domain->smmu);
+	}
+
+	return ret;
 }
 
 static size_t arm_smmu_unmap(struct iommu_domain *domain, unsigned long iova,
-- 
The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum,
hosted by The Linux Foundation

^ permalink raw reply related	[flat|nested] 68+ messages in thread

* [PATCH 6/6] iommu/arm-smmu: add .domain_{set, get}_attr for coherent walk control
  2014-08-13  0:51 ` Mitchel Humpherys
@ 2014-08-13  0:51     ` Mitchel Humpherys
  -1 siblings, 0 replies; 68+ messages in thread
From: Mitchel Humpherys @ 2014-08-13  0:51 UTC (permalink / raw)
  To: linux-arm-kernel-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r,
	iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA, Will Deacon,
	devicetree-u79uwXL29TY76Z2rM5mHXA

Under certain conditions coherent hardware translation table walks can
result in degraded performance. Add a new domain attribute to
disable/enable this feature in generic code along with the domain
attribute setter and getter to handle it in the ARM SMMU driver.

Signed-off-by: Mitchel Humpherys <mitchelh-sgV2jX0FEOL9JmXXK+q4OQ@public.gmane.org>
---
 drivers/iommu/arm-smmu.c | 57 +++++++++++++++++++++++++++++++-----------------
 include/linux/iommu.h    |  1 +
 2 files changed, 38 insertions(+), 20 deletions(-)

diff --git a/drivers/iommu/arm-smmu.c b/drivers/iommu/arm-smmu.c
index 73d056668b..11672a8371 100644
--- a/drivers/iommu/arm-smmu.c
+++ b/drivers/iommu/arm-smmu.c
@@ -426,6 +426,7 @@ struct arm_smmu_cfg {
 	u8				irptndx;
 	u32				cbar;
 	pgd_t				*pgd;
+	bool				htw_disable;
 };
 #define INVALID_IRPTNDX			0xff
 
@@ -833,14 +834,17 @@ static irqreturn_t arm_smmu_global_fault(int irq, void *dev)
 	return IRQ_HANDLED;
 }
 
-static void arm_smmu_flush_pgtable(struct arm_smmu_device *smmu, void *addr,
-				   size_t size)
+static void arm_smmu_flush_pgtable(struct arm_smmu_domain *smmu_domain,
+				   void *addr, size_t size)
 {
+	struct arm_smmu_cfg *cfg = &smmu_domain->cfg;
+	struct arm_smmu_device *smmu = smmu_domain->smmu;
 	unsigned long offset = (unsigned long)addr & ~PAGE_MASK;
 
 
 	/* Ensure new page tables are visible to the hardware walker */
-	if (smmu->features & ARM_SMMU_FEAT_COHERENT_WALK) {
+	if ((smmu->features & ARM_SMMU_FEAT_COHERENT_WALK) &&
+		!cfg->htw_disable) {
 		dsb(ishst);
 	} else {
 		/*
@@ -943,7 +947,7 @@ static void arm_smmu_init_context_bank(struct arm_smmu_domain *smmu_domain)
 	}
 
 	/* TTBR0 */
-	arm_smmu_flush_pgtable(smmu, cfg->pgd,
+	arm_smmu_flush_pgtable(smmu_domain, cfg->pgd,
 			       PTRS_PER_PGD * sizeof(pgd_t));
 	reg = __pa(cfg->pgd);
 	writel_relaxed(reg, cb_base + ARM_SMMU_CB_TTBR0_LO);
@@ -1468,7 +1472,8 @@ static bool arm_smmu_pte_is_contiguous_range(unsigned long addr,
 		(addr + ARM_SMMU_PTE_CONT_SIZE <= end);
 }
 
-static int arm_smmu_alloc_init_pte(struct arm_smmu_device *smmu, pmd_t *pmd,
+static int arm_smmu_alloc_init_pte(struct arm_smmu_domain *smmu_domain,
+				   pmd_t *pmd,
 				   unsigned long addr, unsigned long end,
 				   unsigned long pfn, int prot, int stage)
 {
@@ -1482,9 +1487,10 @@ static int arm_smmu_alloc_init_pte(struct arm_smmu_device *smmu, pmd_t *pmd,
 		if (!table)
 			return -ENOMEM;
 
-		arm_smmu_flush_pgtable(smmu, page_address(table), PAGE_SIZE);
+		arm_smmu_flush_pgtable(smmu_domain, page_address(table),
+				PAGE_SIZE);
 		pmd_populate(NULL, pmd, table);
-		arm_smmu_flush_pgtable(smmu, pmd, sizeof(*pmd));
+		arm_smmu_flush_pgtable(smmu_domain, pmd, sizeof(*pmd));
 	}
 
 	if (stage == 1) {
@@ -1558,7 +1564,7 @@ static int arm_smmu_alloc_init_pte(struct arm_smmu_device *smmu, pmd_t *pmd,
 				pte_val(*(cont_start + j)) &=
 					~ARM_SMMU_PTE_CONT;
 
-			arm_smmu_flush_pgtable(smmu, cont_start,
+			arm_smmu_flush_pgtable(smmu_domain, cont_start,
 					       sizeof(*pte) *
 					       ARM_SMMU_PTE_CONT_ENTRIES);
 		}
@@ -1568,11 +1574,13 @@ static int arm_smmu_alloc_init_pte(struct arm_smmu_device *smmu, pmd_t *pmd,
 		} while (pte++, pfn++, addr += PAGE_SIZE, --i);
 	} while (addr != end);
 
-	arm_smmu_flush_pgtable(smmu, start, sizeof(*pte) * (pte - start));
+	arm_smmu_flush_pgtable(smmu_domain, start,
+			sizeof(*pte) * (pte - start));
 	return 0;
 }
 
-static int arm_smmu_alloc_init_pmd(struct arm_smmu_device *smmu, pud_t *pud,
+static int arm_smmu_alloc_init_pmd(struct arm_smmu_domain *smmu_domain,
+				   pud_t *pud,
 				   unsigned long addr, unsigned long end,
 				   phys_addr_t phys, int prot, int stage)
 {
@@ -1586,9 +1594,9 @@ static int arm_smmu_alloc_init_pmd(struct arm_smmu_device *smmu, pud_t *pud,
 		if (!pmd)
 			return -ENOMEM;
 
-		arm_smmu_flush_pgtable(smmu, pmd, PAGE_SIZE);
+		arm_smmu_flush_pgtable(smmu_domain, pmd, PAGE_SIZE);
 		pud_populate(NULL, pud, pmd);
-		arm_smmu_flush_pgtable(smmu, pud, sizeof(*pud));
+		arm_smmu_flush_pgtable(smmu_domain, pud, sizeof(*pud));
 
 		pmd += pmd_index(addr);
 	} else
@@ -1597,7 +1605,7 @@ static int arm_smmu_alloc_init_pmd(struct arm_smmu_device *smmu, pud_t *pud,
 
 	do {
 		next = pmd_addr_end(addr, end);
-		ret = arm_smmu_alloc_init_pte(smmu, pmd, addr, next, pfn,
+		ret = arm_smmu_alloc_init_pte(smmu_domain, pmd, addr, next, pfn,
 					      prot, stage);
 		phys += next - addr;
 	} while (pmd++, addr = next, addr < end);
@@ -1605,7 +1613,8 @@ static int arm_smmu_alloc_init_pmd(struct arm_smmu_device *smmu, pud_t *pud,
 	return ret;
 }
 
-static int arm_smmu_alloc_init_pud(struct arm_smmu_device *smmu, pgd_t *pgd,
+static int arm_smmu_alloc_init_pud(struct arm_smmu_domain *smmu_domain,
+				   pgd_t *pgd,
 				   unsigned long addr, unsigned long end,
 				   phys_addr_t phys, int prot, int stage)
 {
@@ -1619,9 +1628,9 @@ static int arm_smmu_alloc_init_pud(struct arm_smmu_device *smmu, pgd_t *pgd,
 		if (!pud)
 			return -ENOMEM;
 
-		arm_smmu_flush_pgtable(smmu, pud, PAGE_SIZE);
+		arm_smmu_flush_pgtable(smmu_domain, pud, PAGE_SIZE);
 		pgd_populate(NULL, pgd, pud);
-		arm_smmu_flush_pgtable(smmu, pgd, sizeof(*pgd));
+		arm_smmu_flush_pgtable(smmu_domain, pgd, sizeof(*pgd));
 
 		pud += pud_index(addr);
 	} else
@@ -1630,8 +1639,8 @@ static int arm_smmu_alloc_init_pud(struct arm_smmu_device *smmu, pgd_t *pgd,
 
 	do {
 		next = pud_addr_end(addr, end);
-		ret = arm_smmu_alloc_init_pmd(smmu, pud, addr, next, phys,
-					      prot, stage);
+		ret = arm_smmu_alloc_init_pmd(smmu_domain, pud, addr, next,
+					      phys, prot, stage);
 		phys += next - addr;
 	} while (pud++, addr = next, addr < end);
 
@@ -1677,8 +1686,8 @@ static int arm_smmu_handle_mapping(struct arm_smmu_domain *smmu_domain,
 	do {
 		unsigned long next = pgd_addr_end(iova, end);
 
-		ret = arm_smmu_alloc_init_pud(smmu, pgd, iova, next, paddr,
-					      prot, stage);
+		ret = arm_smmu_alloc_init_pud(smmu_domain, pgd, iova, next,
+					      paddr, prot, stage);
 		if (ret)
 			goto out_unlock;
 
@@ -1908,11 +1917,15 @@ static int arm_smmu_domain_get_attr(struct iommu_domain *domain,
 				    enum iommu_attr attr, void *data)
 {
 	struct arm_smmu_domain *smmu_domain = domain->priv;
+	struct arm_smmu_cfg *cfg = &smmu_domain->cfg;
 
 	switch (attr) {
 	case DOMAIN_ATTR_NESTING:
 		*(int *)data = (smmu_domain->stage == ARM_SMMU_DOMAIN_NESTED);
 		return 0;
+	case DOMAIN_ATTR_COHERENT_HTW_DISABLE:
+		*((bool *)data) = cfg->htw_disable;
+		return 0;
 	default:
 		return -ENODEV;
 	}
@@ -1922,6 +1935,7 @@ static int arm_smmu_domain_set_attr(struct iommu_domain *domain,
 				    enum iommu_attr attr, void *data)
 {
 	struct arm_smmu_domain *smmu_domain = domain->priv;
+	struct arm_smmu_cfg *cfg = &smmu_domain->cfg;
 
 	switch (attr) {
 	case DOMAIN_ATTR_NESTING:
@@ -1933,6 +1947,9 @@ static int arm_smmu_domain_set_attr(struct iommu_domain *domain,
 			smmu_domain->stage = ARM_SMMU_DOMAIN_S1;
 
 		return 0;
+	case DOMAIN_ATTR_COHERENT_HTW_DISABLE:
+		cfg->htw_disable = *((bool *)data);
+		return 0;
 	default:
 		return -ENODEV;
 	}
diff --git a/include/linux/iommu.h b/include/linux/iommu.h
index 0550286df4..8a6449857a 100644
--- a/include/linux/iommu.h
+++ b/include/linux/iommu.h
@@ -81,6 +81,7 @@ enum iommu_attr {
 	DOMAIN_ATTR_FSL_PAMU_ENABLE,
 	DOMAIN_ATTR_FSL_PAMUV1,
 	DOMAIN_ATTR_NESTING,	/* two stages of translation */
+	DOMAIN_ATTR_COHERENT_HTW_DISABLE,
 	DOMAIN_ATTR_MAX,
 };
 
-- 
The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum,
hosted by The Linux Foundation

^ permalink raw reply related	[flat|nested] 68+ messages in thread

* [PATCH 6/6] iommu/arm-smmu: add .domain_{set, get}_attr for coherent walk control
@ 2014-08-13  0:51     ` Mitchel Humpherys
  0 siblings, 0 replies; 68+ messages in thread
From: Mitchel Humpherys @ 2014-08-13  0:51 UTC (permalink / raw)
  To: linux-arm-kernel

Under certain conditions coherent hardware translation table walks can
result in degraded performance. Add a new domain attribute to
disable/enable this feature in generic code along with the domain
attribute setter and getter to handle it in the ARM SMMU driver.

Signed-off-by: Mitchel Humpherys <mitchelh@codeaurora.org>
---
 drivers/iommu/arm-smmu.c | 57 +++++++++++++++++++++++++++++++-----------------
 include/linux/iommu.h    |  1 +
 2 files changed, 38 insertions(+), 20 deletions(-)

diff --git a/drivers/iommu/arm-smmu.c b/drivers/iommu/arm-smmu.c
index 73d056668b..11672a8371 100644
--- a/drivers/iommu/arm-smmu.c
+++ b/drivers/iommu/arm-smmu.c
@@ -426,6 +426,7 @@ struct arm_smmu_cfg {
 	u8				irptndx;
 	u32				cbar;
 	pgd_t				*pgd;
+	bool				htw_disable;
 };
 #define INVALID_IRPTNDX			0xff
 
@@ -833,14 +834,17 @@ static irqreturn_t arm_smmu_global_fault(int irq, void *dev)
 	return IRQ_HANDLED;
 }
 
-static void arm_smmu_flush_pgtable(struct arm_smmu_device *smmu, void *addr,
-				   size_t size)
+static void arm_smmu_flush_pgtable(struct arm_smmu_domain *smmu_domain,
+				   void *addr, size_t size)
 {
+	struct arm_smmu_cfg *cfg = &smmu_domain->cfg;
+	struct arm_smmu_device *smmu = smmu_domain->smmu;
 	unsigned long offset = (unsigned long)addr & ~PAGE_MASK;
 
 
 	/* Ensure new page tables are visible to the hardware walker */
-	if (smmu->features & ARM_SMMU_FEAT_COHERENT_WALK) {
+	if ((smmu->features & ARM_SMMU_FEAT_COHERENT_WALK) &&
+		!cfg->htw_disable) {
 		dsb(ishst);
 	} else {
 		/*
@@ -943,7 +947,7 @@ static void arm_smmu_init_context_bank(struct arm_smmu_domain *smmu_domain)
 	}
 
 	/* TTBR0 */
-	arm_smmu_flush_pgtable(smmu, cfg->pgd,
+	arm_smmu_flush_pgtable(smmu_domain, cfg->pgd,
 			       PTRS_PER_PGD * sizeof(pgd_t));
 	reg = __pa(cfg->pgd);
 	writel_relaxed(reg, cb_base + ARM_SMMU_CB_TTBR0_LO);
@@ -1468,7 +1472,8 @@ static bool arm_smmu_pte_is_contiguous_range(unsigned long addr,
 		(addr + ARM_SMMU_PTE_CONT_SIZE <= end);
 }
 
-static int arm_smmu_alloc_init_pte(struct arm_smmu_device *smmu, pmd_t *pmd,
+static int arm_smmu_alloc_init_pte(struct arm_smmu_domain *smmu_domain,
+				   pmd_t *pmd,
 				   unsigned long addr, unsigned long end,
 				   unsigned long pfn, int prot, int stage)
 {
@@ -1482,9 +1487,10 @@ static int arm_smmu_alloc_init_pte(struct arm_smmu_device *smmu, pmd_t *pmd,
 		if (!table)
 			return -ENOMEM;
 
-		arm_smmu_flush_pgtable(smmu, page_address(table), PAGE_SIZE);
+		arm_smmu_flush_pgtable(smmu_domain, page_address(table),
+				PAGE_SIZE);
 		pmd_populate(NULL, pmd, table);
-		arm_smmu_flush_pgtable(smmu, pmd, sizeof(*pmd));
+		arm_smmu_flush_pgtable(smmu_domain, pmd, sizeof(*pmd));
 	}
 
 	if (stage == 1) {
@@ -1558,7 +1564,7 @@ static int arm_smmu_alloc_init_pte(struct arm_smmu_device *smmu, pmd_t *pmd,
 				pte_val(*(cont_start + j)) &=
 					~ARM_SMMU_PTE_CONT;
 
-			arm_smmu_flush_pgtable(smmu, cont_start,
+			arm_smmu_flush_pgtable(smmu_domain, cont_start,
 					       sizeof(*pte) *
 					       ARM_SMMU_PTE_CONT_ENTRIES);
 		}
@@ -1568,11 +1574,13 @@ static int arm_smmu_alloc_init_pte(struct arm_smmu_device *smmu, pmd_t *pmd,
 		} while (pte++, pfn++, addr += PAGE_SIZE, --i);
 	} while (addr != end);
 
-	arm_smmu_flush_pgtable(smmu, start, sizeof(*pte) * (pte - start));
+	arm_smmu_flush_pgtable(smmu_domain, start,
+			sizeof(*pte) * (pte - start));
 	return 0;
 }
 
-static int arm_smmu_alloc_init_pmd(struct arm_smmu_device *smmu, pud_t *pud,
+static int arm_smmu_alloc_init_pmd(struct arm_smmu_domain *smmu_domain,
+				   pud_t *pud,
 				   unsigned long addr, unsigned long end,
 				   phys_addr_t phys, int prot, int stage)
 {
@@ -1586,9 +1594,9 @@ static int arm_smmu_alloc_init_pmd(struct arm_smmu_device *smmu, pud_t *pud,
 		if (!pmd)
 			return -ENOMEM;
 
-		arm_smmu_flush_pgtable(smmu, pmd, PAGE_SIZE);
+		arm_smmu_flush_pgtable(smmu_domain, pmd, PAGE_SIZE);
 		pud_populate(NULL, pud, pmd);
-		arm_smmu_flush_pgtable(smmu, pud, sizeof(*pud));
+		arm_smmu_flush_pgtable(smmu_domain, pud, sizeof(*pud));
 
 		pmd += pmd_index(addr);
 	} else
@@ -1597,7 +1605,7 @@ static int arm_smmu_alloc_init_pmd(struct arm_smmu_device *smmu, pud_t *pud,
 
 	do {
 		next = pmd_addr_end(addr, end);
-		ret = arm_smmu_alloc_init_pte(smmu, pmd, addr, next, pfn,
+		ret = arm_smmu_alloc_init_pte(smmu_domain, pmd, addr, next, pfn,
 					      prot, stage);
 		phys += next - addr;
 	} while (pmd++, addr = next, addr < end);
@@ -1605,7 +1613,8 @@ static int arm_smmu_alloc_init_pmd(struct arm_smmu_device *smmu, pud_t *pud,
 	return ret;
 }
 
-static int arm_smmu_alloc_init_pud(struct arm_smmu_device *smmu, pgd_t *pgd,
+static int arm_smmu_alloc_init_pud(struct arm_smmu_domain *smmu_domain,
+				   pgd_t *pgd,
 				   unsigned long addr, unsigned long end,
 				   phys_addr_t phys, int prot, int stage)
 {
@@ -1619,9 +1628,9 @@ static int arm_smmu_alloc_init_pud(struct arm_smmu_device *smmu, pgd_t *pgd,
 		if (!pud)
 			return -ENOMEM;
 
-		arm_smmu_flush_pgtable(smmu, pud, PAGE_SIZE);
+		arm_smmu_flush_pgtable(smmu_domain, pud, PAGE_SIZE);
 		pgd_populate(NULL, pgd, pud);
-		arm_smmu_flush_pgtable(smmu, pgd, sizeof(*pgd));
+		arm_smmu_flush_pgtable(smmu_domain, pgd, sizeof(*pgd));
 
 		pud += pud_index(addr);
 	} else
@@ -1630,8 +1639,8 @@ static int arm_smmu_alloc_init_pud(struct arm_smmu_device *smmu, pgd_t *pgd,
 
 	do {
 		next = pud_addr_end(addr, end);
-		ret = arm_smmu_alloc_init_pmd(smmu, pud, addr, next, phys,
-					      prot, stage);
+		ret = arm_smmu_alloc_init_pmd(smmu_domain, pud, addr, next,
+					      phys, prot, stage);
 		phys += next - addr;
 	} while (pud++, addr = next, addr < end);
 
@@ -1677,8 +1686,8 @@ static int arm_smmu_handle_mapping(struct arm_smmu_domain *smmu_domain,
 	do {
 		unsigned long next = pgd_addr_end(iova, end);
 
-		ret = arm_smmu_alloc_init_pud(smmu, pgd, iova, next, paddr,
-					      prot, stage);
+		ret = arm_smmu_alloc_init_pud(smmu_domain, pgd, iova, next,
+					      paddr, prot, stage);
 		if (ret)
 			goto out_unlock;
 
@@ -1908,11 +1917,15 @@ static int arm_smmu_domain_get_attr(struct iommu_domain *domain,
 				    enum iommu_attr attr, void *data)
 {
 	struct arm_smmu_domain *smmu_domain = domain->priv;
+	struct arm_smmu_cfg *cfg = &smmu_domain->cfg;
 
 	switch (attr) {
 	case DOMAIN_ATTR_NESTING:
 		*(int *)data = (smmu_domain->stage == ARM_SMMU_DOMAIN_NESTED);
 		return 0;
+	case DOMAIN_ATTR_COHERENT_HTW_DISABLE:
+		*((bool *)data) = cfg->htw_disable;
+		return 0;
 	default:
 		return -ENODEV;
 	}
@@ -1922,6 +1935,7 @@ static int arm_smmu_domain_set_attr(struct iommu_domain *domain,
 				    enum iommu_attr attr, void *data)
 {
 	struct arm_smmu_domain *smmu_domain = domain->priv;
+	struct arm_smmu_cfg *cfg = &smmu_domain->cfg;
 
 	switch (attr) {
 	case DOMAIN_ATTR_NESTING:
@@ -1933,6 +1947,9 @@ static int arm_smmu_domain_set_attr(struct iommu_domain *domain,
 			smmu_domain->stage = ARM_SMMU_DOMAIN_S1;
 
 		return 0;
+	case DOMAIN_ATTR_COHERENT_HTW_DISABLE:
+		cfg->htw_disable = *((bool *)data);
+		return 0;
 	default:
 		return -ENODEV;
 	}
diff --git a/include/linux/iommu.h b/include/linux/iommu.h
index 0550286df4..8a6449857a 100644
--- a/include/linux/iommu.h
+++ b/include/linux/iommu.h
@@ -81,6 +81,7 @@ enum iommu_attr {
 	DOMAIN_ATTR_FSL_PAMU_ENABLE,
 	DOMAIN_ATTR_FSL_PAMUV1,
 	DOMAIN_ATTR_NESTING,	/* two stages of translation */
+	DOMAIN_ATTR_COHERENT_HTW_DISABLE,
 	DOMAIN_ATTR_MAX,
 };
 
-- 
The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum,
hosted by The Linux Foundation

^ permalink raw reply related	[flat|nested] 68+ messages in thread

* Re: [PATCH 4/6] iommu/arm-smmu: implement generic DT bindings
  2014-08-13  0:51     ` Mitchel Humpherys
@ 2014-08-13 16:53         ` Mitchel Humpherys
  -1 siblings, 0 replies; 68+ messages in thread
From: Mitchel Humpherys @ 2014-08-13 16:53 UTC (permalink / raw)
  To: linux-arm-kernel-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r
  Cc: devicetree-u79uwXL29TY76Z2rM5mHXA,
	iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA, Will Deacon

On Tue, Aug 12 2014 at 05:51:37 PM, Mitchel Humpherys <mitchelh-sgV2jX0FEOL9JmXXK+q4OQ@public.gmane.org> wrote:
> Generic IOMMU device tree bindings were recently added in
> ["devicetree: Add generic IOMMU device tree bindings"]. Implement the
> bindings in the ARM SMMU driver.
>
> See Documentation/devicetree/bindings/iommu/iommu.txt for the bindings
> themselves.
>
> Signed-off-by: Mitchel Humpherys <mitchelh-sgV2jX0FEOL9JmXXK+q4OQ@public.gmane.org>
> ---
>  drivers/iommu/arm-smmu.c | 87 +++++++++++++++++++++++++++++++++++-------------
>  1 file changed, 64 insertions(+), 23 deletions(-)

Ah just realized I didn't update the bindings documentation in
arm,smmu.txt. Will update in v2 along with any other feedback.


-Mitch

-- 
The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum,
hosted by The Linux Foundation

^ permalink raw reply	[flat|nested] 68+ messages in thread

* [PATCH 4/6] iommu/arm-smmu: implement generic DT bindings
@ 2014-08-13 16:53         ` Mitchel Humpherys
  0 siblings, 0 replies; 68+ messages in thread
From: Mitchel Humpherys @ 2014-08-13 16:53 UTC (permalink / raw)
  To: linux-arm-kernel

On Tue, Aug 12 2014 at 05:51:37 PM, Mitchel Humpherys <mitchelh@codeaurora.org> wrote:
> Generic IOMMU device tree bindings were recently added in
> ["devicetree: Add generic IOMMU device tree bindings"]. Implement the
> bindings in the ARM SMMU driver.
>
> See Documentation/devicetree/bindings/iommu/iommu.txt for the bindings
> themselves.
>
> Signed-off-by: Mitchel Humpherys <mitchelh@codeaurora.org>
> ---
>  drivers/iommu/arm-smmu.c | 87 +++++++++++++++++++++++++++++++++++-------------
>  1 file changed, 64 insertions(+), 23 deletions(-)

Ah just realized I didn't update the bindings documentation in
arm,smmu.txt. Will update in v2 along with any other feedback.


-Mitch

-- 
The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum,
hosted by The Linux Foundation

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [PATCH 0/6] iommu/arm-smmu: misc features, new DT bindings
  2014-08-13  0:51 ` Mitchel Humpherys
@ 2014-08-13 17:22     ` Mitchel Humpherys
  -1 siblings, 0 replies; 68+ messages in thread
From: Mitchel Humpherys @ 2014-08-13 17:22 UTC (permalink / raw)
  To: linux-arm-kernel-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r
  Cc: devicetree-u79uwXL29TY76Z2rM5mHXA,
	iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA, Will Deacon

On Tue, Aug 12 2014 at 05:51:33 PM, Mitchel Humpherys <mitchelh-sgV2jX0FEOL9JmXXK+q4OQ@public.gmane.org> wrote:
> This series is based on on Will's iommu/pci branch.

Incredibly, I also neglected to base this on top of Olav's recent patch
("iommu/arm-smmu: Do not access non-existing SMR registers")! I will do
that in v2 after review feedback.


-Mitch

-- 
The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum,
hosted by The Linux Foundation

^ permalink raw reply	[flat|nested] 68+ messages in thread

* [PATCH 0/6] iommu/arm-smmu: misc features, new DT bindings
@ 2014-08-13 17:22     ` Mitchel Humpherys
  0 siblings, 0 replies; 68+ messages in thread
From: Mitchel Humpherys @ 2014-08-13 17:22 UTC (permalink / raw)
  To: linux-arm-kernel

On Tue, Aug 12 2014 at 05:51:33 PM, Mitchel Humpherys <mitchelh@codeaurora.org> wrote:
> This series is based on on Will's iommu/pci branch.

Incredibly, I also neglected to base this on top of Olav's recent patch
("iommu/arm-smmu: Do not access non-existing SMR registers")! I will do
that in v2 after review feedback.


-Mitch

-- 
The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum,
hosted by The Linux Foundation

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [PATCH 1/6] iommu/arm-smmu: add support for specifying clocks
  2014-08-13  0:51     ` Mitchel Humpherys
@ 2014-08-13 21:07         ` Mitchel Humpherys
  -1 siblings, 0 replies; 68+ messages in thread
From: Mitchel Humpherys @ 2014-08-13 21:07 UTC (permalink / raw)
  To: linux-arm-kernel-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r
  Cc: devicetree-u79uwXL29TY76Z2rM5mHXA,
	iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA, Will Deacon

Well hopefully this isn't too Nick Krouse-esque, but I have some
comments on my own patch below. I sat on these for a few days but have
noticed a few things after testing on another platform...

On Tue, Aug 12 2014 at 05:51:34 PM, Mitchel Humpherys <mitchelh-sgV2jX0FEOL9JmXXK+q4OQ@public.gmane.org> wrote:
> On some platforms with tight power constraints it is polite to only
> leave your clocks on for as long as you absolutely need them. Currently
> we assume that all clocks necessary for SMMU register access are always
> on.
>
> Add some optional device tree properties to specify any clocks that are
> necessary for SMMU register access and turn them on and off as needed.
>
> If no clocks are specified in the device tree things continue to work
> the way they always have: we assume all necessary clocks are always
> turned on.
>
> Signed-off-by: Mitchel Humpherys <mitchelh-sgV2jX0FEOL9JmXXK+q4OQ@public.gmane.org>
> ---
>  .../devicetree/bindings/iommu/arm,smmu.txt         |  11 ++
>  drivers/iommu/arm-smmu.c                           | 127 +++++++++++++++++++--
>  2 files changed, 129 insertions(+), 9 deletions(-)

[...]

> -static int arm_smmu_device_cfg_probe(struct arm_smmu_device *smmu)
> +static int arm_smmu_init_clocks(struct arm_smmu_device *smmu)
> +{
> +	const char *cname;
> +	struct property *prop;
> +	int i;
> +	struct device *dev = smmu->dev;
> +
> +	smmu->num_clocks = of_property_count_strings(dev->of_node,
> +						"clock-names");
> +
> +	if (!smmu->num_clocks)
> +		return 0;
> +
> +	smmu->clocks = devm_kzalloc(
> +		dev, sizeof(*smmu->clocks) * smmu->num_clocks,
> +		GFP_KERNEL);
> +
> +	if (!smmu->clocks) {
> +		dev_err(dev,
> +			"Failed to allocate memory for clocks\n");
> +		return -ENODEV;
> +	}
> +
> +	i = 0;
> +	of_property_for_each_string(dev->of_node, "clock-names",
> +				prop, cname) {
> +		struct clk *c = devm_clk_get(dev, cname);
> +		if (IS_ERR(c)) {
> +			dev_err(dev, "Couldn't get clock: %s",
> +				cname);
> +			return -ENODEV;
> +		}
> +
> +		if (clk_get_rate(c) == 0) {
> +			long rate = clk_round_rate(c, 1000);
> +			clk_set_rate(c, rate);
> +		}
> +
> +		smmu->clocks[i] = c;
> +
> +		++i;
> +	}
> +	return 0;
> +}
> +
> +int arm_smmu_device_cfg_probe(struct arm_smmu_device *smmu)

The `static' was dropped unintentionally here.


-Mitch

-- 
The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum,
hosted by The Linux Foundation

^ permalink raw reply	[flat|nested] 68+ messages in thread

* [PATCH 1/6] iommu/arm-smmu: add support for specifying clocks
@ 2014-08-13 21:07         ` Mitchel Humpherys
  0 siblings, 0 replies; 68+ messages in thread
From: Mitchel Humpherys @ 2014-08-13 21:07 UTC (permalink / raw)
  To: linux-arm-kernel

Well hopefully this isn't too Nick Krouse-esque, but I have some
comments on my own patch below. I sat on these for a few days but have
noticed a few things after testing on another platform...

On Tue, Aug 12 2014 at 05:51:34 PM, Mitchel Humpherys <mitchelh@codeaurora.org> wrote:
> On some platforms with tight power constraints it is polite to only
> leave your clocks on for as long as you absolutely need them. Currently
> we assume that all clocks necessary for SMMU register access are always
> on.
>
> Add some optional device tree properties to specify any clocks that are
> necessary for SMMU register access and turn them on and off as needed.
>
> If no clocks are specified in the device tree things continue to work
> the way they always have: we assume all necessary clocks are always
> turned on.
>
> Signed-off-by: Mitchel Humpherys <mitchelh@codeaurora.org>
> ---
>  .../devicetree/bindings/iommu/arm,smmu.txt         |  11 ++
>  drivers/iommu/arm-smmu.c                           | 127 +++++++++++++++++++--
>  2 files changed, 129 insertions(+), 9 deletions(-)

[...]

> -static int arm_smmu_device_cfg_probe(struct arm_smmu_device *smmu)
> +static int arm_smmu_init_clocks(struct arm_smmu_device *smmu)
> +{
> +	const char *cname;
> +	struct property *prop;
> +	int i;
> +	struct device *dev = smmu->dev;
> +
> +	smmu->num_clocks = of_property_count_strings(dev->of_node,
> +						"clock-names");
> +
> +	if (!smmu->num_clocks)
> +		return 0;
> +
> +	smmu->clocks = devm_kzalloc(
> +		dev, sizeof(*smmu->clocks) * smmu->num_clocks,
> +		GFP_KERNEL);
> +
> +	if (!smmu->clocks) {
> +		dev_err(dev,
> +			"Failed to allocate memory for clocks\n");
> +		return -ENODEV;
> +	}
> +
> +	i = 0;
> +	of_property_for_each_string(dev->of_node, "clock-names",
> +				prop, cname) {
> +		struct clk *c = devm_clk_get(dev, cname);
> +		if (IS_ERR(c)) {
> +			dev_err(dev, "Couldn't get clock: %s",
> +				cname);
> +			return -ENODEV;
> +		}
> +
> +		if (clk_get_rate(c) == 0) {
> +			long rate = clk_round_rate(c, 1000);
> +			clk_set_rate(c, rate);
> +		}
> +
> +		smmu->clocks[i] = c;
> +
> +		++i;
> +	}
> +	return 0;
> +}
> +
> +int arm_smmu_device_cfg_probe(struct arm_smmu_device *smmu)

The `static' was dropped unintentionally here.


-Mitch

-- 
The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum,
hosted by The Linux Foundation

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [PATCH 2/6] iommu/arm-smmu: add support for specifying regulators
  2014-08-13  0:51     ` Mitchel Humpherys
@ 2014-08-13 21:17         ` Mitchel Humpherys
  -1 siblings, 0 replies; 68+ messages in thread
From: Mitchel Humpherys @ 2014-08-13 21:17 UTC (permalink / raw)
  To: linux-arm-kernel-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r
  Cc: devicetree-u79uwXL29TY76Z2rM5mHXA,
	iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA, Will Deacon

On Tue, Aug 12 2014 at 05:51:35 PM, Mitchel Humpherys <mitchelh-sgV2jX0FEOL9JmXXK+q4OQ@public.gmane.org> wrote:
> On some power-constrained platforms it's useful to disable power when a
> device is not in use. Add support for specifying regulators for SMMUs
> and only leave power on as long as the SMMU is in use (attached).
>
> Signed-off-by: Mitchel Humpherys <mitchelh-sgV2jX0FEOL9JmXXK+q4OQ@public.gmane.org>
> ---
>  .../devicetree/bindings/iommu/arm,smmu.txt         |   3 +
>  drivers/iommu/arm-smmu.c                           | 102 ++++++++++++++++++---
>  2 files changed, 93 insertions(+), 12 deletions(-)

[...]

> @@ -2124,13 +2192,19 @@ static int arm_smmu_device_dt_probe(struct platform_device *pdev)
>  	}
>  	dev_notice(dev, "registered %d master devices\n", i);
>  
> +	err = arm_smmu_init_regulators(smmu);
> +	if (err)
> +		goto out_put_masters;
> +
>  	err = arm_smmu_init_clocks(smmu);
>  	if (err)
>  		goto out_put_masters;
>  
> +	arm_smmu_enable_regulators(smmu);
>  	arm_smmu_enable_clocks(smmu);
> -
>  	err = arm_smmu_device_cfg_probe(smmu);
> +	arm_smmu_disable_clocks(smmu);
> +	arm_smmu_disable_regulators(smmu);
>  	if (err)
>  		goto out_disable_clocks;

The out_disable_clocks label can go away now that arm_smmu_device_reset
is done in arm_smmu_attach_dev.


>  
> @@ -2163,8 +2237,6 @@ static int arm_smmu_device_dt_probe(struct platform_device *pdev)
>  	list_add(&smmu->list, &arm_smmu_devices);
>  	spin_unlock(&arm_smmu_devices_lock);
>  
> -	arm_smmu_device_reset(smmu);
> -	arm_smmu_disable_clocks(smmu);
>  	return 0;
>  
>  out_free_irqs:
> @@ -2173,6 +2245,7 @@ out_free_irqs:
>  
>  out_disable_clocks:
>  	arm_smmu_disable_clocks(smmu);
> +	arm_smmu_disable_regulators(smmu);
>  
>  out_put_masters:
>  	for (node = rb_first(&smmu->masters); node; node = rb_next(node)) {



-Mitch

-- 
The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum,
hosted by The Linux Foundation

^ permalink raw reply	[flat|nested] 68+ messages in thread

* [PATCH 2/6] iommu/arm-smmu: add support for specifying regulators
@ 2014-08-13 21:17         ` Mitchel Humpherys
  0 siblings, 0 replies; 68+ messages in thread
From: Mitchel Humpherys @ 2014-08-13 21:17 UTC (permalink / raw)
  To: linux-arm-kernel

On Tue, Aug 12 2014 at 05:51:35 PM, Mitchel Humpherys <mitchelh@codeaurora.org> wrote:
> On some power-constrained platforms it's useful to disable power when a
> device is not in use. Add support for specifying regulators for SMMUs
> and only leave power on as long as the SMMU is in use (attached).
>
> Signed-off-by: Mitchel Humpherys <mitchelh@codeaurora.org>
> ---
>  .../devicetree/bindings/iommu/arm,smmu.txt         |   3 +
>  drivers/iommu/arm-smmu.c                           | 102 ++++++++++++++++++---
>  2 files changed, 93 insertions(+), 12 deletions(-)

[...]

> @@ -2124,13 +2192,19 @@ static int arm_smmu_device_dt_probe(struct platform_device *pdev)
>  	}
>  	dev_notice(dev, "registered %d master devices\n", i);
>  
> +	err = arm_smmu_init_regulators(smmu);
> +	if (err)
> +		goto out_put_masters;
> +
>  	err = arm_smmu_init_clocks(smmu);
>  	if (err)
>  		goto out_put_masters;
>  
> +	arm_smmu_enable_regulators(smmu);
>  	arm_smmu_enable_clocks(smmu);
> -
>  	err = arm_smmu_device_cfg_probe(smmu);
> +	arm_smmu_disable_clocks(smmu);
> +	arm_smmu_disable_regulators(smmu);
>  	if (err)
>  		goto out_disable_clocks;

The out_disable_clocks label can go away now that arm_smmu_device_reset
is done in arm_smmu_attach_dev.


>  
> @@ -2163,8 +2237,6 @@ static int arm_smmu_device_dt_probe(struct platform_device *pdev)
>  	list_add(&smmu->list, &arm_smmu_devices);
>  	spin_unlock(&arm_smmu_devices_lock);
>  
> -	arm_smmu_device_reset(smmu);
> -	arm_smmu_disable_clocks(smmu);
>  	return 0;
>  
>  out_free_irqs:
> @@ -2173,6 +2245,7 @@ out_free_irqs:
>  
>  out_disable_clocks:
>  	arm_smmu_disable_clocks(smmu);
> +	arm_smmu_disable_regulators(smmu);
>  
>  out_put_masters:
>  	for (node = rb_first(&smmu->masters); node; node = rb_next(node)) {



-Mitch

-- 
The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum,
hosted by The Linux Foundation

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [PATCH 0/6] iommu/arm-smmu: misc features, new DT bindings
  2014-08-13 17:22     ` Mitchel Humpherys
@ 2014-08-15 17:25         ` Will Deacon
  -1 siblings, 0 replies; 68+ messages in thread
From: Will Deacon @ 2014-08-15 17:25 UTC (permalink / raw)
  To: Mitchel Humpherys
  Cc: devicetree-u79uwXL29TY76Z2rM5mHXA,
	iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA,
	linux-arm-kernel-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r

On Wed, Aug 13, 2014 at 06:22:32PM +0100, Mitchel Humpherys wrote:
> On Tue, Aug 12 2014 at 05:51:33 PM, Mitchel Humpherys <mitchelh-sgV2jX0FEOL9JmXXK+q4OQ@public.gmane.org> wrote:
> > This series is based on on Will's iommu/pci branch.
> 
> Incredibly, I also neglected to base this on top of Olav's recent patch
> ("iommu/arm-smmu: Do not access non-existing SMR registers")! I will do
> that in v2 after review feedback.

No probs, this is for review rather than immediate merge so it's not the end
of the world (and the conflict should be rather simple).

Will

^ permalink raw reply	[flat|nested] 68+ messages in thread

* [PATCH 0/6] iommu/arm-smmu: misc features, new DT bindings
@ 2014-08-15 17:25         ` Will Deacon
  0 siblings, 0 replies; 68+ messages in thread
From: Will Deacon @ 2014-08-15 17:25 UTC (permalink / raw)
  To: linux-arm-kernel

On Wed, Aug 13, 2014 at 06:22:32PM +0100, Mitchel Humpherys wrote:
> On Tue, Aug 12 2014 at 05:51:33 PM, Mitchel Humpherys <mitchelh@codeaurora.org> wrote:
> > This series is based on on Will's iommu/pci branch.
> 
> Incredibly, I also neglected to base this on top of Olav's recent patch
> ("iommu/arm-smmu: Do not access non-existing SMR registers")! I will do
> that in v2 after review feedback.

No probs, this is for review rather than immediate merge so it's not the end
of the world (and the conflict should be rather simple).

Will

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [PATCH 3/6] iommu/arm-smmu: add support for iova_to_phys through ATS1PR
  2014-08-13  0:51     ` Mitchel Humpherys
@ 2014-08-19 12:44         ` Will Deacon
  -1 siblings, 0 replies; 68+ messages in thread
From: Will Deacon @ 2014-08-19 12:44 UTC (permalink / raw)
  To: Mitchel Humpherys
  Cc: devicetree-u79uwXL29TY76Z2rM5mHXA,
	iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA,
	linux-arm-kernel-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r

On Wed, Aug 13, 2014 at 01:51:36AM +0100, Mitchel Humpherys wrote:
> Currently, we provide the iommu_ops.iova_to_phys service by doing a
> table walk in software to translate IO virtual addresses to physical
> addresses. On SMMUs that support it, it can be useful to ask the SMMU
> itself to do the translation. This can be used to warm the TLBs for an
> SMMU. It can also be useful for testing and hardware validation.

I'm not really sold on the usefulness of this feature. If you want hardware
validation features, I'd rather do something through debugfs, but your
use-case for warming the TLBs is intriguing. Do you have an example use-case
with performance figures?

> Since the address translation registers are optional on SMMUv2, only
> enable hardware translations when using SMMUv1 or when SMMU_IDR0.S1TS=1
> and SMMU_IDR0.ATOSNS=0, as described in the ARM SMMU v1-v2 spec.

[...]

> +static phys_addr_t arm_smmu_iova_to_phys_hard(struct iommu_domain *domain,
> +					dma_addr_t iova)
> +{
> +	struct arm_smmu_domain *smmu_domain = domain->priv;
> +	struct arm_smmu_device *smmu = smmu_domain->smmu;
> +	struct arm_smmu_cfg *cfg = &smmu_domain->cfg;
> +	struct device *dev = smmu->dev;
> +	void __iomem *cb_base;
> +	int count = 0;
> +	u64 phys;
> +
> +	arm_smmu_enable_clocks(smmu);
> +
> +	cb_base = ARM_SMMU_CB_BASE(smmu) + ARM_SMMU_CB(smmu, cfg->cbndx);
> +
> +	if (smmu->version == 1) {
> +		u32 reg = iova & 0xFFFFF000;
> +		writel_relaxed(reg, cb_base + ARM_SMMU_CB_ATS1PR_LO);
> +	} else {
> +		u64 reg = iova & 0xfffffffffffff000;
> +		writeq_relaxed(reg, cb_base + ARM_SMMU_CB_ATS1PR_LO);

We don't have writeq for arch/arm/.

> +	}
> +
> +	mb();

Why?

> +	while (readl_relaxed(cb_base + ARM_SMMU_CB_ATSR) & ATSR_ACTIVE) {
> +		if (++count == ATSR_LOOP_TIMEOUT) {
> +			dev_err(dev,
> +				"iova to phys timed out on 0x%pa for %s. Falling back to software table walk.\n",
> +				&iova, dev_name(dev));
> +			arm_smmu_disable_clocks(smmu);
> +			return arm_smmu_iova_to_phys_soft(domain, iova);
> +		}
> +		cpu_relax();
> +	}

Do you know what happened to Olav's patches to make this sort of code
generic?

> @@ -2005,6 +2073,11 @@ int arm_smmu_device_cfg_probe(struct arm_smmu_device *smmu)
>  		return -ENODEV;
>  	}
>  
> +	if (smmu->version == 1 || (!(id & ID0_ATOSNS) && (id & ID0_S1TS))) {

Are you sure about this? The v2 spec says that is ATOSNS is clear then S1TS
is also clear.

Will

^ permalink raw reply	[flat|nested] 68+ messages in thread

* [PATCH 3/6] iommu/arm-smmu: add support for iova_to_phys through ATS1PR
@ 2014-08-19 12:44         ` Will Deacon
  0 siblings, 0 replies; 68+ messages in thread
From: Will Deacon @ 2014-08-19 12:44 UTC (permalink / raw)
  To: linux-arm-kernel

On Wed, Aug 13, 2014 at 01:51:36AM +0100, Mitchel Humpherys wrote:
> Currently, we provide the iommu_ops.iova_to_phys service by doing a
> table walk in software to translate IO virtual addresses to physical
> addresses. On SMMUs that support it, it can be useful to ask the SMMU
> itself to do the translation. This can be used to warm the TLBs for an
> SMMU. It can also be useful for testing and hardware validation.

I'm not really sold on the usefulness of this feature. If you want hardware
validation features, I'd rather do something through debugfs, but your
use-case for warming the TLBs is intriguing. Do you have an example use-case
with performance figures?

> Since the address translation registers are optional on SMMUv2, only
> enable hardware translations when using SMMUv1 or when SMMU_IDR0.S1TS=1
> and SMMU_IDR0.ATOSNS=0, as described in the ARM SMMU v1-v2 spec.

[...]

> +static phys_addr_t arm_smmu_iova_to_phys_hard(struct iommu_domain *domain,
> +					dma_addr_t iova)
> +{
> +	struct arm_smmu_domain *smmu_domain = domain->priv;
> +	struct arm_smmu_device *smmu = smmu_domain->smmu;
> +	struct arm_smmu_cfg *cfg = &smmu_domain->cfg;
> +	struct device *dev = smmu->dev;
> +	void __iomem *cb_base;
> +	int count = 0;
> +	u64 phys;
> +
> +	arm_smmu_enable_clocks(smmu);
> +
> +	cb_base = ARM_SMMU_CB_BASE(smmu) + ARM_SMMU_CB(smmu, cfg->cbndx);
> +
> +	if (smmu->version == 1) {
> +		u32 reg = iova & 0xFFFFF000;
> +		writel_relaxed(reg, cb_base + ARM_SMMU_CB_ATS1PR_LO);
> +	} else {
> +		u64 reg = iova & 0xfffffffffffff000;
> +		writeq_relaxed(reg, cb_base + ARM_SMMU_CB_ATS1PR_LO);

We don't have writeq for arch/arm/.

> +	}
> +
> +	mb();

Why?

> +	while (readl_relaxed(cb_base + ARM_SMMU_CB_ATSR) & ATSR_ACTIVE) {
> +		if (++count == ATSR_LOOP_TIMEOUT) {
> +			dev_err(dev,
> +				"iova to phys timed out on 0x%pa for %s. Falling back to software table walk.\n",
> +				&iova, dev_name(dev));
> +			arm_smmu_disable_clocks(smmu);
> +			return arm_smmu_iova_to_phys_soft(domain, iova);
> +		}
> +		cpu_relax();
> +	}

Do you know what happened to Olav's patches to make this sort of code
generic?

> @@ -2005,6 +2073,11 @@ int arm_smmu_device_cfg_probe(struct arm_smmu_device *smmu)
>  		return -ENODEV;
>  	}
>  
> +	if (smmu->version == 1 || (!(id & ID0_ATOSNS) && (id & ID0_S1TS))) {

Are you sure about this? The v2 spec says that is ATOSNS is clear then S1TS
is also clear.

Will

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [PATCH 6/6] iommu/arm-smmu: add .domain_{set,get}_attr for coherent walk control
  2014-08-13  0:51     ` Mitchel Humpherys
@ 2014-08-19 12:48         ` Will Deacon
  -1 siblings, 0 replies; 68+ messages in thread
From: Will Deacon @ 2014-08-19 12:48 UTC (permalink / raw)
  To: Mitchel Humpherys
  Cc: devicetree-u79uwXL29TY76Z2rM5mHXA,
	iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA,
	linux-arm-kernel-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r

On Wed, Aug 13, 2014 at 01:51:39AM +0100, Mitchel Humpherys wrote:
> Under certain conditions coherent hardware translation table walks can
> result in degraded performance. Add a new domain attribute to
> disable/enable this feature in generic code along with the domain
> attribute setter and getter to handle it in the ARM SMMU driver.

Again, it would be nice to have some information about these cases and the
performance issues that you are seeing.

> @@ -1908,11 +1917,15 @@ static int arm_smmu_domain_get_attr(struct iommu_domain *domain,
>  				    enum iommu_attr attr, void *data)
>  {
>  	struct arm_smmu_domain *smmu_domain = domain->priv;
> +	struct arm_smmu_cfg *cfg = &smmu_domain->cfg;
>  
>  	switch (attr) {
>  	case DOMAIN_ATTR_NESTING:
>  		*(int *)data = (smmu_domain->stage == ARM_SMMU_DOMAIN_NESTED);
>  		return 0;
> +	case DOMAIN_ATTR_COHERENT_HTW_DISABLE:
> +		*((bool *)data) = cfg->htw_disable;
> +		return 0;

I think I'd be more comfortable using int instead of bool for this, as it
could well end up in the user ABI if vfio decides to make use of it. While
we're here, let's also add an attributes bitmap to the arm_smmu_domain
instead of having a bool in the arm_smmu_cfg.

>  	default:
>  		return -ENODEV;
>  	}
> @@ -1922,6 +1935,7 @@ static int arm_smmu_domain_set_attr(struct iommu_domain *domain,
>  				    enum iommu_attr attr, void *data)
>  {
>  	struct arm_smmu_domain *smmu_domain = domain->priv;
> +	struct arm_smmu_cfg *cfg = &smmu_domain->cfg;
>  
>  	switch (attr) {
>  	case DOMAIN_ATTR_NESTING:
> @@ -1933,6 +1947,9 @@ static int arm_smmu_domain_set_attr(struct iommu_domain *domain,
>  			smmu_domain->stage = ARM_SMMU_DOMAIN_S1;
>  
>  		return 0;
> +	case DOMAIN_ATTR_COHERENT_HTW_DISABLE:
> +		cfg->htw_disable = *((bool *)data);
> +		return 0;
>  	default:
>  		return -ENODEV;
>  	}
> diff --git a/include/linux/iommu.h b/include/linux/iommu.h
> index 0550286df4..8a6449857a 100644
> --- a/include/linux/iommu.h
> +++ b/include/linux/iommu.h
> @@ -81,6 +81,7 @@ enum iommu_attr {
>  	DOMAIN_ATTR_FSL_PAMU_ENABLE,
>  	DOMAIN_ATTR_FSL_PAMUV1,
>  	DOMAIN_ATTR_NESTING,	/* two stages of translation */
> +	DOMAIN_ATTR_COHERENT_HTW_DISABLE,

I wonder whether we should make this ARM-specific. Can you take a quick look
to see if any of the other IOMMUs can potentially benefit from this?

Will

^ permalink raw reply	[flat|nested] 68+ messages in thread

* [PATCH 6/6] iommu/arm-smmu: add .domain_{set,get}_attr for coherent walk control
@ 2014-08-19 12:48         ` Will Deacon
  0 siblings, 0 replies; 68+ messages in thread
From: Will Deacon @ 2014-08-19 12:48 UTC (permalink / raw)
  To: linux-arm-kernel

On Wed, Aug 13, 2014 at 01:51:39AM +0100, Mitchel Humpherys wrote:
> Under certain conditions coherent hardware translation table walks can
> result in degraded performance. Add a new domain attribute to
> disable/enable this feature in generic code along with the domain
> attribute setter and getter to handle it in the ARM SMMU driver.

Again, it would be nice to have some information about these cases and the
performance issues that you are seeing.

> @@ -1908,11 +1917,15 @@ static int arm_smmu_domain_get_attr(struct iommu_domain *domain,
>  				    enum iommu_attr attr, void *data)
>  {
>  	struct arm_smmu_domain *smmu_domain = domain->priv;
> +	struct arm_smmu_cfg *cfg = &smmu_domain->cfg;
>  
>  	switch (attr) {
>  	case DOMAIN_ATTR_NESTING:
>  		*(int *)data = (smmu_domain->stage == ARM_SMMU_DOMAIN_NESTED);
>  		return 0;
> +	case DOMAIN_ATTR_COHERENT_HTW_DISABLE:
> +		*((bool *)data) = cfg->htw_disable;
> +		return 0;

I think I'd be more comfortable using int instead of bool for this, as it
could well end up in the user ABI if vfio decides to make use of it. While
we're here, let's also add an attributes bitmap to the arm_smmu_domain
instead of having a bool in the arm_smmu_cfg.

>  	default:
>  		return -ENODEV;
>  	}
> @@ -1922,6 +1935,7 @@ static int arm_smmu_domain_set_attr(struct iommu_domain *domain,
>  				    enum iommu_attr attr, void *data)
>  {
>  	struct arm_smmu_domain *smmu_domain = domain->priv;
> +	struct arm_smmu_cfg *cfg = &smmu_domain->cfg;
>  
>  	switch (attr) {
>  	case DOMAIN_ATTR_NESTING:
> @@ -1933,6 +1947,9 @@ static int arm_smmu_domain_set_attr(struct iommu_domain *domain,
>  			smmu_domain->stage = ARM_SMMU_DOMAIN_S1;
>  
>  		return 0;
> +	case DOMAIN_ATTR_COHERENT_HTW_DISABLE:
> +		cfg->htw_disable = *((bool *)data);
> +		return 0;
>  	default:
>  		return -ENODEV;
>  	}
> diff --git a/include/linux/iommu.h b/include/linux/iommu.h
> index 0550286df4..8a6449857a 100644
> --- a/include/linux/iommu.h
> +++ b/include/linux/iommu.h
> @@ -81,6 +81,7 @@ enum iommu_attr {
>  	DOMAIN_ATTR_FSL_PAMU_ENABLE,
>  	DOMAIN_ATTR_FSL_PAMUV1,
>  	DOMAIN_ATTR_NESTING,	/* two stages of translation */
> +	DOMAIN_ATTR_COHERENT_HTW_DISABLE,

I wonder whether we should make this ARM-specific. Can you take a quick look
to see if any of the other IOMMUs can potentially benefit from this?

Will

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [PATCH 4/6] iommu/arm-smmu: implement generic DT bindings
  2014-08-13  0:51     ` Mitchel Humpherys
@ 2014-08-19 12:54         ` Will Deacon
  -1 siblings, 0 replies; 68+ messages in thread
From: Will Deacon @ 2014-08-19 12:54 UTC (permalink / raw)
  To: Mitchel Humpherys
  Cc: mark.rutland-5wv7dgnIgG8, devicetree-u79uwXL29TY76Z2rM5mHXA,
	arnd-r2nGTMty4D4, robh-DgEjT+Ai2ygdnm+yROfE0A,
	iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA,
	treding-DDmLM1+adcrQT0dZR+AlfA,
	linux-arm-kernel-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r

[adding Rob, Mark, Arnd, Thierry and Hiroshi]

On Wed, Aug 13, 2014 at 01:51:37AM +0100, Mitchel Humpherys wrote:
> Generic IOMMU device tree bindings were recently added in
> ["devicetree: Add generic IOMMU device tree bindings"]. Implement the
> bindings in the ARM SMMU driver.
> 
> See Documentation/devicetree/bindings/iommu/iommu.txt for the bindings
> themselves.

Could you look at moving the parsing code into a separate file please (maybe
under lib/ ?). That way, other IOMMUs can use the same binding without the
boilerplate having to be rewritten each time.

> diff --git a/drivers/iommu/arm-smmu.c b/drivers/iommu/arm-smmu.c
> index 63c6707fad..22e25f3172 100644
> --- a/drivers/iommu/arm-smmu.c
> +++ b/drivers/iommu/arm-smmu.c
> @@ -538,25 +538,32 @@ static int insert_smmu_master(struct arm_smmu_device *smmu,
>  	return 0;
>  }
>  
> +struct iommus_entry {
> +	struct list_head list;
> +	struct device_node *node;
> +	u16 streamids[MAX_MASTER_STREAMIDS];
> +	int num_sids;
> +};
> +
>  static int register_smmu_master(struct arm_smmu_device *smmu,
> -				struct device *dev,
> -				struct of_phandle_args *masterspec)
> +				struct iommus_entry *entry)
>  {
>  	int i;
>  	struct arm_smmu_master *master;
> +	struct device *dev = smmu->dev;
>  
> -	master = find_smmu_master(smmu, masterspec->np);
> +	master = find_smmu_master(smmu, entry->node);
>  	if (master) {
>  		dev_err(dev,
>  			"rejecting multiple registrations for master device %s\n",
> -			masterspec->np->name);
> +			entry->node->name);
>  		return -EBUSY;
>  	}
>  
> -	if (masterspec->args_count > MAX_MASTER_STREAMIDS) {
> +	if (entry->num_sids > MAX_MASTER_STREAMIDS) {
>  		dev_err(dev,
>  			"reached maximum number (%d) of stream IDs for master device %s\n",
> -			MAX_MASTER_STREAMIDS, masterspec->np->name);
> +			MAX_MASTER_STREAMIDS, entry->node->name);
>  		return -ENOSPC;
>  	}
>  
> @@ -564,15 +571,58 @@ static int register_smmu_master(struct arm_smmu_device *smmu,
>  	if (!master)
>  		return -ENOMEM;
>  
> -	master->of_node			= masterspec->np;
> -	master->cfg.num_streamids	= masterspec->args_count;
> +	master->of_node			= entry->node;
> +	master->cfg.num_streamids	= entry->num_sids;
>  
>  	for (i = 0; i < master->cfg.num_streamids; ++i)
> -		master->cfg.streamids[i] = masterspec->args[i];
> +		master->cfg.streamids[i] = entry->streamids[i];
>  
>  	return insert_smmu_master(smmu, master);
>  }
>  
> +static int arm_smmu_parse_iommus_properties(struct arm_smmu_device *smmu,
> +					int *num_masters)
> +{
> +	struct of_phandle_args iommuspec;
> +	struct device_node *dn;
> +
> +	for_each_node_with_property(dn, "iommus") {
> +		int arg_ind = 0;
> +		struct iommus_entry *entry, *n;
> +		LIST_HEAD(iommus);
> +
> +		while (!of_parse_phandle_with_args(dn, "iommus", "#iommu-cells",
> +							arg_ind, &iommuspec)) {

We need to check that the phandle does indeed point at one of our SMMUs
here, in case we have a system with multiple IOMMU types, all using the
generic binding.

> +			int i;
> +
> +			list_for_each_entry(entry, &iommus, list)
> +				if (entry->node == dn)
> +					break;

Oh, yuck, this is really nasty to parse...

> +			if (&entry->list == &iommus) {

Where is entry initialised the first time around?

> +				entry = devm_kzalloc(smmu->dev, sizeof(*entry),
> +						GFP_KERNEL);
> +				if (!entry)
> +					return -ENOMEM;

You need to of_node_put the guys you've got back from the phandle parsing
code.

> +				entry->node = dn;
> +				list_add(&entry->list, &iommus);
> +			}
> +			entry->num_sids = iommuspec.args_count;
> +			for (i = 0; i < entry->num_sids; ++i)
> +				entry->streamids[i] = iommuspec.args[i];
> +			arg_ind++;

Isn't this defined by #iommu-cells?

> +		}
> +
> +		list_for_each_entry_safe(entry, n, &iommus, list) {

Why the _safe variant? This is a local list, right?

Will

^ permalink raw reply	[flat|nested] 68+ messages in thread

* [PATCH 4/6] iommu/arm-smmu: implement generic DT bindings
@ 2014-08-19 12:54         ` Will Deacon
  0 siblings, 0 replies; 68+ messages in thread
From: Will Deacon @ 2014-08-19 12:54 UTC (permalink / raw)
  To: linux-arm-kernel

[adding Rob, Mark, Arnd, Thierry and Hiroshi]

On Wed, Aug 13, 2014 at 01:51:37AM +0100, Mitchel Humpherys wrote:
> Generic IOMMU device tree bindings were recently added in
> ["devicetree: Add generic IOMMU device tree bindings"]. Implement the
> bindings in the ARM SMMU driver.
> 
> See Documentation/devicetree/bindings/iommu/iommu.txt for the bindings
> themselves.

Could you look at moving the parsing code into a separate file please (maybe
under lib/ ?). That way, other IOMMUs can use the same binding without the
boilerplate having to be rewritten each time.

> diff --git a/drivers/iommu/arm-smmu.c b/drivers/iommu/arm-smmu.c
> index 63c6707fad..22e25f3172 100644
> --- a/drivers/iommu/arm-smmu.c
> +++ b/drivers/iommu/arm-smmu.c
> @@ -538,25 +538,32 @@ static int insert_smmu_master(struct arm_smmu_device *smmu,
>  	return 0;
>  }
>  
> +struct iommus_entry {
> +	struct list_head list;
> +	struct device_node *node;
> +	u16 streamids[MAX_MASTER_STREAMIDS];
> +	int num_sids;
> +};
> +
>  static int register_smmu_master(struct arm_smmu_device *smmu,
> -				struct device *dev,
> -				struct of_phandle_args *masterspec)
> +				struct iommus_entry *entry)
>  {
>  	int i;
>  	struct arm_smmu_master *master;
> +	struct device *dev = smmu->dev;
>  
> -	master = find_smmu_master(smmu, masterspec->np);
> +	master = find_smmu_master(smmu, entry->node);
>  	if (master) {
>  		dev_err(dev,
>  			"rejecting multiple registrations for master device %s\n",
> -			masterspec->np->name);
> +			entry->node->name);
>  		return -EBUSY;
>  	}
>  
> -	if (masterspec->args_count > MAX_MASTER_STREAMIDS) {
> +	if (entry->num_sids > MAX_MASTER_STREAMIDS) {
>  		dev_err(dev,
>  			"reached maximum number (%d) of stream IDs for master device %s\n",
> -			MAX_MASTER_STREAMIDS, masterspec->np->name);
> +			MAX_MASTER_STREAMIDS, entry->node->name);
>  		return -ENOSPC;
>  	}
>  
> @@ -564,15 +571,58 @@ static int register_smmu_master(struct arm_smmu_device *smmu,
>  	if (!master)
>  		return -ENOMEM;
>  
> -	master->of_node			= masterspec->np;
> -	master->cfg.num_streamids	= masterspec->args_count;
> +	master->of_node			= entry->node;
> +	master->cfg.num_streamids	= entry->num_sids;
>  
>  	for (i = 0; i < master->cfg.num_streamids; ++i)
> -		master->cfg.streamids[i] = masterspec->args[i];
> +		master->cfg.streamids[i] = entry->streamids[i];
>  
>  	return insert_smmu_master(smmu, master);
>  }
>  
> +static int arm_smmu_parse_iommus_properties(struct arm_smmu_device *smmu,
> +					int *num_masters)
> +{
> +	struct of_phandle_args iommuspec;
> +	struct device_node *dn;
> +
> +	for_each_node_with_property(dn, "iommus") {
> +		int arg_ind = 0;
> +		struct iommus_entry *entry, *n;
> +		LIST_HEAD(iommus);
> +
> +		while (!of_parse_phandle_with_args(dn, "iommus", "#iommu-cells",
> +							arg_ind, &iommuspec)) {

We need to check that the phandle does indeed point at one of our SMMUs
here, in case we have a system with multiple IOMMU types, all using the
generic binding.

> +			int i;
> +
> +			list_for_each_entry(entry, &iommus, list)
> +				if (entry->node == dn)
> +					break;

Oh, yuck, this is really nasty to parse...

> +			if (&entry->list == &iommus) {

Where is entry initialised the first time around?

> +				entry = devm_kzalloc(smmu->dev, sizeof(*entry),
> +						GFP_KERNEL);
> +				if (!entry)
> +					return -ENOMEM;

You need to of_node_put the guys you've got back from the phandle parsing
code.

> +				entry->node = dn;
> +				list_add(&entry->list, &iommus);
> +			}
> +			entry->num_sids = iommuspec.args_count;
> +			for (i = 0; i < entry->num_sids; ++i)
> +				entry->streamids[i] = iommuspec.args[i];
> +			arg_ind++;

Isn't this defined by #iommu-cells?

> +		}
> +
> +		list_for_each_entry_safe(entry, n, &iommus, list) {

Why the _safe variant? This is a local list, right?

Will

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [PATCH 1/6] iommu/arm-smmu: add support for specifying clocks
  2014-08-13  0:51     ` Mitchel Humpherys
@ 2014-08-19 12:58         ` Will Deacon
  -1 siblings, 0 replies; 68+ messages in thread
From: Will Deacon @ 2014-08-19 12:58 UTC (permalink / raw)
  To: Mitchel Humpherys
  Cc: devicetree-u79uwXL29TY76Z2rM5mHXA,
	iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA,
	linux-arm-kernel-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r

On Wed, Aug 13, 2014 at 01:51:34AM +0100, Mitchel Humpherys wrote:
> On some platforms with tight power constraints it is polite to only
> leave your clocks on for as long as you absolutely need them. Currently
> we assume that all clocks necessary for SMMU register access are always
> on.
> 
> Add some optional device tree properties to specify any clocks that are
> necessary for SMMU register access and turn them on and off as needed.
> 
> If no clocks are specified in the device tree things continue to work
> the way they always have: we assume all necessary clocks are always
> turned on.

How does this interact with an SMMU in bypass mode?

[...]

> +static int arm_smmu_enable_clocks(struct arm_smmu_device *smmu)
> +{
> +	int i, ret = 0;
> +
> +	for (i = 0; i < smmu->num_clocks; ++i) {
> +		ret = clk_prepare_enable(smmu->clocks[i]);
> +		if (ret) {
> +			dev_err(smmu->dev, "Couldn't enable clock #%d\n", i);
> +			while (i--)
> +				clk_disable_unprepare(smmu->clocks[i]);
> +			break;
> +		}
> +	}
> +
> +	return ret;
> +}
> +
> +static void arm_smmu_disable_clocks(struct arm_smmu_device *smmu)
> +{
> +	int i;
> +
> +	for (i = 0; i < smmu->num_clocks; ++i)
> +		clk_disable_unprepare(smmu->clocks[i]);
> +}

What stops theses from racing with each other when there are multiple
clocks? I also assume that the clk API ignores calls to clk_enable_prepare
for a clk that's already enabled? I couldn't find that code...

>  /* Wait for any pending TLB invalidations to complete */
>  static void arm_smmu_tlb_sync(struct arm_smmu_device *smmu)
>  {
> @@ -644,11 +672,15 @@ static irqreturn_t arm_smmu_context_fault(int irq, void *dev)
>  	struct arm_smmu_device *smmu = smmu_domain->smmu;
>  	void __iomem *cb_base;
>  
> +	arm_smmu_enable_clocks(smmu);

How can I get a context interrupt from an SMMU without its clocks enabled?

[...]

> +int arm_smmu_device_cfg_probe(struct arm_smmu_device *smmu)
>  {
>  	unsigned long size;
>  	void __iomem *gr0_base = ARM_SMMU_GR0(smmu);
> @@ -2027,10 +2124,16 @@ static int arm_smmu_device_dt_probe(struct platform_device *pdev)
>  	}
>  	dev_notice(dev, "registered %d master devices\n", i);
>  
> -	err = arm_smmu_device_cfg_probe(smmu);
> +	err = arm_smmu_init_clocks(smmu);
>  	if (err)
>  		goto out_put_masters;
>  
> +	arm_smmu_enable_clocks(smmu);
> +
> +	err = arm_smmu_device_cfg_probe(smmu);
> +	if (err)
> +		goto out_disable_clocks;
> +
>  	parse_driver_options(smmu);
>  
>  	if (smmu->version > 1 &&
> @@ -2039,7 +2142,7 @@ static int arm_smmu_device_dt_probe(struct platform_device *pdev)
>  			"found only %d context interrupt(s) but %d required\n",
>  			smmu->num_context_irqs, smmu->num_context_banks);
>  		err = -ENODEV;
> -		goto out_put_masters;
> +		goto out_disable_clocks;
>  	}
>  
>  	for (i = 0; i < smmu->num_global_irqs; ++i) {
> @@ -2061,12 +2164,16 @@ static int arm_smmu_device_dt_probe(struct platform_device *pdev)
>  	spin_unlock(&arm_smmu_devices_lock);
>  
>  	arm_smmu_device_reset(smmu);
> +	arm_smmu_disable_clocks(smmu);

I wonder if this is really the right thing to do. Rather than the
fine-grained clock enable/disable you have, why don't we just enable in
domain_init and disable in domain_destroy, with refcounting for the clocks?

Will

^ permalink raw reply	[flat|nested] 68+ messages in thread

* [PATCH 1/6] iommu/arm-smmu: add support for specifying clocks
@ 2014-08-19 12:58         ` Will Deacon
  0 siblings, 0 replies; 68+ messages in thread
From: Will Deacon @ 2014-08-19 12:58 UTC (permalink / raw)
  To: linux-arm-kernel

On Wed, Aug 13, 2014 at 01:51:34AM +0100, Mitchel Humpherys wrote:
> On some platforms with tight power constraints it is polite to only
> leave your clocks on for as long as you absolutely need them. Currently
> we assume that all clocks necessary for SMMU register access are always
> on.
> 
> Add some optional device tree properties to specify any clocks that are
> necessary for SMMU register access and turn them on and off as needed.
> 
> If no clocks are specified in the device tree things continue to work
> the way they always have: we assume all necessary clocks are always
> turned on.

How does this interact with an SMMU in bypass mode?

[...]

> +static int arm_smmu_enable_clocks(struct arm_smmu_device *smmu)
> +{
> +	int i, ret = 0;
> +
> +	for (i = 0; i < smmu->num_clocks; ++i) {
> +		ret = clk_prepare_enable(smmu->clocks[i]);
> +		if (ret) {
> +			dev_err(smmu->dev, "Couldn't enable clock #%d\n", i);
> +			while (i--)
> +				clk_disable_unprepare(smmu->clocks[i]);
> +			break;
> +		}
> +	}
> +
> +	return ret;
> +}
> +
> +static void arm_smmu_disable_clocks(struct arm_smmu_device *smmu)
> +{
> +	int i;
> +
> +	for (i = 0; i < smmu->num_clocks; ++i)
> +		clk_disable_unprepare(smmu->clocks[i]);
> +}

What stops theses from racing with each other when there are multiple
clocks? I also assume that the clk API ignores calls to clk_enable_prepare
for a clk that's already enabled? I couldn't find that code...

>  /* Wait for any pending TLB invalidations to complete */
>  static void arm_smmu_tlb_sync(struct arm_smmu_device *smmu)
>  {
> @@ -644,11 +672,15 @@ static irqreturn_t arm_smmu_context_fault(int irq, void *dev)
>  	struct arm_smmu_device *smmu = smmu_domain->smmu;
>  	void __iomem *cb_base;
>  
> +	arm_smmu_enable_clocks(smmu);

How can I get a context interrupt from an SMMU without its clocks enabled?

[...]

> +int arm_smmu_device_cfg_probe(struct arm_smmu_device *smmu)
>  {
>  	unsigned long size;
>  	void __iomem *gr0_base = ARM_SMMU_GR0(smmu);
> @@ -2027,10 +2124,16 @@ static int arm_smmu_device_dt_probe(struct platform_device *pdev)
>  	}
>  	dev_notice(dev, "registered %d master devices\n", i);
>  
> -	err = arm_smmu_device_cfg_probe(smmu);
> +	err = arm_smmu_init_clocks(smmu);
>  	if (err)
>  		goto out_put_masters;
>  
> +	arm_smmu_enable_clocks(smmu);
> +
> +	err = arm_smmu_device_cfg_probe(smmu);
> +	if (err)
> +		goto out_disable_clocks;
> +
>  	parse_driver_options(smmu);
>  
>  	if (smmu->version > 1 &&
> @@ -2039,7 +2142,7 @@ static int arm_smmu_device_dt_probe(struct platform_device *pdev)
>  			"found only %d context interrupt(s) but %d required\n",
>  			smmu->num_context_irqs, smmu->num_context_banks);
>  		err = -ENODEV;
> -		goto out_put_masters;
> +		goto out_disable_clocks;
>  	}
>  
>  	for (i = 0; i < smmu->num_global_irqs; ++i) {
> @@ -2061,12 +2164,16 @@ static int arm_smmu_device_dt_probe(struct platform_device *pdev)
>  	spin_unlock(&arm_smmu_devices_lock);
>  
>  	arm_smmu_device_reset(smmu);
> +	arm_smmu_disable_clocks(smmu);

I wonder if this is really the right thing to do. Rather than the
fine-grained clock enable/disable you have, why don't we just enable in
domain_init and disable in domain_destroy, with refcounting for the clocks?

Will

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [PATCH 2/6] iommu/arm-smmu: add support for specifying regulators
  2014-08-13  0:51     ` Mitchel Humpherys
@ 2014-08-19 13:00         ` Will Deacon
  -1 siblings, 0 replies; 68+ messages in thread
From: Will Deacon @ 2014-08-19 13:00 UTC (permalink / raw)
  To: Mitchel Humpherys
  Cc: devicetree-u79uwXL29TY76Z2rM5mHXA,
	iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA,
	linux-arm-kernel-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r

On Wed, Aug 13, 2014 at 01:51:35AM +0100, Mitchel Humpherys wrote:
> On some power-constrained platforms it's useful to disable power when a
> device is not in use. Add support for specifying regulators for SMMUs
> and only leave power on as long as the SMMU is in use (attached).

... and for bypass mode? My comments for clks largely apply here too -- I'd
much rather see this in domain_init/domain_destroy with appropriate
refcounting. It's just too much of a mess having these calls littered around
the driver.

Will

^ permalink raw reply	[flat|nested] 68+ messages in thread

* [PATCH 2/6] iommu/arm-smmu: add support for specifying regulators
@ 2014-08-19 13:00         ` Will Deacon
  0 siblings, 0 replies; 68+ messages in thread
From: Will Deacon @ 2014-08-19 13:00 UTC (permalink / raw)
  To: linux-arm-kernel

On Wed, Aug 13, 2014 at 01:51:35AM +0100, Mitchel Humpherys wrote:
> On some power-constrained platforms it's useful to disable power when a
> device is not in use. Add support for specifying regulators for SMMUs
> and only leave power on as long as the SMMU is in use (attached).

... and for bypass mode? My comments for clks largely apply here too -- I'd
much rather see this in domain_init/domain_destroy with appropriate
refcounting. It's just too much of a mess having these calls littered around
the driver.

Will

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [PATCH 4/6] iommu/arm-smmu: implement generic DT bindings
  2014-08-19 12:54         ` Will Deacon
@ 2014-08-19 15:54             ` Hiroshi Doyu
  -1 siblings, 0 replies; 68+ messages in thread
From: Hiroshi Doyu @ 2014-08-19 15:54 UTC (permalink / raw)
  To: Mitchel Humpherys, Will Deacon
  Cc: mark.rutland-5wv7dgnIgG8, devicetree-u79uwXL29TY76Z2rM5mHXA,
	arnd-r2nGTMty4D4, robh-DgEjT+Ai2ygdnm+yROfE0A,
	iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA,
	Thierry Reding,
	linux-arm-kernel-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r


Will Deacon <will.deacon-5wv7dgnIgG8@public.gmane.org> writes:

> [adding Rob, Mark, Arnd, Thierry and Hiroshi]
>
> On Wed, Aug 13, 2014 at 01:51:37AM +0100, Mitchel Humpherys wrote:
>> Generic IOMMU device tree bindings were recently added in
>> ["devicetree: Add generic IOMMU device tree bindings"]. Implement the
>> bindings in the ARM SMMU driver.
>> 
>> See Documentation/devicetree/bindings/iommu/iommu.txt for the bindings
>> themselves.
>
> Could you look at moving the parsing code into a separate file please (maybe
> under lib/ ?). That way, other IOMMUs can use the same binding without the
> boilerplate having to be rewritten each time.
>
>> diff --git a/drivers/iommu/arm-smmu.c b/drivers/iommu/arm-smmu.c
>> index 63c6707fad..22e25f3172 100644
>> --- a/drivers/iommu/arm-smmu.c
>> +++ b/drivers/iommu/arm-smmu.c
>> @@ -538,25 +538,32 @@ static int insert_smmu_master(struct arm_smmu_device *smmu,
>>  	return 0;
>>  }
>>  
>> +struct iommus_entry {
>> +	struct list_head list;
>> +	struct device_node *node;
>> +	u16 streamids[MAX_MASTER_STREAMIDS];
>> +	int num_sids;
>> +};
>> +
>>  static int register_smmu_master(struct arm_smmu_device *smmu,
>> -				struct device *dev,
>> -				struct of_phandle_args *masterspec)
>> +				struct iommus_entry *entry)
>>  {
>>  	int i;
>>  	struct arm_smmu_master *master;
>> +	struct device *dev = smmu->dev;
>>  
>> -	master = find_smmu_master(smmu, masterspec->np);
>> +	master = find_smmu_master(smmu, entry->node);
>>  	if (master) {
>>  		dev_err(dev,
>>  			"rejecting multiple registrations for master device %s\n",
>> -			masterspec->np->name);
>> +			entry->node->name);
>>  		return -EBUSY;
>>  	}
>>  
>> -	if (masterspec->args_count > MAX_MASTER_STREAMIDS) {
>> +	if (entry->num_sids > MAX_MASTER_STREAMIDS) {
>>  		dev_err(dev,
>>  			"reached maximum number (%d) of stream IDs for master device %s\n",
>> -			MAX_MASTER_STREAMIDS, masterspec->np->name);
>> +			MAX_MASTER_STREAMIDS, entry->node->name);
>>  		return -ENOSPC;
>>  	}
>>  
>> @@ -564,15 +571,58 @@ static int register_smmu_master(struct arm_smmu_device *smmu,
>>  	if (!master)
>>  		return -ENOMEM;
>>  
>> -	master->of_node			= masterspec->np;
>> -	master->cfg.num_streamids	= masterspec->args_count;
>> +	master->of_node			= entry->node;
>> +	master->cfg.num_streamids	= entry->num_sids;
>>  
>>  	for (i = 0; i < master->cfg.num_streamids; ++i)
>> -		master->cfg.streamids[i] = masterspec->args[i];
>> +		master->cfg.streamids[i] = entry->streamids[i];
>>  
>>  	return insert_smmu_master(smmu, master);
>>  }
>>  
>> +static int arm_smmu_parse_iommus_properties(struct arm_smmu_device *smmu,
>> +					int *num_masters)
>> +{
>> +	struct of_phandle_args iommuspec;
>> +	struct device_node *dn;
>> +
>> +	for_each_node_with_property(dn, "iommus") {
>> +		int arg_ind = 0;
>> +		struct iommus_entry *entry, *n;
>> +		LIST_HEAD(iommus);

You may want to use the macro, "of_property_for_each_phandle_with_args()"
to parse "iommus=".

  https://patchwork.ozlabs.org/patch/354073/

>> +
>> +		while (!of_parse_phandle_with_args(dn, "iommus", "#iommu-cells",
>> +							arg_ind, &iommuspec)) {
>
> We need to check that the phandle does indeed point at one of our SMMUs
> here, in case we have a system with multiple IOMMU types, all using the
> generic binding.
>
>> +			int i;
>> +
>> +			list_for_each_entry(entry, &iommus, list)
>> +				if (entry->node == dn)
>> +					break;
>
> Oh, yuck, this is really nasty to parse...
>
>> +			if (&entry->list == &iommus) {
>
> Where is entry initialised the first time around?
>
>> +				entry = devm_kzalloc(smmu->dev, sizeof(*entry),
>> +						GFP_KERNEL);
>> +				if (!entry)
>> +					return -ENOMEM;
>
> You need to of_node_put the guys you've got back from the phandle parsing
> code.
>
>> +				entry->node = dn;
>> +				list_add(&entry->list, &iommus);
>> +			}
>> +			entry->num_sids = iommuspec.args_count;
>> +			for (i = 0; i < entry->num_sids; ++i)
>> +				entry->streamids[i] = iommuspec.args[i];
>> +			arg_ind++;
>
> Isn't this defined by #iommu-cells?
>
>> +		}
>> +
>> +		list_for_each_entry_safe(entry, n, &iommus, list) {
>
> Why the _safe variant? This is a local list, right?
>
> Will

^ permalink raw reply	[flat|nested] 68+ messages in thread

* [PATCH 4/6] iommu/arm-smmu: implement generic DT bindings
@ 2014-08-19 15:54             ` Hiroshi Doyu
  0 siblings, 0 replies; 68+ messages in thread
From: Hiroshi Doyu @ 2014-08-19 15:54 UTC (permalink / raw)
  To: linux-arm-kernel


Will Deacon <will.deacon@arm.com> writes:

> [adding Rob, Mark, Arnd, Thierry and Hiroshi]
>
> On Wed, Aug 13, 2014 at 01:51:37AM +0100, Mitchel Humpherys wrote:
>> Generic IOMMU device tree bindings were recently added in
>> ["devicetree: Add generic IOMMU device tree bindings"]. Implement the
>> bindings in the ARM SMMU driver.
>> 
>> See Documentation/devicetree/bindings/iommu/iommu.txt for the bindings
>> themselves.
>
> Could you look at moving the parsing code into a separate file please (maybe
> under lib/ ?). That way, other IOMMUs can use the same binding without the
> boilerplate having to be rewritten each time.
>
>> diff --git a/drivers/iommu/arm-smmu.c b/drivers/iommu/arm-smmu.c
>> index 63c6707fad..22e25f3172 100644
>> --- a/drivers/iommu/arm-smmu.c
>> +++ b/drivers/iommu/arm-smmu.c
>> @@ -538,25 +538,32 @@ static int insert_smmu_master(struct arm_smmu_device *smmu,
>>  	return 0;
>>  }
>>  
>> +struct iommus_entry {
>> +	struct list_head list;
>> +	struct device_node *node;
>> +	u16 streamids[MAX_MASTER_STREAMIDS];
>> +	int num_sids;
>> +};
>> +
>>  static int register_smmu_master(struct arm_smmu_device *smmu,
>> -				struct device *dev,
>> -				struct of_phandle_args *masterspec)
>> +				struct iommus_entry *entry)
>>  {
>>  	int i;
>>  	struct arm_smmu_master *master;
>> +	struct device *dev = smmu->dev;
>>  
>> -	master = find_smmu_master(smmu, masterspec->np);
>> +	master = find_smmu_master(smmu, entry->node);
>>  	if (master) {
>>  		dev_err(dev,
>>  			"rejecting multiple registrations for master device %s\n",
>> -			masterspec->np->name);
>> +			entry->node->name);
>>  		return -EBUSY;
>>  	}
>>  
>> -	if (masterspec->args_count > MAX_MASTER_STREAMIDS) {
>> +	if (entry->num_sids > MAX_MASTER_STREAMIDS) {
>>  		dev_err(dev,
>>  			"reached maximum number (%d) of stream IDs for master device %s\n",
>> -			MAX_MASTER_STREAMIDS, masterspec->np->name);
>> +			MAX_MASTER_STREAMIDS, entry->node->name);
>>  		return -ENOSPC;
>>  	}
>>  
>> @@ -564,15 +571,58 @@ static int register_smmu_master(struct arm_smmu_device *smmu,
>>  	if (!master)
>>  		return -ENOMEM;
>>  
>> -	master->of_node			= masterspec->np;
>> -	master->cfg.num_streamids	= masterspec->args_count;
>> +	master->of_node			= entry->node;
>> +	master->cfg.num_streamids	= entry->num_sids;
>>  
>>  	for (i = 0; i < master->cfg.num_streamids; ++i)
>> -		master->cfg.streamids[i] = masterspec->args[i];
>> +		master->cfg.streamids[i] = entry->streamids[i];
>>  
>>  	return insert_smmu_master(smmu, master);
>>  }
>>  
>> +static int arm_smmu_parse_iommus_properties(struct arm_smmu_device *smmu,
>> +					int *num_masters)
>> +{
>> +	struct of_phandle_args iommuspec;
>> +	struct device_node *dn;
>> +
>> +	for_each_node_with_property(dn, "iommus") {
>> +		int arg_ind = 0;
>> +		struct iommus_entry *entry, *n;
>> +		LIST_HEAD(iommus);

You may want to use the macro, "of_property_for_each_phandle_with_args()"
to parse "iommus=".

  https://patchwork.ozlabs.org/patch/354073/

>> +
>> +		while (!of_parse_phandle_with_args(dn, "iommus", "#iommu-cells",
>> +							arg_ind, &iommuspec)) {
>
> We need to check that the phandle does indeed point at one of our SMMUs
> here, in case we have a system with multiple IOMMU types, all using the
> generic binding.
>
>> +			int i;
>> +
>> +			list_for_each_entry(entry, &iommus, list)
>> +				if (entry->node == dn)
>> +					break;
>
> Oh, yuck, this is really nasty to parse...
>
>> +			if (&entry->list == &iommus) {
>
> Where is entry initialised the first time around?
>
>> +				entry = devm_kzalloc(smmu->dev, sizeof(*entry),
>> +						GFP_KERNEL);
>> +				if (!entry)
>> +					return -ENOMEM;
>
> You need to of_node_put the guys you've got back from the phandle parsing
> code.
>
>> +				entry->node = dn;
>> +				list_add(&entry->list, &iommus);
>> +			}
>> +			entry->num_sids = iommuspec.args_count;
>> +			for (i = 0; i < entry->num_sids; ++i)
>> +				entry->streamids[i] = iommuspec.args[i];
>> +			arg_ind++;
>
> Isn't this defined by #iommu-cells?
>
>> +		}
>> +
>> +		list_for_each_entry_safe(entry, n, &iommus, list) {
>
> Why the _safe variant? This is a local list, right?
>
> Will

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [PATCH 3/6] iommu/arm-smmu: add support for iova_to_phys through ATS1PR
  2014-08-19 12:44         ` Will Deacon
@ 2014-08-19 18:12             ` Mitchel Humpherys
  -1 siblings, 0 replies; 68+ messages in thread
From: Mitchel Humpherys @ 2014-08-19 18:12 UTC (permalink / raw)
  To: Will Deacon
  Cc: devicetree-u79uwXL29TY76Z2rM5mHXA,
	iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA,
	linux-arm-kernel-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r

On Tue, Aug 19 2014 at 05:44:32 AM, Will Deacon <will.deacon-5wv7dgnIgG8@public.gmane.org> wrote:
> On Wed, Aug 13, 2014 at 01:51:36AM +0100, Mitchel Humpherys wrote:
>> Currently, we provide the iommu_ops.iova_to_phys service by doing a
>> table walk in software to translate IO virtual addresses to physical
>> addresses. On SMMUs that support it, it can be useful to ask the SMMU
>> itself to do the translation. This can be used to warm the TLBs for an
>> SMMU. It can also be useful for testing and hardware validation.
>
> I'm not really sold on the usefulness of this feature. If you want hardware
> validation features, I'd rather do something through debugfs, but your
> use-case for warming the TLBs is intriguing. Do you have an example use-case
> with performance figures?

I'm afraid I don't have an example use case or performance numbers at
the moment...

>> Since the address translation registers are optional on SMMUv2, only
>> enable hardware translations when using SMMUv1 or when SMMU_IDR0.S1TS=1
>> and SMMU_IDR0.ATOSNS=0, as described in the ARM SMMU v1-v2 spec.
>
> [...]
>
>> +static phys_addr_t arm_smmu_iova_to_phys_hard(struct iommu_domain *domain,
>> +					dma_addr_t iova)
>> +{
>> +	struct arm_smmu_domain *smmu_domain = domain->priv;
>> +	struct arm_smmu_device *smmu = smmu_domain->smmu;
>> +	struct arm_smmu_cfg *cfg = &smmu_domain->cfg;
>> +	struct device *dev = smmu->dev;
>> +	void __iomem *cb_base;
>> +	int count = 0;
>> +	u64 phys;
>> +
>> +	arm_smmu_enable_clocks(smmu);
>> +
>> +	cb_base = ARM_SMMU_CB_BASE(smmu) + ARM_SMMU_CB(smmu, cfg->cbndx);
>> +
>> +	if (smmu->version == 1) {
>> +		u32 reg = iova & 0xFFFFF000;
>> +		writel_relaxed(reg, cb_base + ARM_SMMU_CB_ATS1PR_LO);
>> +	} else {
>> +		u64 reg = iova & 0xfffffffffffff000;
>> +		writeq_relaxed(reg, cb_base + ARM_SMMU_CB_ATS1PR_LO);
>
> We don't have writeq for arch/arm/.

Ah yes looks like this is an MSM-ism that never made it upstream since
it wouldn't be guaranteed to be atomic. I'll make sure to do arm32
compiles on upstream kernels for future patches, sorry!

I guess we could use <asm-generic/io-64-nonatomic-lo-hi.h> but I can
also re-work this to be two separate writel's.

>> +	}
>> +
>> +	mb();
>
> Why?

My thought was that if we start polling ATSR_ACTIVE prematurely (before
the write to ATS1PR actually finishes) all heck could break loose? Not
sure if that's a bogus assumption due to device memory being strongly
ordered?

>> +	while (readl_relaxed(cb_base + ARM_SMMU_CB_ATSR) & ATSR_ACTIVE) {
>> +		if (++count == ATSR_LOOP_TIMEOUT) {
>> +			dev_err(dev,
>> +				"iova to phys timed out on 0x%pa for %s. Falling back to software table walk.\n",
>> +				&iova, dev_name(dev));
>> +			arm_smmu_disable_clocks(smmu);
>> +			return arm_smmu_iova_to_phys_soft(domain, iova);
>> +		}
>> +		cpu_relax();
>> +	}
>
> Do you know what happened to Olav's patches to make this sort of code
> generic?

I assume you're talking about this, right?

    http://lists.infradead.org/pipermail/linux-arm-kernel/2014-June/267943.html

Yeah looks like he never sent an update since it was part of a series
that wasn't going to make it in (the qsmmu driver). I can always bring
that patch (actually Matt Wagantall's patch) in here and rework this to
use that.

>
>> @@ -2005,6 +2073,11 @@ int arm_smmu_device_cfg_probe(struct arm_smmu_device *smmu)
>>  		return -ENODEV;
>>  	}
>>  
>> +	if (smmu->version == 1 || (!(id & ID0_ATOSNS) && (id & ID0_S1TS))) {
>
> Are you sure about this? The v2 spec says that is ATOSNS is clear then S1TS
> is also clear.

I was looking at Section 4.1.1 of ARM IHI 0062C ID091613 which states:

    In SMMUv2, the address translation registers are OPTIONAL. The
    address translation registers are implemented only when both:

        o The SMMU_IDR0.S1TS bit is set to 1.
        o The SMMU_IDR0.ATOSNS bit is set to 0.

I assume you're referring to section 9.6.1 of the same document:

    ATOSNS, bit[26]
    Address Translation Operations Not Supported. The possible values of
    this bit are:

        0 Address translation operations are supported. Stage 1
          translation is not supported, that is, the S1TS bit is set to 0.

        1 Address translation operations are not supported. Stage 1
          translation is supported, that is, the S1TS bit is set to 1.

If that really means that S1TS and ATOSNS always have the same value
then Section 4.1.1 doesn't make any sense. Or am I missing something?



-Mitch

-- 
The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum,
hosted by The Linux Foundation

^ permalink raw reply	[flat|nested] 68+ messages in thread

* [PATCH 3/6] iommu/arm-smmu: add support for iova_to_phys through ATS1PR
@ 2014-08-19 18:12             ` Mitchel Humpherys
  0 siblings, 0 replies; 68+ messages in thread
From: Mitchel Humpherys @ 2014-08-19 18:12 UTC (permalink / raw)
  To: linux-arm-kernel

On Tue, Aug 19 2014 at 05:44:32 AM, Will Deacon <will.deacon@arm.com> wrote:
> On Wed, Aug 13, 2014 at 01:51:36AM +0100, Mitchel Humpherys wrote:
>> Currently, we provide the iommu_ops.iova_to_phys service by doing a
>> table walk in software to translate IO virtual addresses to physical
>> addresses. On SMMUs that support it, it can be useful to ask the SMMU
>> itself to do the translation. This can be used to warm the TLBs for an
>> SMMU. It can also be useful for testing and hardware validation.
>
> I'm not really sold on the usefulness of this feature. If you want hardware
> validation features, I'd rather do something through debugfs, but your
> use-case for warming the TLBs is intriguing. Do you have an example use-case
> with performance figures?

I'm afraid I don't have an example use case or performance numbers at
the moment...

>> Since the address translation registers are optional on SMMUv2, only
>> enable hardware translations when using SMMUv1 or when SMMU_IDR0.S1TS=1
>> and SMMU_IDR0.ATOSNS=0, as described in the ARM SMMU v1-v2 spec.
>
> [...]
>
>> +static phys_addr_t arm_smmu_iova_to_phys_hard(struct iommu_domain *domain,
>> +					dma_addr_t iova)
>> +{
>> +	struct arm_smmu_domain *smmu_domain = domain->priv;
>> +	struct arm_smmu_device *smmu = smmu_domain->smmu;
>> +	struct arm_smmu_cfg *cfg = &smmu_domain->cfg;
>> +	struct device *dev = smmu->dev;
>> +	void __iomem *cb_base;
>> +	int count = 0;
>> +	u64 phys;
>> +
>> +	arm_smmu_enable_clocks(smmu);
>> +
>> +	cb_base = ARM_SMMU_CB_BASE(smmu) + ARM_SMMU_CB(smmu, cfg->cbndx);
>> +
>> +	if (smmu->version == 1) {
>> +		u32 reg = iova & 0xFFFFF000;
>> +		writel_relaxed(reg, cb_base + ARM_SMMU_CB_ATS1PR_LO);
>> +	} else {
>> +		u64 reg = iova & 0xfffffffffffff000;
>> +		writeq_relaxed(reg, cb_base + ARM_SMMU_CB_ATS1PR_LO);
>
> We don't have writeq for arch/arm/.

Ah yes looks like this is an MSM-ism that never made it upstream since
it wouldn't be guaranteed to be atomic. I'll make sure to do arm32
compiles on upstream kernels for future patches, sorry!

I guess we could use <asm-generic/io-64-nonatomic-lo-hi.h> but I can
also re-work this to be two separate writel's.

>> +	}
>> +
>> +	mb();
>
> Why?

My thought was that if we start polling ATSR_ACTIVE prematurely (before
the write to ATS1PR actually finishes) all heck could break loose? Not
sure if that's a bogus assumption due to device memory being strongly
ordered?

>> +	while (readl_relaxed(cb_base + ARM_SMMU_CB_ATSR) & ATSR_ACTIVE) {
>> +		if (++count == ATSR_LOOP_TIMEOUT) {
>> +			dev_err(dev,
>> +				"iova to phys timed out on 0x%pa for %s. Falling back to software table walk.\n",
>> +				&iova, dev_name(dev));
>> +			arm_smmu_disable_clocks(smmu);
>> +			return arm_smmu_iova_to_phys_soft(domain, iova);
>> +		}
>> +		cpu_relax();
>> +	}
>
> Do you know what happened to Olav's patches to make this sort of code
> generic?

I assume you're talking about this, right?

    http://lists.infradead.org/pipermail/linux-arm-kernel/2014-June/267943.html

Yeah looks like he never sent an update since it was part of a series
that wasn't going to make it in (the qsmmu driver). I can always bring
that patch (actually Matt Wagantall's patch) in here and rework this to
use that.

>
>> @@ -2005,6 +2073,11 @@ int arm_smmu_device_cfg_probe(struct arm_smmu_device *smmu)
>>  		return -ENODEV;
>>  	}
>>  
>> +	if (smmu->version == 1 || (!(id & ID0_ATOSNS) && (id & ID0_S1TS))) {
>
> Are you sure about this? The v2 spec says that is ATOSNS is clear then S1TS
> is also clear.

I was looking at Section 4.1.1 of ARM IHI 0062C ID091613 which states:

    In SMMUv2, the address translation registers are OPTIONAL. The
    address translation registers are implemented only when both:

        o The SMMU_IDR0.S1TS bit is set to 1.
        o The SMMU_IDR0.ATOSNS bit is set to 0.

I assume you're referring to section 9.6.1 of the same document:

    ATOSNS, bit[26]
    Address Translation Operations Not Supported. The possible values of
    this bit are:

        0 Address translation operations are supported. Stage 1
          translation is not supported, that is, the S1TS bit is set to 0.

        1 Address translation operations are not supported. Stage 1
          translation is supported, that is, the S1TS bit is set to 1.

If that really means that S1TS and ATOSNS always have the same value
then Section 4.1.1 doesn't make any sense. Or am I missing something?



-Mitch

-- 
The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum,
hosted by The Linux Foundation

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [PATCH 1/6] iommu/arm-smmu: add support for specifying clocks
  2014-08-19 12:58         ` Will Deacon
@ 2014-08-19 19:03             ` Olav Haugan
  -1 siblings, 0 replies; 68+ messages in thread
From: Olav Haugan @ 2014-08-19 19:03 UTC (permalink / raw)
  To: Will Deacon, Mitchel Humpherys
  Cc: devicetree-u79uwXL29TY76Z2rM5mHXA,
	iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA,
	linux-arm-kernel-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r

Hi Will,

On 8/19/2014 5:58 AM, Will Deacon wrote:
> On Wed, Aug 13, 2014 at 01:51:34AM +0100, Mitchel Humpherys wrote:
>> On some platforms with tight power constraints it is polite to only
>> leave your clocks on for as long as you absolutely need them. Currently
>> we assume that all clocks necessary for SMMU register access are always
>> on.
>>
>> Add some optional device tree properties to specify any clocks that are
>> necessary for SMMU register access and turn them on and off as needed.
>>
>> If no clocks are specified in the device tree things continue to work
>> the way they always have: we assume all necessary clocks are always
>> turned on.
> 
> How does this interact with an SMMU in bypass mode?

Do you mean if you have a platform that requires clock and power
management but we leave the SMMU in bypass (i.e. no one calls into the
SMMU driver) how are the clock/power managed?

Clients of the SMMU driver are required to vote for clocks and power
when they know they need to use the SMMU. However, the clock and power
needed to be on for the SMMU to service bus masters aren't necessarily
the same as the ones needed to read/write registers...See below.

> 
> [...]
> 
>> +static int arm_smmu_enable_clocks(struct arm_smmu_device *smmu)
>> +{
>> +	int i, ret = 0;
>> +
>> +	for (i = 0; i < smmu->num_clocks; ++i) {
>> +		ret = clk_prepare_enable(smmu->clocks[i]);
>> +		if (ret) {
>> +			dev_err(smmu->dev, "Couldn't enable clock #%d\n", i);
>> +			while (i--)
>> +				clk_disable_unprepare(smmu->clocks[i]);
>> +			break;
>> +		}
>> +	}
>> +
>> +	return ret;
>> +}
>> +
>> +static void arm_smmu_disable_clocks(struct arm_smmu_device *smmu)
>> +{
>> +	int i;
>> +
>> +	for (i = 0; i < smmu->num_clocks; ++i)
>> +		clk_disable_unprepare(smmu->clocks[i]);
>> +}
> 
> What stops theses from racing with each other when there are multiple
> clocks? I also assume that the clk API ignores calls to clk_enable_prepare
> for a clk that's already enabled? I couldn't find that code...

All the clock APIs are reference counted yes. Not sure what you mean by
racing with each other? When you call to enable a clock the call does
not return until the clock is already ON (or OFF).

>>  /* Wait for any pending TLB invalidations to complete */
>>  static void arm_smmu_tlb_sync(struct arm_smmu_device *smmu)
>>  {
>> @@ -644,11 +672,15 @@ static irqreturn_t arm_smmu_context_fault(int irq, void *dev)
>>  	struct arm_smmu_device *smmu = smmu_domain->smmu;
>>  	void __iomem *cb_base;
>>  
>> +	arm_smmu_enable_clocks(smmu);
> 
> How can I get a context interrupt from an SMMU without its clocks enabled?

Good question. At least in our SoC we usually have at least 1 "core"
clock and 1 "programming" clock. Both clocks are needed to read/write
registers but only 1 of the clocks is needed for the SMMU to service bus
master requests.

> 
> [...]
> 
>> +int arm_smmu_device_cfg_probe(struct arm_smmu_device *smmu)
>>  {
>>  	unsigned long size;
>>  	void __iomem *gr0_base = ARM_SMMU_GR0(smmu);
>> @@ -2027,10 +2124,16 @@ static int arm_smmu_device_dt_probe(struct platform_device *pdev)
>>  	}
>>  	dev_notice(dev, "registered %d master devices\n", i);
>>  
>> -	err = arm_smmu_device_cfg_probe(smmu);
>> +	err = arm_smmu_init_clocks(smmu);
>>  	if (err)
>>  		goto out_put_masters;
>>  
>> +	arm_smmu_enable_clocks(smmu);
>> +
>> +	err = arm_smmu_device_cfg_probe(smmu);
>> +	if (err)
>> +		goto out_disable_clocks;
>> +
>>  	parse_driver_options(smmu);
>>  
>>  	if (smmu->version > 1 &&
>> @@ -2039,7 +2142,7 @@ static int arm_smmu_device_dt_probe(struct platform_device *pdev)
>>  			"found only %d context interrupt(s) but %d required\n",
>>  			smmu->num_context_irqs, smmu->num_context_banks);
>>  		err = -ENODEV;
>> -		goto out_put_masters;
>> +		goto out_disable_clocks;
>>  	}
>>  
>>  	for (i = 0; i < smmu->num_global_irqs; ++i) {
>> @@ -2061,12 +2164,16 @@ static int arm_smmu_device_dt_probe(struct platform_device *pdev)
>>  	spin_unlock(&arm_smmu_devices_lock);
>>  
>>  	arm_smmu_device_reset(smmu);
>> +	arm_smmu_disable_clocks(smmu);
> 
> I wonder if this is really the right thing to do. Rather than the
> fine-grained clock enable/disable you have, why don't we just enable in
> domain_init and disable in domain_destroy, with refcounting for the clocks?
> 

So the whole point of all of this is that we try to save power. As Mitch
wrote in the commit text we want to only leave the clock and power on
for as short period of time as possible.

Thanks,

Olav

-- 
The Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum,
hosted by The Linux Foundation

^ permalink raw reply	[flat|nested] 68+ messages in thread

* [PATCH 1/6] iommu/arm-smmu: add support for specifying clocks
@ 2014-08-19 19:03             ` Olav Haugan
  0 siblings, 0 replies; 68+ messages in thread
From: Olav Haugan @ 2014-08-19 19:03 UTC (permalink / raw)
  To: linux-arm-kernel

Hi Will,

On 8/19/2014 5:58 AM, Will Deacon wrote:
> On Wed, Aug 13, 2014 at 01:51:34AM +0100, Mitchel Humpherys wrote:
>> On some platforms with tight power constraints it is polite to only
>> leave your clocks on for as long as you absolutely need them. Currently
>> we assume that all clocks necessary for SMMU register access are always
>> on.
>>
>> Add some optional device tree properties to specify any clocks that are
>> necessary for SMMU register access and turn them on and off as needed.
>>
>> If no clocks are specified in the device tree things continue to work
>> the way they always have: we assume all necessary clocks are always
>> turned on.
> 
> How does this interact with an SMMU in bypass mode?

Do you mean if you have a platform that requires clock and power
management but we leave the SMMU in bypass (i.e. no one calls into the
SMMU driver) how are the clock/power managed?

Clients of the SMMU driver are required to vote for clocks and power
when they know they need to use the SMMU. However, the clock and power
needed to be on for the SMMU to service bus masters aren't necessarily
the same as the ones needed to read/write registers...See below.

> 
> [...]
> 
>> +static int arm_smmu_enable_clocks(struct arm_smmu_device *smmu)
>> +{
>> +	int i, ret = 0;
>> +
>> +	for (i = 0; i < smmu->num_clocks; ++i) {
>> +		ret = clk_prepare_enable(smmu->clocks[i]);
>> +		if (ret) {
>> +			dev_err(smmu->dev, "Couldn't enable clock #%d\n", i);
>> +			while (i--)
>> +				clk_disable_unprepare(smmu->clocks[i]);
>> +			break;
>> +		}
>> +	}
>> +
>> +	return ret;
>> +}
>> +
>> +static void arm_smmu_disable_clocks(struct arm_smmu_device *smmu)
>> +{
>> +	int i;
>> +
>> +	for (i = 0; i < smmu->num_clocks; ++i)
>> +		clk_disable_unprepare(smmu->clocks[i]);
>> +}
> 
> What stops theses from racing with each other when there are multiple
> clocks? I also assume that the clk API ignores calls to clk_enable_prepare
> for a clk that's already enabled? I couldn't find that code...

All the clock APIs are reference counted yes. Not sure what you mean by
racing with each other? When you call to enable a clock the call does
not return until the clock is already ON (or OFF).

>>  /* Wait for any pending TLB invalidations to complete */
>>  static void arm_smmu_tlb_sync(struct arm_smmu_device *smmu)
>>  {
>> @@ -644,11 +672,15 @@ static irqreturn_t arm_smmu_context_fault(int irq, void *dev)
>>  	struct arm_smmu_device *smmu = smmu_domain->smmu;
>>  	void __iomem *cb_base;
>>  
>> +	arm_smmu_enable_clocks(smmu);
> 
> How can I get a context interrupt from an SMMU without its clocks enabled?

Good question. At least in our SoC we usually have at least 1 "core"
clock and 1 "programming" clock. Both clocks are needed to read/write
registers but only 1 of the clocks is needed for the SMMU to service bus
master requests.

> 
> [...]
> 
>> +int arm_smmu_device_cfg_probe(struct arm_smmu_device *smmu)
>>  {
>>  	unsigned long size;
>>  	void __iomem *gr0_base = ARM_SMMU_GR0(smmu);
>> @@ -2027,10 +2124,16 @@ static int arm_smmu_device_dt_probe(struct platform_device *pdev)
>>  	}
>>  	dev_notice(dev, "registered %d master devices\n", i);
>>  
>> -	err = arm_smmu_device_cfg_probe(smmu);
>> +	err = arm_smmu_init_clocks(smmu);
>>  	if (err)
>>  		goto out_put_masters;
>>  
>> +	arm_smmu_enable_clocks(smmu);
>> +
>> +	err = arm_smmu_device_cfg_probe(smmu);
>> +	if (err)
>> +		goto out_disable_clocks;
>> +
>>  	parse_driver_options(smmu);
>>  
>>  	if (smmu->version > 1 &&
>> @@ -2039,7 +2142,7 @@ static int arm_smmu_device_dt_probe(struct platform_device *pdev)
>>  			"found only %d context interrupt(s) but %d required\n",
>>  			smmu->num_context_irqs, smmu->num_context_banks);
>>  		err = -ENODEV;
>> -		goto out_put_masters;
>> +		goto out_disable_clocks;
>>  	}
>>  
>>  	for (i = 0; i < smmu->num_global_irqs; ++i) {
>> @@ -2061,12 +2164,16 @@ static int arm_smmu_device_dt_probe(struct platform_device *pdev)
>>  	spin_unlock(&arm_smmu_devices_lock);
>>  
>>  	arm_smmu_device_reset(smmu);
>> +	arm_smmu_disable_clocks(smmu);
> 
> I wonder if this is really the right thing to do. Rather than the
> fine-grained clock enable/disable you have, why don't we just enable in
> domain_init and disable in domain_destroy, with refcounting for the clocks?
> 

So the whole point of all of this is that we try to save power. As Mitch
wrote in the commit text we want to only leave the clock and power on
for as short period of time as possible.

Thanks,

Olav

-- 
The Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum,
hosted by The Linux Foundation

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [PATCH 6/6] iommu/arm-smmu: add .domain_{set, get}_attr for coherent walk control
  2014-08-19 12:48         ` Will Deacon
@ 2014-08-19 19:19             ` Mitchel Humpherys
  -1 siblings, 0 replies; 68+ messages in thread
From: Mitchel Humpherys @ 2014-08-19 19:19 UTC (permalink / raw)
  To: Will Deacon
  Cc: devicetree-u79uwXL29TY76Z2rM5mHXA,
	iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA,
	linux-arm-kernel-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r

On Tue, Aug 19 2014 at 05:48:07 AM, Will Deacon <will.deacon-5wv7dgnIgG8@public.gmane.org> wrote:
> On Wed, Aug 13, 2014 at 01:51:39AM +0100, Mitchel Humpherys wrote:
>> Under certain conditions coherent hardware translation table walks can
>> result in degraded performance. Add a new domain attribute to
>> disable/enable this feature in generic code along with the domain
>> attribute setter and getter to handle it in the ARM SMMU driver.
>
> Again, it would be nice to have some information about these cases and the
> performance issues that you are seeing.

Basically, the data I'm basing these statements on is: that's what the
HW folks tell me :). I believe it's specific to our hardware, not ARM
IP. Unfortunately, I don't think I could share the specifics even if I
had them, but I can try to press the issue if you want me to.

>
>> @@ -1908,11 +1917,15 @@ static int arm_smmu_domain_get_attr(struct iommu_domain *domain,
>>  				    enum iommu_attr attr, void *data)
>>  {
>>  	struct arm_smmu_domain *smmu_domain = domain->priv;
>> +	struct arm_smmu_cfg *cfg = &smmu_domain->cfg;
>>  
>>  	switch (attr) {
>>  	case DOMAIN_ATTR_NESTING:
>>  		*(int *)data = (smmu_domain->stage == ARM_SMMU_DOMAIN_NESTED);
>>  		return 0;
>> +	case DOMAIN_ATTR_COHERENT_HTW_DISABLE:
>> +		*((bool *)data) = cfg->htw_disable;
>> +		return 0;
>
> I think I'd be more comfortable using int instead of bool for this, as it
> could well end up in the user ABI if vfio decides to make use of it. While
> we're here, let's also add an attributes bitmap to the arm_smmu_domain
> instead of having a bool in the arm_smmu_cfg.

Sounds good. I'll make these changes in v2.

>
>>  	default:
>>  		return -ENODEV;
>>  	}
>> @@ -1922,6 +1935,7 @@ static int arm_smmu_domain_set_attr(struct iommu_domain *domain,
>>  				    enum iommu_attr attr, void *data)
>>  {
>>  	struct arm_smmu_domain *smmu_domain = domain->priv;
>> +	struct arm_smmu_cfg *cfg = &smmu_domain->cfg;
>>  
>>  	switch (attr) {
>>  	case DOMAIN_ATTR_NESTING:
>> @@ -1933,6 +1947,9 @@ static int arm_smmu_domain_set_attr(struct iommu_domain *domain,
>>  			smmu_domain->stage = ARM_SMMU_DOMAIN_S1;
>>  
>>  		return 0;
>> +	case DOMAIN_ATTR_COHERENT_HTW_DISABLE:
>> +		cfg->htw_disable = *((bool *)data);
>> +		return 0;
>>  	default:
>>  		return -ENODEV;
>>  	}
>> diff --git a/include/linux/iommu.h b/include/linux/iommu.h
>> index 0550286df4..8a6449857a 100644
>> --- a/include/linux/iommu.h
>> +++ b/include/linux/iommu.h
>> @@ -81,6 +81,7 @@ enum iommu_attr {
>>  	DOMAIN_ATTR_FSL_PAMU_ENABLE,
>>  	DOMAIN_ATTR_FSL_PAMUV1,
>>  	DOMAIN_ATTR_NESTING,	/* two stages of translation */
>> +	DOMAIN_ATTR_COHERENT_HTW_DISABLE,
>
> I wonder whether we should make this ARM-specific. Can you take a quick look
> to see if any of the other IOMMUs can potentially benefit from this?

Yeah looks like amd_iommu.c and intel-iommu.c are using
IOMMU_CAP_CACHE_COHERENCY which seems to be the same thing (at least
that's how we're treating it in arm-smmu.c). AMD's doesn't look
configurable but Intel's does, so perhaps they would benefit from this.



-Mitch

-- 
The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum,
hosted by The Linux Foundation

^ permalink raw reply	[flat|nested] 68+ messages in thread

* [PATCH 6/6] iommu/arm-smmu: add .domain_{set, get}_attr for coherent walk control
@ 2014-08-19 19:19             ` Mitchel Humpherys
  0 siblings, 0 replies; 68+ messages in thread
From: Mitchel Humpherys @ 2014-08-19 19:19 UTC (permalink / raw)
  To: linux-arm-kernel

On Tue, Aug 19 2014 at 05:48:07 AM, Will Deacon <will.deacon@arm.com> wrote:
> On Wed, Aug 13, 2014 at 01:51:39AM +0100, Mitchel Humpherys wrote:
>> Under certain conditions coherent hardware translation table walks can
>> result in degraded performance. Add a new domain attribute to
>> disable/enable this feature in generic code along with the domain
>> attribute setter and getter to handle it in the ARM SMMU driver.
>
> Again, it would be nice to have some information about these cases and the
> performance issues that you are seeing.

Basically, the data I'm basing these statements on is: that's what the
HW folks tell me :). I believe it's specific to our hardware, not ARM
IP. Unfortunately, I don't think I could share the specifics even if I
had them, but I can try to press the issue if you want me to.

>
>> @@ -1908,11 +1917,15 @@ static int arm_smmu_domain_get_attr(struct iommu_domain *domain,
>>  				    enum iommu_attr attr, void *data)
>>  {
>>  	struct arm_smmu_domain *smmu_domain = domain->priv;
>> +	struct arm_smmu_cfg *cfg = &smmu_domain->cfg;
>>  
>>  	switch (attr) {
>>  	case DOMAIN_ATTR_NESTING:
>>  		*(int *)data = (smmu_domain->stage == ARM_SMMU_DOMAIN_NESTED);
>>  		return 0;
>> +	case DOMAIN_ATTR_COHERENT_HTW_DISABLE:
>> +		*((bool *)data) = cfg->htw_disable;
>> +		return 0;
>
> I think I'd be more comfortable using int instead of bool for this, as it
> could well end up in the user ABI if vfio decides to make use of it. While
> we're here, let's also add an attributes bitmap to the arm_smmu_domain
> instead of having a bool in the arm_smmu_cfg.

Sounds good. I'll make these changes in v2.

>
>>  	default:
>>  		return -ENODEV;
>>  	}
>> @@ -1922,6 +1935,7 @@ static int arm_smmu_domain_set_attr(struct iommu_domain *domain,
>>  				    enum iommu_attr attr, void *data)
>>  {
>>  	struct arm_smmu_domain *smmu_domain = domain->priv;
>> +	struct arm_smmu_cfg *cfg = &smmu_domain->cfg;
>>  
>>  	switch (attr) {
>>  	case DOMAIN_ATTR_NESTING:
>> @@ -1933,6 +1947,9 @@ static int arm_smmu_domain_set_attr(struct iommu_domain *domain,
>>  			smmu_domain->stage = ARM_SMMU_DOMAIN_S1;
>>  
>>  		return 0;
>> +	case DOMAIN_ATTR_COHERENT_HTW_DISABLE:
>> +		cfg->htw_disable = *((bool *)data);
>> +		return 0;
>>  	default:
>>  		return -ENODEV;
>>  	}
>> diff --git a/include/linux/iommu.h b/include/linux/iommu.h
>> index 0550286df4..8a6449857a 100644
>> --- a/include/linux/iommu.h
>> +++ b/include/linux/iommu.h
>> @@ -81,6 +81,7 @@ enum iommu_attr {
>>  	DOMAIN_ATTR_FSL_PAMU_ENABLE,
>>  	DOMAIN_ATTR_FSL_PAMUV1,
>>  	DOMAIN_ATTR_NESTING,	/* two stages of translation */
>> +	DOMAIN_ATTR_COHERENT_HTW_DISABLE,
>
> I wonder whether we should make this ARM-specific. Can you take a quick look
> to see if any of the other IOMMUs can potentially benefit from this?

Yeah looks like amd_iommu.c and intel-iommu.c are using
IOMMU_CAP_CACHE_COHERENCY which seems to be the same thing (at least
that's how we're treating it in arm-smmu.c). AMD's doesn't look
configurable but Intel's does, so perhaps they would benefit from this.



-Mitch

-- 
The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum,
hosted by The Linux Foundation

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [PATCH 1/6] iommu/arm-smmu: add support for specifying clocks
  2014-08-19 12:58         ` Will Deacon
@ 2014-08-19 19:28             ` Mitchel Humpherys
  -1 siblings, 0 replies; 68+ messages in thread
From: Mitchel Humpherys @ 2014-08-19 19:28 UTC (permalink / raw)
  To: Will Deacon
  Cc: devicetree-u79uwXL29TY76Z2rM5mHXA,
	iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA,
	linux-arm-kernel-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r

On Tue, Aug 19 2014 at 05:58:34 AM, Will Deacon <will.deacon-5wv7dgnIgG8@public.gmane.org> wrote:
> I also assume that the clk API ignores calls to clk_enable_prepare
> for a clk that's already enabled? I couldn't find that code...

That's clk_prepare_enable, not clk_enable_prepare. It's in
<linux/clk.h>.


-Mitch

-- 
The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum,
hosted by The Linux Foundation

^ permalink raw reply	[flat|nested] 68+ messages in thread

* [PATCH 1/6] iommu/arm-smmu: add support for specifying clocks
@ 2014-08-19 19:28             ` Mitchel Humpherys
  0 siblings, 0 replies; 68+ messages in thread
From: Mitchel Humpherys @ 2014-08-19 19:28 UTC (permalink / raw)
  To: linux-arm-kernel

On Tue, Aug 19 2014 at 05:58:34 AM, Will Deacon <will.deacon@arm.com> wrote:
> I also assume that the clk API ignores calls to clk_enable_prepare
> for a clk that's already enabled? I couldn't find that code...

That's clk_prepare_enable, not clk_enable_prepare. It's in
<linux/clk.h>.


-Mitch

-- 
The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum,
hosted by The Linux Foundation

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [PATCH 4/6] iommu/arm-smmu: implement generic DT bindings
  2014-08-19 12:54         ` Will Deacon
@ 2014-08-20  3:18             ` Arnd Bergmann
  -1 siblings, 0 replies; 68+ messages in thread
From: Arnd Bergmann @ 2014-08-20  3:18 UTC (permalink / raw)
  To: Will Deacon
  Cc: mark.rutland-5wv7dgnIgG8, devicetree-u79uwXL29TY76Z2rM5mHXA,
	robh-DgEjT+Ai2ygdnm+yROfE0A,
	iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA,
	treding-DDmLM1+adcrQT0dZR+AlfA,
	linux-arm-kernel-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r

On Tuesday 19 August 2014, Will Deacon wrote:
> > +static int arm_smmu_parse_iommus_properties(struct arm_smmu_device *smmu,
> > +                                     int *num_masters)
> > +{
> > +     struct of_phandle_args iommuspec;
> > +     struct device_node *dn;
> > +
> > +     for_each_node_with_property(dn, "iommus") {
> > +             int arg_ind = 0;
> > +             struct iommus_entry *entry, *n;
> > +             LIST_HEAD(iommus);
> > +
> > +             while (!of_parse_phandle_with_args(dn, "iommus", "#iommu-cells",
> > +                                                     arg_ind, &iommuspec)) {
> 
> We need to check that the phandle does indeed point at one of our SMMUs
> here, in case we have a system with multiple IOMMU types, all using the
> generic binding.

Why does the iommu driver do this at all? The of_parse_phandle_with_args()
call should really be part of the iommu common code when we probe the devices in
of_dma_configure() as far as I can tell.

The way I would have expected it to happen is to have some code that
ensures the iommu drivers are instantiated before we do call of_platform_populate,
and then of_dma_configure() looks for whether there is an iommu property or not
and calls into the respective iommu driver for doing the setup if there is.

	Arnd

^ permalink raw reply	[flat|nested] 68+ messages in thread

* [PATCH 4/6] iommu/arm-smmu: implement generic DT bindings
@ 2014-08-20  3:18             ` Arnd Bergmann
  0 siblings, 0 replies; 68+ messages in thread
From: Arnd Bergmann @ 2014-08-20  3:18 UTC (permalink / raw)
  To: linux-arm-kernel

On Tuesday 19 August 2014, Will Deacon wrote:
> > +static int arm_smmu_parse_iommus_properties(struct arm_smmu_device *smmu,
> > +                                     int *num_masters)
> > +{
> > +     struct of_phandle_args iommuspec;
> > +     struct device_node *dn;
> > +
> > +     for_each_node_with_property(dn, "iommus") {
> > +             int arg_ind = 0;
> > +             struct iommus_entry *entry, *n;
> > +             LIST_HEAD(iommus);
> > +
> > +             while (!of_parse_phandle_with_args(dn, "iommus", "#iommu-cells",
> > +                                                     arg_ind, &iommuspec)) {
> 
> We need to check that the phandle does indeed point at one of our SMMUs
> here, in case we have a system with multiple IOMMU types, all using the
> generic binding.

Why does the iommu driver do this at all? The of_parse_phandle_with_args()
call should really be part of the iommu common code when we probe the devices in
of_dma_configure() as far as I can tell.

The way I would have expected it to happen is to have some code that
ensures the iommu drivers are instantiated before we do call of_platform_populate,
and then of_dma_configure() looks for whether there is an iommu property or not
and calls into the respective iommu driver for doing the setup if there is.

	Arnd

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [PATCH 3/6] iommu/arm-smmu: add support for iova_to_phys through ATS1PR
  2014-08-19 18:12             ` Mitchel Humpherys
@ 2014-08-26 13:54                 ` Will Deacon
  -1 siblings, 0 replies; 68+ messages in thread
From: Will Deacon @ 2014-08-26 13:54 UTC (permalink / raw)
  To: Mitchel Humpherys
  Cc: devicetree-u79uwXL29TY76Z2rM5mHXA,
	iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA,
	linux-arm-kernel-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r

Hi Mitch,

On Tue, Aug 19, 2014 at 07:12:41PM +0100, Mitchel Humpherys wrote:
> On Tue, Aug 19 2014 at 05:44:32 AM, Will Deacon <will.deacon-5wv7dgnIgG8@public.gmane.org> wrote:
> > We don't have writeq for arch/arm/.
> 
> Ah yes looks like this is an MSM-ism that never made it upstream since
> it wouldn't be guaranteed to be atomic. I'll make sure to do arm32
> compiles on upstream kernels for future patches, sorry!
> 
> I guess we could use <asm-generic/io-64-nonatomic-lo-hi.h> but I can
> also re-work this to be two separate writel's.

Yeah, just do two writels.

> >> +	}
> >> +
> >> +	mb();
> >
> > Why?
> 
> My thought was that if we start polling ATSR_ACTIVE prematurely (before
> the write to ATS1PR actually finishes) all heck could break loose? Not
> sure if that's a bogus assumption due to device memory being strongly
> ordered?

I think the device-memory guarantees should be enough. If not, we need a
comment explaining why.

> >> +	while (readl_relaxed(cb_base + ARM_SMMU_CB_ATSR) & ATSR_ACTIVE) {
> >> +		if (++count == ATSR_LOOP_TIMEOUT) {
> >> +			dev_err(dev,
> >> +				"iova to phys timed out on 0x%pa for %s. Falling back to software table walk.\n",
> >> +				&iova, dev_name(dev));
> >> +			arm_smmu_disable_clocks(smmu);
> >> +			return arm_smmu_iova_to_phys_soft(domain, iova);
> >> +		}
> >> +		cpu_relax();
> >> +	}
> >
> > Do you know what happened to Olav's patches to make this sort of code
> > generic?
> 
> I assume you're talking about this, right?
> 
>     http://lists.infradead.org/pipermail/linux-arm-kernel/2014-June/267943.html
> 
> Yeah looks like he never sent an update since it was part of a series
> that wasn't going to make it in (the qsmmu driver). I can always bring
> that patch (actually Matt Wagantall's patch) in here and rework this to
> use that.

Yup, I think it would be useful to revive that as a separate series.

> >
> >> @@ -2005,6 +2073,11 @@ int arm_smmu_device_cfg_probe(struct arm_smmu_device *smmu)
> >>  		return -ENODEV;
> >>  	}
> >>  
> >> +	if (smmu->version == 1 || (!(id & ID0_ATOSNS) && (id & ID0_S1TS))) {
> >
> > Are you sure about this? The v2 spec says that is ATOSNS is clear then S1TS
> > is also clear.
> 
> I was looking at Section 4.1.1 of ARM IHI 0062C ID091613 which states:
> 
>     In SMMUv2, the address translation registers are OPTIONAL. The
>     address translation registers are implemented only when both:
> 
>         o The SMMU_IDR0.S1TS bit is set to 1.
>         o The SMMU_IDR0.ATOSNS bit is set to 0.
> 
> I assume you're referring to section 9.6.1 of the same document:
> 
>     ATOSNS, bit[26]
>     Address Translation Operations Not Supported. The possible values of
>     this bit are:
> 
>         0 Address translation operations are supported. Stage 1
>           translation is not supported, that is, the S1TS bit is set to 0.
> 
>         1 Address translation operations are not supported. Stage 1
>           translation is supported, that is, the S1TS bit is set to 1.
> 
> If that really means that S1TS and ATOSNS always have the same value
> then Section 4.1.1 doesn't make any sense. Or am I missing something?

I'll get this checked, as those two paragraphs don't make an awful lot of
sense together.

Will

^ permalink raw reply	[flat|nested] 68+ messages in thread

* [PATCH 3/6] iommu/arm-smmu: add support for iova_to_phys through ATS1PR
@ 2014-08-26 13:54                 ` Will Deacon
  0 siblings, 0 replies; 68+ messages in thread
From: Will Deacon @ 2014-08-26 13:54 UTC (permalink / raw)
  To: linux-arm-kernel

Hi Mitch,

On Tue, Aug 19, 2014 at 07:12:41PM +0100, Mitchel Humpherys wrote:
> On Tue, Aug 19 2014 at 05:44:32 AM, Will Deacon <will.deacon@arm.com> wrote:
> > We don't have writeq for arch/arm/.
> 
> Ah yes looks like this is an MSM-ism that never made it upstream since
> it wouldn't be guaranteed to be atomic. I'll make sure to do arm32
> compiles on upstream kernels for future patches, sorry!
> 
> I guess we could use <asm-generic/io-64-nonatomic-lo-hi.h> but I can
> also re-work this to be two separate writel's.

Yeah, just do two writels.

> >> +	}
> >> +
> >> +	mb();
> >
> > Why?
> 
> My thought was that if we start polling ATSR_ACTIVE prematurely (before
> the write to ATS1PR actually finishes) all heck could break loose? Not
> sure if that's a bogus assumption due to device memory being strongly
> ordered?

I think the device-memory guarantees should be enough. If not, we need a
comment explaining why.

> >> +	while (readl_relaxed(cb_base + ARM_SMMU_CB_ATSR) & ATSR_ACTIVE) {
> >> +		if (++count == ATSR_LOOP_TIMEOUT) {
> >> +			dev_err(dev,
> >> +				"iova to phys timed out on 0x%pa for %s. Falling back to software table walk.\n",
> >> +				&iova, dev_name(dev));
> >> +			arm_smmu_disable_clocks(smmu);
> >> +			return arm_smmu_iova_to_phys_soft(domain, iova);
> >> +		}
> >> +		cpu_relax();
> >> +	}
> >
> > Do you know what happened to Olav's patches to make this sort of code
> > generic?
> 
> I assume you're talking about this, right?
> 
>     http://lists.infradead.org/pipermail/linux-arm-kernel/2014-June/267943.html
> 
> Yeah looks like he never sent an update since it was part of a series
> that wasn't going to make it in (the qsmmu driver). I can always bring
> that patch (actually Matt Wagantall's patch) in here and rework this to
> use that.

Yup, I think it would be useful to revive that as a separate series.

> >
> >> @@ -2005,6 +2073,11 @@ int arm_smmu_device_cfg_probe(struct arm_smmu_device *smmu)
> >>  		return -ENODEV;
> >>  	}
> >>  
> >> +	if (smmu->version == 1 || (!(id & ID0_ATOSNS) && (id & ID0_S1TS))) {
> >
> > Are you sure about this? The v2 spec says that is ATOSNS is clear then S1TS
> > is also clear.
> 
> I was looking at Section 4.1.1 of ARM IHI 0062C ID091613 which states:
> 
>     In SMMUv2, the address translation registers are OPTIONAL. The
>     address translation registers are implemented only when both:
> 
>         o The SMMU_IDR0.S1TS bit is set to 1.
>         o The SMMU_IDR0.ATOSNS bit is set to 0.
> 
> I assume you're referring to section 9.6.1 of the same document:
> 
>     ATOSNS, bit[26]
>     Address Translation Operations Not Supported. The possible values of
>     this bit are:
> 
>         0 Address translation operations are supported. Stage 1
>           translation is not supported, that is, the S1TS bit is set to 0.
> 
>         1 Address translation operations are not supported. Stage 1
>           translation is supported, that is, the S1TS bit is set to 1.
> 
> If that really means that S1TS and ATOSNS always have the same value
> then Section 4.1.1 doesn't make any sense. Or am I missing something?

I'll get this checked, as those two paragraphs don't make an awful lot of
sense together.

Will

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [PATCH 1/6] iommu/arm-smmu: add support for specifying clocks
  2014-08-19 19:03             ` Olav Haugan
@ 2014-08-26 14:27                 ` Will Deacon
  -1 siblings, 0 replies; 68+ messages in thread
From: Will Deacon @ 2014-08-26 14:27 UTC (permalink / raw)
  To: Olav Haugan
  Cc: devicetree-u79uwXL29TY76Z2rM5mHXA,
	iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA,
	linux-arm-kernel-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r,
	mturquette-QSEj5FYQhm4dnm+yROfE0A

[adding Mike]

On Tue, Aug 19, 2014 at 08:03:09PM +0100, Olav Haugan wrote:
> Hi Will,

Hi Olav,

> On 8/19/2014 5:58 AM, Will Deacon wrote:
> > On Wed, Aug 13, 2014 at 01:51:34AM +0100, Mitchel Humpherys wrote:
> >> On some platforms with tight power constraints it is polite to only
> >> leave your clocks on for as long as you absolutely need them. Currently
> >> we assume that all clocks necessary for SMMU register access are always
> >> on.
> >>
> >> Add some optional device tree properties to specify any clocks that are
> >> necessary for SMMU register access and turn them on and off as needed.
> >>
> >> If no clocks are specified in the device tree things continue to work
> >> the way they always have: we assume all necessary clocks are always
> >> turned on.
> > 
> > How does this interact with an SMMU in bypass mode?
> 
> Do you mean if you have a platform that requires clock and power
> management but we leave the SMMU in bypass (i.e. no one calls into the
> SMMU driver) how are the clock/power managed?
> 
> Clients of the SMMU driver are required to vote for clocks and power
> when they know they need to use the SMMU. However, the clock and power
> needed to be on for the SMMU to service bus masters aren't necessarily
> the same as the ones needed to read/write registers...See below.

The case I'm thinking of is where a device masters through the IOMMU, but
doesn't make use of any translations. In this case, its transactions will
bypass the SMMU and I want to ensure that continues to happen, regardless of
the power state of the SMMU.

> >> +static int arm_smmu_enable_clocks(struct arm_smmu_device *smmu)
> >> +{
> >> +	int i, ret = 0;
> >> +
> >> +	for (i = 0; i < smmu->num_clocks; ++i) {
> >> +		ret = clk_prepare_enable(smmu->clocks[i]);
> >> +		if (ret) {
> >> +			dev_err(smmu->dev, "Couldn't enable clock #%d\n", i);
> >> +			while (i--)
> >> +				clk_disable_unprepare(smmu->clocks[i]);
> >> +			break;
> >> +		}
> >> +	}
> >> +
> >> +	return ret;
> >> +}
> >> +
> >> +static void arm_smmu_disable_clocks(struct arm_smmu_device *smmu)
> >> +{
> >> +	int i;
> >> +
> >> +	for (i = 0; i < smmu->num_clocks; ++i)
> >> +		clk_disable_unprepare(smmu->clocks[i]);
> >> +}
> > 
> > What stops theses from racing with each other when there are multiple
> > clocks? I also assume that the clk API ignores calls to clk_enable_prepare
> > for a clk that's already enabled? I couldn't find that code...
> 
> All the clock APIs are reference counted yes. Not sure what you mean by
> racing with each other? When you call to enable a clock the call does
> not return until the clock is already ON (or OFF).

I was thinking of an interrupt handler racing with normal code, but actually
you balance the clk enable/disable in the interrupt handlers. However, it's
not safe to call these clk functions from irq context anyway, since
clk_prepare may sleep.

> >> +int arm_smmu_device_cfg_probe(struct arm_smmu_device *smmu)
> >>  {
> >>  	unsigned long size;
> >>  	void __iomem *gr0_base = ARM_SMMU_GR0(smmu);
> >> @@ -2027,10 +2124,16 @@ static int arm_smmu_device_dt_probe(struct platform_device *pdev)
> >>  	}
> >>  	dev_notice(dev, "registered %d master devices\n", i);
> >>  
> >> -	err = arm_smmu_device_cfg_probe(smmu);
> >> +	err = arm_smmu_init_clocks(smmu);
> >>  	if (err)
> >>  		goto out_put_masters;
> >>  
> >> +	arm_smmu_enable_clocks(smmu);
> >> +
> >> +	err = arm_smmu_device_cfg_probe(smmu);
> >> +	if (err)
> >> +		goto out_disable_clocks;
> >> +
> >>  	parse_driver_options(smmu);
> >>  
> >>  	if (smmu->version > 1 &&
> >> @@ -2039,7 +2142,7 @@ static int arm_smmu_device_dt_probe(struct platform_device *pdev)
> >>  			"found only %d context interrupt(s) but %d required\n",
> >>  			smmu->num_context_irqs, smmu->num_context_banks);
> >>  		err = -ENODEV;
> >> -		goto out_put_masters;
> >> +		goto out_disable_clocks;
> >>  	}
> >>  
> >>  	for (i = 0; i < smmu->num_global_irqs; ++i) {
> >> @@ -2061,12 +2164,16 @@ static int arm_smmu_device_dt_probe(struct platform_device *pdev)
> >>  	spin_unlock(&arm_smmu_devices_lock);
> >>  
> >>  	arm_smmu_device_reset(smmu);
> >> +	arm_smmu_disable_clocks(smmu);
> > 
> > I wonder if this is really the right thing to do. Rather than the
> > fine-grained clock enable/disable you have, why don't we just enable in
> > domain_init and disable in domain_destroy, with refcounting for the clocks?
> > 
> 
> So the whole point of all of this is that we try to save power. As Mitch
> wrote in the commit text we want to only leave the clock and power on
> for as short period of time as possible.

Understood, but if the clocks are going up and down like yo-yos, then it's
not obvious that you end up saving any power at all. Have you tried
measuring the power consumption with different granularities for the clocks?

The code you're proposing seems to take the approach of `we're going to
access registers so enable the clocks, access the registers then disable the
clocks', which is simple but may not be particularly effective.

Will

^ permalink raw reply	[flat|nested] 68+ messages in thread

* [PATCH 1/6] iommu/arm-smmu: add support for specifying clocks
@ 2014-08-26 14:27                 ` Will Deacon
  0 siblings, 0 replies; 68+ messages in thread
From: Will Deacon @ 2014-08-26 14:27 UTC (permalink / raw)
  To: linux-arm-kernel

[adding Mike]

On Tue, Aug 19, 2014 at 08:03:09PM +0100, Olav Haugan wrote:
> Hi Will,

Hi Olav,

> On 8/19/2014 5:58 AM, Will Deacon wrote:
> > On Wed, Aug 13, 2014 at 01:51:34AM +0100, Mitchel Humpherys wrote:
> >> On some platforms with tight power constraints it is polite to only
> >> leave your clocks on for as long as you absolutely need them. Currently
> >> we assume that all clocks necessary for SMMU register access are always
> >> on.
> >>
> >> Add some optional device tree properties to specify any clocks that are
> >> necessary for SMMU register access and turn them on and off as needed.
> >>
> >> If no clocks are specified in the device tree things continue to work
> >> the way they always have: we assume all necessary clocks are always
> >> turned on.
> > 
> > How does this interact with an SMMU in bypass mode?
> 
> Do you mean if you have a platform that requires clock and power
> management but we leave the SMMU in bypass (i.e. no one calls into the
> SMMU driver) how are the clock/power managed?
> 
> Clients of the SMMU driver are required to vote for clocks and power
> when they know they need to use the SMMU. However, the clock and power
> needed to be on for the SMMU to service bus masters aren't necessarily
> the same as the ones needed to read/write registers...See below.

The case I'm thinking of is where a device masters through the IOMMU, but
doesn't make use of any translations. In this case, its transactions will
bypass the SMMU and I want to ensure that continues to happen, regardless of
the power state of the SMMU.

> >> +static int arm_smmu_enable_clocks(struct arm_smmu_device *smmu)
> >> +{
> >> +	int i, ret = 0;
> >> +
> >> +	for (i = 0; i < smmu->num_clocks; ++i) {
> >> +		ret = clk_prepare_enable(smmu->clocks[i]);
> >> +		if (ret) {
> >> +			dev_err(smmu->dev, "Couldn't enable clock #%d\n", i);
> >> +			while (i--)
> >> +				clk_disable_unprepare(smmu->clocks[i]);
> >> +			break;
> >> +		}
> >> +	}
> >> +
> >> +	return ret;
> >> +}
> >> +
> >> +static void arm_smmu_disable_clocks(struct arm_smmu_device *smmu)
> >> +{
> >> +	int i;
> >> +
> >> +	for (i = 0; i < smmu->num_clocks; ++i)
> >> +		clk_disable_unprepare(smmu->clocks[i]);
> >> +}
> > 
> > What stops theses from racing with each other when there are multiple
> > clocks? I also assume that the clk API ignores calls to clk_enable_prepare
> > for a clk that's already enabled? I couldn't find that code...
> 
> All the clock APIs are reference counted yes. Not sure what you mean by
> racing with each other? When you call to enable a clock the call does
> not return until the clock is already ON (or OFF).

I was thinking of an interrupt handler racing with normal code, but actually
you balance the clk enable/disable in the interrupt handlers. However, it's
not safe to call these clk functions from irq context anyway, since
clk_prepare may sleep.

> >> +int arm_smmu_device_cfg_probe(struct arm_smmu_device *smmu)
> >>  {
> >>  	unsigned long size;
> >>  	void __iomem *gr0_base = ARM_SMMU_GR0(smmu);
> >> @@ -2027,10 +2124,16 @@ static int arm_smmu_device_dt_probe(struct platform_device *pdev)
> >>  	}
> >>  	dev_notice(dev, "registered %d master devices\n", i);
> >>  
> >> -	err = arm_smmu_device_cfg_probe(smmu);
> >> +	err = arm_smmu_init_clocks(smmu);
> >>  	if (err)
> >>  		goto out_put_masters;
> >>  
> >> +	arm_smmu_enable_clocks(smmu);
> >> +
> >> +	err = arm_smmu_device_cfg_probe(smmu);
> >> +	if (err)
> >> +		goto out_disable_clocks;
> >> +
> >>  	parse_driver_options(smmu);
> >>  
> >>  	if (smmu->version > 1 &&
> >> @@ -2039,7 +2142,7 @@ static int arm_smmu_device_dt_probe(struct platform_device *pdev)
> >>  			"found only %d context interrupt(s) but %d required\n",
> >>  			smmu->num_context_irqs, smmu->num_context_banks);
> >>  		err = -ENODEV;
> >> -		goto out_put_masters;
> >> +		goto out_disable_clocks;
> >>  	}
> >>  
> >>  	for (i = 0; i < smmu->num_global_irqs; ++i) {
> >> @@ -2061,12 +2164,16 @@ static int arm_smmu_device_dt_probe(struct platform_device *pdev)
> >>  	spin_unlock(&arm_smmu_devices_lock);
> >>  
> >>  	arm_smmu_device_reset(smmu);
> >> +	arm_smmu_disable_clocks(smmu);
> > 
> > I wonder if this is really the right thing to do. Rather than the
> > fine-grained clock enable/disable you have, why don't we just enable in
> > domain_init and disable in domain_destroy, with refcounting for the clocks?
> > 
> 
> So the whole point of all of this is that we try to save power. As Mitch
> wrote in the commit text we want to only leave the clock and power on
> for as short period of time as possible.

Understood, but if the clocks are going up and down like yo-yos, then it's
not obvious that you end up saving any power at all. Have you tried
measuring the power consumption with different granularities for the clocks?

The code you're proposing seems to take the approach of `we're going to
access registers so enable the clocks, access the registers then disable the
clocks', which is simple but may not be particularly effective.

Will

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [PATCH 3/6] iommu/arm-smmu: add support for iova_to_phys through ATS1PR
  2014-08-26 13:54                 ` Will Deacon
@ 2014-09-01 16:15                     ` Will Deacon
  -1 siblings, 0 replies; 68+ messages in thread
From: Will Deacon @ 2014-09-01 16:15 UTC (permalink / raw)
  To: Mitchel Humpherys
  Cc: devicetree-u79uwXL29TY76Z2rM5mHXA,
	iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA,
	linux-arm-kernel-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r

On Tue, Aug 26, 2014 at 02:54:51PM +0100, Will Deacon wrote:
> On Tue, Aug 19, 2014 at 07:12:41PM +0100, Mitchel Humpherys wrote:
> > On Tue, Aug 19 2014 at 05:44:32 AM, Will Deacon <will.deacon-5wv7dgnIgG8@public.gmane.org> wrote:
> > >> @@ -2005,6 +2073,11 @@ int arm_smmu_device_cfg_probe(struct arm_smmu_device *smmu)
> > >>  		return -ENODEV;
> > >>  	}
> > >>  
> > >> +	if (smmu->version == 1 || (!(id & ID0_ATOSNS) && (id & ID0_S1TS))) {
> > >
> > > Are you sure about this? The v2 spec says that is ATOSNS is clear then S1TS
> > > is also clear.
> > 
> > I was looking at Section 4.1.1 of ARM IHI 0062C ID091613 which states:
> > 
> >     In SMMUv2, the address translation registers are OPTIONAL. The
> >     address translation registers are implemented only when both:
> > 
> >         o The SMMU_IDR0.S1TS bit is set to 1.
> >         o The SMMU_IDR0.ATOSNS bit is set to 0.
> > 
> > I assume you're referring to section 9.6.1 of the same document:
> > 
> >     ATOSNS, bit[26]
> >     Address Translation Operations Not Supported. The possible values of
> >     this bit are:
> > 
> >         0 Address translation operations are supported. Stage 1
> >           translation is not supported, that is, the S1TS bit is set to 0.
> > 
> >         1 Address translation operations are not supported. Stage 1
> >           translation is supported, that is, the S1TS bit is set to 1.
> > 
> > If that really means that S1TS and ATOSNS always have the same value
> > then Section 4.1.1 doesn't make any sense. Or am I missing something?
> 
> I'll get this checked, as those two paragraphs don't make an awful lot of
> sense together.

Right, word from above says that ATOS is implemented iff:

  IDR0.S1TS == 1 && IDR0.ATOSNS == 0

Basically, ATOS only works for stage-1 when it's present, so that explains
the dependency. Looking at the piece of diff at the top of this mail, I
think that means your code is correct

Will

^ permalink raw reply	[flat|nested] 68+ messages in thread

* [PATCH 3/6] iommu/arm-smmu: add support for iova_to_phys through ATS1PR
@ 2014-09-01 16:15                     ` Will Deacon
  0 siblings, 0 replies; 68+ messages in thread
From: Will Deacon @ 2014-09-01 16:15 UTC (permalink / raw)
  To: linux-arm-kernel

On Tue, Aug 26, 2014 at 02:54:51PM +0100, Will Deacon wrote:
> On Tue, Aug 19, 2014 at 07:12:41PM +0100, Mitchel Humpherys wrote:
> > On Tue, Aug 19 2014 at 05:44:32 AM, Will Deacon <will.deacon@arm.com> wrote:
> > >> @@ -2005,6 +2073,11 @@ int arm_smmu_device_cfg_probe(struct arm_smmu_device *smmu)
> > >>  		return -ENODEV;
> > >>  	}
> > >>  
> > >> +	if (smmu->version == 1 || (!(id & ID0_ATOSNS) && (id & ID0_S1TS))) {
> > >
> > > Are you sure about this? The v2 spec says that is ATOSNS is clear then S1TS
> > > is also clear.
> > 
> > I was looking at Section 4.1.1 of ARM IHI 0062C ID091613 which states:
> > 
> >     In SMMUv2, the address translation registers are OPTIONAL. The
> >     address translation registers are implemented only when both:
> > 
> >         o The SMMU_IDR0.S1TS bit is set to 1.
> >         o The SMMU_IDR0.ATOSNS bit is set to 0.
> > 
> > I assume you're referring to section 9.6.1 of the same document:
> > 
> >     ATOSNS, bit[26]
> >     Address Translation Operations Not Supported. The possible values of
> >     this bit are:
> > 
> >         0 Address translation operations are supported. Stage 1
> >           translation is not supported, that is, the S1TS bit is set to 0.
> > 
> >         1 Address translation operations are not supported. Stage 1
> >           translation is supported, that is, the S1TS bit is set to 1.
> > 
> > If that really means that S1TS and ATOSNS always have the same value
> > then Section 4.1.1 doesn't make any sense. Or am I missing something?
> 
> I'll get this checked, as those two paragraphs don't make an awful lot of
> sense together.

Right, word from above says that ATOS is implemented iff:

  IDR0.S1TS == 1 && IDR0.ATOSNS == 0

Basically, ATOS only works for stage-1 when it's present, so that explains
the dependency. Looking at the piece of diff at the top of this mail, I
think that means your code is correct

Will

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [PATCH 1/6] iommu/arm-smmu: add support for specifying clocks
  2014-08-26 14:27                 ` Will Deacon
@ 2014-09-10  1:29                     ` Mitchel Humpherys
  -1 siblings, 0 replies; 68+ messages in thread
From: Mitchel Humpherys @ 2014-09-10  1:29 UTC (permalink / raw)
  To: Will Deacon
  Cc: devicetree-u79uwXL29TY76Z2rM5mHXA,
	iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA,
	mturquette-QSEj5FYQhm4dnm+yROfE0A,
	linux-arm-kernel-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r

On Tue, Aug 26 2014 at 07:27:58 AM, Will Deacon <will.deacon-5wv7dgnIgG8@public.gmane.org> wrote:
> [adding Mike]
>
> On Tue, Aug 19, 2014 at 08:03:09PM +0100, Olav Haugan wrote:
>> Hi Will,
>
> Hi Olav,
>
>> On 8/19/2014 5:58 AM, Will Deacon wrote:
>> > On Wed, Aug 13, 2014 at 01:51:34AM +0100, Mitchel Humpherys wrote:
>> >> On some platforms with tight power constraints it is polite to only
>> >> leave your clocks on for as long as you absolutely need them. Currently
>> >> we assume that all clocks necessary for SMMU register access are always
>> >> on.
>> >>
>> >> Add some optional device tree properties to specify any clocks that are
>> >> necessary for SMMU register access and turn them on and off as needed.
>> >>
>> >> If no clocks are specified in the device tree things continue to work
>> >> the way they always have: we assume all necessary clocks are always
>> >> turned on.
>> > 
>> > How does this interact with an SMMU in bypass mode?
>> 
>> Do you mean if you have a platform that requires clock and power
>> management but we leave the SMMU in bypass (i.e. no one calls into the
>> SMMU driver) how are the clock/power managed?
>> 
>> Clients of the SMMU driver are required to vote for clocks and power
>> when they know they need to use the SMMU. However, the clock and power
>> needed to be on for the SMMU to service bus masters aren't necessarily
>> the same as the ones needed to read/write registers...See below.
>
> The case I'm thinking of is where a device masters through the IOMMU, but
> doesn't make use of any translations. In this case, its transactions will
> bypass the SMMU and I want to ensure that continues to happen, regardless of
> the power state of the SMMU.

Then I assume the driver for such a device wouldn't be attaching to (or
detaching from) the IOMMU, so we won't be touching it at all either
way. Or am I missing something?

>
>> >> +static int arm_smmu_enable_clocks(struct arm_smmu_device *smmu)
>> >> +{
>> >> +	int i, ret = 0;
>> >> +
>> >> +	for (i = 0; i < smmu->num_clocks; ++i) {
>> >> +		ret = clk_prepare_enable(smmu->clocks[i]);
>> >> +		if (ret) {
>> >> +			dev_err(smmu->dev, "Couldn't enable clock #%d\n", i);
>> >> +			while (i--)
>> >> +				clk_disable_unprepare(smmu->clocks[i]);
>> >> +			break;
>> >> +		}
>> >> +	}
>> >> +
>> >> +	return ret;
>> >> +}
>> >> +
>> >> +static void arm_smmu_disable_clocks(struct arm_smmu_device *smmu)
>> >> +{
>> >> +	int i;
>> >> +
>> >> +	for (i = 0; i < smmu->num_clocks; ++i)
>> >> +		clk_disable_unprepare(smmu->clocks[i]);
>> >> +}
>> > 
>> > What stops theses from racing with each other when there are multiple
>> > clocks? I also assume that the clk API ignores calls to clk_enable_prepare
>> > for a clk that's already enabled? I couldn't find that code...
>> 
>> All the clock APIs are reference counted yes. Not sure what you mean by
>> racing with each other? When you call to enable a clock the call does
>> not return until the clock is already ON (or OFF).
>
> I was thinking of an interrupt handler racing with normal code, but actually
> you balance the clk enable/disable in the interrupt handlers. However, it's
> not safe to call these clk functions from irq context anyway, since
> clk_prepare may sleep.

Ah yes. You okay with moving to a threaded IRQ?

>
>> >> +int arm_smmu_device_cfg_probe(struct arm_smmu_device *smmu)
>> >>  {
>> >>  	unsigned long size;
>> >>  	void __iomem *gr0_base = ARM_SMMU_GR0(smmu);
>> >> @@ -2027,10 +2124,16 @@ static int arm_smmu_device_dt_probe(struct platform_device *pdev)
>> >>  	}
>> >>  	dev_notice(dev, "registered %d master devices\n", i);
>> >>  
>> >> -	err = arm_smmu_device_cfg_probe(smmu);
>> >> +	err = arm_smmu_init_clocks(smmu);
>> >>  	if (err)
>> >>  		goto out_put_masters;
>> >>  
>> >> +	arm_smmu_enable_clocks(smmu);
>> >> +
>> >> +	err = arm_smmu_device_cfg_probe(smmu);
>> >> +	if (err)
>> >> +		goto out_disable_clocks;
>> >> +
>> >>  	parse_driver_options(smmu);
>> >>  
>> >>  	if (smmu->version > 1 &&
>> >> @@ -2039,7 +2142,7 @@ static int arm_smmu_device_dt_probe(struct platform_device *pdev)
>> >>  			"found only %d context interrupt(s) but %d required\n",
>> >>  			smmu->num_context_irqs, smmu->num_context_banks);
>> >>  		err = -ENODEV;
>> >> -		goto out_put_masters;
>> >> +		goto out_disable_clocks;
>> >>  	}
>> >>  
>> >>  	for (i = 0; i < smmu->num_global_irqs; ++i) {
>> >> @@ -2061,12 +2164,16 @@ static int arm_smmu_device_dt_probe(struct platform_device *pdev)
>> >>  	spin_unlock(&arm_smmu_devices_lock);
>> >>  
>> >>  	arm_smmu_device_reset(smmu);
>> >> +	arm_smmu_disable_clocks(smmu);
>> > 
>> > I wonder if this is really the right thing to do. Rather than the
>> > fine-grained clock enable/disable you have, why don't we just enable in
>> > domain_init and disable in domain_destroy, with refcounting for the clocks?
>> > 
>> 
>> So the whole point of all of this is that we try to save power. As Mitch
>> wrote in the commit text we want to only leave the clock and power on
>> for as short period of time as possible.
>
> Understood, but if the clocks are going up and down like yo-yos, then it's
> not obvious that you end up saving any power at all. Have you tried
> measuring the power consumption with different granularities for the
> clocks?

This has been profiled extensively and for some use cases it's a huge
win. Unfortunately we don't have any numbers for public sharing :( but
you can imagine a use case where some multimedia framework maps a bunch
of buffers into an SMMU at the beginning of some interactive user
session and doesn't unmap them until the (human) user decides they are
done. This could be a long time, all the while these clocks could be
off, saving power.

> The code you're proposing seems to take the approach of `we're going to
> access registers so enable the clocks, access the registers then disable the
> clocks', which is simple but may not be particularly effective.

Yes, that's a good summary of the approach here. It has been effective
in saving power for us in the past...


-Mitch

-- 
The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum,
hosted by The Linux Foundation

^ permalink raw reply	[flat|nested] 68+ messages in thread

* [PATCH 1/6] iommu/arm-smmu: add support for specifying clocks
@ 2014-09-10  1:29                     ` Mitchel Humpherys
  0 siblings, 0 replies; 68+ messages in thread
From: Mitchel Humpherys @ 2014-09-10  1:29 UTC (permalink / raw)
  To: linux-arm-kernel

On Tue, Aug 26 2014 at 07:27:58 AM, Will Deacon <will.deacon@arm.com> wrote:
> [adding Mike]
>
> On Tue, Aug 19, 2014 at 08:03:09PM +0100, Olav Haugan wrote:
>> Hi Will,
>
> Hi Olav,
>
>> On 8/19/2014 5:58 AM, Will Deacon wrote:
>> > On Wed, Aug 13, 2014 at 01:51:34AM +0100, Mitchel Humpherys wrote:
>> >> On some platforms with tight power constraints it is polite to only
>> >> leave your clocks on for as long as you absolutely need them. Currently
>> >> we assume that all clocks necessary for SMMU register access are always
>> >> on.
>> >>
>> >> Add some optional device tree properties to specify any clocks that are
>> >> necessary for SMMU register access and turn them on and off as needed.
>> >>
>> >> If no clocks are specified in the device tree things continue to work
>> >> the way they always have: we assume all necessary clocks are always
>> >> turned on.
>> > 
>> > How does this interact with an SMMU in bypass mode?
>> 
>> Do you mean if you have a platform that requires clock and power
>> management but we leave the SMMU in bypass (i.e. no one calls into the
>> SMMU driver) how are the clock/power managed?
>> 
>> Clients of the SMMU driver are required to vote for clocks and power
>> when they know they need to use the SMMU. However, the clock and power
>> needed to be on for the SMMU to service bus masters aren't necessarily
>> the same as the ones needed to read/write registers...See below.
>
> The case I'm thinking of is where a device masters through the IOMMU, but
> doesn't make use of any translations. In this case, its transactions will
> bypass the SMMU and I want to ensure that continues to happen, regardless of
> the power state of the SMMU.

Then I assume the driver for such a device wouldn't be attaching to (or
detaching from) the IOMMU, so we won't be touching it at all either
way. Or am I missing something?

>
>> >> +static int arm_smmu_enable_clocks(struct arm_smmu_device *smmu)
>> >> +{
>> >> +	int i, ret = 0;
>> >> +
>> >> +	for (i = 0; i < smmu->num_clocks; ++i) {
>> >> +		ret = clk_prepare_enable(smmu->clocks[i]);
>> >> +		if (ret) {
>> >> +			dev_err(smmu->dev, "Couldn't enable clock #%d\n", i);
>> >> +			while (i--)
>> >> +				clk_disable_unprepare(smmu->clocks[i]);
>> >> +			break;
>> >> +		}
>> >> +	}
>> >> +
>> >> +	return ret;
>> >> +}
>> >> +
>> >> +static void arm_smmu_disable_clocks(struct arm_smmu_device *smmu)
>> >> +{
>> >> +	int i;
>> >> +
>> >> +	for (i = 0; i < smmu->num_clocks; ++i)
>> >> +		clk_disable_unprepare(smmu->clocks[i]);
>> >> +}
>> > 
>> > What stops theses from racing with each other when there are multiple
>> > clocks? I also assume that the clk API ignores calls to clk_enable_prepare
>> > for a clk that's already enabled? I couldn't find that code...
>> 
>> All the clock APIs are reference counted yes. Not sure what you mean by
>> racing with each other? When you call to enable a clock the call does
>> not return until the clock is already ON (or OFF).
>
> I was thinking of an interrupt handler racing with normal code, but actually
> you balance the clk enable/disable in the interrupt handlers. However, it's
> not safe to call these clk functions from irq context anyway, since
> clk_prepare may sleep.

Ah yes. You okay with moving to a threaded IRQ?

>
>> >> +int arm_smmu_device_cfg_probe(struct arm_smmu_device *smmu)
>> >>  {
>> >>  	unsigned long size;
>> >>  	void __iomem *gr0_base = ARM_SMMU_GR0(smmu);
>> >> @@ -2027,10 +2124,16 @@ static int arm_smmu_device_dt_probe(struct platform_device *pdev)
>> >>  	}
>> >>  	dev_notice(dev, "registered %d master devices\n", i);
>> >>  
>> >> -	err = arm_smmu_device_cfg_probe(smmu);
>> >> +	err = arm_smmu_init_clocks(smmu);
>> >>  	if (err)
>> >>  		goto out_put_masters;
>> >>  
>> >> +	arm_smmu_enable_clocks(smmu);
>> >> +
>> >> +	err = arm_smmu_device_cfg_probe(smmu);
>> >> +	if (err)
>> >> +		goto out_disable_clocks;
>> >> +
>> >>  	parse_driver_options(smmu);
>> >>  
>> >>  	if (smmu->version > 1 &&
>> >> @@ -2039,7 +2142,7 @@ static int arm_smmu_device_dt_probe(struct platform_device *pdev)
>> >>  			"found only %d context interrupt(s) but %d required\n",
>> >>  			smmu->num_context_irqs, smmu->num_context_banks);
>> >>  		err = -ENODEV;
>> >> -		goto out_put_masters;
>> >> +		goto out_disable_clocks;
>> >>  	}
>> >>  
>> >>  	for (i = 0; i < smmu->num_global_irqs; ++i) {
>> >> @@ -2061,12 +2164,16 @@ static int arm_smmu_device_dt_probe(struct platform_device *pdev)
>> >>  	spin_unlock(&arm_smmu_devices_lock);
>> >>  
>> >>  	arm_smmu_device_reset(smmu);
>> >> +	arm_smmu_disable_clocks(smmu);
>> > 
>> > I wonder if this is really the right thing to do. Rather than the
>> > fine-grained clock enable/disable you have, why don't we just enable in
>> > domain_init and disable in domain_destroy, with refcounting for the clocks?
>> > 
>> 
>> So the whole point of all of this is that we try to save power. As Mitch
>> wrote in the commit text we want to only leave the clock and power on
>> for as short period of time as possible.
>
> Understood, but if the clocks are going up and down like yo-yos, then it's
> not obvious that you end up saving any power at all. Have you tried
> measuring the power consumption with different granularities for the
> clocks?

This has been profiled extensively and for some use cases it's a huge
win. Unfortunately we don't have any numbers for public sharing :( but
you can imagine a use case where some multimedia framework maps a bunch
of buffers into an SMMU at the beginning of some interactive user
session and doesn't unmap them until the (human) user decides they are
done. This could be a long time, all the while these clocks could be
off, saving power.

> The code you're proposing seems to take the approach of `we're going to
> access registers so enable the clocks, access the registers then disable the
> clocks', which is simple but may not be particularly effective.

Yes, that's a good summary of the approach here. It has been effective
in saving power for us in the past...


-Mitch

-- 
The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum,
hosted by The Linux Foundation

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [PATCH 1/6] iommu/arm-smmu: add support for specifying clocks
  2014-09-10  1:29                     ` Mitchel Humpherys
@ 2014-09-10 18:27                         ` Will Deacon
  -1 siblings, 0 replies; 68+ messages in thread
From: Will Deacon @ 2014-09-10 18:27 UTC (permalink / raw)
  To: Mitchel Humpherys
  Cc: devicetree-u79uwXL29TY76Z2rM5mHXA,
	iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA,
	mturquette-QSEj5FYQhm4dnm+yROfE0A,
	linux-arm-kernel-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r

On Wed, Sep 10, 2014 at 02:29:42AM +0100, Mitchel Humpherys wrote:
> On Tue, Aug 26 2014 at 07:27:58 AM, Will Deacon <will.deacon-5wv7dgnIgG8@public.gmane.org> wrote:
> > On Tue, Aug 19, 2014 at 08:03:09PM +0100, Olav Haugan wrote:
> >> Clients of the SMMU driver are required to vote for clocks and power
> >> when they know they need to use the SMMU. However, the clock and power
> >> needed to be on for the SMMU to service bus masters aren't necessarily
> >> the same as the ones needed to read/write registers...See below.
> >
> > The case I'm thinking of is where a device masters through the IOMMU, but
> > doesn't make use of any translations. In this case, its transactions will
> > bypass the SMMU and I want to ensure that continues to happen, regardless of
> > the power state of the SMMU.
> 
> Then I assume the driver for such a device wouldn't be attaching to (or
> detaching from) the IOMMU, so we won't be touching it at all either
> way. Or am I missing something?

As long as its only the register file that gets powered down, then there's
no issue. However, that's making assumptions about what these clocks are
controlling. Is there a way for the driver to know which aspects of the
device are controlled by which clock?

> >> > What stops theses from racing with each other when there are multiple
> >> > clocks? I also assume that the clk API ignores calls to clk_enable_prepare
> >> > for a clk that's already enabled? I couldn't find that code...
> >> 
> >> All the clock APIs are reference counted yes. Not sure what you mean by
> >> racing with each other? When you call to enable a clock the call does
> >> not return until the clock is already ON (or OFF).
> >
> > I was thinking of an interrupt handler racing with normal code, but actually
> > you balance the clk enable/disable in the interrupt handlers. However, it's
> > not safe to call these clk functions from irq context anyway, since
> > clk_prepare may sleep.
> 
> Ah yes. You okay with moving to a threaded IRQ?

A threaded IRQ already makes sense for context interrupts (if anybody has a
platform that can do stalls properly), but it seems a bit weird for the
global fault handler. Is there no way to fix the race instead?

> >> >> @@ -2061,12 +2164,16 @@ static int arm_smmu_device_dt_probe(struct platform_device *pdev)
> >> >>  	spin_unlock(&arm_smmu_devices_lock);
> >> >>  
> >> >>  	arm_smmu_device_reset(smmu);
> >> >> +	arm_smmu_disable_clocks(smmu);
> >> > 
> >> > I wonder if this is really the right thing to do. Rather than the
> >> > fine-grained clock enable/disable you have, why don't we just enable in
> >> > domain_init and disable in domain_destroy, with refcounting for the clocks?
> >> > 
> >> 
> >> So the whole point of all of this is that we try to save power. As Mitch
> >> wrote in the commit text we want to only leave the clock and power on
> >> for as short period of time as possible.
> >
> > Understood, but if the clocks are going up and down like yo-yos, then it's
> > not obvious that you end up saving any power at all. Have you tried
> > measuring the power consumption with different granularities for the
> > clocks?
> 
> This has been profiled extensively and for some use cases it's a huge
> win. Unfortunately we don't have any numbers for public sharing :( but
> you can imagine a use case where some multimedia framework maps a bunch
> of buffers into an SMMU at the beginning of some interactive user
> session and doesn't unmap them until the (human) user decides they are
> done. This could be a long time, all the while these clocks could be
> off, saving power.

Ok, I can see that. I wonder whether we could wrap all of the iommu_ops
functions with the clock enable/disable code, instead of putting it directly
into the drivers?

> > The code you're proposing seems to take the approach of `we're going to
> > access registers so enable the clocks, access the registers then disable the
> > clocks', which is simple but may not be particularly effective.
> 
> Yes, that's a good summary of the approach here. It has been effective
> in saving power for us in the past...

Mike, do you have any experience with this sort of stuff?

Will

^ permalink raw reply	[flat|nested] 68+ messages in thread

* [PATCH 1/6] iommu/arm-smmu: add support for specifying clocks
@ 2014-09-10 18:27                         ` Will Deacon
  0 siblings, 0 replies; 68+ messages in thread
From: Will Deacon @ 2014-09-10 18:27 UTC (permalink / raw)
  To: linux-arm-kernel

On Wed, Sep 10, 2014 at 02:29:42AM +0100, Mitchel Humpherys wrote:
> On Tue, Aug 26 2014 at 07:27:58 AM, Will Deacon <will.deacon@arm.com> wrote:
> > On Tue, Aug 19, 2014 at 08:03:09PM +0100, Olav Haugan wrote:
> >> Clients of the SMMU driver are required to vote for clocks and power
> >> when they know they need to use the SMMU. However, the clock and power
> >> needed to be on for the SMMU to service bus masters aren't necessarily
> >> the same as the ones needed to read/write registers...See below.
> >
> > The case I'm thinking of is where a device masters through the IOMMU, but
> > doesn't make use of any translations. In this case, its transactions will
> > bypass the SMMU and I want to ensure that continues to happen, regardless of
> > the power state of the SMMU.
> 
> Then I assume the driver for such a device wouldn't be attaching to (or
> detaching from) the IOMMU, so we won't be touching it at all either
> way. Or am I missing something?

As long as its only the register file that gets powered down, then there's
no issue. However, that's making assumptions about what these clocks are
controlling. Is there a way for the driver to know which aspects of the
device are controlled by which clock?

> >> > What stops theses from racing with each other when there are multiple
> >> > clocks? I also assume that the clk API ignores calls to clk_enable_prepare
> >> > for a clk that's already enabled? I couldn't find that code...
> >> 
> >> All the clock APIs are reference counted yes. Not sure what you mean by
> >> racing with each other? When you call to enable a clock the call does
> >> not return until the clock is already ON (or OFF).
> >
> > I was thinking of an interrupt handler racing with normal code, but actually
> > you balance the clk enable/disable in the interrupt handlers. However, it's
> > not safe to call these clk functions from irq context anyway, since
> > clk_prepare may sleep.
> 
> Ah yes. You okay with moving to a threaded IRQ?

A threaded IRQ already makes sense for context interrupts (if anybody has a
platform that can do stalls properly), but it seems a bit weird for the
global fault handler. Is there no way to fix the race instead?

> >> >> @@ -2061,12 +2164,16 @@ static int arm_smmu_device_dt_probe(struct platform_device *pdev)
> >> >>  	spin_unlock(&arm_smmu_devices_lock);
> >> >>  
> >> >>  	arm_smmu_device_reset(smmu);
> >> >> +	arm_smmu_disable_clocks(smmu);
> >> > 
> >> > I wonder if this is really the right thing to do. Rather than the
> >> > fine-grained clock enable/disable you have, why don't we just enable in
> >> > domain_init and disable in domain_destroy, with refcounting for the clocks?
> >> > 
> >> 
> >> So the whole point of all of this is that we try to save power. As Mitch
> >> wrote in the commit text we want to only leave the clock and power on
> >> for as short period of time as possible.
> >
> > Understood, but if the clocks are going up and down like yo-yos, then it's
> > not obvious that you end up saving any power at all. Have you tried
> > measuring the power consumption with different granularities for the
> > clocks?
> 
> This has been profiled extensively and for some use cases it's a huge
> win. Unfortunately we don't have any numbers for public sharing :( but
> you can imagine a use case where some multimedia framework maps a bunch
> of buffers into an SMMU at the beginning of some interactive user
> session and doesn't unmap them until the (human) user decides they are
> done. This could be a long time, all the while these clocks could be
> off, saving power.

Ok, I can see that. I wonder whether we could wrap all of the iommu_ops
functions with the clock enable/disable code, instead of putting it directly
into the drivers?

> > The code you're proposing seems to take the approach of `we're going to
> > access registers so enable the clocks, access the registers then disable the
> > clocks', which is simple but may not be particularly effective.
> 
> Yes, that's a good summary of the approach here. It has been effective
> in saving power for us in the past...

Mike, do you have any experience with this sort of stuff?

Will

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [PATCH 1/6] iommu/arm-smmu: add support for specifying clocks
  2014-09-10 18:27                         ` Will Deacon
@ 2014-09-10 19:09                             ` Mitchel Humpherys
  -1 siblings, 0 replies; 68+ messages in thread
From: Mitchel Humpherys @ 2014-09-10 19:09 UTC (permalink / raw)
  To: Will Deacon
  Cc: devicetree-u79uwXL29TY76Z2rM5mHXA,
	iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA,
	mturquette-QSEj5FYQhm4dnm+yROfE0A,
	linux-arm-kernel-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r

On Wed, Sep 10 2014 at 11:27:39 AM, Will Deacon <will.deacon-5wv7dgnIgG8@public.gmane.org> wrote:
> On Wed, Sep 10, 2014 at 02:29:42AM +0100, Mitchel Humpherys wrote:
>> On Tue, Aug 26 2014 at 07:27:58 AM, Will Deacon <will.deacon-5wv7dgnIgG8@public.gmane.org> wrote:
>> > On Tue, Aug 19, 2014 at 08:03:09PM +0100, Olav Haugan wrote:
>> >> Clients of the SMMU driver are required to vote for clocks and power
>> >> when they know they need to use the SMMU. However, the clock and power
>> >> needed to be on for the SMMU to service bus masters aren't necessarily
>> >> the same as the ones needed to read/write registers...See below.
>> >
>> > The case I'm thinking of is where a device masters through the IOMMU, but
>> > doesn't make use of any translations. In this case, its transactions will
>> > bypass the SMMU and I want to ensure that continues to happen, regardless of
>> > the power state of the SMMU.
>> 
>> Then I assume the driver for such a device wouldn't be attaching to (or
>> detaching from) the IOMMU, so we won't be touching it at all either
>> way. Or am I missing something?
>
> As long as its only the register file that gets powered down, then there's
> no issue. However, that's making assumptions about what these clocks are
> controlling. Is there a way for the driver to know which aspects of the
> device are controlled by which clock?

Yes, folks should only be putting "config" clocks here. In our system,
at least, the clocks for configuring the SMMU are different than those
for using it. Maybe I should make a note about what "kinds" of clocks
can be specified here in the bindings (i.e. only those that can be
safely disabled while still allowing translations to occur)?

>
>> >> > What stops theses from racing with each other when there are multiple
>> >> > clocks? I also assume that the clk API ignores calls to clk_enable_prepare
>> >> > for a clk that's already enabled? I couldn't find that code...
>> >> 
>> >> All the clock APIs are reference counted yes. Not sure what you mean by
>> >> racing with each other? When you call to enable a clock the call does
>> >> not return until the clock is already ON (or OFF).
>> >
>> > I was thinking of an interrupt handler racing with normal code, but actually
>> > you balance the clk enable/disable in the interrupt handlers. However, it's
>> > not safe to call these clk functions from irq context anyway, since
>> > clk_prepare may sleep.
>> 
>> Ah yes. You okay with moving to a threaded IRQ?
>
> A threaded IRQ already makes sense for context interrupts (if anybody has a
> platform that can do stalls properly), but it seems a bit weird for the
> global fault handler. Is there no way to fix the race instead?

Are you referring to the scenario where someone might be disabling
clocks at the same time? This isn't a problem since the clocks are
refcounted. I believe the main problem here is calling clk_enable from
atomic context since it might sleep.

For my own edification, why would it be weird to move to a threaded IRQ
here? We're not really doing any important work here (just printing an
informational message) so moving to a threaded IRQ actually seems like
the courteous thing to do...

>
>> >> >> @@ -2061,12 +2164,16 @@ static int arm_smmu_device_dt_probe(struct platform_device *pdev)
>> >> >>  	spin_unlock(&arm_smmu_devices_lock);
>> >> >>  
>> >> >>  	arm_smmu_device_reset(smmu);
>> >> >> +	arm_smmu_disable_clocks(smmu);
>> >> > 
>> >> > I wonder if this is really the right thing to do. Rather than the
>> >> > fine-grained clock enable/disable you have, why don't we just enable in
>> >> > domain_init and disable in domain_destroy, with refcounting for the clocks?
>> >> > 
>> >> 
>> >> So the whole point of all of this is that we try to save power. As Mitch
>> >> wrote in the commit text we want to only leave the clock and power on
>> >> for as short period of time as possible.
>> >
>> > Understood, but if the clocks are going up and down like yo-yos, then it's
>> > not obvious that you end up saving any power at all. Have you tried
>> > measuring the power consumption with different granularities for the
>> > clocks?
>> 
>> This has been profiled extensively and for some use cases it's a huge
>> win. Unfortunately we don't have any numbers for public sharing :( but
>> you can imagine a use case where some multimedia framework maps a bunch
>> of buffers into an SMMU at the beginning of some interactive user
>> session and doesn't unmap them until the (human) user decides they are
>> done. This could be a long time, all the while these clocks could be
>> off, saving power.
>
> Ok, I can see that. I wonder whether we could wrap all of the iommu_ops
> functions with the clock enable/disable code, instead of putting it directly
> into the drivers?

Interesting idea... I'm up for it if the IOMMU core and driver
maintainers are okay with it.

>
>> > The code you're proposing seems to take the approach of `we're going to
>> > access registers so enable the clocks, access the registers then disable the
>> > clocks', which is simple but may not be particularly effective.
>> 
>> Yes, that's a good summary of the approach here. It has been effective
>> in saving power for us in the past...
>
> Mike, do you have any experience with this sort of stuff?
>
> Will



-Mitch

-- 
The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum,
hosted by The Linux Foundation

^ permalink raw reply	[flat|nested] 68+ messages in thread

* [PATCH 1/6] iommu/arm-smmu: add support for specifying clocks
@ 2014-09-10 19:09                             ` Mitchel Humpherys
  0 siblings, 0 replies; 68+ messages in thread
From: Mitchel Humpherys @ 2014-09-10 19:09 UTC (permalink / raw)
  To: linux-arm-kernel

On Wed, Sep 10 2014 at 11:27:39 AM, Will Deacon <will.deacon@arm.com> wrote:
> On Wed, Sep 10, 2014 at 02:29:42AM +0100, Mitchel Humpherys wrote:
>> On Tue, Aug 26 2014 at 07:27:58 AM, Will Deacon <will.deacon@arm.com> wrote:
>> > On Tue, Aug 19, 2014 at 08:03:09PM +0100, Olav Haugan wrote:
>> >> Clients of the SMMU driver are required to vote for clocks and power
>> >> when they know they need to use the SMMU. However, the clock and power
>> >> needed to be on for the SMMU to service bus masters aren't necessarily
>> >> the same as the ones needed to read/write registers...See below.
>> >
>> > The case I'm thinking of is where a device masters through the IOMMU, but
>> > doesn't make use of any translations. In this case, its transactions will
>> > bypass the SMMU and I want to ensure that continues to happen, regardless of
>> > the power state of the SMMU.
>> 
>> Then I assume the driver for such a device wouldn't be attaching to (or
>> detaching from) the IOMMU, so we won't be touching it at all either
>> way. Or am I missing something?
>
> As long as its only the register file that gets powered down, then there's
> no issue. However, that's making assumptions about what these clocks are
> controlling. Is there a way for the driver to know which aspects of the
> device are controlled by which clock?

Yes, folks should only be putting "config" clocks here. In our system,
at least, the clocks for configuring the SMMU are different than those
for using it. Maybe I should make a note about what "kinds" of clocks
can be specified here in the bindings (i.e. only those that can be
safely disabled while still allowing translations to occur)?

>
>> >> > What stops theses from racing with each other when there are multiple
>> >> > clocks? I also assume that the clk API ignores calls to clk_enable_prepare
>> >> > for a clk that's already enabled? I couldn't find that code...
>> >> 
>> >> All the clock APIs are reference counted yes. Not sure what you mean by
>> >> racing with each other? When you call to enable a clock the call does
>> >> not return until the clock is already ON (or OFF).
>> >
>> > I was thinking of an interrupt handler racing with normal code, but actually
>> > you balance the clk enable/disable in the interrupt handlers. However, it's
>> > not safe to call these clk functions from irq context anyway, since
>> > clk_prepare may sleep.
>> 
>> Ah yes. You okay with moving to a threaded IRQ?
>
> A threaded IRQ already makes sense for context interrupts (if anybody has a
> platform that can do stalls properly), but it seems a bit weird for the
> global fault handler. Is there no way to fix the race instead?

Are you referring to the scenario where someone might be disabling
clocks at the same time? This isn't a problem since the clocks are
refcounted. I believe the main problem here is calling clk_enable from
atomic context since it might sleep.

For my own edification, why would it be weird to move to a threaded IRQ
here? We're not really doing any important work here (just printing an
informational message) so moving to a threaded IRQ actually seems like
the courteous thing to do...

>
>> >> >> @@ -2061,12 +2164,16 @@ static int arm_smmu_device_dt_probe(struct platform_device *pdev)
>> >> >>  	spin_unlock(&arm_smmu_devices_lock);
>> >> >>  
>> >> >>  	arm_smmu_device_reset(smmu);
>> >> >> +	arm_smmu_disable_clocks(smmu);
>> >> > 
>> >> > I wonder if this is really the right thing to do. Rather than the
>> >> > fine-grained clock enable/disable you have, why don't we just enable in
>> >> > domain_init and disable in domain_destroy, with refcounting for the clocks?
>> >> > 
>> >> 
>> >> So the whole point of all of this is that we try to save power. As Mitch
>> >> wrote in the commit text we want to only leave the clock and power on
>> >> for as short period of time as possible.
>> >
>> > Understood, but if the clocks are going up and down like yo-yos, then it's
>> > not obvious that you end up saving any power at all. Have you tried
>> > measuring the power consumption with different granularities for the
>> > clocks?
>> 
>> This has been profiled extensively and for some use cases it's a huge
>> win. Unfortunately we don't have any numbers for public sharing :( but
>> you can imagine a use case where some multimedia framework maps a bunch
>> of buffers into an SMMU at the beginning of some interactive user
>> session and doesn't unmap them until the (human) user decides they are
>> done. This could be a long time, all the while these clocks could be
>> off, saving power.
>
> Ok, I can see that. I wonder whether we could wrap all of the iommu_ops
> functions with the clock enable/disable code, instead of putting it directly
> into the drivers?

Interesting idea... I'm up for it if the IOMMU core and driver
maintainers are okay with it.

>
>> > The code you're proposing seems to take the approach of `we're going to
>> > access registers so enable the clocks, access the registers then disable the
>> > clocks', which is simple but may not be particularly effective.
>> 
>> Yes, that's a good summary of the approach here. It has been effective
>> in saving power for us in the past...
>
> Mike, do you have any experience with this sort of stuff?
>
> Will



-Mitch

-- 
The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum,
hosted by The Linux Foundation

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [PATCH 1/6] iommu/arm-smmu: add support for specifying clocks
  2014-09-10 19:09                             ` Mitchel Humpherys
@ 2014-09-15 18:38                                 ` Mitchel Humpherys
  -1 siblings, 0 replies; 68+ messages in thread
From: Mitchel Humpherys @ 2014-09-15 18:38 UTC (permalink / raw)
  To: Will Deacon
  Cc: devicetree-u79uwXL29TY76Z2rM5mHXA,
	iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA,
	mturquette-QSEj5FYQhm4dnm+yROfE0A,
	linux-arm-kernel-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r

On Wed, Sep 10 2014 at 12:09:06 PM, Mitchel Humpherys <mitchelh-sgV2jX0FEOL9JmXXK+q4OQ@public.gmane.org> wrote:
> On Wed, Sep 10 2014 at 11:27:39 AM, Will Deacon <will.deacon-5wv7dgnIgG8@public.gmane.org> wrote:
>> On Wed, Sep 10, 2014 at 02:29:42AM +0100, Mitchel Humpherys wrote:
>>> On Tue, Aug 26 2014 at 07:27:58 AM, Will Deacon <will.deacon-5wv7dgnIgG8@public.gmane.org> wrote:
>>> > On Tue, Aug 19, 2014 at 08:03:09PM +0100, Olav Haugan wrote:
>>> >> Clients of the SMMU driver are required to vote for clocks and power
>>> >> when they know they need to use the SMMU. However, the clock and power
>>> >> needed to be on for the SMMU to service bus masters aren't necessarily
>>> >> the same as the ones needed to read/write registers...See below.
>>> >
>>> > The case I'm thinking of is where a device masters through the IOMMU, but
>>> > doesn't make use of any translations. In this case, its transactions will
>>> > bypass the SMMU and I want to ensure that continues to happen, regardless of
>>> > the power state of the SMMU.
>>> 
>>> Then I assume the driver for such a device wouldn't be attaching to (or
>>> detaching from) the IOMMU, so we won't be touching it at all either
>>> way. Or am I missing something?
>>
>> As long as its only the register file that gets powered down, then there's
>> no issue. However, that's making assumptions about what these clocks are
>> controlling. Is there a way for the driver to know which aspects of the
>> device are controlled by which clock?
>
> Yes, folks should only be putting "config" clocks here. In our system,
> at least, the clocks for configuring the SMMU are different than those
> for using it. Maybe I should make a note about what "kinds" of clocks
> can be specified here in the bindings (i.e. only those that can be
> safely disabled while still allowing translations to occur)?

Let me amend this statement slightly.  Folks should be putting all
clocks necessary to program SMMU registers here.  On our system, this
actually does include the "core" clocks in addition to the "config"
clocks.  Clients won't vote for "config" clocks since they have no
business programming SMMU registers, so those will get shut down when we
remove our vote for them.  Clients *should* hold their votes for "core"
clocks for as long as they want to use the SMMU.  Also, for the bypass
case, clients should be voting for clocks and power for the SMMU
themselves.

In light of all this I guess there isn't really anything to say in the
DT bindings.


-Mitch

-- 
The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum,
hosted by The Linux Foundation

^ permalink raw reply	[flat|nested] 68+ messages in thread

* [PATCH 1/6] iommu/arm-smmu: add support for specifying clocks
@ 2014-09-15 18:38                                 ` Mitchel Humpherys
  0 siblings, 0 replies; 68+ messages in thread
From: Mitchel Humpherys @ 2014-09-15 18:38 UTC (permalink / raw)
  To: linux-arm-kernel

On Wed, Sep 10 2014 at 12:09:06 PM, Mitchel Humpherys <mitchelh@codeaurora.org> wrote:
> On Wed, Sep 10 2014 at 11:27:39 AM, Will Deacon <will.deacon@arm.com> wrote:
>> On Wed, Sep 10, 2014 at 02:29:42AM +0100, Mitchel Humpherys wrote:
>>> On Tue, Aug 26 2014 at 07:27:58 AM, Will Deacon <will.deacon@arm.com> wrote:
>>> > On Tue, Aug 19, 2014 at 08:03:09PM +0100, Olav Haugan wrote:
>>> >> Clients of the SMMU driver are required to vote for clocks and power
>>> >> when they know they need to use the SMMU. However, the clock and power
>>> >> needed to be on for the SMMU to service bus masters aren't necessarily
>>> >> the same as the ones needed to read/write registers...See below.
>>> >
>>> > The case I'm thinking of is where a device masters through the IOMMU, but
>>> > doesn't make use of any translations. In this case, its transactions will
>>> > bypass the SMMU and I want to ensure that continues to happen, regardless of
>>> > the power state of the SMMU.
>>> 
>>> Then I assume the driver for such a device wouldn't be attaching to (or
>>> detaching from) the IOMMU, so we won't be touching it at all either
>>> way. Or am I missing something?
>>
>> As long as its only the register file that gets powered down, then there's
>> no issue. However, that's making assumptions about what these clocks are
>> controlling. Is there a way for the driver to know which aspects of the
>> device are controlled by which clock?
>
> Yes, folks should only be putting "config" clocks here. In our system,
> at least, the clocks for configuring the SMMU are different than those
> for using it. Maybe I should make a note about what "kinds" of clocks
> can be specified here in the bindings (i.e. only those that can be
> safely disabled while still allowing translations to occur)?

Let me amend this statement slightly.  Folks should be putting all
clocks necessary to program SMMU registers here.  On our system, this
actually does include the "core" clocks in addition to the "config"
clocks.  Clients won't vote for "config" clocks since they have no
business programming SMMU registers, so those will get shut down when we
remove our vote for them.  Clients *should* hold their votes for "core"
clocks for as long as they want to use the SMMU.  Also, for the bypass
case, clients should be voting for clocks and power for the SMMU
themselves.

In light of all this I guess there isn't really anything to say in the
DT bindings.


-Mitch

-- 
The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum,
hosted by The Linux Foundation

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [PATCH 5/6] iommu/arm-smmu: support buggy implementations with invalidate-on-map
  2014-08-13  0:51     ` Mitchel Humpherys
@ 2014-11-12 18:26         ` Will Deacon
  -1 siblings, 0 replies; 68+ messages in thread
From: Will Deacon @ 2014-11-12 18:26 UTC (permalink / raw)
  To: Mitchel Humpherys
  Cc: devicetree-u79uwXL29TY76Z2rM5mHXA,
	iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA,
	linux-arm-kernel-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r

Hi Mitch,

On Wed, Aug 13, 2014 at 01:51:38AM +0100, Mitchel Humpherys wrote:
> Add a workaround for some buggy hardware that requires a TLB invalidate
> operation to occur at map time. Activate the feature with the
> qcom,smmu-invalidate-on-map boolean DT property.

I'm digging up an old thread here, but I've been working on a new page-table
allocator for the SMMU and looked into implementing this workaround for you
in there. When I do the TLBI on map after installing the new PTE, can I just
invalidate the range mapped by that PTE, or does it need to be a full TLBI?

Will

^ permalink raw reply	[flat|nested] 68+ messages in thread

* [PATCH 5/6] iommu/arm-smmu: support buggy implementations with invalidate-on-map
@ 2014-11-12 18:26         ` Will Deacon
  0 siblings, 0 replies; 68+ messages in thread
From: Will Deacon @ 2014-11-12 18:26 UTC (permalink / raw)
  To: linux-arm-kernel

Hi Mitch,

On Wed, Aug 13, 2014 at 01:51:38AM +0100, Mitchel Humpherys wrote:
> Add a workaround for some buggy hardware that requires a TLB invalidate
> operation to occur at map time. Activate the feature with the
> qcom,smmu-invalidate-on-map boolean DT property.

I'm digging up an old thread here, but I've been working on a new page-table
allocator for the SMMU and looked into implementing this workaround for you
in there. When I do the TLBI on map after installing the new PTE, can I just
invalidate the range mapped by that PTE, or does it need to be a full TLBI?

Will

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [PATCH 5/6] iommu/arm-smmu: support buggy implementations with invalidate-on-map
  2014-11-12 18:26         ` Will Deacon
@ 2014-11-12 18:58             ` Mitchel Humpherys
  -1 siblings, 0 replies; 68+ messages in thread
From: Mitchel Humpherys @ 2014-11-12 18:58 UTC (permalink / raw)
  To: Will Deacon
  Cc: devicetree-u79uwXL29TY76Z2rM5mHXA,
	iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA,
	linux-arm-kernel-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r

On Wed, Nov 12 2014 at 10:26:43 AM, Will Deacon <will.deacon-5wv7dgnIgG8@public.gmane.org> wrote:
> Hi Mitch,
>
> On Wed, Aug 13, 2014 at 01:51:38AM +0100, Mitchel Humpherys wrote:
>> Add a workaround for some buggy hardware that requires a TLB invalidate
>> operation to occur at map time. Activate the feature with the
>> qcom,smmu-invalidate-on-map boolean DT property.
>
> I'm digging up an old thread here, but I've been working on a new page-table
> allocator for the SMMU and looked into implementing this workaround for you
> in there. When I do the TLBI on map after installing the new PTE, can I just
> invalidate the range mapped by that PTE, or does it need to be a full TLBI?

I'm not totally sure on the history of the hardware errata but I believe
it's just the range mapped by that pte.  We use SMMU_CBn_TLBIVA in the
our smmu driver.

However, let's actually just drop this...  It's looking like the targets
we have that will use the arm-smmu driver thankfully won't need this
workaround.  Thanks for keeping this in mind though :)


-Mitch

-- 
Qualcomm Innovation Center, Inc.
The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum,
a Linux Foundation Collaborative Project

^ permalink raw reply	[flat|nested] 68+ messages in thread

* [PATCH 5/6] iommu/arm-smmu: support buggy implementations with invalidate-on-map
@ 2014-11-12 18:58             ` Mitchel Humpherys
  0 siblings, 0 replies; 68+ messages in thread
From: Mitchel Humpherys @ 2014-11-12 18:58 UTC (permalink / raw)
  To: linux-arm-kernel

On Wed, Nov 12 2014 at 10:26:43 AM, Will Deacon <will.deacon@arm.com> wrote:
> Hi Mitch,
>
> On Wed, Aug 13, 2014 at 01:51:38AM +0100, Mitchel Humpherys wrote:
>> Add a workaround for some buggy hardware that requires a TLB invalidate
>> operation to occur at map time. Activate the feature with the
>> qcom,smmu-invalidate-on-map boolean DT property.
>
> I'm digging up an old thread here, but I've been working on a new page-table
> allocator for the SMMU and looked into implementing this workaround for you
> in there. When I do the TLBI on map after installing the new PTE, can I just
> invalidate the range mapped by that PTE, or does it need to be a full TLBI?

I'm not totally sure on the history of the hardware errata but I believe
it's just the range mapped by that pte.  We use SMMU_CBn_TLBIVA in the
our smmu driver.

However, let's actually just drop this...  It's looking like the targets
we have that will use the arm-smmu driver thankfully won't need this
workaround.  Thanks for keeping this in mind though :)


-Mitch

-- 
Qualcomm Innovation Center, Inc.
The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum,
a Linux Foundation Collaborative Project

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [PATCH 5/6] iommu/arm-smmu: support buggy implementations with invalidate-on-map
  2014-11-12 18:58             ` Mitchel Humpherys
@ 2014-11-13  9:48                 ` Will Deacon
  -1 siblings, 0 replies; 68+ messages in thread
From: Will Deacon @ 2014-11-13  9:48 UTC (permalink / raw)
  To: Mitchel Humpherys
  Cc: devicetree-u79uwXL29TY76Z2rM5mHXA,
	iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA,
	linux-arm-kernel-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r

On Wed, Nov 12, 2014 at 06:58:17PM +0000, Mitchel Humpherys wrote:
> On Wed, Nov 12 2014 at 10:26:43 AM, Will Deacon <will.deacon-5wv7dgnIgG8@public.gmane.org> wrote:
> > On Wed, Aug 13, 2014 at 01:51:38AM +0100, Mitchel Humpherys wrote:
> >> Add a workaround for some buggy hardware that requires a TLB invalidate
> >> operation to occur at map time. Activate the feature with the
> >> qcom,smmu-invalidate-on-map boolean DT property.
> >
> > I'm digging up an old thread here, but I've been working on a new page-table
> > allocator for the SMMU and looked into implementing this workaround for you
> > in there. When I do the TLBI on map after installing the new PTE, can I just
> > invalidate the range mapped by that PTE, or does it need to be a full TLBI?
> 
> I'm not totally sure on the history of the hardware errata but I believe
> it's just the range mapped by that pte.  We use SMMU_CBn_TLBIVA in the
> our smmu driver.
> 
> However, let's actually just drop this...  It's looking like the targets
> we have that will use the arm-smmu driver thankfully won't need this
> workaround.  Thanks for keeping this in mind though :)

Ha, damn, then I don't have a user of the shiny new quirks field I added!
I don't think I'll go as far as removing it altogether though...

Cheers,

Will

^ permalink raw reply	[flat|nested] 68+ messages in thread

* [PATCH 5/6] iommu/arm-smmu: support buggy implementations with invalidate-on-map
@ 2014-11-13  9:48                 ` Will Deacon
  0 siblings, 0 replies; 68+ messages in thread
From: Will Deacon @ 2014-11-13  9:48 UTC (permalink / raw)
  To: linux-arm-kernel

On Wed, Nov 12, 2014 at 06:58:17PM +0000, Mitchel Humpherys wrote:
> On Wed, Nov 12 2014 at 10:26:43 AM, Will Deacon <will.deacon@arm.com> wrote:
> > On Wed, Aug 13, 2014 at 01:51:38AM +0100, Mitchel Humpherys wrote:
> >> Add a workaround for some buggy hardware that requires a TLB invalidate
> >> operation to occur at map time. Activate the feature with the
> >> qcom,smmu-invalidate-on-map boolean DT property.
> >
> > I'm digging up an old thread here, but I've been working on a new page-table
> > allocator for the SMMU and looked into implementing this workaround for you
> > in there. When I do the TLBI on map after installing the new PTE, can I just
> > invalidate the range mapped by that PTE, or does it need to be a full TLBI?
> 
> I'm not totally sure on the history of the hardware errata but I believe
> it's just the range mapped by that pte.  We use SMMU_CBn_TLBIVA in the
> our smmu driver.
> 
> However, let's actually just drop this...  It's looking like the targets
> we have that will use the arm-smmu driver thankfully won't need this
> workaround.  Thanks for keeping this in mind though :)

Ha, damn, then I don't have a user of the shiny new quirks field I added!
I don't think I'll go as far as removing it altogether though...

Cheers,

Will

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [PATCH 5/6] iommu/arm-smmu: support buggy implementations with invalidate-on-map
  2014-11-13  9:48                 ` Will Deacon
@ 2014-11-14 23:08                     ` Mitchel Humpherys
  -1 siblings, 0 replies; 68+ messages in thread
From: Mitchel Humpherys @ 2014-11-14 23:08 UTC (permalink / raw)
  To: Will Deacon
  Cc: devicetree-u79uwXL29TY76Z2rM5mHXA,
	iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA,
	linux-arm-kernel-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r

On Thu, Nov 13 2014 at 01:48:26 AM, Will Deacon <will.deacon-5wv7dgnIgG8@public.gmane.org> wrote:
> Ha, damn, then I don't have a user of the shiny new quirks field I added!
> I don't think I'll go as far as removing it altogether though...

I'm sure we'll be making liberal use of that field soon enough ;)



-Mitch

-- 
Qualcomm Innovation Center, Inc.
The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum,
a Linux Foundation Collaborative Project

^ permalink raw reply	[flat|nested] 68+ messages in thread

* [PATCH 5/6] iommu/arm-smmu: support buggy implementations with invalidate-on-map
@ 2014-11-14 23:08                     ` Mitchel Humpherys
  0 siblings, 0 replies; 68+ messages in thread
From: Mitchel Humpherys @ 2014-11-14 23:08 UTC (permalink / raw)
  To: linux-arm-kernel

On Thu, Nov 13 2014 at 01:48:26 AM, Will Deacon <will.deacon@arm.com> wrote:
> Ha, damn, then I don't have a user of the shiny new quirks field I added!
> I don't think I'll go as far as removing it altogether though...

I'm sure we'll be making liberal use of that field soon enough ;)



-Mitch

-- 
Qualcomm Innovation Center, Inc.
The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum,
a Linux Foundation Collaborative Project

^ permalink raw reply	[flat|nested] 68+ messages in thread

end of thread, other threads:[~2014-11-14 23:08 UTC | newest]

Thread overview: 68+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2014-08-13  0:51 [PATCH 0/6] iommu/arm-smmu: misc features, new DT bindings Mitchel Humpherys
2014-08-13  0:51 ` Mitchel Humpherys
     [not found] ` <1407891099-24641-1-git-send-email-mitchelh-sgV2jX0FEOL9JmXXK+q4OQ@public.gmane.org>
2014-08-13  0:51   ` [PATCH 1/6] iommu/arm-smmu: add support for specifying clocks Mitchel Humpherys
2014-08-13  0:51     ` Mitchel Humpherys
     [not found]     ` <1407891099-24641-2-git-send-email-mitchelh-sgV2jX0FEOL9JmXXK+q4OQ@public.gmane.org>
2014-08-13 21:07       ` Mitchel Humpherys
2014-08-13 21:07         ` Mitchel Humpherys
2014-08-19 12:58       ` Will Deacon
2014-08-19 12:58         ` Will Deacon
     [not found]         ` <20140819125833.GO23128-5wv7dgnIgG8@public.gmane.org>
2014-08-19 19:03           ` Olav Haugan
2014-08-19 19:03             ` Olav Haugan
     [not found]             ` <53F39F6D.1040205-sgV2jX0FEOL9JmXXK+q4OQ@public.gmane.org>
2014-08-26 14:27               ` Will Deacon
2014-08-26 14:27                 ` Will Deacon
     [not found]                 ` <20140826142757.GU23445-5wv7dgnIgG8@public.gmane.org>
2014-09-10  1:29                   ` Mitchel Humpherys
2014-09-10  1:29                     ` Mitchel Humpherys
     [not found]                     ` <vnkwa968b6ux.fsf-Yf+dfxj6toJBVvN7MMdr1KRtKmQZhJ7pQQ4Iyu8u01E@public.gmane.org>
2014-09-10 18:27                       ` Will Deacon
2014-09-10 18:27                         ` Will Deacon
     [not found]                         ` <20140910182739.GM1710-5wv7dgnIgG8@public.gmane.org>
2014-09-10 19:09                           ` Mitchel Humpherys
2014-09-10 19:09                             ` Mitchel Humpherys
     [not found]                             ` <vnkwbnqn9tt9.fsf-Yf+dfxj6toJBVvN7MMdr1KRtKmQZhJ7pQQ4Iyu8u01E@public.gmane.org>
2014-09-15 18:38                               ` Mitchel Humpherys
2014-09-15 18:38                                 ` Mitchel Humpherys
2014-08-19 19:28           ` Mitchel Humpherys
2014-08-19 19:28             ` Mitchel Humpherys
2014-08-13  0:51   ` [PATCH 2/6] iommu/arm-smmu: add support for specifying regulators Mitchel Humpherys
2014-08-13  0:51     ` Mitchel Humpherys
     [not found]     ` <1407891099-24641-3-git-send-email-mitchelh-sgV2jX0FEOL9JmXXK+q4OQ@public.gmane.org>
2014-08-13 21:17       ` Mitchel Humpherys
2014-08-13 21:17         ` Mitchel Humpherys
2014-08-19 13:00       ` Will Deacon
2014-08-19 13:00         ` Will Deacon
2014-08-13  0:51   ` [PATCH 3/6] iommu/arm-smmu: add support for iova_to_phys through ATS1PR Mitchel Humpherys
2014-08-13  0:51     ` Mitchel Humpherys
     [not found]     ` <1407891099-24641-4-git-send-email-mitchelh-sgV2jX0FEOL9JmXXK+q4OQ@public.gmane.org>
2014-08-19 12:44       ` Will Deacon
2014-08-19 12:44         ` Will Deacon
     [not found]         ` <20140819124431.GL23128-5wv7dgnIgG8@public.gmane.org>
2014-08-19 18:12           ` Mitchel Humpherys
2014-08-19 18:12             ` Mitchel Humpherys
     [not found]             ` <vnkwa970qrfq.fsf-Yf+dfxj6toJBVvN7MMdr1KRtKmQZhJ7pQQ4Iyu8u01E@public.gmane.org>
2014-08-26 13:54               ` Will Deacon
2014-08-26 13:54                 ` Will Deacon
     [not found]                 ` <20140826135451.GQ23445-5wv7dgnIgG8@public.gmane.org>
2014-09-01 16:15                   ` Will Deacon
2014-09-01 16:15                     ` Will Deacon
2014-08-13  0:51   ` [PATCH 4/6] iommu/arm-smmu: implement generic DT bindings Mitchel Humpherys
2014-08-13  0:51     ` Mitchel Humpherys
     [not found]     ` <1407891099-24641-5-git-send-email-mitchelh-sgV2jX0FEOL9JmXXK+q4OQ@public.gmane.org>
2014-08-13 16:53       ` Mitchel Humpherys
2014-08-13 16:53         ` Mitchel Humpherys
2014-08-19 12:54       ` Will Deacon
2014-08-19 12:54         ` Will Deacon
     [not found]         ` <20140819125449.GN23128-5wv7dgnIgG8@public.gmane.org>
2014-08-19 15:54           ` Hiroshi Doyu
2014-08-19 15:54             ` Hiroshi Doyu
2014-08-20  3:18           ` Arnd Bergmann
2014-08-20  3:18             ` Arnd Bergmann
2014-08-13  0:51   ` [PATCH 5/6] iommu/arm-smmu: support buggy implementations with invalidate-on-map Mitchel Humpherys
2014-08-13  0:51     ` Mitchel Humpherys
     [not found]     ` <1407891099-24641-6-git-send-email-mitchelh-sgV2jX0FEOL9JmXXK+q4OQ@public.gmane.org>
2014-11-12 18:26       ` Will Deacon
2014-11-12 18:26         ` Will Deacon
     [not found]         ` <20141112182642.GH26437-5wv7dgnIgG8@public.gmane.org>
2014-11-12 18:58           ` Mitchel Humpherys
2014-11-12 18:58             ` Mitchel Humpherys
     [not found]             ` <vnkwy4rg5jqu.fsf-Yf+dfxj6toJBVvN7MMdr1KRtKmQZhJ7pQQ4Iyu8u01E@public.gmane.org>
2014-11-13  9:48               ` Will Deacon
2014-11-13  9:48                 ` Will Deacon
     [not found]                 ` <20141113094826.GA13350-5wv7dgnIgG8@public.gmane.org>
2014-11-14 23:08                   ` Mitchel Humpherys
2014-11-14 23:08                     ` Mitchel Humpherys
2014-08-13  0:51   ` [PATCH 6/6] iommu/arm-smmu: add .domain_{set, get}_attr for coherent walk control Mitchel Humpherys
2014-08-13  0:51     ` Mitchel Humpherys
     [not found]     ` <1407891099-24641-7-git-send-email-mitchelh-sgV2jX0FEOL9JmXXK+q4OQ@public.gmane.org>
2014-08-19 12:48       ` [PATCH 6/6] iommu/arm-smmu: add .domain_{set,get}_attr " Will Deacon
2014-08-19 12:48         ` Will Deacon
     [not found]         ` <20140819124807.GM23128-5wv7dgnIgG8@public.gmane.org>
2014-08-19 19:19           ` [PATCH 6/6] iommu/arm-smmu: add .domain_{set, get}_attr " Mitchel Humpherys
2014-08-19 19:19             ` Mitchel Humpherys
2014-08-13 17:22   ` [PATCH 0/6] iommu/arm-smmu: misc features, new DT bindings Mitchel Humpherys
2014-08-13 17:22     ` Mitchel Humpherys
     [not found]     ` <vnkwvbpwl2xz.fsf-Yf+dfxj6toJBVvN7MMdr1KRtKmQZhJ7pQQ4Iyu8u01E@public.gmane.org>
2014-08-15 17:25       ` Will Deacon
2014-08-15 17:25         ` Will Deacon

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.